create-walle 0.9.12 → 0.9.14

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (60) hide show
  1. package/README.md +6 -1
  2. package/bin/create-walle.js +195 -30
  3. package/bin/mcp-inject.js +18 -53
  4. package/package.json +3 -1
  5. package/template/claude-task-manager/approval-agent.js +7 -0
  6. package/template/claude-task-manager/docs/session-standup-command-center-design.md +242 -0
  7. package/template/claude-task-manager/git-utils.js +111 -3
  8. package/template/claude-task-manager/lib/session-history.js +144 -16
  9. package/template/claude-task-manager/lib/session-standup.js +409 -0
  10. package/template/claude-task-manager/lib/standup-attention.js +200 -0
  11. package/template/claude-task-manager/lib/status-hooks.js +8 -2
  12. package/template/claude-task-manager/lib/update-telemetry.js +114 -0
  13. package/template/claude-task-manager/lib/walle-default-model.js +55 -0
  14. package/template/claude-task-manager/lib/walle-mcp-auto-config.js +62 -0
  15. package/template/claude-task-manager/lib/walle-supervisor.js +83 -19
  16. package/template/claude-task-manager/lib/worktree-cwd.js +82 -0
  17. package/template/claude-task-manager/providers/codex-mcp.js +104 -0
  18. package/template/claude-task-manager/providers/index.js +2 -0
  19. package/template/claude-task-manager/public/css/setup.css +2 -1
  20. package/template/claude-task-manager/public/css/walle.css +5 -0
  21. package/template/claude-task-manager/public/index.html +1596 -283
  22. package/template/claude-task-manager/public/js/session-search-utils.js +171 -1
  23. package/template/claude-task-manager/public/js/setup.js +62 -19
  24. package/template/claude-task-manager/public/js/stream-view.js +55 -6
  25. package/template/claude-task-manager/public/js/walle-session.js +73 -16
  26. package/template/claude-task-manager/public/js/walle.js +34 -2
  27. package/template/claude-task-manager/server.js +780 -177
  28. package/template/claude-task-manager/session-integrity.js +58 -15
  29. package/template/claude-task-manager/workers/approval-widget-validator.js +15 -5
  30. package/template/claude-task-manager/workers/state-detectors/codex.js +6 -0
  31. package/template/package.json +1 -1
  32. package/template/wall-e/agent.js +36 -7
  33. package/template/wall-e/api-walle.js +72 -20
  34. package/template/wall-e/coding/stream-processor.js +22 -2
  35. package/template/wall-e/coding-orchestrator.js +26 -6
  36. package/template/wall-e/eval/agent-runner.js +16 -4
  37. package/template/wall-e/eval/benchmark-generator.js +21 -1
  38. package/template/wall-e/eval/benchmarks/coding-agent.json +0 -596
  39. package/template/wall-e/eval/codex-cli-baseline.js +633 -0
  40. package/template/wall-e/eval/eval-orchestrator.js +3 -3
  41. package/template/wall-e/eval/harvester.js +15 -2
  42. package/template/wall-e/eval/run-agent-benchmarks.js +11 -3
  43. package/template/wall-e/eval/run-codex-cli-baseline.js +177 -0
  44. package/template/wall-e/lib/mcp-integration.js +220 -0
  45. package/template/wall-e/llm/ollama.js +47 -8
  46. package/template/wall-e/llm/ollama.plugin.json +1 -1
  47. package/template/wall-e/llm/tool-adapter.js +1 -0
  48. package/template/wall-e/loops/ingest.js +42 -8
  49. package/template/wall-e/mcp-server.js +272 -10
  50. package/template/wall-e/memory/ctm-session-context.js +910 -0
  51. package/template/wall-e/server.js +26 -1
  52. package/template/wall-e/skills/_bundled/scan-ctm-sessions/SKILL.md +20 -0
  53. package/template/wall-e/skills/_bundled/scan-ctm-sessions/run.js +43 -0
  54. package/template/wall-e/skills/skill-planner.js +52 -3
  55. package/template/wall-e/tools/builtin-middleware.js +55 -2
  56. package/template/wall-e/tools/shell-policy.js +1 -1
  57. package/template/wall-e/tools/slack-owner.js +104 -0
  58. package/template/wall-e/training/harvester.js +15 -2
  59. package/template/website/index.html +2 -2
  60. package/template/builder-journal.md +0 -17
package/README.md CHANGED
@@ -23,7 +23,7 @@ An always-on AI agent that learns from your Slack, email, calendar, and coding s
23
23
  - **Second Brain** — Automatically ingests your digital life into a searchable memory store with full-text search, knowledge extraction, and pattern detection
24
24
  - **Proactive Intelligence** — Surfaces time-sensitive items, suggests actions, and delivers morning briefings and weekly reflections without being asked
25
25
  - **Chat with Tools** — Talk to Wall-E in the browser — it can search your memories, look up people, run skills, and call external tools via MCP (Slack, Glean, etc.)
26
- - **19 Bundled Skills** — Morning briefing, weekly reflection, proactive alerts, Slack monitoring, email sync, calendar integration, coding agent, model training, model pricing sync, and more
26
+ - **20 Bundled Skills** — Morning briefing, weekly reflection, proactive alerts, Slack monitoring, email sync, calendar integration, coding agent, memory search, model training, model pricing sync, and more
27
27
  - **Multi-Model** — Works with Claude, GPT, Gemini, DeepSeek, and local models via Ollama, LM Studio, or MLX with smart routing
28
28
  - **Skill Management GUI** — Search, filter, create, edit, and monitor skills from the browser with rich cards, config forms, execution history, export/import, and pre-flight validation
29
29
  - **Multi-Device** — Share your brain across machines via Dropbox or iCloud
@@ -49,6 +49,11 @@ npx create-walle uninstall # Remove the auto-start service
49
49
  npx create-walle -v # Show version
50
50
  ```
51
51
 
52
+ Install, update, and start verify Node-native modules such as `better-sqlite3`
53
+ against the exact Node.js binary Wall-E will use. If you switch Node versions,
54
+ Wall-E automatically runs the needed `npm rebuild` before starting so CTM does
55
+ not crash with a `NODE_MODULE_VERSION` mismatch.
56
+
52
57
  ## Setup
53
58
 
54
59
  On first launch, the browser setup page guides you through:
@@ -17,37 +17,50 @@ const { injectMcpConfigs } = require('./mcp-inject');
17
17
  const TEMPLATE_DIR = path.join(__dirname, '..', 'template');
18
18
  const LABEL = 'com.walle.server';
19
19
  const INSTALL_PATH_FILE = path.join(process.env.HOME, '.walle', 'install-path');
20
+ const MANAGED_PACKAGE_DIRS = ['claude-task-manager', 'wall-e'];
21
+ const NATIVE_DEPENDENCIES = new Set([
22
+ 'better-sqlite3',
23
+ 'node-pty',
24
+ 'sqlite-vec',
25
+ 'tree-sitter-bash',
26
+ ]);
20
27
 
21
28
  // Files to preserve during update (user config, not code)
22
29
  const PRESERVE_ON_UPDATE = ['.env', 'wall-e/wall-e-config.json'];
23
30
 
24
31
  // ── CLI Router ──
25
32
 
26
- const command = process.argv[2] || '';
27
-
28
- if (command === 'install') {
29
- install(process.argv[3]);
30
- } else if (command === 'update' || command === 'upgrade') {
31
- update();
32
- } else if (command === 'start') {
33
- start();
34
- } else if (command === 'stop') {
35
- stop();
36
- } else if (command === 'status') {
37
- status();
38
- } else if (command === 'logs') {
39
- logs();
40
- } else if (command === 'uninstall') {
41
- uninstall();
42
- } else if (command === '--help' || command === '-h' || command === 'help') {
43
- usage();
44
- } else if (command === '--version' || command === '-v') {
45
- const pkg = require('../package.json');
46
- console.log(pkg.version);
47
- } else if (command && !command.startsWith('-')) {
48
- install(command);
49
- } else {
50
- usage();
33
+ if (require.main === module) {
34
+ main(process.argv.slice(2));
35
+ }
36
+
37
+ function main(argv = []) {
38
+ const command = argv[0] || '';
39
+
40
+ if (command === 'install') {
41
+ install(argv[1]);
42
+ } else if (command === 'update' || command === 'upgrade') {
43
+ update();
44
+ } else if (command === 'start') {
45
+ start();
46
+ } else if (command === 'stop') {
47
+ stop();
48
+ } else if (command === 'status') {
49
+ status();
50
+ } else if (command === 'logs') {
51
+ logs();
52
+ } else if (command === 'uninstall') {
53
+ uninstall();
54
+ } else if (command === '--help' || command === '-h' || command === 'help') {
55
+ usage();
56
+ } else if (command === '--version' || command === '-v') {
57
+ const pkg = require('../package.json');
58
+ console.log(pkg.version);
59
+ } else if (command && !command.startsWith('-')) {
60
+ install(command);
61
+ } else {
62
+ usage();
63
+ }
51
64
  }
52
65
 
53
66
  // ── Commands ──
@@ -267,6 +280,15 @@ function start() {
267
280
  const dir = findWalleDir();
268
281
  const port = readPort(dir);
269
282
  console.log(`\n Starting Wall-E from ${DIM}${dir}${RESET} on port ${port}...`);
283
+ try {
284
+ repairNativeDependencies(dir, { phase: 'start' });
285
+ } catch (err) {
286
+ console.error(`\n ${RED}Native module repair failed.${RESET}`);
287
+ console.error(` ${DIM}${err.message}${RESET}`);
288
+ console.error(` Try: ${BOLD}cd ${dir}/claude-task-manager && npm rebuild better-sqlite3 node-pty${RESET}`);
289
+ console.error(` ${BOLD}cd ${dir}/wall-e && npm rebuild better-sqlite3 sqlite-vec tree-sitter-bash${RESET}\n`);
290
+ process.exit(1);
291
+ }
270
292
  installService(dir, port);
271
293
  console.log(` ${GREEN}Running!${RESET} http://localhost:${port}\n`);
272
294
  printMcpResults(parseInt(readWallePort(dir)));
@@ -326,7 +348,7 @@ function stopQuiet(dir, port) {
326
348
  const wallePort = dir ? readWallePort(dir) : String(parseInt(port) + 1);
327
349
  for (const p of [port, wallePort]) {
328
350
  try {
329
- const pids = execFileSync('lsof', ['-ti', ':' + p], { encoding: 'utf8', timeout: 3000 }).trim().split('\n').filter(Boolean);
351
+ const pids = execFileSync('lsof', ['-ti', '-sTCP:LISTEN', ':' + p], { encoding: 'utf8', timeout: 3000 }).trim().split('\n').filter(Boolean);
330
352
  for (const pid of pids) { try { process.kill(parseInt(pid), 'SIGTERM'); } catch {} }
331
353
  } catch {}
332
354
  }
@@ -464,14 +486,146 @@ done
464
486
 
465
487
  function npmInstall(dir) {
466
488
  try {
467
- execFileSync('npm', ['install', '--loglevel=warn'], { cwd: path.join(dir, 'claude-task-manager'), stdio: 'inherit' });
468
- execFileSync('npm', ['install', '--loglevel=warn'], { cwd: path.join(dir, 'wall-e'), stdio: 'inherit' });
469
- } catch {
470
- console.error(`\n ${RED}npm install failed.${RESET} Try manually:\n cd ${dir}/claude-task-manager && npm install\n cd ${dir}/wall-e && npm install\n`);
489
+ for (const relDir of MANAGED_PACKAGE_DIRS) {
490
+ runNpm(path.join(dir, relDir), ['install', '--loglevel=warn']);
491
+ }
492
+ repairNativeDependencies(dir, { phase: 'install' });
493
+ } catch (err) {
494
+ console.error(`\n ${RED}npm install failed.${RESET}`);
495
+ if (err && err.message) console.error(` ${DIM}${err.message}${RESET}`);
496
+ console.error(` Try manually with the same Node version used by Wall-E:\n cd ${dir}/claude-task-manager && npm install && npm rebuild better-sqlite3 node-pty\n cd ${dir}/wall-e && npm install && npm rebuild better-sqlite3 sqlite-vec tree-sitter-bash\n`);
471
497
  process.exit(1);
472
498
  }
473
499
  }
474
500
 
501
+ function repairNativeDependencies(walleDir, {
502
+ phase = 'install',
503
+ checkDependency = checkNativeDependency,
504
+ runNpmCommand = runNpm,
505
+ log = console.log,
506
+ } = {}) {
507
+ const repairs = [];
508
+ for (const relDir of MANAGED_PACKAGE_DIRS) {
509
+ const packageDir = path.join(walleDir, relDir);
510
+ const deps = nativeDependenciesForPackage(packageDir);
511
+ if (!deps.length) continue;
512
+
513
+ const failed = [];
514
+ for (const dep of deps) {
515
+ const check = checkDependency(packageDir, dep);
516
+ if (!check.ok) failed.push(dep);
517
+ }
518
+ if (!failed.length) continue;
519
+
520
+ log(` ${YELLOW}Rebuilding native modules for ${relDir}${RESET} ${DIM}(Node ${process.version}, ABI ${process.versions.modules})${RESET}`);
521
+ runNpmCommand(packageDir, ['rebuild', ...failed, '--loglevel=warn']);
522
+
523
+ const stillBroken = [];
524
+ for (const dep of failed) {
525
+ const check = checkDependency(packageDir, dep);
526
+ if (!check.ok) stillBroken.push(`${dep}: ${firstLine(check.error)}`);
527
+ }
528
+ if (stillBroken.length) {
529
+ throw new Error(`Native module repair failed in ${relDir} after ${phase}: ${stillBroken.join('; ')}`);
530
+ }
531
+ repairs.push(`${relDir}: ${failed.join(', ')}`);
532
+ }
533
+
534
+ if (repairs.length) {
535
+ log(` ${GREEN}Native modules rebuilt successfully${RESET} ${DIM}(${repairs.join('; ')})${RESET}`);
536
+ }
537
+ return repairs;
538
+ }
539
+
540
+ function nativeDependenciesForPackage(packageDir) {
541
+ const pkgPath = path.join(packageDir, 'package.json');
542
+ let pkg;
543
+ try {
544
+ pkg = JSON.parse(fs.readFileSync(pkgPath, 'utf8'));
545
+ } catch {
546
+ return [];
547
+ }
548
+ const declared = {
549
+ ...(pkg.dependencies || {}),
550
+ ...(pkg.optionalDependencies || {}),
551
+ };
552
+ return Object.keys(declared).filter((name) => NATIVE_DEPENDENCIES.has(name));
553
+ }
554
+
555
+ function checkNativeDependency(packageDir, dependency) {
556
+ try {
557
+ execFileSync(process.execPath, ['-e', `require(${JSON.stringify(dependency)})`], {
558
+ cwd: packageDir,
559
+ encoding: 'utf8',
560
+ stdio: ['ignore', 'pipe', 'pipe'],
561
+ timeout: 15000,
562
+ env: npmChildEnv(),
563
+ });
564
+ return { ok: true };
565
+ } catch (err) {
566
+ const error = [err && err.message, err && err.stdout, err && err.stderr].filter(Boolean).join('\n');
567
+ return { ok: false, error };
568
+ }
569
+ }
570
+
571
+ function runNpm(cwd, args) {
572
+ const runner = resolveNpmRunner();
573
+ if (runner.type === 'node-cli') {
574
+ execFileSync(process.execPath, [runner.path, ...args], {
575
+ cwd,
576
+ stdio: 'inherit',
577
+ env: npmChildEnv(),
578
+ });
579
+ return;
580
+ }
581
+ execFileSync('npm', args, {
582
+ cwd,
583
+ stdio: 'inherit',
584
+ env: npmChildEnv(),
585
+ });
586
+ }
587
+
588
+ function resolveNpmRunner(env = process.env, execPath = process.execPath) {
589
+ const envPath = normalizeNpmExecPath(env.npm_execpath);
590
+ if (envPath && fs.existsSync(envPath)) return { type: 'node-cli', path: envPath };
591
+
592
+ for (const candidate of npmCliCandidates(execPath)) {
593
+ if (fs.existsSync(candidate)) return { type: 'node-cli', path: candidate };
594
+ }
595
+ return { type: 'bin', path: 'npm' };
596
+ }
597
+
598
+ function normalizeNpmExecPath(value) {
599
+ if (!value) return null;
600
+ const resolved = path.resolve(value);
601
+ if (path.basename(resolved) === 'npx-cli.js') {
602
+ return path.join(path.dirname(resolved), 'npm-cli.js');
603
+ }
604
+ return resolved;
605
+ }
606
+
607
+ function npmCliCandidates(execPath = process.execPath) {
608
+ const binDir = path.dirname(execPath);
609
+ return [
610
+ path.join(binDir, 'node_modules', 'npm', 'bin', 'npm-cli.js'),
611
+ path.join(binDir, '..', 'lib', 'node_modules', 'npm', 'bin', 'npm-cli.js'),
612
+ path.join(binDir, '..', 'libexec', 'lib', 'node_modules', 'npm', 'bin', 'npm-cli.js'),
613
+ ].map((candidate) => path.resolve(candidate));
614
+ }
615
+
616
+ function npmChildEnv() {
617
+ const nodeDir = path.dirname(process.execPath);
618
+ return {
619
+ ...process.env,
620
+ PATH: [nodeDir, process.env.PATH || ''].filter(Boolean).join(path.delimiter),
621
+ npm_config_update_notifier: 'false',
622
+ };
623
+ }
624
+
625
+ function firstLine(value) {
626
+ return String(value || '').split(/\r?\n/).find(Boolean) || 'unknown error';
627
+ }
628
+
475
629
  function stampVersion(dir) {
476
630
  try {
477
631
  const ver = require('../package.json').version;
@@ -568,3 +722,14 @@ function copyRecursive(src, dest) {
568
722
  fs.copyFileSync(src, dest);
569
723
  }
570
724
  }
725
+
726
+ module.exports = {
727
+ checkNativeDependency,
728
+ firstLine,
729
+ main,
730
+ nativeDependenciesForPackage,
731
+ normalizeNpmExecPath,
732
+ npmCliCandidates,
733
+ repairNativeDependencies,
734
+ resolveNpmRunner,
735
+ };
package/bin/mcp-inject.js CHANGED
@@ -1,60 +1,25 @@
1
1
  'use strict';
2
- const fs = require('fs');
3
- const path = require('path');
4
-
5
- const MCP_TARGETS = [
6
- { tool: 'Claude Code', configPath: '.claude/mcp.json', detectDir: '.claude' },
7
- { tool: 'Cursor', configPath: '.cursor/mcp.json', detectDir: '.cursor' },
8
- { tool: 'Windsurf', configPath: '.codeium/windsurf/mcp_config.json', detectDir: '.codeium/windsurf' },
9
- { tool: 'Claude Desktop', configPath: 'Library/Application Support/Claude/claude_desktop_config.json', detectDir: 'Library/Application Support/Claude' },
10
- ];
11
-
12
- /**
13
- * Inject Wall-E MCP server config into detected AI tool config files.
14
- * @param {number} wallePort - The Wall-E HTTP port
15
- * @param {string} [homeDir] - Home directory override for testing
16
- * @returns {Array<{tool: string, action: string, configPath?: string}>}
17
- */
18
- function injectMcpConfigs(wallePort, homeDir) {
19
- const home = homeDir || process.env.HOME;
20
- const walleUrl = `http://localhost:${wallePort}/mcp`;
21
- const results = [];
22
-
23
- for (const target of MCP_TARGETS) {
24
- const detectPath = path.join(home, target.detectDir);
25
- const configPath = path.join(home, target.configPath);
26
-
27
- if (!fs.existsSync(detectPath)) {
28
- results.push({ tool: target.tool, action: 'not_installed' });
29
- continue;
30
- }
31
2
 
32
- let config = {};
33
- if (fs.existsSync(configPath)) {
34
- try {
35
- config = JSON.parse(fs.readFileSync(configPath, 'utf8'));
36
- } catch {
37
- config = {};
38
- }
39
- }
40
-
41
- if (!config.mcpServers) config.mcpServers = {};
42
-
43
- const existing = config.mcpServers['wall-e'];
44
- if (existing && existing.url === walleUrl) {
45
- results.push({ tool: target.tool, action: 'already_configured', configPath });
46
- continue;
47
- }
3
+ const path = require('path');
48
4
 
49
- const action = existing ? 'updated' : 'added';
50
- config.mcpServers['wall-e'] = { type: 'http', url: walleUrl };
5
+ const runtimePath = path.join(__dirname, '..', 'template', 'wall-e', 'lib', 'mcp-integration');
6
+ let runtime = null;
51
7
 
52
- fs.mkdirSync(path.dirname(configPath), { recursive: true });
53
- fs.writeFileSync(configPath, JSON.stringify(config, null, 2) + '\n');
54
- results.push({ tool: target.tool, action, configPath });
8
+ function loadRuntime() {
9
+ if (runtime) return runtime;
10
+ try {
11
+ runtime = require(runtimePath);
12
+ } catch {
13
+ runtime = require('../../wall-e/lib/mcp-integration');
55
14
  }
56
-
57
- return results;
15
+ return runtime;
58
16
  }
59
17
 
60
- module.exports = { injectMcpConfigs, MCP_TARGETS };
18
+ module.exports = {
19
+ get MCP_TARGETS() { return loadRuntime().MCP_TARGETS; },
20
+ detectMcpIntegrations: (...args) => loadRuntime().detectMcpIntegrations(...args),
21
+ ensureMcpIntegrations: (...args) => loadRuntime().ensureMcpIntegrations(...args),
22
+ injectMcpConfigs: (...args) => loadRuntime().injectMcpConfigs(...args),
23
+ testWallEMcpEndpoint: (...args) => loadRuntime().testWallEMcpEndpoint(...args),
24
+ wallEMcpUrl: (...args) => loadRuntime().wallEMcpUrl(...args),
25
+ };
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "create-walle",
3
- "version": "0.9.12",
3
+ "version": "0.9.14",
4
4
  "description": "CTM + Wall-E \u2014 AI coding dashboard and personal digital twin agent. Multi-agent terminal for Claude Code, Codex, Gemini, Aider, OpenCode, and more, plus prompt editor, task queue, and an agent that learns from Slack, email & calendar.",
5
5
  "bin": {
6
6
  "create-walle": "bin/create-walle.js"
@@ -12,6 +12,8 @@
12
12
  ],
13
13
  "scripts": {
14
14
  "build": "bash build.sh",
15
+ "test": "node --test tests/*.test.js",
16
+ "test:npx-e2e": "CREATE_WALLE_NPX_E2E=1 node --test tests/npx-install-update.e2e.test.js",
15
17
  "prepublishOnly": "bash build.sh"
16
18
  },
17
19
  "license": "MIT",
@@ -356,6 +356,13 @@ function reviewWithHeuristics(context) {
356
356
  const tool = (context.toolName || '').replace(/^[⏺●\s]+/, '').toLowerCase();
357
357
  const warning = (context.warning || '').toLowerCase();
358
358
 
359
+ if (/^mcp\b/.test(tool)
360
+ && /\bwall-e\.walle_memory_status\b|\bwall-e\b[\s\S]*\bwalle_memory_status\b/.test(cmd)) {
361
+ return { decision: 'approve', reasoning: 'Read-only Wall-E memory status MCP call (heuristic)', riskLevel: 'low',
362
+ ruleLabel: 'Wall-E memory status', rulePattern: 'wall-e.*walle_memory_status',
363
+ ruleDescription: 'Read Wall-E MCP memory status' };
364
+ }
365
+
359
366
  // Low-risk tools — auto-approve immediately (before high-risk content check,
360
367
  // because Edit/Write diffs may contain code with "drop table" or "rm -rf" as
361
368
  // string literals — those are code content, not dangerous operations).
@@ -0,0 +1,242 @@
1
+ # Session Standup Command Center Design
2
+
3
+ ## Problem
4
+
5
+ CTM already supports many long-lived sessions, but the operator cost rises when
6
+ multiple agents are active at once. Each terminal has its own context, and a new
7
+ session does not inherit the hard-won context from an existing one. Reusing
8
+ sessions is therefore necessary, but reuse creates a second problem: the user
9
+ must remember what every session is doing, whether it is blocked, and what the
10
+ next useful instruction should be.
11
+
12
+ The Standup Command Center is a central dashboard for that operating loop. It is
13
+ not a replacement for session terminals. It is a "manager view" over existing
14
+ sessions that answers:
15
+
16
+ - What is every active session doing?
17
+ - Which sessions need the user's attention?
18
+ - Which sessions are ready to review or finish?
19
+ - What should the user do next for each session?
20
+ - How can the user quickly deep dive and issue instructions without losing the
21
+ fleet-level view?
22
+
23
+ ## Research Signals
24
+
25
+ Open-source projects validate the need for a control-plane style UI, but CTM
26
+ should borrow the operating patterns rather than replatforming the runtime.
27
+
28
+ - Mission Control positions itself as an open-source dashboard for agent fleets,
29
+ tasks, costs, workflows, logs, memory, alerts, and quality gates:
30
+ https://github.com/builderz-labs/mission-control
31
+ - AutoGen Studio shows that multi-agent debugging benefits from explicit
32
+ sessions, observable messages, metrics, and reusable agent components, but it
33
+ is framed as a prototyping UI rather than a production app:
34
+ https://autogenhub.github.io/autogen/blog/2023/12/01/AutoGenStudio/
35
+ - OpenHands exposes a local GUI, REST API, CLI-compatible coding-agent workflow,
36
+ and cloud/enterprise surfaces for scaling coding agents:
37
+ https://github.com/OpenHands/OpenHands
38
+ - LangGraph's product framing emphasizes human-in-the-loop checks, persistent
39
+ memory, and streaming as first-class agent UX primitives:
40
+ https://www.langchain.com/langgraph
41
+
42
+ The design conclusion for CTM: do not add a manager-of-managers orchestration
43
+ runtime. CTM already has terminals, stream capture, recent-session history,
44
+ worktree state, approval detection, and review panels. The right abstraction is
45
+ a reusable dashboard projection over those signals, with deterministic manager
46
+ recommendations first and optional LLM annotations later.
47
+
48
+ ## Existing CTM Substrate
49
+
50
+ The feature should reuse these existing surfaces:
51
+
52
+ - `lib/session-stream.js`: summary, intent, progress, last prompt, stream status.
53
+ - `lib/status-hooks.js`: idle, busy, and waiting-input state transitions.
54
+ - `server.js` session payloads: active sessions, provider, cwd, branch, modified
55
+ time, capabilities, and worktree status.
56
+ - `/api/recent-sessions`: historical sessions and project filtering.
57
+ - `/api/stream/status` and `/api/sessions/:id/summary`: stream health and
58
+ per-session summary.
59
+ - Review tab and session deep links: structured transcript and review flows.
60
+
61
+ ## UX Model
62
+
63
+ The first screen in CTM Sessions should become the operator dashboard when no
64
+ terminal is selected. The dashboard is dense, operational, and scannable. It is
65
+ closer to a standup board than a marketing landing page.
66
+
67
+ Primary regions:
68
+
69
+ 1. Fleet header
70
+ - Total active sessions.
71
+ - Counts by lane.
72
+ - Last refresh time.
73
+ - Refresh action.
74
+ - New Session action.
75
+
76
+ 2. Attention ribbon
77
+ - Highest priority sessions needing input or approval.
78
+ - Shows only the action needed and evidence.
79
+ - Lets the user open the session directly.
80
+
81
+ 3. Lanes
82
+ - Needs User: approvals, questions, waiting input, or explicit blockers.
83
+ - Ready Review: worktree changes, completed work, or reviewable artifacts.
84
+ - Running: active output or recent work.
85
+ - Continue Later: idle, stale, or context-preserving sessions.
86
+
87
+ 4. Session card
88
+ - Title and agent/provider.
89
+ - State badge and age.
90
+ - Intent: one-line goal.
91
+ - Progress: one-line latest progress.
92
+ - Manager recommendation: what the user should do next.
93
+ - Evidence chips: waiting input, branch, worktree, last prompt, last activity.
94
+ - Actions: Open, Review when supported, and Send Instruction when a terminal
95
+ target is available.
96
+
97
+ 5. Deep dive behavior
98
+ - Open keeps the user's mental model simple: jump into the session terminal.
99
+ - Review opens the transcript/review panel when the agent supports it.
100
+ - Send Instruction targets the existing prompt/queue mechanism in a later
101
+ phase; the first implementation can focus the session and preserve the
102
+ command center as the resume surface.
103
+
104
+ ## Manager Recommendation Taxonomy
105
+
106
+ Recommendations must be conservative and explainable. They should be derived
107
+ from existing signals and include evidence. They should never claim certainty
108
+ from weak heuristics.
109
+
110
+ Recommended action kinds:
111
+
112
+ - `approval_needed`: approve, deny, or answer a tool prompt.
113
+ - `needs_input`: user instruction appears required.
114
+ - `review`: inspect completed work or worktree changes.
115
+ - `watch`: session is running; no action needed right now.
116
+ - `resume`: session is idle but can continue with more instruction.
117
+ - `investigate`: session appears failed or blocked.
118
+ - `archive`: work appears complete and inactive.
119
+
120
+ Lane priority:
121
+
122
+ 1. Failed or blocked signals.
123
+ 2. Waiting input or approval.
124
+ 3. Reviewable worktree/session output.
125
+ 4. Running or recently active sessions.
126
+ 5. Idle or stale sessions.
127
+
128
+ Each recommendation includes:
129
+
130
+ - `lane`: where it appears on the board.
131
+ - `actionKind`: stable machine-readable action.
132
+ - `actionLabel`: short CTA text.
133
+ - `recommendation`: short human-readable manager note.
134
+ - `confidence`: `high`, `medium`, or `low`.
135
+ - `evidence`: short strings tied to concrete signals.
136
+
137
+ ## API Contract
138
+
139
+ Add `GET /api/sessions/standup`.
140
+
141
+ Response shape:
142
+
143
+ ```json
144
+ {
145
+ "generatedAt": "2026-04-30T12:00:00.000Z",
146
+ "counts": {
147
+ "total": 3,
148
+ "needs_user": 1,
149
+ "ready_review": 1,
150
+ "running": 1,
151
+ "continue_later": 0
152
+ },
153
+ "lanes": [
154
+ {
155
+ "id": "needs_user",
156
+ "title": "Needs User",
157
+ "sessions": []
158
+ }
159
+ ],
160
+ "sessions": [
161
+ {
162
+ "id": "ctm-session-id",
163
+ "agentSessionId": "agent-session-id",
164
+ "title": "Fix build failure",
165
+ "agent": "codex",
166
+ "provider": "openai",
167
+ "cwd": "/repo",
168
+ "branch": "fix-build",
169
+ "status": "waiting_input",
170
+ "lane": "needs_user",
171
+ "actionKind": "approval_needed",
172
+ "actionLabel": "Respond",
173
+ "recommendation": "Approval or input is waiting in the terminal.",
174
+ "confidence": "high",
175
+ "evidence": ["waiting_input", "last activity 2m ago"],
176
+ "intent": "Fix build failure",
177
+ "progress": "Waiting for command approval",
178
+ "lastActivity": "2026-04-30T11:58:00.000Z",
179
+ "worktree": {
180
+ "branch": "fix-build",
181
+ "dirtyFiles": 2,
182
+ "unmergedCommits": 1
183
+ },
184
+ "capabilities": {
185
+ "review": true,
186
+ "resume": true
187
+ }
188
+ }
189
+ ]
190
+ }
191
+ ```
192
+
193
+ The endpoint should be pure projection logic over active sessions first. Recent
194
+ sessions can be added as a later extension once the active-session dashboard is
195
+ stable.
196
+
197
+ ## Implementation Phases
198
+
199
+ Phase 1: Design document
200
+
201
+ - Commit this design.
202
+
203
+ Phase 2: Backend projection
204
+
205
+ - Add a pure `lib/session-standup.js` module.
206
+ - Classify sessions into lanes and recommendations.
207
+ - Add unit tests for the classifier and snapshot builder.
208
+ - Add `GET /api/sessions/standup`.
209
+
210
+ Phase 3: Dashboard UI
211
+
212
+ - Replace the empty Sessions welcome panel with the Standup Command Center.
213
+ - Fetch `/api/sessions/standup`.
214
+ - Render fleet header, attention ribbon, lanes, cards, and actions.
215
+ - Keep the UI within the existing CTM visual language: compact, dark,
216
+ terminal-adjacent, and operational.
217
+
218
+ Phase 4: Verification and polish
219
+
220
+ - Run focused unit tests.
221
+ - Run an isolated CTM dev server on random ports, never the primary 3456/3457.
222
+ - Curl the new API.
223
+ - Drive the dashboard in a real browser and confirm it renders without layout
224
+ overlap or blank states.
225
+
226
+ Future phase: Manager annotations
227
+
228
+ - Add optional LLM-generated manager notes only after deterministic
229
+ recommendations are useful.
230
+ - Store annotation provenance and age.
231
+ - Do not let the LLM issue autonomous cross-session commands; it can propose
232
+ actions, not execute them.
233
+
234
+ ## Open Questions
235
+
236
+ - Should historical/recent sessions appear on the same board or in a secondary
237
+ "Resume Later" view?
238
+ - Should Send Instruction fan out to multiple sessions, or should the first
239
+ version stay one-session-at-a-time?
240
+ - Should manager annotations be cached per session turn, per stream summary, or
241
+ per dashboard refresh?
242
+