create-walle 0.9.13 → 0.9.15

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (98) hide show
  1. package/README.md +8 -3
  2. package/bin/create-walle.js +232 -32
  3. package/bin/mcp-inject.js +18 -53
  4. package/package.json +3 -1
  5. package/template/claude-task-manager/api-prompts.js +11 -2
  6. package/template/claude-task-manager/approval-agent.js +7 -0
  7. package/template/claude-task-manager/db.js +94 -75
  8. package/template/claude-task-manager/docs/session-standup-command-center-design.md +242 -0
  9. package/template/claude-task-manager/docs/session-tooltip-freshness-design.md +224 -0
  10. package/template/claude-task-manager/docs/session-ux-issue-review-2026-05-01.md +369 -0
  11. package/template/claude-task-manager/fuzzy-utils.js +10 -2
  12. package/template/claude-task-manager/git-utils.js +140 -10
  13. package/template/claude-task-manager/lib/agent-capabilities.js +1 -1
  14. package/template/claude-task-manager/lib/agent-presets.js +38 -5
  15. package/template/claude-task-manager/lib/codex-terminal-final.js +53 -0
  16. package/template/claude-task-manager/lib/ctm-session-context-api.js +222 -0
  17. package/template/claude-task-manager/lib/session-diagnostics.js +56 -0
  18. package/template/claude-task-manager/lib/session-history.js +309 -16
  19. package/template/claude-task-manager/lib/session-standup.js +409 -0
  20. package/template/claude-task-manager/lib/session-stream.js +253 -20
  21. package/template/claude-task-manager/lib/standup-attention.js +200 -0
  22. package/template/claude-task-manager/lib/status-hooks.js +8 -2
  23. package/template/claude-task-manager/lib/update-telemetry.js +114 -0
  24. package/template/claude-task-manager/lib/walle-ctm-history.js +49 -6
  25. package/template/claude-task-manager/lib/walle-default-model.js +55 -0
  26. package/template/claude-task-manager/lib/walle-mcp-auto-config.js +66 -0
  27. package/template/claude-task-manager/lib/walle-supervisor.js +86 -19
  28. package/template/claude-task-manager/lib/walle-transcript.js +1 -3
  29. package/template/claude-task-manager/lib/worktree-cwd.js +82 -0
  30. package/template/claude-task-manager/package.json +1 -0
  31. package/template/claude-task-manager/providers/codex-mcp.js +104 -0
  32. package/template/claude-task-manager/providers/index.js +2 -0
  33. package/template/claude-task-manager/public/css/setup.css +2 -1
  34. package/template/claude-task-manager/public/css/walle.css +71 -0
  35. package/template/claude-task-manager/public/index.html +2388 -429
  36. package/template/claude-task-manager/public/js/message-renderer.js +314 -35
  37. package/template/claude-task-manager/public/js/session-search-utils.js +185 -3
  38. package/template/claude-task-manager/public/js/session-status-precedence.js +125 -0
  39. package/template/claude-task-manager/public/js/setup.js +62 -19
  40. package/template/claude-task-manager/public/js/stream-view.js +396 -55
  41. package/template/claude-task-manager/public/js/terminal-restore-state.js +57 -0
  42. package/template/claude-task-manager/public/js/walle-session.js +234 -26
  43. package/template/claude-task-manager/public/js/walle.js +143 -2
  44. package/template/claude-task-manager/server.js +1402 -433
  45. package/template/claude-task-manager/session-integrity.js +77 -28
  46. package/template/claude-task-manager/workers/approval-widget-validator.js +15 -5
  47. package/template/claude-task-manager/workers/scrollback-worker.js +5 -6
  48. package/template/claude-task-manager/workers/state-detectors/codex.js +6 -0
  49. package/template/package.json +1 -1
  50. package/template/wall-e/agent-runners/claude-code.js +2 -0
  51. package/template/wall-e/agent.js +63 -8
  52. package/template/wall-e/api-walle.js +330 -52
  53. package/template/wall-e/brain.js +291 -42
  54. package/template/wall-e/chat.js +172 -15
  55. package/template/wall-e/coding/compaction-service.js +19 -5
  56. package/template/wall-e/coding/stream-processor.js +22 -2
  57. package/template/wall-e/coding/workspace-replay.js +1 -4
  58. package/template/wall-e/coding-orchestrator.js +250 -80
  59. package/template/wall-e/compat.js +0 -28
  60. package/template/wall-e/context/context-builder.js +3 -1
  61. package/template/wall-e/embeddings.js +2 -7
  62. package/template/wall-e/eval/agent-runner.js +30 -9
  63. package/template/wall-e/eval/benchmark-generator.js +21 -1
  64. package/template/wall-e/eval/benchmarks/chat-eval.json +66 -6
  65. package/template/wall-e/eval/benchmarks/coding-agent.json +0 -596
  66. package/template/wall-e/eval/cc-replay.js +1 -0
  67. package/template/wall-e/eval/codex-cli-baseline.js +633 -0
  68. package/template/wall-e/eval/debug-agent003.js +1 -0
  69. package/template/wall-e/eval/eval-orchestrator.js +3 -3
  70. package/template/wall-e/eval/run-agent-benchmarks.js +11 -3
  71. package/template/wall-e/eval/run-codex-cli-baseline.js +177 -0
  72. package/template/wall-e/eval/run-model-comparison.js +1 -0
  73. package/template/wall-e/eval/swebench-adapter.js +1 -0
  74. package/template/wall-e/evaluation/quorum-evaluator.js +0 -1
  75. package/template/wall-e/extraction/knowledge-extractor.js +1 -2
  76. package/template/wall-e/lib/mcp-integration.js +336 -0
  77. package/template/wall-e/llm/ollama.js +47 -8
  78. package/template/wall-e/llm/ollama.plugin.json +1 -1
  79. package/template/wall-e/llm/tool-adapter.js +1 -0
  80. package/template/wall-e/loops/ingest.js +42 -8
  81. package/template/wall-e/loops/initiative.js +87 -2
  82. package/template/wall-e/mcp-server.js +872 -19
  83. package/template/wall-e/memory/ctm-context-client.js +230 -0
  84. package/template/wall-e/memory/ctm-session-context.js +1376 -0
  85. package/template/wall-e/prompts/coding/memory-protocol.md +6 -0
  86. package/template/wall-e/server.js +30 -1
  87. package/template/wall-e/skills/_bundled/memory-search/SKILL.md +8 -0
  88. package/template/wall-e/skills/_bundled/scan-ctm-sessions/SKILL.md +20 -0
  89. package/template/wall-e/skills/_bundled/scan-ctm-sessions/run.js +43 -0
  90. package/template/wall-e/skills/_bundled/slack-mentions/run.js +471 -188
  91. package/template/wall-e/skills/skill-planner.js +86 -4
  92. package/template/wall-e/slack/socket-mode-listener.js +276 -0
  93. package/template/wall-e/telemetry.js +70 -2
  94. package/template/wall-e/tools/builtin-middleware.js +55 -2
  95. package/template/wall-e/tools/shell-policy.js +1 -1
  96. package/template/wall-e/tools/slack-owner.js +104 -0
  97. package/template/website/index.html +4 -4
  98. package/template/builder-journal.md +0 -17
@@ -2141,7 +2141,7 @@ function listSessionConversations({ search, limit, offset, hostname, allDevices
2141
2141
  }
2142
2142
  if (search) {
2143
2143
  sql += ' AND (title LIKE ? OR first_message LIKE ? OR project_path LIKE ? OR messages LIKE ?)';
2144
- const q = `%${search}%`;
2144
+ const q = `%${normalizeSessionSearchValue(search) || search}%`;
2145
2145
  params.push(q, q, q, q);
2146
2146
  }
2147
2147
  sql += ' ORDER BY imported_at DESC';
@@ -2154,6 +2154,14 @@ function getSessionConversation(sessionId) {
2154
2154
  return getDb().prepare('SELECT * FROM session_conversations WHERE ctm_session_id = ?').get(sessionId);
2155
2155
  }
2156
2156
 
2157
+ function normalizeSessionSearchValue(value) {
2158
+ return String(value || '')
2159
+ .trim()
2160
+ .toLowerCase()
2161
+ .replace(/(^|[\s([{])[$/]+(?=[a-z0-9_-])/g, '$1')
2162
+ .replace(/\s+/g, ' ');
2163
+ }
2164
+
2157
2165
  function updateSessionModel(sessionId, modelProvider, modelId) {
2158
2166
  getDb().prepare(
2159
2167
  'UPDATE session_conversations SET model_provider = ?, model_id = ? WHERE ctm_session_id = ?'
@@ -2215,10 +2223,25 @@ function checkpointWal(mode) {
2215
2223
  const m = (mode || 'PASSIVE').toUpperCase();
2216
2224
  if (!_VALID_CHECKPOINT_MODES.has(m)) return;
2217
2225
  try {
2218
- db.pragma(`wal_checkpoint(${m})`);
2226
+ return db.pragma(`wal_checkpoint(${m})`);
2219
2227
  } catch {}
2220
2228
  }
2221
2229
 
2230
+ function checkpointWalOrThrow(mode) {
2231
+ if (!db) throw new Error('Database not initialized');
2232
+ const m = (mode || 'PASSIVE').toUpperCase();
2233
+ if (!_VALID_CHECKPOINT_MODES.has(m)) throw new Error(`Invalid WAL checkpoint mode: ${mode}`);
2234
+ const rows = db.pragma(`wal_checkpoint(${m})`);
2235
+ const row = Array.isArray(rows) ? rows[0] : rows;
2236
+ const busy = Number(row?.busy ?? row?.[0] ?? 0);
2237
+ if (busy > 0) {
2238
+ const log = row?.log ?? row?.[1] ?? 'unknown';
2239
+ const checkpointed = row?.checkpointed ?? row?.[2] ?? 'unknown';
2240
+ throw new Error(`WAL checkpoint ${m} could not complete: busy=${busy}, log=${log}, checkpointed=${checkpointed}`);
2241
+ }
2242
+ return rows;
2243
+ }
2244
+
2222
2245
  // Gzip a .db file to .db.gz and remove the original.
2223
2246
  // Returns the actual output path (destPath if gzip succeeded, srcPath if it didn't).
2224
2247
  function _gzipBackup(srcPath, destPath) {
@@ -2234,41 +2257,12 @@ function _gzipBackup(srcPath, destPath) {
2234
2257
  }
2235
2258
 
2236
2259
  function createBackup(label) {
2237
- if (!db || !currentDbPath) throw new Error('Database not initialized');
2238
- checkpointWal('TRUNCATE');
2239
-
2240
- const now = new Date();
2241
- const ts = now.toISOString().replace(/[:.]/g, '-').slice(0, 19);
2242
- const tag = label ? `-${label.replace(/[^a-zA-Z0-9_-]/g, '')}` : '';
2243
- const tmpPath = path.join(BACKUP_DIR, `task-manager-${ts}${tag}.db`);
2244
- const backupName = `task-manager-${ts}${tag}.db.gz`;
2245
- const backupPath = path.join(BACKUP_DIR, backupName);
2246
-
2247
- // Use SQLite backup API, then gzip
2248
- db.backup(tmpPath).then(() => {
2249
- _gzipBackup(tmpPath, backupPath);
2250
- }).catch(() => {
2251
- fs.copyFileSync(currentDbPath, tmpPath);
2252
- _gzipBackup(tmpPath, backupPath);
2253
- });
2254
-
2255
- // Also copy images dir as a tarball if it has content
2256
- const imagesBackup = path.join(BACKUP_DIR, `images-${ts}${tag}.tar.gz`);
2257
- try {
2258
-
2259
- const imageFiles = fs.readdirSync(DEFAULT_IMAGES_DIR);
2260
- if (imageFiles.length > 0) {
2261
- require('child_process').spawnSync('tar', ['-czf', imagesBackup, '-C', path.dirname(DEFAULT_IMAGES_DIR), path.basename(DEFAULT_IMAGES_DIR)], { timeout: 30000 });
2262
- }
2263
- } catch {}
2264
-
2265
- cleanOldBackups();
2266
- return { backupName, backupPath, timestamp: now.toISOString() };
2260
+ return createBackupSync(label);
2267
2261
  }
2268
2262
 
2269
2263
  function createBackupSync(label) {
2270
2264
  if (!db || !currentDbPath) throw new Error('Database not initialized');
2271
- checkpointWal('TRUNCATE');
2265
+ checkpointWalOrThrow('TRUNCATE');
2272
2266
 
2273
2267
  const now = new Date();
2274
2268
  const ts = now.toISOString().replace(/[:.]/g, '-').slice(0, 19);
@@ -2328,7 +2322,7 @@ function restoreBackup(backupName) {
2328
2322
  createBackupSync('pre-restore');
2329
2323
 
2330
2324
  // Close current DB
2331
- checkpointWal('TRUNCATE');
2325
+ checkpointWalOrThrow('TRUNCATE');
2332
2326
  if (db) { db.close(); db = null; }
2333
2327
 
2334
2328
  // Decompress if needed, then copy over current DB
@@ -2874,7 +2868,7 @@ function updateStartupTaskBranch(sessionId, branch, worktreePath) {
2874
2868
 
2875
2869
  function updateStartupTaskCwd(sessionId, cwd) {
2876
2870
  getDb().prepare('UPDATE startup_tasks SET cwd = ?, worktree_path = ? WHERE ctm_session_id = ?')
2877
- .run(cwd || '', cwd && cwd.includes('.claude/worktrees/') ? cwd : null, sessionId);
2871
+ .run(cwd || '', cwd && /\/\.(?:claude|walle)\/worktrees\//.test(cwd) ? cwd : null, sessionId);
2878
2872
  flushWal();
2879
2873
  }
2880
2874
 
@@ -3593,24 +3587,15 @@ function upsertSession(id, data, opts) {
3593
3587
  }
3594
3588
 
3595
3589
  const agentId = data.agentSessionId;
3596
-
3597
- // Cross-tab claim guard: before writing agent_sessions, make sure this
3598
- // agent_session_id isn't already owned by a different CTM tab. This is the
3599
- // main defense against mass-spawn races where multiple relink paths compete.
3600
- if (agentId && agentId !== '__CLEAR__' && !allowReclaim) {
3601
- const existing = d.prepare(
3602
- 'SELECT ctm_session_id FROM agent_sessions WHERE agent_session_id = ?'
3603
- ).get(agentId);
3604
- if (existing && existing.ctm_session_id && existing.ctm_session_id !== id) {
3605
- const err = new Error(
3606
- `agent_session_id ${agentId} is already claimed by ctm_session ${existing.ctm_session_id} (refusing cross-tab claim for ${id})`
3607
- );
3608
- err.code = 'E_CROSS_TAB_CLAIM';
3609
- err.existingCtmSessionId = existing.ctm_session_id;
3610
- err.attemptedCtmSessionId = id;
3611
- err.agentSessionId = agentId;
3612
- throw err;
3613
- }
3590
+ function throwCrossTabClaim(existingCtmSessionId) {
3591
+ const err = new Error(
3592
+ `agent_session_id ${agentId} is already claimed by ctm_session ${existingCtmSessionId} (refusing cross-tab claim for ${id})`
3593
+ );
3594
+ err.code = 'E_CROSS_TAB_CLAIM';
3595
+ err.existingCtmSessionId = existingCtmSessionId;
3596
+ err.attemptedCtmSessionId = id;
3597
+ err.agentSessionId = agentId;
3598
+ throw err;
3614
3599
  }
3615
3600
  const ctmParams = {
3616
3601
  id,
@@ -3636,6 +3621,7 @@ function upsertSession(id, data, opts) {
3636
3621
  git_branch: data.gitBranch || '',
3637
3622
  user_msg_count: data.userMsgCount || 0,
3638
3623
  slug: data.slug || '',
3624
+ allow_reclaim: allowReclaim ? 1 : 0,
3639
3625
  } : null;
3640
3626
 
3641
3627
  // Wrap both upserts in a transaction to keep ctm_sessions + agent_sessions atomic
@@ -3656,13 +3642,21 @@ function upsertSession(id, data, opts) {
3656
3642
 
3657
3643
  // If agent session data provided, upsert agent_sessions
3658
3644
  if (agentParams) {
3659
- d.prepare(`
3645
+ if (!allowReclaim) {
3646
+ const existing = d.prepare(
3647
+ 'SELECT ctm_session_id FROM agent_sessions WHERE agent_session_id = ?'
3648
+ ).get(agentId);
3649
+ if (existing && existing.ctm_session_id && existing.ctm_session_id !== id) {
3650
+ throwCrossTabClaim(existing.ctm_session_id);
3651
+ }
3652
+ }
3653
+ const result = d.prepare(`
3660
3654
  INSERT INTO agent_sessions (agent_session_id, ctm_session_id, provider, project_path, jsonl_path,
3661
3655
  first_message, file_size, modified_at, hostname, model, git_branch, user_msg_count, slug)
3662
3656
  VALUES (@agent_session_id, @ctm_session_id, @provider, @project_path, @jsonl_path,
3663
3657
  @first_message, @file_size, @modified_at, @hostname, @model, @git_branch, @user_msg_count, @slug)
3664
3658
  ON CONFLICT(agent_session_id) DO UPDATE SET
3665
- ctm_session_id = COALESCE(NULLIF(excluded.ctm_session_id, ''), agent_sessions.ctm_session_id),
3659
+ ctm_session_id = excluded.ctm_session_id,
3666
3660
  provider = COALESCE(NULLIF(excluded.provider, ''), agent_sessions.provider),
3667
3661
  project_path = COALESCE(NULLIF(excluded.project_path, ''), agent_sessions.project_path),
3668
3662
  jsonl_path = COALESCE(NULLIF(excluded.jsonl_path, ''), agent_sessions.jsonl_path),
@@ -3675,10 +3669,20 @@ function upsertSession(id, data, opts) {
3675
3669
  user_msg_count = CASE WHEN excluded.user_msg_count > 0 THEN excluded.user_msg_count ELSE agent_sessions.user_msg_count END,
3676
3670
  slug = COALESCE(NULLIF(excluded.slug, ''), agent_sessions.slug),
3677
3671
  updated_at = datetime('now')
3672
+ WHERE @allow_reclaim = 1
3673
+ OR agent_sessions.ctm_session_id IS NULL
3674
+ OR agent_sessions.ctm_session_id = excluded.ctm_session_id
3678
3675
  `).run(agentParams);
3676
+ if (!allowReclaim && result.changes === 0) {
3677
+ const existing = d.prepare(
3678
+ 'SELECT ctm_session_id FROM agent_sessions WHERE agent_session_id = ?'
3679
+ ).get(agentId);
3680
+ throwCrossTabClaim(existing?.ctm_session_id || 'unknown');
3681
+ }
3679
3682
  }
3680
3683
  });
3681
- txn();
3684
+ if (typeof txn.immediate === 'function') txn.immediate();
3685
+ else txn();
3682
3686
  }
3683
3687
 
3684
3688
  function setSessionStar(id, starred) {
@@ -3757,13 +3761,17 @@ function getSessionTitleNew(id) {
3757
3761
  function getAllSessionsData() {
3758
3762
  return getDb().prepare(`
3759
3763
  SELECT c.*, a.agent_session_id, a.jsonl_path, a.first_message, a.file_size,
3760
- a.modified_at, a.hostname, a.model, a.git_branch, a.user_msg_count,
3761
- a.last_user_content, a.first_assistant_text, a.rename_name
3764
+ a.modified_at, a.hostname, a.model, a.git_branch,
3765
+ MAX(COALESCE(a.user_msg_count, 0), COALESCE(sc.user_msg_count, 0)) as user_msg_count,
3766
+ COALESCE(NULLIF(a.last_user_content, ''), sc.last_user_content) as last_user_content,
3767
+ COALESCE(NULLIF(a.first_assistant_text, ''), sc.first_assistant_text) as first_assistant_text,
3768
+ COALESCE(NULLIF(a.rename_name, ''), sc.rename_name) as rename_name
3762
3769
  FROM ctm_sessions c
3763
3770
  LEFT JOIN (
3764
3771
  SELECT *, ROW_NUMBER() OVER (PARTITION BY ctm_session_id ORDER BY modified_at DESC, created_at DESC) as rn
3765
3772
  FROM agent_sessions
3766
3773
  ) a ON a.ctm_session_id = c.id AND a.rn = 1
3774
+ LEFT JOIN session_conversations sc ON sc.ctm_session_id = a.agent_session_id
3767
3775
  `).all();
3768
3776
  }
3769
3777
 
@@ -3789,22 +3797,33 @@ function getAgentSession(agentSessionId) {
3789
3797
  */
3790
3798
  function deleteCtmSession(ctmSessionId) {
3791
3799
  const d = getDb();
3792
- // Collect JSONL paths before delete (for disk cleanup)
3793
- const agentRows = d.prepare('SELECT jsonl_path FROM agent_sessions WHERE ctm_session_id = ?').all(ctmSessionId);
3794
- const jsonlPaths = agentRows.map(r => r.jsonl_path).filter(Boolean);
3795
-
3796
- // CASCADE delete: deleting from ctm_sessions cascades to agent_sessions
3797
- d.prepare('DELETE FROM ctm_sessions WHERE id = ?').run(ctmSessionId);
3798
-
3799
- // Also clean up other child tables (these don't have FK constraints)
3800
- try { d.prepare('DELETE FROM startup_tasks WHERE ctm_session_id = ?').run(ctmSessionId); } catch (e) { console.error('[db] deleteSession startup_tasks cleanup:', e.message); }
3801
- try { d.prepare('DELETE FROM scrollback_log WHERE ctm_session_id = ?').run(ctmSessionId); } catch (e) { console.error('[db] deleteSession scrollback_log cleanup:', e.message); }
3802
- try { d.prepare('DELETE FROM session_conversations WHERE ctm_session_id = ?').run(ctmSessionId); } catch (e) { console.error('[db] deleteSession session_conversations cleanup:', e.message); }
3803
- try { d.prepare('DELETE FROM session_messages WHERE ctm_session_id = ?').run(ctmSessionId); } catch (e) { console.error('[db] deleteSession session_messages cleanup:', e.message); }
3804
- try { d.prepare('DELETE FROM session_analyses WHERE ctm_session_id = ?').run(ctmSessionId); } catch (e) { console.error('[db] deleteSession session_analyses cleanup:', e.message); }
3805
- try { d.prepare('DELETE FROM prompt_queues WHERE ctm_session_id = ?').run(ctmSessionId); } catch (e) { console.error('[db] deleteSession prompt_queues cleanup:', e.message); }
3806
-
3807
- return jsonlPaths;
3800
+ const cleanupTables = [
3801
+ ['startup_tasks', 'ctm_session_id'],
3802
+ ['scrollback_log', 'ctm_session_id'],
3803
+ ['session_conversations', 'ctm_session_id'],
3804
+ ['session_messages', 'ctm_session_id'],
3805
+ ['session_analyses', 'ctm_session_id'],
3806
+ ['prompt_queues', 'ctm_session_id'],
3807
+ ];
3808
+ const txn = d.transaction(() => {
3809
+ // Collect JSONL paths before delete (for disk cleanup)
3810
+ const agentRows = d.prepare('SELECT jsonl_path FROM agent_sessions WHERE ctm_session_id = ?').all(ctmSessionId);
3811
+ const jsonlPaths = agentRows.map(r => r.jsonl_path).filter(Boolean);
3812
+
3813
+ // Also clean up child tables without FK constraints before the parent row is removed.
3814
+ for (const [table, idColumn] of cleanupTables) {
3815
+ const exists = d.prepare("SELECT 1 FROM sqlite_master WHERE type='table' AND name = ?").get(table);
3816
+ if (!exists) continue;
3817
+ const cols = d.prepare(`PRAGMA table_info(${table})`).all().map(c => c.name);
3818
+ const column = cols.includes(idColumn) ? idColumn : (cols.includes('session_id') ? 'session_id' : null);
3819
+ if (column) d.prepare(`DELETE FROM ${table} WHERE ${column} = ?`).run(ctmSessionId);
3820
+ }
3821
+
3822
+ // CASCADE delete: deleting from ctm_sessions cascades to agent_sessions.
3823
+ d.prepare('DELETE FROM ctm_sessions WHERE id = ?').run(ctmSessionId);
3824
+ return jsonlPaths;
3825
+ });
3826
+ return txn();
3808
3827
  }
3809
3828
 
3810
3829
  // Legacy compatibility: upsertSessionIndex is now a no-op (session_index dropped)
@@ -3833,7 +3852,7 @@ module.exports = {
3833
3852
  getSessionTitle, setSessionTitle, isSessionUserRenamed, getAllSessionTitles,
3834
3853
  createTemplate, listTemplates, getTemplate, deleteTemplate,
3835
3854
  trackPromptUsage, getPromptUsageStats,
3836
- checkpointWal, createBackup, createBackupSync, listBackups, restoreBackup, deleteBackup, startDailyBackup,
3855
+ checkpointWal, checkpointWalOrThrow, createBackup, createBackupSync, listBackups, restoreBackup, deleteBackup, startDailyBackup,
3837
3856
  saveQueue, loadQueue, loadAllQueues, deleteQueueDb,
3838
3857
  listPermRules, addPermRule, removePermRule, bulkSetPermRules, getPermRulesByProject,
3839
3858
  listAutoApprovals, upsertAutoApproval, toggleAutoApproval, deleteAutoApproval, getEnabledAutoApprovals,
@@ -0,0 +1,242 @@
1
+ # Session Standup Command Center Design
2
+
3
+ ## Problem
4
+
5
+ CTM already supports many long-lived sessions, but the operator cost rises when
6
+ multiple agents are active at once. Each terminal has its own context, and a new
7
+ session does not inherit the hard-won context from an existing one. Reusing
8
+ sessions is therefore necessary, but reuse creates a second problem: the user
9
+ must remember what every session is doing, whether it is blocked, and what the
10
+ next useful instruction should be.
11
+
12
+ The Standup Command Center is a central dashboard for that operating loop. It is
13
+ not a replacement for session terminals. It is a "manager view" over existing
14
+ sessions that answers:
15
+
16
+ - What is every active session doing?
17
+ - Which sessions need the user's attention?
18
+ - Which sessions are ready to review or finish?
19
+ - What should the user do next for each session?
20
+ - How can the user quickly deep dive and issue instructions without losing the
21
+ fleet-level view?
22
+
23
+ ## Research Signals
24
+
25
+ Open-source projects validate the need for a control-plane style UI, but CTM
26
+ should borrow the operating patterns rather than replatforming the runtime.
27
+
28
+ - Mission Control positions itself as an open-source dashboard for agent fleets,
29
+ tasks, costs, workflows, logs, memory, alerts, and quality gates:
30
+ https://github.com/builderz-labs/mission-control
31
+ - AutoGen Studio shows that multi-agent debugging benefits from explicit
32
+ sessions, observable messages, metrics, and reusable agent components, but it
33
+ is framed as a prototyping UI rather than a production app:
34
+ https://autogenhub.github.io/autogen/blog/2023/12/01/AutoGenStudio/
35
+ - OpenHands exposes a local GUI, REST API, CLI-compatible coding-agent workflow,
36
+ and cloud/enterprise surfaces for scaling coding agents:
37
+ https://github.com/OpenHands/OpenHands
38
+ - LangGraph's product framing emphasizes human-in-the-loop checks, persistent
39
+ memory, and streaming as first-class agent UX primitives:
40
+ https://www.langchain.com/langgraph
41
+
42
+ The design conclusion for CTM: do not add a manager-of-managers orchestration
43
+ runtime. CTM already has terminals, stream capture, recent-session history,
44
+ worktree state, approval detection, and review panels. The right abstraction is
45
+ a reusable dashboard projection over those signals, with deterministic manager
46
+ recommendations first and optional LLM annotations later.
47
+
48
+ ## Existing CTM Substrate
49
+
50
+ The feature should reuse these existing surfaces:
51
+
52
+ - `lib/session-stream.js`: summary, intent, progress, last prompt, stream status.
53
+ - `lib/status-hooks.js`: idle, busy, and waiting-input state transitions.
54
+ - `server.js` session payloads: active sessions, provider, cwd, branch, modified
55
+ time, capabilities, and worktree status.
56
+ - `/api/recent-sessions`: historical sessions and project filtering.
57
+ - `/api/stream/status` and `/api/sessions/:id/summary`: stream health and
58
+ per-session summary.
59
+ - Review tab and session deep links: structured transcript and review flows.
60
+
61
+ ## UX Model
62
+
63
+ The first screen in CTM Sessions should become the operator dashboard when no
64
+ terminal is selected. The dashboard is dense, operational, and scannable. It is
65
+ closer to a standup board than a marketing landing page.
66
+
67
+ Primary regions:
68
+
69
+ 1. Fleet header
70
+ - Total active sessions.
71
+ - Counts by lane.
72
+ - Last refresh time.
73
+ - Refresh action.
74
+ - New Session action.
75
+
76
+ 2. Attention ribbon
77
+ - Highest priority sessions needing input or approval.
78
+ - Shows only the action needed and evidence.
79
+ - Lets the user open the session directly.
80
+
81
+ 3. Lanes
82
+ - Needs User: approvals, questions, waiting input, or explicit blockers.
83
+ - Ready Review: worktree changes, completed work, or reviewable artifacts.
84
+ - Running: active output or recent work.
85
+ - Continue Later: idle, stale, or context-preserving sessions.
86
+
87
+ 4. Session card
88
+ - Title and agent/provider.
89
+ - State badge and age.
90
+ - Intent: one-line goal.
91
+ - Progress: one-line latest progress.
92
+ - Manager recommendation: what the user should do next.
93
+ - Evidence chips: waiting input, branch, worktree, last prompt, last activity.
94
+ - Actions: Open, Review when supported, and Send Instruction when a terminal
95
+ target is available.
96
+
97
+ 5. Deep dive behavior
98
+ - Open keeps the user's mental model simple: jump into the session terminal.
99
+ - Review opens the transcript/review panel when the agent supports it.
100
+ - Send Instruction targets the existing prompt/queue mechanism in a later
101
+ phase; the first implementation can focus the session and preserve the
102
+ command center as the resume surface.
103
+
104
+ ## Manager Recommendation Taxonomy
105
+
106
+ Recommendations must be conservative and explainable. They should be derived
107
+ from existing signals and include evidence. They should never claim certainty
108
+ from weak heuristics.
109
+
110
+ Recommended action kinds:
111
+
112
+ - `approval_needed`: approve, deny, or answer a tool prompt.
113
+ - `needs_input`: user instruction appears required.
114
+ - `review`: inspect completed work or worktree changes.
115
+ - `watch`: session is running; no action needed right now.
116
+ - `resume`: session is idle but can continue with more instruction.
117
+ - `investigate`: session appears failed or blocked.
118
+ - `archive`: work appears complete and inactive.
119
+
120
+ Lane priority:
121
+
122
+ 1. Failed or blocked signals.
123
+ 2. Waiting input or approval.
124
+ 3. Reviewable worktree/session output.
125
+ 4. Running or recently active sessions.
126
+ 5. Idle or stale sessions.
127
+
128
+ Each recommendation includes:
129
+
130
+ - `lane`: where it appears on the board.
131
+ - `actionKind`: stable machine-readable action.
132
+ - `actionLabel`: short CTA text.
133
+ - `recommendation`: short human-readable manager note.
134
+ - `confidence`: `high`, `medium`, or `low`.
135
+ - `evidence`: short strings tied to concrete signals.
136
+
137
+ ## API Contract
138
+
139
+ Add `GET /api/sessions/standup`.
140
+
141
+ Response shape:
142
+
143
+ ```json
144
+ {
145
+ "generatedAt": "2026-04-30T12:00:00.000Z",
146
+ "counts": {
147
+ "total": 3,
148
+ "needs_user": 1,
149
+ "ready_review": 1,
150
+ "running": 1,
151
+ "continue_later": 0
152
+ },
153
+ "lanes": [
154
+ {
155
+ "id": "needs_user",
156
+ "title": "Needs User",
157
+ "sessions": []
158
+ }
159
+ ],
160
+ "sessions": [
161
+ {
162
+ "id": "ctm-session-id",
163
+ "agentSessionId": "agent-session-id",
164
+ "title": "Fix build failure",
165
+ "agent": "codex",
166
+ "provider": "openai",
167
+ "cwd": "/repo",
168
+ "branch": "fix-build",
169
+ "status": "waiting_input",
170
+ "lane": "needs_user",
171
+ "actionKind": "approval_needed",
172
+ "actionLabel": "Respond",
173
+ "recommendation": "Approval or input is waiting in the terminal.",
174
+ "confidence": "high",
175
+ "evidence": ["waiting_input", "last activity 2m ago"],
176
+ "intent": "Fix build failure",
177
+ "progress": "Waiting for command approval",
178
+ "lastActivity": "2026-04-30T11:58:00.000Z",
179
+ "worktree": {
180
+ "branch": "fix-build",
181
+ "dirtyFiles": 2,
182
+ "unmergedCommits": 1
183
+ },
184
+ "capabilities": {
185
+ "review": true,
186
+ "resume": true
187
+ }
188
+ }
189
+ ]
190
+ }
191
+ ```
192
+
193
+ The endpoint should be pure projection logic over active sessions first. Recent
194
+ sessions can be added as a later extension once the active-session dashboard is
195
+ stable.
196
+
197
+ ## Implementation Phases
198
+
199
+ Phase 1: Design document
200
+
201
+ - Commit this design.
202
+
203
+ Phase 2: Backend projection
204
+
205
+ - Add a pure `lib/session-standup.js` module.
206
+ - Classify sessions into lanes and recommendations.
207
+ - Add unit tests for the classifier and snapshot builder.
208
+ - Add `GET /api/sessions/standup`.
209
+
210
+ Phase 3: Dashboard UI
211
+
212
+ - Replace the empty Sessions welcome panel with the Standup Command Center.
213
+ - Fetch `/api/sessions/standup`.
214
+ - Render fleet header, attention ribbon, lanes, cards, and actions.
215
+ - Keep the UI within the existing CTM visual language: compact, dark,
216
+ terminal-adjacent, and operational.
217
+
218
+ Phase 4: Verification and polish
219
+
220
+ - Run focused unit tests.
221
+ - Run an isolated CTM dev server on random ports, never the primary 3456/3457.
222
+ - Curl the new API.
223
+ - Drive the dashboard in a real browser and confirm it renders without layout
224
+ overlap or blank states.
225
+
226
+ Future phase: Manager annotations
227
+
228
+ - Add optional LLM-generated manager notes only after deterministic
229
+ recommendations are useful.
230
+ - Store annotation provenance and age.
231
+ - Do not let the LLM issue autonomous cross-session commands; it can propose
232
+ actions, not execute them.
233
+
234
+ ## Open Questions
235
+
236
+ - Should historical/recent sessions appear on the same board or in a secondary
237
+ "Resume Later" view?
238
+ - Should Send Instruction fan out to multiple sessions, or should the first
239
+ version stay one-session-at-a-time?
240
+ - Should manager annotations be cached per session turn, per stream summary, or
241
+ per dashboard refresh?
242
+