create-walle 0.9.21 → 0.9.23

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (500) hide show
  1. package/README.md +27 -5
  2. package/package.json +2 -2
  3. package/template/CLAUDE.md +2 -2
  4. package/template/LICENSE +1 -1
  5. package/template/bin/ctm-dev-cleanup.js +24 -3
  6. package/template/bin/ctm-launch.sh +13 -0
  7. package/template/bin/dev.sh +156 -18
  8. package/template/bin/node-bin.sh +84 -0
  9. package/template/bin/pin-node.sh +51 -0
  10. package/template/claude-task-manager/api-prompts.js +1203 -182
  11. package/template/claude-task-manager/api-reviews.js +109 -15
  12. package/template/claude-task-manager/approval-agent.js +1360 -280
  13. package/template/claude-task-manager/bin/restart-ctm.sh +64 -23
  14. package/template/claude-task-manager/bin/storage-migration-supervisor.js +338 -0
  15. package/template/claude-task-manager/db.js +4417 -295
  16. package/template/claude-task-manager/docs/app-update-refresh-protocol.md +69 -0
  17. package/template/claude-task-manager/docs/approval-ai-refinement.md +138 -0
  18. package/template/claude-task-manager/docs/approval-rescue-loop.md +74 -0
  19. package/template/claude-task-manager/docs/codex-operational-warning-health.md +107 -0
  20. package/template/claude-task-manager/docs/codex-resume-state-guard-design.md +17 -12
  21. package/template/claude-task-manager/docs/codex-terminal-render-controller-handoff.md +311 -0
  22. package/template/claude-task-manager/docs/coding-agent-hooks-architecture.md +418 -0
  23. package/template/claude-task-manager/docs/conversation-import-freshness.md +20 -0
  24. package/template/claude-task-manager/docs/google-workspace-auth-health.md +77 -0
  25. package/template/claude-task-manager/docs/image-paste-ux.md +13 -0
  26. package/template/claude-task-manager/docs/ipad-web-preview.md +88 -0
  27. package/template/claude-task-manager/docs/main-loop-offload-architecture.md +66 -0
  28. package/template/claude-task-manager/docs/microsoft-dev-tunnel-phone-access-design.md +274 -519
  29. package/template/claude-task-manager/docs/mobile-live-streaming.md +27 -5
  30. package/template/claude-task-manager/docs/mobile-remote-submission-lifecycle.md +69 -0
  31. package/template/claude-task-manager/docs/phone-access-design.md +53 -15
  32. package/template/claude-task-manager/docs/phone-passkey-identity.md +122 -0
  33. package/template/claude-task-manager/docs/phone-setup.md +3 -0
  34. package/template/claude-task-manager/docs/prompt-editing-tree-design.md +25 -1
  35. package/template/claude-task-manager/docs/remote-desktop-access-design.md +268 -0
  36. package/template/claude-task-manager/docs/restart-lifecycle-architecture.md +95 -0
  37. package/template/claude-task-manager/docs/runtime-work-control-plane.md +53 -0
  38. package/template/claude-task-manager/docs/session-interactive-wait-surfaces.md +38 -0
  39. package/template/claude-task-manager/docs/session-needs-you-dismissal.md +84 -0
  40. package/template/claude-task-manager/docs/session-render-state-management-design.md +91 -3
  41. package/template/claude-task-manager/docs/session-standup-command-center-design.md +25 -1
  42. package/template/claude-task-manager/docs/session-title-authority.md +32 -0
  43. package/template/claude-task-manager/docs/session-workspace-binding.md +33 -0
  44. package/template/claude-task-manager/docs/skill-intent-resolution-design.md +72 -0
  45. package/template/claude-task-manager/docs/walle-mcp-supervisor-health.md +86 -0
  46. package/template/claude-task-manager/docs/walle-relay-phone-access-design.md +24 -15
  47. package/template/claude-task-manager/docs/walle-session-history-hydration.md +114 -0
  48. package/template/claude-task-manager/docs/walle-session-input-queue.md +104 -0
  49. package/template/claude-task-manager/docs/walle-session-model-catalog.md +90 -0
  50. package/template/claude-task-manager/docs/walle-session-model-preferences.md +15 -6
  51. package/template/claude-task-manager/git-utils.js +897 -27
  52. package/template/claude-task-manager/lib/agent-capabilities.js +33 -0
  53. package/template/claude-task-manager/lib/agent-cli-cache.js +37 -7
  54. package/template/claude-task-manager/lib/agent-hooks-installer.js +26 -2
  55. package/template/claude-task-manager/lib/agent-presets.js +17 -1
  56. package/template/claude-task-manager/lib/all-sessions-query.js +108 -0
  57. package/template/claude-task-manager/lib/approval-ai-refinement.js +488 -0
  58. package/template/claude-task-manager/lib/approval-self-adapt.js +168 -0
  59. package/template/claude-task-manager/lib/async-semaphore.js +44 -0
  60. package/template/claude-task-manager/lib/auth-context.js +5 -0
  61. package/template/claude-task-manager/lib/auth-rate-limit.js +47 -4
  62. package/template/claude-task-manager/lib/auth-rules.js +29 -2
  63. package/template/claude-task-manager/lib/auto-approval-verifier.js +129 -16
  64. package/template/claude-task-manager/lib/background-llm.js +144 -17
  65. package/template/claude-task-manager/lib/branch-inventory.js +212 -0
  66. package/template/claude-task-manager/lib/claude-desktop-sessions.js +15 -3
  67. package/template/claude-task-manager/lib/coalesce-sync-frames.js +151 -0
  68. package/template/claude-task-manager/lib/codex-launch-health.js +762 -0
  69. package/template/claude-task-manager/lib/codex-transcript-pager.js +51 -0
  70. package/template/claude-task-manager/lib/codex-zst.js +124 -0
  71. package/template/claude-task-manager/lib/coding-agent-models.js +233 -30
  72. package/template/claude-task-manager/lib/connection-health.js +232 -0
  73. package/template/claude-task-manager/lib/conversation-blob-parser.js +42 -0
  74. package/template/claude-task-manager/lib/conversation-tail-merge.js +89 -26
  75. package/template/claude-task-manager/lib/ctm-session-context-api.js +39 -10
  76. package/template/claude-task-manager/lib/cursor-conversation-store.js +354 -0
  77. package/template/claude-task-manager/lib/db-owner-worker-client.js +315 -0
  78. package/template/claude-task-manager/lib/document-review.js +141 -6
  79. package/template/claude-task-manager/lib/escalation-review.js +152 -0
  80. package/template/claude-task-manager/lib/graceful-shutdown.js +159 -0
  81. package/template/claude-task-manager/lib/headless-term-service.js +678 -0
  82. package/template/claude-task-manager/lib/heavy-worker-fallback.js +38 -0
  83. package/template/claude-task-manager/lib/jsonl-conversation-parser.js +542 -0
  84. package/template/claude-task-manager/lib/jsonl-range-reader.js +112 -0
  85. package/template/claude-task-manager/lib/main-db-census.js +216 -0
  86. package/template/claude-task-manager/lib/message-pagination.js +106 -4
  87. package/template/claude-task-manager/lib/microsoft-dev-tunnel-setup.js +750 -26
  88. package/template/claude-task-manager/lib/mobile-auth-api.js +274 -7
  89. package/template/claude-task-manager/lib/mobile-auth-store.js +592 -10
  90. package/template/claude-task-manager/lib/mobile-notification-dispatcher.js +15 -0
  91. package/template/claude-task-manager/lib/model-overview-brain-fallback.js +311 -0
  92. package/template/claude-task-manager/lib/model-overview-cache.js +141 -0
  93. package/template/claude-task-manager/lib/models-health-routing-notice.js +126 -0
  94. package/template/claude-task-manager/lib/node-pin-guard.js +93 -0
  95. package/template/claude-task-manager/lib/perf-tracker.js +242 -6
  96. package/template/claude-task-manager/lib/permission-match.js +76 -0
  97. package/template/claude-task-manager/lib/permission-sync.js +133 -20
  98. package/template/claude-task-manager/lib/process-title.js +35 -0
  99. package/template/claude-task-manager/lib/prompt-executions-query.js +25 -0
  100. package/template/claude-task-manager/lib/prompt-index-disk-cache.js +44 -0
  101. package/template/claude-task-manager/lib/prompt-intent.js +132 -0
  102. package/template/claude-task-manager/lib/provider-user-context.js +34 -0
  103. package/template/claude-task-manager/lib/read-pool-client.js +313 -0
  104. package/template/claude-task-manager/lib/readpool-breaker.js +31 -0
  105. package/template/claude-task-manager/lib/recent-sessions-breaker.js +12 -0
  106. package/template/claude-task-manager/lib/remote-feedback-client.js +72 -0
  107. package/template/claude-task-manager/lib/remote-relay-protocol.js +37 -4
  108. package/template/claude-task-manager/lib/remote-relay-store.js +159 -0
  109. package/template/claude-task-manager/lib/remote-submission-observer.js +278 -0
  110. package/template/claude-task-manager/lib/restart-guard.js +109 -0
  111. package/template/claude-task-manager/lib/restore-interruption-detector.js +439 -0
  112. package/template/claude-task-manager/lib/restore-policy.js +13 -0
  113. package/template/claude-task-manager/lib/restore-resume-batch.js +74 -0
  114. package/template/claude-task-manager/lib/restore-runtime.js +68 -0
  115. package/template/claude-task-manager/lib/restore-storm.js +34 -0
  116. package/template/claude-task-manager/lib/resume-cwd.js +36 -0
  117. package/template/claude-task-manager/lib/resume-preflight.js +313 -0
  118. package/template/claude-task-manager/lib/runtime-work-registry.js +444 -0
  119. package/template/claude-task-manager/lib/sanitize-openai-auth.js +31 -0
  120. package/template/claude-task-manager/lib/scheduler.js +21 -1
  121. package/template/claude-task-manager/lib/scrollback-snapshot-store.js +159 -0
  122. package/template/claude-task-manager/lib/serial-task-queue.js +64 -0
  123. package/template/claude-task-manager/lib/server-listeners.js +239 -0
  124. package/template/claude-task-manager/lib/session-capture.js +42 -7
  125. package/template/claude-task-manager/lib/session-content-backfill.js +131 -0
  126. package/template/claude-task-manager/lib/session-history.js +388 -43
  127. package/template/claude-task-manager/lib/session-host-manager.js +287 -0
  128. package/template/claude-task-manager/lib/session-image-refs.js +209 -0
  129. package/template/claude-task-manager/lib/session-jobs.js +399 -59
  130. package/template/claude-task-manager/lib/session-prompt-index.js +137 -0
  131. package/template/claude-task-manager/lib/session-restore.js +53 -0
  132. package/template/claude-task-manager/lib/session-standup.js +123 -23
  133. package/template/claude-task-manager/lib/session-state-bus.js +14 -0
  134. package/template/claude-task-manager/lib/session-stream.js +64 -16
  135. package/template/claude-task-manager/lib/session-timeline-summary.js +260 -0
  136. package/template/claude-task-manager/lib/session-token-usage.js +494 -0
  137. package/template/claude-task-manager/lib/session-workspace-binding.js +356 -0
  138. package/template/claude-task-manager/lib/setup-network-config.js +9 -0
  139. package/template/claude-task-manager/lib/size-cap.js +45 -0
  140. package/template/claude-task-manager/lib/size-cap.test.js +62 -0
  141. package/template/claude-task-manager/lib/skill-autocomplete.js +180 -1
  142. package/template/claude-task-manager/lib/skill-intent-resolver.js +304 -0
  143. package/template/claude-task-manager/lib/sqlite-driver.js +19 -3
  144. package/template/claude-task-manager/lib/standup-attention.js +7 -3
  145. package/template/claude-task-manager/lib/status-authority.js +39 -0
  146. package/template/claude-task-manager/lib/status-hooks.js +4 -0
  147. package/template/claude-task-manager/lib/storage-migration.js +235 -0
  148. package/template/claude-task-manager/lib/structured-capture.js +298 -0
  149. package/template/claude-task-manager/lib/sync-io-census.js +163 -0
  150. package/template/claude-task-manager/lib/tailscale-setup.js +6 -0
  151. package/template/claude-task-manager/lib/terminal-activity-evidence.js +33 -0
  152. package/template/claude-task-manager/lib/terminal-choice.js +364 -0
  153. package/template/claude-task-manager/lib/terminal-control-sanitize.js +17 -0
  154. package/template/claude-task-manager/lib/terminal-fingerprint.js +48 -0
  155. package/template/claude-task-manager/lib/terminal-output-flush.js +84 -0
  156. package/template/claude-task-manager/lib/timeline-order.js +122 -0
  157. package/template/claude-task-manager/lib/transcript-store.js +348 -43
  158. package/template/claude-task-manager/lib/transport-security.js +84 -1
  159. package/template/claude-task-manager/lib/wait-state.js +184 -0
  160. package/template/claude-task-manager/lib/walle-client.js +47 -5
  161. package/template/claude-task-manager/lib/walle-ctm-history.js +564 -4
  162. package/template/claude-task-manager/lib/walle-external-actions.js +135 -16
  163. package/template/claude-task-manager/lib/walle-history-hydration.js +46 -0
  164. package/template/claude-task-manager/lib/walle-native-health.js +403 -0
  165. package/template/claude-task-manager/lib/walle-repair.js +701 -0
  166. package/template/claude-task-manager/lib/walle-session-cache.js +109 -0
  167. package/template/claude-task-manager/lib/walle-session-context.js +57 -21
  168. package/template/claude-task-manager/lib/walle-session-model-catalog.js +34 -0
  169. package/template/claude-task-manager/lib/walle-supervisor.js +539 -63
  170. package/template/claude-task-manager/lib/walle-transcript.js +52 -0
  171. package/template/claude-task-manager/lib/worktree-active-sync.js +11 -7
  172. package/template/claude-task-manager/lib/worktree-cwd.js +32 -1
  173. package/template/claude-task-manager/package.json +1 -1
  174. package/template/claude-task-manager/prompt-harvest.js +89 -66
  175. package/template/claude-task-manager/providers/claude-code.js +51 -3
  176. package/template/claude-task-manager/providers/cursor.js +140 -45
  177. package/template/claude-task-manager/public/css/reviews.css +551 -61
  178. package/template/claude-task-manager/public/css/setup.css +191 -0
  179. package/template/claude-task-manager/public/css/walle-session.css +865 -10
  180. package/template/claude-task-manager/public/css/walle.css +154 -0
  181. package/template/claude-task-manager/public/designs/ai-providers-consolidation-v2.html +830 -0
  182. package/template/claude-task-manager/public/index.html +18516 -2058
  183. package/template/claude-task-manager/public/ipad.html +363 -0
  184. package/template/claude-task-manager/public/js/document-review-links.js +301 -0
  185. package/template/claude-task-manager/public/js/image-normalize.js +69 -36
  186. package/template/claude-task-manager/public/js/message-renderer.js +1265 -77
  187. package/template/claude-task-manager/public/js/prompts.js +66 -29
  188. package/template/claude-task-manager/public/js/reviews.js +901 -133
  189. package/template/claude-task-manager/public/js/session-activity-utils.js +11 -1
  190. package/template/claude-task-manager/public/js/session-search-utils.js +94 -10
  191. package/template/claude-task-manager/public/js/session-status-precedence.js +23 -5
  192. package/template/claude-task-manager/public/js/setup.js +1273 -176
  193. package/template/claude-task-manager/public/js/stream-view.js +691 -73
  194. package/template/claude-task-manager/public/js/terminal-reconciler.js +210 -0
  195. package/template/claude-task-manager/public/js/walle-session.js +2455 -158
  196. package/template/claude-task-manager/public/js/walle.js +455 -28
  197. package/template/claude-task-manager/public/m/app.css +2909 -262
  198. package/template/claude-task-manager/public/m/app.js +6601 -398
  199. package/template/claude-task-manager/public/m/claim.html +224 -17
  200. package/template/claude-task-manager/public/m/index.html +117 -21
  201. package/template/claude-task-manager/public/m/sw.js +3 -1
  202. package/template/claude-task-manager/public/manifest.json +2 -2
  203. package/template/claude-task-manager/public/prompts.html +30 -14
  204. package/template/claude-task-manager/queue-engine.js +507 -28
  205. package/template/claude-task-manager/scripts/repair-claude-session-images.js +27 -8
  206. package/template/claude-task-manager/server.js +14341 -2197
  207. package/template/claude-task-manager/session-integrity.js +160 -18
  208. package/template/claude-task-manager/session-search-ranking.js +1 -0
  209. package/template/claude-task-manager/session-utils.js +25 -5
  210. package/template/claude-task-manager/workers/approval-blocklist.js +96 -6
  211. package/template/claude-task-manager/workers/approval-widget-validator.js +14 -8
  212. package/template/claude-task-manager/workers/conversation-import-worker.js +11 -50
  213. package/template/claude-task-manager/workers/db-owner-worker.js +386 -0
  214. package/template/claude-task-manager/workers/harvest-worker.js +9 -55
  215. package/template/claude-task-manager/workers/headless-term-worker.js +9 -530
  216. package/template/claude-task-manager/workers/read-pool-worker.js +387 -0
  217. package/template/claude-task-manager/workers/scrollback-worker.js +11 -72
  218. package/template/claude-task-manager/workers/session-host-process.js +146 -0
  219. package/template/claude-task-manager/workers/session-integrity-worker.js +10 -54
  220. package/template/claude-task-manager/workers/state-detectors/base.js +18 -1
  221. package/template/claude-task-manager/workers/state-detectors/claude-code.js +182 -9
  222. package/template/claude-task-manager/workers/state-detectors/codex.js +150 -2
  223. package/template/claude-task-manager/workers/state-detectors/cursor.js +127 -0
  224. package/template/claude-task-manager/workers/state-detectors/gemini.js +21 -0
  225. package/template/claude-task-manager/workers/state-detectors/index.js +29 -0
  226. package/template/claude-task-manager/workers/state-detectors/opencode.js +103 -0
  227. package/template/docs/design/markdown-review-pane.md +206 -0
  228. package/template/docs/designs/2026-05-17-portkey-gateway-provider-ux.md +129 -38
  229. package/template/docs/designs/2026-05-20-mobile-worktree-finish-command.md +27 -0
  230. package/template/docs/designs/2026-05-22-ai-configuration-consolidation.md +248 -0
  231. package/template/docs/designs/ai-configuration-consolidation-mock.html +812 -0
  232. package/template/docs/private-memory-and-pii-policy.md +69 -0
  233. package/template/package.json +2 -1
  234. package/template/scripts/check-private-data.js +201 -0
  235. package/template/shared/sqlite-owner-guard.js +30 -0
  236. package/template/shared/sqlite-owner-write-queue.js +225 -0
  237. package/template/shared/sqlite-storage-policy.js +111 -0
  238. package/template/shared/sqlite-write-lock.js +428 -0
  239. package/template/wall-e/agent-runners/claude-code.js +5 -0
  240. package/template/wall-e/agent.js +166 -22
  241. package/template/wall-e/api-walle.js +524 -70
  242. package/template/wall-e/auth/provider-flows.js +11 -1
  243. package/template/wall-e/bin/walle-mcp-stdio.js +341 -17
  244. package/template/wall-e/brain.js +1614 -141
  245. package/template/wall-e/chat/attachment-blocks.js +96 -0
  246. package/template/wall-e/chat/attachments.js +2 -1
  247. package/template/wall-e/chat/capability-resolver.js +7 -7
  248. package/template/wall-e/chat/context-messages.js +28 -0
  249. package/template/wall-e/chat/conversation-frame.js +630 -0
  250. package/template/wall-e/chat/provider-messages.js +125 -0
  251. package/template/wall-e/chat.js +1002 -233
  252. package/template/wall-e/coding/acceptance-contract.js +170 -0
  253. package/template/wall-e/coding/acp-adapter.js +1 -1
  254. package/template/wall-e/coding/agent-catalog.js +3 -0
  255. package/template/wall-e/coding/artifact-store.js +93 -0
  256. package/template/wall-e/coding/capability-router.js +120 -0
  257. package/template/wall-e/coding/coding-run-controller.js +423 -0
  258. package/template/wall-e/coding/compaction-service.js +157 -12
  259. package/template/wall-e/coding/frontend-verification.js +258 -0
  260. package/template/wall-e/coding/lifecycle-hooks.js +75 -0
  261. package/template/wall-e/coding/local-preview-contract.js +157 -0
  262. package/template/wall-e/coding/permission-service.js +57 -13
  263. package/template/wall-e/coding/prompt-bundle.js +19 -1
  264. package/template/wall-e/coding/prompt-section-registry.js +227 -0
  265. package/template/wall-e/coding/provider-compat.js +15 -0
  266. package/template/wall-e/coding/runtime-events.js +224 -0
  267. package/template/wall-e/coding/runtime-mode.js +3 -0
  268. package/template/wall-e/coding/side-git-snapshot.js +160 -4
  269. package/template/wall-e/coding/snapshot-service.js +143 -1
  270. package/template/wall-e/coding/stream-processor.js +388 -34
  271. package/template/wall-e/coding/task-tool.js +141 -4
  272. package/template/wall-e/coding/tool-execution-controller.js +365 -0
  273. package/template/wall-e/coding/tool-registry.js +43 -5
  274. package/template/wall-e/coding/user-hooks.js +217 -0
  275. package/template/wall-e/coding-orchestrator.js +1330 -221
  276. package/template/wall-e/coding-prompts.js +20 -4
  277. package/template/wall-e/context/context-builder.js +15 -2
  278. package/template/wall-e/decision/confidence.js +1 -1
  279. package/template/wall-e/docs/coding-acceptance-contract.md +41 -0
  280. package/template/wall-e/docs/external-action-controller.md +26 -6
  281. package/template/wall-e/docs/telemetry-lifecycle.md +8 -2
  282. package/template/wall-e/embeddings.js +591 -53
  283. package/template/wall-e/external-action-controller.js +12 -0
  284. package/template/wall-e/http/auth.js +1 -0
  285. package/template/wall-e/http/chat-api.js +46 -11
  286. package/template/wall-e/http/model-admin.js +836 -34
  287. package/template/wall-e/lib/boot-profile.js +88 -0
  288. package/template/wall-e/lib/event-loop-monitor.js +93 -0
  289. package/template/wall-e/lib/service-health.js +194 -0
  290. package/template/wall-e/llm/anthropic.js +130 -5
  291. package/template/wall-e/llm/client.js +266 -63
  292. package/template/wall-e/llm/default-fallback.js +382 -0
  293. package/template/wall-e/llm/health.js +19 -0
  294. package/template/wall-e/llm/message-guard.js +78 -0
  295. package/template/wall-e/llm/model-catalog.js +252 -1
  296. package/template/wall-e/llm/openai.js +26 -4
  297. package/template/wall-e/llm/portkey-sync.js +654 -0
  298. package/template/wall-e/llm/provider-error.js +30 -2
  299. package/template/wall-e/llm/registry.js +5 -1
  300. package/template/wall-e/llm/request-compat.js +67 -0
  301. package/template/wall-e/loops/backfill.js +79 -23
  302. package/template/wall-e/loops/brain-optimize.js +67 -0
  303. package/template/wall-e/loops/ingest.js +25 -10
  304. package/template/wall-e/loops/question-digest.js +160 -0
  305. package/template/wall-e/loops/reflect.js +6 -4
  306. package/template/wall-e/loops/think.js +39 -12
  307. package/template/wall-e/mcp-server.js +318 -36
  308. package/template/wall-e/memory/ctm-context-client.js +52 -14
  309. package/template/wall-e/memory/ctm-operational-context.js +237 -0
  310. package/template/wall-e/memory/ctm-prompt-executions-client.js +128 -0
  311. package/template/wall-e/memory/ctm-session-context.js +111 -63
  312. package/template/wall-e/prompts/coding/deepseek.txt +3 -0
  313. package/template/wall-e/prompts/coding/gemini.txt +6 -0
  314. package/template/wall-e/prompts/coding/gpt.txt +6 -0
  315. package/template/wall-e/prompts/coding/local.txt +7 -0
  316. package/template/wall-e/runtime/decision-hooks.js +115 -0
  317. package/template/wall-e/runtime/devbox-gateway.js +82 -8
  318. package/template/wall-e/runtime/prompt-manifest.js +86 -0
  319. package/template/wall-e/runtime/tool-executor.js +269 -0
  320. package/template/wall-e/runtime/tool-result-envelope.js +138 -0
  321. package/template/wall-e/runtime/transcript-projection.js +60 -0
  322. package/template/wall-e/runtime/walle-runtime.js +224 -0
  323. package/template/wall-e/scripts/db-optimize/migrate.js +162 -0
  324. package/template/wall-e/scripts/db-optimize/recall-eval.js +117 -0
  325. package/template/wall-e/server.js +15 -0
  326. package/template/wall-e/session-files.js +9 -0
  327. package/template/wall-e/skills/_bundled/google-calendar/run.js +1 -1
  328. package/template/wall-e/skills/_bundled/gws-workspace/run.js +1 -1
  329. package/template/wall-e/skills/_bundled/slack-mentions/run.js +76 -6
  330. package/template/wall-e/skills/claude-code-reader.js +7 -3
  331. package/template/wall-e/skills/script-skill-runner.js +10 -0
  332. package/template/wall-e/skills/skill-planner.js +38 -0
  333. package/template/wall-e/tools/builtin-middleware.js +19 -9
  334. package/template/wall-e/tools/local-tools.js +1428 -16
  335. package/template/wall-e/tools/permission-checker.js +73 -5
  336. package/template/wall-e/tools/question-manager.js +117 -7
  337. package/template/wall-e/training/harvester.js +12 -28
  338. package/template/wall-e/training/replay.js +25 -80
  339. package/template/website/index.html +10 -10
  340. package/template/wall-e/eval/ab-test.js +0 -203
  341. package/template/wall-e/eval/agent-runner.js +0 -772
  342. package/template/wall-e/eval/agent-scorer.js +0 -461
  343. package/template/wall-e/eval/aggregator.js +0 -414
  344. package/template/wall-e/eval/allowed-test-commands.js +0 -34
  345. package/template/wall-e/eval/benchmark-generator.js +0 -113
  346. package/template/wall-e/eval/benchmarks/chat-eval.json +0 -1662
  347. package/template/wall-e/eval/benchmarks/chat.json +0 -82
  348. package/template/wall-e/eval/benchmarks/coding-agent-real.json +0 -1
  349. package/template/wall-e/eval/benchmarks/coding-agent.json +0 -1581
  350. package/template/wall-e/eval/benchmarks/coding.json +0 -122
  351. package/template/wall-e/eval/benchmarks/memory-retrieval.json +0 -234
  352. package/template/wall-e/eval/benchmarks/reasoning.json +0 -82
  353. package/template/wall-e/eval/benchmarks/swebench-lite-30.json +0 -212
  354. package/template/wall-e/eval/benchmarks.js +0 -669
  355. package/template/wall-e/eval/cc-replay.js +0 -719
  356. package/template/wall-e/eval/chat-eval.js +0 -525
  357. package/template/wall-e/eval/check-keys.js +0 -15
  358. package/template/wall-e/eval/check-providers.js +0 -42
  359. package/template/wall-e/eval/codex-cli-baseline.js +0 -669
  360. package/template/wall-e/eval/coding-agent-real.js +0 -570
  361. package/template/wall-e/eval/context-compactor.js +0 -251
  362. package/template/wall-e/eval/debug-agent003.js +0 -68
  363. package/template/wall-e/eval/diagnostics.js +0 -216
  364. package/template/wall-e/eval/eval-orchestrator.js +0 -642
  365. package/template/wall-e/eval/evaluate.js +0 -202
  366. package/template/wall-e/eval/evaluator.js +0 -373
  367. package/template/wall-e/eval/exporter.js +0 -212
  368. package/template/wall-e/eval/fixtures/express-basic/package.json +0 -9
  369. package/template/wall-e/eval/fixtures/express-basic/server.js +0 -115
  370. package/template/wall-e/eval/fixtures/express-basic/test.js +0 -83
  371. package/template/wall-e/eval/fixtures/express-buggy/package.json +0 -9
  372. package/template/wall-e/eval/fixtures/express-buggy/server.js +0 -113
  373. package/template/wall-e/eval/fixtures/express-buggy/test.js +0 -83
  374. package/template/wall-e/eval/fixtures/express-buggy-items/package.json +0 -9
  375. package/template/wall-e/eval/fixtures/express-buggy-items/server.js +0 -112
  376. package/template/wall-e/eval/fixtures/express-buggy-items/test.js +0 -83
  377. package/template/wall-e/eval/fixtures/express-buggy-search/package.json +0 -9
  378. package/template/wall-e/eval/fixtures/express-buggy-search/server.js +0 -121
  379. package/template/wall-e/eval/fixtures/express-buggy-search/test.js +0 -83
  380. package/template/wall-e/eval/fixtures/express-rename-data/data.js +0 -34
  381. package/template/wall-e/eval/fixtures/express-rename-data/package.json +0 -9
  382. package/template/wall-e/eval/fixtures/express-rename-data/server.js +0 -97
  383. package/template/wall-e/eval/fixtures/express-rename-data/test.js +0 -88
  384. package/template/wall-e/eval/fixtures/express-xss/package.json +0 -12
  385. package/template/wall-e/eval/fixtures/express-xss/server.js +0 -90
  386. package/template/wall-e/eval/fixtures/express-xss/test.js +0 -67
  387. package/template/wall-e/eval/fixtures/express-xss/views/profile.ejs +0 -9
  388. package/template/wall-e/eval/fixtures/fullstack-app/config/default.js +0 -9
  389. package/template/wall-e/eval/fixtures/fullstack-app/config/test.js +0 -13
  390. package/template/wall-e/eval/fixtures/fullstack-app/package.json +0 -11
  391. package/template/wall-e/eval/fixtures/fullstack-app/public/css/style.css +0 -137
  392. package/template/wall-e/eval/fixtures/fullstack-app/public/index.html +0 -46
  393. package/template/wall-e/eval/fixtures/fullstack-app/public/js/app.js +0 -121
  394. package/template/wall-e/eval/fixtures/fullstack-app/public/js/auth.js +0 -71
  395. package/template/wall-e/eval/fixtures/fullstack-app/public/js/items.js +0 -80
  396. package/template/wall-e/eval/fixtures/fullstack-app/public/js/users.js +0 -46
  397. package/template/wall-e/eval/fixtures/fullstack-app/public/login.html +0 -45
  398. package/template/wall-e/eval/fixtures/fullstack-app/public/register.html +0 -38
  399. package/template/wall-e/eval/fixtures/fullstack-app/scripts/migrate.js +0 -23
  400. package/template/wall-e/eval/fixtures/fullstack-app/scripts/seed.js +0 -46
  401. package/template/wall-e/eval/fixtures/fullstack-app/server/db.js +0 -99
  402. package/template/wall-e/eval/fixtures/fullstack-app/server/index.js +0 -94
  403. package/template/wall-e/eval/fixtures/fullstack-app/server/middleware/auth.js +0 -19
  404. package/template/wall-e/eval/fixtures/fullstack-app/server/middleware/logger.js +0 -19
  405. package/template/wall-e/eval/fixtures/fullstack-app/server/router.js +0 -50
  406. package/template/wall-e/eval/fixtures/fullstack-app/server/routes/auth.js +0 -69
  407. package/template/wall-e/eval/fixtures/fullstack-app/server/routes/health.js +0 -23
  408. package/template/wall-e/eval/fixtures/fullstack-app/server/routes/items.js +0 -88
  409. package/template/wall-e/eval/fixtures/fullstack-app/server/routes/users.js +0 -75
  410. package/template/wall-e/eval/fixtures/fullstack-app/server/test.js +0 -198
  411. package/template/wall-e/eval/fixtures/fullstack-app/server/utils/response.js +0 -34
  412. package/template/wall-e/eval/fixtures/fullstack-app/server/utils/validate.js +0 -26
  413. package/template/wall-e/eval/fixtures/fullstack-app/server.js +0 -8
  414. package/template/wall-e/eval/fixtures/fullstack-app/test.js +0 -12
  415. package/template/wall-e/eval/fixtures/monorepo-basic/package.json +0 -8
  416. package/template/wall-e/eval/fixtures/monorepo-basic/packages/api/data.js +0 -58
  417. package/template/wall-e/eval/fixtures/monorepo-basic/packages/api/middleware.js +0 -46
  418. package/template/wall-e/eval/fixtures/monorepo-basic/packages/api/package.json +0 -8
  419. package/template/wall-e/eval/fixtures/monorepo-basic/packages/api/routes.js +0 -64
  420. package/template/wall-e/eval/fixtures/monorepo-basic/packages/api/server.js +0 -56
  421. package/template/wall-e/eval/fixtures/monorepo-basic/packages/api/test.js +0 -116
  422. package/template/wall-e/eval/fixtures/monorepo-basic/packages/cli/commands.js +0 -61
  423. package/template/wall-e/eval/fixtures/monorepo-basic/packages/cli/index.js +0 -62
  424. package/template/wall-e/eval/fixtures/monorepo-basic/packages/cli/output.js +0 -43
  425. package/template/wall-e/eval/fixtures/monorepo-basic/packages/cli/package.json +0 -11
  426. package/template/wall-e/eval/fixtures/monorepo-basic/packages/cli/test.js +0 -44
  427. package/template/wall-e/eval/fixtures/monorepo-basic/packages/shared/formatters.js +0 -43
  428. package/template/wall-e/eval/fixtures/monorepo-basic/packages/shared/index.js +0 -12
  429. package/template/wall-e/eval/fixtures/monorepo-basic/packages/shared/package.json +0 -5
  430. package/template/wall-e/eval/fixtures/monorepo-basic/packages/shared/test.js +0 -55
  431. package/template/wall-e/eval/fixtures/monorepo-basic/packages/shared/validators.js +0 -29
  432. package/template/wall-e/eval/fixtures/monorepo-basic/test.js +0 -46
  433. package/template/wall-e/eval/fixtures/node-cli/index.js +0 -78
  434. package/template/wall-e/eval/fixtures/node-cli/package.json +0 -10
  435. package/template/wall-e/eval/fixtures/node-cli/test.js +0 -57
  436. package/template/wall-e/eval/fixtures/node-typed/package.json +0 -8
  437. package/template/wall-e/eval/fixtures/node-typed/src/handlers.js +0 -31
  438. package/template/wall-e/eval/fixtures/node-typed/src/utils.js +0 -33
  439. package/template/wall-e/eval/fixtures/node-typed/test.js +0 -36
  440. package/template/wall-e/eval/fixtures/python-flask/app.py +0 -14
  441. package/template/wall-e/eval/fixtures/python-flask/requirements.txt +0 -2
  442. package/template/wall-e/eval/fixtures/python-flask/test_app.py +0 -25
  443. package/template/wall-e/eval/fixtures/wall-e-subset/brain.js +0 -105
  444. package/template/wall-e/eval/fixtures/wall-e-subset/eval/aggregator.js +0 -101
  445. package/template/wall-e/eval/fixtures/wall-e-subset/eval/benchmarks/chat.json +0 -20
  446. package/template/wall-e/eval/fixtures/wall-e-subset/eval/benchmarks/coding.json +0 -32
  447. package/template/wall-e/eval/fixtures/wall-e-subset/eval/benchmarks.js +0 -64
  448. package/template/wall-e/eval/fixtures/wall-e-subset/eval/fixtures/simple-project/package.json +0 -6
  449. package/template/wall-e/eval/fixtures/wall-e-subset/eval/fixtures/simple-project/server.js +0 -31
  450. package/template/wall-e/eval/fixtures/wall-e-subset/eval/fixtures/simple-project/test.js +0 -18
  451. package/template/wall-e/eval/fixtures/wall-e-subset/eval/fixtures/simple-project/utils.js +0 -34
  452. package/template/wall-e/eval/fixtures/wall-e-subset/eval/runner.js +0 -104
  453. package/template/wall-e/eval/fixtures/wall-e-subset/eval/scorer.js +0 -73
  454. package/template/wall-e/eval/fixtures/wall-e-subset/eval/test.js +0 -134
  455. package/template/wall-e/eval/fixtures/wall-e-subset/llm/client.js +0 -99
  456. package/template/wall-e/eval/fixtures/wall-e-subset/llm/providers.js +0 -63
  457. package/template/wall-e/eval/fixtures/wall-e-subset/llm/test.js +0 -70
  458. package/template/wall-e/eval/fixtures/wall-e-subset/package.json +0 -10
  459. package/template/wall-e/eval/fixtures/wall-e-subset/test.js +0 -86
  460. package/template/wall-e/eval/harvester.js +0 -685
  461. package/template/wall-e/eval/head-to-head.js +0 -388
  462. package/template/wall-e/eval/humaneval-adapter.js +0 -321
  463. package/template/wall-e/eval/list-models.js +0 -31
  464. package/template/wall-e/eval/livecodebench-adapter.js +0 -291
  465. package/template/wall-e/eval/mail-integration.js +0 -443
  466. package/template/wall-e/eval/manifest.js +0 -186
  467. package/template/wall-e/eval/meta-harness/adapters/coding-agent.js +0 -57
  468. package/template/wall-e/eval/meta-harness/bootstrap-snapshot.js +0 -149
  469. package/template/wall-e/eval/meta-harness/candidate-store.js +0 -117
  470. package/template/wall-e/eval/meta-harness/cli.js +0 -86
  471. package/template/wall-e/eval/meta-harness/domain-spec.js +0 -154
  472. package/template/wall-e/eval/meta-harness/domains/coding-agent.domain.json +0 -84
  473. package/template/wall-e/eval/meta-harness/examples/env-bootstrap-candidate.js +0 -29
  474. package/template/wall-e/eval/meta-harness/experience-store.js +0 -174
  475. package/template/wall-e/eval/meta-harness/frontier.js +0 -96
  476. package/template/wall-e/eval/meta-harness/harness-interface.js +0 -90
  477. package/template/wall-e/eval/meta-harness/leakage-guard.js +0 -80
  478. package/template/wall-e/eval/meta-harness/optimizer.js +0 -207
  479. package/template/wall-e/eval/meta-harness/proposer-runner.js +0 -110
  480. package/template/wall-e/eval/meta-harness/reporting.js +0 -58
  481. package/template/wall-e/eval/meta-harness/telemetry.js +0 -27
  482. package/template/wall-e/eval/meta-harness/validation.js +0 -81
  483. package/template/wall-e/eval/promoter.js +0 -228
  484. package/template/wall-e/eval/provider-normalizer.js +0 -33
  485. package/template/wall-e/eval/replay.js +0 -395
  486. package/template/wall-e/eval/run-agent-benchmarks.js +0 -386
  487. package/template/wall-e/eval/run-codex-cli-baseline.js +0 -177
  488. package/template/wall-e/eval/run-coding-agent-real.js +0 -187
  489. package/template/wall-e/eval/run-eval.js +0 -435
  490. package/template/wall-e/eval/run-model-comparison.js +0 -142
  491. package/template/wall-e/eval/session-evaluator.js +0 -187
  492. package/template/wall-e/eval/session-miner.js +0 -207
  493. package/template/wall-e/eval/session-retrieval-benchmark.js +0 -150
  494. package/template/wall-e/eval/session-transcripts.js +0 -509
  495. package/template/wall-e/eval/shadow.js +0 -161
  496. package/template/wall-e/eval/swebench-adapter.js +0 -345
  497. package/template/wall-e/eval/swebench-docker.js +0 -192
  498. package/template/wall-e/eval/train.py +0 -320
  499. package/template/wall-e/eval/trainer.js +0 -232
  500. package/template/wall-e/eval/weekly-eval-loop.js +0 -241
@@ -0,0 +1,268 @@
1
+ # CTM Remote Desktop Access Design
2
+
3
+ Status: Design update from user feedback
4
+ Date: 2026-05-27
5
+ Owner: CTM / Wall-E
6
+ Related docs:
7
+ - `claude-task-manager/docs/phone-access-design.md`
8
+ - `claude-task-manager/docs/microsoft-dev-tunnel-phone-access-design.md`
9
+ - `claude-task-manager/docs/mobile-live-streaming.md`
10
+
11
+ External references:
12
+ - Microsoft Dev Tunnels security:
13
+ `https://learn.microsoft.com/en-us/azure/developer/dev-tunnels/security`
14
+ - Microsoft Dev Tunnels CLI reference:
15
+ `https://learn.microsoft.com/en-us/azure/developer/dev-tunnels/cli-commands`
16
+ - Tailscale quickstart:
17
+ `https://tailscale.com/docs/how-to/quickstart`
18
+ - Tailscale access controls:
19
+ `https://tailscale.com/docs/features/access-control/acls`
20
+
21
+ ## Decisions
22
+
23
+ 1. **Microsoft Dev Tunnel is the default browser remote-desktop transport.**
24
+ It gives a second computer a normal browser URL without DNS setup, VPN setup,
25
+ or Cloudflare Tunnel. It supports two explicit access modes:
26
+ CTM-authenticated connect for phone-friendly pairing, and private
27
+ Microsoft/GitHub gate when the user wants provider auth before CTM loads.
28
+ Neither mode allows anonymous CTM app access.
29
+ 2. **Cloudflare Tunnel is out of scope.** Do not add it to the remote desktop
30
+ IA, default setup path, or MVP plan.
31
+ 3. **Tailscale is the secure private-network alternative.** It requires both
32
+ sides to be authenticated into the same tailnet, or an explicitly shared
33
+ machine accepted by an authenticated Tailscale user. CTM still applies its
34
+ own device auth after the network connection succeeds.
35
+ 4. **No temporary desktop mode.** A remote computer is either not paired or is a
36
+ trusted CTM client device. Avoid split policies such as "temporary desktop"
37
+ or "lite desktop" unless a later product requirement justifies them.
38
+ 5. **Remote desktop gets full CTM access after auth.** The UX should be almost
39
+ the same as using CTM locally: full dashboard, terminal tabs, conversation
40
+ tabs, prompt editor, model selectors, queue, setup, reviews, worktrees, and
41
+ raw terminal input. The difference is that remote access passes through
42
+ network identity, CTM device auth, origin checks, route/WS authorization,
43
+ passkey step-up for high-risk actions, and audit.
44
+ 6. **Mobile and desktop share one architecture.** The only difference is the
45
+ surface: `/m/` is a mobile UX and `/` is the desktop UX. Pairing, device
46
+ identity, scopes, WebAuthn credentials, origin binding, WebSocket tickets,
47
+ revocation, audit, and route/WS policy should be shared.
48
+
49
+ ## Product Shape
50
+
51
+ `Setup -> Access` should present:
52
+
53
+ 1. `Microsoft Dev Tunnel` - default remote computer access.
54
+ 2. `Tailscale` - private tailnet access.
55
+ 3. `Walle Remote` - typed mobile/relay control, not full desktop mirroring.
56
+
57
+ Do not show Cloudflare Tunnel in the remote desktop flow.
58
+
59
+ Each direct transport publishes two URLs:
60
+
61
+ ```text
62
+ desktop_url = <origin>/
63
+ mobile_url = <origin>/m/
64
+ ```
65
+
66
+ The same CTM instance, device store, auth rules, and WebSocket server back both
67
+ URLs. The surface only controls layout and UX.
68
+
69
+ ## Architecture
70
+
71
+ ```text
72
+ Remote browser
73
+ -> Transport identity
74
+ -> Microsoft Dev Tunnel CTM-authenticated connect
75
+ -> or Microsoft Dev Tunnel private browser access
76
+ -> or Tailscale tailnet identity + ACL/grant
77
+ -> CTM same-origin HTTPS/WS endpoint
78
+ -> CTM client-device token
79
+ -> CTM origin allowlist + WebSocket ticket
80
+ -> Auth rule registry for HTTP routes and WS message types
81
+ -> Passkey step-up for high-risk remote actions
82
+ -> Audit + revoke closes active WebSockets
83
+ ```
84
+
85
+ The current code already has most of this shape:
86
+
87
+ - `auth-context.js` resolves loopback versus remote device auth.
88
+ - `auth-rules.js` classifies HTTP routes and WS message types.
89
+ - `mobile-auth-store.js` stores device tokens, WebAuthn credentials, step-up
90
+ sessions, audit, and revocation.
91
+ - `transport-security.js` blocks wildcard binds and validates origins.
92
+ - `server.js` checks Cloudflare Access only when configured, validates CTM
93
+ device tokens, checks origins, enforces step-up, and closes WebSockets when a
94
+ device is revoked.
95
+
96
+ The needed architectural cleanup is to stop treating this as "mobile auth".
97
+ The shared layer should become `client-device` or `remote-client` internally:
98
+
99
+ ```text
100
+ client_device
101
+ id
102
+ title
103
+ surface_last_used: desktop | mobile
104
+ transport_last_used: microsoft-dev-tunnel | tailscale | loopback
105
+ scopes: read, respond, create, admin
106
+ origins / rp_ids
107
+ token hash
108
+ WebAuthn credentials
109
+ audit / revoke metadata
110
+ ```
111
+
112
+ DB migration can preserve existing table names initially, but the code and docs
113
+ should use the product term `client device`. `mobile-*` names should become
114
+ compatibility aliases, not the mental model.
115
+
116
+ ## Microsoft Dev Tunnel Default
117
+
118
+ Microsoft documents that, by default, hosting and connecting require
119
+ authentication with the same Microsoft, Entra, or GitHub account that created
120
+ the tunnel, and that anonymous access is an explicit opt-in. CTM should expose
121
+ that as an explicit private mode, not as a hidden failure state.
122
+
123
+ For phone access, CTM-authenticated mode is the better default: Dev Tunnels
124
+ allows anonymous `connect` to this one port so the browser can reach CTM, and
125
+ CTM then enforces client-device auth, origin checks, route/WS authorization,
126
+ passkey step-up, audit, and revocation.
127
+
128
+ Required flow:
129
+
130
+ 1. The Mac signs into `devtunnel` with Microsoft or GitHub.
131
+ 2. CTM creates or reuses a persistent tunnel ID.
132
+ 3. CTM creates port `3456` with protocol `http`.
133
+ 4. CTM applies the selected access mode:
134
+ - CTM-authenticated mode creates/verifies anonymous `connect` on this
135
+ tunnel/port.
136
+ - Private Microsoft gate mode resets/removes anonymous `connect` access.
137
+ 5. CTM starts `devtunnel host <tunnel-id>`.
138
+ 6. The remote computer opens the `desktop_url`.
139
+ 7. Microsoft/GitHub authentication happens before CTM loads only in private
140
+ Microsoft gate mode.
141
+ 8. After the request reaches CTM, CTM requires client-device pairing or an
142
+ existing CTM device token.
143
+ 9. The paired desktop gets full CTM access subject to CTM remote auth rules.
144
+
145
+ The remote browser should not receive a Dev Tunnel access token in a URL. If a
146
+ future headless/non-browser client needs tunnel access, use Microsoft’s
147
+ `X-Tunnel-Authorization` header flow, not CTM browser URLs.
148
+
149
+ Migration rule:
150
+
151
+ - If CTM detects anonymous Dev Tunnel access while the selected mode is private
152
+ Microsoft gate, show a security warning and offer `Reset to private`.
153
+ - If CTM-authenticated mode is selected, anonymous `connect` is the intended
154
+ transport state and should not be treated as stale.
155
+
156
+ ## Tailscale Alternative
157
+
158
+ Tailscale is not the default for this desktop flow because it requires both
159
+ machines to be in an authenticated Tailscale relationship. That can be:
160
+
161
+ - both devices logged into the same tailnet;
162
+ - the Mac shared to another authenticated Tailscale user;
163
+ - a managed device/auth-key flow for machines you control.
164
+
165
+ For CTM this is a clean security boundary:
166
+
167
+ ```text
168
+ Tailscale proves the remote machine/user can reach the Mac on port 3456.
169
+ CTM proves this browser is an approved CTM client device.
170
+ ```
171
+
172
+ CTM should still bind only to loopback or a specific Tailscale IP. Do not bind
173
+ to `0.0.0.0`. Tailscale ACLs/grants should restrict `3456` to the intended
174
+ user/device where possible.
175
+
176
+ ## Full Desktop Access Policy
177
+
178
+ There is one remote desktop policy: `trusted full desktop`.
179
+
180
+ Allowed after device pairing:
181
+
182
+ - load `/` desktop UI;
183
+ - open all session tabs;
184
+ - subscribe to terminal and conversation streams;
185
+ - type into terminal sessions;
186
+ - send Wall-E/coding messages;
187
+ - create sessions;
188
+ - use prompts, queues, reviews, model selectors, and setup pages.
189
+
190
+ Still protected:
191
+
192
+ - remote mutations must match `auth-rules.js`;
193
+ - routes/WS messages with `stepUp: remote` require a fresh passkey step-up;
194
+ - loopback-only routes remain loopback-only;
195
+ - `/hook/*`, `/v1/logs`, OAuth proxy routes, local telemetry ingress, and local
196
+ permission helper endpoints stay local-only;
197
+ - every remote mutation is audited by device, route/message type, origin, IP,
198
+ scope, and step-up state.
199
+
200
+ This gives local-like power without pretending a remote browser is loopback.
201
+
202
+ ## Pairing UX
203
+
204
+ Use one pairing model for phone and desktop:
205
+
206
+ ```text
207
+ Pair device
208
+ Choose surface: Desktop or Phone
209
+ Choose transport: Microsoft Dev Tunnel or Tailscale
210
+ Choose device title
211
+ Approve locally on the Mac
212
+ Register passkey for this origin
213
+ Issue CTM client-device token
214
+ ```
215
+
216
+ Pairing URLs:
217
+
218
+ ```text
219
+ desktop claim: <origin>/claim?claim=...&secret=...&surface=desktop
220
+ mobile claim: <origin>/m/claim?claim=...&secret=...&surface=mobile
221
+ ```
222
+
223
+ If a remote desktop opens `/` without a CTM token, it should show an auth gate
224
+ that can request pairing. Do not expose the dashboard until CTM auth succeeds.
225
+
226
+ ## UI Requirements
227
+
228
+ Desktop remote banner:
229
+
230
+ ```text
231
+ Remote Desktop
232
+ Microsoft Dev Tunnel · user@example.com · CTM device: Owner MacBook
233
+ Step-up: fresh / expired
234
+ [Step up] [Revoke this device] [Logout]
235
+ ```
236
+
237
+ The banner should be compact and dismissible for the current session, but a
238
+ security status indicator must remain visible when remote.
239
+
240
+ Setup Access should show:
241
+
242
+ - `Desktop URL`;
243
+ - `Phone URL`;
244
+ - tunnel account;
245
+ - CTM device status;
246
+ - last used device;
247
+ - revoke controls;
248
+ - stale anonymous-access warning when detected.
249
+
250
+ ## Test Plan
251
+
252
+ Required tests before implementation is accepted:
253
+
254
+ - Microsoft private tunnel flow does not call or recommend anonymous access
255
+ creation.
256
+ - Stale anonymous access detection surfaces a warning and reset action.
257
+ - Remote desktop opening `/` without CTM token shows auth/pairing gate, not the
258
+ dashboard.
259
+ - Remote desktop opening `/` with paired CTM token loads the same desktop UI.
260
+ - Remote WebSocket can read with `read` scope.
261
+ - Remote terminal `input` requires `respond` scope and fresh step-up when the
262
+ route registry requires it.
263
+ - Admin/session/setup mutations require `admin` or `create` as defined by
264
+ `auth-rules.js` and fresh step-up when remote.
265
+ - Revoke device closes active desktop WebSockets.
266
+ - Tailscale setup emits both `desktop_url` and `mobile_url`.
267
+ - Existing `/m/` phone flows still work from the same device/auth store.
268
+ - Loopback desktop behavior remains unchanged.
@@ -0,0 +1,95 @@
1
+ # CTM Restart Lifecycle Architecture
2
+
3
+ CTM restart must make the HTTP process replaceable quickly while preserving
4
+ browser-side tab state and terminal snapshots. The critical path is intentionally
5
+ small: acknowledge the restart request, stop accepting new connections, drain
6
+ or force-close network handles, checkpoint SQLite, stop child processes, and
7
+ exit for launchd.
8
+
9
+ ## Phases
10
+
11
+ 1. **Announce**
12
+ - `/api/restart/ctm` sends `server-restarting` to connected browsers before
13
+ any sockets are closed.
14
+ - Browsers should keep local tab, route, draft, and terminal cache state while
15
+ reconnecting.
16
+
17
+ 2. **Stop Intake**
18
+ - CTM calls `server.close()` for each listener so no new HTTP requests are
19
+ accepted.
20
+ - Idle HTTP keep-alive sockets are closed immediately.
21
+ - Remaining HTTP sockets are force-closed after a bounded grace window.
22
+
23
+ 3. **Close Upgraded Sockets**
24
+ - The WebSocket server runs in `noServer` mode, so HTTP listener close does
25
+ not own upgraded WebSocket connections.
26
+ - CTM closes each tracked WebSocket with service-restart code `1012`, waits a
27
+ short grace window for the close handshake, then terminates stragglers.
28
+
29
+ 4. **Stop Background Work**
30
+ - Scheduler, session capture, session stream, and JSONL watcher stop before
31
+ PTYs are killed.
32
+ - CTM DB-owner queued writes are drained with a timeout. Timed-out work is
33
+ logged with active/pending labels instead of blocking restart indefinitely.
34
+
35
+ 5. **Persist And Exit**
36
+ - CTM performs a passive WAL checkpoint, closes its SQLite connection, kills
37
+ PTYs, stops Wall-E, then exits non-zero so launchd starts the replacement.
38
+
39
+ ## Startup Contract
40
+
41
+ The new process should listen before non-critical reconciliation work starts.
42
+ Session metadata and cached terminal state are the only startup-critical data.
43
+ Heavy work such as permission-file reconciliation, JSONL full scans, prompt
44
+ harvest, conversation import, FTS backfill, and title/model refresh belongs
45
+ behind timers or scheduler jobs with idle checks and bounded batches.
46
+
47
+ ## Restored Session Runtime State
48
+
49
+ Restart restore has two separate signals:
50
+
51
+ - **Runtime liveness**: CTM has spawned or is spawning a PTY so the tab can be
52
+ used again.
53
+ - **Semantic activity**: the agent produced new output, or the user submitted
54
+ new input that can produce new output.
55
+
56
+ The UI must not merge these signals. During startup, restored PTYs are marked
57
+ `resuming` and their initial replay output is quarantined for a bounded restore
58
+ window. That replay may hydrate the browser terminal, but it must not update
59
+ `lastActivity`, `lastPtyActivity`, SessionStream activity, or the live status
60
+ bus. If no real provider output arrives, the session naturally falls back to
61
+ `idle` when the restore window expires. If the user types into the session, the
62
+ restore quarantine ends immediately because any following output belongs to the
63
+ new turn.
64
+
65
+ This keeps CTM restart behavior predictable:
66
+
67
+ 1. Restart begins: existing sessions are not promoted to fresh running work.
68
+ 2. Restore starts: session status is `resuming`.
69
+ 3. Restore replay draws: terminal cache may update, activity freshness does not.
70
+ 4. No new work arrives: status becomes `idle`; last-output time remains old.
71
+ 5. User input or real provider work arrives: status can become `running` and
72
+ freshness advances from that real event.
73
+
74
+ `startup_tasks.status` is lifecycle metadata, not the UI source of truth.
75
+ Heartbeats may update ownership fields such as process id and heartbeat time,
76
+ but they must not rewrite a task status to `running`. Desktop, phone, and
77
+ standup cards should read the server live-status projection instead.
78
+
79
+ ## SQLite Contract
80
+
81
+ CTM owns `task-manager.db`; Wall-E owns `wall-e-brain.db`. CTM background writes
82
+ that are not request-critical should go through the CTM DB-owner queue so restart
83
+ can drain a single known queue. SQLite WAL allows concurrent readers with one
84
+ writer, but CTM still keeps application-level write ownership explicit so
85
+ background maintenance cannot race restart, backup, or recovery.
86
+
87
+ ## Anti-Patterns
88
+
89
+ - Do not add long synchronous startup scans before `listen`.
90
+ - Do not rely on `server.close()` to close WebSocket clients.
91
+ - Do not leave background timers running during restart.
92
+ - Do not block process exit waiting for unbounded child-process or DB work.
93
+ - Do not read/write CTM and Wall-E SQLite files from both processes as peers.
94
+ - Do not let restore replay, heartbeats, or PTY attach bytes advance session
95
+ freshness or promote a restored session to `running`.
@@ -0,0 +1,53 @@
1
+ # Runtime Work Control Plane
2
+
3
+ CTM now treats agent work as runtime-owned state, not text inferred from the terminal.
4
+
5
+ ## Goals
6
+
7
+ - Show active Wall-E turns, tools, shells, queued work, and recent completions in the session UI.
8
+ - Keep performance isolation: large terminal/tool output stays in the worker/runtime that produced it.
9
+ - Give desktop and phone the same lightweight work-state projection.
10
+ - Leave a clean adapter surface for Codex/Claude PTY workers later.
11
+
12
+ ## Architecture
13
+
14
+ The authoritative runtime emits task lifecycle events. CTM stores only lightweight metadata in `runtime_tasks`:
15
+
16
+ - `ctm_session_id`, `agent_session_id`, `turn_id`
17
+ - `kind`: `turn`, `shell`, `tool`, `skill`, or `agent`
18
+ - `status`: `queued`, `running`, `stopping`, `completed`, `failed`, `cancelled`, or `stopped`
19
+ - labels, previews, provider/model, control capabilities, worker scope
20
+
21
+ Heavy artifacts are not stored in this table. Shell stdout remains in scrollback chunks or provider transcripts. Wall-E tool output remains in the Wall-E transcript/runtime stream. Task rows may point to those sources with `output_ref` later.
22
+
23
+ ## Worker Process Decision
24
+
25
+ The control plane should not run heavy work in the CTM main process.
26
+
27
+ - Wall-E-owned sessions keep tool execution in the Wall-E daemon/runtime.
28
+ - Claude/Codex terminal sessions keep PTY rendering and output capture in the session host/headless terminal workers.
29
+ - CTM main process owns the small task registry and broadcasts compact `work-state` frames.
30
+ - Future worker adapters should emit the same metadata events from their session worker instead of letting the browser scrape terminal text.
31
+
32
+ This gives isolation without fragmenting the UI contract.
33
+
34
+ ## UI Contract
35
+
36
+ Desktop Wall-E sessions render a compact workbar above the composer:
37
+
38
+ - chips for running work, shells, tools, skills, queued items, and isolation status
39
+ - a focused summary of the most recent active/recent task
40
+ - `Manage` opens task rows grouped into Active and Recent
41
+ - active turns expose `Stop`, which maps to the existing `walle-cancel` flow
42
+
43
+ Phone status rows consume `runtimeWork`/`runtime_work` from session payloads and show compact active-work chips.
44
+
45
+ ## Adapter Rule
46
+
47
+ Each runtime adapter should emit normalized lifecycle events:
48
+
49
+ 1. `startTurn`
50
+ 2. `tool_call` / `tool_result`
51
+ 3. `completeTurn` or `requestStop`
52
+
53
+ Adapters must never stream full stdout through `runtime_tasks`. If UI needs output, the task row should carry an `output_ref` to the existing transcript, scrollback, or artifact source.
@@ -0,0 +1,38 @@
1
+ # Session Interactive Wait Surfaces
2
+
3
+ CTM models three different "waiting for the user" states. They must not share
4
+ one approval UI because they have different risk and action semantics.
5
+
6
+ ## States
7
+
8
+ - `approval`: the agent is asking permission to run a tool, edit files, or take
9
+ another side-effecting action. This is a security/control-plane decision.
10
+ - `choice`: the agent is asking the user to pick from a terminal menu, such as a
11
+ Claude `AskUserQuestion` planning question. This is normal conversation flow,
12
+ not permission.
13
+ - `input`: the agent is idle at a free-form composer.
14
+
15
+ ## UI Contract
16
+
17
+ - Only `approval` may show the desktop `Approval needed` banner with
18
+ Approve/Deny actions.
19
+ - `choice` must never show approval wording or Approve/Deny buttons. The live
20
+ terminal menu is the primary UI on desktop; mobile may render a tappable
21
+ `Terminal choice` card that sends `session.select_choice`.
22
+ - `input` should not surface an approval or choice card.
23
+
24
+ ## Control Plane
25
+
26
+ - `approval.respond` is high-risk and step-up gated on remote clients.
27
+ - `session.select_choice` is a medium-risk terminal interaction that verifies the
28
+ live menu before pressing Enter.
29
+ - Desktop must preserve the same separation: raw `1\r` approval shortcuts belong
30
+ only to approval prompts, not normal choices.
31
+
32
+ ## Regression Invariant
33
+
34
+ For any `waiting-for-input` event with `reason: "choice"`:
35
+
36
+ - no `.session-approval-banner` should be visible;
37
+ - text `Approval needed` should not be rendered;
38
+ - Approve/Deny buttons should not exist for that state.
@@ -0,0 +1,84 @@
1
+ # Session Needs You Dismissal
2
+
3
+ ## Problem
4
+
5
+ CTM shows operator-attention sessions in several places: the desktop Active Sessions list, the Sessions overview/standup page, and the phone/iPad Status page. These surfaces used to mix labels such as "Waiting" and "Needs User", and mobile had a local-only dismiss behavior. That made the same session appear actionable on one surface and quiet on another.
6
+
7
+ The shared contract is now:
8
+
9
+ - User-facing attention label: `Needs You`.
10
+ - Backend lane/status source of truth: `needs_user` / `waiting_input`.
11
+ - Dismissal persistence: `startup_tasks`, so the choice survives refreshes and CTM restarts.
12
+ - Reset trigger: only a newer submitted user prompt/input marker for the same session.
13
+
14
+ ## Data Model
15
+
16
+ `startup_tasks` stores three attention fields:
17
+
18
+ - `attention_dismissed_at`: when the operator dismissed the current attention state.
19
+ - `attention_dismissed_input_at`: the latest user-prompt marker known when dismissed.
20
+ - `last_user_prompt_input_at`: latest known submitted prompt/input that can reset the dismissal.
21
+
22
+ The dismissal is active when:
23
+
24
+ ```text
25
+ attention_dismissed_at > 0
26
+ and (last_user_prompt_input_at is empty or last_user_prompt_input_at <= attention_dismissed_input_at)
27
+ ```
28
+
29
+ This intentionally ignores passive events: terminal redraws, session heartbeats, agent output, status polling, restore/resume transitions, and websocket reconnects.
30
+
31
+ ## API
32
+
33
+ Desktop and mobile both use:
34
+
35
+ ```http
36
+ POST /api/sessions/:sessionId/attention-dismiss
37
+ Content-Type: application/json
38
+
39
+ { "dismissed": true }
40
+ ```
41
+
42
+ The response includes both camelCase and snake_case marker fields:
43
+
44
+ - `attentionDismissedAt` / `attention_dismissed_at`
45
+ - `attentionDismissedInputAt` / `attention_dismissed_input_at`
46
+ - `lastUserPromptInputAt` / `last_user_prompt_input_at`
47
+ - `attentionDismissedActive` / `attention_dismissed_active`
48
+
49
+ Sending `{ "dismissed": false }` clears the dismissal.
50
+
51
+ ## Reset Semantics
52
+
53
+ The reset marker is written when CTM sees a real submitted user prompt for a non-blocking waiting state. The guard is deliberate:
54
+
55
+ - Approval/choice prompts should not be reset by pressing `1`, `2`, `y`, `n`, or Escape.
56
+ - Running/idle flicker should not reset a dismissal.
57
+ - Agent output after dismissal should not reset a dismissal.
58
+ - A new user prompt that starts the session again should reset it.
59
+
60
+ ## UI Contract
61
+
62
+ Desktop:
63
+
64
+ - Active Sessions groups show `Needs You`.
65
+ - `Needs You` rows expose a compact `Dismiss` action.
66
+ - The Sessions overview card action area shows `Dismiss` on `needs_user` cards.
67
+ - The top attention banner can be dismissed with the same durable API.
68
+
69
+ Mobile/iPad:
70
+
71
+ - Status grouping uses the same backend lane and label.
72
+ - Local storage dismissal is only a fallback for older snapshots; server marker fields override it.
73
+ - A server-reported newer `lastUserPromptInputAt` clears any local fallback.
74
+
75
+ ## Testing
76
+
77
+ Important regression coverage:
78
+
79
+ - `tests/session-standup.test.js`: classifier demotes dismissed attention and resets only after newer user prompt markers.
80
+ - `tests/active-session-collapse.test.js`: desktop Active Sessions label/render harness.
81
+ - `tests/rendering/scenarios/standup-command-nav.spec.js`: overview card/banner dismiss actions.
82
+ - `tests/rendering/scenarios/mobile-pwa.spec.js`: mobile dismissal survives reload/output-only refresh and resets on newer prompt marker.
83
+
84
+ Do not reintroduce frontend-only dismissal state as the source of truth. It will regress cross-device consistency after reloads and CTM restarts.
@@ -49,6 +49,41 @@ If none of those changed, tab activation should be a cheap paint/focus operation
49
49
  show the existing terminal, fit if necessary, refresh the canvas, align helper
50
50
  textarea, preserve scroll intent, and avoid attach/snapshot/reflow.
51
51
 
52
+ ## TerminalRenderController Contract
53
+
54
+ The browser must have one mutation owner per terminal session:
55
+ `TerminalRenderController`. The server can classify output, coalesce obvious
56
+ transport bursts, and produce authoritative headless snapshots, but it cannot
57
+ own browser rendering because it does not know the active DOM, xterm parser
58
+ state, WebGL renderer state, split visibility, focused input, or scroll intent.
59
+
60
+ All xterm mutations should route through the controller:
61
+
62
+ - `term.write()` for live PTY bytes, snapshots, cursor repair, status cleanup,
63
+ and lifecycle banners;
64
+ - `term.refresh()` and `clearTextureAtlas()` for paint recovery;
65
+ - `fitAddon.fit()` and `term.resize()` for layout changes;
66
+ - `scrollToLine()` for follow-bottom and preserved-scroll restoration.
67
+
68
+ The controller is not just a wrapper. It is the ordering contract:
69
+
70
+ - all writes enter xterm through the controller, which records pending callbacks
71
+ and relies on xterm's FIFO parser for normal live output; barrier jobs such as
72
+ snapshot restore and renderer repair wait for the relevant write callbacks
73
+ before they declare the view settled;
74
+ - synchronized output frames are treated as atomic paint regions, so refresh,
75
+ fit, scroll, and repair jobs wait until the frame closes;
76
+ - snapshot restore is a barrier job with a restore epoch/source, and stale
77
+ snapshots are rejected or sanitized before replaying;
78
+ - active native input/composer display owns the screen until its lease settles;
79
+ - paint/fit/scroll requests are intents queued behind writes instead of direct
80
+ side effects from arbitrary timers;
81
+ - render tests wait on controller/readiness state or persistent semantic output,
82
+ not on transient text that a legitimate full-screen redraw can erase.
83
+
84
+ This makes the existing `SessionViewState` useful as the high-level state model
85
+ and moves low-level terminal side effects into a single place.
86
+
52
87
  ## Existing Useful Pieces
53
88
 
54
89
  The current code already has several strong building blocks.
@@ -132,6 +167,25 @@ If the current tuple equals `lastReadyTuple`, a tab switch must not replay a
132
167
  snapshot. It may still call xterm `refresh()` and align/focus because those are
133
168
  idempotent paint operations.
134
169
 
170
+ The render controller keeps a lower-level tuple for terminal mutation ordering:
171
+
172
+ ```js
173
+ {
174
+ writeSeq,
175
+ activeJob,
176
+ queuedJobs,
177
+ openSynchronizedFrameDepth,
178
+ lastWriteSource,
179
+ lastMutationSource,
180
+ lastSettledAt
181
+ }
182
+ ```
183
+
184
+ `SessionViewState` answers "is this visible view ready?" while
185
+ `TerminalRenderController` answers "which terminal mutation is allowed to run
186
+ next?" Both are required. A render ledger without mutation ownership can still
187
+ observe a race after it already happened.
188
+
135
189
  ## View Phases
136
190
 
137
191
  | Phase | Meaning | Exit condition |
@@ -323,27 +377,37 @@ these user-visible contracts:
323
377
 
324
378
  ## Implementation Plan
325
379
 
326
- Phase 1: Document and instrument
380
+ Phase 1: Document, instrument, and establish mutation ownership
327
381
 
328
382
  - Land this design document.
329
383
  - Add `window._ctmGetSessionViewState(id)` backed by existing flags and render
330
384
  ledger.
331
385
  - Add logging for overlay dismissal reason and blockers.
386
+ - Add `TerminalRenderController` as the single browser-side write queue for live
387
+ output, snapshots, cursor repair, stale-status cleanup, and lifecycle banners.
388
+ - Expose controller state from `window._ctmGetTerminalRenderState(id)` so render
389
+ tests can prove the queue has settled.
332
390
 
333
- Phase 2: Client restore coordinator
391
+ Phase 2: Client restore coordinator and snapshot barriers
334
392
 
335
393
  - Extend `terminal-restore-state.js` from one-shot decisions into a small pure
336
394
  reducer/helper for phases, epochs, and readiness.
337
395
  - Replace restart overlay dismissal on snapshot arrival with render-ledger
338
396
  readiness.
339
397
  - Replace hard 5 second loading dismissal with `ready` or `degraded`.
398
+ - Treat snapshot restore as a controller barrier: clear stale local queues,
399
+ reject stale restore epochs, and sanitize stale Codex transient `Working`
400
+ rows when an authoritative non-running status is newer than the busy frame.
340
401
 
341
- Phase 3: Tab activation idempotency
402
+ Phase 3: Paint, fit, scroll, and repair through controller intents
342
403
 
343
404
  - Formalize the hot-tab unchanged path using epoch tuples.
344
405
  - Make attach/snapshot/reflow decisions table-driven.
345
406
  - Keep existing blank-tab and real Codex tab-switch tests, then add stricter
346
407
  "zero server restore request when unchanged" assertions.
408
+ - Move force-paint, texture-atlas clear, fit/resize, scroll restore, cursor
409
+ repair, and blank-gap compaction behind controller intent helpers so they do
410
+ not run inside pending writes or synchronized frames.
347
411
 
348
412
  Phase 4: Hidden output, split panes, and resize
349
413
 
@@ -357,6 +421,30 @@ Phase 5: Server priority and stale response hygiene
357
421
  - Add request ids/markers to snapshot diagnostics and stale rejection.
358
422
  - Limit background cache warm concurrency.
359
423
 
424
+ ## Current Implementation Slice
425
+
426
+ The first production slice should cover the failures that motivated this design:
427
+
428
+ - controller-routed writes for live output, snapshot replay, cursor repair,
429
+ submitted prompt cleanup, transient Codex `Working` cleanup, blank-gap
430
+ compaction, and crash banners. Normal live bytes still rely on xterm's parser
431
+ FIFO; CTM does not add a second browser-side serialization layer that could
432
+ slow or reorder typing feedback;
433
+ - deferred refresh/texture-atlas clearing while writes or synchronized frames
434
+ are active;
435
+ - stale Codex busy-row cleanup before or immediately after idle snapshot restore;
436
+ - active-input ownership gates late snapshot paint/scroll follow-up so Claude and
437
+ Codex prompt drafts are not moved or repainted by stale restore callbacks;
438
+ - bounded Codex semantic prompt repair only rewrites an already visible blank
439
+ prompt row, and explicitly excludes provider image-reference path drafts so
440
+ compact `[Image #N]` paste UX never flashes local file paths in the terminal;
441
+ - no-retry render coverage for Codex working-status cleanup, synchronized redraw
442
+ streams, skill typing stability, stale snapshot typing, tab-switch
443
+ idempotency, and slash picker flows.
444
+
445
+ Remaining legacy direct paint/fit/scroll paths can then be migrated behind the
446
+ same controller without changing the external behavior again.
447
+
360
448
  ## Open Questions
361
449
 
362
450
  - Should CTM keep hot terminals for all recent sessions, or should the hot cache