create-walle 0.9.21 → 0.9.23

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (500) hide show
  1. package/README.md +27 -5
  2. package/package.json +2 -2
  3. package/template/CLAUDE.md +2 -2
  4. package/template/LICENSE +1 -1
  5. package/template/bin/ctm-dev-cleanup.js +24 -3
  6. package/template/bin/ctm-launch.sh +13 -0
  7. package/template/bin/dev.sh +156 -18
  8. package/template/bin/node-bin.sh +84 -0
  9. package/template/bin/pin-node.sh +51 -0
  10. package/template/claude-task-manager/api-prompts.js +1203 -182
  11. package/template/claude-task-manager/api-reviews.js +109 -15
  12. package/template/claude-task-manager/approval-agent.js +1360 -280
  13. package/template/claude-task-manager/bin/restart-ctm.sh +64 -23
  14. package/template/claude-task-manager/bin/storage-migration-supervisor.js +338 -0
  15. package/template/claude-task-manager/db.js +4417 -295
  16. package/template/claude-task-manager/docs/app-update-refresh-protocol.md +69 -0
  17. package/template/claude-task-manager/docs/approval-ai-refinement.md +138 -0
  18. package/template/claude-task-manager/docs/approval-rescue-loop.md +74 -0
  19. package/template/claude-task-manager/docs/codex-operational-warning-health.md +107 -0
  20. package/template/claude-task-manager/docs/codex-resume-state-guard-design.md +17 -12
  21. package/template/claude-task-manager/docs/codex-terminal-render-controller-handoff.md +311 -0
  22. package/template/claude-task-manager/docs/coding-agent-hooks-architecture.md +418 -0
  23. package/template/claude-task-manager/docs/conversation-import-freshness.md +20 -0
  24. package/template/claude-task-manager/docs/google-workspace-auth-health.md +77 -0
  25. package/template/claude-task-manager/docs/image-paste-ux.md +13 -0
  26. package/template/claude-task-manager/docs/ipad-web-preview.md +88 -0
  27. package/template/claude-task-manager/docs/main-loop-offload-architecture.md +66 -0
  28. package/template/claude-task-manager/docs/microsoft-dev-tunnel-phone-access-design.md +274 -519
  29. package/template/claude-task-manager/docs/mobile-live-streaming.md +27 -5
  30. package/template/claude-task-manager/docs/mobile-remote-submission-lifecycle.md +69 -0
  31. package/template/claude-task-manager/docs/phone-access-design.md +53 -15
  32. package/template/claude-task-manager/docs/phone-passkey-identity.md +122 -0
  33. package/template/claude-task-manager/docs/phone-setup.md +3 -0
  34. package/template/claude-task-manager/docs/prompt-editing-tree-design.md +25 -1
  35. package/template/claude-task-manager/docs/remote-desktop-access-design.md +268 -0
  36. package/template/claude-task-manager/docs/restart-lifecycle-architecture.md +95 -0
  37. package/template/claude-task-manager/docs/runtime-work-control-plane.md +53 -0
  38. package/template/claude-task-manager/docs/session-interactive-wait-surfaces.md +38 -0
  39. package/template/claude-task-manager/docs/session-needs-you-dismissal.md +84 -0
  40. package/template/claude-task-manager/docs/session-render-state-management-design.md +91 -3
  41. package/template/claude-task-manager/docs/session-standup-command-center-design.md +25 -1
  42. package/template/claude-task-manager/docs/session-title-authority.md +32 -0
  43. package/template/claude-task-manager/docs/session-workspace-binding.md +33 -0
  44. package/template/claude-task-manager/docs/skill-intent-resolution-design.md +72 -0
  45. package/template/claude-task-manager/docs/walle-mcp-supervisor-health.md +86 -0
  46. package/template/claude-task-manager/docs/walle-relay-phone-access-design.md +24 -15
  47. package/template/claude-task-manager/docs/walle-session-history-hydration.md +114 -0
  48. package/template/claude-task-manager/docs/walle-session-input-queue.md +104 -0
  49. package/template/claude-task-manager/docs/walle-session-model-catalog.md +90 -0
  50. package/template/claude-task-manager/docs/walle-session-model-preferences.md +15 -6
  51. package/template/claude-task-manager/git-utils.js +897 -27
  52. package/template/claude-task-manager/lib/agent-capabilities.js +33 -0
  53. package/template/claude-task-manager/lib/agent-cli-cache.js +37 -7
  54. package/template/claude-task-manager/lib/agent-hooks-installer.js +26 -2
  55. package/template/claude-task-manager/lib/agent-presets.js +17 -1
  56. package/template/claude-task-manager/lib/all-sessions-query.js +108 -0
  57. package/template/claude-task-manager/lib/approval-ai-refinement.js +488 -0
  58. package/template/claude-task-manager/lib/approval-self-adapt.js +168 -0
  59. package/template/claude-task-manager/lib/async-semaphore.js +44 -0
  60. package/template/claude-task-manager/lib/auth-context.js +5 -0
  61. package/template/claude-task-manager/lib/auth-rate-limit.js +47 -4
  62. package/template/claude-task-manager/lib/auth-rules.js +29 -2
  63. package/template/claude-task-manager/lib/auto-approval-verifier.js +129 -16
  64. package/template/claude-task-manager/lib/background-llm.js +144 -17
  65. package/template/claude-task-manager/lib/branch-inventory.js +212 -0
  66. package/template/claude-task-manager/lib/claude-desktop-sessions.js +15 -3
  67. package/template/claude-task-manager/lib/coalesce-sync-frames.js +151 -0
  68. package/template/claude-task-manager/lib/codex-launch-health.js +762 -0
  69. package/template/claude-task-manager/lib/codex-transcript-pager.js +51 -0
  70. package/template/claude-task-manager/lib/codex-zst.js +124 -0
  71. package/template/claude-task-manager/lib/coding-agent-models.js +233 -30
  72. package/template/claude-task-manager/lib/connection-health.js +232 -0
  73. package/template/claude-task-manager/lib/conversation-blob-parser.js +42 -0
  74. package/template/claude-task-manager/lib/conversation-tail-merge.js +89 -26
  75. package/template/claude-task-manager/lib/ctm-session-context-api.js +39 -10
  76. package/template/claude-task-manager/lib/cursor-conversation-store.js +354 -0
  77. package/template/claude-task-manager/lib/db-owner-worker-client.js +315 -0
  78. package/template/claude-task-manager/lib/document-review.js +141 -6
  79. package/template/claude-task-manager/lib/escalation-review.js +152 -0
  80. package/template/claude-task-manager/lib/graceful-shutdown.js +159 -0
  81. package/template/claude-task-manager/lib/headless-term-service.js +678 -0
  82. package/template/claude-task-manager/lib/heavy-worker-fallback.js +38 -0
  83. package/template/claude-task-manager/lib/jsonl-conversation-parser.js +542 -0
  84. package/template/claude-task-manager/lib/jsonl-range-reader.js +112 -0
  85. package/template/claude-task-manager/lib/main-db-census.js +216 -0
  86. package/template/claude-task-manager/lib/message-pagination.js +106 -4
  87. package/template/claude-task-manager/lib/microsoft-dev-tunnel-setup.js +750 -26
  88. package/template/claude-task-manager/lib/mobile-auth-api.js +274 -7
  89. package/template/claude-task-manager/lib/mobile-auth-store.js +592 -10
  90. package/template/claude-task-manager/lib/mobile-notification-dispatcher.js +15 -0
  91. package/template/claude-task-manager/lib/model-overview-brain-fallback.js +311 -0
  92. package/template/claude-task-manager/lib/model-overview-cache.js +141 -0
  93. package/template/claude-task-manager/lib/models-health-routing-notice.js +126 -0
  94. package/template/claude-task-manager/lib/node-pin-guard.js +93 -0
  95. package/template/claude-task-manager/lib/perf-tracker.js +242 -6
  96. package/template/claude-task-manager/lib/permission-match.js +76 -0
  97. package/template/claude-task-manager/lib/permission-sync.js +133 -20
  98. package/template/claude-task-manager/lib/process-title.js +35 -0
  99. package/template/claude-task-manager/lib/prompt-executions-query.js +25 -0
  100. package/template/claude-task-manager/lib/prompt-index-disk-cache.js +44 -0
  101. package/template/claude-task-manager/lib/prompt-intent.js +132 -0
  102. package/template/claude-task-manager/lib/provider-user-context.js +34 -0
  103. package/template/claude-task-manager/lib/read-pool-client.js +313 -0
  104. package/template/claude-task-manager/lib/readpool-breaker.js +31 -0
  105. package/template/claude-task-manager/lib/recent-sessions-breaker.js +12 -0
  106. package/template/claude-task-manager/lib/remote-feedback-client.js +72 -0
  107. package/template/claude-task-manager/lib/remote-relay-protocol.js +37 -4
  108. package/template/claude-task-manager/lib/remote-relay-store.js +159 -0
  109. package/template/claude-task-manager/lib/remote-submission-observer.js +278 -0
  110. package/template/claude-task-manager/lib/restart-guard.js +109 -0
  111. package/template/claude-task-manager/lib/restore-interruption-detector.js +439 -0
  112. package/template/claude-task-manager/lib/restore-policy.js +13 -0
  113. package/template/claude-task-manager/lib/restore-resume-batch.js +74 -0
  114. package/template/claude-task-manager/lib/restore-runtime.js +68 -0
  115. package/template/claude-task-manager/lib/restore-storm.js +34 -0
  116. package/template/claude-task-manager/lib/resume-cwd.js +36 -0
  117. package/template/claude-task-manager/lib/resume-preflight.js +313 -0
  118. package/template/claude-task-manager/lib/runtime-work-registry.js +444 -0
  119. package/template/claude-task-manager/lib/sanitize-openai-auth.js +31 -0
  120. package/template/claude-task-manager/lib/scheduler.js +21 -1
  121. package/template/claude-task-manager/lib/scrollback-snapshot-store.js +159 -0
  122. package/template/claude-task-manager/lib/serial-task-queue.js +64 -0
  123. package/template/claude-task-manager/lib/server-listeners.js +239 -0
  124. package/template/claude-task-manager/lib/session-capture.js +42 -7
  125. package/template/claude-task-manager/lib/session-content-backfill.js +131 -0
  126. package/template/claude-task-manager/lib/session-history.js +388 -43
  127. package/template/claude-task-manager/lib/session-host-manager.js +287 -0
  128. package/template/claude-task-manager/lib/session-image-refs.js +209 -0
  129. package/template/claude-task-manager/lib/session-jobs.js +399 -59
  130. package/template/claude-task-manager/lib/session-prompt-index.js +137 -0
  131. package/template/claude-task-manager/lib/session-restore.js +53 -0
  132. package/template/claude-task-manager/lib/session-standup.js +123 -23
  133. package/template/claude-task-manager/lib/session-state-bus.js +14 -0
  134. package/template/claude-task-manager/lib/session-stream.js +64 -16
  135. package/template/claude-task-manager/lib/session-timeline-summary.js +260 -0
  136. package/template/claude-task-manager/lib/session-token-usage.js +494 -0
  137. package/template/claude-task-manager/lib/session-workspace-binding.js +356 -0
  138. package/template/claude-task-manager/lib/setup-network-config.js +9 -0
  139. package/template/claude-task-manager/lib/size-cap.js +45 -0
  140. package/template/claude-task-manager/lib/size-cap.test.js +62 -0
  141. package/template/claude-task-manager/lib/skill-autocomplete.js +180 -1
  142. package/template/claude-task-manager/lib/skill-intent-resolver.js +304 -0
  143. package/template/claude-task-manager/lib/sqlite-driver.js +19 -3
  144. package/template/claude-task-manager/lib/standup-attention.js +7 -3
  145. package/template/claude-task-manager/lib/status-authority.js +39 -0
  146. package/template/claude-task-manager/lib/status-hooks.js +4 -0
  147. package/template/claude-task-manager/lib/storage-migration.js +235 -0
  148. package/template/claude-task-manager/lib/structured-capture.js +298 -0
  149. package/template/claude-task-manager/lib/sync-io-census.js +163 -0
  150. package/template/claude-task-manager/lib/tailscale-setup.js +6 -0
  151. package/template/claude-task-manager/lib/terminal-activity-evidence.js +33 -0
  152. package/template/claude-task-manager/lib/terminal-choice.js +364 -0
  153. package/template/claude-task-manager/lib/terminal-control-sanitize.js +17 -0
  154. package/template/claude-task-manager/lib/terminal-fingerprint.js +48 -0
  155. package/template/claude-task-manager/lib/terminal-output-flush.js +84 -0
  156. package/template/claude-task-manager/lib/timeline-order.js +122 -0
  157. package/template/claude-task-manager/lib/transcript-store.js +348 -43
  158. package/template/claude-task-manager/lib/transport-security.js +84 -1
  159. package/template/claude-task-manager/lib/wait-state.js +184 -0
  160. package/template/claude-task-manager/lib/walle-client.js +47 -5
  161. package/template/claude-task-manager/lib/walle-ctm-history.js +564 -4
  162. package/template/claude-task-manager/lib/walle-external-actions.js +135 -16
  163. package/template/claude-task-manager/lib/walle-history-hydration.js +46 -0
  164. package/template/claude-task-manager/lib/walle-native-health.js +403 -0
  165. package/template/claude-task-manager/lib/walle-repair.js +701 -0
  166. package/template/claude-task-manager/lib/walle-session-cache.js +109 -0
  167. package/template/claude-task-manager/lib/walle-session-context.js +57 -21
  168. package/template/claude-task-manager/lib/walle-session-model-catalog.js +34 -0
  169. package/template/claude-task-manager/lib/walle-supervisor.js +539 -63
  170. package/template/claude-task-manager/lib/walle-transcript.js +52 -0
  171. package/template/claude-task-manager/lib/worktree-active-sync.js +11 -7
  172. package/template/claude-task-manager/lib/worktree-cwd.js +32 -1
  173. package/template/claude-task-manager/package.json +1 -1
  174. package/template/claude-task-manager/prompt-harvest.js +89 -66
  175. package/template/claude-task-manager/providers/claude-code.js +51 -3
  176. package/template/claude-task-manager/providers/cursor.js +140 -45
  177. package/template/claude-task-manager/public/css/reviews.css +551 -61
  178. package/template/claude-task-manager/public/css/setup.css +191 -0
  179. package/template/claude-task-manager/public/css/walle-session.css +865 -10
  180. package/template/claude-task-manager/public/css/walle.css +154 -0
  181. package/template/claude-task-manager/public/designs/ai-providers-consolidation-v2.html +830 -0
  182. package/template/claude-task-manager/public/index.html +18516 -2058
  183. package/template/claude-task-manager/public/ipad.html +363 -0
  184. package/template/claude-task-manager/public/js/document-review-links.js +301 -0
  185. package/template/claude-task-manager/public/js/image-normalize.js +69 -36
  186. package/template/claude-task-manager/public/js/message-renderer.js +1265 -77
  187. package/template/claude-task-manager/public/js/prompts.js +66 -29
  188. package/template/claude-task-manager/public/js/reviews.js +901 -133
  189. package/template/claude-task-manager/public/js/session-activity-utils.js +11 -1
  190. package/template/claude-task-manager/public/js/session-search-utils.js +94 -10
  191. package/template/claude-task-manager/public/js/session-status-precedence.js +23 -5
  192. package/template/claude-task-manager/public/js/setup.js +1273 -176
  193. package/template/claude-task-manager/public/js/stream-view.js +691 -73
  194. package/template/claude-task-manager/public/js/terminal-reconciler.js +210 -0
  195. package/template/claude-task-manager/public/js/walle-session.js +2455 -158
  196. package/template/claude-task-manager/public/js/walle.js +455 -28
  197. package/template/claude-task-manager/public/m/app.css +2909 -262
  198. package/template/claude-task-manager/public/m/app.js +6601 -398
  199. package/template/claude-task-manager/public/m/claim.html +224 -17
  200. package/template/claude-task-manager/public/m/index.html +117 -21
  201. package/template/claude-task-manager/public/m/sw.js +3 -1
  202. package/template/claude-task-manager/public/manifest.json +2 -2
  203. package/template/claude-task-manager/public/prompts.html +30 -14
  204. package/template/claude-task-manager/queue-engine.js +507 -28
  205. package/template/claude-task-manager/scripts/repair-claude-session-images.js +27 -8
  206. package/template/claude-task-manager/server.js +14341 -2197
  207. package/template/claude-task-manager/session-integrity.js +160 -18
  208. package/template/claude-task-manager/session-search-ranking.js +1 -0
  209. package/template/claude-task-manager/session-utils.js +25 -5
  210. package/template/claude-task-manager/workers/approval-blocklist.js +96 -6
  211. package/template/claude-task-manager/workers/approval-widget-validator.js +14 -8
  212. package/template/claude-task-manager/workers/conversation-import-worker.js +11 -50
  213. package/template/claude-task-manager/workers/db-owner-worker.js +386 -0
  214. package/template/claude-task-manager/workers/harvest-worker.js +9 -55
  215. package/template/claude-task-manager/workers/headless-term-worker.js +9 -530
  216. package/template/claude-task-manager/workers/read-pool-worker.js +387 -0
  217. package/template/claude-task-manager/workers/scrollback-worker.js +11 -72
  218. package/template/claude-task-manager/workers/session-host-process.js +146 -0
  219. package/template/claude-task-manager/workers/session-integrity-worker.js +10 -54
  220. package/template/claude-task-manager/workers/state-detectors/base.js +18 -1
  221. package/template/claude-task-manager/workers/state-detectors/claude-code.js +182 -9
  222. package/template/claude-task-manager/workers/state-detectors/codex.js +150 -2
  223. package/template/claude-task-manager/workers/state-detectors/cursor.js +127 -0
  224. package/template/claude-task-manager/workers/state-detectors/gemini.js +21 -0
  225. package/template/claude-task-manager/workers/state-detectors/index.js +29 -0
  226. package/template/claude-task-manager/workers/state-detectors/opencode.js +103 -0
  227. package/template/docs/design/markdown-review-pane.md +206 -0
  228. package/template/docs/designs/2026-05-17-portkey-gateway-provider-ux.md +129 -38
  229. package/template/docs/designs/2026-05-20-mobile-worktree-finish-command.md +27 -0
  230. package/template/docs/designs/2026-05-22-ai-configuration-consolidation.md +248 -0
  231. package/template/docs/designs/ai-configuration-consolidation-mock.html +812 -0
  232. package/template/docs/private-memory-and-pii-policy.md +69 -0
  233. package/template/package.json +2 -1
  234. package/template/scripts/check-private-data.js +201 -0
  235. package/template/shared/sqlite-owner-guard.js +30 -0
  236. package/template/shared/sqlite-owner-write-queue.js +225 -0
  237. package/template/shared/sqlite-storage-policy.js +111 -0
  238. package/template/shared/sqlite-write-lock.js +428 -0
  239. package/template/wall-e/agent-runners/claude-code.js +5 -0
  240. package/template/wall-e/agent.js +166 -22
  241. package/template/wall-e/api-walle.js +524 -70
  242. package/template/wall-e/auth/provider-flows.js +11 -1
  243. package/template/wall-e/bin/walle-mcp-stdio.js +341 -17
  244. package/template/wall-e/brain.js +1614 -141
  245. package/template/wall-e/chat/attachment-blocks.js +96 -0
  246. package/template/wall-e/chat/attachments.js +2 -1
  247. package/template/wall-e/chat/capability-resolver.js +7 -7
  248. package/template/wall-e/chat/context-messages.js +28 -0
  249. package/template/wall-e/chat/conversation-frame.js +630 -0
  250. package/template/wall-e/chat/provider-messages.js +125 -0
  251. package/template/wall-e/chat.js +1002 -233
  252. package/template/wall-e/coding/acceptance-contract.js +170 -0
  253. package/template/wall-e/coding/acp-adapter.js +1 -1
  254. package/template/wall-e/coding/agent-catalog.js +3 -0
  255. package/template/wall-e/coding/artifact-store.js +93 -0
  256. package/template/wall-e/coding/capability-router.js +120 -0
  257. package/template/wall-e/coding/coding-run-controller.js +423 -0
  258. package/template/wall-e/coding/compaction-service.js +157 -12
  259. package/template/wall-e/coding/frontend-verification.js +258 -0
  260. package/template/wall-e/coding/lifecycle-hooks.js +75 -0
  261. package/template/wall-e/coding/local-preview-contract.js +157 -0
  262. package/template/wall-e/coding/permission-service.js +57 -13
  263. package/template/wall-e/coding/prompt-bundle.js +19 -1
  264. package/template/wall-e/coding/prompt-section-registry.js +227 -0
  265. package/template/wall-e/coding/provider-compat.js +15 -0
  266. package/template/wall-e/coding/runtime-events.js +224 -0
  267. package/template/wall-e/coding/runtime-mode.js +3 -0
  268. package/template/wall-e/coding/side-git-snapshot.js +160 -4
  269. package/template/wall-e/coding/snapshot-service.js +143 -1
  270. package/template/wall-e/coding/stream-processor.js +388 -34
  271. package/template/wall-e/coding/task-tool.js +141 -4
  272. package/template/wall-e/coding/tool-execution-controller.js +365 -0
  273. package/template/wall-e/coding/tool-registry.js +43 -5
  274. package/template/wall-e/coding/user-hooks.js +217 -0
  275. package/template/wall-e/coding-orchestrator.js +1330 -221
  276. package/template/wall-e/coding-prompts.js +20 -4
  277. package/template/wall-e/context/context-builder.js +15 -2
  278. package/template/wall-e/decision/confidence.js +1 -1
  279. package/template/wall-e/docs/coding-acceptance-contract.md +41 -0
  280. package/template/wall-e/docs/external-action-controller.md +26 -6
  281. package/template/wall-e/docs/telemetry-lifecycle.md +8 -2
  282. package/template/wall-e/embeddings.js +591 -53
  283. package/template/wall-e/external-action-controller.js +12 -0
  284. package/template/wall-e/http/auth.js +1 -0
  285. package/template/wall-e/http/chat-api.js +46 -11
  286. package/template/wall-e/http/model-admin.js +836 -34
  287. package/template/wall-e/lib/boot-profile.js +88 -0
  288. package/template/wall-e/lib/event-loop-monitor.js +93 -0
  289. package/template/wall-e/lib/service-health.js +194 -0
  290. package/template/wall-e/llm/anthropic.js +130 -5
  291. package/template/wall-e/llm/client.js +266 -63
  292. package/template/wall-e/llm/default-fallback.js +382 -0
  293. package/template/wall-e/llm/health.js +19 -0
  294. package/template/wall-e/llm/message-guard.js +78 -0
  295. package/template/wall-e/llm/model-catalog.js +252 -1
  296. package/template/wall-e/llm/openai.js +26 -4
  297. package/template/wall-e/llm/portkey-sync.js +654 -0
  298. package/template/wall-e/llm/provider-error.js +30 -2
  299. package/template/wall-e/llm/registry.js +5 -1
  300. package/template/wall-e/llm/request-compat.js +67 -0
  301. package/template/wall-e/loops/backfill.js +79 -23
  302. package/template/wall-e/loops/brain-optimize.js +67 -0
  303. package/template/wall-e/loops/ingest.js +25 -10
  304. package/template/wall-e/loops/question-digest.js +160 -0
  305. package/template/wall-e/loops/reflect.js +6 -4
  306. package/template/wall-e/loops/think.js +39 -12
  307. package/template/wall-e/mcp-server.js +318 -36
  308. package/template/wall-e/memory/ctm-context-client.js +52 -14
  309. package/template/wall-e/memory/ctm-operational-context.js +237 -0
  310. package/template/wall-e/memory/ctm-prompt-executions-client.js +128 -0
  311. package/template/wall-e/memory/ctm-session-context.js +111 -63
  312. package/template/wall-e/prompts/coding/deepseek.txt +3 -0
  313. package/template/wall-e/prompts/coding/gemini.txt +6 -0
  314. package/template/wall-e/prompts/coding/gpt.txt +6 -0
  315. package/template/wall-e/prompts/coding/local.txt +7 -0
  316. package/template/wall-e/runtime/decision-hooks.js +115 -0
  317. package/template/wall-e/runtime/devbox-gateway.js +82 -8
  318. package/template/wall-e/runtime/prompt-manifest.js +86 -0
  319. package/template/wall-e/runtime/tool-executor.js +269 -0
  320. package/template/wall-e/runtime/tool-result-envelope.js +138 -0
  321. package/template/wall-e/runtime/transcript-projection.js +60 -0
  322. package/template/wall-e/runtime/walle-runtime.js +224 -0
  323. package/template/wall-e/scripts/db-optimize/migrate.js +162 -0
  324. package/template/wall-e/scripts/db-optimize/recall-eval.js +117 -0
  325. package/template/wall-e/server.js +15 -0
  326. package/template/wall-e/session-files.js +9 -0
  327. package/template/wall-e/skills/_bundled/google-calendar/run.js +1 -1
  328. package/template/wall-e/skills/_bundled/gws-workspace/run.js +1 -1
  329. package/template/wall-e/skills/_bundled/slack-mentions/run.js +76 -6
  330. package/template/wall-e/skills/claude-code-reader.js +7 -3
  331. package/template/wall-e/skills/script-skill-runner.js +10 -0
  332. package/template/wall-e/skills/skill-planner.js +38 -0
  333. package/template/wall-e/tools/builtin-middleware.js +19 -9
  334. package/template/wall-e/tools/local-tools.js +1428 -16
  335. package/template/wall-e/tools/permission-checker.js +73 -5
  336. package/template/wall-e/tools/question-manager.js +117 -7
  337. package/template/wall-e/training/harvester.js +12 -28
  338. package/template/wall-e/training/replay.js +25 -80
  339. package/template/website/index.html +10 -10
  340. package/template/wall-e/eval/ab-test.js +0 -203
  341. package/template/wall-e/eval/agent-runner.js +0 -772
  342. package/template/wall-e/eval/agent-scorer.js +0 -461
  343. package/template/wall-e/eval/aggregator.js +0 -414
  344. package/template/wall-e/eval/allowed-test-commands.js +0 -34
  345. package/template/wall-e/eval/benchmark-generator.js +0 -113
  346. package/template/wall-e/eval/benchmarks/chat-eval.json +0 -1662
  347. package/template/wall-e/eval/benchmarks/chat.json +0 -82
  348. package/template/wall-e/eval/benchmarks/coding-agent-real.json +0 -1
  349. package/template/wall-e/eval/benchmarks/coding-agent.json +0 -1581
  350. package/template/wall-e/eval/benchmarks/coding.json +0 -122
  351. package/template/wall-e/eval/benchmarks/memory-retrieval.json +0 -234
  352. package/template/wall-e/eval/benchmarks/reasoning.json +0 -82
  353. package/template/wall-e/eval/benchmarks/swebench-lite-30.json +0 -212
  354. package/template/wall-e/eval/benchmarks.js +0 -669
  355. package/template/wall-e/eval/cc-replay.js +0 -719
  356. package/template/wall-e/eval/chat-eval.js +0 -525
  357. package/template/wall-e/eval/check-keys.js +0 -15
  358. package/template/wall-e/eval/check-providers.js +0 -42
  359. package/template/wall-e/eval/codex-cli-baseline.js +0 -669
  360. package/template/wall-e/eval/coding-agent-real.js +0 -570
  361. package/template/wall-e/eval/context-compactor.js +0 -251
  362. package/template/wall-e/eval/debug-agent003.js +0 -68
  363. package/template/wall-e/eval/diagnostics.js +0 -216
  364. package/template/wall-e/eval/eval-orchestrator.js +0 -642
  365. package/template/wall-e/eval/evaluate.js +0 -202
  366. package/template/wall-e/eval/evaluator.js +0 -373
  367. package/template/wall-e/eval/exporter.js +0 -212
  368. package/template/wall-e/eval/fixtures/express-basic/package.json +0 -9
  369. package/template/wall-e/eval/fixtures/express-basic/server.js +0 -115
  370. package/template/wall-e/eval/fixtures/express-basic/test.js +0 -83
  371. package/template/wall-e/eval/fixtures/express-buggy/package.json +0 -9
  372. package/template/wall-e/eval/fixtures/express-buggy/server.js +0 -113
  373. package/template/wall-e/eval/fixtures/express-buggy/test.js +0 -83
  374. package/template/wall-e/eval/fixtures/express-buggy-items/package.json +0 -9
  375. package/template/wall-e/eval/fixtures/express-buggy-items/server.js +0 -112
  376. package/template/wall-e/eval/fixtures/express-buggy-items/test.js +0 -83
  377. package/template/wall-e/eval/fixtures/express-buggy-search/package.json +0 -9
  378. package/template/wall-e/eval/fixtures/express-buggy-search/server.js +0 -121
  379. package/template/wall-e/eval/fixtures/express-buggy-search/test.js +0 -83
  380. package/template/wall-e/eval/fixtures/express-rename-data/data.js +0 -34
  381. package/template/wall-e/eval/fixtures/express-rename-data/package.json +0 -9
  382. package/template/wall-e/eval/fixtures/express-rename-data/server.js +0 -97
  383. package/template/wall-e/eval/fixtures/express-rename-data/test.js +0 -88
  384. package/template/wall-e/eval/fixtures/express-xss/package.json +0 -12
  385. package/template/wall-e/eval/fixtures/express-xss/server.js +0 -90
  386. package/template/wall-e/eval/fixtures/express-xss/test.js +0 -67
  387. package/template/wall-e/eval/fixtures/express-xss/views/profile.ejs +0 -9
  388. package/template/wall-e/eval/fixtures/fullstack-app/config/default.js +0 -9
  389. package/template/wall-e/eval/fixtures/fullstack-app/config/test.js +0 -13
  390. package/template/wall-e/eval/fixtures/fullstack-app/package.json +0 -11
  391. package/template/wall-e/eval/fixtures/fullstack-app/public/css/style.css +0 -137
  392. package/template/wall-e/eval/fixtures/fullstack-app/public/index.html +0 -46
  393. package/template/wall-e/eval/fixtures/fullstack-app/public/js/app.js +0 -121
  394. package/template/wall-e/eval/fixtures/fullstack-app/public/js/auth.js +0 -71
  395. package/template/wall-e/eval/fixtures/fullstack-app/public/js/items.js +0 -80
  396. package/template/wall-e/eval/fixtures/fullstack-app/public/js/users.js +0 -46
  397. package/template/wall-e/eval/fixtures/fullstack-app/public/login.html +0 -45
  398. package/template/wall-e/eval/fixtures/fullstack-app/public/register.html +0 -38
  399. package/template/wall-e/eval/fixtures/fullstack-app/scripts/migrate.js +0 -23
  400. package/template/wall-e/eval/fixtures/fullstack-app/scripts/seed.js +0 -46
  401. package/template/wall-e/eval/fixtures/fullstack-app/server/db.js +0 -99
  402. package/template/wall-e/eval/fixtures/fullstack-app/server/index.js +0 -94
  403. package/template/wall-e/eval/fixtures/fullstack-app/server/middleware/auth.js +0 -19
  404. package/template/wall-e/eval/fixtures/fullstack-app/server/middleware/logger.js +0 -19
  405. package/template/wall-e/eval/fixtures/fullstack-app/server/router.js +0 -50
  406. package/template/wall-e/eval/fixtures/fullstack-app/server/routes/auth.js +0 -69
  407. package/template/wall-e/eval/fixtures/fullstack-app/server/routes/health.js +0 -23
  408. package/template/wall-e/eval/fixtures/fullstack-app/server/routes/items.js +0 -88
  409. package/template/wall-e/eval/fixtures/fullstack-app/server/routes/users.js +0 -75
  410. package/template/wall-e/eval/fixtures/fullstack-app/server/test.js +0 -198
  411. package/template/wall-e/eval/fixtures/fullstack-app/server/utils/response.js +0 -34
  412. package/template/wall-e/eval/fixtures/fullstack-app/server/utils/validate.js +0 -26
  413. package/template/wall-e/eval/fixtures/fullstack-app/server.js +0 -8
  414. package/template/wall-e/eval/fixtures/fullstack-app/test.js +0 -12
  415. package/template/wall-e/eval/fixtures/monorepo-basic/package.json +0 -8
  416. package/template/wall-e/eval/fixtures/monorepo-basic/packages/api/data.js +0 -58
  417. package/template/wall-e/eval/fixtures/monorepo-basic/packages/api/middleware.js +0 -46
  418. package/template/wall-e/eval/fixtures/monorepo-basic/packages/api/package.json +0 -8
  419. package/template/wall-e/eval/fixtures/monorepo-basic/packages/api/routes.js +0 -64
  420. package/template/wall-e/eval/fixtures/monorepo-basic/packages/api/server.js +0 -56
  421. package/template/wall-e/eval/fixtures/monorepo-basic/packages/api/test.js +0 -116
  422. package/template/wall-e/eval/fixtures/monorepo-basic/packages/cli/commands.js +0 -61
  423. package/template/wall-e/eval/fixtures/monorepo-basic/packages/cli/index.js +0 -62
  424. package/template/wall-e/eval/fixtures/monorepo-basic/packages/cli/output.js +0 -43
  425. package/template/wall-e/eval/fixtures/monorepo-basic/packages/cli/package.json +0 -11
  426. package/template/wall-e/eval/fixtures/monorepo-basic/packages/cli/test.js +0 -44
  427. package/template/wall-e/eval/fixtures/monorepo-basic/packages/shared/formatters.js +0 -43
  428. package/template/wall-e/eval/fixtures/monorepo-basic/packages/shared/index.js +0 -12
  429. package/template/wall-e/eval/fixtures/monorepo-basic/packages/shared/package.json +0 -5
  430. package/template/wall-e/eval/fixtures/monorepo-basic/packages/shared/test.js +0 -55
  431. package/template/wall-e/eval/fixtures/monorepo-basic/packages/shared/validators.js +0 -29
  432. package/template/wall-e/eval/fixtures/monorepo-basic/test.js +0 -46
  433. package/template/wall-e/eval/fixtures/node-cli/index.js +0 -78
  434. package/template/wall-e/eval/fixtures/node-cli/package.json +0 -10
  435. package/template/wall-e/eval/fixtures/node-cli/test.js +0 -57
  436. package/template/wall-e/eval/fixtures/node-typed/package.json +0 -8
  437. package/template/wall-e/eval/fixtures/node-typed/src/handlers.js +0 -31
  438. package/template/wall-e/eval/fixtures/node-typed/src/utils.js +0 -33
  439. package/template/wall-e/eval/fixtures/node-typed/test.js +0 -36
  440. package/template/wall-e/eval/fixtures/python-flask/app.py +0 -14
  441. package/template/wall-e/eval/fixtures/python-flask/requirements.txt +0 -2
  442. package/template/wall-e/eval/fixtures/python-flask/test_app.py +0 -25
  443. package/template/wall-e/eval/fixtures/wall-e-subset/brain.js +0 -105
  444. package/template/wall-e/eval/fixtures/wall-e-subset/eval/aggregator.js +0 -101
  445. package/template/wall-e/eval/fixtures/wall-e-subset/eval/benchmarks/chat.json +0 -20
  446. package/template/wall-e/eval/fixtures/wall-e-subset/eval/benchmarks/coding.json +0 -32
  447. package/template/wall-e/eval/fixtures/wall-e-subset/eval/benchmarks.js +0 -64
  448. package/template/wall-e/eval/fixtures/wall-e-subset/eval/fixtures/simple-project/package.json +0 -6
  449. package/template/wall-e/eval/fixtures/wall-e-subset/eval/fixtures/simple-project/server.js +0 -31
  450. package/template/wall-e/eval/fixtures/wall-e-subset/eval/fixtures/simple-project/test.js +0 -18
  451. package/template/wall-e/eval/fixtures/wall-e-subset/eval/fixtures/simple-project/utils.js +0 -34
  452. package/template/wall-e/eval/fixtures/wall-e-subset/eval/runner.js +0 -104
  453. package/template/wall-e/eval/fixtures/wall-e-subset/eval/scorer.js +0 -73
  454. package/template/wall-e/eval/fixtures/wall-e-subset/eval/test.js +0 -134
  455. package/template/wall-e/eval/fixtures/wall-e-subset/llm/client.js +0 -99
  456. package/template/wall-e/eval/fixtures/wall-e-subset/llm/providers.js +0 -63
  457. package/template/wall-e/eval/fixtures/wall-e-subset/llm/test.js +0 -70
  458. package/template/wall-e/eval/fixtures/wall-e-subset/package.json +0 -10
  459. package/template/wall-e/eval/fixtures/wall-e-subset/test.js +0 -86
  460. package/template/wall-e/eval/harvester.js +0 -685
  461. package/template/wall-e/eval/head-to-head.js +0 -388
  462. package/template/wall-e/eval/humaneval-adapter.js +0 -321
  463. package/template/wall-e/eval/list-models.js +0 -31
  464. package/template/wall-e/eval/livecodebench-adapter.js +0 -291
  465. package/template/wall-e/eval/mail-integration.js +0 -443
  466. package/template/wall-e/eval/manifest.js +0 -186
  467. package/template/wall-e/eval/meta-harness/adapters/coding-agent.js +0 -57
  468. package/template/wall-e/eval/meta-harness/bootstrap-snapshot.js +0 -149
  469. package/template/wall-e/eval/meta-harness/candidate-store.js +0 -117
  470. package/template/wall-e/eval/meta-harness/cli.js +0 -86
  471. package/template/wall-e/eval/meta-harness/domain-spec.js +0 -154
  472. package/template/wall-e/eval/meta-harness/domains/coding-agent.domain.json +0 -84
  473. package/template/wall-e/eval/meta-harness/examples/env-bootstrap-candidate.js +0 -29
  474. package/template/wall-e/eval/meta-harness/experience-store.js +0 -174
  475. package/template/wall-e/eval/meta-harness/frontier.js +0 -96
  476. package/template/wall-e/eval/meta-harness/harness-interface.js +0 -90
  477. package/template/wall-e/eval/meta-harness/leakage-guard.js +0 -80
  478. package/template/wall-e/eval/meta-harness/optimizer.js +0 -207
  479. package/template/wall-e/eval/meta-harness/proposer-runner.js +0 -110
  480. package/template/wall-e/eval/meta-harness/reporting.js +0 -58
  481. package/template/wall-e/eval/meta-harness/telemetry.js +0 -27
  482. package/template/wall-e/eval/meta-harness/validation.js +0 -81
  483. package/template/wall-e/eval/promoter.js +0 -228
  484. package/template/wall-e/eval/provider-normalizer.js +0 -33
  485. package/template/wall-e/eval/replay.js +0 -395
  486. package/template/wall-e/eval/run-agent-benchmarks.js +0 -386
  487. package/template/wall-e/eval/run-codex-cli-baseline.js +0 -177
  488. package/template/wall-e/eval/run-coding-agent-real.js +0 -187
  489. package/template/wall-e/eval/run-eval.js +0 -435
  490. package/template/wall-e/eval/run-model-comparison.js +0 -142
  491. package/template/wall-e/eval/session-evaluator.js +0 -187
  492. package/template/wall-e/eval/session-miner.js +0 -207
  493. package/template/wall-e/eval/session-retrieval-benchmark.js +0 -150
  494. package/template/wall-e/eval/session-transcripts.js +0 -509
  495. package/template/wall-e/eval/shadow.js +0 -161
  496. package/template/wall-e/eval/swebench-adapter.js +0 -345
  497. package/template/wall-e/eval/swebench-docker.js +0 -192
  498. package/template/wall-e/eval/train.py +0 -320
  499. package/template/wall-e/eval/trainer.js +0 -232
  500. package/template/wall-e/eval/weekly-eval-loop.js +0 -241
@@ -0,0 +1,69 @@
1
+ # CTM App Update Refresh Protocol
2
+
3
+ ## Problem
4
+
5
+ CTM is a long-lived browser app. A user can keep several CTM tabs open while
6
+ `create-walle update` replaces CTM and Wall-E code, restarts the server, and
7
+ serves newer HTML, CSS, and JavaScript. Existing browser tabs may reconnect to
8
+ the new server while still executing the old client bundle.
9
+
10
+ That is unsafe in both directions:
11
+
12
+ - silently keeping old code leaves users on stale UX after an update;
13
+ - blindly reloading every tab can destroy active terminal input, queue drafts,
14
+ screenshot edits, or mobile composer text.
15
+
16
+ ## Design
17
+
18
+ The server is the source of truth for the installed application identity. Each
19
+ WebSocket `hello` includes an app identity:
20
+
21
+ - `version`: the installed create-walle version;
22
+ - `components`: CTM and Wall-E package versions;
23
+ - `buildId`: a stable hash of key shipped files and package versions.
24
+
25
+ `buildId` is recomputed from file stats when version info is requested. It must
26
+ not be process-cached because update installers can replace static assets before
27
+ or during a server restart.
28
+
29
+ The client records the first app identity it sees for the current document. On
30
+ later WebSocket reconnects, if the server identity changes, the page is running
31
+ old code and must refresh before continuing normal operation.
32
+
33
+ ## UX Contract
34
+
35
+ When a changed app identity is detected:
36
+
37
+ 1. Broadcast `reload-required` to same-origin CTM tabs using `BroadcastChannel`.
38
+ 2. If the current tab is idle, reload immediately.
39
+ 3. If the current tab has active user work, show a sticky reload-required banner
40
+ and do not steal focus.
41
+ 4. The banner remains until the user clicks **Reload now** or the page is
42
+ refreshed.
43
+
44
+ Active user work includes focused xterm.js terminals, editable inputs,
45
+ contenteditable composers, open modals, open queue panel, and screenshot editor
46
+ state.
47
+
48
+ ## Mobile/PWA
49
+
50
+ The mobile service worker already uses a network-first shell strategy and
51
+ activates new workers. That only updates future navigations; a currently open
52
+ mobile document still needs the same runtime identity handshake. Mobile uses the
53
+ same `hello` comparison and shows a compact refresh banner when a composer or
54
+ detail view is active.
55
+
56
+ ## Non-Goals
57
+
58
+ - No hard reload while a user is typing.
59
+ - No reload prompt for ordinary CTM restarts when the app identity is unchanged.
60
+ - No dependency on service workers for desktop refresh behavior.
61
+
62
+ ## Verification
63
+
64
+ Focused render tests should cover:
65
+
66
+ - an idle desktop tab auto-refreshes on server identity change;
67
+ - a focused terminal desktop tab shows the reload banner without losing focus;
68
+ - another open tab receives the reload-required event through `BroadcastChannel`;
69
+ - mobile shows the refresh banner instead of silently keeping stale code.
@@ -0,0 +1,138 @@
1
+ # Approval AI Refinement
2
+
3
+ CTM's approval system has two separate jobs:
4
+
5
+ 1. Detect and parse the active terminal approval prompt.
6
+ 2. Decide whether the requested action is safe to approve.
7
+
8
+ AI refinement only belongs to the first job. It must never become the approval
9
+ policy itself.
10
+
11
+ ## Decision model (2026 update): allow-by-default + denylist
12
+
13
+ CTM's shadow approver is **allow-by-default**: it auto-approves every detected
14
+ prompt EXCEPT commands on the denylist. The denylist is the dangerous-command
15
+ **blocklist** (`workers/approval-blocklist.js`) — ON by default and editable in
16
+ the Permission Manager / Shadow Approver panel (the "permissions tab"). This is
17
+ the single default gate.
18
+
19
+ `handleApprovalCheck` order: parse → dedup → **blocklist** (escalate if matched)
20
+ → **Permission Manager rules** (user deny → escalate; user allow → approve, skip
21
+ verifier) → **verifier** (medium+ risk, if not user-allowed) → **auto-approve**
22
+ (one-time keystroke). The AI reviewer (`reviewWithAI`) is no longer in the
23
+ default path.
24
+
25
+ - **Blocklist is the denylist, ON by default.** Seeded with catastrophic /
26
+ irreversible patterns (rm -rf, mkfs, dd, sudo-destructive, reboot/shutdown,
27
+ `> /etc|/usr|/var|/boot`, curl|bash, credential exfil, recursive chmod/chown,
28
+ `git push --force`, `drop/truncate table`, `chmod 777` on system paths,
29
+ destructive `find -exec`, `node -e`/`python -c` with dangerous syscalls). Edit
30
+ it in the panel to add/remove what should require approval.
31
+ - **Permission Manager rules are authoritative across providers.** The approver
32
+ honors the user's `perm_rules` (`Bash(node:*)` allow/deny) via
33
+ `lib/permission-match.js` — a **deny** match escalates; an **allow** match
34
+ (without `always_ask`) auto-approves and skips the verifier. These rules
35
+ otherwise only configure Claude Code's own settings.json, so honoring them here
36
+ makes them apply to Codex and every other provider too.
37
+ - **Verifier is ON by default (second opinion).** For commands the user has NOT
38
+ explicitly allowed, the LLM verifier runs on **medium+ risk** and can veto an
39
+ auto-approval (read-only/low-risk ops skip it). Provider-agnostic via
40
+ `lib/background-llm.js` (`callBackgroundLlm`); a configured
41
+ `auto_approval_verifier_command` overrides the built-in. Turn off via
42
+ `ctm_settings.auto_approval_verifier_enabled = false`.
43
+ - **One-time approvals only.** CTM sends the one-time "yes" keystroke and never
44
+ auto-selects the durable "allow all / don't ask again" option.
45
+
46
+ ## Self-adaptation from user behavior
47
+
48
+ Beyond detector/parser gate-misses, CTM learns from the user's own corrections
49
+ (`lib/approval-self-adapt.js`):
50
+
51
+ - **Interrupt after auto-approve** (Ctrl-C right after a learned-rule
52
+ auto-approval) → the offending learned rule is retired.
53
+ - **Approve after escalation** (user manually approves a prompt CTM escalated) →
54
+ a positive signal is recorded for the history-scan / maintenance loop to
55
+ promote a narrow rule.
56
+ - A periodic **`approval-self-adapt`** scheduler job retires active detection
57
+ rules whose patterns no longer compile/are safe, keeping the learned set healthy.
58
+
59
+ ## Problem
60
+
61
+ The shadow approver can miss a real approval prompt before policy review runs:
62
+
63
+ - the detector sees approval-shaped text but the structural gate rejects it;
64
+ - a provider parser fails to extract the command/tool;
65
+ - the terminal snapshot is partial or races with a redraw;
66
+ - the provider changed prompt wording.
67
+
68
+ Previously, the rescue path returned `unparsed` before AI could diagnose the
69
+ miss. That made refinement hide behind the parser even when the parser was the
70
+ broken component.
71
+
72
+ ## Flow
73
+
74
+ The refinement loop is:
75
+
76
+ 1. Record the raw missed approval observation.
77
+ 2. Ask AI to classify the miss and propose a narrow parser/detector rule.
78
+ 3. Save the candidate rule in `approval_ai_refinement_rules`, separate from
79
+ shipped approval rules and user-learned approval policy rules.
80
+ 4. Validate the rule locally against the same raw terminal frame:
81
+ - regexes compile and pass safety checks;
82
+ - the one-time Yes/Allow option is matched;
83
+ - command/tool context can be extracted;
84
+ - no durable "always allow" shortcut is selected.
85
+ 5. If validation passes, mark the rule `active`.
86
+ 6. Rerun the normal approval rescue path. This proves the new rule participates
87
+ in the same blocklist, policy, live-prompt preflight, keystroke, and
88
+ post-send verification as every other approval.
89
+ 7. If the rerun works, the rule remains active.
90
+ 8. If proposal, validation, or rerun fails, persist an open warning in
91
+ `approval_ai_refinement_warnings` and stop. The warning is queryable through
92
+ `/api/approval-ai-refinement/warnings` so it does not disappear if the user
93
+ misses a toast.
94
+
95
+ ## Storage Boundaries
96
+
97
+ AI generated parser rules are intentionally not stored in `approval_rules`.
98
+
99
+ - `approval_rules`: user/AI learned approval policy rules for known command
100
+ signatures.
101
+ - `approval_rescue_patterns`: retry/cooldown bookkeeping for rescue attempts.
102
+ - `approval_ai_refinement_rules`: AI generated parser/detector refinements.
103
+ - `approval_ai_refinement_warnings`: persistent warnings when refinement fails.
104
+
105
+ This keeps product-shipped rules, user policy, and AI-generated parsing
106
+ patches auditable and reversible independently.
107
+
108
+ ## Telemetry
109
+
110
+ When a refinement proposal is evaluated, CTM sends a privacy-safe telemetry
111
+ event named `ctm_approval_ai_refinement` through the Wall-E telemetry endpoint.
112
+
113
+ The event includes:
114
+
115
+ - provider id;
116
+ - source and gate reason;
117
+ - miss type;
118
+ - validation status;
119
+ - confidence;
120
+ - booleans for which rule components were proposed;
121
+ - a sanitized generated rule payload for detector/question/yes/command-pattern
122
+ structure.
123
+
124
+ The event does not include raw terminal text, command text, paths, or secrets.
125
+ Command-shaped anchors are redacted before upload. Product maintainers can
126
+ aggregate these validated rule shapes and use them to improve CTM's shipped
127
+ detectors later.
128
+
129
+ ## Safety Rules
130
+
131
+ Refinement rules are parser rules only:
132
+
133
+ - they cannot approve anything by themselves;
134
+ - they cannot select durable allow-all options;
135
+ - active rules are still subject to the dangerous-command blocklist;
136
+ - active rules are still subject to normal approval policy review;
137
+ - keystrokes are still guarded by live terminal preflight and verification;
138
+ - a failed rerun marks the AI rule failed and persists a warning.
@@ -0,0 +1,74 @@
1
+ # Approval Rescue Loop
2
+
3
+ > Defaults (2026): the approver is **allow-by-default** — it auto-approves
4
+ > everything except the dangerous-command blocklist (the denylist, ON by default,
5
+ > editable in the Permission Manager). The LLM verifier is opt-in (off by
6
+ > default) and provider-agnostic via `callBackgroundLlm`; auto-approvals send the
7
+ > one-time "yes" (never durable allow-all). See `approval-ai-refinement.md`.
8
+
9
+ CTM has two approval layers:
10
+
11
+ 1. The deterministic approval pipeline is the source of truth. The headless terminal detects provider-specific approval widgets, structural gates reject stale/non-widget text, and `approval-agent.js` decides whether to approve, escalate, or block.
12
+ 2. The rescue loop is a bounded observer for misses. It only runs when the deterministic pipeline sees approval-shaped text but rejects it before policy can act, such as a `gate-miss` from the headless terminal.
13
+
14
+ The rescue loop must not replace provider parsers. Its job is to recover one missed prompt, verify whether the intervention worked, and create durable evidence so the product can improve the deterministic path.
15
+
16
+ ## Flow
17
+
18
+ 1. Headless terminal emits a normal `approval-alert` when a prompt passes structural validation.
19
+ 2. If detection sees approval-shaped text but structural validation rejects it, the worker emits `gate-miss` with a redacted/capped screen tail, provider id when known, and the gate reason.
20
+ 3. Server records the observation, then asks `approval-agent` to evaluate a rescue candidate.
21
+ 4. The approval agent:
22
+ - parses the prompt using the provider parser, with a conservative generic numbered-approval fallback for unknown providers;
23
+ - checks the dangerous-command blocklist and normal heuristics;
24
+ - asks the background LLM, when available, whether this is an active approval prompt that should be tried once;
25
+ - sends only the one-time approval shortcut, never durable allow-all, for rescue attempts;
26
+ - verifies the result by checking output movement and whether the prompt disappeared.
27
+ 5. The rescue record is updated as success, failure, skipped stale, suppressed, or promoted.
28
+
29
+ ## Suppression
30
+
31
+ Rescue is intentionally stingy:
32
+
33
+ - A fingerprint covers provider, normalized command/context, and the gate reason.
34
+ - The same fingerprint is not retried during a cooldown.
35
+ - Repeated failures suppress future attempts for that fingerprint.
36
+ - User warnings are throttled per fingerprint, so one bad pattern does not flood the UI.
37
+ - A skipped stale prompt is recorded but not warned, because it usually means the user or provider already moved on.
38
+
39
+ ## Rule Promotion
40
+
41
+ A successful rescue does not automatically create a broad approval rule.
42
+
43
+ After success, the agent classifies why the deterministic path missed it:
44
+
45
+ - `structural_gate_miss`, `parser_bug`, or `race`: record the evidence and keep the deterministic code path as the fix target.
46
+ - `new_provider_pattern`: promote the exact rescue fingerprint into a deterministic rescue rule for future matching and telemetry.
47
+
48
+ The promotion decision is deterministic. If a known provider parser already detected the prompt and a structural gate rejected it, CTM records the rescue as `structural_gate_miss` even when the LLM labels it `new_provider_pattern`. The LLM can decide whether a one-shot rescue is safe; it cannot promote a known-provider gate bug into a durable rule.
49
+
50
+ Promotion is still bounded to the exact fingerprint. Shipping broader provider support requires adding or updating the provider parser, then covering it with regression tests.
51
+
52
+ ## Telemetry Contract
53
+
54
+ The local DB stores `approval_rescue_patterns` rows with:
55
+
56
+ - fingerprint, provider, source, and gate reason;
57
+ - attempts, successes, failures, and consecutive failures;
58
+ - last decision, outcome, diagnosis, and cooldown;
59
+ - promoted rule metadata and last warning time.
60
+
61
+ This gives operators enough evidence to answer:
62
+
63
+ - Did CTM miss an approval?
64
+ - Did the rescue attempt work?
65
+ - Was the miss a new provider pattern or a bug in an existing detector/gate?
66
+ - Is CTM repeatedly trying and failing on the same pattern?
67
+
68
+ ## Provider Guidance
69
+
70
+ New coding agents should add a provider parser under `providers/` first. The rescue loop can help discover a new shape, but a provider is considered supported only after:
71
+
72
+ - parser detection and parse tests pass;
73
+ - headless terminal structural validation accepts the live widget;
74
+ - rescue telemetry is no longer needed for the common prompt shape.
@@ -0,0 +1,107 @@
1
+ # Codex Operational Warning Health
2
+
3
+ CTM treats Codex startup diagnostics as source problems, not rendering noise.
4
+ The raw Terminal view stays raw: CTM does not hide or rewrite provider output.
5
+ Low-value warnings are already collapsed in Review and Conversation renderers,
6
+ but Terminal remains the escape hatch for exact debugging.
7
+
8
+ ## Warning Classes
9
+
10
+ Codex warning floods usually come from two independent launch inputs:
11
+
12
+ - **Invalid generated skill metadata.** Codex validates every enabled plugin
13
+ skill before the session starts. A generated `SKILL.md` with missing
14
+ frontmatter or a `description` over 1024 characters causes repeated
15
+ `Skipped loading ... invalid SKILL.md` warnings.
16
+ - **Unhealthy MCP servers.** Enabled MCP servers are initialized by Codex at
17
+ startup. OAuth-backed remote MCPs such as Slack or enterprise gateway tools
18
+ can be enabled but expired or not logged in, which produces startup warnings
19
+ before the agent can do useful work.
20
+ - **Wrong OpenAI transport.** CTM can run with `OPENAI_API_KEY` and
21
+ `OPENAI_BASE_URL` in its own environment for setup and background LLM jobs.
22
+ Interactive Codex sessions should use Codex's own auth and a stable HTTPS
23
+ OpenAI transport unless the operator explicitly overrides the provider.
24
+
25
+ Wall-E is special: it is CTM's local memory/context server. CTM never disables
26
+ Wall-E automatically. If Wall-E fails to initialize, the failure remains visible
27
+ and is recorded in session diagnostics.
28
+
29
+ ## Launch-Time Repair
30
+
31
+ Before spawning a Codex session, CTM runs `codex-launch-health`:
32
+
33
+ 1. Scan the generated Codex plugin cache under `~/.codex/plugins/cache`.
34
+ 2. Repair only safe skill metadata issues:
35
+ - parseable YAML frontmatter,
36
+ - present `description`,
37
+ - description length above the Codex limit.
38
+ 3. Preserve the original file as `SKILL.md.ctm-bak` before rewriting.
39
+ 4. Leave unrepairable files alone and record a diagnostic.
40
+
41
+ This repair is intentionally narrow. CTM does not rewrite user-authored skills
42
+ or infer missing frontmatter.
43
+
44
+ Generated plugin cache paths can change when Codex refreshes a plugin bundle.
45
+ The repair scheduler therefore only backs off after a real clean or repaired
46
+ scan. A launch with repair disabled does not count as a completed scan, and a
47
+ repairable write failure uses a short retry interval so the next Codex launch
48
+ can recover after transient filesystem or sync issues clear.
49
+
50
+ Set `CTM_CODEX_SKILL_REPAIR=0` to disable this repair path.
51
+
52
+ ## Codex Auth Guard
53
+
54
+ Before spawning Codex, CTM strips inherited `OPENAI_API_KEY` and
55
+ `OPENAI_BASE_URL` from interactive Codex child processes so CTM's provider env
56
+ cannot override the user's normal Codex login/config path.
57
+
58
+ CTM previously prepended this provider override by default:
59
+
60
+ ```text
61
+ -c model_provider="ctm-openai-https"
62
+ -c model_providers.ctm-openai-https.base_url="https://api.openai.com/v1"
63
+ -c model_providers.ctm-openai-https.requires_openai_auth=true
64
+ -c model_providers.ctm-openai-https.supports_websockets=false
65
+ ```
66
+
67
+ That override forces Codex onto the public OpenAI Responses API endpoint and can
68
+ break ChatGPT/Codex subscription auth with `api.responses.write` scope errors.
69
+ It is now opt-in only for operators who intentionally want API-provider
70
+ transport:
71
+
72
+ ```text
73
+ CTM_CODEX_HTTP_TRANSPORT=1
74
+ ```
75
+
76
+ The override is launch-local and does not mutate `~/.codex/config.toml`. Set
77
+ `CTM_KEEP_OPENAI_ENV_FOR_CODEX=1` only when you intentionally want Codex to
78
+ inherit CTM's OpenAI environment.
79
+
80
+ ## MCP Failure Feedback Loop
81
+
82
+ CTM observes raw Codex output for MCP startup/auth failures and stores a compact
83
+ failure record in the CTM settings table. On later Codex launches, CTM can add
84
+ session-local Codex config overrides for optional remote MCPs that recently
85
+ failed auth:
86
+
87
+ ```text
88
+ -c mcp_servers.slack.enabled=false
89
+ -c mcp_servers.ask-data-ai.enabled=false
90
+ ```
91
+
92
+ The override is per launch. It does not mutate `~/.codex/config.toml`, and it
93
+ expires after a short cooldown so a reconnect can take effect without manual DB
94
+ cleanup. Protected/local MCPs, including `wall-e`, are never disabled this way.
95
+
96
+ Set `CTM_CODEX_MCP_AUTO_DISABLE=0` to disable launch-local MCP overrides.
97
+
98
+ ## UI Contract
99
+
100
+ - **Terminal:** exact provider stream; no warning folding or suppression.
101
+ - **Conversation/Review:** operational warnings collapse into a warning group
102
+ so the real answer remains easy to read.
103
+ - **Diagnostics:** launch repairs and MCP failures are recorded per CTM session
104
+ for debugging.
105
+
106
+ This preserves debuggability while fixing the recurring startup causes that make
107
+ the terminal unusable.
@@ -26,7 +26,7 @@ The observed failure mode:
26
26
 
27
27
  ## Contract
28
28
 
29
- Before spawning Codex for a resume or restart restore, CTM must prove this:
29
+ Before spawning Codex for a resume or restart restore, CTM should inspect this:
30
30
 
31
31
  ```text
32
32
  resume id == expected rollout filename id
@@ -34,17 +34,20 @@ Codex state row for resume id points to expected rollout path
34
34
  expected rollout file exists and its session metadata matches the resume id
35
35
  ```
36
36
 
37
- If the contract cannot be proven, CTM must not silently spawn Codex.
37
+ If the contract cannot be proven, CTM records diagnostics and may warn the UI,
38
+ but it must still spawn Codex. `state_5.sqlite` is Codex-owned provider state;
39
+ Codex owns repair prompts, migrations, and final resume behavior.
38
40
 
39
41
  ## Behavior
40
42
 
41
43
  ### Healthy State DB
42
44
 
43
45
  If `state_5.sqlite` is readable, passes an integrity check, and has a
44
- thread row whose rollout path differs from CTM's mapped JSONL, CTM may repair
45
- that single `threads.rollout_path` cell before spawning.
46
+ thread row whose rollout path differs from CTM's mapped JSONL, the normal CTM
47
+ resume path treats this as an inspect-only diagnostic. CTM must not mutate
48
+ `state_5.sqlite` before spawning Codex.
46
49
 
47
- The repair is intentionally narrow:
50
+ Manual repair tooling, if used, must be intentionally narrow:
48
51
 
49
52
  - backup `state_5.sqlite` first;
50
53
  - update only `threads.rollout_path` for the exact resume id;
@@ -55,15 +58,17 @@ The repair is intentionally narrow:
55
58
  ### Corrupt Or Unusable State DB
56
59
 
57
60
  If the DB is corrupt, unreadable, lacks the expected row, or cannot be repaired,
58
- CTM blocks the Codex spawn and records diagnostics. Blocking is safer than
59
- starting a provider process that will append to an unrelated transcript.
61
+ CTM records diagnostics and still launches `codex resume`. Blocking is not
62
+ allowed because it prevents Codex from showing its own repair prompt and turns
63
+ CTM into the owner of provider state.
60
64
 
61
- The UI/API should receive a clear error telling the user that Codex resume was
62
- blocked because provider state does not match CTM's session mapping.
65
+ The UI/API may receive a non-blocking warning telling the user that provider
66
+ state could not be verified and terminal output may temporarily diverge from
67
+ the Conversation tab.
63
68
 
64
69
  ### Diagnostics
65
70
 
66
- Every mismatch, repair, or blocked spawn must be recorded in:
71
+ Every mismatch, repair, or unverifiable state result must be recorded in:
67
72
 
68
73
  - the in-memory session diagnostics ring for immediate UI/debug access;
69
74
  - the durable CTM DB diagnostics table for post-restart investigation;
@@ -81,5 +86,5 @@ should not blindly move messages between JSONLs.
81
86
  - Do not make `session_index` authoritative again. It is legacy residue.
82
87
  - Do not assume Codex has a path-based resume flag; current Codex CLI help only
83
88
  exposes `codex resume [SESSION_ID]`.
84
- - Do not auto-repair a corrupt SQLite file in-place. First stop the bleeding by
85
- blocking wrong writes, then repair state with a dedicated tool.
89
+ - Do not auto-repair a corrupt SQLite file in-place.
90
+ - Do not block `codex resume`; let Codex own its provider database and repair UI.