create-walle 0.9.21 → 0.9.23
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +27 -5
- package/package.json +2 -2
- package/template/CLAUDE.md +2 -2
- package/template/LICENSE +1 -1
- package/template/bin/ctm-dev-cleanup.js +24 -3
- package/template/bin/ctm-launch.sh +13 -0
- package/template/bin/dev.sh +156 -18
- package/template/bin/node-bin.sh +84 -0
- package/template/bin/pin-node.sh +51 -0
- package/template/claude-task-manager/api-prompts.js +1203 -182
- package/template/claude-task-manager/api-reviews.js +109 -15
- package/template/claude-task-manager/approval-agent.js +1360 -280
- package/template/claude-task-manager/bin/restart-ctm.sh +64 -23
- package/template/claude-task-manager/bin/storage-migration-supervisor.js +338 -0
- package/template/claude-task-manager/db.js +4417 -295
- package/template/claude-task-manager/docs/app-update-refresh-protocol.md +69 -0
- package/template/claude-task-manager/docs/approval-ai-refinement.md +138 -0
- package/template/claude-task-manager/docs/approval-rescue-loop.md +74 -0
- package/template/claude-task-manager/docs/codex-operational-warning-health.md +107 -0
- package/template/claude-task-manager/docs/codex-resume-state-guard-design.md +17 -12
- package/template/claude-task-manager/docs/codex-terminal-render-controller-handoff.md +311 -0
- package/template/claude-task-manager/docs/coding-agent-hooks-architecture.md +418 -0
- package/template/claude-task-manager/docs/conversation-import-freshness.md +20 -0
- package/template/claude-task-manager/docs/google-workspace-auth-health.md +77 -0
- package/template/claude-task-manager/docs/image-paste-ux.md +13 -0
- package/template/claude-task-manager/docs/ipad-web-preview.md +88 -0
- package/template/claude-task-manager/docs/main-loop-offload-architecture.md +66 -0
- package/template/claude-task-manager/docs/microsoft-dev-tunnel-phone-access-design.md +274 -519
- package/template/claude-task-manager/docs/mobile-live-streaming.md +27 -5
- package/template/claude-task-manager/docs/mobile-remote-submission-lifecycle.md +69 -0
- package/template/claude-task-manager/docs/phone-access-design.md +53 -15
- package/template/claude-task-manager/docs/phone-passkey-identity.md +122 -0
- package/template/claude-task-manager/docs/phone-setup.md +3 -0
- package/template/claude-task-manager/docs/prompt-editing-tree-design.md +25 -1
- package/template/claude-task-manager/docs/remote-desktop-access-design.md +268 -0
- package/template/claude-task-manager/docs/restart-lifecycle-architecture.md +95 -0
- package/template/claude-task-manager/docs/runtime-work-control-plane.md +53 -0
- package/template/claude-task-manager/docs/session-interactive-wait-surfaces.md +38 -0
- package/template/claude-task-manager/docs/session-needs-you-dismissal.md +84 -0
- package/template/claude-task-manager/docs/session-render-state-management-design.md +91 -3
- package/template/claude-task-manager/docs/session-standup-command-center-design.md +25 -1
- package/template/claude-task-manager/docs/session-title-authority.md +32 -0
- package/template/claude-task-manager/docs/session-workspace-binding.md +33 -0
- package/template/claude-task-manager/docs/skill-intent-resolution-design.md +72 -0
- package/template/claude-task-manager/docs/walle-mcp-supervisor-health.md +86 -0
- package/template/claude-task-manager/docs/walle-relay-phone-access-design.md +24 -15
- package/template/claude-task-manager/docs/walle-session-history-hydration.md +114 -0
- package/template/claude-task-manager/docs/walle-session-input-queue.md +104 -0
- package/template/claude-task-manager/docs/walle-session-model-catalog.md +90 -0
- package/template/claude-task-manager/docs/walle-session-model-preferences.md +15 -6
- package/template/claude-task-manager/git-utils.js +897 -27
- package/template/claude-task-manager/lib/agent-capabilities.js +33 -0
- package/template/claude-task-manager/lib/agent-cli-cache.js +37 -7
- package/template/claude-task-manager/lib/agent-hooks-installer.js +26 -2
- package/template/claude-task-manager/lib/agent-presets.js +17 -1
- package/template/claude-task-manager/lib/all-sessions-query.js +108 -0
- package/template/claude-task-manager/lib/approval-ai-refinement.js +488 -0
- package/template/claude-task-manager/lib/approval-self-adapt.js +168 -0
- package/template/claude-task-manager/lib/async-semaphore.js +44 -0
- package/template/claude-task-manager/lib/auth-context.js +5 -0
- package/template/claude-task-manager/lib/auth-rate-limit.js +47 -4
- package/template/claude-task-manager/lib/auth-rules.js +29 -2
- package/template/claude-task-manager/lib/auto-approval-verifier.js +129 -16
- package/template/claude-task-manager/lib/background-llm.js +144 -17
- package/template/claude-task-manager/lib/branch-inventory.js +212 -0
- package/template/claude-task-manager/lib/claude-desktop-sessions.js +15 -3
- package/template/claude-task-manager/lib/coalesce-sync-frames.js +151 -0
- package/template/claude-task-manager/lib/codex-launch-health.js +762 -0
- package/template/claude-task-manager/lib/codex-transcript-pager.js +51 -0
- package/template/claude-task-manager/lib/codex-zst.js +124 -0
- package/template/claude-task-manager/lib/coding-agent-models.js +233 -30
- package/template/claude-task-manager/lib/connection-health.js +232 -0
- package/template/claude-task-manager/lib/conversation-blob-parser.js +42 -0
- package/template/claude-task-manager/lib/conversation-tail-merge.js +89 -26
- package/template/claude-task-manager/lib/ctm-session-context-api.js +39 -10
- package/template/claude-task-manager/lib/cursor-conversation-store.js +354 -0
- package/template/claude-task-manager/lib/db-owner-worker-client.js +315 -0
- package/template/claude-task-manager/lib/document-review.js +141 -6
- package/template/claude-task-manager/lib/escalation-review.js +152 -0
- package/template/claude-task-manager/lib/graceful-shutdown.js +159 -0
- package/template/claude-task-manager/lib/headless-term-service.js +678 -0
- package/template/claude-task-manager/lib/heavy-worker-fallback.js +38 -0
- package/template/claude-task-manager/lib/jsonl-conversation-parser.js +542 -0
- package/template/claude-task-manager/lib/jsonl-range-reader.js +112 -0
- package/template/claude-task-manager/lib/main-db-census.js +216 -0
- package/template/claude-task-manager/lib/message-pagination.js +106 -4
- package/template/claude-task-manager/lib/microsoft-dev-tunnel-setup.js +750 -26
- package/template/claude-task-manager/lib/mobile-auth-api.js +274 -7
- package/template/claude-task-manager/lib/mobile-auth-store.js +592 -10
- package/template/claude-task-manager/lib/mobile-notification-dispatcher.js +15 -0
- package/template/claude-task-manager/lib/model-overview-brain-fallback.js +311 -0
- package/template/claude-task-manager/lib/model-overview-cache.js +141 -0
- package/template/claude-task-manager/lib/models-health-routing-notice.js +126 -0
- package/template/claude-task-manager/lib/node-pin-guard.js +93 -0
- package/template/claude-task-manager/lib/perf-tracker.js +242 -6
- package/template/claude-task-manager/lib/permission-match.js +76 -0
- package/template/claude-task-manager/lib/permission-sync.js +133 -20
- package/template/claude-task-manager/lib/process-title.js +35 -0
- package/template/claude-task-manager/lib/prompt-executions-query.js +25 -0
- package/template/claude-task-manager/lib/prompt-index-disk-cache.js +44 -0
- package/template/claude-task-manager/lib/prompt-intent.js +132 -0
- package/template/claude-task-manager/lib/provider-user-context.js +34 -0
- package/template/claude-task-manager/lib/read-pool-client.js +313 -0
- package/template/claude-task-manager/lib/readpool-breaker.js +31 -0
- package/template/claude-task-manager/lib/recent-sessions-breaker.js +12 -0
- package/template/claude-task-manager/lib/remote-feedback-client.js +72 -0
- package/template/claude-task-manager/lib/remote-relay-protocol.js +37 -4
- package/template/claude-task-manager/lib/remote-relay-store.js +159 -0
- package/template/claude-task-manager/lib/remote-submission-observer.js +278 -0
- package/template/claude-task-manager/lib/restart-guard.js +109 -0
- package/template/claude-task-manager/lib/restore-interruption-detector.js +439 -0
- package/template/claude-task-manager/lib/restore-policy.js +13 -0
- package/template/claude-task-manager/lib/restore-resume-batch.js +74 -0
- package/template/claude-task-manager/lib/restore-runtime.js +68 -0
- package/template/claude-task-manager/lib/restore-storm.js +34 -0
- package/template/claude-task-manager/lib/resume-cwd.js +36 -0
- package/template/claude-task-manager/lib/resume-preflight.js +313 -0
- package/template/claude-task-manager/lib/runtime-work-registry.js +444 -0
- package/template/claude-task-manager/lib/sanitize-openai-auth.js +31 -0
- package/template/claude-task-manager/lib/scheduler.js +21 -1
- package/template/claude-task-manager/lib/scrollback-snapshot-store.js +159 -0
- package/template/claude-task-manager/lib/serial-task-queue.js +64 -0
- package/template/claude-task-manager/lib/server-listeners.js +239 -0
- package/template/claude-task-manager/lib/session-capture.js +42 -7
- package/template/claude-task-manager/lib/session-content-backfill.js +131 -0
- package/template/claude-task-manager/lib/session-history.js +388 -43
- package/template/claude-task-manager/lib/session-host-manager.js +287 -0
- package/template/claude-task-manager/lib/session-image-refs.js +209 -0
- package/template/claude-task-manager/lib/session-jobs.js +399 -59
- package/template/claude-task-manager/lib/session-prompt-index.js +137 -0
- package/template/claude-task-manager/lib/session-restore.js +53 -0
- package/template/claude-task-manager/lib/session-standup.js +123 -23
- package/template/claude-task-manager/lib/session-state-bus.js +14 -0
- package/template/claude-task-manager/lib/session-stream.js +64 -16
- package/template/claude-task-manager/lib/session-timeline-summary.js +260 -0
- package/template/claude-task-manager/lib/session-token-usage.js +494 -0
- package/template/claude-task-manager/lib/session-workspace-binding.js +356 -0
- package/template/claude-task-manager/lib/setup-network-config.js +9 -0
- package/template/claude-task-manager/lib/size-cap.js +45 -0
- package/template/claude-task-manager/lib/size-cap.test.js +62 -0
- package/template/claude-task-manager/lib/skill-autocomplete.js +180 -1
- package/template/claude-task-manager/lib/skill-intent-resolver.js +304 -0
- package/template/claude-task-manager/lib/sqlite-driver.js +19 -3
- package/template/claude-task-manager/lib/standup-attention.js +7 -3
- package/template/claude-task-manager/lib/status-authority.js +39 -0
- package/template/claude-task-manager/lib/status-hooks.js +4 -0
- package/template/claude-task-manager/lib/storage-migration.js +235 -0
- package/template/claude-task-manager/lib/structured-capture.js +298 -0
- package/template/claude-task-manager/lib/sync-io-census.js +163 -0
- package/template/claude-task-manager/lib/tailscale-setup.js +6 -0
- package/template/claude-task-manager/lib/terminal-activity-evidence.js +33 -0
- package/template/claude-task-manager/lib/terminal-choice.js +364 -0
- package/template/claude-task-manager/lib/terminal-control-sanitize.js +17 -0
- package/template/claude-task-manager/lib/terminal-fingerprint.js +48 -0
- package/template/claude-task-manager/lib/terminal-output-flush.js +84 -0
- package/template/claude-task-manager/lib/timeline-order.js +122 -0
- package/template/claude-task-manager/lib/transcript-store.js +348 -43
- package/template/claude-task-manager/lib/transport-security.js +84 -1
- package/template/claude-task-manager/lib/wait-state.js +184 -0
- package/template/claude-task-manager/lib/walle-client.js +47 -5
- package/template/claude-task-manager/lib/walle-ctm-history.js +564 -4
- package/template/claude-task-manager/lib/walle-external-actions.js +135 -16
- package/template/claude-task-manager/lib/walle-history-hydration.js +46 -0
- package/template/claude-task-manager/lib/walle-native-health.js +403 -0
- package/template/claude-task-manager/lib/walle-repair.js +701 -0
- package/template/claude-task-manager/lib/walle-session-cache.js +109 -0
- package/template/claude-task-manager/lib/walle-session-context.js +57 -21
- package/template/claude-task-manager/lib/walle-session-model-catalog.js +34 -0
- package/template/claude-task-manager/lib/walle-supervisor.js +539 -63
- package/template/claude-task-manager/lib/walle-transcript.js +52 -0
- package/template/claude-task-manager/lib/worktree-active-sync.js +11 -7
- package/template/claude-task-manager/lib/worktree-cwd.js +32 -1
- package/template/claude-task-manager/package.json +1 -1
- package/template/claude-task-manager/prompt-harvest.js +89 -66
- package/template/claude-task-manager/providers/claude-code.js +51 -3
- package/template/claude-task-manager/providers/cursor.js +140 -45
- package/template/claude-task-manager/public/css/reviews.css +551 -61
- package/template/claude-task-manager/public/css/setup.css +191 -0
- package/template/claude-task-manager/public/css/walle-session.css +865 -10
- package/template/claude-task-manager/public/css/walle.css +154 -0
- package/template/claude-task-manager/public/designs/ai-providers-consolidation-v2.html +830 -0
- package/template/claude-task-manager/public/index.html +18516 -2058
- package/template/claude-task-manager/public/ipad.html +363 -0
- package/template/claude-task-manager/public/js/document-review-links.js +301 -0
- package/template/claude-task-manager/public/js/image-normalize.js +69 -36
- package/template/claude-task-manager/public/js/message-renderer.js +1265 -77
- package/template/claude-task-manager/public/js/prompts.js +66 -29
- package/template/claude-task-manager/public/js/reviews.js +901 -133
- package/template/claude-task-manager/public/js/session-activity-utils.js +11 -1
- package/template/claude-task-manager/public/js/session-search-utils.js +94 -10
- package/template/claude-task-manager/public/js/session-status-precedence.js +23 -5
- package/template/claude-task-manager/public/js/setup.js +1273 -176
- package/template/claude-task-manager/public/js/stream-view.js +691 -73
- package/template/claude-task-manager/public/js/terminal-reconciler.js +210 -0
- package/template/claude-task-manager/public/js/walle-session.js +2455 -158
- package/template/claude-task-manager/public/js/walle.js +455 -28
- package/template/claude-task-manager/public/m/app.css +2909 -262
- package/template/claude-task-manager/public/m/app.js +6601 -398
- package/template/claude-task-manager/public/m/claim.html +224 -17
- package/template/claude-task-manager/public/m/index.html +117 -21
- package/template/claude-task-manager/public/m/sw.js +3 -1
- package/template/claude-task-manager/public/manifest.json +2 -2
- package/template/claude-task-manager/public/prompts.html +30 -14
- package/template/claude-task-manager/queue-engine.js +507 -28
- package/template/claude-task-manager/scripts/repair-claude-session-images.js +27 -8
- package/template/claude-task-manager/server.js +14341 -2197
- package/template/claude-task-manager/session-integrity.js +160 -18
- package/template/claude-task-manager/session-search-ranking.js +1 -0
- package/template/claude-task-manager/session-utils.js +25 -5
- package/template/claude-task-manager/workers/approval-blocklist.js +96 -6
- package/template/claude-task-manager/workers/approval-widget-validator.js +14 -8
- package/template/claude-task-manager/workers/conversation-import-worker.js +11 -50
- package/template/claude-task-manager/workers/db-owner-worker.js +386 -0
- package/template/claude-task-manager/workers/harvest-worker.js +9 -55
- package/template/claude-task-manager/workers/headless-term-worker.js +9 -530
- package/template/claude-task-manager/workers/read-pool-worker.js +387 -0
- package/template/claude-task-manager/workers/scrollback-worker.js +11 -72
- package/template/claude-task-manager/workers/session-host-process.js +146 -0
- package/template/claude-task-manager/workers/session-integrity-worker.js +10 -54
- package/template/claude-task-manager/workers/state-detectors/base.js +18 -1
- package/template/claude-task-manager/workers/state-detectors/claude-code.js +182 -9
- package/template/claude-task-manager/workers/state-detectors/codex.js +150 -2
- package/template/claude-task-manager/workers/state-detectors/cursor.js +127 -0
- package/template/claude-task-manager/workers/state-detectors/gemini.js +21 -0
- package/template/claude-task-manager/workers/state-detectors/index.js +29 -0
- package/template/claude-task-manager/workers/state-detectors/opencode.js +103 -0
- package/template/docs/design/markdown-review-pane.md +206 -0
- package/template/docs/designs/2026-05-17-portkey-gateway-provider-ux.md +129 -38
- package/template/docs/designs/2026-05-20-mobile-worktree-finish-command.md +27 -0
- package/template/docs/designs/2026-05-22-ai-configuration-consolidation.md +248 -0
- package/template/docs/designs/ai-configuration-consolidation-mock.html +812 -0
- package/template/docs/private-memory-and-pii-policy.md +69 -0
- package/template/package.json +2 -1
- package/template/scripts/check-private-data.js +201 -0
- package/template/shared/sqlite-owner-guard.js +30 -0
- package/template/shared/sqlite-owner-write-queue.js +225 -0
- package/template/shared/sqlite-storage-policy.js +111 -0
- package/template/shared/sqlite-write-lock.js +428 -0
- package/template/wall-e/agent-runners/claude-code.js +5 -0
- package/template/wall-e/agent.js +166 -22
- package/template/wall-e/api-walle.js +524 -70
- package/template/wall-e/auth/provider-flows.js +11 -1
- package/template/wall-e/bin/walle-mcp-stdio.js +341 -17
- package/template/wall-e/brain.js +1614 -141
- package/template/wall-e/chat/attachment-blocks.js +96 -0
- package/template/wall-e/chat/attachments.js +2 -1
- package/template/wall-e/chat/capability-resolver.js +7 -7
- package/template/wall-e/chat/context-messages.js +28 -0
- package/template/wall-e/chat/conversation-frame.js +630 -0
- package/template/wall-e/chat/provider-messages.js +125 -0
- package/template/wall-e/chat.js +1002 -233
- package/template/wall-e/coding/acceptance-contract.js +170 -0
- package/template/wall-e/coding/acp-adapter.js +1 -1
- package/template/wall-e/coding/agent-catalog.js +3 -0
- package/template/wall-e/coding/artifact-store.js +93 -0
- package/template/wall-e/coding/capability-router.js +120 -0
- package/template/wall-e/coding/coding-run-controller.js +423 -0
- package/template/wall-e/coding/compaction-service.js +157 -12
- package/template/wall-e/coding/frontend-verification.js +258 -0
- package/template/wall-e/coding/lifecycle-hooks.js +75 -0
- package/template/wall-e/coding/local-preview-contract.js +157 -0
- package/template/wall-e/coding/permission-service.js +57 -13
- package/template/wall-e/coding/prompt-bundle.js +19 -1
- package/template/wall-e/coding/prompt-section-registry.js +227 -0
- package/template/wall-e/coding/provider-compat.js +15 -0
- package/template/wall-e/coding/runtime-events.js +224 -0
- package/template/wall-e/coding/runtime-mode.js +3 -0
- package/template/wall-e/coding/side-git-snapshot.js +160 -4
- package/template/wall-e/coding/snapshot-service.js +143 -1
- package/template/wall-e/coding/stream-processor.js +388 -34
- package/template/wall-e/coding/task-tool.js +141 -4
- package/template/wall-e/coding/tool-execution-controller.js +365 -0
- package/template/wall-e/coding/tool-registry.js +43 -5
- package/template/wall-e/coding/user-hooks.js +217 -0
- package/template/wall-e/coding-orchestrator.js +1330 -221
- package/template/wall-e/coding-prompts.js +20 -4
- package/template/wall-e/context/context-builder.js +15 -2
- package/template/wall-e/decision/confidence.js +1 -1
- package/template/wall-e/docs/coding-acceptance-contract.md +41 -0
- package/template/wall-e/docs/external-action-controller.md +26 -6
- package/template/wall-e/docs/telemetry-lifecycle.md +8 -2
- package/template/wall-e/embeddings.js +591 -53
- package/template/wall-e/external-action-controller.js +12 -0
- package/template/wall-e/http/auth.js +1 -0
- package/template/wall-e/http/chat-api.js +46 -11
- package/template/wall-e/http/model-admin.js +836 -34
- package/template/wall-e/lib/boot-profile.js +88 -0
- package/template/wall-e/lib/event-loop-monitor.js +93 -0
- package/template/wall-e/lib/service-health.js +194 -0
- package/template/wall-e/llm/anthropic.js +130 -5
- package/template/wall-e/llm/client.js +266 -63
- package/template/wall-e/llm/default-fallback.js +382 -0
- package/template/wall-e/llm/health.js +19 -0
- package/template/wall-e/llm/message-guard.js +78 -0
- package/template/wall-e/llm/model-catalog.js +252 -1
- package/template/wall-e/llm/openai.js +26 -4
- package/template/wall-e/llm/portkey-sync.js +654 -0
- package/template/wall-e/llm/provider-error.js +30 -2
- package/template/wall-e/llm/registry.js +5 -1
- package/template/wall-e/llm/request-compat.js +67 -0
- package/template/wall-e/loops/backfill.js +79 -23
- package/template/wall-e/loops/brain-optimize.js +67 -0
- package/template/wall-e/loops/ingest.js +25 -10
- package/template/wall-e/loops/question-digest.js +160 -0
- package/template/wall-e/loops/reflect.js +6 -4
- package/template/wall-e/loops/think.js +39 -12
- package/template/wall-e/mcp-server.js +318 -36
- package/template/wall-e/memory/ctm-context-client.js +52 -14
- package/template/wall-e/memory/ctm-operational-context.js +237 -0
- package/template/wall-e/memory/ctm-prompt-executions-client.js +128 -0
- package/template/wall-e/memory/ctm-session-context.js +111 -63
- package/template/wall-e/prompts/coding/deepseek.txt +3 -0
- package/template/wall-e/prompts/coding/gemini.txt +6 -0
- package/template/wall-e/prompts/coding/gpt.txt +6 -0
- package/template/wall-e/prompts/coding/local.txt +7 -0
- package/template/wall-e/runtime/decision-hooks.js +115 -0
- package/template/wall-e/runtime/devbox-gateway.js +82 -8
- package/template/wall-e/runtime/prompt-manifest.js +86 -0
- package/template/wall-e/runtime/tool-executor.js +269 -0
- package/template/wall-e/runtime/tool-result-envelope.js +138 -0
- package/template/wall-e/runtime/transcript-projection.js +60 -0
- package/template/wall-e/runtime/walle-runtime.js +224 -0
- package/template/wall-e/scripts/db-optimize/migrate.js +162 -0
- package/template/wall-e/scripts/db-optimize/recall-eval.js +117 -0
- package/template/wall-e/server.js +15 -0
- package/template/wall-e/session-files.js +9 -0
- package/template/wall-e/skills/_bundled/google-calendar/run.js +1 -1
- package/template/wall-e/skills/_bundled/gws-workspace/run.js +1 -1
- package/template/wall-e/skills/_bundled/slack-mentions/run.js +76 -6
- package/template/wall-e/skills/claude-code-reader.js +7 -3
- package/template/wall-e/skills/script-skill-runner.js +10 -0
- package/template/wall-e/skills/skill-planner.js +38 -0
- package/template/wall-e/tools/builtin-middleware.js +19 -9
- package/template/wall-e/tools/local-tools.js +1428 -16
- package/template/wall-e/tools/permission-checker.js +73 -5
- package/template/wall-e/tools/question-manager.js +117 -7
- package/template/wall-e/training/harvester.js +12 -28
- package/template/wall-e/training/replay.js +25 -80
- package/template/website/index.html +10 -10
- package/template/wall-e/eval/ab-test.js +0 -203
- package/template/wall-e/eval/agent-runner.js +0 -772
- package/template/wall-e/eval/agent-scorer.js +0 -461
- package/template/wall-e/eval/aggregator.js +0 -414
- package/template/wall-e/eval/allowed-test-commands.js +0 -34
- package/template/wall-e/eval/benchmark-generator.js +0 -113
- package/template/wall-e/eval/benchmarks/chat-eval.json +0 -1662
- package/template/wall-e/eval/benchmarks/chat.json +0 -82
- package/template/wall-e/eval/benchmarks/coding-agent-real.json +0 -1
- package/template/wall-e/eval/benchmarks/coding-agent.json +0 -1581
- package/template/wall-e/eval/benchmarks/coding.json +0 -122
- package/template/wall-e/eval/benchmarks/memory-retrieval.json +0 -234
- package/template/wall-e/eval/benchmarks/reasoning.json +0 -82
- package/template/wall-e/eval/benchmarks/swebench-lite-30.json +0 -212
- package/template/wall-e/eval/benchmarks.js +0 -669
- package/template/wall-e/eval/cc-replay.js +0 -719
- package/template/wall-e/eval/chat-eval.js +0 -525
- package/template/wall-e/eval/check-keys.js +0 -15
- package/template/wall-e/eval/check-providers.js +0 -42
- package/template/wall-e/eval/codex-cli-baseline.js +0 -669
- package/template/wall-e/eval/coding-agent-real.js +0 -570
- package/template/wall-e/eval/context-compactor.js +0 -251
- package/template/wall-e/eval/debug-agent003.js +0 -68
- package/template/wall-e/eval/diagnostics.js +0 -216
- package/template/wall-e/eval/eval-orchestrator.js +0 -642
- package/template/wall-e/eval/evaluate.js +0 -202
- package/template/wall-e/eval/evaluator.js +0 -373
- package/template/wall-e/eval/exporter.js +0 -212
- package/template/wall-e/eval/fixtures/express-basic/package.json +0 -9
- package/template/wall-e/eval/fixtures/express-basic/server.js +0 -115
- package/template/wall-e/eval/fixtures/express-basic/test.js +0 -83
- package/template/wall-e/eval/fixtures/express-buggy/package.json +0 -9
- package/template/wall-e/eval/fixtures/express-buggy/server.js +0 -113
- package/template/wall-e/eval/fixtures/express-buggy/test.js +0 -83
- package/template/wall-e/eval/fixtures/express-buggy-items/package.json +0 -9
- package/template/wall-e/eval/fixtures/express-buggy-items/server.js +0 -112
- package/template/wall-e/eval/fixtures/express-buggy-items/test.js +0 -83
- package/template/wall-e/eval/fixtures/express-buggy-search/package.json +0 -9
- package/template/wall-e/eval/fixtures/express-buggy-search/server.js +0 -121
- package/template/wall-e/eval/fixtures/express-buggy-search/test.js +0 -83
- package/template/wall-e/eval/fixtures/express-rename-data/data.js +0 -34
- package/template/wall-e/eval/fixtures/express-rename-data/package.json +0 -9
- package/template/wall-e/eval/fixtures/express-rename-data/server.js +0 -97
- package/template/wall-e/eval/fixtures/express-rename-data/test.js +0 -88
- package/template/wall-e/eval/fixtures/express-xss/package.json +0 -12
- package/template/wall-e/eval/fixtures/express-xss/server.js +0 -90
- package/template/wall-e/eval/fixtures/express-xss/test.js +0 -67
- package/template/wall-e/eval/fixtures/express-xss/views/profile.ejs +0 -9
- package/template/wall-e/eval/fixtures/fullstack-app/config/default.js +0 -9
- package/template/wall-e/eval/fixtures/fullstack-app/config/test.js +0 -13
- package/template/wall-e/eval/fixtures/fullstack-app/package.json +0 -11
- package/template/wall-e/eval/fixtures/fullstack-app/public/css/style.css +0 -137
- package/template/wall-e/eval/fixtures/fullstack-app/public/index.html +0 -46
- package/template/wall-e/eval/fixtures/fullstack-app/public/js/app.js +0 -121
- package/template/wall-e/eval/fixtures/fullstack-app/public/js/auth.js +0 -71
- package/template/wall-e/eval/fixtures/fullstack-app/public/js/items.js +0 -80
- package/template/wall-e/eval/fixtures/fullstack-app/public/js/users.js +0 -46
- package/template/wall-e/eval/fixtures/fullstack-app/public/login.html +0 -45
- package/template/wall-e/eval/fixtures/fullstack-app/public/register.html +0 -38
- package/template/wall-e/eval/fixtures/fullstack-app/scripts/migrate.js +0 -23
- package/template/wall-e/eval/fixtures/fullstack-app/scripts/seed.js +0 -46
- package/template/wall-e/eval/fixtures/fullstack-app/server/db.js +0 -99
- package/template/wall-e/eval/fixtures/fullstack-app/server/index.js +0 -94
- package/template/wall-e/eval/fixtures/fullstack-app/server/middleware/auth.js +0 -19
- package/template/wall-e/eval/fixtures/fullstack-app/server/middleware/logger.js +0 -19
- package/template/wall-e/eval/fixtures/fullstack-app/server/router.js +0 -50
- package/template/wall-e/eval/fixtures/fullstack-app/server/routes/auth.js +0 -69
- package/template/wall-e/eval/fixtures/fullstack-app/server/routes/health.js +0 -23
- package/template/wall-e/eval/fixtures/fullstack-app/server/routes/items.js +0 -88
- package/template/wall-e/eval/fixtures/fullstack-app/server/routes/users.js +0 -75
- package/template/wall-e/eval/fixtures/fullstack-app/server/test.js +0 -198
- package/template/wall-e/eval/fixtures/fullstack-app/server/utils/response.js +0 -34
- package/template/wall-e/eval/fixtures/fullstack-app/server/utils/validate.js +0 -26
- package/template/wall-e/eval/fixtures/fullstack-app/server.js +0 -8
- package/template/wall-e/eval/fixtures/fullstack-app/test.js +0 -12
- package/template/wall-e/eval/fixtures/monorepo-basic/package.json +0 -8
- package/template/wall-e/eval/fixtures/monorepo-basic/packages/api/data.js +0 -58
- package/template/wall-e/eval/fixtures/monorepo-basic/packages/api/middleware.js +0 -46
- package/template/wall-e/eval/fixtures/monorepo-basic/packages/api/package.json +0 -8
- package/template/wall-e/eval/fixtures/monorepo-basic/packages/api/routes.js +0 -64
- package/template/wall-e/eval/fixtures/monorepo-basic/packages/api/server.js +0 -56
- package/template/wall-e/eval/fixtures/monorepo-basic/packages/api/test.js +0 -116
- package/template/wall-e/eval/fixtures/monorepo-basic/packages/cli/commands.js +0 -61
- package/template/wall-e/eval/fixtures/monorepo-basic/packages/cli/index.js +0 -62
- package/template/wall-e/eval/fixtures/monorepo-basic/packages/cli/output.js +0 -43
- package/template/wall-e/eval/fixtures/monorepo-basic/packages/cli/package.json +0 -11
- package/template/wall-e/eval/fixtures/monorepo-basic/packages/cli/test.js +0 -44
- package/template/wall-e/eval/fixtures/monorepo-basic/packages/shared/formatters.js +0 -43
- package/template/wall-e/eval/fixtures/monorepo-basic/packages/shared/index.js +0 -12
- package/template/wall-e/eval/fixtures/monorepo-basic/packages/shared/package.json +0 -5
- package/template/wall-e/eval/fixtures/monorepo-basic/packages/shared/test.js +0 -55
- package/template/wall-e/eval/fixtures/monorepo-basic/packages/shared/validators.js +0 -29
- package/template/wall-e/eval/fixtures/monorepo-basic/test.js +0 -46
- package/template/wall-e/eval/fixtures/node-cli/index.js +0 -78
- package/template/wall-e/eval/fixtures/node-cli/package.json +0 -10
- package/template/wall-e/eval/fixtures/node-cli/test.js +0 -57
- package/template/wall-e/eval/fixtures/node-typed/package.json +0 -8
- package/template/wall-e/eval/fixtures/node-typed/src/handlers.js +0 -31
- package/template/wall-e/eval/fixtures/node-typed/src/utils.js +0 -33
- package/template/wall-e/eval/fixtures/node-typed/test.js +0 -36
- package/template/wall-e/eval/fixtures/python-flask/app.py +0 -14
- package/template/wall-e/eval/fixtures/python-flask/requirements.txt +0 -2
- package/template/wall-e/eval/fixtures/python-flask/test_app.py +0 -25
- package/template/wall-e/eval/fixtures/wall-e-subset/brain.js +0 -105
- package/template/wall-e/eval/fixtures/wall-e-subset/eval/aggregator.js +0 -101
- package/template/wall-e/eval/fixtures/wall-e-subset/eval/benchmarks/chat.json +0 -20
- package/template/wall-e/eval/fixtures/wall-e-subset/eval/benchmarks/coding.json +0 -32
- package/template/wall-e/eval/fixtures/wall-e-subset/eval/benchmarks.js +0 -64
- package/template/wall-e/eval/fixtures/wall-e-subset/eval/fixtures/simple-project/package.json +0 -6
- package/template/wall-e/eval/fixtures/wall-e-subset/eval/fixtures/simple-project/server.js +0 -31
- package/template/wall-e/eval/fixtures/wall-e-subset/eval/fixtures/simple-project/test.js +0 -18
- package/template/wall-e/eval/fixtures/wall-e-subset/eval/fixtures/simple-project/utils.js +0 -34
- package/template/wall-e/eval/fixtures/wall-e-subset/eval/runner.js +0 -104
- package/template/wall-e/eval/fixtures/wall-e-subset/eval/scorer.js +0 -73
- package/template/wall-e/eval/fixtures/wall-e-subset/eval/test.js +0 -134
- package/template/wall-e/eval/fixtures/wall-e-subset/llm/client.js +0 -99
- package/template/wall-e/eval/fixtures/wall-e-subset/llm/providers.js +0 -63
- package/template/wall-e/eval/fixtures/wall-e-subset/llm/test.js +0 -70
- package/template/wall-e/eval/fixtures/wall-e-subset/package.json +0 -10
- package/template/wall-e/eval/fixtures/wall-e-subset/test.js +0 -86
- package/template/wall-e/eval/harvester.js +0 -685
- package/template/wall-e/eval/head-to-head.js +0 -388
- package/template/wall-e/eval/humaneval-adapter.js +0 -321
- package/template/wall-e/eval/list-models.js +0 -31
- package/template/wall-e/eval/livecodebench-adapter.js +0 -291
- package/template/wall-e/eval/mail-integration.js +0 -443
- package/template/wall-e/eval/manifest.js +0 -186
- package/template/wall-e/eval/meta-harness/adapters/coding-agent.js +0 -57
- package/template/wall-e/eval/meta-harness/bootstrap-snapshot.js +0 -149
- package/template/wall-e/eval/meta-harness/candidate-store.js +0 -117
- package/template/wall-e/eval/meta-harness/cli.js +0 -86
- package/template/wall-e/eval/meta-harness/domain-spec.js +0 -154
- package/template/wall-e/eval/meta-harness/domains/coding-agent.domain.json +0 -84
- package/template/wall-e/eval/meta-harness/examples/env-bootstrap-candidate.js +0 -29
- package/template/wall-e/eval/meta-harness/experience-store.js +0 -174
- package/template/wall-e/eval/meta-harness/frontier.js +0 -96
- package/template/wall-e/eval/meta-harness/harness-interface.js +0 -90
- package/template/wall-e/eval/meta-harness/leakage-guard.js +0 -80
- package/template/wall-e/eval/meta-harness/optimizer.js +0 -207
- package/template/wall-e/eval/meta-harness/proposer-runner.js +0 -110
- package/template/wall-e/eval/meta-harness/reporting.js +0 -58
- package/template/wall-e/eval/meta-harness/telemetry.js +0 -27
- package/template/wall-e/eval/meta-harness/validation.js +0 -81
- package/template/wall-e/eval/promoter.js +0 -228
- package/template/wall-e/eval/provider-normalizer.js +0 -33
- package/template/wall-e/eval/replay.js +0 -395
- package/template/wall-e/eval/run-agent-benchmarks.js +0 -386
- package/template/wall-e/eval/run-codex-cli-baseline.js +0 -177
- package/template/wall-e/eval/run-coding-agent-real.js +0 -187
- package/template/wall-e/eval/run-eval.js +0 -435
- package/template/wall-e/eval/run-model-comparison.js +0 -142
- package/template/wall-e/eval/session-evaluator.js +0 -187
- package/template/wall-e/eval/session-miner.js +0 -207
- package/template/wall-e/eval/session-retrieval-benchmark.js +0 -150
- package/template/wall-e/eval/session-transcripts.js +0 -509
- package/template/wall-e/eval/shadow.js +0 -161
- package/template/wall-e/eval/swebench-adapter.js +0 -345
- package/template/wall-e/eval/swebench-docker.js +0 -192
- package/template/wall-e/eval/train.py +0 -320
- package/template/wall-e/eval/trainer.js +0 -232
- package/template/wall-e/eval/weekly-eval-loop.js +0 -241
|
@@ -31,11 +31,21 @@ session card and subsequent skill lookups include the session's agent and cwd.
|
|
|
31
31
|
|
|
32
32
|
The mobile shell keeps a capped localStorage snapshot for instant paint. The cap
|
|
33
33
|
is intentionally high enough for normal CTM dashboards, so users with hundreds
|
|
34
|
-
of sessions still get incremental refreshes.
|
|
35
|
-
|
|
36
|
-
|
|
37
|
-
|
|
38
|
-
|
|
34
|
+
of sessions still get incremental refreshes. A fresh complete snapshot sends
|
|
35
|
+
`?since=<revision>` and the server may return an incremental delta. A stale but
|
|
36
|
+
not expired snapshot still paints immediately as a preview, but intentionally
|
|
37
|
+
drops the revision so the server returns a full standup response. If the cached
|
|
38
|
+
snapshot was capped, the phone also drops the revision so the server fills in
|
|
39
|
+
the omitted sessions.
|
|
40
|
+
|
|
41
|
+
The snapshot is not an auth source and is not current status by itself. If the
|
|
42
|
+
phone boots with a cached snapshot but CTM reports missing/expired device auth,
|
|
43
|
+
the UI must label the rows as a cached preview, show a pairing action, and avoid
|
|
44
|
+
presenting cached running/waiting lanes as authoritative. When a phone pairing
|
|
45
|
+
claim completes, the claim page clears the cached snapshot and writes a
|
|
46
|
+
short-lived "force full standup" marker. The next `/m/#status` load consumes
|
|
47
|
+
that marker and requests `/api/sessions/standup` without `since`, so a re-paired
|
|
48
|
+
phone cannot keep stale status data through an incremental no-op response.
|
|
39
49
|
|
|
40
50
|
The standup response contract is:
|
|
41
51
|
|
|
@@ -46,6 +56,10 @@ The standup response contract is:
|
|
|
46
56
|
changed or newly added session cards;
|
|
47
57
|
- revisions must ignore volatile relative-age fields so a phone poll does not
|
|
48
58
|
download every session just because "last activity 1m ago" became "2m ago".
|
|
59
|
+
- session cards use `worktreeStatus` as the canonical local-work summary. The
|
|
60
|
+
phone cache may read legacy `worktree` cards, but it normalizes them back to
|
|
61
|
+
`worktreeStatus` before rendering or persisting so changed-file/commit chips
|
|
62
|
+
survive a reload before the next HTTP refresh finishes.
|
|
49
63
|
|
|
50
64
|
The key lifecycle messages are:
|
|
51
65
|
|
|
@@ -73,6 +87,14 @@ This avoids blocking resume or watch flows on large JSONL files. The HTTP
|
|
|
73
87
|
history path is still valuable for older transcript context, but it must not own
|
|
74
88
|
freshness.
|
|
75
89
|
|
|
90
|
+
The detail view also keeps a small, browser-local transcript tail per session.
|
|
91
|
+
That tail is a non-authoritative preview only: opening a session can show the
|
|
92
|
+
last known durable messages before WebSocket or HTTP history finishes, then the
|
|
93
|
+
canonical `/api/session/messages` response replaces cached rows while preserving
|
|
94
|
+
new live events that arrived during hydration. The cache is capped by session
|
|
95
|
+
count, message count, message text length, and age, and it is cleared when the
|
|
96
|
+
phone consumes a forced full-refresh marker after re-pairing.
|
|
97
|
+
|
|
76
98
|
Live events are deduplicated by the strongest available key:
|
|
77
99
|
|
|
78
100
|
1. `eventId`
|
|
@@ -0,0 +1,69 @@
|
|
|
1
|
+
# Mobile Remote Submission Lifecycle
|
|
2
|
+
|
|
3
|
+
Phone sends are split into two separate contracts:
|
|
4
|
+
|
|
5
|
+
1. **Transport delivery**: the phone delivered an authenticated remote message to CTM on the Mac.
|
|
6
|
+
2. **Agent acceptance**: CTM has evidence that the target Claude/Codex/Wall-E session accepted that message.
|
|
7
|
+
|
|
8
|
+
The phone must not treat transport delivery as final. PTY-based agents use real terminal line editors, so a successful write plus Enter only proves that CTM interacted with the terminal. It does not prove the agent added the prompt to its transcript.
|
|
9
|
+
|
|
10
|
+
## Data Model
|
|
11
|
+
|
|
12
|
+
`ctm_remote_submissions` is the durable backend ledger keyed by the remote message id.
|
|
13
|
+
|
|
14
|
+
Stored fields intentionally avoid a second full prompt copy:
|
|
15
|
+
|
|
16
|
+
- `message_id`, `session_id`, `device_id`, `message_type`
|
|
17
|
+
- `text_hash` for matching and diagnostics
|
|
18
|
+
- `text_preview` for UI status only
|
|
19
|
+
- `status`, `result_json`, timestamps, attempt count, and last error
|
|
20
|
+
|
|
21
|
+
The full prompt should live in the provider/session transcript once accepted. Before acceptance, the phone keeps its transport outbox locally until CTM acknowledges the remote envelope.
|
|
22
|
+
|
|
23
|
+
## Statuses
|
|
24
|
+
|
|
25
|
+
- `received`: CTM received the remote envelope.
|
|
26
|
+
- `typed_to_pty`: CTM wrote the text to the terminal.
|
|
27
|
+
- `observing`: CTM pressed Enter and is waiting for acceptance evidence.
|
|
28
|
+
- `durable_user_message_seen`: the prompt is present in CTM's session prompt/message index.
|
|
29
|
+
- `response_started`: the target session became busy after submission.
|
|
30
|
+
- `accepted_by_terminal`: the terminal advanced after Enter and the prompt is no longer visibly waiting.
|
|
31
|
+
- `blocked`: CTM refused to type because the session was busy, blocked by approval, or not ready.
|
|
32
|
+
- `failed`: CTM started delivery but could not complete it.
|
|
33
|
+
- `needs_attention`: CTM delivered bytes but could not prove acceptance, or the prompt still appears at the terminal prompt.
|
|
34
|
+
|
|
35
|
+
## Observation Rules
|
|
36
|
+
|
|
37
|
+
For Claude/Codex terminal sessions, CTM records a pre-submit terminal marker, writes text, presses Enter, then observes asynchronously:
|
|
38
|
+
|
|
39
|
+
- First preference: match the submitted text in the durable prompt/message index.
|
|
40
|
+
- Second preference: session status turns busy after the submit.
|
|
41
|
+
- Third preference: terminal output advances after Enter. If CTM recorded a
|
|
42
|
+
before-Enter terminal marker, advancement from the pre-typing marker is not
|
|
43
|
+
enough because it can be only the terminal echo of CTM typing the reply.
|
|
44
|
+
- Attention path: the submitted text still appears at the visible prompt after a grace period, or no acceptance signal arrives before timeout.
|
|
45
|
+
|
|
46
|
+
If the exact submitted prompt is still active and the terminal marker did not
|
|
47
|
+
move after the first Enter, CTM may press Enter one more time. This retry is
|
|
48
|
+
only for a visible stuck composer state; it is not a generic resend of the
|
|
49
|
+
remote message and it does not duplicate the phone envelope.
|
|
50
|
+
|
|
51
|
+
Observation broadcasts `remote-submission-status` over the shared websocket and exposes reconnect state at:
|
|
52
|
+
|
|
53
|
+
`GET /api/remote/submissions?session_id=<id>`
|
|
54
|
+
|
|
55
|
+
Both CTM web and phone can consume the same status stream. The phone renders durable pending/attention rows in the timeline. The composer status is only transient feedback for the current text box action, so attention rows must not be mirrored into a second red warning below the input. Desktop can add the same affordance later without changing the backend contract.
|
|
56
|
+
|
|
57
|
+
## UX Rules
|
|
58
|
+
|
|
59
|
+
- Phone outbox removal means “Mac accepted the envelope,” not “agent accepted the prompt.”
|
|
60
|
+
- While observing, the composer may show “Delivered to Mac. Waiting for the agent to accept it...” as in-flight feedback.
|
|
61
|
+
- If attention is needed, show the backend reason in the remote-submission timeline row and keep the composer free for the next action.
|
|
62
|
+
- Once accepted, remove pending attention rows and refresh the transcript so the durable prompt/reply replaces optimistic UI.
|
|
63
|
+
- Do not hide a pending/attention row just because the phone rendered an optimistic local copy of the submitted prompt; only durable transcript evidence should resolve the remote-submission row.
|
|
64
|
+
|
|
65
|
+
## Regression Risks
|
|
66
|
+
|
|
67
|
+
- Do not retry an envelope after CTM acknowledged it. Retries use the same message id only for network ambiguity before a server response.
|
|
68
|
+
- Do not store full prompt text in the backend ledger.
|
|
69
|
+
- Do not change Claude/Codex copy/paste behavior; this lifecycle observes the existing text-then-Enter terminal contract.
|
|
@@ -15,21 +15,37 @@ from it without re-doing the research.
|
|
|
15
15
|
default-deny authorization, one-time device claims, secure-context rollout,
|
|
16
16
|
notification event sources, Cloudflare Access validation, and phone-side UX.
|
|
17
17
|
|
|
18
|
-
2026-05-09 Walle Relay update: the recommended long-term product path
|
|
19
|
-
the hosted Walle Relay remote-control model in
|
|
20
|
-
`claude-task-manager/docs/walle-relay-phone-access-design.md`.
|
|
21
|
-
|
|
18
|
+
2026-05-09 Walle Relay update: the recommended long-term product path was
|
|
19
|
+
initially framed as the hosted Walle Relay remote-control model in
|
|
20
|
+
`claude-task-manager/docs/walle-relay-phone-access-design.md`. That remains
|
|
21
|
+
useful for typed mobile relay workflows, but it is no longer the primary direct
|
|
22
|
+
browser-access architecture.
|
|
22
23
|
|
|
23
24
|
2026-05-10 Microsoft Dev Tunnel update: Microsoft Dev Tunnel is a quick
|
|
24
25
|
browser fallback that avoids VPN, DNS, and Cloudflare setup, but it is still a
|
|
25
26
|
tunnel transport rather than the Walle Relay product path. See
|
|
26
27
|
`claude-task-manager/docs/microsoft-dev-tunnel-phone-access-design.md`.
|
|
27
28
|
|
|
28
|
-
2026-05-
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
the
|
|
32
|
-
|
|
29
|
+
2026-05-30 correction (supersedes a brief 2026-05-19 "private by default"
|
|
30
|
+
experiment): Microsoft Dev Tunnel defaults to **app-gated** (CTM-authenticated)
|
|
31
|
+
access — the tunnel is reachable by URL and CTM's own device token + passkey +
|
|
32
|
+
step-up are the gate, behind an unguessable tunnel URL. Forcing the tunnel
|
|
33
|
+
"private" (interactive Microsoft/GitHub edge sign-in) bricks the phone, because
|
|
34
|
+
the PWA's `fetch()` + `WebSocket` cannot follow the cross-origin login redirect —
|
|
35
|
+
the symptom is "can't connect / times out". Private is now an explicit opt-in
|
|
36
|
+
only (UI selection, `CTM_MS_TUNNEL_ACCESS_MODE=private`, or
|
|
37
|
+
`options.allowPrivateAccess`); a private value coming only from persisted state is
|
|
38
|
+
migrated back to app-gated on the next start so the phone keeps working. For a
|
|
39
|
+
real edge sign-in that is PWA/WebSocket-compatible, use Cloudflare Access or
|
|
40
|
+
Tailscale instead. See
|
|
41
|
+
`claude-task-manager/docs/microsoft-dev-tunnel-phone-access-design.md`.
|
|
42
|
+
|
|
43
|
+
2026-05-27 unified remote-access update: direct browser access for both phone
|
|
44
|
+
and desktop uses the same CTM client-device architecture. Microsoft Dev Tunnel
|
|
45
|
+
private access is the default direct transport, Tailscale is the private-network
|
|
46
|
+
alternative, and Cloudflare Tunnel is out of scope. See
|
|
47
|
+
`claude-task-manager/docs/remote-desktop-access-design.md` and
|
|
48
|
+
`claude-task-manager/docs/microsoft-dev-tunnel-phone-access-design.md`.
|
|
33
49
|
|
|
34
50
|
---
|
|
35
51
|
|
|
@@ -272,7 +288,12 @@ violates G2 of the threat model.
|
|
|
272
288
|
doesn't see content), but device identity could be issued maliciously.
|
|
273
289
|
Mitigation: tailnet lock (signed device joins) is available if paranoid.
|
|
274
290
|
|
|
275
|
-
### 4.2 Option B — Cloudflare Tunnel + Cloudflare Access (
|
|
291
|
+
### 4.2 Legacy Option B — Cloudflare Tunnel + Cloudflare Access (DO NOT IMPLEMENT)
|
|
292
|
+
|
|
293
|
+
2026-05-27 decision: Cloudflare Tunnel is no longer part of the CTM remote
|
|
294
|
+
access plan. Keep this section only as historical threat-model context for old
|
|
295
|
+
deployments. New work should use private Microsoft Dev Tunnel by default or
|
|
296
|
+
Tailscale as the private-network alternative.
|
|
276
297
|
|
|
277
298
|
```
|
|
278
299
|
[Internet client]──HTTPS──[Cloudflare edge]──QUIC tunnel──[cloudflared on Mac]──[CTM:3456]
|
|
@@ -370,6 +391,18 @@ Loopback requests get all scopes but still go through the same route registry
|
|
|
370
391
|
so tests exercise one path. Remote requests must resolve to a non-revoked
|
|
371
392
|
device token before any `/api/*` route or WS message handler runs.
|
|
372
393
|
|
|
394
|
+
**Loopback-trust security invariant (`lib/transport-security.js` `isLoopbackRequest`).**
|
|
395
|
+
A managed tunnel (devtunnel) forwards remote clients to CTM over `127.0.0.1`, so a
|
|
396
|
+
loopback socket peer alone does NOT prove a request is on-box. Loopback full-admin
|
|
397
|
+
trust requires ALL of: (1) a loopback socket peer, (2) **no** proxy/tunnel
|
|
398
|
+
forwarding markers (`X-Forwarded-For/Host/Proto`, `Forwarded`, `X-Real-IP`,
|
|
399
|
+
`X-MS-Original-Host`, `X-Original-Host`, `CF-Connecting-IP`), (3) no non-loopback
|
|
400
|
+
browser `Origin`/`Referer`, and (4) a loopback or absent `Host`. A genuinely local
|
|
401
|
+
browser or CLI never sends forwarding markers; Azure Dev Tunnels (and every
|
|
402
|
+
standard proxy) always stamp them, so any tunnel-forwarded request fails closed
|
|
403
|
+
and must present a device token like any other remote client. This invariant is
|
|
404
|
+
the linchpin of the whole model — never relax it to "loopback socket ⇒ trusted".
|
|
405
|
+
|
|
373
406
|
**New table**: `ctm_device_tokens`
|
|
374
407
|
```sql
|
|
375
408
|
CREATE TABLE ctm_device_tokens (
|
|
@@ -1055,11 +1088,12 @@ memory-search endpoint if we want free-text memory search in mobile.
|
|
|
1055
1088
|
|--------|-----------------------------------|---------------|---------|
|
|
1056
1089
|
| POST | `/api/auth/device-claims` | loopback only | Create one-time phone pairing claim |
|
|
1057
1090
|
| DELETE | `/api/auth/device-claims/:id` | loopback only | Cancel unclaimed QR |
|
|
1058
|
-
| POST | `/api/auth/claim/begin-passkey` | claim secret | Begin WebAuthn registration for a claim |
|
|
1059
|
-
| POST | `/api/auth/claim/finish` | claim secret | Finish
|
|
1091
|
+
| POST | `/api/auth/claim/begin-passkey` | claim secret | Begin WebAuthn registration or same-phone recovery for a claim |
|
|
1092
|
+
| POST | `/api/auth/claim/finish` | claim secret | Finish registration/recovery and mint device token cookie |
|
|
1060
1093
|
| GET | `/api/auth/me` | token | Current device label, scopes, step-up freshness |
|
|
1061
1094
|
| GET | `/api/auth/devices` | loopback only | List devices |
|
|
1062
1095
|
| DELETE | `/api/auth/devices/:id` | loopback or admin+step-up | Revoke |
|
|
1096
|
+
| POST | `/api/auth/device-duplicates/revoke` | loopback only | Keep one duplicate phone pairing and revoke the others |
|
|
1063
1097
|
| POST | `/api/auth/begin-step-up` | token | Begin WebAuthn assertion |
|
|
1064
1098
|
| POST | `/api/auth/finish-step-up` | token | Complete WebAuthn assertion, create step-up session |
|
|
1065
1099
|
| POST | `/api/auth/register-passkey` | token+step-up or loopback | Add replacement passkey |
|
|
@@ -1196,7 +1230,10 @@ with Face ID, agent resumes within 2 s.
|
|
|
1196
1230
|
**Acceptance**: phone is locked, agent reaches `waiting_input`, lock-screen
|
|
1197
1231
|
banner appears within 5 s; tap → unlocks → opens to detail.
|
|
1198
1232
|
|
|
1199
|
-
### Batch 7 (
|
|
1233
|
+
### Batch 7 (Removed) — Cloudflare Tunnel + Access fallback
|
|
1234
|
+
|
|
1235
|
+
2026-05-27 decision: do not implement this batch for CTM remote access. It is
|
|
1236
|
+
kept here only to explain old references in the risk register.
|
|
1200
1237
|
|
|
1201
1238
|
- Documentation + optional settings.
|
|
1202
1239
|
- Configure expected team domain, Access application AUD, allowed email, and
|
|
@@ -1230,7 +1267,7 @@ banner appears within 5 s; tap → unlocks → opens to detail.
|
|
|
1230
1267
|
| R14 | Idle hook creates noisy or false high-priority notifications | High-priority push only from `waiting_input`, approval detection, crash, or explicit task transition |
|
|
1231
1268
|
| R15 | Offline phone replays stale mutation after reconnect | Do not queue mutations offline; user must re-open action and step up again |
|
|
1232
1269
|
| R16 | Passkey registered on tailnet origin fails on Cloudflare origin | Store `rp_id`/origin per credential; require separate credential enrollment per origin |
|
|
1233
|
-
| R17 |
|
|
1270
|
+
| R17 | App-gated Dev Tunnel is reachable by URL (no edge sign-in) | The unguessable tunnel URL is not the gate — CTM's device token + per-origin passkey + step-up + default-deny route registry + rate-limit/lockout + audit are. For a stronger perimeter, opt into private (desktop browser only — breaks the phone PWA) or use Cloudflare Access / Tailscale |
|
|
1234
1271
|
| R18 | User mistakes passkey for Microsoft account verification | UI and docs say Microsoft/GitHub identity is checked by Dev Tunnels; CTM passkey only proves a registered credential for this origin |
|
|
1235
1272
|
|
|
1236
1273
|
### 10.2 Decisions from review
|
|
@@ -1370,7 +1407,8 @@ is the design's success.
|
|
|
1370
1407
|
- Tailscale docs — Funnel, MagicDNS, certs.
|
|
1371
1408
|
https://tailscale.com/docs/features/tailscale-funnel
|
|
1372
1409
|
https://tailscale.com/docs/how-to/set-up-https-certificates
|
|
1373
|
-
- Cloudflare Tunnel + Access
|
|
1410
|
+
- Legacy Cloudflare Tunnel + Access references for old threat-model context,
|
|
1411
|
+
not for new remote-access implementation.
|
|
1374
1412
|
https://developers.cloudflare.com/cloudflare-one/access-controls/applications/http-apps/self-hosted-public-app/
|
|
1375
1413
|
https://developers.cloudflare.com/cloudflare-one/access-controls/applications/http-apps/authorization-cookie/validating-json/
|
|
1376
1414
|
- SimpleWebAuthn (Node passkey lib). https://github.com/MasterKale/SimpleWebAuthn
|
|
@@ -0,0 +1,122 @@
|
|
|
1
|
+
# CTM Phone Passkey Identity
|
|
2
|
+
|
|
3
|
+
CTM phone pairing has one durable identity rule:
|
|
4
|
+
|
|
5
|
+
> The same phone on the same WebAuthn relying-party ID reuses the same CTM
|
|
6
|
+
> passkey and the same CTM device row. It does not create another passkey.
|
|
7
|
+
|
|
8
|
+
This document defines the pairing, recovery, origin, cleanup, and testing
|
|
9
|
+
contracts for that rule.
|
|
10
|
+
|
|
11
|
+
## Terms
|
|
12
|
+
|
|
13
|
+
- **Phone profile**: CTM's best local match for a physical phone, currently the
|
|
14
|
+
normalized device label plus the browser/device hint parsed from the user
|
|
15
|
+
agent, for example `Owner iPhone` plus `iPhone Safari`.
|
|
16
|
+
- **Origin**: Browser origin such as
|
|
17
|
+
`https://ctm-a8f39c-3456.usw2.devtunnels.ms`.
|
|
18
|
+
- **RP ID**: WebAuthn relying-party ID. In CTM it is the origin host.
|
|
19
|
+
- **Device token**: CTM's opaque cookie-backed token stored only as a hash in
|
|
20
|
+
`ctm_device_tokens`.
|
|
21
|
+
- **Passkey credential**: WebAuthn public-key credential stored in
|
|
22
|
+
`ctm_webauthn_credentials`.
|
|
23
|
+
|
|
24
|
+
## Pairing Decision
|
|
25
|
+
|
|
26
|
+
When a claim reaches `POST /api/auth/claim/begin-passkey`, CTM resolves the
|
|
27
|
+
claim origin and RP ID, then applies this order:
|
|
28
|
+
|
|
29
|
+
1. Find an active authorized device with the same phone profile and a passkey
|
|
30
|
+
credential for the same RP ID.
|
|
31
|
+
2. If one exists, return WebAuthn authentication options with `mode: "recover"`.
|
|
32
|
+
The phone confirms Face ID against the existing passkey.
|
|
33
|
+
3. On `POST /api/auth/claim/finish` with `mode: "recover"`, CTM rotates the
|
|
34
|
+
token hash on the existing device row, updates its scopes/label from the new
|
|
35
|
+
claim, records `claim_recover`, and sets a fresh `ctm_token` cookie.
|
|
36
|
+
4. No new `ctm_device_tokens` row and no new `ctm_webauthn_credentials` row are
|
|
37
|
+
created in this recovery path.
|
|
38
|
+
5. If no same-RP credential exists but the same phone profile has an active
|
|
39
|
+
credential for another RP ID, CTM returns
|
|
40
|
+
`phone_origin_rotation_required`. It does not create another passkey.
|
|
41
|
+
6. Only if there is no reusable same-phone identity does CTM return
|
|
42
|
+
`mode: "register"` and create a new device/passkey after WebAuthn
|
|
43
|
+
registration succeeds.
|
|
44
|
+
|
|
45
|
+
This preserves the user's expectation that re-pairing the same phone is a
|
|
46
|
+
reconnect, not an identity fork.
|
|
47
|
+
|
|
48
|
+
## Origin Constraint
|
|
49
|
+
|
|
50
|
+
WebAuthn credentials and browser cookies are origin/RP scoped. CTM cannot make a
|
|
51
|
+
passkey created for one tunnel host work on another tunnel host.
|
|
52
|
+
|
|
53
|
+
Examples:
|
|
54
|
+
|
|
55
|
+
- Reopening `https://ctm-a8f39c-3456.usw2.devtunnels.ms/m/` uses the same RP
|
|
56
|
+
ID as the original pairing link from that host.
|
|
57
|
+
- `ctm-a8f39c-3456.usw2.devtunnels.ms` and
|
|
58
|
+
`ctm-b91e02-3456.usw2.devtunnels.ms` are different RP IDs.
|
|
59
|
+
|
|
60
|
+
The product consequence is intentional:
|
|
61
|
+
|
|
62
|
+
- Same phone, same RP ID: reuse the existing passkey.
|
|
63
|
+
- Same phone, different RP ID: stop and ask the user to open the stable phone
|
|
64
|
+
URL or explicitly revoke/replace the old phone pairing.
|
|
65
|
+
|
|
66
|
+
Stable Tailscale, Cloudflare, or persistent Dev Tunnel origins are therefore
|
|
67
|
+
part of the auth design, not just network convenience.
|
|
68
|
+
|
|
69
|
+
## Duplicate Cleanup
|
|
70
|
+
|
|
71
|
+
The recovery path prevents new duplicates, but older CTM versions may already
|
|
72
|
+
have created repeated phone rows. CTM handles those in two ways:
|
|
73
|
+
|
|
74
|
+
- After any successful new claim or claim recovery, CTM revokes active older
|
|
75
|
+
rows in the same duplicate group and records `device_revoke_duplicate`.
|
|
76
|
+
- `Setup -> Access -> Paired phones` exposes a `Keep newest` or `Keep this
|
|
77
|
+
phone` action for existing duplicate groups. It calls
|
|
78
|
+
`POST /api/auth/device-duplicates/revoke` and keeps the selected device.
|
|
79
|
+
|
|
80
|
+
Revoked devices remain in the database for audit, but setup hides removed rows
|
|
81
|
+
behind a compact "removed connections" note.
|
|
82
|
+
|
|
83
|
+
## Short-Lived Artifact Retention
|
|
84
|
+
|
|
85
|
+
Pairing creates temporary rows that are not phone identities:
|
|
86
|
+
|
|
87
|
+
- `ctm_device_claims`
|
|
88
|
+
- `ctm_pairing_requests`
|
|
89
|
+
- `ctm_webauthn_challenges`
|
|
90
|
+
|
|
91
|
+
CTM prunes expired unclaimed claims, resolved/expired pairing requests, and old
|
|
92
|
+
WebAuthn challenges through `cleanupMobileAuthArtifacts()`. Claimed device
|
|
93
|
+
tokens, passkey credentials, and audit rows are not deleted by this cleanup.
|
|
94
|
+
|
|
95
|
+
## User-Facing UX
|
|
96
|
+
|
|
97
|
+
The phone claim page distinguishes the two flows:
|
|
98
|
+
|
|
99
|
+
- Registration: `Face ID & Pair`
|
|
100
|
+
- Recovery: `Face ID & Reconnect`
|
|
101
|
+
|
|
102
|
+
Settings distinguishes normal auth health from duplicate cleanup:
|
|
103
|
+
|
|
104
|
+
- Active paired phones show status, last seen time, origin, and duplicate notes.
|
|
105
|
+
- Duplicate actions revoke only the other active devices in the same phone
|
|
106
|
+
profile group.
|
|
107
|
+
- Different-origin same-phone conflicts require an explicit user decision
|
|
108
|
+
because CTM cannot prove the old RP credential on the new origin.
|
|
109
|
+
|
|
110
|
+
## Verification Contract
|
|
111
|
+
|
|
112
|
+
Tests should cover:
|
|
113
|
+
|
|
114
|
+
1. Same phone plus same RP returns `mode: "recover"` from
|
|
115
|
+
`/api/auth/claim/begin-passkey`.
|
|
116
|
+
2. Recovery finishes by rotating the existing device token hash.
|
|
117
|
+
3. Recovery leaves `ctm_device_tokens` and `ctm_webauthn_credentials` counts
|
|
118
|
+
unchanged.
|
|
119
|
+
4. Same phone plus different active RP returns
|
|
120
|
+
`phone_origin_rotation_required`.
|
|
121
|
+
5. Manual duplicate cleanup revokes only the non-kept active devices.
|
|
122
|
+
6. Expired WebAuthn challenges are pruned without deleting device identities.
|
|
@@ -73,6 +73,9 @@ name before running `tailscale cert`.
|
|
|
73
73
|
- HTTPS is enabled only when both `CTM_TLS_CERT` and `CTM_TLS_KEY` are present.
|
|
74
74
|
- Mutating HTTP routes and WebSocket connections reject mismatched browser
|
|
75
75
|
origins.
|
|
76
|
+
- Phone passkeys are reused by phone profile and WebAuthn RP ID. See
|
|
77
|
+
[Phone Passkey Identity](phone-passkey-identity.md) for the same-phone
|
|
78
|
+
recovery and duplicate-cleanup contract.
|
|
76
79
|
|
|
77
80
|
## Cloudflare Tunnel + Access Fallback
|
|
78
81
|
|
|
@@ -1,6 +1,9 @@
|
|
|
1
1
|
# Prompt Editing Tree Design
|
|
2
2
|
|
|
3
|
-
> Status:
|
|
3
|
+
> Status: staged implementation. Wall-E chat already has branch snapshots;
|
|
4
|
+
> CTM Wall-E coding sessions now use the same active-path snapshot contract for
|
|
5
|
+
> delete-from-prompt and edit/resend forks. The full immutable DAG schema below
|
|
6
|
+
> remains the longer-term target.
|
|
4
7
|
> Date: 2026-05-03
|
|
5
8
|
> Scope: Wall-E chat, CTM Wall-E sessions, CTM transcript/review/session-memory caches
|
|
6
9
|
|
|
@@ -38,6 +41,27 @@ review panel, resume path, and Wall-E memory tools will use the wrong branch.
|
|
|
38
41
|
provider CLIs can rewrite native history.
|
|
39
42
|
- Avoid destructive JSONL truncation as the default editing primitive.
|
|
40
43
|
|
|
44
|
+
## Implemented CTM Wall-E Coding Session Contract
|
|
45
|
+
|
|
46
|
+
The first implementation intentionally reuses Wall-E's existing
|
|
47
|
+
`chat_branches` sidecar instead of introducing a second CTM table.
|
|
48
|
+
|
|
49
|
+
- The browser keeps `messages` as the active visible path.
|
|
50
|
+
- Editing a user prompt saves the old suffix as a branch version, replaces the
|
|
51
|
+
visible suffix with the edited prompt, and resends from that point.
|
|
52
|
+
- Deleting a user prompt truncates the visible path before that prompt and
|
|
53
|
+
saves the truncated active path.
|
|
54
|
+
- Branch selection swaps the visible suffix without mutating message content.
|
|
55
|
+
- Every Wall-E coding send includes `contextMessages`, a compact active-path
|
|
56
|
+
history, so `wall-e/chat.js` does not build model context from abandoned
|
|
57
|
+
chronological rows.
|
|
58
|
+
- Review/session history reads prefer the durable branch snapshot when it
|
|
59
|
+
exists, then fall back to legacy CTM JSONL and Wall-E chat history.
|
|
60
|
+
|
|
61
|
+
This gives the user-visible and model-visible behavior now while keeping the
|
|
62
|
+
raw JSONL append-only. The later DAG migration can move the same semantics into
|
|
63
|
+
first-class message-parent tables without changing the product behavior.
|
|
64
|
+
|
|
41
65
|
## Non-Goals
|
|
42
66
|
|
|
43
67
|
- Do not implement this design in this document.
|