npm - create-walle - Versions diffs - 0.9.11 → 0.9.13 - Mend

create-walle 0.9.11 → 0.9.13

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (167) hide show

package/README.md +3 -3
package/package.json +2 -2
package/template/bin/dev.sh +7 -1
package/template/bin/setup.js +53 -9
package/template/bin/sync-images.js +53 -0
package/template/builder-journal.md +17 -0
package/template/claude-task-manager/api-prompts.js +98 -13
package/template/claude-task-manager/api-reviews.js +82 -5
package/template/claude-task-manager/db.js +32 -5
package/template/claude-task-manager/docs/session-capture-foundation-design.md +1273 -0
package/template/claude-task-manager/lib/claude-desktop-sessions.js +696 -0
package/template/claude-task-manager/lib/coding-agent-models.js +49 -1
package/template/claude-task-manager/lib/session-capture.js +421 -0
package/template/claude-task-manager/lib/session-history.js +135 -15
package/template/claude-task-manager/lib/session-jobs.js +10 -5
package/template/claude-task-manager/lib/session-stream.js +87 -19
package/template/claude-task-manager/lib/setup-provider-config.js +115 -0
package/template/claude-task-manager/lib/walle-ctm-history.js +72 -0
package/template/claude-task-manager/lib/walle-session-context.js +61 -0
package/template/claude-task-manager/lib/walle-transcript.js +176 -0
package/template/claude-task-manager/public/css/setup.css +35 -8
package/template/claude-task-manager/public/css/walle-session.css +56 -0
package/template/claude-task-manager/public/css/walle.css +120 -0
package/template/claude-task-manager/public/index.html +814 -181
package/template/claude-task-manager/public/js/message-renderer.js +148 -19
package/template/claude-task-manager/public/js/reviews.js +120 -62
package/template/claude-task-manager/public/js/setup.js +75 -31
package/template/claude-task-manager/public/js/stream-view.js +115 -55
package/template/claude-task-manager/public/js/walle-session.js +84 -2
package/template/claude-task-manager/public/js/walle.js +308 -54
package/template/claude-task-manager/server.js +1092 -146
package/template/claude-task-manager/session-integrity.js +181 -54
package/template/claude-task-manager/session-utils.js +123 -41
package/template/claude-task-manager/workers/state-detectors/codex.js +5 -2
package/template/package.json +1 -1
package/template/wall-e/adapters/ctm.js +39 -18
package/template/wall-e/agent-runners/contract.js +17 -0
package/template/wall-e/agent-runners/index.js +22 -0
package/template/wall-e/agent-runtime/harness.js +212 -0
package/template/wall-e/agent-runtime/index.js +8 -0
package/template/wall-e/agent-runtime/registry.js +67 -0
package/template/wall-e/agent-runtime/session-store.js +179 -0
package/template/wall-e/agent-runtime/spawn.js +208 -0
package/template/wall-e/api-walle.js +174 -7
package/template/wall-e/brain.js +266 -28
package/template/wall-e/channels/policy.js +88 -0
package/template/wall-e/channels/registry.js +15 -1
package/template/wall-e/channels/reply-dispatcher.js +70 -0
package/template/wall-e/channels/session-bindings.js +51 -0
package/template/wall-e/chat/code-review-context.js +29 -0
package/template/wall-e/chat.js +188 -42
package/template/wall-e/coding/acp-adapter.js +188 -0
package/template/wall-e/coding/agent-catalog.js +129 -0
package/template/wall-e/coding/compaction-service.js +247 -0
package/template/wall-e/coding/execution-trace.js +3 -0
package/template/wall-e/coding/instruction-service.js +224 -0
package/template/wall-e/coding/model-message.js +67 -0
package/template/wall-e/coding/permission-rules-store.js +111 -0
package/template/wall-e/coding/permission-service.js +266 -0
package/template/wall-e/coding/prompt-bundle.js +67 -0
package/template/wall-e/coding/prompt-runtime.js +243 -0
package/template/wall-e/coding/provider-transform.js +188 -0
package/template/wall-e/coding/runtime-mode.js +132 -0
package/template/wall-e/coding/snapshot-service.js +155 -0
package/template/wall-e/coding/stream-processor.js +268 -0
package/template/wall-e/coding/task-tool.js +255 -0
package/template/wall-e/coding/tool-registry.js +361 -0
package/template/wall-e/coding/transcript-writer.js +143 -0
package/template/wall-e/coding/workspace-replay.js +324 -0
package/template/wall-e/coding-context.js +4 -22
package/template/wall-e/coding-orchestrator.js +307 -18
package/template/wall-e/coding-prompts.js +44 -3
package/template/wall-e/context/context-builder.js +43 -1
package/template/wall-e/context/topic-matcher.js +1 -1
package/template/wall-e/eval/agent-runner.js +59 -13
package/template/wall-e/eval/benchmarks/memory-retrieval.json +155 -57
package/template/wall-e/eval/benchmarks.js +100 -16
package/template/wall-e/eval/eval-orchestrator.js +218 -8
package/template/wall-e/eval/harvester.js +62 -5
package/template/wall-e/eval/head-to-head.js +23 -2
package/template/wall-e/eval/humaneval-adapter.js +30 -5
package/template/wall-e/eval/livecodebench-adapter.js +29 -5
package/template/wall-e/eval/manifest.js +186 -0
package/template/wall-e/eval/run-agent-benchmarks.js +66 -2
package/template/wall-e/eval/session-retrieval-benchmark.js +150 -0
package/template/wall-e/eval/session-transcripts.js +57 -4
package/template/wall-e/eval/swebench-adapter.js +109 -3
package/template/wall-e/evaluation/agent-router.js +53 -1
package/template/wall-e/evaluation/coding-quorum.js +48 -1
package/template/wall-e/evaluation/router.js +4 -2
package/template/wall-e/evaluation/tier-selector.js +11 -1
package/template/wall-e/extraction/contradiction.js +2 -2
package/template/wall-e/extraction/indexer.js +2 -1
package/template/wall-e/extraction/knowledge-extractor.js +2 -2
package/template/wall-e/hooks/cli.js +92 -0
package/template/wall-e/hooks/discovery.js +119 -0
package/template/wall-e/hooks/index.js +7 -0
package/template/wall-e/hooks/manifest.js +55 -0
package/template/wall-e/hooks/runtime.js +84 -0
package/template/wall-e/hooks/session-memory.js +225 -0
package/template/wall-e/http/auth.js +6 -2
package/template/wall-e/http/chat-api.js +54 -8
package/template/wall-e/integrations/claude-plugin/hooks/hooks.json +27 -0
package/template/wall-e/integrations/claude-plugin/hooks/walle-precompact-hook.sh +5 -0
package/template/wall-e/integrations/claude-plugin/hooks/walle-stop-hook.sh +5 -0
package/template/wall-e/integrations/codex-plugin/hooks/walle-hook.sh +7 -0
package/template/wall-e/integrations/codex-plugin/hooks.json +37 -0
package/template/wall-e/listening/calendar.js +3 -1
package/template/wall-e/llm/client.js +64 -10
package/template/wall-e/llm/google.js +39 -5
package/template/wall-e/llm/ollama.js +1 -1
package/template/wall-e/llm/ollama.plugin.json +1 -1
package/template/wall-e/llm/provider-availability.js +10 -0
package/template/wall-e/llm/provider-error.js +269 -0
package/template/wall-e/llm/tool-adapter.js +48 -12
package/template/wall-e/loops/boot.js +2 -1
package/template/wall-e/loops/initiative.js +2 -2
package/template/wall-e/loops/tasks.js +8 -47
package/template/wall-e/loops/workspace-prompts.js +20 -0
package/template/wall-e/mcp-server.js +442 -1
package/template/wall-e/memory/session-ingest-service.js +159 -0
package/template/wall-e/memory/source-indexer.js +289 -0
package/template/wall-e/plugins/discovery.js +83 -0
package/template/wall-e/plugins/manifest-loader.js +50 -10
package/template/wall-e/plugins/manifest-schema.js +69 -0
package/template/wall-e/plugins/model-catalog.js +55 -0
package/template/wall-e/prompts/coding/base.txt +2 -0
package/template/wall-e/prompts/coding/deepseek.txt +1 -0
package/template/wall-e/prompts/coding/memory-protocol.md +9 -0
package/template/wall-e/prompts/coding/plan.txt +1 -0
package/template/wall-e/runtime/execution-trace.js +220 -0
package/template/wall-e/security/audit.js +266 -0
package/template/wall-e/security/ssrf.js +236 -0
package/template/wall-e/session-files.js +303 -0
package/template/wall-e/skills/_bundled/slack-backfill/SKILL.md +3 -0
package/template/wall-e/skills/_bundled/slack-sync/SKILL.md +3 -0
package/template/wall-e/skills/internal-skill-registry.js +2 -2
package/template/wall-e/skills/script-skill-runner.js +143 -0
package/template/wall-e/skills/skill-executor.js +5 -6
package/template/wall-e/skills/skill-fallback.js +3 -1
package/template/wall-e/skills/skill-harness-registry.js +7 -8
package/template/wall-e/skills/skill-planner.js +52 -4
package/template/wall-e/skills/slack-ingest.js +11 -3
package/template/wall-e/sources/base.js +90 -0
package/template/wall-e/sources/builtin.js +33 -0
package/template/wall-e/sources/claude-code-jsonl.js +78 -0
package/template/wall-e/sources/codex-jsonl.js +125 -0
package/template/wall-e/sources/coding-session-utils.js +117 -0
package/template/wall-e/sources/contract-suite.js +59 -0
package/template/wall-e/sources/gemini-jsonl.js +85 -0
package/template/wall-e/sources/index.js +9 -0
package/template/wall-e/sources/jsonl-utils.js +181 -0
package/template/wall-e/sources/record-types.js +252 -0
package/template/wall-e/sources/registry.js +92 -0
package/template/wall-e/sources/transforms.js +100 -0
package/template/wall-e/sources/walle-jsonl.js +108 -0
package/template/wall-e/tools/coding-middleware.js +31 -1
package/template/wall-e/tools/file-tracker.js +25 -1
package/template/wall-e/tools/local-tools.js +75 -47
package/template/wall-e/tools/session-sharing.js +68 -1
package/template/wall-e/tools/shell-analyzer.js +1 -1
package/template/wall-e/tools/shell-policy.js +47 -0
package/template/wall-e/tools/snapshot.js +42 -0
package/template/wall-e/training/harvester.js +62 -5
package/template/wall-e/utils/repair.js +253 -1
package/template/website/index.html +3 -3
package/template/wall-e/skills/_bundled/slack-mentions/.watched-threads.json +0 -18

package/template/claude-task-manager/docs/session-capture-foundation-design.md ADDED Viewed

@@ -0,0 +1,1273 @@
+# Session Capture Foundation Design
+Date: 2026-04-28
+Status: design draft, source-code reuse pass applied
+Owner: CTM
+## Summary
+Build a shared `SessionCapture` foundation by promoting the existing
+`SessionStream`/session-history/status/approval machinery into a clearer
+provider-neutral contract. The codebase already captures most of the requested
+surface: live transcript events, persisted conversation messages, active-session
+status, hover summaries, prompt queues, approval decisions, and restart
+scrollback. The foundation should reuse those pieces first.
+This should not replace the existing terminal scrollback recorder or the
+existing provider-specific JSONL stream tailers. The foundation should sit above
+them as a normalization and projection layer:
+1. Capture from the richest source available for each provider.
+2. Normalize provider events into a small CTM event vocabulary.
+3. Maintain cheap in-memory live projections for UI and automation.
+4. Persist durable message/search/history data through existing tables.
+5. Let downstream features subscribe to the same substrate instead of each
+   feature scraping the terminal or re-parsing provider files independently.
+The key architectural decision is to make `SessionStream` the first-class live
+capture bus, make raw PTY bytes a fallback signal, and avoid adding new tables
+where existing tables already own the data.
+## Why This Matters
+Several features want the same facts:
+- What did the user ask recently?
+- What has the coding agent emitted recently?
+- Is the session running, waiting for input, waiting for approval, idle, exited,
+  or unknown?
+- Is there an approval request, and what exact command or action needs a
+  decision?
+- What is the recent work summary for a tooltip or active-session preview?
+- Should a monitor agent intervene because the session is stuck, looping,
+  blocked, failing, or drifting?
+Today these questions are answered by separate mechanisms:
+- PTY scrollback capture restores terminal output.
+- Provider-specific stream readers tail structured files where possible.
+- Hooks and telemetry provide partial status signals.
+- The approver still relies heavily on screen parsing.
+- UI status and summaries are computed through existing stream APIs.
+Those pieces are valuable, but they are not yet exposed as one reusable
+foundation. `SessionCapture` should be a thin contract over existing code first,
+not a duplicate recorder.
+## Current Codebase Findings
+### Existing Session And Scrollback Capture
+The codebase already captures raw terminal output for active sessions.
+Relevant files:
+- `claude-task-manager/server.js`
+- `claude-task-manager/workers/scrollback-worker.js`
+- `claude-task-manager/workers/headless-term-worker.js`
+- `claude-task-manager/lib/session-history.js`
+- `claude-task-manager/db.js`
+Observed behavior:
+- PTY output is observed in the server and fed unthrottled to the headless xterm
+  worker for current terminal state snapshots.
+- PTY output is also batched into `scrollback_log` for restart survival.
+- `scrollback_log` is explicitly cleared on normal session exit, so it is not a
+  long-term semantic capture store.
+- This is useful for restoring the terminal view and as a fallback, but raw
+  terminal bytes are expensive and ambiguous for semantic features.
+Implication:
+- Keep scrollback capture.
+- Do not use it as the primary semantic conversation source when structured
+  provider logs exist.
+- Use it for:
+  - visual restore,
+  - coarse activity heartbeat,
+  - fallback for providers without structured transcripts,
+  - debugging capture gaps.
+### Existing Structured Stream Layer
+The codebase already has a structured stream layer.
+Relevant files:
+- `claude-task-manager/lib/session-stream.js`
+- `claude-task-manager/lib/session-state-bus.js`
+- `claude-task-manager/lib/telemetry-receiver.js`
+- `claude-task-manager/public/js/stream-view.js`
+- `claude-task-manager/public/index.html`
+Observed behavior:
+- Provider session files are tailed and converted into events/status.
+- The server exposes stream status and session stream APIs.
+- The frontend subscribes to stream events over websocket.
+- The UI already uses stream summaries/status for active sessions in some
+  places.
+- `SessionStream` already has:
+  - `JsonlTailer` with byte offsets and partial-line handling,
+  - a per-agent ring buffer,
+  - CTM session ID to agent session ID mapping,
+  - Claude and Codex JSONL parsing,
+  - user prompt cache,
+  - debounced summary generation,
+  - `getRecentEvents`, `getSummary`, and `getAllStatuses`.
+- `stream-view.js` already consumes `/api/sessions/:id/summary`,
+  `/api/stream/status`, `stream-init`, `stream-event`, and `stream-status`.
+Implication:
+- `SessionCapture` should evolve out of this layer rather than start as an
+  unrelated subsystem.
+- The new foundation should preserve existing APIs while tightening vocabulary,
+  storage, projections, and downstream contracts.
+### Existing Durable Session Tables
+Relevant file:
+- `claude-task-manager/db.js`
+Existing durable data includes:
+- `ctm_sessions`
+- `agent_sessions`
+- `session_conversations`
+- `session_messages`
+- `session_messages_fts`
+- `session_analyses`
+- `session_analyses_fts`
+- `scrollback_log`
+- `startup_tasks`
+- `approval_decisions`
+- `approval_rules`
+- `permission_rules` / `perm_rules`
+- `prompt_queues`
+Implication:
+- There is already a durable conversation cache, message search index, session
+  identity model, active-session restore model, prompt queue store, and approval
+  audit log.
+- Do not add `session_live_state` or `session_turns` in the first implementation.
+- Do not duplicate `session_messages` for user/assistant conversation text.
+- The only plausible new durable table is an append-only capture/event table for
+  facts that existing tables do not own: status transitions, approval-request
+  lifecycle, tool calls/results, capture health, and monitor-agent annotations.
+### Existing Approval Path
+Relevant files:
+- `claude-task-manager/approval-agent.js`
+- `claude-task-manager/lib/session-jobs.js`
+- provider-specific approval parsers
+- `claude-task-manager/server.js`
+Observed behavior:
+- The approver has substantial logic for interpreting terminal screens and
+  provider-specific approval surfaces.
+- `approval_decisions` already audits approved/escalated decisions.
+- `approval_rules` already stores learned auto-approval rules and command
+  signatures.
+- This works as a compatibility strategy, but it couples approval automation to
+  screen rendering.
+Implication:
+- A normalized event like `approval.requested` should become the preferred input
+  to future monitor/automation, but the first version should reuse the existing
+  provider parsers and `approval_decisions` audit path.
+- Screen parsing should remain as fallback.
+### Source-Code Reuse Matrix
+| Need | Reuse first | Why |
+| --- | --- | --- |
+| CTM session identity | `ctm_sessions` | Existing root row for tab/session title, cwd, provider, starred state. |
+| Provider transcript identity | `agent_sessions` | Already maps provider session IDs to CTM IDs and stores provider, `jsonl_path`, model, branch, counts, slug. |
+| Active-session restore | `startup_tasks` | Already tracks live CTM tasks, command, cwd, model, provider type, chat/session IDs, worktree, branch. |
+| Live transcript stream | `SessionStream` | Already tails JSONL, normalizes Claude/Codex user/assistant/tool-result events, holds a ring, emits WS events/status, and builds summaries. |
+| File watching | `JsonlWatcher` + `session-jobs` reconciliation | Already combines `fs.watch`, periodic rescans, symlink hardening, compact `.bak` handling, and 10-minute reconciliation. |
+| Durable rendered conversation | `session_conversations` | Existing JSON message cache used by Review and Conversation views. |
+| Durable message search | `session_messages` + `session_messages_fts` | Existing per-message table and FTS index for deep search. Extend only if capture needs metadata columns. |
+| Historical AI analysis | `session_analyses` + FTS | Existing title/summary/topics/category table for completed/offline session analysis. |
+| Live terminal restore | `headless-term-worker` and `scrollback_log` | Existing xterm snapshot source and restart scrollback persistence. Do not use as semantic history. |
+| Status signals | `telemetry-receiver`, `status-hooks`, `SessionStream.getAllStatuses` | Existing hook/OTEL/stream status paths already feed active-session UI. |
+| Approval automation | `approval-agent`, provider parsers, `approval_decisions`, `approval_rules` | Existing parser/rule/decision pipeline should remain the execution path. |
+| Prompt queue input | `queue-engine`, `prompt_queues` | Existing queued prompts and auto-advance state can emit capture-side prompt-submitted events. |
+| Active-session UI | `stream-view.js`, `getSessionStatus`, `SessionActivityUtils` | Existing tooltip/status/grouping surface already consumes stream and authoritative status signals. |
+Reuse rule: add new schema only where the fact is not already represented above,
+or where overloading an existing table would break its current contract.
+## Online Research
+### Claude Code Transcripts
+Source: https://code.claude.com/docs/en/claude-directory
+Claude Code stores conversation history under the user-level Claude directory,
+including project-scoped transcript JSONL files. This means CTM can usually
+read a structured conversation stream without scraping terminal output.
+Design implication:
+- Prefer Claude transcript JSONL as the source for user prompts, assistant
+  responses, tool calls, tool results, and session IDs.
+- Tail incrementally by byte offset and inode.
+- Treat file rotation, compaction, and permission failures as recoverable.
+### Claude Code Hooks
+Source: https://code.claude.com/docs/en/hooks
+Claude Code supports lifecycle hooks around prompts, tool use, stop events, and
+related agent activity.
+Design implication:
+- Hooks are strong status and event signals.
+- They should not be the only source of truth because hooks can be disabled,
+  misconfigured, or fail, but they are useful for low-latency status changes and
+  approval/tool events.
+### Claude Agent SDK Sessions
+Source: https://code.claude.com/docs/en/agent-sdk/sessions
+Claude's SDK session model supports continuing and resuming sessions by ID.
+Design implication:
+- CTM should persist provider session IDs and map them to CTM session IDs.
+- Capture should keep the provider ID in every normalized event for replay and
+  cross-provider debugging.
+### Gemini CLI Save/Resume And Checkpointing
+Sources:
+- https://google-gemini.github.io/gemini-cli/docs/cli/commands.html
+- https://google-gemini.github.io/gemini-cli/docs/checkpointing.html
+Gemini CLI has explicit chat save/resume behavior and checkpointing features.
+Design implication:
+- Provider adapters must not assume every coding agent exposes a Claude-like
+  JSONL transcript.
+- The adapter boundary should allow:
+  - durable transcript tailing,
+  - exported/saved chat files,
+  - checkpoint metadata,
+  - PTY fallback.
+### GitHub Copilot CLI Chronicle
+Source: https://docs.github.com/en/copilot/concepts/agents/copilot-cli/chronicle
+GitHub Copilot CLI documents local session data and a session history concept.
+Design implication:
+- "Agent session chronicle" is a solved product pattern.
+- CTM should model a provider-neutral chronicle for coding-agent sessions, not a
+  provider-specific log parser API.
+### OpenTelemetry Logs And GenAI Events
+Sources:
+- https://opentelemetry.io/docs/specs/otel/logs/data-model/
+- https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-events/
+OpenTelemetry log records and GenAI semantic conventions provide useful prior
+art for event attributes, timestamps, trace/session correlation, and GenAI
+message events.
+Design implication:
+- Use OTel-inspired event fields:
+  - timestamp,
+  - severity,
+  - body,
+  - attributes,
+  - trace/span/session correlation where available.
+- Keep CTM's internal schema small, but align naming where it does not make the
+  system awkward.
+### Terminal Recording Prior Art
+Sources:
+- https://docs.asciinema.org/how-it-works/
+- https://docs.asciinema.org/manual/asciicast/v1/
+- https://github.com/microsoft/node-pty
+- https://www.npmjs.com/package/@xterm/addon-serialize/v/0.12.0
+asciinema records terminal sessions as timed output events. `node-pty` is the
+common primitive for spawning and observing terminal processes. xterm's
+serialize addon can serialize terminal buffer state.
+Design implication:
+- Terminal capture is a solved primitive for visual replay.
+- Semantic capture should not be built from ANSI parsing unless no structured
+  source is available.
+- If CTM wants later visual replay, raw PTY chunks plus periodic xterm
+  snapshots are appropriate.
+### File Watching Caveats
+Source: https://nodejs.org/api/fs.html
+Node's `fs.watch` has platform-specific caveats and can miss or coalesce events.
+Design implication:
+- Transcript tailers should combine watch notifications with periodic polling.
+- Store byte offsets.
+- Detect truncation and inode changes.
+- Make replay idempotent with provider event IDs or content hashes.
+## Goals
+1. Provide near-realtime normalized events for active coding sessions.
+2. Capture both user prompts and coding agent output.
+3. Support many concurrent sessions with low CPU and memory overhead.
+4. Make session status derivable from shared signals.
+5. Feed approver automation without screen scraping when possible.
+6. Generate rolling summaries and active-session tooltips.
+7. Enable monitor agents and future automation.
+8. Preserve current terminal restore behavior.
+9. Allow provider-specific adapters without leaking provider details into every
+   downstream feature.
+## Non-Goals
+1. Full visual terminal replay in the first implementation.
+2. Replacing every existing stream/status API at once.
+3. Perfect semantic parsing from raw PTY output.
+4. Capturing secrets or hidden terminal input.
+5. Running a summarizer on every output chunk.
+6. Building a distributed event bus unless local-process scaling becomes a real
+   bottleneck.
+## Proposed Architecture
+### High-Level Flow
+```text
+Existing provider files/hooks/PTY/input events
+        |
+        v
+Existing adapters:
+JsonlWatcher + SessionStream + telemetry-receiver + status-hooks + approver
+        |
+        v
+SessionCapture contract over SessionStream
+        |
+        +--> existing hot ring buffers
+        +--> existing session_conversations/session_messages projections
+        +--> optional capture_events for non-message lifecycle facts
+        +--> existing websocket/API stream
+        +--> existing active-session status/tooltip UI
+        +--> existing approver + future monitor agents
+```
+### Source Adapters
+Do not start by creating parallel adapters. Start by wrapping and extending the
+adapters that already exist.
+Initial reuse mapping:
+- `JsonlWatcher`
+  - Existing source of Claude JSONL `file-new` / `file-change` events.
+  - Already hardened against symlink/path traversal issues.
+- `SessionStream.JsonlTailer`
+  - Existing byte-offset incremental reader.
+  - Reuse it for live transcript tailing rather than adding another tailer.
+- `SessionStream._processEntry`
+  - Existing Claude user/assistant/tool-result normalizer.
+  - Extend it with normalized `kind`/`source` metadata instead of reparsing JSONL
+    somewhere else.
+- `SessionStream._processCodexEntry`
+  - Existing Codex user/assistant normalizer using `session-history.js`.
+  - Extend here for Codex capture rather than adding a second Codex parser.
+- `telemetry-receiver`
+  - Existing hook/OTEL fan-in for Claude, Codex, and Gemini status.
+  - Add capture-state notifications here if needed, but preserve the existing
+    `session.status` websocket contract.
+- `approval-agent` + provider parsers
+  - Existing screen-derived approval context parser and decision engine.
+  - Emit capture lifecycle events from this path; do not duplicate approval
+    parsing.
+- `queue-engine` and `handleInput`
+  - Existing paths that write user/queued prompts to PTY.
+  - These are useful low-latency prompt-submission signals before provider JSONL
+    catches up, but transcript events should still be the high-confidence source.
+Future adapters:
+- Gemini transcript/checkpoint adapter only if Gemini exposes a durable
+  conversation source CTM does not already ingest.
+- A provider-neutral adapter interface only after at least two providers need
+  different code paths that cannot live inside `SessionStream` cleanly.
+### Normalized Event Shape
+```ts
+type SessionCaptureEvent = {
+  id: string;
+  sessionId: string;          // CTM session id
+  provider: 'claude' | 'codex' | 'gemini' | 'unknown';
+  providerSessionId?: string;
+  source: 'transcript' | 'hook' | 'pty' | 'ctm-input' | 'telemetry';
+  kind:
+    | 'session.started'
+    | 'session.exited'
+    | 'turn.started'
+    | 'turn.completed'
+    | 'user.prompt'
+    | 'assistant.delta'
+    | 'assistant.message'
+    | 'tool.call'
+    | 'tool.result'
+    | 'approval.requested'
+    | 'approval.resolved'
+    | 'status.changed'
+    | 'error'
+    | 'heartbeat';
+  createdAt: string;          // ISO timestamp
+  observedAt: string;         // ISO timestamp when CTM observed it
+  sequence: number;           // per CTM session monotonic sequence
+  turnId?: string;
+  parentId?: string;
+  text?: string;
+  data?: Record<string, unknown>;
+  confidence: 'high' | 'medium' | 'low';
+};
+```
+Important rules:
+- `createdAt` comes from the provider if available.
+- `observedAt` always comes from CTM.
+- `sequence` is assigned by CTM after normalization.
+- `confidence` lets downstream users prefer transcript events over PTY fallback.
+- Large raw payloads should not be copied into every event. Store compact text
+  and structured metadata; keep raw payload references when needed.
+### Event Identity And Idempotency
+Events need stable IDs because file watchers can replay lines.
+Preferred ID order:
+1. Provider event/message ID if available.
+2. Provider session ID plus transcript byte offset.
+3. Hash of provider session ID, source, kind, timestamp, and compact payload.
+The event bus should dedupe per session by event ID.
+### Hot In-Memory State
+Maintain a compact per-session live object:
+```ts
+type SessionLiveState = {
+  sessionId: string;
+  status: 'running' | 'waiting' | 'waiting_approval' | 'idle' | 'exited' | 'unknown';
+  provider: string;
+  providerSessionId?: string;
+  lastEventAt?: string;
+  lastUserPromptAt?: string;
+  lastAssistantOutputAt?: string;
+  activeTurnId?: string;
+  pendingApprovalId?: string;
+  recentEvents: RingBuffer<SessionCaptureEvent>;
+  recentText: RingBuffer<CompactTextChunk>;
+  recentTurns: RingBuffer<SessionTurn>;
+  summary?: SessionSummary;
+};
+```
+Default ring sizes:
+- recent events: 200 per session
+- recent text chunks: 64 KB per session
+- recent turns: 10 per session
+For 100 active sessions, this stays small enough for one Node process if events
+are compact and raw PTY chunks are not duplicated.
+### Durable Storage
+Reuse existing tables first.
+Keep using:
+- `ctm_sessions` for CTM tab/session identity.
+- `agent_sessions` for provider session identity, provider type, transcript
+  path, model, branch, file size, modified time, and user-message count.
+- `startup_tasks` for active process restore and live task metadata.
+- `session_conversations` for rendered durable conversation JSON.
+- `session_messages` and `session_messages_fts` for durable per-message search.
+- `session_analyses` and `session_analyses_fts` for completed-session summaries,
+  title/category/topic analysis, and search enrichment.
+- `approval_decisions` for approval audit records.
+- `approval_rules` and `perm_rules` / `permission_rules` for approval policy.
+- `prompt_queues` for queued prompt state.
+- `scrollback_log` only for restart scrollback, not semantic history.
+Do not add in phase 1:
+- `session_live_state`: `SessionStream`, `sessions`, frontend `_streamStatus`,
+  and authoritative `session.status` already form the live state projection.
+- `session_turns`: turns can be computed from `session_messages` and the
+  existing `SessionStream.userPromptCache` for the first tooltip/monitoring
+  use cases.
+- A second message table: `session_messages` already exists for per-message
+  persistence/search.
+Possible later addition:
+```sql
+CREATE TABLE IF NOT EXISTS session_capture_events (
+  id TEXT PRIMARY KEY,
+  ctm_session_id TEXT NOT NULL,
+  agent_session_id TEXT,
+  provider TEXT,
+  source TEXT NOT NULL,
+  kind TEXT NOT NULL,
+  sequence INTEGER NOT NULL,
+  parent_id TEXT,
+  provider_event_id TEXT,
+  created_at TEXT,
+  observed_at TEXT NOT NULL,
+  text TEXT,
+  data_json TEXT,
+  confidence TEXT NOT NULL,
+  inserted_at TEXT DEFAULT (datetime('now')),
+  UNIQUE(ctm_session_id, sequence)
+);
+CREATE INDEX IF NOT EXISTS idx_capture_events_session_kind_time
+  ON session_capture_events(ctm_session_id, kind, observed_at);
+```
+Only add this table when there is a real consumer for durable non-message
+events. Good candidates:
+- `approval.requested`
+- `approval.resolved`
+- `status.changed`
+- `tool.call`
+- `tool.result`
+- `session.started`
+- `session.exited`
+- monitor-agent annotations
+For user and assistant text, continue writing through `session_conversations`
+and `session_messages`; duplicating full message text into
+`session_capture_events` would increase storage and create two sources of truth.
+### Status Projection
+Session status is already a derived projection with multiple inputs:
+- `telemetry-receiver` emits authoritative `session.status` from hooks/OTEL.
+- `SessionStream` emits `stream-status` from JSONL and filtered PTY activity.
+- `status-hooks` owns an idle/busy/waiting-input state bus for user hooks.
+- The frontend `getSessionStatus` merges authoritative status, stream status,
+  and local PTY fallback.
+The capture foundation should consolidate vocabulary and expose the evidence
+behind the status, not replace all of those paths in one patch.
+Suggested status vocabulary:
+- `running`
+  - Recent assistant/tool output, active hook event, or active turn.
+- `waiting`
+  - Provider is waiting for user prompt and no approval is pending.
+- `waiting_approval`
+  - A high-confidence approval request is pending.
+- `idle`
+  - No recent output for a configurable window, process still alive, no known
+    prompt/approval wait state.
+- `exited`
+  - PTY/process ended.
+- `unknown`
+  - Session exists but capture has insufficient evidence.
+Priority order:
+1. exited
+2. waiting_approval
+3. running
+4. waiting
+5. idle
+6. unknown
+Recommended idle thresholds:
+- 30 seconds with no output while active turn is open: still `running`.
+- 2 minutes with no output and no open turn: `idle`.
+- Provider-specific prompt-ready signal: `waiting`.
+Avoid deriving `waiting` solely from "no output"; that confuses long-running
+commands with user-wait states.
+Mapping to existing vocabulary:
+- `SessionStream.running` maps to capture `running`.
+- `SessionStream.waiting` maps to capture `waiting`.
+- `SessionStream.idle` maps to capture `idle`.
+- `SessionStateBus.busy` maps to capture `running`.
+- `SessionStateBus.waiting_input` maps to capture `waiting` or
+  `waiting_approval` when the source reason is an approval prompt.
+- `session.status working=true` maps to capture `running`.
+- `session.status working=false` maps to `waiting` or `idle` depending on prompt
+  evidence.
+### Turn Projection
+Turns are the user-facing unit for summaries and tooltips.
+A turn starts on:
+- `user.prompt`
+A turn may include:
+- assistant deltas/messages,
+- tool calls/results,
+- approvals,
+- errors,
+- status changes.
+A turn completes on:
+- provider stop event,
+- hook stop event,
+- next user prompt,
+- session exit,
+- timeout plus prompt-ready signal.
+Store:
+- prompt text,
+- compact assistant text,
+- tool call summary,
+- approval summary,
+- status,
+- started/completed timestamps.
+### API And Websocket Surface
+Keep existing stream APIs as the public compatibility surface. Add capture names
+only when a new consumer needs them.
+Existing endpoints to reuse:
+- `GET /api/stream/status`
+- `GET /api/sessions/:id/stream`
+- `GET /api/sessions/:id/summary`
+- `GET /api/session/messages`
+Existing websocket events to reuse:
+- `subscribe-stream`
+- `stream-init`
+- `stream-event`
+- `stream-status`
+- `session.status`
+- `waiting-for-input`
+- `approval-decision`
+Possible later additions:
+- `GET /api/sessions/:id/capture/events?after=<sequence>&limit=<n>` only after
+  `session_capture_events` exists.
+- `capture-event` only for non-message lifecycle events that do not fit the
+  existing stream-event contract.
+- Do not force the frontend migration into the first backend patch.
+## Downstream Consumers
+### Approver
+Current pain:
+- Approval automation has to inspect output directly.
+- Terminal rendering is not a stable API.
+New path:
+1. Provider transcript or hook emits `approval.requested`.
+2. Capture projection stores pending approval with command/action metadata.
+3. Approver consumes pending approval event.
+4. Approver sends decision through the existing provider-specific input path.
+5. Capture emits `approval.resolved`.
+Fallback:
+- If no structured approval event arrives but the screen parser finds an
+  approval prompt, emit a low-confidence `approval.requested` event sourced from
+  `pty`.
+This lets approval automation move to structured events without losing
+compatibility.
+### Active Sessions UI
+Use existing active-session state for:
+- status pill,
+- activity timestamp,
+- pending approval badge,
+- recent prompt preview,
+- tooltip summary,
+- "what is this session doing?" detail.
+The screenshot in the original request shows active sessions as the natural
+surface for this. The UI should not need to read raw terminal output to populate
+those affordances.
+Current implementation already covers much of this:
+- `stream-view.js` fetches `/api/sessions/:id/summary` on hover.
+- `getSessionStatus` uses authoritative status and stream status before PTY
+  fallback.
+- Active sessions are grouped by Running, Waiting, Idle, Exited.
+Recommended change:
+- Improve the existing summary/status payload and tooltip rather than adding a
+  new `session_live_state` table.
+### Rolling Summaries
+Generate summaries over the last N turns, default N = 5.
+Inputs:
+- prompt text,
+- assistant summary text,
+- tool calls,
+- approvals,
+- errors,
+- status.
+Trigger policy:
+- On completed turn.
+- On session exit.
+- On demand if summary missing.
+- Rate limit per session, for example no more than once every 30 seconds unless
+  a turn completes.
+Storage:
+- Keep live tooltip summary in `SessionStream.cachedSummary`.
+- For completed/offline analysis, reuse `session_analyses.summary`.
+- If live summaries need restart persistence later, first consider adding
+  `live_summary`, `live_summary_updated_at`, and `live_summary_source` to
+  `session_conversations` or `agent_sessions` before creating a new table.
+Recommended tooltip shape:
+```text
+Current: waiting for approval to run npm test
+Recent: implemented stream parser changes, fixed status projection, ran server tests
+Last prompt: "Add capture foundation design"
+```
+Keep this short. The active sessions UI needs a preview, not a full transcript.
+### Monitor Agent
+The monitor agent should subscribe to capture state/events instead of watching
+the terminal.
+Potential monitor behaviors:
+- Detect stuck sessions:
+  - active turn open,
+  - no output for threshold,
+  - no long-running command marker,
+  - process alive.
+- Detect repeated failures:
+  - repeated test failures,
+  - repeated same command,
+  - repeated edit/revert loops.
+- Detect approval backlog:
+  - pending approval too long,
+  - approval requested for risky command,
+  - multiple sessions blocked.
+- Detect user attention needs:
+  - provider asks a question,
+  - merge conflict appears,
+  - auth/login failure,
+  - rate limit,
+  - network failure.
+- Detect completion:
+  - task summary emitted,
+  - tests pass,
+  - no active turn,
+  - provider waiting.
+Actions should start conservative:
+- annotate session,
+- update tooltip/status,
+- notify user,
+- request approval,
+- queue a suggested next prompt.
+Autonomous intervention should require a separate policy layer.
+## Additional Product Opportunities
+The same capture foundation can support:
+- Cross-session command center:
+  - "show sessions waiting on me"
+  - "show sessions running tests"
+  - "show sessions with failures"
+- Global work feed:
+  - timeline of prompts, tool calls, approvals, completions.
+- Session search:
+  - search recent prompts, assistant output, commands, errors.
+- Smart resume:
+  - restore last five turns plus summary when reopening a session.
+- Automated handoff:
+  - create concise handoff note from capture history.
+- Cost/performance attribution:
+  - estimate expensive sessions by duration, tool volume, token metadata where
+    providers expose it.
+- Quality analytics:
+  - number of turns to completion,
+  - test-fix loops,
+  - approval latency,
+  - idle time.
+- Failure taxonomy:
+  - auth,
+  - missing dependency,
+  - test failure,
+  - type error,
+  - merge conflict,
+  - rate limit,
+  - provider crash.
+- Alerting:
+  - notify when a long job completes,
+  - notify when a session waits for input,
+  - notify when a session reaches a risky approval.
+- Dataset creation:
+  - collect sanitized prompt/output/tool sequences for internal evaluation.
+- Replay and audit:
+  - reconstruct what happened without needing terminal scrollback.
+- Better auto-titles:
+  - derive active-session names from last user intent and current activity.
+- Multi-agent coordination:
+  - let one supervisor understand what multiple coding agents are doing.
+## Efficiency Design
+### Principle 1: Prefer Structured Deltas
+Provider transcript lines are usually much smaller and more meaningful than PTY
+screen updates. Tail transcript files incrementally and store offsets.
+### Principle 2: Avoid Re-Parsing Per Consumer
+Adapters parse once, then publish normalized events. Consumers subscribe to the
+bus or query projections.
+### Principle 3: Keep Raw Payloads Out Of Hot Paths
+Do not store full raw provider messages in every in-memory projection. Keep:
+- compact text,
+- structured metadata,
+- pointer to raw source if needed,
+- rolling rings.
+### Principle 4: Batch Durable Writes
+Use small batch inserts for high-volume streams:
+- flush every 100 to 250 ms,
+- or every N events,
+- whichever comes first.
+Status updates can be coalesced because only the latest live projection matters.
+### Principle 5: Backpressure And Drop Policy
+For each session:
+- never drop durable high-value events:
+  - user prompts,
+  - assistant final messages,
+  - approvals,
+  - errors,
+  - status transitions,
+  - tool calls/results.
+- allow coalescing of:
+  - assistant deltas,
+  - heartbeat,
+  - raw PTY snippets.
+If a consumer is slow, it should resume from the existing stream snapshot plus
+`session_conversations`/`session_messages`. If the optional
+`session_capture_events` table is added later, non-message event consumers can
+resume from that table by sequence.
+### Principle 6: Lazy Summarization
+Summarization is the expensive part. It should run:
+- after turn completion,
+- with rate limits,
+- on compact turn text,
+- only for sessions visible in active UI or recently active unless explicitly
+  requested.
+### Principle 7: Watch Plus Poll
+Use file watch notifications for latency, but backstop with polling because
+watchers can miss events. Store offsets and detect truncation/rotation.
+Recommended default:
+- watch transcript directory where possible.
+- poll active transcript files every 500 to 1000 ms.
+- use exponential slowdown for idle/exited sessions.
+## Privacy And Safety
+Capture will contain user prompts, file paths, commands, errors, and possibly
+secrets accidentally pasted into a terminal.
+Rules:
+- Do not capture hidden password input.
+- Redact known secret patterns before summaries or monitor-agent prompts.
+- Keep raw event retention configurable.
+- Separate local-only raw capture from any exported telemetry.
+- Mark low-confidence PTY-derived text so automations avoid over-trusting it.
+- Allow per-session capture disablement if needed.
+Redaction should apply before:
+- summaries,
+- monitor-agent prompts,
+- notifications,
+- external exports.
+Durable local event storage can retain original text by default if CTM already
+stores local transcripts, but exports should use redacted text.
+## Migration Plan
+### Phase 0: Inventory And Contracts
+- Document existing stream event shapes.
+- Document current session status values and frontend consumers.
+- Confirm provider transcript paths for Claude, Codex, and Gemini in this
+  installation.
+- Treat `SessionStream` as the live bus unless a concrete incompatibility is
+  found.
+- Treat `session_conversations` and `session_messages` as the durable message
+  projection.
+- Do not add schema until a consumer needs data that the existing tables cannot
+  represent.
+Deliverable:
+- design doc and implementation checklist.
+### Phase 1: Capture Core
+- Add a small `session-capture` module or wrapper that delegates to
+  `SessionStream` rather than competing with it.
+- Define a normalized event vocabulary as an adapter over existing
+  `stream-event` payloads.
+- Add `kind`, `source`, `confidence`, and provider metadata to emitted stream
+  events where possible without breaking the frontend.
+- Reuse existing ring buffers and `getRecentEvents`.
+- Reuse existing `/api/stream/status`, `/api/sessions/:id/stream`, and
+  websocket `stream-*` contracts.
+No UI rewrite yet.
+### Phase 2: Status Projection
+- Write a reducer that merges existing evidence:
+  - `SessionStream` status,
+  - `telemetry-receiver` `session.status`,
+  - `status-hooks` state,
+  - approval detection,
+  - prompt-ready fallback.
+- Keep existing frontend status behavior, but expose debug/evidence fields for
+  monitor agents.
+- Add tests for status transitions:
+  - running,
+  - waiting,
+  - waiting approval,
+  - idle,
+  - exited.
+### Phase 3: Approver Integration
+- Emit normalized approval lifecycle events from the existing approver path.
+- Keep `approval-agent` as the execution and parsing owner.
+- Keep writing `approval_decisions`.
+- Consider `decision='pending'` only if the UI/monitor needs a durable pending
+  record; otherwise keep pending approval in hot capture state.
+- Add tests for approval request/resolution lifecycle.
+### Phase 4: Summaries And Tooltips
+- Extend existing `SessionStream.getSummary` from the current prompt-cache model
+  toward a last-five-prompt/turn model.
+- Reuse the existing `/api/sessions/:id/summary` endpoint and stream tooltip.
+- Reuse `session_analyses.summary` for completed/offline sessions.
+- Add rate limits and redaction.
+### Phase 5: Monitor Agent Substrate
+- Add capture subscription for monitor agent.
+- Start with read-only classification and notifications.
+- Add policy-gated actions later.
+### Phase 6: Provider Expansion And Hardening
+- Harden Codex and Gemini adapters.
+- Add replay from durable events on restart.
+- Add capture health dashboard:
+  - active adapters,
+  - lag,
+  - dropped/coalesced events,
+  - last event time,
+  - errors.
+Only in this phase should we revisit a durable `session_capture_events` table,
+and only if monitor/replay/approval consumers need non-message event history
+that cannot be reconstructed from transcripts plus existing tables.
+## Testing Strategy
+### Unit Tests
+- Event normalization per provider.
+- Event ID/idempotency logic.
+- Status reducer.
+- Turn reducer.
+- Summary input builder.
+- Redaction.
+### Integration Tests
+- Tail a fixture transcript and verify emitted events.
+- Simulate file truncation/rotation.
+- Simulate duplicate watcher notifications.
+- Simulate PTY fallback when transcript is absent.
+- Verify websocket subscribers receive ordered events.
+- Verify slow consumer can catch up by sequence.
+### Regression Tests
+- Existing session stream APIs still work.
+- Existing terminal scrollback restore still works.
+- Existing approver screen parsing still works as fallback.
+### Performance Tests
+Simulate:
+- 10 active sessions,
+- 50 active sessions,
+- 100 active sessions.
+Measure:
+- CPU while tailing idle files,
+- CPU while streaming output,
+- memory per session,
+- database write rate,
+- websocket fanout overhead,
+- summary invocation count.
+Target:
+- Near-zero CPU for idle sessions.
+- No per-consumer transcript parsing.
+- Bounded memory per session.
+- Durable writes batched under load.
+## Open Decisions
+1. Should durable non-message events get a new table, or can we stay entirely on
+   existing tables for now?
+   Recommendation: stay on existing tables for phase 1. Add
+   `session_capture_events` later only for non-message events such as approval
+   request lifecycle, status transitions, monitor annotations, and tool events.
+   Keep `session_messages` as the user/assistant text projection.
+2. Should summaries be generated locally by the coding-agent provider, by CTM's
+   chosen model, or by a lightweight heuristic first?
+   Recommendation: reuse `SessionStream`'s current cloud/local/fallback summary
+   tiers, but change the input from "last prompt cache only" toward the last five
+   prompt/assistant turns where available. Keep completed-session summaries in
+   `session_analyses`.
+3. How much raw assistant delta text should be stored?
+   Recommendation: keep final/rendered messages in `session_conversations` and
+   `session_messages`; keep deltas in the existing hot ring only. Do not persist
+   token-level deltas unless visual replay becomes a goal.
+4. How should monitor agents be allowed to act?
+   Recommendation: start read-only. Add action policies separately.
+5. Should capture run for exited sessions?
+   Recommendation: no active watchers after exit. Keep durable replay and final
+   summary generation through `session_conversations`, `session_messages`, and
+   `session_analyses`.
+## Main Risks
+### Provider Format Drift
+Provider transcript schemas can change.
+Mitigation:
+- Keep adapters small and fixture-tested.
+- Preserve unknown payload fields in `data`.
+- Use confidence levels.
+### Watcher Misses
+File watching can miss changes.
+Mitigation:
+- Watch plus poll.
+- Store offsets.
+- Detect truncation/rotation.
+- Make parsing idempotent.
+### Over-Capture
+Capturing everything can create privacy and performance problems.
+Mitigation:
+- Compact events.
+- Redact summaries/exports.
+- Bound hot buffers.
+- Configurable retention.
+### Status Misclassification
+Idle/running/waiting can be ambiguous.
+Mitigation:
+- Prefer explicit provider/hook signals.
+- Use priority reducer.
+- Expose confidence and "last evidence" in debug state.
+### Duplicate Systems
+Adding a new foundation could duplicate existing `session-stream.js`.
+Mitigation:
+- Build by evolving/wrapping the existing stream layer.
+- Keep old APIs as compatibility wrappers.
+- Move consumers gradually.
+## Recommended First Implementation
+The first patch should be deliberately small:
+1. Add a `session-capture` adapter module that wraps `SessionStream` and
+   existing status/approval signals.
+2. Add normalized event vocabulary fields to `SessionStream` events in a
+   backwards-compatible way.
+3. Add a status evidence reducer that consumes existing `stream-status`,
+   `session.status`, `waiting-for-input`, and approval signals.
+4. Reuse existing `/api/stream/status`, `/api/sessions/:id/stream`, and
+   `/api/sessions/:id/summary`; add no tables in the first patch.
+5. Add tests around the reducer and around event compatibility with
+   `stream-view.js`.
+Then:
+1. Extend `SessionStream.getSummary` to use the last five meaningful
+   prompt/assistant turns.
+2. Emit approval lifecycle events from the existing approver path.
+3. Teach monitor-agent code to consume the capture adapter.
+4. Add `session_capture_events` only if a durable monitor/replay use case needs
+   non-message event history.
+This lets the foundation prove value quickly without destabilizing terminal
+restore or provider-specific session handling.
+## Bottom Line
+The codebase already has more than raw ingredients: it has a live structured
+stream (`SessionStream`), durable conversation/message tables, FTS search,
+hook/OTEL state, active-session status UI, hover summaries, prompt queues,
+headless terminal snapshots, restart scrollback, and approval automation. The
+missing layer is a provider-neutral capture contract that reuses those systems
+and fills only the gaps.
+Build `SessionCapture` as that contract. Treat `SessionStream` as the live bus,
+structured provider transcripts and hooks as primary evidence, PTY output as
+fallback, existing message tables as durable conversation storage, and any new
+event table as a later, narrow store for non-message lifecycle facts.