npm - @exaudeus/workrail - Versions diffs - 3.67.0 → 3.68.0 - Mend

@exaudeus/workrail 3.67.0 → 3.68.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (140) hide show

package/dist/application/services/compiler/template-registry.js +10 -1
package/dist/cli/commands/worktrain-init.js +1 -1
package/dist/console-ui/assets/{index-tOl8Vowf.js → index-CyzltI6D.js} +1 -1
package/dist/console-ui/index.html +1 -1
package/dist/coordinators/modes/full-pipeline.js +4 -4
package/dist/coordinators/modes/implement-shared.js +5 -5
package/dist/coordinators/modes/implement.js +4 -4
package/dist/coordinators/pr-review.js +4 -4
package/dist/daemon/workflow-runner.d.ts +1 -0
package/dist/daemon/workflow-runner.js +1 -0
package/dist/manifest.json +25 -25
package/dist/mcp/handlers/v2-workflow.js +1 -1
package/dist/mcp/workflow-protocol-contracts.js +2 -2
package/docs/authoring-v2.md +4 -4
package/docs/changelog-recent.md +3 -3
package/docs/configuration.md +1 -1
package/docs/design/adaptive-coordinator-context-candidates.md +1 -1
package/docs/design/adaptive-coordinator-context.md +1 -1
package/docs/design/adaptive-coordinator-routing-candidates.md +18 -18
package/docs/design/adaptive-coordinator-routing-review.md +1 -1
package/docs/design/adaptive-coordinator-routing.md +34 -34
package/docs/design/agent-cascade-protocol.md +2 -2
package/docs/design/console-daemon-separation-discovery.md +323 -0
package/docs/design/context-assembly-design-candidates.md +1 -1
package/docs/design/context-assembly-implementation-plan.md +1 -1
package/docs/design/context-assembly-layer.md +2 -2
package/docs/design/context-assembly-review-findings.md +1 -1
package/docs/design/coordinator-access-audit.md +293 -0
package/docs/design/coordinator-architecture-audit.md +62 -0
package/docs/design/coordinator-error-handling-audit.md +240 -0
package/docs/design/coordinator-testability-audit.md +426 -0
package/docs/design/daemon-architecture-discovery.md +1 -1
package/docs/design/daemon-console-separation-discovery.md +242 -0
package/docs/design/daemon-memory-audit.md +203 -0
package/docs/design/design-candidates-console-daemon-separation.md +256 -0
package/docs/design/design-candidates-discovery-loop-fix.md +141 -0
package/docs/design/design-review-findings-console-daemon-separation.md +106 -0
package/docs/design/design-review-findings-discovery-loop-fix.md +81 -0
package/docs/design/discovery-loop-fix-candidates.md +161 -0
package/docs/design/discovery-loop-fix-design-review.md +106 -0
package/docs/design/discovery-loop-fix-validation.md +258 -0
package/docs/design/discovery-loop-investigation-A.md +188 -0
package/docs/design/discovery-loop-investigation-B.md +287 -0
package/docs/design/exploration-workflow-candidates.md +205 -0
package/docs/design/exploration-workflow-design-review.md +166 -0
package/docs/design/exploration-workflow-discovery.md +443 -0
package/docs/design/ide-context-files-candidates.md +231 -0
package/docs/design/ide-context-files-design-review.md +85 -0
package/docs/design/ide-context-files.md +615 -0
package/docs/design/implementation-plan-discovery-loop-fix.md +199 -0
package/docs/design/implementation-plan-queue-poll-rotation.md +102 -0
package/docs/design/in-process-http-audit.md +190 -0
package/docs/design/layer3b-ghost-nodes-design-candidates.md +2 -2
package/docs/design/loadSessionNotes-candidates.md +108 -0
package/docs/design/loadSessionNotes-test-coverage-discovery.md +297 -0
package/docs/design/loadSessionNotes-test-coverage-session4.md +209 -0
package/docs/design/loadSessionNotes-test-coverage-v3.md +321 -0
package/docs/design/probe-session-design-candidates.md +261 -0
package/docs/design/probe-session-phase0.md +490 -0
package/docs/design/routines-guide.md +7 -7
package/docs/design/session-metrics-attribution-candidates.md +250 -0
package/docs/design/session-metrics-attribution-design-review.md +115 -0
package/docs/design/session-metrics-attribution-discovery.md +319 -0
package/docs/design/session-metrics-candidates.md +227 -0
package/docs/design/session-metrics-design-review.md +104 -0
package/docs/design/session-metrics-discovery.md +454 -0
package/docs/design/spawn-session-debug.md +202 -0
package/docs/design/trigger-validator-candidates.md +214 -0
package/docs/design/trigger-validator-review.md +109 -0
package/docs/design/trigger-validator-shaping-phase0.md +239 -0
package/docs/design/trigger-validator.md +454 -0
package/docs/design/v2-core-design-locks.md +2 -2
package/docs/design/workflow-extension-points.md +15 -15
package/docs/design/workflow-id-validation-at-startup.md +1 -1
package/docs/design/workflow-id-validation-implementation-plan.md +2 -2
package/docs/design/workflow-trigger-lifecycle-audit.md +175 -0
package/docs/design/worktrain-task-queue-candidates.md +5 -5
package/docs/design/worktrain-task-queue.md +4 -4
package/docs/discovery/coordinator-script-design.md +1 -1
package/docs/discovery/coordinator-ux-discovery.md +3 -3
package/docs/discovery/simulation-report.md +1 -1
package/docs/discovery/workflow-modernization-discovery.md +326 -0
package/docs/discovery/workflow-selection-for-discovery-tasks.md +33 -33
package/docs/discovery/worktrain-status-briefing.md +1 -1
package/docs/discovery/wr-discovery-goal-reframing.md +1 -1
package/docs/docker.md +1 -1
package/docs/ideas/backlog.md +227 -0
package/docs/ideas/third-party-workflow-setup-design-thinking.md +1 -1
package/docs/integrations/claude-code.md +5 -5
package/docs/integrations/firebender.md +1 -1
package/docs/plans/agentic-orchestration-roadmap.md +2 -2
package/docs/plans/mr-review-workflow-redesign.md +9 -9
package/docs/plans/ui-ux-workflow-design-candidates.md +4 -4
package/docs/plans/ui-ux-workflow-discovery.md +2 -2
package/docs/plans/workflow-categories-candidates.md +8 -8
package/docs/plans/workflow-categories-discovery.md +4 -4
package/docs/plans/workflow-modernization-design.md +430 -0
package/docs/plans/workflow-staleness-detection-candidates.md +11 -11
package/docs/plans/workflow-staleness-detection-review.md +4 -4
package/docs/plans/workflow-staleness-detection.md +9 -9
package/docs/plans/workrail-platform-vision.md +3 -3
package/docs/reference/agent-context-cleaner-snippet.md +1 -1
package/docs/reference/agent-context-guidance.md +4 -4
package/docs/reference/context-optimization.md +2 -2
package/docs/roadmap/now-next-later.md +2 -2
package/docs/roadmap/open-work-inventory.md +16 -16
package/docs/workflows.md +31 -31
package/package.json +1 -1
package/spec/workflow-tags.json +47 -47
package/workflows/adaptive-ticket-creation.json +16 -16
package/workflows/architecture-scalability-audit.json +22 -22
package/workflows/bug-investigation.agentic.v2.json +3 -3
package/workflows/classify-task-workflow.json +1 -1
package/workflows/coding-task-workflow-agentic.json +6 -6
package/workflows/cross-platform-code-conversion.v2.json +8 -8
package/workflows/document-creation-workflow.json +8 -8
package/workflows/documentation-update-workflow.json +8 -8
package/workflows/intelligent-test-case-generation.json +2 -2
package/workflows/learner-centered-course-workflow.json +2 -2
package/workflows/mr-review-workflow.agentic.v2.json +4 -4
package/workflows/personal-learning-materials-creation-branched.json +8 -8
package/workflows/presentation-creation.json +5 -5
package/workflows/production-readiness-audit.json +1 -1
package/workflows/relocation-workflow-us.json +31 -31
package/workflows/routines/context-gathering.json +1 -1
package/workflows/routines/design-review.json +1 -1
package/workflows/routines/execution-simulation.json +1 -1
package/workflows/routines/feature-implementation.json +3 -3
package/workflows/routines/final-verification.json +1 -1
package/workflows/routines/hypothesis-challenge.json +1 -1
package/workflows/routines/ideation.json +1 -1
package/workflows/routines/parallel-work-partitioning.json +3 -3
package/workflows/routines/philosophy-alignment.json +2 -2
package/workflows/routines/plan-analysis.json +1 -1
package/workflows/routines/plan-generation.json +1 -1
package/workflows/routines/tension-driven-design.json +6 -6
package/workflows/scoped-documentation-workflow.json +26 -26
package/workflows/ui-ux-design-workflow.json +14 -14
package/workflows/workflow-diagnose-environment.json +1 -1
package/workflows/workflow-for-workflows.json +1 -1

package/docs/design/session-metrics-attribution-candidates.md ADDED Viewed

@@ -0,0 +1,250 @@
+# Session Metrics Attribution -- Design Candidates
+*Raw investigative material for main agent synthesis. Not a final decision.*
+---
+## Problem Understanding
+**Core problem:** WorkRail sessions have `startGitSha` (captured at session start via `observation_recorded`) but no `endGitSha`. Without an end SHA, consumers cannot compute a bounded git diff for a completed session. The concurrent same-branch problem (two sessions on the same branch make commits in overlapping time windows) makes branch-scoped diff attribution unreliable.
+**Tensions (real tradeoffs):**
+1. **Attribution accuracy vs. schema complexity.** A `run_completed` event with `endGitSha` achieves engine-authoritative bounding of the session's diff range. The consumer-offload approach (no new event) achieves zero schema cost but breaks retrospective analytics (current HEAD != session end SHA for sessions completed days ago).
+2. **Agent compliance vs. architectural correctness.** The 'high' confidence tier (per-session disambiguation in same-branch concurrency) requires agents to self-report commit SHAs. An append-style event mechanism (`commit_sha_appended`) would solve the accumulation problem architecturally but doesn't fix non-compliance -- agents could still forget to report. The fundamental problem is social (agent convention), not structural.
+3. **Snapshot vs. query-time computation.** Snapshotting `agentCommitShas` from context into the `run_completed` event gives consumers a pre-computed, durable record that survives future context overwrites. Leaving SHAs only in `context_set` is fragile: context can theoretically be overwritten by a subsequent advance attempt.
+4. **Naive `observation_recorded` reuse vs. new event kind.** The dedupeKey pattern `observation_recorded:{sessionId}:{key}` means a second `git_head_sha` emit is silently discarded on replay. Adding `git_head_sha_end` to the observation key enum costs the same schema change as a new event kind but produces worse semantics (mixes pre/post-workflow observations; can't carry `captureConfidence`).
+**Likely seam:** `buildSuccessOutcome()` in `src/mcp/handlers/v2-advance-core/outcome-success.ts`, inside the `when (newEngineState.kind === 'complete')` guard. This is the only place where: (a) the new engine state is known to be `complete`, and (b) the session is under lock (atomic append). The `extraEventsToAppend` mechanism is already established for this purpose (5 other event types use it).
+**What makes this hard:**
+1. **DedupeKey collision.** A junior developer would try to reuse `observation_recorded` for endGitSha and discover it's silently idempotency-discarded on replay (dedupeKey = `observation_recorded:{sessionId}:{key}`, same session, same key = same dedupeKey).
+2. **Timing.** The `complete` engine state is only known inside `buildSuccessOutcome` (computed at line 179 of `outcome-success.ts` via `fromV1ExecutionState(out.state)`). End SHA capture must happen at this same point.
+3. **`workspacePath` threading.** `buildSuccessOutcome` receives `AdvanceCorePorts` (no workspace context). End SHA resolution requires `workspacePath` from `input.workspacePath`. The resolution must happen before `executeAdvanceCore` is called and the result passed as a parameter `endGitSha: string | null` down to `buildSuccessOutcome`. This is a straightforward parameter threading change but non-obvious.
+4. **Accumulation problem.** `mergeContext` (locked at §18.2) replaces arrays, not merges them. `metrics_commit_shas: ['abc']` at step 5 + `metrics_commit_shas: ['def']` at step 9 = only `['def']` survives. Agent must be instructed to send full accumulated list on every context_set. This is a convention constraint, not fixable architecturally without a new event kind.
+---
+## Philosophy Constraints
+**Principles that matter most here:**
+- **Make illegal states unrepresentable:** `captureConfidence: 'high'` with empty `agentCommitShas` is an illegal state. Must enforce via Zod `superRefine` cross-field check.
+- **Exhaustiveness everywhere:** Most projections use `default: // ignore` (forward-compat pattern, confirmed in `run-dag.ts`, `session-index.ts`). New event kind does NOT require updating existing projections automatically. Primary update: `DomainEventV1Schema` Zod union + `EVENT_KIND` constant.
+- **Validate at boundaries, trust inside:** Agent-reported SHAs should be validated as sha1 format (40-char hex regex) in the Zod schema, not in projections.
+- **YAGNI with discipline:** Do NOT add `durationSeconds` (blocked on timestamps). Do NOT add 'low' confidence tier (blocked on cross-session queries). Placeholder fields that can't be populated are anti-YAGNI.
+- **Architectural fixes over patches:** DedupeKey collision is an architectural signal -- use a new event kind, not a workaround.
+- **Immutability by default:** Once `run_completed` is emitted, it cannot be modified. All fields must be computable at emission time.
+**Philosophy conflicts:**
+1. YAGNI vs. architectural-fixes-over-patches: YAGNI says wait for validated same-branch concurrency data. Architectural-fix principle says use a proper event kind. Resolution: build `run_completed` correctly if built at all; but the "whether to build" decision is separate.
+2. §18.2 locked shallow-merge vs. make-illegal-states-unrepresentable: The shallow merge lock means agent SHA accumulation is inherently fragile. Accepted tension; document the convention.
+---
+## Impact Surface
+Files that must stay consistent:
+- `src/v2/durable-core/schemas/session/events.ts` -- `DomainEventV1Schema` Zod union, primary update target
+- `src/v2/durable-core/constants.ts` -- `EVENT_KIND` constants object, must add `RUN_COMPLETED`
+- `src/mcp/handlers/v2-advance-core/outcome-success.ts` -- emission site, `extraEventsToAppend` usage
+- `src/mcp/handlers/v2-execution/continue-advance.ts` -- workspace anchor resolution must happen here, `endGitSha` passed down
+- `src/v2/projections/run-completed.ts` (new) -- pure projection to extract `run_completed` data from event log
+- `src/v2/usecases/console-service.ts` -- may need to call new projection to populate `ConsoleSessionSummary`
+- `src/v2/usecases/console-types.ts` -- extend `ConsoleSessionSummary` with `gitAttribution` field
+- `console/src/api/types.ts` -- mirror `ConsoleSessionSummary` changes for frontend
+- `docs/authoring-v2.md` -- step prompt convention for `metrics_commit_shas` accumulation
+Contracts that must remain consistent:
+- `ConsoleSessionSummary` DTO: new field must be optional/null-safe (backward-compatible)
+- Session event log schema: Zod parse must accept existing event logs without `run_completed` events (forward-compat, already handled by discriminated union)
+---
+## Candidates
+### Candidate 1: Minimal `run_completed` -- endGitSha only
+**One-sentence summary:** Emit a `run_completed` event at session completion containing `startGitSha`, `endGitSha`, and a two-tier confidence (`medium` / `none`), with no agent-reported data.
+**Tensions resolved:**
+- Attribution accuracy vs. schema complexity: resolves at minimum cost. Consumers get an authoritative start+end SHA range.
+- Snapshot vs. query-time: resolves correctly -- endGitSha is snapshotted at completion, not computed from current HEAD at query time.
+**Tensions accepted:**
+- Agent compliance irrelevant (no agent data captured). Per-session disambiguation in same-branch concurrency is NOT solved.
+**Boundary solved at:** `buildSuccessOutcome()` in `outcome-success.ts`, guarded by `newEngineState.kind === 'complete'`.
+**Why this boundary is the best fit:** Only place with both complete state knowledge and atomic append capability.
+**Failure mode:** `workspacePath` absent on final advance -- `endGitSha` is null, confidence 'none'. Graceful; not an error.
+**Repo-pattern relationship:** Follows -- mirrors `observation_recorded` at start; uses established `extraEventsToAppend` hook.
+**Gains:** Engine-authoritative diff range with minimal schema change. Zero agent compliance requirement.
+**Losses:** Concurrent same-branch attribution problem remains unsolved. 'high' confidence tier is unachievable.
+**Schema:**
+```typescript
+{
+  kind: 'run_completed',
+  scope: { runId: string },
+  data: {
+    startGitSha: string | null,
+    endGitSha: string | null,
+    captureConfidence: 'medium' | 'none',
+    // 'medium' = both SHAs available; 'none' = no git context
+  }
+}
+```
+**Scope judgment:** Too narrow for the stated goal (concurrent same-branch attribution). Best-fit for "approximately how much did this session change."
+**Philosophy fit:** YAGNI satisfied. Tension: 'high' confidence tier promised in architecture but undeliverable.
+---
+### Candidate 2: `run_completed` with agentCommitShas snapshot (RECOMMENDED)
+**One-sentence summary:** Same as Candidate 1 plus `agentCommitShas: readonly string[]` copied from context at completion time, enabling 'high' confidence tier when agents comply.
+**Tensions resolved:**
+- Attribution accuracy vs. schema complexity: best resolution. Marginal schema cost increase over C1; delivers per-session disambiguation.
+- Snapshot vs. query-time: fully resolved. Agent SHAs are snapshotted into the immutable event log at completion.
+**Tensions accepted:**
+- Agent compliance convention. Accumulation problem (agents must send full list, not incremental). No 'low' confidence tier.
+**Boundary solved at:** Same as Candidate 1.
+**Why this boundary is the best fit:** `lockedIndex.runContextByRunId.get(String(runId))?.metrics_commit_shas` is directly readable at this point with no additional I/O.
+**Failure mode:** Agent sends partial SHA list (only current step's commits, not accumulated). `agentCommitShas` in `run_completed` is incomplete; confidence is 'high' but the record is silently wrong. Mitigation: authoring docs must explicitly state "include ALL commit SHAs made during the session in your final context_set."
+**Repo-pattern relationship:** Follows -- `assessment_recorded` does the exact same thing (snapshots assessment context into event log at completion time). This is the established pattern.
+**Gains:**
+- Durable per-session commit SHA record that survives context overwrites
+- 'high' confidence tier enables per-session disambiguation
+- Fully reversible: if agent compliance is poor, consumers can ignore `agentCommitShas` and fall back to 'medium' with no schema change
+- Incremental cost over Candidate 1 is trivial (one field + superRefine + sha1 validation)
+**Losses:** Agent compliance convention is required for the 'high' path to be meaningful.
+**Schema:**
+```typescript
+{
+  kind: 'run_completed',
+  scope: { runId: string },
+  data: {
+    startGitSha: string | null,
+    endGitSha: string | null,
+    agentCommitShas: readonly string[],  // sha1 regex validated, may be empty
+    captureConfidence: 'high' | 'medium' | 'none',
+    // 'high' = agentCommitShas non-empty (all sha1 validated)
+    // 'medium' = start+end SHA available, agentCommitShas empty
+    // 'none' = no git context
+  }
+}
+// Zod superRefine: captureConfidence === 'high' requires agentCommitShas.length > 0
+```
+**Scope judgment:** Best-fit for stated goal.
+**Philosophy fit:** Make illegal states unrepresentable (superRefine invariant), validate at boundaries (sha1 regex on every entry in agentCommitShas), errors are data (captureConfidence), immutability (event log snapshot).
+---
+### Candidate 3 (reframe): Consumer-offload -- projection only, no new event kind
+**One-sentence summary:** Add a pure projection `projectSessionGitAttributionV2(events)` that surfaces existing `startGitSha`, `gitBranch`, and `metrics_commit_shas` from context into a typed DTO; let consumers compute diffs from startSha at query time.
+**Tensions resolved:**
+- Schema complexity: zero. No new event kind.
+- YAGNI: satisfied.
+**Tensions accepted:**
+- Retrospective attribution broken: current HEAD != session end SHA for sessions completed days ago. `git diff startSha..HEAD` is meaningless for historical queries.
+**Boundary solved at:** Projection layer only. No changes to event emission.
+**Why this boundary is wrong:** Consumers need a bounded diff range. Without `endGitSha`, retrospective analytics (the primary stated use case) is impossible.
+**Failure mode:** Consumer runs attribution query on a 3-month-old session. Repo's HEAD is 800 commits ahead of session end. Diff is enormous and meaningless.
+**Repo-pattern relationship:** Follows projection pattern. But insufficient as a standalone solution.
+**Gains:** Zero schema change cost. Can ship as a companion to Candidate 1 or 2 (the projection is needed regardless).
+**Losses:** Breaks retrospective analytics. Does not solve the stated problem.
+**Scope judgment:** Too narrow. Valid as a projection companion (needed for all candidates) but not as a standalone solution.
+**Philosophy fit:** YAGNI (zero schema change). Conflict: architectural fixes over patches -- a projection without endGitSha capture is a workaround.
+---
+## Comparison and Recommendation
+**Recommendation: Candidate 2**
+The incremental cost of `agentCommitShas` over Candidate 1 is trivial. The gain is meaningful: a durable per-session commit SHA record and the 'high' confidence tier. The design follows the `assessment_recorded` pattern exactly. The decision is fully reversible (consumers can ignore `agentCommitShas` if compliance is poor).
+Candidate 3 is a required companion (the projection is needed for all candidates to surface `run_completed` data) but is insufficient as a standalone solution.
+**Phase 1 scope (what ships):**
+- `run_completed` event kind with `startGitSha`, `endGitSha`, `agentCommitShas`, `captureConfidence: 'high' | 'medium' | 'none'`
+- `projectSessionGitAttributionV2` pure projection
+- `ConsoleSessionSummary` extension (optional `gitAttribution` field)
+- `docs/authoring-v2.md` update: full-accumulated-list convention for `metrics_commit_shas`
+**Deferred to Phase 2 (data-gated):**
+- Concurrent session detection (`captureConfidence: 'low'`)
+- `durationSeconds` (blocked on event timestamps)
+- Append-style `commit_sha_appended` event (only if compliance measurement shows full-list convention is unreliable)
+**Implementation threading fix required:**
+Before `executeAdvanceCore` is called in `continue-advance.ts`, resolve end git SHA from `input.workspacePath` and pass `endGitSha: string | null` as a new parameter through to `buildSuccessOutcome`. The existing `resolveWorkspaceAnchors()` function can be reused.
+---
+## Self-Critique
+**Strongest argument against Candidate 2:**
+The silent-wrong-confidence problem. If agents consistently report partial SHA lists (only the current step's commits), the 'high' confidence flag in `run_completed` will be systematically misleading. This is exactly the failure mode that caused Discovery 1 to reject agent self-reporting. The mitigation (authoring docs convention) is necessary but not sufficient -- it creates a trust-the-author dependency that the WorkRail engine cannot enforce at the schema level.
+**Pivot conditions:**
+- Switch to Candidate 1 if: stakeholder confirms per-session disambiguation is not needed (branch-level totals are sufficient for all consumers).
+- Switch to append-style events if: agent compliance measurement shows <50% of multi-commit sessions report full accumulated lists.
+- Keep Candidate 2 if: adoption measurement after 30 days shows agents generally include full accumulated lists in final steps.
+**Assumption that would invalidate the design:**
+If `workspacePath` is frequently absent in the final `continue_workflow` call, `endGitSha` will be null for most sessions. The `run_completed` event would exist but be nearly empty (confidence 'none'). The feature would be dead in practice. This is a documentation/UX concern: the MCP schema must strongly encourage `workspacePath` on every advance call.
+---
+## Open Questions for the Main Agent
+1. **Is per-session attribution (vs. per-branch totals) a validated consumer need?** If branch-level totals suffice, Candidate 1 or even Candidate 3 is correct.
+2. **What is the actual same-branch concurrent session rate?** The decision to build Phase 2 (concurrent detection) should be data-gated on this.
+3. **Should `run_completed` be scoped to `{ runId }` or `{ runId, nodeId }`?** Run-scoped (like `context_set`, `run_started`) seems right since session completion is a run-level event. But the final node's `nodeId` might be useful for linking to the last step.
+4. **How does the console surface `captureConfidence`?** The design doc should specify the UI label ("agent-reported", "approximate", "engine-verified") before implementation to avoid post-hoc design decisions.
+5. **Does the `workspacePath` threading require resolving git anchors eagerly (before the completion check) or lazily (only when completion is detected)?** Eager resolution adds latency to every non-final advance. Lazy resolution (resolve only when `newEngineState.kind === 'complete'`) requires the workspace anchor resolution to happen inside `buildSuccessOutcome` -- which means adding the workspace context to the ports. Trade-off: latency vs. interface cleanliness.

package/docs/design/session-metrics-attribution-design-review.md ADDED Viewed

@@ -0,0 +1,115 @@
+# Session Metrics Attribution -- Design Review Findings
+*Review findings for the selected direction: Candidate 2 (`run_completed` event with startGitSha, endGitSha, gitBranch, agentCommitShas, captureConfidence)*
+---
+## Tradeoff Review
+| Tradeoff | Assessment | Condition for Reversal |
+|----------|-----------|----------------------|
+| Agent compliance convention for 'high' confidence | Acceptable. No in-engine enforcement possible given §18.2 shallow-merge lock. Authoring docs + step prompt mitigate. | If consumers build automated decisions on 'high' confidence -- migrate to typed artifact contract enforcement |
+| Accumulation problem deferred to convention | Acceptable IF step prompt explicitly says "include ALL commit SHAs made during this session." Without explicit prompt, adoption is near-zero. | If compliance measurement shows <50% full-list reporting for multi-commit sessions -- ship commit_sha_appended event type |
+| 'low' confidence tier deferred to Phase 2 | Acceptable. Concurrent session detection requires cross-session I/O not available at AdvanceCorePorts level. | If production misattribution incidents occur on shared branches -- prioritize Phase 2 session summary provider injection |
+| durationSeconds deferred | Acceptable. Event envelope has no timestamps. | If event timestamps are added -- add durationSeconds retroactively (backward-compatible field addition) |
+| Multi-run sessions use last-run-wins | Acceptable. Engine model prevents multiple completions per session (fork/rewind happens before completion). | If multi-completion sessions are possible in a future engine variant |
+---
+## Failure Mode Review
+| Failure Mode | Design Coverage | Missing Mitigation | Risk |
+|-------------|----------------|-------------------|------|
+| Partial SHA list produces silently incomplete 'high' confidence | Not covered at schema level. captureConfidence 'high' only requires non-empty SHAs, not complete set. | Consumers SHOULD cross-check agentCommitShas.length against git rev-list startSha..endSha count. Document as consumer convention. | MEDIUM -- silent partial attribution, not false attribution of other sessions' work |
+| workspacePath absent -- endGitSha null | Covered. captureConfidence 'none' is valid graceful state. startGitSha also null if no observation_recorded events. run_completed still records completion fact. | None needed | LOW |
+| Multi-run sessions (fork/rewind) | Engine model prevents it: fork/rewind happens before completion, so only one run reaches 'complete'. | Document the invariant in the event schema comment. | LOW |
+| workspacePath present but not a git repo | Covered. resolveWorkspaceAnchors returns null anchors for non-git paths. endGitSha null, captureConfidence 'none'. | None needed | LOW |
+---
+## Runner-Up / Simpler Alternative Review
+**From runner-up (Candidate 1):** No elements worth borrowing. Candidate 1 is strictly dominated by Candidate 2 -- the only thing it does differently is omit agentCommitShas. The incremental cost of including agentCommitShas is trivial and the gain (durable SHA record, 'high' confidence path) is meaningful.
+**Simpler variant:** Not possible without dropping an acceptance criterion. Every field in the final schema satisfies a specific criterion.
+**Candidate 3 (reframe / projection-only):** Valid as a companion (a `projectSessionGitAttributionV2` projection is needed regardless of which candidate ships). Not sufficient as a standalone solution because endGitSha is not capturable without a new event kind.
+**Hybrid found during analysis:** Add `gitBranch: string | null` to `run_completed.data` for consumer convenience. Makes the event self-contained for branch-aware analytics. No join against `observation_recorded` events needed. Small gain, zero cost. **INCLUDED in final schema.**
+---
+## Philosophy Alignment
+| Principle | Status |
+|-----------|--------|
+| Immutability by default | Satisfied -- run_completed is immutable once appended |
+| Errors are data | Satisfied -- captureConfidence is a first-class field |
+| Validate at boundaries, trust inside | Satisfied -- sha1 regex validation on agentCommitShas at Zod boundary |
+| Functional/declarative | Satisfied -- new projection is a pure fold |
+| Make illegal states unrepresentable | Partial -- superRefine enforces high-implies-non-empty, but not completeness (external git access required) |
+| Exhaustiveness everywhere | Satisfied -- discriminated union update is complete; existing projections use default: ignore (forward compat) |
+| Architectural fixes over patches | Tension -- accumulation problem solved by convention (not architecture). Accepted with deferral condition. |
+| YAGNI | Satisfied -- durationSeconds absent, 'low' confidence absent |
+| Type safety as first line of defense | Partial -- format validation only, not semantic verification. Acceptable for advisory data. |
+---
+## Findings
+### RED (Blocking)
+None.
+### ORANGE (Important -- Must Address Before Ship)
+**O1: Step prompt convention is required, not optional**
+The 'high' confidence tier is meaningless without a step prompt that explicitly instructs: "Include ALL commit SHAs made during this session in your final continue_workflow context_set, not just commits from the current step." Without this, the design's primary differentiator over Candidate 1 is functionally dead. This prompt template MUST be included in `docs/authoring-v2.md` as a required deliverable alongside the code changes.
+**O2: Consumer cross-check convention must be documented**
+Consumers relying on 'high' confidence SHOULD cross-check `agentCommitShas.length` against `git rev-list startGitSha..endGitSha --count`. If counts diverge, treat as 'medium' regardless of stored confidence. This cross-check detects partial-list compliance. Must be documented in the design doc and any consumer-facing API docs.
+### YELLOW (Advisories -- Can Be Addressed in Follow-Up)
+**Y1: Console UI must label confidence levels clearly**
+Display labels: 'high' = "agent-reported (verified sha1 format)", 'medium' = "approximate (branch-scoped diff)", 'none' = "no git context". Do not display 'high' as "verified" without qualifying it as agent-reported.
+**Y2: Document multi-run session invariant in event schema**
+Add a comment to the `run_completed` event Zod schema: "One per completed run. In normal operation, at most one run per session completes (fork/rewind creates a new run before the original completes). For sessions with multiple completed runs (unusual), consumers should use the latest `run_completed` event by eventIndex."
+**Y3: Track Phase 2 measurement criteria**
+Before Phase 2 (concurrent session detection) is scheduled, measure: (a) actual same-branch concurrent session rate, (b) agent compliance rate for full-list reporting. Phase 2 is only justified if same-branch concurrency rate is >5% of sessions.
+---
+## Recommended Revisions
+**Required (before ship):**
+1. Add `gitBranch: string | null` to `run_completed.data` -- makes event self-contained, no downstream impact.
+2. Add `superRefine` cross-field invariant: `captureConfidence === 'high'` requires `agentCommitShas.length > 0`.
+3. Add sha1 format validation on `agentCommitShas` entries: `z.string().regex(/^[0-9a-f]{40}$/)`.
+4. Pre-resolve `endGitSha` in `continue-advance.ts` BEFORE calling `executeAdvanceCore`. Pass `endGitSha: string | null` as a parameter through to `buildSuccessOutcome`. Do NOT add a new port -- this is a simple parameter threading change.
+5. Guard `run_completed` emission with `newEngineState.kind === 'complete'` check inside `buildSuccessOutcome`.
+6. Update `docs/authoring-v2.md` with the full-accumulated-list step prompt template.
+**Optional (can follow up):**
+7. Add `projectSessionGitAttributionV2` pure projection that surfaces `run_completed` data into a typed DTO for console and analytics consumers.
+8. Extend `ConsoleSessionSummary` with optional `gitAttribution: SessionGitAttribution | null` field.
+---
+## Residual Concerns
+1. **The riskiest assumption in the design:** Agent compliance with the full-list convention. The design's primary differentiator (per-session disambiguation via agentCommitShas) only works if agents consistently send the full accumulated list. Measurement should begin after ship.
+2. **The missing consumer:** No specific consumer of `run_completed` has been identified yet. The console dashboard and external analytics are assumed. Before Phase 2 investment, validate that at least one consumer actually needs the per-session granularity.
+3. **Schema drift risk:** The `observation_recorded.key` enum (`git_branch`, `git_head_sha`, `repo_root_hash`, `repo_root`) is already closed. If a future need arises to add a new observation key (e.g., `git_remote_url`), the same Zod update burden will recur. The `run_completed` event with its embedded `gitBranch` and `startGitSha` reduces (but doesn't eliminate) the need for new observation keys.