npm - @exaudeus/workrail - Versions diffs - 3.73.1 → 3.73.2 - Mend

@exaudeus/workrail 3.73.1 → 3.73.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (34) hide show

package/dist/console-ui/assets/{index-txIYXGHx.js → index-CfI4I3OX.js} +1 -1
package/dist/console-ui/index.html +1 -1
package/dist/manifest.json +67 -51
package/dist/mcp/handlers/v2-advance-core/index.d.ts +1 -0
package/dist/mcp/handlers/v2-advance-core/index.js +3 -3
package/dist/mcp/handlers/v2-advance-core/outcome-success.js +4 -18
package/dist/mcp/handlers/v2-advance-events.d.ts +1 -1
package/dist/mcp/handlers/v2-advance-events.js +1 -1
package/dist/mcp/handlers/v2-execution/advance.d.ts +1 -0
package/dist/mcp/handlers/v2-execution/advance.js +3 -3
package/dist/mcp/handlers/v2-execution/continue-advance.d.ts +1 -0
package/dist/mcp/handlers/v2-execution/continue-advance.js +2 -1
package/dist/mcp/handlers/v2-execution/index.js +3 -1
package/dist/mcp/server.js +6 -4
package/dist/mcp/types.d.ts +2 -0
package/dist/trigger/delivery-action.d.ts +1 -0
package/dist/trigger/delivery-action.js +1 -1
package/dist/trigger/delivery-pipeline.d.ts +13 -2
package/dist/trigger/delivery-pipeline.js +58 -3
package/dist/trigger/trigger-router.js +6 -3
package/dist/v2/durable-core/constants.d.ts +1 -0
package/dist/v2/durable-core/constants.js +1 -0
package/dist/v2/durable-core/schemas/export-bundle/index.d.ts +202 -0
package/dist/v2/durable-core/schemas/session/events.d.ts +56 -0
package/dist/v2/durable-core/schemas/session/events.js +8 -0
package/dist/v2/infra/local/git-snapshot/index.d.ts +6 -0
package/dist/v2/infra/local/git-snapshot/index.js +39 -0
package/dist/v2/ports/git-snapshot.port.d.ts +10 -0
package/dist/v2/ports/git-snapshot.port.js +9 -0
package/dist/v2/projections/session-metrics.js +17 -2
package/docs/design/engine-boundary-discovery.md +123 -0
package/docs/design/engine-boundary-review-findings.md +72 -0
package/docs/ideas/backlog.md +46 -0
package/package.json +1 -1

package/docs/design/engine-boundary-discovery.md ADDED Viewed

@@ -0,0 +1,123 @@
+# Engine Boundary Discovery: SHA Tracking and Layer Architecture
+**Date:** 2026-04-30
+**Status:** Discovery complete -- recommendation: A now (PR #903), C next
+**Goal:** Determine the right architecture for commit SHA tracking and clarify engine/coordinator/delivery layer boundaries
+---
+## Problem Understanding
+### The Real Problem
+The engine emits `run_completed` at step T. The delivery pipeline commits to git at step T+1. There is no mechanism to communicate back from T+1 to T. Agent self-reporting of `metrics_commit_shas` was the broken workaround.
+### Core Tensions
+1. **Information completeness vs engine purity**: commits only exist after delivery, which is after `run_completed`
+2. **YAGNI vs architectural completeness**: MCP sessions may not need SHA tracking right now
+3. **Port purity vs pragmatism**: `outcome-success.ts` already calls `execFile` directly for `endGitSha` -- known violation
+4. **Atomic delivery vs distributed write-back**: appending after `run_completed` requires the session gate to remain open
+### What Makes This Hard
+The gap is temporal. Information that belongs to session T becomes available at T+1. The event log is append-only but NOT closed after `run_completed` -- `ExecutionSessionGateV2` gates on `healthy` vs `corrupt`, not on completion state. Post-completion appends ARE valid.
+---
+## Architecture Contract (what exists, what's missing)
+### Engine owns (src/v2/durable-core/ + src/mcp/handlers/):
+- Session lifecycle (start, advance, checkpoint, resume)
+- HMAC token protocol -- cryptographic enforcement
+- Append-only event log with crash-safe writes
+- Port abstractions for all I/O (no direct fs/git/network calls in engine)
+- Projections: reads event log, produces typed views
+- **Records observations it can make directly**: git_branch, git_head_sha (via WorkspaceContextResolverPortV2 at START); endGitSha (via execFile direct call at END -- known port violation)
+### Coordinator/delivery layer owns (src/trigger/ + src/daemon/):
+- Git commits, pushes, PR creation
+- Delivery pipeline stages
+- SHA extraction from `git commit` output
+- Proof records, pipeline orchestration (future)
+- Agent loop, context injection, crash recovery
+### The gap:
+No mechanism for the delivery layer to record its observations (what it committed) back into the session event log.
+---
+## Philosophy Constraints
+- **Architectural fixes over patches**: always-empty is a patch; extending the port is the fix
+- **Dependency injection for boundaries**: git I/O must be behind a port in the engine layer
+- **Make illegal states unrepresentable**: `captureConfidence: 'high'` with `agentCommitShas: []` is already schema-enforced as invalid
+- **Validate at boundaries, trust inside**: SHA derivation must happen at the git boundary, never from agent memory
+- **YAGNI with discipline**: don't build write-back infrastructure until coordinators actually consume it
+---
+## Candidates
+### A: Always-empty (current PR #903)
+Engine emits `agentCommitShas: []` and `captureConfidence: 'none'` always. Console shows no SHAs. Coordinator scripts derive SHAs from `git log startSha..endSha` themselves.
+- **Tensions resolved**: engine purity, illegal states, YAGNI
+- **Tensions accepted**: information completeness, worktree-after-cleanup risk
+- **Failure mode**: worktree cleaned up before coordinator script runs `git log`
+- **Scope**: best-fit immediate fix; patch not architectural fix
+- **Philosophy**: honors YAGNI, make-illegal-states-unrepresentable. Conflicts with architectural-fixes-over-patches.
+### B: Delivery write-back via DeliveryStage
+New 5th DeliveryStage appends `observation_recorded` event with `git_commit_shas` to session event log after successful `git commit`. Requires injecting `SessionEventLogAppendStorePortV2` into `DeliveryPipeline`.
+- **Tensions resolved**: information completeness (daemon sessions), crash safety (stage before sidecar delete)
+- **Tensions accepted**: delivery-action API change (new session store dependency); MCP sessions still get no SHAs
+- **Failure mode**: append fails silently if session health degraded (must be best-effort, never abort delivery)
+- **Scope**: best-fit for daemon/autoCommit sessions; accepted gap for MCP
+- **Philosophy**: honors architectural-fixes-over-patches, dependency-injection. Mild YAGNI tension.
+### C: Extend WorkspaceContextResolverPortV2 with end-state (port-correct fix)
+Add `resolveEndState(repoRoot, startSha) -> { endSha, commitShas }` to the workspace anchor port (or create `GitSnapshotPortV2`). Inject into `outcome-success.ts`. Runs `git rev-parse HEAD` + `git log startSha..HEAD --format=%H` in parallel. `run_completed` event gets `commitShas: string[]` field.
+- **Tensions resolved**: engine purity (eliminates execFile violation), information completeness (all session types), MCP + daemon covered
+- **Tensions accepted**: run_completed schema gets new field; old sessions project `commitShas: []` (non-breaking)
+- **Failure mode**: `git log startSha..HEAD` may include merge commits or upstream advances; use `--no-merges --first-parent` to scope
+- **Scope**: slightly broader than needed (touches port contract) but correct -- each piece is load-bearing
+- **Philosophy**: honors architectural-fixes-over-patches, dependency-injection-for-boundaries, make-illegal-states-unrepresentable.
+### D: Delivery sidecar store (reframe -- REJECTED)
+Delivery writes `delivery-result-<sessionId>.json` sidecar. Session-metrics reads from both event log and sidecar. REJECTED: violates single-source-of-truth principle for the event log. Two truth sources for SHA data is architecturally unsound for this codebase.
+---
+## Comparison and Recommendation
+**Recommendation: A now (PR #903 as-is), C next.**
+C is the architectural ideal because:
+1. Solves both problems simultaneously: SHA gap AND the existing execFile violation in outcome-success.ts
+2. Works for ALL session types (MCP + daemon) without special-casing
+3. Follows the port pattern the codebase is designed around
+4. run_completed is the correct home -- startGitSha is already there, endGitSha is already there, commitShas belongs alongside them
+B is viable but inferior: adds a session store dependency to the delivery layer (which currently has none), only covers daemon/autoCommit sessions, and has a semantic question (delivery artifacts vs session truth).
+A is correct as the IMMEDIATE state. PR #903 should merge. It makes the illegal state (high confidence + empty SHAs) impossible, and clears the field for C.
+**The original question -- do we need to split the engine? -- is NO.** The current architecture is sound. The fix is a port extension, not a structural split.
+---
+## Self-Critique
+**Strongest counter-argument against C**: run_completed schema migration. Existing stored events lack `commitShas`. Rebuttal: session-metrics projection already handles absent fields (defaults to []). Adding `commitShas?: string[]` is non-breaking additive.
+**What would tip to B**: if daemon sessions with autoCommit dominate and MCP sessions never need SHAs. Currently autoCommit is daemon-only, so B's coverage gap is acceptable. But if MCP sessions start running in worktrees with autoCommit, B becomes wrong.
+**Invalidating assumption**: if `git log startSha..HEAD` includes upstream merge commits, SHAs from unrelated PRs land in the session's record. Mitigation: `--no-merges --first-parent` in the port implementation.
+---
+## Open Questions
+1. Should `GitSnapshotPortV2` be a new port or an extension of `WorkspaceContextResolverPortV2`? New port has cleaner separation; extension avoids proliferating ports.
+2. What is the right git log filter for commit SHAs: `--no-merges`, `--first-parent`, both? Depends on whether merge commits from squash-merge PRs should be included.
+3. Should Candidate C be implemented now (before PR #903 merges) or after? After is safer -- PR #903 is already in CI, touching outcome-success.ts again would create a dirty merge.

package/docs/design/engine-boundary-review-findings.md ADDED Viewed

@@ -0,0 +1,72 @@
+# Engine Boundary Discovery: Design Review Findings
+**Date:** 2026-04-30
+**Decision:** A now (PR #903), C next (GitSnapshotPortV2)
+**Confidence:** HIGH
+---
+## Tradeoff Review
+| Tradeoff | Acceptable? | Invalidating condition |
+|---|---|---|
+| Console shows empty SHAs until C is built | Yes -- empty is honest, no current consumer needs populated SHAs | Proof records feature built before C |
+| MCP sessions get no SHAs under A-only | Yes -- human can check git; MCP is not the autonomous use case | MCP sessions start requiring attribution for proof records |
+| run_completed schema gets optional field | Yes -- non-breaking additive change; old sessions project [] | Schema versioning feature enforces strict field presence |
+All tradeoffs acceptable under current conditions.
+---
+## Failure Mode Review
+| Failure Mode | Coverage | Risk |
+|---|---|---|
+| git log includes upstream merge commits | Mitigate with `--no-merges --first-parent` in port implementation | MEDIUM -- must be in the implementation spec |
+| outcome-success.ts partial failure (endGitSha or commitShas fails) | Promise.all + best-effort; both degrade to null/[] independently | LOW -- same behavior as current resolveEndGitSha |
+**Highest-risk**: FM1. Must specify `--no-merges --first-parent` explicitly in Candidate C implementation.
+---
+## Runner-up / Simpler Alternative Review
+- B (delivery write-back): no elements worth borrowing. Its coupling (session store injected into delivery layer) is the problem C avoids.
+- Simpler C variant (fix execFile violation only, skip commitShas): would require a second port extension later. Marginal cost of doing both at once is low. Not simpler enough to justify the incompleteness.
+- No hybrid warranted. A-then-C sequencing is correct.
+---
+## Philosophy Alignment
+**Satisfied**: make-illegal-states-unrepresentable, validate-at-boundaries, errors-as-data, dependency-injection-for-boundaries, architectural-fixes-over-patches.
+**Under acceptable tension**: YAGNI (building port before proof records consume it -- justified by dual purpose: also fixes execFile violation). Immutability (appending to completed session -- design-correct, session gate confirms).
+---
+## Findings
+### YELLOW: git log filter not specified
+The implementation of Candidate C's `resolveCommitShaRange` must use `--no-merges --first-parent` to avoid including upstream merge commits. This is not currently specified anywhere. Without it, sessions on branches with merge commits from main would record incorrect SHAs.
+**Action**: add explicit filter spec to the Candidate C implementation ticket.
+### YELLOW: No forcing function to build C after A merges
+PR #903 will merge. C is the architectural fix. Without a ticket, C may never get built and the console SHA display stays empty indefinitely.
+**Action**: create a GitHub issue for Candidate C immediately after PR #903 merges.
+---
+## Recommended Revisions to Selected Design
+1. Before implementing C: specify `--no-merges --first-parent` as the required git log filter in the issue description
+2. New port name: `GitSnapshotPortV2` is cleaner than extending `WorkspaceContextResolverPortV2` -- keeps the end-state capture concern separate from the start-state anchor concern
+3. New method signature: `resolveEndSnapshot(repoRoot: string, startSha: string): Promise<{ endSha: string | null, commitShas: string[] }>`
+4. In `outcome-success.ts`: replace `resolveEndGitSha` with `resolveEndSnapshot` -- single port call that returns both `endGitSha` and `commitShas` in parallel
+---
+## Residual Concerns
+- **Architecture contract not written**: the meta-fix (documenting what the engine layer owns vs coordinator/delivery) has not been done. This is the most important long-term outcome from this discovery. Without it, future contributors will make the same boundary violations. Should be added to `docs/design/v2-core-design-locks.md`.
+- **C has open design question**: `GitSnapshotPortV2` vs extension of `WorkspaceContextResolverPortV2`. Both work. `GitSnapshotPortV2` is slightly cleaner (separate port, separate concern). Decision can be made at implementation time.

package/docs/ideas/backlog.md CHANGED Viewed

@@ -18,6 +18,26 @@ See the scoring rubric in the "Agent-assisted backlog prioritization" entry (Wor
 ## P0 / Critical (blocks WorkTrain from working correctly)
+### wr.coding-task implementation loop does not exit when slices complete (Apr 30, 2026)
+**Status: bug** | Priority: high
+**Score: 13** | Cor:3 Cap:1 Eff:2 Lev:2 Con:3 | Blocked: no
+The `wr.coding-task` workflow's implementation loop (up to 20 passes) does not exit when all slices are complete. The `wr.loop_control` stop artifact is emitted correctly but the loop decision gate never fires because `currentSlice.name` remains `[unset]` -- the engine is not tracking which slice is current across passes. The loop ran 8 passes before eventually exiting on its own.
+This means: (1) every coding task session wastes passes doing no work, (2) the agent cannot confidently signal completion, (3) total session turn count is inflated, increasing cost and timeout risk.
+**Root cause**: the `slices` array is stored in context but the engine does not advance a `currentSliceIndex` counter -- or the counter is not being surfaced to the step as `currentSlice.name`. The `wr.loop_control` artifact is evaluated at the loop decision step, but that step only fires when the engine recognizes it's at the end of a pass. With `currentSlice.name = [unset]`, the recognition fails.
+**Things to hash out:**
+- Is the bug in the workflow JSON (slices not wired to currentSlice tracking), in the engine (loop_control artifact evaluation), or in the way context variables are threaded between passes?
+- Does the issue affect all loops with `wr.loop_control`, or only the implementation loop in `wr.coding-task` specifically?
+- Is there a workaround agents can use today (e.g. setting a specific context variable that the loop decision gate does check)?
+- Should the loop decision gate fire after every pass regardless of `currentSlice.name` state, or only when the slice tracking is valid?
+---
 ### Intent gap: agent builds what it understood, not what the user meant (Apr 30, 2026)
 **Status: idea** | Priority: high
@@ -1639,6 +1659,32 @@ Ghost nodes represent steps that were compiled into the DAG but skipped at runti
 ## Workflow Library
+### Automatic root cause analysis when MR review finds issues post-coding (Apr 30, 2026)
+**Status: idea** | Priority: high
+**Score: 13** | Cor:3 Cap:3 Eff:2 Lev:3 Con:2 | Blocked: no
+When an MR review session (run by a WorkTrain agent) finds issues in a coding session's output, WorkTrain should automatically investigate why the coding agent missed it and determine whether the workflow, the prompts, or the process can be improved.
+**Two distinct triggers:**
+1. **WorkTrain MR review finds something**: after a WorkTrain review session produces findings, the coordinator should automatically spawn an analysis session asking: why did the coding agent produce code with this issue? Was it a workflow gap (missing verification step, insufficient scrutiny at a phase), a prompt gap (the agent wasn't told to check this), or a context gap (the agent didn't have the information needed)?
+2. **Human finds something post-review**: when a human reviewer comments on or requests changes to a PR that already passed WorkTrain's review, this is doubly significant -- it means both the coding agent AND the review agent missed it. WorkTrain should automatically investigate why both missed it and whether the review workflow has a systematic blind spot.
+**Why this matters**: every finding that slips through is a signal about a workflow or process gap. Today that signal is lost. Capturing it systematically and feeding it back into workflow improvement closes the quality loop.
+**Things to hash out:**
+- How does WorkTrain detect that a human has commented on a PR post-review? This requires monitoring the PR for new review activity after WorkTrain's session completed -- either webhook events or polling.
+- What does the analysis session actually produce? A structured finding about the gap? A concrete proposal for workflow improvement? Both?
+- Who reviews the analysis output before it becomes a workflow change? Auto-applying workflow changes based on analysis is risky.
+- How do you distinguish "the workflow is fine but this was a genuinely hard edge case" from "the workflow has a systematic gap"? A single miss doesn't prove a gap; multiple misses of the same kind do.
+- Should the analysis result feed directly into `workflow-effectiveness-assessment`, or is it a separate concern?
+- For the "coding agent missed it" case: is the right fix to change the coding workflow, or to make the review workflow more adversarial?
+---
 ### Workflow previewer for compiled and runtime behavior
 **Status: idea** | Priority: medium

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@exaudeus/workrail",
-  "version": "3.73.1",
+  "version": "3.73.2",
   "description": "Step-by-step workflow enforcement for AI agents via MCP",
   "license": "MIT",
   "repository": {