@exaudeus/workrail 3.38.0 → 3.39.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,44 @@
1
+ # Hypothesis Challenge: PR Review Coordinator HTTP-First Design
2
+
3
+ *Generated: 2026-04-18*
4
+
5
+ ## Target Claim
6
+
7
+ The PR review coordinator's HTTP-first design (Candidate B) is sound: the 2-call HTTP notes extraction is reliable, the two-tier keyword parser correctly classifies review severity, and the 5 robustness rules are sufficient.
8
+
9
+ ## Strongest Counter-Argument
10
+
11
+ `phase-6-final-handoff` has `requireConfirmation: true`. The preferredTipNodeId may point to a checkpoint/confirmation node whose recapMarkdown is sparse or absent, causing the keyword scanner to misclassify clean PRs as 'unknown' and escalate them unnecessarily.
12
+
13
+ **Verdict:** Mitigated. In autonomous mode, the agent calls `continue_workflow` with substantive notes before the confirmation gate. WorkRail stores these notes as the RECAP payload for the node. The recapMarkdown IS populated for autonomous sessions completing phase-6. The requireConfirmation flag only blocks advancement until notes are written -- which the autonomous agent does.
14
+
15
+ ## Weak Assumptions / Evidence Gaps
16
+
17
+ 1. **`runs[0]` is always the most recent run:** True in practice. WorkRail appends new runs; index 0 is the most recent. Confirmed in worktrain-await.ts pollSession() which uses `runs[0]`.
18
+
19
+ 2. **Keyword scan is context-unaware:** 'blocking' appearing in negative context ('this is not blocking') would trigger a false positive. Mitigation: require BLOCKING keyword to be present without a preceding negation. Simpler: use priority order -- any blocking keyword -> blocking, regardless of context. The conservative default is acceptable.
20
+
21
+ 3. **go/no-go time check needs adaptation:** Rule 3 (don't spawn if remaining time < 20 minutes) was designed for daemon sessions with known maxSessionMinutes. A CLI coordinator has no such limit. Adaptation: track wall-clock time since coordinator start, refuse to spawn new sessions if elapsed > coordinator_max_minutes - 20.
22
+
23
+ ## Likely Failure Modes
24
+
25
+ 1. **recapMarkdown null for final step** -> 'unknown' severity -> escalate (conservative, correct)
26
+ 2. **Fix-agent loop max 3 passes exceeded** -> escalate after 3 (loop counter enforces this)
27
+ 3. **ECONNREFUSED on daemon calls** -> early exit with clear error message
28
+ 4. **Keyword false positive** -> PR escalated as blocking when actually clean (false negative on merge, acceptable)
29
+ 5. **Merge conflict at merge time** -> `gh pr merge` fails, coordinator reports error and escalates
30
+
31
+ ## Critical Tests
32
+
33
+ - `parseFindingsFromNotes(null)` -> returns err, classifies as 'unknown'
34
+ - `parseFindingsFromNotes(markdown with 'not blocking but...')` -> must NOT return 'blocking'
35
+ - Loop counter: 3 passes with persistent minor -> escalate on pass 3, NOT pass 4
36
+ - ECONNREFUSED: spawnSession failure propagates cleanly to stderr and exit code 1
37
+
38
+ ## Verdict: Keep
39
+
40
+ The design is sound. The 2-call HTTP extraction works for autonomous sessions. The two-tier parser with conservative defaults is sufficient. The 5 robustness rules need one adaptation: Rule 3 (go/no-go time check) should use wall-clock time since coordinator start, not daemon session remaining time.
41
+
42
+ ## Next Action
43
+
44
+ Proceed with Candidate B implementation. Add adaptation to Rule 3: track coordinator wall-clock start time; refuse new spawns if `now() - startTime > (coordinatorMaxMs - 20*60*1000)`. Default coordinatorMaxMs = 90 minutes.
@@ -0,0 +1,85 @@
1
+ # Execution Simulation Report: PR Review Coordinator Failure Paths
2
+
3
+ *Generated: 2026-04-18*
4
+
5
+ ## Summary
6
+
7
+ Three failure paths simulated for the PR review coordinator design. All three produce correct outcomes under the proposed design. One gap identified: Rule 3 (go/no-go time check) needs adaptation for CLI context.
8
+
9
+ ## Scenario 1: recapMarkdown is Null
10
+
11
+ **Setup:** Session completes successfully, but `GET /api/v2/sessions/:id/nodes/:nodeId` returns `recapMarkdown: null`.
12
+
13
+ **Trace:**
14
+ ```
15
+ getAgentResult('handle-419')
16
+ GET /api/v2/sessions/handle-419 -> runs[0].preferredTipNodeId = 'node-xyz'
17
+ GET /api/v2/sessions/handle-419/nodes/node-xyz -> recapMarkdown = null
18
+ returns null
19
+
20
+ parseFindingsFromNotes(null) -> err('notes is null or empty')
21
+ classifySeverity(err) -> 'unknown'
22
+ route('unknown') -> escalate
23
+ ```
24
+
25
+ **Divergence from expected:** None -- conservative escalation is the designed behavior.
26
+
27
+ **Outcome:** PR escalated, not merged. No crash. Clear escalation note written.
28
+
29
+ ## Scenario 2: Fix-Agent Loop Exhaustion (3 Passes, Persistent Minor)
30
+
31
+ **Setup:** PR #406 has minor findings. Fix agent runs 3 times but each re-review still returns minor.
32
+
33
+ **Trace:**
34
+ ```
35
+ Pass 1: passCount=0 -> review: 'minor' -> passCount becomes 1 -> spawn fix agent -> re-review
36
+ Pass 2: passCount=1 -> review: 'minor' -> passCount becomes 2 -> spawn fix agent -> re-review
37
+ Pass 3: passCount=2 -> review: 'minor' -> passCount becomes 3 -> CHECK: 3 >= 3 -> STOP
38
+ -> escalate, write: 'PR #406: 3 fix passes exhausted, still minor. Manual review required.'
39
+ ```
40
+
41
+ **Divergence from expected:** None -- loop terminates correctly at pass 3.
42
+
43
+ **Key invariant verified:** `passCount >= MAX_FIX_PASSES` check happens BEFORE spawning fix agent, not after. This prevents a 4th spawn.
44
+
45
+ **Outcome:** PR escalated after exactly 3 passes. Not merged.
46
+
47
+ ## Scenario 3: Daemon Not Running (ECONNREFUSED)
48
+
49
+ **Setup:** No daemon running on port 3456. No lock files present.
50
+
51
+ **Trace:**
52
+ ```
53
+ discoverPort() -> no lock files -> falls back to 3456
54
+ spawnSession('mr-review-workflow-agentic', 'Review PR #419...', '/workspace')
55
+ POST http://127.0.0.1:3456/api/v2/auto/dispatch
56
+ -> fetch throws Error: ECONNREFUSED 127.0.0.1:3456
57
+ -> spawnSession catches -> returns err('Could not connect to WorkTrain daemon on port 3456')
58
+ runPrReviewCoordinator() receives err
59
+ -> deps.stderr('Could not connect to WorkTrain daemon on port 3456. Start with: worktrain daemon')
60
+ -> returns { kind: 'failure', exitCode: 1 }
61
+ ```
62
+
63
+ **Divergence from expected:** None.
64
+
65
+ **Outcome:** Clear error message, exit code 1, no hang, no partial state.
66
+
67
+ ## Divergence Analysis
68
+
69
+ No divergences found. All 3 scenarios produce correct outcomes.
70
+
71
+ ## Gap Identified: Rule 3 Adaptation for CLI Context
72
+
73
+ **Problem:** The original Rule 3 (go/no-go time check: don't spawn if remaining time < 20 minutes) was written for daemon sessions that have a `maxSessionMinutes` parameter. A CLI coordinator script has no such parameter.
74
+
75
+ **Adaptation required:** Track wall-clock time since coordinator script start (`const startTimeMs = deps.now()` at beginning). Before spawning any new child session (review OR fix agent), check: `if (deps.now() - startTimeMs > coordinatorMaxMs - 20 * 60 * 1000)` -> refuse to spawn.
76
+
77
+ **Default:** `coordinatorMaxMs = 90 * 60 * 1000` (90 minutes). Configurable via `--max-runtime` flag if needed.
78
+
79
+ ## Recommendations
80
+
81
+ 1. Implement `parseFindingsFromNotes(null)` path explicitly -- don't rely on empty string check.
82
+ 2. Keyword parser priority: BLOCKING/CRITICAL/REQUEST CHANGES takes absolute precedence over APPROVE/CLEAN/LGTM (blocking wins even if both present).
83
+ 3. Fix-agent loop: check `passCount >= MAX_FIX_PASSES` BEFORE spawning, not after.
84
+ 4. Add `coordinatorStartMs` tracking and Rule 3 go/no-go check adapted for CLI (wall-clock elapsed time).
85
+ 5. `gh pr merge` failures: catch, write to stderr, escalate -- do NOT retry automatically.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@exaudeus/workrail",
3
- "version": "3.38.0",
3
+ "version": "3.39.0",
4
4
  "description": "Step-by-step workflow enforcement for AI agents via MCP",
5
5
  "license": "MIT",
6
6
  "repository": {