@exaudeus/workrail 3.80.0 → 3.81.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -3418,6 +3418,26 @@ Open questions: does `wr.dispatch` replace `workflowId` in trigger config, or co
3418
3418
 
3419
3419
  ---
3420
3420
 
3421
+ ### wr.mr-review quality and architecture overhaul (May 8, 2026)
3422
+
3423
+ **Status: idea** | Priority: high
3424
+
3425
+ **Score: 13** | Cor:3 Cap:3 Eff:2 Lev:3 Con:2 | Blocked: no
3426
+
3427
+ The current `wr.mr-review` workflow produces findings that are often shallow, miss real issues present in the diff, and conflate pre-existing problems with changes introduced by the PR. In practice, reviews have missed incomplete migrations, attributed failures to the wrong root cause, and approved PRs with silent regressions in the commit history. The workflow runs as a single long-lived session reading the full diff in one pass, which limits how deeply any single concern can be investigated.
3428
+
3429
+ The core gap: the review workflow does not spawn focused sub-agents to investigate suspicious areas. A reviewer that spots a potentially incomplete migration should be able to spawn a quick agent to grep the codebase for other sites that were not updated -- rather than noting it as a surface observation and moving on. Without targeted investigation, the review is pattern-matching on the diff rather than reasoning about system-wide impact.
3430
+
3431
+ **Things to hash out:**
3432
+ - What should trigger spawning a sub-agent during review -- explicit workflow step, or reviewer judgment via `spawn_agent`?
3433
+ - Should sub-agents have narrow scope (e.g. "find all remaining `sessionId: string` in `src/daemon/`") or full workspace access?
3434
+ - How does the parent session synthesize sub-agent findings into the final verdict? What happens if a sub-agent returns inconclusive results?
3435
+ - What is the right decomposition of review concerns -- by file, by concern type, or by risk level?
3436
+ - How does the review workflow distinguish "pre-existing issue on main" from "regression introduced by this PR"?
3437
+ - What does a measurable acceptance criterion look like -- false negative rate, human reviewer agreement, or something else?
3438
+
3439
+ ---
3440
+
3421
3441
  ### MR review session count inflation
3422
3442
 
3423
3443
  **Status: idea** | Priority: medium
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@exaudeus/workrail",
3
- "version": "3.80.0",
3
+ "version": "3.81.0",
4
4
  "description": "Step-by-step workflow enforcement for AI agents via MCP",
5
5
  "license": "MIT",
6
6
  "repository": {