wogiflow 2.17.0 → 2.18.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (111) hide show
  1. package/.claude/commands/wogi-audit.md +212 -17
  2. package/.claude/commands/wogi-research.md +37 -0
  3. package/.claude/commands/wogi-review.md +200 -22
  4. package/.claude/commands/wogi-start.md +45 -0
  5. package/.claude/docs/claude-code-compatibility.md +46 -1
  6. package/.claude/docs/intent-grounded-review.md +209 -0
  7. package/.claude/settings.json +34 -1
  8. package/.workflow/agents/logic-adversary.md +8 -0
  9. package/.workflow/templates/claude-md.hbs +18 -0
  10. package/lib/installer.js +22 -0
  11. package/lib/utils.js +29 -3
  12. package/lib/workspace-changelog.js +2 -1
  13. package/lib/workspace-channel-server.js +4 -6
  14. package/lib/workspace-contracts.js +5 -4
  15. package/lib/workspace-events.js +8 -7
  16. package/lib/workspace-gates.js +4 -3
  17. package/lib/workspace-integration-tests.js +2 -1
  18. package/lib/workspace-intelligence.js +3 -2
  19. package/lib/workspace-locks.js +2 -1
  20. package/lib/workspace-messages.js +7 -6
  21. package/lib/workspace-routing.js +14 -26
  22. package/lib/workspace-session.js +7 -6
  23. package/lib/workspace-sync.js +9 -8
  24. package/package.json +4 -2
  25. package/scripts/base-workflow-step.js +1 -1
  26. package/scripts/flow +19 -0
  27. package/scripts/flow-adaptive-learning.js +1 -1
  28. package/scripts/flow-aggregate.js +2 -1
  29. package/scripts/flow-architect-pass.js +3 -3
  30. package/scripts/flow-archive-runs.js +372 -0
  31. package/scripts/flow-ask.js +1 -1
  32. package/scripts/flow-ast-grep.js +216 -0
  33. package/scripts/flow-audit-gates.js +1 -1
  34. package/scripts/flow-auto-learn.js +8 -11
  35. package/scripts/flow-bug.js +2 -2
  36. package/scripts/flow-capture-gate.js +644 -0
  37. package/scripts/flow-capture.js +4 -3
  38. package/scripts/flow-cli-flags.js +95 -0
  39. package/scripts/flow-community-sync.js +2 -1
  40. package/scripts/flow-community.js +6 -6
  41. package/scripts/flow-conclusion-classifier.js +310 -0
  42. package/scripts/flow-config-defaults.js +3 -3
  43. package/scripts/flow-constants.js +8 -11
  44. package/scripts/flow-context-scoring.js +1 -0
  45. package/scripts/flow-correction-detector.js +344 -3
  46. package/scripts/flow-damage-control.js +1 -1
  47. package/scripts/flow-decisions-merge.js +1 -0
  48. package/scripts/flow-done-gates.js +20 -0
  49. package/scripts/flow-done-report.js +2 -2
  50. package/scripts/flow-done.js +4 -4
  51. package/scripts/flow-epics.js +5 -11
  52. package/scripts/flow-health.js +145 -1
  53. package/scripts/flow-id.js +92 -0
  54. package/scripts/flow-io.js +15 -5
  55. package/scripts/flow-knowledge-router.js +2 -1
  56. package/scripts/flow-links.js +1 -1
  57. package/scripts/flow-log-manager.js +2 -1
  58. package/scripts/flow-logic-adversary.js +4 -4
  59. package/scripts/flow-long-input-cli.js +6 -0
  60. package/scripts/flow-long-input-stories.js +1 -1
  61. package/scripts/flow-loop-retry-learning.js +1 -1
  62. package/scripts/flow-mcp-capabilities.js +2 -3
  63. package/scripts/flow-mcp-docs.js +2 -1
  64. package/scripts/flow-memory-blocks.js +2 -1
  65. package/scripts/flow-memory-sync.js +1 -1
  66. package/scripts/flow-memory.js +767 -0
  67. package/scripts/flow-migrate-igr.js +1 -1
  68. package/scripts/flow-migrate.js +2 -1
  69. package/scripts/flow-model-adapter.js +1 -1
  70. package/scripts/flow-model-config.js +5 -1
  71. package/scripts/flow-model-profile.js +2 -1
  72. package/scripts/flow-orchestrate.js +3 -3
  73. package/scripts/flow-output.js +29 -0
  74. package/scripts/flow-parallel.js +10 -9
  75. package/scripts/flow-pattern-enforcer.js +2 -1
  76. package/scripts/flow-permissions-audit.js +124 -0
  77. package/scripts/flow-plugin-registry.js +2 -2
  78. package/scripts/flow-progress.js +5 -1
  79. package/scripts/flow-project-analyzer.js +1 -1
  80. package/scripts/flow-promote.js +510 -0
  81. package/scripts/flow-registries.js +86 -0
  82. package/scripts/flow-request-log.js +133 -0
  83. package/scripts/flow-research-protocol.js +0 -1
  84. package/scripts/flow-revision-tracker.js +2 -1
  85. package/scripts/flow-roadmap.js +2 -1
  86. package/scripts/flow-rules-sync.js +3 -7
  87. package/scripts/flow-session-end.js +3 -1
  88. package/scripts/flow-session-learning.js +6 -13
  89. package/scripts/flow-session-state.js +2 -2
  90. package/scripts/flow-setup-hooks.js +2 -1
  91. package/scripts/flow-skill-create.js +1 -1
  92. package/scripts/flow-skill-freshness.js +6 -7
  93. package/scripts/flow-skill-learn.js +1 -1
  94. package/scripts/flow-step-coverage.js +1 -1
  95. package/scripts/flow-step-security.js +1 -1
  96. package/scripts/flow-story.js +58 -10
  97. package/scripts/flow-sys.js +204 -0
  98. package/scripts/flow-task-hierarchy.js +88 -0
  99. package/scripts/flow-tech-debt.js +2 -1
  100. package/scripts/flow-test-api.js +1 -1
  101. package/scripts/flow-utils.js +60 -890
  102. package/scripts/hooks/core/bugfix-scope-gate.js +5 -4
  103. package/scripts/hooks/core/deploy-gate.js +1 -1
  104. package/scripts/hooks/core/pre-tool-helpers.js +72 -0
  105. package/scripts/hooks/core/pre-tool-orchestrator.js +442 -0
  106. package/scripts/hooks/core/routing-gate.js +8 -0
  107. package/scripts/hooks/core/session-end.js +28 -0
  108. package/scripts/hooks/entry/claude-code/pre-tool-use.js +48 -492
  109. package/scripts/hooks/entry/shared/hook-runner.js +1 -1
  110. package/scripts/registries/schema-registry.js +1 -1
  111. package/scripts/registries/service-registry.js +1 -1
@@ -29,7 +29,7 @@ Auto-detects when to use multi-pass (4 sequential passes) vs parallel (3 agents)
29
29
  At each phase checkpoint, display a progress bar AND update the progress state file:
30
30
 
31
31
  ```bash
32
- node node_modules/wogiflow/scripts/flow-progress-tracker.js update '{"taskId":"wf-XXX","command":"/wogi-review","phase":"AI Review","phaseNum":2,"totalPhases":5,"step":"Agent 3/6 complete","stepNum":3,"totalSteps":6}'
32
+ node node_modules/wogiflow/scripts/flow-progress-tracker.js update '{"taskId":"wf-XXX","command":"/wogi-review","phase":"AI Review","phaseNum":2,"totalPhases":7,"step":"Agent 3/6 complete","stepNum":3,"totalSteps":6}'
33
33
  ```
34
34
 
35
35
  **Standard format for each checkpoint:**
@@ -38,34 +38,49 @@ node node_modules/wogiflow/scripts/flow-progress-tracker.js update '{"taskId":"w
38
38
  Agent 3/6 complete
39
39
  ```
40
40
 
41
- **Phase mapping for /wogi-review:**
41
+ **Phase mapping for /wogi-review (v6.0 — IGR-hardened):**
42
42
  | Phase | phaseNum | Description |
43
43
  |-------|----------|-------------|
44
+ | 0 | Review Framing | Scope + assumptions (IGR v6.0) |
44
45
  | 1 | Verification Gates | Syntax, lint, tests |
45
46
  | 2 | AI Review | N agents (sub-steps = agents) |
47
+ | 2.5 | Git-Verified Claims | Cross-reference spec vs diff |
48
+ | 2.8 | Findings Adversary | Different-model critique (IGR v6.0) |
46
49
  | 3 | Standards + Promotion | Compliance check + pattern learning |
47
50
  | 4 | Optimization | Solution suggestions |
48
- | 5 | Post-Review | Fix routing, learning, archive |
51
+ | 5 | Post-Review | Fix routing, truth gate, archive |
52
+
53
+ Note: `totalPhases: 7` when Phase 0 counted as phaseNum=0 (8 named phases overall, 7 sequential numeric slots 0→5). Pass `totalPhases: 7` to the progress tracker.
49
54
 
50
55
  On review completion, clear progress: `node node_modules/wogiflow/scripts/flow-progress-tracker.js clear`
51
56
 
52
- ## Review Phases (v5.0)
57
+ ## Review Phases (v6.0 — IGR-hardened)
53
58
 
54
59
  ```
55
60
  ┌─────────────────────────────────────────────────────────────┐
56
61
  │ /wogi-review │
57
62
  ├─────────────────────────────────────────────────────────────┤
63
+ │ Phase 0: Review Framing Pass (IGR v6.0) │
64
+ │ → Interpret what the user asked to review │
65
+ │ → Surface scope (in/out) + review-model assumptions │
66
+ │ → Item reconciliation (anti-deferral guard) │
67
+ │ │
58
68
  │ Phase 1: Verification Gates │
59
69
  │ → Spec verification, lint, typecheck, tests │
60
70
  │ │
61
71
  │ Phase 2: AI Review (multi-pass or parallel) │
62
72
  │ → Code/Logic, Security, Architecture analysis │
63
73
  │ → Adversarial mode: min findings per agent (v5.0) │
74
+ │ → Evidence tiers required on every finding (IGR v6.0) │
64
75
  │ │
65
76
  │ Phase 2.5: Git-Verified Claim Checking (v5.0) │
66
77
  │ → Cross-reference spec claims vs actual git diff │
67
78
  │ → BLOCKS if spec promises files not in git diff │
68
79
  │ │
80
+ │ Phase 2.8: Findings Adversary Critique (IGR v6.0) │
81
+ │ → Different-model review of the findings themselves │
82
+ │ → Flags false positives, severity inflation, missed bugs │
83
+ │ │
69
84
  │ Phase 3: Standards Compliance [STRICT] │
70
85
  │ → decisions.md, app-map.md, naming-conventions.md │
71
86
  │ → MUST_FIX violations block sign-off in Phase 5 │
@@ -74,8 +89,9 @@ On review completion, clear progress: `node node_modules/wogiflow/scripts/flow-p
74
89
  │ → Technical alternatives, UX improvements │
75
90
  │ → Suggestions only - not violations │
76
91
  │ │
77
- │ Phase 5: Post-Review Workflow
92
+ │ Phase 5: Post-Review Workflow + Completion Truth Gate
78
93
  │ → Fix loop, learning, task creation │
94
+ │ → "Fixed" claims require INTERACTIVE evidence (IGR v6.0) │
79
95
  └─────────────────────────────────────────────────────────────┘
80
96
  ```
81
97
 
@@ -112,9 +128,17 @@ Multi-pass advantages:
112
128
 
113
129
  The review system has **two layers**:
114
130
  1. **Runtime scripts** (`flow-review.js`, `flow-standards-checker.js`, `flow-solution-optimizer.js`) — perform automated pre-flight checks (verification gates, standards, optimization). These are helper tools, NOT the full review.
115
- 2. **AI instructions** (this document) — describe the complete 5-phase review loop, agent spawning, and post-review workflow. The AI model executes the full 5-phase loop, using runtime script output as input to specific phases.
131
+ 2. **AI instructions** (this document) — describe the complete 7-phase review loop, agent spawning, and post-review workflow. The AI model executes the full 7-phase loop, using runtime script output as input to specific phases.
132
+
133
+ **The runtime script does NOT execute all 7 phases.** It handles pre-flight only. You (the AI) are responsible for orchestrating the complete review.
134
+
135
+ ### IGR v6.0 — Config Enforcement + Adversary Model Rule (concise)
116
136
 
117
- **The runtime script does NOT execute all 5 phases.** It handles pre-flight only. You (the AI) are responsible for orchestrating the complete review.
137
+ All `config.review.*` toggles are **AI-honored, not runtime-enforced**. Load config first, print toggle states, honor them. Matches `/wogi-audit`'s docs-driven model.
138
+
139
+ `adversaryPass.adversaryModel` is a mapping. **Override-always rule**: the adversary MUST run on a different model than the review agents (same-model = rubber-stamp). If the resolved value equals the agent model, pick a different model regardless.
140
+
141
+ Full reference: [intent-grounded-review.md → Config Enforcement Model](../docs/intent-grounded-review.md#config-enforcement-model--reference-detail).
118
142
 
119
143
  ## Step 0: Scope Resolution (Natural Language Scoping)
120
144
 
@@ -182,7 +206,7 @@ The resolved file list replaces the default git diff in Phase 1. All subsequent
182
206
 
183
207
  ## How It Works (MANDATORY 5-PHASE SEQUENTIAL EXECUTION)
184
208
 
185
- **CRITICAL: You MUST execute ALL 5 phases sequentially. Do NOT stop after Phase 2.**
209
+ **CRITICAL: You MUST execute ALL 7 phases sequentially (0 → 1 → 2 → 2.5 → 2.8 → 3 → 4 → 5). Do NOT stop after Phase 2.**
186
210
 
187
211
  ```
188
212
  ┌─────────────────────────────────────────────────────────────┐
@@ -221,7 +245,7 @@ The resolved file list replaces the default git diff in Phase 1. All subsequent
221
245
  │ → Persist findings, present fix options to user │
222
246
  │ → If user chooses fix: convert to todos, fix loop │
223
247
  │ → Learning capture: corrections, pattern promotion │
224
- │ → Display "Phases: 5/5 executed" │
248
+ │ → Display "Phases: 7/7 executed" │
225
249
  │ ✓ CHECKPOINT: "Phase 5 complete - Review done" │
226
250
  │ │
227
251
  └─────────────────────────────────────────────────────────────┘
@@ -535,6 +559,48 @@ Track phases completed: start at 0/5, increment after each phase checkpoint.
535
559
 
536
560
  ---
537
561
 
562
+ ### PHASE 0: Review Framing Pass (IGR v6.0)
563
+
564
+ **Config toggle**: `config.review.framingPass.enabled` (default `true`). Reference: [intent-grounded-review.md → Phase 0](../docs/intent-grounded-review.md#phase-0-review-framing-pass--reference-detail).
565
+
566
+ **Procedure**:
567
+
568
+ 1. Interpret the review request into a **Framing Artifact** with 5 fields: `interpretation`, `scopeIn`, `scopeOut`, `assumptions`, `posture` (`pre-ship` | `session-review` | `security-focused` | `exploratory`).
569
+
570
+ 2. Write the artifact to `.workflow/state/review-framing/{timestamp}.md` (with PIN markers).
571
+
572
+ 3. Display a short summary:
573
+ ```
574
+ ━━━ REVIEW FRAMING ━━━
575
+ Interpretation: [one sentence]
576
+ Scope (in): [list]
577
+ Scope (out): [list]
578
+ Assumptions:
579
+ - [assumption 1]
580
+ - [assumption 2]
581
+ Posture: [pre-ship | session-review | security-focused | exploratory]
582
+ Proceeding with N-agent analysis on this scope.
583
+ ━━━━━━━━━━━━━━━━━━━━━━
584
+ ```
585
+
586
+ 4. **Item reconciliation (MANDATORY anti-deferral guard)**: if the user's request enumerated multiple items, each MUST appear in `scopeIn`. If the count shrank, framing FAILS — require user confirmation before proceeding.
587
+
588
+ 5. **Posture adjusts agent weighting** — see the reference doc for the full table.
589
+
590
+ **Display Phase 0 results**:
591
+ ```
592
+ ═══════════════════════════════════════
593
+ PHASE 0: REVIEW FRAMING [0/7]
594
+ ═══════════════════════════════════════
595
+ [Framing artifact summary]
596
+
597
+ ✓ Phase 0 complete. Proceeding to Phase 1...
598
+ ```
599
+
600
+ Config toggles: `review.framingPass.enabled` (default true), `review.framingPass.itemReconciliation` (default true), `review.framingPass.adversaryInExploratory` (default false).
601
+
602
+ ---
603
+
538
604
  ### PHASE 1: Verification Gates
539
605
 
540
606
  **1.1. Get changed files**:
@@ -554,7 +620,7 @@ git diff --name-only HEAD~N HEAD # If --commits N specified
554
620
  **1.3. Display Phase 1 results**:
555
621
  ```
556
622
  ═══════════════════════════════════════
557
- PHASE 1: VERIFICATION GATES [1/5]
623
+ PHASE 1: VERIFICATION GATES [1/7]
558
624
  ═══════════════════════════════════════
559
625
  ✓ Spec: N/N deliverables exist
560
626
  ✓ Lint: passed
@@ -613,9 +679,9 @@ Agent Lineup (N agents):
613
679
  Total: N (max: 6)
614
680
  ```
615
681
 
616
- **2.3. Append adversarial minimum findings suffix to EVERY agent prompt**:
682
+ **2.3. Append adversarial minimum findings suffix + evidence tier requirement to EVERY agent prompt**:
617
683
 
618
- Read `config.review.minFindings` (default: 3). Append this to every agent's prompt:
684
+ Read `config.review.minFindings` (default: 3) and `config.review.evidenceTiers.enabled` (default: true). Append this to every agent's prompt:
619
685
 
620
686
  ```
621
687
  IMPORTANT: Adversarial Review Mode
@@ -623,8 +689,37 @@ You MUST find at least [minFindings] findings. If you genuinely cannot find
623
689
  [minFindings] issues, you MUST provide a "clean code justification" as a
624
690
  special finding with type "clean-justification" explaining WHY the code is
625
691
  clean. Generic praise like "looks good" is NOT acceptable.
692
+
693
+ IMPORTANT: Evidence Tier Requirement (IGR v6.0)
694
+
695
+ Every finding MUST carry two additional fields:
696
+
697
+ evidenceTier: integer 0–4
698
+ 0 = STATIC — inferred from source alone (weakest)
699
+ 1 = STRUCTURAL — grepped / globbed / counted instances
700
+ 2 = OBSERVATIONAL — ran a tool (lint, typecheck, npm audit) and read output
701
+ 3 = INTERACTIVE — executed code/tests and observed behavior
702
+ 4 = AUTOMATED — deterministic check in a quality gate / test suite
703
+
704
+ evidenceNote: one-line string citing what produced the evidence
705
+ examples: "grep 'JSON\\.parse' returned 7 matches in src/api/"
706
+ "ran require.resolve() — path resolves correctly"
707
+ "executed tests/foo.test.js and observed assertion failure"
708
+
709
+ SEVERITY IS CAPPED BY TIER:
710
+ - Tier 0: severity MUST be LOW (and will be flagged UNVERIFIED in the report)
711
+ - Tier 1: severity capped at MEDIUM (unless grep returned >=5 instances → HIGH allowed)
712
+ - Tier 2+: severity stands as you assign it
713
+
714
+ Also respect the FRAMING ARTIFACT from Phase 0 — only report findings within
715
+ `scopeIn`. Findings outside `scopeOut` will be moved to an appendix by the
716
+ orchestrator.
626
717
  ```
627
718
 
719
+ **Why evidence tiers matter**: During this project's own self-review (session logs), a `code-reviewer` agent reported an F1 finding as "Critical — broken require path" without citing evidence. Manual verification via `require.resolve()` showed the path was correct — the agent's path math was flawed. With tier enforcement, F1 would have been Tier 0 (no grep, no execution), capped at LOW, and flagged UNVERIFIED — alerting the reader to verify before acting.
720
+
721
+ **Config toggles**: `review.evidenceTiers.enabled` (default true), `review.evidenceTiers.capByTier` (default true — enforce severity caps).
722
+
628
723
  **2.4. Launch ALL agents in parallel** (single message with N Task tool calls, subagent_type=Explore)
629
724
 
630
725
  **2.5. Wait for all agents to complete**
@@ -657,7 +752,7 @@ clean. Generic praise like "looks good" is NOT acceptable.
657
752
  **2.7. Display Phase 2 results (per-agent sections)**:
658
753
  ```
659
754
  ═══════════════════════════════════════
660
- PHASE 2: AI REVIEW [2/5]
755
+ PHASE 2: AI REVIEW [2/7]
661
756
  ═══════════════════════════════════════
662
757
 
663
758
  Agents: N launched (3 core + 1 optional + 2 project-rules)
@@ -713,7 +808,7 @@ git diff --name-only # For unstaged changes
713
808
  **2.5.5. Display Phase 2.5 results**:
714
809
  ```
715
810
  ═══════════════════════════════════════
716
- PHASE 2.5: GIT-VERIFIED CLAIMS [2.5/5]
811
+ PHASE 2.5: GIT-VERIFIED CLAIMS [3/7]
717
812
  ═══════════════════════════════════════
718
813
 
719
814
  Spec: .workflow/changes/wf-XXXXXXXX.md
@@ -733,6 +828,52 @@ Summary: X verified, Y missing, Z unplanned
733
828
 
734
829
  ---
735
830
 
831
+ ### PHASE 2.8: Findings Adversary Critique (IGR v6.0)
832
+
833
+ **Config toggle**: `config.review.adversaryPass.enabled` (default `true`; MANDATORY when framing posture is `pre-ship`). Reference: [intent-grounded-review.md → Phase 2.8](../docs/intent-grounded-review.md#phase-28-findings-adversary-critique--reference-detail).
834
+
835
+ **Procedure**:
836
+
837
+ 1. **Collect inputs**: the framing artifact + all Phase 2 findings (with `evidenceTier` + `evidenceNote`) + Phase 2.5 git-claim results.
838
+
839
+ 2. **Launch ONE Agent sub-agent** (`subagent_type=Explore`, READ-ONLY) on a DIFFERENT model than the review agents. Resolve via `config.review.adversaryPass.adversaryModel` mapping: agents on Sonnet → adversary on Opus; agents on Opus → adversary on Sonnet; agents on Haiku → adversary on Sonnet. **Override-always rule**: if the resolved value equals the agent model, pick a different model anyway.
840
+
841
+ 3. **Adversary prompt** — produce JSON with: `falsePositives[]`, `missedIssues[]`, `severityAdjustments[]`, `scopeDrift[]`, `evidenceChallenges[]`, `overallVerdict` (`ACCEPT | ACCEPT_WITH_ADJUSTMENTS | REVISE_SCOPE | BLOCK`).
842
+
843
+ HUNT specifically for: (a) `evidenceTier=0` + severity ≥ HIGH, (b) line-number claims without code quotes, (c) "broken require path" / "missing import" / "wrong type" without `require.resolve` / `tsc` / `grep` verification, (d) findings contradicting `scopeIn`/`scopeOut`.
844
+
845
+ Forbid "I think" / "might" / "could" — require evidence. Full prompt template in the reference doc.
846
+
847
+ 4. **Parse + apply adjustments**: `severityAdjustments` rewrite severity (mark `[ADVERSARY-ADJUSTED]`); `scopeDrift` moves to appendix; `falsePositives` marked `[DISPUTED]` (not removed); `missedIssues` appended as `[ADVERSARY-FOUND]` Tier-0; `evidenceChallenges` downgrade tier and re-apply severity cap.
848
+
849
+ 5. **Archive** run to `.workflow/state/adversary-runs/review-{timestamp}.json` for the pattern-promotion pipeline.
850
+
851
+ 6. **Display Phase 2.8 results**:
852
+ ```
853
+ ═══════════════════════════════════════
854
+ PHASE 2.8: FINDINGS ADVERSARY [4/7]
855
+ ═══════════════════════════════════════
856
+
857
+ Adversary model: [model] (agents: [agent-model])
858
+ Verdict: [ACCEPT | ACCEPT_WITH_ADJUSTMENTS | REVISE_SCOPE | BLOCK]
859
+
860
+ False positives: N (marked [DISPUTED])
861
+ Severity adjustments: N (marked [ADVERSARY-ADJUSTED])
862
+ Missed issues found: N (appended as [ADVERSARY-FOUND] Tier-0 findings)
863
+ Scope drift: N (moved to Out-of-Scope appendix)
864
+ Evidence challenges: N (tier downgraded, severity re-capped)
865
+
866
+ [For each item, one-line summary with finding ID + reason]
867
+
868
+ ✓ Phase 2.8 complete. Proceeding to Phase 3...
869
+ ```
870
+
871
+ **One pass only** — no iteration loop. If the adversary `BLOCKS`, display the block reason prominently and require the user to acknowledge before proceeding to Phase 3 — or to retry the review with adjusted scope.
872
+
873
+ **Config toggles**: `review.adversaryPass.enabled` (default true), `review.adversaryPass.adversaryModel` (mapping object — see "Adversary Model Selection Rule" in the Architecture Note; resolve at runtime based on agent model, override-always rule applies), `review.adversaryPass.applySeverityAdjustments` (default true), `review.adversaryPass.applyScopeDrift` (default true), `review.adversaryPass.blockOnBlockVerdict` (default true).
874
+
875
+ ---
876
+
736
877
  ### PHASE 3: Standards Compliance [STRICT]
737
878
 
738
879
  **This phase BLOCKS review completion if MUST_FIX violations are found.**
@@ -766,7 +907,7 @@ After running the standards check, feed any violations through the pattern promo
766
907
  **3.4. Display Phase 3 results**:
767
908
  ```
768
909
  ═══════════════════════════════════════
769
- PHASE 3: STANDARDS COMPLIANCE [3/5]
910
+ PHASE 3: STANDARDS COMPLIANCE [5/7]
770
911
  ═══════════════════════════════════════
771
912
 
772
913
  ✓ decisions.md: passed
@@ -809,7 +950,7 @@ Or if the runtime script is not available, manually analyze changed files for:
809
950
  **4.3. Display Phase 4 results**:
810
951
  ```
811
952
  ═══════════════════════════════════════
812
- PHASE 4: SOLUTION OPTIMIZATION [4/5]
953
+ PHASE 4: SOLUTION OPTIMIZATION [6/7]
813
954
  ═══════════════════════════════════════
814
955
 
815
956
  Technical (N):
@@ -850,7 +991,7 @@ Phase Results:
850
991
 
851
992
  Total Findings: N (X critical, Y high, Z medium, W low)
852
993
  Pattern Learning: P patterns tracked, M promoted, G enforcement gaps
853
- Phases: 5/5 executed
994
+ Phases: 7/7 executed
854
995
  ```
855
996
 
856
997
  **5.2. Present severity-aware fix options to user** (use AskUserQuestion):
@@ -990,14 +1131,51 @@ This ensures that patterns discovered during code review feed into the same prom
990
1131
  - Save review report to `.workflow/reviews/YYYY-MM-DD-HHMMSS-review.md`
991
1132
  - Include: date, files reviewed, mode, all findings with status (fixed/task-created/dismissed), summary
992
1133
 
993
- **5.6. Sign-off gate**:
1134
+ **5.6. Completion Truth Gate (IGR v6.0)** — runs BEFORE sign-off:
1135
+
1136
+ **Config toggle**: `config.review.completionTruthGate.enabled` (default `true`).
1137
+
1138
+ **Problem this solves**: A review's "fixed" claim is only as good as the evidence behind it. A finding marked `fixed` because the AI applied an edit is NOT the same as a finding verified to work. Without a truth gate, the sign-off rubber-stamps whatever the agent says.
1139
+
1140
+ **Procedure** — for every finding now marked `status: fixed`:
1141
+
1142
+ 1. **Check evidence tier of the fix**:
1143
+ - Did the fix come with an executed test (`tier ≥ 3 INTERACTIVE`)?
1144
+ - Or an automated gate confirming the fix (`tier 4 AUTOMATED`)?
1145
+ - Or just an edit + lint pass (`tier 2 OBSERVATIONAL`)?
1146
+ - Or just an edit (`tier 0 STATIC`)?
1147
+
1148
+ 2. **Downgrade rule**:
1149
+ - `tier ≥ 3` → status stays `fixed` (INTERACTIVE evidence is sufficient)
1150
+ - `tier 2` → status downgraded to `fixed-unverified` (lint/typecheck passed but behavior not exercised)
1151
+ - `tier ≤ 1` → status downgraded to `implemented-unverified` (edit applied, no evidence of correctness)
1152
+
1153
+ 3. **Display the downgrade in the final summary**:
1154
+ ```
1155
+ ━━━ COMPLETION TRUTH GATE ━━━
1156
+ Findings marked "fixed": N
1157
+ Tier ≥ 3 (INTERACTIVE): M → status stands
1158
+ Tier 2 (OBSERVATIONAL): K → downgraded to "fixed-unverified"
1159
+ Tier ≤ 1 (STATIC/STRUCTURAL): J → downgraded to "implemented-unverified"
1160
+
1161
+ ⚠ K + J findings lack runtime proof of fix.
1162
+ To upgrade: run the relevant tests / smoke-test / browser check and re-verify.
1163
+ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
1164
+ ```
1165
+
1166
+ 4. **Persist downgraded statuses** to `last-review.json`. Do NOT silently mark everything as complete.
1167
+
1168
+ **Config toggles**: `review.completionTruthGate.enabled` (default true), `review.completionTruthGate.requireInteractiveForFixed` (default true — when false, Tier 2 counts as fully fixed).
1169
+
1170
+ **5.7. Sign-off gate**:
994
1171
  - Present summary to user and ask for confirmation that the review is complete
995
- - If user requests additional fixes, return to step 5.3
1172
+ - Display the truth-gate downgrade counts prominently — the user should consciously accept unverified fixes, not have them hidden
1173
+ - If user requests additional fixes or verification, return to step 5.3
996
1174
 
997
- **5.7. Display final checkpoint**:
1175
+ **5.8. Display final checkpoint**:
998
1176
  ```
999
1177
  ═══════════════════════════════════════
1000
- PHASE 5: POST-REVIEW COMPLETE [5/5]
1178
+ PHASE 5: POST-REVIEW COMPLETE [7/7]
1001
1179
  ═══════════════════════════════════════
1002
1180
 
1003
1181
  Findings: N total
@@ -1010,7 +1188,7 @@ Pattern Learning:
1010
1188
 
1011
1189
  Run /wogi-review-fix --pending to batch-process deferred items.
1012
1190
 
1013
- Phases: 5/5 executed
1191
+ Phases: 7/7 executed
1014
1192
  Review complete.
1015
1193
  ```
1016
1194
 
@@ -101,6 +101,51 @@ When a local `/wogi-*` CLI command fails (error in output, "Unknown skill", comm
101
101
  - After `/wogi-start` classifies as conversation: Read, Glob, Grep, WebSearch, WebFetch (read-only). No Edit/Write/state modifications.
102
102
  - Natural exit: when user gives an implementation imperative, transition to `/wogi-story`.
103
103
 
104
+ **Research Reasoning Gate** (applies inside Conversation mode when `config.researchReasoningGate.enabled` — default ON): classify the question into a tier based on structural markers. Do NOT self-classify the question's complexity — use the markers below mechanically. When ambiguous, default to Tier 2.
105
+
106
+ | Tier | Marker phrases | What you do |
107
+ |------|---------------|-------------|
108
+ | **Tier 1 — Factual** | "what is", "how many", "show me", "list all", "which file", "where does" | Answer directly from code/docs. No gate. |
109
+ | **Tier 2 — Domain** (default for ambiguous) | "what should", "how should", "recommend", "which approach", "what do you think about", "is it better to" | **Surface assumptions, then WAIT.** |
110
+ | **Tier 3 — Architecture** | "should we restructure", "what's the right architecture", "design a schema", "how to migrate", "should we split / merge / replace" | Tier 2 flow + spawn adversary on a different model after recommendation. |
111
+
112
+ **Tier 2 flow — the user is the adversary**:
113
+ 1. Before any analysis, identify the domain-model assumptions your answer will depend on (typically 2–5).
114
+ 2. Present them in a fenced block and STOP:
115
+ ```
116
+ ━━━ ASSUMPTIONS (confirm before I analyze) ━━━
117
+ My analysis will depend on these domain model assumptions:
118
+ 1. <assumption 1>
119
+ 2. <assumption 2>
120
+ 3. <assumption 3>
121
+
122
+ Do these match your understanding? [confirm / correct]
123
+ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
124
+ ```
125
+ 3. WAIT for the user to confirm or correct. Do not analyze while waiting.
126
+ 4. When confirmed (or corrected), ground the analysis in the user's stated model — not your original guess.
127
+
128
+ **Tier 3 flow** — after steps 1–4 above, also:
129
+ 5. Produce the recommendation.
130
+ 6. Spawn an Agent sub-agent on a DIFFERENT model (config-controlled, default `sonnet`) with: the user's confirmed assumptions + your recommendation + the original question. Ask: "Does this recommendation follow from these assumptions? What's the strongest counterargument? List 1–3 specific concerns with line/file citations where possible."
131
+ 7. Present both the recommendation AND the adversary critique to the user in a single response:
132
+ ```
133
+ ━━━ RECOMMENDATION ━━━
134
+ <your recommendation>
135
+
136
+ ━━━ ADVERSARY CRITIQUE (reviewed by a different model) ━━━
137
+ <sub-agent output>
138
+ ```
139
+ 8. One pass only — this is conversation, not implementation. No iteration loop.
140
+
141
+ **Config toggles**:
142
+ - `researchReasoningGate.enabled` — master switch
143
+ - `researchReasoningGate.tier2.enabled` — assumption surfacing
144
+ - `researchReasoningGate.tier3.enabled` — spawn adversary
145
+ - `researchReasoningGate.tier3.adversaryModel` — model for the critique agent (default `sonnet`)
146
+
147
+ **Why this works** (from spec wf-6dbc0b2a): same-model self-critique is a known rubber-stamp. The USER is the effective adversary — you surface assumptions so they can validate the domain model before you build recommendations on invisible guesses.
148
+
104
149
  **Everything else**: Route to best command from catalog. Zero exemptions.
105
150
 
106
151
  ### Examples
@@ -74,6 +74,7 @@ flow parallel check # See available parallel tasks
74
74
  | 2.5.0+ | 2.1.84+ | TaskCreated hook, YAML glob lists in rules, CLAUDE_STREAM_IDLE_TIMEOUT_MS, WorktreeCreate HTTP transport, idle-return prompt, MCP 2KB cap |
75
75
  | 2.9.0+ | 2.1.90+ | --resume deferred-tool cache fix, MCP schema perf, PostToolUse format-on-save fix, PreToolUse exit-code-2 fix, .husky protected |
76
76
  | 2.9.2+ | 2.1.97+ | Stop/SubagentStop long-session fix, subagent worktree cwd leak fix, refreshInterval status line, workspace.git_worktree, MCP HTTP/SSE leak fix, 429 backoff, compaction transcript dedup |
77
+ | 2.18.0+ | 2.1.108+ | ENABLE_PROMPT_CACHING_1H guidance, /recap awareness, /doctor MCP duplicate-scope mirror in `/wogi-health` |
77
78
 
78
79
  ### Environment Variables (2.1.19+)
79
80
 
@@ -363,6 +364,50 @@ await cancelTask('wf-123', 'superseded', false);
363
364
 
364
365
  - **`/claude-api` skill updated for Managed Agents**: The `/claude-api` skill now covers Managed Agents (`/v1/agents`, `/v1/sessions`) alongside the Claude API. **Impact on WogiFlow**: Informational — WogiFlow's `claude-api` skill reference remains accurate.
365
366
 
367
+ ### Features in 2.1.108+
368
+
369
+ - **`ENABLE_PROMPT_CACHING_1H` env var (RECOMMENDED for non-subscribers)**: Opts into **1-hour prompt-cache TTL** on **API key, Bedrock, Vertex, and Foundry** providers. Subscribers (Claude Pro, Max, Team, Enterprise via claude.ai OAuth) already get 1h TTL by default — this flag is a **no-op for them**. The complementary `FORCE_PROMPT_CACHING_5M` pins to 5min, and the older `ENABLE_PROMPT_CACHING_1H_BEDROCK` is deprecated but still honored. **Impact on WogiFlow (HIGH)**: WogiFlow sessions load a large, stable prefix every turn — CLAUDE.md (~300 lines), state files (`ready.json`, `decisions.md`, `app-map.md`), phase files, and pinned spec context. At the default 5min TTL, any pause longer than 5 minutes (user thinking, a long `flow` CLI run, a meeting mid-session) invalidates the cache and the next turn pays the full input-token cost again. At 1h TTL, the same prefix stays cached across those pauses, yielding **substantial token-cost reduction** on typical multi-hour WogiFlow work. **Action for API-key / Bedrock / Vertex / Foundry users**: `export ENABLE_PROMPT_CACHING_1H=1` in your shell profile. **Action for subscribers**: none (already enabled). **Risk**: none — if set on a subscriber account it is ignored; if set when not supported, it silently falls back.
370
+
371
+ - **`/recap` command and session recap feature**: Provides context when returning to a session. Configurable in `/config` and manually invocable with `/recap`. For users with telemetry disabled (Bedrock/Vertex/Foundry/`DISABLE_TELEMETRY`), recap is still enabled by default; opt out via `/config` or `CLAUDE_CODE_ENABLE_AWAY_SUMMARY=0`. **Overlap with WogiFlow**: `/wogi-morning`, `/wogi-session-end`, and `/wogi-pre-compact` already provide durable recap via state files. `/recap` is ephemeral (summarizes the current session); WogiFlow's state survives session exit. Use both: `/recap` for intra-session context, `/wogi-morning` for cross-session pickup.
372
+
373
+ - **Built-in slash commands via Skill tool**: Claude can now discover and invoke `/init`, `/review`, `/security-review` via the Skill tool. **Impact on WogiFlow**: No collision — all WogiFlow commands use the `wogi-*` prefix (`/wogi-review`, `/wogi-init`, `/wogi-review-fix`). Natural-language routing in CLAUDE.md directs "code review" phrases to `/wogi-review`, not the built-in `/review`. If a user explicitly types `/review`, Claude Code handles it natively — this is expected.
374
+
375
+ - **`/model` mid-conversation warning**: `/model` now warns before switching models mid-conversation, since the next response re-reads the full history uncached. **Impact on WogiFlow**: Relevant for hybrid mode (`/wogi-hybrid`) — switching the executor model via `/model` during hybrid execution wastes the cached context. WogiFlow's `/wogi-hybrid-setup` is the correct way to change executor models between sessions rather than mid-session.
376
+
377
+ - **`DISABLE_PROMPT_CACHING*` startup warning**: Claude Code now warns at startup when prompt caching is disabled via `DISABLE_PROMPT_CACHING*` env vars. **Impact on WogiFlow**: WogiFlow's heavy context prefix makes disabled caching **expensive**. This warning helps users who accidentally disabled caching (e.g., copy-pasted env from another project) spot the regression fast.
378
+
379
+ - **`/undo` alias for `/rewind`**: Typing `/undo` now aliases to `/rewind`. WogiFlow's `/wogi-pre-compact` and `/wogi-suspend` are complementary — `/undo`/`/rewind` rolls back message turns, while the WogiFlow flows preserve state across sessions.
380
+
381
+ - **Memory footprint reductions for file reads**: Language grammars now load on demand, reducing memory for file reads, edits, and syntax highlighting. **Impact on WogiFlow**: Long WogiFlow sessions (especially `/wogi-bulk-loop` continuous runs) use noticeably less RAM. No code change needed.
382
+
383
+ ### Features in 2.1.110+
384
+
385
+ - **PreToolUse hook `additionalContext` preserved on tool failure (BUG FIX, GOOD NEWS)**: Previously, when a tool call failed, any `additionalContext` returned by PreToolUse hooks was **dropped**. Fixed in 2.1.110. **Impact on WogiFlow (HIGH)**: WogiFlow injects `additionalContext` in 8 places via `scripts/hooks/adapters/claude-code.js` (PreToolUse, UserPromptSubmit, SessionStart) for routing enforcement, phase-gate messages, component reuse hints, and session-start task context. Before this fix, if a guarded tool call failed, WogiFlow's context message vanished — producing "silent" hook behavior that was confusing to debug. After this fix, WogiFlow's hook messages are reliably delivered regardless of tool outcome. **Action**: none — automatic improvement after upgrade.
386
+
387
+ - **`/doctor` warns on duplicate MCP server definitions across scopes**: When the same MCP server is defined in user (`~/.claude/settings.json`), project (`.claude/settings.json`), and local (`.claude/settings.local.json`) scopes with different endpoints, `/doctor` now flags the conflict. **Impact on WogiFlow**: `/wogi-health` has a mirror check in `flow-health.js` that scans the same three scopes and reports duplicate MCP server names with divergent endpoints as a health finding (v2.18.0+).
388
+
389
+ - **PushNotification tool**: Claude can send mobile push notifications when Remote Control and "Push when Claude decides" config are enabled. **WogiFlow opportunity**: Long-running autonomous loops (`/wogi-bulk`, `/wogi-bulk-loop`) could emit a notification on completion, blocker, or extended hang. Tracked as a future enhancement; not auto-wired.
390
+
391
+ - **Bash tool timeout enforcement**: The Bash tool now enforces the documented maximum timeout (600000ms / 10min) instead of accepting arbitrarily large values. **Impact on WogiFlow**: No impact — all WogiFlow hook Bash timeouts are under 60s (verified across `.claude/settings.json` and `scripts/hooks/`).
392
+
393
+ - **stdio MCP servers no longer disconnect on stray non-JSON lines**: Fixed a regression from 2.1.105 where stdio MCP servers that print stray non-JSON lines to stdout were disconnected on the first stray line. **Impact on WogiFlow**: WogiFlow has no custom MCP servers in-repo. User-installed MCP servers (figma, atlassian, gmail) benefit automatically.
394
+
395
+ - **PermissionRequest hook `updatedInput` re-check**: Fixed PermissionRequest hooks returning `updatedInput` not being re-checked against `permissions.deny` rules; `setMode:'bypassPermissions'` updates now respect `disableBypassPermissionsMode`. **Impact on WogiFlow**: WogiFlow does not implement PermissionRequest hooks (only PermissionDenied for logging). Not affected.
396
+
397
+ - **`--resume`/`--continue` resurrects unexpired scheduled tasks**: Scheduled tasks (cron/CronCreate) now resume across session restarts. **Impact on WogiFlow**: WogiFlow does not currently use Claude Code's cron feature. Not affected; tracked as a future opportunity for automated maintenance tasks.
398
+
399
+ - **`/context`, `/exit`, `/reload-plugins` work from Remote Control (mobile/web) clients**: Remote Control users can now invoke these built-ins. **Impact on WogiFlow**: WogiFlow has no TTY-only code paths — all `/wogi-*` skills already work identically on Remote Control. Users can now do full WogiFlow-driven work from mobile/web.
400
+
401
+ - **`/tui` command and `tui` setting**: `/tui fullscreen` switches to flicker-free rendering in the same conversation. The focus view is now toggled separately with `/focus` (Ctrl+O now toggles verbose transcript only). **Impact on WogiFlow**: Documentation only — no runtime dependency on Ctrl+O. The WogiFlow statusline works identically in both TUI modes.
402
+
403
+ - **`autoScrollEnabled` config**: New setting to disable conversation auto-scroll in fullscreen mode. Purely UX — no WogiFlow impact.
404
+
405
+ - **Write tool reports IDE diff edits**: The Write tool now informs the model when the user edits the proposed content in the IDE diff before accepting. **Impact on WogiFlow**: Useful signal for learning — WogiFlow's `/wogi-correction` could eventually consume this to detect "user edited my output" events. Not auto-wired; tracked as an enhancement.
406
+
407
+ - **TRACEPARENT/TRACESTATE in SDK/headless sessions**: SDK and headless sessions now read W3C trace headers from the environment for distributed trace linking. **Impact on wogiflow-cloud**: Teams backend can propagate trace context from CI/CD pipelines into WogiFlow sessions for end-to-end observability. Tracked as a cloud opportunity.
408
+
409
+ - **Hardened "Open in editor" against command injection**: Security hardening for untrusted filenames. **Impact on WogiFlow**: Validates the same pattern in `.claude/rules/security/security-patterns.md` — external inputs going into shell commands must be validated. No WogiFlow code change needed.
410
+
366
411
  ### Simple Mode Naming Distinction
367
412
 
368
413
  Claude Code's `CLAUDE_CODE_SIMPLE` environment variable (which enables a simplified tool set) is **unrelated** to WogiFlow's `loops.simpleMode` (a lightweight task completion loop using string detection). They are separate features that happen to share the word "simple":
@@ -497,4 +542,4 @@ Run `/keybindings` in Claude Code to customize your shortcuts.
497
542
 
498
543
  ---
499
544
 
500
- *Last updated: 2026-04-09*
545
+ *Last updated: 2026-04-16*