npm - wogiflow - Versions diffs - 2.17.0 → 2.18.0 - Mend

wogiflow 2.17.0 → 2.18.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (111) hide show

package/.claude/commands/wogi-audit.md +212 -17
package/.claude/commands/wogi-research.md +37 -0
package/.claude/commands/wogi-review.md +200 -22
package/.claude/commands/wogi-start.md +45 -0
package/.claude/docs/claude-code-compatibility.md +46 -1
package/.claude/docs/intent-grounded-review.md +209 -0
package/.claude/settings.json +34 -1
package/.workflow/agents/logic-adversary.md +8 -0
package/.workflow/templates/claude-md.hbs +18 -0
package/lib/installer.js +22 -0
package/lib/utils.js +29 -3
package/lib/workspace-changelog.js +2 -1
package/lib/workspace-channel-server.js +4 -6
package/lib/workspace-contracts.js +5 -4
package/lib/workspace-events.js +8 -7
package/lib/workspace-gates.js +4 -3
package/lib/workspace-integration-tests.js +2 -1
package/lib/workspace-intelligence.js +3 -2
package/lib/workspace-locks.js +2 -1
package/lib/workspace-messages.js +7 -6
package/lib/workspace-routing.js +14 -26
package/lib/workspace-session.js +7 -6
package/lib/workspace-sync.js +9 -8
package/package.json +4 -2
package/scripts/base-workflow-step.js +1 -1
package/scripts/flow +19 -0
package/scripts/flow-adaptive-learning.js +1 -1
package/scripts/flow-aggregate.js +2 -1
package/scripts/flow-architect-pass.js +3 -3
package/scripts/flow-archive-runs.js +372 -0
package/scripts/flow-ask.js +1 -1
package/scripts/flow-ast-grep.js +216 -0
package/scripts/flow-audit-gates.js +1 -1
package/scripts/flow-auto-learn.js +8 -11
package/scripts/flow-bug.js +2 -2
package/scripts/flow-capture-gate.js +644 -0
package/scripts/flow-capture.js +4 -3
package/scripts/flow-cli-flags.js +95 -0
package/scripts/flow-community-sync.js +2 -1
package/scripts/flow-community.js +6 -6
package/scripts/flow-conclusion-classifier.js +310 -0
package/scripts/flow-config-defaults.js +3 -3
package/scripts/flow-constants.js +8 -11
package/scripts/flow-context-scoring.js +1 -0
package/scripts/flow-correction-detector.js +344 -3
package/scripts/flow-damage-control.js +1 -1
package/scripts/flow-decisions-merge.js +1 -0
package/scripts/flow-done-gates.js +20 -0
package/scripts/flow-done-report.js +2 -2
package/scripts/flow-done.js +4 -4
package/scripts/flow-epics.js +5 -11
package/scripts/flow-health.js +145 -1
package/scripts/flow-id.js +92 -0
package/scripts/flow-io.js +15 -5
package/scripts/flow-knowledge-router.js +2 -1
package/scripts/flow-links.js +1 -1
package/scripts/flow-log-manager.js +2 -1
package/scripts/flow-logic-adversary.js +4 -4
package/scripts/flow-long-input-cli.js +6 -0
package/scripts/flow-long-input-stories.js +1 -1
package/scripts/flow-loop-retry-learning.js +1 -1
package/scripts/flow-mcp-capabilities.js +2 -3
package/scripts/flow-mcp-docs.js +2 -1
package/scripts/flow-memory-blocks.js +2 -1
package/scripts/flow-memory-sync.js +1 -1
package/scripts/flow-memory.js +767 -0
package/scripts/flow-migrate-igr.js +1 -1
package/scripts/flow-migrate.js +2 -1
package/scripts/flow-model-adapter.js +1 -1
package/scripts/flow-model-config.js +5 -1
package/scripts/flow-model-profile.js +2 -1
package/scripts/flow-orchestrate.js +3 -3
package/scripts/flow-output.js +29 -0
package/scripts/flow-parallel.js +10 -9
package/scripts/flow-pattern-enforcer.js +2 -1
package/scripts/flow-permissions-audit.js +124 -0
package/scripts/flow-plugin-registry.js +2 -2
package/scripts/flow-progress.js +5 -1
package/scripts/flow-project-analyzer.js +1 -1
package/scripts/flow-promote.js +510 -0
package/scripts/flow-registries.js +86 -0
package/scripts/flow-request-log.js +133 -0
package/scripts/flow-research-protocol.js +0 -1
package/scripts/flow-revision-tracker.js +2 -1
package/scripts/flow-roadmap.js +2 -1
package/scripts/flow-rules-sync.js +3 -7
package/scripts/flow-session-end.js +3 -1
package/scripts/flow-session-learning.js +6 -13
package/scripts/flow-session-state.js +2 -2
package/scripts/flow-setup-hooks.js +2 -1
package/scripts/flow-skill-create.js +1 -1
package/scripts/flow-skill-freshness.js +6 -7
package/scripts/flow-skill-learn.js +1 -1
package/scripts/flow-step-coverage.js +1 -1
package/scripts/flow-step-security.js +1 -1
package/scripts/flow-story.js +58 -10
package/scripts/flow-sys.js +204 -0
package/scripts/flow-task-hierarchy.js +88 -0
package/scripts/flow-tech-debt.js +2 -1
package/scripts/flow-test-api.js +1 -1
package/scripts/flow-utils.js +60 -890
package/scripts/hooks/core/bugfix-scope-gate.js +5 -4
package/scripts/hooks/core/deploy-gate.js +1 -1
package/scripts/hooks/core/pre-tool-helpers.js +72 -0
package/scripts/hooks/core/pre-tool-orchestrator.js +442 -0
package/scripts/hooks/core/routing-gate.js +8 -0
package/scripts/hooks/core/session-end.js +28 -0
package/scripts/hooks/entry/claude-code/pre-tool-use.js +48 -492
package/scripts/hooks/entry/shared/hook-runner.js +1 -1
package/scripts/registries/schema-registry.js +1 -1
package/scripts/registries/service-registry.js +1 -1

package/.claude/commands/wogi-review.md CHANGED Viewed

@@ -29,7 +29,7 @@ Auto-detects when to use multi-pass (4 sequential passes) vs parallel (3 agents)
 At each phase checkpoint, display a progress bar AND update the progress state file:
 ```bash
-node node_modules/wogiflow/scripts/flow-progress-tracker.js update '{"taskId":"wf-XXX","command":"/wogi-review","phase":"AI Review","phaseNum":2,"totalPhases":5,"step":"Agent 3/6 complete","stepNum":3,"totalSteps":6}'
+node node_modules/wogiflow/scripts/flow-progress-tracker.js update '{"taskId":"wf-XXX","command":"/wogi-review","phase":"AI Review","phaseNum":2,"totalPhases":7,"step":"Agent 3/6 complete","stepNum":3,"totalSteps":6}'
 ```
 **Standard format for each checkpoint:**
@@ -38,34 +38,49 @@ node node_modules/wogiflow/scripts/flow-progress-tracker.js update '{"taskId":"w
   Agent 3/6 complete
 ```
-**Phase mapping for /wogi-review:**
+**Phase mapping for /wogi-review (v6.0 — IGR-hardened):**
 | Phase | phaseNum | Description |
 |-------|----------|-------------|
+| 0 | Review Framing | Scope + assumptions (IGR v6.0) |
 | 1 | Verification Gates | Syntax, lint, tests |
 | 2 | AI Review | N agents (sub-steps = agents) |
+| 2.5 | Git-Verified Claims | Cross-reference spec vs diff |
+| 2.8 | Findings Adversary | Different-model critique (IGR v6.0) |
 | 3 | Standards + Promotion | Compliance check + pattern learning |
 | 4 | Optimization | Solution suggestions |
-| 5 | Post-Review | Fix routing, learning, archive |
+| 5 | Post-Review | Fix routing, truth gate, archive |
+Note: `totalPhases: 7` when Phase 0 counted as phaseNum=0 (8 named phases overall, 7 sequential numeric slots 0→5). Pass `totalPhases: 7` to the progress tracker.
 On review completion, clear progress: `node node_modules/wogiflow/scripts/flow-progress-tracker.js clear`
-## Review Phases (v5.0)
+## Review Phases (v6.0 — IGR-hardened)
 ```
 ┌─────────────────────────────────────────────────────────────┐
 │  /wogi-review                                                │
 ├─────────────────────────────────────────────────────────────┤
+│  Phase 0: Review Framing Pass (IGR v6.0)                     │
+│     → Interpret what the user asked to review                │
+│     → Surface scope (in/out) + review-model assumptions      │
+│     → Item reconciliation (anti-deferral guard)              │
+│                                                              │
 │  Phase 1: Verification Gates                                 │
 │     → Spec verification, lint, typecheck, tests              │
 │                                                              │
 │  Phase 2: AI Review (multi-pass or parallel)                 │
 │     → Code/Logic, Security, Architecture analysis            │
 │     → Adversarial mode: min findings per agent (v5.0)        │
+│     → Evidence tiers required on every finding (IGR v6.0)    │
 │                                                              │
 │  Phase 2.5: Git-Verified Claim Checking (v5.0)               │
 │     → Cross-reference spec claims vs actual git diff         │
 │     → BLOCKS if spec promises files not in git diff          │
 │                                                              │
+│  Phase 2.8: Findings Adversary Critique (IGR v6.0)           │
+│     → Different-model review of the findings themselves      │
+│     → Flags false positives, severity inflation, missed bugs │
+│                                                              │
 │  Phase 3: Standards Compliance [STRICT]                      │
 │     → decisions.md, app-map.md, naming-conventions.md        │
 │     → MUST_FIX violations block sign-off in Phase 5          │
@@ -74,8 +89,9 @@ On review completion, clear progress: `node node_modules/wogiflow/scripts/flow-p
 │     → Technical alternatives, UX improvements                │
 │     → Suggestions only - not violations                      │
 │                                                              │
-│  Phase 5: Post-Review Workflow                               │
+│  Phase 5: Post-Review Workflow + Completion Truth Gate       │
 │     → Fix loop, learning, task creation                      │
+│     → "Fixed" claims require INTERACTIVE evidence (IGR v6.0) │
 └─────────────────────────────────────────────────────────────┘
 ```
@@ -112,9 +128,17 @@ Multi-pass advantages:
 The review system has **two layers**:
 1. **Runtime scripts** (`flow-review.js`, `flow-standards-checker.js`, `flow-solution-optimizer.js`) — perform automated pre-flight checks (verification gates, standards, optimization). These are helper tools, NOT the full review.
-2. **AI instructions** (this document) — describe the complete 5-phase review loop, agent spawning, and post-review workflow. The AI model executes the full 5-phase loop, using runtime script output as input to specific phases.
+2. **AI instructions** (this document) — describe the complete 7-phase review loop, agent spawning, and post-review workflow. The AI model executes the full 7-phase loop, using runtime script output as input to specific phases.
+**The runtime script does NOT execute all 7 phases.** It handles pre-flight only. You (the AI) are responsible for orchestrating the complete review.
+### IGR v6.0 — Config Enforcement + Adversary Model Rule (concise)
-**The runtime script does NOT execute all 5 phases.** It handles pre-flight only. You (the AI) are responsible for orchestrating the complete review.
+All `config.review.*` toggles are **AI-honored, not runtime-enforced**. Load config first, print toggle states, honor them. Matches `/wogi-audit`'s docs-driven model.
+`adversaryPass.adversaryModel` is a mapping. **Override-always rule**: the adversary MUST run on a different model than the review agents (same-model = rubber-stamp). If the resolved value equals the agent model, pick a different model regardless.
+Full reference: [intent-grounded-review.md → Config Enforcement Model](../docs/intent-grounded-review.md#config-enforcement-model--reference-detail).
 ## Step 0: Scope Resolution (Natural Language Scoping)
@@ -182,7 +206,7 @@ The resolved file list replaces the default git diff in Phase 1. All subsequent
 ## How It Works (MANDATORY 5-PHASE SEQUENTIAL EXECUTION)
-**CRITICAL: You MUST execute ALL 5 phases sequentially. Do NOT stop after Phase 2.**
+**CRITICAL: You MUST execute ALL 7 phases sequentially (0 → 1 → 2 → 2.5 → 2.8 → 3 → 4 → 5). Do NOT stop after Phase 2.**
 ```
 ┌─────────────────────────────────────────────────────────────┐
@@ -221,7 +245,7 @@ The resolved file list replaces the default git diff in Phase 1. All subsequent
 │     → Persist findings, present fix options to user          │
 │     → If user chooses fix: convert to todos, fix loop        │
 │     → Learning capture: corrections, pattern promotion       │
-│     → Display "Phases: 5/5 executed"                         │
+│     → Display "Phases: 7/7 executed"                         │
 │     ✓ CHECKPOINT: "Phase 5 complete - Review done"           │
 │                                                              │
 └─────────────────────────────────────────────────────────────┘
@@ -535,6 +559,48 @@ Track phases completed: start at 0/5, increment after each phase checkpoint.
 ---
+### PHASE 0: Review Framing Pass (IGR v6.0)
+**Config toggle**: `config.review.framingPass.enabled` (default `true`). Reference: [intent-grounded-review.md → Phase 0](../docs/intent-grounded-review.md#phase-0-review-framing-pass--reference-detail).
+**Procedure**:
+1. Interpret the review request into a **Framing Artifact** with 5 fields: `interpretation`, `scopeIn`, `scopeOut`, `assumptions`, `posture` (`pre-ship` | `session-review` | `security-focused` | `exploratory`).
+2. Write the artifact to `.workflow/state/review-framing/{timestamp}.md` (with PIN markers).
+3. Display a short summary:
+   ```
+   ━━━ REVIEW FRAMING ━━━
+   Interpretation: [one sentence]
+   Scope (in):  [list]
+   Scope (out): [list]
+   Assumptions:
+     - [assumption 1]
+     - [assumption 2]
+   Posture: [pre-ship | session-review | security-focused | exploratory]
+   Proceeding with N-agent analysis on this scope.
+   ━━━━━━━━━━━━━━━━━━━━━━
+   ```
+4. **Item reconciliation (MANDATORY anti-deferral guard)**: if the user's request enumerated multiple items, each MUST appear in `scopeIn`. If the count shrank, framing FAILS — require user confirmation before proceeding.
+5. **Posture adjusts agent weighting** — see the reference doc for the full table.
+**Display Phase 0 results**:
+```
+═══════════════════════════════════════
+PHASE 0: REVIEW FRAMING [0/7]
+═══════════════════════════════════════
+[Framing artifact summary]
+✓ Phase 0 complete. Proceeding to Phase 1...
+```
+Config toggles: `review.framingPass.enabled` (default true), `review.framingPass.itemReconciliation` (default true), `review.framingPass.adversaryInExploratory` (default false).
+---
 ### PHASE 1: Verification Gates
 **1.1. Get changed files**:
@@ -554,7 +620,7 @@ git diff --name-only HEAD~N HEAD  # If --commits N specified
 **1.3. Display Phase 1 results**:
 ```
 ═══════════════════════════════════════
-PHASE 1: VERIFICATION GATES [1/5]
+PHASE 1: VERIFICATION GATES [1/7]
 ═══════════════════════════════════════
 ✓ Spec: N/N deliverables exist
 ✓ Lint: passed
@@ -613,9 +679,9 @@ Agent Lineup (N agents):
   Total: N (max: 6)
 ```
-**2.3. Append adversarial minimum findings suffix to EVERY agent prompt**:
+**2.3. Append adversarial minimum findings suffix + evidence tier requirement to EVERY agent prompt**:
-Read `config.review.minFindings` (default: 3). Append this to every agent's prompt:
+Read `config.review.minFindings` (default: 3) and `config.review.evidenceTiers.enabled` (default: true). Append this to every agent's prompt:
 ```
 IMPORTANT: Adversarial Review Mode
@@ -623,8 +689,37 @@ You MUST find at least [minFindings] findings. If you genuinely cannot find
 [minFindings] issues, you MUST provide a "clean code justification" as a
 special finding with type "clean-justification" explaining WHY the code is
 clean. Generic praise like "looks good" is NOT acceptable.
+IMPORTANT: Evidence Tier Requirement (IGR v6.0)
+Every finding MUST carry two additional fields:
+  evidenceTier: integer 0–4
+    0 = STATIC      — inferred from source alone (weakest)
+    1 = STRUCTURAL  — grepped / globbed / counted instances
+    2 = OBSERVATIONAL — ran a tool (lint, typecheck, npm audit) and read output
+    3 = INTERACTIVE — executed code/tests and observed behavior
+    4 = AUTOMATED   — deterministic check in a quality gate / test suite
+  evidenceNote: one-line string citing what produced the evidence
+    examples: "grep 'JSON\\.parse' returned 7 matches in src/api/"
+              "ran require.resolve() — path resolves correctly"
+              "executed tests/foo.test.js and observed assertion failure"
+SEVERITY IS CAPPED BY TIER:
+  - Tier 0: severity MUST be LOW (and will be flagged UNVERIFIED in the report)
+  - Tier 1: severity capped at MEDIUM (unless grep returned >=5 instances → HIGH allowed)
+  - Tier 2+: severity stands as you assign it
+Also respect the FRAMING ARTIFACT from Phase 0 — only report findings within
+`scopeIn`. Findings outside `scopeOut` will be moved to an appendix by the
+orchestrator.
 ```
+**Why evidence tiers matter**: During this project's own self-review (session logs), a `code-reviewer` agent reported an F1 finding as "Critical — broken require path" without citing evidence. Manual verification via `require.resolve()` showed the path was correct — the agent's path math was flawed. With tier enforcement, F1 would have been Tier 0 (no grep, no execution), capped at LOW, and flagged UNVERIFIED — alerting the reader to verify before acting.
+**Config toggles**: `review.evidenceTiers.enabled` (default true), `review.evidenceTiers.capByTier` (default true — enforce severity caps).
 **2.4. Launch ALL agents in parallel** (single message with N Task tool calls, subagent_type=Explore)
 **2.5. Wait for all agents to complete**
@@ -657,7 +752,7 @@ clean. Generic praise like "looks good" is NOT acceptable.
 **2.7. Display Phase 2 results (per-agent sections)**:
 ```
 ═══════════════════════════════════════
-PHASE 2: AI REVIEW [2/5]
+PHASE 2: AI REVIEW [2/7]
 ═══════════════════════════════════════
 Agents: N launched (3 core + 1 optional + 2 project-rules)
@@ -713,7 +808,7 @@ git diff --name-only               # For unstaged changes
 **2.5.5. Display Phase 2.5 results**:
 ```
 ═══════════════════════════════════════
-PHASE 2.5: GIT-VERIFIED CLAIMS [2.5/5]
+PHASE 2.5: GIT-VERIFIED CLAIMS [3/7]
 ═══════════════════════════════════════
 Spec: .workflow/changes/wf-XXXXXXXX.md
@@ -733,6 +828,52 @@ Summary: X verified, Y missing, Z unplanned
 ---
+### PHASE 2.8: Findings Adversary Critique (IGR v6.0)
+**Config toggle**: `config.review.adversaryPass.enabled` (default `true`; MANDATORY when framing posture is `pre-ship`). Reference: [intent-grounded-review.md → Phase 2.8](../docs/intent-grounded-review.md#phase-28-findings-adversary-critique--reference-detail).
+**Procedure**:
+1. **Collect inputs**: the framing artifact + all Phase 2 findings (with `evidenceTier` + `evidenceNote`) + Phase 2.5 git-claim results.
+2. **Launch ONE Agent sub-agent** (`subagent_type=Explore`, READ-ONLY) on a DIFFERENT model than the review agents. Resolve via `config.review.adversaryPass.adversaryModel` mapping: agents on Sonnet → adversary on Opus; agents on Opus → adversary on Sonnet; agents on Haiku → adversary on Sonnet. **Override-always rule**: if the resolved value equals the agent model, pick a different model anyway.
+3. **Adversary prompt** — produce JSON with: `falsePositives[]`, `missedIssues[]`, `severityAdjustments[]`, `scopeDrift[]`, `evidenceChallenges[]`, `overallVerdict` (`ACCEPT | ACCEPT_WITH_ADJUSTMENTS | REVISE_SCOPE | BLOCK`).
+    HUNT specifically for: (a) `evidenceTier=0` + severity ≥ HIGH, (b) line-number claims without code quotes, (c) "broken require path" / "missing import" / "wrong type" without `require.resolve` / `tsc` / `grep` verification, (d) findings contradicting `scopeIn`/`scopeOut`.
+    Forbid "I think" / "might" / "could" — require evidence. Full prompt template in the reference doc.
+4. **Parse + apply adjustments**: `severityAdjustments` rewrite severity (mark `[ADVERSARY-ADJUSTED]`); `scopeDrift` moves to appendix; `falsePositives` marked `[DISPUTED]` (not removed); `missedIssues` appended as `[ADVERSARY-FOUND]` Tier-0; `evidenceChallenges` downgrade tier and re-apply severity cap.
+5. **Archive** run to `.workflow/state/adversary-runs/review-{timestamp}.json` for the pattern-promotion pipeline.
+6. **Display Phase 2.8 results**:
+```
+═══════════════════════════════════════
+PHASE 2.8: FINDINGS ADVERSARY [4/7]
+═══════════════════════════════════════
+Adversary model: [model]  (agents: [agent-model])
+Verdict:         [ACCEPT | ACCEPT_WITH_ADJUSTMENTS | REVISE_SCOPE | BLOCK]
+False positives:       N  (marked [DISPUTED])
+Severity adjustments:  N  (marked [ADVERSARY-ADJUSTED])
+Missed issues found:   N  (appended as [ADVERSARY-FOUND] Tier-0 findings)
+Scope drift:           N  (moved to Out-of-Scope appendix)
+Evidence challenges:   N  (tier downgraded, severity re-capped)
+[For each item, one-line summary with finding ID + reason]
+✓ Phase 2.8 complete. Proceeding to Phase 3...
+```
+**One pass only** — no iteration loop. If the adversary `BLOCKS`, display the block reason prominently and require the user to acknowledge before proceeding to Phase 3 — or to retry the review with adjusted scope.
+**Config toggles**: `review.adversaryPass.enabled` (default true), `review.adversaryPass.adversaryModel` (mapping object — see "Adversary Model Selection Rule" in the Architecture Note; resolve at runtime based on agent model, override-always rule applies), `review.adversaryPass.applySeverityAdjustments` (default true), `review.adversaryPass.applyScopeDrift` (default true), `review.adversaryPass.blockOnBlockVerdict` (default true).
+---
 ### PHASE 3: Standards Compliance [STRICT]
 **This phase BLOCKS review completion if MUST_FIX violations are found.**
@@ -766,7 +907,7 @@ After running the standards check, feed any violations through the pattern promo
 **3.4. Display Phase 3 results**:
 ```
 ═══════════════════════════════════════
-PHASE 3: STANDARDS COMPLIANCE [3/5]
+PHASE 3: STANDARDS COMPLIANCE [5/7]
 ═══════════════════════════════════════
 ✓ decisions.md: passed
@@ -809,7 +950,7 @@ Or if the runtime script is not available, manually analyze changed files for:
 **4.3. Display Phase 4 results**:
 ```
 ═══════════════════════════════════════
-PHASE 4: SOLUTION OPTIMIZATION [4/5]
+PHASE 4: SOLUTION OPTIMIZATION [6/7]
 ═══════════════════════════════════════
 Technical (N):
@@ -850,7 +991,7 @@ Phase Results:
 Total Findings: N (X critical, Y high, Z medium, W low)
 Pattern Learning: P patterns tracked, M promoted, G enforcement gaps
-Phases: 5/5 executed
+Phases: 7/7 executed
 ```
 **5.2. Present severity-aware fix options to user** (use AskUserQuestion):
@@ -990,14 +1131,51 @@ This ensures that patterns discovered during code review feed into the same prom
 - Save review report to `.workflow/reviews/YYYY-MM-DD-HHMMSS-review.md`
 - Include: date, files reviewed, mode, all findings with status (fixed/task-created/dismissed), summary
-**5.6. Sign-off gate**:
+**5.6. Completion Truth Gate (IGR v6.0)** — runs BEFORE sign-off:
+**Config toggle**: `config.review.completionTruthGate.enabled` (default `true`).
+**Problem this solves**: A review's "fixed" claim is only as good as the evidence behind it. A finding marked `fixed` because the AI applied an edit is NOT the same as a finding verified to work. Without a truth gate, the sign-off rubber-stamps whatever the agent says.
+**Procedure** — for every finding now marked `status: fixed`:
+1. **Check evidence tier of the fix**:
+   - Did the fix come with an executed test (`tier ≥ 3 INTERACTIVE`)?
+   - Or an automated gate confirming the fix (`tier 4 AUTOMATED`)?
+   - Or just an edit + lint pass (`tier 2 OBSERVATIONAL`)?
+   - Or just an edit (`tier 0 STATIC`)?
+2. **Downgrade rule**:
+   - `tier ≥ 3` → status stays `fixed` (INTERACTIVE evidence is sufficient)
+   - `tier 2` → status downgraded to `fixed-unverified` (lint/typecheck passed but behavior not exercised)
+   - `tier ≤ 1` → status downgraded to `implemented-unverified` (edit applied, no evidence of correctness)
+3. **Display the downgrade in the final summary**:
+```
+━━━ COMPLETION TRUTH GATE ━━━
+  Findings marked "fixed":         N
+  Tier ≥ 3 (INTERACTIVE):          M  → status stands
+  Tier 2 (OBSERVATIONAL):          K  → downgraded to "fixed-unverified"
+  Tier ≤ 1 (STATIC/STRUCTURAL):    J  → downgraded to "implemented-unverified"
+  ⚠ K + J findings lack runtime proof of fix.
+  To upgrade: run the relevant tests / smoke-test / browser check and re-verify.
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+```
+4. **Persist downgraded statuses** to `last-review.json`. Do NOT silently mark everything as complete.
+**Config toggles**: `review.completionTruthGate.enabled` (default true), `review.completionTruthGate.requireInteractiveForFixed` (default true — when false, Tier 2 counts as fully fixed).
+**5.7. Sign-off gate**:
 - Present summary to user and ask for confirmation that the review is complete
-- If user requests additional fixes, return to step 5.3
+- Display the truth-gate downgrade counts prominently — the user should consciously accept unverified fixes, not have them hidden
+- If user requests additional fixes or verification, return to step 5.3
-**5.7. Display final checkpoint**:
+**5.8. Display final checkpoint**:
 ```
 ═══════════════════════════════════════
-PHASE 5: POST-REVIEW COMPLETE [5/5]
+PHASE 5: POST-REVIEW COMPLETE [7/7]
 ═══════════════════════════════════════
 Findings: N total
@@ -1010,7 +1188,7 @@ Pattern Learning:
 Run /wogi-review-fix --pending to batch-process deferred items.
-Phases: 5/5 executed
+Phases: 7/7 executed
 Review complete.
 ```

package/.claude/commands/wogi-start.md CHANGED Viewed

@@ -101,6 +101,51 @@ When a local `/wogi-*` CLI command fails (error in output, "Unknown skill", comm
 - After `/wogi-start` classifies as conversation: Read, Glob, Grep, WebSearch, WebFetch (read-only). No Edit/Write/state modifications.
 - Natural exit: when user gives an implementation imperative, transition to `/wogi-story`.
+**Research Reasoning Gate** (applies inside Conversation mode when `config.researchReasoningGate.enabled` — default ON): classify the question into a tier based on structural markers. Do NOT self-classify the question's complexity — use the markers below mechanically. When ambiguous, default to Tier 2.
+| Tier | Marker phrases | What you do |
+|------|---------------|-------------|
+| **Tier 1 — Factual** | "what is", "how many", "show me", "list all", "which file", "where does" | Answer directly from code/docs. No gate. |
+| **Tier 2 — Domain** (default for ambiguous) | "what should", "how should", "recommend", "which approach", "what do you think about", "is it better to" | **Surface assumptions, then WAIT.** |
+| **Tier 3 — Architecture** | "should we restructure", "what's the right architecture", "design a schema", "how to migrate", "should we split / merge / replace" | Tier 2 flow + spawn adversary on a different model after recommendation. |
+**Tier 2 flow — the user is the adversary**:
+1. Before any analysis, identify the domain-model assumptions your answer will depend on (typically 2–5).
+2. Present them in a fenced block and STOP:
+   ```
+   ━━━ ASSUMPTIONS (confirm before I analyze) ━━━
+   My analysis will depend on these domain model assumptions:
+   1. <assumption 1>
+   2. <assumption 2>
+   3. <assumption 3>
+   Do these match your understanding? [confirm / correct]
+   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+   ```
+3. WAIT for the user to confirm or correct. Do not analyze while waiting.
+4. When confirmed (or corrected), ground the analysis in the user's stated model — not your original guess.
+**Tier 3 flow** — after steps 1–4 above, also:
+5. Produce the recommendation.
+6. Spawn an Agent sub-agent on a DIFFERENT model (config-controlled, default `sonnet`) with: the user's confirmed assumptions + your recommendation + the original question. Ask: "Does this recommendation follow from these assumptions? What's the strongest counterargument? List 1–3 specific concerns with line/file citations where possible."
+7. Present both the recommendation AND the adversary critique to the user in a single response:
+   ```
+   ━━━ RECOMMENDATION ━━━
+   <your recommendation>
+   ━━━ ADVERSARY CRITIQUE (reviewed by a different model) ━━━
+   <sub-agent output>
+   ```
+8. One pass only — this is conversation, not implementation. No iteration loop.
+**Config toggles**:
+- `researchReasoningGate.enabled` — master switch
+- `researchReasoningGate.tier2.enabled` — assumption surfacing
+- `researchReasoningGate.tier3.enabled` — spawn adversary
+- `researchReasoningGate.tier3.adversaryModel` — model for the critique agent (default `sonnet`)
+**Why this works** (from spec wf-6dbc0b2a): same-model self-critique is a known rubber-stamp. The USER is the effective adversary — you surface assumptions so they can validate the domain model before you build recommendations on invisible guesses.
 **Everything else**: Route to best command from catalog. Zero exemptions.
 ### Examples

package/.claude/docs/claude-code-compatibility.md CHANGED Viewed

@@ -74,6 +74,7 @@ flow parallel check  # See available parallel tasks
 | 2.5.0+ | 2.1.84+ | TaskCreated hook, YAML glob lists in rules, CLAUDE_STREAM_IDLE_TIMEOUT_MS, WorktreeCreate HTTP transport, idle-return prompt, MCP 2KB cap |
 | 2.9.0+ | 2.1.90+ | --resume deferred-tool cache fix, MCP schema perf, PostToolUse format-on-save fix, PreToolUse exit-code-2 fix, .husky protected |
 | 2.9.2+ | 2.1.97+ | Stop/SubagentStop long-session fix, subagent worktree cwd leak fix, refreshInterval status line, workspace.git_worktree, MCP HTTP/SSE leak fix, 429 backoff, compaction transcript dedup |
+| 2.18.0+ | 2.1.108+ | ENABLE_PROMPT_CACHING_1H guidance, /recap awareness, /doctor MCP duplicate-scope mirror in `/wogi-health` |
 ### Environment Variables (2.1.19+)
@@ -363,6 +364,50 @@ await cancelTask('wf-123', 'superseded', false);
 - **`/claude-api` skill updated for Managed Agents**: The `/claude-api` skill now covers Managed Agents (`/v1/agents`, `/v1/sessions`) alongside the Claude API. **Impact on WogiFlow**: Informational — WogiFlow's `claude-api` skill reference remains accurate.
+### Features in 2.1.108+
+- **`ENABLE_PROMPT_CACHING_1H` env var (RECOMMENDED for non-subscribers)**: Opts into **1-hour prompt-cache TTL** on **API key, Bedrock, Vertex, and Foundry** providers. Subscribers (Claude Pro, Max, Team, Enterprise via claude.ai OAuth) already get 1h TTL by default — this flag is a **no-op for them**. The complementary `FORCE_PROMPT_CACHING_5M` pins to 5min, and the older `ENABLE_PROMPT_CACHING_1H_BEDROCK` is deprecated but still honored. **Impact on WogiFlow (HIGH)**: WogiFlow sessions load a large, stable prefix every turn — CLAUDE.md (~300 lines), state files (`ready.json`, `decisions.md`, `app-map.md`), phase files, and pinned spec context. At the default 5min TTL, any pause longer than 5 minutes (user thinking, a long `flow` CLI run, a meeting mid-session) invalidates the cache and the next turn pays the full input-token cost again. At 1h TTL, the same prefix stays cached across those pauses, yielding **substantial token-cost reduction** on typical multi-hour WogiFlow work. **Action for API-key / Bedrock / Vertex / Foundry users**: `export ENABLE_PROMPT_CACHING_1H=1` in your shell profile. **Action for subscribers**: none (already enabled). **Risk**: none — if set on a subscriber account it is ignored; if set when not supported, it silently falls back.
+- **`/recap` command and session recap feature**: Provides context when returning to a session. Configurable in `/config` and manually invocable with `/recap`. For users with telemetry disabled (Bedrock/Vertex/Foundry/`DISABLE_TELEMETRY`), recap is still enabled by default; opt out via `/config` or `CLAUDE_CODE_ENABLE_AWAY_SUMMARY=0`. **Overlap with WogiFlow**: `/wogi-morning`, `/wogi-session-end`, and `/wogi-pre-compact` already provide durable recap via state files. `/recap` is ephemeral (summarizes the current session); WogiFlow's state survives session exit. Use both: `/recap` for intra-session context, `/wogi-morning` for cross-session pickup.
+- **Built-in slash commands via Skill tool**: Claude can now discover and invoke `/init`, `/review`, `/security-review` via the Skill tool. **Impact on WogiFlow**: No collision — all WogiFlow commands use the `wogi-*` prefix (`/wogi-review`, `/wogi-init`, `/wogi-review-fix`). Natural-language routing in CLAUDE.md directs "code review" phrases to `/wogi-review`, not the built-in `/review`. If a user explicitly types `/review`, Claude Code handles it natively — this is expected.
+- **`/model` mid-conversation warning**: `/model` now warns before switching models mid-conversation, since the next response re-reads the full history uncached. **Impact on WogiFlow**: Relevant for hybrid mode (`/wogi-hybrid`) — switching the executor model via `/model` during hybrid execution wastes the cached context. WogiFlow's `/wogi-hybrid-setup` is the correct way to change executor models between sessions rather than mid-session.
+- **`DISABLE_PROMPT_CACHING*` startup warning**: Claude Code now warns at startup when prompt caching is disabled via `DISABLE_PROMPT_CACHING*` env vars. **Impact on WogiFlow**: WogiFlow's heavy context prefix makes disabled caching **expensive**. This warning helps users who accidentally disabled caching (e.g., copy-pasted env from another project) spot the regression fast.
+- **`/undo` alias for `/rewind`**: Typing `/undo` now aliases to `/rewind`. WogiFlow's `/wogi-pre-compact` and `/wogi-suspend` are complementary — `/undo`/`/rewind` rolls back message turns, while the WogiFlow flows preserve state across sessions.
+- **Memory footprint reductions for file reads**: Language grammars now load on demand, reducing memory for file reads, edits, and syntax highlighting. **Impact on WogiFlow**: Long WogiFlow sessions (especially `/wogi-bulk-loop` continuous runs) use noticeably less RAM. No code change needed.
+### Features in 2.1.110+
+- **PreToolUse hook `additionalContext` preserved on tool failure (BUG FIX, GOOD NEWS)**: Previously, when a tool call failed, any `additionalContext` returned by PreToolUse hooks was **dropped**. Fixed in 2.1.110. **Impact on WogiFlow (HIGH)**: WogiFlow injects `additionalContext` in 8 places via `scripts/hooks/adapters/claude-code.js` (PreToolUse, UserPromptSubmit, SessionStart) for routing enforcement, phase-gate messages, component reuse hints, and session-start task context. Before this fix, if a guarded tool call failed, WogiFlow's context message vanished — producing "silent" hook behavior that was confusing to debug. After this fix, WogiFlow's hook messages are reliably delivered regardless of tool outcome. **Action**: none — automatic improvement after upgrade.
+- **`/doctor` warns on duplicate MCP server definitions across scopes**: When the same MCP server is defined in user (`~/.claude/settings.json`), project (`.claude/settings.json`), and local (`.claude/settings.local.json`) scopes with different endpoints, `/doctor` now flags the conflict. **Impact on WogiFlow**: `/wogi-health` has a mirror check in `flow-health.js` that scans the same three scopes and reports duplicate MCP server names with divergent endpoints as a health finding (v2.18.0+).
+- **PushNotification tool**: Claude can send mobile push notifications when Remote Control and "Push when Claude decides" config are enabled. **WogiFlow opportunity**: Long-running autonomous loops (`/wogi-bulk`, `/wogi-bulk-loop`) could emit a notification on completion, blocker, or extended hang. Tracked as a future enhancement; not auto-wired.
+- **Bash tool timeout enforcement**: The Bash tool now enforces the documented maximum timeout (600000ms / 10min) instead of accepting arbitrarily large values. **Impact on WogiFlow**: No impact — all WogiFlow hook Bash timeouts are under 60s (verified across `.claude/settings.json` and `scripts/hooks/`).
+- **stdio MCP servers no longer disconnect on stray non-JSON lines**: Fixed a regression from 2.1.105 where stdio MCP servers that print stray non-JSON lines to stdout were disconnected on the first stray line. **Impact on WogiFlow**: WogiFlow has no custom MCP servers in-repo. User-installed MCP servers (figma, atlassian, gmail) benefit automatically.
+- **PermissionRequest hook `updatedInput` re-check**: Fixed PermissionRequest hooks returning `updatedInput` not being re-checked against `permissions.deny` rules; `setMode:'bypassPermissions'` updates now respect `disableBypassPermissionsMode`. **Impact on WogiFlow**: WogiFlow does not implement PermissionRequest hooks (only PermissionDenied for logging). Not affected.
+- **`--resume`/`--continue` resurrects unexpired scheduled tasks**: Scheduled tasks (cron/CronCreate) now resume across session restarts. **Impact on WogiFlow**: WogiFlow does not currently use Claude Code's cron feature. Not affected; tracked as a future opportunity for automated maintenance tasks.
+- **`/context`, `/exit`, `/reload-plugins` work from Remote Control (mobile/web) clients**: Remote Control users can now invoke these built-ins. **Impact on WogiFlow**: WogiFlow has no TTY-only code paths — all `/wogi-*` skills already work identically on Remote Control. Users can now do full WogiFlow-driven work from mobile/web.
+- **`/tui` command and `tui` setting**: `/tui fullscreen` switches to flicker-free rendering in the same conversation. The focus view is now toggled separately with `/focus` (Ctrl+O now toggles verbose transcript only). **Impact on WogiFlow**: Documentation only — no runtime dependency on Ctrl+O. The WogiFlow statusline works identically in both TUI modes.
+- **`autoScrollEnabled` config**: New setting to disable conversation auto-scroll in fullscreen mode. Purely UX — no WogiFlow impact.
+- **Write tool reports IDE diff edits**: The Write tool now informs the model when the user edits the proposed content in the IDE diff before accepting. **Impact on WogiFlow**: Useful signal for learning — WogiFlow's `/wogi-correction` could eventually consume this to detect "user edited my output" events. Not auto-wired; tracked as an enhancement.
+- **TRACEPARENT/TRACESTATE in SDK/headless sessions**: SDK and headless sessions now read W3C trace headers from the environment for distributed trace linking. **Impact on wogiflow-cloud**: Teams backend can propagate trace context from CI/CD pipelines into WogiFlow sessions for end-to-end observability. Tracked as a cloud opportunity.
+- **Hardened "Open in editor" against command injection**: Security hardening for untrusted filenames. **Impact on WogiFlow**: Validates the same pattern in `.claude/rules/security/security-patterns.md` — external inputs going into shell commands must be validated. No WogiFlow code change needed.
 ### Simple Mode Naming Distinction
 Claude Code's `CLAUDE_CODE_SIMPLE` environment variable (which enables a simplified tool set) is **unrelated** to WogiFlow's `loops.simpleMode` (a lightweight task completion loop using string detection). They are separate features that happen to share the word "simple":
@@ -497,4 +542,4 @@ Run `/keybindings` in Claude Code to customize your shortcuts.
 ---
-*Last updated: 2026-04-09*
+*Last updated: 2026-04-16*