npm - wogiflow - Versions diffs - 2.25.0 → 2.26.0 - Mend

wogiflow 2.25.0 → 2.26.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (23) hide show

package/.claude/commands/wogi-debug-hypothesis.md +1 -1
package/.claude/commands/wogi-peer-review.md +5 -3
package/.claude/commands/wogi-session-end.md +1 -1
package/.claude/commands/wogi-triage.md +31 -15
package/.workflow/templates/claude-md.hbs +4 -0
package/.workflow/templates/partials/methodology-rules.hbs +149 -0
package/lib/workspace-routing.js +44 -0
package/package.json +2 -2
package/scripts/flow-completion-truth-gate.js +130 -0
package/scripts/flow-config-defaults.js +35 -1
package/scripts/flow-extraction-review.js +18 -3
package/scripts/flow-morning.js +12 -3
package/scripts/flow-session-end.js +67 -6
package/scripts/hooks/core/phase-gate.js +22 -0
package/scripts/hooks/core/post-compact.js +15 -0
package/scripts/hooks/core/pre-tool-orchestrator.js +24 -0
package/scripts/hooks/core/research-evidence-gate.js +280 -0
package/scripts/hooks/core/session-end.js +7 -0
package/scripts/hooks/entry/claude-code/pre-tool-use.js +15 -0
package/.workflow/state/bugfix-scope.json.template +0 -7
package/.workflow/state/deploy-history.json.template +0 -3
package/.workflow/state/deploy-routes.json.template +0 -4
package/.workflow/state/strike-tracker.json.template +0 -3

package/.claude/commands/wogi-debug-hypothesis.md CHANGED Viewed

@@ -176,7 +176,7 @@ After all agents complete, display the consolidated results:
 ### Step 4: Hypothesis Adversary (v2.23.0+ — MANDATORY unless `--no-adversary`)
-After consolidation, spawn a single Agent (different `model` param if `config.hybrid.enabled`, else same) with this prompt:
+After consolidation, spawn a single Agent on a DIFFERENT model (default `sonnet` via `config.researchReasoningGate.tier3.adversaryModel` — canonical cross-command adversary key, same as `/wogi-peer-review`, `/wogi-learn`, `/wogi-decide`) with this prompt:
 ```
 You are the hypothesis adversary.

package/.claude/commands/wogi-peer-review.md CHANGED Viewed

@@ -47,9 +47,11 @@ Models are selected once per session and remembered for subsequent runs.
 ├─────────────────────────────────────────────────────────┤
 │  1. Collect code changes (git diff or specified files)   │
 │  2. Classify change size → effort tier:                  │
-│     L0/L1 (>10 files)  → opus-4-7 xhigh                  │
+│     L0/L1 (>10 files)  → opus (latest) xhigh             │
 │     L2 (3-10 files)    → sonnet medium                   │
 │     L3 (<3 files)      → haiku medium                    │
+│     (Model IDs resolve from config.models — avoid        │
+│      hardcoding model version in this doc.)              │
 │  3. Generate improvement-focused prompt                  │
 │  4. If includeClaude enabled:                            │
 │     - Launch Claude review (Task agent, Explore type)    │
@@ -96,7 +98,7 @@ analysis, EACH carrying an explicit evidence tier.
 ## Synthesis Adversary (v2.23.0+ — MANDATORY unless `--no-adversary`)
-After initial synthesis, spawn a single adversary agent on a DIFFERENT model from the synthesizer (default: if synthesizer is Opus, adversary is Sonnet; config via `peerReview.adversaryModel`). Prompt:
+After initial synthesis, spawn a single adversary agent on a DIFFERENT model from the synthesizer (default `sonnet`; override via the canonical `config.researchReasoningGate.tier3.adversaryModel` — same key used by `/wogi-debug-hypothesis`, `/wogi-learn`, `/wogi-decide`). Prompt:
 ```
 You are the synthesis adversary.
@@ -199,7 +201,7 @@ For manual review (no API keys needed): `/wogi-peer-review --manual`
 | `--verbose` | Show full model responses |
 | `--create-tasks` | Auto-create tasks for strong agreements |
 | `--no-adversary` | Skip the v2.23.0 synthesis adversary (not recommended for L0/L1 diffs) |
-| `--adversary-model <id>` | Override adversary model (default: cross-model from synthesizer) |
+| `--adversary-model <id>` | Override adversary model (default: `config.researchReasoningGate.tier3.adversaryModel`, usually `sonnet`) |
 | `--effort <level>` | Override effort tier (low/medium/high/xhigh/max) — otherwise derived from diff size |
 ARGUMENTS: {args}

package/.claude/commands/wogi-session-end.md CHANGED Viewed

@@ -13,7 +13,7 @@ Steps:
 6. **Completion-claim honesty scan** - Surface done-in-text-but-not-in-status contradictions (2026-04-16 honesty-infra)
 7. **Workspace session-end message (v2.23.0+)** - If running inside a workspace manager, write a `heads-up` message to `.workspace/messages/` so workers know no new dispatches are coming
 8. **Commit changes** - Stage and commit all workflow files
-9. **Offer to push** - Ask if should push to remote
+9. **Offer to push** - Ask if should push to remote. In workspace worker mode, the prompt is suppressed (workers cannot prompt the user directly). When `config.sessionEnd.autoPushInWorker` is `true` (default), the worker auto-pushes silently. When `false`, the push is skipped and the user pushes manually later. Non-worker sessions are unchanged.
 Output:
 ```

package/.claude/commands/wogi-triage.md CHANGED Viewed

@@ -357,33 +357,45 @@ Each finding is displayed using these fields from `last-review.json`:
 | Issue | `finding.issue` | "Raw JSON.parse without try-catch" |
 | Recommendation | `finding.recommendation` | "Use safeJsonParse from flow-utils.js" |
-## Anti-Deferral Enforcement (v2.25.0+ — MANDATORY)
+## Anti-Deferral Enforcement (v2.25.0+ — two layers)
-The **Review-Findings Anti-Deferral Rule** (`.workflow/state/decisions.md`, 2026-04-15) extends to `/wogi-triage` mechanically in v2.25.0+. Prevents the rubber-stamp pattern where the AI silently drops findings from "fix all" requests.
+The **Review-Findings Anti-Deferral Rule** (`.workflow/state/decisions.md`, 2026-04-15) gets two complementary enforcement layers. One mechanical (an actual gate in the codebase), one AI-followed (a protocol documented here that the triage flow honors).
-**Enforcement rules**:
+### Layer 1 — Mechanical gate (v2.25.1+)
-1. **"Defer" / "skip" requires explicit user confirmation with a reason.** When the AI or user proposes to defer a finding, the triage flow MUST prompt:
+`scripts/flow-completion-truth-gate.js` exports `parseCommitMessageClaims()` and `verifyCommitMessageAgainstDiff()`. Callers pass a commit message and the staged diff (or changed-files list); the function parses finding IDs (`F1`/`M1`/`SEC-001`), task IDs (`wf-XXXXXXXX` after fix/close/resolve verbs), and file-path mentions, then checks each against the diff. Any unverified claim surfaces as a blocking prompt with three remediation options. This is real code, callable from pre-commit hooks, `flow-done.js`, or the triage flow itself.
+Example usage:
+```javascript
+const { verifyCommitMessageAgainstDiff, formatMissingClaimsMessage } =
+  require('wogiflow/scripts/flow-completion-truth-gate');
+const result = verifyCommitMessageAgainstDiff(commitMsg, { diffText, changedFiles });
+if (!result.ok) {
+  console.error(formatMissingClaimsMessage(result));
+  // Block + remediate
+}
+```
+### Layer 2 — AI-followed protocol (documentation)
+The rest of the triage flow is a protocol the AI follows. It is NOT automatically enforced by a hook — the historical v2.17.4 incident showed that doc-only protocols can be violated. The mechanical gate above closes the most damaging failure mode (commit message / diff mismatch). The AI-followed rules below cover the earlier stages:
+1. **Defer requires explicit user confirmation + reason.** The triage flow prompts when proposing to defer:
    ```
    Defer finding wf-review-XXXX?
      Severity: HIGH
      Reason required: [user input]
      [Confirm defer] [Cancel — fix now]
    ```
-   Auto-defer without reason is FORBIDDEN.
+   Auto-defer without reason is forbidden by this protocol.
-2. **"Fix all" / "Option 1" / equivalent means fix ALL.** If the user requests bulk processing:
+2. **"Fix all" / "Option 1" means fix ALL.** If the user requests bulk processing:
    - Ship a fix for every finding with evidence-tier ≥ 1
    - If any finding is too large, STOP and ask: "Finding X requires ~Y minutes of work. Ship now, split to its own release, or defer (needs reason)?"
    - Never silently convert a finding to "deferred" in commit messages or release notes
-3. **Commit/release consistency check.** Before finalizing, scan the commit message / release notes against the findings list. If the message claims "fixes F1, F2, F3, M1" but M1 isn't in the diff, BLOCK with:
-   ```
-   Commit message claims M1 is fixed, but M1 does not appear in the diff.
-   Options: [Fix M1 now] [Remove M1 from message] [Acknowledge + proceed]
-   ```
-4. **Triage output includes a Deferral Audit Trail**:
+3. **Triage output includes a Deferral Audit Trail**:
    ```
    ━━━ TRIAGE SUMMARY ━━━
    Fixed: 12
@@ -394,6 +406,10 @@ The **Review-Findings Anti-Deferral Rule** (`.workflow/state/decisions.md`, 2026
    ━━━━━━━━━━━━━━━━━━━━━━
    ```
-Historical incident (v2.17.4 release, 2026-04-15): commit message claimed "fix all findings" but M1 and M3 were silently dropped. The v2.25.0+ mechanical enforcement makes that failure mode architecturally impossible — the flow stops and asks rather than letting the AI make an autonomous defer decision.
+### Honest tradeoff
+Layer 1 is genuinely mechanical — impossible for an AI to bypass without explicitly disabling the gate. Layer 2 is a protocol the AI can fail to follow if prompted poorly, distracted, or confused about priorities. Both matter; calling the whole system "architecturally impossible to bypass" would be inaccurate. The mechanical gate at least ensures that WHEN the AI writes a commit message, claimed fixes must actually appear in the diff.
+Historical incident (v2.17.4 release, 2026-04-15): commit claimed "fix all findings" but M1 and M3 were silently dropped. Layer 1 would have caught that — the commit message mentioned M1 + M3 but the diff didn't. Layer 2 is the human-protocol reinforcement.
-Skip only if `config.triage.antiDeferralEnforcement.enabled` is explicitly `false` (default: true).
+Skip via `config.triage.antiDeferralEnforcement.enabled: false` — note that this is currently a surface flag only (read by AI-followed protocol, not by the Layer 1 gate); to disable Layer 1 set `config.commitClaimsGate.enabled: false`.

package/.workflow/templates/claude-md.hbs CHANGED Viewed

@@ -432,6 +432,10 @@ Use `/wogi-research "question"` for rigorous verification.
 ---
+{{> methodology-rules}}
+---
 ## Generated by CLI Bridge
 This file was generated by the Wogi Flow CLI bridge.

package/.workflow/templates/partials/methodology-rules.hbs ADDED Viewed

@@ -0,0 +1,149 @@
+## WogiFlow Methodology Rules
+These are product-level rules that apply to every WogiFlow session. They ship with the tool — enforcement is in the shipped scripts/hooks, and the text below explains the contract to Claude so it doesn't try to work around the enforcement.
+---
+### Research Before Propose (MANDATORY)
+**Rule**: Before proposing any fix, plan, or spec, audit existing infrastructure for the problem area. Propose only what fills a confirmed gap. Evidence-before-invention.
+**What counts as research**: read relevant files in `.workflow/state/` (decisions.md, feedback-patterns.md, app-map.md, function-map.md, api-map.md), read the task spec from `.workflow/changes/` or `.workflow/specs/`, grep existing hooks/classifiers/gates, read relevant source files.
+**Why**: baseline LLM training biases toward generating plausible-sounding solutions. In a codebase with existing infrastructure, "plausible" is frequently wrong — proposing a feature that already exists, missing an existing pattern, or reinventing a wired-up hook. The correction cycle cost (user rejecting → replanning → rejecting again) is higher than the upfront audit cost.
+**You MAY ask the user clarifying questions when genuinely needed.** The rule is not "never ask" — it is "don't propose before researching." Asking is a valid escape hatch; proposing without evidence is not.
+**Enforcement**: `scripts/hooks/core/research-evidence-gate.js` tracks state-file reads (`.workflow/state/`, `.workflow/changes/`, `.workflow/specs/`, `.workflow/epics/`) in the current task turn. Three enforcement points check the evidence fingerprint before proposal actions:
+1. **Phase transition** — `transitionPhase()` blocks `→ spec_review` and `→ coding` until `minEvidence` distinct state/spec file reads have been recorded.
+2. **Spec write** — PreToolUse blocks `Edit`/`Write` to `.workflow/changes/*.md`, `.workflow/specs/*.md`, or `.workflow/epics/*.md` when evidence is below threshold.
+3. **Channel dispatch** — in workspace manager mode, `dispatchToChannel()` blocks dispatching a task to a worker until the manager has read evidence from the target member repo.
+Evidence is cleared at task start, session end, and post-compaction so each task begins with a clean slate. The `AskUserQuestion` tool is NOT gated — asking for clarification is a valid escape hatch. IGR's architect + adversary passes challenge solution *quality* downstream; this gate enforces the evidence *base* upstream.
+**Config**: `hooks.rules.researchEvidenceGate.{enabled,minEvidence}` (defaults: `true`, `2`).
+---
+### Completion-Claim Honesty Scan
+**Rule**: At session-end and on `flow health`, scan `ready.json` entries for two contradiction classes and surface (not block) them for user reconciliation.
+- **Class A — status-mismatch**: free-text field contains done-words (`done|completed|shipped|deployed|finished`) while `status` is partial (`completed-partial|blocked|in-progress|failed`).
+- **Class B — negation-vs-evidence**: free-text contains a negated claim (`no outages`, `0 regressions`) while `hotfixes[]`, `incidents[]`, or `regressions[]` is non-empty.
+**Why**: mechanical gates (test counts, lint, tsc) catch implementation errors. Narrative-quality claims in free-text fields (`notes`, `result`, `summary`, `description`) get rubber-stamped. This scan compares narrative against adjacent structured fields.
+**Mode**: surface-and-prompt, non-blocking. A hard-fail at session-end has no recovery path.
+**Enforcement**: `scripts/flow-completion-truth-gate.js` → `scanForClaimContradictions()`. Invoked by `flow-session-end.js` and `flow-health.js`.
+---
+### Merge-Plan Artifact Gate
+**Rule**: `/wogi-finalize` requires `.workflow/scratch/merge-plan.md` for any merge with more than `config.finalization.mergePlan.threshold` commits (default 5) OR any cross-repo merge. The plan must map every commit in `git log <base>..<branch>` to one of: `port | adapt | skip-style | superseded | skip-with-reason`.
+**Mechanical invariant**: count of SHA-prefixed lines in the plan MUST equal `git log <base>..<branch> | wc -l`. Mismatch blocks the merge.
+**Structural-change sensor**: when ≥ `config.finalization.mergePlan.restructureThreshold` (default 20%) of changed files match a restructure pattern (folder-per-component, split-into-submodule, barrel-introduction, rename-new-home), a structural warning prefixes the plan and biases affected commits toward `adapt`.
+**Enforcement**: `scripts/flow-structure-sensor.js`, `.claude/commands/wogi-finalize.md` Step 2.5.
+---
+### Story Creation Quality Gates
+**Rule**: `/wogi-story` enforces five P0 specification-quality gates at creation time. Gates answer *"is the story clear, complete, checkable?"* — NOT *"is the implementation correct?"* (the latter remains `/wogi-start`'s job).
+1. **Long Input** — ≥40 lines OR ≥5 discrete items → route to `/wogi-extract-review` for zero-loss capture.
+2. **Item Reconciliation** — ≥3 discrete items → enumerated "Item Manifest" section; every item must appear in at least one criterion or sub-task. Unmapped items surface as a warning.
+3. **Consumer Impact Analysis** — refactoring keywords (`refactor`, `rename`, `migrate`, `split`, `extract`, ...) trigger `git grep` for consumers. ≥5 breaking consumers → phased migration recommendation.
+4. **Scope-Confidence Audit** — assumption patterns (`new <X>`, `existing <Y>`, `the <Z> service`) are verified against the codebase; findings go into a "Pending Clarifications" block.
+5. **Intent Bootstrap Coordination** — schedules IGR artifact bootstrap via `intentBootstrapScheduledAt` flag so `/wogi-story` and `/wogi-start` don't both prompt.
+**Guard-rails**: all gates fail-open (grep failure, classifier unavailable → warning, story still created). Gates may be bypassed via `--skip-gates` for testing.
+**Config**: `storyFlow.consumerImpactAnalysis.*`, `storyFlow.scopeConfidenceAudit.*`, `storyFlow.itemReconciliation.*` in `.workflow/config.json`.
+---
+### Workspace Autonomous-Mode Action-After-Completion Contract
+**Applies to**: workspace worker mode (`WOGI_WORKSPACE_ROOT` set + `WOGI_REPO_NAME !== 'manager'`).
+**Rule**: A worker's end-of-turn must be a deterministic action. Exactly one of these states must hold:
+1. **ACTION** — started the next pre-approved channel dispatch (invoked `/wogi-start <nextId>`), OR
+2. **ESCALATION** — channel-dispatched a `## QUESTION:` to the manager (after local resolution attempts failed), OR
+3. **IDLE** — zero pending channel dispatches AND zero in-progress tasks.
+**Hedging language is mechanically forbidden**: *"awaiting your signal"*, *"let me know if"*, *"should I continue"*, *"standing by"*, *"ready when you are"*. These invent an imaginary decision point — the manager already pre-approved the dispatch by queuing it. Visibility is NOT a substitute for action; workers narrate AND act in the same turn.
+**Enforcement**: `TaskCompleted` hook emits auto-pickup when queued dispatches exist. `Stop` hook blocks end-of-turn when a worker has queued dispatches but no in-progress task. `worker-rules.md` template carries the 3-state contract.
+**Config**: `workspace.autoPickupChannelDispatches` (default `true`).
+---
+### Workspace Worker Cannot Prompt User Directly
+**Applies to**: workspace worker mode.
+**Rule**: The `AskUserQuestion` tool is mechanically blocked in worker mode. Questions to the user MUST be channel-dispatched to the manager via `## QUESTION: ...`.
+**Why block instead of auto-redirect**: the worker must consciously choose between (a) channel-dispatching the real question to the manager for user input, or (b) making a reasonable autonomous decision and noting it in the task reply. Silent redirection removes that choice.
+**Enforcement**: `scripts/hooks/core/worker-boundary-gate.js` → `checkWorkerBoundary()`. PreToolUse hook blocks `AskUserQuestion`; block message includes the exact `curl ... --data-binary "## QUESTION: ..."` command. Config: `workspace.blockAskUserQuestionInWorker` (default `true`).
+---
+### Workspace Worker Text-Question Classifier
+**Applies to**: workspace worker mode.
+**Rule**: If a worker ends a turn with a text-based question to the user (no tool call — just hedging: *"let me know"*, *"should I"*, *"which option"*, *"thoughts?"*, trailing `?`), the Stop hook runs a Haiku classifier on the final assistant message. If it detects an open question with confidence ≥ `minConfidence` → stop is blocked with channel-dispatch instructions.
+**Why AI instead of regex**: hedging vocabulary is infinite. Regex misses novel phrasings.
+**Fail-open throughout**: missing `ANTHROPIC_API_KEY`, missing transcript path, malformed transcript, or model error → skip. Silent-stall false negatives are recoverable; false-positive blocks every turn are not.
+**Enforcement**: `scripts/flow-worker-question-classifier.js`. Config: `workspace.aiWorkerQuestionClassifier.{enabled,minConfidence,model}`.
+---
+### Workspace Worker Silent-Halt Detection
+**Applies to**: workspace manager mode.
+**Rule**: Every dispatch to a worker MUST be tracked. Any pending dispatch past its `expectedDeadline` with no matching `task-complete` or `worker-stopped` message = silent death, surfaced on the manager's next turn.
+**Three terminal states**:
+1. **Completed** — `task-complete` message arrived.
+2. **Graceful-stop** — `worker-stopped` message arrived (worker's Stop hook fired, but didn't complete).
+3. **Silent-halt** — no message, deadline passed. Worker probably dead.
+**Deadline**: default `expectedDurationMs` = 30 min. Callers override per-dispatch for long tasks.
+**Architecture — file-based, hook-driven, no background processes**:
+- `lib/workspace-dispatch-tracking.js` — record / reconcile / overdue helpers
+- `.workspace/state/dispatched-tasks.json` — ring buffer of last 100 active records
+- Manager's `dispatchToChannel()` calls `recordDispatch()` after successful POST
+- Manager's `UserPromptSubmit` hook sweeps the message bus and surfaces overdue records as `additionalContext`
+---
+### Code Quality Patterns (generic)
+These apply to any codebase being built with WogiFlow's help.
+**1. Single Source of Truth for Constants** — avoid duplicating model/configuration objects across files. Import from one canonical location. Prevents drift and makes updates simpler.
+**2. Named Constants for Magic Numbers** — define thresholds and limits as named constants; don't inline literals.
+```js
+const COVERAGE_THRESHOLDS = { default: 0.7, comprehensive: 0.85, concise: 0.5 };
+```
+Self-documenting; easier to maintain.

package/lib/workspace-routing.js CHANGED Viewed

@@ -710,6 +710,50 @@ async function dispatchToChannel(workspaceRoot, repoName, taskId, opts = {}) {
     return { ok: false, message: `Invalid task ID format: "${taskId}" — expected wf-XXXXXXXX` };
   }
+  // Research-before-propose gate (manager mode): require evidence that the
+  // manager has read at least N state/spec files before dispatching work.
+  //
+  // Architectural note (ARCH-005): the dispatch gate is wired here in the
+  // routing layer rather than via pre-tool-orchestrator because channel
+  // dispatch is a lib-level operation that can be invoked outside the
+  // Claude Code PreToolUse hook path (e.g., by a worker's internal CLI
+  // or by a script). Wiring here ensures the gate fires on every dispatch,
+  // regardless of invocation surface. The spec-write gate is wired in the
+  // orchestrator because Edit/Write is hook-surface-only.
+  //
+  // Error handling (CL-005): separate catches so gate-not-installed (the
+  // intended fail-open path) is distinct from config-load failures (which
+  // indicate broken installs worth surfacing in DEBUG) and from gate
+  // runtime errors (which suggest a bug in the gate itself).
+  let dispatchGateModule = null;
+  try {
+    dispatchGateModule = require('../scripts/hooks/core/research-evidence-gate');
+  } catch (_err) {
+    // Gate module not installed — fail-open silently; this is expected on
+    // older installs that predate the research-evidence gate.
+  }
+  if (dispatchGateModule) {
+    let cfg = null;
+    try {
+      const { getConfig } = require('../scripts/flow-utils');
+      cfg = getConfig();
+    } catch (err) {
+      if (process.env.DEBUG) {
+        console.error(`[dispatchToChannel] Config load failed (gate still enforced with defaults): ${err.message}`);
+      }
+    }
+    try {
+      const dispatchGate = dispatchGateModule.checkDispatchEvidenceGate(cfg);
+      if (dispatchGate.blocked) {
+        return { ok: false, message: dispatchGate.message };
+      }
+    } catch (err) {
+      if (process.env.DEBUG) {
+        console.error(`[dispatchToChannel] Dispatch gate runtime error (fail-open): ${err.message}`);
+      }
+    }
+  }
   const configPath = path.join(workspaceRoot, 'wogi-workspace.json');
   const config = safeReadJson(configPath);
   if (!config || typeof config !== 'object') {

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "wogiflow",
-  "version": "2.25.0",
+  "version": "2.26.0",
   "description": "AI-powered development workflow management system with multi-model support",
   "main": "lib/index.js",
   "bin": {
@@ -10,7 +10,7 @@
   },
   "scripts": {
     "flow": "./scripts/flow",
-    "test": "NODE_ENV=test node --test tests/auto-compact-prompt.test.js tests/flow-paths.test.js tests/flow-io.test.js tests/flow-config-loader.test.js tests/flow-damage-control.test.js tests/flow-output.test.js tests/flow-constants.test.js tests/flow-session-state.test.js tests/flow-hooks-integration.test.js tests/flow-utils.test.js tests/flow-security.test.js tests/flow-memory-db.test.js tests/flow-durable-session.test.js tests/flow-skill-matcher.test.js tests/flow-bridge.test.js tests/flow-proactive-compact.test.js tests/flow-cascade-completion.test.js tests/flow-capture-gate.test.js tests/flow-correction-detector-hybrid.test.js tests/flow-promote.test.js tests/flow-archive-runs.test.js tests/flow-memory.test.js tests/flow-hooks-pre-tool-helpers.test.js tests/flow-hooks-bugfix-scope-gate.test.js tests/flow-hooks-routing-gate.test.js tests/flow-hooks-phase-read-gate.test.js tests/flow-hooks-commit-log-gate.test.js tests/flow-hooks-deploy-gate.test.js tests/flow-hooks-todowrite-gate.test.js tests/flow-hooks-git-safety-gate.test.js tests/flow-hooks-scope-mutation-gate.test.js tests/flow-hooks-strike-gate.test.js tests/flow-hooks-component-check.test.js tests/flow-hooks-scope-gate.test.js tests/flow-hooks-implementation-gate.test.js tests/flow-hooks-research-gate.test.js tests/flow-hooks-loop-check.test.js tests/flow-hooks-manager-boundary-gate.test.js tests/flow-hooks-phase-gate.test.js tests/flow-hooks-pre-tool-orchestrator.test.js tests/flow-hooks-observation-capture.test.js tests/flow-hooks-task-gate.test.js tests/flow-durable-session-suspension.test.js tests/flow-health-mcp-scopes.test.js tests/flow-lean-config.test.js tests/flow-workspace-autopickup.test.js tests/flow-worker-boundary-gate.test.js tests/flow-worker-question-classifier.test.js tests/flow-completion-truth-gate-contradictions.test.js tests/flow-structure-sensor.test.js tests/flow-workspace-dispatch-tracking.test.js tests/flow-story-gates.test.js tests/flow-workspace-restart-handoff.test.js tests/flow-wogi-claude-wrapper.test.js tests/flow-wave1-integrations.test.js tests/flow-wave2-integrations.test.js tests/flow-wave3-integrations.test.js && NODE_ENV=test node tests/run-quality-gates.test.js",
+    "test": "NODE_ENV=test node --test tests/auto-compact-prompt.test.js tests/flow-paths.test.js tests/flow-io.test.js tests/flow-config-loader.test.js tests/flow-damage-control.test.js tests/flow-output.test.js tests/flow-constants.test.js tests/flow-session-state.test.js tests/flow-hooks-integration.test.js tests/flow-utils.test.js tests/flow-security.test.js tests/flow-memory-db.test.js tests/flow-durable-session.test.js tests/flow-skill-matcher.test.js tests/flow-bridge.test.js tests/flow-proactive-compact.test.js tests/flow-cascade-completion.test.js tests/flow-capture-gate.test.js tests/flow-correction-detector-hybrid.test.js tests/flow-promote.test.js tests/flow-archive-runs.test.js tests/flow-memory.test.js tests/flow-hooks-pre-tool-helpers.test.js tests/flow-hooks-bugfix-scope-gate.test.js tests/flow-hooks-routing-gate.test.js tests/flow-hooks-phase-read-gate.test.js tests/flow-hooks-commit-log-gate.test.js tests/flow-hooks-deploy-gate.test.js tests/flow-hooks-todowrite-gate.test.js tests/flow-hooks-git-safety-gate.test.js tests/flow-hooks-scope-mutation-gate.test.js tests/flow-hooks-strike-gate.test.js tests/flow-hooks-component-check.test.js tests/flow-hooks-scope-gate.test.js tests/flow-hooks-implementation-gate.test.js tests/flow-hooks-research-gate.test.js tests/flow-hooks-loop-check.test.js tests/flow-hooks-manager-boundary-gate.test.js tests/flow-hooks-phase-gate.test.js tests/flow-hooks-pre-tool-orchestrator.test.js tests/flow-hooks-observation-capture.test.js tests/flow-hooks-task-gate.test.js tests/flow-durable-session-suspension.test.js tests/flow-health-mcp-scopes.test.js tests/flow-lean-config.test.js tests/flow-workspace-autopickup.test.js tests/flow-worker-boundary-gate.test.js tests/flow-worker-question-classifier.test.js tests/flow-completion-truth-gate-contradictions.test.js tests/flow-structure-sensor.test.js tests/flow-workspace-dispatch-tracking.test.js tests/flow-story-gates.test.js tests/flow-workspace-restart-handoff.test.js tests/flow-wogi-claude-wrapper.test.js tests/flow-wave1-integrations.test.js tests/flow-wave2-integrations.test.js tests/flow-wave3-integrations.test.js tests/flow-commit-claims-gate.test.js && NODE_ENV=test node tests/run-quality-gates.test.js",
     "test:syntax": "find scripts/ lib/ -name '*.js' -not -path '*/node_modules/*' -exec node --check {} +",
     "lint": "eslint scripts/ lib/ tests/",
     "lint:ci": "eslint scripts/ lib/ tests/ --max-warnings 0",

package/scripts/flow-completion-truth-gate.js CHANGED Viewed

@@ -614,6 +614,133 @@ function collectArrayEntries(obj, keys) {
   return out;
 }
+// ============================================================
+// Commit-vs-diff consistency scanner (v2.25.1 — H2b from Waves 1-3 review)
+// ============================================================
+/**
+ * Parse a commit message for "fixes X" / "closes X" / "F1, F2, M1" style claims
+ * that should be verifiable against the diff.
+ *
+ * Heuristics — conservative to avoid false positives:
+ *   1. Bracketed finding IDs: `F1`, `F2`, `M1`, `H3`, `L5`, or `SEC-001`/`PERF-002`
+ *   2. Task IDs: `wf-XXXXXXXX` that appear as "fixes wf-...", "closes wf-...", etc.
+ *   3. File paths mentioned in fix-context: "fixes `path/to/file.js`"
+ *
+ * Returns the structured claims a diff-consistency check can verify.
+ *
+ * @param {string} commitMessage
+ * @returns {{claims: Array<{kind: 'finding-id'|'task-id'|'file', value: string, raw: string}>}}
+ */
+function parseCommitMessageClaims(commitMessage) {
+  const claims = [];
+  if (typeof commitMessage !== 'string' || commitMessage.trim().length === 0) {
+    return { claims };
+  }
+  // Finding IDs: F1, F2, M1, H3, L5, SEC-001, PERF-002, etc.
+  //   - Single-letter + digits: match on word boundary
+  //   - ALLCAPS-dashnum: SEC-001, PERF-002
+  const findingRe = /\b(?:F\d+|H\d+|M\d+|L\d+|[A-Z]{2,6}-\d+)\b/g;
+  for (const m of commitMessage.matchAll(findingRe)) {
+    claims.push({ kind: 'finding-id', value: m[0], raw: m[0] });
+  }
+  // Task IDs (wf-XXXXXXXX) — only count if preceded by fix/close/resolve verb
+  const taskRe = /\b(?:fix(?:es|ed)?|clos(?:es|ed)?|resolv(?:es|ed)?|address(?:es|ed)?)\s+(wf-[0-9a-f]{8})\b/gi;
+  for (const m of commitMessage.matchAll(taskRe)) {
+    claims.push({ kind: 'task-id', value: m[1], raw: m[0] });
+  }
+  // File paths in backticks after fix/address verbs: `fixes \`path/to/file.js\``
+  const fileRe = /(?:fix(?:es|ed)?|address(?:es|ed)?|updat(?:es|ed)?)\s+`([^`\n]{3,120})`/gi;
+  for (const m of commitMessage.matchAll(fileRe)) {
+    // Only count values that look like file paths (have an extension or a slash)
+    const val = m[1];
+    if (/[./]/.test(val) && !val.includes(' ')) {
+      claims.push({ kind: 'file', value: val, raw: m[0] });
+    }
+  }
+  // Dedup
+  const seen = new Set();
+  return {
+    claims: claims.filter(c => {
+      const k = `${c.kind}::${c.value.toLowerCase()}`;
+      if (seen.has(k)) return false;
+      seen.add(k);
+      return true;
+    })
+  };
+}
+/**
+ * Check commit message claims against the staged diff. Each claim must appear
+ * somewhere in the diff (a file path in the changed-files list OR the token
+ * appearing as-is in the diff body).
+ *
+ * @param {string} commitMessage
+ * @param {Object} [opts]
+ * @param {string} [opts.diffText] — raw `git diff --staged` output
+ * @param {string[]} [opts.changedFiles] — staged file list (alternative input)
+ * @returns {{ok: boolean, totalClaims: number, missingClaims: Array, verifiedClaims: Array}}
+ */
+function verifyCommitMessageAgainstDiff(commitMessage, opts = {}) {
+  const { claims } = parseCommitMessageClaims(commitMessage);
+  if (claims.length === 0) return { ok: true, totalClaims: 0, missingClaims: [], verifiedClaims: [] };
+  const diffText = typeof opts.diffText === 'string' ? opts.diffText : '';
+  const changedFiles = Array.isArray(opts.changedFiles) ? opts.changedFiles : [];
+  const haystack = [diffText, ...changedFiles].join('\n');
+  const missingClaims = [];
+  const verifiedClaims = [];
+  for (const claim of claims) {
+    let found = false;
+    if (claim.kind === 'file') {
+      // File claims verify by exact path match (or suffix) in changed-files list
+      found = changedFiles.some(f => f === claim.value || f.endsWith('/' + claim.value) || f.endsWith(claim.value));
+      if (!found) found = diffText.includes(claim.value);
+    } else {
+      // finding-id + task-id: plain substring search in the haystack
+      found = haystack.includes(claim.value);
+    }
+    (found ? verifiedClaims : missingClaims).push(claim);
+  }
+  return {
+    ok: missingClaims.length === 0,
+    totalClaims: claims.length,
+    missingClaims,
+    verifiedClaims
+  };
+}
+/**
+ * Human-readable message when claims are missing from the diff.
+ *
+ * @param {Object} result — from verifyCommitMessageAgainstDiff
+ * @returns {string|null}
+ */
+function formatMissingClaimsMessage(result) {
+  if (!result || result.ok || !Array.isArray(result.missingClaims) || result.missingClaims.length === 0) {
+    return null;
+  }
+  const lines = [
+    `Commit message claims ${result.missingClaims.length} item(s) that do not appear in the staged diff:`
+  ];
+  for (const c of result.missingClaims) {
+    lines.push(`  • ${c.kind === 'finding-id' ? 'Finding' : c.kind === 'task-id' ? 'Task' : 'File'} "${c.value}" — not found`);
+  }
+  lines.push('');
+  lines.push('Options:');
+  lines.push('  1. Add the missing fix to the commit now (git add + amend)');
+  lines.push('  2. Remove the unverified claim from the commit message');
+  lines.push('  3. Acknowledge + proceed (use --force-commit-claims if blocking from a gate)');
+  return lines.join('\n');
+}
 // ============================================================
 // Exports
 // ============================================================
@@ -627,6 +754,9 @@ module.exports = {
   isTruthGateDisabled,
   getMinTierForDone,
   scanForClaimContradictions,
+  parseCommitMessageClaims,
+  verifyCommitMessageAgainstDiff,
+  formatMissingClaimsMessage,
   TIER_NAMES,
   DONE_WORDS,
   DISAGREEMENT_WORDS,

package/scripts/flow-config-defaults.js CHANGED Viewed

@@ -818,6 +818,31 @@ const CONFIG_DEFAULTS = {
   // --- Gate Confidence ---
   gateConfidence: { enabled: false },
+  // --- Intent-Grounded Reasoning (IGR) ---
+  // Master flag for the IGR pipeline: Intent Framing (Step 1.15), Architect
+  // Pass (Step 1.55), Logic Adversary (Step 1.57), Scope-Confidence Audit
+  // (Step 1.45), Completion Truth Gate (Step 3.9). Default-on so new projects
+  // inherit the full reasoning pipeline. See .claude/docs/intent-grounded-reasoning.md.
+  intentGroundedReasoning: {
+    enabled: true,
+    _comment: 'IGR pipeline: architect + logic adversary + truth gate. See .claude/docs/intent-grounded-reasoning.md'
+  },
+  // --- Research Reasoning Gate ---
+  // Tiered classification for conversation-mode questions. Tier 1 = factual,
+  // direct answer. Tier 2 = domain/recommendation, surface assumptions and
+  // wait for user confirmation. Tier 3 = architecture, tier 2 flow + spawn
+  // cross-model adversary. See wogi-start.md § Research Reasoning Gate.
+  researchReasoningGate: {
+    enabled: true,
+    tier2: { enabled: true },
+    tier3: {
+      enabled: true,
+      adversaryModel: 'sonnet',
+      _comment_adversaryModel: 'Model used for Tier-3 cross-model adversary. Reused by /wogi-peer-review, /wogi-debug-hypothesis, /wogi-learn, /wogi-decide — single canonical key.'
+    }
+  },
   // --- Long Input Gate ---
   longInputGate: {
     enabled: true,
@@ -961,7 +986,8 @@ const CONFIG_DEFAULTS = {
         configChange: { enabled: false },
         setup: { enabled: true, autoOnboard: false, maintenanceTasks: ['healthCheck', 'cleanupLocks'] },
         sessionCleanup: { enabled: true },
-        phaseGate: { enabled: true }
+        phaseGate: { enabled: true },
+        researchEvidenceGate: { enabled: true, minEvidence: 2 }
       }
     },
     claudeCode: { installPath: '.claude/settings.local.json' }
@@ -1041,6 +1067,14 @@ const CONFIG_DEFAULTS = {
   // --- Decisions ---
   decisions: { amendmentTracking: { enabled: false } },
+  // --- Session End ---
+  // autoPushInWorker: when /wogi-session-end runs inside a workspace worker
+  // (WOGI_WORKSPACE_ROOT set, WOGI_REPO_NAME !== 'manager'), the worker must
+  // not prompt the user directly. true (default) → auto-push silently.
+  // false → skip push entirely; user pushes manually. Non-worker sessions
+  // are unaffected and still see the interactive prompt.
+  sessionEnd: { autoPushInWorker: true },
   // --- Community ---
   community: {
     enabled: false,

package/scripts/flow-extraction-review.js CHANGED Viewed

@@ -104,8 +104,13 @@ function loadReviewSession() {
       return null;
     }
-    // Check for prototype pollution keys
-    if ('__proto__' in parsed || 'constructor' in parsed || 'prototype' in parsed) {
+    // Check for prototype pollution keys. Use Object.prototype.hasOwnProperty
+    // rather than `key in parsed` — the latter also returns true for inherited
+    // properties, and EVERY plain object inherits `constructor` from
+    // Object.prototype, which made this guard falsely trip on every valid
+    // session file (pre-existing bug, found via v2.25.1 wave2 test).
+    const hasOwn = Object.prototype.hasOwnProperty;
+    if (hasOwn.call(parsed, '__proto__') || hasOwn.call(parsed, 'constructor') || hasOwn.call(parsed, 'prototype')) {
       console.error('Review session file contains unsafe keys');
       return null;
     }
@@ -414,11 +419,21 @@ function exportAsItemManifest() {
   // Coordinate with Intent Bootstrap (see flow-story-gates.coordinateIntentBootstrap)
   // so /wogi-start doesn't re-prompt if the user already scheduled bootstrap via
   // /wogi-story during this session.
+  //
+  // v2.25.1: Semantics corrected (nit from Waves 1-3 review). The flag
+  // represents "is IGR bootstrap active/scheduled for this session?", NOT
+  // "did THIS call schedule it?". `result.active` is true when IGR is enabled
+  // and bootstrap has been scheduled — whether by this call or a prior one.
   let intentBootstrapScheduled = false;
   try {
     const gates = require('./flow-story-gates');
     const result = gates.coordinateIntentBootstrap();
-    intentBootstrapScheduled = !!(result && result.scheduled);
+    if (result && result.active) {
+      // Scheduled in this call OR already-scheduled from a prior call = active
+      intentBootstrapScheduled = result.scheduled === true ||
+                                 result.reason === 'already-scheduled' ||
+                                 result.reason === 'artifacts-exist';
+    }
   } catch (_err) { /* non-critical */ }
   return {

package/scripts/flow-morning.js CHANGED Viewed

@@ -385,22 +385,31 @@ function collectBriefingData() {
   // v2.23.0 — Workspace dispatch surfacing (manager mode only).
   // If the user is working inside a workspace manager session, surface any
   // overdue or restart-gap-lost dispatches so the morning briefing catches
-  // what the last manager turn would have caught. Fail-open.
+  // what the last manager turn would have caught. Fail-open; DEBUG-logged.
   try {
     if (process.env.WOGI_WORKSPACE_ROOT) {
       const { buildOverdueContext } = require('./hooks/core/overdue-dispatches');
       const ctx = buildOverdueContext();
       if (ctx) briefing.workspaceOverdue = ctx;
     }
-  } catch (_err) { /* non-critical */ }
+  } catch (err) {
+    if (process.env.DEBUG) {
+      console.error(`[morning] Workspace overdue check failed (fail-open): ${err.message}`);
+    }
+  }
   // v2.23.0 — Completion-claim honesty scan.
   // Catches done-word-in-notes-while-status-partial and similar
   // contradictions across ready.json (uses the honesty-infra from 2026-04-16).
+  // Fail-open; DEBUG-logged.
   try {
     const { checkCompletionClaimHonesty } = require('./flow-health');
     briefing.honestyHits = checkCompletionClaimHonesty();
-  } catch (_err) { /* non-critical */ }
+  } catch (err) {
+    if (process.env.DEBUG) {
+      console.error(`[morning] Honesty scan failed (fail-open): ${err.message}`);
+    }
+  }
   // Generate suggested prompt if enabled
   if (morningConfig.generatePrompt !== false) {

package/scripts/flow-session-end.js CHANGED Viewed

@@ -41,6 +41,17 @@ const { getReadyData, saveReadyData } = require('./flow-utils');
 // v2.6.1: Use centralized state cleanup module
 const { cleanupStaleState } = require('./flow-state-cleanup');
+// Workspace worker detection — used by offerPush() to branch between
+// interactive prompt (single-repo) and silent auto-push (worker mode).
+// Loaded eagerly; if module is missing on a broken install, fail loud
+// rather than masking the issue via a lazy require deep in offerPush().
+let isWorker;
+try {
+  isWorker = require('../lib/workspace-worker-ready').isWorker;
+} catch (_err) {
+  isWorker = () => false;
+}
 // v1.8.0 automatic memory management
 let memoryDb = null;
 try {
@@ -448,22 +459,63 @@ function analyzeCrossSessionPatterns() {
 }
 /**
- * Offer to push to remote
+ * Offer to push to remote.
+ *
+ * In workspace worker mode (WOGI_WORKSPACE_ROOT + WOGI_REPO_NAME !== 'manager'),
+ * workers must not prompt the user directly — that violates the Workspace Worker
+ * Cannot Prompt User Directly contract (decisions.md v2.20.1+). Instead:
+ *   - config.sessionEnd.autoPushInWorker === true  (default) → push silently
+ *   - config.sessionEnd.autoPushInWorker === false          → skip push silently
+ * In single-repo mode (the common case), the interactive prompt is unchanged.
  */
 async function offerPush() {
   if (!isGitRepo()) return;
   try {
     execSync('git remote get-url origin', { stdio: 'pipe' });
+  } catch (_err) {
+    return;
+  }
-    const confirm = await prompt('Push to remote? (y/N) ');
+  // Worker-mode detection + secondary validation (SEC-001). isWorker() checks
+  // WOGI_WORKSPACE_ROOT + WOGI_REPO_NAME env vars. For defense in depth,
+  // confirm that `.workspace/` exists inside WOGI_WORKSPACE_ROOT before
+  // treating env-var signals as authoritative — this narrows the window in
+  // which a misconfigured single-repo session could be misidentified as a
+  // worker and auto-push without confirmation.
+  if (isWorker() && isValidWorkspaceRoot()) {
+    const autoPush = getConfigValue('sessionEnd.autoPushInWorker', true);
+    if (!autoPush) {
+      console.log(color('dim', '⊘ Push skipped (worker mode — autoPushInWorker: false)'));
+      return;
+    }
+    try {
+      execSync('git push', { stdio: 'inherit' });
+      success('Auto-pushed (worker mode)');
+    } catch (err) {
+      warn(`Auto-push failed: ${err.message}`);
+    }
+    return;
+  }
+  try {
+    const confirm = await prompt('Push to remote? (y/N) ');
     if (confirm.toLowerCase() === 'y') {
       execSync('git push', { stdio: 'inherit' });
       success('Pushed to remote');
     }
   } catch (_err) {
-    // No remote configured, skip
+    // Prompt or push failed, skip
+  }
+}
+function isValidWorkspaceRoot() {
+  const root = process.env.WOGI_WORKSPACE_ROOT;
+  if (!root || !path.isAbsolute(root)) return false;
+  try {
+    return fs.existsSync(path.join(root, '.workspace'));
+  } catch (_err) {
+    return false;
   }
 }
@@ -596,9 +648,18 @@ function writeWorkspaceSessionEndMessage() {
   const workspaceRoot = process.env.WOGI_WORKSPACE_ROOT;
   if (!workspaceRoot) return;
   const repo = process.env.WOGI_REPO_NAME;
-  // Only manager-mode sessions emit this signal. Workers use their own
-  // Stop-hook worker-stopped message (see lib/workspace-messages.js).
-  if (repo && repo !== 'manager') return;
+  // Only emit this signal from EXPLICIT manager-mode sessions.
+  // v2.25.1 (M2 from Waves 1-3 review): tightened to require
+  // WOGI_REPO_NAME === 'manager' explicitly. Previously we let
+  // unset-repo sessions fall through, which could emit a spurious
+  // "manager session ended" broadcast from a mis-env'd worker shell.
+  // Workers use their own Stop-hook worker-stopped message.
+  if (repo !== 'manager') {
+    if (repo && process.env.DEBUG) {
+      console.error(`[session-end] Skipping workspace message — WOGI_REPO_NAME is '${repo}', not 'manager'`);
+    }
+    return;
+  }
   try {
     const messagesLib = path.resolve(__dirname, '..', 'lib', 'workspace-messages.js');

package/scripts/hooks/core/phase-gate.js CHANGED Viewed

@@ -164,6 +164,28 @@ function transitionPhase(from, to, taskId) {
     return false;
   }
+  // Research-evidence gate: transitions into proposal phases (spec_review,
+  // coding) require minimum evidence fingerprint. Fail-open if gate module
+  // is absent. Prints the block message to stderr so flow-phase.js CLI
+  // surfaces it to the AI invoking the transition.
+  if (to === 'spec_review' || to === 'coding') {
+    try {
+      const { checkPhaseTransitionEvidence } = require('./research-evidence-gate');
+      let cfg = null;
+      try {
+        const { getConfig } = require('../../flow-utils');
+        cfg = getConfig();
+      } catch (_err) { /* fail-open on config error */ }
+      const result = checkPhaseTransitionEvidence(from, to, cfg);
+      if (result.blocked) {
+        console.error(result.message);
+        return false;
+      }
+    } catch (_err) {
+      // Gate not installed — fail-open
+    }
+  }
   return writePhaseState({
     phase: to,
     taskId: taskId || current.taskId,

package/scripts/hooks/core/post-compact.js CHANGED Viewed

@@ -167,6 +167,21 @@ function handlePostCompact() {
     }
   }
+  // Clear research-evidence fingerprint after compaction — the AI has fresh
+  // context, so claims of "already read X" in the previous context no longer
+  // apply. The gate must force re-reading in the new context.
+  try {
+    const { clearResearchEvidence } = require('./research-evidence-gate');
+    clearResearchEvidence();
+    if (process.env.DEBUG) {
+      console.error('[post-compact] Research-evidence cleared');
+    }
+  } catch (err) {
+    if (process.env.DEBUG) {
+      console.error(`[post-compact] Research-evidence clear failed: ${err.message}`);
+    }
+  }
   // 3. Re-set routing-pending flag
   // After compaction, the AI has fresh context and may try to act without routing.
   // Setting routing-pending ensures the next tool use goes through routing checks.

package/scripts/hooks/core/pre-tool-orchestrator.js CHANGED Viewed

@@ -80,6 +80,9 @@ function runPreToolGates(ctx, deps) {
   // Phase-read recording (side effect)
   if (toolName === 'Read' && filePath) {
     try { deps.recordPhaseRead(filePath); } catch (_err) { /* fail-open */ }
+    if (deps.recordEvidenceRead) {
+      try { deps.recordEvidenceRead(filePath); } catch (_err) { /* fail-open */ }
+    }
   }
   // Phase gate
@@ -114,6 +117,24 @@ function runPreToolGates(ctx, deps) {
     }
   }
+  // Research-evidence gate (spec-write): blocks Edit/Write to proposal paths
+  // when the AI has not read enough evidence files this task turn.
+  if ((toolName === 'Edit' || toolName === 'Write') && deps.checkSpecWriteGate) {
+    try {
+      const specResult = deps.checkSpecWriteGate(filePath, config);
+      if (specResult.blocked) {
+        return {
+          allowed: false,
+          blocked: true,
+          reason: 'Research-evidence gate: insufficient research before proposal',
+          message: specResult.message,
+        };
+      }
+    } catch (_err) {
+      if (process.env.DEBUG) console.error(`[Hook] Research-evidence gate error (fail-open): ${_err.message}`);
+    }
+  }
   // Scope gate (Edit/Write only)
   if (toolName === 'Edit' || toolName === 'Write') {
     coreResult = deps.checkScopeGate({ filePath, operation: toolName.toLowerCase() }, config);
@@ -133,6 +154,9 @@ function runPreToolGates(ctx, deps) {
     if (typeof skillName === 'string' && /^wogi-(bulk|start)$/i.test(skillName)) {
       deps.markSkillPending(skillName.toLowerCase(), { args: toolInput.args });
       try { deps.clearPhaseReads(); } catch (_err) { /* fail-open */ }
+      if (deps.clearResearchEvidence) {
+        try { deps.clearResearchEvidence(); } catch (_err) { /* fail-open */ }
+      }
       if (process.env.DEBUG) {
         console.error(`[Hook] Marked skill ${skillName} as pending (via Skill tool)`);
       }

package/scripts/hooks/core/research-evidence-gate.js ADDED Viewed

@@ -0,0 +1,280 @@
+#!/usr/bin/env node
+/**
+ * Wogi Flow - Research-Evidence Gate (Core Module)
+ *
+ * Enforces the "Research Before Propose" methodology rule by tracking which
+ * state/spec/epic files the AI has Read in the current task turn, and blocking
+ * proposal actions (spec writes, channel-dispatch to workers) until a minimum
+ * evidence threshold has been reached.
+ *
+ * Does NOT block AskUserQuestion, plain Read/Glob/Grep/WebSearch, conversational
+ * text, or non-proposal edits. Asking the user is a valid escape hatch.
+ *
+ * State file: .workflow/state/research-evidence.json
+ * Fail-open: If state file is missing/corrupt or config disabled, allow the tool call.
+ *
+ * Three entry points:
+ *   recordEvidenceRead(filePath)         — called when Read targets an evidence file
+ *   checkSpecWriteGate(filePath, config) — called before Edit/Write to proposal paths
+ *   checkDispatchEvidenceGate(config)    — called before manager channel-dispatch
+ *   clearResearchEvidence()              — called on new task start / session end / post-compact
+ */
+const path = require('node:path');
+const fs = require('node:fs');
+const { PATHS, safeJsonParse } = require('../../flow-utils');
+const EVIDENCE_FILE = path.join(PATHS.state, 'research-evidence.json');
+// Relative-to-project path prefixes that count as evidence when Read.
+// Any file whose project-relative path starts with one of these prefixes
+// increments the evidence counter.
+const EVIDENCE_PREFIXES = [
+  '.workflow/state/',
+  '.workflow/changes/',
+  '.workflow/specs/',
+  '.workflow/epics/'
+];
+// Path prefixes that trigger the spec-write gate when targeted by Edit/Write.
+// Writing to these paths = "proposing a spec" = must have evidence first.
+const PROPOSAL_PREFIXES = [
+  '.workflow/changes/',
+  '.workflow/specs/',
+  '.workflow/epics/'
+];
+// Default threshold: minimum number of distinct evidence-file reads required
+// before a proposal action is allowed. Can be overridden by config.
+const DEFAULT_MIN_EVIDENCE = 2;
+function toProjectRelative(filePath) {
+  try {
+    // Canonicalize both sides via realpath to prevent symlink escape
+    // (SEC-003): a symlink in PATHS.root or the input path could make a
+    // file outside the project appear inside after a plain path.relative.
+    let rootCanon = PATHS.root;
+    let targetCanon = path.resolve(filePath);
+    try { rootCanon = fs.realpathSync(PATHS.root); } catch (_err) { /* root may not exist mid-test */ }
+    try { targetCanon = fs.realpathSync(targetCanon); } catch (_err) { /* target may not exist yet */ }
+    const rel = path.relative(rootCanon, targetCanon);
+    return rel.split(path.sep).join('/');
+  } catch (_err) {
+    return null;
+  }
+}
+function matchesPrefix(relPath, prefixes) {
+  if (!relPath || relPath.startsWith('..')) return false;
+  return prefixes.some(p => relPath.startsWith(p));
+}
+/**
+ * Record that an evidence file was read. Called from PreToolUse on Read.
+ * De-duplicates: reading the same file twice still counts as 1.
+ *
+ * Write strategy (CL-001): atomic temp-file + rename. A concurrent tool call
+ * that loses the read-modify-write race will at worst lose one evidence entry,
+ * which causes a false-block that the user resolves by reading one more file.
+ * The atomic rename prevents partial writes on crash.
+ */
+function recordEvidenceRead(filePath) {
+  if (!filePath || typeof filePath !== 'string') return;
+  const rel = toProjectRelative(filePath);
+  if (!matchesPrefix(rel, EVIDENCE_PREFIXES)) return;
+  try {
+    const existing = safeJsonParse(EVIDENCE_FILE, {});
+    if (!existing.reads || typeof existing.reads !== 'object') existing.reads = {};
+    if (!existing.reads[rel]) {
+      existing.reads[rel] = { at: new Date().toISOString() };
+      const tmp = `${EVIDENCE_FILE}.${process.pid}.tmp`;
+      fs.writeFileSync(tmp, JSON.stringify(existing, null, 2));
+      fs.renameSync(tmp, EVIDENCE_FILE);
+      if (process.env.DEBUG) {
+        console.error(`[ResearchEvidenceGate] Recorded evidence read: ${rel}`);
+      }
+    }
+  } catch (err) {
+    if (process.env.DEBUG) {
+      console.error(`[ResearchEvidenceGate] Failed to record read: ${err.message}`);
+    }
+  }
+}
+function getEvidenceCount() {
+  try {
+    const data = safeJsonParse(EVIDENCE_FILE, {});
+    if (!data.reads || typeof data.reads !== 'object') return 0;
+    return Object.keys(data.reads).length;
+  } catch (_err) {
+    return 0;
+  }
+}
+function isGateEnabled(config) {
+  const gateCfg = config?.hooks?.rules?.researchEvidenceGate;
+  if (gateCfg === undefined || gateCfg === null) return true;
+  if (gateCfg === false) return false;
+  if (typeof gateCfg === 'object' && gateCfg.enabled === false) return false;
+  return true;
+}
+function getMinEvidence(config) {
+  const v = config?.hooks?.rules?.researchEvidenceGate?.minEvidence;
+  if (typeof v === 'number' && v >= 0 && Number.isFinite(v)) return v;
+  return DEFAULT_MIN_EVIDENCE;
+}
+/**
+ * Block Edit/Write to a proposal path when evidence fingerprint is below threshold.
+ * Called from pre-tool-orchestrator before Edit/Write runs.
+ *
+ * @param {string} filePath - Path being written/edited
+ * @param {Object} config
+ * @returns {{ blocked: boolean, message?: string }}
+ */
+function checkSpecWriteGate(filePath, config) {
+  try {
+    if (!isGateEnabled(config)) return { blocked: false };
+    if (!filePath || typeof filePath !== 'string') return { blocked: false };
+    const rel = toProjectRelative(filePath);
+    if (!matchesPrefix(rel, PROPOSAL_PREFIXES)) return { blocked: false };
+    const minEvidence = getMinEvidence(config);
+    const count = getEvidenceCount();
+    if (count >= minEvidence) return { blocked: false };
+    return {
+      blocked: true,
+      message:
+        `Research-before-propose: this writes a spec/change/epic (${rel}), but you have only ` +
+        `read ${count} evidence file(s) this task turn. Minimum required: ${minEvidence}.\n\n` +
+        `Before proposing, read relevant files from:\n` +
+        `  .workflow/state/decisions.md, feedback-patterns.md, app-map.md, function-map.md, api-map.md\n` +
+        `  the task spec (.workflow/changes/<taskId>.md or .workflow/specs/<id>.md)\n` +
+        `  .workflow/epics/ if this task belongs to an epic\n\n` +
+        `If you genuinely need clarification before proposing, use AskUserQuestion — that is allowed.\n` +
+        `The rule is "don't propose before researching," not "never ask."`
+    };
+  } catch (err) {
+    if (process.env.DEBUG) {
+      console.error(`[ResearchEvidenceGate] Spec-write gate error (fail-open): ${err.message}`);
+    }
+    return { blocked: false };
+  }
+}
+/**
+ * Block channel-dispatch to a worker when the manager has no evidence of
+ * having researched the target's state. Called from lib/workspace-routing.js
+ * before dispatchToChannel posts.
+ *
+ * @param {Object} config
+ * @returns {{ blocked: boolean, message?: string }}
+ */
+function checkDispatchEvidenceGate(config) {
+  try {
+    if (!isGateEnabled(config)) return { blocked: false };
+    // Manager-mode only: workers don't dispatch. Single-repo sessions
+    // (no WOGI_WORKSPACE_ROOT) are n/a — there are no workers to dispatch
+    // to, so the evidence requirement does not apply (CL-003).
+    const workspaceRoot = process.env.WOGI_WORKSPACE_ROOT;
+    const repo = process.env.WOGI_REPO_NAME;
+    const isManager = !!workspaceRoot && (!repo || repo === 'manager');
+    if (!isManager) return { blocked: false };
+    const minEvidence = getMinEvidence(config);
+    const count = getEvidenceCount();
+    if (count >= minEvidence) return { blocked: false };
+    return {
+      blocked: true,
+      message:
+        `Research-before-dispatch: dispatching to a worker proposes work, but the manager has only ` +
+        `read ${count} evidence file(s) this turn. Minimum required: ${minEvidence}.\n\n` +
+        `Before dispatching, read relevant state from the target member repo:\n` +
+        `  <member-repo>/.workflow/state/decisions.md, app-map.md, feedback-patterns.md\n` +
+        `  <member-repo>/.workflow/changes/ for existing task specs\n\n` +
+        `Silent workers that receive poorly-specified work cost the most to recover. ` +
+        `The Wogi Hub manager incident that prompted this rule dispatched Employee-class ` +
+        `clarifying-question work without reading the existing class system.`
+    };
+  } catch (err) {
+    if (process.env.DEBUG) {
+      console.error(`[ResearchEvidenceGate] Dispatch gate error (fail-open): ${err.message}`);
+    }
+    return { blocked: false };
+  }
+}
+/**
+ * Block phase transitions into proposal phases (spec_review, coding) when
+ * the AI has not read enough evidence files this task turn. Called from
+ * phase-gate.js transitionPhase() when the target is a proposal phase.
+ *
+ * @param {string} from
+ * @param {string} to
+ * @param {Object} config
+ * @returns {{ blocked: boolean, message?: string }}
+ */
+function checkPhaseTransitionEvidence(from, to, config) {
+  try {
+    if (!isGateEnabled(config)) return { blocked: false };
+    if (to !== 'spec_review' && to !== 'coding') return { blocked: false };
+    const minEvidence = getMinEvidence(config);
+    const count = getEvidenceCount();
+    if (count >= minEvidence) return { blocked: false };
+    return {
+      blocked: true,
+      message:
+        `Research-before-propose: transitioning to "${to}" requires ${minEvidence} evidence ` +
+        `file read(s) this task turn; you have ${count}.\n\n` +
+        `Before transitioning to a proposal phase, read relevant files from:\n` +
+        `  .workflow/state/decisions.md, feedback-patterns.md, app-map.md, function-map.md\n` +
+        `  the task spec (.workflow/changes/<taskId>.md or .workflow/specs/<id>.md)\n` +
+        `  .workflow/epics/ if this task belongs to an epic\n\n` +
+        `AskUserQuestion is not blocked — ask if clarification is genuinely needed.`
+    };
+  } catch (err) {
+    if (process.env.DEBUG) {
+      console.error(`[ResearchEvidenceGate] Phase-transition gate error (fail-open): ${err.message}`);
+    }
+    return { blocked: false };
+  }
+}
+/**
+ * Clear evidence state. Called at:
+ *   - New task start (pre-tool-use.js Skill hook for wogi-start)
+ *   - Session end
+ *   - Post-compact (forces re-read in new context)
+ */
+function clearResearchEvidence() {
+  try {
+    fs.writeFileSync(EVIDENCE_FILE, JSON.stringify({ reads: {} }, null, 2));
+  } catch (err) {
+    if (process.env.DEBUG) {
+      console.error(`[ResearchEvidenceGate] Failed to clear evidence: ${err.message}`);
+    }
+  }
+}
+module.exports = {
+  recordEvidenceRead,
+  checkSpecWriteGate,
+  checkDispatchEvidenceGate,
+  checkPhaseTransitionEvidence,
+  clearResearchEvidence,
+  getEvidenceCount,
+  EVIDENCE_FILE,
+  EVIDENCE_PREFIXES,
+  PROPOSAL_PREFIXES,
+  DEFAULT_MIN_EVIDENCE
+};

package/scripts/hooks/core/session-end.js CHANGED Viewed

@@ -110,6 +110,13 @@ function handleSessionEnd(input) {
       // Non-critical — phase-read gate may not be installed
     }
+    try {
+      const { clearResearchEvidence } = require('./research-evidence-gate');
+      clearResearchEvidence();
+    } catch (_err) {
+      // Non-critical — research-evidence gate may not be installed
+    }
     // State folder hygiene — clean stale/orphan files (fire-and-forget)
     try {
       const hygiene = cleanStaleFiles();

package/scripts/hooks/entry/claude-code/pre-tool-use.js CHANGED Viewed

@@ -35,6 +35,20 @@ try {
   clearPhaseReads = prg.clearPhaseReads;
 } catch (_err) { if (process.env.DEBUG) console.error(`[Hook] Phase-read gate not loaded: ${_err.message}`); }
+let recordEvidenceRead = () => {}, checkSpecWriteGate = () => ({ blocked: false }), clearResearchEvidence = () => {};
+try {
+  const reg = require('../../core/research-evidence-gate');
+  recordEvidenceRead = reg.recordEvidenceRead;
+  checkSpecWriteGate = reg.checkSpecWriteGate;
+  clearResearchEvidence = reg.clearResearchEvidence;
+} catch (err) {
+  // CL-004: load failure for a gate file that SHOULD be present is a
+  // deployment issue worth surfacing even without DEBUG set. Silently
+  // shimming masks broken installs. Preserve fail-open (shims above)
+  // so the hook pipeline still works, but log to stderr so operators see it.
+  console.error(`[Hook] WARNING: Research-evidence gate failed to load — gate is disabled. ${err.message}`);
+}
 const _noop = () => ({ allowed: true, blocked: false });
 let checkDeployGate = _noop, checkWriteBlock = _noop;
 try { const dg = require('../../core/deploy-gate'); checkDeployGate = dg.checkDeployGate; checkWriteBlock = dg.checkWriteBlock; } catch (_err) { if (process.env.DEBUG) console.error(`[Hook] Deploy gate not loaded: ${_err.message}`); }
@@ -84,6 +98,7 @@ runHook('PreToolUse', async ({ input, parsedInput }) => {
     checkRoutingGate, clearRoutingPending, hasActiveTask,
     checkPhaseGate, checkCommitLogGate,
     recordPhaseRead, checkPhaseReadGate, clearPhaseReads,
+    recordEvidenceRead, checkSpecWriteGate, clearResearchEvidence,
     checkDeployGate, checkWriteBlock,
     checkStrikeGate, checkBugfixScope, checkScopeMutation,
     checkGitSafety, checkManagerBoundary, checkWorkerBoundary,

package/.workflow/state/bugfix-scope.json.template DELETED Viewed

@@ -1,7 +0,0 @@
-{
-  "taskId": null,
-  "uniqueFiles": [],
-  "thresholdReached": false,
-  "scopeInventory": null,
-  "warnedAt": null
-}

package/.workflow/state/deploy-history.json.template DELETED Viewed

@@ -1,3 +0,0 @@
-{
-  "deploys": []
-}

package/.workflow/state/deploy-routes.json.template DELETED Viewed

@@ -1,4 +0,0 @@
-{
-  "routes": [],
-  "lastUpdated": null
-}

package/.workflow/state/strike-tracker.json.template DELETED Viewed

@@ -1,3 +0,0 @@
-{
-  "tasks": {}
-}