wogiflow 2.25.0 → 2.26.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -176,7 +176,7 @@ After all agents complete, display the consolidated results:
176
176
 
177
177
  ### Step 4: Hypothesis Adversary (v2.23.0+ — MANDATORY unless `--no-adversary`)
178
178
 
179
- After consolidation, spawn a single Agent (different `model` param if `config.hybrid.enabled`, else same) with this prompt:
179
+ After consolidation, spawn a single Agent on a DIFFERENT model (default `sonnet` via `config.researchReasoningGate.tier3.adversaryModel` — canonical cross-command adversary key, same as `/wogi-peer-review`, `/wogi-learn`, `/wogi-decide`) with this prompt:
180
180
 
181
181
  ```
182
182
  You are the hypothesis adversary.
@@ -47,9 +47,11 @@ Models are selected once per session and remembered for subsequent runs.
47
47
  ├─────────────────────────────────────────────────────────┤
48
48
  │ 1. Collect code changes (git diff or specified files) │
49
49
  │ 2. Classify change size → effort tier: │
50
- │ L0/L1 (>10 files) → opus-4-7 xhigh
50
+ │ L0/L1 (>10 files) → opus (latest) xhigh
51
51
  │ L2 (3-10 files) → sonnet medium │
52
52
  │ L3 (<3 files) → haiku medium │
53
+ │ (Model IDs resolve from config.models — avoid │
54
+ │ hardcoding model version in this doc.) │
53
55
  │ 3. Generate improvement-focused prompt │
54
56
  │ 4. If includeClaude enabled: │
55
57
  │ - Launch Claude review (Task agent, Explore type) │
@@ -96,7 +98,7 @@ analysis, EACH carrying an explicit evidence tier.
96
98
 
97
99
  ## Synthesis Adversary (v2.23.0+ — MANDATORY unless `--no-adversary`)
98
100
 
99
- After initial synthesis, spawn a single adversary agent on a DIFFERENT model from the synthesizer (default: if synthesizer is Opus, adversary is Sonnet; config via `peerReview.adversaryModel`). Prompt:
101
+ After initial synthesis, spawn a single adversary agent on a DIFFERENT model from the synthesizer (default `sonnet`; override via the canonical `config.researchReasoningGate.tier3.adversaryModel` same key used by `/wogi-debug-hypothesis`, `/wogi-learn`, `/wogi-decide`). Prompt:
100
102
 
101
103
  ```
102
104
  You are the synthesis adversary.
@@ -199,7 +201,7 @@ For manual review (no API keys needed): `/wogi-peer-review --manual`
199
201
  | `--verbose` | Show full model responses |
200
202
  | `--create-tasks` | Auto-create tasks for strong agreements |
201
203
  | `--no-adversary` | Skip the v2.23.0 synthesis adversary (not recommended for L0/L1 diffs) |
202
- | `--adversary-model <id>` | Override adversary model (default: cross-model from synthesizer) |
204
+ | `--adversary-model <id>` | Override adversary model (default: `config.researchReasoningGate.tier3.adversaryModel`, usually `sonnet`) |
203
205
  | `--effort <level>` | Override effort tier (low/medium/high/xhigh/max) — otherwise derived from diff size |
204
206
 
205
207
  ARGUMENTS: {args}
@@ -13,7 +13,7 @@ Steps:
13
13
  6. **Completion-claim honesty scan** - Surface done-in-text-but-not-in-status contradictions (2026-04-16 honesty-infra)
14
14
  7. **Workspace session-end message (v2.23.0+)** - If running inside a workspace manager, write a `heads-up` message to `.workspace/messages/` so workers know no new dispatches are coming
15
15
  8. **Commit changes** - Stage and commit all workflow files
16
- 9. **Offer to push** - Ask if should push to remote
16
+ 9. **Offer to push** - Ask if should push to remote. In workspace worker mode, the prompt is suppressed (workers cannot prompt the user directly). When `config.sessionEnd.autoPushInWorker` is `true` (default), the worker auto-pushes silently. When `false`, the push is skipped and the user pushes manually later. Non-worker sessions are unchanged.
17
17
 
18
18
  Output:
19
19
  ```
@@ -357,33 +357,45 @@ Each finding is displayed using these fields from `last-review.json`:
357
357
  | Issue | `finding.issue` | "Raw JSON.parse without try-catch" |
358
358
  | Recommendation | `finding.recommendation` | "Use safeJsonParse from flow-utils.js" |
359
359
 
360
- ## Anti-Deferral Enforcement (v2.25.0+ — MANDATORY)
360
+ ## Anti-Deferral Enforcement (v2.25.0+ — two layers)
361
361
 
362
- The **Review-Findings Anti-Deferral Rule** (`.workflow/state/decisions.md`, 2026-04-15) extends to `/wogi-triage` mechanically in v2.25.0+. Prevents the rubber-stamp pattern where the AI silently drops findings from "fix all" requests.
362
+ The **Review-Findings Anti-Deferral Rule** (`.workflow/state/decisions.md`, 2026-04-15) gets two complementary enforcement layers. One mechanical (an actual gate in the codebase), one AI-followed (a protocol documented here that the triage flow honors).
363
363
 
364
- **Enforcement rules**:
364
+ ### Layer 1 — Mechanical gate (v2.25.1+)
365
365
 
366
- 1. **"Defer" / "skip" requires explicit user confirmation with a reason.** When the AI or user proposes to defer a finding, the triage flow MUST prompt:
366
+ `scripts/flow-completion-truth-gate.js` exports `parseCommitMessageClaims()` and `verifyCommitMessageAgainstDiff()`. Callers pass a commit message and the staged diff (or changed-files list); the function parses finding IDs (`F1`/`M1`/`SEC-001`), task IDs (`wf-XXXXXXXX` after fix/close/resolve verbs), and file-path mentions, then checks each against the diff. Any unverified claim surfaces as a blocking prompt with three remediation options. This is real code, callable from pre-commit hooks, `flow-done.js`, or the triage flow itself.
367
+
368
+ Example usage:
369
+ ```javascript
370
+ const { verifyCommitMessageAgainstDiff, formatMissingClaimsMessage } =
371
+ require('wogiflow/scripts/flow-completion-truth-gate');
372
+
373
+ const result = verifyCommitMessageAgainstDiff(commitMsg, { diffText, changedFiles });
374
+ if (!result.ok) {
375
+ console.error(formatMissingClaimsMessage(result));
376
+ // Block + remediate
377
+ }
378
+ ```
379
+
380
+ ### Layer 2 — AI-followed protocol (documentation)
381
+
382
+ The rest of the triage flow is a protocol the AI follows. It is NOT automatically enforced by a hook — the historical v2.17.4 incident showed that doc-only protocols can be violated. The mechanical gate above closes the most damaging failure mode (commit message / diff mismatch). The AI-followed rules below cover the earlier stages:
383
+
384
+ 1. **Defer requires explicit user confirmation + reason.** The triage flow prompts when proposing to defer:
367
385
  ```
368
386
  Defer finding wf-review-XXXX?
369
387
  Severity: HIGH
370
388
  Reason required: [user input]
371
389
  [Confirm defer] [Cancel — fix now]
372
390
  ```
373
- Auto-defer without reason is FORBIDDEN.
391
+ Auto-defer without reason is forbidden by this protocol.
374
392
 
375
- 2. **"Fix all" / "Option 1" / equivalent means fix ALL.** If the user requests bulk processing:
393
+ 2. **"Fix all" / "Option 1" means fix ALL.** If the user requests bulk processing:
376
394
  - Ship a fix for every finding with evidence-tier ≥ 1
377
395
  - If any finding is too large, STOP and ask: "Finding X requires ~Y minutes of work. Ship now, split to its own release, or defer (needs reason)?"
378
396
  - Never silently convert a finding to "deferred" in commit messages or release notes
379
397
 
380
- 3. **Commit/release consistency check.** Before finalizing, scan the commit message / release notes against the findings list. If the message claims "fixes F1, F2, F3, M1" but M1 isn't in the diff, BLOCK with:
381
- ```
382
- Commit message claims M1 is fixed, but M1 does not appear in the diff.
383
- Options: [Fix M1 now] [Remove M1 from message] [Acknowledge + proceed]
384
- ```
385
-
386
- 4. **Triage output includes a Deferral Audit Trail**:
398
+ 3. **Triage output includes a Deferral Audit Trail**:
387
399
  ```
388
400
  ━━━ TRIAGE SUMMARY ━━━
389
401
  Fixed: 12
@@ -394,6 +406,10 @@ The **Review-Findings Anti-Deferral Rule** (`.workflow/state/decisions.md`, 2026
394
406
  ━━━━━━━━━━━━━━━━━━━━━━
395
407
  ```
396
408
 
397
- Historical incident (v2.17.4 release, 2026-04-15): commit message claimed "fix all findings" but M1 and M3 were silently dropped. The v2.25.0+ mechanical enforcement makes that failure mode architecturally impossible — the flow stops and asks rather than letting the AI make an autonomous defer decision.
409
+ ### Honest tradeoff
410
+
411
+ Layer 1 is genuinely mechanical — impossible for an AI to bypass without explicitly disabling the gate. Layer 2 is a protocol the AI can fail to follow if prompted poorly, distracted, or confused about priorities. Both matter; calling the whole system "architecturally impossible to bypass" would be inaccurate. The mechanical gate at least ensures that WHEN the AI writes a commit message, claimed fixes must actually appear in the diff.
412
+
413
+ Historical incident (v2.17.4 release, 2026-04-15): commit claimed "fix all findings" but M1 and M3 were silently dropped. Layer 1 would have caught that — the commit message mentioned M1 + M3 but the diff didn't. Layer 2 is the human-protocol reinforcement.
398
414
 
399
- Skip only if `config.triage.antiDeferralEnforcement.enabled` is explicitly `false` (default: true).
415
+ Skip via `config.triage.antiDeferralEnforcement.enabled: false` — note that this is currently a surface flag only (read by AI-followed protocol, not by the Layer 1 gate); to disable Layer 1 set `config.commitClaimsGate.enabled: false`.
@@ -432,6 +432,10 @@ Use `/wogi-research "question"` for rigorous verification.
432
432
 
433
433
  ---
434
434
 
435
+ {{> methodology-rules}}
436
+
437
+ ---
438
+
435
439
  ## Generated by CLI Bridge
436
440
 
437
441
  This file was generated by the Wogi Flow CLI bridge.
@@ -0,0 +1,149 @@
1
+ ## WogiFlow Methodology Rules
2
+
3
+ These are product-level rules that apply to every WogiFlow session. They ship with the tool — enforcement is in the shipped scripts/hooks, and the text below explains the contract to Claude so it doesn't try to work around the enforcement.
4
+
5
+ ---
6
+
7
+ ### Research Before Propose (MANDATORY)
8
+
9
+ **Rule**: Before proposing any fix, plan, or spec, audit existing infrastructure for the problem area. Propose only what fills a confirmed gap. Evidence-before-invention.
10
+
11
+ **What counts as research**: read relevant files in `.workflow/state/` (decisions.md, feedback-patterns.md, app-map.md, function-map.md, api-map.md), read the task spec from `.workflow/changes/` or `.workflow/specs/`, grep existing hooks/classifiers/gates, read relevant source files.
12
+
13
+ **Why**: baseline LLM training biases toward generating plausible-sounding solutions. In a codebase with existing infrastructure, "plausible" is frequently wrong — proposing a feature that already exists, missing an existing pattern, or reinventing a wired-up hook. The correction cycle cost (user rejecting → replanning → rejecting again) is higher than the upfront audit cost.
14
+
15
+ **You MAY ask the user clarifying questions when genuinely needed.** The rule is not "never ask" — it is "don't propose before researching." Asking is a valid escape hatch; proposing without evidence is not.
16
+
17
+ **Enforcement**: `scripts/hooks/core/research-evidence-gate.js` tracks state-file reads (`.workflow/state/`, `.workflow/changes/`, `.workflow/specs/`, `.workflow/epics/`) in the current task turn. Three enforcement points check the evidence fingerprint before proposal actions:
18
+
19
+ 1. **Phase transition** — `transitionPhase()` blocks `→ spec_review` and `→ coding` until `minEvidence` distinct state/spec file reads have been recorded.
20
+ 2. **Spec write** — PreToolUse blocks `Edit`/`Write` to `.workflow/changes/*.md`, `.workflow/specs/*.md`, or `.workflow/epics/*.md` when evidence is below threshold.
21
+ 3. **Channel dispatch** — in workspace manager mode, `dispatchToChannel()` blocks dispatching a task to a worker until the manager has read evidence from the target member repo.
22
+
23
+ Evidence is cleared at task start, session end, and post-compaction so each task begins with a clean slate. The `AskUserQuestion` tool is NOT gated — asking for clarification is a valid escape hatch. IGR's architect + adversary passes challenge solution *quality* downstream; this gate enforces the evidence *base* upstream.
24
+
25
+ **Config**: `hooks.rules.researchEvidenceGate.{enabled,minEvidence}` (defaults: `true`, `2`).
26
+
27
+ ---
28
+
29
+ ### Completion-Claim Honesty Scan
30
+
31
+ **Rule**: At session-end and on `flow health`, scan `ready.json` entries for two contradiction classes and surface (not block) them for user reconciliation.
32
+
33
+ - **Class A — status-mismatch**: free-text field contains done-words (`done|completed|shipped|deployed|finished`) while `status` is partial (`completed-partial|blocked|in-progress|failed`).
34
+ - **Class B — negation-vs-evidence**: free-text contains a negated claim (`no outages`, `0 regressions`) while `hotfixes[]`, `incidents[]`, or `regressions[]` is non-empty.
35
+
36
+ **Why**: mechanical gates (test counts, lint, tsc) catch implementation errors. Narrative-quality claims in free-text fields (`notes`, `result`, `summary`, `description`) get rubber-stamped. This scan compares narrative against adjacent structured fields.
37
+
38
+ **Mode**: surface-and-prompt, non-blocking. A hard-fail at session-end has no recovery path.
39
+
40
+ **Enforcement**: `scripts/flow-completion-truth-gate.js` → `scanForClaimContradictions()`. Invoked by `flow-session-end.js` and `flow-health.js`.
41
+
42
+ ---
43
+
44
+ ### Merge-Plan Artifact Gate
45
+
46
+ **Rule**: `/wogi-finalize` requires `.workflow/scratch/merge-plan.md` for any merge with more than `config.finalization.mergePlan.threshold` commits (default 5) OR any cross-repo merge. The plan must map every commit in `git log <base>..<branch>` to one of: `port | adapt | skip-style | superseded | skip-with-reason`.
47
+
48
+ **Mechanical invariant**: count of SHA-prefixed lines in the plan MUST equal `git log <base>..<branch> | wc -l`. Mismatch blocks the merge.
49
+
50
+ **Structural-change sensor**: when ≥ `config.finalization.mergePlan.restructureThreshold` (default 20%) of changed files match a restructure pattern (folder-per-component, split-into-submodule, barrel-introduction, rename-new-home), a structural warning prefixes the plan and biases affected commits toward `adapt`.
51
+
52
+ **Enforcement**: `scripts/flow-structure-sensor.js`, `.claude/commands/wogi-finalize.md` Step 2.5.
53
+
54
+ ---
55
+
56
+ ### Story Creation Quality Gates
57
+
58
+ **Rule**: `/wogi-story` enforces five P0 specification-quality gates at creation time. Gates answer *"is the story clear, complete, checkable?"* — NOT *"is the implementation correct?"* (the latter remains `/wogi-start`'s job).
59
+
60
+ 1. **Long Input** — ≥40 lines OR ≥5 discrete items → route to `/wogi-extract-review` for zero-loss capture.
61
+ 2. **Item Reconciliation** — ≥3 discrete items → enumerated "Item Manifest" section; every item must appear in at least one criterion or sub-task. Unmapped items surface as a warning.
62
+ 3. **Consumer Impact Analysis** — refactoring keywords (`refactor`, `rename`, `migrate`, `split`, `extract`, ...) trigger `git grep` for consumers. ≥5 breaking consumers → phased migration recommendation.
63
+ 4. **Scope-Confidence Audit** — assumption patterns (`new <X>`, `existing <Y>`, `the <Z> service`) are verified against the codebase; findings go into a "Pending Clarifications" block.
64
+ 5. **Intent Bootstrap Coordination** — schedules IGR artifact bootstrap via `intentBootstrapScheduledAt` flag so `/wogi-story` and `/wogi-start` don't both prompt.
65
+
66
+ **Guard-rails**: all gates fail-open (grep failure, classifier unavailable → warning, story still created). Gates may be bypassed via `--skip-gates` for testing.
67
+
68
+ **Config**: `storyFlow.consumerImpactAnalysis.*`, `storyFlow.scopeConfidenceAudit.*`, `storyFlow.itemReconciliation.*` in `.workflow/config.json`.
69
+
70
+ ---
71
+
72
+ ### Workspace Autonomous-Mode Action-After-Completion Contract
73
+
74
+ **Applies to**: workspace worker mode (`WOGI_WORKSPACE_ROOT` set + `WOGI_REPO_NAME !== 'manager'`).
75
+
76
+ **Rule**: A worker's end-of-turn must be a deterministic action. Exactly one of these states must hold:
77
+
78
+ 1. **ACTION** — started the next pre-approved channel dispatch (invoked `/wogi-start <nextId>`), OR
79
+ 2. **ESCALATION** — channel-dispatched a `## QUESTION:` to the manager (after local resolution attempts failed), OR
80
+ 3. **IDLE** — zero pending channel dispatches AND zero in-progress tasks.
81
+
82
+ **Hedging language is mechanically forbidden**: *"awaiting your signal"*, *"let me know if"*, *"should I continue"*, *"standing by"*, *"ready when you are"*. These invent an imaginary decision point — the manager already pre-approved the dispatch by queuing it. Visibility is NOT a substitute for action; workers narrate AND act in the same turn.
83
+
84
+ **Enforcement**: `TaskCompleted` hook emits auto-pickup when queued dispatches exist. `Stop` hook blocks end-of-turn when a worker has queued dispatches but no in-progress task. `worker-rules.md` template carries the 3-state contract.
85
+
86
+ **Config**: `workspace.autoPickupChannelDispatches` (default `true`).
87
+
88
+ ---
89
+
90
+ ### Workspace Worker Cannot Prompt User Directly
91
+
92
+ **Applies to**: workspace worker mode.
93
+
94
+ **Rule**: The `AskUserQuestion` tool is mechanically blocked in worker mode. Questions to the user MUST be channel-dispatched to the manager via `## QUESTION: ...`.
95
+
96
+ **Why block instead of auto-redirect**: the worker must consciously choose between (a) channel-dispatching the real question to the manager for user input, or (b) making a reasonable autonomous decision and noting it in the task reply. Silent redirection removes that choice.
97
+
98
+ **Enforcement**: `scripts/hooks/core/worker-boundary-gate.js` → `checkWorkerBoundary()`. PreToolUse hook blocks `AskUserQuestion`; block message includes the exact `curl ... --data-binary "## QUESTION: ..."` command. Config: `workspace.blockAskUserQuestionInWorker` (default `true`).
99
+
100
+ ---
101
+
102
+ ### Workspace Worker Text-Question Classifier
103
+
104
+ **Applies to**: workspace worker mode.
105
+
106
+ **Rule**: If a worker ends a turn with a text-based question to the user (no tool call — just hedging: *"let me know"*, *"should I"*, *"which option"*, *"thoughts?"*, trailing `?`), the Stop hook runs a Haiku classifier on the final assistant message. If it detects an open question with confidence ≥ `minConfidence` → stop is blocked with channel-dispatch instructions.
107
+
108
+ **Why AI instead of regex**: hedging vocabulary is infinite. Regex misses novel phrasings.
109
+
110
+ **Fail-open throughout**: missing `ANTHROPIC_API_KEY`, missing transcript path, malformed transcript, or model error → skip. Silent-stall false negatives are recoverable; false-positive blocks every turn are not.
111
+
112
+ **Enforcement**: `scripts/flow-worker-question-classifier.js`. Config: `workspace.aiWorkerQuestionClassifier.{enabled,minConfidence,model}`.
113
+
114
+ ---
115
+
116
+ ### Workspace Worker Silent-Halt Detection
117
+
118
+ **Applies to**: workspace manager mode.
119
+
120
+ **Rule**: Every dispatch to a worker MUST be tracked. Any pending dispatch past its `expectedDeadline` with no matching `task-complete` or `worker-stopped` message = silent death, surfaced on the manager's next turn.
121
+
122
+ **Three terminal states**:
123
+ 1. **Completed** — `task-complete` message arrived.
124
+ 2. **Graceful-stop** — `worker-stopped` message arrived (worker's Stop hook fired, but didn't complete).
125
+ 3. **Silent-halt** — no message, deadline passed. Worker probably dead.
126
+
127
+ **Deadline**: default `expectedDurationMs` = 30 min. Callers override per-dispatch for long tasks.
128
+
129
+ **Architecture — file-based, hook-driven, no background processes**:
130
+ - `lib/workspace-dispatch-tracking.js` — record / reconcile / overdue helpers
131
+ - `.workspace/state/dispatched-tasks.json` — ring buffer of last 100 active records
132
+ - Manager's `dispatchToChannel()` calls `recordDispatch()` after successful POST
133
+ - Manager's `UserPromptSubmit` hook sweeps the message bus and surfaces overdue records as `additionalContext`
134
+
135
+ ---
136
+
137
+ ### Code Quality Patterns (generic)
138
+
139
+ These apply to any codebase being built with WogiFlow's help.
140
+
141
+ **1. Single Source of Truth for Constants** — avoid duplicating model/configuration objects across files. Import from one canonical location. Prevents drift and makes updates simpler.
142
+
143
+ **2. Named Constants for Magic Numbers** — define thresholds and limits as named constants; don't inline literals.
144
+
145
+ ```js
146
+ const COVERAGE_THRESHOLDS = { default: 0.7, comprehensive: 0.85, concise: 0.5 };
147
+ ```
148
+
149
+ Self-documenting; easier to maintain.
@@ -710,6 +710,50 @@ async function dispatchToChannel(workspaceRoot, repoName, taskId, opts = {}) {
710
710
  return { ok: false, message: `Invalid task ID format: "${taskId}" — expected wf-XXXXXXXX` };
711
711
  }
712
712
 
713
+ // Research-before-propose gate (manager mode): require evidence that the
714
+ // manager has read at least N state/spec files before dispatching work.
715
+ //
716
+ // Architectural note (ARCH-005): the dispatch gate is wired here in the
717
+ // routing layer rather than via pre-tool-orchestrator because channel
718
+ // dispatch is a lib-level operation that can be invoked outside the
719
+ // Claude Code PreToolUse hook path (e.g., by a worker's internal CLI
720
+ // or by a script). Wiring here ensures the gate fires on every dispatch,
721
+ // regardless of invocation surface. The spec-write gate is wired in the
722
+ // orchestrator because Edit/Write is hook-surface-only.
723
+ //
724
+ // Error handling (CL-005): separate catches so gate-not-installed (the
725
+ // intended fail-open path) is distinct from config-load failures (which
726
+ // indicate broken installs worth surfacing in DEBUG) and from gate
727
+ // runtime errors (which suggest a bug in the gate itself).
728
+ let dispatchGateModule = null;
729
+ try {
730
+ dispatchGateModule = require('../scripts/hooks/core/research-evidence-gate');
731
+ } catch (_err) {
732
+ // Gate module not installed — fail-open silently; this is expected on
733
+ // older installs that predate the research-evidence gate.
734
+ }
735
+ if (dispatchGateModule) {
736
+ let cfg = null;
737
+ try {
738
+ const { getConfig } = require('../scripts/flow-utils');
739
+ cfg = getConfig();
740
+ } catch (err) {
741
+ if (process.env.DEBUG) {
742
+ console.error(`[dispatchToChannel] Config load failed (gate still enforced with defaults): ${err.message}`);
743
+ }
744
+ }
745
+ try {
746
+ const dispatchGate = dispatchGateModule.checkDispatchEvidenceGate(cfg);
747
+ if (dispatchGate.blocked) {
748
+ return { ok: false, message: dispatchGate.message };
749
+ }
750
+ } catch (err) {
751
+ if (process.env.DEBUG) {
752
+ console.error(`[dispatchToChannel] Dispatch gate runtime error (fail-open): ${err.message}`);
753
+ }
754
+ }
755
+ }
756
+
713
757
  const configPath = path.join(workspaceRoot, 'wogi-workspace.json');
714
758
  const config = safeReadJson(configPath);
715
759
  if (!config || typeof config !== 'object') {
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "wogiflow",
3
- "version": "2.25.0",
3
+ "version": "2.26.0",
4
4
  "description": "AI-powered development workflow management system with multi-model support",
5
5
  "main": "lib/index.js",
6
6
  "bin": {
@@ -10,7 +10,7 @@
10
10
  },
11
11
  "scripts": {
12
12
  "flow": "./scripts/flow",
13
- "test": "NODE_ENV=test node --test tests/auto-compact-prompt.test.js tests/flow-paths.test.js tests/flow-io.test.js tests/flow-config-loader.test.js tests/flow-damage-control.test.js tests/flow-output.test.js tests/flow-constants.test.js tests/flow-session-state.test.js tests/flow-hooks-integration.test.js tests/flow-utils.test.js tests/flow-security.test.js tests/flow-memory-db.test.js tests/flow-durable-session.test.js tests/flow-skill-matcher.test.js tests/flow-bridge.test.js tests/flow-proactive-compact.test.js tests/flow-cascade-completion.test.js tests/flow-capture-gate.test.js tests/flow-correction-detector-hybrid.test.js tests/flow-promote.test.js tests/flow-archive-runs.test.js tests/flow-memory.test.js tests/flow-hooks-pre-tool-helpers.test.js tests/flow-hooks-bugfix-scope-gate.test.js tests/flow-hooks-routing-gate.test.js tests/flow-hooks-phase-read-gate.test.js tests/flow-hooks-commit-log-gate.test.js tests/flow-hooks-deploy-gate.test.js tests/flow-hooks-todowrite-gate.test.js tests/flow-hooks-git-safety-gate.test.js tests/flow-hooks-scope-mutation-gate.test.js tests/flow-hooks-strike-gate.test.js tests/flow-hooks-component-check.test.js tests/flow-hooks-scope-gate.test.js tests/flow-hooks-implementation-gate.test.js tests/flow-hooks-research-gate.test.js tests/flow-hooks-loop-check.test.js tests/flow-hooks-manager-boundary-gate.test.js tests/flow-hooks-phase-gate.test.js tests/flow-hooks-pre-tool-orchestrator.test.js tests/flow-hooks-observation-capture.test.js tests/flow-hooks-task-gate.test.js tests/flow-durable-session-suspension.test.js tests/flow-health-mcp-scopes.test.js tests/flow-lean-config.test.js tests/flow-workspace-autopickup.test.js tests/flow-worker-boundary-gate.test.js tests/flow-worker-question-classifier.test.js tests/flow-completion-truth-gate-contradictions.test.js tests/flow-structure-sensor.test.js tests/flow-workspace-dispatch-tracking.test.js tests/flow-story-gates.test.js tests/flow-workspace-restart-handoff.test.js tests/flow-wogi-claude-wrapper.test.js tests/flow-wave1-integrations.test.js tests/flow-wave2-integrations.test.js tests/flow-wave3-integrations.test.js && NODE_ENV=test node tests/run-quality-gates.test.js",
13
+ "test": "NODE_ENV=test node --test tests/auto-compact-prompt.test.js tests/flow-paths.test.js tests/flow-io.test.js tests/flow-config-loader.test.js tests/flow-damage-control.test.js tests/flow-output.test.js tests/flow-constants.test.js tests/flow-session-state.test.js tests/flow-hooks-integration.test.js tests/flow-utils.test.js tests/flow-security.test.js tests/flow-memory-db.test.js tests/flow-durable-session.test.js tests/flow-skill-matcher.test.js tests/flow-bridge.test.js tests/flow-proactive-compact.test.js tests/flow-cascade-completion.test.js tests/flow-capture-gate.test.js tests/flow-correction-detector-hybrid.test.js tests/flow-promote.test.js tests/flow-archive-runs.test.js tests/flow-memory.test.js tests/flow-hooks-pre-tool-helpers.test.js tests/flow-hooks-bugfix-scope-gate.test.js tests/flow-hooks-routing-gate.test.js tests/flow-hooks-phase-read-gate.test.js tests/flow-hooks-commit-log-gate.test.js tests/flow-hooks-deploy-gate.test.js tests/flow-hooks-todowrite-gate.test.js tests/flow-hooks-git-safety-gate.test.js tests/flow-hooks-scope-mutation-gate.test.js tests/flow-hooks-strike-gate.test.js tests/flow-hooks-component-check.test.js tests/flow-hooks-scope-gate.test.js tests/flow-hooks-implementation-gate.test.js tests/flow-hooks-research-gate.test.js tests/flow-hooks-loop-check.test.js tests/flow-hooks-manager-boundary-gate.test.js tests/flow-hooks-phase-gate.test.js tests/flow-hooks-pre-tool-orchestrator.test.js tests/flow-hooks-observation-capture.test.js tests/flow-hooks-task-gate.test.js tests/flow-durable-session-suspension.test.js tests/flow-health-mcp-scopes.test.js tests/flow-lean-config.test.js tests/flow-workspace-autopickup.test.js tests/flow-worker-boundary-gate.test.js tests/flow-worker-question-classifier.test.js tests/flow-completion-truth-gate-contradictions.test.js tests/flow-structure-sensor.test.js tests/flow-workspace-dispatch-tracking.test.js tests/flow-story-gates.test.js tests/flow-workspace-restart-handoff.test.js tests/flow-wogi-claude-wrapper.test.js tests/flow-wave1-integrations.test.js tests/flow-wave2-integrations.test.js tests/flow-wave3-integrations.test.js tests/flow-commit-claims-gate.test.js && NODE_ENV=test node tests/run-quality-gates.test.js",
14
14
  "test:syntax": "find scripts/ lib/ -name '*.js' -not -path '*/node_modules/*' -exec node --check {} +",
15
15
  "lint": "eslint scripts/ lib/ tests/",
16
16
  "lint:ci": "eslint scripts/ lib/ tests/ --max-warnings 0",
@@ -614,6 +614,133 @@ function collectArrayEntries(obj, keys) {
614
614
  return out;
615
615
  }
616
616
 
617
+ // ============================================================
618
+ // Commit-vs-diff consistency scanner (v2.25.1 — H2b from Waves 1-3 review)
619
+ // ============================================================
620
+
621
+ /**
622
+ * Parse a commit message for "fixes X" / "closes X" / "F1, F2, M1" style claims
623
+ * that should be verifiable against the diff.
624
+ *
625
+ * Heuristics — conservative to avoid false positives:
626
+ * 1. Bracketed finding IDs: `F1`, `F2`, `M1`, `H3`, `L5`, or `SEC-001`/`PERF-002`
627
+ * 2. Task IDs: `wf-XXXXXXXX` that appear as "fixes wf-...", "closes wf-...", etc.
628
+ * 3. File paths mentioned in fix-context: "fixes `path/to/file.js`"
629
+ *
630
+ * Returns the structured claims a diff-consistency check can verify.
631
+ *
632
+ * @param {string} commitMessage
633
+ * @returns {{claims: Array<{kind: 'finding-id'|'task-id'|'file', value: string, raw: string}>}}
634
+ */
635
+ function parseCommitMessageClaims(commitMessage) {
636
+ const claims = [];
637
+ if (typeof commitMessage !== 'string' || commitMessage.trim().length === 0) {
638
+ return { claims };
639
+ }
640
+
641
+ // Finding IDs: F1, F2, M1, H3, L5, SEC-001, PERF-002, etc.
642
+ // - Single-letter + digits: match on word boundary
643
+ // - ALLCAPS-dashnum: SEC-001, PERF-002
644
+ const findingRe = /\b(?:F\d+|H\d+|M\d+|L\d+|[A-Z]{2,6}-\d+)\b/g;
645
+ for (const m of commitMessage.matchAll(findingRe)) {
646
+ claims.push({ kind: 'finding-id', value: m[0], raw: m[0] });
647
+ }
648
+
649
+ // Task IDs (wf-XXXXXXXX) — only count if preceded by fix/close/resolve verb
650
+ const taskRe = /\b(?:fix(?:es|ed)?|clos(?:es|ed)?|resolv(?:es|ed)?|address(?:es|ed)?)\s+(wf-[0-9a-f]{8})\b/gi;
651
+ for (const m of commitMessage.matchAll(taskRe)) {
652
+ claims.push({ kind: 'task-id', value: m[1], raw: m[0] });
653
+ }
654
+
655
+ // File paths in backticks after fix/address verbs: `fixes \`path/to/file.js\``
656
+ const fileRe = /(?:fix(?:es|ed)?|address(?:es|ed)?|updat(?:es|ed)?)\s+`([^`\n]{3,120})`/gi;
657
+ for (const m of commitMessage.matchAll(fileRe)) {
658
+ // Only count values that look like file paths (have an extension or a slash)
659
+ const val = m[1];
660
+ if (/[./]/.test(val) && !val.includes(' ')) {
661
+ claims.push({ kind: 'file', value: val, raw: m[0] });
662
+ }
663
+ }
664
+
665
+ // Dedup
666
+ const seen = new Set();
667
+ return {
668
+ claims: claims.filter(c => {
669
+ const k = `${c.kind}::${c.value.toLowerCase()}`;
670
+ if (seen.has(k)) return false;
671
+ seen.add(k);
672
+ return true;
673
+ })
674
+ };
675
+ }
676
+
677
+ /**
678
+ * Check commit message claims against the staged diff. Each claim must appear
679
+ * somewhere in the diff (a file path in the changed-files list OR the token
680
+ * appearing as-is in the diff body).
681
+ *
682
+ * @param {string} commitMessage
683
+ * @param {Object} [opts]
684
+ * @param {string} [opts.diffText] — raw `git diff --staged` output
685
+ * @param {string[]} [opts.changedFiles] — staged file list (alternative input)
686
+ * @returns {{ok: boolean, totalClaims: number, missingClaims: Array, verifiedClaims: Array}}
687
+ */
688
+ function verifyCommitMessageAgainstDiff(commitMessage, opts = {}) {
689
+ const { claims } = parseCommitMessageClaims(commitMessage);
690
+ if (claims.length === 0) return { ok: true, totalClaims: 0, missingClaims: [], verifiedClaims: [] };
691
+
692
+ const diffText = typeof opts.diffText === 'string' ? opts.diffText : '';
693
+ const changedFiles = Array.isArray(opts.changedFiles) ? opts.changedFiles : [];
694
+ const haystack = [diffText, ...changedFiles].join('\n');
695
+
696
+ const missingClaims = [];
697
+ const verifiedClaims = [];
698
+
699
+ for (const claim of claims) {
700
+ let found = false;
701
+ if (claim.kind === 'file') {
702
+ // File claims verify by exact path match (or suffix) in changed-files list
703
+ found = changedFiles.some(f => f === claim.value || f.endsWith('/' + claim.value) || f.endsWith(claim.value));
704
+ if (!found) found = diffText.includes(claim.value);
705
+ } else {
706
+ // finding-id + task-id: plain substring search in the haystack
707
+ found = haystack.includes(claim.value);
708
+ }
709
+ (found ? verifiedClaims : missingClaims).push(claim);
710
+ }
711
+
712
+ return {
713
+ ok: missingClaims.length === 0,
714
+ totalClaims: claims.length,
715
+ missingClaims,
716
+ verifiedClaims
717
+ };
718
+ }
719
+
720
+ /**
721
+ * Human-readable message when claims are missing from the diff.
722
+ *
723
+ * @param {Object} result — from verifyCommitMessageAgainstDiff
724
+ * @returns {string|null}
725
+ */
726
+ function formatMissingClaimsMessage(result) {
727
+ if (!result || result.ok || !Array.isArray(result.missingClaims) || result.missingClaims.length === 0) {
728
+ return null;
729
+ }
730
+ const lines = [
731
+ `Commit message claims ${result.missingClaims.length} item(s) that do not appear in the staged diff:`
732
+ ];
733
+ for (const c of result.missingClaims) {
734
+ lines.push(` • ${c.kind === 'finding-id' ? 'Finding' : c.kind === 'task-id' ? 'Task' : 'File'} "${c.value}" — not found`);
735
+ }
736
+ lines.push('');
737
+ lines.push('Options:');
738
+ lines.push(' 1. Add the missing fix to the commit now (git add + amend)');
739
+ lines.push(' 2. Remove the unverified claim from the commit message');
740
+ lines.push(' 3. Acknowledge + proceed (use --force-commit-claims if blocking from a gate)');
741
+ return lines.join('\n');
742
+ }
743
+
617
744
  // ============================================================
618
745
  // Exports
619
746
  // ============================================================
@@ -627,6 +754,9 @@ module.exports = {
627
754
  isTruthGateDisabled,
628
755
  getMinTierForDone,
629
756
  scanForClaimContradictions,
757
+ parseCommitMessageClaims,
758
+ verifyCommitMessageAgainstDiff,
759
+ formatMissingClaimsMessage,
630
760
  TIER_NAMES,
631
761
  DONE_WORDS,
632
762
  DISAGREEMENT_WORDS,
@@ -818,6 +818,31 @@ const CONFIG_DEFAULTS = {
818
818
  // --- Gate Confidence ---
819
819
  gateConfidence: { enabled: false },
820
820
 
821
+ // --- Intent-Grounded Reasoning (IGR) ---
822
+ // Master flag for the IGR pipeline: Intent Framing (Step 1.15), Architect
823
+ // Pass (Step 1.55), Logic Adversary (Step 1.57), Scope-Confidence Audit
824
+ // (Step 1.45), Completion Truth Gate (Step 3.9). Default-on so new projects
825
+ // inherit the full reasoning pipeline. See .claude/docs/intent-grounded-reasoning.md.
826
+ intentGroundedReasoning: {
827
+ enabled: true,
828
+ _comment: 'IGR pipeline: architect + logic adversary + truth gate. See .claude/docs/intent-grounded-reasoning.md'
829
+ },
830
+
831
+ // --- Research Reasoning Gate ---
832
+ // Tiered classification for conversation-mode questions. Tier 1 = factual,
833
+ // direct answer. Tier 2 = domain/recommendation, surface assumptions and
834
+ // wait for user confirmation. Tier 3 = architecture, tier 2 flow + spawn
835
+ // cross-model adversary. See wogi-start.md § Research Reasoning Gate.
836
+ researchReasoningGate: {
837
+ enabled: true,
838
+ tier2: { enabled: true },
839
+ tier3: {
840
+ enabled: true,
841
+ adversaryModel: 'sonnet',
842
+ _comment_adversaryModel: 'Model used for Tier-3 cross-model adversary. Reused by /wogi-peer-review, /wogi-debug-hypothesis, /wogi-learn, /wogi-decide — single canonical key.'
843
+ }
844
+ },
845
+
821
846
  // --- Long Input Gate ---
822
847
  longInputGate: {
823
848
  enabled: true,
@@ -961,7 +986,8 @@ const CONFIG_DEFAULTS = {
961
986
  configChange: { enabled: false },
962
987
  setup: { enabled: true, autoOnboard: false, maintenanceTasks: ['healthCheck', 'cleanupLocks'] },
963
988
  sessionCleanup: { enabled: true },
964
- phaseGate: { enabled: true }
989
+ phaseGate: { enabled: true },
990
+ researchEvidenceGate: { enabled: true, minEvidence: 2 }
965
991
  }
966
992
  },
967
993
  claudeCode: { installPath: '.claude/settings.local.json' }
@@ -1041,6 +1067,14 @@ const CONFIG_DEFAULTS = {
1041
1067
  // --- Decisions ---
1042
1068
  decisions: { amendmentTracking: { enabled: false } },
1043
1069
 
1070
+ // --- Session End ---
1071
+ // autoPushInWorker: when /wogi-session-end runs inside a workspace worker
1072
+ // (WOGI_WORKSPACE_ROOT set, WOGI_REPO_NAME !== 'manager'), the worker must
1073
+ // not prompt the user directly. true (default) → auto-push silently.
1074
+ // false → skip push entirely; user pushes manually. Non-worker sessions
1075
+ // are unaffected and still see the interactive prompt.
1076
+ sessionEnd: { autoPushInWorker: true },
1077
+
1044
1078
  // --- Community ---
1045
1079
  community: {
1046
1080
  enabled: false,
@@ -104,8 +104,13 @@ function loadReviewSession() {
104
104
  return null;
105
105
  }
106
106
 
107
- // Check for prototype pollution keys
108
- if ('__proto__' in parsed || 'constructor' in parsed || 'prototype' in parsed) {
107
+ // Check for prototype pollution keys. Use Object.prototype.hasOwnProperty
108
+ // rather than `key in parsed` the latter also returns true for inherited
109
+ // properties, and EVERY plain object inherits `constructor` from
110
+ // Object.prototype, which made this guard falsely trip on every valid
111
+ // session file (pre-existing bug, found via v2.25.1 wave2 test).
112
+ const hasOwn = Object.prototype.hasOwnProperty;
113
+ if (hasOwn.call(parsed, '__proto__') || hasOwn.call(parsed, 'constructor') || hasOwn.call(parsed, 'prototype')) {
109
114
  console.error('Review session file contains unsafe keys');
110
115
  return null;
111
116
  }
@@ -414,11 +419,21 @@ function exportAsItemManifest() {
414
419
  // Coordinate with Intent Bootstrap (see flow-story-gates.coordinateIntentBootstrap)
415
420
  // so /wogi-start doesn't re-prompt if the user already scheduled bootstrap via
416
421
  // /wogi-story during this session.
422
+ //
423
+ // v2.25.1: Semantics corrected (nit from Waves 1-3 review). The flag
424
+ // represents "is IGR bootstrap active/scheduled for this session?", NOT
425
+ // "did THIS call schedule it?". `result.active` is true when IGR is enabled
426
+ // and bootstrap has been scheduled — whether by this call or a prior one.
417
427
  let intentBootstrapScheduled = false;
418
428
  try {
419
429
  const gates = require('./flow-story-gates');
420
430
  const result = gates.coordinateIntentBootstrap();
421
- intentBootstrapScheduled = !!(result && result.scheduled);
431
+ if (result && result.active) {
432
+ // Scheduled in this call OR already-scheduled from a prior call = active
433
+ intentBootstrapScheduled = result.scheduled === true ||
434
+ result.reason === 'already-scheduled' ||
435
+ result.reason === 'artifacts-exist';
436
+ }
422
437
  } catch (_err) { /* non-critical */ }
423
438
 
424
439
  return {
@@ -385,22 +385,31 @@ function collectBriefingData() {
385
385
  // v2.23.0 — Workspace dispatch surfacing (manager mode only).
386
386
  // If the user is working inside a workspace manager session, surface any
387
387
  // overdue or restart-gap-lost dispatches so the morning briefing catches
388
- // what the last manager turn would have caught. Fail-open.
388
+ // what the last manager turn would have caught. Fail-open; DEBUG-logged.
389
389
  try {
390
390
  if (process.env.WOGI_WORKSPACE_ROOT) {
391
391
  const { buildOverdueContext } = require('./hooks/core/overdue-dispatches');
392
392
  const ctx = buildOverdueContext();
393
393
  if (ctx) briefing.workspaceOverdue = ctx;
394
394
  }
395
- } catch (_err) { /* non-critical */ }
395
+ } catch (err) {
396
+ if (process.env.DEBUG) {
397
+ console.error(`[morning] Workspace overdue check failed (fail-open): ${err.message}`);
398
+ }
399
+ }
396
400
 
397
401
  // v2.23.0 — Completion-claim honesty scan.
398
402
  // Catches done-word-in-notes-while-status-partial and similar
399
403
  // contradictions across ready.json (uses the honesty-infra from 2026-04-16).
404
+ // Fail-open; DEBUG-logged.
400
405
  try {
401
406
  const { checkCompletionClaimHonesty } = require('./flow-health');
402
407
  briefing.honestyHits = checkCompletionClaimHonesty();
403
- } catch (_err) { /* non-critical */ }
408
+ } catch (err) {
409
+ if (process.env.DEBUG) {
410
+ console.error(`[morning] Honesty scan failed (fail-open): ${err.message}`);
411
+ }
412
+ }
404
413
 
405
414
  // Generate suggested prompt if enabled
406
415
  if (morningConfig.generatePrompt !== false) {
@@ -41,6 +41,17 @@ const { getReadyData, saveReadyData } = require('./flow-utils');
41
41
  // v2.6.1: Use centralized state cleanup module
42
42
  const { cleanupStaleState } = require('./flow-state-cleanup');
43
43
 
44
+ // Workspace worker detection — used by offerPush() to branch between
45
+ // interactive prompt (single-repo) and silent auto-push (worker mode).
46
+ // Loaded eagerly; if module is missing on a broken install, fail loud
47
+ // rather than masking the issue via a lazy require deep in offerPush().
48
+ let isWorker;
49
+ try {
50
+ isWorker = require('../lib/workspace-worker-ready').isWorker;
51
+ } catch (_err) {
52
+ isWorker = () => false;
53
+ }
54
+
44
55
  // v1.8.0 automatic memory management
45
56
  let memoryDb = null;
46
57
  try {
@@ -448,22 +459,63 @@ function analyzeCrossSessionPatterns() {
448
459
  }
449
460
 
450
461
  /**
451
- * Offer to push to remote
462
+ * Offer to push to remote.
463
+ *
464
+ * In workspace worker mode (WOGI_WORKSPACE_ROOT + WOGI_REPO_NAME !== 'manager'),
465
+ * workers must not prompt the user directly — that violates the Workspace Worker
466
+ * Cannot Prompt User Directly contract (decisions.md v2.20.1+). Instead:
467
+ * - config.sessionEnd.autoPushInWorker === true (default) → push silently
468
+ * - config.sessionEnd.autoPushInWorker === false → skip push silently
469
+ * In single-repo mode (the common case), the interactive prompt is unchanged.
452
470
  */
453
471
  async function offerPush() {
454
472
  if (!isGitRepo()) return;
455
473
 
456
474
  try {
457
475
  execSync('git remote get-url origin', { stdio: 'pipe' });
476
+ } catch (_err) {
477
+ return;
478
+ }
458
479
 
459
- const confirm = await prompt('Push to remote? (y/N) ');
480
+ // Worker-mode detection + secondary validation (SEC-001). isWorker() checks
481
+ // WOGI_WORKSPACE_ROOT + WOGI_REPO_NAME env vars. For defense in depth,
482
+ // confirm that `.workspace/` exists inside WOGI_WORKSPACE_ROOT before
483
+ // treating env-var signals as authoritative — this narrows the window in
484
+ // which a misconfigured single-repo session could be misidentified as a
485
+ // worker and auto-push without confirmation.
486
+ if (isWorker() && isValidWorkspaceRoot()) {
487
+ const autoPush = getConfigValue('sessionEnd.autoPushInWorker', true);
488
+ if (!autoPush) {
489
+ console.log(color('dim', '⊘ Push skipped (worker mode — autoPushInWorker: false)'));
490
+ return;
491
+ }
492
+ try {
493
+ execSync('git push', { stdio: 'inherit' });
494
+ success('Auto-pushed (worker mode)');
495
+ } catch (err) {
496
+ warn(`Auto-push failed: ${err.message}`);
497
+ }
498
+ return;
499
+ }
460
500
 
501
+ try {
502
+ const confirm = await prompt('Push to remote? (y/N) ');
461
503
  if (confirm.toLowerCase() === 'y') {
462
504
  execSync('git push', { stdio: 'inherit' });
463
505
  success('Pushed to remote');
464
506
  }
465
507
  } catch (_err) {
466
- // No remote configured, skip
508
+ // Prompt or push failed, skip
509
+ }
510
+ }
511
+
512
+ function isValidWorkspaceRoot() {
513
+ const root = process.env.WOGI_WORKSPACE_ROOT;
514
+ if (!root || !path.isAbsolute(root)) return false;
515
+ try {
516
+ return fs.existsSync(path.join(root, '.workspace'));
517
+ } catch (_err) {
518
+ return false;
467
519
  }
468
520
  }
469
521
 
@@ -596,9 +648,18 @@ function writeWorkspaceSessionEndMessage() {
596
648
  const workspaceRoot = process.env.WOGI_WORKSPACE_ROOT;
597
649
  if (!workspaceRoot) return;
598
650
  const repo = process.env.WOGI_REPO_NAME;
599
- // Only manager-mode sessions emit this signal. Workers use their own
600
- // Stop-hook worker-stopped message (see lib/workspace-messages.js).
601
- if (repo && repo !== 'manager') return;
651
+ // Only emit this signal from EXPLICIT manager-mode sessions.
652
+ // v2.25.1 (M2 from Waves 1-3 review): tightened to require
653
+ // WOGI_REPO_NAME === 'manager' explicitly. Previously we let
654
+ // unset-repo sessions fall through, which could emit a spurious
655
+ // "manager session ended" broadcast from a mis-env'd worker shell.
656
+ // Workers use their own Stop-hook worker-stopped message.
657
+ if (repo !== 'manager') {
658
+ if (repo && process.env.DEBUG) {
659
+ console.error(`[session-end] Skipping workspace message — WOGI_REPO_NAME is '${repo}', not 'manager'`);
660
+ }
661
+ return;
662
+ }
602
663
 
603
664
  try {
604
665
  const messagesLib = path.resolve(__dirname, '..', 'lib', 'workspace-messages.js');
@@ -164,6 +164,28 @@ function transitionPhase(from, to, taskId) {
164
164
  return false;
165
165
  }
166
166
 
167
+ // Research-evidence gate: transitions into proposal phases (spec_review,
168
+ // coding) require minimum evidence fingerprint. Fail-open if gate module
169
+ // is absent. Prints the block message to stderr so flow-phase.js CLI
170
+ // surfaces it to the AI invoking the transition.
171
+ if (to === 'spec_review' || to === 'coding') {
172
+ try {
173
+ const { checkPhaseTransitionEvidence } = require('./research-evidence-gate');
174
+ let cfg = null;
175
+ try {
176
+ const { getConfig } = require('../../flow-utils');
177
+ cfg = getConfig();
178
+ } catch (_err) { /* fail-open on config error */ }
179
+ const result = checkPhaseTransitionEvidence(from, to, cfg);
180
+ if (result.blocked) {
181
+ console.error(result.message);
182
+ return false;
183
+ }
184
+ } catch (_err) {
185
+ // Gate not installed — fail-open
186
+ }
187
+ }
188
+
167
189
  return writePhaseState({
168
190
  phase: to,
169
191
  taskId: taskId || current.taskId,
@@ -167,6 +167,21 @@ function handlePostCompact() {
167
167
  }
168
168
  }
169
169
 
170
+ // Clear research-evidence fingerprint after compaction — the AI has fresh
171
+ // context, so claims of "already read X" in the previous context no longer
172
+ // apply. The gate must force re-reading in the new context.
173
+ try {
174
+ const { clearResearchEvidence } = require('./research-evidence-gate');
175
+ clearResearchEvidence();
176
+ if (process.env.DEBUG) {
177
+ console.error('[post-compact] Research-evidence cleared');
178
+ }
179
+ } catch (err) {
180
+ if (process.env.DEBUG) {
181
+ console.error(`[post-compact] Research-evidence clear failed: ${err.message}`);
182
+ }
183
+ }
184
+
170
185
  // 3. Re-set routing-pending flag
171
186
  // After compaction, the AI has fresh context and may try to act without routing.
172
187
  // Setting routing-pending ensures the next tool use goes through routing checks.
@@ -80,6 +80,9 @@ function runPreToolGates(ctx, deps) {
80
80
  // Phase-read recording (side effect)
81
81
  if (toolName === 'Read' && filePath) {
82
82
  try { deps.recordPhaseRead(filePath); } catch (_err) { /* fail-open */ }
83
+ if (deps.recordEvidenceRead) {
84
+ try { deps.recordEvidenceRead(filePath); } catch (_err) { /* fail-open */ }
85
+ }
83
86
  }
84
87
 
85
88
  // Phase gate
@@ -114,6 +117,24 @@ function runPreToolGates(ctx, deps) {
114
117
  }
115
118
  }
116
119
 
120
+ // Research-evidence gate (spec-write): blocks Edit/Write to proposal paths
121
+ // when the AI has not read enough evidence files this task turn.
122
+ if ((toolName === 'Edit' || toolName === 'Write') && deps.checkSpecWriteGate) {
123
+ try {
124
+ const specResult = deps.checkSpecWriteGate(filePath, config);
125
+ if (specResult.blocked) {
126
+ return {
127
+ allowed: false,
128
+ blocked: true,
129
+ reason: 'Research-evidence gate: insufficient research before proposal',
130
+ message: specResult.message,
131
+ };
132
+ }
133
+ } catch (_err) {
134
+ if (process.env.DEBUG) console.error(`[Hook] Research-evidence gate error (fail-open): ${_err.message}`);
135
+ }
136
+ }
137
+
117
138
  // Scope gate (Edit/Write only)
118
139
  if (toolName === 'Edit' || toolName === 'Write') {
119
140
  coreResult = deps.checkScopeGate({ filePath, operation: toolName.toLowerCase() }, config);
@@ -133,6 +154,9 @@ function runPreToolGates(ctx, deps) {
133
154
  if (typeof skillName === 'string' && /^wogi-(bulk|start)$/i.test(skillName)) {
134
155
  deps.markSkillPending(skillName.toLowerCase(), { args: toolInput.args });
135
156
  try { deps.clearPhaseReads(); } catch (_err) { /* fail-open */ }
157
+ if (deps.clearResearchEvidence) {
158
+ try { deps.clearResearchEvidence(); } catch (_err) { /* fail-open */ }
159
+ }
136
160
  if (process.env.DEBUG) {
137
161
  console.error(`[Hook] Marked skill ${skillName} as pending (via Skill tool)`);
138
162
  }
@@ -0,0 +1,280 @@
1
+ #!/usr/bin/env node
2
+
3
+ /**
4
+ * Wogi Flow - Research-Evidence Gate (Core Module)
5
+ *
6
+ * Enforces the "Research Before Propose" methodology rule by tracking which
7
+ * state/spec/epic files the AI has Read in the current task turn, and blocking
8
+ * proposal actions (spec writes, channel-dispatch to workers) until a minimum
9
+ * evidence threshold has been reached.
10
+ *
11
+ * Does NOT block AskUserQuestion, plain Read/Glob/Grep/WebSearch, conversational
12
+ * text, or non-proposal edits. Asking the user is a valid escape hatch.
13
+ *
14
+ * State file: .workflow/state/research-evidence.json
15
+ * Fail-open: If state file is missing/corrupt or config disabled, allow the tool call.
16
+ *
17
+ * Three entry points:
18
+ * recordEvidenceRead(filePath) — called when Read targets an evidence file
19
+ * checkSpecWriteGate(filePath, config) — called before Edit/Write to proposal paths
20
+ * checkDispatchEvidenceGate(config) — called before manager channel-dispatch
21
+ * clearResearchEvidence() — called on new task start / session end / post-compact
22
+ */
23
+
24
+ const path = require('node:path');
25
+ const fs = require('node:fs');
26
+ const { PATHS, safeJsonParse } = require('../../flow-utils');
27
+
28
+ const EVIDENCE_FILE = path.join(PATHS.state, 'research-evidence.json');
29
+
30
+ // Relative-to-project path prefixes that count as evidence when Read.
31
+ // Any file whose project-relative path starts with one of these prefixes
32
+ // increments the evidence counter.
33
+ const EVIDENCE_PREFIXES = [
34
+ '.workflow/state/',
35
+ '.workflow/changes/',
36
+ '.workflow/specs/',
37
+ '.workflow/epics/'
38
+ ];
39
+
40
+ // Path prefixes that trigger the spec-write gate when targeted by Edit/Write.
41
+ // Writing to these paths = "proposing a spec" = must have evidence first.
42
+ const PROPOSAL_PREFIXES = [
43
+ '.workflow/changes/',
44
+ '.workflow/specs/',
45
+ '.workflow/epics/'
46
+ ];
47
+
48
+ // Default threshold: minimum number of distinct evidence-file reads required
49
+ // before a proposal action is allowed. Can be overridden by config.
50
+ const DEFAULT_MIN_EVIDENCE = 2;
51
+
52
+ function toProjectRelative(filePath) {
53
+ try {
54
+ // Canonicalize both sides via realpath to prevent symlink escape
55
+ // (SEC-003): a symlink in PATHS.root or the input path could make a
56
+ // file outside the project appear inside after a plain path.relative.
57
+ let rootCanon = PATHS.root;
58
+ let targetCanon = path.resolve(filePath);
59
+ try { rootCanon = fs.realpathSync(PATHS.root); } catch (_err) { /* root may not exist mid-test */ }
60
+ try { targetCanon = fs.realpathSync(targetCanon); } catch (_err) { /* target may not exist yet */ }
61
+ const rel = path.relative(rootCanon, targetCanon);
62
+ return rel.split(path.sep).join('/');
63
+ } catch (_err) {
64
+ return null;
65
+ }
66
+ }
67
+
68
+ function matchesPrefix(relPath, prefixes) {
69
+ if (!relPath || relPath.startsWith('..')) return false;
70
+ return prefixes.some(p => relPath.startsWith(p));
71
+ }
72
+
73
+ /**
74
+ * Record that an evidence file was read. Called from PreToolUse on Read.
75
+ * De-duplicates: reading the same file twice still counts as 1.
76
+ *
77
+ * Write strategy (CL-001): atomic temp-file + rename. A concurrent tool call
78
+ * that loses the read-modify-write race will at worst lose one evidence entry,
79
+ * which causes a false-block that the user resolves by reading one more file.
80
+ * The atomic rename prevents partial writes on crash.
81
+ */
82
+ function recordEvidenceRead(filePath) {
83
+ if (!filePath || typeof filePath !== 'string') return;
84
+
85
+ const rel = toProjectRelative(filePath);
86
+ if (!matchesPrefix(rel, EVIDENCE_PREFIXES)) return;
87
+
88
+ try {
89
+ const existing = safeJsonParse(EVIDENCE_FILE, {});
90
+ if (!existing.reads || typeof existing.reads !== 'object') existing.reads = {};
91
+ if (!existing.reads[rel]) {
92
+ existing.reads[rel] = { at: new Date().toISOString() };
93
+ const tmp = `${EVIDENCE_FILE}.${process.pid}.tmp`;
94
+ fs.writeFileSync(tmp, JSON.stringify(existing, null, 2));
95
+ fs.renameSync(tmp, EVIDENCE_FILE);
96
+ if (process.env.DEBUG) {
97
+ console.error(`[ResearchEvidenceGate] Recorded evidence read: ${rel}`);
98
+ }
99
+ }
100
+ } catch (err) {
101
+ if (process.env.DEBUG) {
102
+ console.error(`[ResearchEvidenceGate] Failed to record read: ${err.message}`);
103
+ }
104
+ }
105
+ }
106
+
107
+ function getEvidenceCount() {
108
+ try {
109
+ const data = safeJsonParse(EVIDENCE_FILE, {});
110
+ if (!data.reads || typeof data.reads !== 'object') return 0;
111
+ return Object.keys(data.reads).length;
112
+ } catch (_err) {
113
+ return 0;
114
+ }
115
+ }
116
+
117
+ function isGateEnabled(config) {
118
+ const gateCfg = config?.hooks?.rules?.researchEvidenceGate;
119
+ if (gateCfg === undefined || gateCfg === null) return true;
120
+ if (gateCfg === false) return false;
121
+ if (typeof gateCfg === 'object' && gateCfg.enabled === false) return false;
122
+ return true;
123
+ }
124
+
125
+ function getMinEvidence(config) {
126
+ const v = config?.hooks?.rules?.researchEvidenceGate?.minEvidence;
127
+ if (typeof v === 'number' && v >= 0 && Number.isFinite(v)) return v;
128
+ return DEFAULT_MIN_EVIDENCE;
129
+ }
130
+
131
+ /**
132
+ * Block Edit/Write to a proposal path when evidence fingerprint is below threshold.
133
+ * Called from pre-tool-orchestrator before Edit/Write runs.
134
+ *
135
+ * @param {string} filePath - Path being written/edited
136
+ * @param {Object} config
137
+ * @returns {{ blocked: boolean, message?: string }}
138
+ */
139
+ function checkSpecWriteGate(filePath, config) {
140
+ try {
141
+ if (!isGateEnabled(config)) return { blocked: false };
142
+ if (!filePath || typeof filePath !== 'string') return { blocked: false };
143
+
144
+ const rel = toProjectRelative(filePath);
145
+ if (!matchesPrefix(rel, PROPOSAL_PREFIXES)) return { blocked: false };
146
+
147
+ const minEvidence = getMinEvidence(config);
148
+ const count = getEvidenceCount();
149
+ if (count >= minEvidence) return { blocked: false };
150
+
151
+ return {
152
+ blocked: true,
153
+ message:
154
+ `Research-before-propose: this writes a spec/change/epic (${rel}), but you have only ` +
155
+ `read ${count} evidence file(s) this task turn. Minimum required: ${minEvidence}.\n\n` +
156
+ `Before proposing, read relevant files from:\n` +
157
+ ` .workflow/state/decisions.md, feedback-patterns.md, app-map.md, function-map.md, api-map.md\n` +
158
+ ` the task spec (.workflow/changes/<taskId>.md or .workflow/specs/<id>.md)\n` +
159
+ ` .workflow/epics/ if this task belongs to an epic\n\n` +
160
+ `If you genuinely need clarification before proposing, use AskUserQuestion — that is allowed.\n` +
161
+ `The rule is "don't propose before researching," not "never ask."`
162
+ };
163
+ } catch (err) {
164
+ if (process.env.DEBUG) {
165
+ console.error(`[ResearchEvidenceGate] Spec-write gate error (fail-open): ${err.message}`);
166
+ }
167
+ return { blocked: false };
168
+ }
169
+ }
170
+
171
+ /**
172
+ * Block channel-dispatch to a worker when the manager has no evidence of
173
+ * having researched the target's state. Called from lib/workspace-routing.js
174
+ * before dispatchToChannel posts.
175
+ *
176
+ * @param {Object} config
177
+ * @returns {{ blocked: boolean, message?: string }}
178
+ */
179
+ function checkDispatchEvidenceGate(config) {
180
+ try {
181
+ if (!isGateEnabled(config)) return { blocked: false };
182
+
183
+ // Manager-mode only: workers don't dispatch. Single-repo sessions
184
+ // (no WOGI_WORKSPACE_ROOT) are n/a — there are no workers to dispatch
185
+ // to, so the evidence requirement does not apply (CL-003).
186
+ const workspaceRoot = process.env.WOGI_WORKSPACE_ROOT;
187
+ const repo = process.env.WOGI_REPO_NAME;
188
+ const isManager = !!workspaceRoot && (!repo || repo === 'manager');
189
+ if (!isManager) return { blocked: false };
190
+
191
+ const minEvidence = getMinEvidence(config);
192
+ const count = getEvidenceCount();
193
+ if (count >= minEvidence) return { blocked: false };
194
+
195
+ return {
196
+ blocked: true,
197
+ message:
198
+ `Research-before-dispatch: dispatching to a worker proposes work, but the manager has only ` +
199
+ `read ${count} evidence file(s) this turn. Minimum required: ${minEvidence}.\n\n` +
200
+ `Before dispatching, read relevant state from the target member repo:\n` +
201
+ ` <member-repo>/.workflow/state/decisions.md, app-map.md, feedback-patterns.md\n` +
202
+ ` <member-repo>/.workflow/changes/ for existing task specs\n\n` +
203
+ `Silent workers that receive poorly-specified work cost the most to recover. ` +
204
+ `The Wogi Hub manager incident that prompted this rule dispatched Employee-class ` +
205
+ `clarifying-question work without reading the existing class system.`
206
+ };
207
+ } catch (err) {
208
+ if (process.env.DEBUG) {
209
+ console.error(`[ResearchEvidenceGate] Dispatch gate error (fail-open): ${err.message}`);
210
+ }
211
+ return { blocked: false };
212
+ }
213
+ }
214
+
215
+ /**
216
+ * Block phase transitions into proposal phases (spec_review, coding) when
217
+ * the AI has not read enough evidence files this task turn. Called from
218
+ * phase-gate.js transitionPhase() when the target is a proposal phase.
219
+ *
220
+ * @param {string} from
221
+ * @param {string} to
222
+ * @param {Object} config
223
+ * @returns {{ blocked: boolean, message?: string }}
224
+ */
225
+ function checkPhaseTransitionEvidence(from, to, config) {
226
+ try {
227
+ if (!isGateEnabled(config)) return { blocked: false };
228
+ if (to !== 'spec_review' && to !== 'coding') return { blocked: false };
229
+
230
+ const minEvidence = getMinEvidence(config);
231
+ const count = getEvidenceCount();
232
+ if (count >= minEvidence) return { blocked: false };
233
+
234
+ return {
235
+ blocked: true,
236
+ message:
237
+ `Research-before-propose: transitioning to "${to}" requires ${minEvidence} evidence ` +
238
+ `file read(s) this task turn; you have ${count}.\n\n` +
239
+ `Before transitioning to a proposal phase, read relevant files from:\n` +
240
+ ` .workflow/state/decisions.md, feedback-patterns.md, app-map.md, function-map.md\n` +
241
+ ` the task spec (.workflow/changes/<taskId>.md or .workflow/specs/<id>.md)\n` +
242
+ ` .workflow/epics/ if this task belongs to an epic\n\n` +
243
+ `AskUserQuestion is not blocked — ask if clarification is genuinely needed.`
244
+ };
245
+ } catch (err) {
246
+ if (process.env.DEBUG) {
247
+ console.error(`[ResearchEvidenceGate] Phase-transition gate error (fail-open): ${err.message}`);
248
+ }
249
+ return { blocked: false };
250
+ }
251
+ }
252
+
253
+ /**
254
+ * Clear evidence state. Called at:
255
+ * - New task start (pre-tool-use.js Skill hook for wogi-start)
256
+ * - Session end
257
+ * - Post-compact (forces re-read in new context)
258
+ */
259
+ function clearResearchEvidence() {
260
+ try {
261
+ fs.writeFileSync(EVIDENCE_FILE, JSON.stringify({ reads: {} }, null, 2));
262
+ } catch (err) {
263
+ if (process.env.DEBUG) {
264
+ console.error(`[ResearchEvidenceGate] Failed to clear evidence: ${err.message}`);
265
+ }
266
+ }
267
+ }
268
+
269
+ module.exports = {
270
+ recordEvidenceRead,
271
+ checkSpecWriteGate,
272
+ checkDispatchEvidenceGate,
273
+ checkPhaseTransitionEvidence,
274
+ clearResearchEvidence,
275
+ getEvidenceCount,
276
+ EVIDENCE_FILE,
277
+ EVIDENCE_PREFIXES,
278
+ PROPOSAL_PREFIXES,
279
+ DEFAULT_MIN_EVIDENCE
280
+ };
@@ -110,6 +110,13 @@ function handleSessionEnd(input) {
110
110
  // Non-critical — phase-read gate may not be installed
111
111
  }
112
112
 
113
+ try {
114
+ const { clearResearchEvidence } = require('./research-evidence-gate');
115
+ clearResearchEvidence();
116
+ } catch (_err) {
117
+ // Non-critical — research-evidence gate may not be installed
118
+ }
119
+
113
120
  // State folder hygiene — clean stale/orphan files (fire-and-forget)
114
121
  try {
115
122
  const hygiene = cleanStaleFiles();
@@ -35,6 +35,20 @@ try {
35
35
  clearPhaseReads = prg.clearPhaseReads;
36
36
  } catch (_err) { if (process.env.DEBUG) console.error(`[Hook] Phase-read gate not loaded: ${_err.message}`); }
37
37
 
38
+ let recordEvidenceRead = () => {}, checkSpecWriteGate = () => ({ blocked: false }), clearResearchEvidence = () => {};
39
+ try {
40
+ const reg = require('../../core/research-evidence-gate');
41
+ recordEvidenceRead = reg.recordEvidenceRead;
42
+ checkSpecWriteGate = reg.checkSpecWriteGate;
43
+ clearResearchEvidence = reg.clearResearchEvidence;
44
+ } catch (err) {
45
+ // CL-004: load failure for a gate file that SHOULD be present is a
46
+ // deployment issue worth surfacing even without DEBUG set. Silently
47
+ // shimming masks broken installs. Preserve fail-open (shims above)
48
+ // so the hook pipeline still works, but log to stderr so operators see it.
49
+ console.error(`[Hook] WARNING: Research-evidence gate failed to load — gate is disabled. ${err.message}`);
50
+ }
51
+
38
52
  const _noop = () => ({ allowed: true, blocked: false });
39
53
  let checkDeployGate = _noop, checkWriteBlock = _noop;
40
54
  try { const dg = require('../../core/deploy-gate'); checkDeployGate = dg.checkDeployGate; checkWriteBlock = dg.checkWriteBlock; } catch (_err) { if (process.env.DEBUG) console.error(`[Hook] Deploy gate not loaded: ${_err.message}`); }
@@ -84,6 +98,7 @@ runHook('PreToolUse', async ({ input, parsedInput }) => {
84
98
  checkRoutingGate, clearRoutingPending, hasActiveTask,
85
99
  checkPhaseGate, checkCommitLogGate,
86
100
  recordPhaseRead, checkPhaseReadGate, clearPhaseReads,
101
+ recordEvidenceRead, checkSpecWriteGate, clearResearchEvidence,
87
102
  checkDeployGate, checkWriteBlock,
88
103
  checkStrikeGate, checkBugfixScope, checkScopeMutation,
89
104
  checkGitSafety, checkManagerBoundary, checkWorkerBoundary,
@@ -1,7 +0,0 @@
1
- {
2
- "taskId": null,
3
- "uniqueFiles": [],
4
- "thresholdReached": false,
5
- "scopeInventory": null,
6
- "warnedAt": null
7
- }
@@ -1,3 +0,0 @@
1
- {
2
- "deploys": []
3
- }
@@ -1,4 +0,0 @@
1
- {
2
- "routes": [],
3
- "lastUpdated": null
4
- }
@@ -1,3 +0,0 @@
1
- {
2
- "tasks": {}
3
- }