npm - wogiflow - Versions diffs - 2.29.4 → 2.29.6 - Mend

wogiflow 2.29.4 → 2.29.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

package/.workflow/templates/partials/methodology-rules.hbs +74 -0
package/README.md +1 -1
package/lib/wogi-claude +34 -3
package/lib/wogi-claude-expect.exp +30 -5
package/package.json +2 -2
package/scripts/flow-defer-auth.js +103 -0
package/scripts/flow-utils.js +52 -0
package/scripts/hooks/core/deferral-classifier.js +129 -0
package/scripts/hooks/core/deferral-gate.js +379 -0
package/scripts/hooks/core/deletion-log.js +426 -0
package/scripts/hooks/core/pre-tool-orchestrator.js +58 -0
package/scripts/hooks/core/research-evidence-gate.js +11 -1
package/scripts/hooks/core/research-required-classifier.js +205 -0
package/scripts/hooks/core/research-required-gate.js +235 -0
package/scripts/hooks/core/session-context.js +21 -0
package/scripts/hooks/core/task-boundary-reset.js +132 -1
package/scripts/hooks/entry/claude-code/post-tool-use.js +34 -0
package/scripts/hooks/entry/claude-code/stop.js +26 -0
package/scripts/hooks/entry/claude-code/user-prompt-submit.js +39 -0

package/.workflow/templates/partials/methodology-rules.hbs CHANGED Viewed

@@ -213,3 +213,77 @@ The user can dump N items, say "go until you finish" / "autonomous mode" / "run
 **Exit conditions**: ready queue drains, user types "stop"/"pause", or fatal error. On exit, render the completion summary (terminal block + JSON payload at `.workflow/state/autonomous-run-summary-<runId>.json`) and clear the flag.
 Enforced by: `flow-autonomous-detector.js`, `flow-question-queue.js`, `flow-decision-authority.js` (autonomous param + `queue-for-review` + `adversary-loop` buckets), `flow-completion-summary.js`, and the SessionStart context injection in `scripts/hooks/core/session-context.js`.
+---
+### Mechanical Deferral Authorization Gate (wf-f9912af6)
+The textual "Review-Findings Anti-Deferral" rule above (incident-driven 2026-04-15) is enforced mechanically by the deferral gate. The AI cannot silently mark review/audit findings as `status: deferred*` without explicit user authorization — the PreToolUse hook intercepts every Write/Edit/Bash that targets `.workflow/state/last-review.json` or `.workflow/state/last-audit.json` and BLOCKS the write when:
+1. The new content introduces one or more findings whose `status` matches `/^deferred(?:[-_].*)?$|^wont-?fix$|^skipped$/i`, AND
+2. No valid authorization marker exists at `.workflow/state/deferral-authorization.json`, AND
+3. The `no-defer-pin.json` is not active (a pin overrides any auth — set when the user says "fix everything" / "no deferrals" / "I don't want tech debt").
+**Authorization sources** (one of):
+- **User-prompt classifier** (`scripts/hooks/core/deferral-classifier.js`): regex-detects explicit defer phrases in UserPromptSubmit messages — "defer X", "fix critical only", "ship as-is", "option 2"/"option 4" from the /wogi-review menu, etc. Writes auth marker with TTL 10 min by default.
+- **Explicit CLI**: `node scripts/flow-defer-auth.js grant --scope=all --reason="<verbatim user phrase>"` (or `--findings=F5,F6,...`). Used when the AI needs to record explicit authorization (e.g., user picked option 4).
+**Negative intent overrides positive**: phrases like "fix everything", "no deferrals", "don't defer", "I don't want tech debt" delete any existing auth and write a `no-defer-pin.json` that hard-blocks deferrals for ~30 minutes.
+**Bash-mutating commands** that write to the target files AND mention `deferred|wont-fix|skipped|dismissed` are blocked when no auth is active — this catches `node -e "fs.writeFileSync('.workflow/state/last-review.json', ...)"` patterns that bypass Write/Edit. Reads (`cat`/`jq`/`grep`) are not blocked.
+**Audit trail**: every blocked attempt logs to `.workflow/state/deferral-block-log.json` (last 100 entries) for telemetry.
+**Why mechanical enforcement matters:** the textual rule has been violated multiple times in incidents — the AI decides "low risk / can wait / pre-existing" and writes `status: deferred` to last-review.json based on its own judgment. The gate makes this structurally impossible without the user's word.
+**Anti-rationalization** (if any of these thoughts cross your mind, you are about to violate the gate):
+- *"This finding is pre-existing, not introduced by my changes"* → WRONG. Pre-existing is a reason to fix it now (continuous improvement) or to surface it to the user with an explicit "ship / fix / defer" question, not to silently `status: deferred-pre-existing`.
+- *"This is LOW severity, the user won't care"* → WRONG. Severity is the user's call, not yours.
+- *"The adversary already verified it's not a real bug"* → WRONG. If it's not a bug, mark it `dismissed-not-a-bug` only AFTER the user confirms; otherwise leave it `open`.
+- *"I'll batch deferrals into the next review cycle"* → WRONG. There is no "next cycle" — the user reads the findings now.
+Config: `deferralGate.{enabled,authTtlSeconds,classifyUserPrompts}` in `.workflow/config.json` (defaults: true / 600 / true).
+Enforced by: `scripts/hooks/core/deferral-gate.js` (core), `scripts/hooks/core/deferral-classifier.js` (intent detection), `scripts/flow-defer-auth.js` (CLI), wired into `scripts/hooks/core/pre-tool-orchestrator.js` (PreToolUse) and `scripts/hooks/entry/claude-code/user-prompt-submit.js` (UserPromptSubmit).
+---
+### Mechanical Research-Required Gate (wf-5cd71b1f)
+The textual rules in CLAUDE.md ("Research Before Propose," Tier 2/3 routing protocol) say the AI must read evidence before answering diagnostic questions. The research-required gate makes this mechanical: it intercepts diagnostic prompts at UserPromptSubmit and re-prompts the AI at Stop hook if the assistant turn produced text without enough Read calls against evidence paths.
+**How it works**:
+1. **UserPromptSubmit classifier** (`scripts/hooks/core/research-required-classifier.js`): regex-classifies each prompt into `command` / `factual` / `diagnostic` / `none`.
+   - `command` — task IDs, action imperatives ("add X"), follow-ups ("yes", "continue", "option N"), AI's own slash commands
+   - `factual` — Tier 1 markers ("what is", "where is", "show me", "list all")
+   - `diagnostic` — Tier 2/3 markers ("why", "should I", "what do you think", "is this correct", "explain why", "did you fix")
+   - On `diagnostic`: writes `.workflow/state/research-required-this-turn.json` with `{requiredEvidence: 2, attemptCount: 0, classifiedAt}`.
+2. **Override**: prompt prefix `!` skips the gate entirely. For when the user knows their question is conversational and doesn't need evidence reading.
+3. **Stop-hook gate** (`scripts/hooks/core/research-required-gate.js`): if marker exists, parses the JSONL transcript for the current turn (since the most recent user entry), counts:
+   - `Read` tool calls where `file_path` matches an evidence prefix
+   - `Bash` tool calls where the command starts with `cat|head|tail|grep|rg|jq|less|view|awk|sed` and targets an evidence-prefix path
+   - `Glob`/`Grep` tool calls (any pattern counts)
+4. **If count < requiredEvidence**:
+   - Increments `attemptCount` in the marker
+   - Returns `{continue: true, stopReason: <violation message>}` — Claude Code re-prompts the AI with the message; the AI must redo the turn with reads
+   - After `maxAttempts` (default 3): returns `{continue: false, stopReason: <hard-stop message>}` — visible to the user, marker cleared
+5. **If count ≥ requiredEvidence**: marker is consumed (deleted), Stop proceeds normally.
+**Evidence prefixes** (shared with `research-evidence-gate.js`): `.workflow/state/`, `.workflow/changes/`, `.workflow/specs/`, `.workflow/epics/`, `lib/`, `scripts/`, `src/`, `tests/`, `app/`. Reading code in answer to "why does X happen" is the legitimate path.
+**Why mechanical enforcement matters**: the textual Tier 2/3 protocol relies on the AI self-classifying its own question's complexity, which is the rubber-stamp pattern. The gate uses structural markers + Stop-hook redo loop — same proven architecture as `worker-tool-first-gate.js` G1/G4. The AI cannot bypass: UserPromptSubmit fires on every user message, Stop fires on every assistant turn end, and `{continue: true, stopReason}` is honored by Claude Code as a forced redo.
+**Anti-rationalization**:
+- *"I already know the answer from context"* → WRONG. Confidence is not evidence. The gate fires on the question's structure, not your perceived certainty.
+- *"This question is conversational, doesn't need code reading"* → WRONG. If you genuinely believe that, the user can prefix `!` next time. Within a turn, the gate is final.
+- *"I'll cite the evidence in my next answer instead of reading it now"* → WRONG. Citations require reads in the same turn. The transcript proves it.
+Config: `researchRequiredGate.{enabled,requiredEvidence,maxAttempts}` in `.workflow/config.json` (defaults: true / 2 / 3). The override prefix `!` is hard-coded.
+Enforced by: `scripts/hooks/core/research-required-classifier.js` (UserPromptSubmit), `scripts/hooks/core/research-required-gate.js` (Stop), wired into `scripts/hooks/entry/claude-code/user-prompt-submit.js` and `scripts/hooks/entry/claude-code/stop.js`.

package/README.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # WogiFlow
-A self-improving AI development workflow that learns from your feedback. Currently supports **Claude Code 2.1.33+**.
+A self-improving AI development workflow that learns from your feedback. Currently supports **Claude Code 2.1.33+**. Claude Code **2.1.121+** is recommended for native MCP startup retries (transient channel-server failures auto-recover), Bash resilience to deleted CWDs (worktree cleanup is safe mid-session), and `Always allow` permission persistence across worker restarts.
 ```bash
 npm install -D wogiflow

package/lib/wogi-claude CHANGED Viewed

@@ -21,6 +21,12 @@
 #   WOGI_MAX_RESTARTS    — safety cap, default 50 (prevents runaway restart storms)
 #   WOGI_WRAPPER_PID     — exported to child; hook checks this to confirm wrapper is present
 #   WOGI_CLAUDE_BIN      — override path to claude binary (default: found via PATH)
+#   WOGI_BASH_BIN        — (wf-ee4e343b cleanup) override the bash binary used
+#                          by the PID-alignment subshell trick. Defaults to
+#                          `bash` on PATH. Useful on minimal containers
+#                          (Alpine, distroless) where bash lives at a
+#                          non-standard path or where the shell wrapping the
+#                          claude CLI is not bash-by-default.
 #   WOGI_USE_EXPECT      — (EXPERIMENTAL, v2.22.4+) set to 1 to opt IN to the
 #                          expect-based auto-dismiss of the "Loading development
 #                          channels" dialog. OFF BY DEFAULT because Ink's
@@ -99,7 +105,7 @@ if [ "$__wogi_is_worker" -eq 1 ]; then
       try {
         const cfg = require(process.cwd() + "/.workflow/config.json");
         process.stdout.write(String(!!(cfg.workspace && cfg.workspace.inheritClaudeAiMcpIntegrations)));
-      } catch (_e) { process.stdout.write("false"); }
+      } catch (_err) { process.stdout.write("false"); }
     ' 2>/dev/null)"
     if [ "$__wogi_config_inherit" = "true" ]; then
       __wogi_strip_mcp=0
@@ -205,7 +211,7 @@ if [ "$__wogi_strip_mcp" -eq 1 ]; then
             const ws = cfg && cfg.mcpServers && cfg.mcpServers["wogi-workspace-channel"];
             if (ws) channelEntry = ws;
           }
-        } catch (_e) {}
+        } catch (_err) {}
         const payload = channelEntry
           ? { mcpServers: { "wogi-workspace-channel": channelEntry } }
           : { mcpServers: {} };
@@ -284,12 +290,37 @@ __wogi_build_argv() {
 # run_claude — invoke claude, routing through expect when we can auto-dismiss
 # the dev-channels dialog. Preserves stdin/stdout/stderr exactly.
+#
+# wf-ee4e343b: PID-alignment via bash-c-exec trick. The Stop hook's SEC-006
+# check (task-boundary-reset.js:200-206) requires WOGI_WRAPPER_PID === process.ppid
+# in any hook running under claude. Plain `"$CLAUDE_BIN" ...` without `exec`
+# causes bash to fork: claude gets a NEW PID that does not match $$ (this bash
+# wrapper's PID). The check then fails silently, breaking auto-restart for
+# everyone since 2026-04-26.
+#
+# Fix: spawn claude through `bash -c '...'` which forks a fresh bash with its
+# OWN $$, sets WOGI_WRAPPER_PID to that $$, then `exec` replaces the new bash
+# with claude — preserving the same PID. Result: claude's PID equals the
+# WOGI_WRAPPER_PID it inherits, and process.ppid in any hook child of claude
+# equals that same value. The strict-equality SEC-006 check now holds.
+#
+# Why `bash -c` and not a `( ... )` subshell: in bash 3.x (macOS system bash),
+# `$$` inside a `( ... )` subshell returns the OUTER shell's PID, not the
+# subshell's own PID. Bash 4+ adds $BASHPID for that purpose, but we cannot
+# rely on bash 4+ being installed. `bash -c` always returns its own PID via
+# `$$`, regardless of version.
+#
+# Bash -c argv form: `bash -c COMMAND COMMAND_NAME ARG1 ARG2 ...` — COMMAND_NAME
+# becomes $0 inside the script and ARG1..N become $1..$N, so `exec "$0" "$@"`
+# invokes claude with all original args without quoting hazards.
+#
+# For expect mode, the same alignment is performed inside wogi-claude-expect.exp.
 run_claude() {
   __wogi_build_argv "$@"
   if [ "$__wogi_use_expect" -eq 1 ]; then
     expect "$WOGI_EXPECT_SCRIPT" "$CLAUDE_BIN" "${__wogi_claude_argv[@]+"${__wogi_claude_argv[@]}"}"
   else
-    "$CLAUDE_BIN" "${__wogi_claude_argv[@]+"${__wogi_claude_argv[@]}"}"
+    "${WOGI_BASH_BIN:-bash}" -c 'export WOGI_WRAPPER_PID=$$; exec "$0" "$@"' "$CLAUDE_BIN" "${__wogi_claude_argv[@]+"${__wogi_claude_argv[@]}"}"
   fi
 }

package/lib/wogi-claude-expect.exp CHANGED Viewed

@@ -90,12 +90,37 @@ set claude_bin [lindex $argv 0]
 set claude_args [lrange $argv 1 end]
 # Spawn claude in a pseudo-TTY so its Ink UI renders normally.
-# Use {*} list-splice rather than `eval spawn` — `eval` reparses its
-# arguments as Tcl script, which lets an argument containing bracket syntax
-# (e.g. `[exec attacker-cmd]`) escape to command execution. The splice form
-# expands the list without reparsing.
+#
+# wf-ee4e343b PID-alignment: spawn claude through `bash -c` so we can
+# re-export WOGI_WRAPPER_PID=$$ (the subshell's PID) before exec — this
+# makes claude inherit a WOGI_WRAPPER_PID equal to its own PID, satisfying
+# the SEC-006 strict-equality check in task-boundary-reset.js. Without this,
+# expect's spawn gives claude a PID different from WOGI_WRAPPER_PID and the
+# Stop-hook restart trigger silently fails.
+#
+# The bash -c form `bash -c COMMAND COMMAND_NAME ARG1 ARG2 ...` makes
+# COMMAND_NAME = $0 and the remaining args = $1..$N, so we use `exec "$0" "$@"`
+# to invoke claude with all original args — no quoting hazards.
 _wogi_boot_mark "before spawn"
-spawn $claude_bin {*}$claude_args
+# F4 fix (wf-ee4e343b cleanup): defensive list construction. `lrange $argv 1
+# end` already returns a clean Tcl list, and `{*}$claude_args` splices it
+# without re-parsing element contents — so brace-containing args are
+# preserved as single elements. The Sonnet review flagged this as a quoting
+# hazard; the Opus adversary verified it's safe. We rebuild the list
+# explicitly via [list ...] anyway for defense-in-depth and to make the
+# safety property obvious to future readers (no implicit dependency on
+# lrange's return contract).
+set claude_args_safe [list]
+foreach _arg $claude_args { lappend claude_args_safe $_arg }
+# Honor WOGI_BASH_BIN (wf-ee4e343b cleanup) for non-standard shell layouts.
+set _wogi_bash_bin "bash"
+if {[info exists env(WOGI_BASH_BIN)] && $env(WOGI_BASH_BIN) ne ""} {
+    set _wogi_bash_bin $env(WOGI_BASH_BIN)
+}
+spawn $_wogi_bash_bin -c "export WOGI_WRAPPER_PID=\$\$; exec \"\$0\" \"\$@\"" $claude_bin {*}$claude_args_safe
 _wogi_boot_mark "after spawn (pid=$spawn_id)"
 # ============================================================================

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "wogiflow",
-  "version": "2.29.4",
+  "version": "2.29.6",
   "description": "AI-powered development workflow management system with multi-model support",
   "main": "lib/index.js",
   "bin": {
@@ -10,7 +10,7 @@
   },
   "scripts": {
     "flow": "./scripts/flow",
-    "test": "NODE_ENV=test node --test tests/auto-compact-prompt.test.js tests/flow-paths.test.js tests/flow-io.test.js tests/flow-config-loader.test.js tests/flow-damage-control.test.js tests/flow-output.test.js tests/flow-constants.test.js tests/flow-session-state.test.js tests/flow-hooks-integration.test.js tests/flow-utils.test.js tests/flow-security.test.js tests/flow-memory-db.test.js tests/flow-durable-session.test.js tests/flow-skill-matcher.test.js tests/flow-bridge.test.js tests/flow-proactive-compact.test.js tests/flow-cascade-completion.test.js tests/flow-capture-gate.test.js tests/flow-correction-detector-hybrid.test.js tests/flow-promote.test.js tests/flow-archive-runs.test.js tests/flow-memory.test.js tests/flow-hooks-pre-tool-helpers.test.js tests/flow-hooks-bugfix-scope-gate.test.js tests/flow-hooks-routing-gate.test.js tests/flow-hooks-phase-read-gate.test.js tests/flow-hooks-commit-log-gate.test.js tests/flow-hooks-deploy-gate.test.js tests/flow-hooks-todowrite-gate.test.js tests/flow-hooks-git-safety-gate.test.js tests/flow-hooks-scope-mutation-gate.test.js tests/flow-hooks-strike-gate.test.js tests/flow-hooks-component-check.test.js tests/flow-hooks-scope-gate.test.js tests/flow-hooks-implementation-gate.test.js tests/flow-hooks-research-gate.test.js tests/flow-hooks-loop-check.test.js tests/flow-hooks-manager-boundary-gate.test.js tests/flow-hooks-phase-gate.test.js tests/flow-hooks-pre-tool-orchestrator.test.js tests/flow-hooks-observation-capture.test.js tests/flow-hooks-task-gate.test.js tests/flow-durable-session-suspension.test.js tests/flow-health-mcp-scopes.test.js tests/flow-lean-config.test.js tests/flow-workspace-autopickup.test.js tests/flow-worker-boundary-gate.test.js tests/flow-worker-question-classifier.test.js tests/flow-completion-truth-gate-contradictions.test.js tests/flow-structure-sensor.test.js tests/flow-workspace-dispatch-tracking.test.js tests/workspace-ipc-sqlite.test.js tests/workspace-ipc-multi-worker.test.js tests/flow-story-gates.test.js tests/flow-workspace-restart-handoff.test.js tests/flow-wogi-claude-wrapper.test.js tests/flow-wave1-integrations.test.js tests/flow-wave2-integrations.test.js tests/flow-wave3-integrations.test.js tests/flow-commit-claims-gate.test.js tests/auto-review.test.js tests/gate-telemetry-surface.test.js tests/agents-md-alias.test.js tests/flow-skill-manage.test.js tests/fuzzy-patch.test.js tests/mode-schema.test.js tests/flow-feature-dossier.test.js tests/flow-autonomous-mode.test.js tests/flow-epic-cascade.test.js tests/flow-workspace-summary.test.js tests/flow-hooks-research-evidence-gate.test.js tests/flow-worker-mcp-strip.test.js tests/flow-orchestrate-corrections.test.js tests/flow-source-fidelity.test.js tests/flow-hooks-long-input-enforcement.test.js && NODE_ENV=test node tests/run-quality-gates.test.js",
+    "test": "NODE_ENV=test node --test tests/auto-compact-prompt.test.js tests/flow-paths.test.js tests/flow-io.test.js tests/flow-config-loader.test.js tests/flow-damage-control.test.js tests/flow-output.test.js tests/flow-constants.test.js tests/flow-session-state.test.js tests/flow-hooks-integration.test.js tests/flow-utils.test.js tests/flow-security.test.js tests/flow-memory-db.test.js tests/flow-durable-session.test.js tests/flow-skill-matcher.test.js tests/flow-bridge.test.js tests/flow-proactive-compact.test.js tests/flow-cascade-completion.test.js tests/flow-capture-gate.test.js tests/flow-correction-detector-hybrid.test.js tests/flow-promote.test.js tests/flow-archive-runs.test.js tests/flow-memory.test.js tests/flow-hooks-pre-tool-helpers.test.js tests/flow-hooks-bugfix-scope-gate.test.js tests/flow-hooks-routing-gate.test.js tests/flow-hooks-phase-read-gate.test.js tests/flow-hooks-commit-log-gate.test.js tests/flow-hooks-deploy-gate.test.js tests/flow-hooks-todowrite-gate.test.js tests/flow-hooks-git-safety-gate.test.js tests/flow-hooks-scope-mutation-gate.test.js tests/flow-hooks-strike-gate.test.js tests/flow-hooks-component-check.test.js tests/flow-hooks-scope-gate.test.js tests/flow-hooks-implementation-gate.test.js tests/flow-hooks-research-gate.test.js tests/flow-hooks-loop-check.test.js tests/flow-hooks-manager-boundary-gate.test.js tests/flow-hooks-phase-gate.test.js tests/flow-hooks-pre-tool-orchestrator.test.js tests/flow-hooks-observation-capture.test.js tests/flow-hooks-task-gate.test.js tests/flow-durable-session-suspension.test.js tests/flow-health-mcp-scopes.test.js tests/flow-lean-config.test.js tests/flow-workspace-autopickup.test.js tests/flow-worker-boundary-gate.test.js tests/flow-worker-question-classifier.test.js tests/flow-completion-truth-gate-contradictions.test.js tests/flow-structure-sensor.test.js tests/flow-workspace-dispatch-tracking.test.js tests/workspace-ipc-sqlite.test.js tests/workspace-ipc-multi-worker.test.js tests/flow-story-gates.test.js tests/flow-workspace-restart-handoff.test.js tests/flow-wogi-claude-wrapper.test.js tests/flow-wave1-integrations.test.js tests/flow-wave2-integrations.test.js tests/flow-wave3-integrations.test.js tests/flow-commit-claims-gate.test.js tests/auto-review.test.js tests/gate-telemetry-surface.test.js tests/agents-md-alias.test.js tests/flow-skill-manage.test.js tests/fuzzy-patch.test.js tests/mode-schema.test.js tests/flow-feature-dossier.test.js tests/flow-autonomous-mode.test.js tests/flow-epic-cascade.test.js tests/flow-workspace-summary.test.js tests/flow-hooks-research-evidence-gate.test.js tests/flow-worker-mcp-strip.test.js tests/flow-orchestrate-corrections.test.js tests/flow-source-fidelity.test.js tests/flow-hooks-long-input-enforcement.test.js tests/workspace-channel-tracking.test.js tests/flow-hooks-deletion-log.test.js tests/flow-task-boundary-reset.test.js tests/flow-deferral-gate.test.js tests/flow-research-required-gate.test.js && NODE_ENV=test node tests/run-quality-gates.test.js",
     "test:syntax": "find scripts/ lib/ -name '*.js' -not -path '*/node_modules/*' -exec node --check {} +",
     "lint": "eslint scripts/ lib/ tests/",
     "lint:ci": "eslint scripts/ lib/ tests/ --max-warnings 0",

package/scripts/flow-defer-auth.js ADDED Viewed

@@ -0,0 +1,103 @@
+#!/usr/bin/env node
+/**
+ * Wogi Flow — Deferral Authorization CLI (wf-f9912af6)
+ *
+ * Explicit user-authorization helper for the deferral gate. Used when the AI
+ * needs to record that the user picked a defer-style menu option in
+ * /wogi-review (e.g., "Create tasks for all - fix later in batches").
+ *
+ * Usage:
+ *   flow defer-auth grant --scope=all --reason="<verbatim user phrase>"
+ *   flow defer-auth grant --findings=F5,F6,F7 --reason="..."
+ *   flow defer-auth clear
+ *   flow defer-auth status
+ */
+const path = require('node:path');
+const gate = require('./hooks/core/deferral-gate');
+function parseArgs(argv) {
+  const args = {};
+  for (const a of argv) {
+    if (a.startsWith('--')) {
+      const eq = a.indexOf('=');
+      if (eq === -1) {
+        args[a.slice(2)] = true;
+      } else {
+        args[a.slice(2, eq)] = a.slice(eq + 1);
+      }
+    }
+  }
+  return args;
+}
+function cmdGrant(args) {
+  let scope = 'all';
+  if (args.findings) {
+    scope = String(args.findings)
+      .split(',')
+      .map(s => s.trim())
+      .filter(Boolean);
+    if (scope.length === 0) {
+      console.error('grant: --findings must be a non-empty comma-separated list');
+      process.exit(2);
+    }
+  } else if (args.scope === 'all' || args.scope === undefined) {
+    scope = 'all';
+  } else {
+    scope = String(args.scope);
+  }
+  const reason = args.reason ? String(args.reason) : 'cli-grant';
+  const ttlSec = args['ttl-sec'] ? parseInt(args['ttl-sec'], 10) : undefined;
+  const payload = gate.writeAuth({
+    scope,
+    source: reason,
+    grantedBy: 'explicit-cli',
+    ttlSec
+  });
+  if (!payload) {
+    console.error('grant: failed to write authorization marker');
+    process.exit(1);
+  }
+  console.log(JSON.stringify({ status: 'granted', ...payload }, null, 2));
+}
+function cmdClear() {
+  gate.clearAuth();
+  gate.clearNoDeferPin();
+  console.log(JSON.stringify({ status: 'cleared' }, null, 2));
+}
+function cmdStatus() {
+  const auth = gate.loadAuth();
+  const pin = gate.loadNoDeferPin();
+  console.log(JSON.stringify({
+    authorization: auth || null,
+    noDeferPin: pin || null,
+    authPath: gate.getAuthPath(),
+    pinPath: gate.getNoDeferPinPath()
+  }, null, 2));
+}
+function usage() {
+  console.log('Usage: flow defer-auth <grant|clear|status> [--scope=all|<id>] [--findings=F1,F2] [--reason="..."] [--ttl-sec=600]');
+  process.exit(2);
+}
+function main() {
+  const [, , subcommand, ...rest] = process.argv;
+  const args = parseArgs(rest);
+  switch (subcommand) {
+    case 'grant': return cmdGrant(args);
+    case 'clear': return cmdClear();
+    case 'status': return cmdStatus();
+    default: return usage();
+  }
+}
+if (require.main === module) main();
+module.exports = { parseArgs, cmdGrant, cmdClear, cmdStatus };

package/scripts/flow-utils.js CHANGED Viewed

@@ -336,6 +336,7 @@ function saveReadyData(data) {
   const toSave = { ...data, lastUpdated: new Date().toISOString() };
   const result = writeJson(PATHS.ready, toSave);
   invalidateReadyDataCache(); // Invalidate AFTER write completes to avoid stale cache race
+  maybeArmTaskBoundaryRestart(previousData, toSave);
   return result;
 }
@@ -359,10 +360,61 @@ async function saveReadyDataAsync(data) {
     const toSave = { ...data, lastUpdated: new Date().toISOString() };
     const result = writeJson(PATHS.ready, toSave);
     invalidateReadyDataCache(); // Invalidate AFTER write completes
+    maybeArmTaskBoundaryRestart(previousData, toSave);
     return result;
   });
 }
+/**
+ * wf-ee4e343b — Phase 1 chokepoint for task-boundary auto-restart.
+ *
+ * Why this exists: Phase 1 marker writes were previously split across three
+ * disjoint paths (`flow done`, `task-completed.js` hook, Stop-hook fallback
+ * with a 5-min freshness window). The Stop-hook fallback misses real-world
+ * timing (user takes >5min to type next message → fallback rejects) and the
+ * other two paths are not always called. By detecting "new entry in
+ * recentlyCompleted" right here in saveReadyData — the actual chokepoint
+ * every completion goes through — we arm the marker at the moment of
+ * completion regardless of who completed the task.
+ *
+ * Gated on WOGI_WRAPPER_PID so test/CLI/non-wrapper invocations don't
+ * write spurious markers. Lazy-required to avoid circular dependency
+ * (task-boundary-reset.js → flow-utils.js).
+ */
+function maybeArmTaskBoundaryRestart(previousData, savedData) {
+  try {
+    if (!process.env.WOGI_WRAPPER_PID) return;
+    // First-save guard (F2): when ready.json doesn't yet exist, previousData
+    // is null. If savedData arrives pre-populated (fresh install seeded from
+    // backup, init script bootstrapping recentlyCompleted, etc.) we MUST NOT
+    // arm a restart marker — there's no completion event, just an initial
+    // state snapshot. Real completions always have a previousData to diff
+    // against because saveReadyData is the only writer.
+    if (!previousData) return;
+    // F7 fix (wf-ee4e343b cleanup): F2 was asymmetric — readJson returns {}
+    // (truthy) on corrupt JSON or missing top-level keys, so the !previousData
+    // guard would not catch that case. A corrupt ready.json that recovers as
+    // {} followed by a save with populated recentlyCompleted would still
+    // false-positive. Require previousData.recentlyCompleted to be an actual
+    // array for the diff to be meaningful — anything else is "we don't know
+    // the prior state," which is structurally identical to first-save and
+    // must NOT arm.
+    if (!Array.isArray(previousData.recentlyCompleted)) return;
+    const prevTop = previousData.recentlyCompleted[0];
+    const curTop = savedData?.recentlyCompleted?.[0];
+    if (!curTop || !curTop.id) return;
+    if (prevTop && prevTop.id === curTop.id) return; // no new completion
+    const { markRestartPending } = require('./hooks/core/task-boundary-reset');
+    markRestartPending({
+      taskId: curTop.id,
+      taskTitle: curTop.title,
+      source: 'saveReadyData'
+    });
+  } catch (_err) {
+    // Fail-open — never let an observability/marker write break ready.json save
+  }
+}
 /**
  * Archive overflow completed tasks to a log file (v3.2)
  * When recentlyCompleted exceeds 10 items, archive the overflow

package/scripts/hooks/core/deferral-classifier.js ADDED Viewed

@@ -0,0 +1,129 @@
+#!/usr/bin/env node
+/**
+ * Wogi Flow — Deferral Intent Classifier (wf-f9912af6)
+ *
+ * Regex-based detector for explicit user deferral intent in UserPromptSubmit
+ * messages. Cheap (no Haiku call), deterministic, runs every prompt.
+ *
+ * NEGATIVE intent takes precedence over POSITIVE — if the user says both
+ * "fix everything" and "skip Y" in the same message, we assume they want
+ * everything fixed (the defer-everything pattern is the dangerous one this
+ * gate exists to stop).
+ *
+ * Negative match → write `no-defer-pin.json` (HARD block, overrides any auth)
+ * Positive match → write `deferral-authorization.json` (allows specific scope)
+ * Neither → no-op
+ *
+ * Fail-open: any error in classification falls through silently.
+ */
+// Negative phrases (HIGH PRIORITY — clear auth, write no-defer pin)
+const NEGATIVE_PATTERNS = [
+  /\bfix\s+(everything|all\s+of\s+(them|it)|all\s+findings?)\b/i,
+  /\bno\s+deferr?als?\b/i,
+  /\b(don'?t|do\s+not)\s+defer\b/i,
+  /\bi\s+don'?t\s+(want|like)\s+(tech\s*-?\s*debt|technical\s*-?\s*debt|deferr?al)/i,
+  /\bnever\s+defer\b/i,
+  /\balways\s+fix\s+(what'?s\s+broken|what\s+needs?\s+fixing)/i,
+  /\bnothing\s+(should\s+be|gets)\s+deferr?ed\b/i,
+];
+// Positive phrases (MEDIUM PRIORITY — write auth marker)
+// We're conservative: require defer/skip phrasing to be coupled with finding
+// context (this/that/those/it/option N/F\d+/severity word) to avoid catching
+// unrelated mentions like "let's defer the meeting".
+const POSITIVE_PATTERNS = [
+  // "defer X" / "skip X" with a referent
+  /\b(defer|skip|ignore|drop)\s+(this|that|those|it|them|f\d+|finding\s+\w+)\b/i,
+  /\bleave\s+(this|that|those|f\d+|.*?)\s+(for\s+)?later\b/i,
+  // /wogi-review menu options that mean defer
+  /\boption\s*[24]\b/i, // option 2 = "fix critical only"; option 4 = "create tasks for all (defer)"
+  /\bcreate\s+tasks?\s+for\s+(all|the\s+rest|remaining)\b/i,
+  // Severity-scoped deferrals
+  /\bfix\s+(only\s+)?(critical|high)\s*(\s*\/\s*high)?\s+only\b/i,
+  /\bfix\s+(critical|high)\s+(only|first)\b/i,
+  /\bskip\s+(low|medium|low\s*\/\s*medium)\b/i,
+  // Ship-as-is style
+  /\bship\s+(it\s+)?as\s*-?\s*is\b/i,
+  /\bgood\s+enough\s+(as\s*-?\s*is|for\s+now)\b/i,
+  /\bcall\s+it\s+(done|good)\b/i,
+];
+/**
+ * Classify a user prompt for deferral intent.
+ *
+ * @param {string} prompt - the user's UserPromptSubmit text
+ * @returns {{ intent: 'negative'|'positive'|'none', match?: string, scope?: string|string[] }}
+ */
+function classifyDeferralIntent(prompt) {
+  if (!prompt || typeof prompt !== 'string') return { intent: 'none' };
+  // Negative first — overrides positive
+  for (const rx of NEGATIVE_PATTERNS) {
+    const m = prompt.match(rx);
+    if (m) return { intent: 'negative', match: m[0] };
+  }
+  // Positive
+  for (const rx of POSITIVE_PATTERNS) {
+    const m = prompt.match(rx);
+    if (m) {
+      // Try to extract scope — look for F\d+ ids in the prompt
+      const findingIds = Array.from(prompt.matchAll(/\bF\d+\b/g)).map(x => x[0]);
+      return {
+        intent: 'positive',
+        match: m[0],
+        scope: findingIds.length > 0 ? findingIds : 'all'
+      };
+    }
+  }
+  return { intent: 'none' };
+}
+/**
+ * Apply classification result to the gate's state files. Wired into
+ * UserPromptSubmit. Fail-open throughout.
+ */
+function applyClassification(prompt, config) {
+  try {
+    if (config?.deferralGate?.classifyUserPrompts === false) return { applied: false, reason: 'classifier-disabled' };
+    const result = classifyDeferralIntent(prompt);
+    if (result.intent === 'none') return { applied: false, reason: 'no-match' };
+    // Lazy-require to avoid load-order coupling
+    const gate = require('./deferral-gate');
+    if (result.intent === 'negative') {
+      gate.writeNoDeferPin({ source: result.match });
+      return { applied: true, intent: 'negative', match: result.match };
+    }
+    if (result.intent === 'positive') {
+      gate.writeAuth({
+        scope: result.scope,
+        source: result.match,
+        grantedBy: 'user-prompt',
+        config
+      });
+      return { applied: true, intent: 'positive', match: result.match, scope: result.scope };
+    }
+    return { applied: false, reason: 'unhandled-intent' };
+  } catch (err) {
+    if (process.env.DEBUG) console.error(`[deferral-classifier] applyClassification error (fail-open): ${err.message}`);
+    return { applied: false, reason: `error: ${err.message}` };
+  }
+}
+module.exports = {
+  classifyDeferralIntent,
+  applyClassification,
+  NEGATIVE_PATTERNS,
+  POSITIVE_PATTERNS
+};