npm - wogiflow - Versions diffs - 2.1.3 → 2.3.0 - Mend

wogiflow 2.1.3 → 2.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (25) hide show

package/.claude/commands/wogi-audit.md +189 -3
package/.claude/commands/wogi-onboard.md +30 -8
package/.claude/commands/wogi-review.md +86 -13
package/.claude/commands/wogi-start.md +66 -21
package/.claude/docs/claude-code-compatibility.md +28 -0
package/.workflow/templates/claude-md.hbs +32 -2
package/package.json +1 -1
package/scripts/flow-api-index.js +128 -63
package/scripts/flow-audit.js +158 -1
package/scripts/flow-function-index.js +65 -63
package/scripts/flow-pattern-extractor.js +1 -1
package/scripts/flow-progress-tracker.js +289 -0
package/scripts/flow-prompt-capture.js +263 -170
package/scripts/flow-scanner-base.js +200 -7
package/scripts/flow-skill-generator.js +1 -0
package/scripts/flow-standards-learner.js +167 -3
package/scripts/flow-task-checkpoint.js +2 -0
package/scripts/flow-template-extractor.js +1 -1
package/scripts/flow-version-check.js +1 -0
package/scripts/hooks/core/commit-log-gate.js +146 -0
package/scripts/hooks/core/post-compact.js +81 -8
package/scripts/hooks/core/task-completed.js +19 -0
package/scripts/hooks/entry/claude-code/post-tool-use.js +60 -0
package/scripts/hooks/entry/claude-code/pre-tool-use.js +27 -0
package/scripts/registries/component-registry.js +141 -4

package/.claude/commands/wogi-audit.md CHANGED Viewed

@@ -31,6 +31,32 @@ The audit system has **two layers**:
 1. **Runtime script** (`flow-audit.js`) — provides helper functions for file scanning, TODO finding, dependency checking, and score calculation.
 2. **AI instructions** (this document) — describe the 7-agent parallel analysis, scoring, and post-audit workflow. You (the AI) orchestrate the full audit.
+## Progress Tracking
+At each step checkpoint, display a progress bar AND update the progress state file:
+```bash
+node node_modules/wogiflow/scripts/flow-progress-tracker.js update '{"taskId":"audit","command":"/wogi-audit","phase":"Agents","phaseNum":2,"totalPhases":6,"step":"Agent 5/7 complete","stepNum":5,"totalSteps":7}'
+```
+**Phase mapping for /wogi-audit:**
+| Phase | phaseNum | Description |
+|-------|----------|-------------|
+| 1 | Gather Files | Scan project files |
+| 2 | Agents | 7 parallel agents (sub-steps = agents) |
+| 3 | Consolidate | Score calculation |
+| 4 | Pattern Promotion | AI clustering + cross-reference + gaps |
+| 5 | Report | Display formatted report |
+| 6 | Persist | Save to last-audit.json |
+**Display at each agent completion:**
+```
+━━━ PROGRESS: [████░░░░░░] 35% Step 2: Audit Agents ━━━
+  Agent 5/7 complete (Architecture, Dependencies, Duplication, Performance, Consistency done)
+```
+On audit completion, clear progress: `node node_modules/wogiflow/scripts/flow-progress-tracker.js clear`
 ## How It Works
 ### Step 1: Gather Project Files
@@ -291,14 +317,146 @@ Top 5 Quick Wins (highest impact, lowest effort):
 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 ```
+### Step 4.5: Pattern Promotion Analysis (MANDATORY)
+After displaying the report, run pattern promotion analysis **before** offering post-audit actions. This step has 3 phases.
+#### Phase 1: AI Semantic Clustering
+Launch a single Agent (`subagent_type=Explore`, `model="sonnet"`) with ALL findings from the 7 audit agents:
+```
+You are a pattern clustering judge. You receive findings from 7 audit agents.
+Your job is to semantically group findings that describe the SAME underlying issue.
+IMPORTANT: This is semantic matching, not string matching.
+"Missing try-catch", "no error handling", and "unprotected JSON.parse" are the SAME pattern.
+"Inconsistent naming" and "mixed camelCase and snake_case" are the SAME pattern.
+For the findings below, produce a JSON array of clusters:
+[findings from all 7 agents pasted here]
+Output format (ONLY output valid JSON, no markdown):
+[
+  {
+    "patternId": "kebab-case-id",
+    "category": "architecture|code-style|security|performance|consistency|dependencies|tech-debt",
+    "description": "One sentence describing the underlying issue",
+    "severity": "HIGH|MEDIUM|LOW",
+    "isSystemic": true/false (true if 5+ files affected),
+    "instanceCount": N,
+    "instances": [{"file": "path", "detail": "brief description"}]
+  }
+]
+Rules:
+- Merge findings that describe the same root cause, even if different agents worded them differently
+- patternId must be stable: same issue should produce the same ID across audits
+- severity: HIGH if 5+ files, MEDIUM if 3-4, LOW if 1-2
+- Do NOT create a cluster for single-file, one-off issues — only patterns (2+ instances)
+- Maximum 20 clusters (if more, merge the most similar)
+```
+Parse the AI output as JSON. If parsing fails, log a warning and skip to Step 5.
+#### Phase 2: Cross-Reference & Promotion
+Run the promote command with the clustered findings:
+```bash
+node node_modules/wogiflow/scripts/flow-audit.js promote '<clusters-json>'
+```
+This automatically:
+1. Checks each pattern against `decisions.md` — marks `ENFORCEMENT_GAP` if a rule already exists
+2. Records/increments patterns in `feedback-patterns.md`
+3. Auto-promotes to `decisions.md` when count reaches threshold (default: 3)
+4. Detects `RECURRING` patterns by comparing with `last-audit.json`
+Display the promotion summary in the report:
+```
+━━━ PATTERN PROMOTION ━━━
+  Patterns found: N
+  - Promoted to rules:    N (auto-promoted to decisions.md)
+  - Tracking:             N (count below threshold)
+  - Enforcement gaps:     N (rule exists, still violated!)
+  - New patterns:         N (first occurrence)
+  - Recurring:            N (seen in previous audit)
+  [For each ENFORCEMENT_GAP]:
+  ⚠ ENFORCEMENT GAP: "pattern description"
+    Rule in: ## Section > ### Rule Name
+    Still violated in N files
+  [For each PROMOTED]:
+  ✓ PROMOTED: "pattern description" (N occurrences → decisions.md)
+  [For each SYSTEMIC (5+ files)]:
+  ! SYSTEMIC: "pattern description" (N files affected)
+    Consider creating an immediate rule
+```
+#### Phase 3: Enforcement Gap Investigation (on demand)
+This phase runs ONLY if enforcement gaps were found AND the user selects it from post-audit actions.
+For each `ENFORCEMENT_GAP` pattern, launch an Agent (`subagent_type=Explore`, `model="sonnet"`):
+```
+You are investigating why a rule in decisions.md is still being violated.
+THE RULE (from decisions.md):
+[insert ruleText from promotion results]
+THE VIOLATIONS (files still violating this rule):
+[insert instances array from cluster]
+Investigate WHY this rule was violated. Check:
+1. Is the rule too vague? Does it say WHAT to do but not HOW?
+2. Is the rule too long or buried in a large section? Key constraint might be lost in noise.
+3. Is the rule outdated? Does it reference patterns/APIs that have changed?
+4. Is the rule in the wrong section? Might be overlooked if categorized poorly.
+5. Does the rule have programmatic enforcement? Or is it text-only with no automated checks?
+6. Does the rule conflict with another rule or common practice in the codebase?
+7. Does the code predate the rule? (Check git blame dates vs rule creation date if available)
+Output format (ONLY output valid JSON):
+{
+  "rootCause": "TOO_VAGUE|TOO_LONG|OUTDATED|WRONG_SCOPE|NO_ENFORCEMENT|CONTRADICTORY|PRE_EXISTING",
+  "explanation": "2-3 sentences explaining what's wrong",
+  "recommendation": "REWRITE|SPLIT|ADD_TO_STANDARDS_GATE|BACKFILL|NO_ACTION",
+  "suggestedFix": "If REWRITE or SPLIT: the improved rule text. If ADD_TO_STANDARDS_GATE: the pattern to add. If BACKFILL: description of cleanup needed."
+}
+```
+Display investigation results:
+```
+━━━ ENFORCEMENT GAP INVESTIGATION ━━━
+  [For each gap]:
+  Pattern: "description"
+  Root cause: TOO_VAGUE — "The rule says to handle errors but doesn't specify the pattern"
+  Recommendation: REWRITE
+  Suggested fix: [improved rule text]
+  Actions available:
+  - Apply suggested rewrites to decisions.md
+  - Create backfill cleanup tasks in ready.json
+  - Add patterns to standards gate for programmatic enforcement
+```
 ### Step 5: Post-Audit Actions
-After displaying the report, offer these options using AskUserQuestion:
+After displaying the report and promotion summary, offer these options using AskUserQuestion:
 1. **Create tasks** — Convert high-priority findings to stories/tasks in ready.json
 2. **Add to tech debt** — Add findings to `.workflow/state/tech-debt.json` via `/wogi-debt`
 3. **Save report** — Persist to `.workflow/audits/YYYY-MM-DD-audit.md`
-4. **Create rules** — Promote recurring patterns to decisions.md via `/wogi-decide`
+4. **Create rules** — Manually promote specific patterns via `/wogi-decide`
+5. **Investigate enforcement gaps** — Run Phase 3 investigation for all `ENFORCEMENT_GAP` patterns
+6. **Apply all promotions** — Batch-confirm all auto-promoted rules (already written by Phase 2)
 ### Step 6: Persist Report
@@ -323,7 +481,35 @@ Regardless of user choice, always save the audit results to `.workflow/state/las
     "medium": 18,
     "low": 19
   },
-  "topFindings": [...]
+  "topFindings": [...],
+  "patterns": [
+    {
+      "patternId": "missing-error-handling",
+      "category": "security",
+      "description": "Functions missing try-catch around I/O operations",
+      "instanceCount": 7,
+      "severity": "HIGH",
+      "status": "ENFORCEMENT_GAP",
+      "count": 5,
+      "isSystemic": true
+    }
+  ],
+  "enforcementGaps": [
+    {
+      "patternId": "json-parse-safety",
+      "ruleLocation": "## Coding Standards",
+      "rootCause": "TOO_VAGUE",
+      "recommendation": "REWRITE",
+      "suggestedFix": "..."
+    }
+  ],
+  "promotions": {
+    "promoted": 2,
+    "tracking": 5,
+    "gaps": 1,
+    "new": 3,
+    "recurring": 4
+  }
 }
 ```

package/.claude/commands/wogi-onboard.md CHANGED Viewed

@@ -530,14 +530,36 @@ Display:
     Display: `  Data-fetching hooks... ✓ react-query, 80 useGet* hooks`
-15. **Populate app-map.md from component data:**
-    From the pattern extraction result, populate app-map.md with:
-    - Detected UI components -> Components table
-    - Detected pages/screens -> Screens table
-    - Detected modals -> Modals table
-    Include paths and patterns where detected.
-    Display: `  app-map.md...         ✓ Found 24 components/modules`
+15. **Run registry-manager scan (comprehensive registry population):**
+    **CRITICAL**: This step replaces manual AI-driven app-map population. The registry
+    manager runs ALL active scanners (components, functions, APIs, schemas, services)
+    with recursive directory traversal and glob-based discovery. This ensures no
+    subdirectory components, co-located hooks, or separated-export API functions are missed.
+    ```javascript
+    // Run the full registry scan — this handles recursion, glob patterns, and all export patterns
+    const { execSync } = require('child_process');
+    try {
+      execSync('node node_modules/wogiflow/scripts/flow-registry-manager.js scan', {
+        cwd: projectRoot,
+        stdio: 'inherit',
+        timeout: 60000
+      });
+    } catch (err) {
+      console.warn('Registry manager scan failed, falling back to individual scanners:', err.message);
+      // Individual scanners from steps 13-14 already ran as fallback
+    }
+    ```
+    The component scanner generates `app-map.md` from scan results (grouped by category/directory).
+    The function scanner discovers co-located hooks via glob patterns (`src/**/hooks`, etc.).
+    The API scanner handles all export patterns including separated `const + export default`.
+    Display: `  Registry scan...      ✓ All registries populated (components, functions, APIs)`
+    **NOTE**: If the registry manager scan succeeds, it supersedes the individual scanner runs
+    from steps 13-14. The scanners are idempotent — running them twice just refreshes the same data.
 16. **Extract file templates:**
     ```javascript

package/.claude/commands/wogi-review.md CHANGED Viewed

@@ -23,6 +23,31 @@ Auto-detects when to use multi-pass (4 sequential passes) vs parallel (3 agents)
 /wogi-review --skip-optimization  # Skip solution optimization suggestions
 ```
+## Progress Tracking
+At each phase checkpoint, display a progress bar AND update the progress state file:
+```bash
+node node_modules/wogiflow/scripts/flow-progress-tracker.js update '{"taskId":"wf-XXX","command":"/wogi-review","phase":"AI Review","phaseNum":2,"totalPhases":5,"step":"Agent 3/6 complete","stepNum":3,"totalSteps":6}'
+```
+**Standard format for each checkpoint:**
+```
+━━━ PROGRESS: [████░░░░░░] 40% Phase 2: AI Review ━━━
+  Agent 3/6 complete
+```
+**Phase mapping for /wogi-review:**
+| Phase | phaseNum | Description |
+|-------|----------|-------------|
+| 1 | Verification Gates | Syntax, lint, tests |
+| 2 | AI Review | N agents (sub-steps = agents) |
+| 3 | Standards + Promotion | Compliance check + pattern learning |
+| 4 | Optimization | Solution suggestions |
+| 5 | Post-Review | Fix routing, learning, archive |
+On review completion, clear progress: `node node_modules/wogiflow/scripts/flow-progress-tracker.js clear`
 ## Review Phases (v5.0)
 ```
@@ -695,7 +720,20 @@ Or if the runtime script is not available, manually check:
 - `naming-conventions.md` - File names (kebab-case), catch variables (`err` not `e`)
 - `security-patterns.md` - Raw JSON.parse, unprotected fs.readFileSync
-**3.3. Display Phase 3 results**:
+**3.3. Pattern Promotion on Violations**:
+After running the standards check, feed any violations through the pattern promotion pipeline (same infrastructure as `/wogi-audit` Step 4.5):
+1. **Cluster violations by pattern**: Group findings that describe the same underlying issue. For review (smaller scope than audit), simple grouping by violation `type` + `category` is sufficient — no AI clustering agent needed.
+2. **Cross-reference with learning pipeline**: For each cluster, call `flow-audit.js promote` or use the learner directly:
+   - Check `decisions.md` — if a rule exists for this pattern, it's an **ENFORCEMENT_GAP**
+   - Record/increment in `feedback-patterns.md` via `recordAuditPattern()`
+   - Auto-promote to `decisions.md` if count reaches threshold (default 3)
+3. **Flag enforcement gaps**: When a violation matches an existing rule in `decisions.md`, this is critical — the rule exists but the AI still violated it. Mark as `ENFORCEMENT_GAP` and include in the report.
+**3.4. Display Phase 3 results**:
 ```
 ═══════════════════════════════════════
 PHASE 3: STANDARDS COMPLIANCE [3/5]
@@ -705,6 +743,13 @@ PHASE 3: STANDARDS COMPLIANCE [3/5]
 ✗ naming-conventions: 1 violation [MUST FIX]
    → src/utils.ts:45 - Catch variable "e" should be "err"
+Pattern Learning:
+  Patterns tracked: N
+  - ENFORCEMENT_GAP: 1 (rule exists, still violated!)
+    ⚠ "err in catch blocks" — rule in ## Coding Standards, violated in src/utils.ts
+  - Tracked: 1 (security-json-parse-safety: 2/3 toward promotion)
+  - Promoted: 0
 Summary: N checks, M violations (X must-fix, Y warnings)
 ✓ Phase 3 complete. Proceeding to Phase 4...
@@ -770,10 +815,11 @@ Phase Results:
   Phase 1 (Verification): 4/4 gates passed
   Phase 2 (AI Review): M findings from N agents
   Phase 2.5 (Git Claims): X verified, Y missing, Z unplanned
-  Phase 3 (Standards): N checks, M violations
+  Phase 3 (Standards): N checks, M violations, P patterns tracked
   Phase 4 (Optimization): N suggestions
 Total Findings: N (X critical, Y high, Z medium, W low)
+Pattern Learning: P patterns tracked, M promoted, G enforcement gaps
 Phases: 5/5 executed
 ```
@@ -867,22 +913,48 @@ After the fix loop completes (Options 1/2), or immediately (Option 4), handle un
 **Learning signal detection:**
-After completing fixes, check for recurring patterns:
-1. If 3+ findings share the same `category` or `file` → log to `feedback-patterns.md`
-2. Display warning suggesting `/wogi-decide` to create a preventive rule
+After completing fixes, feed ALL findings (not just standards violations) through the pattern promotion pipeline:
-**Update `last-review.json`**: Set `"triaged": true` on the review after all findings are addressed (fixed, deferred, or dismissed).
+1. **Cluster all findings by category + type** — group findings that describe the same underlying pattern
+2. **Run promotion pipeline** for each cluster:
+   - Check `decisions.md` for enforcement gaps (rule exists, still violated)
+   - Record/increment in `feedback-patterns.md` via the standards learner
+   - Auto-promote to `decisions.md` when count reaches threshold
+3. **Enforcement gap investigation** — for any `ENFORCEMENT_GAP` patterns, launch an Agent (subagent_type=Explore, model=sonnet) to investigate WHY the rule was violated:
+   - Read the specific rule from `decisions.md`
+   - Read the violating code
+   - Classify root cause: `TOO_VAGUE` | `TOO_LONG` | `OUTDATED` | `NO_ENFORCEMENT` | `CONTRADICTORY` | `PRE_EXISTING`
+   - Recommend action: `REWRITE` | `SPLIT` | `ADD_TO_STANDARDS_GATE` | `BACKFILL` | `NO_ACTION`
+   - Display investigation results with suggested fixes
+4. **Display learning summary**:
+```
+━━━ PATTERN LEARNING ━━━
+  Patterns from this review: N
+  - Promoted to rules: M (auto-promoted to decisions.md)
+  - Tracking: K (count below threshold)
+  - Enforcement gaps: G (rule exists, still violated!)
+    [For each gap]:
+    ⚠ "pattern description" — Root cause: TOO_VAGUE
+      Recommendation: REWRITE — [suggested improved rule text]
+```
+5. **Offer gap resolution**: If enforcement gaps were found, offer to apply suggested rewrites to `decisions.md` immediately — fixing the rule while the violation is fresh in context.
+**Update `last-review.json`**: Set `"triaged": true` on the review after all findings are addressed (fixed, deferred, or dismissed). Include `patterns` and `enforcementGaps` arrays in the saved review (same schema as `last-audit.json`).
 **Config toggles** (all in `config.originTaskTracing`):
 - `annotateCompletedTasks: false` → Skip same-session detection, all findings create standalone tasks
 - `traceOrigin: false` → No `originTask` field on fix tasks
-- `learningSignal.enabled: false` → No pattern detection
+- `learningSignal.enabled: false` → No pattern detection or promotion
 - `sameSessionWindow: "2h"` → Time window for same-session detection (default: 2 hours)
-**5.4. Learning capture**:
-- Check each finding against `feedback-patterns.md`
-- For preventable patterns, create correction records
-- If a pattern has occurred 3+ times → Suggest promoting to `decisions.md`
+**5.4. Learning capture (ENHANCED — uses audit promotion pipeline)**:
+The learning capture now uses the same infrastructure as `/wogi-audit` Step 4.5:
+- `flow-standards-learner.js`: `recordAuditPattern()`, `checkEnforcementGap()`, `promoteToDecisions()`
+- `flow-audit.js`: `promoteAuditPatterns()` for batch processing
+This ensures that patterns discovered during code review feed into the same promotion pipeline as audit findings — a violation found 2x in audits and 1x in review reaches the threshold of 3 and auto-promotes.
 **5.5. Archive review report**:
 - Save review report to `.workflow/reviews/YYYY-MM-DD-HHMMSS-review.md`
@@ -902,8 +974,9 @@ Findings: N total
 Fixed: M  |  Tasks Created: Z  |  Annotated: A  |  Dismissed: W
 Saved to: .workflow/state/last-review.json
-Same-session annotations: A findings linked to N completed tasks
-Origin tracing: Z fix tasks with origin references
+Pattern Learning:
+  Patterns tracked: N  |  Promoted: M  |  Enforcement gaps: G
+  [If gaps found]: Gap fixes applied to decisions.md
 Run /wogi-review-fix --pending to batch-process deferred items.

package/.claude/commands/wogi-start.md CHANGED Viewed

@@ -341,30 +341,75 @@ After implementing all scenarios, BEFORE quality gates:
 **Anti-pattern: "Dead service"** — a service that exists, compiles, is imported somewhere, but its critical method is never called by the thing that should trigger it. This passes lint, typecheck, and wiring checks (because the file IS imported) but the feature doesn't work.
-### Step 3.55: Semantic Verification Pass (for "remove/fix all X" tasks)
+### Step 3.55: Inventory-Based Verification (for "remove/fix/replace all X" tasks)
-**Activates when**: The task involves removing, cleaning up, or fixing ALL instances of something (e.g., "remove all mock data", "fix all console.log", "replace all hardcoded URLs", "remove all deprecated APIs").
+**Activates when**: The task involves removing, cleaning up, fixing, or replacing ALL instances of something (e.g., "remove all mock data", "fix all console.log", "replace all hardcoded URLs", "remove all deprecated APIs").
-**Purpose**: Pattern-based search (regex, grep) finds instances that match a naming pattern. But semantic variants — hardcoded values, helper functions that serve the same purpose, inline fallbacks — are invisible to pattern search. This pass catches what regex misses.
+**The problem this solves**: Pattern-based search (grep, regex) only finds instances that match a naming convention. Semantic variants — inline hardcoded arrays, helper functions that wrap the target, useState initializers with fake data, constants not named with the expected prefix — are invisible to pattern search. In practice, pattern search finds ~60-70% of instances. The AI then declares "done" and the remaining 30-40% persist undetected. This has caused repeated false completions (3-4x on a single project).
-**Procedure**:
-1. After the implementation agent completes, do NOT immediately mark the task as done
-2. Run a **second verification pass** that asks SEMANTICALLY, not syntactically:
-   - "Does any remaining code serve the purpose of [what we're removing]?"
-   - NOT: "Does any remaining code match the pattern [MOCK_*]?"
-3. For each type of removal, use type-specific semantic checks:
-| Removal Type | Semantic Check |
-|-------------|----------------|
-| Mock data | Scan render output for hardcoded business data (customer names, dollar amounts, percentages, dates that look like test data) |
-| Console.log | Scan for any debugging output (console.warn, console.debug, alert(), debugger statements) |
-| Hardcoded URLs | Scan for string literals containing http://, https://, localhost, IP addresses |
-| Deprecated APIs | Scan for the FUNCTIONALITY the API provided, not just its name |
-4. Use AI reasoning, NOT regex — the whole point is catching what regex misses
-5. Report any findings as additional criteria to fix before completion
-**Cross-cutting principle**: Pattern matching is for discovery. AI reasoning is for verification. The two-pass approach (pattern search → semantic verification) is the standard for any "fix all X" task.
+**Core principle**: For each file in scope, ask **"does anything in this file serve the PURPOSE of [what we're removing]?"** — regardless of what it's named. Reason about function, not strings.
+**Procedure (3 phases — ALL mandatory)**:
+#### Phase A: Pre-Implementation Inventory (BEFORE any code changes)
+1. **Identify all files in scope** — every file that could contain instances of [X]. Use both:
+   - Pattern search (grep/glob) for syntactic matches
+   - File-by-file reading of components/pages/modules that CONSUME data related to [X]
+2. **For each file, answer the semantic question**: "Does anything in this file serve the purpose of [what we're removing]?" Examples by task type:
+   | Task Type | Semantic Question | What Pattern Search Misses |
+   |-----------|-------------------|---------------------------|
+   | Remove mock data | "Where does this component get its displayed data? Is it from an API call or a local constant/array/useState?" | Inline arrays (`const customers = [{...}]`), useState initializers (`useState([...POLICY_DATA])`), export constants not named `MOCK_*` |
+   | Remove console.log | "What in this file produces output to any channel?" | `console.warn`, `console.debug`, `debugger`, `alert()`, custom logger wrappers |
+   | Replace hardcoded URLs | "What string values in this file resolve to network addresses?" | URLs built from concatenation, template literals, env var fallbacks with hardcoded defaults |
+   | Remove deprecated API | "What in this file provides the same FUNCTIONALITY as the deprecated API?" | Wrapper functions, polyfills, compatibility shims, re-implementations |
+   | Fix all raw JSON.parse | "What in this file deserializes JSON?" | Utility functions that call JSON.parse internally, library wrappers |
+3. **Produce a numbered inventory** and display it to the user:
+   ```
+   ━━━ PRE-IMPLEMENTATION INVENTORY ━━━
+   Found N instances of [X] across M files:
+     1. [file:lines] — [description] [TYPE: syntactic|semantic]
+     2. [file:lines] — [description] [TYPE: syntactic|semantic]
+     ...
+   Total: N instances (S syntactic, M semantic)
+   Confirm inventory is complete before proceeding? [Y/adjust]
+   ```
+4. **Wait for user confirmation** that the inventory is complete. If the user identifies missing items, add them. This step is CRITICAL — it commits the AI to a concrete scope that can be verified later.
+#### Phase B: Implementation
+5. Implement the removal/fix/replacement for EVERY item in the inventory. Each inventory item becomes a trackable unit of work.
+#### Phase C: Post-Implementation Re-Inventory (AFTER all changes)
+6. **Re-run the SAME semantic scan** from Phase A on the SAME set of files. Use the same questions — do NOT downgrade to pattern-only search.
+7. **Diff the inventories**:
+   ```
+   ━━━ POST-IMPLEMENTATION VERIFICATION ━━━
+   Re-scanned M files for [X]:
+     1. [file:lines] — [description]          → REMOVED ✓
+     2. [file:lines] — [description]          → REMOVED ✓
+     3. [file:lines] — [description]          → STILL PRESENT ✗
+     ...
+   Result: N/N removed (0 remaining)
+   ```
+8. **If ANY items remain** → task is NOT done. Fix the remaining items and re-verify. Do NOT proceed to quality gates with remaining items.
+9. **If new instances are discovered** during re-scan that weren't in the original inventory → add them, fix them, and note them as "discovered during verification."
+**Why this works**: The inventory creates a concrete, numbered checklist BEFORE implementation. The AI cannot claim "done" when the post-inventory shows items still present — the evidence is in the conversation. The pre/post diff is unfakeable.
+**Skip conditions**: Tasks that target a specific file or a small known set (e.g., "remove the mock import in Dashboard.tsx") don't need the full inventory — they're scoped enough already. The inventory is for "all X" / "every X" / "clean up X everywhere" tasks.
 ### Step 3.6: Integration Wiring Validation (MANDATORY)

package/.claude/docs/claude-code-compatibility.md CHANGED Viewed

@@ -68,6 +68,8 @@ flow parallel check  # See available parallel tasks
 | 1.9.1+ | 2.1.72+ | ExitWorktree, Agent model param, effort levels, /plan description, fd auto-approval, prompt cache fix |
 | 1.9.5+ | 2.1.73+ | SessionStart double-fire fix, hook context pollution fix, modelOverrides, subagent model fix on Bedrock/Vertex |
 | 1.9.5+ | 2.1.74+ | SessionEnd timeout fix, managed policy ask rules, autoMemoryDirectory, Agent tool routing gate fix |
+| 2.0.0+ | 2.1.76+ | PostCompact hook, Elicitation/ElicitationResult events, deferred tool schema fix |
+| 2.1.0+ | 2.1.77+ | PreToolUse allow/deny separation, 128k output tokens, worktree sparse checkout, compaction circuit breaker |
 ### Environment Variables (2.1.19+)
@@ -161,6 +163,8 @@ await cancelTask('wf-123', 'superseded', false);
 | SessionEnd | session-end.js | Request logging, progress update |
 | TaskCompleted | task-completed.js | Move task to recentlyCompleted |
 | ConfigChange | config-change.js | Re-sync bridge on mid-session config changes |
+| InstructionsLoaded | instructions-loaded.js | Package check, rule conflicts, auto-onboard |
+| PostCompact | post-compact.js | Re-inject state after context compaction (2.1.76+) |
 ### Features in Latest Release
@@ -209,6 +213,29 @@ await cancelTask('wf-123', 'superseded', false);
 - **Plugin install/marketplace fixes**: Fixed `/plugin install` for marketplace plugins with local sources, and marketplace update not syncing git submodules. WogiFlow's plugin system is internal (not marketplace-based), so no direct impact.
 - **`--plugin-dir` override change**: Local dev copies now override installed marketplace plugins with the same name. Useful for WogiFlow plugin development workflows.
+### Features in 2.1.76+
+- **PostCompact hook**: New hook event that fires after context compaction completes. WogiFlow uses this to re-inject critical state (active task, workflow phase, durable session progress) and re-arm the routing-pending flag. Fully implemented in `scripts/hooks/core/post-compact.js` and registered in `settings.json`.
+- **MCP elicitation support**: MCP servers can now request structured input mid-task via interactive dialogs (form fields or browser URL). New `Elicitation` and `ElicitationResult` hooks available for intercepting/overriding responses. WogiFlow lists these in `UNUSED_SUPPORTED_EVENTS` — not yet implemented but ready for future use (e.g., interactive clarification forms during task triage).
+- **Deferred tools schema fix after compaction**: Previously, tools loaded via `ToolSearch` lost their input schemas after compaction, causing array and number parameters to be rejected with type errors. Now fixed. WogiFlow sessions using deferred MCP tools are no longer affected.
+- **Auto-compaction circuit breaker**: Auto-compaction now stops after 3 consecutive failures instead of retrying indefinitely. WogiFlow's PostCompact hook tracks compaction frequency and warns when multiple compactions occur in quick succession, indicating potential circuit breaker activation.
+- **`/effort` slash command**: New command to set model effort level. WogiFlow already maps task levels to effort (L3→low, L2→medium, L1/L0→high) — this provides a manual override path.
+- **`-n`/`--name` CLI flag**: Set a display name for the session at startup. Can be used with WogiFlow task IDs for clearer session identification.
+- **`worktree.sparsePaths` setting**: New setting for `claude --worktree` in large monorepos to check out only needed directories via git sparse-checkout. WogiFlow documents this in the worktree comparison table but does not auto-configure it — users should set `sparsePaths` in their Claude Code config for monorepo projects.
+### Features in 2.1.77+
+- **PreToolUse "allow" no longer bypasses deny rules**: Previously, a PreToolUse hook returning `permissionDecision: "allow"` would bypass explicit deny rules (including enterprise managed settings). Now `allow` only means "this hook permits it" — deny rules from permissions/managed settings still apply independently. WogiFlow's routing gate returns `allow` after routing is complete and `deny` when routing is pending. This fix is CORRECT behavior for WogiFlow — our `allow` should never have overridden user/enterprise deny rules. No code change needed.
+- **Compound bash "Always Allow" fix**: "Always Allow" on compound bash commands (e.g., `cd src && npm test`) now saves a single rule for the full string instead of per-subcommand, preventing dead rules and repeated permission prompts. WogiFlow's generated permission rules in `claude-bridge.js` use single-command patterns (e.g., `Bash(npm install *)`) so this fix does not affect WogiFlow-generated permissions. Users who manually "Always Allow" compound commands will see improved behavior.
+- **Increased output token limits**: Default max output for Opus 4.6 increased to 64k tokens. Upper bound for both Opus 4.6 and Sonnet 4.6 increased to 128k tokens. WogiFlow's model registry updated: `claude-sonnet-4-6.maxOutputTokens` changed from 64000 to 128000. Opus 4.6 was already at 128000.
+- **Background agent partial results preserved**: Killing a background agent now preserves its partial results in conversation context. WogiFlow's explore phase agents (5-6 launched in parallel) benefit — if one agent is killed or times out, its partial findings are still available.
+- **Agent tool resume parameter removed**: The Agent tool no longer accepts a `resume` parameter. Use `SendMessage({to: agentId})` to continue a previously spawned agent. WogiFlow does not use the `resume` parameter (confirmed by codebase search). `SendMessage` now auto-resumes stopped agents in the background instead of returning an error.
+- **Improved `claude plugin validate`**: Now checks skill, agent, and command frontmatter plus hooks/hooks.json, catching YAML parse errors and schema violations. WogiFlow should periodically run this to catch frontmatter issues.
+- **`--resume` truncation fix**: Fixed `--resume` silently truncating recent conversation history due to a race between memory-extraction writes and the main transcript. Improves reliability of session resumption for WogiFlow durable sessions.
+- **Stale worktree cleanup race condition fix**: Fixed a race condition where stale-worktree cleanup could delete an agent worktree just resumed from a previous crash. WogiFlow's parallel execution with worktree isolation benefits from improved safety.
+- **Memory growth fix**: Fixed progress messages surviving compaction in long-running sessions. Reduces memory pressure during long WogiFlow bulk-loop sessions.
+- **Faster startup on macOS**: ~60ms faster by reading keychain credentials in parallel. Faster `--resume` on fork-heavy sessions — up to 45% faster loading and ~100-150MB less peak memory. Benefits WogiFlow sessions with heavy hook context.
 ### Simple Mode Naming Distinction
 Claude Code's `CLAUDE_CODE_SIMPLE` environment variable (which enables a simplified tool set) is **unrelated** to WogiFlow's `loops.simpleMode` (a lightweight task completion loop using string detection). They are separate features that happen to share the word "simple":
@@ -229,6 +256,7 @@ Both can be active simultaneously without conflict.
 | Squash merge | No (manual) | Yes (`squashOnMerge` config) |
 | Task linking | No | Yes (links to task ID) |
 | Cleanup | Prompted on session exit | Auto after 24h (`autoCleanupHours`) |
+| Sparse checkout | Yes (`worktree.sparsePaths` setting, 2.1.76+) | Not supported — relies on Claude Code native |
 WogiFlow detects native worktrees and avoids nesting. When launched with `--worktree`, WogiFlow uses the native worktree as-is.

package/.workflow/templates/claude-md.hbs CHANGED Viewed

@@ -142,7 +142,7 @@ See `.claude/docs/commands.md` for complete command reference.
 | "show tasks", "what's ready", "available tasks" | `/wogi-ready` |
 | "project status", "show status", "where are we" | `/wogi-status` |
 | "check health", "workflow health", "is everything ok" | `/wogi-health` |
-| "wrap up", "end session", "that's all" | `/wogi-session-end` |
+| "wrap up", "end session", "that's all" | `/wogi-session-end` (**intent-check required** — see note below) |
 | "compact context", "save context", "running low on context" | `/wogi-pre-compact` |
 | "show roadmap", "what's planned", "future work", "deferred items" | `/wogi-roadmap` |
 | "debug this", "investigate hypotheses", "competing theories", "parallel debug" | `/wogi-debug-hypothesis` |
@@ -162,6 +162,14 @@ See `.claude/docs/commands.md` for complete command reference.
 **IMPORTANT**: When a user's message matches one of these patterns, immediately invoke the Skill tool with the corresponding command. Do not ask for confirmation. These `/wogi-*` commands satisfy the mandatory routing requirement — you do NOT also need to invoke `/wogi-start` when a detection match exists. `/wogi-start` is the fallback for messages that don't match this table.
+**Session-end intent check**: `/wogi-session-end` requires extra care. Phrases like "wrap up", "that's all", "let's finish with this" often mean "finish this topic" not "end the entire session." Only invoke `/wogi-session-end` when the user clearly intends to **stop working entirely** — not when they're concluding one topic before moving to another. Examples:
+- "that's all for today, thanks" → session-end (clear finality)
+- "let's wrap up this task and move on to the auth bug" → NOT session-end (continuing work)
+- "I'm done" → session-end (if no follow-up topic mentioned)
+- "let's finish with that and then do X" → NOT session-end (next topic follows)
+When in doubt, route through `/wogi-start` which will classify correctly.
 ## CRITICAL: Universal Entry Point — ALL Requests
 **ALL user messages MUST go through a `/wogi-*` command. No direct handling. No self-classification.**
@@ -274,7 +282,29 @@ Before closing any task, ensure all required gates pass (per `config.json → qu
 ## Context Management
-Use `/wogi-pre-compact` when context is getting large. Before compacting: update progress.md, ensure request-log is current, commit work.
+Context compaction happens automatically — WogiFlow persists all critical state to disk continuously, and the PostCompact hook restores it after compaction. You do NOT need to manually run `/wogi-pre-compact` before compaction.
+**What survives compaction automatically** (via PostCompact hook + state files):
+- Active task ID, title, type, and acceptance criteria
+- Which criteria are completed vs pending (from durable-session.json)
+- Current workflow phase
+- Changed files list (from task-checkpoint.json)
+- Last request-log entry number
+- Routing enforcement (re-armed automatically)
+**`/wogi-pre-compact` is optional** — use it only when you want a detailed summary for a very long session. It is NOT required for state safety.
+**For L1+ tasks**: The pre-task context estimator (Step 0.25) checks if the task fits in remaining context. If it doesn't → compact BEFORE starting to avoid mid-task compaction.
+## Compact Instructions
+When compacting this conversation, preserve the following WogiFlow state:
+- The current task ID and title from `.workflow/state/ready.json` inProgress array
+- Which acceptance criteria are done vs pending
+- The current workflow phase (routing, exploring, coding, validating, completing)
+- The list of files changed in this session
+- Any spec decisions or architectural choices made during this session
+- Read `.workflow/state/task-checkpoint.json` after compaction for full state recovery
 ## Continuous Learning

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "wogiflow",
-  "version": "2.1.3",
+  "version": "2.3.0",
   "description": "AI-powered development workflow management system with multi-model support",
   "main": "lib/index.js",
   "bin": {