wogiflow 2.15.0 → 2.16.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (38) hide show
  1. package/.claude/commands/wogi-challenge.md +4 -4
  2. package/.claude/commands/wogi-gate-stats.md +1 -1
  3. package/.claude/commands/wogi-start-continuation.md +10 -10
  4. package/.claude/commands/wogi-start.md +2 -0
  5. package/.claude/docs/intent-grounded-reasoning.md +1 -1
  6. package/.claude/docs/knowledge-base/02-task-execution/02-execution-loop.md +8 -0
  7. package/.claude/docs/knowledge-base/02-task-execution/03-verification.md +110 -10
  8. package/.claude/docs/knowledge-base/02-task-execution/README.md +10 -0
  9. package/.claude/docs/knowledge-base/02-task-execution/decision-authority.md +110 -0
  10. package/.claude/docs/knowledge-base/02-task-execution/workspace-mode.md +176 -0
  11. package/.claude/docs/knowledge-base/04-memory-context/context-management.md +40 -0
  12. package/.claude/docs/knowledge-base/04-memory-context/memory-systems.md +12 -1
  13. package/.claude/docs/knowledge-base/06-safety-guardrails/README.md +1 -0
  14. package/.claude/docs/knowledge-base/06-safety-guardrails/mechanical-gates.md +150 -0
  15. package/.claude/docs/knowledge-base/wogiflow-enterprise-showcase.md +423 -0
  16. package/.claude/docs/phases/02-spec.md +2 -2
  17. package/.claude/docs/phases/04-verify.md +1 -1
  18. package/.workflow/agents/logic-adversary.md +7 -2
  19. package/.workflow/templates/claude-md.hbs +27 -0
  20. package/lib/wogi-claude +87 -0
  21. package/package.json +3 -2
  22. package/scripts/flow-architect-pass.js +3 -3
  23. package/scripts/flow-config-defaults.js +51 -0
  24. package/scripts/flow-constants.js +3 -1
  25. package/scripts/flow-correct.js +1 -0
  26. package/scripts/flow-done.js +16 -0
  27. package/scripts/flow-hook-status.js +6 -2
  28. package/scripts/flow-logic-adversary.js +4 -4
  29. package/scripts/flow-migrate-igr.js +1 -1
  30. package/scripts/hooks/core/phase-read-gate.js +52 -9
  31. package/scripts/hooks/core/post-compact.js +18 -0
  32. package/scripts/hooks/core/session-context.js +26 -0
  33. package/scripts/hooks/core/session-end.js +10 -0
  34. package/scripts/hooks/core/session-history.js +116 -0
  35. package/scripts/hooks/core/task-boundary-reset.js +249 -0
  36. package/scripts/hooks/core/task-completed.js +35 -0
  37. package/scripts/hooks/entry/claude-code/pre-tool-use.js +10 -5
  38. package/scripts/hooks/entry/claude-code/stop.js +63 -0
@@ -0,0 +1,150 @@
1
+ # Mechanical Enforcement Gates
2
+
3
+ WogiFlow enforces workflow rules through PreToolUse hook gates — real JavaScript code that intercepts every tool call and blocks violations before they happen. These are not prompt suggestions; they are physical blocks that the AI cannot bypass.
4
+
5
+ ---
6
+
7
+ ## How It Works
8
+
9
+ Every time Claude Code calls a tool (Edit, Write, Bash, Read, Grep, etc.), the PreToolUse hook fires first. It runs through a chain of gates. If any gate returns `blocked: true`, the tool call is rejected with an error message telling the AI what to do instead.
10
+
11
+ ```
12
+ Claude Code → tool call → PreToolUse hook → [Gate chain] → allowed / blocked
13
+ ```
14
+
15
+ Gates are fail-open by default (errors don't block work) unless noted. The routing gate is fail-closed (errors block everything until routing completes).
16
+
17
+ ---
18
+
19
+ ## Gate Reference
20
+
21
+ ### Routing Gate
22
+ **Enforces**: Every user message must go through `/wogi-start` before any tool can run.
23
+ **Blocks**: Read, Glob, Grep, Edit, Write, Bash, Agent, WebSearch, WebFetch, EnterPlanMode
24
+ **Fail mode**: Fail-CLOSED (if the gate errors, tools are blocked)
25
+ **Config**: `hooks.rules.routingGate.enabled`
26
+ **Exceptions**: Read-only git commands (`git status`, `git log`, `git diff`), subagents with an active parent task
27
+
28
+ ### Phase Gate
29
+ **Enforces**: Tools are restricted based on the current workflow phase.
30
+ **Rules**:
31
+ - `routing` phase: Edit, Write, Bash blocked
32
+ - `exploring` phase: Edit, Write blocked (research is read-only)
33
+ - `spec_review` phase: Edit, Write, Bash blocked (reviewing, not coding)
34
+ - `coding` phase: All tools allowed
35
+ - `validating` phase: Edit, Write blocked (verifying, not changing)
36
+ - `completing` phase: All tools allowed (for logs, maps, commits)
37
+ **Config**: `hooks.rules.phaseGate.enabled`
38
+
39
+ ### Phase-Read Gate
40
+ **Enforces**: Must read the phase instruction file before using mutation tools.
41
+ **Blocks**: Edit, Write, Bash — until the current phase's file in `.claude/docs/phases/` is read
42
+ **Purpose**: Enables on-demand loading of pipeline instructions (79% token savings for conversations)
43
+ **State file**: `.workflow/state/phase-reads.json`
44
+ **Config**: Respects `hooks.rules.phaseGate.enabled` (same toggle)
45
+
46
+ ### Scope Gate
47
+ **Enforces**: Edits must be within the task's declared file scope.
48
+ **Blocks**: Edit, Write on files not listed in the task spec
49
+ **Config**: `hooks.rules.scopeGating.enabled`
50
+
51
+ ### Bugfix Scope Gate
52
+ **Enforces**: L3 bugfix tasks are limited in how many files they can touch.
53
+ **Behavior**: Warns after 3 unique file edits, blocks at configurable threshold
54
+ **Purpose**: Prevents scope creep — a "quick fix" that touches 15 files should be an L2 task
55
+ **Config**: `enforcement.bugfixScope.enabled`
56
+
57
+ ### Scope Mutation Gate
58
+ **Enforces**: Fix tasks cannot create new files; tasks cannot delete pre-existing files via Bash.
59
+ **Blocks**: Write (new file creation in fix tasks), Bash (rm commands on tracked files)
60
+ **Config**: `enforcement.scopeMutation.enabled`
61
+
62
+ ### Strike Gate
63
+ **Enforces**: After repeated verification failures, blocks further edits.
64
+ **Behavior**: Tracks consecutive failures per task. After 3 strikes, blocks Edit/Write/Bash.
65
+ **Purpose**: Prevents infinite retry loops where the AI tries the same broken approach repeatedly
66
+ **State file**: `.workflow/state/strike-tracker.json`
67
+ **Config**: `enforcement.strikeEscalation.enabled`
68
+
69
+ ### Deploy Gate
70
+ **Enforces**: Cannot deploy without a verification artifact. Cannot write/edit verification artifacts directly (anti-forgery).
71
+ **Blocks**: Bash (deploy commands without artifact), Write/Edit (on verification artifact files)
72
+ **Config**: `enforcement.deployGate.enabled`
73
+
74
+ ### Git Safety Gate
75
+ **Enforces**: Creates automatic backup before destructive git operations.
76
+ **Triggers on**: `git reset --hard`, `git checkout -- .`, `git restore .`, `git clean -f`
77
+ **Behavior**: Creates a backup branch before allowing the destructive operation
78
+ **Config**: `enforcement.gitSafety.enabled`
79
+
80
+ ### Commit-Log Gate
81
+ **Enforces**: Commits must have corresponding request-log entries.
82
+ **Blocks**: Bash (git commit) when request-log.md wasn't updated for the current task
83
+ **Config**: `hooks.rules.commitLogGate.enabled`
84
+
85
+ ### Manager Boundary Gate
86
+ **Enforces**: Manager repos cannot modify worker repo source code (workspace mode).
87
+ **Blocks**: Edit/Write on any file inside member repos; Bash except allowlisted read-only commands
88
+ **Allows**: Read of metadata files (api-map, app-map, config, state files)
89
+ **Active when**: `WOGI_REPO_NAME === 'manager'`
90
+
91
+ ### Component Reuse Gate
92
+ **Enforces**: Must check existing components before creating new ones.
93
+ **Behavior**: When Write creates a new component file, checks app-map for similar existing components
94
+ **Output**: Warning with reuse candidates (name, path, similarity score)
95
+ **Config**: `componentReuse.enabled`
96
+
97
+ ### Standards Compliance Gate
98
+ **Enforces**: Naming conventions, security patterns, decisions.md rules.
99
+ **Runs at**: Task completion (part of quality gates)
100
+ **Checks scoped by task type**: component → naming/components/security, API → naming/api/security, feature → all
101
+ **Config**: Part of `qualityGates.<taskType>.require` array
102
+
103
+ ### Damage Control
104
+ **Enforces**: Configurable blocklist for dangerous commands and protected files.
105
+ **Behavior**: Block or ask-before-execute based on pattern matching
106
+ **Config**: `damageControl.enabled` with custom patterns in `.workflow/damage-control.yaml`
107
+
108
+ ---
109
+
110
+ ## Configuration
111
+
112
+ All gates can be toggled independently:
113
+
114
+ ```json
115
+ {
116
+ "hooks": {
117
+ "rules": {
118
+ "routingGate": { "enabled": true },
119
+ "phaseGate": { "enabled": true },
120
+ "scopeGating": { "enabled": true },
121
+ "commitLogGate": { "enabled": true }
122
+ }
123
+ },
124
+ "enforcement": {
125
+ "strictMode": true,
126
+ "bugfixScope": { "enabled": true },
127
+ "scopeMutation": { "enabled": true },
128
+ "strikeEscalation": { "enabled": true },
129
+ "deployGate": { "enabled": true },
130
+ "gitSafety": { "enabled": true }
131
+ },
132
+ "componentReuse": { "enabled": true },
133
+ "damageControl": { "enabled": false }
134
+ }
135
+ ```
136
+
137
+ ---
138
+
139
+ ## Fast-Path Optimization
140
+
141
+ The hook checks pre-computed status from `.workflow/state/hook-status.json`. If all gates are disabled, the entire hook exits in <1ms. Individual gates are skipped via the status cache without loading their modules.
142
+
143
+ ---
144
+
145
+ ## Related
146
+
147
+ - [Damage Control](./damage-control.md) — Configurable pattern-based protection
148
+ - [Commit Gates](./commit-gates.md) — Approval workflow for commits
149
+ - [Task Execution](../02-task-execution/) — Where gates enforce workflow phases
150
+ - [Verification](../02-task-execution/03-verification.md) — Quality gates at completion
@@ -0,0 +1,423 @@
1
+ # WogiFlow: Enterprise AI Development Workflow
2
+
3
+ **The only AI coding workflow with mechanical enforcement.**
4
+
5
+ Every other tool in the Claude Code ecosystem gives the AI _suggestions_. WogiFlow gives it _rules it physically cannot break_. This is the difference between a coding assistant that sometimes follows best practices and one that is architecturally incapable of skipping them.
6
+
7
+ ---
8
+
9
+ ## The Core Problem WogiFlow Solves
10
+
11
+ AI coding agents are powerful but undisciplined. Without structure, they:
12
+
13
+ - Skip verification and claim "done" based on "the code looks correct"
14
+ - Make untracked changes that nobody can audit
15
+ - Forget project conventions between sessions
16
+ - Hallucinate completion — static checks pass, but the feature doesn't actually work
17
+ - Make autonomous decisions about things the team should decide
18
+ - Lose context mid-task and produce inconsistent output
19
+
20
+ These problems multiply with team size. One developer with a loose AI assistant creates tech debt. Ten developers with loose AI assistants create chaos.
21
+
22
+ **WogiFlow eliminates these failure modes through mechanical enforcement** — hook-based gates that physically intercept every tool call and block violations before they happen. Not prompts asking nicely. Real code that runs before every file edit, every bash command, every tool invocation.
23
+
24
+ ---
25
+
26
+ ## How It Works: The /wogi-start Pipeline
27
+
28
+ Everything in WogiFlow begins with `/wogi-start`. This is the universal entry point — every user message, every task, every question must pass through it. The routing gate mechanically blocks all tools (Edit, Write, Bash, Read, Grep) until routing completes.
29
+
30
+ ### What Happens When You Say "Add dark mode toggle"
31
+
32
+ ```
33
+ User message
34
+
35
+
36
+ ┌─────────────────────────────┐
37
+ │ ROUTING GATE (mechanical) │ ← All tools blocked until /wogi-start runs
38
+ │ PreToolUse hook intercepts │
39
+ │ every tool call │
40
+ └─────────────┬───────────────┘
41
+
42
+
43
+ ┌─────────────────────────────┐
44
+ │ /wogi-start TRIAGE │ ← Classifies: story, bug, review, conversation?
45
+ │ Determines task level: │ Routes to appropriate pipeline
46
+ │ L0 Epic (15+ files) │
47
+ │ L1 Story (5-15 files) │
48
+ │ L2 Task (1-5 files) │
49
+ │ L3 Subtask (1 file) │
50
+ └─────────────┬───────────────┘
51
+
52
+
53
+ ```
54
+
55
+ ### The Pipeline: What Runs Depends on Task Size
56
+
57
+ | Phase | L3 Subtask | L2 Task | L1 Story | L0 Epic |
58
+ |-------|-----------|---------|----------|---------|
59
+ | **Routing + Context** | Yes | Yes | Yes | Yes |
60
+ | **Multi-Agent Research** (6 parallel agents) | Skip | Yes | Yes | Yes |
61
+ | **Reuse Gate** (check existing components) | Skip | Yes | Yes | Yes |
62
+ | **Scope-Confidence Audit** | Skip | Skip | Yes | Yes |
63
+ | **Architect Pass** (IGR) | Skip | Skip | Yes | Yes |
64
+ | **Logic Adversary** (different model critiques plan) | Skip | Skip | Yes | Yes |
65
+ | **Spec Generation + Approval** | Skip | Conditional | Yes (blocks for approval) | Yes (blocks for approval) |
66
+ | **Implementation Loop** | Yes | Yes | Yes | Yes |
67
+ | **Skeptical Evaluator** (separate agent grades work) | Skip | Yes | Yes | Yes |
68
+ | **Runtime Verification** (auto-generated tests) | Yes | Yes | Yes | Yes |
69
+ | **Wiring Validation** | Yes | Yes | Yes | Yes |
70
+ | **Standards Compliance** | Yes | Yes | Yes | Yes |
71
+ | **Quality Gates** | Yes | Yes | Yes | Yes |
72
+
73
+ A trivial L3 fix takes seconds. An L1 story goes through the full pipeline — research, planning, adversarial review, implementation, independent verification, quality gates. The AI doesn't choose what to skip; the pipeline enforces it based on task classification.
74
+
75
+ ### Phase-Loaded Architecture
76
+
77
+ The pipeline instructions are split into 5 phase files loaded on-demand. A conversation that never reaches the coding phase never loads coding instructions — saving ~79% of prompt tokens. The PreToolUse hook blocks Edit/Write/Bash until the current phase's instruction file is read, ensuring the AI always has the right instructions loaded.
78
+
79
+ ---
80
+
81
+ ## Mechanical Enforcement: 12+ Gates That Cannot Be Bypassed
82
+
83
+ Every tool call in WogiFlow passes through the PreToolUse hook. This is real JavaScript code that executes before Claude Code processes any Edit, Write, Bash, Read, or Grep command. The gates are not suggestions — they are physical blocks.
84
+
85
+ | Gate | What It Enforces | Fail Mode |
86
+ |------|-----------------|-----------|
87
+ | **Routing Gate** | Every message must go through /wogi-start first | Block all tools |
88
+ | **Phase Gate** | Tools restricted per workflow phase (no editing during research) | Block Edit/Write |
89
+ | **Phase-Read Gate** | Must read phase instructions before working | Block Edit/Write/Bash |
90
+ | **Scope Gate** | Edits must be within the task's declared file scope | Block Edit/Write |
91
+ | **Bugfix Scope Gate** | L3 bugfixes limited to 3 files before escalation | Warn then block |
92
+ | **Scope Mutation Gate** | Fix tasks can't create new files; can't delete pre-existing files | Block Write/Bash |
93
+ | **Strike Gate** | After 3 failed verification attempts, blocks further edits | Block Edit/Write/Bash |
94
+ | **Deploy Gate** | Can't deploy without verification artifact | Block Bash |
95
+ | **Git Safety Gate** | Auto-backup before destructive git operations | Block or backup |
96
+ | **Commit-Log Gate** | Commits must have request-log entries | Block Bash (git commit) |
97
+ | **Manager Boundary Gate** | Manager repos can't modify worker repo source code | Block Edit/Write |
98
+ | **Component Reuse Gate** | Must check existing components before creating new ones | Warn on Write |
99
+ | **Standards Compliance** | Naming, security, decisions.md rules enforced | Block task completion |
100
+ | **Damage Control** | Configurable blocklist for dangerous commands/files | Block or ask |
101
+
102
+ **Example**: A developer asks Claude to "quickly fix the login bug." Without WogiFlow, Claude edits 8 files, skips tests, and says "done." With WogiFlow:
103
+
104
+ 1. Routing gate forces `/wogi-start` classification → L3 bugfix
105
+ 2. Bugfix scope gate warns after 3 files, blocks after threshold
106
+ 3. Phase gate prevents editing during the research phase
107
+ 4. Strike gate blocks further changes after 3 failed lint checks
108
+ 5. Standards gate verifies the fix follows project conventions
109
+ 6. Commit-log gate ensures the change is logged before commit
110
+
111
+ ---
112
+
113
+ ## Intent-Grounded Reasoning (IGR)
114
+
115
+ For L1+ tasks, WogiFlow adds a reasoning layer that catches logic failures _before_ code is written.
116
+
117
+ ### The Architect + Adversary Pattern
118
+
119
+ 1. **Intent Framing**: The AI explicitly interprets the task — resolving ambiguous terms, identifying affected user journeys, surfacing assumptions
120
+ 2. **Architect Pass**: A read-only sub-agent produces an 8-section plan (approach, data model, risks, alternatives, dependencies)
121
+ 3. **Logic Adversary**: A _different model_ critiques the plan against a 10-principle Logic Constitution. Same-model self-critique is a known rubber-stamp failure mode — WogiFlow uses Sonnet to critique Opus plans (or vice versa)
122
+ 4. **Iteration Loop**: If the Adversary finds issues, the plan goes back to the Architect. Max 3 rounds. If it still fails → task is blocked and surfaced to the user
123
+
124
+ This catches architectural mistakes, missing edge cases, and flawed assumptions before a single line of code is written.
125
+
126
+ ### Completion Truth Gate
127
+
128
+ When the AI claims a task is "done," the Truth Gate audits every acceptance criterion against evidence tiers:
129
+
130
+ | Tier | Name | Counts as Done? |
131
+ |------|------|----------------|
132
+ | 0 | STATIC (compiles, lints) | Never |
133
+ | 1 | STRUCTURAL (file exists, imported) | Never |
134
+ | 2 | OBSERVATIONAL (page loads, renders) | Display-only criteria |
135
+ | 3 | INTERACTIVE (click → result persists) | Yes |
136
+ | 4 | AUTOMATED (test passes) | Yes (strongest) |
137
+
138
+ If the AI claims "done" with only Tier 0 evidence (it compiles), the gate downgrades the claim to "implemented (unverified)" and blocks task completion.
139
+
140
+ ---
141
+
142
+ ## Verification: The Skeptical Evaluator
143
+
144
+ After implementation, WogiFlow spawns a separate sub-agent specifically tuned for skepticism:
145
+
146
+ - **Different model** than the implementer (prevents self-congratulation bias)
147
+ - **Prompted to find problems**, not praise
148
+ - **Grades each criterion**: PASS / PARTIAL / FAIL with file:line evidence
149
+ - **Iteration loop**: If issues found → fix → re-evaluate (max 3 rounds)
150
+
151
+ This is based on Anthropic's own harness design research: _"Separating the agent doing the work from the agent judging it is a strong lever"_ and _"tuning standalone evaluators toward skepticism is far more tractable than making a generator critical of its own work."_
152
+
153
+ ---
154
+
155
+ ## Enforced Testing: Auto-Generated Verification Tests
156
+
157
+ WogiFlow doesn't trust "it compiles" as proof that a feature works. For every task that changes code, the Runtime Verification Gate automatically generates and runs tests — frontend and backend — as part of the execution loop. This is ON by default, not opt-in.
158
+
159
+ ### How It Works
160
+
161
+ For each acceptance criterion in the task spec:
162
+ 1. **Classify**: Is this a UI behavior, API behavior, or internal logic?
163
+ 2. **Generate**: Write a test that exercises the specific criterion
164
+ 3. **Implement**: Write the actual code
165
+ 4. **Run**: Execute the test — it must pass
166
+ 5. **If fail** → debug, fix, re-run (max 5 retries)
167
+ 6. **Persist**: Test stays in `tests/verification/` as a permanent regression guard
168
+
169
+ Over time, this builds an automated test suite from the actual use cases that were implemented — not hypothetical test cases, but tests that directly verify the features your team shipped.
170
+
171
+ ### Frontend: Browser Verification
172
+
173
+ When changed files include UI code (`.tsx`, `.jsx`, `.vue`, `.svelte`, `.css`), WogiFlow generates browser-level tests:
174
+
175
+ **Method 1 — WebMCP (preferred)**: If a browser MCP server is connected, WogiFlow drives the actual browser:
176
+ 1. Navigate to the affected page
177
+ 2. Screenshot BEFORE the change
178
+ 3. Perform the user action (click, type, submit)
179
+ 4. Wait for async updates
180
+ 5. Screenshot AFTER
181
+ 6. Assert DOM state matches expected behavior
182
+ 7. For state mutations: reload the page and verify changes persisted
183
+
184
+ **Method 2 — Playwright**: Auto-generates a Playwright test to `tests/verification/verify-{taskId}.spec.ts`. Persists as a CI-ready regression guard.
185
+
186
+ **Method 3 — User Checklist (fallback)**: When no browser automation is available, generates a specific checklist and blocks task completion until the user replies "verified."
187
+
188
+ **High-risk mutation detection**: When code contains `useMutation`, `invalidateQueries`, or `onMutate`, WogiFlow adds extra verification — wait 3 seconds after all criteria pass, reload the page, and confirm state survived the refetch cycle. This catches the #1 frontend false-completion: "it looked right but the server didn't persist it."
189
+
190
+ ### Backend: API Integration Tests
191
+
192
+ When changed files include API code (`.controller.*`, `.service.*`, `.resolver.*`, `/routes/`, `/api/`, `.dto.*`, `.guard.*`, `.middleware.*`), WogiFlow generates integration tests:
193
+
194
+ - Makes actual HTTP requests to the running dev server
195
+ - Asserts status codes, response shapes, and field values
196
+ - For mutations (POST/PUT/PATCH/DELETE): re-fetches the resource to verify persistence
197
+ - Auth-protected endpoints get the auth token included automatically
198
+ - Tests persist in `tests/verification/api-verify-{taskId}.test.js`
199
+
200
+ ### Fullstack: Boundary Verification
201
+
202
+ When both UI and API files change, WogiFlow generates BOTH browser and API tests and validates the boundary:
203
+ - The API test verifies the server accepts the payload shape the frontend sends
204
+ - The browser test verifies the UI correctly displays the response shape the server returns
205
+ - If either side fails → the frontend/backend contract is broken
206
+
207
+ ### Banned Verification Methods
208
+
209
+ WogiFlow explicitly bans methods that give false confidence for UI tasks:
210
+
211
+ | Banned Method | Why It's Insufficient |
212
+ |---|---|
213
+ | `tsc --noEmit` passes | Type-correct code can have wrong runtime behavior |
214
+ | `vite build` succeeds | Build success says nothing about UX |
215
+ | `grep` deployed bundle | Function may exist but never execute |
216
+ | "I read the code and it's correct" | Author is the worst judge of own work |
217
+
218
+ ### The Repeat Failure Protocol
219
+
220
+ When the same issue appears in 2+ consecutive attempts:
221
+
222
+ | Strike | What Happens |
223
+ |--------|-------------|
224
+ | 1 | Normal fix + evidence log |
225
+ | 2 | Mandatory root cause analysis BEFORE coding. Must change approach. Tier 3+ evidence required. |
226
+ | 3 | Hard block: Cannot mark done without screenshot/console evidence. Must explain what's different this time. |
227
+ | 4+ | Escalation: Acknowledge inability, suggest pair debugging with the developer. |
228
+
229
+ This prevents the AI from trying the same broken approach in a loop — a pattern that wastes hours on real projects.
230
+
231
+ ---
232
+
233
+ ## Hybrid Mode: Smart Model Routing
234
+
235
+ Not every task needs the most expensive model. WogiFlow's hybrid mode lets Opus plan while cheaper models execute:
236
+
237
+ | Task Complexity | Executor | Token Savings |
238
+ |----------------|----------|---------------|
239
+ | Typo fix, config edit | Haiku / GPT-4o-mini | ~75% |
240
+ | New function, component | Sonnet / GPT-4o | ~60% |
241
+ | Documentation | Haiku | ~80% |
242
+ | Complex refactoring | Opus only | 0% (not delegated) |
243
+
244
+ **How it works:**
245
+ 1. Opus analyzes the task and creates a detailed execution plan
246
+ 2. You review and approve (or edit via `/wogi-hybrid-edit`)
247
+ 3. The executor model (Haiku, Sonnet, Ollama, etc.) runs each step
248
+ 4. Opus validates results — lint, typecheck, standards checks
249
+ 5. Opus handles failures with escalation back to itself if needed
250
+
251
+ **Supports cloud and local models**: Haiku, Sonnet, GPT-4o, GPT-4o-mini, Gemini Flash/Pro, Ollama (Qwen3-Coder, DeepSeek Coder, Nemotron). Local models = free execution.
252
+
253
+ ---
254
+
255
+ ## Code Review: /wogi-review
256
+
257
+ WogiFlow's review is not "look at the code and give suggestions." It's a 5-phase verification pipeline:
258
+
259
+ 1. **Verification Gates**: Spec verification, lint, typecheck, test execution
260
+ 2. **AI Analysis**: Multi-pass or parallel code/logic/security/architecture review with adversarial minimum findings (the reviewer must find at least N issues — prevents rubber-stamping)
261
+ 3. **Git-Verified Claim Checking**: Cross-references spec claims against actual git diff. If the spec promises a file that isn't in the diff → BLOCKED
262
+ 4. **Standards Compliance [STRICT]**: Every finding checked against decisions.md, app-map.md, naming conventions. MUST_FIX violations block sign-off
263
+ 5. **Post-Review Workflow**: Fix loop with automatic task creation for findings
264
+
265
+ The review produces a structured report with findings classified by severity (critical, high, medium, low) and type (MUST_FIX, SHOULD_FIX, SUGGESTION). Critical/high findings block the next release.
266
+
267
+ ---
268
+
269
+ ## Morning Briefing: /wogi-morning
270
+
271
+ Start every day with instant situational awareness:
272
+
273
+ - **Where you left off**: Current task, status, files touched
274
+ - **What happened since**: New commits, issues, changes by teammates
275
+ - **Rule violations**: Auto-promoted patterns awaiting enforcement decisions
276
+ - **Stale skills**: Documentation older than 90 days flagged for refresh
277
+ - **Recommended next tasks**: Top 3 by priority from the backlog
278
+ - **Ready-to-use prompt**: Copy-paste continuation prompt to resume immediately
279
+
280
+ This eliminates the 10-15 minute "where was I?" ramp-up that costs teams hours per week.
281
+
282
+ ---
283
+
284
+ ## Session End: /wogi-session-end
285
+
286
+ When you stop working, WogiFlow preserves everything:
287
+
288
+ - **Structured handoff notes**: What's completed, what's in progress, what's next
289
+ - **Cross-session pattern detection**: Identifies requests repeated 3+ times across sessions and suggests promoting them to permanent rules
290
+ - **State persistence**: All workflow state committed to git — survives restarts, crashes, and machine changes
291
+ - **Request log verification**: Ensures all changes are logged before closing
292
+ - **Component registry check**: Verifies new components are registered in app-map
293
+
294
+ The next session (or the next developer on the same repo) picks up exactly where you left off.
295
+
296
+ ---
297
+
298
+ ## Workspace Mode: Multi-Repo Orchestration
299
+
300
+ This is the game changer for enterprise teams. WogiFlow can manage multiple repositories from a single orchestrator.
301
+
302
+ ### The Manager-Worker Architecture
303
+
304
+ ```
305
+ Manager (workspace root — orchestrates, never codes)
306
+
307
+ ├── Backend repo (provider — APIs, database)
308
+ ├── Frontend repo (consumer — UI, pages)
309
+ ├── Shared repo (library — types, utils)
310
+ └── Mobile repo (consumer — native app)
311
+ ```
312
+
313
+ **How it works:**
314
+ 1. You tell the manager: "Add user profile editing"
315
+ 2. Manager reads metadata from all repos (API maps, component registries, schemas — never source code)
316
+ 3. Manager analyzes which repos are affected and in what order
317
+ 4. Manager creates phased execution plan: library → provider → consumer
318
+ 5. Each worker repo gets a task dispatched via HTTP channel
319
+ 6. Workers execute independently in their own Claude Code sessions
320
+ 7. Workers communicate via message bus: contract changes, questions, completion signals
321
+ 8. Manager validates cross-repo integration
322
+
323
+ ### Mechanical Boundary Enforcement
324
+
325
+ The manager physically cannot modify worker repo source code. The Manager Boundary Gate blocks all Edit/Write operations on files inside member repos. The manager can only:
326
+ - Read metadata (api-map, app-map, schema-map, config)
327
+ - Dispatch tasks to workers
328
+ - Read messages from the message bus
329
+ - Validate cross-repo contracts
330
+
331
+ This prevents the orchestrator from becoming a bottleneck or making unauthorized changes.
332
+
333
+ ### Cross-Repo Quality Gates
334
+
335
+ When workspace mode is active, additional gates enforce integration quality:
336
+ - **Contract Compliance**: Changes must comply with declared API contracts
337
+ - **Peer Notification**: Affected repos are automatically notified of changes
338
+ - **Cascade Verification**: Library changes trigger verification in all consumer repos
339
+ - **Impact Query**: Before implementing, workers can ask peers "will my change break you?"
340
+
341
+ ### Agent-to-Agent Communication
342
+
343
+ Workers communicate through 11 message types:
344
+ - `contract-change` — "I changed an API endpoint"
345
+ - `question` — "Does your side handle X?"
346
+ - `impact-query` / `impact-response` — Pre-implementation impact assessment
347
+ - `lock-acquired` / `lock-released` — Shared interface edit coordination
348
+ - `verification-request` — "Please verify your integrations"
349
+
350
+ ---
351
+
352
+ ## Self-Improving Workflow
353
+
354
+ WogiFlow learns from corrections and promotes patterns to permanent rules:
355
+
356
+ 1. **Feedback Patterns**: When you correct the AI, it records the pattern in `feedback-patterns.md`
357
+ 2. **Promotion Threshold**: When a pattern occurs 3+ times, it's promoted to `decisions.md` (permanent project rules)
358
+ 3. **Gate Telemetry**: Every gate tracks pass/catch/miss rates. A gate that consistently passes work that later needs correction has a high "miss rate" — revealing rubber-stamping
359
+ 4. **Correction Memory**: The IGR system cross-references corrections back to gates that previously approved the flawed work
360
+ 5. **Decision Authority**: The AI learns which decisions it can make autonomously vs which need human approval, calibrated per category
361
+
362
+ This means the more you use WogiFlow, the better it gets. Week 1 catches 60% of issues. Week 8 catches 90% — because the rules that caught the other 30% were learned from your corrections.
363
+
364
+ ---
365
+
366
+ ## Memory System
367
+
368
+ WogiFlow maintains persistent memory across sessions using SQLite with semantic search:
369
+
370
+ - **SQLite database** with HuggingFace Transformers embeddings (all-MiniLM-L6-v2)
371
+ - **Cosine similarity search**: "Find decisions related to authentication" works without exact keywords
372
+ - **Relevance decay**: Facts accessed frequently stay hot; unused facts decay over 30 days
373
+ - **Cold retention**: Facts not accessed in 90 days are archived
374
+ - **Auto-promotion**: Facts accessed 3+ times with high relevance are promoted to permanent knowledge
375
+ - **Structured registries**: app-map, function-map, api-map, schema-map, service-map — deterministic lookup for components, functions, endpoints, schemas
376
+
377
+ ---
378
+
379
+ ## Why This Matters for Enterprise
380
+
381
+ ### The Cost of Undisciplined AI Coding
382
+
383
+ | Problem | Without WogiFlow | With WogiFlow |
384
+ |---------|-----------------|---------------|
385
+ | Untracked changes | AI edits files without logging | Every change logged, tagged, auditable |
386
+ | Skipped verification | "It compiles" = "it works" | 5-tier evidence system, skeptical evaluator |
387
+ | Convention drift | Each session forgets project rules | Rules mechanically enforced, self-improving |
388
+ | Scope creep | Fix one bug, touch 15 files | Bugfix scope gate limits blast radius |
389
+ | Knowledge loss | Context lost between sessions | SQLite memory + structured handoffs |
390
+ | Unsafe deployments | Deploy without testing | Deploy gate blocks without verification |
391
+ | Team inconsistency | 10 developers, 10 styles | Standards gate enforces unified conventions |
392
+ | Wasted tokens | Full pipeline for typo fixes | Phase-loaded router: 79% savings for small tasks |
393
+ | Multi-repo chaos | Manual coordination across repos | Workspace orchestration with boundary enforcement |
394
+
395
+ ### What Companies Get
396
+
397
+ 1. **Auditability**: Every task is tracked from request through implementation to verification. The request log, git history, and verification artifacts create a complete audit trail.
398
+
399
+ 2. **Consistency**: Mechanical enforcement means the AI follows the same process whether it's Monday morning or Friday 5pm, whether the developer is senior or junior.
400
+
401
+ 3. **Scalability**: Hybrid mode reduces token costs by 60-75%. Workspace mode enables multi-repo orchestration. Phase loading optimizes prompt costs.
402
+
403
+ 4. **Safety**: 12+ mechanical gates prevent unauthorized changes, scope creep, unsafe deployments, and convention violations. The AI literally cannot bypass them.
404
+
405
+ 5. **Continuous improvement**: The self-learning system means the workflow gets better over time without manual rule maintenance. Corrections become permanent rules automatically.
406
+
407
+ 6. **Knowledge retention**: Project knowledge persists in structured state files, semantic memory, and cross-session handoffs. New team members inherit the full learning history.
408
+
409
+ ---
410
+
411
+ ## Quick Start
412
+
413
+ ```bash
414
+ npm install -D wogiflow
415
+ npx flow onboard
416
+ ```
417
+
418
+ Onboarding analyzes the existing project, detects the tech stack, indexes components, and configures the workflow. First task can start in under 5 minutes.
419
+
420
+ ---
421
+
422
+ *WogiFlow v2.15.0 — AGPL-3.0 Licensed*
423
+ *Teams/enterprise features available via @wogiflow/teams*
@@ -6,7 +6,7 @@ Instructions for the spec/approval phase. Loaded on-demand when phase transition
6
6
 
7
7
  **Conditional** — runs for L1+ tasks when IGR on. L3 skip. L2 runs only on ultrathink auto-bump.
8
8
 
9
- Spawn a **read-only sub-agent** (Explore subagent_type, with Read/Grep/Glob only — no Edit/Write/Bash) on a model chosen per `config.intentGroundedReasoning.architectPass.modelOverride`. Input: Framing Artifact from Step 1.15 + explore findings from Step 1.3 + scope-confidence audit from Step 1.45 + the Logic Constitution v1 rubric (so the Architect anticipates the Adversary's checks).
9
+ Spawn a **read-only sub-agent** (Explore subagent_type, with Read/Grep/Glob only — no Edit/Write/Bash) on a model chosen per `config.intentGroundedReasoning.architectPass.modelOverride`. Input: Framing Artifact from Step 1.15 + explore findings from Step 1.3 + scope-confidence audit from Step 1.45 + the Logic Constitution v2 rubric (so the Architect anticipates the Adversary's checks, including Principle 11 — Platform Capability Grounding, which demands citation + enforcement-preservation + alternative-ruled-out + fallback for every platform-capability claim).
10
10
 
11
11
  Build the prompt via `node scripts/flow-architect-pass.js prompt <task>`. Invoke via Agent tool. Output: an 8-section plan at `.workflow/plans/{taskId}.md` (PINs: approach, data-model, journey-impact, net-new, alternatives, risks, reversibility, dependencies). Parse via `parsePlanArtifact()`; if structural FAIL, re-prompt.
12
12
 
@@ -20,7 +20,7 @@ When IGR flag is OFF: SKIPPED. Pipeline proceeds from Step 1.45 directly to Step
20
20
 
21
21
  Spawn a **separate sub-agent on a different model** than the Architect (Sonnet when Architect is Opus; Opus when Architect is Sonnet — per `modelSeparation: different-from-architect`). Per Anthropic harness research, same-model self-critique is a known rubber-stamp failure mode.
22
22
 
23
- Build the prompt via `node scripts/flow-logic-adversary.js prompt .workflow/plans/{taskId}.md`. The Adversary critiques the plan against the 10-principle Logic Constitution v1 with few-shot calibration examples from `.workflow/state/adversary-calibration.json`.
23
+ Build the prompt via `node scripts/flow-logic-adversary.js prompt .workflow/plans/{taskId}.md`. The Adversary critiques the plan against the 11-principle Logic Constitution v2 with few-shot calibration examples from `.workflow/state/adversary-calibration.json`. Principle 11 (Platform Capability Grounding) always runs — every claim about hook/tool/API/platform behavior must be cited, enforcement must be preserved, an alternative must be named, and a fallback must be specified.
24
24
 
25
25
  Iteration loop (max 3 rounds by default):
26
26
  - `overallVerdict: PASS` or `PASS_WITH_CONCERNS` → proceed to Step 1.5. Concerns surface at approval gate (Step 1.6).
@@ -473,7 +473,7 @@ Run `node node_modules/wogiflow/scripts/flow-standards-gate.js wf-XXXXXXXX [chan
473
473
 
474
474
  Checks scoped by task type: component → naming/components/security. Utility → naming/functions/security. API → naming/api/security. Bugfix → naming/security. Feature → all. Refactor/migration → all + consumer-impact verification.
475
475
 
476
- **Consumer impact check** (ALL L1+ tasks): For each BREAKING consumer from explore phase (blast-radius analysis), verify it was updated. If any NOT migrated → BLOCK task completion. Results are persisted in `.workflow/state/blast-radius-{taskId}.json`.
476
+ **Consumer impact check** (ALL L1+ tasks): The blast-radius analysis ran in the explore phase (Agent 6: Consumer Impact) and wrote results to `.workflow/state/blast-radius-{taskId}.json`. That file contains an array of consumer entries, each classified as BREAKING (must update), NEEDS-UPDATE (review), or SAFE (no change needed). Read the file and, for each entry with `classification: "BREAKING"`, verify the file listed in `path` was actually modified in this task's changeset (check `git diff --name-only`). If any BREAKING consumer is NOT in the diff → BLOCK task completion and surface to user.
477
477
 
478
478
  **Reuse candidate check** (AI-as-Judge): Standards gate returns similar items from all registries. AI reasons about PURPOSE overlap (not just name). If purpose overlaps → ask user (use existing / extend / create new). If purpose clearly differs → proceed silently.
479
479
 
@@ -11,9 +11,9 @@
11
11
 
12
12
  You are the **Logic Adversary** for WogiFlow's Intent-Grounded Reasoning layer.
13
13
 
14
- Your job is to find logic problems in a plan BEFORE any code is written. You are not critiquing code. You are not checking style, library choice, or syntax — other gates handle those. You are reasoning about whether this plan is **logically right** for the project it's proposed for.
14
+ Your job is to find logic problems in a plan BEFORE any code is written. You are not critiquing code. You are not checking style, library choice, or syntax — other gates handle those. You are reasoning about whether this plan is **logically right** for the project it's proposed for AND whether its claims about the target platform (hooks, tool APIs, subagent model, MCP, etc.) are actually true.
15
15
 
16
- You have a specific rubric: the Logic Constitution (currently v1). You will receive it as input. For each of the 10 principles, you produce a verdict: PASS, CONCERN, FAIL, or SKIP. You cite specific evidence for every verdict. Verdicts without evidence are themselves failures of your job.
16
+ You have a specific rubric: the Logic Constitution (currently v2). You will receive it as input. For each of the 11 principles, you produce a verdict: PASS, CONCERN, FAIL, or SKIP. You cite specific evidence for every verdict. Verdicts without evidence are themselves failures of your job.
17
17
 
18
18
  ### What you are looking for
19
19
 
@@ -29,6 +29,11 @@ Patterns that produce logic failures in practice — seen in real agent session
29
29
  8. **Implicit-requirement blindness** — happy path only, no edge cases.
30
30
  9. **User-journey orphans** — dead-end screens, unreachable features.
31
31
  10. **Undocumented irreversibility** — destructive ops without confirmation.
32
+ 11. **Ungrounded platform-capability claims** — plans that rely on a hook, tool API, subagent behavior, MCP feature, or slash command working a certain way WITHOUT citation, WITHOUT enforcement-preservation evidence, WITHOUT a ruled-out alternative, or WITHOUT a capability-unavailable fallback. For every platform-capability claim, demand all four: citation, enforcement walk-through, alternative, fallback. Missing any of the four = FAIL. **Additionally, for runtime-behavior claims (hooks firing, tools returning specific shapes, signals being handled, events being emitted), hearsay-level citations — code comments or docs claiming "X does Y" — are NOT sufficient. Demand either O1 (a captured observation: log, telemetry, trace, test result) OR O2 (a named live-test plan that produces O1 before downstream code is built). See P11.1 in the rubric. A comment saying "the hook fires" is not evidence the hook fires; a log line showing it firing is.**
33
+
34
+ **P11.2 — The same discipline applies to the PROJECT'S OWN RULES**, not just platform capabilities. For every artifact a plan produces (task IDs, file names, config values, state-file entries, spec structures, commit messages), demand: (E1) which rule from `decisions.md`, `feedback-patterns.md`, `.claude/rules/`, a schema, or a validator function applies? (E2) show the artifact satisfying the rule — run the validator, show the format side-by-side with the rule, paste the passing check — *not* "I followed it." (E3) what's the failure mode when violated? Examples of P11.2 violations: (a) "task ID `wf-test0001` follows WogiFlow convention" — no, the convention requires hex, this fails `validateTaskId()`; (b) "config key `taskBoundaryReset` is valid" without being in `flow-constants.js`'s known-keys list; (c) "file name `flowFoo.js` follows kebab-case" — it doesn't. Reflex: *what's the artifact? what rule governs it? is satisfaction SHOWN, not just claimed?*
35
+
36
+ **P11.3 — Also check for EXISTING WOGIFLOW FEATURES that touch the same domain.** Before shipping any new mechanism (hook, wrapper, CLI entry, state file, config key, skill), enumerate the sibling surface: (S1) `grep -r "execSync\|spawn.*claude" lib/ scripts/`, check `.claude/commands/`, check `scripts/flow-constants.js`, check `lib/workspace.js` — does an existing feature already touch this domain? (S2) Show how the new mechanism composes, conflicts, or integrates with each sibling. "Orthogonal" is OK but must be asserted. (S3) If integration work is needed (e.g., the new wrapper needs to be injected into workspace's `execSync('claude')` call), include it in scope OR explicitly file a follow-up story. Silent omission of sibling integration = FAIL. Example violation caught live: `wogi-claude` wrapper initially missed that `lib/workspace.js:1612` spawns claude directly, so workspace-mode workers weren't restart-capable.
32
37
 
33
38
  ### What you are NOT looking for
34
39