wogiflow 2.23.0 → 2.25.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/commands/wogi-decide.md +40 -0
- package/.claude/commands/wogi-epics.md +12 -0
- package/.claude/commands/wogi-extract-review.md +25 -0
- package/.claude/commands/wogi-feature.md +35 -0
- package/.claude/commands/wogi-init.md +29 -0
- package/.claude/commands/wogi-learn.md +46 -0
- package/.claude/commands/wogi-onboard.md +26 -0
- package/.claude/commands/wogi-plan.md +28 -0
- package/.claude/commands/wogi-test-browser.md +11 -0
- package/.claude/commands/wogi-test-generate.md +11 -0
- package/.claude/commands/wogi-test.md +25 -0
- package/.claude/commands/wogi-triage.md +41 -0
- package/package.json +2 -2
- package/scripts/flow-extraction-review.js +51 -0
|
@@ -304,6 +304,46 @@ In `config.json`:
|
|
|
304
304
|
}
|
|
305
305
|
```
|
|
306
306
|
|
|
307
|
+
## Rule-Creation Adversary (v2.25.0+ — OPTIONAL but recommended for ambiguous rules)
|
|
308
|
+
|
|
309
|
+
When creating a non-trivial rule (anything beyond pure preference-setting like "always use semicolons"), spawn an adversary on a different model (default `sonnet` via `config.researchReasoningGate.tier3.adversaryModel`) to stress-test the proposed rule BEFORE it lands in `decisions.md`.
|
|
310
|
+
|
|
311
|
+
```
|
|
312
|
+
Spawn Agent (subagent_type: general-purpose, model: <adversaryModel>):
|
|
313
|
+
|
|
314
|
+
Input:
|
|
315
|
+
Proposed rule title: <title>
|
|
316
|
+
Proposed rule body: <body>
|
|
317
|
+
User's original phrasing: <literal request>
|
|
318
|
+
|
|
319
|
+
Prompt:
|
|
320
|
+
You are the rule-creation adversary.
|
|
321
|
+
1. Edge cases: name 3 situations where following this rule would produce
|
|
322
|
+
worse outcomes than NOT following it.
|
|
323
|
+
2. Interpretation: are there 2+ reasonable interpretations? If yes, list
|
|
324
|
+
them and pick the one the user most likely meant.
|
|
325
|
+
3. Scope creep: could this rule be over-applied to situations the user
|
|
326
|
+
didn't intend? Suggest scope qualifiers.
|
|
327
|
+
4. Verdict:
|
|
328
|
+
- ACCEPT — ship as-is
|
|
329
|
+
- CLARIFY — multiple interpretations; ask user
|
|
330
|
+
- NARROW — over-application risk; add scope qualifiers
|
|
331
|
+
- REJECT — edge cases dominate; more harm than good
|
|
332
|
+
|
|
333
|
+
Output JSON: {
|
|
334
|
+
"verdict", "edge_cases", "interpretations",
|
|
335
|
+
"scope_qualifiers", "suggested_revision"
|
|
336
|
+
}
|
|
337
|
+
```
|
|
338
|
+
|
|
339
|
+
Process:
|
|
340
|
+
- **ACCEPT** → proceed with rule creation
|
|
341
|
+
- **CLARIFY** → ask user to pick interpretation
|
|
342
|
+
- **NARROW** → show scope qualifier; ask user to approve
|
|
343
|
+
- **REJECT** → surface edge cases; require explicit override
|
|
344
|
+
|
|
345
|
+
Fail-open: adversary unavailable → proceed with standard flow. User confirmation is still present.
|
|
346
|
+
|
|
307
347
|
## Files
|
|
308
348
|
|
|
309
349
|
| Action | File |
|
|
@@ -165,6 +165,18 @@ You MAY:
|
|
|
165
165
|
|
|
166
166
|
**If the user provides 9 items, the epic MUST contain 9 stories (or items grouped into stories where every item appears as an acceptance criterion). Verify this with a reconciliation count before proceeding.**
|
|
167
167
|
|
|
168
|
+
## P0 Specification-Quality Gates (v2.24.0+)
|
|
169
|
+
|
|
170
|
+
Before finalizing an epic, run the same P0 gates `/wogi-story` uses (`scripts/flow-story-gates.js`):
|
|
171
|
+
|
|
172
|
+
1. **Long Input Gate** — ≥40 lines or ≥5 items → route to `/wogi-extract-review`
|
|
173
|
+
2. **Item Reconciliation** — already enforced via Anti-Deferral above; use `gates.reconcileItems()` to generate the manifest mechanically
|
|
174
|
+
3. **Consumer Impact** — if epic description contains refactor/rename/migrate/etc. → grep consumers, list breaking count, recommend phased migration at ≥5
|
|
175
|
+
4. **Scope-Confidence** — extract "new X"/"existing Y" assumptions → classify against codebase → surface CONTRADICTED/UNVERIFIED as Pending Clarifications
|
|
176
|
+
5. **Intent Bootstrap Coordination** — schedule IGR bootstrap if missing; respects existing session-state flag
|
|
177
|
+
|
|
178
|
+
All fail-open. Call `require('wogiflow/scripts/flow-story-gates')` directly in the epic-creation flow.
|
|
179
|
+
|
|
168
180
|
## Tips
|
|
169
181
|
|
|
170
182
|
- **Start with epics for major features** - Break down into stories before implementation
|
|
@@ -182,6 +182,31 @@ Contradictions are resolved automatically using temporal ordering:
|
|
|
182
182
|
|
|
183
183
|
The superseded statement is marked and excluded from story generation.
|
|
184
184
|
|
|
185
|
+
## Item Manifest Export (v2.24.0+)
|
|
186
|
+
|
|
187
|
+
After review is complete, the confirmed items can be exported as an **Item Manifest** in the format `/wogi-story`'s P0 item-reconciliation gate consumes directly. This closes the re-extraction loop — when `/wogi-start` routes a long input to `/wogi-extract-review`, the resulting manifest is handed to `/wogi-story` with `bypassLongInput: true` so story creation doesn't re-route back to extraction.
|
|
188
|
+
|
|
189
|
+
```bash
|
|
190
|
+
flow extract-zero-loss manifest # JSON: { items, count, bypassLongInput, sourceSessionId, intentBootstrapScheduled }
|
|
191
|
+
```
|
|
192
|
+
|
|
193
|
+
AI flow:
|
|
194
|
+
```
|
|
195
|
+
User pastes large input
|
|
196
|
+
↓
|
|
197
|
+
/wogi-start detects line/item threshold → routes to /wogi-extract-review
|
|
198
|
+
↓
|
|
199
|
+
Extract → review → confirm completeness
|
|
200
|
+
↓
|
|
201
|
+
flow extract-zero-loss manifest → JSON
|
|
202
|
+
↓
|
|
203
|
+
/wogi-story --full-input=<manifest.items.join('\n')> --bypass-long-input
|
|
204
|
+
↓
|
|
205
|
+
Item Reconciliation gate sees bypassLongInput, skips Gate 1, runs Gates 2-5 normally
|
|
206
|
+
```
|
|
207
|
+
|
|
208
|
+
The manifest also carries an `intentBootstrapScheduled` flag so `/wogi-story` won't re-prompt the user if /wogi-extract-review already scheduled IGR bootstrap.
|
|
209
|
+
|
|
185
210
|
## Files
|
|
186
211
|
|
|
187
212
|
| File | Location |
|
|
@@ -145,6 +145,41 @@ node node_modules/wogiflow/scripts/flow-feature.js progress ft-a1b2c3d4
|
|
|
145
145
|
| → | In Progress (1-99%) |
|
|
146
146
|
| ✓ | Completed (100%) |
|
|
147
147
|
|
|
148
|
+
## Anti-Deferral Rule (MANDATORY — v2.24.0+)
|
|
149
|
+
|
|
150
|
+
**When creating a feature from user input, EVERY item the user provided MUST become a tracked story within the feature.**
|
|
151
|
+
|
|
152
|
+
You must NEVER:
|
|
153
|
+
- Create stories for items 1-3 and silently skip items 4-6 because you judged them as "enhancements"
|
|
154
|
+
- Label items as "deferred" or "long-term" and exclude them from the feature
|
|
155
|
+
- Apply your own priority filter to decide which items deserve stories
|
|
156
|
+
|
|
157
|
+
You MAY:
|
|
158
|
+
- Assign different priorities (P0/P1/P2/P3) — but ALL items get stories
|
|
159
|
+
- Suggest an execution order — but ALL items are tracked in the feature
|
|
160
|
+
- Ask the user "Should I defer items 4-6?" — explicit user consent is the ONLY valid reason to exclude items
|
|
161
|
+
|
|
162
|
+
**If the user provides 5 items, the feature MUST contain 5 stories (or items grouped into stories where every item appears as an acceptance criterion).** Verify with a reconciliation count before proceeding.
|
|
163
|
+
|
|
164
|
+
## P0 Specification-Quality Gates (v2.24.0+)
|
|
165
|
+
|
|
166
|
+
When creating a feature from user input, apply the same P0 gates `/wogi-story` uses (from `scripts/flow-story-gates.js`):
|
|
167
|
+
|
|
168
|
+
1. **Long Input Gate** — ≥40 lines or ≥5 items → route to `/wogi-extract-review` first
|
|
169
|
+
2. **Item Reconciliation** — ≥3 items → enumerate items as a manifest + verify every item maps to at least one story
|
|
170
|
+
3. **Consumer Impact Analysis** — if the feature title/description contains refactor/rename/migrate/replace/consolidate/split/extract/move keywords → run `git grep` on seeded filenames, list likely consumers, recommend phased migration if ≥5 breaking consumers
|
|
171
|
+
4. **Scope-Confidence Audit** — extract "new X"/"existing Y"/"the Z service" assumptions from the description, classify against codebase (VERIFIED/CONTRADICTED/UNVERIFIED), write to a Pending Clarifications section on the feature
|
|
172
|
+
5. **Intent Bootstrap Coordination** — if IGR is enabled and artifacts missing, schedule bootstrap via session-state flag (do NOT duplicate-prompt)
|
|
173
|
+
|
|
174
|
+
All gates fail-open. Use `flow-story-gates.js` directly:
|
|
175
|
+
|
|
176
|
+
```javascript
|
|
177
|
+
const gates = require('wogiflow/scripts/flow-story-gates');
|
|
178
|
+
gates.reconcileItems(userInput);
|
|
179
|
+
gates.analyzeConsumerImpact(title + ' ' + description);
|
|
180
|
+
gates.auditScopeConfidence(description);
|
|
181
|
+
```
|
|
182
|
+
|
|
148
183
|
## Tips
|
|
149
184
|
|
|
150
185
|
- **Features represent user-facing capabilities** - Not technical components
|
|
@@ -1209,3 +1209,32 @@ Say "show me the rules" or "what patterns are we using?" anytime.
|
|
|
1209
1209
|
### If user cancels mid-wizard
|
|
1210
1210
|
- Save progress to `.workflow/state/setup-progress.json`
|
|
1211
1211
|
- Next run can offer to resume
|
|
1212
|
+
|
|
1213
|
+
## v2.25.0+ — Modern Config Scaffolding (MANDATORY)
|
|
1214
|
+
|
|
1215
|
+
New projects MUST be initialized with the following modern-stack config blocks explicitly written to `.workflow/config.json` so users can see + tune them (defaults-only inheritance is fine for behavior, but visibility matters for learning):
|
|
1216
|
+
|
|
1217
|
+
```json
|
|
1218
|
+
{
|
|
1219
|
+
"intentGroundedReasoning": { "enabled": true },
|
|
1220
|
+
"taskBoundaryReset": {
|
|
1221
|
+
"enabled": true,
|
|
1222
|
+
"maxRestartsPerSession": 50
|
|
1223
|
+
},
|
|
1224
|
+
"storyFlow": {
|
|
1225
|
+
"consumerImpactAnalysis": { "enabled": true, "breakingThreshold": 5 },
|
|
1226
|
+
"scopeConfidenceAudit": { "enabled": true },
|
|
1227
|
+
"itemReconciliation": { "enabled": true, "minItems": 3 }
|
|
1228
|
+
},
|
|
1229
|
+
"longInputGate": { "enabled": true, "lineThreshold": 40 },
|
|
1230
|
+
"researchReasoningGate": {
|
|
1231
|
+
"enabled": true,
|
|
1232
|
+
"tier2": { "enabled": true },
|
|
1233
|
+
"tier3": { "enabled": true, "adversaryModel": "sonnet" }
|
|
1234
|
+
}
|
|
1235
|
+
}
|
|
1236
|
+
```
|
|
1237
|
+
|
|
1238
|
+
These capabilities (IGR, task-boundary restart, P0 story gates, long-input routing, research reasoning gate) have proven out in 2.22+ releases. New users should NOT have to manually enable them via `flow migrate-igr` or equivalent — they are active from the first session.
|
|
1239
|
+
|
|
1240
|
+
If onboarding a workspace (multi-repo), also ensure `workspace.autoPickupChannelDispatches: true` and the 2.22.x restart-handoff settings are present.
|
|
@@ -275,6 +275,52 @@ In `config.json`:
|
|
|
275
275
|
}
|
|
276
276
|
```
|
|
277
277
|
|
|
278
|
+
## Promotion Adversary (v2.25.0+ — MANDATORY)
|
|
279
|
+
|
|
280
|
+
Before promoting a pattern from `feedback-patterns.md` to `decisions.md`, run a **Promotion Adversary** on a different model. Rationale: same-model self-critique rubber-stamps. The adversary checks whether the N events that triggered promotion share an actual root cause (genuine recurrence) vs. superficial similarity with different underlying causes (false recurrence — common when the pattern detector just matched keywords).
|
|
281
|
+
|
|
282
|
+
```
|
|
283
|
+
Spawn Agent (subagent_type: general-purpose,
|
|
284
|
+
model: config.researchReasoningGate.tier3.adversaryModel, default 'sonnet'):
|
|
285
|
+
|
|
286
|
+
Input:
|
|
287
|
+
Proposed rule: <title + body>
|
|
288
|
+
Triggering events: [
|
|
289
|
+
{ date, request, correction },
|
|
290
|
+
{ date, request, correction },
|
|
291
|
+
{ date, request, correction }
|
|
292
|
+
]
|
|
293
|
+
|
|
294
|
+
Prompt:
|
|
295
|
+
You are the rule-promotion adversary.
|
|
296
|
+
Do these N events actually share a root cause, or are they superficially
|
|
297
|
+
similar events with different underlying issues?
|
|
298
|
+
|
|
299
|
+
1. For each event: describe the root cause in your own words.
|
|
300
|
+
2. List what's common to all N root causes.
|
|
301
|
+
3. List what's different between them.
|
|
302
|
+
4. Verdict:
|
|
303
|
+
- SAME_PATTERN — genuine recurrence; rule is well-founded
|
|
304
|
+
- MIXED — N-1 match but one event has a different root cause
|
|
305
|
+
- DIFFERENT — surface-similar only; no unifying pattern
|
|
306
|
+
|
|
307
|
+
Output JSON:
|
|
308
|
+
{
|
|
309
|
+
"verdict": "SAME_PATTERN" | "MIXED" | "DIFFERENT",
|
|
310
|
+
"root_causes": [...],
|
|
311
|
+
"commonalities": [...],
|
|
312
|
+
"differences": [...],
|
|
313
|
+
"suggested_rule_scope": "as_proposed" | "narrower" | "split_into_N"
|
|
314
|
+
}
|
|
315
|
+
```
|
|
316
|
+
|
|
317
|
+
Process the verdict:
|
|
318
|
+
- **SAME_PATTERN** → proceed with promotion as-is
|
|
319
|
+
- **MIXED** → ask the user: "Adversary flags event #X as different root cause. Promote rule anyway, narrow scope, or split into multiple rules?"
|
|
320
|
+
- **DIFFERENT** → DO NOT auto-promote. Surface adversary output; require explicit user confirmation.
|
|
321
|
+
|
|
322
|
+
Fail-open: if adversary cannot be spawned (missing API key, network), proceed with standard promotion and log a warning. The threshold check + user confirmation still apply.
|
|
323
|
+
|
|
278
324
|
## Files
|
|
279
325
|
|
|
280
326
|
| Action | File |
|
|
@@ -1077,3 +1077,29 @@ AskUserQuestion({
|
|
|
1077
1077
|
}]
|
|
1078
1078
|
});
|
|
1079
1079
|
```
|
|
1080
|
+
|
|
1081
|
+
## v2.25.0+ — Modern Config Scaffolding (MANDATORY)
|
|
1082
|
+
|
|
1083
|
+
When generating `.workflow/config.json` for a fresh project, include these 2.22+ capability blocks so new users inherit the current-best defaults:
|
|
1084
|
+
|
|
1085
|
+
```json
|
|
1086
|
+
{
|
|
1087
|
+
"intentGroundedReasoning": { "enabled": true },
|
|
1088
|
+
"taskBoundaryReset": { "enabled": true, "maxRestartsPerSession": 50 },
|
|
1089
|
+
"storyFlow": {
|
|
1090
|
+
"consumerImpactAnalysis": { "enabled": true, "breakingThreshold": 5 },
|
|
1091
|
+
"scopeConfidenceAudit": { "enabled": true },
|
|
1092
|
+
"itemReconciliation": { "enabled": true, "minItems": 3 }
|
|
1093
|
+
},
|
|
1094
|
+
"longInputGate": { "enabled": true, "lineThreshold": 40 },
|
|
1095
|
+
"researchReasoningGate": {
|
|
1096
|
+
"enabled": true,
|
|
1097
|
+
"tier2": { "enabled": true },
|
|
1098
|
+
"tier3": { "enabled": true, "adversaryModel": "sonnet" }
|
|
1099
|
+
}
|
|
1100
|
+
}
|
|
1101
|
+
```
|
|
1102
|
+
|
|
1103
|
+
These drive IGR (Architect + Adversary + Truth Gate), task-boundary context reset via the `wogi-claude` wrapper, `/wogi-story` P0 spec gates, auto-routing of long inputs to `/wogi-extract-review`, and the research reasoning gate's assumption-surfacing + cross-model adversary. All have proven out across the 2.22.x release series; new users should not have to discover them one at a time.
|
|
1104
|
+
|
|
1105
|
+
For multi-repo workspaces, also scaffold `workspace.autoPickupChannelDispatches: true` and leave the other `workspace.*` defaults intact — they include the 2.22.2 restart-handoff protocol.
|
|
@@ -182,6 +182,34 @@ When `/wogi-plan` is invoked with a description argument, it should:
|
|
|
182
182
|
1. Create the plan structure in `.workflow/plans/`
|
|
183
183
|
2. Enter Claude Code plan mode with the description: `EnterPlanMode` with the plan context
|
|
184
184
|
|
|
185
|
+
## Anti-Deferral Rule (MANDATORY — v2.24.0+)
|
|
186
|
+
|
|
187
|
+
**When creating a plan from user input, EVERY item the user provided MUST become a tracked epic or feature within the plan.**
|
|
188
|
+
|
|
189
|
+
You must NEVER:
|
|
190
|
+
- Create epics for items 1-3 and silently skip items 4-7 because you judged them as "enhancements" or "long-term"
|
|
191
|
+
- Label items as "deferred" and exclude them from the plan
|
|
192
|
+
- Apply your own filter to decide which items deserve tracking
|
|
193
|
+
|
|
194
|
+
You MAY:
|
|
195
|
+
- Assign different priorities (P0/P1/P2/P3) — but ALL items get epics/features
|
|
196
|
+
- Suggest an execution order — but ALL items are tracked in the plan
|
|
197
|
+
- Ask the user "Should I defer items 4-7?" — explicit user consent is the ONLY valid reason to exclude items
|
|
198
|
+
|
|
199
|
+
**If the user provides 7 items, the plan MUST contain 7 tracked sub-items (epics or features, possibly grouped where every item appears as a criterion).** Verify with a reconciliation count before proceeding.
|
|
200
|
+
|
|
201
|
+
## P0 Specification-Quality Gates (v2.24.0+)
|
|
202
|
+
|
|
203
|
+
When creating a plan from user input, apply the same P0 gates `/wogi-story` uses (`scripts/flow-story-gates.js`):
|
|
204
|
+
|
|
205
|
+
1. **Long Input Gate** — ≥40 lines or ≥5 items → route to `/wogi-extract-review`
|
|
206
|
+
2. **Item Reconciliation** — ≥3 items → enumerate manifest + verify every item maps
|
|
207
|
+
3. **Consumer Impact** — refactor keywords trigger a grep; ≥5 breaking consumers → recommend phased migration
|
|
208
|
+
4. **Scope-Confidence** — extract "new X"/"existing Y"/"the Z service" claims; classify against codebase; surface contradictions as Pending Clarifications
|
|
209
|
+
5. **Intent Bootstrap Coordination** — schedule IGR bootstrap if missing (don't re-prompt)
|
|
210
|
+
|
|
211
|
+
All fail-open.
|
|
212
|
+
|
|
185
213
|
## Tips
|
|
186
214
|
|
|
187
215
|
- **Plans are for strategic visibility** - Track high-level progress
|
|
@@ -328,3 +328,14 @@ Controlled by `.workflow/config.json`:
|
|
|
328
328
|
- **WebMCP assertions**: ~100 tokens per step (tool call + JSON assertion)
|
|
329
329
|
- **Savings**: ~95% token reduction for a 10-step test flow
|
|
330
330
|
- **Bonus**: Deterministic results (no visual diff ambiguity)
|
|
331
|
+
|
|
332
|
+
## Evidence Tier (v2.24.0+)
|
|
333
|
+
|
|
334
|
+
A successful `/wogi-test-browser` run emits **Evidence Tier 4 (SHIPPED)** — the highest tier on the Completion Truth Gate scale. Browser-driven WebMCP tests exercise real DOM interactions with real event listeners, network calls, and state transitions.
|
|
335
|
+
|
|
336
|
+
When reporting results:
|
|
337
|
+
```json
|
|
338
|
+
{ "evidenceTier": 4, "evidenceTierLabel": "SHIPPED" }
|
|
339
|
+
```
|
|
340
|
+
|
|
341
|
+
See `/wogi-test` for the full tier scale. L0/L1 tasks touching UI should reach Tier 4 before closing; Truth Gate (Step 3.9) will downgrade "done" claims that don't.
|
|
@@ -99,4 +99,15 @@ During `/wogi-start` Step 3, verify:
|
|
|
99
99
|
- Run generated tests AFTER implementing → they should all pass
|
|
100
100
|
- If any test passes before implementation → WARNING: test may be trivial
|
|
101
101
|
|
|
102
|
+
## Evidence Tier (v2.24.0+)
|
|
103
|
+
|
|
104
|
+
Generated tests that only assert structural properties (class exists, function has signature) emit **Evidence Tier 1 (STATIC)**. Tests that exercise actual behavior emit **Evidence Tier 2 (COMPILED)** once they pass. To reach Tier 3+ (interactive / shipped), tests must make real network or DOM calls — see `/wogi-test` (API/UI) and `/wogi-test-browser` for those.
|
|
105
|
+
|
|
106
|
+
When recording test results in `.workflow/verifications/`, include:
|
|
107
|
+
```json
|
|
108
|
+
{ "evidenceTier": 1|2, "evidenceTierLabel": "STATIC"|"COMPILED" }
|
|
109
|
+
```
|
|
110
|
+
|
|
111
|
+
The Completion Truth Gate reads these labels to decide whether "tests pass" is sufficient to accept a "done" claim. L0/L1 tasks cannot close on Tier 1 alone.
|
|
112
|
+
|
|
102
113
|
ARGUMENTS: {args}
|
|
@@ -285,6 +285,31 @@ Include results in the summary:
|
|
|
285
285
|
Generated Tests: [passed]/[total] passed
|
|
286
286
|
```
|
|
287
287
|
|
|
288
|
+
## Evidence Tiers (v2.24.0+)
|
|
289
|
+
|
|
290
|
+
Every test invocation should emit an **evidence tier** label that the Completion Truth Gate (Step 3.9 in `/wogi-start`) uses to accept or downgrade "done" claims. A task that claims completion without sufficient tier evidence gets surfaced as "implemented (unverified)" rather than rubber-stamped as done.
|
|
291
|
+
|
|
292
|
+
| Tier | Label | What it means | Source |
|
|
293
|
+
|------|-------|---------------|--------|
|
|
294
|
+
| 0 | NONE | No test ran or all tests silently skipped | N/A |
|
|
295
|
+
| 1 | STATIC | Code parses / type-checks cleanly | `flow lint`, `flow typecheck` |
|
|
296
|
+
| 2 | COMPILED | Unit tests pass | `flow-test-integrity.js` generated/unit tests |
|
|
297
|
+
| 3 | INTERACTIVE | API calls succeed end-to-end (HTTP round-trips, DB reads) | `flow-test-api.js`, `flow-test-ui.js` backend hits |
|
|
298
|
+
| 4 | SHIPPED | UI interaction succeeds in a real browser (click, submit, see result) | `flow-test-ui.js` browser E2E |
|
|
299
|
+
|
|
300
|
+
Output format (JSON mode):
|
|
301
|
+
```json
|
|
302
|
+
{
|
|
303
|
+
"passed": true,
|
|
304
|
+
"results": [...],
|
|
305
|
+
"evidenceTier": 3,
|
|
306
|
+
"evidenceTierLabel": "INTERACTIVE",
|
|
307
|
+
"gates": { "static": true, "unit": true, "api": true, "ui": false }
|
|
308
|
+
}
|
|
309
|
+
```
|
|
310
|
+
|
|
311
|
+
Any `/wogi-start` task closing with `evidenceTier < 3` for features that touch UI or a service boundary will be flagged by the Truth Gate. L3 tasks (refactor/chore) accept tier 1–2. L0/L1 tasks require tier 3+.
|
|
312
|
+
|
|
288
313
|
## Important Notes
|
|
289
314
|
|
|
290
315
|
- Testing is **disabled by default** — zero overhead for projects that don't use it
|
|
@@ -356,3 +356,44 @@ Each finding is displayed using these fields from `last-review.json`:
|
|
|
356
356
|
| File | `finding.file` + `finding.line` | "src/api.ts:45" |
|
|
357
357
|
| Issue | `finding.issue` | "Raw JSON.parse without try-catch" |
|
|
358
358
|
| Recommendation | `finding.recommendation` | "Use safeJsonParse from flow-utils.js" |
|
|
359
|
+
|
|
360
|
+
## Anti-Deferral Enforcement (v2.25.0+ — MANDATORY)
|
|
361
|
+
|
|
362
|
+
The **Review-Findings Anti-Deferral Rule** (`.workflow/state/decisions.md`, 2026-04-15) extends to `/wogi-triage` mechanically in v2.25.0+. Prevents the rubber-stamp pattern where the AI silently drops findings from "fix all" requests.
|
|
363
|
+
|
|
364
|
+
**Enforcement rules**:
|
|
365
|
+
|
|
366
|
+
1. **"Defer" / "skip" requires explicit user confirmation with a reason.** When the AI or user proposes to defer a finding, the triage flow MUST prompt:
|
|
367
|
+
```
|
|
368
|
+
Defer finding wf-review-XXXX?
|
|
369
|
+
Severity: HIGH
|
|
370
|
+
Reason required: [user input]
|
|
371
|
+
[Confirm defer] [Cancel — fix now]
|
|
372
|
+
```
|
|
373
|
+
Auto-defer without reason is FORBIDDEN.
|
|
374
|
+
|
|
375
|
+
2. **"Fix all" / "Option 1" / equivalent means fix ALL.** If the user requests bulk processing:
|
|
376
|
+
- Ship a fix for every finding with evidence-tier ≥ 1
|
|
377
|
+
- If any finding is too large, STOP and ask: "Finding X requires ~Y minutes of work. Ship now, split to its own release, or defer (needs reason)?"
|
|
378
|
+
- Never silently convert a finding to "deferred" in commit messages or release notes
|
|
379
|
+
|
|
380
|
+
3. **Commit/release consistency check.** Before finalizing, scan the commit message / release notes against the findings list. If the message claims "fixes F1, F2, F3, M1" but M1 isn't in the diff, BLOCK with:
|
|
381
|
+
```
|
|
382
|
+
Commit message claims M1 is fixed, but M1 does not appear in the diff.
|
|
383
|
+
Options: [Fix M1 now] [Remove M1 from message] [Acknowledge + proceed]
|
|
384
|
+
```
|
|
385
|
+
|
|
386
|
+
4. **Triage output includes a Deferral Audit Trail**:
|
|
387
|
+
```
|
|
388
|
+
━━━ TRIAGE SUMMARY ━━━
|
|
389
|
+
Fixed: 12
|
|
390
|
+
Deferred (with reasons): 2
|
|
391
|
+
• M3 — "requires restructure, tracked as wf-XXXXXXXX" (user-confirmed)
|
|
392
|
+
• L5 — "out of scope for current release" (user-confirmed)
|
|
393
|
+
Silently dropped: 0 ← MUST be 0
|
|
394
|
+
━━━━━━━━━━━━━━━━━━━━━━
|
|
395
|
+
```
|
|
396
|
+
|
|
397
|
+
Historical incident (v2.17.4 release, 2026-04-15): commit message claimed "fix all findings" but M1 and M3 were silently dropped. The v2.25.0+ mechanical enforcement makes that failure mode architecturally impossible — the flow stops and asks rather than letting the AI make an autonomous defer decision.
|
|
398
|
+
|
|
399
|
+
Skip only if `config.triage.antiDeferralEnforcement.enabled` is explicitly `false` (default: true).
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "wogiflow",
|
|
3
|
-
"version": "2.
|
|
3
|
+
"version": "2.25.0",
|
|
4
4
|
"description": "AI-powered development workflow management system with multi-model support",
|
|
5
5
|
"main": "lib/index.js",
|
|
6
6
|
"bin": {
|
|
@@ -10,7 +10,7 @@
|
|
|
10
10
|
},
|
|
11
11
|
"scripts": {
|
|
12
12
|
"flow": "./scripts/flow",
|
|
13
|
-
"test": "NODE_ENV=test node --test tests/auto-compact-prompt.test.js tests/flow-paths.test.js tests/flow-io.test.js tests/flow-config-loader.test.js tests/flow-damage-control.test.js tests/flow-output.test.js tests/flow-constants.test.js tests/flow-session-state.test.js tests/flow-hooks-integration.test.js tests/flow-utils.test.js tests/flow-security.test.js tests/flow-memory-db.test.js tests/flow-durable-session.test.js tests/flow-skill-matcher.test.js tests/flow-bridge.test.js tests/flow-proactive-compact.test.js tests/flow-cascade-completion.test.js tests/flow-capture-gate.test.js tests/flow-correction-detector-hybrid.test.js tests/flow-promote.test.js tests/flow-archive-runs.test.js tests/flow-memory.test.js tests/flow-hooks-pre-tool-helpers.test.js tests/flow-hooks-bugfix-scope-gate.test.js tests/flow-hooks-routing-gate.test.js tests/flow-hooks-phase-read-gate.test.js tests/flow-hooks-commit-log-gate.test.js tests/flow-hooks-deploy-gate.test.js tests/flow-hooks-todowrite-gate.test.js tests/flow-hooks-git-safety-gate.test.js tests/flow-hooks-scope-mutation-gate.test.js tests/flow-hooks-strike-gate.test.js tests/flow-hooks-component-check.test.js tests/flow-hooks-scope-gate.test.js tests/flow-hooks-implementation-gate.test.js tests/flow-hooks-research-gate.test.js tests/flow-hooks-loop-check.test.js tests/flow-hooks-manager-boundary-gate.test.js tests/flow-hooks-phase-gate.test.js tests/flow-hooks-pre-tool-orchestrator.test.js tests/flow-hooks-observation-capture.test.js tests/flow-hooks-task-gate.test.js tests/flow-durable-session-suspension.test.js tests/flow-health-mcp-scopes.test.js tests/flow-lean-config.test.js tests/flow-workspace-autopickup.test.js tests/flow-worker-boundary-gate.test.js tests/flow-worker-question-classifier.test.js tests/flow-completion-truth-gate-contradictions.test.js tests/flow-structure-sensor.test.js tests/flow-workspace-dispatch-tracking.test.js tests/flow-story-gates.test.js tests/flow-workspace-restart-handoff.test.js tests/flow-wogi-claude-wrapper.test.js tests/flow-wave1-integrations.test.js && NODE_ENV=test node tests/run-quality-gates.test.js",
|
|
13
|
+
"test": "NODE_ENV=test node --test tests/auto-compact-prompt.test.js tests/flow-paths.test.js tests/flow-io.test.js tests/flow-config-loader.test.js tests/flow-damage-control.test.js tests/flow-output.test.js tests/flow-constants.test.js tests/flow-session-state.test.js tests/flow-hooks-integration.test.js tests/flow-utils.test.js tests/flow-security.test.js tests/flow-memory-db.test.js tests/flow-durable-session.test.js tests/flow-skill-matcher.test.js tests/flow-bridge.test.js tests/flow-proactive-compact.test.js tests/flow-cascade-completion.test.js tests/flow-capture-gate.test.js tests/flow-correction-detector-hybrid.test.js tests/flow-promote.test.js tests/flow-archive-runs.test.js tests/flow-memory.test.js tests/flow-hooks-pre-tool-helpers.test.js tests/flow-hooks-bugfix-scope-gate.test.js tests/flow-hooks-routing-gate.test.js tests/flow-hooks-phase-read-gate.test.js tests/flow-hooks-commit-log-gate.test.js tests/flow-hooks-deploy-gate.test.js tests/flow-hooks-todowrite-gate.test.js tests/flow-hooks-git-safety-gate.test.js tests/flow-hooks-scope-mutation-gate.test.js tests/flow-hooks-strike-gate.test.js tests/flow-hooks-component-check.test.js tests/flow-hooks-scope-gate.test.js tests/flow-hooks-implementation-gate.test.js tests/flow-hooks-research-gate.test.js tests/flow-hooks-loop-check.test.js tests/flow-hooks-manager-boundary-gate.test.js tests/flow-hooks-phase-gate.test.js tests/flow-hooks-pre-tool-orchestrator.test.js tests/flow-hooks-observation-capture.test.js tests/flow-hooks-task-gate.test.js tests/flow-durable-session-suspension.test.js tests/flow-health-mcp-scopes.test.js tests/flow-lean-config.test.js tests/flow-workspace-autopickup.test.js tests/flow-worker-boundary-gate.test.js tests/flow-worker-question-classifier.test.js tests/flow-completion-truth-gate-contradictions.test.js tests/flow-structure-sensor.test.js tests/flow-workspace-dispatch-tracking.test.js tests/flow-story-gates.test.js tests/flow-workspace-restart-handoff.test.js tests/flow-wogi-claude-wrapper.test.js tests/flow-wave1-integrations.test.js tests/flow-wave2-integrations.test.js tests/flow-wave3-integrations.test.js && NODE_ENV=test node tests/run-quality-gates.test.js",
|
|
14
14
|
"test:syntax": "find scripts/ lib/ -name '*.js' -not -path '*/node_modules/*' -exec node --check {} +",
|
|
15
15
|
"lint": "eslint scripts/ lib/ tests/",
|
|
16
16
|
"lint:ci": "eslint scripts/ lib/ tests/ --max-warnings 0",
|
|
@@ -390,6 +390,46 @@ function getConfirmedTasks() {
|
|
|
390
390
|
}));
|
|
391
391
|
}
|
|
392
392
|
|
|
393
|
+
/**
|
|
394
|
+
* v2.24.0 — Export confirmed items as an "Item Manifest" compatible with
|
|
395
|
+
* /wogi-story's item-reconciliation gate (wf-63c0f4cc). Downstream callers
|
|
396
|
+
* (/wogi-story, /wogi-epics, /wogi-feature, /wogi-plan) can pass this to
|
|
397
|
+
* their P0 gates as `fullInput` and mark `bypassLongInput: true` so the
|
|
398
|
+
* re-extraction loop is skipped.
|
|
399
|
+
*
|
|
400
|
+
* @returns {{items: Array<string>, count: number, bypassLongInput: true, sourceSessionId: string, intentBootstrapScheduled: boolean}}
|
|
401
|
+
*/
|
|
402
|
+
function exportAsItemManifest() {
|
|
403
|
+
const session = loadReviewSession();
|
|
404
|
+
if (!session) throw new Error('No review session active');
|
|
405
|
+
if (!session.completeness_confirmed) {
|
|
406
|
+
throw new Error('Cannot export manifest: review not yet confirmed as complete');
|
|
407
|
+
}
|
|
408
|
+
|
|
409
|
+
const items = session.items
|
|
410
|
+
.filter(i => i.review_status === 'confirmed')
|
|
411
|
+
.map(i => (i.text || '').trim())
|
|
412
|
+
.filter(Boolean);
|
|
413
|
+
|
|
414
|
+
// Coordinate with Intent Bootstrap (see flow-story-gates.coordinateIntentBootstrap)
|
|
415
|
+
// so /wogi-start doesn't re-prompt if the user already scheduled bootstrap via
|
|
416
|
+
// /wogi-story during this session.
|
|
417
|
+
let intentBootstrapScheduled = false;
|
|
418
|
+
try {
|
|
419
|
+
const gates = require('./flow-story-gates');
|
|
420
|
+
const result = gates.coordinateIntentBootstrap();
|
|
421
|
+
intentBootstrapScheduled = !!(result && result.scheduled);
|
|
422
|
+
} catch (_err) { /* non-critical */ }
|
|
423
|
+
|
|
424
|
+
return {
|
|
425
|
+
items,
|
|
426
|
+
count: items.length,
|
|
427
|
+
bypassLongInput: true,
|
|
428
|
+
sourceSessionId: session.id || null,
|
|
429
|
+
intentBootstrapScheduled
|
|
430
|
+
};
|
|
431
|
+
}
|
|
432
|
+
|
|
393
433
|
// =============================================================================
|
|
394
434
|
// DISPLAY HELPERS
|
|
395
435
|
// =============================================================================
|
|
@@ -779,6 +819,7 @@ module.exports = {
|
|
|
779
819
|
|
|
780
820
|
// Get results
|
|
781
821
|
getConfirmedTasks,
|
|
822
|
+
exportAsItemManifest, // v2.24.0 — Wave 2 alignment with /wogi-story P0 gates
|
|
782
823
|
|
|
783
824
|
// Display
|
|
784
825
|
formatReviewStatus,
|
|
@@ -858,6 +899,16 @@ if (require.main === module) {
|
|
|
858
899
|
}
|
|
859
900
|
break;
|
|
860
901
|
|
|
902
|
+
case 'manifest':
|
|
903
|
+
// v2.24.0 — export Item Manifest for downstream /wogi-story coordination
|
|
904
|
+
try {
|
|
905
|
+
const manifest = exportAsItemManifest();
|
|
906
|
+
console.log(JSON.stringify(manifest, null, 2));
|
|
907
|
+
} catch (err) {
|
|
908
|
+
console.error(`${c.red}✗ ${err.message}${c.reset}`);
|
|
909
|
+
}
|
|
910
|
+
break;
|
|
911
|
+
|
|
861
912
|
default:
|
|
862
913
|
console.log('Extraction Review Module');
|
|
863
914
|
console.log('Commands:');
|