wogiflow 2.12.1 → 2.15.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (65) hide show
  1. package/.claude/commands/wogi-challenge.md +62 -0
  2. package/.claude/commands/wogi-eval.md +14 -1
  3. package/.claude/commands/wogi-gate-stats.md +80 -0
  4. package/.claude/commands/wogi-start-continuation.md +12 -0
  5. package/.claude/commands/wogi-start.md +32 -902
  6. package/.claude/docs/explore-agents.md +49 -0
  7. package/.claude/docs/gate-telemetry.md +142 -0
  8. package/.claude/docs/intent-grounded-reasoning.md +140 -0
  9. package/.claude/docs/phases/01-explore.md +159 -0
  10. package/.claude/docs/phases/02-spec.md +88 -0
  11. package/.claude/docs/phases/03-implement.md +92 -0
  12. package/.claude/docs/phases/04-verify.md +495 -0
  13. package/.claude/docs/phases/05-complete.md +140 -0
  14. package/.claude/rules/_internal/README.md +64 -0
  15. package/.claude/rules/_internal/document-structure.md +77 -0
  16. package/.claude/rules/_internal/dual-repo-management.md +174 -0
  17. package/.claude/rules/_internal/feature-refactoring-cleanup.md +87 -0
  18. package/.claude/rules/_internal/github-releases.md +71 -0
  19. package/.claude/rules/_internal/model-management.md +35 -0
  20. package/.claude/rules/_internal/self-maintenance.md +87 -0
  21. package/.claude/rules/architecture/component-reuse.md +38 -0
  22. package/.claude/rules/code-style/naming-conventions.md +107 -0
  23. package/.claude/rules/operations/git-workflows.md +92 -0
  24. package/.claude/rules/operations/scratch-directory.md +54 -0
  25. package/.claude/rules/security/security-patterns.md +193 -0
  26. package/.claude/skills/figma-analyzer/knowledge/learnings.md +11 -0
  27. package/.workflow/agents/architect.md +104 -0
  28. package/.workflow/agents/logic-adversary.md +81 -0
  29. package/.workflow/specs/architecture.md.template +24 -0
  30. package/.workflow/specs/stack.md.template +33 -0
  31. package/.workflow/specs/testing.md.template +36 -0
  32. package/.workflow/templates/claude-md.hbs +2 -0
  33. package/.workflow/templates/partials/auto-features.hbs +2 -0
  34. package/.workflow/templates/partials/intent-grounded-reasoning.hbs +40 -0
  35. package/package.json +1 -1
  36. package/scripts/flow-architect-pass.js +621 -0
  37. package/scripts/flow-bridge.js +6 -0
  38. package/scripts/flow-cli-utils.js +85 -0
  39. package/scripts/flow-completion-truth-gate.js +477 -0
  40. package/scripts/flow-correction-detector.js +279 -6
  41. package/scripts/flow-done-gates.js +69 -1
  42. package/scripts/flow-gate-telemetry.js +602 -0
  43. package/scripts/flow-intent-bootstrap.js +662 -0
  44. package/scripts/flow-intent-framing.js +708 -0
  45. package/scripts/flow-logic-adversary.js +693 -0
  46. package/scripts/flow-migrate-igr.js +245 -0
  47. package/scripts/flow-runtime-verification.js +37 -0
  48. package/scripts/flow-standards-checker.js +62 -6
  49. package/scripts/flow-standards-gate.js +45 -1
  50. package/scripts/flow-state-drift-detector.js +279 -0
  51. package/scripts/flow-trap-zone.js +470 -0
  52. package/scripts/flow-worktree.js +58 -0
  53. package/scripts/hooks/adapters/claude-code.js +34 -2
  54. package/scripts/hooks/core/phase-read-gate.js +156 -0
  55. package/scripts/hooks/core/pre-compact.js +159 -0
  56. package/scripts/hooks/core/template-change-detector.js +112 -0
  57. package/scripts/hooks/entry/claude-code/post-tool-use.js +12 -0
  58. package/scripts/hooks/entry/claude-code/pre-compact.js +31 -0
  59. package/scripts/hooks/entry/claude-code/pre-tool-use.js +38 -0
  60. package/scripts/hooks/entry/claude-code/session-start.js +17 -0
  61. package/scripts/postinstall.js +7 -0
  62. package/templates/intent/domain-model.md.hbs +44 -0
  63. package/templates/intent/glossary.md.hbs +43 -0
  64. package/templates/intent/product.md.hbs +43 -0
  65. package/templates/intent/user-journeys.md.hbs +41 -0
@@ -0,0 +1,62 @@
1
+ ---
2
+ description: "Manual trigger of the IGR Logic Adversary — critique a plan against the Logic Constitution v1 rubric."
3
+ effort: medium
4
+ ---
5
+
6
+ Manually invoke the **Logic Adversary** (IGR Stage 4) against a plan or spec of your choosing. Normally the Adversary runs automatically during `/wogi-start` Step 1.57. Use `/wogi-challenge` when you want to stress-test a plan outside the pipeline — for example, a design doc you wrote by hand, a pre-approved task where you want an extra pass, or an ad-hoc proposal in conversation.
7
+
8
+ Story: `wf-b00262b1` (IGR)
9
+
10
+ ## Usage
11
+
12
+ ```bash
13
+ # Critique a plan file
14
+ /wogi-challenge path/to/plan.md
15
+
16
+ # Critique the plan for a specific task (reads .workflow/plans/{taskId}.md)
17
+ /wogi-challenge wf-XXXXXXXX
18
+
19
+ # Critique with an explicit rubric version
20
+ /wogi-challenge path/to/plan.md --rubric=logic-constitution-v1
21
+ ```
22
+
23
+ ## What it does
24
+
25
+ 1. Loads the plan (either from a file path or from `.workflow/plans/{taskId}.md`).
26
+ 2. Calls `scripts/flow-logic-adversary.js buildAdversaryPrompt` to assemble the critique prompt — includes the 10-principle Logic Constitution, few-shot calibration examples, and all available intent artifacts.
27
+ 3. Spawns a sub-agent via the Agent tool on a different model than this session when possible (Sonnet when you're on Opus; Opus when you're on Sonnet) — the model-separation rule per the approved spec.
28
+ 4. Parses the returned JSON verdict against the rubric schema.
29
+ 5. Records a telemetry event (`gateId: logic-adversary`) with the verdict.
30
+ 6. Renders the verdict in human-readable form with per-principle PASS/CONCERN/FAIL.
31
+ 7. If the verdict is NEEDS_REVISION or FAIL, offers to iterate — the user can edit the plan and re-run.
32
+
33
+ ## When to use it
34
+
35
+ - **Before approval** — re-check a plan the automatic Adversary already passed, with fresh eyes.
36
+ - **After major revisions** — when you edited a plan by hand and want re-validation.
37
+ - **On external docs** — critique a design doc written outside WogiFlow against the Logic Constitution.
38
+ - **As a gut-check** — for high-stakes decisions (architecture, migrations) where one pass isn't enough.
39
+
40
+ ## Requirements
41
+
42
+ - `intentGroundedReasoning.enabled` and `intentGroundedReasoning.logicAdversary.enabled` must be true in `config.json`.
43
+ - The plan file must contain structured content parseable by the Adversary (plain markdown is fine).
44
+ - For task-ID form (`/wogi-challenge wf-XXX`), the plan must exist at `.workflow/plans/{taskId}.md`.
45
+
46
+ ## Under the hood
47
+
48
+ - Script: `scripts/flow-logic-adversary.js`
49
+ - Rubric: `.workflow/rubrics/logic-constitution-v1.md`
50
+ - Persona: `.workflow/agents/logic-adversary.md`
51
+ - Calibration: `.workflow/state/adversary-calibration.json`
52
+ - Telemetry: `gateId: logic-adversary` in `.workflow/state/gate-telemetry.jsonl`
53
+
54
+ ## Related
55
+
56
+ - `/wogi-start` — runs Adversary automatically at Step 1.57 during task execution
57
+ - `node scripts/flow-gate-telemetry.js stats --gate=logic-adversary` — see historical Adversary effectiveness
58
+ - `/wogi-review` — different tool, critiques code after implementation; Adversary critiques plans before
59
+
60
+ ## Arguments
61
+
62
+ Arguments: `{{ args }}`
@@ -127,10 +127,23 @@ In `config.json`:
127
127
  {
128
128
  "eval": {
129
129
  "judges": { "opus": 1, "sonnet": 2 },
130
- "scoringDimensions": ["completeness", "accuracy", "workflowCompliance", "tokenEfficiency", "quality"],
130
+ "scoringDimensions": ["completeness", "accuracy", "workflowCompliance", "tokenEfficiency", "quality", "productCoherence"],
131
131
  "passingThreshold": 6
132
132
  }
133
133
  }
134
134
  ```
135
135
 
136
+ ## Dimension reference
137
+
138
+ | Dimension | What the judge scores |
139
+ |-----------|----------------------|
140
+ | `completeness` | Were all acceptance criteria implemented? |
141
+ | `accuracy` | Does the implementation match the spec exactly? |
142
+ | `workflowCompliance` | Did the work follow WogiFlow rules (request log, registry update, etc.)? |
143
+ | `tokenEfficiency` | Were tokens used efficiently — minimal redundancy, focused output? |
144
+ | `quality` | Is the code well-structured, readable, maintainable? |
145
+ | **`productCoherence`** (IGR) | Does this implementation make sense for the product's stated users and domain? Does it reuse existing concepts or introduce duplicates? Does it integrate with existing user journeys, or create orphans? Would a product manager look at this and say "yes, this fits"? — **Score 1–10. 10 = textbook product fit; 1 = solves wrong problem entirely.** Requires `intentGroundedReasoning.productFitEval.enabled` and product.md to be confirmed (not draft) for full-strength scoring. |
146
+
147
+ When IGR is disabled, `productCoherence` is omitted from the scoring rubric and the judge falls back to the 5 pre-IGR dimensions.
148
+
136
149
  ARGUMENTS: $ARGUMENTS
@@ -0,0 +1,80 @@
1
+ ---
2
+ description: "Show per-gate telemetry: invocations, pass rate, catch rate, miss rate. The IGR self-assessment dashboard."
3
+ effort: low
4
+ ---
5
+
6
+ Display per-gate statistics from the IGR Gate Telemetry log (`.workflow/state/gate-telemetry.jsonl`). This is the **self-assessment dashboard** the owner asked for during epic planning ("write down every time a gate catches something so we can see what's working and what's not working").
7
+
8
+ Story: `wf-faf340cf` (IGR Story 0 — Gate Telemetry & Self-Assessment Framework)
9
+
10
+ ## Usage
11
+
12
+ ```bash
13
+ # All gates, all time
14
+ /wogi-gate-stats
15
+
16
+ # Filter by time window (e.g., last 7 days)
17
+ /wogi-gate-stats --since=7d
18
+ /wogi-gate-stats --since=24h
19
+ /wogi-gate-stats --since=30d
20
+
21
+ # Filter to one gate
22
+ /wogi-gate-stats --gate=logic-adversary
23
+ /wogi-gate-stats --gate=completion-truth-gate
24
+ /wogi-gate-stats --gate=standards-gate
25
+
26
+ # Combined
27
+ /wogi-gate-stats --since=7d --gate=intent-framing
28
+ ```
29
+
30
+ ## What the metrics mean
31
+
32
+ | Metric | Definition | What it tells you |
33
+ |--------|-----------|-------------------|
34
+ | `invocations` | How many times the gate ran | Activity level |
35
+ | `pass%` | `PASS / invocations` | How permissive the gate is |
36
+ | `catch%` | `(CONCERN + FAIL) / invocations` | How often the gate found issues — the "is this gate doing anything" signal |
37
+ | `miss%` | `userCorrectedAfterPass / PASS` | **Critical signal** — how often the gate passed work the user later corrected. High miss% = rubber-stamping. |
38
+ | `avgMs` | Average duration | Performance |
39
+ | `misses` | Raw count of cross-referenced misses | The number of times you had to correct something this gate said was fine |
40
+
41
+ ## The miss-rate signal — the one that matters most
42
+
43
+ A gate with `pass% = 100%` and `miss% > 10%` is **rubber-stamping**. It's letting things through that you then have to correct. This is the failure mode the owner's QA-98%-parable warned against: 100% coverage that creates false confidence is more dangerous than 70% coverage that triggers a second review.
44
+
45
+ When you see high miss rates:
46
+ 1. Tune the rubric (for `logic-adversary`: edit `.workflow/rubrics/logic-constitution-v1.md`)
47
+ 2. Add calibration examples (for `logic-adversary`: append to `.workflow/state/adversary-calibration.json`)
48
+ 3. Strengthen the gate's blocking behavior (for `completion-truth-gate`: raise `minTierForDone` or set `blockFalseCompletion: true`)
49
+
50
+ ## Example output
51
+
52
+ ```
53
+ gateId invocations pass% catch% miss% avgMs misses
54
+ --------------------- ----------- ------ ------ ------ ----- ------
55
+ logic-adversary 12 75.0% 25.0% 8.3% 72341 1
56
+ intent-framing 12 83.3% 16.7% 0.0% 234 0
57
+ architect-pass 12 91.7% 8.3% 0.0% 5234 0
58
+ completion-truth-gate 10 80.0% 20.0% 0.0% 14 0
59
+ session-corrections 3 100.0% 0.0% 0.0% 87 0
60
+ standards-gate 12 100.0% 0.0% 0.0% 52 0
61
+ intent-bootstrap 1 100.0% 0.0% 0.0% 12 0
62
+
63
+ Total events: 62
64
+ ```
65
+
66
+ ## Related commands
67
+
68
+ - `node scripts/flow-gate-telemetry.js rotate` — force log rotation (default rotates at 10 MB)
69
+ - `node scripts/flow-gate-telemetry.js schema` — print the event schema
70
+ - `/wogi-challenge` — manually invoke the Logic Adversary (one of the gates this dashboard tracks)
71
+
72
+ ## Under the hood
73
+
74
+ - Script: `scripts/flow-gate-telemetry.js`
75
+ - Log: `.workflow/state/gate-telemetry.jsonl` (append-only, JSONL)
76
+ - Archive: `.workflow/state/gate-telemetry-archive/`
77
+
78
+ ## Arguments
79
+
80
+ Arguments: `{{ args }}`
@@ -74,6 +74,18 @@ Run `flow-spec-verifier.js verify`. Check `config.qualityGates` for task type:
74
74
  ## Sprint Reset (5+ criteria)
75
75
  At every 3rd criterion: commit progress, save checkpoint to `task-checkpoint.json`, compact context, resume from checkpoint.
76
76
 
77
+ ## Phase Execution (MANDATORY)
78
+
79
+ Before executing ANY phase, you MUST Read the phase instruction file. The PreToolUse hook BLOCKS Edit/Write/Bash until the phase file is read.
80
+
81
+ | Phase | File to Read |
82
+ |-------|-------------|
83
+ | exploring | `.claude/docs/phases/01-explore.md` |
84
+ | spec_review | `.claude/docs/phases/02-spec.md` |
85
+ | coding | `.claude/docs/phases/03-implement.md` |
86
+ | validating | `.claude/docs/phases/04-verify.md` |
87
+ | completing | `.claude/docs/phases/05-complete.md` |
88
+
77
89
  ## Rules
78
90
  - Validate after EVERY file edit
79
91
  - Re-read ALL criteria before marking done