wogiflow 1.7.0 → 1.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (83) hide show
  1. package/.claude/commands/wogi-compact.md +49 -0
  2. package/.claude/commands/wogi-eval.md +135 -0
  3. package/.claude/commands/wogi-onboard.md +66 -1
  4. package/.claude/commands/wogi-register.md +185 -0
  5. package/.claude/commands/wogi-start.md +109 -1
  6. package/.workflow/templates/claude-md.hbs +2 -0
  7. package/.workflow/templates/partials/user-commands.hbs +1 -0
  8. package/.workflow/templates/prompts/gemini-flash.yaml +42 -0
  9. package/.workflow/templates/prompts/gpt4o.yaml +44 -0
  10. package/.workflow/templates/prompts/haiku.yaml +42 -0
  11. package/.workflow/templates/prompts/opus.yaml +45 -0
  12. package/.workflow/templates/prompts/sonnet.yaml +44 -0
  13. package/package.json +1 -1
  14. package/scripts/flow-best-of-n.js +432 -0
  15. package/scripts/flow-community-sync.js +469 -0
  16. package/scripts/flow-eval-judge.js +388 -0
  17. package/scripts/flow-eval.js +430 -0
  18. package/scripts/flow-model-router.js +10 -0
  19. package/scripts/flow-plugin-registry.js +631 -0
  20. package/scripts/flow-proactive-compact.js +341 -0
  21. package/scripts/flow-prompt-template.js +517 -0
  22. package/scripts/flow-revision-tracker.js +258 -0
  23. package/scripts/flow-skill-freshness.js +39 -18
  24. package/scripts/flow-skill-generator.js +90 -27
  25. package/scripts/flow-stack-wizard.js +2 -2
  26. package/scripts/flow-stats-collector.js +534 -0
  27. package/scripts/flow-sync-anonymizer.js +254 -0
  28. package/scripts/flow-task-checkpoint.js +497 -0
  29. package/scripts/flow-tech-options.js +4 -1
  30. package/scripts/flow-utils.js +47 -11
  31. package/scripts/hooks/core/session-context.js +12 -0
  32. package/scripts/hooks/core/session-end.js +12 -0
  33. package/scripts/hooks/core/task-completed.js +32 -0
  34. package/scripts/hooks/entry/claude-code/session-start.js +56 -10
  35. package/templates/skills/angular/skill.md +1 -1
  36. package/templates/skills/anthropic/knowledge/anti-patterns.md +78 -0
  37. package/templates/skills/anthropic/knowledge/conventions.md +18 -0
  38. package/templates/skills/anthropic/knowledge/learnings.md +5 -0
  39. package/templates/skills/anthropic/knowledge/patterns.md +111 -0
  40. package/templates/skills/anthropic/skill.md +61 -0
  41. package/templates/skills/commander/knowledge/anti-patterns.md +71 -0
  42. package/templates/skills/commander/knowledge/conventions.md +17 -0
  43. package/templates/skills/commander/knowledge/learnings.md +5 -0
  44. package/templates/skills/commander/knowledge/patterns.md +80 -0
  45. package/templates/skills/commander/skill.md +61 -0
  46. package/templates/skills/cypress/skill.md +1 -1
  47. package/templates/skills/django/skill.md +1 -1
  48. package/templates/skills/docker/skill.md +1 -1
  49. package/templates/skills/eslint/skill.md +1 -1
  50. package/templates/skills/express/skill.md +1 -1
  51. package/templates/skills/fastapi/skill.md +1 -1
  52. package/templates/skills/fastify/skill.md +1 -1
  53. package/templates/skills/flask/skill.md +1 -1
  54. package/templates/skills/hono/skill.md +1 -1
  55. package/templates/skills/jest/skill.md +1 -1
  56. package/templates/skills/nestjs/skill.md +1 -1
  57. package/templates/skills/openai/knowledge/anti-patterns.md +69 -0
  58. package/templates/skills/openai/knowledge/conventions.md +18 -0
  59. package/templates/skills/openai/knowledge/learnings.md +5 -0
  60. package/templates/skills/openai/knowledge/patterns.md +121 -0
  61. package/templates/skills/openai/skill.md +61 -0
  62. package/templates/skills/playwright/skill.md +1 -1
  63. package/templates/skills/prisma/skill.md +1 -1
  64. package/templates/skills/pytest/skill.md +1 -1
  65. package/templates/skills/svelte/skill.md +1 -1
  66. package/templates/skills/tailwindcss/skill.md +1 -1
  67. package/templates/skills/terraform/skill.md +1 -1
  68. package/templates/skills/typescript/skill.md +1 -1
  69. package/templates/skills/vitest/skill.md +2 -2
  70. package/templates/skills/zod/skill.md +1 -1
  71. package/.claude/rules/README.md +0 -60
  72. package/.claude/rules/architecture/component-reuse.md +0 -38
  73. package/.claude/rules/architecture/document-structure.md +0 -76
  74. package/.claude/rules/architecture/dual-repo-management.md +0 -169
  75. package/.claude/rules/architecture/feature-refactoring-cleanup.md +0 -87
  76. package/.claude/rules/architecture/model-management.md +0 -35
  77. package/.claude/rules/architecture/self-maintenance.md +0 -87
  78. package/.claude/rules/code-style/naming-conventions.md +0 -55
  79. package/.claude/rules/security/security-patterns.md +0 -176
  80. package/.claude/skills/figma-analyzer/knowledge/learnings.md +0 -11
  81. package/.workflow/specs/architecture.md.template +0 -24
  82. package/.workflow/specs/stack.md.template +0 -33
  83. package/.workflow/specs/testing.md.template +0 -36
@@ -123,6 +123,55 @@ With smart compaction enabled (`config.smartCompaction.enabled`), context is man
123
123
 
124
124
  This means fixed thresholds are less relevant - compaction happens when actually needed based on the specific task.
125
125
 
126
+ ### Proactive Phase-Boundary Compaction (v2.3)
127
+
128
+ With proactive compaction enabled (`config.proactiveCompaction.enabled`), WogiFlow compacts between task phases:
129
+
130
+ - **Phase boundaries**: After explore, spec, each scenario, criteria check, validation
131
+ - **Trigger threshold**: Default 75% context usage (configurable via `triggerThreshold`)
132
+ - **Task checkpoints**: Full task state saved to `.workflow/state/task-checkpoint.json` at every phase boundary
133
+ - **Auto-compact recovery**: If Claude's auto-compact fires, checkpoint enables lossless recovery
134
+
135
+ **How it works:**
136
+ 1. At each phase boundary, `/wogi-start` saves a task checkpoint (task ID, phase, scenarios, files changed)
137
+ 2. If context exceeds the trigger threshold, proactive compaction fires before the next phase
138
+ 3. If Claude auto-compacts (at ~95%), session resume reads the checkpoint and restores full state
139
+
140
+ **Recovery flow:**
141
+ ```
142
+ Auto-compact fires at ~95% → Session resumes with compressed context
143
+ → /wogi-start detects checkpoint exists → Reads task-checkpoint.json
144
+ → Displays: "Auto-compact detected. Restoring task state from checkpoint..."
145
+ → Continues from the exact phase where it left off
146
+ ```
147
+
148
+ **Config** (`config.proactiveCompaction`):
149
+ ```json
150
+ {
151
+ "enabled": true,
152
+ "triggerThreshold": 0.75,
153
+ "useHaiku": true,
154
+ "phases": ["exploring", "spec_review", "scenario", "criteria_check", "validating"]
155
+ }
156
+ ```
157
+
158
+ **CLI commands:**
159
+ ```bash
160
+ # Check if compaction needed at a phase
161
+ node scripts/flow-proactive-compact.js check exploring 0.78 wf-a1b2c3d4
162
+
163
+ # Show current config
164
+ node scripts/flow-proactive-compact.js config
165
+
166
+ # Generate compaction context from checkpoint
167
+ node scripts/flow-proactive-compact.js context
168
+
169
+ # View/manage checkpoints
170
+ node scripts/flow-task-checkpoint.js load
171
+ node scripts/flow-task-checkpoint.js check
172
+ node scripts/flow-task-checkpoint.js clear wf-a1b2c3d4
173
+ ```
174
+
126
175
  ### Legacy Fixed Thresholds
127
176
 
128
177
  If smart compaction is disabled, check context pressure status:
@@ -0,0 +1,135 @@
1
+ ---
2
+ description: "Evaluate WogiFlow task output quality with multi-judge scoring"
3
+ ---
4
+ Evaluate a completed task's output quality using multi-judge scoring (1 Opus + 2 Sonnet).
5
+
6
+ ## Usage
7
+
8
+ ```
9
+ /wogi-eval wf-XXXXXXXX Evaluate a specific task
10
+ /wogi-eval --batch --last 5 Evaluate the last 5 completed tasks
11
+ /wogi-eval --compare Show eval trend comparison
12
+ /wogi-eval --candidates Show tasks eligible for evaluation
13
+ ```
14
+
15
+ ## How It Works
16
+
17
+ 1. **Read the spec**: Load the task's acceptance criteria and requirements
18
+ 2. **Get the diff**: Find the commit and extract the implementation diff
19
+ 3. **Spawn 3 judge agents**: 1 Opus + 2 Sonnet (via Agent tool `model` parameter)
20
+ 4. **Score independently**: Each judge scores on 5 dimensions (1-10)
21
+ 5. **Take median**: Final score = median of 3 judges per dimension
22
+ 6. **Save results**: Store in `.workflow/evals/`
23
+
24
+ ## Scoring Dimensions
25
+
26
+ | Dimension | What It Measures |
27
+ |-----------|-----------------|
28
+ | Completeness | Did implementation address ALL acceptance criteria? |
29
+ | Accuracy | Is code correct, handling edge cases? |
30
+ | Workflow Compliance | Did it follow WogiFlow patterns (spec, criteria check, wiring, standards)? |
31
+ | Token Efficiency | How many tokens/iterations to reach passing state? |
32
+ | Quality | Code quality, readability, maintainability |
33
+
34
+ ## Execution Flow
35
+
36
+ ### Step 1: Prepare eval data
37
+
38
+ ```bash
39
+ node scripts/flow-eval.js prepare wf-XXXXXXXX
40
+ ```
41
+
42
+ This returns: spec content, implementation diff, iteration count, token estimate.
43
+
44
+ ### Step 2: Spawn judge agents
45
+
46
+ Launch 3 agents in parallel using the Agent tool:
47
+
48
+ ```
49
+ Agent(model: "opus", prompt: "<judge prompt with spec + diff>")
50
+ Agent(model: "sonnet", prompt: "<judge prompt with spec + diff>")
51
+ Agent(model: "sonnet", prompt: "<judge prompt with spec + diff>")
52
+ ```
53
+
54
+ Each judge receives the same prompt (from `buildJudgePrompt()` in `flow-eval-judge.js`) and scores independently.
55
+
56
+ ### Step 3: Aggregate scores
57
+
58
+ ```javascript
59
+ const { aggregateScores, parseJudgeResponse } = require('./scripts/flow-eval-judge');
60
+
61
+ // Parse each judge's response
62
+ const scores = judgeResponses.map(parseJudgeResponse).filter(Boolean);
63
+
64
+ // Take median per dimension
65
+ const result = aggregateScores(scores);
66
+ ```
67
+
68
+ ### Step 4: Save and display
69
+
70
+ ```javascript
71
+ const { saveEvalResult, formatEvalResults } = require('./scripts/flow-eval');
72
+ saveEvalResult({ taskId, aggregated: result, judgeResults: scores, model, taskType });
73
+ ```
74
+
75
+ ## Output Format
76
+
77
+ ```
78
+ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
79
+ 📊 EVAL RESULTS: wf-XXXXXXXX
80
+ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
81
+
82
+ Judges: 3 (1 Opus + 2 Sonnet) | Confidence: high
83
+
84
+ completeness ████████░░ 8/10
85
+ accuracy ███████░░░ 7/10
86
+ workflowCompliance █████████░ 9/10
87
+ tokenEfficiency ██████░░░░ 6/10
88
+ quality ████████░░ 8/10
89
+
90
+ Overall: 7.6/10 — PASS (threshold: 6)
91
+
92
+ Individual Judges:
93
+ Judge 1 (opus): Strong implementation, minor edge case gaps
94
+ Judge 2 (sonnet): Good workflow compliance, token usage could improve
95
+ Judge 3 (sonnet): Clean code, well-structured implementation
96
+
97
+ Saved: .workflow/evals/wf-XXXXXXXX-eval-2026-03-02T10-00-00.json
98
+ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
99
+ ```
100
+
101
+ ## Batch Mode
102
+
103
+ When running `--batch --last N`:
104
+ 1. Get the last N completed tasks from stats
105
+ 2. Evaluate each sequentially
106
+ 3. Display summary table
107
+
108
+ ```
109
+ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
110
+ 📊 BATCH EVAL RESULTS
111
+ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
112
+
113
+ Task Model Overall Comp Acc WF Tok Qual
114
+ wf-a1b2c3d4 opus-4-6 7.6 8 7 9 6 8
115
+ wf-e5f6a7b8 sonnet-4-6 6.8 7 7 8 5 7
116
+ wf-c9d0e1f2 opus-4-6 8.2 9 8 9 7 8
117
+
118
+ Average: 7.5/10
119
+ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
120
+ ```
121
+
122
+ ## Configuration
123
+
124
+ In `config.json`:
125
+ ```json
126
+ {
127
+ "eval": {
128
+ "judges": { "opus": 1, "sonnet": 2 },
129
+ "scoringDimensions": ["completeness", "accuracy", "workflowCompliance", "tokenEfficiency", "quality"],
130
+ "passingThreshold": 6
131
+ }
132
+ }
133
+ ```
134
+
135
+ ARGUMENTS: $ARGUMENTS
@@ -345,12 +345,31 @@ Display:
345
345
 
346
346
  If user approves, create task entries in ready.json backlog, grouped by category.
347
347
 
348
+ **CRITICAL: Task ID Generation**
349
+ For EACH task created from health findings:
350
+ 1. Generate the ID by running: `node -e "const { generateTaskId } = require('./scripts/flow-utils'); console.log(generateTaskId('[category] health findings'));"` — or call `generateTaskId()` programmatically
351
+ 2. The ID MUST be in format `wf-[8 hex chars]` (e.g., `wf-a1b2c3d4`)
352
+ 3. **NEVER** manually construct descriptive IDs like `WF-health-1`, `wf-redundancy-check`, etc.
353
+ 4. The descriptive name goes in the `title` field, NOT the `id` field
354
+ 5. Example entry:
355
+ ```json
356
+ {
357
+ "id": "wf-a1b2c3d4",
358
+ "title": "Health: Consolidate 3 redundant button components",
359
+ "type": "refactor",
360
+ "feature": "health-scan",
361
+ "status": "ready",
362
+ "priority": "P2",
363
+ "createdAt": "[ISO timestamp]"
364
+ }
365
+ ```
366
+
348
367
  **If "Paste known issues":**
349
368
  ```
350
369
  Paste your known issues or tech debt below.
351
370
  (One per line, or a comma-separated list)
352
371
  ```
353
- If issues provided, create task entries in ready.json backlog.
372
+ If issues provided, create task entries in ready.json backlog using the same ID generation rules above (call `generateTaskId()`, never construct IDs manually).
354
373
 
355
374
  **If "Skip for now":**
356
375
  Continue to Phase 4. User can run `/wogi-review` or `/wogi-health` later.
@@ -655,6 +674,52 @@ Display:
655
674
  // Check for conventional commits, ticket prefixes, etc.
656
675
  ```
657
676
 
677
+ **Model Routing Configuration:**
678
+
679
+ Present the user with a model routing choice using `AskUserQuestion`:
680
+
681
+ ```
682
+ How should WogiFlow route sub-tasks to AI models?
683
+
684
+ 1. "Full Opus (Recommended)" — Maximum quality. All sub-agents use Opus.
685
+ Best for complex projects where quality matters most.
686
+
687
+ 2. "Smart Routing" — Opus orchestrates, Sonnet handles implementation/review,
688
+ Haiku handles searches/lookups. Best quality-to-cost balance.
689
+ Preserves context window by offloading sub-tasks to lighter models.
690
+
691
+ 3. "Custom" — Configure your own routing rules per task type.
692
+ ```
693
+
694
+ Based on choice:
695
+ - Option 1: Set `config.hybrid.enabled = false` (all tasks stay with current model)
696
+ - Option 2: Set `config.hybrid.enabled = true` with default routing table (already configured)
697
+ - Option 3: Set `config.hybrid.enabled = true` and guide user through per-task-type routing overrides
698
+
699
+ Display: ` Model routing... ✓ [Smart Routing | Full Opus | Custom]`
700
+
701
+ **Community Knowledge Sync:**
702
+
703
+ Present opt-in question using `AskUserQuestion`:
704
+
705
+ ```
706
+ Would you like to share anonymized model performance data with the WogiFlow community?
707
+
708
+ What's shared: model ID, task type, iteration count, token usage, wall clock time
709
+ What's NOT shared: file paths, code, project names, task descriptions
710
+
711
+ You'll receive back: community-optimized model routing rules and capability scores.
712
+
713
+ 1. "Enable (Recommended)" — Help improve WogiFlow for everyone
714
+ 2. "Disable" — Keep all data local only
715
+ ```
716
+
717
+ Based on choice:
718
+ - Option 1: Set `config.communitySync.enabled = true`
719
+ - Option 2: Set `config.communitySync.enabled = false` (default)
720
+
721
+ Display: ` Community sync... ✓ [Enabled | Disabled]`
722
+
658
723
  **Commit style detection:**
659
724
  ```bash
660
725
  git log --oneline -20 --format="%s"
@@ -0,0 +1,185 @@
1
+ ---
2
+ description: "Register Claude Code plugins for /wogi-start routing"
3
+ allowed-tools: "Read,Glob,Grep,WebSearch,WebFetch,Edit,Write,Bash,Agent,ToolSearch,ListMcpResourcesTool,ReadMcpResourceTool,AskUserQuestion"
4
+ user-invocable: true
5
+ ---
6
+
7
+ # /wogi-register — Plugin Registration
8
+
9
+ Register Claude Code plugins so that `/wogi-start` can automatically route requests to them.
10
+
11
+ ## Usage
12
+
13
+ ```
14
+ /wogi-register <plugin-name> Register a new plugin (auto-discover capabilities)
15
+ /wogi-register --list List all registered plugins
16
+ /wogi-register --remove <name> Remove a registered plugin
17
+ ```
18
+
19
+ ## How It Works
20
+
21
+ When you run `/wogi-register <plugin-name>`, the system:
22
+
23
+ 1. **Inspects MCP tools** matching the plugin name (most reliable)
24
+ 2. **Searches online** for the plugin's documentation and capabilities
25
+ 3. **Generates a plugin entry** with triggers, capabilities, and invocation details
26
+ 4. **Saves to registry** at `.workflow/state/plugin-registry.json`
27
+ 5. **Displays summary** of discovered capabilities for confirmation
28
+
29
+ After registration, `/wogi-start` will automatically route matching requests to the plugin.
30
+
31
+ ## Registration Flow
32
+
33
+ ### Step 1: MCP Tool Discovery
34
+
35
+ First, try to discover the plugin's capabilities through MCP tools:
36
+
37
+ 1. Run `node scripts/flow-plugin-registry.js scan` to check for unregistered MCP servers
38
+ 2. Use `ToolSearch` to search for tools matching the plugin name pattern
39
+ 3. Use `ListMcpResourcesTool` to check for MCP resources from matching servers
40
+ 4. Extract: tool names, descriptions, input schemas
41
+ 5. Map each tool to a capability entry
42
+
43
+ ### Step 2: Web Search Discovery (if MCP insufficient)
44
+
45
+ If MCP inspection yields few or no results:
46
+
47
+ 1. Search for `"<plugin-name> Claude Code plugin capabilities"`
48
+ 2. Search for `"<plugin-name> Claude Code MCP tools"`
49
+ 3. Search for the plugin's documentation page
50
+ 4. Extract capabilities from documentation
51
+ 5. Generate trigger phrases from discovered capabilities
52
+
53
+ ### Step 3: Build Plugin Entry
54
+
55
+ From the discovered information, construct:
56
+
57
+ ```json
58
+ {
59
+ "name": "<plugin-name>",
60
+ "description": "Human-readable description of the plugin",
61
+ "source": "mcp|web-discovered|manual",
62
+ "triggers": ["phrase 1", "phrase 2"],
63
+ "capabilities": [
64
+ {
65
+ "action": "action-name",
66
+ "description": "What this action does",
67
+ "triggerPhrases": ["send to X", "push to X"],
68
+ "mcpTool": "mcp__server__tool_name or null",
69
+ "requiresTask": false
70
+ }
71
+ ],
72
+ "metadata": {
73
+ "mcpServer": "server name if MCP-based",
74
+ "docsUrl": "URL to plugin docs if found",
75
+ "version": "plugin version if known"
76
+ }
77
+ }
78
+ ```
79
+
80
+ ### Step 4: User Confirmation
81
+
82
+ Display the discovered capabilities and ask for confirmation:
83
+
84
+ ```
85
+ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
86
+ Plugin Registration: <plugin-name>
87
+ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
88
+
89
+ Description: <discovered description>
90
+ Source: MCP tools | Web search | Manual
91
+
92
+ Capabilities discovered (N):
93
+ 1. <action>: <description>
94
+ Triggers: "phrase 1", "phrase 2"
95
+ MCP Tool: mcp__server__tool
96
+
97
+ 2. <action>: <description>
98
+ Triggers: "phrase 3"
99
+
100
+ Trigger phrases (top-level):
101
+ - "send to <plugin>"
102
+ - "push to <plugin>"
103
+ - "use <plugin>"
104
+
105
+ Does this look correct? You can adjust before saving.
106
+ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
107
+ ```
108
+
109
+ ### Step 5: Save to Registry
110
+
111
+ Call `registerPlugin()` from `scripts/flow-plugin-registry.js`:
112
+
113
+ ```javascript
114
+ const { registerPlugin } = require('./scripts/flow-plugin-registry');
115
+ registerPlugin({
116
+ name: pluginName,
117
+ description: discoveredDescription,
118
+ source: discoverySource,
119
+ triggers: topLevelTriggers,
120
+ capabilities: discoveredCapabilities,
121
+ metadata: { mcpServer, docsUrl, version }
122
+ });
123
+ ```
124
+
125
+ ## Re-Registration (Update)
126
+
127
+ When `/wogi-register <plugin-name>` is run for an already-registered plugin:
128
+
129
+ 1. Re-discover capabilities (same flow as above)
130
+ 2. Compare with existing registration
131
+ 3. Display diff: new capabilities, removed capabilities, changed triggers
132
+ 4. Update the existing entry (preserves registeredAt timestamp)
133
+ 5. Display: `Plugin "<name>" updated. Added N capabilities, removed M.`
134
+
135
+ ## --list Mode
136
+
137
+ Display all registered plugins:
138
+
139
+ ```
140
+ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
141
+ Registered Plugins (N)
142
+ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
143
+
144
+ figma [active]
145
+ 4 capabilities | Source: mcp
146
+ Triggers: "send to figma", "push to figma", "create in figma"
147
+
148
+ linear [active]
149
+ 3 capabilities | Source: web-discovered
150
+ Triggers: "create linear issue", "sync with linear"
151
+
152
+ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
153
+ ```
154
+
155
+ If no plugins registered:
156
+ ```
157
+ No plugins registered. Install a Claude Code plugin and run:
158
+ /wogi-register <plugin-name>
159
+ ```
160
+
161
+ ## --remove Mode
162
+
163
+ ```
164
+ Removed plugin: <plugin-name>
165
+ Was registered with N capabilities
166
+ /wogi-start will no longer route to this plugin
167
+ ```
168
+
169
+ ## Important
170
+
171
+ - The system is **fully generic** — it does NOT hardcode any plugin-specific logic
172
+ - Plugin-specific knowledge is discovered at registration time, not built-in
173
+ - All trigger matching uses word overlap scoring with a 0.5 minimum threshold
174
+ - Built-in `/wogi-*` commands always take priority over plugin routing
175
+ - Plugin actions are tracked through the normal WogiFlow task system when `trackPluginActions` is enabled
176
+
177
+ ## Auto-Discovery on Session Start
178
+
179
+ When `config.plugins.autoScanOnSessionStart` is true:
180
+ - The session-start hook compares available MCP servers against the registry
181
+ - New unregistered servers are auto-registered with discovered capabilities
182
+ - Previously registered servers that are no longer available are marked `inactive`
183
+ - Display: `New plugin detected: <name>. Auto-registered with N capabilities.`
184
+
185
+ Mid-session plugin installs require manual `/wogi-register <name>`.
@@ -36,6 +36,31 @@ When `config.longInputGate.enabled` is `true`:
36
36
  - Prompt is a task ID → already handled in Step 0
37
37
  - Prompt content is primarily code (>80% code blocks) → skip, as code pastes are better handled by normal triage
38
38
 
39
+ ### Step 0.4: Plugin Registry Routing (Automatic)
40
+
41
+ **After the command catalog finds no match, check if the request matches a registered plugin.** Plugin routing has lower priority than built-in `/wogi-*` commands.
42
+
43
+ When `config.plugins.enabled` is `true`:
44
+
45
+ 1. Read `.workflow/state/plugin-registry.json` (the plugin registry)
46
+ 2. For each active plugin, check if the user's request matches any trigger phrase
47
+ 3. Use word overlap scoring (minimum threshold: 0.5) to find the best match
48
+ 4. **If a plugin match is found** (score >= 0.5):
49
+ - Display: `Plugin match: "<plugin-name>" (score: X.XX, trigger: "<matched phrase>")`
50
+ - If the matched capability has an `mcpTool` → use ToolSearch to load and invoke it
51
+ - If no specific `mcpTool` → display the plugin's capabilities and ask the user which action to take
52
+ - If `config.plugins.trackPluginActions` is true → create a lightweight task entry for tracking
53
+ 5. **If no plugin match** → Continue to the Command Catalog below
54
+
55
+ **Plugin routing has LOWER priority than built-in `/wogi-*` commands.** If a request matches both a built-in command and a plugin trigger, the built-in command wins. Plugin routing is the fallback AFTER the command catalog finds no match.
56
+
57
+ **Actual routing order:**
58
+ 1. Check if request is a task ID → Structured Execution
59
+ 2. Check long input gate → `/wogi-extract-review`
60
+ 3. Check Command Catalog → matching `/wogi-*` command
61
+ 4. Check Plugin Registry → matching plugin capability
62
+ 5. Default → `/wogi-story` (implementation request)
63
+
39
64
  ### Command Catalog
40
65
 
41
66
  Think of each command below as a tool available to you. Read the user's request, understand what they need, and invoke the best-fit command using the Skill tool.
@@ -63,6 +88,7 @@ Think of each command below as a tool available to you. Read the user's request,
63
88
  | `/wogi-decide` | Creates/updates project rules with clarifying questions | User says **"from now on" + rule verb** (always/never/must/should), "let's make it a rule", "update our rules". Note: "from now on" alone is not sufficient — require a follow-on rule verb to distinguish from implementation requests. |
64
89
  | `/wogi-learn` | Promotes feedback patterns to decision rules | User says **"let's learn from this"**, "we keep making this mistake", "extract lessons" |
65
90
  | `/wogi-retrospective` | Guided session reflection with lesson capture | User says **"retro"**, "what went well", "what can we improve", "lessons learned" |
91
+ | `/wogi-register` | Register Claude Code plugins for /wogi-start routing | User wants to **register a plugin**, list registered plugins, or remove a plugin registration |
66
92
 
67
93
  ### Internal Tools (Auto-Invoked by wogi-start)
68
94
 
@@ -240,6 +266,18 @@ User: "help me think through how the hook architecture should evolve"
240
266
  → Action: Read relevant code, discuss architecture options. No files written, no tasks created.
241
267
  ```
242
268
 
269
+ ```
270
+ User: "send this design to Figma"
271
+ → Intent: PLUGIN ROUTING — request matches a registered plugin trigger
272
+ → Action: Check plugin-registry.json. If "figma" plugin registered with trigger "send to figma" → route to plugin. If not registered → suggest /wogi-register figma
273
+ ```
274
+
275
+ ```
276
+ User: "register the linear plugin"
277
+ → Intent: Plugin registration
278
+ → Action: Invoke /wogi-register linear
279
+ ```
280
+
243
281
  ```
244
282
  User: "yes"
245
283
  → Intent: CONVERSATIONAL FOLLOW-UP — user is responding to a previous AI question
@@ -330,6 +368,34 @@ At each execution milestone, update the workflow phase. These are no-ops when ph
330
368
 
331
369
  If a transition fails (wrong current phase), it's non-blocking — log and continue.
332
370
 
371
+ ### Task Checkpoints (when `config.proactiveCompaction.enabled`)
372
+
373
+ At each phase boundary, save a task checkpoint and check if proactive compaction is needed. This enables lossless recovery after auto-compact.
374
+
375
+ **At EVERY phase transition listed above**, also:
376
+ 1. Save checkpoint: Record task ID, current phase, completed scenarios, changed files, verification results to `.workflow/state/task-checkpoint.json`
377
+ 2. Check compaction: If context usage >= `proactiveCompaction.triggerThreshold` (default 75%), display compaction message and run `/wogi-compact` before proceeding
378
+
379
+ **Checkpoint integration points:**
380
+ | When | Checkpoint Action |
381
+ |------|-------------------|
382
+ | After explore phase completes | Save exploration summary + related files |
383
+ | After spec is generated | Save spec path + acceptance criteria count |
384
+ | After each scenario completes | Update scenario progress (completed/pending) |
385
+ | After criteria check | Save verification results |
386
+ | Before final validation | Save all changed files list |
387
+ | After task completion | Clear checkpoint |
388
+
389
+ **Auto-compact recovery** (on session resume):
390
+ 1. Check `.workflow/state/task-checkpoint.json` for an active checkpoint
391
+ 2. If checkpoint exists with incomplete scenarios → display recovery message:
392
+ `Auto-compact detected. Restoring task state from checkpoint...`
393
+ 3. Reload: task ID, current phase, completed scenarios, spec path, changed files
394
+ 4. Continue execution from the next pending scenario
395
+
396
+ **Haiku-powered summaries** (when `proactiveCompaction.useHaiku: true`):
397
+ When compacting between phases, use the Agent tool with `model: "haiku"` to generate the compaction summary. This preserves Opus context for the actual implementation work.
398
+
333
399
  ### Execution Flow
334
400
 
335
401
  ```
@@ -804,6 +870,10 @@ Return a structured report:
804
870
 
805
871
  ```javascript
806
872
  // Launch all in parallel (single message, multiple Task tool calls)
873
+ // When hybrid mode is enabled (config.hybrid.enabled), use the model parameter
874
+ // to route sub-agents to the appropriate model tier.
875
+ // Routing is provided by getAgentModel() from flow-prompt-template.js:
876
+ // explore → sonnet, research → sonnet, search → haiku, judging → opus
807
877
  Task(subagent_type=Explore, prompt="Codebase Analyzer: ...")
808
878
  Task(subagent_type=Explore, prompt="Best Practices: ...")
809
879
  Task(subagent_type=Explore, prompt="Version Verifier: ...")
@@ -813,6 +883,21 @@ Task(subagent_type=Explore, prompt="Standards Preview: ...")
813
883
  Task(subagent_type=Explore, prompt="Consumer Impact Analyzer: ...")
814
884
  ```
815
885
 
886
+ **Hybrid Model Routing (S4):**
887
+
888
+ When `config.hybrid.enabled` is `true`, use the Agent tool's `model` parameter to route sub-agents:
889
+
890
+ | Sub-Agent Type | Agent `model` Parameter | Rationale |
891
+ |----------------|------------------------|-----------|
892
+ | Explore/Research | `"sonnet"` | Good analysis capability, saves Opus context |
893
+ | Code Review | `"sonnet"` | Balanced quality for review tasks |
894
+ | Simple Lookup/Search | `"haiku"` | Fast and cheap for file searches |
895
+ | Complex Reasoning | `"opus"` | Only for architecture/planning decisions |
896
+ | Compaction Summary | `"haiku"` | Summaries don't need premium models |
897
+ | Eval Judging | `"opus"` (1) + `"sonnet"` (2) | Multi-judge composition from eval config |
898
+
899
+ The routing table is configured in `scripts/flow-prompt-template.js` and can be overridden via `config.hybrid.routing.overrides`. Capability scores from `.workflow/models/capabilities/*.yaml` are consulted when `checkCapabilities` is true — if a model's score for the task type is below the `capabilityThreshold` (default: 5), the task is escalated to the next tier.
900
+
816
901
  **After all agents complete**, display a consolidated research summary:
817
902
 
818
903
  **Output Format:**
@@ -1801,9 +1886,32 @@ Phase commands:
1801
1886
  ### Scenario keeps failing after max retries
1802
1887
  - Stop and report: "Scenario X failed after N attempts. Issue: [description]"
1803
1888
  - Leave task in inProgress
1804
- - **Auto-suggest hypothesis debugging**: When a scenario fails 3+ times, suggest running `/wogi-debug-hypothesis "[failure description]"` to spawn parallel investigation agents that analyze competing theories about the root cause
1889
+ - **Best-of-N fallback (high-risk tasks)**: When a HIGH-RISK task (architecture, migration, refactor, or complexity HIGH + files > 10) fails 3+ times, auto-suggest Best-of-N:
1890
+ ```
1891
+ This high-risk task has failed 3 times. Would you like to try Best-of-N?
1892
+ → Spawn 2 alternative implementation approaches in isolated worktrees
1893
+ → Opus judges the best approach against the spec
1894
+ ```
1895
+ Use `checkFallbackTrigger()` from `flow-best-of-n.js` to determine if Best-of-N applies.
1896
+ If the task is NOT high-risk: suggest `/wogi-debug-hypothesis` instead (competing theories about root cause).
1897
+ - **Auto-suggest hypothesis debugging**: For non-high-risk tasks, when a scenario fails 3+ times, suggest running `/wogi-debug-hypothesis "[failure description]"` to spawn parallel investigation agents
1805
1898
  - User can investigate and re-run `/wogi-start TASK-XXX` to continue
1806
1899
 
1900
+ ### Best-of-N auto-suggestion (high-risk tasks)
1901
+
1902
+ When starting a task, if `config.bestOfN.enabled` is true:
1903
+ 1. Run `assessRisk()` from `flow-best-of-n.js` with the task's type, description, and file count
1904
+ 2. If `shouldSuggest` is true, display:
1905
+ ```
1906
+ This is a high-risk task. Would you like to use Best-of-N?
1907
+ → Spawn 3 approaches in parallel (isolated worktrees)
1908
+ → Opus selects the best implementation
1909
+ Options: [Yes, use Best-of-N] [No, proceed normally]
1910
+ ```
1911
+ 3. If user confirms: spawn N agents using `Agent(isolation: "worktree")` with variation strategy from `getVariationStrategy()`
1912
+ 4. After all complete: spawn Opus judge using `buildSelectionPrompt()` to select winner
1913
+ 5. Apply winner, clean up losing worktrees
1914
+
1807
1915
  ### Quality gate keeps failing
1808
1916
  - Report which gate is failing and why
1809
1917
  - Attempt to fix automatically
@@ -120,6 +120,7 @@ npm install -D wogiflow && npx flow onboard
120
120
  | `/wogi-roadmap` | View/manage deferred work |
121
121
  | `/wogi-suggest "text"` | Submit suggestion for WogiFlow |
122
122
  | `/wogi-audit` | Comprehensive project-wide analysis (7 dimensions) |
123
+ | `/wogi-register` | Register plugins for /wogi-start routing |
123
124
 
124
125
  See `.claude/docs/commands.md` for complete command reference.
125
126
 
@@ -147,6 +148,7 @@ See `.claude/docs/commands.md` for complete command reference.
147
148
  | "rescan project", "re-evaluate project", "project changed", "others made changes", "sync wogi", "things changed", "out of sync" | `/wogi-rescan` |
148
149
  | "suggest improvement", "feature request for wogi", "wogi suggestion", "submit feedback" | `/wogi-suggest` |
149
150
  | "audit project", "project audit", "full project analysis", "full analysis" | `/wogi-audit` |
151
+ | "register plugin", "list plugins", "remove plugin", "register MCP" | `/wogi-register` |
150
152
 
151
153
  **IMPORTANT**: When a user's message matches one of these patterns, immediately invoke the Skill tool with the corresponding command. Do not ask for confirmation. These `/wogi-*` commands satisfy the mandatory routing requirement — you do NOT also need to invoke `/wogi-start` when a detection match exists. `/wogi-start` is the fallback for messages that don't match this table.
152
154
 
@@ -15,5 +15,6 @@
15
15
  | Session retro | "retro" or "what went well" |
16
16
  | Rescan project | "rescan project" or "things changed" or "out of sync" |
17
17
  | Project audit | "audit project" or "full analysis" |
18
+ | Register plugin | "register plugin" or "/wogi-register <name>" |
18
19
 
19
20
  `/wogi-start` is the universal fallback router — it classifies any request and routes to the right action. Detailed per-command docs live in each skill's `.md` file under `.claude/commands/`.