wogiflow 1.7.0 → 1.8.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/commands/wogi-compact.md +49 -0
- package/.claude/commands/wogi-eval.md +135 -0
- package/.claude/commands/wogi-onboard.md +66 -1
- package/.claude/commands/wogi-register.md +185 -0
- package/.claude/commands/wogi-start.md +109 -1
- package/.workflow/templates/claude-md.hbs +2 -0
- package/.workflow/templates/partials/user-commands.hbs +1 -0
- package/.workflow/templates/prompts/gemini-flash.yaml +42 -0
- package/.workflow/templates/prompts/gpt4o.yaml +44 -0
- package/.workflow/templates/prompts/haiku.yaml +42 -0
- package/.workflow/templates/prompts/opus.yaml +45 -0
- package/.workflow/templates/prompts/sonnet.yaml +44 -0
- package/package.json +1 -1
- package/scripts/flow-best-of-n.js +432 -0
- package/scripts/flow-community-sync.js +469 -0
- package/scripts/flow-eval-judge.js +388 -0
- package/scripts/flow-eval.js +430 -0
- package/scripts/flow-model-router.js +10 -0
- package/scripts/flow-plugin-registry.js +631 -0
- package/scripts/flow-proactive-compact.js +341 -0
- package/scripts/flow-prompt-template.js +517 -0
- package/scripts/flow-revision-tracker.js +258 -0
- package/scripts/flow-skill-freshness.js +39 -18
- package/scripts/flow-skill-generator.js +90 -27
- package/scripts/flow-stack-wizard.js +2 -2
- package/scripts/flow-stats-collector.js +534 -0
- package/scripts/flow-sync-anonymizer.js +254 -0
- package/scripts/flow-task-checkpoint.js +497 -0
- package/scripts/flow-tech-options.js +4 -1
- package/scripts/flow-utils.js +47 -11
- package/scripts/hooks/core/session-context.js +12 -0
- package/scripts/hooks/core/session-end.js +12 -0
- package/scripts/hooks/core/task-completed.js +32 -0
- package/scripts/hooks/entry/claude-code/session-start.js +56 -10
- package/templates/skills/angular/skill.md +1 -1
- package/templates/skills/anthropic/knowledge/anti-patterns.md +78 -0
- package/templates/skills/anthropic/knowledge/conventions.md +18 -0
- package/templates/skills/anthropic/knowledge/learnings.md +5 -0
- package/templates/skills/anthropic/knowledge/patterns.md +111 -0
- package/templates/skills/anthropic/skill.md +61 -0
- package/templates/skills/commander/knowledge/anti-patterns.md +71 -0
- package/templates/skills/commander/knowledge/conventions.md +17 -0
- package/templates/skills/commander/knowledge/learnings.md +5 -0
- package/templates/skills/commander/knowledge/patterns.md +80 -0
- package/templates/skills/commander/skill.md +61 -0
- package/templates/skills/cypress/skill.md +1 -1
- package/templates/skills/django/skill.md +1 -1
- package/templates/skills/docker/skill.md +1 -1
- package/templates/skills/eslint/skill.md +1 -1
- package/templates/skills/express/skill.md +1 -1
- package/templates/skills/fastapi/skill.md +1 -1
- package/templates/skills/fastify/skill.md +1 -1
- package/templates/skills/flask/skill.md +1 -1
- package/templates/skills/hono/skill.md +1 -1
- package/templates/skills/jest/skill.md +1 -1
- package/templates/skills/nestjs/skill.md +1 -1
- package/templates/skills/openai/knowledge/anti-patterns.md +69 -0
- package/templates/skills/openai/knowledge/conventions.md +18 -0
- package/templates/skills/openai/knowledge/learnings.md +5 -0
- package/templates/skills/openai/knowledge/patterns.md +121 -0
- package/templates/skills/openai/skill.md +61 -0
- package/templates/skills/playwright/skill.md +1 -1
- package/templates/skills/prisma/skill.md +1 -1
- package/templates/skills/pytest/skill.md +1 -1
- package/templates/skills/svelte/skill.md +1 -1
- package/templates/skills/tailwindcss/skill.md +1 -1
- package/templates/skills/terraform/skill.md +1 -1
- package/templates/skills/typescript/skill.md +1 -1
- package/templates/skills/vitest/skill.md +2 -2
- package/templates/skills/zod/skill.md +1 -1
- package/.claude/rules/README.md +0 -60
- package/.claude/rules/architecture/component-reuse.md +0 -38
- package/.claude/rules/architecture/document-structure.md +0 -76
- package/.claude/rules/architecture/dual-repo-management.md +0 -169
- package/.claude/rules/architecture/feature-refactoring-cleanup.md +0 -87
- package/.claude/rules/architecture/model-management.md +0 -35
- package/.claude/rules/architecture/self-maintenance.md +0 -87
- package/.claude/rules/code-style/naming-conventions.md +0 -55
- package/.claude/rules/security/security-patterns.md +0 -176
- package/.claude/skills/figma-analyzer/knowledge/learnings.md +0 -11
- package/.workflow/specs/architecture.md.template +0 -24
- package/.workflow/specs/stack.md.template +0 -33
- package/.workflow/specs/testing.md.template +0 -36
|
@@ -123,6 +123,55 @@ With smart compaction enabled (`config.smartCompaction.enabled`), context is man
|
|
|
123
123
|
|
|
124
124
|
This means fixed thresholds are less relevant - compaction happens when actually needed based on the specific task.
|
|
125
125
|
|
|
126
|
+
### Proactive Phase-Boundary Compaction (v2.3)
|
|
127
|
+
|
|
128
|
+
With proactive compaction enabled (`config.proactiveCompaction.enabled`), WogiFlow compacts between task phases:
|
|
129
|
+
|
|
130
|
+
- **Phase boundaries**: After explore, spec, each scenario, criteria check, validation
|
|
131
|
+
- **Trigger threshold**: Default 75% context usage (configurable via `triggerThreshold`)
|
|
132
|
+
- **Task checkpoints**: Full task state saved to `.workflow/state/task-checkpoint.json` at every phase boundary
|
|
133
|
+
- **Auto-compact recovery**: If Claude's auto-compact fires, checkpoint enables lossless recovery
|
|
134
|
+
|
|
135
|
+
**How it works:**
|
|
136
|
+
1. At each phase boundary, `/wogi-start` saves a task checkpoint (task ID, phase, scenarios, files changed)
|
|
137
|
+
2. If context exceeds the trigger threshold, proactive compaction fires before the next phase
|
|
138
|
+
3. If Claude auto-compacts (at ~95%), session resume reads the checkpoint and restores full state
|
|
139
|
+
|
|
140
|
+
**Recovery flow:**
|
|
141
|
+
```
|
|
142
|
+
Auto-compact fires at ~95% → Session resumes with compressed context
|
|
143
|
+
→ /wogi-start detects checkpoint exists → Reads task-checkpoint.json
|
|
144
|
+
→ Displays: "Auto-compact detected. Restoring task state from checkpoint..."
|
|
145
|
+
→ Continues from the exact phase where it left off
|
|
146
|
+
```
|
|
147
|
+
|
|
148
|
+
**Config** (`config.proactiveCompaction`):
|
|
149
|
+
```json
|
|
150
|
+
{
|
|
151
|
+
"enabled": true,
|
|
152
|
+
"triggerThreshold": 0.75,
|
|
153
|
+
"useHaiku": true,
|
|
154
|
+
"phases": ["exploring", "spec_review", "scenario", "criteria_check", "validating"]
|
|
155
|
+
}
|
|
156
|
+
```
|
|
157
|
+
|
|
158
|
+
**CLI commands:**
|
|
159
|
+
```bash
|
|
160
|
+
# Check if compaction needed at a phase
|
|
161
|
+
node scripts/flow-proactive-compact.js check exploring 0.78 wf-a1b2c3d4
|
|
162
|
+
|
|
163
|
+
# Show current config
|
|
164
|
+
node scripts/flow-proactive-compact.js config
|
|
165
|
+
|
|
166
|
+
# Generate compaction context from checkpoint
|
|
167
|
+
node scripts/flow-proactive-compact.js context
|
|
168
|
+
|
|
169
|
+
# View/manage checkpoints
|
|
170
|
+
node scripts/flow-task-checkpoint.js load
|
|
171
|
+
node scripts/flow-task-checkpoint.js check
|
|
172
|
+
node scripts/flow-task-checkpoint.js clear wf-a1b2c3d4
|
|
173
|
+
```
|
|
174
|
+
|
|
126
175
|
### Legacy Fixed Thresholds
|
|
127
176
|
|
|
128
177
|
If smart compaction is disabled, check context pressure status:
|
|
@@ -0,0 +1,135 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: "Evaluate WogiFlow task output quality with multi-judge scoring"
|
|
3
|
+
---
|
|
4
|
+
Evaluate a completed task's output quality using multi-judge scoring (1 Opus + 2 Sonnet).
|
|
5
|
+
|
|
6
|
+
## Usage
|
|
7
|
+
|
|
8
|
+
```
|
|
9
|
+
/wogi-eval wf-XXXXXXXX Evaluate a specific task
|
|
10
|
+
/wogi-eval --batch --last 5 Evaluate the last 5 completed tasks
|
|
11
|
+
/wogi-eval --compare Show eval trend comparison
|
|
12
|
+
/wogi-eval --candidates Show tasks eligible for evaluation
|
|
13
|
+
```
|
|
14
|
+
|
|
15
|
+
## How It Works
|
|
16
|
+
|
|
17
|
+
1. **Read the spec**: Load the task's acceptance criteria and requirements
|
|
18
|
+
2. **Get the diff**: Find the commit and extract the implementation diff
|
|
19
|
+
3. **Spawn 3 judge agents**: 1 Opus + 2 Sonnet (via Agent tool `model` parameter)
|
|
20
|
+
4. **Score independently**: Each judge scores on 5 dimensions (1-10)
|
|
21
|
+
5. **Take median**: Final score = median of 3 judges per dimension
|
|
22
|
+
6. **Save results**: Store in `.workflow/evals/`
|
|
23
|
+
|
|
24
|
+
## Scoring Dimensions
|
|
25
|
+
|
|
26
|
+
| Dimension | What It Measures |
|
|
27
|
+
|-----------|-----------------|
|
|
28
|
+
| Completeness | Did implementation address ALL acceptance criteria? |
|
|
29
|
+
| Accuracy | Is code correct, handling edge cases? |
|
|
30
|
+
| Workflow Compliance | Did it follow WogiFlow patterns (spec, criteria check, wiring, standards)? |
|
|
31
|
+
| Token Efficiency | How many tokens/iterations to reach passing state? |
|
|
32
|
+
| Quality | Code quality, readability, maintainability |
|
|
33
|
+
|
|
34
|
+
## Execution Flow
|
|
35
|
+
|
|
36
|
+
### Step 1: Prepare eval data
|
|
37
|
+
|
|
38
|
+
```bash
|
|
39
|
+
node scripts/flow-eval.js prepare wf-XXXXXXXX
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
This returns: spec content, implementation diff, iteration count, token estimate.
|
|
43
|
+
|
|
44
|
+
### Step 2: Spawn judge agents
|
|
45
|
+
|
|
46
|
+
Launch 3 agents in parallel using the Agent tool:
|
|
47
|
+
|
|
48
|
+
```
|
|
49
|
+
Agent(model: "opus", prompt: "<judge prompt with spec + diff>")
|
|
50
|
+
Agent(model: "sonnet", prompt: "<judge prompt with spec + diff>")
|
|
51
|
+
Agent(model: "sonnet", prompt: "<judge prompt with spec + diff>")
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
Each judge receives the same prompt (from `buildJudgePrompt()` in `flow-eval-judge.js`) and scores independently.
|
|
55
|
+
|
|
56
|
+
### Step 3: Aggregate scores
|
|
57
|
+
|
|
58
|
+
```javascript
|
|
59
|
+
const { aggregateScores, parseJudgeResponse } = require('./scripts/flow-eval-judge');
|
|
60
|
+
|
|
61
|
+
// Parse each judge's response
|
|
62
|
+
const scores = judgeResponses.map(parseJudgeResponse).filter(Boolean);
|
|
63
|
+
|
|
64
|
+
// Take median per dimension
|
|
65
|
+
const result = aggregateScores(scores);
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
### Step 4: Save and display
|
|
69
|
+
|
|
70
|
+
```javascript
|
|
71
|
+
const { saveEvalResult, formatEvalResults } = require('./scripts/flow-eval');
|
|
72
|
+
saveEvalResult({ taskId, aggregated: result, judgeResults: scores, model, taskType });
|
|
73
|
+
```
|
|
74
|
+
|
|
75
|
+
## Output Format
|
|
76
|
+
|
|
77
|
+
```
|
|
78
|
+
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
79
|
+
📊 EVAL RESULTS: wf-XXXXXXXX
|
|
80
|
+
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
81
|
+
|
|
82
|
+
Judges: 3 (1 Opus + 2 Sonnet) | Confidence: high
|
|
83
|
+
|
|
84
|
+
completeness ████████░░ 8/10
|
|
85
|
+
accuracy ███████░░░ 7/10
|
|
86
|
+
workflowCompliance █████████░ 9/10
|
|
87
|
+
tokenEfficiency ██████░░░░ 6/10
|
|
88
|
+
quality ████████░░ 8/10
|
|
89
|
+
|
|
90
|
+
Overall: 7.6/10 — PASS (threshold: 6)
|
|
91
|
+
|
|
92
|
+
Individual Judges:
|
|
93
|
+
Judge 1 (opus): Strong implementation, minor edge case gaps
|
|
94
|
+
Judge 2 (sonnet): Good workflow compliance, token usage could improve
|
|
95
|
+
Judge 3 (sonnet): Clean code, well-structured implementation
|
|
96
|
+
|
|
97
|
+
Saved: .workflow/evals/wf-XXXXXXXX-eval-2026-03-02T10-00-00.json
|
|
98
|
+
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
99
|
+
```
|
|
100
|
+
|
|
101
|
+
## Batch Mode
|
|
102
|
+
|
|
103
|
+
When running `--batch --last N`:
|
|
104
|
+
1. Get the last N completed tasks from stats
|
|
105
|
+
2. Evaluate each sequentially
|
|
106
|
+
3. Display summary table
|
|
107
|
+
|
|
108
|
+
```
|
|
109
|
+
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
110
|
+
📊 BATCH EVAL RESULTS
|
|
111
|
+
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
112
|
+
|
|
113
|
+
Task Model Overall Comp Acc WF Tok Qual
|
|
114
|
+
wf-a1b2c3d4 opus-4-6 7.6 8 7 9 6 8
|
|
115
|
+
wf-e5f6a7b8 sonnet-4-6 6.8 7 7 8 5 7
|
|
116
|
+
wf-c9d0e1f2 opus-4-6 8.2 9 8 9 7 8
|
|
117
|
+
|
|
118
|
+
Average: 7.5/10
|
|
119
|
+
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
## Configuration
|
|
123
|
+
|
|
124
|
+
In `config.json`:
|
|
125
|
+
```json
|
|
126
|
+
{
|
|
127
|
+
"eval": {
|
|
128
|
+
"judges": { "opus": 1, "sonnet": 2 },
|
|
129
|
+
"scoringDimensions": ["completeness", "accuracy", "workflowCompliance", "tokenEfficiency", "quality"],
|
|
130
|
+
"passingThreshold": 6
|
|
131
|
+
}
|
|
132
|
+
}
|
|
133
|
+
```
|
|
134
|
+
|
|
135
|
+
ARGUMENTS: $ARGUMENTS
|
|
@@ -345,12 +345,31 @@ Display:
|
|
|
345
345
|
|
|
346
346
|
If user approves, create task entries in ready.json backlog, grouped by category.
|
|
347
347
|
|
|
348
|
+
**CRITICAL: Task ID Generation**
|
|
349
|
+
For EACH task created from health findings:
|
|
350
|
+
1. Generate the ID by running: `node -e "const { generateTaskId } = require('./scripts/flow-utils'); console.log(generateTaskId('[category] health findings'));"` — or call `generateTaskId()` programmatically
|
|
351
|
+
2. The ID MUST be in format `wf-[8 hex chars]` (e.g., `wf-a1b2c3d4`)
|
|
352
|
+
3. **NEVER** manually construct descriptive IDs like `WF-health-1`, `wf-redundancy-check`, etc.
|
|
353
|
+
4. The descriptive name goes in the `title` field, NOT the `id` field
|
|
354
|
+
5. Example entry:
|
|
355
|
+
```json
|
|
356
|
+
{
|
|
357
|
+
"id": "wf-a1b2c3d4",
|
|
358
|
+
"title": "Health: Consolidate 3 redundant button components",
|
|
359
|
+
"type": "refactor",
|
|
360
|
+
"feature": "health-scan",
|
|
361
|
+
"status": "ready",
|
|
362
|
+
"priority": "P2",
|
|
363
|
+
"createdAt": "[ISO timestamp]"
|
|
364
|
+
}
|
|
365
|
+
```
|
|
366
|
+
|
|
348
367
|
**If "Paste known issues":**
|
|
349
368
|
```
|
|
350
369
|
Paste your known issues or tech debt below.
|
|
351
370
|
(One per line, or a comma-separated list)
|
|
352
371
|
```
|
|
353
|
-
If issues provided, create task entries in ready.json backlog.
|
|
372
|
+
If issues provided, create task entries in ready.json backlog using the same ID generation rules above (call `generateTaskId()`, never construct IDs manually).
|
|
354
373
|
|
|
355
374
|
**If "Skip for now":**
|
|
356
375
|
Continue to Phase 4. User can run `/wogi-review` or `/wogi-health` later.
|
|
@@ -655,6 +674,52 @@ Display:
|
|
|
655
674
|
// Check for conventional commits, ticket prefixes, etc.
|
|
656
675
|
```
|
|
657
676
|
|
|
677
|
+
**Model Routing Configuration:**
|
|
678
|
+
|
|
679
|
+
Present the user with a model routing choice using `AskUserQuestion`:
|
|
680
|
+
|
|
681
|
+
```
|
|
682
|
+
How should WogiFlow route sub-tasks to AI models?
|
|
683
|
+
|
|
684
|
+
1. "Full Opus (Recommended)" — Maximum quality. All sub-agents use Opus.
|
|
685
|
+
Best for complex projects where quality matters most.
|
|
686
|
+
|
|
687
|
+
2. "Smart Routing" — Opus orchestrates, Sonnet handles implementation/review,
|
|
688
|
+
Haiku handles searches/lookups. Best quality-to-cost balance.
|
|
689
|
+
Preserves context window by offloading sub-tasks to lighter models.
|
|
690
|
+
|
|
691
|
+
3. "Custom" — Configure your own routing rules per task type.
|
|
692
|
+
```
|
|
693
|
+
|
|
694
|
+
Based on choice:
|
|
695
|
+
- Option 1: Set `config.hybrid.enabled = false` (all tasks stay with current model)
|
|
696
|
+
- Option 2: Set `config.hybrid.enabled = true` with default routing table (already configured)
|
|
697
|
+
- Option 3: Set `config.hybrid.enabled = true` and guide user through per-task-type routing overrides
|
|
698
|
+
|
|
699
|
+
Display: ` Model routing... ✓ [Smart Routing | Full Opus | Custom]`
|
|
700
|
+
|
|
701
|
+
**Community Knowledge Sync:**
|
|
702
|
+
|
|
703
|
+
Present opt-in question using `AskUserQuestion`:
|
|
704
|
+
|
|
705
|
+
```
|
|
706
|
+
Would you like to share anonymized model performance data with the WogiFlow community?
|
|
707
|
+
|
|
708
|
+
What's shared: model ID, task type, iteration count, token usage, wall clock time
|
|
709
|
+
What's NOT shared: file paths, code, project names, task descriptions
|
|
710
|
+
|
|
711
|
+
You'll receive back: community-optimized model routing rules and capability scores.
|
|
712
|
+
|
|
713
|
+
1. "Enable (Recommended)" — Help improve WogiFlow for everyone
|
|
714
|
+
2. "Disable" — Keep all data local only
|
|
715
|
+
```
|
|
716
|
+
|
|
717
|
+
Based on choice:
|
|
718
|
+
- Option 1: Set `config.communitySync.enabled = true`
|
|
719
|
+
- Option 2: Set `config.communitySync.enabled = false` (default)
|
|
720
|
+
|
|
721
|
+
Display: ` Community sync... ✓ [Enabled | Disabled]`
|
|
722
|
+
|
|
658
723
|
**Commit style detection:**
|
|
659
724
|
```bash
|
|
660
725
|
git log --oneline -20 --format="%s"
|
|
@@ -0,0 +1,185 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: "Register Claude Code plugins for /wogi-start routing"
|
|
3
|
+
allowed-tools: "Read,Glob,Grep,WebSearch,WebFetch,Edit,Write,Bash,Agent,ToolSearch,ListMcpResourcesTool,ReadMcpResourceTool,AskUserQuestion"
|
|
4
|
+
user-invocable: true
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# /wogi-register — Plugin Registration
|
|
8
|
+
|
|
9
|
+
Register Claude Code plugins so that `/wogi-start` can automatically route requests to them.
|
|
10
|
+
|
|
11
|
+
## Usage
|
|
12
|
+
|
|
13
|
+
```
|
|
14
|
+
/wogi-register <plugin-name> Register a new plugin (auto-discover capabilities)
|
|
15
|
+
/wogi-register --list List all registered plugins
|
|
16
|
+
/wogi-register --remove <name> Remove a registered plugin
|
|
17
|
+
```
|
|
18
|
+
|
|
19
|
+
## How It Works
|
|
20
|
+
|
|
21
|
+
When you run `/wogi-register <plugin-name>`, the system:
|
|
22
|
+
|
|
23
|
+
1. **Inspects MCP tools** matching the plugin name (most reliable)
|
|
24
|
+
2. **Searches online** for the plugin's documentation and capabilities
|
|
25
|
+
3. **Generates a plugin entry** with triggers, capabilities, and invocation details
|
|
26
|
+
4. **Saves to registry** at `.workflow/state/plugin-registry.json`
|
|
27
|
+
5. **Displays summary** of discovered capabilities for confirmation
|
|
28
|
+
|
|
29
|
+
After registration, `/wogi-start` will automatically route matching requests to the plugin.
|
|
30
|
+
|
|
31
|
+
## Registration Flow
|
|
32
|
+
|
|
33
|
+
### Step 1: MCP Tool Discovery
|
|
34
|
+
|
|
35
|
+
First, try to discover the plugin's capabilities through MCP tools:
|
|
36
|
+
|
|
37
|
+
1. Run `node scripts/flow-plugin-registry.js scan` to check for unregistered MCP servers
|
|
38
|
+
2. Use `ToolSearch` to search for tools matching the plugin name pattern
|
|
39
|
+
3. Use `ListMcpResourcesTool` to check for MCP resources from matching servers
|
|
40
|
+
4. Extract: tool names, descriptions, input schemas
|
|
41
|
+
5. Map each tool to a capability entry
|
|
42
|
+
|
|
43
|
+
### Step 2: Web Search Discovery (if MCP insufficient)
|
|
44
|
+
|
|
45
|
+
If MCP inspection yields few or no results:
|
|
46
|
+
|
|
47
|
+
1. Search for `"<plugin-name> Claude Code plugin capabilities"`
|
|
48
|
+
2. Search for `"<plugin-name> Claude Code MCP tools"`
|
|
49
|
+
3. Search for the plugin's documentation page
|
|
50
|
+
4. Extract capabilities from documentation
|
|
51
|
+
5. Generate trigger phrases from discovered capabilities
|
|
52
|
+
|
|
53
|
+
### Step 3: Build Plugin Entry
|
|
54
|
+
|
|
55
|
+
From the discovered information, construct:
|
|
56
|
+
|
|
57
|
+
```json
|
|
58
|
+
{
|
|
59
|
+
"name": "<plugin-name>",
|
|
60
|
+
"description": "Human-readable description of the plugin",
|
|
61
|
+
"source": "mcp|web-discovered|manual",
|
|
62
|
+
"triggers": ["phrase 1", "phrase 2"],
|
|
63
|
+
"capabilities": [
|
|
64
|
+
{
|
|
65
|
+
"action": "action-name",
|
|
66
|
+
"description": "What this action does",
|
|
67
|
+
"triggerPhrases": ["send to X", "push to X"],
|
|
68
|
+
"mcpTool": "mcp__server__tool_name or null",
|
|
69
|
+
"requiresTask": false
|
|
70
|
+
}
|
|
71
|
+
],
|
|
72
|
+
"metadata": {
|
|
73
|
+
"mcpServer": "server name if MCP-based",
|
|
74
|
+
"docsUrl": "URL to plugin docs if found",
|
|
75
|
+
"version": "plugin version if known"
|
|
76
|
+
}
|
|
77
|
+
}
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
### Step 4: User Confirmation
|
|
81
|
+
|
|
82
|
+
Display the discovered capabilities and ask for confirmation:
|
|
83
|
+
|
|
84
|
+
```
|
|
85
|
+
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
86
|
+
Plugin Registration: <plugin-name>
|
|
87
|
+
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
88
|
+
|
|
89
|
+
Description: <discovered description>
|
|
90
|
+
Source: MCP tools | Web search | Manual
|
|
91
|
+
|
|
92
|
+
Capabilities discovered (N):
|
|
93
|
+
1. <action>: <description>
|
|
94
|
+
Triggers: "phrase 1", "phrase 2"
|
|
95
|
+
MCP Tool: mcp__server__tool
|
|
96
|
+
|
|
97
|
+
2. <action>: <description>
|
|
98
|
+
Triggers: "phrase 3"
|
|
99
|
+
|
|
100
|
+
Trigger phrases (top-level):
|
|
101
|
+
- "send to <plugin>"
|
|
102
|
+
- "push to <plugin>"
|
|
103
|
+
- "use <plugin>"
|
|
104
|
+
|
|
105
|
+
Does this look correct? You can adjust before saving.
|
|
106
|
+
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
### Step 5: Save to Registry
|
|
110
|
+
|
|
111
|
+
Call `registerPlugin()` from `scripts/flow-plugin-registry.js`:
|
|
112
|
+
|
|
113
|
+
```javascript
|
|
114
|
+
const { registerPlugin } = require('./scripts/flow-plugin-registry');
|
|
115
|
+
registerPlugin({
|
|
116
|
+
name: pluginName,
|
|
117
|
+
description: discoveredDescription,
|
|
118
|
+
source: discoverySource,
|
|
119
|
+
triggers: topLevelTriggers,
|
|
120
|
+
capabilities: discoveredCapabilities,
|
|
121
|
+
metadata: { mcpServer, docsUrl, version }
|
|
122
|
+
});
|
|
123
|
+
```
|
|
124
|
+
|
|
125
|
+
## Re-Registration (Update)
|
|
126
|
+
|
|
127
|
+
When `/wogi-register <plugin-name>` is run for an already-registered plugin:
|
|
128
|
+
|
|
129
|
+
1. Re-discover capabilities (same flow as above)
|
|
130
|
+
2. Compare with existing registration
|
|
131
|
+
3. Display diff: new capabilities, removed capabilities, changed triggers
|
|
132
|
+
4. Update the existing entry (preserves registeredAt timestamp)
|
|
133
|
+
5. Display: `Plugin "<name>" updated. Added N capabilities, removed M.`
|
|
134
|
+
|
|
135
|
+
## --list Mode
|
|
136
|
+
|
|
137
|
+
Display all registered plugins:
|
|
138
|
+
|
|
139
|
+
```
|
|
140
|
+
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
141
|
+
Registered Plugins (N)
|
|
142
|
+
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
143
|
+
|
|
144
|
+
figma [active]
|
|
145
|
+
4 capabilities | Source: mcp
|
|
146
|
+
Triggers: "send to figma", "push to figma", "create in figma"
|
|
147
|
+
|
|
148
|
+
linear [active]
|
|
149
|
+
3 capabilities | Source: web-discovered
|
|
150
|
+
Triggers: "create linear issue", "sync with linear"
|
|
151
|
+
|
|
152
|
+
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
153
|
+
```
|
|
154
|
+
|
|
155
|
+
If no plugins registered:
|
|
156
|
+
```
|
|
157
|
+
No plugins registered. Install a Claude Code plugin and run:
|
|
158
|
+
/wogi-register <plugin-name>
|
|
159
|
+
```
|
|
160
|
+
|
|
161
|
+
## --remove Mode
|
|
162
|
+
|
|
163
|
+
```
|
|
164
|
+
Removed plugin: <plugin-name>
|
|
165
|
+
Was registered with N capabilities
|
|
166
|
+
/wogi-start will no longer route to this plugin
|
|
167
|
+
```
|
|
168
|
+
|
|
169
|
+
## Important
|
|
170
|
+
|
|
171
|
+
- The system is **fully generic** — it does NOT hardcode any plugin-specific logic
|
|
172
|
+
- Plugin-specific knowledge is discovered at registration time, not built-in
|
|
173
|
+
- All trigger matching uses word overlap scoring with a 0.5 minimum threshold
|
|
174
|
+
- Built-in `/wogi-*` commands always take priority over plugin routing
|
|
175
|
+
- Plugin actions are tracked through the normal WogiFlow task system when `trackPluginActions` is enabled
|
|
176
|
+
|
|
177
|
+
## Auto-Discovery on Session Start
|
|
178
|
+
|
|
179
|
+
When `config.plugins.autoScanOnSessionStart` is true:
|
|
180
|
+
- The session-start hook compares available MCP servers against the registry
|
|
181
|
+
- New unregistered servers are auto-registered with discovered capabilities
|
|
182
|
+
- Previously registered servers that are no longer available are marked `inactive`
|
|
183
|
+
- Display: `New plugin detected: <name>. Auto-registered with N capabilities.`
|
|
184
|
+
|
|
185
|
+
Mid-session plugin installs require manual `/wogi-register <name>`.
|
|
@@ -36,6 +36,31 @@ When `config.longInputGate.enabled` is `true`:
|
|
|
36
36
|
- Prompt is a task ID → already handled in Step 0
|
|
37
37
|
- Prompt content is primarily code (>80% code blocks) → skip, as code pastes are better handled by normal triage
|
|
38
38
|
|
|
39
|
+
### Step 0.4: Plugin Registry Routing (Automatic)
|
|
40
|
+
|
|
41
|
+
**After the command catalog finds no match, check if the request matches a registered plugin.** Plugin routing has lower priority than built-in `/wogi-*` commands.
|
|
42
|
+
|
|
43
|
+
When `config.plugins.enabled` is `true`:
|
|
44
|
+
|
|
45
|
+
1. Read `.workflow/state/plugin-registry.json` (the plugin registry)
|
|
46
|
+
2. For each active plugin, check if the user's request matches any trigger phrase
|
|
47
|
+
3. Use word overlap scoring (minimum threshold: 0.5) to find the best match
|
|
48
|
+
4. **If a plugin match is found** (score >= 0.5):
|
|
49
|
+
- Display: `Plugin match: "<plugin-name>" (score: X.XX, trigger: "<matched phrase>")`
|
|
50
|
+
- If the matched capability has an `mcpTool` → use ToolSearch to load and invoke it
|
|
51
|
+
- If no specific `mcpTool` → display the plugin's capabilities and ask the user which action to take
|
|
52
|
+
- If `config.plugins.trackPluginActions` is true → create a lightweight task entry for tracking
|
|
53
|
+
5. **If no plugin match** → Continue to the Command Catalog below
|
|
54
|
+
|
|
55
|
+
**Plugin routing has LOWER priority than built-in `/wogi-*` commands.** If a request matches both a built-in command and a plugin trigger, the built-in command wins. Plugin routing is the fallback AFTER the command catalog finds no match.
|
|
56
|
+
|
|
57
|
+
**Actual routing order:**
|
|
58
|
+
1. Check if request is a task ID → Structured Execution
|
|
59
|
+
2. Check long input gate → `/wogi-extract-review`
|
|
60
|
+
3. Check Command Catalog → matching `/wogi-*` command
|
|
61
|
+
4. Check Plugin Registry → matching plugin capability
|
|
62
|
+
5. Default → `/wogi-story` (implementation request)
|
|
63
|
+
|
|
39
64
|
### Command Catalog
|
|
40
65
|
|
|
41
66
|
Think of each command below as a tool available to you. Read the user's request, understand what they need, and invoke the best-fit command using the Skill tool.
|
|
@@ -63,6 +88,7 @@ Think of each command below as a tool available to you. Read the user's request,
|
|
|
63
88
|
| `/wogi-decide` | Creates/updates project rules with clarifying questions | User says **"from now on" + rule verb** (always/never/must/should), "let's make it a rule", "update our rules". Note: "from now on" alone is not sufficient — require a follow-on rule verb to distinguish from implementation requests. |
|
|
64
89
|
| `/wogi-learn` | Promotes feedback patterns to decision rules | User says **"let's learn from this"**, "we keep making this mistake", "extract lessons" |
|
|
65
90
|
| `/wogi-retrospective` | Guided session reflection with lesson capture | User says **"retro"**, "what went well", "what can we improve", "lessons learned" |
|
|
91
|
+
| `/wogi-register` | Register Claude Code plugins for /wogi-start routing | User wants to **register a plugin**, list registered plugins, or remove a plugin registration |
|
|
66
92
|
|
|
67
93
|
### Internal Tools (Auto-Invoked by wogi-start)
|
|
68
94
|
|
|
@@ -240,6 +266,18 @@ User: "help me think through how the hook architecture should evolve"
|
|
|
240
266
|
→ Action: Read relevant code, discuss architecture options. No files written, no tasks created.
|
|
241
267
|
```
|
|
242
268
|
|
|
269
|
+
```
|
|
270
|
+
User: "send this design to Figma"
|
|
271
|
+
→ Intent: PLUGIN ROUTING — request matches a registered plugin trigger
|
|
272
|
+
→ Action: Check plugin-registry.json. If "figma" plugin registered with trigger "send to figma" → route to plugin. If not registered → suggest /wogi-register figma
|
|
273
|
+
```
|
|
274
|
+
|
|
275
|
+
```
|
|
276
|
+
User: "register the linear plugin"
|
|
277
|
+
→ Intent: Plugin registration
|
|
278
|
+
→ Action: Invoke /wogi-register linear
|
|
279
|
+
```
|
|
280
|
+
|
|
243
281
|
```
|
|
244
282
|
User: "yes"
|
|
245
283
|
→ Intent: CONVERSATIONAL FOLLOW-UP — user is responding to a previous AI question
|
|
@@ -330,6 +368,34 @@ At each execution milestone, update the workflow phase. These are no-ops when ph
|
|
|
330
368
|
|
|
331
369
|
If a transition fails (wrong current phase), it's non-blocking — log and continue.
|
|
332
370
|
|
|
371
|
+
### Task Checkpoints (when `config.proactiveCompaction.enabled`)
|
|
372
|
+
|
|
373
|
+
At each phase boundary, save a task checkpoint and check if proactive compaction is needed. This enables lossless recovery after auto-compact.
|
|
374
|
+
|
|
375
|
+
**At EVERY phase transition listed above**, also:
|
|
376
|
+
1. Save checkpoint: Record task ID, current phase, completed scenarios, changed files, verification results to `.workflow/state/task-checkpoint.json`
|
|
377
|
+
2. Check compaction: If context usage >= `proactiveCompaction.triggerThreshold` (default 75%), display compaction message and run `/wogi-compact` before proceeding
|
|
378
|
+
|
|
379
|
+
**Checkpoint integration points:**
|
|
380
|
+
| When | Checkpoint Action |
|
|
381
|
+
|------|-------------------|
|
|
382
|
+
| After explore phase completes | Save exploration summary + related files |
|
|
383
|
+
| After spec is generated | Save spec path + acceptance criteria count |
|
|
384
|
+
| After each scenario completes | Update scenario progress (completed/pending) |
|
|
385
|
+
| After criteria check | Save verification results |
|
|
386
|
+
| Before final validation | Save all changed files list |
|
|
387
|
+
| After task completion | Clear checkpoint |
|
|
388
|
+
|
|
389
|
+
**Auto-compact recovery** (on session resume):
|
|
390
|
+
1. Check `.workflow/state/task-checkpoint.json` for an active checkpoint
|
|
391
|
+
2. If checkpoint exists with incomplete scenarios → display recovery message:
|
|
392
|
+
`Auto-compact detected. Restoring task state from checkpoint...`
|
|
393
|
+
3. Reload: task ID, current phase, completed scenarios, spec path, changed files
|
|
394
|
+
4. Continue execution from the next pending scenario
|
|
395
|
+
|
|
396
|
+
**Haiku-powered summaries** (when `proactiveCompaction.useHaiku: true`):
|
|
397
|
+
When compacting between phases, use the Agent tool with `model: "haiku"` to generate the compaction summary. This preserves Opus context for the actual implementation work.
|
|
398
|
+
|
|
333
399
|
### Execution Flow
|
|
334
400
|
|
|
335
401
|
```
|
|
@@ -804,6 +870,10 @@ Return a structured report:
|
|
|
804
870
|
|
|
805
871
|
```javascript
|
|
806
872
|
// Launch all in parallel (single message, multiple Task tool calls)
|
|
873
|
+
// When hybrid mode is enabled (config.hybrid.enabled), use the model parameter
|
|
874
|
+
// to route sub-agents to the appropriate model tier.
|
|
875
|
+
// Routing is provided by getAgentModel() from flow-prompt-template.js:
|
|
876
|
+
// explore → sonnet, research → sonnet, search → haiku, judging → opus
|
|
807
877
|
Task(subagent_type=Explore, prompt="Codebase Analyzer: ...")
|
|
808
878
|
Task(subagent_type=Explore, prompt="Best Practices: ...")
|
|
809
879
|
Task(subagent_type=Explore, prompt="Version Verifier: ...")
|
|
@@ -813,6 +883,21 @@ Task(subagent_type=Explore, prompt="Standards Preview: ...")
|
|
|
813
883
|
Task(subagent_type=Explore, prompt="Consumer Impact Analyzer: ...")
|
|
814
884
|
```
|
|
815
885
|
|
|
886
|
+
**Hybrid Model Routing (S4):**
|
|
887
|
+
|
|
888
|
+
When `config.hybrid.enabled` is `true`, use the Agent tool's `model` parameter to route sub-agents:
|
|
889
|
+
|
|
890
|
+
| Sub-Agent Type | Agent `model` Parameter | Rationale |
|
|
891
|
+
|----------------|------------------------|-----------|
|
|
892
|
+
| Explore/Research | `"sonnet"` | Good analysis capability, saves Opus context |
|
|
893
|
+
| Code Review | `"sonnet"` | Balanced quality for review tasks |
|
|
894
|
+
| Simple Lookup/Search | `"haiku"` | Fast and cheap for file searches |
|
|
895
|
+
| Complex Reasoning | `"opus"` | Only for architecture/planning decisions |
|
|
896
|
+
| Compaction Summary | `"haiku"` | Summaries don't need premium models |
|
|
897
|
+
| Eval Judging | `"opus"` (1) + `"sonnet"` (2) | Multi-judge composition from eval config |
|
|
898
|
+
|
|
899
|
+
The routing table is configured in `scripts/flow-prompt-template.js` and can be overridden via `config.hybrid.routing.overrides`. Capability scores from `.workflow/models/capabilities/*.yaml` are consulted when `checkCapabilities` is true — if a model's score for the task type is below the `capabilityThreshold` (default: 5), the task is escalated to the next tier.
|
|
900
|
+
|
|
816
901
|
**After all agents complete**, display a consolidated research summary:
|
|
817
902
|
|
|
818
903
|
**Output Format:**
|
|
@@ -1801,9 +1886,32 @@ Phase commands:
|
|
|
1801
1886
|
### Scenario keeps failing after max retries
|
|
1802
1887
|
- Stop and report: "Scenario X failed after N attempts. Issue: [description]"
|
|
1803
1888
|
- Leave task in inProgress
|
|
1804
|
-
- **
|
|
1889
|
+
- **Best-of-N fallback (high-risk tasks)**: When a HIGH-RISK task (architecture, migration, refactor, or complexity HIGH + files > 10) fails 3+ times, auto-suggest Best-of-N:
|
|
1890
|
+
```
|
|
1891
|
+
This high-risk task has failed 3 times. Would you like to try Best-of-N?
|
|
1892
|
+
→ Spawn 2 alternative implementation approaches in isolated worktrees
|
|
1893
|
+
→ Opus judges the best approach against the spec
|
|
1894
|
+
```
|
|
1895
|
+
Use `checkFallbackTrigger()` from `flow-best-of-n.js` to determine if Best-of-N applies.
|
|
1896
|
+
If the task is NOT high-risk: suggest `/wogi-debug-hypothesis` instead (competing theories about root cause).
|
|
1897
|
+
- **Auto-suggest hypothesis debugging**: For non-high-risk tasks, when a scenario fails 3+ times, suggest running `/wogi-debug-hypothesis "[failure description]"` to spawn parallel investigation agents
|
|
1805
1898
|
- User can investigate and re-run `/wogi-start TASK-XXX` to continue
|
|
1806
1899
|
|
|
1900
|
+
### Best-of-N auto-suggestion (high-risk tasks)
|
|
1901
|
+
|
|
1902
|
+
When starting a task, if `config.bestOfN.enabled` is true:
|
|
1903
|
+
1. Run `assessRisk()` from `flow-best-of-n.js` with the task's type, description, and file count
|
|
1904
|
+
2. If `shouldSuggest` is true, display:
|
|
1905
|
+
```
|
|
1906
|
+
This is a high-risk task. Would you like to use Best-of-N?
|
|
1907
|
+
→ Spawn 3 approaches in parallel (isolated worktrees)
|
|
1908
|
+
→ Opus selects the best implementation
|
|
1909
|
+
Options: [Yes, use Best-of-N] [No, proceed normally]
|
|
1910
|
+
```
|
|
1911
|
+
3. If user confirms: spawn N agents using `Agent(isolation: "worktree")` with variation strategy from `getVariationStrategy()`
|
|
1912
|
+
4. After all complete: spawn Opus judge using `buildSelectionPrompt()` to select winner
|
|
1913
|
+
5. Apply winner, clean up losing worktrees
|
|
1914
|
+
|
|
1807
1915
|
### Quality gate keeps failing
|
|
1808
1916
|
- Report which gate is failing and why
|
|
1809
1917
|
- Attempt to fix automatically
|
|
@@ -120,6 +120,7 @@ npm install -D wogiflow && npx flow onboard
|
|
|
120
120
|
| `/wogi-roadmap` | View/manage deferred work |
|
|
121
121
|
| `/wogi-suggest "text"` | Submit suggestion for WogiFlow |
|
|
122
122
|
| `/wogi-audit` | Comprehensive project-wide analysis (7 dimensions) |
|
|
123
|
+
| `/wogi-register` | Register plugins for /wogi-start routing |
|
|
123
124
|
|
|
124
125
|
See `.claude/docs/commands.md` for complete command reference.
|
|
125
126
|
|
|
@@ -147,6 +148,7 @@ See `.claude/docs/commands.md` for complete command reference.
|
|
|
147
148
|
| "rescan project", "re-evaluate project", "project changed", "others made changes", "sync wogi", "things changed", "out of sync" | `/wogi-rescan` |
|
|
148
149
|
| "suggest improvement", "feature request for wogi", "wogi suggestion", "submit feedback" | `/wogi-suggest` |
|
|
149
150
|
| "audit project", "project audit", "full project analysis", "full analysis" | `/wogi-audit` |
|
|
151
|
+
| "register plugin", "list plugins", "remove plugin", "register MCP" | `/wogi-register` |
|
|
150
152
|
|
|
151
153
|
**IMPORTANT**: When a user's message matches one of these patterns, immediately invoke the Skill tool with the corresponding command. Do not ask for confirmation. These `/wogi-*` commands satisfy the mandatory routing requirement — you do NOT also need to invoke `/wogi-start` when a detection match exists. `/wogi-start` is the fallback for messages that don't match this table.
|
|
152
154
|
|
|
@@ -15,5 +15,6 @@
|
|
|
15
15
|
| Session retro | "retro" or "what went well" |
|
|
16
16
|
| Rescan project | "rescan project" or "things changed" or "out of sync" |
|
|
17
17
|
| Project audit | "audit project" or "full analysis" |
|
|
18
|
+
| Register plugin | "register plugin" or "/wogi-register <name>" |
|
|
18
19
|
|
|
19
20
|
`/wogi-start` is the universal fallback router — it classifies any request and routes to the right action. Detailed per-command docs live in each skill's `.md` file under `.claude/commands/`.
|