npm - wogiflow - Versions diffs - 1.7.0 → 1.8.0 - Mend

wogiflow 1.7.0 → 1.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (83) hide show

package/.claude/commands/wogi-compact.md +49 -0
package/.claude/commands/wogi-eval.md +135 -0
package/.claude/commands/wogi-onboard.md +66 -1
package/.claude/commands/wogi-register.md +185 -0
package/.claude/commands/wogi-start.md +109 -1
package/.workflow/templates/claude-md.hbs +2 -0
package/.workflow/templates/partials/user-commands.hbs +1 -0
package/.workflow/templates/prompts/gemini-flash.yaml +42 -0
package/.workflow/templates/prompts/gpt4o.yaml +44 -0
package/.workflow/templates/prompts/haiku.yaml +42 -0
package/.workflow/templates/prompts/opus.yaml +45 -0
package/.workflow/templates/prompts/sonnet.yaml +44 -0
package/package.json +1 -1
package/scripts/flow-best-of-n.js +432 -0
package/scripts/flow-community-sync.js +469 -0
package/scripts/flow-eval-judge.js +388 -0
package/scripts/flow-eval.js +430 -0
package/scripts/flow-model-router.js +10 -0
package/scripts/flow-plugin-registry.js +631 -0
package/scripts/flow-proactive-compact.js +341 -0
package/scripts/flow-prompt-template.js +517 -0
package/scripts/flow-revision-tracker.js +258 -0
package/scripts/flow-skill-freshness.js +39 -18
package/scripts/flow-skill-generator.js +90 -27
package/scripts/flow-stack-wizard.js +2 -2
package/scripts/flow-stats-collector.js +534 -0
package/scripts/flow-sync-anonymizer.js +254 -0
package/scripts/flow-task-checkpoint.js +497 -0
package/scripts/flow-tech-options.js +4 -1
package/scripts/flow-utils.js +47 -11
package/scripts/hooks/core/session-context.js +12 -0
package/scripts/hooks/core/session-end.js +12 -0
package/scripts/hooks/core/task-completed.js +32 -0
package/scripts/hooks/entry/claude-code/session-start.js +56 -10
package/templates/skills/angular/skill.md +1 -1
package/templates/skills/anthropic/knowledge/anti-patterns.md +78 -0
package/templates/skills/anthropic/knowledge/conventions.md +18 -0
package/templates/skills/anthropic/knowledge/learnings.md +5 -0
package/templates/skills/anthropic/knowledge/patterns.md +111 -0
package/templates/skills/anthropic/skill.md +61 -0
package/templates/skills/commander/knowledge/anti-patterns.md +71 -0
package/templates/skills/commander/knowledge/conventions.md +17 -0
package/templates/skills/commander/knowledge/learnings.md +5 -0
package/templates/skills/commander/knowledge/patterns.md +80 -0
package/templates/skills/commander/skill.md +61 -0
package/templates/skills/cypress/skill.md +1 -1
package/templates/skills/django/skill.md +1 -1
package/templates/skills/docker/skill.md +1 -1
package/templates/skills/eslint/skill.md +1 -1
package/templates/skills/express/skill.md +1 -1
package/templates/skills/fastapi/skill.md +1 -1
package/templates/skills/fastify/skill.md +1 -1
package/templates/skills/flask/skill.md +1 -1
package/templates/skills/hono/skill.md +1 -1
package/templates/skills/jest/skill.md +1 -1
package/templates/skills/nestjs/skill.md +1 -1
package/templates/skills/openai/knowledge/anti-patterns.md +69 -0
package/templates/skills/openai/knowledge/conventions.md +18 -0
package/templates/skills/openai/knowledge/learnings.md +5 -0
package/templates/skills/openai/knowledge/patterns.md +121 -0
package/templates/skills/openai/skill.md +61 -0
package/templates/skills/playwright/skill.md +1 -1
package/templates/skills/prisma/skill.md +1 -1
package/templates/skills/pytest/skill.md +1 -1
package/templates/skills/svelte/skill.md +1 -1
package/templates/skills/tailwindcss/skill.md +1 -1
package/templates/skills/terraform/skill.md +1 -1
package/templates/skills/typescript/skill.md +1 -1
package/templates/skills/vitest/skill.md +2 -2
package/templates/skills/zod/skill.md +1 -1
package/.claude/rules/README.md +0 -60
package/.claude/rules/architecture/component-reuse.md +0 -38
package/.claude/rules/architecture/document-structure.md +0 -76
package/.claude/rules/architecture/dual-repo-management.md +0 -169
package/.claude/rules/architecture/feature-refactoring-cleanup.md +0 -87
package/.claude/rules/architecture/model-management.md +0 -35
package/.claude/rules/architecture/self-maintenance.md +0 -87
package/.claude/rules/code-style/naming-conventions.md +0 -55
package/.claude/rules/security/security-patterns.md +0 -176
package/.claude/skills/figma-analyzer/knowledge/learnings.md +0 -11
package/.workflow/specs/architecture.md.template +0 -24
package/.workflow/specs/stack.md.template +0 -33
package/.workflow/specs/testing.md.template +0 -36

package/.claude/commands/wogi-compact.md CHANGED Viewed

@@ -123,6 +123,55 @@ With smart compaction enabled (`config.smartCompaction.enabled`), context is man
 This means fixed thresholds are less relevant - compaction happens when actually needed based on the specific task.
+### Proactive Phase-Boundary Compaction (v2.3)
+With proactive compaction enabled (`config.proactiveCompaction.enabled`), WogiFlow compacts between task phases:
+- **Phase boundaries**: After explore, spec, each scenario, criteria check, validation
+- **Trigger threshold**: Default 75% context usage (configurable via `triggerThreshold`)
+- **Task checkpoints**: Full task state saved to `.workflow/state/task-checkpoint.json` at every phase boundary
+- **Auto-compact recovery**: If Claude's auto-compact fires, checkpoint enables lossless recovery
+**How it works:**
+1. At each phase boundary, `/wogi-start` saves a task checkpoint (task ID, phase, scenarios, files changed)
+2. If context exceeds the trigger threshold, proactive compaction fires before the next phase
+3. If Claude auto-compacts (at ~95%), session resume reads the checkpoint and restores full state
+**Recovery flow:**
+```
+Auto-compact fires at ~95% → Session resumes with compressed context
+→ /wogi-start detects checkpoint exists → Reads task-checkpoint.json
+→ Displays: "Auto-compact detected. Restoring task state from checkpoint..."
+→ Continues from the exact phase where it left off
+```
+**Config** (`config.proactiveCompaction`):
+```json
+{
+  "enabled": true,
+  "triggerThreshold": 0.75,
+  "useHaiku": true,
+  "phases": ["exploring", "spec_review", "scenario", "criteria_check", "validating"]
+}
+```
+**CLI commands:**
+```bash
+# Check if compaction needed at a phase
+node scripts/flow-proactive-compact.js check exploring 0.78 wf-a1b2c3d4
+# Show current config
+node scripts/flow-proactive-compact.js config
+# Generate compaction context from checkpoint
+node scripts/flow-proactive-compact.js context
+# View/manage checkpoints
+node scripts/flow-task-checkpoint.js load
+node scripts/flow-task-checkpoint.js check
+node scripts/flow-task-checkpoint.js clear wf-a1b2c3d4
+```
 ### Legacy Fixed Thresholds
 If smart compaction is disabled, check context pressure status:

package/.claude/commands/wogi-eval.md ADDED Viewed

@@ -0,0 +1,135 @@
+---
+description: "Evaluate WogiFlow task output quality with multi-judge scoring"
+---
+Evaluate a completed task's output quality using multi-judge scoring (1 Opus + 2 Sonnet).
+## Usage
+```
+/wogi-eval wf-XXXXXXXX              Evaluate a specific task
+/wogi-eval --batch --last 5          Evaluate the last 5 completed tasks
+/wogi-eval --compare                 Show eval trend comparison
+/wogi-eval --candidates              Show tasks eligible for evaluation
+```
+## How It Works
+1. **Read the spec**: Load the task's acceptance criteria and requirements
+2. **Get the diff**: Find the commit and extract the implementation diff
+3. **Spawn 3 judge agents**: 1 Opus + 2 Sonnet (via Agent tool `model` parameter)
+4. **Score independently**: Each judge scores on 5 dimensions (1-10)
+5. **Take median**: Final score = median of 3 judges per dimension
+6. **Save results**: Store in `.workflow/evals/`
+## Scoring Dimensions
+| Dimension | What It Measures |
+|-----------|-----------------|
+| Completeness | Did implementation address ALL acceptance criteria? |
+| Accuracy | Is code correct, handling edge cases? |
+| Workflow Compliance | Did it follow WogiFlow patterns (spec, criteria check, wiring, standards)? |
+| Token Efficiency | How many tokens/iterations to reach passing state? |
+| Quality | Code quality, readability, maintainability |
+## Execution Flow
+### Step 1: Prepare eval data
+```bash
+node scripts/flow-eval.js prepare wf-XXXXXXXX
+```
+This returns: spec content, implementation diff, iteration count, token estimate.
+### Step 2: Spawn judge agents
+Launch 3 agents in parallel using the Agent tool:
+```
+Agent(model: "opus", prompt: "<judge prompt with spec + diff>")
+Agent(model: "sonnet", prompt: "<judge prompt with spec + diff>")
+Agent(model: "sonnet", prompt: "<judge prompt with spec + diff>")
+```
+Each judge receives the same prompt (from `buildJudgePrompt()` in `flow-eval-judge.js`) and scores independently.
+### Step 3: Aggregate scores
+```javascript
+const { aggregateScores, parseJudgeResponse } = require('./scripts/flow-eval-judge');
+// Parse each judge's response
+const scores = judgeResponses.map(parseJudgeResponse).filter(Boolean);
+// Take median per dimension
+const result = aggregateScores(scores);
+```
+### Step 4: Save and display
+```javascript
+const { saveEvalResult, formatEvalResults } = require('./scripts/flow-eval');
+saveEvalResult({ taskId, aggregated: result, judgeResults: scores, model, taskType });
+```
+## Output Format
+```
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+📊 EVAL RESULTS: wf-XXXXXXXX
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+Judges: 3 (1 Opus + 2 Sonnet) | Confidence: high
+  completeness          ████████░░ 8/10
+  accuracy              ███████░░░ 7/10
+  workflowCompliance    █████████░ 9/10
+  tokenEfficiency       ██████░░░░ 6/10
+  quality               ████████░░ 8/10
+Overall: 7.6/10 — PASS (threshold: 6)
+Individual Judges:
+  Judge 1 (opus): Strong implementation, minor edge case gaps
+  Judge 2 (sonnet): Good workflow compliance, token usage could improve
+  Judge 3 (sonnet): Clean code, well-structured implementation
+Saved: .workflow/evals/wf-XXXXXXXX-eval-2026-03-02T10-00-00.json
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+```
+## Batch Mode
+When running `--batch --last N`:
+1. Get the last N completed tasks from stats
+2. Evaluate each sequentially
+3. Display summary table
+```
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+📊 BATCH EVAL RESULTS
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+Task            Model         Overall  Comp  Acc   WF    Tok   Qual
+wf-a1b2c3d4    opus-4-6      7.6      8     7     9     6     8
+wf-e5f6a7b8    sonnet-4-6    6.8      7     7     8     5     7
+wf-c9d0e1f2    opus-4-6      8.2      9     8     9     7     8
+Average: 7.5/10
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+```
+## Configuration
+In `config.json`:
+```json
+{
+  "eval": {
+    "judges": { "opus": 1, "sonnet": 2 },
+    "scoringDimensions": ["completeness", "accuracy", "workflowCompliance", "tokenEfficiency", "quality"],
+    "passingThreshold": 6
+  }
+}
+```
+ARGUMENTS: $ARGUMENTS

package/.claude/commands/wogi-onboard.md CHANGED Viewed

@@ -345,12 +345,31 @@ Display:
    If user approves, create task entries in ready.json backlog, grouped by category.
+   **CRITICAL: Task ID Generation**
+   For EACH task created from health findings:
+   1. Generate the ID by running: `node -e "const { generateTaskId } = require('./scripts/flow-utils'); console.log(generateTaskId('[category] health findings'));"` — or call `generateTaskId()` programmatically
+   2. The ID MUST be in format `wf-[8 hex chars]` (e.g., `wf-a1b2c3d4`)
+   3. **NEVER** manually construct descriptive IDs like `WF-health-1`, `wf-redundancy-check`, etc.
+   4. The descriptive name goes in the `title` field, NOT the `id` field
+   5. Example entry:
+      ```json
+      {
+        "id": "wf-a1b2c3d4",
+        "title": "Health: Consolidate 3 redundant button components",
+        "type": "refactor",
+        "feature": "health-scan",
+        "status": "ready",
+        "priority": "P2",
+        "createdAt": "[ISO timestamp]"
+      }
+      ```
    **If "Paste known issues":**
    ```
    Paste your known issues or tech debt below.
    (One per line, or a comma-separated list)
    ```
-   If issues provided, create task entries in ready.json backlog.
+   If issues provided, create task entries in ready.json backlog using the same ID generation rules above (call `generateTaskId()`, never construct IDs manually).
    **If "Skip for now":**
    Continue to Phase 4. User can run `/wogi-review` or `/wogi-health` later.
@@ -655,6 +674,52 @@ Display:
     // Check for conventional commits, ticket prefixes, etc.
     ```
+    **Model Routing Configuration:**
+    Present the user with a model routing choice using `AskUserQuestion`:
+    ```
+    How should WogiFlow route sub-tasks to AI models?
+    1. "Full Opus (Recommended)" — Maximum quality. All sub-agents use Opus.
+       Best for complex projects where quality matters most.
+    2. "Smart Routing" — Opus orchestrates, Sonnet handles implementation/review,
+       Haiku handles searches/lookups. Best quality-to-cost balance.
+       Preserves context window by offloading sub-tasks to lighter models.
+    3. "Custom" — Configure your own routing rules per task type.
+    ```
+    Based on choice:
+    - Option 1: Set `config.hybrid.enabled = false` (all tasks stay with current model)
+    - Option 2: Set `config.hybrid.enabled = true` with default routing table (already configured)
+    - Option 3: Set `config.hybrid.enabled = true` and guide user through per-task-type routing overrides
+    Display: `  Model routing...      ✓ [Smart Routing | Full Opus | Custom]`
+    **Community Knowledge Sync:**
+    Present opt-in question using `AskUserQuestion`:
+    ```
+    Would you like to share anonymized model performance data with the WogiFlow community?
+    What's shared: model ID, task type, iteration count, token usage, wall clock time
+    What's NOT shared: file paths, code, project names, task descriptions
+    You'll receive back: community-optimized model routing rules and capability scores.
+    1. "Enable (Recommended)" — Help improve WogiFlow for everyone
+    2. "Disable" — Keep all data local only
+    ```
+    Based on choice:
+    - Option 1: Set `config.communitySync.enabled = true`
+    - Option 2: Set `config.communitySync.enabled = false` (default)
+    Display: `  Community sync...     ✓ [Enabled | Disabled]`
     **Commit style detection:**
     ```bash
     git log --oneline -20 --format="%s"

package/.claude/commands/wogi-register.md ADDED Viewed

@@ -0,0 +1,185 @@
+---
+description: "Register Claude Code plugins for /wogi-start routing"
+allowed-tools: "Read,Glob,Grep,WebSearch,WebFetch,Edit,Write,Bash,Agent,ToolSearch,ListMcpResourcesTool,ReadMcpResourceTool,AskUserQuestion"
+user-invocable: true
+---
+# /wogi-register — Plugin Registration
+Register Claude Code plugins so that `/wogi-start` can automatically route requests to them.
+## Usage
+```
+/wogi-register <plugin-name>     Register a new plugin (auto-discover capabilities)
+/wogi-register --list            List all registered plugins
+/wogi-register --remove <name>   Remove a registered plugin
+```
+## How It Works
+When you run `/wogi-register <plugin-name>`, the system:
+1. **Inspects MCP tools** matching the plugin name (most reliable)
+2. **Searches online** for the plugin's documentation and capabilities
+3. **Generates a plugin entry** with triggers, capabilities, and invocation details
+4. **Saves to registry** at `.workflow/state/plugin-registry.json`
+5. **Displays summary** of discovered capabilities for confirmation
+After registration, `/wogi-start` will automatically route matching requests to the plugin.
+## Registration Flow
+### Step 1: MCP Tool Discovery
+First, try to discover the plugin's capabilities through MCP tools:
+1. Run `node scripts/flow-plugin-registry.js scan` to check for unregistered MCP servers
+2. Use `ToolSearch` to search for tools matching the plugin name pattern
+3. Use `ListMcpResourcesTool` to check for MCP resources from matching servers
+4. Extract: tool names, descriptions, input schemas
+5. Map each tool to a capability entry
+### Step 2: Web Search Discovery (if MCP insufficient)
+If MCP inspection yields few or no results:
+1. Search for `"<plugin-name> Claude Code plugin capabilities"`
+2. Search for `"<plugin-name> Claude Code MCP tools"`
+3. Search for the plugin's documentation page
+4. Extract capabilities from documentation
+5. Generate trigger phrases from discovered capabilities
+### Step 3: Build Plugin Entry
+From the discovered information, construct:
+```json
+{
+  "name": "<plugin-name>",
+  "description": "Human-readable description of the plugin",
+  "source": "mcp|web-discovered|manual",
+  "triggers": ["phrase 1", "phrase 2"],
+  "capabilities": [
+    {
+      "action": "action-name",
+      "description": "What this action does",
+      "triggerPhrases": ["send to X", "push to X"],
+      "mcpTool": "mcp__server__tool_name or null",
+      "requiresTask": false
+    }
+  ],
+  "metadata": {
+    "mcpServer": "server name if MCP-based",
+    "docsUrl": "URL to plugin docs if found",
+    "version": "plugin version if known"
+  }
+}
+```
+### Step 4: User Confirmation
+Display the discovered capabilities and ask for confirmation:
+```
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+Plugin Registration: <plugin-name>
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+Description: <discovered description>
+Source: MCP tools | Web search | Manual
+Capabilities discovered (N):
+  1. <action>: <description>
+     Triggers: "phrase 1", "phrase 2"
+     MCP Tool: mcp__server__tool
+  2. <action>: <description>
+     Triggers: "phrase 3"
+Trigger phrases (top-level):
+  - "send to <plugin>"
+  - "push to <plugin>"
+  - "use <plugin>"
+Does this look correct? You can adjust before saving.
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+```
+### Step 5: Save to Registry
+Call `registerPlugin()` from `scripts/flow-plugin-registry.js`:
+```javascript
+const { registerPlugin } = require('./scripts/flow-plugin-registry');
+registerPlugin({
+  name: pluginName,
+  description: discoveredDescription,
+  source: discoverySource,
+  triggers: topLevelTriggers,
+  capabilities: discoveredCapabilities,
+  metadata: { mcpServer, docsUrl, version }
+});
+```
+## Re-Registration (Update)
+When `/wogi-register <plugin-name>` is run for an already-registered plugin:
+1. Re-discover capabilities (same flow as above)
+2. Compare with existing registration
+3. Display diff: new capabilities, removed capabilities, changed triggers
+4. Update the existing entry (preserves registeredAt timestamp)
+5. Display: `Plugin "<name>" updated. Added N capabilities, removed M.`
+## --list Mode
+Display all registered plugins:
+```
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+Registered Plugins (N)
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+  figma [active]
+    4 capabilities | Source: mcp
+    Triggers: "send to figma", "push to figma", "create in figma"
+  linear [active]
+    3 capabilities | Source: web-discovered
+    Triggers: "create linear issue", "sync with linear"
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+```
+If no plugins registered:
+```
+No plugins registered. Install a Claude Code plugin and run:
+  /wogi-register <plugin-name>
+```
+## --remove Mode
+```
+Removed plugin: <plugin-name>
+  Was registered with N capabilities
+  /wogi-start will no longer route to this plugin
+```
+## Important
+- The system is **fully generic** — it does NOT hardcode any plugin-specific logic
+- Plugin-specific knowledge is discovered at registration time, not built-in
+- All trigger matching uses word overlap scoring with a 0.5 minimum threshold
+- Built-in `/wogi-*` commands always take priority over plugin routing
+- Plugin actions are tracked through the normal WogiFlow task system when `trackPluginActions` is enabled
+## Auto-Discovery on Session Start
+When `config.plugins.autoScanOnSessionStart` is true:
+- The session-start hook compares available MCP servers against the registry
+- New unregistered servers are auto-registered with discovered capabilities
+- Previously registered servers that are no longer available are marked `inactive`
+- Display: `New plugin detected: <name>. Auto-registered with N capabilities.`
+Mid-session plugin installs require manual `/wogi-register <name>`.

package/.claude/commands/wogi-start.md CHANGED Viewed

@@ -36,6 +36,31 @@ When `config.longInputGate.enabled` is `true`:
 - Prompt is a task ID → already handled in Step 0
 - Prompt content is primarily code (>80% code blocks) → skip, as code pastes are better handled by normal triage
+### Step 0.4: Plugin Registry Routing (Automatic)
+**After the command catalog finds no match, check if the request matches a registered plugin.** Plugin routing has lower priority than built-in `/wogi-*` commands.
+When `config.plugins.enabled` is `true`:
+1. Read `.workflow/state/plugin-registry.json` (the plugin registry)
+2. For each active plugin, check if the user's request matches any trigger phrase
+3. Use word overlap scoring (minimum threshold: 0.5) to find the best match
+4. **If a plugin match is found** (score >= 0.5):
+   - Display: `Plugin match: "<plugin-name>" (score: X.XX, trigger: "<matched phrase>")`
+   - If the matched capability has an `mcpTool` → use ToolSearch to load and invoke it
+   - If no specific `mcpTool` → display the plugin's capabilities and ask the user which action to take
+   - If `config.plugins.trackPluginActions` is true → create a lightweight task entry for tracking
+5. **If no plugin match** → Continue to the Command Catalog below
+**Plugin routing has LOWER priority than built-in `/wogi-*` commands.** If a request matches both a built-in command and a plugin trigger, the built-in command wins. Plugin routing is the fallback AFTER the command catalog finds no match.
+**Actual routing order:**
+1. Check if request is a task ID → Structured Execution
+2. Check long input gate → `/wogi-extract-review`
+3. Check Command Catalog → matching `/wogi-*` command
+4. Check Plugin Registry → matching plugin capability
+5. Default → `/wogi-story` (implementation request)
 ### Command Catalog
 Think of each command below as a tool available to you. Read the user's request, understand what they need, and invoke the best-fit command using the Skill tool.
@@ -63,6 +88,7 @@ Think of each command below as a tool available to you. Read the user's request,
 | `/wogi-decide` | Creates/updates project rules with clarifying questions | User says **"from now on" + rule verb** (always/never/must/should), "let's make it a rule", "update our rules". Note: "from now on" alone is not sufficient — require a follow-on rule verb to distinguish from implementation requests. |
 | `/wogi-learn` | Promotes feedback patterns to decision rules | User says **"let's learn from this"**, "we keep making this mistake", "extract lessons" |
 | `/wogi-retrospective` | Guided session reflection with lesson capture | User says **"retro"**, "what went well", "what can we improve", "lessons learned" |
+| `/wogi-register` | Register Claude Code plugins for /wogi-start routing | User wants to **register a plugin**, list registered plugins, or remove a plugin registration |
 ### Internal Tools (Auto-Invoked by wogi-start)
@@ -240,6 +266,18 @@ User: "help me think through how the hook architecture should evolve"
 → Action: Read relevant code, discuss architecture options. No files written, no tasks created.
 ```
+```
+User: "send this design to Figma"
+→ Intent: PLUGIN ROUTING — request matches a registered plugin trigger
+→ Action: Check plugin-registry.json. If "figma" plugin registered with trigger "send to figma" → route to plugin. If not registered → suggest /wogi-register figma
+```
+```
+User: "register the linear plugin"
+→ Intent: Plugin registration
+→ Action: Invoke /wogi-register linear
+```
 ```
 User: "yes"
 → Intent: CONVERSATIONAL FOLLOW-UP — user is responding to a previous AI question
@@ -330,6 +368,34 @@ At each execution milestone, update the workflow phase. These are no-ops when ph
 If a transition fails (wrong current phase), it's non-blocking — log and continue.
+### Task Checkpoints (when `config.proactiveCompaction.enabled`)
+At each phase boundary, save a task checkpoint and check if proactive compaction is needed. This enables lossless recovery after auto-compact.
+**At EVERY phase transition listed above**, also:
+1. Save checkpoint: Record task ID, current phase, completed scenarios, changed files, verification results to `.workflow/state/task-checkpoint.json`
+2. Check compaction: If context usage >= `proactiveCompaction.triggerThreshold` (default 75%), display compaction message and run `/wogi-compact` before proceeding
+**Checkpoint integration points:**
+| When | Checkpoint Action |
+|------|-------------------|
+| After explore phase completes | Save exploration summary + related files |
+| After spec is generated | Save spec path + acceptance criteria count |
+| After each scenario completes | Update scenario progress (completed/pending) |
+| After criteria check | Save verification results |
+| Before final validation | Save all changed files list |
+| After task completion | Clear checkpoint |
+**Auto-compact recovery** (on session resume):
+1. Check `.workflow/state/task-checkpoint.json` for an active checkpoint
+2. If checkpoint exists with incomplete scenarios → display recovery message:
+   `Auto-compact detected. Restoring task state from checkpoint...`
+3. Reload: task ID, current phase, completed scenarios, spec path, changed files
+4. Continue execution from the next pending scenario
+**Haiku-powered summaries** (when `proactiveCompaction.useHaiku: true`):
+When compacting between phases, use the Agent tool with `model: "haiku"` to generate the compaction summary. This preserves Opus context for the actual implementation work.
 ### Execution Flow
 ```
@@ -804,6 +870,10 @@ Return a structured report:
 ```javascript
 // Launch all in parallel (single message, multiple Task tool calls)
+// When hybrid mode is enabled (config.hybrid.enabled), use the model parameter
+// to route sub-agents to the appropriate model tier.
+// Routing is provided by getAgentModel() from flow-prompt-template.js:
+//   explore → sonnet, research → sonnet, search → haiku, judging → opus
 Task(subagent_type=Explore, prompt="Codebase Analyzer: ...")
 Task(subagent_type=Explore, prompt="Best Practices: ...")
 Task(subagent_type=Explore, prompt="Version Verifier: ...")
@@ -813,6 +883,21 @@ Task(subagent_type=Explore, prompt="Standards Preview: ...")
 Task(subagent_type=Explore, prompt="Consumer Impact Analyzer: ...")
 ```
+**Hybrid Model Routing (S4):**
+When `config.hybrid.enabled` is `true`, use the Agent tool's `model` parameter to route sub-agents:
+| Sub-Agent Type | Agent `model` Parameter | Rationale |
+|----------------|------------------------|-----------|
+| Explore/Research | `"sonnet"` | Good analysis capability, saves Opus context |
+| Code Review | `"sonnet"` | Balanced quality for review tasks |
+| Simple Lookup/Search | `"haiku"` | Fast and cheap for file searches |
+| Complex Reasoning | `"opus"` | Only for architecture/planning decisions |
+| Compaction Summary | `"haiku"` | Summaries don't need premium models |
+| Eval Judging | `"opus"` (1) + `"sonnet"` (2) | Multi-judge composition from eval config |
+The routing table is configured in `scripts/flow-prompt-template.js` and can be overridden via `config.hybrid.routing.overrides`. Capability scores from `.workflow/models/capabilities/*.yaml` are consulted when `checkCapabilities` is true — if a model's score for the task type is below the `capabilityThreshold` (default: 5), the task is escalated to the next tier.
 **After all agents complete**, display a consolidated research summary:
 **Output Format:**
@@ -1801,9 +1886,32 @@ Phase commands:
 ### Scenario keeps failing after max retries
 - Stop and report: "Scenario X failed after N attempts. Issue: [description]"
 - Leave task in inProgress
-- **Auto-suggest hypothesis debugging**: When a scenario fails 3+ times, suggest running `/wogi-debug-hypothesis "[failure description]"` to spawn parallel investigation agents that analyze competing theories about the root cause
+- **Best-of-N fallback (high-risk tasks)**: When a HIGH-RISK task (architecture, migration, refactor, or complexity HIGH + files > 10) fails 3+ times, auto-suggest Best-of-N:
+  ```
+  This high-risk task has failed 3 times. Would you like to try Best-of-N?
+  → Spawn 2 alternative implementation approaches in isolated worktrees
+  → Opus judges the best approach against the spec
+  ```
+  Use `checkFallbackTrigger()` from `flow-best-of-n.js` to determine if Best-of-N applies.
+  If the task is NOT high-risk: suggest `/wogi-debug-hypothesis` instead (competing theories about root cause).
+- **Auto-suggest hypothesis debugging**: For non-high-risk tasks, when a scenario fails 3+ times, suggest running `/wogi-debug-hypothesis "[failure description]"` to spawn parallel investigation agents
 - User can investigate and re-run `/wogi-start TASK-XXX` to continue
+### Best-of-N auto-suggestion (high-risk tasks)
+When starting a task, if `config.bestOfN.enabled` is true:
+1. Run `assessRisk()` from `flow-best-of-n.js` with the task's type, description, and file count
+2. If `shouldSuggest` is true, display:
+   ```
+   This is a high-risk task. Would you like to use Best-of-N?
+   → Spawn 3 approaches in parallel (isolated worktrees)
+   → Opus selects the best implementation
+   Options: [Yes, use Best-of-N] [No, proceed normally]
+   ```
+3. If user confirms: spawn N agents using `Agent(isolation: "worktree")` with variation strategy from `getVariationStrategy()`
+4. After all complete: spawn Opus judge using `buildSelectionPrompt()` to select winner
+5. Apply winner, clean up losing worktrees
 ### Quality gate keeps failing
 - Report which gate is failing and why
 - Attempt to fix automatically

package/.workflow/templates/claude-md.hbs CHANGED Viewed

@@ -120,6 +120,7 @@ npm install -D wogiflow && npx flow onboard
 | `/wogi-roadmap` | View/manage deferred work |
 | `/wogi-suggest "text"` | Submit suggestion for WogiFlow |
 | `/wogi-audit` | Comprehensive project-wide analysis (7 dimensions) |
+| `/wogi-register` | Register plugins for /wogi-start routing |
 See `.claude/docs/commands.md` for complete command reference.
@@ -147,6 +148,7 @@ See `.claude/docs/commands.md` for complete command reference.
 | "rescan project", "re-evaluate project", "project changed", "others made changes", "sync wogi", "things changed", "out of sync" | `/wogi-rescan` |
 | "suggest improvement", "feature request for wogi", "wogi suggestion", "submit feedback" | `/wogi-suggest` |
 | "audit project", "project audit", "full project analysis", "full analysis" | `/wogi-audit` |
+| "register plugin", "list plugins", "remove plugin", "register MCP" | `/wogi-register` |
 **IMPORTANT**: When a user's message matches one of these patterns, immediately invoke the Skill tool with the corresponding command. Do not ask for confirmation. These `/wogi-*` commands satisfy the mandatory routing requirement — you do NOT also need to invoke `/wogi-start` when a detection match exists. `/wogi-start` is the fallback for messages that don't match this table.

package/.workflow/templates/partials/user-commands.hbs CHANGED Viewed

@@ -15,5 +15,6 @@
 | Session retro | "retro" or "what went well" |
 | Rescan project | "rescan project" or "things changed" or "out of sync" |
 | Project audit | "audit project" or "full analysis" |
+| Register plugin | "register plugin" or "/wogi-register <name>" |
 `/wogi-start` is the universal fallback router — it classifies any request and routes to the right action. Detailed per-command docs live in each skill's `.md` file under `.claude/commands/`.