npm - wogiflow - Versions diffs - 1.6.3 → 1.7.1 - Mend

wogiflow 1.6.3 → 1.7.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (199) hide show

package/.claude/commands/wogi-compact.md +49 -0
package/.claude/commands/wogi-eval.md +135 -0
package/.claude/commands/wogi-morning.md +24 -1
package/.claude/commands/wogi-onboard.md +46 -0
package/.claude/commands/wogi-setup-stack.md +66 -0
package/.claude/commands/wogi-start.md +71 -1
package/.claude/rules/README.md +60 -0
package/.claude/rules/architecture/component-reuse.md +38 -0
package/.claude/rules/architecture/document-structure.md +76 -0
package/.claude/rules/architecture/dual-repo-management.md +169 -0
package/.claude/rules/architecture/feature-refactoring-cleanup.md +87 -0
package/.claude/rules/architecture/model-management.md +35 -0
package/.claude/rules/architecture/self-maintenance.md +87 -0
package/.claude/rules/code-style/naming-conventions.md +55 -0
package/.claude/rules/security/security-patterns.md +176 -0
package/.claude/skills/figma-analyzer/knowledge/learnings.md +11 -0
package/.workflow/specs/architecture.md.template +24 -0
package/.workflow/specs/stack.md.template +33 -0
package/.workflow/specs/testing.md.template +36 -0
package/.workflow/templates/prompts/gemini-flash.yaml +42 -0
package/.workflow/templates/prompts/gpt4o.yaml +44 -0
package/.workflow/templates/prompts/haiku.yaml +42 -0
package/.workflow/templates/prompts/opus.yaml +45 -0
package/.workflow/templates/prompts/sonnet.yaml +44 -0
package/package.json +1 -1
package/scripts/flow +3 -0
package/scripts/flow-best-of-n.js +432 -0
package/scripts/flow-community-sync.js +469 -0
package/scripts/flow-eval-judge.js +388 -0
package/scripts/flow-eval.js +430 -0
package/scripts/flow-model-router.js +10 -0
package/scripts/flow-proactive-compact.js +341 -0
package/scripts/flow-prompt-template.js +517 -0
package/scripts/flow-revision-tracker.js +258 -0
package/scripts/flow-skill-freshness.js +250 -0
package/scripts/flow-skill-generator.js +237 -36
package/scripts/flow-stack-wizard.js +17 -1
package/scripts/flow-stats-collector.js +534 -0
package/scripts/flow-sync-anonymizer.js +254 -0
package/scripts/flow-task-checkpoint.js +497 -0
package/scripts/flow-utils.js +47 -11
package/scripts/hooks/core/session-context.js +12 -0
package/scripts/hooks/core/session-end.js +12 -0
package/scripts/hooks/core/task-completed.js +32 -0
package/templates/skills/angular/knowledge/anti-patterns.md +34 -0
package/templates/skills/angular/knowledge/conventions.md +15 -0
package/templates/skills/angular/knowledge/learnings.md +5 -0
package/templates/skills/angular/knowledge/patterns.md +35 -0
package/templates/skills/angular/skill.md +60 -0
package/templates/skills/cypress/knowledge/anti-patterns.md +20 -0
package/templates/skills/cypress/knowledge/conventions.md +14 -0
package/templates/skills/cypress/knowledge/learnings.md +5 -0
package/templates/skills/cypress/knowledge/patterns.md +39 -0
package/templates/skills/cypress/skill.md +59 -0
package/templates/skills/django/knowledge/anti-patterns.md +20 -0
package/templates/skills/django/knowledge/conventions.md +15 -0
package/templates/skills/django/knowledge/learnings.md +5 -0
package/templates/skills/django/knowledge/patterns.md +40 -0
package/templates/skills/django/skill.md +59 -0
package/templates/skills/docker/knowledge/anti-patterns.md +34 -0
package/templates/skills/docker/knowledge/conventions.md +16 -0
package/templates/skills/docker/knowledge/learnings.md +5 -0
package/templates/skills/docker/knowledge/patterns.md +48 -0
package/templates/skills/docker/skill.md +60 -0
package/templates/skills/drizzle/knowledge/anti-patterns.md +20 -0
package/templates/skills/drizzle/knowledge/conventions.md +14 -0
package/templates/skills/drizzle/knowledge/learnings.md +5 -0
package/templates/skills/drizzle/knowledge/patterns.md +38 -0
package/templates/skills/drizzle/skill.md +59 -0
package/templates/skills/eslint/knowledge/anti-patterns.md +20 -0
package/templates/skills/eslint/knowledge/conventions.md +14 -0
package/templates/skills/eslint/knowledge/learnings.md +5 -0
package/templates/skills/eslint/knowledge/patterns.md +26 -0
package/templates/skills/eslint/skill.md +58 -0
package/templates/skills/express/knowledge/anti-patterns.md +34 -0
package/templates/skills/express/knowledge/conventions.md +15 -0
package/templates/skills/express/knowledge/learnings.md +5 -0
package/templates/skills/express/knowledge/patterns.md +63 -0
package/templates/skills/express/skill.md +61 -0
package/templates/skills/fastapi/knowledge/anti-patterns.md +20 -0
package/templates/skills/fastapi/knowledge/conventions.md +15 -0
package/templates/skills/fastapi/knowledge/learnings.md +5 -0
package/templates/skills/fastapi/knowledge/patterns.md +47 -0
package/templates/skills/fastapi/skill.md +59 -0
package/templates/skills/fastify/knowledge/anti-patterns.md +20 -0
package/templates/skills/fastify/knowledge/conventions.md +14 -0
package/templates/skills/fastify/knowledge/learnings.md +5 -0
package/templates/skills/fastify/knowledge/patterns.md +42 -0
package/templates/skills/fastify/skill.md +59 -0
package/templates/skills/flask/knowledge/anti-patterns.md +20 -0
package/templates/skills/flask/knowledge/conventions.md +15 -0
package/templates/skills/flask/knowledge/learnings.md +5 -0
package/templates/skills/flask/knowledge/patterns.md +41 -0
package/templates/skills/flask/skill.md +59 -0
package/templates/skills/graphql/knowledge/anti-patterns.md +19 -0
package/templates/skills/graphql/knowledge/conventions.md +14 -0
package/templates/skills/graphql/knowledge/learnings.md +5 -0
package/templates/skills/graphql/knowledge/patterns.md +22 -0
package/templates/skills/graphql/skill.md +58 -0
package/templates/skills/hono/knowledge/anti-patterns.md +20 -0
package/templates/skills/hono/knowledge/conventions.md +14 -0
package/templates/skills/hono/knowledge/learnings.md +5 -0
package/templates/skills/hono/knowledge/patterns.md +23 -0
package/templates/skills/hono/skill.md +58 -0
package/templates/skills/jest/knowledge/anti-patterns.md +35 -0
package/templates/skills/jest/knowledge/conventions.md +15 -0
package/templates/skills/jest/knowledge/learnings.md +5 -0
package/templates/skills/jest/knowledge/patterns.md +42 -0
package/templates/skills/jest/skill.md +60 -0
package/templates/skills/mocha/knowledge/anti-patterns.md +20 -0
package/templates/skills/mocha/knowledge/conventions.md +14 -0
package/templates/skills/mocha/knowledge/learnings.md +5 -0
package/templates/skills/mocha/knowledge/patterns.md +26 -0
package/templates/skills/mocha/skill.md +58 -0
package/templates/skills/mongoose/knowledge/anti-patterns.md +20 -0
package/templates/skills/mongoose/knowledge/conventions.md +15 -0
package/templates/skills/mongoose/knowledge/learnings.md +5 -0
package/templates/skills/mongoose/knowledge/patterns.md +37 -0
package/templates/skills/mongoose/skill.md +59 -0
package/templates/skills/nestjs/knowledge/anti-patterns.md +33 -0
package/templates/skills/nestjs/knowledge/conventions.md +15 -0
package/templates/skills/nestjs/knowledge/learnings.md +5 -0
package/templates/skills/nestjs/knowledge/patterns.md +45 -0
package/templates/skills/nestjs/skill.md +60 -0
package/templates/skills/next/knowledge/anti-patterns.md +47 -0
package/templates/skills/next/knowledge/conventions.md +16 -0
package/templates/skills/next/knowledge/learnings.md +5 -0
package/templates/skills/next/knowledge/patterns.md +62 -0
package/templates/skills/next/skill.md +62 -0
package/templates/skills/playwright/knowledge/anti-patterns.md +34 -0
package/templates/skills/playwright/knowledge/conventions.md +14 -0
package/templates/skills/playwright/knowledge/learnings.md +5 -0
package/templates/skills/playwright/knowledge/patterns.md +41 -0
package/templates/skills/playwright/skill.md +60 -0
package/templates/skills/prisma/knowledge/anti-patterns.md +34 -0
package/templates/skills/prisma/knowledge/conventions.md +15 -0
package/templates/skills/prisma/knowledge/learnings.md +5 -0
package/templates/skills/prisma/knowledge/patterns.md +52 -0
package/templates/skills/prisma/skill.md +61 -0
package/templates/skills/pytest/knowledge/anti-patterns.md +20 -0
package/templates/skills/pytest/knowledge/conventions.md +15 -0
package/templates/skills/pytest/knowledge/learnings.md +5 -0
package/templates/skills/pytest/knowledge/patterns.md +49 -0
package/templates/skills/pytest/skill.md +59 -0
package/templates/skills/react/knowledge/anti-patterns.md +48 -0
package/templates/skills/react/knowledge/conventions.md +16 -0
package/templates/skills/react/knowledge/learnings.md +5 -0
package/templates/skills/react/knowledge/patterns.md +57 -0
package/templates/skills/react/skill.md +62 -0
package/templates/skills/sequelize/knowledge/anti-patterns.md +20 -0
package/templates/skills/sequelize/knowledge/conventions.md +14 -0
package/templates/skills/sequelize/knowledge/learnings.md +5 -0
package/templates/skills/sequelize/knowledge/patterns.md +41 -0
package/templates/skills/sequelize/skill.md +59 -0
package/templates/skills/sqlalchemy/knowledge/anti-patterns.md +20 -0
package/templates/skills/sqlalchemy/knowledge/conventions.md +14 -0
package/templates/skills/sqlalchemy/knowledge/learnings.md +5 -0
package/templates/skills/sqlalchemy/knowledge/patterns.md +40 -0
package/templates/skills/sqlalchemy/skill.md +59 -0
package/templates/skills/svelte/knowledge/anti-patterns.md +20 -0
package/templates/skills/svelte/knowledge/conventions.md +14 -0
package/templates/skills/svelte/knowledge/learnings.md +5 -0
package/templates/skills/svelte/knowledge/patterns.md +38 -0
package/templates/skills/svelte/skill.md +59 -0
package/templates/skills/tailwindcss/knowledge/anti-patterns.md +34 -0
package/templates/skills/tailwindcss/knowledge/conventions.md +15 -0
package/templates/skills/tailwindcss/knowledge/learnings.md +5 -0
package/templates/skills/tailwindcss/knowledge/patterns.md +39 -0
package/templates/skills/tailwindcss/skill.md +60 -0
package/templates/skills/terraform/knowledge/anti-patterns.md +36 -0
package/templates/skills/terraform/knowledge/conventions.md +16 -0
package/templates/skills/terraform/knowledge/learnings.md +5 -0
package/templates/skills/terraform/knowledge/patterns.md +50 -0
package/templates/skills/terraform/skill.md +60 -0
package/templates/skills/typeorm/knowledge/anti-patterns.md +20 -0
package/templates/skills/typeorm/knowledge/conventions.md +14 -0
package/templates/skills/typeorm/knowledge/learnings.md +5 -0
package/templates/skills/typeorm/knowledge/patterns.md +25 -0
package/templates/skills/typeorm/skill.md +58 -0
package/templates/skills/typescript/knowledge/anti-patterns.md +34 -0
package/templates/skills/typescript/knowledge/conventions.md +15 -0
package/templates/skills/typescript/knowledge/learnings.md +5 -0
package/templates/skills/typescript/knowledge/patterns.md +41 -0
package/templates/skills/typescript/skill.md +60 -0
package/templates/skills/vitest/knowledge/anti-patterns.md +19 -0
package/templates/skills/vitest/knowledge/conventions.md +14 -0
package/templates/skills/vitest/knowledge/learnings.md +5 -0
package/templates/skills/vitest/knowledge/patterns.md +40 -0
package/templates/skills/vitest/skill.md +59 -0
package/templates/skills/vue/knowledge/anti-patterns.md +34 -0
package/templates/skills/vue/knowledge/conventions.md +15 -0
package/templates/skills/vue/knowledge/learnings.md +5 -0
package/templates/skills/vue/knowledge/patterns.md +40 -0
package/templates/skills/vue/skill.md +60 -0
package/templates/skills/zod/knowledge/anti-patterns.md +20 -0
package/templates/skills/zod/knowledge/conventions.md +14 -0
package/templates/skills/zod/knowledge/learnings.md +5 -0
package/templates/skills/zod/knowledge/patterns.md +41 -0
package/templates/skills/zod/skill.md +59 -0

package/.claude/commands/wogi-compact.md CHANGED Viewed

@@ -123,6 +123,55 @@ With smart compaction enabled (`config.smartCompaction.enabled`), context is man
 This means fixed thresholds are less relevant - compaction happens when actually needed based on the specific task.
+### Proactive Phase-Boundary Compaction (v2.3)
+With proactive compaction enabled (`config.proactiveCompaction.enabled`), WogiFlow compacts between task phases:
+- **Phase boundaries**: After explore, spec, each scenario, criteria check, validation
+- **Trigger threshold**: Default 75% context usage (configurable via `triggerThreshold`)
+- **Task checkpoints**: Full task state saved to `.workflow/state/task-checkpoint.json` at every phase boundary
+- **Auto-compact recovery**: If Claude's auto-compact fires, checkpoint enables lossless recovery
+**How it works:**
+1. At each phase boundary, `/wogi-start` saves a task checkpoint (task ID, phase, scenarios, files changed)
+2. If context exceeds the trigger threshold, proactive compaction fires before the next phase
+3. If Claude auto-compacts (at ~95%), session resume reads the checkpoint and restores full state
+**Recovery flow:**
+```
+Auto-compact fires at ~95% → Session resumes with compressed context
+→ /wogi-start detects checkpoint exists → Reads task-checkpoint.json
+→ Displays: "Auto-compact detected. Restoring task state from checkpoint..."
+→ Continues from the exact phase where it left off
+```
+**Config** (`config.proactiveCompaction`):
+```json
+{
+  "enabled": true,
+  "triggerThreshold": 0.75,
+  "useHaiku": true,
+  "phases": ["exploring", "spec_review", "scenario", "criteria_check", "validating"]
+}
+```
+**CLI commands:**
+```bash
+# Check if compaction needed at a phase
+node scripts/flow-proactive-compact.js check exploring 0.78 wf-a1b2c3d4
+# Show current config
+node scripts/flow-proactive-compact.js config
+# Generate compaction context from checkpoint
+node scripts/flow-proactive-compact.js context
+# View/manage checkpoints
+node scripts/flow-task-checkpoint.js load
+node scripts/flow-task-checkpoint.js check
+node scripts/flow-task-checkpoint.js clear wf-a1b2c3d4
+```
 ### Legacy Fixed Thresholds
 If smart compaction is disabled, check context pressure status:

package/.claude/commands/wogi-eval.md ADDED Viewed

@@ -0,0 +1,135 @@
+---
+description: "Evaluate WogiFlow task output quality with multi-judge scoring"
+---
+Evaluate a completed task's output quality using multi-judge scoring (1 Opus + 2 Sonnet).
+## Usage
+```
+/wogi-eval wf-XXXXXXXX              Evaluate a specific task
+/wogi-eval --batch --last 5          Evaluate the last 5 completed tasks
+/wogi-eval --compare                 Show eval trend comparison
+/wogi-eval --candidates              Show tasks eligible for evaluation
+```
+## How It Works
+1. **Read the spec**: Load the task's acceptance criteria and requirements
+2. **Get the diff**: Find the commit and extract the implementation diff
+3. **Spawn 3 judge agents**: 1 Opus + 2 Sonnet (via Agent tool `model` parameter)
+4. **Score independently**: Each judge scores on 5 dimensions (1-10)
+5. **Take median**: Final score = median of 3 judges per dimension
+6. **Save results**: Store in `.workflow/evals/`
+## Scoring Dimensions
+| Dimension | What It Measures |
+|-----------|-----------------|
+| Completeness | Did implementation address ALL acceptance criteria? |
+| Accuracy | Is code correct, handling edge cases? |
+| Workflow Compliance | Did it follow WogiFlow patterns (spec, criteria check, wiring, standards)? |
+| Token Efficiency | How many tokens/iterations to reach passing state? |
+| Quality | Code quality, readability, maintainability |
+## Execution Flow
+### Step 1: Prepare eval data
+```bash
+node scripts/flow-eval.js prepare wf-XXXXXXXX
+```
+This returns: spec content, implementation diff, iteration count, token estimate.
+### Step 2: Spawn judge agents
+Launch 3 agents in parallel using the Agent tool:
+```
+Agent(model: "opus", prompt: "<judge prompt with spec + diff>")
+Agent(model: "sonnet", prompt: "<judge prompt with spec + diff>")
+Agent(model: "sonnet", prompt: "<judge prompt with spec + diff>")
+```
+Each judge receives the same prompt (from `buildJudgePrompt()` in `flow-eval-judge.js`) and scores independently.
+### Step 3: Aggregate scores
+```javascript
+const { aggregateScores, parseJudgeResponse } = require('./scripts/flow-eval-judge');
+// Parse each judge's response
+const scores = judgeResponses.map(parseJudgeResponse).filter(Boolean);
+// Take median per dimension
+const result = aggregateScores(scores);
+```
+### Step 4: Save and display
+```javascript
+const { saveEvalResult, formatEvalResults } = require('./scripts/flow-eval');
+saveEvalResult({ taskId, aggregated: result, judgeResults: scores, model, taskType });
+```
+## Output Format
+```
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+📊 EVAL RESULTS: wf-XXXXXXXX
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+Judges: 3 (1 Opus + 2 Sonnet) | Confidence: high
+  completeness          ████████░░ 8/10
+  accuracy              ███████░░░ 7/10
+  workflowCompliance    █████████░ 9/10
+  tokenEfficiency       ██████░░░░ 6/10
+  quality               ████████░░ 8/10
+Overall: 7.6/10 — PASS (threshold: 6)
+Individual Judges:
+  Judge 1 (opus): Strong implementation, minor edge case gaps
+  Judge 2 (sonnet): Good workflow compliance, token usage could improve
+  Judge 3 (sonnet): Clean code, well-structured implementation
+Saved: .workflow/evals/wf-XXXXXXXX-eval-2026-03-02T10-00-00.json
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+```
+## Batch Mode
+When running `--batch --last N`:
+1. Get the last N completed tasks from stats
+2. Evaluate each sequentially
+3. Display summary table
+```
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+📊 BATCH EVAL RESULTS
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+Task            Model         Overall  Comp  Acc   WF    Tok   Qual
+wf-a1b2c3d4    opus-4-6      7.6      8     7     9     6     8
+wf-e5f6a7b8    sonnet-4-6    6.8      7     7     8     5     7
+wf-c9d0e1f2    opus-4-6      8.2      9     8     9     7     8
+Average: 7.5/10
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+```
+## Configuration
+In `config.json`:
+```json
+{
+  "eval": {
+    "judges": { "opus": 1, "sonnet": 2 },
+    "scoringDimensions": ["completeness", "accuracy", "workflowCompliance", "tokenEfficiency", "quality"],
+    "passingThreshold": 6
+  }
+}
+```
+ARGUMENTS: $ARGUMENTS

package/.claude/commands/wogi-morning.md CHANGED Viewed

@@ -41,6 +41,11 @@ NEW RULES LEARNED (auto-promoted from patterns)
   → Fix now? Routes through /wogi-start with quality gates.
   → Dismiss? Grandfathers existing violations.
+STALE SKILLS (documentation may be outdated)
+  3 skills older than 90 days:
+    react (95 days), express (120 days), prisma (91 days)
+  → Run `/wogi-setup-stack --refresh-stale` to update.
 CHANGES SINCE LAST SESSION
   - 2 new commits
   - 1 new bug filed
@@ -141,8 +146,26 @@ Configuration in `.workflow/config.json`:
   "showBlockers": true,
   "showKeyContext": true,
   "showRuleViolations": true,
-  "showAutoPromotedRules": true
+  "showAutoPromotedRules": true,
+  "showStaleSkills": true
 }
 ```
 Set `enabled: false` to disable this command.
+## Stale Skills Section (Implementation Details)
+When `morningBriefing.showStaleSkills` is true, the morning briefing checks skill documentation freshness:
+1. Run `node scripts/flow-skill-freshness.js check` or call `getSkillFreshnessReport()` programmatically
+2. If any skills have `lastDocCheck` older than `config.skills.freshnessThreshold` days (default: 90):
+   - Display the skill names and ages
+   - Offer to run `/wogi-setup-stack --refresh-stale`
+3. If no stale skills, omit this section entirely
+Configuration in `.workflow/config.json`:
+```json
+"skills": {
+  "freshnessThreshold": 90
+}
+```

package/.claude/commands/wogi-onboard.md CHANGED Viewed

@@ -655,6 +655,52 @@ Display:
     // Check for conventional commits, ticket prefixes, etc.
     ```
+    **Model Routing Configuration:**
+    Present the user with a model routing choice using `AskUserQuestion`:
+    ```
+    How should WogiFlow route sub-tasks to AI models?
+    1. "Full Opus (Recommended)" — Maximum quality. All sub-agents use Opus.
+       Best for complex projects where quality matters most.
+    2. "Smart Routing" — Opus orchestrates, Sonnet handles implementation/review,
+       Haiku handles searches/lookups. Best quality-to-cost balance.
+       Preserves context window by offloading sub-tasks to lighter models.
+    3. "Custom" — Configure your own routing rules per task type.
+    ```
+    Based on choice:
+    - Option 1: Set `config.hybrid.enabled = false` (all tasks stay with current model)
+    - Option 2: Set `config.hybrid.enabled = true` with default routing table (already configured)
+    - Option 3: Set `config.hybrid.enabled = true` and guide user through per-task-type routing overrides
+    Display: `  Model routing...      ✓ [Smart Routing | Full Opus | Custom]`
+    **Community Knowledge Sync:**
+    Present opt-in question using `AskUserQuestion`:
+    ```
+    Would you like to share anonymized model performance data with the WogiFlow community?
+    What's shared: model ID, task type, iteration count, token usage, wall clock time
+    What's NOT shared: file paths, code, project names, task descriptions
+    You'll receive back: community-optimized model routing rules and capability scores.
+    1. "Enable (Recommended)" — Help improve WogiFlow for everyone
+    2. "Disable" — Keep all data local only
+    ```
+    Based on choice:
+    - Option 1: Set `config.communitySync.enabled = true`
+    - Option 2: Set `config.communitySync.enabled = false` (default)
+    Display: `  Community sync...     ✓ [Enabled | Disabled]`
     **Commit style detection:**
     ```bash
     git log --oneline -20 --format="%s"

package/.claude/commands/wogi-setup-stack.md CHANGED Viewed

@@ -201,6 +201,72 @@ Check the tech options in `flow-tech-options.js` for known `skillsShId` mappings
 }
 ```
+## Pre-Built Skills (v1.7)
+WogiFlow ships with pre-built skill templates for the top 30 most common libraries. During skill generation:
+1. **Pre-built skills** are copied instantly (zero context cost, no network)
+2. **Unknown libraries** fall back to Context7 generation via sub-agent isolation
+Pre-built libraries include: react, next, vue, angular, svelte, tailwindcss, express, nestjs, fastify, hono, prisma, sequelize, mongoose, drizzle, typeorm, jest, vitest, playwright, cypress, mocha, zod, typescript, graphql, eslint, django, flask, fastapi, sqlalchemy, pytest, docker, terraform.
+### Sub-Agent Isolation for Unknown Libraries
+When `--fetch-docs` encounters libraries not in the pre-built set:
+1. **Skip Context7 in main context** — do NOT fetch docs directly
+2. **Spawn a sub-agent** using the Agent tool with `subagent_type=general-purpose`:
+   ```
+   Agent(subagent_type="general-purpose", prompt="Generate a WogiFlow skill for [library].
+   1. Call mcp__MCP_DOCKER__resolve-library-id with libraryName='[library]'
+   2. Call mcp__MCP_DOCKER__get-library-docs with the resolved ID, topic='best practices patterns common mistakes', tokens=8000
+   3. Extract patterns, anti-patterns, and conventions from the docs
+   4. Write skill files to .claude/skills/[library]/ using the skill template format
+   5. Return ONLY the file paths created (not the doc content)
+   ")
+   ```
+3. **Main context receives only file paths** — the full docs stay in the sub-agent's context
+4. **If sub-agent fails**, create a stub skill with placeholder content
+This prevents unknown libraries from consuming the main context window.
+## Documentation Freshness (`--check-freshness`)
+Check if pre-built skills have stale documentation:
+```
+/wogi-setup-stack --check-freshness
+```
+This reads `lastDocCheck` from each skill's frontmatter and flags skills older than 90 days (configurable via `config.skills.freshnessThreshold`).
+Output:
+```
+Documentation Freshness Check:
+  3 skills may have outdated documentation:
+    react (95 days since last check)
+    express (120 days)
+    prisma (91 days)
+  Run `/wogi-setup-stack --refresh-stale` to update.
+```
+## Refresh Stale Skills (`--refresh-stale`)
+Update skills flagged as stale:
+```
+/wogi-setup-stack --refresh-stale
+```
+For each stale skill:
+1. Spawn a sub-agent to fetch latest docs via Context7
+2. Compare new docs against existing skill content
+3. If changes detected: update skill files and `lastDocCheck` date
+4. If no changes: bump `lastDocCheck` only
+5. Display summary of changes
 ## Re-running the Wizard
 You can run the wizard again to:

package/.claude/commands/wogi-start.md CHANGED Viewed

@@ -330,6 +330,34 @@ At each execution milestone, update the workflow phase. These are no-ops when ph
 If a transition fails (wrong current phase), it's non-blocking — log and continue.
+### Task Checkpoints (when `config.proactiveCompaction.enabled`)
+At each phase boundary, save a task checkpoint and check if proactive compaction is needed. This enables lossless recovery after auto-compact.
+**At EVERY phase transition listed above**, also:
+1. Save checkpoint: Record task ID, current phase, completed scenarios, changed files, verification results to `.workflow/state/task-checkpoint.json`
+2. Check compaction: If context usage >= `proactiveCompaction.triggerThreshold` (default 75%), display compaction message and run `/wogi-compact` before proceeding
+**Checkpoint integration points:**
+| When | Checkpoint Action |
+|------|-------------------|
+| After explore phase completes | Save exploration summary + related files |
+| After spec is generated | Save spec path + acceptance criteria count |
+| After each scenario completes | Update scenario progress (completed/pending) |
+| After criteria check | Save verification results |
+| Before final validation | Save all changed files list |
+| After task completion | Clear checkpoint |
+**Auto-compact recovery** (on session resume):
+1. Check `.workflow/state/task-checkpoint.json` for an active checkpoint
+2. If checkpoint exists with incomplete scenarios → display recovery message:
+   `Auto-compact detected. Restoring task state from checkpoint...`
+3. Reload: task ID, current phase, completed scenarios, spec path, changed files
+4. Continue execution from the next pending scenario
+**Haiku-powered summaries** (when `proactiveCompaction.useHaiku: true`):
+When compacting between phases, use the Agent tool with `model: "haiku"` to generate the compaction summary. This preserves Opus context for the actual implementation work.
 ### Execution Flow
 ```
@@ -804,6 +832,10 @@ Return a structured report:
 ```javascript
 // Launch all in parallel (single message, multiple Task tool calls)
+// When hybrid mode is enabled (config.hybrid.enabled), use the model parameter
+// to route sub-agents to the appropriate model tier.
+// Routing is provided by getAgentModel() from flow-prompt-template.js:
+//   explore → sonnet, research → sonnet, search → haiku, judging → opus
 Task(subagent_type=Explore, prompt="Codebase Analyzer: ...")
 Task(subagent_type=Explore, prompt="Best Practices: ...")
 Task(subagent_type=Explore, prompt="Version Verifier: ...")
@@ -813,6 +845,21 @@ Task(subagent_type=Explore, prompt="Standards Preview: ...")
 Task(subagent_type=Explore, prompt="Consumer Impact Analyzer: ...")
 ```
+**Hybrid Model Routing (S4):**
+When `config.hybrid.enabled` is `true`, use the Agent tool's `model` parameter to route sub-agents:
+| Sub-Agent Type | Agent `model` Parameter | Rationale |
+|----------------|------------------------|-----------|
+| Explore/Research | `"sonnet"` | Good analysis capability, saves Opus context |
+| Code Review | `"sonnet"` | Balanced quality for review tasks |
+| Simple Lookup/Search | `"haiku"` | Fast and cheap for file searches |
+| Complex Reasoning | `"opus"` | Only for architecture/planning decisions |
+| Compaction Summary | `"haiku"` | Summaries don't need premium models |
+| Eval Judging | `"opus"` (1) + `"sonnet"` (2) | Multi-judge composition from eval config |
+The routing table is configured in `scripts/flow-prompt-template.js` and can be overridden via `config.hybrid.routing.overrides`. Capability scores from `.workflow/models/capabilities/*.yaml` are consulted when `checkCapabilities` is true — if a model's score for the task type is below the `capabilityThreshold` (default: 5), the task is escalated to the next tier.
 **After all agents complete**, display a consolidated research summary:
 **Output Format:**
@@ -1801,9 +1848,32 @@ Phase commands:
 ### Scenario keeps failing after max retries
 - Stop and report: "Scenario X failed after N attempts. Issue: [description]"
 - Leave task in inProgress
-- **Auto-suggest hypothesis debugging**: When a scenario fails 3+ times, suggest running `/wogi-debug-hypothesis "[failure description]"` to spawn parallel investigation agents that analyze competing theories about the root cause
+- **Best-of-N fallback (high-risk tasks)**: When a HIGH-RISK task (architecture, migration, refactor, or complexity HIGH + files > 10) fails 3+ times, auto-suggest Best-of-N:
+  ```
+  This high-risk task has failed 3 times. Would you like to try Best-of-N?
+  → Spawn 2 alternative implementation approaches in isolated worktrees
+  → Opus judges the best approach against the spec
+  ```
+  Use `checkFallbackTrigger()` from `flow-best-of-n.js` to determine if Best-of-N applies.
+  If the task is NOT high-risk: suggest `/wogi-debug-hypothesis` instead (competing theories about root cause).
+- **Auto-suggest hypothesis debugging**: For non-high-risk tasks, when a scenario fails 3+ times, suggest running `/wogi-debug-hypothesis "[failure description]"` to spawn parallel investigation agents
 - User can investigate and re-run `/wogi-start TASK-XXX` to continue
+### Best-of-N auto-suggestion (high-risk tasks)
+When starting a task, if `config.bestOfN.enabled` is true:
+1. Run `assessRisk()` from `flow-best-of-n.js` with the task's type, description, and file count
+2. If `shouldSuggest` is true, display:
+   ```
+   This is a high-risk task. Would you like to use Best-of-N?
+   → Spawn 3 approaches in parallel (isolated worktrees)
+   → Opus selects the best implementation
+   Options: [Yes, use Best-of-N] [No, proceed normally]
+   ```
+3. If user confirms: spawn N agents using `Agent(isolation: "worktree")` with variation strategy from `getVariationStrategy()`
+4. After all complete: spawn Opus judge using `buildSelectionPrompt()` to select winner
+5. Apply winner, clean up losing worktrees
 ### Quality gate keeps failing
 - Report which gate is failing and why
 - Attempt to fix automatically

package/.claude/rules/README.md ADDED Viewed

@@ -0,0 +1,60 @@
+# Project Rules
+This directory contains coding rules and patterns for this project, organized by category.
+## Structure
+```
+.claude/rules/
+├── code-style/           # Naming conventions, formatting
+│   └── naming-conventions.md
+├── security/             # Security patterns and practices
+│   └── security-patterns.md
+├── architecture/         # Design decisions and patterns
+│   ├── component-reuse.md
+│   └── model-management.md
+└── README.md
+```
+## How Rules Work
+Rules are automatically loaded by Claude Code based on:
+- **alwaysApply: true** - Rule is always loaded
+- **alwaysApply: false** - Rule is loaded based on `globs` or `description` relevance
+- **globs** - File patterns that trigger rule loading
+## Adding New Rules
+1. Choose the appropriate category subdirectory
+2. Create a `.md` file with frontmatter:
+```yaml
+---
+alwaysApply: false
+description: "Brief description for relevance matching"
+globs: src/**/*.ts  # Optional: only load for these files
+---
+```
+3. Write the rule content in markdown
+## Categories
+| Category | Purpose |
+|----------|---------|
+| code-style | Naming conventions, formatting, file structure |
+| security | Security patterns, input validation, safe practices |
+| architecture | Design decisions, component patterns, system organization |
+## Auto-Generation
+Some rules can be auto-generated from `.workflow/state/decisions.md`:
+```bash
+node scripts/flow-rules-sync.js
+```
+The sync script will route rules to appropriate category subdirectories.
+---
+Last updated: 2026-01-12

package/.claude/rules/architecture/component-reuse.md ADDED Viewed

@@ -0,0 +1,38 @@
+---
+globs: src/components/**/*
+alwaysApply: false
+description: "Component reuse policy - always check app-map.md before creating components"
+---
+# Component Reuse Policy
+**Rule**: Always check `app-map.md` before creating any component.
+## Priority Order
+1. **Use existing** - Check if component already exists in app-map
+2. **Add variant** - Extend existing component with a new variant
+3. **Extend** - Create a wrapper/HOC around existing component
+4. **Create new** - Only as last resort
+## Before Creating Components
+```bash
+# Check app-map first
+cat .workflow/state/app-map.md | grep -i "button"
+# Or search codebase
+grep -r "Button" src/components/
+```
+## Variant vs New Component
+Prefer variants when:
+- Same base functionality, different appearance
+- Same HTML structure, different styling
+- Same component, different size/color/state
+Create new component when:
+- Fundamentally different functionality
+- Different DOM structure
+- Different state management

package/.claude/rules/architecture/document-structure.md ADDED Viewed

@@ -0,0 +1,76 @@
+---
+alwaysApply: true
+description: "All AI-context documents must use PIN markers for targeted context loading"
+---
+# Document Structure for AI Context
+All documents in `.workflow/` that are used as AI context MUST follow the PIN standard.
+## Required Structure
+### 1. Header with PIN List
+Every document starts with a comment listing all pins in the document:
+```markdown
+<!-- PINS: pin1, pin2, pin3 -->
+```
+### 2. Section PIN Markers
+Each major section has a PIN marker comment:
+```markdown
+### Section Title
+<!-- PIN: section-specific-pin -->
+[Content]
+```
+### 3. PIN Naming Convention
+- Use kebab-case: `user-authentication`, not `userAuthentication`
+- Use semantic names: `error-handling`, not `eh`
+- Use compound names for specificity: `json-parse-safety`
+## Why PINs Matter
+The PIN system enables:
+1. **Targeted context loading**: Only load sections relevant to current task
+2. **Cheaper model routing**: Haiku can fetch only relevant sections for Opus
+3. **Change detection**: Hash sections independently for smart invalidation
+4. **Cross-reference**: Link sections by PIN across documents
+## Example Document
+```markdown
+# Config Reference
+<!-- PINS: database, authentication, api-keys, environment -->
+## Database Settings
+<!-- PIN: database -->
+| Setting | Default | Description |
+|---------|---------|-------------|
+## Authentication
+<!-- PIN: authentication -->
+| Setting | Default | Description |
+|---------|---------|-------------|
+```
+## Parsing
+The PIN system automatically parses documents with:
+- `flow-section-index.js` - Generates section index with pins
+- `flow-section-resolver.js` - Resolves sections by PIN lookup
+- `getSectionsByPins(['auth', 'security'])` - Fetch only relevant sections
+## Files That Must Have PINs
+| File | Required PINs |
+|------|---------------|
+| `decisions.md` | Per coding rule/pattern |
+| `app-map.md` | Per component/screen |
+| `product.md` | Per product section |
+| `stack.md` | Per technology |
+## Validation
+Run `node scripts/flow-section-index.js --force` to regenerate the index.
+Check `.workflow/state/section-index.json` for indexed sections and their pins.