npm - @massu/core - Versions diffs - 0.5.0 → 0.6.0 - Mend

@massu/core 0.5.0 → 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (118) hide show

package/README.md +40 -0
package/agents/massu-architecture-reviewer.md +104 -0
package/agents/massu-blast-radius-analyzer.md +84 -0
package/agents/massu-competitive-scorer.md +126 -0
package/agents/massu-help-sync.md +73 -0
package/agents/massu-migration-writer.md +94 -0
package/agents/massu-output-scorer.md +87 -0
package/agents/massu-pattern-reviewer.md +84 -0
package/agents/massu-plan-auditor.md +170 -0
package/agents/massu-schema-sync-verifier.md +70 -0
package/agents/massu-security-reviewer.md +98 -0
package/agents/massu-ux-reviewer.md +106 -0
package/commands/_shared-preamble.md +53 -23
package/commands/_shared-references/auto-learning-protocol.md +71 -0
package/commands/_shared-references/blast-radius-protocol.md +76 -0
package/commands/_shared-references/security-pre-screen.md +64 -0
package/commands/_shared-references/test-first-protocol.md +87 -0
package/commands/_shared-references/verification-table.md +52 -0
package/commands/massu-article-review.md +343 -0
package/commands/massu-autoresearch/references/eval-runner.md +84 -0
package/commands/massu-autoresearch/references/safety-rails.md +125 -0
package/commands/massu-autoresearch/references/scoring-protocol.md +151 -0
package/commands/massu-autoresearch.md +258 -0
package/commands/massu-batch.md +44 -12
package/commands/massu-bearings.md +42 -8
package/commands/massu-checkpoint.md +588 -0
package/commands/massu-ci-fix.md +2 -2
package/commands/massu-command-health.md +132 -0
package/commands/massu-command-improve.md +232 -0
package/commands/massu-commit.md +205 -44
package/commands/massu-create-plan.md +239 -57
package/commands/massu-data/references/common-queries.md +79 -0
package/commands/massu-data/references/table-guide.md +50 -0
package/commands/massu-data.md +66 -0
package/commands/massu-dead-code.md +29 -34
package/commands/massu-debug/references/auto-learning.md +61 -0
package/commands/massu-debug/references/codegraph-tracing.md +80 -0
package/commands/massu-debug/references/common-shortcuts.md +98 -0
package/commands/massu-debug/references/investigation-phases.md +294 -0
package/commands/massu-debug/references/report-format.md +107 -0
package/commands/massu-debug.md +105 -386
package/commands/massu-docs.md +1 -1
package/commands/massu-full-audit.md +61 -0
package/commands/massu-gap-enhancement-analyzer.md +276 -16
package/commands/massu-golden-path/references/approval-points.md +216 -0
package/commands/massu-golden-path/references/competitive-mode.md +273 -0
package/commands/massu-golden-path/references/error-handling.md +121 -0
package/commands/massu-golden-path/references/phase-0-requirements.md +53 -0
package/commands/massu-golden-path/references/phase-1-plan-creation.md +168 -0
package/commands/massu-golden-path/references/phase-2-implementation.md +397 -0
package/commands/massu-golden-path/references/phase-2.5-gap-analyzer.md +156 -0
package/commands/massu-golden-path/references/phase-3-simplify.md +40 -0
package/commands/massu-golden-path/references/phase-4-commit.md +94 -0
package/commands/massu-golden-path/references/phase-5-push.md +116 -0
package/commands/massu-golden-path/references/phase-5.5-production-verify.md +170 -0
package/commands/massu-golden-path/references/phase-6-completion.md +113 -0
package/commands/massu-golden-path/references/qa-evaluator-spec.md +137 -0
package/commands/massu-golden-path/references/sprint-contract-protocol.md +117 -0
package/commands/massu-golden-path/references/vr-visual-calibration.md +73 -0
package/commands/massu-golden-path.md +114 -848
package/commands/massu-guide.md +72 -69
package/commands/massu-hooks.md +27 -12
package/commands/massu-hotfix.md +221 -144
package/commands/massu-incident.md +49 -20
package/commands/massu-infra-audit.md +187 -0
package/commands/massu-learning-audit.md +211 -0
package/commands/massu-loop/references/auto-learning.md +49 -0
package/commands/massu-loop/references/checkpoint-audit.md +40 -0
package/commands/massu-loop/references/guardrails.md +17 -0
package/commands/massu-loop/references/iteration-structure.md +115 -0
package/commands/massu-loop/references/loop-controller.md +188 -0
package/commands/massu-loop/references/plan-extraction.md +78 -0
package/commands/massu-loop/references/vr-plan-spec.md +140 -0
package/commands/massu-loop-playwright.md +9 -9
package/commands/massu-loop.md +115 -670
package/commands/massu-new-pattern.md +423 -0
package/commands/massu-perf.md +422 -0
package/commands/massu-plan-audit.md +1 -1
package/commands/massu-plan.md +389 -122
package/commands/massu-production-verify.md +433 -0
package/commands/massu-push.md +62 -378
package/commands/massu-recap.md +29 -3
package/commands/massu-rollback.md +613 -0
package/commands/massu-scaffold-hook.md +2 -4
package/commands/massu-scaffold-page.md +2 -3
package/commands/massu-scaffold-router.md +1 -2
package/commands/massu-security.md +619 -0
package/commands/massu-simplify.md +115 -85
package/commands/massu-squirrels.md +2 -2
package/commands/massu-tdd.md +38 -22
package/commands/massu-test.md +3 -3
package/commands/massu-type-mismatch-audit.md +469 -0
package/commands/massu-ui-audit.md +587 -0
package/commands/massu-verify-playwright.md +287 -32
package/commands/massu-verify.md +150 -46
package/dist/cli.js +146 -95
package/package.json +6 -2
package/patterns/build-patterns.md +302 -0
package/patterns/component-patterns.md +246 -0
package/patterns/display-patterns.md +185 -0
package/patterns/form-patterns.md +890 -0
package/patterns/integration-testing-checklist.md +445 -0
package/patterns/security-patterns.md +219 -0
package/patterns/testing-patterns.md +569 -0
package/patterns/tool-routing.md +81 -0
package/patterns/ui-patterns.md +371 -0
package/protocols/plan-implementation.md +267 -0
package/protocols/recovery.md +225 -0
package/protocols/verification.md +404 -0
package/reference/command-taxonomy.md +178 -0
package/reference/cr-rules-reference.md +76 -0
package/reference/hook-execution-order.md +148 -0
package/reference/lessons-learned.md +175 -0
package/reference/patterns-quickref.md +208 -0
package/reference/standards.md +135 -0
package/reference/subagents-reference.md +17 -0
package/reference/vr-verification-reference.md +867 -0
package/src/commands/install-commands.ts +149 -53

package/commands/massu-golden-path.md CHANGED Viewed

@@ -1,66 +1,53 @@
 ---
 name: massu-golden-path
-description: Complete end-to-end workflow from requirements to production push with minimal pause points
+description: "When user wants full autonomous implementation: 'build this', 'implement this feature', 'golden path', or provides a plan file to execute end-to-end"
 allowed-tools: Bash(*), Read(*), Write(*), Edit(*), Grep(*), Glob(*), Task(*), mcp__plugin_playwright_playwright__*, mcp__playwright__*
 ---
 name: massu-golden-path
-> **Shared rules apply.** Read `.claude/commands/_shared-preamble.md` before proceeding. CR-9 enforced.
+> **Shared rules apply.** Read `.claude/commands/_shared-preamble.md` before proceeding. CR-12, CR-9 enforced.
 # Massu Golden Path: Requirements to Production Push
 ## Objective
 Execute the COMPLETE development workflow in one continuous run:
-**Requirements --> Plan Creation --> Plan Audit --> Implementation --> Browser Verification --> Simplification --> Commit --> Push**
+**Requirements -> Plan Creation -> Plan Audit -> Implementation -> Gap Analysis -> Simplification -> Commit -> Push**
 This command has FULL FEATURE PARITY with the individual commands it replaces:
-`/massu-create-plan` --> `/massu-plan` --> `/massu-loop` --> `/massu-loop-playwright` --> `/massu-simplify` --> `/massu-commit` --> `/massu-push`
+`/massu-create-plan` -> `/massu-plan` -> `/massu-loop` -> `/massu-loop-playwright` -> `/massu-simplify` -> `/massu-commit` -> `/massu-push`
 ---
 ## NON-NEGOTIABLE RULES
-- **Complete workflow** -- ALL phases must execute, no skipping
+- **Complete workflow (CR-11)** -- ALL phases must execute, no skipping. 100% plan coverage required
 - **Zero failures** -- Each phase gate must pass before proceeding
-- **Proof required** -- Show output of each phase gate
+- **Proof required (CR-1)** -- VR-* output pasted, not summarized. "I verified" without output = invalid
 - **FIX ALL ISSUES ENCOUNTERED (CR-9)** -- Whether from current changes or pre-existing
 - **MEMORY IS MANDATORY (CR-38)** -- Persist ALL learnings before session ends
+- **Stagnation bail-out (CR-37)** -- If same item fails 3+ times, replan instead of grinding
 ---
-## APPROVAL POINTS (Max 4 Pauses)
+## APPROVAL POINTS (Max 4 Pauses, 5 with --competitive)
 ```
 +-----------------------------------------------------------------------------+
 |   THIS COMMAND RUNS STRAIGHT THROUGH THE ENTIRE GOLDEN PATH.                |
-|   IT ONLY PAUSES FOR THESE APPROVAL POINTS:                                |
-|                                                                             |
-|   1. PLAN APPROVAL - After plan creation + audit (user reviews plan)        |
-|   2. NEW PATTERN APPROVAL - If a new pattern is needed (during any phase)   |
-|   3. COMMIT APPROVAL - Before creating the commit                           |
-|   4. PUSH APPROVAL - Before pushing to remote                               |
-|                                                                             |
+|   IT ONLY PAUSES FOR THESE APPROVAL POINTS:                                 |
+|                                                                              |
+|   1. PLAN APPROVAL - After plan creation + audit (user reviews plan)         |
+|   2. NEW PATTERN APPROVAL - If a new pattern is needed (during any phase)    |
+|   3. COMMIT APPROVAL - Before creating the commit                            |
+|   4. PUSH APPROVAL - Before pushing to remote                                |
+|   5. WINNER SELECTION - After competitive scoring (--competitive only)       |
+|                                                                              |
 |   EVERYTHING ELSE RUNS AUTOMATICALLY WITHOUT STOPPING.                      |
 +-----------------------------------------------------------------------------+
 ```
-### Approval Point Format
-```
-===============================================================================
-APPROVAL REQUIRED: [TYPE]
-===============================================================================
-[Details]
-OPTIONS:
-  - "approve" / "yes" to continue
-  - "modify" to request changes
-  - "abort" to stop the golden path
-===============================================================================
-```
+Read `references/approval-points.md` for the exact format and options for each approval point.
 After receiving approval, immediately continue. Do NOT ask "shall I continue?" -- just proceed.
@@ -73,888 +60,165 @@ After receiving approval, immediately continue. Do NOT ask "shall I continue?" -
 | **Task Description** | `/massu-golden-path "Implement feature X"` | Full flow from Phase 0 |
 | **Plan File** | `/massu-golden-path /path/to/plan.md` | Skip to Phase 1C (audit) |
 | **Continue** | `/massu-golden-path "Continue [feature]"` | Resume from session state |
+| **Competitive** | `/massu-golden-path --competitive "task"` | Spawn 2-3 competing implementations with bias presets, score, select winner |
+| **Competitive (3 agents)** | `/massu-golden-path --competitive --agents 3 "task"` | 3 agents with quality/ux/robust biases (default: 2 agents = quality + robust) |
+| **External Loop** | `/massu-golden-path --external /path/to/plan.md` | Phase 2 uses `scripts/loop-external.sh` for context-fresh iterations |
 ---
-## PHASE 0: REQUIREMENTS & CONTEXT LOADING
-### 0.1 Session Context Loading
-```
-[GOLDEN PATH -- PHASE 0: REQUIREMENTS & CONTEXT]
-```
+## PHASE OVERVIEW
-- Read `session-state/CURRENT.md` for any prior state
-- Read `massu.config.yaml` for project configuration
-- Search memory files for relevant prior context
-### 0.2 Requirements Coverage Map
-Initialize ALL dimensions as `pending`:
-| # | Dimension | Status | Resolved By |
-|---|-----------|--------|-------------|
-| D1 | Problem & Scope | pending | User request + interview |
-| D2 | Users & Personas | pending | Interview |
-| D3 | Data Model | pending | Phase 1A (Config/Schema Reality Check) |
-| D4 | Backend / API | pending | Phase 1A (Codebase Reality Check) |
-| D5 | Frontend / UX | pending | Interview + Phase 1A |
-| D6 | Auth & Permissions | pending | Phase 1A (Security Pre-Screen) |
-| D7 | Error Handling | pending | Phase 1A (Pattern Compliance) |
-| D8 | Security | pending | Phase 1A (Security Pre-Screen) |
-| D9 | Edge Cases | pending | Phase 1A (Question Filtering) |
-| D10 | Performance | pending | Phase 1A (Pattern Compliance) |
-### 0.3 Ambiguity Detection (7 Signals)
-| Signal | Description |
-|--------|-------------|
-| A1 | Vague scope -- no clear boundary |
-| A2 | No success criteria -- no measurable outcome |
-| A3 | Implicit requirements -- unstated but necessary |
-| A4 | Multi-domain -- spans 3+ domains |
-| A5 | Contradictions -- conflicting constraints |
-| A6 | No persona -- unclear who benefits |
-| A7 | New integration -- external service not yet in codebase |
-**Score >= 2**: Enter interview loop (0.4). **Score 0-1**: Fast-track to Phase 1A.
-### 0.4 Interview Loop (When Triggered)
-Ask via AskUserQuestion, one question at a time:
-1. Show compact coverage status: `Coverage: D1:done D2:pending ...`
-2. Provide 2-4 curated options (never open-ended)
-3. Push back on contradictions and over-engineering
-4. Self-terminate when D1, D2, D5 covered
-5. Escape hatch: user says "skip" / "enough" / "just do it" --> mark remaining as `n/a`
+| Phase | Name | Key Actions | Approval Gate |
+|-------|------|-------------|---------------|
+| 0 | Requirements & Context | Session context, ambiguity detection, interview loop | -- |
+| 1 | Plan Creation & Audit | Research, plan generation, audit loop | PLAN APPROVAL |
+| 2 | Implementation | Item loop, multi-perspective review, verification audit, browser testing | NEW PATTERN (if needed) |
+| 2-COMP | Competitive Implementation | Spawn N agents with bias presets, score, select winner (`--competitive` only) | WINNER SELECTION |
+| 2.5 | Gap & Enhancement Analysis | Find+fix gaps, UX issues, security, pattern compliance; loop until zero | -- |
+| 3 | Simplification | Pattern scanner, parallel semantic review, apply findings | -- |
+| 4 | Pre-Commit Verification | Verification gates, quality scoring | COMMIT APPROVAL |
+| 5 | Push Verification | `scripts/push-verify.sh`, CI monitoring via `scripts/ci-status.sh` | PUSH APPROVAL |
+| 6 | Completion | Final report, plan update, auto-learning, feature registration | -- |
 ---
-## PHASE 1: PLAN CREATION & AUDIT
-### Phase 1A: Research & Reality Check
-```
-[GOLDEN PATH -- PHASE 1A: RESEARCH & REALITY CHECK]
-```
-**If plan file was provided**: Skip to Phase 1C.
-#### 1A.1 Feature Understanding
-- Document: exact user request, feature type, affected domains
-- Search codebase for similar features, tool modules, existing patterns
-- Read `massu.config.yaml` for relevant config sections
-#### 1A.2 Config & Schema Reality Check
-For features touching config or databases:
-- Parse `massu.config.yaml` and verify all referenced config keys exist
-- Check SQLite schema for affected tables (`getCodeGraphDb`, `getDataDb`, `getMemoryDb`)
-- Verify tool definitions in `tools.ts` for any tools being modified
-Document: existing config keys, required new keys, required schema changes.
-#### 1A.3 Config-Code Alignment (VR-CONFIG)
-If feature uses config-driven values:
-```bash
-# Check config keys used in code
-grep -rn "getConfig()" packages/core/src/ | grep -oP 'config\.\w+' | sort -u
-# Compare to massu.config.yaml structure
-```
-#### 1A.4 Codebase Reality Check
-- Verify target directories/files exist
-- Read similar tool modules and handlers
-- Load relevant pattern files (build/testing/security/database/mcp)
-#### 1A.5 Blast Radius Analysis (CR-10)
-**MANDATORY when plan changes any constant, export name, config key, or tool name.**
-1. Identify ALL changed values (old --> new)
-2. Codebase-wide grep for EACH value
-3. If plan deletes files: verify no remaining imports or references
-4. Categorize EVERY occurrence: CHANGE / KEEP (with reason) / INVESTIGATE
-5. Resolve ALL INVESTIGATE to 0. Add ALL CHANGE items as plan deliverables.
-#### 1A.6 Pattern Compliance Check
-Check applicable patterns: ESM imports (.ts extensions), config access (getConfig()), tool registration (3-function pattern), hook compilation (esbuild), SQLite DB access (getCodeGraphDb/getDataDb/getMemoryDb), memDb lifecycle (try/finally close).
-Read most similar tool module for patterns used.
-#### 1A.7 Tool Registration Check (CR-11)
-For EVERY new MCP tool planned -- verify a corresponding registration item exists in the plan (definitions + routing + handler in `tools.ts`). If NOT, ADD IT.
-#### 1A.8 Question Filtering
-1. List all open questions
-2. Self-answer anything answerable by reading code or config
-3. Surface only business logic / UX / scope / priority questions to user via AskUserQuestion
-4. If all self-answerable, skip user prompt
-#### 1A.9 Security Pre-Screen (5 Dimensions)
-| Dim | Check | If Triggered |
-|-----|-------|-------------|
-| S1 | PII / Sensitive Data | Add access controls |
-| S2 | Authentication | Verify auth checks |
-| S3 | Authorization | Add permission checks |
-| S4 | Injection Surfaces | Add input validation, parameterized queries |
-| S5 | Rate Limiting | Add rate limiting considerations |
-**BLOCKS_REMAINING must = 0 before proceeding.**
-Mark all coverage dimensions as `done` or `n/a`.
-### Phase 1B: Plan Generation
-```
-[GOLDEN PATH -- PHASE 1B: PLAN GENERATION]
-```
-Write plan to: `docs/plans/[YYYY-MM-DD]-[feature-name].md`
-**Plan structure** (P-XXX numbered items):
-- Overview (feature, complexity, domains, item count)
-- Requirements Coverage Map (D1-D10 all resolved)
-- Phase 1: Configuration Changes (massu.config.yaml)
-- Phase 2: Backend Implementation (tool modules, handlers, SQLite schema)
-- Phase 3: Frontend/Hook Implementation (hooks, plugin code)
-- Phase 4: Testing & Verification
-- Phase 5: Documentation
-- Verification Commands table
-- Item Summary table
-- Risk Assessment
-- Dependencies
-**Item numbering**: P1-XXX (config), P2-XXX (backend), P3-XXX (frontend/hooks), P4-XXX (testing), P5-XXX (docs).
+## PHASE 0: REQUIREMENTS & CONTEXT LOADING
-**Implementation Specificity Check**: Every item MUST have exact file path, exact content, insertion point, format matches target, verification command.
+Read `references/phase-0-requirements.md` for full details.
-### Phase 1C: Plan Audit Loop
+**Summary**: Load session context via memory tools. Build a 10-dimension requirements coverage map (D1-D10). Run ambiguity detection (7 signals). If ambiguity score >= 2, enter interview loop. Fast-track to Phase 1 when D1, D2, D5 covered or user says "skip" / "just do it".
-```
-[GOLDEN PATH -- PHASE 1C: PLAN AUDIT LOOP]
-```
-Run audit loop using subagent architecture (prevents early termination):
+---
-```
-iteration = 0
-WHILE true:
-  iteration += 1
-  result = Task(subagent_type="massu-plan-auditor", model="opus", prompt="
-    Audit iteration {iteration} for plan: {PLAN_PATH}
-    Execute ONE complete audit pass. Verify ALL deliverables.
-    Check: VR-PLAN-FEASIBILITY, VR-PLAN-SPECIFICITY, Pattern Alignment, Config Reality.
-    Fix any plan document gaps you find.
-    CRITICAL: Report GAPS_DISCOVERED as total gaps FOUND, EVEN IF you fixed them.
-    Finding N gaps and fixing all N = GAPS_DISCOVERED: N.
-    A clean pass finding nothing = GAPS_DISCOVERED: 0.
-  ")
-  gaps = parse GAPS_DISCOVERED from result
-  IF gaps == 0: BREAK (clean pass)
-  ELSE: CONTINUE (re-audit)
-  IF iteration >= 10: Report to user, ask how to proceed
-END WHILE
-```
+## PHASE 1: PLAN CREATION & AUDIT
-**VR-PLAN-FEASIBILITY**: Files exist, config keys valid, dependencies available, patterns documented.
-**VR-PLAN-SPECIFICITY**: Every item has exact path, exact content, insertion point, verification command.
-**Pattern Alignment**: Cross-reference ALL applicable patterns from CLAUDE.md and patterns/*.md.
+Read `references/phase-1-plan-creation.md` for full details.
-### Phase 1 Complete --> APPROVAL POINT #1: PLAN
+**Summary**: Three sub-phases:
+- **1A: Research & Reality Check** -- Feature understanding, codebase check, blast radius analysis (CR-25), pattern compliance, backend-frontend coupling (CR-12), question filtering, security pre-screen (6 dimensions).
+- **1B: Plan Generation** -- Write plan to `docs/plans/[YYYY-MM-DD]-[feature-name].md` with P-XXX numbered items across 6 phases.
+- **1C: Plan Audit Loop** -- Subagent architecture. Iterate until GAPS_DISCOVERED = 0. Max 10 iterations.
-```
-===============================================================================
-APPROVAL REQUIRED: PLAN
-===============================================================================
-Plan created and audited ({iteration} audit passes, 0 gaps).
-PLAN SUMMARY:
--------------------------------------------------------------------------------
-Feature: [name]
-File: [plan path]
-Total Items: [N]
-Phases: [list]
-Requirements Coverage: [X]/10 dimensions resolved
-Feasibility: VERIFIED (config, files, patterns, security)
-Audit Passes: {iteration} (final pass: 0 gaps)
--------------------------------------------------------------------------------
-OPTIONS:
-  - "approve" to begin implementation
-  - "modify: [changes]" to adjust plan
-  - "abort" to stop
-===============================================================================
-```
+**Gate**: APPROVAL POINT #1: PLAN
 ---
 ## PHASE 2: IMPLEMENTATION
-### Phase 2A: Plan Item Extraction & Setup
-```
-[GOLDEN PATH -- PHASE 2: IMPLEMENTATION]
-```
-1. Read plan from disk (NOT memory -- CR-5)
-2. Extract ALL deliverables into tracking table:
-| Item # | Type | Description | Location | Verification | Status |
-|--------|------|-------------|----------|--------------|--------|
-| P1-001 | CONFIG | ... | ... | VR-CONFIG | PENDING |
-3. Create VR-PLAN verification strategy:
-| # | VR-* Check | Target | Why Applicable | Status |
-|---|-----------|--------|----------------|--------|
-| 1 | VR-BUILD | Full project | Always | PENDING |
-4. Initialize session state with AUTHORIZED_COMMAND: massu-golden-path
-### Phase 2B: Implementation Loop
-For each plan item:
-1. **Pre-check**: Verify file exists, read current state
-2. **Execute**: Implement the item following established patterns
-3. **Guardrail**: Run `bash scripts/massu-pattern-scanner.sh` (ABORT if fails)
-4. **Verify**: Run applicable VR-* checks with proof
-5. **Update**: Mark item complete in tracking table
-**DO NOT STOP between items** unless:
-- New pattern needed (Approval Point #2)
-- True blocker (external service, credentials)
-- Critical error after 3 retries
-**Checkpoint Audit at phase boundaries** (after all P1-XXX, after all P2-XXX, etc.):
-```
-CHECKPOINT:
-[1] READ plan section          [2] GREP tool registrations    [3] LS modules
-[4] VR-CONFIG check            [5] VR-TOOL-REG check          [6] VR-HOOK-BUILD check
-[7] Pattern scanner            [8] npm run build               [9] cd packages/core && npx tsc --noEmit
-[10] npm test                  [11] VR-GENERIC check           [12] Security scanner
-[13] COUNT gaps --> IF > 0: FIX and return to [1]
-```
-### Phase 2C: Multi-Perspective Review
-After implementation, BEFORE verification loop -- spawn 3 review agents **IN PARALLEL**:
-```
-security_result = Task(subagent_type="massu-security-reviewer", model="opus", prompt="
-  Review implementation for plan: {PLAN_PATH}
-  Focus: Security vulnerabilities, auth gaps, input validation, data exposure.
-  Return structured result with SECURITY_GATE: PASS/FAIL.
-")
-architecture_result = Task(subagent_type="massu-architecture-reviewer", model="opus", prompt="
-  Review implementation for plan: {PLAN_PATH}
-  Focus: Design issues, coupling, pattern compliance, scalability.
-  Return structured result with ARCHITECTURE_GATE: PASS/FAIL.
-")
-quality_result = Task(subagent_type="massu-quality-reviewer", model="sonnet", prompt="
-  Review implementation for plan: {PLAN_PATH}
-  Focus: Code quality, ESM compliance, config-driven patterns, TypeScript strict mode, test coverage.
-  Return structured result with QUALITY_GATE: PASS/FAIL.
-")
-```
-Fix ALL CRITICAL/HIGH findings before proceeding. WARN findings = document and proceed.
-### Phase 2D: Verification Audit Loop
-```
-iteration = 0
-WHILE true:
-  iteration += 1
-  # Circuit breaker (CR-37)
-  IF iteration >= 3 AND same gaps as previous iteration:
-    AskUserQuestion: "Loop stalled after {iteration} passes. Re-plan / Continue / Stop?"
-  result = Task(subagent_type="massu-plan-auditor", model="opus", prompt="
-    Audit iteration {iteration} for plan: {PLAN_PATH}
-    Verify ALL deliverables with VR-* proof.
-    Check code quality (patterns, build, types, tests).
-    Check plan coverage (every item verified).
-    Fix any gaps you find.
-    CRITICAL: GAPS_DISCOVERED = total FOUND, even if fixed.
-    Finding 5 + fixing 5 = GAPS_DISCOVERED: 5 (NOT 0).
-  ")
-  gaps = parse GAPS_DISCOVERED from result
-  Output: "Verification iteration {iteration}: {gaps} gaps"
-  IF gaps == 0: BREAK
-  IF iteration >= 10: Report remaining gaps, ask user
-END WHILE
-```
-### Phase 2E: Post-Build Reflection + Memory Persist (CR-38)
-**MANDATORY -- reflection + memory write = ONE atomic action.**
-Answer these questions:
-1. "Now that I've built this, what would I have done differently?"
-2. "What should be refactored before moving on?"
-3. "Did we over-build? Is there a simpler way?"
-4. "Would a staff engineer approve this?" (Core Principle #9)
-**IMMEDIATELY write ALL learnings to memory/ files** -- failed approaches, new patterns, tool gotchas, architectural insights. DO NOT output reflections as text without writing to memory.
-Apply any low-risk refactors immediately. Log remaining suggestions in plan under `## Post-Build Reflection`.
-### Phase 2F: Documentation Sync (User-Facing Features)
-If plan includes ANY user-facing features (new MCP tools, config changes, hook changes):
-1. Update relevant documentation (README, API docs, config docs)
-2. Ensure tool descriptions match implementation
-3. Update config schema documentation if config keys changed
-Skip ONLY if purely internal refactoring with zero user-facing changes.
-### Phase 2G: Browser Verification & Fix Loop (`/massu-loop-playwright`)
-```
-[GOLDEN PATH -- PHASE 2G: BROWSER VERIFICATION]
-```
-**This phase executes the full `/massu-loop-playwright` protocol inline.** See `massu-loop-playwright.md` for the standalone version.
-**Auto-trigger condition**: If plan touches ANY UI/demo files or produces visual output, this phase runs automatically. If purely backend/MCP/config with zero visual output, skip with log note: `Browser verification: SKIPPED (no UI files changed)`.
-#### 2G.1 Determine Target Pages
-Map changed features to testable URLs:
-- If Massu has a demo page or documentation site: test affected pages
-- If testing MCP tool output: use a test harness or verify tool responses
-- Component changes: identify ALL pages that render the component
-#### 2G.2 Browser Setup & Authentication
-Use Playwright MCP plugin tools (`mcp__plugin_playwright_playwright__*`). Fallback: `mcp__playwright__*`.
-1. `browser_navigate` to target URL ($TARGET_URL)
-2. `browser_snapshot` to check page status
-3. If authentication required: STOP and request manual login
-```
-AUTHENTICATION REQUIRED
-The Playwright browser is not logged in to the target application.
-Please log in manually in the open browser window, then re-run the golden path.
-```
-**NEVER type credentials. NEVER hardcode passwords. NEVER proceed without authentication.**
-#### 2G.3 Load Audit (Per Page)
-For EACH target page:
+Read `references/phase-2-implementation.md` for full details.
-| Check | Tool | Captures |
-|-------|------|----------|
-| Console errors/warnings | `browser_console_messages` | React errors, TypeError, CSP violations |
-| Network failures | `browser_network_requests` | 500s, 404s, CORS failures, timeouts |
+**Summary**: Nine sub-phases (or external loop via `--external` flag using `scripts/loop-external.sh` for context-fresh iterations):
+- **2A**: Extract plan items into tracking table, initialize session state
+- **2A.5**: Sprint contracts -- negotiate definition-of-done per plan item before implementation (scope boundary, acceptance criteria, VR-* mapping). See `references/sprint-contract-protocol.md`
+- **2B**: Implementation loop (pre-check, execute, guardrail, verify, update per item)
+- **2C**: Multi-perspective review (3 parallel agents: security, architecture, UX) + **QA evaluator** (conditional, UI plans only -- adversarial Playwright-based acceptance testing against sprint contracts). See `references/qa-evaluator-spec.md`
+- **2D**: Verification audit loop (subagent, circuit breaker CR-37, refine-or-pivot at 3+ iterations, sprint contract verification, max 10 iterations)
+- **2E**: Post-build reflection + memory persist (CR-38)
+- **2F**: Documentation sync (if user-facing features)
+- **2G**: Browser verification & fix loop (auto-triggers if UI files changed, Playwright MCP)
-Categorize findings:
-| Category | Severity |
-|----------|----------|
-| Crash, 500 error, data exposure | **P0 -- CRITICAL** |
-| Network failure, broken interaction | **P1 -- HIGH** |
-| Visual issues, performance warnings | **P2 -- MEDIUM** |
-| Console warnings, deprecations | **P3 -- LOW** |
-#### 2G.4 Interactive Testing (Per Page)
-1. `browser_snapshot` --> inventory ALL interactive elements (buttons, links, forms, selects, tabs, modals, data tables)
-2. For EACH testable element:
-   - Capture console state BEFORE interaction (`browser_console_messages`)
-   - Perform interaction (`browser_click`, `browser_select_option`, `browser_fill_form`)
-   - Wait 2-3 seconds for async operations
-   - Capture console state AFTER interaction
-   - Record any NEW errors introduced
-   - `browser_snapshot` to verify DOM state after interaction
-   - If interaction opened modal/sheet: test elements inside, then close
-**SAFETY**: Never submit forms, click Delete/Send/Submit, or create real records on production.
-#### 2G.5 Visual & Performance Audit
-**Visual checks**:
-- Broken images: `browser_evaluate` to find `img` elements with `naturalWidth === 0`
-- Layout issues: overflow, overlapping, missing content, broken alignment
-- Responsive: `browser_resize` at 1440x900 (desktop), 768x1024 (tablet), 375x812 (mobile)
-- Screenshot evidence: `browser_take_screenshot` at each breakpoint if issues found
-**Performance checks**:
-- Page load timing via `browser_evaluate` (`performance.getEntriesByType('navigation')`)
-- Resources > 500KB via `browser_evaluate` (`performance.getEntriesByType('resource')`)
-- Slow API calls > 3s, duplicate requests via `browser_network_requests`
-| Metric | Good | Needs Work | Critical |
-|--------|------|------------|----------|
-| DOM Content Loaded | < 2s | 2-5s | > 5s |
-| Full Load | < 4s | 4-8s | > 8s |
-| TTFB | < 500ms | 500ms-1.5s | > 1.5s |
-#### 2G.6 Fix Loop
-```
-issues = ALL findings from 2G.3-2G.5, sorted by priority (P0 first)
-FOR EACH issue WHERE priority <= P2:
-  1. IDENTIFY root cause (Grep/Read source files)
-  2. APPLY fix (follow CLAUDE.md patterns)
-  3. VERIFY fix (VR-GREP, VR-NEGATIVE, VR-BUILD, VR-TYPE)
-  4. LOG fix in report
-Zero-issue standard: ALL P0/P1 fixed, ALL P2 fixed or documented with justification.
-Circuit breaker: 5 iterations on same page --> ask user.
-```
-Post-fix: reload target URLs, re-run load audit + interactive testing for elements that had failures. If new errors appear, add to issues list and continue fix loop.
-#### 2G.7 Report
-Save to `.claude/playwright-reports/{TIMESTAMP}-{SLUG}.md`.
-Report includes: summary table, console errors, network failures, interactive element failures, visual issues, performance issues, fix log with files changed and VR checks, unfixed issues with justification, screenshots.
-#### 2G.8 Auto-Learning Protocol
-For EACH browser-discovered fix:
-1. Update memory files with symptom/root cause/fix/files
-2. Add to `scripts/massu-pattern-scanner.sh` if the bad pattern is grep-able
-3. Codebase-wide search for same bad pattern (CR-9) -- fix ALL instances
-```
-[GOLDEN PATH -- PHASE 2 COMPLETE]
-- All plan items implemented
-- Multi-perspective review: PASSED (security, architecture, quality)
-- Verification audit: PASSED (Loop #{iteration}, 0 gaps)
-- Post-build reflection: PERSISTED to memory
-- Documentation sync: COMPLETE / N/A
-- Browser verification: PASSED ({N} pages tested, {M} issues fixed) / SKIPPED (no UI files)
-```
+**Gate**: APPROVAL POINT #2: NEW PATTERN (if needed, any sub-phase)
 ---
-## PHASE 3: SIMPLIFICATION (`/massu-simplify`)
-```
-[GOLDEN PATH -- PHASE 3: SIMPLIFICATION]
-```
-**This phase executes the full `/massu-simplify` protocol inline.** See `massu-simplify.md` for the standalone version.
-### 3.1 Fast Gate
+## PHASE 2.5: GAP & ENHANCEMENT ANALYSIS
-```bash
-bash scripts/massu-pattern-scanner.sh  # Fix ALL violations before semantic analysis
-```
-### 3.2 Parallel Semantic Review (3 Agents)
-Spawn IN PARALLEL (Core Principle #10 -- one task per agent):
-**Efficiency Reviewer** (haiku): Query inefficiency (findMany equivalent vs SQL COUNT, N+1 queries, unbounded queries), algorithmic inefficiency (O(n^2), repeated sort/filter), unnecessary allocations, missing caching opportunities.
+Read `references/phase-2.5-gap-analyzer.md` for full details.
-**Reuse Reviewer** (haiku): Known utilities (getConfig(), stripPrefix(), tool registration patterns, memDb lifecycle pattern), module duplication against existing tool modules, pattern duplication across new files, config values that should be in massu.config.yaml.
+**Summary**: After implementation completes, run a continuous gap and enhancement analysis loop. A subagent analyzes all changed files across 7 categories (functional gaps, UX gaps, data integrity, security, pattern compliance, enhancements, sprint contract compliance). VR-VISUAL uses weighted 4-dimension scoring (threshold >= 3.0). Every gap/enhancement found is fixed immediately. The loop re-runs until a full pass discovers ZERO gaps. Max 10 iterations. Skippable only for documentation-only changes or explicit user request.
-**Pattern Compliance Reviewer** (haiku): ESM compliance (.ts import extensions, no require()), config-driven patterns (no hardcoded project-specific values -- CR-38/VR-GENERIC), TypeScript strict mode compliance, tool registration (3-function pattern preferred -- CR-11), hook compilation (esbuild compatible -- CR-12), memDb lifecycle (try/finally close), security (input validation, no eval/exec).
+---
-### 3.3 Apply ALL Findings
+## PHASE 3: SIMPLIFICATION
-Sort by SEVERITY (CRITICAL --> LOW). Fix ALL (CR-9). Re-run pattern scanner.
+Read `references/phase-3-simplify.md` for full details.
-```
-SIMPLIFY_GATE: PASS (N findings, N fixed, 0 remaining)
-```
+**Summary**: Fast gate (pattern scanner), then 3 parallel semantic review agents (efficiency, reuse, pattern compliance). Apply ALL findings sorted by severity. Re-run pattern scanner.
 ---
 ## PHASE 4: PRE-COMMIT VERIFICATION
-```
-[GOLDEN PATH -- PHASE 4: PRE-COMMIT VERIFICATION]
-```
-### 4.1 Auto-Verification Gates (ALL must pass in SINGLE run)
-| Gate | Command | Expected |
-|------|---------|----------|
-| 1. Pattern Scanner | `bash scripts/massu-pattern-scanner.sh` | Exit 0 |
-| 2. Type Safety (VR-TYPE) | `cd packages/core && npx tsc --noEmit` | 0 errors |
-| 3. Build (VR-BUILD) | `npm run build` | Exit 0 |
-| 4. Tests (VR-TEST) | `npm test` | ALL pass |
-| 5. Hook Compilation (VR-HOOK-BUILD) | `cd packages/core && npm run build:hooks` | Exit 0 |
-| 6. Generalization (VR-GENERIC) | `bash scripts/massu-generalization-scanner.sh` | Exit 0 |
-| 7. Security Scanner | `bash scripts/massu-security-scanner.sh` | Exit 0 |
-| 8. Secrets Staged | `git diff --cached --name-only \| grep -E '\.(env\|pem\|key\|secret)'` | 0 files |
-| 9. Credentials in Code | `grep -rn "sk-\|password.*=.*['\"]" --include="*.ts" packages/ \| grep -v "process.env" \| wc -l` | 0 |
-| 10. VR-TOOL-REG | For EACH new tool: verify definitions + handler wired in tools.ts | All wired |
-| 11. Plan Coverage | Verify ALL plan items with VR-* proof | 100% |
-| 12. VR-PLAN-STATUS | `grep "IMPLEMENTATION STATUS" [plan]` | Match |
-| 13. Dependency Security | `npm audit --audit-level=high` | 0 high/crit |
-### 4.2 Quality Scoring Gate
-Spawn `massu-output-scorer` (sonnet): Code Clarity, Pattern Compliance, Error Handling, Test Coverage, Config-Driven Design (1-5 each). All >= 3: PASS. Any < 3: FAIL.
-### 4.3 If ANY Gate Fails
+Read `references/phase-4-commit.md` for full details.
-**DO NOT PAUSE** -- Fix automatically, re-run ALL gates, repeat until all pass.
+**Summary**: Verification gates (pattern scanner, tsc, build, lint, secrets, VR-RENDER, VR-COUPLING, plan coverage, plan status, dep security). Quality scoring gate. Auto-fix on failure.
-### 4.4 Auto-Learning Protocol
-- For each bug fixed: update memory files
-- For new patterns: record in memory
-- Add detection to `scripts/massu-pattern-scanner.sh` if grep-able
-- Codebase-wide search: no other instances of same bad pattern (CR-9)
-- Record user corrections to `memory/corrections.md`
-### Phase 4 Complete --> APPROVAL POINT #3: COMMIT
-```
-===============================================================================
-APPROVAL REQUIRED: COMMIT
-===============================================================================
-All verification checks passed. Ready to commit.
-VERIFICATION RESULTS:
--------------------------------------------------------------------------------
-- Pattern scanner: Exit 0
-- Type check: 0 errors
-- Build: Exit 0
-- Tests: ALL pass
-- Hook compilation: Exit 0
-- Generalization: Exit 0
-- Security: No secrets staged, no credentials in code
-- Tool registration: All new tools wired
-- Plan Coverage: [X]/[X] = 100%
-- Quality Score: [X.X]/5.0
--------------------------------------------------------------------------------
-FILES TO BE COMMITTED:
-[list]
-PROPOSED COMMIT MESSAGE:
--------------------------------------------------------------------------------
-[type]: [description]
-[body]
-Co-Authored-By: Claude <noreply@anthropic.com>
--------------------------------------------------------------------------------
-OPTIONS:
-  - "approve" to commit and continue to push
-  - "message: [new message]" to change commit message
-  - "abort" to stop (changes remain staged)
-===============================================================================
-```
-### Commit Format
-```bash
-git commit -m "$(cat <<'EOF'
-[type]: [description]
-[Body]
-Changes:
-- [Change 1]
-- [Change 2]
-Verified:
-- Pattern scanner: PASS | Type check: 0 errors | Build: PASS
-- Tests: ALL pass | Hooks: compiled | Generalization: PASS
-Co-Authored-By: Claude <noreply@anthropic.com>
-EOF
-)"
-```
+**Gate**: APPROVAL POINT #3: COMMIT
 ---
 ## PHASE 5: PUSH VERIFICATION & PUSH
-```
-[GOLDEN PATH -- PHASE 5: PUSH VERIFICATION]
-```
-### 5.1 Pre-Flight
-```bash
-git log origin/main..HEAD --oneline  # Commits to push
-```
-### 5.2 Tier 1: Quick Re-Verification
-Run in parallel where possible:
-| Check | Command |
-|-------|---------|
-| Pattern Scanner | `bash scripts/massu-pattern-scanner.sh` |
-| Generalization | `bash scripts/massu-generalization-scanner.sh` |
-| TypeScript | `cd packages/core && npx tsc --noEmit` |
-| Build | `npm run build` |
-| Hook Compilation | `cd packages/core && npm run build:hooks` |
-### 5.3 Tier 2: Test Suite (CRITICAL)
-#### 5.3.0 Regression Detection (MANDATORY FIRST)
-```bash
-# Establish baseline on main
-git stash && git checkout main -q
-npm test 2>&1 | tee /tmp/baseline-tests.txt
-git checkout - -q && git stash pop -q
-# Run on current branch
-npm test 2>&1 | tee /tmp/current-tests.txt
-# Compare: any test passing on main but failing now = REGRESSION
-# Regressions MUST be fixed before push
-```
-#### 5.3.1-5.3.3 Test Execution
-Use **parallel Task agents** for independent checks:
-```
-Agent Group A (parallel):
-- Agent 1: npm test (unit tests)
-- Agent 2: npm audit --audit-level=high
-- Agent 3: bash scripts/massu-security-scanner.sh
-Sequential:
-- VR-TOOL-REG: verify ALL new tools registered in tools.ts
-- VR-GENERIC: verify ALL files pass generalization scanner
-```
-### 5.4 Tier 3: Security & Compliance
-| Check | Command |
-|-------|---------|
-| npm audit | `npm audit --audit-level=high` |
-| Security scan | `bash scripts/massu-security-scanner.sh` |
-| Config validation | Parse massu.config.yaml without errors |
-### 5.5 Tier 4: Final Gate
-All tiers must pass:
-| Tier | Status |
-|------|--------|
-| Tier 1: Quick Checks | PASS/FAIL |
-| Tier 2: Test Suite + Regression | PASS/FAIL |
-| Tier 3: Security & Compliance | PASS/FAIL |
-### Phase 5 Gate --> APPROVAL POINT #4: PUSH
-```
-===============================================================================
-APPROVAL REQUIRED: PUSH TO REMOTE
-===============================================================================
-All verification tiers passed. Ready to push.
-PUSH GATE SUMMARY:
--------------------------------------------------------------------------------
-Commit: [hash]
-Message: [message]
-Files changed: [N] | +[N] / -[N]
-Branch: [branch] --> origin
-Tier 1 (Quick): PASS
-Tier 2 (Tests): PASS -- Unit: X/X, Regression: 0
-Tier 3 (Security): PASS -- Audit: 0 high/crit, Secrets: clean
--------------------------------------------------------------------------------
+Read `references/phase-5-push.md` for full details.
-OPTIONS:
-  - "approve" / "push" to push to remote
-  - "abort" to stop (commit remains local)
+**Summary**: Pre-flight (commits to push). Tier 1: quick re-verification. Tier 2: test suite with mandatory regression detection. Tier 3: security & compliance (npm audit, secrets scan). Tier 4: final gate.
-===============================================================================
-```
-After approval: `git push origin [branch]`, then verify with `gh run list --limit 3`.
+**Gate**: APPROVAL POINT #4: PUSH
 ---
 ## PHASE 6: COMPLETION
-### 6.1 Final Report
-```
-===============================================================================
-GOLDEN PATH COMPLETE
-===============================================================================
-SUMMARY:
--------------------------------------------------------------------------------
-Phase 0: Requirements & Context    - D1-D10 resolved
-Phase 1: Plan Creation & Audit     - [N] items, [M] audit passes
-Phase 2: Implementation            - [N] audit loops, 3 reviewers passed
-Phase 2G: Browser Verification     - [N] pages tested, [M] issues fixed / SKIPPED
-Phase 3: Simplification            - [N] findings fixed
-Phase 4: Pre-Commit Verification   - All gates passed
-Phase 5: Push Verification         - 3 tiers passed, 0 regressions
--------------------------------------------------------------------------------
-DELIVERABLES:
-  - Plan: [plan path]
-  - Commit: [hash]
-  - Branch: [branch]
-  - Pushed: YES
-  - Files changed: [N]
-===============================================================================
-```
-### 6.2 Plan Document Update (MANDATORY)
-Add to TOP of plan document:
-```markdown
-# IMPLEMENTATION STATUS
-**Plan**: [Name]
-**Status**: COMPLETE -- PUSHED
-**Last Updated**: [YYYY-MM-DD HH:MM]
-**Push Commit**: [hash]
-**Completed By**: Claude Code (Massu Golden Path)
-## Task Completion Summary
-| # | Task/Phase | Status | Verification | Date |
-|---|------------|--------|--------------|------|
-| 1 | [description] | 100% COMPLETE | VR-BUILD: Pass | [date] |
-```
-### 6.3 Auto-Learning Protocol (MANDATORY)
-1. Review ALL fixes: `git diff origin/main..HEAD`
-2. For each fix: verify memory files updated
-3. For each new pattern: verify recorded
-4. For each failed approach: verify recorded
-5. Record user corrections to `memory/corrections.md`
-6. Consider new CR rule if a class of bug was found
-### 6.4 Update Session State
+Read `references/phase-6-completion.md` for full details.
-Update `session-state/CURRENT.md` with completion status.
+**Summary**: Final report with phase-by-phase status. Plan document update (IMPLEMENTATION STATUS at top). Auto-learning protocol (memory ingest for all fixes/patterns). Quality & observability report. Feature registration. Session state update.
 ---
-## NEW PATTERN APPROVAL (APPROVAL POINT #2 -- Any Phase)
+## Skill Contents
-If a new pattern is needed during ANY phase:
+This skill is a folder. The following files are available for reference:
-```
-===============================================================================
-APPROVAL REQUIRED: NEW PATTERN
-===============================================================================
-A new pattern is needed for: [functionality]
-Existing patterns checked:
-- [pattern 1]: Not suitable because [reason]
-PROPOSED NEW PATTERN:
--------------------------------------------------------------------------------
-Name: [Pattern Name]
-Domain: [Config/MCP/Hook/etc.]
-WRONG: ```[code]```
-CORRECT: ```[code]```
-Error if violated: [What breaks]
--------------------------------------------------------------------------------
-OPTIONS:
-  - "approve" to save and continue
-  - "modify: [changes]" to adjust
-  - "abort" to stop
-===============================================================================
-```
+| File | Purpose | Read When |
+|------|---------|-----------|
+| `references/phase-0-requirements.md` | Requirements interview, ambiguity detection, 10-dimension coverage map | Starting a new implementation from a task description |
+| `references/phase-1-plan-creation.md` | Blast radius analysis, plan generation, audit loop | Writing or auditing a plan |
+| `references/phase-2-implementation.md` | Item loop, sprint contracts, multi-perspective review, QA evaluator, verification audit, browser testing | Executing implementation; any Phase 2 sub-phase |
+| `references/sprint-contract-protocol.md` | Sprint contract template, quality bar, negotiation rules, skip conditions | Phase 2A.5 sprint contract negotiation |
+| `references/qa-evaluator-spec.md` | Adversarial QA evaluator: 4 dimensions, anti-leniency rules, known failure patterns | Phase 2C.2 QA evaluation (UI plans only) |
+| `references/vr-visual-calibration.md` | Score 5/3/1 calibration examples for VR-VISUAL weighted dimensions | Calibrating VR-VISUAL evaluator scoring |
+| `references/phase-2.5-gap-analyzer.md` | Gap/enhancement analysis loop, 7 categories (incl. sprint contract compliance), fix-and-repass until zero | After implementation, before simplification |
+| `references/phase-3-simplify.md` | Pattern scanner fast gate, dead code detection, parallel semantic review agents | Running simplification after implementation |
+| `references/phase-4-commit.md` | Verification gates, quality scoring, commit format | Preparing a commit |
+| `references/phase-5-push.md` | Pre-flight, push verification, regression detection | Preparing to push to remote |
+| `references/phase-6-completion.md` | Final report, plan status update, auto-learning, feature registration | After all verification; completing the golden path |
+| `references/approval-points.md` | Exact format and options for all 4 approval points (5 with --competitive: Plan, New Pattern, Winner Selection, Commit, Push) | Presenting any approval gate to the user |
+| `references/competitive-mode.md` | Competitive mode protocol: agent spawning, scoring, winner selection | Using --competitive flag |
+| `references/error-handling.md` | Abort handling, non-recoverable errors, post-compaction re-verification, competitive mode errors | On user abort, blocker error, or after context compaction |
 ---
-## ABORT HANDLING
+## Gotchas
-```
-===============================================================================
-GOLDEN PATH ABORTED
-===============================================================================
-Stopped at: [Phase N -- Approval Point]
-CURRENT STATE:
-  - Completed phases: [list]
-  - Pending phases: [list]
-  - Plan file: [path]
-  - Files changed: [list]
-  - Commit created: YES/NO
-  - Pushed: NO
-TO RESUME:
-  Run /massu-golden-path again with the same plan
-  Or run individual commands:
-    /massu-loop      -- Continue implementation
-    /massu-commit    -- Run commit verification
-    /massu-push      -- Run push verification
-===============================================================================
-```
+- **Compaction mid-loop loses plan state** -- if context compaction occurs during implementation, the plan file path and current item must be recoverable from session-state/CURRENT.md
+- **UI items need browser verification (CR-41)** -- any plan item touching UI files must be verified with Playwright before claiming done
+- **Approval points must not be skipped** -- there are 4 approval gates (5 with --competitive: Plan, New Pattern, Winner Selection, Commit, Push)
+- **Plan file must be re-read from disk, not memory (CR-5)** -- after compaction, always re-read the plan file. Memory of plan contents drifts from reality
+- **100% coverage required (CR-11)** -- never stop early. "Most items done" is not "all items done"
+- **--competitive increases token cost ~2-3x for Phase 2** -- use for high-stakes features only
+- **Competing agents do NOT run database migrations** -- DB changes must be applied separately before competitive mode
+- **Worktree branches must be mergeable** -- competing agents edit different files from the same plan, but shared files may conflict
+- **Bias presets are suggestions, not constraints** -- agents may deviate if the plan requires a specific approach
 ---
-## ERROR HANDLING
+## Quality Scoring Criteria
-**Recoverable**: Fix automatically --> re-run failed step --> if fixed, continue without pausing --> if not fixable after 3 attempts, pause and report.
+| Dimension | Weight | Measured By |
+|-----------|--------|-------------|
+| Code Clarity | 1-5 | Naming, structure, comments |
+| Pattern Compliance | 1-5 | CLAUDE.md patterns followed |
+| Error Handling | 1-5 | Edge cases, validation, fallbacks |
+| UX Quality | 1-5 | Loading/error/empty states, accessibility |
+| Test Coverage | 1-5 | Test files exist for new code |
-**Non-Recoverable**:
-```
-===============================================================================
-GOLDEN PATH BLOCKED
-===============================================================================
-BLOCKER: [Description]
-Required: [Steps to resolve]
-After resolving, run /massu-golden-path again.
-===============================================================================
-```
+All >= 3: PASS. Any < 3: FAIL.
 ---
 ## START NOW
-**Step 0: Write AUTHORIZED_COMMAND to session state (CR-35)**
+**Step 0: Write AUTHORIZED_COMMAND to session state (CR-12)**
 Update `session-state/CURRENT.md`:
 ```
@@ -963,11 +227,13 @@ AUTHORIZED_COMMAND: massu-golden-path
 1. **Determine input**: Task description, plan file, or continue
 2. **Phase 0**: Requirements & context (if task description)
-3. **Phase 1**: Plan creation & audit --> **PAUSE: Plan Approval**
+3. **Phase 1**: Plan creation & audit -> **PAUSE: Plan Approval**
 4. **Phase 2**: Implementation with verification loops + browser verification (UI changes)
-5. **Phase 3**: Simplification (efficiency, reuse, patterns)
-6. **Phase 4**: Pre-commit verification --> **PAUSE: Commit Approval**
-7. **Phase 5**: Push verification --> **PAUSE: Push Approval**
-8. **Phase 6**: Completion, learning, quality metrics
+4a. **Phase 2-COMP**: Competitive implementation (if --competitive) -> **PAUSE: Winner Selection**
+5. **Phase 2.5**: Gap & enhancement analysis loop (until zero gaps)
+6. **Phase 3**: Simplification (efficiency, reuse, patterns)
+7. **Phase 4**: Pre-commit verification -> **PAUSE: Commit Approval**
+8. **Phase 5**: Push verification via `scripts/push-verify.sh` -> **PAUSE: Push Approval**
+9. **Phase 6**: Completion, learning, quality metrics
 **This command does NOT stop to ask "should I continue?" -- it runs straight through.**