npm - @massu/core - Versions diffs - 0.6.0 → 0.6.2 - Mend

@massu/core 0.6.0 → 0.6.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (26) hide show

package/commands/_shared-preamble.md +14 -0
package/commands/massu-ci-fix.md +2 -2
package/commands/massu-gap-enhancement-analyzer.md +85 -345
package/commands/massu-golden-path/references/approval-points.md +9 -12
package/commands/massu-golden-path/references/competitive-mode.md +9 -7
package/commands/massu-golden-path/references/error-handling.md +4 -2
package/commands/massu-golden-path/references/phase-0-requirements.md +3 -3
package/commands/massu-golden-path/references/phase-1-plan-creation.md +41 -52
package/commands/massu-golden-path/references/phase-2-implementation.md +50 -151
package/commands/massu-golden-path/references/phase-2.5-gap-analyzer.md +14 -34
package/commands/massu-golden-path/references/phase-3-simplify.md +5 -5
package/commands/massu-golden-path/references/phase-4-commit.md +20 -46
package/commands/massu-golden-path/references/phase-5-push.md +14 -47
package/commands/massu-golden-path/references/phase-6-completion.md +8 -58
package/commands/massu-golden-path.md +25 -30
package/commands/massu-loop/references/checkpoint-audit.md +14 -18
package/commands/massu-loop/references/guardrails.md +3 -3
package/commands/massu-loop/references/iteration-structure.md +46 -14
package/commands/massu-loop/references/loop-controller.md +72 -63
package/commands/massu-loop/references/plan-extraction.md +19 -11
package/commands/massu-loop/references/vr-plan-spec.md +20 -28
package/commands/massu-loop.md +36 -56
package/commands/massu-review.md +2 -2
package/dist/cli.js +0 -0
package/package.json +1 -1
package/README.md +0 -40

package/commands/massu-golden-path/references/approval-points.md CHANGED Viewed

@@ -44,7 +44,7 @@ Total Items: [N]
 Phases: [list]
 Requirements Coverage: [X]/10 dimensions resolved
-Feasibility: VERIFIED (DB, files, patterns, security)
+Feasibility: VERIFIED (config, files, patterns, security)
 Audit Passes: {iteration} (final pass: 0 gaps)
 --------------------------------------------------------------------------
@@ -75,7 +75,7 @@ Existing patterns checked:
 PROPOSED NEW PATTERN:
 --------------------------------------------------------------------------
 Name: [Pattern Name]
-Domain: [UI/Database/Auth/etc.]
+Domain: [Config/MCP/Hook/etc.]
 WRONG: [code]
 CORRECT: [code]
@@ -108,15 +108,12 @@ VERIFICATION RESULTS:
   Pattern scanner: Exit 0
   Type check: 0 errors
   Build: Exit 0
-  Lint: Exit 0
-  Prisma: Valid
+  Tests: ALL pass
+  Hook compilation: Exit 0
+  Generalization: Exit 0
   Security: No secrets staged, no credentials in code
-  VR-RENDER: All UI components rendered
-  VR-COUPLING: All backend features exposed in UI
-  VR-COLOR: No hardcoded Tailwind colors
+  Tool registration: All new tools wired
   Plan Coverage: [X]/[X] = 100%
-  Database: All environments verified
-  Help site: UP TO DATE / N/A
   Quality Score: [X.X]/5.0
 --------------------------------------------------------------------------
@@ -161,8 +158,8 @@ Files changed: [N] | +[N] / -[N]
 Branch: [branch] -> origin
 Tier 1 (Quick): PASS
-Tier 2 (Tests): PASS -- Unit: X/X, E2E: X/X, Regression: 0
-Tier 3 (Security): PASS -- Audit: 0 high/crit, RLS: verified, Secrets: clean
+Tier 2 (Tests): PASS -- Unit: X/X, Regression: 0
+Tier 3 (Security): PASS -- Audit: 0 high/crit, Secrets: clean
 --------------------------------------------------------------------------
 OPTIONS:
@@ -200,7 +197,7 @@ COMPETITIVE SCORECARD:
 NOTABLE DIFFERENCES:
   [Aspect]: Agent A did [X], Agent B did [Y]
-RECOMMENDATION: Agent {X} ({bias}) — [reason]
+RECOMMENDATION: Agent {X} ({bias}) -- [reason]
 --------------------------------------------------------------------------
 PER-AGENT NOTES:

package/commands/massu-golden-path/references/competitive-mode.md CHANGED Viewed

@@ -1,8 +1,10 @@
 # Competitive Mode Protocol
+> **Shared rules apply.** Read .claude/commands/_shared-preamble.md before proceeding.
 > Reference doc for `/massu-golden-path --competitive`. Return to main file for overview.
-**Purpose**: Spawn 2-3 competing implementations of the same plan with different optimization biases, score all implementations, and select the winner before proceeding with Massu's verification rigor.
+**Purpose**: Spawn 2-3 competing implementations of the same plan with different optimization biases, score all implementations, and select the winner before proceeding with verification rigor.
 **Triggering**: Only when `/massu-golden-path --competitive` is explicitly used. Never automatic.
@@ -17,7 +19,7 @@ SCAN plan for:
   - Items with type = MIGRATION
   - Items containing ALTER TABLE, CREATE TABLE, DROP TABLE
   - Items containing RLS policies or grants
-  - Items referencing database migrations
+  - Items referencing all database environments
 IF any found:
   ABORT competitive mode with message:
@@ -25,7 +27,7 @@ IF any found:
    Apply migrations first, then re-run with --competitive."
 ```
-This mirrors the `/massu-batch` DB guard pattern (`scripts/batch-db-guard.sh`).
+This mirrors the `/massu-batch` DB guard pattern.
 ---
@@ -75,8 +77,8 @@ IMPLEMENTATION RULES:
 1. Read the plan from disk and implement ALL items
 2. Follow ALL CLAUDE.md patterns (ctx.db, protectedProcedure, etc.)
 3. Do NOT run database migrations (handled separately)
-4. Run pattern-scanner after each file: ./scripts/pattern-scanner.sh
-5. Run tsc after implementation: npx tsc --noEmit
+4. Run pattern-scanner after each file: bash scripts/massu-pattern-scanner.sh
+5. Run tsc after implementation: cd packages/core && npx tsc --noEmit
 6. Fix any issues before declaring done
 OUTPUT FORMAT (at completion):
@@ -249,8 +251,8 @@ IF cleanup fails:
 ### Post-Merge Verification
 ```
-1. Run ./scripts/pattern-scanner.sh (exit 0 required)
-2. Run npx tsc --noEmit (0 errors required)
+1. Run bash scripts/massu-pattern-scanner.sh (exit 0 required)
+2. Run cd packages/core && npx tsc --noEmit (0 errors required)
 3. IF either fails:
      Fix issues from merge
      Re-run verification

package/commands/massu-golden-path/references/error-handling.md CHANGED Viewed

@@ -1,5 +1,7 @@
 # Error Handling
+> **Shared rules apply.** Read .claude/commands/_shared-preamble.md before proceeding.
 > Reference doc for `/massu-golden-path`. Return to main file for overview.
 ## Recoverable Errors
@@ -54,11 +56,11 @@ TO RESUME:
 ---
-## Post-Compaction Re-Verification (CR-42)
+## Post-Compaction Re-Verification (CR-12)
 **After ANY context compaction during a golden path run**, BEFORE continuing implementation:
-1. **Re-read the FULL plan document** from disk (CR-5 -- never from memory)
+1. **Re-read the FULL plan document** from disk (never from memory)
 2. **Diff every completed item against actual code**: For each item marked complete in the tracking table, re-run its VR-* verification command
 3. **VR-SPEC-MATCH audit**: For every completed UI item with specific CSS classes/structure in the plan, grep for those EXACT strings in the implementation
 4. **Flag mismatches**: Any item where implementation doesn't match the plan's exact spec -> mark as gap, fix before continuing

package/commands/massu-golden-path/references/phase-0-requirements.md CHANGED Viewed

@@ -8,9 +8,9 @@
 [GOLDEN PATH -- PHASE 0: REQUIREMENTS & CONTEXT]
 ```
-- Call `massu_memory_sessions` for recent session context
-- Call `massu_memory_search` + `massu_memory_failures` with feature keywords
 - Read `session-state/CURRENT.md` for any prior state
+- Read `massu.config.yaml` for project configuration
+- Search memory files for relevant prior context
 ## 0.2 Requirements Coverage Map
@@ -20,7 +20,7 @@ Initialize ALL dimensions as `pending`:
 |---|-----------|--------|-------------|
 | D1 | Problem & Scope | pending | User request + interview |
 | D2 | Users & Personas | pending | Interview |
-| D3 | Data Model | pending | Phase 1A (DB Reality Check) |
+| D3 | Data Model | pending | Phase 1A (Config/Schema Reality Check) |
 | D4 | Backend / API | pending | Phase 1A (Codebase Reality Check) |
 | D5 | Frontend / UX | pending | Interview + Phase 1A |
 | D6 | Auth & Permissions | pending | Phase 1A (Security Pre-Screen) |

package/commands/massu-golden-path/references/phase-1-plan-creation.md CHANGED Viewed

@@ -12,85 +12,75 @@
 ### 1A.1 Feature Understanding
-- Call `massu_knowledge_search`, `massu_knowledge_pattern`, `massu_knowledge_schema_check` with feature name
 - Document: exact user request, feature type, affected domains
-- Search codebase for similar features, routers, pages
+- Search codebase for similar features, tool modules, existing patterns
+- Read `massu.config.yaml` for relevant config sections
-### 1A.2 Database Reality Check (VR-SCHEMA-PRE)
+### 1A.2 Config & Schema Reality Check
-For EACH table the feature might use, query via MCP:
+For features touching config or databases:
-```sql
-SELECT column_name, data_type, is_nullable, column_default
-FROM information_schema.columns WHERE table_name = '[TABLE]' ORDER BY ordinal_position;
+- Parse `massu.config.yaml` and verify all referenced config keys exist
+- Check SQLite schema for affected tables (`getCodeGraphDb`, `getDataDb`, `getMemoryDb`)
+- Verify tool definitions in `tools.ts` for any tools being modified
-SELECT polname, polcmd FROM pg_policies WHERE tablename = '[TABLE]';
+Document: existing config keys, required new keys, required schema changes.
-SELECT grantee, privilege_type FROM information_schema.table_privileges WHERE table_name = '[TABLE]';
-```
-Run `./scripts/check-bad-columns.sh`. Call `massu_schema` for Prisma cross-reference.
-Document: existing tables, required new tables/columns, migration SQL previews.
+### 1A.3 Config-Code Alignment (VR-CONFIG)
-### 1A.3 Config-Code Alignment (VR-DATA)
+If feature uses config-driven values:
-If feature uses DB-stored configs:
-```sql
-SELECT DISTINCT jsonb_object_keys(config_column) as keys FROM config_table;
+```bash
+# Check config keys used in code
+grep -rn "getConfig()" packages/core/src/ | grep -o 'config\.\w\+' | sort -u
+# Compare to massu.config.yaml structure
 ```
-Compare to code: `grep -rn "config\." src/lib/[feature]/ | grep -oP 'config\.\w+' | sort -u`
 ### 1A.4 Codebase Reality Check
 - Verify target directories/files exist
-- Read similar routers and components
-- Load relevant pattern files (database/auth/ui/realtime/build)
+- Read similar tool modules and handlers
+- Load relevant pattern files (build/testing/security/database/mcp)
 ### 1A.5 Blast Radius Analysis (CR-25)
-**MANDATORY when plan changes any constant, path, route, enum, or config key.**
+**MANDATORY when plan changes any constant, export name, config key, or tool name.**
 1. Identify ALL changed values (old -> new)
 2. Codebase-wide grep for EACH value
-3. Call `massu_impact` for indirect impact through import chains
-4. If plan deletes files: call `massu_sentinel_impact` -- zero orphaned features allowed
-5. Categorize EVERY occurrence: CHANGE / KEEP (with reason) / INVESTIGATE
-6. Resolve ALL INVESTIGATE to 0. Add ALL CHANGE items as plan deliverables.
+3. If plan deletes files: verify no remaining imports or references
+4. Categorize EVERY occurrence: CHANGE / KEEP (with reason) / INVESTIGATE
+5. Resolve ALL INVESTIGATE to 0. Add ALL CHANGE items as plan deliverables.
 ### 1A.6 Pattern Compliance Check
-Check applicable patterns: ctx.db, user_profiles, 3-step query, BigInt/Decimal, RLS+Grants, Suspense, Select.Item, protectedProcedure, Zod validation. Read most similar router/component for patterns used.
+Check applicable patterns: ESM imports (.ts extensions), config access (getConfig()), tool registration (3-function pattern), hook compilation (esbuild), SQLite DB access (getCodeGraphDb/getDataDb/getMemoryDb), memDb lifecycle (try/finally close).
-### 1A.7 Backend-Frontend Coupling Check (CR-12)
+Read most similar tool module for patterns used.
-For EVERY backend z.enum, type, or procedure planned -- verify a corresponding frontend item exists. If NOT, ADD IT.
+### 1A.7 Tool Registration Check
+For EVERY new MCP tool planned -- verify a corresponding registration item exists in the plan (definitions + routing + handler in `tools.ts`). If NOT, ADD IT.
 ### 1A.8 Question Filtering
 1. List all open questions
-2. Self-answer anything answerable by reading code or querying DB
+2. Self-answer anything answerable by reading code or config
 3. Surface only business logic / UX / scope / priority questions to user via AskUserQuestion
 4. If all self-answerable, skip user prompt
-### 1A.9 Security Pre-Screen (6 Dimensions)
+### 1A.9 Security Pre-Screen (5 Dimensions)
 | Dim | Check | If Triggered |
 |-----|-------|-------------|
-| S1 | PII / Sensitive Data | Add RLS + column-level access |
-| S2 | Authentication | Verify protectedProcedure |
-| S3 | Authorization | Add RBAC checks, RLS policies |
-| S4 | Injection Surfaces | Add Zod validation, parameterized queries |
-| S5 | Secrets Management (CR-5) | Add AWS Secrets Manager items |
-| S6 | Rate Limiting | Add rate limiting middleware |
+| S1 | PII / Sensitive Data | Add access controls |
+| S2 | Authentication | Verify auth checks |
+| S3 | Authorization | Add permission checks |
+| S4 | Injection Surfaces | Add input validation, parameterized queries |
+| S5 | Rate Limiting | Add rate limiting considerations |
 **BLOCKS_REMAINING must = 0 before proceeding.**
-### 1A.10 ADR Generation (Optional)
-For architectural decisions: `massu_adr_list` -> `massu_adr_generate`.
 Mark all coverage dimensions as `done` or `n/a`.
 ---
@@ -101,25 +91,24 @@ Mark all coverage dimensions as `done` or `n/a`.
 [GOLDEN PATH -- PHASE 1B: PLAN GENERATION]
 ```
-Write plan to: `plans/[YYYY-MM-DD]-[feature-name].md`
+Write plan to: `docs/plans/[YYYY-MM-DD]-[feature-name].md`
 **Plan structure** (P-XXX numbered items):
 - Overview (feature, complexity, domains, item count)
 - Requirements Coverage Map (D1-D10 all resolved)
-- Phase 0: Credentials & Secrets (CR-5)
-- Phase 1: Database Changes (migrations with exact SQL)
-- Phase 2: Backend Implementation (routers, procedures, input schemas)
-- Phase 3: Frontend Implementation (components, pages, renders-in)
+- Phase 1: Configuration Changes (massu.config.yaml)
+- Phase 2: Backend Implementation (tool modules, handlers, SQLite schema)
+- Phase 3: Frontend/Hook Implementation (hooks, plugin code)
 - Phase 4: Testing & Verification
-- Phase 5: Documentation (help site pages, changelog)
+- Phase 5: Documentation
 - Verification Commands table
 - Item Summary table
 - Risk Assessment
 - Dependencies
-**Item numbering**: P0-XXX (secrets), P1-XXX (database), P2-XXX (backend), P3-XXX (frontend), P4-XXX (testing), P5-XXX (docs).
+**Item numbering**: P1-XXX (config), P2-XXX (backend), P3-XXX (frontend/hooks), P4-XXX (testing), P5-XXX (docs).
-**Implementation Specificity Check**: Every item MUST have exact file path, exact content/SQL, insertion point, format matches target, verification command.
+**Implementation Specificity Check**: Every item MUST have exact file path, exact content, insertion point, format matches target, verification command.
 **Documentation Impact Assessment**: If ANY user-facing features, Phase 5 deliverables are MANDATORY.
@@ -141,7 +130,7 @@ WHILE true:
   result = Task(subagent_type="massu-plan-auditor", model="opus", prompt="
     Audit iteration {iteration} for plan: {PLAN_PATH}
     Execute ONE complete audit pass. Verify ALL deliverables.
-    Check: VR-PLAN-FEASIBILITY, VR-PLAN-SPECIFICITY, Pattern Alignment, Schema Reality.
+    Check: VR-PLAN-FEASIBILITY, VR-PLAN-SPECIFICITY, Pattern Alignment, Config Reality.
     Fix any plan document gaps you find.
     CRITICAL: Report GAPS_DISCOVERED as total gaps FOUND, EVEN IF you fixed them.
@@ -157,7 +146,7 @@ WHILE true:
 END WHILE
 ```
-**VR-PLAN-FEASIBILITY**: DB schema exists, files exist, dependencies available, patterns documented, credentials planned.
+**VR-PLAN-FEASIBILITY**: Files exist, config keys valid, dependencies available, patterns documented.
 **VR-PLAN-SPECIFICITY**: Every item has exact path, exact content, insertion point, verification command.
 **Pattern Alignment**: Cross-reference ALL applicable patterns from CLAUDE.md and patterns/*.md.

package/commands/massu-golden-path/references/phase-2-implementation.md CHANGED Viewed

@@ -21,12 +21,12 @@ ELSE:
 [GOLDEN PATH -- PHASE 2: IMPLEMENTATION]
 ```
-1. Read plan from disk (NOT memory)
+1. Read plan from disk (NOT memory -- CR-5)
 2. Extract ALL deliverables into tracking table:
 | Item # | Type | Description | Location | Verification | Status |
 |--------|------|-------------|----------|--------------|--------|
-| P1-001 | MIGRATION | ... | ... | VR-SCHEMA | PENDING |
+| P1-001 | CONFIG | ... | ... | VR-CONFIG | PENDING |
 3. Create VR-PLAN verification strategy:
@@ -38,38 +38,12 @@ ELSE:
 ---
-## Phase 2A.5: Sprint Contracts
-> Full protocol: [sprint-contract-protocol.md](sprint-contract-protocol.md)
-**Before implementation begins**, negotiate a sprint contract for each plan item:
-1. For each plan item in the tracking table:
-   - Define **Scope Boundary** (IN/OUT)
-   - Define **Implementation Approach** (files, patterns)
-   - Write **3-5 Acceptance Criteria** (must be specific enough that two independent evaluators agree on PASS/FAIL)
-   - Map to **VR-\* Verification Types**
-2. Add contract columns to the Phase 2A tracking table:
-| Item # | Type | Description | Location | Verification | Scope Boundary | Acceptance Criteria | Contract Status |
-|--------|------|-------------|----------|--------------|----------------|---------------------|-----------------|
-| P1-001 | MIGRATION | ... | ... | VR-SCHEMA | IN: ... / OUT: ... | 1. ... 2. ... 3. ... | AGREED |
-3. **Quality bar**: Criteria using words like "good", "correct", "proper" without specifics = reject and rewrite. Each contract must include criteria from at least 3 categories: happy path, data display, empty/loading/error states, user feedback, edge cases.
-4. **Skip conditions**: Mark `Contract: N/A` for pure refactors (VR-BUILD + VR-TYPE + VR-TEST sufficient), documentation-only items, and migrations where SQL IS the contract.
-5. **Max 3 negotiation rounds** per item. If unresolved, escalate via AskUserQuestion.
----
 ## Phase 2B: Implementation Loop
 For each plan item:
-1. **Pre-check**: Load CR rules, domain patterns for the affected file
+1. **Pre-check**: Verify file exists, read current state
 2. **Execute**: Implement the item following established patterns
-3. **Guardrail**: Run `./scripts/pattern-scanner.sh` (ABORT if fails)
+3. **Guardrail**: Run `bash scripts/massu-pattern-scanner.sh` (ABORT if fails)
 4. **Verify**: Run applicable VR-* checks with proof
 5. **VR-PIPELINE**: If the item involves a data pipeline (AI, cron, generation, ETL), trigger the pipeline manually, verify output is non-empty. Empty output = fix before continuing.
 6. **Update**: Mark item complete in tracking table
@@ -83,16 +57,13 @@ For each plan item:
 ```
 CHECKPOINT:
-[1] READ plan section    [2] QUERY DB             [3] GREP routers
-[4] LS components        [5] VR-RENDER check      [6] VR-COUPLING check
-[7] Pattern scanner      [8] npm run build         [9] npx tsc --noEmit
-[10] npm run lint        [11] npx prisma validate  [12] npm test
-[13] UI/UX verification  [14] API/router verification  [15] Security check
-[16] COUNT gaps -> IF > 0: FIX and return to [1]
+[1] READ plan section          [2] GREP tool registrations    [3] LS modules
+[4] VR-CONFIG check            [5] VR-TOOL-REG check          [6] VR-HOOK-BUILD check
+[7] Pattern scanner            [8] npm run build               [9] cd packages/core && npx tsc --noEmit
+[10] npm test                  [11] VR-GENERIC check           [12] Security scanner
+[13] COUNT gaps -> IF > 0: FIX and return to [1]
 ```
-> **Cross-reference**: Full checkpoint audit protocol with detailed steps is in `massu-loop/references/checkpoint-audit.md`.
 ---
 ## Phase 2C: Multi-Perspective Review
@@ -112,60 +83,14 @@ architecture_result = Task(subagent_type="massu-architecture-reviewer", model="o
   Return structured result with ARCHITECTURE_GATE: PASS/FAIL.
 ")
-ux_result = Task(subagent_type="massu-ux-reviewer", model="sonnet", prompt="
+quality_result = Task(subagent_type="massu-quality-reviewer", model="sonnet", prompt="
   Review implementation for plan: {PLAN_PATH}
-  Focus: UX, accessibility, loading/error/empty states, consistency.
-  Return structured result with UX_GATE: PASS/FAIL.
+  Focus: Code quality, ESM compliance, config-driven patterns, TypeScript strict mode, test coverage.
+  Return structured result with QUALITY_GATE: PASS/FAIL.
 ")
 ```
-**Phase 2C.2: QA Evaluator** (conditional -- UI plans only)
-> Full spec: [qa-evaluator-spec.md](qa-evaluator-spec.md)
-If the plan touches UI files, spawn an adversarial QA evaluator:
-```
-IF plan has UI files:
-  qa_result = Task(subagent_type="massu-ux-reviewer", model="opus", prompt="
-    === QA EVALUATOR MODE ===
-    You are an ADVERSARIAL QA agent. Your job is to FIND BUGS, not approve work.
-    Plan: {PLAN_PATH}
-    Sprint contracts: {CONTRACTS_FROM_2A5}
-    For EACH plan item with a sprint contract:
-    1. NAVIGATE to the affected page using Playwright MCP
-    2. EXERCISE the feature as a real user would
-    3. VERIFY against sprint contract acceptance criteria (EVERY criterion)
-    4. CHECK for known failure patterns:
-       - Mock/hardcoded data (data doesn't change when DB changes)
-       - Write succeeds but read/display broken
-       - Feature stubs (onClick/onSubmit empty or log-only)
-       - Invisible elements (display:none, opacity:0, z-index buried)
-       - Missing query invalidation (create item, verify list updates without refresh)
-    5. GRADE: PASS / PARTIAL / FAIL with specific evidence
-    ANTI-LENIENCY RULES:
-    - Never say 'this is acceptable because...' — if criteria aren't met, it's FAIL
-    - Never give benefit of the doubt — if you can't verify it works, it's FAIL
-    - Partial credit is still failure — PARTIAL means 'not done yet'
-    - Every PASS must cite specific evidence (screenshot, DOM state, network response)
-    Return structured result with QA_GATE: PASS/FAIL and per-item grades.
-  ")
-ELSE:
-  Log: "QA Evaluator: SKIPPED (no UI files in plan)"
-```
-**Gate logic**: Fix ALL CRITICAL/HIGH findings before proceeding. WARN findings = document and proceed.
-```
-GATES = [SECURITY_GATE, ARCHITECTURE_GATE, UX_GATE]
-IF plan has UI files: GATES += [QA_GATE]
-IF ANY gate == FAIL: Fix findings and re-run failed gates
-ALL gates must PASS before proceeding to Phase 2D.
-```
+Fix ALL findings at ALL severity levels before proceeding (CR-45). CRITICAL, HIGH, MEDIUM, LOW — all get fixed. No severity is exempt.
 ---
@@ -176,36 +101,15 @@ iteration = 0
 WHILE true:
   iteration += 1
-  # Circuit breaker (detect stagnation)
-  IF iteration >= 3:
-    stalled_items = items that failed in ALL of last 3 iterations
-    IF stalled_items.length > 0:
-      Log: "REFINE-OR-PIVOT: {stalled_items.length} items stalled for 3+ iterations"
-      FOR EACH stalled_item:
-        IF same_root_cause_each_time: REFINE (targeted fix for root cause)
-        IF different_failures_each_time: PIVOT (scrap approach, try alternative)
-        IF no_clear_pattern: AskUserQuestion with evidence from last 3 attempts
+  # Circuit breaker (CR-37)
+  IF iteration >= 3 AND same gaps as previous iteration:
+    AskUserQuestion: "Loop stalled after {iteration} passes. Re-plan / Continue / Stop?"
   result = Task(subagent_type="massu-plan-auditor", model="opus", prompt="
     Audit iteration {iteration} for plan: {PLAN_PATH}
     Verify ALL deliverables with VR-* proof.
     Check code quality (patterns, build, types, tests).
     Check plan coverage (every item verified).
-    VR-SPEC-MATCH: For EVERY UI plan item with specific CSS classes,
-    component names, or layout instructions -- grep the implementation for those
-    EXACT strings. Missing = gap.
-    VR-PIPELINE: For features with data pipelines (AI, cron, generation),
-    trigger the pipeline procedure and verify output is non-empty. Empty = gap.
-    SPRINT CONTRACT VERIFICATION: For each plan item with a sprint contract
-    (from Phase 2A.5), verify EVERY acceptance criterion is met:
-    - Read the contract's acceptance criteria list
-    - Test each criterion with specific evidence (screenshot, grep, DOM state)
-    - Any unmet criterion = gap, even if the code 'looks right'
-    - Contract criteria are IN ADDITION TO VR-* checks — both must pass
     Fix any gaps you find.
     CRITICAL: GAPS_DISCOVERED = total FOUND, even if fixed.
@@ -222,7 +126,7 @@ END WHILE
 ---
-## Phase 2E: Post-Build Reflection + Memory Persist
+## Phase 2E: Post-Build Reflection + Memory Persist (CR-38)
 **MANDATORY -- reflection + memory write = ONE atomic action.**
@@ -240,14 +144,13 @@ Apply any low-risk refactors immediately. Log remaining suggestions in plan unde
 ## Phase 2F: Documentation Sync (User-Facing Features)
-If plan includes ANY user-facing features:
+If plan includes ANY user-facing features (new MCP tools, config changes, hook changes):
-1. Audit documentation against code changes
-2. Update affected documentation pages
-3. Add changelog entry
-4. Commit documentation updates (separate repo if applicable)
+1. Update relevant documentation (README, API docs, config docs)
+2. Ensure tool descriptions match implementation
+3. Update config schema documentation if config keys changed
-Skip ONLY if purely backend/infra with zero user-facing changes.
+Skip ONLY if purely internal refactoring with zero user-facing changes.
 ---
@@ -257,28 +160,27 @@ Skip ONLY if purely backend/infra with zero user-facing changes.
 [GOLDEN PATH -- PHASE 2G: BROWSER VERIFICATION]
 ```
-**Auto-trigger condition**: If plan touches ANY UI files, this phase runs automatically. If purely backend/infra with zero UI changes, skip with log note: `Browser verification: SKIPPED (no UI files changed)`.
-**SAFETY RULE**: NEVER use real client data. NEVER click destructive actions (Delete, Send, Submit) on production.
+**Auto-trigger condition**: If plan touches ANY UI/demo files or produces visual output, this phase runs automatically. If purely backend/MCP/config with zero visual output, skip with log note: `Browser verification: SKIPPED (no UI files changed)`.
 ### 2G.1 Determine Target Pages
-Map changed files to URLs:
+Map changed features to testable URLs:
+- If the project has a demo page or documentation site: test affected pages
+- If testing MCP tool output: use a test harness or verify tool responses
 - Component changes: identify ALL pages that render the component
-- Layout changes: test ALL child routes under that layout
 ### 2G.2 Browser Setup & Authentication
-Use Playwright MCP plugin tools.
+Use Playwright MCP plugin tools (`mcp__plugin_playwright_playwright__*`). Fallback: `mcp__playwright__*`.
-1. `browser_navigate` to first target URL
-2. `browser_snapshot` to check auth status
-3. If redirected to `/login` or auth check visible: STOP and request manual login
+1. `browser_navigate` to target URL
+2. `browser_snapshot` to check page status
+3. If authentication required: STOP and request manual login
 ```
 AUTHENTICATION REQUIRED
-The Playwright browser is not logged in to the app.
+The Playwright browser is not logged in to the target application.
 Please log in manually in the open browser window, then re-run the golden path.
 ```
@@ -290,24 +192,24 @@ For EACH target page:
 | Check | Tool | Captures |
 |-------|------|----------|
-| Console errors/warnings | `browser_console_messages` | React errors, TypeError, CSP violations, auth warnings |
+| Console errors/warnings | `browser_console_messages` | React errors, TypeError, CSP violations |
 | Network failures | `browser_network_requests` | 500s, 404s, CORS failures, timeouts |
 Categorize findings:
 | Category | Severity |
 |----------|----------|
-| React crash, 500 error, data exposure | **P0 -- CRITICAL** |
-| Network failure, CSP violation, broken interaction, auth warning | **P1 -- HIGH** |
-| Visual issues, performance warnings, broken images | **P2 -- MEDIUM** |
-| Console warnings, deprecations, i18n missing keys | **P3 -- LOW** |
+| Crash, 500 error, data exposure | **P0 -- CRITICAL** |
+| Network failure, broken interaction | **P1 -- HIGH** |
+| Visual issues, performance warnings | **P2 -- MEDIUM** |
+| Console warnings, deprecations | **P3 -- LOW** |
 ### 2G.4 Interactive Testing (Per Page)
-1. `browser_snapshot` -> inventory ALL interactive elements
+1. `browser_snapshot` -> inventory ALL interactive elements (buttons, links, forms, selects, tabs, modals, data tables)
 2. For EACH testable element:
-   - Capture console state BEFORE interaction
-   - Perform interaction
+   - Capture console state BEFORE interaction (`browser_console_messages`)
+   - Perform interaction (`browser_click`, `browser_select_option`, `browser_fill_form`)
    - Wait 2-3 seconds for async operations
    - Capture console state AFTER interaction
    - Record any NEW errors introduced
@@ -319,15 +221,15 @@ Categorize findings:
 ### 2G.5 Visual & Performance Audit
 **Visual checks**:
-- Broken images: find `img` elements with `naturalWidth === 0`
+- Broken images: `browser_evaluate` to find `img` elements with `naturalWidth === 0`
 - Layout issues: overflow, overlapping, missing content, broken alignment
-- Responsive: test at 1440x900 (desktop), 768x1024 (tablet), 375x812 (mobile)
-- Screenshot evidence at each breakpoint if issues found
+- Responsive: `browser_resize` at 1440x900 (desktop), 768x1024 (tablet), 375x812 (mobile)
+- Screenshot evidence: `browser_take_screenshot` at each breakpoint if issues found
 **Performance checks**:
-- Page load timing
-- Resources > 500KB
-- Slow API calls > 3s, duplicate requests
+- Page load timing via `browser_evaluate` (`performance.getEntriesByType('navigation')`)
+- Resources > 500KB via `browser_evaluate` (`performance.getEntriesByType('resource')`)
+- Slow API calls > 3s, duplicate requests via `browser_network_requests`
 | Metric | Good | Needs Work | Critical |
 |--------|------|------------|----------|
@@ -361,10 +263,9 @@ Report includes: summary table, console errors, network failures, interactive el
 ### 2G.8 Auto-Learning Protocol
 For EACH browser-discovered fix:
-1. Record with type="bugfix" including browser symptom -> code fix mapping
-2. Update MEMORY.md with symptom/root cause/fix/files
-3. Add to `scripts/pattern-scanner.sh` if the bad pattern is grep-able
-4. Codebase-wide search for same bad pattern (CR-9) -- fix ALL instances
+1. Update memory files with symptom/root cause/fix/files
+2. Add to `scripts/massu-pattern-scanner.sh` if the bad pattern is grep-able
+3. Codebase-wide search for same bad pattern (CR-9) -- fix ALL instances
 ---
@@ -386,11 +287,9 @@ The golden path spawns multiple subagents across Phase 2. Follow these principle
 ```
 [GOLDEN PATH -- PHASE 2 COMPLETE]
-  Sprint contracts: NEGOTIATED ({N} items contracted, {M} N/A)
   All plan items implemented
-  Multi-perspective review: PASSED (security, architecture, UX)
-  QA evaluator: PASSED / SKIPPED (no UI files)
-  Verification audit: PASSED (Loop #{iteration}, 0 gaps, contracts verified)
+  Multi-perspective review: PASSED (security, architecture, quality)
+  Verification audit: PASSED (Loop #{iteration}, 0 gaps)
   Post-build reflection: PERSISTED to memory
   Documentation sync: COMPLETE / N/A
   Browser verification: PASSED ({N} pages tested, {M} issues fixed) / SKIPPED (no UI files)