npm - @massu/core - Versions diffs - 0.4.2 → 0.6.0 - Mend

@massu/core 0.4.2 → 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (125) hide show

package/README.md +40 -0
package/agents/massu-architecture-reviewer.md +104 -0
package/agents/massu-blast-radius-analyzer.md +84 -0
package/agents/massu-competitive-scorer.md +126 -0
package/agents/massu-help-sync.md +73 -0
package/agents/massu-migration-writer.md +94 -0
package/agents/massu-output-scorer.md +87 -0
package/agents/massu-pattern-reviewer.md +84 -0
package/agents/massu-plan-auditor.md +170 -0
package/agents/massu-schema-sync-verifier.md +70 -0
package/agents/massu-security-reviewer.md +98 -0
package/agents/massu-ux-reviewer.md +106 -0
package/commands/_shared-preamble.md +53 -23
package/commands/_shared-references/auto-learning-protocol.md +71 -0
package/commands/_shared-references/blast-radius-protocol.md +76 -0
package/commands/_shared-references/security-pre-screen.md +64 -0
package/commands/_shared-references/test-first-protocol.md +87 -0
package/commands/_shared-references/verification-table.md +52 -0
package/commands/massu-article-review.md +343 -0
package/commands/massu-autoresearch/references/eval-runner.md +84 -0
package/commands/massu-autoresearch/references/safety-rails.md +125 -0
package/commands/massu-autoresearch/references/scoring-protocol.md +151 -0
package/commands/massu-autoresearch.md +258 -0
package/commands/massu-batch.md +44 -12
package/commands/massu-bearings.md +42 -8
package/commands/massu-checkpoint.md +588 -0
package/commands/massu-ci-fix.md +2 -2
package/commands/massu-command-health.md +132 -0
package/commands/massu-command-improve.md +232 -0
package/commands/massu-commit.md +205 -44
package/commands/massu-create-plan.md +239 -57
package/commands/massu-data/references/common-queries.md +79 -0
package/commands/massu-data/references/table-guide.md +50 -0
package/commands/massu-data.md +66 -0
package/commands/massu-dead-code.md +29 -34
package/commands/massu-debug/references/auto-learning.md +61 -0
package/commands/massu-debug/references/codegraph-tracing.md +80 -0
package/commands/massu-debug/references/common-shortcuts.md +98 -0
package/commands/massu-debug/references/investigation-phases.md +294 -0
package/commands/massu-debug/references/report-format.md +107 -0
package/commands/massu-debug.md +105 -386
package/commands/massu-docs.md +1 -1
package/commands/massu-full-audit.md +61 -0
package/commands/massu-gap-enhancement-analyzer.md +276 -16
package/commands/massu-golden-path/references/approval-points.md +216 -0
package/commands/massu-golden-path/references/competitive-mode.md +273 -0
package/commands/massu-golden-path/references/error-handling.md +121 -0
package/commands/massu-golden-path/references/phase-0-requirements.md +53 -0
package/commands/massu-golden-path/references/phase-1-plan-creation.md +168 -0
package/commands/massu-golden-path/references/phase-2-implementation.md +397 -0
package/commands/massu-golden-path/references/phase-2.5-gap-analyzer.md +156 -0
package/commands/massu-golden-path/references/phase-3-simplify.md +40 -0
package/commands/massu-golden-path/references/phase-4-commit.md +94 -0
package/commands/massu-golden-path/references/phase-5-push.md +116 -0
package/commands/massu-golden-path/references/phase-5.5-production-verify.md +170 -0
package/commands/massu-golden-path/references/phase-6-completion.md +113 -0
package/commands/massu-golden-path/references/qa-evaluator-spec.md +137 -0
package/commands/massu-golden-path/references/sprint-contract-protocol.md +117 -0
package/commands/massu-golden-path/references/vr-visual-calibration.md +73 -0
package/commands/massu-golden-path.md +114 -848
package/commands/massu-guide.md +72 -69
package/commands/massu-hooks.md +27 -12
package/commands/massu-hotfix.md +221 -144
package/commands/massu-incident.md +49 -20
package/commands/massu-infra-audit.md +187 -0
package/commands/massu-learning-audit.md +211 -0
package/commands/massu-loop/references/auto-learning.md +49 -0
package/commands/massu-loop/references/checkpoint-audit.md +40 -0
package/commands/massu-loop/references/guardrails.md +17 -0
package/commands/massu-loop/references/iteration-structure.md +115 -0
package/commands/massu-loop/references/loop-controller.md +188 -0
package/commands/massu-loop/references/plan-extraction.md +78 -0
package/commands/massu-loop/references/vr-plan-spec.md +140 -0
package/commands/massu-loop-playwright.md +9 -9
package/commands/massu-loop.md +115 -670
package/commands/massu-new-pattern.md +423 -0
package/commands/massu-perf.md +422 -0
package/commands/massu-plan-audit.md +1 -1
package/commands/massu-plan.md +389 -122
package/commands/massu-production-verify.md +433 -0
package/commands/massu-push.md +62 -378
package/commands/massu-recap.md +29 -3
package/commands/massu-rollback.md +613 -0
package/commands/massu-scaffold-hook.md +2 -4
package/commands/massu-scaffold-page.md +2 -3
package/commands/massu-scaffold-router.md +1 -2
package/commands/massu-security.md +619 -0
package/commands/massu-simplify.md +115 -85
package/commands/massu-squirrels.md +2 -2
package/commands/massu-tdd.md +38 -22
package/commands/massu-test.md +3 -3
package/commands/massu-type-mismatch-audit.md +469 -0
package/commands/massu-ui-audit.md +587 -0
package/commands/massu-verify-playwright.md +287 -32
package/commands/massu-verify.md +150 -46
package/dist/cli.js +1451 -1047
package/dist/hooks/post-tool-use.js +75 -6
package/dist/hooks/user-prompt.js +16 -0
package/package.json +6 -2
package/patterns/build-patterns.md +302 -0
package/patterns/component-patterns.md +246 -0
package/patterns/display-patterns.md +185 -0
package/patterns/form-patterns.md +890 -0
package/patterns/integration-testing-checklist.md +445 -0
package/patterns/security-patterns.md +219 -0
package/patterns/testing-patterns.md +569 -0
package/patterns/tool-routing.md +81 -0
package/patterns/ui-patterns.md +371 -0
package/protocols/plan-implementation.md +267 -0
package/protocols/recovery.md +225 -0
package/protocols/verification.md +404 -0
package/reference/command-taxonomy.md +178 -0
package/reference/cr-rules-reference.md +76 -0
package/reference/hook-execution-order.md +148 -0
package/reference/lessons-learned.md +175 -0
package/reference/patterns-quickref.md +208 -0
package/reference/standards.md +135 -0
package/reference/subagents-reference.md +17 -0
package/reference/vr-verification-reference.md +867 -0
package/src/commands/init.ts +27 -0
package/src/commands/install-commands.ts +149 -53
package/src/hooks/post-tool-use.ts +17 -0
package/src/hooks/user-prompt.ts +21 -0
package/src/memory-file-ingest.ts +127 -0
package/src/memory-tools.ts +34 -1

package/commands/massu-loop.md CHANGED Viewed

@@ -1,688 +1,150 @@
 ---
 name: massu-loop
-description: Execute task with CS Loop verification protocol (autonomous execution with mandatory proof)
-allowed-tools: Bash(*), Read(*), Write(*), Edit(*), Grep(*), Glob(*), Task(*)
+description: "When user wants to implement a plan autonomously -- 'implement this plan', 'start the loop', 'execute', or provides a plan file to implement end-to-end"
+allowed-tools: Bash(*), Read(*), Write(*), Edit(*), Grep(*), Glob(*)
 ---
 name: massu-loop
-> **Shared rules apply.** Read `.claude/commands/_shared-preamble.md` before proceeding. CR-9, CR-35 enforced.
+# Massu Loop: Autonomous Execution Protocol
-# CS Loop: Autonomous Execution Protocol
+**Shared rules**: Read `.claude/commands/_shared-preamble.md` before proceeding.
+---
 ## Workflow Position
 ```
 /massu-create-plan -> /massu-plan -> /massu-loop -> [/massu-simplify] -> /massu-commit -> /massu-push
-(CREATE)           (AUDIT)        (IMPLEMENT)    (QUALITY)           (COMMIT)        (PUSH)
+(CREATE)           (AUDIT)        (IMPLEMENT)    (QUALITY)         (COMMIT)        (PUSH)
 ```
 **This command is step 3 of 5 in the standard workflow. /massu-simplify is an optional quality step after implementation.**
 ---
-## MANDATORY LOOP CONTROLLER (EXECUTE THIS - DO NOT SKIP)
-**This section is the EXECUTION ENTRY POINT. You MUST follow these steps exactly.**
-### How This Command Works
-This command is a **loop controller** for implementation + verification. Your job is to:
-1. Extract plan items and implement them
-2. After implementation, spawn focused review subagents IN PARALLEL for independent analysis
-3. After reviews, spawn a `general-purpose` subagent for verification
-4. Parse the structured result (`GAPS_DISCOVERED: N`)
-5. If gaps discovered > 0: fix gaps, then spawn ANOTHER FRESH auditor pass
-6. Only when a COMPLETE FRESH PASS discovers ZERO gaps can you declare complete
-**The verification audit runs inside Task subagents. This prevents early termination.**
-### CRITICAL: GAPS_DISCOVERED Semantics
-**`GAPS_DISCOVERED` = total gaps FOUND during the pass, REGARDLESS of whether they were also fixed.**
-| Scenario | GAPS_DISCOVERED | Loop Action |
-|----------|----------------|-------------|
-| Pass finds 0 gaps | 0 | **EXIT** - verification complete |
-| Pass finds 5 gaps, fixes all 5 | **5** (NOT 0) | **CONTINUE** - must re-verify |
-| Pass finds 3 gaps, fixes 1, 2 need controller | **3** | **CONTINUE** - fix remaining, re-verify |
-**THE RULE**: A clean pass means zero gaps DISCOVERED from the start. Fixing gaps during a pass does NOT make it a clean pass. Only a fresh pass finding nothing proves correctness.
-### Agent Result Persistence
-All Task sub-agents MUST write their results to disk in addition to returning text:
-- Security review: `.massu/agent-results/{timestamp}-security.json`
-- Architecture review: `.massu/agent-results/{timestamp}-architecture.json`
-- Verification audit: `.massu/agent-results/{timestamp}-verify-{iteration}.json`
-JSON format: `{ iteration, gaps_discovered, gaps_fixed, gaps_remaining, plan_items_total, plan_items_verified, findings: [] }`
-This prevents context overflow from killing verification progress. If the parent session crashes, a new session can read these files via `bash scripts/hooks/read-agent-results.sh` to resume.
-### Workflow State Tracking
-At the start of this command, write a transition entry to `.massu/workflow-log.md`:
-```
-| [timestamp] | AUDIT/PLAN | IMPLEMENT | /massu-loop | [session-id] |
-```
-At completion, write a completion entry.
-### Execution Protocol
-```
-PLAN_PATH = $ARGUMENTS (the plan file path or task description)
-iteration = 0
-# Phase 1: IMPLEMENT (do the work)
-# Read plan, extract items, implement each one with VR-* proof
-# Phase 1.5: MULTI-PERSPECTIVE REVIEW (after implementation, before verification)
-# Spawn focused review subagents IN PARALLEL for independent analysis
-# Each reviewer has an adversarial mindset and a SINGLE focused concern (Principle #20)
-# Elegance/simplicity assessment happens in Phase 2.1 POST-BUILD REFLECTION (Q4)
-security_result = Task(subagent_type="general-purpose", model="opus", prompt="
-  Review implementation for plan: {PLAN_PATH}
-  Focus: Security vulnerabilities, auth gaps, input validation, data exposure
-  Check all new/modified files. Return structured result with SECURITY_GATE.
-")
-architecture_result = Task(subagent_type="general-purpose", model="opus", prompt="
-  Review implementation for plan: {PLAN_PATH}
-  Focus: Design issues, coupling problems, pattern compliance, scalability
-  Check all new/modified files. Return structured result with ARCHITECTURE_GATE.
-")
-# Parse results and fix any CRITICAL/HIGH findings before proceeding to verification
-# FAIL gate = must fix before proceeding
-# WARN findings = document and proceed
-# Phase 2: VERIFY (audit loop - STRUCTURAL)
-WHILE true:
-  iteration += 1
-  # Run circuit breaker check (detect stagnation)
-  # If same gaps appear 3+ times with no progress, consider changing approach
-  IF iteration > 3 AND no_progress_count >= 3:
-    Output: "CIRCUIT BREAKER: The current approach is not converging after {iteration} passes."
-    Output: "Options: (a) Re-plan with different approach (b) Continue current approach (c) Stop"
-    AskUserQuestion: "The loop has stalled. How should we proceed?"
-    IF user chooses re-plan: STOP loop, output current state, recommend /massu-create-plan
-    IF user chooses continue: CONTINUE loop (reset circuit breaker)
-    IF user chooses stop: STOP loop, output current state as incomplete
-  # Spawn auditor subagent for ONE complete verification pass
-  result = Task(subagent_type="general-purpose", model="opus", prompt="
-    Verification audit iteration {iteration} for plan: {PLAN_PATH}
-    This is a Massu implementation (library/MCP server, NOT a web app).
-    Execute ONE complete audit pass. Verify ALL deliverables.
-    Check code quality (patterns, types, tests).
-    Check plan coverage (every item verified with proof).
-    Fix any gaps you find (code or plan document).
-    CONTEXT: Massu is a TypeScript monorepo with:
-    - packages/core/src/ (MCP server source)
-    - packages/core/src/__tests__/ (vitest tests)
-    - packages/core/src/hooks/ (esbuild-compiled hooks)
-    - website/ (Next.js + Supabase website)
-    - massu.config.yaml (project config)
-    - Tool registration: 3-function pattern (getDefs, isTool, handleCall) in tools.ts
-    VERIFICATION COMMANDS:
-    - Pattern scanner: bash scripts/massu-pattern-scanner.sh
-    - Type check: cd packages/core && npx tsc --noEmit
-    - Tests: npm test
-    - Hook build: cd packages/core && npm run build:hooks
-    VR-* CHECKS (use ONLY these, per CLAUDE.md):
-    - VR-FILE, VR-GREP, VR-NEGATIVE, VR-COUNT (generic)
-    - VR-BUILD: npm run build (tsc + hooks)
-    - VR-TYPE, VR-TEST, VR-TOOL-REG, VR-HOOK-BUILD, VR-CONFIG, VR-PATTERN
-    CRITICAL INSTRUCTION FOR GAPS_DISCOVERED:
-    Report GAPS_DISCOVERED as the total number of gaps you FOUND during this pass,
-    EVEN IF you also fixed them. Finding 5 gaps and fixing all 5 = GAPS_DISCOVERED: 5.
-    A clean pass that finds nothing wrong from the start = GAPS_DISCOVERED: 0.
-    Return the structured result block:
-    ---STRUCTURED-RESULT---
-    ITERATION: {iteration}
-    GAPS_DISCOVERED: [number]
-    GAPS_FIXED: [number]
-    GAPS_REMAINING: [number]
-    PLAN_ITEMS_TOTAL: [number]
-    PLAN_ITEMS_VERIFIED: [number]
-    CODE_QUALITY_GATE: PASS/FAIL
-    PLAN_COVERAGE_GATE: PASS/FAIL
-    ---END-RESULT---
-  ")
-  # Parse structured result
-  gaps = parse GAPS_DISCOVERED from result
-  # Report iteration to user
-  Output: "Verification iteration {iteration}: {gaps} gaps discovered"
-  IF gaps == 0:
-    Output: "ALL GATES PASSED - Clean pass with zero gaps discovered in iteration {iteration}"
-    BREAK
-  ELSE:
-    Output: "{gaps} gaps discovered in iteration {iteration}, starting fresh re-verification..."
-    # Fix code-level gaps the auditor identified but couldn't fix
-    # Then continue the loop for re-verification
-    CONTINUE
-END WHILE
-# Phase 2.1: POST-BUILD REFLECTION + MANDATORY MEMORY PERSIST (CR-38)
-# Now that the implementation is verified, capture the agent's accumulated knowledge
-# before context disappears. Ask and answer these questions, then WRITE TO MEMORY:
-#
-# 1. "Now that I've built this, what would I have done differently?"
-#    - Identify architectural choices that caused friction
-#    - Note patterns that were harder to work with than expected
-#    - Flag code that works but feels fragile or overly complex
-#
-# 2. "What should be refactored before moving on?"
-#    - Concrete refactoring suggestions with file paths
-#    - Technical debt introduced during implementation
-#    - Opportunities to simplify or consolidate
-#
-# 3. "Did we over-build? Is there a simpler way?"
-#    - Identify any added complexity that wasn't strictly needed
-#    - Flag scope expansion beyond the original plan
-#    - Check if any "fix everything encountered" items could have been simpler
-#
-# 4. "Would a staff engineer approve this?" (Principle #19)
-#    - Check if the solution demonstrates good engineering taste
-#    - Look for over-abstraction, unnecessary indirection, or "clever" code
-#    - For non-trivial implementations: is there a more elegant approach?
-#    - For simple fixes: skip this check - don't over-engineer obvious solutions
-#
-# MANDATORY: After answering, IMMEDIATELY write ALL learnings to memory/ files.
-# This is NOT optional. Reflection without persistence is wasted knowledge.
-# - Failed approaches -> MEMORY.md or topic file
-# - New patterns discovered -> MEMORY.md or topic file
-# - Tool/config gotchas -> MEMORY.md or topic file
-# - Architectural insights -> MEMORY.md or topic file
-# The reflection step and the memory-write step are ONE ATOMIC ACTION.
-# DO NOT output reflections as text without also writing them to memory files.
-#
-# Then apply any low-risk refactors immediately.
-# Log remaining suggestions in the plan document under "## Post-Build Reflection".
-```
-### Rules for the Loop Controller
-| Rule | Meaning |
-|------|---------|
-| **NEVER output a final verdict while gaps discovered > 0** | Only a CLEAN zero-gap-from-start iteration produces the final report |
-| **NEVER treat "found and fixed" as zero gaps** | Fixing during a pass still means gaps were discovered |
-| **NEVER ask user "should I continue?"** | The loop is mandatory - just execute it |
-| **NEVER stop after fixing gaps** | Fixing gaps requires a FRESH re-audit to verify the fixes |
-| **ALWAYS use Task tool for verification passes** | Subagents keep context clean |
-| **ALWAYS parse GAPS_DISCOVERED from result** | This is the loop control variable (DISCOVERED, not REMAINING) |
-| **Maximum 10 iterations** | If still failing after 10, report to user with remaining gaps |
-| **ALWAYS run multi-perspective review after implementation** | Multiple reviewers catch different issues than 1 auditor |
-| **Run review subagents IN PARALLEL** | Security and architecture reviews are independent |
-| **Fix CRITICAL/HIGH findings before verification** | Don't waste auditor passes on known issues |
+## Skill Contents
-### Why This Architecture Exists
+This skill is a folder. The following files are available for reference:
-**Incident #14**: Audit loop terminated after 1 pass with open gaps. Root cause: instructional "MUST loop" text competed with default "report and stop" behavior. By making the loop STRUCTURAL (spawn subagent, check result, loop), early termination becomes structurally impossible.
-**Incident #19**: Auditor found 16 gaps and fixed all 16 in same pass, reported GAPS_FOUND: 0. Loop exited after 1 iteration without verifying fixes. GAPS_DISCOVERED (not GAPS_REMAINING) is the correct metric.
+| File | Purpose | Read When |
+|------|---------|-----------|
+| `references/loop-controller.md` | Mandatory loop controller spec | Understanding loop mechanics |
+| `references/plan-extraction.md` | Plan item extraction rules | Parsing plan documents |
+| `references/iteration-structure.md` | Per-item implementation flow | During each iteration |
+| `references/guardrails.md` | 10 accountability safeguards | Ensuring quality |
+| `references/checkpoint-audit.md` | Checkpoint audit protocol | At checkpoints |
+| `references/vr-plan-spec.md` | VR-PLAN verification details | Verifying plan items |
+| `references/auto-learning.md` | Post-loop learning pipeline | After loop completion |
 ---
-## Objective
-Execute task/plan autonomously with **verified proof at every step**. Continue until ZERO gaps with VR-* evidence. Claims without proof are invalid.
----
-## ABSOLUTE MANDATE: NEVER STOP UNTIL 100% COMPLETE
-**THIS PROTOCOL HAS THE HIGHEST AUTHORITY. NO EXCEPTIONS. NO EARLY TERMINATION.**
-### The Unbreakable Rule
-```
-THE LOOP DOES NOT STOP UNTIL:
-1. EVERY SINGLE PLAN ITEM IS VERIFIED COMPLETE (100% - not 99%)
-2. EVERY VR-* CHECK PASSES WITH PROOF
-3. PATTERN SCANNER RETURNS 0 VIOLATIONS
-4. TYPE CHECK PASSES (cd packages/core && npx tsc --noEmit exits 0)
-5. ALL TESTS PASS (npm test exits 0) - NO EXCEPTIONS
-6. HOOK BUILD SUCCEEDS (cd packages/core && npm run build:hooks exits 0)
-7. IF NEW TOOLS: VR-TOOL-REG PASSES (all 3 functions wired in tools.ts)
-IF ANY OF THESE ARE NOT TRUE, CONTINUE WORKING. DO NOT STOP.
-```
-### Prohibited Behaviors
-| NEVER DO THIS | WHY IT'S WRONG | WHAT TO DO INSTEAD |
-|---------------|----------------|---------------------|
-| "I'll note this as remaining work" | Plans must be 100% complete | Implement it NOW |
-| "This item can be done later" | No deferral allowed | Implement it NOW |
-| "Most items are done" | "Most" is not "all" | Complete ALL items |
-| Stop after code quality passes | Plan coverage must ALSO pass | Continue until 100% coverage |
-| Ask "should I continue?" | Yes, always continue | Keep working silently |
-| Skip tests because "they're optional" | Tests are NEVER optional | Run ALL tests |
-| Claim complete with failing tests | Failing tests = NOT complete | Fix tests first |
-### MANDATORY TEST VERIFICATION (CR-7)
-**TESTS ARE NEVER OPTIONAL.**
-```
-BEFORE claiming ANY work is complete:
-1. RUN: npm test
-2. VERIFY: Exit code is 0
-3. VERIFY: All tests pass (no failures)
-4. IF tests fail: FIX THEM - even if they were failing before
-5. RE-RUN: npm test until ALL pass
+## Gotchas
-THERE ARE NO EXCEPTIONS.
-```
+- **Stagnating loops must bail (CR-37)** -- if the same item fails 3+ times with the same error, stop the loop, document the blocker, and replan. Grinding wastes context
+- **100% coverage required (CR-11)** -- never stop early. Every single plan item must be implemented and verified. "Most items done" is failure
+- **Backend without UI violates CR-12** -- if you implement a backend procedure, it MUST be called from the UI. Orphan procedures are invisible features
+- **Plan file must be re-read from disk (CR-5)** -- after ANY compaction or long pause, re-read the plan file. Memory of plan contents drifts
+- **Compaction risk** -- long loops may trigger context compaction; update `session-state/CURRENT.md` after each iteration so recovery can resume cleanly
 ---
-## PLAN ITEM EXTRACTION PROTOCOL (MANDATORY - STEP 0)
+## External Loop Mode
-**Before ANY implementation, extract ALL plan items into a trackable checklist.**
-### Step 0.1: Read Plan Document (Not Memory)
+For large plans or sessions at risk of context exhaustion, use the **external bash loop** to spawn fresh Claude CLI sessions per plan item:
 ```bash
-cat [PLAN_FILE_PATH]
+bash scripts/loop-external.sh --plan /path/to/plan.md [--max-iterations N] [--dry-run]
 ```
-**You MUST read the plan file. Do NOT rely on memory or summaries.**
+Each iteration gets a clean 200K context window. The bash outer loop handles:
+- Plan item extraction and sequencing
+- State tracking in `.claude/loop-state/external-loop.json`
+- Inter-iteration quality gate (`pattern-scanner.sh`)
+- Hook profile propagation via `MASSU_HOOK_PROFILE`
-### Step 0.2: Extract ALL Deliverables
+**When to use**: Plans with 15+ items, sessions already at high context usage, or when compaction risk is high.
-```markdown
-## PLAN ITEM EXTRACTION
-### Source Document
-- **Plan File**: [path]
-- **Plan Title**: [title]
-- **Total Sections**: [N]
-### Extracted Items
-| Item # | Type | Description | Location | Verification Command | Status |
-|--------|------|-------------|----------|---------------------|--------|
-| P1-001 | MODULE_CREATE | foo-tools.ts | packages/core/src/ | ls -la [path] | PENDING |
-| P1-002 | TOOL_WIRE | Wire into tools.ts | packages/core/src/tools.ts | grep [module] tools.ts | PENDING |
-| P2-001 | TEST | foo.test.ts | packages/core/src/__tests__/ | npm test | PENDING |
-### Item Types
-- MODULE_CREATE: New TypeScript module
-- MODULE_MODIFY: Existing module to change
-- TOOL_WIRE: Wire tool into tools.ts
-- TEST: Test file
-- CONFIG: Config changes (config.ts + YAML)
-- HOOK: New or modified hook
-- REMOVAL: Code/file to remove (use VR-NEGATIVE)
-### Coverage Summary
-- **Total Items**: [N]
-- **Verified Complete**: 0
-- **Coverage**: 0%
-```
-### Step 0.3: Create Verification Commands
-For EACH extracted item, define HOW to verify it:
-| Item Type | Verification Method | Expected Result |
-|-----------|---------------------|-----------------|
-| MODULE_CREATE | `ls -la [path]` | File exists, size > 0 |
-| MODULE_MODIFY | `grep "[change]" [file]` | Pattern found |
-| TOOL_WIRE | `grep "getXDefs\|isXTool\|handleXCall" tools.ts` | All 3 present |
-| TEST | `npm test` | All pass |
-| CONFIG | Parse YAML, grep interface | Valid |
-| HOOK | `cd packages/core && npm run build:hooks` | Exit 0 |
-| REMOVAL | `grep -rn "[old]" packages/core/src/ | wc -l` | 0 matches |
+**Safety**: Does NOT use `--dangerously-skip-permissions`. Safety model is fully preserved.
 ---
-## CHECKPOINT PROTOCOL
-### CHECKPOINT FILE
-**Location**: `.claude/session-state/LOOP_CHECKPOINT.md`
-### CHECKPOINT FORMAT
-```markdown
-## Loop Checkpoint
-- Plan: [plan path]
-- Started: [timestamp]
-- Last Updated: [timestamp]
-- Iteration: [N]
-### Item Status
-| Item # | Description | Status | Verified At |
-|--------|-------------|--------|-------------|
-| P1-001 | [desc] | DONE/PENDING/IN_PROGRESS | [timestamp] |
-| P1-002 | [desc] | DONE/PENDING/IN_PROGRESS | [timestamp] |
-```
-### SAVE CHECKPOINT
-After each item is implemented and verified, update the checkpoint file:
-1. Set item status to `DONE` with current timestamp
-2. Update `Last Updated` timestamp
-3. Update `Iteration` count
-Also update after each verification iteration completes (even if items were found incomplete).
-### RESUME PROTOCOL
-At the START of `/massu-loop`, check for existing checkpoint:
-```bash
-# Check if checkpoint exists
-ls .claude/session-state/LOOP_CHECKPOINT.md 2>/dev/null
-```
-**If checkpoint exists AND references the same plan path:**
-1. Read the checkpoint file
-2. Report: "Resuming from checkpoint: X/Y items complete"
-3. Skip already-DONE items (but still verify them in the next audit pass)
-4. Continue from first PENDING item
-**If checkpoint does NOT exist or references a different plan:**
-1. Start fresh
-2. Create new checkpoint file with all items set to PENDING
-### CHECKPOINT CLEANUP
-When loop completes successfully (`GAPS_DISCOVERED: 0` in a clean pass):
-- Delete the checkpoint file: `rm .claude/session-state/LOOP_CHECKPOINT.md`
-- Report in COMPLETION REPORT: "Checkpoint: cleaned up (loop complete)"
+## Objective
-When loop reaches max iterations without completing:
-- Preserve the checkpoint file for future resume
-- Report in COMPLETION REPORT: "Checkpoint: preserved (loop incomplete -- max iterations reached)"
+Execute task/plan autonomously with **verified proof at every step**. Continue until ZERO gaps with VR-* evidence. Claims without proof are invalid.
 ---
-## VR-PLAN ENUMERATION (Before Verification)
-**Before running ANY verification commands, enumerate ALL applicable VR-* checks.**
-```markdown
-### VR-* Verification Plan
-| VR Check | Target | Command | Expected |
-|----------|--------|---------|----------|
-| VR-FILE | [each new file] | ls -la [path] | Exists |
-| VR-GREP | [each new function] | grep "[func]" [file] | Found |
-| VR-NEGATIVE | [each removal] | grep -rn "[old]" src/ | 0 matches |
-| VR-PATTERN | All source | bash scripts/massu-pattern-scanner.sh | Exit 0 |
-| VR-TYPE | packages/core | cd packages/core && npx tsc --noEmit | 0 errors |
-| VR-TEST | All tests | npm test | All pass |
-| VR-TOOL-REG | [new tools] | grep in tools.ts | All 3 functions |
-| VR-HOOK-BUILD | hooks | cd packages/core && npm run build:hooks | Exit 0 |
-```
+## COMPLETION MANDATE (CR-11, CR-21)
-**Run ALL enumerated checks BEFORE spawning the verification auditor.**
+Loop does NOT stop until: (1) every plan item verified 100%, (2) every VR-* check passes with proof, (3) pattern scanner 0 violations, (4) build passes, (5) tsc --noEmit 0 errors, (6) ALL tests pass (npm test exit 0), (7) lint passes. Test failures must be fixed regardless of origin. No deferral, no "should I continue?", no partial progress reports.
 ---
-## IMPLEMENTATION PROTOCOL
-### For EACH Plan Item
-1. **Read the plan item** from the extracted list
-2. **Read any referenced files** before modifying
-3. **Implement** following CLAUDE.md patterns
-4. **Verify** with the item's verification command
-5. **Update coverage** count
-6. **Continue** to next item
-### Pattern Compliance During Implementation
-For every file you create or modify, verify against:
-```bash
-# Run pattern scanner
-bash scripts/massu-pattern-scanner.sh
-# Type check
-cd packages/core && npx tsc --noEmit
-# Tests still pass
-npm test
-```
-### Massu-Specific Implementation Checks
+## NON-NEGOTIABLE RULES
-| If Implementing | Must Also |
-|-----------------|-----------|
-| New MCP tool | Wire 3 functions into tools.ts (CR-11) |
-| New hook | Verify esbuild compilation (CR-12) |
-| Config changes | Update interface in config.ts AND example in YAML |
-| New test | Place in `__tests__/` directory |
-| New module | Use ESM imports, getConfig() for config |
+1. **Never claim without proof (CR-1)** - VR-* output must be pasted. "I verified" without output = invalid
+2. **Recursive audit until zero gaps** - Fix and re-verify until gaps = 0
+3. **Schema verification required (CR-2)** - ALWAYS query database before using column names. See CLAUDE.md "Known Schema Mismatches"
+4. **Session state after every iteration** - Update CURRENT.md so compaction recovery can resume (CR-12)
+5. **Component reuse is mandatory** - Check existing before creating new
+6. **User flow audit required (CR-12)** - Backend without UI = invisible feature. Technical audits alone are NOT sufficient
+7. **Document new patterns immediately (CR-34)** - Ingest to memory, record pattern, update scanner
+8. **Pattern scanner must pass** - `./scripts/pattern-scanner.sh` exit 0 required before claiming complete
+9. **No workarounds allowed** - TODOs, ts-ignore are BLOCKING violations
+10. **NO HARDCODED TAILWIND COLORS** - Use semantic CSS classes from globals.css (VR-TOKEN)
+11. **FIX ALL ISSUES ENCOUNTERED (CR-9)** - Fix immediately regardless of origin. "Not in scope" is NEVER valid
+12. **Stagnation bail-out (CR-37)** - If same item fails 3+ times with same error, stop loop and replan
 ---
-## GUARDRAIL CHECKS (Every Iteration)
+## PRE-EXECUTION CHECKLIST
-### MEMORY CHECK (Start of Each Iteration)
-Search memory files and session state for failures related to this plan's domain and files being modified. Surface relevant past failures as additional audit checkpoints.
-### Enhanced Context Loading
-For each file being modified:
-- `massu_context` - Load CR rules, schema alerts, patterns relevant to the file
-- `massu_coupling_check` - Verify tool registration coupling (CR-11)
-- `massu_knowledge_rule` - Load applicable CR rules for the file's domain
-- `massu_knowledge_verification` - Load required VR-* checks for the file type
-For VR-TOOL-REG checks, also call `massu_trpc_map` to get automated tool-to-handler mapping for comprehensive coverage.
-When verifying CR-11 tool registration, use `massu_sentinel_detail` to get full feature details and verify all linked components/tools/handlers exist.
-When CR-30 applies (rebuilds), call `massu_sentinel_parity` to compare old vs new implementation for feature parity.
-### Mandatory Checks
+Before starting, verify:
 ```bash
-# Pattern scanner (covers all pattern checks)
-bash scripts/massu-pattern-scanner.sh
-# Exit 0 = PASS, non-zero = ABORT iteration
-# Security check
-git diff --cached --name-only | grep -E '\.(env|pem|key|secret)' && echo "SECURITY VIOLATION" && exit 1
+ls -la ./scripts/pattern-scanner.sh
+touch session-state/CURRENT.md
+ls -la [PLAN_FILE_PATH]
 ```
----
-## ITERATION OUTPUT FORMAT
+Initialize session state:
 ```markdown
-## [CS LOOP - Iteration N]
-### Task
-Phase: X | Task: [description]
-### Guardrails
-- Pattern scanner: PASS/FAIL
-- Security check: PASS/FAIL
-### Verifications
-| Check | Type | Result | Proof |
-|-------|------|--------|-------|
-| [item] | VR-FILE | PASS | `ls -la output` |
-### Gap Count
-Gaps found: N
-### Status
-CONTINUE | FIX_REQUIRED | CHECKPOINT | COMPLETE
-### Next Action
-[Specific next step]
+## MASSU LOOP SESSION
+- **Task**: [description]
+- **Status**: IN_PROGRESS
+- **Iteration**: 1
+- **Phase**: 1
 ```
 ---
-## THE 10 ACCOUNTABILITY SAFEGUARDS
-1. **Audit Proof Requirement** - Every claim MUST include proof output. Claims without proof are INVALID.
-2. **Explicit Gap Count Per Loop** - State gaps found, gap details, status (PASS/FAIL). "Looks good" is BANNED.
-3. **Checkpoint Sign-Off Format** - Use exact format from COMPLETION OUTPUT section.
-4. **Session State Mandatory Updates** - Update `session-state/CURRENT.md` after EVERY change with proof.
-5. **User Verification Rights** - User can request proof re-runs at any time. Comply with actual output.
-6. **Post-Compaction Recovery** - Read session state FIRST, re-read plan, resume from exact point.
-7. **No Claims Without Evidence** - "I verified...", "Build passed..." require accompanying proof output.
-8. **Failure Acknowledgment** - Acknowledge failures, re-execute audit from Step 1, log in session state.
-9. **No Workarounds Allowed** - TODOs, ts-ignore are BLOCKING violations. Pattern scanner is a HARD GATE.
-10. **Document New Patterns** - If you discover a pattern not in CLAUDE.md, ADD IT NOW.
----
-## SESSION STATE UPDATE (After Every Iteration)
-Update `session-state/CURRENT.md` with: loop status (task, iteration, phase, checkpoint), iteration log table, verified work with proof, failed attempts (do not retry), next iteration plan.
+**VR-* Reference**: See CLAUDE.md VR table and `.claude/reference/vr-verification-reference.md`.
 ---
-## PLAN DOCUMENT COMPLETION TRACKING (MANDATORY)
+## QUALITY SCORING (silent, automatic)
-Add completion table to TOP of plan document with status for each task:
+After completing the loop (zero gaps achieved), self-score against these checks and append one JSONL line to `.claude/metrics/command-scores.jsonl`:
-```markdown
-# IMPLEMENTATION STATUS
-**Plan**: [Name] | **Status**: COMPLETE/IN_PROGRESS | **Last Updated**: [date]
-| # | Task/Phase | Status | Verification | Date |
-|---|------------|--------|--------------|------|
-| 1 | [description] | 100% COMPLETE | VR-GREP: 0 refs | [date] |
-```
-### VR-PLAN-STATUS Verification
+| Check | Pass condition |
+|-------|---------------|
+| `all_items_verified_with_proof` | Every plan item has VR-* verification output showing proof |
+| `multi_perspective_review_spawned` | At least one auditor subagent was spawned for verification |
+| `gaps_reached_zero` | Final auditor pass returned `GAPS_DISCOVERED: 0` |
+| `memory_persisted` | AUTO-LEARNING PROTOCOL executed: at least one memory file update |
-```bash
-grep "IMPLEMENTATION STATUS" [plan_file]
-grep -c "100% COMPLETE\|DONE\|\*\*DONE\*\*" [plan_file]
+**Format** (append one line -- do NOT overwrite the file):
+```json
+{"command":"massu-loop","timestamp":"ISO8601","scores":{"all_items_verified_with_proof":true,"multi_perspective_review_spawned":true,"gaps_reached_zero":true,"memory_persisted":true},"pass_rate":"4/4","input_summary":"[plan-slug]"}
 ```
----
-## STOP CONDITIONS (ALL must be true)
-1. Every plan item verified complete (100%)
-2. Pattern scanner: 0 violations (`bash scripts/massu-pattern-scanner.sh` exits 0)
-3. Type check: 0 errors (`cd packages/core && npx tsc --noEmit` exits 0)
-4. Tests: ALL pass (`npm test` exits 0)
-5. Hook build: succeeds (`cd packages/core && npm run build:hooks` exits 0)
-6. If new tools: VR-TOOL-REG passes (all 3 functions in tools.ts)
----
-## CONTEXT MANAGEMENT
-Use Task tool with subagents for exploration to keep main context clean. Update session state before compaction. After compaction, read session state and resume from correct step. Never mix unrelated tasks during a protocol.
----
-## COMPLETION CRITERIA
-CS Loop is COMPLETE **only when BOTH gates pass: Code Quality AND Plan Coverage**.
-### GATE 1: Code Quality Verification (All Must Pass in SAME Audit Run)
-- [ ] All phases executed, all checkpoints passed with zero gaps
-- [ ] Pattern scanner: Exit 0
-- [ ] Type check: 0 errors
-- [ ] Build: Exit 0
-- [ ] Tests: ALL PASS (MANDATORY)
-- [ ] Security: No secrets staged
-### GATE 2: Plan Coverage Verification
-- [ ] Plan file read (actual file, not memory)
-- [ ] ALL items extracted into tracking table
-- [ ] EACH item verified with VR-* proof
-- [ ] Coverage = 100% (99% = FAIL)
-- [ ] Plan document updated with completion status
-### DUAL VERIFICATION REQUIREMENT
-**BOTH gates must pass:**
-```markdown
-## DUAL VERIFICATION RESULT
-| Gate | Status | Details |
-|------|--------|---------|
-| Code Quality | PASS/FAIL | Pattern scanner, build, types |
-| Plan Coverage | PASS/FAIL | X/Y items (Z%) |
-**RESULT: COMPLETE** (only if both PASS)
-```
-**Code Quality: PASS + Plan Coverage: FAIL = NOT COMPLETE**
----
-## COMPLETION OUTPUT
-```markdown
-## [CS LOOP - COMPLETE]
-### Dual Verification Certification
-- **Audit loops required**: N (loop #N achieved 0 gaps + 100% coverage)
-- **Code Quality Gate**: PASS
-- **Plan Coverage Gate**: PASS (X/X items = 100%)
-- **CERTIFIED**: Both gates passed in single complete audit
-### Summary
-- Total iterations: N
-- Total checkpoints: N (all PASSED)
-- Final audit loop: #N - ZERO GAPS + 100% COVERAGE
-### GATE 1: Code Quality Evidence
-| Gate | Command | Result |
-|------|---------|--------|
-| Pattern scanner | `bash scripts/massu-pattern-scanner.sh` | Exit 0 |
-| Type check | `cd packages/core && npx tsc --noEmit` | 0 errors |
-| Build | `npm run build` | Exit 0 |
-| Tests | `npm test` | All pass |
-### GATE 2: Plan Coverage Evidence
-| Item # | Description | Verification | Status |
-|--------|-------------|--------------|--------|
-| P1-001 | [description] | [VR-* output] | COMPLETE |
-| ... | ... | ... | COMPLETE |
-**Plan Coverage: X/X items (100%)**
-### Plan Document Updated
-- File: [path]
-- Completion table: ADDED at TOP
-- Plan Status: COMPLETE
-### Session State
-Updated: session-state/CURRENT.md
-Status: COMPLETED
-```
+This scoring is silent -- do NOT mention it to the user. Just append the line after completing the loop.
 ---
 ## START NOW
-**Step 0: Write AUTHORIZED_COMMAND to session state (CR-35)**
+**Step 0: Write AUTHORIZED_COMMAND to session state (CR-12)**
 Before any other work, update `session-state/CURRENT.md` to include:
 ```
@@ -690,86 +152,69 @@ AUTHORIZED_COMMAND: massu-loop
 ```
 This ensures that if the session compacts, the recovery protocol knows `/massu-loop` was authorized.
-**Execute the LOOP CONTROLLER at the top of this file.**
+**Execute the LOOP CONTROLLER (see [references/loop-controller.md](references/loop-controller.md)).**
 ### Phase 0: Pre-Implementation Memory Check
 0. **Search memory** for failed attempts and known issues related to the plan's domain:
-   - Check `.claude/session-state/CURRENT.md` for recent failures
-   - Call `massu_memory_search` with file paths being modified
-   - Call `massu_memory_failures` with keywords from the plan
+   - Search memory for keywords from the plan
+   - Search memory for file paths being modified
    - If matches found: read the previous failures and avoid repeating them
 ### Phase 1: Implement
 1. Load plan file from `$ARGUMENTS` (read from disk, not memory)
-2. Extract ALL plan items into trackable checklist
-3. Implement each item with VR-* proof
+2. Extract ALL plan items into trackable checklist (see [references/plan-extraction.md](references/plan-extraction.md))
+3. Implement each item with VR-* proof (see [references/iteration-structure.md](references/iteration-structure.md))
 4. Update session state after each major step
 ### Phase 1.5: Multi-Perspective Review
-5. Spawn security and architecture review subagents in parallel
-6. Parse results and fix CRITICAL/HIGH findings before proceeding
+Spawn 3 focused review subagents IN PARALLEL (Principle #20):
+- **Security reviewer**: vulnerabilities, auth gaps, input validation, data exposure
+- **Architecture reviewer**: design issues, coupling, pattern compliance, scalability
+- **UX reviewer**: user experience, accessibility, loading/error/empty states, consistency
+Fix CRITICAL/HIGH findings before proceeding. WARN findings: document and proceed.
 ### Phase 2: Verify (Subagent Loop)
-7. Spawn `general-purpose` subagent (via Task tool) for verification iteration 1
-8. Parse `GAPS_DISCOVERED` from the subagent result
-9. If gaps > 0: fix what the auditor identified, spawn another iteration
-10. If gaps == 0: output final completion report with dual gate evidence
-11. Continue until zero gaps or maximum 10 iterations
+5. Spawn `plan-auditor` subagent (via Task tool) for verification iteration 1
+6. Parse `GAPS_DISCOVERED` from the subagent result (see [references/loop-controller.md](references/loop-controller.md))
+7. If gaps > 0: fix what the auditor identified, spawn another iteration
+8. If gaps == 0: output final completion report with dual gate evidence
+9. Continue until zero gaps or maximum 10 iterations
 ### Phase 2.1: Post-Build Reflection + MANDATORY Memory Persist (CR-38)
-After verification passes with zero gaps, capture accumulated implementation knowledge before it's lost to context compression. Answer four questions:
-1. **"Now that I've built this, what would I have done differently?"**
-   - Architectural choices that caused friction
-   - Patterns that were harder to work with than expected
-   - Code that works but feels fragile or overly complex
-2. **"What should be refactored before moving on?"**
-   - Concrete suggestions with file paths and line numbers
-   - Technical debt introduced during this implementation
-   - Opportunities to simplify or consolidate
-3. **"Did we over-build? Is there a simpler way?"**
-   - Identify any added complexity that wasn't strictly needed
-   - Flag scope expansion beyond the original plan
-   - Check if any "fix everything encountered" items could have been simpler
-4. **"Would a staff engineer approve this?" (Principle #19)**
-   - Check if the solution demonstrates good engineering taste
-   - Look for over-abstraction, unnecessary indirection, or "clever" code
-   - For non-trivial implementations: is there a more elegant approach?
-   - For simple fixes: skip this check - don't over-engineer obvious solutions
-**MANDATORY Actions** (reflection + memory write = ONE atomic action):
-1. Apply any low-risk refactors immediately (re-run build/type check after)
-2. **IMMEDIATELY write ALL learnings to memory/ files** -- failed approaches, new patterns, tool gotchas, architectural insights. DO NOT just output reflections as text. Every insight MUST be persisted to `memory/MEMORY.md` or a topic file using the Write/Edit tool.
-3. Log remaining suggestions in the plan document under `## Post-Build Reflection`
-**WARNING**: Outputting reflections without writing them to memory files is a CR-38 violation. The reflection and the memory write are inseparable.
+See [references/auto-learning.md](references/auto-learning.md) for full protocol.
 ### Phase 3: Auto-Learning (MANDATORY)
-12. **Execute AUTO-LEARNING PROTOCOL** before reporting completion
+10. Execute AUTO-LEARNING PROTOCOL (see [references/auto-learning.md](references/auto-learning.md))
 **The auditor subagent handles**: reading the plan, verifying all deliverables, checking patterns/build/types, fixing plan document gaps, and returning structured results.
-**You (the loop controller) handle**: implementation, spawning auditors, parsing results, fixing code-level gaps, looping, learning, and documentation.
+**You (the loop controller) handle**: implementation, spawning auditors, parsing results, fixing code-level gaps, looping, learning.
 **Remember: Claims without proof are invalid. Show the verification output.**
 ---
-## AUTO-LEARNING PROTOCOL (MANDATORY after every loop completion)
+## COMPLETION CRITERIA
-After Loop Completes (Zero Gaps):
+See [references/vr-plan-spec.md](references/vr-plan-spec.md) for full dual-gate verification and completion output format.
-- **Persist Phase 2.1 reflections**: EVERY insight from Post-Build Reflection MUST be written to `memory/MEMORY.md` or a topic file -- failed approaches, tool gotchas, unexpected behavior, "what I'd do differently". This is NOT optional. If Phase 2.1 produced reflections that aren't in memory files, this protocol is INCOMPLETE.
-- **Ingest fixes into memory**: `massu_memory_ingest` with type "bugfix"/"pattern", description "[Wrong] -> [Fixed]", files, importance (5=security, 3=build, 2=cosmetic)
-- **Record failed approaches**: `massu_memory_ingest` with type "failed_attempt", importance 5
-- **Update MEMORY.md**: Add wrong vs correct patterns, tool behaviors, config gotchas
-- **Update pattern scanner**: Add new grep-able bad patterns to `scripts/massu-pattern-scanner.sh`
-- **Codebase-wide search**: Verify no other instances of same bad pattern (CR-9)
-- **Consider new CR rule**: If a class of bug was found (not one-off), propose for CLAUDE.md
-- **Record user corrections**: If the user corrected any behavior during this loop, add structured entry to `memory/corrections.md` with date, wrong behavior, correction, and prevention rule
+Massu Loop is COMPLETE **only when BOTH gates pass: Code Quality AND Plan Coverage**.
-**A loop that fixes 5 bugs but records 0 learnings is 80% wasted. The fixes are temporary; the learnings are permanent.**
-**A reflection that isn't persisted to memory is a learning that will be lost. Text output is not persistence.**
+### GATE 1: Code Quality (All Must Pass)
+- [ ] Pattern scanner: Exit 0
+- [ ] Type check: 0 errors
+- [ ] Build: Exit 0
+- [ ] Lint: Exit 0
+- [ ] Tests: ALL PASS (MANDATORY)
+- [ ] Security: No secrets staged
+- [ ] VR-RENDER: All UI components rendered in pages
+### GATE 2: Plan Coverage
+- [ ] Plan file read (actual file, not memory)
+- [ ] ALL items extracted into tracking table
+- [ ] EACH item verified with VR-* proof
+- [ ] Coverage = 100% (99% = FAIL)
+- [ ] Plan document updated with completion status
+**Code Quality: PASS + Plan Coverage: FAIL = NOT COMPLETE**