@massu/core 0.6.0 → 0.6.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (26) hide show
  1. package/commands/_shared-preamble.md +14 -0
  2. package/commands/massu-ci-fix.md +2 -2
  3. package/commands/massu-gap-enhancement-analyzer.md +85 -345
  4. package/commands/massu-golden-path/references/approval-points.md +9 -12
  5. package/commands/massu-golden-path/references/competitive-mode.md +9 -7
  6. package/commands/massu-golden-path/references/error-handling.md +4 -2
  7. package/commands/massu-golden-path/references/phase-0-requirements.md +3 -3
  8. package/commands/massu-golden-path/references/phase-1-plan-creation.md +41 -52
  9. package/commands/massu-golden-path/references/phase-2-implementation.md +50 -151
  10. package/commands/massu-golden-path/references/phase-2.5-gap-analyzer.md +14 -34
  11. package/commands/massu-golden-path/references/phase-3-simplify.md +5 -5
  12. package/commands/massu-golden-path/references/phase-4-commit.md +20 -46
  13. package/commands/massu-golden-path/references/phase-5-push.md +14 -47
  14. package/commands/massu-golden-path/references/phase-6-completion.md +8 -58
  15. package/commands/massu-golden-path.md +25 -30
  16. package/commands/massu-loop/references/checkpoint-audit.md +14 -18
  17. package/commands/massu-loop/references/guardrails.md +3 -3
  18. package/commands/massu-loop/references/iteration-structure.md +46 -14
  19. package/commands/massu-loop/references/loop-controller.md +72 -63
  20. package/commands/massu-loop/references/plan-extraction.md +19 -11
  21. package/commands/massu-loop/references/vr-plan-spec.md +20 -28
  22. package/commands/massu-loop.md +36 -56
  23. package/commands/massu-review.md +2 -2
  24. package/dist/cli.js +0 -0
  25. package/package.json +1 -1
  26. package/README.md +0 -40
@@ -44,7 +44,7 @@ Total Items: [N]
44
44
  Phases: [list]
45
45
 
46
46
  Requirements Coverage: [X]/10 dimensions resolved
47
- Feasibility: VERIFIED (DB, files, patterns, security)
47
+ Feasibility: VERIFIED (config, files, patterns, security)
48
48
  Audit Passes: {iteration} (final pass: 0 gaps)
49
49
  --------------------------------------------------------------------------
50
50
 
@@ -75,7 +75,7 @@ Existing patterns checked:
75
75
  PROPOSED NEW PATTERN:
76
76
  --------------------------------------------------------------------------
77
77
  Name: [Pattern Name]
78
- Domain: [UI/Database/Auth/etc.]
78
+ Domain: [Config/MCP/Hook/etc.]
79
79
 
80
80
  WRONG: [code]
81
81
  CORRECT: [code]
@@ -108,15 +108,12 @@ VERIFICATION RESULTS:
108
108
  Pattern scanner: Exit 0
109
109
  Type check: 0 errors
110
110
  Build: Exit 0
111
- Lint: Exit 0
112
- Prisma: Valid
111
+ Tests: ALL pass
112
+ Hook compilation: Exit 0
113
+ Generalization: Exit 0
113
114
  Security: No secrets staged, no credentials in code
114
- VR-RENDER: All UI components rendered
115
- VR-COUPLING: All backend features exposed in UI
116
- VR-COLOR: No hardcoded Tailwind colors
115
+ Tool registration: All new tools wired
117
116
  Plan Coverage: [X]/[X] = 100%
118
- Database: All environments verified
119
- Help site: UP TO DATE / N/A
120
117
  Quality Score: [X.X]/5.0
121
118
  --------------------------------------------------------------------------
122
119
 
@@ -161,8 +158,8 @@ Files changed: [N] | +[N] / -[N]
161
158
  Branch: [branch] -> origin
162
159
 
163
160
  Tier 1 (Quick): PASS
164
- Tier 2 (Tests): PASS -- Unit: X/X, E2E: X/X, Regression: 0
165
- Tier 3 (Security): PASS -- Audit: 0 high/crit, RLS: verified, Secrets: clean
161
+ Tier 2 (Tests): PASS -- Unit: X/X, Regression: 0
162
+ Tier 3 (Security): PASS -- Audit: 0 high/crit, Secrets: clean
166
163
  --------------------------------------------------------------------------
167
164
 
168
165
  OPTIONS:
@@ -200,7 +197,7 @@ COMPETITIVE SCORECARD:
200
197
  NOTABLE DIFFERENCES:
201
198
  [Aspect]: Agent A did [X], Agent B did [Y]
202
199
 
203
- RECOMMENDATION: Agent {X} ({bias}) [reason]
200
+ RECOMMENDATION: Agent {X} ({bias}) -- [reason]
204
201
  --------------------------------------------------------------------------
205
202
 
206
203
  PER-AGENT NOTES:
@@ -1,8 +1,10 @@
1
1
  # Competitive Mode Protocol
2
2
 
3
+ > **Shared rules apply.** Read .claude/commands/_shared-preamble.md before proceeding.
4
+
3
5
  > Reference doc for `/massu-golden-path --competitive`. Return to main file for overview.
4
6
 
5
- **Purpose**: Spawn 2-3 competing implementations of the same plan with different optimization biases, score all implementations, and select the winner before proceeding with Massu's verification rigor.
7
+ **Purpose**: Spawn 2-3 competing implementations of the same plan with different optimization biases, score all implementations, and select the winner before proceeding with verification rigor.
6
8
 
7
9
  **Triggering**: Only when `/massu-golden-path --competitive` is explicitly used. Never automatic.
8
10
 
@@ -17,7 +19,7 @@ SCAN plan for:
17
19
  - Items with type = MIGRATION
18
20
  - Items containing ALTER TABLE, CREATE TABLE, DROP TABLE
19
21
  - Items containing RLS policies or grants
20
- - Items referencing database migrations
22
+ - Items referencing all database environments
21
23
 
22
24
  IF any found:
23
25
  ABORT competitive mode with message:
@@ -25,7 +27,7 @@ IF any found:
25
27
  Apply migrations first, then re-run with --competitive."
26
28
  ```
27
29
 
28
- This mirrors the `/massu-batch` DB guard pattern (`scripts/batch-db-guard.sh`).
30
+ This mirrors the `/massu-batch` DB guard pattern.
29
31
 
30
32
  ---
31
33
 
@@ -75,8 +77,8 @@ IMPLEMENTATION RULES:
75
77
  1. Read the plan from disk and implement ALL items
76
78
  2. Follow ALL CLAUDE.md patterns (ctx.db, protectedProcedure, etc.)
77
79
  3. Do NOT run database migrations (handled separately)
78
- 4. Run pattern-scanner after each file: ./scripts/pattern-scanner.sh
79
- 5. Run tsc after implementation: npx tsc --noEmit
80
+ 4. Run pattern-scanner after each file: bash scripts/massu-pattern-scanner.sh
81
+ 5. Run tsc after implementation: cd packages/core && npx tsc --noEmit
80
82
  6. Fix any issues before declaring done
81
83
 
82
84
  OUTPUT FORMAT (at completion):
@@ -249,8 +251,8 @@ IF cleanup fails:
249
251
  ### Post-Merge Verification
250
252
 
251
253
  ```
252
- 1. Run ./scripts/pattern-scanner.sh (exit 0 required)
253
- 2. Run npx tsc --noEmit (0 errors required)
254
+ 1. Run bash scripts/massu-pattern-scanner.sh (exit 0 required)
255
+ 2. Run cd packages/core && npx tsc --noEmit (0 errors required)
254
256
  3. IF either fails:
255
257
  Fix issues from merge
256
258
  Re-run verification
@@ -1,5 +1,7 @@
1
1
  # Error Handling
2
2
 
3
+ > **Shared rules apply.** Read .claude/commands/_shared-preamble.md before proceeding.
4
+
3
5
  > Reference doc for `/massu-golden-path`. Return to main file for overview.
4
6
 
5
7
  ## Recoverable Errors
@@ -54,11 +56,11 @@ TO RESUME:
54
56
 
55
57
  ---
56
58
 
57
- ## Post-Compaction Re-Verification (CR-42)
59
+ ## Post-Compaction Re-Verification (CR-12)
58
60
 
59
61
  **After ANY context compaction during a golden path run**, BEFORE continuing implementation:
60
62
 
61
- 1. **Re-read the FULL plan document** from disk (CR-5 -- never from memory)
63
+ 1. **Re-read the FULL plan document** from disk (never from memory)
62
64
  2. **Diff every completed item against actual code**: For each item marked complete in the tracking table, re-run its VR-* verification command
63
65
  3. **VR-SPEC-MATCH audit**: For every completed UI item with specific CSS classes/structure in the plan, grep for those EXACT strings in the implementation
64
66
  4. **Flag mismatches**: Any item where implementation doesn't match the plan's exact spec -> mark as gap, fix before continuing
@@ -8,9 +8,9 @@
8
8
  [GOLDEN PATH -- PHASE 0: REQUIREMENTS & CONTEXT]
9
9
  ```
10
10
 
11
- - Call `massu_memory_sessions` for recent session context
12
- - Call `massu_memory_search` + `massu_memory_failures` with feature keywords
13
11
  - Read `session-state/CURRENT.md` for any prior state
12
+ - Read `massu.config.yaml` for project configuration
13
+ - Search memory files for relevant prior context
14
14
 
15
15
  ## 0.2 Requirements Coverage Map
16
16
 
@@ -20,7 +20,7 @@ Initialize ALL dimensions as `pending`:
20
20
  |---|-----------|--------|-------------|
21
21
  | D1 | Problem & Scope | pending | User request + interview |
22
22
  | D2 | Users & Personas | pending | Interview |
23
- | D3 | Data Model | pending | Phase 1A (DB Reality Check) |
23
+ | D3 | Data Model | pending | Phase 1A (Config/Schema Reality Check) |
24
24
  | D4 | Backend / API | pending | Phase 1A (Codebase Reality Check) |
25
25
  | D5 | Frontend / UX | pending | Interview + Phase 1A |
26
26
  | D6 | Auth & Permissions | pending | Phase 1A (Security Pre-Screen) |
@@ -12,85 +12,75 @@
12
12
 
13
13
  ### 1A.1 Feature Understanding
14
14
 
15
- - Call `massu_knowledge_search`, `massu_knowledge_pattern`, `massu_knowledge_schema_check` with feature name
16
15
  - Document: exact user request, feature type, affected domains
17
- - Search codebase for similar features, routers, pages
16
+ - Search codebase for similar features, tool modules, existing patterns
17
+ - Read `massu.config.yaml` for relevant config sections
18
18
 
19
- ### 1A.2 Database Reality Check (VR-SCHEMA-PRE)
19
+ ### 1A.2 Config & Schema Reality Check
20
20
 
21
- For EACH table the feature might use, query via MCP:
21
+ For features touching config or databases:
22
22
 
23
- ```sql
24
- SELECT column_name, data_type, is_nullable, column_default
25
- FROM information_schema.columns WHERE table_name = '[TABLE]' ORDER BY ordinal_position;
23
+ - Parse `massu.config.yaml` and verify all referenced config keys exist
24
+ - Check SQLite schema for affected tables (`getCodeGraphDb`, `getDataDb`, `getMemoryDb`)
25
+ - Verify tool definitions in `tools.ts` for any tools being modified
26
26
 
27
- SELECT polname, polcmd FROM pg_policies WHERE tablename = '[TABLE]';
27
+ Document: existing config keys, required new keys, required schema changes.
28
28
 
29
- SELECT grantee, privilege_type FROM information_schema.table_privileges WHERE table_name = '[TABLE]';
30
- ```
31
-
32
- Run `./scripts/check-bad-columns.sh`. Call `massu_schema` for Prisma cross-reference.
33
- Document: existing tables, required new tables/columns, migration SQL previews.
29
+ ### 1A.3 Config-Code Alignment (VR-CONFIG)
34
30
 
35
- ### 1A.3 Config-Code Alignment (VR-DATA)
31
+ If feature uses config-driven values:
36
32
 
37
- If feature uses DB-stored configs:
38
-
39
- ```sql
40
- SELECT DISTINCT jsonb_object_keys(config_column) as keys FROM config_table;
33
+ ```bash
34
+ # Check config keys used in code
35
+ grep -rn "getConfig()" packages/core/src/ | grep -o 'config\.\w\+' | sort -u
36
+ # Compare to massu.config.yaml structure
41
37
  ```
42
38
 
43
- Compare to code: `grep -rn "config\." src/lib/[feature]/ | grep -oP 'config\.\w+' | sort -u`
44
-
45
39
  ### 1A.4 Codebase Reality Check
46
40
 
47
41
  - Verify target directories/files exist
48
- - Read similar routers and components
49
- - Load relevant pattern files (database/auth/ui/realtime/build)
42
+ - Read similar tool modules and handlers
43
+ - Load relevant pattern files (build/testing/security/database/mcp)
50
44
 
51
45
  ### 1A.5 Blast Radius Analysis (CR-25)
52
46
 
53
- **MANDATORY when plan changes any constant, path, route, enum, or config key.**
47
+ **MANDATORY when plan changes any constant, export name, config key, or tool name.**
54
48
 
55
49
  1. Identify ALL changed values (old -> new)
56
50
  2. Codebase-wide grep for EACH value
57
- 3. Call `massu_impact` for indirect impact through import chains
58
- 4. If plan deletes files: call `massu_sentinel_impact` -- zero orphaned features allowed
59
- 5. Categorize EVERY occurrence: CHANGE / KEEP (with reason) / INVESTIGATE
60
- 6. Resolve ALL INVESTIGATE to 0. Add ALL CHANGE items as plan deliverables.
51
+ 3. If plan deletes files: verify no remaining imports or references
52
+ 4. Categorize EVERY occurrence: CHANGE / KEEP (with reason) / INVESTIGATE
53
+ 5. Resolve ALL INVESTIGATE to 0. Add ALL CHANGE items as plan deliverables.
61
54
 
62
55
  ### 1A.6 Pattern Compliance Check
63
56
 
64
- Check applicable patterns: ctx.db, user_profiles, 3-step query, BigInt/Decimal, RLS+Grants, Suspense, Select.Item, protectedProcedure, Zod validation. Read most similar router/component for patterns used.
57
+ Check applicable patterns: ESM imports (.ts extensions), config access (getConfig()), tool registration (3-function pattern), hook compilation (esbuild), SQLite DB access (getCodeGraphDb/getDataDb/getMemoryDb), memDb lifecycle (try/finally close).
65
58
 
66
- ### 1A.7 Backend-Frontend Coupling Check (CR-12)
59
+ Read most similar tool module for patterns used.
67
60
 
68
- For EVERY backend z.enum, type, or procedure planned -- verify a corresponding frontend item exists. If NOT, ADD IT.
61
+ ### 1A.7 Tool Registration Check
62
+
63
+ For EVERY new MCP tool planned -- verify a corresponding registration item exists in the plan (definitions + routing + handler in `tools.ts`). If NOT, ADD IT.
69
64
 
70
65
  ### 1A.8 Question Filtering
71
66
 
72
67
  1. List all open questions
73
- 2. Self-answer anything answerable by reading code or querying DB
68
+ 2. Self-answer anything answerable by reading code or config
74
69
  3. Surface only business logic / UX / scope / priority questions to user via AskUserQuestion
75
70
  4. If all self-answerable, skip user prompt
76
71
 
77
- ### 1A.9 Security Pre-Screen (6 Dimensions)
72
+ ### 1A.9 Security Pre-Screen (5 Dimensions)
78
73
 
79
74
  | Dim | Check | If Triggered |
80
75
  |-----|-------|-------------|
81
- | S1 | PII / Sensitive Data | Add RLS + column-level access |
82
- | S2 | Authentication | Verify protectedProcedure |
83
- | S3 | Authorization | Add RBAC checks, RLS policies |
84
- | S4 | Injection Surfaces | Add Zod validation, parameterized queries |
85
- | S5 | Secrets Management (CR-5) | Add AWS Secrets Manager items |
86
- | S6 | Rate Limiting | Add rate limiting middleware |
76
+ | S1 | PII / Sensitive Data | Add access controls |
77
+ | S2 | Authentication | Verify auth checks |
78
+ | S3 | Authorization | Add permission checks |
79
+ | S4 | Injection Surfaces | Add input validation, parameterized queries |
80
+ | S5 | Rate Limiting | Add rate limiting considerations |
87
81
 
88
82
  **BLOCKS_REMAINING must = 0 before proceeding.**
89
83
 
90
- ### 1A.10 ADR Generation (Optional)
91
-
92
- For architectural decisions: `massu_adr_list` -> `massu_adr_generate`.
93
-
94
84
  Mark all coverage dimensions as `done` or `n/a`.
95
85
 
96
86
  ---
@@ -101,25 +91,24 @@ Mark all coverage dimensions as `done` or `n/a`.
101
91
  [GOLDEN PATH -- PHASE 1B: PLAN GENERATION]
102
92
  ```
103
93
 
104
- Write plan to: `plans/[YYYY-MM-DD]-[feature-name].md`
94
+ Write plan to: `docs/plans/[YYYY-MM-DD]-[feature-name].md`
105
95
 
106
96
  **Plan structure** (P-XXX numbered items):
107
97
  - Overview (feature, complexity, domains, item count)
108
98
  - Requirements Coverage Map (D1-D10 all resolved)
109
- - Phase 0: Credentials & Secrets (CR-5)
110
- - Phase 1: Database Changes (migrations with exact SQL)
111
- - Phase 2: Backend Implementation (routers, procedures, input schemas)
112
- - Phase 3: Frontend Implementation (components, pages, renders-in)
99
+ - Phase 1: Configuration Changes (massu.config.yaml)
100
+ - Phase 2: Backend Implementation (tool modules, handlers, SQLite schema)
101
+ - Phase 3: Frontend/Hook Implementation (hooks, plugin code)
113
102
  - Phase 4: Testing & Verification
114
- - Phase 5: Documentation (help site pages, changelog)
103
+ - Phase 5: Documentation
115
104
  - Verification Commands table
116
105
  - Item Summary table
117
106
  - Risk Assessment
118
107
  - Dependencies
119
108
 
120
- **Item numbering**: P0-XXX (secrets), P1-XXX (database), P2-XXX (backend), P3-XXX (frontend), P4-XXX (testing), P5-XXX (docs).
109
+ **Item numbering**: P1-XXX (config), P2-XXX (backend), P3-XXX (frontend/hooks), P4-XXX (testing), P5-XXX (docs).
121
110
 
122
- **Implementation Specificity Check**: Every item MUST have exact file path, exact content/SQL, insertion point, format matches target, verification command.
111
+ **Implementation Specificity Check**: Every item MUST have exact file path, exact content, insertion point, format matches target, verification command.
123
112
 
124
113
  **Documentation Impact Assessment**: If ANY user-facing features, Phase 5 deliverables are MANDATORY.
125
114
 
@@ -141,7 +130,7 @@ WHILE true:
141
130
  result = Task(subagent_type="massu-plan-auditor", model="opus", prompt="
142
131
  Audit iteration {iteration} for plan: {PLAN_PATH}
143
132
  Execute ONE complete audit pass. Verify ALL deliverables.
144
- Check: VR-PLAN-FEASIBILITY, VR-PLAN-SPECIFICITY, Pattern Alignment, Schema Reality.
133
+ Check: VR-PLAN-FEASIBILITY, VR-PLAN-SPECIFICITY, Pattern Alignment, Config Reality.
145
134
  Fix any plan document gaps you find.
146
135
 
147
136
  CRITICAL: Report GAPS_DISCOVERED as total gaps FOUND, EVEN IF you fixed them.
@@ -157,7 +146,7 @@ WHILE true:
157
146
  END WHILE
158
147
  ```
159
148
 
160
- **VR-PLAN-FEASIBILITY**: DB schema exists, files exist, dependencies available, patterns documented, credentials planned.
149
+ **VR-PLAN-FEASIBILITY**: Files exist, config keys valid, dependencies available, patterns documented.
161
150
  **VR-PLAN-SPECIFICITY**: Every item has exact path, exact content, insertion point, verification command.
162
151
  **Pattern Alignment**: Cross-reference ALL applicable patterns from CLAUDE.md and patterns/*.md.
163
152
 
@@ -21,12 +21,12 @@ ELSE:
21
21
  [GOLDEN PATH -- PHASE 2: IMPLEMENTATION]
22
22
  ```
23
23
 
24
- 1. Read plan from disk (NOT memory)
24
+ 1. Read plan from disk (NOT memory -- CR-5)
25
25
  2. Extract ALL deliverables into tracking table:
26
26
 
27
27
  | Item # | Type | Description | Location | Verification | Status |
28
28
  |--------|------|-------------|----------|--------------|--------|
29
- | P1-001 | MIGRATION | ... | ... | VR-SCHEMA | PENDING |
29
+ | P1-001 | CONFIG | ... | ... | VR-CONFIG | PENDING |
30
30
 
31
31
  3. Create VR-PLAN verification strategy:
32
32
 
@@ -38,38 +38,12 @@ ELSE:
38
38
 
39
39
  ---
40
40
 
41
- ## Phase 2A.5: Sprint Contracts
42
-
43
- > Full protocol: [sprint-contract-protocol.md](sprint-contract-protocol.md)
44
-
45
- **Before implementation begins**, negotiate a sprint contract for each plan item:
46
-
47
- 1. For each plan item in the tracking table:
48
- - Define **Scope Boundary** (IN/OUT)
49
- - Define **Implementation Approach** (files, patterns)
50
- - Write **3-5 Acceptance Criteria** (must be specific enough that two independent evaluators agree on PASS/FAIL)
51
- - Map to **VR-\* Verification Types**
52
-
53
- 2. Add contract columns to the Phase 2A tracking table:
54
-
55
- | Item # | Type | Description | Location | Verification | Scope Boundary | Acceptance Criteria | Contract Status |
56
- |--------|------|-------------|----------|--------------|----------------|---------------------|-----------------|
57
- | P1-001 | MIGRATION | ... | ... | VR-SCHEMA | IN: ... / OUT: ... | 1. ... 2. ... 3. ... | AGREED |
58
-
59
- 3. **Quality bar**: Criteria using words like "good", "correct", "proper" without specifics = reject and rewrite. Each contract must include criteria from at least 3 categories: happy path, data display, empty/loading/error states, user feedback, edge cases.
60
-
61
- 4. **Skip conditions**: Mark `Contract: N/A` for pure refactors (VR-BUILD + VR-TYPE + VR-TEST sufficient), documentation-only items, and migrations where SQL IS the contract.
62
-
63
- 5. **Max 3 negotiation rounds** per item. If unresolved, escalate via AskUserQuestion.
64
-
65
- ---
66
-
67
41
  ## Phase 2B: Implementation Loop
68
42
 
69
43
  For each plan item:
70
- 1. **Pre-check**: Load CR rules, domain patterns for the affected file
44
+ 1. **Pre-check**: Verify file exists, read current state
71
45
  2. **Execute**: Implement the item following established patterns
72
- 3. **Guardrail**: Run `./scripts/pattern-scanner.sh` (ABORT if fails)
46
+ 3. **Guardrail**: Run `bash scripts/massu-pattern-scanner.sh` (ABORT if fails)
73
47
  4. **Verify**: Run applicable VR-* checks with proof
74
48
  5. **VR-PIPELINE**: If the item involves a data pipeline (AI, cron, generation, ETL), trigger the pipeline manually, verify output is non-empty. Empty output = fix before continuing.
75
49
  6. **Update**: Mark item complete in tracking table
@@ -83,16 +57,13 @@ For each plan item:
83
57
 
84
58
  ```
85
59
  CHECKPOINT:
86
- [1] READ plan section [2] QUERY DB [3] GREP routers
87
- [4] LS components [5] VR-RENDER check [6] VR-COUPLING check
88
- [7] Pattern scanner [8] npm run build [9] npx tsc --noEmit
89
- [10] npm run lint [11] npx prisma validate [12] npm test
90
- [13] UI/UX verification [14] API/router verification [15] Security check
91
- [16] COUNT gaps -> IF > 0: FIX and return to [1]
60
+ [1] READ plan section [2] GREP tool registrations [3] LS modules
61
+ [4] VR-CONFIG check [5] VR-TOOL-REG check [6] VR-HOOK-BUILD check
62
+ [7] Pattern scanner [8] npm run build [9] cd packages/core && npx tsc --noEmit
63
+ [10] npm test [11] VR-GENERIC check [12] Security scanner
64
+ [13] COUNT gaps -> IF > 0: FIX and return to [1]
92
65
  ```
93
66
 
94
- > **Cross-reference**: Full checkpoint audit protocol with detailed steps is in `massu-loop/references/checkpoint-audit.md`.
95
-
96
67
  ---
97
68
 
98
69
  ## Phase 2C: Multi-Perspective Review
@@ -112,60 +83,14 @@ architecture_result = Task(subagent_type="massu-architecture-reviewer", model="o
112
83
  Return structured result with ARCHITECTURE_GATE: PASS/FAIL.
113
84
  ")
114
85
 
115
- ux_result = Task(subagent_type="massu-ux-reviewer", model="sonnet", prompt="
86
+ quality_result = Task(subagent_type="massu-quality-reviewer", model="sonnet", prompt="
116
87
  Review implementation for plan: {PLAN_PATH}
117
- Focus: UX, accessibility, loading/error/empty states, consistency.
118
- Return structured result with UX_GATE: PASS/FAIL.
88
+ Focus: Code quality, ESM compliance, config-driven patterns, TypeScript strict mode, test coverage.
89
+ Return structured result with QUALITY_GATE: PASS/FAIL.
119
90
  ")
120
91
  ```
121
92
 
122
- **Phase 2C.2: QA Evaluator** (conditional -- UI plans only)
123
-
124
- > Full spec: [qa-evaluator-spec.md](qa-evaluator-spec.md)
125
-
126
- If the plan touches UI files, spawn an adversarial QA evaluator:
127
-
128
- ```
129
- IF plan has UI files:
130
- qa_result = Task(subagent_type="massu-ux-reviewer", model="opus", prompt="
131
- === QA EVALUATOR MODE ===
132
- You are an ADVERSARIAL QA agent. Your job is to FIND BUGS, not approve work.
133
-
134
- Plan: {PLAN_PATH}
135
- Sprint contracts: {CONTRACTS_FROM_2A5}
136
-
137
- For EACH plan item with a sprint contract:
138
- 1. NAVIGATE to the affected page using Playwright MCP
139
- 2. EXERCISE the feature as a real user would
140
- 3. VERIFY against sprint contract acceptance criteria (EVERY criterion)
141
- 4. CHECK for known failure patterns:
142
- - Mock/hardcoded data (data doesn't change when DB changes)
143
- - Write succeeds but read/display broken
144
- - Feature stubs (onClick/onSubmit empty or log-only)
145
- - Invisible elements (display:none, opacity:0, z-index buried)
146
- - Missing query invalidation (create item, verify list updates without refresh)
147
- 5. GRADE: PASS / PARTIAL / FAIL with specific evidence
148
-
149
- ANTI-LENIENCY RULES:
150
- - Never say 'this is acceptable because...' — if criteria aren't met, it's FAIL
151
- - Never give benefit of the doubt — if you can't verify it works, it's FAIL
152
- - Partial credit is still failure — PARTIAL means 'not done yet'
153
- - Every PASS must cite specific evidence (screenshot, DOM state, network response)
154
-
155
- Return structured result with QA_GATE: PASS/FAIL and per-item grades.
156
- ")
157
- ELSE:
158
- Log: "QA Evaluator: SKIPPED (no UI files in plan)"
159
- ```
160
-
161
- **Gate logic**: Fix ALL CRITICAL/HIGH findings before proceeding. WARN findings = document and proceed.
162
-
163
- ```
164
- GATES = [SECURITY_GATE, ARCHITECTURE_GATE, UX_GATE]
165
- IF plan has UI files: GATES += [QA_GATE]
166
- IF ANY gate == FAIL: Fix findings and re-run failed gates
167
- ALL gates must PASS before proceeding to Phase 2D.
168
- ```
93
+ Fix ALL findings at ALL severity levels before proceeding (CR-45). CRITICAL, HIGH, MEDIUM, LOW — all get fixed. No severity is exempt.
169
94
 
170
95
  ---
171
96
 
@@ -176,36 +101,15 @@ iteration = 0
176
101
  WHILE true:
177
102
  iteration += 1
178
103
 
179
- # Circuit breaker (detect stagnation)
180
- IF iteration >= 3:
181
- stalled_items = items that failed in ALL of last 3 iterations
182
- IF stalled_items.length > 0:
183
- Log: "REFINE-OR-PIVOT: {stalled_items.length} items stalled for 3+ iterations"
184
- FOR EACH stalled_item:
185
- IF same_root_cause_each_time: REFINE (targeted fix for root cause)
186
- IF different_failures_each_time: PIVOT (scrap approach, try alternative)
187
- IF no_clear_pattern: AskUserQuestion with evidence from last 3 attempts
104
+ # Circuit breaker (CR-37)
105
+ IF iteration >= 3 AND same gaps as previous iteration:
106
+ AskUserQuestion: "Loop stalled after {iteration} passes. Re-plan / Continue / Stop?"
188
107
 
189
108
  result = Task(subagent_type="massu-plan-auditor", model="opus", prompt="
190
109
  Audit iteration {iteration} for plan: {PLAN_PATH}
191
110
  Verify ALL deliverables with VR-* proof.
192
111
  Check code quality (patterns, build, types, tests).
193
112
  Check plan coverage (every item verified).
194
-
195
- VR-SPEC-MATCH: For EVERY UI plan item with specific CSS classes,
196
- component names, or layout instructions -- grep the implementation for those
197
- EXACT strings. Missing = gap.
198
-
199
- VR-PIPELINE: For features with data pipelines (AI, cron, generation),
200
- trigger the pipeline procedure and verify output is non-empty. Empty = gap.
201
-
202
- SPRINT CONTRACT VERIFICATION: For each plan item with a sprint contract
203
- (from Phase 2A.5), verify EVERY acceptance criterion is met:
204
- - Read the contract's acceptance criteria list
205
- - Test each criterion with specific evidence (screenshot, grep, DOM state)
206
- - Any unmet criterion = gap, even if the code 'looks right'
207
- - Contract criteria are IN ADDITION TO VR-* checks — both must pass
208
-
209
113
  Fix any gaps you find.
210
114
 
211
115
  CRITICAL: GAPS_DISCOVERED = total FOUND, even if fixed.
@@ -222,7 +126,7 @@ END WHILE
222
126
 
223
127
  ---
224
128
 
225
- ## Phase 2E: Post-Build Reflection + Memory Persist
129
+ ## Phase 2E: Post-Build Reflection + Memory Persist (CR-38)
226
130
 
227
131
  **MANDATORY -- reflection + memory write = ONE atomic action.**
228
132
 
@@ -240,14 +144,13 @@ Apply any low-risk refactors immediately. Log remaining suggestions in plan unde
240
144
 
241
145
  ## Phase 2F: Documentation Sync (User-Facing Features)
242
146
 
243
- If plan includes ANY user-facing features:
147
+ If plan includes ANY user-facing features (new MCP tools, config changes, hook changes):
244
148
 
245
- 1. Audit documentation against code changes
246
- 2. Update affected documentation pages
247
- 3. Add changelog entry
248
- 4. Commit documentation updates (separate repo if applicable)
149
+ 1. Update relevant documentation (README, API docs, config docs)
150
+ 2. Ensure tool descriptions match implementation
151
+ 3. Update config schema documentation if config keys changed
249
152
 
250
- Skip ONLY if purely backend/infra with zero user-facing changes.
153
+ Skip ONLY if purely internal refactoring with zero user-facing changes.
251
154
 
252
155
  ---
253
156
 
@@ -257,28 +160,27 @@ Skip ONLY if purely backend/infra with zero user-facing changes.
257
160
  [GOLDEN PATH -- PHASE 2G: BROWSER VERIFICATION]
258
161
  ```
259
162
 
260
- **Auto-trigger condition**: If plan touches ANY UI files, this phase runs automatically. If purely backend/infra with zero UI changes, skip with log note: `Browser verification: SKIPPED (no UI files changed)`.
261
-
262
- **SAFETY RULE**: NEVER use real client data. NEVER click destructive actions (Delete, Send, Submit) on production.
163
+ **Auto-trigger condition**: If plan touches ANY UI/demo files or produces visual output, this phase runs automatically. If purely backend/MCP/config with zero visual output, skip with log note: `Browser verification: SKIPPED (no UI files changed)`.
263
164
 
264
165
  ### 2G.1 Determine Target Pages
265
166
 
266
- Map changed files to URLs:
167
+ Map changed features to testable URLs:
168
+ - If the project has a demo page or documentation site: test affected pages
169
+ - If testing MCP tool output: use a test harness or verify tool responses
267
170
  - Component changes: identify ALL pages that render the component
268
- - Layout changes: test ALL child routes under that layout
269
171
 
270
172
  ### 2G.2 Browser Setup & Authentication
271
173
 
272
- Use Playwright MCP plugin tools.
174
+ Use Playwright MCP plugin tools (`mcp__plugin_playwright_playwright__*`). Fallback: `mcp__playwright__*`.
273
175
 
274
- 1. `browser_navigate` to first target URL
275
- 2. `browser_snapshot` to check auth status
276
- 3. If redirected to `/login` or auth check visible: STOP and request manual login
176
+ 1. `browser_navigate` to target URL
177
+ 2. `browser_snapshot` to check page status
178
+ 3. If authentication required: STOP and request manual login
277
179
 
278
180
  ```
279
181
  AUTHENTICATION REQUIRED
280
182
 
281
- The Playwright browser is not logged in to the app.
183
+ The Playwright browser is not logged in to the target application.
282
184
  Please log in manually in the open browser window, then re-run the golden path.
283
185
  ```
284
186
 
@@ -290,24 +192,24 @@ For EACH target page:
290
192
 
291
193
  | Check | Tool | Captures |
292
194
  |-------|------|----------|
293
- | Console errors/warnings | `browser_console_messages` | React errors, TypeError, CSP violations, auth warnings |
195
+ | Console errors/warnings | `browser_console_messages` | React errors, TypeError, CSP violations |
294
196
  | Network failures | `browser_network_requests` | 500s, 404s, CORS failures, timeouts |
295
197
 
296
198
  Categorize findings:
297
199
 
298
200
  | Category | Severity |
299
201
  |----------|----------|
300
- | React crash, 500 error, data exposure | **P0 -- CRITICAL** |
301
- | Network failure, CSP violation, broken interaction, auth warning | **P1 -- HIGH** |
302
- | Visual issues, performance warnings, broken images | **P2 -- MEDIUM** |
303
- | Console warnings, deprecations, i18n missing keys | **P3 -- LOW** |
202
+ | Crash, 500 error, data exposure | **P0 -- CRITICAL** |
203
+ | Network failure, broken interaction | **P1 -- HIGH** |
204
+ | Visual issues, performance warnings | **P2 -- MEDIUM** |
205
+ | Console warnings, deprecations | **P3 -- LOW** |
304
206
 
305
207
  ### 2G.4 Interactive Testing (Per Page)
306
208
 
307
- 1. `browser_snapshot` -> inventory ALL interactive elements
209
+ 1. `browser_snapshot` -> inventory ALL interactive elements (buttons, links, forms, selects, tabs, modals, data tables)
308
210
  2. For EACH testable element:
309
- - Capture console state BEFORE interaction
310
- - Perform interaction
211
+ - Capture console state BEFORE interaction (`browser_console_messages`)
212
+ - Perform interaction (`browser_click`, `browser_select_option`, `browser_fill_form`)
311
213
  - Wait 2-3 seconds for async operations
312
214
  - Capture console state AFTER interaction
313
215
  - Record any NEW errors introduced
@@ -319,15 +221,15 @@ Categorize findings:
319
221
  ### 2G.5 Visual & Performance Audit
320
222
 
321
223
  **Visual checks**:
322
- - Broken images: find `img` elements with `naturalWidth === 0`
224
+ - Broken images: `browser_evaluate` to find `img` elements with `naturalWidth === 0`
323
225
  - Layout issues: overflow, overlapping, missing content, broken alignment
324
- - Responsive: test at 1440x900 (desktop), 768x1024 (tablet), 375x812 (mobile)
325
- - Screenshot evidence at each breakpoint if issues found
226
+ - Responsive: `browser_resize` at 1440x900 (desktop), 768x1024 (tablet), 375x812 (mobile)
227
+ - Screenshot evidence: `browser_take_screenshot` at each breakpoint if issues found
326
228
 
327
229
  **Performance checks**:
328
- - Page load timing
329
- - Resources > 500KB
330
- - Slow API calls > 3s, duplicate requests
230
+ - Page load timing via `browser_evaluate` (`performance.getEntriesByType('navigation')`)
231
+ - Resources > 500KB via `browser_evaluate` (`performance.getEntriesByType('resource')`)
232
+ - Slow API calls > 3s, duplicate requests via `browser_network_requests`
331
233
 
332
234
  | Metric | Good | Needs Work | Critical |
333
235
  |--------|------|------------|----------|
@@ -361,10 +263,9 @@ Report includes: summary table, console errors, network failures, interactive el
361
263
  ### 2G.8 Auto-Learning Protocol
362
264
 
363
265
  For EACH browser-discovered fix:
364
- 1. Record with type="bugfix" including browser symptom -> code fix mapping
365
- 2. Update MEMORY.md with symptom/root cause/fix/files
366
- 3. Add to `scripts/pattern-scanner.sh` if the bad pattern is grep-able
367
- 4. Codebase-wide search for same bad pattern (CR-9) -- fix ALL instances
266
+ 1. Update memory files with symptom/root cause/fix/files
267
+ 2. Add to `scripts/massu-pattern-scanner.sh` if the bad pattern is grep-able
268
+ 3. Codebase-wide search for same bad pattern (CR-9) -- fix ALL instances
368
269
 
369
270
  ---
370
271
 
@@ -386,11 +287,9 @@ The golden path spawns multiple subagents across Phase 2. Follow these principle
386
287
 
387
288
  ```
388
289
  [GOLDEN PATH -- PHASE 2 COMPLETE]
389
- Sprint contracts: NEGOTIATED ({N} items contracted, {M} N/A)
390
290
  All plan items implemented
391
- Multi-perspective review: PASSED (security, architecture, UX)
392
- QA evaluator: PASSED / SKIPPED (no UI files)
393
- Verification audit: PASSED (Loop #{iteration}, 0 gaps, contracts verified)
291
+ Multi-perspective review: PASSED (security, architecture, quality)
292
+ Verification audit: PASSED (Loop #{iteration}, 0 gaps)
394
293
  Post-build reflection: PERSISTED to memory
395
294
  Documentation sync: COMPLETE / N/A
396
295
  Browser verification: PASSED ({N} pages tested, {M} issues fixed) / SKIPPED (no UI files)