mindsystem-cc 3.20.0 → 3.22.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (113) hide show
  1. package/README.md +9 -18
  2. package/agents/ms-mockup-designer.md +1 -1
  3. package/agents/ms-plan-checker.md +30 -30
  4. package/agents/ms-plan-writer.md +1 -1
  5. package/agents/ms-product-researcher.md +71 -0
  6. package/agents/ms-research-synthesizer.md +1 -1
  7. package/agents/ms-researcher.md +8 -8
  8. package/agents/ms-roadmapper.md +9 -13
  9. package/agents/ms-verifier.md +25 -117
  10. package/bin/install.js +68 -5
  11. package/commands/ms/add-phase.md +7 -8
  12. package/commands/ms/add-todo.md +3 -4
  13. package/commands/ms/adhoc.md +4 -5
  14. package/commands/ms/audit-milestone.md +15 -14
  15. package/commands/ms/complete-milestone.md +27 -24
  16. package/commands/ms/config.md +229 -0
  17. package/commands/ms/create-roadmap.md +3 -4
  18. package/commands/ms/debug.md +3 -4
  19. package/commands/ms/design-phase.md +11 -13
  20. package/commands/ms/discuss-phase.md +26 -22
  21. package/commands/ms/doctor.md +28 -205
  22. package/commands/ms/execute-phase.md +20 -12
  23. package/commands/ms/help.md +46 -39
  24. package/commands/ms/insert-phase.md +6 -7
  25. package/commands/ms/map-codebase.md +1 -2
  26. package/commands/ms/new-milestone.md +41 -19
  27. package/commands/ms/new-project.md +56 -47
  28. package/commands/ms/plan-milestone-gaps.md +7 -9
  29. package/commands/ms/plan-phase.md +4 -5
  30. package/commands/ms/progress.md +3 -4
  31. package/commands/ms/remove-phase.md +3 -4
  32. package/commands/ms/research-phase.md +11 -16
  33. package/commands/ms/research-project.md +19 -26
  34. package/commands/ms/review-design.md +4 -2
  35. package/commands/ms/verify-work.md +6 -8
  36. package/mindsystem/references/continuation-format.md +3 -3
  37. package/mindsystem/references/principles.md +1 -1
  38. package/mindsystem/references/routing/audit-result-routing.md +12 -11
  39. package/mindsystem/references/routing/between-milestones-routing.md +2 -2
  40. package/mindsystem/references/routing/milestone-complete-routing.md +1 -1
  41. package/mindsystem/references/routing/next-phase-routing.md +4 -2
  42. package/mindsystem/references/verification-patterns.md +0 -37
  43. package/mindsystem/templates/config.json +2 -1
  44. package/mindsystem/templates/context.md +7 -6
  45. package/mindsystem/templates/milestone-archive.md +5 -5
  46. package/mindsystem/templates/milestone-context.md +1 -1
  47. package/mindsystem/templates/milestone.md +9 -9
  48. package/mindsystem/templates/project.md +2 -2
  49. package/mindsystem/templates/research-subagent-prompt.md +3 -3
  50. package/mindsystem/templates/roadmap-milestone.md +14 -14
  51. package/mindsystem/templates/roadmap.md +10 -8
  52. package/mindsystem/templates/state.md +2 -2
  53. package/mindsystem/templates/verification-report.md +3 -26
  54. package/mindsystem/workflows/adhoc.md +1 -1
  55. package/mindsystem/workflows/complete-milestone.md +40 -75
  56. package/mindsystem/workflows/discuss-phase.md +141 -65
  57. package/mindsystem/workflows/doctor-fixes.md +273 -0
  58. package/mindsystem/workflows/execute-phase.md +9 -21
  59. package/mindsystem/workflows/execute-plan.md +3 -0
  60. package/mindsystem/workflows/map-codebase.md +6 -12
  61. package/mindsystem/workflows/mockup-generation.md +47 -23
  62. package/mindsystem/workflows/plan-phase.md +13 -6
  63. package/mindsystem/workflows/transition.md +2 -2
  64. package/mindsystem/workflows/verify-work.md +97 -70
  65. package/package.json +1 -1
  66. package/scripts/__pycache__/ms-tools.cpython-314.pyc +0 -0
  67. package/scripts/__pycache__/test_ms_tools.cpython-314-pytest-9.0.2.pyc +0 -0
  68. package/scripts/fixtures/scan-context/.planning/ROADMAP.md +16 -0
  69. package/scripts/fixtures/scan-context/.planning/adhoc/20260220-fix-token-SUMMARY.md +12 -0
  70. package/scripts/fixtures/scan-context/.planning/config.json +3 -0
  71. package/scripts/fixtures/scan-context/.planning/debug/resolved/token-bug.md +11 -0
  72. package/scripts/fixtures/scan-context/.planning/knowledge/auth.md +11 -0
  73. package/scripts/fixtures/scan-context/.planning/phases/02-infra/02-1-SUMMARY.md +20 -0
  74. package/scripts/fixtures/scan-context/.planning/phases/04-setup/04-1-SUMMARY.md +21 -0
  75. package/scripts/fixtures/scan-context/.planning/phases/05-auth/05-1-SUMMARY.md +28 -0
  76. package/scripts/fixtures/scan-context/.planning/todos/done/setup-db.md +10 -0
  77. package/scripts/fixtures/scan-context/.planning/todos/pending/add-logout.md +10 -0
  78. package/scripts/fixtures/scan-context/expected-output.json +257 -0
  79. package/scripts/ms-tools.py +2884 -0
  80. package/scripts/test_ms_tools.py +1622 -0
  81. package/agents/ms-flutter-code-quality.md +0 -169
  82. package/agents/ms-flutter-reviewer.md +0 -211
  83. package/agents/ms-flutter-simplifier.md +0 -79
  84. package/commands/ms/list-phase-assumptions.md +0 -56
  85. package/mindsystem/workflows/list-phase-assumptions.md +0 -178
  86. package/mindsystem/workflows/verify-phase.md +0 -625
  87. package/scripts/__pycache__/compare_mockups.cpython-314.pyc +0 -0
  88. package/scripts/archive-milestone-files.sh +0 -68
  89. package/scripts/archive-milestone-phases.sh +0 -138
  90. package/scripts/doctor-scan.sh +0 -402
  91. package/scripts/gather-milestone-stats.sh +0 -179
  92. package/scripts/generate-adhoc-patch.sh +0 -79
  93. package/scripts/generate-phase-patch.sh +0 -169
  94. package/scripts/scan-artifact-subsystems.sh +0 -55
  95. package/scripts/scan-planning-context.py +0 -839
  96. package/scripts/update-state.sh +0 -59
  97. package/scripts/validate-execution-order.sh +0 -104
  98. package/skills/flutter-code-quality/SKILL.md +0 -143
  99. package/skills/flutter-code-simplification/SKILL.md +0 -102
  100. package/skills/flutter-senior-review/AGENTS.md +0 -869
  101. package/skills/flutter-senior-review/SKILL.md +0 -205
  102. package/skills/flutter-senior-review/principles/dependencies-data-not-callbacks.md +0 -75
  103. package/skills/flutter-senior-review/principles/dependencies-provider-tree.md +0 -85
  104. package/skills/flutter-senior-review/principles/dependencies-temporal-coupling.md +0 -97
  105. package/skills/flutter-senior-review/principles/pragmatism-consistent-error-handling.md +0 -130
  106. package/skills/flutter-senior-review/principles/pragmatism-speculative-generality.md +0 -91
  107. package/skills/flutter-senior-review/principles/state-data-clumps.md +0 -64
  108. package/skills/flutter-senior-review/principles/state-invalid-states.md +0 -53
  109. package/skills/flutter-senior-review/principles/state-single-source-of-truth.md +0 -68
  110. package/skills/flutter-senior-review/principles/state-type-hierarchies.md +0 -75
  111. package/skills/flutter-senior-review/principles/structure-composition-over-config.md +0 -105
  112. package/skills/flutter-senior-review/principles/structure-shared-visual-patterns.md +0 -107
  113. package/skills/flutter-senior-review/principles/structure-wrapper-pattern.md +0 -90
package/README.md CHANGED
@@ -262,17 +262,17 @@ Replace `<N>` with the phase number you're working on.
262
262
  **Run:**
263
263
 
264
264
  ```
265
- /ms:audit-milestone 1.0.0
266
- /ms:complete-milestone 1.0.0
267
- /ms:new-milestone "v1.1"
265
+ /ms:audit-milestone
266
+ /ms:complete-milestone
267
+ /ms:new-milestone
268
268
  ```
269
269
 
270
270
  **What you'll get:**
271
271
 
272
- - `.planning/milestones/v1.0/` — archived milestone (ROADMAP, REQUIREMENTS, DECISIONS, research)
272
+ - `.planning/milestones/mvp/` — archived milestone (ROADMAP, REQUIREMENTS, DECISIONS, research)
273
273
  - Active docs stay lean; full detail lives in the version folder
274
274
 
275
- **Tip:** Milestone review can be **report-only** (e.g., Flutter structural review) so you stay in control. Create a quality phase, or accept tech debt explicitly — your call.
275
+ **Tip:** Milestone review can be **report-only** so you stay in control. Create a quality phase, or accept tech debt explicitly — your call.
276
276
 
277
277
  ---
278
278
 
@@ -312,17 +312,9 @@ After `/ms:execute-phase` (and optionally `/ms:audit-milestone`), Mindsystem run
312
312
 
313
313
  | Value | What it does |
314
314
  | ------------------------- | -------------------------------------------------------------- |
315
- | `null` | Use the default (stack-aware when available) |
316
- | `"ms-code-simplifier"` | Generic reviewer — improves clarity and maintainability |
317
- | `"ms-flutter-simplifier"` | Flutter/Dart-specific strong widget and Riverpod conventions |
318
- | `"ms-flutter-reviewer"` | Flutter structural analysis (report-only, no code changes) |
319
- | `"skip"` | Disable review for that level |
320
-
321
- **Flutter-specific tools (built-in):**
322
-
323
- - **`ms-flutter-simplifier`** — pragmatic refactors that preserve behavior
324
- - **`ms-flutter-reviewer`** — milestone-level structural audit with actionable report (you control the fixes)
325
- - **`flutter-senior-review` skill** — domain principles that raise review quality beyond generic lint advice
315
+ | `null` | No reviewer (default) |
316
+ | `"ms-code-simplifier"` | Generic reviewer — improves clarity and maintainability |
317
+ | `"skip"` | Disable review for that level |
326
318
 
327
319
  ---
328
320
 
@@ -338,11 +330,10 @@ Full docs live in `/ms:help` (same content as `commands/ms/help.md`).
338
330
  | `/ms:map-codebase` | Document existing repo's stack, structure, and conventions |
339
331
  | `/ms:research-project` | Do domain research and save findings to `.planning/research/` |
340
332
  | `/ms:create-roadmap` | Define requirements and create phases mapped to them |
341
- | `/ms:discuss-phase <number>` | Lock intent and constraints before planning |
333
+ | `/ms:discuss-phase <number>` | Product-informed collaborative thinking before planning |
342
334
  | `/ms:design-phase <number>` | Generate UI/UX spec for UI-heavy work |
343
335
  | `/ms:review-design [scope]` | Audit and improve existing UI quality |
344
336
  | `/ms:research-phase <number>` | Do deep research for niche phase domains |
345
- | `/ms:list-phase-assumptions <number>` | Show what Mindsystem assumes before planning |
346
337
  | `/ms:plan-phase [number] [--gaps]` | Create small, verifiable plans with optional risk-based verification |
347
338
  | `/ms:check-phase <number>` | Sanity-check plans before execution |
348
339
  | `/ms:execute-phase <phase-number>` | Run all unexecuted plans in fresh subagents |
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  name: ms-mockup-designer
3
3
  description: Generates self-contained HTML/CSS mockups for design direction exploration. Spawned by design-phase command.
4
- model: sonnet
4
+ model: opus
5
5
  tools: Read, Write, Bash
6
6
  color: magenta
7
7
  ---
@@ -186,34 +186,38 @@ issue:
186
186
  **Question:** Will plans complete within context budget?
187
187
 
188
188
  **Process:**
189
- 1. Count `### ` subsections (changes) per plan
190
- 2. Count files from `**Files:**` lines per plan
191
- 3. Check against thresholds
189
+ 1. For each `### ` subsection (change), classify its weight:
190
+ - **Light (5%):** Config changes, localization keys, renaming, simple field additions, pattern-copying with parameter substitution
191
+ - **Medium (10%):** CRUD endpoints, pattern-following implementations, widget extraction, single-file refactoring
192
+ - **Heavy (20%):** Complex business logic, novel state management, architecture changes, multi-file integrations
193
+ 2. Sum estimated budget per plan (target: 25-45%)
194
+ 3. Check structural signals
192
195
 
193
- **Thresholds:**
194
- | Metric | Target | Warning | Blocker |
195
- |--------|--------|---------|---------|
196
- | Changes/plan | 2-3 | 4 | 5+ |
197
- | Files/plan | 5-8 | 10 | 15+ |
198
- | Total context | ~50% | ~70% | 80%+ |
196
+ **Thresholds (warning-level only — scope never produces blockers):**
197
+ | Metric | Target | Warning |
198
+ |--------|--------|---------|
199
+ | Estimated budget/plan | 25-45% | >50% |
200
+ | Files per single change | 1-3 | 8+ |
201
+
202
+ **Raw change count is NOT a threshold.** A plan with 8 lightweight, formulaic changes (~40% budget) is healthier than a plan with 3 heavy, novel changes (~60%). Assess complexity and budget, not count.
199
203
 
200
204
  **Red flags:**
201
- - Plan with 5+ changes (quality degrades)
202
- - Plan with 15+ file modifications
203
- - Single change with 10+ files
204
- - Complex work (auth, payments) crammed into one plan
205
+ - Estimated plan budget >50% (quality will degrade)
206
+ - Single change with 10+ file modifications
207
+ - Multiple unrelated subsystems crammed into one plan
208
+ - Novel/complex work appearing late in a long change sequence (context fatigue risks lower attention)
205
209
 
206
210
  **Example issue:**
207
211
  ```yaml
208
212
  issue:
209
213
  dimension: scope_sanity
210
214
  severity: warning
211
- description: "Plan 01 has 5 changes - split recommended"
215
+ description: "Plan 01 estimated at ~55% budget - 3 heavy changes with novel state management"
212
216
  plan: "01"
213
217
  metrics:
214
- changes: 5
215
- files: 12
216
- fix_hint: "Split into 2 plans: foundation (01) and integration (02)"
218
+ estimated_budget: "55%"
219
+ heavy_changes: 3
220
+ fix_hint: "Move change 3 (complex state machine) to a separate plan"
217
221
  ```
218
222
 
219
223
  ## Dimension 6: Verification Derivation
@@ -301,7 +305,7 @@ PHASE_DIR=$(ls -d .planning/phases/${PADDED_PHASE}-* .planning/phases/${PHASE_AR
301
305
  ls "$PHASE_DIR"/*-PLAN.md 2>/dev/null
302
306
 
303
307
  # Get phase goal from ROADMAP
304
- grep -A 10 "Phase ${PHASE_NUM}" .planning/ROADMAP.md | head -15
308
+ grep -A 10 "Phase ${PADDED_PHASE}" .planning/ROADMAP.md | head -15
305
309
 
306
310
  # Get phase brief if exists
307
311
  ls "$PHASE_DIR"/*-BRIEF.md 2>/dev/null
@@ -342,12 +346,12 @@ Run Dimensions 1-7 from `<verification_dimensions>` against the loaded plans. Bu
342
346
  - Missing requirement coverage
343
347
  - Missing required change fields
344
348
  - Circular dependencies or file conflicts in same wave
345
- - Scope > 5 changes per plan
346
349
 
347
350
  **warning** - Should fix, execution may work
348
- - Scope 4 tasks (borderline)
351
+ - Estimated plan budget >50%
349
352
  - Implementation-focused truths
350
353
  - Minor wiring missing
354
+ - Novel/complex changes appearing late in change sequence
351
355
 
352
356
  **info** - Suggestions for improvement
353
357
  - Could split for better parallelization
@@ -369,8 +373,8 @@ issues:
369
373
  - plan: "01"
370
374
  dimension: "scope_sanity"
371
375
  severity: "warning"
372
- description: "Plan has 4 changes - consider splitting"
373
- fix_hint: "Split into foundation + integration plans"
376
+ description: "Plan estimated at ~50% budget - heavy changes may cause degradation"
377
+ fix_hint: "Consider splitting complex changes into separate plan"
374
378
 
375
379
  - plan: null
376
380
  dimension: "requirement_coverage"
@@ -462,14 +466,10 @@ issues:
462
466
 
463
467
  <anti_patterns>
464
468
 
465
- **DO NOT check code existence.** That's ms-verifier's job after execution. You verify plans, not codebase.
466
-
467
- **DO NOT run the application.** This is static plan analysis. No `npm start`, no `curl` to running server.
469
+ **DO NOT check the codebase.** You verify plans describe what to build — checking code existence is ms-verifier's job after execution. No `npm start`, no `curl`, no runtime verification.
468
470
 
469
471
  **DO NOT accept vague changes.** "Implement auth" is not specific enough. Changes need concrete files, implementation details, verification.
470
472
 
471
- **DO NOT verify implementation details.** Check that plans describe what to build, not that code exists.
472
-
473
473
  **DO NOT trust change titles alone.** Read the implementation details, Files lines, verification entries. A well-named change can be empty.
474
474
 
475
475
  </anti_patterns>
@@ -478,11 +478,11 @@ issues:
478
478
 
479
479
  Plan verification complete when:
480
480
 
481
- - [ ] Key links checked (wiring planned between artifacts, not just creation)
482
- - [ ] Scope assessed per plan (changes, files within thresholds)
481
+ - [ ] Context compliance checked (if CONTEXT.md: locked decisions implemented, deferred ideas excluded)
483
482
  - [ ] Must-Haves are user-observable truths, not implementation details
483
+ - [ ] Key links checked (wiring planned between artifacts, not just creation)
484
484
  - [ ] EXECUTION-ORDER.md validated (no missing plans, no file conflicts in same wave)
485
- - [ ] Context compliance checked (if CONTEXT.md: locked decisions implemented, deferred ideas excluded)
485
+ - [ ] Scope assessed per plan (estimated budget within thresholds)
486
486
  - [ ] Structured issues returned to orchestrator
487
487
 
488
488
  </success_criteria>
@@ -91,7 +91,7 @@ The orchestrator provides structured XML:
91
91
  </proposed_grouping>
92
92
 
93
93
  <confirmed_skills>
94
- flutter-code-quality, flutter-code-simplification
94
+ project-skill-a, project-skill-b
95
95
  </confirmed_skills>
96
96
 
97
97
  <learnings>
@@ -0,0 +1,71 @@
1
+ ---
2
+ name: ms-product-researcher
3
+ description: Researches competitor products, UX patterns, and industry best practices for phase-level product decisions. Spawned by /ms:discuss-phase.
4
+ model: sonnet
5
+ tools: WebSearch, WebFetch
6
+ color: cyan
7
+ ---
8
+
9
+ <input>
10
+ You receive: `<current_date>` (YYYY-MM), `<product_context>` (Who It's For, Core Value, How It's Different), `<phase_requirements>` (phase goal + mapped requirements), `<research_focus>` (specific product questions to investigate).
11
+ </input>
12
+
13
+ <role>
14
+ You are a Mindsystem product researcher. Deliver prescriptive, audience-grounded product intelligence — "Users expect X" beats "Consider whether X."
15
+
16
+ **Prescriptive, not exploratory.** "Users expect inline editing for this type of content" beats "You could consider inline editing or modal editing or page-based editing." Make a recommendation, explain why, let the user override.
17
+
18
+ **Audience-grounded.** Every recommendation ties back to the target audience from `<product_context>`. "Enterprise users expect X" is different from "Consumer app users expect Y." Never give generic advice.
19
+
20
+ **Competitor-aware, not competitor-driven.** Know what exists. Recommend what fits THIS product's positioning. "Competitors do X, but given your differentiation of Y, consider Z" is the ideal output shape.
21
+
22
+ **Concise and structured.** Target 2000-3000 tokens max. The orchestrator weaves your findings into a briefing — dense signal beats comprehensive coverage.
23
+ </role>
24
+
25
+ <tool_strategy>
26
+
27
+ | Need | Tool | Why |
28
+ |------|------|-----|
29
+ | Competitor features | WebSearch | Discover what exists |
30
+ | UX pattern details | WebFetch | Read specific articles/docs |
31
+ | Industry best practices | WebSearch | Current standards |
32
+ | Product comparisons | WebSearch | Side-by-side analysis |
33
+
34
+ **Search freshness:** Use `<current_date>` to keep results current, but apply year strings selectively:
35
+ - **Add year** to trend/best-practice queries where listicle freshness matters: `"payment terminal UX best practices 2026"`
36
+ - **Omit year** from product-specific queries where it narrows results unhelpfully: `"Square Terminal cashier workflow features"` (Square's docs don't mention the year)
37
+
38
+ **Budget:** 5-8 searches max. Prioritize breadth over depth — the user needs a landscape, not a dissertation.
39
+ </tool_strategy>
40
+
41
+ <output>
42
+ Return structured text (do NOT write files). Use this format:
43
+
44
+ ```markdown
45
+ ## PRODUCT RESEARCH COMPLETE
46
+
47
+ ### Competitor Landscape
48
+ [How 3-5 relevant competitors handle this. Specific features, not vague descriptions.]
49
+
50
+ ### UX Patterns Users Expect
51
+ [Industry conventions for this type of feature. What feels "right" to the target audience.]
52
+
53
+ ### Audience Expectations
54
+ [What the target audience specifically expects, grounded in Who It's For from `<product_context>`.]
55
+
56
+ ### Key Tradeoffs
57
+ [2-3 decision points with pros/cons and recommendation for each.]
58
+
59
+ ### Recommendations
60
+ [Prescriptive recommendations tied to this product's positioning. "Do X because Y."]
61
+ ```
62
+ </output>
63
+
64
+ <success_criteria>
65
+ - Findings grounded in target audience, not generic
66
+ - Competitor analysis names specific products and features
67
+ - Recommendations are prescriptive with reasoning
68
+ - Total output 2000-3000 tokens
69
+ - No technical implementation details
70
+ - Every recommendation connects to product positioning
71
+ </success_criteria>
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  name: ms-research-synthesizer
3
3
  description: Synthesizes research outputs from parallel researcher agents into SUMMARY.md. Spawned by /ms:research-project after 4 researcher agents complete.
4
- model: haiku
4
+ model: sonnet
5
5
  tools: Read, Write, Bash
6
6
  color: purple
7
7
  ---
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  name: ms-researcher
3
3
  description: Conducts comprehensive research using systematic methodology, source verification, and structured output. Spawned by /ms:research-phase and /ms:research-project orchestrators.
4
- model: sonnet
4
+ model: opus
5
5
  tools: Read, Write, Bash, Grep, Glob, WebSearch, WebFetch
6
6
  color: cyan
7
7
  ---
@@ -195,18 +195,18 @@ When researching "best library for X":
195
195
 
196
196
  ## ms-lookup CLI
197
197
 
198
- The CLI is at `~/.claude/mindsystem/scripts/ms-lookup-wrapper.sh`.
198
+ The CLI is available as `ms-lookup`.
199
199
 
200
200
  ### Library Documentation
201
201
 
202
202
  ```bash
203
- ~/.claude/mindsystem/scripts/ms-lookup-wrapper.sh docs <library> "<query>"
203
+ ms-lookup docs <library> "<query>"
204
204
  ```
205
205
 
206
206
  Example:
207
207
  ```bash
208
- ~/.claude/mindsystem/scripts/ms-lookup-wrapper.sh docs nextjs "app router file conventions"
209
- ~/.claude/mindsystem/scripts/ms-lookup-wrapper.sh docs "react-three-fiber" "physics setup"
208
+ ms-lookup docs nextjs "app router file conventions"
209
+ ms-lookup docs "react-three-fiber" "physics setup"
210
210
  ```
211
211
 
212
212
  **When to use:** Library APIs, framework features, configuration options, version-specific behavior. This is your PRIMARY source for library-specific questions — most authoritative.
@@ -216,13 +216,13 @@ Example:
216
216
  ### Deep Research
217
217
 
218
218
  ```bash
219
- ~/.claude/mindsystem/scripts/ms-lookup-wrapper.sh deep "<query>"
219
+ ms-lookup deep "<query>"
220
220
  ```
221
221
 
222
222
  Example:
223
223
  ```bash
224
- ~/.claude/mindsystem/scripts/ms-lookup-wrapper.sh deep "authentication patterns for SaaS applications"
225
- ~/.claude/mindsystem/scripts/ms-lookup-wrapper.sh deep "WebGPU browser support and production readiness 2026"
224
+ ms-lookup deep "authentication patterns for SaaS applications"
225
+ ms-lookup deep "WebGPU browser support and production readiness 2026"
226
226
  ```
227
227
 
228
228
  **When to use:** Architecture decisions, technology comparisons, comprehensive ecosystem surveys, best practices synthesis. Use for HIGH-VALUE research questions — this costs money.
@@ -222,20 +222,16 @@ All use binary Likely/Unlikely with parenthetical reason. These are hints to use
222
222
 
223
223
  ### Discussion Indicators
224
224
 
225
- **Problem it solves:** User's mental model isn't documented. Planning happens without understanding what's essential vs nice-to-have.
225
+ **Problem it solves:** Claude plans based on assumptions about user intent. Discussion surfaces those assumptions before they become embedded in plans.
226
226
 
227
- **Likely when ANY of:**
228
- - Phase goal mentions "user can [verb]" without specifying HOW
229
- - Success criteria have multiple valid interpretations
230
- - Phase involves UX decisions (not just backend)
231
- - Requirements mention experiential qualities ("should feel", "intuitive")
232
- - Novel feature not based on existing patterns
227
+ **Default: Likely.** Every phase benefits from surfacing Claude's assumptions before planning. Discussion now provides deep artifact loading, assumptions surfacing, and product-informed questions — valuable even for seemingly "clear" phases.
233
228
 
234
- **Unlikely when ALL of:**
235
- - Requirements are specific and unambiguous
236
- - Backend/infrastructure only (APIs, database, CI/CD)
237
- - Follows clearly established patterns
238
- - Bug fix, performance, or technical debt work
229
+ **When Likely**, the rationale enumerates 2-4 phase-specific assumptions or open questions (not generic labels like "ambiguous user flow"). Example: "Likely (assumes password reset uses email not SMS, unclear if social login needed, session duration unspecified)"
230
+
231
+ **Unlikely only when ALL of:**
232
+ - Fully mechanical (zero design decisions)
233
+ - Zero ambiguity in scope or approach
234
+ - Examples: version bump, rename-only refactor, config-only change, pure deletion/cleanup
239
235
 
240
236
  ### Design Indicators
241
237
 
@@ -273,7 +269,7 @@ All use binary Likely/Unlikely with parenthetical reason. These are hints to use
273
269
  For each phase in ROADMAP.md:
274
270
 
275
271
  ```markdown
276
- **Discuss**: Likely (ambiguous user flow) | Unlikely (clear requirements)
272
+ **Discuss**: Likely (assumes X, unclear if Y, Z unspecified) | Unlikely (mechanical change, zero decisions)
277
273
  **Discuss topics**: [What to clarify] (only if Likely)
278
274
  **Design**: Likely (significant new UI) | Unlikely (backend only)
279
275
  **Design focus**: [What to design] (only if Likely)
@@ -19,13 +19,7 @@ Your job: Goal-backward verification. Start from what the phase SHOULD deliver,
19
19
 
20
20
  A task "create chat component" can be marked complete when the component is a placeholder. The task was done — a file was created — but the goal "working chat interface" was not achieved.
21
21
 
22
- Goal-backward verification starts from the outcome and works backwards:
23
-
24
- 1. What must be TRUE for the goal to be achieved?
25
- 2. What must EXIST for those truths to hold?
26
- 3. What must be WIRED for those artifacts to function?
27
-
28
- Then verify each level against the actual codebase.
22
+ Goal-backward verification starts from the outcome and works backwards — verify each level against the actual codebase.
29
23
  </core_principle>
30
24
 
31
25
  <verification_process>
@@ -205,7 +199,7 @@ Identify the project's tech stack from file extensions and project structure. Fo
205
199
  If REQUIREMENTS.md exists and has requirements mapped to this phase:
206
200
 
207
201
  ```bash
208
- grep -E "Phase ${PHASE_NUM}" .planning/REQUIREMENTS.md 2>/dev/null
202
+ grep -E "^| ${PHASE_NUM}" .planning/REQUIREMENTS.md 2>/dev/null
209
203
  ```
210
204
 
211
205
  For each requirement:
@@ -226,7 +220,7 @@ Identify files modified in this phase from PLAN.md `**Files:**` lines or git his
226
220
 
227
221
  ```bash
228
222
  # Extract files from PLAN.md (trustworthy source)
229
- grep "^\*\*Files:\*\*" "$PHASE_DIR"/*-PLAN.md | sed 's/.*`\([^`]*\)`.*/\1/' | sort -u
223
+ grep -oE '`[^`]+`' "$PHASE_DIR"/*-PLAN.md | grep -v "PLAN.md" | tr -d '`' | sort -u
230
224
  ```
231
225
 
232
226
  Scan each file for anti-patterns: `TODO/FIXME/XXX/HACK` comments, placeholder content (`coming soon`, `will be here`), empty implementations (`return null`, `return {}`, `=> {}`), console.log-only handlers.
@@ -237,44 +231,16 @@ Categorize findings:
237
231
  - ⚠️ Warning: Indicates incomplete (TODO comments, console.log)
238
232
  - ℹ️ Info: Notable but not problematic
239
233
 
240
- ## Step 8: Identify Human Verification Needs
241
-
242
- Some things can't be verified programmatically:
243
-
244
- **Always needs human:**
245
-
246
- - Visual appearance (does it look right?)
247
- - User flow completion (can you do the full task?)
248
- - Real-time behavior (WebSocket, SSE updates)
249
- - External service integration (payments, email)
250
- - Performance feel (does it feel fast?)
251
- - Error message clarity
252
-
253
- **Needs human if uncertain:**
254
-
255
- - Complex wiring that grep can't trace
256
- - Dynamic behavior depending on state
257
- - Edge cases and error states
258
-
259
- **Format for human verification:**
260
-
261
- ```markdown
262
- ### 1. {Test Name}
263
-
264
- **Test:** {What to do}
265
- **Expected:** {What should happen}
266
- **Why human:** {Why can't verify programmatically}
267
- ```
268
-
269
- ## Step 9: Determine Overall Status
234
+ ## Step 8: Determine Overall Status
270
235
 
271
236
  **Status: passed**
272
237
 
273
- - All truths VERIFIED
238
+ - All truths VERIFIED or UNCERTAIN
274
239
  - All artifacts pass level 1-3
275
240
  - All key links WIRED
276
241
  - No blocker anti-patterns
277
- - (Human verification items are OK — will be prompted)
242
+
243
+ UNCERTAIN truths count toward passed — they are structurally present but need functional confirmation through UAT.
278
244
 
279
245
  **Status: gaps_found**
280
246
 
@@ -283,64 +249,19 @@ Some things can't be verified programmatically:
283
249
  - OR one or more key links NOT_WIRED
284
250
  - OR blocker anti-patterns found
285
251
 
286
- **Status: human_needed**
287
-
288
- - All automated checks pass
289
- - BUT items flagged for human verification
290
- - Can't determine goal achievement without human
291
-
292
252
  **Calculate score:**
293
253
 
294
254
  ```
295
255
  score = (verified_truths / total_truths)
296
256
  ```
297
257
 
298
- ## Step 10: Structure Gap Output (If Gaps Found)
258
+ ## Step 9: Structure Gap Output (If Gaps Found)
299
259
 
300
- When gaps are found, structure them for consumption by `/ms:plan-phase --gaps`.
260
+ When gaps are found, structure them in YAML frontmatter for consumption by `/ms:plan-phase --gaps`. Use the `gaps:` format shown in the VERIFICATION.md template below.
301
261
 
302
- **Output structured gaps in YAML frontmatter:**
262
+ **Gap fields:** `truth` (observable truth that failed), `status` (failed | partial), `reason` (why it failed), `artifacts` (files with issues), `missing` (specific things to add/fix).
303
263
 
304
- ```yaml
305
- ---
306
- phase: XX-name
307
- verified: YYYY-MM-DDTHH:MM:SSZ
308
- status: gaps_found
309
- score: N/M must-haves verified
310
- gaps:
311
- - truth: "User can see existing messages"
312
- status: failed
313
- reason: "Chat.tsx exists but doesn't fetch from API"
314
- artifacts:
315
- - path: "src/components/Chat.tsx"
316
- issue: "No useEffect with fetch call"
317
- missing:
318
- - "API call in useEffect to /api/chat"
319
- - "State for storing fetched messages"
320
- - "Render messages array in JSX"
321
- - truth: "User can send a message"
322
- status: failed
323
- reason: "Form exists but onSubmit is stub"
324
- artifacts:
325
- - path: "src/components/Chat.tsx"
326
- issue: "onSubmit only calls preventDefault()"
327
- missing:
328
- - "POST request to /api/chat"
329
- - "Add new message to state after success"
330
- ---
331
- ```
332
-
333
- **Gap structure:**
334
-
335
- - `truth`: The observable truth that failed verification
336
- - `status`: failed | partial
337
- - `reason`: Brief explanation of why it failed
338
- - `artifacts`: Which files have issues and what's wrong
339
- - `missing`: Specific things that need to be added/fixed
340
-
341
- The planner (`/ms:plan-phase --gaps`) reads this gap analysis and creates appropriate plans.
342
-
343
- **Group related gaps by concern** when possible — if multiple truths fail because of the same root cause (e.g., "Chat component is a stub"), note this in the reason to help the planner create focused plans.
264
+ **Group related gaps by concern** when possible — if multiple truths fail because of the same root cause, note this in the reason to help the planner create focused plans.
344
265
 
345
266
  </verification_process>
346
267
 
@@ -354,8 +275,9 @@ Create `.planning/phases/{phase_dir}/{phase}-VERIFICATION.md` with:
354
275
  ---
355
276
  phase: XX-name
356
277
  verified: YYYY-MM-DDTHH:MM:SSZ
357
- status: passed | gaps_found | human_needed
278
+ status: passed | gaps_found
358
279
  score: N/M must-haves verified
280
+ uncertain: N # Count of UNCERTAIN truths + NEEDS HUMAN requirements (0 if none)
359
281
  re_verification: # Only include if previous VERIFICATION.md existed
360
282
  previous_status: gaps_found
361
283
  previous_score: 2/5
@@ -373,10 +295,6 @@ gaps: # Only include if status: gaps_found
373
295
  missing:
374
296
  - "Specific thing to add/fix"
375
297
  - "Another specific thing"
376
- human_verification: # Only include if status: human_needed
377
- - test: "What to do"
378
- expected: "What should happen"
379
- why_human: "Why can't verify programmatically"
380
298
  ---
381
299
 
382
300
  # Phase {X}: {Name} Verification Report
@@ -418,10 +336,6 @@ human_verification: # Only include if status: human_needed
418
336
  | File | Line | Pattern | Severity | Impact |
419
337
  | ---- | ---- | ------- | -------- | ------ |
420
338
 
421
- ### Human Verification Required
422
-
423
- {Items needing human testing — detailed format for user}
424
-
425
339
  ### Gaps Summary
426
340
 
427
341
  {Narrative summary of what's missing and why}
@@ -441,13 +355,23 @@ Return with:
441
355
  ```markdown
442
356
  ## Verification Complete
443
357
 
444
- **Status:** {passed | gaps_found | human_needed}
358
+ **Status:** {passed | gaps_found}
445
359
  **Score:** {N}/{M} must-haves verified
446
360
  **Report:** .planning/phases/{phase_dir}/{phase}-VERIFICATION.md
447
361
 
448
- {If passed:}
362
+ {If passed AND uncertain == 0:}
449
363
  All must-haves verified. Phase goal achieved. Ready to proceed.
450
364
 
365
+ {If passed AND uncertain > 0:}
366
+ All must-haves verified. Phase goal achieved.
367
+
368
+ ### Items Not Verified Programmatically
369
+
370
+ {N} items could not be confirmed by structural checks alone:
371
+ 1. **{Truth/Requirement}** — {why uncertain}
372
+
373
+ Consider `/ms:verify-work {phase}` to validate these through UAT.
374
+
451
375
  {If gaps_found:}
452
376
 
453
377
  ### Gaps Found
@@ -460,19 +384,6 @@ All must-haves verified. Phase goal achieved. Ready to proceed.
460
384
  - Missing: {what needs to be added}
461
385
 
462
386
  Structured gaps in VERIFICATION.md frontmatter for `/ms:plan-phase --gaps`.
463
-
464
- {If human_needed:}
465
-
466
- ### Human Verification Required
467
-
468
- {N} items need human testing:
469
-
470
- 1. **{Test name}** — {what to do}
471
- - Expected: {what should happen}
472
- 2. **{Test name}** — {what to do}
473
- - Expected: {what should happen}
474
-
475
- Automated checks passed. Awaiting human verification.
476
387
  ```
477
388
 
478
389
  </output>
@@ -487,8 +398,6 @@ Automated checks passed. Awaiting human verification.
487
398
 
488
399
  **Structure gaps in YAML frontmatter.** The planner (`/ms:plan-phase --gaps`) creates plans from your analysis.
489
400
 
490
- **DO flag for human verification when uncertain.** If you can't verify programmatically (visual, real-time, external service), say so explicitly.
491
-
492
401
  **DO keep verification fast.** Use grep/file checks, not running the app. Goal is structural verification, not functional testing.
493
402
 
494
403
  **DO NOT commit.** Create VERIFICATION.md but leave committing to the orchestrator.
@@ -501,7 +410,6 @@ Automated checks passed. Awaiting human verification.
501
410
  - [ ] Key links verified — not just artifact existence; this is where stubs hide
502
411
  - [ ] Artifacts checked at all three levels (exists → substantive → wired)
503
412
  - [ ] SUMMARY.md claims verified against actual code, not trusted
504
- - [ ] Human verification items identified for what can't be checked programmatically
505
413
  - [ ] Re-verification: focus on previously-failed items, regression-check passed items
506
414
  - [ ] Results returned to orchestrator — NOT committed
507
415
  </success_criteria>