mindsystem-cc 3.20.0 → 3.22.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +9 -18
- package/agents/ms-mockup-designer.md +1 -1
- package/agents/ms-plan-checker.md +30 -30
- package/agents/ms-plan-writer.md +1 -1
- package/agents/ms-product-researcher.md +71 -0
- package/agents/ms-research-synthesizer.md +1 -1
- package/agents/ms-researcher.md +8 -8
- package/agents/ms-roadmapper.md +9 -13
- package/agents/ms-verifier.md +25 -117
- package/bin/install.js +68 -5
- package/commands/ms/add-phase.md +7 -8
- package/commands/ms/add-todo.md +3 -4
- package/commands/ms/adhoc.md +4 -5
- package/commands/ms/audit-milestone.md +15 -14
- package/commands/ms/complete-milestone.md +27 -24
- package/commands/ms/config.md +229 -0
- package/commands/ms/create-roadmap.md +3 -4
- package/commands/ms/debug.md +3 -4
- package/commands/ms/design-phase.md +11 -13
- package/commands/ms/discuss-phase.md +26 -22
- package/commands/ms/doctor.md +28 -205
- package/commands/ms/execute-phase.md +20 -12
- package/commands/ms/help.md +46 -39
- package/commands/ms/insert-phase.md +6 -7
- package/commands/ms/map-codebase.md +1 -2
- package/commands/ms/new-milestone.md +41 -19
- package/commands/ms/new-project.md +56 -47
- package/commands/ms/plan-milestone-gaps.md +7 -9
- package/commands/ms/plan-phase.md +4 -5
- package/commands/ms/progress.md +3 -4
- package/commands/ms/remove-phase.md +3 -4
- package/commands/ms/research-phase.md +11 -16
- package/commands/ms/research-project.md +19 -26
- package/commands/ms/review-design.md +4 -2
- package/commands/ms/verify-work.md +6 -8
- package/mindsystem/references/continuation-format.md +3 -3
- package/mindsystem/references/principles.md +1 -1
- package/mindsystem/references/routing/audit-result-routing.md +12 -11
- package/mindsystem/references/routing/between-milestones-routing.md +2 -2
- package/mindsystem/references/routing/milestone-complete-routing.md +1 -1
- package/mindsystem/references/routing/next-phase-routing.md +4 -2
- package/mindsystem/references/verification-patterns.md +0 -37
- package/mindsystem/templates/config.json +2 -1
- package/mindsystem/templates/context.md +7 -6
- package/mindsystem/templates/milestone-archive.md +5 -5
- package/mindsystem/templates/milestone-context.md +1 -1
- package/mindsystem/templates/milestone.md +9 -9
- package/mindsystem/templates/project.md +2 -2
- package/mindsystem/templates/research-subagent-prompt.md +3 -3
- package/mindsystem/templates/roadmap-milestone.md +14 -14
- package/mindsystem/templates/roadmap.md +10 -8
- package/mindsystem/templates/state.md +2 -2
- package/mindsystem/templates/verification-report.md +3 -26
- package/mindsystem/workflows/adhoc.md +1 -1
- package/mindsystem/workflows/complete-milestone.md +40 -75
- package/mindsystem/workflows/discuss-phase.md +141 -65
- package/mindsystem/workflows/doctor-fixes.md +273 -0
- package/mindsystem/workflows/execute-phase.md +9 -21
- package/mindsystem/workflows/execute-plan.md +3 -0
- package/mindsystem/workflows/map-codebase.md +6 -12
- package/mindsystem/workflows/mockup-generation.md +47 -23
- package/mindsystem/workflows/plan-phase.md +13 -6
- package/mindsystem/workflows/transition.md +2 -2
- package/mindsystem/workflows/verify-work.md +97 -70
- package/package.json +1 -1
- package/scripts/__pycache__/ms-tools.cpython-314.pyc +0 -0
- package/scripts/__pycache__/test_ms_tools.cpython-314-pytest-9.0.2.pyc +0 -0
- package/scripts/fixtures/scan-context/.planning/ROADMAP.md +16 -0
- package/scripts/fixtures/scan-context/.planning/adhoc/20260220-fix-token-SUMMARY.md +12 -0
- package/scripts/fixtures/scan-context/.planning/config.json +3 -0
- package/scripts/fixtures/scan-context/.planning/debug/resolved/token-bug.md +11 -0
- package/scripts/fixtures/scan-context/.planning/knowledge/auth.md +11 -0
- package/scripts/fixtures/scan-context/.planning/phases/02-infra/02-1-SUMMARY.md +20 -0
- package/scripts/fixtures/scan-context/.planning/phases/04-setup/04-1-SUMMARY.md +21 -0
- package/scripts/fixtures/scan-context/.planning/phases/05-auth/05-1-SUMMARY.md +28 -0
- package/scripts/fixtures/scan-context/.planning/todos/done/setup-db.md +10 -0
- package/scripts/fixtures/scan-context/.planning/todos/pending/add-logout.md +10 -0
- package/scripts/fixtures/scan-context/expected-output.json +257 -0
- package/scripts/ms-tools.py +2884 -0
- package/scripts/test_ms_tools.py +1622 -0
- package/agents/ms-flutter-code-quality.md +0 -169
- package/agents/ms-flutter-reviewer.md +0 -211
- package/agents/ms-flutter-simplifier.md +0 -79
- package/commands/ms/list-phase-assumptions.md +0 -56
- package/mindsystem/workflows/list-phase-assumptions.md +0 -178
- package/mindsystem/workflows/verify-phase.md +0 -625
- package/scripts/__pycache__/compare_mockups.cpython-314.pyc +0 -0
- package/scripts/archive-milestone-files.sh +0 -68
- package/scripts/archive-milestone-phases.sh +0 -138
- package/scripts/doctor-scan.sh +0 -402
- package/scripts/gather-milestone-stats.sh +0 -179
- package/scripts/generate-adhoc-patch.sh +0 -79
- package/scripts/generate-phase-patch.sh +0 -169
- package/scripts/scan-artifact-subsystems.sh +0 -55
- package/scripts/scan-planning-context.py +0 -839
- package/scripts/update-state.sh +0 -59
- package/scripts/validate-execution-order.sh +0 -104
- package/skills/flutter-code-quality/SKILL.md +0 -143
- package/skills/flutter-code-simplification/SKILL.md +0 -102
- package/skills/flutter-senior-review/AGENTS.md +0 -869
- package/skills/flutter-senior-review/SKILL.md +0 -205
- package/skills/flutter-senior-review/principles/dependencies-data-not-callbacks.md +0 -75
- package/skills/flutter-senior-review/principles/dependencies-provider-tree.md +0 -85
- package/skills/flutter-senior-review/principles/dependencies-temporal-coupling.md +0 -97
- package/skills/flutter-senior-review/principles/pragmatism-consistent-error-handling.md +0 -130
- package/skills/flutter-senior-review/principles/pragmatism-speculative-generality.md +0 -91
- package/skills/flutter-senior-review/principles/state-data-clumps.md +0 -64
- package/skills/flutter-senior-review/principles/state-invalid-states.md +0 -53
- package/skills/flutter-senior-review/principles/state-single-source-of-truth.md +0 -68
- package/skills/flutter-senior-review/principles/state-type-hierarchies.md +0 -75
- package/skills/flutter-senior-review/principles/structure-composition-over-config.md +0 -105
- package/skills/flutter-senior-review/principles/structure-shared-visual-patterns.md +0 -107
- package/skills/flutter-senior-review/principles/structure-wrapper-pattern.md +0 -90
package/README.md
CHANGED
|
@@ -262,17 +262,17 @@ Replace `<N>` with the phase number you're working on.
|
|
|
262
262
|
**Run:**
|
|
263
263
|
|
|
264
264
|
```
|
|
265
|
-
/ms:audit-milestone
|
|
266
|
-
/ms:complete-milestone
|
|
267
|
-
/ms:new-milestone
|
|
265
|
+
/ms:audit-milestone
|
|
266
|
+
/ms:complete-milestone
|
|
267
|
+
/ms:new-milestone
|
|
268
268
|
```
|
|
269
269
|
|
|
270
270
|
**What you'll get:**
|
|
271
271
|
|
|
272
|
-
- `.planning/milestones/
|
|
272
|
+
- `.planning/milestones/mvp/` — archived milestone (ROADMAP, REQUIREMENTS, DECISIONS, research)
|
|
273
273
|
- Active docs stay lean; full detail lives in the version folder
|
|
274
274
|
|
|
275
|
-
**Tip:** Milestone review can be **report-only**
|
|
275
|
+
**Tip:** Milestone review can be **report-only** so you stay in control. Create a quality phase, or accept tech debt explicitly — your call.
|
|
276
276
|
|
|
277
277
|
---
|
|
278
278
|
|
|
@@ -312,17 +312,9 @@ After `/ms:execute-phase` (and optionally `/ms:audit-milestone`), Mindsystem run
|
|
|
312
312
|
|
|
313
313
|
| Value | What it does |
|
|
314
314
|
| ------------------------- | -------------------------------------------------------------- |
|
|
315
|
-
| `null`
|
|
316
|
-
| `"ms-code-simplifier"`
|
|
317
|
-
| `"
|
|
318
|
-
| `"ms-flutter-reviewer"` | Flutter structural analysis (report-only, no code changes) |
|
|
319
|
-
| `"skip"` | Disable review for that level |
|
|
320
|
-
|
|
321
|
-
**Flutter-specific tools (built-in):**
|
|
322
|
-
|
|
323
|
-
- **`ms-flutter-simplifier`** — pragmatic refactors that preserve behavior
|
|
324
|
-
- **`ms-flutter-reviewer`** — milestone-level structural audit with actionable report (you control the fixes)
|
|
325
|
-
- **`flutter-senior-review` skill** — domain principles that raise review quality beyond generic lint advice
|
|
315
|
+
| `null` | No reviewer (default) |
|
|
316
|
+
| `"ms-code-simplifier"` | Generic reviewer — improves clarity and maintainability |
|
|
317
|
+
| `"skip"` | Disable review for that level |
|
|
326
318
|
|
|
327
319
|
---
|
|
328
320
|
|
|
@@ -338,11 +330,10 @@ Full docs live in `/ms:help` (same content as `commands/ms/help.md`).
|
|
|
338
330
|
| `/ms:map-codebase` | Document existing repo's stack, structure, and conventions |
|
|
339
331
|
| `/ms:research-project` | Do domain research and save findings to `.planning/research/` |
|
|
340
332
|
| `/ms:create-roadmap` | Define requirements and create phases mapped to them |
|
|
341
|
-
| `/ms:discuss-phase <number>` |
|
|
333
|
+
| `/ms:discuss-phase <number>` | Product-informed collaborative thinking before planning |
|
|
342
334
|
| `/ms:design-phase <number>` | Generate UI/UX spec for UI-heavy work |
|
|
343
335
|
| `/ms:review-design [scope]` | Audit and improve existing UI quality |
|
|
344
336
|
| `/ms:research-phase <number>` | Do deep research for niche phase domains |
|
|
345
|
-
| `/ms:list-phase-assumptions <number>` | Show what Mindsystem assumes before planning |
|
|
346
337
|
| `/ms:plan-phase [number] [--gaps]` | Create small, verifiable plans with optional risk-based verification |
|
|
347
338
|
| `/ms:check-phase <number>` | Sanity-check plans before execution |
|
|
348
339
|
| `/ms:execute-phase <phase-number>` | Run all unexecuted plans in fresh subagents |
|
|
@@ -186,34 +186,38 @@ issue:
|
|
|
186
186
|
**Question:** Will plans complete within context budget?
|
|
187
187
|
|
|
188
188
|
**Process:**
|
|
189
|
-
1.
|
|
190
|
-
|
|
191
|
-
|
|
189
|
+
1. For each `### ` subsection (change), classify its weight:
|
|
190
|
+
- **Light (5%):** Config changes, localization keys, renaming, simple field additions, pattern-copying with parameter substitution
|
|
191
|
+
- **Medium (10%):** CRUD endpoints, pattern-following implementations, widget extraction, single-file refactoring
|
|
192
|
+
- **Heavy (20%):** Complex business logic, novel state management, architecture changes, multi-file integrations
|
|
193
|
+
2. Sum estimated budget per plan (target: 25-45%)
|
|
194
|
+
3. Check structural signals
|
|
192
195
|
|
|
193
|
-
**Thresholds:**
|
|
194
|
-
| Metric | Target | Warning |
|
|
195
|
-
|
|
196
|
-
|
|
|
197
|
-
| Files
|
|
198
|
-
|
|
196
|
+
**Thresholds (warning-level only — scope never produces blockers):**
|
|
197
|
+
| Metric | Target | Warning |
|
|
198
|
+
|--------|--------|---------|
|
|
199
|
+
| Estimated budget/plan | 25-45% | >50% |
|
|
200
|
+
| Files per single change | 1-3 | 8+ |
|
|
201
|
+
|
|
202
|
+
**Raw change count is NOT a threshold.** A plan with 8 lightweight, formulaic changes (~40% budget) is healthier than a plan with 3 heavy, novel changes (~60%). Assess complexity and budget, not count.
|
|
199
203
|
|
|
200
204
|
**Red flags:**
|
|
201
|
-
-
|
|
202
|
-
-
|
|
203
|
-
-
|
|
204
|
-
-
|
|
205
|
+
- Estimated plan budget >50% (quality will degrade)
|
|
206
|
+
- Single change with 10+ file modifications
|
|
207
|
+
- Multiple unrelated subsystems crammed into one plan
|
|
208
|
+
- Novel/complex work appearing late in a long change sequence (context fatigue risks lower attention)
|
|
205
209
|
|
|
206
210
|
**Example issue:**
|
|
207
211
|
```yaml
|
|
208
212
|
issue:
|
|
209
213
|
dimension: scope_sanity
|
|
210
214
|
severity: warning
|
|
211
|
-
description: "Plan 01
|
|
215
|
+
description: "Plan 01 estimated at ~55% budget - 3 heavy changes with novel state management"
|
|
212
216
|
plan: "01"
|
|
213
217
|
metrics:
|
|
214
|
-
|
|
215
|
-
|
|
216
|
-
fix_hint: "
|
|
218
|
+
estimated_budget: "55%"
|
|
219
|
+
heavy_changes: 3
|
|
220
|
+
fix_hint: "Move change 3 (complex state machine) to a separate plan"
|
|
217
221
|
```
|
|
218
222
|
|
|
219
223
|
## Dimension 6: Verification Derivation
|
|
@@ -301,7 +305,7 @@ PHASE_DIR=$(ls -d .planning/phases/${PADDED_PHASE}-* .planning/phases/${PHASE_AR
|
|
|
301
305
|
ls "$PHASE_DIR"/*-PLAN.md 2>/dev/null
|
|
302
306
|
|
|
303
307
|
# Get phase goal from ROADMAP
|
|
304
|
-
grep -A 10 "Phase ${
|
|
308
|
+
grep -A 10 "Phase ${PADDED_PHASE}" .planning/ROADMAP.md | head -15
|
|
305
309
|
|
|
306
310
|
# Get phase brief if exists
|
|
307
311
|
ls "$PHASE_DIR"/*-BRIEF.md 2>/dev/null
|
|
@@ -342,12 +346,12 @@ Run Dimensions 1-7 from `<verification_dimensions>` against the loaded plans. Bu
|
|
|
342
346
|
- Missing requirement coverage
|
|
343
347
|
- Missing required change fields
|
|
344
348
|
- Circular dependencies or file conflicts in same wave
|
|
345
|
-
- Scope > 5 changes per plan
|
|
346
349
|
|
|
347
350
|
**warning** - Should fix, execution may work
|
|
348
|
-
-
|
|
351
|
+
- Estimated plan budget >50%
|
|
349
352
|
- Implementation-focused truths
|
|
350
353
|
- Minor wiring missing
|
|
354
|
+
- Novel/complex changes appearing late in change sequence
|
|
351
355
|
|
|
352
356
|
**info** - Suggestions for improvement
|
|
353
357
|
- Could split for better parallelization
|
|
@@ -369,8 +373,8 @@ issues:
|
|
|
369
373
|
- plan: "01"
|
|
370
374
|
dimension: "scope_sanity"
|
|
371
375
|
severity: "warning"
|
|
372
|
-
description: "Plan
|
|
373
|
-
fix_hint: "
|
|
376
|
+
description: "Plan estimated at ~50% budget - heavy changes may cause degradation"
|
|
377
|
+
fix_hint: "Consider splitting complex changes into separate plan"
|
|
374
378
|
|
|
375
379
|
- plan: null
|
|
376
380
|
dimension: "requirement_coverage"
|
|
@@ -462,14 +466,10 @@ issues:
|
|
|
462
466
|
|
|
463
467
|
<anti_patterns>
|
|
464
468
|
|
|
465
|
-
**DO NOT check code existence
|
|
466
|
-
|
|
467
|
-
**DO NOT run the application.** This is static plan analysis. No `npm start`, no `curl` to running server.
|
|
469
|
+
**DO NOT check the codebase.** You verify plans describe what to build — checking code existence is ms-verifier's job after execution. No `npm start`, no `curl`, no runtime verification.
|
|
468
470
|
|
|
469
471
|
**DO NOT accept vague changes.** "Implement auth" is not specific enough. Changes need concrete files, implementation details, verification.
|
|
470
472
|
|
|
471
|
-
**DO NOT verify implementation details.** Check that plans describe what to build, not that code exists.
|
|
472
|
-
|
|
473
473
|
**DO NOT trust change titles alone.** Read the implementation details, Files lines, verification entries. A well-named change can be empty.
|
|
474
474
|
|
|
475
475
|
</anti_patterns>
|
|
@@ -478,11 +478,11 @@ issues:
|
|
|
478
478
|
|
|
479
479
|
Plan verification complete when:
|
|
480
480
|
|
|
481
|
-
- [ ]
|
|
482
|
-
- [ ] Scope assessed per plan (changes, files within thresholds)
|
|
481
|
+
- [ ] Context compliance checked (if CONTEXT.md: locked decisions implemented, deferred ideas excluded)
|
|
483
482
|
- [ ] Must-Haves are user-observable truths, not implementation details
|
|
483
|
+
- [ ] Key links checked (wiring planned between artifacts, not just creation)
|
|
484
484
|
- [ ] EXECUTION-ORDER.md validated (no missing plans, no file conflicts in same wave)
|
|
485
|
-
- [ ]
|
|
485
|
+
- [ ] Scope assessed per plan (estimated budget within thresholds)
|
|
486
486
|
- [ ] Structured issues returned to orchestrator
|
|
487
487
|
|
|
488
488
|
</success_criteria>
|
package/agents/ms-plan-writer.md
CHANGED
|
@@ -0,0 +1,71 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: ms-product-researcher
|
|
3
|
+
description: Researches competitor products, UX patterns, and industry best practices for phase-level product decisions. Spawned by /ms:discuss-phase.
|
|
4
|
+
model: sonnet
|
|
5
|
+
tools: WebSearch, WebFetch
|
|
6
|
+
color: cyan
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
<input>
|
|
10
|
+
You receive: `<current_date>` (YYYY-MM), `<product_context>` (Who It's For, Core Value, How It's Different), `<phase_requirements>` (phase goal + mapped requirements), `<research_focus>` (specific product questions to investigate).
|
|
11
|
+
</input>
|
|
12
|
+
|
|
13
|
+
<role>
|
|
14
|
+
You are a Mindsystem product researcher. Deliver prescriptive, audience-grounded product intelligence — "Users expect X" beats "Consider whether X."
|
|
15
|
+
|
|
16
|
+
**Prescriptive, not exploratory.** "Users expect inline editing for this type of content" beats "You could consider inline editing or modal editing or page-based editing." Make a recommendation, explain why, let the user override.
|
|
17
|
+
|
|
18
|
+
**Audience-grounded.** Every recommendation ties back to the target audience from `<product_context>`. "Enterprise users expect X" is different from "Consumer app users expect Y." Never give generic advice.
|
|
19
|
+
|
|
20
|
+
**Competitor-aware, not competitor-driven.** Know what exists. Recommend what fits THIS product's positioning. "Competitors do X, but given your differentiation of Y, consider Z" is the ideal output shape.
|
|
21
|
+
|
|
22
|
+
**Concise and structured.** Target 2000-3000 tokens max. The orchestrator weaves your findings into a briefing — dense signal beats comprehensive coverage.
|
|
23
|
+
</role>
|
|
24
|
+
|
|
25
|
+
<tool_strategy>
|
|
26
|
+
|
|
27
|
+
| Need | Tool | Why |
|
|
28
|
+
|------|------|-----|
|
|
29
|
+
| Competitor features | WebSearch | Discover what exists |
|
|
30
|
+
| UX pattern details | WebFetch | Read specific articles/docs |
|
|
31
|
+
| Industry best practices | WebSearch | Current standards |
|
|
32
|
+
| Product comparisons | WebSearch | Side-by-side analysis |
|
|
33
|
+
|
|
34
|
+
**Search freshness:** Use `<current_date>` to keep results current, but apply year strings selectively:
|
|
35
|
+
- **Add year** to trend/best-practice queries where listicle freshness matters: `"payment terminal UX best practices 2026"`
|
|
36
|
+
- **Omit year** from product-specific queries where it narrows results unhelpfully: `"Square Terminal cashier workflow features"` (Square's docs don't mention the year)
|
|
37
|
+
|
|
38
|
+
**Budget:** 5-8 searches max. Prioritize breadth over depth — the user needs a landscape, not a dissertation.
|
|
39
|
+
</tool_strategy>
|
|
40
|
+
|
|
41
|
+
<output>
|
|
42
|
+
Return structured text (do NOT write files). Use this format:
|
|
43
|
+
|
|
44
|
+
```markdown
|
|
45
|
+
## PRODUCT RESEARCH COMPLETE
|
|
46
|
+
|
|
47
|
+
### Competitor Landscape
|
|
48
|
+
[How 3-5 relevant competitors handle this. Specific features, not vague descriptions.]
|
|
49
|
+
|
|
50
|
+
### UX Patterns Users Expect
|
|
51
|
+
[Industry conventions for this type of feature. What feels "right" to the target audience.]
|
|
52
|
+
|
|
53
|
+
### Audience Expectations
|
|
54
|
+
[What the target audience specifically expects, grounded in Who It's For from `<product_context>`.]
|
|
55
|
+
|
|
56
|
+
### Key Tradeoffs
|
|
57
|
+
[2-3 decision points with pros/cons and recommendation for each.]
|
|
58
|
+
|
|
59
|
+
### Recommendations
|
|
60
|
+
[Prescriptive recommendations tied to this product's positioning. "Do X because Y."]
|
|
61
|
+
```
|
|
62
|
+
</output>
|
|
63
|
+
|
|
64
|
+
<success_criteria>
|
|
65
|
+
- Findings grounded in target audience, not generic
|
|
66
|
+
- Competitor analysis names specific products and features
|
|
67
|
+
- Recommendations are prescriptive with reasoning
|
|
68
|
+
- Total output 2000-3000 tokens
|
|
69
|
+
- No technical implementation details
|
|
70
|
+
- Every recommendation connects to product positioning
|
|
71
|
+
</success_criteria>
|
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: ms-research-synthesizer
|
|
3
3
|
description: Synthesizes research outputs from parallel researcher agents into SUMMARY.md. Spawned by /ms:research-project after 4 researcher agents complete.
|
|
4
|
-
model:
|
|
4
|
+
model: sonnet
|
|
5
5
|
tools: Read, Write, Bash
|
|
6
6
|
color: purple
|
|
7
7
|
---
|
package/agents/ms-researcher.md
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: ms-researcher
|
|
3
3
|
description: Conducts comprehensive research using systematic methodology, source verification, and structured output. Spawned by /ms:research-phase and /ms:research-project orchestrators.
|
|
4
|
-
model:
|
|
4
|
+
model: opus
|
|
5
5
|
tools: Read, Write, Bash, Grep, Glob, WebSearch, WebFetch
|
|
6
6
|
color: cyan
|
|
7
7
|
---
|
|
@@ -195,18 +195,18 @@ When researching "best library for X":
|
|
|
195
195
|
|
|
196
196
|
## ms-lookup CLI
|
|
197
197
|
|
|
198
|
-
The CLI is
|
|
198
|
+
The CLI is available as `ms-lookup`.
|
|
199
199
|
|
|
200
200
|
### Library Documentation
|
|
201
201
|
|
|
202
202
|
```bash
|
|
203
|
-
|
|
203
|
+
ms-lookup docs <library> "<query>"
|
|
204
204
|
```
|
|
205
205
|
|
|
206
206
|
Example:
|
|
207
207
|
```bash
|
|
208
|
-
|
|
209
|
-
|
|
208
|
+
ms-lookup docs nextjs "app router file conventions"
|
|
209
|
+
ms-lookup docs "react-three-fiber" "physics setup"
|
|
210
210
|
```
|
|
211
211
|
|
|
212
212
|
**When to use:** Library APIs, framework features, configuration options, version-specific behavior. This is your PRIMARY source for library-specific questions — most authoritative.
|
|
@@ -216,13 +216,13 @@ Example:
|
|
|
216
216
|
### Deep Research
|
|
217
217
|
|
|
218
218
|
```bash
|
|
219
|
-
|
|
219
|
+
ms-lookup deep "<query>"
|
|
220
220
|
```
|
|
221
221
|
|
|
222
222
|
Example:
|
|
223
223
|
```bash
|
|
224
|
-
|
|
225
|
-
|
|
224
|
+
ms-lookup deep "authentication patterns for SaaS applications"
|
|
225
|
+
ms-lookup deep "WebGPU browser support and production readiness 2026"
|
|
226
226
|
```
|
|
227
227
|
|
|
228
228
|
**When to use:** Architecture decisions, technology comparisons, comprehensive ecosystem surveys, best practices synthesis. Use for HIGH-VALUE research questions — this costs money.
|
package/agents/ms-roadmapper.md
CHANGED
|
@@ -222,20 +222,16 @@ All use binary Likely/Unlikely with parenthetical reason. These are hints to use
|
|
|
222
222
|
|
|
223
223
|
### Discussion Indicators
|
|
224
224
|
|
|
225
|
-
**Problem it solves:**
|
|
225
|
+
**Problem it solves:** Claude plans based on assumptions about user intent. Discussion surfaces those assumptions before they become embedded in plans.
|
|
226
226
|
|
|
227
|
-
**Likely
|
|
228
|
-
- Phase goal mentions "user can [verb]" without specifying HOW
|
|
229
|
-
- Success criteria have multiple valid interpretations
|
|
230
|
-
- Phase involves UX decisions (not just backend)
|
|
231
|
-
- Requirements mention experiential qualities ("should feel", "intuitive")
|
|
232
|
-
- Novel feature not based on existing patterns
|
|
227
|
+
**Default: Likely.** Every phase benefits from surfacing Claude's assumptions before planning. Discussion now provides deep artifact loading, assumptions surfacing, and product-informed questions — valuable even for seemingly "clear" phases.
|
|
233
228
|
|
|
234
|
-
**
|
|
235
|
-
|
|
236
|
-
|
|
237
|
-
-
|
|
238
|
-
-
|
|
229
|
+
**When Likely**, the rationale enumerates 2-4 phase-specific assumptions or open questions (not generic labels like "ambiguous user flow"). Example: "Likely (assumes password reset uses email not SMS, unclear if social login needed, session duration unspecified)"
|
|
230
|
+
|
|
231
|
+
**Unlikely only when ALL of:**
|
|
232
|
+
- Fully mechanical (zero design decisions)
|
|
233
|
+
- Zero ambiguity in scope or approach
|
|
234
|
+
- Examples: version bump, rename-only refactor, config-only change, pure deletion/cleanup
|
|
239
235
|
|
|
240
236
|
### Design Indicators
|
|
241
237
|
|
|
@@ -273,7 +269,7 @@ All use binary Likely/Unlikely with parenthetical reason. These are hints to use
|
|
|
273
269
|
For each phase in ROADMAP.md:
|
|
274
270
|
|
|
275
271
|
```markdown
|
|
276
|
-
**Discuss**: Likely (
|
|
272
|
+
**Discuss**: Likely (assumes X, unclear if Y, Z unspecified) | Unlikely (mechanical change, zero decisions)
|
|
277
273
|
**Discuss topics**: [What to clarify] (only if Likely)
|
|
278
274
|
**Design**: Likely (significant new UI) | Unlikely (backend only)
|
|
279
275
|
**Design focus**: [What to design] (only if Likely)
|
package/agents/ms-verifier.md
CHANGED
|
@@ -19,13 +19,7 @@ Your job: Goal-backward verification. Start from what the phase SHOULD deliver,
|
|
|
19
19
|
|
|
20
20
|
A task "create chat component" can be marked complete when the component is a placeholder. The task was done — a file was created — but the goal "working chat interface" was not achieved.
|
|
21
21
|
|
|
22
|
-
Goal-backward verification starts from the outcome and works backwards
|
|
23
|
-
|
|
24
|
-
1. What must be TRUE for the goal to be achieved?
|
|
25
|
-
2. What must EXIST for those truths to hold?
|
|
26
|
-
3. What must be WIRED for those artifacts to function?
|
|
27
|
-
|
|
28
|
-
Then verify each level against the actual codebase.
|
|
22
|
+
Goal-backward verification starts from the outcome and works backwards — verify each level against the actual codebase.
|
|
29
23
|
</core_principle>
|
|
30
24
|
|
|
31
25
|
<verification_process>
|
|
@@ -205,7 +199,7 @@ Identify the project's tech stack from file extensions and project structure. Fo
|
|
|
205
199
|
If REQUIREMENTS.md exists and has requirements mapped to this phase:
|
|
206
200
|
|
|
207
201
|
```bash
|
|
208
|
-
grep -E "
|
|
202
|
+
grep -E "^| ${PHASE_NUM}" .planning/REQUIREMENTS.md 2>/dev/null
|
|
209
203
|
```
|
|
210
204
|
|
|
211
205
|
For each requirement:
|
|
@@ -226,7 +220,7 @@ Identify files modified in this phase from PLAN.md `**Files:**` lines or git his
|
|
|
226
220
|
|
|
227
221
|
```bash
|
|
228
222
|
# Extract files from PLAN.md (trustworthy source)
|
|
229
|
-
grep
|
|
223
|
+
grep -oE '`[^`]+`' "$PHASE_DIR"/*-PLAN.md | grep -v "PLAN.md" | tr -d '`' | sort -u
|
|
230
224
|
```
|
|
231
225
|
|
|
232
226
|
Scan each file for anti-patterns: `TODO/FIXME/XXX/HACK` comments, placeholder content (`coming soon`, `will be here`), empty implementations (`return null`, `return {}`, `=> {}`), console.log-only handlers.
|
|
@@ -237,44 +231,16 @@ Categorize findings:
|
|
|
237
231
|
- ⚠️ Warning: Indicates incomplete (TODO comments, console.log)
|
|
238
232
|
- ℹ️ Info: Notable but not problematic
|
|
239
233
|
|
|
240
|
-
## Step 8:
|
|
241
|
-
|
|
242
|
-
Some things can't be verified programmatically:
|
|
243
|
-
|
|
244
|
-
**Always needs human:**
|
|
245
|
-
|
|
246
|
-
- Visual appearance (does it look right?)
|
|
247
|
-
- User flow completion (can you do the full task?)
|
|
248
|
-
- Real-time behavior (WebSocket, SSE updates)
|
|
249
|
-
- External service integration (payments, email)
|
|
250
|
-
- Performance feel (does it feel fast?)
|
|
251
|
-
- Error message clarity
|
|
252
|
-
|
|
253
|
-
**Needs human if uncertain:**
|
|
254
|
-
|
|
255
|
-
- Complex wiring that grep can't trace
|
|
256
|
-
- Dynamic behavior depending on state
|
|
257
|
-
- Edge cases and error states
|
|
258
|
-
|
|
259
|
-
**Format for human verification:**
|
|
260
|
-
|
|
261
|
-
```markdown
|
|
262
|
-
### 1. {Test Name}
|
|
263
|
-
|
|
264
|
-
**Test:** {What to do}
|
|
265
|
-
**Expected:** {What should happen}
|
|
266
|
-
**Why human:** {Why can't verify programmatically}
|
|
267
|
-
```
|
|
268
|
-
|
|
269
|
-
## Step 9: Determine Overall Status
|
|
234
|
+
## Step 8: Determine Overall Status
|
|
270
235
|
|
|
271
236
|
**Status: passed**
|
|
272
237
|
|
|
273
|
-
- All truths VERIFIED
|
|
238
|
+
- All truths VERIFIED or UNCERTAIN
|
|
274
239
|
- All artifacts pass level 1-3
|
|
275
240
|
- All key links WIRED
|
|
276
241
|
- No blocker anti-patterns
|
|
277
|
-
|
|
242
|
+
|
|
243
|
+
UNCERTAIN truths count toward passed — they are structurally present but need functional confirmation through UAT.
|
|
278
244
|
|
|
279
245
|
**Status: gaps_found**
|
|
280
246
|
|
|
@@ -283,64 +249,19 @@ Some things can't be verified programmatically:
|
|
|
283
249
|
- OR one or more key links NOT_WIRED
|
|
284
250
|
- OR blocker anti-patterns found
|
|
285
251
|
|
|
286
|
-
**Status: human_needed**
|
|
287
|
-
|
|
288
|
-
- All automated checks pass
|
|
289
|
-
- BUT items flagged for human verification
|
|
290
|
-
- Can't determine goal achievement without human
|
|
291
|
-
|
|
292
252
|
**Calculate score:**
|
|
293
253
|
|
|
294
254
|
```
|
|
295
255
|
score = (verified_truths / total_truths)
|
|
296
256
|
```
|
|
297
257
|
|
|
298
|
-
## Step
|
|
258
|
+
## Step 9: Structure Gap Output (If Gaps Found)
|
|
299
259
|
|
|
300
|
-
When gaps are found, structure them for consumption by `/ms:plan-phase --gaps`.
|
|
260
|
+
When gaps are found, structure them in YAML frontmatter for consumption by `/ms:plan-phase --gaps`. Use the `gaps:` format shown in the VERIFICATION.md template below.
|
|
301
261
|
|
|
302
|
-
**
|
|
262
|
+
**Gap fields:** `truth` (observable truth that failed), `status` (failed | partial), `reason` (why it failed), `artifacts` (files with issues), `missing` (specific things to add/fix).
|
|
303
263
|
|
|
304
|
-
|
|
305
|
-
---
|
|
306
|
-
phase: XX-name
|
|
307
|
-
verified: YYYY-MM-DDTHH:MM:SSZ
|
|
308
|
-
status: gaps_found
|
|
309
|
-
score: N/M must-haves verified
|
|
310
|
-
gaps:
|
|
311
|
-
- truth: "User can see existing messages"
|
|
312
|
-
status: failed
|
|
313
|
-
reason: "Chat.tsx exists but doesn't fetch from API"
|
|
314
|
-
artifacts:
|
|
315
|
-
- path: "src/components/Chat.tsx"
|
|
316
|
-
issue: "No useEffect with fetch call"
|
|
317
|
-
missing:
|
|
318
|
-
- "API call in useEffect to /api/chat"
|
|
319
|
-
- "State for storing fetched messages"
|
|
320
|
-
- "Render messages array in JSX"
|
|
321
|
-
- truth: "User can send a message"
|
|
322
|
-
status: failed
|
|
323
|
-
reason: "Form exists but onSubmit is stub"
|
|
324
|
-
artifacts:
|
|
325
|
-
- path: "src/components/Chat.tsx"
|
|
326
|
-
issue: "onSubmit only calls preventDefault()"
|
|
327
|
-
missing:
|
|
328
|
-
- "POST request to /api/chat"
|
|
329
|
-
- "Add new message to state after success"
|
|
330
|
-
---
|
|
331
|
-
```
|
|
332
|
-
|
|
333
|
-
**Gap structure:**
|
|
334
|
-
|
|
335
|
-
- `truth`: The observable truth that failed verification
|
|
336
|
-
- `status`: failed | partial
|
|
337
|
-
- `reason`: Brief explanation of why it failed
|
|
338
|
-
- `artifacts`: Which files have issues and what's wrong
|
|
339
|
-
- `missing`: Specific things that need to be added/fixed
|
|
340
|
-
|
|
341
|
-
The planner (`/ms:plan-phase --gaps`) reads this gap analysis and creates appropriate plans.
|
|
342
|
-
|
|
343
|
-
**Group related gaps by concern** when possible — if multiple truths fail because of the same root cause (e.g., "Chat component is a stub"), note this in the reason to help the planner create focused plans.
|
|
264
|
+
**Group related gaps by concern** when possible — if multiple truths fail because of the same root cause, note this in the reason to help the planner create focused plans.
|
|
344
265
|
|
|
345
266
|
</verification_process>
|
|
346
267
|
|
|
@@ -354,8 +275,9 @@ Create `.planning/phases/{phase_dir}/{phase}-VERIFICATION.md` with:
|
|
|
354
275
|
---
|
|
355
276
|
phase: XX-name
|
|
356
277
|
verified: YYYY-MM-DDTHH:MM:SSZ
|
|
357
|
-
status: passed | gaps_found
|
|
278
|
+
status: passed | gaps_found
|
|
358
279
|
score: N/M must-haves verified
|
|
280
|
+
uncertain: N # Count of UNCERTAIN truths + NEEDS HUMAN requirements (0 if none)
|
|
359
281
|
re_verification: # Only include if previous VERIFICATION.md existed
|
|
360
282
|
previous_status: gaps_found
|
|
361
283
|
previous_score: 2/5
|
|
@@ -373,10 +295,6 @@ gaps: # Only include if status: gaps_found
|
|
|
373
295
|
missing:
|
|
374
296
|
- "Specific thing to add/fix"
|
|
375
297
|
- "Another specific thing"
|
|
376
|
-
human_verification: # Only include if status: human_needed
|
|
377
|
-
- test: "What to do"
|
|
378
|
-
expected: "What should happen"
|
|
379
|
-
why_human: "Why can't verify programmatically"
|
|
380
298
|
---
|
|
381
299
|
|
|
382
300
|
# Phase {X}: {Name} Verification Report
|
|
@@ -418,10 +336,6 @@ human_verification: # Only include if status: human_needed
|
|
|
418
336
|
| File | Line | Pattern | Severity | Impact |
|
|
419
337
|
| ---- | ---- | ------- | -------- | ------ |
|
|
420
338
|
|
|
421
|
-
### Human Verification Required
|
|
422
|
-
|
|
423
|
-
{Items needing human testing — detailed format for user}
|
|
424
|
-
|
|
425
339
|
### Gaps Summary
|
|
426
340
|
|
|
427
341
|
{Narrative summary of what's missing and why}
|
|
@@ -441,13 +355,23 @@ Return with:
|
|
|
441
355
|
```markdown
|
|
442
356
|
## Verification Complete
|
|
443
357
|
|
|
444
|
-
**Status:** {passed | gaps_found
|
|
358
|
+
**Status:** {passed | gaps_found}
|
|
445
359
|
**Score:** {N}/{M} must-haves verified
|
|
446
360
|
**Report:** .planning/phases/{phase_dir}/{phase}-VERIFICATION.md
|
|
447
361
|
|
|
448
|
-
{If passed:}
|
|
362
|
+
{If passed AND uncertain == 0:}
|
|
449
363
|
All must-haves verified. Phase goal achieved. Ready to proceed.
|
|
450
364
|
|
|
365
|
+
{If passed AND uncertain > 0:}
|
|
366
|
+
All must-haves verified. Phase goal achieved.
|
|
367
|
+
|
|
368
|
+
### Items Not Verified Programmatically
|
|
369
|
+
|
|
370
|
+
{N} items could not be confirmed by structural checks alone:
|
|
371
|
+
1. **{Truth/Requirement}** — {why uncertain}
|
|
372
|
+
|
|
373
|
+
Consider `/ms:verify-work {phase}` to validate these through UAT.
|
|
374
|
+
|
|
451
375
|
{If gaps_found:}
|
|
452
376
|
|
|
453
377
|
### Gaps Found
|
|
@@ -460,19 +384,6 @@ All must-haves verified. Phase goal achieved. Ready to proceed.
|
|
|
460
384
|
- Missing: {what needs to be added}
|
|
461
385
|
|
|
462
386
|
Structured gaps in VERIFICATION.md frontmatter for `/ms:plan-phase --gaps`.
|
|
463
|
-
|
|
464
|
-
{If human_needed:}
|
|
465
|
-
|
|
466
|
-
### Human Verification Required
|
|
467
|
-
|
|
468
|
-
{N} items need human testing:
|
|
469
|
-
|
|
470
|
-
1. **{Test name}** — {what to do}
|
|
471
|
-
- Expected: {what should happen}
|
|
472
|
-
2. **{Test name}** — {what to do}
|
|
473
|
-
- Expected: {what should happen}
|
|
474
|
-
|
|
475
|
-
Automated checks passed. Awaiting human verification.
|
|
476
387
|
```
|
|
477
388
|
|
|
478
389
|
</output>
|
|
@@ -487,8 +398,6 @@ Automated checks passed. Awaiting human verification.
|
|
|
487
398
|
|
|
488
399
|
**Structure gaps in YAML frontmatter.** The planner (`/ms:plan-phase --gaps`) creates plans from your analysis.
|
|
489
400
|
|
|
490
|
-
**DO flag for human verification when uncertain.** If you can't verify programmatically (visual, real-time, external service), say so explicitly.
|
|
491
|
-
|
|
492
401
|
**DO keep verification fast.** Use grep/file checks, not running the app. Goal is structural verification, not functional testing.
|
|
493
402
|
|
|
494
403
|
**DO NOT commit.** Create VERIFICATION.md but leave committing to the orchestrator.
|
|
@@ -501,7 +410,6 @@ Automated checks passed. Awaiting human verification.
|
|
|
501
410
|
- [ ] Key links verified — not just artifact existence; this is where stubs hide
|
|
502
411
|
- [ ] Artifacts checked at all three levels (exists → substantive → wired)
|
|
503
412
|
- [ ] SUMMARY.md claims verified against actual code, not trusted
|
|
504
|
-
- [ ] Human verification items identified for what can't be checked programmatically
|
|
505
413
|
- [ ] Re-verification: focus on previously-failed items, regression-check passed items
|
|
506
414
|
- [ ] Results returned to orchestrator — NOT committed
|
|
507
415
|
</success_criteria>
|