mindsystem-cc 3.11.0 → 3.13.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (35) hide show
  1. package/agents/ms-consolidator.md +4 -4
  2. package/agents/ms-executor.md +19 -351
  3. package/agents/ms-flutter-code-quality.md +7 -6
  4. package/agents/ms-plan-checker.md +170 -175
  5. package/agents/ms-plan-writer.md +121 -125
  6. package/agents/ms-roadmapper.md +1 -18
  7. package/agents/ms-verifier.md +22 -18
  8. package/commands/ms/check-phase.md +3 -3
  9. package/commands/ms/design-phase.md +2 -9
  10. package/commands/ms/execute-phase.md +8 -6
  11. package/commands/ms/help.md +0 -5
  12. package/commands/ms/new-project.md +3 -40
  13. package/commands/ms/plan-phase.md +4 -3
  14. package/commands/ms/review-design.md +1 -8
  15. package/mindsystem/references/goal-backward.md +10 -25
  16. package/mindsystem/references/plan-format.md +326 -247
  17. package/mindsystem/references/scope-estimation.md +29 -57
  18. package/mindsystem/references/tdd-execution.md +70 -0
  19. package/mindsystem/references/tdd.md +53 -194
  20. package/mindsystem/templates/config.json +0 -11
  21. package/mindsystem/templates/phase-prompt.md +51 -367
  22. package/mindsystem/templates/roadmap.md +2 -2
  23. package/mindsystem/templates/verification-report.md +2 -2
  24. package/mindsystem/workflows/adhoc.md +16 -21
  25. package/mindsystem/workflows/execute-phase.md +71 -50
  26. package/mindsystem/workflows/execute-plan.md +183 -1060
  27. package/mindsystem/workflows/mockup-generation.md +10 -4
  28. package/mindsystem/workflows/plan-phase.md +56 -75
  29. package/mindsystem/workflows/transition.md +1 -10
  30. package/mindsystem/workflows/verify-phase.md +16 -20
  31. package/package.json +1 -1
  32. package/scripts/update-state.sh +59 -0
  33. package/scripts/validate-execution-order.sh +102 -0
  34. package/skills/flutter-code-quality/SKILL.md +4 -3
  35. package/mindsystem/templates/summary.md +0 -293
@@ -32,10 +32,10 @@ Why 50% not 80%?
32
32
  | Task Complexity | Tasks/Plan | Context/Task | Total |
33
33
  |-----------------|------------|--------------|-------|
34
34
  | Simple (CRUD, config) | 3 | ~10-15% | ~30-45% |
35
- | Complex (auth, payments) | 2 | ~20-30% | ~40-50% |
35
+ | Complex (auth, payments) | 2-3 | ~15-25% | ~40-50% |
36
36
  | Very complex (migrations, refactors) | 1-2 | ~30-40% | ~30-50% |
37
37
 
38
- **When in doubt: Default to 2 tasks.** Better to have an extra plan than degraded quality.
38
+ **Default to 3 tasks for simple-medium work, 2 for complex.** Executor overhead reduction creates headroom for the third task.
39
39
  </task_rule>
40
40
 
41
41
  <tdd_plans>
@@ -108,24 +108,23 @@ Plan 03: Visualization components
108
108
  </splitting_strategies>
109
109
 
110
110
  <dependency_awareness>
111
- **Plans declare dependencies explicitly via frontmatter.**
111
+ **Dependencies centralized in EXECUTION-ORDER.md.**
112
112
 
113
- ```yaml
114
- # Independent plan (Wave 1 candidate)
115
- depends_on: []
116
- files_modified: [src/features/user/model.ts, src/features/user/api.ts]
117
-
118
-
119
- # Dependent plan (later wave)
120
- depends_on: ["03-01"]
121
- files_modified: [src/integration/stripe.ts]
113
+ ```markdown
114
+ ## Wave 1 (parallel)
115
+ - 03-01-PLAN.md — User feature
116
+ - 03-02-PLAN.md — Product feature
122
117
 
118
+ ## Wave 2
119
+ - 03-03-PLAN.md — Integration (after: 01, 02)
123
120
  ```
124
121
 
122
+ Plans declare files in `**Files:**` lines within `## Changes` subsections. EXECUTION-ORDER.md tracks wave groups and dependencies.
123
+
125
124
  **Wave assignment rules:**
126
- - `depends_on: []` + no file conflicts → Wave 1 (parallel)
127
- - `depends_on: ["XX"]` → runs after plan XX completes
128
- - Shared `files_modified` with sibling → sequential (by plan number)
125
+ - No dependencies + no file conflicts with other Wave 1 plans → Wave 1 (parallel)
126
+ - Depends on earlier plan later wave (runs after dependency completes)
127
+ - Shared files with sibling plan → sequential (by plan number)
129
128
 
130
129
  **SUMMARY references:**
131
130
  - Only reference prior SUMMARY if genuinely needed (imported types, decisions affecting this plan)
@@ -134,17 +133,21 @@ files_modified: [src/integration/stripe.ts]
134
133
  </dependency_awareness>
135
134
 
136
135
  <file_ownership>
137
- **Exclusive file ownership prevents conflicts:**
136
+ **Exclusive file ownership prevents conflicts.**
137
+
138
+ File ownership is determined from `**Files:**` lines in each plan's `## Changes` section and validated in EXECUTION-ORDER.md wave assignments.
138
139
 
139
- ```yaml
140
- # Plan 01 frontmatter
141
- files_modified: [src/models/user.ts, src/api/users.ts, src/components/UserList.tsx]
140
+ ```markdown
141
+ # Plan 01 Changes
142
+ ### 1. Create User model
143
+ **Files:** `src/models/user.ts`, `src/api/users.ts`, `src/components/UserList.tsx`
142
144
 
143
- # Plan 02 frontmatter
144
- files_modified: [src/models/product.ts, src/api/products.ts, src/components/ProductList.tsx]
145
+ # Plan 02 Changes
146
+ ### 1. Create Product model
147
+ **Files:** `src/models/product.ts`, `src/api/products.ts`, `src/components/ProductList.tsx`
145
148
  ```
146
149
 
147
- No overlap → can run parallel.
150
+ No overlap → can run parallel (same wave in EXECUTION-ORDER.md).
148
151
 
149
152
  **If file appears in multiple plans:** Later plan depends on earlier (by plan number).
150
153
  **If file cannot be split:** Plans must be sequential for that file.
@@ -202,39 +205,9 @@ Waves: [01, 02, 03] (all parallel)
202
205
 
203
206
  **2 tasks:** Simple ~30%, Medium ~50%, Complex ~80% (split)
204
207
  **3 tasks:** Simple ~45%, Medium ~75% (risky), Complex 120% (impossible)
205
- </estimating_context>
206
-
207
- <depth_calibration>
208
- **Depth controls compression tolerance, not artificial inflation.**
209
-
210
- | Depth | Typical Phases | Typical Plans/Phase | Tasks/Plan |
211
- |-------|----------------|---------------------|------------|
212
- | Quick | 3-5 | 1-3 | 2-3 |
213
- | Standard | 5-8 | 3-5 | 2-3 |
214
- | Comprehensive | 8-12 | 5-10 | 2-3 |
215
208
 
216
- Tasks/plan is CONSTANT at 2-3. The 50% context rule applies universally.
217
-
218
- **Key principle:** Derive from actual work. Depth determines how aggressively you combine things, not a target to hit.
219
-
220
- - Comprehensive auth = 8 plans (because auth genuinely has 8 concerns)
221
- - Comprehensive "add favicon" = 1 plan (because that's all it is)
222
-
223
- Don't pad small work to hit a number. Don't compress complex work to look efficient.
224
-
225
- **Comprehensive depth example:**
226
- Auth system at comprehensive depth = 8 plans (not 3 big ones):
227
- - 01: DB models (2 tasks)
228
- - 02: Password hashing (2 tasks)
229
- - 03: JWT generation (2 tasks)
230
- - 04: JWT validation middleware (2 tasks)
231
- - 05: Login endpoint (2 tasks)
232
- - 06: Register endpoint (2 tasks)
233
- - 07: Protected route patterns (2 tasks)
234
- - 08: Auth UI components (3 tasks)
235
-
236
- Each plan: fresh context, peak quality. More plans = more thoroughness, same quality per plan.
237
- </depth_calibration>
209
+ **Executor overhead:** ~2,400 tokens (down from ~6,900 in previous versions), freeing ~4,500 tokens per plan for code quality.
210
+ </estimating_context>
238
211
 
239
212
  <summary>
240
213
  **2-3 tasks, 50% context target:**
@@ -246,11 +219,10 @@ Each plan: fresh context, peak quality. More plans = more thoroughness, same qua
246
219
 
247
220
  **The rules:**
248
221
  - If in doubt, split. Quality over consolidation.
249
- - Depth increases plan COUNT, never plan SIZE.
250
222
  - Vertical slices over horizontal layers.
251
- - Explicit dependencies via `depends_on` frontmatter.
223
+ - Dependencies centralized in EXECUTION-ORDER.md.
252
224
  - Autonomous plans get parallel execution.
253
225
 
254
- **Commit rule:** Each plan produces 3-4 commits total (2-3 task commits + 1 docs commit).
226
+ **Commit rule:** Each plan produces 3-4 commits total (2-3 change commits + 1 docs commit).
255
227
  </summary>
256
228
  </scope_estimation>
@@ -0,0 +1,70 @@
1
+ <tdd_execution>
2
+
3
+ ## RED-GREEN-REFACTOR Cycle
4
+
5
+ Lazy-loaded by executor when plan metadata says `**Type:** tdd`.
6
+
7
+ ### RED — Write failing test
8
+
9
+ 1. Create test file following project conventions
10
+ 2. Write test describing expected behavior (from plan's behavior specification)
11
+ 3. Run test — MUST fail (if passes, feature exists or test is wrong — investigate)
12
+ 4. Commit: `test({phase}-{plan}): add failing test for [feature]`
13
+
14
+ ### GREEN — Implement to pass
15
+
16
+ 1. Write minimal code to make test pass — no cleverness, no optimization
17
+ 2. Run test — MUST pass
18
+ 3. Commit: `feat({phase}-{plan}): implement [feature]`
19
+
20
+ ### REFACTOR (if needed)
21
+
22
+ 1. Clean up implementation if obvious improvements exist
23
+ 2. Run tests — MUST still pass
24
+ 3. Commit only if changes made: `refactor({phase}-{plan}): clean up [feature]`
25
+
26
+ Result: Each TDD plan produces 2-3 atomic commits (test/feat/refactor).
27
+
28
+ ---
29
+
30
+ ## Test Framework Setup
31
+
32
+ When no test framework is configured, set it up as part of the RED phase:
33
+
34
+ | Project | Framework | Install |
35
+ |---------|-----------|---------|
36
+ | Node.js | Jest | `npm install -D jest @types/jest ts-jest` |
37
+ | Node.js (Vite) | Vitest | `npm install -D vitest` |
38
+ | Python | pytest | `pip install pytest` |
39
+ | Go | testing | Built-in |
40
+ | Rust | cargo test | Built-in |
41
+ | Flutter/Dart | flutter_test | Built-in |
42
+
43
+ Detect project type from package.json / requirements.txt / go.mod / pubspec.yaml. Create config if needed. Verify with empty test run.
44
+
45
+ ---
46
+
47
+ ## Commit Pattern
48
+
49
+ TDD plans use dedicated commit types per phase:
50
+
51
+ ```
52
+ test(08-02): add failing test for email validation
53
+ feat(08-02): implement email validation
54
+ refactor(08-02): extract regex to constant # optional
55
+ ```
56
+
57
+ Comparison: Standard plans produce 1 commit per task. TDD plans produce 2-3 commits for single feature.
58
+
59
+ ---
60
+
61
+ ## Error Handling
62
+
63
+ | Situation | Action |
64
+ |-----------|--------|
65
+ | Test doesn't fail in RED | Feature may exist or test is wrong — investigate before proceeding |
66
+ | Test doesn't pass in GREEN | Debug implementation, keep iterating until green |
67
+ | Tests fail in REFACTOR | Undo refactor — commit was premature, refactor in smaller steps |
68
+ | Unrelated tests break | Stop and investigate — may indicate coupling issue |
69
+
70
+ </tdd_execution>
@@ -1,12 +1,13 @@
1
- <overview>
1
+ # TDD Reference for Plan Writers
2
+
2
3
  TDD is about design quality, not coverage metrics. The red-green-refactor cycle forces you to think about behavior before implementation, producing cleaner interfaces and more testable code.
3
4
 
4
5
  **Principle:** If you can describe the behavior as `expect(fn(input)).toBe(output)` before writing `fn`, TDD improves the result.
5
6
 
6
- **Key insight:** TDD work is fundamentally heavier than standard tasks—it requires 2-3 execution cycles (RED GREEN REFACTOR), each with file reads, test runs, and potential debugging. TDD features get dedicated plans to ensure full context is available throughout the cycle.
7
- </overview>
7
+ **Key insight:** TDD work is fundamentally heavier than standard tasks it requires 2-3 execution cycles (RED -> GREEN -> REFACTOR), each with file reads, test runs, and potential debugging. TDD features get dedicated plans to ensure full context is available throughout the cycle.
8
+
9
+ ---
8
10
 
9
- <when_to_use_tdd>
10
11
  ## When TDD Improves Quality
11
12
 
12
13
  **TDD candidates (create a TDD plan):**
@@ -18,7 +19,7 @@ TDD is about design quality, not coverage metrics. The red-green-refactor cycle
18
19
  - State machines and workflows
19
20
  - Utility functions with clear specifications
20
21
 
21
- **Skip TDD (use standard plan with `type="auto"` tasks):**
22
+ **Skip TDD (use standard plan):**
22
23
  - UI layout, styling, visual components
23
24
  - Configuration changes
24
25
  - Glue code connecting existing components
@@ -27,92 +28,65 @@ TDD is about design quality, not coverage metrics. The red-green-refactor cycle
27
28
  - Exploratory prototyping
28
29
 
29
30
  **Heuristic:** Can you write `expect(fn(input)).toBe(output)` before writing `fn`?
30
- Yes: Create a TDD plan
31
- No: Use standard plan, add tests after if needed
32
- </when_to_use_tdd>
31
+ - Yes: Create a TDD plan
32
+ - No: Use standard plan, add tests after if needed
33
+
34
+ ---
33
35
 
34
- <tdd_plan_structure>
35
36
  ## TDD Plan Structure
36
37
 
37
- Each TDD plan implements **one feature** through the full RED-GREEN-REFACTOR cycle.
38
+ Each TDD plan implements **one feature** through the full RED-GREEN-REFACTOR cycle. Use the same pure markdown format as all other plans:
38
39
 
39
40
  ```markdown
40
- ---
41
- phase: XX-name
42
- plan: NN
43
- type: tdd
44
- ---
41
+ # Plan NN: Feature name
45
42
 
46
- <objective>
47
- [What feature and why]
48
- Purpose: [Design benefit of TDD for this feature]
49
- Output: [Working, tested feature]
50
- </objective>
51
-
52
- <context>
53
- @.planning/PROJECT.md
54
- @.planning/ROADMAP.md
55
- @relevant/source/files.ts
56
- </context>
57
-
58
- <feature>
59
- <name>[Feature name]</name>
60
- <files>[source file, test file]</files>
61
- <behavior>
62
- [Expected behavior in testable terms]
63
- Cases: input → expected output
64
- </behavior>
65
- <implementation>[How to implement once tests pass]</implementation>
66
- </feature>
67
-
68
- <verification>
69
- [Test command that proves feature works]
70
- </verification>
71
-
72
- <success_criteria>
73
- - Failing test written and committed
74
- - Implementation passes test
75
- - Refactor complete (if needed)
76
- - All 2-3 commits present
77
- </success_criteria>
78
-
79
- <output>
80
- After completion, create SUMMARY.md with:
81
- - RED: What test was written, why it failed
82
- - GREEN: What implementation made it pass
83
- - REFACTOR: What cleanup was done (if any)
84
- - Commits: List of commits produced
85
- </output>
86
- ```
43
+ **Subsystem:** validation | **Type:** tdd
44
+
45
+ ## Context
46
+ Why TDD benefits this feature. Clear inputs/outputs that make test-first
47
+ design valuable. Reference any prior work.
48
+
49
+ ## Changes
50
+
51
+ ### 1. RED — Write failing tests
52
+ **Files:** `src/lib/__tests__/validate-email.test.ts`
53
+
54
+ Test cases:
55
+ - Valid: `user@example.com`, `name+tag@domain.co.uk` -> returns true
56
+ - Invalid: `@domain.com`, `user@`, empty string -> returns false
57
+ - Edge: very long local part (>64 chars) -> returns false
87
58
 
88
- **One feature per TDD plan.** If features are trivial enough to batch, they're trivial enough to skip TDD—use a standard plan and add tests after.
89
- </tdd_plan_structure>
59
+ Import `validateEmail` from `src/lib/validate-email.ts` (does not exist yet).
60
+ Run tests — all must fail with import/function error.
90
61
 
91
- <execution_flow>
92
- ## Red-Green-Refactor Cycle
62
+ ### 2. GREEN — Implement minimal validation
63
+ **Files:** `src/lib/validate-email.ts`
93
64
 
94
- **RED - Write failing test:**
95
- 1. Create test file following project conventions
96
- 2. Write test describing expected behavior (from `<behavior>` element)
97
- 3. Run test - it MUST fail
98
- 4. If test passes: feature exists or test is wrong. Investigate.
99
- 5. Commit: `test({phase}-{plan}): add failing test for [feature]`
65
+ Export `validateEmail(email: string): boolean`. Use regex matching RFC 5322
66
+ simplified pattern. Handle null/undefined input by returning false. No
67
+ optimization just make tests pass.
100
68
 
101
- **GREEN - Implement to pass:**
102
- 1. Write minimal code to make test pass
103
- 2. No cleverness, no optimization - just make it work
104
- 3. Run test - it MUST pass
105
- 4. Commit: `feat({phase}-{plan}): implement [feature]`
69
+ ### 3. REFACTOR Extract regex constant
70
+ **Files:** `src/lib/validate-email.ts`
106
71
 
107
- **REFACTOR (if needed):**
108
- 1. Clean up implementation if obvious improvements exist
109
- 2. Run tests - MUST still pass
110
- 3. Only commit if changes made: `refactor({phase}-{plan}): clean up [feature]`
72
+ Extract regex to `EMAIL_REGEX` constant at module level. Add JSDoc with
73
+ examples. Run tests all must still pass. Only commit if changes improve
74
+ readability.
111
75
 
112
- **Result:** Each TDD plan produces 2-3 atomic commits.
113
- </execution_flow>
76
+ ## Verification
77
+ - `npm test -- --grep "validate-email"` passes all cases
78
+ - Import works from other modules without errors
79
+
80
+ ## Must-Haves
81
+ - [ ] Valid email addresses return true
82
+ - [ ] Invalid email addresses return false
83
+ - [ ] Edge cases (length limits, null input) handled correctly
84
+ ```
85
+
86
+ **One feature per TDD plan.** If features are trivial enough to batch, they're trivial enough to skip TDD — use a standard plan and add tests after.
87
+
88
+ ---
114
89
 
115
- <test_quality>
116
90
  ## Good Tests vs Bad Tests
117
91
 
118
92
  **Test behavior, not implementation:**
@@ -131,123 +105,9 @@ After completion, create SUMMARY.md with:
131
105
  **No implementation details:**
132
106
  - Good: Test public API, observable behavior
133
107
  - Bad: Mock internals, test private methods, assert on internal state
134
- </test_quality>
135
-
136
- <framework_setup>
137
- ## Test Framework Setup (If None Exists)
138
-
139
- When executing a TDD plan but no test framework is configured, set it up as part of the RED phase:
140
-
141
- **1. Detect project type:**
142
- ```bash
143
- # JavaScript/TypeScript
144
- if [ -f package.json ]; then echo "node"; fi
145
-
146
- # Python
147
- if [ -f requirements.txt ] || [ -f pyproject.toml ]; then echo "python"; fi
148
-
149
- # Go
150
- if [ -f go.mod ]; then echo "go"; fi
151
-
152
- # Rust
153
- if [ -f Cargo.toml ]; then echo "rust"; fi
154
- ```
155
-
156
- **2. Install minimal framework:**
157
- | Project | Framework | Install |
158
- |---------|-----------|---------|
159
- | Node.js | Jest | `npm install -D jest @types/jest ts-jest` |
160
- | Node.js (Vite) | Vitest | `npm install -D vitest` |
161
- | Python | pytest | `pip install pytest` |
162
- | Go | testing | Built-in |
163
- | Rust | cargo test | Built-in |
164
-
165
- **3. Create config if needed:**
166
- - Jest: `jest.config.js` with ts-jest preset
167
- - Vitest: `vitest.config.ts` with test globals
168
- - pytest: `pytest.ini` or `pyproject.toml` section
169
-
170
- **4. Verify setup:**
171
- ```bash
172
- # Run empty test suite - should pass with 0 tests
173
- npm test # Node
174
- pytest # Python
175
- go test ./... # Go
176
- cargo test # Rust
177
- ```
178
-
179
- **5. Create first test file:**
180
- Follow project conventions for test location:
181
- - `*.test.ts` / `*.spec.ts` next to source
182
- - `__tests__/` directory
183
- - `tests/` directory at root
184
-
185
- Framework setup is a one-time cost included in the first TDD plan's RED phase.
186
- </framework_setup>
187
-
188
- <error_handling>
189
- ## Error Handling
190
-
191
- **Test doesn't fail in RED phase:**
192
- - Feature may already exist - investigate
193
- - Test may be wrong (not testing what you think)
194
- - Fix before proceeding
195
-
196
- **Test doesn't pass in GREEN phase:**
197
- - Debug implementation
198
- - Don't skip to refactor
199
- - Keep iterating until green
200
-
201
- **Tests fail in REFACTOR phase:**
202
- - Undo refactor
203
- - Commit was premature
204
- - Refactor in smaller steps
205
-
206
- **Unrelated tests break:**
207
- - Stop and investigate
208
- - May indicate coupling issue
209
- - Fix before proceeding
210
- </error_handling>
211
-
212
- <commit_pattern>
213
- ## Commit Pattern for TDD Plans
214
-
215
- TDD plans produce 2-3 atomic commits (one per phase):
216
108
 
217
- ```
218
- test(08-02): add failing test for email validation
219
-
220
- - Tests valid email formats accepted
221
- - Tests invalid formats rejected
222
- - Tests empty input handling
223
-
224
- feat(08-02): implement email validation
225
-
226
- - Regex pattern matches RFC 5322
227
- - Returns boolean for validity
228
- - Handles edge cases (empty, null)
229
-
230
- refactor(08-02): extract regex to constant (optional)
231
-
232
- - Moved pattern to EMAIL_REGEX constant
233
- - No behavior changes
234
- - Tests still pass
235
- ```
236
-
237
- **Comparison with standard plans:**
238
- - Standard plans: 1 commit per task, 2-4 commits per plan
239
- - TDD plans: 2-3 commits for single feature
240
-
241
- Both follow same format: `{type}({phase}-{plan}): {description}`
242
-
243
- **Benefits:**
244
- - Each commit independently revertable
245
- - Git bisect works at commit level
246
- - Clear history showing TDD discipline
247
- - Consistent with overall commit strategy
248
- </commit_pattern>
109
+ ---
249
110
 
250
- <context_budget>
251
111
  ## Context Budget
252
112
 
253
113
  TDD plans target **~40% context usage** (lower than standard plans' ~50%).
@@ -260,4 +120,3 @@ Why lower:
260
120
  Each phase involves reading files, running commands, analyzing output. The back-and-forth is inherently heavier than linear task execution.
261
121
 
262
122
  Single feature focus ensures full quality throughout the cycle.
263
- </context_budget>
@@ -1,16 +1,5 @@
1
1
  {
2
2
  "subsystems": [],
3
- "depth": "standard",
4
- "parallelization": {
5
- "enabled": true,
6
- "plan_level": true,
7
- "max_concurrent_agents": 3,
8
- "min_plans_for_parallel": 2
9
- },
10
- "safety": {
11
- "always_confirm_destructive": true,
12
- "always_confirm_external_services": true
13
- },
14
3
  "code_review": {
15
4
  "adhoc": null,
16
5
  "phase": null,