mindsystem-cc 3.17.1 → 3.18.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (37) hide show
  1. package/agents/ms-consolidator.md +4 -4
  2. package/agents/ms-debugger.md +3 -3
  3. package/agents/ms-designer.md +33 -70
  4. package/agents/ms-executor.md +7 -6
  5. package/agents/ms-plan-writer.md +52 -26
  6. package/agents/ms-researcher.md +13 -13
  7. package/commands/ms/check-phase.md +1 -1
  8. package/commands/ms/complete-milestone.md +47 -54
  9. package/commands/ms/design-phase.md +33 -30
  10. package/commands/ms/review-design.md +106 -395
  11. package/mindsystem/references/principles.md +3 -3
  12. package/mindsystem/references/routing/next-phase-routing.md +1 -1
  13. package/mindsystem/references/scope-estimation.md +22 -35
  14. package/mindsystem/templates/design-iteration.md +13 -13
  15. package/mindsystem/templates/design.md +145 -327
  16. package/mindsystem/templates/knowledge.md +1 -1
  17. package/mindsystem/templates/milestone-archive.md +3 -3
  18. package/mindsystem/templates/phase-prompt.md +6 -7
  19. package/mindsystem/templates/research-subagent-prompt.md +2 -2
  20. package/mindsystem/templates/research.md +7 -7
  21. package/mindsystem/templates/roadmap.md +1 -1
  22. package/mindsystem/templates/verification-report.md +1 -1
  23. package/mindsystem/workflows/complete-milestone.md +52 -227
  24. package/mindsystem/workflows/discuss-phase.md +3 -3
  25. package/mindsystem/workflows/execute-plan.md +1 -1
  26. package/mindsystem/workflows/plan-phase.md +22 -50
  27. package/mindsystem/workflows/verify-phase.md +1 -1
  28. package/package.json +1 -1
  29. package/scripts/archive-milestone-files.sh +68 -0
  30. package/scripts/archive-milestone-phases.sh +138 -0
  31. package/scripts/gather-milestone-stats.sh +179 -0
  32. package/scripts/ms-lookup/ms_lookup/backends/context7.py +17 -5
  33. package/scripts/ms-lookup/ms_lookup/backends/perplexity.py +17 -3
  34. package/scripts/ms-lookup-wrapper.sh +1 -1
  35. package/scripts/scan-planning-context.py +186 -36
  36. package/scripts/validate-execution-order.sh +4 -5
  37. package/scripts/cleanup-phase-artifacts.sh +0 -68
@@ -27,21 +27,23 @@ Why 50% not 80%?
27
27
  </context_target>
28
28
 
29
29
  <task_rule>
30
- **Each plan: 2-3 tasks maximum. Stay under 50% context.**
30
+ **Budget-based grouping.** Classify tasks by weight, then pack plans until budget full.
31
31
 
32
- | Task Complexity | Tasks/Plan | Context/Task | Total |
33
- |-----------------|------------|--------------|-------|
34
- | Simple (CRUD, config) | 3 | ~10-15% | ~30-45% |
35
- | Complex (auth, payments) | 2-3 | ~15-25% | ~40-50% |
36
- | Very complex (migrations, refactors) | 1-2 | ~30-40% | ~30-50% |
32
+ | Weight | Cost | Examples |
33
+ |--------|------|----------|
34
+ | Light | 5% | One-line fixes, config changes, dead code removal, renaming |
35
+ | Medium | 10% | CRUD endpoints, widget extraction, single-file refactoring |
36
+ | Heavy | 20% | Complex business logic, architecture changes, multi-file integrations |
37
37
 
38
- **Default to 3 tasks for simple-medium work, 2 for complex.** Executor overhead reduction creates headroom for the third task.
38
+ **Grouping rule:** `sum(weights) <= 45%`. Pack tasks by feature affinity until budget full. Bias toward consolidation fewer plans, less overhead.
39
+
40
+ **Minimum plan threshold:** Plans under ~10% → consolidate with related work in the same wave. A single light task alone wastes executor overhead.
39
41
  </task_rule>
40
42
 
41
43
  <tdd_plans>
42
44
  **TDD features get their own plans. Target ~40% context.**
43
45
 
44
- TDD requires 2-3 execution cycles (RED → GREEN → REFACTOR), each with file reads, test runs, and potential debugging. This is fundamentally heavier than linear task execution.
46
+ TDD requires 2-3 execution cycles (RED → GREEN → REFACTOR), each with file reads, test runs, and potential debugging. This is fundamentally heavier than linear task execution. TDD features are inherently heavy-weight (~25-40% marginal) and are always isolated into dedicated plans.
45
47
 
46
48
  | TDD Feature Complexity | Context Usage |
47
49
  |------------------------|---------------|
@@ -62,7 +64,7 @@ See `~/.claude/mindsystem/references/tdd.md` for TDD plan structure.
62
64
  <split_signals>
63
65
 
64
66
  <always_split>
65
- - **More than 3 tasks** - Even if tasks seem small
67
+ - **Budget sum exceeds 45%** - Budget overflow regardless of task count
66
68
  - **Multiple subsystems** - DB + API + UI = separate plans
67
69
  - **Any task with >5 file modifications** - Split by file groups
68
70
  - **Discovery + verification in separate plans** - Don't mix exploratory and implementation work
@@ -161,13 +163,12 @@ Tasks: 8 (models, migrations, API, JWT, middleware, hashing, login form, registe
161
163
  Result: Task 1-3 good, Task 4-5 degrading, Task 6-8 rushed
162
164
  ```
163
165
 
164
- **Good - Atomic plans:**
166
+ **Good - Budget-aware plans:**
165
167
  ```
166
- Plan 1: "Auth Database Models" (2 tasks)
167
- Plan 2: "Auth API Core" (3 tasks)
168
- Plan 3: "Auth API Protection" (2 tasks)
169
- Plan 4: "Auth UI Components" (2 tasks)
170
- Each: 30-40% context, peak quality, atomic commits
168
+ Plan 1: "Auth Database + Config" (4L+1M = ~30%)
169
+ Plan 2: "Auth API" (3M = ~30%)
170
+ Plan 3: "Auth UI" (1H+1M = ~30%)
171
+ Each: within 45% budget, peak quality, atomic commits
171
172
  ```
172
173
 
173
174
  **Bad - Horizontal layers (sequential):**
@@ -190,35 +191,21 @@ Waves: [01, 02, 03] (all parallel)
190
191
  </anti_patterns>
191
192
 
192
193
  <estimating_context>
193
- | Files Modified | Context Impact |
194
- |----------------|----------------|
195
- | 0-3 files | ~10-15% (small) |
196
- | 4-6 files | ~20-30% (medium) |
197
- | 7+ files | ~40%+ (large - split) |
198
-
199
- | Complexity | Context/Task |
200
- |------------|--------------|
201
- | Simple CRUD | ~15% |
202
- | Business logic | ~25% |
203
- | Complex algorithms | ~40% |
204
- | Domain modeling | ~35% |
205
-
206
- **2 tasks:** Simple ~30%, Medium ~50%, Complex ~80% (split)
207
- **3 tasks:** Simple ~45%, Medium ~75% (risky), Complex 120% (impossible)
208
-
209
- **Executor overhead:** ~2,400 tokens (down from ~6,900 in previous versions), freeing ~4,500 tokens per plan for code quality.
194
+ Weight estimates are heuristics for the plan-writer to bias toward consolidation, not precise predictions. Actual context usage depends on model, task complexity, file sizes, and context window. Calibrate from real execution data — when plans consistently finish with headroom, pack more aggressively.
210
195
  </estimating_context>
211
196
 
212
197
  <summary>
213
- **2-3 tasks, 50% context target:**
198
+ **Budget-aware consolidation, 50% context target:**
214
199
  - All tasks: Peak quality
215
200
  - Git: Atomic per-task commits
216
201
  - Parallel by default: Fresh context per subagent
217
202
 
218
- **The principle:** Aggressive atomicity. More plans, smaller scope, consistent quality.
203
+ **The principle:** Fewer executors, same quality, less overhead. Bias toward consolidation.
219
204
 
220
205
  **The rules:**
221
- - If in doubt, split. Quality over consolidation.
206
+ - Group by weight budget (`sum(weights) <= 45%`), not by fixed task count.
207
+ - Consolidate plans under ~10% with related same-wave work.
208
+ - Split when budget sum exceeds 45%.
222
209
  - Vertical slices over horizontal layers.
223
210
  - Dependencies centralized in EXECUTION-ORDER.md.
224
211
  - Autonomous plans get parallel execution.
@@ -27,11 +27,11 @@ Iteration: v[N] (previous: v[N-1])
27
27
  <previous_design>
28
28
  [Include key sections from current DESIGN.md that are relevant to the changes]
29
29
 
30
- Visual Identity:
31
- [Current visual identity section]
30
+ Design Direction:
31
+ [Current design direction]
32
32
 
33
- Relevant screens/components:
34
- [Sections being modified]
33
+ Relevant screens (wireframe + states + behavior + hints):
34
+ [Screens being modified]
35
35
  </previous_design>
36
36
 
37
37
  <feedback_on_previous>
@@ -87,14 +87,14 @@ Iteration: v2 (previous: v1)
87
87
  </iteration_context>
88
88
 
89
89
  <previous_design>
90
- Visual Identity:
91
- Professional analytics dashboard with dark mode. Deep navy background (#0a0f1a) with amber accent (#F59E0B).
92
-
93
- Design System Colors:
94
- - Primary: #0a0f1a
95
- - Secondary: #1a1f2e
96
- - Text: #ffffff
97
- - Accent: #F59E0B
90
+ Design Direction:
91
+ Professional analytics dashboard with dark mode. Deep navy background instead of generic black, amber accent for energy.
92
+
93
+ Design Tokens (relevant):
94
+ - bg-primary: #0a0f1a
95
+ - bg-surface: #1a1f2e
96
+ - text-primary: #ffffff
97
+ - accent: #F59E0B
98
98
  </previous_design>
99
99
 
100
100
  <feedback_on_previous>
@@ -139,7 +139,7 @@ Iteration: v3 (previous: v2)
139
139
  </iteration_context>
140
140
 
141
141
  <previous_design>
142
- Screen Layout (Kanban Board):
142
+ Kanban Board screen (wireframe):
143
143
  +------------------------------------------+
144
144
  | [Header] |
145
145
  +------------------------------------------+