mindsystem-cc 3.18.1 → 3.20.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -94,31 +94,32 @@ Exit command.
94
94
 
95
95
  Ask inline (freeform, NOT AskUserQuestion):
96
96
 
97
- "What do you want to build?"
97
+ - **Greenfield** (no existing code): "What do you want to build?"
98
+ - **Brownfield** (code detected in Step 1): "Tell me about this project — if you were explaining it to someone who's never seen it, what does it do and who uses it?"
99
+
100
+ For brownfield, explicitly note: "Not the next feature — the product as a whole."
98
101
 
99
102
  Wait for their response. This gives you the context needed to ask intelligent follow-up questions.
100
103
 
104
+ **Derive business context:**
105
+
106
+ After the initial response, infer business context before asking more questions. Present inferred audience/problem/differentiation for the user to react to:
107
+
108
+ "It sounds like this is for [audience] dealing with [problem], and your approach is different because [differentiation]. Sound right?"
109
+
110
+ This leverages what they've already said and gives them something concrete to react to.
111
+
101
112
  **Follow the thread:**
102
113
 
103
114
  Based on what they said, ask follow-up questions that dig into their response. Use AskUserQuestion with options that probe what they mentioned — interpretations, clarifications, concrete examples.
104
115
 
105
- Keep following threads. Each answer opens new threads to explore. Ask about:
106
- - What excited them
107
- - What problem sparked this
108
- - What they mean by vague terms
109
- - What it would actually look like
110
- - What's already decided
116
+ Track coverage against the 6-item context checklist from `questioning.md`. Probe fuzziest areas first spend questioning budget where clarity is lowest.
111
117
 
112
- Consult `questioning.md` for techniques:
113
- - Challenge vagueness
114
- - Make abstract concrete
115
- - Surface assumptions
116
- - Find edges
117
- - Reveal motivation
118
+ Use grounding questions from `questioning.md` instead of template-shaped questions. Don't ask "Who is your target audience?" — ask "Who would be your first 10 users?"
118
119
 
119
- **Check context (background, not out loud):**
120
+ When the conversation pauses and sections are still fuzzy, use success-backward: "Imagine this is wildly successful in a year. What does that look like?"
120
121
 
121
- As you go, mentally check the context checklist from `questioning.md`. If gaps remain, weave questions naturally. Don't suddenly switch to checklist mode.
122
+ Consult `questioning.md` for techniques — challenge vagueness, derive before asking, use grounding questions over template-shaped questions.
122
123
 
123
124
  **Decision gate:**
124
125
 
@@ -142,54 +143,31 @@ Synthesize all context into `.planning/PROJECT.md` using the template from `temp
142
143
 
143
144
  **For greenfield projects:**
144
145
 
145
- Initialize requirements as hypotheses:
146
-
147
- ```markdown
148
- ## Requirements
149
-
150
- ### Validated
151
-
152
- (None yetship to validate)
153
-
154
- ### Active
155
-
156
- - [ ] [Requirement 1]
157
- - [ ] [Requirement 2]
158
- - [ ] [Requirement 3]
159
-
160
- ### Out of Scope
161
-
162
- - [Exclusion 1] — [why]
163
- - [Exclusion 2] — [why]
164
- ```
165
-
166
- All Active requirements are hypotheses until shipped and validated.
146
+ Initialize all sections from questioning:
147
+ - **What This Is** — product identity from questioning
148
+ - **Core Value** — the ONE thing from questioning
149
+ - **Who It's For** — target audience from questioning
150
+ - **Core Problem** — pain or desire from questioning
151
+ - **How It's Different** — differentiators from questioning
152
+ - **Key User Flows** — core interactions from questioning
153
+ - **Out of Scope** boundaries from questioning
154
+ - **Constraints** — hard limits from questioning
155
+ - **Technical Context** — minimal for greenfield (stack choices if discussed)
156
+ - **Validated** — `(None yet — ship to validate)`
157
+ - **Key Decisions** any decisions made during questioning
167
158
 
168
159
  **For brownfield projects (codebase map exists):**
169
160
 
170
- Infer Validated requirements from existing code:
161
+ Same as greenfield, plus:
171
162
 
172
163
  1. Read `.planning/codebase/ARCHITECTURE.md` and `STACK.md`
173
- 2. Identify what the codebase already does
174
- 3. These become the initial Validated set
164
+ 2. Infer Validated requirements from existing code — what the codebase already does
165
+ 3. Populate Technical Context from codebase map (stack, integrations, known debt)
175
166
 
176
167
  ```markdown
177
- ## Requirements
178
-
179
- ### Validated
180
-
168
+ ## Validated
181
169
  - ✓ [Existing capability 1] — existing
182
170
  - ✓ [Existing capability 2] — existing
183
- - ✓ [Existing capability 3] — existing
184
-
185
- ### Active
186
-
187
- - [ ] [New requirement 1]
188
- - [ ] [New requirement 2]
189
-
190
- ### Out of Scope
191
-
192
- - [Exclusion 1] — [why]
193
171
  ```
194
172
 
195
173
  **Key Decisions:**
@@ -239,7 +217,7 @@ docs: initialize [project-name]
239
217
 
240
218
  [One-liner from PROJECT.md]
241
219
 
242
- Creates PROJECT.md with requirements and constraints.
220
+ Creates PROJECT.md with business context and constraints.
243
221
  EOF
244
222
  )"
245
223
  ```
@@ -261,19 +239,14 @@ Project initialized:
261
239
 
262
240
  ## ▶ Next Up
263
241
 
264
- Choose your path:
265
-
266
- **Option A: Research first** (recommended)
267
- Research ecosystem → define requirements → create roadmap. Discovers standard stacks, expected features, architecture patterns.
268
-
269
- `/ms:research-project`
242
+ `/ms:new-milestone` Discover what to build first, create requirements and roadmap
270
243
 
271
- **Option B: Define requirements and roadmap directly** (familiar domains)
272
- Skip research, define requirements and create roadmap from what you know.
244
+ <sub>`/clear` first fresh context window</sub>
273
245
 
274
- `/ms:create-roadmap`
246
+ ---
275
247
 
276
- <sub>`/clear` first → fresh context window</sub>
248
+ **Also available:**
249
+ - `/ms:create-roadmap` — Skip milestone discovery, define requirements and roadmap directly
277
250
 
278
251
  ---
279
252
  ```
@@ -299,9 +272,10 @@ Update `.planning/STATE.md` Last Command field (if STATE.md exists):
299
272
 
300
273
  <success_criteria>
301
274
 
302
- - [ ] Deep questioning completed (not rushed, threads followed)
303
- - [ ] PROJECT.md captures full context with evolutionary structure
304
- - [ ] Requirements initialized as hypotheses (greenfield) or with inferred Validated (brownfield)
275
+ - [ ] Deep questioning completed with business context extracted
276
+ - [ ] PROJECT.md captures product identity (What/Value/Audience/Problem/Differentiation/Flows)
277
+ - [ ] PROJECT.md captures boundaries (Out of Scope, Constraints, Technical Context)
278
+ - [ ] Validated requirements initialized (empty for greenfield, inferred for brownfield)
305
279
  - [ ] Key Decisions table initialized
306
280
  - [ ] config.json has subsystems and code_review settings
307
281
  - [ ] All committed to git
@@ -12,6 +12,7 @@ allowed-tools:
12
12
  - WebFetch
13
13
  - mcp__context7__*
14
14
  - Task
15
+ - Skill
15
16
  ---
16
17
 
17
18
  <objective>
@@ -90,7 +91,9 @@ Check for `.planning/codebase/` and load relevant documents based on phase type.
90
91
  - Perform mandatory discovery (Level 0-3 as appropriate)
91
92
  - Scan project history via context scanner script (prior decisions, issues, debug resolutions, adhoc learnings, cross-milestone patterns)
92
93
  - Break phase into tasks
93
- - Estimate scope and split into multiple plans if needed
94
+ - Propose plan grouping (plan boundaries, wave structure, budget estimates) for user review
95
+ - Discover relevant project skills, confirm with user
96
+ - Hand off tasks + proposed grouping + confirmed skills to plan-writer subagent
94
97
  - Create PLAN.md file(s) with executable structure
95
98
 
96
99
  **Gap closure mode (--gaps flag):**
@@ -1,11 +1,11 @@
1
1
  ---
2
2
  name: ms:release-notes
3
3
  description: Show full Mindsystem release notes with update status
4
- allowed-tools: [Read, Bash, WebFetch]
4
+ allowed-tools: [Read, Bash, Task]
5
5
  ---
6
6
 
7
7
  <objective>
8
- Display all Mindsystem release notes from 2.0.0 onward in clean bullet format, with update status at the end.
8
+ Display Mindsystem release notes from 2.0.0 through installed version in clean bullet format, oldest first, with update status at the end.
9
9
 
10
10
  Use when: you want to see what Mindsystem has shipped across versions, or check if an update is available.
11
11
  </objective>
@@ -31,65 +31,48 @@ Reinstall with: `npx mindsystem-cc`
31
31
  STOP here if no VERSION file.
32
32
  </step>
33
33
 
34
- <step name="fetch_changelog">
35
- Fetch latest CHANGELOG.md from GitHub:
34
+ <step name="fetch_and_display">
35
+ Spawn a Haiku general-purpose subagent with `model: "haiku"` to fetch, parse, and display the changelog.
36
36
 
37
- Use WebFetch tool with:
38
- - URL: `https://raw.githubusercontent.com/rolandtolnay/mindsystem/main/CHANGELOG.md`
39
- - Prompt: "Return the full raw markdown content of this changelog file. Do not summarize or modify."
37
+ The subagent prompt must include the installed version and instruct it to:
40
38
 
41
- **If fetch fails:**
42
- Fall back to local changelog:
43
- ```bash
44
- cat ~/.claude/mindsystem/CHANGELOG.md 2>/dev/null
45
- ```
39
+ 1. **Fetch changelog** — Run `curl -sfL https://raw.githubusercontent.com/rolandtolnay/mindsystem/main/CHANGELOG.md`. If curl fails, fall back to reading `~/.claude/mindsystem/CHANGELOG.md`.
46
40
 
47
- Note to user: "Showing local changelog (couldn't reach GitHub)."
48
- </step>
41
+ 2. **Parse versions** Extract all `## [X.Y.Z]` entries from 2.0.0 onward (skip `## [Unreleased]` and `## [1.x]`).
49
42
 
50
- <step name="parse_and_display">
51
- From the changelog content:
43
+ 3. **Filter by installed version** — Only include versions where X.Y.Z is less than or equal to the installed version (semver comparison). Also extract the latest version from the changelog for the status line.
52
44
 
53
- 1. **Extract latest version** First `## [X.Y.Z]` entry after `## [Unreleased]`
54
- 2. **Parse all version entries from 2.0.0 onward** — Skip the collapsed `## [1.x]` section
55
- 3. **Convert to clean bullet format** — Transform Keep-a-Changelog sections into flat bullets:
56
-
57
- **Format each version as:**
58
- ```
59
- Version X.Y.Z:
60
- - Added feature description
61
- - Added another feature
62
- - Changed behavior description
63
- - Removed old thing
64
- - Fixed bug description
65
- ```
66
-
67
- **Conversion rules:**
68
- - Combine all `### Added`, `### Changed`, `### Removed`, `### Fixed` items under a single version header
69
- - Prefix each item with its category: "Added", "Changed", "Removed", "Fixed"
70
- - Strip bold markers from item text
71
- - One bullet per changelog line item
72
- - Blank line between versions
73
- - Skip `### Migration` sections
74
-
75
- Output all versions from newest to oldest, stopping before `## [1.x]`.
76
- </step>
45
+ 4. **Convert to clean bullet format:**
46
+ ```
47
+ Version X.Y.Z:
48
+ - Added feature description
49
+ - Changed behavior description
50
+ - Removed old thing
51
+ - Fixed bug description
52
+ ```
53
+ - Combine all `### Added`, `### Changed`, `### Removed`, `### Fixed` items under a single version header
54
+ - Prefix each item with its category: "Added", "Changed", "Removed", "Fixed"
55
+ - Strip bold markers from item text
56
+ - One bullet per changelog line item
57
+ - Blank line between versions
58
+ - Skip `### Migration` sections
77
59
 
78
- <step name="update_status">
79
- After all release notes, add a single status line:
60
+ 5. **Output oldest first, newest last** — Display versions starting from 2.0.0 up to the installed version.
80
61
 
81
- **Compare installed version with latest version from changelog.**
62
+ 6. **End with update status line:**
63
+ - If installed < latest: `Update available: vINSTALLED -> vLATEST. Run 'npx mindsystem-cc@latest' to update.`
64
+ - If installed == latest: `You are using the latest version (vINSTALLED).`
65
+ - If installed > latest: `You are ahead of the latest release (vINSTALLED > vLATEST) — development version.`
82
66
 
83
- - **If behind:** `Update available: vINSTALLED -> vLATEST. Run 'npx mindsystem-cc@latest' to update.`
84
- - **If current:** `You are using the latest version (vINSTALLED).`
85
- - **If ahead:** `You are ahead of the latest release (vINSTALLED > vLATEST) — development version.`
67
+ Output the subagent's response directly to the user.
86
68
  </step>
87
69
 
88
70
  </process>
89
71
 
90
72
  <success_criteria>
91
- - [ ] Installed version read from VERSION file
92
- - [ ] Remote changelog fetched (or graceful fallback to local)
93
- - [ ] All versions from 2.0.0 onward displayed in clean bullet format
73
+ - [ ] Delegated to Haiku subagent (no main-agent token waste)
74
+ - [ ] Only versions installed version displayed
75
+ - [ ] Versions displayed oldest first, newest last
76
+ - [ ] Clean bullet format with category prefixes
94
77
  - [ ] Update status shown as single line at end
95
78
  </success_criteria>
@@ -43,7 +43,7 @@ Details referencing existing utilities: "Use `comparePassword` in `src/lib/crypt
43
43
  | Section | Purpose | Required |
44
44
  |---------|---------|----------|
45
45
  | `# Plan NN: Title` | H1 with plan number and descriptive title | Yes |
46
- | `**Subsystem:** X \| **Type:** Y` | Inline metadata (replaces YAML frontmatter) | Yes (type defaults to `execute`) |
46
+ | `**Subsystem:** X \| **Type:** Y \| **Skills:** Z` | Inline metadata (replaces YAML frontmatter) | Yes (type defaults to `execute`, skills optional) |
47
47
  | `## Context` | Why this work exists, why this approach | Yes |
48
48
  | `## Changes` | Numbered subsections with files and implementation details | Yes |
49
49
  | `## Verification` | Bash commands and observable checks | Yes |
@@ -130,6 +130,15 @@ Matches vocabulary from the project's `config.json`. Used by the executor when g
130
130
 
131
131
  When `**Type:**` is omitted, the plan defaults to `execute`.
132
132
 
133
+ ### Skills
134
+
135
+ | Value | Behavior |
136
+ |-------|----------|
137
+ | *(omitted)* | No skills loaded. Skill discovery happens during `/ms:plan-phase` — absence means none were relevant. |
138
+ | `skill-a, skill-b` | Executor invokes listed skills via Skill tool before implementing. |
139
+
140
+ Skills are confirmed by the user during `/ms:plan-phase` and encoded into plans. Comma-separated list of skill names matching entries in the system-reminder.
141
+
133
142
  ---
134
143
 
135
144
  ## Must-Haves Specification
@@ -190,6 +199,7 @@ Execution order lives in a single `EXECUTION-ORDER.md` file alongside the plans.
190
199
  | What triggers TDD lazy-loading? | `**Type:** tdd` in inline metadata |
191
200
  | How does the executor know why an approach was chosen? | Reads `## Context` section |
192
201
  | How does the executor find existing utilities? | Reads `**Files:**` lines and inline references in `## Changes` |
202
+ | What triggers skill loading in executor? | `**Skills:**` in inline metadata. No skills loaded if omitted. |
193
203
 
194
204
  ---
195
205
 
@@ -102,7 +102,7 @@ ELSE:
102
102
 
103
103
  | Pre-work | Recommended | Status |
104
104
  |----------|-------------|--------|
105
- | Discuss | Unlikely | - |
105
+ | Discuss | Unlikely | Done |
106
106
  | Design | Unlikely | - |
107
107
  | Research | Likely | ✓ Done |
108
108
 
@@ -34,9 +34,9 @@ Plans must complete within reasonable context usage.
34
34
  - 70%+ context: Poor quality
35
35
 
36
36
  **Solution:** Budget-aware consolidation - prefer larger plans, calibrate from real data.
37
- - Budget-based grouping (sum of weights within 45%)
37
+ - Orchestrator proposes grouping (weight heuristics, target 25-45%), plan-writer validates structurally
38
38
  - Each plan independently executable
39
- - Consolidate plans under ~10% with related work
39
+ - Bias toward consolidation fewer plans, less overhead
40
40
  </scope_control>
41
41
 
42
42
  <claude_automates>
@@ -16,9 +16,10 @@ Don't interrogate. Collaborate. Don't follow a script. Follow the thread.
16
16
 
17
17
  By the end of questioning, you need enough clarity to write a PROJECT.md that downstream phases can act on:
18
18
 
19
- - **research-project** needs: what domain to research, what the user already knows, what unknowns exist
20
- - **create-roadmap** needs: clear enough vision to decompose into phases, what "done" looks like
21
- - **plan-phase** needs: specific requirements to break into tasks, context for implementation choices
19
+ - **research-project** needs: product domain, target audience, core problem, what unknowns exist
20
+ - **create-roadmap** needs: clear vision with business context grounding — who it's for, what problem it solves, how it's different — to decompose into phases
21
+ - **plan-phase** needs: audience context for task weighting, specific requirements for implementation choices
22
+ - **discuss-phase** needs: business context for PO-style analysis — audience, problem, differentiation inform scope recommendations
22
23
  - **execute-phase** needs: success criteria to verify against, the "why" behind requirements
23
24
 
24
25
  A vague PROJECT.md forces every downstream phase to guess. The cost compounds.
@@ -37,13 +38,17 @@ A vague PROJECT.md forces every downstream phase to guess. The cost compounds.
37
38
 
38
39
  **Clarify ambiguity.** "When you say Z, do you mean A or B?" "You mentioned X — tell me more."
39
40
 
41
+ **Brownfield reframing.** For brownfield projects, reframe to product-level before feature-level. "Tell me about this project" not "What do you want to build?" Users with existing codebases default to describing the next feature — redirect to the product as a whole first.
42
+
43
+ **Derive before asking.** Infer business context from feature descriptions before asking directly. "You described X, Y, Z — it sounds like this is for [audience] dealing with [problem]. Sound right?" This leverages what they've already said and gives them something concrete to react to.
44
+
40
45
  **Know when to stop.** When you understand what they want, why they want it, who it's for, and what done looks like — offer to proceed.
41
46
 
42
47
  </how_to_question>
43
48
 
44
49
  <highest_leverage>
45
50
 
46
- These three questions unlock the most downstream value. Everything else is nice-to-have context.
51
+ These four questions unlock the most downstream value. Everything else is nice-to-have context.
47
52
 
48
53
  1. **"What does done look like?"** — Without observable outcomes, every downstream phase guesses scope. Roadmap can't decompose, plans can't verify, execution can't stop.
49
54
 
@@ -51,7 +56,9 @@ These three questions unlock the most downstream value. Everything else is nice-
51
56
 
52
57
  3. **"What already exists / what can't change?"** — Constraints and existing code prevent planning in a vacuum. Surfaces brownfield reality, API limitations, time pressure.
53
58
 
54
- If you only get three answers, get these three.
59
+ 4. **"How will you know this is a success?"** — Reveals what actually matters: commercial viability, reliability, user experience, personal satisfaction. Informs Core Value and the weighting of all sections.
60
+
61
+ If you only get four answers, get these four.
55
62
 
56
63
  </highest_leverage>
57
64
 
@@ -109,16 +116,49 @@ User mentions "frustrated with current tools"
109
116
 
110
117
  </using_askuserquestion>
111
118
 
119
+ <clarity_adaptive>
120
+
121
+ Clarity is non-uniform across dimensions. Track per-section, not globally. Spend questioning budget where clarity is lowest.
122
+
123
+ **High clarity signals:** Specific demographics, named competitors, concrete scenarios, quantifiable outcomes, clear success metrics.
124
+ → Confirm and move on. Probe one level deeper to test robustness at most.
125
+
126
+ **Low clarity signals:** Broad categories ("developers", "small businesses"), vague benefits ("makes things easier"), no competitor awareness ("nothing else does this"), feature-focused language, hedging ("I think", "maybe").
127
+ → Offer frameworks. Provide concrete options via AskUserQuestion with derived options. Use scenarios to ground.
128
+
129
+ If audience is crystal-clear but differentiation is fuzzy, probe differentiation — don't revisit audience.
130
+
131
+ </clarity_adaptive>
132
+
133
+ <grounding_questions>
134
+
135
+ Grounding questions produce better answers than template-shaped questions. Use these instead of asking directly for template sections.
136
+
137
+ | Section | Don't Ask | Ask Instead |
138
+ |---------|-----------|-------------|
139
+ | Who It's For | "Who is your target audience?" | "Who would be your first 10 users — real people you'd tell tomorrow?" |
140
+ | Core Problem | "What problem does this solve?" | "What triggered you to want to build this? What's broken today?" |
141
+ | How It's Different | "What's your USP?" | "What are people using today instead? What's wrong with it?" |
142
+ | Core Value | "What's most important?" | "If only ONE thing worked perfectly and everything else was mediocre, what would that one thing be?" |
143
+ | Key User Flows | "What are the key flows?" | "Walk me through a session. You open the app — then what?" |
144
+ | Success | "How do you define success?" | "Imagine this is wildly successful in a year. What does that look like?" |
145
+
146
+ People articulate by reacting, not generating.
147
+
148
+ </grounding_questions>
149
+
112
150
  <context_checklist>
113
151
 
114
152
  Use this as a **background checklist**, not a conversation structure. Check these mentally as you go. If gaps remain, weave questions naturally.
115
153
 
116
154
  - [ ] What they're building (concrete enough to explain to a stranger)
117
155
  - [ ] Why it needs to exist (the problem or desire driving it)
118
- - [ ] Who it's for (even if just themselves)
119
- - [ ] What "done" looks like (observable outcomes)
156
+ - [ ] Who it's for (specific enough to find 10 of these people)
157
+ - [ ] What makes it different (from alternatives, even manual ones)
158
+ - [ ] What users actually do (2-3 core interactions)
159
+ - [ ] What success looks like (how they'll know it worked)
120
160
 
121
- Four things. If they volunteer more, capture it.
161
+ Six things. If they volunteer more, capture it.
122
162
 
123
163
  </context_checklist>
124
164
 
@@ -148,6 +188,8 @@ Loop until "Create PROJECT.md" selected.
148
188
  - **Shallow acceptance** — Taking vague answers without probing
149
189
  - **Premature constraints** — Asking about tech stack before understanding the idea
150
190
  - **User skills** — NEVER ask about user's technical experience. Claude builds.
191
+ - **Accepting "everyone" as audience** — Always narrow: "Who needs this MOST?" Broad audiences are a sign of fuzzy thinking, not universal appeal.
192
+ - **Skipping differentiation** — "Nothing else does this" is almost always wrong. Probe alternatives including manual workarounds, spreadsheets, competitor products.
151
193
 
152
194
  </anti_patterns>
153
195
 
@@ -27,7 +27,7 @@ Why 50% not 80%?
27
27
  </context_target>
28
28
 
29
29
  <task_rule>
30
- **Budget-based grouping.** Classify tasks by weight, then pack plans until budget full.
30
+ **Budget-aware grouping.** The orchestrator proposes plan boundaries using weight heuristics and contextual judgment.
31
31
 
32
32
  | Weight | Cost | Examples |
33
33
  |--------|------|----------|
@@ -35,9 +35,9 @@ Why 50% not 80%?
35
35
  | Medium | 10% | CRUD endpoints, widget extraction, single-file refactoring |
36
36
  | Heavy | 20% | Complex business logic, architecture changes, multi-file integrations |
37
37
 
38
- **Grouping rule:** `sum(weights) <= 45%`. Pack tasks by feature affinity until budget full. Bias toward consolidation — fewer plans, less overhead.
38
+ **For the orchestrator (grouping authority):** Use weight estimates as guidance when proposing plan boundaries. Target 25-45% per plan. Bias toward consolidation — fewer plans, less overhead. Consider that sequential-only work (no parallelism benefit from splitting) can be grouped more aggressively. A single light task alone wastes executor overhead — consolidate with related work.
39
39
 
40
- **Minimum plan threshold:** Plans under ~10% consolidate with related work in the same wave. A single light task alone wastes executor overhead.
40
+ **For the plan-writer (structural validation):** Classify weights for the grouping rationale table. Do NOT re-group based on budget math — the orchestrator already considered context budget with user input. Deviate only for structural issues (file conflicts, circular dependencies, missing dependency chains).
41
41
  </task_rule>
42
42
 
43
43
  <tdd_plans>
@@ -64,7 +64,6 @@ See `~/.claude/mindsystem/references/tdd.md` for TDD plan structure.
64
64
  <split_signals>
65
65
 
66
66
  <always_split>
67
- - **Budget sum exceeds 45%** - Budget overflow regardless of task count
68
67
  - **Multiple subsystems** - DB + API + UI = separate plans
69
68
  - **Any task with >5 file modifications** - Split by file groups
70
69
  - **Discovery + verification in separate plans** - Don't mix exploratory and implementation work
@@ -163,12 +162,11 @@ Tasks: 8 (models, migrations, API, JWT, middleware, hashing, login form, registe
163
162
  Result: Task 1-3 good, Task 4-5 degrading, Task 6-8 rushed
164
163
  ```
165
164
 
166
- **Good - Budget-aware plans:**
165
+ **Good - Consolidated plans:**
167
166
  ```
168
- Plan 1: "Auth Database + Config" (4L+1M = ~30%)
169
- Plan 2: "Auth API" (3M = ~30%)
170
- Plan 3: "Auth UI" (1H+1M = ~30%)
171
- Each: within 45% budget, peak quality, atomic commits
167
+ Plan 1: "Auth Foundation" (models + config + middleware = ~35%)
168
+ Plan 2: "Auth Endpoints + UI" (API + forms + wiring = ~40%)
169
+ Each: within reasonable budget, peak quality, atomic commits
172
170
  ```
173
171
 
174
172
  **Bad - Horizontal layers (sequential):**
@@ -191,7 +189,9 @@ Waves: [01, 02, 03] (all parallel)
191
189
  </anti_patterns>
192
190
 
193
191
  <estimating_context>
194
- Weight estimates are heuristics for the plan-writer to bias toward consolidation, not precise predictions. Actual context usage depends on model, task complexity, file sizes, and context window. Calibrate from real execution data — when plans consistently finish with headroom, pack more aggressively.
192
+ Weight estimates are heuristics for the orchestrator's grouping judgment, not mechanical constraints. Actual context usage depends on model, task complexity, file sizes, and context window. Calibrate from real execution data — when plans consistently finish with headroom, the orchestrator should group more aggressively.
193
+
194
+ The plan-writer uses weight classifications for the grouping rationale table (transparency), not as a reason to override the orchestrator's proposed boundaries.
195
195
  </estimating_context>
196
196
 
197
197
  <summary>
@@ -203,9 +203,8 @@ Weight estimates are heuristics for the plan-writer to bias toward consolidation
203
203
  **The principle:** Fewer executors, same quality, less overhead. Bias toward consolidation.
204
204
 
205
205
  **The rules:**
206
- - Group by weight budget (`sum(weights) <= 45%`), not by fixed task count.
207
- - Consolidate plans under ~10% with related same-wave work.
208
- - Split when budget sum exceeds 45%.
206
+ - Orchestrator proposes grouping using weight heuristics and contextual judgment (target 25-45%).
207
+ - Plan-writer validates structurally (file conflicts, circular deps) — deviates only for structural issues.
209
208
  - Vertical slices over horizontal layers.
210
209
  - Dependencies centralized in EXECUTION-ORDER.md.
211
210
  - Autonomous plans get parallel execution.
@@ -16,17 +16,16 @@ read `~/.claude/mindsystem/references/plan-format.md`.
16
16
 
17
17
  ## Scope Guidance
18
18
 
19
- **Plan sizing — budget-based grouping:**
19
+ **Plan sizing — orchestrator proposes, plan-writer validates:**
20
20
 
21
- Weight classifications (L: 5%, M: 10%, H: 20%) defined in scope-estimation.md.
21
+ Weight classifications (L: 5%, M: 10%, H: 20%) defined in scope-estimation.md. The orchestrator uses these as guidance when proposing plan boundaries to the user. Target 25-45% per plan. Bias toward consolidation.
22
22
 
23
- Grouping rule: `sum(weights) <= 45%`. Target 25-45% per plan. Plans under ~10% consolidate with related same-wave work. Bias toward consolidation.
23
+ The plan-writer receives the proposed grouping and validates structurally (file conflicts, circular deps). It does NOT re-group based on budget math.
24
24
 
25
- **When to split:**
25
+ **When to split (orchestrator judgment):**
26
26
 
27
27
  - Different subsystems (auth vs API vs UI)
28
- - Budget sum exceeds 45%
29
- - Risk of context overflow
28
+ - Risk of context overflow (contextual judgment, not mechanical formula)
30
29
  - TDD candidates — separate plans (inherently heavy-weight)
31
30
 
32
31
  **Vertical slices preferred:**