maxsimcli 4.8.0 → 4.9.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/assets/CHANGELOG.md +13 -0
- package/dist/assets/hooks/maxsim-check-update.cjs +38 -0
- package/dist/assets/hooks/maxsim-check-update.cjs.map +1 -1
- package/dist/assets/hooks/maxsim-statusline.cjs +116 -48
- package/dist/assets/hooks/maxsim-statusline.cjs.map +1 -1
- package/dist/assets/hooks/maxsim-sync-reminder.cjs +117 -0
- package/dist/assets/hooks/maxsim-sync-reminder.cjs.map +1 -0
- package/dist/assets/templates/agents/AGENTS.md +78 -106
- package/dist/assets/templates/agents/executor.md +101 -0
- package/dist/assets/templates/agents/planner.md +86 -0
- package/dist/assets/templates/agents/researcher.md +71 -0
- package/dist/assets/templates/agents/verifier.md +88 -0
- package/dist/assets/templates/commands/maxsim/debug.md +7 -7
- package/dist/assets/templates/commands/maxsim/execute.md +45 -0
- package/dist/assets/templates/commands/maxsim/go.md +29 -0
- package/dist/assets/templates/commands/maxsim/help.md +2 -2
- package/dist/assets/templates/commands/maxsim/init.md +52 -0
- package/dist/assets/templates/commands/maxsim/plan.md +50 -0
- package/dist/assets/templates/commands/maxsim/progress.md +4 -3
- package/dist/assets/templates/commands/maxsim/quick.md +6 -4
- package/dist/assets/templates/commands/maxsim/settings.md +4 -3
- package/dist/assets/templates/references/continuation-format.md +16 -16
- package/dist/assets/templates/references/model-profile-resolution.md +1 -1
- package/dist/assets/templates/references/model-profiles.md +12 -19
- package/dist/assets/templates/rules/conventions.md +51 -0
- package/dist/assets/templates/rules/verification-protocol.md +57 -0
- package/dist/assets/templates/skills/agent-system-map/SKILL.md +92 -0
- package/dist/assets/templates/skills/brainstorming/SKILL.md +48 -36
- package/dist/assets/templates/skills/code-review/SKILL.md +40 -61
- package/dist/assets/templates/skills/commit-conventions/SKILL.md +75 -0
- package/dist/assets/templates/skills/evidence-collection/SKILL.md +87 -0
- package/dist/assets/templates/skills/handoff-contract/SKILL.md +70 -0
- package/dist/assets/templates/skills/input-validation/SKILL.md +51 -0
- package/dist/assets/templates/skills/maxsim-batch/SKILL.md +41 -45
- package/dist/assets/templates/skills/maxsim-simplify/SKILL.md +37 -90
- package/dist/assets/templates/skills/memory-management/SKILL.md +32 -67
- package/dist/assets/templates/skills/research-methodology/SKILL.md +137 -0
- package/dist/assets/templates/skills/roadmap-writing/SKILL.md +40 -58
- package/dist/assets/templates/skills/sdd/SKILL.md +34 -69
- package/dist/assets/templates/skills/systematic-debugging/SKILL.md +20 -26
- package/dist/assets/templates/skills/tdd/SKILL.md +25 -33
- package/dist/assets/templates/skills/tool-priority-guide/SKILL.md +80 -0
- package/dist/assets/templates/skills/using-maxsim/SKILL.md +42 -73
- package/dist/assets/templates/skills/verification-before-completion/SKILL.md +12 -24
- package/dist/assets/templates/skills/verification-gates/SKILL.md +169 -0
- package/dist/assets/templates/templates/UAT.md +3 -3
- package/dist/assets/templates/templates/VALIDATION.md +1 -1
- package/dist/assets/templates/templates/context.md +4 -4
- package/dist/assets/templates/templates/debug-subagent-prompt.md +3 -3
- package/dist/assets/templates/templates/discovery.md +2 -2
- package/dist/assets/templates/templates/phase-prompt.md +2 -2
- package/dist/assets/templates/templates/planner-subagent-prompt.md +7 -7
- package/dist/assets/templates/templates/project.md +1 -1
- package/dist/assets/templates/templates/research.md +1 -1
- package/dist/assets/templates/templates/state.md +2 -2
- package/dist/assets/templates/templates/summary.md +41 -0
- package/dist/assets/templates/workflows/batch.md +5 -5
- package/dist/assets/templates/workflows/diagnose-issues.md +2 -2
- package/dist/assets/templates/workflows/discovery-phase.md +3 -3
- package/dist/assets/templates/workflows/discuss-phase.md +11 -11
- package/dist/assets/templates/workflows/execute-phase.md +205 -11
- package/dist/assets/templates/workflows/execute-plan.md +299 -34
- package/dist/assets/templates/workflows/execute.md +421 -0
- package/dist/assets/templates/workflows/go.md +250 -0
- package/dist/assets/templates/workflows/health.md +5 -5
- package/dist/assets/templates/workflows/help.md +165 -435
- package/dist/assets/templates/workflows/init-existing.md +23 -23
- package/dist/assets/templates/workflows/init.md +205 -0
- package/dist/assets/templates/workflows/new-milestone.md +9 -9
- package/dist/assets/templates/workflows/new-project.md +26 -26
- package/dist/assets/templates/workflows/plan-create.md +298 -0
- package/dist/assets/templates/workflows/plan-discuss.md +347 -0
- package/dist/assets/templates/workflows/plan-phase.md +29 -29
- package/dist/assets/templates/workflows/plan-research.md +177 -0
- package/dist/assets/templates/workflows/plan.md +231 -0
- package/dist/assets/templates/workflows/progress.md +46 -42
- package/dist/assets/templates/workflows/quick.md +195 -14
- package/dist/assets/templates/workflows/research-phase.md +5 -5
- package/dist/assets/templates/workflows/sdd.md +20 -12
- package/dist/assets/templates/workflows/settings.md +18 -14
- package/dist/assets/templates/workflows/verify-phase.md +1 -1
- package/dist/assets/templates/workflows/verify-work.md +16 -16
- package/dist/cli.cjs +496 -91
- package/dist/cli.cjs.map +1 -1
- package/dist/core-D5zUr9cb.cjs.map +1 -1
- package/dist/install.cjs +234 -17
- package/dist/install.cjs.map +1 -1
- package/dist/mcp-server.cjs +21 -2
- package/dist/mcp-server.cjs.map +1 -1
- package/dist/skills-CjFWZIGM.cjs.map +1 -1
- package/package.json +1 -1
- package/dist/assets/hooks/maxsim-context-monitor.cjs +0 -121
- package/dist/assets/hooks/maxsim-context-monitor.cjs.map +0 -1
- package/dist/assets/templates/agents/maxsim-code-reviewer.md +0 -239
- package/dist/assets/templates/agents/maxsim-codebase-mapper.md +0 -214
- package/dist/assets/templates/agents/maxsim-debugger.md +0 -572
- package/dist/assets/templates/agents/maxsim-drift-checker.md +0 -522
- package/dist/assets/templates/agents/maxsim-executor.md +0 -504
- package/dist/assets/templates/agents/maxsim-integration-checker.md +0 -273
- package/dist/assets/templates/agents/maxsim-phase-researcher.md +0 -305
- package/dist/assets/templates/agents/maxsim-plan-checker.md +0 -343
- package/dist/assets/templates/agents/maxsim-planner.md +0 -610
- package/dist/assets/templates/agents/maxsim-project-researcher.md +0 -359
- package/dist/assets/templates/agents/maxsim-research-synthesizer.md +0 -263
- package/dist/assets/templates/agents/maxsim-roadmapper.md +0 -324
- package/dist/assets/templates/agents/maxsim-spec-reviewer.md +0 -245
- package/dist/assets/templates/agents/maxsim-verifier.md +0 -393
- package/dist/assets/templates/commands/maxsim/add-phase.md +0 -43
- package/dist/assets/templates/commands/maxsim/add-tests.md +0 -41
- package/dist/assets/templates/commands/maxsim/add-todo.md +0 -57
- package/dist/assets/templates/commands/maxsim/artefakte.md +0 -122
- package/dist/assets/templates/commands/maxsim/audit-milestone.md +0 -36
- package/dist/assets/templates/commands/maxsim/batch.md +0 -42
- package/dist/assets/templates/commands/maxsim/check-drift.md +0 -56
- package/dist/assets/templates/commands/maxsim/check-todos.md +0 -46
- package/dist/assets/templates/commands/maxsim/cleanup.md +0 -18
- package/dist/assets/templates/commands/maxsim/complete-milestone.md +0 -136
- package/dist/assets/templates/commands/maxsim/discuss-phase.md +0 -87
- package/dist/assets/templates/commands/maxsim/discuss.md +0 -70
- package/dist/assets/templates/commands/maxsim/execute-phase.md +0 -41
- package/dist/assets/templates/commands/maxsim/health.md +0 -22
- package/dist/assets/templates/commands/maxsim/init-existing.md +0 -46
- package/dist/assets/templates/commands/maxsim/insert-phase.md +0 -32
- package/dist/assets/templates/commands/maxsim/list-phase-assumptions.md +0 -46
- package/dist/assets/templates/commands/maxsim/map-codebase.md +0 -71
- package/dist/assets/templates/commands/maxsim/new-milestone.md +0 -44
- package/dist/assets/templates/commands/maxsim/new-project.md +0 -46
- package/dist/assets/templates/commands/maxsim/pause-work.md +0 -38
- package/dist/assets/templates/commands/maxsim/plan-milestone-gaps.md +0 -34
- package/dist/assets/templates/commands/maxsim/plan-phase.md +0 -44
- package/dist/assets/templates/commands/maxsim/realign.md +0 -39
- package/dist/assets/templates/commands/maxsim/reapply-patches.md +0 -110
- package/dist/assets/templates/commands/maxsim/remove-phase.md +0 -31
- package/dist/assets/templates/commands/maxsim/research-phase.md +0 -189
- package/dist/assets/templates/commands/maxsim/resume-work.md +0 -40
- package/dist/assets/templates/commands/maxsim/roadmap.md +0 -19
- package/dist/assets/templates/commands/maxsim/sdd.md +0 -39
- package/dist/assets/templates/commands/maxsim/set-profile.md +0 -34
- package/dist/assets/templates/commands/maxsim/update.md +0 -37
- package/dist/assets/templates/commands/maxsim/verify-work.md +0 -38
- package/dist/assets/templates/workflows/add-phase.md +0 -111
- package/dist/assets/templates/workflows/add-tests.md +0 -351
- package/dist/assets/templates/workflows/add-todo.md +0 -247
- package/dist/assets/templates/workflows/audit-milestone.md +0 -297
- package/dist/assets/templates/workflows/check-drift.md +0 -248
- package/dist/assets/templates/workflows/check-todos.md +0 -261
- package/dist/assets/templates/workflows/cleanup.md +0 -153
- package/dist/assets/templates/workflows/complete-milestone.md +0 -701
- package/dist/assets/templates/workflows/discuss.md +0 -343
- package/dist/assets/templates/workflows/insert-phase.md +0 -129
- package/dist/assets/templates/workflows/list-phase-assumptions.md +0 -178
- package/dist/assets/templates/workflows/map-codebase.md +0 -315
- package/dist/assets/templates/workflows/pause-work.md +0 -122
- package/dist/assets/templates/workflows/plan-milestone-gaps.md +0 -274
- package/dist/assets/templates/workflows/realign.md +0 -288
- package/dist/assets/templates/workflows/remove-phase.md +0 -154
- package/dist/assets/templates/workflows/resume-project.md +0 -306
- package/dist/assets/templates/workflows/roadmap.md +0 -130
- package/dist/assets/templates/workflows/set-profile.md +0 -81
- package/dist/assets/templates/workflows/transition.md +0 -544
- package/dist/assets/templates/workflows/update.md +0 -220
|
@@ -1,22 +1,21 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: roadmap-writing
|
|
3
3
|
description: >-
|
|
4
|
-
|
|
5
|
-
|
|
6
|
-
|
|
4
|
+
Phased planning with dependency graphs, success criteria, and requirement
|
|
5
|
+
mapping. Produces roadmaps with observable truths as success criteria.
|
|
6
|
+
Use when creating project roadmaps, breaking features into phases, or
|
|
7
|
+
structuring multi-phase work.
|
|
7
8
|
---
|
|
8
9
|
|
|
9
10
|
# Roadmap Writing
|
|
10
11
|
|
|
11
12
|
A roadmap without success criteria is a wish list. Define what done looks like for every phase.
|
|
12
13
|
|
|
13
|
-
**HARD GATE: No phase without success criteria and dependencies. Every phase must have a number, name, goal, testable success criteria, and explicit dependencies. Violating this rule is a violation, not flexibility.**
|
|
14
|
-
|
|
15
14
|
## Process
|
|
16
15
|
|
|
17
16
|
### 1. SCOPE -- Understand the Project
|
|
18
17
|
|
|
19
|
-
Before writing phases
|
|
18
|
+
Before writing phases:
|
|
20
19
|
|
|
21
20
|
- Read PROJECT.md for vision and constraints
|
|
22
21
|
- Read REQUIREMENTS.md for v1/v2/out-of-scope boundaries
|
|
@@ -29,7 +28,7 @@ Each phase should be:
|
|
|
29
28
|
|
|
30
29
|
| Property | Requirement |
|
|
31
30
|
|----------|------------|
|
|
32
|
-
| **Independently deliverable** |
|
|
31
|
+
| **Independently deliverable** | Produces a working increment, not a half-built feature |
|
|
33
32
|
| **1-3 days of work** | Larger phases should be split; smaller ones should be merged |
|
|
34
33
|
| **Clear boundary** | You can tell when the phase is done without ambiguity |
|
|
35
34
|
| **Ordered by dependency** | No phase depends on a later phase |
|
|
@@ -42,28 +41,25 @@ Phase numbering convention:
|
|
|
42
41
|
| `01A`, `01B` | Parallel sub-phases that can execute concurrently |
|
|
43
42
|
| `01.1`, `01.2` | Sequential sub-phases within a parent phase |
|
|
44
43
|
|
|
45
|
-
Sort order: `01` then `01A` then `01B` then `01.1` then `01.2` then `02`.
|
|
46
|
-
|
|
47
44
|
### 3. DEFINE -- Write Each Phase
|
|
48
45
|
|
|
49
|
-
Every phase must include
|
|
46
|
+
Every phase must include:
|
|
50
47
|
|
|
51
48
|
```markdown
|
|
52
49
|
### Phase {number}: {name}
|
|
53
50
|
**Goal**: {one sentence -- what this phase achieves}
|
|
54
51
|
**Depends on**: {phase numbers, or "Nothing" for the first phase}
|
|
55
|
-
**Requirements**: {requirement IDs from REQUIREMENTS.md
|
|
52
|
+
**Requirements**: {requirement IDs from REQUIREMENTS.md}
|
|
56
53
|
**Success Criteria** (what must be TRUE):
|
|
57
|
-
1. {
|
|
58
|
-
2. {
|
|
59
|
-
3. {Testable statement}
|
|
54
|
+
1. {Observable truth -- verifiable by command, test, or inspection}
|
|
55
|
+
2. {Observable truth}
|
|
60
56
|
**Plans**: TBD
|
|
61
57
|
```
|
|
62
58
|
|
|
63
59
|
Success criteria rules:
|
|
64
60
|
- Each criterion must be testable -- "code is clean" is not testable; "no lint warnings" is testable
|
|
65
61
|
- Include at least 2 criteria per phase
|
|
66
|
-
- At least one criterion should be verifiable by running a command
|
|
62
|
+
- At least one criterion should be verifiable by running a command
|
|
67
63
|
- Criteria describe the end state, not the process ("tests pass" not "write tests")
|
|
68
64
|
|
|
69
65
|
### 4. CONNECT -- Map Dependencies
|
|
@@ -71,42 +67,54 @@ Success criteria rules:
|
|
|
71
67
|
- Which phases can run in parallel? (Use letter suffixes: `03A`, `03B`)
|
|
72
68
|
- Which phases are strictly sequential? (Use number suffixes: `03.1`, `03.2`)
|
|
73
69
|
- Are there any circular dependencies? (This is a design error -- restructure)
|
|
70
|
+
- Every phase except the first must declare at least one dependency
|
|
71
|
+
|
|
72
|
+
### 5. MAP REQUIREMENTS -- Ensure Coverage
|
|
74
73
|
|
|
75
|
-
Every
|
|
74
|
+
Every requirement ID from REQUIREMENTS.md must appear in at least one phase. Produce a coverage map:
|
|
76
75
|
|
|
77
|
-
|
|
76
|
+
```
|
|
77
|
+
REQUIREMENT-ID -> Phase N
|
|
78
|
+
```
|
|
78
79
|
|
|
79
|
-
|
|
80
|
+
If any requirement is unmapped, either add it to a phase or explicitly mark it as out-of-scope.
|
|
81
|
+
|
|
82
|
+
### 6. MILESTONE -- Group Into Milestones
|
|
83
|
+
|
|
84
|
+
Group phases into milestones that represent user-visible releases:
|
|
80
85
|
|
|
81
86
|
```markdown
|
|
82
87
|
## Milestones
|
|
83
|
-
|
|
84
88
|
- **v1.0 MVP** -- Phases 1-4
|
|
85
89
|
- **v1.1 Polish** -- Phases 5-7
|
|
86
|
-
- **v2.0 Scale** -- Phases 8-10
|
|
87
90
|
```
|
|
88
91
|
|
|
89
|
-
###
|
|
92
|
+
### 7. VALIDATE -- Check the Roadmap
|
|
93
|
+
|
|
94
|
+
| Check | How to Verify |
|
|
95
|
+
|-------|--------------|
|
|
96
|
+
| Every phase has success criteria | Read each phase detail section |
|
|
97
|
+
| Dependencies are acyclic | Trace the dependency chain -- no loops |
|
|
98
|
+
| Phase numbering is sequential | Numbers increase, no gaps larger than 1 |
|
|
99
|
+
| Milestones cover all phases | Every phase appears in exactly one milestone |
|
|
100
|
+
| Success criteria are testable | Each criterion can be verified by command, test, or inspection |
|
|
101
|
+
| Requirements are covered | Every requirement ID maps to at least one phase |
|
|
90
102
|
|
|
91
|
-
|
|
103
|
+
## Roadmap Format
|
|
92
104
|
|
|
93
105
|
```markdown
|
|
94
106
|
# Roadmap: {project name}
|
|
95
107
|
|
|
96
108
|
## Overview
|
|
97
|
-
|
|
98
|
-
{2-3 sentences: what the project is, what this roadmap covers, delivery strategy}
|
|
109
|
+
{2-3 sentences: what the project is, what this roadmap covers}
|
|
99
110
|
|
|
100
111
|
## Milestones
|
|
101
|
-
|
|
102
112
|
- **{milestone name}** -- Phases {range} ({status})
|
|
103
113
|
|
|
104
114
|
## Phases
|
|
105
|
-
|
|
106
115
|
- [ ] **Phase {N}: {name}** - {one-line summary}
|
|
107
116
|
|
|
108
117
|
## Phase Details
|
|
109
|
-
|
|
110
118
|
### Phase {N}: {name}
|
|
111
119
|
**Goal**: ...
|
|
112
120
|
**Depends on**: ...
|
|
@@ -114,51 +122,25 @@ Assemble the complete ROADMAP.md:
|
|
|
114
122
|
**Success Criteria** (what must be TRUE):
|
|
115
123
|
1. ...
|
|
116
124
|
**Plans**: TBD
|
|
117
|
-
```
|
|
118
125
|
|
|
119
|
-
|
|
120
|
-
|
|
121
|
-
|
|
122
|
-
|
|
123
|
-
| Check | How to Verify |
|
|
124
|
-
|-------|--------------|
|
|
125
|
-
| Every phase has success criteria | Read each phase detail section |
|
|
126
|
-
| Dependencies are acyclic | Trace the dependency chain -- no loops |
|
|
127
|
-
| Phase numbering is sequential | Numbers increase, no gaps larger than 1 |
|
|
128
|
-
| Milestones cover all phases | Every phase appears in exactly one milestone |
|
|
129
|
-
| Success criteria are testable | Each criterion can be verified by command, test, or inspection |
|
|
126
|
+
## Coverage Map
|
|
127
|
+
REQUIREMENT-ID -> Phase N
|
|
128
|
+
```
|
|
130
129
|
|
|
131
130
|
## Common Pitfalls
|
|
132
131
|
|
|
133
132
|
| Pitfall | Why It Fails |
|
|
134
133
|
|---------|-------------|
|
|
135
134
|
| "We don't know enough to plan" | Plan what you know. Unknown phases get a research spike first. |
|
|
136
|
-
| "The roadmap will change anyway" | Plans change -- that is expected. No plan guarantees drift. |
|
|
137
135
|
| "Success criteria are too rigid" | Vague criteria are useless. Rigid criteria are adjustable. |
|
|
138
136
|
| "One big phase is simpler" | Big phases hide complexity and delay feedback. Split them. |
|
|
139
137
|
| "Dependencies are obvious" | Obvious to you now. Not obvious to the agent running phase 5 next week. |
|
|
140
138
|
| "We'll add details later" | Later never comes. Write the details now while context is fresh. |
|
|
141
139
|
|
|
142
|
-
Stop if you catch yourself writing a phase without success criteria, creating phases longer than 3 days of work, skipping dependency declarations, writing vague criteria like "code is good"
|
|
143
|
-
|
|
144
|
-
## Verification
|
|
145
|
-
|
|
146
|
-
Before finalizing a roadmap, confirm:
|
|
147
|
-
|
|
148
|
-
- [ ] Every phase has a number, name, goal, dependencies, and success criteria
|
|
149
|
-
- [ ] Success criteria are testable (verifiable by command, test, or inspection)
|
|
150
|
-
- [ ] Dependencies form a DAG (no circular dependencies)
|
|
151
|
-
- [ ] Phase numbering follows MAXSIM convention (01, 01A, 01B, 01.1, etc.)
|
|
152
|
-
- [ ] Phases are 1-3 days of work each
|
|
153
|
-
- [ ] Milestones group phases into coherent deliverables
|
|
154
|
-
- [ ] ROADMAP.md matches the expected format for MAXSIM CLI parsing
|
|
155
|
-
- [ ] Overview section summarizes the project and delivery strategy
|
|
140
|
+
Stop if you catch yourself writing a phase without success criteria, creating phases longer than 3 days of work, skipping dependency declarations, or writing vague criteria like "code is good".
|
|
156
141
|
|
|
157
142
|
## MAXSIM Integration
|
|
158
143
|
|
|
159
|
-
Roadmap writing integrates with the MAXSIM lifecycle:
|
|
160
|
-
- Use during project initialization to create the initial roadmap
|
|
161
|
-
- Use when restructuring after a significant scope change or pivot
|
|
162
144
|
- The roadmap is read by MAXSIM agents via `roadmap read` -- format compliance is mandatory
|
|
163
|
-
- Phase numbering must be parseable by `normalizePhaseName()` and `comparePhaseNum()`
|
|
145
|
+
- Phase numbering must be parseable by `normalizePhaseName()` and `comparePhaseNum()`
|
|
164
146
|
- Config `model_profile` in `.planning/config.json` affects agent assignment per phase
|
|
@@ -1,84 +1,66 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: sdd
|
|
3
3
|
description: >-
|
|
4
|
-
|
|
5
|
-
|
|
6
|
-
|
|
4
|
+
Spec-driven development with fresh-agent-per-task execution. Prevents context
|
|
5
|
+
rot by isolating each task in a clean context window with its spec. Use when
|
|
6
|
+
executing multi-task plans, orchestrating agent work, or when context
|
|
7
|
+
accumulation degrades quality.
|
|
7
8
|
---
|
|
8
9
|
|
|
9
|
-
# Spec-Driven
|
|
10
|
+
# Spec-Driven Development (SDD)
|
|
10
11
|
|
|
11
|
-
Execute tasks sequentially, each in a fresh
|
|
12
|
+
Execute tasks sequentially, each in a fresh agent with clean context. Verify every task before moving to the next.
|
|
12
13
|
|
|
13
|
-
|
|
14
|
+
## Why SDD
|
|
14
15
|
|
|
15
|
-
|
|
16
|
+
Context rot is the primary failure mode for multi-task execution. As an agent processes more tasks, earlier context competes with later instructions. Quality degrades predictably after 3-5 tasks in a single context window. SDD solves this by giving each task a fresh context with only its specification.
|
|
17
|
+
|
|
18
|
+
## The SDD Process
|
|
16
19
|
|
|
17
20
|
### 1. LOAD -- Read the Plan
|
|
18
21
|
|
|
19
22
|
- Read the plan file (PLAN.md) to get the ordered task list
|
|
20
|
-
- For each task
|
|
21
|
-
- Confirm task order
|
|
23
|
+
- For each task: description, acceptance criteria, relevant files
|
|
24
|
+
- Confirm task order respects dependencies
|
|
22
25
|
|
|
23
26
|
### 2. DISPATCH -- Spawn Fresh Agent Per Task
|
|
24
27
|
|
|
25
28
|
For each task in order:
|
|
26
29
|
|
|
27
|
-
1. Assemble
|
|
30
|
+
1. Assemble minimal task context:
|
|
28
31
|
- Task description and acceptance criteria from the plan
|
|
29
32
|
- Only the files relevant to this specific task
|
|
30
33
|
- Results from previous tasks (commit hashes, created files) -- NOT the full previous context
|
|
31
34
|
2. Spawn a fresh agent with this minimal context
|
|
32
|
-
3. The agent implements the task, runs
|
|
35
|
+
3. The agent implements the task, runs verification, and commits
|
|
33
36
|
|
|
34
37
|
### 3. REVIEW -- Two-Stage Quality Gate
|
|
35
38
|
|
|
36
|
-
After each task completes
|
|
37
|
-
|
|
38
|
-
**Stage 1: Spec Compliance**
|
|
39
|
-
|
|
40
|
-
- Does the implementation match the task description?
|
|
41
|
-
- Are all acceptance criteria met?
|
|
42
|
-
- Were only the specified files modified (no scope creep)?
|
|
43
|
-
- Do the changes align with the plan's intent?
|
|
44
|
-
|
|
45
|
-
Verdict: PASS or FAIL with specific issues.
|
|
39
|
+
After each task completes:
|
|
46
40
|
|
|
47
|
-
**Stage
|
|
41
|
+
**Stage 1: Spec Compliance** -- Does the implementation match the task spec? Are all acceptance criteria met? Were only specified files modified?
|
|
48
42
|
|
|
49
|
-
|
|
50
|
-
- Is the code readable and consistent with codebase conventions?
|
|
51
|
-
- Are there unnecessary complications or dead code?
|
|
52
|
-
- Do all tests pass?
|
|
43
|
+
**Stage 2: Code Quality** -- Are there bugs, edge cases, or error handling gaps? Is the code consistent with codebase conventions? Do all tests pass?
|
|
53
44
|
|
|
54
|
-
Verdict: PASS or FAIL with specific issues.
|
|
45
|
+
Verdict: PASS or FAIL with specific issues per stage.
|
|
55
46
|
|
|
56
47
|
### 4. FIX -- Address Review Failures
|
|
57
48
|
|
|
58
49
|
If either review stage fails:
|
|
59
50
|
|
|
60
|
-
1. Spawn a NEW fresh agent with
|
|
61
|
-
2.
|
|
62
|
-
3. Re-run both review stages
|
|
63
|
-
4. If 3 fix attempts fail: STOP and escalate
|
|
51
|
+
1. Spawn a NEW fresh agent with original task spec + review feedback + current file state
|
|
52
|
+
2. Fix agent addresses ONLY the review issues -- no new features
|
|
53
|
+
3. Re-run both review stages
|
|
54
|
+
4. If 3 fix attempts fail: STOP and escalate
|
|
64
55
|
|
|
65
56
|
### 5. ADVANCE -- Move to Next Task
|
|
66
57
|
|
|
67
58
|
Only after both review stages pass:
|
|
68
59
|
|
|
69
|
-
- Record
|
|
70
|
-
-
|
|
71
|
-
- Pass this minimal summary (not full context) to the next task's agent
|
|
72
|
-
|
|
73
|
-
### 6. REPORT -- Final Summary
|
|
74
|
-
|
|
75
|
-
After all tasks complete:
|
|
76
|
-
|
|
77
|
-
- List each task with its status and commit hash
|
|
78
|
-
- Note any tasks that required fix iterations
|
|
79
|
-
- Summarize the total changes made
|
|
60
|
+
- Record task as complete with commit hash
|
|
61
|
+
- Pass minimal summary (not full context) to the next task
|
|
80
62
|
|
|
81
|
-
## Context Management
|
|
63
|
+
## Context Management
|
|
82
64
|
|
|
83
65
|
Each agent receives ONLY what it needs:
|
|
84
66
|
|
|
@@ -89,38 +71,21 @@ Each agent receives ONLY what it needs:
|
|
|
89
71
|
| Previous task commit hashes | Always |
|
|
90
72
|
| Previous task full diff | Never |
|
|
91
73
|
| Previous task agent conversation | Never |
|
|
92
|
-
| PROJECT.md / REQUIREMENTS.md | Only if task references project-level concerns |
|
|
93
74
|
| Full codebase | Never -- only specified files |
|
|
94
75
|
|
|
95
76
|
The point of SDD is fresh context. Loading the previous agent's full context defeats the purpose.
|
|
96
77
|
|
|
78
|
+
## When to Use SDD
|
|
79
|
+
|
|
80
|
+
- **Good fit:** Multi-task plans (3+ tasks), sequential work where each task builds on the previous, implementations where quality degrades over time
|
|
81
|
+
- **Poor fit:** Single-task work, highly interactive tasks requiring user feedback, tasks that share significant overlapping context
|
|
82
|
+
|
|
97
83
|
## Common Pitfalls
|
|
98
84
|
|
|
99
|
-
| Pitfall | Why
|
|
85
|
+
| Pitfall | Why It Matters |
|
|
100
86
|
|---|---|
|
|
101
|
-
| Skipping review for simple tasks | Simple tasks still have bugs. Review
|
|
102
|
-
| Passing full context forward | Full context causes
|
|
87
|
+
| Skipping review for simple tasks | Simple tasks still have bugs. Review catches what the implementer missed. |
|
|
88
|
+
| Passing full context forward | Full context causes the exact rot SDD is designed to prevent. |
|
|
103
89
|
| Deferring fixes to the next task | The next task's agent does not know about the bug. Fix it now. |
|
|
104
|
-
| Accumulating fix-later items across tasks | Each task must be clean before the next starts. |
|
|
105
|
-
|
|
106
|
-
## Verification
|
|
107
|
-
|
|
108
|
-
Before reporting completion, confirm:
|
|
109
|
-
|
|
110
|
-
- [ ] Every task was executed by a fresh agent with minimal context
|
|
111
|
-
- [ ] Every task passed both spec compliance and code quality review
|
|
112
|
-
- [ ] No task was skipped or started before the previous task passed review
|
|
113
|
-
- [ ] Fix iterations (if any) are documented
|
|
114
|
-
- [ ] All tests pass after the final task
|
|
115
|
-
- [ ] Summary includes per-task status and commit hashes
|
|
116
|
-
|
|
117
|
-
## MAXSIM Integration
|
|
118
|
-
|
|
119
|
-
When a plan specifies `skill: "sdd"`:
|
|
120
90
|
|
|
121
|
-
-
|
|
122
|
-
- Each task is dispatched to a fresh subagent
|
|
123
|
-
- Two-stage review runs between every task
|
|
124
|
-
- Failed reviews trigger fix agents (up to 3 attempts)
|
|
125
|
-
- Progress is tracked in STATE.md via decision entries
|
|
126
|
-
- Final results are recorded in SUMMARY.md
|
|
91
|
+
See also: `/verification-before-completion` for the evidence-based verification methodology used within each SDD task.
|
|
@@ -1,18 +1,18 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: systematic-debugging
|
|
3
3
|
description: >-
|
|
4
|
-
|
|
5
|
-
|
|
6
|
-
unexpected behavior, or
|
|
4
|
+
Systematic debugging via reproduce-hypothesize-isolate-verify-fix cycle.
|
|
5
|
+
Requires evidence at each step. Use when investigating bugs, test failures,
|
|
6
|
+
unexpected behavior, or runtime errors.
|
|
7
7
|
---
|
|
8
8
|
|
|
9
9
|
# Systematic Debugging
|
|
10
10
|
|
|
11
11
|
Find the root cause first. Random fixes waste time and create new bugs.
|
|
12
12
|
|
|
13
|
-
**
|
|
13
|
+
**No fix attempts without understanding root cause.** If you have not completed the REPRODUCE and HYPOTHESIZE steps, you cannot propose a fix.
|
|
14
14
|
|
|
15
|
-
## Process
|
|
15
|
+
## The 5-Step Process
|
|
16
16
|
|
|
17
17
|
### 1. REPRODUCE -- Confirm the Problem
|
|
18
18
|
|
|
@@ -52,6 +52,19 @@ Find the root cause first. Random fixes waste time and create new bugs.
|
|
|
52
52
|
- Run the full test suite: no regressions.
|
|
53
53
|
- Verify the original error no longer occurs.
|
|
54
54
|
|
|
55
|
+
## Hypothesis Testing Protocol
|
|
56
|
+
|
|
57
|
+
For each hypothesis:
|
|
58
|
+
|
|
59
|
+
1. **Form:** "I think X is the root cause because Y."
|
|
60
|
+
2. **Design test:** "If X is the cause, then changing Z should produce W."
|
|
61
|
+
3. **Run test:** Execute the change and observe the result.
|
|
62
|
+
4. **Evaluate:** Did the result match the prediction? If yes, proceed to FIX. If no, form a new hypothesis.
|
|
63
|
+
|
|
64
|
+
## Escalation
|
|
65
|
+
|
|
66
|
+
If 3+ fix attempts have failed, the issue is likely architectural. Document what you have tried (hypotheses tested, evidence gathered, fixes attempted) and escalate.
|
|
67
|
+
|
|
55
68
|
## Common Pitfalls
|
|
56
69
|
|
|
57
70
|
| Excuse | Reality |
|
|
@@ -61,25 +74,6 @@ Find the root cause first. Random fixes waste time and create new bugs.
|
|
|
61
74
|
| "Multiple changes at once saves time" | You cannot isolate what worked. You will create new bugs. |
|
|
62
75
|
| "The issue is simple" | Simple bugs have root causes too. The process is fast for simple bugs. |
|
|
63
76
|
|
|
64
|
-
Stop immediately if you catch yourself changing code before reproducing, proposing a fix before reading the full error, trying random fixes, or changing multiple things at once.
|
|
65
|
-
|
|
66
|
-
If 3+ fix attempts have failed, the issue is likely architectural. Document what you have tried and escalate to the user.
|
|
67
|
-
|
|
68
|
-
## Verification
|
|
69
|
-
|
|
70
|
-
Before claiming a bug is fixed, confirm:
|
|
71
|
-
|
|
72
|
-
- [ ] The original error has been reproduced reliably
|
|
73
|
-
- [ ] Root cause has been identified with evidence (not guessed)
|
|
74
|
-
- [ ] A failing test reproduces the bug
|
|
75
|
-
- [ ] A single, targeted fix addresses the root cause
|
|
76
|
-
- [ ] The failing test now passes
|
|
77
|
-
- [ ] The full test suite passes (no regressions)
|
|
78
|
-
- [ ] The original error no longer occurs when running the original steps
|
|
79
|
-
|
|
80
|
-
## MAXSIM Integration
|
|
77
|
+
Stop immediately if you catch yourself changing code before reproducing, proposing a fix before reading the full error, trying random fixes, or changing multiple things at once.
|
|
81
78
|
|
|
82
|
-
|
|
83
|
-
- **Rule 1 (Auto-fix bugs):** You may auto-fix bugs found during execution, but you must still follow this debugging process.
|
|
84
|
-
- **Rule 4 (Architectural changes):** If 3+ fix attempts fail, STOP and return a checkpoint -- this is an architectural decision for the user.
|
|
85
|
-
- Track all debugging deviations for SUMMARY.md documentation.
|
|
79
|
+
See also: `/verification-before-completion` for evidence-based confirmation after fixes.
|
|
@@ -1,18 +1,23 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: tdd
|
|
3
3
|
description: >-
|
|
4
|
-
|
|
5
|
-
failing test first,
|
|
6
|
-
implementing
|
|
4
|
+
Test-driven development with red-green-refactor cycle and atomic commits.
|
|
5
|
+
Write failing test first, then minimal passing code, then refactor. Use when
|
|
6
|
+
implementing business logic, API endpoints, data transformations, validation
|
|
7
|
+
rules, or algorithms.
|
|
7
8
|
---
|
|
8
9
|
|
|
9
10
|
# Test-Driven Development (TDD)
|
|
10
11
|
|
|
11
12
|
Write the test first. Watch it fail. Write minimal code to pass. Clean up.
|
|
12
13
|
|
|
13
|
-
|
|
14
|
+
## When to Use TDD
|
|
14
15
|
|
|
15
|
-
|
|
16
|
+
**Good fit:** Business logic with defined I/O, API endpoints with contracts, data transformations, validation rules, algorithms, state machines.
|
|
17
|
+
|
|
18
|
+
**Poor fit:** UI layout, configuration files, build scripts, one-off scripts, mechanical renames.
|
|
19
|
+
|
|
20
|
+
## The Red-Green-Refactor Cycle
|
|
16
21
|
|
|
17
22
|
### 1. RED -- Write One Failing Test
|
|
18
23
|
|
|
@@ -46,40 +51,27 @@ Write the test first. Watch it fail. Write minimal code to pass. Clean up.
|
|
|
46
51
|
|
|
47
52
|
### 6. REPEAT -- Next failing test for next behavior
|
|
48
53
|
|
|
54
|
+
## Commit Pattern
|
|
55
|
+
|
|
56
|
+
Each TDD cycle produces 2-3 atomic commits:
|
|
57
|
+
|
|
58
|
+
- **RED commit:** `test({scope}): add failing test for [feature]`
|
|
59
|
+
- **GREEN commit:** `feat({scope}): implement [feature]`
|
|
60
|
+
- **REFACTOR commit (if changes made):** `refactor({scope}): clean up [feature]`
|
|
61
|
+
|
|
62
|
+
## Context Budget
|
|
63
|
+
|
|
64
|
+
TDD uses approximately 40% more context than direct implementation due to the RED-GREEN-REFACTOR overhead. Plan accordingly for long task lists.
|
|
65
|
+
|
|
49
66
|
## Common Pitfalls
|
|
50
67
|
|
|
51
|
-
| Excuse | Why
|
|
68
|
+
| Excuse | Why It Fails |
|
|
52
69
|
|--------|-------------|
|
|
53
70
|
| "Too simple to test" | Simple code breaks. The test takes 30 seconds. |
|
|
54
71
|
| "I'll add tests after" | Tests written after pass immediately -- they prove nothing. |
|
|
55
72
|
| "I know the code works" | Knowledge is not evidence. A passing test is evidence. |
|
|
56
73
|
| "TDD is slower" | TDD is faster than debugging. Every skip creates debt. |
|
|
57
|
-
| "Let me keep the code as reference" | You will adapt it instead of writing test-first. Delete means delete. |
|
|
58
|
-
|
|
59
|
-
Stop immediately if you catch yourself:
|
|
60
|
-
|
|
61
|
-
- Writing implementation code before writing a test
|
|
62
|
-
- Writing a test that passes on the first run
|
|
63
|
-
- Skipping the VERIFY RED step
|
|
64
|
-
- Adding features beyond what the current test requires
|
|
65
|
-
- Keeping pre-TDD code "as reference"
|
|
66
|
-
|
|
67
|
-
## Verification
|
|
68
|
-
|
|
69
|
-
Before claiming TDD compliance, confirm:
|
|
70
|
-
|
|
71
|
-
- [ ] Every new function/method has a corresponding test
|
|
72
|
-
- [ ] Each test was written BEFORE its implementation
|
|
73
|
-
- [ ] Each test was observed to FAIL before implementation was written
|
|
74
|
-
- [ ] Each test failed for the expected reason (missing behavior, not syntax error)
|
|
75
|
-
- [ ] Minimal code was written to pass each test
|
|
76
|
-
- [ ] All tests pass after implementation
|
|
77
|
-
- [ ] Refactoring (if any) did not break any tests
|
|
78
|
-
|
|
79
|
-
## MAXSIM Integration
|
|
80
74
|
|
|
81
|
-
|
|
75
|
+
Stop immediately if you catch yourself writing implementation code before writing a test, writing a test that passes on the first run, skipping the VERIFY RED step, or adding features beyond what the current test requires.
|
|
82
76
|
|
|
83
|
-
-
|
|
84
|
-
- **GREEN commit:** `feat({phase}-{plan}): implement [feature]`
|
|
85
|
-
- **REFACTOR commit (if changes made):** `refactor({phase}-{plan}): clean up [feature]`
|
|
77
|
+
See also: `/verification-before-completion` for evidence-based completion claims after TDD cycles.
|
|
@@ -0,0 +1,80 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: tool-priority-guide
|
|
3
|
+
description: >-
|
|
4
|
+
Tool selection guide for Claude Code operations. Maps common tasks to preferred
|
|
5
|
+
tools, explaining when to use Read over cat, Grep over rg, Glob over find,
|
|
6
|
+
Write over echo, and Edit over sed. Use when deciding which tool to use for
|
|
7
|
+
file operations, search, content modification, or web content retrieval.
|
|
8
|
+
user-invocable: false
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
# Tool Priority Guide
|
|
12
|
+
|
|
13
|
+
Use dedicated Claude Code tools over Bash equivalents. Dedicated tools provide better permissions handling, output formatting, and user experience.
|
|
14
|
+
|
|
15
|
+
## File Reading
|
|
16
|
+
|
|
17
|
+
| Task | Use | Not |
|
|
18
|
+
|------|-----|-----|
|
|
19
|
+
| Read file contents | **Read tool** | `cat`, `head`, `tail` via Bash |
|
|
20
|
+
| Read specific lines | **Read tool** (with offset/limit) | `sed -n 'X,Yp'` via Bash |
|
|
21
|
+
| Read images | **Read tool** (multimodal) | Not possible via Bash |
|
|
22
|
+
| Read PDFs | **Read tool** (with pages param) | `pdftotext` via Bash |
|
|
23
|
+
|
|
24
|
+
**Why Read:** Handles permissions, large files, binary formats. Returns line-numbered output.
|
|
25
|
+
|
|
26
|
+
## File Writing
|
|
27
|
+
|
|
28
|
+
| Task | Use | Not |
|
|
29
|
+
|------|-----|-----|
|
|
30
|
+
| Create new file | **Write tool** | `echo > file`, `cat <<EOF` via Bash |
|
|
31
|
+
| Rewrite entire file | **Write tool** (after Read) | `cat > file` via Bash |
|
|
32
|
+
| Modify part of file | **Edit tool** | `sed`, `awk` via Bash |
|
|
33
|
+
| Rename string across file | **Edit tool** (replace_all) | `sed -i 's/old/new/g'` via Bash |
|
|
34
|
+
|
|
35
|
+
**Why Write/Edit:** Atomic operations, preserves encoding, provides diff view for review.
|
|
36
|
+
|
|
37
|
+
## Searching
|
|
38
|
+
|
|
39
|
+
| Task | Use | Not |
|
|
40
|
+
|------|-----|-----|
|
|
41
|
+
| Search file contents | **Grep tool** | `grep`, `rg` via Bash |
|
|
42
|
+
| Find files by pattern | **Glob tool** | `find`, `ls -R` via Bash |
|
|
43
|
+
| Search with context | **Grep tool** (-A, -B, -C params) | `grep -C N` via Bash |
|
|
44
|
+
| Count matches | **Grep tool** (output_mode: count) | `grep -c` via Bash |
|
|
45
|
+
|
|
46
|
+
**Why Grep/Glob:** Optimized permissions, structured output, result limiting.
|
|
47
|
+
|
|
48
|
+
## Web Content
|
|
49
|
+
|
|
50
|
+
| Task | Use | Not |
|
|
51
|
+
|------|-----|-----|
|
|
52
|
+
| Fetch documentation | **WebFetch tool** | `curl` via Bash |
|
|
53
|
+
| Read API responses | **WebFetch tool** | `curl | jq` via Bash |
|
|
54
|
+
| Download files | **Bash** (`curl -O`) | WebFetch (not for binary downloads) |
|
|
55
|
+
|
|
56
|
+
**Why WebFetch:** Handles authentication, follows redirects, parses HTML.
|
|
57
|
+
|
|
58
|
+
## When Bash IS the Right Tool
|
|
59
|
+
|
|
60
|
+
| Task | Why Bash |
|
|
61
|
+
|------|---------|
|
|
62
|
+
| Run build/test commands | `npm test`, `npm run build` -- no dedicated tool |
|
|
63
|
+
| Git operations | `git status`, `git commit` -- no dedicated tool |
|
|
64
|
+
| Install dependencies | `npm install` -- no dedicated tool |
|
|
65
|
+
| Check file existence | `test -f path` -- lightweight, often part of larger commands |
|
|
66
|
+
| Run project CLI tools | Project-specific commands -- no dedicated tool |
|
|
67
|
+
| Chained operations | Multiple sequential commands with `&&` |
|
|
68
|
+
|
|
69
|
+
## Quick Reference
|
|
70
|
+
|
|
71
|
+
```
|
|
72
|
+
Read file --> Read tool
|
|
73
|
+
Write file --> Write tool (new) or Edit tool (modify)
|
|
74
|
+
Search code --> Grep tool
|
|
75
|
+
Find files --> Glob tool
|
|
76
|
+
Fetch URL --> WebFetch tool
|
|
77
|
+
Run commands --> Bash tool
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
The general principle: if a dedicated tool exists for the operation, use it. Fall back to Bash only when no dedicated tool covers the task.
|