maxsimcli 4.0.1 → 4.0.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/assets/CHANGELOG.md +7 -0
- package/dist/assets/templates/skills/batch-worktree/SKILL.md +38 -85
- package/dist/assets/templates/skills/brainstorming/SKILL.md +44 -114
- package/dist/assets/templates/skills/code-review/SKILL.md +43 -71
- package/dist/assets/templates/skills/memory-management/SKILL.md +36 -100
- package/dist/assets/templates/skills/roadmap-writing/SKILL.md +39 -73
- package/dist/assets/templates/skills/sdd/SKILL.md +36 -85
- package/dist/assets/templates/skills/simplify/SKILL.md +96 -139
- package/dist/assets/templates/skills/systematic-debugging/SKILL.md +41 -74
- package/dist/assets/templates/skills/tdd/SKILL.md +32 -65
- package/dist/assets/templates/skills/using-maxsim/SKILL.md +26 -39
- package/dist/assets/templates/skills/verification-before-completion/SKILL.md +37 -56
- package/package.json +1 -1
|
@@ -1,185 +1,142 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: simplify
|
|
3
|
-
description:
|
|
3
|
+
description: >-
|
|
4
|
+
Reviews changed code for reuse opportunities, unnecessary complexity, and
|
|
5
|
+
dead weight using three parallel review agents. Use when reviewing code
|
|
6
|
+
before committing, cleaning up implementations, or preparing changes for
|
|
7
|
+
review.
|
|
4
8
|
---
|
|
5
9
|
|
|
6
10
|
# Simplify
|
|
7
11
|
|
|
8
|
-
Every line of code is a liability.
|
|
12
|
+
Every line of code is a liability. Remove what does not earn its place.
|
|
9
13
|
|
|
10
|
-
**If you have not
|
|
14
|
+
**HARD GATE**: No code ships without a simplification pass. If you have not checked for duplication, dead code, and unnecessary complexity, the change is not ready. "It works" is the starting point, not the finish line.
|
|
11
15
|
|
|
12
|
-
##
|
|
16
|
+
## When to Use
|
|
13
17
|
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
Violating this rule is a violation — not a preference.
|
|
19
|
-
</HARD-GATE>
|
|
18
|
+
- After implementing a feature or fix, before committing
|
|
19
|
+
- When preparing changes for code review
|
|
20
|
+
- When cleaning up code that has grown organically over multiple iterations
|
|
21
|
+
- When onboarding to a file and noticing accumulated complexity
|
|
20
22
|
|
|
21
|
-
|
|
23
|
+
Do NOT use this skill when:
|
|
24
|
+
- Making a hotfix where speed matters more than polish (file a follow-up instead)
|
|
25
|
+
- The changes are purely mechanical (renames, formatting, dependency bumps)
|
|
22
26
|
|
|
23
|
-
|
|
27
|
+
## Process
|
|
24
28
|
|
|
25
|
-
### 1. DIFF —
|
|
29
|
+
### 1. DIFF — Identify What Changed
|
|
26
30
|
|
|
27
|
-
-
|
|
28
|
-
- Read
|
|
29
|
-
-
|
|
31
|
+
- Collect the set of modified and added files
|
|
32
|
+
- Read each file in full, not just the changed hunks
|
|
33
|
+
- Note files that interact with the changes (callers, consumers, shared modules)
|
|
30
34
|
|
|
31
|
-
|
|
32
|
-
# Review staged changes
|
|
33
|
-
git diff --staged
|
|
34
|
-
# Or all uncommitted changes
|
|
35
|
-
git diff
|
|
36
|
-
```
|
|
35
|
+
### 2. DUPLICATION — Eliminate Repeated Logic
|
|
37
36
|
|
|
38
|
-
|
|
37
|
+
- Are there patterns repeated across files that should be a shared helper?
|
|
38
|
+
- Does new code duplicate existing utilities or library functions?
|
|
39
|
+
- Could two similar implementations be merged behind a single interface?
|
|
40
|
+
- Is there copy-paste that should be refactored?
|
|
39
41
|
|
|
40
|
-
|
|
41
|
-
- Are there two or more blocks that do similar things and could share a helper?
|
|
42
|
-
- Could an existing utility, library function, or helper replace new code?
|
|
43
|
-
- Is there copy-paste from another file that should be extracted?
|
|
42
|
+
**Rule of three**: If the same pattern appears three times, extract it.
|
|
44
43
|
|
|
45
|
-
|
|
44
|
+
### 3. DEAD CODE — Remove What Is Not Called
|
|
46
45
|
|
|
47
|
-
|
|
48
|
-
|
|
49
|
-
-
|
|
50
|
-
-
|
|
51
|
-
- Are there unreachable branches or impossible conditions?
|
|
52
|
-
- Are there parameters that are always passed the same value?
|
|
53
|
-
|
|
54
|
-
**If it is not called, it does not belong. Delete it.**
|
|
46
|
+
- Delete unused imports, variables, functions, and parameters
|
|
47
|
+
- Remove commented-out code blocks (version control is the archive)
|
|
48
|
+
- Strip unreachable branches and impossible conditions
|
|
49
|
+
- Drop feature flags and configuration for features that no longer exist
|
|
55
50
|
|
|
56
51
|
### 4. COMPLEXITY — Question Every Abstraction
|
|
57
52
|
|
|
58
|
-
-
|
|
59
|
-
-
|
|
60
|
-
-
|
|
61
|
-
-
|
|
53
|
+
- Does every wrapper, adapter, or indirection layer justify its existence?
|
|
54
|
+
- Are there generics or parametrization that serve only one concrete case?
|
|
55
|
+
- Could a 20-line class be replaced by a 3-line function?
|
|
56
|
+
- Is there defensive programming that guards against conditions that cannot occur?
|
|
62
57
|
|
|
63
|
-
**
|
|
58
|
+
**If removing it does not break anything, it should not be there.**
|
|
64
59
|
|
|
65
|
-
### 5. CLARITY —
|
|
60
|
+
### 5. CLARITY — Tighten Naming and Structure
|
|
66
61
|
|
|
67
|
-
- Are
|
|
68
|
-
- Could
|
|
69
|
-
-
|
|
70
|
-
-
|
|
62
|
+
- Are names self-documenting? Rename anything that needs a comment to explain.
|
|
63
|
+
- Could nested logic be flattened with early returns?
|
|
64
|
+
- Is control flow straightforward, or does it require tracing to understand?
|
|
65
|
+
- Are there layers of indirection that obscure the data path?
|
|
71
66
|
|
|
72
|
-
### 6.
|
|
67
|
+
### 6. REVIEW — Final Sanity Check
|
|
73
68
|
|
|
74
|
-
-
|
|
75
|
-
-
|
|
76
|
-
-
|
|
77
|
-
- Is data being transformed multiple times when once would suffice?
|
|
69
|
+
- Re-read the simplified code end to end
|
|
70
|
+
- Confirm all tests still pass
|
|
71
|
+
- Verify no behavioral changes were introduced (simplify, do not alter)
|
|
78
72
|
|
|
79
|
-
|
|
73
|
+
## Parallel 3-Reviewer Pattern
|
|
80
74
|
|
|
81
|
-
|
|
75
|
+
When invoked as part of the execute-phase cycle, simplification runs as three parallel review agents, each focused on one dimension.
|
|
82
76
|
|
|
83
|
-
|
|
84
|
-
|----------|----------|---------------|
|
|
85
|
-
| Duplication | Same logic exists elsewhere? | Extract shared helper or reuse existing |
|
|
86
|
-
| Duplication | Copy-paste from another file? | Extract to shared module |
|
|
87
|
-
| Dead code | Unused imports or variables? | Delete them |
|
|
88
|
-
| Dead code | Commented-out code? | Delete it (git has history) |
|
|
89
|
-
| Complexity | Abstraction for one call site? | Inline it |
|
|
90
|
-
| Complexity | Generic where specific suffices? | Simplify to specific |
|
|
91
|
-
| Complexity | Config for hypothetical needs? | Remove until needed |
|
|
92
|
-
| Clarity | Confusing variable name? | Rename to describe purpose |
|
|
93
|
-
| Clarity | Deep nesting? | Flatten with early returns |
|
|
94
|
-
| Efficiency | O(n^2) with obvious O(n) fix? | Fix it now |
|
|
95
|
-
| Efficiency | Same computation in a loop? | Hoist outside the loop |
|
|
77
|
+
### Reviewer 1: Code Reuse
|
|
96
78
|
|
|
97
|
-
|
|
79
|
+
- Scan all changed files for duplicated patterns
|
|
80
|
+
- Cross-reference against existing shared utilities and helpers
|
|
81
|
+
- Flag any logic that appears three or more times without extraction
|
|
82
|
+
- **Output**: List of reuse opportunities with file paths and line ranges
|
|
98
83
|
|
|
99
|
-
|
|
100
|
-
|--------|--------------------------|
|
|
101
|
-
| "It works, don't touch it" | Working is the minimum bar. Simplify before it becomes legacy. |
|
|
102
|
-
| "We might need the flexibility later" | YAGNI. Add flexibility when you need it, not before. |
|
|
103
|
-
| "Refactoring is risky" | Small simplifications with passing tests are safe. Large refactors are a separate task. |
|
|
104
|
-
| "The duplication is minor" | Minor duplication compounds. Three occurrences is the threshold, not six. |
|
|
105
|
-
| "I'll clean it up in a follow-up" | Follow-ups rarely happen. Simplify now while context is fresh. |
|
|
106
|
-
| "It's just a few extra lines" | Every unnecessary line is maintenance cost. Delete it. |
|
|
84
|
+
### Reviewer 2: Code Quality
|
|
107
85
|
|
|
108
|
-
|
|
86
|
+
- Check for dead code: unused imports, unreachable branches, commented blocks
|
|
87
|
+
- Verify naming consistency with codebase conventions
|
|
88
|
+
- Flag unnecessary abstractions, wrappers, and indirection
|
|
89
|
+
- **Output**: List of quality issues categorized by severity
|
|
109
90
|
|
|
110
|
-
|
|
111
|
-
- Keeping dead code "just in case"
|
|
112
|
-
- Adding a utility class for a single use case
|
|
113
|
-
- Building configuration for features no one has requested
|
|
114
|
-
- Writing a comment to explain code that could be rewritten to be self-evident
|
|
115
|
-
- Skipping simplification because "it's good enough"
|
|
91
|
+
### Reviewer 3: Efficiency
|
|
116
92
|
|
|
117
|
-
|
|
93
|
+
- Identify over-engineered solutions (parametrization serving one case, generic interfaces with one implementor)
|
|
94
|
+
- Flag defensive programming that guards impossible conditions
|
|
95
|
+
- Check for configuration and feature flags that serve no current purpose
|
|
96
|
+
- **Output**: List of efficiency issues with suggested removals
|
|
118
97
|
|
|
119
|
-
|
|
98
|
+
### Consolidation
|
|
120
99
|
|
|
121
|
-
|
|
100
|
+
After all three reviewers complete:
|
|
101
|
+
1. Merge findings into a deduplicated list
|
|
102
|
+
2. Apply fixes for all actionable items (BLOCKER and HIGH priority first)
|
|
103
|
+
3. Re-run tests to confirm nothing broke
|
|
104
|
+
4. Report status: CLEAN (nothing found), FIXED (issues resolved), or BLOCKED (cannot simplify without architectural changes)
|
|
122
105
|
|
|
123
|
-
|
|
124
|
-
- [ ] No duplication with existing codebase logic (or duplication is below threshold)
|
|
125
|
-
- [ ] No dead code: unused imports, variables, unreachable branches, commented code
|
|
126
|
-
- [ ] No unnecessary abstractions, wrappers, or premature generalizations
|
|
127
|
-
- [ ] Naming is clear and consistent with codebase conventions
|
|
128
|
-
- [ ] No obvious efficiency issues (unnecessary O(n^2), repeated computations)
|
|
129
|
-
- [ ] Tests still pass after simplification changes
|
|
106
|
+
## Common Rationalizations — REJECT THESE
|
|
130
107
|
|
|
131
|
-
|
|
108
|
+
| Excuse | Why It Violates the Rule |
|
|
109
|
+
|--------|--------------------------|
|
|
110
|
+
| "It might be needed later" | Delete it. Re-adding is cheaper than maintaining unused code. |
|
|
111
|
+
| "The abstraction makes it extensible" | Extensibility that serves no current requirement is dead weight. |
|
|
112
|
+
| "Refactoring is risky" | Small, tested simplifications reduce risk. Accumulated complexity increases it. |
|
|
113
|
+
| "I'll clean it up later" | Later never comes. Simplify now while context is fresh. |
|
|
132
114
|
|
|
133
|
-
|
|
134
|
-
- After task implementation is complete and tests pass, run this review
|
|
135
|
-
- Make simplification changes as part of the same commit (not a separate task)
|
|
136
|
-
- If simplification reveals a larger refactoring opportunity, file a todo — do not scope-creep
|
|
137
|
-
- Track significant simplifications in the task's commit message (e.g., "extracted shared helper for X")
|
|
115
|
+
## Red Flags — STOP If You Catch Yourself:
|
|
138
116
|
|
|
139
|
-
|
|
117
|
+
- Skipping the simplification pass because the diff is small
|
|
118
|
+
- Keeping dead code "just in case"
|
|
119
|
+
- Adding complexity during a simplification pass
|
|
120
|
+
- Merging without having read the full file (not just changed lines)
|
|
140
121
|
|
|
141
|
-
|
|
122
|
+
**If any red flag triggers: STOP. Complete the simplification cycle before proceeding.**
|
|
142
123
|
|
|
143
|
-
|
|
144
|
-
- Find duplicated logic across the codebase (not just within changed files)
|
|
145
|
-
- Identify copy-paste from other files that should be extracted to shared modules
|
|
146
|
-
- Suggest shared helpers for patterns appearing 3+ times (Rule of Three)
|
|
147
|
-
- Check if existing utility functions, library methods, or helpers could replace new code
|
|
148
|
-
- **Output:** List of reuse opportunities with file paths and line ranges
|
|
124
|
+
## Verification Checklist
|
|
149
125
|
|
|
150
|
-
|
|
151
|
-
- Check naming consistency with codebase conventions (read CLAUDE.md first)
|
|
152
|
-
- Verify error handling covers all external calls and edge cases
|
|
153
|
-
- Look for dead code: unused imports, variables, unreachable branches, commented-out code
|
|
154
|
-
- Check for unnecessary abstractions, wrappers, or premature generalizations
|
|
155
|
-
- Verify security: no hardcoded secrets, no unsanitized inputs, no data exposure
|
|
156
|
-
- **Output:** List of quality issues categorized by severity (BLOCKER, HIGH, MEDIUM)
|
|
126
|
+
Before reporting completion, confirm:
|
|
157
127
|
|
|
158
|
-
|
|
159
|
-
-
|
|
160
|
-
-
|
|
161
|
-
-
|
|
162
|
-
-
|
|
163
|
-
-
|
|
164
|
-
- **Output:** List of efficiency issues with suggested fixes
|
|
128
|
+
- [ ] All changed files were reviewed in full (not just diffs)
|
|
129
|
+
- [ ] No duplicated logic remains that appears three or more times
|
|
130
|
+
- [ ] No dead code: unused imports, commented blocks, unreachable branches
|
|
131
|
+
- [ ] No unnecessary abstractions, wrappers, or indirection layers
|
|
132
|
+
- [ ] All tests pass after simplification
|
|
133
|
+
- [ ] No behavioral changes were introduced (simplify only, do not alter)
|
|
165
134
|
|
|
166
|
-
|
|
135
|
+
## MAXSIM Integration
|
|
167
136
|
|
|
168
|
-
|
|
169
|
-
|
|
170
|
-
|
|
171
|
-
|
|
172
|
-
|
|
173
|
-
|
|
174
|
-
6. Reports final status: CLEAN (nothing found), FIXED (issues found and resolved), or BLOCKED (cannot fix without architectural change)
|
|
175
|
-
|
|
176
|
-
### Spawning Pattern
|
|
177
|
-
|
|
178
|
-
Each reviewer is spawned as a parallel agent via the Agent tool:
|
|
179
|
-
```
|
|
180
|
-
# All 3 run in parallel
|
|
181
|
-
Task(prompt="Reviewer 1: Code Reuse — {files_to_review} ...", subagent_type="maxsim-executor")
|
|
182
|
-
Task(prompt="Reviewer 2: Code Quality — {files_to_review} ...", subagent_type="maxsim-executor")
|
|
183
|
-
Task(prompt="Reviewer 3: Efficiency — {files_to_review} ...", subagent_type="maxsim-executor")
|
|
184
|
-
# Wait for all 3 to complete, then consolidate
|
|
185
|
-
```
|
|
137
|
+
When a plan specifies `skill: "simplify"`:
|
|
138
|
+
- The orchestrator collects changed files from the implementation step
|
|
139
|
+
- Three parallel reviewers (Reuse, Quality, Efficiency) are spawned
|
|
140
|
+
- Findings are consolidated and fixes applied
|
|
141
|
+
- Progress is tracked in STATE.md via decision entries
|
|
142
|
+
- Final results are recorded in SUMMARY.md
|
|
@@ -1,104 +1,71 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: systematic-debugging
|
|
3
|
-
description:
|
|
3
|
+
description: >-
|
|
4
|
+
Investigates bugs through systematic root-cause analysis: reproduce, hypothesize,
|
|
5
|
+
isolate, verify, fix, confirm. Use when encountering any bug, test failure,
|
|
6
|
+
unexpected behavior, or error message.
|
|
4
7
|
---
|
|
5
8
|
|
|
6
9
|
# Systematic Debugging
|
|
7
10
|
|
|
8
|
-
Random fixes waste time and create new bugs.
|
|
11
|
+
Find the root cause first. Random fixes waste time and create new bugs.
|
|
9
12
|
|
|
10
|
-
**If you have not
|
|
13
|
+
**HARD GATE -- No fix attempts without understanding root cause. If you have not completed the REPRODUCE and HYPOTHESIZE steps, you cannot propose a fix.**
|
|
11
14
|
|
|
12
|
-
##
|
|
15
|
+
## Process
|
|
13
16
|
|
|
14
|
-
|
|
15
|
-
NO FIX ATTEMPTS WITHOUT UNDERSTANDING ROOT CAUSE.
|
|
16
|
-
If you have not completed the REPRODUCE and HYPOTHESIZE steps, you CANNOT propose a fix.
|
|
17
|
-
"Let me just try this" is guessing, not debugging.
|
|
18
|
-
Violating this rule is a violation — not a time-saving shortcut.
|
|
19
|
-
</HARD-GATE>
|
|
20
|
-
|
|
21
|
-
## The Gate Function
|
|
22
|
-
|
|
23
|
-
Follow these steps IN ORDER for every bug, test failure, or unexpected behavior.
|
|
24
|
-
|
|
25
|
-
### 1. REPRODUCE — Confirm the Problem
|
|
17
|
+
### 1. REPRODUCE -- Confirm the Problem
|
|
26
18
|
|
|
27
19
|
- Run the failing command or test. Capture the EXACT error output.
|
|
28
20
|
- Can you trigger it reliably? What are the exact steps?
|
|
29
|
-
- If not reproducible: gather more data
|
|
30
|
-
|
|
31
|
-
```bash
|
|
32
|
-
# Example: reproduce a test failure
|
|
33
|
-
npx vitest run path/to/failing.test.ts
|
|
34
|
-
```
|
|
21
|
+
- If not reproducible: gather more data -- do not guess.
|
|
35
22
|
|
|
36
|
-
### 2. HYPOTHESIZE
|
|
23
|
+
### 2. HYPOTHESIZE -- Form a Theory
|
|
37
24
|
|
|
38
|
-
- Read the error message
|
|
39
|
-
- Check recent changes: `git diff`, recent commits, new dependencies
|
|
25
|
+
- Read the error message completely (stack trace, line numbers, exit codes).
|
|
26
|
+
- Check recent changes: `git diff`, recent commits, new dependencies.
|
|
40
27
|
- Trace data flow: where does the bad value originate?
|
|
41
|
-
- State your hypothesis clearly: "I think X is the root cause because Y"
|
|
28
|
+
- State your hypothesis clearly: "I think X is the root cause because Y."
|
|
42
29
|
|
|
43
|
-
### 3. ISOLATE
|
|
30
|
+
### 3. ISOLATE -- Narrow the Scope
|
|
44
31
|
|
|
45
|
-
- Find the
|
|
46
|
-
- In multi-component systems, add diagnostic logging at each boundary
|
|
47
|
-
- Identify which
|
|
48
|
-
- Compare against working examples in the codebase
|
|
32
|
+
- Find the smallest reproduction case.
|
|
33
|
+
- In multi-component systems, add diagnostic logging at each boundary.
|
|
34
|
+
- Identify which specific layer or component is failing.
|
|
35
|
+
- Compare against working examples in the codebase.
|
|
49
36
|
|
|
50
|
-
### 4. VERIFY
|
|
37
|
+
### 4. VERIFY -- Test Your Hypothesis
|
|
51
38
|
|
|
52
|
-
- Make the
|
|
53
|
-
- Change
|
|
54
|
-
- If hypothesis is wrong: form a
|
|
39
|
+
- Make the smallest possible change to test your hypothesis.
|
|
40
|
+
- Change one variable at a time -- never multiple things simultaneously.
|
|
41
|
+
- If hypothesis is wrong: form a new hypothesis, do not stack fixes.
|
|
55
42
|
|
|
56
|
-
### 5. FIX
|
|
43
|
+
### 5. FIX -- Address the Root Cause
|
|
57
44
|
|
|
58
|
-
- Write a failing test that reproduces the bug
|
|
59
|
-
- Implement a
|
|
60
|
-
- No "while I'm here" improvements
|
|
45
|
+
- Write a failing test that reproduces the bug.
|
|
46
|
+
- Implement a single fix that addresses the root cause.
|
|
47
|
+
- No "while I'm here" improvements -- fix only the identified issue.
|
|
61
48
|
|
|
62
|
-
### 6. CONFIRM
|
|
49
|
+
### 6. CONFIRM -- Verify the Fix
|
|
63
50
|
|
|
64
|
-
- Run the original failing test: it must now pass
|
|
65
|
-
- Run the full test suite: no regressions
|
|
66
|
-
- Verify the original error no longer occurs
|
|
51
|
+
- Run the original failing test: it must now pass.
|
|
52
|
+
- Run the full test suite: no regressions.
|
|
53
|
+
- Verify the original error no longer occurs.
|
|
67
54
|
|
|
68
|
-
|
|
69
|
-
# Confirm the specific fix
|
|
70
|
-
npx vitest run path/to/fixed.test.ts
|
|
71
|
-
# Confirm no regressions
|
|
72
|
-
npx vitest run
|
|
73
|
-
```
|
|
55
|
+
## Common Pitfalls
|
|
74
56
|
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
|
|
|
78
|
-
|
|
79
|
-
| "I think I know what it is" | Thinking is not evidence. Reproduce first, then hypothesize. |
|
|
80
|
-
| "Let me just try this fix" | "Just try" = guessing. You have skipped REPRODUCE and HYPOTHESIZE. |
|
|
81
|
-
| "Quick patch for now, investigate later" | "Later" never comes. Patches mask the real problem. |
|
|
57
|
+
| Excuse | Reality |
|
|
58
|
+
|--------|---------|
|
|
59
|
+
| "I think I know what it is" | Thinking is not evidence. Reproduce first. |
|
|
60
|
+
| "Let me just try this fix" | That is guessing. Complete REPRODUCE and HYPOTHESIZE first. |
|
|
82
61
|
| "Multiple changes at once saves time" | You cannot isolate what worked. You will create new bugs. |
|
|
83
|
-
| "The issue is simple
|
|
84
|
-
| "I'm under time pressure" | Systematic debugging IS faster than guess-and-check thrashing. |
|
|
85
|
-
| "The reference is too long, I'll skim it" | Partial understanding guarantees partial fixes. Read it completely. |
|
|
86
|
-
|
|
87
|
-
## Red Flags — STOP If You Catch Yourself:
|
|
88
|
-
|
|
89
|
-
- Changing code before reproducing the error
|
|
90
|
-
- Proposing a fix before reading the full error message and stack trace
|
|
91
|
-
- Trying random fixes hoping one will work
|
|
92
|
-
- Changing multiple things simultaneously
|
|
93
|
-
- Saying "it's probably X" without evidence
|
|
94
|
-
- Applying a fix that did not work, then adding another fix on top
|
|
95
|
-
- On your 3rd failed fix attempt (this signals an architectural problem — escalate)
|
|
62
|
+
| "The issue is simple" | Simple bugs have root causes too. The process is fast for simple bugs. |
|
|
96
63
|
|
|
97
|
-
|
|
64
|
+
Stop immediately if you catch yourself changing code before reproducing, proposing a fix before reading the full error, trying random fixes, or changing multiple things at once. If any of these triggers, return to step 1.
|
|
98
65
|
|
|
99
|
-
|
|
66
|
+
If 3+ fix attempts have failed, the issue is likely architectural. Document what you have tried and escalate to the user.
|
|
100
67
|
|
|
101
|
-
## Verification
|
|
68
|
+
## Verification
|
|
102
69
|
|
|
103
70
|
Before claiming a bug is fixed, confirm:
|
|
104
71
|
|
|
@@ -110,9 +77,9 @@ Before claiming a bug is fixed, confirm:
|
|
|
110
77
|
- [ ] The full test suite passes (no regressions)
|
|
111
78
|
- [ ] The original error no longer occurs when running the original steps
|
|
112
79
|
|
|
113
|
-
##
|
|
80
|
+
## MAXSIM Integration
|
|
114
81
|
|
|
115
82
|
When debugging during plan execution, MAXSIM deviation rules apply:
|
|
116
83
|
- **Rule 1 (Auto-fix bugs):** You may auto-fix bugs found during execution, but you must still follow this debugging process.
|
|
117
|
-
- **Rule 4 (Architectural changes):** If 3+ fix attempts fail, STOP and return a checkpoint
|
|
84
|
+
- **Rule 4 (Architectural changes):** If 3+ fix attempts fail, STOP and return a checkpoint -- this is an architectural decision for the user.
|
|
118
85
|
- Track all debugging deviations for SUMMARY.md documentation.
|
|
@@ -1,93 +1,70 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: tdd
|
|
3
|
-
description:
|
|
3
|
+
description: >-
|
|
4
|
+
Enforces test-driven development with the Red-Green-Refactor cycle: write a
|
|
5
|
+
failing test first, implement minimal code to pass, then refactor. Use when
|
|
6
|
+
implementing features, fixing bugs, or adding new behavior.
|
|
4
7
|
---
|
|
5
8
|
|
|
6
9
|
# Test-Driven Development (TDD)
|
|
7
10
|
|
|
8
11
|
Write the test first. Watch it fail. Write minimal code to pass. Clean up.
|
|
9
12
|
|
|
10
|
-
**
|
|
13
|
+
**HARD GATE: No implementation code without a failing test first. If you wrote production code before the test, delete it and start over. No exceptions.**
|
|
11
14
|
|
|
12
|
-
##
|
|
15
|
+
## Process
|
|
13
16
|
|
|
14
|
-
|
|
15
|
-
NO IMPLEMENTATION CODE WITHOUT A FAILING TEST FIRST.
|
|
16
|
-
If you wrote production code before the test, DELETE IT. Start over.
|
|
17
|
-
No exceptions. No "I'll add tests after." No "keep as reference."
|
|
18
|
-
Violating this rule is a violation — not a judgment call.
|
|
19
|
-
</HARD-GATE>
|
|
17
|
+
### 1. RED -- Write One Failing Test
|
|
20
18
|
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
Follow this cycle for every behavior change, feature addition, or bug fix.
|
|
24
|
-
|
|
25
|
-
### 1. RED — Write Failing Test
|
|
26
|
-
|
|
27
|
-
- Write ONE minimal test that describes the desired behavior
|
|
19
|
+
- Write ONE minimal test describing the desired behavior
|
|
28
20
|
- Test name describes what SHOULD happen, not implementation details
|
|
29
|
-
- Use real code paths
|
|
21
|
+
- Use real code paths -- mocks only when unavoidable (external APIs, databases)
|
|
30
22
|
|
|
31
|
-
### 2. VERIFY RED
|
|
23
|
+
### 2. VERIFY RED -- Run the Test
|
|
32
24
|
|
|
33
|
-
|
|
34
|
-
# Run the test suite for this file
|
|
35
|
-
npx vitest run path/to/test.test.ts
|
|
36
|
-
```
|
|
37
|
-
|
|
38
|
-
- Test MUST fail (not error — fail with an assertion)
|
|
25
|
+
- Test MUST fail with an assertion (not error out from syntax or imports)
|
|
39
26
|
- Failure message must match the missing behavior
|
|
40
|
-
- If test passes immediately
|
|
27
|
+
- If the test passes immediately, you are testing existing behavior -- rewrite it
|
|
41
28
|
|
|
42
|
-
### 3. GREEN
|
|
29
|
+
### 3. GREEN -- Write Minimal Code
|
|
43
30
|
|
|
44
31
|
- Write the SIMPLEST code that makes the test pass
|
|
45
32
|
- Do NOT add features the test does not require
|
|
46
|
-
- Do NOT refactor yet
|
|
47
|
-
|
|
48
|
-
### 4. VERIFY GREEN — Run All Tests
|
|
33
|
+
- Do NOT refactor yet
|
|
49
34
|
|
|
50
|
-
|
|
51
|
-
npx vitest run
|
|
52
|
-
```
|
|
35
|
+
### 4. VERIFY GREEN -- Run All Tests
|
|
53
36
|
|
|
54
37
|
- The new test MUST pass
|
|
55
38
|
- ALL existing tests MUST still pass
|
|
56
|
-
- If any test fails
|
|
39
|
+
- If any test fails, fix code -- not tests
|
|
57
40
|
|
|
58
|
-
### 5. REFACTOR
|
|
41
|
+
### 5. REFACTOR -- Clean Up (Tests Still Green)
|
|
59
42
|
|
|
60
43
|
- Remove duplication, improve names, extract helpers
|
|
61
|
-
- Run tests after every change
|
|
44
|
+
- Run tests after every change
|
|
62
45
|
- Do NOT add new behavior during refactor
|
|
63
46
|
|
|
64
|
-
### 6. REPEAT
|
|
47
|
+
### 6. REPEAT -- Next failing test for next behavior
|
|
65
48
|
|
|
66
|
-
## Common
|
|
49
|
+
## Common Pitfalls
|
|
67
50
|
|
|
68
|
-
| Excuse | Why
|
|
69
|
-
|
|
70
|
-
| "Too simple to test" | Simple code breaks. The test takes 30 seconds
|
|
71
|
-
| "I'll add tests after" | Tests written after pass immediately
|
|
72
|
-
| "The test framework isn't set up yet" | Set it up. That is part of the task, not a reason to skip. |
|
|
51
|
+
| Excuse | Why it fails |
|
|
52
|
+
|--------|-------------|
|
|
53
|
+
| "Too simple to test" | Simple code breaks. The test takes 30 seconds. |
|
|
54
|
+
| "I'll add tests after" | Tests written after pass immediately -- they prove nothing. |
|
|
73
55
|
| "I know the code works" | Knowledge is not evidence. A passing test is evidence. |
|
|
74
|
-
| "TDD is slower
|
|
56
|
+
| "TDD is slower" | TDD is faster than debugging. Every skip creates debt. |
|
|
75
57
|
| "Let me keep the code as reference" | You will adapt it instead of writing test-first. Delete means delete. |
|
|
76
|
-
| "I need to explore the design first" | Explore, then throw it away. Start implementation with TDD. |
|
|
77
58
|
|
|
78
|
-
|
|
59
|
+
Stop immediately if you catch yourself:
|
|
79
60
|
|
|
80
61
|
- Writing implementation code before writing a test
|
|
81
|
-
- Writing a test that passes on the first run
|
|
82
|
-
- Skipping the VERIFY RED step
|
|
62
|
+
- Writing a test that passes on the first run
|
|
63
|
+
- Skipping the VERIFY RED step
|
|
83
64
|
- Adding features beyond what the current test requires
|
|
84
|
-
-
|
|
85
|
-
- Rationalizing "just this once" or "this is different"
|
|
86
|
-
- Keeping pre-TDD code "as reference" while writing tests
|
|
65
|
+
- Keeping pre-TDD code "as reference"
|
|
87
66
|
|
|
88
|
-
|
|
89
|
-
|
|
90
|
-
## Verification Checklist
|
|
67
|
+
## Verification
|
|
91
68
|
|
|
92
69
|
Before claiming TDD compliance, confirm:
|
|
93
70
|
|
|
@@ -99,20 +76,10 @@ Before claiming TDD compliance, confirm:
|
|
|
99
76
|
- [ ] All tests pass after implementation
|
|
100
77
|
- [ ] Refactoring (if any) did not break any tests
|
|
101
78
|
|
|
102
|
-
|
|
103
|
-
|
|
104
|
-
## When Stuck
|
|
105
|
-
|
|
106
|
-
| Problem | Solution |
|
|
107
|
-
|---------|----------|
|
|
108
|
-
| Don't know how to test it | Write the assertion first. What should the output be? |
|
|
109
|
-
| Test setup is too complex | The design is too complex. Simplify the interface. |
|
|
110
|
-
| Must mock everything | Code is too coupled. Use dependency injection. |
|
|
111
|
-
| Existing code has no tests | Add tests for the code you are changing. Start the cycle now. |
|
|
112
|
-
|
|
113
|
-
## Integration with MAXSIM
|
|
79
|
+
## MAXSIM Integration
|
|
114
80
|
|
|
115
81
|
In MAXSIM plan execution, tasks marked `tdd="true"` follow this cycle with per-step commits:
|
|
82
|
+
|
|
116
83
|
- **RED commit:** `test({phase}-{plan}): add failing test for [feature]`
|
|
117
84
|
- **GREEN commit:** `feat({phase}-{plan}): implement [feature]`
|
|
118
85
|
- **REFACTOR commit (if changes made):** `refactor({phase}-{plan}): clean up [feature]`
|