maxsimcli 4.0.1 → 4.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,185 +1,142 @@
1
1
  ---
2
2
  name: simplify
3
- description: Use after implementation and before commit — requires reviewing changed code for reuse opportunities, quality issues, and unnecessary complexity
3
+ description: >-
4
+ Reviews changed code for reuse opportunities, unnecessary complexity, and
5
+ dead weight using three parallel review agents. Use when reviewing code
6
+ before committing, cleaning up implementations, or preparing changes for
7
+ review.
4
8
  ---
5
9
 
6
10
  # Simplify
7
11
 
8
- Every line of code is a liability. Less code that does the same thing is always better.
12
+ Every line of code is a liability. Remove what does not earn its place.
9
13
 
10
- **If you have not looked for ways to simplify, you are shipping the first draft.**
14
+ **HARD GATE**: No code ships without a simplification pass. If you have not checked for duplication, dead code, and unnecessary complexity, the change is not ready. "It works" is the starting point, not the finish line.
11
15
 
12
- ## The Iron Law
16
+ ## When to Use
13
17
 
14
- <HARD-GATE>
15
- NO COMMIT WITHOUT REVIEWING FOR SIMPLIFICATION.
16
- If you have not checked for duplication, dead code, and unnecessary complexity, you CANNOT commit.
17
- "It works" is the starting point, not the finish line.
18
- Violating this rule is a violation — not a preference.
19
- </HARD-GATE>
18
+ - After implementing a feature or fix, before committing
19
+ - When preparing changes for code review
20
+ - When cleaning up code that has grown organically over multiple iterations
21
+ - When onboarding to a file and noticing accumulated complexity
20
22
 
21
- ## The Gate Function
23
+ Do NOT use this skill when:
24
+ - Making a hotfix where speed matters more than polish (file a follow-up instead)
25
+ - The changes are purely mechanical (renames, formatting, dependency bumps)
22
26
 
23
- After implementation is complete and tests pass, BEFORE committing:
27
+ ## Process
24
28
 
25
- ### 1. DIFF — Review What Changed
29
+ ### 1. DIFF — Identify What Changed
26
30
 
27
- - Run `git diff --staged` (or `git diff` for unstaged changes)
28
- - Read every line you are about to commit
29
- - Flag anything that feels "off" trust that instinct
31
+ - Collect the set of modified and added files
32
+ - Read each file in full, not just the changed hunks
33
+ - Note files that interact with the changes (callers, consumers, shared modules)
30
34
 
31
- ```bash
32
- # Review staged changes
33
- git diff --staged
34
- # Or all uncommitted changes
35
- git diff
36
- ```
35
+ ### 2. DUPLICATION — Eliminate Repeated Logic
37
36
 
38
- ### 2. DUPLICATION Find Reuse Opportunities
37
+ - Are there patterns repeated across files that should be a shared helper?
38
+ - Does new code duplicate existing utilities or library functions?
39
+ - Could two similar implementations be merged behind a single interface?
40
+ - Is there copy-paste that should be refactored?
39
41
 
40
- - Does any new code duplicate existing logic in the codebase?
41
- - Are there two or more blocks that do similar things and could share a helper?
42
- - Could an existing utility, library function, or helper replace new code?
43
- - Is there copy-paste from another file that should be extracted?
42
+ **Rule of three**: If the same pattern appears three times, extract it.
44
43
 
45
- **Rule of three:** If the same pattern appears three times, extract it. Two occurrences are acceptable.
44
+ ### 3. DEAD CODE Remove What Is Not Called
46
45
 
47
- ### 3. DEAD CODE Remove What Is Not Used
48
-
49
- - Are there unused imports, variables, or functions?
50
- - Are there commented-out code blocks? (Delete them git has history)
51
- - Are there unreachable branches or impossible conditions?
52
- - Are there parameters that are always passed the same value?
53
-
54
- **If it is not called, it does not belong. Delete it.**
46
+ - Delete unused imports, variables, functions, and parameters
47
+ - Remove commented-out code blocks (version control is the archive)
48
+ - Strip unreachable branches and impossible conditions
49
+ - Drop feature flags and configuration for features that no longer exist
55
50
 
56
51
  ### 4. COMPLEXITY — Question Every Abstraction
57
52
 
58
- - Is there a wrapper that adds no value beyond indirection?
59
- - Is there a generic solution where a specific one would be simpler?
60
- - Are there feature flags, configuration options, or extension points for hypothetical future needs?
61
- - Could a 3-line inline block replace a 20-line abstraction?
53
+ - Does every wrapper, adapter, or indirection layer justify its existence?
54
+ - Are there generics or parametrization that serve only one concrete case?
55
+ - Could a 20-line class be replaced by a 3-line function?
56
+ - Is there defensive programming that guards against conditions that cannot occur?
62
57
 
63
- **The right amount of abstraction is the minimum needed for the current requirements.**
58
+ **If removing it does not break anything, it should not be there.**
64
59
 
65
- ### 5. CLARITY — Improve Readability
60
+ ### 5. CLARITY — Tighten Naming and Structure
66
61
 
67
- - Are variable and function names self-explanatory?
68
- - Could a confusing block be rewritten more clearly without comments?
69
- - Are there nested conditions that could be flattened with early returns?
70
- - Is the control flow straightforward or unnecessarily clever?
62
+ - Are names self-documenting? Rename anything that needs a comment to explain.
63
+ - Could nested logic be flattened with early returns?
64
+ - Is control flow straightforward, or does it require tracing to understand?
65
+ - Are there layers of indirection that obscure the data path?
71
66
 
72
- ### 6. EFFICIENCYCheck for Obvious Issues
67
+ ### 6. REVIEWFinal Sanity Check
73
68
 
74
- - Are there O(n^2) operations where O(n) is straightforward?
75
- - Are there repeated computations that could be cached or hoisted?
76
- - Are there unnecessary allocations in hot paths?
77
- - Is data being transformed multiple times when once would suffice?
69
+ - Re-read the simplified code end to end
70
+ - Confirm all tests still pass
71
+ - Verify no behavioral changes were introduced (simplify, do not alter)
78
72
 
79
- **Only fix efficiency issues that are obvious. Do not optimize without evidence of a problem.**
73
+ ## Parallel 3-Reviewer Pattern
80
74
 
81
- ## Review Checklist
75
+ When invoked as part of the execute-phase cycle, simplification runs as three parallel review agents, each focused on one dimension.
82
76
 
83
- | Category | Question | Action if Yes |
84
- |----------|----------|---------------|
85
- | Duplication | Same logic exists elsewhere? | Extract shared helper or reuse existing |
86
- | Duplication | Copy-paste from another file? | Extract to shared module |
87
- | Dead code | Unused imports or variables? | Delete them |
88
- | Dead code | Commented-out code? | Delete it (git has history) |
89
- | Complexity | Abstraction for one call site? | Inline it |
90
- | Complexity | Generic where specific suffices? | Simplify to specific |
91
- | Complexity | Config for hypothetical needs? | Remove until needed |
92
- | Clarity | Confusing variable name? | Rename to describe purpose |
93
- | Clarity | Deep nesting? | Flatten with early returns |
94
- | Efficiency | O(n^2) with obvious O(n) fix? | Fix it now |
95
- | Efficiency | Same computation in a loop? | Hoist outside the loop |
77
+ ### Reviewer 1: Code Reuse
96
78
 
97
- ## Common Rationalizations REJECT THESE
79
+ - Scan all changed files for duplicated patterns
80
+ - Cross-reference against existing shared utilities and helpers
81
+ - Flag any logic that appears three or more times without extraction
82
+ - **Output**: List of reuse opportunities with file paths and line ranges
98
83
 
99
- | Excuse | Why It Violates the Rule |
100
- |--------|--------------------------|
101
- | "It works, don't touch it" | Working is the minimum bar. Simplify before it becomes legacy. |
102
- | "We might need the flexibility later" | YAGNI. Add flexibility when you need it, not before. |
103
- | "Refactoring is risky" | Small simplifications with passing tests are safe. Large refactors are a separate task. |
104
- | "The duplication is minor" | Minor duplication compounds. Three occurrences is the threshold, not six. |
105
- | "I'll clean it up in a follow-up" | Follow-ups rarely happen. Simplify now while context is fresh. |
106
- | "It's just a few extra lines" | Every unnecessary line is maintenance cost. Delete it. |
84
+ ### Reviewer 2: Code Quality
107
85
 
108
- ## Red Flags STOP If You Catch Yourself:
86
+ - Check for dead code: unused imports, unreachable branches, commented blocks
87
+ - Verify naming consistency with codebase conventions
88
+ - Flag unnecessary abstractions, wrappers, and indirection
89
+ - **Output**: List of quality issues categorized by severity
109
90
 
110
- - Committing without reading your own diff
111
- - Keeping dead code "just in case"
112
- - Adding a utility class for a single use case
113
- - Building configuration for features no one has requested
114
- - Writing a comment to explain code that could be rewritten to be self-evident
115
- - Skipping simplification because "it's good enough"
91
+ ### Reviewer 3: Efficiency
116
92
 
117
- **If any red flag triggers: STOP. Review the diff again. Simplify before committing.**
93
+ - Identify over-engineered solutions (parametrization serving one case, generic interfaces with one implementor)
94
+ - Flag defensive programming that guards impossible conditions
95
+ - Check for configuration and feature flags that serve no current purpose
96
+ - **Output**: List of efficiency issues with suggested removals
118
97
 
119
- ## Verification Checklist
98
+ ### Consolidation
120
99
 
121
- Before committing, confirm:
100
+ After all three reviewers complete:
101
+ 1. Merge findings into a deduplicated list
102
+ 2. Apply fixes for all actionable items (BLOCKER and HIGH priority first)
103
+ 3. Re-run tests to confirm nothing broke
104
+ 4. Report status: CLEAN (nothing found), FIXED (issues resolved), or BLOCKED (cannot simplify without architectural changes)
122
105
 
123
- - [ ] All staged changes have been reviewed line by line
124
- - [ ] No duplication with existing codebase logic (or duplication is below threshold)
125
- - [ ] No dead code: unused imports, variables, unreachable branches, commented code
126
- - [ ] No unnecessary abstractions, wrappers, or premature generalizations
127
- - [ ] Naming is clear and consistent with codebase conventions
128
- - [ ] No obvious efficiency issues (unnecessary O(n^2), repeated computations)
129
- - [ ] Tests still pass after simplification changes
106
+ ## Common Rationalizations REJECT THESE
130
107
 
131
- ## In MAXSIM Plan Execution
108
+ | Excuse | Why It Violates the Rule |
109
+ |--------|--------------------------|
110
+ | "It might be needed later" | Delete it. Re-adding is cheaper than maintaining unused code. |
111
+ | "The abstraction makes it extensible" | Extensibility that serves no current requirement is dead weight. |
112
+ | "Refactoring is risky" | Small, tested simplifications reduce risk. Accumulated complexity increases it. |
113
+ | "I'll clean it up later" | Later never comes. Simplify now while context is fresh. |
132
114
 
133
- Simplification applies at the task level, after implementation and before commit:
134
- - After task implementation is complete and tests pass, run this review
135
- - Make simplification changes as part of the same commit (not a separate task)
136
- - If simplification reveals a larger refactoring opportunity, file a todo — do not scope-creep
137
- - Track significant simplifications in the task's commit message (e.g., "extracted shared helper for X")
115
+ ## Red Flags STOP If You Catch Yourself:
138
116
 
139
- ## Parallel 3-Reviewer Pattern (Execution Pipeline)
117
+ - Skipping the simplification pass because the diff is small
118
+ - Keeping dead code "just in case"
119
+ - Adding complexity during a simplification pass
120
+ - Merging without having read the full file (not just changed lines)
140
121
 
141
- When invoked as part of the Execute-Review-Simplify-Review cycle in `execute-plan.md`, simplification runs as 3 parallel review agents. Each reviewer focuses on one dimension:
122
+ **If any red flag triggers: STOP. Complete the simplification cycle before proceeding.**
142
123
 
143
- ### Reviewer 1: Code Reuse
144
- - Find duplicated logic across the codebase (not just within changed files)
145
- - Identify copy-paste from other files that should be extracted to shared modules
146
- - Suggest shared helpers for patterns appearing 3+ times (Rule of Three)
147
- - Check if existing utility functions, library methods, or helpers could replace new code
148
- - **Output:** List of reuse opportunities with file paths and line ranges
124
+ ## Verification Checklist
149
125
 
150
- ### Reviewer 2: Code Quality
151
- - Check naming consistency with codebase conventions (read CLAUDE.md first)
152
- - Verify error handling covers all external calls and edge cases
153
- - Look for dead code: unused imports, variables, unreachable branches, commented-out code
154
- - Check for unnecessary abstractions, wrappers, or premature generalizations
155
- - Verify security: no hardcoded secrets, no unsanitized inputs, no data exposure
156
- - **Output:** List of quality issues categorized by severity (BLOCKER, HIGH, MEDIUM)
126
+ Before reporting completion, confirm:
157
127
 
158
- ### Reviewer 3: Efficiency
159
- - Find O(n^2) operations where O(n) is straightforward
160
- - Identify repeated computations that could be cached or hoisted out of loops
161
- - Check for unnecessary allocations in hot paths
162
- - Look for redundant data transformations (data processed multiple times when once suffices)
163
- - Only flag obvious issues do not optimize without evidence of a problem
164
- - **Output:** List of efficiency issues with suggested fixes
128
+ - [ ] All changed files were reviewed in full (not just diffs)
129
+ - [ ] No duplicated logic remains that appears three or more times
130
+ - [ ] No dead code: unused imports, commented blocks, unreachable branches
131
+ - [ ] No unnecessary abstractions, wrappers, or indirection layers
132
+ - [ ] All tests pass after simplification
133
+ - [ ] No behavioral changes were introduced (simplify only, do not alter)
165
134
 
166
- ### Consolidation
135
+ ## MAXSIM Integration
167
136
 
168
- After all 3 reviewers report, the orchestrating agent:
169
- 1. Merges findings into a single deduplicated list
170
- 2. Prioritizes: BLOCKER > HIGH > MEDIUM > informational
171
- 3. Applies fixes for all actionable items (BLOCKER and HIGH)
172
- 4. Files MEDIUM issues as todos if they require larger refactoring
173
- 5. Runs tests to confirm fixes do not break anything
174
- 6. Reports final status: CLEAN (nothing found), FIXED (issues found and resolved), or BLOCKED (cannot fix without architectural change)
175
-
176
- ### Spawning Pattern
177
-
178
- Each reviewer is spawned as a parallel agent via the Agent tool:
179
- ```
180
- # All 3 run in parallel
181
- Task(prompt="Reviewer 1: Code Reuse — {files_to_review} ...", subagent_type="maxsim-executor")
182
- Task(prompt="Reviewer 2: Code Quality — {files_to_review} ...", subagent_type="maxsim-executor")
183
- Task(prompt="Reviewer 3: Efficiency — {files_to_review} ...", subagent_type="maxsim-executor")
184
- # Wait for all 3 to complete, then consolidate
185
- ```
137
+ When a plan specifies `skill: "simplify"`:
138
+ - The orchestrator collects changed files from the implementation step
139
+ - Three parallel reviewers (Reuse, Quality, Efficiency) are spawned
140
+ - Findings are consolidated and fixes applied
141
+ - Progress is tracked in STATE.md via decision entries
142
+ - Final results are recorded in SUMMARY.md
@@ -1,104 +1,71 @@
1
1
  ---
2
2
  name: systematic-debugging
3
- description: Use when encountering any bug, test failure, or unexpected behavior — requires root cause investigation before attempting any fix
3
+ description: >-
4
+ Investigates bugs through systematic root-cause analysis: reproduce, hypothesize,
5
+ isolate, verify, fix, confirm. Use when encountering any bug, test failure,
6
+ unexpected behavior, or error message.
4
7
  ---
5
8
 
6
9
  # Systematic Debugging
7
10
 
8
- Random fixes waste time and create new bugs. Find the root cause first.
11
+ Find the root cause first. Random fixes waste time and create new bugs.
9
12
 
10
- **If you have not identified the root cause, you are guessing not debugging.**
13
+ **HARD GATE -- No fix attempts without understanding root cause. If you have not completed the REPRODUCE and HYPOTHESIZE steps, you cannot propose a fix.**
11
14
 
12
- ## The Iron Law
15
+ ## Process
13
16
 
14
- <HARD-GATE>
15
- NO FIX ATTEMPTS WITHOUT UNDERSTANDING ROOT CAUSE.
16
- If you have not completed the REPRODUCE and HYPOTHESIZE steps, you CANNOT propose a fix.
17
- "Let me just try this" is guessing, not debugging.
18
- Violating this rule is a violation — not a time-saving shortcut.
19
- </HARD-GATE>
20
-
21
- ## The Gate Function
22
-
23
- Follow these steps IN ORDER for every bug, test failure, or unexpected behavior.
24
-
25
- ### 1. REPRODUCE — Confirm the Problem
17
+ ### 1. REPRODUCE -- Confirm the Problem
26
18
 
27
19
  - Run the failing command or test. Capture the EXACT error output.
28
20
  - Can you trigger it reliably? What are the exact steps?
29
- - If not reproducible: gather more data do not guess.
30
-
31
- ```bash
32
- # Example: reproduce a test failure
33
- npx vitest run path/to/failing.test.ts
34
- ```
21
+ - If not reproducible: gather more data -- do not guess.
35
22
 
36
- ### 2. HYPOTHESIZE Form a Theory
23
+ ### 2. HYPOTHESIZE -- Form a Theory
37
24
 
38
- - Read the error message COMPLETELY (stack trace, line numbers, exit codes)
39
- - Check recent changes: `git diff`, recent commits, new dependencies
25
+ - Read the error message completely (stack trace, line numbers, exit codes).
26
+ - Check recent changes: `git diff`, recent commits, new dependencies.
40
27
  - Trace data flow: where does the bad value originate?
41
- - State your hypothesis clearly: "I think X is the root cause because Y"
28
+ - State your hypothesis clearly: "I think X is the root cause because Y."
42
29
 
43
- ### 3. ISOLATE Narrow the Scope
30
+ ### 3. ISOLATE -- Narrow the Scope
44
31
 
45
- - Find the SMALLEST reproduction case
46
- - In multi-component systems, add diagnostic logging at each boundary
47
- - Identify which SPECIFIC layer or component is failing
48
- - Compare against working examples in the codebase
32
+ - Find the smallest reproduction case.
33
+ - In multi-component systems, add diagnostic logging at each boundary.
34
+ - Identify which specific layer or component is failing.
35
+ - Compare against working examples in the codebase.
49
36
 
50
- ### 4. VERIFY Test Your Hypothesis
37
+ ### 4. VERIFY -- Test Your Hypothesis
51
38
 
52
- - Make the SMALLEST possible change to test your hypothesis
53
- - Change ONE variable at a time never multiple things simultaneously
54
- - If hypothesis is wrong: form a NEW hypothesis, do not stack fixes
39
+ - Make the smallest possible change to test your hypothesis.
40
+ - Change one variable at a time -- never multiple things simultaneously.
41
+ - If hypothesis is wrong: form a new hypothesis, do not stack fixes.
55
42
 
56
- ### 5. FIX Address the Root Cause
43
+ ### 5. FIX -- Address the Root Cause
57
44
 
58
- - Write a failing test that reproduces the bug (see TDD skill)
59
- - Implement a SINGLE fix that addresses the root cause
60
- - No "while I'm here" improvements fix only the identified issue
45
+ - Write a failing test that reproduces the bug.
46
+ - Implement a single fix that addresses the root cause.
47
+ - No "while I'm here" improvements -- fix only the identified issue.
61
48
 
62
- ### 6. CONFIRM Verify the Fix
49
+ ### 6. CONFIRM -- Verify the Fix
63
50
 
64
- - Run the original failing test: it must now pass
65
- - Run the full test suite: no regressions
66
- - Verify the original error no longer occurs
51
+ - Run the original failing test: it must now pass.
52
+ - Run the full test suite: no regressions.
53
+ - Verify the original error no longer occurs.
67
54
 
68
- ```bash
69
- # Confirm the specific fix
70
- npx vitest run path/to/fixed.test.ts
71
- # Confirm no regressions
72
- npx vitest run
73
- ```
55
+ ## Common Pitfalls
74
56
 
75
- ## Common Rationalizations REJECT THESE
76
-
77
- | Excuse | Why It Violates the Rule |
78
- |--------|--------------------------|
79
- | "I think I know what it is" | Thinking is not evidence. Reproduce first, then hypothesize. |
80
- | "Let me just try this fix" | "Just try" = guessing. You have skipped REPRODUCE and HYPOTHESIZE. |
81
- | "Quick patch for now, investigate later" | "Later" never comes. Patches mask the real problem. |
57
+ | Excuse | Reality |
58
+ |--------|---------|
59
+ | "I think I know what it is" | Thinking is not evidence. Reproduce first. |
60
+ | "Let me just try this fix" | That is guessing. Complete REPRODUCE and HYPOTHESIZE first. |
82
61
  | "Multiple changes at once saves time" | You cannot isolate what worked. You will create new bugs. |
83
- | "The issue is simple, I don't need the process" | Simple bugs have root causes too. The process is fast for simple bugs. |
84
- | "I'm under time pressure" | Systematic debugging IS faster than guess-and-check thrashing. |
85
- | "The reference is too long, I'll skim it" | Partial understanding guarantees partial fixes. Read it completely. |
86
-
87
- ## Red Flags — STOP If You Catch Yourself:
88
-
89
- - Changing code before reproducing the error
90
- - Proposing a fix before reading the full error message and stack trace
91
- - Trying random fixes hoping one will work
92
- - Changing multiple things simultaneously
93
- - Saying "it's probably X" without evidence
94
- - Applying a fix that did not work, then adding another fix on top
95
- - On your 3rd failed fix attempt (this signals an architectural problem — escalate)
62
+ | "The issue is simple" | Simple bugs have root causes too. The process is fast for simple bugs. |
96
63
 
97
- **If any red flag triggers: STOP. Return to step 1 (REPRODUCE).**
64
+ Stop immediately if you catch yourself changing code before reproducing, proposing a fix before reading the full error, trying random fixes, or changing multiple things at once. If any of these triggers, return to step 1.
98
65
 
99
- **If 3+ fix attempts have failed:** The issue is likely architectural, not a simple bug. Document what you have tried and escalate to the user for a design decision.
66
+ If 3+ fix attempts have failed, the issue is likely architectural. Document what you have tried and escalate to the user.
100
67
 
101
- ## Verification Checklist
68
+ ## Verification
102
69
 
103
70
  Before claiming a bug is fixed, confirm:
104
71
 
@@ -110,9 +77,9 @@ Before claiming a bug is fixed, confirm:
110
77
  - [ ] The full test suite passes (no regressions)
111
78
  - [ ] The original error no longer occurs when running the original steps
112
79
 
113
- ## Debugging in MAXSIM Context
80
+ ## MAXSIM Integration
114
81
 
115
82
  When debugging during plan execution, MAXSIM deviation rules apply:
116
83
  - **Rule 1 (Auto-fix bugs):** You may auto-fix bugs found during execution, but you must still follow this debugging process.
117
- - **Rule 4 (Architectural changes):** If 3+ fix attempts fail, STOP and return a checkpoint this is an architectural decision for the user.
84
+ - **Rule 4 (Architectural changes):** If 3+ fix attempts fail, STOP and return a checkpoint -- this is an architectural decision for the user.
118
85
  - Track all debugging deviations for SUMMARY.md documentation.
@@ -1,93 +1,70 @@
1
1
  ---
2
2
  name: tdd
3
- description: Use when implementing any feature or bug fix — requires writing a failing test before any implementation code
3
+ description: >-
4
+ Enforces test-driven development with the Red-Green-Refactor cycle: write a
5
+ failing test first, implement minimal code to pass, then refactor. Use when
6
+ implementing features, fixing bugs, or adding new behavior.
4
7
  ---
5
8
 
6
9
  # Test-Driven Development (TDD)
7
10
 
8
11
  Write the test first. Watch it fail. Write minimal code to pass. Clean up.
9
12
 
10
- **If you did not watch the test fail, you do not know if it tests the right thing.**
13
+ **HARD GATE: No implementation code without a failing test first. If you wrote production code before the test, delete it and start over. No exceptions.**
11
14
 
12
- ## The Iron Law
15
+ ## Process
13
16
 
14
- <HARD-GATE>
15
- NO IMPLEMENTATION CODE WITHOUT A FAILING TEST FIRST.
16
- If you wrote production code before the test, DELETE IT. Start over.
17
- No exceptions. No "I'll add tests after." No "keep as reference."
18
- Violating this rule is a violation — not a judgment call.
19
- </HARD-GATE>
17
+ ### 1. RED -- Write One Failing Test
20
18
 
21
- ## The Gate Function
22
-
23
- Follow this cycle for every behavior change, feature addition, or bug fix.
24
-
25
- ### 1. RED — Write Failing Test
26
-
27
- - Write ONE minimal test that describes the desired behavior
19
+ - Write ONE minimal test describing the desired behavior
28
20
  - Test name describes what SHOULD happen, not implementation details
29
- - Use real code paths mocks only when unavoidable (external APIs, databases)
21
+ - Use real code paths -- mocks only when unavoidable (external APIs, databases)
30
22
 
31
- ### 2. VERIFY RED Run the Test
23
+ ### 2. VERIFY RED -- Run the Test
32
24
 
33
- ```bash
34
- # Run the test suite for this file
35
- npx vitest run path/to/test.test.ts
36
- ```
37
-
38
- - Test MUST fail (not error — fail with an assertion)
25
+ - Test MUST fail with an assertion (not error out from syntax or imports)
39
26
  - Failure message must match the missing behavior
40
- - If test passes immediately: you are testing existing behavior rewrite it
27
+ - If the test passes immediately, you are testing existing behavior -- rewrite it
41
28
 
42
- ### 3. GREEN Write Minimal Code
29
+ ### 3. GREEN -- Write Minimal Code
43
30
 
44
31
  - Write the SIMPLEST code that makes the test pass
45
32
  - Do NOT add features the test does not require
46
- - Do NOT refactor yet — that comes next
47
-
48
- ### 4. VERIFY GREEN — Run All Tests
33
+ - Do NOT refactor yet
49
34
 
50
- ```bash
51
- npx vitest run
52
- ```
35
+ ### 4. VERIFY GREEN -- Run All Tests
53
36
 
54
37
  - The new test MUST pass
55
38
  - ALL existing tests MUST still pass
56
- - If any test fails: fix code, not tests
39
+ - If any test fails, fix code -- not tests
57
40
 
58
- ### 5. REFACTOR Clean Up (Tests Still Green)
41
+ ### 5. REFACTOR -- Clean Up (Tests Still Green)
59
42
 
60
43
  - Remove duplication, improve names, extract helpers
61
- - Run tests after every change — they must stay green
44
+ - Run tests after every change
62
45
  - Do NOT add new behavior during refactor
63
46
 
64
- ### 6. REPEAT Next failing test for next behavior
47
+ ### 6. REPEAT -- Next failing test for next behavior
65
48
 
66
- ## Common Rationalizations — REJECT THESE
49
+ ## Common Pitfalls
67
50
 
68
- | Excuse | Why It Violates the Rule |
69
- |--------|--------------------------|
70
- | "Too simple to test" | Simple code breaks. The test takes 30 seconds to write. |
71
- | "I'll add tests after" | Tests written after pass immediately they prove nothing. |
72
- | "The test framework isn't set up yet" | Set it up. That is part of the task, not a reason to skip. |
51
+ | Excuse | Why it fails |
52
+ |--------|-------------|
53
+ | "Too simple to test" | Simple code breaks. The test takes 30 seconds. |
54
+ | "I'll add tests after" | Tests written after pass immediately -- they prove nothing. |
73
55
  | "I know the code works" | Knowledge is not evidence. A passing test is evidence. |
74
- | "TDD is slower for this task" | TDD is faster than debugging. Every "quick skip" creates debt. |
56
+ | "TDD is slower" | TDD is faster than debugging. Every skip creates debt. |
75
57
  | "Let me keep the code as reference" | You will adapt it instead of writing test-first. Delete means delete. |
76
- | "I need to explore the design first" | Explore, then throw it away. Start implementation with TDD. |
77
58
 
78
- ## Red Flags STOP If You Catch Yourself:
59
+ Stop immediately if you catch yourself:
79
60
 
80
61
  - Writing implementation code before writing a test
81
- - Writing a test that passes on the first run (you are testing existing behavior)
82
- - Skipping the VERIFY RED step ("I know it will fail")
62
+ - Writing a test that passes on the first run
63
+ - Skipping the VERIFY RED step
83
64
  - Adding features beyond what the current test requires
84
- - Skipping the REFACTOR step to save time
85
- - Rationalizing "just this once" or "this is different"
86
- - Keeping pre-TDD code "as reference" while writing tests
65
+ - Keeping pre-TDD code "as reference"
87
66
 
88
- **If any red flag triggers: STOP. Delete the implementation. Write the test first.**
89
-
90
- ## Verification Checklist
67
+ ## Verification
91
68
 
92
69
  Before claiming TDD compliance, confirm:
93
70
 
@@ -99,20 +76,10 @@ Before claiming TDD compliance, confirm:
99
76
  - [ ] All tests pass after implementation
100
77
  - [ ] Refactoring (if any) did not break any tests
101
78
 
102
- Cannot check all boxes? You skipped TDD. Start over.
103
-
104
- ## When Stuck
105
-
106
- | Problem | Solution |
107
- |---------|----------|
108
- | Don't know how to test it | Write the assertion first. What should the output be? |
109
- | Test setup is too complex | The design is too complex. Simplify the interface. |
110
- | Must mock everything | Code is too coupled. Use dependency injection. |
111
- | Existing code has no tests | Add tests for the code you are changing. Start the cycle now. |
112
-
113
- ## Integration with MAXSIM
79
+ ## MAXSIM Integration
114
80
 
115
81
  In MAXSIM plan execution, tasks marked `tdd="true"` follow this cycle with per-step commits:
82
+
116
83
  - **RED commit:** `test({phase}-{plan}): add failing test for [feature]`
117
84
  - **GREEN commit:** `feat({phase}-{plan}): implement [feature]`
118
85
  - **REFACTOR commit (if changes made):** `refactor({phase}-{plan}): clean up [feature]`