thought-cabinet 0.1.11 → 0.1.12

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -24,7 +24,7 @@ Thought Cabinet solves these by providing:
24
24
  cd your-project
25
25
 
26
26
  # 1. Install
27
- npm install -g thought-cabinet
27
+ pnpm install -g thought-cabinet
28
28
 
29
29
  # 2. Initialize thoughts in your project
30
30
  thc init
@@ -33,7 +33,7 @@ thc init
33
33
  thc agent init
34
34
 
35
35
  # 4. Use skills in your agent session (e.g. Claude Code)
36
- > /researching-codebase How does the authentication system work?
36
+ > /research-codebase How does the authentication system work?
37
37
  > /creating-plan Add OAuth2 support based on the research
38
38
  > /implementing-plan thoughts/shared/plans/add-oauth.md
39
39
  > /validating-plan thoughts/shared/plans/add-oauth.md
@@ -43,14 +43,14 @@ thc agent init
43
43
 
44
44
  Skills are installed by `thc agent init` and invoked as slash commands in your agent session:
45
45
 
46
- | Skill | Description |
47
- | ----------------------- | --------------------------------------------------------------------- |
48
- | `/researching-codebase` | Deep-dive into codebase, save findings to `thoughts/shared/research/` |
49
- | `/creating-plan` | Create implementation plan with phases and success criteria |
50
- | `/iterating-plan` | Refine existing plans based on feedback |
51
- | `/implementing-plan` | Execute plan phase-by-phase with verification |
52
- | `/validating-plan` | Verify implementation against plan's success criteria |
53
- | `/commit` | Create git commits with clear, descriptive messages |
46
+ | Skill | Description |
47
+ | -------------------- | --------------------------------------------------------------------- |
48
+ | `/research-codebase` | Deep-dive into codebase, save findings to `thoughts/shared/research/` |
49
+ | `/creating-plan` | Create implementation plan with phases and success criteria |
50
+ | `/iterating-plan` | Refine existing plans based on feedback |
51
+ | `/implementing-plan` | Execute plan phase-by-phase with verification |
52
+ | `/validating-plan` | Verify implementation against plan's success criteria |
53
+ | `/commit` | Create git commits with clear, descriptive messages |
54
54
 
55
55
  **Typical workflow**: research the codebase to build understanding, create a plan, iterate until the plan is solid, implement it, then validate the result.
56
56
 
package/docs/CLI.md CHANGED
@@ -104,14 +104,14 @@ thc agent init --force # Overwrite existing installations
104
104
 
105
105
  #### Installed Skills
106
106
 
107
- | Skill | Slash Command | Description |
108
- | -------------------- | ----------------------- | --------------------------------------------------------------------- |
109
- | researching-codebase | `/researching-codebase` | Deep-dive into codebase, save findings to `thoughts/shared/research/` |
110
- | creating-plan | `/creating-plan` | Create implementation plan with phases and success criteria |
111
- | iterating-plan | `/iterating-plan` | Refine existing plans based on feedback |
112
- | implementing-plan | `/implementing-plan` | Execute plan phase-by-phase with verification |
113
- | validating-plan | `/validating-plan` | Verify implementation against plan's success criteria |
114
- | commit | `/commit` | Create git commits with clear, descriptive messages |
107
+ | Skill | Slash Command | Description |
108
+ | ----------------- | -------------------- | --------------------------------------------------------------------- |
109
+ | research-codebase | `/research-codebase` | Deep-dive into codebase, save findings to `thoughts/shared/research/` |
110
+ | creating-plan | `/creating-plan` | Create implementation plan with phases and success criteria |
111
+ | iterating-plan | `/iterating-plan` | Refine existing plans based on feedback |
112
+ | implementing-plan | `/implementing-plan` | Execute plan phase-by-phase with verification |
113
+ | validating-plan | `/validating-plan` | Verify implementation against plan's success criteria |
114
+ | commit | `/commit` | Create git commits with clear, descriptive messages |
115
115
 
116
116
  #### Installed Agents
117
117
 
package/docs/WORKTREES.md CHANGED
@@ -16,7 +16,7 @@ cd your-project
16
16
  claude
17
17
 
18
18
  # Research the codebase
19
- > /researching-codebase
19
+ > /research-codebase
20
20
  > How does the authentication system work?
21
21
 
22
22
  # Create an implementation plan
@@ -75,7 +75,7 @@ thc worktree merge add-oauth
75
75
  ```
76
76
  Main Branch Worktree (parallel)
77
77
 
78
- ├── /researching-codebase
78
+ ├── /research-codebase
79
79
  │ └── writes to thoughts/shared/research/
80
80
 
81
81
  ├── /creating-plan
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "thought-cabinet",
3
- "version": "0.1.11",
3
+ "version": "0.1.12",
4
4
  "description": "Thought Cabinet (thc) — CLI for structured AI coding workflows with filesystem-based memory and context management.",
5
5
  "type": "module",
6
6
  "main": "dist/index.js",
@@ -21,9 +21,9 @@
21
21
  "lint": "eslint . --format stylish",
22
22
  "format": "prettier --write .",
23
23
  "format:check": "prettier --check .",
24
- "prepublishOnly": "npm run clean && npm run build",
24
+ "prepublishOnly": "pnpm run clean && pnpm run build",
25
25
  "test": "vitest run --passWithNoTests",
26
- "check": "npm run format:check && npm run lint && npm run test && npm run build",
26
+ "check": "pnpm run format:check && pnpm run lint && pnpm run test && pnpm run build",
27
27
  "clean": "rm -rf dist/"
28
28
  },
29
29
  "dependencies": {
@@ -31,7 +31,8 @@
31
31
  "chalk": "^5.4.1",
32
32
  "commander": "^14.0.0",
33
33
  "dotenv": "^16.5.0",
34
- "tabtab": "^3.0.2"
34
+ "tabtab": "^3.0.2",
35
+ "thought-cabinet": "link:"
35
36
  },
36
37
  "devDependencies": {
37
38
  "@changesets/cli": "^2.29.8",
@@ -50,15 +51,11 @@
50
51
  "engines": {
51
52
  "node": "^20.0.0 || >=22.0.0"
52
53
  },
53
- "overrides": {
54
- "tabtab": {
55
- "inquirer": "^10.0.0"
56
- },
57
- "external-editor": {
58
- "tmp": "^0.2.4"
59
- },
60
- "@typescript-eslint/typescript-estree": {
61
- "minimatch": "^10.2.1"
54
+ "pnpm": {
55
+ "overrides": {
56
+ "tabtab>inquirer": "^10.0.0",
57
+ "external-editor>tmp": "^0.2.4",
58
+ "@typescript-eslint/typescript-estree>minimatch": "^10.2.1"
62
59
  }
63
60
  },
64
61
  "keywords": [
@@ -75,5 +72,6 @@
75
72
  "type": "git",
76
73
  "url": "https://github.com/sanbaiw/thought-cabinet"
77
74
  },
78
- "license": "Apache-2.0"
75
+ "license": "Apache-2.0",
76
+ "packageManager": "pnpm@10.30.1"
79
77
  }
@@ -1,22 +1,31 @@
1
1
  ---
2
2
  name: creating-plan
3
- description: Create detailed implementation plans through interactive research and iteration. Use when planning new features or changes, or creating technical specifications.
3
+ description: Create detailed implementation plans through interactive research and iteration. Use when planning new features, designing changes, writing technical specs.
4
4
  ---
5
5
 
6
6
  # Creating Implementation Plans
7
7
 
8
8
  Create detailed implementation plans through an interactive, iterative process. Be skeptical, thorough, and work collaboratively to produce high-quality technical specifications.
9
9
 
10
+ ## Workflow Context
11
+
12
+ This skill produces the plan document consumed by downstream skills:
13
+
14
+ 1. **creating-plan** (this skill) — Research, design, write the plan
15
+ 2. `implementing-plan` — Execute the plan phase-by-phase, running build/lint/test after each phase
16
+ 3. `validating-plan` — Audit the implementation against the plan
17
+
18
+ The plan file at `thoughts/shared/plans/YYYY-MM-DD-description.md` is the contract between these skills. Write success criteria knowing that `implementing-plan` will run the automated verification commands literally.
19
+
10
20
  ## Workflow Overview
11
21
 
12
- 1. **Gather context** - Read provided files, research codebase
13
- 2. **Ask clarifying questions** - Only what research couldn't answer
14
- 3. **Discover and propose options** - Present design choices with tradeoffs
15
- 4. **Structure the plan** - Get approval on phases before detailing
16
- 5. **Write the plan**
17
- 6. **Iterate** - Refine until user is satisfied
22
+ 1. **Gather context & clarify** - Research codebase, present understanding, ask only what research couldn't answer
23
+ 2. **Research & propose options** - Deeper investigation based on user input, present design choices with tradeoffs
24
+ 3. **Structure the plan** - Get approval on phases before detailing
25
+ 4. **Write the plan** - Detailed, actionable plan following the template
26
+ 5. **Iterate** - Refine until user is satisfied
18
27
 
19
- ## Step 1: Context Gathering
28
+ ## Step 1: Gather Context & Clarify
20
29
 
21
30
  ### If Parameters Provided
22
31
 
@@ -56,14 +65,19 @@ Questions that my research couldn't answer:
56
65
 
57
66
  Only ask questions you genuinely cannot answer through code investigation.
58
67
 
59
- ## Step 2: Research & Discovery
68
+ ## Step 2: Research & Propose Options
60
69
 
61
- If the user corrects any misunderstanding:
70
+ After the user responds to Step 1 (whether confirming understanding, answering questions, or correcting misunderstandings):
71
+
72
+ **If the user corrects any misunderstanding:**
62
73
  1. DO NOT just accept the correction
63
74
  2. Spawn new research tasks to verify
64
75
  3. Read the specific files/directories mentioned
65
76
  4. Only proceed once verified
66
77
 
78
+ **If the user confirms understanding or provides answers:**
79
+ Spawn deeper research tasks informed by the user's input to explore the solution space.
80
+
67
81
  ### Spawn Parallel Research Tasks
68
82
 
69
83
  Use the right agent for each type:
@@ -77,7 +91,7 @@ Use the right agent for each type:
77
91
  - `thoughts-locator` - Find research, plans, or decisions
78
92
  - `thoughts-analyzer` - Extract insights from relevant documents
79
93
 
80
- Wait for ALL sub-tasks to complete before proceeding.
94
+ Wait for ALL tasks to complete before proceeding.
81
95
 
82
96
  ### Present Findings
83
97
 
@@ -101,7 +115,7 @@ Which approach aligns best with your vision?
101
115
 
102
116
  ## Step 3: Plan Structure
103
117
 
104
- Once aligned on approach:
118
+ Once aligned on approach, propose the phase structure. Each phase MUST be independently verifiable — see [Phase Independence](#phase-independence) below.
105
119
 
106
120
  ```
107
121
  Here's my proposed plan structure:
@@ -126,6 +140,7 @@ After structure approval:
126
140
  1. **Determine file path**: `thoughts/shared/plans/YYYY-MM-DD-description.md`
127
141
  - YYYY-MM-DD: today's date
128
142
  - description: brief kebab-case summary
143
+ - If a file already exists at this path, append a numeric suffix (e.g. `-2`) or ask the user
129
144
 
130
145
  2. **Write plan** using [plan-template.md](plan-template.md)
131
146
  - **MUST** Read the template and follow the structure exactly.
@@ -135,7 +150,7 @@ After structure approval:
135
150
  thoughtcabinet sync -m "Plan: <description>"
136
151
  ```
137
152
 
138
- ## Step 5: Review & Iterate
153
+ ## Step 5: Iterate
139
154
 
140
155
  Present the draft location:
141
156
 
@@ -152,35 +167,9 @@ Please review it and let me know:
152
167
 
153
168
  Iterate until the user is satisfied.
154
169
 
155
- ## Guidelines
156
-
157
- ### Be Skeptical
158
- - Question vague requirements
159
- - Identify potential issues early
160
- - Ask "why" and "what about"
161
- - Don't assume - verify with code
162
-
163
- ### Be Interactive
164
- - Don't write the full plan in one shot
165
- - Get buy-in at each major step
166
- - Allow course corrections
167
- - Work collaboratively
168
-
169
- ### Be Thorough
170
- - Read all context files COMPLETELY
171
- - Research actual code patterns using parallel sub-tasks
172
- - Include specific file paths and line numbers
173
- - Write measurable success criteria
174
-
175
- ### Be Practical
176
- - Focus on incremental, testable changes
177
- - Consider migration and rollback
178
- - Think about edge cases
179
- - Include "what we're NOT doing"
180
-
181
- ### Phase Independence
170
+ ## Phase Independence
182
171
 
183
- Each phase must be independently verifiable. The implementing-plan workflow runs build/lint/test and pauses for manual verification after each phase, so phases cannot have circular dependencies.
172
+ Each phase MUST be independently verifiable. `implementing-plan` runs build/lint/test and pauses for manual verification after each phase, so phases cannot have circular dependencies.
184
173
 
185
174
  **Requirements:**
186
175
  - Code must compile/build after completing each phase alone
@@ -188,14 +177,14 @@ Each phase must be independently verifiable. The implementing-plan workflow runs
188
177
  - Success criteria should be testable without implementing later phases
189
178
  - Ask: "Can I run build/lint/test and pause for manual verification after this phase alone?"
190
179
 
191
- **Example of a BAD phase structure:**
180
+ **BAD phase structure:**
192
181
  ```
193
182
  Phase 1: Create command that imports handler
194
183
  Phase 2: Create handler module
195
184
  ```
196
185
  Problem: Phase 1 won't compile until Phase 2 is done.
197
186
 
198
- **Example of a GOOD phase structure:**
187
+ **GOOD phase structure:**
199
188
  ```
200
189
  Phase 1: Create handler module with core logic
201
190
  Phase 2: Create command that imports and uses handler
@@ -209,14 +198,40 @@ Phase 2: Implement handler logic
209
198
  ```
210
199
  Both phases compile; Phase 1 has minimal but working functionality.
211
200
 
201
+ ## Guidelines
202
+
203
+ ### Be Skeptical
204
+ - Question vague requirements
205
+ - Identify potential issues early
206
+ - Ask "why" and "what about"
207
+ - Don't assume - verify with code
208
+
209
+ ### Be Interactive
210
+ - Don't write the full plan in one shot
211
+ - Get buy-in at each major step
212
+ - Allow course corrections
213
+ - Work collaboratively
214
+
215
+ ### Be Thorough
216
+ - Read all context files COMPLETELY
217
+ - Research actual code patterns using parallel tasks
218
+ - Include specific file paths and line numbers
219
+ - Write measurable success criteria
220
+
221
+ ### Be Practical
222
+ - Focus on incremental, testable changes
223
+ - Consider migration and rollback
224
+ - Think about edge cases
225
+ - Include "what we're NOT doing"
226
+
212
227
  ### No Open Questions in Final Plan
213
228
  - If you encounter open questions, STOP
214
229
  - Research or ask for clarification immediately
215
230
  - The implementation plan must be complete and actionable
216
231
 
217
- ## Sub-task Best Practices
232
+ ## Research Task Best Practices
218
233
 
219
- When spawning research sub-tasks:
234
+ When spawning research tasks:
220
235
 
221
236
  1. **Spawn multiple tasks in parallel** for efficiency
222
237
  2. **Each task should be focused** on a specific area
@@ -50,6 +50,7 @@
50
50
 
51
51
  **File**: `path/to/file.ext`
52
52
  **Changes**: [Summary of changes]
53
+ **Testable behaviors**: [List the behaviors this change introduces or modifies — these become TDD RED tests during implementation]
53
54
 
54
55
  ```[language]
55
56
  // Specific code to add/modify
@@ -1,12 +1,37 @@
1
1
  ---
2
2
  name: implementing-plan
3
- description: Implement technical plans from thoughts/shared/plans with verification. Use when executing approved implementation plans, or resuming work on partially completed plans.
3
+ description: Implement technical plans from thoughts/shared/plans with verification. Use when executing approved implementation plans, resuming partially completed plans, or when the user mentions execute plan or resume plan.
4
4
  ---
5
5
 
6
6
  # Implementing Plans
7
7
 
8
8
  Execute approved technical plans from `thoughts/shared/plans/` with verification at each phase.
9
9
 
10
+ ## Workflow Context
11
+
12
+ This skill executes plans produced by `creating-plan`:
13
+
14
+ 1. `creating-plan` — Research, design, write the plan
15
+ 2. **implementing-plan** (this skill) — Execute phase-by-phase with verification
16
+ 3. `validating-plan` — Audit the implementation against the plan
17
+
18
+ The plan file at `thoughts/shared/plans/` is the contract. Success criteria in the plan are executed literally — automated verification commands are run as written.
19
+
20
+ ### Test-Driven Implementation
21
+
22
+ **MANDATORY**: Apply the `test-driven-development` skill's RED-GREEN-REFACTOR cycle for every unit of production code written within a phase.
23
+
24
+ The procedure for each unit of work:
25
+
26
+ 1. Write a failing test (RED)
27
+ 2. Write minimal code to pass (GREEN)
28
+ 3. Refactor while keeping tests green
29
+ 4. Repeat for the next behavior
30
+
31
+ After all TDD cycles in the phase are complete, run the phase's automated verification commands as the final gate.
32
+
33
+ **Resolving conflicts with the plan**: If the plan says "no tests needed", evaluate independently — apply TDD unless genuinely untestable (pure wiring, no behavioral logic). Document any skip with a reason in the phase completion message.
34
+
10
35
  ## Getting Started
11
36
 
12
37
  When given a plan path:
@@ -52,14 +77,22 @@ Why this matters: [explanation]
52
77
  How should I proceed?
53
78
  ```
54
79
 
80
+ ## Phase Implementation Workflow
81
+
82
+ Before writing any production code for a phase:
83
+
84
+ 1. Identify the testable behaviors the phase introduces or changes
85
+ 2. Apply the `test-driven-development` RED-GREEN-REFACTOR cycle for each behavior
86
+ 3. Only after all TDD cycles are complete, proceed to the completion checklist below
87
+
55
88
  ## Phase Completion Checklist
56
89
 
57
- After implementing a phase, follow this checklist **in order**:
90
+ After implementing a phase (all TDD cycles done), follow this checklist **in order**:
58
91
 
59
92
  1. Run automated success criteria checks (compile, tests, etc.)
60
93
  2. Fix any issues found
61
94
  3. Update checkboxes in the plan file for completed automated verification items
62
- 4. Update progress in todos (TodoWrite)
95
+ 4. Update progress in todo list
63
96
  5. **STOP** and present the verification message (see below)
64
97
  6. **WAIT** for user confirmation before starting next phase
65
98
 
@@ -102,7 +135,7 @@ When something isn't working as expected:
102
135
  2. Consider if the codebase evolved since the plan was written
103
136
  3. Present the mismatch clearly and ask for guidance
104
137
 
105
- Use sub-tasks sparingly - mainly for targeted debugging or exploring unfamiliar territory.
138
+ Use tasks sparingly - mainly for targeted debugging or exploring unfamiliar territory.
106
139
 
107
140
  ## Guidelines
108
141
 
@@ -1,6 +1,6 @@
1
1
  ---
2
- name: researching-codebase
3
- description: Document codebase as-is with thoughts directory for historical context. Use when exploring how codebase features work, or understanding component interactions, or creating technical documentation of existing systems.
2
+ name: research-codebase
3
+ description: Research and understand how the codebase works, then document findings in thoughts directory. Use when investigating specific code flows or workflows, exploring how features are implemented, understanding component interactions, tracing initialization or lifecycle processes, or answering "how does X work" questions about the codebase.
4
4
  ---
5
5
 
6
6
  # Research Codebase
@@ -0,0 +1,201 @@
1
+ ---
2
+ name: test-driven-development
3
+ description: Write tests before implementation code using red-green-refactor. Use when implementing features, fixing bugs, or when the user mentions TDD, test-first, or test-driven.
4
+ ---
5
+
6
+ # Test-Driven Development
7
+
8
+ Write the test first. Watch it fail. Write minimal code to pass.
9
+
10
+ **Core principle:** If you didn't watch the test fail, you don't know if it tests the right thing.
11
+
12
+ ## Workflow Context
13
+
14
+ This skill integrates with the plan-based workflow:
15
+
16
+ - `implementing-plan` executes phases that include success criteria with test commands
17
+ - **test-driven-development** (this skill) governs **how** code within each phase gets written: test-first
18
+
19
+ When implementing a plan phase, apply TDD to each unit of work within that phase. The phase's automated verification commands are the final check, not a substitute for test-first development.
20
+
21
+ ## Workflow Overview
22
+
23
+ 1. **Discover test infrastructure** - Find test runner, patterns, conventions
24
+ 2. **RED** - Write one failing test
25
+ 3. **Verify RED** - Run it, confirm correct failure
26
+ 4. **GREEN** - Write minimal code to pass
27
+ 5. **Verify GREEN** - Run it, confirm all tests pass
28
+ 6. **REFACTOR** - Clean up, keep tests green
29
+ 7. **Repeat** - Next behavior, next failing test
30
+
31
+ ## Step 1: Discover Test Infrastructure
32
+
33
+ Before writing any test, research the project's testing setup:
34
+
35
+ ```
36
+ Tasks to spawn concurrently:
37
+ - codebase-locator: Find test files near the code being changed
38
+ - codebase-pattern-finder: Find test patterns (describe/it, test runner config, assertion style)
39
+ ```
40
+
41
+ Identify:
42
+ - Test runner and command (e.g. `npm test`, `pytest`, `make test`)
43
+ - File naming convention (e.g. `*.test.ts`, `*_test.go`, `test_*.py`)
44
+ - Assertion style and test structure used in existing tests
45
+ - Any test utilities or fixtures in the project
46
+
47
+ Follow existing conventions exactly. Do not introduce new test libraries or patterns unless user explicitly asks for it.
48
+
49
+ ## Step 2: RED - Write Failing Test
50
+
51
+ Write one minimal test for one behavior.
52
+
53
+ <Good>
54
+ ```typescript
55
+ test('rejects empty email', async () => {
56
+ const result = await submitForm({ email: '' });
57
+ expect(result.error).toBe('Email required');
58
+ });
59
+ ```
60
+ Clear name describing behavior, tests one thing, uses real code.
61
+ </Good>
62
+
63
+ <Bad>
64
+ ```typescript
65
+ test('works', async () => {
66
+ const mock = jest.fn().mockResolvedValueOnce('ok');
67
+ await submitForm(mock);
68
+ expect(mock).toHaveBeenCalledTimes(1);
69
+ });
70
+ ```
71
+ Vague name, tests mock not behavior.
72
+ </Bad>
73
+
74
+ **Requirements:**
75
+ - One behavior per test
76
+ - Name describes the expected behavior
77
+ - Use real code paths (mocks only when unavoidable — external APIs, databases)
78
+
79
+ ## Step 3: Verify RED - Watch It Fail
80
+
81
+ **MANDATORY. Never skip.**
82
+
83
+ Run the test and confirm:
84
+ - Test **fails** (not errors from syntax/import issues)
85
+ - Failure message matches what you expect
86
+ - Fails because the feature is missing, not because of a typo
87
+
88
+ **Test passes immediately?** You're testing existing behavior. Rewrite the test.
89
+
90
+ **Test errors instead of failing?** Fix the error first, re-run until it fails correctly.
91
+
92
+ ## Step 4: GREEN - Minimal Code
93
+
94
+ Write the simplest code that makes the test pass.
95
+
96
+ <Good>
97
+ ```typescript
98
+ async function retryOperation<T>(fn: () => T | Promise<T>): Promise<T> {
99
+ for (let i = 0; i < 3; i++) {
100
+ try { return await fn(); }
101
+ catch (e) { if (i === 2) throw e; }
102
+ }
103
+ throw new Error('unreachable');
104
+ }
105
+ ```
106
+ Just enough to pass.
107
+ </Good>
108
+
109
+ <Bad>
110
+ ```typescript
111
+ async function retryOperation<T>(
112
+ fn: () => T | Promise<T>,
113
+ options?: { maxRetries?: number; backoff?: 'linear' | 'exponential'; onRetry?: (n: number) => void }
114
+ ): Promise<T> { /* ... */ }
115
+ ```
116
+ Over-engineered beyond what the test requires.
117
+ </Bad>
118
+
119
+ Do not add features, refactor surrounding code, or "improve" beyond the test.
120
+
121
+ ## Step 5: Verify GREEN - Watch It Pass
122
+
123
+ **MANDATORY.**
124
+
125
+ Run the test and confirm:
126
+ - The new test passes
127
+ - All existing tests still pass
128
+ - No errors or warnings in output
129
+
130
+ **New test fails?** Fix the implementation, not the test.
131
+
132
+ **Other tests break?** Fix them now before proceeding.
133
+
134
+ ## Step 6: REFACTOR - Clean Up
135
+
136
+ Only after green:
137
+ - Remove duplication
138
+ - Improve names
139
+ - Extract helpers if warranted
140
+
141
+ Run tests after each refactoring change. Stay green throughout.
142
+
143
+ Do not add new behavior during refactoring.
144
+
145
+ ## Step 7: Repeat
146
+
147
+ Return to Step 2 for the next behavior. Each cycle adds one tested behavior.
148
+
149
+ ## The Iron Law
150
+
151
+ ```
152
+ NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
153
+ ```
154
+
155
+ Wrote code before the test? Delete it. Start over from Step 2.
156
+
157
+ - Do not keep it as "reference"
158
+ - Do not "adapt" it while writing tests
159
+ - Delete means delete
160
+
161
+ ## Good Tests
162
+
163
+ | Quality | Good | Bad |
164
+ |---------|------|-----|
165
+ | **Minimal** | One behavior | `test('validates email and domain and length')` |
166
+ | **Clear** | Name describes expected behavior | `test('test1')` |
167
+ | **Real** | Tests actual code paths | Tests mock return values |
168
+ | **Focused** | Asserts on outcome | Asserts on internal implementation |
169
+
170
+ ## Red Flags - STOP and Start Over
171
+
172
+ - Wrote production code before a test
173
+ - Test passes immediately on first run
174
+ - Can't explain why the test failed
175
+ - Rationalizing "just this once"
176
+ - "I already manually tested it"
177
+ - Keeping pre-written code as "reference"
178
+
179
+ All of these mean: delete the code, start over with a failing test.
180
+
181
+ ## When Stuck
182
+
183
+ | Problem | Solution |
184
+ |---------|----------|
185
+ | Don't know how to test | Write the API you wish existed. Write the assertion first. Ask the user. |
186
+ | Test too complicated | Design too complicated. Simplify the interface. |
187
+ | Must mock everything | Code too coupled. Use dependency injection. |
188
+ | Test setup huge | Extract helpers. Still complex? Simplify design. |
189
+
190
+ ## Integration with Plan Phases
191
+
192
+ When working within a plan phase:
193
+
194
+ 1. Read the phase's success criteria
195
+ 2. For each unit of work in the phase, apply the RED-GREEN-REFACTOR cycle
196
+ 3. After all units are complete, run the phase's automated verification commands
197
+ 4. Follow `implementing-plan`'s Phase Completion Checklist: update checkboxes, present the verification message, wait for user confirmation
198
+
199
+ The phase's automated verification is the final gate. TDD cycles happen within that gate, not instead of it.
200
+
201
+ **When the plan says tests aren't needed**: Evaluate independently — apply TDD unless genuinely untestable (pure wiring, no behavioral logic). Document any skip with a reason in the phase completion message.