thought-cabinet 0.1.11 → 0.1.12
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +10 -10
- package/docs/CLI.md +8 -8
- package/docs/WORKTREES.md +2 -2
- package/package.json +12 -14
- package/src/agent-assets/skills/creating-plan/SKILL.md +60 -45
- package/src/agent-assets/skills/creating-plan/plan-template.md +1 -0
- package/src/agent-assets/skills/implementing-plan/SKILL.md +37 -4
- package/src/agent-assets/skills/{researching-codebase → research-codebase}/SKILL.md +2 -2
- package/src/agent-assets/skills/test-driven-development/SKILL.md +201 -0
- /package/src/agent-assets/skills/{researching-codebase → research-codebase}/research-template.md +0 -0
package/README.md
CHANGED
|
@@ -24,7 +24,7 @@ Thought Cabinet solves these by providing:
|
|
|
24
24
|
cd your-project
|
|
25
25
|
|
|
26
26
|
# 1. Install
|
|
27
|
-
|
|
27
|
+
pnpm install -g thought-cabinet
|
|
28
28
|
|
|
29
29
|
# 2. Initialize thoughts in your project
|
|
30
30
|
thc init
|
|
@@ -33,7 +33,7 @@ thc init
|
|
|
33
33
|
thc agent init
|
|
34
34
|
|
|
35
35
|
# 4. Use skills in your agent session (e.g. Claude Code)
|
|
36
|
-
> /
|
|
36
|
+
> /research-codebase How does the authentication system work?
|
|
37
37
|
> /creating-plan Add OAuth2 support based on the research
|
|
38
38
|
> /implementing-plan thoughts/shared/plans/add-oauth.md
|
|
39
39
|
> /validating-plan thoughts/shared/plans/add-oauth.md
|
|
@@ -43,14 +43,14 @@ thc agent init
|
|
|
43
43
|
|
|
44
44
|
Skills are installed by `thc agent init` and invoked as slash commands in your agent session:
|
|
45
45
|
|
|
46
|
-
| Skill
|
|
47
|
-
|
|
|
48
|
-
| `/
|
|
49
|
-
| `/creating-plan`
|
|
50
|
-
| `/iterating-plan`
|
|
51
|
-
| `/implementing-plan`
|
|
52
|
-
| `/validating-plan`
|
|
53
|
-
| `/commit`
|
|
46
|
+
| Skill | Description |
|
|
47
|
+
| -------------------- | --------------------------------------------------------------------- |
|
|
48
|
+
| `/research-codebase` | Deep-dive into codebase, save findings to `thoughts/shared/research/` |
|
|
49
|
+
| `/creating-plan` | Create implementation plan with phases and success criteria |
|
|
50
|
+
| `/iterating-plan` | Refine existing plans based on feedback |
|
|
51
|
+
| `/implementing-plan` | Execute plan phase-by-phase with verification |
|
|
52
|
+
| `/validating-plan` | Verify implementation against plan's success criteria |
|
|
53
|
+
| `/commit` | Create git commits with clear, descriptive messages |
|
|
54
54
|
|
|
55
55
|
**Typical workflow**: research the codebase to build understanding, create a plan, iterate until the plan is solid, implement it, then validate the result.
|
|
56
56
|
|
package/docs/CLI.md
CHANGED
|
@@ -104,14 +104,14 @@ thc agent init --force # Overwrite existing installations
|
|
|
104
104
|
|
|
105
105
|
#### Installed Skills
|
|
106
106
|
|
|
107
|
-
| Skill
|
|
108
|
-
|
|
|
109
|
-
|
|
|
110
|
-
| creating-plan
|
|
111
|
-
| iterating-plan
|
|
112
|
-
| implementing-plan
|
|
113
|
-
| validating-plan
|
|
114
|
-
| commit
|
|
107
|
+
| Skill | Slash Command | Description |
|
|
108
|
+
| ----------------- | -------------------- | --------------------------------------------------------------------- |
|
|
109
|
+
| research-codebase | `/research-codebase` | Deep-dive into codebase, save findings to `thoughts/shared/research/` |
|
|
110
|
+
| creating-plan | `/creating-plan` | Create implementation plan with phases and success criteria |
|
|
111
|
+
| iterating-plan | `/iterating-plan` | Refine existing plans based on feedback |
|
|
112
|
+
| implementing-plan | `/implementing-plan` | Execute plan phase-by-phase with verification |
|
|
113
|
+
| validating-plan | `/validating-plan` | Verify implementation against plan's success criteria |
|
|
114
|
+
| commit | `/commit` | Create git commits with clear, descriptive messages |
|
|
115
115
|
|
|
116
116
|
#### Installed Agents
|
|
117
117
|
|
package/docs/WORKTREES.md
CHANGED
|
@@ -16,7 +16,7 @@ cd your-project
|
|
|
16
16
|
claude
|
|
17
17
|
|
|
18
18
|
# Research the codebase
|
|
19
|
-
> /
|
|
19
|
+
> /research-codebase
|
|
20
20
|
> How does the authentication system work?
|
|
21
21
|
|
|
22
22
|
# Create an implementation plan
|
|
@@ -75,7 +75,7 @@ thc worktree merge add-oauth
|
|
|
75
75
|
```
|
|
76
76
|
Main Branch Worktree (parallel)
|
|
77
77
|
│
|
|
78
|
-
├── /
|
|
78
|
+
├── /research-codebase
|
|
79
79
|
│ └── writes to thoughts/shared/research/
|
|
80
80
|
│
|
|
81
81
|
├── /creating-plan
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "thought-cabinet",
|
|
3
|
-
"version": "0.1.
|
|
3
|
+
"version": "0.1.12",
|
|
4
4
|
"description": "Thought Cabinet (thc) — CLI for structured AI coding workflows with filesystem-based memory and context management.",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"main": "dist/index.js",
|
|
@@ -21,9 +21,9 @@
|
|
|
21
21
|
"lint": "eslint . --format stylish",
|
|
22
22
|
"format": "prettier --write .",
|
|
23
23
|
"format:check": "prettier --check .",
|
|
24
|
-
"prepublishOnly": "
|
|
24
|
+
"prepublishOnly": "pnpm run clean && pnpm run build",
|
|
25
25
|
"test": "vitest run --passWithNoTests",
|
|
26
|
-
"check": "
|
|
26
|
+
"check": "pnpm run format:check && pnpm run lint && pnpm run test && pnpm run build",
|
|
27
27
|
"clean": "rm -rf dist/"
|
|
28
28
|
},
|
|
29
29
|
"dependencies": {
|
|
@@ -31,7 +31,8 @@
|
|
|
31
31
|
"chalk": "^5.4.1",
|
|
32
32
|
"commander": "^14.0.0",
|
|
33
33
|
"dotenv": "^16.5.0",
|
|
34
|
-
"tabtab": "^3.0.2"
|
|
34
|
+
"tabtab": "^3.0.2",
|
|
35
|
+
"thought-cabinet": "link:"
|
|
35
36
|
},
|
|
36
37
|
"devDependencies": {
|
|
37
38
|
"@changesets/cli": "^2.29.8",
|
|
@@ -50,15 +51,11 @@
|
|
|
50
51
|
"engines": {
|
|
51
52
|
"node": "^20.0.0 || >=22.0.0"
|
|
52
53
|
},
|
|
53
|
-
"
|
|
54
|
-
"
|
|
55
|
-
"inquirer": "^10.0.0"
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
"tmp": "^0.2.4"
|
|
59
|
-
},
|
|
60
|
-
"@typescript-eslint/typescript-estree": {
|
|
61
|
-
"minimatch": "^10.2.1"
|
|
54
|
+
"pnpm": {
|
|
55
|
+
"overrides": {
|
|
56
|
+
"tabtab>inquirer": "^10.0.0",
|
|
57
|
+
"external-editor>tmp": "^0.2.4",
|
|
58
|
+
"@typescript-eslint/typescript-estree>minimatch": "^10.2.1"
|
|
62
59
|
}
|
|
63
60
|
},
|
|
64
61
|
"keywords": [
|
|
@@ -75,5 +72,6 @@
|
|
|
75
72
|
"type": "git",
|
|
76
73
|
"url": "https://github.com/sanbaiw/thought-cabinet"
|
|
77
74
|
},
|
|
78
|
-
"license": "Apache-2.0"
|
|
75
|
+
"license": "Apache-2.0",
|
|
76
|
+
"packageManager": "pnpm@10.30.1"
|
|
79
77
|
}
|
|
@@ -1,22 +1,31 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: creating-plan
|
|
3
|
-
description: Create detailed implementation plans through interactive research and iteration. Use when planning new features
|
|
3
|
+
description: Create detailed implementation plans through interactive research and iteration. Use when planning new features, designing changes, writing technical specs.
|
|
4
4
|
---
|
|
5
5
|
|
|
6
6
|
# Creating Implementation Plans
|
|
7
7
|
|
|
8
8
|
Create detailed implementation plans through an interactive, iterative process. Be skeptical, thorough, and work collaboratively to produce high-quality technical specifications.
|
|
9
9
|
|
|
10
|
+
## Workflow Context
|
|
11
|
+
|
|
12
|
+
This skill produces the plan document consumed by downstream skills:
|
|
13
|
+
|
|
14
|
+
1. **creating-plan** (this skill) — Research, design, write the plan
|
|
15
|
+
2. `implementing-plan` — Execute the plan phase-by-phase, running build/lint/test after each phase
|
|
16
|
+
3. `validating-plan` — Audit the implementation against the plan
|
|
17
|
+
|
|
18
|
+
The plan file at `thoughts/shared/plans/YYYY-MM-DD-description.md` is the contract between these skills. Write success criteria knowing that `implementing-plan` will run the automated verification commands literally.
|
|
19
|
+
|
|
10
20
|
## Workflow Overview
|
|
11
21
|
|
|
12
|
-
1. **Gather context** -
|
|
13
|
-
2. **
|
|
14
|
-
3. **
|
|
15
|
-
4. **
|
|
16
|
-
5. **
|
|
17
|
-
6. **Iterate** - Refine until user is satisfied
|
|
22
|
+
1. **Gather context & clarify** - Research codebase, present understanding, ask only what research couldn't answer
|
|
23
|
+
2. **Research & propose options** - Deeper investigation based on user input, present design choices with tradeoffs
|
|
24
|
+
3. **Structure the plan** - Get approval on phases before detailing
|
|
25
|
+
4. **Write the plan** - Detailed, actionable plan following the template
|
|
26
|
+
5. **Iterate** - Refine until user is satisfied
|
|
18
27
|
|
|
19
|
-
## Step 1: Context
|
|
28
|
+
## Step 1: Gather Context & Clarify
|
|
20
29
|
|
|
21
30
|
### If Parameters Provided
|
|
22
31
|
|
|
@@ -56,14 +65,19 @@ Questions that my research couldn't answer:
|
|
|
56
65
|
|
|
57
66
|
Only ask questions you genuinely cannot answer through code investigation.
|
|
58
67
|
|
|
59
|
-
## Step 2: Research &
|
|
68
|
+
## Step 2: Research & Propose Options
|
|
60
69
|
|
|
61
|
-
|
|
70
|
+
After the user responds to Step 1 (whether confirming understanding, answering questions, or correcting misunderstandings):
|
|
71
|
+
|
|
72
|
+
**If the user corrects any misunderstanding:**
|
|
62
73
|
1. DO NOT just accept the correction
|
|
63
74
|
2. Spawn new research tasks to verify
|
|
64
75
|
3. Read the specific files/directories mentioned
|
|
65
76
|
4. Only proceed once verified
|
|
66
77
|
|
|
78
|
+
**If the user confirms understanding or provides answers:**
|
|
79
|
+
Spawn deeper research tasks informed by the user's input to explore the solution space.
|
|
80
|
+
|
|
67
81
|
### Spawn Parallel Research Tasks
|
|
68
82
|
|
|
69
83
|
Use the right agent for each type:
|
|
@@ -77,7 +91,7 @@ Use the right agent for each type:
|
|
|
77
91
|
- `thoughts-locator` - Find research, plans, or decisions
|
|
78
92
|
- `thoughts-analyzer` - Extract insights from relevant documents
|
|
79
93
|
|
|
80
|
-
Wait for ALL
|
|
94
|
+
Wait for ALL tasks to complete before proceeding.
|
|
81
95
|
|
|
82
96
|
### Present Findings
|
|
83
97
|
|
|
@@ -101,7 +115,7 @@ Which approach aligns best with your vision?
|
|
|
101
115
|
|
|
102
116
|
## Step 3: Plan Structure
|
|
103
117
|
|
|
104
|
-
Once aligned on approach
|
|
118
|
+
Once aligned on approach, propose the phase structure. Each phase MUST be independently verifiable — see [Phase Independence](#phase-independence) below.
|
|
105
119
|
|
|
106
120
|
```
|
|
107
121
|
Here's my proposed plan structure:
|
|
@@ -126,6 +140,7 @@ After structure approval:
|
|
|
126
140
|
1. **Determine file path**: `thoughts/shared/plans/YYYY-MM-DD-description.md`
|
|
127
141
|
- YYYY-MM-DD: today's date
|
|
128
142
|
- description: brief kebab-case summary
|
|
143
|
+
- If a file already exists at this path, append a numeric suffix (e.g. `-2`) or ask the user
|
|
129
144
|
|
|
130
145
|
2. **Write plan** using [plan-template.md](plan-template.md)
|
|
131
146
|
- **MUST** Read the template and follow the structure exactly.
|
|
@@ -135,7 +150,7 @@ After structure approval:
|
|
|
135
150
|
thoughtcabinet sync -m "Plan: <description>"
|
|
136
151
|
```
|
|
137
152
|
|
|
138
|
-
## Step 5:
|
|
153
|
+
## Step 5: Iterate
|
|
139
154
|
|
|
140
155
|
Present the draft location:
|
|
141
156
|
|
|
@@ -152,35 +167,9 @@ Please review it and let me know:
|
|
|
152
167
|
|
|
153
168
|
Iterate until the user is satisfied.
|
|
154
169
|
|
|
155
|
-
##
|
|
156
|
-
|
|
157
|
-
### Be Skeptical
|
|
158
|
-
- Question vague requirements
|
|
159
|
-
- Identify potential issues early
|
|
160
|
-
- Ask "why" and "what about"
|
|
161
|
-
- Don't assume - verify with code
|
|
162
|
-
|
|
163
|
-
### Be Interactive
|
|
164
|
-
- Don't write the full plan in one shot
|
|
165
|
-
- Get buy-in at each major step
|
|
166
|
-
- Allow course corrections
|
|
167
|
-
- Work collaboratively
|
|
168
|
-
|
|
169
|
-
### Be Thorough
|
|
170
|
-
- Read all context files COMPLETELY
|
|
171
|
-
- Research actual code patterns using parallel sub-tasks
|
|
172
|
-
- Include specific file paths and line numbers
|
|
173
|
-
- Write measurable success criteria
|
|
174
|
-
|
|
175
|
-
### Be Practical
|
|
176
|
-
- Focus on incremental, testable changes
|
|
177
|
-
- Consider migration and rollback
|
|
178
|
-
- Think about edge cases
|
|
179
|
-
- Include "what we're NOT doing"
|
|
180
|
-
|
|
181
|
-
### Phase Independence
|
|
170
|
+
## Phase Independence
|
|
182
171
|
|
|
183
|
-
Each phase
|
|
172
|
+
Each phase MUST be independently verifiable. `implementing-plan` runs build/lint/test and pauses for manual verification after each phase, so phases cannot have circular dependencies.
|
|
184
173
|
|
|
185
174
|
**Requirements:**
|
|
186
175
|
- Code must compile/build after completing each phase alone
|
|
@@ -188,14 +177,14 @@ Each phase must be independently verifiable. The implementing-plan workflow runs
|
|
|
188
177
|
- Success criteria should be testable without implementing later phases
|
|
189
178
|
- Ask: "Can I run build/lint/test and pause for manual verification after this phase alone?"
|
|
190
179
|
|
|
191
|
-
**
|
|
180
|
+
**BAD phase structure:**
|
|
192
181
|
```
|
|
193
182
|
Phase 1: Create command that imports handler
|
|
194
183
|
Phase 2: Create handler module
|
|
195
184
|
```
|
|
196
185
|
Problem: Phase 1 won't compile until Phase 2 is done.
|
|
197
186
|
|
|
198
|
-
**
|
|
187
|
+
**GOOD phase structure:**
|
|
199
188
|
```
|
|
200
189
|
Phase 1: Create handler module with core logic
|
|
201
190
|
Phase 2: Create command that imports and uses handler
|
|
@@ -209,14 +198,40 @@ Phase 2: Implement handler logic
|
|
|
209
198
|
```
|
|
210
199
|
Both phases compile; Phase 1 has minimal but working functionality.
|
|
211
200
|
|
|
201
|
+
## Guidelines
|
|
202
|
+
|
|
203
|
+
### Be Skeptical
|
|
204
|
+
- Question vague requirements
|
|
205
|
+
- Identify potential issues early
|
|
206
|
+
- Ask "why" and "what about"
|
|
207
|
+
- Don't assume - verify with code
|
|
208
|
+
|
|
209
|
+
### Be Interactive
|
|
210
|
+
- Don't write the full plan in one shot
|
|
211
|
+
- Get buy-in at each major step
|
|
212
|
+
- Allow course corrections
|
|
213
|
+
- Work collaboratively
|
|
214
|
+
|
|
215
|
+
### Be Thorough
|
|
216
|
+
- Read all context files COMPLETELY
|
|
217
|
+
- Research actual code patterns using parallel tasks
|
|
218
|
+
- Include specific file paths and line numbers
|
|
219
|
+
- Write measurable success criteria
|
|
220
|
+
|
|
221
|
+
### Be Practical
|
|
222
|
+
- Focus on incremental, testable changes
|
|
223
|
+
- Consider migration and rollback
|
|
224
|
+
- Think about edge cases
|
|
225
|
+
- Include "what we're NOT doing"
|
|
226
|
+
|
|
212
227
|
### No Open Questions in Final Plan
|
|
213
228
|
- If you encounter open questions, STOP
|
|
214
229
|
- Research or ask for clarification immediately
|
|
215
230
|
- The implementation plan must be complete and actionable
|
|
216
231
|
|
|
217
|
-
##
|
|
232
|
+
## Research Task Best Practices
|
|
218
233
|
|
|
219
|
-
When spawning research
|
|
234
|
+
When spawning research tasks:
|
|
220
235
|
|
|
221
236
|
1. **Spawn multiple tasks in parallel** for efficiency
|
|
222
237
|
2. **Each task should be focused** on a specific area
|
|
@@ -1,12 +1,37 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: implementing-plan
|
|
3
|
-
description: Implement technical plans from thoughts/shared/plans with verification. Use when executing approved implementation plans, or
|
|
3
|
+
description: Implement technical plans from thoughts/shared/plans with verification. Use when executing approved implementation plans, resuming partially completed plans, or when the user mentions execute plan or resume plan.
|
|
4
4
|
---
|
|
5
5
|
|
|
6
6
|
# Implementing Plans
|
|
7
7
|
|
|
8
8
|
Execute approved technical plans from `thoughts/shared/plans/` with verification at each phase.
|
|
9
9
|
|
|
10
|
+
## Workflow Context
|
|
11
|
+
|
|
12
|
+
This skill executes plans produced by `creating-plan`:
|
|
13
|
+
|
|
14
|
+
1. `creating-plan` — Research, design, write the plan
|
|
15
|
+
2. **implementing-plan** (this skill) — Execute phase-by-phase with verification
|
|
16
|
+
3. `validating-plan` — Audit the implementation against the plan
|
|
17
|
+
|
|
18
|
+
The plan file at `thoughts/shared/plans/` is the contract. Success criteria in the plan are executed literally — automated verification commands are run as written.
|
|
19
|
+
|
|
20
|
+
### Test-Driven Implementation
|
|
21
|
+
|
|
22
|
+
**MANDATORY**: Apply the `test-driven-development` skill's RED-GREEN-REFACTOR cycle for every unit of production code written within a phase.
|
|
23
|
+
|
|
24
|
+
The procedure for each unit of work:
|
|
25
|
+
|
|
26
|
+
1. Write a failing test (RED)
|
|
27
|
+
2. Write minimal code to pass (GREEN)
|
|
28
|
+
3. Refactor while keeping tests green
|
|
29
|
+
4. Repeat for the next behavior
|
|
30
|
+
|
|
31
|
+
After all TDD cycles in the phase are complete, run the phase's automated verification commands as the final gate.
|
|
32
|
+
|
|
33
|
+
**Resolving conflicts with the plan**: If the plan says "no tests needed", evaluate independently — apply TDD unless genuinely untestable (pure wiring, no behavioral logic). Document any skip with a reason in the phase completion message.
|
|
34
|
+
|
|
10
35
|
## Getting Started
|
|
11
36
|
|
|
12
37
|
When given a plan path:
|
|
@@ -52,14 +77,22 @@ Why this matters: [explanation]
|
|
|
52
77
|
How should I proceed?
|
|
53
78
|
```
|
|
54
79
|
|
|
80
|
+
## Phase Implementation Workflow
|
|
81
|
+
|
|
82
|
+
Before writing any production code for a phase:
|
|
83
|
+
|
|
84
|
+
1. Identify the testable behaviors the phase introduces or changes
|
|
85
|
+
2. Apply the `test-driven-development` RED-GREEN-REFACTOR cycle for each behavior
|
|
86
|
+
3. Only after all TDD cycles are complete, proceed to the completion checklist below
|
|
87
|
+
|
|
55
88
|
## Phase Completion Checklist
|
|
56
89
|
|
|
57
|
-
After implementing a phase, follow this checklist **in order**:
|
|
90
|
+
After implementing a phase (all TDD cycles done), follow this checklist **in order**:
|
|
58
91
|
|
|
59
92
|
1. Run automated success criteria checks (compile, tests, etc.)
|
|
60
93
|
2. Fix any issues found
|
|
61
94
|
3. Update checkboxes in the plan file for completed automated verification items
|
|
62
|
-
4. Update progress in
|
|
95
|
+
4. Update progress in todo list
|
|
63
96
|
5. **STOP** and present the verification message (see below)
|
|
64
97
|
6. **WAIT** for user confirmation before starting next phase
|
|
65
98
|
|
|
@@ -102,7 +135,7 @@ When something isn't working as expected:
|
|
|
102
135
|
2. Consider if the codebase evolved since the plan was written
|
|
103
136
|
3. Present the mismatch clearly and ask for guidance
|
|
104
137
|
|
|
105
|
-
Use
|
|
138
|
+
Use tasks sparingly - mainly for targeted debugging or exploring unfamiliar territory.
|
|
106
139
|
|
|
107
140
|
## Guidelines
|
|
108
141
|
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
---
|
|
2
|
-
name:
|
|
3
|
-
description:
|
|
2
|
+
name: research-codebase
|
|
3
|
+
description: Research and understand how the codebase works, then document findings in thoughts directory. Use when investigating specific code flows or workflows, exploring how features are implemented, understanding component interactions, tracing initialization or lifecycle processes, or answering "how does X work" questions about the codebase.
|
|
4
4
|
---
|
|
5
5
|
|
|
6
6
|
# Research Codebase
|
|
@@ -0,0 +1,201 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: test-driven-development
|
|
3
|
+
description: Write tests before implementation code using red-green-refactor. Use when implementing features, fixing bugs, or when the user mentions TDD, test-first, or test-driven.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Test-Driven Development
|
|
7
|
+
|
|
8
|
+
Write the test first. Watch it fail. Write minimal code to pass.
|
|
9
|
+
|
|
10
|
+
**Core principle:** If you didn't watch the test fail, you don't know if it tests the right thing.
|
|
11
|
+
|
|
12
|
+
## Workflow Context
|
|
13
|
+
|
|
14
|
+
This skill integrates with the plan-based workflow:
|
|
15
|
+
|
|
16
|
+
- `implementing-plan` executes phases that include success criteria with test commands
|
|
17
|
+
- **test-driven-development** (this skill) governs **how** code within each phase gets written: test-first
|
|
18
|
+
|
|
19
|
+
When implementing a plan phase, apply TDD to each unit of work within that phase. The phase's automated verification commands are the final check, not a substitute for test-first development.
|
|
20
|
+
|
|
21
|
+
## Workflow Overview
|
|
22
|
+
|
|
23
|
+
1. **Discover test infrastructure** - Find test runner, patterns, conventions
|
|
24
|
+
2. **RED** - Write one failing test
|
|
25
|
+
3. **Verify RED** - Run it, confirm correct failure
|
|
26
|
+
4. **GREEN** - Write minimal code to pass
|
|
27
|
+
5. **Verify GREEN** - Run it, confirm all tests pass
|
|
28
|
+
6. **REFACTOR** - Clean up, keep tests green
|
|
29
|
+
7. **Repeat** - Next behavior, next failing test
|
|
30
|
+
|
|
31
|
+
## Step 1: Discover Test Infrastructure
|
|
32
|
+
|
|
33
|
+
Before writing any test, research the project's testing setup:
|
|
34
|
+
|
|
35
|
+
```
|
|
36
|
+
Tasks to spawn concurrently:
|
|
37
|
+
- codebase-locator: Find test files near the code being changed
|
|
38
|
+
- codebase-pattern-finder: Find test patterns (describe/it, test runner config, assertion style)
|
|
39
|
+
```
|
|
40
|
+
|
|
41
|
+
Identify:
|
|
42
|
+
- Test runner and command (e.g. `npm test`, `pytest`, `make test`)
|
|
43
|
+
- File naming convention (e.g. `*.test.ts`, `*_test.go`, `test_*.py`)
|
|
44
|
+
- Assertion style and test structure used in existing tests
|
|
45
|
+
- Any test utilities or fixtures in the project
|
|
46
|
+
|
|
47
|
+
Follow existing conventions exactly. Do not introduce new test libraries or patterns unless user explicitly asks for it.
|
|
48
|
+
|
|
49
|
+
## Step 2: RED - Write Failing Test
|
|
50
|
+
|
|
51
|
+
Write one minimal test for one behavior.
|
|
52
|
+
|
|
53
|
+
<Good>
|
|
54
|
+
```typescript
|
|
55
|
+
test('rejects empty email', async () => {
|
|
56
|
+
const result = await submitForm({ email: '' });
|
|
57
|
+
expect(result.error).toBe('Email required');
|
|
58
|
+
});
|
|
59
|
+
```
|
|
60
|
+
Clear name describing behavior, tests one thing, uses real code.
|
|
61
|
+
</Good>
|
|
62
|
+
|
|
63
|
+
<Bad>
|
|
64
|
+
```typescript
|
|
65
|
+
test('works', async () => {
|
|
66
|
+
const mock = jest.fn().mockResolvedValueOnce('ok');
|
|
67
|
+
await submitForm(mock);
|
|
68
|
+
expect(mock).toHaveBeenCalledTimes(1);
|
|
69
|
+
});
|
|
70
|
+
```
|
|
71
|
+
Vague name, tests mock not behavior.
|
|
72
|
+
</Bad>
|
|
73
|
+
|
|
74
|
+
**Requirements:**
|
|
75
|
+
- One behavior per test
|
|
76
|
+
- Name describes the expected behavior
|
|
77
|
+
- Use real code paths (mocks only when unavoidable — external APIs, databases)
|
|
78
|
+
|
|
79
|
+
## Step 3: Verify RED - Watch It Fail
|
|
80
|
+
|
|
81
|
+
**MANDATORY. Never skip.**
|
|
82
|
+
|
|
83
|
+
Run the test and confirm:
|
|
84
|
+
- Test **fails** (not errors from syntax/import issues)
|
|
85
|
+
- Failure message matches what you expect
|
|
86
|
+
- Fails because the feature is missing, not because of a typo
|
|
87
|
+
|
|
88
|
+
**Test passes immediately?** You're testing existing behavior. Rewrite the test.
|
|
89
|
+
|
|
90
|
+
**Test errors instead of failing?** Fix the error first, re-run until it fails correctly.
|
|
91
|
+
|
|
92
|
+
## Step 4: GREEN - Minimal Code
|
|
93
|
+
|
|
94
|
+
Write the simplest code that makes the test pass.
|
|
95
|
+
|
|
96
|
+
<Good>
|
|
97
|
+
```typescript
|
|
98
|
+
async function retryOperation<T>(fn: () => T | Promise<T>): Promise<T> {
|
|
99
|
+
for (let i = 0; i < 3; i++) {
|
|
100
|
+
try { return await fn(); }
|
|
101
|
+
catch (e) { if (i === 2) throw e; }
|
|
102
|
+
}
|
|
103
|
+
throw new Error('unreachable');
|
|
104
|
+
}
|
|
105
|
+
```
|
|
106
|
+
Just enough to pass.
|
|
107
|
+
</Good>
|
|
108
|
+
|
|
109
|
+
<Bad>
|
|
110
|
+
```typescript
|
|
111
|
+
async function retryOperation<T>(
|
|
112
|
+
fn: () => T | Promise<T>,
|
|
113
|
+
options?: { maxRetries?: number; backoff?: 'linear' | 'exponential'; onRetry?: (n: number) => void }
|
|
114
|
+
): Promise<T> { /* ... */ }
|
|
115
|
+
```
|
|
116
|
+
Over-engineered beyond what the test requires.
|
|
117
|
+
</Bad>
|
|
118
|
+
|
|
119
|
+
Do not add features, refactor surrounding code, or "improve" beyond the test.
|
|
120
|
+
|
|
121
|
+
## Step 5: Verify GREEN - Watch It Pass
|
|
122
|
+
|
|
123
|
+
**MANDATORY.**
|
|
124
|
+
|
|
125
|
+
Run the test and confirm:
|
|
126
|
+
- The new test passes
|
|
127
|
+
- All existing tests still pass
|
|
128
|
+
- No errors or warnings in output
|
|
129
|
+
|
|
130
|
+
**New test fails?** Fix the implementation, not the test.
|
|
131
|
+
|
|
132
|
+
**Other tests break?** Fix them now before proceeding.
|
|
133
|
+
|
|
134
|
+
## Step 6: REFACTOR - Clean Up
|
|
135
|
+
|
|
136
|
+
Only after green:
|
|
137
|
+
- Remove duplication
|
|
138
|
+
- Improve names
|
|
139
|
+
- Extract helpers if warranted
|
|
140
|
+
|
|
141
|
+
Run tests after each refactoring change. Stay green throughout.
|
|
142
|
+
|
|
143
|
+
Do not add new behavior during refactoring.
|
|
144
|
+
|
|
145
|
+
## Step 7: Repeat
|
|
146
|
+
|
|
147
|
+
Return to Step 2 for the next behavior. Each cycle adds one tested behavior.
|
|
148
|
+
|
|
149
|
+
## The Iron Law
|
|
150
|
+
|
|
151
|
+
```
|
|
152
|
+
NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
|
|
153
|
+
```
|
|
154
|
+
|
|
155
|
+
Wrote code before the test? Delete it. Start over from Step 2.
|
|
156
|
+
|
|
157
|
+
- Do not keep it as "reference"
|
|
158
|
+
- Do not "adapt" it while writing tests
|
|
159
|
+
- Delete means delete
|
|
160
|
+
|
|
161
|
+
## Good Tests
|
|
162
|
+
|
|
163
|
+
| Quality | Good | Bad |
|
|
164
|
+
|---------|------|-----|
|
|
165
|
+
| **Minimal** | One behavior | `test('validates email and domain and length')` |
|
|
166
|
+
| **Clear** | Name describes expected behavior | `test('test1')` |
|
|
167
|
+
| **Real** | Tests actual code paths | Tests mock return values |
|
|
168
|
+
| **Focused** | Asserts on outcome | Asserts on internal implementation |
|
|
169
|
+
|
|
170
|
+
## Red Flags - STOP and Start Over
|
|
171
|
+
|
|
172
|
+
- Wrote production code before a test
|
|
173
|
+
- Test passes immediately on first run
|
|
174
|
+
- Can't explain why the test failed
|
|
175
|
+
- Rationalizing "just this once"
|
|
176
|
+
- "I already manually tested it"
|
|
177
|
+
- Keeping pre-written code as "reference"
|
|
178
|
+
|
|
179
|
+
All of these mean: delete the code, start over with a failing test.
|
|
180
|
+
|
|
181
|
+
## When Stuck
|
|
182
|
+
|
|
183
|
+
| Problem | Solution |
|
|
184
|
+
|---------|----------|
|
|
185
|
+
| Don't know how to test | Write the API you wish existed. Write the assertion first. Ask the user. |
|
|
186
|
+
| Test too complicated | Design too complicated. Simplify the interface. |
|
|
187
|
+
| Must mock everything | Code too coupled. Use dependency injection. |
|
|
188
|
+
| Test setup huge | Extract helpers. Still complex? Simplify design. |
|
|
189
|
+
|
|
190
|
+
## Integration with Plan Phases
|
|
191
|
+
|
|
192
|
+
When working within a plan phase:
|
|
193
|
+
|
|
194
|
+
1. Read the phase's success criteria
|
|
195
|
+
2. For each unit of work in the phase, apply the RED-GREEN-REFACTOR cycle
|
|
196
|
+
3. After all units are complete, run the phase's automated verification commands
|
|
197
|
+
4. Follow `implementing-plan`'s Phase Completion Checklist: update checkboxes, present the verification message, wait for user confirmation
|
|
198
|
+
|
|
199
|
+
The phase's automated verification is the final gate. TDD cycles happen within that gate, not instead of it.
|
|
200
|
+
|
|
201
|
+
**When the plan says tests aren't needed**: Evaluate independently — apply TDD unless genuinely untestable (pure wiring, no behavioral logic). Document any skip with a reason in the phase completion message.
|
/package/src/agent-assets/skills/{researching-codebase → research-codebase}/research-template.md
RENAMED
|
File without changes
|