@tianhai/pi-workflow-kit 0.10.1 → 0.11.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +86 -37
- package/extensions/workflow-guard.ts +2 -3
- package/package.json +11 -4
- package/skills/brainstorming/SKILL.md +3 -3
- package/skills/executing-tasks/SKILL.md +86 -116
- package/skills/finalizing/SKILL.md +5 -3
- package/skills/writing-plans/SKILL.md +10 -10
- package/banner.jpg +0 -0
package/README.md
CHANGED
|
@@ -1,72 +1,121 @@
|
|
|
1
1
|
# pi-workflow-kit
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
> Stop AI agents from rushing to code. Enforce a structured brainstorm→plan→execute→finalize workflow with TDD discipline.
|
|
4
4
|
|
|
5
|
-
|
|
5
|
+
AI coding agents tend to skip design and jump straight into implementation, producing over-engineered or misaligned code. **pi-workflow-kit** solves this by hard-blocking write operations during brainstorm and planning phases — the agent *literally cannot modify your source files* until you approve the design.
|
|
6
|
+
|
|
7
|
+
[pi](https://github.com/badlogic/pi-mono) package. Zero configuration required.
|
|
6
8
|
|
|
7
|
-
|
|
9
|
+
## Install
|
|
8
10
|
|
|
11
|
+
```bash
|
|
12
|
+
pi install npm:@tianhai/pi-workflow-kit
|
|
9
13
|
```
|
|
10
|
-
|
|
14
|
+
|
|
15
|
+
No setup needed — skills and guards activate automatically after install.
|
|
16
|
+
|
|
17
|
+
**Want to try before committing?**
|
|
18
|
+
|
|
19
|
+
```bash
|
|
20
|
+
pi -e npm:@tianhai/pi-workflow-kit
|
|
11
21
|
```
|
|
12
22
|
|
|
13
|
-
|
|
23
|
+
## What You Get
|
|
14
24
|
|
|
15
|
-
|
|
16
|
-
- `bash` is **restricted to read-only commands** — file writes, installs, git mutations, and editors are blocked. Safe commands like `grep`, `find`, `git status`, `cat`, `curl`, `go doc`, `go list` remain available.
|
|
25
|
+
### 🛡️ Workflow Guard (extension)
|
|
17
26
|
|
|
18
|
-
|
|
27
|
+
Enforces phase-appropriate tool access — not just guidelines, but hard blocks:
|
|
19
28
|
|
|
20
|
-
|
|
29
|
+
| Phase | `write` / `edit` | `bash` |
|
|
30
|
+
|-------|:-:|:-:|
|
|
31
|
+
| **Brainstorm** / **Plan** | 🔒 Blocked outside `docs/plans/` | 🔒 Read-only only (grep, find, cat, git status, curl…) |
|
|
32
|
+
| **Execute** / **Finalize** | ✅ Full access | ✅ Full access |
|
|
21
33
|
|
|
22
|
-
|
|
23
|
-
pi install npm:@tianhai/pi-workflow-kit
|
|
24
|
-
```
|
|
34
|
+
The agent can read code and discuss design with you during brainstorm/plan, but it physically cannot modify source files or run mutating commands.
|
|
25
35
|
|
|
26
|
-
|
|
36
|
+
### 🧠 5 Workflow Skills
|
|
27
37
|
|
|
28
|
-
|
|
38
|
+
Guide the agent through a disciplined development process:
|
|
29
39
|
|
|
30
|
-
|
|
40
|
+
```
|
|
41
|
+
brainstorm → plan → execute → finalize
|
|
42
|
+
↕
|
|
43
|
+
diagnose (anytime)
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
| Phase | Trigger | What Happens |
|
|
31
47
|
|-------|---------|--------------|
|
|
32
|
-
| **Brainstorm** | `/skill:brainstorming` |
|
|
33
|
-
| **Plan** | `/skill:writing-plans` | Break
|
|
34
|
-
| **Execute** | `/skill:executing-tasks` | Implement
|
|
48
|
+
| **Brainstorm** | `/skill:brainstorming` | Explore approaches, debate tradeoffs, produce a design doc |
|
|
49
|
+
| **Plan** | `/skill:writing-plans` | Break design into bite-sized TDD tasks with file paths and acceptance criteria |
|
|
50
|
+
| **Execute** | `/skill:executing-tasks` | Implement tasks one-by-one with TDD discipline and optional checkpoint review gates |
|
|
35
51
|
| **Finalize** | `/skill:finalizing` | Archive plan docs, update README/CHANGELOG, create PR |
|
|
52
|
+
| **Diagnose** | `/skill:diagnose` | 6-phase debugging loop: reproduce → hypothesize → instrument → fix → verify |
|
|
36
53
|
|
|
37
|
-
|
|
54
|
+
## The Workflow in Detail
|
|
38
55
|
|
|
39
|
-
###
|
|
56
|
+
### Phase Control
|
|
40
57
|
|
|
41
|
-
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
|
|
45
|
-
|
|
46
|
-
|
|
47
|
-
|
|
58
|
+
You control each phase — the agent never advances on its own. Invoke a skill to move forward:
|
|
59
|
+
|
|
60
|
+
```
|
|
61
|
+
/skill:brainstorming → discuss and design
|
|
62
|
+
/skill:writing-plans → break into tasks
|
|
63
|
+
/skill:executing-tasks → implement with TDD
|
|
64
|
+
/skill:finalizing → ship it
|
|
65
|
+
```
|
|
48
66
|
|
|
49
67
|
### TDD Three-Scenario Model
|
|
50
68
|
|
|
51
|
-
|
|
69
|
+
Each task is labeled with its TDD scenario during planning:
|
|
52
70
|
|
|
53
71
|
| Scenario | When | Rule |
|
|
54
72
|
|----------|------|------|
|
|
55
|
-
| New feature | Adding new behavior | Write failing test → implement → pass |
|
|
56
|
-
| Modifying tested code | Changing existing behavior | Run existing tests first → modify → verify |
|
|
57
|
-
| Trivial | Config, docs, naming | Use judgment |
|
|
73
|
+
| **New feature** | Adding new behavior | Write failing test → implement → pass |
|
|
74
|
+
| **Modifying tested code** | Changing existing behavior | Run existing tests first → modify → verify |
|
|
75
|
+
| **Trivial** | Config, docs, naming | Use judgment |
|
|
58
76
|
|
|
59
77
|
### Checkpoint Review Gates
|
|
60
78
|
|
|
61
|
-
Optionally label tasks with a `checkpoint` to pause for human review
|
|
79
|
+
Optionally label tasks with a `checkpoint` to pause for human review. At each checkpoint the agent stops and waits for your feedback — you can approve, ask for changes, or send it back to rethink. Only when you're satisfied does it move on to the next task.
|
|
62
80
|
|
|
63
|
-
| Checkpoint | When to
|
|
81
|
+
| Checkpoint | When to Use | What Happens |
|
|
64
82
|
|---|---|---|
|
|
65
83
|
| *(none)* | Trivial tasks, well-understood changes | Auto-advance, no pause |
|
|
66
|
-
| `checkpoint: test` | Test design matters |
|
|
67
|
-
| `checkpoint: done` | Implementation review matters |
|
|
84
|
+
| `checkpoint: test` | Test design matters | Agent writes the failing test, then pauses for your review. Verify the test covers the right cases before the agent implements. |
|
|
85
|
+
| `checkpoint: done` | Implementation review matters | Agent implements and passes tests, then pauses for your review. Verify the implementation is correct before committing. |
|
|
86
|
+
|
|
87
|
+
## Quick Start
|
|
88
|
+
|
|
89
|
+
```bash
|
|
90
|
+
# Install
|
|
91
|
+
pi install npm:@tianhai/pi-workflow-kit
|
|
92
|
+
|
|
93
|
+
# Start a new feature
|
|
94
|
+
> /skill:brainstorming
|
|
95
|
+
> I want to add OAuth2 login to our API
|
|
96
|
+
|
|
97
|
+
# (agent explores approaches, writes design doc)
|
|
98
|
+
# (write/edit are blocked — your code is safe)
|
|
99
|
+
|
|
100
|
+
> /skill:writing-plans
|
|
101
|
+
|
|
102
|
+
# (agent breaks design into TDD tasks)
|
|
103
|
+
> /skill:executing-tasks
|
|
104
|
+
|
|
105
|
+
# (agent implements with TDD, all tools unlocked)
|
|
106
|
+
> /skill:finalizing
|
|
107
|
+
|
|
108
|
+
# (agent archives docs, updates changelog, creates PR)
|
|
109
|
+
```
|
|
110
|
+
|
|
111
|
+
## Why?
|
|
112
|
+
|
|
113
|
+
- **AI agents skip design.** Left unchecked, they jump to code and over-engineer. This forces a think-first workflow.
|
|
114
|
+
- **TDD needs structure.** The three-scenario model gives the agent clear rules for when to write tests first.
|
|
115
|
+
- **You stay in control.** Checkpoint review gates let you approve test designs and implementations before the agent commits.
|
|
116
|
+
- **Enforced, not suggested.** Hard blocks mean the agent can't ignore the rules — not even accidentally.
|
|
68
117
|
|
|
69
|
-
##
|
|
118
|
+
## Project
|
|
70
119
|
|
|
71
120
|
```
|
|
72
121
|
pi-workflow-kit/
|
|
@@ -92,4 +141,4 @@ npm test
|
|
|
92
141
|
|
|
93
142
|
## License
|
|
94
143
|
|
|
95
|
-
MIT
|
|
144
|
+
[MIT](LICENSE)
|
|
@@ -45,7 +45,7 @@ const DESTRUCTIVE_PATTERNS = [
|
|
|
45
45
|
/\bshutdown\b/i,
|
|
46
46
|
/\bsystemctl\s+(start|stop|restart|enable|disable)/i,
|
|
47
47
|
/\bservice\s+\S+\s+(start|stop|restart)/i,
|
|
48
|
-
|
|
48
|
+
/\b(vim?|nano|emacs|code|subl)\b/i,
|
|
49
49
|
];
|
|
50
50
|
|
|
51
51
|
const SAFE_PATTERNS = [
|
|
@@ -117,8 +117,7 @@ const SAFE_PATTERNS = [
|
|
|
117
117
|
];
|
|
118
118
|
|
|
119
119
|
/** Split a compound command into individual sub-commands.
|
|
120
|
-
*
|
|
121
|
-
* Does NOT split on | (pipe) to allow piping (e.g. `git log | head`).
|
|
120
|
+
* Handles &&, ||, ;, and | (pipe) operators, ignoring leading whitespace.
|
|
122
121
|
*/
|
|
123
122
|
function splitCompoundCommand(command: string): string[] {
|
|
124
123
|
// Match sub-commands separated by &&, ||, ; (with optional whitespace)
|
package/package.json
CHANGED
|
@@ -1,9 +1,17 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@tianhai/pi-workflow-kit",
|
|
3
|
-
"version": "0.
|
|
4
|
-
"description": "
|
|
3
|
+
"version": "0.11.0",
|
|
4
|
+
"description": "Enforce structured brainstorm→plan→execute→finalize workflow with TDD discipline in AI coding agents",
|
|
5
5
|
"keywords": [
|
|
6
|
-
"pi-package"
|
|
6
|
+
"pi-package",
|
|
7
|
+
"ai-coding-agent",
|
|
8
|
+
"workflow",
|
|
9
|
+
"tdd",
|
|
10
|
+
"guard-rails",
|
|
11
|
+
"code-review",
|
|
12
|
+
"pi-extension",
|
|
13
|
+
"brainstorm",
|
|
14
|
+
"test-driven-development"
|
|
7
15
|
],
|
|
8
16
|
"scripts": {
|
|
9
17
|
"test": "vitest run",
|
|
@@ -20,7 +28,6 @@
|
|
|
20
28
|
"extensions/",
|
|
21
29
|
"skills/",
|
|
22
30
|
"docs/",
|
|
23
|
-
"banner.jpg",
|
|
24
31
|
"LICENSE",
|
|
25
32
|
"README.md"
|
|
26
33
|
],
|
|
@@ -10,9 +10,9 @@ Read-only exploration. You may **not** edit or create any files except under `do
|
|
|
10
10
|
## Process
|
|
11
11
|
|
|
12
12
|
1. **Check git state** — run `git status` and `git log --oneline -5`. If there's uncommitted work, ask the user what to do with it first.
|
|
13
|
-
2. **Understand the idea** — read existing code, docs, and recent commits.
|
|
13
|
+
2. **Understand the idea** — read existing code, docs, and recent commits. Grep for related functionality, check package.json/dependencies and module structure. Read only what's necessary to ground the design — don't read the entire codebase. Ask questions to refine the idea. Prefer multiple choice when possible. After each question, check: can you clearly articulate (a) what the user wants to build, (b) why, and (c) key constraints? If yes, present your understanding as a short summary and ask: "Should I proceed with this, or is there more to add?" The human decides when to move on.
|
|
14
14
|
3. **Explore approaches** — propose 2-3 approaches. For each approach, sketch the concrete interface (types, method signatures, example caller code) so the comparison is grounded in actual code, not abstract descriptions. Lead with your recommendation.
|
|
15
|
-
4. **Present the design** — break it into sections of
|
|
15
|
+
4. **Present the design** — break it into focused sections. Each section should be one screen of reading. Present each section to the human and wait for approval before continuing. Cover: architecture, components, data flow, error handling, testing. On feedback, incorporate it and re-present the revised section.
|
|
16
16
|
|
|
17
17
|
When a significant architectural decision is identified, offer to write a lightweight ADR to `docs/plans/adr/`. Only write an ADR when all three are true:
|
|
18
18
|
|
|
@@ -29,7 +29,7 @@ Read-only exploration. You may **not** edit or create any files except under `do
|
|
|
29
29
|
```
|
|
30
30
|
|
|
31
31
|
ADRs live under `docs/plans/adr/` and are archived during finalizing alongside the design doc.
|
|
32
|
-
5. **Write the design doc** — save it to `docs/plans/YYYY-MM-DD-<topic>-design.md`.
|
|
32
|
+
5. **Write the design doc** — save it to `docs/plans/YYYY-MM-DD-<topic>-design.md`. Organize features as end-to-end slices (each slice delivers one observable behavior through all relevant layers) so the planning phase can decompose them directly into tasks. Branch creation, committing, and workspace setup are handled by `/skill:executing-tasks`.
|
|
33
33
|
|
|
34
34
|
## Principles
|
|
35
35
|
|
|
@@ -10,19 +10,32 @@ Implement the plan from `docs/plans/*-implementation.md` task by task, with file
|
|
|
10
10
|
## Before you start
|
|
11
11
|
|
|
12
12
|
1. **Check git state** — run `git status` and `git log --oneline -5`. Note any uncommitted changes.
|
|
13
|
-
2. **Find the plan** — look for `docs/plans/*-implementation.md`. If multiple exist, ask the user which one to execute.
|
|
13
|
+
2. **Find the plan** — look for `docs/plans/*-implementation.md`. If none exist, say "No implementation plan found. Run `/skill:writing-plans` first." and stop. If multiple exist, ask the user which one to execute.
|
|
14
14
|
3. **Check for existing progress** — look for `docs/plans/*-progress.md`. If one exists matching the plan, this is a **resume** (see [Resume](#resume)). If not, this is a **first run** (see [First run](#first-run)).
|
|
15
15
|
|
|
16
16
|
## First run
|
|
17
17
|
|
|
18
18
|
1. **Parse the implementation plan** — read the plan and extract all `## Task N:` headings. Build the progress table with all tasks as `⬜ pending`.
|
|
19
|
-
2. **
|
|
19
|
+
2. **Suggest workspace isolation** — if the user isn't already on a feature branch or worktree, present the options:
|
|
20
|
+
|
|
21
|
+
- **Branch** (smaller changes):
|
|
22
|
+
```
|
|
23
|
+
git checkout -b <feature-name>
|
|
24
|
+
```
|
|
25
|
+
- **Worktree** (larger features, keeps main clean):
|
|
26
|
+
```
|
|
27
|
+
git worktree add ../<repo>-<feature-name> -b <feature-name>
|
|
28
|
+
```
|
|
29
|
+
|
|
30
|
+
Derive `<feature-name>` from the plan doc (e.g. `docs/plans/2026-04-16-auth-design.md` → `auth`). Ask the user which they prefer, then wait for confirmation before proceeding.
|
|
31
|
+
|
|
32
|
+
3. **Create the progress file** — save to `docs/plans/<plan-name>-progress.md` (replace `-implementation` with `-progress` in the plan filename):
|
|
20
33
|
|
|
21
34
|
```markdown
|
|
22
35
|
# Progress: <topic>
|
|
23
36
|
|
|
24
37
|
Plan: docs/plans/YYYY-MM-DD-<topic>-implementation.md
|
|
25
|
-
Branch: <
|
|
38
|
+
Branch: <actual branch name>
|
|
26
39
|
Started: <ISO timestamp>
|
|
27
40
|
Last updated: <ISO timestamp>
|
|
28
41
|
|
|
@@ -31,18 +44,7 @@ Implement the plan from `docs/plans/*-implementation.md` task by task, with file
|
|
|
31
44
|
| 1 | ⬜ pending | Task description (preserve checkpoint labels) | — |
|
|
32
45
|
```
|
|
33
46
|
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
- **Branch** (smaller changes):
|
|
37
|
-
```
|
|
38
|
-
git checkout -b <feature-name>
|
|
39
|
-
```
|
|
40
|
-
- **Worktree** (larger features, keeps main clean):
|
|
41
|
-
```
|
|
42
|
-
git worktree add ../<repo>-<feature-name> -b <feature-name>
|
|
43
|
-
```
|
|
44
|
-
|
|
45
|
-
Derive `<feature-name>` from the plan doc (e.g. `docs/plans/2026-04-16-auth-design.md` → `auth`). Ask the user which they prefer, then wait for confirmation before proceeding.
|
|
47
|
+
Use the actual branch name — whether it's the original branch or a new one from the isolation step.
|
|
46
48
|
|
|
47
49
|
4. **Commit the plan docs** — if `docs/plans/` has uncommitted files, commit them on the new branch:
|
|
48
50
|
```
|
|
@@ -92,117 +94,84 @@ Implement the plan from `docs/plans/*-implementation.md` task by task, with file
|
|
|
92
94
|
For each task the agent works on:
|
|
93
95
|
|
|
94
96
|
1. **Mark in-progress** — update the progress file: `🔄 in-progress`
|
|
95
|
-
2. **Read
|
|
96
|
-
3. **
|
|
97
|
-
4. **
|
|
98
|
-
5.
|
|
99
|
-
6. **
|
|
100
|
-
|
|
101
|
-
|
|
102
|
-
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
|
|
109
|
-
|
|
110
|
-
|
|
111
|
-
|
|
112
|
-
|
|
113
|
-
|
|
114
|
-
|
|
115
|
-
|
|
116
|
-
|
|
117
|
-
|
|
118
|
-
|
|
119
|
-
|
|
120
|
-
|
|
121
|
-
|
|
122
|
-
|
|
123
|
-
1. **Implement** — write the code as described in the plan
|
|
124
|
-
2. **Run tests** — verify the changes work
|
|
125
|
-
3. **Fix if needed** — if tests fail, debug and fix before moving on
|
|
126
|
-
4. **Pause for review** — show what was done and the diff, then wait for human input
|
|
127
|
-
5. **Commit** — `git add` the relevant files and commit with a clear message
|
|
97
|
+
2. **Read the plan selectively** — read the plan's overview section (everything before `## Task 1:`). Skim all `## Task N:` headings for dependency awareness. Then read the current task's body in full.
|
|
98
|
+
3. **Write the test** — for `new-feature`: write a failing test. For `modifying-tested-code`: run existing tests first. For `trivial`: skip steps 3-5, go to step 6.
|
|
99
|
+
4. **Run the test** — confirm it fails (new-feature) or passes (modifying-tested-code). Fix if needed.
|
|
100
|
+
5. **⏸ PAUSE if `checkpoint: test`** — present the [checkpoint review](#checkpoint-review) below. Wait for human input. On changes, update and re-present at this same pause.
|
|
101
|
+
6. **Implement** — write the code to make the test pass.
|
|
102
|
+
7. **Run tests** — verify everything passes. If tests fail and you cannot fix them after retrying, see [If you're stuck](#if-youre-stuck). If still stuck, mark the task `❌ failed` with the reason in the progress file and move to the next task.
|
|
103
|
+
8. **Verify against task description** — re-read the task from the plan. Does the implementation satisfy every requirement in the description? If not, fix before proceeding.
|
|
104
|
+
9. **Refactor if needed** — after all tests pass, check for refactoring opportunities:
|
|
105
|
+
- **Shallow modules** — is the interface nearly as complex as the implementation? Can complexity be hidden behind a simpler interface?
|
|
106
|
+
- **Deletion test** — if you deleted this module, would complexity vanish (pass-through) or reappear across callers (earning its keep)?
|
|
107
|
+
- **Duplication** — extract repeated patterns
|
|
108
|
+
- **Seam discipline** — don't introduce abstraction unless something actually varies across it. One adapter = hypothetical seam. Two adapters = real seam
|
|
109
|
+
|
|
110
|
+
Run tests after each refactor step. Never refactor while tests are failing.
|
|
111
|
+
10. **⏸ PAUSE if `checkpoint: done`** — present the [checkpoint review](#checkpoint-review) below. Wait for human input. On changes, update and re-present at this same pause.
|
|
112
|
+
11. **Commit** — `git add` the relevant files and commit with a clear message.
|
|
113
|
+
12. **Update progress** — mark `✅ done` + record the commit hash.
|
|
114
|
+
13. **Suggest session break if needed** — after completing ~3-5 tasks since the last break, suggest:
|
|
115
|
+
```
|
|
116
|
+
✅ Tasks N-M done (commits: abc, def)
|
|
117
|
+
Progress: X/Y tasks done
|
|
118
|
+
⏭ Next: Task [N+1] — [description]
|
|
119
|
+
💡 Context is building up. For clean context on remaining tasks:
|
|
120
|
+
/new then /skill:executing-tasks
|
|
121
|
+
(or just say "continue" to keep going here)
|
|
122
|
+
```
|
|
123
|
+
Also suggest at checkpoint review pauses when multiple tasks have been completed since the last break. Respect the user's choice if they say "continue".
|
|
124
|
+
14. **Loop** — go back to step 1 for the next `⬜ pending` task, or see [After all tasks](#after-all-tasks) if none remain.
|
|
128
125
|
|
|
129
126
|
## Checkpoint review
|
|
130
127
|
|
|
131
|
-
When pausing at a checkpoint
|
|
128
|
+
When pausing at a `checkpoint: test`, present the test code first:
|
|
132
129
|
|
|
133
130
|
```
|
|
134
|
-
⏸ Paused at checkpoint:
|
|
135
|
-
|
|
136
|
-
**
|
|
137
|
-
|
|
138
|
-
|
|
139
|
-
|
|
131
|
+
⏸ Paused at checkpoint: test for task [N]
|
|
132
|
+
|
|
133
|
+
**Test written:**
|
|
134
|
+
[show the test code]
|
|
135
|
+
|
|
136
|
+
**Expected behavior:** [what this test validates]
|
|
137
|
+
**Next:** Task [N+1] — [description]
|
|
138
|
+
|
|
139
|
+
**Available actions:**
|
|
140
|
+
- **Approve** — continue to implementation (step 6)
|
|
141
|
+
- **Request changes** — describe what to change, I'll update and re-present
|
|
142
|
+
- **Revert** — undo this task and mark it back to pending
|
|
143
|
+
- **Adjust plan** — modify the remaining tasks in the implementation plan
|
|
144
|
+
- `skip` — skip this task and move on
|
|
145
|
+
- `stop` — pause here, start a fresh session later with `/skill:executing-tasks`
|
|
146
|
+
- `status` — show the full progress table
|
|
140
147
|
```
|
|
141
148
|
|
|
142
|
-
|
|
143
|
-
- Approve and continue
|
|
144
|
-
- Request changes to the test or implementation
|
|
145
|
-
- Ask to revert the task
|
|
146
|
-
- Adjust the remaining plan
|
|
147
|
-
|
|
148
|
-
## TDD discipline
|
|
149
|
-
|
|
150
|
-
Follow the TDD scenario from the plan:
|
|
151
|
-
|
|
152
|
-
- **New feature**: write the test first, see it fail, then implement
|
|
153
|
-
- **Modifying tested code**: run existing tests before and after
|
|
154
|
-
- **Trivial change**: use judgment
|
|
155
|
-
|
|
156
|
-
Don't skip tests because "it's obvious." The test is the contract.
|
|
157
|
-
|
|
158
|
-
## Refactoring
|
|
159
|
-
|
|
160
|
-
After all tests pass for a task, check for refactoring opportunities:
|
|
161
|
-
|
|
162
|
-
- **Shallow modules** — is the interface nearly as complex as the implementation? Can complexity be hidden behind a simpler interface?
|
|
163
|
-
- **Deletion test** — if you deleted this module, would complexity vanish (pass-through) or reappear across callers (earning its keep)?
|
|
164
|
-
- **Duplication** — extract repeated patterns
|
|
165
|
-
- **Seam discipline** — don't introduce abstraction unless something actually varies across it. One adapter = hypothetical seam. Two adapters = real seam
|
|
166
|
-
|
|
167
|
-
Run tests after each refactor step. Never refactor while tests are failing.
|
|
168
|
-
|
|
169
|
-
Key vocabulary: **depth** (lots of behavior behind a small interface), **seam** (where behavior can be altered without editing in place), **locality** (change concentrated in one place).
|
|
170
|
-
|
|
171
|
-
## Batching and session management
|
|
172
|
-
|
|
173
|
-
The agent suggests a fresh session at natural break points to minimize token accumulation. After completing ~3-5 non-checkpoint tasks in the same session, suggest:
|
|
149
|
+
When pausing at a `checkpoint: done`, present the implementation review:
|
|
174
150
|
|
|
175
151
|
```
|
|
176
|
-
|
|
177
|
-
|
|
178
|
-
Progress: 5/10 tasks done
|
|
152
|
+
⏸ Paused at checkpoint: done for task [N]
|
|
179
153
|
|
|
180
|
-
|
|
181
|
-
|
|
182
|
-
|
|
183
|
-
|
|
184
|
-
|
|
154
|
+
**What was done:** [brief summary]
|
|
155
|
+
**Diff:** [show relevant diff]
|
|
156
|
+
**Next:** Task [N+1] — [description]
|
|
157
|
+
|
|
158
|
+
**Available actions:**
|
|
159
|
+
- **Approve** — continue to the next task
|
|
160
|
+
- **Request changes** — describe what to change, I'll update and re-present
|
|
161
|
+
- **Revert** — undo this task and mark it back to pending
|
|
162
|
+
- **Adjust plan** — modify the remaining tasks in the implementation plan
|
|
163
|
+
- `skip` — skip this task and move on
|
|
164
|
+
- `stop` — pause here, start a fresh session later with `/skill:executing-tasks`
|
|
165
|
+
- `status` — show the full progress table
|
|
185
166
|
```
|
|
186
167
|
|
|
187
|
-
|
|
168
|
+
Wait for the human to respond. On **request changes**, make the edits, then re-present at the same checkpoint. Repeat until approved.
|
|
188
169
|
|
|
189
|
-
|
|
170
|
+
## Progress file updates
|
|
190
171
|
|
|
191
|
-
|
|
192
|
-
|
|
193
|
-
During execution, the agent should update the progress file in place. Example workflow:
|
|
194
|
-
|
|
195
|
-
```bash
|
|
196
|
-
# Before task 2 starts:
|
|
197
|
-
sed -i 's/| 2 | ⬜ pending/| 2 | 🔄 in-progress/'
|
|
198
|
-
# After successful commit a1b2c3d:
|
|
199
|
-
sed -i 's/| 2 | 🔄 in-progress/| 2 | ✅ done/'
|
|
200
|
-
sed -i 's/| 2 | ✅ done[^|]*|/| 2 | ✅ done | a1b2c3d |/'
|
|
201
|
-
# Update timestamp:
|
|
202
|
-
sed -i "s/Last updated:.*/Last updated: $(date -u +%Y-%m-%dT%H:%M:%SZ)/"
|
|
203
|
-
```
|
|
172
|
+
Update the progress file by reading it, modifying the relevant row's status and commit hash, and writing it back. Target the specific task row — do not use pattern-matching approaches (e.g. sed) that could corrupt the table.
|
|
204
173
|
|
|
205
|
-
|
|
174
|
+
Update `Last updated` timestamp on every change.
|
|
206
175
|
|
|
207
176
|
## User override commands
|
|
208
177
|
|
|
@@ -217,18 +186,19 @@ The user can issue these commands at any time during execution:
|
|
|
217
186
|
|
|
218
187
|
## Receiving code review
|
|
219
188
|
|
|
220
|
-
When the user shares code review feedback:
|
|
189
|
+
When the user shares code review feedback (outside of a checkpoint pause):
|
|
221
190
|
|
|
222
191
|
1. **Verify the criticism** — read the relevant code. Is the feedback accurate?
|
|
223
192
|
2. **Evaluate the suggestion** — is the proposed fix the right approach? Consider alternatives.
|
|
224
|
-
3. **Implement or push back** — if valid, fix it. If not, explain why with evidence from the codebase.
|
|
193
|
+
3. **Implement or push back** — if valid, fix it, re-run tests, and amend the commit. If not, explain why with evidence from the codebase.
|
|
225
194
|
4. **Don't blindly implement** — every suggestion should be verified against the code before accepting.
|
|
226
195
|
|
|
227
196
|
## If you're stuck
|
|
228
197
|
|
|
229
|
-
|
|
230
|
-
|
|
231
|
-
|
|
198
|
+
1. Re-read the current task section from the plan — you may have drifted from the spec
|
|
199
|
+
2. Check git log — recent commits may reveal context
|
|
200
|
+
3. Ask the user — it's better to clarify than to guess wrong
|
|
201
|
+
4. If still stuck after asking, mark the task `❌ failed` with the reason in the progress file and move to the next task
|
|
232
202
|
|
|
233
203
|
## After all tasks
|
|
234
204
|
|
|
@@ -25,14 +25,16 @@ Wait for the user to confirm before proceeding.
|
|
|
25
25
|
```
|
|
26
26
|
mkdir -p docs/plans/completed
|
|
27
27
|
mkdir -p docs/plans/completed/adr
|
|
28
|
-
mv docs/plans/*-design.md docs/plans/completed/
|
|
29
|
-
mv docs/plans/*-implementation.md docs/plans/completed/
|
|
30
|
-
mv docs/plans/*-progress.md docs/plans/completed/
|
|
28
|
+
mv docs/plans/*-design.md docs/plans/completed/ 2>/dev/null || true
|
|
29
|
+
mv docs/plans/*-implementation.md docs/plans/completed/ 2>/dev/null || true
|
|
30
|
+
mv docs/plans/*-progress.md docs/plans/completed/ 2>/dev/null || true
|
|
31
31
|
mv docs/plans/adr/*.md docs/plans/completed/adr/ 2>/dev/null || true
|
|
32
32
|
rmdir docs/plans/adr 2>/dev/null || true
|
|
33
33
|
git add docs/plans/ && git commit -m "chore: archive planning docs"
|
|
34
34
|
```
|
|
35
35
|
|
|
36
|
+
Each `mv` gracefully handles the case where no matching files exist (e.g., if the user skipped straight from brainstorm to finalize without executing tasks).
|
|
37
|
+
|
|
36
38
|
2. **Update documentation** — if the API or surface changed:
|
|
37
39
|
- Update README.md
|
|
38
40
|
- Update CHANGELOG.md
|
|
@@ -5,22 +5,24 @@ description: "Use this to break a design into an implementation plan with bite-s
|
|
|
5
5
|
|
|
6
6
|
# Writing Plans
|
|
7
7
|
|
|
8
|
-
|
|
8
|
+
You may only create or edit files under `docs/plans/`. Do not modify source code or configuration.
|
|
9
9
|
|
|
10
10
|
## Process
|
|
11
11
|
|
|
12
|
-
1. **Check for a design doc** — look for `docs/plans/*-design.md`. If one exists, use it as the basis for the plan. If no design doc exists, ask the user to describe what they want to build and read relevant code.
|
|
13
|
-
2. **Write the implementation plan** — break the design into tasks. Save to `docs/plans/YYYY-MM-DD-<topic>-implementation.md`.
|
|
12
|
+
1. **Check for a design doc** — look for `docs/plans/*-design.md`. If one exists, use it as the basis for the plan. If the design doc is incomplete, fill gaps by asking the human. If no design doc exists, ask the user to describe what they want to build and read relevant code.
|
|
13
|
+
2. **Write the implementation plan** — break the design into tasks. Save to `docs/plans/YYYY-MM-DD-<topic>-implementation.md`. If the design is too large for ~15 tasks, flag this to the human and ask whether to reduce scope or proceed with the full plan.
|
|
14
|
+
3. **Present the plan** — show the complete plan to the human. Wait for approval before suggesting execution.
|
|
14
15
|
|
|
15
16
|
## Task format
|
|
16
17
|
|
|
17
|
-
Each task should
|
|
18
|
+
Each task should produce one committed, testable change:
|
|
18
19
|
|
|
19
20
|
- Exact file paths to create/modify
|
|
20
|
-
- Complete code (not "add validation")
|
|
21
|
+
- Complete code (not "add validation"). For tasks that depend on types or utilities from earlier tasks, reference them explicitly (e.g., `import { User } from Task 2`) and include only the new code
|
|
21
22
|
- Exact commands with expected output
|
|
22
23
|
- `git commit` after each task
|
|
23
24
|
- Optional `checkpoint: test` or `checkpoint: done` label
|
|
25
|
+
- Each task's tests should cover the happy path and at least one edge case or error path
|
|
24
26
|
|
|
25
27
|
Each task must use a numbered heading:
|
|
26
28
|
|
|
@@ -33,16 +35,12 @@ Each task must use a numbered heading:
|
|
|
33
35
|
|
|
34
36
|
...where N starts at 1 and incrementally numbers each task in the plan.
|
|
35
37
|
|
|
36
|
-
The metadata comments (placed right after the heading) are optional
|
|
38
|
+
The metadata comments (placed right after the heading) are optional. If omitted, the executing-tasks skill infers the TDD scenario and checkpoint from context. When in doubt, include them explicitly.
|
|
37
39
|
|
|
38
40
|
Valid TDD values: `new-feature`, `modifying-tested-code`, `trivial`
|
|
39
41
|
|
|
40
42
|
Valid checkpoint values: `none`, `test`, `done`
|
|
41
43
|
|
|
42
|
-
These comments are optional — if omitted, the agent infers TDD scenario and checkpoint from context.
|
|
43
|
-
|
|
44
|
-
Also use the `<!-- tdd: ... -->` and `<!-- checkpoint: ... -->` metadata comments to specify options explicitly. The inline `checkpoint: test` / `checkpoint: done` label format (e.g. in a task list) is also supported as a fallback, but the metadata comment is the canonical source.
|
|
45
|
-
|
|
46
44
|
|
|
47
45
|
## Vertical slices
|
|
48
46
|
|
|
@@ -61,6 +59,8 @@ RIGHT (vertical):
|
|
|
61
59
|
Task 3: User can view profile (query + endpoint + test)
|
|
62
60
|
```
|
|
63
61
|
|
|
62
|
+
Order tasks so each one can be verified independently and delivers a complete vertical slice. If a task requires infrastructure (models, types) that no previous task has created, include it in that task — don't create it as a separate task.
|
|
63
|
+
|
|
64
64
|
Vertical slices ensure every committed task leaves the codebase in a testable state and reduces the blast radius of a bad task.
|
|
65
65
|
|
|
66
66
|
## TDD in the plan
|
package/banner.jpg
DELETED
|
Binary file
|