@tianhai/pi-workflow-kit 0.10.1 → 0.11.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,72 +1,121 @@
1
1
  # pi-workflow-kit
2
2
 
3
- Structured workflow skills and enforcement for [pi](https://github.com/badlogic/pi-mono).
3
+ > Stop AI agents from rushing to code. Enforce a structured brainstorm→plan→execute→finalize workflow with TDD discipline.
4
4
 
5
- ## What You Get
5
+ AI coding agents tend to skip design and jump straight into implementation, producing over-engineered or misaligned code. **pi-workflow-kit** solves this by hard-blocking write operations during brainstorm and planning phases — the agent *literally cannot modify your source files* until you approve the design.
6
+
7
+ [pi](https://github.com/badlogic/pi-mono) package. Zero configuration required.
6
8
 
7
- **4 workflow skills** that guide the agent through a structured development process:
9
+ ## Install
8
10
 
11
+ ```bash
12
+ pi install npm:@tianhai/pi-workflow-kit
9
13
  ```
10
- brainstorm → plan → execute → finalize
14
+
15
+ No setup needed — skills and guards activate automatically after install.
16
+
17
+ **Want to try before committing?**
18
+
19
+ ```bash
20
+ pi -e npm:@tianhai/pi-workflow-kit
11
21
  ```
12
22
 
13
- **1 extension** that enforces the rules:
23
+ ## What You Get
14
24
 
15
- - During brainstorming and planning, `write` and `edit` are **hard-blocked** outside `docs/plans/`. The agent can only read code and discuss the design with you — it literally cannot modify source files.
16
- - `bash` is **restricted to read-only commands** — file writes, installs, git mutations, and editors are blocked. Safe commands like `grep`, `find`, `git status`, `cat`, `curl`, `go doc`, `go list` remain available.
25
+ ### 🛡️ Workflow Guard (extension)
17
26
 
18
- No configuration required. Skills and extensions activate automatically after install.
27
+ Enforces phase-appropriate tool access not just guidelines, but hard blocks:
19
28
 
20
- ## Install
29
+ | Phase | `write` / `edit` | `bash` |
30
+ |-------|:-:|:-:|
31
+ | **Brainstorm** / **Plan** | 🔒 Blocked outside `docs/plans/` | 🔒 Read-only only (grep, find, cat, git status, curl…) |
32
+ | **Execute** / **Finalize** | ✅ Full access | ✅ Full access |
21
33
 
22
- ```bash
23
- pi install npm:@tianhai/pi-workflow-kit
24
- ```
34
+ The agent can read code and discuss design with you during brainstorm/plan, but it physically cannot modify source files or run mutating commands.
25
35
 
26
- ## The Workflow
36
+ ### 🧠 5 Workflow Skills
27
37
 
28
- You control each phase explicitly by invoking the skill:
38
+ Guide the agent through a disciplined development process:
29
39
 
30
- | Phase | Command | What Happens |
40
+ ```
41
+ brainstorm → plan → execute → finalize
42
+
43
+ diagnose (anytime)
44
+ ```
45
+
46
+ | Phase | Trigger | What Happens |
31
47
  |-------|---------|--------------|
32
- | **Brainstorm** | `/skill:brainstorming` | Refine your idea into a design doc via collaborative dialogue |
33
- | **Plan** | `/skill:writing-plans` | Break the design into bite-sized TDD tasks with exact file paths and code |
34
- | **Execute** | `/skill:executing-tasks` | Implement the plan task-by-task with TDD discipline and optional checkpoint review gates |
48
+ | **Brainstorm** | `/skill:brainstorming` | Explore approaches, debate tradeoffs, produce a design doc |
49
+ | **Plan** | `/skill:writing-plans` | Break design into bite-sized TDD tasks with file paths and acceptance criteria |
50
+ | **Execute** | `/skill:executing-tasks` | Implement tasks one-by-one with TDD discipline and optional checkpoint review gates |
35
51
  | **Finalize** | `/skill:finalizing` | Archive plan docs, update README/CHANGELOG, create PR |
52
+ | **Diagnose** | `/skill:diagnose` | 6-phase debugging loop: reproduce → hypothesize → instrument → fix → verify |
36
53
 
37
- During brainstorm and plan, the extension blocks `write`/`edit` outside `docs/plans/` and restricts `bash` to read-only commands. During execute and finalize, all tools are available.
54
+ ## The Workflow in Detail
38
55
 
39
- ### Skills
56
+ ### Phase Control
40
57
 
41
- | Skill | Lines | Description |
42
- |-------|------:|-------------|
43
- | `brainstorming` | ~30 | Explore the idea, propose approaches, write design doc |
44
- | `writing-plans` | ~35 | Break design into tasks with TDD scenarios, set up branch/worktree |
45
- | `executing-tasks` | ~50 | Implement tasks with TDD discipline, checkpoint review gates, handle code review |
46
- | `finalizing` | ~20 | Archive docs, update changelog, create PR, clean up |
47
- | `diagnose` | ~35 | 6-phase debugging loop: build feedback loop, reproduce, hypothesise, instrument, fix, cleanup |
58
+ You control each phase the agent never advances on its own. Invoke a skill to move forward:
59
+
60
+ ```
61
+ /skill:brainstorming → discuss and design
62
+ /skill:writing-plans → break into tasks
63
+ /skill:executing-tasks → implement with TDD
64
+ /skill:finalizing → ship it
65
+ ```
48
66
 
49
67
  ### TDD Three-Scenario Model
50
68
 
51
- The plan labels each task with its TDD scenario:
69
+ Each task is labeled with its TDD scenario during planning:
52
70
 
53
71
  | Scenario | When | Rule |
54
72
  |----------|------|------|
55
- | New feature | Adding new behavior | Write failing test → implement → pass |
56
- | Modifying tested code | Changing existing behavior | Run existing tests first → modify → verify |
57
- | Trivial | Config, docs, naming | Use judgment |
73
+ | **New feature** | Adding new behavior | Write failing test → implement → pass |
74
+ | **Modifying tested code** | Changing existing behavior | Run existing tests first → modify → verify |
75
+ | **Trivial** | Config, docs, naming | Use judgment |
58
76
 
59
77
  ### Checkpoint Review Gates
60
78
 
61
- Optionally label tasks with a `checkpoint` to pause for human review:
79
+ Optionally label tasks with a `checkpoint` to pause for human review. At each checkpoint the agent stops and waits for your feedback — you can approve, ask for changes, or send it back to rethink. Only when you're satisfied does it move on to the next task.
62
80
 
63
- | Checkpoint | When to use | What happens |
81
+ | Checkpoint | When to Use | What Happens |
64
82
  |---|---|---|
65
83
  | *(none)* | Trivial tasks, well-understood changes | Auto-advance, no pause |
66
- | `checkpoint: test` | Test design matters | Pause after failing test, before implementing |
67
- | `checkpoint: done` | Implementation review matters | Pause after implementation passes tests, before committing |
84
+ | `checkpoint: test` | Test design matters | Agent writes the failing test, then pauses for your review. Verify the test covers the right cases before the agent implements. |
85
+ | `checkpoint: done` | Implementation review matters | Agent implements and passes tests, then pauses for your review. Verify the implementation is correct before committing. |
86
+
87
+ ## Quick Start
88
+
89
+ ```bash
90
+ # Install
91
+ pi install npm:@tianhai/pi-workflow-kit
92
+
93
+ # Start a new feature
94
+ > /skill:brainstorming
95
+ > I want to add OAuth2 login to our API
96
+
97
+ # (agent explores approaches, writes design doc)
98
+ # (write/edit are blocked — your code is safe)
99
+
100
+ > /skill:writing-plans
101
+
102
+ # (agent breaks design into TDD tasks)
103
+ > /skill:executing-tasks
104
+
105
+ # (agent implements with TDD, all tools unlocked)
106
+ > /skill:finalizing
107
+
108
+ # (agent archives docs, updates changelog, creates PR)
109
+ ```
110
+
111
+ ## Why?
112
+
113
+ - **AI agents skip design.** Left unchecked, they jump to code and over-engineer. This forces a think-first workflow.
114
+ - **TDD needs structure.** The three-scenario model gives the agent clear rules for when to write tests first.
115
+ - **You stay in control.** Checkpoint review gates let you approve test designs and implementations before the agent commits.
116
+ - **Enforced, not suggested.** Hard blocks mean the agent can't ignore the rules — not even accidentally.
68
117
 
69
- ## Architecture
118
+ ## Project
70
119
 
71
120
  ```
72
121
  pi-workflow-kit/
@@ -92,4 +141,4 @@ npm test
92
141
 
93
142
  ## License
94
143
 
95
- MIT
144
+ [MIT](LICENSE)
@@ -45,7 +45,7 @@ const DESTRUCTIVE_PATTERNS = [
45
45
  /\bshutdown\b/i,
46
46
  /\bsystemctl\s+(start|stop|restart|enable|disable)/i,
47
47
  /\bservice\s+\S+\s+(start|stop|restart)/i,
48
- /^\s*(vim?|nano|emacs|code|subl)\b/i,
48
+ /\b(vim?|nano|emacs|code|subl)\b/i,
49
49
  ];
50
50
 
51
51
  const SAFE_PATTERNS = [
@@ -117,8 +117,7 @@ const SAFE_PATTERNS = [
117
117
  ];
118
118
 
119
119
  /** Split a compound command into individual sub-commands.
120
- * Splits on &&, ||, and ; operators, ignoring leading whitespace.
121
- * Does NOT split on | (pipe) to allow piping (e.g. `git log | head`).
120
+ * Handles &&, ||, ;, and | (pipe) operators, ignoring leading whitespace.
122
121
  */
123
122
  function splitCompoundCommand(command: string): string[] {
124
123
  // Match sub-commands separated by &&, ||, ; (with optional whitespace)
package/package.json CHANGED
@@ -1,9 +1,17 @@
1
1
  {
2
2
  "name": "@tianhai/pi-workflow-kit",
3
- "version": "0.10.1",
4
- "description": "Workflow skills and enforcement extensions for pi",
3
+ "version": "0.11.0",
4
+ "description": "Enforce structured brainstorm→plan→execute→finalize workflow with TDD discipline in AI coding agents",
5
5
  "keywords": [
6
- "pi-package"
6
+ "pi-package",
7
+ "ai-coding-agent",
8
+ "workflow",
9
+ "tdd",
10
+ "guard-rails",
11
+ "code-review",
12
+ "pi-extension",
13
+ "brainstorm",
14
+ "test-driven-development"
7
15
  ],
8
16
  "scripts": {
9
17
  "test": "vitest run",
@@ -20,7 +28,6 @@
20
28
  "extensions/",
21
29
  "skills/",
22
30
  "docs/",
23
- "banner.jpg",
24
31
  "LICENSE",
25
32
  "README.md"
26
33
  ],
@@ -10,9 +10,9 @@ Read-only exploration. You may **not** edit or create any files except under `do
10
10
  ## Process
11
11
 
12
12
  1. **Check git state** — run `git status` and `git log --oneline -5`. If there's uncommitted work, ask the user what to do with it first.
13
- 2. **Understand the idea** — read existing code, docs, and recent commits. Ask questions one at a time to refine the idea. Prefer multiple choice when possible.
13
+ 2. **Understand the idea** — read existing code, docs, and recent commits. Grep for related functionality, check package.json/dependencies and module structure. Read only what's necessary to ground the design — don't read the entire codebase. Ask questions to refine the idea. Prefer multiple choice when possible. After each question, check: can you clearly articulate (a) what the user wants to build, (b) why, and (c) key constraints? If yes, present your understanding as a short summary and ask: "Should I proceed with this, or is there more to add?" The human decides when to move on.
14
14
  3. **Explore approaches** — propose 2-3 approaches. For each approach, sketch the concrete interface (types, method signatures, example caller code) so the comparison is grounded in actual code, not abstract descriptions. Lead with your recommendation.
15
- 4. **Present the design** — break it into sections of 200-300 words. Check after each section whether it looks right. Cover: architecture, components, data flow, error handling, testing.
15
+ 4. **Present the design** — break it into focused sections. Each section should be one screen of reading. Present each section to the human and wait for approval before continuing. Cover: architecture, components, data flow, error handling, testing. On feedback, incorporate it and re-present the revised section.
16
16
 
17
17
  When a significant architectural decision is identified, offer to write a lightweight ADR to `docs/plans/adr/`. Only write an ADR when all three are true:
18
18
 
@@ -29,7 +29,7 @@ Read-only exploration. You may **not** edit or create any files except under `do
29
29
  ```
30
30
 
31
31
  ADRs live under `docs/plans/adr/` and are archived during finalizing alongside the design doc.
32
- 5. **Write the design doc** — save it to `docs/plans/YYYY-MM-DD-<topic>-design.md`. Ask the user to commit it. Branch creation and worktree setup should be deferred to the execution phase (`/skill:executing-tasks`).
32
+ 5. **Write the design doc** — save it to `docs/plans/YYYY-MM-DD-<topic>-design.md`. Organize features as end-to-end slices (each slice delivers one observable behavior through all relevant layers) so the planning phase can decompose them directly into tasks. Branch creation, committing, and workspace setup are handled by `/skill:executing-tasks`.
33
33
 
34
34
  ## Principles
35
35
 
@@ -10,19 +10,32 @@ Implement the plan from `docs/plans/*-implementation.md` task by task, with file
10
10
  ## Before you start
11
11
 
12
12
  1. **Check git state** — run `git status` and `git log --oneline -5`. Note any uncommitted changes.
13
- 2. **Find the plan** — look for `docs/plans/*-implementation.md`. If multiple exist, ask the user which one to execute.
13
+ 2. **Find the plan** — look for `docs/plans/*-implementation.md`. If none exist, say "No implementation plan found. Run `/skill:writing-plans` first." and stop. If multiple exist, ask the user which one to execute.
14
14
  3. **Check for existing progress** — look for `docs/plans/*-progress.md`. If one exists matching the plan, this is a **resume** (see [Resume](#resume)). If not, this is a **first run** (see [First run](#first-run)).
15
15
 
16
16
  ## First run
17
17
 
18
18
  1. **Parse the implementation plan** — read the plan and extract all `## Task N:` headings. Build the progress table with all tasks as `⬜ pending`.
19
- 2. **Create the progress file** — save to `docs/plans/<plan-name>-progress.md` (replace `-implementation` with `-progress` in the plan filename):
19
+ 2. **Suggest workspace isolation** — if the user isn't already on a feature branch or worktree, present the options:
20
+
21
+ - **Branch** (smaller changes):
22
+ ```
23
+ git checkout -b <feature-name>
24
+ ```
25
+ - **Worktree** (larger features, keeps main clean):
26
+ ```
27
+ git worktree add ../<repo>-<feature-name> -b <feature-name>
28
+ ```
29
+
30
+ Derive `<feature-name>` from the plan doc (e.g. `docs/plans/2026-04-16-auth-design.md` → `auth`). Ask the user which they prefer, then wait for confirmation before proceeding.
31
+
32
+ 3. **Create the progress file** — save to `docs/plans/<plan-name>-progress.md` (replace `-implementation` with `-progress` in the plan filename):
20
33
 
21
34
  ```markdown
22
35
  # Progress: <topic>
23
36
 
24
37
  Plan: docs/plans/YYYY-MM-DD-<topic>-implementation.md
25
- Branch: <current-branch>
38
+ Branch: <actual branch name>
26
39
  Started: <ISO timestamp>
27
40
  Last updated: <ISO timestamp>
28
41
 
@@ -31,18 +44,7 @@ Implement the plan from `docs/plans/*-implementation.md` task by task, with file
31
44
  | 1 | ⬜ pending | Task description (preserve checkpoint labels) | — |
32
45
  ```
33
46
 
34
- 3. **Suggest workspace isolation**if the user isn't already on a feature branch or worktree, present the options:
35
-
36
- - **Branch** (smaller changes):
37
- ```
38
- git checkout -b <feature-name>
39
- ```
40
- - **Worktree** (larger features, keeps main clean):
41
- ```
42
- git worktree add ../<repo>-<feature-name> -b <feature-name>
43
- ```
44
-
45
- Derive `<feature-name>` from the plan doc (e.g. `docs/plans/2026-04-16-auth-design.md` → `auth`). Ask the user which they prefer, then wait for confirmation before proceeding.
47
+ Use the actual branch name whether it's the original branch or a new one from the isolation step.
46
48
 
47
49
  4. **Commit the plan docs** — if `docs/plans/` has uncommitted files, commit them on the new branch:
48
50
  ```
@@ -92,117 +94,84 @@ Implement the plan from `docs/plans/*-implementation.md` task by task, with file
92
94
  For each task the agent works on:
93
95
 
94
96
  1. **Mark in-progress** — update the progress file: `🔄 in-progress`
95
- 2. **Read only the relevant task** — grep/jump to `## Task N:` in the implementation plan. Do not read the entire plan.
96
- 3. **Implement** — follow the TDD discipline (see [TDD discipline](#tdd-discipline)) and checkpoint flow (see [Checkpoints](#checkpoints))
97
- 4. **Commit** — `git add` the relevant files and commit with a clear message
98
- 5. **Update progress**mark `✅ done` + record the commit hash
99
- 6. **Check next task** — look at the next task in the progress file:
100
- - **Has checkpoint** pause for review (see [Checkpoint review](#checkpoint-review))
101
- - **No checkpoint** continue to the next task
102
-
103
- ## Checkpoints
104
-
105
- Check each task for a `checkpoint` label and follow the appropriate flow:
106
-
107
- ### No checkpoint (auto-advance)
108
-
109
- 1. **Implement**write the code as described in the plan
110
- 2. **Run tests** — verify the changes work
111
- 3. **Fix if needed** — if tests fail, debug and fix before moving on
112
- 4. **Commit** — `git add` the relevant files and commit with a clear message
113
-
114
- ### checkpoint: test
115
-
116
- 1. **Write the test** follow the TDD scenario for the task
117
- 2. **Pause for review** show what was done and the diff, then wait for human input
118
- 3. **Continue** — implement, run tests, fix if needed
119
- 4. **Commit** `git add` the relevant files and commit with a clear message
120
-
121
- ### checkpoint: done
122
-
123
- 1. **Implement** — write the code as described in the plan
124
- 2. **Run tests** — verify the changes work
125
- 3. **Fix if needed** — if tests fail, debug and fix before moving on
126
- 4. **Pause for review** — show what was done and the diff, then wait for human input
127
- 5. **Commit** — `git add` the relevant files and commit with a clear message
97
+ 2. **Read the plan selectively** — read the plan's overview section (everything before `## Task 1:`). Skim all `## Task N:` headings for dependency awareness. Then read the current task's body in full.
98
+ 3. **Write the test** — for `new-feature`: write a failing test. For `modifying-tested-code`: run existing tests first. For `trivial`: skip steps 3-5, go to step 6.
99
+ 4. **Run the test** — confirm it fails (new-feature) or passes (modifying-tested-code). Fix if needed.
100
+ 5. **⏸ PAUSE if `checkpoint: test`** present the [checkpoint review](#checkpoint-review) below. Wait for human input. On changes, update and re-present at this same pause.
101
+ 6. **Implement** — write the code to make the test pass.
102
+ 7. **Run tests** verify everything passes. If tests fail and you cannot fix them after retrying, see [If you're stuck](#if-youre-stuck). If still stuck, mark the task `❌ failed` with the reason in the progress file and move to the next task.
103
+ 8. **Verify against task description** re-read the task from the plan. Does the implementation satisfy every requirement in the description? If not, fix before proceeding.
104
+ 9. **Refactor if needed** — after all tests pass, check for refactoring opportunities:
105
+ - **Shallow modules** — is the interface nearly as complex as the implementation? Can complexity be hidden behind a simpler interface?
106
+ - **Deletion test** — if you deleted this module, would complexity vanish (pass-through) or reappear across callers (earning its keep)?
107
+ - **Duplication** extract repeated patterns
108
+ - **Seam discipline** — don't introduce abstraction unless something actually varies across it. One adapter = hypothetical seam. Two adapters = real seam
109
+
110
+ Run tests after each refactor step. Never refactor while tests are failing.
111
+ 10. **⏸ PAUSE if `checkpoint: done`** present the [checkpoint review](#checkpoint-review) below. Wait for human input. On changes, update and re-present at this same pause.
112
+ 11. **Commit** — `git add` the relevant files and commit with a clear message.
113
+ 12. **Update progress** — mark `✅ done` + record the commit hash.
114
+ 13. **Suggest session break if needed** — after completing ~3-5 tasks since the last break, suggest:
115
+ ```
116
+ Tasks N-M done (commits: abc, def)
117
+ Progress: X/Y tasks done
118
+ ⏭ Next: Task [N+1][description]
119
+ 💡 Context is building up. For clean context on remaining tasks:
120
+ /new then /skill:executing-tasks
121
+ (or just say "continue" to keep going here)
122
+ ```
123
+ Also suggest at checkpoint review pauses when multiple tasks have been completed since the last break. Respect the user's choice if they say "continue".
124
+ 14. **Loop** — go back to step 1 for the next `⬜ pending` task, or see [After all tasks](#after-all-tasks) if none remain.
128
125
 
129
126
  ## Checkpoint review
130
127
 
131
- When pausing at a checkpoint, present:
128
+ When pausing at a `checkpoint: test`, present the test code first:
132
129
 
133
130
  ```
134
- ⏸ Paused at checkpoint: [test|done] for task [N]
135
-
136
- **What was done:** [brief summary]
137
- **Diff:** [show relevant diff]
138
-
139
- Review and let me know how to proceed.
131
+ ⏸ Paused at checkpoint: test for task [N]
132
+
133
+ **Test written:**
134
+ [show the test code]
135
+
136
+ **Expected behavior:** [what this test validates]
137
+ **Next:** Task [N+1] — [description]
138
+
139
+ **Available actions:**
140
+ - **Approve** — continue to implementation (step 6)
141
+ - **Request changes** — describe what to change, I'll update and re-present
142
+ - **Revert** — undo this task and mark it back to pending
143
+ - **Adjust plan** — modify the remaining tasks in the implementation plan
144
+ - `skip` — skip this task and move on
145
+ - `stop` — pause here, start a fresh session later with `/skill:executing-tasks`
146
+ - `status` — show the full progress table
140
147
  ```
141
148
 
142
- Wait for the human to respond. They may:
143
- - Approve and continue
144
- - Request changes to the test or implementation
145
- - Ask to revert the task
146
- - Adjust the remaining plan
147
-
148
- ## TDD discipline
149
-
150
- Follow the TDD scenario from the plan:
151
-
152
- - **New feature**: write the test first, see it fail, then implement
153
- - **Modifying tested code**: run existing tests before and after
154
- - **Trivial change**: use judgment
155
-
156
- Don't skip tests because "it's obvious." The test is the contract.
157
-
158
- ## Refactoring
159
-
160
- After all tests pass for a task, check for refactoring opportunities:
161
-
162
- - **Shallow modules** — is the interface nearly as complex as the implementation? Can complexity be hidden behind a simpler interface?
163
- - **Deletion test** — if you deleted this module, would complexity vanish (pass-through) or reappear across callers (earning its keep)?
164
- - **Duplication** — extract repeated patterns
165
- - **Seam discipline** — don't introduce abstraction unless something actually varies across it. One adapter = hypothetical seam. Two adapters = real seam
166
-
167
- Run tests after each refactor step. Never refactor while tests are failing.
168
-
169
- Key vocabulary: **depth** (lots of behavior behind a small interface), **seam** (where behavior can be altered without editing in place), **locality** (change concentrated in one place).
170
-
171
- ## Batching and session management
172
-
173
- The agent suggests a fresh session at natural break points to minimize token accumulation. After completing ~3-5 non-checkpoint tasks in the same session, suggest:
149
+ When pausing at a `checkpoint: done`, present the implementation review:
174
150
 
175
151
  ```
176
- Tasks 3-5 done (commits: a1b2, e4f5, i7j8)
177
-
178
- Progress: 5/10 tasks done
152
+ Paused at checkpoint: done for task [N]
179
153
 
180
- ⏭ Next: Task 6 Add auth middleware (no checkpoint)
181
-
182
- 💡 Context is building up. For clean context on remaining tasks:
183
- /new then /skill:executing-tasks
184
- (or just say "continue" to keep going here)
154
+ **What was done:** [brief summary]
155
+ **Diff:** [show relevant diff]
156
+ **Next:** Task [N+1] [description]
157
+
158
+ **Available actions:**
159
+ - **Approve** — continue to the next task
160
+ - **Request changes** — describe what to change, I'll update and re-present
161
+ - **Revert** — undo this task and mark it back to pending
162
+ - **Adjust plan** — modify the remaining tasks in the implementation plan
163
+ - `skip` — skip this task and move on
164
+ - `stop` — pause here, start a fresh session later with `/skill:executing-tasks`
165
+ - `status` — show the full progress table
185
166
  ```
186
167
 
187
- The user can say "continue" to keep going in the same session. Respect their choice.
168
+ Wait for the human to respond. On **request changes**, make the edits, then re-present at the same checkpoint. Repeat until approved.
188
169
 
189
- Also suggest `/new` at checkpoint review pauses when multiple tasks have been completed since the last session break.
170
+ ## Progress file updates
190
171
 
191
- ## Progress file updates (automated)
192
-
193
- During execution, the agent should update the progress file in place. Example workflow:
194
-
195
- ```bash
196
- # Before task 2 starts:
197
- sed -i 's/| 2 | ⬜ pending/| 2 | 🔄 in-progress/'
198
- # After successful commit a1b2c3d:
199
- sed -i 's/| 2 | 🔄 in-progress/| 2 | ✅ done/'
200
- sed -i 's/| 2 | ✅ done[^|]*|/| 2 | ✅ done | a1b2c3d |/'
201
- # Update timestamp:
202
- sed -i "s/Last updated:.*/Last updated: $(date -u +%Y-%m-%dT%H:%M:%SZ)/"
203
- ```
172
+ Update the progress file by reading it, modifying the relevant row's status and commit hash, and writing it back. Target the specific task row — do not use pattern-matching approaches (e.g. sed) that could corrupt the table.
204
173
 
205
- Note: The agent should use proper markdown table parsing (not naive sed in production) to avoid corrupting the file — ensure the replacement targets the correct row.
174
+ Update `Last updated` timestamp on every change.
206
175
 
207
176
  ## User override commands
208
177
 
@@ -217,18 +186,19 @@ The user can issue these commands at any time during execution:
217
186
 
218
187
  ## Receiving code review
219
188
 
220
- When the user shares code review feedback:
189
+ When the user shares code review feedback (outside of a checkpoint pause):
221
190
 
222
191
  1. **Verify the criticism** — read the relevant code. Is the feedback accurate?
223
192
  2. **Evaluate the suggestion** — is the proposed fix the right approach? Consider alternatives.
224
- 3. **Implement or push back** — if valid, fix it. If not, explain why with evidence from the codebase.
193
+ 3. **Implement or push back** — if valid, fix it, re-run tests, and amend the commit. If not, explain why with evidence from the codebase.
225
194
  4. **Don't blindly implement** — every suggestion should be verified against the code before accepting.
226
195
 
227
196
  ## If you're stuck
228
197
 
229
- - Re-read the current task section from the plan — you may have drifted from the spec
230
- - Check git log — recent commits may reveal context
231
- - Ask the user — it's better to clarify than to guess wrong
198
+ 1. Re-read the current task section from the plan — you may have drifted from the spec
199
+ 2. Check git log — recent commits may reveal context
200
+ 3. Ask the user — it's better to clarify than to guess wrong
201
+ 4. If still stuck after asking, mark the task `❌ failed` with the reason in the progress file and move to the next task
232
202
 
233
203
  ## After all tasks
234
204
 
@@ -25,14 +25,16 @@ Wait for the user to confirm before proceeding.
25
25
  ```
26
26
  mkdir -p docs/plans/completed
27
27
  mkdir -p docs/plans/completed/adr
28
- mv docs/plans/*-design.md docs/plans/completed/
29
- mv docs/plans/*-implementation.md docs/plans/completed/
30
- mv docs/plans/*-progress.md docs/plans/completed/
28
+ mv docs/plans/*-design.md docs/plans/completed/ 2>/dev/null || true
29
+ mv docs/plans/*-implementation.md docs/plans/completed/ 2>/dev/null || true
30
+ mv docs/plans/*-progress.md docs/plans/completed/ 2>/dev/null || true
31
31
  mv docs/plans/adr/*.md docs/plans/completed/adr/ 2>/dev/null || true
32
32
  rmdir docs/plans/adr 2>/dev/null || true
33
33
  git add docs/plans/ && git commit -m "chore: archive planning docs"
34
34
  ```
35
35
 
36
+ Each `mv` gracefully handles the case where no matching files exist (e.g., if the user skipped straight from brainstorm to finalize without executing tasks).
37
+
36
38
  2. **Update documentation** — if the API or surface changed:
37
39
  - Update README.md
38
40
  - Update CHANGELOG.md
@@ -5,22 +5,24 @@ description: "Use this to break a design into an implementation plan with bite-s
5
5
 
6
6
  # Writing Plans
7
7
 
8
- Read-only exploration. You may **not** edit or create any files except under `docs/plans/`.
8
+ You may only create or edit files under `docs/plans/`. Do not modify source code or configuration.
9
9
 
10
10
  ## Process
11
11
 
12
- 1. **Check for a design doc** — look for `docs/plans/*-design.md`. If one exists, use it as the basis for the plan. If no design doc exists, ask the user to describe what they want to build and read relevant code.
13
- 2. **Write the implementation plan** — break the design into tasks. Save to `docs/plans/YYYY-MM-DD-<topic>-implementation.md`.
12
+ 1. **Check for a design doc** — look for `docs/plans/*-design.md`. If one exists, use it as the basis for the plan. If the design doc is incomplete, fill gaps by asking the human. If no design doc exists, ask the user to describe what they want to build and read relevant code.
13
+ 2. **Write the implementation plan** — break the design into tasks. Save to `docs/plans/YYYY-MM-DD-<topic>-implementation.md`. If the design is too large for ~15 tasks, flag this to the human and ask whether to reduce scope or proceed with the full plan.
14
+ 3. **Present the plan** — show the complete plan to the human. Wait for approval before suggesting execution.
14
15
 
15
16
  ## Task format
16
17
 
17
- Each task should be 2-5 minutes of work:
18
+ Each task should produce one committed, testable change:
18
19
 
19
20
  - Exact file paths to create/modify
20
- - Complete code (not "add validation")
21
+ - Complete code (not "add validation"). For tasks that depend on types or utilities from earlier tasks, reference them explicitly (e.g., `import { User } from Task 2`) and include only the new code
21
22
  - Exact commands with expected output
22
23
  - `git commit` after each task
23
24
  - Optional `checkpoint: test` or `checkpoint: done` label
25
+ - Each task's tests should cover the happy path and at least one edge case or error path
24
26
 
25
27
  Each task must use a numbered heading:
26
28
 
@@ -33,16 +35,12 @@ Each task must use a numbered heading:
33
35
 
34
36
  ...where N starts at 1 and incrementally numbers each task in the plan.
35
37
 
36
- The metadata comments (placed right after the heading) are optional but recommended. If present, they help the executing-tasks skill parse the plan correctly.
38
+ The metadata comments (placed right after the heading) are optional. If omitted, the executing-tasks skill infers the TDD scenario and checkpoint from context. When in doubt, include them explicitly.
37
39
 
38
40
  Valid TDD values: `new-feature`, `modifying-tested-code`, `trivial`
39
41
 
40
42
  Valid checkpoint values: `none`, `test`, `done`
41
43
 
42
- These comments are optional — if omitted, the agent infers TDD scenario and checkpoint from context.
43
-
44
- Also use the `<!-- tdd: ... -->` and `<!-- checkpoint: ... -->` metadata comments to specify options explicitly. The inline `checkpoint: test` / `checkpoint: done` label format (e.g. in a task list) is also supported as a fallback, but the metadata comment is the canonical source.
45
-
46
44
 
47
45
  ## Vertical slices
48
46
 
@@ -61,6 +59,8 @@ RIGHT (vertical):
61
59
  Task 3: User can view profile (query + endpoint + test)
62
60
  ```
63
61
 
62
+ Order tasks so each one can be verified independently and delivers a complete vertical slice. If a task requires infrastructure (models, types) that no previous task has created, include it in that task — don't create it as a separate task.
63
+
64
64
  Vertical slices ensure every committed task leaves the codebase in a testable state and reduces the blast radius of a bad task.
65
65
 
66
66
  ## TDD in the plan
package/banner.jpg DELETED
Binary file