@tianhai/pi-workflow-kit 0.5.3 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (70) hide show
  1. package/README.md +50 -490
  2. package/docs/developer-usage-guide.md +41 -401
  3. package/docs/oversight-model.md +13 -34
  4. package/docs/plans/2026-04-11-finalizing-merge-options-design.md +33 -0
  5. package/docs/plans/completed/2026-04-11-checkpoint-review-gates-design.md +50 -0
  6. package/docs/plans/completed/2026-04-11-checkpoint-review-gates-implementation.md +98 -0
  7. package/docs/plans/completed/2026-04-11-finalizing-merge-options-design.md +33 -0
  8. package/docs/plans/completed/2026-04-11-finalizing-merge-options-implementation.md +75 -0
  9. package/docs/plans/completed/2026-04-11-workspace-setup-design.md +28 -0
  10. package/docs/plans/completed/2026-04-11-workspace-setup-implementation.md +57 -0
  11. package/docs/workflow-phases.md +32 -46
  12. package/extensions/workflow-guard.ts +67 -0
  13. package/package.json +3 -7
  14. package/skills/brainstorming/SKILL.md +20 -67
  15. package/skills/executing-tasks/SKILL.md +49 -214
  16. package/skills/finalizing/SKILL.md +67 -0
  17. package/skills/writing-plans/SKILL.md +29 -129
  18. package/ROADMAP.md +0 -16
  19. package/agents/code-reviewer.md +0 -18
  20. package/agents/config.ts +0 -5
  21. package/agents/implementer.md +0 -26
  22. package/agents/spec-reviewer.md +0 -13
  23. package/agents/worker.md +0 -17
  24. package/docs/plans/2026-04-10-brainstorming-boundary-enforcement-design.md +0 -60
  25. package/docs/plans/completed/2026-04-09-cleanup-legacy-state-and-enforce-think-phases-design.md +0 -56
  26. package/docs/plans/completed/2026-04-09-cleanup-legacy-state-and-enforce-think-phases-implementation.md +0 -196
  27. package/docs/plans/completed/2026-04-09-workflow-next-autocomplete-design.md +0 -185
  28. package/docs/plans/completed/2026-04-09-workflow-next-autocomplete-implementation.md +0 -334
  29. package/docs/plans/completed/2026-04-09-workflow-next-handoff-state-design.md +0 -251
  30. package/docs/plans/completed/2026-04-09-workflow-next-handoff-state-implementation.md +0 -253
  31. package/extensions/constants.ts +0 -15
  32. package/extensions/lib/logging.ts +0 -138
  33. package/extensions/plan-tracker.ts +0 -508
  34. package/extensions/subagent/agents.ts +0 -144
  35. package/extensions/subagent/concurrency.ts +0 -52
  36. package/extensions/subagent/env.ts +0 -47
  37. package/extensions/subagent/index.ts +0 -1181
  38. package/extensions/subagent/lifecycle.ts +0 -25
  39. package/extensions/subagent/timeout.ts +0 -13
  40. package/extensions/workflow-monitor/debug-monitor.ts +0 -98
  41. package/extensions/workflow-monitor/git.ts +0 -31
  42. package/extensions/workflow-monitor/heuristics.ts +0 -58
  43. package/extensions/workflow-monitor/investigation.ts +0 -52
  44. package/extensions/workflow-monitor/reference-tool.ts +0 -42
  45. package/extensions/workflow-monitor/skip-confirmation.ts +0 -19
  46. package/extensions/workflow-monitor/tdd-monitor.ts +0 -137
  47. package/extensions/workflow-monitor/test-runner.ts +0 -37
  48. package/extensions/workflow-monitor/verification-monitor.ts +0 -61
  49. package/extensions/workflow-monitor/warnings.ts +0 -81
  50. package/extensions/workflow-monitor/workflow-handler.ts +0 -363
  51. package/extensions/workflow-monitor/workflow-next-completions.ts +0 -68
  52. package/extensions/workflow-monitor/workflow-next-state.ts +0 -112
  53. package/extensions/workflow-monitor/workflow-tracker.ts +0 -286
  54. package/extensions/workflow-monitor/workflow-transitions.ts +0 -88
  55. package/extensions/workflow-monitor.ts +0 -909
  56. package/skills/dispatching-parallel-agents/SKILL.md +0 -194
  57. package/skills/receiving-code-review/SKILL.md +0 -196
  58. package/skills/systematic-debugging/SKILL.md +0 -170
  59. package/skills/systematic-debugging/condition-based-waiting-example.ts +0 -158
  60. package/skills/systematic-debugging/condition-based-waiting.md +0 -115
  61. package/skills/systematic-debugging/defense-in-depth.md +0 -122
  62. package/skills/systematic-debugging/find-polluter.sh +0 -63
  63. package/skills/systematic-debugging/reference/rationalizations.md +0 -61
  64. package/skills/systematic-debugging/root-cause-tracing.md +0 -169
  65. package/skills/test-driven-development/SKILL.md +0 -266
  66. package/skills/test-driven-development/reference/examples.md +0 -101
  67. package/skills/test-driven-development/reference/rationalizations.md +0 -67
  68. package/skills/test-driven-development/reference/when-stuck.md +0 -33
  69. package/skills/test-driven-development/testing-anti-patterns.md +0 -299
  70. package/skills/using-git-worktrees/SKILL.md +0 -231
@@ -1,247 +1,82 @@
1
1
  ---
2
2
  name: executing-tasks
3
- description: Use when you have an approved implementation plan to execute task-by-task with human gates and bounded retries
3
+ description: "Use this to implement an approved plan task-by-task. Run after writing-plans, before finalizing."
4
4
  ---
5
5
 
6
6
  # Executing Tasks
7
7
 
8
- ## Overview
8
+ Implement the plan from `docs/plans/*-implementation.md` task by task.
9
9
 
10
- Execute an implementation plan task-by-task using a per-task lifecycle with human gates and bounded retry loops. Each task goes through: **define → approve → execute → verify → review → fix**.
10
+ ## Per-task lifecycle
11
11
 
12
- **Announce at start:** "I'm using the executing-tasks skill to implement the plan."
12
+ Check each task for a `checkpoint` label and follow the appropriate flow:
13
13
 
14
- ## Prerequisites
14
+ ### No checkpoint (auto-advance)
15
15
 
16
- Before starting, verify:
17
- - [ ] On the correct branch/worktree
18
- - [ ] Plan file exists at `docs/plans/YYYY-MM-DD-<name>.md`
19
- - [ ] Plan has been reviewed and approved
16
+ 1. **Implement** — write the code as described in the plan
17
+ 2. **Run tests** verify the changes work
18
+ 3. **Fix if needed** if tests fail, debug and fix before moving on
19
+ 4. **Commit** `git add` the relevant files and commit with a clear message
20
20
 
21
- ## Initialization
21
+ ### checkpoint: test
22
22
 
23
- 1. Read the plan file and extract all tasks, including each task's `Type:` field
24
- 2. Initialize plan_tracker with structured task metadata:
25
- ```
26
- plan_tracker({
27
- action: "init",
28
- tasks: [
29
- { name: "Task 1 name", type: "code" },
30
- { name: "Task 2 name", type: "non-code" },
31
- ],
32
- })
33
- ```
34
- 3. Mark the execute phase as active
23
+ 1. **Write the test** follow the TDD scenario for the task
24
+ 2. **Pause for review** show what was done and the diff, then wait for human input
25
+ 3. **Continue** — implement, run tests, fix if needed
26
+ 4. **Commit** — `git add` the relevant files and commit with a clear message
35
27
 
36
- ## Per-Task Lifecycle
28
+ ### checkpoint: done
37
29
 
38
- For each task in the plan:
30
+ 1. **Implement** write the code as described in the plan
31
+ 2. **Run tests** — verify the changes work
32
+ 3. **Fix if needed** — if tests fail, debug and fix before moving on
33
+ 4. **Pause for review** — show what was done and the diff, then wait for human input
34
+ 5. **Commit** — `git add` the relevant files and commit with a clear message
39
35
 
40
- ### 1. Define
36
+ ## TDD discipline
41
37
 
42
- **Code task →** Write actual test file(s) with assertions:
43
- - Create test files that exercise the new/modified behavior
44
- - Tests must be specific, deterministic, and fail before implementation
45
- - Include edge cases and error conditions
46
- - Apply TDD-specific guidance only to code tasks
38
+ Follow the TDD scenario from the plan:
47
39
 
48
- **Non-code task →** Reuse and refine the plan's acceptance criteria:
49
- - List specific, measurable conditions
50
- - Each criterion must be independently verifiable
51
- - Treat these criteria as the basis for approval and verification
40
+ - **New feature**: write the test first, see it fail, then implement
41
+ - **Modifying tested code**: run existing tests before and after
42
+ - **Trivial change**: use judgment
52
43
 
53
- Update plan_tracker:
54
- ```
55
- plan_tracker({ action: "update", index: N, phase: "define" })
56
- ```
57
-
58
- ### 2. Approve (Human Gate)
59
-
60
- Present the test cases or acceptance criteria to the human:
61
-
62
- **For code tasks:**
63
- - Show the test files to be written
64
- - Explain what each test verifies
65
- - Ask: "Do these test cases cover the requirements? Approve, revise, or reject?"
66
-
67
- **For non-code tasks:**
68
- - Show the acceptance criteria list from the plan
69
- - Ask: "Do these criteria capture the intent? Approve, revise, or reject?"
70
-
71
- **No execution begins until approved.**
72
-
73
- If revised → return to Define step.
74
- If rejected → skip task and mark as blocked.
75
-
76
- ```
77
- plan_tracker({ action: "update", index: N, phase: "approve" })
78
- ```
44
+ Don't skip tests because "it's obvious." The test is the contract.
79
45
 
80
- ### 3. Execute (max 3 attempts)
46
+ ## Checkpoint review
81
47
 
82
- Implement the task following the plan's steps.
48
+ When pausing at a checkpoint, present:
83
49
 
84
- For each attempt:
85
- 1. Write/modify code as specified in the plan
86
- 2. Run tests or verify against acceptance criteria
87
- 3. If all pass → move to Verify
88
- 4. If failures:
89
- - Analyze the failures
90
- - Fix the implementation
91
- - Increment executeAttempts
92
- - If executeAttempts reaches 3 → **escalate to human**
93
-
94
- ```
95
- plan_tracker({ action: "update", index: N, phase: "execute" })
96
- plan_tracker({ action: "update", index: N, attempts: 1 }) // after each attempt (routes to executeAttempts based on phase)
97
50
  ```
51
+ ⏸ Paused at checkpoint: [test|done] for task [N]
98
52
 
99
- **Escalation on budget exhaustion:**
100
- > "I've attempted this task 3 times without success. Options:
101
- > 1. Revise the scope or approach
102
- > 2. Adjust the test cases / acceptance criteria
103
- > 3. Abandon this task and move on
104
- >
105
- > What would you like to do?"
106
-
107
- ### 4. Verify
108
-
109
- Re-run all tests or check all acceptance criteria.
110
-
111
- Report results to the human:
112
- - ✅ Condition 1: passed
113
- - ✅ Condition 2: passed
114
- - ❌ Condition 3: failed — [description of failure]
115
-
116
- **Does not auto-fix.** Flags failures to human for decision.
53
+ **What was done:** [brief summary]
54
+ **Diff:** [show relevant diff]
117
55
 
56
+ Review and let me know how to proceed.
118
57
  ```
119
- plan_tracker({ action: "update", index: N, phase: "verify" })
120
- ```
121
-
122
- If failures detected:
123
- > "Verification found issues. Options:
124
- > 1. Go back to Execute for another attempt
125
- > 2. Revise the tests/criteria
126
- > 3. Accept as-is (mark partial)
127
- >
128
- > What would you like to do?"
129
-
130
- ### 5. Review (two layers)
131
58
 
132
- **Layer 1 Subagent review:**
133
- - Dispatch a subagent to review the implementation against the task spec
134
- - Subagent checks: correctness, edge cases, code quality, test coverage
135
- - Subagent reports findings
59
+ Wait for the human to respond. They may:
60
+ - Approve and continue
61
+ - Request changes to the test or implementation
62
+ - Ask to revert the task
63
+ - Adjust the remaining plan
136
64
 
137
- Use `agentScope: "both"` to access the bundled `code-reviewer` agent:
138
- ```
139
- subagent({ agent: "code-reviewer", task: "Review implementation of task N against spec", agentScope: "both" })
140
- ```
65
+ ## Receiving code review
141
66
 
142
- **Layer 2 Human sign-off:**
143
- - Present the subagent review + test results to the human
144
- - Summarize what was done, what passed, any concerns
145
- - Ask: "Does this look good? Approve or request changes?"
67
+ When the user shares code review feedback:
146
68
 
147
- ```
148
- plan_tracker({ action: "update", index: N, phase: "review" })
149
- ```
69
+ 1. **Verify the criticism** — read the relevant code. Is the feedback accurate?
70
+ 2. **Evaluate the suggestion** is the proposed fix the right approach? Consider alternatives.
71
+ 3. **Implement or push back** — if valid, fix it. If not, explain why with evidence from the codebase.
72
+ 4. **Don't blindly implement** — every suggestion should be verified against the code before accepting.
150
73
 
151
- If issues found → move to Fix.
74
+ ## If you're stuck
152
75
 
153
- ### 6. Fix (max 3 loops, re-enters Verify Review)
76
+ - Re-read the plan you may have drifted from the spec
77
+ - Check git log — recent commits may reveal context
78
+ - Ask the user — it's better to clarify than to guess wrong
154
79
 
155
- 1. Address the review feedback
156
- 2. Re-enter Verify → Review cycle
157
- 3. Increment fixAttempts after each fix round
158
- 4. If fixAttempts reaches 3 → **escalate to human**
159
-
160
- ```
161
- plan_tracker({ action: "update", index: N, phase: "fix" })
162
- plan_tracker({ action: "update", index: N, attempts: 1 }) // routes to fixAttempts based on phase
163
- ```
164
-
165
- **Escalation on budget exhaustion:**
166
- > "I've attempted fixes 3 times. Options:
167
- > 1. Proceed as-is despite remaining issues
168
- > 2. Keep fixing (at your own risk)
169
- > 3. Abandon this task and move on
170
- >
171
- > What would you like to do?"
172
-
173
- ### Task Complete
174
-
175
- When both reviewers are satisfied and all conditions pass:
176
-
177
- ```
178
- plan_tracker({ action: "update", index: N, status: "complete" })
179
- ```
180
-
181
- Commit the task:
182
- ```bash
183
- git add <relevant files>
184
- git commit -m "feat(task N): <description>"
185
- ```
186
-
187
- ## Escalation Rules
188
-
189
- | Event | Action |
190
- |-------|--------|
191
- | Execute 3 attempts exhausted | Escalate to human — never auto-skip |
192
- | Fix loop 3 attempts exhausted | Escalate to human — never auto-skip |
193
- | Verify fails | Flag to human — human decides next step |
194
-
195
- **No silent skipping. Consistent escalation everywhere.**
196
-
197
- ## Finalize Phase
198
-
199
- After all tasks complete (or are explicitly accepted by human):
200
-
201
- ### 1. Final Review
202
- - Dispatch subagent to review the entire implementation holistically
203
- - Check for integration issues, consistency across tasks, documentation gaps
204
-
205
- ### 2. Create PR
206
- ```bash
207
- git push origin <branch>
208
- gh pr create --title "feat: <feature summary>" --body "<task summary>"
209
- ```
210
-
211
- ### 3. Archive Planning Docs
212
- ```bash
213
- mkdir -p docs/plans/completed
214
- mv docs/plans/<plan-file> docs/plans/completed/
215
- ```
80
+ ## After all tasks
216
81
 
217
- ### 4. Update Repo Docs
218
- - Update CHANGELOG with feature summary
219
- - Update README if API/surface changed
220
- - Update inline documentation as needed
221
-
222
- ### 5. Update Project Documentation
223
- - Update README if project overview has changed
224
- - Update CONTRIBUTING or architecture docs if structure changed
225
- - Note any new patterns or conventions introduced
226
-
227
- ### 6. Clean Up
228
- - Remove worktree if one was used
229
- - Mark finalize phase complete
230
-
231
- ## Boundaries
232
- - Read code, docs, and tests: yes
233
- - Write tests and implementation code: yes (within current task scope)
234
- - Write to docs/plans/completed/: yes (during finalize)
235
- - Edit files outside task scope: no (unless human explicitly approves)
236
-
237
- ## Remember
238
- - Always present test cases/criteria for human approval before executing
239
- - Extract each task's `Type:` from the plan and preserve it in `plan_tracker`
240
- - Track per-task phase and attempts in plan_tracker
241
- - Code tasks use TDD; non-code tasks use acceptance criteria during define, approve, and verify
242
- - Escalate immediately on budget exhaustion — never silently skip or continue
243
- - Verify does not auto-fix — always flag to human
244
- - Review has two layers (subagent first, then human)
245
- - Fix loops re-enter verify → review (max 3 fix loops)
246
- - Execute has separate budget (max 3 attempts)
247
- - Total max cycles per task: 3 execute + 3 fix = 6
82
+ Ask: "All tasks done? Run `/skill:finalizing` to ship."
@@ -0,0 +1,67 @@
1
+ ---
2
+ name: finalizing
3
+ description: "Use this after all tasks are complete to clean up, document, and ship the work."
4
+ ---
5
+
6
+ # Finalizing
7
+
8
+ Ship the completed work.
9
+
10
+ ## Process
11
+
12
+ 1. **Move planning docs** — archive the design and implementation docs, then commit:
13
+ ```
14
+ mkdir -p docs/plans/completed
15
+ mv docs/plans/*-design.md docs/plans/completed/
16
+ mv docs/plans/*-implementation.md docs/plans/completed/
17
+ git add docs/plans/completed/ && git commit -m "chore: archive planning docs"
18
+ ```
19
+
20
+ 2. **Update documentation** — if the API or surface changed:
21
+ - Update README.md
22
+ - Update CHANGELOG.md
23
+ - Update any inline docs
24
+
25
+ 3. **Choose a merge strategy** — ask the human which option they prefer:
26
+
27
+ 1. **Create PR** — push and open a PR for external review:
28
+ ```
29
+ git push origin <branch>
30
+ gh pr create --title "feat: <summary>" --body "<task summary>"
31
+ ```
32
+
33
+ 2. **Rebase & merge** *(recommended)* — rebase onto parent, fast-forward merge, push parent, delete branch:
34
+ ```
35
+ parent=$(git show-branch -a 2>/dev/null | grep '\*' | grep -v "$(git branch --show-current)" | head -1 | sed 's/.*\[\(.*\)\].*/\1/' | sed 's/[\^~].*//')
36
+ git checkout "$parent" && git pull
37
+ git checkout - && git rebase "$parent"
38
+ git checkout "$parent" && git merge --ff-only -
39
+ git push origin "$parent"
40
+ git branch -d - && git push origin --delete -
41
+ ```
42
+
43
+ 3. **Squash & merge** — squash all commits into one on parent, push parent, delete branch:
44
+ ```
45
+ parent=$(git show-branch -a 2>/dev/null | grep '\*' | grep -v "$(git branch --show-current)" | head -1 | sed 's/.*\[\(.*\)\].*/\1/' | sed 's/[\^~].*//')
46
+ git checkout "$parent" && git pull
47
+ git merge --squash -
48
+ git commit -m "feat: <summary>"
49
+ git push origin "$parent"
50
+ git branch -d - && git push origin --delete -
51
+ ```
52
+
53
+ 4. **Merge commit** — merge with `--no-ff`, push parent, delete branch:
54
+ ```
55
+ parent=$(git show-branch -a 2>/dev/null | grep '\*' | grep -v "$(git branch --show-current)" | head -1 | sed 's/.*\[\(.*\)\].*/\1/' | sed 's/[\^~].*//')
56
+ git checkout "$parent" && git pull
57
+ git merge --no-ff -m "Merge branch '<branch>'" -
58
+ git push origin "$parent"
59
+ git branch -d - && git push origin --delete -
60
+ ```
61
+
62
+ For options 2–4, confirm the detected parent branch with the human before proceeding.
63
+
64
+ 4. **Clean up** — if a worktree was used, remove it:
65
+ ```
66
+ git worktree remove ../<repo>-<feature-name>
67
+ ```
@@ -1,149 +1,49 @@
1
1
  ---
2
2
  name: writing-plans
3
- description: Use when you have a spec or requirements for a multi-step task, before touching code
3
+ description: "Use this to break a design into an implementation plan with bite-sized TDD tasks. Works with or without a prior brainstorm."
4
4
  ---
5
5
 
6
- > **Related skills:** Did you `/skill:brainstorming` first? Ready to implement? Use `/skill:executing-tasks`.
7
-
8
6
  # Writing Plans
9
7
 
10
- ## Overview
11
-
12
- Write comprehensive implementation plans assuming the engineer has zero context for our codebase and questionable taste. Document everything they need to know: which files to touch for each task, code, testing, docs they might need to check, how to test it. Give them the whole plan as bite-sized tasks. DRY. YAGNI. TDD. Frequent commits.
13
-
14
- Assume they are a skilled developer, but know almost nothing about our toolset or problem domain. Assume they don't know good test design very well.
15
-
16
- **Announce at start:** "I'm using the writing-plans skill to create the implementation plan."
17
-
18
- **Context:** This should be run in a dedicated worktree (created by brainstorming skill).
19
-
20
- **Save plans to:** `docs/plans/YYYY-MM-DD-<feature-name>-implementation.md`
21
-
22
- ## Boundaries
23
- - Read code and docs: yes
24
- - Write to docs/plans/: yes
25
- - Edit or create any other files: no
26
-
27
- ## Bite-Sized Task Granularity
28
-
29
- **Each step is one action (2-5 minutes):**
30
- - "Write the failing test" - step
31
- - "Run it to make sure it fails" - step
32
- - "Implement the minimal code to make the test pass" - step
33
- - "Run the tests and make sure they pass" - step
34
- - "Commit" - step
35
-
36
- ## Plan Document Header
37
-
38
- **Every plan MUST start with this header:**
39
-
40
- ```markdown
41
- # [Feature Name] Implementation Plan
42
-
43
- > **REQUIRED SUB-SKILL:** Use the executing-tasks skill to implement this plan task-by-task.
44
-
45
- **Goal:** [One sentence describing what this builds]
46
-
47
- **Architecture:** [2-3 sentences about approach]
48
-
49
- **Tech Stack:** [Key technologies/libraries]
50
-
51
- ---
52
- ```
8
+ Read-only exploration. You may **not** edit or create any files except under `docs/plans/`.
53
9
 
54
- ## Task Structure
10
+ ## Process
55
11
 
56
- Every task must declare its type explicitly so `executing-tasks` can initialize `plan_tracker` with the correct metadata.
12
+ 1. **Check for a design doc & workspace** look for `docs/plans/*-design.md`. If one exists, use it as the basis for the plan. Verify you're on the feature branch (or in its worktree) created during brainstorming. If no design doc exists, ask the user to describe what they want to build, read relevant code, create a branch, and create the plan directly.
13
+ 2. **Write the implementation plan** — break the design into tasks. Save to `docs/plans/YYYY-MM-DD-<topic>-implementation.md`.
57
14
 
58
- ### Code task template
15
+ ## Task format
59
16
 
60
- ```markdown
61
- ### Task N: [Component Name]
17
+ Each task should be 2-5 minutes of work:
62
18
 
63
- **Type:** code
64
- **TDD scenario:** [New feature — full TDD cycle | Modifying tested code run existing tests first | Trivial change — use judgment]
65
-
66
- **Files:**
67
- - Create: `exact/path/to/file.py`
68
- - Modify: `exact/path/to/existing.py:123-145`
69
- - Test: `tests/exact/path/to/test.py`
70
-
71
- **Step 1: Write the failing test**
72
-
73
- ```python
74
- def test_specific_behavior():
75
- result = function(input)
76
- assert result == expected
77
- ```
78
-
79
- **Step 2: Run test to verify it fails**
80
-
81
- Run: `pytest tests/path/test.py::test_name -v`
82
- Expected: FAIL with "function not defined"
83
-
84
- **Step 3: Write minimal implementation**
85
-
86
- ```python
87
- def function(input):
88
- return expected
89
- ```
90
-
91
- **Step 4: Run test to verify it passes**
92
-
93
- Run: `pytest tests/path/test.py::test_name -v`
94
- Expected: PASS
95
-
96
- **Step 5: Commit**
97
-
98
- ```bash
99
- git add tests/path/test.py src/path/file.py
100
- git commit -m "feat: add specific feature"
101
- ```
102
- ```
103
-
104
- ### Non-code task template
105
-
106
- ```markdown
107
- ### Task N: [Documentation / rollout / analysis task]
108
-
109
- **Type:** non-code
110
-
111
- **Files:**
112
- - Modify: `README.md`
113
- - Modify: `docs/architecture.md`
19
+ - Exact file paths to create/modify
20
+ - Complete code (not "add validation")
21
+ - Exact commands with expected output
22
+ - `git commit` after each task
23
+ - Optional `checkpoint: test` or `checkpoint: done` label
114
24
 
115
- **Acceptance criteria:**
116
- - Criterion 1: [Specific, observable outcome]
117
- - Criterion 2: [Specific, observable outcome]
118
- - Criterion 3: [Specific, observable outcome]
25
+ ## TDD in the plan
119
26
 
120
- **Implementation notes:**
121
- - Update the listed files only.
122
- - Keep terminology consistent with the rest of the repo.
123
- - Reference the relevant code paths or docs where useful.
27
+ Label each task with its TDD scenario:
124
28
 
125
- **Verification:**
126
- - Review each acceptance criterion one-by-one.
127
- - Confirm the updated docs match the implemented behavior.
128
- ```
29
+ | Scenario | When | Instructions in the task |
30
+ |---|---|---|
31
+ | **New feature** | Adding new behavior | Write failing test → run it → implement → run it → commit |
32
+ | **Modifying tested code** | Changing existing behavior | Run existing tests first → modify → verify they pass → commit |
33
+ | **Trivial** | Config, docs, naming | Use judgment, commit when done |
129
34
 
130
- ## Remember
131
- - Exact file paths always
132
- - Every task must include `**Type:** code` or `**Type:** non-code`
133
- - Non-code tasks must include explicit `**Acceptance criteria:**`
134
- - Complete code in plan (not "add validation")
135
- - Exact commands with expected output
136
- - Reference relevant skills
137
- - DRY, YAGNI, TDD, frequent commits
138
- - Order tasks so each task's dependencies are completed by earlier tasks
139
- - If plan exceeds ~8 tasks, consider splitting into phases with a checkpoint between them
35
+ ## Checkpoint labels
140
36
 
141
- ## Execution Handoff
37
+ Optionally label each task with a `checkpoint` to require human review before proceeding:
142
38
 
143
- After saving the plan, the workflow monitor automatically tracks phase transitions when you invoke skills.
39
+ | Checkpoint | When to use | What happens during execution |
40
+ |---|---|---|
41
+ | *(none)* | Trivial tasks, well-understood changes | Auto-advance, no pause |
42
+ | **`checkpoint: test`** | Test design matters (API contracts, edge cases, complex behavior) | Pause after writing the failing test, before implementing |
43
+ | **`checkpoint: done`** | Implementation review matters (complex logic, security, performance) | Pause after implementation + tests pass, before committing |
144
44
 
145
- Then offer execution:
45
+ Use judgment when assigning checkpoints. Prefer `checkpoint: test` for new features with non-obvious test design. Prefer `checkpoint: done` for tasks where the implementation approach is debatable. Most tasks should not need a checkpoint. The user can adjust checkpoints when reviewing the plan.
146
46
 
147
- **"Plan complete and saved to `docs/plans/<filename>.md`. Ready to execute with `/skill:executing-tasks`."**
47
+ ## After the plan
148
48
 
149
- The executing-tasks skill handles the full per-task lifecycle (define → approve → execute → verify → review → fix) with human gates and bounded retry loops.
49
+ Ask: "Ready to execute? Run `/skill:executing-tasks`"
package/ROADMAP.md DELETED
@@ -1,16 +0,0 @@
1
- # Roadmap
2
-
3
- Short-term priorities for `pi-workflow-kit`:
4
-
5
- - keep the simplified 4-phase workflow coherent across skills, extensions, and docs
6
- - improve `executing-tasks` ergonomics for real feature delivery
7
- - continue tightening workflow-monitor tests around phase transitions and handoffs
8
- - improve README and reference docs so package behavior is easy to trust
9
- - expand examples for non-code tasks and finalize flows
10
-
11
- Longer-term possibilities:
12
-
13
- - richer plan parsing helpers for typed task extraction
14
- - deeper TUI support for active-task detail
15
- - more bundled subagents for targeted review and maintenance tasks
16
- - optional reporting/export of workflow state
@@ -1,18 +0,0 @@
1
- ---
2
- name: code-reviewer
3
- description: "Production readiness review: quality, security, testing (read-only)"
4
- tools: read, bash, find, grep, ls
5
- ---
6
-
7
- You are a code quality reviewer.
8
-
9
- Review for:
10
- - correctness, error handling
11
- - maintainability
12
- - security and footguns
13
- - test coverage quality
14
-
15
- Return:
16
- - Strengths
17
- - Issues (Critical/Important/Minor)
18
- - Clear verdict (ready or not)
package/agents/config.ts DELETED
@@ -1,5 +0,0 @@
1
- /**
2
- * Canonical model configuration for bundled agents.
3
- * Change this one constant to update the default model for all bundled agents.
4
- */
5
- export const DEFAULT_MODEL = "claude-sonnet-4-5";
@@ -1,26 +0,0 @@
1
- ---
2
- name: implementer
3
- description: Implement tasks via TDD and commit small changes
4
- tools: read, write, edit, bash, plan_tracker, workflow_reference
5
- extensions: ../extensions/workflow-monitor, ../extensions/plan-tracker
6
- ---
7
-
8
- You are an implementation subagent.
9
-
10
- ## TDD Approach
11
-
12
- Determine which scenario applies before writing code:
13
-
14
- **New files / new features:** Full TDD. Write a failing test first, verify it fails, implement minimal code to pass, refactor.
15
-
16
- **Modifying code with existing tests:** Run existing tests first to confirm green. Make your change. Run tests again. If the change isn't covered by existing tests, add a test. If it is, you're done.
17
-
18
- **Trivial changes (typo, config, rename):** Use judgment. Run relevant tests after if they exist.
19
-
20
- **If you see a ⚠️ TDD warning:** Pause. Consider which scenario applies. If existing tests cover your change, run them and proceed. If not, write a test first.
21
-
22
- ## Rules
23
- - Keep changes minimal and scoped to the task.
24
- - Run the narrowest test(s) first, then the full suite when appropriate.
25
- - Commit when the task's tests pass.
26
- - Report: what changed, tests run, files changed, any concerns.
@@ -1,13 +0,0 @@
1
- ---
2
- name: spec-reviewer
3
- description: Verify implementation matches the plan/spec (read-only)
4
- tools: read, bash, find, grep, ls
5
- ---
6
-
7
- You are a spec compliance reviewer.
8
-
9
- Check the implementation against the provided requirements.
10
- - Identify missing requirements.
11
- - Identify scope creep / unrequested changes.
12
- - Point to exact files/lines and provide concrete fixes.
13
- Return a clear verdict: ✅ compliant / ❌ not compliant.
package/agents/worker.md DELETED
@@ -1,17 +0,0 @@
1
- ---
2
- name: worker
3
- description: General-purpose worker for isolated tasks
4
- tools: read, write, edit, bash, plan_tracker, workflow_reference
5
- extensions: ../extensions/workflow-monitor, ../extensions/plan-tracker
6
- ---
7
-
8
- You are a general-purpose subagent. Follow the task exactly.
9
-
10
- ## TDD (when changing production code)
11
-
12
- - New files: write a failing test first, then implement.
13
- - Modifying existing code: run existing tests first, make your change, run again. Add tests if not covered.
14
- - Trivial changes: run relevant tests after if they exist.
15
- - If you see a ⚠️ TDD warning, pause and decide which scenario applies before proceeding.
16
-
17
- Prefer small, test-backed changes.