npm - @tianhai/pi-workflow-kit - Versions diffs - 0.10.1 → 0.11.0 - Mend

@tianhai/pi-workflow-kit 0.10.1 → 0.11.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/README.md +86 -37
package/extensions/workflow-guard.ts +2 -3
package/package.json +11 -4
package/skills/brainstorming/SKILL.md +3 -3
package/skills/executing-tasks/SKILL.md +86 -116
package/skills/finalizing/SKILL.md +5 -3
package/skills/writing-plans/SKILL.md +10 -10
package/banner.jpg +0 -0

package/README.md CHANGED Viewed

@@ -1,72 +1,121 @@
 # pi-workflow-kit
-Structured workflow skills and enforcement for [pi](https://github.com/badlogic/pi-mono).
+> Stop AI agents from rushing to code. Enforce a structured brainstorm→plan→execute→finalize workflow with TDD discipline.
-## What You Get
+AI coding agents tend to skip design and jump straight into implementation, producing over-engineered or misaligned code. **pi-workflow-kit** solves this by hard-blocking write operations during brainstorm and planning phases — the agent *literally cannot modify your source files* until you approve the design.
+[pi](https://github.com/badlogic/pi-mono) package. Zero configuration required.
-**4 workflow skills** that guide the agent through a structured development process:
+## Install
+```bash
+pi install npm:@tianhai/pi-workflow-kit
 ```
-brainstorm → plan → execute → finalize
+No setup needed — skills and guards activate automatically after install.
+**Want to try before committing?**
+```bash
+pi -e npm:@tianhai/pi-workflow-kit
 ```
-**1 extension** that enforces the rules:
+## What You Get
-- During brainstorming and planning, `write` and `edit` are **hard-blocked** outside `docs/plans/`. The agent can only read code and discuss the design with you — it literally cannot modify source files.
-- `bash` is **restricted to read-only commands** — file writes, installs, git mutations, and editors are blocked. Safe commands like `grep`, `find`, `git status`, `cat`, `curl`, `go doc`, `go list` remain available.
+### 🛡️ Workflow Guard (extension)
-No configuration required. Skills and extensions activate automatically after install.
+Enforces phase-appropriate tool access — not just guidelines, but hard blocks:
-## Install
+| Phase | `write` / `edit` | `bash` |
+|-------|:-:|:-:|
+| **Brainstorm** / **Plan** | 🔒 Blocked outside `docs/plans/` | 🔒 Read-only only (grep, find, cat, git status, curl…) |
+| **Execute** / **Finalize** | ✅ Full access | ✅ Full access |
-```bash
-pi install npm:@tianhai/pi-workflow-kit
-```
+The agent can read code and discuss design with you during brainstorm/plan, but it physically cannot modify source files or run mutating commands.
-## The Workflow
+### 🧠 5 Workflow Skills
-You control each phase explicitly by invoking the skill:
+Guide the agent through a disciplined development process:
-| Phase | Command | What Happens |
+```
+brainstorm → plan → execute → finalize
+              ↕
+           diagnose (anytime)
+```
+| Phase | Trigger | What Happens |
 |-------|---------|--------------|
-| **Brainstorm** | `/skill:brainstorming` | Refine your idea into a design doc via collaborative dialogue |
-| **Plan** | `/skill:writing-plans` | Break the design into bite-sized TDD tasks with exact file paths and code |
-| **Execute** | `/skill:executing-tasks` | Implement the plan task-by-task with TDD discipline and optional checkpoint review gates |
+| **Brainstorm** | `/skill:brainstorming` | Explore approaches, debate tradeoffs, produce a design doc |
+| **Plan** | `/skill:writing-plans` | Break design into bite-sized TDD tasks with file paths and acceptance criteria |
+| **Execute** | `/skill:executing-tasks` | Implement tasks one-by-one with TDD discipline and optional checkpoint review gates |
 | **Finalize** | `/skill:finalizing` | Archive plan docs, update README/CHANGELOG, create PR |
+| **Diagnose** | `/skill:diagnose` | 6-phase debugging loop: reproduce → hypothesize → instrument → fix → verify |
-During brainstorm and plan, the extension blocks `write`/`edit` outside `docs/plans/` and restricts `bash` to read-only commands. During execute and finalize, all tools are available.
+## The Workflow in Detail
-### Skills
+### Phase Control
-| Skill | Lines | Description |
-|-------|------:|-------------|
-| `brainstorming` | ~30 | Explore the idea, propose approaches, write design doc |
-| `writing-plans` | ~35 | Break design into tasks with TDD scenarios, set up branch/worktree |
-| `executing-tasks` | ~50 | Implement tasks with TDD discipline, checkpoint review gates, handle code review |
-| `finalizing` | ~20 | Archive docs, update changelog, create PR, clean up |
-| `diagnose` | ~35 | 6-phase debugging loop: build feedback loop, reproduce, hypothesise, instrument, fix, cleanup |
+You control each phase — the agent never advances on its own. Invoke a skill to move forward:
+```
+/skill:brainstorming   →  discuss and design
+/skill:writing-plans   →  break into tasks
+/skill:executing-tasks →  implement with TDD
+/skill:finalizing      →  ship it
+```
 ### TDD Three-Scenario Model
-The plan labels each task with its TDD scenario:
+Each task is labeled with its TDD scenario during planning:
 | Scenario | When | Rule |
 |----------|------|------|
-| New feature | Adding new behavior | Write failing test → implement → pass |
-| Modifying tested code | Changing existing behavior | Run existing tests first → modify → verify |
-| Trivial | Config, docs, naming | Use judgment |
+| **New feature** | Adding new behavior | Write failing test → implement → pass |
+| **Modifying tested code** | Changing existing behavior | Run existing tests first → modify → verify |
+| **Trivial** | Config, docs, naming | Use judgment |
 ### Checkpoint Review Gates
-Optionally label tasks with a `checkpoint` to pause for human review:
+Optionally label tasks with a `checkpoint` to pause for human review. At each checkpoint the agent stops and waits for your feedback — you can approve, ask for changes, or send it back to rethink. Only when you're satisfied does it move on to the next task.
-| Checkpoint | When to use | What happens |
+| Checkpoint | When to Use | What Happens |
 |---|---|---|
 | *(none)* | Trivial tasks, well-understood changes | Auto-advance, no pause |
-| `checkpoint: test` | Test design matters | Pause after failing test, before implementing |
-| `checkpoint: done` | Implementation review matters | Pause after implementation passes tests, before committing |
+| `checkpoint: test` | Test design matters | Agent writes the failing test, then pauses for your review. Verify the test covers the right cases before the agent implements. |
+| `checkpoint: done` | Implementation review matters | Agent implements and passes tests, then pauses for your review. Verify the implementation is correct before committing. |
+## Quick Start
+```bash
+# Install
+pi install npm:@tianhai/pi-workflow-kit
+# Start a new feature
+> /skill:brainstorming
+> I want to add OAuth2 login to our API
+# (agent explores approaches, writes design doc)
+# (write/edit are blocked — your code is safe)
+> /skill:writing-plans
+# (agent breaks design into TDD tasks)
+> /skill:executing-tasks
+# (agent implements with TDD, all tools unlocked)
+> /skill:finalizing
+# (agent archives docs, updates changelog, creates PR)
+```
+## Why?
+- **AI agents skip design.** Left unchecked, they jump to code and over-engineer. This forces a think-first workflow.
+- **TDD needs structure.** The three-scenario model gives the agent clear rules for when to write tests first.
+- **You stay in control.** Checkpoint review gates let you approve test designs and implementations before the agent commits.
+- **Enforced, not suggested.** Hard blocks mean the agent can't ignore the rules — not even accidentally.
-## Architecture
+## Project
 ```
 pi-workflow-kit/
@@ -92,4 +141,4 @@ npm test
 ## License
-MIT
+[MIT](LICENSE)

package/extensions/workflow-guard.ts CHANGED Viewed

@@ -45,7 +45,7 @@ const DESTRUCTIVE_PATTERNS = [
 	/\bshutdown\b/i,
 	/\bsystemctl\s+(start|stop|restart|enable|disable)/i,
 	/\bservice\s+\S+\s+(start|stop|restart)/i,
-	/^\s*(vim?|nano|emacs|code|subl)\b/i,
+	/\b(vim?|nano|emacs|code|subl)\b/i,
 ];
 const SAFE_PATTERNS = [
@@ -117,8 +117,7 @@ const SAFE_PATTERNS = [
 ];
 /** Split a compound command into individual sub-commands.
- * Splits on &&, ||, and ; operators, ignoring leading whitespace.
- * Does NOT split on | (pipe) to allow piping (e.g. `git log | head`).
+ * Handles &&, ||, ;, and | (pipe) operators, ignoring leading whitespace.
  */
 function splitCompoundCommand(command: string): string[] {
 	// Match sub-commands separated by &&, ||, ; (with optional whitespace)

package/package.json CHANGED Viewed

@@ -1,9 +1,17 @@
 {
   "name": "@tianhai/pi-workflow-kit",
-  "version": "0.10.1",
-  "description": "Workflow skills and enforcement extensions for pi",
+  "version": "0.11.0",
+  "description": "Enforce structured brainstorm→plan→execute→finalize workflow with TDD discipline in AI coding agents",
   "keywords": [
-    "pi-package"
+    "pi-package",
+    "ai-coding-agent",
+    "workflow",
+    "tdd",
+    "guard-rails",
+    "code-review",
+    "pi-extension",
+    "brainstorm",
+    "test-driven-development"
   ],
   "scripts": {
     "test": "vitest run",
@@ -20,7 +28,6 @@
     "extensions/",
     "skills/",
     "docs/",
-    "banner.jpg",
     "LICENSE",
     "README.md"
   ],

package/skills/brainstorming/SKILL.md CHANGED Viewed

@@ -10,9 +10,9 @@ Read-only exploration. You may **not** edit or create any files except under `do
 ## Process
 1. **Check git state** — run `git status` and `git log --oneline -5`. If there's uncommitted work, ask the user what to do with it first.
-2. **Understand the idea** — read existing code, docs, and recent commits. Ask questions one at a time to refine the idea. Prefer multiple choice when possible.
+2. **Understand the idea** — read existing code, docs, and recent commits. Grep for related functionality, check package.json/dependencies and module structure. Read only what's necessary to ground the design — don't read the entire codebase. Ask questions to refine the idea. Prefer multiple choice when possible. After each question, check: can you clearly articulate (a) what the user wants to build, (b) why, and (c) key constraints? If yes, present your understanding as a short summary and ask: "Should I proceed with this, or is there more to add?" The human decides when to move on.
 3. **Explore approaches** — propose 2-3 approaches. For each approach, sketch the concrete interface (types, method signatures, example caller code) so the comparison is grounded in actual code, not abstract descriptions. Lead with your recommendation.
-4. **Present the design** — break it into sections of 200-300 words. Check after each section whether it looks right. Cover: architecture, components, data flow, error handling, testing.
+4. **Present the design** — break it into focused sections. Each section should be one screen of reading. Present each section to the human and wait for approval before continuing. Cover: architecture, components, data flow, error handling, testing. On feedback, incorporate it and re-present the revised section.
    When a significant architectural decision is identified, offer to write a lightweight ADR to `docs/plans/adr/`. Only write an ADR when all three are true:
@@ -29,7 +29,7 @@ Read-only exploration. You may **not** edit or create any files except under `do
    ```
    ADRs live under `docs/plans/adr/` and are archived during finalizing alongside the design doc.
-5. **Write the design doc** — save it to `docs/plans/YYYY-MM-DD-<topic>-design.md`. Ask the user to commit it. Branch creation and worktree setup should be deferred to the execution phase (`/skill:executing-tasks`).
+5. **Write the design doc** — save it to `docs/plans/YYYY-MM-DD-<topic>-design.md`. Organize features as end-to-end slices (each slice delivers one observable behavior through all relevant layers) so the planning phase can decompose them directly into tasks. Branch creation, committing, and workspace setup are handled by `/skill:executing-tasks`.
 ## Principles

package/skills/executing-tasks/SKILL.md CHANGED Viewed

@@ -10,19 +10,32 @@ Implement the plan from `docs/plans/*-implementation.md` task by task, with file
 ## Before you start
 1. **Check git state** — run `git status` and `git log --oneline -5`. Note any uncommitted changes.
-2. **Find the plan** — look for `docs/plans/*-implementation.md`. If multiple exist, ask the user which one to execute.
+2. **Find the plan** — look for `docs/plans/*-implementation.md`. If none exist, say "No implementation plan found. Run `/skill:writing-plans` first." and stop. If multiple exist, ask the user which one to execute.
 3. **Check for existing progress** — look for `docs/plans/*-progress.md`. If one exists matching the plan, this is a **resume** (see [Resume](#resume)). If not, this is a **first run** (see [First run](#first-run)).
 ## First run
 1. **Parse the implementation plan** — read the plan and extract all `## Task N:` headings. Build the progress table with all tasks as `⬜ pending`.
-2. **Create the progress file** — save to `docs/plans/<plan-name>-progress.md` (replace `-implementation` with `-progress` in the plan filename):
+2. **Suggest workspace isolation** — if the user isn't already on a feature branch or worktree, present the options:
+   - **Branch** (smaller changes):
+     ```
+     git checkout -b <feature-name>
+     ```
+   - **Worktree** (larger features, keeps main clean):
+     ```
+     git worktree add ../<repo>-<feature-name> -b <feature-name>
+     ```
+   Derive `<feature-name>` from the plan doc (e.g. `docs/plans/2026-04-16-auth-design.md` → `auth`). Ask the user which they prefer, then wait for confirmation before proceeding.
+3. **Create the progress file** — save to `docs/plans/<plan-name>-progress.md` (replace `-implementation` with `-progress` in the plan filename):
    ```markdown
    # Progress: <topic>
    Plan: docs/plans/YYYY-MM-DD-<topic>-implementation.md
-   Branch: <current-branch>
+   Branch: <actual branch name>
    Started: <ISO timestamp>
    Last updated: <ISO timestamp>
@@ -31,18 +44,7 @@ Implement the plan from `docs/plans/*-implementation.md` task by task, with file
    | 1 | ⬜ pending | Task description (preserve checkpoint labels) | — |
    ```
-3. **Suggest workspace isolation** — if the user isn't already on a feature branch or worktree, present the options:
-   - **Branch** (smaller changes):
-     ```
-     git checkout -b <feature-name>
-     ```
-   - **Worktree** (larger features, keeps main clean):
-     ```
-     git worktree add ../<repo>-<feature-name> -b <feature-name>
-     ```
-   Derive `<feature-name>` from the plan doc (e.g. `docs/plans/2026-04-16-auth-design.md` → `auth`). Ask the user which they prefer, then wait for confirmation before proceeding.
+   Use the actual branch name — whether it's the original branch or a new one from the isolation step.
 4. **Commit the plan docs** — if `docs/plans/` has uncommitted files, commit them on the new branch:
    ```
@@ -92,117 +94,84 @@ Implement the plan from `docs/plans/*-implementation.md` task by task, with file
 For each task the agent works on:
 1. **Mark in-progress** — update the progress file: `🔄 in-progress`
-2. **Read only the relevant task** — grep/jump to `## Task N:` in the implementation plan. Do not read the entire plan.
-3. **Implement** — follow the TDD discipline (see [TDD discipline](#tdd-discipline)) and checkpoint flow (see [Checkpoints](#checkpoints))
-4. **Commit** — `git add` the relevant files and commit with a clear message
-5. **Update progress** — mark `✅ done` + record the commit hash
-6. **Check next task** — look at the next task in the progress file:
-   - **Has checkpoint** → pause for review (see [Checkpoint review](#checkpoint-review))
-   - **No checkpoint** → continue to the next task
-## Checkpoints
-Check each task for a `checkpoint` label and follow the appropriate flow:
-### No checkpoint (auto-advance)
-1. **Implement** — write the code as described in the plan
-2. **Run tests** — verify the changes work
-3. **Fix if needed** — if tests fail, debug and fix before moving on
-4. **Commit** — `git add` the relevant files and commit with a clear message
-### checkpoint: test
-1. **Write the test** — follow the TDD scenario for the task
-2. **Pause for review** — show what was done and the diff, then wait for human input
-3. **Continue** — implement, run tests, fix if needed
-4. **Commit** — `git add` the relevant files and commit with a clear message
-### checkpoint: done
-1. **Implement** — write the code as described in the plan
-2. **Run tests** — verify the changes work
-3. **Fix if needed** — if tests fail, debug and fix before moving on
-4. **Pause for review** — show what was done and the diff, then wait for human input
-5. **Commit** — `git add` the relevant files and commit with a clear message
+2. **Read the plan selectively** — read the plan's overview section (everything before `## Task 1:`). Skim all `## Task N:` headings for dependency awareness. Then read the current task's body in full.
+3. **Write the test** — for `new-feature`: write a failing test. For `modifying-tested-code`: run existing tests first. For `trivial`: skip steps 3-5, go to step 6.
+4. **Run the test** — confirm it fails (new-feature) or passes (modifying-tested-code). Fix if needed.
+5. **⏸ PAUSE if `checkpoint: test`** — present the [checkpoint review](#checkpoint-review) below. Wait for human input. On changes, update and re-present at this same pause.
+6. **Implement** — write the code to make the test pass.
+7. **Run tests** — verify everything passes. If tests fail and you cannot fix them after retrying, see [If you're stuck](#if-youre-stuck). If still stuck, mark the task `❌ failed` with the reason in the progress file and move to the next task.
+8. **Verify against task description** — re-read the task from the plan. Does the implementation satisfy every requirement in the description? If not, fix before proceeding.
+9. **Refactor if needed** — after all tests pass, check for refactoring opportunities:
+   - **Shallow modules** — is the interface nearly as complex as the implementation? Can complexity be hidden behind a simpler interface?
+   - **Deletion test** — if you deleted this module, would complexity vanish (pass-through) or reappear across callers (earning its keep)?
+   - **Duplication** — extract repeated patterns
+   - **Seam discipline** — don't introduce abstraction unless something actually varies across it. One adapter = hypothetical seam. Two adapters = real seam
+   Run tests after each refactor step. Never refactor while tests are failing.
+10. **⏸ PAUSE if `checkpoint: done`** — present the [checkpoint review](#checkpoint-review) below. Wait for human input. On changes, update and re-present at this same pause.
+11. **Commit** — `git add` the relevant files and commit with a clear message.
+12. **Update progress** — mark `✅ done` + record the commit hash.
+13. **Suggest session break if needed** — after completing ~3-5 tasks since the last break, suggest:
+    ```
+    ✅ Tasks N-M done (commits: abc, def)
+    Progress: X/Y tasks done
+    ⏭  Next: Task [N+1] — [description]
+    💡 Context is building up. For clean context on remaining tasks:
+       /new  then  /skill:executing-tasks
+       (or just say "continue" to keep going here)
+    ```
+    Also suggest at checkpoint review pauses when multiple tasks have been completed since the last break. Respect the user's choice if they say "continue".
+14. **Loop** — go back to step 1 for the next `⬜ pending` task, or see [After all tasks](#after-all-tasks) if none remain.
 ## Checkpoint review
-When pausing at a checkpoint, present:
+When pausing at a `checkpoint: test`, present the test code first:
 ```
-⏸ Paused at checkpoint: [test|done] for task [N]
-**What was done:** [brief summary]
-**Diff:** [show relevant diff]
-Review and let me know how to proceed.
+⏸ Paused at checkpoint: test for task [N]
+**Test written:**
+[show the test code]
+**Expected behavior:** [what this test validates]
+**Next:** Task [N+1] — [description]
+**Available actions:**
+- **Approve** — continue to implementation (step 6)
+- **Request changes** — describe what to change, I'll update and re-present
+- **Revert** — undo this task and mark it back to pending
+- **Adjust plan** — modify the remaining tasks in the implementation plan
+- `skip` — skip this task and move on
+- `stop` — pause here, start a fresh session later with `/skill:executing-tasks`
+- `status` — show the full progress table
 ```
-Wait for the human to respond. They may:
-- Approve and continue
-- Request changes to the test or implementation
-- Ask to revert the task
-- Adjust the remaining plan
-## TDD discipline
-Follow the TDD scenario from the plan:
-- **New feature**: write the test first, see it fail, then implement
-- **Modifying tested code**: run existing tests before and after
-- **Trivial change**: use judgment
-Don't skip tests because "it's obvious." The test is the contract.
-## Refactoring
-After all tests pass for a task, check for refactoring opportunities:
-- **Shallow modules** — is the interface nearly as complex as the implementation? Can complexity be hidden behind a simpler interface?
-- **Deletion test** — if you deleted this module, would complexity vanish (pass-through) or reappear across callers (earning its keep)?
-- **Duplication** — extract repeated patterns
-- **Seam discipline** — don't introduce abstraction unless something actually varies across it. One adapter = hypothetical seam. Two adapters = real seam
-Run tests after each refactor step. Never refactor while tests are failing.
-Key vocabulary: **depth** (lots of behavior behind a small interface), **seam** (where behavior can be altered without editing in place), **locality** (change concentrated in one place).
-## Batching and session management
-The agent suggests a fresh session at natural break points to minimize token accumulation. After completing ~3-5 non-checkpoint tasks in the same session, suggest:
+When pausing at a `checkpoint: done`, present the implementation review:
 ```
-✅ Tasks 3-5 done (commits: a1b2, e4f5, i7j8)
-Progress: 5/10 tasks done
+⏸ Paused at checkpoint: done for task [N]
-⏭  Next: Task 6 — Add auth middleware (no checkpoint)
-💡 Context is building up. For clean context on remaining tasks:
-   /new  then  /skill:executing-tasks
-   (or just say "continue" to keep going here)
+**What was done:** [brief summary]
+**Diff:** [show relevant diff]
+**Next:** Task [N+1] — [description]
+**Available actions:**
+- **Approve** — continue to the next task
+- **Request changes** — describe what to change, I'll update and re-present
+- **Revert** — undo this task and mark it back to pending
+- **Adjust plan** — modify the remaining tasks in the implementation plan
+- `skip` — skip this task and move on
+- `stop` — pause here, start a fresh session later with `/skill:executing-tasks`
+- `status` — show the full progress table
 ```
-The user can say "continue" to keep going in the same session. Respect their choice.
+Wait for the human to respond. On **request changes**, make the edits, then re-present at the same checkpoint. Repeat until approved.
-Also suggest `/new` at checkpoint review pauses when multiple tasks have been completed since the last session break.
+## Progress file updates
-## Progress file updates (automated)
-During execution, the agent should update the progress file in place. Example workflow:
-```bash
-# Before task 2 starts:
-sed -i 's/| 2 | ⬜ pending/| 2 | 🔄 in-progress/'
-# After successful commit a1b2c3d:
-sed -i 's/| 2 | 🔄 in-progress/| 2 | ✅ done/'
-sed -i 's/| 2 | ✅ done[^|]*|/| 2 | ✅ done | a1b2c3d |/'
-# Update timestamp:
-sed -i "s/Last updated:.*/Last updated: $(date -u +%Y-%m-%dT%H:%M:%SZ)/"
-```
+Update the progress file by reading it, modifying the relevant row's status and commit hash, and writing it back. Target the specific task row — do not use pattern-matching approaches (e.g. sed) that could corrupt the table.
-Note: The agent should use proper markdown table parsing (not naive sed in production) to avoid corrupting the file — ensure the replacement targets the correct row.
+Update `Last updated` timestamp on every change.
 ## User override commands
@@ -217,18 +186,19 @@ The user can issue these commands at any time during execution:
 ## Receiving code review
-When the user shares code review feedback:
+When the user shares code review feedback (outside of a checkpoint pause):
 1. **Verify the criticism** — read the relevant code. Is the feedback accurate?
 2. **Evaluate the suggestion** — is the proposed fix the right approach? Consider alternatives.
-3. **Implement or push back** — if valid, fix it. If not, explain why with evidence from the codebase.
+3. **Implement or push back** — if valid, fix it, re-run tests, and amend the commit. If not, explain why with evidence from the codebase.
 4. **Don't blindly implement** — every suggestion should be verified against the code before accepting.
 ## If you're stuck
-- Re-read the current task section from the plan — you may have drifted from the spec
-- Check git log — recent commits may reveal context
-- Ask the user — it's better to clarify than to guess wrong
+1. Re-read the current task section from the plan — you may have drifted from the spec
+2. Check git log — recent commits may reveal context
+3. Ask the user — it's better to clarify than to guess wrong
+4. If still stuck after asking, mark the task `❌ failed` with the reason in the progress file and move to the next task
 ## After all tasks

package/skills/finalizing/SKILL.md CHANGED Viewed

@@ -25,14 +25,16 @@ Wait for the user to confirm before proceeding.
    ```
    mkdir -p docs/plans/completed
    mkdir -p docs/plans/completed/adr
-   mv docs/plans/*-design.md docs/plans/completed/
-   mv docs/plans/*-implementation.md docs/plans/completed/
-   mv docs/plans/*-progress.md docs/plans/completed/
+   mv docs/plans/*-design.md docs/plans/completed/ 2>/dev/null || true
+   mv docs/plans/*-implementation.md docs/plans/completed/ 2>/dev/null || true
+   mv docs/plans/*-progress.md docs/plans/completed/ 2>/dev/null || true
    mv docs/plans/adr/*.md docs/plans/completed/adr/ 2>/dev/null || true
    rmdir docs/plans/adr 2>/dev/null || true
    git add docs/plans/ && git commit -m "chore: archive planning docs"
    ```
+   Each `mv` gracefully handles the case where no matching files exist (e.g., if the user skipped straight from brainstorm to finalize without executing tasks).
 2. **Update documentation** — if the API or surface changed:
    - Update README.md
    - Update CHANGELOG.md

package/skills/writing-plans/SKILL.md CHANGED Viewed

@@ -5,22 +5,24 @@ description: "Use this to break a design into an implementation plan with bite-s
 # Writing Plans
-Read-only exploration. You may **not** edit or create any files except under `docs/plans/`.
+You may only create or edit files under `docs/plans/`. Do not modify source code or configuration.
 ## Process
-1. **Check for a design doc** — look for `docs/plans/*-design.md`. If one exists, use it as the basis for the plan. If no design doc exists, ask the user to describe what they want to build and read relevant code.
-2. **Write the implementation plan** — break the design into tasks. Save to `docs/plans/YYYY-MM-DD-<topic>-implementation.md`.
+1. **Check for a design doc** — look for `docs/plans/*-design.md`. If one exists, use it as the basis for the plan. If the design doc is incomplete, fill gaps by asking the human. If no design doc exists, ask the user to describe what they want to build and read relevant code.
+2. **Write the implementation plan** — break the design into tasks. Save to `docs/plans/YYYY-MM-DD-<topic>-implementation.md`. If the design is too large for ~15 tasks, flag this to the human and ask whether to reduce scope or proceed with the full plan.
+3. **Present the plan** — show the complete plan to the human. Wait for approval before suggesting execution.
 ## Task format
-Each task should be 2-5 minutes of work:
+Each task should produce one committed, testable change:
 - Exact file paths to create/modify
-- Complete code (not "add validation")
+- Complete code (not "add validation"). For tasks that depend on types or utilities from earlier tasks, reference them explicitly (e.g., `import { User } from Task 2`) and include only the new code
 - Exact commands with expected output
 - `git commit` after each task
 - Optional `checkpoint: test` or `checkpoint: done` label
+- Each task's tests should cover the happy path and at least one edge case or error path
 Each task must use a numbered heading:
@@ -33,16 +35,12 @@ Each task must use a numbered heading:
 ...where N starts at 1 and incrementally numbers each task in the plan.
-The metadata comments (placed right after the heading) are optional but recommended. If present, they help the executing-tasks skill parse the plan correctly.
+The metadata comments (placed right after the heading) are optional. If omitted, the executing-tasks skill infers the TDD scenario and checkpoint from context. When in doubt, include them explicitly.
 Valid TDD values: `new-feature`, `modifying-tested-code`, `trivial`
 Valid checkpoint values: `none`, `test`, `done`
-These comments are optional — if omitted, the agent infers TDD scenario and checkpoint from context.
-Also use the `<!-- tdd: ... -->` and `<!-- checkpoint: ... -->` metadata comments to specify options explicitly. The inline `checkpoint: test` / `checkpoint: done` label format (e.g. in a task list) is also supported as a fallback, but the metadata comment is the canonical source.
 ## Vertical slices
@@ -61,6 +59,8 @@ RIGHT (vertical):
   Task 3: User can view profile (query + endpoint + test)
 ```
+Order tasks so each one can be verified independently and delivers a complete vertical slice. If a task requires infrastructure (models, types) that no previous task has created, include it in that task — don't create it as a separate task.
 Vertical slices ensure every committed task leaves the codebase in a testable state and reduces the blast radius of a bad task.
 ## TDD in the plan

package/banner.jpg DELETED Viewed

Binary file