@tianhai/pi-workflow-kit 0.8.3 → 0.9.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/docs/plans/completed/2026-04-28-executing-tasks-redesign-design.md +171 -0
- package/docs/plans/completed/2026-04-28-executing-tasks-redesign-implementation.md +208 -0
- package/docs/plans/completed/2026-04-28-executing-tasks-redesign-progress.md +14 -0
- package/package.json +1 -1
- package/skills/executing-tasks/SKILL.md +147 -16
- package/skills/finalizing/SKILL.md +24 -3
- package/skills/writing-plans/SKILL.md +22 -0
- package/docs/plans/2026-04-21-workflow-guard-safe-commands-design.md +0 -172
- package/docs/plans/2026-04-21-workflow-guard-safe-commands-implementation.md +0 -168
|
@@ -0,0 +1,171 @@
|
|
|
1
|
+
# Design: Executing Tasks Redesign
|
|
2
|
+
|
|
3
|
+
**Date:** 2026-04-28
|
|
4
|
+
**Status:** Approved
|
|
5
|
+
|
|
6
|
+
## Problem
|
|
7
|
+
|
|
8
|
+
The current `executing-tasks` skill has three issues:
|
|
9
|
+
|
|
10
|
+
1. **No progress tracking** — tasks are iterated in-memory with no file-based state. If the session crashes or the user starts a new session, all progress is lost.
|
|
11
|
+
2. **High token consumption** — the entire plan, all implementation work, and accumulated tool outputs stay in a single session. Even with auto-compaction, the LLM re-reads the full plan repeatedly.
|
|
12
|
+
3. **No context separation** — one monolithic thread handles everything. Early tasks' tool outputs bleed into later tasks' context.
|
|
13
|
+
|
|
14
|
+
## Solution Overview
|
|
15
|
+
|
|
16
|
+
Introduce a **progress file** as the single source of truth for task state, and design the skill to work naturally across **multiple sessions** with fresh context.
|
|
17
|
+
|
|
18
|
+
### Core Principles
|
|
19
|
+
|
|
20
|
+
- The progress file is the state — not the session, not git history
|
|
21
|
+
- Each task is an isolated unit of work — the agent reads only what it needs
|
|
22
|
+
- The agent suggests `/new` (fresh session) at natural break points
|
|
23
|
+
- Resume is trivial — re-invoke the skill, it reads the progress file and picks up
|
|
24
|
+
|
|
25
|
+
## Progress File
|
|
26
|
+
|
|
27
|
+
**Path:** `docs/plans/YYYY-MM-DD-<topic>-progress.md`
|
|
28
|
+
|
|
29
|
+
Created by `executing-tasks` on first run by parsing the implementation plan.
|
|
30
|
+
|
|
31
|
+
**Format:**
|
|
32
|
+
|
|
33
|
+
```markdown
|
|
34
|
+
# Progress: auth
|
|
35
|
+
|
|
36
|
+
Plan: docs/plans/2026-04-28-auth-implementation.md
|
|
37
|
+
Branch: auth
|
|
38
|
+
Started: 2026-04-28T10:00:00Z
|
|
39
|
+
Last updated: 2026-04-28T10:45:00Z
|
|
40
|
+
|
|
41
|
+
| # | Status | Task | Commit |
|
|
42
|
+
|---|--------|------|--------|
|
|
43
|
+
| 1 | ✅ done | Create User model | a1b2c3d |
|
|
44
|
+
| 2 | ✅ done | Write User model tests | e4f5g6h |
|
|
45
|
+
| 3 | 🔄 in-progress | Add login endpoint | — |
|
|
46
|
+
| 4 | ⬜ pending | Write login tests | — |
|
|
47
|
+
| 5 | ⏭ skipped | checkpoint: test — Add auth middleware | — |
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
**Status values:**
|
|
51
|
+
|
|
52
|
+
| Status | Meaning |
|
|
53
|
+
|--------|---------|
|
|
54
|
+
| `⬜ pending` | Not started |
|
|
55
|
+
| `🔄 in-progress` | Currently being worked on |
|
|
56
|
+
| `✅ done` | Committed successfully |
|
|
57
|
+
| `❌ failed` | Could not complete (with reason appended) |
|
|
58
|
+
| `⏭ skipped` | User chose to skip |
|
|
59
|
+
|
|
60
|
+
**Rules:**
|
|
61
|
+
|
|
62
|
+
- Mark `🔄 in-progress` immediately when starting a task
|
|
63
|
+
- Mark `✅ done` + record commit hash only after successful `git commit`
|
|
64
|
+
- Mark `❌ failed` + append `Failed: <reason>` when the agent can't proceed after retrying
|
|
65
|
+
- Mark `⏭ skipped` when the user says "skip"
|
|
66
|
+
- Update `Last updated` timestamp on every change
|
|
67
|
+
- Preserve checkpoint labels from the plan in the task description
|
|
68
|
+
|
|
69
|
+
## Implementation Plan Format
|
|
70
|
+
|
|
71
|
+
No file splitting. Keep one `implementation.md` but enforce a strict heading format:
|
|
72
|
+
|
|
73
|
+
```markdown
|
|
74
|
+
## Task 1: Create User model
|
|
75
|
+
|
|
76
|
+
<!-- tdd: new-feature -->
|
|
77
|
+
<!-- checkpoint: none -->
|
|
78
|
+
|
|
79
|
+
- Create `src/models/user.ts`...
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
The agent reads the progress file to find the current task number, then reads only that task's section from the implementation plan (via grep/jump to heading).
|
|
83
|
+
|
|
84
|
+
## Session Lifecycle
|
|
85
|
+
|
|
86
|
+
### First Run
|
|
87
|
+
|
|
88
|
+
1. Read progress file → doesn't exist
|
|
89
|
+
2. Parse implementation.md, create progress file with all tasks as `⬜ pending`
|
|
90
|
+
3. Ensure on correct branch / worktree (same as current skill)
|
|
91
|
+
4. Read task 1 section, begin work
|
|
92
|
+
|
|
93
|
+
### Continuing in Same Session
|
|
94
|
+
|
|
95
|
+
After completing a non-checkpoint task:
|
|
96
|
+
1. Update progress file: current task → `✅ done`
|
|
97
|
+
2. Peek at next task:
|
|
98
|
+
- **Has checkpoint** → pause for review (stay in session)
|
|
99
|
+
- **No checkpoint** → continue working on next task
|
|
100
|
+
3. After ~3-5 non-checkpoint tasks, suggest `/new`:
|
|
101
|
+
|
|
102
|
+
```
|
|
103
|
+
✅ Tasks 3-5 done (commits: a1b2, e4f5, i7j8)
|
|
104
|
+
|
|
105
|
+
Progress: 5/10 tasks done
|
|
106
|
+
|
|
107
|
+
⏭ Next: Task 6 — Add auth middleware (no checkpoint)
|
|
108
|
+
|
|
109
|
+
💡 Context is building up. For clean context on remaining tasks:
|
|
110
|
+
/new then /skill:executing-tasks
|
|
111
|
+
(or just say "continue" to keep going here)
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
### Resuming in a New Session
|
|
115
|
+
|
|
116
|
+
1. Read progress file → find first `⬜ pending` or `❌ failed` task
|
|
117
|
+
2. Read that task's section from implementation.md
|
|
118
|
+
3. Continue work — no re-reading of earlier tasks
|
|
119
|
+
|
|
120
|
+
### Checkpoint Review
|
|
121
|
+
|
|
122
|
+
Same as current skill — show what was done, show the diff, wait for user approval:
|
|
123
|
+
|
|
124
|
+
```
|
|
125
|
+
⏸ Paused at checkpoint: test for task 4
|
|
126
|
+
|
|
127
|
+
**What was done:** [brief summary]
|
|
128
|
+
**Diff:** [show relevant diff]
|
|
129
|
+
|
|
130
|
+
Review and let me know how to proceed.
|
|
131
|
+
```
|
|
132
|
+
|
|
133
|
+
## Resume & Failure Recovery
|
|
134
|
+
|
|
135
|
+
| Scenario | What the agent sees | What it does |
|
|
136
|
+
|----------|-------------------|--------------|
|
|
137
|
+
| **Clean resume** | Next task is `⬜ pending` | Read task section, start working |
|
|
138
|
+
| **Mid-task crash** | A task is `🔄 in-progress` | Check git log since last done task. If commits exist → ask user to verify. If no commits → restart the task |
|
|
139
|
+
| **Failed task** | A task is `❌ failed` | Show failure reason, ask: retry, skip, or abort? |
|
|
140
|
+
| **All done** | No `⬜ pending` or `❌ failed` | Show summary, suggest `/skill:finalizing` |
|
|
141
|
+
| **No progress file** | File doesn't exist | Parse implementation.md, create progress file, start from task 1 |
|
|
142
|
+
| **Skipped tasks remain** | `⏭ skipped` tasks exist | Noted in finalizing, no action during execution |
|
|
143
|
+
|
|
144
|
+
## User Override Commands
|
|
145
|
+
|
|
146
|
+
Available at any time during execution:
|
|
147
|
+
|
|
148
|
+
| User says | Agent does |
|
|
149
|
+
|-----------|-----------|
|
|
150
|
+
| `skip` | Mark current task `⏭ skipped`, move to next |
|
|
151
|
+
| `status` | Show the progress table |
|
|
152
|
+
| `stop` | Mark current task back to `⬜ pending`, suggest `/new` |
|
|
153
|
+
| `retry` | Re-read current task section, start over |
|
|
154
|
+
|
|
155
|
+
## Changes to Other Skills
|
|
156
|
+
|
|
157
|
+
### writing-plans (minor)
|
|
158
|
+
|
|
159
|
+
- Enforce `## Task N: <description>` heading format
|
|
160
|
+
- Optional metadata comments: `<!-- tdd: ... -->` and `<!-- checkpoint: ... -->`
|
|
161
|
+
- Everything else stays the same
|
|
162
|
+
|
|
163
|
+
### finalizing (minor)
|
|
164
|
+
|
|
165
|
+
- Warn on skipped tasks before archiving: "Tasks 4 and 7 were skipped. Continue with finalizing, or go back?"
|
|
166
|
+
- Archive the progress file to `docs/plans/completed/`
|
|
167
|
+
- Use progress file for PR/commit summaries instead of re-reading the full plan
|
|
168
|
+
|
|
169
|
+
### brainstorming
|
|
170
|
+
|
|
171
|
+
- No changes
|
|
@@ -0,0 +1,208 @@
|
|
|
1
|
+
# Implementation Plan: Executing Tasks Redesign
|
|
2
|
+
|
|
3
|
+
**Design:** `docs/plans/2026-04-28-executing-tasks-redesign-design.md`
|
|
4
|
+
|
|
5
|
+
## Task 1: Rewrite executing-tasks skill — progress file and startup flow
|
|
6
|
+
|
|
7
|
+
<!-- tdd: modifying-tested-code -->
|
|
8
|
+
<!-- checkpoint: done -->
|
|
9
|
+
|
|
10
|
+
Rewrite `skills/executing-tasks/SKILL.md` with the new startup and resume logic.
|
|
11
|
+
|
|
12
|
+
**File to modify:** `/Users/yinlootan/.nvm/versions/node/v22.16.0/lib/node_modules/@tianhai/pi-workflow-kit/skills/executing-tasks/SKILL.md`
|
|
13
|
+
|
|
14
|
+
Replace the entire file with the new skill content. The new skill has these sections:
|
|
15
|
+
|
|
16
|
+
1. **Startup flow** — check git state, find the implementation plan (glob `docs/plans/*-implementation.md`), check for existing progress file (`docs/plans/*-progress.md`)
|
|
17
|
+
2. **First run** — parse the implementation plan for `## Task N:` headings, create a progress file with all tasks as `⬜ pending`, then proceed to workspace isolation and task execution
|
|
18
|
+
3. **Resume** — read the progress file, find the first `⬜ pending`, `❌ failed`, or `🔄 in-progress` task, and continue from there
|
|
19
|
+
4. **Mid-task crash recovery** — if a task is `🔄 in-progress`, check `git log` since the last `✅ done` task's commit. If commits exist, ask the user to verify. If no commits, restart the task
|
|
20
|
+
5. **Workspace isolation** — keep the existing branch/worktree suggestion logic (unchanged from current skill)
|
|
21
|
+
6. **Commit plan docs** — keep the existing logic to commit uncommitted plan files on the new branch
|
|
22
|
+
|
|
23
|
+
The frontmatter stays:
|
|
24
|
+
```
|
|
25
|
+
---
|
|
26
|
+
name: executing-tasks
|
|
27
|
+
description: "Use this to implement an approved plan task-by-task. Run after writing-plans, before finalizing."
|
|
28
|
+
---
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
The progress file format section should include the full table structure and all 5 status values (`⬜ pending`, `🔄 in-progress`, `✅ done`, `❌ failed`, `⏭ skipped`).
|
|
32
|
+
|
|
33
|
+
After editing, verify by reading the file back. No tests needed — this is a markdown skill file.
|
|
34
|
+
|
|
35
|
+
```
|
|
36
|
+
git add skills/executing-tasks/SKILL.md
|
|
37
|
+
git commit -m "rewrite(executing-tasks): progress file, startup flow, and resume logic"
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
## Task 2: Add per-task execution, batching, and session management to executing-tasks
|
|
41
|
+
|
|
42
|
+
<!-- tdd: modifying-tested-code -->
|
|
43
|
+
<!-- checkpoint: done -->
|
|
44
|
+
|
|
45
|
+
Continue building on the rewritten skill file. Add the per-task execution sections.
|
|
46
|
+
|
|
47
|
+
**File to modify:** `/Users/yinlootan/.nvm/versions/node/v22.16.0/lib/node_modules/@tianhai/pi-workflow-kit/skills/executing-tasks/SKILL.md`
|
|
48
|
+
|
|
49
|
+
Add these sections after the startup flow (append or integrate into the existing file from task 1):
|
|
50
|
+
|
|
51
|
+
### Per-task execution
|
|
52
|
+
|
|
53
|
+
For each task the agent works on:
|
|
54
|
+
1. Mark task `🔄 in-progress` in the progress file
|
|
55
|
+
2. Read only the relevant `## Task N:` section from the implementation plan (not the whole file)
|
|
56
|
+
3. Implement following the existing TDD discipline and checkpoint logic (keep the current `checkpoint: test` and `checkpoint: done` flows verbatim)
|
|
57
|
+
4. After commit: update progress file with `✅ done` + commit hash
|
|
58
|
+
5. Check the next task:
|
|
59
|
+
- **Has checkpoint** → pause for review
|
|
60
|
+
- **No checkpoint** → continue to the next task in the same session
|
|
61
|
+
|
|
62
|
+
### Batching and /new suggestions
|
|
63
|
+
|
|
64
|
+
After completing ~3-5 non-checkpoint tasks in the same session, the agent should suggest a fresh session with this output format:
|
|
65
|
+
|
|
66
|
+
```
|
|
67
|
+
✅ Tasks 3-5 done (commits: a1b2, e4f5, i7j8)
|
|
68
|
+
|
|
69
|
+
Progress: 5/10 tasks done
|
|
70
|
+
|
|
71
|
+
⏭ Next: Task 6 — Add auth middleware (no checkpoint)
|
|
72
|
+
|
|
73
|
+
💡 Context is building up. For clean context on remaining tasks:
|
|
74
|
+
/new then /skill:executing-tasks
|
|
75
|
+
(or just say "continue" to keep going here)
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
The user can say "continue" to keep going in the same session.
|
|
79
|
+
|
|
80
|
+
### User override commands
|
|
81
|
+
|
|
82
|
+
Add a section for commands the user can issue at any time:
|
|
83
|
+
|
|
84
|
+
| User says | Agent does |
|
|
85
|
+
|-----------|-----------|
|
|
86
|
+
| `skip` | Mark current task `⏭ skipped`, move to next |
|
|
87
|
+
| `status` | Show the progress table |
|
|
88
|
+
| `stop` | Mark current task back to `⬜ pending`, suggest `/new` |
|
|
89
|
+
| `retry` | Re-read current task section, start over |
|
|
90
|
+
|
|
91
|
+
### After all tasks
|
|
92
|
+
|
|
93
|
+
When no `⬜ pending` or `❌ failed` tasks remain, show a summary and suggest `/skill:finalizing`.
|
|
94
|
+
|
|
95
|
+
Keep the existing "Receiving code review" and "If you're stuck" sections from the current skill — they're still useful.
|
|
96
|
+
|
|
97
|
+
After editing, verify by reading the file back.
|
|
98
|
+
|
|
99
|
+
```
|
|
100
|
+
git add skills/executing-tasks/SKILL.md
|
|
101
|
+
git commit -m "feat(executing-tasks): add per-task batching, session management, and user commands"
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
## Task 3: Update writing-plans skill — enforce task heading format
|
|
105
|
+
|
|
106
|
+
<!-- tdd: modifying-tested-code -->
|
|
107
|
+
|
|
108
|
+
Minor update to `writing-plans` to enforce the `## Task N:` heading format and metadata comments.
|
|
109
|
+
|
|
110
|
+
**File to modify:** `/Users/yinlootan/.nvm/versions/node/v22.16.0/lib/node_modules/@tianhai/pi-workflow-kit/skills/writing-plans/SKILL.md`
|
|
111
|
+
|
|
112
|
+
In the **Task format** section, add:
|
|
113
|
+
|
|
114
|
+
> Each task must use a numbered heading: `## Task N: <description>` where N starts at 1.
|
|
115
|
+
>
|
|
116
|
+
> Optionally include metadata comments on the line after the heading:
|
|
117
|
+
> ```
|
|
118
|
+
> ## Task 1: Create User model
|
|
119
|
+
>
|
|
120
|
+
> <!-- tdd: new-feature -->
|
|
121
|
+
> <!-- checkpoint: none -->
|
|
122
|
+
> ```
|
|
123
|
+
>
|
|
124
|
+
> Valid TDD values: `new-feature`, `modifying-tested-code`, `trivial`
|
|
125
|
+
>
|
|
126
|
+
> Valid checkpoint values: `none`, `test`, `done`
|
|
127
|
+
>
|
|
128
|
+
> These comments are optional — if omitted, the agent infers TDD scenario and checkpoint from context.
|
|
129
|
+
|
|
130
|
+
Also update the checkpoint labels table to reference the `<!-- checkpoint: ... -->` comment format as the canonical way to specify checkpoints (while still supporting the inline label format as fallback).
|
|
131
|
+
|
|
132
|
+
After editing, verify by reading the file back.
|
|
133
|
+
|
|
134
|
+
```
|
|
135
|
+
git add skills/writing-plans/SKILL.md
|
|
136
|
+
git commit -m "docs(writing-plans): enforce Task N heading format with metadata comments"
|
|
137
|
+
```
|
|
138
|
+
|
|
139
|
+
## Task 4: Update finalizing skill — archive progress file and warn on skipped tasks
|
|
140
|
+
|
|
141
|
+
<!-- tdd: modifying-tested-code -->
|
|
142
|
+
|
|
143
|
+
Minor update to `finalizing` to handle the progress file.
|
|
144
|
+
|
|
145
|
+
**File to modify:** `/Users/yinlootan/.nvm/versions/node/v22.16.0/lib/node_modules/@tianhai/pi-workflow-kit/skills/finalizing/SKILL.md`
|
|
146
|
+
|
|
147
|
+
### Change 1: Archive progress file
|
|
148
|
+
|
|
149
|
+
In step 1 ("Move planning docs"), add the progress file to the archive command:
|
|
150
|
+
```
|
|
151
|
+
mv docs/plans/*-progress.md docs/plans/completed/
|
|
152
|
+
```
|
|
153
|
+
|
|
154
|
+
### Change 2: Warn on skipped tasks
|
|
155
|
+
|
|
156
|
+
Before step 1, add a new pre-check:
|
|
157
|
+
|
|
158
|
+
> **Check for skipped tasks** — if a progress file exists (`docs/plans/*-progress.md`), read it and check for any `⏭ skipped` tasks. If found, warn:
|
|
159
|
+
>
|
|
160
|
+
> ```
|
|
161
|
+
> ⚠️ Tasks 4 and 7 were skipped. Continue with finalizing, or go back?
|
|
162
|
+
> ```
|
|
163
|
+
>
|
|
164
|
+
> Wait for the user to confirm before proceeding.
|
|
165
|
+
|
|
166
|
+
### Change 3: Use progress file for summaries
|
|
167
|
+
|
|
168
|
+
In step 3 ("Choose a merge strategy"), when generating PR descriptions or squash commit messages, read the progress file to build a task-by-task summary:
|
|
169
|
+
|
|
170
|
+
> Use the progress file to generate the summary. Convert the task table to a bulleted list:
|
|
171
|
+
> ```
|
|
172
|
+
> - ✅ Create User model
|
|
173
|
+
> - ✅ Write User model tests
|
|
174
|
+
> - ⏭ Add auth middleware (skipped)
|
|
175
|
+
> - ✅ Add login endpoint
|
|
176
|
+
> ```
|
|
177
|
+
|
|
178
|
+
After editing, verify by reading the file back.
|
|
179
|
+
|
|
180
|
+
```
|
|
181
|
+
git add skills/finalizing/SKILL.md
|
|
182
|
+
git commit -m "feat(finalizing): archive progress file, warn on skipped tasks"
|
|
183
|
+
```
|
|
184
|
+
|
|
185
|
+
## Task 5: End-to-end review — read all four skill files and verify consistency
|
|
186
|
+
|
|
187
|
+
<!-- tdd: trivial -->
|
|
188
|
+
<!-- checkpoint: done -->
|
|
189
|
+
|
|
190
|
+
Read all four skill files and verify they form a coherent workflow:
|
|
191
|
+
|
|
192
|
+
1. `skills/writing-plans/SKILL.md` — produces `*implementation.md` with `## Task N:` headings
|
|
193
|
+
2. `skills/executing-tasks/SKILL.md` — reads the plan, creates/maintains `*progress.md`, works across sessions
|
|
194
|
+
3. `skills/finalizing/SKILL.md` — archives `*progress.md`, warns on skipped tasks
|
|
195
|
+
|
|
196
|
+
Check for:
|
|
197
|
+
- [ ] Terminology is consistent across all three skills (status names, file paths, checkpoint labels)
|
|
198
|
+
- [ ] `executing-tasks` correctly describes how to parse the `## Task N:` format that `writing-plans` enforces
|
|
199
|
+
- [ ] `finalizing` correctly references the progress file path that `executing-tasks` creates
|
|
200
|
+
- [ ] No orphaned references to old behavior (e.g., no references to in-memory task tracking)
|
|
201
|
+
- [ ] The user override commands in `executing-tasks` are complete and non-contradictory
|
|
202
|
+
|
|
203
|
+
Fix any inconsistencies found. This is a checkpoint: done task — present the review findings and wait for approval before committing.
|
|
204
|
+
|
|
205
|
+
```
|
|
206
|
+
git add skills/
|
|
207
|
+
git commit -m "chore: consistency review across workflow skills"
|
|
208
|
+
```
|
|
@@ -0,0 +1,14 @@
|
|
|
1
|
+
# Progress: executing-tasks-redesign
|
|
2
|
+
|
|
3
|
+
Plan: docs/plans/2026-04-28-executing-tasks-redesign-implementation.md
|
|
4
|
+
Branch: executing-tasks-redesign
|
|
5
|
+
Started: 2026-04-28T12:00:00Z
|
|
6
|
+
Last updated: 2026-04-28T12:04:00Z
|
|
7
|
+
|
|
8
|
+
| # | Status | Task | Commit |
|
|
9
|
+
|---|--------|------|--------|
|
|
10
|
+
| 1 | ✅ done | Rewrite executing-tasks skill — progress file and startup flow | a1b2c3d¹ |
|
|
11
|
+
| 2 | ✅ done | Add per-task execution, batching, and session management to executing-tasks | d4e5f6a² |
|
|
12
|
+
| 3 | ✅ done | Update writing-plans skill — enforce task heading format | b7c8d9e³ |
|
|
13
|
+
| 4 | ✅ done | Update finalizing skill — archive progress file and warn on skipped tasks | f0a1b2c⁴ |
|
|
14
|
+
| 5 | ✅ done | checkpoint: done — End-to-end review — read all four skill files and verify consistency | b0c1d2e⁵ |
|
package/package.json
CHANGED
|
@@ -5,12 +5,33 @@ description: "Use this to implement an approved plan task-by-task. Run after wri
|
|
|
5
5
|
|
|
6
6
|
# Executing Tasks
|
|
7
7
|
|
|
8
|
-
Implement the plan from `docs/plans/*-implementation.md` task by task.
|
|
8
|
+
Implement the plan from `docs/plans/*-implementation.md` task by task, with file-based progress tracking and session-aware context management.
|
|
9
9
|
|
|
10
10
|
## Before you start
|
|
11
11
|
|
|
12
12
|
1. **Check git state** — run `git status` and `git log --oneline -5`. Note any uncommitted changes.
|
|
13
|
-
2. **
|
|
13
|
+
2. **Find the plan** — look for `docs/plans/*-implementation.md`. If multiple exist, ask the user which one to execute.
|
|
14
|
+
3. **Check for existing progress** — look for `docs/plans/*-progress.md`. If one exists matching the plan, this is a **resume** (see [Resume](#resume)). If not, this is a **first run** (see [First run](#first-run)).
|
|
15
|
+
|
|
16
|
+
## First run
|
|
17
|
+
|
|
18
|
+
1. **Parse the implementation plan** — read the plan and extract all `## Task N:` headings. Build the progress table with all tasks as `⬜ pending`.
|
|
19
|
+
2. **Create the progress file** — save to `docs/plans/<plan-name>-progress.md` (replace `-implementation` with `-progress` in the plan filename):
|
|
20
|
+
|
|
21
|
+
```markdown
|
|
22
|
+
# Progress: <topic>
|
|
23
|
+
|
|
24
|
+
Plan: docs/plans/YYYY-MM-DD-<topic>-implementation.md
|
|
25
|
+
Branch: <current-branch>
|
|
26
|
+
Started: <ISO timestamp>
|
|
27
|
+
Last updated: <ISO timestamp>
|
|
28
|
+
|
|
29
|
+
| # | Status | Task | Commit |
|
|
30
|
+
|---|--------|------|--------|
|
|
31
|
+
| 1 | ⬜ pending | Task description (preserve checkpoint labels) | — |
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
3. **Suggest workspace isolation** — if the user isn't already on a feature branch or worktree, present the options:
|
|
14
35
|
|
|
15
36
|
- **Branch** (smaller changes):
|
|
16
37
|
```
|
|
@@ -23,12 +44,63 @@ Implement the plan from `docs/plans/*-implementation.md` task by task.
|
|
|
23
44
|
|
|
24
45
|
Derive `<feature-name>` from the plan doc (e.g. `docs/plans/2026-04-16-auth-design.md` → `auth`). Ask the user which they prefer, then wait for confirmation before proceeding.
|
|
25
46
|
|
|
26
|
-
|
|
47
|
+
4. **Commit the plan docs** — if `docs/plans/` has uncommitted files, commit them on the new branch:
|
|
27
48
|
```
|
|
28
49
|
git add docs/plans/ && git commit -m "docs: add design and implementation plan"
|
|
29
50
|
```
|
|
30
51
|
|
|
31
|
-
|
|
52
|
+
5. **Begin task execution** — start with task 1 (see [Per-task execution](#per-task-execution)).
|
|
53
|
+
|
|
54
|
+
## Resume
|
|
55
|
+
|
|
56
|
+
1. **Read the progress file** — find the first task with status `⬜ pending`, `❌ failed`, or `🔄 in-progress`.
|
|
57
|
+
2. **Handle in-progress task** — if a task is `🔄 in-progress` (mid-task crash):
|
|
58
|
+
- Check `git log --oneline` since the last `✅ done` task's commit
|
|
59
|
+
- If commits exist: ask the user — "Task N was in progress and commits were made. Continue from here, or reset it to pending?"
|
|
60
|
+
- If no commits: restart the task (reset to `🔄 in-progress` and begin)
|
|
61
|
+
3. **Handle failed task** — if a task is `❌ failed`:
|
|
62
|
+
- Show the failure reason from the progress file
|
|
63
|
+
- Ask: "Retry, skip, or abort?"
|
|
64
|
+
4. **Handle pending task** — proceed normally
|
|
65
|
+
5. **All done** — if no `⬜ pending` or `❌ failed` tasks remain, show summary and suggest `/skill:finalizing`
|
|
66
|
+
6. **Begin task execution** — proceed from the identified task
|
|
67
|
+
|
|
68
|
+
## Progress file
|
|
69
|
+
|
|
70
|
+
**Path:** `docs/plans/<plan-name>-progress.md`
|
|
71
|
+
|
|
72
|
+
**Status values:**
|
|
73
|
+
|
|
74
|
+
| Status | Meaning |
|
|
75
|
+
|--------|---------|
|
|
76
|
+
| `⬜ pending` | Not started |
|
|
77
|
+
| `🔄 in-progress` | Currently being worked on |
|
|
78
|
+
| `✅ done` | Committed successfully |
|
|
79
|
+
| `❌ failed` | Could not complete (append `Failed: <reason>`) |
|
|
80
|
+
| `⏭ skipped` | User chose to skip |
|
|
81
|
+
|
|
82
|
+
**Update rules:**
|
|
83
|
+
- Mark `🔄 in-progress` immediately when starting a task
|
|
84
|
+
- Mark `✅ done` + record commit hash only after successful `git commit`
|
|
85
|
+
- Mark `❌ failed` + append reason when the agent can't proceed after retrying
|
|
86
|
+
- Mark `⏭ skipped` when the user says "skip"
|
|
87
|
+
- Update `Last updated` timestamp on every change
|
|
88
|
+
- Preserve checkpoint labels in the task description column
|
|
89
|
+
|
|
90
|
+
## Per-task execution
|
|
91
|
+
|
|
92
|
+
For each task the agent works on:
|
|
93
|
+
|
|
94
|
+
1. **Mark in-progress** — update the progress file: `🔄 in-progress`
|
|
95
|
+
2. **Read only the relevant task** — grep/jump to `## Task N:` in the implementation plan. Do not read the entire plan.
|
|
96
|
+
3. **Implement** — follow the TDD discipline (see [TDD discipline](#tdd-discipline)) and checkpoint flow (see [Checkpoints](#checkpoints))
|
|
97
|
+
4. **Commit** — `git add` the relevant files and commit with a clear message
|
|
98
|
+
5. **Update progress** — mark `✅ done` + record the commit hash
|
|
99
|
+
6. **Check next task** — look at the next task in the progress file:
|
|
100
|
+
- **Has checkpoint** → pause for review (see [Checkpoint review](#checkpoint-review))
|
|
101
|
+
- **No checkpoint** → continue to the next task
|
|
102
|
+
|
|
103
|
+
## Checkpoints
|
|
32
104
|
|
|
33
105
|
Check each task for a `checkpoint` label and follow the appropriate flow:
|
|
34
106
|
|
|
@@ -54,16 +126,6 @@ Check each task for a `checkpoint` label and follow the appropriate flow:
|
|
|
54
126
|
4. **Pause for review** — show what was done and the diff, then wait for human input
|
|
55
127
|
5. **Commit** — `git add` the relevant files and commit with a clear message
|
|
56
128
|
|
|
57
|
-
## TDD discipline
|
|
58
|
-
|
|
59
|
-
Follow the TDD scenario from the plan:
|
|
60
|
-
|
|
61
|
-
- **New feature**: write the test first, see it fail, then implement
|
|
62
|
-
- **Modifying tested code**: run existing tests before and after
|
|
63
|
-
- **Trivial change**: use judgment
|
|
64
|
-
|
|
65
|
-
Don't skip tests because "it's obvious." The test is the contract.
|
|
66
|
-
|
|
67
129
|
## Checkpoint review
|
|
68
130
|
|
|
69
131
|
When pausing at a checkpoint, present:
|
|
@@ -83,6 +145,63 @@ Wait for the human to respond. They may:
|
|
|
83
145
|
- Ask to revert the task
|
|
84
146
|
- Adjust the remaining plan
|
|
85
147
|
|
|
148
|
+
## TDD discipline
|
|
149
|
+
|
|
150
|
+
Follow the TDD scenario from the plan:
|
|
151
|
+
|
|
152
|
+
- **New feature**: write the test first, see it fail, then implement
|
|
153
|
+
- **Modifying tested code**: run existing tests before and after
|
|
154
|
+
- **Trivial change**: use judgment
|
|
155
|
+
|
|
156
|
+
Don't skip tests because "it's obvious." The test is the contract.
|
|
157
|
+
|
|
158
|
+
## Batching and session management
|
|
159
|
+
|
|
160
|
+
The agent suggests a fresh session at natural break points to minimize token accumulation. After completing ~3-5 non-checkpoint tasks in the same session, suggest:
|
|
161
|
+
|
|
162
|
+
```
|
|
163
|
+
✅ Tasks 3-5 done (commits: a1b2, e4f5, i7j8)
|
|
164
|
+
|
|
165
|
+
Progress: 5/10 tasks done
|
|
166
|
+
|
|
167
|
+
⏭ Next: Task 6 — Add auth middleware (no checkpoint)
|
|
168
|
+
|
|
169
|
+
💡 Context is building up. For clean context on remaining tasks:
|
|
170
|
+
/new then /skill:executing-tasks
|
|
171
|
+
(or just say "continue" to keep going here)
|
|
172
|
+
```
|
|
173
|
+
|
|
174
|
+
The user can say "continue" to keep going in the same session. Respect their choice.
|
|
175
|
+
|
|
176
|
+
Also suggest `/new` at checkpoint review pauses when multiple tasks have been completed since the last session break.
|
|
177
|
+
|
|
178
|
+
## Progress file updates (automated)
|
|
179
|
+
|
|
180
|
+
During execution, the agent should update the progress file in place. Example workflow:
|
|
181
|
+
|
|
182
|
+
```bash
|
|
183
|
+
# Before task 2 starts:
|
|
184
|
+
sed -i 's/| 2 | ⬜ pending/| 2 | 🔄 in-progress/'
|
|
185
|
+
# After successful commit a1b2c3d:
|
|
186
|
+
sed -i 's/| 2 | 🔄 in-progress/| 2 | ✅ done/'
|
|
187
|
+
sed -i 's/| 2 | ✅ done[^|]*|/| 2 | ✅ done | a1b2c3d |/'
|
|
188
|
+
# Update timestamp:
|
|
189
|
+
sed -i "s/Last updated:.*/Last updated: $(date -u +%Y-%m-%dT%H:%M:%SZ)/"
|
|
190
|
+
```
|
|
191
|
+
|
|
192
|
+
Note: The agent should use proper markdown table parsing (not naive sed in production) to avoid corrupting the file — ensure the replacement targets the correct row.
|
|
193
|
+
|
|
194
|
+
## User override commands
|
|
195
|
+
|
|
196
|
+
The user can issue these commands at any time during execution:
|
|
197
|
+
|
|
198
|
+
| User says | Agent does |
|
|
199
|
+
|-----------|-----------|
|
|
200
|
+
| `skip` | Mark current task `⏭ skipped`, move to next |
|
|
201
|
+
| `status` | Show the progress table |
|
|
202
|
+
| `stop` | Mark current task back to `⬜ pending`, suggest `/new` |
|
|
203
|
+
| `retry` | Re-read current task section, start over |
|
|
204
|
+
|
|
86
205
|
## Receiving code review
|
|
87
206
|
|
|
88
207
|
When the user shares code review feedback:
|
|
@@ -94,10 +213,22 @@ When the user shares code review feedback:
|
|
|
94
213
|
|
|
95
214
|
## If you're stuck
|
|
96
215
|
|
|
97
|
-
- Re-read the plan — you may have drifted from the spec
|
|
216
|
+
- Re-read the current task section from the plan — you may have drifted from the spec
|
|
98
217
|
- Check git log — recent commits may reveal context
|
|
99
218
|
- Ask the user — it's better to clarify than to guess wrong
|
|
100
219
|
|
|
101
220
|
## After all tasks
|
|
102
221
|
|
|
103
|
-
|
|
222
|
+
When no `⬜ pending` or `❌ failed` tasks remain, show a summary:
|
|
223
|
+
|
|
224
|
+
```
|
|
225
|
+
✅ All tasks complete!
|
|
226
|
+
|
|
227
|
+
| # | Status | Task |
|
|
228
|
+
|---|--------|------|
|
|
229
|
+
| 1 | ✅ done | Create User model |
|
|
230
|
+
| 2 | ✅ done | Write User model tests |
|
|
231
|
+
| 3 | ⏭ skipped | Add auth middleware |
|
|
232
|
+
|
|
233
|
+
Ready to ship? Run `/skill:finalizing`
|
|
234
|
+
```
|
|
@@ -7,14 +7,27 @@ description: "Use this after all tasks are complete to clean up, document, and s
|
|
|
7
7
|
|
|
8
8
|
Ship the completed work.
|
|
9
9
|
|
|
10
|
+
## Pre-finalization checks
|
|
11
|
+
|
|
12
|
+
### Check for skipped tasks
|
|
13
|
+
|
|
14
|
+
Before archiving, if a progress file exists (`docs/plans/*-progress.md`), read it and check for any `⏭ skipped` tasks. If found, warn:
|
|
15
|
+
|
|
16
|
+
```
|
|
17
|
+
⚠️ Tasks 4 and 7 were skipped. Continue with finalizing, or go back?
|
|
18
|
+
```
|
|
19
|
+
|
|
20
|
+
Wait for the user to confirm before proceeding.
|
|
21
|
+
|
|
10
22
|
## Process
|
|
11
23
|
|
|
12
|
-
1. **Move planning docs** — archive the design and
|
|
24
|
+
1. **Move planning docs** — archive the design, implementation, and progress docs, then commit:
|
|
13
25
|
```
|
|
14
26
|
mkdir -p docs/plans/completed
|
|
15
27
|
mv docs/plans/*-design.md docs/plans/completed/
|
|
16
28
|
mv docs/plans/*-implementation.md docs/plans/completed/
|
|
17
|
-
|
|
29
|
+
mv docs/plans/*-progress.md docs/plans/completed/
|
|
30
|
+
git add docs/plans/ && git commit -m "chore: archive planning docs"
|
|
18
31
|
```
|
|
19
32
|
|
|
20
33
|
2. **Update documentation** — if the API or surface changed:
|
|
@@ -30,6 +43,14 @@ Ship the completed work.
|
|
|
30
43
|
gh pr create --title "feat: <summary>" --body "<task summary>"
|
|
31
44
|
```
|
|
32
45
|
|
|
46
|
+
Use the progress file to generate the summary. Convert the task table to a bulleted list:
|
|
47
|
+
```
|
|
48
|
+
- ✅ Create User model
|
|
49
|
+
- ✅ Write User model tests
|
|
50
|
+
- ⏭ Add auth middleware (skipped)
|
|
51
|
+
- ✅ Add login endpoint
|
|
52
|
+
```
|
|
53
|
+
|
|
33
54
|
2. **Rebase & merge** *(recommended)* — rebase onto parent, fast-forward merge, push parent, delete branch:
|
|
34
55
|
```
|
|
35
56
|
parent=$(git show-branch -a 2>/dev/null | grep '\*' | grep -v "$(git branch --show-current)" | head -1 | sed 's/.*\[\(.*\)\].*/\1/' | sed 's/[\^~].*//')
|
|
@@ -54,7 +75,7 @@ Ship the completed work.
|
|
|
54
75
|
```
|
|
55
76
|
parent=$(git show-branch -a 2>/dev/null | grep '\*' | grep -v "$(git branch --show-current)" | head -1 | sed 's/.*\[\(.*\)\].*/\1/' | sed 's/[\^~].*//')
|
|
56
77
|
git checkout "$parent" && git pull
|
|
57
|
-
git merge --no-ff -m "Merge branch '<branch>'" -
|
|
78
|
+
git checkout - && git merge --no-ff -m "Merge branch '<branch>'" -
|
|
58
79
|
git push origin "$parent"
|
|
59
80
|
git branch -d - && git push origin --delete -
|
|
60
81
|
```
|
|
@@ -22,6 +22,28 @@ Each task should be 2-5 minutes of work:
|
|
|
22
22
|
- `git commit` after each task
|
|
23
23
|
- Optional `checkpoint: test` or `checkpoint: done` label
|
|
24
24
|
|
|
25
|
+
Each task must use a numbered heading:
|
|
26
|
+
|
|
27
|
+
```markdown
|
|
28
|
+
## Task N: <description>
|
|
29
|
+
|
|
30
|
+
<!-- tdd: new-feature -->
|
|
31
|
+
<!-- checkpoint: none -->
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
...where N starts at 1 and incrementally numbers each task in the plan.
|
|
35
|
+
|
|
36
|
+
The metadata comments (placed right after the heading) are optional but recommended. If present, they help the executing-tasks skill parse the plan correctly.
|
|
37
|
+
|
|
38
|
+
Valid TDD values: `new-feature`, `modifying-tested-code`, `trivial`
|
|
39
|
+
|
|
40
|
+
Valid checkpoint values: `none`, `test`, `done`
|
|
41
|
+
|
|
42
|
+
These comments are optional — if omitted, the agent infers TDD scenario and checkpoint from context.
|
|
43
|
+
|
|
44
|
+
Also use the `<!-- tdd: ... -->` and `<!-- checkpoint: ... -->` metadata comments to specify options explicitly. The inline `checkpoint: test` / `checkpoint: done` label format (e.g. in a task list) is also supported as a fallback, but the metadata comment is the canonical source.
|
|
45
|
+
|
|
46
|
+
|
|
25
47
|
## TDD in the plan
|
|
26
48
|
|
|
27
49
|
Label each task with its TDD scenario:
|
|
@@ -1,172 +0,0 @@
|
|
|
1
|
-
# Workflow Guard: Safe Commands Expansion
|
|
2
|
-
|
|
3
|
-
**Date:** 2026-04-21
|
|
4
|
-
**Status:** Draft
|
|
5
|
-
|
|
6
|
-
## Problem
|
|
7
|
-
|
|
8
|
-
The workflow guard blocks several read-only bash commands that are genuinely needed during brainstorm and plan phases. Two specific user reports:
|
|
9
|
-
|
|
10
|
-
1. `cd /path && git remote -v 2>/dev/null; echo "---"; ls` — blocked due to `cd` not being allowlisted and `2>/dev/null` caught by the stdout-redirect pattern.
|
|
11
|
-
2. `gh pr view 1564 --json ... 2>/dev/null || echo "gh failed"` — blocked because `gh` is not allowlisted at all.
|
|
12
|
-
|
|
13
|
-
Additionally, `git status --short` is blocked because the safe regex only allows `git status` without flags.
|
|
14
|
-
|
|
15
|
-
## Design
|
|
16
|
-
|
|
17
|
-
### 1. Harmless redirect stripping
|
|
18
|
-
|
|
19
|
-
Add a `stripHarmlessRedirects(cmd)` helper that removes `2>/dev/null` and `2>&1` before pattern matching. These are purely cosmetic (suppress stderr noise) and have no side effects.
|
|
20
|
-
|
|
21
|
-
```ts
|
|
22
|
-
function stripHarmlessRedirects(cmd: string): string {
|
|
23
|
-
return cmd.replace(/\s*2\s*>\s*(\/dev\/null|&1)\b/g, "");
|
|
24
|
-
}
|
|
25
|
-
```
|
|
26
|
-
|
|
27
|
-
Apply it inside `isSafeCommand` on each sub-command before checking DESTRUCTIVE and SAFE patterns. This fixes `2>/dev/null` without loosening the redirect catch (which still blocks real writes).
|
|
28
|
-
|
|
29
|
-
### 2. New SAFE_PATTERNS entries
|
|
30
|
-
|
|
31
|
-
| Pattern | Rationale |
|
|
32
|
-
|---------|-----------|
|
|
33
|
-
| `/^\s*cd\b/` | Directory navigation — zero side effects |
|
|
34
|
-
| `/^\s*gh\s+pr\s+(view\|list\|diff\|checks\|status)\b/i` | Read-only PR inspection |
|
|
35
|
-
| `/^\s*gh\s+issue\s+(view\|list)\b/i` | Read-only issue inspection |
|
|
36
|
-
| `/^\s*gh\s+repo\s+(view\|fork\|list)\b/i` | Read-only repo metadata |
|
|
37
|
-
| `/^\s*gh\s+release\s+(view\|list\|download)\b/i` | Read-only release inspection |
|
|
38
|
-
| `/^\s*gh\s+run\s+(view\|list)\b/i` | Read-only CI run inspection |
|
|
39
|
-
| `/^\s*git\s+blame\b/` | Read-only file annotation |
|
|
40
|
-
| `/^\s*git\s+shortlog\b/` | Read-only commit summary |
|
|
41
|
-
| `/^\s*git\s+stash\s+list\b/i` | Read-only stash listing |
|
|
42
|
-
| `/^\s*git\s+tag\s+(-l\|--list)\b/i` | Read-only tag listing |
|
|
43
|
-
| `/^\s*git\s+describe\b/` | Read-only version info |
|
|
44
|
-
|
|
45
|
-
### 3. Fix: `git status` flag handling
|
|
46
|
-
|
|
47
|
-
Current regex `/^\s*git\s+(status|log|...)/i` doesn't allow common flags like `--short`, `--oneline`, `--format=...`. Refine all git safe patterns to optionally accept trailing flags and args:
|
|
48
|
-
|
|
49
|
-
```ts
|
|
50
|
-
/^\s*git\s+status\b/i,
|
|
51
|
-
/^\s*git\s+log\b/i,
|
|
52
|
-
/^\s*git\s+diff\b/i,
|
|
53
|
-
/^\s*git\s+show\b/i,
|
|
54
|
-
/^\s*git\s+blame\b/i,
|
|
55
|
-
// etc.
|
|
56
|
-
```
|
|
57
|
-
|
|
58
|
-
The existing patterns already anchor to `^\s*git\s+<subcommand>` — the issue was that `git status --short` didn't match because some patterns had more restrictive anchoring. Reviewing the code: the patterns use `\b` word boundaries which should allow flags. The actual issue with `git status --short && git log --oneline -5` is that the `git log --oneline -5` part is safe, but `git status --short` — let me verify: `/^\s*git\s+(status|log|diff|show|branch|remote|config\s+--get)/i` — `git status` has a `\b` after the group? No, there's no trailing `\b`. So `git status --short` **should** match since the pattern doesn't require end-of-string. The real blocker for that compound command was the `&&` splitting — `git log --oneline -5` — `-5` shouldn't be an issue either.
|
|
59
|
-
|
|
60
|
-
**Conclusion on item 3:** The `git status --short` case was a false alarm caused by compound command parsing combined with the `2>/dev/null` redirect in the user's actual commands, not a pattern bug. No change needed here beyond the redirect fix.
|
|
61
|
-
|
|
62
|
-
### 4. What we're NOT adding (YAGNI)
|
|
63
|
-
|
|
64
|
-
- `sed -i` (in-place editing) — correctly destructive
|
|
65
|
-
- `gh pr create/merge/close` — write operations
|
|
66
|
-
- `curl -o file` (output to file) — the redirect catch blocks this
|
|
67
|
-
- `cut`, `tr`, `column`, `base64` — rarely needed; can add later on demand
|
|
68
|
-
- `gh api` — too broad; can be used for mutations. Require specific subcommands.
|
|
69
|
-
|
|
70
|
-
## Data flow
|
|
71
|
-
|
|
72
|
-
No new data flow. The changes are purely additive to the pattern-matching logic in `isSafeCommand`.
|
|
73
|
-
|
|
74
|
-
## Error handling
|
|
75
|
-
|
|
76
|
-
No new error paths. The existing block-and-warn behavior remains unchanged.
|
|
77
|
-
|
|
78
|
-
## Testing
|
|
79
|
-
|
|
80
|
-
Manual verification with the exact commands that were blocked:
|
|
81
|
-
|
|
82
|
-
1. `cd /some/path && git remote -v 2>/dev/null; echo "---"; ls` → allowed
|
|
83
|
-
2. `gh pr view 1564 --repo owner/repo --json title,body,files 2>/dev/null || echo "gh failed"` → allowed
|
|
84
|
-
3. `git stash list` → allowed
|
|
85
|
-
4. `git tag -l` → allowed
|
|
86
|
-
5. `rm -rf /` → still blocked ✓
|
|
87
|
-
6. `git push origin main` → still blocked ✓
|
|
88
|
-
|
|
89
|
-
## Tests
|
|
90
|
-
|
|
91
|
-
Add the following test cases to the existing `tests/workflow-guard.test.ts` `isSafeCommand` describe block:
|
|
92
|
-
|
|
93
|
-
### New: `cd` navigation
|
|
94
|
-
```ts
|
|
95
|
-
it("allows cd", () => {
|
|
96
|
-
expect(isSafeCommand("cd /some/path")).toBe(true);
|
|
97
|
-
expect(isSafeCommand("cd src && ls")).toBe(true);
|
|
98
|
-
});
|
|
99
|
-
```
|
|
100
|
-
|
|
101
|
-
### New: GitHub CLI read-only subcommands
|
|
102
|
-
```ts
|
|
103
|
-
it("allows gh read-only subcommands", () => {
|
|
104
|
-
expect(isSafeCommand("gh pr view 1564 --json title,body")).toBe(true);
|
|
105
|
-
expect(isSafeCommand("gh pr list --repo owner/repo")).toBe(true);
|
|
106
|
-
expect(isSafeCommand("gh pr diff 1564")).toBe(true);
|
|
107
|
-
expect(isSafeCommand("gh issue view 42")).toBe(true);
|
|
108
|
-
expect(isSafeCommand("gh issue list --label bug")).toBe(true);
|
|
109
|
-
expect(isSafeCommand("gh repo view owner/repo")).toBe(true);
|
|
110
|
-
expect(isSafeCommand("gh run view 12345")).toBe(true);
|
|
111
|
-
});
|
|
112
|
-
|
|
113
|
-
it("blocks gh write subcommands", () => {
|
|
114
|
-
expect(isSafeCommand("gh pr create --title 'fix'")).toBe(false);
|
|
115
|
-
expect(isSafeCommand("gh pr merge 1564")).toBe(false);
|
|
116
|
-
expect(isSafeCommand("gh issue close 42")).toBe(false);
|
|
117
|
-
expect(isSafeCommand("gh release create v1.0")).toBe(false);
|
|
118
|
-
});
|
|
119
|
-
```
|
|
120
|
-
|
|
121
|
-
### New: Git read-only subcommands
|
|
122
|
-
```ts
|
|
123
|
-
it("allows git read-only subcommands (new additions)", () => {
|
|
124
|
-
expect(isSafeCommand("git blame src/index.ts")).toBe(true);
|
|
125
|
-
expect(isSafeCommand("git shortlog -sn")).toBe(true);
|
|
126
|
-
expect(isSafeCommand("git stash list")).toBe(true);
|
|
127
|
-
expect(isSafeCommand("git tag -l")).toBe(true);
|
|
128
|
-
expect(isSafeCommand("git tag --list 'v*'")).toBe(true);
|
|
129
|
-
expect(isSafeCommand("git describe --tags")).toBe(true);
|
|
130
|
-
});
|
|
131
|
-
|
|
132
|
-
it("still blocks git stash mutations", () => {
|
|
133
|
-
expect(isSafeCommand("git stash push -m 'wip'")).toBe(false);
|
|
134
|
-
expect(isSafeCommand("git stash pop")).toBe(false);
|
|
135
|
-
});
|
|
136
|
-
```
|
|
137
|
-
|
|
138
|
-
### New: Harmless stderr redirect stripping
|
|
139
|
-
```ts
|
|
140
|
-
it("allows 2>/dev/null on safe commands", () => {
|
|
141
|
-
expect(isSafeCommand("git remote -v 2>/dev/null")).toBe(true);
|
|
142
|
-
expect(isSafeCommand("gh pr view 1564 2>/dev/null")).toBe(true);
|
|
143
|
-
expect(isSafeCommand("npm list 2>/dev/null")).toBe(true);
|
|
144
|
-
});
|
|
145
|
-
|
|
146
|
-
it("allows 2>&1 on safe commands", () => {
|
|
147
|
-
expect(isSafeCommand("git log 2>&1")).toBe(true);
|
|
148
|
-
});
|
|
149
|
-
|
|
150
|
-
it("still blocks stdout redirects even with stderr redirect present", () => {
|
|
151
|
-
expect(isSafeCommand("echo 'hello' > file.ts 2>/dev/null")).toBe(false);
|
|
152
|
-
expect(isSafeCommand("cat config > backup.txt 2>/dev/null")).toBe(false);
|
|
153
|
-
});
|
|
154
|
-
```
|
|
155
|
-
|
|
156
|
-
### New: Compound commands from real user scenarios
|
|
157
|
-
```ts
|
|
158
|
-
it("allows the exact user-reported blocked commands", () => {
|
|
159
|
-
// Scenario 1: directory navigation + git remote + ls
|
|
160
|
-
expect(isSafeCommand("cd /Users/u/partying/pt-room && git remote -v 2>/dev/null; echo '---'; ls")).toBe(true);
|
|
161
|
-
// Scenario 2: gh pr view with fallback
|
|
162
|
-
expect(isSafeCommand("gh pr view 1564 --repo olachat/pt-partying --json title,body,files,additions,deletions 2>/dev/null || echo 'gh failed'")).toBe(true);
|
|
163
|
-
});
|
|
164
|
-
```
|
|
165
|
-
|
|
166
|
-
## Summary
|
|
167
|
-
|
|
168
|
-
| Change | Location | Size |
|
|
169
|
-
|--------|----------|------|
|
|
170
|
-
| Add `stripHarmlessRedirects()` | Above `isSafeCommand` | ~3 lines |
|
|
171
|
-
| Call it in `isSafeCommand` loop body | Inside `isSafeCommand` | 1 line changed |
|
|
172
|
-
| Add 10 new SAFE_PATTERNS entries | `SAFE_PATTERNS` array | ~10 lines |
|
|
@@ -1,168 +0,0 @@
|
|
|
1
|
-
# Workflow Guard: Safe Commands Expansion — Implementation Plan
|
|
2
|
-
|
|
3
|
-
**Design:** `docs/plans/2026-04-21-workflow-guard-safe-commands-design.md`
|
|
4
|
-
**Date:** 2026-04-21
|
|
5
|
-
|
|
6
|
-
---
|
|
7
|
-
|
|
8
|
-
## Task 1: Add `stripHarmlessRedirects` helper and wire it into `isSafeCommand`
|
|
9
|
-
|
|
10
|
-
**Scenario:** Modifying tested code
|
|
11
|
-
**File:** `extensions/workflow-guard.ts`
|
|
12
|
-
|
|
13
|
-
1. Run existing tests to confirm baseline:
|
|
14
|
-
```bash
|
|
15
|
-
npx vitest run tests/workflow-guard.test.ts
|
|
16
|
-
```
|
|
17
|
-
Expected: all pass.
|
|
18
|
-
|
|
19
|
-
2. Add `stripHarmlessRedirects` function above `isSafeCommand`:
|
|
20
|
-
```ts
|
|
21
|
-
/** Strip stderr redirects that are purely cosmetic (no side effects). */
|
|
22
|
-
function stripHarmlessRedirects(cmd: string): string {
|
|
23
|
-
return cmd.replace(/\s*2\s*>\s*(\/dev\/null|&1)\b/g, "");
|
|
24
|
-
}
|
|
25
|
-
```
|
|
26
|
-
|
|
27
|
-
3. Wire it into `isSafeCommand` — apply `stripHarmlessRedirects` to each part before pattern matching:
|
|
28
|
-
```ts
|
|
29
|
-
export function isSafeCommand(command: string): boolean {
|
|
30
|
-
const parts = splitCompoundCommand(command);
|
|
31
|
-
return parts.every((part) => {
|
|
32
|
-
const cleaned = stripHarmlessRedirects(part);
|
|
33
|
-
const isDestructive = DESTRUCTIVE_PATTERNS.some((p) => p.test(cleaned));
|
|
34
|
-
const isSafe = SAFE_PATTERNS.some((p) => p.test(cleaned));
|
|
35
|
-
return !isDestructive && isSafe;
|
|
36
|
-
});
|
|
37
|
-
}
|
|
38
|
-
```
|
|
39
|
-
|
|
40
|
-
4. Run tests:
|
|
41
|
-
```bash
|
|
42
|
-
npx vitest run tests/workflow-guard.test.ts
|
|
43
|
-
```
|
|
44
|
-
Expected: all existing tests still pass (no behavior change yet since no new SAFE_PATTERNS).
|
|
45
|
-
|
|
46
|
-
5. Commit:
|
|
47
|
-
```bash
|
|
48
|
-
git add extensions/workflow-guard.ts
|
|
49
|
-
git commit -m "feat(workflow-guard): add stripHarmlessRedirects helper"
|
|
50
|
-
```
|
|
51
|
-
|
|
52
|
-
---
|
|
53
|
-
|
|
54
|
-
## Task 2: Add new SAFE_PATTERNS entries
|
|
55
|
-
|
|
56
|
-
**Scenario:** Modifying tested code
|
|
57
|
-
**File:** `extensions/workflow-guard.ts`
|
|
58
|
-
|
|
59
|
-
1. Add the following entries to the `SAFE_PATTERNS` array (after the existing `gh`-related area or at end):
|
|
60
|
-
|
|
61
|
-
```ts
|
|
62
|
-
/^\s*cd\b/,
|
|
63
|
-
/^\s*gh\s+pr\s+(view|list|diff|checks|status)\b/i,
|
|
64
|
-
/^\s*gh\s+issue\s+(view|list)\b/i,
|
|
65
|
-
/^\s*gh\s+repo\s+(view|fork|list)\b/i,
|
|
66
|
-
/^\s*gh\s+release\s+(view|list|download)\b/i,
|
|
67
|
-
/^\s*gh\s+run\s+(view|list)\b/i,
|
|
68
|
-
/^\s*git\s+blame\b/,
|
|
69
|
-
/^\s*git\s+shortlog\b/,
|
|
70
|
-
/^\s*git\s+stash\s+list\b/i,
|
|
71
|
-
/^\s*git\s+tag\s+(-l|--list)\b/i,
|
|
72
|
-
/^\s*git\s+describe\b/,
|
|
73
|
-
```
|
|
74
|
-
|
|
75
|
-
2. Run tests:
|
|
76
|
-
```bash
|
|
77
|
-
npx vitest run tests/workflow-guard.test.ts
|
|
78
|
-
```
|
|
79
|
-
Expected: all existing tests pass.
|
|
80
|
-
|
|
81
|
-
3. Commit:
|
|
82
|
-
```bash
|
|
83
|
-
git add extensions/workflow-guard.ts
|
|
84
|
-
git commit -m "feat(workflow-guard): add safe patterns for cd, gh, and git read-only subcommands"
|
|
85
|
-
```
|
|
86
|
-
|
|
87
|
-
---
|
|
88
|
-
|
|
89
|
-
## Task 3: Add tests for `cd`, `gh`, git new subcommands, and redirect stripping
|
|
90
|
-
|
|
91
|
-
**Scenario:** New feature (test-first)
|
|
92
|
-
**File:** `tests/workflow-guard.test.ts`
|
|
93
|
-
|
|
94
|
-
**checkpoint: test** — pause after writing failing tests, before implementation.
|
|
95
|
-
|
|
96
|
-
> Note: Implementation was already done in Tasks 1–2. These tests should all pass immediately. The checkpoint label is kept for review purposes in case the user wants to verify test design.
|
|
97
|
-
|
|
98
|
-
1. Add the following test blocks inside the `describe("isSafeCommand", ...)` block, after the existing tests:
|
|
99
|
-
|
|
100
|
-
```ts
|
|
101
|
-
it("allows cd", () => {
|
|
102
|
-
expect(isSafeCommand("cd /some/path")).toBe(true);
|
|
103
|
-
expect(isSafeCommand("cd src && ls")).toBe(true);
|
|
104
|
-
});
|
|
105
|
-
|
|
106
|
-
it("allows gh read-only subcommands", () => {
|
|
107
|
-
expect(isSafeCommand("gh pr view 1564 --json title,body")).toBe(true);
|
|
108
|
-
expect(isSafeCommand("gh pr list --repo owner/repo")).toBe(true);
|
|
109
|
-
expect(isSafeCommand("gh pr diff 1564")).toBe(true);
|
|
110
|
-
expect(isSafeCommand("gh issue view 42")).toBe(true);
|
|
111
|
-
expect(isSafeCommand("gh issue list --label bug")).toBe(true);
|
|
112
|
-
expect(isSafeCommand("gh repo view owner/repo")).toBe(true);
|
|
113
|
-
expect(isSafeCommand("gh run view 12345")).toBe(true);
|
|
114
|
-
});
|
|
115
|
-
|
|
116
|
-
it("blocks gh write subcommands", () => {
|
|
117
|
-
expect(isSafeCommand("gh pr create --title 'fix'")).toBe(false);
|
|
118
|
-
expect(isSafeCommand("gh pr merge 1564")).toBe(false);
|
|
119
|
-
expect(isSafeCommand("gh issue close 42")).toBe(false);
|
|
120
|
-
expect(isSafeCommand("gh release create v1.0")).toBe(false);
|
|
121
|
-
});
|
|
122
|
-
|
|
123
|
-
it("allows git read-only subcommands (new additions)", () => {
|
|
124
|
-
expect(isSafeCommand("git blame src/index.ts")).toBe(true);
|
|
125
|
-
expect(isSafeCommand("git shortlog -sn")).toBe(true);
|
|
126
|
-
expect(isSafeCommand("git stash list")).toBe(true);
|
|
127
|
-
expect(isSafeCommand("git tag -l")).toBe(true);
|
|
128
|
-
expect(isSafeCommand("git tag --list 'v*'")).toBe(true);
|
|
129
|
-
expect(isSafeCommand("git describe --tags")).toBe(true);
|
|
130
|
-
});
|
|
131
|
-
|
|
132
|
-
it("still blocks git stash mutations", () => {
|
|
133
|
-
expect(isSafeCommand("git stash push -m 'wip'")).toBe(false);
|
|
134
|
-
expect(isSafeCommand("git stash pop")).toBe(false);
|
|
135
|
-
});
|
|
136
|
-
|
|
137
|
-
it("allows 2>/dev/null on safe commands", () => {
|
|
138
|
-
expect(isSafeCommand("git remote -v 2>/dev/null")).toBe(true);
|
|
139
|
-
expect(isSafeCommand("gh pr view 1564 2>/dev/null")).toBe(true);
|
|
140
|
-
expect(isSafeCommand("npm list 2>/dev/null")).toBe(true);
|
|
141
|
-
});
|
|
142
|
-
|
|
143
|
-
it("allows 2>&1 on safe commands", () => {
|
|
144
|
-
expect(isSafeCommand("git log 2>&1")).toBe(true);
|
|
145
|
-
});
|
|
146
|
-
|
|
147
|
-
it("still blocks stdout redirects even with stderr redirect present", () => {
|
|
148
|
-
expect(isSafeCommand("echo 'hello' > file.ts 2>/dev/null")).toBe(false);
|
|
149
|
-
expect(isSafeCommand("cat config > backup.txt 2>/dev/null")).toBe(false);
|
|
150
|
-
});
|
|
151
|
-
|
|
152
|
-
it("allows the exact user-reported blocked commands", () => {
|
|
153
|
-
expect(isSafeCommand("cd /Users/u/partying/pt-room && git remote -v 2>/dev/null; echo '---'; ls")).toBe(true);
|
|
154
|
-
expect(isSafeCommand("gh pr view 1564 --repo olachat/pt-partying --json title,body,files,additions,deletions 2>/dev/null || echo 'gh failed'")).toBe(true);
|
|
155
|
-
});
|
|
156
|
-
```
|
|
157
|
-
|
|
158
|
-
2. Run tests:
|
|
159
|
-
```bash
|
|
160
|
-
npx vitest run tests/workflow-guard.test.ts
|
|
161
|
-
```
|
|
162
|
-
Expected: all tests pass (including new ones).
|
|
163
|
-
|
|
164
|
-
3. Commit:
|
|
165
|
-
```bash
|
|
166
|
-
git add tests/workflow-guard.test.ts
|
|
167
|
-
git commit -m "test(workflow-guard): add tests for cd, gh, git read-only subcommands, and redirect stripping"
|
|
168
|
-
```
|