@devshop/crew 0.10.0 → 0.11.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,3 +1,17 @@
1
+ # [0.11.0](https://github.com/devshop-software/crew/compare/v0.10.1...v0.11.0) (2026-05-09)
2
+
3
+
4
+ ### Features
5
+
6
+ * **skills:** qa-v2 — journey-scope coverage map, leaner skill text ([d6342a0](https://github.com/devshop-software/crew/commit/d6342a066579047921041829cf75e45cbdb840fb))
7
+
8
+ ## [0.10.1](https://github.com/devshop-software/crew/compare/v0.10.0...v0.10.1) (2026-05-09)
9
+
10
+
11
+ ### Bug Fixes
12
+
13
+ * **skills:** close per-feature .feature ban loophole; mandate Gherkin Impact ([46eb47b](https://github.com/devshop-software/crew/commit/46eb47b932b6a4be90547c76f2b72324c2f0f09a)), closes [#53](https://github.com/devshop-software/crew/issues/53)
14
+
1
15
  # [0.10.0](https://github.com/devshop-software/crew/compare/v0.9.1...v0.10.0) (2026-05-07)
2
16
 
3
17
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@devshop/crew",
3
- "version": "0.10.0",
3
+ "version": "0.11.0",
4
4
  "description": "Project-agnostic Claude Code skills for spec → implement → qa → review → ship",
5
5
  "bin": {
6
6
  "crew": "scripts/cli.js"
@@ -19,11 +19,7 @@ Activate when called from the `/adjust` command. Otherwise ignore.
19
19
 
20
20
  ## Input Handling
21
21
 
22
- `$ARGUMENTS` may be:
23
-
24
- - **Empty** (most common) — scan the current project and set up config
25
- - **`update`** — re-scan and update an existing Workflow Config
26
- - **A specific key** (e.g. `test-cmd`) — update just that config value
22
+ Take whatever was passed and infer the scope: full project scan (default), a re-scan of the existing config, or an update to a single config key.
27
23
 
28
24
  ---
29
25
 
@@ -166,7 +162,7 @@ This step converts a standard git clone into a bare-clone worktree layout, or va
166
162
  .claude/ ← real dir at root (not a symlink)
167
163
  .mcp.json ← shared across worktrees
168
164
  main/ ← worktree for the base branch
169
- wt/ ← feature worktrees (created by /indie)
165
+ wt/ ← feature worktrees (created by /indie-agent)
170
166
  <feature-name>/ ← short, scannable names
171
167
  ```
172
168
 
@@ -245,7 +241,7 @@ This project uses a **bare-clone worktree layout**. The repo root is not a worki
245
241
  .claude/ ← real dir at root (not a symlink)
246
242
  .mcp.json ← shared across worktrees
247
243
  main/ ← worktree for the main branch (primary working copy)
248
- wt/ ← feature worktrees created by /indie
244
+ wt/ ← feature worktrees created by /indie-agent
249
245
  <feature-name>/ ← short, scannable names
250
246
  \`\`\`
251
247
 
@@ -19,11 +19,7 @@ Activate when called from the `/audit` command. Otherwise ignore.
19
19
 
20
20
  ## Input Handling
21
21
 
22
- `$ARGUMENTS` may be:
23
-
24
- - **Empty** (most common) — full codebase audit across all five dimensions
25
- - **A dimension name** (e.g. `security`, `testing`, `infrastructure`, `code-health`, `dependencies`) — audit only that dimension
26
- - **A path** to a directory — scope the audit to a specific area of the codebase
22
+ Take whatever was passed: empty for a full audit across all five dimensions, a dimension name (`security`, `testing`, `infrastructure`, `code-health`, `dependencies`) for one slice, or a directory path to scope to part of the codebase.
27
23
 
28
24
  ---
29
25
 
@@ -17,13 +17,7 @@ Activate when called from the `/docs` command. Otherwise ignore.
17
17
 
18
18
  ## Input Handling
19
19
 
20
- `$ARGUMENTS` may be:
21
-
22
- - **Empty** or `all` — regenerate all five managed files
23
- - **A folder name** — `operational` or `technical`. Regenerate the files in that folder only.
24
- - **A file name** — `architecture`, `first-time-setup`, `ci-cd`, `best-practices`, or `patterns`. Regenerate only that file.
25
-
26
- Record which files are in scope for this run before proceeding.
20
+ Take whatever was passed: empty regenerates all five managed files, a folder name (`operational` or `technical`) scopes to that folder, a file name scopes to that single file. Record which files are in scope before proceeding.
27
21
 
28
22
  ---
29
23
 
@@ -21,11 +21,7 @@ Activate when called from the `/implement` command. Otherwise ignore.
21
21
 
22
22
  ## Input Handling
23
23
 
24
- `$ARGUMENTS` may be:
25
-
26
- - A **folder name** (e.g. `20260413-1423-dark-mode`)
27
- - A **path** to the workflow folder or spec file
28
- - **Empty** — auto-detect: scan the workflow directory for folders that have `01-spec.md` but no `02-implementation.md`. If exactly one exists, use it. If multiple, list them and ask. If none, tell the user there are no unimplemented specs.
24
+ Take whatever was passed — workflow folder name, path to the folder or spec file, or empty to auto-detect (one folder with `01-spec.md` and no `02-implementation.md`; ask if multiple).
29
25
 
30
26
  ---
31
27
 
@@ -11,8 +11,6 @@ You are a lightweight workflow orchestrator. You drive the full development chai
11
11
 
12
12
  You are a conductor, not a player. You never write code, write tests, or perform reviews yourself. You read skill files from disk, construct agent prompts, dispatch them, then read the resulting artifacts to decide what comes next.
13
13
 
14
- **Key difference from the `indie` skill:** Each phase runs in a fresh subagent with its own context window. This means every phase gets full context budget for codebase exploration, and the orchestrator stays lean regardless of how many fix loop iterations occur.
15
-
16
14
  Each feature runs in its own git worktree, enabling multiple `/indie-agent` invocations to run in parallel across separate terminals.
17
15
 
18
16
  By default you run fully autonomously. The user provides input once, you deliver a PR with green CI. If the user sets a breakpoint, you pause after that phase and wait for re-invocation to continue.
@@ -25,28 +23,16 @@ Activate when called from the `/indie-agent` command. Otherwise ignore.
25
23
 
26
24
  ## Input Handling
27
25
 
28
- `$ARGUMENTS` may be:
29
-
30
- - A **GitHub issue URL** (e.g. `https://github.com/org/repo/issues/42`) — passed to the spec phase as input
31
- - **Free text** — a feature description, passed to the spec phase as input
32
- - A **workflow folder reference** (folder name or path) — resume an existing workflow from wherever it left off
33
- - **Empty** — auto-detect: scan the workflow directory for incomplete workflows (folders missing later artifacts). If exactly one exists, resume it. If multiple, list and ask. If none, tell the user to provide a feature description.
26
+ Take whatever was passed: a feature brief, GitHub issue URL, workflow folder reference (resume), or empty to auto-detect (one incomplete workflow if exactly one exists; ask if multiple).
34
27
 
35
28
  ### Breakpoints
36
29
 
37
- The input may include a breakpoint instruction. Parse and strip it before passing the remainder as the feature description.
38
-
39
- **Syntax:** `--stop-after <phase>`, `stop after <phase>`, `pause after <phase>`, or `break after <phase>` anywhere in the input.
30
+ The input may include a natural-language breakpoint instruction like "stop after spec", "pause after review", or "break after implement" anywhere in the message. Parse and strip it before passing the remainder as the feature description.
40
31
 
41
32
  **Recognized phases:** `spec`, `implement`, `qa`, `review`, `ship`
42
33
 
43
- **Examples:**
44
- - `/indie-agent https://github.com/org/repo/issues/42 --stop-after spec`
45
- - `/indie-agent add user avatars, stop after review`
46
- - `/indie-agent dark-mode --stop-after implement`
47
- - `/indie-agent https://github.com/org/repo/issues/42` — no breakpoint, fully autonomous
48
-
49
34
  **At a breakpoint:**
35
+
50
36
  1. Complete the phase normally (let the subagent finish)
51
37
  2. Verify the output artifact exists
52
38
  3. Report: "Paused after `<phase>`. Artifact: `<path>`. Worktree: `<worktree-path>`. Review it, then re-invoke `/indie-agent <folder>` to continue."
@@ -80,6 +66,7 @@ The input may include a breakpoint instruction. Parse and strip it before passin
80
66
  6. **Switch context:** all subsequent steps run inside the worktree directory
81
67
 
82
68
  Present a one-line plan:
69
+
83
70
  - **No breakpoint:** "Starting autonomous workflow for: `<feature summary>` in worktree `<path>`. Will run: spec → implement → qa → review → ship → monitor CI. I'll report back when done or if I hit a blocker."
84
71
  - **With breakpoint:** "Starting workflow for: `<feature summary>` in worktree `<path>`. Will run through `<phase>` and pause for your review."
85
72
  - **Resuming:** "Resuming workflow `<folder>` from `<next phase>`."
@@ -90,17 +77,17 @@ Present a one-line plan:
90
77
 
91
78
  Read the workflow folder and determine the current state from existing artifacts:
92
79
 
93
- | State | Artifacts Present | Next Action |
94
- |-------|-------------------|-------------|
95
- | Nothing | No workflow folder | Dispatch spec agent (Step 2) |
96
- | Spec done | `01-spec.md` only | Dispatch implementation agent (Step 3) |
97
- | Implementation done | `+ 02-implementation.md` | Dispatch QA agent (Step 4) |
98
- | QA done | `+ 03-qa*.md` (latest) | Dispatch review agent (Step 5) |
99
- | Review FAIL | `+ 04-review*.md` with FAIL verdict | Dispatch implementation fix agent (Step 5F) |
100
- | Review PASS | `+ 04-review*.md` with PASS verdict | Dispatch ship agent (Step 6) |
101
- | PR created | PR exists on remote branch | Monitor CI (Step 7) |
102
- | CI passing | All checks green | Write summary (Step 8) |
103
- | CI failing | Checks red | CI fix loop (Step 7F) |
80
+ | State | Artifacts Present | Next Action |
81
+ | ------------------- | ----------------------------------- | ------------------------------------------- |
82
+ | Nothing | No workflow folder | Dispatch spec agent (Step 2) |
83
+ | Spec done | `01-spec.md` only | Dispatch implementation agent (Step 3) |
84
+ | Implementation done | `+ 02-implementation.md` | Dispatch QA agent (Step 4) |
85
+ | QA done | `+ 03-qa*.md` (latest) | Dispatch review agent (Step 5) |
86
+ | Review FAIL | `+ 04-review*.md` with FAIL verdict | Dispatch implementation fix agent (Step 5F) |
87
+ | Review PASS | `+ 04-review*.md` with PASS verdict | Dispatch ship agent (Step 6) |
88
+ | PR created | PR exists on remote branch | Monitor CI (Step 7) |
89
+ | CI passing | All checks green | Write summary (Step 8) |
90
+ | CI failing | Checks red | CI fix loop (Step 7F) |
104
91
 
105
92
  **To detect "PR created":** Check if the current branch exists on the remote (`git ls-remote --heads origin <branch-name>`). If it does, find the PR with `gh pr list --head <branch-name>`.
106
93
 
@@ -113,9 +100,8 @@ Every phase (Steps 2–6) follows the same dispatch pattern:
113
100
  ### Before dispatching:
114
101
 
115
102
  1. **Read the skill file** from disk: `.claude/skills/<skill-name>/SKILL.md`
116
- 2. **Pre-seed the TaskList** — call `TaskCreate` once per subtask of this phase (see per-phase seed lists in Steps 2–6). Capture the returned task IDs; they go into the agent prompt.
117
- 3. **Construct the agent prompt** (see template below) it MUST embed the progress-log path and the seeded task IDs
118
- 4. **Dispatch** via the Agent tool. For long phases (implementation, QA, fix loops) use `run_in_background: true` so the orchestrator stays responsive to user status queries. For short phases (spec, review, ship) foreground is fine.
103
+ 2. **Construct the agent prompt** (see template below) it MUST embed the progress-log path
104
+ 3. **Dispatch** via the Agent tool. For long phases (implementation, QA, fix loops) use `run_in_background: true` so the orchestrator stays responsive to user status queries. For short phases (spec, review, ship) foreground is fine.
119
105
 
120
106
  ### Agent prompt template:
121
107
 
@@ -133,28 +119,28 @@ You are running as part of an autonomous workflow orchestrator. Your working dir
133
119
  - Workflow Config:
134
120
  <key-value pairs from CLAUDE.md>
135
121
 
136
- ## Progress Reporting (MANDATORY)
122
+ ## Progress Log (MANDATORY)
123
+
124
+ You MUST update the progress log as you work. The orchestrator reads it in real time to answer the user's status queries while you run. The progress log is the only signal — there is no task list, no other channel.
125
+
126
+ **Log path:** `<worktree-path>/<workflow-dir>/<folder-name>/_progress.log`
127
+
128
+ **Format (one line per milestone):** `[<phase>] <ISO-8601 UTC timestamp> — <event>`
129
+ Example: `[implementation] 2026-04-20T21:14:03Z — step 4/13: FIFO allocator service — starting`
137
130
 
138
- You MUST report progress as you work. The orchestrator reads these signals in real time to answer the user's status queries while you run.
131
+ **You MUST append at least:**
139
132
 
140
- 1. **Progress log.** Append to: `<worktree-path>/<workflow-dir>/<folder-name>/_progress.log`
141
- Format (one line per milestone): `[<phase>] <ISO-8601 UTC timestamp> — <event>`
142
- Example: `[implementation] 2026-04-20T21:14:03Z step 4/13: FIFO allocator servicestarting`
143
- Append AT LEAST:
144
- - ONE line when the phase starts
145
- - ONE line when you begin each subtask (with "<name> — starting")
146
- - ONE line when you finish each subtask (with "<name> — done" or "<name> — failed: <short reason>")
147
- - ONE line when the phase finishes (success or failure)
148
- Use shell append (`>>`), not overwrite. Never delete or truncate the file. Never batch-log at the end — log BEFORE and AFTER each subtask, as you go.
133
+ - ONE line when the phase starts (`[<phase>] <ts> — phase start`)
134
+ - ONE line when you begin each subtask (`<name> — starting`)
135
+ - ONE line when you finish each subtask (`<name>done` or `<name> — failed: <short reason>`)
136
+ - ONE line when the phase finishes (`[<phase>] <ts> — phase done` or `phase failed: <reason>`)
149
137
 
150
- 2. **Task list.** The orchestrator pre-seeded these TaskList IDs for your phase:
151
- <task-id-list>
152
- Flip each one via `TaskUpdate`:
153
- - `status: "in_progress"` when you start working it
154
- - `status: "completed"` when it's done
155
- Do NOT TaskCreate new tasks unless you discover genuinely new work the orchestrator did not plan. Do NOT delete or re-subject seeded tasks.
138
+ **Discipline:**
156
139
 
157
- 3. **Discipline.** If a step fails or you hit a blocker, log it immediately with the `— failed:` form AND flip the task to `in_progress` (not completed). Do not stay silent.
140
+ - Use shell append (`echo ... >> _progress.log`), never overwrite. Never delete or truncate the file.
141
+ - Log BEFORE and AFTER each subtask, as you go. Never batch-log at the end.
142
+ - If a step fails or you hit a blocker, log it immediately with the `— failed:` form. Do not stay silent.
143
+ - Phase-end line is non-negotiable. The orchestrator uses it to detect completion.
158
144
 
159
145
  ## Skill Instructions
160
146
  Follow the skill instructions below. They define your role, steps, constraints, and red flags.
@@ -168,16 +154,15 @@ Where the skill says to ask the user or wait for confirmation, the overrides abo
168
154
 
169
155
  1. **Verify the output artifact** exists (read the file)
170
156
  2. **Read the artifact** to extract status/verdict
171
- 3. **Reconcile the TaskList** — mark any still-`in_progress` tasks `completed` if the artifact shows they're done, or leave them in-progress and note the gap
157
+ 3. **Verify the progress log was updated** — `tail _progress.log` and confirm a phase-end line exists. If missing, log it as a gap but proceed with the artifact.
172
158
  4. **Check breakpoint** — if the current phase matches, pause
173
159
  5. **Decide next step** based on the artifact state
174
160
 
175
161
  ### When the user asks "status" / "what's up" while a subagent is running
176
162
 
177
163
  1. `tail` the last ~30 lines of `<workflow-dir>/<folder-name>/_progress.log`
178
- 2. Call `TaskList` and read which seeded tasks are `pending` / `in_progress` / `completed`
179
- 3. Report concisely: phase, step N of M, most recent log event, time since the last log line. If the last log line is more than ~5 minutes old, note that the agent may be in a long tool call or stuck.
180
- 4. Do NOT dispatch another agent, do NOT mutate files. Answering the user is read-only.
164
+ 2. Report concisely: phase, most recent log event, time since the last log line. If the last log line is more than ~5 minutes old, note that the agent may be in a long tool call or stuck.
165
+ 3. Do NOT dispatch another agent, do NOT mutate files. Answering the user is read-only.
181
166
 
182
167
  ---
183
168
 
@@ -185,15 +170,10 @@ Where the skill says to ask the user or wait for confirmation, the overrides abo
185
170
 
186
171
  **Skill file:** `.claude/skills/spec-writer/SKILL.md`
187
172
 
188
- **Pre-seed TaskList (call TaskCreate once each, capture IDs):**
189
- - `[spec] Read inputs and project config`
190
- - `[spec] Explore codebase / pick structural template`
191
- - `[spec] Draft 01-spec.md`
192
- - `[spec] Verify acceptance criteria are testable`
193
-
194
173
  **Dispatch mode:** foreground (spec runs are bounded — usually 5–15 min).
195
174
 
196
175
  **Task instructions:**
176
+
197
177
  ```
198
178
  Write a spec for this feature.
199
179
 
@@ -202,9 +182,12 @@ Input: <issue URL or free text description>
202
182
  Create the workflow folder: <workflow-dir>/<folder-name>/
203
183
  Write the spec as: <workflow-dir>/<folder-name>/01-spec.md
204
184
  DO NOT modify files outside the workflow folder. Writing the spec is the ONLY deliverable — no code, no migrations, no src/ changes.
185
+
186
+ Update the progress log at every milestone (see "Progress Log (MANDATORY)" above). At minimum: phase start, before/after each subtask, phase done. The phase-end line is required.
205
187
  ```
206
188
 
207
189
  **Autonomous overrides:**
190
+
208
191
  - Skip the ambiguity check's user questions — make reasonable decisions and document assumptions in the spec
209
192
  - Skip Step 8 ("Present and Refine") — write the spec and finish
210
193
  - If requirements are genuinely too vague to plan (no identifiable feature, contradictory requirements), write a message explaining why and stop
@@ -217,23 +200,25 @@ DO NOT modify files outside the workflow folder. Writing the spec is the ONLY de
217
200
 
218
201
  **Skill file:** `.claude/skills/implementation/SKILL.md`
219
202
 
220
- **Pre-seed TaskList:** First read `01-spec.md` in the orchestrator and parse the `## Implementation Steps` section to count `### Step N — <title>` entries. Then TaskCreate one task per spec step — subject: `[impl] Step N — <title>`. Add one trailing task: `[impl] Run lint / test / build and write 02-implementation.md`.
221
-
222
203
  **Dispatch mode:** **background** (`run_in_background: true`). Implementation is the longest phase; staying responsive matters.
223
204
 
224
205
  **Task instructions:**
206
+
225
207
  ```
226
208
  Implement the feature specified in the spec.
227
209
 
228
210
  Workflow folder: <workflow-dir>/<folder-name>/
229
211
  Read 01-spec.md for the implementation plan.
230
212
  Write the implementation report as 02-implementation.md in the same folder.
213
+
214
+ Update the progress log at every milestone (see "Progress Log (MANDATORY)" above). Implementation is long — log BEFORE and AFTER each Implementation Step from the spec. Phase-end line is required when 02-implementation.md is written.
231
215
  ```
232
216
 
233
217
  **Autonomous overrides:**
218
+
234
219
  - Skip Step 3 ("Present Summary and Get Confirmation") — begin implementing immediately after reading the spec
235
220
 
236
- **Gate:** After the completion notification fires, verify `02-implementation.md` exists and has a Status line. Reconcile the TaskList (any step left `in_progress` = the agent didn't finish it; read the artifact to confirm). If breakpoint is `implement`, pause here.
221
+ **Gate:** After the completion notification fires, verify `02-implementation.md` exists and has a Status line. Verify a phase-end line exists in `_progress.log`. If breakpoint is `implement`, pause here.
237
222
 
238
223
  ---
239
224
 
@@ -241,11 +226,10 @@ Write the implementation report as 02-implementation.md in the same folder.
241
226
 
242
227
  **Skill file:** `.claude/skills/qa-engineer/SKILL.md`
243
228
 
244
- **Pre-seed TaskList:** Read the spec's `## Acceptance Criteria` section and count criteria. TaskCreate one task per criterion — subject: `[qa] AC N — <short paraphrase>`. Add: `[qa] Run e2e suite` and `[qa] Write 03-qa*.md`.
245
-
246
229
  **Dispatch mode:** **background** (`run_in_background: true`). Playwright runs can be slow.
247
230
 
248
231
  **Task instructions:**
232
+
249
233
  ```
250
234
  Write and run e2e tests for the implemented feature.
251
235
 
@@ -253,11 +237,14 @@ Workflow folder: <workflow-dir>/<folder-name>/
253
237
  Read 01-spec.md for acceptance criteria.
254
238
  Read 02-implementation.md for what was built.
255
239
  Write the QA report as <03-qa.md or 03-qa-N.md> in the same folder.
240
+
241
+ Update the progress log at every milestone (see "Progress Log (MANDATORY)" above). At minimum: phase start, log before/after writing each scenario, before/after running the suite, phase done. The phase-end line is required.
256
242
  ```
257
243
 
258
244
  **Autonomous overrides:** None — the QA skill already runs without confirmation.
259
245
 
260
246
  **Gate:** After the agent returns, verify the QA artifact exists. If breakpoint is `qa`, pause here. Otherwise read its status:
247
+
261
248
  - **PASS** → proceed to review
262
249
  - **FAIL** → log it, proceed to review (the review will catch the implementation issue)
263
250
  - **PARTIAL** → proceed to review
@@ -268,16 +255,10 @@ Write the QA report as <03-qa.md or 03-qa-N.md> in the same folder.
268
255
 
269
256
  **Skill file:** `.claude/skills/review/SKILL.md`
270
257
 
271
- **Pre-seed TaskList:**
272
- - `[review] Read spec / implementation / QA artifacts`
273
- - `[review] Read the actual code (diff vs base branch)`
274
- - `[review] Check acceptance criteria coverage`
275
- - `[review] Check code quality / security / scope`
276
- - `[review] Write 04-review*.md with verdict`
277
-
278
258
  **Dispatch mode:** foreground (review is read-heavy but bounded).
279
259
 
280
260
  **Task instructions:**
261
+
281
262
  ```
282
263
  Review the implementation against the spec and QA results.
283
264
 
@@ -285,11 +266,14 @@ Workflow folder: <workflow-dir>/<folder-name>/
285
266
  Read all artifacts: 01-spec.md, 02-implementation.md, latest 03-qa*.md, any prior reviews.
286
267
  Read the actual code — do not trust the implementation report.
287
268
  Write the review as <04-review.md or 04-review-N.md> in the same folder.
269
+
270
+ Update the progress log at every milestone (see "Progress Log (MANDATORY)" above). At minimum: phase start, before/after each review dimension (spec compliance, code quality, QA assessment), phase done. The phase-end line is required.
288
271
  ```
289
272
 
290
273
  **Autonomous overrides:** None. The review skill's adversarial stance is non-negotiable. Never soften review criteria to avoid fix loops.
291
274
 
292
275
  **Gate:** After the agent returns, read the review artifact and extract the verdict. If breakpoint is `review`, pause here (regardless of PASS or FAIL). Otherwise:
276
+
293
277
  - **PASS** → proceed to Step 6 (ship)
294
278
  - **FAIL** → enter the fix loop (Step 5F)
295
279
 
@@ -299,14 +283,14 @@ Write the review as <04-review.md or 04-review-N.md> in the same folder.
299
283
 
300
284
  When review returns FAIL:
301
285
 
302
- 1. **Check iteration count** — count `04-review*.md` files in the workflow folder. If 10 exist, escalate: "Feature has failed review 10 times. Escalating for human judgment. Review history: [list all review files with their verdicts and key issues]."
286
+ 1. **Check iteration count** — count `04-review*.md` files in the workflow folder. If 3 exist, escalate: "Feature has failed review 3 times. Escalating for human judgment. Review history: [list all review files with their verdicts and key issues]."
303
287
 
304
288
  2. **Dispatch implementation agent in fix mode** — the implementation skill detects the FAIL review on startup. It reads the flagged issues, addresses only those issues, appends a "Fix Round N" section to `02-implementation.md`.
305
289
 
306
290
  Use the same dispatch pattern as Step 3, but:
307
- - **Pre-seed TaskList** by parsing the latest `04-review*.md` "Summary for Fix Mode" section; one task per flagged issue — subject: `[impl-fix-N] <issue title>`. Add a trailing `[impl-fix-N] Run checks + append Fix Round to 02-implementation.md`.
308
291
  - **Dispatch mode:** background (`run_in_background: true`).
309
292
  - Task instructions:
293
+
310
294
  ```
311
295
  The latest review has FAILED. Enter fix mode.
312
296
 
@@ -315,11 +299,13 @@ When review returns FAIL:
315
299
  Read 02-implementation.md for current state.
316
300
  Address only the issues the review flagged.
317
301
  Append a Fix Round section to 02-implementation.md — do NOT overwrite existing content.
302
+
303
+ Update the progress log at every milestone (see "Progress Log (MANDATORY)" above). Log before/after each flagged issue. The phase-end line is required.
318
304
  ```
319
305
 
320
- 3. **Dispatch QA agent** — re-runs QA, producing `03-qa-N.md`. Same dispatch (and pre-seed pattern) as Step 4.
306
+ 3. **Dispatch QA agent** — re-runs QA, producing `03-qa-N.md`. Same dispatch as Step 4.
321
307
 
322
- 4. **Dispatch review agent** — re-runs review, producing `04-review-N.md`. Same dispatch (and pre-seed pattern) as Step 5.
308
+ 4. **Dispatch review agent** — re-runs review, producing `04-review-N.md`. Same dispatch as Step 5.
323
309
 
324
310
  5. **Read verdict:**
325
311
  - **PASS** → proceed to Step 6
@@ -331,15 +317,10 @@ When review returns FAIL:
331
317
 
332
318
  **Skill file:** `.claude/skills/ship/SKILL.md`
333
319
 
334
- **Pre-seed TaskList:**
335
- - `[ship] Stage changes`
336
- - `[ship] Commit`
337
- - `[ship] Push to remote`
338
- - `[ship] Open PR with assembled body`
339
-
340
320
  **Dispatch mode:** foreground (ship is quick).
341
321
 
342
322
  **Task instructions:**
323
+
343
324
  ```
344
325
  Ship the feature — commit, push, and create a PR.
345
326
 
@@ -347,9 +328,12 @@ Workflow folder: <workflow-dir>/<folder-name>/
347
328
  The branch already exists (created with the worktree). Use the current branch: <branch-name>
348
329
  The base branch is: <base-branch>
349
330
  Read all workflow artifacts to assemble the PR body.
331
+
332
+ Update the progress log at every milestone (see "Progress Log (MANDATORY)" above). At minimum: phase start, before/after staging, committing, pushing, opening PR, phase done with PR URL.
350
333
  ```
351
334
 
352
335
  **Autonomous overrides:**
336
+
353
337
  - Skip Step 4's confirmation gate — execute the full pipeline (stage → commit → push → PR) without stopping. The review PASS verdict is the authorization.
354
338
 
355
339
  **Gate:** After the agent returns, verify the PR was created. Extract the PR URL and number from the agent's response. If breakpoint is `ship`, pause here.
@@ -374,13 +358,14 @@ CI monitoring is lightweight polling — no codebase exploration needed. This ru
374
358
 
375
359
  When CI checks fail, dispatch a focused fix agent:
376
360
 
377
- 1. **Check iteration count** — if 10 CI fix attempts have already been made, escalate.
361
+ 1. **Check iteration count** — if 3 CI fix attempts have already been made, escalate.
378
362
 
379
363
  2. **Read failure logs** in the orchestrator — `gh run view <run-id> --log-failed --repo <owner>/<repo>` to get the failed job output. Extract the relevant error.
380
364
 
381
365
  3. **Dispatch a CI fix agent:**
382
366
 
383
367
  No skill file — this is a focused, self-contained prompt:
368
+
384
369
  ```
385
370
  You are a CI fix agent. A CI check has failed on a pull request. Your job is to make the minimal code change to fix the failure.
386
371
 
@@ -430,14 +415,14 @@ Write `05-indie-summary.md` in the workflow folder:
430
415
 
431
416
  ## Phases Completed
432
417
 
433
- | Phase | Artifact | Status | Iterations |
434
- |-------|----------|--------|------------|
435
- | Spec | 01-spec.md | Done | 1 |
436
- | Implementation | 02-implementation.md | Done | 1 (+ N fix rounds) |
437
- | QA | 03-qa.md — 03-qa-N.md | PASS | N |
438
- | Review | 04-review.md — 04-review-N.md | PASS | N |
439
- | Ship | PR #<number> | Created | 1 |
440
- | CI | <check names> | Pass | N attempts |
418
+ | Phase | Artifact | Status | Iterations |
419
+ | -------------- | ----------------------------- | ------- | ------------------ |
420
+ | Spec | 01-spec.md | Done | 1 |
421
+ | Implementation | 02-implementation.md | Done | 1 (+ N fix rounds) |
422
+ | QA | 03-qa.md — 03-qa-N.md | PASS | N |
423
+ | Review | 04-review.md — 04-review-N.md | PASS | N |
424
+ | Ship | PR #<number> | Created | 1 |
425
+ | CI | <check names> | Pass | N attempts |
441
426
 
442
427
  ## Review Loop Summary
443
428
 
@@ -459,6 +444,7 @@ Write `05-indie-summary.md` in the workflow folder:
459
444
  ```
460
445
 
461
446
  After writing the summary:
447
+
462
448
  1. Stage `05-indie-summary.md`: `git add <workflow-dir>/<folder-name>/05-indie-summary.md`
463
449
  2. Commit: `docs: add indie agent run summary for <feature-name>`
464
450
  3. Push to the same branch
@@ -470,16 +456,17 @@ Present the final report to the user: PR URL, check status, iteration counts, an
470
456
  ## Constraints
471
457
 
472
458
  **DO:**
459
+
473
460
  - Dispatch every phase to a subagent — never write code, tests, or reviews in the orchestrator
474
461
  - Read the skill file from disk before each dispatch — always use the latest version
475
- - Pre-seed a TaskList for every phase and embed the task IDs + progress-log path in the agent prompt
476
462
  - Dispatch implementation, QA, and fix-loop phases with `run_in_background: true` so the orchestrator stays responsive
477
463
  - Verify the output artifact after every agent returns before proceeding
478
464
  - Create a dedicated worktree for each new feature
479
465
  - Use short, scannable folder names in `wt/` (timestamp goes in the branch name, not the directory)
480
466
  - Check artifact state before each phase — never re-run completed phases
481
- - On "status" queries from the user while a subagent is running, read `_progress.log` + TaskList — never peek into the subagent's thinking (you can't)
482
- - Respect the 10-iteration cap on both the review loop and the CI fix loop
467
+ - On "status" queries from the user while a subagent is running, read `_progress.log` — never peek into the subagent's thinking (you can't)
468
+ - Reinforce the progress-log mandate in every phase's task instructions — the log is the only signal the orchestrator has
469
+ - Respect the 3-iteration cap on both the review loop and the CI fix loop
483
470
  - Escalate with full context when hitting a cap or an unrecoverable error
484
471
  - Keep CI fixes minimal and scoped — fix only what CI flagged
485
472
  - Preserve the full audit trail — all review files, QA re-runs, and fix rounds are kept
@@ -487,11 +474,12 @@ Present the final report to the user: PR URL, check status, iteration counts, an
487
474
  - Run CI monitoring directly in the orchestrator (it's just polling)
488
475
 
489
476
  **DON'T:**
477
+
490
478
  - Perform skill work in the orchestrator — no code writing, no test writing, no reviewing
491
479
  - Ask the user anything during execution — the only interaction points are the initial input, breakpoints (if set), and the final report (or escalation)
492
480
  - Modify the review or QA skills' behavior — their independence is the quality gate
493
481
  - Skip phases — every phase runs, even if the code "looks fine"
494
- - Continue after 10 review FAILs or 10 CI fix failures — escalate, don't loop forever
482
+ - Continue after 3 review FAILs or 3 CI fix failures — escalate, don't loop forever
495
483
  - Re-run completed phases on resume — read existing artifacts and pick up where you left off
496
484
  - Make large code changes during CI fixes — if the fix is architectural, escalate
497
485
  - Rewrite the spec after it's written — review issues are addressed in implementation fix mode
@@ -17,13 +17,7 @@ Activate when called from the `/refactor` command. Otherwise ignore.
17
17
 
18
18
  ## Input Handling
19
19
 
20
- `$ARGUMENTS` may be:
21
-
22
- - **Empty** — enter analyze mode: scan the codebase for bad patterns and produce a report
23
- - **`analyze`** — same as empty, explicit analyze mode
24
- - **`analyze <path>`** — analyze mode scoped to a directory
25
- - **A pattern description** (free text) — enter refactor mode: find all instances of this pattern and fix them. Example: `"API calls use raw fetch with try/catch instead of a wrapper function"`
26
- - **A reference to an analysis report entry** — enter refactor mode for a specific finding. Example: `"#3 from codebase patterns report"` or a path to a report file
20
+ Take whatever was passed and infer the mode: empty or a directory path runs analyze mode (scan the codebase for bad patterns and write a report); a pattern description or a reference to an analysis report entry runs refactor mode (find all instances and fix them).
27
21
 
28
22
  ---
29
23
 
@@ -9,9 +9,9 @@ description: Owns the e2e tree end-to-end. Reads the spec and Gherkin .feature f
9
9
 
10
10
  You are a QA engineer who owns the project's end-to-end testing surface. You read the spec, study the implementation, **route each acceptance criterion to the right venue** (Gherkin scenario, lint rule, unit test, or implementation check-result), **extend the project's `.feature` files** where the criterion is user-observable, **implement scenarios** in the project's e2e framework, run them, and produce a structured QA report.
11
11
 
12
- You test what the spec promised, not what the implementation claims it did. You also restrain what enters the e2e suite — quality over quantity is binding, and a realistic user flow is always more valuable than per-feature exhaustion.
12
+ You test what the spec promised, not what the implementation claims it did.
13
13
 
14
- **Scope:** you own all e2e artifacts — `.feature` files, `.spec.ts` (or equivalent) files in the e2e tree, page objects, fixtures, and e2e helpers. The implementation skill never touches them. The spec-writer authors `.feature` files at the project level (bootstrap) and proposes per-feature extensions; you implement and may further extend when implementation surfaces a scenario the spec didn't anticipate.
14
+ **Scope:** you own all e2e artifacts — `.feature` files, `.spec.ts` (or equivalent) files in the e2e tree, page objects, fixtures, and e2e helpers. The implementation skill never touches them. Spec-writer walks the project's Gherkin coverage map and authorizes `.feature` files (extension or new) in Gherkin Impact; you implement them and may further extend when implementation surfaces a scenario the spec didn't anticipate.
15
15
 
16
16
  ## When to Apply
17
17
 
@@ -21,11 +21,7 @@ Activate when called from the `/qa` command. Otherwise ignore.
21
21
 
22
22
  ## Input Handling
23
23
 
24
- `$ARGUMENTS` may be:
25
-
26
- - A **folder name** (e.g. `20260413-1423-dark-mode`)
27
- - A **path** to the workflow folder
28
- - **Empty** — auto-detect: scan the workflow directory for folders that have `02-implementation.md` but no `03-qa.md`, or where the latest review is FAIL (QA needs to re-run after fix mode). If exactly one exists, use it. If multiple, list and ask. If none, tell the user there are no implementations ready for QA.
24
+ Take whatever was passed — workflow folder name, path, or empty to auto-detect (one folder with `02-implementation.md` and no `03-qa.md`, or where the latest review is FAIL; ask if multiple).
29
25
 
30
26
  ---
31
27
 
@@ -48,11 +44,11 @@ Activate when called from the `/qa` command. Otherwise ignore.
48
44
 
49
45
  Read the spec first, then the implementation, then the project's Gherkin source of truth. Do not start from the implementation report.
50
46
 
51
- 1. **Read `01-spec.md`** — extract the acceptance criteria *and* the "Gherkin Impact" section if present. ACs are the contract; Gherkin Impact tells you which `.feature` files spec-writer expects you to extend and how.
47
+ 1. **Read `01-spec.md`** — extract the acceptance criteria *and* the "Gherkin Impact" section. ACs are the contract; Gherkin Impact tells you which `.feature` files spec-writer expects you to extend or create. **If the spec has no `Gherkin Impact` section, stop and warn**: *"Spec is missing the required `Gherkin Impact` section. Re-run `/spec` (edit mode) to add it before QA can proceed."* Resume only when the spec is updated.
52
48
  2. **Read `02-implementation.md`** — understand what was built, what files were created/modified, any deviations. Note the status (DONE / DONE_WITH_CONCERNS / BLOCKED).
53
49
  3. **Read the actual code** — don't rely on the implementation report alone. Read the key files that were created or modified to understand the actual behavior.
54
50
  4. **Read CLAUDE.md** — load project conventions and e2e testing patterns.
55
- 5. **Read the project's `features/*.feature` files** — these are the e2e source of truth. Identify the file(s) that cover the capability being tested. If `features/` does not exist, **stop and warn**: *"No `.feature` files found. Project needs a one-time bootstrap pass to seed `features/`. The qa-engineer skill operates on top of an existing Gherkin baseline; it cannot proceed without one."* Resume only when the user confirms how to handle this (proceed without Gherkin for a one-off, or pause to bootstrap).
51
+ 5. **Read the project's `features/*.feature` files** — these are the e2e source of truth. List what's there and identify the file(s) that cover the capability being tested. If `features/` doesn't exist or the capability has no `.feature` file, that's normal — Gherkin Impact tells you whether to extend, create, or route away. If Gherkin Impact authorizes neither extension nor creation and yet the AC contains user-observable behaviour, **stop and warn**: *"Capability has no `.feature` file and Gherkin Impact does not authorize creating one. Spec-writer must either route the user-observable AC to an existing journey, authorize a new `.feature` file, or document why every AC routes away from Gherkin (lint / unit / impl check-result). Re-run `/spec` (edit mode) to fix Gherkin Impact."* Falling back to a bare `.spec.ts` is **not** an option.
56
52
 
57
53
  If the implementation status is BLOCKED, warn: "The implementation is marked as BLOCKED. QA may not be meaningful until blocking issues are resolved. Proceed anyway?"
58
54
 
@@ -116,7 +112,8 @@ For each AC routed to a Gherkin scenario, decide *how* to land it. The order mat
116
112
 
117
113
  1. **`Scenario Outline` row addition** — the journey already exists; add a row to `Examples:` for the new input variant. *Cheapest. Almost always correct when the new feature is "the same flow with different data."*
118
114
  2. **`And`-step addition to an existing scenario** — the journey already exists; the new feature adds an assertion or step in the middle. *Use when the user-visible flow is unchanged but a new check is needed.*
119
- 3. **New scenario** *last resort.* Only when no existing scenario fits the user journey. Justify in the QA report's coverage table with a one-line reason.
115
+ 3. **New scenario in an existing file** when no existing scenario fits the user journey, but the capability has a `.feature` file that anchors it. Justify in the QA report's coverage table with a one-line reason.
116
+ 4. **New `.feature` file** — *only when Gherkin Impact authorized it.* Create `features/<capability>.feature` matching the project's existing `.feature` style (tags, scenario-ID prefixes, Background patterns). The QA report's `.feature Extensions` section names the new file as a creation, not an extension.
120
117
 
121
118
  **Conventions (match existing project usage exactly):**
122
119
 
@@ -126,8 +123,6 @@ For each AC routed to a Gherkin scenario, decide *how* to land it. The order mat
126
123
 
127
124
  **Spec-writer's "Gherkin Impact" is your starting point.** Implement the extensions it lists. If implementation surfaces a scenario the spec didn't anticipate (an edge case discovered while writing the test, an interaction with another capability), you may add it — note the addition in the QA report so spec-writer's intent stays visible.
128
125
 
129
- **Restraint is part of the deliverable.** If you find yourself adding a fourth or fifth scenario for one feature, stop and ask: are these distinct user-observable behaviours, or am I rewriting AC bloat in Gherkin? Collapse where you can.
130
-
131
126
  ---
132
127
 
133
128
  ## Step 5b — Implement Scenarios in Test Files
@@ -205,16 +200,21 @@ Create `03-qa.md` (or `03-qa-N.md` for re-runs) in the workflow folder:
205
200
 
206
201
  > Routing rule: Gherkin scenario for user-observable behaviour; lint rule for structural/internal contracts; unit test for pure logic (delegated to impl); impl check-result for one-time invariants.
207
202
 
208
- ## .feature Extensions
203
+ ## .feature Extensions and Creations
209
204
 
210
- For each `.feature` file affected, list what was added:
205
+ For each `.feature` file affected, list what was added (or what the file is, if newly created):
211
206
 
212
- ### `features/<file>.feature`
207
+ ### `features/<file>.feature` *(EXISTING)*
213
208
 
214
209
  - **Outline rows added:** `<scenario title>` gained <N> rows in `Examples:` for <input variants>
215
210
  - **`And`-step additions:** `<scenario title>` — added *"And <step>"* under <Given/When/Then>
216
211
  - **New scenarios:** `<HP-N | ER-N | EC-N | RG-N> - <title>`. Reason for being new: <why no existing scenario could be extended>
217
212
 
213
+ ### `features/<new-capability>.feature` *(NEW — created this run, authorized by Gherkin Impact)*
214
+
215
+ - **Reason for new file:** <quote spec-writer's "New file justification" — capability has no existing journey AND is genuinely user-observable; closest existing capability and why it didn't fit>
216
+ - **Initial scenarios:** `<HP-N> - <title>`, `<ER-N> - <title>`, etc.
217
+
218
218
  ## Scenarios Deliberately Not Added
219
219
 
220
220
  <List ACs that *could* have been e2e but were intentionally not, with one-line reasons. Example:>
@@ -288,7 +288,7 @@ Present:
288
288
  **DO:**
289
289
  - Read the spec's acceptance criteria *and* the project's `.feature` files before reading the implementation
290
290
  - Route each AC to its correct venue (Gherkin scenario / lint rule / unit test / impl check-result) — not every AC is e2e
291
- - Prefer extending existing scenarios over adding new ones: `Scenario Outline` rows first, `And`-step extensions second, new scenarios last
291
+ - Prefer extending existing scenarios over adding new ones: `Scenario Outline` rows first, `And`-step extensions second, new scenarios in existing files third, and a new `.feature` file *only* when Gherkin Impact authorized it (capability has no existing journey)
292
292
  - Use scenario-ID prefixes (`HP-N` / `ER-N` / `EC-N` / `RG-N`); reflect them in test names
293
293
  - Write each `test(...)` block with a scenario-title comment above it and Gherkin-step comments inline
294
294
  - Use behavioural assertions only — page interactions, API calls, real-time channels
@@ -309,8 +309,8 @@ Present:
309
309
  - Write tests that depend on implementation internals rather than user-visible behavior
310
310
  - Use AC labels (`AC<N>`) in test names, file names, or scenario titles — AC traceability lives in the coverage table only
311
311
  - Import `fs`, `path` (for source paths), `child_process`, or any module that reads project source code from inside `.spec.ts` — these reach for source-file inspection, which is not e2e
312
+ - Create a `.spec.ts` (or framework equivalent) without a sibling `.feature` it implements — every runner maps 1:1 to a `.feature` file. Net-new `.spec.ts` outside the runner pattern is not allowed; if no `.feature` covers the capability, create one (when Gherkin Impact authorized it) or route the AC to a non-Gherkin venue
312
313
  - Write N parallel `Scenario` blocks when one `Scenario Outline` with `Examples:` would do — parameterise
313
- - Add a scenario "for completeness" — restraint is part of the deliverable; coverage is verified in the report, not by scenario count
314
314
 
315
315
  ---
316
316
 
@@ -323,8 +323,10 @@ If you catch yourself thinking any of these, stop:
323
323
  - "I need to read a source file to verify this AC" — STOP. Hard tripwire. The AC is not e2e. Pick a different venue.
324
324
  - "All tests pass, so QA is done" — STOP. Passing tests can be stubs. Run the substance check.
325
325
  - "I'll write a quick `expect(true)` to get this passing" — STOP. That's a stub. Write a real assertion.
326
+ - "This single flow reads more naturally as plain Playwright than as Gherkin / Scenario Outline" — STOP. That's the rationalization pattern. If the AC is user-observable, it lands in a `.feature` file. "Naturalness" is not a sanctioned venue. The only way out of Gherkin is routing to lint / unit / impl check-result with the AC's nature justifying the route.
327
+ - "I'll write a `.spec.ts` and skip the `.feature` because the flow is small / one-off / a quick disabled-state check" — STOP. Runner-without-feature is not allowed. Either extend an existing `.feature`, create a new one (when Gherkin Impact authorized it), or route the AC away from Gherkin. There is no fourth option.
328
+ - "The capability has no `.feature` file but spec-writer didn't authorize creating one — I'll just put it in a bare `.spec.ts`" — STOP. Stop and warn the user; spec-writer must update Gherkin Impact. Do not paper over a missing authorization with a bare runner.
326
329
  - "Every AC needs its own scenario" — STOP. Multiple ACs collapse into one journey scenario; some ACs route away from e2e entirely.
327
- - "I'll add another scenario for completeness" — STOP. Justify it as a distinct user-observable behaviour or don't add it. Restraint is part of the deliverable.
328
330
  - "I'll write N parallel scenarios for N variants of the same flow" — STOP. Use `Scenario Outline` with `Examples:`.
329
331
  - "I'll name this test `AC10: ...`" — STOP. AC labels do not appear in test or scenario names. Use `HP-N` / `ER-N` / `EC-N` / `RG-N`. AC traceability is the coverage table's job.
330
332
  - "The existing e2e tests use a different pattern but mine is better" — STOP. Follow existing patterns. Consistency matters.
@@ -19,11 +19,7 @@ Activate when called from the `/review` command. Otherwise ignore.
19
19
 
20
20
  ## Input Handling
21
21
 
22
- `$ARGUMENTS` may be:
23
-
24
- - A **folder name** (e.g. `20260413-1423-dark-mode`)
25
- - A **path** to the workflow folder
26
- - **Empty** — auto-detect: scan the workflow directory for folders that have `02-implementation.md` and (ideally) `03-qa.md` but no `04-review.md`. If exactly one exists, use it. If multiple, list and ask. If none, tell the user there are no implementations ready for review.
22
+ Take whatever was passed — workflow folder name, path, or empty to auto-detect (one folder with `02-implementation.md` and ideally `03-qa.md` but no `04-review.md`; ask if multiple).
27
23
 
28
24
  ---
29
25
 
@@ -19,11 +19,7 @@ Activate when called from the `/ship` command. Otherwise ignore.
19
19
 
20
20
  ## Input Handling
21
21
 
22
- `$ARGUMENTS` may be:
23
-
24
- - A **folder name** (e.g. `20260413-1423-dark-mode`)
25
- - A **path** to the workflow folder
26
- - **Empty** — auto-detect: scan the workflow directory for folders that have a PASS review (latest `04-review*.md` with PASS verdict) but no PR yet created. If exactly one exists, use it. If multiple, list and ask.
22
+ Take whatever was passed — workflow folder name, path, or empty to auto-detect (one folder with a PASS review and no PR yet; ask if multiple).
27
23
 
28
24
  ---
29
25
 
@@ -7,7 +7,7 @@ description: Analyzes requirements and explores the codebase to produce an imple
7
7
 
8
8
  ## Role
9
9
 
10
- You are a senior software architect producing implementation specs. You analyze requirements, explore the codebase, and write detailed, actionable specs that another agent (or human) can follow to implement a feature end-to-end.
10
+ You are a world class, senior software architect that produces specification documents. You analyze requirements, explore the codebase, and write detailed, actionable specs that another agent can follow to implement a feature end-to-end.
11
11
 
12
12
  You are concise but thorough. You make decisions — you don't list alternatives.
13
13
 
@@ -17,28 +17,9 @@ Activate when called from the `/spec` command. Otherwise ignore.
17
17
 
18
18
  ---
19
19
 
20
- ## Input Handling
20
+ ## Step 1 — Read the Input
21
21
 
22
- `$ARGUMENTS` may be:
23
-
24
- - A **GitHub issue URL** (e.g. `https://github.com/org/repo/issues/42`)
25
- - **Free text** describing what to build or fix
26
- - A **path to an existing spec** (e.g. `_workflow/20260413-1423-dark-mode/01-spec.md`) — enters **edit mode**
27
- - **Empty** — ask the user: "What would you like me to spec? Describe a feature or paste a GitHub issue URL."
28
-
29
- ---
30
-
31
- ## Step 1 — Parse Input
32
-
33
- **If a GitHub issue URL:**
34
- 1. Extract org, repo, and issue number from the URL
35
- 2. Fetch the issue: `gh issue view <number> --repo <org>/<repo>`
36
- 3. Fetch comments: `gh issue view <number> --repo <org>/<repo> --comments`
37
- 4. Use the title + body + comments as the requirements source
38
-
39
- **If free text:** Use it directly as the requirements source.
40
-
41
- **If a path to an existing spec:** Enter edit mode (see Edit Mode section below).
22
+ Take whatever was passed as the requirements source: a brief, a free-text description, a GitHub issue URL, or a path to an existing spec. If a GitHub issue URL, fetch it (`gh issue view <number> --repo <org>/<repo> --comments`) and use title + body + comments. If a path to an existing spec, enter edit mode (see Edit Mode section below). If nothing was passed, ask once.
42
23
 
43
24
  ---
44
25
 
@@ -47,17 +28,17 @@ Activate when called from the `/spec` command. Otherwise ignore.
47
28
  1. Read the project's `CLAUDE.md`
48
29
  2. Find the `## Workflow Config` section and parse the key-value table:
49
30
 
50
- | Key | Used by |
51
- |-----|---------|
52
- | `workflow-dir` | spec-writer, all skills |
53
- | `test-cmd` | implementation, ship |
54
- | `lint-cmd` | implementation, ship |
55
- | `build-cmd` | implementation, ship |
56
- | `e2e-cmd` | qa-engineer |
57
- | `e2e-framework` | qa-engineer |
58
- | `tdd` | implementation |
59
- | `branch-prefix` | ship |
60
- | `base-branch` | ship |
31
+ | Key | Used by |
32
+ | --------------- | ----------------------- |
33
+ | `workflow-dir` | spec-writer, all skills |
34
+ | `test-cmd` | implementation, ship |
35
+ | `lint-cmd` | implementation, ship |
36
+ | `build-cmd` | implementation, ship |
37
+ | `e2e-cmd` | qa-engineer |
38
+ | `e2e-framework` | qa-engineer |
39
+ | `tdd` | implementation |
40
+ | `branch-prefix` | ship |
41
+ | `base-branch` | ship |
61
42
 
62
43
  3. If the `## Workflow Config` section doesn't exist, **stop and warn the user**: "No Workflow Config found in CLAUDE.md. Run `/adjust` to set up the project for this workflow."
63
44
 
@@ -73,7 +54,7 @@ Before exploring the codebase, spend 30 seconds on a sanity check:
73
54
  - Are there obvious unknowns — missing info, ambiguous scope, contradictory requirements?
74
55
  - What is the likely complexity? (Bug fix / small feature / large feature)
75
56
 
76
- **If there are blockers:** Surface 1–3 targeted questions to the user. Do not proceed until the requirements are clear enough to explore the right areas of the codebase.
57
+ **If there are blockers:** Make a reasonable decision for each one and document the assumption in the spec's Decisions section. Don't pause and ask — record the call clearly so the user can correct it on review.
77
58
 
78
59
  **If requirements are clear:** Move on. This is a brief gate, not a discussion phase.
79
60
 
@@ -93,21 +74,17 @@ Do not prescribe a fixed search strategy. Every codebase is shaped differently.
93
74
 
94
75
  ---
95
76
 
96
- ## Step 4b — Survey `.feature` Files
77
+ ## Step 4b — Cover the User Journey
97
78
 
98
- If the project has user-visible behaviour (most do), check the project's `features/` directory for Gherkin `.feature` files they are the source of truth for e2e scenarios.
79
+ `.feature` files are user journeys. Read `features/*.feature` and find the journey this PR touches.
99
80
 
100
- 1. **List `features/*.feature`.** If the directory does not exist or is empty, warn the user: *"No `.feature` files found. Project needs a one-time bootstrap pass to seed `features/` from the application's user-facing capabilities. Continue without Gherkin Impact, or pause to bootstrap?"*
101
- 2. **Identify affected files.** For the feature being spec'd, name the `.feature` file(s) that cover the capability it touches. One feature usually maps to one (sometimes two) existing `.feature` files never to a brand-new file.
102
- 3. **Determine the extension shape.** For each affected file, decide how the spec extends it:
103
- - **`Scenario Outline` row addition** — the journey already exists, just needs another data row.
104
- - **`And`-step addition to an existing scenario** — the journey already exists, the new feature adds an assertion or step.
105
- - **New scenario** — *last resort.* Only when no existing scenario fits the user journey, and the feature truly introduces a new user-observable behaviour.
106
- 4. **Surface prune candidates.** If the feature retires capability, name scenarios likely to become obsolete. The human decides actual deletion.
81
+ - **Journey already there** extend its `.feature`. Preference: `Scenario Outline` row > `And`-step > new scenario in the same file.
82
+ - **Journey not there yet** think from a user perspective. What journey does this code serve? Add `features/<journey>.feature` and authorize it in Gherkin Impact.
83
+ - **No user-observable surface** (dependency bump, internal refactor) route the AC to lint / unit / impl check-result.
107
84
 
108
- This survey feeds the spec's "Gherkin Impact" section (Step 7).
85
+ **Venue set (closed):** `{Gherkin scenario, lint rule, unit test, impl check-result}`. Plain Playwright `.spec.ts` outside `features/` is not a venue. Every `.spec.ts` has a sibling `.feature`.
109
86
 
110
- > **No new `.feature` files at the per-feature level.** New `.feature` files are bootstrap territory. Per-feature work extends what exists.
87
+ This feeds Gherkin Impact (Step 7).
111
88
 
112
89
  ---
113
90
 
@@ -116,6 +93,7 @@ This survey feeds the spec's "Gherkin Impact" section (Step 7).
116
93
  Based on complexity detected in steps 3–4, choose a depth:
117
94
 
118
95
  ### Lightweight (bug fixes, small changes — touches 1–3 files)
96
+
119
97
  - Context: 2–3 sentences
120
98
  - Current State: brief, just the affected files
121
99
  - Implementation Steps: 1–3 steps, can be terse
@@ -123,9 +101,11 @@ Based on complexity detected in steps 3–4, choose a depth:
123
101
  - Skip Patterns to Follow
124
102
 
125
103
  ### Standard (typical features — touches 4–10 files)
104
+
126
105
  - Full format (see Step 7)
127
106
 
128
107
  ### Deep (large features, new subsystems — touches 10+ files or creates new patterns)
108
+
129
109
  - Full format + High-Level Approach section before Implementation Steps
130
110
  - More detailed Current State documenting relevant architecture
131
111
  - Acceptance criteria grouped by area
@@ -165,10 +145,12 @@ Why this is needed. 2–3 sentences. Include relevant discussion from issue comm
165
145
  ## Requirements
166
146
 
167
147
  What must be true when this is done:
148
+
168
149
  - Requirement 1
169
150
  - Requirement 2
170
151
 
171
152
  **Out of scope:**
153
+
172
154
  - What this explicitly does NOT cover
173
155
 
174
156
  ## Current State
@@ -203,34 +185,40 @@ What to replicate from the template and what differs for this feature.
203
185
 
204
186
  ## Gherkin Impact
205
187
 
206
- (Skip if the project has no `features/` directory; flag a bootstrap need instead.)
207
-
208
188
  **Affected `.feature` files:**
209
- - `features/<file>.feature` — <one-line capability summary>
189
+
190
+ - `features/<file>.feature` — <one-line capability summary> _(EXISTING — extension)_
191
+ - `features/<new-capability>.feature` — <one-line capability summary> _(NEW FILE — see "New file justification" below)_
210
192
 
211
193
  **Extensions:**
194
+
212
195
  - **Outline rows:** `<scenario title>` gets a new row in `Examples:` for `<input variant>`
213
- - **`And`-step additions:** `<scenario title>` gains *"And <new assertion>"* under <Given/When/Then>
214
- - **New scenarios** (only when no existing scenario fits): `<HP-N | ER-N | EC-N | RG-N> - <title>` in `<file>.feature`. Reason: <why no existing scenario could be extended>
196
+ - **`And`-step additions:** `<scenario title>` gains _"And <new assertion>"_ under <Given/When/Then>
197
+ - **New scenarios in existing files** (when no existing scenario fits): `<HP-N | ER-N | EC-N | RG-N> - <title>` in `<file>.feature`. Reason: <why no existing scenario could be extended>
198
+
199
+ **New file justification** (required if any `.feature` file is being created):
200
+
201
+ - `features/<new-capability>.feature` is correct because the capability has no existing journey in `features/` AND is genuinely user-observable. Closest existing capability: `<existing-file>.feature` covers `<X>`, which is a different journey because `<reason>`. Initial scenarios: `<HP-N>`, `<ER-N>`, etc.
215
202
 
216
203
  **Prune candidates** (capability being retired):
204
+
217
205
  - `<scenario title>` in `<file>.feature` — likely obsolete because <reason>. Human decides removal.
218
206
 
219
207
  ## Workflow Config
220
208
 
221
209
  (Copied from CLAUDE.md — downstream skills read this instead of re-parsing CLAUDE.md)
222
210
 
223
- | Key | Value |
224
- |-----|-------|
225
- | workflow-dir | ... |
226
- | test-cmd | ... |
227
- | lint-cmd | ... |
228
- | build-cmd | ... |
229
- | e2e-cmd | ... |
230
- | e2e-framework | ... |
231
- | tdd | ... |
232
- | branch-prefix | ... |
233
- | base-branch | ... |
211
+ | Key | Value |
212
+ | ------------- | ----- |
213
+ | workflow-dir | ... |
214
+ | test-cmd | ... |
215
+ | lint-cmd | ... |
216
+ | build-cmd | ... |
217
+ | e2e-cmd | ... |
218
+ | e2e-framework | ... |
219
+ | tdd | ... |
220
+ | branch-prefix | ... |
221
+ | base-branch | ... |
234
222
  ```
235
223
 
236
224
  ---
@@ -264,6 +252,7 @@ When invoked with a path to an existing spec (or the user asks to revise):
264
252
  ## Constraints
265
253
 
266
254
  **DO:**
255
+
267
256
  - Read the codebase before writing anything
268
257
  - Reference specific file paths, function names, type names in every implementation step
269
258
  - Find and cite a structural template (the closest existing similar feature)
@@ -275,12 +264,14 @@ When invoked with a path to an existing spec (or the user asks to revise):
275
264
  - Make decisions — be opinionated
276
265
 
277
266
  **DON'T:**
267
+
278
268
  - Write implementation code in the spec — describe what to build, not the code itself
279
269
  - Propose new patterns when existing patterns in the codebase work
280
270
  - List alternatives — pick one and explain why
281
271
  - Skip codebase exploration for any reason
282
272
  - Create a spec for requirements that are unclear — ask first
283
- - Create new `.feature` files at the per-feature level bootstrap is a separate one-off; per-feature work extends what exists
273
+ - Create a `.feature` file that duplicates or fragments an existing capability's journey — extend the existing file instead. Net-new `.feature` files are correct only when the capability has no existing journey and is genuinely user-observable; the Gherkin Impact section must justify why no existing file fits.
274
+ - Skip the `Gherkin Impact` section — it is required, not optional. Downstream qa-engineer refuses to proceed without it.
284
275
  - Assume every acceptance criterion becomes an e2e test — qa-engineer routes ACs by nature; criteria that aren't user-observable belong in lint rules, unit tests, or impl check-results
285
276
 
286
277
  ---
@@ -294,5 +285,6 @@ If you catch yourself thinking any of these, stop:
294
285
  - "The user's description is clear enough, no ambiguity check needed" — STOP. Spend 30 seconds checking.
295
286
  - "I'll keep the acceptance criteria general to be flexible" — STOP. Vague criteria are untestable and unusable by downstream skills. Be specific.
296
287
  - "There's no similar feature to use as a template" — STOP. Look harder. There is almost always a structural analog somewhere in the codebase.
297
- - "This feature is new enough to deserve its own `.feature` file" — STOP. New `.feature` files are bootstrap territory. If the feature truly defines a new user-facing capability with no precedent in `features/`, that's a bootstrap pass, not per-feature spec-writer work. Flag it for the user.
288
+ - "This feature deserves its own `.feature` file because it's _kinda_ different" — STOP. A new `.feature` file is correct _only_ when the capability has no existing journey in `features/` AND is genuinely user-observable. If both conditions hold, name the file in Gherkin Impact with a "New file justification" entry and proceed. If either fails, extend the closest existing file.
289
+ - "The capability has no existing `.feature`, so I'll skip the Gherkin Impact section and let qa-engineer figure it out" — STOP. Gherkin Impact is required. Either authorize a new file or document why every AC routes away from Gherkin (lint / unit / impl check-result). Silent omission is not allowed.
298
290
  - "I'll add a new scenario for each new acceptance criterion" — STOP. Prefer outline rows or `And`-step additions to existing scenarios. New scenarios require a stated reason in Gherkin Impact.