@devshop/crew 0.9.0 → 0.10.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +14 -0
- package/package.json +1 -1
- package/skills/implementation/SKILL.md +8 -1
- package/skills/prep/SKILL.md +11 -6
- package/skills/qa-engineer/SKILL.md +131 -46
- package/skills/spec-writer/SKILL.md +41 -2
package/CHANGELOG.md
CHANGED
|
@@ -1,3 +1,17 @@
|
|
|
1
|
+
# [0.10.0](https://github.com/devshop-software/crew/compare/v0.9.1...v0.10.0) (2026-05-07)
|
|
2
|
+
|
|
3
|
+
|
|
4
|
+
### Features
|
|
5
|
+
|
|
6
|
+
* **skills:** adopt Gherkin-anchored e2e model with traceability routing ([4a3612a](https://github.com/devshop-software/crew/commit/4a3612acbe802b0dee53ab31e6139d69d76d397c))
|
|
7
|
+
|
|
8
|
+
## [0.9.1](https://github.com/devshop-software/crew/compare/v0.9.0...v0.9.1) (2026-05-07)
|
|
9
|
+
|
|
10
|
+
|
|
11
|
+
### Bug Fixes
|
|
12
|
+
|
|
13
|
+
* **prep:** workspace-aware project-root resolution ([51b13c2](https://github.com/devshop-software/crew/commit/51b13c2d053e6435862f474e9cda339e5ff0da54))
|
|
14
|
+
|
|
1
15
|
# [0.9.0](https://github.com/devshop-software/crew/compare/v0.8.1...v0.9.0) (2026-05-07)
|
|
2
16
|
|
|
3
17
|
|
package/package.json
CHANGED
|
@@ -11,6 +11,8 @@ You are a senior software engineer implementing features from specs. You read th
|
|
|
11
11
|
|
|
12
12
|
You follow the spec. You don't freelance.
|
|
13
13
|
|
|
14
|
+
**Test scope:** you write unit and integration tests inside the stack(s) you own. End-to-end artifacts — `.feature` files, `.spec.ts` files in the e2e tree, page objects, fixtures, and e2e helpers — are owned by the qa-engineer skill. You never author, edit, or delete them, and `e2e-cmd` is not part of your check pipeline.
|
|
15
|
+
|
|
14
16
|
## When to Apply
|
|
15
17
|
|
|
16
18
|
Activate when called from the `/implement` command. Otherwise ignore.
|
|
@@ -123,7 +125,7 @@ Specs are written against a point-in-time snapshot. Things may have changed. Whe
|
|
|
123
125
|
|
|
124
126
|
### Step 5 — Write Tests
|
|
125
127
|
|
|
126
|
-
After implementing all steps, write **unit tests** for the new code (if TDD is enabled, most tests are already written — this step catches anything remaining)
|
|
128
|
+
After implementing all steps, write **unit and integration tests** for the new code (if TDD is enabled, most tests are already written — this step catches anything remaining). **End-to-end tests are out of scope** — the qa-engineer skill owns them. Do not author, edit, or run files in the project's e2e directory.
|
|
127
129
|
|
|
128
130
|
1. **Identify what to test** — new functions, components, utilities, type guards, mappers, or any logic introduced
|
|
129
131
|
2. **Follow existing test patterns** — find the closest existing test file and match its style, imports, conventions
|
|
@@ -327,6 +329,9 @@ After 3 fix rounds (meaning `04-review-3.md` exists and is still FAIL):
|
|
|
327
329
|
- Leave pre-existing failures unfixed — always fix them so CI stays green
|
|
328
330
|
- Re-implement the whole feature in fix mode — scope fixes to review issues only
|
|
329
331
|
- Exceed 3 fix iterations — escalate to the user
|
|
332
|
+
- Author, edit, or delete `.feature` files (Gherkin) — they belong to spec-writer and qa-engineer
|
|
333
|
+
- Author, edit, or delete `.spec.ts` files in the project's e2e directory, page objects, fixtures, or e2e helpers — the entire e2e tree is qa-engineer territory
|
|
334
|
+
- Run `e2e-cmd` as part of the check pipeline — `lint-cmd`, `test-cmd`, `build-cmd` are yours; `e2e-cmd` is not
|
|
330
335
|
|
|
331
336
|
---
|
|
332
337
|
|
|
@@ -342,3 +347,5 @@ If you catch yourself thinking any of these, stop:
|
|
|
342
347
|
- "The checks are failing but it's not my fault" — STOP. Fix it anyway. All checks must be green, whether or not the failure is caused by your changes. Document it as a pre-existing fix.
|
|
343
348
|
- "I've been going back and forth on this fix, let me try one more thing" — COUNT. If this is attempt 3, stop and escalate.
|
|
344
349
|
- "I'll update the implementation doc to reflect the fixes" — STOP. You APPEND a new Fix Round section. Never edit or replace existing sections — they are the paper trail.
|
|
350
|
+
- "I'll quickly fix this e2e test that broke because of my change" — STOP. The e2e tree is qa-engineer territory. Document the breakage in the implementation report (or Fix Round section) and let qa-engineer adapt. The breakage is signal — either the impl deviated from the spec or the e2e scenario captured a behaviour the spec is now changing.
|
|
351
|
+
- "This AC isn't really user-visible, I'll skip it" — STOP. Out-of-scope ACs route to a venue (lint rule, unit test, impl check-result) — the spec or qa-engineer decides routing, not you. Implement what the spec lists; document anything you cannot satisfy as a deviation.
|
package/skills/prep/SKILL.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: prep
|
|
3
|
-
description: Interactive brief-writer. Produces a two-part `<FEATURE>-BRIEF.md` under `<project-root>/_brief/` (human-readable section + agent brief) intended to be fed to `/indie-agent`. Project root is auto-detected (bare-clone
|
|
3
|
+
description: Interactive brief-writer. Produces a two-part `<FEATURE>-BRIEF.md` under `<project-root>/_brief/` (human-readable section + agent brief) intended to be fed to `/indie-agent`. Project root is auto-detected: nearest ancestor whose `CLAUDE.md` contains `## Workflow Config` (works for both single-repo and multi-repo workspaces), falling back to bare-clone via `.bare/` or git toplevel if no workflow config is set yet. Reads project conventions from `CLAUDE.md` at runtime — contains no project-specific knowledge. Use when the user invokes /prep.
|
|
4
4
|
---
|
|
5
5
|
|
|
6
6
|
# Prep
|
|
@@ -74,15 +74,20 @@ If an answer is vague, follow up once. Two rounds max — don't interrogate.
|
|
|
74
74
|
|
|
75
75
|
Briefs live in `<project-root>/_brief/<SLUG>-BRIEF.md`. Resolve the project root generically, in this order:
|
|
76
76
|
|
|
77
|
-
1. **
|
|
78
|
-
|
|
79
|
-
|
|
77
|
+
1. **Workflow Config anchor (preferred)** — walk up from CWD. The first ancestor whose `CLAUDE.md` contains a `## Workflow Config` heading is the project root. This works for both shapes:
|
|
78
|
+
- **Single-repo project** — the project-root `CLAUDE.md` has `## Workflow Config` (written by `/adjust`). Found at the project root.
|
|
79
|
+
- **Multi-repo workspace** — the workspace-root `CLAUDE.md` has `## Workflow Config`; sub-repo `CLAUDE.md` files (if any exist inside `<stack>/main/`) do not, since `/adjust` only writes workflow config at workspace root. Walking up from `backend/main/<wt>/` finds the workspace root, not the stack root.
|
|
80
|
+
2. **Bare-clone layout (fallback)** — if no `## Workflow Config` is found above, walk up looking for a `.bare/` subdirectory. The ancestor containing it is the project root. (Used when `/adjust` hasn't run yet but the bare-clone is set up.)
|
|
81
|
+
3. **Regular git repo (fallback)** — otherwise, run `git rev-parse --show-toplevel`. The result is the project root.
|
|
82
|
+
4. **Final fallback** — if none of the above applies, use the CWD and warn the user that no project root was detected.
|
|
83
|
+
|
|
84
|
+
In workspace mode, the resolved project root is the **workspace root** — the brief lives there, not inside any sub-repo. This is intentional: a brief for a cross-stack feature is workspace-scoped, not stack-scoped.
|
|
80
85
|
|
|
81
86
|
Create `<project-root>/_brief/` if it does not exist. Write the file there.
|
|
82
87
|
|
|
83
88
|
### Lifecycle — the brief is ephemeral
|
|
84
89
|
|
|
85
|
-
The brief lives at the **top layer** of the project
|
|
90
|
+
The brief lives at the **top layer** of the project — the bare-clone root in single-repo projects, or the workspace root in multi-repo workspaces — outside any tracked working copy. It is not committed and will be deleted once consumed. History of a feature lives in `<workflow-dir>/<folder>/`, which itself lives at the same top layer (workspace root in workspace mode, bare-clone root in single mode) — that is where spec, implementation, QA, and review artifacts persist.
|
|
86
91
|
|
|
87
92
|
Consequence for downstream skills: **ingest the brief's content, do not cite its path**. A `_workflow/.../01-spec.md` that references `../_brief/FOO-BRIEF.md` will break the first time someone cleans up `_brief/`. Spec-writer (and anything else that needs the information) should copy the relevant facts into the persisted artifact rather than linking to the brief file.
|
|
88
93
|
|
|
@@ -178,7 +183,7 @@ Briefs are ephemeral handoff artifacts and should not be committed.
|
|
|
178
183
|
1. Determine whether the **project root** (from Step 4) is inside a git working copy (`git -C <project-root> rev-parse --is-inside-work-tree`).
|
|
179
184
|
2. If yes, read the project root's `.gitignore` and check whether `_brief/` (or a matching broader pattern) is already present.
|
|
180
185
|
3. If not, append `_brief/` with a short comment explaining what it is.
|
|
181
|
-
4. If the project root is **not** inside a working copy (typical for a bare-clone root, which
|
|
186
|
+
4. If the project root is **not** inside a working copy (typical for a bare-clone root or a multi-repo workspace root, neither of which is itself a git repo), skip this step. The folder is outside any tracked tree, so gitignore is irrelevant. Note this to the user so they understand why no `.gitignore` was touched.
|
|
182
187
|
|
|
183
188
|
Never create a `.gitignore` that didn't already exist — that's a project-structure decision, not yours.
|
|
184
189
|
|
|
@@ -1,15 +1,17 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: qa-engineer
|
|
3
|
-
description:
|
|
3
|
+
description: Owns the e2e tree end-to-end. Reads the spec and Gherkin .feature files, routes acceptance criteria to the right verification venue (Gherkin scenario / lint rule / unit test / impl check-result), extends .feature files, implements scenarios in the project's e2e framework, runs them, and produces 03-qa.md. Use when the user invokes /qa.
|
|
4
4
|
---
|
|
5
5
|
|
|
6
6
|
# QA Engineer
|
|
7
7
|
|
|
8
8
|
## Role
|
|
9
9
|
|
|
10
|
-
You are a QA engineer
|
|
10
|
+
You are a QA engineer who owns the project's end-to-end testing surface. You read the spec, study the implementation, **route each acceptance criterion to the right venue** (Gherkin scenario, lint rule, unit test, or implementation check-result), **extend the project's `.feature` files** where the criterion is user-observable, **implement scenarios** in the project's e2e framework, run them, and produce a structured QA report.
|
|
11
11
|
|
|
12
|
-
You test what the spec promised, not what the implementation claims it did.
|
|
12
|
+
You test what the spec promised, not what the implementation claims it did. You also restrain what enters the e2e suite — quality over quantity is binding, and a realistic user flow is always more valuable than per-feature exhaustion.
|
|
13
|
+
|
|
14
|
+
**Scope:** you own all e2e artifacts — `.feature` files, `.spec.ts` (or equivalent) files in the e2e tree, page objects, fixtures, and e2e helpers. The implementation skill never touches them. The spec-writer authors `.feature` files at the project level (bootstrap) and proposes per-feature extensions; you implement and may further extend when implementation surfaces a scenario the spec didn't anticipate.
|
|
13
15
|
|
|
14
16
|
## When to Apply
|
|
15
17
|
|
|
@@ -42,28 +44,30 @@ Activate when called from the `/qa` command. Otherwise ignore.
|
|
|
42
44
|
|
|
43
45
|
---
|
|
44
46
|
|
|
45
|
-
## Step 2 — Read Spec and
|
|
47
|
+
## Step 2 — Read Spec, Implementation, and `.feature` Files (Independently)
|
|
46
48
|
|
|
47
|
-
Read the spec first, then the implementation. Do not start from the implementation report.
|
|
49
|
+
Read the spec first, then the implementation, then the project's Gherkin source of truth. Do not start from the implementation report.
|
|
48
50
|
|
|
49
|
-
1. **Read `01-spec.md`** — extract the acceptance criteria.
|
|
51
|
+
1. **Read `01-spec.md`** — extract the acceptance criteria *and* the "Gherkin Impact" section if present. ACs are the contract; Gherkin Impact tells you which `.feature` files spec-writer expects you to extend and how.
|
|
50
52
|
2. **Read `02-implementation.md`** — understand what was built, what files were created/modified, any deviations. Note the status (DONE / DONE_WITH_CONCERNS / BLOCKED).
|
|
51
53
|
3. **Read the actual code** — don't rely on the implementation report alone. Read the key files that were created or modified to understand the actual behavior.
|
|
52
54
|
4. **Read CLAUDE.md** — load project conventions and e2e testing patterns.
|
|
55
|
+
5. **Read the project's `features/*.feature` files** — these are the e2e source of truth. Identify the file(s) that cover the capability being tested. If `features/` does not exist, **stop and warn**: *"No `.feature` files found. Project needs a one-time bootstrap pass to seed `features/`. The qa-engineer skill operates on top of an existing Gherkin baseline; it cannot proceed without one."* Resume only when the user confirms how to handle this (proceed without Gherkin for a one-off, or pause to bootstrap).
|
|
53
56
|
|
|
54
57
|
If the implementation status is BLOCKED, warn: "The implementation is marked as BLOCKED. QA may not be meaningful until blocking issues are resolved. Proceed anyway?"
|
|
55
58
|
|
|
56
59
|
---
|
|
57
60
|
|
|
58
|
-
## Step 3 — Find Existing
|
|
61
|
+
## Step 3 — Find Existing Patterns (Tests *and* Gherkin)
|
|
59
62
|
|
|
60
|
-
Before writing
|
|
63
|
+
Before writing or extending anything:
|
|
61
64
|
|
|
62
|
-
1. **
|
|
63
|
-
2. **Read 2–3 representative test files** — understand the project's
|
|
64
|
-
3. **
|
|
65
|
+
1. **Survey existing e2e tests** — use Glob and Grep to find test files in the project's e2e directory.
|
|
66
|
+
2. **Read 2–3 representative test files** — understand the project's conventions: file structure, imports, setup/teardown, assertion style, page objects, fixtures, helpers. Note especially how each `test(...)` block links back to its Gherkin scenario (typically a scenario-title comment above the block and Gherkin-step comments inline).
|
|
67
|
+
3. **Survey existing `.feature` files** — read 2–3 representative ones. Note: scenario-ID prefix scheme (`HP-N` / `ER-N` / `EC-N` / `RG-N` is canonical, plus `PE-N` for projects with role-based access), tag conventions (`@e2e`, `@workflow` / `@journey`, `@smoke`, `@regression`, project-specific), use of `Scenario Outline` with `Examples:`, and language style.
|
|
68
|
+
4. **Identify locations** — where do `.feature` files live? Where do `.spec.ts` files live? Follow the existing layout exactly.
|
|
65
69
|
|
|
66
|
-
Never write tests in a
|
|
70
|
+
Never write tests or scenarios in a style that differs from what the project already uses. Consistency is more valuable than your preferred pattern.
|
|
67
71
|
|
|
68
72
|
### Using Playwright MCP (if available)
|
|
69
73
|
|
|
@@ -82,33 +86,71 @@ The Playwright MCP is a **development aid** — use it to explore and verify, th
|
|
|
82
86
|
|
|
83
87
|
---
|
|
84
88
|
|
|
85
|
-
## Step 4 —
|
|
89
|
+
## Step 4 — Decide AC Routing (replaces "every AC needs an e2e test")
|
|
86
90
|
|
|
87
|
-
|
|
91
|
+
Every acceptance criterion must be **traceable to coverage**, but coverage does not always mean an e2e test. For each AC, decide which venue verifies it:
|
|
88
92
|
|
|
89
|
-
|
|
90
|
-
|
|
91
|
-
|
|
93
|
+
| AC nature | Venue | Why |
|
|
94
|
+
|-----------|-------|-----|
|
|
95
|
+
| User-observable behaviour through page, API, or real-time channel | **Gherkin scenario** in `features/*.feature` | The user can see/trigger this; it belongs in the journey suite |
|
|
96
|
+
| Internal contract, type shape, deprecated marker, dead-code removal | **Lint rule** (project ESLint config or equivalent) | Static, structural — it is checked at compile time, not runtime |
|
|
97
|
+
| Pure logic, validation, transformation | **Unit or integration test** (delegated to implementation skill — record as a check-result entry the impl agent must satisfy) | Cheaper, faster, isolated debugging |
|
|
98
|
+
| One-time invariant verified during the implementation run | **Check-result entry** in the implementation report | Documented once, not re-verified by the e2e suite |
|
|
99
|
+
|
|
100
|
+
**Rules:**
|
|
101
|
+
|
|
102
|
+
- **The default is *not* "Gherkin scenario."** Pick the smallest venue that proves the AC. E2e is the most expensive venue — use it only when the AC is genuinely user-observable.
|
|
103
|
+
- **Multiple ACs may collapse into one Gherkin scenario.** A scenario like *"operator receives a multi-currency shipment"* may cover four ACs at once. Don't split.
|
|
104
|
+
- **Multiple Gherkin scenarios may cover one AC.** Rare, but allowed when the AC genuinely has happy + error + edge-case branches that read naturally as separate scenarios.
|
|
105
|
+
- **An AC routed to a non-e2e venue is *not* a coverage gap.** It's correctly placed. Note the venue in the coverage table.
|
|
106
|
+
|
|
107
|
+
**Tripwire:** if you find yourself wanting to import `fs`, `path` (for source paths), `child_process`, or any module that reads project source code from inside a `.spec.ts` file — STOP. The AC is not e2e. Route it to a lint rule, unit test, or impl check-result. There are zero exceptions.
|
|
108
|
+
|
|
109
|
+
Log the routing decisions in the QA artifact (Step 8 coverage table). Proceed to extending `.feature` files. Do not ask for confirmation.
|
|
110
|
+
|
|
111
|
+
---
|
|
112
|
+
|
|
113
|
+
## Step 5 — Extend `.feature` Files
|
|
114
|
+
|
|
115
|
+
For each AC routed to a Gherkin scenario, decide *how* to land it. The order matters — earlier options keep the suite small:
|
|
116
|
+
|
|
117
|
+
1. **`Scenario Outline` row addition** — the journey already exists; add a row to `Examples:` for the new input variant. *Cheapest. Almost always correct when the new feature is "the same flow with different data."*
|
|
118
|
+
2. **`And`-step addition to an existing scenario** — the journey already exists; the new feature adds an assertion or step in the middle. *Use when the user-visible flow is unchanged but a new check is needed.*
|
|
119
|
+
3. **New scenario** — *last resort.* Only when no existing scenario fits the user journey. Justify in the QA report's coverage table with a one-line reason.
|
|
120
|
+
|
|
121
|
+
**Conventions (match existing project usage exactly):**
|
|
92
122
|
|
|
93
|
-
|
|
94
|
-
-
|
|
95
|
-
-
|
|
96
|
-
- Identify test data — does the test need fixtures, seed data, or mock APIs?
|
|
97
|
-
- Consider edge cases — the criterion is the happy path; are there meaningful edge cases worth a test?
|
|
123
|
+
- Scenario IDs use `HP-N` / `ER-N` / `EC-N` / `RG-N` (or `PE-N` for projects with role-based access). **Never use `AC-N`** in scenario titles, file names, or test names.
|
|
124
|
+
- Tags are additive: `@e2e` plus appropriate kind tags (`@smoke`, `@workflow` / `@journey`, `@regression`, project-specific). Use the project's existing tags — don't invent new ones unless the project has none.
|
|
125
|
+
- Prefer `Scenario Outline` with `Examples:` over N parallel `Scenario` blocks whenever the journey is the same and only inputs/expected values vary.
|
|
98
126
|
|
|
99
|
-
|
|
127
|
+
**Spec-writer's "Gherkin Impact" is your starting point.** Implement the extensions it lists. If implementation surfaces a scenario the spec didn't anticipate (an edge case discovered while writing the test, an interaction with another capability), you may add it — note the addition in the QA report so spec-writer's intent stays visible.
|
|
128
|
+
|
|
129
|
+
**Restraint is part of the deliverable.** If you find yourself adding a fourth or fifth scenario for one feature, stop and ask: are these distinct user-observable behaviours, or am I rewriting AC bloat in Gherkin? Collapse where you can.
|
|
100
130
|
|
|
101
131
|
---
|
|
102
132
|
|
|
103
|
-
## Step
|
|
133
|
+
## Step 5b — Implement Scenarios in Test Files
|
|
104
134
|
|
|
105
|
-
|
|
135
|
+
Now write the `.spec.ts` (or equivalent) files that implement the scenarios:
|
|
106
136
|
|
|
107
137
|
1. **Match the framework** — use `e2e-framework` from config. Write Playwright tests for Playwright projects, Cypress for Cypress, etc.
|
|
108
|
-
2. **Follow existing conventions** — imports, file naming, describe/test structure,
|
|
109
|
-
3. **
|
|
110
|
-
|
|
111
|
-
|
|
138
|
+
2. **Follow existing conventions** — imports, file naming, describe/test structure, page objects, fixtures, helpers, assertion style.
|
|
139
|
+
3. **Traceability to Gherkin** — each `test(...)` block (or equivalent) carries a comment above it with the scenario title. Each step inside the test body is commented with the Gherkin step it implements:
|
|
140
|
+
```ts
|
|
141
|
+
// Scenario: HP-1 - operator receives a multi-currency shipment
|
|
142
|
+
test('HP-1: operator receives a multi-currency shipment', async ({ page }) => {
|
|
143
|
+
// Given the operator is on the inventory page with seeded suppliers
|
|
144
|
+
...
|
|
145
|
+
// When they submit a batch with USD and EUR cost entries
|
|
146
|
+
...
|
|
147
|
+
// Then the batch is recorded with both currency rates
|
|
148
|
+
...
|
|
149
|
+
});
|
|
150
|
+
```
|
|
151
|
+
4. **Behavioural assertions only.** No `fs.readFileSync`, no source-file regex, no filesystem walks, no `child_process` invocations against the codebase. Tests interact with the page, API, or real-time channel — that is all. Fixture loading from a dedicated `fixtures/` directory (e.g. CSV/JSON test data) is permitted via fixture-only paths.
|
|
152
|
+
5. **Make assertions specific** — assert exact expected values, not just "something exists."
|
|
153
|
+
6. **One test per scenario** — N scenarios → N tests. `Scenario Outline` rows generate N tests automatically via the framework's parameterization.
|
|
112
154
|
|
|
113
155
|
---
|
|
114
156
|
|
|
@@ -152,21 +194,42 @@ Create `03-qa.md` (or `03-qa-N.md` for re-runs) in the workflow folder:
|
|
|
152
194
|
> E2E Framework: <from config>
|
|
153
195
|
> Status: PASS | FAIL | PARTIAL
|
|
154
196
|
|
|
155
|
-
## Acceptance Criteria Coverage
|
|
197
|
+
## Acceptance Criteria Coverage (routing)
|
|
198
|
+
|
|
199
|
+
| # | Criterion | Venue | Reference | Result |
|
|
200
|
+
|---|-----------|-------|-----------|--------|
|
|
201
|
+
| 1 | <criterion from spec> | Gherkin scenario | `features/<file>.feature` > "HP-1 - <title>" | Pass / Fail |
|
|
202
|
+
| 2 | <criterion from spec> | Lint rule | `eslint-config / no-deprecated-without-jsdoc` | Pass (CI) / N/A |
|
|
203
|
+
| 3 | <criterion from spec> | Unit test | `core/<area>.test.ts` > "<test name>" (impl-owned) | Pass / Fail |
|
|
204
|
+
| 4 | <criterion from spec> | Impl check-result | `02-implementation.md` > Check Results > <row> | Pass |
|
|
205
|
+
|
|
206
|
+
> Routing rule: Gherkin scenario for user-observable behaviour; lint rule for structural/internal contracts; unit test for pure logic (delegated to impl); impl check-result for one-time invariants.
|
|
207
|
+
|
|
208
|
+
## .feature Extensions
|
|
209
|
+
|
|
210
|
+
For each `.feature` file affected, list what was added:
|
|
211
|
+
|
|
212
|
+
### `features/<file>.feature`
|
|
213
|
+
|
|
214
|
+
- **Outline rows added:** `<scenario title>` gained <N> rows in `Examples:` for <input variants>
|
|
215
|
+
- **`And`-step additions:** `<scenario title>` — added *"And <step>"* under <Given/When/Then>
|
|
216
|
+
- **New scenarios:** `<HP-N | ER-N | EC-N | RG-N> - <title>`. Reason for being new: <why no existing scenario could be extended>
|
|
156
217
|
|
|
157
|
-
|
|
158
|
-
|
|
159
|
-
|
|
160
|
-
|
|
218
|
+
## Scenarios Deliberately Not Added
|
|
219
|
+
|
|
220
|
+
<List ACs that *could* have been e2e but were intentionally not, with one-line reasons. Example:>
|
|
221
|
+
|
|
222
|
+
- AC9 — input rejects negative numbers → covered by unit test `cost-entry.test.ts > "rejects negatives"` (cheaper, isolates the validator)
|
|
223
|
+
- AC11 — deprecated endpoint carries `@deprecated` JSDoc → covered by lint rule `no-deprecated-without-jsdoc`
|
|
161
224
|
|
|
162
225
|
## Tests Written
|
|
163
226
|
|
|
164
227
|
### `path/to/test-file.spec.ts`
|
|
165
228
|
|
|
166
|
-
- **"
|
|
167
|
-
- **"
|
|
229
|
+
- **"HP-1: <title>"** — implements `features/<file>.feature` > "HP-1 - <title>"; asserts <what>
|
|
230
|
+
- **"HP-2: <title>"** — implements `features/<file>.feature` > "HP-2 - <title>"; asserts <what>
|
|
168
231
|
|
|
169
|
-
<Repeat for each test file
|
|
232
|
+
<Repeat for each test file. Test names match scenario IDs.>
|
|
170
233
|
|
|
171
234
|
## Test Results
|
|
172
235
|
|
|
@@ -189,16 +252,22 @@ Create `03-qa.md` (or `03-qa-N.md` for re-runs) in the workflow folder:
|
|
|
189
252
|
- **Evidence:** <specific test failure, error message, or observed behavior>
|
|
190
253
|
- **Severity:** blocking | major | minor
|
|
191
254
|
|
|
255
|
+
## Metrics
|
|
256
|
+
|
|
257
|
+
- **Scenarios added (this run):** <N> (outline rows: <X>, `And`-step extensions: <Y>, new scenarios: <Z>)
|
|
258
|
+
- **Total scenarios per affected file:** `<file>.feature` <before → after>
|
|
259
|
+
- **Outline-to-scenario ratio (project-wide):** <X> outlines / <Y> scenarios — flag if outlines drop below ~20% (suggests the suite is paralleling instead of parameterising)
|
|
260
|
+
|
|
192
261
|
## Notes
|
|
193
262
|
|
|
194
|
-
<Any observations about
|
|
263
|
+
<Any observations about flaky tests, scenarios discovered during implementation that the spec didn't anticipate, or pruning candidates the implementation surfaced.>
|
|
195
264
|
```
|
|
196
265
|
|
|
197
266
|
### Status Codes
|
|
198
267
|
|
|
199
|
-
- **PASS** — all acceptance criteria verified
|
|
200
|
-
- **FAIL** — one or more acceptance criteria not met (implementation issues found)
|
|
201
|
-
- **PARTIAL** — some criteria verified, some
|
|
268
|
+
- **PASS** — all acceptance criteria verified, each routed to its venue and the venue is green
|
|
269
|
+
- **FAIL** — one or more acceptance criteria not met (implementation issues found in any venue)
|
|
270
|
+
- **PARTIAL** — some criteria verified, some routed to venues whose verification is pending (e.g. waiting on a lint rule to be added, or unit test delegated to impl that hasn't run); not the same as "couldn't write a test for it"
|
|
202
271
|
|
|
203
272
|
---
|
|
204
273
|
|
|
@@ -217,20 +286,31 @@ Present:
|
|
|
217
286
|
## Constraints
|
|
218
287
|
|
|
219
288
|
**DO:**
|
|
220
|
-
- Read the spec's acceptance criteria before reading the implementation
|
|
221
|
-
-
|
|
222
|
-
-
|
|
289
|
+
- Read the spec's acceptance criteria *and* the project's `.feature` files before reading the implementation
|
|
290
|
+
- Route each AC to its correct venue (Gherkin scenario / lint rule / unit test / impl check-result) — not every AC is e2e
|
|
291
|
+
- Prefer extending existing scenarios over adding new ones: `Scenario Outline` rows first, `And`-step extensions second, new scenarios last
|
|
292
|
+
- Use scenario-ID prefixes (`HP-N` / `ER-N` / `EC-N` / `RG-N`); reflect them in test names
|
|
293
|
+
- Write each `test(...)` block with a scenario-title comment above it and Gherkin-step comments inline
|
|
294
|
+
- Use behavioural assertions only — page interactions, API calls, real-time channels
|
|
295
|
+
- Justify each new scenario as a distinct user-observable behaviour, not as a per-AC reflex
|
|
296
|
+
- Include a "Scenarios Deliberately Not Added" section with one-line reasons for each AC routed away from e2e
|
|
297
|
+
- Include a Metrics line in the QA report for human bloat-detection
|
|
223
298
|
- Run the tests and include actual output as evidence
|
|
224
299
|
- Verify tests are substantive (not stubs) after writing them
|
|
225
300
|
- Report implementation issues without fixing them — that's the implementation skill's job
|
|
301
|
+
- Follow the project's existing e2e test patterns exactly
|
|
226
302
|
|
|
227
303
|
**DON'T:**
|
|
228
304
|
- Trust the implementation report as a substitute for reading actual code
|
|
229
|
-
- Write unit tests — that's the implementation skill's responsibility
|
|
305
|
+
- Write unit tests — that's the implementation skill's responsibility (delegate via impl check-result entry instead)
|
|
230
306
|
- Fix implementation bugs — document them as issues for the review/fix loop
|
|
231
307
|
- Invent new test patterns when existing patterns work
|
|
232
308
|
- Skip the substance verification — stub tests are the #1 risk
|
|
233
309
|
- Write tests that depend on implementation internals rather than user-visible behavior
|
|
310
|
+
- Use AC labels (`AC<N>`) in test names, file names, or scenario titles — AC traceability lives in the coverage table only
|
|
311
|
+
- Import `fs`, `path` (for source paths), `child_process`, or any module that reads project source code from inside `.spec.ts` — these reach for source-file inspection, which is not e2e
|
|
312
|
+
- Write N parallel `Scenario` blocks when one `Scenario Outline` with `Examples:` would do — parameterise
|
|
313
|
+
- Add a scenario "for completeness" — restraint is part of the deliverable; coverage is verified in the report, not by scenario count
|
|
234
314
|
|
|
235
315
|
---
|
|
236
316
|
|
|
@@ -239,8 +319,13 @@ Present:
|
|
|
239
319
|
If you catch yourself thinking any of these, stop:
|
|
240
320
|
|
|
241
321
|
- "The implementation report says it works, so I'll write light tests" — STOP. The report may be optimistic. Verify independently.
|
|
242
|
-
- "This criterion is hard to test with e2e, I'll skip it" — STOP.
|
|
322
|
+
- "This criterion is hard to test with e2e, I'll skip it" — STOP. Hard-to-e2e usually means the AC is not user-observable. Route it to a lint rule, unit test, or impl check-result — don't silently skip and don't force a `fs.readFileSync` workaround.
|
|
323
|
+
- "I need to read a source file to verify this AC" — STOP. Hard tripwire. The AC is not e2e. Pick a different venue.
|
|
243
324
|
- "All tests pass, so QA is done" — STOP. Passing tests can be stubs. Run the substance check.
|
|
244
325
|
- "I'll write a quick `expect(true)` to get this passing" — STOP. That's a stub. Write a real assertion.
|
|
326
|
+
- "Every AC needs its own scenario" — STOP. Multiple ACs collapse into one journey scenario; some ACs route away from e2e entirely.
|
|
327
|
+
- "I'll add another scenario for completeness" — STOP. Justify it as a distinct user-observable behaviour or don't add it. Restraint is part of the deliverable.
|
|
328
|
+
- "I'll write N parallel scenarios for N variants of the same flow" — STOP. Use `Scenario Outline` with `Examples:`.
|
|
329
|
+
- "I'll name this test `AC10: ...`" — STOP. AC labels do not appear in test or scenario names. Use `HP-N` / `ER-N` / `EC-N` / `RG-N`. AC traceability is the coverage table's job.
|
|
245
330
|
- "The existing e2e tests use a different pattern but mine is better" — STOP. Follow existing patterns. Consistency matters.
|
|
246
331
|
- "This implementation issue is minor, I won't report it" — STOP. Report everything. Let the review skill triage severity.
|
|
@@ -93,6 +93,24 @@ Do not prescribe a fixed search strategy. Every codebase is shaped differently.
|
|
|
93
93
|
|
|
94
94
|
---
|
|
95
95
|
|
|
96
|
+
## Step 4b — Survey `.feature` Files
|
|
97
|
+
|
|
98
|
+
If the project has user-visible behaviour (most do), check the project's `features/` directory for Gherkin `.feature` files — they are the source of truth for e2e scenarios.
|
|
99
|
+
|
|
100
|
+
1. **List `features/*.feature`.** If the directory does not exist or is empty, warn the user: *"No `.feature` files found. Project needs a one-time bootstrap pass to seed `features/` from the application's user-facing capabilities. Continue without Gherkin Impact, or pause to bootstrap?"*
|
|
101
|
+
2. **Identify affected files.** For the feature being spec'd, name the `.feature` file(s) that cover the capability it touches. One feature usually maps to one (sometimes two) existing `.feature` files — never to a brand-new file.
|
|
102
|
+
3. **Determine the extension shape.** For each affected file, decide how the spec extends it:
|
|
103
|
+
- **`Scenario Outline` row addition** — the journey already exists, just needs another data row.
|
|
104
|
+
- **`And`-step addition to an existing scenario** — the journey already exists, the new feature adds an assertion or step.
|
|
105
|
+
- **New scenario** — *last resort.* Only when no existing scenario fits the user journey, and the feature truly introduces a new user-observable behaviour.
|
|
106
|
+
4. **Surface prune candidates.** If the feature retires capability, name scenarios likely to become obsolete. The human decides actual deletion.
|
|
107
|
+
|
|
108
|
+
This survey feeds the spec's "Gherkin Impact" section (Step 7).
|
|
109
|
+
|
|
110
|
+
> **No new `.feature` files at the per-feature level.** New `.feature` files are bootstrap territory. Per-feature work extends what exists.
|
|
111
|
+
|
|
112
|
+
---
|
|
113
|
+
|
|
96
114
|
## Step 5 — Determine Spec Depth
|
|
97
115
|
|
|
98
116
|
Based on complexity detected in steps 3–4, choose a depth:
|
|
@@ -181,7 +199,22 @@ What to replicate from the template and what differs for this feature.
|
|
|
181
199
|
- [ ] Criterion (specific, testable)
|
|
182
200
|
- [ ] Criterion
|
|
183
201
|
|
|
184
|
-
> These criteria are the contract that flows downstream. The review skill checks whether the implementation meets them. The qa-engineer skill
|
|
202
|
+
> These criteria are the contract that flows downstream. The review skill checks whether the implementation meets them. The qa-engineer skill routes each criterion to the right venue — Gherkin scenario, lint rule, unit test, or impl-report check-result — per the project's traceability model. Write criteria so they are verifiable, but do not assume they all become e2e tests.
|
|
203
|
+
|
|
204
|
+
## Gherkin Impact
|
|
205
|
+
|
|
206
|
+
(Skip if the project has no `features/` directory; flag a bootstrap need instead.)
|
|
207
|
+
|
|
208
|
+
**Affected `.feature` files:**
|
|
209
|
+
- `features/<file>.feature` — <one-line capability summary>
|
|
210
|
+
|
|
211
|
+
**Extensions:**
|
|
212
|
+
- **Outline rows:** `<scenario title>` gets a new row in `Examples:` for `<input variant>`
|
|
213
|
+
- **`And`-step additions:** `<scenario title>` gains *"And <new assertion>"* under <Given/When/Then>
|
|
214
|
+
- **New scenarios** (only when no existing scenario fits): `<HP-N | ER-N | EC-N | RG-N> - <title>` in `<file>.feature`. Reason: <why no existing scenario could be extended>
|
|
215
|
+
|
|
216
|
+
**Prune candidates** (capability being retired):
|
|
217
|
+
- `<scenario title>` in `<file>.feature` — likely obsolete because <reason>. Human decides removal.
|
|
185
218
|
|
|
186
219
|
## Workflow Config
|
|
187
220
|
|
|
@@ -234,7 +267,9 @@ When invoked with a path to an existing spec (or the user asks to revise):
|
|
|
234
267
|
- Read the codebase before writing anything
|
|
235
268
|
- Reference specific file paths, function names, type names in every implementation step
|
|
236
269
|
- Find and cite a structural template (the closest existing similar feature)
|
|
237
|
-
- Write acceptance criteria that are verifiable
|
|
270
|
+
- Write acceptance criteria that are verifiable — they flow to review (code-level verification) and qa-engineer (venue routing: Gherkin scenario / lint rule / unit test / impl check-result)
|
|
271
|
+
- Survey existing `.feature` files and prefer extending them — outline rows or `And`-step additions before new scenarios
|
|
272
|
+
- Surface prune candidates when capability retires (human decides actual removal)
|
|
238
273
|
- Scale spec depth to task complexity
|
|
239
274
|
- Include the workflow config in the output for downstream skills
|
|
240
275
|
- Make decisions — be opinionated
|
|
@@ -245,6 +280,8 @@ When invoked with a path to an existing spec (or the user asks to revise):
|
|
|
245
280
|
- List alternatives — pick one and explain why
|
|
246
281
|
- Skip codebase exploration for any reason
|
|
247
282
|
- Create a spec for requirements that are unclear — ask first
|
|
283
|
+
- Create new `.feature` files at the per-feature level — bootstrap is a separate one-off; per-feature work extends what exists
|
|
284
|
+
- Assume every acceptance criterion becomes an e2e test — qa-engineer routes ACs by nature; criteria that aren't user-observable belong in lint rules, unit tests, or impl check-results
|
|
248
285
|
|
|
249
286
|
---
|
|
250
287
|
|
|
@@ -257,3 +294,5 @@ If you catch yourself thinking any of these, stop:
|
|
|
257
294
|
- "The user's description is clear enough, no ambiguity check needed" — STOP. Spend 30 seconds checking.
|
|
258
295
|
- "I'll keep the acceptance criteria general to be flexible" — STOP. Vague criteria are untestable and unusable by downstream skills. Be specific.
|
|
259
296
|
- "There's no similar feature to use as a template" — STOP. Look harder. There is almost always a structural analog somewhere in the codebase.
|
|
297
|
+
- "This feature is new enough to deserve its own `.feature` file" — STOP. New `.feature` files are bootstrap territory. If the feature truly defines a new user-facing capability with no precedent in `features/`, that's a bootstrap pass, not per-feature spec-writer work. Flag it for the user.
|
|
298
|
+
- "I'll add a new scenario for each new acceptance criterion" — STOP. Prefer outline rows or `And`-step additions to existing scenarios. New scenarios require a stated reason in Gherkin Impact.
|