specweave 1.0.436 → 1.0.438

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,6 +1,8 @@
1
1
  # SpecWeave
2
2
 
3
- **The spec-driven Skill Fabric for AI coding agents.** Program your AI in English. Ship features while you sleep.
3
+ **AI-assisted development, under control.**
4
+
5
+ Your AI responds to natural language — and now follows a structured, spec-first, quality-gated process every time. Configure your standards once. Every developer, every AI tool, every session enforces them automatically.
4
6
 
5
7
  *Works with Claude Code, Cursor, Copilot, Codex, Antigravity & any LLM-powered coding tool.*
6
8
 
@@ -15,36 +17,61 @@ npm install -g specweave # Requires Node.js 20.12.0+
15
17
 
16
18
  ---
17
19
 
18
- ## What Are Skills?
20
+ ## No Commands to Memorize
19
21
 
20
- **Skills are programs written in English** not prompts, not templates, but reusable logic that controls how AI thinks, decides, and acts.
22
+ SpecWeave is not a workflow you switch into. It is a behavior layer that changes how your AI works — installed once, active in every conversation.
21
23
 
22
- ```
23
- Without SpecWeave: With SpecWeave:
24
- ───────────────── ───────────────
25
- "Use React Hook Form with Zod..." "Add a login form"
26
- "Remember, we use Tailwind..." → AI already knows your patterns.
27
- "Don't forget the test pattern..." → It remembered from last time.
28
- "Wait, I told you this yesterday..." → Fix once, learned permanently.
29
- ```
24
+ When you describe what you want, your AI routes internally to the right skill. You just work naturally:
30
25
 
31
- Each skill is a **programmable AI behavior** you can customize without forking. Fix once, remembered permanently. 100+ skills ship out of the box PM, Architect, QA, Security, DevOps, Frontend, Backend, Mobile, ML.
26
+ | You say | Your AI runsautomatically |
27
+ |---------|------------------------------|
28
+ | "Build me X" / "Let's add Y" | `/sw:increment` → spec + plan + tasks |
29
+ | "Go ahead" / "Build it" | `/sw:auto` → autonomous execution |
30
+ | "Ship it" / "We're done" | `/sw:done` → quality gates + close |
31
+ | "Split this into teams" | `/sw:team-lead` → parallel agents |
32
+ | "Review the code" | `/sw:grill` → critical code review |
32
33
 
33
- **You don't need to learn Claude Code docs.** SpecWeave handles hooks, plugins, CLAUDE.md, and context management for you. Install, describe your feature, skills do the rest.
34
+ You can also invoke these directly for fine-grained control but you rarely need to.
35
+
36
+ ---
37
+
38
+ ## What You Control
39
+
40
+ SpecWeave's behavior is driven by configuration. Define your standards once; every AI interaction in your project enforces them.
41
+
42
+ ```json
43
+ // .specweave/config.json
44
+ {
45
+ "testing": {
46
+ "defaultTestMode": "TDD", // AI always follows red-green-refactor
47
+ "tddEnforcement": "strict" // Tasks cannot close without passing tests
48
+ },
49
+ "quality": {
50
+ "grillRequired": true, // Code review gate before every close
51
+ "judgeLlmRequired": true // Independent AI validation gate
52
+ },
53
+ "sync": {
54
+ "github": true, // Auto-sync to GitHub Issues / PRs
55
+ "jira": true // Bidirectional JIRA sync on close
56
+ }
57
+ }
58
+ ```
59
+
60
+ This is the difference between **asking** an AI to follow a process and **configuring** it to. No prompting required. No hoping it remembers. The config is the contract.
34
61
 
35
62
  ---
36
63
 
37
64
  ## The Workflow
38
65
 
39
- Just describe what you want. SpecWeave handles the rest.
66
+ Just describe what you want. Your AI handles the orchestration.
40
67
 
41
68
  ```
42
69
  You: "Build me a checkout flow with Stripe"
43
70
 
44
- SpecWeave asks 5-10 clarifying questions
71
+ AI asks 5-10 clarifying questions
45
72
  (What payment methods? Guest checkout? Subscriptions? Which UI library?)
46
73
 
47
- Creates: spec.md → plan.md → tasks.md
74
+ Creates: spec.md → plan.md → tasks.md ← you review the plan here
48
75
 
49
76
  You: "Go ahead and build it"
50
77
  → autonomous execution for hours
@@ -54,13 +81,13 @@ You wake up. Review finished work.
54
81
  Tests cover technical correctness. You check the UI and UX.
55
82
 
56
83
  You: "Looks good, ship it"
57
- → validated, documented, shipped.
84
+ → validated, documented, closed.
58
85
  ```
59
86
 
60
87
  **Solo developer:**
61
88
  ```
62
89
  You: "I need user authentication with OAuth and magic links"
63
- SpecWeave interviews you, creates spec + plan + tasks
90
+ AI interviews you, creates spec + plan + tasks
64
91
  You: "Build it"
65
92
  → AI works autonomously for hours
66
93
  You: "Ship it"
@@ -81,23 +108,28 @@ You: "Migrate the checkout page to React"
81
108
  → TDD-first autonomous execution
82
109
  ```
83
110
 
84
- <details>
85
- <summary><strong>Under the hood</strong> — SpecWeave auto-activates these skills from natural language:</summary>
111
+ ---
86
112
 
87
- | You say | SpecWeave runs |
88
- |---------|---------------|
89
- | "Build me X" | `/sw:increment` spec + plan + tasks |
90
- | "Go ahead" / "Build it" | `/sw:auto` → autonomous execution |
91
- | "Ship it" / "We're done" | `/sw:done` → quality gates + close |
92
- | "Split this into teams" | `/sw:team-lead` → parallel agents |
93
- | "Review the code" | `/sw:grill` → critical code review |
113
+ ## What Are Skills?
114
+
115
+ **Skills are programs written in English** not prompts, not templates, but reusable logic that controls how AI thinks, decides, and acts.
94
116
 
95
- You can also invoke commands directly for fine-grained control.
96
- </details>
117
+ ```
118
+ Without SpecWeave: With SpecWeave:
119
+ ───────────────── ───────────────
120
+ "Use React Hook Form with Zod..." "Add a login form"
121
+ "Remember, we use Tailwind..." → AI already knows your patterns.
122
+ "Don't forget the test pattern..." → It remembered from last time.
123
+ "Wait, I told you this yesterday..." → Fix once, learned permanently.
124
+ ```
125
+
126
+ Each skill is a **programmable AI behavior** you can customize without forking. Fix once, remembered permanently. 100+ skills ship out of the box — PM, Architect, QA, Security, DevOps, Frontend, Backend, Mobile, ML.
127
+
128
+ **You don't need to learn Claude Code docs.** SpecWeave handles hooks, plugins, CLAUDE.md, and context management for you. Install, describe your feature, skills do the rest.
97
129
 
98
130
  ---
99
131
 
100
- ## Why SpecWeave?
132
+ ## Why Spec-First?
101
133
 
102
134
  **The plan is more important than the code.**
103
135
 
@@ -271,7 +303,7 @@ You: "Add dark mode to the app"
271
303
 
272
304
  ## Core Commands
273
305
 
274
- All commands activate automatically from natural language. Use directly for fine-grained control.
306
+ These run automatically from natural language — see the table above. Use directly when you want fine-grained control.
275
307
 
276
308
  | Command | Purpose | Natural trigger |
277
309
  |---------|---------|----------------|
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "specweave",
3
- "version": "1.0.436",
4
- "description": "Spec-driven development framework for AI coding agents. Works with Claude Code, Codex, Antigravity, Cursor, Copilot & more. 100+ skills, 49 CLI commands, verified skill certification, autonomous execution, and living documentation.",
3
+ "version": "1.0.438",
4
+ "description": "AI-assisted development, under control. Configure your standards once spec-first, TDD, quality gates and every AI interaction enforces them automatically. Works with Claude Code, Cursor, Copilot, Codex & more.",
5
5
  "type": "module",
6
6
  "main": "dist/index.js",
7
7
  "bin": {
@@ -0,0 +1,420 @@
1
+ ---
2
+ description: Generate, run, and report Playwright E2E tests traced to spec.md acceptance criteria. Supports accessibility auditing via --a11y. Use when saying "e2e tests", "playwright tests", "run e2e", "generate e2e", "accessibility audit", "a11y test".
3
+ argument-hint: "--generate|--run|--a11y <increment-id>"
4
+ allowed-tools: Read, Write, Edit, Grep, Glob, Bash
5
+ context: fork
6
+ model: sonnet
7
+ ---
8
+
9
+ # E2E Testing — Playwright + AC Traceability
10
+
11
+ ## Project Overrides
12
+
13
+ !`s="e2e"; for d in .specweave/skill-memories .claude/skill-memories "$HOME/.claude/skill-memories"; do p="$d/$s.md"; [ -f "$p" ] && awk '/^## Learnings$/{ok=1;next}/^## /{ok=0}ok' "$p" && break; done 2>/dev/null; true`
14
+
15
+ Generate Playwright E2E tests from spec.md acceptance criteria, run them, and produce a structured report that maps pass/fail results to AC-IDs. Consumed by sw:done Gate 2a for automated closure gating.
16
+
17
+ ## Modes
18
+
19
+ | Flag | Action |
20
+ |------|--------|
21
+ | `--generate <id>` | Read spec.md → create one `.spec.ts` per US with one `test()` per AC |
22
+ | `--run <id>` | Execute `npx playwright test` → parse results → write `e2e-report.json` |
23
+ | `--a11y <id>` | Like `--run` but also scans each page with `@axe-core/playwright` |
24
+
25
+ Combine `--run` + `--a11y` to get both functional and accessibility results. `--generate` ignores `--a11y` (warn if combined).
26
+
27
+ ---
28
+
29
+ ## Step 1: Parse Arguments
30
+
31
+ Extract mode and increment ID from `$ARGUMENTS`:
32
+
33
+ ```bash
34
+ # Parse: --generate 0042 | --run 0042 | --a11y 0042 | --run --a11y 0042
35
+ MODE="run" # default
36
+ A11Y=false
37
+ INCREMENT_ID=""
38
+
39
+ for arg in $ARGUMENTS; do
40
+ case "$arg" in
41
+ --generate) MODE="generate" ;;
42
+ --run) MODE="run" ;;
43
+ --a11y) A11Y=true ;;
44
+ *) INCREMENT_ID="$arg" ;;
45
+ esac
46
+ done
47
+ ```
48
+
49
+ If no increment ID provided, check for an active increment:
50
+ ```bash
51
+ ACTIVE=$(find .specweave/increments -maxdepth 2 -name "metadata.json" -exec grep -l '"active"' {} \; 2>/dev/null | head -1)
52
+ ```
53
+
54
+ If still no ID → **STOP**: "No increment ID provided and no active increment found."
55
+
56
+ Resolve increment path: `.specweave/increments/<id>/`
57
+
58
+ ## Step 2: Environment Validation — Playwright Detection
59
+
60
+ **MANDATORY before any operation.** Detect Playwright installation:
61
+
62
+ ```bash
63
+ # 1. Find playwright.config
64
+ PW_CONFIG=$(find . repositories -maxdepth 4 -name "playwright.config.ts" -o -name "playwright.config.js" 2>/dev/null | head -1)
65
+
66
+ # 2. Check for @playwright/test in package.json
67
+ PW_PACKAGE=$(grep -r '"@playwright/test"' package.json packages/*/package.json repositories/*/*/package.json 2>/dev/null | head -1)
68
+ ```
69
+
70
+ **Decision matrix**:
71
+
72
+ | Config | Package | Action |
73
+ |--------|---------|--------|
74
+ | Found | Found | **Proceed** — use config path |
75
+ | Missing | Found | **FAIL**: "Playwright installed but no config found. Run `npx playwright init` to create playwright.config.ts" |
76
+ | Missing | Missing | **FAIL**: "Playwright not installed. Run `npm init playwright@latest` to set up E2E testing" |
77
+ | Found | Missing | **Proceed** with warning: "Playwright config found but package not in package.json (global install?)" |
78
+
79
+ Store `PW_CONFIG` path for later use.
80
+
81
+ ## Step 3: Read spec.md — AC Extraction
82
+
83
+ Parse the increment's spec.md to extract acceptance criteria:
84
+
85
+ ```bash
86
+ # Extract ACs: matches both [ ] and [x] checkboxes
87
+ grep -E '^\s*-\s*\[[ x]\]\s*\*\*AC-' .specweave/increments/<id>/spec.md
88
+ ```
89
+
90
+ **Parsing algorithm**:
91
+
92
+ 1. Read `.specweave/increments/<id>/spec.md`
93
+ 2. For each line matching `- [[ x]] **AC-USx-xx**: <text>`:
94
+ - Extract AC-ID (e.g., `AC-US1-01`)
95
+ - Extract description text (the Given/When/Then or plain text after the colon)
96
+ - Derive parent US-ID from AC prefix (e.g., `AC-US1-01` → `US-001`)
97
+ - Flag `hasGWT` if text contains "Given" AND "When" AND "Then"
98
+ 3. Group ACs by parent US-ID
99
+ 4. Detect journey sequences: ACs under the same US that describe sequential steps on the same page
100
+
101
+ **Edge cases**:
102
+ - **No ACs found**: Output "No acceptance criteria found in spec.md — nothing to generate" and exit cleanly
103
+ - **ACs without Given/When/Then**: Generate a test stub with `// TODO: AC text does not follow GWT format — implement test manually`
104
+ - **Duplicate AC-IDs**: Warn, append `-dup1` suffix to the test name
105
+
106
+ Store the parsed result as a structured list for subsequent steps.
107
+
108
+ ---
109
+
110
+ ## Step 4: Generate Mode (`--generate`)
111
+
112
+ **Goal**: Create Playwright test files from extracted ACs.
113
+
114
+ ### 4a. Determine Output Directory
115
+
116
+ ```bash
117
+ # Read testDir from playwright config, default to e2e/
118
+ TEST_DIR=$(grep -oP "testDir:\s*['\"]([^'\"]+)" "$PW_CONFIG" | head -1 | sed "s/testDir:\s*['\"]//")
119
+ TEST_DIR="${TEST_DIR:-e2e}"
120
+ mkdir -p "$TEST_DIR"
121
+ ```
122
+
123
+ ### 4b. Generate Test Files
124
+
125
+ For each user story, create `{TEST_DIR}/us-{NNN}.spec.ts`:
126
+
127
+ **Template for standard ACs** (one test per AC):
128
+
129
+ ```typescript
130
+ import { test, expect } from '@playwright/test';
131
+
132
+ test.describe('US-001: <User Story Title>', () => {
133
+ test('AC-US1-01: <AC description summary>', async ({ page }) => {
134
+ // Given: <given clause>
135
+ // When: <when clause>
136
+ // Then: <then clause>
137
+
138
+ // TODO: Implement test steps
139
+ // AC text: <full AC text>
140
+ });
141
+
142
+ test('AC-US1-02: <AC description summary>', async ({ page }) => {
143
+ // ...
144
+ });
145
+ });
146
+ ```
147
+
148
+ **Template for journey ACs** (grouped into one test):
149
+
150
+ When multiple ACs under the same US describe sequential steps (e.g., "user sees form" → "user submits form" → "user sees confirmation"), group them:
151
+
152
+ ```typescript
153
+ test('AC-US1-01 → AC-US1-03: <journey description>', async ({ page }) => {
154
+ // --- AC-US1-01: <description> ---
155
+ // Given/When/Then steps...
156
+
157
+ // --- AC-US1-02: <description> ---
158
+ // Given/When/Then steps...
159
+
160
+ // --- AC-US1-03: <description> ---
161
+ // Given/When/Then steps...
162
+ });
163
+ ```
164
+
165
+ ### 4c. Post-Generate Summary
166
+
167
+ Output:
168
+ ```
169
+ Generated E2E tests:
170
+ {TEST_DIR}/us-001.spec.ts (3 ACs: AC-US1-01, AC-US1-02, AC-US1-03)
171
+ {TEST_DIR}/us-002.spec.ts (2 ACs: AC-US2-01, AC-US2-02)
172
+
173
+ Total: 5 tests across 2 files
174
+ Next: Implement test steps, then run with /sw:e2e --run <id>
175
+ ```
176
+
177
+ ---
178
+
179
+ ## Step 5: Run Mode (`--run`)
180
+
181
+ **Goal**: Execute Playwright tests and produce AC-mapped `e2e-report.json`.
182
+
183
+ ### 5a. Execute Playwright
184
+
185
+ ```bash
186
+ # Run with JSON reporter for structured output
187
+ npx playwright test --reporter=json 2>&1 | tee /tmp/pw-results.json
188
+
189
+ # Capture exit code
190
+ PW_EXIT=$?
191
+ ```
192
+
193
+ If Playwright exits non-zero, that's expected for failing tests — continue to report generation.
194
+
195
+ ### 5b. Parse Results and Map to ACs
196
+
197
+ 1. Read the JSON reporter output
198
+ 2. For each test result:
199
+ - Extract test title
200
+ - Match AC-ID from title using regex: `/AC-US\d+-\d+/`
201
+ - Map to status: `passed` → `pass`, `failed` → `fail`, `skipped` → `skip`
202
+ - Extract duration and error message (if failed)
203
+ 3. Tests without AC-IDs in title → report under `acId: "UNMAPPED"`
204
+
205
+ ### 5c. Write e2e-report.json
206
+
207
+ Write to `.specweave/increments/<id>/reports/e2e-report.json`:
208
+
209
+ ```json
210
+ {
211
+ "incrementId": "<id>",
212
+ "timestamp": "<ISO-8601>",
213
+ "mode": "run",
214
+ "playwrightConfig": "<path to playwright.config.ts>",
215
+ "summary": {
216
+ "total": 5,
217
+ "passed": 4,
218
+ "failed": 1,
219
+ "skipped": 0
220
+ },
221
+ "results": [
222
+ {
223
+ "acId": "AC-US1-01",
224
+ "testFile": "e2e/us-001.spec.ts",
225
+ "status": "pass",
226
+ "duration": 1234,
227
+ "error": null
228
+ },
229
+ {
230
+ "acId": "AC-US1-02",
231
+ "testFile": "e2e/us-001.spec.ts",
232
+ "status": "fail",
233
+ "duration": 5678,
234
+ "error": "Expected element to be visible but it was hidden"
235
+ }
236
+ ]
237
+ }
238
+ ```
239
+
240
+ ### 5d. Output Summary
241
+
242
+ ```
243
+ E2E Results for increment <id>:
244
+ Total: 5 | Passed: 4 | Failed: 1 | Skipped: 0
245
+
246
+ FAILED:
247
+ AC-US1-02: Expected element to be visible but it was hidden (us-001.spec.ts)
248
+
249
+ Report: .specweave/increments/<id>/reports/e2e-report.json
250
+ ```
251
+
252
+ If `summary.failed > 0`:
253
+ ```
254
+ ⚠ E2E tests have failures. Fix before closing increment.
255
+ ```
256
+
257
+ If `summary.failed === 0`:
258
+ ```
259
+ All E2E tests passed. Gate 2a will allow closure.
260
+ ```
261
+
262
+ ---
263
+
264
+ ## Step 6: A11y Mode (`--a11y`)
265
+
266
+ **Goal**: Extend run mode with accessibility scanning via `@axe-core/playwright`.
267
+
268
+ ### 6a. Check axe-core Installation
269
+
270
+ ```bash
271
+ grep -q '"@axe-core/playwright"' package.json 2>/dev/null
272
+ ```
273
+
274
+ If not installed:
275
+ ```
276
+ @axe-core/playwright is not installed. Install it:
277
+ npm install -D @axe-core/playwright axe-core
278
+
279
+ Then re-run: /sw:e2e --a11y <id>
280
+ ```
281
+
282
+ ### 6b. Inject A11y Scans
283
+
284
+ When generating tests with `--a11y`, add after each test's primary assertions:
285
+
286
+ ```typescript
287
+ import AxeBuilder from '@axe-core/playwright';
288
+
289
+ // After primary test assertions:
290
+ const a11yResults = await new AxeBuilder({ page })
291
+ .withTags(['wcag2a', 'wcag2aa', 'wcag21aa'])
292
+ .analyze();
293
+ ```
294
+
295
+ When running existing tests, the a11y scan must be part of the generated test code. If tests were generated without `--a11y`, recommend regenerating with the flag.
296
+
297
+ ### 6c. A11y Report Extension
298
+
299
+ For each AC result, attach a11y data:
300
+
301
+ ```json
302
+ {
303
+ "acId": "AC-US1-01",
304
+ "status": "pass",
305
+ "a11y": {
306
+ "violations": [
307
+ {
308
+ "rule": "color-contrast",
309
+ "impact": "serious",
310
+ "description": "Elements must have sufficient color contrast",
311
+ "nodes": 3,
312
+ "helpUrl": "https://dequeuniversity.com/rules/axe/4.7/color-contrast"
313
+ }
314
+ ],
315
+ "passes": 42
316
+ }
317
+ }
318
+ ```
319
+
320
+ ### 6d. Standalone A11y (No AC Context)
321
+
322
+ When `--a11y` runs without `--generate` context (pre-existing tests without AC-IDs):
323
+
324
+ - Group violations by page URL instead of AC-ID
325
+ - Write to the top-level `a11y` field in the report:
326
+
327
+ ```json
328
+ {
329
+ "a11y": {
330
+ "violations": [
331
+ {
332
+ "pageUrl": "/login",
333
+ "rule": "color-contrast",
334
+ "impact": "serious",
335
+ "nodes": 2
336
+ }
337
+ ],
338
+ "passes": 87
339
+ }
340
+ }
341
+ ```
342
+
343
+ ### 6e. axe-core Rule Tags Reference
344
+
345
+ | Tag | Meaning |
346
+ |-----|---------|
347
+ | `wcag2a` / `wcag2aa` | WCAG 2.0 Level A / AA |
348
+ | `wcag21aa` / `wcag22aa` | WCAG 2.1 / 2.2 Level AA |
349
+ | `best-practice` | Non-WCAG best practices |
350
+
351
+ Default: `['wcag2a', 'wcag2aa', 'wcag21aa']` (covers standard compliance).
352
+
353
+ ---
354
+
355
+ ## Step 7: Report Schema Reference
356
+
357
+ ### e2e-report.json (complete)
358
+
359
+ ```json
360
+ {
361
+ "incrementId": "string",
362
+ "timestamp": "ISO-8601",
363
+ "mode": "run | generate | a11y",
364
+ "playwrightConfig": "path/to/playwright.config.ts",
365
+ "summary": {
366
+ "total": 0,
367
+ "passed": 0,
368
+ "failed": 0,
369
+ "skipped": 0
370
+ },
371
+ "results": [
372
+ {
373
+ "acId": "AC-US1-01",
374
+ "testFile": "e2e/us-001.spec.ts",
375
+ "status": "pass | fail | skip",
376
+ "duration": 1234,
377
+ "error": null,
378
+ "a11y": {
379
+ "violations": [],
380
+ "passes": 0
381
+ }
382
+ }
383
+ ],
384
+ "a11y": {
385
+ "violations": [],
386
+ "passes": 0
387
+ }
388
+ }
389
+ ```
390
+
391
+ **Consumption by sw:done Gate 2a**:
392
+ 1. Read `.specweave/increments/<id>/reports/e2e-report.json`
393
+ 2. If `summary.failed > 0` → **BLOCK closure**
394
+ 3. If report missing → **BLOCK closure** (report must exist after sw:e2e invocation)
395
+ 4. If `summary.failed === 0` → **PASS gate**
396
+
397
+ ---
398
+
399
+ ## Step 8: Edge Cases and Error Handling
400
+
401
+ | Scenario | Behavior |
402
+ |----------|----------|
403
+ | No spec.md | "spec.md not found at increment path. Run /sw:increment first." |
404
+ | spec.md with no ACs | "No acceptance criteria found in spec.md — nothing to generate." Exit cleanly. |
405
+ | ACs without GWT format | Generate test stub with `// TODO: implement` comment |
406
+ | Duplicate AC-IDs | Warn, append `-dup1` suffix |
407
+ | Playwright timeout on a test | Report as `status: "fail"`, `error: "Test timed out after Xms"` |
408
+ | `--generate` + `--a11y` combined | Warn: "a11y flag is only used with --run. Generating without a11y scans." |
409
+ | No Playwright config | FAIL with installation instructions (see Step 2) |
410
+ | Pre-existing tests without AC-IDs | Map to `acId: "UNMAPPED"` in results |
411
+ | JSON reporter not available | Fall back to parsing Playwright stdout for pass/fail counts |
412
+
413
+ ## Anti-Rationalization
414
+
415
+ | Excuse | Rebuttal |
416
+ |--------|----------|
417
+ | "Tests are too simple to need AC tracing" | Tracing is free — it costs one string in the test title. Skip it and you lose audit trail. |
418
+ | "I'll add AC-IDs later" | You won't. Generate with `--generate` and they're there from the start. |
419
+ | "Accessibility can wait" | WCAG violations caught at dev time cost 10x less to fix than post-release. Use `--a11y`. |
420
+ | "The report is overkill for a small project" | Gate 2a reads the report. No report = no closure. The schema is fixed overhead, not per-test. |
@@ -0,0 +1,122 @@
1
+ {
2
+ "skill_name": "sw:e2e",
3
+ "evals": [
4
+ {
5
+ "id": 1,
6
+ "name": "generate-from-spec",
7
+ "prompt": "Generate E2E tests for increment 0042-auth-flow. The spec.md has 2 user stories: US-001 (Login) with AC-US1-01 (see login form), AC-US1-02 (submit credentials), AC-US1-03 (see dashboard after login); US-002 (Logout) with AC-US2-01 (click logout), AC-US2-02 (redirected to login page).",
8
+ "expected_output": "Playwright test files created in e2e/ directory with one file per user story and AC-IDs in test titles",
9
+ "assertions": [
10
+ {
11
+ "id": "a1",
12
+ "text": "Reads spec.md from .specweave/increments/0042-auth-flow/spec.md to extract ACs",
13
+ "type": "boolean"
14
+ },
15
+ {
16
+ "id": "a2",
17
+ "text": "Creates e2e/us-001.spec.ts with test titles containing AC-US1-01, AC-US1-02, AC-US1-03",
18
+ "type": "boolean"
19
+ },
20
+ {
21
+ "id": "a3",
22
+ "text": "Creates e2e/us-002.spec.ts with test titles containing AC-US2-01, AC-US2-02",
23
+ "type": "boolean"
24
+ },
25
+ {
26
+ "id": "a4",
27
+ "text": "Checks for playwright.config.ts before generating test files",
28
+ "type": "boolean"
29
+ },
30
+ {
31
+ "id": "a5",
32
+ "text": "Groups AC-US1-01 through AC-US1-03 as a journey since they are sequential login steps",
33
+ "type": "boolean"
34
+ }
35
+ ]
36
+ },
37
+ {
38
+ "id": 2,
39
+ "name": "run-and-report",
40
+ "prompt": "Run E2E tests for increment 0042-auth-flow and produce the AC-mapped report. Tests already exist in e2e/ directory.",
41
+ "expected_output": "Playwright tests executed, e2e-report.json written to reports/ with AC-ID mapping and pass/fail summary",
42
+ "assertions": [
43
+ {
44
+ "id": "a1",
45
+ "text": "Runs npx playwright test with --reporter=json flag",
46
+ "type": "boolean"
47
+ },
48
+ {
49
+ "id": "a2",
50
+ "text": "Writes e2e-report.json to .specweave/increments/0042-auth-flow/reports/ directory",
51
+ "type": "boolean"
52
+ },
53
+ {
54
+ "id": "a3",
55
+ "text": "Report contains summary object with total, passed, failed, skipped counts",
56
+ "type": "boolean"
57
+ },
58
+ {
59
+ "id": "a4",
60
+ "text": "Report results array maps test outcomes to AC-IDs extracted from test titles",
61
+ "type": "boolean"
62
+ }
63
+ ]
64
+ },
65
+ {
66
+ "id": 3,
67
+ "name": "a11y-scan",
68
+ "prompt": "Run E2E tests with accessibility auditing for increment 0042-auth-flow. Include WCAG 2.1 AA compliance checks.",
69
+ "expected_output": "Tests run with @axe-core/playwright scans, violations attached to per-AC results in report",
70
+ "assertions": [
71
+ {
72
+ "id": "a1",
73
+ "text": "Checks for @axe-core/playwright installation before running a11y scans",
74
+ "type": "boolean"
75
+ },
76
+ {
77
+ "id": "a2",
78
+ "text": "Uses AxeBuilder with wcag2a, wcag2aa, wcag21aa tags",
79
+ "type": "boolean"
80
+ },
81
+ {
82
+ "id": "a3",
83
+ "text": "Attaches a11y violations to per-AC result entries in e2e-report.json",
84
+ "type": "boolean"
85
+ },
86
+ {
87
+ "id": "a4",
88
+ "text": "Reports zero violations as { violations: [], passes: N }",
89
+ "type": "boolean"
90
+ }
91
+ ]
92
+ },
93
+ {
94
+ "id": 4,
95
+ "name": "no-playwright-detected",
96
+ "prompt": "Generate E2E tests for increment 0099-new-feature. The project has no playwright.config.ts and no @playwright/test in package.json.",
97
+ "expected_output": "Skill detects missing Playwright and outputs installation instructions without generating files",
98
+ "assertions": [
99
+ {
100
+ "id": "a1",
101
+ "text": "Searches for playwright.config.ts and playwright.config.js in project root and common locations",
102
+ "type": "boolean"
103
+ },
104
+ {
105
+ "id": "a2",
106
+ "text": "Searches for @playwright/test in package.json dependencies",
107
+ "type": "boolean"
108
+ },
109
+ {
110
+ "id": "a3",
111
+ "text": "Outputs error message mentioning npm init playwright@latest for installation",
112
+ "type": "boolean"
113
+ },
114
+ {
115
+ "id": "a4",
116
+ "text": "Does NOT create any test files or e2e-report.json",
117
+ "type": "boolean"
118
+ }
119
+ ]
120
+ }
121
+ ]
122
+ }
@@ -157,15 +157,15 @@ This spawns three parallel reviewers:
157
157
 
158
158
  Generate comprehensive test coverage across all test levels simultaneously. Each agent focuses on a different testing layer and operates independently.
159
159
 
160
- > **Note:** `testing:qa` is the primary orchestration skill for testing workflows. This preset splits its responsibilities into specialized agents for parallel execution.
160
+ > **Note:** SpecWeave testing skills (`sw:tdd-red`, `sw:e2e`, `sw:validate`) provide the testing workflows. This preset splits responsibilities into specialized agents for parallel execution.
161
161
 
162
162
  #### Agent Composition
163
163
 
164
164
  | # | Role | Skill(s) | Owns | Responsibility |
165
165
  |---|------|----------|------|----------------|
166
- | 1 | Unit | `testing:unit` | `tests/unit/` | Write unit tests for individual functions, classes, and modules with proper mocking |
167
- | 2 | E2E | `testing:e2e` | `tests/e2e/` | Write end-to-end tests for user flows, API sequences, and cross-service interactions |
168
- | 3 | Coverage | `testing:qa` | `tests/` (analysis scope) | Analyze coverage gaps, generate missing test cases, ensure threshold compliance |
166
+ | 1 | Unit | `sw:tdd-red` | `tests/unit/` | Write unit tests for individual functions, classes, and modules with proper mocking |
167
+ | 2 | E2E | `sw:e2e` | `tests/e2e/` | Write end-to-end tests for user flows, API sequences, and cross-service interactions |
168
+ | 3 | Coverage | `sw:validate` | `tests/` (analysis scope) | Analyze coverage gaps, generate missing test cases, ensure threshold compliance |
169
169
 
170
170
  #### Execution Chain
171
171
 
@@ -164,7 +164,7 @@ Analyze the feature request and map affected domains to SpecWeave skills.
164
164
  | **Backend** | `sw:architect` | `infra:devops` | API endpoints, services, business logic |
165
165
  | **Database** | `sw:architect` | | Schema design, migrations, seed data |
166
166
  | **Shared/Types** | `sw:architect` | `sw:code-simplifier` | TypeScript interfaces, shared constants, API contracts |
167
- | **Testing** | `testing:qa` | `testing:e2e`, `testing:unit` | Test strategy, E2E suites, integration tests |
167
+ | **Testing** | `sw:e2e` | `sw:tdd-red`, `sw:validate` | Test strategy, E2E suites, integration tests |
168
168
  | **Security** | `sw:security` | `security:patterns` | Auth, authorization, threat modeling, OWASP |
169
169
  | **DevOps** | `infra:devops` | `k8s:deployment-generate`, `infra:observability` | CI/CD, Docker, K8s, monitoring |
170
170
  | **Mobile** | `mobile:react-native` | `mobile:screen-generate`, `mobile:expo` | Native/cross-platform mobile apps |
@@ -416,7 +416,7 @@ Agent definitions live as reusable `.md` files in the `agents/` subdirectory. Wh
416
416
  | Frontend | `agents/frontend.md` | UI, components, pages | 2 (downstream) | `frontend:architect`, `frontend:design` |
417
417
  | Backend | `agents/backend.md` | API, services, middleware | 2 (downstream) | `sw:architect`, `infra:devops` |
418
418
  | Database | `agents/database.md` | Schema, migrations, seeds | 1 (upstream) | `sw:architect` |
419
- | Testing | `agents/testing.md` | Unit, integration, E2E | 2 (downstream) | `testing:qa`, `testing:e2e` |
419
+ | Testing | `agents/testing.md` | Unit, integration, E2E | 2 (downstream) | `sw:e2e`, `sw:tdd-red` |
420
420
  | Security | `agents/security.md` | Auth, validation, audit | 2 (downstream) | `sw:security` |
421
421
 
422
422
  ### How to Use Agent Files
@@ -7,9 +7,9 @@ MASTER SPEC (SOURCE OF TRUTH):
7
7
  Read the master spec BEFORE planning any work.
8
8
 
9
9
  SKILLS TO INVOKE:
10
- Skill({ skill: "testing:qa" })
11
- Skill({ skill: "testing:e2e" }) // for E2E test suites
12
- Skill({ skill: "testing:unit" }) // for unit test coverage
10
+ Skill({ skill: "sw:e2e", args: "--generate [INCREMENT_ID]" }) // generate E2E tests from ACs
11
+ Skill({ skill: "sw:e2e", args: "--run [INCREMENT_ID]" }) // run E2E + produce e2e-report.json
12
+ Skill({ skill: "sw:e2e", args: "--a11y [INCREMENT_ID]" }) // E2E + accessibility audit
13
13
 
14
14
  FILE OWNERSHIP (WRITE access):
15
15
  tests/**