npm - specweave - Versions diffs - 1.0.436 → 1.0.438 - Mend

specweave 1.0.436 → 1.0.438

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/README.md +63 -31
package/package.json +2 -2
package/plugins/specweave/skills/e2e/SKILL.md +420 -0
package/plugins/specweave/skills/e2e/evals/evals.json +122 -0
package/plugins/specweave/skills/team-build/SKILL.md +4 -4
package/plugins/specweave/skills/team-lead/SKILL.md +2 -2
package/plugins/specweave/skills/team-lead/agents/testing.md +3 -3

package/README.md CHANGED Viewed

@@ -1,6 +1,8 @@
 # SpecWeave
-**The spec-driven Skill Fabric for AI coding agents.** Program your AI in English. Ship features while you sleep.
+**AI-assisted development, under control.**
+Your AI responds to natural language — and now follows a structured, spec-first, quality-gated process every time. Configure your standards once. Every developer, every AI tool, every session enforces them automatically.
 *Works with Claude Code, Cursor, Copilot, Codex, Antigravity & any LLM-powered coding tool.*
@@ -15,36 +17,61 @@ npm install -g specweave   # Requires Node.js 20.12.0+
 ---
-## What Are Skills?
+## No Commands to Memorize
-**Skills are programs written in English** — not prompts, not templates, but reusable logic that controls how AI thinks, decides, and acts.
+SpecWeave is not a workflow you switch into. It is a behavior layer that changes how your AI works — installed once, active in every conversation.
-```
-Without SpecWeave:                          With SpecWeave:
-─────────────────                           ───────────────
-"Use React Hook Form with Zod..."           "Add a login form"
-"Remember, we use Tailwind..."              → AI already knows your patterns.
-"Don't forget the test pattern..."          → It remembered from last time.
-"Wait, I told you this yesterday..."        → Fix once, learned permanently.
-```
+When you describe what you want, your AI routes internally to the right skill. You just work naturally:
-Each skill is a **programmable AI behavior** you can customize without forking. Fix once, remembered permanently. 100+ skills ship out of the box — PM, Architect, QA, Security, DevOps, Frontend, Backend, Mobile, ML.
+| You say | Your AI runs — automatically |
+|---------|------------------------------|
+| "Build me X" / "Let's add Y" | `/sw:increment` → spec + plan + tasks |
+| "Go ahead" / "Build it" | `/sw:auto` → autonomous execution |
+| "Ship it" / "We're done" | `/sw:done` → quality gates + close |
+| "Split this into teams" | `/sw:team-lead` → parallel agents |
+| "Review the code" | `/sw:grill` → critical code review |
-**You don't need to learn Claude Code docs.** SpecWeave handles hooks, plugins, CLAUDE.md, and context management for you. Install, describe your feature, skills do the rest.
+You can also invoke these directly for fine-grained control — but you rarely need to.
+---
+## What You Control
+SpecWeave's behavior is driven by configuration. Define your standards once; every AI interaction in your project enforces them.
+```json
+// .specweave/config.json
+{
+  "testing": {
+    "defaultTestMode": "TDD",       // AI always follows red-green-refactor
+    "tddEnforcement": "strict"      // Tasks cannot close without passing tests
+  },
+  "quality": {
+    "grillRequired": true,          // Code review gate before every close
+    "judgeLlmRequired": true        // Independent AI validation gate
+  },
+  "sync": {
+    "github": true,                 // Auto-sync to GitHub Issues / PRs
+    "jira": true                    // Bidirectional JIRA sync on close
+  }
+}
+```
+This is the difference between **asking** an AI to follow a process and **configuring** it to. No prompting required. No hoping it remembers. The config is the contract.
 ---
 ## The Workflow
-Just describe what you want. SpecWeave handles the rest.
+Just describe what you want. Your AI handles the orchestration.
 ```
 You: "Build me a checkout flow with Stripe"
   ↓
-SpecWeave asks 5-10 clarifying questions
+AI asks 5-10 clarifying questions
   (What payment methods? Guest checkout? Subscriptions? Which UI library?)
   ↓
-Creates: spec.md → plan.md → tasks.md
+Creates: spec.md → plan.md → tasks.md   ← you review the plan here
   ↓
 You: "Go ahead and build it"
   → autonomous execution for hours
@@ -54,13 +81,13 @@ You wake up. Review finished work.
   Tests cover technical correctness. You check the UI and UX.
   ↓
 You: "Looks good, ship it"
-  → validated, documented, shipped.
+  → validated, documented, closed.
 ```
 **Solo developer:**
 ```
 You: "I need user authentication with OAuth and magic links"
-  → SpecWeave interviews you, creates spec + plan + tasks
+  → AI interviews you, creates spec + plan + tasks
 You: "Build it"
   → AI works autonomously for hours
 You: "Ship it"
@@ -81,23 +108,28 @@ You: "Migrate the checkout page to React"
   → TDD-first autonomous execution
 ```
-<details>
-<summary><strong>Under the hood</strong> — SpecWeave auto-activates these skills from natural language:</summary>
+---
-| You say | SpecWeave runs |
-|---------|---------------|
-| "Build me X" | `/sw:increment` → spec + plan + tasks |
-| "Go ahead" / "Build it" | `/sw:auto` → autonomous execution |
-| "Ship it" / "We're done" | `/sw:done` → quality gates + close |
-| "Split this into teams" | `/sw:team-lead` → parallel agents |
-| "Review the code" | `/sw:grill` → critical code review |
+## What Are Skills?
+**Skills are programs written in English** — not prompts, not templates, but reusable logic that controls how AI thinks, decides, and acts.
-You can also invoke commands directly for fine-grained control.
-</details>
+```
+Without SpecWeave:                          With SpecWeave:
+─────────────────                           ───────────────
+"Use React Hook Form with Zod..."           "Add a login form"
+"Remember, we use Tailwind..."              → AI already knows your patterns.
+"Don't forget the test pattern..."          → It remembered from last time.
+"Wait, I told you this yesterday..."        → Fix once, learned permanently.
+```
+Each skill is a **programmable AI behavior** you can customize without forking. Fix once, remembered permanently. 100+ skills ship out of the box — PM, Architect, QA, Security, DevOps, Frontend, Backend, Mobile, ML.
+**You don't need to learn Claude Code docs.** SpecWeave handles hooks, plugins, CLAUDE.md, and context management for you. Install, describe your feature, skills do the rest.
 ---
-## Why SpecWeave?
+## Why Spec-First?
 **The plan is more important than the code.**
@@ -271,7 +303,7 @@ You: "Add dark mode to the app"
 ## Core Commands
-All commands activate automatically from natural language. Use directly for fine-grained control.
+These run automatically from natural language — see the table above. Use directly when you want fine-grained control.
 | Command | Purpose | Natural trigger |
 |---------|---------|----------------|

package/package.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "name": "specweave",
-  "version": "1.0.436",
-  "description": "Spec-driven development framework for AI coding agents. Works with Claude Code, Codex, Antigravity, Cursor, Copilot & more. 100+ skills, 49 CLI commands, verified skill certification, autonomous execution, and living documentation.",
+  "version": "1.0.438",
+  "description": "AI-assisted development, under control. Configure your standards once — spec-first, TDD, quality gates — and every AI interaction enforces them automatically. Works with Claude Code, Cursor, Copilot, Codex & more.",
   "type": "module",
   "main": "dist/index.js",
   "bin": {

package/plugins/specweave/skills/e2e/SKILL.md ADDED Viewed

@@ -0,0 +1,420 @@
+---
+description: Generate, run, and report Playwright E2E tests traced to spec.md acceptance criteria. Supports accessibility auditing via --a11y. Use when saying "e2e tests", "playwright tests", "run e2e", "generate e2e", "accessibility audit", "a11y test".
+argument-hint: "--generate|--run|--a11y <increment-id>"
+allowed-tools: Read, Write, Edit, Grep, Glob, Bash
+context: fork
+model: sonnet
+---
+# E2E Testing — Playwright + AC Traceability
+## Project Overrides
+!`s="e2e"; for d in .specweave/skill-memories .claude/skill-memories "$HOME/.claude/skill-memories"; do p="$d/$s.md"; [ -f "$p" ] && awk '/^## Learnings$/{ok=1;next}/^## /{ok=0}ok' "$p" && break; done 2>/dev/null; true`
+Generate Playwright E2E tests from spec.md acceptance criteria, run them, and produce a structured report that maps pass/fail results to AC-IDs. Consumed by sw:done Gate 2a for automated closure gating.
+## Modes
+| Flag | Action |
+|------|--------|
+| `--generate <id>` | Read spec.md → create one `.spec.ts` per US with one `test()` per AC |
+| `--run <id>` | Execute `npx playwright test` → parse results → write `e2e-report.json` |
+| `--a11y <id>` | Like `--run` but also scans each page with `@axe-core/playwright` |
+Combine `--run` + `--a11y` to get both functional and accessibility results. `--generate` ignores `--a11y` (warn if combined).
+---
+## Step 1: Parse Arguments
+Extract mode and increment ID from `$ARGUMENTS`:
+```bash
+# Parse: --generate 0042 | --run 0042 | --a11y 0042 | --run --a11y 0042
+MODE="run"       # default
+A11Y=false
+INCREMENT_ID=""
+for arg in $ARGUMENTS; do
+  case "$arg" in
+    --generate) MODE="generate" ;;
+    --run)      MODE="run" ;;
+    --a11y)     A11Y=true ;;
+    *)          INCREMENT_ID="$arg" ;;
+  esac
+done
+```
+If no increment ID provided, check for an active increment:
+```bash
+ACTIVE=$(find .specweave/increments -maxdepth 2 -name "metadata.json" -exec grep -l '"active"' {} \; 2>/dev/null | head -1)
+```
+If still no ID → **STOP**: "No increment ID provided and no active increment found."
+Resolve increment path: `.specweave/increments/<id>/`
+## Step 2: Environment Validation — Playwright Detection
+**MANDATORY before any operation.** Detect Playwright installation:
+```bash
+# 1. Find playwright.config
+PW_CONFIG=$(find . repositories -maxdepth 4 -name "playwright.config.ts" -o -name "playwright.config.js" 2>/dev/null | head -1)
+# 2. Check for @playwright/test in package.json
+PW_PACKAGE=$(grep -r '"@playwright/test"' package.json packages/*/package.json repositories/*/*/package.json 2>/dev/null | head -1)
+```
+**Decision matrix**:
+| Config | Package | Action |
+|--------|---------|--------|
+| Found | Found | **Proceed** — use config path |
+| Missing | Found | **FAIL**: "Playwright installed but no config found. Run `npx playwright init` to create playwright.config.ts" |
+| Missing | Missing | **FAIL**: "Playwright not installed. Run `npm init playwright@latest` to set up E2E testing" |
+| Found | Missing | **Proceed** with warning: "Playwright config found but package not in package.json (global install?)" |
+Store `PW_CONFIG` path for later use.
+## Step 3: Read spec.md — AC Extraction
+Parse the increment's spec.md to extract acceptance criteria:
+```bash
+# Extract ACs: matches both [ ] and [x] checkboxes
+grep -E '^\s*-\s*\[[ x]\]\s*\*\*AC-' .specweave/increments/<id>/spec.md
+```
+**Parsing algorithm**:
+1. Read `.specweave/increments/<id>/spec.md`
+2. For each line matching `- [[ x]] **AC-USx-xx**: <text>`:
+   - Extract AC-ID (e.g., `AC-US1-01`)
+   - Extract description text (the Given/When/Then or plain text after the colon)
+   - Derive parent US-ID from AC prefix (e.g., `AC-US1-01` → `US-001`)
+   - Flag `hasGWT` if text contains "Given" AND "When" AND "Then"
+3. Group ACs by parent US-ID
+4. Detect journey sequences: ACs under the same US that describe sequential steps on the same page
+**Edge cases**:
+- **No ACs found**: Output "No acceptance criteria found in spec.md — nothing to generate" and exit cleanly
+- **ACs without Given/When/Then**: Generate a test stub with `// TODO: AC text does not follow GWT format — implement test manually`
+- **Duplicate AC-IDs**: Warn, append `-dup1` suffix to the test name
+Store the parsed result as a structured list for subsequent steps.
+---
+## Step 4: Generate Mode (`--generate`)
+**Goal**: Create Playwright test files from extracted ACs.
+### 4a. Determine Output Directory
+```bash
+# Read testDir from playwright config, default to e2e/
+TEST_DIR=$(grep -oP "testDir:\s*['\"]([^'\"]+)" "$PW_CONFIG" | head -1 | sed "s/testDir:\s*['\"]//")
+TEST_DIR="${TEST_DIR:-e2e}"
+mkdir -p "$TEST_DIR"
+```
+### 4b. Generate Test Files
+For each user story, create `{TEST_DIR}/us-{NNN}.spec.ts`:
+**Template for standard ACs** (one test per AC):
+```typescript
+import { test, expect } from '@playwright/test';
+test.describe('US-001: <User Story Title>', () => {
+  test('AC-US1-01: <AC description summary>', async ({ page }) => {
+    // Given: <given clause>
+    // When: <when clause>
+    // Then: <then clause>
+    // TODO: Implement test steps
+    // AC text: <full AC text>
+  });
+  test('AC-US1-02: <AC description summary>', async ({ page }) => {
+    // ...
+  });
+});
+```
+**Template for journey ACs** (grouped into one test):
+When multiple ACs under the same US describe sequential steps (e.g., "user sees form" → "user submits form" → "user sees confirmation"), group them:
+```typescript
+test('AC-US1-01 → AC-US1-03: <journey description>', async ({ page }) => {
+  // --- AC-US1-01: <description> ---
+  // Given/When/Then steps...
+  // --- AC-US1-02: <description> ---
+  // Given/When/Then steps...
+  // --- AC-US1-03: <description> ---
+  // Given/When/Then steps...
+});
+```
+### 4c. Post-Generate Summary
+Output:
+```
+Generated E2E tests:
+  {TEST_DIR}/us-001.spec.ts (3 ACs: AC-US1-01, AC-US1-02, AC-US1-03)
+  {TEST_DIR}/us-002.spec.ts (2 ACs: AC-US2-01, AC-US2-02)
+Total: 5 tests across 2 files
+Next: Implement test steps, then run with /sw:e2e --run <id>
+```
+---
+## Step 5: Run Mode (`--run`)
+**Goal**: Execute Playwright tests and produce AC-mapped `e2e-report.json`.
+### 5a. Execute Playwright
+```bash
+# Run with JSON reporter for structured output
+npx playwright test --reporter=json 2>&1 | tee /tmp/pw-results.json
+# Capture exit code
+PW_EXIT=$?
+```
+If Playwright exits non-zero, that's expected for failing tests — continue to report generation.
+### 5b. Parse Results and Map to ACs
+1. Read the JSON reporter output
+2. For each test result:
+   - Extract test title
+   - Match AC-ID from title using regex: `/AC-US\d+-\d+/`
+   - Map to status: `passed` → `pass`, `failed` → `fail`, `skipped` → `skip`
+   - Extract duration and error message (if failed)
+3. Tests without AC-IDs in title → report under `acId: "UNMAPPED"`
+### 5c. Write e2e-report.json
+Write to `.specweave/increments/<id>/reports/e2e-report.json`:
+```json
+{
+  "incrementId": "<id>",
+  "timestamp": "<ISO-8601>",
+  "mode": "run",
+  "playwrightConfig": "<path to playwright.config.ts>",
+  "summary": {
+    "total": 5,
+    "passed": 4,
+    "failed": 1,
+    "skipped": 0
+  },
+  "results": [
+    {
+      "acId": "AC-US1-01",
+      "testFile": "e2e/us-001.spec.ts",
+      "status": "pass",
+      "duration": 1234,
+      "error": null
+    },
+    {
+      "acId": "AC-US1-02",
+      "testFile": "e2e/us-001.spec.ts",
+      "status": "fail",
+      "duration": 5678,
+      "error": "Expected element to be visible but it was hidden"
+    }
+  ]
+}
+```
+### 5d. Output Summary
+```
+E2E Results for increment <id>:
+  Total: 5 | Passed: 4 | Failed: 1 | Skipped: 0
+  FAILED:
+    AC-US1-02: Expected element to be visible but it was hidden (us-001.spec.ts)
+  Report: .specweave/increments/<id>/reports/e2e-report.json
+```
+If `summary.failed > 0`:
+```
+⚠ E2E tests have failures. Fix before closing increment.
+```
+If `summary.failed === 0`:
+```
+All E2E tests passed. Gate 2a will allow closure.
+```
+---
+## Step 6: A11y Mode (`--a11y`)
+**Goal**: Extend run mode with accessibility scanning via `@axe-core/playwright`.
+### 6a. Check axe-core Installation
+```bash
+grep -q '"@axe-core/playwright"' package.json 2>/dev/null
+```
+If not installed:
+```
+@axe-core/playwright is not installed. Install it:
+  npm install -D @axe-core/playwright axe-core
+Then re-run: /sw:e2e --a11y <id>
+```
+### 6b. Inject A11y Scans
+When generating tests with `--a11y`, add after each test's primary assertions:
+```typescript
+import AxeBuilder from '@axe-core/playwright';
+// After primary test assertions:
+const a11yResults = await new AxeBuilder({ page })
+  .withTags(['wcag2a', 'wcag2aa', 'wcag21aa'])
+  .analyze();
+```
+When running existing tests, the a11y scan must be part of the generated test code. If tests were generated without `--a11y`, recommend regenerating with the flag.
+### 6c. A11y Report Extension
+For each AC result, attach a11y data:
+```json
+{
+  "acId": "AC-US1-01",
+  "status": "pass",
+  "a11y": {
+    "violations": [
+      {
+        "rule": "color-contrast",
+        "impact": "serious",
+        "description": "Elements must have sufficient color contrast",
+        "nodes": 3,
+        "helpUrl": "https://dequeuniversity.com/rules/axe/4.7/color-contrast"
+      }
+    ],
+    "passes": 42
+  }
+}
+```
+### 6d. Standalone A11y (No AC Context)
+When `--a11y` runs without `--generate` context (pre-existing tests without AC-IDs):
+- Group violations by page URL instead of AC-ID
+- Write to the top-level `a11y` field in the report:
+```json
+{
+  "a11y": {
+    "violations": [
+      {
+        "pageUrl": "/login",
+        "rule": "color-contrast",
+        "impact": "serious",
+        "nodes": 2
+      }
+    ],
+    "passes": 87
+  }
+}
+```
+### 6e. axe-core Rule Tags Reference
+| Tag | Meaning |
+|-----|---------|
+| `wcag2a` / `wcag2aa` | WCAG 2.0 Level A / AA |
+| `wcag21aa` / `wcag22aa` | WCAG 2.1 / 2.2 Level AA |
+| `best-practice` | Non-WCAG best practices |
+Default: `['wcag2a', 'wcag2aa', 'wcag21aa']` (covers standard compliance).
+---
+## Step 7: Report Schema Reference
+### e2e-report.json (complete)
+```json
+{
+  "incrementId": "string",
+  "timestamp": "ISO-8601",
+  "mode": "run | generate | a11y",
+  "playwrightConfig": "path/to/playwright.config.ts",
+  "summary": {
+    "total": 0,
+    "passed": 0,
+    "failed": 0,
+    "skipped": 0
+  },
+  "results": [
+    {
+      "acId": "AC-US1-01",
+      "testFile": "e2e/us-001.spec.ts",
+      "status": "pass | fail | skip",
+      "duration": 1234,
+      "error": null,
+      "a11y": {
+        "violations": [],
+        "passes": 0
+      }
+    }
+  ],
+  "a11y": {
+    "violations": [],
+    "passes": 0
+  }
+}
+```
+**Consumption by sw:done Gate 2a**:
+1. Read `.specweave/increments/<id>/reports/e2e-report.json`
+2. If `summary.failed > 0` → **BLOCK closure**
+3. If report missing → **BLOCK closure** (report must exist after sw:e2e invocation)
+4. If `summary.failed === 0` → **PASS gate**
+---
+## Step 8: Edge Cases and Error Handling
+| Scenario | Behavior |
+|----------|----------|
+| No spec.md | "spec.md not found at increment path. Run /sw:increment first." |
+| spec.md with no ACs | "No acceptance criteria found in spec.md — nothing to generate." Exit cleanly. |
+| ACs without GWT format | Generate test stub with `// TODO: implement` comment |
+| Duplicate AC-IDs | Warn, append `-dup1` suffix |
+| Playwright timeout on a test | Report as `status: "fail"`, `error: "Test timed out after Xms"` |
+| `--generate` + `--a11y` combined | Warn: "a11y flag is only used with --run. Generating without a11y scans." |
+| No Playwright config | FAIL with installation instructions (see Step 2) |
+| Pre-existing tests without AC-IDs | Map to `acId: "UNMAPPED"` in results |
+| JSON reporter not available | Fall back to parsing Playwright stdout for pass/fail counts |
+## Anti-Rationalization
+| Excuse | Rebuttal |
+|--------|----------|
+| "Tests are too simple to need AC tracing" | Tracing is free — it costs one string in the test title. Skip it and you lose audit trail. |
+| "I'll add AC-IDs later" | You won't. Generate with `--generate` and they're there from the start. |
+| "Accessibility can wait" | WCAG violations caught at dev time cost 10x less to fix than post-release. Use `--a11y`. |
+| "The report is overkill for a small project" | Gate 2a reads the report. No report = no closure. The schema is fixed overhead, not per-test. |

package/plugins/specweave/skills/e2e/evals/evals.json ADDED Viewed

@@ -0,0 +1,122 @@
+{
+  "skill_name": "sw:e2e",
+  "evals": [
+    {
+      "id": 1,
+      "name": "generate-from-spec",
+      "prompt": "Generate E2E tests for increment 0042-auth-flow. The spec.md has 2 user stories: US-001 (Login) with AC-US1-01 (see login form), AC-US1-02 (submit credentials), AC-US1-03 (see dashboard after login); US-002 (Logout) with AC-US2-01 (click logout), AC-US2-02 (redirected to login page).",
+      "expected_output": "Playwright test files created in e2e/ directory with one file per user story and AC-IDs in test titles",
+      "assertions": [
+        {
+          "id": "a1",
+          "text": "Reads spec.md from .specweave/increments/0042-auth-flow/spec.md to extract ACs",
+          "type": "boolean"
+        },
+        {
+          "id": "a2",
+          "text": "Creates e2e/us-001.spec.ts with test titles containing AC-US1-01, AC-US1-02, AC-US1-03",
+          "type": "boolean"
+        },
+        {
+          "id": "a3",
+          "text": "Creates e2e/us-002.spec.ts with test titles containing AC-US2-01, AC-US2-02",
+          "type": "boolean"
+        },
+        {
+          "id": "a4",
+          "text": "Checks for playwright.config.ts before generating test files",
+          "type": "boolean"
+        },
+        {
+          "id": "a5",
+          "text": "Groups AC-US1-01 through AC-US1-03 as a journey since they are sequential login steps",
+          "type": "boolean"
+        }
+      ]
+    },
+    {
+      "id": 2,
+      "name": "run-and-report",
+      "prompt": "Run E2E tests for increment 0042-auth-flow and produce the AC-mapped report. Tests already exist in e2e/ directory.",
+      "expected_output": "Playwright tests executed, e2e-report.json written to reports/ with AC-ID mapping and pass/fail summary",
+      "assertions": [
+        {
+          "id": "a1",
+          "text": "Runs npx playwright test with --reporter=json flag",
+          "type": "boolean"
+        },
+        {
+          "id": "a2",
+          "text": "Writes e2e-report.json to .specweave/increments/0042-auth-flow/reports/ directory",
+          "type": "boolean"
+        },
+        {
+          "id": "a3",
+          "text": "Report contains summary object with total, passed, failed, skipped counts",
+          "type": "boolean"
+        },
+        {
+          "id": "a4",
+          "text": "Report results array maps test outcomes to AC-IDs extracted from test titles",
+          "type": "boolean"
+        }
+      ]
+    },
+    {
+      "id": 3,
+      "name": "a11y-scan",
+      "prompt": "Run E2E tests with accessibility auditing for increment 0042-auth-flow. Include WCAG 2.1 AA compliance checks.",
+      "expected_output": "Tests run with @axe-core/playwright scans, violations attached to per-AC results in report",
+      "assertions": [
+        {
+          "id": "a1",
+          "text": "Checks for @axe-core/playwright installation before running a11y scans",
+          "type": "boolean"
+        },
+        {
+          "id": "a2",
+          "text": "Uses AxeBuilder with wcag2a, wcag2aa, wcag21aa tags",
+          "type": "boolean"
+        },
+        {
+          "id": "a3",
+          "text": "Attaches a11y violations to per-AC result entries in e2e-report.json",
+          "type": "boolean"
+        },
+        {
+          "id": "a4",
+          "text": "Reports zero violations as { violations: [], passes: N }",
+          "type": "boolean"
+        }
+      ]
+    },
+    {
+      "id": 4,
+      "name": "no-playwright-detected",
+      "prompt": "Generate E2E tests for increment 0099-new-feature. The project has no playwright.config.ts and no @playwright/test in package.json.",
+      "expected_output": "Skill detects missing Playwright and outputs installation instructions without generating files",
+      "assertions": [
+        {
+          "id": "a1",
+          "text": "Searches for playwright.config.ts and playwright.config.js in project root and common locations",
+          "type": "boolean"
+        },
+        {
+          "id": "a2",
+          "text": "Searches for @playwright/test in package.json dependencies",
+          "type": "boolean"
+        },
+        {
+          "id": "a3",
+          "text": "Outputs error message mentioning npm init playwright@latest for installation",
+          "type": "boolean"
+        },
+        {
+          "id": "a4",
+          "text": "Does NOT create any test files or e2e-report.json",
+          "type": "boolean"
+        }
+      ]
+    }
+  ]
+}

package/plugins/specweave/skills/team-build/SKILL.md CHANGED Viewed

@@ -157,15 +157,15 @@ This spawns three parallel reviewers:
 Generate comprehensive test coverage across all test levels simultaneously. Each agent focuses on a different testing layer and operates independently.
-> **Note:** `testing:qa` is the primary orchestration skill for testing workflows. This preset splits its responsibilities into specialized agents for parallel execution.
+> **Note:** SpecWeave testing skills (`sw:tdd-red`, `sw:e2e`, `sw:validate`) provide the testing workflows. This preset splits responsibilities into specialized agents for parallel execution.
 #### Agent Composition
 | # | Role | Skill(s) | Owns | Responsibility |
 |---|------|----------|------|----------------|
-| 1 | Unit | `testing:unit` | `tests/unit/` | Write unit tests for individual functions, classes, and modules with proper mocking |
-| 2 | E2E | `testing:e2e` | `tests/e2e/` | Write end-to-end tests for user flows, API sequences, and cross-service interactions |
-| 3 | Coverage | `testing:qa` | `tests/` (analysis scope) | Analyze coverage gaps, generate missing test cases, ensure threshold compliance |
+| 1 | Unit | `sw:tdd-red` | `tests/unit/` | Write unit tests for individual functions, classes, and modules with proper mocking |
+| 2 | E2E | `sw:e2e` | `tests/e2e/` | Write end-to-end tests for user flows, API sequences, and cross-service interactions |
+| 3 | Coverage | `sw:validate` | `tests/` (analysis scope) | Analyze coverage gaps, generate missing test cases, ensure threshold compliance |
 #### Execution Chain

package/plugins/specweave/skills/team-lead/SKILL.md CHANGED Viewed

@@ -164,7 +164,7 @@ Analyze the feature request and map affected domains to SpecWeave skills.
 | **Backend** | `sw:architect` | `infra:devops` | API endpoints, services, business logic |
 | **Database** | `sw:architect` | | Schema design, migrations, seed data |
 | **Shared/Types** | `sw:architect` | `sw:code-simplifier` | TypeScript interfaces, shared constants, API contracts |
-| **Testing** | `testing:qa` | `testing:e2e`, `testing:unit` | Test strategy, E2E suites, integration tests |
+| **Testing** | `sw:e2e` | `sw:tdd-red`, `sw:validate` | Test strategy, E2E suites, integration tests |
 | **Security** | `sw:security` | `security:patterns` | Auth, authorization, threat modeling, OWASP |
 | **DevOps** | `infra:devops` | `k8s:deployment-generate`, `infra:observability` | CI/CD, Docker, K8s, monitoring |
 | **Mobile** | `mobile:react-native` | `mobile:screen-generate`, `mobile:expo` | Native/cross-platform mobile apps |
@@ -416,7 +416,7 @@ Agent definitions live as reusable `.md` files in the `agents/` subdirectory. Wh
 | Frontend | `agents/frontend.md` | UI, components, pages | 2 (downstream) | `frontend:architect`, `frontend:design` |
 | Backend | `agents/backend.md` | API, services, middleware | 2 (downstream) | `sw:architect`, `infra:devops` |
 | Database | `agents/database.md` | Schema, migrations, seeds | 1 (upstream) | `sw:architect` |
-| Testing | `agents/testing.md` | Unit, integration, E2E | 2 (downstream) | `testing:qa`, `testing:e2e` |
+| Testing | `agents/testing.md` | Unit, integration, E2E | 2 (downstream) | `sw:e2e`, `sw:tdd-red` |
 | Security | `agents/security.md` | Auth, validation, audit | 2 (downstream) | `sw:security` |
 ### How to Use Agent Files

package/plugins/specweave/skills/team-lead/agents/testing.md CHANGED Viewed

@@ -7,9 +7,9 @@ MASTER SPEC (SOURCE OF TRUTH):
   Read the master spec BEFORE planning any work.
 SKILLS TO INVOKE:
-  Skill({ skill: "testing:qa" })
-  Skill({ skill: "testing:e2e" })        // for E2E test suites
-  Skill({ skill: "testing:unit" })       // for unit test coverage
+  Skill({ skill: "sw:e2e", args: "--generate [INCREMENT_ID]" })  // generate E2E tests from ACs
+  Skill({ skill: "sw:e2e", args: "--run [INCREMENT_ID]" })       // run E2E + produce e2e-report.json
+  Skill({ skill: "sw:e2e", args: "--a11y [INCREMENT_ID]" })      // E2E + accessibility audit
 FILE OWNERSHIP (WRITE access):
   tests/**