npm - @brunosps00/dev-workflow - Versions diffs - 0.8.0 → 0.9.0 - Mend

@brunosps00/dev-workflow 0.8.0 → 0.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (73) hide show

package/README.md +18 -14
package/bin/dev-workflow.js +1 -1
package/lib/constants.js +8 -2
package/lib/init.js +6 -0
package/lib/install-deps.js +0 -5
package/lib/migrate-gsd.js +164 -0
package/lib/uninstall.js +2 -2
package/package.json +1 -1
package/scaffold/en/commands/dw-analyze-project.md +6 -11
package/scaffold/en/commands/dw-autopilot.md +6 -13
package/scaffold/en/commands/dw-brainstorm.md +1 -1
package/scaffold/en/commands/dw-code-review.md +6 -5
package/scaffold/en/commands/dw-create-prd.md +5 -4
package/scaffold/en/commands/dw-create-techspec.md +5 -4
package/scaffold/en/commands/dw-execute-phase.md +149 -0
package/scaffold/en/commands/dw-fix-qa.md +34 -13
package/scaffold/en/commands/dw-help.md +5 -2
package/scaffold/en/commands/dw-intel.md +98 -29
package/scaffold/en/commands/dw-map-codebase.md +125 -0
package/scaffold/en/commands/dw-new-project.md +1 -1
package/scaffold/en/commands/dw-plan-checker.md +144 -0
package/scaffold/en/commands/dw-quick.md +30 -12
package/scaffold/en/commands/dw-redesign-ui.md +5 -9
package/scaffold/en/commands/dw-refactoring-analysis.md +6 -5
package/scaffold/en/commands/dw-resume.md +10 -8
package/scaffold/en/commands/dw-run-plan.md +14 -20
package/scaffold/en/commands/dw-run-qa.md +124 -23
package/scaffold/en/commands/dw-run-task.md +5 -4
package/scaffold/en/commands/dw-update.md +3 -1
package/scaffold/en/templates/idea-onepager.md +1 -1
package/scaffold/pt-br/commands/dw-analyze-project.md +6 -11
package/scaffold/pt-br/commands/dw-autopilot.md +6 -13
package/scaffold/pt-br/commands/dw-brainstorm.md +1 -1
package/scaffold/pt-br/commands/dw-code-review.md +6 -5
package/scaffold/pt-br/commands/dw-create-prd.md +5 -4
package/scaffold/pt-br/commands/dw-create-techspec.md +5 -4
package/scaffold/pt-br/commands/dw-execute-phase.md +149 -0
package/scaffold/pt-br/commands/dw-fix-qa.md +34 -13
package/scaffold/pt-br/commands/dw-help.md +5 -2
package/scaffold/pt-br/commands/dw-intel.md +98 -29
package/scaffold/pt-br/commands/dw-map-codebase.md +125 -0
package/scaffold/pt-br/commands/dw-new-project.md +1 -1
package/scaffold/pt-br/commands/dw-plan-checker.md +144 -0
package/scaffold/pt-br/commands/dw-quick.md +30 -12
package/scaffold/pt-br/commands/dw-redesign-ui.md +5 -9
package/scaffold/pt-br/commands/dw-refactoring-analysis.md +6 -5
package/scaffold/pt-br/commands/dw-resume.md +10 -8
package/scaffold/pt-br/commands/dw-run-plan.md +16 -22
package/scaffold/pt-br/commands/dw-run-qa.md +124 -23
package/scaffold/pt-br/commands/dw-run-task.md +5 -4
package/scaffold/pt-br/commands/dw-update.md +3 -1
package/scaffold/pt-br/templates/idea-onepager.md +1 -1
package/scaffold/skills/api-testing-recipes/SKILL.md +104 -0
package/scaffold/skills/api-testing-recipes/recipes/dotnet-webapp-factory.md +168 -0
package/scaffold/skills/api-testing-recipes/recipes/http-rest-client.md +130 -0
package/scaffold/skills/api-testing-recipes/recipes/pytest-httpx.md +157 -0
package/scaffold/skills/api-testing-recipes/recipes/rust-reqwest.md +173 -0
package/scaffold/skills/api-testing-recipes/recipes/supertest-node.md +153 -0
package/scaffold/skills/api-testing-recipes/references/auth-patterns.md +138 -0
package/scaffold/skills/api-testing-recipes/references/log-conventions.md +117 -0
package/scaffold/skills/api-testing-recipes/references/matrix-conventions.md +68 -0
package/scaffold/skills/api-testing-recipes/references/openapi-driven.md +97 -0
package/scaffold/skills/dw-codebase-intel/SKILL.md +101 -0
package/scaffold/skills/dw-codebase-intel/agents/intel-updater.md +318 -0
package/scaffold/skills/dw-codebase-intel/references/incremental-update.md +79 -0
package/scaffold/skills/dw-codebase-intel/references/intel-format.md +208 -0
package/scaffold/skills/dw-codebase-intel/references/query-patterns.md +148 -0
package/scaffold/skills/dw-execute-phase/SKILL.md +133 -0
package/scaffold/skills/dw-execute-phase/agents/executor.md +264 -0
package/scaffold/skills/dw-execute-phase/agents/plan-checker.md +215 -0
package/scaffold/skills/dw-execute-phase/references/atomic-commits.md +143 -0
package/scaffold/skills/dw-execute-phase/references/plan-verification.md +156 -0
package/scaffold/skills/dw-execute-phase/references/wave-coordination.md +102 -0

package/scaffold/skills/api-testing-recipes/references/auth-patterns.md ADDED Viewed

@@ -0,0 +1,138 @@
+# Auth patterns — how to wire credentials into API tests
+Tests need real credentials, but credentials must never live in the script files (which are committed). This file describes how each recipe handles the four common auth schemes and where credentials come from.
+## The four schemes
+| Scheme | How it travels | Recipe handling |
+|--------|----------------|-----------------|
+| **Bearer JWT** | `Authorization: Bearer <token>` | Most common. Token comes from a login response or pre-issued for QA. |
+| **Cookie session** | `Cookie: session=<sid>` (set by `Set-Cookie` on login) | Recipes capture the cookie from a login call and replay it. |
+| **API key** | `X-API-Key: <key>` (header) or `?api_key=<key>` (query) | Header form is preferred; key comes from a per-environment env var. |
+| **Basic auth** | `Authorization: Basic <base64(user:pass)>` | Rare in modern APIs; supported but discouraged. |
+## Where credentials come from (in priority order)
+1. **`.env` file** at the repo root, gitignored. Contains `QA_TOKEN_ADMIN`, `QA_ADMIN_EMAIL`, `QA_ADMIN_PASSWORD`, etc.
+2. **Pre-issued QA tokens** — long-lived JWTs minted by an admin tool (e.g., a `make qa-tokens` target) and stored in `.env`. Best for CI; avoids login-time flake.
+3. **Login at runtime** — a setup request hits `/auth/login` with `QA_ADMIN_EMAIL` + `QA_ADMIN_PASSWORD` and captures the token. Use when no pre-issued option exists.
+4. **`.dw/templates/qa-test-credentials.md`** — the project-level QA credentials registry that `dw-run-qa` already reads (UI mode). API mode reads the same file for env-var hints + role mapping.
+## Three roles every project should have
+Even for single-tenant apps, define at minimum:
+- **`token_admin`** — has every permission. Used for setup (create test data) and teardown.
+- **`token_user`** — regular authenticated user. The role most happy-path tests run as.
+- **`token_guest`** OR **`token_other_org_admin`** — for negative tests. In multi-tenant apps, this token belongs to a different org and powers the cross-tenant denial tests.
+## Per-recipe variable conventions
+### `.http` (REST Client)
+Top of the file:
+```http
+@base = {{$dotenv API_BASE_URL}}
+@token_admin = {{$dotenv QA_TOKEN_ADMIN}}
+@token_user = {{$dotenv QA_TOKEN_USER}}
+@token_other_org = {{$dotenv QA_TOKEN_OTHER_ORG}}
+```
+Or, if logging in at runtime, capture once and reuse:
+```http
+### Setup — login as admin
+# @name login_admin
+POST {{base}}/auth/login
+Content-Type: application/json
+{ "email": "{{$dotenv QA_ADMIN_EMAIL}}", "password": "{{$dotenv QA_ADMIN_PASSWORD}}" }
+> {%
+client.global.set("token_admin", response.body.access_token);
+client.test("login ok", () => client.assert(response.status === 200));
+%}
+```
+### `pytest + httpx`
+Read from environment in module scope; expose as fixtures if the test count grows:
+```python
+TOKEN_ADMIN = os.environ["QA_TOKEN_ADMIN"]
+TOKEN_USER = os.environ["QA_TOKEN_USER"]
+TOKEN_OTHER_ORG = os.environ.get("QA_TOKEN_OTHER_ORG", "")
+@pytest.fixture(scope="session")
+async def admin_client():
+    async with httpx.AsyncClient(base_url=BASE,
+        headers={"Authorization": f"Bearer {TOKEN_ADMIN}"},
+        timeout=10.0) as c:
+        yield c
+```
+### `supertest` (Node)
+Same `process.env` reads, optionally one helper per role:
+```ts
+const auth = (token: string) => ({ Authorization: `Bearer ${token}` });
+const TOKEN_ADMIN = process.env.QA_TOKEN_ADMIN!;
+```
+### `WebApplicationFactory` (.NET)
+Subclass the factory once per role:
+```csharp
+public class AdminAppFactory : WebApplicationFactory<Program>
+{
+    protected override void ConfigureClient(HttpClient client)
+    {
+        client.DefaultRequestHeaders.Authorization =
+            new AuthenticationHeaderValue("Bearer",
+                Environment.GetEnvironmentVariable("QA_TOKEN_ADMIN") ?? "");
+    }
+}
+```
+### `reqwest` (Rust)
+Helper functions read env once:
+```rust
+fn token_admin() -> String { std::env::var("QA_TOKEN_ADMIN").unwrap_or_default() }
+fn admin_client() -> reqwest::Client {
+    reqwest::Client::builder().build().unwrap()
+}
+// then: admin_client().get(url).bearer_auth(token_admin()).send().await
+```
+## Refresh tokens
+If the API uses refresh tokens, capture both `access_token` and `refresh_token` in the login setup. When a test needs a long-lived flow (e.g., wait for a webhook), refresh the access token before the wait.
+For most QA suites, the access token's TTL (typically 15-60 min) is longer than the suite's runtime, so refresh is unnecessary.
+## Scoped credentials per role
+For RBAC-heavy systems, define more roles:
+- `token_admin` — global admin
+- `token_org_admin` — admin within one org
+- `token_member` — regular member of one org
+- `token_billing` — read-only billing access
+- `token_other_org_admin` — admin of a different org (for cross-tenant tests)
+Add one env var per role; the recipe reads them as needed. Tests that don't need a particular role just don't reference it.
+## Anti-patterns
+- **Don't hardcode `Bearer eyJ...` in any committed file.** Even "test" tokens leak.
+- **Don't share one token across happy-path AND negative tests.** If a happy-path test mutates the token's user (e.g., suspends it), every later test fails.
+- **Don't reuse production tokens for QA.** Mint QA-only tokens with a clearly distinct subject (`sub: qa-admin@example.com`).
+- **Don't pass credentials via command-line args.** They land in shell history and process listings.
+## What `dw-run-qa` does
+In API mode, `/dw-run-qa` reads `QA/test-credentials.md` (or `.env`) for the env var names, picks the recipe, and substitutes variables at test-generation time. The script files reference `@variable` references only — never raw tokens.

package/scaffold/skills/api-testing-recipes/references/log-conventions.md ADDED Viewed

@@ -0,0 +1,117 @@
+# Log conventions — request/response evidence as JSONL
+In API mode, **logs replace screenshots** as the primary QA evidence. Every request/response pair the QA suite makes is captured as one JSONL line so the bug report links back to a reproducible event.
+## File location
+`{{PRD_PATH}}/QA/logs/api/<scope>.log`
+Where `<scope>` is one of:
+- `RF-XX-[slug].log` — log for a single requirement run (1 file per RF).
+- `BUG-NN-retest.log` — log for a fix retest (1 file per bug retest cycle).
+- `run-<YYYY-MM-DD>.log` — global run log (full QA pass).
+## Line shape (JSONL — one JSON object per line)
+```json
+{
+  "ts": 1715000000000,
+  "rf": "RF-03",
+  "case": "happy-path",
+  "method": "POST",
+  "url": "http://localhost:3000/users",
+  "request_headers": {
+    "authorization": "Bearer <redacted>",
+    "content-type": "application/json"
+  },
+  "request_body": {
+    "email": "qa-1@example.com",
+    "name": "QA"
+  },
+  "status": 201,
+  "response_headers": {
+    "content-type": "application/json",
+    "location": "/users/12345"
+  },
+  "response_body": {
+    "id": "12345",
+    "email": "qa-1@example.com",
+    "name": "QA",
+    "created_at": "2026-05-06T12:00:00Z"
+  },
+  "ms": 47,
+  "verdict": "PASS",
+  "assertion_failures": []
+}
+```
+## Required fields
+| Field | Type | Notes |
+|-------|------|-------|
+| `ts` | int (epoch ms, UTC) | When the request was sent |
+| `rf` | string | Which `RF-XX` this request belongs to (or `"BUG-NN"` for retests) |
+| `case` | string | One of `happy-path`, `validation`, `auth-missing`, `auth-expired`, `authz-cross-tenant`, `not-found`, `conflict`, `server-error`, `contract` |
+| `method` | string | HTTP method |
+| `url` | string | Full URL including query string |
+| `status` | int | HTTP status code |
+| `ms` | int | Elapsed milliseconds |
+| `verdict` | string | `"PASS"` or `"FAIL"` |
+| `assertion_failures` | array of strings | Each failed assertion as a one-line description (empty array on PASS) |
+## Optional fields
+| Field | Type | Notes |
+|-------|------|-------|
+| `request_headers` | object | Map of header name → value |
+| `request_body` | any | Parsed JSON if `Content-Type: application/json`; raw string otherwise |
+| `response_headers` | object | Same shape as request_headers |
+| `response_body` | any | Parsed JSON if `Content-Type: application/json`; raw string otherwise |
+| `err` | string | Network/runtime error message (if no response was received at all) |
+## Redaction rules
+The log goes to `QA/logs/api/` which **may end up in artifacts uploaded to CI** or attached to bug reports. Redact:
+- **`Authorization` header** → `"Bearer <redacted>"` or `"Basic <redacted>"`. The token's presence is logged; the value never is.
+- **`Cookie` header** → `"<redacted>"`. Same reasoning.
+- **`X-API-Key` header** → `"<redacted>"`.
+- **Response fields named `password*`, `secret*`, `*_hash`, `token*`, `apiKey*`** → `"<redacted>"`. These should never be in a response anyway; if they are, the log redacts AND the QA report flags the leak.
+- **Free-form `request_body` fields named `password`** → `"<redacted>"`.
+The redaction is applied at log-write time, never on read; even a leaked log file should not expose secrets.
+## Why JSONL (not pretty-printed JSON)
+- **Append-friendly**: each request is one line; concurrent runs append safely without parsing the whole file.
+- **Greppable**: `grep '"verdict":"FAIL"' QA/logs/api/RF-03.log` shows every failed case in one shot.
+- **Queryable**: `jq -c 'select(.status >= 500)' QA/logs/api/run-*.log | jq -s 'group_by(.url) | map({url: .[0].url, count: length})'` finds the most-failing URLs.
+- **Diffable across runs**: `diff <(jq -c 'del(.ts, .ms)' RF-03.log) <(jq -c 'del(.ts, .ms)' RF-03.log.prev)` shows behavior changes free of timing noise.
+## Per-recipe writers
+Every recipe in `recipes/` includes a small writer helper in its example:
+- `.http` — the agent writes via `Bash` after each `curl` invocation.
+- `pytest+httpx` — `LoggingClient` subclass overriding `request`.
+- `supertest` — small `logRequest` helper imported by tests.
+- `.NET WebApplicationFactory` — `DelegatingHandler` registered on the test client.
+- `reqwest` — wrapper function around `client.execute(req)`.
+All of them produce the same JSONL shape so downstream tooling (the QA report renderer, the bug retest loop) doesn't care which recipe was used.
+## How `dw-run-qa` reads logs back
+When generating the QA report (Step 8 in `dw-run-qa`), the agent reads each `RF-XX-[slug].log`, computes:
+- **Total requests** per RF
+- **Pass count vs fail count**
+- **Failing cases** with the assertion message
+- **Tail latency** (p99 if there are ≥10 requests, max otherwise)
+These land in the report's "Verified Requirements" table and feed the bug entries (with `evidence_path: QA/logs/api/RF-03.log#L42` pointing to the failing line).
+## How `dw-fix-qa` consumes them
+The retest loop reads `QA/bugs.md` for each open bug, finds the corresponding log line via `evidence_path`, replays the request via the same recipe + assertions, and writes a new line to `BUG-NN-retest.log` with `verdict: "PASS"` (closing the bug) or `verdict: "FAIL"` (cycling through the fix-retest loop again, max 5 cycles).

package/scaffold/skills/api-testing-recipes/references/matrix-conventions.md ADDED Viewed

@@ -0,0 +1,68 @@
+# Matrix conventions — deriving tests from a PRD requirement
+Every API requirement (`RF-XX`) gets a structured matrix of test cases. The matrix is the bridge between "the PRD says this endpoint must exist" and "we have evidence it works under the cases that matter."
+## The five tiers
+For each `RF-XX`, generate at least one test per tier that applies:
+| Tier | Goal | When to skip |
+|------|------|--------------|
+| **200 happy path** | Prove the endpoint accepts the documented input and returns the documented output. | Never — every RF needs at least one happy path. |
+| **4xx — validation** | Prove input validation rejects malformed payloads with a useful error. | Skip only for endpoints with no body (`GET` without query params). |
+| **4xx — auth (401)** | Prove missing/expired/invalid credentials return 401. | Skip for endpoints documented as anonymous. |
+| **4xx — authorization (403)** | Prove valid credentials without the required role/scope return 403. | Skip if the endpoint is open to any authenticated user. |
+| **4xx — not found (404)** | Prove non-existent IDs return 404, not 500. | Skip for endpoints that don't take an ID. |
+| **4xx — conflict (409)** | Prove duplicates / version mismatches return 409. | Skip if the endpoint is idempotent and conflict-free by design. |
+| **5xx — server error** | Prove the system fails gracefully (no leaked stack trace, no half-write). | Skip if no synthetic failure is reproducible without invasive infrastructure changes. |
+| **Contract drift** | Prove the response shape matches the documented spec (OpenAPI, TS types, README examples). | Never — this is the cheapest way to catch silent breakage. |
+| **Authorization cross-tenant** | Prove tokens from org A cannot access data of org B. | Skip only for single-tenant systems (rare in practice). |
+## Why the cross-tenant test is mandatory
+Cross-tenant data leakage is the most damaging API bug class — it's silent (no error), undetected by happy-path tests, and lethal in B2B SaaS. Every endpoint that returns or mutates tenant-scoped data must have a cross-tenant denial test. If the project is single-tenant, mark the test `pytest.skip` / `it.skip` / `[Fact(Skip="single-tenant")]` instead of omitting — the explicit skip is a record of the decision.
+## How to enumerate inputs per tier
+For each tier, ask:
+- **200**: what's the minimum valid payload? Build the test around that. Add 2-3 variations only if the endpoint has interesting branching (nullable fields, enum variants, optional sections).
+- **4xx validation**: what fields are required? Drop each one. What types are constrained? Send the wrong type. What ranges? Test min-1 and max+1. Don't test all combinations — one per kind of constraint is enough.
+- **4xx auth**: 3 variants — no token, expired token, malformed token. One test for each is enough.
+- **4xx authorization**: identify role boundaries (admin vs user vs guest, owner vs member). One test per boundary.
+- **4xx not found**: 1 test with a syntactically-valid-but-nonexistent ID (UUID, integer, etc.).
+- **4xx conflict**: 1 test that triggers the documented conflict (duplicate email, race on version).
+- **5xx**: skip if not reproducible. If the project has a way to inject failures (chaos hooks, dev-only error endpoints), use them.
+- **Contract drift**: 1 test that asserts every documented field is present AND no leaked internal field is.
+- **Cross-tenant**: 1 test per tenant-scoped endpoint with a token from a different tenant.
+## Example expansion: `POST /users`
+PRD says: "RF-03 — admins can create users. Validation: email is required and must be unique. Returns 201 with the new user."
+Matrix:
+| # | Tier | Case | Expected |
+|---|------|------|----------|
+| 1 | 200 | admin creates user with valid payload | 201, body has id |
+| 2 | 4xx validation | missing email | 422, error mentions email |
+| 3 | 4xx validation | invalid email format | 422 |
+| 4 | 4xx auth | no token | 401 |
+| 5 | 4xx auth | expired token | 401 |
+| 6 | 4xx authorization | regular user (not admin) | 403 |
+| 7 | 4xx conflict | email already taken | 409 |
+| 8 | Contract | all required fields present, no `password_hash` | matches spec |
+| 9 | Cross-tenant | admin from another org tries to fetch this user | 403 or 404 |
+That's 9 test cases for one RF — the floor for a real API surface, not the ceiling.
+## What NOT to do
+- **Don't test every combination** of validation failures. The framework already enforces type + presence; one test per kind of constraint is the signal.
+- **Don't test the framework**. `Content-Type: application/json` parsing, default routing, etc. — those belong to FastAPI / Fastify / ASP.NET, not to your QA suite.
+- **Don't write tests for endpoints with no PRD reference**. If a route exists but no RF describes it, that's a documentation gap to flag, not a test to add.
+- **Don't skip 5xx because "it shouldn't happen"**. If you have a way to reproduce, do it. If you genuinely can't, document the skip in the QA report so the gap is visible.
+## How `dw-run-qa` uses this
+When in API mode, `/dw-run-qa` walks each `RF-XX` in the PRD, runs through this matrix, and emits PASS/FAIL per RF — not per test case. A single FAIL in any tier marks the RF as FAIL and lands a `BUG-NN` entry pointing to the failing log line.

package/scaffold/skills/api-testing-recipes/references/openapi-driven.md ADDED Viewed

@@ -0,0 +1,97 @@
+# OpenAPI-driven mode — generating tests from a spec
+When the project exposes an OpenAPI spec (static `openapi.yaml`/`openapi.json`, or dynamic `/openapi.json` for FastAPI), `/dw-run-qa` can derive a baseline test suite directly from it. This catches contract drift between code and spec for free.
+## When to use this mode
+- The project already maintains an OpenAPI spec — either hand-written, generated from code annotations (FastAPI, NestJS + `@nestjs/swagger`, dotnet Swashbuckle), or synced via a code generator.
+- You want a quick "is this endpoint reachable AND does its response shape match the spec?" check.
+- You want to detect when code drifts ahead of (or behind) the spec.
+## What it generates
+For each path × method in the spec:
+1. A **happy-path test** using the spec's `requestBody` example (or schema-derived sample).
+2. A **contract-shape test** asserting the response matches the documented schema.
+3. Skips paths/methods marked with the `x-internal: true` extension or those without examples.
+The generated tests live alongside hand-written ones in `{{PRD_PATH}}/QA/scripts/api/`. Filename pattern: `openapi-RF-XX-[path-slug].http` (or stack-specific extension).
+## How to run it
+`/dw-run-qa --from-openapi <spec-path-or-url>` — explicit. The `<spec-path-or-url>` can be:
+- `./openapi.yaml`
+- `http://localhost:3000/openapi.json` (FastAPI default)
+- `http://localhost:3000/swagger/v1/swagger.json` (ASP.NET Core default)
+Without the flag, `/dw-run-qa` auto-detects:
+- File at repo root: `openapi.yaml`, `openapi.json`, `swagger.yaml`, `swagger.json`.
+- Project running locally: `GET /openapi.json`, `GET /swagger/v1/swagger.json`, `GET /api-docs`.
+If found, the agent asks: "OpenAPI spec detected at `<path>`. Generate baseline tests from it? [y/n]". On `y`, the baseline is added on top of the PRD-derived matrix.
+## Mapping spec endpoints to RFs
+The PRD names requirements (`RF-01`, `RF-02`); the spec names paths (`/users`, `/orders/{id}`). Two conventions for cross-referencing:
+- **By tag**: tests for a path tagged `users` map to the PRD section also tagged `users`. Cleanest if the project keeps tags consistent.
+- **By summary keyword**: the spec's `summary` field is matched against PRD requirement titles. Less reliable; only use as a fallback.
+If neither matches, the test lands as `openapi-no-rf-[slug].http` and the QA report flags it as "spec endpoint not mapped to any RF — possible documentation gap."
+## Contract drift detection
+For each response from a generated test, compare:
+1. **Status code** — does the actual status appear in the spec's `responses` block?
+2. **Required fields** — every field marked `required: true` in the schema must be present.
+3. **Type matching** — `email: string` in spec, but actual is `email: null`? Fail.
+4. **No leaked fields** — fields NOT in the spec but present in the response are flagged as **drift forward** (code added a field; spec is stale).
+5. **Sensitive defaults** — fields named `password*`, `secret*`, `token*`, `*_hash` in the response trigger an immediate FAIL with severity HIGH, even if they're "documented."
+## Generating example payloads
+If the spec has `example` or `examples`, use them verbatim. If only schemas exist, sample using a deterministic strategy:
+| JSON Schema type | Sample value |
+|-------|-------|
+| `string` | `"qa-string"` (or `"qa@example.com"` if `format: email`, ISO date if `format: date-time`, UUID v4 if `format: uuid`) |
+| `integer` | `1` (or value within `minimum`/`maximum` if set) |
+| `number` | `1.0` |
+| `boolean` | `true` |
+| `array` | one element of the inner type |
+| `object` | recurse on `properties`; only include `required` fields |
+| `enum` | first value |
+| `oneOf`/`anyOf` | first variant |
+Skip endpoints whose request shape can't be sampled deterministically (e.g., free-form JSON without schema, file uploads requiring real binary data).
+## What NOT to do
+- **Don't replace the PRD-derived matrix with OpenAPI-only tests.** OpenAPI tells you what the code claims to do; the PRD tells you what the product needs. Both matter. Keep both.
+- **Don't trust the spec implicitly.** If `dw-run-qa` finds 0 drift on day 1 and the team has been shipping for 6 months, the spec is probably stale, not the code. Flag the suspicion in the QA report.
+- **Don't generate tests for `x-internal: true` endpoints.** Those are behind an internal-network boundary; QA on them needs different credentials and risk profile.
+## Limitations
+- Doesn't generate authorization tests automatically (the spec doesn't say "this endpoint should reject other-tenant tokens"). Hand-write those per the cross-tenant pattern in `matrix-conventions.md`.
+- Doesn't generate state-mutating sequences (create → update → delete). Those need PRD context to know what state matters.
+- Treats the spec as authoritative for contract drift, but not for behavior. A spec that's wrong is still going to fail tests against correct code — and that's the right outcome. Update the spec.
+## What `dw-run-qa` produces
+When OpenAPI mode runs, the QA report gains a section:
+```markdown
+## OpenAPI baseline
+- Spec source: openapi.yaml (53 paths, 121 operations)
+- Endpoints sampled: 89 (32 skipped: missing examples, file uploads, `x-internal`)
+- Drift detected: 4 endpoints (see RF-12, RF-15, RF-22, openapi-no-rf-internal-metrics)
+- Contract issues:
+  - RF-12 — `email` documented as required, response returns null
+  - openapi-no-rf-internal-metrics — endpoint exists in spec but no PRD reference
+```

package/scaffold/skills/dw-codebase-intel/SKILL.md ADDED Viewed

@@ -0,0 +1,101 @@
+---
+name: dw-codebase-intel
+description: Codebase intelligence for dev-workflow. The intel-updater agent maintains a queryable index in .dw/intel/ (stack.json, files.json, apis.json, deps.json, arch.md) that other commands read instead of doing expensive codebase exploration. Used by /dw-intel and /dw-map-codebase. Adapted from get-shit-done-cc (MIT).
+allowed-tools:
+  - Read
+  - Write
+  - Bash
+  - Glob
+  - Grep
+---
+# dw-codebase-intel
+Bundled skill that gives dev-workflow native **codebase intelligence** — a queryable knowledge base of the project's stack, file graph, API surface, dependencies, and architecture. Other commands (`/dw-create-prd`, `/dw-create-techspec`, `/dw-code-review`, `/dw-refactoring-analysis`, `/dw-brainstorm`, etc.) read from this index instead of re-exploring the codebase on every invocation.
+## Why a skill (not inline)
+- Each agent is independently maintainable and reusable.
+- The format of `.dw/intel/` is a contract that downstream commands depend on; codifying it in references makes drift visible.
+- Other commands invoke these agents through the skill, not by duplicating the logic.
+## When to Use
+Read this skill when:
+- `/dw-map-codebase` is invoked (full or partial codebase analysis).
+- `/dw-intel "<query>"` is invoked (query existing intel).
+- `/dw-analyze-project` runs after first commit and wants to enrich `.dw/rules/` with structural facts from `.dw/intel/`.
+- Any other `dw-*` command wants to look up "where is X used", "what frameworks are in this stack", "what's the architecture pattern" without re-scanning files.
+Do NOT use when:
+- The user is asking a runtime question (use logs / running app).
+- The intel is fresh (`.dw/intel/.last-refresh.json` is current and the user did not ask for a refresh).
+- The project is greenfield with no source files yet — in that case, fall back to `.dw/rules/` seeded by `/dw-new-project`.
+## Files in `.dw/intel/`
+The intel directory is the contract. Every file is machine-parseable and references real file paths.
+| File | Purpose | Format |
+|------|---------|--------|
+| `stack.json` | Languages, frameworks, build/test tooling, package manager | JSON |
+| `files.json` | File graph: per-file imports/exports/type | JSON |
+| `apis.json` | API surface: routes, methods, params, source file | JSON |
+| `deps.json` | Dependencies: version, type, used_by, invocation | JSON |
+| `arch.md` | Human-readable architecture overview + key components + data flow | Markdown |
+| `.last-refresh.json` | Timestamps + content hashes for incremental detection | JSON |
+Schemas are documented in `references/intel-format.md`.
+## Agents
+| Agent | Responsibility | Spawn from |
+|-------|----------------|------------|
+| `agents/intel-updater.md` | Reads source files, writes structured intel to `.dw/intel/`. Supports `full` or `partial --files <paths>` updates. | `/dw-map-codebase` |
+This skill ships ONE agent — `intel-updater` — which produces machine-readable JSON for `/dw-intel` queries. Human-readable architecture analysis (per-module conventions, anti-patterns, code smells) lives in `.dw/rules/` and is generated by `/dw-analyze-project`. The two are complementary: `.dw/intel/` answers "what's in this codebase right now?" and `.dw/rules/` answers "how should we write code here?".
+## How to Compose (the typical flow)
+1. **`/dw-map-codebase`** is invoked.
+2. The command spawns `intel-updater` with `focus: full` (first run) or `focus: partial --files <paths>` (incremental).
+3. The agent reads source files (using Glob/Read/Grep; no Bash file listing for cross-platform safety) and writes the 5 intel files.
+4. The agent writes `.last-refresh.json` with timestamps + hashes for incremental change detection on the next run.
+5. `/dw-map-codebase` reports completion and invites the user to query via `/dw-intel "<question>"`.
+For human-readable analysis (architecture overview, module conventions, anti-patterns), run `/dw-analyze-project` after `/dw-map-codebase` — it reads `.dw/intel/` as input and produces `.dw/rules/`.
+## How `/dw-intel` Reads This
+`/dw-intel "auth flow"` does:
+1. Check `.dw/intel/.last-refresh.json` — is the index fresh (within last 7 days)? If stale, suggest re-running `/dw-map-codebase`.
+2. Search `apis.json` for matching paths/descriptions.
+3. Search `files.json` for matching exports.
+4. Search `arch.md` (full-text) for the keyword.
+5. Cross-reference with `deps.json` if the query is about a library.
+6. Return a structured answer with file paths cited.
+If no `.dw/intel/` exists at all, `/dw-intel` falls back to `.dw/rules/` (seeded by `/dw-new-project` or `/dw-analyze-project`) and direct grep over the codebase.
+## Rules
+- **Always cite file paths.** Every claim in intel must reference an actual file location.
+- **Current state only.** No temporal language ("recently added", "will change"). The index reflects what's there NOW.
+- **Evidence-based.** Read actual files. Never guess from filenames or directory structures.
+- **Cross-platform**: use Glob/Read/Grep; never Bash `ls`/`find`/`cat` (those break on Windows).
+- **Forbidden files**: never read `.env*` (except `.env.example`/`.env.template`), `*.key`, `*.pem`, `*.pfx`, `*.p12`, `*.keystore`, `*.jks`, `id_rsa`, `id_ed25519`, or files matching `*credential*`/`*secret*` in name. Skip silently if encountered.
+- **Excluded directories**: `node_modules/`, `.git/`, `dist/`, `build/`, `.dw/` (planning docs, not project code).
+- **Output budget**: `files.json` ≤2K tokens, `apis.json` ≤1.5K, `deps.json` ≤1K, `stack.json` ≤500, `arch.md` ≤1.5K. For large repos, prioritize coverage of key files over exhaustive listing.
+## References
+- `references/intel-format.md` — schema for each `.dw/intel/` file with examples.
+- `references/incremental-update.md` — how partial updates work (which files to re-read, how to merge with existing entries).
+- `references/query-patterns.md` — how `/dw-intel` answers different question shapes (where-is, what-uses, architecture-of, dependency-of).
+## Inspired by
+Adapted from [`get-shit-done-cc`](https://github.com/gsd-build/get-shit-done) (`gsd-intel-updater`) by gsd-build (MIT license). Core schemas (`files.json`, `apis.json`, `deps.json`, `stack.json`, `arch.md`) and incremental update protocol preserved. Path conventions changed from `.planning/intel/` to `.dw/intel/`. CLI tooling (`gsd-sdk query intel.*`) replaced by agent-driven inline operations (no separate runtime). The companion `gsd-codebase-mapper` agent (human-readable analysis docs) was NOT ported — its scope overlaps with the existing `/dw-analyze-project` command which writes to `.dw/rules/`.