@brunosps00/dev-workflow 0.8.0 → 0.8.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,117 @@
1
+ # Log conventions — request/response evidence as JSONL
2
+
3
+ In API mode, **logs replace screenshots** as the primary QA evidence. Every request/response pair the QA suite makes is captured as one JSONL line so the bug report links back to a reproducible event.
4
+
5
+ ## File location
6
+
7
+ `{{PRD_PATH}}/QA/logs/api/<scope>.log`
8
+
9
+ Where `<scope>` is one of:
10
+
11
+ - `RF-XX-[slug].log` — log for a single requirement run (1 file per RF).
12
+ - `BUG-NN-retest.log` — log for a fix retest (1 file per bug retest cycle).
13
+ - `run-<YYYY-MM-DD>.log` — global run log (full QA pass).
14
+
15
+ ## Line shape (JSONL — one JSON object per line)
16
+
17
+ ```json
18
+ {
19
+ "ts": 1715000000000,
20
+ "rf": "RF-03",
21
+ "case": "happy-path",
22
+ "method": "POST",
23
+ "url": "http://localhost:3000/users",
24
+ "request_headers": {
25
+ "authorization": "Bearer <redacted>",
26
+ "content-type": "application/json"
27
+ },
28
+ "request_body": {
29
+ "email": "qa-1@example.com",
30
+ "name": "QA"
31
+ },
32
+ "status": 201,
33
+ "response_headers": {
34
+ "content-type": "application/json",
35
+ "location": "/users/12345"
36
+ },
37
+ "response_body": {
38
+ "id": "12345",
39
+ "email": "qa-1@example.com",
40
+ "name": "QA",
41
+ "created_at": "2026-05-06T12:00:00Z"
42
+ },
43
+ "ms": 47,
44
+ "verdict": "PASS",
45
+ "assertion_failures": []
46
+ }
47
+ ```
48
+
49
+ ## Required fields
50
+
51
+ | Field | Type | Notes |
52
+ |-------|------|-------|
53
+ | `ts` | int (epoch ms, UTC) | When the request was sent |
54
+ | `rf` | string | Which `RF-XX` this request belongs to (or `"BUG-NN"` for retests) |
55
+ | `case` | string | One of `happy-path`, `validation`, `auth-missing`, `auth-expired`, `authz-cross-tenant`, `not-found`, `conflict`, `server-error`, `contract` |
56
+ | `method` | string | HTTP method |
57
+ | `url` | string | Full URL including query string |
58
+ | `status` | int | HTTP status code |
59
+ | `ms` | int | Elapsed milliseconds |
60
+ | `verdict` | string | `"PASS"` or `"FAIL"` |
61
+ | `assertion_failures` | array of strings | Each failed assertion as a one-line description (empty array on PASS) |
62
+
63
+ ## Optional fields
64
+
65
+ | Field | Type | Notes |
66
+ |-------|------|-------|
67
+ | `request_headers` | object | Map of header name → value |
68
+ | `request_body` | any | Parsed JSON if `Content-Type: application/json`; raw string otherwise |
69
+ | `response_headers` | object | Same shape as request_headers |
70
+ | `response_body` | any | Parsed JSON if `Content-Type: application/json`; raw string otherwise |
71
+ | `err` | string | Network/runtime error message (if no response was received at all) |
72
+
73
+ ## Redaction rules
74
+
75
+ The log goes to `QA/logs/api/` which **may end up in artifacts uploaded to CI** or attached to bug reports. Redact:
76
+
77
+ - **`Authorization` header** → `"Bearer <redacted>"` or `"Basic <redacted>"`. The token's presence is logged; the value never is.
78
+ - **`Cookie` header** → `"<redacted>"`. Same reasoning.
79
+ - **`X-API-Key` header** → `"<redacted>"`.
80
+ - **Response fields named `password*`, `secret*`, `*_hash`, `token*`, `apiKey*`** → `"<redacted>"`. These should never be in a response anyway; if they are, the log redacts AND the QA report flags the leak.
81
+ - **Free-form `request_body` fields named `password`** → `"<redacted>"`.
82
+
83
+ The redaction is applied at log-write time, never on read; even a leaked log file should not expose secrets.
84
+
85
+ ## Why JSONL (not pretty-printed JSON)
86
+
87
+ - **Append-friendly**: each request is one line; concurrent runs append safely without parsing the whole file.
88
+ - **Greppable**: `grep '"verdict":"FAIL"' QA/logs/api/RF-03.log` shows every failed case in one shot.
89
+ - **Queryable**: `jq -c 'select(.status >= 500)' QA/logs/api/run-*.log | jq -s 'group_by(.url) | map({url: .[0].url, count: length})'` finds the most-failing URLs.
90
+ - **Diffable across runs**: `diff <(jq -c 'del(.ts, .ms)' RF-03.log) <(jq -c 'del(.ts, .ms)' RF-03.log.prev)` shows behavior changes free of timing noise.
91
+
92
+ ## Per-recipe writers
93
+
94
+ Every recipe in `recipes/` includes a small writer helper in its example:
95
+
96
+ - `.http` — the agent writes via `Bash` after each `curl` invocation.
97
+ - `pytest+httpx` — `LoggingClient` subclass overriding `request`.
98
+ - `supertest` — small `logRequest` helper imported by tests.
99
+ - `.NET WebApplicationFactory` — `DelegatingHandler` registered on the test client.
100
+ - `reqwest` — wrapper function around `client.execute(req)`.
101
+
102
+ All of them produce the same JSONL shape so downstream tooling (the QA report renderer, the bug retest loop) doesn't care which recipe was used.
103
+
104
+ ## How `dw-run-qa` reads logs back
105
+
106
+ When generating the QA report (Step 8 in `dw-run-qa`), the agent reads each `RF-XX-[slug].log`, computes:
107
+
108
+ - **Total requests** per RF
109
+ - **Pass count vs fail count**
110
+ - **Failing cases** with the assertion message
111
+ - **Tail latency** (p99 if there are ≥10 requests, max otherwise)
112
+
113
+ These land in the report's "Verified Requirements" table and feed the bug entries (with `evidence_path: QA/logs/api/RF-03.log#L42` pointing to the failing line).
114
+
115
+ ## How `dw-fix-qa` consumes them
116
+
117
+ The retest loop reads `QA/bugs.md` for each open bug, finds the corresponding log line via `evidence_path`, replays the request via the same recipe + assertions, and writes a new line to `BUG-NN-retest.log` with `verdict: "PASS"` (closing the bug) or `verdict: "FAIL"` (cycling through the fix-retest loop again, max 5 cycles).
@@ -0,0 +1,68 @@
1
+ # Matrix conventions — deriving tests from a PRD requirement
2
+
3
+ Every API requirement (`RF-XX`) gets a structured matrix of test cases. The matrix is the bridge between "the PRD says this endpoint must exist" and "we have evidence it works under the cases that matter."
4
+
5
+ ## The five tiers
6
+
7
+ For each `RF-XX`, generate at least one test per tier that applies:
8
+
9
+ | Tier | Goal | When to skip |
10
+ |------|------|--------------|
11
+ | **200 happy path** | Prove the endpoint accepts the documented input and returns the documented output. | Never — every RF needs at least one happy path. |
12
+ | **4xx — validation** | Prove input validation rejects malformed payloads with a useful error. | Skip only for endpoints with no body (`GET` without query params). |
13
+ | **4xx — auth (401)** | Prove missing/expired/invalid credentials return 401. | Skip for endpoints documented as anonymous. |
14
+ | **4xx — authorization (403)** | Prove valid credentials without the required role/scope return 403. | Skip if the endpoint is open to any authenticated user. |
15
+ | **4xx — not found (404)** | Prove non-existent IDs return 404, not 500. | Skip for endpoints that don't take an ID. |
16
+ | **4xx — conflict (409)** | Prove duplicates / version mismatches return 409. | Skip if the endpoint is idempotent and conflict-free by design. |
17
+ | **5xx — server error** | Prove the system fails gracefully (no leaked stack trace, no half-write). | Skip if no synthetic failure is reproducible without invasive infrastructure changes. |
18
+ | **Contract drift** | Prove the response shape matches the documented spec (OpenAPI, TS types, README examples). | Never — this is the cheapest way to catch silent breakage. |
19
+ | **Authorization cross-tenant** | Prove tokens from org A cannot access data of org B. | Skip only for single-tenant systems (rare in practice). |
20
+
21
+ ## Why the cross-tenant test is mandatory
22
+
23
+ Cross-tenant data leakage is the most damaging API bug class — it's silent (no error), undetected by happy-path tests, and lethal in B2B SaaS. Every endpoint that returns or mutates tenant-scoped data must have a cross-tenant denial test. If the project is single-tenant, mark the test `pytest.skip` / `it.skip` / `[Fact(Skip="single-tenant")]` instead of omitting — the explicit skip is a record of the decision.
24
+
25
+ ## How to enumerate inputs per tier
26
+
27
+ For each tier, ask:
28
+
29
+ - **200**: what's the minimum valid payload? Build the test around that. Add 2-3 variations only if the endpoint has interesting branching (nullable fields, enum variants, optional sections).
30
+ - **4xx validation**: what fields are required? Drop each one. What types are constrained? Send the wrong type. What ranges? Test min-1 and max+1. Don't test all combinations — one per kind of constraint is enough.
31
+ - **4xx auth**: 3 variants — no token, expired token, malformed token. One test for each is enough.
32
+ - **4xx authorization**: identify role boundaries (admin vs user vs guest, owner vs member). One test per boundary.
33
+ - **4xx not found**: 1 test with a syntactically-valid-but-nonexistent ID (UUID, integer, etc.).
34
+ - **4xx conflict**: 1 test that triggers the documented conflict (duplicate email, race on version).
35
+ - **5xx**: skip if not reproducible. If the project has a way to inject failures (chaos hooks, dev-only error endpoints), use them.
36
+ - **Contract drift**: 1 test that asserts every documented field is present AND no leaked internal field is.
37
+ - **Cross-tenant**: 1 test per tenant-scoped endpoint with a token from a different tenant.
38
+
39
+ ## Example expansion: `POST /users`
40
+
41
+ PRD says: "RF-03 — admins can create users. Validation: email is required and must be unique. Returns 201 with the new user."
42
+
43
+ Matrix:
44
+
45
+ | # | Tier | Case | Expected |
46
+ |---|------|------|----------|
47
+ | 1 | 200 | admin creates user with valid payload | 201, body has id |
48
+ | 2 | 4xx validation | missing email | 422, error mentions email |
49
+ | 3 | 4xx validation | invalid email format | 422 |
50
+ | 4 | 4xx auth | no token | 401 |
51
+ | 5 | 4xx auth | expired token | 401 |
52
+ | 6 | 4xx authorization | regular user (not admin) | 403 |
53
+ | 7 | 4xx conflict | email already taken | 409 |
54
+ | 8 | Contract | all required fields present, no `password_hash` | matches spec |
55
+ | 9 | Cross-tenant | admin from another org tries to fetch this user | 403 or 404 |
56
+
57
+ That's 9 test cases for one RF — the floor for a real API surface, not the ceiling.
58
+
59
+ ## What NOT to do
60
+
61
+ - **Don't test every combination** of validation failures. The framework already enforces type + presence; one test per kind of constraint is the signal.
62
+ - **Don't test the framework**. `Content-Type: application/json` parsing, default routing, etc. — those belong to FastAPI / Fastify / ASP.NET, not to your QA suite.
63
+ - **Don't write tests for endpoints with no PRD reference**. If a route exists but no RF describes it, that's a documentation gap to flag, not a test to add.
64
+ - **Don't skip 5xx because "it shouldn't happen"**. If you have a way to reproduce, do it. If you genuinely can't, document the skip in the QA report so the gap is visible.
65
+
66
+ ## How `dw-run-qa` uses this
67
+
68
+ When in API mode, `/dw-run-qa` walks each `RF-XX` in the PRD, runs through this matrix, and emits PASS/FAIL per RF — not per test case. A single FAIL in any tier marks the RF as FAIL and lands a `BUG-NN` entry pointing to the failing log line.
@@ -0,0 +1,97 @@
1
+ # OpenAPI-driven mode — generating tests from a spec
2
+
3
+ When the project exposes an OpenAPI spec (static `openapi.yaml`/`openapi.json`, or dynamic `/openapi.json` for FastAPI), `/dw-run-qa` can derive a baseline test suite directly from it. This catches contract drift between code and spec for free.
4
+
5
+ ## When to use this mode
6
+
7
+ - The project already maintains an OpenAPI spec — either hand-written, generated from code annotations (FastAPI, NestJS + `@nestjs/swagger`, dotnet Swashbuckle), or synced via a code generator.
8
+ - You want a quick "is this endpoint reachable AND does its response shape match the spec?" check.
9
+ - You want to detect when code drifts ahead of (or behind) the spec.
10
+
11
+ ## What it generates
12
+
13
+ For each path × method in the spec:
14
+
15
+ 1. A **happy-path test** using the spec's `requestBody` example (or schema-derived sample).
16
+ 2. A **contract-shape test** asserting the response matches the documented schema.
17
+ 3. Skips paths/methods marked with the `x-internal: true` extension or those without examples.
18
+
19
+ The generated tests live alongside hand-written ones in `{{PRD_PATH}}/QA/scripts/api/`. Filename pattern: `openapi-RF-XX-[path-slug].http` (or stack-specific extension).
20
+
21
+ ## How to run it
22
+
23
+ `/dw-run-qa --from-openapi <spec-path-or-url>` — explicit. The `<spec-path-or-url>` can be:
24
+
25
+ - `./openapi.yaml`
26
+ - `http://localhost:3000/openapi.json` (FastAPI default)
27
+ - `http://localhost:3000/swagger/v1/swagger.json` (ASP.NET Core default)
28
+
29
+ Without the flag, `/dw-run-qa` auto-detects:
30
+
31
+ - File at repo root: `openapi.yaml`, `openapi.json`, `swagger.yaml`, `swagger.json`.
32
+ - Project running locally: `GET /openapi.json`, `GET /swagger/v1/swagger.json`, `GET /api-docs`.
33
+
34
+ If found, the agent asks: "OpenAPI spec detected at `<path>`. Generate baseline tests from it? [y/n]". On `y`, the baseline is added on top of the PRD-derived matrix.
35
+
36
+ ## Mapping spec endpoints to RFs
37
+
38
+ The PRD names requirements (`RF-01`, `RF-02`); the spec names paths (`/users`, `/orders/{id}`). Two conventions for cross-referencing:
39
+
40
+ - **By tag**: tests for a path tagged `users` map to the PRD section also tagged `users`. Cleanest if the project keeps tags consistent.
41
+ - **By summary keyword**: the spec's `summary` field is matched against PRD requirement titles. Less reliable; only use as a fallback.
42
+
43
+ If neither matches, the test lands as `openapi-no-rf-[slug].http` and the QA report flags it as "spec endpoint not mapped to any RF — possible documentation gap."
44
+
45
+ ## Contract drift detection
46
+
47
+ For each response from a generated test, compare:
48
+
49
+ 1. **Status code** — does the actual status appear in the spec's `responses` block?
50
+ 2. **Required fields** — every field marked `required: true` in the schema must be present.
51
+ 3. **Type matching** — `email: string` in spec, but actual is `email: null`? Fail.
52
+ 4. **No leaked fields** — fields NOT in the spec but present in the response are flagged as **drift forward** (code added a field; spec is stale).
53
+ 5. **Sensitive defaults** — fields named `password*`, `secret*`, `token*`, `*_hash` in the response trigger an immediate FAIL with severity HIGH, even if they're "documented."
54
+
55
+ ## Generating example payloads
56
+
57
+ If the spec has `example` or `examples`, use them verbatim. If only schemas exist, sample using a deterministic strategy:
58
+
59
+ | JSON Schema type | Sample value |
60
+ |-------|-------|
61
+ | `string` | `"qa-string"` (or `"qa@example.com"` if `format: email`, ISO date if `format: date-time`, UUID v4 if `format: uuid`) |
62
+ | `integer` | `1` (or value within `minimum`/`maximum` if set) |
63
+ | `number` | `1.0` |
64
+ | `boolean` | `true` |
65
+ | `array` | one element of the inner type |
66
+ | `object` | recurse on `properties`; only include `required` fields |
67
+ | `enum` | first value |
68
+ | `oneOf`/`anyOf` | first variant |
69
+
70
+ Skip endpoints whose request shape can't be sampled deterministically (e.g., free-form JSON without schema, file uploads requiring real binary data).
71
+
72
+ ## What NOT to do
73
+
74
+ - **Don't replace the PRD-derived matrix with OpenAPI-only tests.** OpenAPI tells you what the code claims to do; the PRD tells you what the product needs. Both matter. Keep both.
75
+ - **Don't trust the spec implicitly.** If `dw-run-qa` finds 0 drift on day 1 and the team has been shipping for 6 months, the spec is probably stale, not the code. Flag the suspicion in the QA report.
76
+ - **Don't generate tests for `x-internal: true` endpoints.** Those are behind an internal-network boundary; QA on them needs different credentials and risk profile.
77
+
78
+ ## Limitations
79
+
80
+ - Doesn't generate authorization tests automatically (the spec doesn't say "this endpoint should reject other-tenant tokens"). Hand-write those per the cross-tenant pattern in `matrix-conventions.md`.
81
+ - Doesn't generate state-mutating sequences (create → update → delete). Those need PRD context to know what state matters.
82
+ - Treats the spec as authoritative for contract drift, but not for behavior. A spec that's wrong is still going to fail tests against correct code — and that's the right outcome. Update the spec.
83
+
84
+ ## What `dw-run-qa` produces
85
+
86
+ When OpenAPI mode runs, the QA report gains a section:
87
+
88
+ ```markdown
89
+ ## OpenAPI baseline
90
+
91
+ - Spec source: openapi.yaml (53 paths, 121 operations)
92
+ - Endpoints sampled: 89 (32 skipped: missing examples, file uploads, `x-internal`)
93
+ - Drift detected: 4 endpoints (see RF-12, RF-15, RF-22, openapi-no-rf-internal-metrics)
94
+ - Contract issues:
95
+ - RF-12 — `email` documented as required, response returns null
96
+ - openapi-no-rf-internal-metrics — endpoint exists in spec but no PRD reference
97
+ ```