ccqa 0.8.3 → 0.10.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +115 -12
- package/dist/bin/ccqa.mjs +869 -303
- package/dist/package.json +1 -1
- package/dist/runtime/test-helpers.d.mts +8 -1
- package/dist/runtime/test-helpers.mjs +28 -3
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -69,6 +69,8 @@ ccqa run tasks/create-and-complete # vitest replays test.spec.ts; no LLM
|
|
|
69
69
|
ccqa run tasks/create-and-complete # Claude drives the browser every time
|
|
70
70
|
```
|
|
71
71
|
|
|
72
|
+
Live specs can start already-signed-in by pointing `statePath:` at a saved agent-browser state file (cookies + localStorage). Run an interactive login locally once, save the state with `agent-browser state save .ccqa/sessions/<name>.json`, then commit the path (not the file) — see [Pre-authenticated state](#pre-authenticated-state-statepath) below for the local bootstrap and the CI restore pattern.
|
|
73
|
+
|
|
72
74
|
By default deterministic runs write step-boundary screenshots and metadata to `ccqa-report/evidence/<feature>/<spec>/` so a reviewer can confirm a passing spec actually reached the states its `expected` clauses describe. Disable with `--no-evidence`.
|
|
73
75
|
|
|
74
76
|
In CI you can opt in to an HTML run report by passing `--report` — every failing spec gets a drift audit plus a root-cause call (TEST_DRIFT / SPEC_CHANGE / PRODUCT_BUG) using the branch's git diff as context, and the report lets a human grade those calls to measure their accuracy. Requires `ANTHROPIC_API_KEY` or a local Claude login for the analysis part. Opt out with `--no-failure-analysis` (which also implicitly skips the drift audit — the audit is rendered as evidence under the classification, so without the classification the cost has nowhere to land). Use `--no-drift-audit` to keep the classification but skip the audit. See [Run report](./docs/report.md).
|
|
@@ -84,6 +86,7 @@ ccqa run --changed --report # only specs whose relatedPaths t
|
|
|
84
86
|
|---|---|
|
|
85
87
|
| Write specs interactively with Claude | [Draft](./docs/draft.md) |
|
|
86
88
|
| Reuse login and other shared step sequences | [Blocks](./docs/blocks.md) |
|
|
89
|
+
| Drive `<input type="file">` without an OS picker | [File upload](./docs/file-upload.md) |
|
|
87
90
|
| Assertion helper functions | [Assertions](./docs/assertions.md) |
|
|
88
91
|
| Auto-fix failing tests | [Auto-fix](./docs/auto-fix.md) |
|
|
89
92
|
| Detect spec/code drift in CI | [Drift](./docs/drift.md) |
|
|
@@ -94,19 +97,22 @@ ccqa run --changed --report # only specs whose relatedPaths t
|
|
|
94
97
|
## Commands
|
|
95
98
|
|
|
96
99
|
```
|
|
100
|
+
ccqa init Scaffold .ccqa/prompts/{live,record}.{user,agent}.md templates
|
|
97
101
|
ccqa draft [feature/spec] Co-author a test spec with Claude
|
|
98
102
|
ccqa perspectives Inventory existing test coverage into .ccqa/perspectives.yaml
|
|
99
103
|
ccqa record <feature/spec> (deterministic specs only) Trace browser actions + generate test.spec.ts
|
|
100
|
-
ccqa run [feature/spec]
|
|
104
|
+
ccqa run [feature/spec...] Execute specs. Per spec, the spec.yaml `mode:` field selects deterministic
|
|
101
105
|
(vitest replay) or live (Claude drives every time). One run can mix both;
|
|
102
|
-
`--report` writes one unified HTML.
|
|
106
|
+
`--report` writes one unified HTML. Pass multiple targets space-separated.
|
|
103
107
|
ccqa drift [feature/spec] Standalone spec ↔ codebase static audit (for PR checks)
|
|
104
108
|
```
|
|
105
109
|
|
|
106
110
|
`ccqa run` flags:
|
|
107
111
|
|
|
108
112
|
- `--report [dir]` — write a self-contained HTML run report (default dir: `ccqa-report/`)
|
|
109
|
-
- `--
|
|
113
|
+
- `--profile <name>` — load `.ccqa/profiles/<name>.env` into the environment before resolving spec `${VAR}` references, so one spec targets dev/stg/prd without per-environment copies. See [Profiles](#profiles---profile).
|
|
114
|
+
- `--changed` — restrict execution to specs whose `relatedPaths` intersect `git diff <base>...HEAD`. Mutually exclusive with explicit spec targets.
|
|
115
|
+
- `--concurrency <n>` — run up to N specs in parallel **within each mode** (deterministic specs run as one phase, live specs as the next; parallelism is within a phase, not across). Default `1` (sequential, identical to before). Above 1, each spec's output is buffered and flushed as a labelled block so parallel logs stay legible. Live specs each launch their own headed Chrome, so high values spawn many browser instances.
|
|
110
116
|
- `--base <ref>` — base ref for the git diff (default: `$GITHUB_BASE_REF`, then `origin/main`)
|
|
111
117
|
- `--no-failure-analysis` — skip the per-failure root-cause classification (also skips the drift audit, since the audit only shows under the classification)
|
|
112
118
|
- `--no-drift-audit` — skip the spec ↔ code drift audit while keeping the classification
|
|
@@ -114,10 +120,11 @@ ccqa drift [feature/spec] Standalone spec ↔ codebase static audit (fo
|
|
|
114
120
|
- `--retry <n>` — (live specs only) retry each failing step up to N more times
|
|
115
121
|
- `--format <fmt>` — `text` (default), `json` (report.json), `github` (Actions annotations)
|
|
116
122
|
- `--out <dir>` — (live specs only, single-spec invocations) override the per-run artifact directory
|
|
123
|
+
- `--update-agent-prompt` — (live specs only) after the run, summarise it back to Claude and rewrite `.ccqa/prompts/live.agent.md` so the next run inherits the lessons learned. `ccqa record` ships the same flag, refreshing `record.agent.md` from the trace summary.
|
|
117
124
|
|
|
118
125
|
All Claude-driven commands accept `-m, --model <name>` (alias `sonnet` | `opus` | `haiku`, or a full model ID). The flag overrides `CCQA_MODEL`; when both are unset, the Claude Code CLI default is used. They also accept `--language <bcp47>` (e.g. `ja`, `en`) to set the language of human-readable output; the default `auto` follows the language of the spec/codebase. `--cwd <path>` works on `record` / `run` / `drift` so you can target a subpackage inside a monorepo from the repo root. Interactive commands authenticate via your local Claude Code login; commands that talk to Claude in CI (`ccqa run --report`, `ccqa drift`) additionally honor `ANTHROPIC_API_KEY`.
|
|
119
126
|
|
|
120
|
-
`<feature/spec>` is a 2-segment alias for the on-disk path `.ccqa/features/<feature>/test-cases/<spec>/`.
|
|
127
|
+
`<feature/spec>` is a 2-segment alias for the on-disk path `.ccqa/features/<feature>/test-cases/<spec>/`. `ccqa run` accepts several targets space-separated (each a `<feature>/<spec>`, a bare `<feature>` for all its specs, or omitted for everything); duplicates are de-duped and `--changed` cannot be combined with explicit targets.
|
|
121
128
|
|
|
122
129
|
## File structure
|
|
123
130
|
|
|
@@ -125,9 +132,14 @@ All Claude-driven commands accept `-m, --model <name>` (alias `sonnet` | `opus`
|
|
|
125
132
|
.ccqa/
|
|
126
133
|
perspectives.yaml # Inventory of existing coverage (machine-readable, canonical)
|
|
127
134
|
perspectives.md # Category index, regenerated from the YAML
|
|
128
|
-
|
|
129
|
-
|
|
130
|
-
|
|
135
|
+
profiles/ # `--profile <name>` env files
|
|
136
|
+
stg.env # URLs + credential refs; commit if it uses secret-manager refs, gitignore if it holds plaintext secrets
|
|
137
|
+
prd.env
|
|
138
|
+
prompts/ # Run `ccqa init` to scaffold these
|
|
139
|
+
record.user.md # Human-maintained guidance appended to `ccqa record` (trace phase)
|
|
140
|
+
record.agent.md # Auto-updated by `ccqa record --update-agent-prompt`
|
|
141
|
+
live.user.md # Human-maintained guidance appended to `ccqa run` (live specs)
|
|
142
|
+
live.agent.md # Auto-updated by `ccqa run --update-agent-prompt`
|
|
131
143
|
blocks/
|
|
132
144
|
login/
|
|
133
145
|
spec.yaml # Reusable block (params + steps)
|
|
@@ -151,6 +163,26 @@ All Claude-driven commands accept `-m, --model <name>` (alias `sonnet` | `opus`
|
|
|
151
163
|
|
|
152
164
|
Add `.ccqa/features/*/test-cases/*/runs/` to `.gitignore` — these are per-run artefacts that should not be committed. Likewise `ccqa-report*/`.
|
|
153
165
|
|
|
166
|
+
## Profiles (`--profile`)
|
|
167
|
+
|
|
168
|
+
Keep environment-specific values out of specs as `${VAR}` references and supply them per environment from a **profile** — a `.env` under `.ccqa/profiles/<name>.env`. `ccqa run`/`record --profile <name>` merges it into the environment before resolving `${VAR}`, so one spec runs anywhere.
|
|
169
|
+
|
|
170
|
+
```bash
|
|
171
|
+
# .ccqa/profiles/stg.env
|
|
172
|
+
APP_BASE_URL=https://<your-app-host>
|
|
173
|
+
TEST_USER_EMAIL=<stg-test-account>
|
|
174
|
+
TEST_USER_PASSWORD=...
|
|
175
|
+
```
|
|
176
|
+
```bash
|
|
177
|
+
ccqa run auth/login --profile stg # same spec, stg values
|
|
178
|
+
```
|
|
179
|
+
|
|
180
|
+
- Name is free-form (`stg`/`prd` are conventions); a path separator / `..` / leading dot is rejected, and a missing profile exits 2. Only the name is logged, never values.
|
|
181
|
+
- Format is a small `.env` subset (`KEY=value`, `#` comments, `export`, quotes). Profile values **override** the inherited environment.
|
|
182
|
+
- Without `--profile`, ccqa auto-loads `<cwd>/.env` if present (like dotenv); with neither, `${VAR}` resolves against the existing `process.env` (e.g. `direnv`).
|
|
183
|
+
|
|
184
|
+
**Secrets:** gitignore any profile that holds plaintext secrets. ccqa only parses `.env` files — it doesn't resolve secret-manager references — so to keep secrets off disk, drop `--profile` and run ccqa under your secret manager instead (e.g. `op run --env-file=.ccqa/profiles/stg.env -- ccqa run ...`), which injects the resolved values into `process.env` for ccqa to read.
|
|
185
|
+
|
|
154
186
|
## Live specs (`mode: live`)
|
|
155
187
|
|
|
156
188
|
For specs declared `mode: live` in their spec.yaml, `ccqa run` skips codegen entirely: Claude executes each spec step against `agent-browser` directly, judges whether the step's `expected` outcome holds, and saves a PNG screenshot before and after every step. Use this mode when:
|
|
@@ -175,11 +207,82 @@ ccqa run --retry 2 tasks/create-and-complete
|
|
|
175
207
|
|
|
176
208
|
Constraints on selectors / `agent-browser` subcommands that apply during `ccqa record` (no `eval`, no `@ref`, no bare-tag positional `find`, no chained agent-browser calls) are **relaxed** for live specs — Claude can use any subcommand and any selector style because there is no replay contract to honour.
|
|
177
209
|
|
|
178
|
-
###
|
|
210
|
+
### Pre-authenticated state (`statePath:`)
|
|
211
|
+
|
|
212
|
+
By default each `ccqa run` of a live spec spins up a fresh `agent-browser` session and starts signed-out. That keeps runs hermetic but forces every device-trust gate (Slack "we don't recognize this browser", Google's unfamiliar-device prompt, MFA challenges, …) to fire on every run.
|
|
213
|
+
|
|
214
|
+
To skip them, save an authenticated browser state to a JSON file once locally and point the spec at it:
|
|
215
|
+
|
|
216
|
+
```yaml
|
|
217
|
+
title: Slack App Home — non-admin access denied
|
|
218
|
+
mode: live
|
|
219
|
+
statePath: .ccqa/sessions/slack-stg.json # cookies + localStorage to restore
|
|
220
|
+
steps:
|
|
221
|
+
- ...
|
|
222
|
+
```
|
|
223
|
+
|
|
224
|
+
ccqa resolves the path against the project root and passes `--state <path>` to every `agent-browser` invocation in the run (including ccqa's own screenshot calls). The file is **read-only** — `--state` loads it but never writes back to it. Re-running locally or in CI does not mutate it.
|
|
225
|
+
|
|
226
|
+
Bootstrap once locally:
|
|
227
|
+
|
|
228
|
+
```bash
|
|
229
|
+
# 1. Log in interactively in a headed browser.
|
|
230
|
+
agent-browser --headed open https://app.slack.com
|
|
231
|
+
# …complete login + device-trust prompts by hand…
|
|
232
|
+
|
|
233
|
+
# 2. Snapshot cookies + localStorage to the path the spec references.
|
|
234
|
+
mkdir -p .ccqa/sessions
|
|
235
|
+
agent-browser state save .ccqa/sessions/slack-stg.json
|
|
236
|
+
agent-browser close
|
|
237
|
+
|
|
238
|
+
# 3. ccqa run reuses the saved state — no login prompt.
|
|
239
|
+
ccqa run slack/app-home-non-admin-access-denied
|
|
240
|
+
```
|
|
241
|
+
|
|
242
|
+
Add `.ccqa/sessions/` to `.gitignore` — these files contain live auth cookies and must never be committed.
|
|
243
|
+
|
|
244
|
+
#### CI: bring the state file with you
|
|
245
|
+
|
|
246
|
+
`statePath:` lives entirely inside `.ccqa/` and never touches `~/`. CI re-uses the state by writing the file into the same path the spec already references:
|
|
247
|
+
|
|
248
|
+
```bash
|
|
249
|
+
# Locally, after the interactive bootstrap above:
|
|
250
|
+
base64 -i .ccqa/sessions/slack-stg.json | pbcopy
|
|
251
|
+
# paste into your CI secret store as CCQA_SLACK_STG_STATE_B64
|
|
252
|
+
```
|
|
253
|
+
|
|
254
|
+
```yaml
|
|
255
|
+
# .github/workflows/ccqa.yml (sketch)
|
|
256
|
+
- name: Restore agent-browser state
|
|
257
|
+
env:
|
|
258
|
+
CCQA_SLACK_STG_STATE_B64: ${{ secrets.CCQA_SLACK_STG_STATE_B64 }}
|
|
259
|
+
run: |
|
|
260
|
+
mkdir -p .ccqa/sessions
|
|
261
|
+
printf '%s' "$CCQA_SLACK_STG_STATE_B64" | base64 -d \
|
|
262
|
+
> .ccqa/sessions/slack-stg.json
|
|
263
|
+
|
|
264
|
+
- name: Run live specs
|
|
265
|
+
env:
|
|
266
|
+
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
|
|
267
|
+
run: pnpm ccqa run --report
|
|
268
|
+
```
|
|
269
|
+
|
|
270
|
+
Caveats:
|
|
271
|
+
|
|
272
|
+
- **Expiry.** Whatever the upstream service's "remember this device" window is (Slack ≈ 30 days, others vary), the cookies in the state file eventually expire and CI starts failing on the device-trust gate again. Re-bootstrap locally and rotate the secret.
|
|
273
|
+
- **Treat the file as a credential.** It contains live auth cookies. Store it in your CI secret manager (GitHub Actions encrypted secrets, Vault, …) and never commit it.
|
|
274
|
+
- **Deterministic specs ignore `statePath:`.** Today it only affects `mode: live`; vitest-replayed specs always run isolated.
|
|
275
|
+
|
|
276
|
+
### Per-project guidance (`.ccqa/prompts/live.user.md` + `live.agent.md`)
|
|
277
|
+
|
|
278
|
+
ccqa's live-mode system prompt is deliberately product-agnostic. Anything specific to **your** project — staging URLs, login flow quirks, rich-editor types, common access-denied wording — belongs in two sibling files (run `ccqa init` to scaffold both):
|
|
279
|
+
|
|
280
|
+
- `.ccqa/prompts/live.user.md` — human-maintained stable guidance.
|
|
281
|
+
- `.ccqa/prompts/live.agent.md` — auto-updated by `ccqa run --update-agent-prompt` from each run's summary. You can hand-edit it, but the next `--update-agent-prompt` run may rewrite the whole file; durable rules should live in `live.user.md`.
|
|
179
282
|
|
|
180
|
-
|
|
283
|
+
Both files (when present) are read once per invocation and appended to the system prompt under "Project-specific guidance". The `ccqa record` (trace) side has the same split: `record.user.md` + `record.agent.md`, refreshed by `ccqa record --update-agent-prompt`.
|
|
181
284
|
|
|
182
|
-
Keep
|
|
285
|
+
Keep them short. A page or two of focused notes beats a long handbook — Claude has the spec's `expected` text to work from, these files are for the *non-obvious* product knowledge that isn't in any single spec. Examples of what's useful here:
|
|
183
286
|
|
|
184
287
|
- "the rich text editor is `[contenteditable='true']` — use `fill`, not keystrokes"
|
|
185
288
|
- "login redirects through an IDP service-selection screen; you can skip it by opening the destination URL directly"
|
|
@@ -189,9 +292,9 @@ Examples of what does **not** belong:
|
|
|
189
292
|
|
|
190
293
|
- per-spec details (those belong in the spec's `instruction` / `expected`)
|
|
191
294
|
- restating the STEP_RESULT contract (already in the system prompt)
|
|
192
|
-
- copy-pasted style guidelines from `
|
|
295
|
+
- copy-pasted style guidelines from `record.user.md` (the relaxed-constraint mode doesn't need them)
|
|
193
296
|
|
|
194
|
-
The
|
|
297
|
+
The combined bundle is capped at 32 KiB; anything beyond that is truncated with a warning.
|
|
195
298
|
|
|
196
299
|
## License
|
|
197
300
|
|