ccqa 0.3.9 → 0.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +39 -301
- package/dist/bin/ccqa.mjs +2180 -1124
- package/dist/package.json +2 -2
- package/dist/runtime/test-helpers.mjs +1 -53
- package/dist/runtime/vitest.config.d.mts +10 -10
- package/dist/spawn-ab-BxjEhA5e.mjs +65 -0
- package/package.json +2 -2
package/README.md
CHANGED
|
@@ -2,375 +2,113 @@
|
|
|
2
2
|
|
|
3
3
|
**Your Claude subscription already includes a QA engineer.**
|
|
4
4
|
|
|
5
|
-
ccqa turns Claude Code into a browser test recorder.
|
|
5
|
+
ccqa turns Claude Code into a browser test recorder. Write a spec in YAML, run `ccqa trace`, and Claude drives your app via [agent-browser](https://github.com/vercel-labs/agent-browser). Every action is recorded and compiled into a deterministic test script you can run in CI. No extra API key. Just `claude`.
|
|
6
6
|
|
|
7
|
-
|
|
8
|
-
|
|
9
|
-
Every action is recorded as structured data and compiled into a deterministic test script you can run in CI. No extra API key. Just `claude`.
|
|
7
|
+
[日本語版 README](./docs/README.ja.md)
|
|
10
8
|
|
|
11
9
|
## How it works
|
|
12
10
|
|
|
13
11
|
```mermaid
|
|
14
12
|
flowchart LR
|
|
15
|
-
A["Write spec\n(
|
|
13
|
+
A["Write spec\n(spec.yaml)"] --> B["ccqa trace\n(Claude drives browser)"]
|
|
16
14
|
B --> C["ccqa generate\n(LLM → test script)"]
|
|
17
15
|
C --> D["ccqa run\n(deterministic replay)"]
|
|
18
16
|
```
|
|
19
17
|
|
|
20
|
-
`trace` invokes Claude Code with your spec. Claude drives the browser step by step
|
|
18
|
+
`trace` invokes Claude Code with your spec. Claude drives the browser step by step, recording every action as structured data. `generate` compiles that data into a vitest-compatible script. `run` replays it deterministically — no LLM involved.
|
|
21
19
|
|
|
22
20
|
## Install
|
|
23
21
|
|
|
24
|
-
Add ccqa as a dev dependency in your project:
|
|
25
|
-
|
|
26
|
-
```bash
|
|
27
|
-
pnpm add -D ccqa vitest
|
|
28
|
-
# or
|
|
29
|
-
npm install -D ccqa vitest
|
|
30
|
-
```
|
|
31
|
-
|
|
32
|
-
Then invoke the CLI via your package runner:
|
|
33
|
-
|
|
34
22
|
```bash
|
|
35
|
-
pnpm
|
|
36
|
-
# or
|
|
37
|
-
npx ccqa trace tasks/create-and-complete
|
|
23
|
+
pnpm add -D ccqa vitest agent-browser
|
|
38
24
|
```
|
|
39
25
|
|
|
40
|
-
|
|
26
|
+
Requires Node.js **20+**. [agent-browser](https://github.com/vercel-labs/agent-browser) is a peer dependency.
|
|
41
27
|
|
|
42
|
-
|
|
43
|
-
pnpm add -D agent-browser
|
|
44
|
-
```
|
|
28
|
+
## Quick start
|
|
45
29
|
|
|
46
|
-
|
|
30
|
+
**1. Write a spec** — by hand, or interactively with [`ccqa draft`](./docs/draft.md)
|
|
47
31
|
|
|
48
|
-
|
|
49
|
-
|
|
50
|
-
```markdown
|
|
51
|
-
<!-- .ccqa/features/tasks/test-cases/create-and-complete/test-spec.md -->
|
|
52
|
-
---
|
|
32
|
+
```yaml
|
|
33
|
+
# .ccqa/features/tasks/test-cases/create-and-complete/spec.yaml
|
|
53
34
|
title: Create a task and mark it complete
|
|
54
|
-
baseUrl: http://localhost:3000
|
|
55
|
-
---
|
|
56
|
-
|
|
57
|
-
## Steps
|
|
58
35
|
|
|
59
|
-
|
|
60
|
-
-
|
|
61
|
-
|
|
36
|
+
steps:
|
|
37
|
+
- instruction: |
|
|
38
|
+
Open ${APP_URL}/login. Fill in email and password, submit the form.
|
|
39
|
+
expected: Redirected to /dashboard, user avatar visible in the header
|
|
62
40
|
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
### Step 3: Mark the task as complete
|
|
68
|
-
- **Instruction**: Open the task "Fix login bug", click "Mark as complete"
|
|
69
|
-
- **Expected**: Task status changes to "Done", task moves to the completed section
|
|
41
|
+
- instruction: |
|
|
42
|
+
Click "New Task", fill in the title "Fix login bug", set priority to High, save.
|
|
43
|
+
expected: Task appears in the task list with status "Open"
|
|
70
44
|
```
|
|
71
45
|
|
|
72
|
-
|
|
46
|
+
URLs live inside `instruction` strings — either verbatim or via `${ENV_VAR}` references for environment-specific values.
|
|
47
|
+
|
|
48
|
+
**2. Trace** — Claude drives the browser and records every action
|
|
73
49
|
|
|
74
50
|
```bash
|
|
75
51
|
ccqa trace tasks/create-and-complete
|
|
76
52
|
```
|
|
77
53
|
|
|
78
|
-
|
|
79
|
-
▶ trace tasks/create-and-complete
|
|
80
|
-
spec Create a task and mark it complete
|
|
81
|
-
url http://localhost:3000
|
|
82
|
-
steps 3
|
|
83
|
-
|
|
84
|
-
Running agent-browser session...
|
|
85
|
-
● step-01 Log in
|
|
86
|
-
● step-02 Create a new task
|
|
87
|
-
● step-03 Mark the task as complete
|
|
88
|
-
|
|
89
|
-
trace .ccqa/features/tasks/test-cases/create-and-complete/actions.json
|
|
90
|
-
actions 24
|
|
91
|
-
status PASSED
|
|
92
|
-
```
|
|
93
|
-
|
|
94
|
-
**3. Generate — convert recorded actions into a replayable test**
|
|
54
|
+
**3. Generate** — convert recorded actions into a replayable test
|
|
95
55
|
|
|
96
56
|
```bash
|
|
97
57
|
ccqa generate tasks/create-and-complete
|
|
98
58
|
```
|
|
99
59
|
|
|
100
|
-
**4. Run — replay deterministically, no LLM involved
|
|
60
|
+
**4. Run** — replay deterministically, no LLM involved
|
|
101
61
|
|
|
102
62
|
```bash
|
|
103
63
|
ccqa run tasks/create-and-complete
|
|
104
64
|
```
|
|
105
65
|
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
Writing a `test-spec.md` from scratch means digging into your codebase to find the right aria-labels, URLs, and button text. `ccqa draft` puts Claude in the loop: you describe what you want to test in plain language, Claude reads the relevant code, and you refine the spec interactively.
|
|
66
|
+
In CI you can opt in to drift analysis on test failures by passing `--drift` — Claude will explain the failure by comparing the spec against the current codebase. Requires `ANTHROPIC_API_KEY` or a local Claude login.
|
|
109
67
|
|
|
110
68
|
```bash
|
|
111
|
-
ccqa
|
|
112
|
-
```
|
|
113
|
-
|
|
114
|
-
The first run asks for your intent, proposes a `feature/spec` name, and writes a draft. Each subsequent invocation lets you give a refinement instruction — empty input means "just re-check the current spec against the code." Press `y` at the final "Are you done with this draft?" prompt to end the session.
|
|
115
|
-
|
|
116
|
-
```
|
|
117
|
-
ccqa draft
|
|
118
|
-
|
|
119
|
-
What do you want to test? > Select a category on the AI Maintenance page and run a check
|
|
120
|
-
Proposing a feature/spec name based on your intent...
|
|
121
|
-
proposed: ai-maintenance/run-check-with-category
|
|
122
|
-
Use this name? [y/N/edit] > y
|
|
123
|
-
|
|
124
|
-
Reading codebase and drafting spec...
|
|
125
|
-
✓ 5 Read, 3 Grep, 2 Glob (4.2s)
|
|
126
|
-
|
|
127
|
-
── Review (1 warning, 3 passed) ───────────────────────────────────
|
|
128
|
-
|
|
129
|
-
WARNINGS (1)
|
|
130
|
-
Assertability step-05
|
|
131
|
-
Result row may still show "running" right after the click
|
|
132
|
-
└ ContentQualityCheck.tsx polls every 5s; the status starts at
|
|
133
|
-
IN_PROGRESS and only flips to SUCCEEDED later.
|
|
134
|
-
|
|
135
|
-
PASSED (3)
|
|
136
|
-
Setup references, Step granularity, Unimplemented checks
|
|
137
|
-
|
|
138
|
-
────────────────────────────────────────────────────────────────────
|
|
139
|
-
|
|
140
|
-
--- proposed changes ---
|
|
141
|
-
+ ---
|
|
142
|
-
+ title: "AI Maintenance — content quality check"
|
|
143
|
-
...
|
|
144
|
-
|
|
145
|
-
Apply this patch? [y/N] y
|
|
146
|
-
saved: .ccqa/features/ai-maintenance/test-cases/run-check-with-category/test-spec.md
|
|
147
|
-
|
|
148
|
-
How would you like to refine? (empty = re-validate) >
|
|
69
|
+
ccqa run tasks/create-and-complete --drift --format github
|
|
149
70
|
```
|
|
150
71
|
|
|
151
|
-
|
|
152
|
-
|
|
153
|
-
### What gets reviewed
|
|
72
|
+
## Features
|
|
154
73
|
|
|
155
|
-
|
|
156
|
-
|
|
157
|
-
| Check | What it verifies |
|
|
74
|
+
| Feature | Docs |
|
|
158
75
|
|---|---|
|
|
159
|
-
|
|
|
160
|
-
|
|
|
161
|
-
|
|
|
162
|
-
|
|
|
163
|
-
|
|
164
|
-
Findings with severity `WARN` or `ERROR` are shown in full; `OK` checks collapse to a one-line summary.
|
|
165
|
-
|
|
166
|
-
### Flags
|
|
167
|
-
|
|
168
|
-
```
|
|
169
|
-
ccqa draft [feature/spec] # arg is optional; Claude proposes a name if omitted
|
|
170
|
-
--instruction <text> # single-shot, non-interactive
|
|
171
|
-
--apply # auto-apply patches without [y/N] confirmation
|
|
172
|
-
```
|
|
173
|
-
|
|
174
|
-
## Setup Specs — Reusable shared procedures
|
|
175
|
-
|
|
176
|
-
Setup specs let you define reusable procedures (login, data preparation, etc.) that run before your test steps. Define once, use across multiple test specs.
|
|
177
|
-
|
|
178
|
-
### 1. Write a setup spec
|
|
179
|
-
|
|
180
|
-
```markdown
|
|
181
|
-
<!-- .ccqa/setups/login/setup-spec.md -->
|
|
182
|
-
---
|
|
183
|
-
title: "Login"
|
|
184
|
-
placeholders:
|
|
185
|
-
loginUrl:
|
|
186
|
-
dummy: "http://localhost:3000/login"
|
|
187
|
-
description: "Login page URL"
|
|
188
|
-
email:
|
|
189
|
-
dummy: "user@example.com"
|
|
190
|
-
description: "Email address"
|
|
191
|
-
password:
|
|
192
|
-
dummy: "secret"
|
|
193
|
-
description: "Password"
|
|
194
|
-
---
|
|
195
|
-
|
|
196
|
-
## Steps
|
|
197
|
-
|
|
198
|
-
### Step 1: Open login page
|
|
199
|
-
- **Instruction**: Navigate to {{loginUrl}}
|
|
200
|
-
- **Expected**: Login form is displayed
|
|
201
|
-
|
|
202
|
-
### Step 2: Enter credentials and log in
|
|
203
|
-
- **Instruction**: Enter email {{email}} and password {{password}}, then submit
|
|
204
|
-
- **Expected**: Login succeeds
|
|
205
|
-
```
|
|
206
|
-
|
|
207
|
-
The `placeholders` section defines variables with `dummy` values. During `trace-setup`, the dummy values are used for actual browser operation. During `generate-setup`, they are reverse-replaced with `{{key}}` placeholders.
|
|
208
|
-
|
|
209
|
-
### 2. Trace the setup
|
|
210
|
-
|
|
211
|
-
```bash
|
|
212
|
-
ccqa trace-setup login
|
|
213
|
-
```
|
|
214
|
-
|
|
215
|
-
### 3. Generate and validate the setup
|
|
216
|
-
|
|
217
|
-
```bash
|
|
218
|
-
ccqa generate-setup login
|
|
219
|
-
```
|
|
220
|
-
|
|
221
|
-
This generates `test.dummy.spec.ts` with dummy values, runs vitest to validate, and applies auto-fix. On success, it reverse-replaces dummy values with placeholders and saves `test.spec.ts`.
|
|
222
|
-
|
|
223
|
-
If auto-fix fails, edit `test.dummy.spec.ts` manually and re-run:
|
|
224
|
-
|
|
225
|
-
```bash
|
|
226
|
-
ccqa generate-setup login --from-dummy
|
|
227
|
-
```
|
|
228
|
-
|
|
229
|
-
### 4. Reference from test specs
|
|
230
|
-
|
|
231
|
-
```markdown
|
|
232
|
-
---
|
|
233
|
-
title: Create a task
|
|
234
|
-
baseUrl: http://localhost:3000
|
|
235
|
-
setups:
|
|
236
|
-
- name: login
|
|
237
|
-
params:
|
|
238
|
-
loginUrl: "http://localhost:3000/login"
|
|
239
|
-
email: "admin@example.com"
|
|
240
|
-
password: "AdminPass123"
|
|
241
|
-
---
|
|
242
|
-
|
|
243
|
-
## Steps
|
|
244
|
-
### Step 1: Create a new task
|
|
245
|
-
...
|
|
246
|
-
```
|
|
247
|
-
|
|
248
|
-
When you run `ccqa trace` or `ccqa generate`, the setup's test body is loaded, placeholders are replaced with `params` values, and it runs before your test steps — sharing the same browser session.
|
|
249
|
-
|
|
250
|
-
## What gets generated
|
|
251
|
-
|
|
252
|
-
`ab()` is a thin wrapper around [agent-browser](https://github.com/vercel-labs/agent-browser) — a headless browser CLI. Each call spawns `agent-browser <command>` as a subprocess and throws if it exits non-zero. No browser driver setup, no async/await, no `.waitFor()`.
|
|
253
|
-
|
|
254
|
-
```typescript
|
|
255
|
-
// .ccqa/features/tasks/test-cases/create-and-complete/test.spec.ts
|
|
256
|
-
import { test } from "vitest";
|
|
257
|
-
import { ab, abWait, abAssertUrl, abAssertTextVisible, abAssertEnabled } from "ccqa/test-helpers";
|
|
258
|
-
|
|
259
|
-
process.env.AGENT_BROWSER_SESSION = `ccqa-run-${Date.now()}`;
|
|
260
|
-
|
|
261
|
-
test("setup: login", () => {
|
|
262
|
-
ab("cookies", "clear");
|
|
263
|
-
ab("open", "http://localhost:3000/login");
|
|
264
|
-
ab("fill", "[placeholder='Email']", "admin@example.com");
|
|
265
|
-
ab("fill", "[type='password']", "AdminPass123");
|
|
266
|
-
ab("press", "Enter");
|
|
267
|
-
}, 3 * 60 * 1000);
|
|
268
|
-
|
|
269
|
-
test("Create a task", () => {
|
|
270
|
-
ab("open", "http://localhost:3000");
|
|
271
|
-
|
|
272
|
-
// Create a new task
|
|
273
|
-
ab("click", "[aria-label='New Task']");
|
|
274
|
-
ab("fill", "[placeholder='Task title']", "Fix login bug");
|
|
275
|
-
ab("select", "[aria-label='Priority']", "High");
|
|
276
|
-
ab("click", "[aria-label='Save']");
|
|
277
|
-
abAssertTextVisible("Fix login bug");
|
|
278
|
-
abAssertTextVisible("Open");
|
|
279
|
-
}, 5 * 60 * 1000);
|
|
280
|
-
```
|
|
281
|
-
|
|
282
|
-
Setup and test share the same `AGENT_BROWSER_SESSION` — login state carries over. Each run starts with `cookies clear` to ensure a clean session.
|
|
283
|
-
|
|
284
|
-
## Assertions
|
|
285
|
-
|
|
286
|
-
During `trace`, Claude verifies each step with at least two independent signals and emits structured assertions. These become typed helper calls in the generated script:
|
|
287
|
-
|
|
288
|
-
| Assert | What it checks |
|
|
289
|
-
|--------|---------------|
|
|
290
|
-
| `abAssertTextVisible(text)` | Text appears on page (waits up to 30s) |
|
|
291
|
-
| `abAssertUrl(pattern)` | Current URL contains pattern |
|
|
292
|
-
| `abAssertEnabled(selector)` | Button/input is enabled |
|
|
293
|
-
| `abAssertDisabled(selector)` | Button/input is disabled |
|
|
294
|
-
| `abAssertVisible(selector)` | Element is visible |
|
|
295
|
-
| `abAssertNotVisible(selector)` | Element is hidden |
|
|
296
|
-
| `abAssertChecked(selector)` | Checkbox is checked |
|
|
297
|
-
| `abAssertUnchecked(selector)` | Checkbox is unchecked |
|
|
298
|
-
|
|
299
|
-
Assertions are stability-aware: Claude skips timestamps, session IDs, and exact counts that vary between runs.
|
|
300
|
-
|
|
301
|
-
## Auto-fix
|
|
302
|
-
|
|
303
|
-
If the generated script fails, `generate` invokes an LLM to diagnose the failure and propose a fix. The diagnosis is one of:
|
|
304
|
-
|
|
305
|
-
- **TIMING_ISSUE** — insert or extend `sleep` so the page has time to settle.
|
|
306
|
-
- **OVER_ASSERTION** — remove `abAssert*` lines that the spec doesn't actually require.
|
|
307
|
-
- **SELECTOR_DRIFT** — replace a renamed selector with the new one. The diagnose LLM is allowed to `Grep` / `Read` your repository (read-only) to find the actual `aria-label` / `placeholder` / `data-testid` / i18n string in the app source, so renames in the UI code are caught even when the failure log only says "selector not visible".
|
|
308
|
-
- **DATA_MISSING** / **UNKNOWN** — not auto-fixable; the loop bails and reports the diagnosis.
|
|
309
|
-
|
|
310
|
-
Each diagnosis has a `confidence` score. By default high-confidence fixes are applied automatically; low-confidence fixes drop into an interactive `[a]pply / [s]kip / [m]anual / [q]uit` prompt.
|
|
311
|
-
|
|
312
|
-
```bash
|
|
313
|
-
ccqa generate tasks/create-and-complete # default: interactive on low confidence
|
|
314
|
-
ccqa generate tasks/create-and-complete --auto # CI: always auto-apply
|
|
315
|
-
ccqa generate tasks/create-and-complete --no-interactive # CI: auto-apply on high confidence, give up otherwise
|
|
316
|
-
ccqa generate tasks/create-and-complete --max-retries 5
|
|
317
|
-
```
|
|
318
|
-
|
|
319
|
-
> **Note**: `generate` regenerates `test.spec.ts` from `actions.json` on every run. Manual edits to `test.spec.ts` are lost on the next `generate`. When an existing `test.spec.ts` is detected, `generate` always asks for `y/N` confirmation before overwriting (even with `--auto` / `--no-interactive`). To skip the prompt in CI, pass `--force`. To persist a fix, re-run `trace` so `actions.json` reflects the new flow.
|
|
76
|
+
| Write specs interactively with Claude | [Draft](./docs/draft.md) |
|
|
77
|
+
| Reuse login and other shared step sequences | [Blocks](./docs/blocks.md) |
|
|
78
|
+
| Assertion helper functions | [Assertions](./docs/assertions.md) |
|
|
79
|
+
| Auto-fix failing tests | [Auto-fix](./docs/auto-fix.md) |
|
|
80
|
+
| Detect spec/code drift in CI | [Drift](./docs/drift.md) |
|
|
320
81
|
|
|
321
82
|
## Commands
|
|
322
83
|
|
|
323
84
|
```
|
|
324
85
|
ccqa draft [feature/spec] Co-author a test spec with Claude
|
|
325
|
-
|
|
326
|
-
--apply Auto-apply patches without [y/N] confirmation
|
|
327
|
-
|
|
328
|
-
ccqa trace <feature/spec> Record browser actions for a test spec
|
|
86
|
+
ccqa trace <feature/spec> Record browser actions for a spec (inlines any included blocks)
|
|
329
87
|
ccqa generate <feature/spec> Generate test script from recorded actions
|
|
330
|
-
|
|
331
|
-
|
|
332
|
-
--force Overwrite an existing test.spec.ts without prompting
|
|
333
|
-
--max-retries <n> Default: 3
|
|
334
|
-
ccqa run [feature/spec] Execute generated test scripts
|
|
335
|
-
|
|
336
|
-
ccqa trace-setup <name> Record browser actions for a setup spec
|
|
337
|
-
ccqa generate-setup <name> Generate and validate setup test script
|
|
338
|
-
--from-dummy Resume from manually edited test.dummy.spec.ts
|
|
339
|
-
--auto / --no-interactive Same semantics as `generate`
|
|
88
|
+
ccqa run [feature/spec] Execute generated test scripts (add --drift to analyze failures)
|
|
89
|
+
ccqa drift [feature/spec] Standalone spec ↔ codebase drift audit (for scheduled jobs)
|
|
340
90
|
```
|
|
341
91
|
|
|
342
|
-
All Claude-driven commands
|
|
92
|
+
All Claude-driven commands accept `-m, --model <name>` (alias `sonnet` | `opus` | `haiku`, or a full model ID). The flag overrides `CCQA_MODEL`; when both are unset, the Claude Code CLI default is used. Interactive commands authenticate via your local Claude Code login; commands that talk to Claude in CI (`ccqa run --drift`, `ccqa drift`) additionally honor `ANTHROPIC_API_KEY`.
|
|
343
93
|
|
|
344
|
-
`<feature/spec>` is a 2-segment alias for the on-disk path `.ccqa/features/<feature>/test-cases/<spec>/`.
|
|
94
|
+
`<feature/spec>` is a 2-segment alias for the on-disk path `.ccqa/features/<feature>/test-cases/<spec>/`.
|
|
345
95
|
|
|
346
96
|
## File structure
|
|
347
97
|
|
|
348
98
|
```
|
|
349
99
|
.ccqa/
|
|
350
|
-
|
|
100
|
+
blocks/
|
|
351
101
|
login/
|
|
352
|
-
|
|
353
|
-
test.spec.ts # Generated setup script (with {{placeholders}})
|
|
102
|
+
spec.yaml # Reusable block (params + steps)
|
|
354
103
|
features/
|
|
355
104
|
tasks/
|
|
356
105
|
test-cases/
|
|
357
106
|
create-and-complete/
|
|
358
|
-
|
|
107
|
+
spec.yaml # Test definition
|
|
359
108
|
actions.json # Recorded actions from trace
|
|
360
109
|
test.spec.ts # Generated test script
|
|
361
110
|
```
|
|
362
111
|
|
|
363
|
-
## Why not write Playwright tests by hand?
|
|
364
|
-
|
|
365
|
-
| | ccqa | Hand-written Playwright |
|
|
366
|
-
|---|---|---|
|
|
367
|
-
| Write selectors | Claude picks them from ARIA snapshots | You inspect the DOM |
|
|
368
|
-
| Handle timing | Recorded wait commands, auto-fix sleep | `waitFor`, `expect().toBeVisible()` |
|
|
369
|
-
| Assertions | Auto-generated from verified signals | Written manually |
|
|
370
|
-
| Login / setup | Shared setup specs with placeholders | Custom fixtures per project |
|
|
371
|
-
| Update after UI change | Re-run `trace` | Find and update every affected locator |
|
|
372
|
-
| Runs in CI | Yes (deterministic replay, no LLM) | Yes |
|
|
373
|
-
|
|
374
112
|
## License
|
|
375
113
|
|
|
376
114
|
MIT
|