ccqa 0.3.8 → 0.3.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -2,11 +2,9 @@
2
2
 
3
3
  **Your Claude subscription already includes a QA engineer.**
4
4
 
5
- ccqa turns Claude Code into a browser test recorder.
5
+ ccqa turns Claude Code into a browser test recorder. Write a spec in Markdown, run `ccqa trace`, and Claude drives your app via [agent-browser](https://github.com/vercel-labs/agent-browser). Every action is recorded and compiled into a deterministic test script you can run in CI. No extra API key. Just `claude`.
6
6
 
7
- Write a spec in Markdown, run `ccqa trace`, and Claude drives your app via [agent-browser](https://github.com/vercel-labs/agent-browser) — a lightweight headless browser CLI that runs anywhere without a browser driver or Playwright setup. Because the agent controls the browser through a simple CLI interface, it can handle login flows, intermediate screens, and dynamic UI the same way a human would.
8
-
9
- Every action is recorded as structured data and compiled into a deterministic test script you can run in CI. No extra API key. Just `claude`.
7
+ [日本語版 README](./docs/README.ja.md)
10
8
 
11
9
  ## How it works
12
10
 
@@ -17,35 +15,19 @@ flowchart LR
17
15
  C --> D["ccqa run\n(deterministic replay)"]
18
16
  ```
19
17
 
20
- `trace` invokes Claude Code with your spec. Claude drives the browser step by step via [agent-browser](https://github.com/vercel-labs/agent-browser), recording every action as structured data. `generate` compiles that data into a vitest-compatible script. `run` replays it deterministically — no LLM involved.
18
+ `trace` invokes Claude Code with your spec. Claude drives the browser step by step, recording every action as structured data. `generate` compiles that data into a vitest-compatible script. `run` replays it deterministically — no LLM involved.
21
19
 
22
20
  ## Install
23
21
 
24
- Add ccqa as a dev dependency in your project:
25
-
26
- ```bash
27
- pnpm add -D ccqa vitest
28
- # or
29
- npm install -D ccqa vitest
30
- ```
31
-
32
- Then invoke the CLI via your package runner:
33
-
34
22
  ```bash
35
- pnpm exec ccqa trace tasks/create-and-complete
36
- # or
37
- npx ccqa trace tasks/create-and-complete
23
+ pnpm add -D ccqa vitest agent-browser
38
24
  ```
39
25
 
40
- ccqa requires Node.js **20+** at runtime. The peer dependency [agent-browser](https://github.com/vercel-labs/agent-browser) must also be installed:
26
+ Requires Node.js **20+**. [agent-browser](https://github.com/vercel-labs/agent-browser) is a peer dependency.
41
27
 
42
- ```bash
43
- pnpm add -D agent-browser
44
- ```
28
+ ## Quick start
45
29
 
46
- ## Usage
47
-
48
- **1. Write a spec**
30
+ **1. Write a spec** — by hand, or interactively with [`ccqa draft`](./docs/draft.md)
49
31
 
50
32
  ```markdown
51
33
  <!-- .ccqa/features/tasks/test-cases/create-and-complete/test-spec.md -->
@@ -63,213 +45,51 @@ baseUrl: http://localhost:3000
63
45
  ### Step 2: Create a new task
64
46
  - **Instruction**: Click "New Task", fill in the title "Fix login bug", set priority to High, save
65
47
  - **Expected**: Task appears in the task list with status "Open"
66
-
67
- ### Step 3: Mark the task as complete
68
- - **Instruction**: Open the task "Fix login bug", click "Mark as complete"
69
- - **Expected**: Task status changes to "Done", task moves to the completed section
70
48
  ```
71
49
 
72
- **2. Trace — Claude drives the browser and records every action**
50
+ **2. Trace** — Claude drives the browser and records every action
73
51
 
74
52
  ```bash
75
53
  ccqa trace tasks/create-and-complete
76
54
  ```
77
55
 
78
- ```
79
- ▶ trace tasks/create-and-complete
80
- spec Create a task and mark it complete
81
- url http://localhost:3000
82
- steps 3
83
-
84
- Running agent-browser session...
85
- ● step-01 Log in
86
- ● step-02 Create a new task
87
- ● step-03 Mark the task as complete
88
-
89
- trace .ccqa/features/tasks/test-cases/create-and-complete/actions.json
90
- actions 24
91
- status PASSED
92
- ```
93
-
94
- **3. Generate — convert recorded actions into a replayable test**
56
+ **3. Generate** — convert recorded actions into a replayable test
95
57
 
96
58
  ```bash
97
59
  ccqa generate tasks/create-and-complete
98
60
  ```
99
61
 
100
- **4. Run — replay deterministically, no LLM involved**
62
+ **4. Run** — replay deterministically, no LLM involved
101
63
 
102
64
  ```bash
103
65
  ccqa run tasks/create-and-complete
104
66
  ```
105
67
 
106
- ## Setup Specs — Reusable shared procedures
68
+ ## Features
107
69
 
108
- Setup specs let you define reusable procedures (login, data preparation, etc.) that run before your test steps. Define once, use across multiple test specs.
109
-
110
- ### 1. Write a setup spec
111
-
112
- ```markdown
113
- <!-- .ccqa/setups/login/setup-spec.md -->
114
- ---
115
- title: "Login"
116
- placeholders:
117
- loginUrl:
118
- dummy: "http://localhost:3000/login"
119
- description: "Login page URL"
120
- email:
121
- dummy: "user@example.com"
122
- description: "Email address"
123
- password:
124
- dummy: "secret"
125
- description: "Password"
126
- ---
127
-
128
- ## Steps
129
-
130
- ### Step 1: Open login page
131
- - **Instruction**: Navigate to {{loginUrl}}
132
- - **Expected**: Login form is displayed
133
-
134
- ### Step 2: Enter credentials and log in
135
- - **Instruction**: Enter email {{email}} and password {{password}}, then submit
136
- - **Expected**: Login succeeds
137
- ```
138
-
139
- The `placeholders` section defines variables with `dummy` values. During `trace-setup`, the dummy values are used for actual browser operation. During `generate-setup`, they are reverse-replaced with `{{key}}` placeholders.
140
-
141
- ### 2. Trace the setup
142
-
143
- ```bash
144
- ccqa trace-setup login
145
- ```
146
-
147
- ### 3. Generate and validate the setup
148
-
149
- ```bash
150
- ccqa generate-setup login
151
- ```
152
-
153
- This generates `test.dummy.spec.ts` with dummy values, runs vitest to validate, and applies auto-fix. On success, it reverse-replaces dummy values with placeholders and saves `test.spec.ts`.
154
-
155
- If auto-fix fails, edit `test.dummy.spec.ts` manually and re-run:
156
-
157
- ```bash
158
- ccqa generate-setup login --from-dummy
159
- ```
160
-
161
- ### 4. Reference from test specs
162
-
163
- ```markdown
164
- ---
165
- title: Create a task
166
- baseUrl: http://localhost:3000
167
- setups:
168
- - name: login
169
- params:
170
- loginUrl: "http://localhost:3000/login"
171
- email: "admin@example.com"
172
- password: "AdminPass123"
173
- ---
174
-
175
- ## Steps
176
- ### Step 1: Create a new task
177
- ...
178
- ```
179
-
180
- When you run `ccqa trace` or `ccqa generate`, the setup's test body is loaded, placeholders are replaced with `params` values, and it runs before your test steps — sharing the same browser session.
181
-
182
- ## What gets generated
183
-
184
- `ab()` is a thin wrapper around [agent-browser](https://github.com/vercel-labs/agent-browser) — a headless browser CLI. Each call spawns `agent-browser <command>` as a subprocess and throws if it exits non-zero. No browser driver setup, no async/await, no `.waitFor()`.
185
-
186
- ```typescript
187
- // .ccqa/features/tasks/test-cases/create-and-complete/test.spec.ts
188
- import { test } from "vitest";
189
- import { ab, abWait, abAssertUrl, abAssertTextVisible, abAssertEnabled } from "ccqa/test-helpers";
190
-
191
- process.env.AGENT_BROWSER_SESSION = `ccqa-run-${Date.now()}`;
192
-
193
- test("setup: login", () => {
194
- ab("cookies", "clear");
195
- ab("open", "http://localhost:3000/login");
196
- ab("fill", "[placeholder='Email']", "admin@example.com");
197
- ab("fill", "[type='password']", "AdminPass123");
198
- ab("press", "Enter");
199
- }, 3 * 60 * 1000);
200
-
201
- test("Create a task", () => {
202
- ab("open", "http://localhost:3000");
203
-
204
- // Create a new task
205
- ab("click", "[aria-label='New Task']");
206
- ab("fill", "[placeholder='Task title']", "Fix login bug");
207
- ab("select", "[aria-label='Priority']", "High");
208
- ab("click", "[aria-label='Save']");
209
- abAssertTextVisible("Fix login bug");
210
- abAssertTextVisible("Open");
211
- }, 5 * 60 * 1000);
212
- ```
213
-
214
- Setup and test share the same `AGENT_BROWSER_SESSION` — login state carries over. Each run starts with `cookies clear` to ensure a clean session.
215
-
216
- ## Assertions
217
-
218
- During `trace`, Claude verifies each step with at least two independent signals and emits structured assertions. These become typed helper calls in the generated script:
219
-
220
- | Assert | What it checks |
221
- |--------|---------------|
222
- | `abAssertTextVisible(text)` | Text appears on page (waits up to 30s) |
223
- | `abAssertUrl(pattern)` | Current URL contains pattern |
224
- | `abAssertEnabled(selector)` | Button/input is enabled |
225
- | `abAssertDisabled(selector)` | Button/input is disabled |
226
- | `abAssertVisible(selector)` | Element is visible |
227
- | `abAssertNotVisible(selector)` | Element is hidden |
228
- | `abAssertChecked(selector)` | Checkbox is checked |
229
- | `abAssertUnchecked(selector)` | Checkbox is unchecked |
230
-
231
- Assertions are stability-aware: Claude skips timestamps, session IDs, and exact counts that vary between runs.
232
-
233
- ## Auto-fix
234
-
235
- If the generated script fails, `generate` invokes an LLM to diagnose the failure and propose a fix. The diagnosis is one of:
236
-
237
- - **TIMING_ISSUE** — insert or extend `sleep` so the page has time to settle.
238
- - **OVER_ASSERTION** — remove `abAssert*` lines that the spec doesn't actually require.
239
- - **SELECTOR_DRIFT** — replace a renamed selector with the new one. The diagnose LLM is allowed to `Grep` / `Read` your repository (read-only) to find the actual `aria-label` / `placeholder` / `data-testid` / i18n string in the app source, so renames in the UI code are caught even when the failure log only says "selector not visible".
240
- - **DATA_MISSING** / **UNKNOWN** — not auto-fixable; the loop bails and reports the diagnosis.
241
-
242
- Each diagnosis has a `confidence` score. By default high-confidence fixes are applied automatically; low-confidence fixes drop into an interactive `[a]pply / [s]kip / [m]anual / [q]uit` prompt.
243
-
244
- ```bash
245
- ccqa generate tasks/create-and-complete # default: interactive on low confidence
246
- ccqa generate tasks/create-and-complete --auto # CI: always auto-apply
247
- ccqa generate tasks/create-and-complete --no-interactive # CI: auto-apply on high confidence, give up otherwise
248
- ccqa generate tasks/create-and-complete --max-retries 5
249
- ```
250
-
251
- > **Note**: `generate` regenerates `test.spec.ts` from `actions.json` on every run. Manual edits to `test.spec.ts` are lost on the next `generate`. When an existing `test.spec.ts` is detected, `generate` always asks for `y/N` confirmation before overwriting (even with `--auto` / `--no-interactive`). To skip the prompt in CI, pass `--force`. To persist a fix, re-run `trace` so `actions.json` reflects the new flow.
70
+ | Feature | Docs |
71
+ |---|---|
72
+ | Write specs interactively with Claude | [Draft](./docs/draft.md) |
73
+ | Reuse login and other setup steps | [Setup Specs](./docs/setup-specs.md) |
74
+ | Assertion helper functions | [Assertions](./docs/assertions.md) |
75
+ | Auto-fix failing tests | [Auto-fix](./docs/auto-fix.md) |
76
+ | Detect spec/code drift in CI | [Drift](./docs/drift.md) |
252
77
 
253
78
  ## Commands
254
79
 
255
80
  ```
256
- ccqa trace <feature/spec> Record browser actions for a test spec
81
+ ccqa draft [feature/spec] Co-author a test spec with Claude
82
+ ccqa drift [feature/spec] Check spec ↔ codebase drift (CI-friendly)
83
+ ccqa trace <feature/spec> Record browser actions
257
84
  ccqa generate <feature/spec> Generate test script from recorded actions
258
- --auto Apply auto-fixes without confirmation (CI)
259
- --no-interactive Auto-apply only on high confidence; never prompt
260
- --force Overwrite an existing test.spec.ts without prompting
261
- --max-retries <n> Default: 3
262
85
  ccqa run [feature/spec] Execute generated test scripts
263
-
264
- ccqa trace-setup <name> Record browser actions for a setup spec
86
+ ccqa trace-setup <name> Record actions for a setup spec
265
87
  ccqa generate-setup <name> Generate and validate setup test script
266
- --from-dummy Resume from manually edited test.dummy.spec.ts
267
- --auto / --no-interactive Same semantics as `generate`
268
88
  ```
269
89
 
270
- All Claude-driven commands (`trace`, `trace-setup`, `generate`, `generate-setup`) accept `-m, --model <name>` to select the Claude model — pass an alias (`sonnet` | `opus` | `haiku`) or a full model ID (e.g. `claude-opus-4-7`). The flag overrides the `CCQA_MODEL` environment variable; when both are unset, the Claude Code CLI default is used. Authentication is handled by your local Claude Code login no `ANTHROPIC_API_KEY` is required.
90
+ All Claude-driven commands accept `-m, --model <name>` (alias `sonnet` | `opus` | `haiku`, or a full model ID). The flag overrides `CCQA_MODEL`; when both are unset, the Claude Code CLI default is used. Interactive commands authenticate via your local Claude Code login; `ccqa drift` additionally honors `ANTHROPIC_API_KEY` for CI.
271
91
 
272
- `<feature/spec>` is a 2-segment alias for the on-disk path `.ccqa/features/<feature>/test-cases/<spec>/`. Pass the alias, not the full directory path.
92
+ `<feature/spec>` is a 2-segment alias for the on-disk path `.ccqa/features/<feature>/test-cases/<spec>/`.
273
93
 
274
94
  ## File structure
275
95
 
@@ -278,27 +98,16 @@ All Claude-driven commands (`trace`, `trace-setup`, `generate`, `generate-setup`
278
98
  setups/
279
99
  login/
280
100
  setup-spec.md # Setup definition with placeholders
281
- test.spec.ts # Generated setup script (with {{placeholders}})
101
+ test.spec.ts # Generated setup script
282
102
  features/
283
103
  tasks/
284
104
  test-cases/
285
105
  create-and-complete/
286
- test-spec.md # Test definition (references setups)
106
+ test-spec.md # Test definition
287
107
  actions.json # Recorded actions from trace
288
108
  test.spec.ts # Generated test script
289
109
  ```
290
110
 
291
- ## Why not write Playwright tests by hand?
292
-
293
- | | ccqa | Hand-written Playwright |
294
- |---|---|---|
295
- | Write selectors | Claude picks them from ARIA snapshots | You inspect the DOM |
296
- | Handle timing | Recorded wait commands, auto-fix sleep | `waitFor`, `expect().toBeVisible()` |
297
- | Assertions | Auto-generated from verified signals | Written manually |
298
- | Login / setup | Shared setup specs with placeholders | Custom fixtures per project |
299
- | Update after UI change | Re-run `trace` | Find and update every affected locator |
300
- | Runs in CI | Yes (deterministic replay, no LLM) | Yes |
301
-
302
111
  ## License
303
112
 
304
113
  MIT