npm - ccqa - Versions diffs - 0.9.1 → 0.10.1 - Mend

ccqa 0.9.1 → 0.10.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

package/README.md +98 -4
package/dist/bin/ccqa.mjs +496 -159
package/dist/package.json +1 -1
package/dist/runtime/test-helpers.d.mts +8 -1
package/dist/runtime/test-helpers.mjs +28 -3
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -69,6 +69,8 @@ ccqa run tasks/create-and-complete      # vitest replays test.spec.ts; no LLM
 ccqa run tasks/create-and-complete      # Claude drives the browser every time
 ```
+Live specs can start already-signed-in by pointing `statePath:` at a saved agent-browser state file (cookies + localStorage). Run an interactive login locally once, save the state with `agent-browser state save .ccqa/sessions/<name>.json`, then commit the path (not the file) — see [Pre-authenticated state](#pre-authenticated-state-statepath) below for the local bootstrap and the CI restore pattern.
 By default deterministic runs write step-boundary screenshots and metadata to `ccqa-report/evidence/<feature>/<spec>/` so a reviewer can confirm a passing spec actually reached the states its `expected` clauses describe. Disable with `--no-evidence`.
 In CI you can opt in to an HTML run report by passing `--report` — every failing spec gets a drift audit plus a root-cause call (TEST_DRIFT / SPEC_CHANGE / PRODUCT_BUG) using the branch's git diff as context, and the report lets a human grade those calls to measure their accuracy. Requires `ANTHROPIC_API_KEY` or a local Claude login for the analysis part. Opt out with `--no-failure-analysis` (which also implicitly skips the drift audit — the audit is rendered as evidence under the classification, so without the classification the cost has nowhere to land). Use `--no-drift-audit` to keep the classification but skip the audit. See [Run report](./docs/report.md).
@@ -84,6 +86,7 @@ ccqa run --changed --report                    # only specs whose relatedPaths t
 |---|---|
 | Write specs interactively with Claude | [Draft](./docs/draft.md) |
 | Reuse login and other shared step sequences | [Blocks](./docs/blocks.md) |
+| Drive `<input type="file">` without an OS picker | [File upload](./docs/file-upload.md) |
 | Assertion helper functions | [Assertions](./docs/assertions.md) |
 | Auto-fix failing tests | [Auto-fix](./docs/auto-fix.md) |
 | Detect spec/code drift in CI | [Drift](./docs/drift.md) |
@@ -98,16 +101,18 @@ ccqa init                          Scaffold .ccqa/prompts/{live,record}.{user,ag
 ccqa draft [feature/spec]          Co-author a test spec with Claude
 ccqa perspectives                  Inventory existing test coverage into .ccqa/perspectives.yaml
 ccqa record <feature/spec>         (deterministic specs only) Trace browser actions + generate test.spec.ts
-ccqa run [feature/spec]            Execute specs. Per spec, the spec.yaml `mode:` field selects deterministic
+ccqa run [feature/spec...]         Execute specs. Per spec, the spec.yaml `mode:` field selects deterministic
                                    (vitest replay) or live (Claude drives every time). One run can mix both;
-                                   `--report` writes one unified HTML.
+                                   `--report` writes one unified HTML. Pass multiple targets space-separated.
 ccqa drift [feature/spec]          Standalone spec ↔ codebase static audit (for PR checks)
 ```
 `ccqa run` flags:
 - `--report [dir]` — write a self-contained HTML run report (default dir: `ccqa-report/`)
-- `--changed` — restrict execution to specs whose `relatedPaths` intersect `git diff <base>...HEAD`. Mutually exclusive with an explicit spec id.
+- `--profile <name>` — load `.ccqa/profiles/<name>.env` into the environment before resolving spec `${VAR}` references, so one spec targets dev/stg/prd without per-environment copies. See [Profiles](#profiles---profile).
+- `--changed` — restrict execution to specs whose `relatedPaths` intersect `git diff <base>...HEAD`. Mutually exclusive with explicit spec targets.
+- `--concurrency <n>` — run up to N specs in parallel **within each mode** (deterministic specs run as one phase, live specs as the next; parallelism is within a phase, not across). Default `1` (sequential, identical to before). Above 1, each spec's output is buffered and flushed as a labelled block so parallel logs stay legible. Live specs each launch their own headed Chrome, so high values spawn many browser instances.
 - `--base <ref>` — base ref for the git diff (default: `$GITHUB_BASE_REF`, then `origin/main`)
 - `--no-failure-analysis` — skip the per-failure root-cause classification (also skips the drift audit, since the audit only shows under the classification)
 - `--no-drift-audit` — skip the spec ↔ code drift audit while keeping the classification
@@ -119,7 +124,7 @@ ccqa drift [feature/spec]          Standalone spec ↔ codebase static audit (fo
 All Claude-driven commands accept `-m, --model <name>` (alias `sonnet` | `opus` | `haiku`, or a full model ID). The flag overrides `CCQA_MODEL`; when both are unset, the Claude Code CLI default is used. They also accept `--language <bcp47>` (e.g. `ja`, `en`) to set the language of human-readable output; the default `auto` follows the language of the spec/codebase. `--cwd <path>` works on `record` / `run` / `drift` so you can target a subpackage inside a monorepo from the repo root. Interactive commands authenticate via your local Claude Code login; commands that talk to Claude in CI (`ccqa run --report`, `ccqa drift`) additionally honor `ANTHROPIC_API_KEY`.
-`<feature/spec>` is a 2-segment alias for the on-disk path `.ccqa/features/<feature>/test-cases/<spec>/`.
+`<feature/spec>` is a 2-segment alias for the on-disk path `.ccqa/features/<feature>/test-cases/<spec>/`. `ccqa run` accepts several targets space-separated (each a `<feature>/<spec>`, a bare `<feature>` for all its specs, or omitted for everything); duplicates are de-duped and `--changed` cannot be combined with explicit targets.
 ## File structure
@@ -127,6 +132,9 @@ All Claude-driven commands accept `-m, --model <name>` (alias `sonnet` | `opus`
 .ccqa/
   perspectives.yaml              # Inventory of existing coverage (machine-readable, canonical)
   perspectives.md                # Category index, regenerated from the YAML
+  profiles/                      # `--profile <name>` env files
+    stg.env                      # URLs + credential refs; commit if it uses secret-manager refs, gitignore if it holds plaintext secrets
+    prd.env
   prompts/                       # Run `ccqa init` to scaffold these
     record.user.md               # Human-maintained guidance appended to `ccqa record` (trace phase)
     record.agent.md              # Auto-updated by `ccqa record --update-agent-prompt`
@@ -155,6 +163,26 @@ All Claude-driven commands accept `-m, --model <name>` (alias `sonnet` | `opus`
 Add `.ccqa/features/*/test-cases/*/runs/` to `.gitignore` — these are per-run artefacts that should not be committed. Likewise `ccqa-report*/`.
+## Profiles (`--profile`)
+Keep environment-specific values out of specs as `${VAR}` references and supply them per environment from a **profile** — a `.env` under `.ccqa/profiles/<name>.env`. `ccqa run`/`record --profile <name>` merges it into the environment before resolving `${VAR}`, so one spec runs anywhere.
+```bash
+# .ccqa/profiles/stg.env
+APP_BASE_URL=https://<your-app-host>
+TEST_USER_EMAIL=<stg-test-account>
+TEST_USER_PASSWORD=...
+```
+```bash
+ccqa run auth/login --profile stg    # same spec, stg values
+```
+- Name is free-form (`stg`/`prd` are conventions); a path separator / `..` / leading dot is rejected, and a missing profile exits 2. Only the name is logged, never values.
+- Format is a small `.env` subset (`KEY=value`, `#` comments, `export`, quotes). Profile values **override** the inherited environment.
+- Without `--profile`, ccqa auto-loads `<cwd>/.env` if present (like dotenv); with neither, `${VAR}` resolves against the existing `process.env` (e.g. `direnv`).
+**Secrets:** gitignore any profile that holds plaintext secrets. ccqa only parses `.env` files — it doesn't resolve secret-manager references — so to keep secrets off disk, drop `--profile` and run ccqa under your secret manager instead (e.g. `op run --env-file=.ccqa/profiles/stg.env -- ccqa run ...`), which injects the resolved values into `process.env` for ccqa to read.
 ## Live specs (`mode: live`)
 For specs declared `mode: live` in their spec.yaml, `ccqa run` skips codegen entirely: Claude executes each spec step against `agent-browser` directly, judges whether the step's `expected` outcome holds, and saves a PNG screenshot before and after every step. Use this mode when:
@@ -179,6 +207,72 @@ ccqa run --retry 2 tasks/create-and-complete
 Constraints on selectors / `agent-browser` subcommands that apply during `ccqa record` (no `eval`, no `@ref`, no bare-tag positional `find`, no chained agent-browser calls) are **relaxed** for live specs — Claude can use any subcommand and any selector style because there is no replay contract to honour.
+### Pre-authenticated state (`statePath:`)
+By default each `ccqa run` of a live spec spins up a fresh `agent-browser` session and starts signed-out. That keeps runs hermetic but forces every device-trust gate (Slack "we don't recognize this browser", Google's unfamiliar-device prompt, MFA challenges, …) to fire on every run.
+To skip them, save an authenticated browser state to a JSON file once locally and point the spec at it:
+```yaml
+title: Slack App Home — non-admin access denied
+mode: live
+statePath: .ccqa/sessions/slack-stg.json   # cookies + localStorage to restore
+steps:
+  - ...
+```
+ccqa resolves the path against the project root and passes `--state <path>` to every `agent-browser` invocation in the run (including ccqa's own screenshot calls). The file is **read-only** — `--state` loads it but never writes back to it. Re-running locally or in CI does not mutate it.
+Bootstrap once locally:
+```bash
+# 1. Log in interactively in a headed browser.
+agent-browser --headed open https://app.slack.com
+# …complete login + device-trust prompts by hand…
+# 2. Snapshot cookies + localStorage to the path the spec references.
+mkdir -p .ccqa/sessions
+agent-browser state save .ccqa/sessions/slack-stg.json
+agent-browser close
+# 3. ccqa run reuses the saved state — no login prompt.
+ccqa run slack/app-home-non-admin-access-denied
+```
+Add `.ccqa/sessions/` to `.gitignore` — these files contain live auth cookies and must never be committed.
+#### CI: bring the state file with you
+`statePath:` lives entirely inside `.ccqa/` and never touches `~/`. CI re-uses the state by writing the file into the same path the spec already references:
+```bash
+# Locally, after the interactive bootstrap above:
+base64 -i .ccqa/sessions/slack-stg.json | pbcopy
+# paste into your CI secret store as CCQA_SLACK_STG_STATE_B64
+```
+```yaml
+# .github/workflows/ccqa.yml (sketch)
+- name: Restore agent-browser state
+  env:
+    CCQA_SLACK_STG_STATE_B64: ${{ secrets.CCQA_SLACK_STG_STATE_B64 }}
+  run: |
+    mkdir -p .ccqa/sessions
+    printf '%s' "$CCQA_SLACK_STG_STATE_B64" | base64 -d \
+      > .ccqa/sessions/slack-stg.json
+- name: Run live specs
+  env:
+    ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
+  run: pnpm ccqa run --report
+```
+Caveats:
+- **Expiry.** Whatever the upstream service's "remember this device" window is (Slack ≈ 30 days, others vary), the cookies in the state file eventually expire and CI starts failing on the device-trust gate again. Re-bootstrap locally and rotate the secret.
+- **Treat the file as a credential.** It contains live auth cookies. Store it in your CI secret manager (GitHub Actions encrypted secrets, Vault, …) and never commit it.
+- **Deterministic specs ignore `statePath:`.** Today it only affects `mode: live`; vitest-replayed specs always run isolated.
 ### Per-project guidance (`.ccqa/prompts/live.user.md` + `live.agent.md`)
 ccqa's live-mode system prompt is deliberately product-agnostic. Anything specific to **your** project — staging URLs, login flow quirks, rich-editor types, common access-denied wording — belongs in two sibling files (run `ccqa init` to scaffold both):

package/dist/bin/ccqa.mjs CHANGED Viewed

@@ -6,12 +6,14 @@ import { accessSync, existsSync, readFileSync, statSync } from "node:fs";
 import { fileURLToPath } from "node:url";
 import { access, mkdir, mkdtemp, readFile, readdir, rm, stat, writeFile } from "node:fs/promises";
 import { homedir, tmpdir } from "node:os";
-import { delimiter, dirname, join, posix, relative, resolve } from "node:path";
+import { delimiter, dirname, isAbsolute, join, posix, relative, resolve } from "node:path";
 import { parse, stringify } from "yaml";
 import { ZodError, z } from "zod";
 import { execFile, spawn, spawnSync } from "node:child_process";
 import { query } from "@anthropic-ai/claude-agent-sdk";
+import { AsyncLocalStorage } from "node:async_hooks";
 import { promisify } from "node:util";
+import { randomUUID } from "node:crypto";
 import { createInterface } from "node:readline";
 import { createInterface as createInterface$1 } from "node:readline/promises";
 //#region src/runtime/env-vars.ts
@@ -139,6 +141,7 @@ const TestSpecSchema = z.object({
 	title: z.string().min(1),
 	relatedPaths: z.array(z.string().min(1)).optional(),
 	mode: SpecModeSchema.optional(),
+	statePath: z.string().min(1).optional(),
 	steps: z.array(StepSchema).min(1)
 }).strict();
 /** Default mode when `mode:` is absent. */
@@ -757,6 +760,55 @@ function waitExit(child) {
 	});
 }
 //#endregion
+//#region src/runtime/live-artifacts.ts
+/**
+* Build a sortable run id from the current wall-clock time. ISO8601 with
+* `:` / `.` replaced so it's filename-safe. Caller is expected to mkdir the
+* directory once and pass `runDir = <baseDir>/<runId>` to the path helpers
+* below.
+*/
+function buildRunId() {
+	return (/* @__PURE__ */ new Date()).toISOString().replace(/[:.]/g, "-");
+}
+/**
+* Per-step artifact paths under a run directory. `<runDir>/steps/<stepId>.*`.
+* Three files per step:
+*   - <stepId>.before.png : screenshot taken BEFORE Claude executes the step.
+*   - <stepId>.after.png  : screenshot taken AFTER  Claude executes the step.
+*   - <stepId>.log.txt    : full assistant transcript for the step (judgement
+*                           reasoning, any STEP_RESULT lines, raw tool output
+*                           summaries the model chose to keep).
+*/
+function stepArtifactPaths(runDir, stepId) {
+	const dir = join(runDir, "steps");
+	return {
+		beforePng: join(dir, `${stepId}.before.png`),
+		afterPng: join(dir, `${stepId}.after.png`),
+		logTxt: join(dir, `${stepId}.log.txt`)
+	};
+}
+//#endregion
+//#region src/runtime/pool.ts
+/**
+* Run each item through `fn` with at most `concurrency` running at once.
+* Results preserve input order. A throwing `fn` rejects the whole pool
+* (callers that want per-item isolation should catch inside `fn`).
+*/
+async function runPool(items, concurrency, fn) {
+	const results = new Array(items.length);
+	let cursor = 0;
+	const worker = async () => {
+		while (true) {
+			const idx = cursor++;
+			if (idx >= items.length) return;
+			results[idx] = await fn(items[idx], idx);
+		}
+	};
+	const n = Math.max(1, Math.min(concurrency, items.length));
+	await Promise.all(Array.from({ length: n }, () => worker()));
+	return results;
+}
+//#endregion
 //#region src/claude/extract-json.ts
 /**
 * Pulls a JSON object out of a Claude completion. Accepts either a fenced
@@ -779,26 +831,70 @@ const STEP_ICONS = {
 	STEP_SKIPPED: "⊘",
 	RUN_COMPLETED: "■"
 };
+/**
+* When a `withBuffer` scope is active, every log line (stdout and stderr) is
+* appended to its buffer instead of being written immediately. Parallel spec
+* runs use this so each spec's narration — including logs emitted deep inside
+* the live executor — flushes as one contiguous block, not interleaved.
+*/
+const bufferStore = new AsyncLocalStorage();
+/** True while inside a `withBuffer` scope: progress lines avoid TTY cursor tricks. */
+function isBuffered() {
+	return bufferStore.getStore() !== void 0;
+}
+function emit(text, sink = process.stdout) {
+	const store = bufferStore.getStore();
+	if (store) {
+		store.out.push(text);
+		return;
+	}
+	sink.write(text);
+}
+/**
+* Write raw text to the active `withBuffer` scope, or straight to stdout when
+* none is active. Lets a runner redirect sub-process output (e.g. a child's
+* stdout) into the same buffer as its `log.*` lines so they flush together.
+*/
+function emitRaw(text) {
+	emit(text);
+}
+/**
+* Run `fn` with all its log output captured into a buffer, then flush the
+* buffer in one shot under `label`. Used by parallel runners to keep each
+* spec's output legible. Output is flushed even when `fn` throws.
+*
+* When `buffered` is false, `fn` runs with no buffer so its output streams
+* live — this is the sequential (concurrency 1) path, unchanged from before.
+*/
+async function withBuffer(label, buffered, fn) {
+	if (!buffered) return fn();
+	const store = { out: [] };
+	try {
+		return await bufferStore.run(store, fn);
+	} finally {
+		process.stdout.write(`\n──── ${label} ────\n${store.out.join("")}`);
+	}
+}
 function header(command, target) {
-	process.stdout.write(`\nccqa ${command}${target ? ` ${target}` : ""}\n\n`);
+	emit(`\nccqa ${command}${target ? ` ${target}` : ""}\n\n`);
 }
 function write(scope, message, sink = process.stdout) {
-	sink.write(`[${scope}] ${message}\n`);
+	emit(`[${scope}] ${message}\n`, sink);
 }
 function meta(key, value) {
 	write("meta", `${key}: ${value}`);
 }
 function blank() {
-	process.stdout.write("\n");
+	emit("\n");
 }
 function info(message) {
 	write("info", message);
 }
 function step(type, stepId, detail) {
-	process.stdout.write(`  ${STEP_ICONS[type]} [${stepId}] ${detail}\n`);
+	emit(`  ${STEP_ICONS[type]} [${stepId}] ${detail}\n`);
 }
 function bash(command) {
-	process.stdout.write(`  $ ${command.slice(0, 120)}\n`);
+	emit(`  $ ${command.slice(0, 120)}\n`);
 }
 function error(message) {
 	write("error", message, process.stderr);
@@ -807,7 +903,7 @@ function warn(message) {
 	write("warn", message, process.stderr);
 }
 function hint(message) {
-	process.stdout.write("\n");
+	emit("\n");
 	write("hint", message);
 }
 function fix(message) {
@@ -832,17 +928,17 @@ const PROGRESS_NONTTY_STRIDE = 5;
 let lastProgressNonTtyEmit = -1;
 function progress(current, total, label) {
 	const text = `[info] ${current + 1}/${total} ${label}`;
-	if (process.stdout.isTTY) {
+	if (process.stdout.isTTY && !isBuffered()) {
 		process.stdout.write(`\r${text}\x1b[K`);
 		return;
 	}
 	if (current === 0 || current - lastProgressNonTtyEmit >= PROGRESS_NONTTY_STRIDE) {
-		process.stdout.write(`${text}\n`);
+		emit(`${text}\n`);
 		lastProgressNonTtyEmit = current;
 	}
 }
 function progressEnd() {
-	if (process.stdout.isTTY) process.stdout.write(`\r\x1b[K`);
+	if (process.stdout.isTTY && !isBuffered()) process.stdout.write(`\r\x1b[K`);
 	lastProgressNonTtyEmit = -1;
 }
 /**
@@ -1363,6 +1459,12 @@ function extractAbActionFromBashCommand(cmd) {
 		case "type":
 		case "select": return `AB_ACTION|${subCmd}|${args[0] ?? ""}|${args[1] ?? ""}|${args[2] ?? ""}`;
 		case "drag": return `AB_ACTION|drag|${args[0] ?? ""}|${args[1] ?? ""}|${args[2] ?? ""}`;
+		case "upload": {
+			const sel = args[0] ?? "";
+			const files = args.slice(1);
+			if (!sel || files.length === 0) return null;
+			return `AB_ACTION|upload|${sel}|${files.join("|")}`;
+		}
 		case "snapshot": return null;
 		case "find": return extractFindAbAction(args);
 		default: return null;
@@ -1700,25 +1802,15 @@ const DEFAULT_CONCURRENCY$1 = 3;
 */
 async function analyzeDrift(input) {
 	const { targets, cwd, blocks, concurrency = DEFAULT_CONCURRENCY$1, model, language, onSpecStart } = input;
-	const results = new Array(targets.length);
-	let cursor = 0;
-	const worker = async () => {
-		while (true) {
-			const idx = cursor++;
-			if (idx >= targets.length) return;
-			const target = targets[idx];
-			onSpecStart?.(target);
-			results[idx] = await checkSpec(target, {
-				cwd,
-				blocks,
-				model,
-				language
-			});
-		}
-	};
-	const pool = Array.from({ length: Math.min(concurrency, targets.length) }, () => worker());
-	await Promise.all(pool);
-	return results;
+	return runPool(targets, concurrency, async (target) => {
+		onSpecStart?.(target);
+		return checkSpec(target, {
+			cwd,
+			blocks,
+			model,
+			language
+		});
+	});
 }
 async function checkSpec(target, opts) {
 	const { featureName, specName } = target;
@@ -4095,6 +4187,123 @@ const CLIENT_JS = `
 })();
 `;
 //#endregion
+//#region src/runtime/profile-env.ts
+/**
+* Profile env (Issue #37). A profile is a named `.env` under
+* `.ccqa/profiles/<name>.env`; its contents merge into `process.env` before any
+* spec work, so one spec targets dev/stg/prd without per-environment copies.
+* Spec `${VAR}` references all resolve against `process.env` downstream.
+*
+* The `.env` parser is a small hand-rolled subset (no dotenv dependency).
+*/
+/**
+* Parse a `.env` body into a `name → value` map. Subset: blank / `#` lines
+* skipped, optional leading `export`, split on the first `=`, surrounding
+* quotes stripped, inline `# comment` dropped. No multi-line / interpolation.
+*/
+function parseDotenv(content) {
+	const out = {};
+	for (const rawLine of content.split(/\r?\n/)) {
+		const line = rawLine.trim();
+		if (line === "" || line.startsWith("#")) continue;
+		const withoutExport = line.replace(/^export\s+/, "");
+		const eq = withoutExport.indexOf("=");
+		if (eq === -1) continue;
+		const key = withoutExport.slice(0, eq).trim();
+		if (key === "") continue;
+		out[key] = parseValue(withoutExport.slice(eq + 1).trim());
+	}
+	return out;
+}
+function parseValue(raw) {
+	const quote = raw[0];
+	if (quote === "\"" || quote === "'") {
+		const close = raw.indexOf(quote, 1);
+		if (close !== -1 && /^\s*(#.*)?$/.test(raw.slice(close + 1))) return raw.slice(1, close);
+	}
+	const hash = raw.search(/\s#/);
+	return hash === -1 ? raw : raw.slice(0, hash).trimEnd();
+}
+var ProfileNotFoundError = class extends Error {
+	profile;
+	path;
+	constructor(profile, path) {
+		super(`profile "${profile}" not found: ${path}`);
+		this.name = "ProfileNotFoundError";
+		this.profile = profile;
+		this.path = path;
+	}
+};
+var InvalidProfileNameError = class extends Error {
+	profile;
+	constructor(profile) {
+		super(`invalid profile name "${profile}": expected a bare name like "stg" (no path separators, no leading dot)`);
+		this.name = "InvalidProfileNameError";
+		this.profile = profile;
+	}
+};
+/**
+* A profile name must be a single, non-dot-leading path segment, so
+* `--profile <name>` can't read a file outside the profiles dir (e.g.
+* `--profile ../../etc/hosts`). Rejecting separators and a leading dot already
+* blocks `..` traversal, so an in-name `..` (like `v1..2`) stays allowed.
+*/
+function assertValidProfileName(profile) {
+	if (profile === "" || profile.includes("/") || profile.includes("\\") || profile.startsWith(".")) throw new InvalidProfileNameError(profile);
+}
+/** Absolute path of the `.env` file backing `<profile>` under `<cwd>/.ccqa/`. */
+function profilePath(profile, cwd) {
+	assertValidProfileName(profile);
+	return join(cwd, ".ccqa", "profiles", `${profile}.env`);
+}
+/** Read + parse a `.env`, or `null` if absent. Other read errors propagate. */
+async function readDotenv(path) {
+	let content;
+	try {
+		content = await readFile(path, "utf8");
+	} catch (err) {
+		if (err.code === "ENOENT") return null;
+		throw err;
+	}
+	return parseDotenv(content);
+}
+/**
+* Load `.ccqa/profiles/<profile>.env`. A missing file throws — a typo must fail
+* loudly, not silently resolve every credential to empty.
+*/
+async function loadProfileEnv(profile, cwd) {
+	const path = profilePath(profile, cwd);
+	const vars = await readDotenv(path);
+	if (vars === null) throw new ProfileNotFoundError(profile, path);
+	return vars;
+}
+/** Absolute path of the default `.env` ccqa loads when `--profile` is absent. */
+function defaultEnvPath(cwd) {
+	return join(cwd, ".env");
+}
+/**
+* Load `<cwd>/.env`, the default when no `--profile` is given. A missing `.env`
+* is fine (returns `null`) — the run falls back to the existing `process.env`.
+*/
+async function loadDefaultEnv(cwd) {
+	return readDotenv(defaultEnvPath(cwd));
+}
+/**
+* Merge vars into `process.env`. With `override` (the default), the profile
+* wins over inherited values. Returns the applied names — never values, so
+* callers log names only and secrets stay out of the log.
+*/
+function applyProfileEnv(vars, opts = {}) {
+	const override = opts.override ?? true;
+	const applied = [];
+	for (const [name, value] of Object.entries(vars)) {
+		if (!override && process.env[name] !== void 0) continue;
+		process.env[name] = value;
+		applied.push(name);
+	}
+	return applied;
+}
+//#endregion
 //#region src/cli/options.ts
 /**
 * Shared `--language` flag. Every Claude-driven command writes some
@@ -4105,6 +4314,53 @@ const CLIENT_JS = `
 function addLanguageOption(command) {
 	return command.option("--language <bcp47>", "Language for human-readable output (e.g. 'en', 'ja'). Default 'auto' follows the language of the spec/codebase.", DEFAULT_LANGUAGE);
 }
+/**
+* Shared `--profile <name>` flag for the browser-driving commands (`run`,
+* `record`), registered identically so help text and behaviour don't drift.
+*/
+function addProfileOption(command) {
+	return command.option("--profile <name>", "Load .ccqa/profiles/<name>.env into the environment before resolving spec ${VAR} references (URLs, credentials), so one spec can target dev/stg/prd without per-environment copies. Profile values override the inherited environment.");
+}
+/**
+* Merge the environment for a `run` / `record` invocation into `process.env`
+* before any spec work. With `--profile <name>`, load that profile (missing /
+* invalid → exit 2). Without it, auto-load `<cwd>/.env` if present (a missing
+* `.env` is fine). Checking `!== undefined` rejects `--profile ""` rather than
+* skipping it.
+*/
+async function applyProfileFromOption(profile, cwd) {
+	if (profile !== void 0) await applyNamedProfile(profile, cwd);
+	else await applyDefaultEnv(cwd);
+}
+/** "1 var" / "2 vars" — the count summary shared by both load paths' meta line. */
+function varCount(n) {
+	return `${n} var${n === 1 ? "" : "s"}`;
+}
+async function applyNamedProfile(profile, cwd) {
+	try {
+		const applied = applyProfileEnv(await loadProfileEnv(profile, cwd));
+		meta("profile", `${profile} (${varCount(applied.length)})`);
+		if (applied.length === 0) warn(`profile "${profile}" defined no variables — spec $\{VAR} references will resolve to empty`);
+	} catch (err) {
+		if (err instanceof ProfileNotFoundError) {
+			error(err.message);
+			hint(`create ${err.path} with the environment's $\{VAR} values`);
+		} else if (err instanceof InvalidProfileNameError) error(err.message);
+		else error(`failed to load profile "${profile}": ${err instanceof Error ? err.message : String(err)}`);
+		process.exit(2);
+	}
+}
+async function applyDefaultEnv(cwd) {
+	let vars;
+	try {
+		vars = await loadDefaultEnv(cwd);
+	} catch (err) {
+		error(`failed to load ${defaultEnvPath(cwd)}: ${err instanceof Error ? err.message : String(err)}`);
+		process.exit(2);
+	}
+	if (vars === null) return;
+	meta("env", `.env (${varCount(applyProfileEnv(vars, { override: false }).length)})`);
+}
 //#endregion
 //#region src/cli/resolve-cwd.ts
 /**
@@ -4116,7 +4372,7 @@ function addLanguageOption(command) {
 *
 * It's mostly useful in monorepos where you want to invoke ccqa from the
 * repo root but target a subpackage (e.g.
-* `ccqa run --cwd js/apps/knowledge-webapp`).
+* `ccqa run --cwd apps/web-app`).
 *
 * Falls back to `process.cwd()` when the option is not given.
 */
@@ -4328,36 +4584,14 @@ function oneLine$1(s) {
 	return s.replace(/\s+/g, " ").trim();
 }
 //#endregion
-//#region src/runtime/live-artifacts.ts
-/**
-* Build a sortable run id from the current wall-clock time. ISO8601 with
-* `:` / `.` replaced so it's filename-safe. Caller is expected to mkdir the
-* directory once and pass `runDir = <baseDir>/<runId>` to the path helpers
-* below.
-*/
-function buildRunId() {
-	return (/* @__PURE__ */ new Date()).toISOString().replace(/[:.]/g, "-");
-}
-/**
-* Per-step artifact paths under a run directory. `<runDir>/steps/<stepId>.*`.
-* Three files per step:
-*   - <stepId>.before.png : screenshot taken BEFORE Claude executes the step.
-*   - <stepId>.after.png  : screenshot taken AFTER  Claude executes the step.
-*   - <stepId>.log.txt    : full assistant transcript for the step (judgement
-*                           reasoning, any STEP_RESULT lines, raw tool output
-*                           summaries the model chose to keep).
-*/
-function stepArtifactPaths(runDir, stepId) {
-	const dir = join(runDir, "steps");
-	return {
-		beforePng: join(dir, `${stepId}.before.png`),
-		afterPng: join(dir, `${stepId}.after.png`),
-		logTxt: join(dir, `${stepId}.log.txt`)
-	};
-}
-//#endregion
 //#region src/claude/agent-browser-invoke.ts
 function agentBrowserInvokeBase(input) {
+	const env = {
+		AGENT_BROWSER_SESSION: input.sessionName,
+		CCQA_RUN_ID: input.runId,
+		PATH: pathWithAgentBrowserShim(process.env["PATH"])
+	};
+	if (input.statePath) env["CCQA_AB_STATE"] = input.statePath;
 	return {
 		allowedTools: [
 			"Bash(*)",
@@ -4365,17 +4599,19 @@ function agentBrowserInvokeBase(input) {
 			"Grep",
 			"Glob"
 		],
-		env: {
-			AGENT_BROWSER_SESSION: input.sessionName,
-			CCQA_RUN_ID: input.runId,
-			PATH: pathWithAgentBrowserShim(process.env["PATH"])
-		}
+		env
 	};
 }
 //#endregion
 //#region src/prompts/live.ts
+/**
+* Unique agent-browser session name. The runId is millisecond-precision wall
+* clock, so under `--concurrency > 1` two specs can start in the same
+* millisecond and collide; a random suffix guarantees each spec gets its own
+* Chrome session and state never bleeds across parallel runs.
+*/
 function generateLiveSessionName() {
-	return `ccqa-live-${buildRunId()}`;
+	return `ccqa-live-${buildRunId()}-${randomUUID().slice(0, 8)}`;
 }
 /**
 * Static prefix of the `ccqa run` (live spec) system prompt. Built once per
@@ -4405,19 +4641,20 @@ function buildLiveSystemPromptPrefix(input) {
 	const stepsText = input.allSteps.map((s) => `### ${s.id} [${s.source}]
 - **Instruction**: ${s.instruction}
 - **Expected**: ${s.expected}`).join("\n\n");
+	const stateLine = input.statePath ? `\n\nA pre-recorded auth-state file is provided at \`${input.statePath}\` (also in the env var \`CCQA_AB_STATE\`). **Always also pass \`--state "$CCQA_AB_STATE"\`** to every \`agent-browser\` command — this restores cookies and localStorage from a prior interactive login, so the user is already signed in to the application under test from step 1. The file is loaded read-only; do not run \`agent-browser state save\`.` : "";
 	return `You are a QA execution agent. You are executing ONE step of a browser-based end-to-end test and judging whether the step's expected outcome was achieved. You are NOT recording a replayable test script — be flexible, explore the DOM as needed, and make a clear pass / fail call at the end.
 ## Session
 SESSION NAME: \`${input.sessionName}\`
-Always pass \`--session ${input.sessionName}\` to every \`agent-browser\` command. The session persists across steps within this test run, so the browser state from previous steps is already loaded when this turn starts.
+Always pass \`--session ${input.sessionName}\` to every \`agent-browser\` command. The session persists across steps within this test run, so the browser state from previous steps is already loaded when this turn starts.${stateLine}
 ## Tools
 You have:
-- **Bash** to run \`agent-browser\` (the full surface — \`open\`, \`snapshot\`, \`click\`, \`fill\`, \`press\`, \`wait\`, \`find\`, \`screenshot\`, \`eval\`, \`js\`, \`get\`, etc.). Any selector form is allowed: \`@ref\` (e.g. \`@e14\`), CSS selectors, \`text=...\`, \`[aria-label='...']\`, \`[data-testid='...']\`, bare tags inside \`find first/last/nth\` — whatever works for this single run. There is no replay contract to honour.
+- **Bash** to run \`agent-browser\` (the full surface — \`open\`, \`snapshot\`, \`click\`, \`fill\`, \`upload\`, \`press\`, \`wait\`, \`find\`, \`screenshot\`, \`eval\`, \`js\`, \`get\`, etc.). Any selector form is allowed: \`@ref\` (e.g. \`@e14\`), CSS selectors, \`text=...\`, \`[aria-label='...']\`, \`[data-testid='...']\`, bare tags inside \`find first/last/nth\` — whatever works for this single run. There is no replay contract to honour. For file inputs (\`<input type="file">\`) do NOT \`click\` the input — use \`agent-browser upload "<selector>" <path>\` so no OS file-picker dialog opens. Fixtures conventionally live under \`.ccqa/fixtures/\`; reference them via \`\${CCQA_FIXTURES_DIR}/<name>\`.
 - **Read / Grep / Glob** for inspecting the application source code when you need to find a selector or understand routing. Read-only — do not modify source files.
 ## Test Specification
@@ -4525,11 +4762,9 @@ function findLastStepResult(text) {
 * artifact, not a reason to abort the test step.
 */
 function takeScreenshot(sessionName, outPath, options) {
-	const args = [
-		"--session",
-		sessionName,
-		"screenshot"
-	];
+	const args = ["--session", sessionName];
+	if (options?.statePath) args.push("--state", options.statePath);
+	args.push("screenshot");
 	if (options?.fullPage) args.push("--full");
 	args.push(outPath);
 	const res = spawnAB(args);
@@ -4562,16 +4797,19 @@ async function runLiveExecutor(input) {
 	const startedAt = /* @__PURE__ */ new Date();
 	const stepResults = [];
 	let overallFailed = false;
+	const statePath = input.statePath ?? null;
 	const promptPrefix = buildLiveSystemPromptPrefix({
 		title: input.spec.title,
 		allSteps: input.steps,
-		sessionName: input.sessionName
+		sessionName: input.sessionName,
+		statePath
 	});
 	const suffixBlock = input.systemPromptSuffix ? `\n## Project-specific guidance\n\n${input.systemPromptSuffix}\n` : "";
 	const langDirective = languageDirective(input.language);
 	const invokeBase = agentBrowserInvokeBase({
 		sessionName: input.sessionName,
-		runId: input.runId
+		runId: input.runId,
+		statePath
 	});
 	const retries = Math.max(0, input.retries ?? 0);
 	for (let i = 0; i < input.steps.length; i++) {
@@ -4616,7 +4854,7 @@ async function runLiveExecutor(input) {
 		}
 	}
 	async function executeStepAttempt(step, paths, systemPrompt, userPrompt) {
-		const before = takeScreenshot(input.sessionName, paths.beforePng);
+		const before = takeScreenshot(input.sessionName, paths.beforePng, { statePath });
 		if (!before.ok) warn(`screenshot (before, ${step.id}) failed: ${before.error}`);
 		const transcriptParts = [];
 		let isError = false;
@@ -4648,7 +4886,10 @@ async function runLiveExecutor(input) {
 			transcriptParts.push(`[ccqa] invokeClaudeStreaming threw: ${err instanceof Error ? err.message : String(err)}`);
 		}
 		const transcript = transcriptParts.join("\n");
-		const after = takeScreenshot(input.sessionName, paths.afterPng, { fullPage: true });
+		const after = takeScreenshot(input.sessionName, paths.afterPng, {
+			fullPage: true,
+			statePath
+		});
 		if (!after.ok) warn(`screenshot (after, ${step.id}) failed: ${after.error}`);
 		await writeFile(paths.logTxt, transcript || "(no assistant text captured)", "utf-8");
 		const { status, reasoning } = judgeStepOutcome({
@@ -4840,24 +5081,24 @@ async function runLiveSpecs(specs, opts) {
 	await preflightAgentBrowserCommand();
 	meta("live-specs", specs.length);
 	const userPromptBundle = await loadLivePromptBundle(cwd);
-	if (userPromptBundle !== null) meta("user-prompt", userPromptBundle.loaded.join(" + "));
+	if (userPromptBundle !== null) meta("prompt", userPromptBundle.loaded.join(" + "));
 	const userPromptSuffix = userPromptBundle?.text ?? null;
-	const runs = [];
-	for (let i = 0; i < specs.length; i++) {
-		const { featureName, specName } = specs[i];
-		const label = `${featureName}/${specName}`;
-		if (specs.length > 1) {
-			blank();
-			info(`[${i + 1}/${specs.length}] ${label}`);
-		}
-		runs.push(await runOneSpec({
-			featureName,
-			specName,
-			opts,
-			userPromptSuffix,
-			cwd
-		}));
-	}
+	const concurrency = Math.max(1, opts.concurrency ?? 1);
+	const runs = await runPool(specs, concurrency, (spec, i) => {
+		const label = `${spec.featureName}/${spec.specName}`;
+		return withBuffer(label, concurrency > 1, () => {
+			if (concurrency === 1 && specs.length > 1) {
+				blank();
+				info(`[${i + 1}/${specs.length}] ${label}`);
+			}
+			return runOneSpec({
+				...spec,
+				opts,
+				userPromptSuffix,
+				cwd
+			});
+		});
+	});
 	const failedCount = runs.filter((r) => r.kind === "error" || r.kind === "run" && r.result.status === "failed").length;
 	blank();
 	meta("live-summary", `${runs.length - failedCount} passed / ${failedCount} failed`);
@@ -4956,6 +5197,23 @@ async function runOneSpec(args) {
 	if (includes.length > 0) meta("blocks", includes.join(", "));
 	const sessionName = generateLiveSessionName();
 	meta("session", sessionName);
+	let statePath = null;
+	if (spec.statePath) {
+		statePath = isAbsolute(spec.statePath) ? spec.statePath : resolve(cwd, spec.statePath);
+		try {
+			await access(statePath);
+		} catch {
+			const msg = `spec.statePath points to a missing file: ${statePath}`;
+			error(msg);
+			return {
+				kind: "error",
+				featureName,
+				specName,
+				error: msg
+			};
+		}
+		meta("state", statePath);
+	}
 	const runId = buildRunId();
 	const runDir = opts.out ?? join(specDir, "runs", runId);
 	await mkdir(runDir, { recursive: true });
@@ -4966,6 +5224,7 @@ async function runOneSpec(args) {
 		runId,
 		runDir,
 		sessionName,
+		statePath,
 		systemPromptSuffix: userPromptSuffix,
 		model: opts.model,
 		language: opts.language,
@@ -5231,28 +5490,57 @@ async function resolveVitestConfig(cwd) {
 		return bundledVitestConfigPath();
 	}
 }
-const runCommand = addLanguageOption(new Command("run").argument("[target]", "Spec to run: '<feature>/<spec>', '<feature>', or omit for all").description("Run specs. Each spec's execution mode comes from its spec.yaml `mode:` field (default deterministic; set `mode: live` to have Claude drive agent-browser live per step). Deterministic specs replay the recorded test.spec.ts under vitest. Pass --report to write one unified HTML report covering both modes.").option("--report [dir]", `Write a self-contained HTML run report (failure analysis + drift audit by default). Default dir: ${DEFAULT_REPORT_DIR}/`).option("--changed", "Restrict execution to specs whose relatedPaths intersect the git diff against --base (or, in CI, $GITHUB_BASE_REF, else origin/main). Cannot be combined with an explicit spec id.").option("--no-failure-analysis", "Skip the per-failure root-cause classification (TEST_DRIFT / SPEC_CHANGE / PRODUCT_BUG). --report only.").option("--no-drift-audit", "Skip the spec↔code drift audit shown in the report. --report only.").option("--base <ref>", "Base ref the source diff is taken against for failure analysis (default: GITHUB_BASE_REF, then origin/main).").option("--cwd <path>", "Working directory containing the .ccqa/ tree (monorepo support). Defaults to the current directory.").option("--format <fmt>", "Additional output format alongside HTML when --report is set: 'text' (default), 'json' (writes report.json), 'github' (GitHub Actions annotations on stdout).", (raw) => {
+const runCommand = addProfileOption(addLanguageOption(new Command("run").argument("[targets...]", "Specs to run, space-separated: each '<feature>/<spec>', '<feature>', or omit for all. Duplicates are de-duped.").description("Run specs. Each spec's execution mode comes from its spec.yaml `mode:` field (default deterministic; set `mode: live` to have Claude drive agent-browser live per step). Deterministic specs replay the recorded test.spec.ts under vitest. Pass --report to write one unified HTML report covering both modes.").option("--report [dir]", `Write a self-contained HTML run report (failure analysis + drift audit by default). Default dir: ${DEFAULT_REPORT_DIR}/`).option("--changed", "Restrict execution to specs whose relatedPaths intersect the git diff against --base (or, in CI, $GITHUB_BASE_REF, else origin/main). Cannot be combined with an explicit spec id.").option("--no-failure-analysis", "Skip the per-failure root-cause classification (TEST_DRIFT / SPEC_CHANGE / PRODUCT_BUG). --report only.").option("--no-drift-audit", "Skip the spec↔code drift audit shown in the report. --report only.").option("--base <ref>", "Base ref the source diff is taken against for failure analysis (default: GITHUB_BASE_REF, then origin/main).").option("--cwd <path>", "Working directory containing the .ccqa/ tree (monorepo support). Defaults to the current directory.").option("--format <fmt>", "Additional output format alongside HTML when --report is set: 'text' (default), 'json' (writes report.json), 'github' (GitHub Actions annotations on stdout).", (raw) => {
 	if (REPORT_FORMATS.includes(raw)) return raw;
 	throw new Error(`--format must be one of ${REPORT_FORMATS.join(" | ")}`);
 }, "text").option("-m, --model <name>", "Claude model alias ('sonnet'|'opus'|'haiku') or full ID. Overrides CCQA_MODEL.").option("--no-evidence", `(deterministic only) Skip step-boundary evidence capture (PNG + meta JSON written to ${DEFAULT_REPORT_DIR}/${EVIDENCE_SUBDIR}/ by default).`).option("--retry <n>", "(live only) Retry each failed step up to N more times before recording failure. Default 0.", (raw) => {
 	const n = Number(raw);
 	if (!Number.isFinite(n) || n < 0 || Math.floor(n) !== n) throw new Error(`--retry must be a non-negative integer, got "${raw}"`);
 	return n;
-}, 0).option("--out <dir>", "(live only) Override the per-spec artifact directory. Default: <specDir>/runs/<runId>. Ignored when running multiple specs.").option("--update-agent-prompt", "(live only) After the run finishes, ask Claude to refresh .ccqa/prompts/live.agent.md from a summary of the run.")).action(async (target, opts) => {
-	await runDispatcher(target, opts);
+}, 0).option("--out <dir>", "(live only) Override the per-spec artifact directory. Default: <specDir>/runs/<runId>. Ignored when running multiple specs.").option("--update-agent-prompt", "(live only) After the run finishes, ask Claude to refresh .ccqa/prompts/live.agent.md from a summary of the run.").option("--concurrency <n>", "Run up to N specs in parallel within each mode (deterministic / live). Default 1 (sequential). Live specs each get an isolated agent-browser session; high values spawn many headed Chrome instances.", parseConcurrency$1, 1))).action(async (targets, opts) => {
+	await runDispatcher(targets, opts);
 });
+/** Parse --concurrency: a positive integer. Rejects 0, negatives, non-integers. */
+function parseConcurrency$1(raw) {
+	const n = Number(raw);
+	if (!Number.isInteger(n) || n < 1) {
+		error(`invalid --concurrency: ${raw} (expected positive integer)`);
+		process.exit(2);
+	}
+	return n;
+}
 function resolveReportDir(report, cwd) {
 	if (report === void 0 || report === false) return void 0;
 	return resolve(cwd, typeof report === "string" ? report : DEFAULT_REPORT_DIR);
 }
-async function runDispatcher(target, opts) {
-	header("run", target ?? (opts.changed ? "(changed)" : "(all specs)"));
-	if (opts.changed && target) {
+/** Header label shown after `ccqa run`: the lone target, a count, or a mode marker. */
+function headerTarget(targets, opts) {
+	if (targets.length === 1) return targets[0];
+	if (targets.length > 1) return `${targets.length} targets`;
+	return opts.changed ? "(changed)" : "(all specs)";
+}
+/** De-dupe by `featureName/specName`, keeping first-seen order. */
+function dedupeSpecs(specs) {
+	const seen = /* @__PURE__ */ new Set();
+	const out = [];
+	for (const s of specs) {
+		const key = `${s.featureName}/${s.specName}`;
+		if (seen.has(key)) continue;
+		seen.add(key);
+		out.push(s);
+	}
+	return out;
+}
+async function runDispatcher(targets, opts) {
+	header("run", headerTarget(targets, opts));
+	if (opts.changed && targets.length > 0) {
 		error("--changed and an explicit spec target cannot be combined");
 		process.exit(2);
 	}
 	const cwd = resolveCwd(opts.cwd);
-	let specs = await resolveSpecTargets(target, () => listAllSpecsWithSpecFile(cwd), cwd);
+	await applyProfileFromOption(opts.profile, cwd);
+	const enumerateAll = () => listAllSpecsWithSpecFile(cwd);
+	let specs = dedupeSpecs((await Promise.all((targets.length ? targets : [void 0]).map((t) => resolveSpecTargets(t, enumerateAll, cwd)))).flat());
 	if (opts.changed) {
 		const before = specs.length;
 		specs = await collectChangedSpecs(specs, {
@@ -5273,7 +5561,7 @@ async function runDispatcher(target, opts) {
 		if (typeof opts.retry === "number" && opts.retry > 0) warn("--retry is ignored without any 'mode: live' spec");
 		if (opts.out) warn("--out is ignored without any 'mode: live' spec");
 		if (opts.updateAgentPrompt) warn("--update-agent-prompt is ignored without any 'mode: live' spec");
-	}
+	} else if (opts.out && liveSpecs.length > 1) warn("--out is ignored when running multiple live specs");
 	if (detSpecs.length === 0 && opts.evidence === false) warn("--no-evidence is ignored without any 'mode: deterministic' spec");
 	blank();
 	const reportDir = resolveReportDir(opts.report, cwd);
@@ -5282,11 +5570,12 @@ async function runDispatcher(target, opts) {
 	const live = await runLiveSpecs(liveSpecs, {
 		...opts.model ? { model: opts.model } : {},
 		...opts.language ? { language: opts.language } : {},
-		...opts.out ? { out: opts.out } : {},
+		...opts.out && liveSpecs.length === 1 ? { out: opts.out } : {},
 		cwd,
 		...opts.base ? { base: opts.base } : {},
 		...reportDir ? { reportDir } : {},
 		...typeof opts.retry === "number" ? { retry: opts.retry } : {},
+		concurrency: opts.concurrency ?? 1,
 		...reportDir && opts.driftAudit !== false ? { driftAudit: true } : {},
 		...reportDir && opts.failureAnalysis === false ? { failureAnalysis: false } : {}
 	});
@@ -5345,72 +5634,87 @@ async function runDeterministicSpecs(specs, opts, cwd, reportDirAbs) {
 		exitCode: 0
 	};
 	const tmpDir = await mkdtemp(join(tmpdir(), "ccqa-run-"));
-	const summaries = [];
-	let exitCode = 0;
 	const vitestConfig = await resolveVitestConfig(cwd);
 	const captureOutput = Boolean(opts.report);
 	const evidenceRoot = opts.evidence !== false ? join(reportDirAbs, EVIDENCE_SUBDIR) : null;
+	const concurrency = Math.max(1, opts.concurrency ?? 1);
+	const ctx = {
+		cwd,
+		tmpDir,
+		vitestConfig,
+		captureOutput,
+		evidenceRoot
+	};
 	try {
-		for (let i = 0; i < specs.length; i++) {
-			const { featureName, specName } = specs[i];
-			const scriptFile = await getTestScript(featureName, specName, cwd);
-			if (!scriptFile) {
-				warn(`${featureName}/${specName}: no test.spec.ts found`);
-				hint("run 'ccqa record <feature>/<spec>' to record it, or set 'mode: live' in spec.yaml");
-				continue;
-			}
-			run(`${featureName}/${specName}`);
-			meta("test", scriptFile);
-			blank();
-			const reportFile = join(tmpDir, `report-${i}.json`);
-			const evidenceDir = evidenceRoot ? join(evidenceRoot, featureName, specName) : null;
-			if (evidenceDir) {
-				await rm(evidenceDir, {
-					recursive: true,
-					force: true
-				});
-				await mkdir(evidenceDir, { recursive: true });
-			}
-			const proc = spawnVitestStreaming([
-				"run",
-				"--config",
-				vitestConfig,
-				scriptFile,
-				"--reporter=json",
-				`--outputFile.json=${reportFile}`
-			], {
-				cwd,
-				env: evidenceDir ? {
-					...process.env,
-					CCQA_EVIDENCE_DIR: evidenceDir
-				} : process.env
-			});
-			const tail = captureOutput ? new TailBuffer(OUTPUT_TAIL_CAP) : null;
-			await Promise.all([streamFiltered(proc.stdout, process.stdout, tail), streamFiltered(proc.stderr, process.stderr, tail)]);
-			const specExitCode = await proc.exited;
-			if (specExitCode !== 0) exitCode = specExitCode;
-			const report = await readReport(reportFile);
-			summaries.push({
-				featureName,
-				specName,
-				scriptFile,
-				report,
-				exitCode: specExitCode,
-				outputTail: tail ? tail.toString() : null,
-				evidenceDir
-			});
-			blank();
-		}
+		const summaries = (await runPool(specs, concurrency, (spec, i) => withBuffer(`${spec.featureName}/${spec.specName}`, concurrency > 1, () => runOneDeterministicSpec(spec, i, ctx)))).filter((s) => s !== null);
 		printSummary(summaries);
+		return {
+			summaries,
+			exitCode: summaries.reduce((acc, s) => s.exitCode !== 0 ? s.exitCode : acc, 0)
+		};
 	} finally {
 		await rm(tmpDir, {
 			recursive: true,
 			force: true
 		});
 	}
+}
+/**
+* Run one spec under vitest. Returns null when the spec has no recorded
+* test.spec.ts (skipped). All output goes through the logger, so under a
+* `log.withBuffer` scope it's captured and flushed as one labelled block.
+*/
+async function runOneDeterministicSpec(spec, index, ctx) {
+	const { featureName, specName } = spec;
+	const scriptFile = await getTestScript(featureName, specName, ctx.cwd);
+	if (!scriptFile) {
+		warn(`${featureName}/${specName}: no test.spec.ts found`);
+		hint("run 'ccqa record <feature>/<spec>' to record it, or set 'mode: live' in spec.yaml");
+		return null;
+	}
+	run(`${featureName}/${specName}`);
+	meta("test", scriptFile);
+	const runId = buildRunId();
+	meta("runId", runId);
+	blank();
+	const reportFile = join(ctx.tmpDir, `report-${index}.json`);
+	const evidenceDir = ctx.evidenceRoot ? join(ctx.evidenceRoot, featureName, specName) : null;
+	if (evidenceDir) {
+		await rm(evidenceDir, {
+			recursive: true,
+			force: true
+		});
+		await mkdir(evidenceDir, { recursive: true });
+	}
+	const specEnv = {
+		...process.env,
+		CCQA_RUN_ID: runId
+	};
+	if (evidenceDir) specEnv.CCQA_EVIDENCE_DIR = evidenceDir;
+	const proc = spawnVitestStreaming([
+		"run",
+		"--config",
+		ctx.vitestConfig,
+		scriptFile,
+		"--reporter=json",
+		`--outputFile.json=${reportFile}`
+	], {
+		cwd: ctx.cwd,
+		env: specEnv
+	});
+	const sink = { write: emitRaw };
+	const tail = ctx.captureOutput ? new TailBuffer(OUTPUT_TAIL_CAP) : null;
+	await Promise.all([streamFiltered(proc.stdout, sink, tail), streamFiltered(proc.stderr, sink, tail)]);
+	const specExitCode = await proc.exited;
+	blank();
 	return {
-		summaries,
-		exitCode
+		featureName,
+		specName,
+		scriptFile,
+		report: await readReport(reportFile),
+		exitCode: specExitCode,
+		outputTail: tail ? tail.toString() : null,
+		evidenceDir
 	};
 }
 function failedSpec(s) {
@@ -5859,6 +6163,7 @@ agent-browser --session SESSION wait --load networkidle
 agent-browser --session SESSION get count "<selector>"   # element-existence check (returns a number, fast)
 agent-browser --session SESSION cookies clear
 agent-browser --session SESSION find <locator> <value> <action> [<input>] [--name "<n>"] [--exact]
+agent-browser --session SESSION upload "<input[type=file] selector>" <file> [<file> ...]
 # See "Selector Rules" for the full \`find\` subset.
 # IMPORTANT: do NOT use \`wait "<css-selector>"\`. agent-browser ignores --timeout on a
 # CSS-selector wait and blocks for ~150s when the selector never matches, killing the run.
@@ -5934,6 +6239,8 @@ find nth <index> "<ALLOWED-css>" <action>
 **Verifying cleanup / deletion**: assert the *absence* of the deleted thing, not the surrounding listing screen's text. Use \`wait --fn "!document.body.innerText.includes('<unique-label>')"\` (text disappearance) — never \`wait "<css-selector>" --state hidden\` (blocks the daemon) and never \`wait --text "<navbar label>"\` (passes regardless of the deletion).
+**File inputs (\`<input type="file">\`) / OS file-picker dialogs**: do NOT \`click\` the input — that opens the OS picker, which agent-browser cannot drive. Use \`upload "<selector>" <path>\` instead. agent-browser sets the input's files directly via the underlying browser API, no native dialog ever opens. Use an ALLOWED selector to identify the input (\`[aria-label='…']\`, \`[data-testid='…']\`, \`[type='file']\` only when it's unique on the page). File paths must be plain shell args — wrap each in \`"\` for safety. Reference fixtures via \`\${CCQA_FIXTURES_DIR}/<name>\` so the same spec works locally and in CI; conventionally fixtures live under \`.ccqa/fixtures/\` and the env var resolves there. Multi-file inputs accept several positionals: \`upload "[aria-label='Attach']" "\${CCQA_FIXTURES_DIR}/a.pdf" "\${CCQA_FIXTURES_DIR}/b.pdf"\`.
 ## Test Specification
 Title: ${input.title}
@@ -6016,6 +6323,7 @@ AB_ACTION|select|<selector>|<value>|<aria label>
 AB_ACTION|hover|<selector>|<visible label>
 AB_ACTION|scroll|<direction>|<pixels>
 AB_ACTION|drag|<source selector>|<target selector>|<source label>
+AB_ACTION|upload|<file-input selector>|<file1>[|<file2>...]
 AB_ACTION|wait|<selector or text>|<label>
 AB_ACTION|snapshot|<key observation, max 100 chars>
 AB_ACTION|assert|<assertType>|<selector or "">|<value or "">|<observation>
@@ -6332,6 +6640,17 @@ function actionToAbArgs(action, sessionName) {
 			sub(action.selector),
 			sub(action.target)
 		];
+		case "upload": {
+			const sel = sub(action.selector);
+			const files = (action.files ?? []).map((f) => sub(f));
+			if (!sel || files.length === 0) return null;
+			return [
+				...base,
+				"upload",
+				sel,
+				...files
+			];
+		}
 		case "wait": {
 			const raw = sub(action.selector);
 			if (!raw) return null;
@@ -6824,7 +7143,7 @@ async function runTrace(featureName, specName, model, validationMode = "lenient"
 		sessionName
 	});
 	const promptBundle = await loadRecordPromptBundle();
-	if (promptBundle !== null) meta("user-prompt", promptBundle.loaded.join(" + "));
+	if (promptBundle !== null) meta("prompt", promptBundle.loaded.join(" + "));
 	const systemPrompt = (promptBundle === null ? baseSystemPrompt : `${baseSystemPrompt}\n## Project-specific guidance\n\n${promptBundle.text}\n`) + languageDirective(language);
 	const prompt = buildTracePrompt(spec.title);
 	info("Running agent-browser session...");
@@ -6970,7 +7289,7 @@ function dedupAndReport(actions) {
 function isAdjacentDuplicate(a, b) {
 	if (a.command !== b.command) return false;
 	if ((a.stepId ?? "") !== (b.stepId ?? "")) return false;
-	return (a.selector ?? "") === (b.selector ?? "") && (a.value ?? "") === (b.value ?? "") && (a.target ?? "") === (b.target ?? "") && (a.label ?? "") === (b.label ?? "") && (a.assertType ?? "") === (b.assertType ?? "") && (a.findLocator ?? "") === (b.findLocator ?? "") && (a.findValue ?? "") === (b.findValue ?? "") && (a.findName ?? "") === (b.findName ?? "") && (a.findIndex ?? -1) === (b.findIndex ?? -1) && (a.findExact ?? false) === (b.findExact ?? false);
+	return (a.selector ?? "") === (b.selector ?? "") && (a.value ?? "") === (b.value ?? "") && (a.target ?? "") === (b.target ?? "") && (a.label ?? "") === (b.label ?? "") && (a.assertType ?? "") === (b.assertType ?? "") && (a.findLocator ?? "") === (b.findLocator ?? "") && (a.findValue ?? "") === (b.findValue ?? "") && (a.findName ?? "") === (b.findName ?? "") && (a.findIndex ?? -1) === (b.findIndex ?? -1) && (a.findExact ?? false) === (b.findExact ?? false) && (a.files ?? []).join("|") === (b.files ?? []).join("|");
 }
 /**
 * Run the post-trace replay validation and emit user-visible drop reports.
@@ -7192,6 +7511,16 @@ function parseAbAction(line) {
 			target: parts[3],
 			label: parts[4]
 		};
+		case "upload": {
+			const selector = parts[2];
+			const files = parts.slice(3).filter((f) => f !== "");
+			if (!selector || files.length === 0) return null;
+			return {
+				command,
+				selector,
+				files
+			};
+		}
 		case "find_click":
 		case "find_dblclick":
 		case "find_hover":
@@ -7242,6 +7571,7 @@ function actionsToScript(input) {
 		`import { ${[
 			"ab",
 			"abWait",
+			"abUpload",
 			"abAssertTextVisible",
 			"abAssertVisible",
 			"abAssertNotVisible",
@@ -7275,6 +7605,7 @@ const ELEMENT_COMMANDS = new Set([
 	"select",
 	"hover",
 	"drag",
+	"upload",
 	"find_click",
 	"find_dblclick",
 	"find_fill",
@@ -7406,6 +7737,11 @@ function actionToLine(action) {
 		case "hover": return `ab("hover", ${j(action.selector)});`;
 		case "scroll": return `ab("scroll", ${[action.direction ?? "down", ...action.pixels ? [action.pixels] : []].map(j).join(", ")});`;
 		case "drag": return `ab("drag", ${j(action.selector)}, ${j(action.target)});`;
+		case "upload": {
+			const files = action.files ?? [];
+			if (!action.selector || files.length === 0) return null;
+			return `abUpload(${[j(action.selector), ...files.map(jExpr)].join(", ")});`;
+		}
 		case "wait": {
 			const sel = action.selector;
 			if (/^\d+$/.test(sel)) return `spawnSync("sleep", [${j(sel)}], { stdio: "inherit" });`;
@@ -8481,19 +8817,20 @@ function toFixMode(autoFix) {
 		case "interactive": return "interactive";
 	}
 }
-const recordCommand = addLanguageOption(new Command("record").argument("<feature/spec>", "Spec id in '<feature>/<spec>' form (resolves to .ccqa/features/<feature>/test-cases/<spec>/)").description("Record a deterministic test from a spec: run agent-browser to collect actions (trace), then generate test.spec.ts with auto-fix retries (generate). After recording, `ccqa run <feature/spec>` replays it under vitest (deterministic specs only — live specs do not need recording).").option("-m, --model <name>", "Claude model alias ('sonnet'|'opus'|'haiku') or full ID. Overrides CCQA_MODEL.").option("--validation-mode <mode>", "Post-trace validation behaviour: 'lenient' (default) tags failing actions; 'strict' drops them.", (raw) => {
+const recordCommand = addProfileOption(addLanguageOption(new Command("record").argument("<feature/spec>", "Spec id in '<feature>/<spec>' form (resolves to .ccqa/features/<feature>/test-cases/<spec>/)").description("Record a deterministic test from a spec: run agent-browser to collect actions (trace), then generate test.spec.ts with auto-fix retries (generate). After recording, `ccqa run <feature/spec>` replays it under vitest (deterministic specs only — live specs do not need recording).").option("-m, --model <name>", "Claude model alias ('sonnet'|'opus'|'haiku') or full ID. Overrides CCQA_MODEL.").option("--validation-mode <mode>", "Post-trace validation behaviour: 'lenient' (default) tags failing actions; 'strict' drops them.", (raw) => {
 	if (VALIDATION_MODES.includes(raw)) return raw;
 	throw new Error(`--validation-mode must be one of ${VALIDATION_MODES.join(" | ")}`);
 }, "lenient").option("--auto-fix <mode>", "Auto-fix behaviour during script generation: 'interactive' (default, prompt y/N), 'auto' (apply without prompt, for CI), 'skip' (never prompt, only apply high-confidence fixes).", (raw) => {
 	if (AUTO_FIX_MODES.includes(raw)) return raw;
 	throw new Error(`--auto-fix must be one of ${AUTO_FIX_MODES.join(" | ")}`);
-}, "interactive").option("--max-retries <n>", "Maximum number of auto-fix retries", "3").option("--force", "Overwrite an existing test.spec.ts without warning").option("--no-snapshot", "Don't pin AGENT_BROWSER_SESSION / capture page snapshots after a failure (debug toggle)").option("--skip-trace", "Skip the trace step and run codegen against an existing actions.json").option("--skip-codegen", "Run only the trace step (do not generate test.spec.ts)").option("--update-agent-prompt", "After the trace finishes, ask Claude to refresh .ccqa/prompts/record.agent.md from a summary of the run.").option("--cwd <path>", "Working directory containing the .ccqa/ tree (monorepo support). Defaults to the current directory.")).action(async (specPath, opts) => {
+}, "interactive").option("--max-retries <n>", "Maximum number of auto-fix retries", "3").option("--force", "Overwrite an existing test.spec.ts without warning").option("--no-snapshot", "Don't pin AGENT_BROWSER_SESSION / capture page snapshots after a failure (debug toggle)").option("--skip-trace", "Skip the trace step and run codegen against an existing actions.json").option("--skip-codegen", "Run only the trace step (do not generate test.spec.ts)").option("--update-agent-prompt", "After the trace finishes, ask Claude to refresh .ccqa/prompts/record.agent.md from a summary of the run.").option("--cwd <path>", "Working directory containing the .ccqa/ tree (monorepo support). Defaults to the current directory."))).action(async (specPath, opts) => {
 	const { featureName, specName } = parseSpecPath(specPath);
 	const language = opts.language ?? "auto";
 	if (opts.skipTrace && opts.skipCodegen) {
 		error("--skip-trace and --skip-codegen cannot be combined; nothing would run");
 		process.exit(2);
 	}
+	await applyProfileFromOption(opts.profile, resolveCwd(opts.cwd));
 	let traceResult = null;
 	if (!opts.skipTrace) {
 		traceResult = await runTrace(featureName, specName, opts.model, opts.validationMode ?? "lenient", language);

package/dist/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "ccqa",
-  "version": "0.9.1",
+  "version": "0.10.1",
   "type": "module",
   "description": "Browser test recorder powered by Claude Code and agent-browser",
   "repository": {

package/dist/runtime/test-helpers.d.mts CHANGED Viewed

@@ -3,6 +3,13 @@ declare function __setCurrentStep(stepId: string, source: string): void;
 declare function ab(...args: string[]): void;
 /** Wait for element/text with an explicit timeout so long-running async ops don't hang. */
 declare function abWait(selector: string, timeoutMs?: number): void;
+/**
+ * Upload one or more files to a file input via `agent-browser upload`.
+ * Relative paths resolve against the test's cwd. We pre-check existence
+ * locally so a typo in a fixture path surfaces as a clear error before
+ * agent-browser exits with an opaque non-zero status.
+ */
+declare function abUpload(selector: string, ...files: string[]): void;
 /** Assert stable text is visible on page (via wait --text). */
 declare function abAssertTextVisible(text: string, timeoutMs?: number): void;
 /** Assert element is visible (polls `get count`; never uses the blocking `wait <selector>`). */
@@ -29,4 +36,4 @@ declare function abAssertUnchecked(selector: string): void;
  */
 declare function abStepEvidence(stepId: string, source: string): void;
 //#endregion
-export { __setCurrentStep, ab, abAssertChecked, abAssertDisabled, abAssertEnabled, abAssertNotVisible, abAssertTextVisible, abAssertUnchecked, abAssertUrl, abAssertVisible, abStepEvidence, abWait };
+export { __setCurrentStep, ab, abAssertChecked, abAssertDisabled, abAssertEnabled, abAssertNotVisible, abAssertTextVisible, abAssertUnchecked, abAssertUrl, abAssertVisible, abStepEvidence, abUpload, abWait };

package/dist/runtime/test-helpers.mjs CHANGED Viewed

@@ -1,6 +1,6 @@
 import { i as FAILURE_STEP_ID, n as spawnAB, r as FAILURE_SOURCE, t as sleepSync } from "../spawn-ab-Ja8NRRab.mjs";
-import { mkdirSync, writeFileSync } from "node:fs";
-import { dirname, join } from "node:path";
+import { existsSync, mkdirSync, writeFileSync } from "node:fs";
+import { dirname, isAbsolute, join, resolve } from "node:path";
 //#region src/runtime/test-helpers.ts
 const POST_OPEN_SETTLE_MS = 600;
 function logStep(action, args) {
@@ -100,6 +100,31 @@ function abWait(selector, timeoutMs = 3e4) {
 		stderr: ""
 	});
 }
+/**
+* Upload one or more files to a file input via `agent-browser upload`.
+* Relative paths resolve against the test's cwd. We pre-check existence
+* locally so a typo in a fixture path surfaces as a clear error before
+* agent-browser exits with an opaque non-zero status.
+*/
+function abUpload(selector, ...files) {
+	logStep("upload", [selector, ...files]);
+	const resolved = [];
+	for (const file of files) {
+		const abs = isAbsolute(file) ? file : resolve(process.cwd(), file);
+		if (!existsSync(abs)) fail(`abUpload: file not found (${file} → ${abs})`, {
+			status: 1,
+			stdout: "",
+			stderr: ""
+		});
+		resolved.push(abs);
+	}
+	const result = spawnAB([
+		"upload",
+		selector,
+		...resolved
+	]);
+	if (result.status !== 0) fail(`agent-browser upload failed (exit ${result.status})`, result);
+}
 /** Assert stable text is visible on page (via wait --text). */
 function abAssertTextVisible(text, timeoutMs = 3e4) {
 	logStep("assert.text", [text]);
@@ -289,4 +314,4 @@ function warnEvidence(msg) {
 	process.stderr.write(`[ccqa] evidence: ${msg}\n`);
 }
 //#endregion
-export { __setCurrentStep, ab, abAssertChecked, abAssertDisabled, abAssertEnabled, abAssertNotVisible, abAssertTextVisible, abAssertUnchecked, abAssertUrl, abAssertVisible, abStepEvidence, abWait };
+export { __setCurrentStep, ab, abAssertChecked, abAssertDisabled, abAssertEnabled, abAssertNotVisible, abAssertTextVisible, abAssertUnchecked, abAssertUrl, abAssertVisible, abStepEvidence, abUpload, abWait };

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "ccqa",
-  "version": "0.9.1",
+  "version": "0.10.1",
   "type": "module",
   "description": "Browser test recorder powered by Claude Code and agent-browser",
   "repository": {