prose-qa 0.1.0 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +65 -502
- package/dist/agent/llm-model.d.ts.map +1 -1
- package/dist/agent/llm-model.js +19 -10
- package/dist/agent/llm-model.js.map +1 -1
- package/dist/agent/prompt.js +1 -1
- package/dist/agent/prompt.js.map +1 -1
- package/dist/agent/prompt.test.js +0 -1
- package/dist/agent/prompt.test.js.map +1 -1
- package/dist/cli/analyze.js +7 -7
- package/dist/cli/analyze.js.map +1 -1
- package/dist/cli/record.js +3 -3
- package/dist/cli/record.js.map +1 -1
- package/dist/cli/run.js +12 -12
- package/dist/cli/run.js.map +1 -1
- package/dist/config/load.d.ts +4 -1
- package/dist/config/load.d.ts.map +1 -1
- package/dist/config/load.js +40 -24
- package/dist/config/load.js.map +1 -1
- package/dist/config/load.test.js +72 -13
- package/dist/config/load.test.js.map +1 -1
- package/dist/prompt/load.d.ts +2 -2
- package/dist/prompt/load.d.ts.map +1 -1
- package/dist/prompt/load.js +4 -9
- package/dist/prompt/load.js.map +1 -1
- package/dist/redact/env-secrets.test.js +3 -3
- package/dist/redact/env-secrets.test.js.map +1 -1
- package/dist/scenarios/globs.d.ts +3 -4
- package/dist/scenarios/globs.d.ts.map +1 -1
- package/dist/scenarios/globs.js +5 -27
- package/dist/scenarios/globs.js.map +1 -1
- package/dist/scenarios/globs.test.js +11 -29
- package/dist/scenarios/globs.test.js.map +1 -1
- package/dist/types/config.d.ts +3 -4
- package/dist/types/config.d.ts.map +1 -1
- package/package.json +4 -4
- package/pqa.config.ts +0 -10
package/README.md
CHANGED
|
@@ -1,570 +1,133 @@
|
|
|
1
1
|
# Prose-QA
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
Write what you want to test in plain text, and let Prose-QA do the rest. This autonomous, LLM-driven testing engine executes complex web workflows and validation checkpoints without the overhead of heavy browser wrappers, bringing frictionless QA to modern development.
|
|
4
4
|
|
|
5
|
-
|
|
5
|
+
Requires **Node.js 24+**, `PQA_LLM_API_KEY`, and `llm.provider` / `llm.model` in config.
|
|
6
6
|
|
|
7
|
-
|
|
8
|
-
|
|
9
|
-
## Features
|
|
10
|
-
|
|
11
|
-
- **Natural language scenarios** with `# Goal`, `# Steps`, and `# Then` checkpoints
|
|
12
|
-
- **Agent Skills** ([agentskills.io](https://agentskills.io/)) — Anthropic-compatible `SKILL.md` format
|
|
13
|
-
- **Pinned agent-browser skill** vendored at `skills/agent-browser/` (installed via `postinstall` on `npm ci` / `npm install`)
|
|
14
|
-
- **CI + local debug** modes with HTML/JSON reports
|
|
15
|
-
|
|
16
|
-
## Install
|
|
7
|
+
## Quick start
|
|
17
8
|
|
|
18
9
|
```bash
|
|
19
|
-
npm install -g prose-qa
|
|
20
|
-
# or in a project:
|
|
21
10
|
npm install prose-qa
|
|
22
|
-
npx pqa --help
|
|
23
|
-
```
|
|
24
|
-
|
|
25
|
-
Requires Node.js 20+ and an LLM API key (`ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, `FIREWORKS_API_KEY`, `OPENROUTER_API_KEY`, etc. depending on config).
|
|
26
|
-
|
|
27
|
-
On first install, `agent-browser` downloads its browser binary via `postinstall`. In CI, run:
|
|
28
|
-
|
|
29
|
-
```bash
|
|
30
|
-
npx agent-browser install --with-deps
|
|
31
|
-
```
|
|
32
11
|
|
|
33
|
-
## New project setup
|
|
34
|
-
|
|
35
|
-
1. Install the package in your app repo (or globally).
|
|
36
|
-
2. Create `pqa.config.json` in your project root (or use `pqa config <key> <value>` to set values incrementally):
|
|
37
|
-
|
|
38
|
-
```bash
|
|
39
12
|
pqa config llm.provider anthropic
|
|
40
13
|
pqa config llm.model claude-sonnet-4-20250514
|
|
41
|
-
```
|
|
42
|
-
|
|
43
|
-
Supported config filenames (first match wins): `pqa.config.json`, `pqa.config.mjs`, `pqa.config.js`, `pqa.config.ts`.
|
|
44
|
-
|
|
45
|
-
3. Create scenarios under `scenarios/` (see [0_hello-world.md](scenarios/0_hello-world.md)).
|
|
46
|
-
4. Copy [`.env.example`](.env.example) to `.env.development.local` (or set env vars in CI) and fill in secrets.
|
|
47
|
-
5. Run:
|
|
48
14
|
|
|
49
|
-
|
|
50
|
-
export ANTHROPIC_API_KEY=...
|
|
15
|
+
export PQA_LLM_API_KEY=...
|
|
51
16
|
pqa run scenarios/**/*.md --tags smoke
|
|
52
17
|
```
|
|
53
18
|
|
|
54
|
-
|
|
55
|
-
|
|
56
|
-
## Development (this repo)
|
|
57
|
-
|
|
58
|
-
```bash
|
|
59
|
-
git clone https://github.com/FreakDev/Prose-QA.git
|
|
60
|
-
cd Prose-QA
|
|
61
|
-
npm ci
|
|
62
|
-
npm run build
|
|
63
|
-
|
|
64
|
-
export ANTHROPIC_API_KEY=...
|
|
65
|
-
|
|
66
|
-
# Bundled scenarios target http://127.0.0.1:8080/ — start the demo server first (separate terminal or background)
|
|
67
|
-
npm run demo:server
|
|
68
|
-
|
|
69
|
-
# CI mode
|
|
70
|
-
npm run dev -- run scenarios/**/*.md --tags example
|
|
71
|
-
|
|
72
|
-
# Debug single scenario
|
|
73
|
-
npm run dev -- debug scenarios/0_hello-world.md --verbose
|
|
74
|
-
|
|
75
|
-
# Auth demo (demo server with hardcoded credentials)
|
|
76
|
-
export PQA_TEST_EMAIL=demo@pqa.local PQA_TEST_PASSWORD=demo-password
|
|
77
|
-
npm run dev -- debug scenarios/1_example-authenticated.md --verbose
|
|
78
|
-
```
|
|
79
|
-
|
|
80
|
-
The demo server (`npm run demo:server` → `scripts/demo-server.mjs`) serves `/` (Hello World), `/login`, and protected `/projects`. Credentials match `.env.example`.
|
|
81
|
-
|
|
82
|
-
See [CONTRIBUTING.md](CONTRIBUTING.md) for pull request guidelines.
|
|
19
|
+
**New project checklist**
|
|
83
20
|
|
|
84
|
-
|
|
21
|
+
1. Install the package in your app repo (or globally with `npm install -g prose-qa`).
|
|
22
|
+
2. Create `pqa.config.json` — use `pqa config <key> <value>` or copy the [minimal example](docs/CONFIG.md#minimal-example).
|
|
23
|
+
3. Add scenarios under `scenarios/` (see [0_hello-world.md](scenarios/0_hello-world.md)).
|
|
24
|
+
4. Copy `[.env.example](.env.example)` to `.env.development.local` (or set env vars in CI).
|
|
25
|
+
5. Run `pqa run` or `pqa debug`.
|
|
85
26
|
|
|
86
|
-
|
|
27
|
+
On first install, `agent-browser` downloads its browser binary via `postinstall`. In CI:
|
|
87
28
|
|
|
88
29
|
```bash
|
|
89
|
-
|
|
90
|
-
# or from this repo:
|
|
91
|
-
npm run mcp
|
|
92
|
-
```
|
|
93
|
-
|
|
94
|
-
**Cursor** (`.cursor/mcp.json` in your app repo — `cwd` must be the project with `pqa.config` and env vars):
|
|
95
|
-
|
|
96
|
-
```json
|
|
97
|
-
{
|
|
98
|
-
"mcpServers": {
|
|
99
|
-
"prose-qa": {
|
|
100
|
-
"command": "npx",
|
|
101
|
-
"args": ["-y", "prose-qa", "mcp"]
|
|
102
|
-
}
|
|
103
|
-
}
|
|
104
|
-
}
|
|
30
|
+
npx agent-browser install --with-deps
|
|
105
31
|
```
|
|
106
32
|
|
|
107
|
-
|
|
33
|
+
Bundled harness assets (`prompt/`, `skills/`) ship inside the npm package. Your project only needs `pqa.config.*`, `scenarios/`, and optional `.agents/skills/` overrides.
|
|
108
34
|
|
|
109
|
-
|
|
110
|
-
| -------- | -------- |
|
|
111
|
-
| Resource `pqa://skill/create-pqa-scenario` | Full create-pqa-scenario `SKILL.md` |
|
|
112
|
-
| Tool `get_create_pqa_scenario_skill` | Same skill as text |
|
|
113
|
-
| Tool `validate_scenario` | Parse `content` without running the browser |
|
|
114
|
-
| Tool `run_scenario` | Execute `content` (requires LLM + browser env) |
|
|
115
|
-
| Prompt `author_pqa_scenario` | Template that includes the skill |
|
|
35
|
+
## What you get
|
|
116
36
|
|
|
117
|
-
|
|
37
|
+
- **Natural language scenarios** — `# Goal`, `# Steps`, and `# Then` checkpoints ([format guide](docs/HOWTO.md#1-scenario-format-goal--steps--then--frontmatter))
|
|
38
|
+
- **Agent Skills** ([agentskills.io](https://agentskills.io/)) — Anthropic-compatible `SKILL.md` format
|
|
39
|
+
- **Pinned agent-browser skill** vendored at `skills/agent-browser/` (installed via `postinstall`)
|
|
40
|
+
- **CI + local debug** modes with HTML/JSON reports
|
|
41
|
+
- **Auth, cache, healing, recording, and analysis** — see [HOWTO](docs/HOWTO.md)
|
|
118
42
|
|
|
119
|
-
|
|
43
|
+
## Documentation
|
|
120
44
|
|
|
121
|
-
```markdown
|
|
122
|
-
---
|
|
123
|
-
name: checkout-happy-path
|
|
124
|
-
tags: [smoke]
|
|
125
|
-
auth: admin
|
|
126
|
-
url: https://app.example.com
|
|
127
|
-
---
|
|
128
45
|
|
|
129
|
-
|
|
130
|
-
|
|
46
|
+
| Doc | Purpose |
|
|
47
|
+
| ------------------------------------------------------------ | ------------------------------------------------------------------------------------------ |
|
|
48
|
+
| [docs/HOWTO.md](docs/HOWTO.md) | Step-by-step guide: scenarios → run → CI → auth → MCP → record → cache → healing → analyze |
|
|
49
|
+
| [docs/CONFIG.md](docs/CONFIG.md) | Full configuration reference |
|
|
50
|
+
| [CONTRIBUTING.md](CONTRIBUTING.md) | Pull request guidelines |
|
|
51
|
+
| [SECURITY.md](SECURITY.md) | Vulnerability reporting, secrets, and run artifacts |
|
|
52
|
+
| [recorder-extension/README.md](recorder-extension/README.md) | Chrome extension recorder (WIP) |
|
|
131
53
|
|
|
132
|
-
# Steps
|
|
133
|
-
1. Add item to cart and proceed to checkout.
|
|
134
|
-
2. Complete payment with test card.
|
|
135
54
|
|
|
136
|
-
|
|
137
|
-
- url contains "/order-confirmation"
|
|
138
|
-
- page shows "Thank you"
|
|
139
|
-
```
|
|
55
|
+
## CLI
|
|
140
56
|
|
|
141
|
-
## Configuration
|
|
142
57
|
|
|
143
|
-
|
|
58
|
+
| Command | Description |
|
|
59
|
+
| --------------------------------------------------- | -------------------------------------------------- |
|
|
60
|
+
| `pqa config <key> <value>` | Set a value in `pqa.config.json` |
|
|
61
|
+
| `pqa run [globs]` | Run scenarios (headless by default) |
|
|
62
|
+
| `pqa debug [globs]` | Verbose debug run (headed by default) |
|
|
63
|
+
| `pqa clear-cache [scenario]` | Clear scenario replay cache |
|
|
64
|
+
| `pqa auth list` / `clear` / `save` | Manage cached auth profiles |
|
|
65
|
+
| `pqa analyze [run...]` | Post-run analysis and flaky detection (`--last N`) |
|
|
66
|
+
| `pqa record start` / `note` / `checkpoint` / `stop` | Record browser actions → scenario markdown |
|
|
67
|
+
| `pqa skills list` / `show` / `sync` | Discover and inspect agent skills |
|
|
68
|
+
| `pqa mcp` | Start MCP server (Cursor, Claude Desktop, …) |
|
|
144
69
|
|
|
145
|
-
**Local config files** (first match in the project root wins): `pqa.config.json`, `pqa.config.mjs`, `pqa.config.js`, `pqa.config.ts`.
|
|
146
70
|
|
|
147
|
-
|
|
71
|
+
Tag filters, auth refresh, retries, and cache flags: see [HOWTO §3–§4](docs/HOWTO.md#3-debug-vs-run) and [HOWTO §11](docs/HOWTO.md#11-healing--retries).
|
|
148
72
|
|
|
149
|
-
|
|
150
|
-
pqa config llm.provider anthropic
|
|
151
|
-
pqa config browser.headed true
|
|
152
|
-
pqa config envVars '["PQA_TEST_EMAIL","PQA_TEST_PASSWORD"]'
|
|
153
|
-
```
|
|
73
|
+
**Exit codes:** `0` pass · `1` failure · `2` config/harness error
|
|
154
74
|
|
|
155
|
-
|
|
75
|
+
## Configuration
|
|
156
76
|
|
|
157
|
-
|
|
77
|
+
Supported filenames (first match wins): `pqa.config.json`, `pqa.config.mjs`, `pqa.config.js`, `pqa.config.ts`.
|
|
158
78
|
|
|
159
79
|
```json
|
|
160
80
|
{
|
|
161
81
|
"envVars": ["PQA_TEST_EMAIL", "PQA_TEST_PASSWORD"],
|
|
162
|
-
"sensitiveEnvVars": ["PQA_TEST_EMAIL", "PQA_TEST_PASSWORD"],
|
|
163
82
|
"llm": {
|
|
164
83
|
"provider": "anthropic",
|
|
165
84
|
"model": "claude-sonnet-4-20250514"
|
|
166
|
-
},
|
|
167
|
-
"auth": {
|
|
168
|
-
"admin": {
|
|
169
|
-
"scenario": "login-admin",
|
|
170
|
-
"statePath": ".pqa/auth/admin.json"
|
|
171
|
-
}
|
|
172
85
|
}
|
|
173
86
|
}
|
|
174
87
|
```
|
|
175
88
|
|
|
176
|
-
### Environment variables
|
|
177
|
-
|
|
178
|
-
| Variable | Description |
|
|
179
|
-
| --- | --- |
|
|
180
|
-
| `ANTHROPIC_API_KEY` | Required when `llm.provider` is `anthropic` |
|
|
181
|
-
| `OPENAI_API_KEY` | Required when `llm.provider` is `openai` |
|
|
182
|
-
| `FIREWORKS_API_KEY` | Required when `llm.provider` is `fireworks` |
|
|
183
|
-
| `GOOGLE_GENERATIVE_AI_API_KEY` | Required when `llm.provider` is `google` |
|
|
184
|
-
| `OPENROUTER_API_KEY` | Required when `llm.provider` is `openrouter` |
|
|
185
|
-
| `PQA_LLM_PROVIDER` | Overrides bundled default `llm.provider` (dev / CI shortcut) |
|
|
186
|
-
| `PQA_LLM_MODEL` | Overrides bundled default `llm.model` |
|
|
187
|
-
|
|
188
|
-
Ollama does not require an API key env var. Any name listed in `envVars` must be set before a run starts.
|
|
189
|
-
|
|
190
|
-
### All options
|
|
191
|
-
|
|
192
|
-
#### `scenariosDir` (string)
|
|
193
|
-
|
|
194
|
-
Root directory for scenario markdown files. Set directly in `pqa.config.json`.
|
|
195
|
-
|
|
196
|
-
| | |
|
|
197
|
-
| --- | --- |
|
|
198
|
-
| **Default** | `scenarios`, or `pqa/` when that directory exists and `scenarios/` does not |
|
|
199
|
-
|
|
200
|
-
#### `systemPromptPath` (string)
|
|
201
|
-
|
|
202
|
-
Path to the agent system prompt markdown file. Relative paths resolve against the project cwd first, then bundled package assets.
|
|
203
|
-
|
|
204
|
-
| | |
|
|
205
|
-
| --- | --- |
|
|
206
|
-
| **Default** | `prompt/SYSTEM.md` (bundled) |
|
|
207
|
-
|
|
208
|
-
#### `envVars` (string[])
|
|
209
|
-
|
|
210
|
-
Environment variable **names** the agent should know about. Injected into the system prompt at runtime (set / not-set status only — never values). Validated before each run.
|
|
211
|
-
|
|
212
|
-
| | |
|
|
213
|
-
| --- | --- |
|
|
214
|
-
| **Default** | `[]` |
|
|
215
|
-
|
|
216
|
-
#### `sensitiveEnvVars` (string[])
|
|
217
|
-
|
|
218
|
-
Env var names whose **values** are redacted from transcripts, verdicts, reports, and verbose logs (replaced with `${VAR_NAME}`). If omitted, defaults to `envVars`. The LLM API key for the configured provider is always redacted.
|
|
219
|
-
|
|
220
|
-
| | |
|
|
221
|
-
| --- | --- |
|
|
222
|
-
| **Default** | same as `envVars` |
|
|
223
|
-
|
|
224
|
-
---
|
|
225
|
-
|
|
226
|
-
#### `llm` (object)
|
|
227
|
-
|
|
228
|
-
LLM provider and model used for test runs, recording generation, and analysis.
|
|
229
|
-
|
|
230
|
-
| Key | Type | Default | Description |
|
|
231
|
-
| --- | --- | --- | --- |
|
|
232
|
-
| `provider` | `"anthropic"` \| `"openai"` \| `"fireworks"` \| `"ollama"` \| `"google"` \| `"openrouter"` | `"anthropic"` | LLM backend |
|
|
233
|
-
| `model` | string | `"claude-sonnet-4-20250514"` | Model identifier for the chosen provider |
|
|
234
|
-
|
|
235
|
-
##### `llm.thinking` (object, optional)
|
|
236
|
-
|
|
237
|
-
Extended thinking / reasoning. Provider support varies.
|
|
238
|
-
|
|
239
|
-
| Key | Type | Default | Description |
|
|
240
|
-
| --- | --- | --- | --- |
|
|
241
|
-
| `enabled` | boolean | `true` | Enable extended thinking |
|
|
242
|
-
| `budgetTokens` | number | `10000` | Thinking token budget (Anthropic, Fireworks, Google, OpenRouter) |
|
|
243
|
-
| `reasoningEffort` | `"none"` \| `"minimal"` \| `"low"` \| `"medium"` \| `"high"` \| `"xhigh"` | — | OpenAI reasoning effort; mapped to Anthropic effort, Google thinking level, and OpenRouter reasoning effort. Ollama uses `think` mode only (other fields ignored) |
|
|
244
|
-
|
|
245
|
-
---
|
|
246
|
-
|
|
247
|
-
#### `browser` (object)
|
|
248
|
-
|
|
249
|
-
Default browser behavior for scenario runs (overridable per run with `--headed` / `--no-headed`).
|
|
250
|
-
|
|
251
|
-
| Key | Type | Default | Description |
|
|
252
|
-
| --- | --- | --- | --- |
|
|
253
|
-
| `headed` | boolean | `false` | Run browser in visible (headed) mode |
|
|
254
|
-
| `sessionName` | string | `"pqa"` | agent-browser session name |
|
|
255
|
-
| `defaultTimeout` | number | `25000` | Default timeout in milliseconds |
|
|
256
|
-
|
|
257
|
-
---
|
|
258
|
-
|
|
259
|
-
#### `skills` (object)
|
|
260
|
-
|
|
261
|
-
Agent skill discovery and preloading ([agentskills.io](https://agentskills.io/) `SKILL.md` format).
|
|
262
|
-
|
|
263
|
-
| Key | Type | Default | Description |
|
|
264
|
-
| --- | --- | --- | --- |
|
|
265
|
-
| `dirs` | string[] | `["skills", ".agents/skills"]` | Directories scanned for skills. Relative paths resolve like bundled assets |
|
|
266
|
-
| `preloads` | string[] | `["core"]` | Skill names always appended to the system prompt (`core` = vendored agent-browser skill) |
|
|
267
|
-
|
|
268
|
-
---
|
|
269
|
-
|
|
270
|
-
#### `agent` (object)
|
|
271
89
|
|
|
272
|
-
|
|
90
|
+
| Variable | Required when |
|
|
91
|
+
| ------------------ | ---------------------------------------- |
|
|
92
|
+
| `PQA_LLM_API_KEY` | Any cloud `llm.provider` (not `ollama`) |
|
|
93
|
+
| `PQA_LLM_PROVIDER` | Optional env shortcut for `llm.provider` |
|
|
94
|
+
| `PQA_LLM_MODEL` | Optional env shortcut for `llm.model` |
|
|
273
95
|
|
|
274
|
-
| Key | Type | Default | Description |
|
|
275
|
-
| --- | --- | --- | --- |
|
|
276
|
-
| `maxTurns` | number | `200` | Maximum agent turns per scenario |
|
|
277
|
-
| `bashTimeoutMs` | number | `120000` | Timeout for each bash (agent-browser) command in milliseconds |
|
|
278
96
|
|
|
279
|
-
|
|
97
|
+
All options, env vars, and a full example: **[docs/CONFIG.md](docs/CONFIG.md)**.
|
|
280
98
|
|
|
281
|
-
|
|
99
|
+
## MCP (Cursor)
|
|
282
100
|
|
|
283
|
-
|
|
284
|
-
|
|
285
|
-
Each profile key (e.g. `admin`) supports:
|
|
286
|
-
|
|
287
|
-
| Key | Type | Default | Description |
|
|
288
|
-
| --- | --- | --- | --- |
|
|
289
|
-
| `scenario` | string | — | `frontmatter.name` of the on-demand auth scenario (e.g. `"login-admin"`) |
|
|
290
|
-
| `statePath` | string | `.pqa/auth/<profile>.json` | agent-browser state file path |
|
|
291
|
-
|
|
292
|
-
When a scenario uses `auth: admin`, the harness loads cached state from `statePath` or runs the auth scenario once, saves browser state, then continues. See [Auth (hybrid authStore)](#auth-hybrid-authstore).
|
|
293
|
-
|
|
294
|
-
---
|
|
295
|
-
|
|
296
|
-
#### `healing` (object, optional)
|
|
297
|
-
|
|
298
|
-
Conservative self-healing: in-run recovery and transient-only scenario retries. See [Self-healing](#self-healing-conservative).
|
|
299
|
-
|
|
300
|
-
| Key | Type | Default | Description |
|
|
301
|
-
| --- | --- | --- | --- |
|
|
302
|
-
| `enabled` | boolean | `true` | Master switch for in-run recovery and transient retry gating |
|
|
303
|
-
| `maxRecoveryTurns` | number | `2` | Extra agent turns after a failed verdict (same browser session) |
|
|
304
|
-
| `recoverOnUnknown` | boolean | `false` | Allow recovery when failure class is unknown but bash output looks transient |
|
|
305
|
-
| `transientPatterns` | string[] | see below | Substrings matched against bash output and checkpoint reasons to classify transient failures |
|
|
306
|
-
|
|
307
|
-
Default `transientPatterns`: `timeout`, `timed out`, `not found`, `waiting for`, `navigation`, `net::`, `target closed`, `detached`, `stale`, `interrupted`.
|
|
308
|
-
|
|
309
|
-
CLI equivalents: `--no-healing`, `--retries N`, `--retries-policy transient|always`, `--no-cache`.
|
|
310
|
-
|
|
311
|
-
---
|
|
312
|
-
|
|
313
|
-
#### `cache` (object, optional)
|
|
314
|
-
|
|
315
|
-
Scenario replay cache settings. See [Scenario replay cache](#scenario-replay-cache).
|
|
316
|
-
|
|
317
|
-
| Key | Type | Default | Description |
|
|
318
|
-
| --- | --- | --- | --- |
|
|
319
|
-
| `dir` | string | `".pqa/cache"` | Directory for per-scenario replay hints |
|
|
320
|
-
| `enabled` | boolean | `true` | Master switch (opt-out via `--no-cache`) |
|
|
321
|
-
|
|
322
|
-
---
|
|
323
|
-
|
|
324
|
-
#### `recorder` (object, optional)
|
|
325
|
-
|
|
326
|
-
Settings for `pqa record`. See [Recording scenarios](#recording-scenarios).
|
|
327
|
-
|
|
328
|
-
| Key | Type | Default | Description |
|
|
329
|
-
| --- | --- | --- | --- |
|
|
330
|
-
| `bridgePort` | number | `17321` | Local HTTP port for the recording event bridge |
|
|
331
|
-
| `outputDir` | string | `".pqa/recordings"` | Directory for saved recording sessions |
|
|
332
|
-
| `defaultTags` | string[] | `["recorded"]` | Tags added to generated scenario frontmatter |
|
|
333
|
-
|
|
334
|
-
---
|
|
335
|
-
|
|
336
|
-
### Full reference example
|
|
101
|
+
Add to `.cursor/mcp.json` in your app repo (`cwd` must be the project with `pqa.config` and env vars):
|
|
337
102
|
|
|
338
103
|
```json
|
|
339
104
|
{
|
|
340
|
-
"
|
|
341
|
-
|
|
342
|
-
|
|
343
|
-
|
|
344
|
-
"llm": {
|
|
345
|
-
"provider": "anthropic",
|
|
346
|
-
"model": "claude-sonnet-4-20250514",
|
|
347
|
-
"thinking": {
|
|
348
|
-
"enabled": true,
|
|
349
|
-
"budgetTokens": 10000,
|
|
350
|
-
"reasoningEffort": "high"
|
|
351
|
-
}
|
|
352
|
-
},
|
|
353
|
-
"browser": {
|
|
354
|
-
"headed": false,
|
|
355
|
-
"sessionName": "pqa",
|
|
356
|
-
"defaultTimeout": 25000
|
|
357
|
-
},
|
|
358
|
-
"skills": {
|
|
359
|
-
"dirs": ["skills", ".agents/skills"],
|
|
360
|
-
"preloads": ["core"]
|
|
361
|
-
},
|
|
362
|
-
"agent": {
|
|
363
|
-
"maxTurns": 200,
|
|
364
|
-
"bashTimeoutMs": 120000
|
|
365
|
-
},
|
|
366
|
-
"auth": {
|
|
367
|
-
"admin": {
|
|
368
|
-
"scenario": "login-admin",
|
|
369
|
-
"statePath": ".pqa/auth/admin.json"
|
|
105
|
+
"mcpServers": {
|
|
106
|
+
"prose-qa": {
|
|
107
|
+
"command": "npx",
|
|
108
|
+
"args": ["-y", "prose-qa", "mcp"]
|
|
370
109
|
}
|
|
371
|
-
},
|
|
372
|
-
"healing": {
|
|
373
|
-
"enabled": true,
|
|
374
|
-
"maxRecoveryTurns": 2,
|
|
375
|
-
"recoverOnUnknown": false,
|
|
376
|
-
"transientPatterns": [
|
|
377
|
-
"timeout",
|
|
378
|
-
"timed out",
|
|
379
|
-
"not found",
|
|
380
|
-
"waiting for",
|
|
381
|
-
"navigation",
|
|
382
|
-
"net::",
|
|
383
|
-
"target closed",
|
|
384
|
-
"detached",
|
|
385
|
-
"stale",
|
|
386
|
-
"interrupted"
|
|
387
|
-
]
|
|
388
|
-
},
|
|
389
|
-
"recorder": {
|
|
390
|
-
"bridgePort": 17321,
|
|
391
|
-
"outputDir": ".pqa/recordings",
|
|
392
|
-
"defaultTags": ["recorded"]
|
|
393
|
-
},
|
|
394
|
-
"cache": {
|
|
395
|
-
"dir": ".pqa/cache",
|
|
396
|
-
"enabled": true
|
|
397
|
-
}
|
|
398
|
-
}
|
|
399
|
-
```
|
|
400
|
-
|
|
401
|
-
## CLI
|
|
402
|
-
|
|
403
|
-
| Command | Description |
|
|
404
|
-
| --- | --- |
|
|
405
|
-
| `pqa config <key> <value>` | Set a value in `pqa.config.json` |
|
|
406
|
-
| `pqa run [globs]` | Run scenarios (headless by default) |
|
|
407
|
-
| `pqa clear-cache [scenario]` | Clear scenario replay cache |
|
|
408
|
-
| `pqa debug [globs]` | Verbose debug run (headed by default, supports `--tag` / `--tags`) |
|
|
409
|
-
| `pqa skills list` | List discovered skills |
|
|
410
|
-
| `pqa skills show <name>` | Print skill body |
|
|
411
|
-
| `pqa skills sync` | Re-vendor agent-browser skill (dev repo only) |
|
|
412
|
-
| `pqa auth list` | List cached auth profiles in the auth store |
|
|
413
|
-
| `pqa auth clear [profile]` | Clear cached auth state |
|
|
414
|
-
| `pqa auth save <name>` | Run the configured auth scenario and save state |
|
|
415
|
-
| `pqa analyze [run...]` | Heuristic + LLM analysis, interactive patch review (REPL); multi-run flaky detection with `--last N` |
|
|
416
|
-
| `pqa record start` | Start headed recording session (browser + event bridge) |
|
|
417
|
-
| `pqa record note <text>` | Add a comment to the active recording |
|
|
418
|
-
| `pqa record checkpoint <text>` | Add a Then-section hint |
|
|
419
|
-
| `pqa record stop` | Stop recording and generate `scenarios/recorded/*.md` via LLM |
|
|
420
|
-
| `pqa record generate <dir>` | Regenerate scenario markdown from a saved recording |
|
|
421
|
-
|
|
422
|
-
Tag filters on `run` and `debug` can express AND/OR/NOT combinations:
|
|
423
|
-
|
|
424
|
-
```bash
|
|
425
|
-
# AND: scenario must have both tags
|
|
426
|
-
pqa run scenarios/**/*.md --tags smoke,checkout
|
|
427
|
-
|
|
428
|
-
# AND with NOT: scenario must have p0 and must not have smoke
|
|
429
|
-
pqa run scenarios/**/*.md --tags p0,!smoke
|
|
430
|
-
|
|
431
|
-
# OR: scenario may have either tag
|
|
432
|
-
pqa run scenarios/**/*.md --tag smoke --tag checkout
|
|
433
|
-
|
|
434
|
-
# OR with NOT: scenario either lacks p0 or has smoke
|
|
435
|
-
pqa run scenarios/**/*.md --tag !p0 --tag smoke
|
|
436
|
-
|
|
437
|
-
# Combined: (smoke AND checkout) OR auth
|
|
438
|
-
pqa run scenarios/**/*.md --tags smoke,checkout --tag auth
|
|
439
|
-
```
|
|
440
|
-
|
|
441
|
-
Use `--auth-refresh` on `run` / `debug` to re-run auth scenarios and refresh the store.
|
|
442
|
-
|
|
443
|
-
## Scenario replay cache
|
|
444
|
-
|
|
445
|
-
After a scenario passes, PQA runs a secondary LLM pass on the run transcript to produce **replay hints** under `.pqa/cache/<scenario-name>/` (`hints.md` + `meta.json`). On the next run, those hints are injected into the agent system prompt (like a skill) so the agent can follow proven `agent-browser` paths and avoid repeating costly recovery loops.
|
|
446
|
-
|
|
447
|
-
```bash
|
|
448
|
-
# First run: agent executes; hints are generated on pass
|
|
449
|
-
pqa run scenarios/lapresse/homepage-smoke.md
|
|
450
|
-
|
|
451
|
-
# Second run: agent runs with cached hints (if scenario content unchanged)
|
|
452
|
-
pqa run scenarios/lapresse/homepage-smoke.md
|
|
453
|
-
|
|
454
|
-
# Skip cache read/write
|
|
455
|
-
pqa run scenarios/**/*.md --no-cache
|
|
456
|
-
|
|
457
|
-
# Clear one or all caches
|
|
458
|
-
pqa clear-cache lapresse-homepage-smoke
|
|
459
|
-
pqa clear-cache
|
|
460
|
-
```
|
|
461
|
-
|
|
462
|
-
Cache is **invalidated** when the effective scenario content changes (Goal, Steps, Then, frontmatter, and expanded includes — detected via content hash). Hints are **merged and refined** on each subsequent pass. Failed runs do not update the cache.
|
|
463
|
-
|
|
464
|
-
Config (optional):
|
|
465
|
-
|
|
466
|
-
```json
|
|
467
|
-
{
|
|
468
|
-
"cache": {
|
|
469
|
-
"dir": ".pqa/cache",
|
|
470
|
-
"enabled": true
|
|
471
110
|
}
|
|
472
111
|
}
|
|
473
112
|
```
|
|
474
113
|
|
|
475
|
-
|
|
476
|
-
|
|
477
|
-
Record user actions and generate a draft scenario markdown file:
|
|
478
|
-
|
|
479
|
-
```bash
|
|
480
|
-
pqa record start --url http://localhost:3000/projects
|
|
481
|
-
pqa record note "intentionally invalid date"
|
|
482
|
-
# interact in the browser
|
|
483
|
-
pqa record checkpoint 'page shows "Projects"'
|
|
484
|
-
pqa record stop --name my-flow
|
|
485
|
-
pqa debug scenarios/recorded/my-flow.md --verbose --headed
|
|
486
|
-
```
|
|
487
|
-
|
|
488
|
-
Events are stored under `.pqa/recordings/<timestamp>/events.jsonl`. On each interaction, the bridge runs `agent-browser snapshot -i`, matches the target to a snapshot ref (`snapshot.ref`, `snapshot.description`), and saves the tree under `snapshots/<ts>.json`. A background bridge process keeps receiving browser events until `pqa record stop` (so you can run `record note` / `record checkpoint` in another terminal while clicking in the browser). Generation uses the same LLM config as test runs (`prompt/RECORD.md`). Recorder options: see [`recorder`](#recorder-object-optional) in Configuration.
|
|
489
|
-
|
|
490
|
-
**Chrome extension (WIP):** load unpacked from [recorder-extension/](recorder-extension/README.md), run `pqa record start --connect 9222`, and use the popup for notes/checkpoints.
|
|
491
|
-
|
|
492
|
-
**Exit codes:** `0` pass · `1` failure · `2` config/harness error
|
|
493
|
-
|
|
494
|
-
## System prompt & skills
|
|
495
|
-
|
|
496
|
-
| File / skill | Role |
|
|
497
|
-
| --- | --- |
|
|
498
|
-
| [prompt/SYSTEM.md](prompt/SYSTEM.md) | Agent system prompt (workflow, verdict schema, rules) |
|
|
499
|
-
| `core` | Vendored agent-browser skill at `skills/agent-browser/` (bundled with the package) |
|
|
500
|
-
|
|
501
|
-
`prompt/SYSTEM.md` is loaded as the system prompt; `core` is appended as a supplemental skill. Browser control stays in bash — the agent runs `agent-browser` commands directly.
|
|
502
|
-
|
|
503
|
-
The system prompt enforces an **Observe-Act-Verify loop**: snapshot before each UI interaction, one interaction command per bash call, re-snapshot after page changes, and targeted reasoning only at ambiguous refs, failures, or before the final verdict. See [prompt/SYSTEM.md](prompt/SYSTEM.md) for details.
|
|
504
|
-
|
|
505
|
-
## Auth (hybrid authStore)
|
|
506
|
-
|
|
507
|
-
Map auth profiles to on-demand login scenarios via the [`auth`](#auth-object) config block. See [scenario format — Auth](prompt/references/scenario-format.md#auth-hybrid-authstore).
|
|
508
|
-
|
|
509
|
-
When a consumer scenario uses `auth: admin`, the harness loads cached state from `.pqa/auth/` or runs `login-admin` once, saves browser state, then continues.
|
|
510
|
-
|
|
511
|
-
```bash
|
|
512
|
-
# Inspect / invalidate cache
|
|
513
|
-
pqa auth list
|
|
514
|
-
pqa auth clear admin
|
|
515
|
-
|
|
516
|
-
# Force fresh login
|
|
517
|
-
pqa run scenarios/**/*.md --auth-refresh
|
|
518
|
-
```
|
|
519
|
-
|
|
520
|
-
**CI:** pass test credentials as GitHub Secrets → env vars (`PQA_TEST_EMAIL`, etc.) referenced in auth scenario Steps. Optionally pre-seed state from a base64 secret before the run.
|
|
521
|
-
|
|
522
|
-
Legacy manual capture (runs the configured auth scenario):
|
|
523
|
-
|
|
524
|
-
```bash
|
|
525
|
-
pqa auth save admin
|
|
526
|
-
```
|
|
527
|
-
|
|
528
|
-
## Self-healing (conservative)
|
|
529
|
-
|
|
530
|
-
When [`healing.enabled`](#healing-object-optional) is `true` (default), Prose-QA can:
|
|
531
|
-
|
|
532
|
-
1. **In-run recovery** — after a failed verdict, retry verification of failed checkpoints only (same browser session), for **transient** failures (timeouts, stale refs).
|
|
533
|
-
2. **Scenario retries** — `--retries N` with `--retries-policy transient` (default) re-runs the whole scenario only when the failure is classified transient. Use `--no-healing` for legacy behavior (any failure retries).
|
|
114
|
+
Tools: `validate_scenario`, `run_scenario`, `get_create_pqa_scenario_skill`. Details: [HOWTO §8](docs/HOWTO.md#8-mcp--author-skill).
|
|
534
115
|
|
|
535
|
-
|
|
116
|
+
## Development (this repo)
|
|
536
117
|
|
|
537
118
|
```bash
|
|
538
|
-
|
|
539
|
-
|
|
119
|
+
git clone https://github.com/FreakDev/Prose-QA.git
|
|
120
|
+
cd Prose-QA
|
|
121
|
+
npm ci && npm run build
|
|
540
122
|
|
|
541
|
-
|
|
542
|
-
pqa analyze
|
|
123
|
+
export PQA_LLM_API_KEY=...
|
|
543
124
|
|
|
544
|
-
|
|
545
|
-
|
|
125
|
+
npm run demo:server # terminal 1 — http://127.0.0.1:8080/
|
|
126
|
+
npm run dev -- debug scenarios/0_hello-world.md --verbose
|
|
546
127
|
```
|
|
547
128
|
|
|
548
|
-
|
|
549
|
-
|
|
550
|
-
## Reports
|
|
551
|
-
|
|
552
|
-
Runs write artifacts to `.pqa/runs/<runId>/`:
|
|
553
|
-
|
|
554
|
-
- `report.json` / `report.html` — summary
|
|
555
|
-
- `analyze.json` / `analyze-llm.json` — written by `pqa analyze` (single run)
|
|
556
|
-
- `.pqa/analyze/<timestamp>/analyze-flaky.json` / `analyze-llm.json` — multi-run flaky analysis
|
|
557
|
-
- `<scenario>/transcript.json` — bash commands + agent messages
|
|
558
|
-
- `<scenario>/verdict.json` — structured pass/fail
|
|
559
|
-
|
|
560
|
-
## CI
|
|
561
|
-
|
|
562
|
-
See [.github/workflows/smoke_tests.yml](.github/workflows/smoke_tests.yml). Unit tests run on every push. Optional smoke PQA runs require `ANTHROPIC_API_KEY` (or configure another provider via env).
|
|
563
|
-
|
|
564
|
-
## Security
|
|
565
|
-
|
|
566
|
-
See [SECURITY.md](SECURITY.md) for vulnerability reporting and guidance on run artifacts and credentials.
|
|
129
|
+
See [CONTRIBUTING.md](CONTRIBUTING.md) and [docs/HOWTO.md](docs/HOWTO.md) for the full walkthrough.
|
|
567
130
|
|
|
568
131
|
## License
|
|
569
132
|
|
|
570
|
-
MIT — see [LICENSE](LICENSE).
|
|
133
|
+
MIT — see [LICENSE](LICENSE).
|
|
@@ -1 +1 @@
|
|
|
1
|
-
{"version":3,"file":"llm-model.d.ts","sourceRoot":"","sources":["../../src/agent/llm-model.ts"],"names":[],"mappings":"AAMA,OAAO,KAAK,EAAE,aAAa,EAAE,MAAM,IAAI,CAAC;
|
|
1
|
+
{"version":3,"file":"llm-model.d.ts","sourceRoot":"","sources":["../../src/agent/llm-model.ts"],"names":[],"mappings":"AAMA,OAAO,KAAK,EAAE,aAAa,EAAE,MAAM,IAAI,CAAC;AAExC,OAAO,KAAK,EAAE,SAAS,EAAE,MAAM,oBAAoB,CAAC;AASpD,gFAAgF;AAChF,wBAAgB,cAAc,CAAC,MAAM,EAAE,SAAS,GAAG,aAAa,CAwB/D"}
|