@exodus/xqa 1.2.3 → 1.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,166 +1,220 @@
1
1
  # @exodus/xqa
2
2
 
3
- CLI for running AI-powered QA agents against Exodus mobile apps on iOS.
3
+ AI-powered QA agent CLI for Exodus applications.
4
4
 
5
- ## Prerequisites
5
+ ## Overview
6
6
 
7
- - Node >= 22
8
- - pnpm
9
- - An Anthropic API key
7
+ `xqa` automates mobile app QA by connecting to physical devices or emulators and running intelligent exploration and spec-based testing. The CLI orchestrates the pipeline that spawns agents to interact with your app, capture screenshots, and generate findings based on user-defined specs or breadth-first exploration.
10
8
 
11
- ## Installation
9
+ The tool manages configuration, project initialization, session state tracking, and interactive review workflows for triaging findings.
12
10
 
13
- From the monorepo root:
11
+ ## Commands
14
12
 
15
- ```bash
16
- pnpm install
17
- ```
13
+ ### init
18
14
 
19
- Then build and link the CLI globally:
15
+ Initialize a new xqa project in the current directory.
16
+
17
+ Creates a `.xqa/` directory with templates and subdirectories for specs, designs, and suites. Installs the `xqa-spec` skill for creating test specs.
20
18
 
21
19
  ```bash
22
- pnpm build:link # build + link `xqa` into PATH
20
+ xqa init
23
21
  ```
24
22
 
25
- For active development:
23
+ ### explore [prompt]
24
+
25
+ Run the explorer agent; omit prompt for a full breadth-first sweep.
26
+
27
+ Optional focus hint for the explorer agent. Omit to explore the entire app from the starting state. Generates a findings JSON file in `.xqa/output/` and prints the path upon completion.
26
28
 
27
29
  ```bash
28
- pnpm dev:link # build, link, and watch for changes
30
+ xqa explore # breadth-first exploration
31
+ xqa explore "test the login flow" # focused exploration
32
+ xqa explore -v prompt,screen # verbose output for categories
33
+ xqa explore -v # verbose output for all categories
29
34
  ```
30
35
 
31
- ## Setup
36
+ Flag: `-v, --verbose [categories]` — Log categories (prompt, tools, screen, memory). Default: all if flag is present without value.
32
37
 
33
- Copy the example env file and fill in your values:
38
+ ### spec [spec-file]
39
+
40
+ Run the explorer agent against a spec file.
41
+
42
+ Loads a spec markdown file from `.xqa/specs/` (or an absolute path) and executes the agent against it. Spec files define entry points, steps, and optional timeouts. Omit the argument to pick from available specs interactively.
34
43
 
35
44
  ```bash
36
- cp .env.example .env.local
45
+ xqa spec # interactive spec picker
46
+ xqa spec .xqa/specs/authentication.test.md # explicit spec file
47
+ xqa spec -v tools,memory # verbose output
37
48
  ```
38
49
 
39
- `.env.local` is loaded automatically at startup.
50
+ Flag: `-v, --verbose [categories]` Same as explore.
40
51
 
41
- ## Environment Variables
52
+ Spec file format (YAML frontmatter + markdown):
42
53
 
43
- | Variable | Required | Default | Description |
44
- | ------------------------------ | -------- | ---------------- | ------------------------------------------------------------------------------------------- |
45
- | `ANTHROPIC_API_KEY` | Yes | — | Anthropic API key |
46
- | `GOOGLE_GENERATIVE_AI_API_KEY` | No | — | Gemini key — enables video analysis; required for `xqa analyse` |
47
- | `QA_RUN_ID` | No | auto-generated | Fixed run ID; auto-incremented when omitted |
48
- | `QA_EXPLORE_TIMEOUT_SECONDS` | No | — | Max wall-clock time for an explore or spec run |
49
- | `QA_WALLET_MNEMONIC` | No | — | Wallet mnemonic; agent restores wallet before exploring when set |
50
- | `QA_BUILD_ENV` | No | `prod` | `dev` or `prod`; `dev` mode ignores debug overlays |
51
- | `QA_STARTUP_STATE` | No | — | `portfolio`, `new-wallet`, or `restore-wallet`; unset means app starts in its current state |
52
- | `QA_DESIGNS_DIR` | No | `./.xqa/designs` | Design artboards directory; enables visual regression checks when set |
54
+ ```markdown
55
+ ---
56
+ feature: 'Feature Name'
57
+ entry: 'Screen name or navigation path'
58
+ timeout: 300
59
+ ---
53
60
 
54
- ## Commands
61
+ # Spec content
62
+ ```
63
+
64
+ ### review [findings-path]
55
65
 
56
- ### `xqa explore [prompt]`
66
+ Review findings and mark false positives.
57
67
 
58
- Runs the explorer agent against the live simulator. Without a prompt the agent sweeps the entire app. With a prompt it focuses on the described flow.
68
+ Interactive session for triaging findings generated by explore or spec runs. Displays findings with confidence scores, steps, and screenshots. Mark findings as false positives (with optional reason) or undo previous dismissals. Saves dismissals to `.xqa/dismissals.json`. Defaults to the last findings path if omitted.
59
69
 
60
70
  ```bash
61
- xqa explore
62
- xqa explore "Try to send Bitcoin to an external address"
63
- xqa explore --verbose
71
+ xqa review # use last findings file
72
+ xqa review .xqa/output/findings-abc123.json # explicit path
64
73
  ```
65
74
 
66
- Startup state (`QA_STARTUP_STATE`) controls what the agent sees on launch:
75
+ ### analyse [video-path]
76
+
77
+ Analyse a session recording with Gemini.
67
78
 
68
- - `portfolio` main assets screen (default)
69
- - `new-wallet` — onboarding screen; agent taps through setup
70
- - `restore-wallet` — onboarding screen; agent restores wallet using `QA_WALLET_MNEMONIC`
79
+ Requires `GOOGLE_GENERATIVE_AI_API_KEY` in environment. Analyzes a video file recorded during exploration and outputs findings as JSON.
80
+
81
+ ```bash
82
+ xqa analyse /path/to/video.mp4
83
+ ```
71
84
 
72
- When `GOOGLE_GENERATIVE_AI_API_KEY` is set, a Gemini video analyser runs automatically after the explorer finishes.
85
+ ### completion <shell>
73
86
 
74
- ### `xqa spec <spec-file>`
87
+ Output shell completion script.
75
88
 
76
- Runs the explorer against a markdown spec file. The agent navigates to the entry point defined in the frontmatter and verifies each described step.
89
+ Generate completion script for bash or zsh. Pipe output to shell config file to enable tab completion.
77
90
 
78
91
  ```bash
79
- xqa spec path/to/send-flow.md
80
- xqa spec path/to/send-flow.md --verbose
92
+ xqa completion bash # generate bash completions
93
+ xqa completion zsh # generate zsh completions
81
94
  ```
82
95
 
83
- Spec file format:
96
+ ## Configuration
84
97
 
85
- ```markdown
86
- ---
87
- feature: Send Flow
88
- entry: Assets list
89
- max_steps: 40
90
- ---
98
+ Configuration is loaded from environment variables and `.env.local`:
91
99
 
92
- Steps describing the flow to verify...
93
- ```
100
+ - `ANTHROPIC_API_KEY` (required) Anthropic Claude API key for agent reasoning
101
+ - `GOOGLE_GENERATIVE_AI_API_KEY` (optional) — Google Generative AI key for video analysis
102
+ - `QA_RUN_ID` (optional) — Custom run identifier; defaults to auto-generated
103
+ - `QA_EXPLORE_TIMEOUT_SECONDS` (optional) — Exploration timeout in seconds
104
+ - `QA_BUILD_ENV` (optional) — Build environment: `dev` or `prod` (default: prod)
105
+
106
+ ## Architecture
94
107
 
95
- | Field | Required | Description |
96
- | ----------- | -------- | -------------------------------------------------- |
97
- | `feature` | Yes | Human-readable feature name |
98
- | `entry` | Yes | Screen name the agent navigates to before starting |
99
- | `max_steps` | No | Maximum number of agent steps |
108
+ Key files and directories:
100
109
 
101
- ### `xqa analyse <video-path>`
110
+ - `src/index.ts` CLI entry point; wires commander commands and manages graceful shutdown via process locks
111
+ - `src/commands/` — Command implementations (init, explore, spec, review, analyse, completion)
112
+ - `src/core/` — Pure functions: spec parsing, completion generation, verbose option parsing, last-path tracking
113
+ - `src/shell/` — I/O wrappers: file reading, device discovery, app context loading
114
+ - `src/config.ts`, `src/config-schema.ts` — Configuration loading and validation with Zod
115
+ - `src/review-session.ts` — Interactive finding review loop with dismissal tracking
116
+ - `src/spec-frontmatter.ts` — Spec markdown frontmatter parsing (YAML)
117
+ - `src/spec-slug.ts` — Spec filename to slug derivation for output organization
118
+ - `src/pid-lock.ts` — Process-level mutual exclusion to prevent concurrent runs
102
119
 
103
- Analyses a session recording with Gemini. Requires `GOOGLE_GENERATIVE_AI_API_KEY`. Prints findings as JSON to stdout.
120
+ ## Error Types
121
+
122
+ Core error discriminated unions:
123
+
124
+ - `ConfigError` — Configuration validation failed (INVALID_CONFIG)
125
+ - `AppContextError` — Failed to read app.md or explore.md (READ_FAILED)
126
+ - `XqaDirectoryError` — No .xqa directory found (XQA_NOT_INITIALIZED)
127
+ - `SpecFrontmatterError` — Malformed spec markdown (MISSING_FRONTMATTER, MISSING_FIELD, PARSE_ERROR)
128
+ - `LastPathError` — No findings path provided and no prior session (NO_ARG_AND_NO_STATE)
129
+
130
+ ## Development
131
+
132
+ Install dependencies:
104
133
 
105
134
  ```bash
106
- xqa analyse .xqa/output/2026-04-10/0001/recording.mp4
135
+ pnpm install
107
136
  ```
108
137
 
109
- ### `xqa review [findings-path]`
138
+ Build the CLI:
110
139
 
111
- Interactive terminal session for reviewing findings and marking false positives. Requires a TTY. Dismissals are persisted to a dismissals store and excluded from future runs.
140
+ ```bash
141
+ pnpm run build
142
+ ```
143
+
144
+ Run tests:
112
145
 
113
146
  ```bash
114
- xqa review .xqa/output/2026-04-10/0001/findings.json
147
+ pnpm run test
148
+ ```
149
+
150
+ Type check:
115
151
 
116
- # re-open the last reviewed findings file
117
- xqa review
152
+ ```bash
153
+ pnpm run typecheck
118
154
  ```
119
155
 
120
- ### `xqa completion <shell>`
156
+ Lint and format:
121
157
 
122
- Outputs a shell completion script.
158
+ ```bash
159
+ pnpm run lint
160
+ pnpm run lint:fix
161
+ ```
162
+
163
+ Full quality check (lint, typecheck, test):
123
164
 
124
165
  ```bash
125
- xqa completion zsh >> ~/.zshrc
126
- xqa completion bash >> ~/.bashrc
166
+ pnpm run check
167
+ pnpm run check:fix
127
168
  ```
128
169
 
129
- ## Process Behaviour
170
+ Watch mode (build + re-run on file changes):
130
171
 
131
- Only one `xqa` instance runs at a time (PID lock). A second invocation while a run is active will exit immediately with an error.
172
+ ```bash
173
+ pnpm run dev
174
+ ```
132
175
 
133
- - `Ctrl+C` once: graceful shutdown the current agent step completes, findings are written, then the process exits
134
- - `Ctrl+C` twice: force exit
176
+ Link binary globally (symlinks dist/xqa.cjs to ~/.local/bin/xqa):
135
177
 
136
- ## Development
178
+ ```bash
179
+ pnpm run build:link
180
+ ```
181
+
182
+ Unlink binary:
137
183
 
138
184
  ```bash
139
- pnpm dev # watch build
140
- pnpm build # production build
141
- pnpm build:link # build + link `xqa` globally
142
- pnpm dev:link # watch build + link
143
- pnpm test # run Vitest test suite
144
- pnpm typecheck # TypeScript type check
145
- pnpm lint # ESLint + Prettier check
146
- pnpm lint:fix # ESLint + Prettier auto-fix
147
- pnpm check # lint + typecheck + test (affected only)
148
- pnpm check:fix # lint:fix + typecheck + test (affected only)
185
+ pnpm run build:unlink
149
186
  ```
150
187
 
151
- ## Architecture
188
+ ## Project Structure
152
189
 
153
190
  ```
154
191
  src/
155
- index.ts # CLI entry — registers all commands
156
- config-schema.ts # Zod schema for all environment variables
192
+ index.ts # CLI entry point
193
+ config.ts # Config loading and types
194
+ config-schema.ts # Zod schema for env vars
195
+ constants.ts # Tool lists and timeouts
196
+ pid-lock.ts # Process exclusion lock
197
+ spec-slug.ts # Spec file to slug conversion
198
+ spec-frontmatter.ts # Spec YAML parsing
199
+ review-session.ts # Interactive finding review loop
200
+
157
201
  commands/
158
- explore-command.ts # xqa explore
159
- spec-command.ts # xqa spec
160
- analyse-command.ts # xqa analyse
161
- review-command.ts # xqa review
162
- completion-command.ts # xqa completion
163
- prompt-builder.ts # builds the explorer system prompt from config
202
+ init-command.ts # Project initialization
203
+ explore-command.ts # Breadth-first exploration
204
+ spec-command.ts # Spec-based exploration
205
+ review-command.ts # Finding triage workflow
206
+ analyse-command.ts # Video analysis
207
+ completion-command.ts # Shell completion generation
208
+
209
+ core/
210
+ parse-verbose.ts # Verbose flag parsing
211
+ completion-generator.ts # Bash/zsh completion script generation
212
+ last-path.ts # Last findings path tracking
213
+
214
+ shell/
215
+ app-context.ts # Read app.md and explore.md
216
+ xqa-directory.ts # Locate .xqa directory
217
+
218
+ __tests__/
219
+ *.test.ts # Test files co-located with src/
164
220
  ```
165
-
166
- The CLI is a thin shell over `@qa-agents/pipeline`. It parses env vars, builds a `PipelineConfig`, and calls `runPipeline()`.
@@ -0,0 +1,99 @@
1
+ # xqa-spec
2
+
3
+ ## When to use
4
+
5
+ - User runs `/xqa-spec` with a flow description
6
+ - User implies spec authoring intent: "I want to test X", "write a spec for Y", "update the Z spec"
7
+
8
+ Detect implied intent and self-activate without requiring explicit slash command.
9
+
10
+ ## Process
11
+
12
+ ```
13
+ Explore → Detect mode → Interview (one question at a time) → Draft → Review → Write
14
+ ```
15
+
16
+ IMPORTANT: Never generate a draft before the interview is complete. The user describes the spec; you transcribe it.
17
+
18
+ ### 1. Explore
19
+
20
+ Silently scan `.xqa/specs/*.test.md`. Learn:
21
+
22
+ - Naming conventions
23
+ - Tag vocabulary
24
+ - Level of detail and step granularity
25
+
26
+ Also read `.xqa/app.md` if it exists for app context.
27
+
28
+ ### 2. Detect mode
29
+
30
+ | Condition | Mode |
31
+ | ------------------------ | ----------------------------------------------- |
32
+ | Matching spec file found | Edit — read it, ask which sections to change |
33
+ | No match | Create — derive kebab-case filename from intent |
34
+
35
+ In **edit mode**: ask which sections to change before doing anything. Modify only those sections; preserve everything else verbatim.
36
+
37
+ ### 3. Interview (create mode only)
38
+
39
+ Ask one question at a time. Wait for the answer before asking the next. Prefer multiple choice when options are known.
40
+
41
+ **Question sequence:**
42
+
43
+ 1. **What flow?** — Confirm what's being tested if not already clear. Suggest a filename and `feature` name.
44
+ 2. **Entry point** — "What's the navigation path to reach this flow?" (e.g., `App launch`, `Home > Wallet`) → becomes `entry:` frontmatter
45
+ 3. **Starting state** — "What's already set up? What state is the device/app in?" → becomes `## Setup`
46
+ 4. **Steps** — "Walk me through the steps, one at a time. I'll ask for the next when you're done." → collect each step, then ask "What should happen?" for the assertion (optional)
47
+ 5. **Global assertions** — "Any overall things that should be true at the end of the flow?" → becomes `## Assertions` (skip if none)
48
+ 6. **Timeout** — "Set a timeout in seconds? (optional, for long-running specs)" → becomes `timeout:` frontmatter (offer to skip)
49
+
50
+ IMPORTANT: Ask each question in its own message. Never batch questions.
51
+
52
+ ### 4. Draft
53
+
54
+ Assemble using ONLY these frontmatter fields: `feature`, `entry`, `timeout`. Do not add any other frontmatter field. `feature` MUST be present. `timeout` MUST be a positive number (seconds) if included.
55
+
56
+ Steps and assertions come from the user — never invent them. Present the full draft for review.
57
+
58
+ ### 5. Review
59
+
60
+ Show the draft. Ask: "Does this look right, or anything to change?"
61
+
62
+ Iterate until approved. One round of changes per message.
63
+
64
+ ### 6. Write
65
+
66
+ Save to `.xqa/specs/<name>.test.md` only after explicit approval.
67
+
68
+ ## File format
69
+
70
+ ```md
71
+ ---
72
+ feature: <string>
73
+ entry: <string>
74
+ timeout: <seconds>
75
+ ---
76
+
77
+ ## Setup
78
+
79
+ <preconditions and starting state>
80
+
81
+ ## Steps
82
+
83
+ 1. <action> → <expected outcome>
84
+ 2. <action>
85
+
86
+ ## Assertions
87
+
88
+ - <global flow-level check>
89
+ ```
90
+
91
+ Omit `entry` and `timeout` lines if not provided. Omit `## Assertions` section if none.
92
+
93
+ ## Rules
94
+
95
+ - `## Setup` and `## Steps` are required; `## Assertions` is optional
96
+ - Inline assertion syntax: `action → outcome` using the → character
97
+ - Steps come from the user — never invent them
98
+ - Write file only after explicit approval
99
+ - In edit mode, ask before touching anything
@@ -0,0 +1,125 @@
1
+ ---
2
+ name: xqa-spec
3
+ description: Create or edit *.test.md spec files in .xqa/specs/ through guided dialogue. Triggers on /xqa-spec or implied spec authoring intent ("I want to test X", "write a spec for Y", "update the Z spec").
4
+ license: MIT
5
+ ---
6
+
7
+ # xqa-spec
8
+
9
+ ## When to use
10
+
11
+ - User runs `/xqa-spec` with a flow description
12
+ - User implies spec authoring intent: "I want to test X", "write a spec for Y", "update the Z spec"
13
+
14
+ Detect implied intent and self-activate without requiring explicit slash command.
15
+
16
+ ## Process
17
+
18
+ ```
19
+ Explore → Detect mode → Interview (one question at a time) → Draft → Review → Write
20
+ ```
21
+
22
+ IMPORTANT: Never generate a draft before the interview is complete. The user describes the spec; you transcribe it.
23
+
24
+ ### 1. Explore
25
+
26
+ Silently scan `.xqa/specs/*.test.md`. Learn:
27
+
28
+ - Naming conventions
29
+ - Tag vocabulary
30
+ - Level of detail and step granularity
31
+
32
+ Also read `.xqa/app.md` if it exists for app context.
33
+
34
+ ### 2. Detect mode
35
+
36
+ | Condition | Mode |
37
+ | ------------------------ | ----------------------------------------------- |
38
+ | Matching spec file found | Edit — read it, ask which sections to change |
39
+ | No match | Create — derive kebab-case filename from intent |
40
+
41
+ In **edit mode**: ask which sections to change before doing anything. Modify only those sections; preserve everything else verbatim.
42
+
43
+ ### 3. Interview (create mode only)
44
+
45
+ Ask one question at a time. Wait for the answer before asking the next. Prefer multiple choice when options are known.
46
+
47
+ **Question sequence:**
48
+
49
+ 1. **What flow?** — Confirm what's being tested if not already clear. Suggest a filename and `feature` name.
50
+ 2. **Entry point** — "What's the navigation path to reach this flow?" (e.g., `App launch`, `Home > Wallet`) → becomes `entry:` frontmatter
51
+ 3. **Starting state** — "What's already set up? What state is the device/app in?" → becomes `## Setup`
52
+ 4. **Steps** — "Walk me through the steps, one at a time. I'll ask for the next when you're done." → collect each step, then ask "What should happen?" for the assertion (optional)
53
+ 5. **Global assertions** — "Any overall things that should be true at the end of the flow?" → becomes `## Assertions` (skip if none)
54
+ 6. **Max steps** — "Set a timeout in seconds? (optional, for long-running specs)" → becomes `timeout:` frontmatter (offer to skip)
55
+
56
+ IMPORTANT: Ask each question in its own message. Never batch questions.
57
+
58
+ ### 4. Draft
59
+
60
+ Assemble using ONLY these frontmatter fields: `feature`, `entry`, `timeout`. Do not add any other frontmatter field. `feature` MUST be present. `timeout` MUST be a positive number (seconds) if included.
61
+
62
+ Steps and assertions come from the user — never invent them. Present the full draft for review.
63
+
64
+ ### 5. Review
65
+
66
+ Show the draft. Ask: "Does this look right, or anything to change?"
67
+
68
+ Iterate until approved. One round of changes per message.
69
+
70
+ ### 6. Write
71
+
72
+ Before writing, verify the draft passes all checks:
73
+
74
+ - [ ] `feature` is present and non-empty
75
+ - [ ] frontmatter contains only permitted fields: `feature`, `entry`, `timeout`
76
+ - [ ] `timeout` if present is a positive number in seconds (not a string, not zero)
77
+ - [ ] `## Setup` section is present
78
+ - [ ] `## Steps` section is present
79
+ - [ ] No forbidden fields: `tags`, `max_steps`, `priority`, `type`, `description`, `id`, `author`
80
+
81
+ Fix any failure before writing. Save to `.xqa/specs/<name>.test.md` only after explicit approval.
82
+
83
+ ## File format
84
+
85
+ FRONTMATTER SCHEMA — exact fields, exact types, no others:
86
+
87
+ ```
88
+ feature string REQUIRED
89
+ entry string OPTIONAL — omit if not provided
90
+ timeout positive number (seconds) OPTIONAL — omit if not provided
91
+ ```
92
+
93
+ FORBIDDEN frontmatter fields — never generate these: `tags`, `max_steps`, `priority`, `type`, `description`, `id`, `author`, `version`
94
+
95
+ CANONICAL OUTPUT FORMAT:
96
+
97
+ ```md
98
+ ---
99
+ feature: <string>
100
+ entry: <string>
101
+ timeout: <seconds>
102
+ ---
103
+
104
+ ## Setup
105
+
106
+ <preconditions and starting state>
107
+
108
+ ## Steps
109
+
110
+ 1. <action> → <expected outcome>
111
+ 2. <action>
112
+
113
+ ## Assertions
114
+
115
+ - <global flow-level check>
116
+ ```
117
+
118
+ Omit `entry` and `timeout` lines if not provided. Omit `## Assertions` section if none.
119
+
120
+ ## Rules
121
+
122
+ - Inline assertion syntax: `action → outcome` using the → character
123
+ - Steps come from the user — never invent them
124
+ - Write file only after explicit approval
125
+ - In edit mode, ask before touching anything
@@ -0,0 +1,5 @@
1
+ {
2
+ "version": "1.0.0",
3
+ "organization": "Exodus Movement",
4
+ "abstract": "Guides QA engineers through creating and editing *.test.md spec files in .xqa/specs/ using a structured interview-first workflow. Asks one question at a time to extract setup, steps, and assertions from the user before drafting."
5
+ }