@exodus/xqa 1.3.0 → 1.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,166 +1,220 @@
1
1
  # @exodus/xqa
2
2
 
3
- CLI for running AI-powered QA agents against Exodus mobile apps on iOS.
3
+ AI-powered QA agent CLI for Exodus applications.
4
4
 
5
- ## Prerequisites
5
+ ## Overview
6
6
 
7
- - Node >= 22
8
- - pnpm
9
- - An Anthropic API key
7
+ `xqa` automates mobile app QA by connecting to physical devices or emulators and running intelligent exploration and spec-based testing. The CLI orchestrates the pipeline that spawns agents to interact with your app, capture screenshots, and generate findings based on user-defined specs or breadth-first exploration.
10
8
 
11
- ## Installation
9
+ The tool manages configuration, project initialization, session state tracking, and interactive review workflows for triaging findings.
12
10
 
13
- From the monorepo root:
11
+ ## Commands
14
12
 
15
- ```bash
16
- pnpm install
17
- ```
13
+ ### init
18
14
 
19
- Then build and link the CLI globally:
15
+ Initialize a new xqa project in the current directory.
16
+
17
+ Creates a `.xqa/` directory with templates and subdirectories for specs, designs, and suites. Installs the `xqa-spec` skill for creating test specs.
20
18
 
21
19
  ```bash
22
- pnpm build:link # build + link `xqa` into PATH
20
+ xqa init
23
21
  ```
24
22
 
25
- For active development:
23
+ ### explore [prompt]
24
+
25
+ Run the explorer agent; omit prompt for a full breadth-first sweep.
26
+
27
+ Optional focus hint for the explorer agent. Omit to explore the entire app from the starting state. Generates a findings JSON file in `.xqa/output/` and prints the path upon completion.
26
28
 
27
29
  ```bash
28
- pnpm dev:link # build, link, and watch for changes
30
+ xqa explore # breadth-first exploration
31
+ xqa explore "test the login flow" # focused exploration
32
+ xqa explore -v prompt,screen # verbose output for categories
33
+ xqa explore -v # verbose output for all categories
29
34
  ```
30
35
 
31
- ## Setup
36
+ Flag: `-v, --verbose [categories]` — Log categories (prompt, tools, screen, memory). Default: all if flag is present without value.
32
37
 
33
- Copy the example env file and fill in your values:
38
+ ### spec [spec-file]
39
+
40
+ Run the explorer agent against a spec file.
41
+
42
+ Loads a spec markdown file from `.xqa/specs/` (or an absolute path) and executes the agent against it. Spec files define entry points, steps, and optional timeouts. Omit the argument to pick from available specs interactively.
34
43
 
35
44
  ```bash
36
- cp .env.example .env.local
45
+ xqa spec # interactive spec picker
46
+ xqa spec .xqa/specs/authentication.test.md # explicit spec file
47
+ xqa spec -v tools,memory # verbose output
37
48
  ```
38
49
 
39
- `.env.local` is loaded automatically at startup.
50
+ Flag: `-v, --verbose [categories]` Same as explore.
40
51
 
41
- ## Environment Variables
52
+ Spec file format (YAML frontmatter + markdown):
42
53
 
43
- | Variable | Required | Default | Description |
44
- | ------------------------------ | -------- | ---------------- | ------------------------------------------------------------------------------------------- |
45
- | `ANTHROPIC_API_KEY` | Yes | — | Anthropic API key |
46
- | `GOOGLE_GENERATIVE_AI_API_KEY` | No | — | Gemini key — enables video analysis; required for `xqa analyse` |
47
- | `QA_RUN_ID` | No | auto-generated | Fixed run ID; auto-incremented when omitted |
48
- | `QA_EXPLORE_TIMEOUT_SECONDS` | No | — | Max wall-clock time for an explore or spec run |
49
- | `QA_WALLET_MNEMONIC` | No | — | Wallet mnemonic; agent restores wallet before exploring when set |
50
- | `QA_BUILD_ENV` | No | `prod` | `dev` or `prod`; `dev` mode ignores debug overlays |
51
- | `QA_STARTUP_STATE` | No | — | `portfolio`, `new-wallet`, or `restore-wallet`; unset means app starts in its current state |
52
- | `QA_DESIGNS_DIR` | No | `./.xqa/designs` | Design artboards directory; enables visual regression checks when set |
54
+ ```markdown
55
+ ---
56
+ feature: 'Feature Name'
57
+ entry: 'Screen name or navigation path'
58
+ timeout: 300
59
+ ---
53
60
 
54
- ## Commands
61
+ # Spec content
62
+ ```
63
+
64
+ ### review [findings-path]
55
65
 
56
- ### `xqa explore [prompt]`
66
+ Review findings and mark false positives.
57
67
 
58
- Runs the explorer agent against the live simulator. Without a prompt the agent sweeps the entire app. With a prompt it focuses on the described flow.
68
+ Interactive session for triaging findings generated by explore or spec runs. Displays findings with confidence scores, steps, and screenshots. Mark findings as false positives (with optional reason) or undo previous dismissals. Saves dismissals to `.xqa/dismissals.json`. Defaults to the last findings path if omitted.
59
69
 
60
70
  ```bash
61
- xqa explore
62
- xqa explore "Try to send Bitcoin to an external address"
63
- xqa explore --verbose
71
+ xqa review # use last findings file
72
+ xqa review .xqa/output/findings-abc123.json # explicit path
64
73
  ```
65
74
 
66
- Startup state (`QA_STARTUP_STATE`) controls what the agent sees on launch:
75
+ ### analyse [video-path]
76
+
77
+ Analyse a session recording with Gemini.
67
78
 
68
- - `portfolio` main assets screen (default)
69
- - `new-wallet` — onboarding screen; agent taps through setup
70
- - `restore-wallet` — onboarding screen; agent restores wallet using `QA_WALLET_MNEMONIC`
79
+ Requires `GOOGLE_GENERATIVE_AI_API_KEY` in environment. Analyzes a video file recorded during exploration and outputs findings as JSON.
80
+
81
+ ```bash
82
+ xqa analyse /path/to/video.mp4
83
+ ```
71
84
 
72
- When `GOOGLE_GENERATIVE_AI_API_KEY` is set, a Gemini video analyser runs automatically after the explorer finishes.
85
+ ### completion <shell>
73
86
 
74
- ### `xqa spec <spec-file>`
87
+ Output shell completion script.
75
88
 
76
- Runs the explorer against a markdown spec file. The agent navigates to the entry point defined in the frontmatter and verifies each described step.
89
+ Generate completion script for bash or zsh. Pipe output to shell config file to enable tab completion.
77
90
 
78
91
  ```bash
79
- xqa spec path/to/send-flow.md
80
- xqa spec path/to/send-flow.md --verbose
92
+ xqa completion bash # generate bash completions
93
+ xqa completion zsh # generate zsh completions
81
94
  ```
82
95
 
83
- Spec file format:
96
+ ## Configuration
84
97
 
85
- ```markdown
86
- ---
87
- feature: Send Flow
88
- entry: Assets list
89
- max_steps: 40
90
- ---
98
+ Configuration is loaded from environment variables and `.env.local`:
91
99
 
92
- Steps describing the flow to verify...
93
- ```
100
+ - `ANTHROPIC_API_KEY` (required) Anthropic Claude API key for agent reasoning
101
+ - `GOOGLE_GENERATIVE_AI_API_KEY` (optional) — Google Generative AI key for video analysis
102
+ - `QA_RUN_ID` (optional) — Custom run identifier; defaults to auto-generated
103
+ - `QA_EXPLORE_TIMEOUT_SECONDS` (optional) — Exploration timeout in seconds
104
+ - `QA_BUILD_ENV` (optional) — Build environment: `dev` or `prod` (default: prod)
105
+
106
+ ## Architecture
94
107
 
95
- | Field | Required | Description |
96
- | ----------- | -------- | -------------------------------------------------- |
97
- | `feature` | Yes | Human-readable feature name |
98
- | `entry` | Yes | Screen name the agent navigates to before starting |
99
- | `max_steps` | No | Maximum number of agent steps |
108
+ Key files and directories:
100
109
 
101
- ### `xqa analyse <video-path>`
110
+ - `src/index.ts` CLI entry point; wires commander commands and manages graceful shutdown via process locks
111
+ - `src/commands/` — Command implementations (init, explore, spec, review, analyse, completion)
112
+ - `src/core/` — Pure functions: spec parsing, completion generation, verbose option parsing, last-path tracking
113
+ - `src/shell/` — I/O wrappers: file reading, device discovery, app context loading
114
+ - `src/config.ts`, `src/config-schema.ts` — Configuration loading and validation with Zod
115
+ - `src/review-session.ts` — Interactive finding review loop with dismissal tracking
116
+ - `src/spec-frontmatter.ts` — Spec markdown frontmatter parsing (YAML)
117
+ - `src/spec-slug.ts` — Spec filename to slug derivation for output organization
118
+ - `src/pid-lock.ts` — Process-level mutual exclusion to prevent concurrent runs
102
119
 
103
- Analyses a session recording with Gemini. Requires `GOOGLE_GENERATIVE_AI_API_KEY`. Prints findings as JSON to stdout.
120
+ ## Error Types
121
+
122
+ Core error discriminated unions:
123
+
124
+ - `ConfigError` — Configuration validation failed (INVALID_CONFIG)
125
+ - `AppContextError` — Failed to read app.md or explore.md (READ_FAILED)
126
+ - `XqaDirectoryError` — No .xqa directory found (XQA_NOT_INITIALIZED)
127
+ - `SpecFrontmatterError` — Malformed spec markdown (MISSING_FRONTMATTER, MISSING_FIELD, PARSE_ERROR)
128
+ - `LastPathError` — No findings path provided and no prior session (NO_ARG_AND_NO_STATE)
129
+
130
+ ## Development
131
+
132
+ Install dependencies:
104
133
 
105
134
  ```bash
106
- xqa analyse .xqa/output/2026-04-10/0001/recording.mp4
135
+ pnpm install
107
136
  ```
108
137
 
109
- ### `xqa review [findings-path]`
138
+ Build the CLI:
110
139
 
111
- Interactive terminal session for reviewing findings and marking false positives. Requires a TTY. Dismissals are persisted to a dismissals store and excluded from future runs.
140
+ ```bash
141
+ pnpm run build
142
+ ```
143
+
144
+ Run tests:
112
145
 
113
146
  ```bash
114
- xqa review .xqa/output/2026-04-10/0001/findings.json
147
+ pnpm run test
148
+ ```
149
+
150
+ Type check:
115
151
 
116
- # re-open the last reviewed findings file
117
- xqa review
152
+ ```bash
153
+ pnpm run typecheck
118
154
  ```
119
155
 
120
- ### `xqa completion <shell>`
156
+ Lint and format:
121
157
 
122
- Outputs a shell completion script.
158
+ ```bash
159
+ pnpm run lint
160
+ pnpm run lint:fix
161
+ ```
162
+
163
+ Full quality check (lint, typecheck, test):
123
164
 
124
165
  ```bash
125
- xqa completion zsh >> ~/.zshrc
126
- xqa completion bash >> ~/.bashrc
166
+ pnpm run check
167
+ pnpm run check:fix
127
168
  ```
128
169
 
129
- ## Process Behaviour
170
+ Watch mode (build + re-run on file changes):
130
171
 
131
- Only one `xqa` instance runs at a time (PID lock). A second invocation while a run is active will exit immediately with an error.
172
+ ```bash
173
+ pnpm run dev
174
+ ```
132
175
 
133
- - `Ctrl+C` once: graceful shutdown the current agent step completes, findings are written, then the process exits
134
- - `Ctrl+C` twice: force exit
176
+ Link binary globally (symlinks dist/xqa.cjs to ~/.local/bin/xqa):
135
177
 
136
- ## Development
178
+ ```bash
179
+ pnpm run build:link
180
+ ```
181
+
182
+ Unlink binary:
137
183
 
138
184
  ```bash
139
- pnpm dev # watch build
140
- pnpm build # production build
141
- pnpm build:link # build + link `xqa` globally
142
- pnpm dev:link # watch build + link
143
- pnpm test # run Vitest test suite
144
- pnpm typecheck # TypeScript type check
145
- pnpm lint # ESLint + Prettier check
146
- pnpm lint:fix # ESLint + Prettier auto-fix
147
- pnpm check # lint + typecheck + test (affected only)
148
- pnpm check:fix # lint:fix + typecheck + test (affected only)
185
+ pnpm run build:unlink
149
186
  ```
150
187
 
151
- ## Architecture
188
+ ## Project Structure
152
189
 
153
190
  ```
154
191
  src/
155
- index.ts # CLI entry — registers all commands
156
- config-schema.ts # Zod schema for all environment variables
192
+ index.ts # CLI entry point
193
+ config.ts # Config loading and types
194
+ config-schema.ts # Zod schema for env vars
195
+ constants.ts # Tool lists and timeouts
196
+ pid-lock.ts # Process exclusion lock
197
+ spec-slug.ts # Spec file to slug conversion
198
+ spec-frontmatter.ts # Spec YAML parsing
199
+ review-session.ts # Interactive finding review loop
200
+
157
201
  commands/
158
- explore-command.ts # xqa explore
159
- spec-command.ts # xqa spec
160
- analyse-command.ts # xqa analyse
161
- review-command.ts # xqa review
162
- completion-command.ts # xqa completion
163
- prompt-builder.ts # builds the explorer system prompt from config
202
+ init-command.ts # Project initialization
203
+ explore-command.ts # Breadth-first exploration
204
+ spec-command.ts # Spec-based exploration
205
+ review-command.ts # Finding triage workflow
206
+ analyse-command.ts # Video analysis
207
+ completion-command.ts # Shell completion generation
208
+
209
+ core/
210
+ parse-verbose.ts # Verbose flag parsing
211
+ completion-generator.ts # Bash/zsh completion script generation
212
+ last-path.ts # Last findings path tracking
213
+
214
+ shell/
215
+ app-context.ts # Read app.md and explore.md
216
+ xqa-directory.ts # Locate .xqa directory
217
+
218
+ __tests__/
219
+ *.test.ts # Test files co-located with src/
164
220
  ```
165
-
166
- The CLI is a thin shell over `@qa-agents/pipeline`. It parses env vars, builds a `PipelineConfig`, and calls `runPipeline()`.
@@ -23,7 +23,7 @@ Silently scan `.xqa/specs/*.test.md`. Learn:
23
23
  - Tag vocabulary
24
24
  - Level of detail and step granularity
25
25
 
26
- Also read `.xqa/instructions.md` if it exists for app context.
26
+ Also read `.xqa/app.md` if it exists for app context.
27
27
 
28
28
  ### 2. Detect mode
29
29
 
@@ -40,17 +40,20 @@ Ask one question at a time. Wait for the answer before asking the next. Prefer m
40
40
 
41
41
  **Question sequence:**
42
42
 
43
- 1. **What flow?** — Confirm what's being tested if not already clear. Suggest a filename.
44
- 2. **Starting state** — "Where does the app start for this test? What's already set up?" → becomes `## Setup`
45
- 3. **Steps** — "Walk me through the steps, one at a time. I'll ask for the next when you're done." → collect each step, then ask "What should happen?" for the assertion (optional)
46
- 4. **Global assertions** — "Anything non-obvious that should hold at the end?" becomes `## Assertions`; skip if none. Never suggest trivial examples (no errors shown, page loaded) only capture meaningful, app-specific checks.
47
- 5. **Metadata** — "Any tags or a custom timeout?" (offer to skip)
43
+ 1. **What flow?** — Confirm what's being tested if not already clear. Suggest a filename and `feature` name.
44
+ 2. **Entry point** — "What's the navigation path to reach this flow?" (e.g., `App launch`, `Home > Wallet`) → becomes `entry:` frontmatter
45
+ 3. **Starting state** — "What's already set up? What state is the device/app in?" → becomes `## Setup`
46
+ 4. **Steps** — "Walk me through the steps, one at a time. I'll ask for the next when you're done." collect each step, then ask "What should happen?" for the assertion (optional)
47
+ 5. **Global assertions** — "Any overall things that should be true at the end of the flow?" → becomes `## Assertions` (skip if none)
48
+ 6. **Timeout** — "Set a timeout in seconds? (optional, for long-running specs)" → becomes `timeout:` frontmatter (offer to skip)
48
49
 
49
50
  IMPORTANT: Ask each question in its own message. Never batch questions.
50
51
 
51
52
  ### 4. Draft
52
53
 
53
- Assemble the spec from interview answers don't invent steps or assertions the user didn't describe. Present the full draft for review.
54
+ Assemble using ONLY these frontmatter fields: `feature`, `entry`, `timeout`. Do not add any other frontmatter field. `feature` MUST be present. `timeout` MUST be a positive number (seconds) if included.
55
+
56
+ Steps and assertions come from the user — never invent them. Present the full draft for review.
54
57
 
55
58
  ### 5. Review
56
59
 
@@ -66,28 +69,30 @@ Save to `.xqa/specs/<name>.test.md` only after explicit approval.
66
69
 
67
70
  ```md
68
71
  ---
69
- description: optional one-liner
70
- tags: [optional, tags]
71
- timeout: 120
72
+ feature: <string>
73
+ entry: <string>
74
+ timeout: <seconds>
72
75
  ---
73
76
 
74
77
  ## Setup
75
78
 
76
- Starting screen and preconditions. Required.
79
+ <preconditions and starting state>
77
80
 
78
81
  ## Steps
79
82
 
80
- 1. Action → expected outcome (optional inline assertion)
81
- 2. Next action
83
+ 1. <action><expected outcome>
84
+ 2. <action>
82
85
 
83
86
  ## Assertions
84
87
 
85
- - Global flow-level check (optional section)
88
+ - <global flow-level check>
86
89
  ```
87
90
 
91
+ Omit `entry` and `timeout` lines if not provided. Omit `## Assertions` section if none.
92
+
88
93
  ## Rules
89
94
 
90
- - `## Setup` and `## Steps` are required; frontmatter and `## Assertions` are optional
95
+ - `## Setup` and `## Steps` are required; `## Assertions` is optional
91
96
  - Inline assertion syntax: `action → outcome` using the → character
92
97
  - Steps come from the user — never invent them
93
98
  - Write file only after explicit approval
@@ -29,7 +29,7 @@ Silently scan `.xqa/specs/*.test.md`. Learn:
29
29
  - Tag vocabulary
30
30
  - Level of detail and step granularity
31
31
 
32
- Also read `.xqa/instructions.md` if it exists for app context.
32
+ Also read `.xqa/app.md` if it exists for app context.
33
33
 
34
34
  ### 2. Detect mode
35
35
 
@@ -46,17 +46,20 @@ Ask one question at a time. Wait for the answer before asking the next. Prefer m
46
46
 
47
47
  **Question sequence:**
48
48
 
49
- 1. **What flow?** — Confirm what's being tested if not already clear. Suggest a filename.
50
- 2. **Starting state** — "Where does the app start for this test? What's already set up?" → becomes `## Setup`
51
- 3. **Steps** — "Walk me through the steps, one at a time. I'll ask for the next when you're done." → collect each step, then ask "What should happen?" for the assertion (optional)
52
- 4. **Global assertions** — "Anything non-obvious that should hold at the end?" becomes `## Assertions`; skip if none. Never suggest trivial examples (no errors shown, page loaded) only capture meaningful, app-specific checks.
53
- 5. **Metadata** — "Any tags or a custom timeout?" (offer to skip)
49
+ 1. **What flow?** — Confirm what's being tested if not already clear. Suggest a filename and `feature` name.
50
+ 2. **Entry point** — "What's the navigation path to reach this flow?" (e.g., `App launch`, `Home > Wallet`) → becomes `entry:` frontmatter
51
+ 3. **Starting state** — "What's already set up? What state is the device/app in?" → becomes `## Setup`
52
+ 4. **Steps** — "Walk me through the steps, one at a time. I'll ask for the next when you're done." collect each step, then ask "What should happen?" for the assertion (optional)
53
+ 5. **Global assertions** — "Any overall things that should be true at the end of the flow?" → becomes `## Assertions` (skip if none)
54
+ 6. **Max steps** — "Set a timeout in seconds? (optional, for long-running specs)" → becomes `timeout:` frontmatter (offer to skip)
54
55
 
55
56
  IMPORTANT: Ask each question in its own message. Never batch questions.
56
57
 
57
58
  ### 4. Draft
58
59
 
59
- Assemble the spec from interview answers don't invent steps or assertions the user didn't describe. Present the full draft for review.
60
+ Assemble using ONLY these frontmatter fields: `feature`, `entry`, `timeout`. Do not add any other frontmatter field. `feature` MUST be present. `timeout` MUST be a positive number (seconds) if included.
61
+
62
+ Steps and assertions come from the user — never invent them. Present the full draft for review.
60
63
 
61
64
  ### 5. Review
62
65
 
@@ -66,34 +69,56 @@ Iterate until approved. One round of changes per message.
66
69
 
67
70
  ### 6. Write
68
71
 
69
- Save to `.xqa/specs/<name>.test.md` only after explicit approval.
72
+ Before writing, verify the draft passes all checks:
73
+
74
+ - [ ] `feature` is present and non-empty
75
+ - [ ] frontmatter contains only permitted fields: `feature`, `entry`, `timeout`
76
+ - [ ] `timeout` if present is a positive number in seconds (not a string, not zero)
77
+ - [ ] `## Setup` section is present
78
+ - [ ] `## Steps` section is present
79
+ - [ ] No forbidden fields: `tags`, `max_steps`, `priority`, `type`, `description`, `id`, `author`
80
+
81
+ Fix any failure before writing. Save to `.xqa/specs/<name>.test.md` only after explicit approval.
70
82
 
71
83
  ## File format
72
84
 
85
+ FRONTMATTER SCHEMA — exact fields, exact types, no others:
86
+
87
+ ```
88
+ feature string REQUIRED
89
+ entry string OPTIONAL — omit if not provided
90
+ timeout positive number (seconds) OPTIONAL — omit if not provided
91
+ ```
92
+
93
+ FORBIDDEN frontmatter fields — never generate these: `tags`, `max_steps`, `priority`, `type`, `description`, `id`, `author`, `version`
94
+
95
+ CANONICAL OUTPUT FORMAT:
96
+
73
97
  ```md
74
98
  ---
75
- description: optional one-liner
76
- tags: [optional, tags]
77
- timeout: 120
99
+ feature: <string>
100
+ entry: <string>
101
+ timeout: <seconds>
78
102
  ---
79
103
 
80
104
  ## Setup
81
105
 
82
- Starting screen and preconditions. Required.
106
+ <preconditions and starting state>
83
107
 
84
108
  ## Steps
85
109
 
86
- 1. Action → expected outcome (optional inline assertion)
87
- 2. Next action
110
+ 1. <action><expected outcome>
111
+ 2. <action>
88
112
 
89
113
  ## Assertions
90
114
 
91
- - Global flow-level check (optional section)
115
+ - <global flow-level check>
92
116
  ```
93
117
 
118
+ Omit `entry` and `timeout` lines if not provided. Omit `## Assertions` section if none.
119
+
94
120
  ## Rules
95
121
 
96
- - `## Setup` and `## Steps` are required; frontmatter and `## Assertions` are optional
97
122
  - Inline assertion syntax: `action → outcome` using the → character
98
123
  - Steps come from the user — never invent them
99
124
  - Write file only after explicit approval
package/dist/xqa.cjs CHANGED
@@ -15864,6 +15864,18 @@ function formatMemoryElements(elements) {
15864
15864
  (element) => `${element.label} [${String(Math.round(element.confidence * PCT_MULTIPLIER))}%${element.phase === "after-scroll" ? "\u2193" : ""}]`
15865
15865
  ).join(", ");
15866
15866
  }
15867
+ var ALL_VERBOSE_CATEGORIES = /* @__PURE__ */ new Set([
15868
+ "prompt",
15869
+ "tools",
15870
+ "screen",
15871
+ "memory"
15872
+ ]);
15873
+ function isVerboseEnabled(config3, category) {
15874
+ if (config3 === void 0) {
15875
+ return false;
15876
+ }
15877
+ return config3.has(category);
15878
+ }
15867
15879
  var SCREEN_PREVIEW_LENGTH = 80;
15868
15880
  function write(line) {
15869
15881
  process.stderr.write(line + "\n");
@@ -15871,7 +15883,7 @@ function write(line) {
15871
15883
  function writePlainScreenState(event, verbose) {
15872
15884
  const preview = (event.snapshot.split("\n")[0] ?? "").slice(0, SCREEN_PREVIEW_LENGTH);
15873
15885
  write(`[${event.agent}] screen (${String(event.snapshot.length)} chars): ${preview}`);
15874
- if (verbose) {
15886
+ if (isVerboseEnabled(verbose, "screen")) {
15875
15887
  write(event.snapshot);
15876
15888
  }
15877
15889
  }
@@ -15879,7 +15891,7 @@ function writePlainScreenMemory(event, verbose) {
15879
15891
  write(
15880
15892
  `[${event.agent}] memory (${String(event.sessionsObserved)} sessions): ${formatMemoryElements(event.elements)}`
15881
15893
  );
15882
- if (verbose) {
15894
+ if (isVerboseEnabled(verbose, "memory")) {
15883
15895
  write(event.enrichedSnapshot);
15884
15896
  }
15885
15897
  }
@@ -15914,8 +15926,14 @@ function writePlainToolError(event) {
15914
15926
  write(`${prefix} error handling ${event.toolName}: ${line}`);
15915
15927
  }
15916
15928
  }
15929
+ function writePlainError(event) {
15930
+ write(`[${event.agent}] error: ${event.message}`);
15931
+ if (event.stack !== void 0) {
15932
+ write(event.stack);
15933
+ }
15934
+ }
15917
15935
  function writePlainToolResult(event, verbose) {
15918
- if (!verbose) {
15936
+ if (!isVerboseEnabled(verbose, "tools")) {
15919
15937
  return;
15920
15938
  }
15921
15939
  const prefix = `[${event.agent}]`;
@@ -15941,13 +15959,13 @@ function handlePlainToolEvent(event, verbose) {
15941
15959
  }
15942
15960
  }
15943
15961
  function writePlainSystemPrompt(event, verbose) {
15944
- if (!verbose) {
15962
+ if (!isVerboseEnabled(verbose, "prompt")) {
15945
15963
  return;
15946
15964
  }
15947
15965
  write(`[${event.agent}] system prompt:
15948
15966
  ${event.prompt}`);
15949
15967
  }
15950
- function dispatchPlainEventFirst(event, verbose) {
15968
+ function dispatchPlainNonVerboseFirst(event) {
15951
15969
  switch (event.type) {
15952
15970
  case "STAGE_START": {
15953
15971
  writePlainStageStart(event);
@@ -15961,16 +15979,19 @@ function dispatchPlainEventFirst(event, verbose) {
15961
15979
  writePlainThought(event);
15962
15980
  return;
15963
15981
  }
15964
- case "SCREEN_STATE": {
15965
- writePlainScreenState(event, verbose);
15966
- return;
15967
- }
15968
15982
  case "SCREENSHOT": {
15969
15983
  writePlainScreenshot(event);
15970
15984
  return;
15971
15985
  }
15972
15986
  }
15973
15987
  }
15988
+ function dispatchPlainEventFirst(event, verbose) {
15989
+ if (event.type === "SCREEN_STATE") {
15990
+ writePlainScreenState(event, verbose);
15991
+ return;
15992
+ }
15993
+ dispatchPlainNonVerboseFirst(event);
15994
+ }
15974
15995
  function dispatchPlainEventSecond(event, verbose) {
15975
15996
  switch (event.type) {
15976
15997
  case "SCREEN_MEMORY": {
@@ -15986,10 +16007,7 @@ function dispatchPlainEventSecond(event, verbose) {
15986
16007
  return;
15987
16008
  }
15988
16009
  case "ERROR": {
15989
- write(`[${event.agent}] error: ${event.message}`);
15990
- if (event.stack !== void 0) {
15991
- write(event.stack);
15992
- }
16010
+ writePlainError(event);
15993
16011
  return;
15994
16012
  }
15995
16013
  }
@@ -16054,14 +16072,14 @@ function createGitHubCIFormatter(write2) {
16054
16072
  for (const warning of flushWarnings(event.agent, warnings)) {
16055
16073
  write2(warning);
16056
16074
  }
16057
- handlePlain(event, false);
16075
+ handlePlain(event);
16058
16076
  return;
16059
16077
  }
16060
16078
  if (event.type === "INSPECTOR_STEP") {
16061
16079
  collectWarning(event, warnings);
16062
16080
  return;
16063
16081
  }
16064
- handlePlain(event, false);
16082
+ handlePlain(event);
16065
16083
  };
16066
16084
  }
16067
16085
  var CHALK_TRUECOLOR_LEVEL = 3;
@@ -16116,7 +16134,7 @@ function writePrettyMemory(event, context) {
16116
16134
  barLine(applyMemoryStyle(`\u25B8 memory (${String(event.sessionsObserved)} sessions): ${top}`)),
16117
16135
  context.state
16118
16136
  );
16119
- if (context.verbose) {
16137
+ if (isVerboseEnabled(context.verbose, "memory")) {
16120
16138
  for (const line of event.enrichedSnapshot.split("\n")) {
16121
16139
  writeLine(`${chalk2.dim(S_BAR)} ${applyMemoryStyle(line)}`, context.state);
16122
16140
  }
@@ -16136,7 +16154,7 @@ function writePrettyScreenState(snapshot, context) {
16136
16154
  barLine(applyMemoryStyle(`\u25B8 screen (${String(snapshot.length)} chars): ${preview}`)),
16137
16155
  context.state
16138
16156
  );
16139
- if (context.verbose) {
16157
+ if (isVerboseEnabled(context.verbose, "screen")) {
16140
16158
  for (const line of snapshot.split("\n")) {
16141
16159
  writeLine(`${chalk2.dim(S_BAR)} ${applyMemoryStyle(line)}`, context.state);
16142
16160
  }
@@ -16151,7 +16169,7 @@ function writePrettyError(event, state) {
16151
16169
  }
16152
16170
  }
16153
16171
  function writePrettySystemPrompt(event, context) {
16154
- if (!context.verbose) {
16172
+ if (!isVerboseEnabled(context.verbose, "prompt")) {
16155
16173
  return;
16156
16174
  }
16157
16175
  writeLine(barLine(applyThoughtStyle("\u25C6 system prompt")), context.state);
@@ -16566,7 +16584,7 @@ function buildToolArguments(input) {
16566
16584
  return Object.entries(input).filter(([key]) => !HIDDEN_TOOL_ARGS.has(key)).map(([key, value]) => `${key}: ${String(value)}`).join(", ");
16567
16585
  }
16568
16586
  function writeToolResult(event, context) {
16569
- if (context.verbose) {
16587
+ if (isVerboseEnabled(context.verbose, "tools")) {
16570
16588
  for (const line of event.result.split("\n")) {
16571
16589
  writeLine(`${chalk4.dim(S_BAR4)} ${applyToolStyle(line)}`, context.state);
16572
16590
  }
@@ -16697,7 +16715,7 @@ function resolveOutputMode() {
16697
16715
  }
16698
16716
  function createConsoleObserver(options) {
16699
16717
  const mode = options?.mode ?? resolveOutputMode();
16700
- const verbose = options?.verbose ?? false;
16718
+ const verbose = options?.verbose;
16701
16719
  if (mode === "tty") {
16702
16720
  return createHybridTtyRenderer({ verbose });
16703
16721
  }
@@ -55891,13 +55909,13 @@ var WORKING_STATE_SECTION = `## Working State
55891
55909
 
55892
55910
  At every reasoning step, maintain a mental ledger:
55893
55911
  - VISITED: screen names confirmed via \`view_ui\` this session
55894
- - QUEUE: screen names seen as reachable but not yet explored
55912
+ - QUEUE: screen names seen as reachable but not yet explored \u2014 also seed from App Knowledge if present
55895
55913
  - PATH: your current navigation stack from root (e.g. Home > Settings > Privacy)
55896
55914
 
55897
55915
  Consult the ledger before every action. Always prefer navigating to a QUEUE screen over a VISITED one.`;
55898
- var BACK_NAV_RULE = `- After navigating forward to any new screen: tap back, call \`view_ui\`, confirm you returned to the expected parent in PATH \u2014 if not, emit a \`back-nav-failure\` finding, then navigate forward again to continue`;
55899
- var STUCK_LOOP_RULE = `- Stuck loop: emit a \`stuck-loop\` finding when any of these occur: (1) \`view_ui\` returns the same screen state 3 or more consecutive steps, (2) the same element has been tapped more than twice with no screen change, (3) PATH shows the same screen at two non-adjacent positions \u2014 before emitting, try one alternative action (scroll, long-press, swipe) to rule out a gesture mismatch`;
55900
- var CLIPPED_ELEMENT_RULE = `- Never tap an element tagged \`[clipped-top]\`, \`[clipped-bottom]\`, \`[clipped-left]\`, or \`[clipped-right]\` \u2014 scroll to fully reveal it first, then re-call \`view_ui\` before tapping`;
55916
+ var BACK_NAV_RULE = `After navigating forward to any new screen: tap back, call \`view_ui\`, confirm you returned to the expected parent in PATH \u2014 if not, emit a \`back-nav-failure\` finding, then navigate forward again to continue`;
55917
+ var STUCK_LOOP_RULE = `Stuck loop: emit a \`stuck-loop\` finding when any of these occur: (1) \`view_ui\` returns the same screen state 3 or more consecutive steps, (2) the same element has been tapped more than twice with no screen change, (3) PATH shows the same screen at two non-adjacent positions \u2014 before emitting, try one alternative action (scroll, long-press, swipe) to rule out a gesture mismatch`;
55918
+ var CLIPPED_ELEMENT_RULE = `Never tap an element tagged \`[clipped-top]\`, \`[clipped-bottom]\`, \`[clipped-left]\`, or \`[clipped-right]\` \u2014 scroll to fully reveal it first, then re-call \`view_ui\` before tapping`;
55901
55919
  var WHAT_TO_TEST_SECTION = `## What to Test
55902
55920
 
55903
55921
  Test navigation elements first, interactions second.
@@ -55919,41 +55937,59 @@ Test navigation elements first, interactions second.
55919
55937
  If an interaction produces no observable change, retry once before flagging.`;
55920
55938
  var DEAD_END_SECTION = `## Dead End and Modal Detection
55921
55939
 
55922
- **Dead end** \u2014 when \`view_ui\` shows no interactive exit affordance, attempt ALL of before emitting a finding: (1) any visible back/close button, (2) swipe from the left edge (back gesture), (3) swipe down (dismiss gesture). If all fail, emit a \`dead-end\` finding describing what was visible and what was attempted.
55940
+ **Dead end** \u2014 when \`view_ui\` shows no interactive exit affordance, first consult App Knowledge for gesture-based navigation on this screen, then attempt ALL of before emitting a finding: (1) any visible back/close button, (2) swipe from the left edge (back gesture), (3) swipe down (dismiss gesture). If all fail, emit a \`dead-end\` finding describing what was visible and what was attempted.
55923
55941
 
55924
55942
  **Stuck modal** \u2014 when a modal or bottom sheet blocks the screen, attempt dismissal in order: (1) close/X button if present, (2) tap outside the modal, (3) swipe down, (4) swipe from the left edge. If all fail, emit a \`stuck-modal\` finding listing the modal, the screen it appeared on, and the methods attempted.`;
55925
- var SPEC_MODE_TEMPLATE = (specContent, options) => {
55926
- const appSection = options.userPrompt ? `## Application
55943
+ var SPEC_WHAT_TO_TEST_SECTION = `## What to Test
55927
55944
 
55928
- ${options.userPrompt}
55945
+ Test only the elements and interactions described in the spec. Do not interact with elements outside the spec path.
55929
55946
 
55930
- ` : "";
55931
- const environmentSection = options.buildEnv === "dev" ? `
55947
+ If you observe obvious breakage while navigating to a spec step \u2014 a broken control, unexpected error, missing screen, or crash \u2014 flag it as a passive observation without stopping to investigate it.`;
55948
+ var SPEC_DEAD_END_SECTION = `## Dead End and Modal Detection
55932
55949
 
55933
- ${DEV_ENVIRONMENT_SECTION}` : "";
55934
- return `You are a navigation and interaction testing agent. Your role is to find broken navigation flows and non-functional interactive elements. Do not report content bugs, copy errors, or visual style issues unless they directly prevent a navigation action from completing.
55950
+ **Dead end** \u2014 if a spec step leaves the agent on a screen with no path to the next spec step, attempt: (1) any visible back/close button, (2) swipe from the left edge, (3) swipe down. If all fail, emit a \`dead-end\` finding and halt \u2014 do not attempt further exploration to recover.
55935
55951
 
55936
- Verify app against specs below.
55952
+ **Stuck modal** \u2014 when a modal or bottom sheet blocks spec step execution, attempt dismissal in order: (1) close/X button if present, (2) tap outside the modal, (3) swipe down, (4) swipe from the left edge. If all fail, emit a \`stuck-modal\` finding listing the modal, the screen it appeared on, and the methods attempted.`;
55953
+ function buildContextSections(appContext, initialState) {
55954
+ return [
55955
+ appContext ? `## App Knowledge
55937
55956
 
55938
- ${appSection}${WORKING_STATE_SECTION}
55957
+ ${appContext}` : void 0,
55958
+ initialState ? `## Initial State
55939
55959
 
55940
- ## Rules
55960
+ ${initialState}` : void 0,
55961
+ WORKING_STATE_SECTION
55962
+ ].filter((section) => section !== void 0).join("\n\n");
55963
+ }
55964
+ var SPEC_RULES_SECTION = `## Rules
55941
55965
 
55942
55966
  - ALWAYS call \`view_ui\` after every action before deciding what to do next \u2014 it is your only way to observe the screen
55943
- - ${BACK_NAV_RULE.slice(2)}
55967
+ - ${BACK_NAV_RULE}
55944
55968
  - Before selecting any action, prefer navigating to a QUEUE screen over re-exploring a VISITED one
55945
- - ${STUCK_LOOP_RULE.slice(2)}
55946
- - ${CLIPPED_ELEMENT_RULE.slice(2)}
55969
+ - ${STUCK_LOOP_RULE}
55970
+ - ${CLIPPED_ELEMENT_RULE}
55947
55971
  - Each item in \`**Assertions**\` is a mandatory pass/fail check \u2014 verify using \`view_ui\`; if the accessibility tree cannot confirm, emit a \`spec-deviation\` finding based on what is observable
55948
- - Flag unexpected navigation failures, broken interactions, or crash dialogs encountered during step execution, even if not listed as assertions
55972
+ - Flag crash dialogs, unexpected system errors, or navigation failures that occur as a direct result of executing a spec step; if you observe a visibly broken element in passing while navigating, note it without interacting with it`;
55973
+ function buildSpecModeBody({
55974
+ specContent,
55975
+ contextBlock,
55976
+ environmentSection
55977
+ }) {
55978
+ return `You are a spec execution agent. Your role is to follow the provided spec exactly \u2014 execute each step in sequence, verify each assertion, and report deviations. Observe and flag obvious breakage encountered in transit, but do not explore or interact with anything outside the spec.
55949
55979
 
55950
- ## Exploration Strategy
55980
+ Verify app against specs below.
55951
55981
 
55952
- Navigate to verify each spec's scenarios. When choosing how to reach a screen, prefer breadth-first paths \u2014 map sibling screens before going deeper into any one branch.
55982
+ ${contextBlock}
55953
55983
 
55954
- ${WHAT_TO_TEST_SECTION}
55984
+ ${SPEC_RULES_SECTION}
55955
55985
 
55956
- ${DEAD_END_SECTION}
55986
+ ## Execution Strategy
55987
+
55988
+ Execute spec steps in strict sequence. Navigate by the shortest path to each step's target screen. Do not interact with any screen, element, or flow not required by the spec.
55989
+
55990
+ ${SPEC_WHAT_TO_TEST_SECTION}
55991
+
55992
+ ${SPEC_DEAD_END_SECTION}
55957
55993
 
55958
55994
  ## Specs
55959
55995
 
@@ -55962,28 +55998,31 @@ ${specContent}${environmentSection}
55962
55998
  ## Output
55963
55999
 
55964
56000
  CRITICAL: Call \`set_output\` each time your findings change \u2014 when you discover something new, confirm a false positive, or revise a finding. Each call replaces the previous output entirely, so always pass the full current list. Do not reply in plain text.`;
56001
+ }
56002
+ var SPEC_MODE_TEMPLATE = (specContent, options) => {
56003
+ const contextBlock = buildContextSections(options.appContext, options.initialState);
56004
+ const environmentSection = options.buildEnv === "dev" ? `
56005
+
56006
+ ${DEV_ENVIRONMENT_SECTION}` : "";
56007
+ return buildSpecModeBody({ specContent, contextBlock, environmentSection });
55965
56008
  };
55966
56009
  var FREESTYLE_TEMPLATE = (options) => {
55967
- const { userPrompt, buildEnv } = options ?? {};
55968
- const appSection = userPrompt ? `## Application
55969
-
55970
- ${userPrompt}
55971
-
55972
- ` : "";
56010
+ const { appContext, initialState, buildEnv } = options ?? {};
56011
+ const contextBlock = buildContextSections(appContext, initialState);
55973
56012
  const environmentSection = buildEnv === "dev" ? `
55974
56013
 
55975
56014
  ${DEV_ENVIRONMENT_SECTION}` : "";
55976
56015
  return `You are a navigation and interaction testing agent. Your role is to find broken navigation flows and non-functional interactive elements. Do not report content bugs, copy errors, or visual style issues unless they directly prevent a navigation action from completing.
55977
56016
 
55978
- ${appSection}${WORKING_STATE_SECTION}
56017
+ ${contextBlock}
55979
56018
 
55980
56019
  ## Rules
55981
56020
 
55982
56021
  - ALWAYS call \`view_ui\` after every action before deciding what to do next \u2014 it is your only way to observe the screen
55983
- - ${BACK_NAV_RULE.slice(2)}
56022
+ - ${BACK_NAV_RULE}
55984
56023
  - Before selecting any action, prefer navigating to a QUEUE screen over re-exploring a VISITED one
55985
- - ${STUCK_LOOP_RULE.slice(2)}
55986
- - ${CLIPPED_ELEMENT_RULE.slice(2)}
56024
+ - ${STUCK_LOOP_RULE}
56025
+ - ${CLIPPED_ELEMENT_RULE}
55987
56026
 
55988
56027
  ## Exploration Strategy
55989
56028
 
@@ -56000,10 +56039,11 @@ CRITICAL: Call \`set_output\` each time your findings change \u2014 when you dis
56000
56039
  function generateExplorerPrompt({
56001
56040
  mode,
56002
56041
  specs,
56003
- userPrompt,
56042
+ appContext,
56043
+ initialState,
56004
56044
  buildEnv
56005
56045
  }) {
56006
- return mode === "spec" ? buildSpecModePrompt(specs, { userPrompt, buildEnv }) : FREESTYLE_TEMPLATE({ userPrompt, buildEnv });
56046
+ return mode === "spec" ? buildSpecModePrompt(specs, { appContext, initialState, buildEnv }) : FREESTYLE_TEMPLATE({ appContext, initialState, buildEnv });
56007
56047
  }
56008
56048
  function renderStep(step, index) {
56009
56049
  const stepNumber = String(index + 1);
@@ -56862,7 +56902,8 @@ function buildPrompt(safeConfig, specs) {
56862
56902
  return generateExplorerPrompt({
56863
56903
  mode: safeConfig.mode,
56864
56904
  specs,
56865
- userPrompt: safeConfig.userPrompt,
56905
+ appContext: safeConfig.appContext,
56906
+ initialState: safeConfig.initialState,
56866
56907
  buildEnv: safeConfig.buildEnv
56867
56908
  });
56868
56909
  }
@@ -56907,12 +56948,18 @@ function collectAndFinalize({
56907
56948
  return error48;
56908
56949
  });
56909
56950
  }
56951
+ function resolveAndParseSpecs(safeConfig) {
56952
+ if (safeConfig.mode === "freestyle") {
56953
+ return (0, import_neverthrow15.okAsync)([]);
56954
+ }
56955
+ return resolveSpecs(safeConfig).mapErr((cause) => ({ type: "SPEC_RESOLVE_FAILED", cause })).andThen((specs) => parseSpecs(specs));
56956
+ }
56910
56957
  function runPipeline({
56911
56958
  safeConfig,
56912
56959
  runPaths,
56913
56960
  start
56914
56961
  }) {
56915
- return resolveSpecs(safeConfig).mapErr((cause) => ({ type: "SPEC_RESOLVE_FAILED", cause })).andThen((specs) => parseSpecs(specs)).map((parsedSpecs) => buildPrompt(safeConfig, parsedSpecs)).map((prompt) => {
56962
+ return resolveAndParseSpecs(safeConfig).map((parsedSpecs) => buildPrompt(safeConfig, parsedSpecs)).map((prompt) => {
56916
56963
  safeConfig.onEvent?.({ type: "SYSTEM_PROMPT", agent: "explorer", prompt });
56917
56964
  return prompt;
56918
56965
  }).andThen((prompt) => collectAndFinalize({ safeConfig, prompt, runPaths, start }));
@@ -61066,7 +61113,7 @@ function writeLastPath(xqaDirectory, findingsPath) {
61066
61113
  (0, import_node_fs4.writeFileSync)(lastPathFilePath(xqaDirectory), findingsPath);
61067
61114
  }
61068
61115
 
61069
- // src/shell/instructions.ts
61116
+ // src/shell/app-context.ts
61070
61117
  var import_promises17 = require("node:fs/promises");
61071
61118
  var import_node_path9 = __toESM(require("node:path"), 1);
61072
61119
  var import_neverthrow36 = __toESM(require_index_cjs(), 1);
@@ -61074,45 +61121,53 @@ var HTML_COMMENT_PATTERN = /<!--[\s\S]*?-->/g;
61074
61121
  function isEnoentError(value) {
61075
61122
  return value !== null && typeof value === "object" && "code" in value && value.code === "ENOENT";
61076
61123
  }
61077
- function toInstructionsError(cause) {
61124
+ function toAppContextError(cause) {
61078
61125
  return { type: "READ_FAILED", cause };
61079
61126
  }
61080
- function absentInstructions() {
61127
+ function absentContext() {
61081
61128
  const absent = void 0;
61082
61129
  return (0, import_neverthrow36.ok)(absent);
61083
61130
  }
61084
61131
  var safeReadFile2 = import_neverthrow36.ResultAsync.fromThrowable(
61085
61132
  async (filePath) => (0, import_promises17.readFile)(filePath, "utf8"),
61086
- toInstructionsError
61133
+ toAppContextError
61087
61134
  );
61088
61135
  function stripAndNormalize(content) {
61089
61136
  const stripped = content.replaceAll(HTML_COMMENT_PATTERN, "").trim();
61090
61137
  return stripped.length === 0 ? void 0 : stripped;
61091
61138
  }
61092
- function readInstructions(xqaDirectory) {
61093
- const filePath = import_node_path9.default.join(xqaDirectory, "instructions.md");
61139
+ function readContextFile(xqaDirectory, filename) {
61140
+ const filePath = import_node_path9.default.join(xqaDirectory, filename);
61094
61141
  return safeReadFile2(filePath).map((content) => stripAndNormalize(content)).orElse((error48) => {
61095
61142
  if (isEnoentError(error48.cause)) {
61096
- return absentInstructions();
61143
+ return absentContext();
61097
61144
  }
61098
61145
  return (0, import_neverthrow36.err)(error48);
61099
61146
  });
61100
61147
  }
61148
+ function readAppContext(xqaDirectory) {
61149
+ return readContextFile(xqaDirectory, "app.md");
61150
+ }
61151
+ function readExploreContext(xqaDirectory) {
61152
+ return readContextFile(xqaDirectory, "explore.md");
61153
+ }
61101
61154
 
61102
61155
  // src/commands/explore-command.ts
61103
61156
  function buildExplorerConfig2({
61104
61157
  input,
61105
61158
  config: config3,
61106
- instructions
61159
+ appContext,
61160
+ initialState
61107
61161
  }) {
61108
- const parts = [instructions, input.prompt].filter(Boolean);
61109
- const userPrompt = parts.length > 0 ? parts.join("\n\n") : void 0;
61162
+ const parts = [initialState, input.prompt].filter(Boolean);
61163
+ const resolvedStartingState = parts.length > 0 ? parts.join("\n\n") : void 0;
61110
61164
  return {
61111
61165
  mode: "freestyle",
61112
61166
  mcpServers: createDefaultMcpServers(),
61113
61167
  allowedTools: ALLOWED_TOOLS,
61114
61168
  timeoutMs: config3.QA_EXPLORE_TIMEOUT_SECONDS === void 0 ? void 0 : config3.QA_EXPLORE_TIMEOUT_SECONDS * MS_PER_SECOND3,
61115
- userPrompt,
61169
+ appContext,
61170
+ initialState: resolvedStartingState,
61116
61171
  buildEnv: config3.QA_BUILD_ENV
61117
61172
  };
61118
61173
  }
@@ -61125,7 +61180,7 @@ function buildPipelineConfig({
61125
61180
  const base = {
61126
61181
  outputDir: import_node_path10.default.join(xqaDirectory, "output"),
61127
61182
  runId: config3.QA_RUN_ID,
61128
- onEvent: createConsoleObserver(input.verbose ? { verbose: true } : void 0),
61183
+ onEvent: createConsoleObserver(input.verbose ? { verbose: input.verbose } : void 0),
61129
61184
  signal: input.signal,
61130
61185
  inspector: { designsDirectory: import_node_path10.default.join(xqaDirectory, "designs") },
61131
61186
  explorer
@@ -61163,24 +61218,29 @@ ${cause}
61163
61218
  }
61164
61219
  };
61165
61220
  }
61221
+ function handleContextError(error48) {
61222
+ const cause = error48.cause instanceof Error ? error48.cause.message : JSON.stringify(error48.cause);
61223
+ process.stderr.write(`Failed to read context: ${error48.type}
61224
+ ${cause}
61225
+ `);
61226
+ process.exit(1);
61227
+ }
61166
61228
  function runExploreCommand(input, options) {
61167
61229
  const { config: config3, xqaDirectory } = options;
61168
61230
  const { onSuccess, onError } = handlePipelineResult(input, xqaDirectory);
61169
- void readInstructions(xqaDirectory).match(
61170
- (instructions) => {
61171
- const explorerConfig = buildExplorerConfig2({ input, config: config3, instructions });
61172
- void runPipeline2(
61173
- buildPipelineConfig({ input, config: config3, xqaDirectory, explorer: explorerConfig })
61174
- ).match(onSuccess, onError);
61175
- },
61176
- (error48) => {
61177
- const cause = error48.cause instanceof Error ? error48.cause.message : JSON.stringify(error48.cause);
61178
- process.stderr.write(`Failed to read instructions: ${error48.type}
61179
- ${cause}
61180
- `);
61181
- process.exit(1);
61182
- }
61183
- );
61231
+ void readAppContext(xqaDirectory).andThen(
61232
+ (appContext) => readExploreContext(xqaDirectory).map((exploreContext) => ({ appContext, exploreContext }))
61233
+ ).match(({ appContext, exploreContext }) => {
61234
+ const explorerConfig = buildExplorerConfig2({
61235
+ input,
61236
+ config: config3,
61237
+ appContext,
61238
+ initialState: exploreContext
61239
+ });
61240
+ void runPipeline2(
61241
+ buildPipelineConfig({ input, config: config3, xqaDirectory, explorer: explorerConfig })
61242
+ ).match(onSuccess, onError);
61243
+ }, handleContextError);
61184
61244
  }
61185
61245
 
61186
61246
  // src/commands/init-command.ts
@@ -61191,20 +61251,45 @@ var import_node_url = require("node:url");
61191
61251
  var GITIGNORE_CONTENT = `/output
61192
61252
  /last-findings-path
61193
61253
  `;
61194
- var INSTRUCTIONS_TEMPLATE = `<!-- App Overview
61195
- Describe what your app does and its main purpose.
61196
- Example: This is a crypto wallet app that lets users send, receive, and swap tokens.
61254
+ var APP_TEMPLATE = `<!-- Overview
61255
+ What this app does in 1-2 sentences. Focus on domain, not tech stack.
61256
+ Example: Crypto wallet for sending, receiving, and swapping tokens across multiple blockchains.
61197
61257
  -->
61198
61258
 
61199
- <!-- Navigation
61200
- Describe the main navigation structure and how to move between screens.
61201
- Example: The main screen is the asset list. Swipe down to open Profile. Dismiss modals by swiping down.
61259
+ <!-- Screens
61260
+ List the main screens and how to reach them. Use > for navigation paths.
61261
+ Include any non-obvious names the accessibility tree uses for screen titles.
61262
+ Example:
61263
+ - Portfolio: default home screen, shows asset list
61264
+ - Asset Detail: tap any asset in Portfolio
61265
+ - Settings: tap the gear icon top-right on Portfolio
61266
+ - Send: Portfolio > tap asset > Send button
61267
+ If the accessibility tree uses a different name than what's visible, include both.
61202
61268
  -->
61203
61269
 
61204
- <!-- Startup
61205
- Describe the initial state of the app when the agent starts.
61206
- Example: The app starts on the home screen with a wallet already loaded.
61207
- If this file contains a mnemonic phrase, add .xqa/instructions.md to your .gitignore.
61270
+ <!-- Gestures
61271
+ Optional. List navigation gestures that have no visible button. The agent cannot discover these from the UI tree.
61272
+ Skip this section if your app does not use gesture-based navigation.
61273
+ Example:
61274
+ - Swipe down on Portfolio \u2192 opens Profile
61275
+ - Swipe down on any modal \u2192 dismisses it
61276
+ - Swipe left on Asset Detail \u2192 goes back (no back button visible)
61277
+ -->
61278
+ `;
61279
+ var EXPLORE_TEMPLATE = `<!-- Starting State
61280
+ Describe the exact screen and state the app is in when the agent connects.
61281
+ Include credentials or wallet state if relevant. Add explore.md to .gitignore if it contains secrets.
61282
+ Example: App is on the Portfolio screen with a funded wallet loaded. No modals are open.
61283
+ Example with credentials: App is on the Login screen. Use PIN 123456 to unlock.
61284
+ Example with mid-flow state: App is on the Send flow, amount entry modal is open. Dismiss before exploring.
61285
+ -->
61286
+
61287
+ <!-- Scope
61288
+ Optional. Tell the agent where to focus or what to skip.
61289
+ Without this, the agent explores everything reachable from the starting screen.
61290
+ Example: Focus on the Settings section only. Skip the Send and Receive flows.
61291
+ Example: Explore everything except the Swap screen \u2014 it requires live network.
61292
+ Scope applies from the starting screen. If the focus area requires navigation, describe that in Starting State instead.
61208
61293
  -->
61209
61294
  `;
61210
61295
  function resolveSkillPath(skillName) {
@@ -61213,7 +61298,7 @@ function resolveSkillPath(skillName) {
61213
61298
  }
61214
61299
  function runInitCommand() {
61215
61300
  const xqaDirectory = import_node_path11.default.join(process.cwd(), ".xqa");
61216
- (0, import_node_child_process5.spawnSync)("npx", ["skills", "add", resolveSkillPath("xqa-spec"), "--all", "-y"], {
61301
+ (0, import_node_child_process5.spawnSync)("npx", ["skills", "add", resolveSkillPath("xqa-spec")], {
61217
61302
  stdio: "inherit"
61218
61303
  });
61219
61304
  if ((0, import_node_fs5.existsSync)(xqaDirectory)) {
@@ -61223,15 +61308,18 @@ function runInitCommand() {
61223
61308
  }
61224
61309
  (0, import_node_fs5.mkdirSync)(xqaDirectory);
61225
61310
  (0, import_node_fs5.writeFileSync)(import_node_path11.default.join(xqaDirectory, ".gitignore"), GITIGNORE_CONTENT);
61226
- (0, import_node_fs5.writeFileSync)(import_node_path11.default.join(xqaDirectory, "instructions.md"), INSTRUCTIONS_TEMPLATE);
61311
+ (0, import_node_fs5.writeFileSync)(import_node_path11.default.join(xqaDirectory, "app.md"), APP_TEMPLATE);
61312
+ (0, import_node_fs5.writeFileSync)(import_node_path11.default.join(xqaDirectory, "explore.md"), EXPLORE_TEMPLATE);
61227
61313
  for (const subdir of ["designs", "specs", "suites"]) {
61228
61314
  (0, import_node_fs5.mkdirSync)(import_node_path11.default.join(xqaDirectory, subdir));
61229
61315
  (0, import_node_fs5.writeFileSync)(import_node_path11.default.join(xqaDirectory, subdir, ".gitkeep"), "");
61230
61316
  }
61231
61317
  process.stdout.write(`Initialized xqa project: ${xqaDirectory}
61232
61318
  `);
61233
- process.stdout.write(`Edit .xqa/instructions.md to describe your app.
61234
- `);
61319
+ process.stdout.write(
61320
+ `Edit .xqa/app.md to describe your app and .xqa/explore.md to configure exploration.
61321
+ `
61322
+ );
61235
61323
  }
61236
61324
 
61237
61325
  // src/commands/review-command.ts
@@ -63984,14 +64072,14 @@ function extractFrontmatterBlock(content) {
63984
64072
  }
63985
64073
  return (0, import_neverthrow39.ok)(normalized.slice(FRONTMATTER_OPEN_LEN, end));
63986
64074
  }
63987
- function parseMaxSteps(fields) {
63988
- const maxStepsRaw = fields.get("max_steps");
63989
- if (maxStepsRaw === void 0) {
63990
- return (0, import_neverthrow39.ok)(maxStepsRaw);
64075
+ function parseTimeout(fields) {
64076
+ const raw = fields.get("timeout");
64077
+ if (raw === void 0) {
64078
+ return (0, import_neverthrow39.ok)(raw);
63991
64079
  }
63992
- const parsed = Number(maxStepsRaw);
63993
- if (!Number.isInteger(parsed) || parsed <= 0) {
63994
- return (0, import_neverthrow39.err)({ type: "PARSE_ERROR", cause: `invalid max_steps: ${maxStepsRaw}` });
64080
+ const parsed = Number(raw);
64081
+ if (Number.isNaN(parsed) || parsed <= 0) {
64082
+ return (0, import_neverthrow39.err)({ type: "PARSE_ERROR", cause: `invalid timeout: ${raw}` });
63995
64083
  }
63996
64084
  return (0, import_neverthrow39.ok)(parsed);
63997
64085
  }
@@ -64003,7 +64091,7 @@ function parseSpecFrontmatter(content) {
64003
64091
  return (0, import_neverthrow39.err)({ type: "MISSING_FIELD", field: "feature" });
64004
64092
  }
64005
64093
  const entry = fields.get("entry");
64006
- return parseMaxSteps(fields).map((maxSteps) => ({ feature, entry, maxSteps }));
64094
+ return parseTimeout(fields).map((timeout) => ({ feature, entry, timeout }));
64007
64095
  });
64008
64096
  }
64009
64097
  function parseYamlFields(block) {
@@ -64056,7 +64144,9 @@ function buildSpecExplorer(input, context) {
64056
64144
  specFiles: [context.absolutePath],
64057
64145
  mcpServers: createDefaultMcpServers(),
64058
64146
  allowedTools: ALLOWED_TOOLS,
64059
- userPrompt: context.entry ? `Navigate to \`${context.entry}\` before beginning spec verification.` : void 0,
64147
+ timeoutMs: context.timeout === void 0 ? void 0 : context.timeout * MS_PER_SECOND3,
64148
+ appContext: context.appContext,
64149
+ initialState: context.entry ? `Navigate to \`${context.entry}\` before beginning spec verification.` : void 0,
64060
64150
  buildEnv: context.config.QA_BUILD_ENV
64061
64151
  };
64062
64152
  }
@@ -64064,7 +64154,7 @@ function buildPipelineConfig2(input, context) {
64064
64154
  return {
64065
64155
  outputDir: import_node_path15.default.join(context.xqaDirectory, "output", context.slug),
64066
64156
  signal: input.signal,
64067
- onEvent: createConsoleObserver(input.verbose ? { verbose: true } : void 0),
64157
+ onEvent: createConsoleObserver(input.verbose ? { verbose: input.verbose } : void 0),
64068
64158
  inspector: { designsDirectory: import_node_path15.default.join(context.xqaDirectory, "designs") },
64069
64159
  explorer: buildSpecExplorer(input, context)
64070
64160
  };
@@ -64149,9 +64239,31 @@ async function executeSpec(input, context) {
64149
64239
  handleSpecSuccess(context.xqaDirectory, output);
64150
64240
  }, handleSpecError);
64151
64241
  }
64242
+ function handleAppContextError(error48) {
64243
+ const cause = error48.cause instanceof Error ? error48.cause.message : JSON.stringify(error48.cause);
64244
+ process.stderr.write(`Failed to read app context: ${error48.type}
64245
+ ${cause}
64246
+ `);
64247
+ process.exit(1);
64248
+ }
64249
+ async function buildContext(options, specData) {
64250
+ const appContextResult = await readAppContext(options.xqaDirectory);
64251
+ if (appContextResult.isErr()) {
64252
+ handleAppContextError(appContextResult.error);
64253
+ return void 0;
64254
+ }
64255
+ return {
64256
+ config: options.config,
64257
+ xqaDirectory: options.xqaDirectory,
64258
+ absolutePath: specData.absolutePath,
64259
+ entry: specData.entry,
64260
+ timeout: specData.timeout,
64261
+ slug: deriveSpecSlug(specData.absolutePath),
64262
+ appContext: appContextResult.value
64263
+ };
64264
+ }
64152
64265
  async function runSpecCommand(input, options) {
64153
- const { config: config3, xqaDirectory } = options;
64154
- const resolvedSpecFile = await resolveSpecFile(input.specFile, xqaDirectory);
64266
+ const resolvedSpecFile = await resolveSpecFile(input.specFile, options.xqaDirectory);
64155
64267
  if (resolvedSpecFile === void 0) {
64156
64268
  return;
64157
64269
  }
@@ -64160,8 +64272,15 @@ async function runSpecCommand(input, options) {
64160
64272
  if (frontmatter === void 0) {
64161
64273
  return;
64162
64274
  }
64163
- const slug = deriveSpecSlug(absolutePath);
64164
- await executeSpec(input, { config: config3, xqaDirectory, absolutePath, entry: frontmatter.entry, slug });
64275
+ const context = await buildContext(options, {
64276
+ absolutePath,
64277
+ entry: frontmatter.entry,
64278
+ timeout: frontmatter.timeout
64279
+ });
64280
+ if (context === void 0) {
64281
+ return;
64282
+ }
64283
+ await executeSpec(input, context);
64165
64284
  }
64166
64285
 
64167
64286
  // src/config.ts
@@ -68235,6 +68354,25 @@ ${messages.join("\n")}` });
68235
68354
  return (0, import_neverthrow41.ok)(result.data);
68236
68355
  }
68237
68356
 
68357
+ // src/core/parse-verbose.ts
68358
+ function parseVerboseOption(value) {
68359
+ if (value === void 0 || value === "all") {
68360
+ return new Set(ALL_VERBOSE_CATEGORIES);
68361
+ }
68362
+ if (value === "") {
68363
+ throw new InvalidArgumentError("--verbose requires categories or no value for all");
68364
+ }
68365
+ const requested = value.split(",").map((category) => category.trim().toLowerCase());
68366
+ const invalid = requested.filter((category) => !ALL_VERBOSE_CATEGORIES.has(category));
68367
+ const validList = [...ALL_VERBOSE_CATEGORIES].join(", ");
68368
+ if (invalid.length > 0) {
68369
+ const names = invalid.map((name) => `"${name}"`).join(", ");
68370
+ const label = invalid.length === 1 ? "category" : "categories";
68371
+ throw new InvalidArgumentError(`Unknown verbose ${label}: ${names}. Valid: ${validList}`);
68372
+ }
68373
+ return new Set(requested);
68374
+ }
68375
+
68238
68376
  // src/pid-lock.ts
68239
68377
  var import_node_fs8 = require("node:fs");
68240
68378
  var import_neverthrow42 = __toESM(require_index_cjs(), 1);
@@ -68350,7 +68488,11 @@ program2.name("xqa").description("AI-powered QA agent CLI");
68350
68488
  program2.command("init").description("Initialize a new xqa project in the current directory").action(() => {
68351
68489
  runInitCommand();
68352
68490
  });
68353
- program2.command("explore").description("Run the explorer agent; omit prompt for a full breadth-first sweep").argument("[prompt]", "Optional focus hint for the explorer; omit for a full breadth-first sweep").option("--verbose", "Log tool call results").action((prompt, options) => {
68491
+ program2.command("explore").description("Run the explorer agent; omit prompt for a full breadth-first sweep").argument("[prompt]", "Optional focus hint for the explorer; omit for a full breadth-first sweep").option(
68492
+ "-v, --verbose [categories]",
68493
+ "Verbose output [prompt,tools,screen,memory] (default: all)",
68494
+ parseVerboseOption
68495
+ ).action((prompt, options) => {
68354
68496
  const xqaDirectory = resolveXqaDirectory();
68355
68497
  runExploreCommand(
68356
68498
  { prompt, verbose: options.verbose, signal: controller.signal },
@@ -68367,10 +68509,14 @@ program2.command("review").description("Review findings and mark false positives
68367
68509
  const xqaDirectory = resolveXqaDirectory();
68368
68510
  void runReviewCommand(findingsPath, xqaDirectory);
68369
68511
  });
68370
- program2.command("spec").description("Run the explorer agent against a spec file").argument("[spec-file]", "Path to the spec markdown file; omit to pick interactively").option("--verbose", "Log tool call results").action((specFile, options) => {
68512
+ program2.command("spec").description("Run the explorer agent against a spec file").argument("[spec-file]", "Path to the spec markdown file; omit to pick interactively").option(
68513
+ "-v, --verbose [categories]",
68514
+ "Verbose output [prompt,tools,screen,memory] (default: all)",
68515
+ parseVerboseOption
68516
+ ).action((specFile, options) => {
68371
68517
  const xqaDirectory = resolveXqaDirectory();
68372
68518
  void runSpecCommand(
68373
- { specFile, verbose: options.verbose ?? false, signal: controller.signal },
68519
+ { specFile, verbose: options.verbose, signal: controller.signal },
68374
68520
  { config: config2, xqaDirectory }
68375
68521
  );
68376
68522
  });
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@exodus/xqa",
3
- "version": "1.3.0",
3
+ "version": "1.4.0",
4
4
  "type": "module",
5
5
  "engines": {
6
6
  "node": ">=22"
@@ -26,12 +26,12 @@
26
26
  "typescript": "^5.8.3",
27
27
  "vitest": "^3.2.1",
28
28
  "zod": "^3.0.0",
29
- "@qa-agents/explorer": "0.0.0",
30
29
  "@qa-agents/eslint-config": "0.0.0",
31
- "@qa-agents/shared": "0.0.0",
32
- "@qa-agents/typescript-config": "0.0.0",
30
+ "@qa-agents/explorer": "0.0.0",
33
31
  "@qa-agents/observer": "0.0.0",
34
- "@qa-agents/pipeline": "0.0.0"
32
+ "@qa-agents/pipeline": "0.0.0",
33
+ "@qa-agents/typescript-config": "0.0.0",
34
+ "@qa-agents/shared": "0.0.0"
35
35
  },
36
36
  "dependencies": {
37
37
  "@mobilenext/mobile-mcp": "^0.0.50",
@@ -46,8 +46,8 @@
46
46
  },
47
47
  "scripts": {
48
48
  "dev": "node scripts/build.mjs --watch",
49
- "build:link": "pnpm link --global",
50
- "build:unlink": "pnpm remove -g @exodus/xqa",
49
+ "build:link": "ln -sf \"$(pwd)/dist/xqa.cjs\" ~/.local/bin/xqa",
50
+ "build:unlink": "rm -f ~/.local/bin/xqa",
51
51
  "dev:link": "pnpm link --global && pnpm run dev",
52
52
  "build": "node scripts/build.mjs",
53
53
  "typecheck": "tsc --noEmit",