npm - @exodus/xqa - Versions diffs - 1.2.3 → 1.4.0 - Mend

@exodus/xqa 1.2.3 → 1.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

package/README.md +150 -96
package/dist/skills/xqa-spec/AGENTS.md +99 -0
package/dist/skills/xqa-spec/SKILL.md +125 -0
package/dist/skills/xqa-spec/metadata.json +5 -0
package/dist/xqa.cjs +710 -359
package/package.json +8 -7

package/README.md CHANGED Viewed

@@ -1,166 +1,220 @@
 # @exodus/xqa
-CLI for running AI-powered QA agents against Exodus mobile apps on iOS.
+AI-powered QA agent CLI for Exodus applications.
-## Prerequisites
+## Overview
-- Node >= 22
-- pnpm
-- An Anthropic API key
+`xqa` automates mobile app QA by connecting to physical devices or emulators and running intelligent exploration and spec-based testing. The CLI orchestrates the pipeline that spawns agents to interact with your app, capture screenshots, and generate findings based on user-defined specs or breadth-first exploration.
-## Installation
+The tool manages configuration, project initialization, session state tracking, and interactive review workflows for triaging findings.
-From the monorepo root:
+## Commands
-```bash
-pnpm install
-```
+### init
-Then build and link the CLI globally:
+Initialize a new xqa project in the current directory.
+Creates a `.xqa/` directory with templates and subdirectories for specs, designs, and suites. Installs the `xqa-spec` skill for creating test specs.
 ```bash
-pnpm build:link   # build + link `xqa` into PATH
+xqa init
 ```
-For active development:
+### explore [prompt]
+Run the explorer agent; omit prompt for a full breadth-first sweep.
+Optional focus hint for the explorer agent. Omit to explore the entire app from the starting state. Generates a findings JSON file in `.xqa/output/` and prints the path upon completion.
 ```bash
-pnpm dev:link     # build, link, and watch for changes
+xqa explore                          # breadth-first exploration
+xqa explore "test the login flow"    # focused exploration
+xqa explore -v prompt,screen         # verbose output for categories
+xqa explore -v                       # verbose output for all categories
 ```
-## Setup
+Flag: `-v, --verbose [categories]` — Log categories (prompt, tools, screen, memory). Default: all if flag is present without value.
-Copy the example env file and fill in your values:
+### spec [spec-file]
+Run the explorer agent against a spec file.
+Loads a spec markdown file from `.xqa/specs/` (or an absolute path) and executes the agent against it. Spec files define entry points, steps, and optional timeouts. Omit the argument to pick from available specs interactively.
 ```bash
-cp .env.example .env.local
+xqa spec                                      # interactive spec picker
+xqa spec .xqa/specs/authentication.test.md   # explicit spec file
+xqa spec -v tools,memory                      # verbose output
 ```
-`.env.local` is loaded automatically at startup.
+Flag: `-v, --verbose [categories]` — Same as explore.
-## Environment Variables
+Spec file format (YAML frontmatter + markdown):
-| Variable                       | Required | Default          | Description                                                                                 |
-| ------------------------------ | -------- | ---------------- | ------------------------------------------------------------------------------------------- |
-| `ANTHROPIC_API_KEY`            | Yes      | —                | Anthropic API key                                                                           |
-| `GOOGLE_GENERATIVE_AI_API_KEY` | No       | —                | Gemini key — enables video analysis; required for `xqa analyse`                             |
-| `QA_RUN_ID`                    | No       | auto-generated   | Fixed run ID; auto-incremented when omitted                                                 |
-| `QA_EXPLORE_TIMEOUT_SECONDS`   | No       | —                | Max wall-clock time for an explore or spec run                                              |
-| `QA_WALLET_MNEMONIC`           | No       | —                | Wallet mnemonic; agent restores wallet before exploring when set                            |
-| `QA_BUILD_ENV`                 | No       | `prod`           | `dev` or `prod`; `dev` mode ignores debug overlays                                          |
-| `QA_STARTUP_STATE`             | No       | —                | `portfolio`, `new-wallet`, or `restore-wallet`; unset means app starts in its current state |
-| `QA_DESIGNS_DIR`               | No       | `./.xqa/designs` | Design artboards directory; enables visual regression checks when set                       |
+```markdown
+---
+feature: 'Feature Name'
+entry: 'Screen name or navigation path'
+timeout: 300
+---
-## Commands
+# Spec content
+```
+### review [findings-path]
-### `xqa explore [prompt]`
+Review findings and mark false positives.
-Runs the explorer agent against the live simulator. Without a prompt the agent sweeps the entire app. With a prompt it focuses on the described flow.
+Interactive session for triaging findings generated by explore or spec runs. Displays findings with confidence scores, steps, and screenshots. Mark findings as false positives (with optional reason) or undo previous dismissals. Saves dismissals to `.xqa/dismissals.json`. Defaults to the last findings path if omitted.
 ```bash
-xqa explore
-xqa explore "Try to send Bitcoin to an external address"
-xqa explore --verbose
+xqa review                                      # use last findings file
+xqa review .xqa/output/findings-abc123.json    # explicit path
 ```
-Startup state (`QA_STARTUP_STATE`) controls what the agent sees on launch:
+### analyse [video-path]
+Analyse a session recording with Gemini.
-- `portfolio` — main assets screen (default)
-- `new-wallet` — onboarding screen; agent taps through setup
-- `restore-wallet` — onboarding screen; agent restores wallet using `QA_WALLET_MNEMONIC`
+Requires `GOOGLE_GENERATIVE_AI_API_KEY` in environment. Analyzes a video file recorded during exploration and outputs findings as JSON.
+```bash
+xqa analyse /path/to/video.mp4
+```
-When `GOOGLE_GENERATIVE_AI_API_KEY` is set, a Gemini video analyser runs automatically after the explorer finishes.
+### completion <shell>
-### `xqa spec <spec-file>`
+Output shell completion script.
-Runs the explorer against a markdown spec file. The agent navigates to the entry point defined in the frontmatter and verifies each described step.
+Generate completion script for bash or zsh. Pipe output to shell config file to enable tab completion.
 ```bash
-xqa spec path/to/send-flow.md
-xqa spec path/to/send-flow.md --verbose
+xqa completion bash  # generate bash completions
+xqa completion zsh   # generate zsh completions
 ```
-Spec file format:
+## Configuration
-```markdown
----
-feature: Send Flow
-entry: Assets list
-max_steps: 40
----
+Configuration is loaded from environment variables and `.env.local`:
-Steps describing the flow to verify...
-```
+- `ANTHROPIC_API_KEY` (required) — Anthropic Claude API key for agent reasoning
+- `GOOGLE_GENERATIVE_AI_API_KEY` (optional) — Google Generative AI key for video analysis
+- `QA_RUN_ID` (optional) — Custom run identifier; defaults to auto-generated
+- `QA_EXPLORE_TIMEOUT_SECONDS` (optional) — Exploration timeout in seconds
+- `QA_BUILD_ENV` (optional) — Build environment: `dev` or `prod` (default: prod)
+## Architecture
-| Field       | Required | Description                                        |
-| ----------- | -------- | -------------------------------------------------- |
-| `feature`   | Yes      | Human-readable feature name                        |
-| `entry`     | Yes      | Screen name the agent navigates to before starting |
-| `max_steps` | No       | Maximum number of agent steps                      |
+Key files and directories:
-### `xqa analyse <video-path>`
+- `src/index.ts` — CLI entry point; wires commander commands and manages graceful shutdown via process locks
+- `src/commands/` — Command implementations (init, explore, spec, review, analyse, completion)
+- `src/core/` — Pure functions: spec parsing, completion generation, verbose option parsing, last-path tracking
+- `src/shell/` — I/O wrappers: file reading, device discovery, app context loading
+- `src/config.ts`, `src/config-schema.ts` — Configuration loading and validation with Zod
+- `src/review-session.ts` — Interactive finding review loop with dismissal tracking
+- `src/spec-frontmatter.ts` — Spec markdown frontmatter parsing (YAML)
+- `src/spec-slug.ts` — Spec filename to slug derivation for output organization
+- `src/pid-lock.ts` — Process-level mutual exclusion to prevent concurrent runs
-Analyses a session recording with Gemini. Requires `GOOGLE_GENERATIVE_AI_API_KEY`. Prints findings as JSON to stdout.
+## Error Types
+Core error discriminated unions:
+- `ConfigError` — Configuration validation failed (INVALID_CONFIG)
+- `AppContextError` — Failed to read app.md or explore.md (READ_FAILED)
+- `XqaDirectoryError` — No .xqa directory found (XQA_NOT_INITIALIZED)
+- `SpecFrontmatterError` — Malformed spec markdown (MISSING_FRONTMATTER, MISSING_FIELD, PARSE_ERROR)
+- `LastPathError` — No findings path provided and no prior session (NO_ARG_AND_NO_STATE)
+## Development
+Install dependencies:
 ```bash
-xqa analyse .xqa/output/2026-04-10/0001/recording.mp4
+pnpm install
 ```
-### `xqa review [findings-path]`
+Build the CLI:
-Interactive terminal session for reviewing findings and marking false positives. Requires a TTY. Dismissals are persisted to a dismissals store and excluded from future runs.
+```bash
+pnpm run build
+```
+Run tests:
 ```bash
-xqa review .xqa/output/2026-04-10/0001/findings.json
+pnpm run test
+```
+Type check:
-# re-open the last reviewed findings file
-xqa review
+```bash
+pnpm run typecheck
 ```
-### `xqa completion <shell>`
+Lint and format:
-Outputs a shell completion script.
+```bash
+pnpm run lint
+pnpm run lint:fix
+```
+Full quality check (lint, typecheck, test):
 ```bash
-xqa completion zsh >> ~/.zshrc
-xqa completion bash >> ~/.bashrc
+pnpm run check
+pnpm run check:fix
 ```
-## Process Behaviour
+Watch mode (build + re-run on file changes):
-Only one `xqa` instance runs at a time (PID lock). A second invocation while a run is active will exit immediately with an error.
+```bash
+pnpm run dev
+```
-- `Ctrl+C` once: graceful shutdown — the current agent step completes, findings are written, then the process exits
-- `Ctrl+C` twice: force exit
+Link binary globally (symlinks dist/xqa.cjs to ~/.local/bin/xqa):
-## Development
+```bash
+pnpm run build:link
+```
+Unlink binary:
 ```bash
-pnpm dev          # watch build
-pnpm build        # production build
-pnpm build:link   # build + link `xqa` globally
-pnpm dev:link     # watch build + link
-pnpm test         # run Vitest test suite
-pnpm typecheck    # TypeScript type check
-pnpm lint         # ESLint + Prettier check
-pnpm lint:fix     # ESLint + Prettier auto-fix
-pnpm check        # lint + typecheck + test (affected only)
-pnpm check:fix    # lint:fix + typecheck + test (affected only)
+pnpm run build:unlink
 ```
-## Architecture
+## Project Structure
 ```
 src/
-  index.ts                # CLI entry — registers all commands
-  config-schema.ts        # Zod schema for all environment variables
+  index.ts                    # CLI entry point
+  config.ts                   # Config loading and types
+  config-schema.ts            # Zod schema for env vars
+  constants.ts                # Tool lists and timeouts
+  pid-lock.ts                 # Process exclusion lock
+  spec-slug.ts                # Spec file to slug conversion
+  spec-frontmatter.ts         # Spec YAML parsing
+  review-session.ts           # Interactive finding review loop
   commands/
-    explore-command.ts    # xqa explore
-    spec-command.ts       # xqa spec
-    analyse-command.ts    # xqa analyse
-    review-command.ts     # xqa review
-    completion-command.ts # xqa completion
-  prompt-builder.ts       # builds the explorer system prompt from config
+    init-command.ts           # Project initialization
+    explore-command.ts        # Breadth-first exploration
+    spec-command.ts           # Spec-based exploration
+    review-command.ts         # Finding triage workflow
+    analyse-command.ts        # Video analysis
+    completion-command.ts     # Shell completion generation
+  core/
+    parse-verbose.ts          # Verbose flag parsing
+    completion-generator.ts   # Bash/zsh completion script generation
+    last-path.ts              # Last findings path tracking
+  shell/
+    app-context.ts            # Read app.md and explore.md
+    xqa-directory.ts          # Locate .xqa directory
+  __tests__/
+    *.test.ts                 # Test files co-located with src/
 ```
-The CLI is a thin shell over `@qa-agents/pipeline`. It parses env vars, builds a `PipelineConfig`, and calls `runPipeline()`.

package/dist/skills/xqa-spec/AGENTS.md ADDED Viewed

@@ -0,0 +1,99 @@
+# xqa-spec
+## When to use
+- User runs `/xqa-spec` with a flow description
+- User implies spec authoring intent: "I want to test X", "write a spec for Y", "update the Z spec"
+Detect implied intent and self-activate without requiring explicit slash command.
+## Process
+```
+Explore → Detect mode → Interview (one question at a time) → Draft → Review → Write
+```
+IMPORTANT: Never generate a draft before the interview is complete. The user describes the spec; you transcribe it.
+### 1. Explore
+Silently scan `.xqa/specs/*.test.md`. Learn:
+- Naming conventions
+- Tag vocabulary
+- Level of detail and step granularity
+Also read `.xqa/app.md` if it exists for app context.
+### 2. Detect mode
+| Condition                | Mode                                            |
+| ------------------------ | ----------------------------------------------- |
+| Matching spec file found | Edit — read it, ask which sections to change    |
+| No match                 | Create — derive kebab-case filename from intent |
+In **edit mode**: ask which sections to change before doing anything. Modify only those sections; preserve everything else verbatim.
+### 3. Interview (create mode only)
+Ask one question at a time. Wait for the answer before asking the next. Prefer multiple choice when options are known.
+**Question sequence:**
+1. **What flow?** — Confirm what's being tested if not already clear. Suggest a filename and `feature` name.
+2. **Entry point** — "What's the navigation path to reach this flow?" (e.g., `App launch`, `Home > Wallet`) → becomes `entry:` frontmatter
+3. **Starting state** — "What's already set up? What state is the device/app in?" → becomes `## Setup`
+4. **Steps** — "Walk me through the steps, one at a time. I'll ask for the next when you're done." → collect each step, then ask "What should happen?" for the assertion (optional)
+5. **Global assertions** — "Any overall things that should be true at the end of the flow?" → becomes `## Assertions` (skip if none)
+6. **Timeout** — "Set a timeout in seconds? (optional, for long-running specs)" → becomes `timeout:` frontmatter (offer to skip)
+IMPORTANT: Ask each question in its own message. Never batch questions.
+### 4. Draft
+Assemble using ONLY these frontmatter fields: `feature`, `entry`, `timeout`. Do not add any other frontmatter field. `feature` MUST be present. `timeout` MUST be a positive number (seconds) if included.
+Steps and assertions come from the user — never invent them. Present the full draft for review.
+### 5. Review
+Show the draft. Ask: "Does this look right, or anything to change?"
+Iterate until approved. One round of changes per message.
+### 6. Write
+Save to `.xqa/specs/<name>.test.md` only after explicit approval.
+## File format
+```md
+---
+feature: <string>
+entry: <string>
+timeout: <seconds>
+---
+## Setup
+<preconditions and starting state>
+## Steps
+1. <action> → <expected outcome>
+2. <action>
+## Assertions
+- <global flow-level check>
+```
+Omit `entry` and `timeout` lines if not provided. Omit `## Assertions` section if none.
+## Rules
+- `## Setup` and `## Steps` are required; `## Assertions` is optional
+- Inline assertion syntax: `action → outcome` using the → character
+- Steps come from the user — never invent them
+- Write file only after explicit approval
+- In edit mode, ask before touching anything

package/dist/skills/xqa-spec/SKILL.md ADDED Viewed

@@ -0,0 +1,125 @@
+---
+name: xqa-spec
+description: Create or edit *.test.md spec files in .xqa/specs/ through guided dialogue. Triggers on /xqa-spec or implied spec authoring intent ("I want to test X", "write a spec for Y", "update the Z spec").
+license: MIT
+---
+# xqa-spec
+## When to use
+- User runs `/xqa-spec` with a flow description
+- User implies spec authoring intent: "I want to test X", "write a spec for Y", "update the Z spec"
+Detect implied intent and self-activate without requiring explicit slash command.
+## Process
+```
+Explore → Detect mode → Interview (one question at a time) → Draft → Review → Write
+```
+IMPORTANT: Never generate a draft before the interview is complete. The user describes the spec; you transcribe it.
+### 1. Explore
+Silently scan `.xqa/specs/*.test.md`. Learn:
+- Naming conventions
+- Tag vocabulary
+- Level of detail and step granularity
+Also read `.xqa/app.md` if it exists for app context.
+### 2. Detect mode
+| Condition                | Mode                                            |
+| ------------------------ | ----------------------------------------------- |
+| Matching spec file found | Edit — read it, ask which sections to change    |
+| No match                 | Create — derive kebab-case filename from intent |
+In **edit mode**: ask which sections to change before doing anything. Modify only those sections; preserve everything else verbatim.
+### 3. Interview (create mode only)
+Ask one question at a time. Wait for the answer before asking the next. Prefer multiple choice when options are known.
+**Question sequence:**
+1. **What flow?** — Confirm what's being tested if not already clear. Suggest a filename and `feature` name.
+2. **Entry point** — "What's the navigation path to reach this flow?" (e.g., `App launch`, `Home > Wallet`) → becomes `entry:` frontmatter
+3. **Starting state** — "What's already set up? What state is the device/app in?" → becomes `## Setup`
+4. **Steps** — "Walk me through the steps, one at a time. I'll ask for the next when you're done." → collect each step, then ask "What should happen?" for the assertion (optional)
+5. **Global assertions** — "Any overall things that should be true at the end of the flow?" → becomes `## Assertions` (skip if none)
+6. **Max steps** — "Set a timeout in seconds? (optional, for long-running specs)" → becomes `timeout:` frontmatter (offer to skip)
+IMPORTANT: Ask each question in its own message. Never batch questions.
+### 4. Draft
+Assemble using ONLY these frontmatter fields: `feature`, `entry`, `timeout`. Do not add any other frontmatter field. `feature` MUST be present. `timeout` MUST be a positive number (seconds) if included.
+Steps and assertions come from the user — never invent them. Present the full draft for review.
+### 5. Review
+Show the draft. Ask: "Does this look right, or anything to change?"
+Iterate until approved. One round of changes per message.
+### 6. Write
+Before writing, verify the draft passes all checks:
+- [ ] `feature` is present and non-empty
+- [ ] frontmatter contains only permitted fields: `feature`, `entry`, `timeout`
+- [ ] `timeout` if present is a positive number in seconds (not a string, not zero)
+- [ ] `## Setup` section is present
+- [ ] `## Steps` section is present
+- [ ] No forbidden fields: `tags`, `max_steps`, `priority`, `type`, `description`, `id`, `author`
+Fix any failure before writing. Save to `.xqa/specs/<name>.test.md` only after explicit approval.
+## File format
+FRONTMATTER SCHEMA — exact fields, exact types, no others:
+```
+feature    string           REQUIRED
+entry      string           OPTIONAL — omit if not provided
+timeout  positive number (seconds) OPTIONAL — omit if not provided
+```
+FORBIDDEN frontmatter fields — never generate these: `tags`, `max_steps`, `priority`, `type`, `description`, `id`, `author`, `version`
+CANONICAL OUTPUT FORMAT:
+```md
+---
+feature: <string>
+entry: <string>
+timeout: <seconds>
+---
+## Setup
+<preconditions and starting state>
+## Steps
+1. <action> → <expected outcome>
+2. <action>
+## Assertions
+- <global flow-level check>
+```
+Omit `entry` and `timeout` lines if not provided. Omit `## Assertions` section if none.
+## Rules
+- Inline assertion syntax: `action → outcome` using the → character
+- Steps come from the user — never invent them
+- Write file only after explicit approval
+- In edit mode, ask before touching anything

package/dist/skills/xqa-spec/metadata.json ADDED Viewed

@@ -0,0 +1,5 @@
+{
+  "version": "1.0.0",
+  "organization": "Exodus Movement",
+  "abstract": "Guides QA engineers through creating and editing *.test.md spec files in .xqa/specs/ using a structured interview-first workflow. Asks one question at a time to extract setup, steps, and assertions from the user before drafting."
+}