npm - @exodus/xqa - Versions diffs - 3.0.1 → 5.0.0 - Mend

@exodus/xqa 3.0.1 → 5.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/README.md +134 -8
package/dist/skills/xqa-test-plan/SKILL.md +211 -0
package/dist/xqa.cjs +64932 -59127
package/package.json +10 -4

package/README.md CHANGED Viewed

@@ -47,7 +47,7 @@ xqa explore --udid ABCD1234          # target a specific booted simulator
 Flags:
 - `-v, --verbose [categories]` — Log categories (prompt, tools, screen, memory). Default: all if flag is present without value.
-- `-t, --timeout <seconds>` — Explorer timeout in seconds (overrides `QA_EXPLORE_TIMEOUT_SECONDS`).
+- `-t, --timeout <seconds>` — Explorer timeout in seconds (overrides `agents.explorer.timeoutSeconds` in `.xqa/config.yaml`).
 - `--debug` — Log timing and event details to stderr.
 - `--udid <id>` — Target simulator UDID. Overrides auto-detect of first booted; exits with code 2 if the UDID is not booted.
@@ -104,11 +104,40 @@ Flags:
 - `--debug` — Log timing and event details to stderr.
 - `--udid <id>` — Target simulator UDID. When supplied, the suite is constrained to that one simulator; exits with code 2 if the UDID is not booted.
+### plan
+Generate or evolve the manual test plan for the current branch.
+Inspects the git diff between the current branch and its upstream, asks the planner agent to emit Markdown scenario specs, and writes them to `.xqa/test-plan/default/` (or a custom directory). Subcommands let you refine individual scenarios, append new scenarios after fresh commits, and correlate findings from a run against the plan.
+```bash
+xqa plan                                          # generate scenarios from current diff
+xqa plan --intent "login changes" --out .xqa/test-plan/my-slug
+xqa plan edit .xqa/test-plan/my-slug/scenario-1.test.md --feedback "rename step 2"
+xqa plan extend                                   # append scenarios for fresh commits
+xqa plan report --findings .xqa/output/.../findings.json --specs .xqa/test-plan/my-slug
+```
+Flags:
+- `--intent <text>` — Optional focus hint passed to the planner.
+- `--out <dir>` — Output directory for the generated scenarios (default: `<xqa>/test-plan/default`).
+Subcommands:
+- `xqa plan edit <file> --feedback <text>` — apply user-requested edits to an existing scenario spec.
+- `xqa plan extend [--intent <text>] [--out <dir>]` — append new scenarios for commits since the last plan was generated.
+- `xqa plan report --findings <path> [--specs <dir>]` — correlate findings with scenarios and write `report.json` next to the plan.
+What does it do?
+`xqa plan` reads the branch diff, summarizes it, and feeds the context to the planner agent, which emits one Markdown scenario spec per suggested flow. The specs are written to the plan directory so you can review or hand them to `xqa run`. After running the scenarios, `xqa plan report` correlates the resulting findings back to each scenario so you can see which flows passed, which surfaced issues, and which were skipped. `xqa plan edit` lets you nudge a single scenario with natural-language feedback; `xqa plan extend` picks up commits added after the initial generation and appends new scenarios without touching the existing ones.
 ### review [findings-path]
 Review findings and mark false positives.
-Interactive session for triaging findings generated by explore or spec runs. Mark findings as dismissed (with optional reason) or undo previous dismissals. Dismissals are written to `dismissals.json` next to the `.xqa` directory (override with `QA_DISMISSALS_PATH`). Defaults to the last findings path if omitted.
+Interactive session for triaging findings generated by explore or spec runs. Mark findings as dismissed (with optional reason) or undo previous dismissals. Dismissals are written to `dismissals.json` next to the `.xqa` directory (override with `run.dismissalsPath` in `.xqa/config.yaml`). Defaults to the last findings path if omitted.
 ```bash
 xqa review                                     # use last findings file
@@ -178,14 +207,111 @@ Contract:
 ## Configuration
-Configuration is loaded from environment variables and `.env.local`:
+Configuration splits in two: non-sensitive runtime settings in `.xqa/config.yaml`, secrets in the environment.
+### `.xqa/config.yaml`
+`xqa init` writes this file with sensible defaults. It's the canonical home for agent toggles and tunables:
+```yaml
+version: 1
+run:
+  # id: my-run
+  # dismissalsPath: .xqa/dismissals.json
+suites:
+  directory: .xqa/suites
+agents:
+  explorer:
+    enabled: true
+    timeoutSeconds: 600
+    buildEnv: dev
+    capabilities:
+      videoRecording: false
+      viewUiServer: true
+      findingScreenshots: true
+  analyser:
+    enabled: true
+  inspector:
+    enabled: true
+    designsDirectory: .xqa/designs
+  consolidator:
+    enabled: true
+  triager:
+    enabled: false
+```
+Field reference:
+| Field                                             | Default                | Description                                                           |
+| ------------------------------------------------- | ---------------------- | --------------------------------------------------------------------- |
+| `version`                                         | `1`                    | Config schema version.                                                |
+| `run.id`                                          | _(auto)_               | Fixed run ID. Omit for sequential per-run IDs.                        |
+| `run.dismissalsPath`                              | `.xqa/dismissals.json` | Where `xqa review` persists dismissals.                               |
+| `suites.directory`                                | `.xqa/suites`          | Directory containing `*.suite.json` files.                            |
+| `agents.explorer.enabled`                         | `true`                 | Runs the explorer agent.                                              |
+| `agents.explorer.timeoutSeconds`                  | `600`                  | Wall-clock limit per explore/spec run.                                |
+| `agents.explorer.buildEnv`                        | `dev`                  | `dev` or `prod`. `dev` ignores debug overlays as findings.            |
+| `agents.explorer.capabilities.videoRecording`     | `false`                | Records the simulator screen to MP4.                                  |
+| `agents.explorer.capabilities.viewUiServer`       | `true`                 | Registers the `view_ui` MCP tool for reading the UI tree.             |
+| `agents.explorer.capabilities.findingScreenshots` | `true`                 | Writes per-finding PNGs.                                              |
+| `agents.analyser.enabled`                         | `true`                 | Runs the Gemini video analyser. Needs `GOOGLE_GENERATIVE_AI_API_KEY`. |
+| `agents.inspector.enabled`                        | `true`                 | Runs the visual diff inspector.                                       |
+| `agents.inspector.designsDirectory`               | `.xqa/designs`         | Reference artboards for visual diffing.                               |
+| `agents.consolidator.enabled`                     | `true`                 | Merges and deduplicates findings from every agent.                    |
+| `agents.triager.enabled`                          | `false`                | Runs the PR suite matcher. Needs `GITHUB_TOKEN`.                      |
+### Capabilities
+Each agent has a `capabilities` block of opt-in feature flags. Enabling a capability doesn't enable the agent — both `enabled: true` and `capabilities.<name>: true` are required.
+The explorer's `videoRecording` capability used to auto-turn-on whenever `GOOGLE_GENERATIVE_AI_API_KEY` was set. It's now independent: you can record video without running the analyser, and vice versa.
+### Environment variables
+Secrets stay in `.env.local` (loaded by dotenv) or your shell. Lock the file down:
+```bash
+chmod 600 .env.local
+```
 - `ANTHROPIC_API_KEY` (required) — Anthropic Claude API key for agent reasoning
-- `GOOGLE_GENERATIVE_AI_API_KEY` (optional) — Google Generative AI key for video analysis
-- `QA_RUN_ID` (optional) — Custom run identifier; defaults to auto-generated
-- `QA_EXPLORE_TIMEOUT_SECONDS` (optional) — Exploration timeout in seconds
-- `QA_BUILD_ENV` (optional) — Build environment: `dev` or `prod` (default: prod)
-- `QA_DISMISSALS_PATH` (optional) — Override the dismissals file path used by `xqa review`
+- `GOOGLE_GENERATIVE_AI_API_KEY` (optional) — Gemini key for the analyser and `xqa analyse`
+- `GITHUB_TOKEN` (optional) — required for `xqa triage`
+### Example: explorer only, with video recording
+No Gemini key? Record video without the analyser:
+```yaml
+agents:
+  explorer:
+    enabled: true
+    capabilities:
+      videoRecording: true
+  analyser:
+    enabled: false
+```
+`videoRecording` is decoupled from the analyser, so this works without `GOOGLE_GENERATIVE_AI_API_KEY`.
+### Migration from legacy env vars
+Legacy `QA_*` and `XQA_*` environment variables are rejected at startup with a `LEGACY_ENV_DETECTED` error. Move their values into `.xqa/config.yaml`:
+| Legacy env var               | New config path                  |
+| ---------------------------- | -------------------------------- |
+| `QA_RUN_ID`                  | `run.id`                         |
+| `QA_EXPLORE_TIMEOUT_SECONDS` | `agents.explorer.timeoutSeconds` |
+| `QA_BUILD_ENV`               | `agents.explorer.buildEnv`       |
+| `QA_DISMISSALS_PATH`         | `run.dismissalsPath`             |
+| `XQA_SUITES_DIR`             | `suites.directory`               |
 ## Architecture

package/dist/skills/xqa-test-plan/SKILL.md ADDED Viewed

@@ -0,0 +1,211 @@
+---
+name: xqa-test-plan
+description: Generate a manual test plan from the current branch diff, approve it, run it via xqa, and see findings inline. Triggers on /xqa-test-plan or implied intent ("what should I test before pushing?", "generate a test plan for my changes", "QA my branch").
+license: MIT
+---
+# xqa-test-plan
+## When to use
+- User runs `/xqa-test-plan`
+- User says "what should I test", "generate a test plan", "QA my branch", "test my changes before pushing", "what should I QA?"
+- Self-activate on implied intent when the user is asking for pre-push manual verification on the current branch
+## Process
+```
+Detect state → Generate → Approve → Run → Report → Re-run / Regenerate / Extend
+```
+IMPORTANT: The skill orchestrates the CLI; it never writes `.test.md` files directly. Every spec mutation goes through `xqa plan` or `xqa plan edit`.
+## Detect state
+Resolve the current branch and its plan directory before anything else.
+```bash
+git symbolic-ref --short HEAD 2>/dev/null || git rev-parse --short HEAD
+```
+- Named branch → use the name as the slug source.
+- Detached HEAD → slug becomes `sha-<first-7-chars>`.
+Slug rules (mirrors the planner's `branchToSlug`):
+| Input fragment      | Slug output   |
+| ------------------- | ------------- |
+| `a-z`, `A-Z`, `0-9` | preserved     |
+| `.`, `_`, `-`       | preserved     |
+| any other char      | `-`           |
+| consecutive `-`     | collapsed `-` |
+Plan directory: `.xqa/test-plan/<slug>/`.
+### Auto-prune stale siblings
+List every child of `.xqa/test-plan/`. For each sibling directory:
+1. Decode the slug back to an approximate branch name. WARNING: slug decoding is ambiguous (dashes could have been slashes or other chars). Treat the approximation as best-effort.
+2. Probe the local branch list:
+   ```bash
+   git branch --list '<approximate-name>'
+   ```
+3. If the probe is empty, the directory is stale — `rm -rf` it.
+IMPORTANT: Never prune the current branch's slug directory, even if the approximate-name probe looks empty.
+IMPORTANT: Auto-prune only removes directories whose approximation has NO matching local branch. When in doubt, leave it.
+### Existing plan detection
+Check `.xqa/test-plan/<slug>/` for `*.test.md` files.
+| State                 | Next action                                                |
+| --------------------- | ---------------------------------------------------------- |
+| No specs              | Proceed to Generate flow                                   |
+| Specs already present | Offer three choices: **Rerun**, **Regenerate**, **Extend** |
+## Generate flow
+### 1. Summarize intent
+Scan the last ~5 turns of chat history. Compress the user's stated goal into a single sentence suitable for `--intent`. If no intent emerges, pass an empty string.
+### 2. Detect booted simulators
+```bash
+xcrun simctl list devices booted --json
+```
+Count booted devices:
+| Count | Behavior                                                                   |
+| ----- | -------------------------------------------------------------------------- |
+| 0     | STOP — tell user to boot a simulator (`xcrun simctl boot <udid>`) and wait |
+| 1     | Auto-select its UDID                                                       |
+| >1    | Ask user to pick one by name + UDID                                        |
+Remember the chosen UDID for the Run step.
+### 3. Invoke planner
+```bash
+xqa plan --intent "<one-sentence summary>" --out .xqa/test-plan/<slug>
+```
+### 4. Render approval checklist
+- Parse planner stdout as JSON; extract the `specs[]` array.
+- Read each `<path>.test.md` that the planner wrote.
+- Extract the step intents (first line of each numbered step, before `→` or `[hint:`).
+- Render a numbered checklist, one line per scenario, showing scenario title and 1-3 key steps.
+Ask: "Does this cover what you want to QA? Reply **approve** / **run it** / **looks good** to proceed, or describe edits."
+## Approval loop
+| User reply                                 | Skill action                                              |
+| ------------------------------------------ | --------------------------------------------------------- |
+| "approve" / "run it" / "looks good" / "go" | Exit loop → proceed to Run                                |
+| Edit request targeting one scenario        | `xqa plan edit <path> --feedback "<verbatim user words>"` |
+| Edit request spanning multiple scenarios   | Issue one `xqa plan edit` call per affected file          |
+| Ambiguous request                          | Ask one clarifying question; do not invoke edit           |
+After every edit call, re-read the updated spec and re-render the checklist. Loop until approval.
+IMPORTANT: Never hand-edit `.test.md` files. Every change flows through `xqa plan edit`.
+## Run
+### 1. Re-verify simulator
+Confirm the UDID chosen during Generate is still booted:
+```bash
+xcrun simctl list devices booted --json
+```
+If it is no longer booted, go back to the simulator selection step.
+### 2. Preflight screenshot
+```bash
+xcrun simctl io <udid> screenshot /tmp/preflight.png
+```
+| Outcome | Next                                                            |
+| ------- | --------------------------------------------------------------- |
+| Success | Proceed to Run dispatch                                         |
+| Error   | Pick a different booted simulator or abort with a clear message |
+### 3. Dispatch
+```bash
+xqa run --spec '.xqa/test-plan/<slug>/*.test.md' --udid <udid>
+```
+Capture the findings path from the `Run complete. Findings: <path>` line in stdout.
+## Move artifacts
+`xqa run` writes findings + screenshots into a timestamped output directory under its own roots. Relocate them alongside the plan:
+```bash
+mv <findings-path>          .xqa/test-plan/<slug>/runs/<iso-timestamp>/findings.json
+mv <findings-path>/../shots .xqa/test-plan/<slug>/runs/<iso-timestamp>/shots
+```
+Use an ISO-8601 UTC timestamp (e.g. `2026-04-22T15-30-00Z`) as the directory name. Use `mv`, not `cp` — the plan directory owns the canonical copy.
+WARNING: After moving, all downstream steps reference the new paths under `.xqa/test-plan/<slug>/runs/`.
+## Report
+```bash
+xqa plan report \
+  --findings .xqa/test-plan/<slug>/runs/<iso-timestamp>/findings.json \
+  --specs .xqa/test-plan/<slug>
+```
+Parse stdout JSON as `CorrelatedReport`. Render grouped output:
+| Group                    | Rendering                                                                                  |
+| ------------------------ | ------------------------------------------------------------------------------------------ |
+| `has_findings` scenarios | Strikethrough title + fail marker. Inline each finding description. Link screenshot paths. |
+| `not_run` scenarios      | Muted/grey. Prefix with "(no findings referenced)" — do not mark as passed.                |
+| Unmatched findings       | Separate "Findings without a scenario" section at the end.                                 |
+IMPORTANT: `not_run` does NOT mean "passed". It means the agent did not emit a finding that correlated to this scenario. We literally don't know the outcome. Do not print a green check.
+## Rerun / Regenerate / Extend
+| Mode       | Action                                                                                          |
+| ---------- | ----------------------------------------------------------------------------------------------- |
+| Rerun      | Reuse existing `.test.md` files. Go straight to Run → Move artifacts → Report.                  |
+| Regenerate | `rm .xqa/test-plan/<slug>/*.test.md` (specs only — preserve `runs/`). Re-run Generate flow.     |
+| Extend     | `xqa plan extend --out .xqa/test-plan/<slug>`. Planner appends new scenarios for fresh commits. |
+IMPORTANT: Regenerate never deletes `runs/`. Run history is preserved across regenerations.
+## Guardrails
+- IMPORTANT: The skill NEVER writes `.test.md` files directly. Only `xqa plan` and `xqa plan edit` produce or modify them. `scenarioId` and meta injection live in the planner — hand-authoring breaks correlation.
+- IMPORTANT: The skill NEVER computes finding correlation. Only `xqa plan report` does. Do not write ad-hoc string matchers.
+- WARNING: The skill NEVER modifies another branch's slug directory. Auto-prune only removes directories whose local branch is gone AND which are not the current branch.
+- WARNING: The skill NEVER edits artifacts under `.xqa/test-plan/<slug>/runs/`. Those are immutable run records.
+- IMPORTANT: The skill NEVER claims `not_run` scenarios passed. Report them as unknown, not green.
+## Manual test plan
+- [ ] Skill activates on `/xqa-test-plan`
+- [ ] Skill activates on implied intent ("what should I QA?")
+- [ ] Detect state correctly computes slug for current branch
+- [ ] Detect state auto-prunes stale sibling dirs but never current
+- [ ] Generate flow handles 0 booted simulators (error), 1 (auto), >1 (prompt)
+- [ ] Generate flow passes `--intent` correctly
+- [ ] Approval loop calls `xqa plan edit` per file
+- [ ] Run flow preflights simulator before dispatching
+- [ ] Report flow renders two-state status (has_findings / not_run) correctly
+- [ ] Rerun doesn't regenerate specs
+- [ ] Regenerate wipes specs before re-invoking xqa plan
+- [ ] Extend appends scenario-N+1 without re-writing existing scenarios