@exodus/xqa 5.0.0 → 5.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +3 -0
- package/dist/skills/xqa-test-plan/SKILL.md +720 -53
- package/dist/skills/xqa-test-plan/SKILL.test.md +122 -0
- package/dist/xqa.cjs +1195 -770
- package/package.json +3 -3
|
@@ -0,0 +1,122 @@
|
|
|
1
|
+
# xqa-test-plan — manual test plan
|
|
2
|
+
|
|
3
|
+
Manual verification checklist for the `xqa-test-plan` skill. Run when adding/changing skill behavior.
|
|
4
|
+
|
|
5
|
+
## Activation
|
|
6
|
+
|
|
7
|
+
- [ ] Skill activates on `/xqa-test-plan`
|
|
8
|
+
- [ ] Skill activates on implied intent ("what should I QA?")
|
|
9
|
+
|
|
10
|
+
## Detect state
|
|
11
|
+
|
|
12
|
+
- [ ] Detect state correctly computes slug for current branch
|
|
13
|
+
- [ ] Detect state auto-prunes stale sibling dirs but never current
|
|
14
|
+
- [ ] Auto-prune iterates branches forward (applies `branchToSlug`); never tries to invert slug
|
|
15
|
+
- [ ] Auto-prune leaves directory when slug-match is uncertain
|
|
16
|
+
|
|
17
|
+
## Generate flow
|
|
18
|
+
|
|
19
|
+
- [ ] Handles 0 booted simulators (error), 1 (auto), >1 (prompt)
|
|
20
|
+
- [ ] Uses AskUserQuestion (or platform fallback) for simulator selection when >1 booted
|
|
21
|
+
- [ ] Passes `--intent` correctly
|
|
22
|
+
- [ ] Detects base ref from open PR (`gh pr view`)
|
|
23
|
+
- [ ] Falls back to upstream tracking when no PR exists
|
|
24
|
+
- [ ] Omits `--base` + emits broader-diff warning when both detection methods fail
|
|
25
|
+
- [ ] Never fabricates `origin/HEAD` fallback claim
|
|
26
|
+
|
|
27
|
+
## Approval
|
|
28
|
+
|
|
29
|
+
- [ ] Approval loop calls `xqa plan edit` per file
|
|
30
|
+
- [ ] Emits Run-go gate after plan approval for all scenarios (never skips; plan approval ≠ run approval)
|
|
31
|
+
- [ ] Run-go gate question references expected app state (first scenario's precondition label)
|
|
32
|
+
- [ ] Emits flat `- [ ]` checklist for approval (not numbered scenario+steps)
|
|
33
|
+
|
|
34
|
+
## Run
|
|
35
|
+
|
|
36
|
+
- [ ] Preflights simulator before dispatching
|
|
37
|
+
- [ ] Setup coordination asks per-group before dispatching when scenarios have divergent setups
|
|
38
|
+
- [ ] Batches scenarios with identical setup without re-asking
|
|
39
|
+
- [ ] Ordering puts non-destructive setups before no-wallet/delete-wallet setups
|
|
40
|
+
- [ ] Cancellation handles abort/skip/skip all/rerun
|
|
41
|
+
|
|
42
|
+
## Gates
|
|
43
|
+
|
|
44
|
+
- [ ] Does not emit free-text "Reply X / Y / Z" prompts at any decision gate
|
|
45
|
+
- [ ] Uses AskUserQuestion for plan approval gate
|
|
46
|
+
- [ ] Uses AskUserQuestion for run-go gate
|
|
47
|
+
- [ ] Uses AskUserQuestion for sim-state question (multi-profile case)
|
|
48
|
+
- [ ] Uses AskUserQuestion for transition-ready prompts between groups
|
|
49
|
+
- [ ] Uses AskUserQuestion for destructive-delete seed-backup confirmation
|
|
50
|
+
- [ ] Uses AskUserQuestion for existing-plan Rerun/Regenerate/Extend choice
|
|
51
|
+
- [ ] Uses AskUserQuestion for post-interruption Resume/Report/Abort choice
|
|
52
|
+
- [ ] Still accepts free-form replies (AskUserQuestion's "Other" path)
|
|
53
|
+
- [ ] Detects `AskUserQuestion` availability at conversation start
|
|
54
|
+
- [ ] When available: uses AskUserQuestion at all 11 gates
|
|
55
|
+
- [ ] When unavailable: uses structured platform-fallback format at all 11 gates
|
|
56
|
+
- [ ] Fallback format never degenerates into loose "Reply X / Y / describe" prompts
|
|
57
|
+
|
|
58
|
+
## Report
|
|
59
|
+
|
|
60
|
+
- [ ] Renders `passed` scenarios with green check (no strikethrough)
|
|
61
|
+
- [ ] Renders `failed` scenarios with strikethrough + inline finding + screenshot link
|
|
62
|
+
- [ ] Renders `not_run` scenarios with outcome verb (errored/timed_out/aborted) or "(no run record)"
|
|
63
|
+
- [ ] Opens with correct sentence based on bucket distribution
|
|
64
|
+
- [ ] Footer summary line shows counts when >1 scenario
|
|
65
|
+
- [ ] Uses correct `xqa plan report` flags (`--findings`, `--specs`) with default paths from `.xqa/test-plan/<slug>/`
|
|
66
|
+
- [ ] Omits `--runs` by default (`scenario-runs.json` resolved from same dir as `findings.json`)
|
|
67
|
+
|
|
68
|
+
## Rerun / Regenerate / Extend
|
|
69
|
+
|
|
70
|
+
- [ ] Rerun doesn't regenerate specs
|
|
71
|
+
- [ ] Regenerate wipes specs before re-invoking xqa plan
|
|
72
|
+
- [ ] Regenerate preserves `runs/`
|
|
73
|
+
- [ ] Extend appends scenario-N+1 without re-writing existing scenarios
|
|
74
|
+
|
|
75
|
+
## PR detection
|
|
76
|
+
|
|
77
|
+
- [ ] Probe runs in parallel with sim probe and slug computation (single Bash batch)
|
|
78
|
+
- [ ] Branch with no open PR: skips PR-integration silently, proceeds local-only
|
|
79
|
+
- [ ] Branch with open PR: fetches PR body, computes diagnostic signals
|
|
80
|
+
- [ ] Signal A: coverage count identifies items with changed-surface token overlap
|
|
81
|
+
- [ ] Signal B: items containing vitest/eslint/pnpm test/ci passes are flagged
|
|
82
|
+
- [ ] Signal C: items containing TODO/TBD/??? are flagged
|
|
83
|
+
- [ ] Diagnostic emitted before gate (N_covered/M, K CI steps, P placeholders)
|
|
84
|
+
- [ ] Same PR body always produces same diagnostic (deterministic)
|
|
85
|
+
- [ ] AskUserQuestion gate offers all three options: Use as-is / Enrich / Regenerate
|
|
86
|
+
- [ ] "Use as-is" skips planner, converts PR checklist to approval checklist
|
|
87
|
+
- [ ] "Enrich" runs planner then merges with PR checklist, dedupes by intent
|
|
88
|
+
- [ ] "Regenerate" ignores PR body, runs normal Generate flow
|
|
89
|
+
- [ ] PR with no test plan section proceeds with normal Generate flow (no gate shown)
|
|
90
|
+
|
|
91
|
+
## Update PR
|
|
92
|
+
|
|
93
|
+
- [ ] Update PR gate fires after report renders when open PR exists (single 3-way gate, not two sequential y/n gates)
|
|
94
|
+
- [ ] `Write plan + tACK`: executes write-back then self-tACK posting sequentially, no additional confirmation
|
|
95
|
+
- [ ] `Write plan only`: executes write-back only, no self-tACK
|
|
96
|
+
- [ ] `Skip`: PR body unchanged, no tACK
|
|
97
|
+
|
|
98
|
+
## Write-back
|
|
99
|
+
|
|
100
|
+
- [ ] Posts every approved checklist item verbatim as `- [ ]` bullets (not scenario titles)
|
|
101
|
+
- [ ] Single-scenario plan has no subheader; multi-scenario plan has one `### <Scenario name>` per scenario
|
|
102
|
+
- [ ] No `[x]` boxes at write-back time — all unchecked
|
|
103
|
+
- [ ] Replaces existing `## Test plan` section without corrupting other sections
|
|
104
|
+
- [ ] Appends `## Test plan` section when none existed
|
|
105
|
+
- [ ] Preserves all content below the original test plan section verbatim
|
|
106
|
+
- [ ] Uses `gh pr edit <number> --body-file` (not --body string interpolation)
|
|
107
|
+
- [ ] Surfaces `gh pr edit` errors verbatim, no silent retry
|
|
108
|
+
- [ ] `## Test plan` bullets stay verbatim after run completes (no scenario-title rewrite, no `<!-- failed -->` injection, no `[x]` flip)
|
|
109
|
+
- [ ] Post-run status goes in optional footnote below checklist, not inline in bullets
|
|
110
|
+
|
|
111
|
+
## Self-tACK
|
|
112
|
+
|
|
113
|
+
- [ ] No prior tACK: renders proposed comment for transparency, posts immediately (no separate gate — Update PR was the confirmation)
|
|
114
|
+
- [ ] Prior tACK identical: reports "no update needed" without gating
|
|
115
|
+
- [ ] Prior tACK differs: renders diff before gating update (self-tACK update confirmation gate fires)
|
|
116
|
+
- [ ] Update uses `gh api --method PATCH /repos/{owner}/{repo}/issues/comments/<id>` with `--field body=@file`
|
|
117
|
+
- [ ] Comment body: first line `tACK`, blank line, all items `[x]`, no commentary
|
|
118
|
+
- [ ] Validation: item count matches, all boxes `[x]`, no paraphrase, no invented items
|
|
119
|
+
- [ ] Post-run PR body fetch and tACK lookup run in parallel (Probe A + Probe B)
|
|
120
|
+
- [ ] tACK body is subset of PR `## Test plan` — every tACK line matches a test plan line verbatim (modulo `[ ]` → `[x]`)
|
|
121
|
+
- [ ] tACK item count equals PR `## Test plan` item count (no missing, no extra)
|
|
122
|
+
- [ ] Scenario subheaders (`### <name>`) preserved when present in test plan
|