@exodus/xqa 5.0.0 → 5.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,122 @@
1
+ # xqa-test-plan — manual test plan
2
+
3
+ Manual verification checklist for the `xqa-test-plan` skill. Run when adding/changing skill behavior.
4
+
5
+ ## Activation
6
+
7
+ - [ ] Skill activates on `/xqa-test-plan`
8
+ - [ ] Skill activates on implied intent ("what should I QA?")
9
+
10
+ ## Detect state
11
+
12
+ - [ ] Detect state correctly computes slug for current branch
13
+ - [ ] Detect state auto-prunes stale sibling dirs but never current
14
+ - [ ] Auto-prune iterates branches forward (applies `branchToSlug`); never tries to invert slug
15
+ - [ ] Auto-prune leaves directory when slug-match is uncertain
16
+
17
+ ## Generate flow
18
+
19
+ - [ ] Handles 0 booted simulators (error), 1 (auto), >1 (prompt)
20
+ - [ ] Uses AskUserQuestion (or platform fallback) for simulator selection when >1 booted
21
+ - [ ] Passes `--intent` correctly
22
+ - [ ] Detects base ref from open PR (`gh pr view`)
23
+ - [ ] Falls back to upstream tracking when no PR exists
24
+ - [ ] Omits `--base` + emits broader-diff warning when both detection methods fail
25
+ - [ ] Never fabricates `origin/HEAD` fallback claim
26
+
27
+ ## Approval
28
+
29
+ - [ ] Approval loop calls `xqa plan edit` per file
30
+ - [ ] Emits Run-go gate after plan approval for all scenarios (never skips; plan approval ≠ run approval)
31
+ - [ ] Run-go gate question references expected app state (first scenario's precondition label)
32
+ - [ ] Emits flat `- [ ]` checklist for approval (not numbered scenario+steps)
33
+
34
+ ## Run
35
+
36
+ - [ ] Preflights simulator before dispatching
37
+ - [ ] Setup coordination asks per-group before dispatching when scenarios have divergent setups
38
+ - [ ] Batches scenarios with identical setup without re-asking
39
+ - [ ] Ordering puts non-destructive setups before no-wallet/delete-wallet setups
40
+ - [ ] Cancellation handles abort/skip/skip all/rerun
41
+
42
+ ## Gates
43
+
44
+ - [ ] Does not emit free-text "Reply X / Y / Z" prompts at any decision gate
45
+ - [ ] Uses AskUserQuestion for plan approval gate
46
+ - [ ] Uses AskUserQuestion for run-go gate
47
+ - [ ] Uses AskUserQuestion for sim-state question (multi-profile case)
48
+ - [ ] Uses AskUserQuestion for transition-ready prompts between groups
49
+ - [ ] Uses AskUserQuestion for destructive-delete seed-backup confirmation
50
+ - [ ] Uses AskUserQuestion for existing-plan Rerun/Regenerate/Extend choice
51
+ - [ ] Uses AskUserQuestion for post-interruption Resume/Report/Abort choice
52
+ - [ ] Still accepts free-form replies (AskUserQuestion's "Other" path)
53
+ - [ ] Detects `AskUserQuestion` availability at conversation start
54
+ - [ ] When available: uses AskUserQuestion at all 11 gates
55
+ - [ ] When unavailable: uses structured platform-fallback format at all 11 gates
56
+ - [ ] Fallback format never degenerates into loose "Reply X / Y / describe" prompts
57
+
58
+ ## Report
59
+
60
+ - [ ] Renders `passed` scenarios with green check (no strikethrough)
61
+ - [ ] Renders `failed` scenarios with strikethrough + inline finding + screenshot link
62
+ - [ ] Renders `not_run` scenarios with outcome verb (errored/timed_out/aborted) or "(no run record)"
63
+ - [ ] Opens with correct sentence based on bucket distribution
64
+ - [ ] Footer summary line shows counts when >1 scenario
65
+ - [ ] Uses correct `xqa plan report` flags (`--findings`, `--specs`) with default paths from `.xqa/test-plan/<slug>/`
66
+ - [ ] Omits `--runs` by default (`scenario-runs.json` resolved from same dir as `findings.json`)
67
+
68
+ ## Rerun / Regenerate / Extend
69
+
70
+ - [ ] Rerun doesn't regenerate specs
71
+ - [ ] Regenerate wipes specs before re-invoking xqa plan
72
+ - [ ] Regenerate preserves `runs/`
73
+ - [ ] Extend appends scenario-N+1 without re-writing existing scenarios
74
+
75
+ ## PR detection
76
+
77
+ - [ ] Probe runs in parallel with sim probe and slug computation (single Bash batch)
78
+ - [ ] Branch with no open PR: skips PR-integration silently, proceeds local-only
79
+ - [ ] Branch with open PR: fetches PR body, computes diagnostic signals
80
+ - [ ] Signal A: coverage count identifies items with changed-surface token overlap
81
+ - [ ] Signal B: items containing vitest/eslint/pnpm test/ci passes are flagged
82
+ - [ ] Signal C: items containing TODO/TBD/??? are flagged
83
+ - [ ] Diagnostic emitted before gate (N_covered/M, K CI steps, P placeholders)
84
+ - [ ] Same PR body always produces same diagnostic (deterministic)
85
+ - [ ] AskUserQuestion gate offers all three options: Use as-is / Enrich / Regenerate
86
+ - [ ] "Use as-is" skips planner, converts PR checklist to approval checklist
87
+ - [ ] "Enrich" runs planner then merges with PR checklist, dedupes by intent
88
+ - [ ] "Regenerate" ignores PR body, runs normal Generate flow
89
+ - [ ] PR with no test plan section proceeds with normal Generate flow (no gate shown)
90
+
91
+ ## Update PR
92
+
93
+ - [ ] Update PR gate fires after report renders when open PR exists (single 3-way gate, not two sequential y/n gates)
94
+ - [ ] `Write plan + tACK`: executes write-back then self-tACK posting sequentially, no additional confirmation
95
+ - [ ] `Write plan only`: executes write-back only, no self-tACK
96
+ - [ ] `Skip`: PR body unchanged, no tACK
97
+
98
+ ## Write-back
99
+
100
+ - [ ] Posts every approved checklist item verbatim as `- [ ]` bullets (not scenario titles)
101
+ - [ ] Single-scenario plan has no subheader; multi-scenario plan has one `### <Scenario name>` per scenario
102
+ - [ ] No `[x]` boxes at write-back time — all unchecked
103
+ - [ ] Replaces existing `## Test plan` section without corrupting other sections
104
+ - [ ] Appends `## Test plan` section when none existed
105
+ - [ ] Preserves all content below the original test plan section verbatim
106
+ - [ ] Uses `gh pr edit <number> --body-file` (not --body string interpolation)
107
+ - [ ] Surfaces `gh pr edit` errors verbatim, no silent retry
108
+ - [ ] `## Test plan` bullets stay verbatim after run completes (no scenario-title rewrite, no `<!-- failed -->` injection, no `[x]` flip)
109
+ - [ ] Post-run status goes in optional footnote below checklist, not inline in bullets
110
+
111
+ ## Self-tACK
112
+
113
+ - [ ] No prior tACK: renders proposed comment for transparency, posts immediately (no separate gate — Update PR was the confirmation)
114
+ - [ ] Prior tACK identical: reports "no update needed" without gating
115
+ - [ ] Prior tACK differs: renders diff before gating update (self-tACK update confirmation gate fires)
116
+ - [ ] Update uses `gh api --method PATCH /repos/{owner}/{repo}/issues/comments/<id>` with `--field body=@file`
117
+ - [ ] Comment body: first line `tACK`, blank line, all items `[x]`, no commentary
118
+ - [ ] Validation: item count matches, all boxes `[x]`, no paraphrase, no invented items
119
+ - [ ] Post-run PR body fetch and tACK lookup run in parallel (Probe A + Probe B)
120
+ - [ ] tACK body is subset of PR `## Test plan` — every tACK line matches a test plan line verbatim (modulo `[ ]` → `[x]`)
121
+ - [ ] tACK item count equals PR `## Test plan` item count (no missing, no extra)
122
+ - [ ] Scenario subheaders (`### <name>`) preserved when present in test plan