sequant 2.0.0 → 2.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/marketplace.json +1 -1
- package/.claude-plugin/plugin.json +1 -1
- package/README.md +7 -6
- package/dist/bin/cli.js +2 -1
- package/dist/marketplace/external_plugins/sequant/.claude-plugin/plugin.json +1 -1
- package/dist/marketplace/external_plugins/sequant/.mcp.json +6 -0
- package/dist/marketplace/external_plugins/sequant/README.md +58 -8
- package/dist/marketplace/external_plugins/sequant/hooks/post-tool.sh +19 -8
- package/dist/marketplace/external_plugins/sequant/hooks/pre-tool.sh +36 -49
- package/dist/marketplace/external_plugins/sequant/skills/_shared/references/subagent-types.md +158 -48
- package/dist/marketplace/external_plugins/sequant/skills/assess/SKILL.md +354 -352
- package/dist/marketplace/external_plugins/sequant/skills/exec/SKILL.md +1155 -33
- package/dist/marketplace/external_plugins/sequant/skills/fullsolve/SKILL.md +35 -4
- package/dist/marketplace/external_plugins/sequant/skills/qa/SKILL.md +2157 -104
- package/dist/marketplace/external_plugins/sequant/skills/qa/scripts/quality-checks.sh +1 -1
- package/dist/marketplace/external_plugins/sequant/skills/setup/SKILL.md +386 -0
- package/dist/marketplace/external_plugins/sequant/skills/solve/SKILL.md +38 -664
- package/dist/marketplace/external_plugins/sequant/skills/spec/SKILL.md +505 -120
- package/dist/marketplace/external_plugins/sequant/skills/test/SKILL.md +246 -1
- package/dist/marketplace/external_plugins/sequant/skills/testgen/SKILL.md +138 -1
- package/dist/src/commands/dashboard.js +1 -1
- package/dist/src/commands/doctor.js +1 -1
- package/dist/src/commands/init.js +10 -10
- package/dist/src/commands/logs.js +1 -1
- package/dist/src/commands/run.js +49 -39
- package/dist/src/commands/state.js +3 -3
- package/dist/src/commands/status.js +5 -5
- package/dist/src/commands/sync.js +8 -8
- package/dist/src/commands/update.js +16 -16
- package/dist/src/lib/cli-ui.js +20 -19
- package/dist/src/lib/merge-check/index.js +2 -2
- package/dist/src/lib/settings.d.ts +8 -0
- package/dist/src/lib/settings.js +1 -0
- package/dist/src/lib/shutdown.js +1 -1
- package/dist/src/lib/templates.js +2 -0
- package/dist/src/lib/wizard.js +6 -4
- package/dist/src/lib/workflow/batch-executor.d.ts +9 -1
- package/dist/src/lib/workflow/batch-executor.js +39 -2
- package/dist/src/lib/workflow/log-writer.js +6 -6
- package/dist/src/lib/workflow/metrics-writer.js +5 -3
- package/dist/src/lib/workflow/phase-executor.d.ts +1 -1
- package/dist/src/lib/workflow/phase-executor.js +52 -22
- package/dist/src/lib/workflow/platforms/github.js +5 -1
- package/dist/src/lib/workflow/state-cleanup.js +1 -1
- package/dist/src/lib/workflow/state-manager.js +15 -13
- package/dist/src/lib/workflow/state-rebuild.js +2 -2
- package/dist/src/lib/workflow/types.d.ts +27 -0
- package/dist/src/lib/workflow/worktree-manager.js +40 -41
- package/dist/src/lib/worktree-isolation.d.ts +130 -0
- package/dist/src/lib/worktree-isolation.js +310 -0
- package/package.json +24 -14
- package/templates/agents/sequant-explorer.md +23 -0
- package/templates/agents/sequant-implementer.md +18 -0
- package/templates/agents/sequant-qa-checker.md +24 -0
- package/templates/agents/sequant-testgen.md +25 -0
- package/templates/scripts/cleanup-worktree.sh +18 -0
- package/templates/skills/_shared/references/subagent-types.md +158 -48
- package/templates/skills/exec/SKILL.md +72 -6
- package/templates/skills/qa/SKILL.md +8 -217
- package/templates/skills/spec/SKILL.md +446 -120
- package/templates/skills/testgen/SKILL.md +138 -1
|
@@ -16,10 +16,11 @@ allowed-tools:
|
|
|
16
16
|
- Bash(gh pr view:*)
|
|
17
17
|
- Bash(gh pr diff:*)
|
|
18
18
|
- Bash(gh pr comment:*)
|
|
19
|
+
- Bash(gh pr checks:*)
|
|
19
20
|
- Bash(semgrep:*)
|
|
20
21
|
- Bash(npx semgrep:*)
|
|
21
22
|
- Bash(npx tsx scripts/semgrep-scan.ts:*)
|
|
22
|
-
-
|
|
23
|
+
- Agent(sequant-qa-checker)
|
|
23
24
|
- AgentOutputTool
|
|
24
25
|
---
|
|
25
26
|
|
|
@@ -37,6 +38,32 @@ When invoked as `/qa`, your job is to:
|
|
|
37
38
|
4. Assess whether the change is "A+ status" or needs more work.
|
|
38
39
|
5. Draft a GitHub review/QA comment summarizing findings and recommendations.
|
|
39
40
|
|
|
41
|
+
## Orchestration Context
|
|
42
|
+
|
|
43
|
+
When running as part of an orchestrated workflow (e.g., `sequant run` or `/fullsolve`), this skill receives environment variables that indicate the orchestration context:
|
|
44
|
+
|
|
45
|
+
| Environment Variable | Description | Example Value |
|
|
46
|
+
|---------------------|-------------|---------------|
|
|
47
|
+
| `SEQUANT_ORCHESTRATOR` | The orchestrator invoking this skill | `sequant-run` |
|
|
48
|
+
| `SEQUANT_PHASE` | Current phase in the workflow | `qa` |
|
|
49
|
+
| `SEQUANT_ISSUE` | Issue number being processed | `123` |
|
|
50
|
+
| `SEQUANT_WORKTREE` | Path to the feature worktree | `/path/to/worktrees/feature/...` |
|
|
51
|
+
|
|
52
|
+
**Behavior when orchestrated (SEQUANT_ORCHESTRATOR is set):**
|
|
53
|
+
|
|
54
|
+
1. **Skip pre-flight sync check** - Orchestrator has already synced
|
|
55
|
+
2. **Use provided worktree** - Work in `SEQUANT_WORKTREE` path directly
|
|
56
|
+
3. **Skip issue fetch** - Use `SEQUANT_ISSUE`, orchestrator has context
|
|
57
|
+
4. **Reduce GitHub comment frequency** - Defer updates to orchestrator
|
|
58
|
+
5. **Trust git state** - Orchestrator verified branch status
|
|
59
|
+
|
|
60
|
+
**Behavior when standalone (SEQUANT_ORCHESTRATOR is NOT set):**
|
|
61
|
+
|
|
62
|
+
- Perform pre-flight sync check
|
|
63
|
+
- Locate worktree or work from main
|
|
64
|
+
- Fetch fresh issue context from GitHub
|
|
65
|
+
- Post QA comment directly to GitHub
|
|
66
|
+
|
|
40
67
|
## Phase Detection (Smart Resumption)
|
|
41
68
|
|
|
42
69
|
**Before executing**, check if the exec phase has been completed (prerequisite for QA):
|
|
@@ -70,15 +97,22 @@ fi
|
|
|
70
97
|
|
|
71
98
|
**Phase Marker Emission:**
|
|
72
99
|
|
|
73
|
-
When posting the QA review comment to GitHub, append a phase marker at the end
|
|
100
|
+
When posting the QA review comment to GitHub, append a phase marker at the end.
|
|
101
|
+
|
|
102
|
+
**IMPORTANT:** Always include the `commitSHA` field with the current HEAD SHA. This enables incremental re-runs by recording the baseline commit for future QA runs.
|
|
103
|
+
|
|
104
|
+
```bash
|
|
105
|
+
# Get current HEAD SHA for the phase marker
|
|
106
|
+
COMMIT_SHA=$(git rev-parse HEAD)
|
|
107
|
+
```
|
|
74
108
|
|
|
75
109
|
```markdown
|
|
76
|
-
<!-- SEQUANT_PHASE: {"phase":"qa","status":"completed","timestamp":"<ISO-8601>"} -->
|
|
110
|
+
<!-- SEQUANT_PHASE: {"phase":"qa","status":"completed","timestamp":"<ISO-8601>","commitSHA":"<HEAD-SHA>"} -->
|
|
77
111
|
```
|
|
78
112
|
|
|
79
113
|
If QA determines AC_NOT_MET, emit:
|
|
80
114
|
```markdown
|
|
81
|
-
<!-- SEQUANT_PHASE: {"phase":"qa","status":"failed","timestamp":"<ISO-8601>","error":"AC_NOT_MET"} -->
|
|
115
|
+
<!-- SEQUANT_PHASE: {"phase":"qa","status":"failed","timestamp":"<ISO-8601>","error":"AC_NOT_MET","commitSHA":"<HEAD-SHA>"} -->
|
|
82
116
|
```
|
|
83
117
|
|
|
84
118
|
Include this marker in every `gh issue comment` that represents QA completion.
|
|
@@ -89,10 +123,100 @@ Invocation:
|
|
|
89
123
|
|
|
90
124
|
- `/qa 123`: Treat `123` as the GitHub issue/PR identifier in context.
|
|
91
125
|
- `/qa <freeform description>`: Treat the text as context about the change to review.
|
|
126
|
+
- `/qa 123 --parallel`: Force parallel agent execution (faster, higher token usage).
|
|
127
|
+
- `/qa 123 --sequential`: Force sequential agent execution (slower, lower token usage).
|
|
128
|
+
|
|
129
|
+
### Agent Execution Mode
|
|
130
|
+
|
|
131
|
+
Before spawning quality check agents, determine the execution mode:
|
|
132
|
+
|
|
133
|
+
1. **Check for CLI flag override:**
|
|
134
|
+
- `--parallel` → Use parallel execution
|
|
135
|
+
- `--sequential` → Use sequential execution
|
|
136
|
+
|
|
137
|
+
2. **If no flag, read project settings:**
|
|
138
|
+
Use the Read tool to check project settings:
|
|
139
|
+
```
|
|
140
|
+
Read(file_path=".sequant/settings.json")
|
|
141
|
+
# Parse JSON and extract agents.parallel (default: false)
|
|
142
|
+
```
|
|
143
|
+
|
|
144
|
+
3. **Default:** Sequential (cost-optimized)
|
|
145
|
+
|
|
146
|
+
| Mode | Token Usage | Speed | Best For |
|
|
147
|
+
|------|-------------|-------|----------|
|
|
148
|
+
| Sequential | 1x (baseline) | Slower | Limited API plans, single issues |
|
|
149
|
+
| Parallel | ~2-3x | ~50% faster | Unlimited plans, batch operations |
|
|
150
|
+
|
|
151
|
+
### Quality Check Caching
|
|
152
|
+
|
|
153
|
+
The QA quality checks support caching to skip unchanged checks on re-run, significantly improving iteration speed.
|
|
154
|
+
|
|
155
|
+
#### Cache Configuration
|
|
156
|
+
|
|
157
|
+
**CLI flags:**
|
|
158
|
+
- `/qa 123 --no-cache`: Force fresh run, ignore all cached results
|
|
159
|
+
- `/qa 123 --use-cache`: Enable caching (default)
|
|
160
|
+
|
|
161
|
+
**When caching is used:**
|
|
162
|
+
- Type safety check → Cached (keyed by diff hash)
|
|
163
|
+
- Deleted tests check → Cached (keyed by diff hash)
|
|
164
|
+
- Security scan → Cached (keyed by diff hash + config)
|
|
165
|
+
- Semgrep analysis → Cached (keyed by diff hash)
|
|
166
|
+
- Build verification → Cached (keyed by diff hash)
|
|
167
|
+
- Scope/size metrics → Always fresh (cheap operations)
|
|
168
|
+
|
|
169
|
+
#### Cache Invalidation Rules
|
|
170
|
+
|
|
171
|
+
| Change Type | Invalidation Scope |
|
|
172
|
+
|-------------|-------------------|
|
|
173
|
+
| Source file changes | Re-run type safety, security, semgrep |
|
|
174
|
+
| Test file changes | Re-run deleted-tests check |
|
|
175
|
+
| Config changes (tsconfig, package.json) | Re-run affected checks |
|
|
176
|
+
| `package-lock.json` changes | Re-run ALL checks |
|
|
177
|
+
| TTL expiry (1 hour default) | Re-run expired checks |
|
|
178
|
+
|
|
179
|
+
#### Cache Status Reporting (AC-4)
|
|
180
|
+
|
|
181
|
+
The quality-checks.sh script outputs a cache status table:
|
|
182
|
+
|
|
183
|
+
```markdown
|
|
184
|
+
### Cache Status Report
|
|
185
|
+
|
|
186
|
+
| Check | Cache Status |
|
|
187
|
+
|-------|--------------|
|
|
188
|
+
| type-safety | ✅ HIT |
|
|
189
|
+
| deleted-tests | ✅ HIT |
|
|
190
|
+
| scope | ⏭️ SKIP |
|
|
191
|
+
| size | ⏭️ SKIP |
|
|
192
|
+
| security | ❌ MISS |
|
|
193
|
+
| semgrep | ❌ MISS |
|
|
194
|
+
| build | ✅ HIT |
|
|
195
|
+
|
|
196
|
+
**Summary:** 3 hits, 2 misses, 2 skipped
|
|
197
|
+
**Performance:** Cached checks saved execution time
|
|
198
|
+
```
|
|
199
|
+
|
|
200
|
+
#### Cache Location
|
|
201
|
+
|
|
202
|
+
Cache is stored at `.sequant/.cache/qa/cache.json` with the following structure:
|
|
203
|
+
- `diffHash`: SHA256 hash of `git diff main...HEAD`
|
|
204
|
+
- `configHash`: SHA256 hash of relevant config files
|
|
205
|
+
- `result`: Check result (passed, message, details)
|
|
206
|
+
- `ttl`: Time-to-live in milliseconds (default: 1 hour)
|
|
207
|
+
|
|
208
|
+
#### Graceful Degradation (AC-6)
|
|
209
|
+
|
|
210
|
+
If the cache is corrupted or unreadable:
|
|
211
|
+
1. Log warning at debug level (AC-7)
|
|
212
|
+
2. Fall back to fresh run
|
|
213
|
+
3. Continue without caching errors affecting QA
|
|
92
214
|
|
|
93
215
|
### Pre-flight Sync Check
|
|
94
216
|
|
|
95
|
-
|
|
217
|
+
**Skip this section if `SEQUANT_ORCHESTRATOR` is set** - the orchestrator has already verified sync status.
|
|
218
|
+
|
|
219
|
+
Before starting QA (standalone mode), verify the local branch is in sync with remote:
|
|
96
220
|
|
|
97
221
|
```bash
|
|
98
222
|
git fetch origin 2>/dev/null || echo "Network unavailable - proceeding with local state"
|
|
@@ -109,10 +233,103 @@ If diverged, recommend:
|
|
|
109
233
|
git pull origin main # Or merge origin/main if pull fails
|
|
110
234
|
```
|
|
111
235
|
|
|
236
|
+
### Stale Branch Detection
|
|
237
|
+
|
|
238
|
+
**Skip this section if `SEQUANT_ORCHESTRATOR` is set** - the orchestrator handles branch freshness checks.
|
|
239
|
+
|
|
240
|
+
**Purpose:** Detect when the feature branch is significantly behind main, which can lead to:
|
|
241
|
+
- QA cycles wasted reviewing code that won't cleanly merge
|
|
242
|
+
- False `READY_FOR_MERGE` verdicts that fail at merge time
|
|
243
|
+
- Conflicts that require rework after QA approval
|
|
244
|
+
|
|
245
|
+
**Detection:**
|
|
246
|
+
|
|
247
|
+
```bash
|
|
248
|
+
# Ensure we have latest remote state
|
|
249
|
+
git fetch origin 2>/dev/null || true
|
|
250
|
+
|
|
251
|
+
# Count commits behind main
|
|
252
|
+
behind=$(git rev-list --count HEAD..origin/main 2>/dev/null || echo "0")
|
|
253
|
+
echo "Feature branch is $behind commits behind main"
|
|
254
|
+
```
|
|
255
|
+
|
|
256
|
+
**Threshold Configuration:**
|
|
257
|
+
|
|
258
|
+
The stale branch threshold is configurable in `.sequant/settings.json`:
|
|
259
|
+
|
|
260
|
+
```json
|
|
261
|
+
{
|
|
262
|
+
"run": {
|
|
263
|
+
"staleBranchThreshold": 5
|
|
264
|
+
}
|
|
265
|
+
}
|
|
266
|
+
```
|
|
267
|
+
|
|
268
|
+
Default: 5 commits
|
|
269
|
+
|
|
270
|
+
**Behavior:**
|
|
271
|
+
|
|
272
|
+
| Commits Behind | Action |
|
|
273
|
+
|----------------|--------|
|
|
274
|
+
| 0 | ✅ Proceed normally |
|
|
275
|
+
| 1 to threshold | ⚠️ **Warning:** "Feature branch is N commits behind main. Consider rebasing before QA." |
|
|
276
|
+
| > threshold | ❌ **Block:** "STALE_BRANCH: Feature branch is N commits behind main (threshold: T). Rebase required before QA." |
|
|
277
|
+
|
|
278
|
+
**Implementation:**
|
|
279
|
+
|
|
280
|
+
```bash
|
|
281
|
+
# Read threshold from settings (default: 5)
|
|
282
|
+
threshold=$(jq -r '.run.staleBranchThreshold // 5' .sequant/settings.json 2>/dev/null || echo "5")
|
|
283
|
+
|
|
284
|
+
behind=$(git rev-list --count HEAD..origin/main 2>/dev/null || echo "0")
|
|
285
|
+
|
|
286
|
+
if [[ $behind -gt $threshold ]]; then
|
|
287
|
+
echo "❌ STALE_BRANCH: Feature branch is $behind commits behind main (threshold: $threshold)"
|
|
288
|
+
echo " Rebase required before QA:"
|
|
289
|
+
echo " git fetch origin && git rebase origin/main"
|
|
290
|
+
# Exit with error - QA should not proceed
|
|
291
|
+
exit 1
|
|
292
|
+
elif [[ $behind -gt 0 ]]; then
|
|
293
|
+
echo "⚠️ Warning: Feature branch is $behind commits behind main."
|
|
294
|
+
echo " Consider rebasing before QA: git fetch origin && git rebase origin/main"
|
|
295
|
+
# Continue with warning
|
|
296
|
+
fi
|
|
297
|
+
```
|
|
298
|
+
|
|
299
|
+
**Output Format:**
|
|
300
|
+
|
|
301
|
+
Include in QA output when branch is stale:
|
|
302
|
+
|
|
303
|
+
```markdown
|
|
304
|
+
### Stale Branch Check
|
|
305
|
+
|
|
306
|
+
| Check | Value |
|
|
307
|
+
|-------|-------|
|
|
308
|
+
| Commits behind main | N |
|
|
309
|
+
| Threshold | T |
|
|
310
|
+
| Status | ✅ OK / ⚠️ Warning / ❌ Blocked |
|
|
311
|
+
|
|
312
|
+
[Warning/blocking message if applicable]
|
|
313
|
+
```
|
|
314
|
+
|
|
315
|
+
**Verdict Impact:**
|
|
316
|
+
|
|
317
|
+
| Status | Verdict Impact |
|
|
318
|
+
|--------|----------------|
|
|
319
|
+
| OK (0 behind) | No impact |
|
|
320
|
+
| Warning (1 to threshold) | Note in findings, recommend rebase |
|
|
321
|
+
| Blocked (> threshold) | **Cannot proceed** - rebase first |
|
|
322
|
+
|
|
112
323
|
### Feature Worktree Workflow
|
|
113
324
|
|
|
114
325
|
**QA Phase:** Review code in the feature worktree.
|
|
115
326
|
|
|
327
|
+
**If orchestrated (SEQUANT_WORKTREE is set):**
|
|
328
|
+
- Use the provided worktree path directly: `cd $SEQUANT_WORKTREE`
|
|
329
|
+
- Skip step 1 below (worktree location provided by orchestrator)
|
|
330
|
+
|
|
331
|
+
**If standalone:**
|
|
332
|
+
|
|
116
333
|
1. **Locate the worktree:**
|
|
117
334
|
- The worktree should already exist from the execution phase (`/exec`)
|
|
118
335
|
- Find the worktree: `git worktree list` or check `../worktrees/feature/` for directories matching the issue number
|
|
@@ -163,111 +380,1434 @@ If no feature worktree exists (work was done directly on main):
|
|
|
163
380
|
|
|
164
381
|
4. **Run quality checks** on the current branch instead of comparing to a worktree.
|
|
165
382
|
|
|
166
|
-
###
|
|
383
|
+
### Phase 0: Implementation Status Check — REQUIRED
|
|
167
384
|
|
|
168
|
-
**
|
|
385
|
+
**Before spawning quality check agents**, verify that implementation actually exists. Running full QA on an unimplemented issue wastes tokens and produces confusing output.
|
|
169
386
|
|
|
170
|
-
**
|
|
387
|
+
**Detection Logic:**
|
|
171
388
|
|
|
172
|
-
|
|
389
|
+
```bash
|
|
390
|
+
# 1. Check for worktree (indicates work may have started)
|
|
391
|
+
worktree_path=$(git worktree list | grep -i "<issue-number>" | awk '{print $1}' | head -1 || true)
|
|
173
392
|
|
|
174
|
-
2.
|
|
393
|
+
# 2. Check for commits on feature branch (vs main) — include ALL file types
|
|
394
|
+
commits_exist=$(git log --oneline main..HEAD 2>/dev/null | head -1)
|
|
175
395
|
|
|
176
|
-
3.
|
|
396
|
+
# 3. Check for uncommitted changes
|
|
397
|
+
uncommitted_changes=$(git status --porcelain | head -1)
|
|
177
398
|
|
|
178
|
-
|
|
399
|
+
# 4. Check for open PR linked to this issue
|
|
400
|
+
pr_exists=$(gh pr list --search "<issue-number>" --state open --json number -q '.[0].number' 2>/dev/null)
|
|
401
|
+
|
|
402
|
+
# 5. Check for ANY file changes (including .md, prompt-only changes)
|
|
403
|
+
any_diff=$(git diff --name-only main..HEAD 2>/dev/null | head -1 || true)
|
|
404
|
+
```
|
|
405
|
+
|
|
406
|
+
**IMPORTANT: Prompt-only and markdown-only changes ARE valid implementations.** Many issues (e.g., skill improvements, documentation features) are implemented entirely via `.md` file changes. The detection logic must count these as real implementation, not skip them.
|
|
407
|
+
|
|
408
|
+
**Implementation Status Matrix:**
|
|
409
|
+
|
|
410
|
+
| Worktree | Commits | Uncommitted | PR | Status | Action |
|
|
411
|
+
|----------|---------|-------------|-----|--------|--------|
|
|
412
|
+
| ❌ | ❌ | ❌ | ❌ | No implementation | Early exit |
|
|
413
|
+
| ✅ | ❌ | ❌ | ❌ | Worktree created but no work | Early exit |
|
|
414
|
+
| ✅ | ❌ | ✅ | ❌ | Work in progress (uncommitted) | Proceed with QA |
|
|
415
|
+
| ✅ | ✅ | * | * | Implementation exists | Proceed with QA |
|
|
416
|
+
| * | ✅ | * | * | Commits exist | Proceed with QA |
|
|
417
|
+
| * | * | * | ✅ | PR exists | Proceed with QA |
|
|
418
|
+
|
|
419
|
+
**Early Exit Condition:**
|
|
420
|
+
- No commits on feature branch AND no uncommitted changes AND no open PR
|
|
421
|
+
|
|
422
|
+
**False Negative Prevention (CRITICAL):**
|
|
423
|
+
|
|
424
|
+
Root cause analysis (#448) found that 33% of multi-attempt QA failures were caused by QA reporting "NOT FOUND" when implementation existed. Common causes:
|
|
425
|
+
|
|
426
|
+
| Cause | Example | Fix |
|
|
427
|
+
|-------|---------|-----|
|
|
428
|
+
| Prompt-only changes | Skill SKILL.md modifications (#413) | Check `git diff --name-only` for ANY file, not just .ts/.tsx |
|
|
429
|
+
| Cross-repo work | Landing page issue tracked in main repo (#393) | Check exec progress comments for cross-repo indicators |
|
|
430
|
+
| Worktree mismatch | QA runs in wrong directory | Verify `pwd` matches expected worktree path |
|
|
431
|
+
|
|
432
|
+
**If `git diff --name-only main..HEAD` shows files but standard detection says "NOT FOUND":**
|
|
433
|
+
1. The implementation exists — proceed with QA
|
|
434
|
+
2. Adapt review approach to the file types changed (e.g., review .md changes for content quality rather than TypeScript compilation)
|
|
435
|
+
|
|
436
|
+
**If early exit triggered:**
|
|
437
|
+
1. **Skip** sub-agent spawning (nothing to check)
|
|
438
|
+
2. **Skip** code review (no code to review)
|
|
439
|
+
3. **Skip** quality metrics collection
|
|
440
|
+
4. Use the **Early Exit Output Template** below
|
|
441
|
+
5. Verdict: `AC_NOT_MET`
|
|
442
|
+
|
|
443
|
+
---
|
|
444
|
+
|
|
445
|
+
### Early Exit Output Template
|
|
446
|
+
|
|
447
|
+
When no implementation is detected, use this streamlined output:
|
|
448
|
+
|
|
449
|
+
```markdown
|
|
450
|
+
## QA Review for Issue #<N>
|
|
451
|
+
|
|
452
|
+
### Implementation Status: NOT FOUND
|
|
453
|
+
|
|
454
|
+
No implementation detected for this issue:
|
|
455
|
+
- Commits on feature branch: None
|
|
456
|
+
- Uncommitted changes: None
|
|
457
|
+
- Open PR: None
|
|
458
|
+
|
|
459
|
+
**Verdict: AC_NOT_MET**
|
|
460
|
+
|
|
461
|
+
No code changes found to review. The acceptance criteria cannot be evaluated without an implementation.
|
|
462
|
+
|
|
463
|
+
### Next Steps
|
|
464
|
+
|
|
465
|
+
1. Run `/exec <issue-number>` to implement the feature
|
|
466
|
+
2. Re-run `/qa <issue-number>` after implementation is complete
|
|
467
|
+
|
|
468
|
+
---
|
|
469
|
+
|
|
470
|
+
*QA skipped: No implementation to review*
|
|
471
|
+
```
|
|
472
|
+
|
|
473
|
+
**Important:** Do NOT spawn sub-agents when using early exit. This saves tokens and avoids confusing "no changes found" outputs from quality checkers.
|
|
474
|
+
|
|
475
|
+
**CRITICAL — Before early exit, double-check for false negatives:**
|
|
179
476
|
```bash
|
|
180
|
-
|
|
477
|
+
# Final safety check: are there ANY file changes vs main?
|
|
478
|
+
any_changes=$(git diff --name-only main..HEAD 2>/dev/null | wc -l | xargs || echo "0")
|
|
479
|
+
if [[ "$any_changes" -gt 0 ]]; then
|
|
480
|
+
echo "WARNING: $any_changes files changed but detection said NOT FOUND"
|
|
481
|
+
echo "Changed files:"
|
|
482
|
+
git diff --name-only main..HEAD 2>/dev/null | head -20
|
|
483
|
+
echo "Proceeding with QA instead of early exit."
|
|
484
|
+
# DO NOT early exit — proceed with QA
|
|
485
|
+
fi
|
|
181
486
|
```
|
|
182
487
|
|
|
183
|
-
|
|
488
|
+
---
|
|
489
|
+
|
|
490
|
+
### Phase 0b: Quality Plan Verification (CONDITIONAL)
|
|
491
|
+
|
|
492
|
+
**When to apply:** If issue has a Feature Quality Planning section in comments (from `/spec`).
|
|
493
|
+
|
|
494
|
+
**Purpose:** Verify that quality dimensions identified during planning were addressed in implementation. This catches gaps that AC verification alone misses.
|
|
495
|
+
|
|
496
|
+
**Detection:**
|
|
497
|
+
```bash
|
|
498
|
+
# Check if issue has quality planning section in comments
|
|
499
|
+
quality_plan_exists=$(gh issue view <issue> --comments --json comments -q '.comments[].body' | grep -q "Feature Quality Planning" && echo "yes" || echo "no")
|
|
500
|
+
```
|
|
184
501
|
|
|
185
|
-
|
|
502
|
+
**If Quality Plan found:**
|
|
186
503
|
|
|
187
|
-
|
|
504
|
+
1. **Extract quality dimensions** from the spec comment:
|
|
505
|
+
- Completeness Check items
|
|
506
|
+
- Error Handling items
|
|
507
|
+
- Code Quality items
|
|
508
|
+
- Test Coverage Plan items
|
|
509
|
+
- Best Practices items
|
|
510
|
+
- Polish items (if UI feature)
|
|
511
|
+
- Derived ACs
|
|
188
512
|
|
|
189
|
-
|
|
513
|
+
2. **Verify each dimension against implementation:**
|
|
190
514
|
|
|
191
|
-
|
|
515
|
+
| Dimension | Verification Method |
|
|
516
|
+
|-----------|---------------------|
|
|
517
|
+
| Completeness | Check all AC steps have code |
|
|
518
|
+
| Error Handling | Search for error handling code, try/catch blocks |
|
|
519
|
+
| Code Quality | Check for `any` types, magic strings |
|
|
520
|
+
| Test Coverage | Verify test files exist for critical paths |
|
|
521
|
+
| Best Practices | Check for logging, security patterns |
|
|
522
|
+
| Polish | Check loading/error/empty states in UI |
|
|
192
523
|
|
|
193
|
-
|
|
194
|
-
|
|
195
|
-
|
|
196
|
-
|
|
524
|
+
3. **Extract and Verify Derived ACs:**
|
|
525
|
+
|
|
526
|
+
**Extraction Method:**
|
|
527
|
+
```bash
|
|
528
|
+
# Extract derived ACs from spec comment's Derived ACs table
|
|
529
|
+
# Format: | Source | AC-N: Description | Priority |
|
|
530
|
+
# Uses flexible pattern to match any source dimension (not hardcoded)
|
|
531
|
+
derived_acs=$(gh issue view <issue-number> --comments --json comments -q '.comments[].body' | \
|
|
532
|
+
grep -E '\|[^|]+\|\s*AC-[0-9]+:' | \
|
|
533
|
+
grep -oE 'AC-[0-9]+:[^|]+' | \
|
|
534
|
+
sed 's/^[[:space:]]*//;s/[[:space:]]*$//' | \
|
|
535
|
+
sort -u || true)
|
|
536
|
+
|
|
537
|
+
# Count derived ACs
|
|
538
|
+
derived_count=$(echo "$derived_acs" | grep -c "AC-" || true)
|
|
539
|
+
echo "Found $derived_count derived ACs"
|
|
540
|
+
```
|
|
197
541
|
|
|
198
|
-
|
|
542
|
+
**Handling Edge Cases:**
|
|
543
|
+
- **0 derived ACs:** Output "Derived ACs: None found" and skip derived AC verification
|
|
544
|
+
- **1+ derived ACs:** Include each in AC coverage table with source attribution
|
|
545
|
+
- **Malformed rows:** Rows missing the `| Source | AC-N: ... |` pattern are skipped
|
|
546
|
+
- **Extra whitespace:** Trimmed during extraction
|
|
199
547
|
|
|
200
|
-
**
|
|
201
|
-
-
|
|
202
|
-
-
|
|
203
|
-
-
|
|
548
|
+
**Verification:**
|
|
549
|
+
- Treat derived ACs identically to original ACs
|
|
550
|
+
- Include in AC coverage table with "Derived ([Source])" notation
|
|
551
|
+
- Mark as MET/PARTIALLY_MET/NOT_MET based on implementation evidence
|
|
204
552
|
|
|
205
|
-
**
|
|
206
|
-
1. Structure your analysis with explicit numbered steps
|
|
207
|
-
2. Document each concern systematically before synthesizing verdict
|
|
208
|
-
3. Use a pros/cons format for trade-off decisions
|
|
553
|
+
**Output Format:**
|
|
209
554
|
|
|
210
555
|
```markdown
|
|
211
|
-
|
|
556
|
+
### Quality Plan Verification
|
|
557
|
+
|
|
558
|
+
**Quality Plan found:** Yes/No
|
|
559
|
+
|
|
560
|
+
| Dimension | Items Planned | Items Addressed | Status |
|
|
561
|
+
|-----------|---------------|-----------------|--------|
|
|
562
|
+
| Completeness | 5 | 5 | ✅ Complete |
|
|
563
|
+
| Error Handling | 3 | 2 | ⚠️ Partial (missing: API timeout) |
|
|
564
|
+
| Code Quality | 4 | 4 | ✅ Complete |
|
|
565
|
+
| Test Coverage | 3 | 3 | ✅ Complete |
|
|
566
|
+
| Best Practices | 2 | 2 | ✅ Complete |
|
|
567
|
+
| Polish | N/A | N/A | - (not UI feature) |
|
|
568
|
+
|
|
569
|
+
**Derived ACs:** 2/2 addressed
|
|
212
570
|
|
|
213
|
-
**
|
|
214
|
-
**Step 2:** [Analyze second dimension - maintainability]
|
|
215
|
-
**Step 3:** [Analyze third dimension - performance]
|
|
216
|
-
**Step 4:** [Synthesize findings into verdict]
|
|
571
|
+
**Quality Plan Status:** Complete / Partial / Not Addressed
|
|
217
572
|
```
|
|
218
573
|
|
|
219
|
-
|
|
574
|
+
**Verdict Impact:**
|
|
220
575
|
|
|
221
|
-
|
|
222
|
-
|
|
223
|
-
|
|
224
|
-
|
|
576
|
+
| Quality Plan Status | Verdict Impact |
|
|
577
|
+
|---------------------|----------------|
|
|
578
|
+
| Complete | No impact (positive signal) |
|
|
579
|
+
| Partial | Note in findings, consider `AC_MET_BUT_NOT_A_PLUS` |
|
|
580
|
+
| Not Addressed | Flag in findings, may indicate gaps |
|
|
581
|
+
| No Plan Found | Note: "Quality plan not available - standard QA only" |
|
|
225
582
|
|
|
226
|
-
**
|
|
227
|
-
1. Search codebase with Grep for existing usage patterns
|
|
228
|
-
2. Use WebSearch for official library documentation
|
|
229
|
-
3. Check similar implementations in the codebase as reference
|
|
230
|
-
4. Review library's README or documentation in node_modules
|
|
583
|
+
**Status Threshold Definitions:**
|
|
231
584
|
|
|
232
|
-
|
|
585
|
+
| Status | Criteria |
|
|
586
|
+
|--------|----------|
|
|
587
|
+
| **Complete** | All applicable dimensions have ≥80% items addressed |
|
|
588
|
+
| **Partial** | At least 50% of applicable dimensions have items addressed |
|
|
589
|
+
| **Not Addressed** | <50% of applicable dimensions addressed, or 0 items addressed |
|
|
233
590
|
|
|
234
|
-
|
|
235
|
-
-
|
|
236
|
-
-
|
|
591
|
+
*Example: If 4 dimensions apply (Completeness, Error Handling, Code Quality, Test Coverage):*
|
|
592
|
+
- *Complete: 4/4 dimensions at ≥80%*
|
|
593
|
+
- *Partial: 2-3/4 dimensions have work done*
|
|
594
|
+
- *Not Addressed: 0-1/4 dimensions have work done*
|
|
595
|
+
|
|
596
|
+
**If no Quality Plan found:**
|
|
597
|
+
- Output: "Quality Plan Verification: N/A - No quality plan found in issue comments"
|
|
598
|
+
- Proceed with standard QA (no verdict impact)
|
|
599
|
+
|
|
600
|
+
---
|
|
601
|
+
|
|
602
|
+
### Phase 0c: Incremental Re-Run Detection (CONDITIONAL)
|
|
603
|
+
|
|
604
|
+
**When to apply:** On QA re-runs (when a prior QA phase marker exists in issue comments).
|
|
605
|
+
|
|
606
|
+
**Purpose:** Optimize QA re-runs by detecting what changed since the last QA run and skipping checks whose inputs haven't changed. This significantly reduces token usage and execution time on iterative QA cycles.
|
|
607
|
+
|
|
608
|
+
**Detection:**
|
|
609
|
+
|
|
610
|
+
```bash
|
|
611
|
+
# Step 1: Check for prior QA run context in cache
|
|
612
|
+
prior_context=$(npx tsx scripts/qa/qa-cache-cli.ts get-run-context 2>/dev/null || true)
|
|
613
|
+
|
|
614
|
+
# Step 2: If no cache context found, fall through to full QA run
|
|
615
|
+
if [[ -z "$prior_context" ]] || echo "$prior_context" | grep -q "No QA run context"; then
|
|
616
|
+
echo "No prior QA context found — running full QA"
|
|
617
|
+
INCREMENTAL_MODE=false
|
|
618
|
+
else
|
|
619
|
+
LAST_QA_SHA=$(echo "$prior_context" | jq -r '.lastQACommitSHA')
|
|
620
|
+
LAST_QA_HASH=$(echo "$prior_context" | jq -r '.lastQADiffHash')
|
|
621
|
+
|
|
622
|
+
# Step 3: Validate the commit SHA still exists in git history
|
|
623
|
+
if ! git cat-file -t "$LAST_QA_SHA" &>/dev/null; then
|
|
624
|
+
echo "Warning: Last QA commit SHA ($LAST_QA_SHA) not found in history — running full QA"
|
|
625
|
+
INCREMENTAL_MODE=false
|
|
626
|
+
else
|
|
627
|
+
# Step 4: Get files changed since last QA
|
|
628
|
+
changed_files=$(npx tsx scripts/qa/qa-cache-cli.ts changed-since "$LAST_QA_SHA" 2>/dev/null || true)
|
|
629
|
+
|
|
630
|
+
if [[ "$changed_files" == "NO_CHANGES" ]]; then
|
|
631
|
+
echo "No changes since last QA — all checks can use cached results"
|
|
632
|
+
INCREMENTAL_MODE=true
|
|
633
|
+
NO_FILE_CHANGES=true
|
|
634
|
+
else
|
|
635
|
+
echo "Changes detected since last QA ($LAST_QA_SHA):"
|
|
636
|
+
echo "$changed_files" | head -20
|
|
637
|
+
INCREMENTAL_MODE=true
|
|
638
|
+
NO_FILE_CHANGES=false
|
|
639
|
+
fi
|
|
640
|
+
fi
|
|
641
|
+
fi
|
|
642
|
+
```
|
|
643
|
+
|
|
644
|
+
**Skip Logic (when INCREMENTAL_MODE=true):**
|
|
645
|
+
|
|
646
|
+
| Check / Item | Skip Condition | Re-run Condition |
|
|
647
|
+
|-------------|----------------|------------------|
|
|
648
|
+
| Quality checks (type-safety, security, etc.) | Existing diff-hash cache handles this | Hash mismatch -> re-run |
|
|
649
|
+
| Build verification | **Never skip** (always re-run) | Always — cheap and can regress |
|
|
650
|
+
| CI status | **Never skip** (always re-run) | Always — external state changes |
|
|
651
|
+
| AC items with prior status `met` | Skip if NO_FILE_CHANGES=true | Any file changes since last QA |
|
|
652
|
+
| AC items with prior status `not_met` | **Never skip** | Always re-evaluate |
|
|
653
|
+
| AC items with prior status `partially_met` | **Never skip** | Always re-evaluate |
|
|
654
|
+
| AC items with prior status `pending`/`blocked` | **Never skip** | Always re-evaluate |
|
|
655
|
+
|
|
656
|
+
**AC Re-evaluation Rules:**
|
|
657
|
+
|
|
658
|
+
When `INCREMENTAL_MODE=true`:
|
|
659
|
+
|
|
660
|
+
1. **Load prior AC statuses** from run context:
|
|
661
|
+
```bash
|
|
662
|
+
# Extract AC statuses from prior context
|
|
663
|
+
ac_statuses=$(echo "$prior_context" | jq -r '.acStatuses | to_entries[] | "\(.key)=\(.value)"')
|
|
664
|
+
```
|
|
665
|
+
|
|
666
|
+
2. **For each AC item:**
|
|
667
|
+
- If prior status is `met` AND `NO_FILE_CHANGES=true`:
|
|
668
|
+
- **Skip full re-evaluation** — output "Cached: previously MET, no file changes"
|
|
669
|
+
- Mark as `MET (cached)` in output
|
|
670
|
+
- If prior status is `met` AND files changed:
|
|
671
|
+
- **Re-evaluate** — changes may have caused regression
|
|
672
|
+
- If prior status is `not_met` or `partially_met`:
|
|
673
|
+
- **Always re-evaluate** — this is the primary purpose of re-runs
|
|
674
|
+
- If prior status is `pending` or `blocked`:
|
|
675
|
+
- **Always re-evaluate** — status may have changed
|
|
676
|
+
|
|
677
|
+
3. **`--no-cache` flag behavior:**
|
|
678
|
+
- When `--no-cache` is passed, set `INCREMENTAL_MODE=false`
|
|
679
|
+
- This forces full re-evaluation of ALL checks and AC items
|
|
680
|
+
- Run context is still saved at the end for future re-runs
|
|
681
|
+
|
|
682
|
+
**Output Format (Incremental QA Summary):**
|
|
683
|
+
|
|
684
|
+
When `INCREMENTAL_MODE=true`, prepend this section to the QA output:
|
|
685
|
+
|
|
686
|
+
```markdown
|
|
687
|
+
### Incremental QA Summary
|
|
688
|
+
|
|
689
|
+
**Last QA:** <timestamp> (commit: <sha-short>)
|
|
690
|
+
**Changes since last QA:** N files
|
|
691
|
+
|
|
692
|
+
| Check / AC | Status | Re-run? | Reason |
|
|
693
|
+
|------------|--------|---------|--------|
|
|
694
|
+
| type-safety | PASS | Cached | Diff hash unchanged |
|
|
695
|
+
| security | PASS | Cached | Diff hash unchanged |
|
|
696
|
+
| build | PASS | Re-run | Always fresh |
|
|
697
|
+
| CI status | PASS | Re-run | Always fresh |
|
|
698
|
+
| AC-1 | MET | Cached | Previously MET, no file changes |
|
|
699
|
+
| AC-2 | MET | Re-evaluated | Was NOT_MET |
|
|
700
|
+
| AC-3 | MET | Re-evaluated | Files changed since last QA |
|
|
701
|
+
|
|
702
|
+
**Summary:** X checks cached, Y re-evaluated, Z always-fresh
|
|
703
|
+
```
|
|
704
|
+
|
|
705
|
+
**Run Context Persistence:**
|
|
706
|
+
|
|
707
|
+
After QA completes (regardless of incremental mode), save the run context:
|
|
708
|
+
|
|
709
|
+
```bash
|
|
710
|
+
# Get current HEAD SHA
|
|
711
|
+
current_sha=$(git rev-parse HEAD)
|
|
712
|
+
# Get current diff hash
|
|
713
|
+
current_hash=$(npx tsx scripts/qa/qa-cache-cli.ts hash)
|
|
714
|
+
|
|
715
|
+
# Build AC statuses JSON from QA results
|
|
716
|
+
# Example: {"AC-1":"met","AC-2":"not_met","AC-3":"met"}
|
|
717
|
+
ac_json='{"AC-1":"met","AC-2":"not_met"}' # Replace with actual results
|
|
718
|
+
|
|
719
|
+
# Save run context
|
|
720
|
+
echo "{
|
|
721
|
+
\"lastQACommitSHA\": \"$current_sha\",
|
|
722
|
+
\"lastQADiffHash\": \"$current_hash\",
|
|
723
|
+
\"acStatuses\": $ac_json,
|
|
724
|
+
\"timestamp\": \"$(date -u +%Y-%m-%dT%H:%M:%S.000Z)\"
|
|
725
|
+
}" | npx tsx scripts/qa/qa-cache-cli.ts set-run-context
|
|
726
|
+
```
|
|
727
|
+
|
|
728
|
+
---
|
|
729
|
+
|
|
730
|
+
### Phase 1: CI Status Check — REQUIRED
|
|
731
|
+
|
|
732
|
+
**Purpose:** Check GitHub CI status before finalizing verdict. CI-dependent AC items (e.g., "Tests pass in CI") should reflect actual CI status, not just local test results.
|
|
733
|
+
|
|
734
|
+
**When to check:** If a PR exists for the issue/branch.
|
|
735
|
+
|
|
736
|
+
**Detection:**
|
|
737
|
+
```bash
|
|
738
|
+
# Get PR number for current branch
|
|
739
|
+
pr_number=$(gh pr view --json number -q '.number' 2>/dev/null)
|
|
740
|
+
|
|
741
|
+
# If PR exists, check CI status
|
|
742
|
+
if [[ -n "$pr_number" ]]; then
|
|
743
|
+
gh pr checks "$pr_number" --json name,state,bucket
|
|
744
|
+
fi
|
|
745
|
+
```
|
|
746
|
+
|
|
747
|
+
**CI Status Mapping:**
|
|
748
|
+
|
|
749
|
+
| State | Bucket | AC Status | Verdict Impact |
|
|
750
|
+
|-------|--------|-----------|----------------|
|
|
751
|
+
| `SUCCESS` | `pass` | `MET` | No impact |
|
|
752
|
+
| `FAILURE` | `fail` | `NOT_MET` | Blocks merge |
|
|
753
|
+
| `CANCELLED` | `fail` | `NOT_MET` | Blocks merge |
|
|
754
|
+
| `SKIPPED` | `pass` | `N/A` | No impact |
|
|
755
|
+
| `PENDING` | `pending` | `PENDING` | → `NEEDS_VERIFICATION` |
|
|
756
|
+
| `QUEUED` | `pending` | `PENDING` | → `NEEDS_VERIFICATION` |
|
|
757
|
+
| `IN_PROGRESS` | `pending` | `PENDING` | → `NEEDS_VERIFICATION` |
|
|
758
|
+
| (empty response) | - | `N/A` | No CI configured |
|
|
759
|
+
|
|
760
|
+
**CI-Related AC Detection:**
|
|
761
|
+
|
|
762
|
+
Identify AC items that depend on CI by matching these patterns:
|
|
763
|
+
- "Tests pass in CI"
|
|
764
|
+
- "CI passes"
|
|
765
|
+
- "Build succeeds in CI"
|
|
766
|
+
- "GitHub Actions pass"
|
|
767
|
+
- "Pipeline passes"
|
|
768
|
+
- "Workflow passes"
|
|
769
|
+
- "Checks pass"
|
|
770
|
+
- "Actions succeed"
|
|
771
|
+
- "CI/CD passes"
|
|
772
|
+
|
|
773
|
+
```bash
|
|
774
|
+
# Example: Check if any AC mentions CI
|
|
775
|
+
ci_ac_patterns="CI|pipeline|GitHub Actions|build succeeds|tests pass in CI|workflow|checks pass|actions succeed"
|
|
776
|
+
```
|
|
777
|
+
|
|
778
|
+
**Error Handling:**
|
|
779
|
+
|
|
780
|
+
If `gh pr checks` fails or returns unexpected results:
|
|
781
|
+
- **`gh` not installed** → Skip CI section with note: "CI status unavailable (gh CLI not found)"
|
|
782
|
+
- **`gh` not authenticated** → Skip CI section with note: "CI status unavailable (gh auth required)"
|
|
783
|
+
- **Network/auth error** → Treat as N/A with note: "CI status unavailable (gh command failed)"
|
|
784
|
+
- **No PR exists** → Skip CI status section entirely
|
|
785
|
+
- **Empty response** → No CI configured (not an error)
|
|
786
|
+
|
|
787
|
+
**Portability Note:**
|
|
788
|
+
|
|
789
|
+
CI status detection requires GitHub. Other platforms (GitLab, Bitbucket, Azure DevOps) are not supported. To check if `gh` is available:
|
|
790
|
+
```bash
|
|
791
|
+
if ! command -v gh &>/dev/null; then
|
|
792
|
+
echo "gh CLI not installed - skipping CI status check"
|
|
793
|
+
fi
|
|
794
|
+
```
|
|
795
|
+
|
|
796
|
+
**Output Format:**
|
|
797
|
+
|
|
798
|
+
Include CI status in the QA output:
|
|
799
|
+
|
|
800
|
+
```markdown
|
|
801
|
+
### CI Status
|
|
802
|
+
|
|
803
|
+
| Check | State | Bucket | Impact |
|
|
804
|
+
|-------|-------|--------|--------|
|
|
805
|
+
| `build (20.x)` | SUCCESS | pass | ✅ MET |
|
|
806
|
+
| `build (22.x)` | PENDING | pending | ⏳ PENDING |
|
|
807
|
+
| `lint` | FAILURE | fail | ❌ NOT_MET |
|
|
808
|
+
|
|
809
|
+
**CI Summary:** 1 passed, 1 pending, 1 failed
|
|
810
|
+
**CI-related AC items:** AC-4 ("Tests pass in CI") → PENDING (CI still running)
|
|
811
|
+
```
|
|
812
|
+
|
|
813
|
+
**No CI Configured:**
|
|
814
|
+
|
|
815
|
+
If `gh pr checks` returns an empty response:
|
|
816
|
+
```markdown
|
|
817
|
+
### CI Status
|
|
818
|
+
|
|
819
|
+
No CI checks configured for this repository.
|
|
820
|
+
|
|
821
|
+
**CI-related AC items:** AC-4 ("Tests pass in CI") → N/A (no CI configured)
|
|
822
|
+
```
|
|
823
|
+
|
|
824
|
+
**Verdict Integration:**
|
|
825
|
+
|
|
826
|
+
CI status affects the final verdict through the standard verdict algorithm:
|
|
827
|
+
- CI `PENDING` → AC item marked `PENDING` → Verdict: `NEEDS_VERIFICATION`
|
|
828
|
+
- CI `failure` → AC item marked `NOT_MET` → Verdict: `AC_NOT_MET`
|
|
829
|
+
- CI `success` → AC item marked `MET` → No additional impact
|
|
830
|
+
- No CI → AC item marked `N/A` → No impact on verdict
|
|
831
|
+
|
|
832
|
+
**Important:** Do NOT give `READY_FOR_MERGE` if any CI check is still pending. The correct verdict is `NEEDS_VERIFICATION` with a note to re-run QA after CI completes.
|
|
833
|
+
|
|
834
|
+
---
|
|
835
|
+
|
|
836
|
+
### Small-Diff Fast Path (Size Gate)
|
|
837
|
+
|
|
838
|
+
**Purpose:** Skip sub-agent spawning for trivial diffs to save ~30s latency and reduce token cost.
|
|
839
|
+
|
|
840
|
+
**Evaluate the size gate BEFORE spawning any quality check sub-agents:**
|
|
841
|
+
|
|
842
|
+
```bash
|
|
843
|
+
# 1. Read threshold from settings (default: 100)
|
|
844
|
+
threshold=$(cat .sequant/settings.json 2>/dev/null | grep -o '"smallDiffThreshold"[[:space:]]*:[[:space:]]*[0-9]*' | grep -o '[0-9]*$' || echo "100")
|
|
845
|
+
if [ -z "$threshold" ]; then threshold=100; fi
|
|
846
|
+
|
|
847
|
+
# 2. Compute diff size (additions + deletions)
|
|
848
|
+
diff_stats=$(git diff origin/main...HEAD --stat | tail -1 || true)
|
|
849
|
+
additions=$(echo "$diff_stats" | grep -o '[0-9]* insertion' | grep -o '[0-9]*' || echo "0")
|
|
850
|
+
deletions=$(echo "$diff_stats" | grep -o '[0-9]* deletion' | grep -o '[0-9]*' || echo "0")
|
|
851
|
+
total_changes=$((${additions:-0} + ${deletions:-0}))
|
|
852
|
+
|
|
853
|
+
# 3. Check if package.json changed
|
|
854
|
+
pkg_changed=$(git diff origin/main...HEAD --name-only | grep -c '^package\.json$' || true)
|
|
855
|
+
|
|
856
|
+
# 4. Check security-sensitive paths (reuses existing heuristic from anti-pattern detection)
|
|
857
|
+
security_paths=$(git diff origin/main...HEAD --name-only | grep -iE 'auth|payment|security|server-action|middleware|admin' || true)
|
|
858
|
+
security_sensitive="false"
|
|
859
|
+
if [ -n "$security_paths" ]; then security_sensitive="true"; fi
|
|
860
|
+
|
|
861
|
+
echo "Size gate: $total_changes lines changed (threshold: $threshold), pkg_changed=$pkg_changed, security=$security_sensitive"
|
|
862
|
+
```
|
|
863
|
+
|
|
864
|
+
**Size gate decision:**
|
|
865
|
+
|
|
866
|
+
| Condition | Result |
|
|
867
|
+
|-----------|--------|
|
|
868
|
+
| `total_changes < threshold` AND `pkg_changed == 0` AND `security_sensitive == false` | `SMALL_DIFF=true` — use inline checks |
|
|
869
|
+
| Any condition fails | `SMALL_DIFF=false` — use sub-agents (standard pipeline) |
|
|
870
|
+
| Size gate evaluation errors (e.g., git fails) | `SMALL_DIFF=false` — fall back to full pipeline (AC-5) |
|
|
871
|
+
|
|
872
|
+
**Log the decision (AC-6):**
|
|
873
|
+
|
|
874
|
+
```markdown
|
|
875
|
+
### Size Gate
|
|
876
|
+
|
|
877
|
+
| Check | Value |
|
|
878
|
+
|-------|-------|
|
|
879
|
+
| Diff size | N lines (threshold: T) |
|
|
880
|
+
| package.json changed | Yes/No |
|
|
881
|
+
| Security-sensitive paths | Yes/No [list if yes] |
|
|
882
|
+
| Decision | **Inline checks** / **Sub-agents** |
|
|
883
|
+
```
|
|
884
|
+
|
|
885
|
+
#### If `SMALL_DIFF=true`: Inline Quality Checks
|
|
886
|
+
|
|
887
|
+
Run these checks directly (no sub-agents needed):
|
|
888
|
+
|
|
889
|
+
```bash
|
|
890
|
+
# Type safety: check for 'any' additions
|
|
891
|
+
any_count=$(git diff origin/main...HEAD | grep '^\+' | grep -v '^\+\+\+' | grep -cw 'any' || true)
|
|
892
|
+
|
|
893
|
+
# Deleted tests check
|
|
894
|
+
deleted_tests=$(git diff origin/main...HEAD --name-only --diff-filter=D | grep -cE '\.(test|spec)\.' || true)
|
|
895
|
+
|
|
896
|
+
# Scope: files changed count
|
|
897
|
+
files_changed=$(git diff origin/main...HEAD --name-only | wc -l | tr -d ' ')
|
|
898
|
+
|
|
899
|
+
# Security scan (lightweight — just check for obvious patterns in added lines)
|
|
900
|
+
security_issues=$(git diff origin/main...HEAD | grep '^\+' | grep -v '^\+\+\+' | grep -ciE 'eval\(|innerHTML|dangerouslySetInnerHTML|exec\(|password.*=.*["']|secret.*=.*["']|api.?key.*=.*["']' || true)
|
|
901
|
+
|
|
902
|
+
echo "Inline checks: any=$any_count, deleted_tests=$deleted_tests, files=$files_changed, security_issues=$security_issues"
|
|
903
|
+
```
|
|
904
|
+
|
|
905
|
+
**After inline checks, skip to the output template** (the sub-agent section below is not executed).
|
|
906
|
+
|
|
907
|
+
#### If `SMALL_DIFF=false`: Use Sub-Agents (Standard Pipeline)
|
|
908
|
+
|
|
909
|
+
Proceed to the standard Quality Checks section below.
|
|
910
|
+
|
|
911
|
+
---
|
|
912
|
+
|
|
913
|
+
### Quality Checks (Multi-Agent) — REQUIRED
|
|
914
|
+
|
|
915
|
+
**When `SMALL_DIFF=false`**, you MUST spawn sub-agents for quality checks. Do NOT run these checks inline with bash commands. Sub-agents provide parallel execution, better context isolation, and consistent reporting.
|
|
916
|
+
|
|
917
|
+
**Execution mode:** Respect the agent execution mode determined above (see "Agent Execution Mode" section).
|
|
918
|
+
|
|
919
|
+
#### Documentation Issue Detection
|
|
920
|
+
|
|
921
|
+
Check if this is a documentation-only issue by reading the `SEQUANT_ISSUE_TYPE` environment variable:
|
|
922
|
+
|
|
923
|
+
```bash
|
|
924
|
+
issue_type="${SEQUANT_ISSUE_TYPE:-}"
|
|
925
|
+
```
|
|
926
|
+
|
|
927
|
+
**If `SEQUANT_ISSUE_TYPE=docs`**, use the lighter docs QA pipeline:
|
|
928
|
+
|
|
929
|
+
- **Skip** type safety sub-agent (no TypeScript changes expected)
|
|
930
|
+
- **Skip** security scan sub-agent (no runtime code changes)
|
|
931
|
+
- **Keep** scope/size check (still useful for docs)
|
|
932
|
+
- **Focus review on:** content accuracy, completeness, formatting, and link validity
|
|
933
|
+
|
|
934
|
+
**Docs QA sub-agents (1 agent instead of 3):**
|
|
935
|
+
|
|
936
|
+
1. `Agent(subagent_type="sequant-qa-checker", prompt="Run scope and size checks on the current branch vs main. Check for broken links in changed markdown files. Report: files count, diff size, broken links, size assessment.")`
|
|
937
|
+
|
|
938
|
+
**If `SEQUANT_ISSUE_TYPE` is not set or is not `docs`**, use the standard pipeline below.
|
|
939
|
+
|
|
940
|
+
#### If parallel mode enabled:
|
|
941
|
+
|
|
942
|
+
**Spawn ALL THREE agents in a SINGLE message (one Tool call per agent, all in same response):**
|
|
943
|
+
|
|
944
|
+
**IMPORTANT:** Background agents need `mode="bypassPermissions"` to execute Bash commands (`git diff`, `npm test`, etc.) without interactive approval. The default `acceptEdits` mode only auto-approves Edit/Write — Bash calls are silently denied. These quality check agents only read and analyze; they never write files or push code, so bypassing permissions is safe.
|
|
945
|
+
|
|
946
|
+
1. `Agent(subagent_type="sequant-qa-checker", prompt="Run type safety and deleted tests checks on the current branch vs main. Report: type issues count, deleted tests, verdict.")`
|
|
947
|
+
|
|
948
|
+
2. `Agent(subagent_type="sequant-qa-checker", prompt="Run scope and size checks on the current branch vs main. Report: files count, diff size, size assessment.")`
|
|
949
|
+
|
|
950
|
+
3. `Agent(subagent_type="sequant-qa-checker", prompt="Run security scan on changed files in current branch vs main. Report: critical/warning/info counts, verdict.")`
|
|
951
|
+
|
|
952
|
+
#### If sequential mode (default):
|
|
953
|
+
|
|
954
|
+
**Spawn each agent ONE AT A TIME, waiting for each to complete before the next:**
|
|
955
|
+
|
|
956
|
+
**Note:** Sequential agents run in the foreground where the user can approve Bash interactively. However, for consistency and to avoid approval fatigue, we still use `mode="bypassPermissions"` since these agents only perform read-only quality checks.
|
|
957
|
+
|
|
958
|
+
1. **First:** `Agent(subagent_type="sequant-qa-checker", prompt="Run type safety and deleted tests checks on the current branch vs main. Report: type issues count, deleted tests, verdict.")`
|
|
959
|
+
|
|
960
|
+
2. **After #1 completes:** `Agent(subagent_type="sequant-qa-checker", prompt="Run scope and size checks on the current branch vs main. Report: files count, diff size, size assessment.")`
|
|
961
|
+
|
|
962
|
+
3. **After #2 completes:** `Agent(subagent_type="sequant-qa-checker", prompt="Run security scan on changed files in current branch vs main. Report: critical/warning/info counts, verdict.")`
|
|
963
|
+
|
|
964
|
+
**Add RLS check if admin files modified:**
|
|
965
|
+
```bash
|
|
966
|
+
admin_modified=$(git diff main...HEAD --name-only | grep -E "^app/admin/" | head -1 || true)
|
|
967
|
+
```
|
|
968
|
+
|
|
969
|
+
See [quality-gates.md](references/quality-gates.md) for detailed verdict synthesis.
|
|
970
|
+
|
|
971
|
+
### Using MCP Tools (Optional)
|
|
972
|
+
|
|
973
|
+
- **Sequential Thinking:** For complex multi-step analysis
|
|
974
|
+
- **Context7:** For broader pattern context and library documentation
|
|
975
|
+
|
|
976
|
+
### 1. Context and AC Alignment
|
|
977
|
+
|
|
978
|
+
- **Read all GitHub issue comments** for complete context
|
|
979
|
+
- Reconstruct the AC checklist (AC-1, AC-2, ...)
|
|
980
|
+
- If AC unclear, state assumptions explicitly
|
|
981
|
+
|
|
982
|
+
### 2. Code Review
|
|
983
|
+
|
|
984
|
+
Perform a code review focusing on:
|
|
985
|
+
|
|
986
|
+
- Correctness and potential bugs
|
|
987
|
+
- Readability and maintainability
|
|
988
|
+
- Alignment with existing patterns (see CLAUDE.md)
|
|
989
|
+
- TypeScript strictness and type safety
|
|
990
|
+
- **Duplicate utility check:** Verify new utilities don't duplicate existing ones in `docs/patterns/`
|
|
991
|
+
|
|
992
|
+
See [code-review-checklist.md](references/code-review-checklist.md) for integration verification steps.
|
|
993
|
+
|
|
994
|
+
### 2a. Build Verification (When Build Fails)
|
|
995
|
+
|
|
996
|
+
**When to apply:** `npm run build` fails on the feature branch.
|
|
997
|
+
|
|
998
|
+
**Purpose:** Distinguish between pre-existing build failures (already on main) and regressions introduced by this PR.
|
|
999
|
+
|
|
1000
|
+
**Detection:**
|
|
1001
|
+
```bash
|
|
1002
|
+
# Run build and capture result
|
|
1003
|
+
npm run build 2>&1
|
|
1004
|
+
BUILD_EXIT_CODE=$?
|
|
1005
|
+
```
|
|
1006
|
+
|
|
1007
|
+
**If build fails, verify against main:**
|
|
1008
|
+
|
|
1009
|
+
The quality-checks.sh script includes `run_build_with_verification()` which:
|
|
1010
|
+
1. Runs `npm run build` on the feature branch
|
|
1011
|
+
2. If it fails, runs build on main branch (via the main repo directory)
|
|
1012
|
+
3. Compares exit codes and first error lines
|
|
1013
|
+
4. Produces a "Build Verification" table (see AC-4)
|
|
1014
|
+
|
|
1015
|
+
**Verification Logic:**
|
|
1016
|
+
|
|
1017
|
+
| Feature Build | Main Build | Error Match | Result |
|
|
1018
|
+
|---------------|------------|-------------|--------|
|
|
1019
|
+
| ❌ Fail | ✅ Pass | N/A | **Regression** - failure introduced by PR |
|
|
1020
|
+
| ❌ Fail | ❌ Fail | Same error | **Pre-existing** - not blocking |
|
|
1021
|
+
| ❌ Fail | ❌ Fail | Different | **Unknown** - manual review needed |
|
|
1022
|
+
| ✅ Pass | * | N/A | No verification needed |
|
|
1023
|
+
|
|
1024
|
+
**Verdict Impact:**
|
|
1025
|
+
|
|
1026
|
+
| Build Verification Result | Verdict Impact |
|
|
1027
|
+
|---------------------------|----------------|
|
|
1028
|
+
| Regression detected | `AC_NOT_MET` - must fix before merge |
|
|
1029
|
+
| Pre-existing failure | No impact - document and proceed |
|
|
1030
|
+
| Unknown (different errors) | `AC_MET_BUT_NOT_A_PLUS` - manual review |
|
|
1031
|
+
| Build passes | No impact |
|
|
1032
|
+
|
|
1033
|
+
**Output Format:**
|
|
1034
|
+
|
|
1035
|
+
```markdown
|
|
1036
|
+
### Build Verification
|
|
1037
|
+
|
|
1038
|
+
| Check | Status |
|
|
1039
|
+
|-------|--------|
|
|
1040
|
+
| Feature branch build | ❌ Failed |
|
|
1041
|
+
| Main branch build | ❌ Failed |
|
|
1042
|
+
| Error match | ✅ Same error |
|
|
1043
|
+
| Regression | **No** (pre-existing) |
|
|
1044
|
+
|
|
1045
|
+
**Note:** Build failure is pre-existing on main branch. Not blocking this PR.
|
|
1046
|
+
```
|
|
1047
|
+
|
|
1048
|
+
### 2b. Test Coverage Transparency (REQUIRED)
|
|
1049
|
+
|
|
1050
|
+
**Purpose:** Report which changed files have corresponding tests, not just "N tests passed."
|
|
1051
|
+
|
|
1052
|
+
**After running `npm test`, you MUST analyze test coverage for changed files:**
|
|
1053
|
+
|
|
1054
|
+
Use the Glob tool to check for corresponding test files:
|
|
1055
|
+
```
|
|
1056
|
+
# Get changed source files (excluding tests) from git
|
|
1057
|
+
changed=$(git diff main...HEAD --name-only | grep -E '\.(ts|tsx|js|jsx)$' | grep -v -E '\.test\.|\.spec\.|__tests__' || true)
|
|
1058
|
+
|
|
1059
|
+
# For each changed file, use the Glob tool to find matching test files
|
|
1060
|
+
# Glob(pattern="**/${base}.test.*") or Glob(pattern="**/${base}.spec.*")
|
|
1061
|
+
# If no test file found, report "NO TEST: $file"
|
|
1062
|
+
```
|
|
1063
|
+
|
|
1064
|
+
**Required reporting format:**
|
|
1065
|
+
|
|
1066
|
+
| Scenario | Report |
|
|
1067
|
+
|----------|--------|
|
|
1068
|
+
| Tests cover changed files | `Tests: N passed (covers changed files)` |
|
|
1069
|
+
| Tests don't cover changed files | `Tests: N passed (⚠️ 0 cover changed files)` |
|
|
1070
|
+
| No tests for specific files | `Tests: N passed (⚠️ NO TESTS: file1.ts, file2.ts)` |
|
|
1071
|
+
|
|
1072
|
+
**Include in output template:**
|
|
1073
|
+
|
|
1074
|
+
```markdown
|
|
1075
|
+
### Test Coverage Analysis
|
|
1076
|
+
|
|
1077
|
+
| Changed File | Has Tests? | Test File |
|
|
1078
|
+
|--------------|------------|-----------|
|
|
1079
|
+
| `lib/foo.ts` | ✅ Yes | `__tests__/foo.test.ts` |
|
|
1080
|
+
| `lib/bar.ts` | ⚠️ No | - |
|
|
1081
|
+
|
|
1082
|
+
**Coverage:** X/Y changed files have tests
|
|
1083
|
+
```
|
|
1084
|
+
|
|
1085
|
+
### 2c. Change Tier Classification
|
|
1086
|
+
|
|
1087
|
+
**Purpose:** Flag coverage gaps based on criticality, not just presence/absence.
|
|
1088
|
+
|
|
1089
|
+
**Tier definitions:**
|
|
1090
|
+
|
|
1091
|
+
| Tier | Change Type | Coverage Requirement |
|
|
1092
|
+
|------|-------------|---------------------|
|
|
1093
|
+
| **Critical** | Auth, payments, security, server-actions, middleware, admin | Flag prominently if missing |
|
|
1094
|
+
| **Standard** | Business logic, API handlers, utilities | Note if missing |
|
|
1095
|
+
| **Optional** | Config, types-only, UI tweaks | No flag needed |
|
|
1096
|
+
|
|
1097
|
+
**Detection heuristic:**
|
|
1098
|
+
|
|
1099
|
+
```bash
|
|
1100
|
+
# Detect critical paths in changed files
|
|
1101
|
+
changed=$(git diff main...HEAD --name-only | grep -E '\.(ts|tsx|js|jsx)$' || true)
|
|
1102
|
+
critical=$(echo "$changed" | grep -E 'auth|payment|security|server-action|middleware|admin' || true)
|
|
1103
|
+
|
|
1104
|
+
if [[ -n "$critical" ]]; then
|
|
1105
|
+
echo "⚠️ CRITICAL PATH CHANGES (test coverage strongly recommended):"
|
|
1106
|
+
echo "$critical"
|
|
1107
|
+
fi
|
|
1108
|
+
```
|
|
1109
|
+
|
|
1110
|
+
**Reporting format:**
|
|
1111
|
+
|
|
1112
|
+
```markdown
|
|
1113
|
+
### Change Tiers
|
|
1114
|
+
|
|
1115
|
+
| Tier | Files | Coverage Status |
|
|
1116
|
+
|------|-------|-----------------|
|
|
1117
|
+
| Critical | `auth/login.ts` | ⚠️ NO TESTS - Flag prominently |
|
|
1118
|
+
| Standard | `lib/utils.ts` | Note: No tests |
|
|
1119
|
+
| Optional | `types/index.ts` | OK - Types only |
|
|
1120
|
+
```
|
|
1121
|
+
|
|
1122
|
+
### 2d. Test Quality Review
|
|
1123
|
+
|
|
1124
|
+
**When to apply:** Test files were added or modified.
|
|
1125
|
+
|
|
1126
|
+
Evaluate test quality using the checklist:
|
|
1127
|
+
- **Behavior vs Implementation:** Tests assert on outputs, not internals
|
|
1128
|
+
- **Coverage Depth:** Error paths and edge cases covered
|
|
1129
|
+
- **Mock Hygiene:** Only external dependencies mocked
|
|
1130
|
+
- **Test Reliability:** No timing dependencies, deterministic
|
|
1131
|
+
|
|
1132
|
+
See [test-quality-checklist.md](references/test-quality-checklist.md) for detailed evaluation criteria.
|
|
1133
|
+
|
|
1134
|
+
**Flag common issues:**
|
|
1135
|
+
- Over-mocking (4+ modules mocked in single test)
|
|
1136
|
+
- Missing error path tests
|
|
1137
|
+
- Snapshot abuse (>50 line snapshots)
|
|
1138
|
+
- Implementation mirroring
|
|
1139
|
+
|
|
1140
|
+
### 2e. Anti-Pattern Detection
|
|
1141
|
+
|
|
1142
|
+
**Always run** code pattern checks on changed files:
|
|
1143
|
+
|
|
1144
|
+
```bash
|
|
1145
|
+
# Get changed TypeScript/JavaScript files
|
|
1146
|
+
changed_files=$(git diff main...HEAD --name-only | grep -E '\.(ts|tsx|js|jsx)$' || true)
|
|
1147
|
+
```
|
|
1148
|
+
|
|
1149
|
+
**Check for:**
|
|
1150
|
+
|
|
1151
|
+
| Category | Pattern | Risk |
|
|
1152
|
+
|----------|---------|------|
|
|
1153
|
+
| Performance | N+1 query (`await` in loop) | ⚠️ Medium |
|
|
1154
|
+
| Error Handling | Empty catch block | ⚠️ Medium |
|
|
1155
|
+
| Security | Hardcoded secrets | ❌ High |
|
|
1156
|
+
| Security | SQL concatenation | ❌ High |
|
|
1157
|
+
| Security | Server binds all interfaces (`0.0.0.0`) | ❌ High |
|
|
1158
|
+
| Memory | Uncleared interval/timeout | ⚠️ Medium |
|
|
1159
|
+
| A11y | Image without alt | ⚠️ Low |
|
|
1160
|
+
|
|
1161
|
+
**Dependency audit** (when `package.json` modified):
|
|
1162
|
+
|
|
1163
|
+
| Flag | Threshold |
|
|
1164
|
+
|------|-----------|
|
|
1165
|
+
| Low downloads | <1,000/week |
|
|
1166
|
+
| Stale | No updates 12+ months |
|
|
1167
|
+
| License risk | UNLICENSED, GPL in MIT |
|
|
1168
|
+
| Security | Known vulnerabilities |
|
|
1169
|
+
|
|
1170
|
+
See [anti-pattern-detection.md](references/anti-pattern-detection.md) for detection commands and full criteria.
|
|
1171
|
+
|
|
1172
|
+
### 2f. Product Review (When New User-Facing Features Added)
|
|
1173
|
+
|
|
1174
|
+
**When to apply:** New CLI commands, MCP tools, configuration options, or other features that end users interact with directly.
|
|
1175
|
+
|
|
1176
|
+
**Detection:**
|
|
1177
|
+
```bash
|
|
1178
|
+
# Detect user-facing changes
|
|
1179
|
+
cli_added=$(git diff main...HEAD -- bin/cli.ts | grep -E '^\+.*\.command\(' | wc -l | xargs || true)
|
|
1180
|
+
new_commands=$(git diff main...HEAD --name-only | grep -E '^src/commands/' | wc -l | xargs || true)
|
|
1181
|
+
mcp_added=$(git diff main...HEAD --name-only | grep -E '^src/mcp/' | wc -l | xargs || true)
|
|
1182
|
+
config_changed=$(git diff main...HEAD --name-only | grep -E 'settings|config' | wc -l | xargs || true)
|
|
1183
|
+
|
|
1184
|
+
if [[ $((cli_added + new_commands + mcp_added + config_changed)) -gt 0 ]]; then
|
|
1185
|
+
echo "User-facing changes detected - running product review"
|
|
1186
|
+
fi
|
|
1187
|
+
```
|
|
1188
|
+
|
|
1189
|
+
**If user-facing changes detected, answer these questions:**
|
|
1190
|
+
|
|
1191
|
+
| Question | What to check |
|
|
1192
|
+
|----------|---------------|
|
|
1193
|
+
| **First-time setup:** Can a new user go from zero to working? | List every prerequisite. Try the setup path mentally. |
|
|
1194
|
+
| **Per-environment differences:** Does this work the same everywhere? | macOS/Linux/Windows, different clients/tools, CI vs local |
|
|
1195
|
+
| **What does the user see?** | Walk through the actual UX — wait times, output format, progress indicators |
|
|
1196
|
+
| **What happens after?** | Where's the output? What does the user do next? |
|
|
1197
|
+
| **Failure modes the user will hit:** | Not code edge cases — real scenarios (wrong directory, missing auth, timeout) |
|
|
1198
|
+
|
|
1199
|
+
**Output Format:**
|
|
1200
|
+
|
|
1201
|
+
```markdown
|
|
1202
|
+
### Product Review
|
|
1203
|
+
|
|
1204
|
+
**User-facing changes:** [list new commands/tools/options]
|
|
1205
|
+
|
|
1206
|
+
| Question | Finding |
|
|
1207
|
+
|----------|---------|
|
|
1208
|
+
| First-time setup | [All prerequisites identified? Setup path clear?] |
|
|
1209
|
+
| Per-environment | [Any client/platform differences?] |
|
|
1210
|
+
| User sees | [Wait times, output format, progress] |
|
|
1211
|
+
| After completion | [Where output goes, next steps] |
|
|
1212
|
+
| Likely failure modes | [Real user scenarios] |
|
|
1213
|
+
|
|
1214
|
+
**Gaps found:** [list any gaps, or "None"]
|
|
1215
|
+
```
|
|
1216
|
+
|
|
1217
|
+
**Verdict Impact:**
|
|
1218
|
+
|
|
1219
|
+
| Finding | Verdict Impact |
|
|
1220
|
+
|---------|----------------|
|
|
1221
|
+
| No gaps | No impact |
|
|
1222
|
+
| Missing prerequisites in docs | `AC_MET_BUT_NOT_A_PLUS` |
|
|
1223
|
+
| Feature silently fails in common environment | `AC_NOT_MET` (e.g., wrong cwd, missing auth) |
|
|
1224
|
+
| Poor UX but functional | Note in findings |
|
|
1225
|
+
|
|
1226
|
+
### 2g. Call-Site Review (When New Functions Added)
|
|
1227
|
+
|
|
1228
|
+
**When to apply:** New exported functions are detected in the diff.
|
|
1229
|
+
|
|
1230
|
+
**Purpose:** Review not just the function implementation but **where** and **how** it's called. A function can be perfectly implemented but called incorrectly at the call site. (Origin: Issue #295 — `rebaseBeforePR()` had thorough unit tests but was called for every issue in a chain loop when the AC specified "only the final branch.")
|
|
1231
|
+
|
|
1232
|
+
**Detection:**
|
|
1233
|
+
```bash
|
|
1234
|
+
# Find new exported functions (added lines only)
|
|
1235
|
+
# Catches: export function foo, export async function foo,
|
|
1236
|
+
# export const foo = () =>, export const foo = async () =>
|
|
1237
|
+
fn_exports=$(git diff main...HEAD | grep -E '^\+export (async )?function \w+' | sed 's/^+//' | grep -oE 'function \w+' | awk '{print $2}' || true)
|
|
1238
|
+
arrow_exports=$(git diff main...HEAD | grep -E '^\+export const \w+ = (async )?\(' | sed 's/^+//' | grep -oE 'const \w+' | awk '{print $2}' || true)
|
|
1239
|
+
new_exports=$(echo -e "${fn_exports}\n${arrow_exports}" | sed '/^$/d' | sort -u)
|
|
1240
|
+
export_count=$(echo "$new_exports" | grep -c . || echo 0)
|
|
1241
|
+
|
|
1242
|
+
if [[ $export_count -gt 0 ]]; then
|
|
1243
|
+
echo "New exported functions detected: $export_count"
|
|
1244
|
+
echo "$new_exports"
|
|
1245
|
+
fi
|
|
1246
|
+
```
|
|
1247
|
+
|
|
1248
|
+
**If new exported functions found:**
|
|
1249
|
+
|
|
1250
|
+
#### Step 1: Call-Site Inventory
|
|
1251
|
+
|
|
1252
|
+
For each new exported function, identify ALL call sites using the Grep tool:
|
|
1253
|
+
|
|
1254
|
+
```
|
|
1255
|
+
# For each new function, find call sites
|
|
1256
|
+
# Use the Grep tool for each function name:
|
|
1257
|
+
Grep(pattern="${func}\\(", glob="*.{ts,tsx}", output_mode="content")
|
|
1258
|
+
# Then exclude test files, __tests__ dirs, and the export definition itself
|
|
1259
|
+
```
|
|
1260
|
+
|
|
1261
|
+
**Call site types:**
|
|
1262
|
+
- Direct call: `functionName(args)`
|
|
1263
|
+
- Method call: `this.functionName(args)` or `obj.functionName(args)`
|
|
1264
|
+
- Callback: `.then(functionName)` or `array.map(functionName)`
|
|
1265
|
+
- Conditional: `condition && functionName(args)`
|
|
1266
|
+
|
|
1267
|
+
#### Step 2: Condition Audit
|
|
1268
|
+
|
|
1269
|
+
For each call site, document the conditions that gate the call:
|
|
1270
|
+
|
|
1271
|
+
| Condition Type | Example | Check |
|
|
1272
|
+
|----------------|---------|-------|
|
|
1273
|
+
| Guard clause | `if (x) { fn() }` | Does condition match AC? |
|
|
1274
|
+
| Logical AND | `x && fn()` | Is guard sufficient? |
|
|
1275
|
+
| Ternary | `x ? fn() : null` | Correct branch? |
|
|
1276
|
+
| Early return | `if (!x) return; fn()` | Correct logic? |
|
|
1277
|
+
|
|
1278
|
+
**Compare conditions against AC constraints:**
|
|
1279
|
+
- AC says "only when X" → Call site should have `if (X)` guard
|
|
1280
|
+
- AC says "not in Y mode" → Call site should have `if (!Y)` guard
|
|
1281
|
+
- AC says "for Z items" → Call site should filter for Z condition
|
|
1282
|
+
|
|
1283
|
+
#### Step 3: Loop Awareness
|
|
1284
|
+
|
|
1285
|
+
**Detect if function is called inside a loop:**
|
|
1286
|
+
|
|
1287
|
+
```
|
|
1288
|
+
# For each function, use the Grep tool with context to check surrounding lines:
|
|
1289
|
+
Grep(pattern="${func}\\(", glob="*.{ts,tsx}", output_mode="content", -B=5)
|
|
1290
|
+
# Then inspect the context lines for loop constructs:
|
|
1291
|
+
# for, while, forEach, .map(, .filter(, .reduce(
|
|
1292
|
+
# If a loop is found, flag: "function called inside loop - verify iteration scope"
|
|
1293
|
+
```
|
|
1294
|
+
|
|
1295
|
+
**Loop iteration review questions:**
|
|
1296
|
+
1. Should function run for ALL iterations? → OK if yes
|
|
1297
|
+
2. Should function run for FIRST/LAST only? → Check for index guard
|
|
1298
|
+
3. Should function run for SOME iterations? → Check for condition filter
|
|
1299
|
+
|
|
1300
|
+
**Red flags:**
|
|
1301
|
+
- Function called unconditionally in loop when AC says "only once"
|
|
1302
|
+
- No break/return after call when AC implies single execution
|
|
1303
|
+
- Missing mode/flag guard when AC specifies conditions
|
|
1304
|
+
|
|
1305
|
+
#### Step 4: Mode Sensitivity
|
|
1306
|
+
|
|
1307
|
+
If the function accepts configuration or mode options:
|
|
1308
|
+
- Is the correct mode passed at the call site?
|
|
1309
|
+
- Are all mode-specific paths exercised appropriately?
|
|
1310
|
+
|
|
1311
|
+
**Output Format:**
|
|
1312
|
+
|
|
1313
|
+
```markdown
|
|
1314
|
+
### Call-Site Review
|
|
1315
|
+
|
|
1316
|
+
**New exported functions detected:** N
|
|
1317
|
+
|
|
1318
|
+
| Function | Call Sites | Loop? | Conditions | AC Match |
|
|
1319
|
+
|----------|-----------|-------|------------|----------|
|
|
1320
|
+
| `newFunction()` | `file.ts:123` | No | `if (success)` | ✅ Matches AC-2 |
|
|
1321
|
+
| `anotherFunc()` | `run.ts:456` | Yes | None | ⚠️ Missing guard (AC-3 says "final only") |
|
|
1322
|
+
| `thirdFunc()` | Not called | - | - | ⚠️ Unused export |
|
|
1323
|
+
|
|
1324
|
+
**Findings:**
|
|
1325
|
+
- [List any mismatches between call-site conditions and AC constraints]
|
|
1326
|
+
|
|
1327
|
+
**Recommendations:**
|
|
1328
|
+
- [Specific fixes needed at call sites]
|
|
1329
|
+
```
|
|
1330
|
+
|
|
1331
|
+
**Verdict Impact:**
|
|
1332
|
+
|
|
1333
|
+
| Finding | Verdict Impact |
|
|
1334
|
+
|---------|----------------|
|
|
1335
|
+
| All call sites match AC | No impact |
|
|
1336
|
+
| Call site missing AC-required guard | `AC_NOT_MET` |
|
|
1337
|
+
| Function not called anywhere | `AC_MET_BUT_NOT_A_PLUS` (dead export) |
|
|
1338
|
+
| Call site in loop, AC unclear about iteration | `NEEDS_VERIFICATION` |
|
|
1339
|
+
|
|
1340
|
+
See [call-site-review.md](references/call-site-review.md) for detailed methodology and examples.
|
|
1341
|
+
|
|
1342
|
+
### 2h. CLI Registration Verification (When Option Interfaces Modified)
|
|
1343
|
+
|
|
1344
|
+
**When to apply:** `RunOptions` or similar CLI option interfaces are modified in the diff.
|
|
1345
|
+
|
|
1346
|
+
**Purpose:** Detect new option interface fields that have runtime usage (via `mergedOptions.X`) but lack corresponding CLI registration (via `.option()` in `bin/cli.ts`). This class of bug is invisible to TypeScript, build, and unit tests—caught only by manual review or this check.
|
|
1347
|
+
|
|
1348
|
+
**Origin:** Issue #305 — `force?: boolean` was added to `RunOptions`, checked at runtime with `mergedOptions.force`, and referenced in user-facing warnings ("use --force to re-run"), but `--force` was never registered in `bin/cli.ts`. The bug passed QA and was caught only by manual cross-reference.
|
|
1349
|
+
|
|
1350
|
+
**Detection:**
|
|
1351
|
+
|
|
1352
|
+
```bash
|
|
1353
|
+
# Check if option interfaces or CLI file were modified
|
|
1354
|
+
option_files=$(git diff main...HEAD --name-only | grep -E "batch-executor\.ts|run\.ts|cli\.ts" || true)
|
|
1355
|
+
option_modified=$(echo "$option_files" | grep -v "^$" | wc -l | xargs || echo "0")
|
|
1356
|
+
|
|
1357
|
+
if [[ $option_modified -gt 0 ]]; then
|
|
1358
|
+
echo "Option interface or CLI file modified - running CLI registration verification"
|
|
1359
|
+
fi
|
|
1360
|
+
```
|
|
1361
|
+
|
|
1362
|
+
**Key File Map:**
|
|
1363
|
+
|
|
1364
|
+
| Interface | Location | CLI Registration |
|
|
1365
|
+
|-----------|----------|------------------|
|
|
1366
|
+
| `RunOptions` | `src/lib/workflow/batch-executor.ts` | `run` command in `bin/cli.ts` |
|
|
1367
|
+
|
|
1368
|
+
**Verification Logic:**
|
|
1369
|
+
|
|
1370
|
+
1. **Extract new interface fields from diff:**
|
|
1371
|
+
```bash
|
|
1372
|
+
# Get new fields added to RunOptions (or similar interfaces)
|
|
1373
|
+
new_fields=$(git diff main...HEAD -- src/lib/workflow/batch-executor.ts | \
|
|
1374
|
+
grep -E '^\+\s+\w+\??: ' | \
|
|
1375
|
+
sed 's/.*+ *//' | \
|
|
1376
|
+
sed 's/\?.*//' | \
|
|
1377
|
+
sed 's/:.*//' | \
|
|
1378
|
+
tr -d ' ' || true)
|
|
1379
|
+
```
|
|
1380
|
+
|
|
1381
|
+
2. **Check for runtime usage (mergedOptions.X):**
|
|
1382
|
+
```bash
|
|
1383
|
+
# For each new field, check if it's used at runtime
|
|
1384
|
+
for field in $new_fields; do
|
|
1385
|
+
runtime_usage=$(git diff main...HEAD | grep -E "mergedOptions\.$field|options\.$field" || true)
|
|
1386
|
+
if [[ -n "$runtime_usage" ]]; then
|
|
1387
|
+
echo "Field '$field' has runtime usage - verify CLI registration"
|
|
1388
|
+
fi
|
|
1389
|
+
done
|
|
1390
|
+
```
|
|
1391
|
+
|
|
1392
|
+
3. **Verify CLI registration exists:**
|
|
1393
|
+
```bash
|
|
1394
|
+
# Extract registered CLI options from bin/cli.ts
|
|
1395
|
+
# Matches patterns like: --force, --dry-run, --timeout
|
|
1396
|
+
registered=$(grep -oE '"\-\-[a-z-]+"' bin/cli.ts | tr -d '"' | sed 's/^--//' || true)
|
|
1397
|
+
|
|
1398
|
+
# Check if field has corresponding registration
|
|
1399
|
+
# Note: CLI flags use kebab-case, interface fields use camelCase
|
|
1400
|
+
# Example: fieldName → --field-name
|
|
1401
|
+
```
|
|
1402
|
+
|
|
1403
|
+
4. **Internal-only field exclusion (AC-5):**
|
|
1404
|
+
|
|
1405
|
+
Fields without runtime `mergedOptions.X` usage are internal-only and don't need CLI registration:
|
|
1406
|
+
- `autoDetectPhases` — set programmatically, not user-facing
|
|
1407
|
+
- `worktreeIsolation` — environment-controlled
|
|
1408
|
+
- Fields only used in type signatures without runtime access
|
|
1409
|
+
|
|
1410
|
+
**Detection:** If `grep "mergedOptions.$field"` returns no matches, the field is internal-only.
|
|
1411
|
+
|
|
1412
|
+
**Output Format:**
|
|
1413
|
+
|
|
1414
|
+
```markdown
|
|
1415
|
+
### CLI Registration Verification
|
|
1416
|
+
|
|
1417
|
+
**Option files modified:** Yes/No
|
|
1418
|
+
|
|
1419
|
+
| Interface Field | Runtime Usage | CLI Registered | Status |
|
|
1420
|
+
|----------------|--------------|----------------|--------|
|
|
1421
|
+
| `force` | `mergedOptions.force` (line 2447) | `--force` in bin/cli.ts | ✅ OK |
|
|
1422
|
+
| `newField` | `mergedOptions.newField` (line 500) | NOT REGISTERED | ❌ FAIL |
|
|
1423
|
+
| `internalOnly` | None (internal) | N/A | ⏭️ SKIP |
|
|
1424
|
+
|
|
1425
|
+
**Verification Status:** Passed / Failed / N/A
|
|
1426
|
+
```
|
|
1427
|
+
|
|
1428
|
+
**Verdict Gating (AC-4):**
|
|
1429
|
+
|
|
1430
|
+
| Verification Status | Maximum Verdict |
|
|
1431
|
+
|---------------------|-----------------|
|
|
1432
|
+
| Passed | READY_FOR_MERGE |
|
|
1433
|
+
| N/A (no option changes) | READY_FOR_MERGE |
|
|
1434
|
+
| Failed | AC_NOT_MET |
|
|
1435
|
+
|
|
1436
|
+
**CRITICAL:** If CLI registration verification = **Failed**, verdict CANNOT be `READY_FOR_MERGE`. Missing CLI registrations mean users cannot access the feature via command line.
|
|
1437
|
+
|
|
1438
|
+
**If verification fails:**
|
|
1439
|
+
1. Flag the specific fields missing CLI registration
|
|
1440
|
+
2. Set verdict to `AC_NOT_MET`
|
|
1441
|
+
3. Include remediation steps:
|
|
1442
|
+
```markdown
|
|
1443
|
+
**Remediation:**
|
|
1444
|
+
1. Add to `bin/cli.ts` under the appropriate command:
|
|
1445
|
+
```typescript
|
|
1446
|
+
.option("--field-name", "Description of what the flag does")
|
|
1447
|
+
```
|
|
1448
|
+
2. Verify with `npx sequant <command> --help`
|
|
1449
|
+
```
|
|
1450
|
+
|
|
1451
|
+
---
|
|
1452
|
+
|
|
1453
|
+
### 3. QA vs AC
|
|
1454
|
+
|
|
1455
|
+
For each AC item, mark as:
|
|
1456
|
+
- `MET`
|
|
1457
|
+
- `PARTIALLY_MET`
|
|
1458
|
+
- `NOT_MET`
|
|
1459
|
+
|
|
1460
|
+
Provide a sentence or two explaining why.
|
|
1461
|
+
|
|
1462
|
+
#### AC Literal Verification (REQUIRED)
|
|
1463
|
+
|
|
1464
|
+
**Before marking any AC as MET**, verify the implementation matches the AC text literally, not just in spirit:
|
|
1465
|
+
|
|
1466
|
+
1. **Extract specific technical claims** from the AC text (commands, flags, function names, config keys, UI elements)
|
|
1467
|
+
2. **Search the implementation** for each claim using Grep or Read — do not assume presence
|
|
1468
|
+
3. **If the AC mentions a flag** (e.g., `--file <relevant-files>`), verify that flag appears in the code
|
|
1469
|
+
4. **If the AC says "works end-to-end"**, trace the full call chain from entry point to execution
|
|
1470
|
+
|
|
1471
|
+
**Example:** If AC says *"shells out to `aider --yes --no-auto-commits --message '<prompt>' --file <relevant-files>`"*:
|
|
1472
|
+
- Verify `--yes` is in args array ✅
|
|
1473
|
+
- Verify `--no-auto-commits` is in args array ✅
|
|
1474
|
+
- Verify `--message` is in args array ✅
|
|
1475
|
+
- Verify `--file` is in args array — **if missing, AC is NOT MET** ❌
|
|
1476
|
+
|
|
1477
|
+
Do NOT mark MET based on "the general intent is satisfied." The AC text is the contract — verify it literally.
|
|
1478
|
+
|
|
1479
|
+
### 3a. AC Status Persistence — REQUIRED
|
|
1480
|
+
|
|
1481
|
+
**After evaluating each AC item**, update the status in workflow state using the state CLI:
|
|
1482
|
+
|
|
1483
|
+
```bash
|
|
1484
|
+
# Step 1: Initialize AC items for the issue (run once, before updating statuses)
|
|
1485
|
+
npx tsx scripts/state/update.ts init-ac <issue-number> <ac-count>
|
|
1486
|
+
|
|
1487
|
+
# Example: Initialize 4 AC items for issue #250
|
|
1488
|
+
npx tsx scripts/state/update.ts init-ac 250 4
|
|
1489
|
+
```
|
|
1490
|
+
|
|
1491
|
+
```bash
|
|
1492
|
+
# Step 2: Update each AC item's status
|
|
1493
|
+
npx tsx scripts/state/update.ts ac <issue-number> <ac-id> <status> "<notes>"
|
|
1494
|
+
|
|
1495
|
+
# Examples:
|
|
1496
|
+
npx tsx scripts/state/update.ts ac 250 AC-1 met "Verified: tests pass and feature works"
|
|
1497
|
+
npx tsx scripts/state/update.ts ac 250 AC-2 not_met "Missing error handling for edge case"
|
|
1498
|
+
npx tsx scripts/state/update.ts ac 250 AC-3 blocked "Waiting on upstream dependency"
|
|
1499
|
+
```
|
|
1500
|
+
|
|
1501
|
+
**Status mapping:**
|
|
1502
|
+
- `MET` → `met`
|
|
1503
|
+
- `PARTIALLY_MET` → `not_met` (with notes explaining what's missing)
|
|
1504
|
+
- `NOT_MET` → `not_met`
|
|
1505
|
+
- `BLOCKED` → `blocked` (external dependency issue)
|
|
1506
|
+
|
|
1507
|
+
**Why this matters:** Updating AC status in state enables:
|
|
1508
|
+
- Dashboard shows real-time AC progress per issue
|
|
1509
|
+
- Cross-skill tracking of which AC items need work
|
|
1510
|
+
- Summary badges show "X/Y met" status
|
|
1511
|
+
|
|
1512
|
+
**If issue has no stored AC:**
|
|
1513
|
+
- Run `init-ac` first to create the AC items
|
|
1514
|
+
- Then update each AC status individually
|
|
1515
|
+
|
|
1516
|
+
### 4. Failure Path & Edge Case Testing (REQUIRED)
|
|
1517
|
+
|
|
1518
|
+
Before any READY_FOR_MERGE verdict, complete the adversarial thinking checklist:
|
|
1519
|
+
|
|
1520
|
+
1. **"What would break this?"** - Identify and test at least 2 failure scenarios
|
|
1521
|
+
2. **"What assumptions am I making?"** - List and validate key assumptions
|
|
1522
|
+
3. **"What's the unhappy path?"** - Test invalid inputs, failed dependencies
|
|
1523
|
+
4. **"Did I test the feature's PRIMARY PURPOSE?"** - If it handles errors, trigger an error
|
|
1524
|
+
|
|
1525
|
+
See [testing-requirements.md](references/testing-requirements.md) for edge case checklists.
|
|
1526
|
+
|
|
1527
|
+
### 5. Adversarial Self-Evaluation (REQUIRED)
|
|
1528
|
+
|
|
1529
|
+
**Before issuing your verdict**, you MUST complete this adversarial self-evaluation to catch issues that automated quality checks miss.
|
|
1530
|
+
|
|
1531
|
+
**Why this matters:** QA automation catches type issues, deleted tests, and scope creep - but misses:
|
|
1532
|
+
- Features that don't actually work as expected
|
|
1533
|
+
- Tests that pass but don't test the right things
|
|
1534
|
+
- Edge cases only apparent when actually using the feature
|
|
1535
|
+
|
|
1536
|
+
**Answer these questions honestly:**
|
|
1537
|
+
1. "Did the implementation actually work when I reviewed it, or am I assuming it works?"
|
|
1538
|
+
2. "Do the tests actually test the feature's primary purpose, or just pass?"
|
|
1539
|
+
3. "What's the most likely way this feature could break in production?"
|
|
1540
|
+
4. "Am I giving a positive verdict because the code looks clean, or because I verified it works?"
|
|
1541
|
+
5. "Are there 'design choices' I'm excusing that are actually bad practices?" (e.g., no version pinning, leaking secrets to unnecessary env vars, non-portable shell in example code, no input validation). Would I accept this in a code review from a junior developer?
|
|
1542
|
+
|
|
1543
|
+
**Include this section in your output:**
|
|
1544
|
+
|
|
1545
|
+
```markdown
|
|
1546
|
+
### Self-Evaluation
|
|
1547
|
+
|
|
1548
|
+
- **Verified working:** [Yes/No - did you actually verify the feature works, or assume it does?]
|
|
1549
|
+
- **Test efficacy:** [High/Medium/Low - do tests catch the feature breaking?]
|
|
1550
|
+
- **Likely failure mode:** [What would most likely break this in production?]
|
|
1551
|
+
- **Verdict confidence:** [High/Medium/Low - explain any uncertainty]
|
|
1552
|
+
```
|
|
1553
|
+
|
|
1554
|
+
**If any answer reveals concerns:**
|
|
1555
|
+
- Factor the concerns into your verdict
|
|
1556
|
+
- If significant, change verdict to `AC_NOT_MET` or `AC_MET_BUT_NOT_A_PLUS`
|
|
1557
|
+
- Document the concerns in the QA comment
|
|
1558
|
+
|
|
1559
|
+
**Do NOT skip this self-evaluation.** Honest reflection catches issues that code review cannot.
|
|
1560
|
+
|
|
1561
|
+
#### Skill Change Review (Conditional)
|
|
1562
|
+
|
|
1563
|
+
**When to apply:** `.claude/skills/**/*.md` files were modified.
|
|
1564
|
+
|
|
1565
|
+
**Detect skill changes:**
|
|
1566
|
+
```bash
|
|
1567
|
+
skills_changed=$(git diff main...HEAD --name-only | grep -E "^\.claude/skills/.*\.md$" | wc -l | xargs || true)
|
|
1568
|
+
```
|
|
1569
|
+
|
|
1570
|
+
**If skills_changed > 0, add these adversarial prompts:**
|
|
1571
|
+
|
|
1572
|
+
| Prompt | Why It Matters |
|
|
1573
|
+
|--------|----------------|
|
|
1574
|
+
| **Command verified:** Did you execute at least one referenced command? | Skill instructions can reference commands that don't work (wrong flags, missing fields) |
|
|
1575
|
+
| **Fields verified:** For JSON commands, do field names match actual output? | Issue #178: `gh pr checks --json conclusion` failed because `conclusion` doesn't exist |
|
|
1576
|
+
| **Patterns complete:** What variations might users write that aren't covered? | Skills define patterns - missing coverage causes silent failures |
|
|
1577
|
+
| **Dependencies explicit:** What CLIs/tools does this skill assume are installed? | Missing `gh`, `npm`, etc. breaks the skill with confusing errors |
|
|
1578
|
+
|
|
1579
|
+
**Example skill-specific self-evaluation:**
|
|
1580
|
+
|
|
1581
|
+
```markdown
|
|
1582
|
+
### Skill Change Review
|
|
1583
|
+
|
|
1584
|
+
- [ ] **Command verified:** Executed `gh pr checks --json name,state,bucket` - fields exist ✅
|
|
1585
|
+
- [ ] **Fields verified:** Checked `gh pr checks --help` for valid JSON fields ✅
|
|
1586
|
+
- [ ] **Patterns complete:** Covered SUCCESS, FAILURE, PENDING states ✅
|
|
1587
|
+
- [ ] **Dependencies explicit:** Requires `gh` CLI authenticated ✅
|
|
1588
|
+
```
|
|
1589
|
+
|
|
1590
|
+
---
|
|
1591
|
+
|
|
1592
|
+
### 6. Execution Evidence (REQUIRED for scripts/CLI)
|
|
1593
|
+
|
|
1594
|
+
**When to apply:** `scripts/` or CLI files were modified.
|
|
1595
|
+
|
|
1596
|
+
**Detect change type:**
|
|
1597
|
+
```bash
|
|
1598
|
+
scripts_changed=$(git diff main...HEAD --name-only | grep -E "^scripts/" | wc -l | xargs || true)
|
|
1599
|
+
cli_changed=$(git diff main...HEAD --name-only | grep -E "(cli|commands?)" | wc -l | xargs || true)
|
|
1600
|
+
```
|
|
1601
|
+
|
|
1602
|
+
**If scripts/CLI changed, execute at least one smoke command:**
|
|
1603
|
+
|
|
1604
|
+
| Change Type | Required Command |
|
|
1605
|
+
|-------------|------------------|
|
|
1606
|
+
| `scripts/` | `npx tsx scripts/<file>.ts --help` |
|
|
1607
|
+
| CLI commands | `npx sequant <cmd> --help` or `--dry-run` |
|
|
1608
|
+
| Tests only | `npm test -- --grep "feature"` |
|
|
1609
|
+
| Types/config only | Waiver with reason |
|
|
1610
|
+
|
|
1611
|
+
**Capture evidence:**
|
|
1612
|
+
```bash
|
|
1613
|
+
# Execute and capture
|
|
1614
|
+
npx tsx scripts/example.ts --help 2>&1
|
|
1615
|
+
echo "Exit code: $?"
|
|
1616
|
+
```
|
|
1617
|
+
|
|
1618
|
+
**Evidence status:**
|
|
1619
|
+
- **Complete:** All required commands executed successfully
|
|
1620
|
+
- **Incomplete:** Some commands not run or failed
|
|
1621
|
+
- **Waived:** Explicit reason documented (types-only, config-only)
|
|
1622
|
+
- **Not Required:** No executable changes
|
|
1623
|
+
|
|
1624
|
+
**Verdict gating:**
|
|
1625
|
+
- `READY_FOR_MERGE` requires evidence status: Complete, Waived, or Not Required
|
|
1626
|
+
- `AC_MET_BUT_NOT_A_PLUS` if evidence is Incomplete
|
|
1627
|
+
|
|
1628
|
+
See [quality-gates.md](references/quality-gates.md) for detailed evidence requirements.
|
|
1629
|
+
|
|
1630
|
+
---
|
|
1631
|
+
|
|
1632
|
+
### 6a. Skill Command Verification (REQUIRED for skill changes)
|
|
1633
|
+
|
|
1634
|
+
**When to apply:** `.claude/skills/**/*.md` files were modified.
|
|
1635
|
+
|
|
1636
|
+
**Purpose:** Skills contain instructions with CLI commands. If those commands have wrong syntax, missing flags, or non-existent JSON fields, the skill will fail when used. QA must verify commands actually work before READY_FOR_MERGE.
|
|
1637
|
+
|
|
1638
|
+
**Detect skill changes:**
|
|
1639
|
+
```bash
|
|
1640
|
+
skills_changed=$(git diff main...HEAD --name-only | grep -E "^\.claude/skills/.*\.md$" || true)
|
|
1641
|
+
skill_count=$(echo "$skills_changed" | grep -c . || echo 0)
|
|
1642
|
+
```
|
|
1643
|
+
|
|
1644
|
+
**Pre-requisite check:**
|
|
1645
|
+
```bash
|
|
1646
|
+
# Verify gh CLI is available before running verification
|
|
1647
|
+
if ! command -v gh &>/dev/null; then
|
|
1648
|
+
echo "⚠️ gh CLI not installed - skill command verification skipped"
|
|
1649
|
+
echo "Install: https://cli.github.com/"
|
|
1650
|
+
# Set verification status to "Skipped" with reason
|
|
1651
|
+
fi
|
|
1652
|
+
```
|
|
1653
|
+
|
|
1654
|
+
**If skill_count > 0, extract and verify commands:**
|
|
1655
|
+
|
|
1656
|
+
#### Step 1: Extract Commands from Changed Skills
|
|
1657
|
+
|
|
1658
|
+
```bash
|
|
1659
|
+
# Extract command patterns from skill files
|
|
1660
|
+
for skill_file in $skills_changed; do
|
|
1661
|
+
echo "=== Commands in $skill_file ==="
|
|
1662
|
+
|
|
1663
|
+
# Commands at start of line (simple commands)
|
|
1664
|
+
grep -E '^\s*(gh|npm|npx|git)\s+' "$skill_file" 2>/dev/null | head -10
|
|
1665
|
+
|
|
1666
|
+
# Commands in subshells/variable assignments: result=$(gh pr view ...)
|
|
1667
|
+
grep -oE '\$\((gh|npm|npx|git)\s+[^)]+\)' "$skill_file" 2>/dev/null | head -10
|
|
1668
|
+
|
|
1669
|
+
# Commands in inline backticks
|
|
1670
|
+
grep -oE '\`(gh|npm|npx|git)\s+[^\`]+\`' "$skill_file" 2>/dev/null | head -10
|
|
1671
|
+
|
|
1672
|
+
# Commands after pipe or semicolon: ... | gh ... or ; npm ...
|
|
1673
|
+
grep -oE '[|;]\s*(gh|npm|npx|git)\s+[^|;&]+' "$skill_file" 2>/dev/null | head -10
|
|
1674
|
+
done
|
|
1675
|
+
```
|
|
1676
|
+
|
|
1677
|
+
**Note:** Multi-line commands (using `\` continuation) require manual review. The extraction patterns above capture single-line commands only.
|
|
1678
|
+
|
|
1679
|
+
#### Step 2: Verify Command Syntax
|
|
1680
|
+
|
|
1681
|
+
For each extracted command type:
|
|
1682
|
+
|
|
1683
|
+
| Command Type | Verification Method | Example |
|
|
1684
|
+
|--------------|---------------------|---------|
|
|
1685
|
+
| `gh pr checks --json X` | Check `gh pr checks --help` for valid JSON fields | `gh pr checks --help \| grep -A 30 "JSON FIELDS"` |
|
|
1686
|
+
| `gh issue view --json X` | Check `gh issue view --help` for valid JSON fields | `gh issue view --help \| grep -A 30 "JSON FIELDS"` |
|
|
1687
|
+
| `gh api ...` | Verify endpoint format matches GitHub API | Check endpoint structure |
|
|
1688
|
+
| `npm run <script>` | Verify script exists in package.json | `jq '.scripts["<script>"]' package.json` |
|
|
1689
|
+
| `npx tsx <file>` | Verify file exists | `test -f <file>` |
|
|
1690
|
+
| `git <cmd>` | Verify against `git <cmd> --help` | Check valid flags |
|
|
1691
|
+
|
|
1692
|
+
**JSON Field Validation Example:**
|
|
1693
|
+
|
|
1694
|
+
```bash
|
|
1695
|
+
# For commands like: gh pr checks --json name,state,conclusion
|
|
1696
|
+
# Verify each field exists
|
|
1697
|
+
|
|
1698
|
+
# Get valid fields
|
|
1699
|
+
valid_fields=$(gh pr checks --help 2>/dev/null | grep -A 50 "JSON FIELDS" | grep -E "^\s+\w+" | awk '{print $1}' || true)
|
|
1700
|
+
|
|
1701
|
+
# Check if "conclusion" is valid (spoiler: it's not)
|
|
1702
|
+
echo "$valid_fields" | grep -qw "conclusion" && echo "✅ conclusion exists" || echo "❌ conclusion NOT a valid field"
|
|
1703
|
+
```
|
|
1704
|
+
|
|
1705
|
+
#### Step 3: Handle Placeholders
|
|
1706
|
+
|
|
1707
|
+
Commands with placeholders (`<issue-number>`, `$PR_NUMBER`, `${VAR}`) cannot be executed directly.
|
|
1708
|
+
|
|
1709
|
+
**Handling:**
|
|
1710
|
+
- **Skip execution** for commands with placeholders
|
|
1711
|
+
- **Mark as "Syntax verified, execution skipped"**
|
|
1712
|
+
- **Still verify JSON fields** by extracting field names
|
|
1713
|
+
|
|
1714
|
+
```bash
|
|
1715
|
+
# Example: gh pr checks $pr_number --json name,state,bucket
|
|
1716
|
+
# Can't execute (no $pr_number), but can verify fields
|
|
1717
|
+
echo "name,state,bucket" | tr ',' '\n' | while read field; do
|
|
1718
|
+
gh pr checks --help | grep -qw "$field" && echo "✅ $field" || echo "❌ $field"
|
|
1719
|
+
done
|
|
1720
|
+
```
|
|
1721
|
+
|
|
1722
|
+
#### Step 4: Command Verification Status
|
|
1723
|
+
|
|
1724
|
+
| Status | Meaning |
|
|
1725
|
+
|--------|---------|
|
|
1726
|
+
| **Passed** | All commands verified, fields exist |
|
|
1727
|
+
| **Failed** | At least one command has invalid syntax or non-existent fields |
|
|
1728
|
+
| **Skipped** | Commands have placeholders; syntax looks valid but not executed |
|
|
1729
|
+
| **Not Required** | No skill files changed |
|
|
1730
|
+
|
|
1731
|
+
#### Verdict Gating
|
|
1732
|
+
|
|
1733
|
+
**CRITICAL:** If skill command verification = **Failed**, verdict CANNOT be `READY_FOR_MERGE`.
|
|
1734
|
+
|
|
1735
|
+
| Verification Status | Maximum Verdict |
|
|
1736
|
+
|---------------------|-----------------|
|
|
1737
|
+
| Passed | READY_FOR_MERGE |
|
|
1738
|
+
| Skipped | READY_FOR_MERGE (with note about unverified placeholders) |
|
|
1739
|
+
| Failed | AC_MET_BUT_NOT_A_PLUS (blocks merge until fixed) |
|
|
1740
|
+
| Not Required | READY_FOR_MERGE |
|
|
1741
|
+
|
|
1742
|
+
**Output Format:**
|
|
1743
|
+
|
|
1744
|
+
```markdown
|
|
1745
|
+
### Skill Command Verification
|
|
1746
|
+
|
|
1747
|
+
**Skill files changed:** 2
|
|
1748
|
+
|
|
1749
|
+
| File | Commands Found | Verification Status |
|
|
1750
|
+
|------|----------------|---------------------|
|
|
1751
|
+
| `.claude/skills/qa/SKILL.md` | 5 | ✅ Passed |
|
|
1752
|
+
| `.claude/skills/exec/SKILL.md` | 3 | ⚠️ Skipped (placeholders) |
|
|
1753
|
+
|
|
1754
|
+
**Commands Verified:**
|
|
1755
|
+
- `gh pr checks --json name,state,bucket` → ✅ All fields exist
|
|
1756
|
+
- `gh issue view --json title,body` → ✅ All fields exist
|
|
1757
|
+
|
|
1758
|
+
**Commands with Issues:**
|
|
1759
|
+
- `gh pr checks --json conclusion` → ❌ Field "conclusion" does not exist
|
|
1760
|
+
|
|
1761
|
+
**Verification Status:** Failed
|
|
1762
|
+
```
|
|
237
1763
|
|
|
238
|
-
|
|
1764
|
+
---
|
|
239
1765
|
|
|
240
|
-
|
|
1766
|
+
### 6b. Smoke Test (CONDITIONAL)
|
|
241
1767
|
|
|
242
|
-
|
|
243
|
-
- Readability and maintainability
|
|
244
|
-
- Alignment with existing patterns (see CLAUDE.md)
|
|
245
|
-
- TypeScript strictness and type safety
|
|
246
|
-
- **Duplicate utility check:** Verify new utilities don't duplicate existing ones in `docs/patterns/`
|
|
1768
|
+
**When to apply:** Feature changes workflow behavior (skills, CLI commands, scripts).
|
|
247
1769
|
|
|
248
|
-
|
|
1770
|
+
**Detection:**
|
|
1771
|
+
```bash
|
|
1772
|
+
# Detect workflow-affecting changes
|
|
1773
|
+
skills_changed=$(git diff main...HEAD --name-only | grep -E "^\.claude/skills/" | wc -l | xargs || true)
|
|
1774
|
+
scripts_changed=$(git diff main...HEAD --name-only | grep -E "^scripts/" | wc -l | xargs || true)
|
|
1775
|
+
cli_changed=$(git diff main...HEAD --name-only | grep -E "^(src/cli|bin)/" | wc -l | xargs || true)
|
|
249
1776
|
|
|
250
|
-
|
|
1777
|
+
if [[ $((skills_changed + scripts_changed + cli_changed)) -gt 0 ]]; then
|
|
1778
|
+
echo "Smoke test recommended for workflow changes"
|
|
1779
|
+
fi
|
|
1780
|
+
```
|
|
251
1781
|
|
|
252
|
-
|
|
253
|
-
|
|
254
|
-
|
|
255
|
-
|
|
1782
|
+
**Smoke Test Checklist:**
|
|
1783
|
+
1. **Happy path:** Execute the primary use case
|
|
1784
|
+
2. **Edge cases:** Test graceful handling (missing deps, invalid input)
|
|
1785
|
+
3. **Error detection:** Verify errors are caught and reported
|
|
256
1786
|
|
|
257
|
-
|
|
1787
|
+
**Output Format:**
|
|
258
1788
|
|
|
259
|
-
|
|
1789
|
+
| Test | Command | Result | Notes |
|
|
1790
|
+
|------|---------|--------|-------|
|
|
1791
|
+
| Happy path | `[command]` | ✅/❌ | [observation] |
|
|
1792
|
+
| Edge case | `[command]` | ✅/❌ | [observation] |
|
|
1793
|
+
| Error handling | `[command]` | ✅/❌ | [observation] |
|
|
260
1794
|
|
|
261
|
-
|
|
1795
|
+
**Smoke Test Status:**
|
|
1796
|
+
- **Complete:** All applicable tests passed
|
|
1797
|
+
- **Partial:** Some tests skipped or failed (document why)
|
|
1798
|
+
- **Not Required:** No workflow-affecting changes
|
|
262
1799
|
|
|
263
|
-
|
|
264
|
-
2. **"What assumptions am I making?"** - List and validate key assumptions
|
|
265
|
-
3. **"What's the unhappy path?"** - Test invalid inputs, failed dependencies
|
|
266
|
-
4. **"Did I test the feature's PRIMARY PURPOSE?"** - If it handles errors, trigger an error
|
|
1800
|
+
**Verdict Impact:**
|
|
267
1801
|
|
|
268
|
-
|
|
1802
|
+
| Smoke Test Status | Verdict Impact |
|
|
1803
|
+
|-------------------|----------------|
|
|
1804
|
+
| Complete | No impact (positive signal) |
|
|
1805
|
+
| Partial | → `AC_MET_BUT_NOT_A_PLUS` (document gaps) |
|
|
1806
|
+
| Not Required | No impact |
|
|
1807
|
+
|
|
1808
|
+
---
|
|
269
1809
|
|
|
270
|
-
###
|
|
1810
|
+
### 7. A+ Status Verdict
|
|
271
1811
|
|
|
272
1812
|
Provide an overall verdict:
|
|
273
1813
|
|
|
@@ -279,28 +1819,47 @@ Provide an overall verdict:
|
|
|
279
1819
|
**Verdict Determination Algorithm (REQUIRED):**
|
|
280
1820
|
|
|
281
1821
|
```text
|
|
282
|
-
1. Count AC statuses:
|
|
283
|
-
- met_count = ACs with status MET
|
|
284
|
-
- partial_count = ACs with status PARTIALLY_MET
|
|
285
|
-
- pending_count = ACs with status PENDING
|
|
286
|
-
- not_met_count = ACs with status NOT_MET
|
|
287
|
-
|
|
288
|
-
|
|
1822
|
+
1. Count AC statuses (INCLUDES both original AND derived ACs):
|
|
1823
|
+
- met_count = ACs with status MET (original + derived)
|
|
1824
|
+
- partial_count = ACs with status PARTIALLY_MET (original + derived)
|
|
1825
|
+
- pending_count = ACs with status PENDING (original + derived)
|
|
1826
|
+
- not_met_count = ACs with status NOT_MET (original + derived)
|
|
1827
|
+
|
|
1828
|
+
NOTE: Derived ACs are treated IDENTICALLY to original ACs.
|
|
1829
|
+
A derived AC marked NOT_MET will block merge just like an original AC.
|
|
1830
|
+
|
|
1831
|
+
2. Check verification gates:
|
|
1832
|
+
- skill_verification = status from Section 6a (Passed/Failed/Skipped/Not Required)
|
|
1833
|
+
- execution_evidence = status from Section 6 (Complete/Incomplete/Waived/Not Required)
|
|
1834
|
+
- quality_plan_status = status from Phase 0b (Complete/Partial/Not Addressed/N/A)
|
|
1835
|
+
- smoke_test_status = status from Section 6b (Complete/Partial/Not Required)
|
|
1836
|
+
|
|
1837
|
+
3. Browser testing enforcement check:
|
|
289
1838
|
- Check if any .tsx files were changed: git diff main...HEAD --name-only | grep '\.tsx$' || true
|
|
290
1839
|
- Check if /test phase ran: look for test phase marker in issue comments
|
|
291
1840
|
- Check if issue has 'no-browser-test' label
|
|
292
1841
|
- IF .tsx files changed AND /test did NOT run AND no 'no-browser-test' label:
|
|
293
|
-
→
|
|
294
|
-
"Browser testing recommended: .tsx files were modified but no /test phase ran.
|
|
295
|
-
Add 'ui' label to enable browser testing, or 'no-browser-test' to opt out."
|
|
1842
|
+
→ Set browser_test_missing = true
|
|
296
1843
|
|
|
297
|
-
|
|
1844
|
+
4. Determine verdict (in order):
|
|
298
1845
|
- IF not_met_count > 0 OR partial_count > 0:
|
|
299
1846
|
→ AC_NOT_MET (block merge)
|
|
1847
|
+
- ELSE IF skill_verification == "Failed":
|
|
1848
|
+
→ AC_MET_BUT_NOT_A_PLUS (skill commands have issues - cannot be READY_FOR_MERGE)
|
|
1849
|
+
- ELSE IF execution_evidence == "Incomplete":
|
|
1850
|
+
→ AC_MET_BUT_NOT_A_PLUS (scripts not verified - cannot be READY_FOR_MERGE)
|
|
1851
|
+
- ELSE IF quality_plan_status == "Not Addressed" AND quality_plan_exists:
|
|
1852
|
+
→ AC_MET_BUT_NOT_A_PLUS (quality dimensions not addressed - flag for review)
|
|
1853
|
+
- ELSE IF browser_test_missing (from step 3):
|
|
1854
|
+
→ AC_MET_BUT_NOT_A_PLUS (browser testing recommended for .tsx changes)
|
|
1855
|
+
Note: "Browser testing recommended: .tsx files modified without /test phase.
|
|
1856
|
+
Add 'ui' label to enable, or 'no-browser-test' to opt out."
|
|
300
1857
|
- ELSE IF pending_count > 0:
|
|
301
1858
|
→ NEEDS_VERIFICATION (wait for verification)
|
|
302
|
-
- ELSE IF
|
|
303
|
-
→ AC_MET_BUT_NOT_A_PLUS (
|
|
1859
|
+
- ELSE IF quality_plan_status == "Partial":
|
|
1860
|
+
→ AC_MET_BUT_NOT_A_PLUS (some quality dimensions incomplete - can merge with notes)
|
|
1861
|
+
- ELSE IF smoke_test_status == "Partial":
|
|
1862
|
+
→ AC_MET_BUT_NOT_A_PLUS (smoke tests incomplete - document gaps before merge)
|
|
304
1863
|
- ELSE IF improvement_suggestions.length > 0:
|
|
305
1864
|
→ AC_MET_BUT_NOT_A_PLUS (can merge with notes)
|
|
306
1865
|
- ELSE:
|
|
@@ -338,18 +1897,22 @@ fi
|
|
|
338
1897
|
|
|
339
1898
|
**CRITICAL:** `PARTIALLY_MET` is NOT sufficient for merge. It MUST be treated as `NOT_MET` for verdict purposes.
|
|
340
1899
|
|
|
1900
|
+
**CRITICAL:** If skill command verification = "Failed", verdict CANNOT be `READY_FOR_MERGE`. This prevents shipping skills with broken commands (like issue #178's `conclusion` field).
|
|
1901
|
+
|
|
341
1902
|
See [quality-gates.md](references/quality-gates.md) for detailed verdict criteria.
|
|
342
1903
|
|
|
1904
|
+
---
|
|
1905
|
+
|
|
343
1906
|
## Automated Quality Checks (Reference)
|
|
344
1907
|
|
|
345
1908
|
**Note:** These commands are what the sub-agents execute internally. You do NOT run these directly — the sub-agents spawned above handle this. This section is reference documentation only.
|
|
346
1909
|
|
|
347
1910
|
```bash
|
|
348
1911
|
# Type safety
|
|
349
|
-
type_issues=$(git diff main...HEAD | grep -E ":\s*any[,)]|as any" | wc -l | xargs ||
|
|
1912
|
+
type_issues=$(git diff main...HEAD | grep -E ":\s*any[,)]|as any" | wc -l | xargs || true)
|
|
350
1913
|
|
|
351
1914
|
# Deleted tests
|
|
352
|
-
deleted_tests=$(git diff main...HEAD --diff-filter=D --name-only | grep -E "\\.test\\.|\\spec\\." | wc -l | xargs ||
|
|
1915
|
+
deleted_tests=$(git diff main...HEAD --diff-filter=D --name-only | grep -E "\\.test\\.|\\spec\\." | wc -l | xargs || true)
|
|
353
1916
|
|
|
354
1917
|
# Scope check
|
|
355
1918
|
files_changed=$(git diff main...HEAD --name-only | wc -l | xargs)
|
|
@@ -364,7 +1927,7 @@ npx tsx scripts/lib/__tests__/run-security-scan.ts 2>/dev/null
|
|
|
364
1927
|
|
|
365
1928
|
See [scripts/quality-checks.sh](scripts/quality-checks.sh) for the complete automation script.
|
|
366
1929
|
|
|
367
|
-
###
|
|
1930
|
+
### 8. Draft Review/QA Comment
|
|
368
1931
|
|
|
369
1932
|
Produce a Markdown snippet for the PR/issue:
|
|
370
1933
|
- Short summary of the change
|
|
@@ -372,7 +1935,14 @@ Produce a Markdown snippet for the PR/issue:
|
|
|
372
1935
|
- Key strengths and issues
|
|
373
1936
|
- Clear, actionable next steps
|
|
374
1937
|
|
|
375
|
-
###
|
|
1938
|
+
### 9. Update GitHub Issue
|
|
1939
|
+
|
|
1940
|
+
**If orchestrated (SEQUANT_ORCHESTRATOR is set):**
|
|
1941
|
+
- Skip posting GitHub comment (orchestrator handles aggregated summary)
|
|
1942
|
+
- Include verdict and AC coverage in output for orchestrator to capture
|
|
1943
|
+
- Let orchestrator update labels based on final workflow status
|
|
1944
|
+
|
|
1945
|
+
**If standalone:**
|
|
376
1946
|
|
|
377
1947
|
Post the draft comment to GitHub and update labels:
|
|
378
1948
|
|
|
@@ -381,7 +1951,7 @@ Post the draft comment to GitHub and update labels:
|
|
|
381
1951
|
- `AC_MET_BUT_NOT_A_PLUS`: add `needs-improvement` label
|
|
382
1952
|
- `NEEDS_VERIFICATION`: add `needs-verification` label
|
|
383
1953
|
|
|
384
|
-
###
|
|
1954
|
+
### 10. Documentation Reminder
|
|
385
1955
|
|
|
386
1956
|
If verdict is `READY_FOR_MERGE` or `AC_MET_BUT_NOT_A_PLUS`:
|
|
387
1957
|
|
|
@@ -389,13 +1959,84 @@ If verdict is `READY_FOR_MERGE` or `AC_MET_BUT_NOT_A_PLUS`:
|
|
|
389
1959
|
**Documentation:** Before merging, run `/docs <issue>` to generate feature documentation.
|
|
390
1960
|
```
|
|
391
1961
|
|
|
392
|
-
###
|
|
1962
|
+
### 10a. CHANGELOG Quality Gate (REQUIRED)
|
|
1963
|
+
|
|
1964
|
+
**Purpose:** Verify user-facing changes have corresponding CHANGELOG entries before `READY_FOR_MERGE`.
|
|
1965
|
+
|
|
1966
|
+
**Detection:**
|
|
1967
|
+
|
|
1968
|
+
```bash
|
|
1969
|
+
# Check if CHANGELOG.md exists
|
|
1970
|
+
if [ ! -f "CHANGELOG.md" ]; then
|
|
1971
|
+
echo "No CHANGELOG.md found - skip CHANGELOG check"
|
|
1972
|
+
exit 0
|
|
1973
|
+
fi
|
|
1974
|
+
|
|
1975
|
+
# Check if [Unreleased] section has entries
|
|
1976
|
+
unreleased_entries=$(sed -n '/^## \[Unreleased\]/,/^## \[/p' CHANGELOG.md | grep -E '^\s*-' | wc -l | xargs || true)
|
|
1977
|
+
|
|
1978
|
+
# Determine if change is user-facing (new features, bug fixes, etc.)
|
|
1979
|
+
# Look at commit messages or file changes
|
|
1980
|
+
user_facing=$(git log main..HEAD --oneline | grep -iE '^[a-f0-9]+ (feat|fix|perf|refactor|docs):' | wc -l | xargs || true)
|
|
1981
|
+
```
|
|
1982
|
+
|
|
1983
|
+
**Verification Logic:**
|
|
1984
|
+
|
|
1985
|
+
| Condition | CHANGELOG Entry Required? | Action |
|
|
1986
|
+
|-----------|---------------------------|--------|
|
|
1987
|
+
| User-facing changes detected + CHANGELOG exists | ✅ Yes | Check for `[Unreleased]` entry |
|
|
1988
|
+
| User-facing changes + no entry | ⚠️ Block | Flag as missing CHANGELOG |
|
|
1989
|
+
| Non-user-facing changes (test, ci, chore) | ❌ No | Skip check |
|
|
1990
|
+
| No CHANGELOG.md in repo | ❌ No | Skip check |
|
|
1991
|
+
|
|
1992
|
+
**If CHANGELOG entry is missing:**
|
|
1993
|
+
|
|
1994
|
+
1. Do NOT give `READY_FOR_MERGE` verdict
|
|
1995
|
+
2. Set verdict to `AC_MET_BUT_NOT_A_PLUS` with note:
|
|
1996
|
+
```markdown
|
|
1997
|
+
**CHANGELOG:** Missing entry for user-facing changes. Add entry to `## [Unreleased]` section before merging.
|
|
1998
|
+
```
|
|
1999
|
+
3. Include this in the draft review comment
|
|
2000
|
+
|
|
2001
|
+
**CHANGELOG Entry Validation:**
|
|
2002
|
+
|
|
2003
|
+
When an entry exists, verify it follows the format:
|
|
2004
|
+
- Starts with action verb (Add, Fix, Update, Remove, Improve)
|
|
2005
|
+
- Includes issue number `(#123)`
|
|
2006
|
+
- Is under the correct section (Added, Fixed, Changed, etc.)
|
|
2007
|
+
|
|
2008
|
+
**Example validation:**
|
|
2009
|
+
|
|
2010
|
+
```markdown
|
|
2011
|
+
### CHANGELOG Verification
|
|
2012
|
+
|
|
2013
|
+
| Check | Status |
|
|
2014
|
+
|-------|--------|
|
|
2015
|
+
| CHANGELOG.md exists | ✅ Found |
|
|
2016
|
+
| User-facing changes | ✅ Yes (feat: commit detected) |
|
|
2017
|
+
| [Unreleased] entry | ✅ Present |
|
|
2018
|
+
| Entry format | ✅ Valid (includes issue number) |
|
|
2019
|
+
|
|
2020
|
+
**Result:** CHANGELOG requirements met
|
|
2021
|
+
```
|
|
2022
|
+
|
|
2023
|
+
**If CHANGELOG is not required:**
|
|
2024
|
+
|
|
2025
|
+
```markdown
|
|
2026
|
+
### CHANGELOG Verification
|
|
2027
|
+
|
|
2028
|
+
**Result:** N/A (non-user-facing changes only)
|
|
2029
|
+
```
|
|
2030
|
+
|
|
2031
|
+
---
|
|
2032
|
+
|
|
2033
|
+
### 11. Script/CLI Execution Verification
|
|
393
2034
|
|
|
394
2035
|
**REQUIRED for CLI/script features:** When `scripts/` files are modified, execution verification is required before `READY_FOR_MERGE`.
|
|
395
2036
|
|
|
396
2037
|
**Detection:**
|
|
397
2038
|
```bash
|
|
398
|
-
scripts_changed=$(git diff main...HEAD --name-only | grep "^scripts/" | wc -l | xargs ||
|
|
2039
|
+
scripts_changed=$(git diff main...HEAD --name-only | grep -E "^(scripts/|templates/scripts/)" | wc -l | xargs || true)
|
|
399
2040
|
if [[ $scripts_changed -gt 0 ]]; then
|
|
400
2041
|
echo "Script changes detected. Run /verify before READY_FOR_MERGE"
|
|
401
2042
|
fi
|
|
@@ -409,7 +2050,7 @@ fi
|
|
|
409
2050
|
|
|
410
2051
|
**If no verification evidence exists:**
|
|
411
2052
|
1. Prompt: "Script changes detected but no execution verification found. Run `/verify <issue> --command \"<test command>\"` before READY_FOR_MERGE verdict."
|
|
412
|
-
2. Do NOT give `READY_FOR_MERGE` verdict until verification is complete (unless an approved override applies — see Section
|
|
2053
|
+
2. Do NOT give `READY_FOR_MERGE` verdict until verification is complete (unless an approved override applies — see Section 11a)
|
|
413
2054
|
3. Verdict should be `AC_MET_BUT_NOT_A_PLUS` with note about missing verification
|
|
414
2055
|
|
|
415
2056
|
**Why this matters:**
|
|
@@ -418,7 +2059,7 @@ fi
|
|
|
418
2059
|
|
|
419
2060
|
**Example workflow:**
|
|
420
2061
|
```bash
|
|
421
|
-
# QA detects scripts/ changes
|
|
2062
|
+
# QA detects scripts/ or templates/scripts/ changes
|
|
422
2063
|
# -> Prompt: "Run /verify before READY_FOR_MERGE"
|
|
423
2064
|
|
|
424
2065
|
/verify 558 --command "npx tsx scripts/migrate.ts --dry-run"
|
|
@@ -429,7 +2070,7 @@ fi
|
|
|
429
2070
|
/qa 558 # Re-run, now sees verification, can give READY_FOR_MERGE
|
|
430
2071
|
```
|
|
431
2072
|
|
|
432
|
-
###
|
|
2073
|
+
### 11a. Script Verification Override
|
|
433
2074
|
|
|
434
2075
|
In some cases, `/verify` execution can be safely skipped when script changes are purely cosmetic or have no runtime impact. **Overrides require explicit justification and risk assessment.**
|
|
435
2076
|
|
|
@@ -484,33 +2125,123 @@ In some cases, `/verify` execution can be safely skipped when script changes are
|
|
|
484
2125
|
|
|
485
2126
|
---
|
|
486
2127
|
|
|
2128
|
+
## State Tracking
|
|
2129
|
+
|
|
2130
|
+
**IMPORTANT:** Update workflow state when running standalone (not orchestrated).
|
|
2131
|
+
|
|
2132
|
+
### State Updates (Standalone Only)
|
|
2133
|
+
|
|
2134
|
+
When NOT orchestrated (`SEQUANT_ORCHESTRATOR` is not set):
|
|
2135
|
+
|
|
2136
|
+
**At skill start:**
|
|
2137
|
+
```bash
|
|
2138
|
+
npx tsx scripts/state/update.ts start <issue-number> qa
|
|
2139
|
+
```
|
|
2140
|
+
|
|
2141
|
+
**On successful completion (READY_FOR_MERGE or AC_MET_BUT_NOT_A_PLUS):**
|
|
2142
|
+
```bash
|
|
2143
|
+
npx tsx scripts/state/update.ts complete <issue-number> qa
|
|
2144
|
+
npx tsx scripts/state/update.ts status <issue-number> ready_for_merge
|
|
2145
|
+
```
|
|
2146
|
+
|
|
2147
|
+
**On failure (AC_NOT_MET):**
|
|
2148
|
+
```bash
|
|
2149
|
+
npx tsx scripts/state/update.ts fail <issue-number> qa "AC not met"
|
|
2150
|
+
```
|
|
2151
|
+
|
|
2152
|
+
**Why this matters:** State tracking enables dashboard visibility, resume capability, and workflow orchestration. Skills update state when standalone; orchestrators handle state when running workflows.
|
|
2153
|
+
|
|
2154
|
+
---
|
|
2155
|
+
|
|
487
2156
|
## Output Verification
|
|
488
2157
|
|
|
489
2158
|
**Before responding, verify your output includes ALL of these:**
|
|
490
2159
|
|
|
491
|
-
|
|
492
|
-
|
|
2160
|
+
### Simple Fix Mode (`SMALL_DIFF=true`)
|
|
2161
|
+
|
|
2162
|
+
When the size gate determined `SMALL_DIFF=true`, use the **simplified output template**. The following sections are **omitted** (not marked N/A — completely absent):
|
|
2163
|
+
|
|
2164
|
+
- Quality Plan Verification
|
|
2165
|
+
- Incremental QA Summary
|
|
2166
|
+
- Call-Site Review
|
|
2167
|
+
- Product Review
|
|
2168
|
+
- Smoke Test
|
|
2169
|
+
- CLI Registration Verification
|
|
2170
|
+
- Skill Command Verification
|
|
2171
|
+
- Script Verification Override
|
|
2172
|
+
- Skill Change Review
|
|
2173
|
+
|
|
2174
|
+
**Required sections for simple fix mode:**
|
|
2175
|
+
|
|
2176
|
+
- [ ] **Size Gate** - Size gate decision table with threshold, diff size, and decision
|
|
2177
|
+
- [ ] **AC Coverage** - Each AC item marked as MET, PARTIALLY_MET, NOT_MET, PENDING, or N/A
|
|
2178
|
+
- [ ] **Quality Metrics** - Type issues, deleted tests, files changed, additions/deletions (from inline checks)
|
|
2179
|
+
- [ ] **Code Review Findings** - Strengths, issues, suggestions
|
|
2180
|
+
- [ ] **Test Coverage Analysis** - Changed files with/without tests, critical paths flagged
|
|
2181
|
+
- [ ] **Anti-Pattern Detection** - Code patterns check (lightweight)
|
|
2182
|
+
- [ ] **Self-Evaluation Completed** - Adversarial self-evaluation section included
|
|
2183
|
+
- [ ] **Verdict** - One of: READY_FOR_MERGE, AC_MET_BUT_NOT_A_PLUS, NEEDS_VERIFICATION, AC_NOT_MET
|
|
2184
|
+
- [ ] **Documentation Check** - README/docs updated if feature adds new functionality
|
|
2185
|
+
- [ ] **Next Steps** - Clear, actionable recommendations
|
|
2186
|
+
|
|
2187
|
+
### Standard QA (Implementation Exists, `SMALL_DIFF=false`)
|
|
2188
|
+
|
|
2189
|
+
- [ ] **Self-Evaluation Completed** - Adversarial self-evaluation section included in output
|
|
2190
|
+
- [ ] **AC Coverage** - Each AC item marked as MET, PARTIALLY_MET, NOT_MET, PENDING, or N/A
|
|
2191
|
+
- [ ] **Quality Plan Verification** - Included if quality plan exists (or marked N/A if no quality plan)
|
|
2192
|
+
- [ ] **CI Status** - Included if PR exists (or marked "No PR" / "No CI configured")
|
|
2193
|
+
- [ ] **Verdict** - One of: READY_FOR_MERGE, AC_MET_BUT_NOT_A_PLUS, NEEDS_VERIFICATION, AC_NOT_MET
|
|
493
2194
|
- [ ] **Quality Metrics** - Type issues, deleted tests, files changed, additions/deletions
|
|
2195
|
+
- [ ] **Cache Status** - Included if caching enabled (or marked N/A if --no-cache)
|
|
2196
|
+
- [ ] **Build Verification** - Included if build failed (or marked N/A if build passed)
|
|
2197
|
+
- [ ] **Test Coverage Analysis** - Changed files with/without tests, critical paths flagged
|
|
494
2198
|
- [ ] **Code Review Findings** - Strengths, issues, suggestions
|
|
2199
|
+
- [ ] **Test Quality Review** - Included if test files modified (or marked N/A)
|
|
2200
|
+
- [ ] **Anti-Pattern Detection** - Dependency audit (if package.json changed) + code patterns
|
|
2201
|
+
- [ ] **Call-Site Review** - Included if new exported functions detected (or marked N/A)
|
|
2202
|
+
- [ ] **Execution Evidence** - Included if scripts/CLI modified (or marked N/A)
|
|
495
2203
|
- [ ] **Script Verification Override** - Included if scripts/CLI modified AND /verify was skipped (with justification and risk assessment)
|
|
2204
|
+
- [ ] **Skill Command Verification** - Included if `.claude/skills/**/*.md` modified (or marked N/A)
|
|
2205
|
+
- [ ] **Skill Change Review** - Skill-specific adversarial prompts included if skills changed
|
|
2206
|
+
- [ ] **Smoke Test** - Included if workflow-affecting changes (skills, scripts, CLI), or marked "Not Required"
|
|
2207
|
+
- [ ] **CHANGELOG Verification** - User-facing changes have `[Unreleased]` entry (or marked N/A)
|
|
496
2208
|
- [ ] **Documentation Check** - README/docs updated if feature adds new functionality
|
|
497
2209
|
- [ ] **Next Steps** - Clear, actionable recommendations
|
|
498
2210
|
|
|
499
|
-
|
|
2211
|
+
### Early Exit (No Implementation)
|
|
2212
|
+
|
|
2213
|
+
When early exit is triggered (no commits, no uncommitted changes, no PR):
|
|
2214
|
+
|
|
2215
|
+
- [ ] **Implementation Status** - Clearly states "NOT FOUND"
|
|
2216
|
+
- [ ] **Verdict** - Must be `AC_NOT_MET`
|
|
2217
|
+
- [ ] **Next Steps** - Directs user to run `/exec` first
|
|
2218
|
+
- [ ] **Sub-agents NOT spawned** - Quality check agents were skipped
|
|
2219
|
+
|
|
2220
|
+
**DO NOT respond until all applicable items are verified.**
|
|
500
2221
|
|
|
501
2222
|
## Output Template
|
|
502
2223
|
|
|
503
|
-
|
|
2224
|
+
### Simple Fix Template (`SMALL_DIFF=true`)
|
|
2225
|
+
|
|
2226
|
+
When the size gate triggers simple fix mode, use this shorter template:
|
|
504
2227
|
|
|
505
2228
|
```markdown
|
|
506
|
-
## QA Review for Issue #<N>
|
|
2229
|
+
## QA Review for Issue #<N> (Simple Fix)
|
|
2230
|
+
|
|
2231
|
+
### Size Gate
|
|
2232
|
+
|
|
2233
|
+
| Check | Value |
|
|
2234
|
+
|-------|-------|
|
|
2235
|
+
| Diff size | N lines (threshold: T) |
|
|
2236
|
+
| package.json changed | No |
|
|
2237
|
+
| Security-sensitive paths | No |
|
|
2238
|
+
| Decision | **Inline checks** |
|
|
507
2239
|
|
|
508
2240
|
### AC Coverage
|
|
509
2241
|
|
|
510
2242
|
| AC | Description | Status | Notes |
|
|
511
2243
|
|----|-------------|--------|-------|
|
|
512
|
-
| AC-1 | [description] | MET/
|
|
513
|
-
| AC-2 | [description] | MET/PARTIALLY_MET/NOT_MET | [explanation] |
|
|
2244
|
+
| AC-1 | [description] | MET/NOT_MET | [explanation] |
|
|
514
2245
|
|
|
515
2246
|
**Coverage:** X/Y AC items fully met
|
|
516
2247
|
|
|
@@ -525,6 +2256,191 @@ You MUST include these sections:
|
|
|
525
2256
|
| Files changed | X | OK/WARN |
|
|
526
2257
|
| Lines added | +X | - |
|
|
527
2258
|
| Lines deleted | -X | - |
|
|
2259
|
+
| Security patterns | X | OK/WARN |
|
|
2260
|
+
|
|
2261
|
+
---
|
|
2262
|
+
|
|
2263
|
+
### Code Review
|
|
2264
|
+
|
|
2265
|
+
**Strengths:**
|
|
2266
|
+
- [Positive findings]
|
|
2267
|
+
|
|
2268
|
+
**Issues:**
|
|
2269
|
+
- [Problems found]
|
|
2270
|
+
|
|
2271
|
+
**Suggestions:**
|
|
2272
|
+
- [Improvements recommended]
|
|
2273
|
+
|
|
2274
|
+
---
|
|
2275
|
+
|
|
2276
|
+
### Test Coverage Analysis
|
|
2277
|
+
|
|
2278
|
+
| Changed File | Tier | Has Tests? | Test File |
|
|
2279
|
+
|--------------|------|------------|-----------|
|
|
2280
|
+
| `[file]` | Critical/Standard/Optional | Yes/No | `[test file or -]` |
|
|
2281
|
+
|
|
2282
|
+
**Coverage:** X/Y changed source files have corresponding tests
|
|
2283
|
+
|
|
2284
|
+
---
|
|
2285
|
+
|
|
2286
|
+
### Anti-Pattern Detection
|
|
2287
|
+
|
|
2288
|
+
| File:Line | Category | Pattern | Suggestion |
|
|
2289
|
+
|-----------|----------|---------|------------|
|
|
2290
|
+
| [location] | [category] | [pattern] | [fix] |
|
|
2291
|
+
|
|
2292
|
+
---
|
|
2293
|
+
|
|
2294
|
+
### Self-Evaluation
|
|
2295
|
+
|
|
2296
|
+
- **Verified working:** [Yes/No]
|
|
2297
|
+
- **Test efficacy:** [High/Medium/Low]
|
|
2298
|
+
- **Likely failure mode:** [description]
|
|
2299
|
+
- **Verdict confidence:** [High/Medium/Low]
|
|
2300
|
+
|
|
2301
|
+
---
|
|
2302
|
+
|
|
2303
|
+
### Verdict: [READY_FOR_MERGE | AC_MET_BUT_NOT_A_PLUS | NEEDS_VERIFICATION | AC_NOT_MET]
|
|
2304
|
+
|
|
2305
|
+
[Explanation of verdict]
|
|
2306
|
+
|
|
2307
|
+
### Documentation
|
|
2308
|
+
|
|
2309
|
+
- [ ] N/A - Simple fix, no documentation needed
|
|
2310
|
+
- [ ] README/docs updated
|
|
2311
|
+
|
|
2312
|
+
### Next Steps
|
|
2313
|
+
|
|
2314
|
+
1. [Action item]
|
|
2315
|
+
```
|
|
2316
|
+
|
|
2317
|
+
---
|
|
2318
|
+
|
|
2319
|
+
### Standard Template (`SMALL_DIFF=false`)
|
|
2320
|
+
|
|
2321
|
+
You MUST include these sections:
|
|
2322
|
+
|
|
2323
|
+
```markdown
|
|
2324
|
+
## QA Review for Issue #<N>
|
|
2325
|
+
|
|
2326
|
+
### AC Coverage
|
|
2327
|
+
|
|
2328
|
+
| AC | Source | Description | Status | Notes |
|
|
2329
|
+
|----|--------|-------------|--------|-------|
|
|
2330
|
+
| AC-1 | Original | [description] | MET/PARTIALLY_MET/NOT_MET/PENDING/N/A | [explanation] |
|
|
2331
|
+
| AC-2 | Original | [description] | MET/PARTIALLY_MET/NOT_MET/PENDING/N/A | [explanation] |
|
|
2332
|
+
| **Derived ACs** | | | | |
|
|
2333
|
+
| AC-6 | Derived (Error Handling) | [description from quality plan] | MET/PARTIALLY_MET/NOT_MET | [explanation] |
|
|
2334
|
+
| AC-7 | Derived (Test Coverage) | [description from quality plan] | MET/PARTIALLY_MET/NOT_MET | [explanation] |
|
|
2335
|
+
|
|
2336
|
+
**Coverage:** X/Y AC items fully met (includes derived ACs)
|
|
2337
|
+
**Original ACs:** X/Y met
|
|
2338
|
+
**Derived ACs:** X/Y met
|
|
2339
|
+
|
|
2340
|
+
---
|
|
2341
|
+
|
|
2342
|
+
### Quality Plan Verification
|
|
2343
|
+
|
|
2344
|
+
[Include if quality plan exists in issue comments, otherwise: "N/A - No quality plan found"]
|
|
2345
|
+
|
|
2346
|
+
| Dimension | Items Planned | Items Addressed | Status |
|
|
2347
|
+
|-----------|---------------|-----------------|--------|
|
|
2348
|
+
| Completeness | X | X | ✅ Complete / ⚠️ Partial / ❌ Not addressed |
|
|
2349
|
+
| Error Handling | X | X | ✅ Complete / ⚠️ Partial / ❌ Not addressed |
|
|
2350
|
+
| Code Quality | X | X | ✅ Complete / ⚠️ Partial / ❌ Not addressed |
|
|
2351
|
+
| Test Coverage | X | X | ✅ Complete / ⚠️ Partial / ❌ Not addressed |
|
|
2352
|
+
| Best Practices | X | X | ✅ Complete / ⚠️ Partial / ❌ Not addressed |
|
|
2353
|
+
| Polish | X | X | ✅ Complete / ⚠️ Partial / N/A (not UI) |
|
|
2354
|
+
|
|
2355
|
+
**Derived ACs:** X/Y addressed
|
|
2356
|
+
**Quality Plan Status:** Complete / Partial / Not Addressed
|
|
2357
|
+
|
|
2358
|
+
---
|
|
2359
|
+
|
|
2360
|
+
### Incremental QA Summary
|
|
2361
|
+
|
|
2362
|
+
[Include if INCREMENTAL_MODE=true from Phase 0c, otherwise: "N/A - First QA run"]
|
|
2363
|
+
|
|
2364
|
+
**Last QA:** <timestamp> (commit: <sha-short>)
|
|
2365
|
+
**Changes since last QA:** N files
|
|
2366
|
+
|
|
2367
|
+
| Check / AC | Status | Re-run? | Reason |
|
|
2368
|
+
|------------|--------|---------|--------|
|
|
2369
|
+
| [check/AC] | [status] | Cached / Re-run / Re-evaluated | [reason] |
|
|
2370
|
+
|
|
2371
|
+
**Summary:** X checks cached, Y re-evaluated, Z always-fresh
|
|
2372
|
+
|
|
2373
|
+
---
|
|
2374
|
+
|
|
2375
|
+
### CI Status
|
|
2376
|
+
|
|
2377
|
+
[Include if PR exists, otherwise: "No PR exists yet" or "No CI configured"]
|
|
2378
|
+
|
|
2379
|
+
| Check | State | Conclusion | Impact |
|
|
2380
|
+
|-------|-------|------------|--------|
|
|
2381
|
+
| `[check name]` | completed/in_progress/queued/pending | success/failure/cancelled/skipped/- | ✅ MET / ❌ NOT_MET / ⏳ PENDING |
|
|
2382
|
+
|
|
2383
|
+
**CI Summary:** X passed, Y pending, Z failed
|
|
2384
|
+
**CI-related AC items:** [list affected AC items and their status based on CI]
|
|
2385
|
+
|
|
2386
|
+
---
|
|
2387
|
+
|
|
2388
|
+
### Quality Metrics
|
|
2389
|
+
|
|
2390
|
+
| Metric | Value | Status |
|
|
2391
|
+
|--------|-------|--------|
|
|
2392
|
+
| Type issues (`any`) | X | OK/WARN |
|
|
2393
|
+
| Deleted tests | X | OK/WARN |
|
|
2394
|
+
| Files changed | X | OK/WARN |
|
|
2395
|
+
| Lines added | +X | - |
|
|
2396
|
+
| Lines deleted | -X | - |
|
|
2397
|
+
|
|
2398
|
+
---
|
|
2399
|
+
|
|
2400
|
+
### Cache Status
|
|
2401
|
+
|
|
2402
|
+
[Include if caching enabled, otherwise: "N/A - Caching disabled (--no-cache)"]
|
|
2403
|
+
|
|
2404
|
+
| Check | Cache Status |
|
|
2405
|
+
|-------|--------------|
|
|
2406
|
+
| type-safety | ✅ HIT / ❌ MISS / ⏭️ SKIP |
|
|
2407
|
+
| deleted-tests | ✅ HIT / ❌ MISS / ⏭️ SKIP |
|
|
2408
|
+
| scope | ⏭️ SKIP (always fresh) |
|
|
2409
|
+
| size | ⏭️ SKIP (always fresh) |
|
|
2410
|
+
| security | ✅ HIT / ❌ MISS / ⏭️ SKIP |
|
|
2411
|
+
| semgrep | ✅ HIT / ❌ MISS / ⏭️ SKIP |
|
|
2412
|
+
| build | ✅ HIT / ❌ MISS / ⏭️ SKIP |
|
|
2413
|
+
|
|
2414
|
+
**Summary:** X hits, Y misses, Z skipped
|
|
2415
|
+
**Performance:** [Note if cached checks saved time]
|
|
2416
|
+
|
|
2417
|
+
---
|
|
2418
|
+
|
|
2419
|
+
### Build Verification
|
|
2420
|
+
|
|
2421
|
+
[Include if `npm run build` failed, otherwise: "N/A - Build passed"]
|
|
2422
|
+
|
|
2423
|
+
| Check | Status |
|
|
2424
|
+
|-------|--------|
|
|
2425
|
+
| Feature branch build | ✅ Passed / ❌ Failed |
|
|
2426
|
+
| Main branch build | ✅ Passed / ❌ Failed |
|
|
2427
|
+
| Error match | ✅ Same error / ❌ Different errors / N/A |
|
|
2428
|
+
| Regression | **Yes** (new) / **No** (pre-existing) / **Unknown** |
|
|
2429
|
+
|
|
2430
|
+
**Note:** [Explanation of build verification result]
|
|
2431
|
+
|
|
2432
|
+
**Verdict impact:** [None / Blocking / Needs review]
|
|
2433
|
+
|
|
2434
|
+
---
|
|
2435
|
+
|
|
2436
|
+
### Test Coverage Analysis
|
|
2437
|
+
|
|
2438
|
+
| Changed File | Tier | Has Tests? | Test File |
|
|
2439
|
+
|--------------|------|------------|-----------|
|
|
2440
|
+
| `[file]` | Critical/Standard/Optional | ✅ Yes / ⚠️ No | `[test file or -]` |
|
|
2441
|
+
|
|
2442
|
+
**Coverage:** X/Y changed source files have corresponding tests
|
|
2443
|
+
**Critical paths without tests:** [list or "None"]
|
|
528
2444
|
|
|
529
2445
|
---
|
|
530
2446
|
|
|
@@ -541,18 +2457,155 @@ You MUST include these sections:
|
|
|
541
2457
|
|
|
542
2458
|
---
|
|
543
2459
|
|
|
2460
|
+
### Test Quality Review
|
|
2461
|
+
|
|
2462
|
+
[Include if test files were added/modified, otherwise: "N/A - No test files modified"]
|
|
2463
|
+
|
|
2464
|
+
| Category | Status | Notes |
|
|
2465
|
+
|----------|--------|-------|
|
|
2466
|
+
| Behavior vs Implementation | ✅ OK / ⚠️ WARN | [notes] |
|
|
2467
|
+
| Coverage Depth | ✅ OK / ⚠️ WARN | [notes] |
|
|
2468
|
+
| Mock Hygiene | ✅ OK / ⚠️ WARN | [notes] |
|
|
2469
|
+
| Test Reliability | ✅ OK / ⚠️ WARN | [notes] |
|
|
2470
|
+
|
|
2471
|
+
**Issues Found:**
|
|
2472
|
+
- [file:line - description]
|
|
2473
|
+
|
|
2474
|
+
---
|
|
2475
|
+
|
|
2476
|
+
### Anti-Pattern Detection
|
|
2477
|
+
|
|
2478
|
+
#### Dependency Audit
|
|
2479
|
+
[Include if package.json modified, otherwise: "N/A - No dependency changes"]
|
|
2480
|
+
|
|
2481
|
+
| Package | Downloads/wk | Last Update | Flags |
|
|
2482
|
+
|---------|--------------|-------------|-------|
|
|
2483
|
+
| [pkg] | [count] | [date] | [flags] |
|
|
2484
|
+
|
|
2485
|
+
#### Code Patterns
|
|
2486
|
+
|
|
2487
|
+
| File:Line | Category | Pattern | Suggestion |
|
|
2488
|
+
|-----------|----------|---------|------------|
|
|
2489
|
+
| [location] | [category] | [pattern] | [fix] |
|
|
2490
|
+
|
|
2491
|
+
**Critical Issues:** X
|
|
2492
|
+
**Warnings:** Y
|
|
2493
|
+
|
|
2494
|
+
---
|
|
2495
|
+
|
|
2496
|
+
### Call-Site Review
|
|
2497
|
+
|
|
2498
|
+
[Include if new exported functions detected, otherwise: "N/A - No new exported functions"]
|
|
2499
|
+
|
|
2500
|
+
**New exported functions detected:** N
|
|
2501
|
+
|
|
2502
|
+
| Function | Call Sites | Loop? | Conditions | AC Match |
|
|
2503
|
+
|----------|-----------|-------|------------|----------|
|
|
2504
|
+
| `[function]` | `[file:line]` | Yes/No | `[condition]` | ✅ Matches AC-N / ⚠️ [issue] |
|
|
2505
|
+
|
|
2506
|
+
**Findings:**
|
|
2507
|
+
- [List any mismatches between call-site conditions and AC constraints]
|
|
2508
|
+
|
|
2509
|
+
**Recommendations:**
|
|
2510
|
+
- [Specific fixes needed at call sites]
|
|
2511
|
+
|
|
2512
|
+
---
|
|
2513
|
+
|
|
2514
|
+
### Execution Evidence
|
|
2515
|
+
|
|
2516
|
+
[Include if scripts/CLI modified, otherwise: "N/A - No executable changes"]
|
|
2517
|
+
|
|
2518
|
+
| Test Type | Command | Exit Code | Result |
|
|
2519
|
+
|-----------|---------|-----------|--------|
|
|
2520
|
+
| Smoke test | `[command]` | [code] | [result] |
|
|
2521
|
+
|
|
2522
|
+
**Evidence status:** Complete / Incomplete / Waived (reason) / Not Required
|
|
2523
|
+
|
|
2524
|
+
---
|
|
2525
|
+
|
|
544
2526
|
### Script Verification Override
|
|
545
2527
|
|
|
546
2528
|
[Include if scripts/CLI modified AND /verify was skipped, otherwise omit this section]
|
|
547
2529
|
|
|
548
2530
|
**Requirement:** `/verify` before READY_FOR_MERGE
|
|
549
2531
|
**Override:** Yes
|
|
550
|
-
**Justification:** [Approved category from Section
|
|
2532
|
+
**Justification:** [Approved category from Section 11a]
|
|
551
2533
|
**Risk Assessment:** [None/Low/Medium]
|
|
552
2534
|
|
|
553
2535
|
---
|
|
554
2536
|
|
|
555
|
-
###
|
|
2537
|
+
### Skill Command Verification
|
|
2538
|
+
|
|
2539
|
+
[Include if `.claude/skills/**/*.md` modified, otherwise: "N/A - No skill files changed"]
|
|
2540
|
+
|
|
2541
|
+
**Skill files changed:** X
|
|
2542
|
+
|
|
2543
|
+
| File | Commands Found | Verification Status |
|
|
2544
|
+
|------|----------------|---------------------|
|
|
2545
|
+
| `[skill file]` | [count] | ✅ Passed / ❌ Failed / ⚠️ Skipped |
|
|
2546
|
+
|
|
2547
|
+
**Commands Verified:**
|
|
2548
|
+
- `[command]` → ✅ [result]
|
|
2549
|
+
|
|
2550
|
+
**Commands with Issues:**
|
|
2551
|
+
- `[command]` → ❌ [issue description]
|
|
2552
|
+
|
|
2553
|
+
**Verification Status:** Passed / Failed / Skipped / Not Required
|
|
2554
|
+
|
|
2555
|
+
---
|
|
2556
|
+
|
|
2557
|
+
### CLI Registration Verification
|
|
2558
|
+
|
|
2559
|
+
[Include if option interfaces or CLI file modified, otherwise: "N/A - No option interface changes"]
|
|
2560
|
+
|
|
2561
|
+
**Option files modified:** Yes/No
|
|
2562
|
+
|
|
2563
|
+
| Interface Field | Runtime Usage | CLI Registered | Status |
|
|
2564
|
+
|----------------|--------------|----------------|--------|
|
|
2565
|
+
| `[field]` | `[usage location]` | `--[flag]` in bin/cli.ts / NOT REGISTERED | ✅ OK / ❌ FAIL / ⏭️ SKIP |
|
|
2566
|
+
|
|
2567
|
+
**Verification Status:** Passed / Failed / N/A
|
|
2568
|
+
|
|
2569
|
+
**Remediation (if failed):**
|
|
2570
|
+
- Add `.option("--field-name", "description")` to bin/cli.ts
|
|
2571
|
+
|
|
2572
|
+
---
|
|
2573
|
+
|
|
2574
|
+
### Skill Change Review
|
|
2575
|
+
|
|
2576
|
+
[Include if skill files changed, otherwise omit]
|
|
2577
|
+
|
|
2578
|
+
- [ ] **Command verified:** Did you execute at least one referenced command?
|
|
2579
|
+
- [ ] **Fields verified:** For JSON commands, do field names match actual output?
|
|
2580
|
+
- [ ] **Patterns complete:** What variations might users write that aren't covered?
|
|
2581
|
+
- [ ] **Dependencies explicit:** What CLIs/tools does this skill assume are installed?
|
|
2582
|
+
|
|
2583
|
+
---
|
|
2584
|
+
|
|
2585
|
+
### Smoke Test
|
|
2586
|
+
|
|
2587
|
+
[Include if workflow-affecting changes (skills, scripts, CLI), otherwise: "Not Required - No workflow-affecting changes"]
|
|
2588
|
+
|
|
2589
|
+
| Test | Command | Result | Notes |
|
|
2590
|
+
|------|---------|--------|-------|
|
|
2591
|
+
| Happy path | `[command]` | ✅/❌ | [observation] |
|
|
2592
|
+
| Edge case | `[command]` | ✅/❌ | [observation] |
|
|
2593
|
+
| Error handling | `[command]` | ✅/❌ | [observation] |
|
|
2594
|
+
|
|
2595
|
+
**Smoke Test Status:** Complete / Partial (document gaps) / Not Required
|
|
2596
|
+
|
|
2597
|
+
---
|
|
2598
|
+
|
|
2599
|
+
### Self-Evaluation
|
|
2600
|
+
|
|
2601
|
+
- **Verified working:** [Yes/No - did you actually verify the feature works?]
|
|
2602
|
+
- **Test efficacy:** [High/Medium/Low - do tests catch the feature breaking?]
|
|
2603
|
+
- **Likely failure mode:** [What would most likely break this in production?]
|
|
2604
|
+
- **Verdict confidence:** [High/Medium/Low - explain any uncertainty]
|
|
2605
|
+
|
|
2606
|
+
---
|
|
2607
|
+
|
|
2608
|
+
### Verdict: [READY_FOR_MERGE | AC_MET_BUT_NOT_A_PLUS | NEEDS_VERIFICATION | AC_NOT_MET]
|
|
556
2609
|
|
|
557
2610
|
[Explanation of verdict]
|
|
558
2611
|
|