sequant 2.0.0 → 2.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (61) hide show
  1. package/.claude-plugin/marketplace.json +1 -1
  2. package/.claude-plugin/plugin.json +1 -1
  3. package/README.md +7 -6
  4. package/dist/bin/cli.js +2 -1
  5. package/dist/marketplace/external_plugins/sequant/.claude-plugin/plugin.json +1 -1
  6. package/dist/marketplace/external_plugins/sequant/.mcp.json +6 -0
  7. package/dist/marketplace/external_plugins/sequant/README.md +58 -8
  8. package/dist/marketplace/external_plugins/sequant/hooks/post-tool.sh +19 -8
  9. package/dist/marketplace/external_plugins/sequant/hooks/pre-tool.sh +36 -49
  10. package/dist/marketplace/external_plugins/sequant/skills/_shared/references/subagent-types.md +158 -48
  11. package/dist/marketplace/external_plugins/sequant/skills/assess/SKILL.md +354 -352
  12. package/dist/marketplace/external_plugins/sequant/skills/exec/SKILL.md +1155 -33
  13. package/dist/marketplace/external_plugins/sequant/skills/fullsolve/SKILL.md +35 -4
  14. package/dist/marketplace/external_plugins/sequant/skills/qa/SKILL.md +2157 -104
  15. package/dist/marketplace/external_plugins/sequant/skills/qa/scripts/quality-checks.sh +1 -1
  16. package/dist/marketplace/external_plugins/sequant/skills/setup/SKILL.md +386 -0
  17. package/dist/marketplace/external_plugins/sequant/skills/solve/SKILL.md +38 -664
  18. package/dist/marketplace/external_plugins/sequant/skills/spec/SKILL.md +505 -120
  19. package/dist/marketplace/external_plugins/sequant/skills/test/SKILL.md +246 -1
  20. package/dist/marketplace/external_plugins/sequant/skills/testgen/SKILL.md +138 -1
  21. package/dist/src/commands/dashboard.js +1 -1
  22. package/dist/src/commands/doctor.js +1 -1
  23. package/dist/src/commands/init.js +10 -10
  24. package/dist/src/commands/logs.js +1 -1
  25. package/dist/src/commands/run.js +49 -39
  26. package/dist/src/commands/state.js +3 -3
  27. package/dist/src/commands/status.js +5 -5
  28. package/dist/src/commands/sync.js +8 -8
  29. package/dist/src/commands/update.js +16 -16
  30. package/dist/src/lib/cli-ui.js +20 -19
  31. package/dist/src/lib/merge-check/index.js +2 -2
  32. package/dist/src/lib/settings.d.ts +8 -0
  33. package/dist/src/lib/settings.js +1 -0
  34. package/dist/src/lib/shutdown.js +1 -1
  35. package/dist/src/lib/templates.js +2 -0
  36. package/dist/src/lib/wizard.js +6 -4
  37. package/dist/src/lib/workflow/batch-executor.d.ts +9 -1
  38. package/dist/src/lib/workflow/batch-executor.js +39 -2
  39. package/dist/src/lib/workflow/log-writer.js +6 -6
  40. package/dist/src/lib/workflow/metrics-writer.js +5 -3
  41. package/dist/src/lib/workflow/phase-executor.d.ts +1 -1
  42. package/dist/src/lib/workflow/phase-executor.js +52 -22
  43. package/dist/src/lib/workflow/platforms/github.js +5 -1
  44. package/dist/src/lib/workflow/state-cleanup.js +1 -1
  45. package/dist/src/lib/workflow/state-manager.js +15 -13
  46. package/dist/src/lib/workflow/state-rebuild.js +2 -2
  47. package/dist/src/lib/workflow/types.d.ts +27 -0
  48. package/dist/src/lib/workflow/worktree-manager.js +40 -41
  49. package/dist/src/lib/worktree-isolation.d.ts +130 -0
  50. package/dist/src/lib/worktree-isolation.js +310 -0
  51. package/package.json +24 -14
  52. package/templates/agents/sequant-explorer.md +23 -0
  53. package/templates/agents/sequant-implementer.md +18 -0
  54. package/templates/agents/sequant-qa-checker.md +24 -0
  55. package/templates/agents/sequant-testgen.md +25 -0
  56. package/templates/scripts/cleanup-worktree.sh +18 -0
  57. package/templates/skills/_shared/references/subagent-types.md +158 -48
  58. package/templates/skills/exec/SKILL.md +72 -6
  59. package/templates/skills/qa/SKILL.md +8 -217
  60. package/templates/skills/spec/SKILL.md +446 -120
  61. package/templates/skills/testgen/SKILL.md +138 -1
@@ -16,10 +16,11 @@ allowed-tools:
16
16
  - Bash(gh pr view:*)
17
17
  - Bash(gh pr diff:*)
18
18
  - Bash(gh pr comment:*)
19
+ - Bash(gh pr checks:*)
19
20
  - Bash(semgrep:*)
20
21
  - Bash(npx semgrep:*)
21
22
  - Bash(npx tsx scripts/semgrep-scan.ts:*)
22
- - Task(general-purpose)
23
+ - Agent(sequant-qa-checker)
23
24
  - AgentOutputTool
24
25
  ---
25
26
 
@@ -37,6 +38,32 @@ When invoked as `/qa`, your job is to:
37
38
  4. Assess whether the change is "A+ status" or needs more work.
38
39
  5. Draft a GitHub review/QA comment summarizing findings and recommendations.
39
40
 
41
+ ## Orchestration Context
42
+
43
+ When running as part of an orchestrated workflow (e.g., `sequant run` or `/fullsolve`), this skill receives environment variables that indicate the orchestration context:
44
+
45
+ | Environment Variable | Description | Example Value |
46
+ |---------------------|-------------|---------------|
47
+ | `SEQUANT_ORCHESTRATOR` | The orchestrator invoking this skill | `sequant-run` |
48
+ | `SEQUANT_PHASE` | Current phase in the workflow | `qa` |
49
+ | `SEQUANT_ISSUE` | Issue number being processed | `123` |
50
+ | `SEQUANT_WORKTREE` | Path to the feature worktree | `/path/to/worktrees/feature/...` |
51
+
52
+ **Behavior when orchestrated (SEQUANT_ORCHESTRATOR is set):**
53
+
54
+ 1. **Skip pre-flight sync check** - Orchestrator has already synced
55
+ 2. **Use provided worktree** - Work in `SEQUANT_WORKTREE` path directly
56
+ 3. **Skip issue fetch** - Use `SEQUANT_ISSUE`, orchestrator has context
57
+ 4. **Reduce GitHub comment frequency** - Defer updates to orchestrator
58
+ 5. **Trust git state** - Orchestrator verified branch status
59
+
60
+ **Behavior when standalone (SEQUANT_ORCHESTRATOR is NOT set):**
61
+
62
+ - Perform pre-flight sync check
63
+ - Locate worktree or work from main
64
+ - Fetch fresh issue context from GitHub
65
+ - Post QA comment directly to GitHub
66
+
40
67
  ## Phase Detection (Smart Resumption)
41
68
 
42
69
  **Before executing**, check if the exec phase has been completed (prerequisite for QA):
@@ -70,15 +97,22 @@ fi
70
97
 
71
98
  **Phase Marker Emission:**
72
99
 
73
- When posting the QA review comment to GitHub, append a phase marker at the end:
100
+ When posting the QA review comment to GitHub, append a phase marker at the end.
101
+
102
+ **IMPORTANT:** Always include the `commitSHA` field with the current HEAD SHA. This enables incremental re-runs by recording the baseline commit for future QA runs.
103
+
104
+ ```bash
105
+ # Get current HEAD SHA for the phase marker
106
+ COMMIT_SHA=$(git rev-parse HEAD)
107
+ ```
74
108
 
75
109
  ```markdown
76
- <!-- SEQUANT_PHASE: {"phase":"qa","status":"completed","timestamp":"<ISO-8601>"} -->
110
+ <!-- SEQUANT_PHASE: {"phase":"qa","status":"completed","timestamp":"<ISO-8601>","commitSHA":"<HEAD-SHA>"} -->
77
111
  ```
78
112
 
79
113
  If QA determines AC_NOT_MET, emit:
80
114
  ```markdown
81
- <!-- SEQUANT_PHASE: {"phase":"qa","status":"failed","timestamp":"<ISO-8601>","error":"AC_NOT_MET"} -->
115
+ <!-- SEQUANT_PHASE: {"phase":"qa","status":"failed","timestamp":"<ISO-8601>","error":"AC_NOT_MET","commitSHA":"<HEAD-SHA>"} -->
82
116
  ```
83
117
 
84
118
  Include this marker in every `gh issue comment` that represents QA completion.
@@ -89,10 +123,100 @@ Invocation:
89
123
 
90
124
  - `/qa 123`: Treat `123` as the GitHub issue/PR identifier in context.
91
125
  - `/qa <freeform description>`: Treat the text as context about the change to review.
126
+ - `/qa 123 --parallel`: Force parallel agent execution (faster, higher token usage).
127
+ - `/qa 123 --sequential`: Force sequential agent execution (slower, lower token usage).
128
+
129
+ ### Agent Execution Mode
130
+
131
+ Before spawning quality check agents, determine the execution mode:
132
+
133
+ 1. **Check for CLI flag override:**
134
+ - `--parallel` → Use parallel execution
135
+ - `--sequential` → Use sequential execution
136
+
137
+ 2. **If no flag, read project settings:**
138
+ Use the Read tool to check project settings:
139
+ ```
140
+ Read(file_path=".sequant/settings.json")
141
+ # Parse JSON and extract agents.parallel (default: false)
142
+ ```
143
+
144
+ 3. **Default:** Sequential (cost-optimized)
145
+
146
+ | Mode | Token Usage | Speed | Best For |
147
+ |------|-------------|-------|----------|
148
+ | Sequential | 1x (baseline) | Slower | Limited API plans, single issues |
149
+ | Parallel | ~2-3x | ~50% faster | Unlimited plans, batch operations |
150
+
151
+ ### Quality Check Caching
152
+
153
+ The QA quality checks support caching to skip unchanged checks on re-run, significantly improving iteration speed.
154
+
155
+ #### Cache Configuration
156
+
157
+ **CLI flags:**
158
+ - `/qa 123 --no-cache`: Force fresh run, ignore all cached results
159
+ - `/qa 123 --use-cache`: Enable caching (default)
160
+
161
+ **When caching is used:**
162
+ - Type safety check → Cached (keyed by diff hash)
163
+ - Deleted tests check → Cached (keyed by diff hash)
164
+ - Security scan → Cached (keyed by diff hash + config)
165
+ - Semgrep analysis → Cached (keyed by diff hash)
166
+ - Build verification → Cached (keyed by diff hash)
167
+ - Scope/size metrics → Always fresh (cheap operations)
168
+
169
+ #### Cache Invalidation Rules
170
+
171
+ | Change Type | Invalidation Scope |
172
+ |-------------|-------------------|
173
+ | Source file changes | Re-run type safety, security, semgrep |
174
+ | Test file changes | Re-run deleted-tests check |
175
+ | Config changes (tsconfig, package.json) | Re-run affected checks |
176
+ | `package-lock.json` changes | Re-run ALL checks |
177
+ | TTL expiry (1 hour default) | Re-run expired checks |
178
+
179
+ #### Cache Status Reporting (AC-4)
180
+
181
+ The quality-checks.sh script outputs a cache status table:
182
+
183
+ ```markdown
184
+ ### Cache Status Report
185
+
186
+ | Check | Cache Status |
187
+ |-------|--------------|
188
+ | type-safety | ✅ HIT |
189
+ | deleted-tests | ✅ HIT |
190
+ | scope | ⏭️ SKIP |
191
+ | size | ⏭️ SKIP |
192
+ | security | ❌ MISS |
193
+ | semgrep | ❌ MISS |
194
+ | build | ✅ HIT |
195
+
196
+ **Summary:** 3 hits, 2 misses, 2 skipped
197
+ **Performance:** Cached checks saved execution time
198
+ ```
199
+
200
+ #### Cache Location
201
+
202
+ Cache is stored at `.sequant/.cache/qa/cache.json` with the following structure:
203
+ - `diffHash`: SHA256 hash of `git diff main...HEAD`
204
+ - `configHash`: SHA256 hash of relevant config files
205
+ - `result`: Check result (passed, message, details)
206
+ - `ttl`: Time-to-live in milliseconds (default: 1 hour)
207
+
208
+ #### Graceful Degradation (AC-6)
209
+
210
+ If the cache is corrupted or unreadable:
211
+ 1. Log warning at debug level (AC-7)
212
+ 2. Fall back to fresh run
213
+ 3. Continue without caching errors affecting QA
92
214
 
93
215
  ### Pre-flight Sync Check
94
216
 
95
- Before starting QA, verify the local branch is in sync with remote:
217
+ **Skip this section if `SEQUANT_ORCHESTRATOR` is set** - the orchestrator has already verified sync status.
218
+
219
+ Before starting QA (standalone mode), verify the local branch is in sync with remote:
96
220
 
97
221
  ```bash
98
222
  git fetch origin 2>/dev/null || echo "Network unavailable - proceeding with local state"
@@ -109,10 +233,103 @@ If diverged, recommend:
109
233
  git pull origin main # Or merge origin/main if pull fails
110
234
  ```
111
235
 
236
+ ### Stale Branch Detection
237
+
238
+ **Skip this section if `SEQUANT_ORCHESTRATOR` is set** - the orchestrator handles branch freshness checks.
239
+
240
+ **Purpose:** Detect when the feature branch is significantly behind main, which can lead to:
241
+ - QA cycles wasted reviewing code that won't cleanly merge
242
+ - False `READY_FOR_MERGE` verdicts that fail at merge time
243
+ - Conflicts that require rework after QA approval
244
+
245
+ **Detection:**
246
+
247
+ ```bash
248
+ # Ensure we have latest remote state
249
+ git fetch origin 2>/dev/null || true
250
+
251
+ # Count commits behind main
252
+ behind=$(git rev-list --count HEAD..origin/main 2>/dev/null || echo "0")
253
+ echo "Feature branch is $behind commits behind main"
254
+ ```
255
+
256
+ **Threshold Configuration:**
257
+
258
+ The stale branch threshold is configurable in `.sequant/settings.json`:
259
+
260
+ ```json
261
+ {
262
+ "run": {
263
+ "staleBranchThreshold": 5
264
+ }
265
+ }
266
+ ```
267
+
268
+ Default: 5 commits
269
+
270
+ **Behavior:**
271
+
272
+ | Commits Behind | Action |
273
+ |----------------|--------|
274
+ | 0 | ✅ Proceed normally |
275
+ | 1 to threshold | ⚠️ **Warning:** "Feature branch is N commits behind main. Consider rebasing before QA." |
276
+ | > threshold | ❌ **Block:** "STALE_BRANCH: Feature branch is N commits behind main (threshold: T). Rebase required before QA." |
277
+
278
+ **Implementation:**
279
+
280
+ ```bash
281
+ # Read threshold from settings (default: 5)
282
+ threshold=$(jq -r '.run.staleBranchThreshold // 5' .sequant/settings.json 2>/dev/null || echo "5")
283
+
284
+ behind=$(git rev-list --count HEAD..origin/main 2>/dev/null || echo "0")
285
+
286
+ if [[ $behind -gt $threshold ]]; then
287
+ echo "❌ STALE_BRANCH: Feature branch is $behind commits behind main (threshold: $threshold)"
288
+ echo " Rebase required before QA:"
289
+ echo " git fetch origin && git rebase origin/main"
290
+ # Exit with error - QA should not proceed
291
+ exit 1
292
+ elif [[ $behind -gt 0 ]]; then
293
+ echo "⚠️ Warning: Feature branch is $behind commits behind main."
294
+ echo " Consider rebasing before QA: git fetch origin && git rebase origin/main"
295
+ # Continue with warning
296
+ fi
297
+ ```
298
+
299
+ **Output Format:**
300
+
301
+ Include in QA output when branch is stale:
302
+
303
+ ```markdown
304
+ ### Stale Branch Check
305
+
306
+ | Check | Value |
307
+ |-------|-------|
308
+ | Commits behind main | N |
309
+ | Threshold | T |
310
+ | Status | ✅ OK / ⚠️ Warning / ❌ Blocked |
311
+
312
+ [Warning/blocking message if applicable]
313
+ ```
314
+
315
+ **Verdict Impact:**
316
+
317
+ | Status | Verdict Impact |
318
+ |--------|----------------|
319
+ | OK (0 behind) | No impact |
320
+ | Warning (1 to threshold) | Note in findings, recommend rebase |
321
+ | Blocked (> threshold) | **Cannot proceed** - rebase first |
322
+
112
323
  ### Feature Worktree Workflow
113
324
 
114
325
  **QA Phase:** Review code in the feature worktree.
115
326
 
327
+ **If orchestrated (SEQUANT_WORKTREE is set):**
328
+ - Use the provided worktree path directly: `cd $SEQUANT_WORKTREE`
329
+ - Skip step 1 below (worktree location provided by orchestrator)
330
+
331
+ **If standalone:**
332
+
116
333
  1. **Locate the worktree:**
117
334
  - The worktree should already exist from the execution phase (`/exec`)
118
335
  - Find the worktree: `git worktree list` or check `../worktrees/feature/` for directories matching the issue number
@@ -163,111 +380,1434 @@ If no feature worktree exists (work was done directly on main):
163
380
 
164
381
  4. **Run quality checks** on the current branch instead of comparing to a worktree.
165
382
 
166
- ### Quality Checks (Multi-Agent) — REQUIRED
383
+ ### Phase 0: Implementation Status Check — REQUIRED
167
384
 
168
- **You MUST spawn sub-agents for quality checks.** Do NOT run these checks inline with bash commands. Sub-agents provide parallel execution, better context isolation, and consistent reporting.
385
+ **Before spawning quality check agents**, verify that implementation actually exists. Running full QA on an unimplemented issue wastes tokens and produces confusing output.
169
386
 
170
- **Spawn ALL THREE agents in a SINGLE message:**
387
+ **Detection Logic:**
171
388
 
172
- 1. `Task(subagent_type="general-purpose", model="haiku", prompt="Run type safety and deleted tests checks on the current branch vs main. Report: type issues count, deleted tests, verdict.")`
389
+ ```bash
390
+ # 1. Check for worktree (indicates work may have started)
391
+ worktree_path=$(git worktree list | grep -i "<issue-number>" | awk '{print $1}' | head -1 || true)
173
392
 
174
- 2. `Task(subagent_type="general-purpose", model="haiku", prompt="Run scope and size checks on the current branch vs main. Report: files count, diff size, size assessment.")`
393
+ # 2. Check for commits on feature branch (vs main) include ALL file types
394
+ commits_exist=$(git log --oneline main..HEAD 2>/dev/null | head -1)
175
395
 
176
- 3. `Task(subagent_type="general-purpose", model="haiku", prompt="Run security scan on changed files in current branch vs main. Report: critical/warning/info counts, verdict.")`
396
+ # 3. Check for uncommitted changes
397
+ uncommitted_changes=$(git status --porcelain | head -1)
177
398
 
178
- **Add RLS check if admin files modified:**
399
+ # 4. Check for open PR linked to this issue
400
+ pr_exists=$(gh pr list --search "<issue-number>" --state open --json number -q '.[0].number' 2>/dev/null)
401
+
402
+ # 5. Check for ANY file changes (including .md, prompt-only changes)
403
+ any_diff=$(git diff --name-only main..HEAD 2>/dev/null | head -1 || true)
404
+ ```
405
+
406
+ **IMPORTANT: Prompt-only and markdown-only changes ARE valid implementations.** Many issues (e.g., skill improvements, documentation features) are implemented entirely via `.md` file changes. The detection logic must count these as real implementation, not skip them.
407
+
408
+ **Implementation Status Matrix:**
409
+
410
+ | Worktree | Commits | Uncommitted | PR | Status | Action |
411
+ |----------|---------|-------------|-----|--------|--------|
412
+ | ❌ | ❌ | ❌ | ❌ | No implementation | Early exit |
413
+ | ✅ | ❌ | ❌ | ❌ | Worktree created but no work | Early exit |
414
+ | ✅ | ❌ | ✅ | ❌ | Work in progress (uncommitted) | Proceed with QA |
415
+ | ✅ | ✅ | * | * | Implementation exists | Proceed with QA |
416
+ | * | ✅ | * | * | Commits exist | Proceed with QA |
417
+ | * | * | * | ✅ | PR exists | Proceed with QA |
418
+
419
+ **Early Exit Condition:**
420
+ - No commits on feature branch AND no uncommitted changes AND no open PR
421
+
422
+ **False Negative Prevention (CRITICAL):**
423
+
424
+ Root cause analysis (#448) found that 33% of multi-attempt QA failures were caused by QA reporting "NOT FOUND" when implementation existed. Common causes:
425
+
426
+ | Cause | Example | Fix |
427
+ |-------|---------|-----|
428
+ | Prompt-only changes | Skill SKILL.md modifications (#413) | Check `git diff --name-only` for ANY file, not just .ts/.tsx |
429
+ | Cross-repo work | Landing page issue tracked in main repo (#393) | Check exec progress comments for cross-repo indicators |
430
+ | Worktree mismatch | QA runs in wrong directory | Verify `pwd` matches expected worktree path |
431
+
432
+ **If `git diff --name-only main..HEAD` shows files but standard detection says "NOT FOUND":**
433
+ 1. The implementation exists — proceed with QA
434
+ 2. Adapt review approach to the file types changed (e.g., review .md changes for content quality rather than TypeScript compilation)
435
+
436
+ **If early exit triggered:**
437
+ 1. **Skip** sub-agent spawning (nothing to check)
438
+ 2. **Skip** code review (no code to review)
439
+ 3. **Skip** quality metrics collection
440
+ 4. Use the **Early Exit Output Template** below
441
+ 5. Verdict: `AC_NOT_MET`
442
+
443
+ ---
444
+
445
+ ### Early Exit Output Template
446
+
447
+ When no implementation is detected, use this streamlined output:
448
+
449
+ ```markdown
450
+ ## QA Review for Issue #<N>
451
+
452
+ ### Implementation Status: NOT FOUND
453
+
454
+ No implementation detected for this issue:
455
+ - Commits on feature branch: None
456
+ - Uncommitted changes: None
457
+ - Open PR: None
458
+
459
+ **Verdict: AC_NOT_MET**
460
+
461
+ No code changes found to review. The acceptance criteria cannot be evaluated without an implementation.
462
+
463
+ ### Next Steps
464
+
465
+ 1. Run `/exec <issue-number>` to implement the feature
466
+ 2. Re-run `/qa <issue-number>` after implementation is complete
467
+
468
+ ---
469
+
470
+ *QA skipped: No implementation to review*
471
+ ```
472
+
473
+ **Important:** Do NOT spawn sub-agents when using early exit. This saves tokens and avoids confusing "no changes found" outputs from quality checkers.
474
+
475
+ **CRITICAL — Before early exit, double-check for false negatives:**
179
476
  ```bash
180
- admin_modified=$(git diff main...HEAD --name-only | grep -E "^app/admin/" | head -1 || true)
477
+ # Final safety check: are there ANY file changes vs main?
478
+ any_changes=$(git diff --name-only main..HEAD 2>/dev/null | wc -l | xargs || echo "0")
479
+ if [[ "$any_changes" -gt 0 ]]; then
480
+ echo "WARNING: $any_changes files changed but detection said NOT FOUND"
481
+ echo "Changed files:"
482
+ git diff --name-only main..HEAD 2>/dev/null | head -20
483
+ echo "Proceeding with QA instead of early exit."
484
+ # DO NOT early exit — proceed with QA
485
+ fi
181
486
  ```
182
487
 
183
- See [quality-gates.md](references/quality-gates.md) for detailed verdict synthesis.
488
+ ---
489
+
490
+ ### Phase 0b: Quality Plan Verification (CONDITIONAL)
491
+
492
+ **When to apply:** If issue has a Feature Quality Planning section in comments (from `/spec`).
493
+
494
+ **Purpose:** Verify that quality dimensions identified during planning were addressed in implementation. This catches gaps that AC verification alone misses.
495
+
496
+ **Detection:**
497
+ ```bash
498
+ # Check if issue has quality planning section in comments
499
+ quality_plan_exists=$(gh issue view <issue> --comments --json comments -q '.comments[].body' | grep -q "Feature Quality Planning" && echo "yes" || echo "no")
500
+ ```
184
501
 
185
- ### MCP Tools (Optional - Graceful Degradation)
502
+ **If Quality Plan found:**
186
503
 
187
- MCP tools enhance `/qa` but are **not required**. The skill works fully without them.
504
+ 1. **Extract quality dimensions** from the spec comment:
505
+ - Completeness Check items
506
+ - Error Handling items
507
+ - Code Quality items
508
+ - Test Coverage Plan items
509
+ - Best Practices items
510
+ - Polish items (if UI feature)
511
+ - Derived ACs
188
512
 
189
- #### MCP Availability Check
513
+ 2. **Verify each dimension against implementation:**
190
514
 
191
- Before using MCP tools, verify they are available. If unavailable, use the fallback strategies.
515
+ | Dimension | Verification Method |
516
+ |-----------|---------------------|
517
+ | Completeness | Check all AC steps have code |
518
+ | Error Handling | Search for error handling code, try/catch blocks |
519
+ | Code Quality | Check for `any` types, magic strings |
520
+ | Test Coverage | Verify test files exist for critical paths |
521
+ | Best Practices | Check for logging, security patterns |
522
+ | Polish | Check loading/error/empty states in UI |
192
523
 
193
- | MCP Tool | Purpose | Fallback When Unavailable |
194
- |----------|---------|---------------------------|
195
- | Sequential Thinking | Complex multi-step analysis | Use explicit step-by-step reasoning in response |
196
- | Context7 | Library documentation lookup | Use WebSearch or codebase pattern search |
524
+ 3. **Extract and Verify Derived ACs:**
525
+
526
+ **Extraction Method:**
527
+ ```bash
528
+ # Extract derived ACs from spec comment's Derived ACs table
529
+ # Format: | Source | AC-N: Description | Priority |
530
+ # Uses flexible pattern to match any source dimension (not hardcoded)
531
+ derived_acs=$(gh issue view <issue-number> --comments --json comments -q '.comments[].body' | \
532
+ grep -E '\|[^|]+\|\s*AC-[0-9]+:' | \
533
+ grep -oE 'AC-[0-9]+:[^|]+' | \
534
+ sed 's/^[[:space:]]*//;s/[[:space:]]*$//' | \
535
+ sort -u || true)
536
+
537
+ # Count derived ACs
538
+ derived_count=$(echo "$derived_acs" | grep -c "AC-" || true)
539
+ echo "Found $derived_count derived ACs"
540
+ ```
197
541
 
198
- #### Sequential Thinking Fallback
542
+ **Handling Edge Cases:**
543
+ - **0 derived ACs:** Output "Derived ACs: None found" and skip derived AC verification
544
+ - **1+ derived ACs:** Include each in AC coverage table with source attribution
545
+ - **Malformed rows:** Rows missing the `| Source | AC-N: ... |` pattern are skipped
546
+ - **Extra whitespace:** Trimmed during extraction
199
547
 
200
- **When to use Sequential Thinking:**
201
- - Complex architectural trade-offs during code review
202
- - Multi-dimensional quality assessment
203
- - Analyzing interconnected issues across files
548
+ **Verification:**
549
+ - Treat derived ACs identically to original ACs
550
+ - Include in AC coverage table with "Derived ([Source])" notation
551
+ - Mark as MET/PARTIALLY_MET/NOT_MET based on implementation evidence
204
552
 
205
- **If unavailable:**
206
- 1. Structure your analysis with explicit numbered steps
207
- 2. Document each concern systematically before synthesizing verdict
208
- 3. Use a pros/cons format for trade-off decisions
553
+ **Output Format:**
209
554
 
210
555
  ```markdown
211
- ## Analysis Steps (Manual Sequential Thinking)
556
+ ### Quality Plan Verification
557
+
558
+ **Quality Plan found:** Yes/No
559
+
560
+ | Dimension | Items Planned | Items Addressed | Status |
561
+ |-----------|---------------|-----------------|--------|
562
+ | Completeness | 5 | 5 | ✅ Complete |
563
+ | Error Handling | 3 | 2 | ⚠️ Partial (missing: API timeout) |
564
+ | Code Quality | 4 | 4 | ✅ Complete |
565
+ | Test Coverage | 3 | 3 | ✅ Complete |
566
+ | Best Practices | 2 | 2 | ✅ Complete |
567
+ | Polish | N/A | N/A | - (not UI feature) |
568
+
569
+ **Derived ACs:** 2/2 addressed
212
570
 
213
- **Step 1:** [Analyze first dimension - correctness]
214
- **Step 2:** [Analyze second dimension - maintainability]
215
- **Step 3:** [Analyze third dimension - performance]
216
- **Step 4:** [Synthesize findings into verdict]
571
+ **Quality Plan Status:** Complete / Partial / Not Addressed
217
572
  ```
218
573
 
219
- #### Context7 Fallback
574
+ **Verdict Impact:**
220
575
 
221
- **When to use Context7:**
222
- - Verifying implementation matches library best practices
223
- - Checking if API usage follows recommended patterns
224
- - Understanding framework-specific conventions in reviewed code
576
+ | Quality Plan Status | Verdict Impact |
577
+ |---------------------|----------------|
578
+ | Complete | No impact (positive signal) |
579
+ | Partial | Note in findings, consider `AC_MET_BUT_NOT_A_PLUS` |
580
+ | Not Addressed | Flag in findings, may indicate gaps |
581
+ | No Plan Found | Note: "Quality plan not available - standard QA only" |
225
582
 
226
- **If unavailable:**
227
- 1. Search codebase with Grep for existing usage patterns
228
- 2. Use WebSearch for official library documentation
229
- 3. Check similar implementations in the codebase as reference
230
- 4. Review library's README or documentation in node_modules
583
+ **Status Threshold Definitions:**
231
584
 
232
- ### 1. Context and AC Alignment
585
+ | Status | Criteria |
586
+ |--------|----------|
587
+ | **Complete** | All applicable dimensions have ≥80% items addressed |
588
+ | **Partial** | At least 50% of applicable dimensions have items addressed |
589
+ | **Not Addressed** | <50% of applicable dimensions addressed, or 0 items addressed |
233
590
 
234
- - **Read all GitHub issue comments** for complete context
235
- - Reconstruct the AC checklist (AC-1, AC-2, ...)
236
- - If AC unclear, state assumptions explicitly
591
+ *Example: If 4 dimensions apply (Completeness, Error Handling, Code Quality, Test Coverage):*
592
+ - *Complete: 4/4 dimensions at ≥80%*
593
+ - *Partial: 2-3/4 dimensions have work done*
594
+ - *Not Addressed: 0-1/4 dimensions have work done*
595
+
596
+ **If no Quality Plan found:**
597
+ - Output: "Quality Plan Verification: N/A - No quality plan found in issue comments"
598
+ - Proceed with standard QA (no verdict impact)
599
+
600
+ ---
601
+
602
+ ### Phase 0c: Incremental Re-Run Detection (CONDITIONAL)
603
+
604
+ **When to apply:** On QA re-runs (when a prior QA phase marker exists in issue comments).
605
+
606
+ **Purpose:** Optimize QA re-runs by detecting what changed since the last QA run and skipping checks whose inputs haven't changed. This significantly reduces token usage and execution time on iterative QA cycles.
607
+
608
+ **Detection:**
609
+
610
+ ```bash
611
+ # Step 1: Check for prior QA run context in cache
612
+ prior_context=$(npx tsx scripts/qa/qa-cache-cli.ts get-run-context 2>/dev/null || true)
613
+
614
+ # Step 2: If no cache context found, fall through to full QA run
615
+ if [[ -z "$prior_context" ]] || echo "$prior_context" | grep -q "No QA run context"; then
616
+ echo "No prior QA context found — running full QA"
617
+ INCREMENTAL_MODE=false
618
+ else
619
+ LAST_QA_SHA=$(echo "$prior_context" | jq -r '.lastQACommitSHA')
620
+ LAST_QA_HASH=$(echo "$prior_context" | jq -r '.lastQADiffHash')
621
+
622
+ # Step 3: Validate the commit SHA still exists in git history
623
+ if ! git cat-file -t "$LAST_QA_SHA" &>/dev/null; then
624
+ echo "Warning: Last QA commit SHA ($LAST_QA_SHA) not found in history — running full QA"
625
+ INCREMENTAL_MODE=false
626
+ else
627
+ # Step 4: Get files changed since last QA
628
+ changed_files=$(npx tsx scripts/qa/qa-cache-cli.ts changed-since "$LAST_QA_SHA" 2>/dev/null || true)
629
+
630
+ if [[ "$changed_files" == "NO_CHANGES" ]]; then
631
+ echo "No changes since last QA — all checks can use cached results"
632
+ INCREMENTAL_MODE=true
633
+ NO_FILE_CHANGES=true
634
+ else
635
+ echo "Changes detected since last QA ($LAST_QA_SHA):"
636
+ echo "$changed_files" | head -20
637
+ INCREMENTAL_MODE=true
638
+ NO_FILE_CHANGES=false
639
+ fi
640
+ fi
641
+ fi
642
+ ```
643
+
644
+ **Skip Logic (when INCREMENTAL_MODE=true):**
645
+
646
+ | Check / Item | Skip Condition | Re-run Condition |
647
+ |-------------|----------------|------------------|
648
+ | Quality checks (type-safety, security, etc.) | Existing diff-hash cache handles this | Hash mismatch -> re-run |
649
+ | Build verification | **Never skip** (always re-run) | Always — cheap and can regress |
650
+ | CI status | **Never skip** (always re-run) | Always — external state changes |
651
+ | AC items with prior status `met` | Skip if NO_FILE_CHANGES=true | Any file changes since last QA |
652
+ | AC items with prior status `not_met` | **Never skip** | Always re-evaluate |
653
+ | AC items with prior status `partially_met` | **Never skip** | Always re-evaluate |
654
+ | AC items with prior status `pending`/`blocked` | **Never skip** | Always re-evaluate |
655
+
656
+ **AC Re-evaluation Rules:**
657
+
658
+ When `INCREMENTAL_MODE=true`:
659
+
660
+ 1. **Load prior AC statuses** from run context:
661
+ ```bash
662
+ # Extract AC statuses from prior context
663
+ ac_statuses=$(echo "$prior_context" | jq -r '.acStatuses | to_entries[] | "\(.key)=\(.value)"')
664
+ ```
665
+
666
+ 2. **For each AC item:**
667
+ - If prior status is `met` AND `NO_FILE_CHANGES=true`:
668
+ - **Skip full re-evaluation** — output "Cached: previously MET, no file changes"
669
+ - Mark as `MET (cached)` in output
670
+ - If prior status is `met` AND files changed:
671
+ - **Re-evaluate** — changes may have caused regression
672
+ - If prior status is `not_met` or `partially_met`:
673
+ - **Always re-evaluate** — this is the primary purpose of re-runs
674
+ - If prior status is `pending` or `blocked`:
675
+ - **Always re-evaluate** — status may have changed
676
+
677
+ 3. **`--no-cache` flag behavior:**
678
+ - When `--no-cache` is passed, set `INCREMENTAL_MODE=false`
679
+ - This forces full re-evaluation of ALL checks and AC items
680
+ - Run context is still saved at the end for future re-runs
681
+
682
+ **Output Format (Incremental QA Summary):**
683
+
684
+ When `INCREMENTAL_MODE=true`, prepend this section to the QA output:
685
+
686
+ ```markdown
687
+ ### Incremental QA Summary
688
+
689
+ **Last QA:** <timestamp> (commit: <sha-short>)
690
+ **Changes since last QA:** N files
691
+
692
+ | Check / AC | Status | Re-run? | Reason |
693
+ |------------|--------|---------|--------|
694
+ | type-safety | PASS | Cached | Diff hash unchanged |
695
+ | security | PASS | Cached | Diff hash unchanged |
696
+ | build | PASS | Re-run | Always fresh |
697
+ | CI status | PASS | Re-run | Always fresh |
698
+ | AC-1 | MET | Cached | Previously MET, no file changes |
699
+ | AC-2 | MET | Re-evaluated | Was NOT_MET |
700
+ | AC-3 | MET | Re-evaluated | Files changed since last QA |
701
+
702
+ **Summary:** X checks cached, Y re-evaluated, Z always-fresh
703
+ ```
704
+
705
+ **Run Context Persistence:**
706
+
707
+ After QA completes (regardless of incremental mode), save the run context:
708
+
709
+ ```bash
710
+ # Get current HEAD SHA
711
+ current_sha=$(git rev-parse HEAD)
712
+ # Get current diff hash
713
+ current_hash=$(npx tsx scripts/qa/qa-cache-cli.ts hash)
714
+
715
+ # Build AC statuses JSON from QA results
716
+ # Example: {"AC-1":"met","AC-2":"not_met","AC-3":"met"}
717
+ ac_json='{"AC-1":"met","AC-2":"not_met"}' # Replace with actual results
718
+
719
+ # Save run context
720
+ echo "{
721
+ \"lastQACommitSHA\": \"$current_sha\",
722
+ \"lastQADiffHash\": \"$current_hash\",
723
+ \"acStatuses\": $ac_json,
724
+ \"timestamp\": \"$(date -u +%Y-%m-%dT%H:%M:%S.000Z)\"
725
+ }" | npx tsx scripts/qa/qa-cache-cli.ts set-run-context
726
+ ```
727
+
728
+ ---
729
+
730
+ ### Phase 1: CI Status Check — REQUIRED
731
+
732
+ **Purpose:** Check GitHub CI status before finalizing verdict. CI-dependent AC items (e.g., "Tests pass in CI") should reflect actual CI status, not just local test results.
733
+
734
+ **When to check:** If a PR exists for the issue/branch.
735
+
736
+ **Detection:**
737
+ ```bash
738
+ # Get PR number for current branch
739
+ pr_number=$(gh pr view --json number -q '.number' 2>/dev/null)
740
+
741
+ # If PR exists, check CI status
742
+ if [[ -n "$pr_number" ]]; then
743
+ gh pr checks "$pr_number" --json name,state,bucket
744
+ fi
745
+ ```
746
+
747
+ **CI Status Mapping:**
748
+
749
+ | State | Bucket | AC Status | Verdict Impact |
750
+ |-------|--------|-----------|----------------|
751
+ | `SUCCESS` | `pass` | `MET` | No impact |
752
+ | `FAILURE` | `fail` | `NOT_MET` | Blocks merge |
753
+ | `CANCELLED` | `fail` | `NOT_MET` | Blocks merge |
754
+ | `SKIPPED` | `pass` | `N/A` | No impact |
755
+ | `PENDING` | `pending` | `PENDING` | → `NEEDS_VERIFICATION` |
756
+ | `QUEUED` | `pending` | `PENDING` | → `NEEDS_VERIFICATION` |
757
+ | `IN_PROGRESS` | `pending` | `PENDING` | → `NEEDS_VERIFICATION` |
758
+ | (empty response) | - | `N/A` | No CI configured |
759
+
760
+ **CI-Related AC Detection:**
761
+
762
+ Identify AC items that depend on CI by matching these patterns:
763
+ - "Tests pass in CI"
764
+ - "CI passes"
765
+ - "Build succeeds in CI"
766
+ - "GitHub Actions pass"
767
+ - "Pipeline passes"
768
+ - "Workflow passes"
769
+ - "Checks pass"
770
+ - "Actions succeed"
771
+ - "CI/CD passes"
772
+
773
+ ```bash
774
+ # Example: Check if any AC mentions CI
775
+ ci_ac_patterns="CI|pipeline|GitHub Actions|build succeeds|tests pass in CI|workflow|checks pass|actions succeed"
776
+ ```
777
+
778
+ **Error Handling:**
779
+
780
+ If `gh pr checks` fails or returns unexpected results:
781
+ - **`gh` not installed** → Skip CI section with note: "CI status unavailable (gh CLI not found)"
782
+ - **`gh` not authenticated** → Skip CI section with note: "CI status unavailable (gh auth required)"
783
+ - **Network/auth error** → Treat as N/A with note: "CI status unavailable (gh command failed)"
784
+ - **No PR exists** → Skip CI status section entirely
785
+ - **Empty response** → No CI configured (not an error)
786
+
787
+ **Portability Note:**
788
+
789
+ CI status detection requires GitHub. Other platforms (GitLab, Bitbucket, Azure DevOps) are not supported. To check if `gh` is available:
790
+ ```bash
791
+ if ! command -v gh &>/dev/null; then
792
+ echo "gh CLI not installed - skipping CI status check"
793
+ fi
794
+ ```
795
+
796
+ **Output Format:**
797
+
798
+ Include CI status in the QA output:
799
+
800
+ ```markdown
801
+ ### CI Status
802
+
803
+ | Check | State | Bucket | Impact |
804
+ |-------|-------|--------|--------|
805
+ | `build (20.x)` | SUCCESS | pass | ✅ MET |
806
+ | `build (22.x)` | PENDING | pending | ⏳ PENDING |
807
+ | `lint` | FAILURE | fail | ❌ NOT_MET |
808
+
809
+ **CI Summary:** 1 passed, 1 pending, 1 failed
810
+ **CI-related AC items:** AC-4 ("Tests pass in CI") → PENDING (CI still running)
811
+ ```
812
+
813
+ **No CI Configured:**
814
+
815
+ If `gh pr checks` returns an empty response:
816
+ ```markdown
817
+ ### CI Status
818
+
819
+ No CI checks configured for this repository.
820
+
821
+ **CI-related AC items:** AC-4 ("Tests pass in CI") → N/A (no CI configured)
822
+ ```
823
+
824
+ **Verdict Integration:**
825
+
826
+ CI status affects the final verdict through the standard verdict algorithm:
827
+ - CI `PENDING` → AC item marked `PENDING` → Verdict: `NEEDS_VERIFICATION`
828
+ - CI `failure` → AC item marked `NOT_MET` → Verdict: `AC_NOT_MET`
829
+ - CI `success` → AC item marked `MET` → No additional impact
830
+ - No CI → AC item marked `N/A` → No impact on verdict
831
+
832
+ **Important:** Do NOT give `READY_FOR_MERGE` if any CI check is still pending. The correct verdict is `NEEDS_VERIFICATION` with a note to re-run QA after CI completes.
833
+
834
+ ---
835
+
836
+ ### Small-Diff Fast Path (Size Gate)
837
+
838
+ **Purpose:** Skip sub-agent spawning for trivial diffs to save ~30s latency and reduce token cost.
839
+
840
+ **Evaluate the size gate BEFORE spawning any quality check sub-agents:**
841
+
842
+ ```bash
843
+ # 1. Read threshold from settings (default: 100)
844
+ threshold=$(cat .sequant/settings.json 2>/dev/null | grep -o '"smallDiffThreshold"[[:space:]]*:[[:space:]]*[0-9]*' | grep -o '[0-9]*$' || echo "100")
845
+ if [ -z "$threshold" ]; then threshold=100; fi
846
+
847
+ # 2. Compute diff size (additions + deletions)
848
+ diff_stats=$(git diff origin/main...HEAD --stat | tail -1 || true)
849
+ additions=$(echo "$diff_stats" | grep -o '[0-9]* insertion' | grep -o '[0-9]*' || echo "0")
850
+ deletions=$(echo "$diff_stats" | grep -o '[0-9]* deletion' | grep -o '[0-9]*' || echo "0")
851
+ total_changes=$((${additions:-0} + ${deletions:-0}))
852
+
853
+ # 3. Check if package.json changed
854
+ pkg_changed=$(git diff origin/main...HEAD --name-only | grep -c '^package\.json$' || true)
855
+
856
+ # 4. Check security-sensitive paths (reuses existing heuristic from anti-pattern detection)
857
+ security_paths=$(git diff origin/main...HEAD --name-only | grep -iE 'auth|payment|security|server-action|middleware|admin' || true)
858
+ security_sensitive="false"
859
+ if [ -n "$security_paths" ]; then security_sensitive="true"; fi
860
+
861
+ echo "Size gate: $total_changes lines changed (threshold: $threshold), pkg_changed=$pkg_changed, security=$security_sensitive"
862
+ ```
863
+
864
+ **Size gate decision:**
865
+
866
+ | Condition | Result |
867
+ |-----------|--------|
868
+ | `total_changes < threshold` AND `pkg_changed == 0` AND `security_sensitive == false` | `SMALL_DIFF=true` — use inline checks |
869
+ | Any condition fails | `SMALL_DIFF=false` — use sub-agents (standard pipeline) |
870
+ | Size gate evaluation errors (e.g., git fails) | `SMALL_DIFF=false` — fall back to full pipeline (AC-5) |
871
+
872
+ **Log the decision (AC-6):**
873
+
874
+ ```markdown
875
+ ### Size Gate
876
+
877
+ | Check | Value |
878
+ |-------|-------|
879
+ | Diff size | N lines (threshold: T) |
880
+ | package.json changed | Yes/No |
881
+ | Security-sensitive paths | Yes/No [list if yes] |
882
+ | Decision | **Inline checks** / **Sub-agents** |
883
+ ```
884
+
885
+ #### If `SMALL_DIFF=true`: Inline Quality Checks
886
+
887
+ Run these checks directly (no sub-agents needed):
888
+
889
+ ```bash
890
+ # Type safety: check for 'any' additions
891
+ any_count=$(git diff origin/main...HEAD | grep '^\+' | grep -v '^\+\+\+' | grep -cw 'any' || true)
892
+
893
+ # Deleted tests check
894
+ deleted_tests=$(git diff origin/main...HEAD --name-only --diff-filter=D | grep -cE '\.(test|spec)\.' || true)
895
+
896
+ # Scope: files changed count
897
+ files_changed=$(git diff origin/main...HEAD --name-only | wc -l | tr -d ' ')
898
+
899
+ # Security scan (lightweight — just check for obvious patterns in added lines)
900
+ security_issues=$(git diff origin/main...HEAD | grep '^\+' | grep -v '^\+\+\+' | grep -ciE 'eval\(|innerHTML|dangerouslySetInnerHTML|exec\(|password.*=.*["']|secret.*=.*["']|api.?key.*=.*["']' || true)
901
+
902
+ echo "Inline checks: any=$any_count, deleted_tests=$deleted_tests, files=$files_changed, security_issues=$security_issues"
903
+ ```
904
+
905
+ **After inline checks, skip to the output template** (the sub-agent section below is not executed).
906
+
907
+ #### If `SMALL_DIFF=false`: Use Sub-Agents (Standard Pipeline)
908
+
909
+ Proceed to the standard Quality Checks section below.
910
+
911
+ ---
912
+
913
+ ### Quality Checks (Multi-Agent) — REQUIRED
914
+
915
+ **When `SMALL_DIFF=false`**, you MUST spawn sub-agents for quality checks. Do NOT run these checks inline with bash commands. Sub-agents provide parallel execution, better context isolation, and consistent reporting.
916
+
917
+ **Execution mode:** Respect the agent execution mode determined above (see "Agent Execution Mode" section).
918
+
919
+ #### Documentation Issue Detection
920
+
921
+ Check if this is a documentation-only issue by reading the `SEQUANT_ISSUE_TYPE` environment variable:
922
+
923
+ ```bash
924
+ issue_type="${SEQUANT_ISSUE_TYPE:-}"
925
+ ```
926
+
927
+ **If `SEQUANT_ISSUE_TYPE=docs`**, use the lighter docs QA pipeline:
928
+
929
+ - **Skip** type safety sub-agent (no TypeScript changes expected)
930
+ - **Skip** security scan sub-agent (no runtime code changes)
931
+ - **Keep** scope/size check (still useful for docs)
932
+ - **Focus review on:** content accuracy, completeness, formatting, and link validity
933
+
934
+ **Docs QA sub-agents (1 agent instead of 3):**
935
+
936
+ 1. `Agent(subagent_type="sequant-qa-checker", prompt="Run scope and size checks on the current branch vs main. Check for broken links in changed markdown files. Report: files count, diff size, broken links, size assessment.")`
937
+
938
+ **If `SEQUANT_ISSUE_TYPE` is not set or is not `docs`**, use the standard pipeline below.
939
+
940
+ #### If parallel mode enabled:
941
+
942
+ **Spawn ALL THREE agents in a SINGLE message (one Tool call per agent, all in same response):**
943
+
944
+ **IMPORTANT:** Background agents need `mode="bypassPermissions"` to execute Bash commands (`git diff`, `npm test`, etc.) without interactive approval. The default `acceptEdits` mode only auto-approves Edit/Write — Bash calls are silently denied. These quality check agents only read and analyze; they never write files or push code, so bypassing permissions is safe.
945
+
946
+ 1. `Agent(subagent_type="sequant-qa-checker", prompt="Run type safety and deleted tests checks on the current branch vs main. Report: type issues count, deleted tests, verdict.")`
947
+
948
+ 2. `Agent(subagent_type="sequant-qa-checker", prompt="Run scope and size checks on the current branch vs main. Report: files count, diff size, size assessment.")`
949
+
950
+ 3. `Agent(subagent_type="sequant-qa-checker", prompt="Run security scan on changed files in current branch vs main. Report: critical/warning/info counts, verdict.")`
951
+
952
+ #### If sequential mode (default):
953
+
954
+ **Spawn each agent ONE AT A TIME, waiting for each to complete before the next:**
955
+
956
+ **Note:** Sequential agents run in the foreground where the user can approve Bash interactively. However, for consistency and to avoid approval fatigue, we still use `mode="bypassPermissions"` since these agents only perform read-only quality checks.
957
+
958
+ 1. **First:** `Agent(subagent_type="sequant-qa-checker", prompt="Run type safety and deleted tests checks on the current branch vs main. Report: type issues count, deleted tests, verdict.")`
959
+
960
+ 2. **After #1 completes:** `Agent(subagent_type="sequant-qa-checker", prompt="Run scope and size checks on the current branch vs main. Report: files count, diff size, size assessment.")`
961
+
962
+ 3. **After #2 completes:** `Agent(subagent_type="sequant-qa-checker", prompt="Run security scan on changed files in current branch vs main. Report: critical/warning/info counts, verdict.")`
963
+
964
+ **Add RLS check if admin files modified:**
965
+ ```bash
966
+ admin_modified=$(git diff main...HEAD --name-only | grep -E "^app/admin/" | head -1 || true)
967
+ ```
968
+
969
+ See [quality-gates.md](references/quality-gates.md) for detailed verdict synthesis.
970
+
971
+ ### Using MCP Tools (Optional)
972
+
973
+ - **Sequential Thinking:** For complex multi-step analysis
974
+ - **Context7:** For broader pattern context and library documentation
975
+
976
+ ### 1. Context and AC Alignment
977
+
978
+ - **Read all GitHub issue comments** for complete context
979
+ - Reconstruct the AC checklist (AC-1, AC-2, ...)
980
+ - If AC unclear, state assumptions explicitly
981
+
982
+ ### 2. Code Review
983
+
984
+ Perform a code review focusing on:
985
+
986
+ - Correctness and potential bugs
987
+ - Readability and maintainability
988
+ - Alignment with existing patterns (see CLAUDE.md)
989
+ - TypeScript strictness and type safety
990
+ - **Duplicate utility check:** Verify new utilities don't duplicate existing ones in `docs/patterns/`
991
+
992
+ See [code-review-checklist.md](references/code-review-checklist.md) for integration verification steps.
993
+
994
+ ### 2a. Build Verification (When Build Fails)
995
+
996
+ **When to apply:** `npm run build` fails on the feature branch.
997
+
998
+ **Purpose:** Distinguish between pre-existing build failures (already on main) and regressions introduced by this PR.
999
+
1000
+ **Detection:**
1001
+ ```bash
1002
+ # Run build and capture result
1003
+ npm run build 2>&1
1004
+ BUILD_EXIT_CODE=$?
1005
+ ```
1006
+
1007
+ **If build fails, verify against main:**
1008
+
1009
+ The quality-checks.sh script includes `run_build_with_verification()` which:
1010
+ 1. Runs `npm run build` on the feature branch
1011
+ 2. If it fails, runs build on main branch (via the main repo directory)
1012
+ 3. Compares exit codes and first error lines
1013
+ 4. Produces a "Build Verification" table (see AC-4)
1014
+
1015
+ **Verification Logic:**
1016
+
1017
+ | Feature Build | Main Build | Error Match | Result |
1018
+ |---------------|------------|-------------|--------|
1019
+ | ❌ Fail | ✅ Pass | N/A | **Regression** - failure introduced by PR |
1020
+ | ❌ Fail | ❌ Fail | Same error | **Pre-existing** - not blocking |
1021
+ | ❌ Fail | ❌ Fail | Different | **Unknown** - manual review needed |
1022
+ | ✅ Pass | * | N/A | No verification needed |
1023
+
1024
+ **Verdict Impact:**
1025
+
1026
+ | Build Verification Result | Verdict Impact |
1027
+ |---------------------------|----------------|
1028
+ | Regression detected | `AC_NOT_MET` - must fix before merge |
1029
+ | Pre-existing failure | No impact - document and proceed |
1030
+ | Unknown (different errors) | `AC_MET_BUT_NOT_A_PLUS` - manual review |
1031
+ | Build passes | No impact |
1032
+
1033
+ **Output Format:**
1034
+
1035
+ ```markdown
1036
+ ### Build Verification
1037
+
1038
+ | Check | Status |
1039
+ |-------|--------|
1040
+ | Feature branch build | ❌ Failed |
1041
+ | Main branch build | ❌ Failed |
1042
+ | Error match | ✅ Same error |
1043
+ | Regression | **No** (pre-existing) |
1044
+
1045
+ **Note:** Build failure is pre-existing on main branch. Not blocking this PR.
1046
+ ```
1047
+
1048
+ ### 2b. Test Coverage Transparency (REQUIRED)
1049
+
1050
+ **Purpose:** Report which changed files have corresponding tests, not just "N tests passed."
1051
+
1052
+ **After running `npm test`, you MUST analyze test coverage for changed files:**
1053
+
1054
+ Use the Glob tool to check for corresponding test files:
1055
+ ```
1056
+ # Get changed source files (excluding tests) from git
1057
+ changed=$(git diff main...HEAD --name-only | grep -E '\.(ts|tsx|js|jsx)$' | grep -v -E '\.test\.|\.spec\.|__tests__' || true)
1058
+
1059
+ # For each changed file, use the Glob tool to find matching test files
1060
+ # Glob(pattern="**/${base}.test.*") or Glob(pattern="**/${base}.spec.*")
1061
+ # If no test file found, report "NO TEST: $file"
1062
+ ```
1063
+
1064
+ **Required reporting format:**
1065
+
1066
+ | Scenario | Report |
1067
+ |----------|--------|
1068
+ | Tests cover changed files | `Tests: N passed (covers changed files)` |
1069
+ | Tests don't cover changed files | `Tests: N passed (⚠️ 0 cover changed files)` |
1070
+ | No tests for specific files | `Tests: N passed (⚠️ NO TESTS: file1.ts, file2.ts)` |
1071
+
1072
+ **Include in output template:**
1073
+
1074
+ ```markdown
1075
+ ### Test Coverage Analysis
1076
+
1077
+ | Changed File | Has Tests? | Test File |
1078
+ |--------------|------------|-----------|
1079
+ | `lib/foo.ts` | ✅ Yes | `__tests__/foo.test.ts` |
1080
+ | `lib/bar.ts` | ⚠️ No | - |
1081
+
1082
+ **Coverage:** X/Y changed files have tests
1083
+ ```
1084
+
1085
+ ### 2c. Change Tier Classification
1086
+
1087
+ **Purpose:** Flag coverage gaps based on criticality, not just presence/absence.
1088
+
1089
+ **Tier definitions:**
1090
+
1091
+ | Tier | Change Type | Coverage Requirement |
1092
+ |------|-------------|---------------------|
1093
+ | **Critical** | Auth, payments, security, server-actions, middleware, admin | Flag prominently if missing |
1094
+ | **Standard** | Business logic, API handlers, utilities | Note if missing |
1095
+ | **Optional** | Config, types-only, UI tweaks | No flag needed |
1096
+
1097
+ **Detection heuristic:**
1098
+
1099
+ ```bash
1100
+ # Detect critical paths in changed files
1101
+ changed=$(git diff main...HEAD --name-only | grep -E '\.(ts|tsx|js|jsx)$' || true)
1102
+ critical=$(echo "$changed" | grep -E 'auth|payment|security|server-action|middleware|admin' || true)
1103
+
1104
+ if [[ -n "$critical" ]]; then
1105
+ echo "⚠️ CRITICAL PATH CHANGES (test coverage strongly recommended):"
1106
+ echo "$critical"
1107
+ fi
1108
+ ```
1109
+
1110
+ **Reporting format:**
1111
+
1112
+ ```markdown
1113
+ ### Change Tiers
1114
+
1115
+ | Tier | Files | Coverage Status |
1116
+ |------|-------|-----------------|
1117
+ | Critical | `auth/login.ts` | ⚠️ NO TESTS - Flag prominently |
1118
+ | Standard | `lib/utils.ts` | Note: No tests |
1119
+ | Optional | `types/index.ts` | OK - Types only |
1120
+ ```
1121
+
1122
+ ### 2d. Test Quality Review
1123
+
1124
+ **When to apply:** Test files were added or modified.
1125
+
1126
+ Evaluate test quality using the checklist:
1127
+ - **Behavior vs Implementation:** Tests assert on outputs, not internals
1128
+ - **Coverage Depth:** Error paths and edge cases covered
1129
+ - **Mock Hygiene:** Only external dependencies mocked
1130
+ - **Test Reliability:** No timing dependencies, deterministic
1131
+
1132
+ See [test-quality-checklist.md](references/test-quality-checklist.md) for detailed evaluation criteria.
1133
+
1134
+ **Flag common issues:**
1135
+ - Over-mocking (4+ modules mocked in single test)
1136
+ - Missing error path tests
1137
+ - Snapshot abuse (>50 line snapshots)
1138
+ - Implementation mirroring
1139
+
1140
+ ### 2e. Anti-Pattern Detection
1141
+
1142
+ **Always run** code pattern checks on changed files:
1143
+
1144
+ ```bash
1145
+ # Get changed TypeScript/JavaScript files
1146
+ changed_files=$(git diff main...HEAD --name-only | grep -E '\.(ts|tsx|js|jsx)$' || true)
1147
+ ```
1148
+
1149
+ **Check for:**
1150
+
1151
+ | Category | Pattern | Risk |
1152
+ |----------|---------|------|
1153
+ | Performance | N+1 query (`await` in loop) | ⚠️ Medium |
1154
+ | Error Handling | Empty catch block | ⚠️ Medium |
1155
+ | Security | Hardcoded secrets | ❌ High |
1156
+ | Security | SQL concatenation | ❌ High |
1157
+ | Security | Server binds all interfaces (`0.0.0.0`) | ❌ High |
1158
+ | Memory | Uncleared interval/timeout | ⚠️ Medium |
1159
+ | A11y | Image without alt | ⚠️ Low |
1160
+
1161
+ **Dependency audit** (when `package.json` modified):
1162
+
1163
+ | Flag | Threshold |
1164
+ |------|-----------|
1165
+ | Low downloads | <1,000/week |
1166
+ | Stale | No updates 12+ months |
1167
+ | License risk | UNLICENSED, GPL in MIT |
1168
+ | Security | Known vulnerabilities |
1169
+
1170
+ See [anti-pattern-detection.md](references/anti-pattern-detection.md) for detection commands and full criteria.
1171
+
1172
+ ### 2f. Product Review (When New User-Facing Features Added)
1173
+
1174
+ **When to apply:** New CLI commands, MCP tools, configuration options, or other features that end users interact with directly.
1175
+
1176
+ **Detection:**
1177
+ ```bash
1178
+ # Detect user-facing changes
1179
+ cli_added=$(git diff main...HEAD -- bin/cli.ts | grep -E '^\+.*\.command\(' | wc -l | xargs || true)
1180
+ new_commands=$(git diff main...HEAD --name-only | grep -E '^src/commands/' | wc -l | xargs || true)
1181
+ mcp_added=$(git diff main...HEAD --name-only | grep -E '^src/mcp/' | wc -l | xargs || true)
1182
+ config_changed=$(git diff main...HEAD --name-only | grep -E 'settings|config' | wc -l | xargs || true)
1183
+
1184
+ if [[ $((cli_added + new_commands + mcp_added + config_changed)) -gt 0 ]]; then
1185
+ echo "User-facing changes detected - running product review"
1186
+ fi
1187
+ ```
1188
+
1189
+ **If user-facing changes detected, answer these questions:**
1190
+
1191
+ | Question | What to check |
1192
+ |----------|---------------|
1193
+ | **First-time setup:** Can a new user go from zero to working? | List every prerequisite. Try the setup path mentally. |
1194
+ | **Per-environment differences:** Does this work the same everywhere? | macOS/Linux/Windows, different clients/tools, CI vs local |
1195
+ | **What does the user see?** | Walk through the actual UX — wait times, output format, progress indicators |
1196
+ | **What happens after?** | Where's the output? What does the user do next? |
1197
+ | **Failure modes the user will hit:** | Not code edge cases — real scenarios (wrong directory, missing auth, timeout) |
1198
+
1199
+ **Output Format:**
1200
+
1201
+ ```markdown
1202
+ ### Product Review
1203
+
1204
+ **User-facing changes:** [list new commands/tools/options]
1205
+
1206
+ | Question | Finding |
1207
+ |----------|---------|
1208
+ | First-time setup | [All prerequisites identified? Setup path clear?] |
1209
+ | Per-environment | [Any client/platform differences?] |
1210
+ | User sees | [Wait times, output format, progress] |
1211
+ | After completion | [Where output goes, next steps] |
1212
+ | Likely failure modes | [Real user scenarios] |
1213
+
1214
+ **Gaps found:** [list any gaps, or "None"]
1215
+ ```
1216
+
1217
+ **Verdict Impact:**
1218
+
1219
+ | Finding | Verdict Impact |
1220
+ |---------|----------------|
1221
+ | No gaps | No impact |
1222
+ | Missing prerequisites in docs | `AC_MET_BUT_NOT_A_PLUS` |
1223
+ | Feature silently fails in common environment | `AC_NOT_MET` (e.g., wrong cwd, missing auth) |
1224
+ | Poor UX but functional | Note in findings |
1225
+
1226
+ ### 2g. Call-Site Review (When New Functions Added)
1227
+
1228
+ **When to apply:** New exported functions are detected in the diff.
1229
+
1230
+ **Purpose:** Review not just the function implementation but **where** and **how** it's called. A function can be perfectly implemented but called incorrectly at the call site. (Origin: Issue #295 — `rebaseBeforePR()` had thorough unit tests but was called for every issue in a chain loop when the AC specified "only the final branch.")
1231
+
1232
+ **Detection:**
1233
+ ```bash
1234
+ # Find new exported functions (added lines only)
1235
+ # Catches: export function foo, export async function foo,
1236
+ # export const foo = () =>, export const foo = async () =>
1237
+ fn_exports=$(git diff main...HEAD | grep -E '^\+export (async )?function \w+' | sed 's/^+//' | grep -oE 'function \w+' | awk '{print $2}' || true)
1238
+ arrow_exports=$(git diff main...HEAD | grep -E '^\+export const \w+ = (async )?\(' | sed 's/^+//' | grep -oE 'const \w+' | awk '{print $2}' || true)
1239
+ new_exports=$(echo -e "${fn_exports}\n${arrow_exports}" | sed '/^$/d' | sort -u)
1240
+ export_count=$(echo "$new_exports" | grep -c . || echo 0)
1241
+
1242
+ if [[ $export_count -gt 0 ]]; then
1243
+ echo "New exported functions detected: $export_count"
1244
+ echo "$new_exports"
1245
+ fi
1246
+ ```
1247
+
1248
+ **If new exported functions found:**
1249
+
1250
+ #### Step 1: Call-Site Inventory
1251
+
1252
+ For each new exported function, identify ALL call sites using the Grep tool:
1253
+
1254
+ ```
1255
+ # For each new function, find call sites
1256
+ # Use the Grep tool for each function name:
1257
+ Grep(pattern="${func}\\(", glob="*.{ts,tsx}", output_mode="content")
1258
+ # Then exclude test files, __tests__ dirs, and the export definition itself
1259
+ ```
1260
+
1261
+ **Call site types:**
1262
+ - Direct call: `functionName(args)`
1263
+ - Method call: `this.functionName(args)` or `obj.functionName(args)`
1264
+ - Callback: `.then(functionName)` or `array.map(functionName)`
1265
+ - Conditional: `condition && functionName(args)`
1266
+
1267
+ #### Step 2: Condition Audit
1268
+
1269
+ For each call site, document the conditions that gate the call:
1270
+
1271
+ | Condition Type | Example | Check |
1272
+ |----------------|---------|-------|
1273
+ | Guard clause | `if (x) { fn() }` | Does condition match AC? |
1274
+ | Logical AND | `x && fn()` | Is guard sufficient? |
1275
+ | Ternary | `x ? fn() : null` | Correct branch? |
1276
+ | Early return | `if (!x) return; fn()` | Correct logic? |
1277
+
1278
+ **Compare conditions against AC constraints:**
1279
+ - AC says "only when X" → Call site should have `if (X)` guard
1280
+ - AC says "not in Y mode" → Call site should have `if (!Y)` guard
1281
+ - AC says "for Z items" → Call site should filter for Z condition
1282
+
1283
+ #### Step 3: Loop Awareness
1284
+
1285
+ **Detect if function is called inside a loop:**
1286
+
1287
+ ```
1288
+ # For each function, use the Grep tool with context to check surrounding lines:
1289
+ Grep(pattern="${func}\\(", glob="*.{ts,tsx}", output_mode="content", -B=5)
1290
+ # Then inspect the context lines for loop constructs:
1291
+ # for, while, forEach, .map(, .filter(, .reduce(
1292
+ # If a loop is found, flag: "function called inside loop - verify iteration scope"
1293
+ ```
1294
+
1295
+ **Loop iteration review questions:**
1296
+ 1. Should function run for ALL iterations? → OK if yes
1297
+ 2. Should function run for FIRST/LAST only? → Check for index guard
1298
+ 3. Should function run for SOME iterations? → Check for condition filter
1299
+
1300
+ **Red flags:**
1301
+ - Function called unconditionally in loop when AC says "only once"
1302
+ - No break/return after call when AC implies single execution
1303
+ - Missing mode/flag guard when AC specifies conditions
1304
+
1305
+ #### Step 4: Mode Sensitivity
1306
+
1307
+ If the function accepts configuration or mode options:
1308
+ - Is the correct mode passed at the call site?
1309
+ - Are all mode-specific paths exercised appropriately?
1310
+
1311
+ **Output Format:**
1312
+
1313
+ ```markdown
1314
+ ### Call-Site Review
1315
+
1316
+ **New exported functions detected:** N
1317
+
1318
+ | Function | Call Sites | Loop? | Conditions | AC Match |
1319
+ |----------|-----------|-------|------------|----------|
1320
+ | `newFunction()` | `file.ts:123` | No | `if (success)` | ✅ Matches AC-2 |
1321
+ | `anotherFunc()` | `run.ts:456` | Yes | None | ⚠️ Missing guard (AC-3 says "final only") |
1322
+ | `thirdFunc()` | Not called | - | - | ⚠️ Unused export |
1323
+
1324
+ **Findings:**
1325
+ - [List any mismatches between call-site conditions and AC constraints]
1326
+
1327
+ **Recommendations:**
1328
+ - [Specific fixes needed at call sites]
1329
+ ```
1330
+
1331
+ **Verdict Impact:**
1332
+
1333
+ | Finding | Verdict Impact |
1334
+ |---------|----------------|
1335
+ | All call sites match AC | No impact |
1336
+ | Call site missing AC-required guard | `AC_NOT_MET` |
1337
+ | Function not called anywhere | `AC_MET_BUT_NOT_A_PLUS` (dead export) |
1338
+ | Call site in loop, AC unclear about iteration | `NEEDS_VERIFICATION` |
1339
+
1340
+ See [call-site-review.md](references/call-site-review.md) for detailed methodology and examples.
1341
+
1342
+ ### 2h. CLI Registration Verification (When Option Interfaces Modified)
1343
+
1344
+ **When to apply:** `RunOptions` or similar CLI option interfaces are modified in the diff.
1345
+
1346
+ **Purpose:** Detect new option interface fields that have runtime usage (via `mergedOptions.X`) but lack corresponding CLI registration (via `.option()` in `bin/cli.ts`). This class of bug is invisible to TypeScript, build, and unit tests—caught only by manual review or this check.
1347
+
1348
+ **Origin:** Issue #305 — `force?: boolean` was added to `RunOptions`, checked at runtime with `mergedOptions.force`, and referenced in user-facing warnings ("use --force to re-run"), but `--force` was never registered in `bin/cli.ts`. The bug passed QA and was caught only by manual cross-reference.
1349
+
1350
+ **Detection:**
1351
+
1352
+ ```bash
1353
+ # Check if option interfaces or CLI file were modified
1354
+ option_files=$(git diff main...HEAD --name-only | grep -E "batch-executor\.ts|run\.ts|cli\.ts" || true)
1355
+ option_modified=$(echo "$option_files" | grep -v "^$" | wc -l | xargs || echo "0")
1356
+
1357
+ if [[ $option_modified -gt 0 ]]; then
1358
+ echo "Option interface or CLI file modified - running CLI registration verification"
1359
+ fi
1360
+ ```
1361
+
1362
+ **Key File Map:**
1363
+
1364
+ | Interface | Location | CLI Registration |
1365
+ |-----------|----------|------------------|
1366
+ | `RunOptions` | `src/lib/workflow/batch-executor.ts` | `run` command in `bin/cli.ts` |
1367
+
1368
+ **Verification Logic:**
1369
+
1370
+ 1. **Extract new interface fields from diff:**
1371
+ ```bash
1372
+ # Get new fields added to RunOptions (or similar interfaces)
1373
+ new_fields=$(git diff main...HEAD -- src/lib/workflow/batch-executor.ts | \
1374
+ grep -E '^\+\s+\w+\??: ' | \
1375
+ sed 's/.*+ *//' | \
1376
+ sed 's/\?.*//' | \
1377
+ sed 's/:.*//' | \
1378
+ tr -d ' ' || true)
1379
+ ```
1380
+
1381
+ 2. **Check for runtime usage (mergedOptions.X):**
1382
+ ```bash
1383
+ # For each new field, check if it's used at runtime
1384
+ for field in $new_fields; do
1385
+ runtime_usage=$(git diff main...HEAD | grep -E "mergedOptions\.$field|options\.$field" || true)
1386
+ if [[ -n "$runtime_usage" ]]; then
1387
+ echo "Field '$field' has runtime usage - verify CLI registration"
1388
+ fi
1389
+ done
1390
+ ```
1391
+
1392
+ 3. **Verify CLI registration exists:**
1393
+ ```bash
1394
+ # Extract registered CLI options from bin/cli.ts
1395
+ # Matches patterns like: --force, --dry-run, --timeout
1396
+ registered=$(grep -oE '"\-\-[a-z-]+"' bin/cli.ts | tr -d '"' | sed 's/^--//' || true)
1397
+
1398
+ # Check if field has corresponding registration
1399
+ # Note: CLI flags use kebab-case, interface fields use camelCase
1400
+ # Example: fieldName → --field-name
1401
+ ```
1402
+
1403
+ 4. **Internal-only field exclusion (AC-5):**
1404
+
1405
+ Fields without runtime `mergedOptions.X` usage are internal-only and don't need CLI registration:
1406
+ - `autoDetectPhases` — set programmatically, not user-facing
1407
+ - `worktreeIsolation` — environment-controlled
1408
+ - Fields only used in type signatures without runtime access
1409
+
1410
+ **Detection:** If `grep "mergedOptions.$field"` returns no matches, the field is internal-only.
1411
+
1412
+ **Output Format:**
1413
+
1414
+ ```markdown
1415
+ ### CLI Registration Verification
1416
+
1417
+ **Option files modified:** Yes/No
1418
+
1419
+ | Interface Field | Runtime Usage | CLI Registered | Status |
1420
+ |----------------|--------------|----------------|--------|
1421
+ | `force` | `mergedOptions.force` (line 2447) | `--force` in bin/cli.ts | ✅ OK |
1422
+ | `newField` | `mergedOptions.newField` (line 500) | NOT REGISTERED | ❌ FAIL |
1423
+ | `internalOnly` | None (internal) | N/A | ⏭️ SKIP |
1424
+
1425
+ **Verification Status:** Passed / Failed / N/A
1426
+ ```
1427
+
1428
+ **Verdict Gating (AC-4):**
1429
+
1430
+ | Verification Status | Maximum Verdict |
1431
+ |---------------------|-----------------|
1432
+ | Passed | READY_FOR_MERGE |
1433
+ | N/A (no option changes) | READY_FOR_MERGE |
1434
+ | Failed | AC_NOT_MET |
1435
+
1436
+ **CRITICAL:** If CLI registration verification = **Failed**, verdict CANNOT be `READY_FOR_MERGE`. Missing CLI registrations mean users cannot access the feature via command line.
1437
+
1438
+ **If verification fails:**
1439
+ 1. Flag the specific fields missing CLI registration
1440
+ 2. Set verdict to `AC_NOT_MET`
1441
+ 3. Include remediation steps:
1442
+ ```markdown
1443
+ **Remediation:**
1444
+ 1. Add to `bin/cli.ts` under the appropriate command:
1445
+ ```typescript
1446
+ .option("--field-name", "Description of what the flag does")
1447
+ ```
1448
+ 2. Verify with `npx sequant <command> --help`
1449
+ ```
1450
+
1451
+ ---
1452
+
1453
+ ### 3. QA vs AC
1454
+
1455
+ For each AC item, mark as:
1456
+ - `MET`
1457
+ - `PARTIALLY_MET`
1458
+ - `NOT_MET`
1459
+
1460
+ Provide a sentence or two explaining why.
1461
+
1462
+ #### AC Literal Verification (REQUIRED)
1463
+
1464
+ **Before marking any AC as MET**, verify the implementation matches the AC text literally, not just in spirit:
1465
+
1466
+ 1. **Extract specific technical claims** from the AC text (commands, flags, function names, config keys, UI elements)
1467
+ 2. **Search the implementation** for each claim using Grep or Read — do not assume presence
1468
+ 3. **If the AC mentions a flag** (e.g., `--file <relevant-files>`), verify that flag appears in the code
1469
+ 4. **If the AC says "works end-to-end"**, trace the full call chain from entry point to execution
1470
+
1471
+ **Example:** If AC says *"shells out to `aider --yes --no-auto-commits --message '<prompt>' --file <relevant-files>`"*:
1472
+ - Verify `--yes` is in args array ✅
1473
+ - Verify `--no-auto-commits` is in args array ✅
1474
+ - Verify `--message` is in args array ✅
1475
+ - Verify `--file` is in args array — **if missing, AC is NOT MET** ❌
1476
+
1477
+ Do NOT mark MET based on "the general intent is satisfied." The AC text is the contract — verify it literally.
1478
+
1479
+ ### 3a. AC Status Persistence — REQUIRED
1480
+
1481
+ **After evaluating each AC item**, update the status in workflow state using the state CLI:
1482
+
1483
+ ```bash
1484
+ # Step 1: Initialize AC items for the issue (run once, before updating statuses)
1485
+ npx tsx scripts/state/update.ts init-ac <issue-number> <ac-count>
1486
+
1487
+ # Example: Initialize 4 AC items for issue #250
1488
+ npx tsx scripts/state/update.ts init-ac 250 4
1489
+ ```
1490
+
1491
+ ```bash
1492
+ # Step 2: Update each AC item's status
1493
+ npx tsx scripts/state/update.ts ac <issue-number> <ac-id> <status> "<notes>"
1494
+
1495
+ # Examples:
1496
+ npx tsx scripts/state/update.ts ac 250 AC-1 met "Verified: tests pass and feature works"
1497
+ npx tsx scripts/state/update.ts ac 250 AC-2 not_met "Missing error handling for edge case"
1498
+ npx tsx scripts/state/update.ts ac 250 AC-3 blocked "Waiting on upstream dependency"
1499
+ ```
1500
+
1501
+ **Status mapping:**
1502
+ - `MET` → `met`
1503
+ - `PARTIALLY_MET` → `not_met` (with notes explaining what's missing)
1504
+ - `NOT_MET` → `not_met`
1505
+ - `BLOCKED` → `blocked` (external dependency issue)
1506
+
1507
+ **Why this matters:** Updating AC status in state enables:
1508
+ - Dashboard shows real-time AC progress per issue
1509
+ - Cross-skill tracking of which AC items need work
1510
+ - Summary badges show "X/Y met" status
1511
+
1512
+ **If issue has no stored AC:**
1513
+ - Run `init-ac` first to create the AC items
1514
+ - Then update each AC status individually
1515
+
1516
+ ### 4. Failure Path & Edge Case Testing (REQUIRED)
1517
+
1518
+ Before any READY_FOR_MERGE verdict, complete the adversarial thinking checklist:
1519
+
1520
+ 1. **"What would break this?"** - Identify and test at least 2 failure scenarios
1521
+ 2. **"What assumptions am I making?"** - List and validate key assumptions
1522
+ 3. **"What's the unhappy path?"** - Test invalid inputs, failed dependencies
1523
+ 4. **"Did I test the feature's PRIMARY PURPOSE?"** - If it handles errors, trigger an error
1524
+
1525
+ See [testing-requirements.md](references/testing-requirements.md) for edge case checklists.
1526
+
1527
+ ### 5. Adversarial Self-Evaluation (REQUIRED)
1528
+
1529
+ **Before issuing your verdict**, you MUST complete this adversarial self-evaluation to catch issues that automated quality checks miss.
1530
+
1531
+ **Why this matters:** QA automation catches type issues, deleted tests, and scope creep - but misses:
1532
+ - Features that don't actually work as expected
1533
+ - Tests that pass but don't test the right things
1534
+ - Edge cases only apparent when actually using the feature
1535
+
1536
+ **Answer these questions honestly:**
1537
+ 1. "Did the implementation actually work when I reviewed it, or am I assuming it works?"
1538
+ 2. "Do the tests actually test the feature's primary purpose, or just pass?"
1539
+ 3. "What's the most likely way this feature could break in production?"
1540
+ 4. "Am I giving a positive verdict because the code looks clean, or because I verified it works?"
1541
+ 5. "Are there 'design choices' I'm excusing that are actually bad practices?" (e.g., no version pinning, leaking secrets to unnecessary env vars, non-portable shell in example code, no input validation). Would I accept this in a code review from a junior developer?
1542
+
1543
+ **Include this section in your output:**
1544
+
1545
+ ```markdown
1546
+ ### Self-Evaluation
1547
+
1548
+ - **Verified working:** [Yes/No - did you actually verify the feature works, or assume it does?]
1549
+ - **Test efficacy:** [High/Medium/Low - do tests catch the feature breaking?]
1550
+ - **Likely failure mode:** [What would most likely break this in production?]
1551
+ - **Verdict confidence:** [High/Medium/Low - explain any uncertainty]
1552
+ ```
1553
+
1554
+ **If any answer reveals concerns:**
1555
+ - Factor the concerns into your verdict
1556
+ - If significant, change verdict to `AC_NOT_MET` or `AC_MET_BUT_NOT_A_PLUS`
1557
+ - Document the concerns in the QA comment
1558
+
1559
+ **Do NOT skip this self-evaluation.** Honest reflection catches issues that code review cannot.
1560
+
1561
+ #### Skill Change Review (Conditional)
1562
+
1563
+ **When to apply:** `.claude/skills/**/*.md` files were modified.
1564
+
1565
+ **Detect skill changes:**
1566
+ ```bash
1567
+ skills_changed=$(git diff main...HEAD --name-only | grep -E "^\.claude/skills/.*\.md$" | wc -l | xargs || true)
1568
+ ```
1569
+
1570
+ **If skills_changed > 0, add these adversarial prompts:**
1571
+
1572
+ | Prompt | Why It Matters |
1573
+ |--------|----------------|
1574
+ | **Command verified:** Did you execute at least one referenced command? | Skill instructions can reference commands that don't work (wrong flags, missing fields) |
1575
+ | **Fields verified:** For JSON commands, do field names match actual output? | Issue #178: `gh pr checks --json conclusion` failed because `conclusion` doesn't exist |
1576
+ | **Patterns complete:** What variations might users write that aren't covered? | Skills define patterns - missing coverage causes silent failures |
1577
+ | **Dependencies explicit:** What CLIs/tools does this skill assume are installed? | Missing `gh`, `npm`, etc. breaks the skill with confusing errors |
1578
+
1579
+ **Example skill-specific self-evaluation:**
1580
+
1581
+ ```markdown
1582
+ ### Skill Change Review
1583
+
1584
+ - [ ] **Command verified:** Executed `gh pr checks --json name,state,bucket` - fields exist ✅
1585
+ - [ ] **Fields verified:** Checked `gh pr checks --help` for valid JSON fields ✅
1586
+ - [ ] **Patterns complete:** Covered SUCCESS, FAILURE, PENDING states ✅
1587
+ - [ ] **Dependencies explicit:** Requires `gh` CLI authenticated ✅
1588
+ ```
1589
+
1590
+ ---
1591
+
1592
+ ### 6. Execution Evidence (REQUIRED for scripts/CLI)
1593
+
1594
+ **When to apply:** `scripts/` or CLI files were modified.
1595
+
1596
+ **Detect change type:**
1597
+ ```bash
1598
+ scripts_changed=$(git diff main...HEAD --name-only | grep -E "^scripts/" | wc -l | xargs || true)
1599
+ cli_changed=$(git diff main...HEAD --name-only | grep -E "(cli|commands?)" | wc -l | xargs || true)
1600
+ ```
1601
+
1602
+ **If scripts/CLI changed, execute at least one smoke command:**
1603
+
1604
+ | Change Type | Required Command |
1605
+ |-------------|------------------|
1606
+ | `scripts/` | `npx tsx scripts/<file>.ts --help` |
1607
+ | CLI commands | `npx sequant <cmd> --help` or `--dry-run` |
1608
+ | Tests only | `npm test -- --grep "feature"` |
1609
+ | Types/config only | Waiver with reason |
1610
+
1611
+ **Capture evidence:**
1612
+ ```bash
1613
+ # Execute and capture
1614
+ npx tsx scripts/example.ts --help 2>&1
1615
+ echo "Exit code: $?"
1616
+ ```
1617
+
1618
+ **Evidence status:**
1619
+ - **Complete:** All required commands executed successfully
1620
+ - **Incomplete:** Some commands not run or failed
1621
+ - **Waived:** Explicit reason documented (types-only, config-only)
1622
+ - **Not Required:** No executable changes
1623
+
1624
+ **Verdict gating:**
1625
+ - `READY_FOR_MERGE` requires evidence status: Complete, Waived, or Not Required
1626
+ - `AC_MET_BUT_NOT_A_PLUS` if evidence is Incomplete
1627
+
1628
+ See [quality-gates.md](references/quality-gates.md) for detailed evidence requirements.
1629
+
1630
+ ---
1631
+
1632
+ ### 6a. Skill Command Verification (REQUIRED for skill changes)
1633
+
1634
+ **When to apply:** `.claude/skills/**/*.md` files were modified.
1635
+
1636
+ **Purpose:** Skills contain instructions with CLI commands. If those commands have wrong syntax, missing flags, or non-existent JSON fields, the skill will fail when used. QA must verify commands actually work before READY_FOR_MERGE.
1637
+
1638
+ **Detect skill changes:**
1639
+ ```bash
1640
+ skills_changed=$(git diff main...HEAD --name-only | grep -E "^\.claude/skills/.*\.md$" || true)
1641
+ skill_count=$(echo "$skills_changed" | grep -c . || echo 0)
1642
+ ```
1643
+
1644
+ **Pre-requisite check:**
1645
+ ```bash
1646
+ # Verify gh CLI is available before running verification
1647
+ if ! command -v gh &>/dev/null; then
1648
+ echo "⚠️ gh CLI not installed - skill command verification skipped"
1649
+ echo "Install: https://cli.github.com/"
1650
+ # Set verification status to "Skipped" with reason
1651
+ fi
1652
+ ```
1653
+
1654
+ **If skill_count > 0, extract and verify commands:**
1655
+
1656
+ #### Step 1: Extract Commands from Changed Skills
1657
+
1658
+ ```bash
1659
+ # Extract command patterns from skill files
1660
+ for skill_file in $skills_changed; do
1661
+ echo "=== Commands in $skill_file ==="
1662
+
1663
+ # Commands at start of line (simple commands)
1664
+ grep -E '^\s*(gh|npm|npx|git)\s+' "$skill_file" 2>/dev/null | head -10
1665
+
1666
+ # Commands in subshells/variable assignments: result=$(gh pr view ...)
1667
+ grep -oE '\$\((gh|npm|npx|git)\s+[^)]+\)' "$skill_file" 2>/dev/null | head -10
1668
+
1669
+ # Commands in inline backticks
1670
+ grep -oE '\`(gh|npm|npx|git)\s+[^\`]+\`' "$skill_file" 2>/dev/null | head -10
1671
+
1672
+ # Commands after pipe or semicolon: ... | gh ... or ; npm ...
1673
+ grep -oE '[|;]\s*(gh|npm|npx|git)\s+[^|;&]+' "$skill_file" 2>/dev/null | head -10
1674
+ done
1675
+ ```
1676
+
1677
+ **Note:** Multi-line commands (using `\` continuation) require manual review. The extraction patterns above capture single-line commands only.
1678
+
1679
+ #### Step 2: Verify Command Syntax
1680
+
1681
+ For each extracted command type:
1682
+
1683
+ | Command Type | Verification Method | Example |
1684
+ |--------------|---------------------|---------|
1685
+ | `gh pr checks --json X` | Check `gh pr checks --help` for valid JSON fields | `gh pr checks --help \| grep -A 30 "JSON FIELDS"` |
1686
+ | `gh issue view --json X` | Check `gh issue view --help` for valid JSON fields | `gh issue view --help \| grep -A 30 "JSON FIELDS"` |
1687
+ | `gh api ...` | Verify endpoint format matches GitHub API | Check endpoint structure |
1688
+ | `npm run <script>` | Verify script exists in package.json | `jq '.scripts["<script>"]' package.json` |
1689
+ | `npx tsx <file>` | Verify file exists | `test -f <file>` |
1690
+ | `git <cmd>` | Verify against `git <cmd> --help` | Check valid flags |
1691
+
1692
+ **JSON Field Validation Example:**
1693
+
1694
+ ```bash
1695
+ # For commands like: gh pr checks --json name,state,conclusion
1696
+ # Verify each field exists
1697
+
1698
+ # Get valid fields
1699
+ valid_fields=$(gh pr checks --help 2>/dev/null | grep -A 50 "JSON FIELDS" | grep -E "^\s+\w+" | awk '{print $1}' || true)
1700
+
1701
+ # Check if "conclusion" is valid (spoiler: it's not)
1702
+ echo "$valid_fields" | grep -qw "conclusion" && echo "✅ conclusion exists" || echo "❌ conclusion NOT a valid field"
1703
+ ```
1704
+
1705
+ #### Step 3: Handle Placeholders
1706
+
1707
+ Commands with placeholders (`<issue-number>`, `$PR_NUMBER`, `${VAR}`) cannot be executed directly.
1708
+
1709
+ **Handling:**
1710
+ - **Skip execution** for commands with placeholders
1711
+ - **Mark as "Syntax verified, execution skipped"**
1712
+ - **Still verify JSON fields** by extracting field names
1713
+
1714
+ ```bash
1715
+ # Example: gh pr checks $pr_number --json name,state,bucket
1716
+ # Can't execute (no $pr_number), but can verify fields
1717
+ echo "name,state,bucket" | tr ',' '\n' | while read field; do
1718
+ gh pr checks --help | grep -qw "$field" && echo "✅ $field" || echo "❌ $field"
1719
+ done
1720
+ ```
1721
+
1722
+ #### Step 4: Command Verification Status
1723
+
1724
+ | Status | Meaning |
1725
+ |--------|---------|
1726
+ | **Passed** | All commands verified, fields exist |
1727
+ | **Failed** | At least one command has invalid syntax or non-existent fields |
1728
+ | **Skipped** | Commands have placeholders; syntax looks valid but not executed |
1729
+ | **Not Required** | No skill files changed |
1730
+
1731
+ #### Verdict Gating
1732
+
1733
+ **CRITICAL:** If skill command verification = **Failed**, verdict CANNOT be `READY_FOR_MERGE`.
1734
+
1735
+ | Verification Status | Maximum Verdict |
1736
+ |---------------------|-----------------|
1737
+ | Passed | READY_FOR_MERGE |
1738
+ | Skipped | READY_FOR_MERGE (with note about unverified placeholders) |
1739
+ | Failed | AC_MET_BUT_NOT_A_PLUS (blocks merge until fixed) |
1740
+ | Not Required | READY_FOR_MERGE |
1741
+
1742
+ **Output Format:**
1743
+
1744
+ ```markdown
1745
+ ### Skill Command Verification
1746
+
1747
+ **Skill files changed:** 2
1748
+
1749
+ | File | Commands Found | Verification Status |
1750
+ |------|----------------|---------------------|
1751
+ | `.claude/skills/qa/SKILL.md` | 5 | ✅ Passed |
1752
+ | `.claude/skills/exec/SKILL.md` | 3 | ⚠️ Skipped (placeholders) |
1753
+
1754
+ **Commands Verified:**
1755
+ - `gh pr checks --json name,state,bucket` → ✅ All fields exist
1756
+ - `gh issue view --json title,body` → ✅ All fields exist
1757
+
1758
+ **Commands with Issues:**
1759
+ - `gh pr checks --json conclusion` → ❌ Field "conclusion" does not exist
1760
+
1761
+ **Verification Status:** Failed
1762
+ ```
237
1763
 
238
- ### 2. Code Review
1764
+ ---
239
1765
 
240
- Perform a code review focusing on:
1766
+ ### 6b. Smoke Test (CONDITIONAL)
241
1767
 
242
- - Correctness and potential bugs
243
- - Readability and maintainability
244
- - Alignment with existing patterns (see CLAUDE.md)
245
- - TypeScript strictness and type safety
246
- - **Duplicate utility check:** Verify new utilities don't duplicate existing ones in `docs/patterns/`
1768
+ **When to apply:** Feature changes workflow behavior (skills, CLI commands, scripts).
247
1769
 
248
- See [code-review-checklist.md](references/code-review-checklist.md) for integration verification steps.
1770
+ **Detection:**
1771
+ ```bash
1772
+ # Detect workflow-affecting changes
1773
+ skills_changed=$(git diff main...HEAD --name-only | grep -E "^\.claude/skills/" | wc -l | xargs || true)
1774
+ scripts_changed=$(git diff main...HEAD --name-only | grep -E "^scripts/" | wc -l | xargs || true)
1775
+ cli_changed=$(git diff main...HEAD --name-only | grep -E "^(src/cli|bin)/" | wc -l | xargs || true)
249
1776
 
250
- ### 3. QA vs AC
1777
+ if [[ $((skills_changed + scripts_changed + cli_changed)) -gt 0 ]]; then
1778
+ echo "Smoke test recommended for workflow changes"
1779
+ fi
1780
+ ```
251
1781
 
252
- For each AC item, mark as:
253
- - `MET`
254
- - `PARTIALLY_MET`
255
- - `NOT_MET`
1782
+ **Smoke Test Checklist:**
1783
+ 1. **Happy path:** Execute the primary use case
1784
+ 2. **Edge cases:** Test graceful handling (missing deps, invalid input)
1785
+ 3. **Error detection:** Verify errors are caught and reported
256
1786
 
257
- Provide a sentence or two explaining why.
1787
+ **Output Format:**
258
1788
 
259
- ### 4. Failure Path & Edge Case Testing (REQUIRED)
1789
+ | Test | Command | Result | Notes |
1790
+ |------|---------|--------|-------|
1791
+ | Happy path | `[command]` | ✅/❌ | [observation] |
1792
+ | Edge case | `[command]` | ✅/❌ | [observation] |
1793
+ | Error handling | `[command]` | ✅/❌ | [observation] |
260
1794
 
261
- Before any READY_FOR_MERGE verdict, complete the adversarial thinking checklist:
1795
+ **Smoke Test Status:**
1796
+ - **Complete:** All applicable tests passed
1797
+ - **Partial:** Some tests skipped or failed (document why)
1798
+ - **Not Required:** No workflow-affecting changes
262
1799
 
263
- 1. **"What would break this?"** - Identify and test at least 2 failure scenarios
264
- 2. **"What assumptions am I making?"** - List and validate key assumptions
265
- 3. **"What's the unhappy path?"** - Test invalid inputs, failed dependencies
266
- 4. **"Did I test the feature's PRIMARY PURPOSE?"** - If it handles errors, trigger an error
1800
+ **Verdict Impact:**
267
1801
 
268
- See [testing-requirements.md](references/testing-requirements.md) for edge case checklists.
1802
+ | Smoke Test Status | Verdict Impact |
1803
+ |-------------------|----------------|
1804
+ | Complete | No impact (positive signal) |
1805
+ | Partial | → `AC_MET_BUT_NOT_A_PLUS` (document gaps) |
1806
+ | Not Required | No impact |
1807
+
1808
+ ---
269
1809
 
270
- ### 5. A+ Status Verdict
1810
+ ### 7. A+ Status Verdict
271
1811
 
272
1812
  Provide an overall verdict:
273
1813
 
@@ -279,28 +1819,47 @@ Provide an overall verdict:
279
1819
  **Verdict Determination Algorithm (REQUIRED):**
280
1820
 
281
1821
  ```text
282
- 1. Count AC statuses:
283
- - met_count = ACs with status MET
284
- - partial_count = ACs with status PARTIALLY_MET
285
- - pending_count = ACs with status PENDING
286
- - not_met_count = ACs with status NOT_MET
287
-
288
- 2. Browser testing enforcement check:
1822
+ 1. Count AC statuses (INCLUDES both original AND derived ACs):
1823
+ - met_count = ACs with status MET (original + derived)
1824
+ - partial_count = ACs with status PARTIALLY_MET (original + derived)
1825
+ - pending_count = ACs with status PENDING (original + derived)
1826
+ - not_met_count = ACs with status NOT_MET (original + derived)
1827
+
1828
+ NOTE: Derived ACs are treated IDENTICALLY to original ACs.
1829
+ A derived AC marked NOT_MET will block merge just like an original AC.
1830
+
1831
+ 2. Check verification gates:
1832
+ - skill_verification = status from Section 6a (Passed/Failed/Skipped/Not Required)
1833
+ - execution_evidence = status from Section 6 (Complete/Incomplete/Waived/Not Required)
1834
+ - quality_plan_status = status from Phase 0b (Complete/Partial/Not Addressed/N/A)
1835
+ - smoke_test_status = status from Section 6b (Complete/Partial/Not Required)
1836
+
1837
+ 3. Browser testing enforcement check:
289
1838
  - Check if any .tsx files were changed: git diff main...HEAD --name-only | grep '\.tsx$' || true
290
1839
  - Check if /test phase ran: look for test phase marker in issue comments
291
1840
  - Check if issue has 'no-browser-test' label
292
1841
  - IF .tsx files changed AND /test did NOT run AND no 'no-browser-test' label:
293
- Force AC_MET_BUT_NOT_A_PLUS with note:
294
- "Browser testing recommended: .tsx files were modified but no /test phase ran.
295
- Add 'ui' label to enable browser testing, or 'no-browser-test' to opt out."
1842
+ Set browser_test_missing = true
296
1843
 
297
- 3. Determine verdict (in order):
1844
+ 4. Determine verdict (in order):
298
1845
  - IF not_met_count > 0 OR partial_count > 0:
299
1846
  → AC_NOT_MET (block merge)
1847
+ - ELSE IF skill_verification == "Failed":
1848
+ → AC_MET_BUT_NOT_A_PLUS (skill commands have issues - cannot be READY_FOR_MERGE)
1849
+ - ELSE IF execution_evidence == "Incomplete":
1850
+ → AC_MET_BUT_NOT_A_PLUS (scripts not verified - cannot be READY_FOR_MERGE)
1851
+ - ELSE IF quality_plan_status == "Not Addressed" AND quality_plan_exists:
1852
+ → AC_MET_BUT_NOT_A_PLUS (quality dimensions not addressed - flag for review)
1853
+ - ELSE IF browser_test_missing (from step 3):
1854
+ → AC_MET_BUT_NOT_A_PLUS (browser testing recommended for .tsx changes)
1855
+ Note: "Browser testing recommended: .tsx files modified without /test phase.
1856
+ Add 'ui' label to enable, or 'no-browser-test' to opt out."
300
1857
  - ELSE IF pending_count > 0:
301
1858
  → NEEDS_VERIFICATION (wait for verification)
302
- - ELSE IF browser_test_missing (from step 2):
303
- → AC_MET_BUT_NOT_A_PLUS (browser testing recommended)
1859
+ - ELSE IF quality_plan_status == "Partial":
1860
+ → AC_MET_BUT_NOT_A_PLUS (some quality dimensions incomplete - can merge with notes)
1861
+ - ELSE IF smoke_test_status == "Partial":
1862
+ → AC_MET_BUT_NOT_A_PLUS (smoke tests incomplete - document gaps before merge)
304
1863
  - ELSE IF improvement_suggestions.length > 0:
305
1864
  → AC_MET_BUT_NOT_A_PLUS (can merge with notes)
306
1865
  - ELSE:
@@ -338,18 +1897,22 @@ fi
338
1897
 
339
1898
  **CRITICAL:** `PARTIALLY_MET` is NOT sufficient for merge. It MUST be treated as `NOT_MET` for verdict purposes.
340
1899
 
1900
+ **CRITICAL:** If skill command verification = "Failed", verdict CANNOT be `READY_FOR_MERGE`. This prevents shipping skills with broken commands (like issue #178's `conclusion` field).
1901
+
341
1902
  See [quality-gates.md](references/quality-gates.md) for detailed verdict criteria.
342
1903
 
1904
+ ---
1905
+
343
1906
  ## Automated Quality Checks (Reference)
344
1907
 
345
1908
  **Note:** These commands are what the sub-agents execute internally. You do NOT run these directly — the sub-agents spawned above handle this. This section is reference documentation only.
346
1909
 
347
1910
  ```bash
348
1911
  # Type safety
349
- type_issues=$(git diff main...HEAD | grep -E ":\s*any[,)]|as any" | wc -l | xargs || echo "0")
1912
+ type_issues=$(git diff main...HEAD | grep -E ":\s*any[,)]|as any" | wc -l | xargs || true)
350
1913
 
351
1914
  # Deleted tests
352
- deleted_tests=$(git diff main...HEAD --diff-filter=D --name-only | grep -E "\\.test\\.|\\spec\\." | wc -l | xargs || echo "0")
1915
+ deleted_tests=$(git diff main...HEAD --diff-filter=D --name-only | grep -E "\\.test\\.|\\spec\\." | wc -l | xargs || true)
353
1916
 
354
1917
  # Scope check
355
1918
  files_changed=$(git diff main...HEAD --name-only | wc -l | xargs)
@@ -364,7 +1927,7 @@ npx tsx scripts/lib/__tests__/run-security-scan.ts 2>/dev/null
364
1927
 
365
1928
  See [scripts/quality-checks.sh](scripts/quality-checks.sh) for the complete automation script.
366
1929
 
367
- ### 6. Draft Review/QA Comment
1930
+ ### 8. Draft Review/QA Comment
368
1931
 
369
1932
  Produce a Markdown snippet for the PR/issue:
370
1933
  - Short summary of the change
@@ -372,7 +1935,14 @@ Produce a Markdown snippet for the PR/issue:
372
1935
  - Key strengths and issues
373
1936
  - Clear, actionable next steps
374
1937
 
375
- ### 7. Update GitHub Issue
1938
+ ### 9. Update GitHub Issue
1939
+
1940
+ **If orchestrated (SEQUANT_ORCHESTRATOR is set):**
1941
+ - Skip posting GitHub comment (orchestrator handles aggregated summary)
1942
+ - Include verdict and AC coverage in output for orchestrator to capture
1943
+ - Let orchestrator update labels based on final workflow status
1944
+
1945
+ **If standalone:**
376
1946
 
377
1947
  Post the draft comment to GitHub and update labels:
378
1948
 
@@ -381,7 +1951,7 @@ Post the draft comment to GitHub and update labels:
381
1951
  - `AC_MET_BUT_NOT_A_PLUS`: add `needs-improvement` label
382
1952
  - `NEEDS_VERIFICATION`: add `needs-verification` label
383
1953
 
384
- ### 8. Documentation Reminder
1954
+ ### 10. Documentation Reminder
385
1955
 
386
1956
  If verdict is `READY_FOR_MERGE` or `AC_MET_BUT_NOT_A_PLUS`:
387
1957
 
@@ -389,13 +1959,84 @@ If verdict is `READY_FOR_MERGE` or `AC_MET_BUT_NOT_A_PLUS`:
389
1959
  **Documentation:** Before merging, run `/docs <issue>` to generate feature documentation.
390
1960
  ```
391
1961
 
392
- ### 9. Script/CLI Execution Verification
1962
+ ### 10a. CHANGELOG Quality Gate (REQUIRED)
1963
+
1964
+ **Purpose:** Verify user-facing changes have corresponding CHANGELOG entries before `READY_FOR_MERGE`.
1965
+
1966
+ **Detection:**
1967
+
1968
+ ```bash
1969
+ # Check if CHANGELOG.md exists
1970
+ if [ ! -f "CHANGELOG.md" ]; then
1971
+ echo "No CHANGELOG.md found - skip CHANGELOG check"
1972
+ exit 0
1973
+ fi
1974
+
1975
+ # Check if [Unreleased] section has entries
1976
+ unreleased_entries=$(sed -n '/^## \[Unreleased\]/,/^## \[/p' CHANGELOG.md | grep -E '^\s*-' | wc -l | xargs || true)
1977
+
1978
+ # Determine if change is user-facing (new features, bug fixes, etc.)
1979
+ # Look at commit messages or file changes
1980
+ user_facing=$(git log main..HEAD --oneline | grep -iE '^[a-f0-9]+ (feat|fix|perf|refactor|docs):' | wc -l | xargs || true)
1981
+ ```
1982
+
1983
+ **Verification Logic:**
1984
+
1985
+ | Condition | CHANGELOG Entry Required? | Action |
1986
+ |-----------|---------------------------|--------|
1987
+ | User-facing changes detected + CHANGELOG exists | ✅ Yes | Check for `[Unreleased]` entry |
1988
+ | User-facing changes + no entry | ⚠️ Block | Flag as missing CHANGELOG |
1989
+ | Non-user-facing changes (test, ci, chore) | ❌ No | Skip check |
1990
+ | No CHANGELOG.md in repo | ❌ No | Skip check |
1991
+
1992
+ **If CHANGELOG entry is missing:**
1993
+
1994
+ 1. Do NOT give `READY_FOR_MERGE` verdict
1995
+ 2. Set verdict to `AC_MET_BUT_NOT_A_PLUS` with note:
1996
+ ```markdown
1997
+ **CHANGELOG:** Missing entry for user-facing changes. Add entry to `## [Unreleased]` section before merging.
1998
+ ```
1999
+ 3. Include this in the draft review comment
2000
+
2001
+ **CHANGELOG Entry Validation:**
2002
+
2003
+ When an entry exists, verify it follows the format:
2004
+ - Starts with action verb (Add, Fix, Update, Remove, Improve)
2005
+ - Includes issue number `(#123)`
2006
+ - Is under the correct section (Added, Fixed, Changed, etc.)
2007
+
2008
+ **Example validation:**
2009
+
2010
+ ```markdown
2011
+ ### CHANGELOG Verification
2012
+
2013
+ | Check | Status |
2014
+ |-------|--------|
2015
+ | CHANGELOG.md exists | ✅ Found |
2016
+ | User-facing changes | ✅ Yes (feat: commit detected) |
2017
+ | [Unreleased] entry | ✅ Present |
2018
+ | Entry format | ✅ Valid (includes issue number) |
2019
+
2020
+ **Result:** CHANGELOG requirements met
2021
+ ```
2022
+
2023
+ **If CHANGELOG is not required:**
2024
+
2025
+ ```markdown
2026
+ ### CHANGELOG Verification
2027
+
2028
+ **Result:** N/A (non-user-facing changes only)
2029
+ ```
2030
+
2031
+ ---
2032
+
2033
+ ### 11. Script/CLI Execution Verification
393
2034
 
394
2035
  **REQUIRED for CLI/script features:** When `scripts/` files are modified, execution verification is required before `READY_FOR_MERGE`.
395
2036
 
396
2037
  **Detection:**
397
2038
  ```bash
398
- scripts_changed=$(git diff main...HEAD --name-only | grep "^scripts/" | wc -l | xargs || echo "0")
2039
+ scripts_changed=$(git diff main...HEAD --name-only | grep -E "^(scripts/|templates/scripts/)" | wc -l | xargs || true)
399
2040
  if [[ $scripts_changed -gt 0 ]]; then
400
2041
  echo "Script changes detected. Run /verify before READY_FOR_MERGE"
401
2042
  fi
@@ -409,7 +2050,7 @@ fi
409
2050
 
410
2051
  **If no verification evidence exists:**
411
2052
  1. Prompt: "Script changes detected but no execution verification found. Run `/verify <issue> --command \"<test command>\"` before READY_FOR_MERGE verdict."
412
- 2. Do NOT give `READY_FOR_MERGE` verdict until verification is complete (unless an approved override applies — see Section 9a)
2053
+ 2. Do NOT give `READY_FOR_MERGE` verdict until verification is complete (unless an approved override applies — see Section 11a)
413
2054
  3. Verdict should be `AC_MET_BUT_NOT_A_PLUS` with note about missing verification
414
2055
 
415
2056
  **Why this matters:**
@@ -418,7 +2059,7 @@ fi
418
2059
 
419
2060
  **Example workflow:**
420
2061
  ```bash
421
- # QA detects scripts/ changes
2062
+ # QA detects scripts/ or templates/scripts/ changes
422
2063
  # -> Prompt: "Run /verify before READY_FOR_MERGE"
423
2064
 
424
2065
  /verify 558 --command "npx tsx scripts/migrate.ts --dry-run"
@@ -429,7 +2070,7 @@ fi
429
2070
  /qa 558 # Re-run, now sees verification, can give READY_FOR_MERGE
430
2071
  ```
431
2072
 
432
- ### 9a. Script Verification Override
2073
+ ### 11a. Script Verification Override
433
2074
 
434
2075
  In some cases, `/verify` execution can be safely skipped when script changes are purely cosmetic or have no runtime impact. **Overrides require explicit justification and risk assessment.**
435
2076
 
@@ -484,33 +2125,123 @@ In some cases, `/verify` execution can be safely skipped when script changes are
484
2125
 
485
2126
  ---
486
2127
 
2128
+ ## State Tracking
2129
+
2130
+ **IMPORTANT:** Update workflow state when running standalone (not orchestrated).
2131
+
2132
+ ### State Updates (Standalone Only)
2133
+
2134
+ When NOT orchestrated (`SEQUANT_ORCHESTRATOR` is not set):
2135
+
2136
+ **At skill start:**
2137
+ ```bash
2138
+ npx tsx scripts/state/update.ts start <issue-number> qa
2139
+ ```
2140
+
2141
+ **On successful completion (READY_FOR_MERGE or AC_MET_BUT_NOT_A_PLUS):**
2142
+ ```bash
2143
+ npx tsx scripts/state/update.ts complete <issue-number> qa
2144
+ npx tsx scripts/state/update.ts status <issue-number> ready_for_merge
2145
+ ```
2146
+
2147
+ **On failure (AC_NOT_MET):**
2148
+ ```bash
2149
+ npx tsx scripts/state/update.ts fail <issue-number> qa "AC not met"
2150
+ ```
2151
+
2152
+ **Why this matters:** State tracking enables dashboard visibility, resume capability, and workflow orchestration. Skills update state when standalone; orchestrators handle state when running workflows.
2153
+
2154
+ ---
2155
+
487
2156
  ## Output Verification
488
2157
 
489
2158
  **Before responding, verify your output includes ALL of these:**
490
2159
 
491
- - [ ] **AC Coverage** - Each AC item marked as MET, PARTIALLY_MET, or NOT_MET
492
- - [ ] **Verdict** - One of: READY_FOR_MERGE, AC_MET_BUT_NOT_A_PLUS, AC_NOT_MET
2160
+ ### Simple Fix Mode (`SMALL_DIFF=true`)
2161
+
2162
+ When the size gate determined `SMALL_DIFF=true`, use the **simplified output template**. The following sections are **omitted** (not marked N/A — completely absent):
2163
+
2164
+ - Quality Plan Verification
2165
+ - Incremental QA Summary
2166
+ - Call-Site Review
2167
+ - Product Review
2168
+ - Smoke Test
2169
+ - CLI Registration Verification
2170
+ - Skill Command Verification
2171
+ - Script Verification Override
2172
+ - Skill Change Review
2173
+
2174
+ **Required sections for simple fix mode:**
2175
+
2176
+ - [ ] **Size Gate** - Size gate decision table with threshold, diff size, and decision
2177
+ - [ ] **AC Coverage** - Each AC item marked as MET, PARTIALLY_MET, NOT_MET, PENDING, or N/A
2178
+ - [ ] **Quality Metrics** - Type issues, deleted tests, files changed, additions/deletions (from inline checks)
2179
+ - [ ] **Code Review Findings** - Strengths, issues, suggestions
2180
+ - [ ] **Test Coverage Analysis** - Changed files with/without tests, critical paths flagged
2181
+ - [ ] **Anti-Pattern Detection** - Code patterns check (lightweight)
2182
+ - [ ] **Self-Evaluation Completed** - Adversarial self-evaluation section included
2183
+ - [ ] **Verdict** - One of: READY_FOR_MERGE, AC_MET_BUT_NOT_A_PLUS, NEEDS_VERIFICATION, AC_NOT_MET
2184
+ - [ ] **Documentation Check** - README/docs updated if feature adds new functionality
2185
+ - [ ] **Next Steps** - Clear, actionable recommendations
2186
+
2187
+ ### Standard QA (Implementation Exists, `SMALL_DIFF=false`)
2188
+
2189
+ - [ ] **Self-Evaluation Completed** - Adversarial self-evaluation section included in output
2190
+ - [ ] **AC Coverage** - Each AC item marked as MET, PARTIALLY_MET, NOT_MET, PENDING, or N/A
2191
+ - [ ] **Quality Plan Verification** - Included if quality plan exists (or marked N/A if no quality plan)
2192
+ - [ ] **CI Status** - Included if PR exists (or marked "No PR" / "No CI configured")
2193
+ - [ ] **Verdict** - One of: READY_FOR_MERGE, AC_MET_BUT_NOT_A_PLUS, NEEDS_VERIFICATION, AC_NOT_MET
493
2194
  - [ ] **Quality Metrics** - Type issues, deleted tests, files changed, additions/deletions
2195
+ - [ ] **Cache Status** - Included if caching enabled (or marked N/A if --no-cache)
2196
+ - [ ] **Build Verification** - Included if build failed (or marked N/A if build passed)
2197
+ - [ ] **Test Coverage Analysis** - Changed files with/without tests, critical paths flagged
494
2198
  - [ ] **Code Review Findings** - Strengths, issues, suggestions
2199
+ - [ ] **Test Quality Review** - Included if test files modified (or marked N/A)
2200
+ - [ ] **Anti-Pattern Detection** - Dependency audit (if package.json changed) + code patterns
2201
+ - [ ] **Call-Site Review** - Included if new exported functions detected (or marked N/A)
2202
+ - [ ] **Execution Evidence** - Included if scripts/CLI modified (or marked N/A)
495
2203
  - [ ] **Script Verification Override** - Included if scripts/CLI modified AND /verify was skipped (with justification and risk assessment)
2204
+ - [ ] **Skill Command Verification** - Included if `.claude/skills/**/*.md` modified (or marked N/A)
2205
+ - [ ] **Skill Change Review** - Skill-specific adversarial prompts included if skills changed
2206
+ - [ ] **Smoke Test** - Included if workflow-affecting changes (skills, scripts, CLI), or marked "Not Required"
2207
+ - [ ] **CHANGELOG Verification** - User-facing changes have `[Unreleased]` entry (or marked N/A)
496
2208
  - [ ] **Documentation Check** - README/docs updated if feature adds new functionality
497
2209
  - [ ] **Next Steps** - Clear, actionable recommendations
498
2210
 
499
- **DO NOT respond until all items are verified.**
2211
+ ### Early Exit (No Implementation)
2212
+
2213
+ When early exit is triggered (no commits, no uncommitted changes, no PR):
2214
+
2215
+ - [ ] **Implementation Status** - Clearly states "NOT FOUND"
2216
+ - [ ] **Verdict** - Must be `AC_NOT_MET`
2217
+ - [ ] **Next Steps** - Directs user to run `/exec` first
2218
+ - [ ] **Sub-agents NOT spawned** - Quality check agents were skipped
2219
+
2220
+ **DO NOT respond until all applicable items are verified.**
500
2221
 
501
2222
  ## Output Template
502
2223
 
503
- You MUST include these sections:
2224
+ ### Simple Fix Template (`SMALL_DIFF=true`)
2225
+
2226
+ When the size gate triggers simple fix mode, use this shorter template:
504
2227
 
505
2228
  ```markdown
506
- ## QA Review for Issue #<N>
2229
+ ## QA Review for Issue #<N> (Simple Fix)
2230
+
2231
+ ### Size Gate
2232
+
2233
+ | Check | Value |
2234
+ |-------|-------|
2235
+ | Diff size | N lines (threshold: T) |
2236
+ | package.json changed | No |
2237
+ | Security-sensitive paths | No |
2238
+ | Decision | **Inline checks** |
507
2239
 
508
2240
  ### AC Coverage
509
2241
 
510
2242
  | AC | Description | Status | Notes |
511
2243
  |----|-------------|--------|-------|
512
- | AC-1 | [description] | MET/PARTIALLY_MET/NOT_MET | [explanation] |
513
- | AC-2 | [description] | MET/PARTIALLY_MET/NOT_MET | [explanation] |
2244
+ | AC-1 | [description] | MET/NOT_MET | [explanation] |
514
2245
 
515
2246
  **Coverage:** X/Y AC items fully met
516
2247
 
@@ -525,6 +2256,191 @@ You MUST include these sections:
525
2256
  | Files changed | X | OK/WARN |
526
2257
  | Lines added | +X | - |
527
2258
  | Lines deleted | -X | - |
2259
+ | Security patterns | X | OK/WARN |
2260
+
2261
+ ---
2262
+
2263
+ ### Code Review
2264
+
2265
+ **Strengths:**
2266
+ - [Positive findings]
2267
+
2268
+ **Issues:**
2269
+ - [Problems found]
2270
+
2271
+ **Suggestions:**
2272
+ - [Improvements recommended]
2273
+
2274
+ ---
2275
+
2276
+ ### Test Coverage Analysis
2277
+
2278
+ | Changed File | Tier | Has Tests? | Test File |
2279
+ |--------------|------|------------|-----------|
2280
+ | `[file]` | Critical/Standard/Optional | Yes/No | `[test file or -]` |
2281
+
2282
+ **Coverage:** X/Y changed source files have corresponding tests
2283
+
2284
+ ---
2285
+
2286
+ ### Anti-Pattern Detection
2287
+
2288
+ | File:Line | Category | Pattern | Suggestion |
2289
+ |-----------|----------|---------|------------|
2290
+ | [location] | [category] | [pattern] | [fix] |
2291
+
2292
+ ---
2293
+
2294
+ ### Self-Evaluation
2295
+
2296
+ - **Verified working:** [Yes/No]
2297
+ - **Test efficacy:** [High/Medium/Low]
2298
+ - **Likely failure mode:** [description]
2299
+ - **Verdict confidence:** [High/Medium/Low]
2300
+
2301
+ ---
2302
+
2303
+ ### Verdict: [READY_FOR_MERGE | AC_MET_BUT_NOT_A_PLUS | NEEDS_VERIFICATION | AC_NOT_MET]
2304
+
2305
+ [Explanation of verdict]
2306
+
2307
+ ### Documentation
2308
+
2309
+ - [ ] N/A - Simple fix, no documentation needed
2310
+ - [ ] README/docs updated
2311
+
2312
+ ### Next Steps
2313
+
2314
+ 1. [Action item]
2315
+ ```
2316
+
2317
+ ---
2318
+
2319
+ ### Standard Template (`SMALL_DIFF=false`)
2320
+
2321
+ You MUST include these sections:
2322
+
2323
+ ```markdown
2324
+ ## QA Review for Issue #<N>
2325
+
2326
+ ### AC Coverage
2327
+
2328
+ | AC | Source | Description | Status | Notes |
2329
+ |----|--------|-------------|--------|-------|
2330
+ | AC-1 | Original | [description] | MET/PARTIALLY_MET/NOT_MET/PENDING/N/A | [explanation] |
2331
+ | AC-2 | Original | [description] | MET/PARTIALLY_MET/NOT_MET/PENDING/N/A | [explanation] |
2332
+ | **Derived ACs** | | | | |
2333
+ | AC-6 | Derived (Error Handling) | [description from quality plan] | MET/PARTIALLY_MET/NOT_MET | [explanation] |
2334
+ | AC-7 | Derived (Test Coverage) | [description from quality plan] | MET/PARTIALLY_MET/NOT_MET | [explanation] |
2335
+
2336
+ **Coverage:** X/Y AC items fully met (includes derived ACs)
2337
+ **Original ACs:** X/Y met
2338
+ **Derived ACs:** X/Y met
2339
+
2340
+ ---
2341
+
2342
+ ### Quality Plan Verification
2343
+
2344
+ [Include if quality plan exists in issue comments, otherwise: "N/A - No quality plan found"]
2345
+
2346
+ | Dimension | Items Planned | Items Addressed | Status |
2347
+ |-----------|---------------|-----------------|--------|
2348
+ | Completeness | X | X | ✅ Complete / ⚠️ Partial / ❌ Not addressed |
2349
+ | Error Handling | X | X | ✅ Complete / ⚠️ Partial / ❌ Not addressed |
2350
+ | Code Quality | X | X | ✅ Complete / ⚠️ Partial / ❌ Not addressed |
2351
+ | Test Coverage | X | X | ✅ Complete / ⚠️ Partial / ❌ Not addressed |
2352
+ | Best Practices | X | X | ✅ Complete / ⚠️ Partial / ❌ Not addressed |
2353
+ | Polish | X | X | ✅ Complete / ⚠️ Partial / N/A (not UI) |
2354
+
2355
+ **Derived ACs:** X/Y addressed
2356
+ **Quality Plan Status:** Complete / Partial / Not Addressed
2357
+
2358
+ ---
2359
+
2360
+ ### Incremental QA Summary
2361
+
2362
+ [Include if INCREMENTAL_MODE=true from Phase 0c, otherwise: "N/A - First QA run"]
2363
+
2364
+ **Last QA:** <timestamp> (commit: <sha-short>)
2365
+ **Changes since last QA:** N files
2366
+
2367
+ | Check / AC | Status | Re-run? | Reason |
2368
+ |------------|--------|---------|--------|
2369
+ | [check/AC] | [status] | Cached / Re-run / Re-evaluated | [reason] |
2370
+
2371
+ **Summary:** X checks cached, Y re-evaluated, Z always-fresh
2372
+
2373
+ ---
2374
+
2375
+ ### CI Status
2376
+
2377
+ [Include if PR exists, otherwise: "No PR exists yet" or "No CI configured"]
2378
+
2379
+ | Check | State | Conclusion | Impact |
2380
+ |-------|-------|------------|--------|
2381
+ | `[check name]` | completed/in_progress/queued/pending | success/failure/cancelled/skipped/- | ✅ MET / ❌ NOT_MET / ⏳ PENDING |
2382
+
2383
+ **CI Summary:** X passed, Y pending, Z failed
2384
+ **CI-related AC items:** [list affected AC items and their status based on CI]
2385
+
2386
+ ---
2387
+
2388
+ ### Quality Metrics
2389
+
2390
+ | Metric | Value | Status |
2391
+ |--------|-------|--------|
2392
+ | Type issues (`any`) | X | OK/WARN |
2393
+ | Deleted tests | X | OK/WARN |
2394
+ | Files changed | X | OK/WARN |
2395
+ | Lines added | +X | - |
2396
+ | Lines deleted | -X | - |
2397
+
2398
+ ---
2399
+
2400
+ ### Cache Status
2401
+
2402
+ [Include if caching enabled, otherwise: "N/A - Caching disabled (--no-cache)"]
2403
+
2404
+ | Check | Cache Status |
2405
+ |-------|--------------|
2406
+ | type-safety | ✅ HIT / ❌ MISS / ⏭️ SKIP |
2407
+ | deleted-tests | ✅ HIT / ❌ MISS / ⏭️ SKIP |
2408
+ | scope | ⏭️ SKIP (always fresh) |
2409
+ | size | ⏭️ SKIP (always fresh) |
2410
+ | security | ✅ HIT / ❌ MISS / ⏭️ SKIP |
2411
+ | semgrep | ✅ HIT / ❌ MISS / ⏭️ SKIP |
2412
+ | build | ✅ HIT / ❌ MISS / ⏭️ SKIP |
2413
+
2414
+ **Summary:** X hits, Y misses, Z skipped
2415
+ **Performance:** [Note if cached checks saved time]
2416
+
2417
+ ---
2418
+
2419
+ ### Build Verification
2420
+
2421
+ [Include if `npm run build` failed, otherwise: "N/A - Build passed"]
2422
+
2423
+ | Check | Status |
2424
+ |-------|--------|
2425
+ | Feature branch build | ✅ Passed / ❌ Failed |
2426
+ | Main branch build | ✅ Passed / ❌ Failed |
2427
+ | Error match | ✅ Same error / ❌ Different errors / N/A |
2428
+ | Regression | **Yes** (new) / **No** (pre-existing) / **Unknown** |
2429
+
2430
+ **Note:** [Explanation of build verification result]
2431
+
2432
+ **Verdict impact:** [None / Blocking / Needs review]
2433
+
2434
+ ---
2435
+
2436
+ ### Test Coverage Analysis
2437
+
2438
+ | Changed File | Tier | Has Tests? | Test File |
2439
+ |--------------|------|------------|-----------|
2440
+ | `[file]` | Critical/Standard/Optional | ✅ Yes / ⚠️ No | `[test file or -]` |
2441
+
2442
+ **Coverage:** X/Y changed source files have corresponding tests
2443
+ **Critical paths without tests:** [list or "None"]
528
2444
 
529
2445
  ---
530
2446
 
@@ -541,18 +2457,155 @@ You MUST include these sections:
541
2457
 
542
2458
  ---
543
2459
 
2460
+ ### Test Quality Review
2461
+
2462
+ [Include if test files were added/modified, otherwise: "N/A - No test files modified"]
2463
+
2464
+ | Category | Status | Notes |
2465
+ |----------|--------|-------|
2466
+ | Behavior vs Implementation | ✅ OK / ⚠️ WARN | [notes] |
2467
+ | Coverage Depth | ✅ OK / ⚠️ WARN | [notes] |
2468
+ | Mock Hygiene | ✅ OK / ⚠️ WARN | [notes] |
2469
+ | Test Reliability | ✅ OK / ⚠️ WARN | [notes] |
2470
+
2471
+ **Issues Found:**
2472
+ - [file:line - description]
2473
+
2474
+ ---
2475
+
2476
+ ### Anti-Pattern Detection
2477
+
2478
+ #### Dependency Audit
2479
+ [Include if package.json modified, otherwise: "N/A - No dependency changes"]
2480
+
2481
+ | Package | Downloads/wk | Last Update | Flags |
2482
+ |---------|--------------|-------------|-------|
2483
+ | [pkg] | [count] | [date] | [flags] |
2484
+
2485
+ #### Code Patterns
2486
+
2487
+ | File:Line | Category | Pattern | Suggestion |
2488
+ |-----------|----------|---------|------------|
2489
+ | [location] | [category] | [pattern] | [fix] |
2490
+
2491
+ **Critical Issues:** X
2492
+ **Warnings:** Y
2493
+
2494
+ ---
2495
+
2496
+ ### Call-Site Review
2497
+
2498
+ [Include if new exported functions detected, otherwise: "N/A - No new exported functions"]
2499
+
2500
+ **New exported functions detected:** N
2501
+
2502
+ | Function | Call Sites | Loop? | Conditions | AC Match |
2503
+ |----------|-----------|-------|------------|----------|
2504
+ | `[function]` | `[file:line]` | Yes/No | `[condition]` | ✅ Matches AC-N / ⚠️ [issue] |
2505
+
2506
+ **Findings:**
2507
+ - [List any mismatches between call-site conditions and AC constraints]
2508
+
2509
+ **Recommendations:**
2510
+ - [Specific fixes needed at call sites]
2511
+
2512
+ ---
2513
+
2514
+ ### Execution Evidence
2515
+
2516
+ [Include if scripts/CLI modified, otherwise: "N/A - No executable changes"]
2517
+
2518
+ | Test Type | Command | Exit Code | Result |
2519
+ |-----------|---------|-----------|--------|
2520
+ | Smoke test | `[command]` | [code] | [result] |
2521
+
2522
+ **Evidence status:** Complete / Incomplete / Waived (reason) / Not Required
2523
+
2524
+ ---
2525
+
544
2526
  ### Script Verification Override
545
2527
 
546
2528
  [Include if scripts/CLI modified AND /verify was skipped, otherwise omit this section]
547
2529
 
548
2530
  **Requirement:** `/verify` before READY_FOR_MERGE
549
2531
  **Override:** Yes
550
- **Justification:** [Approved category from Section 9a]
2532
+ **Justification:** [Approved category from Section 11a]
551
2533
  **Risk Assessment:** [None/Low/Medium]
552
2534
 
553
2535
  ---
554
2536
 
555
- ### Verdict: [READY_FOR_MERGE | AC_MET_BUT_NOT_A_PLUS | AC_NOT_MET]
2537
+ ### Skill Command Verification
2538
+
2539
+ [Include if `.claude/skills/**/*.md` modified, otherwise: "N/A - No skill files changed"]
2540
+
2541
+ **Skill files changed:** X
2542
+
2543
+ | File | Commands Found | Verification Status |
2544
+ |------|----------------|---------------------|
2545
+ | `[skill file]` | [count] | ✅ Passed / ❌ Failed / ⚠️ Skipped |
2546
+
2547
+ **Commands Verified:**
2548
+ - `[command]` → ✅ [result]
2549
+
2550
+ **Commands with Issues:**
2551
+ - `[command]` → ❌ [issue description]
2552
+
2553
+ **Verification Status:** Passed / Failed / Skipped / Not Required
2554
+
2555
+ ---
2556
+
2557
+ ### CLI Registration Verification
2558
+
2559
+ [Include if option interfaces or CLI file modified, otherwise: "N/A - No option interface changes"]
2560
+
2561
+ **Option files modified:** Yes/No
2562
+
2563
+ | Interface Field | Runtime Usage | CLI Registered | Status |
2564
+ |----------------|--------------|----------------|--------|
2565
+ | `[field]` | `[usage location]` | `--[flag]` in bin/cli.ts / NOT REGISTERED | ✅ OK / ❌ FAIL / ⏭️ SKIP |
2566
+
2567
+ **Verification Status:** Passed / Failed / N/A
2568
+
2569
+ **Remediation (if failed):**
2570
+ - Add `.option("--field-name", "description")` to bin/cli.ts
2571
+
2572
+ ---
2573
+
2574
+ ### Skill Change Review
2575
+
2576
+ [Include if skill files changed, otherwise omit]
2577
+
2578
+ - [ ] **Command verified:** Did you execute at least one referenced command?
2579
+ - [ ] **Fields verified:** For JSON commands, do field names match actual output?
2580
+ - [ ] **Patterns complete:** What variations might users write that aren't covered?
2581
+ - [ ] **Dependencies explicit:** What CLIs/tools does this skill assume are installed?
2582
+
2583
+ ---
2584
+
2585
+ ### Smoke Test
2586
+
2587
+ [Include if workflow-affecting changes (skills, scripts, CLI), otherwise: "Not Required - No workflow-affecting changes"]
2588
+
2589
+ | Test | Command | Result | Notes |
2590
+ |------|---------|--------|-------|
2591
+ | Happy path | `[command]` | ✅/❌ | [observation] |
2592
+ | Edge case | `[command]` | ✅/❌ | [observation] |
2593
+ | Error handling | `[command]` | ✅/❌ | [observation] |
2594
+
2595
+ **Smoke Test Status:** Complete / Partial (document gaps) / Not Required
2596
+
2597
+ ---
2598
+
2599
+ ### Self-Evaluation
2600
+
2601
+ - **Verified working:** [Yes/No - did you actually verify the feature works?]
2602
+ - **Test efficacy:** [High/Medium/Low - do tests catch the feature breaking?]
2603
+ - **Likely failure mode:** [What would most likely break this in production?]
2604
+ - **Verdict confidence:** [High/Medium/Low - explain any uncertainty]
2605
+
2606
+ ---
2607
+
2608
+ ### Verdict: [READY_FOR_MERGE | AC_MET_BUT_NOT_A_PLUS | NEEDS_VERIFICATION | AC_NOT_MET]
556
2609
 
557
2610
  [Explanation of verdict]
558
2611