slash-do 1.4.2 → 1.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -30,6 +30,10 @@
30
30
 
31
31
  ---
32
32
 
33
+ ## Philosophy
34
+
35
+ slashdo commands emphasize **high-quality software engineering over token conservation**. While efforts are made to use agents, models, and prompts efficiently, these tools work hard to ensure your software meets high-quality standards — and will use the tokens necessary to meet that end. Expect thorough reviews, multi-agent scans, and verification loops rather than shortcuts.
36
+
33
37
  ## Quick Start
34
38
 
35
39
  **With npm/npx:**
@@ -49,6 +49,16 @@ When the resolved model is `opus`, **omit** the `model` parameter on the Agent/T
49
49
 
50
50
  Opus reduces false positives in audit (judgment-heavy). Sonnet is the floor for code-writing agents (remediation). Haiku works for fast first-pass pattern scanning but may produce more false positives — remediation agents (Sonnet+) validate before fixing.
51
51
 
52
+ ## Compaction Guidance
53
+
54
+ When compacting during this workflow, always preserve:
55
+ - The `FILE_OWNER_MAP` (complete, not summarized)
56
+ - All CRITICAL/HIGH findings with file:line references
57
+ - The current phase number and what phases remain
58
+ - All PR numbers and URLs created so far
59
+ - `BUILD_CMD`, `TEST_CMD`, `PROJECT_TYPE`, `WORKTREE_DIR` values
60
+ - `VCS_HOST`, `CLI_TOOL`, `DEFAULT_BRANCH`, `CURRENT_BRANCH`
61
+
52
62
  ## Phase 0: Discovery & Setup
53
63
 
54
64
  Detect the project environment before any scanning or remediation.
@@ -98,6 +108,8 @@ If `VCS_HOST` is `github`, proactively verify browser authentication for the Cop
98
108
 
99
109
  This ensures the browser is ready before we need it in Phase 6, avoiding interruptions mid-flow.
100
110
 
111
+ <audit_instructions>
112
+
101
113
  ## Phase 1: Unified Audit
102
114
 
103
115
  Project conventions are already in your context. Pass relevant conventions to each agent.
@@ -107,7 +119,7 @@ Launch 7 Explore agents in two batches. Each agent must report findings in this
107
119
  - **[CRITICAL/HIGH/MEDIUM/LOW]** `file:line` - Description. Suggested fix: ... Complexity: Simple/Medium/Complex
108
120
  ```
109
121
 
110
- **IMPORTANT: Context requirement for audit agents.** When flagging an issue, agents MUST read at least 30 lines of surrounding context to confirm the issue is real. Common false positives to watch for:
122
+ **Context requirement.** Before flagging, read at least 30 lines of surrounding context to confirm the issue is real. Common false positives to watch for:
111
123
  - A Promise `.then()` chain that appears "unawaited" but IS collected into an array and awaited via `Promise.all` downstream
112
124
  - A value that appears "unvalidated" but IS checked by a guard clause earlier in the function or by the caller
113
125
  - A pattern that looks like an anti-pattern in isolation but IS idiomatic for the specific framework or library being used
@@ -115,6 +127,17 @@ Launch 7 Explore agents in two batches. Each agent must report findings in this
115
127
 
116
128
  If the surrounding context shows the code is correct, do NOT flag it.
117
129
 
130
+ If uncertain whether something is a genuine issue, report it as **[UNCERTAIN]** with your reasoning. The consolidation phase will evaluate these separately. Fewer confident findings is better than padding with questionable ones.
131
+
132
+ <approach>
133
+ For each potential finding:
134
+ 1. Read the file and 30+ lines of surrounding context
135
+ 2. Quote the specific code that demonstrates the issue
136
+ 3. Explain why it's a problem given the context
137
+ 4. Only then classify severity and suggest a fix
138
+ Skip step 4 if steps 1-3 reveal the code is correct.
139
+ </approach>
140
+
118
141
  ### Batch 1 (5 parallel Explore agents via Task tool):
119
142
 
120
143
  **Model**: Pass `AUDIT_MODEL` as the `model` parameter on each agent. If `AUDIT_MODEL` is `opus`, omit the parameter to inherit from session.
@@ -122,6 +145,8 @@ If the surrounding context shows the code is correct, do NOT flag it.
122
145
  1. **Security & Secrets**
123
146
  Sources: authentication checks, credential exposure, infrastructure security, input validation, dependency health
124
147
  Focus: hardcoded credentials, API keys, exposed secrets, authentication bypasses, disabled security checks, PII exposure, injection vulnerabilities (SQL/command/path traversal), insecure CORS configurations, missing auth checks, unsanitized user input in file paths or queries, known CVEs in dependencies (check `npm audit` / `cargo audit` / `pip-audit` / `go vuln` output), abandoned or unmaintained dependencies, overly permissive dependency version ranges
148
+ OWASP Top 10 framing: broken auth (session fixation, credential stuffing), security misconfiguration (default creds, debug mode in prod), SSRF (user-controlled URLs in server fetch without allowlist), mass assignment (request bodies bound to models without field allowlist)
149
+ Supply chain: lockfile committed + frozen installs in CI, no untrusted postinstall scripts
125
150
 
126
151
  2. **Code Quality & Style**
127
152
  Sources: code brittleness, convention violations, test workarounds, logging & observability
@@ -134,10 +159,13 @@ If the surrounding context shows the code is correct, do NOT flag it.
134
159
  4. **Architecture & SOLID**
135
160
  Sources: structural violations, coupling analysis, modularity, API contract quality
136
161
  Focus: Single Responsibility violations (god files >500 lines, functions >50 lines doing multiple things), tight coupling between modules, circular dependencies, mixed concerns in single files, dependency inversion violations, classes/modules with too many responsibilities (>20 public methods), deep nesting (>4 levels), long parameter lists, modules reaching into other modules' internals, inconsistent API error response shapes across endpoints, list endpoints missing pagination, missing rate limiting on public endpoints, inconsistent request/response envelope patterns
162
+ API contract consistency: breaking response shape changes without versioning, inconsistent error envelopes across endpoints, missing deprecation headers on sunset endpoints
137
163
 
138
164
  5. **Bugs, Performance & Error Handling**
139
165
  Sources: runtime safety, resource management, async correctness, performance, race conditions
140
166
  Focus: missing `await` on async calls, unhandled promise rejections, null/undefined access without guards, off-by-one errors, incorrect comparison operators, mutation of shared state, resource leaks (unbounded caches/maps, unclosed connections/streams), `process.exit()` in library code, async routes without error forwarding, missing AbortController on data fetching, N+1 query patterns (loading related records inside loops), O(n²) or worse algorithms in hot paths, unbounded result sets (missing LIMIT/pagination on DB queries), missing database indexes on frequently queried columns, race conditions (TOCTOU, double-submit without idempotency keys, concurrent writes to shared state without locks, stale-read-then-write patterns), missing connection pooling or pool exhaustion
167
+ Resilience: external calls without timeouts, missing fallback for unavailable downstream services, retry without backoff ceiling/jitter, missing health check endpoints
168
+ Observability: production paths without structured logging, error logs missing reproduction context (request ID, input params), async flows without correlation IDs
141
169
 
142
170
  ### Batch 2 (2 agents after Batch 1 completes):
143
171
 
@@ -150,6 +178,7 @@ If the surrounding context shows the code is correct, do NOT flag it.
150
178
  - **Python**: mutable default arguments, bare except clauses, missing type hints on public APIs, sync I/O in async contexts
151
179
  - **Go**: unchecked errors, goroutine leaks, defer in loops, context propagation gaps
152
180
  - **Web projects (any stack)**: accessibility issues — missing alt text on images, broken keyboard navigation, missing ARIA labels on interactive elements, insufficient color contrast, form inputs without associated labels
181
+ - **Database migrations**: exclusive-lock ALTER TABLE on large tables, CREATE INDEX without CONCURRENTLY, missing down migrations or untested rollback paths
153
182
  - General: framework-specific security issues, language-specific gotchas, domain-specific compliance, environment variable hygiene (missing `.env.example`, required env vars not validated at startup, secrets in config files that should be in env)
154
183
 
155
184
  7. **Test Coverage**
@@ -158,12 +187,16 @@ If the surrounding context shows the code is correct, do NOT flag it.
158
187
 
159
188
  Wait for ALL agents to complete before proceeding.
160
189
 
190
+ </audit_instructions>
191
+
192
+ <plan_and_remediate>
193
+
161
194
  ## Phase 2: Plan Generation
162
195
 
163
196
  1. Read the existing `PLAN.md` (create if it doesn't exist)
164
197
  2. Consolidate all findings from Phase 1, deduplicating across agents (same file:line flagged by multiple agents → keep the most specific description)
165
198
  3. Identify **shared utility extractions** — patterns duplicated 3+ times that should become reusable functions. Group these as "Foundation" work for Phase 3b.
166
- 4. **Build the file ownership map** (CRITICAL for Phase 5):
199
+ 4. **Build the file ownership map** (required by Phase 5 for conflict-free PRs):
167
200
  - For each finding, record which file(s) it touches
168
201
  - Assign each file to exactly ONE category (its primary category)
169
202
  - If a file is touched by multiple categories, assign it to the category with the highest-severity finding for that file
@@ -266,58 +299,17 @@ If no shared utilities were identified, skip this step.
266
299
  4. Spawn up to 5 general-purpose agents as teammates. **Pass `REMEDIATION_MODEL` as the `model` parameter on each agent.** If `REMEDIATION_MODEL` is `opus`, omit the parameter to inherit from session.
267
300
 
268
301
  ### Agent instructions template:
269
- ```
270
- You are {agent-name} on team better-{DATE}.
271
-
272
- Your task: Fix all {CATEGORY} findings from the Good audit.
273
- Working directory: {WORKTREE_DIR} (this is a git worktree — all work happens here)
274
-
275
- Project type: {PROJECT_TYPE}
276
- Build command: {BUILD_CMD}
277
- Test command: {TEST_CMD}
278
-
279
- Foundation utilities available (if created):
280
- {list of utility files with brief descriptions}
281
-
282
- Findings to address:
283
- {filtered list of CRITICAL/HIGH/MEDIUM findings for this category}
284
-
285
- FINDING VALIDATION — verify before fixing:
286
- - Before fixing each finding, READ the file and at least 30 lines of surrounding
287
- context to confirm the issue is genuine.
288
- - Check whether the flagged code is already correct (e.g., a Promise chain that
289
- IS properly awaited downstream, a value that IS validated earlier in the function,
290
- a pattern that IS idiomatic for the framework).
291
- - If the existing code is already correct, SKIP the fix and report it as a
292
- false positive with a brief explanation of why the original code is fine.
293
- - Do not make changes that are semantically equivalent to the original code
294
- (e.g., wrapping a .then() chain in an async IIFE adds noise without fixing anything).
295
-
296
- COMMIT STRATEGY — commit early and often:
297
- - After completing each logical group of related fixes, stage those files
298
- and commit immediately with a descriptive conventional commit message.
299
- - Each commit should be independently valid (build should pass).
300
- - Run {BUILD_CMD} in {WORKTREE_DIR} before each commit to verify.
301
- - Use `git -C {WORKTREE_DIR} add <specific files>` — never `git add -A` or `git add .`
302
- - Use `git -C {WORKTREE_DIR} commit -m "prefix: description"`
303
- - Use conventional commit prefixes: fix:, refactor:, feat:, security:
304
- - Do NOT include co-author or generated-by annotations in commits.
305
- - Do NOT bump the version — that happens once at the end.
306
-
307
- After all fixes:
308
- - Ensure all changes are committed (no uncommitted work)
309
- - Mark your task as completed via TaskUpdate
310
- - Report: commits made, files modified, findings addressed, any skipped issues
311
-
312
- CONFLICT AVOIDANCE:
313
- - Only modify files listed in your assigned findings
314
- - If you need to modify a file assigned to another agent, skip that change and report it
315
- ```
302
+
303
+ !`cat ~/.claude/lib/remediation-agent-template.md`
316
304
 
317
305
  ### Conflict avoidance:
318
306
  - Review all findings before task assignment. If two categories touch the same file, assign both sets of findings to the same agent.
319
307
  - Security agent gets priority on validation logic; DRY agent gets priority on import consolidation.
320
308
 
309
+ </plan_and_remediate>
310
+
311
+ <verification_and_pr>
312
+
321
313
  ## Phase 4: Verification
322
314
 
323
315
  After all agents complete:
@@ -359,9 +351,9 @@ For each category that has findings:
359
351
  ```
360
352
  5. Push the branch: `git push -u origin better/{CATEGORY_SLUG}`
361
353
 
362
- **CRITICAL: File isolation rule** — each file must appear in exactly ONE branch. If a file has changes from multiple categories (e.g., `server/index.js` with both security and stack-specific changes), assign the whole file to one category based on the file ownership map. Do not split file-level changes across PRs.
354
+ **File isolation rule** (one file per branch) — each file must appear in exactly ONE branch. If a file has changes from multiple categories (e.g., `server/index.js` with both security and stack-specific changes), assign the whole file to one category based on the file ownership map. Do not split file-level changes across PRs.
363
355
 
364
- **CRITICAL: Cross-PR dependency check** — after building all branches, verify each branch builds independently:
356
+ **Cross-PR dependency check** — verify each branch builds independently:
365
357
  ```bash
366
358
  git checkout better/{CATEGORY_SLUG} && {BUILD_CMD}
367
359
  ```
@@ -465,7 +457,7 @@ After creating all PRs, verify CI passes on each one:
465
457
 
466
458
  Maximum 5 iterations per PR to prevent infinite loops.
467
459
 
468
- **IMPORTANT — Sub-agent delegation**: To prevent context exhaustion on long review cycles with multiple PRs, delegate each PR's review loop to a **separate general-purpose sub-agent** via the Agent tool. Launch sub-agents in parallel (one per PR). Each sub-agent runs the full loop (request → wait → check → fix → re-request) autonomously and returns only the final status.
460
+ **Sub-agent delegation** (prevents context exhaustion): delegate each PR's review loop to a **separate general-purpose sub-agent** via the Agent tool. Launch sub-agents in parallel (one per PR). Each sub-agent runs the full loop (request → wait → check → fix → re-request) autonomously and returns only the final status.
469
461
 
470
462
  ### 6.0: Verify browser authentication
471
463
 
@@ -486,85 +478,11 @@ If this returns 422 ("not a collaborator"), record `REVIEW_METHOD=playwright`. O
486
478
 
487
479
  ### 6.2: Launch parallel sub-agents (one per PR)
488
480
 
489
- For each PR, spawn a general-purpose sub-agent with:
481
+ For each PR, spawn a general-purpose sub-agent using the shared review loop template:
490
482
 
491
- ```
492
- You are a Copilot review loop agent for PR {PR_NUMBER}.
493
-
494
- Repository: {OWNER}/{REPO}
495
- Branch: better/{CATEGORY_SLUG}
496
- Build command: {BUILD_CMD}
497
- Review request method: {REVIEW_METHOD}
498
- Max iterations: 5
499
-
500
- DECREASING TIMEOUT SCHEDULE (shorter than single-PR review since multiple
501
- PRs are reviewed in parallel — see do:rpr for single-PR dynamic timing):
502
- - Iteration 1: max wait 5 minutes
503
- - Iteration 2: max wait 4 minutes
504
- - Iteration 3: max wait 3 minutes
505
- - Iteration 4: max wait 2 minutes
506
- - Iteration 5+: max wait 1 minute
507
- Poll interval: 30 seconds for all iterations.
508
-
509
- Run the following loop until Copilot returns zero new comments or you hit
510
- the max iteration limit:
511
-
512
- 1. CAPTURE the latest Copilot review timestamp, then REQUEST a new review:
513
- - First, capture the latest Copilot review timestamp via GraphQL:
514
- echo '{"query":"{ repository(owner: \"{OWNER}\", name: \"{REPO}\") { pullRequest(number: {PR_NUMBER}) { reviews(last: 20) { nodes { author { login } submittedAt } } } } }"}' | gh api graphql --input -
515
- - Find the most recent submittedAt where author.login is
516
- copilot-pull-request-reviewer[bot] and record as LAST_COPILOT_SUBMITTED_AT.
517
- - If no prior Copilot review exists, record LAST_COPILOT_SUBMITTED_AT=NONE
518
- and treat the next Copilot review as NEW regardless of timestamp.
519
- - Then REQUEST:
520
- If REVIEW_METHOD is "api":
521
- gh api repos/{OWNER}/{REPO}/pulls/{PR_NUMBER}/requested_reviewers \
522
- -f 'reviewers[]=copilot-pull-request-reviewer[bot]'
523
- If REVIEW_METHOD is "playwright":
524
- Navigate to the PR URL, click the "Reviewers" gear button, click the
525
- Copilot menuitemradio option, verify sidebar shows "Awaiting requested
526
- review from Copilot"
527
-
528
- 2. WAIT for the review (BLOCKING):
529
- - Poll using stdin JSON piping (avoid shell-escaping issues):
530
- echo '{"query":"{ repository(owner: \"{OWNER}\", name: \"{REPO}\") { pullRequest(number: {PR_NUMBER}) { reviews(last: 5) { totalCount nodes { state body author { login } submittedAt } } reviewThreads(first: 100) { nodes { id isResolved comments(first: 3) { nodes { body path line author { login } } } } } } } }"}' | gh api graphql --input -
531
- - Complete when a new copilot-pull-request-reviewer[bot] review appears
532
- with submittedAt after LAST_COPILOT_SUBMITTED_AT captured in step 1
533
- (or, if LAST_COPILOT_SUBMITTED_AT=NONE, when the first
534
- copilot-pull-request-reviewer[bot] review for this loop appears)
535
- - Use the DECREASING TIMEOUT for the current iteration number
536
- - Error detection: if review body contains "Copilot encountered an error"
537
- or "unable to review", re-request and resume. Max 3 error retries.
538
- - If no review after max wait, report timeout and exit
539
-
540
- 3. CHECK for unresolved threads:
541
- Fetch threads via stdin JSON piping:
542
- echo '{"query":"{ repository(owner: \"{OWNER}\", name: \"{REPO}\") { pullRequest(number: {PR_NUMBER}) { reviewThreads(first: 100) { nodes { id isResolved comments(first: 10) { nodes { body path line author { login } } } } } } } }"}' | gh api graphql --input -
543
- - Verify review was successful (no error text in body)
544
- - If zero comments / no unresolved threads: report success and exit
545
- - If unresolved threads exist: proceed to step 4
546
-
547
- 4. FIX all unresolved threads:
548
- For each unresolved thread:
549
- - Read the referenced file and understand the feedback
550
- - Evaluate: valid feedback → make the fix; informational/false positive →
551
- resolve without changes
552
- - If fixing:
553
- git checkout better/{CATEGORY_SLUG}
554
- # make changes
555
- git add <specific files>
556
- git commit -m "address Copilot review feedback"
557
- git push
558
- - Resolve thread via stdin JSON piping:
559
- echo '{"query":"mutation { resolveReviewThread(input: {threadId: \"{THREAD_ID}\"}) { thread { id isResolved } } }"}' | gh api graphql --input -
560
- - After all threads resolved, increment iteration and go back to step 1
561
-
562
- When done, report back:
563
- - Final status: clean / max-iterations-reached / timeout / error
564
- - Total iterations completed
565
- - List of commits made (if any)
566
- - Any unresolved threads remaining
567
- ```
483
+ !`cat ~/.claude/lib/copilot-review-loop.md`
484
+
485
+ Pass each sub-agent the PR-specific variables: `{PR_NUMBER}`, `{OWNER}/{REPO}`, `better/{CATEGORY_SLUG}`, `{BUILD_CMD}`, and `{REVIEW_METHOD}`.
568
486
 
569
487
  Launch all PR sub-agents in parallel. Wait for all to complete.
570
488
 
@@ -598,6 +516,8 @@ If merge fails (e.g., branch protection, merge conflicts from a prior PR):
598
516
  Then re-run CI check before merging.
599
517
  - If branch protection: inform the user and suggest manual merge
600
518
 
519
+ </verification_and_pr>
520
+
601
521
  ## Phase 7: Cleanup
602
522
 
603
523
  1. Remove the worktree:
@@ -59,18 +59,39 @@ Before committing, ensure the fork is up to date with upstream:
59
59
  git push -u origin {CURRENT_BRANCH}
60
60
  ```
61
61
 
62
- ## Local Code Review (before opening PR)
62
+ ## Local Code Review (REQUIRED GATE)
63
+
64
+ Fork PRs go to upstream maintainers who can't easily ask for changes — getting it right the first time matters more here than on internal PRs.
65
+
66
+ <review_gate>
63
67
 
64
68
  1. Fetch upstream default branch for accurate diff:
65
69
  ```bash
66
70
  git fetch upstream {UPSTREAM_DEFAULT_BRANCH}
67
71
  ```
68
- 2. Run `git diff upstream/{UPSTREAM_DEFAULT_BRANCH}...{CURRENT_BRANCH}` to see the full diff against upstream
69
- 3. **For each changed file**, read the full file (not just the diff hunks) and check:
72
+ 2. Run `git diff upstream/{UPSTREAM_DEFAULT_BRANCH}...{CURRENT_BRANCH}` to get the list of changed files
73
+ 3. For every changed file:
74
+ a. Read the entire file using the Read tool (not just diff hunks)
75
+ b. Check it against the tiered checklist below (always check Tiers 1+4; check Tiers 2-3 when relevance filters match)
76
+ c. For each finding, quote the specific code line and explain why it's a problem
77
+ 4. After reviewing all files, verify: does the code actually deliver what the commits claim?
78
+ 5. Print a review summary table (see do:review for format)
79
+ 6. Fix any issues, recommit, and push before proceeding
80
+ 7. Only after printing the review summary may you proceed to "Open the PR"
81
+
82
+ If the diff touches more than 15 files, delegate later batches to a subagent to keep context clean.
83
+
84
+ </review_gate>
85
+
86
+ Checklist to apply to each file:
70
87
 
71
88
  !`cat ~/.claude/lib/code-review-checklist.md`
72
- 4. If issues are found, fix them, recommit, and push before proceeding
73
- 5. Summarize the review findings so the user can see what was checked
89
+
90
+ Verification confirm before proceeding:
91
+ - [ ] Read every changed file in full (not just diffs)
92
+ - [ ] Checked each file against the relevant checklist tiers
93
+ - [ ] Quoted specific code for each finding
94
+ - [ ] Printed a review summary table with findings
74
95
 
75
96
  ## Check for Upstream Contributing Guidelines
76
97
 
package/commands/do/pr.md CHANGED
@@ -17,17 +17,36 @@ Print: `PR flow: {current_branch} → {default_branch}`
17
17
  - Keep commit message concise and do not use co-author information
18
18
  - Push the branch to remote: `git pull --rebase --autostash && git push -u origin {current_branch}`
19
19
 
20
- ## Local Code Review (before opening PR)
20
+ ## Local Code Review (REQUIRED GATE)
21
21
 
22
- Before creating the PR, perform a thorough self-review. Read each changed file not just the diff to understand how the changes behave at runtime.
22
+ This review catches bugs that Copilot misses incomplete pattern copying is the #1 source of post-merge review feedback. Skipping costs more time in review cycles than it saves.
23
23
 
24
- 1. Run `git diff {default_branch}...{current_branch}` to see the full diff
25
- 2. **For each changed file**, read the full file (not just the diff hunks) and check:
24
+ <review_gate>
25
+
26
+ 1. Read commit messages to understand what this change claims to do
27
+ 2. Run `git diff {default_branch}...{current_branch}` to get the list of changed files
28
+ 3. For every changed file:
29
+ a. Read the entire file using the Read tool (not just diff hunks)
30
+ b. Check it against the tiered checklist below (always check Tiers 1+4; check Tiers 2-3 when relevance filters match)
31
+ c. For each finding, quote the specific code line and explain why it's a problem
32
+ 4. After reviewing all files, verify: does the code actually deliver what the commits claim?
33
+ 5. Print a review summary table (see do:review for format)
34
+ 6. Fix any issues, run tests, and verify tests cover the changed code paths
35
+ 7. Only after printing the review summary may you proceed to "Open the PR"
36
+
37
+ If the diff touches more than 15 files, delegate later batches to a subagent to keep context clean.
38
+
39
+ </review_gate>
40
+
41
+ Checklist to apply to each file:
26
42
 
27
43
  !`cat ~/.claude/lib/code-review-checklist.md`
28
44
 
29
- 3. If issues are found, fix them and amend/recommit before proceeding
30
- 4. Summarize the review findings (even if clean) so the user can see what was checked
45
+ Verification confirm before proceeding:
46
+ - [ ] Read every changed file in full (not just diffs)
47
+ - [ ] Checked each file against the relevant checklist tiers
48
+ - [ ] Quoted specific code for each finding
49
+ - [ ] Printed a review summary table with findings
31
50
 
32
51
  ## Open the PR
33
52
 
@@ -57,17 +57,36 @@ If ambiguous, ask the user to confirm before proceeding.
57
57
 
58
58
  4. **Commit the release**: Stage `package.json`, `package-lock.json`, and the changelog file. Commit with message `chore: release v{new_version}`
59
59
 
60
- ## Local Code Review (before opening PR)
60
+ ## Local Code Review (REQUIRED GATE)
61
61
 
62
- Perform a thorough self-review. Read each changed file not just the diffto understand how the changes behave at runtime.
62
+ A release without a deep code review ships bugs to users. This review is the last line of defense the full diff since the last release often contains interactions that individual PR reviews missed.
63
63
 
64
- 1. Run `git diff {target}...{source}` to see the full diff
65
- 2. **For each changed file**, read the full file (not just the diff hunks) and check:
64
+ <review_gate>
65
+
66
+ 1. Read all commit messages since last release to understand the scope
67
+ 2. Run `git diff {target}...{source}` to get the list of changed files
68
+ 3. For every changed file:
69
+ a. Read the entire file using the Read tool (not just diff hunks)
70
+ b. Check it against the tiered checklist below (always check Tiers 1+4; check Tiers 2-3 when relevance filters match)
71
+ c. For each finding, quote the specific code line and explain why it's a problem
72
+ 4. After reviewing all files, verify: does the aggregate change set deliver what the release claims?
73
+ 5. Print a review summary table (see do:review for format)
74
+ 6. Fix any issues, run tests, verify tests cover the changed code paths, commit and push
75
+ 7. Only after printing the review summary may you proceed to "Open the Release PR"
76
+
77
+ If the diff touches more than 15 files, delegate later batches to a subagent to keep context clean.
78
+
79
+ </review_gate>
80
+
81
+ Checklist to apply to each file:
66
82
 
67
83
  !`cat ~/.claude/lib/code-review-checklist.md`
68
84
 
69
- 3. If issues are found, fix them, commit, and push before proceeding
70
- 4. Summarize the review findings (even if clean) so the user can see what was checked
85
+ Verification confirm before proceeding:
86
+ - [ ] Read every changed file in full (not just diffs)
87
+ - [ ] Checked each file against the relevant checklist tiers
88
+ - [ ] Quoted specific code for each finding
89
+ - [ ] Printed a review summary table with findings
71
90
 
72
91
  ## Open the Release PR
73
92
 
@@ -17,6 +17,22 @@ If there are no changes, inform the user and stop.
17
17
 
18
18
  CLAUDE.md is already loaded into your context. Use its rules (code style, error handling, logging, security model, scope exclusions) as overrides to generic best practices throughout this review. For example, if CLAUDE.md says "no auth needed — internal tool", do not flag missing authentication.
19
19
 
20
+ <review_instructions>
21
+
22
+ ## PR-Level Coherence Check
23
+
24
+ Before reviewing individual files, understand what this change set claims to do:
25
+
26
+ 1. Read commit messages (`git log {base}...HEAD --oneline`)
27
+ 2. After reviewing all files, verify: does the changed code actually deliver what the commits claim? Flag any claims not backed by code (e.g., "adds rate limiting" but only adds a comment).
28
+
29
+ ## Large PR Strategy
30
+
31
+ If the diff touches more than 15 files, split the review into batches:
32
+ 1. Group files by module/directory
33
+ 2. Review each batch, printing findings as you go
34
+ 3. Delegate files beyond the first 15 to a subagent if context is getting full
35
+
20
36
  ## Deep File Review
21
37
 
22
38
  For **each changed file** in the diff, read the **entire file** (not just diff hunks). Reviewing only the diff misses context bugs where new code interacts incorrectly with existing code.
@@ -59,12 +75,20 @@ With the flow understood, evaluate the changed code against these principles:
59
75
 
60
76
  Only flag principle violations that are **concrete and actionable** in the changed code. Do not flag pre-existing design issues in untouched code unless the changes make them worse.
61
77
 
78
+ </review_instructions>
79
+
80
+ <checklist>
81
+
62
82
  ### Per-File Checklist
63
83
 
64
- Check every file against this checklist:
84
+ Check every file against this checklist. The checklist is organized into tiers — always check Tiers 1 and 4, and check Tiers 2-3 only when the relevance filter matches the file:
65
85
 
66
86
  !`cat ~/.claude/lib/code-review-checklist.md`
67
87
 
88
+ </checklist>
89
+
90
+ <deep_checks>
91
+
68
92
  ### Additional deep checks (read surrounding code to verify):
69
93
 
70
94
  **Cross-file consistency**
@@ -87,6 +111,7 @@ Check every file against this checklist:
87
111
 
88
112
  **Access scope changes**
89
113
  - If the PR widens access to an endpoint or resource (admin→public, internal→external), trace all shared dependencies the endpoint uses (rate limiters, queues, connection pools, external service quotas) and assess whether they were sized for the previous access level — in-memory/process-local limiters don't enforce limits across horizontally scaled instances
114
+ - If the PR adds endpoints under a restricted route group (admin, internal, scoped), read sibling endpoints in the same route group and verify the new endpoint applies the same authorization gate — missing gates on admin-mounted endpoints are consistently the most dangerous review finding
90
115
 
91
116
  **Guard-before-cache ordering**
92
117
  - If a handler performs a pre-flight guard check (rate limit, quota, feature flag) before a cache lookup or short-circuit path, verify the guard doesn't block operations that would be served from cache without touching the guarded resource — restructure so cache hits bypass the guard
@@ -112,17 +137,62 @@ Check every file against this checklist:
112
137
  - If the PR modifies a value (identifier, parameter name, format convention, threshold, timeout) that is referenced in other files, trace all cross-references and verify they agree. This includes: reviewer usernames, API names, placeholder formats, GraphQL field names, operational constants
113
138
  - If the PR adds or reorders sequential steps/instructions, verify the ordering matches execution dependencies — readers following steps in order must not perform an action before its prerequisite
114
139
 
140
+ **Transactional write integrity**
141
+ - If the PR performs multi-item writes (database transactions, batch operations), verify each write includes condition expressions that prevent stale-read races (TOCTOU) — an unconditioned write after a read can upsert deleted records, double-count aggregates, or drive counters negative. Trace the gap between read and write for each operation
142
+ - If the PR catches transaction/conditional failures, verify the error is translated to a client-appropriate status (409, 404) rather than bubbling as 500 — expected concurrency failures are not server errors
143
+
144
+ **Batch/paginated API consumption**
145
+ - If the PR calls batch or paginated external APIs (database batch gets, paginated queries, bulk service calls), verify the caller handles partial results — unprocessed items, continuation tokens, and rate-limited responses must be retried or surfaced, not silently dropped. Check that retry loops include backoff and attempt limits
146
+ - If the PR references resource names from API responses (table names, queue names), verify lookups account for environment-prefixed names rather than hardcoding bare names
147
+
148
+ **Data model vs access pattern alignment**
149
+ - If the PR adds queries that claim ordering (e.g., "recent", "top"), verify the underlying key/index design actually supports that ordering natively — random UUIDs and non-time-sortable keys require full scans and in-memory sorting, which degrades at scale
150
+
151
+ **Deletion/lifecycle cleanup completeness**
152
+ - If the PR adds a delete or destroy function, trace all resources created during the entity's lifecycle (data directories, git branches, child records, temporary files, worktrees) and verify each is cleaned up on deletion. Compare with existing delete functions in the codebase for completeness patterns
153
+
154
+ **Update schema depth**
155
+ - If the PR derives an update/patch schema from a create schema (e.g., `.partial()`, `Partial<T>`), verify that nested objects also become partial — shallow partial on deeply-required schemas rejects valid partial updates where the caller only wants to change one nested field
156
+
157
+ **Mutation return value freshness**
158
+ - If a function mutates an entity and returns it, verify the returned object reflects the post-mutation state, not a pre-read snapshot. Also check whether dependent scheduling/evaluation state (backoff, timers, status flags) is reset when a "force" or "trigger" operation is invoked
159
+
160
+ **Responsibility relocation audit**
161
+ - If the PR moves a responsibility from one module to another (e.g., a database write from a handler to middleware, a computation from client to server), trace all code at the old location that depended on the timing, return value, or side effects of the moved operation — guards, response fields, in-memory state updates, and downstream scheduling that assumed co-located execution. Verify the new execution point preserves these contracts or that dependents are updated. Check for dead code left behind at the old location
162
+
163
+ **Read-after-write consistency**
164
+ - If the PR writes to a data store and then immediately queries that store (especially scans, aggregations, or replica reads), check whether the store's consistency model guarantees visibility of the write. If not, flag the read as potentially stale and suggest computing from in-memory state, using consistent-read options, or adding a delay/caveat
165
+
115
166
  **Formatting & structural consistency**
116
167
  - If the PR adds content to an existing file (list items, sections, config entries), verify the new content matches the file's existing indentation, bullet style, heading levels, and structure — rendering inconsistencies are the most common Copilot review finding
117
168
 
169
+ </deep_checks>
170
+
171
+ <verify_findings>
172
+
173
+ ## Verify Findings
174
+
175
+ For each issue found, ground it in evidence before classifying:
176
+ 1. **Quote the specific code line(s)** that demonstrate the issue
177
+ 2. **Explain why it's a problem** in one sentence given the surrounding context
178
+ 3. If the fix involves async/state changes, **trace the execution path** to confirm the issue is real
179
+ 4. If you cannot quote specific code for a finding, downgrade it to **[UNCERTAIN]**
180
+
181
+ After verifying all findings, run the project's build and test commands to confirm no false positives.
182
+
183
+ </verify_findings>
184
+
185
+ <fix_and_report>
186
+
118
187
  ## Fix Issues Found
119
188
 
120
- For each issue found:
189
+ For each verified issue:
121
190
  1. Classify severity: **CRITICAL** (runtime crash, data leak, security) vs **IMPROVEMENT** (consistency, robustness, conventions)
122
191
  2. Fix all CRITICAL issues immediately
123
192
  3. For IMPROVEMENT issues, fix them too — the goal is to eliminate Copilot review round-trips
124
193
  4. After fixes, run the project's test suite and build command (per project conventions already in context)
125
- 5. Commit fixes: `refactor: address code review findings`
194
+ 5. Verify the test suite covers the changed code paths — passing unrelated tests is not validation
195
+ 6. Commit fixes: `refactor: address code review findings`
126
196
 
127
197
  ## Report
128
198
 
@@ -144,3 +214,5 @@ Print a summary table of what was reviewed and found:
144
214
  ```
145
215
 
146
216
  If no issues were found, confirm the code is clean and ready for PR.
217
+
218
+ </fix_and_report>
package/install.sh CHANGED
@@ -36,6 +36,7 @@ OLD_COMMANDS=(cam good makegoals makegood optimize-md)
36
36
 
37
37
  LIBS=(
38
38
  code-review-checklist copilot-review-loop graphql-escaping
39
+ remediation-agent-template
39
40
  )
40
41
 
41
42
  HOOKS=(slashdo-check-update slashdo-statusline)
@@ -1,3 +1,11 @@
1
+ <!--
2
+ Triage: Check Tiers 1 and 4 for every file. Check Tier 2/3 only when
3
+ the relevance filter matches the changed code. This prevents important
4
+ checks from being lost in a long list.
5
+ -->
6
+
7
+ ## Tier 1 — Always Check (Runtime Crashes, Security, Hygiene)
8
+
1
9
  **Hygiene**
2
10
  - Leftover debug code (`console.log`, `debugger`, TODO/FIXME/HACK), hardcoded secrets/credentials, and uncommittable files (.env, node_modules, build artifacts)
3
11
  - Overly broad changes that should be split into separate PRs
@@ -11,110 +19,157 @@
11
19
  - Type coercion edge cases — `Number('')` is `0` not empty, `0` is falsy in truthy checks, `NaN` comparisons are always false; string comparison operators (`<`, `>`, `localeCompare`) do lexicographic, not semantic, ordering (e.g., `"10" < "2"`). Use explicit type checks (`Number.isFinite()`, `!= null`) and dedicated libraries (e.g., semver for versions) instead of truthy guards or lexicographic ordering when zero/empty are valid values or semantic ordering matters
12
20
  - Functions that index into arrays without guarding empty arrays; state/variables declared but never updated or only partially wired up
13
21
  - Shared mutable references — module-level defaults passed by reference mutate across calls (use `structuredClone()`/spread); `useCallback`/`useMemo` referencing a later `const` (temporal dead zone); object spread followed by unconditional assignment that clobbers spread values
14
- - Side effects during React render (setState, navigation, mutations outside useEffect)
22
+ - Functions with >10 branches or >15 cyclomatic complexity refactor into smaller units
23
+
24
+ **API & URL safety**
25
+ - User-supplied or system-generated values interpolated into URL paths, shell commands, file paths, or subprocess arguments without encoding/validation — use `encodeURIComponent()` for URLs, regex allowlists for execution boundaries. Generated identifiers used as URL path segments must be safe for your router/storage (no `/`, `?`, `#`; consider allowlisting characters and/or applying `encodeURIComponent()`). Identifiers derived from human-readable names (slugs) used for namespaced resources (git branches, directories) need a unique suffix (ID, hash) to prevent collisions between entities with the same or similar names
26
+ - Route params passed to services without format validation; path containment checks using string prefix without path separator boundary (use `path.relative()`)
27
+ - Error/fallback responses that hardcode security headers instead of using centralized policy — error paths bypass security tightening
28
+
29
+ **Trust boundaries & data exposure**
30
+ - API responses returning full objects with sensitive fields — destructure and omit across ALL response paths (GET, PUT, POST, error, socket); comments/docs claiming data isn't exposed while the code path does expose it
31
+ - Server trusting client-provided computed/derived values (scores, totals, correctness flags) when the server can recompute them — strip and recompute server-side; don't require clients to submit fields the server should own
32
+ - New endpoints mounted under restricted paths (admin, internal) missing authorization verification — compare with sibling endpoints in the same route group to ensure the same access gate (role check, scope validation) is applied consistently
33
+
34
+ ## Tier 2 — Check When Relevant (Data Integrity, Async, Error Handling)
15
35
 
16
- **Async & state consistency**
17
- - Optimistic state changes (view switches, navigation, success callbacks) before async completion — if the operation fails or is cancelled, the UI is stuck with no rollback. Check return values/errors before calling success callbacks. Handle both failure and cancellation paths
36
+ **Async & state consistency** _[applies when: code uses async/await, Promises, or UI state]_
37
+ - Optimistic state changes (view switches, navigation, success callbacks) before async completion — if the operation fails or is cancelled, the UI is stuck with no rollback. Check return values/errors before calling success callbacks. Handle both failure and cancellation paths. Watch for `.catch(() => null)` followed by unconditional success code (toast, state update) — the catch silences the error but the success path still runs. Either let errors propagate naturally or check the return value before proceeding
18
38
  - Multiple coupled state variables updated independently — actions that change one must update all related fields; debounced/cancelable operations must reset loading state on every exit path (cleared, stale, failed, aborted)
19
39
  - Error notification at multiple layers (shared API client + component-level) — verify exactly one layer owns user-facing error messages
20
40
  - Optimistic updates using full-collection snapshots for rollback — a second in-flight action gets clobbered. Use per-item rollback and functional state updaters after async gaps; sync optimistic changes to parent via callback or trigger refetch on remount
21
41
  - State updates guarded by truthiness of the new value (`if (arr?.length)`) — prevents clearing state when the source legitimately returns empty. Distinguish "no response" from "empty response"
42
+ - Mutation/trigger functions that return or propagate stale pre-mutation state — if a function activates, updates, or resets an entity, the returned value and any dependent scheduling/evaluation state (backoff timers, "last run" timestamps, status flags) must reflect the post-mutation state, not a snapshot read before the mutation
43
+ - Fire-and-forget or async writes where the in-memory object is not updated (response returns stale data) or is updated unconditionally regardless of write success (response claims state that was never persisted) — update in-memory state conditionally on write outcome, or document the tradeoff explicitly
44
+ - Missing `await` on async operations in error/cleanup paths — fire-and-forget cleanup (e.g., aborting a failed operation, rolling back partial state) that must complete before the function returns or the caller proceeds
22
45
  - `Promise.all` without error handling — partial load with unhandled rejection. Wrap with fallback/error state
46
+ - Side effects during React render (setState, navigation, mutations outside useEffect)
23
47
 
24
- **Resource management**
25
- - Event listeners, socket handlers, subscriptions, timers, and useEffect side effects are cleaned up on unmount/teardown
26
- - Initialization functions (schedulers, pollers, listeners) that don't guard against multiple calls — creates duplicate instances. Check for existing instances before reinitializing
27
-
28
- **Error handling**
29
- - Service functions throwing generic `Error` for client-caused conditions — bubbles as 500 instead of 400/404. Use typed error classes with explicit status codes; ensure consistent error responses across similar endpoints
48
+ **Error handling** _[applies when: code has try/catch, .catch, error responses, or external calls]_
49
+ - Service functions throwing generic `Error` for client-caused conditions — bubbles as 500 instead of 400/404. Use typed error classes with explicit status codes; ensure consistent error responses across similar endpoints. Include expected concurrency/conditional failures (transaction cancellations, optimistic lock conflicts) — catch and translate to 409/retry rather than letting them surface as 500
30
50
  - Swallowed errors (empty `.catch(() => {})`), handlers that replace detailed failure info with generic messages, and error/catch handlers that exit cleanly (`exit 0`, `return`) without any user-visible output — surface a notification, propagate original context, and make failures look like failures
31
51
  - Destructive operations in retry/cleanup paths assumed to succeed without their own error handling — if cleanup fails, retry logic crashes instead of reporting the intended failure
52
+ - External service calls without configurable timeouts — a hung downstream service blocks the caller indefinitely
53
+ - Missing fallback behavior when downstream services are unavailable (see also: retry without backoff in "Sync & replication")
32
54
 
33
- **API & URL safety**
34
- - User-supplied values interpolated into URL paths, shell commands, file paths, or subprocess arguments without encoding/validation use `encodeURIComponent()` for URLs, regex allowlists for execution boundaries
35
- - Route params passed to services without format validation; path containment checks using string prefix without path separator boundary (use `path.relative()`)
36
- - Error/fallback responses that hardcode security headers instead of using centralized policyerror paths bypass security tightening
37
-
38
- **Trust boundaries & data exposure**
39
- - API responses returning full objects with sensitive fields — destructure and omit across ALL response paths (GET, PUT, POST, error, socket); comments/docs claiming data isn't exposed while the code path does expose it
40
- - Server trusting client-provided computed/derived values (scores, totals, correctness flags) when the server can recompute them — strip and recompute server-side; don't require clients to submit fields the server should own
41
-
42
- **Input handling**
43
- - Trimming values where whitespace is significant (API keys, tokens, passwords, base64) — only trim identifiers/names
44
- - Endpoints accepting unbounded arrays/collections without upper limits — enforce max size or move to background jobs
55
+ **Resource management** _[applies when: code uses event listeners, timers, subscriptions, or useEffect]_
56
+ - Event listeners, socket handlers, subscriptions, timers, and useEffect side effects are cleaned up on unmount/teardown
57
+ - Deletion/destroy functions that clean up the primary resource but leave orphaned secondary resources (data directories, git branches, child records, temporary files) — trace all resources created during the entity's lifecycle and verify each is removed on delete
58
+ - Initialization functions (schedulers, pollers, listeners) that don't guard against multiple callscreates duplicate instances. Check for existing instances before reinitializing
45
59
 
46
- **Validation & consistency**
60
+ **Validation & consistency** _[applies when: code handles user input, schemas, or API contracts]_
61
+ - API versioning: breaking changes to public endpoints without version bump or deprecation path
62
+ - Backward-incompatible response shape changes without client migration plan
47
63
  - New endpoints/schemas should match validation patterns of existing similar endpoints — field limits, required fields, types, error handling. If validation exists on one endpoint for a param, the same param on other endpoints needs the same validation
48
64
  - When a validation/sanitization function is introduced for a field, trace ALL write paths (create, update, sync, import) — partial application means invalid values re-enter through the unguarded path
49
- - Schema fields accepting values downstream code can't handle; Zod/schema stripping fields the service reads (silent `undefined`); config values persisted but silently ignored by the implementation — trace each field through schema → service → consumer
65
+ - Schema fields accepting values downstream code can't handle; Zod/schema stripping fields the service reads (silent `undefined`); config values persisted but silently ignored by the implementation — trace each field through schema → service → consumer. Update schemas derived from create schemas (e.g., `.partial()`) must also make nested object fields optional — shallow partial on a deeply-required schema rejects valid partial updates. Additionally, `.deepPartial()` or `.partial()` on schemas with `.default()` values will apply those defaults on update, silently overwriting existing persisted values with defaults — create explicit update schemas without defaults instead
66
+ - Entity creation without case-insensitive uniqueness checks — names differing only in case (e.g., "MyAgent" vs "myagent") cause collisions in case-insensitive contexts (file paths, git branches, URLs). Normalize to lowercase before comparing
50
67
  - Handlers reading properties from framework-provided objects using field names the framework doesn't populate — silent `undefined`. Verify property names match the caller's contract
68
+ - Data model fields that have different names depending on the creation/write path (e.g., `createdAt` vs `created`) — code referencing only one naming convention silently misses records created through other paths. Trace all write paths to discover the actual field names in use
51
69
  - Numeric values from strings used without `NaN`/type guards — `NaN` comparisons silently pass bounds checks. Clamp query params to safe lower bounds
52
70
  - UI elements hidden from navigation but still accessible via direct URL — enforce restrictions at the route level
53
- - Summary counters/accumulators that miss edge cases (removals, branch coverage); silent operations in verbose sequences where all branches should print status
54
-
55
- **Intent vs implementation**
56
- - Labels, comments, status messages, or documentation that describe behavior the code doesn't implement — e.g., a map named "renamed" that only deletes, or an action labeled "migrated" that never creates the target
57
- - Inline code examples, command templates, and query snippets that aren't syntactically valid as written — template placeholders must use a consistent format, queries must use correct syntax for their language (e.g., single `{}` in GraphQL, not `{{}}`)
58
- - Cross-references between files (identifiers, parameter names, format conventions, operational thresholds) that disagree — when one reference changes, trace all other files that reference the same entity and update them
59
- - Sequential instructions or steps whose ordering doesn't match the required execution order — readers following in order will perform actions at the wrong time (e.g., "record X" in step 2 when X must be captured before step 1's action)
60
- - Sequential numbering (section numbers, step numbers) with gaps or jumps after edits — verify continuity
61
- - Completion markers, success flags, or status files written before the operation they attest to finishes — consumers see false success if the operation fails after the write
62
- - Existence checks (directory exists, file exists, module resolves) used as proof of correct/complete installation — a directory can exist but be empty, a file can exist with invalid contents. Verify the specific resource the consumer needs
63
- - Tracking/checkpoint files that default to empty on parse failure — causes full re-execution. Fail loudly instead
64
- - Registering references to resources without verifying the resource exists — dangling references after failed operations
71
+ - Summary counters/accumulators that miss edge cases (removals, branch coverage, underflow on decrements — guard against going negative with lower-bound conditions); silent operations in verbose sequences where all branches should print status
65
72
 
66
- **Concurrency & data integrity**
67
- - Shared mutable state accessed by concurrent requests without locking or atomic writes; multi-step read-modify-write cycles that can interleave
73
+ **Concurrency & data integrity** _[applies when: code has shared state, database writes, or multi-step mutations]_
74
+ - Shared mutable state accessed by concurrent requests without locking or atomic writes; multi-step read-modify-write cycles that can interleave — use conditional writes/optimistic concurrency (e.g., condition expressions, version checks) to close the gap between read and write; if the conditional write fails, surface a retryable error instead of letting it bubble as a 500
68
75
  - Multi-table writes without a transaction — FK violations or errors leave partial state
76
+ - Writes that replace an entire composite attribute (array, map, JSON blob) when the field is populated by multiple sources — the write discards data from other sources. Use a separate attribute, merge with the existing value, or use list/set append operations
69
77
  - Functions with early returns for "no primary fields to update" that silently skip secondary operations (relationship updates, link writes)
70
78
  - Functions that acquire shared state (locks, flags, markers) with exit paths that skip cleanup — leaves the system permanently locked. Trace all exit paths including error branches
71
79
 
72
- **Search & navigation**
73
- - Search results linking to generic list pages instead of deep-linking to the specific record
74
- - Search/query code hardcoding one backend's implementation when the system supports multiple verify option/parameter names are mapped between backends
80
+ **Input handling** _[applies when: code accepts user/external input]_
81
+ - Trimming values where whitespace is significant (API keys, tokens, passwords, base64) only trim identifiers/names
82
+ - Endpoints accepting unbounded arrays/collections without upper limits enforce max size or move to background jobs
75
83
 
76
- **Sync & replication**
77
- - Upsert/`ON CONFLICT UPDATE` updating only a subset of exported fields — replicas diverge. Document deliberately omitted fields
78
- - Pagination using `COUNT(*)` (full table scan) instead of `limit + 1`; endpoints missing `next` token input/output; hard-capped limits silently truncating results
84
+ ## Tier 3 — Domain-Specific (Check Only When File Type Matches)
79
85
 
80
- **SQL & database**
86
+ **SQL & database** _[applies when: code contains SQL, ORM queries, or migration files]_
81
87
  - Parameterized query placeholder indices must match parameter array positions — especially with shared param builders or computed indices
82
88
  - Database triggers clobbering explicitly-provided values; auto-incrementing columns that only increment on INSERT, not UPDATE
83
89
  - Full-text search with strict parsers (`to_tsquery`) on user input — use `websearch_to_tsquery` or `plainto_tsquery`
84
90
  - Dead queries (results never read), N+1 patterns inside transactions, O(n²) algorithms on growing data
85
91
  - `CREATE TABLE IF NOT EXISTS` as sole migration strategy — won't add columns/indexes on upgrade. Use `ALTER TABLE ... ADD COLUMN IF NOT EXISTS` or a migration framework
86
92
  - Functions/extensions requiring specific database versions without verification
93
+ - Migrations that lock tables for extended periods (ADD COLUMN with default on large tables, CREATE INDEX without CONCURRENTLY) — use concurrent operations or batched backfills
94
+ - Missing rollback/down migration or untested rollback path
87
95
 
88
- **Lazy initialization & module loading**
96
+ **Sync & replication** _[applies when: code uses pagination, batch APIs, or data sync]_
97
+ - Upsert/`ON CONFLICT UPDATE` updating only a subset of exported fields — replicas diverge. Document deliberately omitted fields
98
+ - Pagination using `COUNT(*)` (full table scan) instead of `limit + 1`; endpoints missing `next` token input/output; hard-capped limits silently truncating results
99
+ - Batch/paginated API calls (database batch gets, external service calls) that don't handle partial results — unprocessed items, continuation tokens, or rate-limited responses silently dropped. Add retry loops with backoff for unprocessed items
100
+ - Retry loops without backoff or max-attempt limits — tight loops under throttling extend latency indefinitely. Use bounded retries with exponential backoff/jitter
101
+
102
+ **Lazy initialization & module loading** _[applies when: code uses dynamic imports, lazy singletons, or bootstrap sequences]_
89
103
  - Cached state getters returning null before initialization — provide async initializer or ensure-style function
90
104
  - Module-level side effects (file reads, SDK init) without error handling — corrupted files crash the process on import
91
105
  - Bootstrap/resilience code that imports the dependencies it's meant to install — restructure so installation precedes resolution
92
106
  - Re-exporting from heavy modules defeats lazy loading — use lightweight shared modules
93
107
 
94
- **Data format portability**
108
+ **Data format portability** _[applies when: code crosses serialization boundaries — JSON, DB, IPC]_
95
109
  - Values crossing serialization boundaries may change format (arrays in JSON vs string literals in DB) — convert consistently
110
+ - Reads issued immediately after writes to an eventually consistent store (database scans, replica reads, cache refreshes) may return stale data — use consistent-read options, compute from in-memory state after confirmed writes, or document the eventual-consistency window
96
111
  - BIGINT values parsed into JavaScript `Number` — precision lost past `MAX_SAFE_INTEGER`. Use strings or `BigInt`
112
+ - Data model key/index design that doesn't support required query access patterns — e.g., claiming "recent" ordering but using non-time-sortable keys (random UUIDs, user IDs). Verify sort keys and indexes can serve the queries the code performs without full-partition scans and in-memory sorting
97
113
 
98
- **Shell & portability**
114
+ **Shell & portability** _[applies when: code spawns subprocesses, uses shell scripts, or builds CLI tools]_
99
115
  - Subprocess calls under `set -e` abort on failure; non-critical writes fail on broken pipes — use `|| true` for non-critical output
100
116
  - Detached child processes with piped stdio — parent exit causes SIGPIPE. Redirect to log files or use `'ignore'`
101
117
  - Platform-specific assumptions — hardcoded shell interpreters, `path.join()` backslashes breaking ESM imports. Use `pathToFileURL()` for dynamic imports
102
118
 
103
- **Test coverage**
104
- - New logic/schemas/services without corresponding tests when similar existing code has tests
105
- - New error paths untestable because services throw generic errors instead of typed ones
106
- - Tests re-implementing logic under test instead of importing real exports — pass even when real code regresses
107
- - Tests depending on real wall-clock time or external dependencies when testing logic — use fake timers and mocks
108
- - Missing tests for trust-boundary enforcementsubmit tampered values, verify server ignores them
119
+ **Search & navigation** _[applies when: code implements search results or deep-linking]_
120
+ - Search results linking to generic list pages instead of deep-linking to the specific record
121
+ - Search/query code hardcoding one backend's implementation when the system supports multiple verify option/parameter names are mapped between backends
122
+
123
+ **Destructive UI operations** _[applies when: code adds delete, reset, revoke, or other destructive actions]_
124
+ - Destructive actions (delete, reset, revoke) in the UI without a confirmation step compare with how similar destructive operations elsewhere in the codebase handle confirmation
109
125
 
110
- **Accessibility**
126
+ **Accessibility** _[applies when: code modifies UI components or interactive elements]_
111
127
  - Interactive elements missing accessible names, roles, or ARIA states — including disabled interactions without `aria-disabled`
112
128
  - Custom toggle/switch UI built from non-semantic elements instead of native inputs
113
129
 
130
+ ## Tier 4 — Always Check (Quality, Conventions, AI-Generated Code)
131
+
132
+ **Intent vs implementation**
133
+ - Labels, comments, status messages, or documentation that describe behavior the code doesn't implement — e.g., a map named "renamed" that only deletes, or an action labeled "migrated" that never creates the target
134
+ - Inline code examples, command templates, and query snippets that aren't syntactically valid as written — template placeholders must use a consistent format, queries must use correct syntax for their language (e.g., single `{}` in GraphQL, not `{{}}`)
135
+ - Cross-references between files (identifiers, parameter names, format conventions, operational thresholds) that disagree — when one reference changes, trace all other files that reference the same entity and update them
136
+ - Responsibility relocated from one module to another (e.g., writes moved from handler to middleware) without updating all consumers that depended on the old location's timing, return value, or side effects — trace callers that relied on the synchronous or co-located behavior and verify they still work with the new execution point. Remove dead code left behind at the old location
137
+ - Sequential instructions or steps whose ordering doesn't match the required execution order — readers following in order will perform actions at the wrong time (e.g., "record X" in step 2 when X must be captured before step 1's action)
138
+ - Sequential numbering (section numbers, step numbers) with gaps or jumps after edits — verify continuity
139
+ - Completion markers, success flags, or status files written before the operation they attest to finishes — consumers see false success if the operation fails after the write
140
+ - Existence checks (directory exists, file exists, module resolves) used as proof of correct/complete installation — a directory can exist but be empty, a file can exist with invalid contents. Verify the specific resource the consumer needs
141
+ - Lookups that check only one scope when multiple exist — e.g., checking local git branches but not remote, checking in-memory cache but not persistent store. Trace all locations where the resource could exist and check each
142
+ - Tracking/checkpoint files that default to empty on parse failure — causes full re-execution. Fail loudly instead
143
+ - Registering references to resources without verifying the resource exists — dangling references after failed operations
144
+
145
+ **AI-generated code quality** _(Claude 4.6 specific failure modes)_
146
+ - Over-engineering: new abstractions, wrapper functions, helper files, or utility modules that serve only one call site — inline the logic instead
147
+ - Feature flags, configuration options, or extension points with only one possible value or consumer
148
+ - Commit messages or comments claiming a fix while the underlying bug remains — verify each claimed fix actually addresses the root cause, not just the symptom
149
+ - Functions containing placeholder comments (`// TODO`, `// FIXME`, `// implement later`) or stub implementations presented as complete
150
+ - Unnecessary defensive code: error handling for scenarios that provably cannot occur given the call site, fallbacks for internal functions that always return valid data
151
+
114
152
  **Configuration & hardcoding**
115
- - Hardcoded values when a config field or env var already exists; dead config fields nothing consumes; unused function parameters creating false API contracts
116
- - Duplicated config/constants/utilities across modules — extract to shared module to prevent drift
153
+ - Hardcoded values when a config field or env var already exists; dead config fields nothing consumes; unused function parameters creating false API contracts; resource names (table names, queue names, bucket names) hardcoded without accounting for environment prefixes — lookups on response objects using the wrong key silently return undefined
154
+ - Duplicated config/constants/utilities/helper functions across modules — extract to shared module to prevent drift. Watch for behavioral inconsistencies between copies (e.g., one returns `'unknown'` for null while another returns `'never'`)
117
155
  - CI pipelines installing without lockfile pinning or version constraints — non-deterministic builds
156
+ - Production code paths with no structured logging at entry/exit points
157
+ - Error logs missing reproduction context (request ID, input parameters)
158
+ - Async flows without correlation ID propagation
159
+
160
+ **Supply chain & dependency health**
161
+ - Lockfile committed and CI uses `--frozen-lockfile`; no lockfile drift from manifest
162
+ - `npm audit` / `cargo audit` / `pip-audit` has no unaddressed HIGH/CRITICAL vulnerabilities
163
+ - No `postinstall` scripts from untrusted packages executing arbitrary code without review
164
+ - Overly permissive version ranges (`*`, `>=`) on deps with known breaking-change history
165
+
166
+ **Test coverage**
167
+ - New logic/schemas/services without corresponding tests when similar existing code has tests
168
+ - New error paths untestable because services throw generic errors instead of typed ones
169
+ - Tests re-implementing logic under test instead of importing real exports — pass even when real code regresses
170
+ - Tests depending on real wall-clock time or external dependencies when testing logic — use fake timers and mocks
171
+ - Missing tests for trust-boundary enforcement — submit tampered values, verify server ignores them
172
+ - Tests that pass but don't cover the changed code paths — passing unrelated tests is not validation
118
173
 
119
174
  **Style & conventions**
120
175
  - Naming and patterns consistent with the rest of the codebase
@@ -0,0 +1,61 @@
1
+ ## Remediation Agent Template
2
+
3
+ Use this template when spawning remediation agents in Phase 3c. Replace all `{PLACEHOLDERS}` with actual values.
4
+
5
+ ```
6
+ <context>
7
+ Project type: {PROJECT_TYPE}
8
+ Build command: {BUILD_CMD}
9
+ Test command: {TEST_CMD}
10
+ Working directory: {WORKTREE_DIR} (this is a git worktree — all work happens here)
11
+ Foundation utilities available (if created):
12
+ {FOUNDATION_UTILS}
13
+ </context>
14
+
15
+ <findings>
16
+ {FINDINGS}
17
+ </findings>
18
+
19
+ <instructions>
20
+ You are {AGENT_NAME} on team better-{DATE}.
21
+
22
+ Your task: Fix all {CATEGORY} findings listed above.
23
+
24
+ FINDING VALIDATION — verify before fixing:
25
+ - Before fixing each finding, READ the file and at least 30 lines of surrounding
26
+ context to confirm the issue is genuine.
27
+ - Check whether the flagged code is already correct (e.g., a Promise chain that
28
+ IS properly awaited downstream, a value that IS validated earlier in the function,
29
+ a pattern that IS idiomatic for the framework).
30
+ - If the existing code is already correct, SKIP the fix and report it as a
31
+ false positive with a brief explanation of why the original code is fine.
32
+ - Do not make changes that are semantically equivalent to the original code
33
+ (e.g., wrapping a .then() chain in an async IIFE adds noise without fixing anything).
34
+ </instructions>
35
+
36
+ <guardrails>
37
+ - Only use APIs/functions verified to exist by reading source files. If a fix
38
+ requires an API you haven't confirmed, read the module's exports first.
39
+ - Fix with minimum change required. Do not introduce new abstractions or helpers
40
+ unless the finding specifically calls for it. A one-line fix beats a refactored module.
41
+ - If a git/build/file-read command fails, retry once after verifying the working
42
+ directory and path. If it fails again, report the error and move to the next finding.
43
+ </guardrails>
44
+
45
+ <commit_strategy>
46
+ Goal: each commit builds independently and contains one logical group of
47
+ related fixes. Use conventional prefixes (fix:, refactor:, feat:, security:).
48
+ Stage specific files only (`git -C {WORKTREE_DIR} add <specific files>` — never
49
+ `git add -A` or `git add .`). Run {BUILD_CMD} in {WORKTREE_DIR} before committing.
50
+ No co-author annotations or version bumps.
51
+ </commit_strategy>
52
+
53
+ CONFLICT AVOIDANCE:
54
+ - Only modify files listed in your assigned findings
55
+ - If you need to modify a file assigned to another agent, skip that change and report it
56
+
57
+ After all fixes:
58
+ - Ensure all changes are committed (no uncommitted work)
59
+ - Mark your task as completed via TaskUpdate
60
+ - Report: commits made, files modified, findings addressed, any skipped issues
61
+ ```
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "slash-do",
3
- "version": "1.4.2",
3
+ "version": "1.5.0",
4
4
  "description": "Curated slash commands for AI coding assistants — Claude Code, OpenCode, Gemini CLI, and Codex",
5
5
  "author": "Adam Eivy <adam@eivy.com>",
6
6
  "license": "MIT",
package/uninstall.sh CHANGED
@@ -32,6 +32,7 @@ OLD_COMMANDS=(cam good makegoals makegood optimize-md)
32
32
 
33
33
  LIBS=(
34
34
  code-review-checklist copilot-review-loop graphql-escaping
35
+ remediation-agent-template
35
36
  )
36
37
 
37
38
  HOOKS=(slashdo-check-update slashdo-statusline)