slash-do 1.4.0 → 1.4.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -465,6 +465,8 @@ After creating all PRs, verify CI passes on each one:
465
465
 
466
466
  Maximum 5 iterations per PR to prevent infinite loops.
467
467
 
468
+ **IMPORTANT — Sub-agent delegation**: To prevent context exhaustion on long review cycles with multiple PRs, delegate each PR's review loop to a **separate general-purpose sub-agent** via the Agent tool. Launch sub-agents in parallel (one per PR). Each sub-agent runs the full loop (request → wait → check → fix → re-request) autonomously and returns only the final status.
469
+
468
470
  ### 6.0: Verify browser authentication
469
471
 
470
472
  If `BROWSER_AUTHENTICATED` is not true (e.g., Phase 0e was skipped or failed):
@@ -472,71 +474,109 @@ If `BROWSER_AUTHENTICATED` is not true (e.g., Phase 0e was skipped or failed):
472
474
  2. Check for user avatar/menu
473
475
  3. If not logged in: navigate to `https://github.com/login`, inform the user **"Please log in to GitHub in the browser. I'll wait for you to confirm."**, and use `AskUserQuestion` to wait
474
476
 
475
- ### 6.1: Request Copilot reviews on all PRs
476
-
477
- For each PR:
477
+ ### 6.1: Determine review request method
478
478
 
479
- **Try the API first:**
479
+ **Try the API first** on any one PR:
480
480
  ```bash
481
- gh api repos/{OWNER}/{REPO}/pulls/{PR_NUMBER}/requested_reviewers --method POST --input - <<< '{"reviewers":["copilot-pull-request-reviewer"]}'
481
+ gh api repos/{OWNER}/{REPO}/pulls/{PR_NUMBER}/requested_reviewers \
482
+ -f 'reviewers[]=copilot-pull-request-reviewer[bot]'
482
483
  ```
483
484
 
484
- If this returns 422 ("not a collaborator"), **fall back to Playwright** for each PR:
485
- 1. Navigate to `{PR_URL}`
486
- 2. Click the "Reviewers" gear button in the PR sidebar
487
- 3. In the dropdown menu, click the Copilot `menuitemradio` option (not the "Request" button which may be obscured by the dropdown header)
488
- 4. Verify the sidebar shows "Awaiting requested review from Copilot"
485
+ If this returns 422 ("not a collaborator"), record `REVIEW_METHOD=playwright`. Otherwise record `REVIEW_METHOD=api`.
489
486
 
490
- ### 6.2: Poll for review completion
487
+ ### 6.2: Launch parallel sub-agents (one per PR)
491
488
 
492
- **Dynamic poll timing**: Before your first poll, check how long the most recent Copilot review on this PR took by comparing consecutive Copilot review `submittedAt` timestamps (or PR creation time for the first review). Use that duration as your expected wait. If no prior review exists, default to 5 minutes. Set poll interval to 60 seconds and max wait to **2x the expected duration** (minimum 5 minutes, maximum 20 minutes). Copilot reviews can take **10-15 minutes** for large diffs — do NOT give up early.
489
+ For each PR, spawn a general-purpose sub-agent with:
493
490
 
494
- For each PR, poll using GraphQL to check for a new Copilot review:
495
- ```bash
496
- echo '{"query":"{ repository(owner: \"OWNER\", name: \"REPO\") { pullRequest(number: PR_NUM) { reviews(last: 5) { nodes { state body author { login } submittedAt } } reviewThreads(first: 100) { nodes { id isResolved comments(first: 3) { nodes { body path line author { login } } } } } } } }"}' | gh api graphql --input -
497
491
  ```
492
+ You are a Copilot review loop agent for PR {PR_NUMBER}.
498
493
 
499
- The review is complete when a new `copilot-pull-request-reviewer[bot]` review appears with a `submittedAt` after your request. If no review appears after max wait, **ask the user** whether to continue waiting, re-request, or skip.
500
-
501
- **Error detection**: After a review appears, check its `body` for error text such as "Copilot encountered an error" or "unable to review this pull request". If found, this is NOT a successful review — log a warning, re-request the review (step 6.1), and resume polling from 6.2. Allow up to 3 error retries per PR before asking the user whether to continue or skip.
502
-
503
- ### 6.3: Check for unresolved threads
504
-
505
- For each reviewed PR, fetch review threads via GraphQL using stdin JSON (**never use `$variables`** shell expansion consumes `$` signs):
506
- ```bash
507
- echo '{"query":"{ repository(owner: \"{OWNER}\", name: \"{REPO}\") { pullRequest(number: {PR_NUMBER}) { reviewThreads(first: 100) { nodes { id isResolved comments(first: 10) { nodes { body path line author { login } } } } } } } }"}' | gh api graphql --input -
494
+ Repository: {OWNER}/{REPO}
495
+ Branch: better/{CATEGORY_SLUG}
496
+ Build command: {BUILD_CMD}
497
+ Review request method: {REVIEW_METHOD}
498
+ Max iterations: 5
499
+
500
+ DECREASING TIMEOUT SCHEDULE (shorter than single-PR review since multiple
501
+ PRs are reviewed in parallel — see do:rpr for single-PR dynamic timing):
502
+ - Iteration 1: max wait 5 minutes
503
+ - Iteration 2: max wait 4 minutes
504
+ - Iteration 3: max wait 3 minutes
505
+ - Iteration 4: max wait 2 minutes
506
+ - Iteration 5+: max wait 1 minute
507
+ Poll interval: 30 seconds for all iterations.
508
+
509
+ Run the following loop until Copilot returns zero new comments or you hit
510
+ the max iteration limit:
511
+
512
+ 1. CAPTURE the latest Copilot review timestamp, then REQUEST a new review:
513
+ - First, capture the latest Copilot review timestamp via GraphQL:
514
+ echo '{"query":"{ repository(owner: \"{OWNER}\", name: \"{REPO}\") { pullRequest(number: {PR_NUMBER}) { reviews(last: 20) { nodes { author { login } submittedAt } } } } }"}' | gh api graphql --input -
515
+ - Find the most recent submittedAt where author.login is
516
+ copilot-pull-request-reviewer[bot] and record as LAST_COPILOT_SUBMITTED_AT.
517
+ - If no prior Copilot review exists, record LAST_COPILOT_SUBMITTED_AT=NONE
518
+ and treat the next Copilot review as NEW regardless of timestamp.
519
+ - Then REQUEST:
520
+ If REVIEW_METHOD is "api":
521
+ gh api repos/{OWNER}/{REPO}/pulls/{PR_NUMBER}/requested_reviewers \
522
+ -f 'reviewers[]=copilot-pull-request-reviewer[bot]'
523
+ If REVIEW_METHOD is "playwright":
524
+ Navigate to the PR URL, click the "Reviewers" gear button, click the
525
+ Copilot menuitemradio option, verify sidebar shows "Awaiting requested
526
+ review from Copilot"
527
+
528
+ 2. WAIT for the review (BLOCKING):
529
+ - Poll using stdin JSON piping (avoid shell-escaping issues):
530
+ echo '{"query":"{ repository(owner: \"{OWNER}\", name: \"{REPO}\") { pullRequest(number: {PR_NUMBER}) { reviews(last: 5) { totalCount nodes { state body author { login } submittedAt } } reviewThreads(first: 100) { nodes { id isResolved comments(first: 3) { nodes { body path line author { login } } } } } } } }"}' | gh api graphql --input -
531
+ - Complete when a new copilot-pull-request-reviewer[bot] review appears
532
+ with submittedAt after LAST_COPILOT_SUBMITTED_AT captured in step 1
533
+ (or, if LAST_COPILOT_SUBMITTED_AT=NONE, when the first
534
+ copilot-pull-request-reviewer[bot] review for this loop appears)
535
+ - Use the DECREASING TIMEOUT for the current iteration number
536
+ - Error detection: if review body contains "Copilot encountered an error"
537
+ or "unable to review", re-request and resume. Max 3 error retries.
538
+ - If no review after max wait, report timeout and exit
539
+
540
+ 3. CHECK for unresolved threads:
541
+ Fetch threads via stdin JSON piping:
542
+ echo '{"query":"{ repository(owner: \"{OWNER}\", name: \"{REPO}\") { pullRequest(number: {PR_NUMBER}) { reviewThreads(first: 100) { nodes { id isResolved comments(first: 10) { nodes { body path line author { login } } } } } } } }"}' | gh api graphql --input -
543
+ - Verify review was successful (no error text in body)
544
+ - If zero comments / no unresolved threads: report success and exit
545
+ - If unresolved threads exist: proceed to step 4
546
+
547
+ 4. FIX all unresolved threads:
548
+ For each unresolved thread:
549
+ - Read the referenced file and understand the feedback
550
+ - Evaluate: valid feedback → make the fix; informational/false positive →
551
+ resolve without changes
552
+ - If fixing:
553
+ git checkout better/{CATEGORY_SLUG}
554
+ # make changes
555
+ git add <specific files>
556
+ git commit -m "address Copilot review feedback"
557
+ git push
558
+ - Resolve thread via stdin JSON piping:
559
+ echo '{"query":"mutation { resolveReviewThread(input: {threadId: \"{THREAD_ID}\"}) { thread { id isResolved } } }"}' | gh api graphql --input -
560
+ - After all threads resolved, increment iteration and go back to step 1
561
+
562
+ When done, report back:
563
+ - Final status: clean / max-iterations-reached / timeout / error
564
+ - Total iterations completed
565
+ - List of commits made (if any)
566
+ - Any unresolved threads remaining
508
567
  ```
509
- Save to `/tmp/better_threads_{PR_NUMBER}.json` for parsing.
510
-
511
- - If **no unresolved threads** → mark PR as ready to merge
512
- - If **unresolved threads exist** → proceed to 6.4 (fix)
513
568
 
514
- ### 6.4: Fix unresolved threads
569
+ Launch all PR sub-agents in parallel. Wait for all to complete.
515
570
 
516
- For each unresolved thread on each PR:
517
- 1. Read the referenced file and understand the feedback
518
- 2. Evaluate: is the feedback valid? Some Copilot comments are informational or about pre-existing patterns.
519
- - **Valid feedback**: make the code fix
520
- - **Informational/false positive**: resolve the thread without changes
521
- 3. If fixing:
522
- ```bash
523
- git checkout better/{CATEGORY_SLUG}
524
- # make changes
525
- git add <specific files>
526
- git commit -m "address review: {summary of change}"
527
- git push
528
- ```
529
- 4. Resolve the thread via GraphQL mutation:
530
- ```bash
531
- echo '{"query":"mutation { resolveReviewThread(input: {threadId: \"{THREAD_ID}\"}) { thread { id isResolved } } }"}' | gh api graphql --input -
532
- ```
571
+ ### 6.3: Handle sub-agent results
533
572
 
534
- After all threads are resolved on a PR:
535
- 1. Increment that PR's `REVIEW_ITERATION`
536
- 2. If `REVIEW_ITERATION >= 5`: inform the user "Reached max review iterations (5) on PR #{number}. Remaining issues may need manual review."
537
- 3. Otherwise: re-request Copilot review on that PR and repeat from 6.2
573
+ For each sub-agent result:
574
+ - **clean**: mark PR as ready to merge
575
+ - **timeout**: ask the user whether to continue waiting, re-request, or skip
576
+ - **max-iterations-reached**: inform the user "Reached max review iterations (5) on PR #{number}. Remaining issues may need manual review."
577
+ - **error**: inform the user and ask whether to retry or skip
538
578
 
539
- ### 6.5: Merge
579
+ ### 6.4: Merge
540
580
 
541
581
  For each PR that has passed CI and review (in dependency order if applicable):
542
582
  ```bash
@@ -599,7 +639,7 @@ If merge fails (e.g., branch protection, merge conflicts from a prior PR):
599
639
  - **Push failure**: `git pull --rebase --autostash` then retry push
600
640
  - **CI failure on PR**: investigate logs, fix in a new commit, push, re-check (max 3 attempts per PR)
601
641
  - **Cross-PR dependency breakage**: add backward-compatible re-exports or move shared files to the PR that creates them
602
- - **Copilot timeout** (review not received within 3 min): inform user, offer to merge without review approval or wait longer
642
+ - **Copilot timeout** (review not received within decreasing timeout window): inform user, offer to merge without review approval or wait longer
603
643
  - **Copilot review loop exceeds 5 iterations per PR**: stop iterating on that PR, inform user, proceed to merge
604
644
  - **Existing worktree found at startup**: ask user — resume (reuse worktree) or cleanup (remove and start fresh)
605
645
  - **No findings above LOW**: skip Phases 3-7, print "No actionable findings" with the LOW summary
package/commands/do/pr.md CHANGED
@@ -34,8 +34,6 @@ Before creating the PR, perform a thorough self-review. Read each changed file
34
34
  - Create a PR from `{current_branch}` to `{default_branch}`
35
35
  - Create a rich PR description
36
36
 
37
- **IMPORTANT**: During each fix cycle in the Copilot review loop below, after fixing all review comments and before pushing, also bump the patch version (`npm version patch --no-git-tag-version` or equivalent) and commit the version bump.
38
-
39
37
  !`cat ~/.claude/lib/copilot-review-loop.md`
40
38
 
41
39
  **Report the final status** to the user including PR URL and review outcome.
@@ -91,6 +91,30 @@ Check every file against this checklist:
91
91
  **Guard-before-cache ordering**
92
92
  - If a handler performs a pre-flight guard check (rate limit, quota, feature flag) before a cache lookup or short-circuit path, verify the guard doesn't block operations that would be served from cache without touching the guarded resource — restructure so cache hits bypass the guard
93
93
 
94
+ **Sanitization/validation coverage**
95
+ - If the PR introduces a new validation or sanitization function for a data field, trace every code path that writes to that field (create, update, import, sync, rename) — verify they all use the same sanitization. Partial application is the #1 way invalid data re-enters through an unguarded path
96
+
97
+ **Bootstrap/initialization ordering**
98
+ - If the PR adds resilience or self-healing code (dependency installers, auto-repair, migration runners), trace the execution order: does the main code path resolve or import the dependencies BEFORE the resilience code runs? If so, the bootstrapper never executes when it's needed most — restructure so verification/installation precedes resolution
99
+
100
+ **Lock/flag exit-path completeness**
101
+ - If a function sets a shared flag or lock (in-progress, mutex, status marker), trace every exit path — early returns, error catches, platform-specific guards, and normal completion — to verify the flag is cleared. A missed path leaves the system permanently locked
102
+
103
+ **Operation-marker ordering**
104
+ - If the PR writes completion markers, success flags, or status files, verify they are written AFTER the operation they attest to, not before. If the operation can fail after the marker write, consumers see false success. Also check that marker-dependent startup logic validates the marker's contents rather than treating presence as unconditional success
105
+
106
+ **Real-time event vs response timing**
107
+ - If a handler emits push notifications (WebSocket, SSE, pub/sub) AND returns an HTTP response, verify clients won't receive push events before the response that gives them context to interpret those events — especially when the response contains IDs or version numbers the event consumer needs
108
+
109
+ **Intent vs implementation (meta-cognitive pass)**
110
+ - For each label, comment, docstring, status message, or inline instruction that describes behavior, verify the code actually implements that behavior. A detection mechanism must query the data it claims to detect; a migration must create the target, not just delete the source
111
+ - If the PR contains inline code examples, command templates, or query snippets, verify they are syntactically valid for their language — run a mental parse of each example. Watch for template placeholder format inconsistencies within and across files
112
+ - If the PR modifies a value (identifier, parameter name, format convention, threshold, timeout) that is referenced in other files, trace all cross-references and verify they agree. This includes: reviewer usernames, API names, placeholder formats, GraphQL field names, operational constants
113
+ - If the PR adds or reorders sequential steps/instructions, verify the ordering matches execution dependencies — readers following steps in order must not perform an action before its prerequisite
114
+
115
+ **Formatting & structural consistency**
116
+ - If the PR adds content to an existing file (list items, sections, config entries), verify the new content matches the file's existing indentation, bullet style, heading levels, and structure — rendering inconsistencies are the most common Copilot review finding
117
+
94
118
  ## Fix Issues Found
95
119
 
96
120
  For each issue found:
@@ -1,144 +1,121 @@
1
1
  **Hygiene**
2
- - Leftover debug code (`console.log`, `debugger`, TODO/FIXME/HACK comments)
3
- - Hardcoded secrets, API keys, or credentials
4
- - Files that shouldn't be committed (.env, node_modules, build artifacts)
2
+ - Leftover debug code (`console.log`, `debugger`, TODO/FIXME/HACK), hardcoded secrets/credentials, and uncommittable files (.env, node_modules, build artifacts)
5
3
  - Overly broad changes that should be split into separate PRs
6
4
 
7
5
  **Imports & references**
8
- - Every symbol used in the file is imported (missing imports → runtime crash)
9
- - No unused imports introduced by the changes
6
+ - Every symbol used is imported (missing → runtime crash); no unused imports introduced
10
7
 
11
8
  **Runtime correctness**
12
- - State/variables that are declared but never updated or only partially wired up (e.g. a state setter that's never called)
9
+ - Null/undefined access without guards, off-by-one errors, object spread of potentially-null values (spread of null is `{}`, silently discarding state)
10
+ - Data from external/user sources (parsed JSON, API responses, file reads) used without structural validation — guard against parse failures, missing properties, wrong types, and null elements before accessing nested values. When parsed data is optional enrichment, isolate failures so they don't abort the main operation
11
+ - Type coercion edge cases — `Number('')` is `0` not empty, `0` is falsy in truthy checks, `NaN` comparisons are always false; string comparison operators (`<`, `>`, `localeCompare`) do lexicographic, not semantic, ordering (e.g., `"10" < "2"`). Use explicit type checks (`Number.isFinite()`, `!= null`) and dedicated libraries (e.g., semver for versions) instead of truthy guards or lexicographic ordering when zero/empty are valid values or semantic ordering matters
12
+ - Functions that index into arrays without guarding empty arrays; state/variables declared but never updated or only partially wired up
13
+ - Shared mutable references — module-level defaults passed by reference mutate across calls (use `structuredClone()`/spread); `useCallback`/`useMemo` referencing a later `const` (temporal dead zone); object spread followed by unconditional assignment that clobbers spread values
13
14
  - Side effects during React render (setState, navigation, mutations outside useEffect)
14
- - Off-by-one errors, null/undefined access without guards
15
- - `JSON.parse` on user-editable or external files (config, settings, cache, package metadata) without error handling — corrupted files will crash the process. When the parsed data is optional enrichment (e.g., version info, display metadata), isolate the failure so it doesn't abort the main operation
16
- - Accessing properties/methods on parsed JSON objects without verifying expected structure (e.g., `obj.arr.push()` when `arr` might not be an array)
17
- - Iterating arrays from external/user-editable sources without guarding each element a `null` or wrong-type entry throws `TypeError` when treated as an object
18
- - Version/string comparisons using `!==` when semantic ordering mattersuse proper semver comparison for version checks
19
- - `Number('')` produces `0`, not emptycleared numeric inputs must map to `undefined`/`null`, not `0`, which silently fails validation or sets wrong values
20
- - Truthy checks on numeric values where `0` is valid (e.g., `days || 365` treats `0` as falsy) use `!= null` or explicit undefined checks instead
21
- - Functions that index into arrays (`arr[Math.floor(Math.random() * arr.length)]`) without guarding empty arrays produces `undefined`/`NaN` when `arr.length === 0`
22
- - Module-level default/config objects passed by reference to consumers — shared mutation across calls. Use `structuredClone()` or spread when handing out defaults
23
- - `useCallback`/`useMemo` referencing a `const` declared later in the same function body — triggers temporal dead zone `ReferenceError`. Ensure dependency declarations appear before their dependents
24
- - Object spread/merge followed by unconditional field assignment that clobbers spread values — e.g., `{...input.details, notes: notes || null}` silently overwrites `input.details.notes` even when `notes` is undefined. Only set fields when the overriding value is explicitly provided
25
-
26
- **Async & UI state consistency**
27
- - Optimistic UI state changes (view switches, navigation, success callbacks) before an async operation completes — if the operation fails or is cancelled (drag cancel, upload abort, form dismiss), the UI is stuck in the wrong state with no rollback. Handle both failure and cancellation paths to reset intermediate state
28
- - `Promise.all` without try/catch — if any request rejects, the UI ends up partially loaded with an unhandled rejection. Wrap in try/catch with fallback/error state so the view remains usable
29
- - Success callbacks (`onSaved()`, `onComplete()`) called unconditionally after an async call — check the return value or catch errors before calling the callback
30
- - Debounced/cancelable async operations that don't reset loading state on all code paths (input cleared, stale response arrives, request fails) — loading spinners get stuck and stale results display. Use AbortController or request IDs to discard outdated responses and clear loading in every exit path (including early returns)
31
- - Multiple UI state variables representing coupled data (coordinates + display name, selected item + dependent list) updated independently — actions that change one must update all related fields to prevent display/data mismatch
32
- - Error notification at multiple layers (shared API client that auto-displays errors + component-level error handling) — verify exactly one layer is responsible for user-facing error messages to avoid duplicate toasts/alerts. Suppress lower-layer notifications when the caller handles its own error display
33
- - Optimistic state updates using full-collection snapshots for rollback — if a second user action starts while the first is in-flight, rollback restores the snapshot and clobbers the second action's changes. Use per-item rollback and functional state updaters (`setState(prev => ...)`) after async gaps to avoid stale closures
34
- - Child components maintaining local copies of parent-provided data for optimistic updates without propagating changes back — on unmount/remount the parent's stale cache is re-rendered. Sync optimistic changes to the parent via callback alongside local state, or trigger a data refetch on remount
15
+
16
+ **Async & state consistency**
17
+ - Optimistic state changes (view switches, navigation, success callbacks) before async completion — if the operation fails or is cancelled, the UI is stuck with no rollback. Check return values/errors before calling success callbacks. Handle both failure and cancellation paths
18
+ - Multiple coupled state variables updated independently actions that change one must update all related fields; debounced/cancelable operations must reset loading state on every exit path (cleared, stale, failed, aborted)
19
+ - Error notification at multiple layers (shared API client + component-level) verify exactly one layer owns user-facing error messages
20
+ - Optimistic updates using full-collection snapshots for rollback a second in-flight action gets clobbered. Use per-item rollback and functional state updaters after async gaps; sync optimistic changes to parent via callback or trigger refetch on remount
21
+ - State updates guarded by truthiness of the new value (`if (arr?.length)`) prevents clearing state when the source legitimately returns empty. Distinguish "no response" from "empty response"
22
+ - `Promise.all` without error handlingpartial load with unhandled rejection. Wrap with fallback/error state
35
23
 
36
24
  **Resource management**
37
- - Event listeners, socket handlers, subscriptions, and timers are cleaned up on unmount/teardown
38
- - useEffect cleanup functions remove everything the effect sets up
25
+ - Event listeners, socket handlers, subscriptions, timers, and useEffect side effects are cleaned up on unmount/teardown
26
+ - Initialization functions (schedulers, pollers, listeners) that don't guard against multiple calls — creates duplicate instances. Check for existing instances before reinitializing
39
27
 
40
- **HTTP status codes & error classification**
41
- - Service functions that throw generic `Error` for client-caused conditions (not found, invalid input) these bubble as 500 when they should be 400/404. Use typed error classes with explicit status codes
42
- - Consistent error responses across similar endpointsif one validates, all should
28
+ **Error handling**
29
+ - Service functions throwing generic `Error` for client-caused conditions — bubbles as 500 instead of 400/404. Use typed error classes with explicit status codes; ensure consistent error responses across similar endpoints
30
+ - Swallowed errors (empty `.catch(() => {})`), handlers that replace detailed failure info with generic messages, and error/catch handlers that exit cleanly (`exit 0`, `return`) without any user-visible output surface a notification, propagate original context, and make failures look like failures
31
+ - Destructive operations in retry/cleanup paths assumed to succeed without their own error handling — if cleanup fails, retry logic crashes instead of reporting the intended failure
43
32
 
44
33
  **API & URL safety**
45
- - User-supplied values interpolated into URL paths must use `encodeURIComponent()` even if the UI restricts input, the API should be safe independently
46
- - Route params (`:name`, `:id`) passed to services without validation add format checks (regex, length limits) at the route level
47
- - Data from external APIs or upstream services interpolated into shell commands, file paths, or subprocess arguments without validation enforce expected format (e.g., regex allowlist) before passing to execution boundaries
48
- - Path containment checks using string prefix comparison (`resolvedPath.startsWith(baseDir)`) without a path separator boundary — `baseDir + "evil/..."` passes the check. Use `path.relative()` (reject if starts with `..`) or append `path.sep` to the base
49
- - Error/fallback responses that hardcode security headers (CORS, CSP) instead of using the centralized policy — error paths bypass security tightening applied to happy paths. Always reuse shared header middleware/constants
50
-
51
- **Data exposure**
52
- - API responses returning full objects that contain sensitive fields (secrets, tokens, passwords) — destructure and omit before sending. Check ALL response paths (GET, PUT, POST) not just one
53
- - Comments/docs claiming data is never exposed while the code path does expose it
34
+ - User-supplied values interpolated into URL paths, shell commands, file paths, or subprocess arguments without encoding/validation — use `encodeURIComponent()` for URLs, regex allowlists for execution boundaries
35
+ - Route params passed to services without format validation; path containment checks using string prefix without path separator boundary (use `path.relative()`)
36
+ - Error/fallback responses that hardcode security headers instead of using centralized policyerror paths bypass security tightening
54
37
 
55
- **Client/server trust boundary**
56
- - Server trusting client-provided computed/derived values (scores, totals, correctness flags) when the server has the data to recompute them strip client-provided scoring/summary fields and recompute server-side
57
- - Validation schemas requiring clients to submit fields the server should own (e.g., `expected` answers, `correct` flags) make these optional/omitted in submissions and derive them server-side
58
- - API responses leaking answer keys or expected values that the client will later submit back — either strip before responding or use server-side nonce/seed verification
38
+ **Trust boundaries & data exposure**
39
+ - API responses returning full objects with sensitive fields destructure and omit across ALL response paths (GET, PUT, POST, error, socket); comments/docs claiming data isn't exposed while the code path does expose it
40
+ - Server trusting client-provided computed/derived values (scores, totals, correctness flags) when the server can recompute them strip and recompute server-side; don't require clients to submit fields the server should own
59
41
 
60
42
  **Input handling**
61
- - Trimming values where whitespace is significant (API keys, tokens, passwords, base64) — only trim identifiers/names, not secret values
62
- - Swallowed errors (empty `.catch(() => {})`) that hide failures from users at minimum surface a notification on failure
63
- - Endpoints that accept unbounded arrays/collections without an upper limit — large payloads can exceed request timeouts, exhaust memory, or create DoS vectors. Enforce a max size and return 400 when exceeded, or move large operations to background jobs
43
+ - Trimming values where whitespace is significant (API keys, tokens, passwords, base64) — only trim identifiers/names
44
+ - Endpoints accepting unbounded arrays/collections without upper limits enforce max size or move to background jobs
64
45
 
65
46
  **Validation & consistency**
66
- - New endpoints/schemas match validation standards of similar existing endpoints (check for field limits, required fields, types)
67
- - New API routes have the same error handling patterns as existing routes
68
- - If validation exists on one endpoint for a param, the same param on other endpoints needs the same validation
69
- - Schema fields that accept values the rest of the system can't handle (e.g., a field accepts any string but downstream code requires a specific format)
70
- - Zod/schema stripping fields the service actually readswhen Zod uses `.strict()` or strips unknown keys, any field the service reads from the validated object must be declared in the schema, otherwise it's silently `undefined`
71
- - Config values accepted by the API and persisted but silently ignored by the implementationtrace each config field through schema → service → generator/consumer to verify it's actually used (e.g., a `startRange` saved to config but the generator hardcodes a range)
72
- - Handlers/functions that read properties from framework-provided objects (request, event, context) using a field name the framework doesn't populate — results in silent `undefined`. Verify the property name matches the caller's contract, not just the handler's assumption
73
- - Numeric query params (`limit`, `offset`, `page`) parsed from strings without lower-bound clamping — `parseInt` can produce 0, negative, or `NaN` values that cause SQL errors or unexpected behavior. Always clamp to safe bounds (e.g., `Math.max(1, ...)`)
74
- - Summary counters/accumulators that miss edge cases — if an item is removed, is the count updated? Are all branches counted?
75
- - Silent operations in verbose sequences when a series of operations each prints a status line, ensure all branches print consistent output
76
- - UI elements hidden from navigation (filtered tabs, conditional menu items) but still accessible via direct URLenforce access restrictions at the route/handler level, not just visibility
77
- - Labels, comments, or status messages that describe behavior the code doesn't implemente.g., a map named "renamed" that only deletes, or an action labeled "migrated" that never creates the target
78
- - Registering references (config entries, settings pointers) to files or resources without verifying the resource actually exists a failed download or missing file leaves dangling references that break later operations
79
- - Error/catch handlers that exit cleanly (`exit 0`, `return`) without any user-visible outputmakes failures look like successes; always print a skip/warning message explaining why the operation was skipped
47
+ - New endpoints/schemas should match validation patterns of existing similar endpoints field limits, required fields, types, error handling. If validation exists on one endpoint for a param, the same param on other endpoints needs the same validation
48
+ - When a validation/sanitization function is introduced for a field, trace ALL write paths (create, update, sync, import) — partial application means invalid values re-enter through the unguarded path
49
+ - Schema fields accepting values downstream code can't handle; Zod/schema stripping fields the service reads (silent `undefined`); config values persisted but silently ignored by the implementation — trace each field through schema → service → consumer
50
+ - Handlers reading properties from framework-provided objects using field names the framework doesn't populate silent `undefined`. Verify property names match the caller's contract
51
+ - Numeric values from strings used without `NaN`/type guards — `NaN` comparisons silently pass bounds checks. Clamp query params to safe lower bounds
52
+ - UI elements hidden from navigation but still accessible via direct URLenforce restrictions at the route level
53
+ - Summary counters/accumulators that miss edge cases (removals, branch coverage); silent operations in verbose sequences where all branches should print status
54
+
55
+ **Intent vs implementation**
56
+ - Labels, comments, status messages, or documentation that describe behavior the code doesn't implement — e.g., a map named "renamed" that only deletes, or an action labeled "migrated" that never creates the target
57
+ - Inline code examples, command templates, and query snippets that aren't syntactically valid as writtentemplate placeholders must use a consistent format, queries must use correct syntax for their language (e.g., single `{}` in GraphQL, not `{{}}`)
58
+ - Cross-references between files (identifiers, parameter names, format conventions, operational thresholds) that disagreewhen one reference changes, trace all other files that reference the same entity and update them
59
+ - Sequential instructions or steps whose ordering doesn't match the required execution order — readers following in order will perform actions at the wrong time (e.g., "record X" in step 2 when X must be captured before step 1's action)
60
+ - Sequential numbering (section numbers, step numbers) with gaps or jumps after editsverify continuity
61
+ - Completion markers, success flags, or status files written before the operation they attest to finishes — consumers see false success if the operation fails after the write
62
+ - Existence checks (directory exists, file exists, module resolves) used as proof of correct/complete installation — a directory can exist but be empty, a file can exist with invalid contents. Verify the specific resource the consumer needs
63
+ - Tracking/checkpoint files that default to empty on parse failure — causes full re-execution. Fail loudly instead
64
+ - Registering references to resources without verifying the resource exists — dangling references after failed operations
80
65
 
81
66
  **Concurrency & data integrity**
82
- - Shared mutable state (files, in-memory caches) accessed by concurrent requests without locking or atomic writes
83
- - Multi-step read-modify-write cycles on files or databases that can interleave with other requests
84
- - Multi-table writes (e.g., parent row + relationship/link rows) without a transaction FK violations or errors after the first insert leave partial state. Wrap all related writes in a single transaction
85
- - Functions with early returns for "no primary fields to update" that silently skip secondary operations (relationship updates, link table writes) ensure early-return guards don't bypass logic that should run independently of primary field changes
67
+ - Shared mutable state accessed by concurrent requests without locking or atomic writes; multi-step read-modify-write cycles that can interleave
68
+ - Multi-table writes without a transaction FK violations or errors leave partial state
69
+ - Functions with early returns for "no primary fields to update" that silently skip secondary operations (relationship updates, link writes)
70
+ - Functions that acquire shared state (locks, flags, markers) with exit paths that skip cleanupleaves the system permanently locked. Trace all exit paths including error branches
86
71
 
87
72
  **Search & navigation**
88
- - Search results that link to generic list pages instead of deep-linking to the specific record — include the record type and ID in the URL
89
- - Search or query code that hardcodes one backend's implementation when the system supports multiple backends use the active backend's capabilities so results aren't stale after a backend switch. Also check that option/parameter names are mapped between backends (e.g., `ftsWeight` vs `bm25Weight`) so configuration isn't silently ignored
73
+ - Search results linking to generic list pages instead of deep-linking to the specific record
74
+ - Search/query code hardcoding one backend's implementation when the system supports multiple — verify option/parameter names are mapped between backends
90
75
 
91
76
  **Sync & replication**
92
- - Upsert/`ON CONFLICT UPDATE` clauses that only update a subset of the fields exported by the corresponding "get changes" query omitted fields cause replicas to diverge. Deliberately omit only fields that should stay local (e.g., access stats), and document the decision
93
- - Pagination using `COUNT(*)` to compute `hasMore` — this forces a full table scan on large tables. Use the `limit + 1` pattern: fetch one extra row to detect more pages, return only `limit` rows
94
- - Pagination endpoints that return a `next` token but don't accept one as input (or vice versa) — clients can't retrieve pages beyond the first. Also check that hard-capped query limits (e.g., `Limit: 100`) don't silently truncate results when offset exceeds the cap
77
+ - Upsert/`ON CONFLICT UPDATE` updating only a subset of exported fields — replicas diverge. Document deliberately omitted fields
78
+ - Pagination using `COUNT(*)` (full table scan) instead of `limit + 1`; endpoints missing `next` token input/output; hard-capped limits silently truncating results
95
79
 
96
80
  **SQL & database**
97
- - Parameterized query placeholder indices (`$1`, `$2`, ...) must match the actual parameter array positions — especially when multiple queries share a param builder or when the index is computed dynamically
98
- - Database triggers (e.g., `BEFORE UPDATE` setting `updated_at = NOW()`) that clobber explicitly-provided values verify triggers don't interfere with replication/sync that sets fields to remote timestamps
99
- - Auto-incrementing columns (`BIGSERIAL`, `SERIAL`) only auto-increment on INSERT, not UPDATE if change-tracking relies on a sequence column, the UPDATE path must explicitly call `nextval()` to bump it
100
- - Database functions that require specific extensions or minimum versions verify the deployment target supports them and the init script enables the extension
101
- - Full-text search with strict query parsers (`to_tsquery`) directly on user inputpunctuation, quotes, and operators cause SQL errors. Use `websearch_to_tsquery` or `plainto_tsquery` for user-facing search
102
- - Query results assigned to variables but never read — remove dead queries to avoid unnecessary database load
103
- - N+1 query patterns inside transactions (SELECT + INSERT/UPDATE per row) — use batched upserts (`INSERT ... ON CONFLICT ... DO UPDATE`) to reduce round-trips and lock time
104
- - `CREATE TABLE IF NOT EXISTS` used as the sole schema migration strategy — it won't add new columns, indexes, or triggers to existing tables on upgrade. Use `ALTER TABLE ... ADD COLUMN IF NOT EXISTS` or a migration framework for schema evolution
105
- - O(n²) algorithms (self-joins, all-pairs comparisons, nested loops over full tables) triggered per-request on data that grows over time — these become prohibitive at scale. Add caps, use indexed lookups, or move to background jobs
81
+ - Parameterized query placeholder indices must match parameter array positions — especially with shared param builders or computed indices
82
+ - Database triggers clobbering explicitly-provided values; auto-incrementing columns that only increment on INSERT, not UPDATE
83
+ - Full-text search with strict parsers (`to_tsquery`) on user inputuse `websearch_to_tsquery` or `plainto_tsquery`
84
+ - Dead queries (results never read), N+1 patterns inside transactions, O(n²) algorithms on growing data
85
+ - `CREATE TABLE IF NOT EXISTS` as sole migration strategywon't add columns/indexes on upgrade. Use `ALTER TABLE ... ADD COLUMN IF NOT EXISTS` or a migration framework
86
+ - Functions/extensions requiring specific database versions without verification
106
87
 
107
88
  **Lazy initialization & module loading**
108
- - Cached state getters that return `null`/`undefined` before the module is initialized code that checks the cached value before triggering initialization will get incorrect results. Provide an async initializer or ensure-style function
109
- - Re-exporting constants from heavy modules defeats lazy loading define shared constants in a lightweight module or inline them
110
- - Module-level side effects (file reads, JSON.parse, SDK client init) that run on import without error handling — a corrupted file or missing credential crashes the entire process before any request is served. Wrap module-level init in try/catch and degrade gracefully
89
+ - Cached state getters returning null before initializationprovide async initializer or ensure-style function
90
+ - Module-level side effects (file reads, SDK init) without error handling corrupted files crash the process on import
91
+ - Bootstrap/resilience code that imports the dependencies it's meant to install restructure so installation precedes resolution
92
+ - Re-exporting from heavy modules defeats lazy loading — use lightweight shared modules
111
93
 
112
94
  **Data format portability**
113
- - Values that cross serialization boundaries (JSON API → database, peer sync) may change format — e.g., arrays in JSON vs specialized string literals in the database. Convert consistently before writing to the target
114
- - Database BIGINT/BIGSERIAL values parsed into JavaScript `Number` via `parseInt` or `Number()` — precision is lost past `Number.MAX_SAFE_INTEGER`, silently corrupting IDs, sequence cursors, or pagination tokens. Use string representation or `BigInt` for large integer columns
115
-
116
- **Shell script safety**
117
- - Subprocess calls in shell scripts under `set -e` — if the subprocess fails, the script aborts. Also check non-critical writes (e.g., `echo` to stdout) which fail on broken pipes and trigger exit — use `|| true` for non-critical output
118
- - Detached/background child processes spawned with piped stdio — if the parent exits (restart, crash), pipes close and writes cause SIGPIPE. Redirect stdio to log files or use `'ignore'` for children that must outlive the parent
119
- - When the same data structure is manipulated in both application code and shell-inline scripts, apply identical guards in both places
95
+ - Values crossing serialization boundaries may change format (arrays in JSON vs string literals in DB) convert consistently
96
+ - BIGINT values parsed into JavaScript `Number` — precision lost past `MAX_SAFE_INTEGER`. Use strings or `BigInt`
120
97
 
121
- **Cross-platform compatibility**
122
- - Platform-specific execution assumptions — hardcoded shell interpreters (`bash`, `sh`), `path.join()` producing backslashes that break ESM `import()` or URL-based APIs on Windows, platform-gated scripts without fallback or clear error. Use `pathToFileURL()` for dynamic imports, check `process.platform` for shell dispatch
98
+ **Shell & portability**
99
+ - Subprocess calls under `set -e` abort on failure; non-critical writes fail on broken pipes use `|| true` for non-critical output
100
+ - Detached child processes with piped stdio — parent exit causes SIGPIPE. Redirect to log files or use `'ignore'`
101
+ - Platform-specific assumptions — hardcoded shell interpreters, `path.join()` backslashes breaking ESM imports. Use `pathToFileURL()` for dynamic imports
123
102
 
124
103
  **Test coverage**
125
- - New validation schemas, service functions, or business logic added without corresponding tests — especially when the project already has a test suite covering similar existing code
126
- - New error paths (404, 400) that are untestable because the service throws generic errors instead of typed/status-coded ones
127
- - Tests that re-implement the logic under test instead of importing real exports — these pass even when the real code regresses. Import and call the actual functions
128
- - Missing tests for trust-boundary enforcement if the server strips/recomputes client-provided fields, add a test that submits tampered values and verifies the server ignores them
129
- - Tests that depend on real wall-clock time (`setTimeout`, `Date.now`, network delays) for rate limiters, debounce, or scheduling — slow under normal conditions and flaky under CI load. Use fake timers or time mocking
104
+ - New logic/schemas/services without corresponding tests when similar existing code has tests
105
+ - New error paths untestable because services throw generic errors instead of typed ones
106
+ - Tests re-implementing logic under test instead of importing real exports — pass even when real code regresses
107
+ - Tests depending on real wall-clock time or external dependencies when testing logic use fake timers and mocks
108
+ - Missing tests for trust-boundary enforcement submit tampered values, verify server ignores them
130
109
 
131
110
  **Accessibility**
132
- - Interactive elements (buttons, toggles, custom controls) missing accessible names, roles, or ARIA states — including programmatically disabled interactions that don't reflect the disabled state visually or via `aria-disabled` (e.g., drag handles that appear interactive but are inert during async operations)
133
- - Custom toggle/switch UI built from `<button>` or `<div>` instead of native inputs with appropriate labeling
111
+ - Interactive elements missing accessible names, roles, or ARIA states — including disabled interactions without `aria-disabled`
112
+ - Custom toggle/switch UI built from non-semantic elements instead of native inputs
134
113
 
135
114
  **Configuration & hardcoding**
136
- - Hardcoded values (usernames, org names, limits) when a config field or env var already exists for that purpose
137
- - Dead config fields that nothing reads either wire them up or remove them
138
- - Function parameters that are accepted but never usedcreates a false API contract; remove unused params or implement the intended behavior
139
- - Duplicated config/constants/utility helpers across modules — extract to a single shared module to prevent drift (watch for circular imports when choosing the shared location)
140
- - CI pipelines that install dependencies without lockfile pinning (`npm install` instead of `npm ci`) or that ad-hoc install packages without version constraints — creates non-deterministic builds that can break unpredictably
115
+ - Hardcoded values when a config field or env var already exists; dead config fields nothing consumes; unused function parameters creating false API contracts
116
+ - Duplicated config/constants/utilities across modulesextract to shared module to prevent drift
117
+ - CI pipelines installing without lockfile pinning or version constraints non-deterministic builds
141
118
 
142
119
  **Style & conventions**
143
120
  - Naming and patterns consistent with the rest of the codebase
144
- - Missing error handling at system boundaries (user input, external APIs)
121
+ - Formatting consistency within each file new content must match existing indentation, bullet style, heading levels, and structure
@@ -1,46 +1,87 @@
1
1
  ## Copilot Code Review Loop
2
2
 
3
- After the PR is created, run the Copilot review-and-fix loop:
4
-
5
- 1. **Request a Copilot review via API**
6
- ```bash
7
- gh api repos/OWNER/REPO/pulls/PR_NUM/requested_reviewers -f 'reviewers[]=copilot-pull-request-reviewer[bot]'
8
- ```
9
- **CRITICAL**: The reviewer name MUST include the `[bot]` suffix. Without it, the API returns a 422 "not a collaborator" error.
10
- - For **public repos**: Copilot review may trigger automatically on PR creation — check if a review already exists before requesting
11
- - If no Copilot reviewer is configured at all, inform the user and skip this loop
12
-
13
- 2. **Wait for the review to complete (BLOCKING — do not skip or proceed early)**
14
- - Record the current review count and latest `submittedAt` timestamp before waiting
15
- - Poll using `gh api graphql` to check the `reviews` array for a NEW review node (compare `submittedAt` timestamps or count):
16
- ```bash
17
- gh api graphql -f query='{ repository(owner: "OWNER", name: "REPO") { pullRequest(number: PR_NUM) { reviews(last: 3) { nodes { state body author { login } submittedAt } } reviewThreads(first: 100) { nodes { id isResolved comments(first: 3) { nodes { body path line author { login } } } } } } } }'
18
- ```
19
- - The review is complete when a new Copilot review node appears with a `submittedAt` after your latest push
20
- - **Error detection**: After a review appears, check the review `body` for error text such as "Copilot encountered an error" or "unable to review this pull request". If the review body contains this error, it is NOT a successful review — re-request the review (step 1) and resume polling. Log a warning so the user knows a retry occurred. Apply a maximum of 3 error retries before asking the user whether to continue waiting or skip.
21
- - **Do NOT proceed until the re-requested review has actually posted** — "Awaiting requested review" means it is still in progress
22
- - **Dynamic poll timing**: Before your first poll, check how long the most recent Copilot review on this PR took by comparing its `submittedAt` to the previous review's `submittedAt` (or to the PR creation time if it was the first review). Use that duration as your expected wait time. If no prior review exists, default to 5 minutes. Set poll interval to 60 seconds and max wait to **2x the expected duration** (minimum 5 minutes, maximum 20 minutes).
23
- - Copilot reviews can take **10-15 minutes** for large diffs — do NOT give up early
24
- - If no review appears after the max wait time, **ask the user** whether to continue waiting, re-request the review, or skip — **never proceed without user approval when the review loop fails**
25
- - If the review request silently disappears (reviewRequests becomes empty without a review being posted), re-request the review once and resume polling
26
-
27
- 3. **Check for unresolved comments**
28
- - Filter review threads for `isResolved: false`
29
- - **First, verify the review was successful**: check that the latest Copilot review body does NOT contain "Copilot encountered an error" or "unable to review". If it does, this is an error response — go back to step 1 (re-request) instead of proceeding. This check is critical because error reviews have no comments and no unresolved threads, making them look identical to a clean review.
30
- - Also count the total comments in the latest review (check the review body for "generated N comments")
31
- - If the latest review has **zero comments** (body says "generated 0 comments" or no unresolved threads exist): the PR is clean — exit the loop
32
- - If **there are unresolved comments**: proceed to fix them (step 4)
33
-
34
- 4. **Fix all unresolved review comments**
3
+ After the PR is created, run the Copilot review-and-fix loop.
4
+
5
+ **IMPORTANT Sub-agent delegation**: To prevent context exhaustion on long review cycles, delegate the entire review loop to a **general-purpose sub-agent** via the Agent tool. The sub-agent runs the full loop (request → wait → check → fix → re-request) autonomously and returns only the final status. This keeps the parent agent's context clean.
6
+
7
+ ### Sub-agent prompt template:
8
+
9
+ ```
10
+ You are a Copilot review loop agent.
11
+
12
+ PR: {PR_NUMBER} in {OWNER}/{REPO}
13
+ Branch: {BRANCH_NAME}
14
+ Build command: {BUILD_CMD}
15
+ Max iterations: 5
16
+
17
+ TIMEOUT SCHEDULE:
18
+ When running parallel PR reviews (do:better), use shorter waits to avoid
19
+ blocking other PRs:
20
+ - Iteration 1: max wait 5 minutes
21
+ - Iteration 2: max wait 4 minutes
22
+ - Iteration 3: max wait 3 minutes
23
+ - Iteration 4: max wait 2 minutes
24
+ - Iteration 5+: max wait 1 minute
25
+ When running a single-PR review (do:pr, do:release), use dynamic timing:
26
+ check the previous Copilot review duration on this PR and wait up to 2x
27
+ that (minimum 5 minutes, maximum 20 minutes). Copilot reviews can take
28
+ 10-15 minutes for large diffs.
29
+ Poll interval: 30 seconds for all iterations.
30
+
31
+ Run the following loop until Copilot returns zero new comments or you hit
32
+ the max iteration limit:
33
+
34
+ 1. CAPTURE the latest Copilot review submittedAt timestamp (so you can
35
+ detect when a NEW review arrives):
36
+ echo '{"query":"{ repository(owner: \"{OWNER}\", name: \"{REPO}\") { pullRequest(number: {PR_NUMBER}) { reviews(last: 5) { nodes { author { login } submittedAt } } } } }"}' | gh api graphql --input -
37
+ Record the most recent submittedAt from copilot-pull-request-reviewer[bot].
38
+ Then REQUEST a Copilot review:
39
+ gh api repos/{OWNER}/{REPO}/pulls/{PR_NUMBER}/requested_reviewers \
40
+ -f 'reviewers[]=copilot-pull-request-reviewer[bot]'
41
+ CRITICAL: The reviewer name MUST include the [bot] suffix.
42
+ - For public repos: check if a review already exists before requesting
43
+ - If no Copilot reviewer is configured, report back and exit
44
+
45
+ 2. WAIT for the review to complete (BLOCKING):
46
+ - Poll using stdin JSON piping to avoid shell-escaping issues:
47
+ echo '{"query":"{ repository(owner: \"{OWNER}\", name: \"{REPO}\") { pullRequest(number: {PR_NUMBER}) { reviews(last: 5) { totalCount nodes { state body author { login } submittedAt } } reviewThreads(first: 100) { nodes { id isResolved comments(first: 3) { nodes { body path line author { login } } } } } } } }"}' | gh api graphql --input -
48
+ - The review is complete when a new Copilot review node appears with a
49
+ submittedAt after the timestamp captured in step 1
50
+ - For parallel PR reviews (do:better): use the DECREASING TIMEOUT for
51
+ the current iteration number
52
+ - For single-PR reviews (do:pr, do:release): use dynamic timing based on
53
+ the previous Copilot review duration on this PR (2x that, min 5 min,
54
+ max 20 min)
55
+ - Error detection: if the review body contains "Copilot encountered an
56
+ error" or "unable to review this pull request", re-request (step 1)
57
+ and resume polling. Max 3 error retries before reporting failure.
58
+ - If no review appears after max wait, report the timeout — the parent
59
+ agent will ask the user what to do
60
+
61
+ 3. CHECK for unresolved comments:
62
+ - Filter review threads for isResolved: false
63
+ - First verify the review was successful: check that the latest Copilot
64
+ review body does NOT contain error text. If it does, go back to step 1.
65
+ - If zero comments (body says "generated 0 comments" or no unresolved
66
+ threads): PR is clean — report success and exit
67
+ - If unresolved comments exist: proceed to step 4
68
+
69
+ 4. FIX all unresolved review comments:
35
70
  For each unresolved thread:
36
71
  - Read the referenced file and understand the feedback
37
72
  - Make the code fix
38
- - Run the build (`npm run build` or the project's build command)
39
- - If build passes, commit with message `address review: <summary of changes>`
40
- - Resolve the thread via GraphQL mutation:
41
- ```bash
42
- gh api graphql -f query='mutation { resolveReviewThread(input: {threadId: "THREAD_ID"}) { thread { id isResolved } } }'
43
- ```
44
- - After all threads are resolved, push all commits to remote
45
- - **Re-request a Copilot review** via API (same command as step 1)
46
- - **Go back to step 2** (wait for new review) — this loop MUST repeat until Copilot returns a review with zero new comments. Never proceed after only one round of fixes.
73
+ - Run the build command
74
+ - If build passes, commit: address review: <summary>
75
+ - Resolve the thread via GraphQL mutation using stdin JSON piping:
76
+ echo '{"query":"mutation { resolveReviewThread(input: {threadId: \"{THREAD_ID}\"}) { thread { id isResolved } } }"}' | gh api graphql --input -
77
+ - After all threads resolved, push all commits to remote
78
+ - Increment iteration counter and go back to step 1
79
+
80
+ When done, report back:
81
+ - Final status: clean / max-iterations-reached / timeout / error
82
+ - Total iterations completed
83
+ - List of commits made (if any)
84
+ - Any unresolved threads remaining
85
+ ```
86
+
87
+ Launch the sub-agent and wait for its result. If the sub-agent reports a timeout or error, **ask the user** whether to continue waiting, re-request the review, or skip — never proceed without user approval when the review loop fails.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "slash-do",
3
- "version": "1.4.0",
3
+ "version": "1.4.2",
4
4
  "description": "Curated slash commands for AI coding assistants — Claude Code, OpenCode, Gemini CLI, and Codex",
5
5
  "author": "Adam Eivy <adam@eivy.com>",
6
6
  "license": "MIT",