slash-do 1.6.0 → 1.6.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/commands/do/better.md +20 -15
- package/commands/do/review.md +24 -1
- package/lib/code-review-checklist.md +25 -13
- package/lib/copilot-review-loop.md +10 -5
- package/package.json +1 -1
package/commands/do/better.md
CHANGED
|
@@ -453,13 +453,17 @@ Write tests for these remediated files:
|
|
|
453
453
|
After writing/fixing each test file:
|
|
454
454
|
1. Run `{TEST_CMD}` to verify all tests pass
|
|
455
455
|
2. For each NEW test, verify that it fails when the behavior under test is wrong:
|
|
456
|
-
-
|
|
456
|
+
- Stage your test changes so they are protected: `git add path/to/test_file*`
|
|
457
|
+
- Confirm your staged diff only includes the intended test changes: `git diff --cached`
|
|
458
|
+
- Confirm there are no other unstaged changes in the worktree: `git diff` is clean
|
|
457
459
|
- Apply a small, obvious, and **uncommitted** change to the code under test (e.g., return a constant, flip a conditional)
|
|
458
460
|
- Run `{TEST_CMD}` and confirm the new test FAILS
|
|
459
|
-
- Immediately restore the code
|
|
460
|
-
|
|
461
|
+
- Immediately restore only the temporary code change (do **not** touch the staged tests), for example:
|
|
462
|
+
- `git restore path/to/code_under_test` **or**
|
|
463
|
+
- `git checkout HEAD -- path/to/code_under_test`
|
|
464
|
+
- Confirm the worktree has no remaining unstaged changes (`git diff` shows no changes) and that your staged test changes are still present (`git diff --cached`)
|
|
461
465
|
This is the key quality gate — a test that does not fail when the code is broken is worthless.
|
|
462
|
-
3. After confirming the code is
|
|
466
|
+
3. After confirming the temporary code change is reverted and only the intended test changes are staged, commit the passing tests: `test: {description of what's tested}`
|
|
463
467
|
```
|
|
464
468
|
|
|
465
469
|
### 4c.3: Verification
|
|
@@ -489,15 +493,15 @@ Instead of one mega PR, create **separate branches and PRs for each category**.
|
|
|
489
493
|
|
|
490
494
|
Using the `FILE_OWNER_MAP` from Phase 2 (updated in Phase 4c.3), create one branch per category.
|
|
491
495
|
|
|
492
|
-
Initialize `CREATED_CATEGORY_SLUGS=""` (empty space-delimited string). After each category branch is successfully created and pushed below, append its slug: `CREATED_CATEGORY_SLUGS="$CREATED_CATEGORY_SLUGS {CATEGORY_SLUG}"`. Phase 7 uses this for cleanup.
|
|
496
|
+
Initialize `CREATED_CATEGORY_SLUGS=""` (empty space-delimited string). After each category branch is successfully created and pushed below, append its slug: `CREATED_CATEGORY_SLUGS="$CREATED_CATEGORY_SLUGS {CATEGORY_SLUG}"`. Phase 7 uses this as the set of candidate branches for cleanup; when deleting branches, either run cleanup only after all desired merges are complete or explicitly verify that each branch in `CREATED_CATEGORY_SLUGS` has been merged before deleting it.
|
|
493
497
|
|
|
494
498
|
For each category that has findings:
|
|
495
499
|
1. Switch to `{DEFAULT_BRANCH}`: `git checkout {DEFAULT_BRANCH}`
|
|
496
500
|
2. Create a category branch: `git checkout -b better/{CATEGORY_SLUG}`
|
|
497
501
|
- Use slugs: `security`, `code-quality`, `dry`, `architecture`, `bugs-perf`, `stack-specific`, `tests`
|
|
498
502
|
3. For each file assigned to this category in `FILE_OWNER_MAP`:
|
|
499
|
-
- **Modified files**: `git checkout
|
|
500
|
-
- **New files (Added)**: `git checkout
|
|
503
|
+
- **Modified files**: `git checkout better/{DATE} -- {file_path}`
|
|
504
|
+
- **New files (Added)**: `git checkout better/{DATE} -- {file_path}`
|
|
501
505
|
- **Deleted files**: `git rm {file_path}`
|
|
502
506
|
4. Commit all staged changes with a descriptive message:
|
|
503
507
|
```bash
|
|
@@ -609,7 +613,7 @@ After creating all PRs, verify CI passes on each one:
|
|
|
609
613
|
|
|
610
614
|
## Phase 6: Copilot Review Loop (GitHub only)
|
|
611
615
|
|
|
612
|
-
|
|
616
|
+
Loop until Copilot returns zero new comments (no fixed iteration limit). Sub-agents enforce a 10-iteration guardrail: at iteration 10 the sub-agent stops and returns a "guardrail" status, prompting the parent agent to ask the user whether to continue or stop.
|
|
613
617
|
|
|
614
618
|
**Sub-agent delegation** (prevents context exhaustion): delegate each PR's review loop to a **separate general-purpose sub-agent** via the Agent tool. Launch sub-agents in parallel (one per PR). Each sub-agent runs the full loop (request → wait → check → fix → re-request) autonomously and returns only the final status.
|
|
615
619
|
|
|
@@ -627,9 +631,9 @@ Launch all PR sub-agents in parallel. Wait for all to complete.
|
|
|
627
631
|
|
|
628
632
|
For each sub-agent result:
|
|
629
633
|
- **clean**: mark PR as ready to merge
|
|
630
|
-
- **timeout**:
|
|
631
|
-
- **max-iterations-reached**: inform the user "Reached max review iterations (5) on PR #{number}. Remaining issues may need manual review."
|
|
634
|
+
- **timeout**: inform the user "Copilot review timed out on PR #{number}." and ask whether to continue waiting, re-request, or skip
|
|
632
635
|
- **error**: inform the user and ask whether to retry or skip
|
|
636
|
+
- **guardrail**: the sub-agent hit the 10-iteration limit; ask the user whether to continue with more iterations or stop
|
|
633
637
|
|
|
634
638
|
### 6.3: Merge Gate (MANDATORY)
|
|
635
639
|
|
|
@@ -680,16 +684,17 @@ If merge fails (e.g., branch protection, merge conflicts from a prior PR):
|
|
|
680
684
|
```bash
|
|
681
685
|
git worktree remove {WORKTREE_DIR}
|
|
682
686
|
```
|
|
683
|
-
2. Delete local
|
|
687
|
+
2. Delete the local staging branch and per-category branches (local + remote). Use the tracked list of branches from Phase 5 rather than a fixed list:
|
|
684
688
|
```bash
|
|
685
|
-
git
|
|
689
|
+
git checkout {DEFAULT_BRANCH}
|
|
690
|
+
git branch -D better/{DATE}
|
|
686
691
|
# CREATED_CATEGORY_SLUGS is a space-delimited string, e.g. "security code-quality tests"
|
|
687
692
|
for slug in $CREATED_CATEGORY_SLUGS; do
|
|
688
|
-
git branch -d "better/$slug" || echo "warning: local branch better/$slug not found or not fully merged"
|
|
689
|
-
git push origin --delete "better/$slug"
|
|
693
|
+
git branch -d "better/$slug" || echo "warning: local branch better/$slug not found or not fully merged — skipping (use -D to force)"
|
|
694
|
+
git push origin --delete "better/$slug" || echo "warning: remote branch better/$slug not found or already deleted"
|
|
690
695
|
done
|
|
691
696
|
```
|
|
692
|
-
|
|
697
|
+
`-D` (force delete) is used only for the staging branch `better/{DATE}` because it is intentionally unmerged — its file contents are cherry-picked into category branches. Category branches use `-d` (safe delete) so that unmerged work is not accidentally lost; if a category branch was not merged, the warning will surface it. The guards prevent errors from interrupting cleanup.
|
|
693
698
|
3. Restore stashed changes (if stashed in Phase 3a):
|
|
694
699
|
```bash
|
|
695
700
|
git stash pop
|
package/commands/do/review.md
CHANGED
|
@@ -94,6 +94,9 @@ Check every file against this checklist. The checklist is organized into tiers
|
|
|
94
94
|
**Cross-file consistency**
|
|
95
95
|
- If a new function/endpoint follows a pattern from an existing similar one, verify ALL aspects match (validation, error codes, response shape, cleanup). Partial copying is the #1 source of review feedback.
|
|
96
96
|
- New API client functions should use the same encoding/escaping as existing ones (e.g., if other endpoints use `encodeURIComponent`, new ones must too)
|
|
97
|
+
- If the PR adds a new endpoint, trace where existing endpoints are registered and verify the new one is wired in all runtime adapters (serverless handler map, framework route file, API gateway config, local dev server) — a route registered in one adapter but missing from another will silently 404 in the missing runtime
|
|
98
|
+
- If the PR adds a new call to an external service that has established mock/test infrastructure (mock mode flags, test helpers, dev stubs), verify the new call uses the same patterns — bypassing them makes the new code path untestable in offline/dev environments and inconsistent with existing integrations
|
|
99
|
+
- If the PR adds a new UI component or client-side consumer against an existing API endpoint, read the actual endpoint handler or response shape — verify every field name, nesting level, identifier property, and response envelope path used in the consumer matches what the producer returns. This is the #1 source of "renders empty" bugs in new views built against existing APIs
|
|
97
100
|
|
|
98
101
|
**Error path completeness**
|
|
99
102
|
- Trace each error path end-to-end: does the error reach the user with a helpful message and correct HTTP status? Or does it get swallowed, logged silently, or surface as a generic 500?
|
|
@@ -148,8 +151,9 @@ Check every file against this checklist. The checklist is organized into tiers
|
|
|
148
151
|
**Data model vs access pattern alignment**
|
|
149
152
|
- If the PR adds queries that claim ordering (e.g., "recent", "top"), verify the underlying key/index design actually supports that ordering natively — random UUIDs and non-time-sortable keys require full scans and in-memory sorting, which degrades at scale
|
|
150
153
|
|
|
151
|
-
**Deletion/lifecycle cleanup completeness**
|
|
154
|
+
**Deletion/lifecycle cleanup and aggregate reset completeness**
|
|
152
155
|
- If the PR adds a delete or destroy function, trace all resources created during the entity's lifecycle (data directories, git branches, child records, temporary files, worktrees) and verify each is cleaned up on deletion. Compare with existing delete functions in the codebase for completeness patterns
|
|
156
|
+
- If the PR adds a state transition that resets an aggregate value (counter, score, flag count), trace all individual records that contribute to that aggregate and verify they are also cleared, archived, or versioned — a reset counter with stale contributing records causes inconsistency and blocks duplicate-prevention checks on re-entry
|
|
153
157
|
|
|
154
158
|
**Update schema depth**
|
|
155
159
|
- If the PR derives an update/patch schema from a create schema (e.g., `.partial()`, `Partial<T>`), verify that nested objects also become partial — shallow partial on deeply-required schemas rejects valid partial updates where the caller only wants to change one nested field
|
|
@@ -163,9 +167,28 @@ Check every file against this checklist. The checklist is organized into tiers
|
|
|
163
167
|
**Read-after-write consistency**
|
|
164
168
|
- If the PR writes to a data store and then immediately queries that store (especially scans, aggregations, or replica reads), check whether the store's consistency model guarantees visibility of the write. If not, flag the read as potentially stale and suggest computing from in-memory state, using consistent-read options, or adding a delay/caveat
|
|
165
169
|
|
|
170
|
+
**Security-sensitive configuration parsing**
|
|
171
|
+
- If the PR reads environment variables or config values that affect security behavior (proxy trust depth, rate limit thresholds, CORS origins, token expiry), verify the parsing enforces the expected type and range — e.g., integer-only via `parseInt` with `Number.isInteger` check, non-negative bounds, and a logged fallback to a safe default on invalid input. `Number()` on arbitrary strings accepts floats, negatives, and empty-string-as-zero, all of which can silently weaken security controls
|
|
172
|
+
|
|
173
|
+
**Multi-source data aggregation**
|
|
174
|
+
- If the PR aggregates items from multiple sources into a single collection (merging accounts, combining API results, flattening caches), verify each item retains its source identifier through the aggregation — downstream operations that need to route back to the correct source (updates, deletes, detail views) will silently break or operate on the wrong source if the origin is lost
|
|
175
|
+
|
|
176
|
+
**Field-set enumeration consistency**
|
|
177
|
+
- If the PR adds an operation that targets a set of entity fields (enrichment, validation, migration, sync), trace every other location that independently enumerates those fields — UI predicates, scan/query filters, API documentation, response shapes, and test assertions. Each must cover the same field set; a missed field causes silent skips or false UI state. Prefer deriving enumerations from a single source of truth (constant array, schema keys) over maintaining independent lists
|
|
178
|
+
|
|
179
|
+
**Abstraction layer fidelity**
|
|
180
|
+
- If the PR calls a third-party API through an internal wrapper/abstraction layer, trace whether the wrapper requests and forwards all fields the handler depends on — third-party APIs often have optional response attributes that require explicit opt-in (e.g., cancellation reasons, extended metadata). Code branching on fields the wrapper doesn't forward will silently receive `undefined` and take the wrong path. Also verify that test mocks match what the real wrapper returns, not what the underlying API could theoretically return
|
|
181
|
+
|
|
182
|
+
**Data model / status lifecycle changes**
|
|
183
|
+
- If the PR changes the set of valid statuses, enum values, or entity lifecycle states, sweep all dependent artifacts: API doc summaries and enum declarations, UI filter/tab options, conditional rendering branches (which actions to show per state), integration guide examples, route names derived from old status names, and test assertions. Each artifact that references the old value set must be updated — partial updates leave stale filters, invalid actions, and misleading documentation
|
|
184
|
+
- If the PR renames a concept (e.g., "flagged" → "rejected"), trace all manifestations beyond user-facing labels: route paths, component/file names, variable names, CSS classes, and test descriptions. Internal identifiers using the old name create confusion even when the UI is correct
|
|
185
|
+
|
|
166
186
|
**Formatting & structural consistency**
|
|
167
187
|
- If the PR adds content to an existing file (list items, sections, config entries), verify the new content matches the file's existing indentation, bullet style, heading levels, and structure — rendering inconsistencies are the most common Copilot review finding
|
|
168
188
|
|
|
189
|
+
**Query key / stored key precision alignment**
|
|
190
|
+
- If the PR adds queries that construct lookup keys with a different precision, encoding, or format than what the write path persists, the query will silently return zero matches. Trace the key construction in both write and read paths and verify they produce compatible values
|
|
191
|
+
|
|
169
192
|
</deep_checks>
|
|
170
193
|
|
|
171
194
|
<verify_findings>
|
|
@@ -17,13 +17,15 @@
|
|
|
17
17
|
- Null/undefined access without guards, off-by-one errors, object spread of potentially-null values (spread of null is `{}`, silently discarding state)
|
|
18
18
|
- Data from external/user sources (parsed JSON, API responses, file reads) used without structural validation — guard against parse failures, missing properties, wrong types, and null elements before accessing nested values. When parsed data is optional enrichment, isolate failures so they don't abort the main operation
|
|
19
19
|
- Type coercion edge cases — `Number('')` is `0` not empty, `0` is falsy in truthy checks, `NaN` comparisons are always false; string comparison operators (`<`, `>`, `localeCompare`) do lexicographic, not semantic, ordering (e.g., `"10" < "2"`). Use explicit type checks (`Number.isFinite()`, `!= null`) and dedicated libraries (e.g., semver for versions) instead of truthy guards or lexicographic ordering when zero/empty are valid values or semantic ordering matters
|
|
20
|
-
- Functions that index into arrays without guarding empty arrays; state/variables declared but never updated or only partially wired up
|
|
20
|
+
- Functions that index into arrays without guarding empty arrays; aggregate operations (`every`, `some`, `reduce`) on potentially-empty collections returning vacuously true/default values that mask misconfiguration or missing data; state/variables declared but never updated or only partially wired up
|
|
21
21
|
- Shared mutable references — module-level defaults passed by reference mutate across calls (use `structuredClone()`/spread); `useCallback`/`useMemo` referencing a later `const` (temporal dead zone); object spread followed by unconditional assignment that clobbers spread values
|
|
22
22
|
- Functions with >10 branches or >15 cyclomatic complexity — refactor into smaller units
|
|
23
23
|
|
|
24
24
|
**API & URL safety**
|
|
25
25
|
- User-supplied or system-generated values interpolated into URL paths, shell commands, file paths, or subprocess arguments without encoding/validation — use `encodeURIComponent()` for URLs, regex allowlists for execution boundaries. Generated identifiers used as URL path segments must be safe for your router/storage (no `/`, `?`, `#`; consider allowlisting characters and/or applying `encodeURIComponent()`). Identifiers derived from human-readable names (slugs) used for namespaced resources (git branches, directories) need a unique suffix (ID, hash) to prevent collisions between entities with the same or similar names
|
|
26
26
|
- Route params passed to services without format validation; path containment checks using string prefix without path separator boundary (use `path.relative()`)
|
|
27
|
+
- Parameterized/wildcard routes registered before specific named routes — the generic route captures requests meant for the specific endpoint (e.g., `/:id` registered before `/drafts` matches `/drafts` as `id="drafts"`). Verify route registration order or use path prefixes to disambiguate
|
|
28
|
+
- Stored or external URLs rendered as clickable links (`href`, `src`, `window.open`) without protocol validation — `javascript:`, `data:`, and `vbscript:` URLs execute in the user's browser. Allowlist `http:`/`https:` (and `mailto:` if needed) before rendering; for all other schemes, render as plain text or strip the value
|
|
27
29
|
- Error/fallback responses that hardcode security headers instead of using centralized policy — error paths bypass security tightening
|
|
28
30
|
|
|
29
31
|
**Trust boundaries & data exposure**
|
|
@@ -40,33 +42,36 @@
|
|
|
40
42
|
- Optimistic updates using full-collection snapshots for rollback — a second in-flight action gets clobbered. Use per-item rollback and functional state updaters after async gaps; sync optimistic changes to parent via callback or trigger refetch on remount
|
|
41
43
|
- State updates guarded by truthiness of the new value (`if (arr?.length)`) — prevents clearing state when the source legitimately returns empty. Distinguish "no response" from "empty response"
|
|
42
44
|
- Mutation/trigger functions that return or propagate stale pre-mutation state — if a function activates, updates, or resets an entity, the returned value and any dependent scheduling/evaluation state (backoff timers, "last run" timestamps, status flags) must reflect the post-mutation state, not a snapshot read before the mutation
|
|
43
|
-
- Fire-and-forget or async writes where the in-memory object is not updated (response returns stale data) or is updated unconditionally regardless of write success (response claims state that was never persisted) — update in-memory state conditionally on write outcome, or document the tradeoff explicitly
|
|
45
|
+
- Fire-and-forget or async writes where the in-memory object is not updated (response returns stale data) or is updated unconditionally regardless of write success (response claims state that was never persisted) — update in-memory state conditionally on write outcome, or document the tradeoff explicitly. Also applies to responses and business-logic decisions (threshold triggers, status transitions) derived from pre-transaction reads — concurrent writers all read the same stale value, so thresholds may be crossed without triggering the transition. Compute from post-write state or use conditional expressions that evaluate the stored value. For monotonic counters (sequence numbers, cursors) that must stay in lockstep with append-only storage, advancing before the write risks the counter running ahead on failure; not advancing after a partial write risks reuse — reserve the range before writing and commit only on success
|
|
46
|
+
- Error/early-exit paths that return status metadata (pagination flags, truncation indicators, hasMore, completion markers) or emit events (WebSocket, SSE, pub/sub) with default/initial values instead of reflecting actual accumulated state — downstream consumers make incorrect decisions (e.g., treating a failed sync as successful because the completion event was emitted unconditionally). Set metadata flags and event payloads based on actual outcome, not just the final request's exit path
|
|
44
47
|
- Missing `await` on async operations in error/cleanup paths — fire-and-forget cleanup (e.g., aborting a failed operation, rolling back partial state) that must complete before the function returns or the caller proceeds
|
|
45
48
|
- `Promise.all` without error handling — partial load with unhandled rejection. Wrap with fallback/error state
|
|
49
|
+
- Sequential processing of items (loops over external operations, batch mutations) where one item throwing aborts all remaining items — wrap per-item operations in try/catch with logging so partial progress is preserved and failures are isolated
|
|
46
50
|
- Side effects during React render (setState, navigation, mutations outside useEffect)
|
|
47
51
|
|
|
48
52
|
**Error handling** _[applies when: code has try/catch, .catch, error responses, or external calls]_
|
|
49
|
-
- Service functions throwing generic `Error` for client-caused conditions — bubbles as 500 instead of 400/404. Use typed error classes with explicit status codes; ensure consistent error responses across similar endpoints. Include expected concurrency/conditional failures (transaction cancellations, optimistic lock conflicts) — catch and translate to 409/retry rather than letting them surface as 500
|
|
50
|
-
- Swallowed errors (empty `.catch(() => {})`), handlers that replace detailed failure info with generic messages, and error/catch handlers that exit cleanly (`exit 0`, `return`) without any user-visible output — surface a notification, propagate original context, and make failures look like failures
|
|
53
|
+
- Service functions throwing generic `Error` for client-caused conditions — bubbles as 500 instead of 400/404. Use typed error classes with explicit status codes; ensure consistent error responses across similar endpoints — when multiple endpoints make the same access-control decision (e.g., "resource exists but caller lacks access"), they must return the same HTTP status (typically 404 to avoid leaking existence). Include expected concurrency/conditional failures (transaction cancellations, optimistic lock conflicts) — catch and translate to 409/retry rather than letting them surface as 500
|
|
54
|
+
- Swallowed errors (empty `.catch(() => {})`), handlers that replace detailed failure info with generic messages, and error/catch handlers that exit cleanly (`exit 0`, `return`) without any user-visible output — surface a notification, propagate original context, and make failures look like failures. Includes external service wrappers that return `null`/empty for all non-success responses — collapsing configuration errors (missing API key), auth failures (403), rate limits (429), and server errors (5xx) into a single "not found" return masks outages and misconfiguration as normal "no match" results. Distinguish retriable from non-retriable failures and surface infrastructure errors loudly
|
|
51
55
|
- Destructive operations in retry/cleanup paths assumed to succeed without their own error handling — if cleanup fails, retry logic crashes instead of reporting the intended failure
|
|
52
56
|
- External service calls without configurable timeouts — a hung downstream service blocks the caller indefinitely
|
|
53
57
|
- Missing fallback behavior when downstream services are unavailable (see also: retry without backoff in "Sync & replication")
|
|
54
58
|
|
|
55
59
|
**Resource management** _[applies when: code uses event listeners, timers, subscriptions, or useEffect]_
|
|
56
60
|
- Event listeners, socket handlers, subscriptions, timers, and useEffect side effects are cleaned up on unmount/teardown
|
|
57
|
-
- Deletion/destroy functions that clean up the primary resource but leave orphaned secondary resources (data directories, git branches, child records, temporary files) — trace all resources created during the entity's lifecycle and verify each is removed on delete
|
|
61
|
+
- Deletion/destroy and state-reset functions that clean up or reset the primary resource but leave orphaned or inconsistent secondary resources (data directories, git branches, child records, temporary files, per-user flag/vote items) — trace all resources created during the entity's lifecycle and verify each is removed on delete. For state transitions that reset aggregate values (counters, scores, flags), also clear or version the individual records that contributed to those aggregates — otherwise the aggregate and its sources disagree, and duplicate-prevention checks block legitimate re-entry
|
|
58
62
|
- Initialization functions (schedulers, pollers, listeners) that don't guard against multiple calls — creates duplicate instances. Check for existing instances before reinitializing
|
|
59
63
|
|
|
60
64
|
**Validation & consistency** _[applies when: code handles user input, schemas, or API contracts]_
|
|
61
65
|
- API versioning: breaking changes to public endpoints without version bump or deprecation path
|
|
62
66
|
- Backward-incompatible response shape changes without client migration plan
|
|
63
|
-
- Backward compatibility breaking changes — renamed/removed config keys, changed file formats, altered DB schemas, modified event payloads, or restructured persisted data (localStorage, files, database rows) without a migration path or fallback that reads the old format. Trace all consumers of the changed contract (other services, CLI versions, stored data) and verify they still work or have an upgrade path. For schema changes, require a migration script; for config/format changes, support both old and new formats during a transition period or provide a one-time converter
|
|
67
|
+
- Backward compatibility breaking changes — renamed/removed config keys, changed file formats, altered DB schemas, modified event payloads, renamed URL routes/paths, or restructured persisted data (localStorage, files, database rows) without a migration path or fallback that reads the old format. For route/URL renames, add redirects from old paths to preserve bookmarks and external links. Trace all consumers of the changed contract (other services, CLI versions, stored data) and verify they still work or have an upgrade path. For schema changes, require a migration script; for config/format changes, support both old and new formats during a transition period or provide a one-time converter
|
|
64
68
|
- New endpoints/schemas should match validation patterns of existing similar endpoints — field limits, required fields, types, error handling. If validation exists on one endpoint for a param, the same param on other endpoints needs the same validation
|
|
65
69
|
- When a validation/sanitization function is introduced for a field, trace ALL write paths (create, update, sync, import) — partial application means invalid values re-enter through the unguarded path
|
|
66
70
|
- Schema fields accepting values downstream code can't handle; Zod/schema stripping fields the service reads (silent `undefined`); config values persisted but silently ignored by the implementation — trace each field through schema → service → consumer. Update schemas derived from create schemas (e.g., `.partial()`) must also make nested object fields optional — shallow partial on a deeply-required schema rejects valid partial updates. Additionally, `.deepPartial()` or `.partial()` on schemas with `.default()` values will apply those defaults on update, silently overwriting existing persisted values with defaults — create explicit update schemas without defaults instead
|
|
67
71
|
- Entity creation without case-insensitive uniqueness checks — names differing only in case (e.g., "MyAgent" vs "myagent") cause collisions in case-insensitive contexts (file paths, git branches, URLs). Normalize to lowercase before comparing
|
|
68
|
-
-
|
|
69
|
-
- Data model fields that have different names depending on the creation/write path (e.g., `createdAt` vs `created`) — code referencing only one naming convention silently misses records created through other paths. Trace all write paths to discover the actual field names in use
|
|
72
|
+
- Code reading properties from API responses, framework-provided objects, or internal abstraction layers using field names the source doesn't populate or forward — silent `undefined`. Verify property names and nesting depth match the actual response shape (e.g., `response.items` vs `response.data.items`, `obj.placeId` vs `obj.id`, flat fields vs nested sub-objects). When building a new consumer against an existing API, check the producer's actual response — not assumed conventions. When branching on fields from a wrapped third-party API, confirm the wrapper actually requests and forwards those fields (e.g., optional response attributes that require explicit opt-in)
|
|
73
|
+
- Data model fields that have different names depending on the creation/write path (e.g., `createdAt` vs `created`) — code referencing only one naming convention silently misses records created through other paths. Trace all write paths to discover the actual field names in use. When new logic (access control, UI display, queries) checks only a newly introduced field, verify it falls back to any legacy field that existing records still use — otherwise records created before the migration are silently excluded or inaccessible
|
|
74
|
+
- Inconsistent "missing value" semantics across layers — one layer treats `null`/`undefined` as missing while another also treats empty strings or whitespace-only strings as missing. Query filters, update expressions, and UI predicates that disagree on what constitutes "missing" cause records to be skipped by one path but processed by another. Define a single `isMissing` predicate and use it consistently, or normalize empty/whitespace values to `null` at write time. Also applies to comparison/detection logic: coercing an absent field to a sentinel (`?? 0`, default parameters) makes the logic treat "unsupported" as a real value — guard with an explicit presence check before comparing
|
|
70
75
|
- Numeric values from strings used without `NaN`/type guards — `NaN` comparisons silently pass bounds checks. Clamp query params to safe lower bounds
|
|
71
76
|
- UI elements hidden from navigation but still accessible via direct URL — enforce restrictions at the route level
|
|
72
77
|
- Summary counters/accumulators that miss edge cases (removals, branch coverage, underflow on decrements — guard against going negative with lower-bound conditions); silent operations in verbose sequences where all branches should print status
|
|
@@ -89,6 +94,7 @@
|
|
|
89
94
|
- Database triggers clobbering explicitly-provided values; auto-incrementing columns that only increment on INSERT, not UPDATE
|
|
90
95
|
- Full-text search with strict parsers (`to_tsquery`) on user input — use `websearch_to_tsquery` or `plainto_tsquery`
|
|
91
96
|
- Dead queries (results never read), N+1 patterns inside transactions, O(n²) algorithms on growing data
|
|
97
|
+
- Performance optimizations in query/search loops (early exits, capped per-item limits, break-on-first-match) that silently reduce correctness — verify the optimization preserves the same result set as the unoptimized path, especially for dedup/nearest-match queries where stopping early can miss closer or more appropriate results
|
|
92
98
|
- `CREATE TABLE IF NOT EXISTS` as sole migration strategy — won't add columns/indexes on upgrade. Use `ALTER TABLE ... ADD COLUMN IF NOT EXISTS` or a migration framework
|
|
93
99
|
- Functions/extensions requiring specific database versions without verification
|
|
94
100
|
- Migrations that lock tables for extended periods (ADD COLUMN with default on large tables, CREATE INDEX without CONCURRENTLY) — use concurrent operations or batched backfills
|
|
@@ -96,7 +102,8 @@
|
|
|
96
102
|
|
|
97
103
|
**Sync & replication** _[applies when: code uses pagination, batch APIs, or data sync]_
|
|
98
104
|
- Upsert/`ON CONFLICT UPDATE` updating only a subset of exported fields — replicas diverge. Document deliberately omitted fields
|
|
99
|
-
- Pagination using `COUNT(*)` (full table scan) instead of `limit + 1`; endpoints missing `next` token input/output; hard-capped limits silently truncating results
|
|
105
|
+
- Pagination using `COUNT(*)` (full table scan) instead of `limit + 1`; endpoints missing `next` token input/output; hard-capped limits silently truncating results. When a data store applies query limits before filter expressions, a fixed multiplier on the limit still under-fetches — loop with continuation tokens until the target count of post-filter results is collected
|
|
106
|
+
- Pagination cursors derived from the last *scanned* item rather than the last *returned* item — if accumulated results are trimmed (e.g., sliced to a page size), the cursor advances past items that were fetched but never delivered, causing permanent skips
|
|
100
107
|
- Batch/paginated API calls (database batch gets, external service calls) that don't handle partial results — unprocessed items, continuation tokens, or rate-limited responses silently dropped. Add retry loops with backoff for unprocessed items
|
|
101
108
|
- Retry loops without backoff or max-attempt limits — tight loops under throttling extend latency indefinitely. Use bounded retries with exponential backoff/jitter
|
|
102
109
|
|
|
@@ -110,7 +117,7 @@
|
|
|
110
117
|
- Values crossing serialization boundaries may change format (arrays in JSON vs string literals in DB) — convert consistently
|
|
111
118
|
- Reads issued immediately after writes to an eventually consistent store (database scans, replica reads, cache refreshes) may return stale data — use consistent-read options, compute from in-memory state after confirmed writes, or document the eventual-consistency window
|
|
112
119
|
- BIGINT values parsed into JavaScript `Number` — precision lost past `MAX_SAFE_INTEGER`. Use strings or `BigInt`
|
|
113
|
-
- Data model key/index design that doesn't support required query access patterns — e.g., claiming "recent" ordering but using non-time-sortable keys (random UUIDs, user IDs). Verify sort keys and indexes can serve the queries the code performs without full-partition scans and in-memory sorting
|
|
120
|
+
- Data model key/index design that doesn't support required query access patterns — e.g., claiming "recent" ordering but using non-time-sortable keys (random UUIDs, user IDs). Verify sort keys and indexes can serve the queries the code performs without full-partition scans and in-memory sorting. When a new write path creates or associates an entity through a different attribute than the primary index (e.g., adding co-owners to an array field when the discovery index queries a single-owner scalar field), verify existing listing/discovery queries can surface the new association — otherwise the new data is persisted but undiscoverable
|
|
114
121
|
|
|
115
122
|
**Shell & portability** _[applies when: code spawns subprocesses, uses shell scripts, or builds CLI tools]_
|
|
116
123
|
- Subprocess calls under `set -e` abort on failure; non-critical writes fail on broken pipes — use `|| true` for non-critical output
|
|
@@ -131,12 +138,14 @@
|
|
|
131
138
|
## Tier 4 — Always Check (Quality, Conventions, AI-Generated Code)
|
|
132
139
|
|
|
133
140
|
**Intent vs implementation**
|
|
134
|
-
- Labels, comments, status messages, or documentation that describe behavior the code doesn't implement — e.g., a map named "renamed" that only deletes,
|
|
141
|
+
- Labels, comments, status messages, or documentation that describe behavior the code doesn't implement — e.g., a map named "renamed" that only deletes, an action labeled "migrated" that never creates the target, or UI actions offered for entity states where the transition is invalid (e.g., a "Reject" button on already-rejected items)
|
|
135
142
|
- Inline code examples, command templates, and query snippets that aren't syntactically valid as written — template placeholders must use a consistent format, queries must use correct syntax for their language (e.g., single `{}` in GraphQL, not `{{}}`)
|
|
136
|
-
- Cross-references between files (identifiers, parameter names, format conventions, operational thresholds) that disagree — when one reference changes, trace all other files that reference the same entity and update them
|
|
143
|
+
- Cross-references between files (identifiers, parameter names, format conventions, version numbers, operational thresholds) that disagree — when one reference changes, trace all other files that reference the same entity and update them. This includes internal identifiers (route paths, file names, component names) that should be renamed when the concept they represent is renamed — a nav label saying "Rejected" pointing to `/admin/flagged` or a component named `FlaggedList` rendering rejected items creates maintenance confusion. For releases, verify version consistency across all versioned artifacts (package manifests, lockfiles, API specs, changelogs, PR metadata). Also applies to field-set enumerations: when an operation targets a set of entity fields, every predicate, filter expression, scan criteria, API doc, and UI conditional that enumerates those fields must stay in sync — an independently maintained list that omits a field causes silent skips or false positives
|
|
137
144
|
- Template/workflow variables referenced (`{VAR_NAME}`) but never assigned — trace each placeholder to a definition step; undefined variables cause silent failures or confusing instructions. Also check for colliding identifiers (two distinct concepts mapped to the same slug, key, or name)
|
|
138
145
|
- Responsibility relocated from one module to another (e.g., writes moved from handler to middleware) without updating all consumers that depended on the old location's timing, return value, or side effects — trace callers that relied on the synchronous or co-located behavior and verify they still work with the new execution point. Remove dead code left behind at the old location
|
|
139
146
|
- Sequential instructions or steps whose ordering doesn't match the required execution order — readers following in order will perform actions at the wrong time (e.g., "record X" in step 2 when X must be captured before step 1's action)
|
|
147
|
+
- Constraints, limits, or guardrails described in a preamble or summary that are not enforced by an explicit condition in the procedural steps below — the description promises safety but the steps don't implement it. Add an explicit check/exit condition tied to the stated constraint
|
|
148
|
+
- Duplicate or contradictory items in sequential lists — copy/paste producing two entries for the same case with conflicting instructions. Deduplicate and reconcile
|
|
140
149
|
- Sequential numbering (section numbers, step numbers) with gaps or jumps after edits — verify continuity
|
|
141
150
|
- Completion markers, success flags, or status files written before the operation they attest to finishes — consumers see false success if the operation fails after the write
|
|
142
151
|
- Existence checks (directory exists, file exists, module resolves) used as proof of correct/complete installation — a directory can exist but be empty, a file can exist with invalid contents. Verify the specific resource the consumer needs
|
|
@@ -176,8 +185,11 @@
|
|
|
176
185
|
- Tests re-implementing logic under test instead of importing real exports — pass even when real code regresses
|
|
177
186
|
- Tests depending on real wall-clock time or external dependencies when testing logic — use fake timers and mocks
|
|
178
187
|
- Missing tests for trust-boundary enforcement — submit tampered values, verify server ignores them
|
|
188
|
+
- Tests that exercise code paths depending on features the integration layer doesn't expose — they pass against mocks but the behavior can't trigger in production. Verify mocked responses match what the real dependency actually returns
|
|
189
|
+
- Test mock state leaking between tests — mock setup APIs that configure return values often persist across tests even after clearing call history, because "clear" resets invocation counts but not configured behavior (use "reset" variants that restore original implementations). Conversely, per-call sequential mock responses couple tests to internal call count — prefer stable return values for behavior tests, sequential mocks only when verifying call order
|
|
179
190
|
- Tests that pass but don't cover the changed code paths — passing unrelated tests is not validation
|
|
180
191
|
|
|
181
192
|
**Style & conventions**
|
|
182
193
|
- Naming and patterns consistent with the rest of the codebase
|
|
183
|
-
- Formatting consistency within each file — new content must match existing indentation, bullet style, heading levels, and structure
|
|
194
|
+
- Formatting consistency within each file — new content must match existing indentation, bullet style, heading levels, and structure. For structured files that follow a convention across sibling files (changelogs, config files, migration files), verify new entries use the same section headers, field names, and ordering as existing siblings
|
|
195
|
+
- Shell/workflow instructions with destructive operations (branch deletion, file removal, force operations) must verify preconditions first — e.g., ensure you're not on a branch being deleted, confirm the target exists, and don't suppress stderr from commands where failures indicate real problems (auth errors, network issues)
|
|
@@ -12,7 +12,9 @@ You are a Copilot review loop agent.
|
|
|
12
12
|
PR: {PR_NUMBER} in {OWNER}/{REPO}
|
|
13
13
|
Branch: {BRANCH_NAME}
|
|
14
14
|
Build command: {BUILD_CMD}
|
|
15
|
-
Max iterations:
|
|
15
|
+
Max iterations: unlimited (loop until Copilot returns 0 comments)
|
|
16
|
+
Safety guardrail: after 10 iterations, report back and ask the user
|
|
17
|
+
whether to continue or stop — never loop indefinitely without confirmation.
|
|
16
18
|
|
|
17
19
|
TIMEOUT SCHEDULE:
|
|
18
20
|
When running parallel PR reviews (do:better), use shorter waits to avoid
|
|
@@ -28,8 +30,7 @@ that (minimum 5 minutes, maximum 20 minutes). Copilot reviews can take
|
|
|
28
30
|
10-15 minutes for large diffs.
|
|
29
31
|
Poll interval: 30 seconds for all iterations.
|
|
30
32
|
|
|
31
|
-
Run the following loop until Copilot returns zero new comments
|
|
32
|
-
the max iteration limit:
|
|
33
|
+
Run the following loop until Copilot returns zero new comments:
|
|
33
34
|
|
|
34
35
|
1. CAPTURE the latest Copilot review submittedAt timestamp (so you can
|
|
35
36
|
detect when a NEW review arrives):
|
|
@@ -75,10 +76,14 @@ the max iteration limit:
|
|
|
75
76
|
- Resolve the thread via GraphQL mutation using stdin JSON piping:
|
|
76
77
|
echo '{"query":"mutation { resolveReviewThread(input: {threadId: \"{THREAD_ID}\"}) { thread { id isResolved } } }"}' | gh api graphql --input -
|
|
77
78
|
- After all threads resolved, push all commits to remote
|
|
78
|
-
- Increment iteration counter
|
|
79
|
+
- Increment iteration counter
|
|
80
|
+
- If iteration counter reaches 10, stop the loop and report back with
|
|
81
|
+
status "guardrail" — the parent agent will ask the user whether to
|
|
82
|
+
continue or stop
|
|
83
|
+
- Otherwise, go back to step 1
|
|
79
84
|
|
|
80
85
|
When done, report back:
|
|
81
|
-
- Final status: clean /
|
|
86
|
+
- Final status: clean / timeout / error / guardrail
|
|
82
87
|
- Total iterations completed
|
|
83
88
|
- List of commits made (if any)
|
|
84
89
|
- Any unresolved threads remaining
|
package/package.json
CHANGED