slash-do 1.5.1 → 1.6.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,5 +1,5 @@
1
1
  ---
2
- description: Unified DevSecOps audit, remediation, per-category PRs, CI verification, and Copilot review loop with worktree isolation
2
+ description: Unified DevSecOps audit, remediation, test enhancement, per-category PRs, CI verification, and Copilot review loop with worktree isolation
3
3
  argument-hint: "[--scan-only] [--no-merge] [path filter or focus areas]"
4
4
  ---
5
5
 
@@ -58,6 +58,9 @@ When compacting during this workflow, always preserve:
58
58
  - All PR numbers and URLs created so far
59
59
  - `BUILD_CMD`, `TEST_CMD`, `PROJECT_TYPE`, `WORKTREE_DIR` values
60
60
  - `VCS_HOST`, `CLI_TOOL`, `DEFAULT_BRANCH`, `CURRENT_BRANCH`
61
+ - `PHASE_4C_START_SHA` (needed for FILE_OWNER_MAP update in Phase 4c.3)
62
+ - `VACUOUS_TESTS_FIXED`, `WEAK_TESTS_STRENGTHENED`, `NEW_TEST_CASES`, `NEW_TEST_FILES`
63
+ - `CREATED_CATEGORY_SLUGS` (list of branch slugs created in Phase 5)
61
64
 
62
65
 
63
66
  ## Phase 0: Discovery & Setup
@@ -173,9 +176,31 @@ Skip step 4 if steps 1-3 reveal the code is correct.
173
176
  - **Database migrations**: exclusive-lock ALTER TABLE on large tables, CREATE INDEX without CONCURRENTLY, missing down migrations or untested rollback paths
174
177
  - General: framework-specific security issues, language-specific gotchas, domain-specific compliance, environment variable hygiene (missing `.env.example`, required env vars not validated at startup, secrets in config files that should be in env)
175
178
 
176
- 7. **Test Coverage**
177
- Uses Batch 1 findings as context to prioritize:
178
- Focus: missing test files for critical modules, untested edge cases, tests that only cover happy paths, mocked dependencies that hide real bugs, areas with high complexity (identified by agents 1-5) but no tests, test files that don't actually assert anything meaningful
179
+ 7. **Test Quality & Coverage**
180
+ Uses Batch 1 findings as context to prioritize.
181
+ Focus areas:
182
+
183
+ **Coverage gaps:**
184
+ - Missing test files for critical modules, untested edge cases, tests that only cover happy paths
185
+ - Areas with high complexity (identified by agents 1-5) but no tests
186
+ - Remediation changes from agents 1-6 that lack corresponding test coverage
187
+
188
+ **Vacuous tests (tests that don't actually test anything):**
189
+ - Tests that assert on mocked return values instead of real behavior (testing the mock, not the code)
190
+ - Tests that only check truthiness (`assert.ok(result)`) when they should verify specific values or shapes
191
+ - Tests with assertions that can never fail (e.g., asserting a hardcoded value equals itself, asserting `typeof x === 'object'` on a literal `{}`)
192
+ - Tests that re-implement the logic under test instead of importing the real function — these pass even when real code regresses
193
+ - `it('should work', ...)` tests with no meaningful assertion or with assertions commented out
194
+ - Tests that mock the module they're testing (testing mock behavior, not real behavior)
195
+
196
+ **Weak test patterns:**
197
+ - Tests that verify implementation details (internal state, private methods, call counts) instead of observable behavior
198
+ - Tests where all assertions pass even if the function under test returns `null`/`undefined`/empty — verify by mentally substituting a no-op and checking if the test would still pass
199
+ - Integration tests that mock so aggressively they become unit tests of glue code
200
+ - Tests missing negative cases (invalid input, error paths, boundary conditions)
201
+ - Tests with shared mutable state between cases (`beforeEach` that doesn't reset, module-level variables)
202
+
203
+ Report each finding with a severity prefix `**[CRITICAL]**`, `**[HIGH]**`, `**[MEDIUM]**`, or `**[LOW]**` followed immediately by a quality prefix `[VACUOUS]`, `[WEAK]`, or `[MISSING]` (for example, `**[HIGH][VACUOUS]**`) to distinguish quality issues from coverage gaps while keeping the format consistent with other agents. Include the specific test name and file:line for existing test issues.
179
204
 
180
205
  Wait for ALL agents to complete before proceeding.
181
206
 
@@ -220,10 +245,18 @@ For each file touched by multiple categories, document why it was assigned to on
220
245
  ### Architecture & SOLID
221
246
  ### Bugs, Performance & Error Handling
222
247
  ### Stack-Specific
223
- ### Test Coverage (tracked, not auto-remediated)
248
+ ### Test Quality & Coverage
224
249
  ```
225
250
 
226
- 6. Print a summary table:
251
+ 6. Print a summary table (short labels → full category → branch slug):
252
+ - Security → Security & Secrets → `security`
253
+ - Code Quality → Code Quality & Style → `code-quality`
254
+ - DRY & YAGNI → DRY & YAGNI → `dry`
255
+ - Architecture → Architecture & SOLID → `architecture`
256
+ - Bugs & Perf → Bugs, Performance & Error Handling → `bugs-perf`
257
+ - Stack-Specific → Stack-Specific → `stack-specific`
258
+ - Tests → Test Quality & Coverage → `tests`
259
+
227
260
  ```
228
261
  | Category | CRITICAL | HIGH | MEDIUM | LOW | Total |
229
262
  |-------------------|----------|------|--------|-----|-------|
@@ -233,7 +266,7 @@ For each file touched by multiple categories, document why it was assigned to on
233
266
  | Architecture | ... | ... | ... | ... | ... |
234
267
  | Bugs & Perf | ... | ... | ... | ... | ... |
235
268
  | Stack-Specific | ... | ... | ... | ... | ... |
236
- | Test Coverage | ... | ... | ... | ... | ... |
269
+ | Tests | ... | ... | ... | ... | ... |
237
270
  | TOTAL | ... | ... | ... | ... | ... |
238
271
  ```
239
272
 
@@ -241,7 +274,7 @@ For each file touched by multiple categories, document why it was assigned to on
241
274
 
242
275
  ## Phase 3: Worktree Remediation
243
276
 
244
- Only proceed with CRITICAL, HIGH, and MEDIUM findings. LOW and Test Coverage findings remain tracked in PLAN.md but are not auto-remediated.
277
+ Only proceed with CRITICAL, HIGH, and MEDIUM findings for code remediation. LOW findings remain tracked in PLAN.md but are not auto-remediated. Test Quality & Coverage findings are handled separately in Phase 4c.
245
278
 
246
279
  ### 3a: Setup
247
280
 
@@ -349,21 +382,126 @@ Before creating PRs, run a deep code review on all remediation changes to catch
349
382
  ```
350
383
  5. If "Show diff" selected, print the diff and re-ask. If "Abort", stop and print the worktree path.
351
384
 
385
+ ## Phase 4c: Test Enhancement
386
+
387
+ After internal code review passes, evaluate and enhance the project's test suite. This phase acts on Agent 7's findings AND ensures all remediation work from Phase 3 has proper test coverage.
388
+
389
+ ### 4c.0: Record Start SHA
390
+
391
+ Before any test enhancement commits, capture the current HEAD so Phase 4c changes can be diffed later:
392
+ ```bash
393
+ cd {WORKTREE_DIR}
394
+ PHASE_4C_START_SHA="$(git rev-parse HEAD)"
395
+ ```
396
+
397
+ ### 4c.1: Test Audit Triage
398
+
399
+ Review Agent 7 findings from Phase 1 and categorize them:
400
+
401
+ 1. **`[VACUOUS]` findings** — tests that exist but don't test real behavior. These are the highest priority because they create a false sense of safety.
402
+ 2. **`[WEAK]` findings** — tests that partially cover behavior but miss important cases. Strengthen with additional assertions and edge cases.
403
+ 3. **`[MISSING]` findings** — no tests exist for critical paths. Write new test files or add test cases to existing files.
404
+
405
+ Additionally, scan all remediation changes from Phase 3:
406
+ - For each file modified by remediation agents, check if corresponding tests exist
407
+ - If tests exist, verify they cover the specific behavior that was fixed/changed
408
+ - If no tests exist for a remediated module, flag for new test creation
409
+
410
+ ### 4c.2: Test Enhancement Execution
411
+
412
+ Spawn a general-purpose agent (using `REMEDIATION_MODEL`) in the worktree to fix and write tests. Populate the template placeholders below from Phase 4c.1 triage output: `{VACUOUS_AND_WEAK_FINDINGS}` from `[VACUOUS]`/`[WEAK]` findings, `{MISSING_FINDINGS}` from `[MISSING]` findings, and `{REMEDIATED_FILES_WITHOUT_TESTS}` from the remediation-change scan. The agent instructions:
413
+
414
+ ```
415
+ You are a test enhancement agent working in {WORKTREE_DIR}.
416
+ Project type: {PROJECT_TYPE}. Test command: {TEST_CMD}.
417
+
418
+ Your job is to fix weak/vacuous tests and write missing tests that verify REAL BEHAVIOR.
419
+
420
+ ## Rules for writing good tests
421
+
422
+ 1. **Test observable behavior, not implementation.** Assert on return values, side effects (files written, state changed), and error messages — never on internal variable names, call counts, or private method invocations.
423
+
424
+ 2. **Every assertion must be falsifiable.** For each assertion you write, mentally substitute a broken implementation (returns null, returns wrong value, throws instead of succeeding, succeeds instead of throwing). If your assertion would still pass, it's vacuous — rewrite it.
425
+
426
+ 3. **Prefer real modules over mocks.** Only mock at system boundaries (filesystem, network, time). If you must mock, assert on the arguments passed TO the mock, not on its return value.
427
+
428
+ 4. **Test the edges.** Each test function needs at minimum:
429
+ - Happy path with specific expected output
430
+ - Empty/null/undefined input
431
+ - Invalid input that should error
432
+ - Boundary values (0, -1, MAX, empty string vs null)
433
+
434
+ 5. **Use concrete expected values.** `assert.equal(result, 'expected string')` not `assert.ok(result)`. `assert.deepEqual(output, { key: 'value' })` not `assert.ok(typeof output === 'object')`.
435
+
436
+ 6. **One behavior per test.** Each `it()` block tests exactly one scenario. The test name describes the scenario and expected outcome.
437
+
438
+ 7. **No shared mutable state.** Each test must be independently runnable. Use `beforeEach` to create fresh fixtures. Never rely on test execution order.
439
+
440
+ ## Task list
441
+
442
+ Fix these vacuous/weak tests:
443
+ {VACUOUS_AND_WEAK_FINDINGS}
444
+
445
+ Write tests for these gaps:
446
+ {MISSING_FINDINGS}
447
+
448
+ Write tests for these remediated files:
449
+ {REMEDIATED_FILES_WITHOUT_TESTS}
450
+
451
+ ## Verification
452
+
453
+ After writing/fixing each test file:
454
+ 1. Run `{TEST_CMD}` to verify all tests pass
455
+ 2. For each NEW test, verify that it fails when the behavior under test is wrong:
456
+ - Stage your test changes so they are protected: `git add path/to/test_file*`
457
+ - Confirm your staged diff only includes the intended test changes: `git diff --cached`
458
+ - Confirm there are no other unstaged changes in the worktree: `git diff` is clean
459
+ - Apply a small, obvious, and **uncommitted** change to the code under test (e.g., return a constant, flip a conditional)
460
+ - Run `{TEST_CMD}` and confirm the new test FAILS
461
+ - Immediately restore only the temporary code change (do **not** touch the staged tests), for example:
462
+ - `git restore path/to/code_under_test` **or**
463
+ - `git checkout HEAD -- path/to/code_under_test`
464
+ - Confirm the worktree has no remaining unstaged changes (`git diff` shows no changes) and that your staged test changes are still present (`git diff --cached`)
465
+ This is the key quality gate — a test that does not fail when the code is broken is worthless.
466
+ 3. After confirming the temporary code change is reverted and only the intended test changes are staged, commit the passing tests: `test: {description of what's tested}`
467
+ ```
468
+
469
+ ### 4c.3: Verification
470
+
471
+ After the test agent completes:
472
+
473
+ 1. Run the full test suite:
474
+ ```bash
475
+ cd {WORKTREE_DIR} && {TEST_CMD}
476
+ ```
477
+ 2. If tests fail, fix in a new commit
478
+ 3. Count new/fixed tests and record four variables:
479
+ - `VACUOUS_TESTS_FIXED` — number of vacuous tests fixed
480
+ - `WEAK_TESTS_STRENGTHENED` — number of weak tests strengthened
481
+ - `NEW_TEST_CASES` — number of new test cases added
482
+ - `NEW_TEST_FILES` — number of new test files created
483
+ 4. **Update `FILE_OWNER_MAP`** — Phase 4c may have created or modified test files that were not in the Phase 2 map. Before Phase 5 assembles branches:
484
+ - List all files changed by Phase 4c commits: `git diff --name-only "$PHASE_4C_START_SHA"..HEAD`
485
+ - For each file not already in `FILE_OWNER_MAP`, assign it to the `tests` category
486
+ - For each file already owned by another category, leave it in that category (co-located test changes ship with the code they test — the `tests` branch only contains standalone test files not owned by other categories)
487
+
352
488
  ## Phase 5: Per-Category PR Creation
353
489
 
354
490
  Instead of one mega PR, create **separate branches and PRs for each category**. This enables independent review, targeted CI, and granular merge decisions.
355
491
 
356
492
  ### 5a: Build the Category Branches
357
493
 
358
- Using the `FILE_OWNER_MAP` from Phase 2, create one branch per category:
494
+ Using the `FILE_OWNER_MAP` from Phase 2 (updated in Phase 4c.3), create one branch per category.
495
+
496
+ Initialize `CREATED_CATEGORY_SLUGS=""` (empty space-delimited string). After each category branch is successfully created and pushed below, append its slug: `CREATED_CATEGORY_SLUGS="$CREATED_CATEGORY_SLUGS {CATEGORY_SLUG}"`. Phase 7 uses this as the set of candidate branches for cleanup; when deleting branches, either run cleanup only after all desired merges are complete or explicitly verify that each branch in `CREATED_CATEGORY_SLUGS` has been merged before deleting it.
359
497
 
360
498
  For each category that has findings:
361
499
  1. Switch to `{DEFAULT_BRANCH}`: `git checkout {DEFAULT_BRANCH}`
362
500
  2. Create a category branch: `git checkout -b better/{CATEGORY_SLUG}`
363
- - Use slugs: `security`, `code-quality`, `dry`, `arch-bugs`, `stack-specific`
501
+ - Use slugs: `security`, `code-quality`, `dry`, `architecture`, `bugs-perf`, `stack-specific`, `tests`
364
502
  3. For each file assigned to this category in `FILE_OWNER_MAP`:
365
- - **Modified files**: `git checkout origin/better/{DATE} -- {file_path}`
366
- - **New files (Added)**: `git checkout origin/better/{DATE} -- {file_path}`
503
+ - **Modified files**: `git checkout better/{DATE} -- {file_path}`
504
+ - **New files (Added)**: `git checkout better/{DATE} -- {file_path}`
367
505
  - **Deleted files**: `git rm {file_path}`
368
506
  4. Commit all staged changes with a descriptive message:
369
507
  ```bash
@@ -475,7 +613,7 @@ After creating all PRs, verify CI passes on each one:
475
613
 
476
614
  ## Phase 6: Copilot Review Loop (GitHub only)
477
615
 
478
- Maximum 5 iterations per PR to prevent infinite loops.
616
+ Loop until Copilot returns zero new comments (no fixed iteration limit). Sub-agents enforce a 10-iteration guardrail: at iteration 10 the sub-agent stops and returns a "guardrail" status, prompting the parent agent to ask the user whether to continue or stop.
479
617
 
480
618
  **Sub-agent delegation** (prevents context exhaustion): delegate each PR's review loop to a **separate general-purpose sub-agent** via the Agent tool. Launch sub-agents in parallel (one per PR). Each sub-agent runs the full loop (request → wait → check → fix → re-request) autonomously and returns only the final status.
481
619
 
@@ -493,9 +631,9 @@ Launch all PR sub-agents in parallel. Wait for all to complete.
493
631
 
494
632
  For each sub-agent result:
495
633
  - **clean**: mark PR as ready to merge
496
- - **timeout**: ask the user whether to continue waiting, re-request, or skip
497
- - **max-iterations-reached**: inform the user "Reached max review iterations (5) on PR #{number}. Remaining issues may need manual review."
634
+ - **timeout**: inform the user "Copilot review timed out on PR #{number}." and ask whether to continue waiting, re-request, or skip
498
635
  - **error**: inform the user and ask whether to retry or skip
636
+ - **guardrail**: the sub-agent hit the 10-iteration limit; ask the user whether to continue with more iterations or stop
499
637
 
500
638
  ### 6.3: Merge Gate (MANDATORY)
501
639
 
@@ -546,16 +684,17 @@ If merge fails (e.g., branch protection, merge conflicts from a prior PR):
546
684
  ```bash
547
685
  git worktree remove {WORKTREE_DIR}
548
686
  ```
549
- 2. Delete local AND remote branches (only if merged):
687
+ 2. Delete the local staging branch and per-category branches (local + remote). Use the tracked list of branches from Phase 5 rather than a fixed list:
550
688
  ```bash
551
- git branch -d better/{DATE}
552
- git branch -d better/security better/code-quality better/dry better/arch-bugs better/stack-specific
689
+ git checkout {DEFAULT_BRANCH}
690
+ git branch -D better/{DATE}
691
+ # CREATED_CATEGORY_SLUGS is a space-delimited string, e.g. "security code-quality tests"
692
+ for slug in $CREATED_CATEGORY_SLUGS; do
693
+ git branch -d "better/$slug" || echo "warning: local branch better/$slug not found or not fully merged — skipping (use -D to force)"
694
+ git push origin --delete "better/$slug" || echo "warning: remote branch better/$slug not found or already deleted"
695
+ done
553
696
  ```
554
- ```bash
555
- git push origin --delete better/{DATE}
556
- git push origin --delete better/security better/code-quality better/dry better/arch-bugs better/stack-specific
557
- ```
558
- Ignore errors from `--delete` if a branch doesn't exist remotely.
697
+ `-D` (force delete) is used only for the staging branch `better/{DATE}` because it is intentionally unmerged — its file contents are cherry-picked into category branches. Category branches use `-d` (safe delete) so that unmerged work is not accidentally lost; if a category branch was not merged, the warning will surface it. The guards prevent errors from interrupting cleanup.
559
698
  3. Restore stashed changes (if stashed in Phase 3a):
560
699
  ```bash
561
700
  git stash pop
@@ -575,8 +714,14 @@ If merge fails (e.g., branch protection, merge conflicts from a prior PR):
575
714
  | Architecture | ... | ... | ... | #number | pass | approved |
576
715
  | Bugs & Perf | ... | ... | ... | #number | pass | approved |
577
716
  | Stack-Specific | ... | ... | ... | #number | pass | approved |
578
- | Test Coverage | ... | (tracked only) | ... | | |
717
+ | Tests | ... | ... | ... | #number | pass | approved |
579
718
  | TOTAL | ... | ... | ... | N PRs | | |
719
+
720
+ Test Enhancement Stats:
721
+ - Vacuous tests fixed: {VACUOUS_TESTS_FIXED}
722
+ - Weak tests strengthened: {WEAK_TESTS_STRENGTHENED}
723
+ - New test cases added: {NEW_TEST_CASES}
724
+ - New test files created: {NEW_TEST_FILES}
580
725
  ```
581
726
 
582
727
  ## Error Recovery
@@ -602,6 +747,7 @@ If merge fails (e.g., branch protection, merge conflicts from a prior PR):
602
747
  - Each file appears in exactly ONE PR (file ownership map) to prevent merge conflicts between PRs
603
748
  - When extracting modules, always add backward-compatible re-exports in the original module to prevent cross-PR breakage
604
749
  - Version bump happens exactly once on the first category branch based on aggregate commit analysis
605
- - Only CRITICAL, HIGH, and MEDIUM findings are auto-remediated; LOW and Test Coverage remain tracked in PLAN.md
750
+ - Only CRITICAL, HIGH, and MEDIUM findings are auto-remediated for code categories; LOW findings remain tracked in PLAN.md
751
+ - Test Quality & Coverage findings are remediated in Phase 4c with a dedicated test enhancement agent that verifies tests fail when code is broken
606
752
  - GitLab projects skip the Copilot review loop entirely (Phase 6) and stop after MR creation
607
753
  - CI must pass on each PR before requesting Copilot review or merging
@@ -94,6 +94,9 @@ Check every file against this checklist. The checklist is organized into tiers
94
94
  **Cross-file consistency**
95
95
  - If a new function/endpoint follows a pattern from an existing similar one, verify ALL aspects match (validation, error codes, response shape, cleanup). Partial copying is the #1 source of review feedback.
96
96
  - New API client functions should use the same encoding/escaping as existing ones (e.g., if other endpoints use `encodeURIComponent`, new ones must too)
97
+ - If the PR adds a new endpoint, trace where existing endpoints are registered and verify the new one is wired in all runtime adapters (serverless handler map, framework route file, API gateway config, local dev server) — a route registered in one adapter but missing from another will silently 404 in the missing runtime
98
+ - If the PR adds a new call to an external service that has established mock/test infrastructure (mock mode flags, test helpers, dev stubs), verify the new call uses the same patterns — bypassing them makes the new code path untestable in offline/dev environments and inconsistent with existing integrations
99
+ - If the PR adds a new UI component or client-side consumer against an existing API endpoint, read the actual endpoint handler or response shape — verify every field name, nesting level, identifier property, and response envelope path used in the consumer matches what the producer returns. This is the #1 source of "renders empty" bugs in new views built against existing APIs
97
100
 
98
101
  **Error path completeness**
99
102
  - Trace each error path end-to-end: does the error reach the user with a helpful message and correct HTTP status? Or does it get swallowed, logged silently, or surface as a generic 500?
@@ -148,8 +151,9 @@ Check every file against this checklist. The checklist is organized into tiers
148
151
  **Data model vs access pattern alignment**
149
152
  - If the PR adds queries that claim ordering (e.g., "recent", "top"), verify the underlying key/index design actually supports that ordering natively — random UUIDs and non-time-sortable keys require full scans and in-memory sorting, which degrades at scale
150
153
 
151
- **Deletion/lifecycle cleanup completeness**
154
+ **Deletion/lifecycle cleanup and aggregate reset completeness**
152
155
  - If the PR adds a delete or destroy function, trace all resources created during the entity's lifecycle (data directories, git branches, child records, temporary files, worktrees) and verify each is cleaned up on deletion. Compare with existing delete functions in the codebase for completeness patterns
156
+ - If the PR adds a state transition that resets an aggregate value (counter, score, flag count), trace all individual records that contribute to that aggregate and verify they are also cleared, archived, or versioned — a reset counter with stale contributing records causes inconsistency and blocks duplicate-prevention checks on re-entry
153
157
 
154
158
  **Update schema depth**
155
159
  - If the PR derives an update/patch schema from a create schema (e.g., `.partial()`, `Partial<T>`), verify that nested objects also become partial — shallow partial on deeply-required schemas rejects valid partial updates where the caller only wants to change one nested field
@@ -163,9 +167,28 @@ Check every file against this checklist. The checklist is organized into tiers
163
167
  **Read-after-write consistency**
164
168
  - If the PR writes to a data store and then immediately queries that store (especially scans, aggregations, or replica reads), check whether the store's consistency model guarantees visibility of the write. If not, flag the read as potentially stale and suggest computing from in-memory state, using consistent-read options, or adding a delay/caveat
165
169
 
170
+ **Security-sensitive configuration parsing**
171
+ - If the PR reads environment variables or config values that affect security behavior (proxy trust depth, rate limit thresholds, CORS origins, token expiry), verify the parsing enforces the expected type and range — e.g., integer-only via `parseInt` with `Number.isInteger` check, non-negative bounds, and a logged fallback to a safe default on invalid input. `Number()` on arbitrary strings accepts floats, negatives, and empty-string-as-zero, all of which can silently weaken security controls
172
+
173
+ **Multi-source data aggregation**
174
+ - If the PR aggregates items from multiple sources into a single collection (merging accounts, combining API results, flattening caches), verify each item retains its source identifier through the aggregation — downstream operations that need to route back to the correct source (updates, deletes, detail views) will silently break or operate on the wrong source if the origin is lost
175
+
176
+ **Field-set enumeration consistency**
177
+ - If the PR adds an operation that targets a set of entity fields (enrichment, validation, migration, sync), trace every other location that independently enumerates those fields — UI predicates, scan/query filters, API documentation, response shapes, and test assertions. Each must cover the same field set; a missed field causes silent skips or false UI state. Prefer deriving enumerations from a single source of truth (constant array, schema keys) over maintaining independent lists
178
+
179
+ **Abstraction layer fidelity**
180
+ - If the PR calls a third-party API through an internal wrapper/abstraction layer, trace whether the wrapper requests and forwards all fields the handler depends on — third-party APIs often have optional response attributes that require explicit opt-in (e.g., cancellation reasons, extended metadata). Code branching on fields the wrapper doesn't forward will silently receive `undefined` and take the wrong path. Also verify that test mocks match what the real wrapper returns, not what the underlying API could theoretically return
181
+
182
+ **Data model / status lifecycle changes**
183
+ - If the PR changes the set of valid statuses, enum values, or entity lifecycle states, sweep all dependent artifacts: API doc summaries and enum declarations, UI filter/tab options, conditional rendering branches (which actions to show per state), integration guide examples, route names derived from old status names, and test assertions. Each artifact that references the old value set must be updated — partial updates leave stale filters, invalid actions, and misleading documentation
184
+ - If the PR renames a concept (e.g., "flagged" → "rejected"), trace all manifestations beyond user-facing labels: route paths, component/file names, variable names, CSS classes, and test descriptions. Internal identifiers using the old name create confusion even when the UI is correct
185
+
166
186
  **Formatting & structural consistency**
167
187
  - If the PR adds content to an existing file (list items, sections, config entries), verify the new content matches the file's existing indentation, bullet style, heading levels, and structure — rendering inconsistencies are the most common Copilot review finding
168
188
 
189
+ **Query key / stored key precision alignment**
190
+ - If the PR adds queries that construct lookup keys with a different precision, encoding, or format than what the write path persists, the query will silently return zero matches. Trace the key construction in both write and read paths and verify they produce compatible values
191
+
169
192
  </deep_checks>
170
193
 
171
194
  <verify_findings>
@@ -17,13 +17,15 @@
17
17
  - Null/undefined access without guards, off-by-one errors, object spread of potentially-null values (spread of null is `{}`, silently discarding state)
18
18
  - Data from external/user sources (parsed JSON, API responses, file reads) used without structural validation — guard against parse failures, missing properties, wrong types, and null elements before accessing nested values. When parsed data is optional enrichment, isolate failures so they don't abort the main operation
19
19
  - Type coercion edge cases — `Number('')` is `0` not empty, `0` is falsy in truthy checks, `NaN` comparisons are always false; string comparison operators (`<`, `>`, `localeCompare`) do lexicographic, not semantic, ordering (e.g., `"10" < "2"`). Use explicit type checks (`Number.isFinite()`, `!= null`) and dedicated libraries (e.g., semver for versions) instead of truthy guards or lexicographic ordering when zero/empty are valid values or semantic ordering matters
20
- - Functions that index into arrays without guarding empty arrays; state/variables declared but never updated or only partially wired up
20
+ - Functions that index into arrays without guarding empty arrays; aggregate operations (`every`, `some`, `reduce`) on potentially-empty collections returning vacuously true/default values that mask misconfiguration or missing data; state/variables declared but never updated or only partially wired up
21
21
  - Shared mutable references — module-level defaults passed by reference mutate across calls (use `structuredClone()`/spread); `useCallback`/`useMemo` referencing a later `const` (temporal dead zone); object spread followed by unconditional assignment that clobbers spread values
22
22
  - Functions with >10 branches or >15 cyclomatic complexity — refactor into smaller units
23
23
 
24
24
  **API & URL safety**
25
25
  - User-supplied or system-generated values interpolated into URL paths, shell commands, file paths, or subprocess arguments without encoding/validation — use `encodeURIComponent()` for URLs, regex allowlists for execution boundaries. Generated identifiers used as URL path segments must be safe for your router/storage (no `/`, `?`, `#`; consider allowlisting characters and/or applying `encodeURIComponent()`). Identifiers derived from human-readable names (slugs) used for namespaced resources (git branches, directories) need a unique suffix (ID, hash) to prevent collisions between entities with the same or similar names
26
26
  - Route params passed to services without format validation; path containment checks using string prefix without path separator boundary (use `path.relative()`)
27
+ - Parameterized/wildcard routes registered before specific named routes — the generic route captures requests meant for the specific endpoint (e.g., `/:id` registered before `/drafts` matches `/drafts` as `id="drafts"`). Verify route registration order or use path prefixes to disambiguate
28
+ - Stored or external URLs rendered as clickable links (`href`, `src`, `window.open`) without protocol validation — `javascript:`, `data:`, and `vbscript:` URLs execute in the user's browser. Allowlist `http:`/`https:` (and `mailto:` if needed) before rendering; for all other schemes, render as plain text or strip the value
27
29
  - Error/fallback responses that hardcode security headers instead of using centralized policy — error paths bypass security tightening
28
30
 
29
31
  **Trust boundaries & data exposure**
@@ -40,32 +42,36 @@
40
42
  - Optimistic updates using full-collection snapshots for rollback — a second in-flight action gets clobbered. Use per-item rollback and functional state updaters after async gaps; sync optimistic changes to parent via callback or trigger refetch on remount
41
43
  - State updates guarded by truthiness of the new value (`if (arr?.length)`) — prevents clearing state when the source legitimately returns empty. Distinguish "no response" from "empty response"
42
44
  - Mutation/trigger functions that return or propagate stale pre-mutation state — if a function activates, updates, or resets an entity, the returned value and any dependent scheduling/evaluation state (backoff timers, "last run" timestamps, status flags) must reflect the post-mutation state, not a snapshot read before the mutation
43
- - Fire-and-forget or async writes where the in-memory object is not updated (response returns stale data) or is updated unconditionally regardless of write success (response claims state that was never persisted) — update in-memory state conditionally on write outcome, or document the tradeoff explicitly
45
+ - Fire-and-forget or async writes where the in-memory object is not updated (response returns stale data) or is updated unconditionally regardless of write success (response claims state that was never persisted) — update in-memory state conditionally on write outcome, or document the tradeoff explicitly. Also applies to responses and business-logic decisions (threshold triggers, status transitions) derived from pre-transaction reads — concurrent writers all read the same stale value, so thresholds may be crossed without triggering the transition. Compute from post-write state or use conditional expressions that evaluate the stored value. For monotonic counters (sequence numbers, cursors) that must stay in lockstep with append-only storage, advancing before the write risks the counter running ahead on failure; not advancing after a partial write risks reuse — reserve the range before writing and commit only on success
46
+ - Error/early-exit paths that return status metadata (pagination flags, truncation indicators, hasMore, completion markers) or emit events (WebSocket, SSE, pub/sub) with default/initial values instead of reflecting actual accumulated state — downstream consumers make incorrect decisions (e.g., treating a failed sync as successful because the completion event was emitted unconditionally). Set metadata flags and event payloads based on actual outcome, not just the final request's exit path
44
47
  - Missing `await` on async operations in error/cleanup paths — fire-and-forget cleanup (e.g., aborting a failed operation, rolling back partial state) that must complete before the function returns or the caller proceeds
45
48
  - `Promise.all` without error handling — partial load with unhandled rejection. Wrap with fallback/error state
49
+ - Sequential processing of items (loops over external operations, batch mutations) where one item throwing aborts all remaining items — wrap per-item operations in try/catch with logging so partial progress is preserved and failures are isolated
46
50
  - Side effects during React render (setState, navigation, mutations outside useEffect)
47
51
 
48
52
  **Error handling** _[applies when: code has try/catch, .catch, error responses, or external calls]_
49
- - Service functions throwing generic `Error` for client-caused conditions — bubbles as 500 instead of 400/404. Use typed error classes with explicit status codes; ensure consistent error responses across similar endpoints. Include expected concurrency/conditional failures (transaction cancellations, optimistic lock conflicts) — catch and translate to 409/retry rather than letting them surface as 500
50
- - Swallowed errors (empty `.catch(() => {})`), handlers that replace detailed failure info with generic messages, and error/catch handlers that exit cleanly (`exit 0`, `return`) without any user-visible output — surface a notification, propagate original context, and make failures look like failures
53
+ - Service functions throwing generic `Error` for client-caused conditions — bubbles as 500 instead of 400/404. Use typed error classes with explicit status codes; ensure consistent error responses across similar endpoints — when multiple endpoints make the same access-control decision (e.g., "resource exists but caller lacks access"), they must return the same HTTP status (typically 404 to avoid leaking existence). Include expected concurrency/conditional failures (transaction cancellations, optimistic lock conflicts) — catch and translate to 409/retry rather than letting them surface as 500
54
+ - Swallowed errors (empty `.catch(() => {})`), handlers that replace detailed failure info with generic messages, and error/catch handlers that exit cleanly (`exit 0`, `return`) without any user-visible output — surface a notification, propagate original context, and make failures look like failures. Includes external service wrappers that return `null`/empty for all non-success responses — collapsing configuration errors (missing API key), auth failures (403), rate limits (429), and server errors (5xx) into a single "not found" return masks outages and misconfiguration as normal "no match" results. Distinguish retriable from non-retriable failures and surface infrastructure errors loudly
51
55
  - Destructive operations in retry/cleanup paths assumed to succeed without their own error handling — if cleanup fails, retry logic crashes instead of reporting the intended failure
52
56
  - External service calls without configurable timeouts — a hung downstream service blocks the caller indefinitely
53
57
  - Missing fallback behavior when downstream services are unavailable (see also: retry without backoff in "Sync & replication")
54
58
 
55
59
  **Resource management** _[applies when: code uses event listeners, timers, subscriptions, or useEffect]_
56
60
  - Event listeners, socket handlers, subscriptions, timers, and useEffect side effects are cleaned up on unmount/teardown
57
- - Deletion/destroy functions that clean up the primary resource but leave orphaned secondary resources (data directories, git branches, child records, temporary files) — trace all resources created during the entity's lifecycle and verify each is removed on delete
61
+ - Deletion/destroy and state-reset functions that clean up or reset the primary resource but leave orphaned or inconsistent secondary resources (data directories, git branches, child records, temporary files, per-user flag/vote items) — trace all resources created during the entity's lifecycle and verify each is removed on delete. For state transitions that reset aggregate values (counters, scores, flags), also clear or version the individual records that contributed to those aggregates — otherwise the aggregate and its sources disagree, and duplicate-prevention checks block legitimate re-entry
58
62
  - Initialization functions (schedulers, pollers, listeners) that don't guard against multiple calls — creates duplicate instances. Check for existing instances before reinitializing
59
63
 
60
64
  **Validation & consistency** _[applies when: code handles user input, schemas, or API contracts]_
61
65
  - API versioning: breaking changes to public endpoints without version bump or deprecation path
62
66
  - Backward-incompatible response shape changes without client migration plan
67
+ - Backward compatibility breaking changes — renamed/removed config keys, changed file formats, altered DB schemas, modified event payloads, renamed URL routes/paths, or restructured persisted data (localStorage, files, database rows) without a migration path or fallback that reads the old format. For route/URL renames, add redirects from old paths to preserve bookmarks and external links. Trace all consumers of the changed contract (other services, CLI versions, stored data) and verify they still work or have an upgrade path. For schema changes, require a migration script; for config/format changes, support both old and new formats during a transition period or provide a one-time converter
63
68
  - New endpoints/schemas should match validation patterns of existing similar endpoints — field limits, required fields, types, error handling. If validation exists on one endpoint for a param, the same param on other endpoints needs the same validation
64
69
  - When a validation/sanitization function is introduced for a field, trace ALL write paths (create, update, sync, import) — partial application means invalid values re-enter through the unguarded path
65
70
  - Schema fields accepting values downstream code can't handle; Zod/schema stripping fields the service reads (silent `undefined`); config values persisted but silently ignored by the implementation — trace each field through schema → service → consumer. Update schemas derived from create schemas (e.g., `.partial()`) must also make nested object fields optional — shallow partial on a deeply-required schema rejects valid partial updates. Additionally, `.deepPartial()` or `.partial()` on schemas with `.default()` values will apply those defaults on update, silently overwriting existing persisted values with defaults — create explicit update schemas without defaults instead
66
71
  - Entity creation without case-insensitive uniqueness checks — names differing only in case (e.g., "MyAgent" vs "myagent") cause collisions in case-insensitive contexts (file paths, git branches, URLs). Normalize to lowercase before comparing
67
- - Handlers reading properties from framework-provided objects using field names the framework doesn't populate — silent `undefined`. Verify property names match the caller's contract
68
- - Data model fields that have different names depending on the creation/write path (e.g., `createdAt` vs `created`) — code referencing only one naming convention silently misses records created through other paths. Trace all write paths to discover the actual field names in use
72
+ - Code reading properties from API responses, framework-provided objects, or internal abstraction layers using field names the source doesn't populate or forward — silent `undefined`. Verify property names and nesting depth match the actual response shape (e.g., `response.items` vs `response.data.items`, `obj.placeId` vs `obj.id`, flat fields vs nested sub-objects). When building a new consumer against an existing API, check the producer's actual response — not assumed conventions. When branching on fields from a wrapped third-party API, confirm the wrapper actually requests and forwards those fields (e.g., optional response attributes that require explicit opt-in)
73
+ - Data model fields that have different names depending on the creation/write path (e.g., `createdAt` vs `created`) — code referencing only one naming convention silently misses records created through other paths. Trace all write paths to discover the actual field names in use. When new logic (access control, UI display, queries) checks only a newly introduced field, verify it falls back to any legacy field that existing records still use — otherwise records created before the migration are silently excluded or inaccessible
74
+ - Inconsistent "missing value" semantics across layers — one layer treats `null`/`undefined` as missing while another also treats empty strings or whitespace-only strings as missing. Query filters, update expressions, and UI predicates that disagree on what constitutes "missing" cause records to be skipped by one path but processed by another. Define a single `isMissing` predicate and use it consistently, or normalize empty/whitespace values to `null` at write time. Also applies to comparison/detection logic: coercing an absent field to a sentinel (`?? 0`, default parameters) makes the logic treat "unsupported" as a real value — guard with an explicit presence check before comparing
69
75
  - Numeric values from strings used without `NaN`/type guards — `NaN` comparisons silently pass bounds checks. Clamp query params to safe lower bounds
70
76
  - UI elements hidden from navigation but still accessible via direct URL — enforce restrictions at the route level
71
77
  - Summary counters/accumulators that miss edge cases (removals, branch coverage, underflow on decrements — guard against going negative with lower-bound conditions); silent operations in verbose sequences where all branches should print status
@@ -88,6 +94,7 @@
88
94
  - Database triggers clobbering explicitly-provided values; auto-incrementing columns that only increment on INSERT, not UPDATE
89
95
  - Full-text search with strict parsers (`to_tsquery`) on user input — use `websearch_to_tsquery` or `plainto_tsquery`
90
96
  - Dead queries (results never read), N+1 patterns inside transactions, O(n²) algorithms on growing data
97
+ - Performance optimizations in query/search loops (early exits, capped per-item limits, break-on-first-match) that silently reduce correctness — verify the optimization preserves the same result set as the unoptimized path, especially for dedup/nearest-match queries where stopping early can miss closer or more appropriate results
91
98
  - `CREATE TABLE IF NOT EXISTS` as sole migration strategy — won't add columns/indexes on upgrade. Use `ALTER TABLE ... ADD COLUMN IF NOT EXISTS` or a migration framework
92
99
  - Functions/extensions requiring specific database versions without verification
93
100
  - Migrations that lock tables for extended periods (ADD COLUMN with default on large tables, CREATE INDEX without CONCURRENTLY) — use concurrent operations or batched backfills
@@ -95,7 +102,8 @@
95
102
 
96
103
  **Sync & replication** _[applies when: code uses pagination, batch APIs, or data sync]_
97
104
  - Upsert/`ON CONFLICT UPDATE` updating only a subset of exported fields — replicas diverge. Document deliberately omitted fields
98
- - Pagination using `COUNT(*)` (full table scan) instead of `limit + 1`; endpoints missing `next` token input/output; hard-capped limits silently truncating results
105
+ - Pagination using `COUNT(*)` (full table scan) instead of `limit + 1`; endpoints missing `next` token input/output; hard-capped limits silently truncating results. When a data store applies query limits before filter expressions, a fixed multiplier on the limit still under-fetches — loop with continuation tokens until the target count of post-filter results is collected
106
+ - Pagination cursors derived from the last *scanned* item rather than the last *returned* item — if accumulated results are trimmed (e.g., sliced to a page size), the cursor advances past items that were fetched but never delivered, causing permanent skips
99
107
  - Batch/paginated API calls (database batch gets, external service calls) that don't handle partial results — unprocessed items, continuation tokens, or rate-limited responses silently dropped. Add retry loops with backoff for unprocessed items
100
108
  - Retry loops without backoff or max-attempt limits — tight loops under throttling extend latency indefinitely. Use bounded retries with exponential backoff/jitter
101
109
 
@@ -109,7 +117,7 @@
109
117
  - Values crossing serialization boundaries may change format (arrays in JSON vs string literals in DB) — convert consistently
110
118
  - Reads issued immediately after writes to an eventually consistent store (database scans, replica reads, cache refreshes) may return stale data — use consistent-read options, compute from in-memory state after confirmed writes, or document the eventual-consistency window
111
119
  - BIGINT values parsed into JavaScript `Number` — precision lost past `MAX_SAFE_INTEGER`. Use strings or `BigInt`
112
- - Data model key/index design that doesn't support required query access patterns — e.g., claiming "recent" ordering but using non-time-sortable keys (random UUIDs, user IDs). Verify sort keys and indexes can serve the queries the code performs without full-partition scans and in-memory sorting
120
+ - Data model key/index design that doesn't support required query access patterns — e.g., claiming "recent" ordering but using non-time-sortable keys (random UUIDs, user IDs). Verify sort keys and indexes can serve the queries the code performs without full-partition scans and in-memory sorting. When a new write path creates or associates an entity through a different attribute than the primary index (e.g., adding co-owners to an array field when the discovery index queries a single-owner scalar field), verify existing listing/discovery queries can surface the new association — otherwise the new data is persisted but undiscoverable
113
121
 
114
122
  **Shell & portability** _[applies when: code spawns subprocesses, uses shell scripts, or builds CLI tools]_
115
123
  - Subprocess calls under `set -e` abort on failure; non-critical writes fail on broken pipes — use `|| true` for non-critical output
@@ -130,11 +138,14 @@
130
138
  ## Tier 4 — Always Check (Quality, Conventions, AI-Generated Code)
131
139
 
132
140
  **Intent vs implementation**
133
- - Labels, comments, status messages, or documentation that describe behavior the code doesn't implement — e.g., a map named "renamed" that only deletes, or an action labeled "migrated" that never creates the target
141
+ - Labels, comments, status messages, or documentation that describe behavior the code doesn't implement — e.g., a map named "renamed" that only deletes, an action labeled "migrated" that never creates the target, or UI actions offered for entity states where the transition is invalid (e.g., a "Reject" button on already-rejected items)
134
142
  - Inline code examples, command templates, and query snippets that aren't syntactically valid as written — template placeholders must use a consistent format, queries must use correct syntax for their language (e.g., single `{}` in GraphQL, not `{{}}`)
135
- - Cross-references between files (identifiers, parameter names, format conventions, operational thresholds) that disagree — when one reference changes, trace all other files that reference the same entity and update them
143
+ - Cross-references between files (identifiers, parameter names, format conventions, version numbers, operational thresholds) that disagree — when one reference changes, trace all other files that reference the same entity and update them. This includes internal identifiers (route paths, file names, component names) that should be renamed when the concept they represent is renamed — a nav label saying "Rejected" pointing to `/admin/flagged` or a component named `FlaggedList` rendering rejected items creates maintenance confusion. For releases, verify version consistency across all versioned artifacts (package manifests, lockfiles, API specs, changelogs, PR metadata). Also applies to field-set enumerations: when an operation targets a set of entity fields, every predicate, filter expression, scan criteria, API doc, and UI conditional that enumerates those fields must stay in sync — an independently maintained list that omits a field causes silent skips or false positives
144
+ - Template/workflow variables referenced (`{VAR_NAME}`) but never assigned — trace each placeholder to a definition step; undefined variables cause silent failures or confusing instructions. Also check for colliding identifiers (two distinct concepts mapped to the same slug, key, or name)
136
145
  - Responsibility relocated from one module to another (e.g., writes moved from handler to middleware) without updating all consumers that depended on the old location's timing, return value, or side effects — trace callers that relied on the synchronous or co-located behavior and verify they still work with the new execution point. Remove dead code left behind at the old location
137
146
  - Sequential instructions or steps whose ordering doesn't match the required execution order — readers following in order will perform actions at the wrong time (e.g., "record X" in step 2 when X must be captured before step 1's action)
147
+ - Constraints, limits, or guardrails described in a preamble or summary that are not enforced by an explicit condition in the procedural steps below — the description promises safety but the steps don't implement it. Add an explicit check/exit condition tied to the stated constraint
148
+ - Duplicate or contradictory items in sequential lists — copy/paste producing two entries for the same case with conflicting instructions. Deduplicate and reconcile
138
149
  - Sequential numbering (section numbers, step numbers) with gaps or jumps after edits — verify continuity
139
150
  - Completion markers, success flags, or status files written before the operation they attest to finishes — consumers see false success if the operation fails after the write
140
151
  - Existence checks (directory exists, file exists, module resolves) used as proof of correct/complete installation — a directory can exist but be empty, a file can exist with invalid contents. Verify the specific resource the consumer needs
@@ -174,8 +185,11 @@
174
185
  - Tests re-implementing logic under test instead of importing real exports — pass even when real code regresses
175
186
  - Tests depending on real wall-clock time or external dependencies when testing logic — use fake timers and mocks
176
187
  - Missing tests for trust-boundary enforcement — submit tampered values, verify server ignores them
188
+ - Tests that exercise code paths depending on features the integration layer doesn't expose — they pass against mocks but the behavior can't trigger in production. Verify mocked responses match what the real dependency actually returns
189
+ - Test mock state leaking between tests — mock setup APIs that configure return values often persist across tests even after clearing call history, because "clear" resets invocation counts but not configured behavior (use "reset" variants that restore original implementations). Conversely, per-call sequential mock responses couple tests to internal call count — prefer stable return values for behavior tests, sequential mocks only when verifying call order
177
190
  - Tests that pass but don't cover the changed code paths — passing unrelated tests is not validation
178
191
 
179
192
  **Style & conventions**
180
193
  - Naming and patterns consistent with the rest of the codebase
181
- - Formatting consistency within each file — new content must match existing indentation, bullet style, heading levels, and structure
194
+ - Formatting consistency within each file — new content must match existing indentation, bullet style, heading levels, and structure. For structured files that follow a convention across sibling files (changelogs, config files, migration files), verify new entries use the same section headers, field names, and ordering as existing siblings
195
+ - Shell/workflow instructions with destructive operations (branch deletion, file removal, force operations) must verify preconditions first — e.g., ensure you're not on a branch being deleted, confirm the target exists, and don't suppress stderr from commands where failures indicate real problems (auth errors, network issues)
@@ -12,7 +12,9 @@ You are a Copilot review loop agent.
12
12
  PR: {PR_NUMBER} in {OWNER}/{REPO}
13
13
  Branch: {BRANCH_NAME}
14
14
  Build command: {BUILD_CMD}
15
- Max iterations: 5
15
+ Max iterations: unlimited (loop until Copilot returns 0 comments)
16
+ Safety guardrail: after 10 iterations, report back and ask the user
17
+ whether to continue or stop — never loop indefinitely without confirmation.
16
18
 
17
19
  TIMEOUT SCHEDULE:
18
20
  When running parallel PR reviews (do:better), use shorter waits to avoid
@@ -28,8 +30,7 @@ that (minimum 5 minutes, maximum 20 minutes). Copilot reviews can take
28
30
  10-15 minutes for large diffs.
29
31
  Poll interval: 30 seconds for all iterations.
30
32
 
31
- Run the following loop until Copilot returns zero new comments or you hit
32
- the max iteration limit:
33
+ Run the following loop until Copilot returns zero new comments:
33
34
 
34
35
  1. CAPTURE the latest Copilot review submittedAt timestamp (so you can
35
36
  detect when a NEW review arrives):
@@ -75,10 +76,14 @@ the max iteration limit:
75
76
  - Resolve the thread via GraphQL mutation using stdin JSON piping:
76
77
  echo '{"query":"mutation { resolveReviewThread(input: {threadId: \"{THREAD_ID}\"}) { thread { id isResolved } } }"}' | gh api graphql --input -
77
78
  - After all threads resolved, push all commits to remote
78
- - Increment iteration counter and go back to step 1
79
+ - Increment iteration counter
80
+ - If iteration counter reaches 10, stop the loop and report back with
81
+ status "guardrail" — the parent agent will ask the user whether to
82
+ continue or stop
83
+ - Otherwise, go back to step 1
79
84
 
80
85
  When done, report back:
81
- - Final status: clean / max-iterations-reached / timeout / error
86
+ - Final status: clean / timeout / error / guardrail
82
87
  - Total iterations completed
83
88
  - List of commits made (if any)
84
89
  - Any unresolved threads remaining
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "slash-do",
3
- "version": "1.5.1",
3
+ "version": "1.6.1",
4
4
  "description": "Curated slash commands for AI coding assistants — Claude Code, OpenCode, Gemini CLI, and Codex",
5
5
  "author": "Adam Eivy <adam@eivy.com>",
6
6
  "license": "MIT",