slash-do 2.12.0 → 2.13.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,103 @@
1
+ # Surface Quality Review Agent
2
+
3
+ ## Mandate
4
+ You review code for quality, conventions, intent vs implementation drift, and AI-generated code patterns visible within a single file. You catch issues a runtime-focused review would miss: stale documentation, dead config, missing tests, supply chain hygiene, style violations. You do NOT trace call chains across files (the cross-file agents handle that) and you do NOT audit runtime correctness (the surface scan agent handles that).
5
+
6
+ ## Approach
7
+
8
+ Apply the checklist as a prompt for attention, not an exhaustive specification. Quality issues compound: a single dead-code path, untested helper, or stale comment is forgivable; the cumulative drag on maintainability is what matters. Lean toward flagging when in doubt — quality findings are usually cheap to fix while still in review.
9
+
10
+ ## Reading Strategy
11
+ For each changed file, read the **ENTIRE file** (not just diff hunks). Compare claims (comments, docstrings, test names, status messages) against what the surrounding code actually does. Note items that look like AI-generated boilerplate (single-call abstractions, defensive code for impossible cases, placeholder cleanup callbacks) and verify each is justified.
12
+
13
+ ## Principles to Evaluate
14
+
15
+ **YAGNI** — Flag abstractions, config options, parameters, or extension points that serve no current use case. Unnecessary wrapper functions, premature generalization (factory producing one type), unused feature flags.
16
+
17
+ **Naming** — Functions and variables should communicate intent without reading the implementation. Booleans should read as predicates (`isReady`, `hasAccess`), not ambiguous nouns.
18
+
19
+ **DRY** — Duplicated config/constants/helpers across modules drift over time. Extract to shared locations even when the duplication is small.
20
+
21
+ ## Checklist
22
+
23
+ ### Intent vs implementation (single-file)
24
+ - Labels, comments, status messages describing behavior the code doesn't implement. Also covers factual doc drift: file paths/extensions (`foo.js` referenced when the file is `foo.jsx`), item counts ("13 widgets" when there are 15), default entity names ("Default" vs actual "Everything"), and route/response-shape comments that don't match what the handler returns. Verify every factual claim in a comment or JSDoc against the code it references
25
+ - Inline code examples or command templates that aren't syntactically valid
26
+ - Sequential numbering with gaps or jumps after edits
27
+ - Template/workflow variables referenced but never assigned — trace each placeholder to a definition
28
+ - Constraints described in preamble not enforced by conditions in procedural steps
29
+ - Duplicate or contradictory items in sequential lists
30
+ - Completion markers or success flags written before the operation they attest to
31
+ - Existence checks (directory exists, file exists) used as proof of correctness — file can exist with invalid contents
32
+ - Lookups checking only one scope when multiple exist (local branches but not remote)
33
+ - Tracking/checkpoint files defaulting to empty on parse failure — fail-open guards
34
+ - Registering references to resources without verifying resource exists
35
+ - Composed instructions, prompts, system messages, or rule sets that vary by mode/role/context — unconditional clauses can contradict mode-specific directives (e.g., "always cite sources inline" combined with a `draft` mode that asks for "no preamble, no commentary"). Build the composition conditionally — include each block only for modes that want it — or define an explicit precedence so contradictions are predictable
36
+ - Audit, lint, anomaly, and analytics tools that re-derive a domain concept (current cycle, active session, completed period) using a SIMPLIFIED heuristic must match the authoritative production logic — or they produce false positives/negatives. When production keeps a cycle active until `sellRatio < 1.0`, the audit can't use a 50% threshold. Either invoke the authoritative function/predicate, OR snapshot-test the audit against production-state fixtures. Same for suppression rules — gating "ignore this anomaly" on a stale signal hides real issues
37
+
38
+ ### UX integrity (single-component)
39
+ - Unsaved changes / dirty state silently discarded when the user switches context in a multi-record editor or closes a sheet — data loss. Dirty-check on switch (inline confirm), auto-save drafts, or disable the switch control while dirty. `beforeunload` does not cover in-app context switches
40
+ - Array index used as React `key={i}` on a list that's sliced (`logs.slice(-40)`), reordered, filtered, or has items dropped from either end shifts keys as items move, causing React to reuse DOM nodes for different entries — flicker, lost focus, stale tooltips, broken animations, selection bleed across rows. Use a stable identifier from the payload (`id`, `timestamp + event`, content hash)
41
+
42
+ ### Code complexity & PR scope
43
+ - Functions with >10 branches or >15 cyclomatic complexity — refactor (extract early-returns, lookup tables, helper functions)
44
+ - Overly broad changes that should be split into separate PRs (mixed refactor + feature, multiple unrelated concerns) — flag as a process smell so reviewers can request a split
45
+
46
+ ### AI-generated code quality
47
+ - New abstractions, wrapper functions, helper files serving only one call site — inline instead
48
+ - Feature flags, config options, extension points with only one possible value
49
+ - Commit messages claiming a fix while the bug remains
50
+ - Placeholder comments (`// TODO`, `// FIXME`) or stubs presented as complete
51
+ - Unnecessary defensive code for scenarios that provably cannot occur
52
+ - Cleanup callbacks (useEffect return, finalizer, dispose, signal handler) containing only comments are misleading — implement the cleanup or remove the callback entirely
53
+
54
+ ### Configuration & hardcoding
55
+ - Hardcoded values when config/env var exists; dead config fields; unused function parameters
56
+ - Duplicated config/constants/helpers across modules — extract to shared module. Watch for behavioral inconsistencies between copies
57
+ - CI pipelines without lockfile pinning or version constraints
58
+ - Production code paths with no structured logging at entry/exit
59
+ - Error logs missing reproduction context (request ID, input params)
60
+ - Async flows without correlation ID propagation
61
+
62
+ ### Supply chain & dependencies
63
+ - Lockfile committed and CI uses `--frozen-lockfile`; no drift from manifest
64
+ - `npm audit` / `cargo audit` / `pip-audit` — no unaddressed HIGH/CRITICAL vulns
65
+ - No `postinstall` scripts from untrusted packages executing arbitrary code
66
+ - Overly permissive version ranges (`*`, `>=`) on deps with breaking-change history
67
+
68
+ ### Test coverage
69
+ - New logic/schemas/services without tests when similar existing code has tests
70
+ - New error paths untestable because services throw generic errors
71
+ - Tests re-implementing logic under test instead of importing real exports — pass even when real code regresses. Tests asserting by inspecting source code strings rather than calling functions
72
+ - Tests depending on real wall-clock time or external dependencies (system `git`, `gh`, `python`, etc.) — environment-dependent flakiness; mock the subprocess interface (`child_process.spawn`) instead of relying on the binary being installed
73
+ - Missing tests for trust-boundary enforcement
74
+ - Tests exercising code paths the integration layer doesn't expose — pass against mocks but untriggerable in production
75
+ - Test mock state leaking between tests — "clear" resets invocation counts but not configured behavior; use "reset" variants
76
+ - Response/status assertions written as loose ranges (`status >= 400`, `status < 500`, `ok: false`) — a regression that turns a 400 validation failure into a 500 still passes. Assert the specific expected status so tests distinguish validation from server failure
77
+ - Tests gated by `if (process.platform !== 'darwin') return` (or POSIX-only filesystem tricks like `chmod`-based permission failures) silently skip on CI runners with different platforms — the new code becomes effectively untested. Factor platform-specific behavior into pure functions, mock `fs/promises` directly to throw deterministically, or run multi-platform CI. `vi.spyOn(process, 'platform', 'get')` is brittle because `process.platform` is a value property — use `Object.defineProperty(process, 'platform', { value: '<os>', configurable: true })` and restore the original descriptor in cleanup
78
+ - Tests that allocate temp directories (`mkdtempSync`, `mkdir`), spawn long-lived child processes, or write artifacts must clean up in `afterEach`/`finally` (e.g., `rmSync(dir, { recursive: true, force: true })`). Without cleanup, the OS temp dir accumulates over many test runs; concurrent test orderings can collide on shared paths
79
+ - Tests that mutate global state inside the test body (`vi.useFakeTimers()`/`jest.useFakeTimers()`, monkey-patches, `Object.defineProperty(process, 'platform', ...)`, env-var overrides, `mock.module()` setup, frozen `Date.now`, intercepted `console.log`) and only restore at the END of the happy path leak the mutation into the next test when an assertion throws midway — a flaky cascade where one failure causes unrelated tests to misbehave. Restore in a `try/finally` block inside the test body, OR move setup/teardown into `beforeEach`/`afterEach` for the describe block so the framework guarantees cleanup regardless of assertion outcome
80
+ - Tests whose name or description claims a behavior they don't actually assert (`'forwards lastImageFile'` that only checks `prompt` and `mode`) lie about the contract — the test passes even when the named behavior regresses. Either rename the test to match what's asserted or add the missing assertion
81
+ - Tests asserting a specific validation rejection (negative number, oversized string, invalid format) must provide ALL OTHER required fields in valid form — otherwise the 400 the test sees comes from a different validation path (UUID failure, missing required field) and the intended rule is never exercised. Use a valid fixture for unrelated fields so the rejection is attributable to the field under test
82
+ - Path assertions in tests using forward-slash literal substrings (`expect(p).toContain('/some/sub/path')`) fail on Windows where `path.join` produces backslashes. Use `path.join()` in expectations (`expect(p).toContain(join('some', 'sub', 'path'))`), assert the suffix via `endsWith(join(...))`, or use a separator-agnostic regex (`/some[\/\\]sub[\/\\]path/`) — otherwise the test is silently macOS/Linux-only despite running on a Windows CI matrix
83
+ - Refactors that consolidate logic into a SINGLE helper that becomes the source of truth for a critical shape (`buildXPayload`, `serializeY`, `formatZ`) — this helper deserves explicit unit-test coverage on its output fields/shape, because its bugs propagate to every consumer (engine, IPC, route fallback, websocket emitter) without a localized symptom. After extracting a shared builder, add field-by-field assertions on the helper's output and at least one test per call site that exercises end-to-end emission
84
+
85
+ ### Automated pipeline discipline
86
+ - Internal code review must run before creating PRs — never go straight from "tests pass" to PR
87
+ - Copilot review must complete before merging
88
+ - Automated agent output must be reviewed against project conventions
89
+
90
+ ### Style & conventions
91
+ - Naming and patterns inconsistent with rest of codebase
92
+ - New content not matching existing indentation, bullet style, heading levels. Within a single structured file (changelog, README, TOML config), section headers must be unique — duplicate `## Fixed` blocks or repeated table sections are a merge artifact that splits content downstream tools expect to find under one header. Consolidate
93
+ - Shell instructions with destructive operations not verifying preconditions first
94
+
95
+ ## Output Format
96
+
97
+ For each finding:
98
+ ```
99
+ file:line — [CRITICAL|IMPROVEMENT|UNCERTAIN] description
100
+ Evidence: `quoted code line(s)`
101
+ ```
102
+
103
+ Only report verified findings with quoted code evidence. If you cannot quote specific code for a finding, mark as [UNCERTAIN].
@@ -1,7 +1,7 @@
1
1
  # Surface Scan Review Agent
2
2
 
3
3
  ## Mandate
4
- You review code for per-file correctness: bugs, quality issues, and convention violations visible within a single file. You do NOT trace call chains or data flows across files another agent handles cross-file analysis.
4
+ You review code for per-file RUNTIME CORRECTNESS: bugs that crash, fail, mishandle data, or produce wrong output at runtime, plus domain-specific runtime patterns. You do NOT trace call chains across files (the cross-file agents handle that) and you do NOT audit code quality / conventions / tests / documentation drift (the surface quality agent handles that).
5
5
 
6
6
  ## Approach
7
7
 
@@ -10,11 +10,17 @@ Apply the checklist as a prompt for attention, not an exhaustive specification.
10
10
  ## Reading Strategy
11
11
  For each changed file, read the **ENTIRE file** (not just diff hunks). New code interacting incorrectly with existing code in the same file is a common bug source. Review one file at a time.
12
12
 
13
- ## Principles to Evaluate
13
+ ## Out of Scope (handled by sibling agents)
14
14
 
15
- **YAGNI** Flag abstractions, config options, parameters, or extension points that serve no current use case. Unnecessary wrapper functions, premature generalization (factory producing one type), unused feature flags.
15
+ The following concerns are explicitly NOT this agent's job do not flag them here:
16
16
 
17
- **Naming** Functions and variables should communicate intent without reading the implementation. Booleans should read as predicates (`isReady`, `hasAccess`), not ambiguous nouns.
17
+ - **Code quality / conventions / naming / YAGNI / dead code** handled by the Surface Quality agent
18
+ - **Tests / documentation drift / commit hygiene** — handled by the Surface Quality agent
19
+ - **Cross-file call chains, state lifecycle, concurrency** — handled by the Cross-File Tracing agent
20
+ - **Schemas / validation / error classification / architectural pattern adherence** — handled by the Cross-File Contract agent
21
+ - **Secrets, auth, supply-chain, prompt-injection** — handled by the Security Audit agent
22
+
23
+ If a concern fits one of the categories above, drop it from your output rather than co-flagging — that overlap is exactly what this split was intended to remove.
18
24
 
19
25
  ## Checklist
20
26
 
@@ -22,7 +28,6 @@ For each changed file, read the **ENTIRE file** (not just diff hunks). New code
22
28
 
23
29
  **Hygiene**
24
30
  - Leftover debug code (`console.log`, `debugger`, TODO/FIXME/HACK), hardcoded secrets/credentials, uncommittable files (.env, node_modules, build artifacts, runtime-generated data/reports)
25
- - Overly broad changes that should be split into separate PRs
26
31
 
27
32
  **Imports & references**
28
33
  - Every symbol used is imported (missing → runtime crash); no unused imports. Also check references to framework utilities (CSS class names, directive names, component props) — a non-existent utility class or prop name silently does nothing
@@ -37,11 +42,24 @@ For each changed file, read the **ENTIRE file** (not just diff hunks). New code
37
42
  - Parallel arrays coupled by index position — use objects/maps keyed by stable identifier
38
43
  - Shared mutable references: module-level defaults mutated across calls (use `structuredClone()`); `useCallback`/`useMemo` referencing later `const` (temporal dead zone); spread followed by unconditional assignment clobbering spread values
39
44
  - React state invariants (uniqueness, cap/floor, monotonicity) checked against render-time value before `setX(...)` — rapid events or concurrent updates race the check. Move into the functional updater: `setX(prev => prev.includes(id) ? prev : [...prev, id])`
45
+ - Bound-derived state values (current time, scroll position, focused index, selected page) that are bounded by another piece of state (total duration, list length, page count) must be clamped when the bound shrinks. Removing a clip / deleting an item / tightening trims can leave the derived value past its new upper bound, and downstream readers (preview seekers, `arr[index]`, `findClipAt(t)`) return garbage / past-end results / undefined elements. Add a `useEffect` keyed on the bound that clamps the derived value (and pauses playback / clears selection if appropriate)
40
46
  - `useEffect` depending on state it writes — the write retriggers the effect (infinite loop / request storm). Split into two effects, drop the self-written value from deps, or use a functional setter that doesn't require the current value in deps
41
- - Functions with >10 branches or >15 cyclomatic complexity — refactor
42
47
  - String accumulation via `+=` inside high-frequency loops (streaming frames, chunked I/O, per-event handlers) is O(n²) for long outputs and triggers React re-renders on growing payloads. Collect chunks into an array and `join('')` at the end while still emitting per-chunk events
48
+ - Sorting an entire collection just to find a single max/min/first-matching element is O(n log n) when a single pass is O(n) — `[...items].sort(byDate)[0]` over thousands of items burns CPU on every render. Use a linear scan that tracks the current best (`reduce`, manual `for` loop with comparator). Especially impactful when invoked per-render or per-component (collection card, list row)
43
49
  - Server-side string formatting (`toLocaleString`, `toLocaleDateString`, currency/number formatters) that depends on locale/timezone defaults produces non-deterministic outputs across deployments. For data flowing into prompts, logs, or persisted records, format with explicit `Intl.DateTimeFormat({ timeZone: ... })` or ISO strings; reserve locale-aware formatting for user-visible UI layers
44
50
  - Required-at-use-time config values (model name, API key, endpoint URL, default selection) that may be null/undefined in source data must be validated at the boundary before invoking the downstream API. Otherwise the API responds with an opaque error far from the user's intent. Emit a clear, actionable error identifying the missing field
51
+ - Optional-chain incompleteness through dereference chains: `obj?.a.b.c` only guards `obj` — every subsequent `.b` / `.c` still throws if `a` or `b` is null. Extend optional chaining through every dereference (`obj?.a?.b?.c`), or destructure with defaults at the boundary. Common in helpers wrapping CLI output (`res?.stdout.split(...)` crashes when stdout is missing)
52
+ - Temp filenames derived from `Date.now()` collide when concurrent operations start in the same millisecond — corrupted output, mid-flight `unlink` of another request's file. Use `randomUUID()` (combined with `process.pid` for cross-process isolation), `fs.mkdtemp` for per-request scratch dirs, or a shared atomic counter. Same caveat for any filename pattern using a clock reading or a non-unique sequence
53
+ - Cross-platform `fs.rename` "destination exists" handling that destroys the user's output on failure. POSIX `rename` atomically replaces the destination; Windows `fs.rename` rejects when the destination exists. Naive Windows fix (`unlink(dest); rename(tmp, dest)`) is dangerous: if `rename` fails (file lock, AV scan, transient permissions), the original is gone and only the temp file remains. Use a backup-aside rollback: rename original to `<dest>.bak.<uuid>`, rename temp into place, unlink the .bak on success; on failure, rename .bak back. Treat `ENOENT` on the initial backup rename as "no original existed." Same caveat for any optimization-style replace (faststart remux, sidecar updates) — and never silently swallow rename errors: if the rename throws, unlink the temp file in the catch (so failed optimization doesn't slowly leak `.fs.tmp` / `.bak` files) and log a warning so operators know the optimization didn't apply
54
+ - Cache miss for falsy successful values: `if (cache[key]) ... else cache[key] = compute(...)` re-computes whenever the cached value is `''`, `0`, or `false` — falsy successful results are never hit. Use `if (key in cache)` (or `Object.hasOwn(cache, key)`) so the *presence* of the key, not its truthiness, controls the cache hit. Same caveat applies to `Map.get(key) ?? compute()` when the cached value can be falsy
55
+ - Browser storage APIs (`sessionStorage`/`localStorage` `setItem`/`getItem`, IndexedDB) and `JSON.parse` on stored values can throw — Safari private mode, quota exceeded, disabled cookies, corrupted data, or older schema all surface as runtime exceptions during render or effects. Wrap reads/writes in try/catch and validate the parsed shape (string/object/required fields) before consuming; storage failure must not crash the page or trip the nearest error boundary
56
+ - LLM tool-call / palette / agent-invoked function parameters commonly arrive as JSON-typed strings even when the schema declares `number`/`boolean` (the calling LLM serialized them as strings). Pass-through patterns like `width: width || undefined` forward `'1024'` as a string into downstream APIs. Coerce explicitly (`Number(v)`, `Number.isInteger`, range guards) and return a structured error on invalid coercion — bypassing route-layer Zod schemas means the tool's own handler must enforce the same contract
57
+ - Boolean flags surfaced to clients (`feature.enabled`, `mode.active`) must reflect whether the feature is CONFIGURED, not whether transient runtime state happens to be present right now. Deriving `enabled` from "are there active items" causes the flag to flip between cycles/fills/sessions even though the feature is still on. Source from config; report runtime presence in a separate field (`hasActiveItems`, `currentCount`)
58
+ - Producer emits `null` or `0` as an "unknown" sentinel — consumers compute against it as if it were real (`Date.now() - placedAt` with `placedAt: null` becomes epoch → "decades old" age display; `qty * lastPrice` with `lastPrice: 0` zeroes out P&L/APY). Either omit the field, use an explicit `unknown: true` / `stale: true` marker, or document and verify each consumer guards with `!= null` before computing
59
+ - Tracking sets/arrays/maps that only GROW during process lifetime — settled-id sets, seen-event fingerprints, processed-task ids — leak monotonically with workload. Long-lived services (workers, daemons, market-data feeds) accumulate over weeks and OOM. Cap with size limits + LRU eviction, periodic age-based pruning, OR scope to a smaller window (per-cycle, per-day) when feasible
60
+ - State machines that handle only terminal events (`FILLED`, `COMPLETED`, `SETTLED`) ignore intermediate progress fields (`filledSize`, partial completion, in-progress counters) the producer exposes between start and terminal. Consumers that want live progress need the intermediate values too — process them or document the lag
61
+ - File-existence checks for "must be a regular file" — `existsSync(path)` returns true for directories, symlinks, and special files. When downstream code needs a regular file (image, config, manifest, ffmpeg input), use `statSync(path, { throwIfNoEntry: false })?.isFile()` and reject `.`, `..`, and empty basenames before passing to subprocess args / `readFile`. Wrap `statSync` in try/catch when invoked from request validation paths so transient permission/FS errors surface as clean 4xx, not unhandled 500s
62
+ - Cache-validity checks using `existsSync` alone return zero-byte / partial / corrupted files from prior failed attempts as valid cache hits. When the cache is a file written by a subprocess that may crash mid-write (ffmpeg frames, downloaded artifacts, generated thumbnails), validate the cached file is non-empty (`statSync(path).size > 0`) and consider a magic-byte / format check; on extraction failures, unlink the partial output before returning so the next call re-runs cleanly. Same caveat for "fall through to re-extraction on stat error" comments — wrap the `statSync` in try/catch and treat any error (EACCES, transient I/O, ENOENT) as cache miss, not as a fatal abort
45
63
 
46
64
  **API route basics**
47
65
  - Route params passed to services without format validation; path containment using string prefix without separator boundary (use `path.relative()`). When sibling endpoints validate body/query fields, the path param must be validated with the same schema — skipping param validation on one endpoint turns schema violations into 404/500 instead of 400
@@ -49,9 +67,13 @@ For each changed file, read the **ENTIRE file** (not just diff hunks). New code
49
67
  - Stored or external URLs rendered as clickable links without protocol validation — allowlist `http:`/`https:`
50
68
  - Request schema fields for large string/binary payloads (base64, file content, free text) without per-field size limits — a total body-size limit alone doesn't prevent individual oversized fields from consuming excessive memory or exceeding downstream service limits; add per-field `max(N)` constraints with clear error messages
51
69
  - Character-class regex validators (`^[a-z0-9-]+$`) claiming to enforce a structured format (slug, kebab-case, reverse-DNS) — they accept leading/trailing separators (`-foo`, `foo-`) and repeated separators (`a--b`). Require alnum boundaries or use a parser
52
- - `z.string().min(1)` without `.trim()` accepts whitespace-only values for user-visible names — use `z.string().trim().min(1)` when the field represents a human-readable identifier
70
+ - Schema-validator chain ordering — checks run in chain order, so `.trim()` AFTER `.min(1)` (`z.string().min(1).max(200).trim()`) means whitespace-only strings pass the non-empty guard and get trimmed to `''` afterward, defeating the intent. Always order as `z.string().trim().min(1)` for human-readable names AND for nullable foreign-key IDs (`folderId`, `parentId`, `workId`): `z.string().max(N).nullable()` without `.trim().min(1)` accepts `''` and persists it as the ID service layers treat `''` as falsy (skipping existence checks) but still write it, breaking lookups and grouping invariants. Tighten to `z.string().trim().min(1).max(N).nullable()` (or preprocess `''` → `null` at the schema boundary). Handler-side trim normalization (`patch.title = patch.title.trim()`) must additionally reject post-trim emptiness — a PATCH with `" "` trims to `''` and persists a blank value even when create requires non-empty
71
+ - HTTP query parameters can arrive as arrays for repeated keys (`?id=a&id=b`) in most server frameworks (Express, Fastify, etc.) — a route that types `req.query.id` as a string and passes it through unchecked may treat the array as `undefined` (silently dropping the filter and returning UNFILTERED data, a quiet authorization/leakage bug) or pass the array shape downstream. Validate explicitly at the route boundary: accept only `typeof === 'string'`, take the first element when arrays are intended, or reject with 400. Apply via a Zod query schema covering every typed query field
53
72
  - Object spread of a potentially-null/undefined boundary value (`{ ...req.body, id }`) — throws a TypeError and surfaces as 500 instead of 4xx. Use `{ ...(req.body ?? {}), id }` at request/boundary entry points
73
+ - HTTP header values that are case-insensitive by spec (`Content-Type`, `Accept`, `Authorization` scheme, `Transfer-Encoding`) must be lowercased before comparison — `startsWith('multipart/form-data')` against `Multipart/Form-Data; boundary=...` returns false and skips the middleware. Parse parameterized values structurally (`mimeType.split(';')[0].trim().toLowerCase()`) rather than substring-matching the raw header
54
74
  - Schemas accepting paired range fields (`startDate`/`endDate`, `min`/`max`, `from`/`to`) without a cross-field refinement (`zod .refine()`) — accepts inconsistent ranges (start > end). Define deterministic rules when only one bound is supplied
75
+ - PATCH/update endpoints accepting an empty body (or only metadata fields like `expectedUpdatedAt`) pass validation, write-through bumps `updatedAt`, and produce needless write churn + optimistic-concurrency conflicts. Add a `.refine()` requiring at least one mutating field, OR have the service no-op when the patch is empty
76
+ - Truthy-only guards on concurrency tokens / etags / version stamps (`if (expectedUpdatedAt && ...)`) skip the conflict check whenever the client sends an empty string, allowing overwrites of newer data. Use `!= null` (or explicit string validation that rejects empty/whitespace) so "provided but empty" surfaces as a 400, not a silent bypass
55
77
  - Outbound payloads (snippets, previews, cached excerpts) stored in persisted records or sent over streaming protocols without size caps — applies the same per-field max constraint as inbound payloads, enforced at capture time so display-layer truncation isn't the only defense
56
78
 
57
79
  **Async & state (single-file patterns)**
@@ -67,6 +89,11 @@ For each changed file, read the **ENTIRE file** (not just diff hunks). New code
67
89
  - Caches that store negative/error results (`null`, "not found", probe failure) without a TTL or invalidation hook — when a user installs the missing dependency mid-runtime (ffmpeg, python venv, model file), the cache reports "still missing" until process restart. Cache only successful lookups, OR use a short TTL for negatives, OR re-probe on demand when the cached value is the negative sentinel
68
90
  - Late-connecting clients to long-running async jobs (SSE, WebSocket subscribe-by-id) receive nothing if they connect after the terminal `complete`/`error` broadcast — the server emitted once and moved on. Persist the most-recent (or terminal) payload on the job and emit it immediately on attach, OR document that subscribers must connect before kicking off the job and update any "late connectors will get the final state" comments accordingly
69
91
  - Server returning an empty success payload (`200` with `{ images: [] }`, `{ items: null }`, etc.) when an awaited operation succeeded but the artifact fetch failed — clients treat empty as "no work to show" and never surface the underlying error. After awaiting completion, a missing/unreadable artifact is an internal error: return non-2xx with a structured error, never an empty 200
92
+ - DOM event handlers (`onMouseLeave`, `onBlur`, `onPointerLeave`, key handlers) registered via JSX close over render-time state — so `if (pressing) endPress()` inside the handler reads the value from the closure, not the live state, and may see stale `false` even when the user is still holding. Use refs (`pressingRef.current`) for liveness flags read inside DOM handlers, OR call cleanup methods unconditionally and have them no-op if not active
93
+ - Synchronous re-entrancy guards via React state — `if (saving) return; setSaving(true); await save()` is unsafe because `setState` is async and rapid repeated triggers (key-repeat Cmd/Ctrl+S, double-click, debounce overlap, two paths to the same handler) all observe `saving === false` and fire concurrent operations. Use a synchronous mutable ref (`savingRef.current = true` set BEFORE awaiting; reset in `finally`) for the gate, and reserve React state purely for rendering. Same pattern applies to any "in-flight" boolean used as a mutex
94
+ - Controlled-input rehydration from a parent prop — components that hold local state derived from `prop` and only sync via `useEffect(() => setLocal(prop.field), [prop.id])` MISS prop changes within the same parent identity (switching draft versions on the same work, switching tabs on the same record, server normalizing the field). Track every identity discriminator that should trigger rehydration (`[work.id, work.activeDraftVersionId]`), AND sync after server normalization — when a save returns a normalized value (trimmed, lowercased, server-canonical), update local state from the response, not just from optimistic input. Also: a `<select value={prop.status}>` controlled by the prop with no local state can snap back to the old value on re-render before the PATCH resolves — keep a local state mirror and update it in `onChange`
95
+ - Sorted-list mutations via in-place patch (`items.map(x => x.id === id ? merged : x)`) leave the list visually sorted by the OLD order when the mutation changes the sort key (`updatedAt`, priority, score) — a recently-saved item stays mid-list instead of moving to the top. After mutating the sort key, re-sort the array (or splice the item to its new position). Same for filter predicates that may now exclude the item, and for selection sets that may need pruning when the mutation changes the entity's group/tab
96
+ - Reload-on-update fallback that replaces richer state with partial input — handlers like `if (opts.reload) { fresh = await refetch(); setState(fresh) } else { setState(updated) }` that fall back to `setState(updated)` ON REFETCH FAILURE blank fields the previous state had (full body, comments, derived metadata, joined relations) when the input lacks them. Either keep the previous state on refetch failure, retry, or surface the error explicitly — never replace richer state with the partial input that triggered the reload as a "recovery" path
70
97
 
71
98
  **Streaming response handlers (server-side)**
72
99
  - Long-lived streaming handlers (SSE, chunked HTTP) must register a client-disconnect listener (`req.on('close')`/`req.on('aborted')`), set an `aborted` flag, and check it before subsequent writes — otherwise the server keeps emitting frames and incurring `write-after-end` errors after the client navigates away
@@ -74,14 +101,20 @@ For each changed file, read the **ENTIRE file** (not just diff hunks). New code
74
101
  - Per-request timeouts on streaming responses must remain active for the full duration of stream consumption, not cleared on initial fetch resolution — a stalled upstream that keeps the connection open hangs the consumer indefinitely
75
102
  - Honor write backpressure: check the boolean return of `write()` and await `'drain'` when it returns false; otherwise a slow client causes unbounded server-side buffering
76
103
  - When attaching paired listeners for backpressure or completion (`drain` + `close`), the cleanup handler must remove ALL of them — asymmetric removal accumulates listeners across slow-client cycles
104
+ - EventEmitter / socket cleanup using inline anonymous handlers (`socket.on('foo', () => { ... })`) cannot be removed precisely later — assign the handler to a named const (`const handleFoo = () => {...}; socket.on('foo', handleFoo); return () => socket.off('foo', handleFoo);`). On shared emitters/sockets/buses, NEVER call `.off(event)` without the specific handler — that removes ALL listeners on the emitter, breaking sibling subscribers (e.g., other components listening to the same socket event)
105
+ - State-reset methods (`clear()`, `reset()`, `dispose()`, mode-switch handlers) must release every related artifact, not just the data they track: timers (`clearTimeout`, `clearInterval`), refs (`pressingRef.current = false`), audio playback (`stopTone()`), pending press state, animation frames, and any boolean flags blocking re-entry. A `clear()` that resets only the dataset leaves dangling timers firing on stale state, audio playing forever, and a stuck `pressingRef` that prevents the next interaction
77
106
  - Named lifecycle events on streams (`error`, `done`, `complete`) must be MUTUALLY EXCLUSIVE — after emitting `error`, do NOT also emit `done`. Otherwise clients parsing the last event treat failed runs as completed
78
107
 
79
108
  **Error handling (single-file)**
80
109
  - Swallowed errors (empty `.catch(() => {})`); error handlers that exit cleanly (`exit 0`, `return`) without user-visible output; handlers replacing detailed failure info with generic messages
81
110
  - Error discrimination by string matching (`err.message.includes('not found')`, regex on error text) — localization, refactors, or wrapper rewrites silently change HTTP status / retry behavior. Use explicit error codes or typed classes
82
111
  - Route handlers mapping any exception from a service into a single HTTP status (e.g., `catch { throw new NotFoundError() }`) — hides real server errors (file I/O, parse, write failures) as domain 404s. Map only known codes/classes; let unknown errors surface as 500
112
+ - Catch-all fallback that synthesizes a SUCCESS-shaped payload indistinguishable from a legitimate quiet state — e.g., catching any IPC/RPC failure and returning `{ success: true, status: { isRunning: false } }` so the dashboard renders "engine cleanly stopped" whether the engine is actually stopped, timing out, or crashed. Operators can't distinguish "expected idle" from "outage". Either preserve a distinct health/mode value the UI recognizes as "outage" AND ensure every consumer maps it, OR translate the underlying error to a specific HTTP status. Don't gate the fallback on broad rejection types — distinguish timeout, connection-refused, and method-not-found
113
+ - State-machine transitions gated on an external call (settle a tracked order, mark a job complete) must verify success: a thrown error, an empty array where data was expected, or a null where a record was expected MUST keep the state in `pending`, NOT advance it. A swallowed external-call failure followed by `state = 'settled'` loses ledger entries and erases work that was never confirmed. Distinguish "external call definitively reports terminal state" from "external call failed to populate the data we needed"
83
114
  - Error wrappers that re-throw with only `{ status }` and drop `code`/`context`/`cause` — downstream consumers see generic `INTERNAL_ERROR` instead of the specific code. Preserve structured detail when wrapping
84
115
  - Errors thrown from middleware/parser modules (multipart, body parsers, validators) without `err.status` set are normalized to HTTP 500 by the framework's default error handler — but they typically represent client payload issues (bad multipart, payload too large, file type rejected, missing boundary). Set `err.status = 400` (or 413 for size limits, etc.) and a stable `err.code` (`PAYLOAD_TOO_LARGE`, `INVALID_MULTIPART`, `VALIDATION_ERROR`) at the throw site, OR throw a typed `ServerError`/`ApiError`, so clients distinguish their bad input from real server failures
116
+ - Error message templates that interpolate possibly-empty values (`${stderr.split('\n')[0]}`, `${err.code}`, first-line excerpts) produce useless output (`X failed: .`, `X failed: undefined`) when the upstream returns blank/whitespace/missing. Trim the source, fall back to a default (`err.message`, `'unknown error'`) when empty, and skip trailing punctuation that piles on already-punctuated content (`needsPeriod = !/[.!?]$/.test(text)`)
117
+ - JSDoc / comment claims of absolute behavior (`Never throws`, `Always returns`, `Synchronous`, `Idempotent`, `Pure`) must be verified against the implementation. A "Never throws" helper that calls `mkdirSync`/`readFileSync`/`atomicWrite` will throw on permissions/disk-full and crash callers that built error handling around the documented contract. Either soften the doc to specify which failure modes are handled (e.g., "expected provisioning failures surface as `{ ok: false }`; unexpected FS errors may still throw") OR wrap the throwing operations in try/catch and return a structured failure
85
118
  - Outbound `fetch()` / HTTP calls in setup, install, or update scripts (`scripts/*.js`, `setup.sh` invoked tools, post-install hooks) without an `AbortController` per-request timeout — a hung server (accepts connection, never responds) blocks the parent shell process indefinitely, breaking "fail-soft" guarantees that the parent script depends on. Use the same timeout helper the rest of the codebase uses for outbound HTTP, and treat timeout as a skip with a clean exit code
86
119
 
87
120
  ### Domain-Specific (check only when file type matches)
@@ -106,7 +139,7 @@ For each changed file, read the **ENTIRE file** (not just diff hunks). New code
106
139
 
107
140
  **Lazy initialization** _[dynamic imports, lazy singletons, bootstrap]_
108
141
  - Cached state getters returning null before initialization
109
- - Module-level side effects (file reads, SDK init) without error handling
142
+ - Module-level side effects (file reads, SDK init) without error handling. Loaders for user-editable config (`media-models.json`, settings, registries) invoked at import-time must wrap BOTH the read (`readFileSync` can throw on permissions, broken symlinks, transient I/O) AND the parse (`JSON.parse` throws on malformed edits) in try/catch — an uncaught exception during import aborts server boot. Beyond IO/parse: parsed-but-malformed data (missing top-level keys, wrong types like `image: {}` instead of `image: []`) crashes downstream consumers (`reg.video.macos` is undefined → TypeError). Normalize the parsed object: deep-merge with defaults, coerce array-typed fields with `Array.isArray(...)`, coerce object-typed fields with a `isPlainObject` check, and fall back to defaults with a clear log message when shapes don't match
110
143
  - File writes assuming parent directory exists
111
144
  - Bootstrap code importing dependencies it's meant to install — restructure so install precedes resolution
112
145
  - Re-exporting from heavy modules defeats lazy loading
@@ -123,6 +156,10 @@ For each changed file, read the **ENTIRE file** (not just diff hunks). New code
123
156
  - Detached processes with piped stdio — SIGPIPE on parent exit. Use `'ignore'`
124
157
  - Subprocess output buffered without size limits — unbounded memory growth
125
158
  - Platform-specific: hardcoded shell interpreters; `path.join()` backslashes breaking ESM imports — use `pathToFileURL()`
159
+ - Subprocess spawns of binaries with platform-or-distro-dependent names (`pwsh` vs Windows PowerShell `powershell.exe`, `python3` vs `python`, `gh` vs absent on minimal containers, `tailscale` vs path-bundled) must probe (`which`/`where`) and fall back to alternates rather than assuming the modern/preferred name is installed — many Windows boxes ship Windows PowerShell only, distro-stripped Linux containers may lack `python3`. Detect once at startup or per-call; emit a clear actionable error if no candidate is found
160
+ - Platform guards using inverse logic (`if (!IS_WIN && ...)`) when the real condition is "macOS only" run platform-specific binaries on Linux/BSD/Unix where they don't exist (`caffeinate` is macOS-only, `pmset` is macOS-only, `nvidia-smi` requires NVIDIA hardware). Use the positive predicate (`process.platform === 'darwin'`, `IS_MAC`) — every "not Windows" check is wrong on every other non-Windows OS. The same applies to "not macOS" checks meant for Windows-only behavior
161
+ - Setup/install scripts that switch between multiple packages providing the same import path (renamed packages, fork swaps, deprecated→replacement) must explicitly uninstall the predecessor before installing the new one (`pip uninstall -y <old>` before `pip install <new>`). Otherwise both remain installed and Python's import resolution becomes ambiguous (or the wrong one wins on path order). Idempotent: suppress errors from the uninstall step so reruns don't fail when the old package is already absent
162
+ - Case-sensitive string operations on filesystem paths break on case-insensitive filesystems (Windows always, macOS HFS+/APFS by default). Deriving a sibling binary path via case-sensitive `replace(/ffmpeg$/, 'ffprobe')` against a real filesystem path that may be `FFMPEG.EXE` produces an unchanged path → caller spawns the wrong binary and `existsSync` confirms it exists. Use case-insensitive regex flags (`/ffmpeg$/i`) and/or derive the sibling via `path.parse()` + `path.format()` so the result is structurally guaranteed to be the intended binary, not a string that happens to match a pattern
126
163
  - Naive whitespace splitting of command strings breaks quoted arguments — use proper argv parser
127
164
  - Subprocess output parsed from single stream (stdout or stderr) to detect conditions — check both streams and exit code
128
165
  - Readiness/health probes that rely solely on subprocess exit code without inspecting output — many CLIs (`psql`, `curl`, `kubectl`) exit 0 for empty results, missing schema, or auth-only handshake. Capture stdout and verify it contains the expected marker. For tools that read user-level config (`.psqlrc`, `~/.curlrc`), pass flags that ignore those files (`-X`, `--no-rcfile`) so the probe behaves the same in every environment
@@ -131,6 +168,9 @@ For each changed file, read the **ENTIRE file** (not just diff hunks). New code
131
168
  - Arguments passed via process argv have OS-imposed length limits (notoriously low on Windows, ~32KB). For variable-length payloads (prompts, JSON blobs, file contents), pipe via stdin instead of constructing a long argv. If argv must be used, enforce a strict cap and fail with a clear message before spawning
132
169
  - PowerShell `$LASTEXITCODE` propagates from any external call and is read by the script's final exit. A step claiming to be "fail-soft" (e.g., a non-essential post-install hook) that runs an external command without explicitly resetting `$LASTEXITCODE = 0` (or wrapping in try/catch with `$global:LASTEXITCODE = 0`) leaks a non-zero exit code from the soft step into the parent script's overall exit status — breaking the fail-soft contract that callers depend on
133
170
 
171
+ **Multi-input pipelines** _[ffmpeg concat, audio mixers, image compositors, set unions]_
172
+ - Pipelines that join multiple input streams (ffmpeg `concat`, audio mixers, image compositors, SQL `UNION ALL`) require matching parameters across ALL inputs (sample rate, channel layout, sample format, frame rate, dimensions, column types). When one branch normalizes (`aresample=48000,aformat=fltp`, dim coercion) and a sibling branch (silent track from `anullsrc`, fallback empty stream, generated placeholder) does not, the join fails at runtime ("Input link parameters do not match") or produces uneven output. Apply the SAME normalization across regular AND fallback/generated branches; if the pipeline is variable-rate, force a canonical rate (`fps`, `settb`, `aresample`) on every input
173
+
134
174
  **Search & navigation** _[search, deep-linking]_
135
175
  - Search results linking to generic list pages instead of deep-linking to specific record
136
176
  - Search code hardcoding one backend when system supports multiple
@@ -140,6 +180,10 @@ For each changed file, read the **ENTIRE file** (not just diff hunks). New code
140
180
 
141
181
  **Accessibility** _[UI components, interactive elements]_
142
182
  - Interactive elements missing accessible names, roles, or ARIA states — including labels removed or replaced with non-descriptive placeholders in conditional/compact rendering modes. ARIA attributes should match established patterns used elsewhere for the same widget type (disclosure, menu, dialog)
183
+ - Icon-only buttons relying on `title` for their accessible name fail across screen readers — `title` is announced inconsistently or not at all, and not surfaced to assistive tech in many contexts. Add an explicit `aria-label` to the button and mark the icon `aria-hidden="true"` since it's decorative
184
+ - Form submission via Enter calls `onSubmit` regardless of the submit button's disabled state — submit handlers must replicate every guard the disabled state enforces (`notConnected`, missing prerequisite, validation failure, in-flight). Either set the inner button to `type="button"` (so Enter doesn't submit) and submit only via the explicit handler, or duplicate every disabled-condition into the submit handler's early-return guard
185
+ - `<input type="number">` (or any text input) controlled with format-on-render (`value={n.toFixed(2)}`, `value={String(parsed)}`) prevents users from typing intermediate states ("", "0.", "-", "1e") because every keystroke snaps the value back to a formatted number. Store the raw string in local state and only parse/clamp on `onBlur` or Enter, OR use `defaultValue` with an uncontrolled input and read on commit — controlled + format-on-render is broken for free-form numeric entry
186
+ - Form inputs that mutate the source-of-truth prop (`onChange={e => project.name = e.target.value}`, `onChange={e => setProject({ ...project, name: e.target.value })}`) on every keystroke break dirty-checks that compare against the same prop on `onBlur` (`if (trimmed === project.name) return`) — the guard is always true, so the save never fires. Use a separate draft state OR compare against the last-saved/server-confirmed value (e.g., a ref capturing the value at mount), and trigger the save when the draft differs from the saved snapshot
143
187
  - ARIA roles applied without the keyboard interactions they imply — `role="menu"`/`menuitem*` expects roving focus, arrow-key navigation, Escape scoped to the menu, and focus management; `role="listbox"` expects Home/End/typeahead; `role="dialog"` expects focus trap + return focus. Either implement the full interaction pattern or drop to a simpler one (native `<button>` + disclosure)
144
188
  - Nested inputs handling `Escape`/`Enter`/`ArrowUp`/`ArrowDown` inside a modal/form that also handles the key at the ancestor — the event bubbles and the ancestor fires too (closes modal, submits form). Call `e.stopPropagation()` (and usually `preventDefault()`) in the inner handler
145
189
  - Custom toggles from non-semantic elements instead of native inputs
@@ -148,75 +192,18 @@ For each changed file, read the **ENTIRE file** (not just diff hunks). New code
148
192
 
149
193
  **UI performance** _[UI components with streaming, scroll, or frequent updates]_
150
194
  - Event handlers or `useEffect` callbacks firing on every high-frequency event (streaming deltas, scroll, resize, keydown) without throttle, debounce, or `requestAnimationFrame` batching — causes jank and excessive re-renders. Batch with rAF or a time-based limiter when the handler doesn't need to run on every tick
195
+ - Global event-listener `useEffect`s (`addEventListener('keydown'/'mousemove'/'resize')`) whose dependency array includes a rapidly-mutating object or callback — the listener detaches and re-attaches on every change, churning DOM and racing in-flight events. The cleanup also runs on every re-attach, so any timers/refs the cleanup clears (`flushTimerRef`, `wordTimerRef`, audio nodes) get reset mid-interaction. Stabilize: keep changing values in a `ref` read inside a stable handler, and depend only on truly listener-relevant inputs (`enabled`, route key) — not on the live filter object or per-press state
196
+ - Background polling (`setInterval(asyncFn, N)`, recurring `fetch` chains) must (a) suppress per-iteration error toasts via the codebase's `silent` flag (or equivalent) — transient failures otherwise spawn toast storms; (b) guard against overlapping in-flight requests with a ref boolean or convert to a `setTimeout`-after-resolve loop. `setInterval(asyncFn, N)` produces overlapping requests when N < response time, leading to out-of-order state updates and resource pile-up
197
+ - Background polling whose data appears in long-lived UI surfaces (HUD counts, headers, dashboards) must subscribe to live event streams (sockets, pub/sub) rather than re-polling on a fixed cadence — one-shot fetches at mount go stale immediately, and polling alone misses bursty changes between intervals
151
198
 
152
- **Wire-protocol parsers** _[SSE/NDJSON/line-delimited frame parsers, multipart, etc.]_
153
- - Wire-protocol parsers must (a) handle the spec's full set of separators (e.g., both `\n\n` and `\r\n\r\n` for SSE; multiple `data:` lines joined with `\n`); (b) flush remaining buffered content on EOF — otherwise the last frame is dropped when upstream closes mid-frame; (c) wrap per-frame deserialization (`JSON.parse`) so a single malformed frame doesn't terminate the entire stream
199
+ **Wire-protocol parsers** _[SSE/NDJSON/line-delimited frame parsers, multipart, source-code/config parsers, etc.]_
200
+ - Wire-protocol parsers must (a) handle the spec's full set of separators (e.g., both `\n\n` and `\r\n\r\n` for SSE; multiple `data:` lines joined with `\n`); (b) flush remaining buffered content on EOF — otherwise the last frame is dropped when upstream closes mid-frame; (c) wrap per-frame deserialization (`JSON.parse`) so a single malformed frame doesn't terminate the entire stream. Per-token regexes that scan stream chunks (line-extracting `data: foo` from raw chunks) miss matches when the producer splits a single line across chunks — buffer with a rolling buffer or use `readline` to parse complete lines before applying token regexes
201
+ - Hand-rolled parsers for source-code or config formats (env-block extraction, brace matching, key extraction) must handle the language's full token grammar: nested braces with depth counting (`\{([^}]*)\}` truncates at first `}`, missing nested object spreads / ternaries), ALL string delimiters including backtick template literals (not just single/double quotes), escape detection by **counting consecutive backslashes** (odd = escaped, even = not — `\\\\"` is a real quote, not an escape), and optional quoting on keys (`PORT: 3000`, `'PORT': 3000`, `"PORT": 3000`). Prefer the language's own AST parser (Babel, esbuild, ts-morph, YAML/TOML libs) when available; flag every regex that simplifies the source grammar
154
202
  - Stateful parsers (multipart, MIME, framed protocols) must verify they reached the terminal state on `req.on('end')` / EOF — calling `finish()` while still in `STATE_HEADERS`/`STATE_BODY` accepts truncated input as success, silently corrupting partially-written uploads or persisting half-written state. Track the terminal-state transition (e.g., `STATE_DONE` after the closing `--boundary--`) and return a 400 error otherwise (and clean up any partial files written)
155
203
  - Per-part state in stateful parsers must be reset at part boundaries — fields like `currentFileMimetype`, accumulated headers, decoder state, and offsets that aren't cleared at the start of each new part will leak the previous part's value (e.g., a file part with no `Content-Type` inherits the previous part's mimetype). Reset per-part state at the top of the part-start handler
156
204
  - Refactoring a streaming parser to "buffer-then-process" (calling `readAllBytes()` / `Buffer.concat(chunks)` / `await req.text()` before parsing) defeats the streaming contract and re-introduces an OOM/DoS vector for large uploads — verify the new implementation still respects each caller's `maxSize`/body cap WHILE reading (stop collecting once bytes exceed the cap), or restore true streaming. Watch for header comments still claiming "streams" / "never buffers entire body in memory" after such refactors — they become a documentation lie
157
205
  - Library wrappers advertising a multer/express-style contract `(req, file, cb)` must pass the real `req` (not `null`) through to filters/hooks; treating the `cb` as synchronous breaks any caller that supplied an async filter (callback fires later, but the wrapper already read pre-callback state). Either enforce synchronous filters with a clear error and document, or `await` a Promise-wrapped callback before continuing
158
206
 
159
- ### Always Check — Quality & Conventions
160
-
161
- **Intent vs implementation (single-file)**
162
- - Labels, comments, status messages describing behavior the code doesn't implement. Also covers factual doc drift: file paths/extensions (`foo.js` referenced when the file is `foo.jsx`), item counts ("13 widgets" when there are 15), default entity names ("Default" vs actual "Everything"), and route/response-shape comments that don't match what the handler returns. Verify every factual claim in a comment or JSDoc against the code it references
163
- - Inline code examples or command templates that aren't syntactically valid
164
- - Sequential numbering with gaps or jumps after edits
165
- - Template/workflow variables referenced but never assigned — trace each placeholder to a definition
166
- - Constraints described in preamble not enforced by conditions in procedural steps
167
- - Duplicate or contradictory items in sequential lists
168
- - Completion markers or success flags written before the operation they attest to
169
- - Existence checks (directory exists, file exists) used as proof of correctness — file can exist with invalid contents
170
- - Lookups checking only one scope when multiple exist (local branches but not remote)
171
- - Tracking/checkpoint files defaulting to empty on parse failure — fail-open guards
172
- - Registering references to resources without verifying resource exists
173
- - Composed instructions, prompts, system messages, or rule sets that vary by mode/role/context — unconditional clauses can contradict mode-specific directives (e.g., "always cite sources inline" combined with a `draft` mode that asks for "no preamble, no commentary"). Build the composition conditionally — include each block only for modes that want it — or define an explicit precedence so contradictions are predictable
174
-
175
- **UX integrity (single-component)**
176
- - Unsaved changes / dirty state silently discarded when the user switches context in a multi-record editor or closes a sheet — data loss. Dirty-check on switch (inline confirm), auto-save drafts, or disable the switch control while dirty. `beforeunload` does not cover in-app context switches
177
-
178
- **AI-generated code quality**
179
- - New abstractions, wrapper functions, helper files serving only one call site — inline instead
180
- - Feature flags, config options, extension points with only one possible value
181
- - Commit messages claiming a fix while the bug remains
182
- - Placeholder comments (`// TODO`, `// FIXME`) or stubs presented as complete
183
- - Unnecessary defensive code for scenarios that provably cannot occur
184
- - Cleanup callbacks (useEffect return, finalizer, dispose, signal handler) containing only comments are misleading — implement the cleanup or remove the callback entirely
185
-
186
- **Configuration & hardcoding**
187
- - Hardcoded values when config/env var exists; dead config fields; unused function parameters
188
- - Duplicated config/constants/helpers across modules — extract to shared module. Watch for behavioral inconsistencies between copies
189
- - CI pipelines without lockfile pinning or version constraints
190
- - Production code paths with no structured logging at entry/exit
191
- - Error logs missing reproduction context (request ID, input params)
192
- - Async flows without correlation ID propagation
193
-
194
- **Supply chain & dependencies**
195
- - Lockfile committed and CI uses `--frozen-lockfile`; no drift from manifest
196
- - `npm audit` / `cargo audit` / `pip-audit` — no unaddressed HIGH/CRITICAL vulns
197
- - No `postinstall` scripts from untrusted packages executing arbitrary code
198
- - Overly permissive version ranges (`*`, `>=`) on deps with breaking-change history
199
-
200
- **Test coverage**
201
- - New logic/schemas/services without tests when similar existing code has tests
202
- - New error paths untestable because services throw generic errors
203
- - Tests re-implementing logic under test instead of importing real exports — pass even when real code regresses. Tests asserting by inspecting source code strings rather than calling functions
204
- - Tests depending on real wall-clock time or external dependencies
205
- - Missing tests for trust-boundary enforcement
206
- - Tests exercising code paths the integration layer doesn't expose — pass against mocks but untriggerable in production
207
- - Test mock state leaking between tests — "clear" resets invocation counts but not configured behavior; use "reset" variants
208
- - Response/status assertions written as loose ranges (`status >= 400`, `status < 500`, `ok: false`) — a regression that turns a 400 validation failure into a 500 still passes. Assert the specific expected status so tests distinguish validation from server failure
209
-
210
- **Automated pipeline discipline**
211
- - Internal code review must run before creating PRs — never go straight from "tests pass" to PR
212
- - Copilot review must complete before merging
213
- - Automated agent output must be reviewed against project conventions
214
-
215
- **Style & conventions**
216
- - Naming and patterns inconsistent with rest of codebase
217
- - New content not matching existing indentation, bullet style, heading levels. Within a single structured file (changelog, README, TOML config), section headers must be unique — duplicate `## Fixed` blocks or repeated table sections are a merge artifact that splits content downstream tools expect to find under one header. Consolidate
218
- - Shell instructions with destructive operations not verifying preconditions first
219
-
220
207
  ## Output Format
221
208
 
222
209
  For each finding:
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "slash-do",
3
- "version": "2.12.0",
3
+ "version": "2.13.0",
4
4
  "description": "Curated slash commands for AI coding assistants — Claude Code, OpenCode, Gemini CLI, and Codex",
5
5
  "author": "Adam Eivy <adam@eivy.com>",
6
6
  "license": "MIT",
package/uninstall.sh CHANGED
@@ -33,7 +33,8 @@ OLD_COMMANDS=(cam good makegoals makegood optimize-md)
33
33
  LIBS=(
34
34
  code-review-checklist copilot-review-loop graphql-escaping
35
35
  remediation-agent-template swift-review-checklist swift-gotchas
36
- review-surface-scan review-security-audit review-cross-file-tracing
36
+ review-surface-scan review-surface-quality review-security-audit
37
+ review-cross-file-tracing review-cross-file-contract
37
38
  )
38
39
 
39
40
  HOOKS=(slashdo-check-update slashdo-statusline)