npm - slash-do - Versions diffs - 2.12.0 → 2.13.0 - Mend

slash-do 2.12.0 → 2.13.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

package/commands/do/review.md +36 -14
package/commands/do/scan.md +32 -17
package/install.sh +2 -1
package/lib/code-review-checklist.md +81 -10
package/lib/review-cross-file-contract.md +186 -0
package/lib/review-cross-file-tracing.md +17 -138
package/lib/review-security-audit.md +1 -1
package/lib/review-surface-quality.md +103 -0
package/lib/review-surface-scan.md +58 -71
package/package.json +1 -1
package/uninstall.sh +2 -1

package/lib/code-review-checklist.md CHANGED Viewed

@@ -23,9 +23,11 @@
    - Parallel arrays or tuples coupled by index position (e.g., a names array, a promises array, and a destructuring assignment that must stay aligned) — insertion or reordering in one silently misaligns all others. Use objects/maps keyed by a stable identifier instead
    - Shared mutable references — module-level defaults passed by reference mutate across calls (use `structuredClone()`/spread); `useCallback`/`useMemo` referencing a later `const` (temporal dead zone); object spread followed by unconditional assignment that clobbers spread values
    - UI-framework state invariants (uniqueness, monotonicity, cap/floor) checked against the render-time value before calling the updater — rapid events or concurrent updates then violate the invariant. Move the check inside the functional updater (e.g., `setX(prev => prev.includes(id) ? prev : [...prev, id])`) so it runs against the latest state
+   - Bound-derived state values (current time, scroll position, focused index, selected page) bounded by another piece of state (total duration, list length, page count) must be clamped when the bound shrinks. Removing items / tightening trims / deleting the last page leaves the derived value past its new upper bound, and downstream readers (preview seekers, `arr[index]`, lookups) return garbage / past-end / undefined. Add a `useEffect` keyed on the bound that clamps the derived value (and pauses playback / clears selection if appropriate)
    - Reactive effects / observers that depend on a value they themselves write — the write retriggers the effect, producing an infinite loop or network storm. Split into separate effects (one guarded by a "not yet initialized" condition, one that refreshes on an explicit trigger), or drop the self-written value from the dependency array and update via a functional setter
    - Functions with >10 branches or >15 cyclomatic complexity — refactor into smaller units
    - String accumulation via `+=` inside high-frequency loops (streaming frames, chunked I/O, per-event handlers) becomes O(n²) for long outputs and triggers React re-renders on growing payloads. Collect chunks into an array and `join('')` once at the end while still emitting per-chunk events to consumers
+   - Sorting an entire collection just to find a single max/min/first-matching element is O(n log n) when a single pass is O(n) — `[...items].sort(byDate)[0]` over thousands of items burns CPU on every render. Use a linear scan that tracks the current best (`reduce`, manual `for` loop with comparator). Especially impactful when invoked per-render or per-component
    - Server-side string formatting (`toLocaleString`, `toLocaleDateString`, currency/number formatters) that depends on locale/timezone defaults produces non-deterministic outputs across deployments. For data that flows into prompts, logs, persisted records, or cross-system messages, format with explicit `Intl.DateTimeFormat({ timeZone: ... })` or ISO strings; reserve locale-aware formatting for user-visible UI layers where the user's locale is the explicit input
    - Required-at-use-time config values (model name, API key, endpoint URL, default selection) that may be null/undefined in the source data must be validated at the boundary before invoking the downstream API or initialization. Otherwise the upstream API responds with an opaque error far from the user's actual intent. Emit a clear, actionable error (or fail fast at startup) when a required value is missing, identifying the specific field
    - Child process `spawn()` calls without an `error` event handler — when the binary is missing or unexecutable, Node emits an `'error'` event and never `'close'`. Promise wrappers listening only for `'close'` hang forever; bare spawn calls with no listener crash the parent via uncaught exception. Always pair `proc.on('error', ...)` with `'close'`. SIGKILL escalation guards must check liveness via `proc.exitCode == null` (or a `closed` flag set in the close handler), not `proc.killed` — `.killed` becomes `true` immediately after `kill('SIGTERM')` is called, so guards using `if (!proc.killed)` never fire and hung children survive indefinitely. Single-process tracking ("BUSY guard", `activeProcess` global) must hold the reference until the `'close'` event fires, not until `kill()` is sent — clearing at SIGTERM opens a race window where a new job starts while the previous child is still alive
@@ -34,9 +36,26 @@
    - Late-connecting clients to long-running async jobs (SSE, WebSocket subscribe-by-id) receive nothing if they connect after the terminal `complete`/`error` broadcast — the server emitted once and moved on. Persist the most-recent (or terminal) payload on the job and emit it immediately on attach, OR document that subscribers must connect before kicking off the job and update any "late connectors will get the final state" comments accordingly
    - Server returning an empty success payload (`200` with `{ images: [] }`, `{ items: null }`, etc.) when an awaited operation succeeded but the artifact fetch failed — clients treat empty as "no work to show" and never surface the underlying error. After awaiting completion, a missing/unreadable artifact is an internal error: return non-2xx with a structured error, never an empty 200
    - HTML `<button>` elements without an explicit `type="button"` attribute default to `type="submit"`. When the component is rendered (or could be rendered) inside a `<form>` ancestor, clicks trigger unintended form submission. Set `type="button"` on every non-submit button (close, cancel, expand, menu trigger)
+   - Form submission via Enter calls `onSubmit` regardless of the submit button's disabled state — submit handlers must replicate every guard the disabled state enforces (`notConnected`, missing prerequisite, validation failure, in-flight). Either set the inner button to `type="button"` so Enter doesn't submit, or duplicate every disabled-condition into the submit handler's early-return guard
+   - `<input type="number">` (or any text input) controlled with format-on-render (`value={n.toFixed(2)}`, `value={String(parsed)}`) prevents users from typing intermediate states ("", "0.", "-", "1e") because every keystroke snaps the value back to a formatted number. Store the raw string in local state and only parse/clamp on `onBlur` or Enter, OR use `defaultValue` with an uncontrolled input — controlled + format-on-render is broken for free-form numeric entry
+   - Form inputs that mutate the source-of-truth prop on every keystroke break dirty-checks that compare against the same prop on `onBlur` (`if (trimmed === project.name) return`) — the guard is always true, so the save never fires. Use a separate draft state OR compare against the last-saved/server-confirmed value
+   - Optional-chain incompleteness through dereference chains: `obj?.a.b.c` only guards `obj` — every subsequent `.b` / `.c` still throws if `a` or `b` is null. Extend optional chaining through every dereference (`obj?.a?.b?.c`), or destructure with defaults at the boundary
+   - Temp filenames derived from `Date.now()` collide when concurrent operations start in the same millisecond — corrupted output, mid-flight `unlink` of another request's file. Use `randomUUID()` (combined with `process.pid` for cross-process isolation), `fs.mkdtemp` for per-request scratch dirs, or a shared atomic counter
+   - Cache miss for falsy successful values: `if (cache[key]) ... else cache[key] = compute(...)` re-computes whenever the cached value is `''`, `0`, or `false` — falsy successful results are never cached. Use `if (key in cache)` (or `Object.hasOwn(cache, key)`) so the *presence* of the key, not its truthiness, controls the cache hit. Same caveat applies to `Map.get(key) ?? compute()` when the cached value can be falsy
+   - Browser storage APIs (`sessionStorage`/`localStorage` `setItem`/`getItem`, IndexedDB) and `JSON.parse` on stored values can throw — Safari private mode, quota exceeded, disabled cookies, corrupted data, or older schema all surface as runtime exceptions during render or effects. Wrap reads/writes in try/catch and validate the parsed shape before consuming; storage failure must not crash the page or trip the nearest error boundary
+   - File-existence checks for "must be a regular file" — `existsSync(path)` returns true for directories, symlinks, and special files. When downstream code needs a regular file (image, config, manifest, ffmpeg input), use `statSync(path, { throwIfNoEntry: false })?.isFile()` and reject `.`, `..`, and empty basenames before passing to subprocess args / `readFile`. Wrap `statSync` in try/catch when invoked from request validation paths so transient FS errors surface as clean 4xx, not 500s
+   - Cache-validity checks using `existsSync` alone return zero-byte / partial / corrupted files from prior failed attempts as valid cache hits. When the cache is a file written by a subprocess that may crash mid-write (extracted frames, downloaded artifacts, generated thumbnails), validate the cached file is non-empty (`statSync(path).size > 0`) and consider a magic-byte / format check; on extraction failures, unlink the partial output before returning so the next call re-runs cleanly. Same caveat for "fall through to re-extraction on stat error" comments — wrap `statSync` in try/catch and treat any error as cache miss, not as a fatal abort
+   - LLM tool-call / palette / agent-invoked function parameters commonly arrive as JSON-typed strings even when the schema declares `number`/`boolean` (the calling LLM serialized them as strings). Pass-through patterns like `width: width || undefined` forward `'1024'` as a string into downstream APIs. Coerce explicitly (`Number(v)`, `Number.isInteger`, range guards) and return a structured error on invalid coercion — bypassing route-layer schemas means the tool's own handler must enforce the same contract
+   - Boolean flags surfaced to clients/consumers (`feature.enabled`, `mode.active`, `replication.on`) must reflect whether the feature is CONFIGURED, not whether transient runtime state happens to be present at the moment the response is built. Deriving `enabled` from "are there active items right now" causes the flag to flip between cycles, fills, sessions, or batches even though the feature is still on — UIs that key off the flag turn the feature display on/off in lockstep with runtime activity. Source the flag from config; report runtime presence in a separate field (`hasActiveItems`, `currentCount`, `inProgress`)
+   - Tracking sets/arrays/maps that only GROW during process lifetime — settled-id sets, seen-event fingerprints, processed-task ids, dedup caches — leak monotonically with workload. Long-lived services (workers, daemons, market-data feeds, event consumers) accumulate gigabytes over weeks and eventually OOM. Cap with size limits + LRU eviction, periodic age-based pruning, persistence to a bounded store, OR scope the set to a smaller window (per-cycle, per-day, per-session) when the use case allows. The same caveat applies to any in-memory map keyed by an unbounded id space
    **API & URL safety**
    - User-supplied or system-generated values interpolated into URL paths, shell commands, file paths, subprocess arguments, or dynamically evaluated code (eval, CDP evaluate, new Function, template strings executed in browser/page context) without encoding/escaping — use `encodeURIComponent()` for URLs, regex allowlists for execution boundaries, and `JSON.stringify()` for values embedded in evaluated code strings. Generated identifiers used as URL path segments must be safe for your router/storage (no `/`, `?`, `#`; consider allowlisting characters and/or applying `encodeURIComponent()`). Identifiers derived from human-readable names (slugs) used for namespaced resources (git branches, directories) need a unique suffix (ID, hash) to prevent collisions between entities with the same or similar names
+   - HTTP header values that are case-insensitive by spec (`Content-Type`, `Accept`, `Authorization` scheme, `Transfer-Encoding`) must be lowercased before comparison — `startsWith('multipart/form-data')` against `Multipart/Form-Data; boundary=...` returns false and silently skips the middleware. Parse parameterized values structurally (`mimeType.split(';')[0].trim().toLowerCase()`) rather than substring-matching the raw header
+   - Hand-rolled URL parsing for structured services (GitHub/GitLab PR URLs, OAuth callback URLs, custom protocols) must use `new URL()` and validate: protocol allowlist (reject non-http(s)), `parsed.hostname` not `parsed.host` (host includes port and breaks `--hostname` consumers), exact path-segment shape, and a numeric/format check on terminal segments. Allow trailing path suffixes (`/pull/123/files`) without mis-parsing earlier segments as the project. Empty hostname or set port should reject by default unless explicitly supported. When the function operates on data from one source (PR URL) but determines runtime behavior from another (the local repo's `origin` remote), the two can disagree — carry the forge/host discriminator with the data and act on the URL itself
+   - Identifier fields used as delimiter-separated keys (`<kind>:<ref>`, `owner/repo`, `tag,name`, `bucket#object`) must reject the delimiter character at validation time — otherwise persisted entries become unaddressable through the API (DELETE/GET/PATCH that splits on the delimiter gets the wrong segments) and cross-component matching (cover keys, cache keys) breaks. Trace from the validator through every consumer that splits on the delimiter
+   - Persisted JSON loaders called at runtime must normalize the parsed root shape — hand-edited or corrupted persistence (`{}` instead of `[]`, `null` instead of `{}`) crashes downstream `.find` / `.unshift` / `.map` callers. Apply `Array.isArray(parsed) ? parsed : []` (or `isPlainObject` for object roots) in EVERY loader, not just the import-time bootstrap, and verify each loader's consumers don't assume shape that the loader doesn't enforce
+   - Multi-input pipelines (ffmpeg `concat`, audio mixers, image compositors, SQL `UNION ALL`) require matching parameters across ALL inputs (sample rate, channel layout, sample format, frame rate, dimensions, column types). When one branch normalizes and a sibling branch (silent track, fallback empty stream, generated placeholder) does not, the join fails at runtime ("Input link parameters do not match") or produces uneven output. Apply the SAME normalization across regular AND fallback/generated branches; force a canonical rate/format on every input
    - Route params passed to services without format validation; path containment checks using string prefix without path separator boundary (use `path.relative()`). When a route validates body/query fields, the corresponding path parameter must be validated with the same schema — skipping param validation on sibling endpoints (e.g., `PUT /:id` validates body.id but `DELETE /:id` lets the raw param fall through) causes inconsistent error classes (400 vs 404 vs 500) for the same invalid input
    - Character-class regex validators (e.g., `^[a-z0-9-]+$`) claiming to enforce a structured format (slug, kebab-case, reverse-DNS, semver) — they accept leading/trailing separators (`-foo`, `foo-`), repeated separators (`a--b`), and empty segments. Require boundary characters (`^[a-z0-9](?:[a-z0-9-]*[a-z0-9])?$`) or use a dedicated parser when the claim is a structured format rather than a character set
    - Request schema fields for large string/binary payloads (base64, file content, free text) without per-field size limits — a total body-size limit alone doesn't prevent individual oversized fields from consuming excessive memory or exceeding downstream service limits. Add per-field `max(N)` constraints with clear error messages. The same cap applies to OUTBOUND payloads stored in persisted records or sent over streaming protocols (snippets, previews, cached excerpts) — capture-time size enforcement prevents on-disk growth and SSE/WebSocket payload bloat that pure display-time truncation cannot
@@ -46,7 +65,7 @@
    - Error/fallback responses that hardcode security headers instead of using centralized policy — error paths bypass security tightening
    **Trust boundaries & data exposure**
-   - API responses returning full objects with sensitive fields — destructure and omit across ALL response paths (GET, PUT, POST, error, socket); comments/docs claiming data isn't exposed while the code path does expose it
+   - API responses returning full objects with sensitive fields — destructure and omit across ALL response paths (GET, PUT, POST, error, socket); comments/docs claiming data isn't exposed while the code path does expose it. The same scrubbing applies to server-emitted log/status/progress lines forwarded to clients via SSE, WebSocket, or any push channel — `STATUS:Saved to /Users/me/data/...`, error tails, ffmpeg progress lines all leak server filesystem layout exactly the way catalog responses do. Reduce paths to basenames before emitting
    - Server trusting client-provided computed/derived values (scores, totals, correctness flags, file metadata like MIME type and size) when the server can recompute or verify them — strip and recompute server-side; for file uploads, validate content type via magic bytes and size via actual buffer length rather than trusting client-supplied headers. The same principle applies to trust in persisted state: flags read from flat-file/JSON/DB records that control authorization or deletion protection (`builtIn`, `protected`, `role`, `owner`) must be derived from a trusted source (code constants, session identity) on every read — otherwise hand-editing the file or tampered sync can flip the flag and bypass the protection
    - New endpoints mounted under restricted paths (admin, internal) missing authorization verification — compare with sibling endpoints in the same route group to ensure the same access gate (role check, scope validation) is applied consistently. When new capabilities require additional OAuth scopes or API permissions, verify the scope-upgrade check covers all required scopes — a check that only tests for one scope will miss newly added scopes, causing downstream API calls to fail with insufficient permissions
    - User-controlled objects merged via `Object.assign`/spread without sanitizing keys — `__proto__`, `constructor`, and `prototype` keys enable prototype pollution. Use `Object.create(null)` for the target, whitelist allowed keys, and use `hasOwnProperty` (not `in`) to check membership. Also verify the merge can't override reserved/internal fields the system depends on
@@ -74,17 +93,24 @@
    - Async functions invoked from synchronous event handlers (`onClick`, `onKeyDown`, command dispatchers) or effects without handling rejection at the call site — even when a shared request helper toasts the error, the unhandled rejection pollutes the console, leaves UI state inconsistent (optimistic mutation not reverted, palette/modal stuck open, navigation skipped), and hides failures from downstream event-loop instrumentation. Wrap the `await` in try/catch, attach `.catch(...)`, or use `void promise.catch(...)` — and only run success-path side effects (close, navigate, clear dirty) inside the success branch. After awaiting an async operation that may be cancelled or whose owning component may unmount, check `signal.aborted` (or a `mountedRef`) before subsequent state writes — otherwise React warns about updates on unmounted trees and stale state leaks through
    - Single shared error-state variable (one `error` setter) reused by multiple independent async flows — one flow's success path clears the other flow's displayed error and vice versa. Split errors by domain (`dataError`, `layoutsError`), scope errors per operation, or only overwrite the specific error the current flow owns
    - A page renders based on multiple independent async loads but the loading flag reflects only one of them — a slow or failed secondary load renders a blank page with no loading indicator and no error. Either include every render-gating fetch in the loading state (or a per-fetch status map) or provide explicit empty/error states for the secondary data
+   - Array index used as React `key={i}` on a list that's sliced (`logs.slice(-40)`), reordered, filtered, or has items dropped from either end shifts keys as items move, causing React to reuse DOM nodes for different entries — flicker, lost focus, stale tooltips, broken animations, selection bleed across rows. Use a stable identifier from the payload
    - Unsaved changes / dirty state discarded without warning when the user switches context (selecting another record in a multi-record editor, navigating away, closing a sheet) — silent data loss. Dirty-check on context change (inline confirm), auto-save drafts per record, or gate the switch control until unsaved state is committed. The `beforeunload` prompt alone does not cover in-app context switches
    - Actions triggered from one surface (command palette, global menu, external event) that mutate data another already-mounted page fetched on mount — re-navigating to the same route does NOT remount the page (routers treat it as a no-op), so the visible state stays stale even though the mutation succeeded server-side. Propagate the change via a shared store, a pub/sub event whose name is a shared constant, a refetch-on-focus/visibility hook, or a key-based remount — and verify the mounted page actually subscribes
    - Long-lived streaming response handlers (SSE, chunked HTTP, WebSocket) on the server must register a client-disconnect listener (`req.on('close')`, `req.on('aborted')`) and propagate cancellation through the FULL processing chain (retrieval, fetches, subprocesses) — partial threading wastes work after disconnect, leaves `write-after-end` errors, and inflates resource use. Per-request timeouts on streaming responses must remain active for the full duration of stream consumption, not cleared on initial fetch resolution — a stalled upstream that keeps the connection open hangs the consumer indefinitely. Honor write backpressure: check the boolean return of `write()` and await `'drain'` when it returns false. When attaching paired listeners for backpressure or completion (`drain` + `close`), the cleanup handler must remove ALL of them — asymmetric removal accumulates listeners across slow-client cycles. After flushing streaming headers, framework error middleware (asyncHandler, exception filters) cannot send a JSON error — wrap post-handshake logic in try/finally that translates errors into a terminal `event: error` SSE frame and ends the response gracefully (`if (!res.writableEnded && !res.destroyed) res.end()`)
    - Client-side AbortController for an in-flight streaming/long-lived operation must be invoked when its owning UI context tears down OR navigates AWAY from that operation. When the cleanup is keyed to a route param, navigation events emitted BY the in-flight stream itself (e.g., redirecting to a permalink after the server returns an id) trigger the cleanup and abort the very operation that caused the navigation. Track the streaming operation's identity in a ref and abort only when navigating away from THAT identity, not on every param change. Mirror the abort by cancelling the stream reader (`reader.cancel()`) in a `finally` block and ignoring late events whose ID doesn't match the now-current operation
    - Streaming UIs must preserve the deltas already received when the stream emits a terminal `error` event mid-stream (commit them as a partial result with an error indicator). Clearing the streaming buffer on error discards visible content the user already saw and breaks recovery flows. Pair this with mutually exclusive terminal events (see Error handling)
+   - State-machine transitions gated on an external call (settle a tracked order, mark a job complete, finalize a write) must verify success before advancing: a thrown error, an empty array where data was expected, or a null where a record was expected MUST keep the state in `pending`/`uncertain`, NOT advance to `settled`/`completed`/`finalized`. A swallowed external-call failure followed by `state = 'settled'` loses ledger entries, drops linkages, and erases work that was never confirmed. Distinguish "external call definitively reports terminal state" (advance) from "external call failed or returned empty data" (retry, surface, or leave unchanged). The same applies to optimistic completion based on a single API response shape — verify the response actually contains the expected payload before transitioning
+   - State machines that handle only terminal events (`FILLED`, `COMPLETED`, `SETTLED`, `DONE`) ignore intermediate progress fields (`filledSize`, partial completion percentages, in-progress counters) the producer exposes between start and terminal. Consumers that want live progress (running totals, partial-fill tracking, mid-flight cost basis) need the intermediate values too — either process them or document the lag explicitly. Common bug: a buy order shows fully open in the dashboard until the final FILL event arrives, even though the exchange already partially filled it
+   - Reconstructing UI/API state from a persisted snapshot (state file, last-known-good cache, replay-from-disk path) while a live in-memory tracker exists must reconcile against the live tracker (`getOpenOrders`, `getOrderStatus`, in-memory event log) — not just disk. Fills, cancels, and external mutations that happened after the last persist are missing from the snapshot but may already be visible in the live tracker. Reading from disk alone produces phantom open orders, missed cancellations, zero quantities for filled positions, and stale flags until the next persist. Document the staleness window or explicitly merge live + persisted before emitting
    **Error handling** _[applies when: code has try/catch, .catch, error responses, or external calls]_
    - Service functions throwing generic `Error` for client-caused conditions — bubbles as 500 instead of 400/404. Use typed error classes with explicit status codes; ensure consistent error responses across similar endpoints — when multiple endpoints make the same access-control decision (e.g., "resource exists but caller lacks access"), they must return the same HTTP status (typically 404 to avoid leaking existence). Include expected concurrency/conditional failures (transaction cancellations, optimistic lock conflicts) — catch and translate to 409/retry rather than letting them surface as 500
    - Error discrimination by string matching (`err.message.includes('not found')`, regex on error text) or by coupling to localizable/developer-facing messages — refactors, localization, or wrapper rewrites silently change behavior (HTTP status, retry policy, user message). Use explicit error codes or typed classes and branch on them
    - Route handlers (or equivalent response mappers) that convert any exception from a service call into a single specific status — e.g., `catch (err) { throw new NotFoundError() }` — mask real server errors (file I/O, JSON parse, atomic-write failures) as domain 404s and hide outages. Map only known error classes/codes; let unknown errors surface as 500 so real bugs aren't suppressed
+   - Catch-all fallback that synthesizes a SUCCESS-shaped payload indistinguishable from a legitimate quiet state — e.g., a route that catches any IPC/RPC failure and returns `{ success: true, status: { isRunning: false } }` so the dashboard renders "engine cleanly stopped" whether the engine is actually stopped, timing out, transiently disconnected, or crashed. Operators can't distinguish "expected idle" from "outage" and may take unsafe control actions on a process that's still alive. Either (a) preserve a distinct health/mode value the UI recognizes as "outage" (`health.mode = 'ENGINE_DOWN'`) AND ensure every consumer maps that value, OR (b) translate the underlying error to a specific HTTP status so the dashboard's error-handling path runs. Don't gate the fallback on broad rejection types (e.g., "any IPC error") — distinguish timeout, connection-refused, and method-not-found
    - Error wrappers that re-throw with only a subset of the original fields (e.g., `new ServerError(err.message, { status })` that drops `code`, `context`, `cause`) — downstream consumers see a generic `INTERNAL_ERROR` instead of the specific code they branch on. Preserve structured detail across wrapping: propagate `code`, `context`, `cause`, and any fields clients or logs depend on
+   - Error message templates that interpolate possibly-empty values (`${stderr.split('\n')[0]}`, `${err.code}`, first-line excerpts) produce useless output (`X failed: .`, `X failed: undefined`) when the upstream returns blank/whitespace/missing. Trim the source, fall back to a default (`err.message`, `'unknown error'`) when empty, and skip trailing punctuation that piles on already-punctuated content
+   - JSDoc / comment claims of absolute behavior (`Never throws`, `Always returns`, `Synchronous`, `Idempotent`, `Pure`) must be verified against the implementation. A "Never throws" helper that calls `mkdirSync`/`readFileSync`/`atomicWrite` will throw on permissions/disk-full and crash callers that built error handling around the documented contract. Either soften the doc to specify which failure modes are handled OR wrap the throwing operations in try/catch and return a structured failure
    - Swallowed errors (empty `.catch(() => {})`), handlers that replace detailed failure info with generic messages, and error/catch handlers that exit cleanly (`exit 0`, `return`) without any user-visible output — surface a notification, propagate original context, and make failures look like failures. This includes cross-layer error propagation: if the server returns structured error detail (field-level validation messages, `details[]` arrays, error codes), the client should surface actionable detail rather than discarding the structure for a generic string. Includes external service wrappers that return `null`/empty for all non-success responses — collapsing configuration errors (missing API key), auth failures (403), rate limits (429), and server errors (5xx) into a single "not found" return masks outages and misconfiguration as normal "no match" results. Distinguish retriable from non-retriable failures and surface infrastructure errors loudly
    - SSE or streaming handlers that call `end()`/`close()` on mid-stream errors without emitting an error event — the client observes a clean stream termination and treats partial content as complete. Emit a structured `event: error` block before closing so clients can detect and surface the failure. Conversely, named lifecycle events on streams (`error`, `done`, `complete`) must be MUTUALLY EXCLUSIVE — after emitting `error`, do NOT also emit `done`, or include explicit success/error info in the terminal frame. Otherwise clients parsing the last event treat failed runs as completed
    - Raw `fetch()` failures (TypeError "Failed to fetch", DNS errors, ECONNREFUSED) at API client boundaries must be translated to a consistent user-friendly message matching the project's established transport-error utility — otherwise users see cryptic browser errors and can't distinguish "server unreachable" from "request rejected." Preserve `AbortError` so callers can still distinguish cancellation from failure
@@ -98,13 +124,33 @@
    - Cross-module feature-flag detection drift — when multiple modules independently determine "is feature X active?" (HTTPS enabled, OAuth scope satisfied, dark mode, tier-gated capability) using divergent checks, behavior diverges and UX contradicts itself (UI advertises an `https://` URL while the server runs HTTP; one helper checks for `cert.pem`, another requires `cert.pem && key.pem`). Centralize the predicate in a single exported helper and have every caller import it; flag any module that re-derives the same boolean inline
    - Cross-module error classification — a low-level wrapper rethrows errors with a different `name`/`code`/`message` shape than the original (e.g., custom fetch wrapper aborts with `new Error('Request aborted')` while a downstream classifier checks `err.name === 'AbortError'`). The classifier matches nothing and the timeout/cancel branch never fires. Either preserve `name`/`code`/`cause` through the wrapper, OR have the classifier accept the union of shapes the wrapper can emit
    - Compatibility-shim end-to-end plumbing — when a route bridges to an external API standard (A1111-style image-gen, OpenAI, S3-compatible, etc.) every documented response field must be backed by a real value chain through the provider, intermediate service, and response builder. Common bug: response shape is correct but a field like `seed`, `progress`, `eta`, `model`, or `usage.tokens` is hardcoded to a default (`0`, `null`, the request input) because nothing in the chain actually returns it. Trace each declared response field from where it's set in the route → service return shape → underlying provider/process output, and confirm the value flows end-to-end. "Always returns 0 / always undefined / always empty array" patterns signal incomplete plumbing
+   - Multi-provider operation enumeration — when a system supports multiple providers/backends/sources for the same capability (image-gen local + codex, search backends, storage tiers), every dispatcher operation that fans out (cancel, list active, attach SSE, status probe, "current job") must enumerate ALL providers — not short-circuit on the first match. Trace from the dispatcher entry point through every provider import and verify each provider's variant is invoked. Common bug: `cancel()` calls `local.cancel()` and returns; codex jobs survive. Same applies to `getActiveJob()` returning only the first truthy provider while a sibling has a hidden in-flight job
+   - Job ownership before clearing shared singleton state — finalize / cleanup handlers in single-active-job providers (`activeJob`, `activeProcess`, `currentSession`) must check the cleared reference still belongs to the job that owns the handler — otherwise a stale finalize from an older run wipes state belonging to a newer in-flight job. Pattern: `if (activeJob === job) activeJob = null;`. The error path (spawn `'error'` before `'close'`) is the most common entry that bypasses the close handler's ownership check, so `finalizeError` is the most common offender
+   - Disable-active-option fallback chain — when a UI "disable" or "remove" action removes the currently-active option from a configurable set (provider, model, default), the fallback must be the next CONFIGURED option, not a hardcoded default. Disabling X while it's the active backend should fall back to the next configured backend, not blindly to a sibling that may itself be unconfigured. Trace from the disable handler through fallback selection and verify each branch lands on a working configuration
+   - UI saved-state mirrors save-time normalization — when a save handler trims/lowercases/sorts/transforms input before persisting, the local `saved` snapshot used for dirty-checking must reflect the SAME normalization. Otherwise the dirty-check thinks the persisted state has uncommitted changes (trailing whitespace, case-mismatched id) until the next reload. Trace: input field state → normalization in save → patch sent to server → `saved` snapshot update → dirty-comparison
+   - Default selector validated against actual available set — when a default model/option/backend id comes from user-editable config (registry JSON, settings, env), and the consumer enumerates a filtered list (per-platform availability, `broken: true` flags, missing-file gating), the configured default may not appear in the consumer's list. Returning the raw configured value surfaces later as "Unknown X" / TypeError. The default getter must verify the configured id appears in the consumer's available set and fall back to the first-available with a clear warning log when invalid. Same applies to nested fields the downstream invocation depends on (`model.repo`, `entry.url`) — validate non-empty/well-formed at the boundary, never pass `undefined` into argv
+   - Input mode switching with stale "other-mode" value — when a UI offers multiple input modes for one purpose (gallery pick + file upload + URL paste, text + image + video) and they share a common destination field, switching modes must clear or override the OTHER modes' values. Otherwise the preview shows one source while the POST sends another, and the user submits something different from what they see. Trace: mode toggle → state mutations → preview render predicate → outgoing payload builder; verify exactly one mode owns the source value at any time, or make the preview always reflect what will actually be sent
+   - Cross-platform script-flag parity — when a JS service spawns platform-specific scripts (`generate.py` on macOS/Linux, `generate_win.py` on Windows; `setup.sh` vs `setup.ps1`), every flag/argument the dispatching service emits must be implemented in EVERY platform script. Partial coverage causes "unrecognized arguments" / silent ignores on the other OS. Trace from `buildArgs()` through the spawn call to each platform script's argparse/parameter definitions and verify coverage. If support is intentionally asymmetric, gate the flag emission by platform and update inline comments / STATUS lines to match
+   - Compound visual state propagation through child components — when a parent component supports a visual state (`dimmed`, `disabled`, `loading`, `selected`, `muted`) that should affect its entire visual presence, every visual sub-component (text labels, halos, edge strips, ground glow, accents, neon lines, hologram overlays, ring/border meshes) must inherit and apply the state. Threading the prop only into the primary mesh/material leaves surrounding decorations at full opacity, so non-matching items still read as "lit". Centralize via a single shared multiplier prop passed to every child
+   - Last-precedence wins for layered config blocks — when parsing config files (PM2 ecosystem, docker-compose, env-file layers) that allow multiple scope blocks for the same key (`env`, `env_development`, `env_production`), key extraction must respect explicit precedence (later/more-specific blocks override earlier/general) — not first-wins-then-skip. Otherwise a `PORT` defined in both `env` and `env_production` reserves whichever appears first in the file, which is rarely the runtime value
+   - Cancellation completeness — a cancel UI must short-circuit ALL continuation paths, not just close the visible stream: (a) the `runOperation()` Promise must settle (stored `reject` ref or AbortController), (b) any `.then(...)` chained from a non-cancelable upstream POST must check a per-run token before opening downstream resources (EventSource, WebSocket, secondary fetch) so a late HTTP response can't spawn a new SSE after cancellation, (c) every UI flag set during start (`extracting`, `running`, spinner) must reset regardless of which step was interrupted. Cancel + queue worker race: workers that mark the running item errored and immediately advance race the server's cancellation cleanup — the cancelled child takes SIGTERM→SIGKILL escalation seconds to actually exit, and the next job hits `409 BUSY`. Either await the child's actual exit before returning success from cancel, OR have the worker treat 409 BUSY as retry/backoff. User-initiated cancel signals propagating through subprocess close handlers must NOT surface as generic "error" — distinguish SIGTERM (deliberate cancel) from SIGKILL/non-zero exit, and emit a distinct `cancelled` event/status so the UI doesn't show a confusing failure for a normal cancel
+   - Client-side EventSource / WebSocket `onerror` handlers that only call `close()` without resetting render-state flags (`renderJobId`, `progress`, `isLoading`) leave the UI stuck in "in-progress" forever after a connection drop. Trace every long-lived stream attach → onerror handler → state reset; verify the handler clears every flag set when the stream opened and surfaces a user-visible toast/banner
+   - Page/component-level unsubscribe from shared event streams — when a page hook emits `unsubscribe` on a Socket.IO namespace / pub-sub channel / shared bus on cleanup, and the server's subscription model is per-socket (single Set, no ref-count), unmounting drops subscriptions other always-mounted consumers (Layout-level notifications, header counts, global toasts) still need. Trace every `subscribe` / `unsubscribe` emit and every `socket.off(event)` (without a handler) site to identify which channels are shared. Either avoid unsubscribing from shared namespaces in page-level hooks, OR introduce a ref-counted subscription manager. Same caveat applies to `socket.off(event)` without a handler on shared emitters — that removes ALL listeners across components
    **Resource management** _[applies when: code uses event listeners, timers, subscriptions, or useEffect]_
    - Event listeners, socket handlers, subscriptions, timers, and useEffect side effects are cleaned up on unmount/teardown. `requestAnimationFrame` handles must be cancelled on unmount — pending frames invoke DOM operations or state updates on unmounted nodes. Blob/object URLs created via `URL.createObjectURL()` must be revoked on both item removal AND component unmount. ReadableStream / fetch readers consumed in a loop need `try/finally { reader.cancel() }` — exceptions in the loop otherwise leave the stream open; the `finally` block should catch its own errors so it doesn't mask the original exception
    - Large payloads (base64, binary buffers) stored in multiple state fields simultaneously (e.g., as both a `data` field and a `previewUrl` data URL) — each copy multiplies memory, with significant impact for large attachments. Derive one representation from the other on demand rather than storing both, and ensure cleanup on removal and unmount
-   - Deletion/destroy and state-reset functions that clean up or reset the primary resource but leave orphaned or inconsistent secondary resources (data directories, git branches, child records, temporary files, per-user flag/vote items) — trace all resources created during the entity's lifecycle and verify each is removed on delete. Also check the inverse: preservation guards that prevent cleanup under overly broad conditions (e.g., preserving a branch "in case there were commits" when the ahead count is zero) — over-conservative guards leak resources over time. For state transitions that reset aggregate values (counters, scores, flags), also clear or version the individual records that contributed to those aggregates — otherwise the aggregate and its sources disagree, and duplicate-prevention checks block legitimate re-entry. Also check cleanup operations that perform implicit state mutations (auto-merge, auto-commit, cascade writes) as part of teardown — these can introduce unreviewed changes or silently modify shared state. Verify cleanup fails safely when a prerequisite step (e.g., saving dirty state) fails rather than proceeding with data loss
+   - Deletion/destroy and state-reset functions that clean up or reset the primary resource but leave orphaned or inconsistent secondary resources (data directories, git branches, child records, temporary files, per-user flag/vote items) — trace all resources created during the entity's lifecycle and verify each is removed on delete. Also check the inverse: preservation guards that prevent cleanup under overly broad conditions (e.g., preserving a branch "in case there were commits" when the ahead count is zero) — over-conservative guards leak resources over time. For state transitions that reset aggregate values (counters, scores, flags), also clear or version the individual records that contributed to those aggregates — otherwise the aggregate and its sources disagree, and duplicate-prevention checks block legitimate re-entry. Also check cleanup operations that perform implicit state mutations (auto-merge, auto-commit, cascade writes) as part of teardown — these can introduce unreviewed changes or silently modify shared state. Verify cleanup fails safely when a prerequisite step (e.g., saving dirty state) fails rather than proceeding with data loss. Also check the **valid-state requirement**: delete/cleanup handlers that gate on a getter (`getEntity(id)` → `unlink(...)`, `loadConfig(id)` → `rm -rf`, `parseManifest()` → archive) propagate the getter's failure modes to the cleanup path — if the entity is in a corrupted state (parse error, schema mismatch, transient FS error), the getter throws and the entity becomes UNDELETABLE through the API/UI, leaving users no recovery path. Catch known corruption error classes (`err.code === 'CORRUPTED_MANIFEST'`, `SyntaxError`, `SchemaValidationError`) and proceed with cleanup, OR use a lower-level existence check (`fs.existsSync` for the directory, lock-file presence, rowid lookup without joins) that doesn't require parsing — corrupted entities are exactly when users need to delete them most
    - Initialization functions (schedulers, pollers, listeners) that don't guard against multiple calls — creates duplicate instances. Check for existing instances before reinitializing
+   - Time-based cache eviction (TTL, LRU pruning, post-completion cleanup) defeated by another code path that re-adds the evicted entry from a stale source. Pattern: tracker entries evict 60s after terminal status, but a periodic `cacheReload()` calls `restoreFromDisk()` and re-seeds the same ids from a snapshot that doesn't reflect terminal status. The TTL prunes "filled/cancelled" entries, the reload re-creates them as "open", and the eviction loop runs forever without making progress. Trace every code path that ADDS to the tracker; either gate the re-add on freshness (don't re-seed entries known to be terminal), use a tombstone set the eviction logic respects, or drive the reload from the same source-of-truth the eviction validated
    - Self-rescheduling callbacks (one-shot timers, deferred job handlers) where the next cycle is registered inside the callback body — an unhandled error before the re-registration call permanently stops the schedule. Wrap the callback body in try/finally with re-registration in the finally block, or register the next cycle before executing the current one
+   - EventEmitter / socket cleanup using inline anonymous handlers (`socket.on('foo', () => { ... })`) cannot be removed precisely later — assign the handler to a named const so cleanup can pass the same reference. On shared emitters/sockets/buses, NEVER call `.off(event)` without the specific handler — that removes ALL listeners on the emitter, breaking sibling subscribers
+   - State-reset methods (`clear()`, `reset()`, `dispose()`, mode-switch handlers) must release every related artifact, not just the data they track: timers (`clearTimeout`, `clearInterval`), refs (`pressingRef.current = false`), audio playback (`stopTone()`), pending press state, animation frames, and any boolean flags blocking re-entry. A `clear()` that resets only the dataset leaves dangling timers firing on stale state, audio playing forever, and stuck flags that prevent the next interaction
+   - DOM event handlers (`onMouseLeave`, `onBlur`, `onPointerLeave`, key handlers) registered via JSX close over render-time state — `if (pressing) endPress()` reads the value from the closure, not the live state, and may see stale `false` even when the user is still holding. Use refs (`pressingRef.current`) for liveness flags read inside DOM handlers, OR call cleanup methods unconditionally and have them no-op if not active
+   - Synchronous re-entrancy guards via React state — `if (saving) return; setSaving(true); await save()` is unsafe because `setState` is async and rapid repeated triggers (key-repeat Cmd/Ctrl+S, double-click, debounce overlap, two paths to the same handler) all observe `saving === false` and fire concurrent operations. Use a synchronous mutable ref (`savingRef.current = true` set BEFORE awaiting; reset in `finally`) for the gate, and reserve React state purely for rendering. Same pattern applies to any "in-flight" boolean used as a mutex
+   - Controlled-input rehydration from a parent prop — components that hold local state derived from `prop` and only sync via `useEffect(() => setLocal(prop.field), [prop.id])` MISS prop changes within the same parent identity (switching draft versions on the same work, switching tabs on the same record, server normalizing the field). Track every identity discriminator that should trigger rehydration (`[work.id, work.activeDraftVersionId]`), AND sync after server normalization — when a save returns a normalized value (trimmed, lowercased, server-canonical), update local state from the response. Also: a `<select value={prop.status}>` controlled by the prop with no local state can snap back to the old value on re-render before the PATCH resolves — keep a local state mirror and update it in `onChange`
+   - Sorted-list mutations via in-place patch (`items.map(x => x.id === id ? merged : x)`) leave the list visually sorted by the OLD order when the mutation changes the sort key (`updatedAt`, priority, score) — a recently-saved item stays mid-list instead of moving to the top. After mutating the sort key, re-sort the array (or splice the item to its new position). Same for filter predicates that may now exclude the item, and for selection sets that may need pruning when the mutation changes the entity's group/tab
+   - Reload-on-update fallback that replaces richer state with partial input — handlers like `if (opts.reload) { fresh = await refetch(); setState(fresh) } else { setState(updated) }` that fall back to `setState(updated)` ON REFETCH FAILURE blank fields the previous state had (full body, comments, derived metadata, joined relations) when the input lacks them. Either keep the previous state on refetch failure, retry, or surface the error explicitly — never replace richer state with the partial input that triggered the reload as a "recovery" path
    **Validation & consistency** _[applies when: code handles user input, schemas, or API contracts]_
    - API versioning: breaking changes to public endpoints without version bump or deprecation path
@@ -113,7 +159,12 @@
    - One-time migrations or initializations triggered on load/startup without a completion guard (version stamp, flag, or condition that excludes already-migrated data) — re-execute on every startup, causing unnecessary writes, duplicate processing, or state churn. Ensure the migration condition excludes records/configs that have already been migrated. Also covers setup/provisioning scripts invoked from hot paths (`npm start`, dev script, container entrypoint, app boot) that mutate credentials, privileges, or installed-package state (`ALTER USER`, password resets, brew installs, file ownership changes) — gate the heavy work behind a cheap readiness check so reruns are no-ops, OR refactor the script so each step is itself idempotent and detects already-applied state
    - Data migrations that silently change runtime behavior — converting records from one type/schedule/state to another must preserve the original execution semantics (frequency, enabled state, trigger behavior). Migrating an on-demand/manual entity to a scheduled one causes unexpected automated execution; migrating a fine-grained interval to a coarser default changes frequency. Unsupported source values (cron expressions, custom intervals) must be flagged or preserved, not silently dropped to defaults
    - Update/patch endpoints with explicit field allowlists (destructured picks, permitted-key arrays) — when the data model gains new configurable fields, the allowlist must be updated or the new fields are silently dropped on save. Trace from model definition to the update handler's field extraction to verify coverage
-   - New endpoints/schemas should match validation patterns of existing similar endpoints — field limits, required fields, types, error handling. If validation exists on one endpoint for a param, the same param on other endpoints needs the same validation — including path params, query params, and body fields on sibling endpoints (create/update/delete/activate). The same field should be accepted, rejected, and trimmed identically everywhere it appears, regardless of which endpoint consumes it. `z.string().min(1)` without `.trim()` accepts whitespace-only values — prefer `z.string().trim().min(1)` for user-visible names. API documentation schemas (OpenAPI, JSON Schema) must be structurally complete — array types require `items` definitions, required fields must be listed, and the documented shape must match what the implementation actually returns
+   - Multiple code paths emitting what should be the SAME response/event/snapshot (e.g., a status payload constructed by a primary engine, a stopped-engine IPC handler, a route fallback, AND a websocket emitter) drift over time as fields are added — adding a new field to one path leaves the others returning a subset shape. Extract the shape into a single shared builder/serializer that every emission point imports, AND lock the contract with a snapshot/contract test (or a test against the builder) so a missing field surfaces in tests rather than as a UI regression. When a fix targets a drift bug, audit whether the fix REMOVES the drift (consolidates emitters) or PERPETUATES it (adds a 4th hand-built copy inline) — the latter recreates the same class of bug at the next field addition. Trace every code path that emits the shape, count distinct construction sites, and recommend consolidation when N > 2
+   - Fallback or degraded-mode response shape must be a SUPERSET of fields the consumer reads, not a subset — when the consumer treats the response as a full replacement (`setStatus(res.status)` rather than merging), missing fields cause UI sections to disappear, render zeros, or default to wrong derived values (`apy`, `lifecycle`, `health.mode`, `market.lastPrice`). Don't rely on "the consumer will preserve the last good value via socket merging" without verifying the actual reload code path — a hard refresh during fallback may have no prior state to merge with. Inventory every field the consumer reads on the happy path, and verify the fallback emits a meaningful value (or explicit "stale/unknown" marker the consumer recognizes) for each
+   - Schema migrations that COPY a value from an old field to a new field WITHOUT clearing the old field can leave both populated for the entity's lifetime. Code that reads BOTH fields and emits one record per non-null instance produces duplicate output keyed by the same underlying identifier (the same orderId rendered twice with different categorizing types, the same record emitted as both `legacy` and `migrated`). Either clear the old field after migration completes, dedupe by stable id when emitting, or read only the migration-target field once a completion guard confirms migration ran. Trace from the migration step through every read path that enumerates the affected fields
+   - New endpoints/schemas should match validation patterns of existing similar endpoints — field limits, required fields, types, error handling. If validation exists on one endpoint for a param, the same param on other endpoints needs the same validation — including path params, query params, and body fields on sibling endpoints (create/update/delete/activate). The same field should be accepted, rejected, and trimmed identically everywhere it appears, regardless of which endpoint consumes it. **Schema-validator chain ordering:** in chain-style validators (Zod, Yup, Joi) checks run in chain order, so `.trim()` AFTER `.min(1)` (`z.string().min(1).max(200).trim()`) lets whitespace-only strings pass the non-empty guard and get trimmed to `''` afterward. Always order as `z.string().trim().min(1)` for human-readable names AND for nullable foreign-key IDs (`folderId`, `parentId`): `z.string().max(N).nullable()` without `.trim().min(1)` accepts `''` and persists it as the ID — service layers treat `''` as falsy (skipping existence checks) but still write it, breaking lookups and grouping invariants. Tighten to `z.string().trim().min(1).max(N).nullable()` (or preprocess `''` → `null` at the schema boundary). Handler-side trim normalization (`patch.title = patch.title.trim()`) must additionally reject post-trim emptiness — a PATCH with `"   "` trims to `''` and persists a blank value even when create requires non-empty. API documentation schemas (OpenAPI, JSON Schema) must be structurally complete — array types require `items` definitions, required fields must be listed, and the documented shape must match what the implementation actually returns
+   - HTTP query parameters can arrive as arrays for repeated keys (`?id=a&id=b`) in most server frameworks (Express, Fastify, etc.) — a route that types `req.query.id` as a string and passes it through unchecked may treat the array as `undefined` (silently dropping the filter and returning UNFILTERED data, a quiet authorization/leakage bug) or pass the array shape downstream. Validate explicitly at the route boundary: accept only `typeof === 'string'`, take the first element when arrays are intended, or reject with 400. Apply via a Zod query schema covering every typed query field
+   - Foreign-key existence-check parity across write paths — when a create endpoint validates that a referenced ID exists (`createWork` checks `folderId` resolves to a folder; `createComment` checks `postId` exists) before persisting, every other write path that accepts the same field (update/PATCH, bulk import, sync, admin override, internal callers) must apply the same existence check. Partial coverage allows orphaned references through the unguarded path: the entity appears in no parent's listing, and cascade/group operations break. Trace every endpoint that writes the foreign-key field; the same applies to cross-entity invariants (parent must be of correct type, owner must be active, target must not be self)
    - Client-side input validation limits (max count, file size, string length, combined totals) must be consistent with — and ideally tighter than — server-side enforcement. When the client allows combinations the server rejects (e.g., 8 × 10MB files vs a 50MB JSON body limit), users hit confusing 400/413 errors. Trace all enforcement boundaries (UI, API schema, body parser, downstream service) and verify they form a coherent envelope
    - Sample config files, README examples, and documentation that reference config keys or structure must match what the implementation actually reads. Trace example keys against the config loader — stale examples teach operators to configure values the system ignores (or vice versa)
    - Subprocess invocations must inherit the same configuration source as the parent — if the parent reads from `.env`/config files but the child only sees `process.env`, exporting those values explicitly via the `env` option is required. Otherwise a probe uses customized credentials/ports while the underlying setup runs with built-in defaults, creating an "inconsistency loop" where the probe always fails and provisioning re-applies defaults that overwrite user customization. Pass the resolved values through; do not assume children re-parse the same config files
@@ -129,19 +180,22 @@
    - Arrays of IDs (widget ids, tag ids, member ids) persisted, returned by API, or rendered with `key={x}` without container-level deduplication — element-level validation (type check, length cap) is not enough. Duplicates cause React key collisions, inflated operation counts, and inconsistent UI updates. Enforce uniqueness via schema refinement (`zod.refine(arr => new Set(arr).size === arr.length)`), dedupe during ingestion, AND dedupe during read-path sanitization so hand-edited or legacy data can't reintroduce collisions. Apply the same logic to arrays of records keyed by id at the container level (first-wins, not just element-level shape checks)
    - Data loaded from files or persistent stores sanitized less strictly than the API accepts on write — hand-edited, migrated, or corrupted persisted state can introduce values (oversized names, non-kebab ids, duplicate entries, bypassed limits) the API would reject, producing oversized responses, unreachable records (client renders but API rejects on mutate), or invariant violations. Apply the same length caps, regex, uniqueness, and type guards in read-path sanitization as in request-schema validation; drop or truncate out-of-range values rather than passing them through
    - Authority / privilege flags (builtIn, protected, owner, role, immutable) read directly from persisted records without re-derivation — flipping the flag in a hand-edited file or a tampered sync source grants unintended privileges (e.g., changing `builtIn: true → false` lets "protected" records be deleted). Derive authority from code (a constant set of built-in ids, session identity, server-side role lookup), not from the persisted representation
-   - Persisted-state filename/path fields (history JSON entries, settings.json paths, manifest entries) used as filesystem operands (`path.join(BASE, item.filename)` for `unlink`/`readFile`/`spawn` arg lists / ffmpeg-imagemagick concat manifests) without basename + path-resolve-prefix-check validation — corrupted or tampered persisted state can include `../` segments that escape the intended directory. Use a `safeUnder(base, candidate)` helper at every consumption site. For paths that further pass into exec arg strings or manifest files (e.g., ffmpeg concat-demuxer `file '...'` lines), basename validation is necessary but not sufficient — the consumer's parser has its own escaping rules: single quotes / newlines break ffmpeg manifests, backslashes on Windows are interpreted as escape characters in quoted strings, shell metacharacters break shell-quoted args. Either reject filenames containing parser-special characters at validation time, or apply consumer-specific escaping before writing the manifest/argv
+   - Persisted-state filename/path fields (history JSON entries, settings.json paths, manifest entries) used as filesystem operands (`path.join(BASE, item.filename)` for `unlink`/`readFile`/`spawn` arg lists / ffmpeg-imagemagick concat manifests) without basename + path-resolve-prefix-check validation — corrupted or tampered persisted state can include `../` segments that escape the intended directory. Use a `safeUnder(base, candidate)` helper at every consumption site. The same applies to non-filename id fields (UUIDs, slugs, slot keys) that get interpolated into a filename template (`lastframe-${item.id}.png`, `cache-${userId}.json`): a tampered id can include path separators or `..` and escape the intended directory just like a tampered filename can. Validate the id matches its expected shape (UUID `/^[a-f0-9-]{36}$/i`, slug `/^[a-z0-9-]+$/`) at every consumption site, OR run the assembled filename through `safeUnder()` before use. For paths that further pass into exec arg strings or manifest files (e.g., ffmpeg concat-demuxer `file '...'` lines), basename validation is necessary but not sufficient — the consumer's parser has its own escaping rules: single quotes / newlines break ffmpeg manifests, backslashes on Windows are interpreted as escape characters in quoted strings, shell metacharacters break shell-quoted args. Either reject filenames containing parser-special characters at validation time, or apply consumer-specific escaping before writing the manifest/argv
+   - Tilde-expansion that uses `path.join(homedir(), input.slice(1))` (or any equivalent that keeps the leading `/` from `~/foo`) silently discards the home directory because `path.join` drops earlier segments when a later segment starts with `/`. The result is `/foo` instead of `/Users/me/foo`, breaking auto-detect logic for files under the user's home (LM Studio caches, dotfiles, XDG dirs) and potentially leaking absolute root-relative paths into downstream tools. Strip `~/` (and handle bare `~`) before joining: `p === '~' ? homedir() : p.startsWith('~/') ? join(homedir(), p.slice(2)) : p`. Pin with a unit test that asserts the expansion result starts with `homedir()`
    - Allowlists gating user-provided identifiers must use the consumer's identifier namespace, not a sibling namespace. Common bug: an allowlist of import-module names (`cv2`, `PIL`) used to gate `pip install <name>` — pip's identifier space is package specs (`opencv-python`, `pillow`), so the allowlist permits installs of typosquatted/unintended packages. Same risk for command names vs aliases, OAuth scope strings vs role names, file extensions vs MIME types. Build the allowlist from the consumer's actual valid-input set, NOT from a related-but-different list, and include a unit test that asserts every allowlist entry is a valid input to the consumer
-   - API responses returning server-internal absolute filesystem paths (`/Users/.../data/...`, `C:\app\data\...`) leak server layout, OS, and install locations to clients and couple them to filesystem structure. Return basenames or relative identifiers (`/data/loras/<filename>`) and resolve/validate server-side at consumption time
+   - API responses returning server-internal absolute filesystem paths (`/Users/.../data/...`, `C:\app\data\...`) leak server layout, OS, and install locations to clients and couple them to filesystem structure. Return basenames or relative identifiers (`/data/loras/<filename>`) and resolve/validate server-side at consumption time. **Error messages** are a frequent leak source: `ServerError`/custom error classes/thrown messages that interpolate filesystem paths (`new ServerError(\`Corrupted manifest at ${path}\`)`), connection strings, internal hostnames, env-var values, or stack frames are surfaced verbatim by default error handlers — keep the path/detail in the server log `context` field while the user-facing `message` stays path-free (`Corrupted manifest`, optionally with the entity ID). Audit every `throw new Error(\`... ${path/secret/host} ...\`)` and every error-response builder for infrastructure detail crossing the boundary
    - Cross-module constants kept in sync by comment ("must stay in sync with X", copy-pasted regex, duplicated event names or size limits) — the comment is not enforcement and drift is a silent failure. Any constant shared across module boundaries (client↔server, route↔service, component↔component, producer↔consumer) must be a single exported value imported at both sides: event names, regex patterns, numeric limits, path segments, feature-flag keys, error codes
    - Generator/validator structural invariants — when a generator produces values with structural guarantees (sortability via fixed-width prefix, embedded checksum, encoded version), the validator regex (and any client-side mirror) must enforce the SAME shape. Broader regexes accept inputs the generator never emits, breaking invariants the rest of the system relies on (e.g., lexical sort == chronological sort breaks once a base36 timestamp grows by a digit). Verify generator + server validator + client mirror form a closed loop. Test fixtures should use IDs/payloads that match generator output, not contrived literals — fixtures with off-spec shapes hide regressions where the production format changes
    - Schemas accepting paired/range fields (`startDate`/`endDate`, `min`/`max`, `from`/`to`) must add a cross-field refinement (`zod .refine()`, JSON Schema `dependentSchemas`) enforcing the relationship (start ≤ end). Without it, the schema accepts inconsistent ranges that the implementation may silently swap, ignore, or interpret ambiguously. Define deterministic rules when only one bound is supplied
+   - PATCH/update endpoints accepting an empty body (or only metadata fields like `expectedUpdatedAt`/`etag`) pass validation, write-through bumps `updatedAt`, and produce needless write churn + optimistic-concurrency conflicts. Add a `.refine()` requiring at least one mutating field, OR have the service no-op when the patch is empty
+   - Truthy-only guards on concurrency tokens / etags / version stamps (`if (expectedUpdatedAt && ...)`) skip the conflict check whenever the client sends an empty string, allowing overwrites of newer data. Use `!= null` (or explicit string validation that rejects empty/whitespace) so "provided but empty" surfaces as a 400, not a silent bypass
    - Pure persistence/utility modules importing from orchestration/service modules just to access a constant pulls the entire downstream import graph as a transitive dependency — increases test cost, slows imports, and creates side effects on load. Move shared constants (enums, regex, size caps, valid-mode sets) into small dedicated modules so storage and utility code can be reused without dragging the service stack in
    - Code reading properties from API responses, framework-provided objects, or internal abstraction layers using field names the source doesn't populate or forward — silent `undefined`. Verify property names and nesting depth match the actual response shape (e.g., `response.items` vs `response.data.items`, `obj.placeId` vs `obj.id`, flat fields vs nested sub-objects). When building a new consumer against an existing API, check the producer's actual response — not assumed conventions. When branching on fields from a wrapped third-party API, confirm the wrapper actually requests and forwards those fields (e.g., optional response attributes that require explicit opt-in). Also verify call sites pass inputs in the format the called function actually accepts — framework constructors with non-obvious positional argument order, loaders with format-specific variants (content paths vs script paths, asset objects vs class references), and accessor APIs with distinct method-vs-property semantics. Fallback branches in multi-format dispatchers commonly use the wrong function for the input type
    - Data model fields that have different names depending on the creation/write path (e.g., `createdAt` vs `created`) — code referencing only one naming convention silently misses records created through other paths. Trace all write paths to discover the actual field names in use. When new logic (access control, UI display, queries) checks only a newly introduced field, verify it falls back to any legacy field that existing records still use — otherwise records created before the migration are silently excluded or inaccessible. Also check entity identity keys: if code looks up or matches entities using a computed key (e.g., `e.id || e.externalId`), all code paths that perform the same lookup must use the same key computation — one path using `e.id` while another uses `e.id || e.externalId` causes mismatches for entities missing the primary key
    - Entity type changes without invariant revalidation — when an entity has a discriminator field (type, kind, category) and the user changes it, all type-specific invariants must be enforced on the new type AND type-specific fields from the old type must be cleared or revalidated. A job changing from `shell` to `agent` without clearing `command`, or changing to `shell` without requiring `command`, leaves the entity in an invalid hybrid state that fails at runtime or resurfaces stale data
    - Invariant relationships between configuration flags (flag A implies flag B) not enforced across all layers — UI toggle handlers, API validation schemas, server default-application functions, and serialization/deserialization must all preserve the invariant. If any layer allows setting A=true with B=false (or vice versa), cascading defaults and toggle logic produce contradictory state. Trace the invariant through: UI state handlers, form submission, route validation, service defaults, and persistence round-trip
    - Operations scoped to a specific entity subtype that don't verify the entity's type discriminator before processing — an endpoint or function designed for one account/entity type that accepts any entity by ID can corrupt state or produce wrong results when called with the wrong type. Add an explicit type guard and return a structured error
-   - Inconsistent "missing value" semantics across layers — one layer treats `null`/`undefined` as missing while another also treats empty strings or whitespace-only strings as missing. Query filters, update expressions, and UI predicates that disagree on what constitutes "missing" cause records to be skipped by one path but processed by another. Define a single `isMissing` predicate and use it consistently, or normalize empty/whitespace values to `null` at write time. Also applies to comparison/detection logic: coercing an absent field to a sentinel (`?? 0`, default parameters) makes the logic treat "unsupported" as a real value — guard with an explicit presence check before comparing. Watch for validation/sanitization functions that return `null` for invalid input when `null` also means "clear/delete" downstream — malformed input silently destroys existing data. Distinguish "invalid, reject the request" from "explicitly clear this field". Also applies to normalization (trailing slashes, case, whitespace): if one path normalizes a value before comparison but the write path stores it un-normalized, comparisons against the stored value produce incorrect results — normalize at write time or normalize both sides consistently
+   - Inconsistent "missing value" semantics across layers — one layer treats `null`/`undefined` as missing while another also treats empty strings or whitespace-only strings as missing. Query filters, update expressions, and UI predicates that disagree on what constitutes "missing" cause records to be skipped by one path but processed by another. Define a single `isMissing` predicate and use it consistently, or normalize empty/whitespace values to `null` at write time. Also applies to comparison/detection logic: coercing an absent field to a sentinel (`?? 0`, default parameters) makes the logic treat "unsupported" as a real value — guard with an explicit presence check before comparing. Watch for validation/sanitization functions that return `null` for invalid input when `null` also means "clear/delete" downstream — malformed input silently destroys existing data. Distinguish "invalid, reject the request" from "explicitly clear this field". Also applies to normalization (trailing slashes, case, whitespace): if one path normalizes a value before comparison but the write path stores it un-normalized, comparisons against the stored value produce incorrect results — normalize at write time or normalize both sides consistently. **Producer-side `null`/`0` as "unknown" sentinel** is a common consumer trap: a producer that emits `placedAt: null` when the timestamp isn't available, or `lastPrice: 0` when the price is unknown, gets misinterpreted by consumers who treat the value as real (computing `Date.now() - null` = epoch → "decades old" age display; multiplying quantity × 0 → P&L/APY rendered as zero). Either omit the field entirely (so consumers can guard with `!= null`), use an explicit "unknown" marker the consumer recognizes (`stale: true`, `priceAvailable: false`), or document and verify every consumer's null/zero handling matches the producer's semantic intent
    - Validation functions that delegate to runtime-behavior computations (next schedule occurrence, URL reachability, resource resolution) — conflating "no result within search window" or "temporarily unavailable" with "invalid input" rejects valid configurations. Validate syntax and structure independently of runtime feasibility
    - Numeric values from strings used without `NaN`/type guards — `NaN` comparisons silently pass bounds checks; `NaN` flowing into subprocess args (`-p NaN`) or formatted strings produces opaque downstream failures. Clamp query params to safe lower bounds. For env-var parsing in particular, the input commonly contains stray whitespace or inline `# comment` text — use `Number.parseInt(String(value).trim(), 10)` and gate with `Number.isFinite(parsed)` before falling back to the default
    - Hand-rolled regex validators for well-known formats (IP addresses, email, URLs, dates, semver) that accept invalid inputs or reject valid ones — use platform/standard library parsers instead (e.g., `net.isIP()`, `URL` constructor, `semver.valid()`) which handle edge cases the regex misses
@@ -151,6 +205,7 @@
    **Concurrency & data integrity** _[applies when: code has shared state, database writes, or multi-step mutations]_
    - Shared mutable state accessed by concurrent requests without locking or atomic writes; multi-step read-modify-write cycles that can interleave — use conditional writes/optimistic concurrency (e.g., condition expressions, version checks) to close the gap between read and write; if the conditional write fails, surface a retryable error instead of letting it bubble as a 500
    - Read-only API paths that trigger lazy initialization or one-time migration with side effects (file writes, renames) — these become unprotected write paths when called concurrently. Guard with locks or hoist initialization to startup
+   - Cross-platform `fs.rename` "destination exists" handling that destroys the user's output on failure. POSIX `rename` atomically replaces the destination; Windows `fs.rename` rejects when the destination exists. The naive Windows fix (`unlink(dest); rename(tmp, dest)`) is dangerous: if `rename` fails (file lock, AV scan, transient permissions, low disk), the original is gone and only the temp file remains, which the caller's error path typically deletes too — net result is data loss. Use a backup-aside rollback instead: rename the original to `<dest>.bak.<uuid>`, then rename the temp into place, unlink the .bak on success; on failure, rename the .bak back to `<dest>`. Ensure `ENOENT` on the initial backup rename is treated as "no original existed" (set backupPath = null), not a fatal error. Same pattern applies to any "replace one file with another" sequence (config rewrites, atomic-write helpers, faststart remux, sidecar updates) — never `unlink` the production file before the replacement is verifiably in place
    - Multi-table writes without a transaction — FK violations or errors leave partial state
    - Writes that replace an entire composite attribute (array, map, JSON blob) when the field is populated by multiple sources — the write discards data from other sources. Use a separate attribute, merge with the existing value, or use list/set append operations
    - Functions with early returns for "no primary fields to update" that silently skip secondary operations (relationship updates, link writes)
@@ -184,7 +239,7 @@
    **Lazy initialization & module loading** _[applies when: code uses dynamic imports, lazy singletons, or bootstrap sequences]_
    - Cached state getters returning null before initialization — provide async initializer or ensure-style function
-   - Module-level side effects (file reads, SDK init) without error handling — corrupted files crash the process on import
+   - Module-level side effects (file reads, SDK init) without error handling — corrupted files crash the process on import. Loaders for user-editable config files (registries, settings, model catalogs) invoked at import-time must wrap BOTH the read (`readFileSync` can throw on permissions, broken symlinks, transient I/O) AND the parse (`JSON.parse` throws on malformed edits) — uncaught exceptions during import abort server boot. Beyond IO/parse: parsed-but-malformed data (missing top-level keys, wrong types like `field: {}` when an array is expected) crashes downstream consumers. Normalize the parsed object: deep-merge with defaults, coerce array-typed fields with `Array.isArray(...)`, coerce object-typed fields with `isPlainObject`, and fall back to defaults with a clear log message when shapes don't match
    - File writes that assume the parent directory exists — on fresh installs or after directory cleanup, the write fails with ENOENT. Ensure the directory exists before writing (or create it on demand)
    - Bootstrap/resilience code that imports the dependencies it's meant to install — restructure so installation precedes resolution
    - Re-exporting from heavy modules defeats lazy loading — use lightweight shared modules
@@ -202,6 +257,10 @@
    - Detached child processes with piped stdio — parent exit causes SIGPIPE. Redirect to log files or use `'ignore'`
    - Subprocess output buffered in memory without size limits — a noisy or stuck child process can cause unbounded memory growth. Cap in-memory buffers and truncate or stream to disk for long-running commands
    - Platform-specific assumptions — hardcoded shell interpreters, `path.join()` backslashes breaking ESM imports. Use `pathToFileURL()` for dynamic imports
+   - Subprocess spawns of binaries with platform-or-distro-dependent names (`pwsh` vs Windows PowerShell `powershell.exe`, `python3` vs `python`, `gh` vs absent on minimal containers) must probe (`which`/`where`) and fall back to alternates rather than assuming the modern/preferred name is installed — many Windows boxes ship Windows PowerShell only, distro-stripped Linux containers may lack `python3`. Detect once at startup or per-call; emit a clear actionable error if no candidate is found
+   - Platform guards using inverse logic (`if (!IS_WIN && ...)`) when the real condition is "macOS only" run platform-specific binaries on Linux/BSD/Unix where they don't exist (`caffeinate` is macOS-only, `pmset` is macOS-only, `nvidia-smi` requires NVIDIA hardware). Use the positive predicate (`process.platform === 'darwin'`, `IS_MAC`) — every "not Windows" check is wrong on every other non-Windows OS
+   - Setup/install scripts that switch between multiple packages providing the same import path (renamed packages, fork swaps, deprecated→replacement) must explicitly uninstall the predecessor before installing the new one (`pip uninstall -y <old>` before `pip install <new>`). Otherwise both remain installed and Python's import resolution becomes ambiguous (or the wrong one wins on path order). Suppress errors from the uninstall step so reruns don't fail when the old package is already absent
+   - Case-sensitive string operations on filesystem paths break on case-insensitive filesystems (Windows always, macOS HFS+/APFS by default). Deriving a sibling binary path via case-sensitive `replace(/ffmpeg$/, 'ffprobe')` against a real filesystem path that may be `FFMPEG.EXE` produces an unchanged path — caller spawns the wrong binary and `existsSync` confirms it. Use case-insensitive regex flags (`/foo$/i`) and/or derive sibling paths via `path.parse()` + `path.format()` so the result is structurally guaranteed
    - Naive whitespace splitting of command strings (`str.split(/\s+/)`) breaks quoted arguments — use a proper argv parser or explicitly disallow quoted/multi-word arguments when validating shell commands
    - Subprocess output parsed from a single stream (stdout or stderr) to detect conditions (conflicts, errors, specific states) — the information may appear in the other stream or vary by tool version/config. Check both stdout and stderr, and verify the exit code, to reliably detect the condition
    - Shell expansions (brace `{a,b}`, glob `*`, tilde `~`, variable `$VAR`) suppressed by quoting context — single quotes prevent all expansion, so patterns like `--include='*.{ts,js}'` pass the literal braces to the command instead of expanding. Use multiple flags, unquoted brace expansion (bash-only), or other command-specific syntax when expansion is required
@@ -218,14 +277,17 @@
    - Destructive actions (delete, reset, revoke) in the UI without a confirmation step — compare with how similar destructive operations elsewhere in the codebase handle confirmation
    **Streaming & real-time protocols** _[applies when: code implements SSE, WebSocket, ReadableStream, or event-driven APIs]_
-   - Wire protocol parsers that assume a simplified subset of the spec (e.g., `\n`-only line endings when the spec allows `\r\n`) — test against boundary representations the spec permits, not just the happy path. Parsers must (a) handle the spec's full set of separators (e.g., both `\n\n` and `\r\n\r\n` for SSE; multiple `data:` lines joined with `\n`); (b) flush remaining buffered content on EOF — otherwise the last frame is dropped when upstream closes mid-frame; (c) wrap per-frame deserialization (`JSON.parse`) so a single malformed frame doesn't terminate the entire stream
+   - Wire protocol parsers that assume a simplified subset of the spec (e.g., `\n`-only line endings when the spec allows `\r\n`) — test against boundary representations the spec permits, not just the happy path. Parsers must (a) handle the spec's full set of separators (e.g., both `\n\n` and `\r\n\r\n` for SSE; multiple `data:` lines joined with `\n`); (b) flush remaining buffered content on EOF — otherwise the last frame is dropped when upstream closes mid-frame; (c) wrap per-frame deserialization (`JSON.parse`) so a single malformed frame doesn't terminate the entire stream. Per-token regexes that scan stream chunks miss matches when the producer splits a single line across chunks — buffer with a rolling buffer or use `readline` to parse complete lines before applying token regexes
+   - Hand-rolled parsers for source-code or config formats (env-block extraction, brace matching, key extraction) must handle the language's full token grammar: nested braces with depth counting (`\{([^}]*)\}` truncates at first `}`, missing nested object spreads / ternaries), ALL string delimiters including backtick template literals (not just single/double quotes), escape detection by counting consecutive backslashes (odd = escaped, even = not), and optional quoting on keys (`PORT: 3000`, `'PORT': 3000`, `"PORT": 3000`). Prefer the language's own AST parser when available; flag every regex that simplifies the source grammar
    - Stateful parsers (multipart, MIME, framed protocols) must verify they reached the terminal state on `req.on('end')` / EOF — calling `finish()` while still in a body/header state accepts truncated input as success, silently corrupting partially-written uploads. Track the terminal-state transition (e.g., `STATE_DONE` after the closing `--boundary--`) and return a 400 error otherwise (and clean up any partial files). Per-part state (`currentFileMimetype`, accumulated headers, decoder state) must be reset at the start of each new part — otherwise a part with no `Content-Type` inherits the previous part's mimetype
    - Refactoring a streaming parser to "buffer-then-process" (calling `readAllBytes()` / `Buffer.concat(chunks)` / `await req.text()` before parsing) defeats the streaming contract and re-introduces an OOM/DoS vector for large uploads — verify the new implementation still respects each caller's `maxSize`/body cap WHILE reading (stop collecting once bytes exceed the cap), or restore true streaming. Watch for header comments still claiming "streams" / "never buffers entire body in memory" after such refactors — they become a documentation lie
    - Library wrappers advertising a multer/express-style contract `(req, file, cb)` must pass the real `req` through to filters/hooks, and must `await` async callbacks before continuing — treating the `cb` as synchronous when the contract permits async breaks any caller that supplied an async filter. Errors thrown from middleware/parser modules without `err.status` set are normalized to HTTP 500 — set `err.status = 400` (or 413 for size limits) and a stable `err.code` (`PAYLOAD_TOO_LARGE`, `INVALID_MULTIPART`, `VALIDATION_ERROR`) at the throw site, OR throw a typed `ServerError`/`ApiError`
    - Event handlers or effects that fire on every high-frequency event (streaming deltas, scroll, resize, keydown) without throttle, debounce, or `requestAnimationFrame` batching — causes jank and excessive re-renders
+   - Global event-listener `useEffect`s (`addEventListener('keydown'/'mousemove'/'resize')`) whose dependency array includes a rapidly-mutating object or callback — the listener detaches and re-attaches on every change, churning DOM and racing in-flight events. Cleanup also runs on every re-attach, so any timers/refs the cleanup clears (`flushTimerRef`, audio nodes) get reset mid-interaction. Stabilize: keep changing values in a `ref` read inside a stable handler, and depend only on truly listener-relevant inputs (`enabled`, route key)
+   - Background polling (`setInterval(asyncFn, N)`, recurring `fetch` chains) must (a) suppress per-iteration error toasts via the codebase's `silent` flag — transient failures otherwise spawn toast storms; (b) guard against overlapping in-flight requests with a ref boolean or convert to a `setTimeout`-after-resolve loop. `setInterval(asyncFn, N)` produces overlapping requests when N < response time, leading to out-of-order state updates. Background polling whose data appears in long-lived UI surfaces (HUD counts, headers) should subscribe to live event streams (sockets, pub/sub) rather than re-polling on a fixed cadence — one-shot fetches at mount go stale immediately
    **Accessibility** _[applies when: code modifies UI components or interactive elements]_
-   - Interactive elements missing accessible names, roles, or ARIA states — including labels lost or replaced with non-descriptive placeholders in conditional/compact rendering modes. Disabled interactions should have `aria-disabled`. Verify the ARIA attribute set matches established patterns used elsewhere in the codebase for the same widget type (disclosure, menu, dialog) — inconsistency degrades assistive-tech support and creates confusion for keyboard users
+   - Interactive elements missing accessible names, roles, or ARIA states — including labels lost or replaced with non-descriptive placeholders in conditional/compact rendering modes. Disabled interactions should have `aria-disabled`. Verify the ARIA attribute set matches established patterns used elsewhere in the codebase for the same widget type (disclosure, menu, dialog) — inconsistency degrades assistive-tech support and creates confusion for keyboard users. Icon-only buttons relying on `title` for their accessible name fail across screen readers — `title` is announced inconsistently or not at all. Add an explicit `aria-label` and mark the icon `aria-hidden="true"` since it's decorative
    - ARIA roles applied without the keyboard interactions the role contract requires — `role="menu"`/`menuitem*` expects roving focus, arrow-key navigation, Escape scoped to the menu, and focus management on open/close; `role="listbox"`/`option` expects Home/End/typeahead; `role="dialog"` expects focus trap + return focus. Shipping the role without the behavior strands screen-reader and keyboard users on a control that looks interactive to AT but doesn't respond. Either implement the full interaction pattern or drop to a simpler one (native `<button>` + disclosure) that doesn't promise more than the code delivers
    - Custom toggle/switch UI built from non-semantic elements instead of native inputs
    - Overlay or absolutely-positioned layers with broad `pointer-events-auto` that intercept clicks/hover intended for elements beneath — use `pointer-events-none` on decorative overlays and enable events only on small interactive affordances. Conversely, `pointer-events-none` on a parent kills hover/click handlers on children — verify both directions when layering positioned elements
@@ -267,6 +329,8 @@
    **Configuration & hardcoding**
    - Hardcoded values when a config field or env var already exists; dead config fields nothing consumes; unused function parameters creating false API contracts; resource names (table names, queue names, bucket names) hardcoded without accounting for environment prefixes — lookups on response objects using the wrong key silently return undefined
    - Duplicated config/constants/utilities/helper functions across modules — extract to shared module to prevent drift. Watch for behavioral inconsistencies between copies (e.g., one returns `'unknown'` for null while another returns `'never'`)
+   - Audit, lint, anomaly, and analytics tools that re-derive a domain concept (current cycle, active session, completed period, "is this the latest run") using a SIMPLIFIED heuristic must match the authoritative production logic — or they produce false positives/negatives that mislead operators. When production keeps a cycle active until `sellRatio < 1.0`, the audit can't use a 50% threshold. When production tracks a persisted `currentCycleId`, the audit can't fall back to "newest activity wins". When production suppresses a finding based on live exchange state, the audit can't suppress based on a possibly-stale persisted state. Either invoke the authoritative function/predicate directly, OR snapshot-test the audit's derivation against fixtures the production logic also validates. The same applies to suppression rules — gating "ignore this anomaly" on a weak/stale signal hides real issues exactly when the system needs the audit most
+   - Architectural pattern divergence — every new module/feature addresses a class of concern (data storage & persistence, content/template management, API endpoints & validation, authentication/authorization, error handling, structured logging, configuration loading, HTTP/transport clients, caching, background work, state management, testing infrastructure, inter-service communication, naming/path conventions), and the project usually has an established pattern for that class — a registry directory, a repository module, a middleware chain, a typed error hierarchy, a schema validator, a logger, a transport wrapper, a queue convention, a store. Before adding a parallel implementation (a new `*Prompts.js`/`*Templates.js`/`*Copy.js` hardcoding registry-managed content, a new endpoint hand-rolling validation/error shape, a new direct-fs read when peers use a repository, an inline auth check, a `console.log` when peers use a structured logger, a raw `fetch()` when peers use a transport wrapper, an inline background task when peers use a worker, a hand-rolled fixture when peers use a builder), inventory how peer features in the same service handle the same concern: sibling directories (`data/`, `templates/`, `repositories/`, `middleware/`, `clients/`, `stores/`, `queues/`, `migrations/`), helper/loader modules (`load*`, `get*`, `resolve*`, `*Registry`, `*Repository`, `*Service`, `*Client`), and centralized utilities (auth gates, logger, error classes, transport wrapper, config loader, retry helper). Compare the new code's pattern to the dominant peer pattern. If they diverge, either (i) refactor onto the established pattern, OR (ii) extend the established pattern to cover the structural gap that pushed the author away from it (missing variable type, scope, error category, response field, retry policy) — the established artifact, not the bypass, should be the durable surface. Parallel implementations break invariants the dominant pattern enforces (variable substitution, error classification, auth-scope coverage, log correlation, retry/backoff, transactional boundaries), fragment the audit surface, and force future readers to learn N implementations of the same concept
    - CI pipelines installing without lockfile pinning or version constraints — non-deterministic builds
    - Production code paths with no structured logging at entry/exit points
    - Error logs missing reproduction context (request ID, input parameters)
@@ -280,9 +344,16 @@
    **Test coverage**
    - New logic/schemas/services without corresponding tests when similar existing code has tests
+   - Refactors that consolidate logic into a SINGLE helper that becomes the source of truth for a critical shape (`buildXPayload`, `serializeY`, `formatZ`) — this helper deserves explicit unit-test coverage on its output fields/shape, because its bugs propagate to every consumer (engine, IPC, route fallback, websocket emitter) without a localized symptom. Consolidation without tests blinds the team to ALL downstream regressions and recreates the drift bug the consolidation was meant to prevent. After extracting a shared builder, add field-by-field assertions on the helper's output and at least one test per call site that exercises end-to-end emission
    - New error paths untestable because services throw generic errors instead of typed ones
    - Tests re-implementing logic under test instead of importing real exports — pass even when real code regresses. Includes tests that assert by inspecting function source code (string-matching implementation details) rather than calling the function and checking behavior — they break on harmless refactors while missing actual behavioral changes. Also tests that mutate global state at import time (module registries, sys.modules) without fixture-scoped cleanup — causes ordering-dependent failures across the test session
-   - Tests depending on real wall-clock time or external dependencies when testing logic — use fake timers and mocks
+   - Tests depending on real wall-clock time or external dependencies when testing logic — use fake timers and mocks. Tests that shell out to system binaries (`git`, `gh`, `python`, `tailscale`) introduce environment-dependent flakiness; mock the subprocess interface (`child_process.spawn`) instead of relying on the binary being installed
+   - Tests gated by `if (process.platform !== 'darwin') return` (or POSIX-only filesystem tricks like `chmod`-based permission failures) silently skip on CI runners with different platforms — the new code becomes effectively untested. Factor platform-specific behavior into pure functions, mock `fs/promises` directly to throw deterministically, or run multi-platform CI. `vi.spyOn(process, 'platform', 'get')` is brittle because `process.platform` is a value property — use `Object.defineProperty(process, 'platform', { value: '<os>', configurable: true })` and restore the original descriptor in cleanup
+   - Tests that allocate temp directories (`mkdtempSync`, `mkdir`), spawn long-lived child processes, or write artifacts must clean up in `afterEach`/`finally` (e.g., `rmSync(dir, { recursive: true, force: true })`) — without cleanup, the OS temp dir accumulates over many test runs; concurrent test orderings can collide on shared paths
+   - Tests that mutate global state inside the test body (`vi.useFakeTimers()`/`jest.useFakeTimers()`, monkey-patches, `Object.defineProperty(process, 'platform', ...)`, env-var overrides, `mock.module()` setup, frozen `Date.now`, intercepted `console.log`) and only restore at the END of the happy path leak the mutation into the next test when an assertion throws midway — a flaky cascade where one failure causes unrelated tests to misbehave. Restore in a `try/finally` block inside the test body, OR move setup/teardown into `beforeEach`/`afterEach` for the describe block so the framework guarantees cleanup regardless of assertion outcome
+   - Tests whose name or description claims a behavior they don't actually assert (`'forwards lastImageFile'` that only checks `prompt` and `mode`) lie about the contract — the test passes even when the named behavior regresses. Either rename the test to match what's asserted or add the missing assertion
+   - Tests asserting a specific validation rejection (negative number, oversized string, invalid format) must provide ALL OTHER required fields in valid form — otherwise the 400 the test sees comes from a different validation path (UUID failure, missing required field) and the intended rule is never exercised. Use a valid fixture for unrelated fields so the rejection is attributable to the field under test
+   - Path assertions in tests using forward-slash literal substrings (`expect(p).toContain('/some/sub/path')`) fail on Windows where `path.join` produces backslashes. Use `path.join()` in expectations, assert the suffix via `endsWith(join(...))`, or use a separator-agnostic regex — otherwise the test is silently macOS/Linux-only despite running on a Windows CI matrix
    - Missing tests for trust-boundary enforcement — submit tampered values, verify server ignores them
    - Tests that exercise code paths depending on features the integration layer doesn't expose — they pass against mocks but the behavior can't trigger in production. Verify mocked responses match what the real dependency actually returns
    - Test mock state leaking between tests — mock setup APIs that configure return values often persist across tests even after clearing call history, because "clear" resets invocation counts but not configured behavior (use "reset" variants that restore original implementations). Conversely, per-call sequential mock responses couple tests to internal call count — prefer stable return values for behavior tests, sequential mocks only when verifying call order