npm - slash-do - Versions diffs - 2.10.0 → 2.12.0 - Mend

slash-do 2.10.0 → 2.12.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (15) hide show

package/README.md +3 -1
package/commands/do/depfree.md +104 -4
package/commands/do/help.md +2 -0
package/commands/do/pr-better.md +94 -0
package/commands/do/review.md +7 -1
package/commands/do/rpr.md +3 -3
package/commands/do/scan.md +775 -0
package/install.sh +2 -2
package/lib/code-review-checklist.md +68 -14
package/lib/copilot-review-loop.md +18 -11
package/lib/review-cross-file-tracing.md +26 -2
package/lib/review-security-audit.md +6 -1
package/lib/review-surface-scan.md +63 -6
package/package.json +1 -1
package/uninstall.sh +2 -2

package/install.sh CHANGED Viewed

@@ -46,8 +46,8 @@ banner() {
 }
 COMMANDS=(
-  better better-swift fpr goals help omd
-  pr push release replan review rpr update
+  better better-swift depfree fpr goals help omd
+  pr pr-better push release replan review rpr scan update
 )

package/lib/code-review-checklist.md CHANGED Viewed

@@ -22,12 +22,24 @@
    - Functions that index into arrays without guarding empty arrays; aggregate operations (`every`, `some`, `reduce`) on potentially-empty collections returning vacuously true/default values that mask misconfiguration or missing data; state/variables declared but never updated or only partially wired up
    - Parallel arrays or tuples coupled by index position (e.g., a names array, a promises array, and a destructuring assignment that must stay aligned) — insertion or reordering in one silently misaligns all others. Use objects/maps keyed by a stable identifier instead
    - Shared mutable references — module-level defaults passed by reference mutate across calls (use `structuredClone()`/spread); `useCallback`/`useMemo` referencing a later `const` (temporal dead zone); object spread followed by unconditional assignment that clobbers spread values
+   - UI-framework state invariants (uniqueness, monotonicity, cap/floor) checked against the render-time value before calling the updater — rapid events or concurrent updates then violate the invariant. Move the check inside the functional updater (e.g., `setX(prev => prev.includes(id) ? prev : [...prev, id])`) so it runs against the latest state
+   - Reactive effects / observers that depend on a value they themselves write — the write retriggers the effect, producing an infinite loop or network storm. Split into separate effects (one guarded by a "not yet initialized" condition, one that refreshes on an explicit trigger), or drop the self-written value from the dependency array and update via a functional setter
    - Functions with >10 branches or >15 cyclomatic complexity — refactor into smaller units
+   - String accumulation via `+=` inside high-frequency loops (streaming frames, chunked I/O, per-event handlers) becomes O(n²) for long outputs and triggers React re-renders on growing payloads. Collect chunks into an array and `join('')` once at the end while still emitting per-chunk events to consumers
+   - Server-side string formatting (`toLocaleString`, `toLocaleDateString`, currency/number formatters) that depends on locale/timezone defaults produces non-deterministic outputs across deployments. For data that flows into prompts, logs, persisted records, or cross-system messages, format with explicit `Intl.DateTimeFormat({ timeZone: ... })` or ISO strings; reserve locale-aware formatting for user-visible UI layers where the user's locale is the explicit input
+   - Required-at-use-time config values (model name, API key, endpoint URL, default selection) that may be null/undefined in the source data must be validated at the boundary before invoking the downstream API or initialization. Otherwise the upstream API responds with an opaque error far from the user's actual intent. Emit a clear, actionable error (or fail fast at startup) when a required value is missing, identifying the specific field
+   - Child process `spawn()` calls without an `error` event handler — when the binary is missing or unexecutable, Node emits an `'error'` event and never `'close'`. Promise wrappers listening only for `'close'` hang forever; bare spawn calls with no listener crash the parent via uncaught exception. Always pair `proc.on('error', ...)` with `'close'`. SIGKILL escalation guards must check liveness via `proc.exitCode == null` (or a `closed` flag set in the close handler), not `proc.killed` — `.killed` becomes `true` immediately after `kill('SIGTERM')` is called, so guards using `if (!proc.killed)` never fire and hung children survive indefinitely. Single-process tracking ("BUSY guard", `activeProcess` global) must hold the reference until the `'close'` event fires, not until `kill()` is sent — clearing at SIGTERM opens a race window where a new job starts while the previous child is still alive
+   - `spawn`/`exec` env objects: setting a key to `undefined` may coerce to the literal string `"undefined"` instead of unsetting the variable — build the env, then `delete env.PYTHONPATH` (or set to `''` if you explicitly want it cleared)
+   - Caches that store negative/error results (`null`, "not found", probe failure) without a TTL or invalidation hook — when the missing dependency is installed mid-runtime, the cache reports "still missing" until process restart. Cache only successful lookups, OR use a short TTL for negatives, OR re-probe when the cached value is the negative sentinel
+   - Late-connecting clients to long-running async jobs (SSE, WebSocket subscribe-by-id) receive nothing if they connect after the terminal `complete`/`error` broadcast — the server emitted once and moved on. Persist the most-recent (or terminal) payload on the job and emit it immediately on attach, OR document that subscribers must connect before kicking off the job and update any "late connectors will get the final state" comments accordingly
+   - Server returning an empty success payload (`200` with `{ images: [] }`, `{ items: null }`, etc.) when an awaited operation succeeded but the artifact fetch failed — clients treat empty as "no work to show" and never surface the underlying error. After awaiting completion, a missing/unreadable artifact is an internal error: return non-2xx with a structured error, never an empty 200
+   - HTML `<button>` elements without an explicit `type="button"` attribute default to `type="submit"`. When the component is rendered (or could be rendered) inside a `<form>` ancestor, clicks trigger unintended form submission. Set `type="button"` on every non-submit button (close, cancel, expand, menu trigger)
    **API & URL safety**
    - User-supplied or system-generated values interpolated into URL paths, shell commands, file paths, subprocess arguments, or dynamically evaluated code (eval, CDP evaluate, new Function, template strings executed in browser/page context) without encoding/escaping — use `encodeURIComponent()` for URLs, regex allowlists for execution boundaries, and `JSON.stringify()` for values embedded in evaluated code strings. Generated identifiers used as URL path segments must be safe for your router/storage (no `/`, `?`, `#`; consider allowlisting characters and/or applying `encodeURIComponent()`). Identifiers derived from human-readable names (slugs) used for namespaced resources (git branches, directories) need a unique suffix (ID, hash) to prevent collisions between entities with the same or similar names
-   - Route params passed to services without format validation; path containment checks using string prefix without path separator boundary (use `path.relative()`)
-   - Request schema fields for large string/binary payloads (base64, file content, free text) without per-field size limits — a total body-size limit alone doesn't prevent individual oversized fields from consuming excessive memory or exceeding downstream service limits. Add per-field `max(N)` constraints with clear error messages
+   - Route params passed to services without format validation; path containment checks using string prefix without path separator boundary (use `path.relative()`). When a route validates body/query fields, the corresponding path parameter must be validated with the same schema — skipping param validation on sibling endpoints (e.g., `PUT /:id` validates body.id but `DELETE /:id` lets the raw param fall through) causes inconsistent error classes (400 vs 404 vs 500) for the same invalid input
+   - Character-class regex validators (e.g., `^[a-z0-9-]+$`) claiming to enforce a structured format (slug, kebab-case, reverse-DNS, semver) — they accept leading/trailing separators (`-foo`, `foo-`), repeated separators (`a--b`), and empty segments. Require boundary characters (`^[a-z0-9](?:[a-z0-9-]*[a-z0-9])?$`) or use a dedicated parser when the claim is a structured format rather than a character set
+   - Request schema fields for large string/binary payloads (base64, file content, free text) without per-field size limits — a total body-size limit alone doesn't prevent individual oversized fields from consuming excessive memory or exceeding downstream service limits. Add per-field `max(N)` constraints with clear error messages. The same cap applies to OUTBOUND payloads stored in persisted records or sent over streaming protocols (snippets, previews, cached excerpts) — capture-time size enforcement prevents on-disk growth and SSE/WebSocket payload bloat that pure display-time truncation cannot
    - Parameterized/wildcard routes registered before specific named routes — the generic route captures requests meant for the specific endpoint (e.g., `/:id` registered before `/drafts` matches `/drafts` as `id="drafts"`). Verify route registration order or use path prefixes to disambiguate
    - Stored or external URLs rendered as clickable links (`href`, `src`, `window.open`) without protocol validation — `javascript:`, `data:`, and `vbscript:` URLs execute in the user's browser. Allowlist `http:`/`https:` (and `mailto:` if needed) before rendering; for all other schemes, render as plain text or strip the value
    - Server-side HTTP requests using user-configurable or externally-stored URLs without protocol allowlisting (http/https only) and host/network restrictions — the server becomes an SSRF proxy for reaching internal network services, cloud metadata endpoints, or localhost-bound APIs. Validate scheme and restrict to expected hosts or external-only ranges before any server-side fetch. Also check redirect handling: auto-following redirects (`redirect: 'follow'`) bypasses initial host validation when a public URL redirects to an internal IP. Disable auto-follow and revalidate each hop, or resolve DNS and block private/loopback/link-local ranges before connecting — public hostnames can resolve to internal IPs via DNS rebinding
@@ -35,7 +47,7 @@
    **Trust boundaries & data exposure**
    - API responses returning full objects with sensitive fields — destructure and omit across ALL response paths (GET, PUT, POST, error, socket); comments/docs claiming data isn't exposed while the code path does expose it
-   - Server trusting client-provided computed/derived values (scores, totals, correctness flags, file metadata like MIME type and size) when the server can recompute or verify them — strip and recompute server-side; for file uploads, validate content type via magic bytes and size via actual buffer length rather than trusting client-supplied headers
+   - Server trusting client-provided computed/derived values (scores, totals, correctness flags, file metadata like MIME type and size) when the server can recompute or verify them — strip and recompute server-side; for file uploads, validate content type via magic bytes and size via actual buffer length rather than trusting client-supplied headers. The same principle applies to trust in persisted state: flags read from flat-file/JSON/DB records that control authorization or deletion protection (`builtIn`, `protected`, `role`, `owner`) must be derived from a trusted source (code constants, session identity) on every read — otherwise hand-editing the file or tampered sync can flip the flag and bypass the protection
    - New endpoints mounted under restricted paths (admin, internal) missing authorization verification — compare with sibling endpoints in the same route group to ensure the same access gate (role check, scope validation) is applied consistently. When new capabilities require additional OAuth scopes or API permissions, verify the scope-upgrade check covers all required scopes — a check that only tests for one scope will miss newly added scopes, causing downstream API calls to fail with insufficient permissions
    - User-controlled objects merged via `Object.assign`/spread without sanitizing keys — `__proto__`, `constructor`, and `prototype` keys enable prototype pollution. Use `Object.create(null)` for the target, whitelist allowed keys, and use `hasOwnProperty` (not `in`) to check membership. Also verify the merge can't override reserved/internal fields the system depends on
    - Push events (WebSocket, SSE, pub/sub) emitted without scoping to the originating user or session — sensitive payloads (user content, tokens, progress data, images) leak to all connected clients in multi-user environments. Scope events to the requesting session via room/channel isolation or include a correlation ID the client provides at request time; verify consumers filter events by correlation ID before updating UI state
@@ -43,7 +55,7 @@
 ## Tier 2 — Check When Relevant (Data Integrity, Async, Error Handling)
    **Async & state consistency** _[applies when: code uses async/await, Promises, or UI state]_
-   - Optimistic state changes (view switches, navigation, success callbacks) before async completion — if the operation fails or is cancelled, the UI is stuck with no rollback. Check return values/errors before calling success callbacks. Handle both failure and cancellation paths. Watch for `.catch(() => null)` followed by unconditional success code (toast, state update) — the catch silences the error but the success path still runs. Either let errors propagate naturally or check the return value before proceeding
+   - Optimistic state changes (view switches, navigation, success callbacks) before async completion — if the operation fails or is cancelled, the UI is stuck with no rollback. Check return values/errors before calling success callbacks. Handle both failure and cancellation paths. Watch for `.catch(() => null)` followed by unconditional success code (toast, state update) — the catch silences the error but the success path still runs. Either let errors propagate naturally or check the return value before proceeding. Also covers optimistic placeholder IDs ('pending', 'temp_*', client-generated UUIDs): they must NOT be echoed back to the server in subsequent requests — the server validates against its real ID format and rejects them as 400s. Either disable controls bound to optimistic IDs until the server returns a real one, OR omit the field from outgoing payloads when the local value still matches the optimistic shape. Settings whose persistence model is per-record (per-conversation, per-document, per-project) must be persisted on every mutation, not just held in local component state — otherwise refresh resets to the persisted value and server-side history shows different content than the user used. Decide explicitly whether the field is per-record, per-session, or per-action — and persist accordingly
    - Multiple coupled state variables updated independently — actions that change one must update all related fields; debounced/cancelable operations must reset loading state on every exit path (cleared, stale, failed, aborted). Reference/selection sets that point to items in a data collection must be pruned when items are removed and invalidated when the collection is reloaded, filtered, paginated, or sorted — stale references send nonexistent IDs to downstream operations. Operations triggered from a confirmation dialog must re-validate preconditions (selection non-empty, items still exist) at execution time — the underlying data may change between dialog display and user confirmation. Component state initialized from props via `useState(prop)` only captures the initial value — if the prop updates asynchronously (data fetch, parent re-render), the local state goes stale. Sync with an effect when the user is not actively editing, or lift state to avoid the copy
    - Error notification at multiple layers (shared API client + component-level) — verify exactly one layer owns user-facing error messages. For periodic polling, also check that error notifications are throttled or deduplicated (only fire on state transitions like success→error, not on every failed iteration) and that failure doesn't make the UI section disappear entirely (component returning null when data is null/errored) — render an error or stale-data state instead of absence
    - Optimistic updates using full-collection snapshots for rollback — a second in-flight action gets clobbered. Use per-item rollback and functional state updaters after async gaps; sync optimistic changes to parent via callback or trigger refetch on remount. When appending items to a list optimistically, guard against duplicates (check existence before append) — concurrent or repeated operations can insert the same item multiple times
@@ -59,17 +71,33 @@
    - Side effects during React render (setState, navigation, mutations outside useEffect)
    - Interactive UI elements (buttons, inputs, drag-and-drop targets, keyboard shortcuts) that remain enabled while an async operation owns their related state — a second trigger while the first is in-flight produces concurrent state mutations or duplicate operations. All entry points for the same action must be disabled together while it is pending
    - Optimistic UI messages that substitute placeholder text when the actual payload sent to the server differs — the user sees one thing, the server stores another. Use the same fallback text in both the optimistic render and the outgoing payload, or surface the actual payload text in the UI
+   - Async functions invoked from synchronous event handlers (`onClick`, `onKeyDown`, command dispatchers) or effects without handling rejection at the call site — even when a shared request helper toasts the error, the unhandled rejection pollutes the console, leaves UI state inconsistent (optimistic mutation not reverted, palette/modal stuck open, navigation skipped), and hides failures from downstream event-loop instrumentation. Wrap the `await` in try/catch, attach `.catch(...)`, or use `void promise.catch(...)` — and only run success-path side effects (close, navigate, clear dirty) inside the success branch. After awaiting an async operation that may be cancelled or whose owning component may unmount, check `signal.aborted` (or a `mountedRef`) before subsequent state writes — otherwise React warns about updates on unmounted trees and stale state leaks through
+   - Single shared error-state variable (one `error` setter) reused by multiple independent async flows — one flow's success path clears the other flow's displayed error and vice versa. Split errors by domain (`dataError`, `layoutsError`), scope errors per operation, or only overwrite the specific error the current flow owns
+   - A page renders based on multiple independent async loads but the loading flag reflects only one of them — a slow or failed secondary load renders a blank page with no loading indicator and no error. Either include every render-gating fetch in the loading state (or a per-fetch status map) or provide explicit empty/error states for the secondary data
+   - Unsaved changes / dirty state discarded without warning when the user switches context (selecting another record in a multi-record editor, navigating away, closing a sheet) — silent data loss. Dirty-check on context change (inline confirm), auto-save drafts per record, or gate the switch control until unsaved state is committed. The `beforeunload` prompt alone does not cover in-app context switches
+   - Actions triggered from one surface (command palette, global menu, external event) that mutate data another already-mounted page fetched on mount — re-navigating to the same route does NOT remount the page (routers treat it as a no-op), so the visible state stays stale even though the mutation succeeded server-side. Propagate the change via a shared store, a pub/sub event whose name is a shared constant, a refetch-on-focus/visibility hook, or a key-based remount — and verify the mounted page actually subscribes
+   - Long-lived streaming response handlers (SSE, chunked HTTP, WebSocket) on the server must register a client-disconnect listener (`req.on('close')`, `req.on('aborted')`) and propagate cancellation through the FULL processing chain (retrieval, fetches, subprocesses) — partial threading wastes work after disconnect, leaves `write-after-end` errors, and inflates resource use. Per-request timeouts on streaming responses must remain active for the full duration of stream consumption, not cleared on initial fetch resolution — a stalled upstream that keeps the connection open hangs the consumer indefinitely. Honor write backpressure: check the boolean return of `write()` and await `'drain'` when it returns false. When attaching paired listeners for backpressure or completion (`drain` + `close`), the cleanup handler must remove ALL of them — asymmetric removal accumulates listeners across slow-client cycles. After flushing streaming headers, framework error middleware (asyncHandler, exception filters) cannot send a JSON error — wrap post-handshake logic in try/finally that translates errors into a terminal `event: error` SSE frame and ends the response gracefully (`if (!res.writableEnded && !res.destroyed) res.end()`)
+   - Client-side AbortController for an in-flight streaming/long-lived operation must be invoked when its owning UI context tears down OR navigates AWAY from that operation. When the cleanup is keyed to a route param, navigation events emitted BY the in-flight stream itself (e.g., redirecting to a permalink after the server returns an id) trigger the cleanup and abort the very operation that caused the navigation. Track the streaming operation's identity in a ref and abort only when navigating away from THAT identity, not on every param change. Mirror the abort by cancelling the stream reader (`reader.cancel()`) in a `finally` block and ignoring late events whose ID doesn't match the now-current operation
+   - Streaming UIs must preserve the deltas already received when the stream emits a terminal `error` event mid-stream (commit them as a partial result with an error indicator). Clearing the streaming buffer on error discards visible content the user already saw and breaks recovery flows. Pair this with mutually exclusive terminal events (see Error handling)
    **Error handling** _[applies when: code has try/catch, .catch, error responses, or external calls]_
    - Service functions throwing generic `Error` for client-caused conditions — bubbles as 500 instead of 400/404. Use typed error classes with explicit status codes; ensure consistent error responses across similar endpoints — when multiple endpoints make the same access-control decision (e.g., "resource exists but caller lacks access"), they must return the same HTTP status (typically 404 to avoid leaking existence). Include expected concurrency/conditional failures (transaction cancellations, optimistic lock conflicts) — catch and translate to 409/retry rather than letting them surface as 500
+   - Error discrimination by string matching (`err.message.includes('not found')`, regex on error text) or by coupling to localizable/developer-facing messages — refactors, localization, or wrapper rewrites silently change behavior (HTTP status, retry policy, user message). Use explicit error codes or typed classes and branch on them
+   - Route handlers (or equivalent response mappers) that convert any exception from a service call into a single specific status — e.g., `catch (err) { throw new NotFoundError() }` — mask real server errors (file I/O, JSON parse, atomic-write failures) as domain 404s and hide outages. Map only known error classes/codes; let unknown errors surface as 500 so real bugs aren't suppressed
+   - Error wrappers that re-throw with only a subset of the original fields (e.g., `new ServerError(err.message, { status })` that drops `code`, `context`, `cause`) — downstream consumers see a generic `INTERNAL_ERROR` instead of the specific code they branch on. Preserve structured detail across wrapping: propagate `code`, `context`, `cause`, and any fields clients or logs depend on
    - Swallowed errors (empty `.catch(() => {})`), handlers that replace detailed failure info with generic messages, and error/catch handlers that exit cleanly (`exit 0`, `return`) without any user-visible output — surface a notification, propagate original context, and make failures look like failures. This includes cross-layer error propagation: if the server returns structured error detail (field-level validation messages, `details[]` arrays, error codes), the client should surface actionable detail rather than discarding the structure for a generic string. Includes external service wrappers that return `null`/empty for all non-success responses — collapsing configuration errors (missing API key), auth failures (403), rate limits (429), and server errors (5xx) into a single "not found" return masks outages and misconfiguration as normal "no match" results. Distinguish retriable from non-retriable failures and surface infrastructure errors loudly
-   - SSE or streaming handlers that call `end()`/`close()` on mid-stream errors without emitting an error event — the client observes a clean stream termination and treats partial content as complete. Emit a structured `event: error` block before closing so clients can detect and surface the failure
+   - SSE or streaming handlers that call `end()`/`close()` on mid-stream errors without emitting an error event — the client observes a clean stream termination and treats partial content as complete. Emit a structured `event: error` block before closing so clients can detect and surface the failure. Conversely, named lifecycle events on streams (`error`, `done`, `complete`) must be MUTUALLY EXCLUSIVE — after emitting `error`, do NOT also emit `done`, or include explicit success/error info in the terminal frame. Otherwise clients parsing the last event treat failed runs as completed
+   - Raw `fetch()` failures (TypeError "Failed to fetch", DNS errors, ECONNREFUSED) at API client boundaries must be translated to a consistent user-friendly message matching the project's established transport-error utility — otherwise users see cryptic browser errors and can't distinguish "server unreachable" from "request rejected." Preserve `AbortError` so callers can still distinguish cancellation from failure
    - SSE or event dispatchers handling named event types but ignoring the protocol's default/unnamed event — SSE streams emitting `data:` without `event:` produce type `'message'` (the SSE spec default), which a handler processing only named types silently discards. Verify the default event type is either handled or explicitly excluded
    - Caller/callee disagreement on success/decision semantics — a function that resolves with `{ success: false }` while callers use `.catch()` for error handling means failures are treated as successes. Verify that the contract between producer and consumer is consistent: if callers branch on rejection, the function must reject on failure; if callers branch on a status field, the function must never reject. This extends beyond success/failure to any evaluation result — if a gate/check function returns `{ shouldRun: false }` on errors while the runtime treats errors as fail-open (run anyway), the API surface and runtime disagree on skip semantics. Also covers argument shape contracts — passing a wrapped object (`{ items: [...] }`) when the callee expects a bare array (or vice versa), or supplying arguments at the wrong parameter position, causes silent no-ops or partial processing; verify argument shapes match by reading the callee's signature, not assuming from naming conventions. Also check EventEmitter async handlers — `async` callbacks on `'close'`/`'error'` events create unhandled rejections because EventEmitter doesn't await handler promises; wrap in try/catch
    - Destructive operations in retry/cleanup paths assumed to succeed without their own error handling — if cleanup fails, retry logic crashes instead of reporting the intended failure
    - External service calls without configurable timeouts — a hung downstream service blocks the caller indefinitely
    - Missing fallback behavior when downstream services are unavailable (see also: retry without backoff in "Sync & replication")
    - Route handlers that call a status/health probe before delegating to the main service when the service already handles the "not configured"/"unreachable" case — the pre-probe adds a redundant upstream round-trip on every request and can fail even when the intended operation would succeed. Let the service be the authoritative source and map its structured errors to the appropriate HTTP status at the route boundary
+   - Sync-shaped route handler wrapping a service that is async-by-design (returns a job handle, writes the artifact later) — the handler must subscribe to the completion event BEFORE calling the service (so a fast/cached job can't fire `complete` before the listener attaches) AND wait for the matching job id with a timeout. Common bug: the handler reads `result.filename` from disk immediately after the service returns, gets nothing, and replies with an empty success payload. If the service is callable both async (jobId-only) and sync (await artifact), expose the sync variant explicitly (`generateAndWait` / `generateSync`) rather than mixing modes
+   - Cross-module feature-flag detection drift — when multiple modules independently determine "is feature X active?" (HTTPS enabled, OAuth scope satisfied, dark mode, tier-gated capability) using divergent checks, behavior diverges and UX contradicts itself (UI advertises an `https://` URL while the server runs HTTP; one helper checks for `cert.pem`, another requires `cert.pem && key.pem`). Centralize the predicate in a single exported helper and have every caller import it; flag any module that re-derives the same boolean inline
+   - Cross-module error classification — a low-level wrapper rethrows errors with a different `name`/`code`/`message` shape than the original (e.g., custom fetch wrapper aborts with `new Error('Request aborted')` while a downstream classifier checks `err.name === 'AbortError'`). The classifier matches nothing and the timeout/cancel branch never fires. Either preserve `name`/`code`/`cause` through the wrapper, OR have the classifier accept the union of shapes the wrapper can emit
+   - Compatibility-shim end-to-end plumbing — when a route bridges to an external API standard (A1111-style image-gen, OpenAI, S3-compatible, etc.) every documented response field must be backed by a real value chain through the provider, intermediate service, and response builder. Common bug: response shape is correct but a field like `seed`, `progress`, `eta`, `model`, or `usage.tokens` is hardcoded to a default (`0`, `null`, the request input) because nothing in the chain actually returns it. Trace each declared response field from where it's set in the route → service return shape → underlying provider/process output, and confirm the value flows end-to-end. "Always returns 0 / always undefined / always empty array" patterns signal incomplete plumbing
    **Resource management** _[applies when: code uses event listeners, timers, subscriptions, or useEffect]_
    - Event listeners, socket handlers, subscriptions, timers, and useEffect side effects are cleaned up on unmount/teardown. `requestAnimationFrame` handles must be cancelled on unmount — pending frames invoke DOM operations or state updates on unmounted nodes. Blob/object URLs created via `URL.createObjectURL()` must be revoked on both item removal AND component unmount. ReadableStream / fetch readers consumed in a loop need `try/finally { reader.cancel() }` — exceptions in the loop otherwise leave the stream open; the `finally` block should catch its own errors so it doesn't mask the original exception
@@ -82,12 +110,13 @@
    - API versioning: breaking changes to public endpoints without version bump or deprecation path
    - Backward-incompatible response shape changes without client migration plan
    - Backward compatibility breaking changes — renamed/removed config keys, changed file formats, altered DB schemas, modified event payloads, renamed URL routes/paths, or restructured persisted data (localStorage, files, database rows) without a migration path or fallback that reads the old format. For route/URL renames, add redirects from old paths to preserve bookmarks and external links. Trace all consumers of the changed contract (other services, CLI versions, stored data) and verify they still work or have an upgrade path. For schema changes, require a migration script; for config/format changes, support both old and new formats during a transition period or provide a one-time converter
-   - One-time migrations or initializations triggered on load/startup without a completion guard (version stamp, flag, or condition that excludes already-migrated data) — re-execute on every startup, causing unnecessary writes, duplicate processing, or state churn. Ensure the migration condition excludes records/configs that have already been migrated
+   - One-time migrations or initializations triggered on load/startup without a completion guard (version stamp, flag, or condition that excludes already-migrated data) — re-execute on every startup, causing unnecessary writes, duplicate processing, or state churn. Ensure the migration condition excludes records/configs that have already been migrated. Also covers setup/provisioning scripts invoked from hot paths (`npm start`, dev script, container entrypoint, app boot) that mutate credentials, privileges, or installed-package state (`ALTER USER`, password resets, brew installs, file ownership changes) — gate the heavy work behind a cheap readiness check so reruns are no-ops, OR refactor the script so each step is itself idempotent and detects already-applied state
    - Data migrations that silently change runtime behavior — converting records from one type/schedule/state to another must preserve the original execution semantics (frequency, enabled state, trigger behavior). Migrating an on-demand/manual entity to a scheduled one causes unexpected automated execution; migrating a fine-grained interval to a coarser default changes frequency. Unsupported source values (cron expressions, custom intervals) must be flagged or preserved, not silently dropped to defaults
    - Update/patch endpoints with explicit field allowlists (destructured picks, permitted-key arrays) — when the data model gains new configurable fields, the allowlist must be updated or the new fields are silently dropped on save. Trace from model definition to the update handler's field extraction to verify coverage
-   - New endpoints/schemas should match validation patterns of existing similar endpoints — field limits, required fields, types, error handling. If validation exists on one endpoint for a param, the same param on other endpoints needs the same validation. API documentation schemas (OpenAPI, JSON Schema) must be structurally complete — array types require `items` definitions, required fields must be listed, and the documented shape must match what the implementation actually returns
+   - New endpoints/schemas should match validation patterns of existing similar endpoints — field limits, required fields, types, error handling. If validation exists on one endpoint for a param, the same param on other endpoints needs the same validation — including path params, query params, and body fields on sibling endpoints (create/update/delete/activate). The same field should be accepted, rejected, and trimmed identically everywhere it appears, regardless of which endpoint consumes it. `z.string().min(1)` without `.trim()` accepts whitespace-only values — prefer `z.string().trim().min(1)` for user-visible names. API documentation schemas (OpenAPI, JSON Schema) must be structurally complete — array types require `items` definitions, required fields must be listed, and the documented shape must match what the implementation actually returns
    - Client-side input validation limits (max count, file size, string length, combined totals) must be consistent with — and ideally tighter than — server-side enforcement. When the client allows combinations the server rejects (e.g., 8 × 10MB files vs a 50MB JSON body limit), users hit confusing 400/413 errors. Trace all enforcement boundaries (UI, API schema, body parser, downstream service) and verify they form a coherent envelope
    - Sample config files, README examples, and documentation that reference config keys or structure must match what the implementation actually reads. Trace example keys against the config loader — stale examples teach operators to configure values the system ignores (or vice versa)
+   - Subprocess invocations must inherit the same configuration source as the parent — if the parent reads from `.env`/config files but the child only sees `process.env`, exporting those values explicitly via the `env` option is required. Otherwise a probe uses customized credentials/ports while the underlying setup runs with built-in defaults, creating an "inconsistency loop" where the probe always fails and provisioning re-applies defaults that overwrite user customization. Pass the resolved values through; do not assume children re-parse the same config files
    - Config values whose format can be validated at initialization time (URLs, port numbers, auth schemes) but are only validated at first use — misconfiguration surfaces as a cryptic runtime error deep in the call stack. Validate format and range of security-relevant config values during initialization and surface a specific diagnostic identifying the bad field
    - URL joining utilities that force paths absolute-from-origin discarding the base URL's pathname — `baseUrl=http://host/proxy` + `/v1/api` silently produces `http://host/v1/api` instead of `http://host/proxy/v1/api`. Verify URL construction utilities preserve pathname segments from the base URL, or document and enforce that base URLs must be origin-only
    - Summary/aggregation endpoints that compute counts or previews via a different query path, filter set, or data source than the detail views they link to — users see inconsistent numbers between the dashboard and the destination page. Trace the computation logic in both paths and verify they apply the same filters, exclusions, and ordering guarantees (or document the intentional difference)
@@ -97,6 +126,16 @@
    - Schema fields accepting values downstream code can't handle; Zod/schema stripping fields the service reads (silent `undefined`); config values persisted but silently ignored by the implementation — trace each field through schema → service → consumer. Also check for parameters accepted and validated in the schema but never consumed by the implementation — dead API surface that misleads callers into believing they're configuring behavior that's silently ignored; remove unused parameters or wire them through to the implementation. Update schemas derived from create schemas (e.g., `.partial()`) must also make nested object fields optional — shallow partial on a deeply-required schema rejects valid partial updates. Additionally, `.deepPartial()` or `.partial()` on schemas with `.default()` values will apply those defaults on update, silently overwriting existing persisted values with defaults — create explicit update schemas without defaults instead
    - Multi-part UI features (e.g., table header + rows) whose rendering is gated on different prop/condition subsets — if the header checks prop A while rows check prop B, partial provision causes structural misalignment (column count mismatch, orphaned interactive elements without handlers). Derive a single enablement boolean from the complete prop set and use it consistently across all participating components
    - Entity creation without case-insensitive uniqueness checks — names differing only in case (e.g., "MyAgent" vs "myagent") cause collisions in case-insensitive contexts (file paths, git branches, URLs). Normalize to lowercase before comparing
+   - Arrays of IDs (widget ids, tag ids, member ids) persisted, returned by API, or rendered with `key={x}` without container-level deduplication — element-level validation (type check, length cap) is not enough. Duplicates cause React key collisions, inflated operation counts, and inconsistent UI updates. Enforce uniqueness via schema refinement (`zod.refine(arr => new Set(arr).size === arr.length)`), dedupe during ingestion, AND dedupe during read-path sanitization so hand-edited or legacy data can't reintroduce collisions. Apply the same logic to arrays of records keyed by id at the container level (first-wins, not just element-level shape checks)
+   - Data loaded from files or persistent stores sanitized less strictly than the API accepts on write — hand-edited, migrated, or corrupted persisted state can introduce values (oversized names, non-kebab ids, duplicate entries, bypassed limits) the API would reject, producing oversized responses, unreachable records (client renders but API rejects on mutate), or invariant violations. Apply the same length caps, regex, uniqueness, and type guards in read-path sanitization as in request-schema validation; drop or truncate out-of-range values rather than passing them through
+   - Authority / privilege flags (builtIn, protected, owner, role, immutable) read directly from persisted records without re-derivation — flipping the flag in a hand-edited file or a tampered sync source grants unintended privileges (e.g., changing `builtIn: true → false` lets "protected" records be deleted). Derive authority from code (a constant set of built-in ids, session identity, server-side role lookup), not from the persisted representation
+   - Persisted-state filename/path fields (history JSON entries, settings.json paths, manifest entries) used as filesystem operands (`path.join(BASE, item.filename)` for `unlink`/`readFile`/`spawn` arg lists / ffmpeg-imagemagick concat manifests) without basename + path-resolve-prefix-check validation — corrupted or tampered persisted state can include `../` segments that escape the intended directory. Use a `safeUnder(base, candidate)` helper at every consumption site. For paths that further pass into exec arg strings or manifest files (e.g., ffmpeg concat-demuxer `file '...'` lines), basename validation is necessary but not sufficient — the consumer's parser has its own escaping rules: single quotes / newlines break ffmpeg manifests, backslashes on Windows are interpreted as escape characters in quoted strings, shell metacharacters break shell-quoted args. Either reject filenames containing parser-special characters at validation time, or apply consumer-specific escaping before writing the manifest/argv
+   - Allowlists gating user-provided identifiers must use the consumer's identifier namespace, not a sibling namespace. Common bug: an allowlist of import-module names (`cv2`, `PIL`) used to gate `pip install <name>` — pip's identifier space is package specs (`opencv-python`, `pillow`), so the allowlist permits installs of typosquatted/unintended packages. Same risk for command names vs aliases, OAuth scope strings vs role names, file extensions vs MIME types. Build the allowlist from the consumer's actual valid-input set, NOT from a related-but-different list, and include a unit test that asserts every allowlist entry is a valid input to the consumer
+   - API responses returning server-internal absolute filesystem paths (`/Users/.../data/...`, `C:\app\data\...`) leak server layout, OS, and install locations to clients and couple them to filesystem structure. Return basenames or relative identifiers (`/data/loras/<filename>`) and resolve/validate server-side at consumption time
+   - Cross-module constants kept in sync by comment ("must stay in sync with X", copy-pasted regex, duplicated event names or size limits) — the comment is not enforcement and drift is a silent failure. Any constant shared across module boundaries (client↔server, route↔service, component↔component, producer↔consumer) must be a single exported value imported at both sides: event names, regex patterns, numeric limits, path segments, feature-flag keys, error codes
+   - Generator/validator structural invariants — when a generator produces values with structural guarantees (sortability via fixed-width prefix, embedded checksum, encoded version), the validator regex (and any client-side mirror) must enforce the SAME shape. Broader regexes accept inputs the generator never emits, breaking invariants the rest of the system relies on (e.g., lexical sort == chronological sort breaks once a base36 timestamp grows by a digit). Verify generator + server validator + client mirror form a closed loop. Test fixtures should use IDs/payloads that match generator output, not contrived literals — fixtures with off-spec shapes hide regressions where the production format changes
+   - Schemas accepting paired/range fields (`startDate`/`endDate`, `min`/`max`, `from`/`to`) must add a cross-field refinement (`zod .refine()`, JSON Schema `dependentSchemas`) enforcing the relationship (start ≤ end). Without it, the schema accepts inconsistent ranges that the implementation may silently swap, ignore, or interpret ambiguously. Define deterministic rules when only one bound is supplied
+   - Pure persistence/utility modules importing from orchestration/service modules just to access a constant pulls the entire downstream import graph as a transitive dependency — increases test cost, slows imports, and creates side effects on load. Move shared constants (enums, regex, size caps, valid-mode sets) into small dedicated modules so storage and utility code can be reused without dragging the service stack in
    - Code reading properties from API responses, framework-provided objects, or internal abstraction layers using field names the source doesn't populate or forward — silent `undefined`. Verify property names and nesting depth match the actual response shape (e.g., `response.items` vs `response.data.items`, `obj.placeId` vs `obj.id`, flat fields vs nested sub-objects). When building a new consumer against an existing API, check the producer's actual response — not assumed conventions. When branching on fields from a wrapped third-party API, confirm the wrapper actually requests and forwards those fields (e.g., optional response attributes that require explicit opt-in). Also verify call sites pass inputs in the format the called function actually accepts — framework constructors with non-obvious positional argument order, loaders with format-specific variants (content paths vs script paths, asset objects vs class references), and accessor APIs with distinct method-vs-property semantics. Fallback branches in multi-format dispatchers commonly use the wrong function for the input type
    - Data model fields that have different names depending on the creation/write path (e.g., `createdAt` vs `created`) — code referencing only one naming convention silently misses records created through other paths. Trace all write paths to discover the actual field names in use. When new logic (access control, UI display, queries) checks only a newly introduced field, verify it falls back to any legacy field that existing records still use — otherwise records created before the migration are silently excluded or inaccessible. Also check entity identity keys: if code looks up or matches entities using a computed key (e.g., `e.id || e.externalId`), all code paths that perform the same lookup must use the same key computation — one path using `e.id` while another uses `e.id || e.externalId` causes mismatches for entities missing the primary key
    - Entity type changes without invariant revalidation — when an entity has a discriminator field (type, kind, category) and the user changes it, all type-specific invariants must be enforced on the new type AND type-specific fields from the old type must be cleared or revalidated. A job changing from `shell` to `agent` without clearing `command`, or changing to `shell` without requiring `command`, leaves the entity in an invalid hybrid state that fails at runtime or resurfaces stale data
@@ -104,7 +143,7 @@
    - Operations scoped to a specific entity subtype that don't verify the entity's type discriminator before processing — an endpoint or function designed for one account/entity type that accepts any entity by ID can corrupt state or produce wrong results when called with the wrong type. Add an explicit type guard and return a structured error
    - Inconsistent "missing value" semantics across layers — one layer treats `null`/`undefined` as missing while another also treats empty strings or whitespace-only strings as missing. Query filters, update expressions, and UI predicates that disagree on what constitutes "missing" cause records to be skipped by one path but processed by another. Define a single `isMissing` predicate and use it consistently, or normalize empty/whitespace values to `null` at write time. Also applies to comparison/detection logic: coercing an absent field to a sentinel (`?? 0`, default parameters) makes the logic treat "unsupported" as a real value — guard with an explicit presence check before comparing. Watch for validation/sanitization functions that return `null` for invalid input when `null` also means "clear/delete" downstream — malformed input silently destroys existing data. Distinguish "invalid, reject the request" from "explicitly clear this field". Also applies to normalization (trailing slashes, case, whitespace): if one path normalizes a value before comparison but the write path stores it un-normalized, comparisons against the stored value produce incorrect results — normalize at write time or normalize both sides consistently
    - Validation functions that delegate to runtime-behavior computations (next schedule occurrence, URL reachability, resource resolution) — conflating "no result within search window" or "temporarily unavailable" with "invalid input" rejects valid configurations. Validate syntax and structure independently of runtime feasibility
-   - Numeric values from strings used without `NaN`/type guards — `NaN` comparisons silently pass bounds checks. Clamp query params to safe lower bounds
+   - Numeric values from strings used without `NaN`/type guards — `NaN` comparisons silently pass bounds checks; `NaN` flowing into subprocess args (`-p NaN`) or formatted strings produces opaque downstream failures. Clamp query params to safe lower bounds. For env-var parsing in particular, the input commonly contains stray whitespace or inline `# comment` text — use `Number.parseInt(String(value).trim(), 10)` and gate with `Number.isFinite(parsed)` before falling back to the default
    - Hand-rolled regex validators for well-known formats (IP addresses, email, URLs, dates, semver) that accept invalid inputs or reject valid ones — use platform/standard library parsers instead (e.g., `net.isIP()`, `URL` constructor, `semver.valid()`) which handle edge cases the regex misses
    - UI elements hidden from navigation but still accessible via direct URL — enforce restrictions at the route level
    - Summary counters/accumulators that miss edge cases (removals, branch coverage, underflow on decrements — guard against going negative with lower-bound conditions); counters incremented before confirming the operation actually changed state — rejected, skipped, or no-op iterations inflate success counts. Batch operations that report overall success while silently logging per-item failures — callers see success but partial work was done; collect and return per-item failures in the response. Silent operations in verbose sequences where all branches should print status
@@ -116,6 +155,7 @@
    - Writes that replace an entire composite attribute (array, map, JSON blob) when the field is populated by multiple sources — the write discards data from other sources. Use a separate attribute, merge with the existing value, or use list/set append operations
    - Functions with early returns for "no primary fields to update" that silently skip secondary operations (relationship updates, link writes)
    - Functions that acquire shared state (locks, flags, markers) with exit paths that skip cleanup — leaves the system permanently locked. Trace all exit paths including error branches
+   - Modules that own a persistence schema (write to disk/DB with a known shape) should validate at the persistence boundary, not assume the API/route layer will catch everything. Direct callers (internal scripts, tests, programmatic batch jobs, future endpoints) bypass route validation and corrupt on-disk state. Keep enum/range/required checks layered: the route validates input format, the storage layer validates the persistence contract, and at minimum reject invalid enum values before writing
    **Input handling** _[applies when: code accepts user/external input]_
    - Trimming values where whitespace is significant (API keys, tokens, passwords, base64) — only trim identifiers/names
@@ -137,7 +177,7 @@
    **Sync & replication** _[applies when: code uses pagination, batch APIs, or data sync]_
    - Upsert/`ON CONFLICT UPDATE` updating only a subset of exported fields — replicas diverge. Document deliberately omitted fields
-   - Pagination using `COUNT(*)` (full table scan) instead of `limit + 1`; endpoints missing `next` token input/output; hard-capped limits silently truncating results. When a data store applies query limits before filter expressions, a fixed multiplier on the limit still under-fetches — loop with continuation tokens until the target count of post-filter results is collected
+   - Pagination using `COUNT(*)` (full table scan) instead of `limit + 1`; endpoints missing `next` token input/output; hard-capped limits silently truncating results. When a data store applies query limits before filter expressions, a fixed multiplier on the limit still under-fetches — loop with continuation tokens until the target count of post-filter results is collected. Periodic maintenance (cleanup, expiry, dedup) bolted onto a paginated read path runs only for items returned in that page — entries beyond the page boundary are never processed. Move maintenance to a background sweep, run a separate unbounded pass, OR use cheap metadata (mtime, size) for the maintenance pass while only doing expensive reads for the page actually returned. Maintenance gates that depend on parsed metadata fields will skip records where parsing returns a sentinel (0, null, "") — those records become permanent. Treat parse failure as "expired" (or fall back to filesystem mtime), or surface unparseable records for manual review
    - Pagination cursors derived from the last *scanned* item rather than the last *returned* item — if accumulated results are trimmed (e.g., sliced to a page size), the cursor advances past items that were fetched but never delivered, causing permanent skips
    - Batch/paginated API calls (database batch gets, external service calls) that don't handle partial results — unprocessed items, continuation tokens, or rate-limited responses silently dropped. Add retry loops with backoff for unprocessed items
    - Retry loops without backoff or max-attempt limits — tight loops under throttling extend latency indefinitely. Use bounded retries with exponential backoff/jitter
@@ -148,6 +188,7 @@
    - File writes that assume the parent directory exists — on fresh installs or after directory cleanup, the write fails with ENOENT. Ensure the directory exists before writing (or create it on demand)
    - Bootstrap/resilience code that imports the dependencies it's meant to install — restructure so installation precedes resolution
    - Re-exporting from heavy modules defeats lazy loading — use lightweight shared modules
+   - New global APIs (`AbortSignal.any`, `Promise.withResolvers`, `structuredClone`) used in code that runs on older runtimes must be feature-detected. When the codebase already provides a fallback utility for the same API (e.g., a `withTimeout` helper that polyfills `AbortSignal.any`), reuse it instead of calling the unguarded global — drift between the safe path and a new direct call reintroduces the runtime error the fallback was created to avoid
    **Data format portability** _[applies when: code crosses serialization boundaries — JSON, DB, IPC]_
    - Values crossing serialization boundaries may change format (arrays in JSON vs string literals in DB) — convert consistently. Datetime values are especially fragile: mixing UTC string operations (`toISOString().split('T')[0]`) with local-time `Date` methods (`setDate`/`getDate`) shifts results across timezone boundaries; appending `'Z'` to bare datetimes without verifying the source timezone converts non-UTC values incorrectly. Keep all datetime arithmetic in one consistent timezone (preferably UTC with explicit UTC methods) or use a timezone-aware library
@@ -157,34 +198,43 @@
    **Shell & portability** _[applies when: code spawns subprocesses, uses shell scripts, or builds CLI tools]_
    - Subprocess calls under `set -e` abort on failure; non-critical writes fail on broken pipes — use `|| true` for non-critical output
-   - Interactive prompts (`read -p`, `Read-Host`, `prompt()`) in scripts that may run non-interactively (CI, cron, automation) — guard with TTY detection (`[ -t 0 ]`, `[Environment]::UserInteractive`) and default to a safe value or skip when stdin is not a terminal
+   - Interactive prompts (`read -p`, `Read-Host`, `prompt()`) in scripts that may run non-interactively (CI, cron, automation) — guard with TTY detection (`[ -t 0 ]`, `[Environment]::UserInteractive`) and default to a safe value or skip when stdin is not a terminal. Also handle EOF (Ctrl-D, closed stdin) explicitly: under `set -e`, a `read` returning non-zero on EOF will abort the entire script mid-flow — use `read ... || true` or check the return value and apply a safe default. Prompt input validation should accept the full set of expected answers (e.g., `y`/`yes`/`n`/`no` case-insensitive), not just the default — treating any non-default input as consent surprises users who type "no" expecting it to mean no
    - Detached child processes with piped stdio — parent exit causes SIGPIPE. Redirect to log files or use `'ignore'`
    - Subprocess output buffered in memory without size limits — a noisy or stuck child process can cause unbounded memory growth. Cap in-memory buffers and truncate or stream to disk for long-running commands
    - Platform-specific assumptions — hardcoded shell interpreters, `path.join()` backslashes breaking ESM imports. Use `pathToFileURL()` for dynamic imports
    - Naive whitespace splitting of command strings (`str.split(/\s+/)`) breaks quoted arguments — use a proper argv parser or explicitly disallow quoted/multi-word arguments when validating shell commands
    - Subprocess output parsed from a single stream (stdout or stderr) to detect conditions (conflicts, errors, specific states) — the information may appear in the other stream or vary by tool version/config. Check both stdout and stderr, and verify the exit code, to reliably detect the condition
    - Shell expansions (brace `{a,b}`, glob `*`, tilde `~`, variable `$VAR`) suppressed by quoting context — single quotes prevent all expansion, so patterns like `--include='*.{ts,js}'` pass the literal braces to the command instead of expanding. Use multiple flags, unquoted brace expansion (bash-only), or other command-specific syntax when expansion is required
+   - Arguments passed via process argv have OS-imposed length limits (notoriously low on Windows, ~32KB; ~128KB-2MB on most Unix). For variable-length payloads (prompts, JSON blobs, file contents) or anything potentially large, pipe via stdin instead of constructing a long argv. If argv must be used, enforce a strict cap and fail with a clear message before spawning
+   - PowerShell `$LASTEXITCODE` propagates from any external call and is read by the script's final exit. A step claiming to be "fail-soft" (a non-essential post-install hook, optional dashboard auto-open) that runs an external command without explicitly resetting `$LASTEXITCODE = 0` (or wrapping in try/catch with `$global:LASTEXITCODE = 0`) leaks a non-zero exit from the soft step into the parent script's overall exit status — breaking the fail-soft contract that callers depend on
+   - Outbound `fetch()` / HTTP calls in setup/install/update scripts without an `AbortController` per-request timeout — a hung server (accepts connection, never responds) blocks the parent shell process indefinitely, breaking "fail-soft" guarantees that the parent script depends on. Use the same timeout helper the rest of the codebase uses for outbound HTTP, and treat timeout as a skip with a clean exit code
    **Search & navigation** _[applies when: code implements search results or deep-linking]_
    - Search results linking to generic list pages instead of deep-linking to the specific record
    - Search/query code hardcoding one backend's implementation when the system supports multiple — verify option/parameter names are mapped between backends
+   - Deep-link URL contracts between sender and receiver — a URL with query parameters (`?id=...`, `?date=...`) or path segments is a contract: the receiving page/route MUST consume those parameters and use them to scroll/select/filter. Otherwise the sender misleads users (chip claims to "deep-link to event" but the route ignores the parameter and lands on the section list). Verify deep-link destinations actually route to the intended record/state by reading the receiver's route handler / page code, not just the producer's intent. If the receiver doesn't yet support the parameter, either drop it (and adjust docs/changelog claims) or wire it through end-to-end
    **Destructive UI operations** _[applies when: code adds delete, reset, revoke, or other destructive actions]_
    - Destructive actions (delete, reset, revoke) in the UI without a confirmation step — compare with how similar destructive operations elsewhere in the codebase handle confirmation
    **Streaming & real-time protocols** _[applies when: code implements SSE, WebSocket, ReadableStream, or event-driven APIs]_
-   - Wire protocol parsers that assume a simplified subset of the spec (e.g., `\n`-only line endings when the spec allows `\r\n`) — test against boundary representations the spec permits, not just the happy path
+   - Wire protocol parsers that assume a simplified subset of the spec (e.g., `\n`-only line endings when the spec allows `\r\n`) — test against boundary representations the spec permits, not just the happy path. Parsers must (a) handle the spec's full set of separators (e.g., both `\n\n` and `\r\n\r\n` for SSE; multiple `data:` lines joined with `\n`); (b) flush remaining buffered content on EOF — otherwise the last frame is dropped when upstream closes mid-frame; (c) wrap per-frame deserialization (`JSON.parse`) so a single malformed frame doesn't terminate the entire stream
+   - Stateful parsers (multipart, MIME, framed protocols) must verify they reached the terminal state on `req.on('end')` / EOF — calling `finish()` while still in a body/header state accepts truncated input as success, silently corrupting partially-written uploads. Track the terminal-state transition (e.g., `STATE_DONE` after the closing `--boundary--`) and return a 400 error otherwise (and clean up any partial files). Per-part state (`currentFileMimetype`, accumulated headers, decoder state) must be reset at the start of each new part — otherwise a part with no `Content-Type` inherits the previous part's mimetype
+   - Refactoring a streaming parser to "buffer-then-process" (calling `readAllBytes()` / `Buffer.concat(chunks)` / `await req.text()` before parsing) defeats the streaming contract and re-introduces an OOM/DoS vector for large uploads — verify the new implementation still respects each caller's `maxSize`/body cap WHILE reading (stop collecting once bytes exceed the cap), or restore true streaming. Watch for header comments still claiming "streams" / "never buffers entire body in memory" after such refactors — they become a documentation lie
+   - Library wrappers advertising a multer/express-style contract `(req, file, cb)` must pass the real `req` through to filters/hooks, and must `await` async callbacks before continuing — treating the `cb` as synchronous when the contract permits async breaks any caller that supplied an async filter. Errors thrown from middleware/parser modules without `err.status` set are normalized to HTTP 500 — set `err.status = 400` (or 413 for size limits) and a stable `err.code` (`PAYLOAD_TOO_LARGE`, `INVALID_MULTIPART`, `VALIDATION_ERROR`) at the throw site, OR throw a typed `ServerError`/`ApiError`
    - Event handlers or effects that fire on every high-frequency event (streaming deltas, scroll, resize, keydown) without throttle, debounce, or `requestAnimationFrame` batching — causes jank and excessive re-renders
    **Accessibility** _[applies when: code modifies UI components or interactive elements]_
-   - Interactive elements missing accessible names, roles, or ARIA states — including labels lost or replaced with non-descriptive placeholders in conditional/compact rendering modes. Disabled interactions should have `aria-disabled`
+   - Interactive elements missing accessible names, roles, or ARIA states — including labels lost or replaced with non-descriptive placeholders in conditional/compact rendering modes. Disabled interactions should have `aria-disabled`. Verify the ARIA attribute set matches established patterns used elsewhere in the codebase for the same widget type (disclosure, menu, dialog) — inconsistency degrades assistive-tech support and creates confusion for keyboard users
+   - ARIA roles applied without the keyboard interactions the role contract requires — `role="menu"`/`menuitem*` expects roving focus, arrow-key navigation, Escape scoped to the menu, and focus management on open/close; `role="listbox"`/`option` expects Home/End/typeahead; `role="dialog"` expects focus trap + return focus. Shipping the role without the behavior strands screen-reader and keyboard users on a control that looks interactive to AT but doesn't respond. Either implement the full interaction pattern or drop to a simpler one (native `<button>` + disclosure) that doesn't promise more than the code delivers
    - Custom toggle/switch UI built from non-semantic elements instead of native inputs
    - Overlay or absolutely-positioned layers with broad `pointer-events-auto` that intercept clicks/hover intended for elements beneath — use `pointer-events-none` on decorative overlays and enable events only on small interactive affordances. Conversely, `pointer-events-none` on a parent kills hover/click handlers on children — verify both directions when layering positioned elements
+   - Nested inputs / controls handling keyboard events (`Escape`, `Enter`, `ArrowUp`/`Down`) inside a modal or form that also handles the same key at the ancestor level — the event bubbles and the ancestor fires too (closing the modal, submitting the form, collapsing the disclosure). Call `e.stopPropagation()` (and usually `e.preventDefault()`) in the inner handler when it handles the key, or scope the ancestor handler to only fire when the event target is outside the inner control
 ## Tier 4 — Always Check (Quality, Conventions, AI-Generated Code)
    **Intent vs implementation**
-   - Labels, comments, status messages, or documentation that describe behavior the code doesn't implement — e.g., a map named "renamed" that only deletes, an action labeled "migrated" that never creates the target, or UI actions offered for entity states where the transition is invalid (e.g., a "Reject" button on already-rejected items)
+   - Labels, comments, status messages, or documentation that describe behavior the code doesn't implement — e.g., a map named "renamed" that only deletes, an action labeled "migrated" that never creates the target, or UI actions offered for entity states where the transition is invalid (e.g., a "Reject" button on already-rejected items). Also covers doc drift on concrete facts: file paths or extensions (`foo.js` referenced when the file is `foo.jsx`), item counts ("13 widgets" when there are 15), test counts ("5 tests" when the file has 3), timeout values ("30s" when the code uses 90s), request content-type ("urlencoded" when the code sends multipart), "fail-soft" / "never crashes" claims that are belied by the actual error path, documented default entity names ("Default" vs the actual "Everything"), and route/response-shape comments that say `{ activeId }` when the handler returns `{ activeId, items }`. Test names are part of this contract too: a test called "rejects traversal X" that asserts a 200 (because the implementation strips and accepts) is lying about the security contract — rename to match what's actually tested. When reviewing, verify every factual claim in a comment, test name, changelog entry, or doc against the code it references
    - Inline code examples, command templates, and query snippets that aren't syntactically valid as written — template placeholders must use a consistent format, queries must use correct syntax for their language (e.g., single `{}` in GraphQL, not `{{}}`)
    - Cross-references between files (identifiers, parameter names, format conventions, version numbers, operational thresholds) that disagree — when one reference changes, trace all other files that reference the same entity and update them. This includes internal identifiers (route paths, file names, component names) that should be renamed when the concept they represent is renamed — a nav label saying "Rejected" pointing to `/admin/flagged` or a component named `FlaggedList` rendering rejected items creates maintenance confusion. For releases, verify version consistency across all versioned artifacts (package manifests, lockfiles, API specs, changelogs, PR metadata). Also applies to field-set enumerations: when an operation targets a set of entity fields, every predicate, filter expression, scan criteria, API doc, and UI conditional that enumerates those fields must stay in sync — an independently maintained list that omits a field causes silent skips or false positives
    - Template/workflow variables referenced (`{VAR_NAME}`) but never assigned — trace each placeholder to a definition step; undefined variables cause silent failures or confusing instructions. Also check for colliding identifiers (two distinct concepts mapped to the same slug, key, or name)
@@ -195,9 +245,11 @@
    - Sequential numbering (section numbers, step numbers) with gaps or jumps after edits — verify continuity
    - Completion markers, success flags, or status files written before the operation they attest to finishes — consumers see false success if the operation fails after the write
    - Existence checks (directory exists, file exists, module resolves) used as proof of correct/complete installation — a directory can exist but be empty, a file can exist with invalid contents. Verify the specific resource the consumer needs
+   - Readiness/health probes that rely solely on subprocess exit code without inspecting output — many CLIs (`psql`, `curl`, `kubectl`) exit 0 for empty results, missing schema, auth-only handshake, or "command accepted" when the actual condition isn't met. Capture stdout (`execFileSync(...).toString()`) and verify it contains the expected marker (a row count, a status code, a schema-table name). For tools that read user-level config (`.psqlrc`, `~/.curlrc`), pass flags that ignore those files (`-X`, `--no-rcfile`) so the probe behaves the same in every environment
    - Lookups that check only one scope when multiple exist — e.g., checking local git branches but not remote, checking in-memory cache but not persistent store. Trace all locations where the resource could exist and check each
    - Tracking/checkpoint files that default to empty on parse failure — causes full re-execution. Fail loudly instead. More broadly, safety/guard checks that catch errors and default to "safe to proceed" (fail-open) rather than treating errors as "unsafe, abort" (fail-closed) — a guard that silently succeeds on error provides no protection when it's needed most
    - Registering references to resources without verifying the resource exists — dangling references after failed operations
+   - Composed instructions, prompts, system messages, or rule sets that vary by mode/role/context — unconditional clauses can contradict mode-specific directives (e.g., "always cite sources inline" combined with a "draft" mode that asks for "no preamble, no commentary"). Build the composition conditionally — include each instruction block only for the modes that actually want it — or define an explicit precedence so contradictions are predictable
    **Automated pipeline discipline**
    - Internal code review must run on all automated remediation changes BEFORE creating PRs — never go straight from "tests pass" to PR creation
@@ -210,6 +262,7 @@
    - Commit messages or comments claiming a fix while the underlying bug remains — verify each claimed fix actually addresses the root cause, not just the symptom
    - Functions containing placeholder comments (`// TODO`, `// FIXME`, `// implement later`) or stub implementations presented as complete
    - Unnecessary defensive code: error handling for scenarios that provably cannot occur given the call site, fallbacks for internal functions that always return valid data
+   - Cleanup callbacks (useEffect return, finalizer, dispose, signal handler) containing only comments are misleading and signal missing or relocated behavior. Either implement the cleanup or remove the callback entirely; don't ship a cleanup function that does nothing
    **Configuration & hardcoding**
    - Hardcoded values when a config field or env var already exists; dead config fields nothing consumes; unused function parameters creating false API contracts; resource names (table names, queue names, bucket names) hardcoded without accounting for environment prefixes — lookups on response objects using the wrong key silently return undefined
@@ -234,8 +287,9 @@
    - Tests that exercise code paths depending on features the integration layer doesn't expose — they pass against mocks but the behavior can't trigger in production. Verify mocked responses match what the real dependency actually returns
    - Test mock state leaking between tests — mock setup APIs that configure return values often persist across tests even after clearing call history, because "clear" resets invocation counts but not configured behavior (use "reset" variants that restore original implementations). Conversely, per-call sequential mock responses couple tests to internal call count — prefer stable return values for behavior tests, sequential mocks only when verifying call order
    - Tests that pass but don't cover the changed code paths — passing unrelated tests is not validation
+   - Response/status assertions written as loose ranges (`status >= 400`, `status < 500`, `ok: false`) — a regression that turns a 400 validation failure into a 500 internal error still passes. Assert the specific expected status (or at least `< 500` for "must be a client error") so the test distinguishes validation from server failure. Same principle for assertions like "contains any error" or "thrown anything" — assert the specific error type/code the code is contracted to produce
    **Style & conventions**
    - Naming and patterns consistent with the rest of the codebase
-   - Formatting consistency within each file — new content must match existing indentation, bullet style, heading levels, and structure. For structured files that follow a convention across sibling files (changelogs, config files, migration files), verify new entries use the same section headers, field names, and ordering as existing siblings
+   - Formatting consistency within each file — new content must match existing indentation, bullet style, heading levels, and structure. For structured files that follow a convention across sibling files (changelogs, config files, migration files), verify new entries use the same section headers, field names, and ordering as existing siblings. Within a single structured file, section headers must be unique — two `## Fixed` blocks in the same changelog (or two `[features]` tables in the same TOML) are a merge artifact that splits content downstream tools expect to find under one header. Consolidate duplicates into a single section
    - Shell/workflow instructions with destructive operations (branch deletion, file removal, force operations) must verify preconditions first — e.g., ensure you're not on a branch being deleted, confirm the target exists, and don't suppress stderr from commands where failures indicate real problems (auth errors, network issues)

package/lib/copilot-review-loop.md CHANGED Viewed

@@ -19,16 +19,21 @@ whether to continue or stop — never loop indefinitely without confirmation.
 TIMEOUT SCHEDULE:
 When running parallel PR reviews (do:better), use shorter waits to avoid
 blocking other PRs:
-- Iteration 1: max wait 5 minutes
-- Iteration 2: max wait 4 minutes
-- Iteration 3: max wait 3 minutes
-- Iteration 4: max wait 2 minutes
-- Iteration 5+: max wait 1 minute
+- Iteration 1: max wait 3 minutes
+- Iteration 2: max wait 2 minutes
+- Iteration 3: max wait 90 seconds
+- Iteration 4: max wait 60 seconds
+- Iteration 5+: max wait 45 seconds
 When running a single-PR review (do:pr, do:release), use dynamic timing:
-check the previous Copilot review duration on this PR and wait up to 2x
-that (minimum 5 minutes, maximum 20 minutes). Copilot reviews can take
-10-15 minutes for large diffs.
-Poll interval: 30 seconds for all iterations.
+check the previous Copilot review duration on this PR. If no prior
+review exists, default to 60 seconds. Set max wait to 3x the expected
+duration (minimum 90 seconds, maximum 5 minutes); only large diffs
+(200+ changed lines) should approach the max. Copilot reviews on small
+diffs typically land in 30-90 seconds; large diffs may take longer.
+Use progressive poll intervals: 5s, 5s, 10s, 10s, then 15s thereafter —
+an early first check avoids burning a full minute on a review that's
+already sitting in the API. For parallel PR reviews (do:better), use
+the decreasing timeout schedule above with a 15-second poll interval.
 Run the following loop until Copilot returns zero new comments:
@@ -51,8 +56,10 @@ Run the following loop until Copilot returns zero new comments:
    - For parallel PR reviews (do:better): use the DECREASING TIMEOUT for
      the current iteration number
    - For single-PR reviews (do:pr, do:release): use dynamic timing based on
-     the previous Copilot review duration on this PR (2x that, min 5 min,
-     max 20 min)
+     the previous Copilot review duration on this PR (3x that, min 90 sec,
+     max 5 min). If no prior review exists, default expected duration to
+     60 seconds. Use progressive poll intervals (5s, 5s, 10s, 10s, then
+     15s thereafter)
    - Error detection: if the review body contains "Copilot encountered an
      error" or "unable to review this pull request", re-request (step 1)
      and resume polling. Max 3 error retries before reporting failure.

package/lib/review-cross-file-tracing.md CHANGED Viewed

@@ -43,6 +43,9 @@ Apply the checklist as a prompt for attention, not an exhaustive specification.
 - Side effects during React render
 - Interactive UI elements (buttons, inputs, drag-and-drop targets, keyboard shortcuts) that remain enabled while an async operation owns their related state — a second trigger while the first is in-flight produces concurrent state mutations or duplicate operations. All entry points for the same operation must be disabled together while the operation is pending
 - Optimistic UI messages that substitute placeholder text when the actual payload sent to the server differs — the conversation history and server-side record will show different content than what the user saw. Use the same fallback text in both the optimistic render and the outgoing payload, or surface the actual payload text
+- Optimistic placeholder IDs ('pending', 'temp_*', client-generated UUIDs) echoed back to the server in subsequent requests — server validates against its real ID format and rejects them as 400s. Trace from optimistic insertion → controls bound to the optimistic record (pin, promote, delete, follow-up) → outgoing request payloads. Disable controls until the server returns a real ID, OR omit the field from outgoing payloads when the local value still matches the optimistic shape
+- Client-side AbortController for an in-flight streaming/long-lived operation must abort when the owning UI context tears down OR navigates AWAY from THAT operation. When cleanup is keyed to a route param, navigation events emitted BY the in-flight stream itself (e.g., redirecting to a permalink after the server returns an id) trigger the cleanup and abort the very operation that caused the navigation. Track the streaming operation's identity in a ref and abort only when navigating away from THAT identity, not on every param change. Mirror by cancelling the stream reader (`reader.cancel()`) in `finally` and ignoring late events whose ID doesn't match the now-current operation
+- Settings whose persistence model is per-record (per-conversation, per-document, per-project) held only in local component state — refresh resets to the persisted value while the server-side history shows different content. Trace: UI mode/setting state → outgoing payload → persistence schema → reload path. Persist on every mutation OR derive UI from the last-persisted record
 ### Error Handling
@@ -52,9 +55,15 @@ Apply the checklist as a prompt for attention, not an exhaustive specification.
 - Destructive ops in retry/cleanup paths without own error handling
 - External service calls without configurable timeouts
 - Missing fallback for unavailable downstream services
-- SSE or streaming handlers that call `end()`/`close()` on mid-stream errors without emitting an error event — the client observes a clean stream termination and treats partial content as complete. Emit a structured `event: error` block before closing so clients can detect and surface the failure
+- SSE or streaming handlers that call `end()`/`close()` on mid-stream errors without emitting an error event — the client observes a clean stream termination and treats partial content as complete. Emit a structured `event: error` block before closing so clients can detect and surface the failure. Conversely, named lifecycle events (`error`, `done`, `complete`) must be MUTUALLY EXCLUSIVE — after emitting `error`, do NOT also emit `done`, or include explicit success/error info in the terminal frame. Trace every exit path of the stream generator and the route's loop body to verify only one terminal event fires
+- Streaming server handlers whose abort signal is wired into ONLY the final consumer (e.g., the LLM provider fetch) but not into upstream retrieval / embedding / subprocess work — disconnects don't actually stop the expensive earlier work. Trace the AbortSignal from `req.on('close')` through every async leg of the pipeline and verify each takes a `signal` parameter and propagates it
+- Raw `fetch()` failures (TypeError "Failed to fetch", DNS errors, ECONNREFUSED) at API client boundaries must be translated to a consistent message matching the project's established transport-error utility. Trace each new API client function against existing siblings (`apiCore.request`, `apiOpenClaw.streamMessage`) and verify the same wrapper is used; preserve `AbortError` so callers distinguish cancellation from failure
 - SSE or event dispatchers that handle named event types but ignore the protocol's default/unnamed event — SSE streams that emit `data:` without `event:` produce type `'message'` (the SSE default), which a handler processing only named types will silently discard. Verify the default event type is either handled or explicitly excluded
 - Route handlers that call a status/health probe before delegating to the main service when the service already handles the "not configured"/"unreachable" case — the pre-probe adds an extra upstream round-trip on every request and can fail even when the intended operation would succeed. Let the service be the authoritative source of truth and map its structured errors to the appropriate HTTP status at the route boundary
+- Sync-shaped route handler (`POST /generate`, `POST /txt2img`) wrapping a service that is async-by-design (returns a job handle, writes the artifact later) — the handler must subscribe to the completion event BEFORE calling the service (so a fast/cached job can't fire `complete` before the listener attaches) AND wait for the matching job id with a timeout. Common bug: the handler reads `result.filename` from disk immediately after the service returns, gets nothing, and replies with an empty success payload. Trace from route → service → completion-event/file-watcher → response builder; verify the route awaits a real readiness signal, not just the service's job-handle return. If the service is callable both async (jobId-only) and sync (await artifact), expose the sync variant explicitly (`generateAndWait` / `generateSync`) rather than mixing modes
+- Cross-module feature-flag detection drift — when multiple modules independently determine "is feature X active?" (HTTPS enabled, OAuth scopes, dark mode, a tier-gated capability) using divergent checks, behavior diverges and the user-visible UX contradicts itself. Examples: client UI checks one cert file while the server requires both; client hardcodes an `https://` scheme while the server is running plain HTTP; one helper checks `cert.pem` exists, another checks `cert.pem && key.pem`. Centralize the predicate in a single exported helper (`hasTailscaleCert()`, `isHttpsEnabled()`, `userHasScope(scope)`) and have every caller import it. Flag any module that re-derives the same boolean inline
+- Cross-module error classification — a low-level wrapper rethrows errors with a different `name`/`code`/`message` shape than the original (e.g., a custom fetch wrapper aborts with `new Error('Request aborted')` while the classifier downstream checks `err.name === 'AbortError'`). The classifier matches nothing and the timeout/cancel branch never fires. Either preserve `name`/`code`/`cause` through the wrapper, OR have the classifier accept the union of shapes the wrapper can emit. Trace each error-classifying call site back to the wrapper(s) that produce its inputs and verify the contract holds
+- Compatibility-shim end-to-end plumbing — when a route bridges to an external API standard (A1111 SD-API, OpenAI, S3-compatible, etc.) every documented response field must be backed by a real value chain through the provider, intermediate service, and response builder. Common bug: the response shape is correct but a field like `seed`, `progress`, `eta`, `model`, or `usage.tokens` is hardcoded to a default (`0`, `null`, the request input) because nothing in the chain actually returns it. Trace each declared response field from where it's set in the route → the service's return shape → the underlying provider/process output, and confirm the value flows end-to-end; placeholder fields ("we'll plumb it later") break clients that depend on the standard. Same trace applies to "always returns 0 / always undefined / always empty array" patterns in the response — they signal incomplete plumbing
 ### Resource Management
@@ -74,9 +83,10 @@ Apply the checklist as a prompt for attention, not an exhaustive specification.
 - One-time migrations without completion guard — re-execute every startup
 - Data migrations silently changing runtime behavior — preserve execution semantics. Unsupported source values must be flagged, not defaulted
 - Update endpoints with field allowlists not covering new model fields
-- New endpoints not matching validation patterns of existing similar ones. API doc schemas must be structurally complete
+- New endpoints not matching validation patterns of existing similar ones. The same field (id, name) accepted by multiple endpoints must be validated identically everywhere — path params, query, body, on sibling endpoints (create/update/delete/activate). Skipping param validation on one sibling turns violations into 404/500 instead of 400. `z.string().min(1)` without `.trim()` accepts whitespace-only names. API doc schemas must be structurally complete
 - Client-side input validation limits (max count, file size, string length, combined totals) must be consistent with — and ideally tighter than — server-side enforcement. When the client allows combinations the server rejects (e.g., 8 × 10MB files vs a 50MB JSON body limit), users hit confusing 400/413 errors. Trace all enforcement boundaries (UI, API schema, body parser, downstream service) and verify they form a coherent envelope
 - Sample config files, README examples, and documentation that reference config keys or structure must match what the implementation actually reads. Trace example keys against the config loader — stale examples teach operators to configure values the system ignores (or vice versa)
+- Subprocess invocations must inherit the same configuration source as the parent — if the parent reads from `.env`/config files but the child only sees `process.env`, exporting those values explicitly via the `env` option is required. Trace from config loader → invocation site → subprocess script. Otherwise a probe uses customized credentials/ports while the underlying setup runs with defaults, creating an "inconsistency loop" where the probe always fails and provisioning re-applies defaults that overwrite user customization
 - Config values whose format can be validated at initialization time (URLs, port numbers, auth schemes) but are only validated at first use — misconfiguration surfaces as a cryptic runtime error deep in the call stack. Validate format and range of security-relevant config values during initialization and surface a specific diagnostic identifying the bad field
 - URL joining utilities that force paths absolute-from-origin (stripping the base URL's pathname) — `baseUrl=http://host/proxy` + `/v1/api` silently produces `http://host/v1/api` instead of `http://host/proxy/v1/api`. Verify URL construction utilities preserve pathname segments from the base URL, or document and enforce that base URLs must be origin-only
 - Summary/aggregation endpoints using different filters/sources than detail views they link to
@@ -84,8 +94,13 @@ Apply the checklist as a prompt for attention, not an exhaustive specification.
 - Validation functions introduced for a field: trace ALL write paths. New branches must apply same validation as siblings
 - Stored config merged with shallow spread — nested objects lose new default keys on upgrade. Use deep merge
 - Schema fields accepting values downstream can't handle. Validated params never consumed (dead API surface). `.partial()` on nested schemas: verify nested objects also partial. `.partial()` with `.default()` silently overwrites persisted values on update
+- Generator/validator structural invariant — when a generator produces values with structural guarantees (sortability via fixed-width prefix, embedded checksum, encoded version), the validator regex (and any client-side mirror) must enforce the SAME shape. Broader regexes accept inputs the generator never emits, breaking invariants the rest of the system relies on (e.g., lexical sort == chronological sort breaks once a base36 timestamp grows by a digit). Trace generator → server validator → client mirror as a closed loop. Test fixtures should use IDs/payloads that match generator output, not contrived literals
+- Schemas accepting paired range fields (`startDate`/`endDate`, `min`/`max`, `from`/`to`) without a cross-field refinement (`zod .refine()`) — accepts inconsistent ranges (start > end). Trace the schema definition, route validation, and downstream consumer to confirm the range relationship is enforced somewhere (preferably at the schema)
+- Required-at-use-time config values (model name, API key, endpoint URL, default selection) that may be null/undefined in the source data must be validated at the boundary before invoking the downstream API. Trace from config source → loading layer → use site, and verify nullable fields are guarded with a clear, actionable error before the downstream call. Otherwise the downstream API responds with an opaque error far from the user's intent
 - Multi-part UI gated on different prop subsets — derive single enablement boolean
 - Entity creation without case-insensitive uniqueness
+- Arrays of IDs (widget ids, tag ids, member ids) persisted, returned by API, or rendered with `key={x}` without container-level dedup — element-level validation (type, length) isn't enough. Enforce uniqueness via schema refinement (`zod.refine(arr => new Set(arr).size === arr.length)`), dedupe on ingest, AND dedupe during read-path sanitization so hand-edited / legacy data can't reintroduce collisions. Apply the same first-wins dedup to arrays of records keyed by id at the container level
+- Data loaded from files or persistent stores sanitized less strictly than the API accepts on write — hand-edited, migrated, or corrupted persisted state can introduce values (oversized names, non-kebab ids, duplicate entries) the API rejects on mutate, producing oversized responses, unreachable records, or invariant violations. Apply the same length caps, regex, uniqueness, and type guards in read-path sanitization as in request validation
 - Code reading response properties that don't exist — verify field names, nesting, actual response shape. Wrappers that don't request/forward needed fields. Call sites using wrong function variant for input format or wrong positional argument order
 - Data model fields with different names per write path. Entity identity keys inconsistent across lookup paths
 - Entity type changes without revalidating type-specific invariants and clearing old-type fields
@@ -115,6 +130,11 @@ Apply the checklist as a prompt for attention, not an exhaustive specification.
 - New external service calls must use established mock/test infrastructure
 - New UI consumers against existing APIs: verify every field name, nesting, identifier, response envelope matches actual producer response
 - Discovery/catalog endpoints: trace enumerated set against consumer's supported inputs
+- Cross-module constants kept in sync by comment ("must stay in sync with X", duplicated regex, duplicated event name, duplicated size limit) — the comment is not enforcement and drift is a silent failure. Event names, regex patterns, numeric limits, path segments, and feature-flag keys shared across modules (client↔server, route↔service, component↔component, producer↔consumer) must be a single exported constant imported by both. Flag any instance where a comment notes "keep in sync" without the actual shared module
+- New global APIs (`AbortSignal.any`, `Promise.withResolvers`, `structuredClone`) used directly when the codebase already has a fallback utility for the same API — search for an existing wrapper (`fetchWithTimeout`, `withSignal`, polyfill helpers) before adding a new direct call. Drift between the safe path and a new direct call reintroduces the runtime error the fallback was created to avoid
+- Pure persistence/utility modules importing from orchestration/service modules just to access a constant pulls the entire downstream import graph as a transitive dependency. Trace each `import` in storage / utility files; if the imported symbol is a constant (enum, regex, size cap, valid-mode set), suggest moving it to a small dedicated shared module
+- Actions triggered from one surface (command palette, global menu, external event) that mutate data another already-mounted page/component fetched on mount — re-navigating to the same route doesn't remount (routers no-op it), so the visible state stays stale while the server updates. Propagate change via shared store, a pub/sub event whose name is a shared constant, focus/visibility refetch, or key-based remount — and verify the mounted page actually subscribes on its side
+- Modules that own a persistence schema (write to disk/DB with a known shape) should validate at the persistence boundary, not assume the API/route layer will catch everything. Trace from route validator → service call → persistence write — verify enum/range/required checks exist at the storage layer for fields the schema cares about. Direct callers (internal scripts, tests, programmatic batch jobs) bypass route validation otherwise
 **Cleanup/teardown side effects**
 - Cleanup functions with implicit mutations (auto-merge, auto-commit, cascade writes) — verify abort on prerequisite failure
@@ -168,6 +188,10 @@ Apply the checklist as a prompt for attention, not an exhaustive specification.
 **Batch/paginated consumption**
 - Batch API callers handle partial results, continuation tokens, rate limits with backoff. Resource names account for environment prefixes
+- Periodic maintenance (cleanup, expiry, dedup) bolted onto a paginated read path runs only for items returned in that page — entries beyond the boundary are never processed. Trace from list endpoint → maintenance/sweep code → the iteration that bounds it. Move maintenance to a background sweep, run a separate unbounded pass, OR use cheap metadata (mtime, size) for the maintenance pass while only doing expensive reads for the page actually returned. Maintenance gates that depend on parsed metadata fields will skip records where parsing returns a sentinel (0, null, "") — those records become permanent
+**Deep-link URL contract (sender ↔ receiver)**
+- A URL with query parameters (`?id=...`, `?date=...`) or path segments is a contract: the receiving page/route MUST consume those parameters and use them to scroll/select/filter. Trace each new deep-link href to the destination route handler / page component and verify it reads and acts on every parameter the sender includes. If the receiver doesn't yet support the parameter, either drop it (and adjust docs/changelog claims) or wire it through end-to-end
 **Data model vs access pattern**
 - Claims of ordering ("recent", "top") verified against key/index design — random UUIDs require full scans

package/lib/review-security-audit.md CHANGED Viewed

@@ -24,8 +24,10 @@ For each changed file:
 ### Trust Boundaries & Data Exposure
-- API responses returning full objects with sensitive fields — destructure and omit across ALL paths (GET, PUT, POST, error, socket). Comments claiming data isn't exposed while the code does expose it
+- API responses returning full objects with sensitive fields — destructure and omit across ALL paths (GET, PUT, POST, error, socket). Comments claiming data isn't exposed while the code does expose it. This includes server-internal absolute filesystem paths (`/Users/.../data/loras/foo.safetensors`, `C:\app\data\models\bar`) returned in catalog/list endpoints — they leak server layout, OS, and install locations to any UI user and couple the client to filesystem structure. Return basenames or relative identifiers (`/data/loras/<filename>`) and resolve/validate server-side at consumption time
 - Server trusting client-provided computed/derived values (scores, totals, correctness flags, file metadata like MIME type and size) — strip and recompute server-side. Validate uploads via magic bytes and buffer length, not headers
+- Server trusting persisted-state flags (builtIn, protected, role, owner, immutable) read from flat-file/JSON/DB records to make authorization or deletion decisions — hand-editing the file or tampered sync can flip the flag and bypass protection. Derive authority on every read from a trusted source: a code-level constant set of built-in ids, session identity, or a server-side role lookup. The persisted representation can cache the flag for display, but must not be the source of truth for security decisions
+- Persisted-state filename/path fields (history JSON entries, settings.json paths, manifest entries) used as filesystem operands (`path.join(BASE, item.filename)` for `unlink`, `readFile`, `spawn` arg lists, ffmpeg/imagemagick concat manifests) without basename + path-resolve-prefix-check validation — corrupted, hand-edited, or tampered persisted state can include `../` segments that escape the intended directory and read/write/delete arbitrary files. Use a `safeUnder(base, candidate)` helper at every consumption site (delete, stitch, last-frame extract, batch ops, thumbnail). For paths that further pass into exec arg strings or manifest files (e.g., ffmpeg concat-demuxer `file '...'` lines), basename validation is necessary but not sufficient — the consumer's parser has its own escaping rules: single quotes / newlines break ffmpeg manifests, backslashes on Windows are interpreted as escape characters in quoted strings, shell metacharacters break shell-quoted args. Either reject filenames containing parser-special characters at validation time, or apply consumer-specific escaping (forward-slash normalization, quote escape, etc.) before writing the manifest/argv
 - New endpoints under restricted paths (admin, internal) missing authorization — compare with sibling endpoints for same access gate (role check, scope validation). New OAuth scopes must be checked comprehensively — a check testing only one scope misses newly added scopes
 - User-controlled objects merged via `Object.assign`/spread without sanitizing keys — `__proto__`, `constructor`, `prototype` enable prototype pollution. Use `Object.create(null)`, whitelist keys, use `hasOwnProperty` not `in`
 - Push events (WebSocket, SSE, pub/sub) emitted without scoping to originating user/session — sensitive payloads leak to all connected clients. Scope via room/channel isolation or server-side correlation ID
@@ -35,6 +37,7 @@ For each changed file:
 - Trimming values where whitespace is significant (API keys, tokens, passwords, base64) — only trim identifiers/names
 - Endpoints accepting unbounded arrays without upper limits — enforce max size. Validate element types/format, deduplicate to prevent inflated counts/repeated side effects. Internal operations fanning out unbounded parallel I/O risk EMFILE — use concurrency limiters
 - Security/sanitization functions handling only one input format when data arrives in multiple formats (JSON, shell env, URL-encoded, headers) — sensitive data leaks through unhandled format
+- Allowlists gating user-provided identifiers must use the consumer's identifier namespace, not a sibling namespace. Common bug: an allowlist of import-module names (`cv2`, `PIL`) used to gate `pip install <name>` — pip's identifier space is package specs (`opencv-python`, `pillow`), so the allowlist permits installs of typosquatted/unintended packages. Same risk for: command names vs aliases, OAuth scope strings vs role names, file extensions vs MIME types, language identifiers vs runtime identifiers. Build the allowlist from the consumer's actual valid-input set (`REQUIRED_PACKAGES.map(pipNameFor)`), NOT from a related-but-different list, and include a unit test that asserts every allowlist entry is a valid input to the consumer
 ### Hand-rolled Validators
@@ -56,7 +59,9 @@ For each changed file:
 **Sanitization/validation coverage**
 - If a new validation function is introduced for a field: trace ALL write paths (create, update, import, sync, bulk) — partial application means invalid data re-enters through unguarded paths
 - If a "raw" or bypass write path is added: compare normalization against what the read/parse path assumes — data through raw path must be valid on reload
+- Read-path sanitization of persisted data must enforce the SAME bounds as the API schema (length caps, uniqueness, regex, per-item type guards) — hand-edited or migrated data can otherwise introduce values the API rejects on mutate, producing oversized responses, unreachable records (client renders but API rejects), or invariant violations. Drop or truncate out-of-range values rather than passing them through
 - If a new dispatch branch is added within a multi-type handler: verify equivalent validation as sibling branches
+- Modules that own a persistence schema (write to disk/DB with a known shape) must validate at the persistence boundary — not only at the API/route layer. Direct callers (internal scripts, tests, programmatic batch jobs, future endpoints) bypass route validation and corrupt on-disk state. At minimum, reject invalid enum values, missing required fields, and out-of-range values before writing — so the storage layer enforces its own contract independent of who calls it
 **Security-sensitive configuration parsing**
 - Env vars/config affecting security (proxy trust, rate limits, CORS, token expiry): verify type and range enforcement. `Number()` accepts floats, negatives, empty-string-as-zero — use `parseInt` + `Number.isInteger` + range checks with logged safe defaults