slash-do 2.2.0 → 2.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -306,6 +306,7 @@ Skip step 4 if steps 1-3 reveal the code is correct.
306
306
  - Touch-specific interactions without pointer alternatives
307
307
  - Fixed sizes that don't adapt to Mac window resizing
308
308
  - Missing `Settings` scene for macOS apps
309
+ - **macOS window lifecycle (App Store Guideline 4):** Missing `NSApplicationDelegate` with `applicationShouldTerminateAfterLastWindowClosed` returning `false` (app quits on window close instead of staying in Dock). Missing `applicationShouldHandleReopen(_:hasVisibleWindows:)` (Dock click does nothing when window is closed). `WindowGroup` without stable `id:` parameter prevents programmatic reopening via `openWindow(id:)`. Missing "Show Main Window" menu command (Cmd+0) in Window menu. Missing `reopenWindow` closure bridge between AppDelegate and SwiftUI `openWindow`. Menu bar commands that don't ensure main window is visible before acting = **[HIGH]**
309
310
  - watchOS complications not updated, widget timelines not refreshed
310
311
  - visionOS: missing `.windowStyle(.volumetric)` or `.immersionStyle()` where appropriate
311
312
 
@@ -15,253 +15,75 @@ If there are no changes, inform the user and stop.
15
15
 
16
16
  ## Apply Project Conventions
17
17
 
18
- CLAUDE.md is already loaded into your context. Use its rules (code style, error handling, logging, security model, scope exclusions) as overrides to generic best practices throughout this review. For example, if CLAUDE.md says "no auth needed — internal tool", do not flag missing authentication.
19
-
20
- <review_instructions>
18
+ CLAUDE.md is already loaded into your context. Use its rules (code style, error handling, logging, security model, scope exclusions) as overrides to generic best practices throughout this review. Pass relevant convention overrides to each agent so they don't flag things the project intentionally allows (e.g., "no auth needed — internal tool").
21
19
 
22
20
  ## PR-Level Coherence Check
23
21
 
24
- Before reviewing individual files, understand what this change set claims to do:
22
+ Before dispatching agents, understand what this change set claims to do:
25
23
 
26
24
  1. Read commit messages (`git log {base}...HEAD --oneline`)
27
- 2. After reviewing all files, verify: does the changed code actually deliver what the commits claim? Flag any claims not backed by code (e.g., "adds rate limiting" but only adds a comment).
28
-
29
- ## Large PR Strategy
30
-
31
- If the diff touches more than 15 files, split the review into batches:
32
- 1. Group files by module/directory
33
- 2. Review each batch, printing findings as you go
34
- 3. Delegate files beyond the first 15 to a subagent if context is getting full
35
-
36
- ## Deep File Review
37
-
38
- For **each changed file** in the diff, read the **entire file** (not just diff hunks). Reviewing only the diff misses context bugs where new code interacts incorrectly with existing code.
39
-
40
- ### Understand the Code Flow
41
-
42
- Before checking individual files against the checklist, **map the flow of changed code across all files**. This means:
43
-
44
- 1. **Trace call chains** — for each new or modified function/method, identify every caller and callee across the changed files. Read those files too if needed. You cannot evaluate whether code is duplicated or well-structured without knowing how it connects.
45
- 2. **Identify shared data paths** — trace data from entry point (route handler, event listener, CLI arg) through transforms, storage, and output. Understand what each layer is responsible for.
46
- 3. **Map responsibilities** — for each changed module/file, state its single responsibility in one sentence. If you can't, it may be doing too much.
47
-
48
- ### Evaluate Software Engineering Principles
49
-
50
- With the flow understood, evaluate the changed code against these principles:
51
-
52
- **DRY (Don't Repeat Yourself)**
53
- - Look for logic duplicated across changed files or between changed and existing code. Grep for similar function signatures, repeated conditional blocks, or copy-pasted patterns with minor variations.
54
- - If two functions do nearly the same thing with small differences, they should likely share a common implementation with the differences parameterized.
55
- - Duplicated validation, error formatting, or data transformation are common violations.
56
-
57
- **YAGNI (You Ain't Gonna Need It)**
58
- - Flag abstractions, config options, parameters, or extension points that serve no current use case. Code should solve the problem at hand, not hypothetical future problems.
59
- - Unnecessary wrapper functions, premature generalization (e.g., a factory that produces one type), and unused feature flags are common violations.
60
-
61
- **SOLID Principles**
62
- - **Single Responsibility** — each module/function should have one reason to change. If a function handles both business logic and I/O formatting, flag it.
63
- - **Open/Closed** — new behavior should be addable without modifying existing working code where practical (e.g., strategy patterns, plugin hooks).
64
- - **Liskov Substitution** — if subclasses or interface implementations exist, verify they are fully substitutable without breaking callers.
65
- - **Interface Segregation** — callers should not depend on methods they don't use. Large config objects or option bags passed through many layers are a smell.
66
- - **Dependency Inversion** — high-level modules should not import low-level implementation details directly when an abstraction boundary would be cleaner.
67
-
68
- **Separation of Concerns**
69
- - Business logic should not be tangled with transport (HTTP, WebSocket), storage (SQL, file I/O), or presentation (HTML, JSON formatting).
70
- - If a route handler contains business rules beyond simple delegation, flag it.
71
-
72
- **Naming & Readability**
73
- - Function and variable names should communicate intent. If you need to read the implementation to understand what a name means, it's poorly named.
74
- - Boolean variables/params should read as predicates (`isReady`, `hasAccess`), not ambiguous nouns.
75
-
76
- For this review, only flag principle and design violations that are **concrete and actionable** in the code changed by this PR. However, if you discover a clear, real bug or correctness issue — even in code not directly modified here — call it out and help ensure it gets fixed (in this PR or a follow-up). Never dismiss serious problems as "out of scope" or "not modified in this PR."
77
-
78
- </review_instructions>
79
-
80
- <checklist>
81
-
82
- ### Per-File Checklist
83
-
84
- Check every file against this checklist. The checklist is organized into tiers — always check Tiers 1 and 4, and check Tiers 2-3 only when the relevance filter matches the file:
85
-
86
- !`cat ~/.claude/lib/code-review-checklist.md`
87
-
88
- </checklist>
89
-
90
- <deep_checks>
91
-
92
- ### Additional deep checks (read surrounding code to verify):
93
-
94
- **Cross-file consistency**
95
- - If a new function/endpoint follows a pattern from an existing similar one, verify ALL aspects match (validation, error codes, response shape, cleanup). Partial copying is the #1 source of review feedback.
96
- - New API client functions should use the same encoding/escaping as existing ones (e.g., if other endpoints use `encodeURIComponent`, new ones must too)
97
- - If the PR adds a new endpoint, trace where existing endpoints are registered and verify the new one is wired in all runtime adapters (serverless handler map, framework route file, API gateway config, local dev server) — a route registered in one adapter but missing from another will silently 404 in the missing runtime
98
- - If the PR adds a new call to an external service that has established mock/test infrastructure (mock mode flags, test helpers, dev stubs), verify the new call uses the same patterns — bypassing them makes the new code path untestable in offline/dev environments and inconsistent with existing integrations
99
- - If the PR adds a new UI component or client-side consumer against an existing API endpoint, read the actual endpoint handler or response shape — verify every field name, nesting level, identifier property, and response envelope path used in the consumer matches what the producer returns. This is the #1 source of "renders empty" bugs in new views built against existing APIs
100
- - If the PR adds or modifies a discovery/catalog endpoint that enumerates available capabilities (actions, node types, valid options) for a downstream consumer API, trace the full enumerated set against the consumer's actual supported inputs: verify every advertised item can be consumed without error, every consumer-supported item is discoverable, and any identifier transformations (naming conventions, case conversions, key format changes) between discovery output and consumer input preserve the format the consumer expects — mismatches produce runtime errors that no amount of unit testing will catch because the two sides are tested independently
101
-
102
- **Push/real-time event scoping**
103
- - If the PR adds or modifies WebSocket, SSE, or pub/sub event emission, trace the event scope: does the event reach only the originating session/user, or is it broadcast to all connected clients? Check payloads for sensitive content (user inputs, images, tokens) that should not leak across sessions. If the consumer filters by a correlation ID, verify the producer includes one and that the ID is generated server-side or validated against the session
104
-
105
- **Cleanup/teardown side effect audit**
106
- - If the PR adds cleanup, teardown, or garbage-collection functions, trace whether the cleanup performs implicit state mutations (auto-merge into main, auto-commit of unreviewed changes, cascade writes to shared state). Verify the cleanup aborts safely if a prerequisite step fails (e.g., saving dirty state before deletion) rather than proceeding with data loss
25
+ 2. Note the claims verify after agents return whether the code actually delivers them.
107
26
 
108
- **Specification/standard conformance**
109
- - If the PR implements or extends a parser for a well-known format (cron expressions, date formats, URLs, semver, MIME types), verify boundary handling matches the specification — especially field-specific ranges (month starts at 1, not 0), normalization conventions (cron DOW 0 and 7 both mean Sunday), and step/range semantics that differ per field type
27
+ ## Dispatch Review Agents
110
28
 
111
- **Temporal context consistency**
112
- - If the PR adds timezone-aware logic alongside existing non-timezone-aware comparisons in the same code flow (e.g., a weekday gate using UTC while cron matching uses user timezone), check that all temporal comparisons in the flow use the same timezone context — mixed contexts cause operations to trigger on the wrong local day/hour
29
+ Read the three agent instruction files, then spawn **all three in parallel** using the Agent tool. Each agent reviews ALL changed files independently.
113
30
 
114
- **Status/health endpoint freshness**
115
- - If the PR adds or modifies a status or health-check endpoint, trace whether it returns live probe results or cached data. Cached health checks mask real-time failures — a cache keyed by URL that survives URL reconfiguration reports stale status. Verify health endpoints bypass caches or use sufficiently short TTLs
31
+ <surface_scan_agent>
116
32
 
117
- **Boolean/type fidelity through serialization boundaries**
118
- - If the PR persists boolean flags to text-based storage (markdown metadata, flat files, query strings, form data), trace the round-trip: write path → storage format → read/parse path → consumption site. Boolean `false` serialized as the string `"false"` is truthy in JavaScript — verify all consumption sites use strict equality or a dedicated coercion function, and that the same coercion is applied consistently
33
+ ### 1. Surface Scan Agent
119
34
 
120
- **Cross-layer invariant enforcement**
121
- - If the PR introduces or modifies an invariant relationship between configuration flags (e.g., "flag A implies flag B"), trace enforcement through every layer: UI toggle handlers, form submission payloads, API validation schemas, server default-application functions, and persistence round-trip. If any layer allows the invariant to be violated, cascading defaults produce contradictory state
35
+ Catches per-file bugs: runtime crashes, hygiene, domain-specific issues, quality, and convention violations.
122
36
 
123
- **Error path completeness**
124
- - Trace each error path end-to-end: does the error reach the user with a helpful message and correct HTTP status? Or does it get swallowed, logged silently, or surface as a generic 500?
125
- - For multi-step operations (sync to N repos, batch updates): are per-item failures tracked separately from overall success? Does the status reflect partial failure accurately?
37
+ !`cat ~/.claude/lib/review-surface-scan.md`
126
38
 
127
- **Concurrency under user interaction**
128
- - If a component performs optimistic updates with async operations, simulate what happens when the user triggers a second action while the first is in-flight — trace whether rollback/success handlers can clobber concurrent state changes or close over stale snapshots
39
+ </surface_scan_agent>
129
40
 
130
- **State ownership across component boundaries**
131
- - If a child component maintains local state derived from a parent's data (e.g., optimistic UI copies), trace the ownership boundary: does the child propagate changes back to the parent? What happens on unmount/remount — does the parent's stale cache resurface?
41
+ <security_agent>
132
42
 
133
- **Data flow audit**
134
- - For sensitive data (secrets, tokens): trace the value from input → storage → retrieval → response. Verify it is never leaked in ANY response path (GET, PUT, POST, error responses, socket events)
135
- - For user input → URL/command interpolation: verify encoding/escaping at every boundary
43
+ ### 2. Security Audit Agent
136
44
 
137
- **Access scope changes**
138
- - If the PR widens access to an endpoint or resource (admin→public, internal→external), trace all shared dependencies the endpoint uses (rate limiters, queues, connection pools, external service quotas) and assess whether they were sized for the previous access level — in-memory/process-local limiters don't enforce limits across horizontally scaled instances
139
- - If the PR adds endpoints under a restricted route group (admin, internal, scoped), read sibling endpoints in the same route group and verify the new endpoint applies the same authorization gate — missing gates on admin-mounted endpoints are consistently the most dangerous review finding
45
+ Catches trust boundary violations, injection, SSRF, data exposure, and access control gaps.
140
46
 
141
- **Guard-before-cache ordering**
142
- - If a handler performs a pre-flight guard check (rate limit, quota, feature flag) before a cache lookup or short-circuit path, verify the guard doesn't block operations that would be served from cache without touching the guarded resource — restructure so cache hits bypass the guard
47
+ !`cat ~/.claude/lib/review-security-audit.md`
143
48
 
144
- **Sanitization/validation/normalization coverage**
145
- - If the PR introduces a new validation or sanitization function for a data field, trace every code path that writes to that field (create, update, import, sync, rename, raw/bulk persist) — verify they all use the same sanitization. Partial application is the #1 way invalid data re-enters through an unguarded path
146
- - If the PR adds a "raw" or bypass write path (e.g., `raw: true` flag, bulk import, migration backfill), compare the normalization it applies against what the standard read/parse path assumes — ID prefixes, required defaults, shape invariants. Data that passes through the raw path must still be valid when reloaded through the normal path
147
- - If the PR adds a new dispatch branch within a multi-type handler (e.g., coercing a new data shape, handling a new entity subtype), trace sibling branches and verify the new one applies equivalent validation, type-checking, and error-handling constraints — new branches commonly bypass validation that existing branches enforce because the author focuses on happy-path behavior
49
+ </security_agent>
148
50
 
149
- **Bootstrap/initialization ordering**
150
- - If the PR adds resilience or self-healing code (dependency installers, auto-repair, migration runners), trace the execution order: does the main code path resolve or import the dependencies BEFORE the resilience code runs? If so, the bootstrapper never executes when it's needed most — restructure so verification/installation precedes resolution
51
+ <cross_file_agent>
151
52
 
152
- **Self-rescheduling callback resilience**
153
- - If the PR adds a one-shot timer or deferred callback that re-registers itself for the next cycle, trace what happens when the callback body throws before re-registration — an unhandled error permanently stops the schedule. Verify re-registration is in a finally block or occurs before the main logic
53
+ ### 3. Cross-File Tracing Agent
154
54
 
155
- **Periodic operation skip behavior**
156
- - If the PR adds skip/gate conditions to periodic operations (scheduled jobs, pollers), trace whether a skip still advances the scheduling state (lastRun, nextFireTime). A skipped execution with null/stale timing state causes immediate re-trigger loops
55
+ Catches contract mismatches, broken call chains, stale state propagation, lifecycle gaps, and architectural violations.
157
56
 
158
- **Lock/flag exit-path completeness**
159
- - If a function sets a shared flag or lock (in-progress, mutex, status marker), trace every exit path — early returns, error catches, platform-specific guards, and normal completion — to verify the flag is cleared. A missed path leaves the system permanently locked
57
+ !`cat ~/.claude/lib/review-cross-file-tracing.md`
160
58
 
161
- **Operation-marker ordering**
162
- - If the PR writes completion markers, success flags, or status files, verify they are written AFTER the operation they attest to, not before. If the operation can fail after the marker write, consumers see false success. Also check that marker-dependent startup logic validates the marker's contents rather than treating presence as unconditional success
59
+ </cross_file_agent>
163
60
 
164
- **Real-time event vs response timing**
165
- - If a handler emits push notifications (WebSocket, SSE, pub/sub) AND returns an HTTP response, verify clients won't receive push events before the response that gives them context to interpret those events — especially when the response contains IDs or version numbers the event consumer needs
61
+ ### How to dispatch
166
62
 
167
- **Paired lifecycle event completeness**
168
- - If a function emits a "started" or "begin" event, trace every exit path (success, error, early return, no-op branches for specific entity types) and verify each emits the corresponding "completed" or "failed" event. A missing completion event leaves consumers (UI spinners, progress indicators, orchestrators waiting for all-done) in a permanently stale state. Pay special attention to branches that short-circuit for specific entity subtypes — they often return early without the completion event because the author only considered the primary code path
63
+ For each agent, construct its prompt by combining:
64
+ 1. The agent's instruction content (from the sections above)
65
+ 2. Project convention overrides from CLAUDE.md that affect the review
66
+ 3. The list of changed files from the diff stat
67
+ 4. Instruction: "Read each changed file in full (not just diff hunks). Apply your checklist. Return structured findings."
169
68
 
170
- **Entity identity key consistency**
171
- - If the PR uses a computed identity key to look up, match, or index entities (e.g., `e.id || e.externalId`, `item.slug ?? item.name`), trace all other code paths that perform the same entity lookup and verify they use the identical key computation. Inconsistent key strategies cause mismatches — one path stores data under key A while another reads under key B, leading to phantom missing records or incorrect counts
69
+ Spawn all three agents simultaneously. Each returns its findings independently.
172
70
 
173
- **Intent vs implementation (meta-cognitive pass)**
174
- - For each label, comment, docstring, status message, or inline instruction that describes behavior, verify the code actually implements that behavior. A detection mechanism must query the data it claims to detect; a migration must create the target, not just delete the source
175
- - If the PR contains inline code examples, command templates, or query snippets, verify they are syntactically valid for their language — run a mental parse of each example. Watch for template placeholder format inconsistencies within and across files
176
- - If the PR modifies a value (identifier, parameter name, format convention, threshold, timeout) that is referenced in other files, trace all cross-references and verify they agree. This includes: reviewer usernames, API names, placeholder formats, GraphQL field names, operational constants
177
- - If the PR adds or reorders sequential steps/instructions, verify the ordering matches execution dependencies — readers following steps in order must not perform an action before its prerequisite
71
+ ### Large PR handling
178
72
 
179
- **Transactional write integrity**
180
- - If the PR performs multi-item writes (database transactions, batch operations), verify each write includes condition expressions that prevent stale-read races (TOCTOU) — an unconditioned write after a read can upsert deleted records, double-count aggregates, or drive counters negative. Trace the gap between read and write for each operation. Also verify that update/modify operations won't silently create records when the target key doesn't exist — database update operations often have implicit upsert semantics (e.g., DynamoDB UpdateItem, MongoDB update with upsert) that create partial records for invalid IDs; add existence condition expressions when the operation should only modify existing records
181
- - If the PR catches transaction/conditional failures, verify the error is translated to a client-appropriate status (409, 404) rather than bubbling as 500 — expected concurrency failures are not server errors
73
+ If the diff touches more than 20 files, tell each agent to batch files by directory and process groups sequentially within their parallel run. The orchestrator does not manage batching.
182
74
 
183
- **Batch/paginated API consumption**
184
- - If the PR calls batch or paginated external APIs (database batch gets, paginated queries, bulk service calls), verify the caller handles partial results — unprocessed items, continuation tokens, and rate-limited responses must be retried or surfaced, not silently dropped. Check that retry loops include backoff and attempt limits
185
- - If the PR references resource names from API responses (table names, queue names), verify lookups account for environment-prefixed names rather than hardcoding bare names
75
+ ## Collect & Deduplicate
186
76
 
187
- **Data model vs access pattern alignment**
188
- - If the PR adds queries that claim ordering (e.g., "recent", "top"), verify the underlying key/index design actually supports that ordering natively — random UUIDs and non-time-sortable keys require full scans and in-memory sorting, which degrades at scale
77
+ After all three agents return:
189
78
 
190
- **Deletion/lifecycle cleanup and aggregate reset completeness**
191
- - If the PR adds a delete or destroy function, trace all resources created during the entity's lifecycle (data directories, git branches, child records, temporary files, worktrees) and verify each is cleaned up on deletion. Compare with existing delete functions in the codebase for completeness patterns
192
- - If the PR adds a state transition that resets an aggregate value (counter, score, flag count), trace all individual records that contribute to that aggregate and verify they are also cleared, archived, or versioned a reset counter with stale contributing records causes inconsistency and blocks duplicate-prevention checks on re-entry
193
-
194
- **Update schema depth**
195
- - If the PR derives an update/patch schema from a create schema (e.g., `.partial()`, `Partial<T>`), verify that nested objects also become partial — shallow partial on deeply-required schemas rejects valid partial updates where the caller only wants to change one nested field
196
-
197
- **Mutation return value freshness**
198
- - If a function mutates an entity and returns it, verify the returned object reflects the post-mutation state, not a pre-read snapshot. Also check whether dependent scheduling/evaluation state (backoff, timers, status flags) is reset when a "force" or "trigger" operation is invoked
199
-
200
- **Responsibility relocation audit**
201
- - If the PR moves a responsibility from one module to another (e.g., a database write from a handler to middleware, a computation from client to server), trace all code at the old location that depended on the timing, return value, or side effects of the moved operation — guards, response fields, in-memory state updates, and downstream scheduling that assumed co-located execution. Verify the new execution point preserves these contracts or that dependents are updated. Check for dead code left behind at the old location
202
-
203
- **Read-after-write consistency**
204
- - If the PR writes to a data store and then immediately queries that store (especially scans, aggregations, or replica reads), check whether the store's consistency model guarantees visibility of the write. If not, flag the read as potentially stale and suggest computing from in-memory state, using consistent-read options, or adding a delay/caveat
205
-
206
- **Security-sensitive configuration parsing**
207
- - If the PR reads environment variables or config values that affect security behavior (proxy trust depth, rate limit thresholds, CORS origins, token expiry), verify the parsing enforces the expected type and range — e.g., integer-only via `parseInt` with `Number.isInteger` check, non-negative bounds, and a logged fallback to a safe default on invalid input. `Number()` on arbitrary strings accepts floats, negatives, and empty-string-as-zero, all of which can silently weaken security controls
208
-
209
- **Multi-source data aggregation**
210
- - If the PR aggregates items from multiple sources into a single collection (merging accounts, combining API results, flattening caches), verify each item retains its source identifier through the aggregation — downstream operations that need to route back to the correct source (updates, deletes, detail views) will silently break or operate on the wrong source if the origin is lost
211
-
212
- **Field-set enumeration consistency**
213
- - If the PR adds an operation that targets a set of entity fields (enrichment, validation, migration, sync), trace every other location that independently enumerates those fields — UI predicates, scan/query filters, API documentation, response shapes, and test assertions. Each must cover the same field set; a missed field causes silent skips or false UI state. Prefer deriving enumerations from a single source of truth (constant array, schema keys) over maintaining independent lists
214
-
215
- **Abstraction layer fidelity**
216
- - If the PR calls a third-party API through an internal wrapper/abstraction layer, trace whether the wrapper requests and forwards all fields the handler depends on — third-party APIs often have optional response attributes that require explicit opt-in (e.g., cancellation reasons, extended metadata). Code branching on fields the wrapper doesn't forward will silently receive `undefined` and take the wrong path. Also verify that test mocks match what the real wrapper returns, not what the underlying API could theoretically return
217
- - If the PR passes multiple parameters through a wrapper/abstraction layer to an underlying API, check whether any parameter combinations are mutually exclusive in the underlying API (e.g., projection expressions + count-only select modes) — the wrapper should strip conflicting parameters rather than forwarding all unconditionally, which causes validation errors at the underlying layer
218
- - If the PR calls framework or library functions with discriminated input formats (e.g., content paths vs script paths, different loader functions per format), trace each call site to verify the function variant used actually handles the input format being passed — especially fallback/default branches in multi-format dispatchers, where the fallback commonly uses the wrong function. Also verify positional argument order matches the called function's parameter order (not assumed from variable names) and that the object type passed matches what the API expects (e.g., asset object vs class reference, property access vs method call)
219
-
220
- **Parameter consumption tracing**
221
- - If the PR adds a function with validated input parameters (schema validation, input decorators, type annotations), trace each validated parameter through to where it's actually consumed in the implementation. Parameters that pass validation but are never read create dead API surface — callers believe they're configuring behavior that's silently ignored. Either wire the parameter through or remove it from the public API
222
-
223
- **Summary/aggregation endpoint consistency**
224
- - If the PR adds a summary or dashboard endpoint that aggregates counts/previews across multiple data sources, trace each category's computation logic against the corresponding detail view it links to — verify they apply the same filters (e.g., orphan exclusion, status filtering), the same ordering guarantees (sort keys that actually exist on the queried index), and that navigation links propagate the aggregated context (e.g., `?status=pending`) so the destination page matches what the summary promised
225
-
226
- **Data model / status lifecycle changes**
227
- - If the PR changes the set of valid statuses, enum values, or entity lifecycle states, sweep all dependent artifacts: API doc summaries and enum declarations, UI filter/tab options, conditional rendering branches (which actions to show per state), integration guide examples, route names derived from old status names, and test assertions. Each artifact that references the old value set must be updated — partial updates leave stale filters, invalid actions, and misleading documentation
228
- - If the PR renames a concept (e.g., "flagged" → "rejected"), trace all manifestations beyond user-facing labels: route paths, component/file names, variable names, CSS classes, and test descriptions. Internal identifiers using the old name create confusion even when the UI is correct
229
-
230
- **Type-discriminated entity validation**
231
- - If the PR modifies entities with a discriminator field (type, kind, category), trace all code paths that change the discriminator value — not just the update handler, but also migration paths, bulk operations, and UI type-switchers. Verify downstream branching logic (execution, rendering, validation) handles all type transitions without falling through to the wrong handler
232
-
233
- **Migration/initialization idempotency**
234
- - If the PR adds a startup-time migration or one-time initialization, verify it is idempotent — what happens when it runs a second time? Check that the migration condition excludes already-migrated records and that completion is recorded (version stamp, flag) to prevent re-entry on every load
235
-
236
- **Data migration semantic preservation**
237
- - If the PR includes a data migration, trace each migrated field's behavioral meaning before and after. Focus on: migration-time validation matching runtime validation (don't persist values that will fail at execution), concurrency protection (migrations triggered by read paths race with concurrent requests), and unsupported source values being flagged rather than silently defaulted
238
-
239
- **Formatting & structural consistency**
240
- - If the PR adds content to an existing file (list items, sections, config entries), verify the new content matches the file's existing indentation, bullet style, heading levels, and structure — rendering inconsistencies are the most common Copilot review finding
241
-
242
- **Dependent operation ordering**
243
- - If a handler orchestrates multiple operations (primary write + side effects like rewards, uploads, notifications), trace the dependency graph — verify side effects only execute after the primary operation confirms success. Watch for `Promise.all` grouping operations that should be sequential because one depends on the other's outcome, and for resource allocation (file uploads, external API calls) happening before a gate operation (lock acquisition, uniqueness check, validation) that may reject the request
244
- - If the PR handles file uploads or binary data, verify the server validates content independently of client-supplied metadata — check MIME type via magic bytes, size via buffer length, and content validity via actual parsing rather than trusting request headers or middleware-provided fields
245
-
246
- **Bulk vs single-item operation parity**
247
- - If the PR modifies a single-item CRUD operation (create, update, delete) to handle new fields or apply new logic, trace the corresponding bulk/batch operation for the same entity — it often has its own independent implementation that won't pick up the change. Verify both paths handle the same fields, apply the same validation, and preserve the same secondary data
248
-
249
- **Bulk operation selection lifecycle**
250
- - If the PR adds operations that act on a user-selected subset of items (bulk actions, batch operations), trace the complete lifecycle of the selection state: when is it cleared (data refresh, item deletion), when is it not cleared but should be (filter/sort/page changes), and whether the operation re-validates the selection at execution time (especially after confirmation dialogs where the underlying data may change between display and confirmation)
251
-
252
- **Config value provenance for auto-upgrade**
253
- - If the PR adds auto-upgrade logic that replaces config values with newer defaults (prompt versions, schema migrations, template updates), verify the code can distinguish "user customized this value" from "this is the previous default." Without provenance tracking (version stamps, customization flags, or comparison against known previous defaults), auto-upgrade will overwrite intentional user customizations or skip legitimate upgrades
254
-
255
- **Query key / stored key precision alignment**
256
- - If the PR adds queries that construct lookup keys with a different precision, encoding, or format than what the write path persists, the query will silently return zero matches. Trace the key construction in both write and read paths and verify they produce compatible values
257
-
258
- </deep_checks>
259
-
260
- <verify_findings>
79
+ 1. **Merge** all findings into a single list, tagged by source agent
80
+ 2. **Deduplicate**: if two agents flagged the same `file:line` with overlapping descriptions, keep the most detailed version and note both agents found it
81
+ 3. **PR coherence**: verify commits deliver what they claimflag discrepancies as IMPROVEMENT findings
82
+ 4. **CLAUDE.md filter**: remove findings that conflict with explicit project conventions
261
83
 
262
84
  ## Verify Findings
263
85
 
264
- For each issue found, ground it in evidence before classifying:
86
+ For each finding, ground it in evidence before classifying:
265
87
  1. **Quote the specific code line(s)** that demonstrate the issue
266
88
  2. **Explain why it's a problem** in one sentence given the surrounding context
267
89
  3. If the fix involves async/state changes, **trace the execution path** to confirm the issue is real
@@ -269,16 +91,12 @@ For each issue found, ground it in evidence before classifying:
269
91
 
270
92
  After verifying all findings, run the project's build and test commands to confirm no false positives.
271
93
 
272
- </verify_findings>
273
-
274
- <fix_and_report>
275
-
276
- ## Fix Issues Found
94
+ ## Fix Issues
277
95
 
278
- For each verified issue:
96
+ For each verified finding:
279
97
  1. Classify severity: **CRITICAL** (runtime crash, data leak, security) vs **IMPROVEMENT** (consistency, robustness, conventions)
280
98
  2. Fix all CRITICAL issues immediately
281
- 3. For IMPROVEMENT issues, fix them too — the goal is to eliminate Copilot review round-trips
99
+ 3. For IMPROVEMENT issues, fix them too — the goal is to eliminate review round-trips
282
100
  4. After fixes, run the project's test suite and build command (per project conventions already in context)
283
101
  5. Verify the test suite covers the changed code paths — passing unrelated tests is not validation
284
102
  6. Commit fixes: `refactor: address code review findings`
@@ -290,13 +108,15 @@ Print a summary table of what was reviewed and found:
290
108
  ```
291
109
  ## Review Summary
292
110
 
293
- | Category | Files Checked | Issues Found | Fixed |
294
- |----------|--------------|-------------|-------|
295
- | Hygiene | N | N | N |
296
- | ... | ... | ... | ... |
111
+ | Agent | Files Checked | Issues Found | Fixed |
112
+ |-------|--------------|-------------|-------|
113
+ | Surface Scan | N | N | N |
114
+ | Security Audit | N | N | N |
115
+ | Cross-File Tracing | N | N | N |
116
+ | **Total** | **N** | **N** | **N** |
297
117
 
298
118
  ### Issues Fixed
299
- - file:line — description of fix
119
+ - file:line — description of fix (agent: Surface/Security/Cross-File)
300
120
 
301
121
  ### Accepted As-Is (with rationale)
302
122
  - file:line — description and why it's acceptable
@@ -304,10 +124,6 @@ Print a summary table of what was reviewed and found:
304
124
 
305
125
  If no issues were found, confirm the code is clean and ready for PR.
306
126
 
307
- </fix_and_report>
308
-
309
- <pr_comment_policy>
310
-
311
127
  ## PR Comment Policy
312
128
 
313
129
  After the review and any fixes, determine whether to post review comments on the PR/MR:
@@ -318,5 +134,3 @@ After the review and any fixes, determine whether to post review comments on the
318
134
  4. **If the PR was opened by someone else**, post a review comment on the PR summarizing the findings using `gh pr review {number} --comment --body "..."`. Include the issues found, fixes applied, and any remaining items that need the author's attention.
319
135
 
320
136
  This avoids noisy self-comments on your own PRs while still providing feedback to other contributors.
321
-
322
- </pr_comment_policy>
package/install.sh CHANGED
@@ -56,6 +56,7 @@ OLD_COMMANDS=(cam good makegoals makegood optimize-md)
56
56
  LIBS=(
57
57
  code-review-checklist copilot-review-loop graphql-escaping
58
58
  remediation-agent-template swift-review-checklist
59
+ review-surface-scan review-security-audit review-cross-file-tracing
59
60
  )
60
61
 
61
62
  HOOKS=(slashdo-check-update slashdo-statusline)
@@ -7,7 +7,7 @@
7
7
  ## Tier 1 — Always Check (Runtime Crashes, Security, Hygiene)
8
8
 
9
9
  **Hygiene**
10
- - Leftover debug code (`console.log`, `debugger`, TODO/FIXME/HACK), hardcoded secrets/credentials, and uncommittable files (.env, node_modules, build artifacts)
10
+ - Leftover debug code (`console.log`, `debugger`, TODO/FIXME/HACK), hardcoded secrets/credentials, and uncommittable files (.env, node_modules, build artifacts, runtime-generated data/reports)
11
11
  - Overly broad changes that should be split into separate PRs
12
12
 
13
13
  **Imports & references**
@@ -27,7 +27,7 @@
27
27
  - Route params passed to services without format validation; path containment checks using string prefix without path separator boundary (use `path.relative()`)
28
28
  - Parameterized/wildcard routes registered before specific named routes — the generic route captures requests meant for the specific endpoint (e.g., `/:id` registered before `/drafts` matches `/drafts` as `id="drafts"`). Verify route registration order or use path prefixes to disambiguate
29
29
  - Stored or external URLs rendered as clickable links (`href`, `src`, `window.open`) without protocol validation — `javascript:`, `data:`, and `vbscript:` URLs execute in the user's browser. Allowlist `http:`/`https:` (and `mailto:` if needed) before rendering; for all other schemes, render as plain text or strip the value
30
- - Server-side HTTP requests using user-configurable or externally-stored URLs without protocol allowlisting (http/https only) and host/network restrictions — the server becomes an SSRF proxy for reaching internal network services, cloud metadata endpoints, or localhost-bound APIs. Validate scheme and restrict to expected hosts or external-only ranges before any server-side fetch
30
+ - Server-side HTTP requests using user-configurable or externally-stored URLs without protocol allowlisting (http/https only) and host/network restrictions — the server becomes an SSRF proxy for reaching internal network services, cloud metadata endpoints, or localhost-bound APIs. Validate scheme and restrict to expected hosts or external-only ranges before any server-side fetch. Also check redirect handling: auto-following redirects (`redirect: 'follow'`) bypasses initial host validation when a public URL redirects to an internal IP. Disable auto-follow and revalidate each hop, or resolve DNS and block private/loopback/link-local ranges before connecting — public hostnames can resolve to internal IPs via DNS rebinding
31
31
  - Error/fallback responses that hardcode security headers instead of using centralized policy — error paths bypass security tightening
32
32
 
33
33
  **Trust boundaries & data exposure**
@@ -65,7 +65,7 @@
65
65
 
66
66
  **Resource management** _[applies when: code uses event listeners, timers, subscriptions, or useEffect]_
67
67
  - Event listeners, socket handlers, subscriptions, timers, and useEffect side effects are cleaned up on unmount/teardown
68
- - Deletion/destroy and state-reset functions that clean up or reset the primary resource but leave orphaned or inconsistent secondary resources (data directories, git branches, child records, temporary files, per-user flag/vote items) — trace all resources created during the entity's lifecycle and verify each is removed on delete. For state transitions that reset aggregate values (counters, scores, flags), also clear or version the individual records that contributed to those aggregates — otherwise the aggregate and its sources disagree, and duplicate-prevention checks block legitimate re-entry. Also check cleanup operations that perform implicit state mutations (auto-merge, auto-commit, cascade writes) as part of teardown — these can introduce unreviewed changes or silently modify shared state. Verify cleanup fails safely when a prerequisite step (e.g., saving dirty state) fails rather than proceeding with data loss
68
+ - Deletion/destroy and state-reset functions that clean up or reset the primary resource but leave orphaned or inconsistent secondary resources (data directories, git branches, child records, temporary files, per-user flag/vote items) — trace all resources created during the entity's lifecycle and verify each is removed on delete. Also check the inverse: preservation guards that prevent cleanup under overly broad conditions (e.g., preserving a branch "in case there were commits" when the ahead count is zero) — over-conservative guards leak resources over time. For state transitions that reset aggregate values (counters, scores, flags), also clear or version the individual records that contributed to those aggregates — otherwise the aggregate and its sources disagree, and duplicate-prevention checks block legitimate re-entry. Also check cleanup operations that perform implicit state mutations (auto-merge, auto-commit, cascade writes) as part of teardown — these can introduce unreviewed changes or silently modify shared state. Verify cleanup fails safely when a prerequisite step (e.g., saving dirty state) fails rather than proceeding with data loss
69
69
  - Initialization functions (schedulers, pollers, listeners) that don't guard against multiple calls — creates duplicate instances. Check for existing instances before reinitializing
70
70
  - Self-rescheduling callbacks (one-shot timers, deferred job handlers) where the next cycle is registered inside the callback body — an unhandled error before the re-registration call permanently stops the schedule. Wrap the callback body in try/finally with re-registration in the finally block, or register the next cycle before executing the current one
71
71
 
@@ -92,6 +92,7 @@
92
92
  - Inconsistent "missing value" semantics across layers — one layer treats `null`/`undefined` as missing while another also treats empty strings or whitespace-only strings as missing. Query filters, update expressions, and UI predicates that disagree on what constitutes "missing" cause records to be skipped by one path but processed by another. Define a single `isMissing` predicate and use it consistently, or normalize empty/whitespace values to `null` at write time. Also applies to comparison/detection logic: coercing an absent field to a sentinel (`?? 0`, default parameters) makes the logic treat "unsupported" as a real value — guard with an explicit presence check before comparing. Watch for validation/sanitization functions that return `null` for invalid input when `null` also means "clear/delete" downstream — malformed input silently destroys existing data. Distinguish "invalid, reject the request" from "explicitly clear this field". Also applies to normalization (trailing slashes, case, whitespace): if one path normalizes a value before comparison but the write path stores it un-normalized, comparisons against the stored value produce incorrect results — normalize at write time or normalize both sides consistently
93
93
  - Validation functions that delegate to runtime-behavior computations (next schedule occurrence, URL reachability, resource resolution) — conflating "no result within search window" or "temporarily unavailable" with "invalid input" rejects valid configurations. Validate syntax and structure independently of runtime feasibility
94
94
  - Numeric values from strings used without `NaN`/type guards — `NaN` comparisons silently pass bounds checks. Clamp query params to safe lower bounds
95
+ - Hand-rolled regex validators for well-known formats (IP addresses, email, URLs, dates, semver) that accept invalid inputs or reject valid ones — use platform/standard library parsers instead (e.g., `net.isIP()`, `URL` constructor, `semver.valid()`) which handle edge cases the regex misses
95
96
  - UI elements hidden from navigation but still accessible via direct URL — enforce restrictions at the route level
96
97
  - Summary counters/accumulators that miss edge cases (removals, branch coverage, underflow on decrements — guard against going negative with lower-bound conditions); counters incremented before confirming the operation actually changed state — rejected, skipped, or no-op iterations inflate success counts. Batch operations that report overall success while silently logging per-item failures — callers see success but partial work was done; collect and return per-item failures in the response. Silent operations in verbose sequences where all branches should print status
97
98
 
@@ -148,6 +149,7 @@
148
149
  - Subprocess output buffered in memory without size limits — a noisy or stuck child process can cause unbounded memory growth. Cap in-memory buffers and truncate or stream to disk for long-running commands
149
150
  - Platform-specific assumptions — hardcoded shell interpreters, `path.join()` backslashes breaking ESM imports. Use `pathToFileURL()` for dynamic imports
150
151
  - Naive whitespace splitting of command strings (`str.split(/\s+/)`) breaks quoted arguments — use a proper argv parser or explicitly disallow quoted/multi-word arguments when validating shell commands
152
+ - Subprocess output parsed from a single stream (stdout or stderr) to detect conditions (conflicts, errors, specific states) — the information may appear in the other stream or vary by tool version/config. Check both stdout and stderr, and verify the exit code, to reliably detect the condition
151
153
  - Shell expansions (brace `{a,b}`, glob `*`, tilde `~`, variable `$VAR`) suppressed by quoting context — single quotes prevent all expansion, so patterns like `--include='*.{ts,js}'` pass the literal braces to the command instead of expanding. Use multiple flags, unquoted brace expansion (bash-only), or other command-specific syntax when expansion is required
152
154
 
153
155
  **Search & navigation** _[applies when: code implements search results or deep-linking]_
@@ -160,6 +162,7 @@
160
162
  **Accessibility** _[applies when: code modifies UI components or interactive elements]_
161
163
  - Interactive elements missing accessible names, roles, or ARIA states — including disabled interactions without `aria-disabled`
162
164
  - Custom toggle/switch UI built from non-semantic elements instead of native inputs
165
+ - Overlay or absolutely-positioned layers with broad `pointer-events-auto` that intercept clicks/hover intended for elements beneath — use `pointer-events-none` on decorative overlays and enable events only on small interactive affordances. Conversely, `pointer-events-none` on a parent kills hover/click handlers on children — verify both directions when layering positioned elements
163
166
 
164
167
  ## Tier 4 — Always Check (Quality, Conventions, AI-Generated Code)
165
168
 
@@ -0,0 +1,227 @@
1
+ # Cross-File Tracing Review Agent
2
+
3
+ ## Mandate
4
+ You review code by tracing data and control flow ACROSS files. You catch issues invisible in single-file review: mismatched contracts, broken call chains, stale state propagation, lifecycle gaps, and architectural violations.
5
+
6
+ ## Reading Strategy
7
+ 1. Read ALL changed files to understand each module's responsibility
8
+ 2. Trace call chains: for each new/modified function, identify callers and callees across files. Read unchanged files when needed to verify contracts
9
+ 3. Map data from entry points (route handlers, event listeners) through transforms, storage, and output
10
+ 4. For each module, state its single responsibility — if you can't, flag it
11
+
12
+ ## Principles to Evaluate
13
+
14
+ **DRY** — Logic duplicated across changed files or between changed and existing code. Similar function signatures, copy-pasted patterns with minor variations. Two functions doing nearly the same thing should share implementation.
15
+
16
+ **SOLID**
17
+ - Single Responsibility: each module/function has one reason to change. Route handlers with business rules beyond delegation violate this
18
+ - Open/Closed: new behavior addable without modifying working code
19
+ - Interface Segregation: callers don't depend on methods they don't use
20
+ - Dependency Inversion: high-level modules don't import low-level details directly
21
+
22
+ ## Checklist
23
+
24
+ ### Async & State Consistency
25
+
26
+ - Optimistic state changes before async completion — if operation fails, UI stuck. `.catch(() => null)` followed by unconditional success code — the catch silences but success path still runs
27
+ - Multiple coupled state variables updated independently — changes to one must update all related fields. Debounced/cancelable ops must reset loading on every exit. Selection sets must be pruned when items are removed, invalidated on reload/filter/sort/page. Ops from confirmation dialogs must re-validate at execution time. `useState(prop)` only captures initial value — sync with effect when prop updates async
28
+ - Error notification at multiple layers — verify exactly one layer owns user-facing messages. Periodic polling: throttle notifications to state transitions (success→error), don't make UI disappear on error
29
+ - Optimistic updates with full-collection rollback snapshots — second in-flight action clobbered. Use per-item rollback and functional updaters. Guard against duplicate appends
30
+ - State updates guarded by truthiness (`if (arr?.length)`) preventing clearing when source returns empty — distinguish "no response" from "empty response"
31
+ - Periodic operations with skip conditions not advancing timing state (lastRun, nextFireTime) — null/stale lastRun causes re-trigger loops. Check initial baseline: epoch makes items immediately due, "now" may prevent them from ever becoming due
32
+ - Cached values keyed without all discriminators (URL, tenant, config version) — context changes serve stale data. Health endpoints returning cached results mask real-time failures
33
+ - Mutation functions returning pre-mutation state — dependent scheduling/evaluation uses stale values
34
+ - Fire-and-forget writes: in-memory not updated (stale response) or updated unconditionally (claims unpersisted state). Side effects (rewards, notifications, uploads) before confirmed primary write. Monotonic counters advancing before write risks running ahead on failure
35
+ - Error/early-exit paths returning default status metadata (hasMore, pagination) or emitting events unconditionally — false success. Paired lifecycle events: every "started" exit path must emit "completed"/"failed" — watch short-circuit branches
36
+ - Missing `await` on async ops in error/cleanup paths that must complete before function returns
37
+ - `Promise.all` without error handling — partial load. `Promise.allSettled` without logging rejection reasons before mapping fallbacks
38
+ - Sequential processing where one throw aborts remaining — wrap per-item in try/catch
39
+ - Side effects during React render
40
+
41
+ ### Error Handling
42
+
43
+ - Service functions throwing generic Error for client conditions — 500 instead of 400/404. Consistent access-control responses across endpoints. Concurrency failures → 409 not 500
44
+ - Swallowed errors; generic messages replacing detail; external service wrappers returning null for all failures (collapsing config errors, auth, rate limits, 5xx into "not found")
45
+ - Caller/callee disagreement: `{ success: false }` vs `.catch()`; gate returning `{ shouldRun: false }` on error vs fail-open runtime; argument shape mismatches (wrapped object vs bare array, wrong positional order); async EventEmitter handlers creating unhandled rejections
46
+ - Destructive ops in retry/cleanup paths without own error handling
47
+ - External service calls without configurable timeouts
48
+ - Missing fallback for unavailable downstream services
49
+
50
+ ### Resource Management
51
+
52
+ - Event listeners, sockets, subscriptions, timers, useEffect cleaned up on teardown
53
+ - Delete/destroy leaving orphaned secondary resources (data dirs, branches, child records, temp files). Over-broad preservation guards preventing cleanup when nothing worth preserving (branch preserved with 0 commits ahead). Cleanup with implicit mutations (auto-merge, auto-commit) — abort on prerequisite failure
54
+ - Initialization functions without guard against multiple calls — creates duplicates
55
+ - Self-rescheduling callbacks where error before re-registration permanently stops schedule — use try/finally
56
+
57
+ ### Validation & Consistency
58
+
59
+ - Breaking changes to public API without version bump or deprecation path
60
+ - Backward-incompatible changes (renamed config keys, file formats, schemas, event payloads, routes, persisted data) without migration or fallback. Route renames need redirects
61
+ - One-time migrations without completion guard — re-execute every startup
62
+ - Data migrations silently changing runtime behavior — preserve execution semantics. Unsupported source values must be flagged, not defaulted
63
+ - Update endpoints with field allowlists not covering new model fields
64
+ - New endpoints not matching validation patterns of existing similar ones. API doc schemas must be structurally complete
65
+ - Summary/aggregation endpoints using different filters/sources than detail views they link to
66
+ - Discovery endpoints must validate against consumer's actual supported set. Identifier transformations between producer and consumer must preserve expected format
67
+ - Validation functions introduced for a field: trace ALL write paths. New branches must apply same validation as siblings
68
+ - Stored config merged with shallow spread — nested objects lose new default keys on upgrade. Use deep merge
69
+ - Schema fields accepting values downstream can't handle. Validated params never consumed (dead API surface). `.partial()` on nested schemas: verify nested objects also partial. `.partial()` with `.default()` silently overwrites persisted values on update
70
+ - Multi-part UI gated on different prop subsets — derive single enablement boolean
71
+ - Entity creation without case-insensitive uniqueness
72
+ - Code reading response properties that don't exist — verify field names, nesting, actual response shape. Wrappers that don't request/forward needed fields. Call sites using wrong function variant for input format or wrong positional argument order
73
+ - Data model fields with different names per write path. Entity identity keys inconsistent across lookup paths
74
+ - Entity type changes without revalidating type-specific invariants and clearing old-type fields
75
+ - Config flag invariants (A implies B) not enforced across all layers: UI toggles, API validation, server defaults, persistence
76
+ - Operations scoped to entity subtype without verifying discriminator — wrong type corrupts state
77
+ - Inconsistent "missing value" semantics (null vs empty string vs whitespace) across layers. Validation returning null when null means "clear" downstream. Normalization applied inconsistently between write and comparison paths
78
+ - Validation delegating to runtime computation — conflating "no result in window" with "invalid input"
79
+ - Numeric strings without NaN/type guards. Hand-rolled regex for well-known formats — use platform parsers
80
+ - UI hidden from nav but accessible via direct URL
81
+ - Summary counters missing edge cases; counters incremented before confirming state change; batch ops reporting success while logging per-item failures
82
+
83
+ ### Concurrency & Data Integrity
84
+
85
+ - Shared mutable state without locking; read-modify-write interleaving — use conditional writes/optimistic concurrency
86
+ - Read-only paths triggering lazy init with write side effects — unprotected concurrent writes
87
+ - Multi-table writes without transaction — partial state on error
88
+ - Writes replacing entire composite attribute populated by multiple sources — discards other sources' data
89
+ - Early returns for "no primary fields" skipping secondary operations
90
+ - Shared flags/locks with exit paths that skip cleanup — permanent lock
91
+
92
+ ### Cross-File Deep Checks
93
+
94
+ **Cross-file consistency**
95
+ - New functions following existing patterns must match ALL aspects (validation, error codes, response shape, cleanup). Partial copying is #1 review feedback source
96
+ - New API client functions must use same encoding/escaping as existing ones
97
+ - New endpoints must be wired in all runtime adapters (serverless, framework routes, gateway)
98
+ - New external service calls must use established mock/test infrastructure
99
+ - New UI consumers against existing APIs: verify every field name, nesting, identifier, response envelope matches actual producer response
100
+ - Discovery/catalog endpoints: trace enumerated set against consumer's supported inputs
101
+
102
+ **Cleanup/teardown side effects**
103
+ - Cleanup functions with implicit mutations (auto-merge, auto-commit, cascade writes) — verify abort on prerequisite failure
104
+
105
+ **Specification conformance**
106
+ - Parsers for well-known formats (cron, dates, URLs, semver): verify boundary handling matches spec — field ranges, normalization, step/range semantics
107
+
108
+ **Temporal context**
109
+ - Timezone-aware logic alongside non-timezone-aware in same flow — mixed contexts trigger on wrong day/hour
110
+
111
+ **Boolean/type fidelity through serialization**
112
+ - Boolean flags persisted to text (markdown metadata, query strings, flat files): trace write → storage → read → consumption. `"false"` is truthy — verify strict equality at all consumption sites
113
+
114
+ **Cross-layer invariant enforcement**
115
+ - Config flag invariants (A implies B): trace through UI toggles, form submission, route validation, server defaults, persistence round-trip
116
+
117
+ **Error path completeness**
118
+ - Each error reaches user with helpful message and correct HTTP status. Multi-step operations track per-item failures separately from overall success
119
+
120
+ **Concurrency under user interaction**
121
+ - Optimistic updates with async: second action while first in-flight — rollback/success handlers can clobber concurrent state or close over stale snapshots
122
+
123
+ **State ownership across boundaries**
124
+ - Child component local state from parent data: trace ownership, propagation back to parent, unmount/remount stale cache
125
+
126
+ **Bootstrap/initialization ordering**
127
+ - Resilience code (installers, auto-repair, migrations) importing dependencies before installing them — restructure so install precedes resolution
128
+
129
+ **Lock/flag exit-path completeness**
130
+ - Shared flags/locks: trace every exit path (early returns, catches, platform guards, normal completion) for clearing
131
+
132
+ **Operation-marker ordering**
133
+ - Completion markers, success flags written AFTER the operation, not before. Marker-dependent startup validates contents, not just presence
134
+
135
+ **Real-time event vs response timing**
136
+ - Push events (WS, SSE) before HTTP response that gives clients context to interpret them (IDs, version numbers)
137
+
138
+ **Paired lifecycle event completeness**
139
+ - "Started" event → every exit path (success, error, early return, no-op branches for specific entity types) emits "completed"/"failed"
140
+
141
+ **Entity identity key consistency**
142
+ - Computed lookup keys (e.g., `e.id || e.externalId`): trace all paths using same computation — inconsistent keys cause mismatches
143
+
144
+ **Intent vs implementation (cross-file)**
145
+ - Cross-references between files (identifiers, param names, format conventions, versions, thresholds) that disagree — trace all references when one changes. Internal identifiers renamed when concept renamed
146
+ - Modified values referenced in other files: trace all cross-references
147
+ - Responsibility relocated from one module to another: trace all dependents at old location (guards, return values, state updates). Remove dead code at old location
148
+
149
+ **Transactional write integrity**
150
+ - Multi-item writes: condition expressions preventing stale-read races (TOCTOU). Update ops that silently create records for invalid IDs (DynamoDB UpdateItem, MongoDB upsert) — add existence conditions. Caught conditional failures → 409 not 500
151
+
152
+ **Batch/paginated consumption**
153
+ - Batch API callers handle partial results, continuation tokens, rate limits with backoff. Resource names account for environment prefixes
154
+
155
+ **Data model vs access pattern**
156
+ - Claims of ordering ("recent", "top") verified against key/index design — random UUIDs require full scans
157
+
158
+ **Deletion/lifecycle cleanup**
159
+ - Delete functions: trace all lifecycle resources. State resets: clear individual contributing records — stale records block re-entry
160
+
161
+ **Update schema depth**
162
+ - Update schemas from create (`.partial()`): nested objects must also be partial
163
+
164
+ **Mutation return value freshness**
165
+ - Returned entity reflects post-mutation state. Force/trigger operations reset dependent scheduling state
166
+
167
+ **Read-after-write consistency**
168
+ - Writes then immediate scans/aggregations: check store's consistency model. Compute from in-memory state or use consistent-read options
169
+
170
+ **Multi-source data aggregation**
171
+ - Items from multiple sources: retain source identifier through aggregation for downstream routing
172
+
173
+ **Field-set enumeration consistency**
174
+ - Operations targeting field sets: trace every other enumeration (UI predicates, filters, docs, tests) — prefer single source of truth
175
+
176
+ **Abstraction layer fidelity**
177
+ - Wrappers requesting all fields handlers depend on — third-party APIs often require opt-in. Mutually exclusive params: strip conflicts. Framework function variants match input format. Positional args match called function's parameter order
178
+
179
+ **Parameter consumption tracing**
180
+ - Validated params: trace to actual consumption. Unread params create dead API surface — wire through or remove
181
+
182
+ **Summary/aggregation consistency**
183
+ - Dashboard counts vs detail views: same filters, ordering. Navigation links propagate aggregated context
184
+
185
+ **Data model / status lifecycle**
186
+ - Changed statuses/enums: sweep API docs, UI filters, conditional rendering, routes, tests. Renamed concepts: trace all manifestations (routes, components, variables, CSS, tests)
187
+
188
+ **Type-discriminated entities**
189
+ - Discriminator changes: trace all code paths (migration, bulk, UI type-switchers) — verify downstream branching handles all transitions
190
+
191
+ **Migration idempotency**
192
+ - Startup migrations: verify second run is no-op. Condition excludes already-migrated records
193
+
194
+ **Data migration semantics**
195
+ - Migrated fields preserve behavioral meaning. Concurrency protection for read-triggered migrations. Unsupported source values flagged not defaulted
196
+
197
+ **Dependent operation ordering**
198
+ - Side effects only after primary operation confirms success. `Promise.all` grouping sequential deps. Resource allocation before gate operations (locks, validation)
199
+
200
+ **Bulk vs single-item parity**
201
+ - Single-item CRUD changes: trace corresponding bulk operation — verify same fields, validation, secondary data
202
+
203
+ **Bulk selection lifecycle**
204
+ - Selection cleared on data refresh/deletion. Not cleared but should be on filter/sort/page change. Re-validate at execution time after confirmation dialog
205
+
206
+ **Config auto-upgrade provenance**
207
+ - Auto-upgrade logic: distinguish user customization from previous default — without provenance, overwrites intentional customizations
208
+
209
+ **Query key / stored key alignment**
210
+ - Lookup key precision/encoding/format matching write path — mismatches return zero matches
211
+
212
+ **Subprocess condition detection**
213
+ - Subprocess output parsed to detect conditions: check both stdout and stderr plus exit code — location varies by tool version
214
+
215
+ **Formatting consistency**
216
+ - New content matches file's existing indentation, bullets, headings, structure
217
+
218
+ ## Output Format
219
+
220
+ For each finding:
221
+ ```
222
+ file:line — [CRITICAL|IMPROVEMENT|UNCERTAIN] description
223
+ Cross-file trace: file_a:line → file_b:line (what flows between them)
224
+ Evidence: `quoted code from each file`
225
+ ```
226
+
227
+ Only report verified findings with cross-file evidence. If the trace is uncertain, mark [UNCERTAIN].
@@ -0,0 +1,76 @@
1
+ # Security Audit Review Agent
2
+
3
+ ## Mandate
4
+ You review code with an adversarial mindset. Find trust boundary violations, injection vectors, data exposure, and access control gaps. Focus on security concerns that a general code reviewer would deprioritize.
5
+
6
+ ## Reading Strategy
7
+ For each changed file:
8
+ 1. Read the **ENTIRE file** (not just diff hunks)
9
+ 2. Identify all trust boundaries: client → server, user → system, external → internal
10
+ 3. Trace user/external input from entry to consumption within the file
11
+ 4. For cross-file security flows (input entering one file, consumed unsafely in another), flag as [NEEDS-TRACE] so the cross-file agent can verify
12
+
13
+ ## Checklist
14
+
15
+ ### Injection & URL Safety
16
+
17
+ - User/system values interpolated into URL paths, shell commands, file paths, subprocess args, or dynamically evaluated code (eval, CDP evaluate, new Function, template strings in page context) without encoding/escaping — use `encodeURIComponent()` for URLs, regex allowlists for execution boundaries, `JSON.stringify()` for eval'd code. Generated identifiers in URL segments must be safe (no `/`, `?`, `#`). Slugs for namespaced resources (branches, directories) need unique suffix to prevent collisions
18
+ - Server-side HTTP requests using user-configurable or externally-stored URLs without protocol allowlisting (http/https) and host restrictions — SSRF to internal services, metadata endpoints, localhost APIs. Check redirect handling: auto-follow (`redirect: 'follow'`) bypasses initial validation when redirecting to internal IPs. Resolve DNS and block private/loopback/link-local ranges — public hostnames can resolve to internal IPs via DNS rebinding
19
+ - Error/fallback responses hardcoding security headers instead of using centralized policy — error paths bypass tightening
20
+
21
+ ### Trust Boundaries & Data Exposure
22
+
23
+ - API responses returning full objects with sensitive fields — destructure and omit across ALL paths (GET, PUT, POST, error, socket). Comments claiming data isn't exposed while the code does expose it
24
+ - Server trusting client-provided computed/derived values (scores, totals, correctness flags, file metadata like MIME type and size) — strip and recompute server-side. Validate uploads via magic bytes and buffer length, not headers
25
+ - New endpoints under restricted paths (admin, internal) missing authorization — compare with sibling endpoints for same access gate (role check, scope validation). New OAuth scopes must be checked comprehensively — a check testing only one scope misses newly added scopes
26
+ - User-controlled objects merged via `Object.assign`/spread without sanitizing keys — `__proto__`, `constructor`, `prototype` enable prototype pollution. Use `Object.create(null)`, whitelist keys, use `hasOwnProperty` not `in`
27
+ - Push events (WebSocket, SSE, pub/sub) emitted without scoping to originating user/session — sensitive payloads leak to all connected clients. Scope via room/channel isolation or server-side correlation ID
28
+
29
+ ### Input Handling
30
+
31
+ - Trimming values where whitespace is significant (API keys, tokens, passwords, base64) — only trim identifiers/names
32
+ - Endpoints accepting unbounded arrays without upper limits — enforce max size. Validate element types/format, deduplicate to prevent inflated counts/repeated side effects. Internal operations fanning out unbounded parallel I/O risk EMFILE — use concurrency limiters
33
+ - Security/sanitization functions handling only one input format when data arrives in multiple formats (JSON, shell env, URL-encoded, headers) — sensitive data leaks through unhandled format
34
+
35
+ ### Hand-rolled Validators
36
+
37
+ - Hand-rolled regex for well-known formats (IP addresses, email, URLs, dates, semver) that accept invalid inputs — use platform parsers (`net.isIP()`, `URL` constructor, `semver.valid()`)
38
+
39
+ ### Security Deep Checks
40
+
41
+ **Push/real-time event scoping**
42
+ - If the PR adds or modifies WebSocket, SSE, or pub/sub events: does the event reach only the originating session, or all clients? Check payloads for sensitive content. Verify correlation IDs are server-generated or validated against session
43
+
44
+ **Access scope changes**
45
+ - If the PR widens access (admin → public, internal → external): trace shared dependencies (rate limiters, queues, pools) — were they sized for the previous access level? Process-local limiters don't enforce across instances
46
+ - New endpoints under restricted route groups: verify same authorization gate as siblings — missing gates on admin-mounted endpoints are the most dangerous finding
47
+
48
+ **Data flow audit**
49
+ - For secrets/tokens: trace input → storage → retrieval → response. Verify never leaked in ANY response path
50
+ - For user input → URL/command interpolation: verify encoding/escaping at every boundary
51
+
52
+ **Sanitization/validation coverage**
53
+ - If a new validation function is introduced for a field: trace ALL write paths (create, update, import, sync, bulk) — partial application means invalid data re-enters through unguarded paths
54
+ - If a "raw" or bypass write path is added: compare normalization against what the read/parse path assumes — data through raw path must be valid on reload
55
+ - If a new dispatch branch is added within a multi-type handler: verify equivalent validation as sibling branches
56
+
57
+ **Security-sensitive configuration parsing**
58
+ - Env vars/config affecting security (proxy trust, rate limits, CORS, token expiry): verify type and range enforcement. `Number()` accepts floats, negatives, empty-string-as-zero — use `parseInt` + `Number.isInteger` + range checks with logged safe defaults
59
+
60
+ **Guard-before-cache ordering**
61
+ - Pre-flight guards (rate limit, quota, feature flag) before cache lookup: verify the guard doesn't block operations served from cache without touching the guarded resource
62
+
63
+ **Server-side fetch lifecycle**
64
+ - Server-side HTTP requests to user/external URLs: trace initial validation → DNS resolution → connection → redirect handling. Host/IP restrictions must be enforced on each redirect hop and after DNS resolution
65
+
66
+ ## Output Format
67
+
68
+ For each finding:
69
+ ```
70
+ file:line — [CRITICAL|IMPROVEMENT|NEEDS-TRACE] description
71
+ Evidence: `quoted code line(s)`
72
+ Attack scenario: brief exploitation description
73
+ ```
74
+
75
+ Security findings default to CRITICAL unless exploitation requires unlikely preconditions.
76
+ Use [NEEDS-TRACE] for cross-file security flows that require the cross-file agent to verify.
@@ -0,0 +1,161 @@
1
+ # Surface Scan Review Agent
2
+
3
+ ## Mandate
4
+ You review code for per-file correctness: bugs, quality issues, and convention violations visible within a single file. You do NOT trace call chains or data flows across files — another agent handles cross-file analysis.
5
+
6
+ ## Reading Strategy
7
+ For each changed file, read the **ENTIRE file** (not just diff hunks). New code interacting incorrectly with existing code in the same file is a common bug source. Review one file at a time.
8
+
9
+ ## Principles to Evaluate
10
+
11
+ **YAGNI** — Flag abstractions, config options, parameters, or extension points that serve no current use case. Unnecessary wrapper functions, premature generalization (factory producing one type), unused feature flags.
12
+
13
+ **Naming** — Functions and variables should communicate intent without reading the implementation. Booleans should read as predicates (`isReady`, `hasAccess`), not ambiguous nouns.
14
+
15
+ ## Checklist
16
+
17
+ ### Always Check — Runtime & Hygiene
18
+
19
+ **Hygiene**
20
+ - Leftover debug code (`console.log`, `debugger`, TODO/FIXME/HACK), hardcoded secrets/credentials, uncommittable files (.env, node_modules, build artifacts, runtime-generated data/reports)
21
+ - Overly broad changes that should be split into separate PRs
22
+
23
+ **Imports & references**
24
+ - Every symbol used is imported (missing → runtime crash); no unused imports. Also check references to framework utilities (CSS class names, directive names, component props) — a non-existent utility class or prop name silently does nothing
25
+
26
+ **Runtime correctness**
27
+ - Null/undefined access without guards; off-by-one errors; spread of null (is `{}`), spread of non-objects (string → indexed chars, array → numeric keys) — guard with plain-object check before spreading
28
+ - External/user data (parsed JSON, API responses, file reads) used without structural validation — guard parse failures, missing properties, wrong types, null elements. Optional enrichment failures should not abort the main operation
29
+ - Type coercion: `Number('')` is `0` not empty; `0` is falsy in truthy checks; `NaN` comparisons always false; `"10" < "2"` (lexicographic). Deserialized booleans: `"false"` is truthy — use `=== 'true'`. `isinstance(x, int)` accepts `bool` in Python; `typeof NaN === 'number'` in JS
30
+ - Indexing empty arrays; `every`/`some`/`reduce` on empty collections returning vacuously true; declared-but-never-updated state/variables
31
+ - Parallel arrays coupled by index position — use objects/maps keyed by stable identifier
32
+ - Shared mutable references: module-level defaults mutated across calls (use `structuredClone()`); `useCallback`/`useMemo` referencing later `const` (temporal dead zone); spread followed by unconditional assignment clobbering spread values
33
+ - Functions with >10 branches or >15 cyclomatic complexity — refactor
34
+
35
+ **API route basics**
36
+ - Route params passed to services without format validation; path containment using string prefix without separator boundary (use `path.relative()`)
37
+ - Parameterized/wildcard routes registered before specific named routes (`/:id` before `/drafts` matches `/drafts` as `id="drafts"`)
38
+ - Stored or external URLs rendered as clickable links without protocol validation — allowlist `http:`/`https:`
39
+
40
+ **Error handling (single-file)**
41
+ - Swallowed errors (empty `.catch(() => {})`); error handlers that exit cleanly (`exit 0`, `return`) without user-visible output; handlers replacing detailed failure info with generic messages
42
+
43
+ ### Domain-Specific (check only when file type matches)
44
+
45
+ **SQL & database** _[SQL, ORM, migration files]_
46
+ - Parameterized query placeholder indices vs parameter array positions
47
+ - DB triggers clobbering explicit values; auto-increment only on INSERT not UPDATE
48
+ - Full-text search with strict parsers (`to_tsquery`) on user input — use `plainto_tsquery`
49
+ - Dead queries (results never read); N+1 patterns; O(n²) on growing data
50
+ - Performance optimizations (early exits, capped limits) that silently reduce correctness
51
+ - `CREATE TABLE IF NOT EXISTS` as sole migration — won't add columns. Use `ALTER TABLE ... ADD COLUMN IF NOT EXISTS`
52
+ - Functions/extensions requiring unchecked database versions
53
+ - Migrations locking tables (ADD COLUMN with default, CREATE INDEX without CONCURRENTLY)
54
+ - Missing rollback/down migration
55
+
56
+ **Sync & replication** _[pagination, batch APIs, data sync]_
57
+ - Upsert/`ON CONFLICT UPDATE` updating only subset of exported fields — replicas diverge
58
+ - Pagination: `COUNT(*)` (full scan) instead of `limit + 1`; missing `next` token; hard-capped limits truncating silently; store applying limits before filters requiring loop with continuation tokens
59
+ - Pagination cursors from last scanned vs last returned item — trimmed results cause permanent skips
60
+ - Batch API calls not handling partial results — unprocessed items, continuation tokens dropped
61
+ - Retry loops without backoff or max attempts
62
+
63
+ **Lazy initialization** _[dynamic imports, lazy singletons, bootstrap]_
64
+ - Cached state getters returning null before initialization
65
+ - Module-level side effects (file reads, SDK init) without error handling
66
+ - File writes assuming parent directory exists
67
+ - Bootstrap code importing dependencies it's meant to install — restructure so install precedes resolution
68
+ - Re-exporting from heavy modules defeats lazy loading
69
+
70
+ **Data format portability** _[JSON, DB, IPC, serialization boundaries]_
71
+ - Values changing format across boundaries (arrays in JSON vs strings in DB). Datetime: mixing UTC string ops with local Date methods shifts across timezones; appending 'Z' without verifying source timezone
72
+ - Reads immediately after writes to eventually consistent stores
73
+ - BIGINT → JS Number precision loss past `MAX_SAFE_INTEGER` — use strings or BigInt
74
+ - Key/index design not supporting required query patterns (random UUIDs claiming "recent" ordering)
75
+
76
+ **Shell & portability** _[subprocesses, shell scripts, CLI tools]_
77
+ - `set -e` aborting on non-critical failures; broken pipes on non-critical writes — use `|| true`
78
+ - Interactive prompts in non-interactive contexts (CI, cron) — guard with TTY detection
79
+ - Detached processes with piped stdio — SIGPIPE on parent exit. Use `'ignore'`
80
+ - Subprocess output buffered without size limits — unbounded memory growth
81
+ - Platform-specific: hardcoded shell interpreters; `path.join()` backslashes breaking ESM imports — use `pathToFileURL()`
82
+ - Naive whitespace splitting of command strings breaks quoted arguments — use proper argv parser
83
+ - Subprocess output parsed from single stream (stdout or stderr) to detect conditions — check both streams and exit code
84
+ - Shell expansions suppressed by quoting — single quotes prevent all expansion
85
+
86
+ **Search & navigation** _[search, deep-linking]_
87
+ - Search results linking to generic list pages instead of deep-linking to specific record
88
+ - Search code hardcoding one backend when system supports multiple
89
+
90
+ **Destructive UI** _[delete, reset, revoke actions]_
91
+ - Destructive actions without confirmation step
92
+
93
+ **Accessibility** _[UI components, interactive elements]_
94
+ - Interactive elements missing accessible names, roles, or ARIA states
95
+ - Custom toggles from non-semantic elements instead of native inputs
96
+ - Overlay layers with `pointer-events-auto` intercepting clicks beneath; `pointer-events-none` on parent killing child hover handlers
97
+
98
+ ### Always Check — Quality & Conventions
99
+
100
+ **Intent vs implementation (single-file)**
101
+ - Labels, comments, status messages describing behavior the code doesn't implement
102
+ - Inline code examples or command templates that aren't syntactically valid
103
+ - Sequential numbering with gaps or jumps after edits
104
+ - Template/workflow variables referenced but never assigned — trace each placeholder to a definition
105
+ - Constraints described in preamble not enforced by conditions in procedural steps
106
+ - Duplicate or contradictory items in sequential lists
107
+ - Completion markers or success flags written before the operation they attest to
108
+ - Existence checks (directory exists, file exists) used as proof of correctness — file can exist with invalid contents
109
+ - Lookups checking only one scope when multiple exist (local branches but not remote)
110
+ - Tracking/checkpoint files defaulting to empty on parse failure — fail-open guards
111
+ - Registering references to resources without verifying resource exists
112
+
113
+ **AI-generated code quality**
114
+ - New abstractions, wrapper functions, helper files serving only one call site — inline instead
115
+ - Feature flags, config options, extension points with only one possible value
116
+ - Commit messages claiming a fix while the bug remains
117
+ - Placeholder comments (`// TODO`, `// FIXME`) or stubs presented as complete
118
+ - Unnecessary defensive code for scenarios that provably cannot occur
119
+
120
+ **Configuration & hardcoding**
121
+ - Hardcoded values when config/env var exists; dead config fields; unused function parameters
122
+ - Duplicated config/constants/helpers across modules — extract to shared module. Watch for behavioral inconsistencies between copies
123
+ - CI pipelines without lockfile pinning or version constraints
124
+ - Production code paths with no structured logging at entry/exit
125
+ - Error logs missing reproduction context (request ID, input params)
126
+ - Async flows without correlation ID propagation
127
+
128
+ **Supply chain & dependencies**
129
+ - Lockfile committed and CI uses `--frozen-lockfile`; no drift from manifest
130
+ - `npm audit` / `cargo audit` / `pip-audit` — no unaddressed HIGH/CRITICAL vulns
131
+ - No `postinstall` scripts from untrusted packages executing arbitrary code
132
+ - Overly permissive version ranges (`*`, `>=`) on deps with breaking-change history
133
+
134
+ **Test coverage**
135
+ - New logic/schemas/services without tests when similar existing code has tests
136
+ - New error paths untestable because services throw generic errors
137
+ - Tests re-implementing logic under test instead of importing real exports — pass even when real code regresses. Tests asserting by inspecting source code strings rather than calling functions
138
+ - Tests depending on real wall-clock time or external dependencies
139
+ - Missing tests for trust-boundary enforcement
140
+ - Tests exercising code paths the integration layer doesn't expose — pass against mocks but untriggerable in production
141
+ - Test mock state leaking between tests — "clear" resets invocation counts but not configured behavior; use "reset" variants
142
+
143
+ **Automated pipeline discipline**
144
+ - Internal code review must run before creating PRs — never go straight from "tests pass" to PR
145
+ - Copilot review must complete before merging
146
+ - Automated agent output must be reviewed against project conventions
147
+
148
+ **Style & conventions**
149
+ - Naming and patterns inconsistent with rest of codebase
150
+ - New content not matching existing indentation, bullet style, heading levels
151
+ - Shell instructions with destructive operations not verifying preconditions first
152
+
153
+ ## Output Format
154
+
155
+ For each finding:
156
+ ```
157
+ file:line — [CRITICAL|IMPROVEMENT|UNCERTAIN] description
158
+ Evidence: `quoted code line(s)`
159
+ ```
160
+
161
+ Only report verified findings with quoted code evidence. If you cannot quote specific code for a finding, mark as [UNCERTAIN].
@@ -106,6 +106,13 @@
106
106
  - `UserInterfaceIdiom` checks without handling `.mac` / `.pad` / `.vision` appropriately
107
107
  - Touch-specific gestures (drag, long press) without pointer/hover alternatives for macOS
108
108
  - Missing `#if targetEnvironment(macCatalyst)` handling when running iPad apps on Mac
109
+ - **macOS window lifecycle (App Store Guideline 4):**
110
+ - Missing `NSApplicationDelegate` with `applicationShouldTerminateAfterLastWindowClosed` returning `false` — app quits when the user closes the window instead of staying in the Dock
111
+ - Missing `applicationShouldHandleReopen(_:hasVisibleWindows:)` — clicking the Dock icon or Finder menu does nothing when the main window is closed
112
+ - `WindowGroup` without a stable `id:` parameter (e.g., `WindowGroup(id: "main")`) — prevents programmatic window reopening via `openWindow(id:)`
113
+ - Missing "Show Main Window" menu command (typically Cmd+0) in the Window menu — users have no way to recover the main window from the menu bar
114
+ - Missing `reopenWindow` closure bridge between `NSApplicationDelegate` and SwiftUI's `@Environment(\.openWindow)` — AppKit delegate can't create new SwiftUI windows
115
+ - Menu bar commands (File > New, Edit actions) that don't ensure the main window is visible before acting — commands fire but the user sees nothing
109
116
 
110
117
  **Data persistence** _[applies when: code uses Core Data, SwiftData, or file storage]_
111
118
  - `@FetchRequest` or `@Query` without sort descriptors — undefined ordering across launches
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "slash-do",
3
- "version": "2.2.0",
3
+ "version": "2.4.0",
4
4
  "description": "Curated slash commands for AI coding assistants — Claude Code, OpenCode, Gemini CLI, and Codex",
5
5
  "author": "Adam Eivy <adam@eivy.com>",
6
6
  "license": "MIT",
@@ -49,7 +49,7 @@ function inlineLibContent(body, libDir) {
49
49
  function toYamlFrontmatter(fm) {
50
50
  const lines = ['---'];
51
51
  for (const [key, val] of Object.entries(fm)) {
52
- lines.push(`${key}: ${val}`);
52
+ lines.push(`${key}: ${JSON.stringify(String(val))}`);
53
53
  }
54
54
  lines.push('---');
55
55
  return lines.join('\n');
package/uninstall.sh CHANGED
@@ -33,6 +33,7 @@ OLD_COMMANDS=(cam good makegoals makegood optimize-md)
33
33
  LIBS=(
34
34
  code-review-checklist copilot-review-loop graphql-escaping
35
35
  remediation-agent-template swift-review-checklist
36
+ review-surface-scan review-security-audit review-cross-file-tracing
36
37
  )
37
38
 
38
39
  HOOKS=(slashdo-check-update slashdo-statusline)