npm - general-coding-tools-mcp - Versions diffs - 1.0.9 → 1.1.1 - Mend

general-coding-tools-mcp 1.0.9 → 1.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/dist/content.json CHANGED Viewed

	@@ -1 +1 @@
1	- {"skills":[{"id":"best-practices-audit","name":"best-practices-audit","hasReference":true},{"id":"correctness-audit","name":"correctness-audit","hasReference":true},{"id":"feature-planning","name":"feature-planning","hasReference":true},{"id":"security-audit","name":"security-audit","hasReference":true},{"id":"systematic-debugging","name":"systematic-debugging","hasReference":false},{"id":"test-deno","name":"test-deno","hasReference":true},{"id":"test-frontend","name":"test-frontend","hasReference":true},{"id":"test-pgtap","name":"test-pgtap","hasReference":true},{"id":"ui-audit","name":"ui-audit","hasReference":true}],"subagents":[{"id":"deep-research","name":"deep-research"},{"id":"update-docs","name":"update-docs"},{"id":"verifier","name":"verifier"}],"content":{"skills":{"best-practices-audit":{"content":"---\nname: best-practices-audit\ndescription: Audits code against named industry standards and coding best practices (DRY, SOLID, KISS, YAGNI, Clean Code, OWASP, etc.). Use when the user asks to check best practices, enforce standards, audit for anti-patterns, review code quality against principles, or ensure code follows industry conventions. Works on git diffs, specific files, or an entire codebase.\n---\n\n# Best Practices Audit\n\nAudit code against established industry standards and named best practices. Cite the specific principle violated for every finding so the developer learns which standard applies and why.\n\n## Scope\n\nDetermine what to audit based on user request and context:\n\n- Git diff mode (default when no scope specified and changes exist): run `git diff` and `git diff --cached` to audit only changed/added code\n- File/directory mode: audit the files or directories the user specifies\n- Codebase mode: when the user explicitly asks for a full codebase audit, scan the project broadly (focus on source code, skip vendor/node_modules/build artifacts)\n\nRead all in-scope code before producing findings.\n\n## Principles to Enforce\n\nEvaluate code against each category. Skip categories with no findings. See [REFERENCE.md](REFERENCE.md) for detailed definitions and examples of each principle.\n\n### 1. DRY (Don't Repeat Yourself)\n\n- Duplicated logic across functions, components, or modules\n- Copy-pasted code blocks with minor variations\n- Repeated string literals, magic numbers, or config values that should be constants\n- Similar data transformations that could be unified\n\n### 2. SOLID Principles\n\n- S — Single Responsibility: classes/modules/functions doing more than one thing\n- O — Open/Closed: code that requires modification (instead of extension) to add behavior\n- L — Liskov Substitution: subtypes that break the contract of their parent type\n- I — Interface Segregation: interfaces/types forcing implementers to depend on methods they don't use\n- D — Dependency Inversion: high-level modules depending on concrete implementations instead of abstractions\n\n### 3. KISS (Keep It Simple, Stupid)\n\n- Unnecessary complexity or over-engineering\n- Convoluted control flow when a simpler approach exists\n- Abstractions that add indirection without clear value\n- Clever tricks that sacrifice readability\n\n### 4. YAGNI (You Ain't Gonna Need It)\n\n- Code for features that don't exist yet and aren't requested\n- Premature generalization or unnecessary configurability\n- Unused parameters, flags, or code paths \"just in case\"\n- Speculative abstractions with a single implementation\n\n### 5. Clean Code (Robert C. Martin)\n\n- Naming: vague, misleading, or inconsistent names; abbreviations that hinder readability\n- Functions: functions longer than ~20 lines; too many parameters (>3); mixed abstraction levels\n- Comments: comments that restate the code; commented-out code; missing comments on why for non-obvious decisions\n- Formatting: inconsistent indentation, spacing, or file organization within the project\n\n### 6. Error Handling Best Practices\n\n- Swallowed exceptions (empty catch blocks)\n- Generic catch-all without meaningful handling\n- Missing error propagation — errors that should bubble up but don't\n- No user-facing feedback on failure\n- Using exceptions for control flow\n\n### 7. Security Standards (OWASP Top 10)\n\n- Unsanitized user input (injection, XSS, path traversal)\n- Broken authentication or session management\n- Sensitive data exposure (secrets in code, insecure storage, unencrypted transmission)\n- Missing access control checks\n- Security misconfiguration (permissive CORS, missing CSP headers)\n- Using components with known vulnerabilities\n\n### 8. Performance Best Practices\n\n- Unnecessary re-renders or re-computations\n- N+1 queries, unbounded result sets, missing pagination\n- Synchronous blocking in async-capable contexts\n- Missing memoization, caching, or debouncing where clearly beneficial\n- Large bundle imports when a smaller alternative exists\n\n### 9. Testing Best Practices\n\n- Untested public API surface or critical paths\n- Tests tightly coupled to implementation details\n- Missing edge case coverage for non-trivial logic\n- Flaky patterns (time-dependent, order-dependent, network-dependent tests)\n- Test code that violates DRY without justification\n\n### 10. Code Organization & Architecture\n\n- Circular dependencies between modules\n- Business logic mixed into UI/presentation layers\n- Shared mutable state across module boundaries\n- Inconsistent project structure or file placement conventions\n- Missing or inconsistent use of the project's established patterns\n\n### 11. Defensive Programming\n\n- Missing input validation at system boundaries (API endpoints, user forms, external data)\n- Assumptions about data shape without type guards or runtime checks\n- Missing null/undefined handling where values can realistically be absent\n- No graceful degradation on partial failures\n\n### 12. Separation of Concerns\n\n- Mixed responsibilities in a single file or function (e.g. data fetching + rendering + business logic)\n- Configuration values hardcoded in business logic\n- Platform-specific code leaking into core/shared modules\n- Presentation logic mixed with data transformation\n\n## Output Format\n\nGroup findings by severity. Each finding MUST name the specific principle violated.\n\n```\n## Critical\nViolations that will cause bugs, data loss, or security vulnerabilities in production.\n\n### [PRINCIPLE] Brief title\nFile: `path/to/file.ts` (lines X-Y)\nPrinciple: Full name of the principle and a one-line explanation of what it requires.\nViolation: What the code does wrong and the concrete impact.\nFix: Specific, actionable suggestion.\n\n## Warning\nViolations that degrade maintainability, readability, or robustness.\n\n(same structure)\n\n## Suggestion\nImprovements aligned with best practices but not urgent.\n\n(same structure)\n\n## Summary\n- Total findings: N (X critical, Y warning, Z suggestion)\n- Principles most frequently violated: list the top 2-3\n- Overall assessment: 1-2 sentence verdict on the code's adherence to standards\n```\n\n## Linter Tools\n\nBefore producing findings, always run the available linters on in-scope code to supplement your manual review. Linter output should be incorporated into your findings (cite the linter rule alongside the principle).\n\n### ESLint (TypeScript/React)\n\nRun from the `app/` directory. Config: `app/eslint.config.js` (flat config with TypeScript-ESLint, React Hooks, React Refresh).\n\n```bash\ncd app && npx eslint . # full codebase\ncd app && npx eslint src/path/to/file.ts # specific file(s)\ncd app && npx eslint --fix . # auto-fix what's possible (only with user approval)\n```\n\n### Ruff (Python)\n\nRun from the project root. Config: `ruff.toml` (pycodestyle, pyflakes, isort, pep8-naming, pyupgrade, bugbear, simplify, bandit).\n\n```bash\nruff check scripts/ # all Python scripts\nruff check scripts/wireframe.py # specific file\nruff check --fix scripts/ # auto-fix (only with user approval)\n```\n\n### How to use linter output\n\n1. Run the relevant linter(s) based on which file types are in scope.\n2. For each linter error/warning, map it to the matching principle category (e.g. `@typescript-eslint/no-unused-vars` → Clean Code / Naming, `react-hooks/set-state-in-effect` → Performance / React Best Practices, `S101` → Security / OWASP).\n3. Include linter findings in the appropriate severity section. Linter errors that indicate real bugs or security issues go under Critical; style/convention issues go under Suggestion.\n4. If the linter finds no issues for a file type, note \"ESLint: clean\" or \"Ruff: clean\" in the Summary.\n\n## Rules\n\n- Name the principle: every finding must cite the specific standard (e.g. \"DRY\", \"SRP from SOLID\", \"OWASP A03: Injection\"). This is the core value of this skill.\n- Be specific: always cite file paths and line numbers.\n- Be actionable: every finding must include a concrete fix.\n- Respect scope: only audit what's in scope. In diff mode, only flag issues in changed lines (and their immediate context).\n- Don't duplicate code-quality-review: focus on named principles and standards, not generic bug-hunting. If using both skills, they complement each other.\n- Pragmatism over dogma: a principle violation is only worth flagging if fixing it provides real value. Don't flag trivial or pedantic violations that would add noise.\n- Context matters: consider the project's scale, team size, and existing patterns. A startup prototype has different standards than a production system.\n","reference":"# Best Practices Reference\n\nDetailed definitions, rationale, and code examples for each principle audited by this skill.\n\n## Table of Contents\n\n1. [DRY](#1-dry-dont-repeat-yourself)\n2. [SOLID](#2-solid-principles)\n3. [KISS](#3-kiss-keep-it-simple-stupid)\n4. [YAGNI](#4-yagni-you-aint-gonna-need-it)\n5. [Clean Code](#5-clean-code)\n6. [Error Handling](#6-error-handling)\n7. [Security (OWASP)](#7-security-owasp-top-10)\n8. [Performance](#8-performance)\n9. [Testing](#9-testing)\n10. [Code Organization](#10-code-organization--architecture)\n11. [Defensive Programming](#11-defensive-programming)\n12. [Separation of Concerns](#12-separation-of-concerns)\n\n---\n\n## 1. DRY (Don't Repeat Yourself)\n\nSource: The Pragmatic Programmer — Andy Hunt & Dave Thomas (1999)\n\nPrinciple: Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.\n\nWhat it covers: Not just code duplication — also duplicated logic, data definitions, and documentation that can fall out of sync.\n\nBad:\n```ts\n// User validation in registration handler\nif (!email \|\| !email.includes('@')) throw new Error('Invalid email');\nif (!password \|\| password.length < 8) throw new Error('Weak password');\n\n// Same validation repeated in profile update handler\nif (!email \|\| !email.includes('@')) throw new Error('Invalid email');\nif (!password \|\| password.length < 8) throw new Error('Weak password');\n```\n\nGood:\n```ts\nfunction validateCredentials(email: string, password: string) {\n if (!email \|\| !email.includes('@')) throw new Error('Invalid email');\n if (!password \|\| password.length < 8) throw new Error('Weak password');\n}\n```\n\nCaveat: Not all similar-looking code is a DRY violation. Two functions that happen to share structure but serve different purposes and will evolve independently are fine as-is. Premature deduplication can create coupling.\n\n---\n\n## 2. SOLID Principles\n\nSource: Robert C. Martin (aggregated ~2000s, acronym coined by Michael Feathers)\n\n### S — Single Responsibility Principle (SRP)\n\nA class/module should have one, and only one, reason to change.\n\nBad: A `UserService` that handles registration, email sending, and report generation.\nGood: Separate `UserRegistration`, `EmailService`, and `ReportGenerator`.\n\n### O — Open/Closed Principle (OCP)\n\nSoftware entities should be open for extension but closed for modification. Add new behavior by adding new code, not changing existing code.\n\nBad: A payment processor with a growing `switch` statement for each new payment method.\nGood: A strategy pattern where each payment method implements a `PaymentProcessor` interface.\n\n### L — Liskov Substitution Principle (LSP)\n\nSubtypes must be substitutable for their base types without altering correctness. If `Square extends Rectangle`, calling `setWidth()` must not break expectations.\n\n### I — Interface Segregation Principle (ISP)\n\nNo client should be forced to depend on methods it does not use. Prefer many small, focused interfaces over one large one.\n\n### D — Dependency Inversion Principle (DIP)\n\nHigh-level modules should not depend on low-level modules. Both should depend on abstractions. Abstractions should not depend on details.\n\nBad: `OrderService` directly imports and instantiates `PostgresDatabase`.\nGood: `OrderService` depends on a `Database` interface; the concrete implementation is injected.\n\n---\n\n## 3. KISS (Keep It Simple, Stupid)\n\nSource: U.S. Navy design principle (1960s), widely adopted in software engineering.\n\nPrinciple: Most systems work best if they are kept simple rather than made complicated. Simplicity should be a key goal and unnecessary complexity should be avoided.\n\nCommon violations:\n- Replacing a simple `if/else` with a factory + strategy + registry pattern for two cases\n- Using metaprogramming/reflection when straightforward code works\n- Creating deep inheritance hierarchies when composition or plain functions suffice\n- Writing a custom solution for something the language/framework already provides\n\n---\n\n## 4. YAGNI (You Ain't Gonna Need It)\n\nSource: Extreme Programming (XP) — Kent Beck & Ron Jeffries\n\nPrinciple: Don't implement something until you actually need it, not when you foresee you might need it.\n\nCommon violations:\n- Adding plugin architectures when the app has one implementation\n- Creating abstract base classes with a single concrete subclass\n- Building configuration options nobody has asked for\n- Adding feature flags before there's more than one variant\n\nRelationship with KISS: YAGNI is about scope (don't build it yet), KISS is about complexity (build it simply).\n\n---\n\n## 5. Clean Code\n\nSource: Clean Code — Robert C. Martin (2008)\n\n### Naming\n- Names should reveal intent: `getUserPermissions()` not `getData()`\n- Avoid abbreviations unless universally understood (`id`, `url`, `http` are fine; `usrPrmLst` is not)\n- Boolean names should read as questions: `isActive`, `hasPermission`, `canEdit`\n- Consistent vocabulary: don't mix `fetch`, `get`, `retrieve`, `load` for the same concept\n\n### Functions\n- Should do one thing, at one level of abstraction\n- Prefer fewer than 3 parameters; use an options object for more\n- Avoid flag arguments (`render(true)`) — split into two named functions\n- Side effects should be obvious from the name or documented\n\n### Comments\n- Good: explain why a non-obvious decision was made\n- Bad: restate what the code does (`// increment i by 1`)\n- Worst: commented-out code left in the codebase\n\n---\n\n## 6. Error Handling\n\nSources: Clean Code Chapter 7; language-specific community standards\n\n- Don't swallow errors: empty `catch {}` blocks hide bugs\n- Fail fast: validate inputs early and throw/return immediately on invalid state\n- Use typed/specific errors: catch specific error types rather than generic `catch(e)`\n- Errors are not control flow: don't use try/catch for expected branching logic\n- Always handle promises: every Promise should have a `.catch()` or be `await`ed in a try block\n- Provide context: error messages should include what failed and why, with enough info to debug\n\n---\n\n## 7. Security (OWASP Top 10)\n\nSource: OWASP Foundation — updated periodically (latest: 2021)\n\n\| ID \| Category \| What to look for \|\n\|----\|----------\|-----------------\|\n\| A01 \| Broken Access Control \| Missing auth checks, IDOR, privilege escalation \|\n\| A02 \| Cryptographic Failures \| Plaintext secrets, weak hashing, unencrypted sensitive data \|\n\| A03 \| Injection \| SQL injection, XSS, command injection, path traversal \|\n\| A04 \| Insecure Design \| Missing threat modeling, no rate limiting, no abuse prevention \|\n\| A05 \| Security Misconfiguration \| Default credentials, overly permissive CORS, verbose errors in production \|\n\| A06 \| Vulnerable Components \| Outdated dependencies with known CVEs \|\n\| A07 \| Auth Failures \| Weak passwords allowed, no brute-force protection, broken session management \|\n\| A08 \| Data Integrity Failures \| Missing integrity checks, insecure deserialization \|\n\| A09 \| Logging Failures \| No audit trail, sensitive data in logs \|\n\| A10 \| SSRF \| Server making requests to user-controlled URLs without validation \|\n\n---\n\n## 8. Performance\n\nSources: Web.dev, framework-specific documentation, general CS principles\n\n- Avoid premature optimization — but do avoid obviously bad patterns:\n - O(n^2) when O(n) or O(n log n) is straightforward\n - Fetching entire tables/collections when only a subset is needed\n - Re-computing values on every render/call that could be memoized\n- Minimize bundle size: tree-shake, lazy-load routes/components, avoid importing entire libraries for one utility\n- Batch operations: reduce network round-trips, use bulk APIs, batch DOM updates\n- Debounce/throttle: user input handlers that trigger expensive work\n\n---\n\n## 9. Testing\n\nSources: xUnit Test Patterns — Gerard Meszaros; Growing Object-Oriented Software, Guided by Tests — Freeman & Pryce\n\n- AAA pattern: Arrange, Act, Assert — keep tests structured and readable\n- Test behavior, not implementation: tests should survive refactors that don't change behavior\n- One assertion per concept: a test should verify one logical thing (may use multiple `expect` calls if they test the same concept)\n- Deterministic: no random data, no reliance on wall-clock time, no network calls in unit tests\n- Test the contract: focus on public API, not private internals\n- Coverage priorities: critical paths and edge cases first; don't chase 100% coverage on trivial code\n\n---\n\n## 10. Code Organization & Architecture\n\nSources: Clean Architecture — Robert C. Martin; Patterns of Enterprise Application Architecture — Martin Fowler\n\n- Dependency direction: dependencies should point inward (toward core/domain logic), not outward (toward frameworks/IO)\n- Feature cohesion: related code should live together (by feature/domain), not scattered by technical role\n- No circular dependencies: if A imports B and B imports A, extract shared code to C\n- Consistent file structure: follow the project's established conventions for where things go\n- Layered boundaries: keep clear boundaries between data access, business logic, and presentation\n\n---\n\n## 11. Defensive Programming\n\nSource: Code Complete — Steve McConnell; The Pragmatic Programmer\n\n- Validate at boundaries: every system entry point (API endpoint, form handler, external data source) must validate inputs\n- Fail gracefully: partial failures should not crash the entire system\n- Guard clauses: return early on invalid conditions instead of deeply nesting the happy path\n- Type narrowing: use type guards, assertions, or schema validation (e.g. Zod) for external data\n- Avoid assumptions: if a value can be null/undefined according to its type, handle it\n\n---\n\n## 12. Separation of Concerns\n\nSource: Edsger W. Dijkstra (1974); foundational software engineering principle\n\n- Each module addresses one concern: rendering, data fetching, state management, and business logic should be separable\n- Configuration over hardcoding: environment-specific values belong in config, not scattered in source\n- Platform boundaries: core logic should be portable; framework-specific code stays at the edges\n- Data vs. presentation: keep data transformation separate from how it's displayed\n"},"correctness-audit":{"content":"---\nname: correctness-audit\ndescription: Reviews code for correctness bugs, uncaught edge cases, and scalability problems. Use when reviewing code changes, performing code audits, or when the user asks for a review or quality check. For security vulnerabilities use security-audit; for design, maintainability, and principle violations use best-practices-audit.\n---\n\n# Code Quality Review\n\nPerform a systematic review focused on correctness and runtime concerns: will this code work correctly under all realistic inputs and load? Every finding must cite the file, line(s), dimension, and a concrete fix. For security vulnerabilities, use `security-audit`. For principle violations (DRY, SOLID, Clean Code), use `best-practices-audit`.\n\n## Scope\n\nDetermine what to review based on context:\n\n- Git diff mode (default when no scope specified and changes exist): run `git diff` and `git diff --cached` to review only changed/added code and its immediate context\n- File/directory mode: review the files or directories the user specifies\n- Full review mode: when the user asks for a full review, scan all source code (skip vendor/node_modules/build artifacts)\n\nRead all in-scope code before producing findings.\n\n## Dimensions to Evaluate\n\nEvaluate code against each dimension. Skip dimensions with no findings. See [REFERENCE.md](REFERENCE.md) for detailed definitions, concrete examples, and fixes.\n\n### 1. Logic Bugs\n\n- Wrong operators: `<` vs `<=`, `==` vs `===`, `&&` vs `\|\|`, bitwise vs logical operators\n- Off-by-one errors: loop boundaries, slice/splice indices, pagination offset calculations\n- Incorrect variable: copy-paste errors where the wrong variable is used (e.g. checking `a > 0` but intending `b > 0`)\n- Boolean logic inversions: conditions that are the exact opposite of what they should be (missing `!`, De Morgan's law violations)\n- Mutating instead of cloning: modifying an input argument or shared reference when a local copy is required\n- Shadowed variables: inner-scope declaration masking an outer-scope variable of the same name, causing silent incorrect reads\n- Assignment in condition: `if (x = getValue())` when `===` was intended\n- Short-circuit misuse: relying on `&&` or `\|\|` for side effects in code paths where the right-hand side must always run\n\n### 2. Type & Coercion Bugs\n\n- Implicit type coercion: `+` operator on mixed `string \| number` producing concatenation instead of addition; `==` coercing types unexpectedly\n- Unsafe casts: `as T` assertions on data from external sources (API responses, `JSON.parse`, database rows typed as `any`) without runtime validation\n- Integer/float confusion: using floating-point arithmetic where integer arithmetic is required (financial amounts, indices, counts); missing `Math.floor`/`Math.round` on division results\n- Precision loss: `Number` used for values > `Number.MAX_SAFE_INTEGER` (2⁵³-1); should use `BigInt` or a decimal library\n- NaN propagation: arithmetic on a value that may be `NaN` without a guard; `NaN === NaN` is always `false`; `isNaN(\"string\")` returns `true`\n- Nullable column mismatch: TypeScript type says `string` but the database column is nullable; the value can be `null` at runtime\n\n### 3. Null, Undefined & Missing Value Bugs\n\n- Unguarded property access: accessing `.foo` on a value that can realistically be `null` or `undefined` at runtime (API response fields, optional config, database nullable columns)\n- Destructuring without defaults: `const { limit } = options` where `options` may be `undefined`, or `limit` may be absent\n- Array access without bounds check: `arr[0]` on an array that may be empty; `arr[arr.length - 1]` on a zero-length array\n- `find()` result not checked: `.find()` returns `undefined` when no match exists; using the result directly without a null guard will throw\n- Optional chaining gaps: using `a.b.c` when `a` or `b` can be nullish; should be `a?.b?.c`\n- Early return missing: function continues executing after a condition should have terminated it\n\n### 4. Async & Promise Bugs\n\n- Missing `await`: `async` function calls whose result is not awaited, running fire-and-forget when the caller depends on the result\n- Unhandled promise rejections: `.then()` without `.catch()`, or top-level `async` functions with no try/catch, that silently swallow errors\n- Sequential awaits that should be parallel: awaiting independent async operations in series (`await a(); await b()`) when `Promise.all([a(), b()])` would be faster and correct\n- `Promise.all` vs `Promise.allSettled`: using `Promise.all` when any single rejection should not abort all others; vs. using `Promise.allSettled` when the caller actually needs to fail fast\n- Async function returning void unintentionally: a function signature of `async (): Promise<void>` that actually should return a value the caller uses\n- Race between async operations: two concurrent async paths writing to the same location (state, DB row, file) without synchronization\n- Uncleaned async resources: `setInterval`, `setTimeout`, event listeners, or subscriptions started inside a component/class that are never cleaned up when the scope is destroyed\n\n### 5. Stale Closures & Captured State\n\n- Stale closure over mutable variable: a callback or timeout captures a variable by reference; by the time the callback runs, the variable has changed\n- Loop variable capture: `for (var i = 0; ...)` with async/callback inside — all callbacks share the same `i` by the time they run (use `let` or pass `i` as an argument)\n- React hooks missing dependencies: a `useEffect` or `useCallback` that reads a prop or state value not listed in the dependency array — the callback sees the initial value forever\n- Event listener capturing stale props: a DOM event listener added once in a `useEffect` that captures `props.onEvent` at mount time, missing all future updates\n- Memoization with wrong keys: `useMemo` / `useCallback` / `React.memo` used with a dependency array that doesn't actually capture everything the computation depends on\n\n### 6. Resource Leaks & Missing Cleanup\n\n- Event listeners never removed: `addEventListener` called on mount, no corresponding `removeEventListener` on unmount\n- Intervals/timeouts never cleared: `setInterval` / `setTimeout` not captured in a ref or cancelled on component unmount\n- Subscriptions not cancelled: Realtime, WebSocket, or observable subscriptions opened but never `.unsubscribe()` / `.close()` called\n- File/stream handles not closed: `fs.open`, database connections, or readable streams that are opened but not closed on all exit paths (including error paths)\n- Growing in-memory collections: caches, queues, or maps that are added to but never evicted from, unbounded over time\n\n### 7. Uncaught Edge Cases — Inputs\n\n- Empty string: functions that receive a user-provided string and assume it is non-empty (`.split()`, `.charAt(0)`, regex matching)\n- Empty array or object: loops or transforms on collections that assume at least one element\n- Zero and negative numbers: code that divides by a user-supplied value without guarding against zero; index calculations that go negative\n- Numeric boundaries: values at or near `Number.MAX_SAFE_INTEGER`, `Number.MIN_SAFE_INTEGER`, `Infinity`, `-Infinity`, `NaN`\n- Unicode and emoji: string `.length` counts UTF-16 code units, not characters; a single emoji is 2 code units — truncation, substring, and split operations can corrupt multi-code-unit characters\n- Null bytes and control characters: untrusted strings containing `\\0`, `\\r`, `\\n` passed to file paths, log messages, or downstream systems\n- Very long inputs: strings or arrays far larger than typical — does the code O(n) scale gracefully, or does it load everything into memory?\n\n### 8. Uncaught Edge Cases — External Data & Network\n\n- Non-200 HTTP responses not handled: `fetch` resolves (does not reject) on 4xx/5xx — the caller must explicitly check `response.ok` or `response.status`\n- Partial or truncated responses: streaming or chunked data where the full payload may not arrive\n- Timeout not set: outbound HTTP calls with no timeout; one slow downstream service hangs the entire request chain indefinitely\n- Retry without backoff: immediately retrying failed network calls in a tight loop instead of using exponential backoff with jitter\n- Malformed JSON: `JSON.parse()` throws on invalid input; this must be wrapped in try/catch\n- Unexpected API shape: downstream API fields assumed to be present and correctly typed without validation; treat all external data as `unknown`\n- Stale or cached data returned on error: error handlers that silently return the last-known-good cached value without signalling the failure to the caller\n\n### 9. Concurrency & Shared State\n\n- Check-then-act (TOCTOU): reading a value, checking a condition, then acting — another concurrent operation can change the value between check and act\n- Non-atomic read-modify-write: incrementing a counter or appending to a list stored outside the current execution context without a lock or atomic operation\n- Reentrant function calls: an async function that can be called again before its first invocation completes, with both invocations sharing mutable state\n- Global/module-level mutable state: variables at module scope that accumulate or change across requests (dangerous in server contexts where module scope is shared between requests in the same isolate)\n- Event ordering assumptions: code that assumes async events will arrive in a specific order (e.g., \"message A always before message B\") without enforcement\n\n### 10. Scalability — Algorithmic Complexity\n\n- O(n²) or worse nested loops: an inner loop that iterates over the same or a related collection for every outer iteration; grows quadratically\n- Linear scan where constant lookup exists: using `Array.includes()`, `Array.find()`, or `Array.indexOf()` inside a loop where converting to a `Set` or `Map` would make lookups O(1)\n- Repeated sorting: sorting the same array on each render or request when it could be sorted once and cached\n- Unnecessary full-collection passes: multiple `.filter().map().reduce()` chains on the same array that could be combined into a single pass\n- Regex recompilation: constructing `new RegExp(pattern)` inside a loop when the pattern is constant — compile once outside the loop\n\n### 11. Scalability — Database & I/O\n\n- N+1 queries: fetching a list of N records, then issuing a separate query for each one in a loop — should be a single join or an `IN (...)` query\n- Unbounded queries: `SELECT * FROM table` or `.findAll()` without `LIMIT` — returns the entire table; grows unbounded as data grows\n- Missing pagination: API endpoints that return all results instead of pages; clients and servers both suffer as dataset grows\n- Fetching more columns than needed: `SELECT ` when only 2-3 columns are used; pulls unnecessary data across the network and into memory\n- Queries inside render or hot paths: database or API calls triggered on every render cycle or in tight loops rather than cached or batched\n- Sequential queries that could be parallel: `await db.query(A); await db.query(B)` where A and B are independent — use `Promise.all`\n- Missing index implied by access pattern: code that filters or sorts on a column that will clearly require a full table scan without an index (flag based on the access pattern — don't claim to know the schema unless you can read it)\n\n### 12. Scalability — Memory & Throughput\n\n- Loading full dataset into memory: reading an entire file, table, or collection into an array when streaming or cursor-based processing would avoid the memory spike\n- Unbounded `Promise.all`: `Promise.all(items.map(asyncFn))` where `items` can be very large — spawns thousands of concurrent operations, exhausting connections or memory\n- No backpressure on queues: pushing work into a queue faster than it can be consumed, with no throttling or rejection when the queue is full\n- In-memory coordination state: using a module-level `Map` or `Set` as a cache, queue, or lock that is not shared between process replicas — breaks on horizontal scale-out\n- No connection pooling: creating a new database connection per request instead of using a pool\n- Repeated expensive computation: calling an expensive pure function with the same inputs repeatedly without memoization or caching the result\n\n## Static Analysis Tools\n\nBefore producing findings, run available linters* on in-scope code and incorporate their output into findings.\n\n### TypeScript compiler\n```bash\nnpx tsc --noEmit\n```\nType errors, implicit `any`, and unchecked nulls. Map findings to Dimension 2 (Type & Coercion) or Dimension 3 (Null/Undefined).\n\n### ESLint\n```bash\nnpx eslint --ext .ts,.tsx src/\n```\nKey rules that surface bugs: `no-unused-vars`, `no-undef`, `@typescript-eslint/no-floating-promises`, `@typescript-eslint/no-misused-promises`, `react-hooks/exhaustive-deps`, `no-constant-condition`, `no-self-assign`.\n\n### Ruff (Python)\n```bash\nruff check --select E,F,B,C90 .\n```\n`F` = Pyflakes (undefined names, unused imports), `B` = Bugbear (common bug patterns), `C90` = McCabe complexity.\n\n### How to use tool output\n1. Map each tool finding to its dimension (e.g., `@typescript-eslint/no-floating-promises` → Dimension 4: Async & Promise Bugs).\n2. Linter errors that indicate real runtime bugs go under Critical; style findings go under Suggestion.\n3. Note \"tsc: clean\" / \"ESLint: clean\" in the Summary if no issues.\n\n## Output Format\n\nGroup findings by severity, not by dimension. Each finding must name the dimension it falls under.\n\n```\n## Critical\nIssues that will cause incorrect behavior, data loss, or crashes in production.\n\n### [Dimension] Brief title\nFile: `path/to/file.ts` (lines X–Y)\nDimension: Full dimension name — one-line explanation of what correct code requires.\nProblem: What the code does wrong and the concrete runtime impact (what breaks, when, and for whom).\nFix: Specific, actionable code change.\n\n## Warning\nIssues likely to cause bugs under realistic inputs or load, or that will cause failures during future changes.\n\n(same structure)\n\n## Suggestion\nImprovements that reduce risk or improve robustness but are not urgently broken.\n\n(same structure)\n\n## Summary\n- Total findings: N (X critical, Y warning, Z suggestion)\n- Dimensions most frequently violated: list top 2–3\n- Linter results: tsc: clean / ESLint: N issues / Ruff: clean (etc.)\n- Overall assessment: 1–2 sentence verdict on correctness and robustness\n```\n\n## Rules\n\n- Be specific: always cite file paths and line numbers.\n- Be actionable: every finding must include a concrete fix — not \"handle null\" but \"add `if (!user) return notFound()` before line 42.\"\n- Model the failure: every Critical finding must describe what actually breaks at runtime — which input triggers it, what the symptom is.\n- Severity by real-world impact: rate by what breaks in production, not theoretical worst-case.\n- No fluff: skip dimensions with no findings. Don't praise code that is merely acceptable.\n- Respect scope: in diff mode, only flag issues in changed lines and their immediate context. Don't audit the entire file when asked about a one-line change.\n- Don't duplicate other skills: correctness bugs only — no security (use `security-audit`), no principle violations (use `best-practices-audit`). Edge cases and concurrency bugs that are also security vulnerabilities should be flagged here for correctness and referenced to `security-audit` for the security angle.\n","reference":"# Correctness Audit — Reference\n\nDetailed definitions, failure patterns, concrete examples, and fixes for each dimension in `SKILL.md`.\n\n---\n\n## 1. Logic Bugs\n\n### Wrong Comparison Operator\n\nThe single most common logic bug. `<` vs `<=` is the canonical off-by-one; `==` vs `===` produces silent type coercion in JavaScript.\n\nViolation:\n```ts\n// WRONG — excludes the last valid page\nif (page < totalPages) fetchPage(page); // misses page === totalPages\n\n// WRONG — \"0\" == 0 is true in JS; both branches trigger unexpectedly\nif (status == 0) handlePending();\nif (status == false) handleEmpty(); // also true for 0, \"\", null, undefined\n```\nFix:\n```ts\nif (page <= totalPages) fetchPage(page);\nif (status === 0) handlePending();\n```\n\n### Mutation of Input Arguments\n\nFunctions that mutate their arguments create invisible coupling — the caller's data changes without warning.\n\nViolation:\n```ts\nfunction normalize(items: Item[]) {\n items.sort((a, b) => a.id - b.id); // mutates the caller's array\n return items;\n}\n```\nFix:\n```ts\nfunction normalize(items: Item[]) {\n return [...items].sort((a, b) => a.id - b.id); // local copy\n}\n```\n\n### Shadowed Variable\n\nA variable declared inside an inner scope shares the name of an outer-scope variable. Reads in the inner scope silently use the inner version, ignoring the outer.\n\nViolation:\n```ts\nconst user = getCurrentUser();\nif (condition) {\n const user = await fetchUser(id); // shadows outer `user`\n applyPermissions(user); // uses inner — correct\n}\nlog(user.id); // uses outer — developer may have intended inner\n```\nFix: Use distinct names. Lint rule: `no-shadow`.\n\n### Boolean Logic Inversion (De Morgan)\n\nMissing or extra negations produce conditions that are the exact opposite of intent.\n\nViolation:\n```ts\n// Intent: \"allow if admin OR owner\"\n// Bug: \"allow if NOT admin AND NOT owner\" (blocks everyone who should be allowed)\nif (!isAdmin && !isOwner) return allowAccess();\n```\nFix:\n```ts\nif (isAdmin \|\| isOwner) return allowAccess();\n```\n\n---\n\n## 2. Type & Coercion Bugs\n\n### `+` Operator on Mixed Types\n\nJavaScript's `+` operator does string concatenation when either operand is a string. A number read from an input field, query param, or JSON-as-string will concatenate instead of add.\n\nViolation:\n```ts\n// req.query.count is always a string\nconst total = req.query.count + 10; // \"510\" not 15\n```\nFix:\n```ts\nconst total = Number(req.query.count) + 10;\n// or: parseInt(req.query.count, 10) + 10\n```\n\n### Floating-Point Arithmetic in Financial Logic\n\nIEEE 754 doubles cannot represent most decimal fractions exactly. `0.1 + 0.2 === 0.30000000000000004` — do not use `number` for money.\n\nViolation:\n```ts\nconst total = price * quantity; // $10.10 * 3 = $30.299999999999997\n```\nFix: Store monetary values as integer cents in the database. Perform all arithmetic in cents. Convert to decimal only for display.\n\n### NaN Propagation\n\nArithmetic involving `NaN` always produces `NaN`. A single bad input silently corrupts all downstream calculations. `NaN === NaN` is `false`, so equality checks miss it.\n\nViolation:\n```ts\nconst score = parseInt(rawInput); // \"abc\" → NaN\nconst adjusted = score + bonus; // NaN — no warning\nif (adjusted > threshold) award(); // never triggers\n```\nFix:\n```ts\nconst score = parseInt(rawInput, 10);\nif (!Number.isFinite(score)) throw new Error(`Invalid score: ${rawInput}`);\n```\n\n### `JSON.parse` Without Validation\n\n`JSON.parse` returns `any` in TypeScript. Treating the result as a typed value without runtime validation means any shape mismatch (missing field, wrong type, null) silently becomes a bug downstream.\n\nViolation:\n```ts\nconst payload = JSON.parse(body) as WebhookPayload;\nprocessEvent(payload.eventType); // crashes if eventType is missing\n```\nFix:\n```ts\nconst raw: unknown = JSON.parse(body);\nconst payload = WebhookPayloadSchema.parse(raw); // throws on invalid shape\nprocessEvent(payload.eventType); // safe\n```\n\n---\n\n## 3. Null, Undefined & Missing Value Bugs\n\n### Unguarded `.find()` Result\n\n`Array.find()` returns `undefined` when no match exists. Using the result directly without checking throws at runtime.\n\nViolation:\n```ts\nconst config = configs.find(c => c.id === targetId);\nreturn config.value; // TypeError: Cannot read properties of undefined\n```\nFix:\n```ts\nconst config = configs.find(c => c.id === targetId);\nif (!config) throw new Error(`Config ${targetId} not found`);\nreturn config.value;\n```\n\n### Empty Array Access\n\n`arr[0]` on an empty array returns `undefined`, not an error. If the code then accesses a property of the result, it throws.\n\nViolation:\n```ts\nconst latest = events[0].timestamp; // undefined.timestamp if events = []\n```\nFix:\n```ts\nconst latest = events[0]?.timestamp ?? null;\n// or: if (events.length === 0) return null;\n```\n\n### Nullable Database Column Treated as Non-Null\n\nA TypeScript type may say `string` for a column that is nullable in the database. The type is wrong — any row inserted with `NULL` will produce `null` at runtime.\n\nPattern to flag: Reading `.foo` on a database row without checking if the type declaration matches the actual schema's nullable constraints.\n\n---\n\n## 4. Async & Promise Bugs\n\n### Missing `await` on Critical Path\n\nA fire-and-forget async call looks correct but the caller does not know if it succeeded or failed, and the function may return before the operation completes.\n\nViolation:\n```ts\nasync function deleteUser(id: string) {\n revokeTokens(id); // NOT awaited — may not complete before function returns\n await db.delete(id);\n return { success: true };\n}\n```\nFix:\n```ts\nasync function deleteUser(id: string) {\n await revokeTokens(id); // must complete before deleting the user\n await db.delete(id);\n return { success: true };\n}\n```\n\n### Unhandled Promise Rejection\n\nA `.then()` without `.catch()` silently drops errors. In Node.js, unhandled rejections crash the process in newer versions.\n\nViolation:\n```ts\nfetchData().then(process); // rejection from fetchData or process is silently lost\n```\nFix:\n```ts\nfetchData().then(process).catch(err => logger.error(\"fetchData failed\", err));\n// or use async/await with try/catch\n```\n\n### Sequential Awaits on Independent Operations\n\nTwo independent async operations awaited in series take `T_a + T_b` time instead of `max(T_a, T_b)`.\n\nViolation:\n```ts\nconst user = await fetchUser(id);\nconst config = await fetchConfig(); // independent — no reason to wait for user first\n```\nFix:\n```ts\nconst [user, config] = await Promise.all([fetchUser(id), fetchConfig()]);\n```\n\n### `Promise.all` Fail-Fast When Partial Failure Is Acceptable\n\n`Promise.all` rejects as soon as any promise rejects, abandoning the remaining operations. If partial success is acceptable, `Promise.allSettled` is correct.\n\nViolation:\n```ts\n// Sending notifications — one failure shouldn't prevent others\nawait Promise.all(users.map(u => sendNotification(u))); // one failure cancels all\n```\nFix:\n```ts\nconst results = await Promise.allSettled(users.map(u => sendNotification(u)));\nconst failures = results.filter(r => r.status === \"rejected\");\nif (failures.length > 0) logger.warn(`${failures.length} notifications failed`);\n```\n\n### Unbounded `Promise.all` on Large Array\n\nSpawning thousands of concurrent async operations exhausts database connections, file handles, or external API rate limits.\n\nViolation:\n```ts\nawait Promise.all(thousandsOfItems.map(item => processItem(item)));\n```\nFix: Use a concurrency-limited batch runner:\n```ts\n// Process in chunks of 10 at a time\nfor (let i = 0; i < items.length; i += 10) {\n await Promise.all(items.slice(i, i + 10).map(processItem));\n}\n// or use a library like p-limit\n```\n\n---\n\n## 5. Stale Closures & Captured State\n\n### Loop Variable Capture with `var`\n\n`var` is function-scoped, not block-scoped. All closures created inside the loop capture the same variable, which has its final value by the time the callbacks run.\n\nViolation:\n```ts\nfor (var i = 0; i < 5; i++) {\n setTimeout(() => console.log(i), 0); // logs \"5\" five times, not 0,1,2,3,4\n}\n```\nFix:\n```ts\nfor (let i = 0; i < 5; i++) { // `let` is block-scoped; each iteration gets its own `i`\n setTimeout(() => console.log(i), 0);\n}\n```\n\n### React `useEffect` Stale Closure\n\nA `useEffect` callback captures prop/state values at the time of the effect's creation. If those values change but the effect's dependency array doesn't include them, the callback operates on stale values forever.\n\nViolation:\n```tsx\nuseEffect(() => {\n const interval = setInterval(() => {\n // `count` is captured at mount and never updates\n setCount(count + 1); // always adds 1 to the initial value\n }, 1000);\n return () => clearInterval(interval);\n}, []); // missing `count` in deps\n```\nFix:\n```tsx\nuseEffect(() => {\n const interval = setInterval(() => {\n setCount(c => c + 1); // functional update — always uses current value\n }, 1000);\n return () => clearInterval(interval);\n}, []);\n```\n\n---\n\n## 6. Resource Leaks & Missing Cleanup\n\n### Event Listener Never Removed\n\nAdding a listener in a component's mount phase without removing it on unmount causes the handler to fire after the component is gone, often throwing on de-referenced state.\n\nViolation:\n```tsx\nuseEffect(() => {\n window.addEventListener(\"resize\", handleResize);\n // no cleanup — handleResize fires after unmount, references stale state\n}, []);\n```\nFix:\n```tsx\nuseEffect(() => {\n window.addEventListener(\"resize\", handleResize);\n return () => window.removeEventListener(\"resize\", handleResize);\n}, [handleResize]);\n```\n\n### Interval Not Cleared on Unmount\n\nA `setInterval` that updates component state will throw `Can't perform a React state update on an unmounted component` after unmount.\n\nViolation:\n```tsx\nuseEffect(() => {\n setInterval(tick, 1000); // interval ID discarded; can never be cleared\n}, []);\n```\nFix:\n```tsx\nuseEffect(() => {\n const id = setInterval(tick, 1000);\n return () => clearInterval(id);\n}, []);\n```\n\n### Growing Unbounded Cache\n\nAn in-memory cache that is added to without eviction grows without bound and eventually exhausts memory.\n\nViolation:\n```ts\nconst cache = new Map<string, Result>(); // module-level, grows forever\nfunction getCached(key: string) {\n if (!cache.has(key)) cache.set(key, compute(key));\n return cache.get(key)!;\n}\n```\nFix: Add a max-size eviction policy (LRU), a TTL, or use a bounded cache library. At minimum, document that the key space must be finite and bounded.\n\n---\n\n## 7. Edge Cases — Inputs\n\n### Empty String Assumptions\n\nA function receiving a user-supplied string must handle `\"\"` explicitly — it is falsy in JavaScript, which sometimes helps but often misleads.\n\nViolation:\n```ts\nfunction getInitials(name: string) {\n return name.split(\" \").map(w => w[0]).join(\"\"); // name=\"\" → [][\"\"][0] → undefined\n}\n```\nFix:\n```ts\nfunction getInitials(name: string) {\n if (!name.trim()) return \"\";\n return name.trim().split(/\\s+/).map(w => w[0].toUpperCase()).join(\"\");\n}\n```\n\n### Unicode / Emoji String Length\n\nJavaScript strings are UTF-16. Emoji and many non-Latin characters are represented as surrogate pairs — two code units each. `.length`, `.slice()`, `.charAt()`, and `.split(\"\")` all operate on code units, not characters.\n\nViolation:\n```ts\nconst truncated = message.slice(0, 100); // may split a surrogate pair, producing \"?\"\nconst len = \"👋\".length; // 2, not 1\n```\nFix:\n```ts\n// Use Array.from or spread to iterate by Unicode code point\nconst chars = Array.from(message);\nconst truncated = chars.slice(0, 100).join(\"\");\nconst len = Array.from(\"👋\").length; // 1\n```\n\n### Division by Zero\n\nAny user-supplied or computed value used as a divisor must be checked.\n\nViolation:\n```ts\nconst avgScore = totalScore / userCount; // NaN or Infinity when userCount = 0\n```\nFix:\n```ts\nconst avgScore = userCount === 0 ? 0 : totalScore / userCount;\n```\n\n---\n\n## 8. Edge Cases — External Data & Network\n\n### `fetch` Does Not Reject on HTTP Errors\n\n`fetch` only rejects on network failure (DNS, timeout, no connection). A 400, 404, or 500 response resolves normally with `response.ok === false`.\n\nViolation:\n```ts\nconst data = await fetch(\"/api/users\").then(r => r.json()); // 500 → parsed error body, no throw\n```\nFix:\n```ts\nconst response = await fetch(\"/api/users\");\nif (!response.ok) throw new Error(`HTTP ${response.status}: ${await response.text()}`);\nconst data = await response.json();\n```\n\n### Missing Request Timeout\n\nA `fetch` call with no timeout will wait indefinitely if the server hangs. In a serverless function, this exhaust the function's max execution time and blocks the client.\n\nViolation:\n```ts\nconst response = await fetch(url); // no timeout\n```\nFix:\n```ts\nconst response = await fetch(url, { signal: AbortSignal.timeout(5_000) }); // 5 second max\n```\n\n### `JSON.parse` Not Wrapped in Try/Catch\n\n`JSON.parse` throws a `SyntaxError` on malformed input. If the input comes from an external source it can fail at any time.\n\nViolation:\n```ts\nconst data = JSON.parse(rawBody); // throws on malformed JSON; crashes the handler\n```\nFix:\n```ts\nlet data: unknown;\ntry {\n data = JSON.parse(rawBody);\n} catch {\n return badRequest(\"Invalid JSON body\");\n}\n```\n\n---\n\n## 9. Concurrency & Shared State\n\n### Non-Atomic Read-Modify-Write\n\nRead a value, compute a new value, write it back. If two concurrent operations both read the same initial value, the second write silently overwrites the first.\n\nViolation (application layer):\n```ts\nconst balance = await getBalance(userId); // both read 100\nconst newBalance = balance - amount; // both compute 50\nawait setBalance(userId, newBalance); // second write wins: 50 instead of 0\n```\nFix: Use a database-level atomic update (`UPDATE ... SET coins = coins - $amount WHERE coins >= $amount`), or use `SELECT FOR UPDATE` to lock the row for the duration of the transaction.\n\nViolation (JavaScript):\n```ts\nlet counter = 0;\nasync function increment() {\n const current = counter; // read\n await someAsync(); // yields — another increment may run here\n counter = current + 1; // write: first increment's result is lost\n}\n```\nFix: For in-process counters, use a mutex or perform the increment synchronously without yielding.\n\n### Reentrant Async Function\n\nAn async function that is called again before its first invocation finishes, with both invocations modifying shared state.\n\nPattern to flag:\n```ts\nlet isSyncing = false; // in-memory guard\n\nasync function sync() {\n if (isSyncing) return; // TOCTOU: two callers can both read false simultaneously\n isSyncing = true;\n await doSync();\n isSyncing = false;\n}\n```\nFix: The guard only works if `isSyncing = true` is set synchronously before the first `await`. The code above is actually fine for this reason — flag it only if there is a `await` before setting the flag. For distributed/multi-instance systems, an in-memory flag is insufficient and must be moved to a database or Redis.\n\n---\n\n## 10. Scalability — Algorithmic Complexity\n\n### Linear Scan Inside a Loop — O(n²)\n\nUsing `Array.includes()`, `Array.find()`, or `Array.indexOf()` inside a loop that iterates over a collection of size n performs n × n = n² operations.\n\nViolation:\n```ts\n// O(n²): for each item, scan all blockedIds\nconst visible = items.filter(item => !blockedIds.includes(item.id));\n```\nFix:\n```ts\n// O(n): one-time Set construction + O(1) lookups\nconst blockedSet = new Set(blockedIds);\nconst visible = items.filter(item => !blockedSet.has(item.id));\n```\n\n### Regex Recompilation in a Loop\n\n`new RegExp(pattern)` compiles the pattern every call. If called in a loop with a constant pattern, this is wasted work.\n\nViolation:\n```ts\nfor (const line of lines) {\n if (new RegExp(\"^ERROR:\").test(line)) handle(line); // compiles every iteration\n}\n```\nFix:\n```ts\nconst errorPattern = /^ERROR:/; // compile once\nfor (const line of lines) {\n if (errorPattern.test(line)) handle(line);\n}\n```\n\n---\n\n## 11. Scalability — Database & I/O\n\n### N+1 Queries\n\nFetching a list, then issuing one query per row in a loop, is the most common database scalability bug. It turns one round-trip into N+1 round-trips.\n\nViolation:\n```ts\nconst posts = await db.query(\"SELECT * FROM posts LIMIT 20\");\nfor (const post of posts) {\n // 20 separate queries — one per post\n post.author = await db.query(\"SELECT * FROM users WHERE id = $1\", [post.author_id]);\n}\n```\nFix:\n```ts\nconst posts = await db.query(\"SELECT * FROM posts LIMIT 20\");\nconst authorIds = posts.map(p => p.author_id);\nconst authors = await db.query(\"SELECT * FROM users WHERE id = ANY($1)\", [authorIds]);\nconst authorMap = new Map(authors.map(a => [a.id, a]));\nposts.forEach(p => { p.author = authorMap.get(p.author_id); });\n```\n\n### Unbounded Query\n\nA query with no `LIMIT` returns the entire table. Tables grow over time; this query will eventually time out, exhaust memory, or cause OOM.\n\nViolation:\n```ts\nconst users = await db.query(\"SELECT * FROM users WHERE active = true\");\n// returns 10 rows today; returns 100,000 rows in a year\n```\nFix:\n```ts\nconst users = await db.query(\n \"SELECT id, display_name FROM users WHERE active = true LIMIT $1 OFFSET $2\",\n [pageSize, page * pageSize]\n);\n```\n\n---\n\n## 12. Scalability — Memory & Throughput\n\n### Loading Full Dataset Into Memory\n\nReading an entire file, table, or collection into an array before processing. Memory usage grows linearly with data size.\n\nViolation:\n```ts\nconst allEvents = await db.query(\"SELECT * FROM events\"); // 10 million rows\nconst processed = allEvents.map(transform);\n```\nFix: Use cursor-based streaming or pagination:\n```ts\nlet cursor = 0;\nwhile (true) {\n const batch = await db.query(\"SELECT * FROM events WHERE id > $1 LIMIT 1000\", [cursor]);\n if (batch.length === 0) break;\n batch.forEach(transform);\n cursor = batch[batch.length - 1].id;\n}\n```\n\n### In-Memory Coordination State That Breaks on Scale-Out\n\nA module-level `Map`, `Set`, or variable used as a cache, rate limiter, or deduplication store is not shared between multiple server instances or worker processes. When the service scales out or restarts, the state is lost or silently per-instance.\n\nViolation:\n```ts\n// Works on one instance; breaks when there are two\nconst rateLimitCache = new Map<string, number>(); // module-level\n\nfunction checkRateLimit(userId: string): boolean {\n const count = rateLimitCache.get(userId) ?? 0;\n rateLimitCache.set(userId, count + 1);\n return count < 10;\n}\n```\nFix: Move shared state to a database (Redis, PostgreSQL) that all instances can access. Flag this whenever module-level mutable state is used for coordination in a server context.\n"},"feature-planning":{"content":"---\nname: feature-planning\ndescription: Extensively plans a proposed feature before any code is written. Use when the user asks to plan, design, or spec out a feature, or when they say \"plan this feature\", \"design this\", or want to think through a feature before building it.\n---\n\n# Feature Planning\n\nEnter plan mode and produce a thorough, implementation-ready feature plan. Do not write any code until the plan is approved.\n\n## Trigger\n\nWhen this skill is invoked, immediately enter plan mode using the EnterPlanMode tool. All planning work happens inside plan mode.\n\n## Scope\n\n- User describes a feature: Treat the description as the starting point. Explore the codebase to understand where the feature fits before designing anything.\n- Request is vague or ambiguous: Ask clarifying questions using AskUserQuestion before proceeding. Do not assume intent. Common ambiguities to probe:\n - Who is the target user or actor?\n - What is the expected behavior vs. current behavior?\n - Are there constraints (performance, compatibility, platform)?\n - What is explicitly out of scope?\n - Are there related features this interacts with?\n- User provides a detailed spec: Validate it against the codebase. Identify gaps, contradictions, or unstated assumptions and raise them before planning.\n\nDo NOT skip clarification. A plan built on wrong assumptions wastes more time than a question.\n\n## Process\n\n### 1. Understand Context\n\n- Read the project's SPEC.md, README, CLAUDE.md, and any relevant docs to understand the system's architecture, conventions, and existing features.\n- Explore the codebase areas the feature will touch. Identify existing patterns, data models, state management, and UI conventions.\n- Map out what already exists that the feature will interact with or depend on.\n- API/tech stack verification: If the feature involves specific APIs, SDKs, or third-party services, look up the official documentation directly before designing anything. Check if available MCP tools (Supabase, Vercel, etc.) can accelerate this lookup. Never assume correct API usage from training knowledge alone — docs may have changed and wrong API usage produces security holes, not just bugs.\n- Output: A brief summary of the current system context relevant to this feature.\n\n### 2. Clarify Requirements\n\n- If any of the following are unclear, ask before continuing:\n - Functional requirements: What exactly should the feature do? What are the inputs, outputs, and user flows?\n - Non-functional requirements: Performance targets, data volume expectations, offline behavior, accessibility.\n - Boundaries: What is in scope vs. out of scope for this iteration?\n - Dependencies: Does this require new APIs, services, migrations, or third-party integrations?\n- Output: A clear, numbered list of confirmed requirements.\n\n### 3. Design the Feature\n\nProduce a plan that covers each of the following sections. Skip a section only if it genuinely does not apply.\n\n#### 3a. User-Facing Behavior\n- Describe the feature from the user's perspective: what they see, what they do, what happens.\n- Cover the happy path end-to-end.\n- Define error states and what the user sees when things go wrong (invalid input, network failure, permission denied, etc.).\n\n#### 3b. Data Model Changes\n- New types, interfaces, database tables, or schema changes.\n- Migrations needed and their reversibility.\n- Impact on existing data (backwards compatibility, data backfill).\n\n#### 3c. Architecture & Module Design\n- Which files/modules will be created or modified.\n- How the feature integrates with the existing architecture (state management, routing, API layer, etc.).\n- Clear responsibility boundaries: what each new module/function owns.\n\n#### 3d. API & Integration Points\n- New endpoints, webhooks, or external service calls.\n- Request/response shapes.\n- Authentication and authorization requirements.\n\n#### 3e. State Management\n- What state the feature introduces (local, global, persisted, cached).\n- State transitions and lifecycle.\n- How state syncs across components or with the backend.\n\n#### 3f. Implementation Steps\n- An ordered sequence of concrete implementation steps.\n- Each step should be small enough to be a single commit.\n- Note dependencies between steps (what must come before what).\n- Identify which steps can be done in parallel.\n\n### 4. Analyze Quality Dimensions\n\nProactively evaluate the proposed design against each of these dimensions. For each, explicitly state what risks exist and how the design addresses them. If a dimension does not apply, say so briefly. See [REFERENCE.md](REFERENCE.md) for named standards, plan quality criteria, templates, and anti-patterns.\n\n#### Bugs & Correctness\n(Applies `correctness-audit` — Dimensions 1–9: Logic Bugs through Concurrency & Shared State)\n\nReview the design against the `correctness-audit` dimensions. State which are highest-risk for this feature:\n- Logic bugs: off-by-one errors, boolean inversions, wrong operators in proposed conditional logic\n- Null / undefined: fields that can be absent — are they guarded? Do nullable DB columns match their TypeScript types?\n- Async & Promise: are concurrent async paths safe? Is there risk of fire-and-forget on critical writes?\n- Concurrency / TOCTOU: can concurrent requests (multiple users, tabs, or duplicate submissions) corrupt shared state? Does any step read-check-act on data another operation could change between check and act?\n\n#### Edge Cases\n(Applies `correctness-audit` — Dimensions 7 & 8: Edge Case Inputs, External Data & Network)\n\n- Empty state: what does the user see before any data exists for this feature?\n- Boundary values: max field lengths, max collection sizes, numeric overflow — are they defined and enforced at both the API and database layers?\n- Network failures: if an operation fails mid-way, what state is the system left in? Is partial completion visible to the user?\n- Reentrant / concurrent usage: double-submit, multiple tabs, back-button navigation mid-flow.\n- External data: any third-party API or webhook payload — is it validated as `unknown` before use, not cast directly to a typed shape?\n\n#### Design Quality\n(SOLID — Robert C. Martin; Clean Architecture — Robert C. Martin & Martin Fowler)\n\n- SRP: does each new module have one clearly stated reason to change?\n- OCP: can new behavior be added by extension without modifying existing modules?\n- DIP: do high-level modules depend on abstractions, not concrete implementations?\n- Dependency direction: do dependencies point inward (domain ← application ← infrastructure)? No domain module should depend on a framework or I/O layer.\n- Does the design follow existing project patterns, or introduce a new one? If new, is the justification explicitly stated?\n\n#### Maintainability\n(Clean Code — Robert C. Martin; The Pragmatic Programmer — Hunt & Thomas)\n\n- Will a developer unfamiliar with this feature understand it from the plan alone, without asking the author?\n- Are proposed module and function names self-documenting?\n- Are non-obvious design decisions explained in the plan's rationale, not left as tribal knowledge?\n- Are implicit contracts between modules made explicit (typed interfaces, documented invariants)?\n\n#### Modularity\n(SOLID — SRP, ISP, DIP; UNIX philosophy)\n\n- Can each new component be unit-tested in isolation, without the full stack?\n- Are new module dependencies unidirectional? Does the design introduce any circular imports?\n- Could any new module be replaced or reused independently of the others?\n\n#### Simplicity\n(KISS — Clarence Johnson, 1960; YAGNI — Extreme Programming, Kent Beck & Ron Jeffries)\n\n- KISS: is this the simplest design that satisfies the stated requirements?\n- YAGNI: are there components designed for hypothetical future requirements not in scope for this iteration?\n- Does the language or framework already provide something the design is building from scratch?\n- Is there unnecessary indirection — interfaces, factories, registries — with only one concrete implementation?\n\n#### Scalability\n(Applies `correctness-audit` — Dimensions 10–12: Algorithmic Complexity, Database & I/O, Memory & Throughput)\n\n- Will this design function correctly at 10× the current data volume without architectural changes?\n- Are there unbounded database queries (no `LIMIT`) or full-collection loads into memory?\n- Are there N+1 query patterns that will emerge as data grows?\n- Is any coordination state stored in-memory in a way that breaks under horizontal scale-out?\n\n#### Security\n(Applies `security-audit` — use the relevant domains for each new design element)\n\nMap each new element of the design to the applicable security-audit domains:\n- New API endpoint → §2 Authorization, §5 Input Validation, §6 API Security, §8 Rate Limiting\n- New database table or function → §7 Database Security (RLS, REVOKE, CHECK constraints)\n- New auth flow or session handling → §1 Authentication & Session Management\n- New external service call or webhook → §6 API7 SSRF, §10 webhook deduplication & signature\n- New financial operation → §10 Financial & Transaction Integrity, §9 Concurrency & Race Conditions\n- New user data stored or transmitted → §13 Data Privacy & Retention, §4 Cryptography & Secrets\n\n### 5. Identify Risks & Open Questions\n\n- List anything that could go wrong or that you're uncertain about.\n- Flag technical risks (performance cliffs, migration dangers, dependency on unstable APIs).\n- Flag product risks (user confusion, feature conflicts, scope creep).\n- For each risk, suggest a mitigation or note that it needs a decision.\n\n## Output Format\n\nWrite the plan to the plan file with this structure:\n\n```\n# Feature: [Name]\n\n## Context\n[Brief summary of current system state relevant to this feature]\n\n## Requirements\n1. [Confirmed requirement]\n2. ...\n\n## Design\n\n### User-Facing Behavior\n[Description with happy path and error states]\n\n### Data Model Changes\n[Types, schemas, migrations]\n\n### Architecture\n[Modules, files, integration points]\n\n### API & Integration Points\n[Endpoints, external calls]\n\n### State Management\n[State shape, transitions, sync]\n\n### Implementation Steps\n1. [Step with description]\n2. ...\n\n## Quality Analysis\n\n### Bugs & Correctness\n[Risks and mitigations]\n\n### Edge Cases\n[Identified edge cases and how they're handled]\n\n### Design Quality\n[Assessment]\n\n### Maintainability\n[Assessment]\n\n### Modularity\n[Assessment]\n\n### Simplicity\n[Assessment]\n\n### Scalability\n[Assessment]\n\n### Security\n[Assessment]\n\n## Risks & Open Questions\n- [Risk/question with proposed mitigation or decision needed]\n\n## Out of Scope\n- [What this plan explicitly does not cover]\n```\n\n## Rules\n\n- Plan mode first: Always enter plan mode before doing any planning work. The plan is written to the plan file, not output as chat.\n- No code: Do not write implementation code during planning. The plan is the deliverable.\n- Ask, don't assume: If the request is ambiguous, ask clarifying questions. Prefer one round of good questions over multiple rounds of back-and-forth.\n- Read before designing: Explore the codebase thoroughly. Reference actual file paths, function names, and patterns from the project.\n- Be concrete: Implementation steps should reference specific files and modules, not vague descriptions like \"update the backend.\"\n- Be honest about uncertainty: If you're unsure about something, flag it as an open question rather than making a guess that will become the plan.\n- Respect existing patterns: The plan should extend the project's architecture, not fight it. If a new pattern is warranted, justify why.\n- Scope boundaries: Clearly state what is and isn't included. Prevent scope creep by naming it.\n- Verify API usage against official docs: Before finalizing any design that uses a specific SDK, API, or third-party service, consult the official documentation to confirm correct usage. Use available MCP tools (Supabase, Vercel, etc.) where possible. Do not rely on training knowledge — incorrect API usage is a design flaw that silently becomes a security vulnerability.\n- Name the pattern: when the design follows or introduces a named pattern (Repository, Strategy, ADR, C4 Container), name it and note its source so the rationale is traceable.\n- Delegate to audit skills: the quality analysis does not re-describe what the audit skills cover in detail — it identifies which domains apply and defers to those skills for the specific checklist.\n","reference":"# Feature Planning — Reference\n\nDetailed standards, plan quality criteria, templates, and anti-patterns for the skill defined in `SKILL.md`.\n\n---\n\n## 1. Design Methodologies\n\n### C4 Model (Simon Brown)\nApplicable to: Architecture & Module Design section\n\nUse C4 vocabulary to describe architecture at the right level of detail. Don't describe implementation-level detail in architecture, or architecture-level detail in a code comment.\n\n- System Context: How the feature fits in the broader product and what external systems it touches.\n- Container: Major runtime components (web app, API server, database, message queue, cache). A new Edge Function or a new Supabase table is a container-level concern.\n- Component: Key modules within a container (e.g., `useNotifications` hook, `NotificationService` class). Most features are designed at this level.\n- Code: Only describe at this level for non-obvious or algorithmically critical parts.\n\nWhen writing the Architecture section, identify which C4 level is appropriate. A simple UI tweak is Code-level. A new backend service is Container-level.\n\n### Architecture Decision Records (ADR)\nApplicable to: any significant or non-obvious design choice in the plan\n\nWhen the plan makes a non-obvious design choice (e.g., \"use Realtime instead of polling\", \"store as JSONB instead of normalized columns\"), embed a mini-ADR in the rationale:\n\n```\nDecision: [What was chosen]\nContext: [Why a decision was needed; what problem this solves; what alternatives were considered]\nConsequences: [What becomes easier; what becomes harder; what is explicitly ruled out]\n```\n\nThis prevents \"we chose X\" from becoming tribal knowledge. The next developer reading the code needs to know why, not just what.\n\n### RFC-Style Specification\nApplicable to: complex or high-risk features affecting multiple systems or teams\n\nFor features that significantly affect multiple teams or carry high design risk, structure the plan to include:\n\n- Abstract: 2–3 sentence summary of the feature and its purpose.\n- Motivation: Why this is needed now. What problem it solves. Why existing solutions are insufficient.\n- Drawbacks: Reasons not to build this, or not to build it this way.\n- Alternatives: Other approaches considered and why they were rejected.\n\n---\n\n## 2. Plan Quality Criteria\n\nA plan section is \"done\" when it meets these criteria. Self-check before calling `ExitPlanMode`.\n\n### Context\n- [ ] References actual file paths, function names, and patterns from the real codebase (not generic descriptions).\n- [ ] Identifies all existing systems the feature will interact with or depend on.\n- [ ] Notes which existing files will change, not just what will be added.\n\n### Requirements\n- [ ] Functional requirements describe observable behavior (inputs, outputs, user flows) — not implementation details.\n- [ ] Non-functional requirements name specific targets (\"p95 latency < 200ms\", \"works offline for up to 24h\") — not vague aspirations (\"it should be fast\").\n- [ ] Out of scope is stated explicitly for anything a reader might reasonably assume is included.\n\n### User-Facing Behavior\n- [ ] Happy path is described end-to-end from the user's perspective.\n- [ ] Every error state has an explicit description of what the user sees — not \"show an error\" but \"display 'Something went wrong. Try again.' with a retry button.\"\n- [ ] Empty state is defined (what the user sees before any data exists for this feature).\n- [ ] Loading / pending state is defined if the feature involves async operations.\n\n### Data Model Changes\n- [ ] New tables include all columns with types, nullability, defaults, CHECK constraints, and FK `ON DELETE` behavior.\n- [ ] RLS requirements are stated for every new table.\n- [ ] Index requirements are stated based on the query access patterns described in the plan.\n- [ ] Migration is characterized as destructive / non-destructive, and whether a data backfill is needed.\n\n### Architecture\n- [ ] Lists specific files to be created and specific existing files to be modified.\n- [ ] Responsibility of each new module is stated in one sentence.\n- [ ] Dependency graph between new modules is described (what imports what).\n- [ ] No circular dependencies introduced.\n\n### API & Integration Points\n- [ ] Endpoint paths, HTTP methods, request bodies, and response shapes are defined.\n- [ ] Auth requirements are stated per endpoint.\n- [ ] Error response shapes and status codes are defined (not just the 200 case).\n\n### Implementation Steps\n- [ ] Each step is small enough to be a single commit.\n- [ ] Dependencies between steps are noted (what must come before what).\n- [ ] Steps that can be parallelized are identified.\n- [ ] The first step is always safe to merge independently (non-breaking change).\n\n---\n\n## 3. Plan Section Templates\n\n### Data Model Changes\n\nBad (too vague):\n> We'll add a notifications table.\n\nGood (specific):\n> New table: `notifications`\n>\n> \| Column \| Type \| Constraints \|\n> \|--------\|------\|-------------\|\n> \| `id` \| `UUID` \| `PRIMARY KEY DEFAULT gen_random_uuid()` \|\n> \| `user_id` \| `UUID` \| `NOT NULL REFERENCES auth.users(id) ON DELETE CASCADE` \|\n> \| `type` \| `TEXT` \| `NOT NULL CHECK (type IN ('quest_complete', 'reward_earned', 'system'))` \|\n> \| `read_at` \| `TIMESTAMPTZ` \| nullable — null means unread \|\n> \| `created_at` \| `TIMESTAMPTZ` \| `NOT NULL DEFAULT now()` \|\n>\n> RLS: `USING (user_id = auth.uid())` for SELECT; no UPDATE/DELETE for users.\n> Index: `(user_id, created_at DESC)` — supports the \"latest N unread for user\" query.\n> Migration: Non-destructive (new table). No backfill required.\n\n---\n\n### Implementation Steps\n\nBad (too vague):\n> 1. Build the backend.\n> 2. Build the frontend.\n> 3. Add tests.\n\nGood (specific):\n> 1. [Migration] Add `notifications` table and RLS policy. Non-destructive; safe to ship independently.\n> 2. [Edge Function] `POST /notifications/mark-read` — Zod-validated body, updates `read_at`, returns 204. Blocked by step 1.\n> 3. [React hook] `useNotifications()` — Realtime subscription scoped to `auth.uid()`. Can be built in parallel with step 2.\n> 4. [UI] `<NotificationBell>` — badge count, dropdown list, \"mark all read\" action. Blocked by step 3.\n> 5. [Test] Integration test: verify user A cannot read user B's notifications (RLS enforcement). Blocked by step 1.\n\n---\n\n### API Endpoint\n\n> `POST /api/quests/:questId/complete`\n> - Auth: Requires valid JWT; `getUser()` server-side (not `getSession()`).\n> - Authorization: Verify `quest.user_id === authenticatedUser.id` before any mutation.\n> - Request body: `{ evidence: string }` — validated with Zod; `evidence` max 500 chars, non-empty.\n> - Response (200): `{ coinsAwarded: number, newBalance: number }`\n> - Response (404): Quest not found or does not belong to caller. (Do not distinguish between the two — prevents enumeration.)\n> - Response (409): Quest already completed.\n> - Response (422): Schema validation failure with field-level errors.\n\n---\n\n### Architecture Decision Record (inline)\n\n> Decision: Use Supabase Realtime for live notification updates instead of polling.\n> Context: The feature requires users to see new notifications without refreshing. Polling every N seconds introduces latency and unnecessary load. Realtime is already available in the project infrastructure.\n> Consequences: Simpler client code (no polling interval to manage); subscription must be cleaned up on component unmount to avoid leaks; does not work for users behind restrictive firewalls (acceptable for this use case).\n\n---\n\n## 4. Common Planning Anti-Patterns\n\n### Premature Generalization\n(YAGNI — Extreme Programming, Kent Beck & Ron Jeffries)\n\nThe plan designs a general-purpose system for one concrete use case. Examples: building a \"plugin architecture\" when one integration is needed; an \"event bus\" when one event type exists; an \"action system\" for a single action type.\n\nSignal: The architecture section describes abstractions (interfaces, factories, registries) where no concrete second implementation exists or is planned.\n\nRemedy: Design for the concrete case. Note in Out of Scope that generalization is deferred until a second concrete case exists.\n\n---\n\n### Over-Complex Control Flow\n(KISS — Clarence Johnson)\n\nThe design requires a developer to trace through several interacting systems to follow one user action. Each hop (component → service → event → consumer → database) multiplies failure modes and debugging surface.\n\nSignal: Implementation steps require more than 3 conceptual layers for a straightforward operation.\n\nRemedy: Simplify the call chain. Prefer direct calls over event-driven patterns until the added complexity is justified by a concrete requirement (e.g., \"multiple independent consumers\", \"decoupled deployment\").\n\n---\n\n### Missing Error States in User-Facing Behavior\n(Defensive Programming — Steve McConnell, Code Complete)\n\nThe user-facing behavior section describes only the happy path. Network failures, validation errors, empty states, and permission-denied cases are left undefined. These become inconsistent behavior implemented ad-hoc during implementation.\n\nSignal: The user-facing behavior section has no \"when X fails, the user sees…\" entries.\n\nRemedy: For every user-visible action, add an explicit error state: what message appears, where it appears, and whether the user can recover (retry vs. dead end).\n\n---\n\n### Unstated Assumptions\n(The Pragmatic Programmer — Hunt & Thomas: \"Don't Assume, Check\")\n\nThe plan assumes an external API contract, an existing service capability, a team decision, or an infrastructure arrangement that has not been confirmed. These become discovered blockers during implementation.\n\nSignal: Phrases like \"we'll integrate with X\", \"X already supports this\", or \"the infra team will handle Y\" without a reference or confirmation.\n\nRemedy: Flag every unconfirmed assumption as an explicit open question in Risks & Open Questions, with a named owner and a decision deadline if possible.\n\n---\n\n### Circular Module Dependencies\n(Clean Architecture — Robert C. Martin)\n\nThe architecture introduces a dependency cycle: A imports B, B imports C, C imports A. This prevents independent testing, makes initialization order fragile, and is a source of \"works but nobody knows why\" bugs.\n\nSignal: In the dependency graph, any arrow forms a loop.\n\nRemedy: Extract the shared dependency into a third module that neither A nor C depend on, or invert one dependency using an interface (Dependency Inversion Principle).\n\n---\n\n### Data Model Without Constraints\n(Defensive Programming; database design best practices)\n\nNew tables are defined without `NOT NULL`, `CHECK`, or explicit FK `ON DELETE` behavior. Constraints are the last line of defense — they enforce correctness even when the application layer has a bug or is bypassed (e.g., a direct DB migration, a future code path).\n\nSignal: A table definition where any column that should always have a value lacks `NOT NULL`; a financial amount column without a `CHECK (amount > 0)` constraint; a FK without a stated `ON DELETE` policy.\n\nRemedy: For every new column, explicitly state: nullable or not, default value, and any domain constraint. For every FK: `CASCADE`, `SET NULL`, or `RESTRICT` — never leave it unstated.\n"},"security-audit":{"content":"---\nname: security-audit\ndescription: Performs a thorough security audit against established industry standards (OWASP Top 10 2021, OWASP API Security Top 10 2023, CWE taxonomy, GDPR, PCI-DSS). Use when reviewing for security vulnerabilities, hardening production systems, auditing auth/payment/database code, or conducting periodic security reviews. Works on git diffs, specific files, or an entire codebase.\n---\n\n# Security Audit\n\nAudit code against established security standards and threat models. Every finding must cite the specific standard ID (OWASP, CWE, GDPR article, etc.) so the developer understands the authoritative source for each requirement. This skill is for security-specific review; for clean code and architecture concerns, use `best-practices-audit` instead.\n\n## Scope\n\nDetermine what to audit based on user request and context:\n\n- Git diff mode (default when no scope specified and changes exist): run `git diff` and `git diff --cached` to audit only changed/added code and its immediate context\n- File/directory mode: audit the files or directories the user specifies\n- Full audit mode: when the user asks for a full security review, scan all source code (skip vendor/node_modules/build artifacts); prioritize files touching auth, payments, database, and external integrations\n\nRead all in-scope code before producing findings.\n\n## Domains to Evaluate\n\nCheck each domain. Skip domains with no findings. See [REFERENCE.md](REFERENCE.md) for detailed definitions, standard IDs, and concrete examples.\n\n### 1. Authentication & Session Management\n(OWASP A07:2021, CWE-287, CWE-384)\n\n- Using `getSession()` instead of server-side `getUser()` for auth decisions (JWT trusting without server validation)\n- Missing token expiry enforcement; long-lived tokens without rotation\n- Weak or missing logout (session not invalidated server-side)\n- OAuth state parameter missing or not validated (CSRF on OAuth flows)\n- Trusting client-provided user identity without server-side verification\n- Credentials stored in localStorage instead of httpOnly cookies\n\n### 2. Authorization & Access Control\n(OWASP A01:2021, OWASP API2:2023, CWE-284, CWE-639)\n\n- BOLA/IDOR: object IDs accepted from user input without ownership verification\n- Missing Row-Level Security (RLS) policies on database tables\n- Privilege escalation paths: routes or RPCs accessible to roles that shouldn't have access\n- Broken function-level auth: admin/internal endpoints not restricted by role\n- REVOKE gaps: functions or tables accessible to PUBLIC or anon when they shouldn't be\n- Assuming the presence of a valid JWT implies authorization (JWT ≠ authz check)\n\n### 3. Injection\n(OWASP A03:2021, CWE-89, CWE-79, CWE-77, CWE-94)\n\n- SQL injection: raw string interpolation in queries; use parameterized queries or an ORM\n- XSS: unsanitized user content inserted into HTML; missing `Content-Security-Policy`\n- Command injection: user input passed to shell commands, `exec()`, `eval()`, `Function()`\n- Template injection: user-controlled strings rendered by a template engine\n- Schema pollution (PostgreSQL): SECURITY DEFINER functions without `SET search_path = ''`; attacker-controlled schemas prepended to search path\n\n### 4. Cryptography & Secrets\n(OWASP A02:2021, CWE-327, CWE-798, CWE-312, CWE-321)\n\n- Hardcoded credentials, API keys, tokens, or secrets in source code or `.env.example`\n- Secrets in environment variables loaded client-side (exposed in browser bundles)\n- Weak hashing algorithms (MD5, SHA-1) used for security purposes\n- Tokens or sensitive data stored in plaintext in the database instead of a secrets vault\n- Missing HTTPS enforcement; secrets transmitted over HTTP\n- JWT secrets that are short, guessable, or shared across environments\n\n### 5. Input Validation & Output Encoding\n(CWE-20, CWE-116, CWE-601, OWASP A03:2021)\n\n- No schema validation (Zod, Yup, JSON Schema, etc.) at API boundaries\n- Validation only on the client, not enforced on the server\n- Missing length/range constraints on user-supplied strings (no `maxLength`, no `CHECK` constraint)\n- Missing content-type validation on file uploads\n- Open redirects: user-controlled URL passed directly to redirect without allowlist validation\n- Missing `encodeURIComponent` on user data placed in URLs\n\n### 6. API Security\n(OWASP API Top 10 2023)\n\n- API1 — BOLA: resources returned or modified by user-supplied ID without ownership check\n- API2 — Broken Auth: unprotected endpoints, missing JWT verification, bearer token in URL\n- API3 — Broken Object Property Level Auth: response includes fields (e.g. `role`, `coins`, `internal_id`) that the caller should not see\n- API4 — Unrestricted Resource Consumption: no rate limiting, pagination, or request size limits\n- API5 — Broken Function Level Auth: non-public actions (admin, delete, ban) not verified against caller's role\n- API7 — SSRF: URL parameters or webhook URLs accepted from user input without allowlist validation\n- API8 — Security Misconfiguration: permissive CORS (``), verbose error messages leaking stack traces or schema details, debug endpoints in production\n- API10 — Unsafe Consumption of APIs: external API responses trusted without validation; webhooks not verified via HMAC signature\n\n### 7. Database Security\n(CWE-250, CWE-284, PostgreSQL Security Best Practices)\n\n- Tables created without `ENABLE ROW LEVEL SECURITY`\n- Missing `REVOKE EXECUTE` on SECURITY DEFINER functions from `PUBLIC`, `authenticated`, `anon`\n- SECURITY DEFINER functions without `SET search_path = ''` (schema pollution vector)\n- Missing `REVOKE TRUNCATE` on financial, audit, or compliance tables\n- Overly permissive RLS policies (e.g., `USING (true)` on sensitive tables)\n- Direct client-to-database connections bypassing application security layer\n- Sensitive columns (tokens, PII) stored in plaintext instead of encrypted columns or vault references\n- Missing `CHECK` constraints on financial columns (e.g., balance `>= 0`, amount sign validation)\n\n### 8. Rate Limiting & Denial-of-Service\n(OWASP API4:2023, CWE-770, CWE-400)\n\n- No rate limiting on authentication endpoints (brute force enabler)\n- No rate limiting on expensive operations (sync, export, AI calls, file uploads)\n- Rate limits implemented in-memory per process/isolate (bypassed by horizontal scaling or redeployment)\n- Missing request body size limits (memory exhaustion)\n- Unbounded database queries without `LIMIT` clause (full table scan DoS)\n- No backoff or circuit breaker for outbound calls to third-party services\n\n### 9. Concurrency & Race Conditions\n(CWE-362, CWE-367 TOCTOU)\n\n- Check-then-act patterns on financial or inventory data without database-level locking\n- Double-spend or double-grant risk: no idempotency key or `ON CONFLICT DO NOTHING` guard\n- Missing advisory locks or `SELECT FOR UPDATE` on critical rows during multi-step transactions\n- Non-atomic read-modify-write sequences on shared state (coin balance, stock count, etc.)\n- Idempotency keys that can be `NULL` (treated as distinct by PostgreSQL UNIQUE, allowing bypass)\n\n### 10. Financial & Transaction Integrity\n(PCI-DSS Req 6 & 10, CWE-362)\n\n- Client-side coin/credit/reward calculation (any value trusted from client is a vulnerability)\n- Missing `CHECK` constraint on transaction amount sign (credits vs. debits not enforced at DB level)\n- Coin or balance modification without an audit trail (append-only transaction log)\n- Webhook events not deduplicated by a provider-assigned event ID (replay attack enabler)\n- Webhook signature not verified (unauthenticated financial state changes)\n- Deletion of financial transaction records (violates audit trail requirements; potential legal violation)\n- Missing `NOT NULL` on idempotency key column for transaction tables\n\n### 11. Security Logging & Monitoring\n(OWASP A09:2021, CWE-778, CWE-117)\n\n- Security-relevant events not logged (auth failures, permission denials, validation failures, HMAC failures)\n- Log injection: unsanitized user input included directly in log messages\n- Sensitive data (passwords, tokens, card numbers, PII) written to logs\n- No structured logging — free-text logs that can't be queried or alerted on\n- Missing correlation between security events and user/request IDs\n- No alerting or anomaly detection on suspicious event patterns\n- Logs stored in a volatile medium (in-memory, ephemeral filesystem) that survives restarts but not scaling events\n\n### 12. Secrets & Environment Security\n(CWE-798, CWE-312, 12-Factor App)\n\n- Secrets committed to git (`.env`, private keys, API tokens in source files)\n- Fallback to insecure defaults when env vars are absent (e.g., CORS origin falling back to ``)\n- Using the same secrets across development, staging, and production environments\n- Secrets logged or included in error messages\n- Client-side environment variables (prefixed `VITE_`, `NEXT_PUBLIC_`, etc.) containing server-side secrets\n- Secrets passed as CLI arguments (visible in process list)\n\n### 13. Data Privacy & Retention\n(GDPR Art. 5/17/25, CCPA, CWE-359)\n\n- PII stored longer than necessary (no retention policy or purge cron)\n- No anonymization path for account deletion (right to erasure, GDPR Art. 17)\n- PII in logs, error messages, or analytics events that shouldn't be there\n- Missing `ON DELETE SET NULL` or equivalent for user-linked tables that must survive account deletion\n- Financial records with FK `ON DELETE CASCADE` that would purge legally required audit evidence\n- No consent record for data collection (GDPR Art. 6)\n- User data returned in API responses without field-level access checks (over-fetching)\n\n### 14. Security Misconfiguration\n(OWASP A05:2021, CWE-16)\n\n- Permissive CORS (`Access-Control-Allow-Origin: ` on authenticated endpoints)\n- Missing `Content-Security-Policy`, `X-Frame-Options`, `X-Content-Type-Options` headers\n- HTTP used instead of HTTPS; missing HSTS header\n- Debug/development endpoints or verbose error responses in production\n- Default credentials or example configurations deployed\n- Database or storage buckets with public access that should be private\n- Missing `SameSite` attribute on session cookies\n- JWT verification disabled on functions that handle authenticated user data\n\n### 15. Supply Chain & Dependency Security\n(OWASP A06:2021, CWE-1357)\n\n- Dependencies with known CVEs (run `npm audit`, `pip audit`, `bun audit`)\n- Unpinned dependency versions (``, `latest`, `^` for production dependencies)\n- Dependencies pulled from non-official registries without integrity hashing\n- Dev dependencies installed in production containers\n- Missing integrity subresource hashing on CDN-loaded scripts\n\n### 16. TypeScript / JavaScript Specific\n(CWE-843 Type Confusion, CWE-915 Improperly Controlled Modification)\n\n- `as any` or `as unknown as T` casts that bypass type checking on externally-sourced data\n- Prototype pollution: `Object.assign(target, userControlledObject)` or spread of unvalidated input onto objects\n- `eval()`, `new Function()`, `setTimeout(string)`, or `innerHTML =` with user-controlled content\n- `JSON.parse()` result used without validation (treat parsed JSON as `unknown`, not `any`)\n- Arithmetic on `bigint` and `number` without explicit conversion (silent precision loss)\n- Async functions missing `await` on promises that should be awaited (unhandled rejection, ordering bug)\n\n## Static Analysis Tools\n\nBefore producing findings, run available tools on in-scope code. Incorporate tool output into your findings (cite the tool rule alongside the standard ID).\n\n### npm / bun audit (dependency vulnerabilities)\n```bash\nnpm audit --audit-level=moderate # or: bun audit\n```\nMap findings to OWASP A06:2021 and the specific CVE ID.\n\n### ESLint with security plugins\n```bash\n# Check for eslint-plugin-security in devDependencies first\nnpx eslint --ext .ts,.tsx src/\n```\nKey rules to look for: `security/detect-object-injection`, `security/detect-non-literal-regexp`, `no-eval`, `no-implied-eval`.\n\n### Semgrep (if available)\n```bash\nsemgrep --config=p/owasp-top-ten .\nsemgrep --config=p/typescript .\n```\n\n### Ruff with Bandit rules (Python)\n```bash\nruff check --select S . # Bandit security rules\n```\n\n### How to use tool output\n1. Map each tool finding to its security domain (e.g., a SQL injection ESLint rule → Domain 3: Injection).\n2. Critical CVEs or injection/auth findings → Critical. Outdated deps with low-severity CVEs → Warning or Suggestion.\n3. If a tool is not present or produces no findings, note \"npm audit: clean\" etc. in the Summary.\n\n## API & Tech Stack Verification\n\nBefore finalizing findings, verify security-relevant API and SDK usage against official documentation:\n\n- Look up official docs: If the code uses a specific SDK, API, or service (e.g. Supabase auth, Stripe, OAuth providers), consult the official documentation to confirm the correct security usage pattern. Do not rely on training knowledge — APIs change, and incorrect usage is frequently a Critical security flaw that looks correct to a code reviewer.\n- Use available MCP tools: Check if available MCP tools (Supabase MCP, Vercel MCP, etc.) can provide faster or more authoritative access to official docs.\n- Wrong API usage = security finding: If code uses an API in a non-standard or incorrect way that bypasses security controls (e.g. trusting client-side session data instead of server-side verification), it must be reported as a finding at the appropriate severity — not treated as a style issue.\n\n## False Positive Filtering\n\nBefore including any finding in the report, apply these filters in order. A report with 3 real findings is more valuable than one with 3 real findings buried in 12 noise items.\n\n### Hard Exclusions\n\nAutomatically exclude findings that match these categories — do not report them even as Low:\n\n1. Pure DoS / resource exhaustion without an auth bypass or data-integrity component. Domain 8 items belong in the report only when combined with another vulnerability class (e.g., unbounded query + missing auth = Critical, unbounded query alone = excluded).\n2. Theoretical race conditions without a concrete exploitation path. Only report a race condition if you can describe the specific interleaving of requests that causes harm (e.g., double-spend). \"This read-modify-write could race\" is not a finding.\n3. Outdated dependency versions — these are surfaced by `npm audit` / `bun audit` output in the Summary section. Do not create individual findings for known CVEs in third-party libraries; that is the dependency scanner's job.\n4. Missing hardening with no attack vector — e.g., \"should add CSP header\" when there is no XSS vector in the application, or \"should add rate limiting\" on an internal-only endpoint. A missing defense layer is only a finding when the attack it defends against is actually possible.\n5. Test-only code — unit tests, fixtures, test helpers, mocks, and seed scripts. Exception: test files that contain real secrets or credentials.\n6. Log spoofing / unsanitized log output — unless the log output feeds a downstream system that parses and acts on log content (SIEM injection, log-based alerting bypass).\n7. Regex injection / ReDoS — unless the regex runs on untrusted input in a hot path with no timeout and you can demonstrate catastrophic backtracking.\n8. Documentation-only files — markdown, JSDoc comments, README content. These are not executable.\n9. Client-side validation gaps when server-side validation exists — missing Zod schema in a React form is a UX concern, not a security finding, if the API endpoint validates the same input.\n10. SSRF limited to path control — only report SSRF when the attacker can control the host or protocol. Path-only SSRF is not exploitable in practice.\n11. Memory safety issues in memory-safe languages — buffer overflows, use-after-free, etc. are impossible in TypeScript, Python, Go, Rust, and Java. Do not report them.\n12. Secrets or credentials stored on disk if they are otherwise secured (e.g., encrypted at rest, in a secrets vault, or managed by a dedicated process).\n\n### Framework & Language Precedents\n\nThese are established rulings — patterns that are NOT vulnerabilities by themselves:\n\n1. React / Angular / Vue are XSS-safe by default. Only flag XSS when using `dangerouslySetInnerHTML`, `bypassSecurityTrustHtml`, `v-html`, `[innerHTML]`, or equivalent escape hatches. Normal JSX interpolation (`{userInput}`) is auto-escaped.\n2. UUIDs (v4) are unguessable. Don't flag UUID-based resource access as IDOR unless the real issue is missing ownership verification (the problem is the missing WHERE clause, not the identifier format).\n3. Environment variables and CLI flags are trusted input. Attacks requiring attacker-controlled env vars are invalid in standard deployment models. Do not flag `process.env.X` as \"unsanitized input.\"\n4. Client-side code does not need auth checks. The backend is responsible for authorization. Missing permission guards in React components, API client wrappers, or frontend route guards are not security findings — they are UX decisions.\n5. GitHub Actions: most injection vectors are not exploitable. Only flag when untrusted input (PR title, branch name, issue body, commit message) flows into `run:` steps via `${{ }}` expression injection without intermediate sanitization.\n6. Jupyter notebooks run locally. Only flag if untrusted external input reaches code execution, not just because a cell calls `eval()` on a hardcoded string.\n7. Shell scripts with no untrusted input are safe. Command injection requires untrusted user input flowing into the script. Scripts that only process env vars, hardcoded paths, or pipeline-internal values are not vulnerable.\n8. `JSON.parse()` is not a vulnerability. Only a finding if the parsed result is used without validation in a security-critical path (auth decisions, financial calculations, SQL query construction).\n9. Logging non-PII data is safe. Only report logging findings when secrets (passwords, tokens, API keys) or personally identifiable information is written to logs. Logging URLs, request metadata, or error messages is not a vulnerability.\n\n### Confidence Gate\n\nBefore including any finding, answer these three questions:\n\n1. Concrete attack path? Can you describe the specific HTTP request, API call, or user action an attacker would use? If not, it's a code smell, not a finding.\n2. Reasonable disagreement? Could a competent security engineer argue this is not a vulnerability given the application's threat model? If yes, downgrade to a \"Needs Investigation\" note in the Summary.\n3. Specific location? Does the finding have an exact file path, line number, and reproduction scenario? Vague findings (\"the app should use HTTPS somewhere\") are not actionable and must be excluded.\n\nIf any question raises doubt, do not report it as a formal finding. Instead, add a brief \"Needs Investigation\" note in the Summary section so the developer is aware without the noise.\n\n## Output Format\n\nGroup findings by severity. Each finding must name the specific standard violated.\n\n```\n## Critical\nViolations that are directly exploitable or enable data theft, privilege escalation, or financial fraud.\n\n### [DOMAIN] Brief title\nFile: `path/to/file.ts` (lines X–Y)\nStandard: OWASP A01:2021 / CWE-639 — one-line description of what the standard requires.\nViolation: What the code does wrong and the concrete attack scenario.\nFix: Specific, actionable code change or architectural remedy.\n\n## High\nViolations that create significant risk but require specific conditions or chaining to exploit.\n\n(same structure)\n\n## Medium\nDefense-in-depth gaps, missing controls, or violations that increase attack surface.\n\n(same structure)\n\n## Low\nBest-practice deviations, hardening opportunities, or compliance gaps unlikely to be directly exploited.\n\n(same structure)\n\n## Needs Investigation (optional)\nBrief notes on patterns that warrant a closer look but did not pass the Confidence Gate. These are not formal findings.\n\n## Summary\n- Total findings: N (X critical, Y high, Z medium, W low)\n- Highest-risk area: name the domain with the most severe findings\n- Key standards violated: list specific OWASP/CWE IDs\n- Overall security posture: 1–2 sentence verdict\n- Recommended immediate action: the single most urgent fix\n```\n\n## Rules\n\n- Cite the standard: every finding must reference a specific standard ID (OWASP A-code, CWE-NNN, GDPR Art. N, PCI-DSS Req. N). This is the core value of this skill.\n- Model the attack: every Critical or High finding must describe the realistic attack scenario, not just the code smell.\n- Be specific: always cite file paths and line numbers.\n- Be actionable: every finding must include a concrete fix — not \"add validation\" but \"use a Zod schema on the request body and reject with 400 if it fails.\"\n- Severity by exploitability: rate severity by real-world exploitability and impact, not theoretical worst-case. A missing CSP header with no XSS vector is Low at most. A SQL injection in a public endpoint is Critical regardless of whether a WAF might catch it.\n- Don't duplicate best-practices-audit: focus on security vulnerabilities and compliance gaps. Architecture and clean code issues belong in the other skill.\n- Minimize false positives: Apply the False Positive Filtering rules (Hard Exclusions, Framework Precedents, Confidence Gate) before including any finding. When uncertain, add a \"Needs Investigation\" note in the Summary rather than reporting a formal finding. A clean report with 3 real findings is more valuable than one with 3 real findings buried in 12 noise items.\n- Verify API usage against official docs: Do not assume an API or SDK is being used correctly based on training knowledge. If the code uses a specific SDK or service, look up the official documentation (using MCP tools where available) and verify the security-relevant usage pattern is correct. Incorrect API usage that bypasses security controls is a Critical finding.\n- Defense-in-depth counts: a control missing a second layer of enforcement (e.g., RLS present but no CHECK constraint) is a Medium finding even if the first layer is sound.\n","reference":"# Security Audit — Reference\n\nDetailed definitions, standard sources, violation examples, and fixes for each domain in `SKILL.md`.\n\n---\n\n## 1. Authentication & Session Management\nStandards: OWASP A07:2021 — Identification and Authentication Failures; CWE-287 Improper Authentication; CWE-384 Session Fixation; RFC 6750 Bearer Token Usage\n\n### `getSession()` vs. `getUser()` — OWASP A07:2021\n\n`getSession()` reads the JWT from the client-supplied cookie/header and parses it locally. A tampered or expired JWT can appear valid if clock skew or local validation is used. `getUser()` performs a server-side round-trip to the authorization server, guaranteeing the token is currently valid and the user account has not been revoked.\n\nViolation pattern (Supabase/TypeScript):\n```ts\n// WRONG — trusts client-supplied JWT locally\nconst { data: { session } } = await supabase.auth.getSession();\nconst userId = session?.user?.id;\n```\nFix:\n```ts\n// CORRECT — server validates the token\nconst { data: { user }, error } = await supabase.auth.getUser(authHeader);\nif (error \|\| !user) return unauthorized();\n```\n\n### OAuth State Parameter — CWE-352 CSRF\n\nThe OAuth `state` parameter must be a cryptographically random nonce stored server-side (or signed cookie). Without it, an attacker can force a victim to link their account to the attacker's OAuth token.\n\nFix: Generate `state = crypto.randomUUID()`, store in DB or signed cookie with short TTL, validate on callback before exchanging code.\n\n---\n\n## 2. Authorization & Access Control\nStandards: OWASP A01:2021 — Broken Access Control; OWASP API1:2023 — Broken Object Level Authorization; CWE-284 Improper Access Control; CWE-639 Authorization Bypass Through User-Controlled Key\n\n### BOLA / IDOR\n\nThe most prevalent API vulnerability class. Any time a user-controlled identifier (UUID, integer, slug) is used to look up a resource, ownership must be verified server-side — it cannot be assumed from the JWT alone.\n\nViolation pattern:\n```ts\n// WRONG — trusts caller-supplied userId\nconst { id } = req.body;\nconst resource = await db.query(\"SELECT * FROM documents WHERE id = $1\", [id]);\nreturn resource; // returns any user's document\n```\nFix:\n```ts\n// CORRECT — adds ownership column to WHERE clause\nconst resource = await db.query(\n \"SELECT * FROM documents WHERE id = $1 AND owner_id = $2\",\n [id, authenticatedUser.id]\n);\nif (!resource) return notFound(); // don't reveal existence\n```\n\n### Row-Level Security (PostgreSQL)\n\nEvery table with user-scoped data must have RLS enabled AND a policy defined. RLS enabled with no policies = no access. RLS disabled = all data visible to any authenticated DB connection.\n\nRequired pattern:\n```sql\nALTER TABLE documents ENABLE ROW LEVEL SECURITY;\n\nCREATE POLICY \"users_own_documents\"\n ON documents FOR ALL\n TO authenticated\n USING (owner_id = auth.uid());\n```\n\nHigh-risk gap: Financial tables (transactions, payment records) should have RLS but also block UPDATE/DELETE via separate policies or triggers — RLS `FOR ALL` with `USING` only controls SELECT.\n\n---\n\n## 3. Injection\nStandards: OWASP A03:2021 — Injection; CWE-89 SQL Injection; CWE-79 XSS; CWE-77 Command Injection; CWE-94 Code Injection\n\n### SQL Injection — CWE-89\n\nAny string concatenation or interpolation in a SQL query is potentially exploitable. The fix is always parameterized queries (also called prepared statements).\n\nViolation:\n```ts\n// WRONG\nconst result = await db.query(`SELECT * FROM users WHERE name = '${name}'`);\n```\nFix:\n```ts\n// CORRECT\nconst result = await db.query(\"SELECT * FROM users WHERE name = $1\", [name]);\n```\n\n### Schema Pollution (PostgreSQL SECURITY DEFINER) — CWE-89\n\nA function with `SECURITY DEFINER` runs with the privileges of the function's owner (often a superuser). If `search_path` is not pinned, an attacker who can create schemas may prepend a malicious schema, causing the function to resolve table names to their injected versions.\n\nViolation:\n```sql\nCREATE OR REPLACE FUNCTION credit_coins(uid uuid, amount int)\nRETURNS void\nLANGUAGE plpgsql\nSECURITY DEFINER AS $$\nBEGIN\n UPDATE profiles SET coins = coins + amount WHERE id = uid;\nEND;\n$$;\n```\nFix:\n```sql\nCREATE OR REPLACE FUNCTION public.credit_coins(uid uuid, amount int)\nRETURNS void\nLANGUAGE plpgsql\nSECURITY DEFINER\nSET search_path = '' -- pins search path; no user schema can be injected\nAS $$\nBEGIN\n UPDATE public.profiles SET coins = coins + amount WHERE id = uid;\nEND;\n$$;\n```\n\n### XSS — CWE-79\n\nNever assign user-controlled content to `innerHTML`, `outerHTML`, `document.write()`, or React's `dangerouslySetInnerHTML` without sanitization.\n\nViolation:\n```ts\nelement.innerHTML = userInput; // executes embedded <script> tags\n```\nFix:\n```ts\nelement.textContent = userInput; // text node — never executed as HTML\n// If HTML is genuinely needed, use DOMPurify:\nelement.innerHTML = DOMPurify.sanitize(userInput, { ALLOWED_TAGS: ['b', 'i'] });\n```\n\n---\n\n## 4. Cryptography & Secrets\nStandards: OWASP A02:2021 — Cryptographic Failures; CWE-327 Use of Broken Algorithm; CWE-798 Hardcoded Credentials; CWE-312 Cleartext Storage; NIST SP 800-131A\n\n### Hardcoded Secrets — CWE-798\n\nAny secret in source code is compromised the moment the repo is cloned. Even private repos have been breached.\n\nScan for: `apiKey =`, `password =`, `secret =`, `token =`, `-----BEGIN RSA PRIVATE KEY-----` in `.ts`, `.js`, `.json`, `.toml`, `.yaml` files.\n\nFix: Rotate immediately. Store in environment variables loaded at runtime (never in source), or a secrets manager (HashiCorp Vault, AWS Secrets Manager, Supabase Vault).\n\n### Broken Hash Algorithms — CWE-327\n\nMD5 and SHA-1 are collision-compromised. Never use for password hashing, HMAC, or integrity verification.\n\n- Passwords: use `bcrypt` (cost ≥ 12), `argon2id`, or `scrypt`.\n- HMAC: use SHA-256 minimum. `HMAC-SHA256` is the baseline for webhook signatures.\n- File integrity: SHA-256 minimum.\n\n### Client-Side Secret Exposure\n\nIn Vite: `VITE_` variables are embedded in the JS bundle and visible to any user who opens DevTools. In Next.js: `NEXT_PUBLIC_` is the same. Never put API keys or service secrets in these variables.\n\n---\n\n## 5. Input Validation & Output Encoding\nStandards: CWE-20 Improper Input Validation; CWE-116 Improper Encoding; CWE-601 Open Redirect; OWASP Input Validation Cheat Sheet\n\n### Server-Side Validation is Non-Negotiable\n\nClient-side validation (React form validation, browser `required` attributes) is UX, not security. Any attacker can send raw HTTP requests bypassing the client entirely.\n\nRequired pattern (TypeScript with Zod):\n```ts\nconst Schema = z.object({\n username: z.string().min(1).max(30),\n amount: z.number().int().positive().max(10_000),\n});\n\nconst parsed = Schema.safeParse(req.body);\nif (!parsed.success) return badRequest(parsed.error.flatten());\n// Use parsed.data — never req.body — downstream\n```\n\n### Defense-in-Depth: Database CHECK Constraints\n\nApplication validation can be bypassed (direct DB connection, migration mistake, future code path). CHECK constraints are the last line of defense.\n\n```sql\n-- Prevents negative balance under any race condition\nALTER TABLE profiles ADD CONSTRAINT chk_coins_non_negative CHECK (coins >= 0);\n\n-- Enforces transaction sign by type\nALTER TABLE coin_transactions ADD CONSTRAINT chk_credit_positive\n CHECK (tx_type NOT IN ('quest_reward', 'purchase') OR amount > 0);\nALTER TABLE coin_transactions ADD CONSTRAINT chk_debit_negative\n CHECK (tx_type NOT IN ('cosmetic_purchase', 'refund') OR amount < 0);\n```\n\n### Open Redirect — CWE-601\n\n```ts\n// WRONG — attacker crafts ?next=https://evil.com\nconst next = req.query.next;\nres.redirect(next);\n\n// CORRECT — validate against allowlist\nconst ALLOWED_PATHS = ['/dashboard', '/profile', '/settings'];\nconst next = req.query.next;\nif (!ALLOWED_PATHS.includes(next)) return res.redirect('/dashboard');\nres.redirect(next);\n```\n\n---\n\n## 6. API Security\nStandards: OWASP API Security Top 10 2023\n\n### API1:2023 — Broken Object Level Authorization (BOLA)\n\nSee Domain 2. Every resource access must verify ownership. This is the #1 API vulnerability.\n\n### API3:2023 — Broken Object Property Level Authorization\n\nAPIs often return full database row objects. If the object contains fields the caller should not see (other users' data, internal flags, admin properties), this is a data exposure violation.\n\nFix: Explicitly allowlist fields returned in API responses. Never return `SELECT ` to the client.\n\n```ts\n// WRONG\nreturn res.json(userRow); // includes password_hash, role, internal_flags\n\n// CORRECT\nreturn res.json({\n id: userRow.id,\n displayName: userRow.display_name,\n avatarUrl: userRow.avatar_url,\n});\n```\n\n### API7:2023 — Server-Side Request Forgery (SSRF)\n\nIf the application fetches a URL derived from user input, an attacker can target internal services (metadata endpoints, Redis, internal databases).\n\nViolation:\n```ts\n// WRONG — user controls the URL\nconst data = await fetch(req.body.webhookUrl);\n```\nFix:* Validate URL against a strict allowlist of expected domains. Block private IP ranges (10.x, 172.16.x–172.31.x, 192.168.x, 169.254.x, ::1, fc00::/7).\n\n### API8:2023 — Security Misconfiguration\n\n- CORS `Access-Control-Allow-Origin: ` on authenticated endpoints allows any origin to read responses.\n- Verbose error messages that expose stack traces, SQL query structure, or internal paths.\n- Debug endpoints (`/debug`, `/metrics`, `/__admin`) accessible in production.\n\n---\n\n## 7. Database Security\nStandards: CWE-250 Execution with Unnecessary Privileges; PostgreSQL Security Best Practices; CIS PostgreSQL Benchmark\n\n### Principle of Least Privilege\n\nEvery database role should have only the minimum permissions required. The `public` schema grants `CREATE` to all roles by default in PostgreSQL < 15 — revoke this explicitly.\n\n```sql\nREVOKE CREATE ON SCHEMA public FROM PUBLIC;\nREVOKE ALL ON ALL TABLES IN SCHEMA public FROM PUBLIC;\n\n-- Then explicitly grant only what each role needs\nGRANT SELECT, INSERT ON public.profiles TO authenticated;\n```\n\n### REVOKE EXECUTE on SECURITY DEFINER Functions\n\nSECURITY DEFINER functions run as their owner. If PUBLIC or `authenticated` can call them without restriction, any logged-in user can trigger privileged operations.\n\n```sql\n-- After defining any SECURITY DEFINER function:\nREVOKE EXECUTE ON FUNCTION public.credit_coins(uuid, int) FROM PUBLIC;\nREVOKE EXECUTE ON FUNCTION public.credit_coins(uuid, int) FROM authenticated;\nREVOKE EXECUTE ON FUNCTION public.credit_coins(uuid, int) FROM anon;\n-- Re-grant only to service_role or internal callers as needed\n```\n\n### REVOKE TRUNCATE on Audit Tables\n\n`TRUNCATE` bypasses RLS and row-level triggers. Any role that can TRUNCATE an audit table can silently destroy evidence.\n\n```sql\nREVOKE TRUNCATE ON TABLE public.coin_transactions FROM PUBLIC;\nREVOKE TRUNCATE ON TABLE public.coin_transactions FROM authenticated;\nREVOKE TRUNCATE ON TABLE public.coin_transactions FROM service_role;\n-- Even service_role should not be able to bulk-erase financial records\n```\n\n---\n\n## 8. Rate Limiting & Denial-of-Service\nStandards: OWASP API4:2023 — Unrestricted Resource Consumption; CWE-770 Allocation of Resources Without Limits; CWE-400 Uncontrolled Resource Consumption\n\n### In-Memory Rate Limiting Is Ineffective\n\nRate limits implemented with an in-process `Map` or `LRU` cache are reset on process restart and are not shared across horizontal replicas. An attacker simply retries after waiting for a cold deploy, or routes requests to different instances.\n\nCorrect approach: Store rate limit counters in a database (Redis, PostgreSQL) keyed by user ID and action type. The counter must be incremented atomically in the same transaction as the action.\n\nPostgreSQL pattern:\n```sql\n-- Atomic check-and-increment\nINSERT INTO rate_limits (user_id, action, window_start, count)\nVALUES ($1, $2, date_trunc('minute', now()), 1)\nON CONFLICT (user_id, action, window_start)\nDO UPDATE SET count = rate_limits.count + 1\nRETURNING count;\n-- If returned count > max_allowed, reject with 429\n```\n\n### Missing Rate Limits on Auth Endpoints\n\nAuthentication endpoints (login, password reset, OTP verification) without rate limiting enable brute-force and credential-stuffing attacks.\n\nRecommended limits (baseline):\n- Login: 5 attempts per minute per IP\n- Password reset: 3 per hour per email\n- OTP verification: 3 attempts per code before invalidating\n\n---\n\n## 9. Concurrency & Race Conditions\nStandards: CWE-362 Concurrent Execution Using Shared Resource with Improper Synchronization (TOCTOU); CWE-367 TOCTOU Race Condition\n\n### Check-Then-Act on Financial Data\n\nThe most dangerous race condition pattern in financial systems: read the balance, check if sufficient, then deduct. If two requests run concurrently, both checks pass against the same stale balance.\n\nViolation:\n```sql\n-- Thread 1 and Thread 2 both read balance = 100 at the same time\nSELECT coins FROM profiles WHERE id = $1; -- both see 100\n-- Both check: 100 >= 50 → true\nUPDATE profiles SET coins = coins - 50 WHERE id = $1; -- both run\n-- Result: balance = 0 instead of 50. Or worse, -50 if CHECK constraint absent.\n```\n\nFix — advisory lock + FOR UPDATE:\n```sql\nBEGIN;\nSELECT pg_advisory_xact_lock(hashtext($1::text)); -- serialize per user\nSELECT coins FROM profiles WHERE id = $1 FOR UPDATE; -- lock the row\n-- Now deduct safely — only one transaction holds the lock\nUPDATE profiles SET coins = coins - $2 WHERE id = $1 AND coins >= $2;\nCOMMIT;\n```\n\n### Idempotency Key Bypass\n\nIf an idempotency key column allows `NULL`, PostgreSQL's UNIQUE constraint treats each `NULL` as a distinct value — meaning `NULL` keys do not deduplicate. This allows unlimited replay of reward operations.\n\n```sql\n-- WRONG — NULLs are not unique in PostgreSQL\nidempotency_key TEXT UNIQUE -- NULL can appear unlimited times\n\n-- CORRECT\nidempotency_key TEXT NOT NULL UNIQUE -- enforces exactly-once\n```\n\n---\n\n## 10. Financial & Transaction Integrity\nStandards: PCI-DSS v4 Req. 6 (Secure Systems), Req. 10 (Audit Logs); ISO 27001 A.9; CWE-362\n\n### Server-Authoritative Coin Logic\n\nAny value computed or provided by the client that affects financial state is a vulnerability. The server must compute all rewards, deductions, and balances independently.\n\nPattern to flag:\n```ts\n// WRONG — client tells server how many coins to award\nconst { userId, coinsEarned } = req.body;\nawait creditCoins(userId, coinsEarned); // attacker sends coinsEarned = 99999\n```\n\nCorrect:* The server computes the reward based on verified activity data (e.g., verified GitHub events), never from a client-supplied amount.\n\n### Append-Only Transaction Log\n\nCoin/credit transaction tables must be immutable after insert. Updates would allow retroactive falsification of balances; deletes destroy the audit trail.\n\n```sql\n-- Trigger blocking updates to financial records\nCREATE OR REPLACE FUNCTION block_transaction_updates()\nRETURNS trigger LANGUAGE plpgsql AS $$\nBEGIN\n RAISE EXCEPTION 'Updates to coin_transactions are not permitted';\nEND;\n$$;\n\nCREATE TRIGGER no_update_coin_transactions\nBEFORE UPDATE ON coin_transactions\nFOR EACH ROW EXECUTE FUNCTION block_transaction_updates();\n```\n\n### Webhook Deduplication — Replay Attack\n\nPayment providers may retry webhooks. Without deduplication on the provider's event ID, the same payment event can credit coins multiple times.\n\n```sql\nINSERT INTO payment_events (provider_event_id, payload, received_at)\nVALUES ($1, $2, now())\nON CONFLICT (provider_event_id) DO NOTHING;\n-- Only process coins if INSERT affected 1 row (i.e., event was new)\n```\n\n---\n\n## 11. Security Logging & Monitoring\nStandards: OWASP A09:2021 — Security Logging and Monitoring Failures; CWE-778 Insufficient Logging; CWE-117 Log Injection; NIST SP 800-92\n\n### What Must Be Logged\n\nAt minimum, log these events with timestamp, user ID, IP address, and action detail:\n- Authentication failures (wrong password, expired token, missing auth header)\n- Authorization failures (access denied to a resource)\n- Input validation failures that look like attacks (unexpected field shapes, oversized inputs)\n- Cryptographic verification failures (HMAC mismatch on webhooks)\n- Rate limit hits\n- Account actions (password change, email change, account deletion)\n- Financial anomalies (deduction larger than balance attempted)\n\n### Log Injection — CWE-117\n\nIf log messages are constructed using string interpolation with user input, an attacker can inject newlines to forge log entries.\n\nViolation:\n```ts\nlogger.info(`User logged in: ${req.body.username}`);\n// Attacker sends username = \"admin\\nSECURITY: Admin password changed\"\n```\nFix: Use structured logging (JSON with separate fields), never string interpolation.\n```ts\nlogger.info({ event: \"login\", username: req.body.username }); // safe\n```\n\n---\n\n## 12. Secrets & Environment Security\nStandards: CWE-798 Hardcoded Credentials; CWE-312 Cleartext Storage; The Twelve-Factor App (Factor III: Config)\n\n### Env Var Fallback to Insecure Default\n\nA common pattern in \"developer-friendly\" code is to fall back to a permissive default if an env var is missing. This silently disables security in production if the env var is misconfigured.\n\nViolation:\n```ts\n// WRONG — falls back to wildcard CORS if env var missing\nconst origin = Deno.env.get(\"ALLOWED_ORIGIN\") ?? \"\";\n```\nFix:\n```ts\n// CORRECT — hard-error on missing config; fail secure\nconst origin = Deno.env.get(\"ALLOWED_ORIGIN\");\nif (!origin) throw new Error(\"ALLOWED_ORIGIN env var is required\");\n```\n\n---\n\n## 13. Data Privacy & Retention\nStandards: GDPR Art. 5 (data minimization), Art. 17 (right to erasure), Art. 25 (privacy by design); CCPA §1798.105; CWE-359 Exposure of Private Information\n\n### Right to Erasure — Account Deletion\n\nOn account deletion, the application must:\n1. Delete or anonymize personal data (name, email, avatar, IP, user-agent)\n2. Retain legally required financial records (PCI-DSS, EU VAT — typically 7–10 years)\n3. Preserve abuse/moderation evidence (content reports, security flags)\n4. Nullify sender references in shared records (e.g., chat messages become anonymous)\n\nCritical FK patterns:\n```sql\n-- Chat: anonymize messages, don't delete them (conversation history remains intact)\nsender_id UUID REFERENCES auth.users(id) ON DELETE SET NULL\n\n-- Transactions: retain for audit; user_id becomes orphaned (no cascade)\nuser_id UUID -- intentionally no FK constraint, or FK with ON DELETE SET NULL\n```\n\n### Data Minimization — GDPR Art. 5(1)(c)\n\nDo not collect or store more data than necessary. Flag:\n- IP addresses stored permanently when 30/90 day retention suffices\n- User-agent strings logged indefinitely (they are PII under GDPR)\n- Full request bodies logged when only metadata is needed for debugging\n- `SELECT ` queries that pull PII columns into contexts that don't need them\n\n---\n\n## 14. Security Misconfiguration\nStandards: OWASP A05:2021; CWE-16 Configuration; CIS Benchmarks; OWASP Secure Headers\n\n### Required Security Headers\n\n```\nContent-Security-Policy: default-src 'self'; script-src 'self'; object-src 'none'\nX-Frame-Options: DENY\nX-Content-Type-Options: nosniff\nReferrer-Policy: strict-origin-when-cross-origin\nStrict-Transport-Security: max-age=63072000; includeSubDomains; preload\nPermissions-Policy: geolocation=(), microphone=(), camera=()\n```\n\n### CORS Misconfiguration\n\n`Access-Control-Allow-Origin: ` on an authenticated endpoint effectively disables CORS protection — any origin can make credentialed requests and read the response.\n\nThe origin allowlist must be an explicit list of trusted domains, validated server-side. Never reflect the request `Origin` header without verification.\n\n```ts\n// WRONG — reflects any origin\nconst origin = req.headers.get(\"origin\");\nheaders.set(\"Access-Control-Allow-Origin\", origin ?? \"\");\n\n// CORRECT — validate against explicit allowlist\nconst ALLOWED = new Set([\"https://app.example.com\"]);\nconst requestOrigin = req.headers.get(\"origin\") ?? \"\";\nif (ALLOWED.has(requestOrigin)) {\n headers.set(\"Access-Control-Allow-Origin\", requestOrigin);\n headers.set(\"Vary\", \"Origin\");\n}\n```\n\n---\n\n## 15. Supply Chain & Dependency Security\nStandards: OWASP A06:2021 — Vulnerable and Outdated Components; CWE-1357; SLSA Framework\n\n### Dependency Audit\n\nRun `npm audit` or `bun audit` and treat results as:\n- Critical/High CVEs → block deployment; patch immediately\n- Moderate CVEs → fix within the sprint\n- Low CVEs → fix in next dependency update cycle\n\n### Version Pinning\n\nUse exact versions in `package.json` for production dependencies, or lock with `package-lock.json`/`bun.lockb`. The `^` prefix allows minor version bumps that could introduce regressions or security fixes you haven't reviewed.\n\n---\n\n## 16. TypeScript / JavaScript Specific\nStandards: CWE-843 Type Confusion; CWE-915 Prototype Pollution; CWE-94 Code Injection; OWASP Cheat Sheet: DOM-based XSS\n\n### Prototype Pollution — CWE-915\n\nMerging user-controlled objects onto existing objects can overwrite properties on `Object.prototype`, affecting all objects in the process.\n\nViolation:\n```ts\nfunction mergeOptions(defaults: object, userOptions: unknown) {\n return Object.assign(defaults, userOptions); // if userOptions is {\"__proto__\": {\"admin\": true}}\n}\n```\nFix: Validate and allowlist the keys of user-controlled objects before merging. Use `Object.create(null)` for dictionaries that must not inherit from `Object.prototype`. Use schema validation (Zod) to strip unknown keys.\n\n### `as any` Type Assertions on External Data — CWE-843\n\nExternal data (API responses, webhook payloads, database query results typed as `any`, `JSON.parse()` output) must be treated as `unknown` and parsed through a validator before use. Using `as any` or `as ExpectedType` directly bypasses TypeScript's safety guarantees entirely.\n\n```ts\n// WRONG\nconst payload = JSON.parse(body) as WebhookPayload;\ncreditCoins(payload.userId, payload.amount); // if payload.amount is a string: NaN coins\n\n// CORRECT\nconst parsed = WebhookPayloadSchema.safeParse(JSON.parse(body));\nif (!parsed.success) return badRequest();\ncreditCoins(parsed.data.userId, parsed.data.amount); // type-safe and validated\n```\n\n### Unhandled Promise Rejections — CWE-755\n\nIn async TypeScript/JavaScript, a missing `await` means the promise runs in the background and any rejection is silently swallowed (or crashes Node.js). This is especially dangerous in financial operations where you need to know if the DB write succeeded.\n\n```ts\n// WRONG — fire-and-forget on a critical operation\nlogSecurityEvent(userId, \"auth_failure\"); // rejection silently lost\n\n// CORRECT — await or explicitly handle\nawait logSecurityEvent(userId, \"auth_failure\");\n// or: void logSecurityEvent(...).catch(err => console.error(\"Failed to log:\", err));\n```\n\n"},"systematic-debugging":{"content":"---\nname: systematic-debugging\ndescription: \"Guides root-cause analysis with a structured process: reproduce, isolate, hypothesize, verify. Use when debugging bugs, investigating failures, or when the user says something is broken or not working as expected.\"\n---\n# Systematic Debugging\n\nWork through failures in order. Don't guess at fixes until the cause is narrowed down.\n\n## Scope\n\n- User reports a bug: Clarify what \"wrong\" means (error message, wrong result, crash, hang). Get steps to reproduce or environment details if missing.\n- User points at code: Treat that as the suspected area; still reproduce and isolate before changing code.\n- Logs/stack traces provided: Use them to form hypotheses; don't ignore them.\n\n## Process\n\n### 1. Reproduce\n\n- Confirm the failure is reproducible. If not, note that and list what's needed (e.g. data, env, steps).\n- Identify: one-off or intermittent? In which environment (dev/staging/prod, OS, version)?\n- Output: \"Reproducible: yes/no. How: …\"\n\n### 2. Isolate\n\n- Shrink the problem: minimal input, minimal code path, or minimal config that still fails.\n- Bisect if useful: which commit, which option, which input range?\n- Remove variables (other features, network, time) to see when the failure goes away.\n- Output: \"Failure occurs when: …\" and \"Failure does not occur when: …\"\n\n### 3. Hypothesize\n\n- State one or more concrete hypotheses that explain the observed behavior (e.g. \"null passed here\", \"race between A and B\", \"wrong type at runtime\").\n- Tie each hypothesis to evidence from reproduce/isolate (logs, stack trace, line numbers).\n- Prefer the simplest hypothesis that fits the evidence.\n- Output: \"Hypothesis: …\" with \"Evidence: …\"\n\n### 4. Verify\n\n- Propose a minimal check (log, assert, unit test, or one-line change) that would confirm or rule out the top hypothesis.\n- If the user can run it, give the exact step. If you can run it (e.g. tests), do it.\n- After verification: \"Confirmed: …\" or \"Ruled out; next hypothesis: …\"\n\n### 5. Fix\n\n- Only suggest a fix after the cause is confirmed or highly likely.\n- Fix the root cause when possible; document or ticket workarounds if you suggest one.\n- Suggest a regression test or assertion so the bug doesn't come back.\n\n## Output\n\n- Prefer short bullets over long paragraphs.\n- Always cite file/line/function when pointing at code.\n- If stuck (can't reproduce, no logs), say what's missing and what would help next.\n- Don't suggest random fixes (e.g. \"try clearing cache\") without tying them to a hypothesis.\n","reference":null},"test-deno":{"content":"---\nname: test-deno\ndescription: Use when writing, reviewing, or fixing Deno integration tests for Supabase Edge Functions, or when auditing edge function tests for best practices. Triggers on test failures involving sanitizers, assertions, mocking, HTTP testing, or environment isolation.\n---\n\n# Deno Edge Function Testing\n\nWrite and review integration tests for Supabase Edge Functions using Deno's built-in test runner and official standard library modules. Every recommendation in this skill is sourced from official documentation — see [REFERENCE.md](REFERENCE.md) for citations.\n\n## Scope\n\nDetermine what to review or write based on user request:\n\n- Write mode: write new tests for edge functions the user specifies\n- Review mode: audit existing test files for anti-patterns and best practice violations\n- Fix mode: fix failing or flawed tests\n\nTest files live in the project's edge function test directory (Supabase convention: `supabase/functions/tests/`).\n\n## Prerequisites\n\nBefore tests can run, the local Supabase stack must be running:\n\n```bash\n# Terminal 1: start local stack\nnpx supabase start\n\n# Terminal 2: serve functions\nnpx supabase functions serve --no-verify-jwt --env-file supabase/functions/tests/.env.local\n\n# Terminal 3: run tests\ndeno test --no-lock --env-file=supabase/functions/tests/.env.local \\\n --allow-net --allow-env --allow-read \\\n supabase/functions/tests/\n```\n\n`--no-lock` is required — Supabase Edge Runtime uses Deno v2.1.x internally, and newer Deno CLI versions generate lock file format v5 which the runtime cannot parse.\n\n## Principles to Enforce\n\n### 1. Test Structure — `Deno.test()` or BDD (`describe`/`it`)\n\nBoth styles are officially supported. Choose one and be consistent within a project.\n\n`Deno.test()` style (native):\n```ts\nDeno.test(\"function returns 200 for valid input\", async () => {\n const res = await fetch(`${BASE_URL}/my-function`, { /* ... / });\n assertEquals(res.status, 200);\n});\n```\n\nBDD style* (`@std/testing/bdd`):\n```ts\nimport { describe, it, beforeEach, afterEach } from \"@std/testing/bdd\";\n\ndescribe(\"my-function\", () => {\n it(\"returns 200 for valid input\", async () => {\n const res = await fetch(`${BASE_URL}/my-function`, { /* ... / });\n assertEquals(res.status, 200);\n });\n});\n```\n\nRules:\n- `describe()` and `it()` are wrappers over `Deno.test()` and `t.step()` — they are not a separate test runner\n- Hooks: `beforeAll` > `beforeEach` > test > `afterEach` > `afterAll`\n- `afterEach`/`afterAll` run even if tests fail\n- Per-test `permissions` do NOT work inside nested `describe` blocks (known limitation — tests use `t.step` internally which doesn't support permissions)\n\n### 2. Assertions — `@std/assert` or `@std/expect`\n\nTwo assertion styles are officially supported:\n\n`@std/assert`* (Deno-native):\n```ts\nimport { assertEquals, assertRejects, assertThrows } from \"@std/assert\";\n```\n\n`@std/expect` (Jest-compatible):\n```ts\nimport { expect } from \"@std/expect\";\nexpect(result).toEqual(42);\n```\n\nKey assertion selection rules:\n\n\| Situation \| Use \| NOT \|\n\|-----------\|-----\|-----\|\n\| Deep equality (objects, arrays) \| `assertEquals` \| `assertStrictEquals` \|\n\| Reference/primitive equality (`===`) \| `assertStrictEquals` \| `assertEquals` \|\n\| Value is not null/undefined \| `assertExists` \| `assert(val !== null)` \|\n\| Synchronous throw \| `assertThrows(fn, ErrorClass?, msg?)` \| try/catch \|\n\| Async rejection \| `assertRejects(fn, ErrorClass?, msg?)` \| `assertThrows` \|\n\| Partial object match \| `assertObjectMatch` \| manual property checks \|\n\| String contains substring \| `assertStringIncludes` \| `assert(s.includes(...))` \|\n\| Numeric comparison \| `assertGreater`, `assertLess`, etc. \| `assert(a > b)` \|\n\| Unconditional fail \| `fail()` or `unreachable()` \| `assert(false)` \|\n\nAnti-pattern: Using `assertThrows` for async code — it only catches synchronous exceptions. Use `assertRejects` for promises.\n\n### 3. Integration Testing Pattern (Supabase Official)\n\nEdge Function tests should be integration tests — real HTTP requests against locally-served functions. This is the official Supabase recommendation.\n\n```ts\nconst BASE_URL = Deno.env.get(\"SUPABASE_URL\") + \"/functions/v1\";\n\nDeno.test(\"POST /my-function returns expected data\", async () => {\n const response = await fetch(`${BASE_URL}/my-function`, {\n method: \"POST\",\n headers: {\n \"Content-Type\": \"application/json\",\n \"Authorization\": `Bearer ${Deno.env.get(\"SB_PUBLISHABLE_KEY\")}`,\n },\n body: JSON.stringify({ name: \"Test\" }),\n });\n\n assertEquals(response.status, 200);\n const data = await response.json();\n assertEquals(data.message, \"Hello Test!\");\n});\n```\n\nWhat to test:\n- Happy-path request/response (status code, body shape)\n- Authentication enforcement (missing/invalid JWT → 401)\n- Input validation (malformed body → 400)\n- Error responses (correct status codes and error messages)\n- CORS headers (OPTIONS preflight, allowed origins)\n- Method routing (POST vs GET vs unsupported methods)\n\nWhat NOT to test here (test at the database layer instead):\n- RLS policies\n- RPC business logic\n- Trigger behavior\n\n### 4. Sanitizers — Resource, Op, and Exit\n\nSanitizers are enabled by default on every `Deno.test()`. They catch resource leaks and unfinished async work.\n\n\| Sanitizer \| Default \| What it catches \|\n\|-----------\|---------\|-----------------\|\n\| `sanitizeResources` \| `true` \| Open files, connections not closed \|\n\| `sanitizeOps` \| `true` \| Unawaited async operations \|\n\| `sanitizeExit` \| `true` \| Calls to `Deno.exit()` \|\n\nRules:\n- NEVER disable sanitizers globally\n- Only disable on specific tests with a comment explaining why\n- For integration tests with `fetch()`, sanitizers should pass without disabling — `await fetch()` handles cleanup\n- If a third-party library holds connections open (e.g., Supabase client), you may need `sanitizeResources: false` on specific tests\n\n```ts\nDeno.test({\n name: \"test with persistent connection\",\n sanitizeResources: false, // Supabase client keeps connection pool open\n async fn() { /* ... / },\n});\n```\n\n### 5. Mocking — `@std/testing/mock`\n\nSpies* record calls without changing behavior:\n```ts\nimport { spy, assertSpyCalls, assertSpyCall } from \"@std/testing/mock\";\nconst dbSpy = spy(database, \"save\");\n// ... test code ...\nassertSpyCalls(dbSpy, 1);\n```\n\nStubs replace function behavior:\n```ts\nimport { stub } from \"@std/testing/mock\";\nusing _stub = stub(deps, \"getUserName\", () => \"Test User\");\n// stub auto-restores when scope exits\n```\n\nRules:\n- ALWAYS restore spies/stubs — use `using` keyword (preferred) or `try/finally` with `.restore()`\n- Do NOT over-mock in integration tests — if testing HTTP behavior, use real `fetch()` against the local server\n- Do NOT mock what you don't own — mock your code's dependencies, not third-party internals\n\nFakeTime (from `@std/testing/time`):\n```ts\nimport { FakeTime } from \"@std/testing/time\";\nusing time = new FakeTime();\n// ... time.tick(3500) ...\n```\n\n### 6. Environment Isolation\n\n- Use `--env-file=path` to load test-specific environment variables\n- Keep a dedicated `.env.local` in `supabase/functions/tests/`\n- NEVER hardcode URLs, keys, or secrets in test files — use `Deno.env.get()`\n- Environment variables from the shell take precedence over `--env-file` values\n\n### 7. Permissions — Principle of Least Privilege\n\nGrant only what tests need:\n\n```bash\ndeno test --allow-net --allow-env --allow-read tests/\n```\n\nFor fine-grained control per test:\n```ts\nDeno.test({\n name: \"reads config file\",\n permissions: { read: [\"./config.json\"], net: false },\n fn: () => { /* ... / },\n});\n```\n\nPer-test permissions CANNOT exceed CLI-granted permissions* — they can only restrict further.\n\n### 8. Test File Naming and Organization\n\nDeno auto-discovers files matching: `{_,.,}test.{ts,tsx,mts,js,mjs,jsx}`\n\nThis means:\n- `my_function_test.ts` (Deno/Go convention)\n- `my_function.test.ts` (Node/Jest convention)\n\nBoth work. Supabase's official example uses `function-name-test.ts` with a hyphen, but note that hyphens are NOT matched by auto-discovery — you must pass the directory explicitly to `deno test`.\n\nProject structure:\n```\nsupabase/functions/\n my-function/\n index.ts\n tests/\n .env.local\n my_function_test.ts\n```\n\n### 9. Test Independence and Determinism\n\n- Tests within a file run sequentially; files can run in parallel (`--parallel`)\n- Module-level state is shared across tests in the same file\n- Database/server state persists between tests unless explicitly cleaned up\n- Use `beforeEach`/`afterEach` to reset state\n- NEVER rely on test execution order\n- NEVER use random data without seeding\n- NEVER depend on wall-clock time (use `FakeTime`)\n\n### 10. Error Response Testing\n\nAlways test error paths explicitly:\n\n```ts\nDeno.test(\"returns 401 for missing auth\", async () => {\n const res = await fetch(`${BASE_URL}/my-function`, {\n method: \"POST\",\n headers: { \"Content-Type\": \"application/json\" },\n body: JSON.stringify({}),\n });\n assertEquals(res.status, 401);\n const body = await res.json();\n assertStringIncludes(body.error, \"Missing\");\n});\n\nDeno.test(\"returns 400 for invalid body\", async () => {\n const res = await fetch(`${BASE_URL}/my-function`, {\n method: \"POST\",\n headers: {\n \"Content-Type\": \"application/json\",\n \"Authorization\": `Bearer ${validToken}`,\n },\n body: JSON.stringify({ wrong: \"shape\" }),\n });\n assertEquals(res.status, 400);\n});\n\nDeno.test(\"returns 405 for unsupported method\", async () => {\n const res = await fetch(`${BASE_URL}/my-function`, { method: \"DELETE\" });\n assertEquals(res.status, 405);\n});\n```\n\n## Common Anti-Patterns\n\n\| Anti-Pattern \| Why it's wrong \| Fix \|\n\|---\|---\|---\|\n\| Not awaiting `fetch()` or async ops \| `sanitizeOps` will catch this; test may pass falsely \| Always `await` every async operation \|\n\| Disabling sanitizers globally \| Hides real resource leaks \| Disable only per-test with a comment \|\n\| Using `assertThrows` for async code \| Only catches synchronous exceptions \| Use `assertRejects` for promises \|\n\| Not restoring stubs/spies \| Leaks mock state to other tests \| Use `using` keyword or `try/finally` \|\n\| Hardcoding URLs and keys \| Breaks in different environments \| Use `Deno.env.get()` + `--env-file` \|\n\| Mocking `fetch` in integration tests \| Defeats the purpose of integration testing \| Use real HTTP calls to local server \|\n\| Sharing mutable state without cleanup \| Tests become order-dependent \| Reset in `beforeEach`/`afterEach` \|\n\| Using `assert(condition)` for everything \| Provides no useful failure message \| Use specific assertions (`assertEquals`, etc.) \|\n\n## Output Format (Review Mode)\n\nWhen reviewing existing tests, group findings by severity:\n\n```\n## Critical\nIssues that make tests unreliable, flaky, or misleading.\n\n### [PRINCIPLE] Brief title\nFile: `path/to/file_test.ts` (lines X-Y)\nPrinciple: What the standard requires.\nViolation: What the code does wrong.\nFix: Specific, actionable suggestion.\n\n## Warning\nIssues that weaken test value or violate conventions.\n\n(same structure)\n\n## Suggestion\nImprovements aligned with best practices.\n\n(same structure)\n```\n\n## Rules\n\n- Only verified claims: every recommendation in this skill is backed by official Deno or Supabase documentation. See REFERENCE.md for source citations.\n- Integration over unit: for Edge Functions, prefer integration tests (real HTTP against local server) over unit tests with mocked dependencies.\n- Test the contract, not the implementation: test HTTP status codes, response bodies, and headers — not internal function calls.\n- Respect sanitizers: treat sanitizer failures as real bugs, not annoyances to disable.\n- Least privilege: grant only the permissions tests actually need.\n","reference":"# Deno Edge Function Testing Reference\n\nDetailed definitions, official sources, and verified citations for each principle in this skill.\n\n## Table of Contents\n\n1. [Test Runner](#1-test-runner)\n2. [Assertions](#2-assertions)\n3. [BDD Module](#3-bdd-module)\n4. [Mocking](#4-mocking)\n5. [Sanitizers](#5-sanitizers)\n6. [Supabase Integration Testing](#6-supabase-integration-testing)\n7. [Environment and Permissions](#7-environment-and-permissions)\n8. [CLI Reference](#8-cli-reference)\n9. [Anti-Patterns](#9-anti-patterns)\n\n---\n\n## 1. Test Runner\n\nSource: [docs.deno.com/runtime/fundamentals/testing](https://docs.deno.com/runtime/fundamentals/testing/)\n\nDeno ships a built-in test runner — no external framework required. Tests are registered with `Deno.test()`.\n\n### File auto-discovery\n\n`deno test` auto-discovers files matching: `{_,.,}test.{ts, tsx, mts, js, mjs, jsx}`\n\nThis matches `_test.ts`, `.test.ts`, and `test.ts` — but NOT `-test.ts` (hyphenated). To run hyphenated files, pass the directory explicitly.\n\n### Test steps\n\nSub-tests within a single `Deno.test()`:\n\n```ts\nDeno.test(\"grouped tests\", async (t) => {\n await t.step(\"step one\", async () => { / ... / });\n await t.step(\"step two\", async () => { / ... / });\n});\n```\n\nSteps are awaited sequentially. Each step reports independently.\n\n---\n\n## 2. Assertions\n\nSource:* [jsr.io/@std/assert](https://jsr.io/@std/assert) — version 1.0.19, 27 exports\n\n### Complete function list (verified)\n\n\| Function \| Purpose \|\n\|---\|---\|\n\| `assert(expr)` \| Truthy check \|\n\| `assertAlmostEquals(actual, expected, tolerance?)` \| Floating-point comparison \|\n\| `assertArrayIncludes(actual, expected)` \| Array contains all elements \|\n\| `assertEquals(actual, expected)` \| Deep equality \|\n\| `assertExists(actual)` \| Not null/undefined (narrows to `NonNullable<T>`) \|\n\| `assertFalse(expr)` \| Falsy check \|\n\| `assertGreater(actual, expected)` \| `actual > expected` \|\n\| `assertGreaterOrEqual(actual, expected)` \| `actual >= expected` \|\n\| `assertInstanceOf(actual, ExpectedType)` \| `instanceof` check \|\n\| `assertIsError(error, ErrorClass?, msgIncludes?)` \| Error type check \|\n\| `assertLess(actual, expected)` \| `actual < expected` \|\n\| `assertLessOrEqual(actual, expected)` \| `actual <= expected` \|\n\| `assertMatch(actual, regex)` \| String matches RegExp \|\n\| `assertNotEquals(actual, expected)` \| Deep inequality \|\n\| `assertNotInstanceOf(actual, UnexpectedType)` \| NOT `instanceof` \|\n\| `assertNotMatch(actual, regex)` \| String does NOT match RegExp \|\n\| `assertNotStrictEquals(actual, expected)` \| Reference inequality \|\n\| `assertObjectMatch(actual, expected)` \| Partial deep match \|\n\| `assertRejects(fn, ErrorClass?, msgIncludes?)` \| Async rejection testing \|\n\| `assertStrictEquals(actual, expected)` \| Reference equality (`===`) \|\n\| `assertStringIncludes(actual, expected)` \| String contains substring \|\n\| `assertThrows(fn, ErrorClass?, msgIncludes?)` \| Synchronous throw testing \|\n\| `equal(a, b)` \| Deep equality (returns boolean, no assertion) \|\n\| `fail(msg?)` \| Unconditional failure \|\n\| `unimplemented(msg?)` \| Marks unimplemented code \|\n\| `unreachable()` \| Marks unreachable code \|\n\n### Alternative: `@std/expect` (Jest-compatible)\n\nSource: [jsr.io/@std/expect](https://jsr.io/@std/expect) — official Deno standard library\n\n```ts\nimport { expect } from \"@std/expect\";\nexpect(x).toEqual(42);\nexpect(fn).toThrow(TypeError);\nawait expect(asyncFn()).resolves.toEqual(42);\n```\n\nSupports 23 common matchers, 10 mock-related matchers, and asymmetric matchers (`expect.anything()`, `expect.objectContaining()`, etc.).\n\n---\n\n## 3. BDD Module\n\nSource: [jsr.io/@std/testing/doc/bdd](https://jsr.io/@std/testing/doc/bdd)\n\n### Exports\n\n`describe`, `it`, `test`, `beforeAll`, `afterAll`, `beforeEach`, `afterEach`, `before` (alias of `beforeAll`), `after` (alias of `afterAll`)\n\n### How it maps to Deno.test\n\n\"Internally, `describe` and `it` are registering tests with `Deno.test` and `t.step`.\"\n\n### Modifiers\n\n- `.only()` — run only this test/suite\n- `.skip()` — skip this test/suite\n- `.ignore()` — alias of `.skip()`\n\n### Known limitation\n\n\"There is currently one limitation to this, you cannot use the permissions option on an individual test case or test suite that belongs to another test suite. That's because internally those tests are registered with `t.step` which does not support the permissions option.\"\n\n### Hook execution order\n\n\"A test suite can have multiples of each type of hook, they will be called in the order that they are registered. The `afterEach` and `afterAll` hooks will be called whether or not the test case passes.\"\n\n---\n\n## 4. Mocking\n\n### Spies\n\nSource: [jsr.io/@std/testing/doc/mock](https://jsr.io/@std/testing/doc/mock), [docs.deno.com/examples/mocking_tutorial](https://docs.deno.com/examples/mocking_tutorial/)\n\n\"Test spies are function stand-ins that are used to assert if a function's internal behavior matches expectations. Test spies on methods keep the original behavior but allow you to test how the method is called and what it returns.\"\n\n### Stubs\n\n\"Test stubs are an extension of test spies that also replaces the original methods behavior.\"\n\n### Cleanup\n\n\"Method spys are disposable, meaning that you can have them automatically restore themselves with the `using` keyword.\"\n\nWithout `using`, always restore in `try/finally`:\n```ts\nconst myStub = stub(obj, \"method\", () => \"mocked\");\ntry {\n // test code\n} finally {\n myStub.restore();\n}\n```\n\n### FakeTime\n\nSource: `@std/testing/time` (separate module from `@std/testing/mock`)\n\n```ts\nimport { FakeTime } from \"@std/testing/time\";\nusing time = new FakeTime();\ntime.tick(3500);\n```\n\n### Assertion helpers\n\n- `assertSpyCall(spy, callIndex, expected)` — assert specific call\n- `assertSpyCalls(spy, expectedCount)` — assert total call count\n- `assertSpyCallArg(spy, callIndex, argIndex, expected)` — assert specific argument\n- `assertSpyCallArgs(spy, callIndex, expected)` — assert all arguments\n- `returnsNext(values)` — create function returning values from iterable\n- `resolvesNext(values)` — async version of `returnsNext`\n\n---\n\n## 5. Sanitizers\n\nSource: [docs.deno.com/runtime/fundamentals/testing](https://docs.deno.com/runtime/fundamentals/testing/)\n\n### sanitizeResources (default: true)\n\n\"Ensures that all I/O resources created during a test are closed, to prevent leaks.\"\n\n### sanitizeOps (default: true)\n\n\"Ensures that all async operations started in a test are completed before the test ends.\"\n\n### sanitizeExit (default: true)\n\n\"Ensures that tested code doesn't call `Deno.exit()`, which could signal a false test success.\"\n\n### When to disable\n\n- `sanitizeResources: false` — when a third-party library holds connections open (e.g., database pool)\n- `sanitizeOps: false` — when background tasks fire intentionally (e.g., token refresh)\n- NEVER disable globally — only per-test with a documented reason\n\n---\n\n## 6. Supabase Integration Testing\n\nSource: [supabase.com/docs/guides/functions/unit-test](https://supabase.com/docs/guides/functions/unit-test)\n\n### Official pattern\n\nThe Supabase docs title this page \"Writing Unit Tests for Edge Functions\" but the example is an integration test — it starts real services and makes HTTP calls.\n\n### Recommended structure\n\n```\nsupabase/functions/\n function-one/\n index.ts\n tests/\n .env.local\n function-one-test.ts\n```\n\n\"using the same name as the Function followed by `-test.ts`\"\n\n### Official example\n\n```ts\nimport { assert, assertEquals } from \"jsr:@std/assert@1\";\nimport { createClient } from \"npm:@supabase/supabase-js@2\";\n\nconst supabaseUrl = Deno.env.get(\"SUPABASE_URL\") ?? \"\";\nconst supabaseKey = Deno.env.get(\"SUPABASE_PUBLISHABLE_KEY\") ?? \"\";\n\nconst client = createClient(supabaseUrl, supabaseKey, {\n auth: {\n autoRefreshToken: false,\n persistSession: false,\n detectSessionInUrl: false,\n },\n});\n```\n\n### Running\n\n```bash\nsupabase start\nsupabase functions serve\ndeno test --allow-all supabase/functions/tests/function-one-test.ts\n```\n\n### Lock file issue\n\nSource: [github.com/orgs/supabase/discussions/39966](https://github.com/orgs/supabase/discussions/39966)\n\nThe Supabase Edge Runtime uses Deno v2.1.x. Newer Deno CLI versions generate lock file format v5, which the runtime cannot parse. Use `--no-lock` to bypass.\n\n---\n\n## 7. Environment and Permissions\n\n### `--env-file`\n\nSource: [docs.deno.com/runtime/reference/cli/test](https://docs.deno.com/runtime/reference/cli/test)\n\n\"Load environment variables from local file. Only the first environment variable with a given key is used.\" Existing env vars take precedence.\n\n### Permissions\n\nSource: [docs.deno.com/runtime/fundamentals/testing](https://docs.deno.com/runtime/fundamentals/testing/)\n\n\"The `permissions` property in the `Deno.test` configuration allows you to specifically deny permissions, but does not grant them. Permissions must be provided when running the test command.\"\n\n\"Remember that any permission not explicitly granted at the command line will be denied, regardless of what's specified in the test configuration.\"\n\n---\n\n## 8. CLI Reference\n\nSource: [docs.deno.com/runtime/reference/cli/test](https://docs.deno.com/runtime/reference/cli/test)\n\n\| Flag \| Purpose \|\n\|---\|---\|\n\| `--env-file=<path>` \| Load env vars from file \|\n\| `--no-lock` \| Disable lock file discovery \|\n\| `--filter \"<pattern>\"` \| Run tests matching string or `/regex/` \|\n\| `--parallel` \| Run test files in parallel (defaults to CPU count) \|\n\| `--fail-fast` \| Stop after first failure \|\n\| `--watch` \| Re-run on file changes \|\n\| `--coverage=<dir>` \| Collect coverage data \|\n\| `--reporter=<type>` \| Output format (default: `pretty`) \|\n\| `--no-check` \| Skip type checking \|\n\| `--doc` \| Evaluate code blocks in JSDoc/Markdown \|\n\| `--shuffle` \| Randomize test order \|\n\| `--trace-leaks` \| Show resource leak stack traces \|\n\| `--junit-path=<path>` \| Output JUnit XML \|\n\| `--permit-no-files` \| Don't error if no test files found \|\n\n---\n\n## 9. Anti-Patterns\n\nSynthesized from official documentation warnings and sanitizer documentation:\n\n1. Not awaiting async operations — sanitizeOps exists specifically for this\n2. Leaking resources — open files/connections without closing\n3. Disabling sanitizers globally — hides real bugs\n4. Not restoring stubs/spies — leaks mock state between tests\n5. Using `assertThrows` for async code — use `assertRejects`\n6. Over-mocking in integration tests — defeats the purpose\n7. Relying on test execution order — tests should be independent\n8. Hardcoding URLs and credentials — use `Deno.env.get()` + `--env-file`\n9. Ignoring the lock file issue — use `--no-lock` with Supabase Edge Runtime\n"},"test-frontend":{"content":"---\nname: test-frontend\ndescription: Use when writing, reviewing, or fixing React component/hook tests, or when auditing frontend tests for RTL, Vitest, Zustand, or TanStack Query best practices. Triggers on query priority issues, mock leaks, flaky async tests, or Kent C. Dodds common-mistakes violations.\n---\n\n# React Frontend Testing\n\nWrite and review React component and hook tests using Vitest and React Testing Library (RTL). Every recommendation is sourced from official documentation — see [REFERENCE.md](REFERENCE.md) for citations.\n\n## Scope\n\nDetermine what to review or write based on user request:\n\n- Write mode: write new tests for components/hooks the user specifies\n- Review mode: audit existing test files for anti-patterns and best practice violations\n- Fix mode: fix failing or flawed tests\n\nTest files live in the project's test directory (commonly `src/__tests__/` or `__tests__/` — check the project structure).\n\n## Prerequisites\n\n```bash\ncd app && npx vitest # run all tests (watch mode)\ncd app && npx vitest run # run all tests once\ncd app && npx vitest run src/__tests__/path/to/file.test.tsx # specific file\n```\n\n## The Core Principle\n\nSource: [testing-library.com/docs/guiding-principles](https://testing-library.com/docs/guiding-principles)\n\n> \"The more your tests resemble the way your software is used, the more confidence they can give you.\"\n\nThis means:\n- Test from the user's perspective (what they see and interact with)\n- Query elements by their accessible roles and visible text\n- Do NOT test implementation details (internal state, CSS classes, component structure)\n\n## Principles to Enforce\n\n### 1. Query Priority — Use the Most Accessible Query\n\nSource: [testing-library.com/docs/queries/about](https://testing-library.com/docs/queries/about)\n\nOfficial priority order (most preferred to least):\n\n\| Priority \| Query \| When to use \|\n\|----------\|-------\|-------------\|\n\| 1 \| `getByRole` \| Default choice — accessible to everyone \|\n\| 2 \| `getByLabelText` \| Form fields with labels \|\n\| 3 \| `getByPlaceholderText` \| When no label exists \|\n\| 4 \| `getByText` \| Non-interactive content \|\n\| 5 \| `getByDisplayValue` \| Filled-in form elements \|\n\| 6 \| `getByAltText` \| Images, areas, inputs \|\n\| 7 \| `getByTitle` \| Tooltip-like content \|\n\| 8 \| `getByTestId` \| Last resort only \|\n\nAnti-patterns:\n- Using `getByTestId` when `getByRole` would work\n- Using `container.querySelector()` — NEVER do this\n- Using `getByText` for buttons when `getByRole('button', { name: /text/i })` is available\n\n### 2. Query Type Selection — `getBy` vs `queryBy` vs `findBy`\n\n\| Type \| Returns \| Throws? \| Use when \|\n\|------\|---------\|---------\|----------\|\n\| `getBy` \| Element \| Yes, if not found \| Element MUST exist (default) \|\n\| `queryBy` \| Element or `null` \| No \| Asserting element does NOT exist \|\n\| `findBy` \| Promise\\<Element\\> \| Yes, after timeout \| Element appears asynchronously \|\n\| `getAllBy` \| Array \| Yes, if empty \| Multiple elements MUST exist \|\n\| `queryAllBy` \| Array (may be empty) \| No \| Checking count of elements \|\n\| `findAllBy` \| Promise\\<Array\\> \| Yes, after timeout \| Multiple elements appear async \|\n\nAnti-patterns:\n- Using `queryBy` to assert existence — use `getBy` instead\n- Wrapping `getBy` in `waitFor` — use `findBy` instead\n- Using `findBy` for synchronous elements — use `getBy` instead\n\n### 3. User Interactions — Use `@testing-library/user-event`\n\nSource: [testing-library.com/docs/user-event/intro](https://testing-library.com/docs/user-event/intro)\n\n`user-event` simulates complete interactions (focus, keyboard events, input events), while `fireEvent` dispatches single DOM events.\n\n```ts\nimport userEvent from '@testing-library/user-event';\n\nit('submits the form', async () => {\n const user = userEvent.setup();\n render(<MyForm />);\n\n await user.type(screen.getByRole('textbox', { name: /name/i }), 'Alice');\n await user.click(screen.getByRole('button', { name: /submit/i }));\n\n expect(screen.getByText(/success/i)).toBeVisible();\n});\n```\n\nRules:\n- Use `userEvent.setup()` before `render()`, inside each test — the official docs \"[discourage] rendering or using any userEvent functions outside of the test itself - e.g. in a `before`/`after` hook\"\n- Use `user-event` for all interactions — only fall back to `fireEvent` for events `user-event` doesn't support\n\n### 4. Async Testing — `waitFor` and `findBy`\n\n```ts\n// GOOD: findBy for elements that appear asynchronously\nconst heading = await screen.findByRole('heading', { name: /welcome/i });\n\n// GOOD: waitFor for assertions that become true asynchronously\nawait waitFor(() => {\n expect(screen.getByText(/loaded/i)).toBeVisible();\n});\n\n// BAD: wrapping getBy in waitFor (use findBy instead)\nawait waitFor(() => {\n screen.getByText(/loaded/i); // wrong — use findByText\n});\n\n// BAD: empty waitFor callback\nawait waitFor(() => {}); // does nothing useful\n\n// BAD: multiple assertions in waitFor\nawait waitFor(() => {\n expect(a).toBe(1);\n expect(b).toBe(2); // if a fails, b never runs — put one inside, rest outside\n});\n\n// BAD: side effects inside waitFor\nawait waitFor(() => {\n fireEvent.click(button); // don't do this — put side effects outside\n expect(result).toBeVisible();\n});\n```\n\n### 5. `screen` — Always Use It\n\nSource: Kent C. Dodds — \"Common Mistakes with React Testing Library\"\n\n```ts\n// GOOD\nrender(<MyComponent />);\nexpect(screen.getByRole('button')).toBeVisible();\n\n// BAD — destructuring from render\nconst { getByRole } = render(<MyComponent />);\nexpect(getByRole('button')).toBeVisible();\n```\n\nWhy: `screen` is always available, reduces refactoring churn, and matches the Testing Library recommended pattern.\n\n### 6. Assertions — Use `jest-dom` Matchers\n\nSource: [testing-library.com — jest-dom](https://testing-library.com/docs/ecosystem-jest-dom)\n\n```ts\n// GOOD\nexpect(button).toBeDisabled();\nexpect(element).toBeVisible();\nexpect(element).toHaveTextContent('hello');\nexpect(link).toHaveAttribute('href', '/path');\n\n// BAD — checking properties directly\nexpect(button.disabled).toBe(true);\nexpect(element.textContent).toBe('hello');\n```\n\nKey matchers:\n- `toBeVisible()` — element is visible to user\n- `toBeDisabled()` / `toBeEnabled()`\n- `toBeInTheDocument()` — element exists in DOM\n- `toHaveTextContent(text)`\n- `toHaveAttribute(attr, value?)`\n- `toHaveClass(className)`\n- `toHaveValue(value)` — for form elements\n- `toBeChecked()` — for checkboxes/radios\n\n### 7. Vitest Mocking — `vi.mock()`, `vi.spyOn()`, `vi.fn()`\n\nSource: [vitest.dev/guide/mocking](https://vitest.dev/guide/mocking), [vitest.dev/api/vi](https://vitest.dev/api/vi.html)\n\n#### `vi.mock()` hoisting\n\n\"The call to `vi.mock` is hoisted to top of the file. It will always be executed before all imports.\"\n\n```ts\n// This is hoisted to the top — runs before any imports\nvi.mock('../../lib/supabase', () => ({\n supabase: { channel: vi.fn(), removeChannel: vi.fn() },\n}));\n\n// These imports see the mocked module\nimport { supabase } from '../../lib/supabase';\n```\n\n#### `vi.hoisted()`\n\nUse when you need variables available to the hoisted mock factory:\n\n```ts\nconst { mockFn } = vi.hoisted(() => ({\n mockFn: vi.fn(),\n}));\n\nvi.mock('./module', () => ({ fn: mockFn }));\n```\n\n#### Mock clearing\n\n\| Method \| What it does \|\n\|--------\|-------------\|\n\| `vi.clearAllMocks()` \| Clears mock history (calls, instances). Does NOT reset implementation. \|\n\| `vi.resetAllMocks()` \| Clears history AND resets implementation to `() => undefined`. \|\n\| `vi.restoreAllMocks()` \| Restores original implementations for `vi.spyOn` spies. Does NOT clear history. \|\n\nUse in `beforeEach`:\n```ts\nbeforeEach(() => {\n vi.clearAllMocks(); // most common — clears call history between tests\n});\n```\n\n### 8. Component Testing with Providers\n\nComponents that use React Query, Router, or Zustand need provider wrappers.\n\n```ts\nimport { QueryClient, QueryClientProvider } from '@tanstack/react-query';\nimport { MemoryRouter } from 'react-router-dom';\nimport { createElement, type ReactNode } from 'react';\n\nfunction createWrapper() {\n const queryClient = new QueryClient({\n defaultOptions: { queries: { retry: false } },\n });\n return ({ children }: { children: ReactNode }) =>\n createElement(QueryClientProvider, { client: queryClient },\n createElement(MemoryRouter, null, children)\n );\n}\n\n// In test:\nrender(<MyComponent />, { wrapper: createWrapper() });\n\n// For hooks:\nrenderHook(() => useMyHook(), { wrapper: createWrapper() });\n```\n\n### 9. Testing React Query\n\nSource: TanStack Query official docs\n\nRules:\n- Create a new `QueryClient` per test — prevents shared cache between tests\n- Set `retry: false` — prevents tests from retrying failed queries (makes failures instant)\n- Use the `wrapper` option to provide `QueryClientProvider`\n\n```ts\nconst queryClient = new QueryClient({\n defaultOptions: { queries: { retry: false } },\n});\n```\n\n### 10. Testing Zustand Stores\n\nSource: [github.com/pmndrs/zustand — docs/learn/guides/testing.md](https://github.com/pmndrs/zustand)\n\n#### Official pattern: `__mocks__/zustand.ts`\n\nZustand's official testing guide recommends creating a mock that auto-resets stores between tests:\n\n1. Create `__mocks__/zustand.ts` (or `src/__mocks__/zustand.ts` if root is `./src`)\n2. Mock intercepts `create`/`createStore`, captures initial state, registers reset functions\n3. `afterEach` resets all stores to initial state\n\n#### Alternative: Direct `setState` in tests\n\nFor simpler cases, set store state directly before each test:\n\n```ts\nimport { useAppStore } from '../../store';\n\nbeforeEach(() => {\n useAppStore.getState().clearStore(); // if clearStore action exists\n // or\n useAppStore.setState({ key: initialValue });\n});\n```\n\nWarning for Vitest: If you change the Vitest `root` config (e.g., to `./src`), the `__mocks__` directory must be relative to that root, not the project root.\n\n### 11. Cleanup — Automatic\n\nSource: [testing-library.com/docs/react-testing-library/api](https://testing-library.com/docs/react-testing-library/api)\n\n\"Unmounts React trees that were mounted with render. This is called automatically if your testing framework...injects a global `afterEach()` function.\"\n\nRules:\n- Do NOT manually call `cleanup()` — Vitest handles it automatically\n- Do NOT import `cleanup` — it's unnecessary boilerplate\n\n### 12. Test Isolation\n\nSource: [vitest.dev/guide/features](https://vitest.dev/guide/features.html)\n\n\"Vitest also isolates each file's environment so env mutations in one file don't affect others.\"\n\nRules:\n- Use `beforeEach` to reset mocks and store state\n- Create fresh `QueryClient` instances per test (not shared)\n- Use `vi.clearAllMocks()` in `beforeEach` to reset call history\n- Tests within a file share module scope — don't rely on test order\n\n### 13. What to Test vs What NOT to Test\n\nTest (user-observable behavior):\n- Rendered text and accessible elements\n- User interactions (click, type, submit) and their effects\n- Navigation and route changes\n- Error states and loading states\n- Accessibility (roles, labels, ARIA attributes)\n\nDo NOT test (implementation details):\n- Internal component state\n- CSS classes or inline styles\n- Component instance methods\n- Hook internals (test via component behavior or `renderHook`)\n- That a function was called N times (unless it's the main behavior being tested)\n\n### 14. Act Warnings — When to Use `act()`\n\nSource: Kent C. Dodds — \"Common Mistakes with React Testing Library\"\n\n`render()` and `fireEvent` are already wrapped in `act()`. Do NOT wrap them again.\n\n```ts\n// BAD — unnecessary act()\nact(() => {\n render(<MyComponent />);\n});\n\n// GOOD — render already handles act()\nrender(<MyComponent />);\n```\n\nOnly use `act()` when directly triggering state updates outside of RTL utilities (e.g., calling store methods directly).\n\n## Common Anti-Patterns (Kent C. Dodds' Official List)\n\nSource: [kentcdodds.com/blog/common-mistakes-with-react-testing-library](https://kentcdodds.com/blog/common-mistakes-with-react-testing-library)\n\n\| # \| Anti-Pattern \| Fix \|\n\|---\|---\|---\|\n\| 1 \| Not using Testing Library ESLint plugins \| Install `eslint-plugin-testing-library` \|\n\| 2 \| Using `wrapper` as variable name for render result \| Destructure or use `screen` \|\n\| 3 \| Manually calling `cleanup` \| Remove — it's automatic \|\n\| 4 \| Not using `screen` \| Always use `screen.getByRole(...)` \|\n\| 5 \| Wrong assertion (`button.disabled` instead of matcher) \| Use `toBeDisabled()` \|\n\| 6 \| Wrapping everything in `act()` \| Remove — `render`/`fireEvent` already handle it \|\n\| 7 \| Using `getByTestId` instead of accessible queries \| Use `getByRole`, `getByText`, etc. \|\n\| 8 \| Using `container.querySelector()` \| Use `screen` queries \|\n\| 9 \| Not querying by text \| Query by visible text content \|\n\| 10 \| Not using `ByRole` most of the time \| `getByRole` is the default \|\n\| 11 \| Adding unnecessary `aria-`/`role` attributes \| Use semantic HTML \|\n\| 12 \| Using `fireEvent` instead of `user-event` \| Use `userEvent.setup()` \|\n\| 13 \| Using `query` for existence checks \| `query` is for NON-existence only \|\n\| 14 \| Using `waitFor` instead of `findBy` \| `findBy` = `waitFor` + `getBy` \|\n\| 15 \| Empty `waitFor(() => {})` callback \| Put an assertion inside \|\n\| 16 \| Multiple assertions in `waitFor` \| One assertion inside, rest outside \|\n\| 17 \| Side effects inside `waitFor` \| Put side effects outside the callback \|\n\| 18 \| Using `get` as implicit assertions \| Always use explicit `expect()` \|\n\n## Output Format (Review Mode)\n\nWhen reviewing existing tests, group findings by severity:\n\n```\n## Critical\nIssues that make tests unreliable, flaky, or misleading.\n\n### [PRINCIPLE] Brief title\nFile: `path/to/file.test.tsx` (lines X-Y)\nPrinciple: What the standard requires.\nViolation: What the code does wrong.\nFix: Specific, actionable suggestion.\n\n## Warning\nIssues that weaken test value or violate conventions.\n\n(same structure)\n\n## Suggestion\nImprovements aligned with best practices.\n\n(same structure)\n```\n\n## Linter\n\nRun ESLint with Testing Library plugin:\n\n```bash\ncd app && npx eslint src/__tests__/\n```\n\n## Rules\n\n- Only verified claims: every recommendation is backed by official Testing Library, Vitest, or framework documentation.\n- User perspective: test what users see and do, not internal implementation.\n- Accessible queries first: `getByRole` is the default; `getByTestId` is the last resort.\n- No unnecessary wrappers: don't add `act()`, `cleanup()`, or extra abstractions.\n- Fresh state per test: new QueryClient, reset store, clear mocks in `beforeEach`.\n- Explicit assertions: always use `expect()` — don't rely on `getBy` throwing as an assertion.\n","reference":"# React Frontend Testing Reference\n\nDetailed definitions, official sources, and verified citations for each principle in this skill.\n\n## Table of Contents\n\n1. [Guiding Principles](#1-guiding-principles)\n2. [Query Priority](#2-query-priority)\n3. [Query Types](#3-query-types)\n4. [User Events](#4-user-events)\n5. [Vitest Mocking](#5-vitest-mocking)\n6. [React Testing Library API](#6-react-testing-library-api)\n7. [Zustand Testing](#7-zustand-testing)\n8. [TanStack Query Testing](#8-tanstack-query-testing)\n9. [Common Mistakes](#9-common-mistakes)\n10. [jest-dom Matchers](#10-jest-dom-matchers)\n\n---\n\n## 1. Guiding Principles\n\nSource: [testing-library.com/docs/guiding-principles](https://testing-library.com/docs/guiding-principles)\n\n> \"The more your tests resemble the way your software is used, the more confidence they can give you.\"\n\nThe library emphasizes three principles:\n1. Tests should interact with DOM nodes rather than component instances\n2. Utilities should encourage testing applications as users would actually use them\n3. Implementations should remain simple and flexible\n\n---\n\n## 2. Query Priority\n\nSource: [testing-library.com/docs/queries/about](https://testing-library.com/docs/queries/about)\n\nOfficial order from most to least preferred:\n\n1. `getByRole` — \"query every element that is exposed in the accessibility tree\"\n2. `getByLabelText` — \"top preference\" for form fields\n3. `getByPlaceholderText` — fallback when labels unavailable\n4. `getByText` — for non-interactive elements outside forms\n5. `getByDisplayValue` — for form elements with filled-in values\n6. `getByAltText` — for elements supporting alt text\n7. `getByTitle` — least reliable semantic option\n8. `getByTestId` — only when other methods don't apply\n\n---\n\n## 3. Query Types\n\nSource: [testing-library.com/docs/queries/about](https://testing-library.com/docs/queries/about)\n\n\| Type \| 0 matches \| 1 match \| >1 matches \| Async? \|\n\|------\|-----------\|---------\|------------\|--------\|\n\| `getBy` \| Throw \| Return \| Throw \| No \|\n\| `queryBy` \| `null` \| Return \| Throw \| No \|\n\| `findBy` \| Throw \| Return \| Throw \| Yes (retries up to 1000ms) \|\n\| `getAllBy` \| Throw \| Array \| Array \| No \|\n\| `queryAllBy` \| `[]` \| Array \| Array \| No \|\n\| `findAllBy` \| Throw \| Array \| Array \| Yes \|\n\n---\n\n## 4. User Events\n\nSource: [testing-library.com/docs/user-event/intro](https://testing-library.com/docs/user-event/intro)\n\n`user-event` \"simulates user interactions by dispatching the events that would happen if the interaction took place in a browser.\"\n\nKey difference from `fireEvent`: `user-event` \"adds visibility and interactability checks along the way and manipulates the DOM just like a user interaction in the browser would.\"\n\nSetup:\n```ts\nconst user = userEvent.setup();\nrender(<MyComponent />);\nawait user.click(screen.getByRole('button'));\n```\n\nThe documentation \"discourages rendering or using any `userEvent` functions outside of the test itself - e.g. in a `before`/`after` hook.\"\n\n---\n\n## 5. Vitest Mocking\n\n### vi.mock() hoisting\n\nSource: [vitest.dev/api/vi](https://vitest.dev/api/vi.html)\n\n\"`vi.mock` is hoisted (in other words, _moved_) to top of the file.\"\n\n\"The call to `vi.mock` is hoisted to top of the file. It will always be executed before all imports.\"\n\n### vi.hoisted()\n\nAllows side effects before static imports are evaluated. Returns the factory function's return value.\n\n### Mock clearing methods\n\nSource: [vitest.dev/api/vi](https://vitest.dev/api/vi.html)\n\n`vi.clearAllMocks()` — Calls `.mockClear()` on all spies. \"This will clear mock history without affecting mock implementations.\"\n\n`vi.resetAllMocks()` — Calls `.mockReset()` on all spies. \"This will clear mock history and reset each mock's implementation.\"\n\n`vi.restoreAllMocks()` — \"This restores all original implementations on spies created with `vi.spyOn`.\" Does NOT clear history.\n\n### Internal vs external access warning\n\nSource: [vitest.dev/guide/mocking](https://vitest.dev/guide/mocking)\n\n\"This only mocks _external_ access. In this example, if `original` calls `mocked` internally, it will always call the function defined in the module, not in the mock factory.\"\n\n---\n\n## 6. React Testing Library API\n\nSource: [testing-library.com/docs/react-testing-library/api](https://testing-library.com/docs/react-testing-library/api)\n\n### render()\n\nReturns: `container`, `baseElement`, `debug`, `rerender`, `unmount`, `asFragment`, plus all bound queries.\n\nOptions: `container`, `baseElement`, `hydrate`, `legacyRoot`, `wrapper`, `queries`, `reactStrictMode`.\n\n### wrapper option\n\n\"Pass a React Component as the `wrapper` option to have it rendered around the inner element. This is most useful for creating reusable custom render functions for common data providers.\"\n\n### cleanup\n\n\"Unmounts React trees that were mounted with render. This is called automatically if your testing framework (such as mocha, Jest or Jasmine) injects a global `afterEach()` function.\"\n\n### renderHook()\n\n\"A convenience wrapper around `render` with a custom test component.\" Returns `result` (with `result.current`), `rerender`, `unmount`.\n\n---\n\n## 7. Zustand Testing\n\nSource: [github.com/pmndrs/zustand — docs/learn/guides/testing.md](https://github.com/pmndrs/zustand)\n\n### Official recommendation\n\n\"We recommend using React Testing Library (RTL) to test out React components that connect to Zustand.\"\n\n\"We also recommend using Mock Service Worker (MSW) to mock network requests.\"\n\n### Store reset pattern (Vitest)\n\n1. Create `__mocks__/zustand.ts`:\n\n```ts\nimport { act } from '@testing-library/react';\nimport type * as ZustandExportedTypes from 'zustand';\nexport * from 'zustand';\n\nconst { create: actualCreate, createStore: actualCreateStore } =\n await vi.importActual<typeof ZustandExportedTypes>('zustand');\n\nexport const storeResetFns = new Set<() => void>();\n\nconst createUncurried = <T>(\n stateCreator: ZustandExportedTypes.StateCreator<T>,\n) => {\n const store = actualCreate(stateCreator);\n const initialState = store.getInitialState();\n storeResetFns.add(() => { store.setState(initialState, true); });\n return store;\n};\n\nexport const create = (<T>(\n stateCreator: ZustandExportedTypes.StateCreator<T>,\n) => {\n return typeof stateCreator === 'function'\n ? createUncurried(stateCreator)\n : createUncurried;\n}) as typeof ZustandExportedTypes.create;\n\n// Similar for createStore...\n\nafterEach(() => {\n act(() => { storeResetFns.forEach((fn) => fn()); });\n});\n```\n\n2. In setup file: `vi.mock('zustand');`\n\n### Warning\n\n\"In Vitest you can change the root. Due to that, you need make sure that you are creating your `__mocks__` directory in the right place. Let's say that you change the root to `./src`, that means you need to create a `__mocks__` directory under `./src`.\"\n\n---\n\n## 8. TanStack Query Testing\n\nSource: TanStack Query official documentation\n\n### Key patterns\n\n- Create a new `QueryClient` for each test to prevent cache leaking\n- Set `retry: false` to make failures immediate:\n ```ts\n new QueryClient({ defaultOptions: { queries: { retry: false } } })\n ```\n- Provide via wrapper:\n ```ts\n const wrapper = ({ children }) =>\n createElement(QueryClientProvider, { client: queryClient }, children);\n ```\n\n---\n\n## 9. Common Mistakes\n\nSource: [kentcdodds.com/blog/common-mistakes-with-react-testing-library](https://kentcdodds.com/blog/common-mistakes-with-react-testing-library) (Kent C. Dodds, creator of Testing Library)\n\n### Distinct anti-patterns identified (18 items across 14 sections + sub-items)\n\n1. Not using Testing Library ESLint plugins — Install and use them\n2. Using `wrapper` as variable name for render result — Use `screen` or destructure\n3. Manually calling `cleanup` — It's automatic\n4. Not using `screen` — Always use `screen` for queries\n5. Wrong assertion — Use jest-dom matchers like `toBeDisabled()`\n6. Wrapping in `act()` unnecessarily — `render`/`fireEvent` already handle it\n7. Using wrong query — Use accessible queries, not `getByTestId`\n8. Using `container.querySelector()` — Use `screen` queries\n9. Not querying by text — Query by visible text content\n10. *Not using `ByRole` — It should be the primary query\n11. Adding `aria-`/`role` incorrectly — Use semantic HTML elements\n12. Using `fireEvent` instead of `user-event` — `userEvent.setup()` is preferred\n13. Using `query` for existence* — `query` is for NON-existence; use `getBy` for existence\n14. Using `waitFor` instead of `findBy`* — `findBy` = `waitFor` + `getBy`\n15. Empty `waitFor` callback — Must contain an assertion\n16. Multiple assertions in `waitFor` — One inside, rest outside\n17. Side effects inside `waitFor` — Put side effects outside\n18. *Using `get` as implicit assertions — Always use explicit `expect()`\n\n---\n\n## 10. jest-dom Matchers\n\nSource: [testing-library.com/docs/ecosystem-jest-dom](https://testing-library.com/docs/ecosystem-jest-dom)\n\nKey matchers for DOM testing:\n\n\| Matcher \| Tests for \|\n\|---\|---\|\n\| `toBeInTheDocument()` \| Element exists in DOM \|\n\| `toBeVisible()` \| Element is visible to user \|\n\| `toBeDisabled()` / `toBeEnabled()` \| Disabled state \|\n\| `toBeChecked()` \| Checkbox/radio is checked \|\n\| `toBeRequired()` \| Form element is required \|\n\| `toBeValid()` / `toBeInvalid()` \| Form validation state \|\n\| `toBeEmptyDOMElement()` \| No content \|\n\| `toHaveTextContent(text)` \| Contains text \|\n\| `toHaveAttribute(attr, value?)` \| Has HTML attribute \|\n\| `toHaveClass(className)` \| Has CSS class \|\n\| `toHaveStyle(css)` \| Has inline style \|\n\| `toHaveValue(value)` \| Form element value \|\n\| `toHaveDisplayValue(value)` \| Displayed value \|\n\| `toHaveFocus()` \| Element is focused \|\n\| `toContainElement(element)` \| Contains child element \|\n\| `toContainHTML(html)` \| Contains HTML string \|\n\| `toHaveDescription(text)` \| Has `aria-describedby` text \|\n\| `toHaveErrorMessage(text)` \| Has `aria-errormessage` text \|\n\| `toHaveAccessibleName(name)` \| Has accessible name \|\n\| `toHaveAccessibleDescription(desc)` \| Has accessible description \|\n\n### Vitest environment\n\nSource: [vitest.dev/guide/features](https://vitest.dev/guide/features.html)\n\nVitest supports both `happy-dom` and `jsdom` for DOM mocking: \"happy-dom or jsdom for DOM mocking.\" Configure via the `environment` option in vitest config.\n\n\"Vitest also isolates each file's environment so env mutations in one file don't affect others.\"\n"},"test-pgtap":{"content":"---\nname: test-pgtap\ndescription: Use when writing, reviewing, or fixing pgTAP tests for Supabase SQL migrations, or when auditing database tests for best practices. Triggers on plan count mismatches, transaction isolation issues, RLS policy testing, privilege verification, or assertion selection problems.\n---\n\n# pgTAP Database Testing\n\nWrite and review pgTAP tests for Supabase SQL migrations. Every recommendation is sourced from official pgTAP documentation (pgtap.org) or Supabase documentation — see [REFERENCE.md](REFERENCE.md) for citations.\n\n## Scope\n\nDetermine what to review or write based on user request:\n\n- Write mode: write new tests for migrations the user specifies\n- Review mode: audit existing test files for anti-patterns and best practice violations\n- Fix mode*: fix failing or flawed tests\n\nTest files live in the project's database test directory (Supabase convention: `supabase/tests/database/.test.sql`).\n\n## Prerequisites\n\n```bash\nnpx supabase start # start local Supabase stack\nnpx supabase test db # run all pgTAP tests\nnpx supabase db reset # reset DB if needed (re-runs all migrations + seeds)\nnpx supabase db lint # run plpgsql_check linter\n```\n\nRequired extension: The `supabase_test_helpers` extension must be enabled for user management helpers (`tests.create_supabase_user`, `tests.authenticate_as`, etc.). Enable it in a migration:\n\n```sql\nCREATE EXTENSION IF NOT EXISTS supabase_test_helpers;\n```\n\n## Principles to Enforce\n\n### 1. Transaction Isolation — `BEGIN`/`ROLLBACK`\n\nEvery test file MUST be wrapped in a transaction:\n\n```sql\nBEGIN;\nSELECT plan(N);\n\n-- tests here\n\nSELECT * FROM finish();\nROLLBACK;\n```\n\nWhy: ROLLBACK ensures all test-created data (rows, schema changes from CREATE OR REPLACE) is cleaned up. Tests cannot leak state to other test files.\n\nRules:\n- Always `BEGIN;` as the first statement\n- Always `ROLLBACK;` as the last statement\n- NEVER use `COMMIT;` in test files\n- `finish()` must be called before `ROLLBACK` to output TAP diagnostics\n\n### 2. Plan Counts — Always Use `SELECT plan(N)`\n\n```sql\nSELECT plan(16); -- exactly 16 assertions will run\n```\n\npgTAP official documentation states about `no_plan()`: \"Try to avoid using this as it weakens your test.\"\n\nRules:\n- ALWAYS use `SELECT plan(N)` with an exact count\n- NEVER use `no_plan()` — it hides missing/skipped assertions\n- The plan count MUST match the actual number of assertion calls\n- When adding/removing assertions, update the plan count AND the file header comment\n- Include the count in the file header: `-- Assertion count: N`\n\n### 3. Test File Organization\n\nNaming convention: `NNNNN-description.test.sql` where NNNNN is a zero-padded number controlling execution order.\n\nFile header template:\n```sql\n-- NNNNN-description.test.sql\n-- Tests for: <what migration/feature this tests>\n--\n-- Covers:\n-- 1. <first thing tested>\n-- 2. <second thing tested>\n--\n-- Assertion count: N\n-- Dependency: <test files or seeds this depends on>\n```\n\nCategorization:\n- `00NNN` — Schema tests (table/column/index/constraint existence)\n- `00NNN` — Trigger schema + behavioral tests\n- `01NNN` — RPC/function behavioral tests\n- Files should test ONE migration or ONE logical unit\n\n### 4. Assertion Function Selection\n\nUse the most specific assertion for the situation:\n\n\| Situation \| Use \| NOT \|\n\|-----------\|-----\|-----\|\n\| Exact value equality \| `is(have, want, desc)` \| `ok(have = want, desc)` \|\n\| Value inequality \| `isnt(have, want, desc)` \| `ok(have != want, desc)` \|\n\| Boolean condition \| `ok(condition, desc)` \| `is(condition, true, desc)` \|\n\| Row existence \| `ok(EXISTS(SELECT ...), desc)` \| Checking count \|\n\| Exception expected \| `throws_ok(sql, errcode, errmsg, desc)` \| Manual BEGIN/EXCEPTION \|\n\| No exception expected \| `lives_ok(sql, desc)` \| Running SQL without assertion \|\n\| Row non-existence \| `ok(NOT EXISTS(SELECT ...), desc)` \| `is(count, 0, desc)` \|\n\| Exact row comparison \| `results_eq(sql, sql, desc)` \| Manual row-by-row checks \|\n\| Set equality (order-independent) \| `set_eq(sql, sql, desc)` \| `results_eq` when order doesn't matter \|\n\| Empty result set \| `is_empty(sql, desc)` \| `ok(NOT EXISTS(...))` \|\n\n`is()` uses `IS NOT DISTINCT FROM` — this correctly handles NULL comparisons (unlike `=`).\n\n### 5. Schema Tests — Existence and Structure\n\nTest that migrations created the expected schema objects:\n\n```sql\n-- Table exists\nSELECT has_table('public', 'profiles', 'profiles table exists');\n\n-- Column exists with correct type\nSELECT has_column('public', 'profiles', 'coins', 'profiles.coins column exists');\nSELECT col_type_is('public', 'profiles', 'coins', 'integer', 'profiles.coins is integer');\n\n-- Column constraints\nSELECT col_not_null('public', 'profiles', 'id', 'profiles.id is NOT NULL');\nSELECT col_default_is('public', 'profiles', 'coins', '100', 'profiles.coins defaults to 100');\nSELECT col_is_pk('public', 'profiles', 'id', 'profiles.id is primary key');\n\n-- Foreign key\nSELECT fk_ok('public', 'profiles', 'id', 'auth', 'users', 'id', 'profiles.id references auth.users');\n\n-- Check constraint\nSELECT has_check('public', 'profiles', 'chk_coins_positive', 'check constraint exists');\n\n-- Index\nSELECT has_index('public', 'profiles', 'idx_profiles_github', 'github index exists');\n\n-- RLS enabled\nSELECT ok(\n (SELECT rowsecurity FROM pg_tables WHERE schemaname = 'public' AND tablename = 'profiles'),\n 'RLS enabled on profiles'\n);\n```\n\n### 6. Behavioral Tests — RPCs and Business Logic\n\nTest PL/pgSQL function behavior by calling them and asserting outcomes:\n\n```sql\n-- Happy path\nSELECT is(\n (SELECT public.my_rpc(param1, param2)),\n expected_value,\n 'my_rpc: returns expected value'\n);\n\n-- Exception path\nSELECT throws_ok(\n format($sql$SELECT public.my_rpc(%L, %L)$sql$, bad_param1, bad_param2),\n 'P0001', -- SQLSTATE for RAISE EXCEPTION\n 'Expected error message',\n 'my_rpc: rejects bad input'\n);\n\n-- Side effects\nSELECT ok(\n EXISTS (SELECT 1 FROM public.some_table WHERE condition),\n 'my_rpc: creates expected row'\n);\n```\n\nRules for `throws_ok`:\n- First arg is a SQL string (not a function call) — wrap in `$sql$...$sql$`\n- Use `format()` with `%L` for parameter interpolation (prevents SQL injection in tests)\n- Second arg is SQLSTATE code (`'P0001'` for custom RAISE EXCEPTION)\n- Third arg is the expected error message (exact match)\n\n### 7. RLS Policy Testing\n\nTest that RLS policies exist, are configured correctly, and enforce access:\n\n```sql\n-- Policy exists (exact set)\nSELECT policies_are('public', 'profiles',\n ARRAY['profiles_select_policy', 'profiles_update_policy'],\n 'profiles has expected RLS policies'\n);\n\n-- Policy applies to correct roles\nSELECT policy_roles_are('public', 'profiles', 'profiles_select_policy',\n ARRAY['authenticated'],\n 'select policy applies to authenticated only'\n);\n\n-- Policy applies to correct command\nSELECT policy_cmd_is('public', 'profiles', 'profiles_select_policy',\n 'SELECT',\n 'select policy is SELECT-only'\n);\n```\n\nBehavioral RLS testing requires setting the role context:\n\n```sql\n-- Set authenticated user context\nSET LOCAL ROLE authenticated;\nSET LOCAL \"request.jwt.claims\" = '{\"sub\": \"user-uuid-here\"}';\n\n-- Now queries run as that user, RLS applies\nSELECT is_empty(\n $$SELECT * FROM public.profiles WHERE id != 'user-uuid-here'$$,\n 'authenticated user cannot see other profiles'\n);\n\n-- Reset role\nRESET ROLE;\n```\n\n### 8. SECURITY DEFINER and Privilege Testing\n\n```sql\n-- Function is SECURITY DEFINER\nSELECT is_definer('public', 'my_function', ARRAY['uuid']::name[],\n 'my_function is SECURITY DEFINER');\n\n-- Function is NOT SECURITY DEFINER\nSELECT isnt_definer('public', 'my_function', ARRAY['uuid']::name[],\n 'my_function is SECURITY INVOKER');\n\n-- No role has EXECUTE privilege (defense-in-depth)\nSELECT function_privs_are('public', 'my_function', ARRAY['uuid']::name[],\n 'anon', ARRAY[]::text[],\n 'anon: no execute on my_function');\n\nSELECT function_privs_are('public', 'my_function', ARRAY['uuid']::name[],\n 'authenticated', ARRAY[]::text[],\n 'authenticated: no execute on my_function');\n\nSELECT function_privs_are('public', 'my_function', ARRAY['uuid']::name[],\n 'service_role', ARRAY[]::text[],\n 'service_role: no execute on my_function');\n```\n\nNote: `function_privs_are` takes the parameter types as `ARRAY[]::name[]` — use empty array for no-argument functions.\n\n### 9. Trigger Testing\n\n```sql\n-- Trigger exists on table\nSELECT has_trigger('public', 'messages', 'on_message_insert',\n 'on_message_insert trigger exists on messages');\n\n-- Trigger function exists and is SECURITY DEFINER\nSELECT has_function('public', 'broadcast_new_message', 'trigger function exists');\nSELECT is_definer('public', 'broadcast_new_message', ARRAY[]::name[],\n 'broadcast_new_message is SECURITY DEFINER');\n\n-- Trigger behavior (insert data, verify side effects)\nINSERT INTO public.messages (...) VALUES (...);\nSELECT ok(\n EXISTS (SELECT 1 FROM public.expected_side_effect WHERE ...),\n 'trigger creates expected side effect'\n);\n```\n\n### 10. Supabase Test Helpers\n\nSupabase provides helper functions for user management in tests:\n\n```sql\n-- Create a test user (fires auth triggers)\nSELECT tests.create_supabase_user(\n 'test_user_alias',\n 'test@example.com',\n NULL, -- phone (optional)\n '{\"sub\": \"12345\", \"preferred_username\": \"testuser\"}'::jsonb -- raw_user_meta_data\n);\n\n-- Get the UUID of a test user\nSELECT tests.get_supabase_uid('test_user_alias');\n```\n\nRules:\n- Use unique aliases per test file to avoid collisions\n- Prefix aliases with the test file's theme (e.g., `auth_trigger_alice`)\n- The JSONB metadata must include `sub` and `preferred_username` for GitHub OAuth simulation\n\n### 11. Test Description Conventions\n\nEvery assertion MUST have a descriptive message:\n\n```sql\n-- Good: tells you what's being tested and what function/feature\nSELECT is(result, expected, 'my_rpc: returns correct value for edge case');\n\n-- Bad: no description\nSELECT is(result, expected);\n\n-- Bad: vague\nSELECT is(result, expected, 'test 1');\n```\n\nFormat: `'<function_or_feature>: <what is being verified>'`\n\n### 12. Determinism and Independence\n\n- Tests MUST be deterministic — same result every run\n- Use fixed values, not `random()`, `now()`, or `gen_random_uuid()` in assertions\n- Each test file should be independent — don't rely on state from other test files\n- Use `tests.create_supabase_user()` for user setup, not raw INSERT (ensures triggers fire)\n- Clean up is handled by `ROLLBACK` — no explicit DELETE needed\n\n### 13. `SET LOCAL` vs `SET` — Scope to the Transaction\n\nWhen changing session variables inside a test (e.g., setting role or JWT claims), always use `SET LOCAL`:\n\n```sql\n-- GOOD: scoped to the current transaction — reverted by ROLLBACK\nSET LOCAL ROLE authenticated;\nSET LOCAL \"request.jwt.claims\" = '{\"sub\": \"...\"}';\n\n-- BAD: persists beyond ROLLBACK — leaks to subsequent test files\nSET ROLE authenticated;\nSET \"request.jwt.claims\" = '{\"sub\": \"...\"}';\n```\n\n`SET LOCAL` restricts the change to the current transaction. Since every test file uses `BEGIN`/`ROLLBACK`, the variable is automatically restored. Plain `SET` persists after `ROLLBACK` and can contaminate later test files.\n\n### 14. SAVEPOINT Caveat — Avoid Sub-transactions\n\nDo NOT use `SAVEPOINT`/`ROLLBACK TO` inside pgTAP test files. Rolling back to a savepoint discards any assertions emitted after the savepoint, causing a plan count mismatch (pgTAP still expects them but they were rolled back). If you need to test that an operation fails, use `throws_ok()` instead of manually catching exceptions with savepoints.\n\n## Common Anti-Patterns\n\n\| Anti-Pattern \| Why it's wrong \| Fix \|\n\|---\|---\|---\|\n\| `no_plan()` \| Hides missing assertions \| Use `plan(N)` with exact count \|\n\| Missing `ROLLBACK` \| Test data leaks to other files \| Always end with `ROLLBACK;` \|\n\| `ok(a = b, desc)` for equality \| Fails silently on NULL \| Use `is(a, b, desc)` \|\n\| No description on assertions \| Failures are undiagnosable \| Always provide descriptive message \|\n\| Testing private internals \| Brittle, breaks on refactor \| Test public RPC behavior \|\n\| Hardcoded UUIDs \| Collides with other tests \| Use `tests.get_supabase_uid()` \|\n\| `COMMIT` in test files \| Permanently alters database \| Use `ROLLBACK` \|\n\| Plan count mismatch \| Test suite reports wrong total \| Keep count in sync with assertions \|\n\| Missing `finish()` \| No diagnostic output on failure \| Always call before `ROLLBACK` \|\n\| `SET` instead of `SET LOCAL` \| Leaks role/claims beyond `ROLLBACK` \| Always use `SET LOCAL` inside tests \|\n\| `SAVEPOINT`/`ROLLBACK TO` \| Discards assertions, breaks plan count \| Use `throws_ok()` for error testing \|\n\n## Output Format (Review Mode)\n\nWhen reviewing existing tests, group findings by severity:\n\n```\n## Critical\nIssues that make tests unreliable, flaky, or misleading.\n\n### [PRINCIPLE] Brief title\nFile: `path/to/file.test.sql` (lines X-Y)\nPrinciple: What the standard requires.\nViolation: What the code does wrong.\nFix: Specific, actionable suggestion.\n\n## Warning\nIssues that weaken test value or violate conventions.\n\n(same structure)\n\n## Suggestion\nImprovements aligned with best practices.\n\n(same structure)\n```\n\n## Rules\n\n- Only verified claims: every recommendation is backed by pgtap.org or Supabase official documentation.\n- Schema AND behavior: test both that objects exist (schema) and that they work correctly (behavior).\n- Transaction discipline: every file wrapped in BEGIN/ROLLBACK, no exceptions.\n- Exact plan counts: never use `no_plan()`.\n- Descriptive messages: every assertion needs a clear description.\n- Test the contract: test what RPCs accept, return, and side-effect — not internal implementation.\n","reference":"# pgTAP Database Testing Reference\n\nDetailed definitions, official sources, and verified citations for each principle in this skill.\n\n## Table of Contents\n\n1. [Test Structure](#1-test-structure)\n2. [Plan Counts](#2-plan-counts)\n3. [Core Assertions](#3-core-assertions)\n4. [Schema Testing Functions](#4-schema-testing-functions)\n5. [Column Testing Functions](#5-column-testing-functions)\n6. [Function Testing Functions](#6-function-testing-functions)\n7. [RLS Policy Functions](#7-rls-policy-functions)\n8. [Privilege Testing Functions](#8-privilege-testing-functions)\n9. [Exception Testing](#9-exception-testing)\n10. [Result Set Testing](#10-result-set-testing)\n11. [Supabase Helpers](#11-supabase-helpers)\n12. [Diagnostics and Utilities](#12-diagnostics-and-utilities)\n\n---\n\n## 1. Test Structure\n\nSource: [pgtap.org/documentation.html](https://pgtap.org/documentation.html)\n\nThe standard test structure shown in pgTAP documentation:\n\n```sql\nBEGIN;\nSELECT plan(N);\n-- tests\nSELECT * FROM finish();\nROLLBACK;\n```\n\n\"This ensures all changes (including function loading) are rolled back after tests complete.\"\n\nSource (Supabase): [supabase.com/docs/guides/database/extensions/pgtap](https://supabase.com/docs/guides/database/extensions/pgtap)\n\nSupabase's examples use the same `begin;`/`rollback;` pattern.\n\nRunning tests: `supabase test db`\n\nSource (Supabase): [supabase.com/docs/guides/database/testing](https://supabase.com/docs/guides/database/testing)\n\nTest files go in `./supabase/tests/database/` with `.sql` extension. \"All `sql` files use pgTAP as the test runner.\"\n\n---\n\n## 2. Plan Counts\n\nSource: [pgtap.org/documentation.html](https://pgtap.org/documentation.html)\n\n`SELECT plan(N);` — declares the expected number of tests.\n\n`SELECT * FROM no_plan();` — for cases where test count is unknown. \"Try to avoid using this as it weakens your test.\"\n\n`SELECT * FROM finish();` — outputs TAP summary, reports failures. Optional parameter: `finish(true)` throws exception if any test failed.\n\n---\n\n## 3. Core Assertions\n\nSource: [pgtap.org/documentation.html](https://pgtap.org/documentation.html)\n\n### Basic\n\n\| Function \| Description \|\n\|---\|---\|\n\| `ok(boolean, description)` \| Passes if boolean is true \|\n\| `is(have, want, description)` \| Equality using `IS NOT DISTINCT FROM` (NULL-safe) \|\n\| `isnt(have, want, description)` \| Inequality using `IS DISTINCT FROM` \|\n\| `pass(description)` \| Unconditional pass \|\n\| `fail(description)` \| Unconditional fail \|\n\| `isa_ok(value, regtype, name)` \| Type checking \|\n\n### Pattern Matching\n\n\| Function \| Description \|\n\|---\|---\|\n\| `matches(have, regex, description)` \| Regex match \|\n\| `imatches(have, regex, description)` \| Case-insensitive regex match \|\n\| `doesnt_match(have, regex, description)` \| Regex non-match \|\n\| `alike(have, like_pattern, description)` \| SQL LIKE pattern \|\n\| `unalike(have, like_pattern, description)` \| LIKE non-match \|\n\| `cmp_ok(have, operator, want, description)` \| Arbitrary operator comparison \|\n\n### Comparison\n\n\| Function \| Description \|\n\|---\|---\|\n\| `is(have, want, desc)` uses `IS NOT DISTINCT FROM` \| Correctly handles NULL = NULL as true \|\n\| `=` operator \| NULL = NULL returns NULL (falsy) — DO NOT USE for equality testing \|\n\n---\n\n## 4. Schema Testing Functions\n\nSource: [pgtap.org/documentation.html](https://pgtap.org/documentation.html)\n\n### Existence\n\n\| Function \| Tests for \|\n\|---\|---\|\n\| `has_table(schema, table, desc)` \| Table exists \|\n\| `hasnt_table(schema, table, desc)` \| Table doesn't exist \|\n\| `has_view(schema, view, desc)` \| View exists \|\n\| `has_materialized_view(schema, view, desc)` \| Materialized view exists \|\n\| `has_sequence(schema, sequence, desc)` \| Sequence exists \|\n\| `has_index(schema, table, index, desc)` \| Index exists \|\n\| `has_trigger(schema, table, trigger, desc)` \| Trigger exists \|\n\| `has_function(schema, function, desc)` \| Function exists \|\n\| `has_extension(name, desc)` \| Extension enabled \|\n\| `has_schema(name, desc)` \| Schema exists \|\n\| `has_type(schema, type, desc)` \| Type exists \|\n\| `has_enum(schema, enum, desc)` \| Enum exists \|\n\| `has_composite(schema, composite, desc)` \| Composite type exists \|\n\| `has_domain(schema, domain, desc)` \| Domain exists \|\n\| `has_role(name, desc)` \| Role exists \|\n\n### Collection assertions\n\n\| Function \| Tests for \|\n\|---\|---\|\n\| `tables_are(schema, tables_array, desc)` \| Exact set of tables \|\n\| `views_are(schema, views_array, desc)` \| Exact set of views \|\n\| `columns_are(schema, table, columns_array, desc)` \| Exact set of columns \|\n\| `indexes_are(schema, table, indexes_array, desc)` \| Exact set of indexes \|\n\| `triggers_are(schema, table, triggers_array, desc)` \| Exact set of triggers \|\n\| `functions_are(schema, functions_array, desc)` \| Exact set of functions \|\n\| `schemas_are(schemas_array, desc)` \| Exact set of schemas \|\n\| `extensions_are(schema, extensions_array, desc)` \| Exact set of extensions \|\n\| `roles_are(roles_array, desc)` \| Exact set of roles \|\n\| `enum_has_labels(schema, enum, labels_array, desc)` \| Enum has expected labels \|\n\n---\n\n## 5. Column Testing Functions\n\nSource: [pgtap.org/documentation.html](https://pgtap.org/documentation.html)\n\n\| Function \| Tests for \|\n\|---\|---\|\n\| `has_column(schema, table, column, desc)` \| Column exists \|\n\| `hasnt_column(schema, table, column, desc)` \| Column doesn't exist \|\n\| `col_type_is(schema, table, column, type, desc)` \| Column has expected type \|\n\| `col_not_null(schema, table, column, desc)` \| Column is NOT NULL \|\n\| `col_is_null(schema, table, column, desc)` \| Column allows NULL \|\n\| `col_has_default(schema, table, column, desc)` \| Column has a default \|\n\| `col_hasnt_default(schema, table, column, desc)` \| Column has no default \|\n\| `col_default_is(schema, table, column, default, desc)` \| Default value matches \|\n\| `col_is_pk(schema, table, column, desc)` \| Column is primary key \|\n\| `col_isnt_pk(schema, table, column, desc)` \| Column is not primary key \|\n\| `col_is_fk(schema, table, column, desc)` \| Column is foreign key \|\n\| `col_isnt_fk(schema, table, column, desc)` \| Column is not foreign key \|\n\| `col_is_unique(schema, table, column, desc)` \| Column has unique constraint \|\n\| `has_pk(schema, table, desc)` \| Table has a primary key \|\n\| `has_fk(schema, table, desc)` \| Table has a foreign key \|\n\| `fk_ok(schema, table, cols, ref_schema, ref_table, ref_cols, desc)` \| Foreign key references correct table \|\n\| `has_check(schema, table, check_name, desc)` \| Check constraint exists \|\n\| `has_unique(schema, table, columns, desc)` \| Unique constraint on columns \|\n\| `is_partitioned(schema, table, desc)` \| Table is partitioned \|\n\| `is_partition_of(schema, table, parent, desc)` \| Table is partition of parent \|\n\n---\n\n## 6. Function Testing Functions\n\nSource: [pgtap.org/documentation.html](https://pgtap.org/documentation.html)\n\n\| Function \| Tests for \|\n\|---\|---\|\n\| `has_function(schema, function, args, desc)` \| Function exists with given args \|\n\| `function_lang_is(schema, function, args, language, desc)` \| Function language (plpgsql, sql, etc.) \|\n\| `function_returns(schema, function, args, return_type, desc)` \| Return type \|\n\| `is_definer(schema, function, args, desc)` \| SECURITY DEFINER \|\n\| `isnt_definer(schema, function, args, desc)` \| NOT SECURITY DEFINER (INVOKER) \|\n\| `is_strict(schema, function, args, desc)` \| STRICT (RETURNS NULL ON NULL INPUT) \|\n\| `isnt_strict(schema, function, args, desc)` \| NOT STRICT \|\n\| `volatility_is(schema, function, args, volatility, desc)` \| IMMUTABLE, STABLE, or VOLATILE \|\n\| `is_aggregate(schema, function, args, desc)` \| Is an aggregate function \|\n\| `is_procedure(schema, function, args, desc)` \| Is a procedure \|\n\| `is_normal_function(schema, function, args, desc)` \| Is a normal function \|\n\| `trigger_is(schema, table, trigger, function, desc)` \| Trigger calls expected function \|\n\n### Argument format\n\nFor `args`, use `ARRAY['uuid', 'text']::name[]` or `ARRAY[]::name[]` for no arguments:\n\n```sql\nSELECT is_definer('public', 'my_function', ARRAY['uuid', 'integer']::name[], 'is SECURITY DEFINER');\nSELECT is_definer('public', 'my_trigger_fn', ARRAY[]::name[], 'trigger fn is SECURITY DEFINER');\n```\n\n---\n\n## 7. RLS Policy Functions\n\nSource: [pgtap.org/documentation.html](https://pgtap.org/documentation.html), [supabase.com/docs/guides/database/extensions/pgtap](https://supabase.com/docs/guides/database/extensions/pgtap)\n\n\| Function \| Tests for \|\n\|---\|---\|\n\| `policies_are(schema, table, policies_array, desc)` \| Exact set of policies on table \|\n\| `policy_roles_are(schema, table, policy, roles_array, desc)` \| Policy applies to these roles \|\n\| `policy_cmd_is(schema, table, policy, command, desc)` \| Policy applies to SELECT/INSERT/UPDATE/DELETE/ALL \|\n\n### Example from Supabase docs\n\n```sql\nSELECT policies_are(\n 'public', 'profiles',\n ARRAY['Profiles are public', 'Profiles can only be updated by the owner']\n);\n```\n\n---\n\n## 8. Privilege Testing Functions\n\nSource: [pgtap.org/documentation.html](https://pgtap.org/documentation.html)\n\n\| Function \| Tests for \|\n\|---\|---\|\n\| `table_privs_are(schema, table, role, privs, desc)` \| Table privileges (SELECT, INSERT, UPDATE, DELETE, etc.) \|\n\| `schema_privs_are(schema, role, privs, desc)` \| Schema privileges (CREATE, USAGE) \|\n\| `function_privs_are(schema, function, args, role, privs, desc)` \| Function privileges (EXECUTE) \|\n\| `sequence_privs_are(schema, sequence, role, privs, desc)` \| Sequence privileges \|\n\| `column_privs_are(schema, table, column, role, privs, desc)` \| Column-level privileges \|\n\| `database_privs_are(database, role, privs, desc)` \| Database privileges \|\n\n### Testing REVOKE\n\nTo verify a function has NO execute privilege for a role:\n\n```sql\nSELECT function_privs_are('public', 'my_function', ARRAY['uuid']::name[],\n 'authenticated', ARRAY[]::text[], -- empty array = no privileges\n 'authenticated: no execute on my_function');\n```\n\n---\n\n## 9. Exception Testing\n\nSource: [pgtap.org/documentation.html](https://pgtap.org/documentation.html)\n\n\| Function \| Tests for \|\n\|---\|---\|\n\| `throws_ok(sql, errcode, errmsg, desc)` \| SQL raises expected exception \|\n\| `throws_like(sql, like_pattern, desc)` \| Exception message matches LIKE pattern \|\n\| `throws_matching(sql, regex, desc)` \| Exception message matches regex \|\n\| `lives_ok(sql, desc)` \| SQL does NOT raise an exception \|\n\| `performs_ok(sql, milliseconds, desc)` \| SQL completes within time limit \|\n\n### `throws_ok` signatures\n\n```sql\n-- Full form: SQLSTATE + message\nSELECT throws_ok(\n $$SELECT 1/0$$,\n '22012', -- SQLSTATE for division by zero\n 'division by zero',\n 'division by zero throws correct error'\n);\n\n-- Message only\nSELECT throws_ok(\n $$SELECT 1/0$$,\n 'division by zero'\n);\n\n-- SQLSTATE only\nSELECT throws_ok(\n $$SELECT 1/0$$,\n '22012'\n);\n```\n\nCommon SQLSTATE codes:\n- `P0001` — `RAISE EXCEPTION` (custom)\n- `23505` — unique_violation\n- `23503` — foreign_key_violation\n- `23514` — check_violation\n- `22012` — division_by_zero\n- `42501` — insufficient_privilege\n\n---\n\n## 10. Result Set Testing\n\nSource: [pgtap.org/documentation.html](https://pgtap.org/documentation.html)\n\n\| Function \| Tests for \|\n\|---\|---\|\n\| `results_eq(sql, sql, desc)` \| Exact row-by-row match (order matters) \|\n\| `results_ne(sql, sql, desc)` \| Results differ \|\n\| `set_eq(sql, sql, desc)` \| Same rows regardless of order/duplicates \|\n\| `set_ne(sql, sql, desc)` \| Different sets \|\n\| `set_has(sql, sql, desc)` \| First result is superset of second \|\n\| `set_hasnt(sql, sql, desc)` \| First result has none of second's rows \|\n\| `bag_eq(sql, sql, desc)` \| Same multiset (duplicates matter, order doesn't) \|\n\| `bag_ne(sql, sql, desc)` \| Different multisets \|\n\| `is_empty(sql, desc)` \| Query returns no rows \|\n\| `isnt_empty(sql, desc)` \| Query returns at least one row \|\n\| `row_eq(sql, record, desc)` \| Single row matches record \|\n\n---\n\n## 11. Supabase Helpers\n\nSource: [supabase.com/docs/guides/database/testing](https://supabase.com/docs/guides/database/testing), [supabase.com/docs/guides/local-development/testing/pgtap-extended](https://supabase.com/docs/guides/local-development/testing/pgtap-extended)\n\nSupabase provides a `tests` schema with helper functions for managing test users and context:\n\n### User Management\n\n\| Function \| Purpose \|\n\|---\|---\|\n\| `tests.create_supabase_user(identifier, email, phone?, metadata?)` \| Creates `auth.users` record (fires auth triggers) \|\n\| `tests.get_supabase_uid(identifier)` \| Returns UUID of previously created test user \|\n\n```sql\nSELECT tests.create_supabase_user(\n 'my_test_user',\n 'test@example.com',\n NULL,\n '{\"sub\": \"12345\", \"preferred_username\": \"testuser\", \"avatar_url\": \"https://example.com/avatar.png\"}'::jsonb\n);\n\nSELECT tests.get_supabase_uid('my_test_user');\n-- Returns: uuid\n```\n\n### Authentication Context\n\n\| Function \| Purpose \|\n\|---\|---\|\n\| `tests.authenticate_as(identifier)` \| Sets role to `authenticated` + JWT claims for user \|\n\| `tests.authenticate_as_service_role()` \| Sets role to `service_role`, clears JWT claims \|\n\| `tests.clear_authentication()` \| Sets role to `anon`, clears JWT claims \|\n\n```sql\n-- Test as authenticated user\nSELECT tests.authenticate_as('my_test_user');\n-- Now queries run with RLS applied for this user\n\n-- Test as service_role (bypasses RLS)\nSELECT tests.authenticate_as_service_role();\n\n-- Test as anonymous\nSELECT tests.clear_authentication();\n```\n\n### RLS Verification\n\n\| Function \| Purpose \|\n\|---\|---\|\n\| `tests.rls_enabled(schema)` \| Asserts ALL tables in schema have RLS enabled \|\n\| `tests.rls_enabled(schema, table)` \| Asserts specific table has RLS enabled \|\n\n```sql\nSELECT tests.rls_enabled('public');\n```\n\n### Time Control\n\n\| Function \| Purpose \|\n\|---\|---\|\n\| `tests.freeze_time(timestamp)` \| Freeze `now()` for deterministic time tests \|\n\| `tests.unfreeze_time()` \| Restore normal time behavior \|\n\n---\n\n## 12. Diagnostics and Utilities\n\nSource: [pgtap.org/documentation.html](https://pgtap.org/documentation.html)\n\n\| Function \| Purpose \|\n\|---\|---\|\n\| `diag(message)` \| Output diagnostic message (prefixed with `#`) \|\n\| `skip(reason, count)` \| Skip N tests with reason \|\n\| `todo(reason, count)` \| Mark N tests as to-do \|\n\| `todo_start(why)` / `todo_end()` \| Block-style todo marking \|\n\| `collect_tap(...)` \| Concatenate TAP output \|\n\| `pgtap_version()` \| pgTAP version string \|\n\| `pg_version()` \| PostgreSQL version string \|\n\| `pg_version_num()` \| Numeric PG version (e.g., 150000) \|\n\n### Ownership testing\n\n\| Function \| Tests for \|\n\|---\|---\|\n\| `table_owner_is(schema, table, owner, desc)` \| Table owner \|\n\| `view_owner_is(schema, view, owner, desc)` \| View owner \|\n\| `function_owner_is(schema, function, args, owner, desc)` \| Function owner \|\n\| `schema_owner_is(schema, owner, desc)` \| Schema owner \|\n\| `sequence_owner_is(schema, sequence, owner, desc)` \| Sequence owner \|\n\n---\n\n## 13. `SET LOCAL` vs `SET`\n\nSource: [PostgreSQL documentation — SET](https://www.postgresql.org/docs/current/sql-set.html)\n\n`SET LOCAL` restricts the setting to the current transaction. When the transaction ends (via `COMMIT` or `ROLLBACK`), the setting reverts to its session-level value.\n\nPlain `SET` (without `LOCAL`) changes the session-level value, which persists after `ROLLBACK`. In pgTAP tests wrapped in `BEGIN`/`ROLLBACK`, this means `SET ROLE` or `SET \"request.jwt.claims\"` will leak into subsequent test files.\n\n```sql\n-- Inside BEGIN/ROLLBACK:\nSET LOCAL ROLE authenticated; -- reverted by ROLLBACK ✓\nSET ROLE authenticated; -- persists after ROLLBACK ✗\n```\n\n---\n\n## 14. SAVEPOINT Caveat\n\nSource: PostgreSQL transaction semantics ([postgresql.org/docs/current/sql-savepoint.html](https://www.postgresql.org/docs/current/sql-savepoint.html))\n\nNote: pgTAP documentation does not explicitly address SAVEPOINTs. This guidance is derived from PostgreSQL transaction semantics: `ROLLBACK TO SAVEPOINT` undoes all changes (including sequence increments and table writes) made after the savepoint. Since pgTAP tracks test state within the transaction, rolling back to a savepoint can corrupt internal counters and cause plan count mismatches.\n\n```sql\n-- BAD: assertions between SAVEPOINT and ROLLBACK TO are lost\nSAVEPOINT sp1;\nSELECT ok(true, 'this assertion gets rolled back'); -- counted by plan but discarded\nROLLBACK TO sp1;\n-- Plan now expects more assertions than will actually complete\n\n-- GOOD: use throws_ok() instead\nSELECT throws_ok(\n $$SELECT some_function_that_should_fail()$$,\n 'P0001', 'expected error message',\n 'function rejects bad input'\n);\n```\n"},"ui-audit":{"content":"---\nname: ui-audit\ndescription: Use after any UI edit, when reviewing UI components, or when asked for an accessibility or structure audit. Triggers on WCAG 2.2 violations, WAI-ARIA APG pattern issues, touch target sizing, focus management, component duplication, or separation of concerns problems in React/Tailwind code.\n---\n\n# UI Audit — Accessibility & Structure\n\nAudit React/Tailwind UI code for accessibility violations and structural anti-patterns. Every finding must cite the specific standard (WCAG SC, WAI-ARIA APG pattern, platform guideline) so the developer knows the authoritative source.\n\nSee [REFERENCE.md](REFERENCE.md) for detailed standard definitions, exact requirements, and code examples.\n\n## Scope\n\nDetermine what to audit based on context:\n\n- Git diff mode (default when no scope specified and changes exist): run `git diff` and `git diff --cached` to audit only changed/added UI code (`.tsx`, `.css` files)\n- File/directory mode: audit the files or directories the user specifies\n- Full audit mode: when the user asks for a full UI audit, scan the project's `src/` directory (skip node_modules, build artifacts, test files)\n\nRead all in-scope code before producing findings.\n\n## Part 1 — Accessibility\n\nEvaluate against each check. Skip checks with no findings.\n\n### 1. Touch Target Size\n\nStandards: WCAG 2.5.5 (AAA) — 44x44 CSS px; WCAG 2.5.8 (AA) — 24x24 CSS px; Apple HIG — 44x44 pt; Material Design — 48x48 dp\n\nThis project targets mobile (future native app). Enforce 44x44px minimum (Tailwind `min-h-11` = 2.75rem = 44px at 16px root).\n\nWhat to check:\n- Every `<button>`, `<a>`, `<input>`, `<select>`, clickable `<div>`/`<li>`, and icon button must produce a tap target of at least 44x44px\n- Padding classes that produce heights below 44px on small text: `py-0.5` (~24px), `py-1` (~28px), `py-1.5` (~32px), `py-2` (~36px) on `text-sm`/`text-xs` elements\n- Toggle/switch components: the clickable area (not just the visual track) must be 44x44px\n- Close buttons (especially bare `x` character): must have padding to reach 44x44px\n\nExceptions (per WCAG 2.5.5):\n- Inline links within a sentence or block of text\n- Size determined entirely by the user agent\n\n### 2. Modal / Dialog Accessibility\n\nStandard: WAI-ARIA APG — Dialog (Modal) Pattern\n\nRequired attributes:\n- Container: `role=\"dialog\"` and `aria-modal=\"true\"`\n- Title: `aria-labelledby` pointing to the dialog's heading, OR `aria-label`\n- Optional: `aria-describedby` for descriptive content\n\nRequired focus management:\n- On open: move focus to an element inside the dialog\n- Focus trap: Tab/Shift+Tab cycle within the dialog, never escaping behind\n- On close: return focus to the element that triggered the dialog\n\nRequired keyboard:\n- Escape closes the dialog\n\nViolations to flag:\n- `role=\"presentation\"` on a modal container\n- No focus trap\n- No focus restoration on close\n- Missing `aria-labelledby` or `aria-label`\n\n### 3. Focus Visibility\n\nStandard: WCAG 2.4.7 (AA) — \"Any keyboard operable user interface has a mode of operation where the keyboard focus indicator is visible.\"\n\nWhat to check:\n- Every interactive element (`<button>`, `<a>`, `<input>`, `<select>`, `[role=\"button\"]`, `[role=\"tab\"]`, `[tabindex]`) must have a visible focus indicator\n- Look for `focus-visible:outline`, `focus-visible:ring`, `focus:ring`, or equivalent\n- If NO interactive elements in scope have focus styles, flag as a blanket issue\n- Custom components wrapping `<div onClick>` need `tabIndex={0}` AND a focus style\n\n### 4. Color Contrast\n\nStandard: WCAG 1.4.3 (AA) — Contrast (Minimum)\n\nRequired ratios:\n- Normal text (< 18pt or < 14pt bold): 4.5:1\n- Large text (>= 18pt or >= 14pt bold): 3:1\n- UI components and graphical objects (WCAG 1.4.11 AA): 3:1\n\nWhat to check:\n- Low-opacity text: `opacity-30`, `opacity-40`, or equivalent — compute effective contrast\n- Stacked opacity (e.g., `bg-surface/50 opacity-60`) — compound reduction likely fails\n- Placeholder text colors\n- Note: WCAG 1.4.3 exempts \"inactive\" (disabled) UI components from contrast requirements\n\n### 5. Form Label Association\n\nStandards: WCAG 1.3.1 (A) — \"Information, structure, and relationships conveyed through presentation can be programmatically determined\"; WCAG 4.1.2 (A) — \"For all user interface components, the name and role can be programmatically determined\"\n\nWhat to check:\n- Every `<input>`, `<select>`, `<textarea>` must have ONE of:\n - A `<label>` with `htmlFor` matching the input's `id`\n - `aria-label` on the input\n - `aria-labelledby` pointing to a visible label element\n- Visible label text NOT programmatically connected = violation\n- `placeholder` alone is NOT a label (it disappears on input)\n\n### 6. Icon-Only Buttons\n\nStandards: WCAG 1.1.1 (A) — \"All non-text content that is presented to the user has a text alternative that serves the equivalent purpose\"; WCAG 4.1.2 (A) — Name, Role, Value\n\nWhat to check:\n- Every button/link containing only an icon (SVG, icon component, single character like `x`) must have:\n - `aria-label` describing the action, OR\n - `<span className=\"sr-only\">` with descriptive text\n- Icons inside buttons with visible text should have `aria-hidden=\"true\"`\n\n### 7. ARIA Widget Patterns\n\nStandard: WAI-ARIA APG\n\nTabs (APG Tabs Pattern):\n- Container: `role=\"tablist\"`\n- Each tab: `role=\"tab\"`, `aria-selected=\"true\"/\"false\"`, `aria-controls` referencing its panel\n- Each panel: `role=\"tabpanel\"`, `aria-labelledby` referencing its tab\n- Keyboard: Left/Right arrows move between tabs; Tab moves to panel content\n\nMenu buttons (APG Menu Button Pattern):\n- Trigger: `aria-haspopup=\"menu\"` or `\"true\"`, `aria-expanded=\"true\"/\"false\"`\n- Menu container: `role=\"menu\"`\n- Items: `role=\"menuitem\"`\n- Keyboard: Enter/Space opens; Arrow keys navigate items; Escape closes\n\nAlerts (APG Alert Pattern + WCAG 4.1.3 AA):\n- Error messages / status banners: `role=\"alert\"`\n- Non-urgent status messages: `aria-live=\"polite\"`\n- Alerts must not auto-dismiss (WCAG 2.2.3)\n\n### 8. Keyboard Accessibility\n\nStandard: WCAG 2.1.1 (A) — \"All functionality of the content is operable through a keyboard interface\"\n\nWhat to check:\n- Clickable non-interactive elements (`<div onClick>`, `<li onClick>`, `<span onClick>`) must have:\n - `role=\"button\"` (or appropriate role)\n - `tabIndex={0}`\n - `onKeyDown` handler (Enter and Space should activate)\n- Context menus, dropdowns, popovers: closeable with Escape\n- Hover-only interactions (`opacity-0 group-hover:opacity-100` on buttons): invisible to keyboard — must have a keyboard-accessible alternative\n\n### 9. Loading States\n\nStandard: WCAG 4.1.3 (AA) — \"Status messages can be programmatically determined through role or properties such that they can be presented to the user by assistive technologies without receiving focus\"; React Suspense docs\n\nWhat to check:\n- Components that `return null` during loading = blank screen with no feedback — always show a loading indicator\n- Dynamic content regions that update should use `aria-live` or `role=\"status\"` to announce changes\n- React Suspense docs: \"Don't put a Suspense boundary around every component. Suspense boundaries should not be more granular than the loading sequence that you want the user to experience.\"\n- React Suspense docs: \"Replacing visible UI with a fallback creates a jarring user experience\" — use `startTransition` for updates to already-visible content\n\n### 10. Text Size\n\nImportant: WCAG has NO minimum font size requirement. WCAG 1.4.4 (AA) requires text to be resizable to 200% without loss of content — not a minimum size.\n\nBest practice (Apple HIG, Material Design, general UX):\n- Body text: 16px recommended\n- Secondary/caption text: 12px practical minimum\n- Text below 12px (`text-[10px]`, `text-[9px]`) is a readability concern, especially on mobile\n\nSeverity: Warning (best practice), never Critical. Always note this is NOT a WCAG requirement.\n\n## Part 2 — UI Structure\n\n### 11. Component Extraction (DRY)\n\nSources: Tailwind docs — \"Reusing Styles\"; Kent C. Dodds — AHA Programming\n\nTailwind official docs: \"If you need to reuse some styles across multiple files, the best strategy is to create a component.\"\n\nKent C. Dodds (AHA): \"After you've got a few places where that code is running, the commonalities will scream at you for abstraction.\"\n\nWhat to check:\n- Same Tailwind class combination (5+ utility classes forming one visual pattern) appearing 3+ times across different files — extract to a shared component\n- Common extraction candidates: Button variants, Card, Input, Badge, Modal close button\n- Utility style patterns (e.g., focus rings) repeated 10+ times — bake into base components\n\nThreshold: 3+ identical patterns across 2+ files = extract. Duplication within a single file is fine — Tailwind docs say to use multi-cursor editing for same-file duplication.\n\nDo NOT flag:\n- Single-use class combinations, even if long (this is Tailwind by-design)\n- Structural Tailwind classes that naturally repeat (`flex items-center gap-2`)\n\n### 12. Component Size & Responsibility\n\nSources: React docs — \"Thinking in React\"; Robert C. Martin — Single Responsibility Principle\n\nReact docs: a component's purpose should be describable in one sentence without \"and.\"\n\nWhat to check:\n- Components exceeding ~200 lines — likely multiple responsibilities\n- JSX return exceeding ~50 lines — consider splitting into subcomponents\n- Business logic (API calls, optimistic updates, complex state transforms) inline in render components — extract to custom hooks\n- Inline event handlers exceeding ~10 lines — extract to named functions or hooks\n- Multiple unrelated `useState`/`useEffect` clusters in one component\n\n### 13. Layout Consistency\n\nWhat to check:\n- Individual screens overriding the app-level layout constraint (e.g., screen sets `max-w-lg` when layout uses `max-w-2xl`)\n- Hardcoded heights with `calc()` and magic numbers (`calc(100vh - 140px)`) — use flex/grid layout instead; these break when surrounding layout changes\n- Inconsistent page-level spacing (one screen `p-4`, another `p-6` for the same structural role)\n\n### 14. Design Token Usage\n\nSource: Tailwind docs — Theme configuration\n\nWhat to check:\n- Hardcoded hex colors (`#1a1a2e`, `rgb(...)`, inline `style={{ color: '...' }}`) bypassing the project's CSS custom properties / Tailwind theme\n- Hardcoded pixel values for spacing/sizing that should use Tailwind's scale\n- Magic numbers for timeouts, thresholds, row heights, page sizes — should be named constants (Clean Code Ch. 17: numbers other than 0 and 1 should be named)\n\n### 15. Loading & Error Patterns\n\nSources: React Suspense docs; React docs — Error Boundaries\n\nWhat to check:\n- `.catch(() => {})` on user-initiated actions (buy, claim, save) — user sees nothing on failure. Note: acceptable for best-effort background operations (auto-sync, prefetch)\n- Missing error boundaries around independently-failing sections\n- Inconsistent loading patterns across screens (some `useQuery`, some manual `useState`, some `return null`)\n\n### 16. State & Hook Patterns\n\nSource: React docs — \"Reusing Logic with Custom Hooks\"\n\nWhat to check:\n- Custom hooks wrapping a single `useState` with no other hooks — React docs: \"extracting a useFormInput Hook to wrap a single useState call is probably unnecessary\"\n- Functions prefixed with `use` that don't call any React hooks — React docs: \"If your function doesn't call any Hooks, avoid the use prefix\"\n- Components with 4+ `useState` calls that could be consolidated into a custom hook or `useReducer`\n- React docs: \"Custom Hooks let you share stateful logic but not state itself. Each call to a Hook is completely independent.\"\n\n## Output Format\n\nGroup findings by severity. Each finding MUST name the specific standard.\n\n```\n## Critical\nViolations that directly harm users — screen reader users can't navigate, keyboard users are trapped, touch users can't tap targets.\n\n### [STANDARD] Brief title\nFile: `path/to/file.tsx` (lines X-Y)\nStandard: Full standard ID and one-line requirement.\nViolation: What the code does wrong and who is affected.\nFix: Specific, actionable code change.\n\n## Warning\nViolations that degrade usability but have workarounds, or best-practice violations with real UX impact.\n\n(same structure)\n\n## Suggestion\nImprovements that increase robustness or consistency but aren't urgently broken.\n\n(same structure)\n\n## Summary\n- Total findings: N (X critical, Y warning, Z suggestion)\n- Standards most frequently violated: list top 2-3\n- Overall assessment: 1-2 sentence verdict\n```\n\n## False Positive Filtering\n\n### Hard Exclusions — do NOT report:\n\n1. Inline links within body text — exempt from touch target size per WCAG 2.5.5\n2. Disabled/inactive elements — exempt from contrast requirements per WCAG 1.4.3\n3. Purely decorative elements — exempt from text alternative requirements per WCAG 1.1.1\n4. Third-party component internals — don't audit inside node_modules\n5. Test files — skip `.test.tsx`, `.spec.tsx`\n6. Theme/token definitions — CSS variable definitions in theme config ARE the design system\n\n### Severity Calibration:\n\n- Critical: Users physically cannot complete an action (can't tap, can't navigate, can't perceive content). Screen reader users locked out.\n- Warning: Users CAN complete the action but with significant difficulty. UX best-practice violations with real impact.\n- Suggestion: Improvements that help but aren't urgently broken. Minor inconsistencies.\n\n## Rules\n\n- Cite the standard: every finding must reference the specific WCAG SC, ARIA APG pattern, or platform guideline.\n- Be specific: always cite file paths and line numbers.\n- Be actionable: every finding must include a concrete fix — not \"add aria-label\" but `aria-label=\"Close dialog\"` on line 42.\n- Measure real impact: severity by who is affected and how badly.\n- Don't over-report text size: WCAG has no minimum font size. Sub-12px = Warning (best practice), never Critical.\n- Don't over-report DRY: same-file duplication is fine per Tailwind guidance. Only flag cross-file duplication of 3+ occurrences.\n- Respect scope: in diff mode, only flag issues in changed lines and their immediate context.\n- Don't duplicate other skills: a11y and UI structure only. Logic bugs go to `correctness-audit`, security to `security-audit`, general code quality to `best-practices-audit`.\n","reference":"# UI Audit Reference\n\nDetailed definitions, exact requirements, and source citations for each check in the audit.\n\n## Table of Contents\n\n### Part 1 — Accessibility\n1. [Touch Target Size](#1-touch-target-size)\n2. [Modal / Dialog](#2-modal--dialog-accessibility)\n3. [Focus Visibility](#3-focus-visibility)\n4. [Color Contrast](#4-color-contrast)\n5. [Form Labels](#5-form-label-association)\n6. [Icon-Only Buttons](#6-icon-only-buttons)\n7. [ARIA Widget Patterns](#7-aria-widget-patterns)\n8. [Keyboard Accessibility](#8-keyboard-accessibility)\n9. [Loading States](#9-loading-states)\n10. [Text Size](#10-text-size)\n\n### Part 2 — UI Structure\n11. [Component Extraction](#11-component-extraction)\n12. [Component Size](#12-component-size--responsibility)\n13. [Layout Consistency](#13-layout-consistency)\n14. [Design Tokens](#14-design-token-usage)\n15. [Loading & Error Patterns](#15-loading--error-patterns)\n16. [State & Hook Patterns](#16-state--hook-patterns)\n\n---\n\n## 1. Touch Target Size\n\n### Sources\n- WCAG 2.5.8 (AA): https://www.w3.org/WAI/WCAG22/Understanding/target-size-minimum.html\n - \"The size of the target for pointer inputs is at least 24 by 24 CSS pixels\"\n- WCAG 2.5.5 (AAA): https://www.w3.org/WAI/WCAG22/Understanding/target-size-enhanced.html\n - \"The size of the target for pointer inputs is at least 44 by 44 CSS pixels\"\n- Apple HIG: https://developer.apple.com/design/human-interface-guidelines/accessibility\n - Controls must measure at least 44x44 points\n- Material Design: https://m2.material.io/develop/web/supporting/touch-target\n - Touch targets should be at least 48x48 dp with 8dp spacing\n\n### Why 44px for this project\nWCAG AA requires only 24px, but this project targets mobile (future native app). Apple HIG (44pt) and Material Design (48dp) both enforce larger targets. We use 44px as the minimum — satisfying Apple HIG, WCAG AAA, and being close to Material Design's 48dp.\n\n### Tailwind mapping\n- `min-h-11` = 2.75rem = 44px (at 16px root)\n- `py-2.5` on `text-sm` produces ~40px — still under 44px; use `py-3` or add `min-h-11`\n- `h-6` (24px), `h-8` (32px), `h-10` (40px) are all below 44px\n- `h-11` (44px) is the target\n\n### Exceptions (WCAG 2.5.5)\n1. Inline: target is in a sentence or constrained by line-height of surrounding text\n2. Equivalent: function available through a different control meeting the size requirement\n3. User agent control: size determined by browser, not author\n4. Essential: specific presentation is essential to the information\n\n---\n\n## 2. Modal / Dialog Accessibility\n\n### Source\n- WAI-ARIA APG — Dialog (Modal) Pattern: https://www.w3.org/WAI/ARIA/apg/patterns/dialog-modal/\n\n### Required attributes\n\| Attribute \| Element \| Requirement \|\n\|-----------\|---------\|-------------\|\n\| `role=\"dialog\"` \| Container \| Identifies the element as a dialog \|\n\| `aria-modal=\"true\"` \| Container \| Tells assistive tech content behind is inert \|\n\| `aria-labelledby` \| Container \| Points to the dialog's visible title element \|\n\| `aria-label` \| Container \| Alternative when no visible title exists \|\n\| `aria-describedby` \| Container \| Optional — points to descriptive content \|\n\n### Focus management (from APG)\n1. On open: focus moves to an element inside the dialog\n - If content is primarily semantic (text): focus a static element at the top with `tabindex=\"-1\"`\n - If content has a primary action: focus that action button\n - If destructive: focus the least destructive option\n2. Focus trap: Tab cycles forward; Shift+Tab cycles backward; both wrap within dialog\n3. On close: focus returns to the triggering element (unless it no longer exists)\n\n### Keyboard\n- Escape: closes the dialog\n- Tab: moves to next focusable element within dialog (wraps)\n- Shift+Tab: moves to previous focusable element (wraps)\n\n### Common violations\n- `role=\"presentation\"` instead of `role=\"dialog\"` — screen readers don't recognize it as a dialog\n- No focus trap — Tab key escapes behind the overlay\n- No auto-focus on open — focus stays on the trigger behind the modal\n- No focus restoration on close — focus drops to `<body>`\n\n---\n\n## 3. Focus Visibility\n\n### Source\n- WCAG 2.4.7 (AA): https://www.w3.org/WAI/WCAG22/Understanding/focus-visible.html\n - \"Any keyboard operable user interface has a mode of operation where the keyboard focus indicator is visible.\"\n- WCAG 2.4.13 (AAA): https://www.w3.org/WAI/WCAG22/Understanding/focus-appearance.html\n - Focus indicator area: at least as large as a 2px thick perimeter of the unfocused component\n - Focus indicator contrast: at least 3:1 between focused and unfocused states\n\n### Practical implementation\nEvery interactive element needs a visible focus style. In Tailwind:\n```jsx\n// Good\n<button className=\"focus-visible:outline focus-visible:outline-2 focus-visible:outline-primary\">\n\n// Bad — no focus style at all\n<button className=\"bg-primary text-white\">\n```\n\nUse `focus-visible` (not `focus`) to avoid showing focus rings on mouse clicks while preserving them for keyboard navigation.\n\n---\n\n## 4. Color Contrast\n\n### Source\n- WCAG 1.4.3 (AA): https://www.w3.org/WAI/WCAG22/Understanding/contrast-minimum.html\n - Normal text: at least 4.5:1\n - Large text (>= 18pt / >= 14pt bold): at least 3:1\n - Large text ≈ 24px regular / 18.66px bold\n- WCAG 1.4.11 (AA): https://www.w3.org/WAI/WCAG22/Understanding/non-text-contrast.html\n - UI components and graphical objects: at least 3:1\n\n### Exemptions\n- Inactive (disabled) components\n- Purely decorative elements\n- Logotypes\n\n### Common Tailwind violations\n- `text-disabled` at `rgba(255,255,255,0.3)` on dark bg ≈ 2.5:1 (fails 4.5:1)\n- Stacked opacity: `bg-surface/50 opacity-60` compounds two reductions\n- `placeholder:text-muted` if muted color is too faint\n\n---\n\n## 5. Form Label Association\n\n### Sources\n- WCAG 1.3.1 (A): https://www.w3.org/WAI/WCAG22/Understanding/info-and-relationships.html\n - \"Information, structure, and relationships conveyed through presentation can be programmatically determined or are available in text.\"\n- WCAG 4.1.2 (A): https://www.w3.org/WAI/WCAG22/Understanding/name-role-value.html\n - \"For all user interface components, the name and role can be programmatically determined\"\n- React docs: https://legacy.reactjs.org/docs/accessibility.html\n - \"Every HTML form control, such as `<input>` and `<textarea>`, needs to be labeled accessibly.\"\n\n### Valid labeling techniques\n1. `<label htmlFor=\"name\">Name</label> <input id=\"name\" />`\n2. `<input aria-label=\"Search\" />`\n3. `<input aria-labelledby=\"heading-id\" />`\n\n### NOT valid\n- `<input placeholder=\"Name\" />` alone — placeholder disappears on input, not a reliable label\n- A `<span>` visually positioned near the input but not programmatically connected\n\n---\n\n## 6. Icon-Only Buttons\n\n### Source\n- WCAG 1.1.1 (A): https://www.w3.org/WAI/WCAG22/Understanding/non-text-content.html\n - \"All non-text content that is presented to the user has a text alternative that serves the equivalent purpose\"\n - For controls: \"it has a name that describes its purpose\"\n\n### Implementation\n```jsx\n// Good — aria-label on button\n<button aria-label=\"Close dialog\"><XIcon aria-hidden=\"true\" /></button>\n\n// Good — sr-only text\n<button><XIcon aria-hidden=\"true\" /><span className=\"sr-only\">Close dialog</span></button>\n\n// Bad — no accessible name\n<button><XIcon /></button>\n\n// Bad — icon has name but button doesn't (redundant, confusing)\n<button><XIcon aria-label=\"close\" /></button>\n```\n\nThe accessible name belongs on the button, not on the icon inside it. Icons inside labeled buttons should be `aria-hidden=\"true\"`.\n\n---\n\n## 7. ARIA Widget Patterns\n\n### Tabs\nSource: https://www.w3.org/WAI/ARIA/apg/patterns/tabs/\n\n\| Role/Attribute \| Element \| Required \|\n\|---------------\|---------\|----------\|\n\| `role=\"tablist\"` \| Container \| Yes \|\n\| `aria-label` or `aria-labelledby` \| Tablist \| Yes \|\n\| `role=\"tab\"` \| Each tab button \| Yes \|\n\| `aria-selected=\"true\"/\"false\"` \| Each tab \| Yes \|\n\| `aria-controls` \| Each tab \| Yes — points to its panel \|\n\| `role=\"tabpanel\"` \| Each panel \| Yes \|\n\| `aria-labelledby` \| Each panel \| Yes — points to its tab \|\n\| `tabindex=\"0\"` \| Active tab + panel \| Yes (if panel has no focusable content) \|\n\nKeyboard: Left/Right arrows move between tabs (wrap); Tab moves to panel content; Home/End to first/last (optional).\n\n### Menu Button\nSource: https://www.w3.org/WAI/ARIA/apg/patterns/menu-button/\n\n\| Role/Attribute \| Element \| Required \|\n\|---------------\|---------\|----------\|\n\| `aria-haspopup=\"menu\"` \| Button \| Yes \|\n\| `aria-expanded=\"true\"/\"false\"` \| Button \| Yes \|\n\| `role=\"menu\"` \| Menu container \| Yes \|\n\| `role=\"menuitem\"` \| Each item \| Yes \|\n\nKeyboard: Enter/Space opens and focuses first item; Down Arrow opens and focuses first item; Up Arrow opens and focuses last item; Escape closes.\n\n### Alert\nSource: https://www.w3.org/WAI/ARIA/apg/patterns/alert/\n\n- Use `role=\"alert\"` for error messages and urgent notifications\n- Alerts do not move keyboard focus\n- Avoid auto-dismissing alerts (WCAG 2.2.3)\n- For non-urgent status: use `aria-live=\"polite\"` instead\n\n### Status Messages\nSource: WCAG 4.1.3 (AA) — https://www.w3.org/WAI/WCAG22/Understanding/status-messages.html\n- \"Status messages can be programmatically determined through role or properties such that they can be presented to the user by assistive technologies without receiving focus.\"\n- Use `role=\"status\"` (implicitly `aria-live=\"polite\"`) for non-urgent updates\n\n---\n\n## 8. Keyboard Accessibility\n\n### Source\n- WCAG 2.1.1 (A): https://www.w3.org/WAI/WCAG22/Understanding/keyboard.html\n - \"All functionality of the content is operable through a keyboard interface without requiring specific timings for individual keystrokes\"\n\n### Custom interactive elements\nWhen using non-semantic elements as interactive controls:\n```jsx\n// Bad — keyboard users can't interact\n<div onClick={handleClick}>Click me</div>\n\n// Good — full keyboard support\n<div\n role=\"button\"\n tabIndex={0}\n onClick={handleClick}\n onKeyDown={(e) => { if (e.key === 'Enter' \|\| e.key === ' ') { e.preventDefault(); handleClick(); } }}\n>\n Click me\n</div>\n\n// Best — use a real button\n<button onClick={handleClick}>Click me</button>\n```\n\n### Hover-only patterns\n```jsx\n// Bad — invisible to keyboard users\n<div className=\"opacity-0 group-hover:opacity-100\">\n <button>Options</button>\n</div>\n\n// Good — visible on focus too\n<div className=\"opacity-0 group-hover:opacity-100 group-focus-within:opacity-100\">\n <button>Options</button>\n</div>\n```\n\n---\n\n## 9. Loading States\n\n### Sources\n- WCAG 4.1.3 (AA): https://www.w3.org/WAI/WCAG22/Understanding/status-messages.html\n- React Suspense docs: https://react.dev/reference/react/Suspense\n - \"Don't put a Suspense boundary around every component. Suspense boundaries should not be more granular than the loading sequence that you want the user to experience.\"\n - \"Replacing visible UI with a fallback creates a jarring user experience.\"\n\n### Rules\n1. Never `return null` during loading — show a spinner, skeleton, or placeholder\n2. Use `aria-busy=\"true\"` on containers that are loading content\n3. Use `aria-live=\"polite\"` on regions that update dynamically\n4. Use `startTransition` when updating already-visible content to avoid replacing it with a loading fallback\n\n---\n\n## 10. Text Size\n\n### Source\n- WCAG 1.4.4 (AA): https://www.w3.org/WAI/WCAG22/Understanding/resize-text.html\n - \"Text can be resized without assistive technology up to 200 percent without loss of content or functionality.\"\n - No minimum font size is specified by WCAG.\n\n### Best practice (NOT WCAG)\n- Body: 16px recommended baseline\n- Secondary: 12px practical minimum\n- Below 12px: readability concern, especially mobile\n- Tailwind `text-xs` = 12px = acceptable\n- `text-[10px]`, `text-[9px]` = flag as Warning\n\n---\n\n## 11. Component Extraction\n\n### Sources\n- Tailwind docs — Reusing Styles: https://tailwindcss.com/docs/reusing-styles\n - \"If you need to reuse some styles across multiple files, the best strategy is to create a component if you're using a front-end framework like React.\"\n - On same-file duplication: \"the easiest way to deal with it is to use multi-cursor editing\"\n- Kent C. Dodds — AHA Programming: https://kentcdodds.com/blog/aha-programming\n - \"After you've got a few places where that code is running, the commonalities will scream at you for abstraction\"\n\n### Extraction threshold\n- 3+ identical patterns across 2+ files = extract to a shared component\n- Same file: fine — use multi-cursor (Tailwind guidance)\n- 1-2 occurrences: too early to extract (AHA principle)\n\n### Do NOT flag\n- Long class strings appearing once (Tailwind by-design)\n- Structural classes that naturally repeat (`flex items-center gap-2`)\n\n---\n\n## 12. Component Size & Responsibility\n\n### Sources\n- React docs — Thinking in React: https://react.dev/learn/thinking-in-react\n- Robert C. Martin — Single Responsibility Principle\n\n### Guidelines\n- ~200 lines total = consider splitting\n- ~50 lines of JSX return = consider extracting subcomponents\n- A component's purpose should be describable in one sentence without \"and\"\n- Business logic (API calls, optimistic updates) belongs in custom hooks, not inline in render\n\n---\n\n## 13. Layout Consistency\n\n### Sources\n- Radix Themes — Layout: https://www.radix-ui.com/themes/docs/overview/layout\n - \"Container's sole responsibility is to provide a consistent max-width to the content it wraps\"\n- CSS-Tricks — Magic Numbers in CSS: https://css-tricks.com/magic-numbers-in-css/\n - \"Magic numbers in CSS refer to values which 'work' under some circumstances but are fragile and prone to break when those circumstances change\"\n\n### Rules\n- Max-width should be set once in a layout wrapper, not repeated per-screen\n- `calc(100vh - 140px)` is a magic number — breaks when header/footer changes. Use flex layout instead.\n- Page-level padding should come from the layout component, not individual pages\n\n---\n\n## 14. Design Token Usage\n\n### Sources\n- Tailwind docs — Theme: https://tailwindcss.com/docs/theme\n- Robert C. Martin — Clean Code, Chapter 17: numbers other than 0 and 1 should be named constants\n\n### Rules\n- Colors must come from theme tokens, never hardcoded hex/rgb\n- If an arbitrary Tailwind value (`text-[14px]`) appears in 2+ files, extract to a token\n- Timeouts, thresholds, sizes used in logic should be named constants\n\n---\n\n## 15. Loading & Error Patterns\n\n### Sources\n- React Suspense docs: https://react.dev/reference/react/Suspense\n- React Error Boundaries: https://react.dev/reference/react/Component#catching-rendering-errors-with-an-error-boundary\n\n### Rules\n- `.catch(() => {})` on user-initiated actions = swallowed error. User needs feedback.\n- `.catch(() => {})` on background/best-effort operations = acceptable (auto-sync, prefetch)\n- Every data-fetching section should have error boundary coverage\n- Consistent loading patterns across screens\n\n---\n\n## 16. State & Hook Patterns\n\n### Source\n- React docs — Reusing Logic with Custom Hooks: https://react.dev/learn/reusing-logic-with-custom-hooks\n\n### Verified quotes from React docs\n- \"Extracting a `useFormInput` Hook to wrap a single `useState` call like earlier is probably unnecessary.\"\n- \"However, whenever you write an Effect, consider whether it would be clearer to also wrap it in a custom Hook.\"\n- \"If your function doesn't call any Hooks, avoid the `use` prefix.\"\n- \"Custom Hooks let you share stateful logic but not state itself. Each call to a Hook is completely independent from every other call to the same Hook.\"\n- \"Keep your custom Hooks focused on concrete high-level use cases.\"\n\n### Anti-patterns\n- `useMount()`, `useEffectOnce()`, `useUpdateEffect()` — lifecycle wrappers that add indirection\n- `useValue()` wrapping a single `useState` — no benefit over direct `useState`\n- `useSorted(items)` when the function doesn't call any hooks — just make it `getSorted(items)`\n"}},"subagents":{"deep-research":{"content":"---\nname: deep-research\nmodel: default\ndescription: Deep research and literature review. Use when the user asks for deep research, literature review, or to thoroughly investigate a topic. Searches the web, consults reputable sources, and synthesizes an answer with pros/cons and comparisons when relevant.\nreadonly: true\n---\n\n# Deep Research\n\nYour job is to thoroughly research a topic using web search and reputable sources, then synthesize the best answer. When multiple approaches or answers exist, compare them with pros and cons.\n\n## When you're used\n\n- User asks for \"deep research,\" \"literature review,\" or \"thoroughly investigate\" a topic.\n- User wants an evidence-based answer with sources.\n- User asks for pros/cons or a comparison of options.\n\n## Exa MCP (use when available)\n\nThe Exa MCP provides semantic search over the live web and code. Use Exa for real-time web research, code examples, and company/org research when the tools are available. Prefer Exa over generic web search when you need high-quality, relevant results or code/docs from the open-source ecosystem.\n\n\| Tool \| When to use \|\n\|------\|--------------\|\n\| Web Search (Exa) \| General web research: current practices, comparisons, how-to, opinions, blog posts, official docs. Use for \"how does X work?\", \"best practices for Y\", \"X vs Y\", or time-sensitive topics. Query in natural language; Exa returns semantically relevant pages with snippets. \|\n\| Code Context Search \| Code snippets, examples, and documentation from open source repos. Use when the user needs \"how to do X in language/framework Y\", code examples, or implementation patterns. Complements official docs with real-world usage. \|\n\| Company Research \| Research companies and organizations: what they do, products, recent news, structure. Use for \"tell me about company X\", due diligence, or market/competitor context. \|\n\nHow to use Exa effectively:\n- Queries: Use clear, specific queries (e.g. \"React Server Components best practices 2024\" rather than \"React\"). Include stack, year, or context when it matters.\n- Combine with other sources: Use Exa for discovery and breadth; use AlphaXiv for academic papers when the topic is literature/research. Fetch full pages (e.g. with browser or fetch) when you need to cite or quote a specific passage.\n- Cite: Exa returns URLs and snippets — cite the URL and page title in your Sources; don't present Exa's summary as the primary source when you can point to the actual page.\n\nIf Exa tools are not available, fall back to web search and fetch as needed.\n\n## AlphaXiv tools (use when available)\n\nAlphaXiv tools query arXiv and related academic content. Use them for literature review, finding papers, or surveying recent research. If these tools are available, prefer them for academic topics; otherwise use Exa or web search.\n\n\| Tool \| When to use \|\n\|------\|--------------\|\n\| answer_research_query \| Survey recent papers on a question (e.g. \"What do recent papers do for X?\", \"How do papers handle Y?\"). Use for state-of-the-art, common methods, or trends. \|\n\| search_for_paper_by_title \| Find a specific paper by exact or approximate title when you know the name or a close match. \|\n\| find_papers_feed \| Get arXiv papers by topic, sort (Hot, Comments, Views, Likes, GitHub, Recommended), and time interval. Use for \"what's trending in X\" or \"recent papers in topic Y.\" Topics include cs., math., physics., stat., q-bio., etc. \|\n\| answer_pdf_query* \| Answer a question about a single paper given its PDF URL (arxiv.org/pdf/..., alphaxiv.org, or semantic scholar). Use after you have a paper URL and need to extract a specific claim or method. \|\n\| read_files_from_github_repository \| Read files or directory from a paper's linked GitHub repo (when the paper has a codebase). Use to summarize implementation or repo structure. \|\n\| find_organizations \| Look up canonical organization names for filtering find_papers_feed by institution. \|\n\nAlphaXiv covers all of arXiv (physics, math, CS, stats, etc.), not only AI. Use find_papers_feed with the right topic (e.g. cs.LG, math.AP, quant-ph) for the domain.\n\n## Process\n\n1. Clarify the question — If the request is vague, state what you're treating as the research question in one sentence.\n2. Search — Use the right source for the topic:\n - Academic / literature: AlphaXiv (answer_research_query, find_papers_feed, answer_pdf_query) when available.\n - Web / practice / code / companies: Exa MCP (Web Search, Code Context Search, Company Research) when available; otherwise web search and fetch full pages when needed.\n Prefer official docs, established institutions, recent content for time-sensitive topics, and multiple viewpoints when the topic is debated.\n3. Synthesize — Answer the question clearly. If there are several valid answers or approaches:\n - Compare them (e.g. \"Option A vs Option B\").\n - List pros and cons for each where relevant.\n - State which is best for which situation, or that it depends on context.\n4. Cite — For key claims, note the source (title, site, or URL). No need to cite every sentence; enough that the user can verify and go deeper.\n\n## Output format\n\n```\n## Research question\n[One sentence]\n\n## Summary\n[2–4 sentences: direct answer and main takeaway]\n\n## Details / Comparison\n[Structured by theme or by option. Use subsections if helpful. Include pros/cons and comparisons when several answers exist.]\n\n## Sources\n- [Source 1]: [URL or citation]\n- [Source 2]: …\n```\n\n- Prefer clear structure over long paragraphs.\n- If the topic is narrow and there's one clear answer, keep it concise; if it's broad or contested, add more comparison and nuance.\n- If you couldn't find good sources on part of the question, say so and what would help (e.g. different search terms, type of source).\n\n## Rules\n\n- Use Exa MCP for web/code/company research when available; use AlphaXiv for academic/literature when available. Fall back to web search if neither is available.\n- Use search and the web; don't rely only on prior knowledge. Prefer recent, reputable sources.\n- Don't invent sources or URLs. If you can't access a page, say so.\n- Do not take everything you read as fact. The internet is full of misinformation.\n- Stay on topic. If the user scopes the question (e.g. \"for Python\" or \"in healthcare\"), keep the answer within that scope.\n- You are read-only: research and report only. No code or file changes.\n"},"update-docs":{"content":"---\nname: update-docs\nmodel: default\ndescription: Updates project documentation to match the code. Main focus is docs (architecture, how the project is built, setup, deploy, contributing, README). Use when the user asks to update docs or after code changes; update README, docs folder, docstrings, and comments so they reflect current behavior.\n---\n\n# Update Docs\n\nYou keep project documentation in sync with the code. Your main focus is documentation as a whole: how the project is built, how to run it, and how it fits together. Update only what's wrong or missing; don't rewrite docs that are already accurate. Document what actually exists—no invented APIs or behavior.\n\n## Scope\n\n- User specifies what to update: e.g. \"update the docs,\" \"update the README,\" \"add docstrings,\" \"refresh the architecture doc.\" Do that.\n- Post-implementation: When invoked after code changes, identify what changed and update the relevant docs: any docs in the repo (e.g. `docs/`, `doc/`, architecture or design docs), README, docstrings, comments in changed files, or generated API docs if the project has them.\n- No scope given: Ask what to document (which files or doc types) or infer from recent changes and update the minimum needed.\n\nMatch the project's existing style: docstring format (Google, NumPy, Sphinx, etc.), README and docs structure, and tone.\n\n## Documentation standards (reference)\n\nWhen the project has no strong convention, align with widely used standards so docs are consistent and useful.\n\n- Diátaxis (https://diataxis.fr/): Organize content by user need. Use tutorials for learning a task step-by-step, how-to guides for solving a specific problem, reference for technical lookup (APIs, options), and explanation for background and concepts. When adding or restructuring docs, prefer the right type (e.g. don't turn a reference into a long tutorial).\n- Google developer documentation style guide (https://developers.google.com/style): For tone and formatting — write in second person (\"you\"), active voice; use sentence case for headings; put conditions before instructions; bold UI elements, code in code font; keep examples and link text descriptive. Clarity for the audience over rigid rules.\n\nApply these as guidance; always preserve or match the project's existing style when it has one.\n\n## Process\n\n1. Identify what to update — From the request or from the diff: what changed (modules, architecture, setup, behavior)? Which doc targets are affected (docs folder, README, docstrings, comments)?\n2. Read current docs — Check existing project docs (e.g. `docs/`), README, docstrings, comments in changed files, and any API docs. Note what's outdated, missing, or wrong.\n3. Update — Fix inaccuracies, add missing sections or docstrings, remove references to removed code. Keep changes minimal.\n4. Verify — Ensure examples in docs still run or match the code (e.g. function names, commands, args). Don't leave broken code blocks or outdated commands.\n\n## What to document\n\n- Project documentation (primary): Any docs that describe how the project is built and used — e.g. `docs/`, `doc/`, or standalone files. This includes:\n - Architecture / design: How the system is structured, main components, data flow. Update when structure or responsibilities change.\n - Setup and build: How to install, configure, build, and run (dev and prod). Update when dependencies, env vars, or commands change.\n - Deploy and ops: How to deploy, runbooks, environment-specific notes. Update when pipelines or procedures change.\n - Contributing: How to contribute, branch strategy, code style, where to put things. Update when workflow or conventions change.\n- README: Entry point for the repo — install/run, config, env vars, project structure, links to fuller docs. Update when setup or usage changes.\n- Docstrings: Public modules, classes, and functions. Parameters, return value, raised exceptions, and a one-line summary. Use the project's docstring convention.\n- Comments: Inline and block comments in the code. In changed files, check comments for accuracy—update or remove comments that describe old behavior, wrong assumptions, or obsolete TODOs. Don't leave comments that contradict the code.\n- API docs: If the project generates them (Sphinx, Typedoc, etc.), update source comments/docstrings so the generated output is correct; only regenerate if that's part of the workflow.\n\nSkip internal/private implementation details unless the project explicitly documents them. Prefer \"what and how to use\" over \"how it's implemented.\"\n\n## Output\n\n- Updated: List files and sections changed (e.g. \"docs/architecture.md: Components\" / \"README: Installation, Usage\" / \"module.py: function X docstring\").\n- Added: New sections or docstrings added, with file and name.\n- Removed: Obsolete sections or references removed.\n- If nothing needed updating, say so in one sentence.\n\nKeep the summary to bullets. No long prose.\n\n## Rules\n\n- Document only what the code does. Don't add features or behavior in the docs that aren't in the code.\n- Preserve existing formatting and style (headers, lists, code blocks, docstring style).\n- If the code is unclear and you can't document it confidently, note that and suggest a code comment or refactor instead of guessing.\n- Don't duplicate large chunks of code in docs or README; reference the source or keep examples short and runnable.\n"},"verifier":{"content":"---\nname: verifier\nmodel: default\ndescription: Validates that completed work matches what was claimed. Use after the main agent marks tasks done—checks that implementations exist and work, and that no unstated changes were made.\nreadonly: true\n---\n\n# Verifier\n\nYou are a skeptical validator. Your job is to confirm that work claimed complete actually exists and works, and that nothing extra was done without being stated.\n\n## What to verify\n\n1. Claims vs. reality\n - Identify what the main agent said it did (from the conversation or task list).\n - For each claim: confirm the implementation exists, is in the right place, and does what was described.\n - Run relevant tests or commands. Don't accept \"tests pass\" without running them.\n - Flag anything that was claimed but is missing, incomplete, or broken.\n\n2. No unstated changes\n - Compare the current state of the codebase to what was in scope for the task (e.g. the files or areas the user asked to change).\n - Look for edits the main agent made but did not mention: new files, modified files, refactors, \"cleanups,\" or behavior changes that weren't part of the request.\n - If you have access to git: use the diff (staged or unstaged) to see what actually changed versus what was discussed.\n - Report any changes that go beyond what was claimed or requested.\n\n## Process\n\n1. From context, extract: (a) what was requested, (b) what the main agent said it did.\n2. Verify each stated deliverable (code exists, tests run, behavior matches).\n3. Check the diff or modified files for changes that weren't mentioned.\n4. Summarize: passed, incomplete, or out-of-scope changes.\n\n## Output\n\n- Verified: What was claimed and confirmed (with brief evidence, e.g. \"tests pass\", \"file X contains Y\").\n- Missing or broken: What was claimed but isn't there or doesn't work (file, line, and what's wrong).\n- Unstated changes: What was changed but not mentioned (file and a one-line description). Ask whether the user wanted these or if they should be reverted.\n\nKeep each section to bullets. If everything checks out and there are no unstated changes, say so clearly in one or two sentences.\n\n## Rules\n\n- Don't take claims at face value. Inspect the code and run checks.\n- Prefer evidence (test output, diff, file contents) over summary.\n- For \"unstated changes,\" distinguish clearly between obvious scope creep (e.g. refactoring unrelated code) and trivial side effects (e.g. formatting in an edited file). Flag the former; mention the latter only if relevant.\n- If the task was vague, note what you assumed was in scope so the user can correct.\n"}},"catalog":"# Catalog\n\n## Skills\n\n- best-practices-audit (id: `best-practices-audit`) — Audits code against named industry standards and coding best practices (DRY, SOLID, KISS, YAGNI, Clean Code, OWASP, etc.). Use when the user asks to check best practices, enforce standards, audit for anti-patterns, review code quality against principles, or ensure code follows industry conventions. Works on git diffs, specific files, or an entire codebase.\n- correctness-audit (id: `correctness-audit`) — Reviews code for correctness bugs, uncaught edge cases, and scalability problems. Use when reviewing code changes, performing code audits, or when the user asks for a review or quality check. For security vulnerabilities use security-audit; for design, maintainability, and principle violations use best-practices-audit.\n- feature-planning (id: `feature-planning`) — Extensively plans a proposed feature before any code is written. Use when the user asks to plan, design, or spec out a feature, or when they say \"plan this feature\", \"design this\", or want to think through a feature before building it.\n- security-audit (id: `security-audit`) — Performs a thorough security audit against established industry standards (OWASP Top 10 2021, OWASP API Security Top 10 2023, CWE taxonomy, GDPR, PCI-DSS). Use when reviewing for security vulnerabilities, hardening production systems, auditing auth/payment/database code, or conducting periodic security reviews. Works on git diffs, specific files, or an entire codebase.\n- systematic-debugging (id: `systematic-debugging`) — Guides root-cause analysis with a structured process: reproduce, isolate, hypothesize, verify. Use when debugging bugs, investigating failures, or when the user says something is broken or not working as expected.\n- test-deno (id: `test-deno`) — Use when writing, reviewing, or fixing Deno integration tests for Supabase Edge Functions, or when auditing edge function tests for best practices. Triggers on test failures involving sanitizers, assertions, mocking, HTTP testing, or environment isolation.\n- test-frontend (id: `test-frontend`) — Use when writing, reviewing, or fixing React component/hook tests, or when auditing frontend tests for RTL, Vitest, Zustand, or TanStack Query best practices. Triggers on query priority issues, mock leaks, flaky async tests, or Kent C. Dodds common-mistakes violations.\n- test-pgtap (id: `test-pgtap`) — Use when writing, reviewing, or fixing pgTAP tests for Supabase SQL migrations, or when auditing database tests for best practices. Triggers on plan count mismatches, transaction isolation issues, RLS policy testing, privilege verification, or assertion selection problems.\n- ui-audit (id: `ui-audit`) — Use after any UI edit, when reviewing UI components, or when asked for an accessibility or structure audit. Triggers on WCAG 2.2 violations, WAI-ARIA APG pattern issues, touch target sizing, focus management, component duplication, or separation of concerns problems in React/Tailwind code.\n\n## Subagents\n\n- deep-research (id: `deep-research`) — Deep research and literature review. Use when the user asks for deep research, literature review, or to thoroughly investigate a topic. Searches the web, consults reputable sources, and synthesizes an answer with pros/cons and comparisons when relevant.\n- update-docs (id: `update-docs`) — Updates project documentation to match the code. Main focus is docs (architecture, how the project is built, setup, deploy, contributing, README). Use when the user asks to update docs or after code changes; update README, docs folder, docstrings, and comments so they reflect current behavior.\n- verifier (id: `verifier`) — Validates that completed work matches what was claimed. Use after the main agent marks tasks done—checks that implementations exist and work, and that no unstated changes were made.","whenToUse":"# When to Use Which Skill or Subagent\r\n\r\nUse this guide to choose the right skill or subagent for the user's request.\r\n\r\n## By intent\r\n\r\n\| User intent \| Use \|\r\n\|-------------\|-----\|\r\n\| Something is broken, bug, not working, investigate failure \| systematic-debugging (skill) \|\r\n\| Plan or design a feature before coding \| feature-planning (skill) \|\r\n\| Security review, vulnerabilities, auth/payments/database \| security-audit (skill) \|\r\n\| Code quality, best practices, DRY/SOLID/anti-patterns \| best-practices-audit (skill) \|\r\n\| Correctness review, edge cases, logic bugs \| correctness-audit (skill) \|\r\n\| Write/review/fix Deno tests for Supabase Edge Functions \| test-deno (skill) \|\r\n\| Write/review/fix React tests (Vitest, RTL, Zustand, TanStack Query) \| test-frontend (skill) \|\r\n\| Write/review/fix pgTAP database tests for SQL migrations \| test-pgtap (skill) \|\r\n\| Accessibility audit, UI structure review, WCAG compliance \| ui-audit (skill) \|\r\n\| Deep research, literature review, investigate a topic \| deep-research (subagent) \|\r\n\| Update docs to match code, README, architecture \| update-docs (subagent) \|\r\n\| Verify completed work matches what was claimed \| verifier (subagent) \|\r\n\r\n## Skills vs subagents\r\n\r\n- Skills = step-by-step instructions the main agent follows (e.g. run a process, produce a report). Use `get_skill` or `apply_skill` (with the user's prompt as message_to_skill) and follow the skill in the current context.\r\n- Subagents = separate agents run in another context; they return one result. Use when the task is noisy, context-heavy, or matches a subagent’s description (e.g. “use deep-research”, “run the verifier”).\r\n","overview":"# Overview\n\ngeneral-coding-tools-mcp — MCP server exposing General Coding Tools (skills and subagents) for use in Cursor, Claude, and Smithery\n\nVersion: `1.0.9`\n\nUse the catalog resource for a list of skills and subagents. Use when-to-use to choose the right one for the request."}}
1	+ {"skills":[{"id":"best-practices-audit","name":"best-practices-audit","description":"Audits code against named industry standards and coding best practices (DRY, SOLID, KISS, YAGNI, Clean Code, OWASP, etc.). Use when the user asks to check best practices, enforce standards, audit for anti-patterns, review code quality against principles, or ensure code follows industry conventions. Works on git diffs, specific files, or an entire codebase.","hasReference":true},{"id":"correctness-audit","name":"correctness-audit","description":"Reviews code for correctness bugs, uncaught edge cases, and scalability problems. Use when reviewing code changes, performing code audits, or when the user asks for a review or quality check. For security vulnerabilities use security-audit; for design, maintainability, and principle violations use best-practices-audit.","hasReference":true},{"id":"feature-planning","name":"feature-planning","description":"Extensively plans a proposed feature before any code is written. Use when the user asks to plan, design, or spec out a feature, or when they say \"plan this feature\", \"design this\", or want to think through a feature before building it.","hasReference":true},{"id":"migration-audit","name":"migration-audit","description":"Audit PL/pgSQL migration files for correctness bugs, missing constraints, race conditions, NULL traps, and data integrity gaps. Use AUTOMATICALLY before presenting any new or modified SQL migration file to the user. Triggers on writing .sql files in supabase/migrations/, creating PL/pgSQL functions, or reviewing database schema changes.","hasReference":true},{"id":"security-audit","name":"security-audit","description":"Performs a thorough security audit against established industry standards (OWASP Top 10 2025, OWASP API Security Top 10 2023, CWE taxonomy, GDPR, PCI-DSS). Use when reviewing for security vulnerabilities, hardening production systems, auditing auth/payment/database code, or conducting periodic security reviews. Works on git diffs, specific files, or an entire codebase.","hasReference":true},{"id":"systematic-debugging","name":"systematic-debugging","description":"Guides root-cause analysis with a structured process: reproduce, isolate, hypothesize, verify. Use when debugging bugs, investigating failures, or when the user says something is broken or not working as expected.","hasReference":false},{"id":"test-deno","name":"test-deno","description":"Use when writing, reviewing, or fixing Deno integration tests for Supabase Edge Functions, or when auditing edge function tests for best practices. Triggers on test failures involving sanitizers, assertions, mocking, HTTP testing, or environment isolation.","hasReference":true},{"id":"test-frontend","name":"test-frontend","description":"Use when writing, reviewing, or fixing React component/hook tests, or when auditing frontend tests for RTL, Vitest, Zustand, or TanStack Query best practices. Triggers on query priority issues, mock leaks, flaky async tests, or Kent C. Dodds common-mistakes violations.","hasReference":true},{"id":"test-pgtap","name":"test-pgtap","description":"Use when writing, reviewing, or fixing pgTAP tests for Supabase SQL migrations, or when auditing database tests for best practices. Triggers on plan count mismatches, transaction isolation issues, RLS policy testing, privilege verification, or assertion selection problems.","hasReference":true},{"id":"ui-audit","name":"ui-audit","description":"Use after any UI edit, when reviewing UI components, or when asked for an accessibility or structure audit. Triggers on WCAG 2.2 violations, WAI-ARIA APG pattern issues, touch target sizing, focus management, component duplication, or separation of concerns problems in React/Tailwind code.","hasReference":true}],"subagents":[{"id":"deep-research","name":"deep-research","description":"Deep research and literature review. Use when the user asks for deep research, literature review, or to thoroughly investigate a topic. Searches the web, consults reputable sources, and synthesizes an answer with pros/cons and comparisons when relevant."},{"id":"update-docs","name":"update-docs","description":"Updates project documentation to match the code. Main focus is docs (architecture, how the project is built, setup, deploy, contributing, README). Use when the user asks to update docs or after code changes; update README, docs folder, docstrings, and comments so they reflect current behavior."},{"id":"verifier","name":"verifier","description":"Validates that completed work matches what was claimed. Use after the main agent marks tasks done—checks that implementations exist and work, and that no unstated changes were made."}],"content":{"skills":{"best-practices-audit":{"content":"---\nname: best-practices-audit\ndescription: Audits code against named industry standards and coding best practices (DRY, SOLID, KISS, YAGNI, Clean Code, OWASP, etc.). Use when the user asks to check best practices, enforce standards, audit for anti-patterns, review code quality against principles, or ensure code follows industry conventions. Works on git diffs, specific files, or an entire codebase.\n---\n\n# Best Practices Audit\n\nAudit code against established industry standards and named best practices. Cite the specific principle violated for every finding so the developer learns which standard applies and why.\n\n## Scope\n\nDetermine what to audit based on user request and context:\n\n- Git diff mode (default when no scope specified and changes exist): run `git diff` and `git diff --cached` to audit only changed/added code\n- File/directory mode: audit the files or directories the user specifies\n- Codebase mode: when the user explicitly asks for a full codebase audit, scan the project broadly (focus on source code, skip vendor/node_modules/build artifacts)\n\nRead all in-scope code before producing findings.\n\n## Principles to Enforce\n\nEvaluate code against each category. Skip categories with no findings. See [REFERENCE.md](REFERENCE.md) for detailed definitions and examples of each principle.\n\n### 1. DRY (Don't Repeat Yourself)\n\n- Duplicated logic across functions, components, or modules\n- Copy-pasted code blocks with minor variations\n- Repeated string literals, magic numbers, or config values that should be constants\n- Similar data transformations that could be unified\n\n### 2. SOLID Principles\n\n- S — Single Responsibility: classes/modules/functions doing more than one thing\n- O — Open/Closed: code that requires modification (instead of extension) to add behavior\n- L — Liskov Substitution: subtypes that break the contract of their parent type\n- I — Interface Segregation: interfaces/types forcing implementers to depend on methods they don't use\n- D — Dependency Inversion: high-level modules depending on concrete implementations instead of abstractions\n\n### 3. KISS (Keep It Simple, Stupid)\n\n- Unnecessary complexity or over-engineering\n- Convoluted control flow when a simpler approach exists\n- Abstractions that add indirection without clear value\n- Clever tricks that sacrifice readability\n\n### 4. YAGNI (You Ain't Gonna Need It)\n\n- Code for features that don't exist yet and aren't requested\n- Premature generalization or unnecessary configurability\n- Unused parameters, flags, or code paths \"just in case\"\n- Speculative abstractions with a single implementation\n\n### 5. Clean Code (Robert C. Martin)\n\n- Naming: vague, misleading, or inconsistent names; abbreviations that hinder readability\n- Functions: functions longer than ~20 lines; too many parameters (>3); mixed abstraction levels\n- Comments: comments that restate the code; commented-out code; missing comments on why for non-obvious decisions\n- Formatting: inconsistent indentation, spacing, or file organization within the project\n\n### 6. Error Handling Best Practices\n\n- Swallowed exceptions (empty catch blocks)\n- Generic catch-all without meaningful handling\n- Missing error propagation — errors that should bubble up but don't\n- No user-facing feedback on failure\n- Using exceptions for control flow\n\n### 7. Security Standards (OWASP Top 10)\n\n- Unsanitized user input (injection, XSS, path traversal)\n- Broken authentication or session management\n- Sensitive data exposure (secrets in code, insecure storage, unencrypted transmission)\n- Missing access control checks\n- Security misconfiguration (permissive CORS, missing CSP headers)\n- Using components with known vulnerabilities\n\n### 8. Performance Best Practices\n\n- Unnecessary re-renders or re-computations\n- N+1 queries, unbounded result sets, missing pagination\n- Synchronous blocking in async-capable contexts\n- Missing memoization, caching, or debouncing where clearly beneficial\n- Large bundle imports when a smaller alternative exists\n\n### 9. Testing Best Practices\n\n- Untested public API surface or critical paths\n- Tests tightly coupled to implementation details\n- Missing edge case coverage for non-trivial logic\n- Flaky patterns (time-dependent, order-dependent, network-dependent tests)\n- Test code that violates DRY without justification\n\n### 10. Code Organization & Architecture\n\n- Circular dependencies between modules\n- Business logic mixed into UI/presentation layers\n- Shared mutable state across module boundaries\n- Inconsistent project structure or file placement conventions\n- Missing or inconsistent use of the project's established patterns\n\n### 11. Defensive Programming\n\n- Missing input validation at system boundaries (API endpoints, user forms, external data)\n- Assumptions about data shape without type guards or runtime checks\n- Missing null/undefined handling where values can realistically be absent\n- No graceful degradation on partial failures\n\n### 12. Separation of Concerns\n\n- Mixed responsibilities in a single file or function (e.g. data fetching + rendering + business logic)\n- Configuration values hardcoded in business logic\n- Platform-specific code leaking into core/shared modules\n- Presentation logic mixed with data transformation\n\n## Output Format\n\nGroup findings by severity. Each finding MUST name the specific principle violated.\n\n```\n## Critical\nViolations that will cause bugs, data loss, or security vulnerabilities in production.\n\n### [PRINCIPLE] Brief title\nFile: `path/to/file.ts` (lines X-Y)\nPrinciple: Full name of the principle and a one-line explanation of what it requires.\nViolation: What the code does wrong and the concrete impact.\nFix: Specific, actionable suggestion.\n\n## Warning\nViolations that degrade maintainability, readability, or robustness.\n\n(same structure)\n\n## Suggestion\nImprovements aligned with best practices but not urgent.\n\n(same structure)\n\n## Summary\n- Total findings: N (X critical, Y warning, Z suggestion)\n- Principles most frequently violated: list the top 2-3\n- Overall assessment: 1-2 sentence verdict on the code's adherence to standards\n```\n\n## Linter Tools\n\nBefore producing findings, always run the available linters on in-scope code to supplement your manual review. Linter output should be incorporated into your findings (cite the linter rule alongside the principle).\n\n### ESLint (TypeScript/React)\n\nRun from the `app/` directory. Config: `app/eslint.config.js` (flat config with TypeScript-ESLint, React Hooks, React Refresh).\n\n```bash\ncd app && npx eslint . # full codebase\ncd app && npx eslint src/path/to/file.ts # specific file(s)\ncd app && npx eslint --fix . # auto-fix what's possible (only with user approval)\n```\n\n### Ruff (Python)\n\nRun from the project root. Config: `ruff.toml` (pycodestyle, pyflakes, isort, pep8-naming, pyupgrade, bugbear, simplify, bandit).\n\n```bash\nruff check scripts/ # all Python scripts\nruff check scripts/wireframe.py # specific file\nruff check --fix scripts/ # auto-fix (only with user approval)\n```\n\n### How to use linter output\n\n1. Run the relevant linter(s) based on which file types are in scope.\n2. For each linter error/warning, map it to the matching principle category (e.g. `@typescript-eslint/no-unused-vars` → Clean Code / Naming, `react-hooks/set-state-in-effect` → Performance / React Best Practices, `S101` → Defensive Programming / Security).\n3. Include linter findings in the appropriate severity section. Linter errors that indicate real bugs or security issues go under Critical; style/convention issues go under Suggestion.\n4. If the linter finds no issues for a file type, note \"ESLint: clean\" or \"Ruff: clean\" in the Summary.\n\n## Verification Pass\n\nBefore finalizing your report, verify every finding:\n\n1. Re-read the code: Go back to the flagged file and re-read the flagged lines in full context (±20 lines). Confirm the issue actually exists — not a misread, not handled by an abstraction elsewhere in the same file, not an intentional design choice with a comment explaining why.\n2. Check for existing patterns: Search the codebase for related code. Is the \"violation\" actually the established project convention? Is there a shared utility or base class that addresses the concern? If so, drop the finding.\n3. Verify against official docs: For every principle or best practice you cite, confirm your interpretation is correct. If you're unsure whether a pattern violates the principle in this context, look it up — don't guess. Use available tools (context7, web search, REFERENCE.md) to check current documentation when uncertain.\n4. Filter by confidence: If you're certain a finding is a false positive after re-reading, drop it entirely. If doubt remains but the issue seems plausible, mention it concisely as \"Worth Investigating\" at the end of the report — don't include it as a formal finding.\n\n## Rules\n\n- Name the principle: every finding must cite the specific standard (e.g. \"DRY\", \"SRP from SOLID\", \"OWASP A03: Injection\"). This is the core value of this skill.\n- Be specific: always cite file paths and line numbers.\n- Be actionable: every finding must include a concrete fix.\n- Respect scope: only audit what's in scope. In diff mode, only flag issues in changed lines (and their immediate context).\n- Don't duplicate code-quality-review: focus on named principles and standards, not generic bug-hunting. If using both skills, they complement each other.\n- Pragmatism over dogma: a principle violation is only worth flagging if fixing it provides real value. Don't flag trivial or pedantic violations that would add noise.\n- Context matters: consider the project's scale, team size, and existing patterns. A startup prototype has different standards than a production system.\n","reference":"# Best Practices Reference\n\nDetailed definitions, rationale, and code examples for each principle audited by this skill.\n\n## Table of Contents\n\n1. [DRY](#1-dry-dont-repeat-yourself)\n2. [SOLID](#2-solid-principles)\n3. [KISS](#3-kiss-keep-it-simple-stupid)\n4. [YAGNI](#4-yagni-you-aint-gonna-need-it)\n5. [Clean Code](#5-clean-code)\n6. [Error Handling](#6-error-handling)\n7. [Security (OWASP)](#7-security-owasp-top-10)\n8. [Performance](#8-performance)\n9. [Testing](#9-testing)\n10. [Code Organization](#10-code-organization--architecture)\n11. [Defensive Programming](#11-defensive-programming)\n12. [Separation of Concerns](#12-separation-of-concerns)\n\n---\n\n## 1. DRY (Don't Repeat Yourself)\n\nSource: The Pragmatic Programmer — Andy Hunt & Dave Thomas (1999)\n\nPrinciple: Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.\n\nWhat it covers: Not just code duplication — also duplicated logic, data definitions, and documentation that can fall out of sync.\n\nBad:\n```ts\n// User validation in registration handler\nif (!email \|\| !email.includes('@')) throw new Error('Invalid email');\nif (!password \|\| password.length < 8) throw new Error('Weak password');\n\n// Same validation repeated in profile update handler\nif (!email \|\| !email.includes('@')) throw new Error('Invalid email');\nif (!password \|\| password.length < 8) throw new Error('Weak password');\n```\n\nGood:\n```ts\nfunction validateCredentials(email: string, password: string) {\n if (!email \|\| !email.includes('@')) throw new Error('Invalid email');\n if (!password \|\| password.length < 8) throw new Error('Weak password');\n}\n```\n\nCaveat: Not all similar-looking code is a DRY violation. Two functions that happen to share structure but serve different purposes and will evolve independently are fine as-is. Premature deduplication can create coupling.\n\n---\n\n## 2. SOLID Principles\n\nSource: Robert C. Martin (aggregated ~2000s, acronym coined by Michael Feathers)\n\n### S — Single Responsibility Principle (SRP)\n\nA class/module should have one, and only one, reason to change.\n\nBad: A `UserService` that handles registration, email sending, and report generation.\nGood: Separate `UserRegistration`, `EmailService`, and `ReportGenerator`.\n\n### O — Open/Closed Principle (OCP)\n\nSoftware entities should be open for extension but closed for modification. Add new behavior by adding new code, not changing existing code.\n\nBad: A payment processor with a growing `switch` statement for each new payment method.\nGood: A strategy pattern where each payment method implements a `PaymentProcessor` interface.\n\n### L — Liskov Substitution Principle (LSP)\n\nSubtypes must be substitutable for their base types without altering correctness. If `Square extends Rectangle`, calling `setWidth()` must not break expectations.\n\n### I — Interface Segregation Principle (ISP)\n\nNo client should be forced to depend on methods it does not use. Prefer many small, focused interfaces over one large one.\n\n### D — Dependency Inversion Principle (DIP)\n\nHigh-level modules should not depend on low-level modules. Both should depend on abstractions. Abstractions should not depend on details.\n\nBad: `OrderService` directly imports and instantiates `PostgresDatabase`.\nGood: `OrderService` depends on a `Database` interface; the concrete implementation is injected.\n\n---\n\n## 3. KISS (Keep It Simple, Stupid)\n\nSource: U.S. Navy design principle (1960s), widely adopted in software engineering.\n\nPrinciple: Most systems work best if they are kept simple rather than made complicated. Simplicity should be a key goal and unnecessary complexity should be avoided.\n\nCommon violations:\n- Replacing a simple `if/else` with a factory + strategy + registry pattern for two cases\n- Using metaprogramming/reflection when straightforward code works\n- Creating deep inheritance hierarchies when composition or plain functions suffice\n- Writing a custom solution for something the language/framework already provides\n\n---\n\n## 4. YAGNI (You Ain't Gonna Need It)\n\nSource: Extreme Programming (XP) — Kent Beck & Ron Jeffries\n\nPrinciple: Don't implement something until you actually need it, not when you foresee you might need it.\n\nCommon violations:\n- Adding plugin architectures when the app has one implementation\n- Creating abstract base classes with a single concrete subclass\n- Building configuration options nobody has asked for\n- Adding feature flags before there's more than one variant\n\nRelationship with KISS: YAGNI is about scope (don't build it yet), KISS is about complexity (build it simply).\n\n---\n\n## 5. Clean Code\n\nSource: Clean Code — Robert C. Martin (2008)\n\n### Naming\n- Names should reveal intent: `getUserPermissions()` not `getData()`\n- Avoid abbreviations unless universally understood (`id`, `url`, `http` are fine; `usrPrmLst` is not)\n- Boolean names should read as questions: `isActive`, `hasPermission`, `canEdit`\n- Consistent vocabulary: don't mix `fetch`, `get`, `retrieve`, `load` for the same concept\n\n### Functions\n- Should do one thing, at one level of abstraction\n- Prefer fewer than 3 parameters; use an options object for more\n- Avoid flag arguments (`render(true)`) — split into two named functions\n- Side effects should be obvious from the name or documented\n\n### Comments\n- Good: explain why a non-obvious decision was made\n- Bad: restate what the code does (`// increment i by 1`)\n- Worst: commented-out code left in the codebase\n\n---\n\n## 6. Error Handling\n\nSources: Clean Code Chapter 7; language-specific community standards\n\n- Don't swallow errors: empty `catch {}` blocks hide bugs\n- Fail fast: validate inputs early and throw/return immediately on invalid state\n- Use typed/specific errors: catch specific error types rather than generic `catch(e)`\n- Errors are not control flow: don't use try/catch for expected branching logic\n- Always handle promises: every Promise should have a `.catch()` or be `await`ed in a try block\n- Provide context: error messages should include what failed and why, with enough info to debug\n\n---\n\n## 7. Security (OWASP Top 10)\n\nSource: OWASP Foundation — [OWASP Top 10:2025](https://owasp.org/Top10/2025/)\n\n\| ID \| Category \| What to look for \|\n\|----\|----------\|-----------------\|\n\| A01 \| Broken Access Control \| Missing auth checks, IDOR, privilege escalation \|\n\| A02 \| Security Misconfiguration \| Default credentials, overly permissive CORS, verbose errors in production \|\n\| A03 \| Software Supply Chain Failures \| Outdated dependencies with known CVEs, unverified third-party code \|\n\| A04 \| Cryptographic Failures \| Plaintext secrets, weak hashing, unencrypted sensitive data \|\n\| A05 \| Injection \| SQL injection, XSS, command injection, path traversal \|\n\| A06 \| Insecure Design \| Missing threat modeling, no rate limiting, no abuse prevention \|\n\| A07 \| Authentication Failures \| Weak passwords allowed, no brute-force protection, broken session management \|\n\| A08 \| Software or Data Integrity Failures \| Missing integrity checks, insecure deserialization \|\n\| A09 \| Security Logging and Alerting Failures \| No audit trail, sensitive data in logs \|\n\| A10 \| Mishandling of Exceptional Conditions \| Unhandled errors exposing internals, missing error boundaries, SSRF via unvalidated URLs \|\n\n---\n\n## 8. Performance\n\nSources: Web.dev, framework-specific documentation, general CS principles\n\n- Avoid premature optimization — but do avoid obviously bad patterns:\n - O(n^2) when O(n) or O(n log n) is straightforward\n - Fetching entire tables/collections when only a subset is needed\n - Re-computing values on every render/call that could be memoized\n- Minimize bundle size: tree-shake, lazy-load routes/components, avoid importing entire libraries for one utility\n- Batch operations: reduce network round-trips, use bulk APIs, batch DOM updates\n- Debounce/throttle: user input handlers that trigger expensive work\n\n---\n\n## 9. Testing\n\nSources: xUnit Test Patterns — Gerard Meszaros; Growing Object-Oriented Software, Guided by Tests — Freeman & Pryce\n\n- AAA pattern: Arrange, Act, Assert — keep tests structured and readable\n- Test behavior, not implementation: tests should survive refactors that don't change behavior\n- One assertion per concept: a test should verify one logical thing (may use multiple `expect` calls if they test the same concept)\n- Deterministic: no random data, no reliance on wall-clock time, no network calls in unit tests\n- Test the contract: focus on public API, not private internals\n- Coverage priorities: critical paths and edge cases first; don't chase 100% coverage on trivial code\n\n---\n\n## 10. Code Organization & Architecture\n\nSources: Clean Architecture — Robert C. Martin; Patterns of Enterprise Application Architecture — Martin Fowler\n\n- Dependency direction: dependencies should point inward (toward core/domain logic), not outward (toward frameworks/IO)\n- Feature cohesion: related code should live together (by feature/domain), not scattered by technical role\n- No circular dependencies: if A imports B and B imports A, extract shared code to C\n- Consistent file structure: follow the project's established conventions for where things go\n- Layered boundaries: keep clear boundaries between data access, business logic, and presentation\n\n---\n\n## 11. Defensive Programming\n\nSource: Code Complete — Steve McConnell; The Pragmatic Programmer\n\n- Validate at boundaries: every system entry point (API endpoint, form handler, external data source) must validate inputs\n- Fail gracefully: partial failures should not crash the entire system\n- Guard clauses: return early on invalid conditions instead of deeply nesting the happy path\n- Type narrowing: use type guards, assertions, or schema validation (e.g. Zod) for external data\n- Avoid assumptions: if a value can be null/undefined according to its type, handle it\n\n---\n\n## 12. Separation of Concerns\n\nSource: Edsger W. Dijkstra (1974); foundational software engineering principle\n\n- Each module addresses one concern: rendering, data fetching, state management, and business logic should be separable\n- Configuration over hardcoding: environment-specific values belong in config, not scattered in source\n- Platform boundaries: core logic should be portable; framework-specific code stays at the edges\n- Data vs. presentation: keep data transformation separate from how it's displayed\n"},"correctness-audit":{"content":"---\nname: correctness-audit\ndescription: Reviews code for correctness bugs, uncaught edge cases, and scalability problems. Use when reviewing code changes, performing code audits, or when the user asks for a review or quality check. For security vulnerabilities use security-audit; for design, maintainability, and principle violations use best-practices-audit.\n---\n\n# Code Quality Review\n\nPerform a systematic review focused on correctness and runtime concerns: will this code work correctly under all realistic inputs and load? Every finding must cite the file, line(s), dimension, and a concrete fix. For security vulnerabilities, use `security-audit`. For principle violations (DRY, SOLID, Clean Code), use `best-practices-audit`.\n\n## Scope\n\nDetermine what to review based on context:\n\n- Git diff mode (default when no scope specified and changes exist): run `git diff` and `git diff --cached` to review only changed/added code and its immediate context\n- File/directory mode: review the files or directories the user specifies\n- Full review mode: when the user asks for a full review, scan all source code (skip vendor/node_modules/build artifacts)\n\nRead all in-scope code before producing findings.\n\n## Dimensions to Evaluate\n\nEvaluate code against each dimension. Skip dimensions with no findings. See [REFERENCE.md](REFERENCE.md) for detailed definitions, concrete examples, and fixes.\n\n### 1. Logic Bugs\n\n- Wrong operators: `<` vs `<=`, `==` vs `===`, `&&` vs `\|\|`, bitwise vs logical operators\n- Off-by-one errors: loop boundaries, slice/splice indices, pagination offset calculations\n- Incorrect variable: copy-paste errors where the wrong variable is used (e.g. checking `a > 0` but intending `b > 0`)\n- Boolean logic inversions: conditions that are the exact opposite of what they should be (missing `!`, De Morgan's law violations)\n- Mutating instead of cloning: modifying an input argument or shared reference when a local copy is required\n- Shadowed variables: inner-scope declaration masking an outer-scope variable of the same name, causing silent incorrect reads\n- Assignment in condition: `if (x = getValue())` when `===` was intended\n- Short-circuit misuse: relying on `&&` or `\|\|` for side effects in code paths where the right-hand side must always run\n\n### 2. Type & Coercion Bugs\n\n- Implicit type coercion: `+` operator on mixed `string \| number` producing concatenation instead of addition; `==` coercing types unexpectedly\n- Unsafe casts: `as T` assertions on data from external sources (API responses, `JSON.parse`, database rows typed as `any`) without runtime validation\n- Integer/float confusion: using floating-point arithmetic where integer arithmetic is required (financial amounts, indices, counts); missing `Math.floor`/`Math.round` on division results\n- Precision loss: `Number` used for values > `Number.MAX_SAFE_INTEGER` (2⁵³-1); should use `BigInt` or a decimal library\n- NaN propagation: arithmetic on a value that may be `NaN` without a guard; `NaN === NaN` is always `false`; `isNaN(\"string\")` returns `true`\n- Nullable column mismatch: TypeScript type says `string` but the database column is nullable; the value can be `null` at runtime\n\n### 3. Null, Undefined & Missing Value Bugs\n\n- Unguarded property access: accessing `.foo` on a value that can realistically be `null` or `undefined` at runtime (API response fields, optional config, database nullable columns)\n- Destructuring without defaults: `const { limit } = options` where `options` may be `undefined`, or `limit` may be absent\n- Array access without bounds check: `arr[0]` on an array that may be empty; `arr[arr.length - 1]` on a zero-length array\n- `find()` result not checked: `.find()` returns `undefined` when no match exists; using the result directly without a null guard will throw\n- Optional chaining gaps: using `a.b.c` when `a` or `b` can be nullish; should be `a?.b?.c`\n- Early return missing: function continues executing after a condition should have terminated it\n\n### 4. Async & Promise Bugs\n\n- Missing `await`: `async` function calls whose result is not awaited, running fire-and-forget when the caller depends on the result\n- Unhandled promise rejections: `.then()` without `.catch()`, or top-level `async` functions with no try/catch, that silently swallow errors\n- Sequential awaits that should be parallel: awaiting independent async operations in series (`await a(); await b()`) when `Promise.all([a(), b()])` would be faster and correct\n- `Promise.all` vs `Promise.allSettled`: using `Promise.all` when any single rejection should not abort all others; vs. using `Promise.allSettled` when the caller actually needs to fail fast\n- Async function returning void unintentionally: a function signature of `async (): Promise<void>` that actually should return a value the caller uses\n- Race between async operations: two concurrent async paths writing to the same location (state, DB row, file) without synchronization\n- Uncleaned async resources: `setInterval`, `setTimeout`, event listeners, or subscriptions started inside a component/class that are never cleaned up when the scope is destroyed\n\n### 5. Stale Closures & Captured State\n\n- Stale closure over mutable variable: a callback or timeout captures a variable by reference; by the time the callback runs, the variable has changed\n- Loop variable capture: `for (var i = 0; ...)` with async/callback inside — all callbacks share the same `i` by the time they run (use `let` or pass `i` as an argument)\n- React hooks missing dependencies: a `useEffect` or `useCallback` that reads a prop or state value not listed in the dependency array — the callback sees the initial value forever\n- Event listener capturing stale props: a DOM event listener added once in a `useEffect` that captures `props.onEvent` at mount time, missing all future updates\n- Memoization with wrong keys: `useMemo` / `useCallback` / `React.memo` used with a dependency array that doesn't actually capture everything the computation depends on\n\n### 6. Resource Leaks & Missing Cleanup\n\n- Event listeners never removed: `addEventListener` called on mount, no corresponding `removeEventListener` on unmount\n- Intervals/timeouts never cleared: `setInterval` / `setTimeout` not captured in a ref or cancelled on component unmount\n- Subscriptions not cancelled: Realtime, WebSocket, or observable subscriptions opened but never `.unsubscribe()` / `.close()` called\n- File/stream handles not closed: `fs.open`, database connections, or readable streams that are opened but not closed on all exit paths (including error paths)\n- Growing in-memory collections: caches, queues, or maps that are added to but never evicted from, unbounded over time\n\n### 7. Uncaught Edge Cases — Inputs\n\n- Empty string: functions that receive a user-provided string and assume it is non-empty (`.split()`, `.charAt(0)`, regex matching)\n- Empty array or object: loops or transforms on collections that assume at least one element\n- Zero and negative numbers: code that divides by a user-supplied value without guarding against zero; index calculations that go negative\n- Numeric boundaries: values at or near `Number.MAX_SAFE_INTEGER`, `Number.MIN_SAFE_INTEGER`, `Infinity`, `-Infinity`, `NaN`\n- Unicode and emoji: string `.length` counts UTF-16 code units, not characters; a single emoji is 2 code units — truncation, substring, and split operations can corrupt multi-code-unit characters\n- Null bytes and control characters: untrusted strings containing `\\0`, `\\r`, `\\n` passed to file paths, log messages, or downstream systems\n- Very long inputs: strings or arrays far larger than typical — does the code O(n) scale gracefully, or does it load everything into memory?\n\n### 8. Uncaught Edge Cases — External Data & Network\n\n- Non-200 HTTP responses not handled: `fetch` resolves (does not reject) on 4xx/5xx — the caller must explicitly check `response.ok` or `response.status`\n- Partial or truncated responses: streaming or chunked data where the full payload may not arrive\n- Timeout not set: outbound HTTP calls with no timeout; one slow downstream service hangs the entire request chain indefinitely\n- Retry without backoff: immediately retrying failed network calls in a tight loop instead of using exponential backoff with jitter\n- Malformed JSON: `JSON.parse()` throws on invalid input; this must be wrapped in try/catch\n- Unexpected API shape: downstream API fields assumed to be present and correctly typed without validation; treat all external data as `unknown`\n- Stale or cached data returned on error: error handlers that silently return the last-known-good cached value without signalling the failure to the caller\n\n### 9. Concurrency & Shared State\n\n- Check-then-act (TOCTOU): reading a value, checking a condition, then acting — another concurrent operation can change the value between check and act\n- Non-atomic read-modify-write: incrementing a counter or appending to a list stored outside the current execution context without a lock or atomic operation\n- Reentrant function calls: an async function that can be called again before its first invocation completes, with both invocations sharing mutable state\n- Global/module-level mutable state: variables at module scope that accumulate or change across requests (dangerous in server contexts where module scope is shared between requests in the same isolate)\n- Event ordering assumptions: code that assumes async events will arrive in a specific order (e.g., \"message A always before message B\") without enforcement\n\n### 10. Scalability — Algorithmic Complexity\n\n- O(n²) or worse nested loops: an inner loop that iterates over the same or a related collection for every outer iteration; grows quadratically\n- Linear scan where constant lookup exists: using `Array.includes()`, `Array.find()`, or `Array.indexOf()` inside a loop where converting to a `Set` or `Map` would make lookups O(1)\n- Repeated sorting: sorting the same array on each render or request when it could be sorted once and cached\n- Unnecessary full-collection passes: multiple `.filter().map().reduce()` chains on the same array that could be combined into a single pass\n- Regex recompilation: constructing `new RegExp(pattern)` inside a loop when the pattern is constant — compile once outside the loop\n\n### 11. Scalability — Database & I/O\n\n- N+1 queries: fetching a list of N records, then issuing a separate query for each one in a loop — should be a single join or an `IN (...)` query\n- Unbounded queries: `SELECT * FROM table` or `.findAll()` without `LIMIT` — returns the entire table; grows unbounded as data grows\n- Missing pagination: API endpoints that return all results instead of pages; clients and servers both suffer as dataset grows\n- Fetching more columns than needed: `SELECT ` when only 2-3 columns are used; pulls unnecessary data across the network and into memory\n- Queries inside render or hot paths: database or API calls triggered on every render cycle or in tight loops rather than cached or batched\n- Sequential queries that could be parallel: `await db.query(A); await db.query(B)` where A and B are independent — use `Promise.all`\n- Missing index implied by access pattern: code that filters or sorts on a column that will clearly require a full table scan without an index (flag based on the access pattern — don't claim to know the schema unless you can read it)\n\n### 12. Scalability — Memory & Throughput\n\n- Loading full dataset into memory: reading an entire file, table, or collection into an array when streaming or cursor-based processing would avoid the memory spike\n- Unbounded `Promise.all`: `Promise.all(items.map(asyncFn))` where `items` can be very large — spawns thousands of concurrent operations, exhausting connections or memory\n- No backpressure on queues: pushing work into a queue faster than it can be consumed, with no throttling or rejection when the queue is full\n- In-memory coordination state: using a module-level `Map` or `Set` as a cache, queue, or lock that is not shared between process replicas — breaks on horizontal scale-out\n- No connection pooling: creating a new database connection per request instead of using a pool\n- Repeated expensive computation: calling an expensive pure function with the same inputs repeatedly without memoization or caching the result\n\n## Static Analysis Tools\n\nBefore producing findings, run available linters* on in-scope code and incorporate their output into findings.\n\n### TypeScript compiler\n```bash\nnpx tsc --noEmit\n```\nType errors, implicit `any`, and unchecked nulls. Map findings to Dimension 2 (Type & Coercion) or Dimension 3 (Null/Undefined).\n\n### ESLint\n```bash\nnpx eslint src/\n```\nKey rules that surface bugs: `no-unused-vars`, `no-undef`, `@typescript-eslint/no-floating-promises`, `@typescript-eslint/no-misused-promises`, `react-hooks/exhaustive-deps`, `no-constant-condition`, `no-self-assign`.\n\n### Ruff (Python)\n```bash\nruff check --select E,F,B,C90 .\n```\n`F` = Pyflakes (undefined names, unused imports), `B` = Bugbear (common bug patterns), `C90` = McCabe complexity.\n\n### How to use tool output\n1. Map each tool finding to its dimension (e.g., `@typescript-eslint/no-floating-promises` → Dimension 4: Async & Promise Bugs).\n2. Linter errors that indicate real runtime bugs go under Critical; style findings go under Suggestion.\n3. Note \"tsc: clean\" / \"ESLint: clean\" in the Summary if no issues.\n\n## Output Format\n\nGroup findings by severity, not by dimension. Each finding must name the dimension it falls under.\n\n```\n## Critical\nIssues that will cause incorrect behavior, data loss, or crashes in production.\n\n### [Dimension] Brief title\nFile: `path/to/file.ts` (lines X–Y)\nDimension: Full dimension name — one-line explanation of what correct code requires.\nProblem: What the code does wrong and the concrete runtime impact (what breaks, when, and for whom).\nFix: Specific, actionable code change.\n\n## Warning\nIssues likely to cause bugs under realistic inputs or load, or that will cause failures during future changes.\n\n(same structure)\n\n## Suggestion\nImprovements that reduce risk or improve robustness but are not urgently broken.\n\n(same structure)\n\n## Summary\n- Total findings: N (X critical, Y warning, Z suggestion)\n- Dimensions most frequently violated: list top 2–3\n- Linter results: tsc: clean / ESLint: N issues / Ruff: clean (etc.)\n- Overall assessment: 1–2 sentence verdict on correctness and robustness\n```\n\n## Verification Pass\n\nBefore finalizing your report, verify every finding:\n\n1. Re-read the code: Go back to the flagged file and re-read the flagged lines in full context (±20 lines). Confirm the issue actually exists — not a misread, not handled elsewhere in the same file, not guarded by a try/catch, type check, or upstream validation.\n2. Check for existing mitigations: Search the codebase for related patterns. Is the \"missing\" check done in a shared utility, middleware, type guard, or configuration? If so, drop the finding.\n3. Verify against official docs: For every API or runtime behavior you cite, confirm your claim is correct. If you're unsure how a function handles edge cases (null, empty, concurrent), look it up — don't guess. Use available tools (context7, web search, REFERENCE.md) to check current documentation when uncertain.\n4. Filter by confidence: If you're certain a finding is a false positive after re-reading, drop it entirely. If doubt remains but the issue seems plausible, mention it concisely as \"Worth Investigating\" at the end of the report — don't include it as a formal finding.\n\n## Rules\n\n- Be specific: always cite file paths and line numbers.\n- Be actionable: every finding must include a concrete fix — not \"handle null\" but \"add `if (!user) return notFound()` before line 42.\"\n- Model the failure: every Critical finding must describe what actually breaks at runtime — which input triggers it, what the symptom is.\n- Severity by real-world impact: rate by what breaks in production, not theoretical worst-case.\n- No fluff: skip dimensions with no findings. Don't praise code that is merely acceptable.\n- Respect scope: in diff mode, only flag issues in changed lines and their immediate context. Don't audit the entire file when asked about a one-line change.\n- Don't duplicate other skills: correctness bugs only — no security (use `security-audit`), no principle violations (use `best-practices-audit`). Edge cases and concurrency bugs that are also security vulnerabilities should be flagged here for correctness and referenced to `security-audit` for the security angle.\n","reference":"# Correctness Audit — Reference\n\nDetailed definitions, failure patterns, concrete examples, and fixes for each dimension in `SKILL.md`.\n\n---\n\n## 1. Logic Bugs\n\n### Wrong Comparison Operator\n\nThe single most common logic bug. `<` vs `<=` is the canonical off-by-one; `==` vs `===` produces silent type coercion in JavaScript.\n\nViolation:\n```ts\n// WRONG — excludes the last valid page\nif (page < totalPages) fetchPage(page); // misses page === totalPages\n\n// WRONG — \"0\" == 0 is true in JS; both branches trigger unexpectedly\nif (status == 0) handlePending();\nif (status == false) handleEmpty(); // also true for 0, \"\", null, undefined\n```\nFix:\n```ts\nif (page <= totalPages) fetchPage(page);\nif (status === 0) handlePending();\n```\n\n### Mutation of Input Arguments\n\nFunctions that mutate their arguments create invisible coupling — the caller's data changes without warning.\n\nViolation:\n```ts\nfunction normalize(items: Item[]) {\n items.sort((a, b) => a.id - b.id); // mutates the caller's array\n return items;\n}\n```\nFix:\n```ts\nfunction normalize(items: Item[]) {\n return [...items].sort((a, b) => a.id - b.id); // local copy\n}\n```\n\n### Shadowed Variable\n\nA variable declared inside an inner scope shares the name of an outer-scope variable. Reads in the inner scope silently use the inner version, ignoring the outer.\n\nViolation:\n```ts\nconst user = getCurrentUser();\nif (condition) {\n const user = await fetchUser(id); // shadows outer `user`\n applyPermissions(user); // uses inner — correct\n}\nlog(user.id); // uses outer — developer may have intended inner\n```\nFix: Use distinct names. Lint rule: `no-shadow`.\n\n### Boolean Logic Inversion (De Morgan)\n\nMissing or extra negations produce conditions that are the exact opposite of intent.\n\nViolation:\n```ts\n// Intent: \"allow if admin OR owner\"\n// Bug: \"allow if NOT admin AND NOT owner\" (blocks everyone who should be allowed)\nif (!isAdmin && !isOwner) return allowAccess();\n```\nFix:\n```ts\nif (isAdmin \|\| isOwner) return allowAccess();\n```\n\n---\n\n## 2. Type & Coercion Bugs\n\n### `+` Operator on Mixed Types\n\nJavaScript's `+` operator does string concatenation when either operand is a string. A number read from an input field, query param, or JSON-as-string will concatenate instead of add.\n\nViolation:\n```ts\n// req.query.count is always a string\nconst total = req.query.count + 10; // \"510\" not 15\n```\nFix:\n```ts\nconst total = Number(req.query.count) + 10;\n// or: parseInt(req.query.count, 10) + 10\n```\n\n### Floating-Point Arithmetic in Financial Logic\n\nIEEE 754 doubles cannot represent most decimal fractions exactly. `0.1 + 0.2 === 0.30000000000000004` — do not use `number` for money.\n\nViolation:\n```ts\nconst total = price * quantity; // $10.10 * 3 = $30.299999999999997\n```\nFix: Store monetary values as integer cents in the database. Perform all arithmetic in cents. Convert to decimal only for display.\n\n### NaN Propagation\n\nArithmetic involving `NaN` always produces `NaN`. A single bad input silently corrupts all downstream calculations. `NaN === NaN` is `false`, so equality checks miss it.\n\nViolation:\n```ts\nconst score = parseInt(rawInput); // \"abc\" → NaN\nconst adjusted = score + bonus; // NaN — no warning\nif (adjusted > threshold) award(); // never triggers\n```\nFix:\n```ts\nconst score = parseInt(rawInput, 10);\nif (!Number.isFinite(score)) throw new Error(`Invalid score: ${rawInput}`);\n```\n\n### `JSON.parse` Without Validation\n\n`JSON.parse` returns `any` in TypeScript. Treating the result as a typed value without runtime validation means any shape mismatch (missing field, wrong type, null) silently becomes a bug downstream.\n\nViolation:\n```ts\nconst payload = JSON.parse(body) as WebhookPayload;\nprocessEvent(payload.eventType); // crashes if eventType is missing\n```\nFix:\n```ts\nconst raw: unknown = JSON.parse(body);\nconst payload = WebhookPayloadSchema.parse(raw); // throws on invalid shape\nprocessEvent(payload.eventType); // safe\n```\n\n---\n\n## 3. Null, Undefined & Missing Value Bugs\n\n### Unguarded `.find()` Result\n\n`Array.find()` returns `undefined` when no match exists. Using the result directly without checking throws at runtime.\n\nViolation:\n```ts\nconst config = configs.find(c => c.id === targetId);\nreturn config.value; // TypeError: Cannot read properties of undefined\n```\nFix:\n```ts\nconst config = configs.find(c => c.id === targetId);\nif (!config) throw new Error(`Config ${targetId} not found`);\nreturn config.value;\n```\n\n### Empty Array Access\n\n`arr[0]` on an empty array returns `undefined`, not an error. If the code then accesses a property of the result, it throws.\n\nViolation:\n```ts\nconst latest = events[0].timestamp; // undefined.timestamp if events = []\n```\nFix:\n```ts\nconst latest = events[0]?.timestamp ?? null;\n// or: if (events.length === 0) return null;\n```\n\n### Nullable Database Column Treated as Non-Null\n\nA TypeScript type may say `string` for a column that is nullable in the database. The type is wrong — any row inserted with `NULL` will produce `null` at runtime.\n\nPattern to flag: Reading `.foo` on a database row without checking if the type declaration matches the actual schema's nullable constraints.\n\n---\n\n## 4. Async & Promise Bugs\n\n### Missing `await` on Critical Path\n\nA fire-and-forget async call looks correct but the caller does not know if it succeeded or failed, and the function may return before the operation completes.\n\nViolation:\n```ts\nasync function deleteUser(id: string) {\n revokeTokens(id); // NOT awaited — may not complete before function returns\n await db.delete(id);\n return { success: true };\n}\n```\nFix:\n```ts\nasync function deleteUser(id: string) {\n await revokeTokens(id); // must complete before deleting the user\n await db.delete(id);\n return { success: true };\n}\n```\n\n### Unhandled Promise Rejection\n\nA `.then()` without `.catch()` silently drops errors. In Node.js, unhandled rejections crash the process in newer versions.\n\nViolation:\n```ts\nfetchData().then(process); // rejection from fetchData or process is silently lost\n```\nFix:\n```ts\nfetchData().then(process).catch(err => logger.error(\"fetchData failed\", err));\n// or use async/await with try/catch\n```\n\n### Sequential Awaits on Independent Operations\n\nTwo independent async operations awaited in series take `T_a + T_b` time instead of `max(T_a, T_b)`.\n\nViolation:\n```ts\nconst user = await fetchUser(id);\nconst config = await fetchConfig(); // independent — no reason to wait for user first\n```\nFix:\n```ts\nconst [user, config] = await Promise.all([fetchUser(id), fetchConfig()]);\n```\n\n### `Promise.all` Fail-Fast When Partial Failure Is Acceptable\n\n`Promise.all` rejects as soon as any promise rejects — the remaining promises continue executing but their results are ignored. If partial success is acceptable, `Promise.allSettled` is correct.\n\nViolation:\n```ts\n// Sending notifications — one failure shouldn't prevent others\nawait Promise.all(users.map(u => sendNotification(u))); // one failure cancels all\n```\nFix:\n```ts\nconst results = await Promise.allSettled(users.map(u => sendNotification(u)));\nconst failures = results.filter(r => r.status === \"rejected\");\nif (failures.length > 0) logger.warn(`${failures.length} notifications failed`);\n```\n\n### Unbounded `Promise.all` on Large Array\n\nSpawning thousands of concurrent async operations exhausts database connections, file handles, or external API rate limits.\n\nViolation:\n```ts\nawait Promise.all(thousandsOfItems.map(item => processItem(item)));\n```\nFix: Use a concurrency-limited batch runner:\n```ts\n// Process in chunks of 10 at a time\nfor (let i = 0; i < items.length; i += 10) {\n await Promise.all(items.slice(i, i + 10).map(processItem));\n}\n// or use a library like p-limit\n```\n\n---\n\n## 5. Stale Closures & Captured State\n\n### Loop Variable Capture with `var`\n\n`var` is function-scoped, not block-scoped. All closures created inside the loop capture the same variable, which has its final value by the time the callbacks run.\n\nViolation:\n```ts\nfor (var i = 0; i < 5; i++) {\n setTimeout(() => console.log(i), 0); // logs \"5\" five times, not 0,1,2,3,4\n}\n```\nFix:\n```ts\nfor (let i = 0; i < 5; i++) { // `let` is block-scoped; each iteration gets its own `i`\n setTimeout(() => console.log(i), 0);\n}\n```\n\n### React `useEffect` Stale Closure\n\nA `useEffect` callback captures prop/state values at the time of the effect's creation. If those values change but the effect's dependency array doesn't include them, the callback operates on stale values forever.\n\nViolation:\n```tsx\nuseEffect(() => {\n const interval = setInterval(() => {\n // `count` is captured at mount and never updates\n setCount(count + 1); // always adds 1 to the initial value\n }, 1000);\n return () => clearInterval(interval);\n}, []); // missing `count` in deps\n```\nFix:\n```tsx\nuseEffect(() => {\n const interval = setInterval(() => {\n setCount(c => c + 1); // functional update — always uses current value\n }, 1000);\n return () => clearInterval(interval);\n}, []);\n```\n\n---\n\n## 6. Resource Leaks & Missing Cleanup\n\n### Event Listener Never Removed\n\nAdding a listener in a component's mount phase without removing it on unmount causes the handler to fire after the component is gone, often throwing on de-referenced state.\n\nViolation:\n```tsx\nuseEffect(() => {\n window.addEventListener(\"resize\", handleResize);\n // no cleanup — handleResize fires after unmount, references stale state\n}, []);\n```\nFix:\n```tsx\nuseEffect(() => {\n window.addEventListener(\"resize\", handleResize);\n return () => window.removeEventListener(\"resize\", handleResize);\n}, [handleResize]);\n```\n\n### Interval Not Cleared on Unmount\n\nA `setInterval` that is not cleared on unmount continues firing after the component is gone, wasting resources and updating state that is no longer rendered.\n\nViolation:\n```tsx\nuseEffect(() => {\n setInterval(tick, 1000); // interval ID discarded; can never be cleared\n}, []);\n```\nFix:\n```tsx\nuseEffect(() => {\n const id = setInterval(tick, 1000);\n return () => clearInterval(id);\n}, []);\n```\n\n### Growing Unbounded Cache\n\nAn in-memory cache that is added to without eviction grows without bound and eventually exhausts memory.\n\nViolation:\n```ts\nconst cache = new Map<string, Result>(); // module-level, grows forever\nfunction getCached(key: string) {\n if (!cache.has(key)) cache.set(key, compute(key));\n return cache.get(key)!;\n}\n```\nFix: Add a max-size eviction policy (LRU), a TTL, or use a bounded cache library. At minimum, document that the key space must be finite and bounded.\n\n---\n\n## 7. Edge Cases — Inputs\n\n### Empty String Assumptions\n\nA function receiving a user-supplied string must handle `\"\"` explicitly — it is falsy in JavaScript, which sometimes helps but often misleads.\n\nViolation:\n```ts\nfunction getInitials(name: string) {\n return name.split(\" \").map(w => w[0]).join(\"\"); // name=\"\" → [][\"\"][0] → undefined\n}\n```\nFix:\n```ts\nfunction getInitials(name: string) {\n if (!name.trim()) return \"\";\n return name.trim().split(/\\s+/).map(w => w[0].toUpperCase()).join(\"\");\n}\n```\n\n### Unicode / Emoji String Length\n\nJavaScript strings are UTF-16. Emoji and many non-Latin characters are represented as surrogate pairs — two code units each. `.length`, `.slice()`, `.charAt()`, and `.split(\"\")` all operate on code units, not characters.\n\nViolation:\n```ts\nconst truncated = message.slice(0, 100); // may split a surrogate pair, producing \"?\"\nconst len = \"👋\".length; // 2, not 1\n```\nFix:\n```ts\n// Use Array.from or spread to iterate by Unicode code point\nconst chars = Array.from(message);\nconst truncated = chars.slice(0, 100).join(\"\");\nconst len = Array.from(\"👋\").length; // 1\n```\n\n### Division by Zero\n\nAny user-supplied or computed value used as a divisor must be checked.\n\nViolation:\n```ts\nconst avgScore = totalScore / userCount; // NaN or Infinity when userCount = 0\n```\nFix:\n```ts\nconst avgScore = userCount === 0 ? 0 : totalScore / userCount;\n```\n\n---\n\n## 8. Edge Cases — External Data & Network\n\n### `fetch` Does Not Reject on HTTP Errors\n\n`fetch` only rejects on network failure (DNS, timeout, no connection). A 400, 404, or 500 response resolves normally with `response.ok === false`.\n\nViolation:\n```ts\nconst data = await fetch(\"/api/users\").then(r => r.json()); // 500 → parsed error body, no throw\n```\nFix:\n```ts\nconst response = await fetch(\"/api/users\");\nif (!response.ok) throw new Error(`HTTP ${response.status}: ${await response.text()}`);\nconst data = await response.json();\n```\n\n### Missing Request Timeout\n\nA `fetch` call with no timeout will wait indefinitely if the server hangs. In a serverless function, this exhaust the function's max execution time and blocks the client.\n\nViolation:\n```ts\nconst response = await fetch(url); // no timeout\n```\nFix:\n```ts\nconst response = await fetch(url, { signal: AbortSignal.timeout(5_000) }); // 5 second max\n```\n\n### `JSON.parse` Not Wrapped in Try/Catch\n\n`JSON.parse` throws a `SyntaxError` on malformed input. If the input comes from an external source it can fail at any time.\n\nViolation:\n```ts\nconst data = JSON.parse(rawBody); // throws on malformed JSON; crashes the handler\n```\nFix:\n```ts\nlet data: unknown;\ntry {\n data = JSON.parse(rawBody);\n} catch {\n return badRequest(\"Invalid JSON body\");\n}\n```\n\n---\n\n## 9. Concurrency & Shared State\n\n### Non-Atomic Read-Modify-Write\n\nRead a value, compute a new value, write it back. If two concurrent operations both read the same initial value, the second write silently overwrites the first.\n\nViolation (application layer):\n```ts\nconst balance = await getBalance(userId); // both read 100\nconst newBalance = balance - amount; // both compute 50\nawait setBalance(userId, newBalance); // second write wins: 50 instead of 0\n```\nFix: Use a database-level atomic update (`UPDATE ... SET coins = coins - $amount WHERE coins >= $amount`), or use `SELECT FOR UPDATE` to lock the row for the duration of the transaction.\n\nViolation (JavaScript):\n```ts\nlet counter = 0;\nasync function increment() {\n const current = counter; // read\n await someAsync(); // yields — another increment may run here\n counter = current + 1; // write: first increment's result is lost\n}\n```\nFix: For in-process counters, use a mutex or perform the increment synchronously without yielding.\n\n### Reentrant Async Function\n\nAn async function that is called again before its first invocation finishes, with both invocations modifying shared state.\n\nPattern to flag:\n```ts\nlet isSyncing = false; // in-memory guard\n\nasync function sync() {\n if (isSyncing) return; // TOCTOU: two callers can both read false simultaneously\n isSyncing = true;\n await doSync();\n isSyncing = false;\n}\n```\nFix: The guard only works if `isSyncing = true` is set synchronously before the first `await`. The code above is actually fine for this reason — flag it only if there is a `await` before setting the flag. For distributed/multi-instance systems, an in-memory flag is insufficient and must be moved to a database or Redis.\n\n---\n\n## 10. Scalability — Algorithmic Complexity\n\n### Linear Scan Inside a Loop — O(n²)\n\nUsing `Array.includes()`, `Array.find()`, or `Array.indexOf()` inside a loop that iterates over a collection of size n performs n × n = n² operations.\n\nViolation:\n```ts\n// O(n²): for each item, scan all blockedIds\nconst visible = items.filter(item => !blockedIds.includes(item.id));\n```\nFix:\n```ts\n// O(n): one-time Set construction + O(1) lookups\nconst blockedSet = new Set(blockedIds);\nconst visible = items.filter(item => !blockedSet.has(item.id));\n```\n\n### Regex Recompilation in a Loop\n\n`new RegExp(pattern)` compiles the pattern every call. If called in a loop with a constant pattern, this is wasted work.\n\nViolation:\n```ts\nfor (const line of lines) {\n if (new RegExp(\"^ERROR:\").test(line)) handle(line); // compiles every iteration\n}\n```\nFix:\n```ts\nconst errorPattern = /^ERROR:/; // compile once\nfor (const line of lines) {\n if (errorPattern.test(line)) handle(line);\n}\n```\n\n---\n\n## 11. Scalability — Database & I/O\n\n### N+1 Queries\n\nFetching a list, then issuing one query per row in a loop, is the most common database scalability bug. It turns one round-trip into N+1 round-trips.\n\nViolation:\n```ts\nconst posts = await db.query(\"SELECT * FROM posts LIMIT 20\");\nfor (const post of posts) {\n // 20 separate queries — one per post\n post.author = await db.query(\"SELECT * FROM users WHERE id = $1\", [post.author_id]);\n}\n```\nFix:\n```ts\nconst posts = await db.query(\"SELECT * FROM posts LIMIT 20\");\nconst authorIds = posts.map(p => p.author_id);\nconst authors = await db.query(\"SELECT * FROM users WHERE id = ANY($1)\", [authorIds]);\nconst authorMap = new Map(authors.map(a => [a.id, a]));\nposts.forEach(p => { p.author = authorMap.get(p.author_id); });\n```\n\n### Unbounded Query\n\nA query with no `LIMIT` returns the entire table. Tables grow over time; this query will eventually time out, exhaust memory, or cause OOM.\n\nViolation:\n```ts\nconst users = await db.query(\"SELECT * FROM users WHERE active = true\");\n// returns 10 rows today; returns 100,000 rows in a year\n```\nFix:\n```ts\nconst users = await db.query(\n \"SELECT id, display_name FROM users WHERE active = true LIMIT $1 OFFSET $2\",\n [pageSize, page * pageSize]\n);\n```\n\n---\n\n## 12. Scalability — Memory & Throughput\n\n### Loading Full Dataset Into Memory\n\nReading an entire file, table, or collection into an array before processing. Memory usage grows linearly with data size.\n\nViolation:\n```ts\nconst allEvents = await db.query(\"SELECT * FROM events\"); // 10 million rows\nconst processed = allEvents.map(transform);\n```\nFix: Use cursor-based streaming or pagination:\n```ts\nlet cursor = 0;\nwhile (true) {\n const batch = await db.query(\"SELECT * FROM events WHERE id > $1 LIMIT 1000\", [cursor]);\n if (batch.length === 0) break;\n batch.forEach(transform);\n cursor = batch[batch.length - 1].id;\n}\n```\n\n### In-Memory Coordination State That Breaks on Scale-Out\n\nA module-level `Map`, `Set`, or variable used as a cache, rate limiter, or deduplication store is not shared between multiple server instances or worker processes. When the service scales out or restarts, the state is lost or silently per-instance.\n\nViolation:\n```ts\n// Works on one instance; breaks when there are two\nconst rateLimitCache = new Map<string, number>(); // module-level\n\nfunction checkRateLimit(userId: string): boolean {\n const count = rateLimitCache.get(userId) ?? 0;\n rateLimitCache.set(userId, count + 1);\n return count < 10;\n}\n```\nFix: Move shared state to a database (Redis, PostgreSQL) that all instances can access. Flag this whenever module-level mutable state is used for coordination in a server context.\n"},"feature-planning":{"content":"---\nname: feature-planning\ndescription: Extensively plans a proposed feature before any code is written. Use when the user asks to plan, design, or spec out a feature, or when they say \"plan this feature\", \"design this\", or want to think through a feature before building it.\n---\n\n# Feature Planning\n\nEnter plan mode and produce a thorough, implementation-ready feature plan. Do not write any code until the plan is approved.\n\n## Trigger\n\nWhen this skill is invoked, immediately enter plan mode using the EnterPlanMode tool. All planning work happens inside plan mode.\n\n## Scope\n\n- User describes a feature: Treat the description as the starting point. Explore the codebase to understand where the feature fits before designing anything.\n- Request is vague or ambiguous: Ask clarifying questions using AskUserQuestion before proceeding. Do not assume intent. Common ambiguities to probe:\n - Who is the target user or actor?\n - What is the expected behavior vs. current behavior?\n - Are there constraints (performance, compatibility, platform)?\n - What is explicitly out of scope?\n - Are there related features this interacts with?\n- User provides a detailed spec: Validate it against the codebase. Identify gaps, contradictions, or unstated assumptions and raise them before planning.\n\nDo NOT skip clarification. A plan built on wrong assumptions wastes more time than a question.\n\n## Process\n\n### 1. Understand Context\n\n- Read the project's SPEC.md, README, CLAUDE.md, and any relevant docs to understand the system's architecture, conventions, and existing features.\n- Explore the codebase areas the feature will touch. Identify existing patterns, data models, state management, and UI conventions.\n- Map out what already exists that the feature will interact with or depend on.\n- API/tech stack verification: If the feature involves specific APIs, SDKs, or third-party services, look up the official documentation directly before designing anything. Check if available MCP tools (Supabase, Vercel, etc.) can accelerate this lookup. Never assume correct API usage from training knowledge alone — docs may have changed and wrong API usage produces security holes, not just bugs.\n- Output: A brief summary of the current system context relevant to this feature.\n\n### 2. Clarify Requirements\n\n- If any of the following are unclear, ask before continuing:\n - Functional requirements: What exactly should the feature do? What are the inputs, outputs, and user flows?\n - Non-functional requirements: Performance targets, data volume expectations, offline behavior, accessibility.\n - Boundaries: What is in scope vs. out of scope for this iteration?\n - Dependencies: Does this require new APIs, services, migrations, or third-party integrations?\n- Output: A clear, numbered list of confirmed requirements.\n\n### 3. Design the Feature\n\nProduce a plan that covers each of the following sections. Skip a section only if it genuinely does not apply.\n\n#### 3a. User-Facing Behavior\n- Describe the feature from the user's perspective: what they see, what they do, what happens.\n- Cover the happy path end-to-end.\n- Define error states and what the user sees when things go wrong (invalid input, network failure, permission denied, etc.).\n\n#### 3b. Data Model Changes\n- New types, interfaces, database tables, or schema changes.\n- Migrations needed and their reversibility.\n- Impact on existing data (backwards compatibility, data backfill).\n\n#### 3c. Architecture & Module Design\n- Which files/modules will be created or modified.\n- How the feature integrates with the existing architecture (state management, routing, API layer, etc.).\n- Clear responsibility boundaries: what each new module/function owns.\n\n#### 3d. API & Integration Points\n- New endpoints, webhooks, or external service calls.\n- Request/response shapes.\n- Authentication and authorization requirements.\n\n#### 3e. State Management\n- What state the feature introduces (local, global, persisted, cached).\n- State transitions and lifecycle.\n- How state syncs across components or with the backend.\n\n#### 3f. Implementation Steps\n- An ordered sequence of concrete implementation steps.\n- Each step should be small enough to be a single commit.\n- Note dependencies between steps (what must come before what).\n- Identify which steps can be done in parallel.\n\n### 4. Analyze Quality Dimensions\n\nProactively evaluate the proposed design against each of these dimensions. For each, explicitly state what risks exist and how the design addresses them. If a dimension does not apply, say so briefly. See [REFERENCE.md](REFERENCE.md) for named standards, plan quality criteria, templates, and anti-patterns.\n\n#### Bugs & Correctness\n(Applies `correctness-audit` — Dimensions 1–9: Logic Bugs through Concurrency & Shared State)\n\nReview the design against the `correctness-audit` dimensions. State which are highest-risk for this feature:\n- Logic bugs: off-by-one errors, boolean inversions, wrong operators in proposed conditional logic\n- Null / undefined: fields that can be absent — are they guarded? Do nullable DB columns match their TypeScript types?\n- Async & Promise: are concurrent async paths safe? Is there risk of fire-and-forget on critical writes?\n- Concurrency / TOCTOU: can concurrent requests (multiple users, tabs, or duplicate submissions) corrupt shared state? Does any step read-check-act on data another operation could change between check and act?\n\n#### Edge Cases\n(Applies `correctness-audit` — Dimensions 7 & 8: Edge Case Inputs, External Data & Network)\n\n- Empty state: what does the user see before any data exists for this feature?\n- Boundary values: max field lengths, max collection sizes, numeric overflow — are they defined and enforced at both the API and database layers?\n- Network failures: if an operation fails mid-way, what state is the system left in? Is partial completion visible to the user?\n- Reentrant / concurrent usage: double-submit, multiple tabs, back-button navigation mid-flow.\n- External data: any third-party API or webhook payload — is it validated as `unknown` before use, not cast directly to a typed shape?\n\n#### Design Quality\n(SOLID — Robert C. Martin; Clean Architecture — Robert C. Martin & Martin Fowler)\n\n- SRP: does each new module have one clearly stated reason to change?\n- OCP: can new behavior be added by extension without modifying existing modules?\n- DIP: do high-level modules depend on abstractions, not concrete implementations?\n- Dependency direction: do dependencies point inward (domain ← application ← infrastructure)? No domain module should depend on a framework or I/O layer.\n- Does the design follow existing project patterns, or introduce a new one? If new, is the justification explicitly stated?\n\n#### Maintainability\n(Clean Code — Robert C. Martin; The Pragmatic Programmer — Hunt & Thomas)\n\n- Will a developer unfamiliar with this feature understand it from the plan alone, without asking the author?\n- Are proposed module and function names self-documenting?\n- Are non-obvious design decisions explained in the plan's rationale, not left as tribal knowledge?\n- Are implicit contracts between modules made explicit (typed interfaces, documented invariants)?\n\n#### Modularity\n(SOLID — SRP, ISP, DIP; UNIX philosophy)\n\n- Can each new component be unit-tested in isolation, without the full stack?\n- Are new module dependencies unidirectional? Does the design introduce any circular imports?\n- Could any new module be replaced or reused independently of the others?\n\n#### Simplicity\n(KISS — Clarence Johnson, 1960; YAGNI — Extreme Programming, Kent Beck & Ron Jeffries)\n\n- KISS: is this the simplest design that satisfies the stated requirements?\n- YAGNI: are there components designed for hypothetical future requirements not in scope for this iteration?\n- Does the language or framework already provide something the design is building from scratch?\n- Is there unnecessary indirection — interfaces, factories, registries — with only one concrete implementation?\n\n#### Scalability\n(Applies `correctness-audit` — Dimensions 10–12: Algorithmic Complexity, Database & I/O, Memory & Throughput)\n\n- Will this design function correctly at 10× the current data volume without architectural changes?\n- Are there unbounded database queries (no `LIMIT`) or full-collection loads into memory?\n- Are there N+1 query patterns that will emerge as data grows?\n- Is any coordination state stored in-memory in a way that breaks under horizontal scale-out?\n\n#### Security\n(Applies `security-audit` — use the relevant domains for each new design element)\n\nMap each new element of the design to the applicable security-audit domains:\n- New API endpoint → §2 Authorization, §5 Input Validation, §6 API Security, §8 Rate Limiting\n- New database table or function → §7 Database Security (RLS, REVOKE, CHECK constraints)\n- New auth flow or session handling → §1 Authentication & Session Management\n- New external service call or webhook → §6 API7 SSRF, §10 webhook deduplication & signature\n- New financial operation → §10 Financial & Transaction Integrity, §9 Concurrency & Race Conditions\n- New user data stored or transmitted → §13 Data Privacy & Retention, §4 Cryptography & Secrets\n\n### 5. Identify Risks & Open Questions\n\n- List anything that could go wrong or that you're uncertain about.\n- Flag technical risks (performance cliffs, migration dangers, dependency on unstable APIs).\n- Flag product risks (user confusion, feature conflicts, scope creep).\n- For each risk, suggest a mitigation or note that it needs a decision.\n\n## Output Format\n\nWrite the plan to the plan file with this structure:\n\n```\n# Feature: [Name]\n\n## Context\n[Brief summary of current system state relevant to this feature]\n\n## Requirements\n1. [Confirmed requirement]\n2. ...\n\n## Design\n\n### User-Facing Behavior\n[Description with happy path and error states]\n\n### Data Model Changes\n[Types, schemas, migrations]\n\n### Architecture\n[Modules, files, integration points]\n\n### API & Integration Points\n[Endpoints, external calls]\n\n### State Management\n[State shape, transitions, sync]\n\n### Implementation Steps\n1. [Step with description]\n2. ...\n\n## Quality Analysis\n\n### Bugs & Correctness\n[Risks and mitigations]\n\n### Edge Cases\n[Identified edge cases and how they're handled]\n\n### Design Quality\n[Assessment]\n\n### Maintainability\n[Assessment]\n\n### Modularity\n[Assessment]\n\n### Simplicity\n[Assessment]\n\n### Scalability\n[Assessment]\n\n### Security\n[Assessment]\n\n## Risks & Open Questions\n- [Risk/question with proposed mitigation or decision needed]\n\n## Out of Scope\n- [What this plan explicitly does not cover]\n```\n\n## Rules\n\n- Plan mode first: Always enter plan mode before doing any planning work. The plan is written to the plan file, not output as chat.\n- No code: Do not write implementation code during planning. The plan is the deliverable.\n- Ask, don't assume: If the request is ambiguous, ask clarifying questions. Prefer one round of good questions over multiple rounds of back-and-forth.\n- Read before designing: Explore the codebase thoroughly. Reference actual file paths, function names, and patterns from the project.\n- Be concrete: Implementation steps should reference specific files and modules, not vague descriptions like \"update the backend.\"\n- Be honest about uncertainty: If you're unsure about something, flag it as an open question rather than making a guess that will become the plan.\n- Respect existing patterns: The plan should extend the project's architecture, not fight it. If a new pattern is warranted, justify why.\n- Scope boundaries: Clearly state what is and isn't included. Prevent scope creep by naming it.\n- Verify API usage against official docs: Before finalizing any design that uses a specific SDK, API, or third-party service, consult the official documentation to confirm correct usage. Use available MCP tools (Supabase, Vercel, etc.) where possible. Do not rely on training knowledge — incorrect API usage is a design flaw that silently becomes a security vulnerability.\n- Name the pattern: when the design follows or introduces a named pattern (Repository, Strategy, ADR, C4 Container), name it and note its source so the rationale is traceable.\n- Delegate to audit skills: the quality analysis does not re-describe what the audit skills cover in detail — it identifies which domains apply and defers to those skills for the specific checklist.\n","reference":"# Feature Planning — Reference\n\nDetailed standards, plan quality criteria, templates, and anti-patterns for the skill defined in `SKILL.md`.\n\n---\n\n## 1. Design Methodologies\n\n### C4 Model (Simon Brown)\nApplicable to: Architecture & Module Design section\n\nUse C4 vocabulary to describe architecture at the right level of detail. Don't describe implementation-level detail in architecture, or architecture-level detail in a code comment.\n\n- System Context: How the feature fits in the broader product and what external systems it touches.\n- Container: Major runtime components (web app, API server, database, message queue, cache). A new Edge Function or a new Supabase table is a container-level concern.\n- Component: Key modules within a container (e.g., `useNotifications` hook, `NotificationService` class). Most features are designed at this level.\n- Code: Only describe at this level for non-obvious or algorithmically critical parts.\n\nWhen writing the Architecture section, identify which C4 level is appropriate. A simple UI tweak is Code-level. A new backend service is Container-level.\n\n### Architecture Decision Records (ADR)\nApplicable to: any significant or non-obvious design choice in the plan\n\nWhen the plan makes a non-obvious design choice (e.g., \"use Realtime instead of polling\", \"store as JSONB instead of normalized columns\"), embed a mini-ADR in the rationale:\n\n```\nDecision: [What was chosen]\nContext: [Why a decision was needed; what problem this solves; what alternatives were considered]\nConsequences: [What becomes easier; what becomes harder; what is explicitly ruled out]\n```\n\nThis prevents \"we chose X\" from becoming tribal knowledge. The next developer reading the code needs to know why, not just what.\n\n### RFC-Style Specification\nApplicable to: complex or high-risk features affecting multiple systems or teams\n\nFor features that significantly affect multiple teams or carry high design risk, structure the plan to include:\n\n- Abstract: 2–3 sentence summary of the feature and its purpose.\n- Motivation: Why this is needed now. What problem it solves. Why existing solutions are insufficient.\n- Drawbacks: Reasons not to build this, or not to build it this way.\n- Alternatives: Other approaches considered and why they were rejected.\n\n---\n\n## 2. Plan Quality Criteria\n\nA plan section is \"done\" when it meets these criteria. Self-check before calling `ExitPlanMode`.\n\n### Context\n- [ ] References actual file paths, function names, and patterns from the real codebase (not generic descriptions).\n- [ ] Identifies all existing systems the feature will interact with or depend on.\n- [ ] Notes which existing files will change, not just what will be added.\n\n### Requirements\n- [ ] Functional requirements describe observable behavior (inputs, outputs, user flows) — not implementation details.\n- [ ] Non-functional requirements name specific targets (\"p95 latency < 200ms\", \"works offline for up to 24h\") — not vague aspirations (\"it should be fast\").\n- [ ] Out of scope is stated explicitly for anything a reader might reasonably assume is included.\n\n### User-Facing Behavior\n- [ ] Happy path is described end-to-end from the user's perspective.\n- [ ] Every error state has an explicit description of what the user sees — not \"show an error\" but \"display 'Something went wrong. Try again.' with a retry button.\"\n- [ ] Empty state is defined (what the user sees before any data exists for this feature).\n- [ ] Loading / pending state is defined if the feature involves async operations.\n\n### Data Model Changes\n- [ ] New tables include all columns with types, nullability, defaults, CHECK constraints, and FK `ON DELETE` behavior.\n- [ ] RLS requirements are stated for every new table.\n- [ ] Index requirements are stated based on the query access patterns described in the plan.\n- [ ] Migration is characterized as destructive / non-destructive, and whether a data backfill is needed.\n\n### Architecture\n- [ ] Lists specific files to be created and specific existing files to be modified.\n- [ ] Responsibility of each new module is stated in one sentence.\n- [ ] Dependency graph between new modules is described (what imports what).\n- [ ] No circular dependencies introduced.\n\n### API & Integration Points\n- [ ] Endpoint paths, HTTP methods, request bodies, and response shapes are defined.\n- [ ] Auth requirements are stated per endpoint.\n- [ ] Error response shapes and status codes are defined (not just the 200 case).\n\n### Implementation Steps\n- [ ] Each step is small enough to be a single commit.\n- [ ] Dependencies between steps are noted (what must come before what).\n- [ ] Steps that can be parallelized are identified.\n- [ ] The first step is always safe to merge independently (non-breaking change).\n\n---\n\n## 3. Plan Section Templates\n\n### Data Model Changes\n\nBad (too vague):\n> We'll add a notifications table.\n\nGood (specific):\n> New table: `notifications`\n>\n> \| Column \| Type \| Constraints \|\n> \|--------\|------\|-------------\|\n> \| `id` \| `UUID` \| `PRIMARY KEY DEFAULT gen_random_uuid()` \|\n> \| `user_id` \| `UUID` \| `NOT NULL REFERENCES auth.users(id) ON DELETE CASCADE` \|\n> \| `type` \| `TEXT` \| `NOT NULL CHECK (type IN ('quest_complete', 'reward_earned', 'system'))` \|\n> \| `read_at` \| `TIMESTAMPTZ` \| nullable — null means unread \|\n> \| `created_at` \| `TIMESTAMPTZ` \| `NOT NULL DEFAULT now()` \|\n>\n> RLS: `USING (user_id = auth.uid())` for SELECT; no UPDATE/DELETE for users.\n> Index: `(user_id, created_at DESC)` — supports the \"latest N unread for user\" query.\n> Migration: Non-destructive (new table). No backfill required.\n\n---\n\n### Implementation Steps\n\nBad (too vague):\n> 1. Build the backend.\n> 2. Build the frontend.\n> 3. Add tests.\n\nGood (specific):\n> 1. [Migration] Add `notifications` table and RLS policy. Non-destructive; safe to ship independently.\n> 2. [Edge Function] `POST /notifications/mark-read` — Zod-validated body, updates `read_at`, returns 204. Blocked by step 1.\n> 3. [React hook] `useNotifications()` — Realtime subscription scoped to `auth.uid()`. Can be built in parallel with step 2.\n> 4. [UI] `<NotificationBell>` — badge count, dropdown list, \"mark all read\" action. Blocked by step 3.\n> 5. [Test] Integration test: verify user A cannot read user B's notifications (RLS enforcement). Blocked by step 1.\n\n---\n\n### API Endpoint\n\n> `POST /api/quests/:questId/complete`\n> - Auth: Requires valid JWT; `getUser()` server-side (not `getSession()`).\n> - Authorization: Verify `quest.user_id === authenticatedUser.id` before any mutation.\n> - Request body: `{ evidence: string }` — validated with Zod; `evidence` max 500 chars, non-empty.\n> - Response (200): `{ coinsAwarded: number, newBalance: number }`\n> - Response (404): Quest not found or does not belong to caller. (Do not distinguish between the two — prevents enumeration.)\n> - Response (409): Quest already completed.\n> - Response (422): Schema validation failure with field-level errors.\n\n---\n\n### Architecture Decision Record (inline)\n\n> Decision: Use Supabase Realtime for live notification updates instead of polling.\n> Context: The feature requires users to see new notifications without refreshing. Polling every N seconds introduces latency and unnecessary load. Realtime is already available in the project infrastructure.\n> Consequences: Simpler client code (no polling interval to manage); subscription must be cleaned up on component unmount to avoid leaks; does not work for users behind restrictive firewalls (acceptable for this use case).\n\n---\n\n## 4. Common Planning Anti-Patterns\n\n### Premature Generalization\n(YAGNI — Extreme Programming, Kent Beck & Ron Jeffries)\n\nThe plan designs a general-purpose system for one concrete use case. Examples: building a \"plugin architecture\" when one integration is needed; an \"event bus\" when one event type exists; an \"action system\" for a single action type.\n\nSignal: The architecture section describes abstractions (interfaces, factories, registries) where no concrete second implementation exists or is planned.\n\nRemedy: Design for the concrete case. Note in Out of Scope that generalization is deferred until a second concrete case exists.\n\n---\n\n### Over-Complex Control Flow\n(KISS — Clarence Johnson)\n\nThe design requires a developer to trace through several interacting systems to follow one user action. Each hop (component → service → event → consumer → database) multiplies failure modes and debugging surface.\n\nSignal: Implementation steps require more than 3 conceptual layers for a straightforward operation.\n\nRemedy: Simplify the call chain. Prefer direct calls over event-driven patterns until the added complexity is justified by a concrete requirement (e.g., \"multiple independent consumers\", \"decoupled deployment\").\n\n---\n\n### Missing Error States in User-Facing Behavior\n(Defensive Programming — Steve McConnell, Code Complete)\n\nThe user-facing behavior section describes only the happy path. Network failures, validation errors, empty states, and permission-denied cases are left undefined. These become inconsistent behavior implemented ad-hoc during implementation.\n\nSignal: The user-facing behavior section has no \"when X fails, the user sees…\" entries.\n\nRemedy: For every user-visible action, add an explicit error state: what message appears, where it appears, and whether the user can recover (retry vs. dead end).\n\n---\n\n### Unstated Assumptions\n(The Pragmatic Programmer — Hunt & Thomas: \"Don't Assume, Check\")\n\nThe plan assumes an external API contract, an existing service capability, a team decision, or an infrastructure arrangement that has not been confirmed. These become discovered blockers during implementation.\n\nSignal: Phrases like \"we'll integrate with X\", \"X already supports this\", or \"the infra team will handle Y\" without a reference or confirmation.\n\nRemedy: Flag every unconfirmed assumption as an explicit open question in Risks & Open Questions, with a named owner and a decision deadline if possible.\n\n---\n\n### Circular Module Dependencies\n(Clean Architecture — Robert C. Martin)\n\nThe architecture introduces a dependency cycle: A imports B, B imports C, C imports A. This prevents independent testing, makes initialization order fragile, and is a source of \"works but nobody knows why\" bugs.\n\nSignal: In the dependency graph, any arrow forms a loop.\n\nRemedy: Extract the shared dependency into a third module that neither A nor C depend on, or invert one dependency using an interface (Dependency Inversion Principle).\n\n---\n\n### Data Model Without Constraints\n(Defensive Programming; database design best practices)\n\nNew tables are defined without `NOT NULL`, `CHECK`, or explicit FK `ON DELETE` behavior. Constraints are the last line of defense — they enforce correctness even when the application layer has a bug or is bypassed (e.g., a direct DB migration, a future code path).\n\nSignal: A table definition where any column that should always have a value lacks `NOT NULL`; a financial amount column without a `CHECK (amount > 0)` constraint; a FK without a stated `ON DELETE` policy.\n\nRemedy: For every new column, explicitly state: nullable or not, default value, and any domain constraint. For every FK: `CASCADE`, `SET NULL`, or `RESTRICT` — never leave it unstated.\n"},"migration-audit":{"content":"---\nname: migration-audit\ndescription: Audit PL/pgSQL migration files for correctness bugs, missing constraints, race conditions, NULL traps, and data integrity gaps. Use AUTOMATICALLY before presenting any new or modified SQL migration file to the user. Triggers on writing .sql files in supabase/migrations/, creating PL/pgSQL functions, or reviewing database schema changes.\n---\n\n# PL/pgSQL Migration Audit\n\nRun this audit on EVERY migration file BEFORE presenting it to the user. Do not show the migration until all checks pass or issues are documented.\n\nSee [REFERENCE.md](REFERENCE.md) for detailed failure modes, code examples, and source citations for each check.\n\n## Pre-audit: Read Dependencies\n\nBefore auditing, read ALL referenced tables' CREATE TABLE statements to understand:\n- Column types, NOT NULL constraints, DEFAULT values\n- CHECK constraints, UNIQUE constraints, partial unique indexes\n- Foreign key relationships and ON DELETE behavior\n- Existing indexes that affect ON CONFLICT behavior\n\n## Checklist\n\n### 1. NULL Safety\n\nFor EVERY function parameter and EVERY variable used in a comparison:\n\nComparisons:\n- Does any `!=` / `<>` break on NULL? (`NULL != 'x'` → NULL → falsy, skips the block)\n- Does any `=` silently match nothing on NULL? (`WHERE col = NULL` → zero rows always)\n- FIX: `IS DISTINCT FROM` for NULL-safe inequality, `IS NOT DISTINCT FROM` for NULL-safe equality\n- Is `NOT IN` used with a subquery that could contain NULLs? (Returns ZERO rows — use `NOT EXISTS`)\n\nVariables:\n- Are any PL/pgSQL variables used before assignment? (All default to NULL — `counter + 1` = NULL)\n- Does any string concatenation include a potentially-NULL value? (`'text' \|\| NULL` = NULL)\n- Does any arithmetic include a potentially-NULL value? (`5 + NULL` = NULL)\n\nCASE expressions:\n- Is `CASE var WHEN NULL` used? (Never matches — use `CASE WHEN var IS NULL`)\n\nParameters:\n- Do NOT add cosmetic NULL guards on internal/service-role RPCs validated by the caller\n- DO add a NULL guard ONLY when NULL causes a silent logic bug (wrong result, not just ugly error)\n- For each parameter: trace what happens if NULL. If the function errors → fine. If it silently produces wrong results or skips a check → add a guard.\n\n### 2. TOCTOU / Race Conditions\n\nFor EVERY read-then-write sequence:\n\nIdentify the pattern:\n- SELECT/EXISTS → INSERT: Can two concurrent calls both pass the check and both INSERT?\n - FIX: `INSERT ... ON CONFLICT` or catch `unique_violation`\n- SELECT/EXISTS → UPDATE: Can another transaction modify the row between check and update?\n - FIX: Put the check in the UPDATE's WHERE clause (atomic single-statement)\n- SELECT → compute → UPDATE (read-modify-write): Is the computed value stale?\n - FIX: Atomic relative UPDATE (`SET col = col - amount WHERE col >= amount RETURNING col`)\n\nWhy atomic UPDATE is the gold standard: Under READ COMMITTED, if a concurrent UPDATE modifies the target row, a subsequent UPDATE waits for the first to commit, then re-evaluates its WHERE clause against the newly committed row. This makes single-statement UPDATE inherently TOCTOU-safe.\n\nRow locking (when atomic single-statement isn't sufficient):\n- Is `FOR UPDATE` / `FOR NO KEY UPDATE` needed to serialize a multi-statement decision?\n- Which row to lock? (Lock the row whose state determines the decision)\n- Prefer `FOR NO KEY UPDATE` when not changing PK / FK-referenced columns (allows concurrent `FOR KEY SHARE`)\n\nDeadlock risk (when locking multiple rows):\n- Are rows locked in a consistent, deterministic order? (e.g., `ORDER BY id FOR UPDATE`)\n\nAdvisory locks (when row locks don't fit):\n- Is `pg_advisory_xact_lock` appropriate? (auto-released at transaction end — preferred for short-lived locks)\n- Session-level `pg_advisory_lock` requires explicit unlock and survives rollback — use only when needed\n- Is the lock key deterministic and collision-free?\n\n### 3. Unique Constraints & ON CONFLICT\n\nFor EVERY INSERT:\n- What unique constraint catches a duplicate? Name it explicitly.\n- Is `unique_violation` caught in a BEGIN/EXCEPTION block where needed?\n- If an invariant is enforced by application logic only: should it ALSO be a DB constraint?\n\nFor EVERY `ON CONFLICT`:\n- Does the target constraint/index actually exist in a prior migration? Name it.\n- Is it safe to silently skip (`DO NOTHING`), or should it raise an error?\n- If using `RETURNING`: does NOT return anything on `DO NOTHING` path — is this handled?\n- Could multiple unique constraints be violated? (ON CONFLICT targets only ONE — a conflict on a DIFFERENT constraint still raises `unique_violation` uncaught)\n\n### 4. Error Handling Completeness\n\nSELECT INTO:\n- Is `FOUND` checked after every `SELECT INTO` where zero rows is possible?\n- `FOUND` = TRUE even if the returned VALUE is NULL (tracks row existence, not value)\n- Alternative: `SELECT INTO STRICT` raises `NO_DATA_FOUND` / `TOO_MANY_ROWS` automatically — use when exactly one row is expected\n\nUPDATE/DELETE:\n- Is `FOUND` checked after every UPDATE/DELETE where zero affected rows is an error?\n\nConstraint violations:\n- Could a constraint violation occur that isn't caught?\n- If caught via EXCEPTION block: is the cost justified? (Each EXCEPTION block creates a subtransaction — avoid in loops)\n\nRAISE EXCEPTION:\n- Does every RAISE include `USING DETAIL` for programmatic error handling?\n- Is there a consistent error code vocabulary across the project?\n\n### 5. JSONB Construction\n\nFor EVERY `jsonb_build_object`, `jsonb_agg`, `jsonb_object_agg`:\n- NULL inclusion: `jsonb_build_object('key', NULL)` produces `{\"key\": null}`, not omission. Is this intended?\n- Empty aggregation: `jsonb_agg` over zero rows returns NULL, not `'[]'::jsonb`. Use `COALESCE(jsonb_agg(...), '[]'::jsonb)`.\n- NULL in arrays: `jsonb_agg` includes NULL elements. Use `jsonb_agg_strict` (PG 16+) or filter with WHERE to exclude.\n- NULL values in objects: `jsonb_object_agg` includes NULL values. Use `jsonb_object_agg_strict` (PG 16+) or filter.\n- Use `jsonb_strip_nulls()` to remove null-valued keys from a constructed object when needed.\n\n### 6. Function Volatility\n\nFor EVERY function:\n- Is it marked `VOLATILE` (default), `STABLE`, or `IMMUTABLE`?\n- A function with side effects (INSERT, UPDATE, DELETE, writing to a sequence) must be `VOLATILE`\n- `STABLE` means: cannot modify the database, returns same result within a single statement for same args\n- `IMMUTABLE` means: cannot modify the database, returns same result forever for same args (no DB lookups at all)\n- Mismarking a writing function as `STABLE`/`IMMUTABLE` lets the planner cache or reorder calls, skipping writes\n\n### 7. Financial / Balance Safety (if applicable)\n\n- Atomic deduction pattern used? (`UPDATE ... SET bal = bal - cost WHERE bal >= cost RETURNING bal`)\n- Transaction logged with idempotency key? (prevents double-grant on retry)\n- `balance_after` from RETURNING clause (not computed separately)?\n- CHECK constraint on balance column? If intentionally omitted (e.g., chargebacks), document why.\n- Could ON DELETE CASCADE on a parent table silently destroy financial records?\n\n### 8. Security Template\n\nFor EVERY function:\n- `SECURITY DEFINER` + `SET search_path = ''` (prevents search_path hijacking — CVE-2007-2138)\n- `REVOKE EXECUTE FROM PUBLIC, anon, authenticated` (prevents direct PostgREST `/rpc/` calls)\n- `GRANT EXECUTE TO service_role` (or whichever role(s) need access)\n- All table refs fully qualified (e.g., `public.tablename` — required when search_path is empty)\n- No dynamic SQL with string concatenation (use `format(%I, %L)` or `EXECUTE ... USING`)\n- REVOKE/GRANT signature must match the CREATE FUNCTION signature exactly (including DEFAULT params)\n\n### 9. DDL Safety (if applicable — ALTER TABLE, CREATE INDEX, etc.)\n\n- Does any DDL take ACCESS EXCLUSIVE lock on a table with data? (blocks ALL reads/writes)\n- Should `lock_timeout` be set as a safety net?\n- Are constraints added with `NOT VALID` + separate `VALIDATE CONSTRAINT`? (non-blocking for large tables)\n- `SET NOT NULL` after a validated `CHECK (col IS NOT NULL)`? (skips table scan — PG 12+)\n- Are indexes on existing tables created `CONCURRENTLY`? (non-blocking)\n- Can `CREATE INDEX CONCURRENTLY` run inside a transaction? (NO — needs separate migration)\n- Is every DDL statement idempotent? (`IF NOT EXISTS`, `OR REPLACE`, `DROP IF EXISTS` + `CREATE`)\n\n## Verification Pass\n\nBefore finalizing your audit notes, verify every issue you found:\n\n1. Re-read the code: Go back to the flagged lines in full context. Confirm the issue actually exists — not a misread, not handled by a later statement in the same function, not guarded by a constraint or trigger you missed.\n2. Check for existing mitigations: Search the migration file and referenced tables. Is the \"missing\" constraint already defined in a prior migration? Is the race condition prevented by a unique index you didn't notice? If so, drop the finding.\n3. Verify against official docs: For every PostgreSQL behavior you cite (NULL semantics, lock levels, ON CONFLICT rules), confirm your claim is correct. If you're unsure, look it up — don't guess. Use available tools (context7, web search, REFERENCE.md) to check current documentation when uncertain.\n4. Filter by confidence: If you're certain a finding is a false positive after re-reading, drop it entirely. If doubt remains but the issue seems plausible, note it in the audit summary as \"Worth Investigating\" — don't fix it without confirmation.\n\n## Output\n\nAfter auditing, include a brief audit summary when presenting the migration:\n\n```\nAudit notes:\n- NULL: [which params/vars checked, any IS DISTINCT FROM needed]\n- TOCTOU: [which row locked and why, or why locking isn't needed]\n- Constraints: [which unique constraints protect each INSERT]\n- Error handling: [FOUND checks present, STRICT used where appropriate, EXCEPTION blocks justified]\n- JSONB: [NULL handling verified for build_object / agg calls]\n- Volatility: [VOLATILE/STABLE/IMMUTABLE correctly assigned]\n- Financial: [atomic deduction + idempotency key, or N/A]\n- Security: [SECURITY DEFINER + search_path + REVOKE/GRANT confirmed]\n- DDL: [lock levels, idempotency, CONCURRENTLY, or N/A]\n```\n\nIf issues are found, fix them BEFORE presenting. Do not present migrations with known issues.\n","reference":"# PL/pgSQL Migration Audit — Reference\n\nDetailed failure modes, code examples, and source citations for each checklist section in [SKILL.md](SKILL.md).\n\n---\n\n## 1. NULL Safety\n\n### Three-Valued Logic\n\nSQL comparisons with NULL yield NULL (not TRUE or FALSE). NULL is falsy in WHERE/IF contexts.\n\n> \"Ordinary comparison operators yield null (signifying 'unknown'), not true or false, when either input is null. For example, `7 = NULL` yields null, as does `7 <> NULL`.\"\n> — [PostgreSQL: Comparison Functions and Operators](https://www.postgresql.org/docs/current/functions-comparison.html)\n\n```sql\nNULL = 1 -- NULL (not FALSE)\nNULL != 1 -- NULL (not TRUE)\nNULL = NULL -- NULL (not TRUE)\nNULL != NULL -- NULL (not FALSE)\n```\n\n### The != NULL Trap\n\nThe most dangerous NULL bug in PL/pgSQL. When an IF block uses `!=` on a value that can be NULL, the entire block is silently skipped:\n\n```sql\n-- BUG: If v_role is NULL, this block is SKIPPED — ownership check bypassed\nIF v_role != 'admin' THEN\n RAISE EXCEPTION 'Forbidden';\nEND IF;\n\n-- FIX: IS DISTINCT FROM treats NULL as a comparable value\nIF v_role IS DISTINCT FROM 'admin' THEN\n RAISE EXCEPTION 'Forbidden';\nEND IF;\n```\n\n### IS DISTINCT FROM / IS NOT DISTINCT FROM\n\n> \"Not equal, treating null as a comparable value.\"\n> \"Equal, treating null as a comparable value.\"\n> \"Thus, these predicates effectively act as though null were a normal data value, rather than 'unknown'.\"\n> — [PostgreSQL: Comparison Functions and Operators](https://www.postgresql.org/docs/current/functions-comparison.html)\n\n\| a \| b \| `a = b` \| `a IS NOT DISTINCT FROM b` \|\n\|------\|------\|---------\|----------------------------\|\n\| 1 \| 1 \| TRUE \| TRUE \|\n\| 1 \| NULL \| NULL \| FALSE \|\n\| NULL \| NULL \| NULL \| TRUE \|\n\n### The NOT IN Trap\n\n> \"If there are no equal right-hand values and at least one right-hand row yields null, the result of the `NOT IN` construct will be null, not true.\"\n> — [PostgreSQL: Subquery Expressions](https://www.postgresql.org/docs/current/functions-subquery.html)\n\n`NOT IN (1, 2, NULL)` expands to `val != 1 AND val != 2 AND val != NULL`. Since `val != NULL` is NULL, the AND chain never evaluates to TRUE. The query returns zero rows.\n\n```sql\n-- BUG: Returns ZERO rows if subquery has even one NULL\nSELECT id FROM orders WHERE id NOT IN (SELECT order_id FROM details);\n\n-- FIX: NOT EXISTS is NULL-safe\nSELECT id FROM orders o\nWHERE NOT EXISTS (SELECT 1 FROM details d WHERE d.order_id = o.id);\n```\n\n### Uninitialized Variables\n\nPL/pgSQL variables default to NULL, not zero or empty string:\n\n```sql\nDECLARE\n counter INTEGER; -- NULL, not 0\nBEGIN\n counter := counter + 1; -- NULL + 1 = NULL\n```\n\n### String Concatenation and Arithmetic\n\n```sql\n'Hello' \|\| NULL -- NULL (not 'Hello')\n5 + NULL -- NULL\nNULL / 0 -- NULL (not division_by_zero!)\n```\n\nUse `COALESCE` or `concat()` (which treats NULL as empty string).\n\n### CASE WHEN NULL\n\n```sql\n-- BUG: Simple CASE uses = internally; NULL = NULL is NULL, never matches\nCASE v_status WHEN NULL THEN 'unknown' END;\n\n-- FIX: Searched CASE with IS NULL\nCASE WHEN v_status IS NULL THEN 'unknown' END;\n```\n\n### When to Add NULL Parameter Guards\n\nFor internal RPCs where the caller validates inputs, do not add blanket NULL guards. PostgreSQL's own mechanism for this is the `STRICT` keyword:\n\n> \"`RETURNS NULL ON NULL INPUT` or `STRICT` indicates that the function always returns null whenever any of its arguments are null. If this parameter is specified, the function is not executed when there are null arguments; instead a null result is assumed automatically.\"\n> — [PostgreSQL: CREATE FUNCTION](https://www.postgresql.org/docs/current/sql-createfunction.html)\n\nOnly add a guard when NULL causes a silent logic bug:\n1. Trace what happens if the parameter is NULL\n2. If the function raises an error (ugly or not) → no guard needed\n3. If it silently produces wrong results or skips a check → add a guard with a comment explaining the bug\n\n---\n\n## 2. TOCTOU / Race Conditions\n\n### READ COMMITTED Re-evaluation\n\nUnder READ COMMITTED (PostgreSQL default), UPDATE/DELETE/SELECT FOR UPDATE wait for concurrent transactions, then re-evaluate the WHERE clause against the updated row:\n\n> \"Such a target row might have already been updated (or deleted or locked) by another concurrent transaction by the time it is found. In this case, the would-be updater will wait for the first updating transaction to commit or roll back.\"\n>\n> \"The search condition of the command (the `WHERE` clause) is re-evaluated to see if the updated version of the row still matches the search condition. If so, the second updater proceeds with its operation using the updated version of the row.\"\n> — [PostgreSQL: Transaction Isolation](https://www.postgresql.org/docs/current/transaction-iso.html)\n\nThis is why atomic single-statement UPDATE is the gold standard — the check and write are one operation, and PostgreSQL handles concurrency automatically.\n\n### Anti-Pattern: SELECT then INSERT\n\n```sql\n-- BUG: Two concurrent calls both find no row, both insert\nIF NOT EXISTS (SELECT 1 FROM tags WHERE name = p_name) THEN\n INSERT INTO tags (name) VALUES (p_name);\nEND IF;\n\n-- FIX: INSERT ON CONFLICT\nINSERT INTO tags (name) VALUES (p_name) ON CONFLICT (name) DO NOTHING;\n```\n\n### Anti-Pattern: EXISTS check then UPDATE\n\n```sql\n-- BUG: Row could be deleted/modified between check and update\nIF EXISTS (SELECT 1 FROM orders WHERE id = p_id AND status = 'pending') THEN\n UPDATE orders SET status = 'processing' WHERE id = p_id;\nEND IF;\n\n-- FIX: Atomic — put check in WHERE clause\nUPDATE orders SET status = 'processing' WHERE id = p_id AND status = 'pending';\nIF NOT FOUND THEN RAISE EXCEPTION '...'; END IF;\n```\n\n### Anti-Pattern: Read-Modify-Write\n\n```sql\n-- BUG: Two concurrent calls read same balance, both write back\nSELECT balance INTO v_bal FROM accounts WHERE id = p_id;\nv_bal := v_bal - p_amount;\nUPDATE accounts SET balance = v_bal WHERE id = p_id;\n\n-- FIX: Atomic relative UPDATE with WHERE guard\nUPDATE accounts SET balance = balance - p_amount\nWHERE id = p_id AND balance >= p_amount\nRETURNING balance INTO v_bal;\n```\n\n### Row Locking\n\n> \"FOR UPDATE causes the rows retrieved by the SELECT statement to be locked as though for update. This prevents them from being locked, modified or deleted by other transactions until the current transaction ends.\"\n> — [PostgreSQL: Explicit Locking](https://www.postgresql.org/docs/current/explicit-locking.html)\n\nFOR UPDATE vs FOR NO KEY UPDATE:\n\n> \"`FOR NO KEY UPDATE` behaves similarly to `FOR UPDATE`, except that the lock acquired is weaker: this lock will not block `SELECT FOR KEY SHARE` commands that attempt to acquire a lock on the same rows.\"\n> — [PostgreSQL: Explicit Locking](https://www.postgresql.org/docs/current/explicit-locking.html)\n\nPrefer `FOR NO KEY UPDATE` when not changing PK / FK-referenced columns — it avoids blocking child table FK inserts.\n\nWhich row to lock: Lock the row whose state determines the decision.\n\n### Deadlocks\n\n> \"The best defense against deadlocks is generally to avoid them by being certain that all applications using a database acquire locks on multiple objects in a consistent order.\"\n> — [PostgreSQL: Explicit Locking](https://www.postgresql.org/docs/current/explicit-locking.html)\n\n```sql\n-- Consistent order: always lock by ascending ID\nSELECT * FROM accounts WHERE user_id IN (p_from, p_to) ORDER BY user_id FOR UPDATE;\n```\n\n### Advisory Locks\n\nUse when row locks don't fit (e.g., locking a logical concept, not a specific row):\n\n> \"PostgreSQL provides a means for creating locks that have application-defined meanings. These are called advisory locks, because the system does not enforce their use — it is up to the application to use them correctly.\"\n> — [PostgreSQL: Explicit Locking](https://www.postgresql.org/docs/current/explicit-locking.html)\n\nTransaction-level (`pg_advisory_xact_lock`):\n> \"Transaction-level lock requests... are automatically released at the end of the transaction, and there is no explicit unlock operation.\"\n\nSession-level (`pg_advisory_lock`):\n> \"Once acquired at session level, an advisory lock is held until explicitly released or the session ends... a lock acquired during a transaction that is later rolled back will still be held following the rollback.\"\n\nTransaction-level is almost always what you want. Session-level survives rollback, which is surprising and dangerous.\n\n> \"While a flag stored in a table could be used for the same purpose, advisory locks are faster, avoid table bloat, and are automatically cleaned up by the server at the end of the session.\"\n\nDangerous pattern with LIMIT:\n> \"In certain cases... care must be taken to control the locks acquired because of the order in which SQL expressions are evaluated.\"\n\n```sql\n-- DANGEROUS: LIMIT may not be applied before the lock function executes\nSELECT pg_advisory_lock(id) FROM foo WHERE id > 12345 LIMIT 100;\n\n-- SAFE: subquery forces LIMIT first\nSELECT pg_advisory_lock(q.id) FROM (\n SELECT id FROM foo WHERE id > 12345 LIMIT 100\n) q;\n```\n\n---\n\n## 3. Unique Constraints & ON CONFLICT\n\n### ON CONFLICT Targets a Single Constraint\n\n`ON CONFLICT (col)` targets one specific unique index. If the table has multiple unique constraints, a conflict on a DIFFERENT constraint still raises `unique_violation`.\n\n### RETURNING Does Not Fire on DO NOTHING\n\n> \"Only rows that were successfully inserted or updated will be returned.\"\n> — [PostgreSQL: INSERT](https://www.postgresql.org/docs/current/sql-insert.html)\n\nIf you need the existing row's ID after a DO NOTHING conflict:\n\n```sql\nWITH ins AS (\n INSERT INTO tags (name) VALUES (p_name)\n ON CONFLICT (name) DO NOTHING\n RETURNING id\n)\nSELECT id FROM ins\nUNION ALL\nSELECT id FROM tags WHERE name = p_name\nLIMIT 1;\n```\n\n### Index Inference vs Named Constraint\n\n> \"It is often preferable to use unique index inference rather than naming a constraint directly using `ON CONFLICT ON CONSTRAINT`. Inference will continue to work correctly when the underlying index is replaced by another more or less equivalent index.\"\n> — [PostgreSQL: INSERT](https://www.postgresql.org/docs/current/sql-insert.html)\n\n### EXCEPTION Block for unique_violation\n\nWhen ON CONFLICT can't cover all scenarios (multiple unique constraints):\n\n```sql\nBEGIN\n INSERT INTO owned_items (user_id, item_id) VALUES (p_uid, p_id);\nEXCEPTION WHEN unique_violation THEN\n RAISE EXCEPTION 'Already owned' USING DETAIL = 'ALREADY_EXISTS';\nEND;\n```\n\nPerformance cost — see §4.\n\n---\n\n## 4. Error Handling\n\n### SELECT INTO and FOUND\n\n`SELECT INTO` (without STRICT) sets the target to NULL on no rows — no error raised. Must check `FOUND`:\n\n> \"If `STRICT` is not specified in the `INTO` clause, then target will be set to the first row returned by the command, or to nulls if the command returned no rows.\"\n> — [PostgreSQL: Basic Statements](https://www.postgresql.org/docs/current/plpgsql-statements.html)\n\nCritical nuance: FOUND tracks row existence, not value existence:\n\n> \"A `SELECT INTO` sets `FOUND` true if a row is assigned, false if no row is returned.\"\n> — [PostgreSQL: Basic Statements](https://www.postgresql.org/docs/current/plpgsql-statements.html)\n\nFOUND = TRUE even if the returned value is NULL.\n\n### SELECT INTO STRICT\n\nWhen exactly one row is expected, `STRICT` eliminates manual FOUND checking:\n\n> \"If the `STRICT` option is specified, the command must return exactly one row or a run-time error will be reported, either `NO_DATA_FOUND` (no rows) or `TOO_MANY_ROWS` (more than one row).\"\n> — [PostgreSQL: Basic Statements](https://www.postgresql.org/docs/current/plpgsql-statements.html)\n\n```sql\n-- Without STRICT: must check FOUND manually\nSELECT col INTO v_val FROM t WHERE id = p_id;\nIF NOT FOUND THEN RAISE EXCEPTION '...' USING DETAIL = 'NOT_FOUND'; END IF;\n\n-- With STRICT: automatic error on 0 or >1 rows\nBEGIN\n SELECT col INTO STRICT v_val FROM t WHERE id = p_id;\nEXCEPTION\n WHEN NO_DATA_FOUND THEN\n RAISE EXCEPTION '...' USING DETAIL = 'NOT_FOUND';\nEND;\n```\n\nTrade-off: STRICT requires an EXCEPTION block (subtransaction cost) if you want a custom error message. For simple \"must exist\" lookups where you control the error message, manual FOUND is cheaper. STRICT is most useful when you also need to guard against >1 row.\n\n### EXCEPTION Block Subtransaction Cost\n\n> \"A block containing an `EXCEPTION` clause is significantly more expensive to enter and exit than a block without one. Therefore, don't use `EXCEPTION` without need.\"\n> — [PostgreSQL: Control Structures](https://www.postgresql.org/docs/current/plpgsql-control-structures.html)\n\nThe cost comes from implicit subtransactions:\n\n> \"When an error is caught by an `EXCEPTION` clause, the local variables of the PL/pgSQL function remain as they were when the error occurred, but all changes to persistent database state within the block are rolled back.\"\n> — [PostgreSQL: Control Structures](https://www.postgresql.org/docs/current/plpgsql-control-structures.html)\n\nThe 64-subtransaction overflow: Each backend caches up to 64 subtransaction XIDs (`PGPROC_MAX_CACHED_SUBXIDS`). Beyond 64, tracking overflows to the `pg_subtrans` SLRU on disk. Under concurrency, multiple sessions accessing the 32-page SLRU cache causes lock contention and disk I/O. GitLab documented TPS drops from 360,000 to 50,000 from this issue and spent a month eliminating all subtransactions.\n\nSources:\n- [PostgresAI: Subtransactions considered harmful](https://postgres.ai/blog/20210831-postgresql-subtransactions-considered-harmful)\n- [GitLab: Why we spent the last month eliminating PostgreSQL subtransactions](https://about.gitlab.com/blog/why-we-spent-the-last-month-eliminating-postgresql-subtransactions/)\n\nRule: Use `IF`/`FOUND`/`ON CONFLICT` for expected control flow. Reserve EXCEPTION blocks for truly exceptional situations, and never use them in loops.\n\n### PostgREST Error Mapping\n\nRAISE EXCEPTION fields map to JSON response fields:\n- MESSAGE → `\"message\"`\n- DETAIL → `\"details\"`\n- HINT → `\"hint\"`\n\nSQLSTATE to HTTP status:\n\n\| SQLSTATE \| HTTP \| Meaning \|\n\|----------\|------\|---------\|\n\| `P0001` \| 400 \| Default RAISE EXCEPTION \|\n\| `23503` \| 409 \| Foreign key violation \|\n\| `23505` \| 409 \| Unique violation \|\n\| `PT4xx` \| 4xx \| Custom HTTP status (PT prefix) \|\n\nSource: [PostgREST: Errors](https://docs.postgrest.org/en/v12/references/errors.html)\n\n---\n\n## 5. JSONB NULL Behavior\n\n### jsonb_build_object Includes NULLs\n\n`jsonb_build_object('key', NULL)` produces `{\"key\": null}` — it does NOT omit the key. If you want to omit null-valued keys, apply `jsonb_strip_nulls()` afterward:\n\n> \"Deletes all object fields that have null values from the given JSON value, recursively.\"\n> — [PostgreSQL: JSON Functions](https://www.postgresql.org/docs/current/functions-json.html)\n\n```sql\n-- Includes null: {\"name\": \"Alice\", \"bio\": null}\njsonb_build_object('name', 'Alice', 'bio', NULL)\n\n-- Strips null: {\"name\": \"Alice\"}\njsonb_strip_nulls(jsonb_build_object('name', 'Alice', 'bio', NULL))\n```\n\n### jsonb_agg Returns NULL on Empty Input, Includes NULLs in Output\n\n> \"Collects all the input values, including nulls, into a JSON array.\"\n> — [PostgreSQL: Aggregate Functions](https://www.postgresql.org/docs/current/functions-aggregate.html)\n\nOver zero rows, `jsonb_agg` returns NULL (like all aggregate functions except `count`):\n\n> \"It should be noted that except for `count`, these functions return a null value when no rows are selected.\"\n> — [PostgreSQL: Aggregate Functions](https://www.postgresql.org/docs/current/functions-aggregate.html)\n\n```sql\n-- Always wrap in COALESCE for empty-set safety\nCOALESCE(jsonb_agg(col), '[]'::jsonb)\n\n-- Use jsonb_agg_strict (PG 16+) to exclude NULL elements\n-- \"Collects all the input values, skipping nulls, into a JSON array.\"\nCOALESCE(jsonb_agg_strict(col), '[]'::jsonb)\n```\n\n### jsonb_object_agg NULL Behavior\n\n> \"Values can be null, but keys cannot.\"\n> — [PostgreSQL: Aggregate Functions](https://www.postgresql.org/docs/current/functions-aggregate.html)\n\nUse `jsonb_object_agg_strict` (PG 16+) to skip entries where the value is NULL.\n\n---\n\n## 6. Function Volatility\n\n> \"`IMMUTABLE` indicates that the function cannot modify the database and always returns the same result when given the same argument values.\"\n>\n> \"`STABLE` indicates that the function cannot modify the database, and that within a single table scan it will consistently return the same result for the same argument values, but that its result could change across SQL statements.\"\n>\n> \"`VOLATILE` indicates that the function value can change even within a single table scan, so no optimizations can be made.\"\n>\n> \"Any function that has side-effects must be classified volatile, even if its result is quite predictable, to prevent calls from being optimized away.\"\n> — [PostgreSQL: CREATE FUNCTION](https://www.postgresql.org/docs/current/sql-createfunction.html)\n\nThe danger: If a function that writes data (INSERT/UPDATE/DELETE) is marked `STABLE` or `IMMUTABLE`, the planner may:\n- Cache the result and skip subsequent calls with the same arguments\n- Reorder calls in ways that break expected execution order\n- Fold the call into a constant during planning\n\nThe default is `VOLATILE`, which is safe. Only change it when you're certain the function meets the stricter contract.\n\n---\n\n## 7. Financial / Balance Safety\n\n### The Atomic Deduction Pattern\n\nThe correct pattern for deducting from a balance:\n\n```sql\nUPDATE accounts SET balance = balance - p_cost\nWHERE id = p_id AND balance >= p_cost\nRETURNING balance INTO v_balance_after;\n\nIF NOT FOUND THEN\n RAISE EXCEPTION 'Insufficient balance' USING DETAIL = 'INSUFFICIENT_FUNDS';\nEND IF;\n```\n\nWhy this is safe:\n- Atomic: the read (current balance), check (`>= p_cost`), and write (`- p_cost`) happen in one statement\n- TOCTOU-safe: under READ COMMITTED, concurrent UPDATEs wait and re-evaluate WHERE (see §2)\n- balance_after from RETURNING: the returned value is the actual post-deduction balance, not a separately computed value that could be stale\n\n### Idempotency Keys\n\nEvery financial operation should log a transaction with an idempotency key to prevent double-grant on retry:\n\n```sql\nINSERT INTO transactions (user_id, amount, type, idempotency_key, balance_after)\nVALUES (p_uid, -p_cost, 'purchase', p_idempotency_key, v_balance_after);\n-- unique constraint on idempotency_key prevents duplicate\n```\n\nThe idempotency key column must be `NOT NULL` — PostgreSQL treats each NULL as distinct for UNIQUE constraints, so a nullable column would allow unlimited duplicates with NULL keys.\n\n### CASCADE Risk on Financial Tables\n\n`ON DELETE CASCADE` on a parent table (e.g., user deletion) can silently destroy financial transaction records. Financial tables should typically use `ON DELETE RESTRICT` or `ON DELETE SET NULL` to preserve the audit trail.\n\n---\n\n## 8. Security\n\n### SECURITY DEFINER + SET search_path\n\n> \"Because a `SECURITY DEFINER` function is executed with the privileges of the user that owns it, care is needed to ensure that the function cannot be misused.\"\n>\n> \"For security, `search_path` should be set to exclude any schemas writable by untrusted users. This prevents malicious users from creating objects (e.g., tables, functions, and operators) that mask objects intended to be used by the function.\"\n> — [PostgreSQL: CREATE FUNCTION](https://www.postgresql.org/docs/current/sql-createfunction.html)\n\nWithout `SET search_path = ''`, an attacker can create a temp table shadowing a real table:\n\n```sql\n-- Attacker creates shadow table\nCREATE TEMP TABLE profiles (id UUID, role TEXT DEFAULT 'admin');\n-- SECURITY DEFINER function reads from temp table instead of public.profiles\n```\n\nCVE-2007-2138: [PostgreSQL Security Advisory](https://www.postgresql.org/support/security/CVE-2007-2138/)\n\n`SET search_path = ''` forces all references to be fully qualified (`public.tablename`), eliminating this attack class.\n\n### REVOKE EXECUTE FROM PUBLIC\n\nPostgreSQL grants EXECUTE to PUBLIC by default on ALL functions. In Supabase/PostgREST environments, every function in the `public` schema is callable via `/rpc/`. Without REVOKE, anyone with the anon key can call SECURITY DEFINER functions directly.\n\nSource: [Supabase: Hardening the Data API](https://supabase.com/docs/guides/database/hardening-data-api)\n\n---\n\n## 9. DDL Safety\n\n### Lock Levels\n\n\| DDL Operation \| Lock \| Blocks Reads? \| Blocks Writes? \|\n\|--------------\|------\|---------------\|----------------\|\n\| `ADD COLUMN` (nullable, no default) \| ACCESS EXCLUSIVE \| Yes \| Yes \|\n\| `ADD COLUMN DEFAULT` (PG 11+) \| ACCESS EXCLUSIVE \| Yes \| Yes (but instant — no rewrite) \|\n\| `ALTER COLUMN TYPE` \| ACCESS EXCLUSIVE \| Yes \| Yes (full table rewrite) \|\n\| `SET NOT NULL` \| ACCESS EXCLUSIVE \| Yes \| Yes (full table scan — see below) \|\n\| `ADD CHECK NOT VALID` \| ACCESS EXCLUSIVE \| Yes \| Yes (brief — no scan) \|\n\| `VALIDATE CONSTRAINT` \| SHARE UPDATE EXCLUSIVE \| No \| No \|\n\| `CREATE INDEX` \| SHARE \| No \| Yes \|\n\| `CREATE INDEX CONCURRENTLY` \| SHARE UPDATE EXCLUSIVE \| No \| No \|\n\n### SET NOT NULL — Skipping the Table Scan (PG 12+)\n\n`SET NOT NULL` normally requires a full table scan to verify no NULLs exist. Since PG 12, if a valid `CHECK (col IS NOT NULL)` constraint already exists, the scan is skipped:\n\n> \"`SET NOT NULL` may only be applied to a column provided none of the records in the table contain a `NULL` value for the column. Ordinarily this is checked during the `ALTER TABLE` by scanning the entire table; however, if a valid `CHECK` constraint exists (and is not dropped in the same command) which proves no `NULL` can exist, then the table scan is skipped.\"\n> — [PostgreSQL: ALTER TABLE](https://www.postgresql.org/docs/current/sql-altertable.html)\n\nSafe pattern for zero-downtime NOT NULL on large tables:\n\n```sql\n-- Step 1: Add CHECK without scanning (brief ACCESS EXCLUSIVE)\nALTER TABLE t ADD CONSTRAINT chk_col_nn CHECK (col IS NOT NULL) NOT VALID;\n-- Step 2: Validate (non-blocking SHARE UPDATE EXCLUSIVE scan)\nALTER TABLE t VALIDATE CONSTRAINT chk_col_nn;\n-- Step 3: SET NOT NULL (skips scan because validated CHECK exists)\nALTER TABLE t ALTER COLUMN col SET NOT NULL;\n-- Step 4: Drop redundant CHECK\nALTER TABLE t DROP CONSTRAINT chk_col_nn;\n```\n\n### Safe Constraint Addition\n\n```sql\n-- Step 1: Add without scanning (brief lock)\nALTER TABLE t ADD CONSTRAINT chk CHECK (col > 0) NOT VALID;\n-- Step 2: Validate (non-blocking scan)\nALTER TABLE t VALIDATE CONSTRAINT chk;\n```\n\n### Statements That Cannot Run in a Transaction\n\n- `CREATE INDEX CONCURRENTLY` / `DROP INDEX CONCURRENTLY`\n- `VACUUM`\n- `CREATE DATABASE` / `DROP DATABASE`\n\nNote on `ALTER TYPE ... ADD VALUE`: Since PG 12, this CAN run inside a transaction block, but the new enum value cannot be used until the transaction commits:\n\n> \"If `ALTER TYPE ... ADD VALUE` (the form that adds a new value to an enum type) is executed inside a transaction block, the new value cannot be used until after the transaction has been committed.\"\n> — [PostgreSQL: ALTER TYPE](https://www.postgresql.org/docs/current/sql-altertype.html)\n\nThis means it works with Supabase `db push` (which wraps migrations in a transaction), but you cannot INSERT a row using the new enum value in the same migration file.\n\n### Idempotency Patterns\n\n\| Statement \| Idempotent Pattern \|\n\|-----------\|-------------------\|\n\| Table \| `CREATE TABLE IF NOT EXISTS` \|\n\| Column \| `ALTER TABLE ADD COLUMN IF NOT EXISTS` \|\n\| Index \| `CREATE INDEX IF NOT EXISTS` \|\n\| Function \| `CREATE OR REPLACE FUNCTION` \|\n\| Trigger \| `DROP TRIGGER IF EXISTS` + `CREATE TRIGGER` \|\n\| Policy \| `DROP POLICY IF EXISTS` + `CREATE POLICY` \|\n\| Grants \| Inherently idempotent \|\n\n`CREATE OR REPLACE FUNCTION` cannot change the return type or argument types — must `DROP` + `CREATE` for those.\n\n---\n\n## Sources\n\n### PostgreSQL Official Documentation\n- [Transaction Isolation](https://www.postgresql.org/docs/current/transaction-iso.html)\n- [Explicit Locking](https://www.postgresql.org/docs/current/explicit-locking.html)\n- [INSERT (ON CONFLICT)](https://www.postgresql.org/docs/current/sql-insert.html)\n- [CREATE FUNCTION](https://www.postgresql.org/docs/current/sql-createfunction.html)\n- [ALTER TABLE](https://www.postgresql.org/docs/current/sql-altertable.html)\n- [ALTER TYPE](https://www.postgresql.org/docs/current/sql-altertype.html)\n- [Comparison Functions and Operators](https://www.postgresql.org/docs/current/functions-comparison.html)\n- [Subquery Expressions (NOT IN)](https://www.postgresql.org/docs/current/functions-subquery.html)\n- [JSON Functions and Operators](https://www.postgresql.org/docs/current/functions-json.html)\n- [Aggregate Functions](https://www.postgresql.org/docs/current/functions-aggregate.html)\n- [PL/pgSQL Basic Statements](https://www.postgresql.org/docs/current/plpgsql-statements.html)\n- [PL/pgSQL Control Structures](https://www.postgresql.org/docs/current/plpgsql-control-structures.html)\n- [PL/pgSQL Errors and Messages](https://www.postgresql.org/docs/current/plpgsql-errors-and-messages.html)\n\n### CVEs\n- [CVE-2007-2138: search_path injection](https://www.postgresql.org/support/security/CVE-2007-2138/)\n\n### Subtransaction Performance\n- [PostgresAI: Subtransactions considered harmful](https://postgres.ai/blog/20210831-postgresql-subtransactions-considered-harmful) — 64-subtransaction cache limit, SLRU overflow, 20x TPS drop\n- [GitLab: Why we spent the last month eliminating PostgreSQL subtransactions](https://about.gitlab.com/blog/why-we-spent-the-last-month-eliminating-postgresql-subtransactions/) — 360k→50k TPS drop, full elimination strategy\n\n### Supabase / PostgREST\n- [PostgREST: Error Mapping](https://docs.postgrest.org/en/v12/references/errors.html)\n- [Supabase: Hardening the Data API](https://supabase.com/docs/guides/database/hardening-data-api)\n- [Supabase: Row Level Security](https://supabase.com/docs/guides/database/postgres/row-level-security)\n\n### Industry Guides\n- [GoCardless: Zero-downtime Postgres migrations](https://gocardless.com/blog/zero-downtime-postgres-migrations-the-hard-parts/)\n- [Cybertec: Abusing SECURITY DEFINER](https://www.cybertec-postgresql.com/en/abusing-security-definer-functions/)\n"},"security-audit":{"content":"---\nname: security-audit\ndescription: Performs a thorough security audit against established industry standards (OWASP Top 10 2025, OWASP API Security Top 10 2023, CWE taxonomy, GDPR, PCI-DSS). Use when reviewing for security vulnerabilities, hardening production systems, auditing auth/payment/database code, or conducting periodic security reviews. Works on git diffs, specific files, or an entire codebase.\n---\n\n# Security Audit\n\nAudit code against established security standards and threat models. Every finding must cite the specific standard ID (OWASP, CWE, GDPR article, etc.) so the developer understands the authoritative source for each requirement. This skill is for security-specific review; for clean code and architecture concerns, use `best-practices-audit` instead.\n\n## Scope\n\nDetermine what to audit based on user request and context:\n\n- Git diff mode (default when no scope specified and changes exist): run `git diff` and `git diff --cached` to audit only changed/added code and its immediate context\n- File/directory mode: audit the files or directories the user specifies\n- Full audit mode: when the user asks for a full security review, scan all source code (skip vendor/node_modules/build artifacts); prioritize files touching auth, payments, database, and external integrations\n\nRead all in-scope code before producing findings.\n\n## Domains to Evaluate\n\nCheck each domain. Skip domains with no findings. See [REFERENCE.md](REFERENCE.md) for detailed definitions, standard IDs, and concrete examples.\n\n### 1. Authentication & Session Management\n(OWASP A07:2025, CWE-287, CWE-384)\n\n- Using `getSession()` instead of server-side `getUser()` for auth decisions (JWT trusting without server validation)\n- Missing token expiry enforcement; long-lived tokens without rotation\n- Weak or missing logout (session not invalidated server-side)\n- OAuth state parameter missing or not validated (CSRF on OAuth flows)\n- Trusting client-provided user identity without server-side verification\n- Credentials stored in localStorage instead of httpOnly cookies\n\n### 2. Authorization & Access Control\n(OWASP A01:2025, OWASP API2:2023, CWE-284, CWE-639)\n\n- BOLA/IDOR: object IDs accepted from user input without ownership verification\n- Missing Row-Level Security (RLS) policies on database tables\n- Privilege escalation paths: routes or RPCs accessible to roles that shouldn't have access\n- Broken function-level auth: admin/internal endpoints not restricted by role\n- REVOKE gaps: functions or tables accessible to PUBLIC or anon when they shouldn't be\n- Assuming the presence of a valid JWT implies authorization (JWT ≠ authz check)\n\n### 3. Injection\n(OWASP A05:2025, CWE-89, CWE-79, CWE-77, CWE-94)\n\n- SQL injection: raw string interpolation in queries; use parameterized queries or an ORM\n- XSS: unsanitized user content inserted into HTML; missing `Content-Security-Policy`\n- Command injection: user input passed to shell commands, `exec()`, `eval()`, `Function()`\n- Template injection: user-controlled strings rendered by a template engine\n- Schema pollution (PostgreSQL): SECURITY DEFINER functions without `SET search_path = ''`; attacker-controlled schemas prepended to search path\n\n### 4. Cryptography & Secrets\n(OWASP A04:2025, CWE-327, CWE-798, CWE-312, CWE-321)\n\n- Hardcoded credentials, API keys, tokens, or secrets in source code or `.env.example`\n- Secrets in environment variables loaded client-side (exposed in browser bundles)\n- Weak hashing algorithms (MD5, SHA-1) used for security purposes\n- Tokens or sensitive data stored in plaintext in the database instead of a secrets vault\n- Missing HTTPS enforcement; secrets transmitted over HTTP\n- JWT secrets that are short, guessable, or shared across environments\n\n### 5. Input Validation & Output Encoding\n(CWE-20, CWE-116, CWE-601, OWASP A05:2025)\n\n- No schema validation (Zod, Yup, JSON Schema, etc.) at API boundaries\n- Validation only on the client, not enforced on the server\n- Missing length/range constraints on user-supplied strings (no `maxLength`, no `CHECK` constraint)\n- Missing content-type validation on file uploads\n- Open redirects: user-controlled URL passed directly to redirect without allowlist validation\n- Missing `encodeURIComponent` on user data placed in URLs\n\n### 6. API Security\n(OWASP API Top 10 2023)\n\n- API1 — BOLA: resources returned or modified by user-supplied ID without ownership check\n- API2 — Broken Auth: unprotected endpoints, missing JWT verification, bearer token in URL\n- API3 — Broken Object Property Level Auth: response includes fields (e.g. `role`, `coins`, `internal_id`) that the caller should not see\n- API4 — Unrestricted Resource Consumption: no rate limiting, pagination, or request size limits\n- API5 — Broken Function Level Auth: non-public actions (admin, delete, ban) not verified against caller's role\n- API7 — SSRF: URL parameters or webhook URLs accepted from user input without allowlist validation\n- API8 — Security Misconfiguration: permissive CORS (``), verbose error messages leaking stack traces or schema details, debug endpoints in production\n- API10 — Unsafe Consumption of APIs: external API responses trusted without validation; webhooks not verified via HMAC signature\n\n### 7. Database Security\n(CWE-250, CWE-284, PostgreSQL Security Best Practices)\n\n- Tables created without `ENABLE ROW LEVEL SECURITY`\n- Missing `REVOKE EXECUTE` on SECURITY DEFINER functions from `PUBLIC`, `authenticated`, `anon`\n- SECURITY DEFINER functions without `SET search_path = ''` (schema pollution vector)\n- Missing `REVOKE TRUNCATE` on financial, audit, or compliance tables\n- Overly permissive RLS policies (e.g., `USING (true)` on sensitive tables)\n- Direct client-to-database connections bypassing application security layer\n- Sensitive columns (tokens, PII) stored in plaintext instead of encrypted columns or vault references\n- Missing `CHECK` constraints on financial columns (e.g., balance `>= 0`, amount sign validation)\n\n### 8. Rate Limiting & Denial-of-Service\n(OWASP API4:2023, CWE-770, CWE-400)\n\n- No rate limiting on authentication endpoints (brute force enabler)\n- No rate limiting on expensive operations (sync, export, AI calls, file uploads)\n- Rate limits implemented in-memory per process/isolate (bypassed by horizontal scaling or redeployment)\n- Missing request body size limits (memory exhaustion)\n- Unbounded database queries without `LIMIT` clause (full table scan DoS)\n- No backoff or circuit breaker for outbound calls to third-party services\n\n### 9. Concurrency & Race Conditions\n(CWE-362, CWE-367 TOCTOU)\n\n- Check-then-act patterns on financial or inventory data without database-level locking\n- Double-spend or double-grant risk: no idempotency key or `ON CONFLICT DO NOTHING` guard\n- Missing advisory locks or `SELECT FOR UPDATE` on critical rows during multi-step transactions\n- Non-atomic read-modify-write sequences on shared state (coin balance, stock count, etc.)\n- Idempotency keys that can be `NULL` (treated as distinct by PostgreSQL UNIQUE, allowing bypass)\n\n### 10. Financial & Transaction Integrity\n(PCI-DSS Req 6 & 10, CWE-362)\n\n- Client-side coin/credit/reward calculation (any value trusted from client is a vulnerability)\n- Missing `CHECK` constraint on transaction amount sign (credits vs. debits not enforced at DB level)\n- Coin or balance modification without an audit trail (append-only transaction log)\n- Webhook events not deduplicated by a provider-assigned event ID (replay attack enabler)\n- Webhook signature not verified (unauthenticated financial state changes)\n- Deletion of financial transaction records (violates audit trail requirements; potential legal violation)\n- Missing `NOT NULL` on idempotency key column for transaction tables\n\n### 11. Security Logging & Monitoring\n(OWASP A09:2025, CWE-778, CWE-117)\n\n- Security-relevant events not logged (auth failures, permission denials, validation failures, HMAC failures)\n- Log injection: unsanitized user input included directly in log messages\n- Sensitive data (passwords, tokens, card numbers, PII) written to logs\n- No structured logging — free-text logs that can't be queried or alerted on\n- Missing correlation between security events and user/request IDs\n- No alerting or anomaly detection on suspicious event patterns\n- Logs stored in a volatile medium (in-memory, ephemeral filesystem) that survives restarts but not scaling events\n\n### 12. Secrets & Environment Security\n(CWE-798, CWE-312, 12-Factor App)\n\n- Secrets committed to git (`.env`, private keys, API tokens in source files)\n- Fallback to insecure defaults when env vars are absent (e.g., CORS origin falling back to ``)\n- Using the same secrets across development, staging, and production environments\n- Secrets logged or included in error messages\n- Client-side environment variables (prefixed `VITE_`, `NEXT_PUBLIC_`, etc.) containing server-side secrets\n- Secrets passed as CLI arguments (visible in process list)\n\n### 13. Data Privacy & Retention\n(GDPR Art. 5/17/25, CCPA, CWE-359)\n\n- PII stored longer than necessary (no retention policy or purge cron)\n- No anonymization path for account deletion (right to erasure, GDPR Art. 17)\n- PII in logs, error messages, or analytics events that shouldn't be there\n- Missing `ON DELETE SET NULL` or equivalent for user-linked tables that must survive account deletion\n- Financial records with FK `ON DELETE CASCADE` that would purge legally required audit evidence\n- No consent record for data collection (GDPR Art. 6)\n- User data returned in API responses without field-level access checks (over-fetching)\n\n### 14. Security Misconfiguration\n(OWASP A02:2025, CWE-16)\n\n- Permissive CORS (reflecting request `Origin` without validation + `Access-Control-Allow-Credentials: true`)\n- Missing `Content-Security-Policy`, `X-Frame-Options`, `X-Content-Type-Options` headers\n- HTTP used instead of HTTPS; missing HSTS header\n- Debug/development endpoints or verbose error responses in production\n- Default credentials or example configurations deployed\n- Database or storage buckets with public access that should be private\n- Missing `SameSite` attribute on session cookies\n- JWT verification disabled on functions that handle authenticated user data\n\n### 15. Supply Chain & Dependency Security\n(OWASP A03:2025, CWE-1357)\n\n- Dependencies with known CVEs (run `npm audit`, `pip audit`, `bun audit`)\n- Unpinned dependency versions (``, `latest`, `^` for production dependencies)\n- Dependencies pulled from non-official registries without integrity hashing\n- Dev dependencies installed in production containers\n- Missing integrity subresource hashing on CDN-loaded scripts\n\n### 16. TypeScript / JavaScript Specific\n(CWE-843 Type Confusion, CWE-915 Improperly Controlled Modification)\n\n- `as any` or `as unknown as T` casts that bypass type checking on externally-sourced data\n- Prototype pollution: `Object.assign(target, userControlledObject)` or spread of unvalidated input onto objects\n- `eval()`, `new Function()`, `setTimeout(string)`, or `innerHTML =` with user-controlled content\n- `JSON.parse()` result used without validation (treat parsed JSON as `unknown`, not `any`)\n- Arithmetic on `bigint` and `number` without explicit conversion (silent precision loss)\n- Async functions missing `await` on promises that should be awaited (unhandled rejection, ordering bug)\n\n## Static Analysis Tools\n\nBefore producing findings, run available tools* on in-scope code. Incorporate tool output into your findings (cite the tool rule alongside the standard ID).\n\n### npm / bun audit (dependency vulnerabilities)\n```bash\nnpm audit --audit-level=moderate # or: bun audit\n```\nMap findings to OWASP A03:2025 and the specific CVE ID.\n\n### ESLint with security plugins\n```bash\n# Check for eslint-plugin-security in devDependencies first\nnpx eslint src/\n```\nKey rules to look for: `security/detect-object-injection`, `security/detect-non-literal-regexp`, `no-eval`, `no-implied-eval`.\n\n### Semgrep (if available)\n```bash\nsemgrep --config=p/owasp-top-ten .\nsemgrep --config=p/typescript .\n```\n\n### Ruff with Bandit rules (Python)\n```bash\nruff check --select S . # Bandit security rules\n```\n\n### How to use tool output\n1. Map each tool finding to its security domain (e.g., a SQL injection ESLint rule → Domain 3: Injection).\n2. Critical CVEs or injection/auth findings → Critical. Outdated deps with low-severity CVEs → Warning or Suggestion.\n3. If a tool is not present or produces no findings, note \"npm audit: clean\" etc. in the Summary.\n\n## API & Tech Stack Verification\n\nBefore finalizing findings, verify security-relevant API and SDK usage against official documentation:\n\n- Look up official docs: If the code uses a specific SDK, API, or service (e.g. Supabase auth, Stripe, OAuth providers), consult the official documentation to confirm the correct security usage pattern. Do not rely on training knowledge — APIs change, and incorrect usage is frequently a Critical security flaw that looks correct to a code reviewer.\n- Use available MCP tools: Check if available MCP tools (Supabase MCP, Vercel MCP, etc.) can provide faster or more authoritative access to official docs.\n- Wrong API usage = security finding: If code uses an API in a non-standard or incorrect way that bypasses security controls (e.g. trusting client-side session data instead of server-side verification), it must be reported as a finding at the appropriate severity — not treated as a style issue.\n\n## False Positive Filtering\n\nBefore including any finding in the report, apply these filters in order. A report with 3 real findings is more valuable than one with 3 real findings buried in 12 noise items.\n\n### Hard Exclusions\n\nAutomatically exclude findings that match these categories — do not report them even as Low:\n\n1. Pure DoS / resource exhaustion without an auth bypass or data-integrity component. Domain 8 items belong in the report only when combined with another vulnerability class (e.g., unbounded query + missing auth = Critical, unbounded query alone = excluded).\n2. Theoretical race conditions without a concrete exploitation path. Only report a race condition if you can describe the specific interleaving of requests that causes harm (e.g., double-spend). \"This read-modify-write could race\" is not a finding.\n3. Outdated dependency versions — these are surfaced by `npm audit` / `bun audit` output in the Summary section. Do not create individual findings for known CVEs in third-party libraries; that is the dependency scanner's job.\n4. Missing hardening with no attack vector — e.g., \"should add CSP header\" when there is no XSS vector in the application, or \"should add rate limiting\" on an internal-only endpoint. A missing defense layer is only a finding when the attack it defends against is actually possible.\n5. Test-only code — unit tests, fixtures, test helpers, mocks, and seed scripts. Exception: test files that contain real secrets or credentials.\n6. Log spoofing / unsanitized log output — unless the log output feeds a downstream system that parses and acts on log content (SIEM injection, log-based alerting bypass).\n7. Regex injection / ReDoS — unless the regex runs on untrusted input in a hot path with no timeout and you can demonstrate catastrophic backtracking.\n8. Documentation-only files — markdown, JSDoc comments, README content. These are not executable.\n9. Client-side validation gaps when server-side validation exists — missing Zod schema in a React form is a UX concern, not a security finding, if the API endpoint validates the same input.\n10. SSRF limited to path control — only report SSRF when the attacker can control the host or protocol. Path-only SSRF is not exploitable in practice.\n11. Memory safety issues in memory-safe languages — buffer overflows, use-after-free, etc. are impossible in TypeScript, Python, Go, Rust, and Java. Do not report them.\n12. Secrets or credentials stored on disk if they are otherwise secured (e.g., encrypted at rest, in a secrets vault, or managed by a dedicated process).\n\n### Framework & Language Precedents\n\nThese are established rulings — patterns that are NOT vulnerabilities by themselves:\n\n1. React / Angular / Vue are XSS-safe by default. Only flag XSS when using `dangerouslySetInnerHTML`, `bypassSecurityTrustHtml`, `v-html`, `[innerHTML]`, or equivalent escape hatches. Normal JSX interpolation (`{userInput}`) is auto-escaped.\n2. UUIDs (v4) are unguessable. Don't flag UUID-based resource access as IDOR unless the real issue is missing ownership verification (the problem is the missing WHERE clause, not the identifier format).\n3. Environment variables and CLI flags are trusted input. Attacks requiring attacker-controlled env vars are invalid in standard deployment models. Do not flag `process.env.X` as \"unsanitized input.\"\n4. Client-side code does not need auth checks. The backend is responsible for authorization. Missing permission guards in React components, API client wrappers, or frontend route guards are not security findings — they are UX decisions.\n5. GitHub Actions: most injection vectors are not exploitable. Only flag when untrusted input (PR title, branch name, issue body, commit message) flows into `run:` steps via `${{ }}` expression injection without intermediate sanitization.\n6. Jupyter notebooks run locally. Only flag if untrusted external input reaches code execution, not just because a cell calls `eval()` on a hardcoded string.\n7. Shell scripts with no untrusted input are safe. Command injection requires untrusted user input flowing into the script. Scripts that only process env vars, hardcoded paths, or pipeline-internal values are not vulnerable.\n8. `JSON.parse()` is not a vulnerability. Only a finding if the parsed result is used without validation in a security-critical path (auth decisions, financial calculations, SQL query construction).\n9. Logging non-PII data is safe. Only report logging findings when secrets (passwords, tokens, API keys) or personally identifiable information is written to logs. Logging URLs, request metadata, or error messages is not a vulnerability.\n\n### Confidence Gate\n\nBefore including any finding, answer these three questions:\n\n1. Concrete attack path? Can you describe the specific HTTP request, API call, or user action an attacker would use? If not, it's a code smell, not a finding.\n2. Reasonable disagreement? Could a competent security engineer argue this is not a vulnerability given the application's threat model? If yes, downgrade to a \"Needs Investigation\" note in the Summary.\n3. Specific location? Does the finding have an exact file path, line number, and reproduction scenario? Vague findings (\"the app should use HTTPS somewhere\") are not actionable and must be excluded.\n\nIf any question raises doubt, do not report it as a formal finding. Instead, add a brief \"Needs Investigation\" note in the Summary section so the developer is aware without the noise.\n\n## Output Format\n\nGroup findings by severity. Each finding must name the specific standard violated.\n\n```\n## Critical\nViolations that are directly exploitable or enable data theft, privilege escalation, or financial fraud.\n\n### [DOMAIN] Brief title\nFile: `path/to/file.ts` (lines X–Y)\nStandard: OWASP A01:2025 / CWE-639 — one-line description of what the standard requires.\nViolation: What the code does wrong and the concrete attack scenario.\nFix: Specific, actionable code change or architectural remedy.\n\n## High\nViolations that create significant risk but require specific conditions or chaining to exploit.\n\n(same structure)\n\n## Medium\nDefense-in-depth gaps, missing controls, or violations that increase attack surface.\n\n(same structure)\n\n## Low\nBest-practice deviations, hardening opportunities, or compliance gaps unlikely to be directly exploited.\n\n(same structure)\n\n## Needs Investigation (optional)\nBrief notes on patterns that warrant a closer look but did not pass the Confidence Gate. These are not formal findings.\n\n## Summary\n- Total findings: N (X critical, Y high, Z medium, W low)\n- Highest-risk area: name the domain with the most severe findings\n- Key standards violated: list specific OWASP/CWE IDs\n- Overall security posture: 1–2 sentence verdict\n- Recommended immediate action: the single most urgent fix\n```\n\n## Verification Pass\n\nBefore finalizing your report, verify every finding:\n\n1. Re-read the code: Go back to the flagged file and re-read the flagged lines in full context (±20 lines). Confirm the issue actually exists — not a misread, not handled elsewhere in the same file, not guarded by middleware, a wrapper, or a parent function.\n2. Check for existing mitigations: Search the codebase for related patterns. Is the \"missing\" check done in a shared middleware, auth wrapper, API gateway, or configuration file? If so, drop the finding.\n3. Verify against official docs: For every standard or API you cite, confirm your interpretation is correct. If you're unsure whether a pattern constitutes a real vulnerability, look it up — don't guess. Use available tools (context7, web search, REFERENCE.md, Supabase MCP, etc.) to check current documentation when uncertain.\n4. Filter by confidence: If you're certain a finding is a false positive after re-reading, drop it entirely. If doubt remains but the issue seems plausible, move it to \"Needs Investigation\" in the report — don't include it as a formal finding.\n\n## Rules\n\n- Cite the standard: every finding must reference a specific standard ID (OWASP A-code, CWE-NNN, GDPR Art. N, PCI-DSS Req. N). This is the core value of this skill.\n- Model the attack: every Critical or High finding must describe the realistic attack scenario, not just the code smell.\n- Be specific: always cite file paths and line numbers.\n- Be actionable: every finding must include a concrete fix — not \"add validation\" but \"use a Zod schema on the request body and reject with 400 if it fails.\"\n- Severity by exploitability: rate severity by real-world exploitability and impact, not theoretical worst-case. A missing CSP header with no XSS vector is Low at most. A SQL injection in a public endpoint is Critical regardless of whether a WAF might catch it.\n- Don't duplicate best-practices-audit: focus on security vulnerabilities and compliance gaps. Architecture and clean code issues belong in the other skill.\n- Minimize false positives: Apply the False Positive Filtering rules (Hard Exclusions, Framework Precedents, Confidence Gate) before including any finding. When uncertain, add a \"Needs Investigation\" note in the Summary rather than reporting a formal finding. A clean report with 3 real findings is more valuable than one with 3 real findings buried in 12 noise items.\n- Verify API usage against official docs: Do not assume an API or SDK is being used correctly based on training knowledge. If the code uses a specific SDK or service, look up the official documentation (using MCP tools where available) and verify the security-relevant usage pattern is correct. Incorrect API usage that bypasses security controls is a Critical finding.\n- Defense-in-depth counts: a control missing a second layer of enforcement (e.g., RLS present but no CHECK constraint) is a Medium finding even if the first layer is sound.\n","reference":"# Security Audit — Reference\n\nDetailed definitions, standard sources, violation examples, and fixes for each domain in `SKILL.md`.\n\n---\n\n## 1. Authentication & Session Management\nStandards: OWASP A07:2025 — Authentication Failures; CWE-287 Improper Authentication; CWE-384 Session Fixation; RFC 6750 Bearer Token Usage\n\n### `getSession()` vs. `getUser()` — OWASP A07:2025\n\n`getSession()` reads the JWT from the client-supplied cookie/header and parses it locally. A tampered or expired JWT can appear valid if clock skew or local validation is used. `getUser()` performs a server-side round-trip to the authorization server, guaranteeing the token is currently valid and the user account has not been revoked.\n\nViolation pattern (Supabase/TypeScript):\n```ts\n// WRONG — trusts client-supplied JWT locally\nconst { data: { session } } = await supabase.auth.getSession();\nconst userId = session?.user?.id;\n```\nFix:\n```ts\n// CORRECT — server validates the token\nconst { data: { user }, error } = await supabase.auth.getUser(authHeader);\nif (error \|\| !user) return unauthorized();\n```\n\n### OAuth State Parameter — CWE-352 CSRF\n\nThe OAuth `state` parameter must be a cryptographically random nonce stored server-side (or signed cookie). Without it, an attacker can force a victim to link their account to the attacker's OAuth token.\n\nFix: Generate `state = crypto.randomUUID()`, store in DB or signed cookie with short TTL, validate on callback before exchanging code.\n\n---\n\n## 2. Authorization & Access Control\nStandards: OWASP A01:2025 — Broken Access Control; OWASP API1:2023 — Broken Object Level Authorization; CWE-284 Improper Access Control; CWE-639 Authorization Bypass Through User-Controlled Key\n\n### BOLA / IDOR\n\nThe most prevalent API vulnerability class. Any time a user-controlled identifier (UUID, integer, slug) is used to look up a resource, ownership must be verified server-side — it cannot be assumed from the JWT alone.\n\nViolation pattern:\n```ts\n// WRONG — trusts caller-supplied userId\nconst { id } = req.body;\nconst resource = await db.query(\"SELECT * FROM documents WHERE id = $1\", [id]);\nreturn resource; // returns any user's document\n```\nFix:\n```ts\n// CORRECT — adds ownership column to WHERE clause\nconst resource = await db.query(\n \"SELECT * FROM documents WHERE id = $1 AND owner_id = $2\",\n [id, authenticatedUser.id]\n);\nif (!resource) return notFound(); // don't reveal existence\n```\n\n### Row-Level Security (PostgreSQL)\n\nEvery table with user-scoped data must have RLS enabled AND a policy defined. RLS enabled with no policies = no access. RLS disabled = all data visible to any authenticated DB connection.\n\nRequired pattern:\n```sql\nALTER TABLE documents ENABLE ROW LEVEL SECURITY;\n\nCREATE POLICY \"users_own_documents\"\n ON documents FOR ALL\n TO authenticated\n USING (owner_id = auth.uid());\n```\n\nHigh-risk gap: Financial tables (transactions, payment records) should have RLS but also block UPDATE/DELETE via separate policies or triggers — RLS `FOR ALL` with `USING` only controls SELECT.\n\n---\n\n## 3. Injection\nStandards: OWASP A05:2025 — Injection; CWE-89 SQL Injection; CWE-79 XSS; CWE-77 Command Injection; CWE-94 Code Injection\n\n### SQL Injection — CWE-89\n\nAny string concatenation or interpolation in a SQL query is potentially exploitable. The fix is always parameterized queries (also called prepared statements).\n\nViolation:\n```ts\n// WRONG\nconst result = await db.query(`SELECT * FROM users WHERE name = '${name}'`);\n```\nFix:\n```ts\n// CORRECT\nconst result = await db.query(\"SELECT * FROM users WHERE name = $1\", [name]);\n```\n\n### Schema Pollution (PostgreSQL SECURITY DEFINER) — CWE-89\n\nA function with `SECURITY DEFINER` runs with the privileges of the function's owner (often a superuser). If `search_path` is not pinned, an attacker who can create schemas may prepend a malicious schema, causing the function to resolve table names to their injected versions.\n\nViolation:\n```sql\nCREATE OR REPLACE FUNCTION credit_coins(uid uuid, amount int)\nRETURNS void\nLANGUAGE plpgsql\nSECURITY DEFINER AS $$\nBEGIN\n UPDATE profiles SET coins = coins + amount WHERE id = uid;\nEND;\n$$;\n```\nFix:\n```sql\nCREATE OR REPLACE FUNCTION public.credit_coins(uid uuid, amount int)\nRETURNS void\nLANGUAGE plpgsql\nSECURITY DEFINER\nSET search_path = '' -- pins search path; no user schema can be injected\nAS $$\nBEGIN\n UPDATE public.profiles SET coins = coins + amount WHERE id = uid;\nEND;\n$$;\n```\n\n### XSS — CWE-79\n\nNever assign user-controlled content to `innerHTML`, `outerHTML`, `document.write()`, or React's `dangerouslySetInnerHTML` without sanitization.\n\nViolation:\n```ts\nelement.innerHTML = userInput; // executes embedded <script> tags\n```\nFix:\n```ts\nelement.textContent = userInput; // text node — never executed as HTML\n// If HTML is genuinely needed, use DOMPurify:\nelement.innerHTML = DOMPurify.sanitize(userInput, { ALLOWED_TAGS: ['b', 'i'] });\n```\n\n---\n\n## 4. Cryptography & Secrets\nStandards: OWASP A04:2025 — Cryptographic Failures; CWE-327 Use of Broken Algorithm; CWE-798 Hardcoded Credentials; CWE-312 Cleartext Storage; NIST SP 800-131A\n\n### Hardcoded Secrets — CWE-798\n\nAny secret in source code is compromised the moment the repo is cloned. Even private repos have been breached.\n\nScan for: `apiKey =`, `password =`, `secret =`, `token =`, `-----BEGIN RSA PRIVATE KEY-----` in `.ts`, `.js`, `.json`, `.toml`, `.yaml` files.\n\nFix: Rotate immediately. Store in environment variables loaded at runtime (never in source), or a secrets manager (HashiCorp Vault, AWS Secrets Manager, Supabase Vault).\n\n### Broken Hash Algorithms — CWE-327\n\nMD5 and SHA-1 are collision-compromised. Never use for password hashing, HMAC, or integrity verification.\n\n- Passwords: use `bcrypt` (cost ≥ 12), `argon2id`, or `scrypt`.\n- HMAC: use SHA-256 minimum. `HMAC-SHA256` is the baseline for webhook signatures.\n- File integrity: SHA-256 minimum.\n\n### Client-Side Secret Exposure\n\nIn Vite: `VITE_` variables are embedded in the JS bundle and visible to any user who opens DevTools. In Next.js: `NEXT_PUBLIC_` is the same. Never put API keys or service secrets in these variables.\n\n---\n\n## 5. Input Validation & Output Encoding\nStandards: CWE-20 Improper Input Validation; CWE-116 Improper Encoding; CWE-601 Open Redirect; OWASP Input Validation Cheat Sheet\n\n### Server-Side Validation is Non-Negotiable\n\nClient-side validation (React form validation, browser `required` attributes) is UX, not security. Any attacker can send raw HTTP requests bypassing the client entirely.\n\nRequired pattern (TypeScript with Zod):\n```ts\nconst Schema = z.object({\n username: z.string().min(1).max(30),\n amount: z.number().int().positive().max(10_000),\n});\n\nconst parsed = Schema.safeParse(req.body);\nif (!parsed.success) return badRequest(parsed.error.flatten());\n// Use parsed.data — never req.body — downstream\n```\n\n### Defense-in-Depth: Database CHECK Constraints\n\nApplication validation can be bypassed (direct DB connection, migration mistake, future code path). CHECK constraints are the last line of defense.\n\n```sql\n-- Prevents negative balance under any race condition\nALTER TABLE profiles ADD CONSTRAINT chk_coins_non_negative CHECK (coins >= 0);\n\n-- Enforces transaction sign by type\nALTER TABLE coin_transactions ADD CONSTRAINT chk_credit_positive\n CHECK (tx_type NOT IN ('quest_reward', 'purchase') OR amount > 0);\nALTER TABLE coin_transactions ADD CONSTRAINT chk_debit_negative\n CHECK (tx_type NOT IN ('cosmetic_purchase', 'refund') OR amount < 0);\n```\n\n### Open Redirect — CWE-601\n\n```ts\n// WRONG — attacker crafts ?next=https://evil.com\nconst next = req.query.next;\nres.redirect(next);\n\n// CORRECT — validate against allowlist\nconst ALLOWED_PATHS = ['/dashboard', '/profile', '/settings'];\nconst next = req.query.next;\nif (!ALLOWED_PATHS.includes(next)) return res.redirect('/dashboard');\nres.redirect(next);\n```\n\n---\n\n## 6. API Security\nStandards: OWASP API Security Top 10 2023\n\n### API1:2023 — Broken Object Level Authorization (BOLA)\n\nSee Domain 2. Every resource access must verify ownership. This is the #1 API vulnerability.\n\n### API3:2023 — Broken Object Property Level Authorization\n\nAPIs often return full database row objects. If the object contains fields the caller should not see (other users' data, internal flags, admin properties), this is a data exposure violation.\n\nFix: Explicitly allowlist fields returned in API responses. Never return `SELECT ` to the client.\n\n```ts\n// WRONG\nreturn res.json(userRow); // includes password_hash, role, internal_flags\n\n// CORRECT\nreturn res.json({\n id: userRow.id,\n displayName: userRow.display_name,\n avatarUrl: userRow.avatar_url,\n});\n```\n\n### API7:2023 — Server-Side Request Forgery (SSRF)\n\nIf the application fetches a URL derived from user input, an attacker can target internal services (metadata endpoints, Redis, internal databases).\n\nViolation:\n```ts\n// WRONG — user controls the URL\nconst data = await fetch(req.body.webhookUrl);\n```\nFix:* Validate URL against a strict allowlist of expected domains. Block private IP ranges (10.x, 172.16.x–172.31.x, 192.168.x, 169.254.x, ::1, fc00::/7).\n\n### API8:2023 — Security Misconfiguration\n\n- CORS origin reflection without validation combined with `Access-Control-Allow-Credentials: true` allows any origin to make credentialed requests.\n- Verbose error messages that expose stack traces, SQL query structure, or internal paths.\n- Debug endpoints (`/debug`, `/metrics`, `/__admin`) accessible in production.\n\n---\n\n## 7. Database Security\nStandards: CWE-250 Execution with Unnecessary Privileges; PostgreSQL Security Best Practices; CIS PostgreSQL Benchmark\n\n### Principle of Least Privilege\n\nEvery database role should have only the minimum permissions required. The `public` schema grants `CREATE` to all roles by default in PostgreSQL < 15 — revoke this explicitly.\n\n```sql\nREVOKE CREATE ON SCHEMA public FROM PUBLIC;\nREVOKE ALL ON ALL TABLES IN SCHEMA public FROM PUBLIC;\n\n-- Then explicitly grant only what each role needs\nGRANT SELECT, INSERT ON public.profiles TO authenticated;\n```\n\n### REVOKE EXECUTE on SECURITY DEFINER Functions\n\nSECURITY DEFINER functions run as their owner. If PUBLIC or `authenticated` can call them without restriction, any logged-in user can trigger privileged operations.\n\n```sql\n-- After defining any SECURITY DEFINER function:\nREVOKE EXECUTE ON FUNCTION public.credit_coins(uuid, int) FROM PUBLIC;\nREVOKE EXECUTE ON FUNCTION public.credit_coins(uuid, int) FROM authenticated;\nREVOKE EXECUTE ON FUNCTION public.credit_coins(uuid, int) FROM anon;\n-- Re-grant only to service_role or internal callers as needed\n```\n\n### REVOKE TRUNCATE on Audit Tables\n\n`TRUNCATE` bypasses RLS and row-level triggers. Any role that can TRUNCATE an audit table can silently destroy evidence.\n\n```sql\nREVOKE TRUNCATE ON TABLE public.coin_transactions FROM PUBLIC;\nREVOKE TRUNCATE ON TABLE public.coin_transactions FROM authenticated;\nREVOKE TRUNCATE ON TABLE public.coin_transactions FROM service_role;\n-- Even service_role should not be able to bulk-erase financial records\n```\n\n---\n\n## 8. Rate Limiting & Denial-of-Service\nStandards: OWASP API4:2023 — Unrestricted Resource Consumption; CWE-770 Allocation of Resources Without Limits; CWE-400 Uncontrolled Resource Consumption\n\n### In-Memory Rate Limiting Is Ineffective\n\nRate limits implemented with an in-process `Map` or `LRU` cache are reset on process restart and are not shared across horizontal replicas. An attacker simply retries after waiting for a cold deploy, or routes requests to different instances.\n\nCorrect approach: Store rate limit counters in a database (Redis, PostgreSQL) keyed by user ID and action type. The counter must be incremented atomically in the same transaction as the action.\n\nPostgreSQL pattern:\n```sql\n-- Atomic check-and-increment\nINSERT INTO rate_limits (user_id, action, window_start, count)\nVALUES ($1, $2, date_trunc('minute', now()), 1)\nON CONFLICT (user_id, action, window_start)\nDO UPDATE SET count = rate_limits.count + 1\nRETURNING count;\n-- If returned count > max_allowed, reject with 429\n```\n\n### Missing Rate Limits on Auth Endpoints\n\nAuthentication endpoints (login, password reset, OTP verification) without rate limiting enable brute-force and credential-stuffing attacks.\n\nRecommended limits (baseline):\n- Login: 5 attempts per minute per IP\n- Password reset: 3 per hour per email\n- OTP verification: 3 attempts per code before invalidating\n\n---\n\n## 9. Concurrency & Race Conditions\nStandards: CWE-362 Concurrent Execution Using Shared Resource with Improper Synchronization (TOCTOU); CWE-367 TOCTOU Race Condition\n\n### Check-Then-Act on Financial Data\n\nThe most dangerous race condition pattern in financial systems: read the balance, check if sufficient, then deduct. If two requests run concurrently, both checks pass against the same stale balance.\n\nViolation:\n```sql\n-- Thread 1 and Thread 2 both read balance = 100 at the same time\nSELECT coins FROM profiles WHERE id = $1; -- both see 100\n-- Both check: 100 >= 50 → true\nUPDATE profiles SET coins = coins - 50 WHERE id = $1; -- both run\n-- Result: balance = 0 instead of 50. Or worse, -50 if CHECK constraint absent.\n```\n\nFix — advisory lock + FOR UPDATE:\n```sql\nBEGIN;\nSELECT pg_advisory_xact_lock(hashtext($1::text)); -- serialize per user\nSELECT coins FROM profiles WHERE id = $1 FOR UPDATE; -- lock the row\n-- Now deduct safely — only one transaction holds the lock\nUPDATE profiles SET coins = coins - $2 WHERE id = $1 AND coins >= $2;\nCOMMIT;\n```\n\n### Idempotency Key Bypass\n\nIf an idempotency key column allows `NULL`, PostgreSQL's UNIQUE constraint treats each `NULL` as a distinct value — meaning `NULL` keys do not deduplicate. This allows unlimited replay of reward operations.\n\n```sql\n-- WRONG — NULLs are not unique in PostgreSQL\nidempotency_key TEXT UNIQUE -- NULL can appear unlimited times\n\n-- CORRECT\nidempotency_key TEXT NOT NULL UNIQUE -- enforces exactly-once\n```\n\n---\n\n## 10. Financial & Transaction Integrity\nStandards: PCI-DSS v4 Req. 6 (Secure Systems), Req. 10 (Audit Logs); ISO 27001 A.9; CWE-362\n\n### Server-Authoritative Coin Logic\n\nAny value computed or provided by the client that affects financial state is a vulnerability. The server must compute all rewards, deductions, and balances independently.\n\nPattern to flag:\n```ts\n// WRONG — client tells server how many coins to award\nconst { userId, coinsEarned } = req.body;\nawait creditCoins(userId, coinsEarned); // attacker sends coinsEarned = 99999\n```\n\nCorrect: The server computes the reward based on verified activity data (e.g., verified GitHub events), never from a client-supplied amount.\n\n### Append-Only Transaction Log\n\nCoin/credit transaction tables must be immutable after insert. Updates would allow retroactive falsification of balances; deletes destroy the audit trail.\n\n```sql\n-- Trigger blocking updates to financial records\nCREATE OR REPLACE FUNCTION block_transaction_updates()\nRETURNS trigger LANGUAGE plpgsql AS $$\nBEGIN\n RAISE EXCEPTION 'Updates to coin_transactions are not permitted';\nEND;\n$$;\n\nCREATE TRIGGER no_update_coin_transactions\nBEFORE UPDATE ON coin_transactions\nFOR EACH ROW EXECUTE FUNCTION block_transaction_updates();\n```\n\n### Webhook Deduplication — Replay Attack\n\nPayment providers may retry webhooks. Without deduplication on the provider's event ID, the same payment event can credit coins multiple times.\n\n```sql\nINSERT INTO payment_events (provider_event_id, payload, received_at)\nVALUES ($1, $2, now())\nON CONFLICT (provider_event_id) DO NOTHING;\n-- Only process coins if INSERT affected 1 row (i.e., event was new)\n```\n\n---\n\n## 11. Security Logging & Monitoring\nStandards: OWASP A09:2025 — Security Logging and Alerting Failures; CWE-778 Insufficient Logging; CWE-117 Log Injection; NIST SP 800-92\n\n### What Must Be Logged\n\nAt minimum, log these events with timestamp, user ID, IP address, and action detail:\n- Authentication failures (wrong password, expired token, missing auth header)\n- Authorization failures (access denied to a resource)\n- Input validation failures that look like attacks (unexpected field shapes, oversized inputs)\n- Cryptographic verification failures (HMAC mismatch on webhooks)\n- Rate limit hits\n- Account actions (password change, email change, account deletion)\n- Financial anomalies (deduction larger than balance attempted)\n\n### Log Injection — CWE-117\n\nIf log messages are constructed using string interpolation with user input, an attacker can inject newlines to forge log entries.\n\nViolation:\n```ts\nlogger.info(`User logged in: ${req.body.username}`);\n// Attacker sends username = \"admin\\nSECURITY: Admin password changed\"\n```\nFix: Use structured logging (JSON with separate fields), never string interpolation.\n```ts\nlogger.info({ event: \"login\", username: req.body.username }); // safe\n```\n\n---\n\n## 12. Secrets & Environment Security\nStandards: CWE-798 Hardcoded Credentials; CWE-312 Cleartext Storage; The Twelve-Factor App (Factor III: Config)\n\n### Env Var Fallback to Insecure Default\n\nA common pattern in \"developer-friendly\" code is to fall back to a permissive default if an env var is missing. This silently disables security in production if the env var is misconfigured.\n\nViolation:\n```ts\n// WRONG — falls back to wildcard CORS if env var missing\nconst origin = Deno.env.get(\"ALLOWED_ORIGIN\") ?? \"\";\n```\nFix:\n```ts\n// CORRECT — hard-error on missing config; fail secure\nconst origin = Deno.env.get(\"ALLOWED_ORIGIN\");\nif (!origin) throw new Error(\"ALLOWED_ORIGIN env var is required\");\n```\n\n---\n\n## 13. Data Privacy & Retention\nStandards: GDPR Art. 5 (data minimization), Art. 17 (right to erasure), Art. 25 (privacy by design); CCPA §1798.105; CWE-359 Exposure of Private Information\n\n### Right to Erasure — Account Deletion\n\nOn account deletion, the application must:\n1. Delete or anonymize personal data (name, email, avatar, IP, user-agent)\n2. Retain legally required financial records (PCI-DSS, EU VAT — typically 7–10 years)\n3. Preserve abuse/moderation evidence (content reports, security flags)\n4. Nullify sender references in shared records (e.g., chat messages become anonymous)\n\nCritical FK patterns:\n```sql\n-- Chat: anonymize messages, don't delete them (conversation history remains intact)\nsender_id UUID REFERENCES auth.users(id) ON DELETE SET NULL\n\n-- Transactions: retain for audit; user_id becomes orphaned (no cascade)\nuser_id UUID -- intentionally no FK constraint, or FK with ON DELETE SET NULL\n```\n\n### Data Minimization — GDPR Art. 5(1)(c)\n\nDo not collect or store more data than necessary. Flag:\n- IP addresses stored permanently when 30/90 day retention suffices\n- User-agent strings logged indefinitely (personal data under GDPR when combined with other identifiers like IP and timestamps, which is typical in server logs)\n- Full request bodies logged when only metadata is needed for debugging\n- `SELECT ` queries that pull PII columns into contexts that don't need them\n\n---\n\n## 14. Security Misconfiguration\nStandards: OWASP A02:2025; CWE-16 Configuration; CIS Benchmarks; OWASP Secure Headers\n\n### Required Security Headers\n\n```\nContent-Security-Policy: default-src 'self'; script-src 'self'; object-src 'none'\nX-Frame-Options: DENY\nX-Content-Type-Options: nosniff\nReferrer-Policy: strict-origin-when-cross-origin\nStrict-Transport-Security: max-age=63072000; includeSubDomains; preload\nPermissions-Policy: geolocation=(), microphone=(), camera=()\n```\n\n### CORS Misconfiguration\n\n`Access-Control-Allow-Origin: ` allows any origin to read responses to non-credentialed requests. For cookie-based auth, browsers block* credentialed requests when the wildcard is used (MDN: \"If a request includes a credential and the response includes an `Access-Control-Allow-Origin: ` header, the browser will block access to the response\"). The more dangerous misconfiguration is reflecting the request `Origin` header* without validation while setting `Access-Control-Allow-Credentials: true` — this effectively allows any origin to make credentialed requests.\n\nThe origin allowlist must be an explicit list of trusted domains, validated server-side. Never reflect the request `Origin` header without verification.\n\n```ts\n// WRONG — reflects any origin\nconst origin = req.headers.get(\"origin\");\nheaders.set(\"Access-Control-Allow-Origin\", origin ?? \"\");\n\n// CORRECT — validate against explicit allowlist\nconst ALLOWED = new Set([\"https://app.example.com\"]);\nconst requestOrigin = req.headers.get(\"origin\") ?? \"\";\nif (ALLOWED.has(requestOrigin)) {\n headers.set(\"Access-Control-Allow-Origin\", requestOrigin);\n headers.set(\"Vary\", \"Origin\");\n}\n```\n\n---\n\n## 15. Supply Chain & Dependency Security\nStandards: OWASP A03:2025 — Software Supply Chain Failures; CWE-1357; SLSA Framework\n\n### Dependency Audit\n\nRun `npm audit` or `bun audit` and treat results as:\n- Critical/High CVEs* → block deployment; patch immediately\n- Moderate CVEs → fix within the sprint\n- Low CVEs → fix in next dependency update cycle\n\n### Version Pinning\n\nUse exact versions in `package.json` for production dependencies, or lock with `package-lock.json`/`bun.lockb`. The `^` prefix allows minor version bumps that could introduce regressions or security fixes you haven't reviewed.\n\n---\n\n## 16. TypeScript / JavaScript Specific\nStandards: CWE-843 Type Confusion; CWE-915 Prototype Pollution; CWE-94 Code Injection; OWASP Cheat Sheet: DOM-based XSS\n\n### Prototype Pollution — CWE-915\n\nMerging user-controlled objects onto existing objects can overwrite properties on `Object.prototype`, affecting all objects in the process.\n\nViolation:\n```ts\nfunction mergeOptions(defaults: object, userOptions: unknown) {\n return Object.assign(defaults, userOptions); // if userOptions is {\"__proto__\": {\"admin\": true}}\n}\n```\nFix: Validate and allowlist the keys of user-controlled objects before merging. Use `Object.create(null)` for dictionaries that must not inherit from `Object.prototype`. Use schema validation (Zod) to strip unknown keys.\n\n### `as any` Type Assertions on External Data — CWE-843\n\nExternal data (API responses, webhook payloads, database query results typed as `any`, `JSON.parse()` output) must be treated as `unknown` and parsed through a validator before use. Using `as any` or `as ExpectedType` directly bypasses TypeScript's safety guarantees entirely.\n\n```ts\n// WRONG\nconst payload = JSON.parse(body) as WebhookPayload;\ncreditCoins(payload.userId, payload.amount); // if payload.amount is a string: NaN coins\n\n// CORRECT\nconst parsed = WebhookPayloadSchema.safeParse(JSON.parse(body));\nif (!parsed.success) return badRequest();\ncreditCoins(parsed.data.userId, parsed.data.amount); // type-safe and validated\n```\n\n### Unhandled Promise Rejections — CWE-755\n\nIn async TypeScript/JavaScript, a missing `await` means the promise runs in the background and any rejection is silently swallowed (or crashes Node.js). This is especially dangerous in financial operations where you need to know if the DB write succeeded.\n\n```ts\n// WRONG — fire-and-forget on a critical operation\nlogSecurityEvent(userId, \"auth_failure\"); // rejection silently lost\n\n// CORRECT — await or explicitly handle\nawait logSecurityEvent(userId, \"auth_failure\");\n// or: void logSecurityEvent(...).catch(err => console.error(\"Failed to log:\", err));\n```\n\n"},"systematic-debugging":{"content":"---\nname: systematic-debugging\ndescription: \"Guides root-cause analysis with a structured process: reproduce, isolate, hypothesize, verify. Use when debugging bugs, investigating failures, or when the user says something is broken or not working as expected.\"\n---\n# Systematic Debugging\n\nWork through failures in order. Don't guess at fixes until the cause is narrowed down.\n\n## Scope\n\n- User reports a bug: Clarify what \"wrong\" means (error message, wrong result, crash, hang). Get steps to reproduce or environment details if missing.\n- User points at code: Treat that as the suspected area; still reproduce and isolate before changing code.\n- Logs/stack traces provided: Use them to form hypotheses; don't ignore them.\n\n## Process\n\n### 1. Reproduce\n\n- Confirm the failure is reproducible. If not, note that and list what's needed (e.g. data, env, steps).\n- Identify: one-off or intermittent? In which environment (dev/staging/prod, OS, version)?\n- Output: \"Reproducible: yes/no. How: …\"\n\n### 2. Isolate\n\n- Shrink the problem: minimal input, minimal code path, or minimal config that still fails.\n- Bisect if useful: which commit, which option, which input range?\n- Remove variables (other features, network, time) to see when the failure goes away.\n- Output: \"Failure occurs when: …\" and \"Failure does not occur when: …\"\n\n### 3. Hypothesize\n\n- State one or more concrete hypotheses that explain the observed behavior (e.g. \"null passed here\", \"race between A and B\", \"wrong type at runtime\").\n- Tie each hypothesis to evidence from reproduce/isolate (logs, stack trace, line numbers).\n- Prefer the simplest hypothesis that fits the evidence.\n- Output: \"Hypothesis: …\" with \"Evidence: …\"\n\n### 4. Verify\n\n- Propose a minimal check (log, assert, unit test, or one-line change) that would confirm or rule out the top hypothesis.\n- If the user can run it, give the exact step. If you can run it (e.g. tests), do it.\n- After verification: \"Confirmed: …\" or \"Ruled out; next hypothesis: …\"\n\n### 5. Fix\n\n- Only suggest a fix after the cause is confirmed or highly likely.\n- Fix the root cause when possible; document or ticket workarounds if you suggest one.\n- Suggest a regression test or assertion so the bug doesn't come back.\n\n## Output\n\n- Prefer short bullets over long paragraphs.\n- Always cite file/line/function when pointing at code.\n- If stuck (can't reproduce, no logs), say what's missing and what would help next.\n- Don't suggest random fixes (e.g. \"try clearing cache\") without tying them to a hypothesis.\n","reference":null},"test-deno":{"content":"---\nname: test-deno\ndescription: Use when writing, reviewing, or fixing Deno integration tests for Supabase Edge Functions, or when auditing edge function tests for best practices. Triggers on test failures involving sanitizers, assertions, mocking, HTTP testing, or environment isolation.\n---\n\n# Deno Edge Function Testing\n\nWrite and review integration tests for Supabase Edge Functions using Deno's built-in test runner and official standard library modules. Every recommendation in this skill is sourced from official documentation.\n\nSee [REFERENCE.md](REFERENCE.md) for detailed definitions, code examples, and official source citations.\n\n## Scope\n\nDetermine what to review or write based on user request:\n\n- Write mode: write new tests for edge functions the user specifies\n- Review mode: audit existing test files for anti-patterns and best practice violations\n- Fix mode: fix failing or flawed tests\n\nTest files live in the project's edge function test directory (Supabase convention: `supabase/functions/tests/`).\n\n## Prerequisites\n\nBefore tests can run, the local Supabase stack must be running:\n\n```bash\n# Terminal 1: start local stack\nnpx supabase start\n\n# Terminal 2: serve functions\nnpx supabase functions serve --no-verify-jwt --env-file supabase/functions/tests/.env.local\n\n# Terminal 3: run tests\ndeno test --no-lock --env-file=supabase/functions/tests/.env.local \\\n --allow-net --allow-env --allow-read \\\n supabase/functions/tests/\n```\n\n`--no-lock` is required — Supabase Edge Runtime uses Deno v2.1.x internally, and newer Deno CLI versions generate lock file format v5 which the runtime cannot parse.\n\n## Principles to Enforce\n\n### 1. Test Structure — `Deno.test()` or BDD (`describe`/`it`)\n\n- Both styles are officially supported; choose one and be consistent within a project\n- `describe()` and `it()` are wrappers over `Deno.test()` and `t.step()` — not a separate test runner\n- Hook order: `beforeAll` > `beforeEach` > test > `afterEach` > `afterAll`; after-hooks run even on failure\n- Per-test `permissions` do NOT work inside nested `describe` blocks (known limitation)\n\n### 2. Assertions — `@std/assert` or `@std/expect`\n\nTwo assertion styles are officially supported: `@std/assert` (Deno-native) and `@std/expect` (Jest-compatible).\n\nAssertion selection:\n\n\| Situation \| Use \| NOT \|\n\|-----------\|-----\|-----\|\n\| Deep equality (objects, arrays) \| `assertEquals` \| `assertStrictEquals` \|\n\| Reference/primitive equality (`===`) \| `assertStrictEquals` \| `assertEquals` \|\n\| Value is not null/undefined \| `assertExists` \| `assert(val !== null)` \|\n\| Synchronous throw \| `assertThrows(fn, ErrorClass?, msg?)` \| try/catch \|\n\| Async rejection \| `assertRejects(fn, ErrorClass?, msg?)` \| `assertThrows` \|\n\| Partial object match \| `assertObjectMatch` \| manual property checks \|\n\| String contains substring \| `assertStringIncludes` \| `assert(s.includes(...))` \|\n\| Numeric comparison \| `assertGreater`, `assertLess`, etc. \| `assert(a > b)` \|\n\| Unconditional fail \| `fail()` or `unreachable()` \| `assert(false)` \|\n\n### 3. Integration Testing Pattern (Supabase Official)\n\nEdge Function tests should be integration tests — real HTTP requests against locally-served functions.\n\nWhat to test:\n- Happy-path request/response (status code, body shape)\n- Authentication enforcement (missing/invalid JWT returns 401)\n- Input validation (malformed body returns 400)\n- Error responses (correct status codes and error messages)\n- CORS headers (OPTIONS preflight, allowed origins)\n- Method routing (POST vs GET vs unsupported methods)\n\nWhat NOT to test here (test at the database layer instead):\n- RLS policies\n- RPC business logic\n- Trigger behavior\n\n### 4. Sanitizers — Resource, Op, and Exit\n\nSanitizers are enabled by default on every `Deno.test()`. They catch resource leaks and unfinished async work.\n\n\| Sanitizer \| Default \| What it catches \|\n\|-----------\|---------\|-----------------\|\n\| `sanitizeResources` \| `true` \| Open files, connections not closed \|\n\| `sanitizeOps` \| `true` \| Unawaited async operations \|\n\| `sanitizeExit` \| `true` \| Calls to `Deno.exit()` \|\n\n- NEVER disable sanitizers globally — only per-test with a comment explaining why\n- For integration tests with `fetch()`, sanitizers should pass without disabling\n- If a third-party library holds connections open, you may need `sanitizeResources: false` on specific tests\n\n### 5. Mocking — `@std/testing/mock`\n\n- Spies record calls without changing behavior; stubs replace behavior\n- ALWAYS restore spies/stubs — use `using` keyword (preferred) or `try/finally` with `.restore()`\n- Do NOT over-mock in integration tests — use real `fetch()` against the local server\n- Do NOT mock what you don't own — mock your code's dependencies, not third-party internals\n- FakeTime (`@std/testing/time`) — use for time-dependent tests instead of wall-clock time\n\n### 6. Environment Isolation\n\n- Use `--env-file=path` to load test-specific environment variables\n- Keep a dedicated `.env.local` in `supabase/functions/tests/`\n- NEVER hardcode URLs, keys, or secrets in test files — use `Deno.env.get()`\n- `--env-file` values take precedence over existing shell environment variables\n\n### 7. Permissions — Principle of Least Privilege\n\n- Grant only what tests need at the CLI level (`--allow-net`, `--allow-env`, `--allow-read`)\n- Per-test `permissions` config can restrict further but CANNOT exceed CLI-granted permissions\n\n### 8. Test File Naming and Organization\n\n- Deno auto-discovers files matching: `{_,.,}test.{ts,tsx,mts,js,mjs,jsx}`\n- Supabase's official example uses `function-name-test.ts` with a hyphen — hyphens are NOT auto-discovered, pass the directory explicitly\n- Place tests in `supabase/functions/tests/` with a `.env.local` for environment variables\n\n### 9. Test Independence and Determinism\n\n- Tests within a file run sequentially; files can run in parallel (`--parallel`)\n- Module-level state is shared across tests in the same file\n- Use `beforeEach`/`afterEach` to reset state; database/server state persists unless cleaned up\n- NEVER rely on test execution order, random data without seeding, or wall-clock time\n\n## Common Anti-Patterns\n\n\| Anti-Pattern \| Why it's wrong \| Fix \|\n\|---\|---\|---\|\n\| Not awaiting `fetch()` or async ops \| `sanitizeOps` will catch this; test may pass falsely \| Always `await` every async operation \|\n\| Disabling sanitizers globally \| Hides real resource leaks \| Disable only per-test with a comment \|\n\| Using `assertThrows` for async code \| Only catches synchronous exceptions \| Use `assertRejects` for promises \|\n\| Not restoring stubs/spies \| Leaks mock state to other tests \| Use `using` keyword or `try/finally` \|\n\| Hardcoding URLs and keys \| Breaks in different environments \| Use `Deno.env.get()` + `--env-file` \|\n\| Mocking `fetch` in integration tests \| Defeats the purpose of integration testing \| Use real HTTP calls to local server \|\n\| Sharing mutable state without cleanup \| Tests become order-dependent \| Reset in `beforeEach`/`afterEach` \|\n\| Using `assert(condition)` for everything \| Provides no useful failure message \| Use specific assertions (`assertEquals`, etc.) \|\n\n## Output Format (Review Mode)\n\nWhen reviewing existing tests, group findings by severity:\n\n```\n## Critical\nIssues that make tests unreliable, flaky, or misleading.\n\n### [PRINCIPLE] Brief title\nFile: `path/to/file_test.ts` (lines X-Y)\nPrinciple: What the standard requires.\nViolation: What the code does wrong.\nFix: Specific, actionable suggestion.\n\n## Warning\nIssues that weaken test value or violate conventions.\n\n(same structure)\n\n## Suggestion\nImprovements aligned with best practices.\n\n(same structure)\n```\n\n## Rules\n\n- Only verified claims: every recommendation in this skill is backed by official Deno or Supabase documentation. See REFERENCE.md for source citations.\n- Integration over unit: for Edge Functions, prefer integration tests (real HTTP against local server) over unit tests with mocked dependencies.\n- Test the contract, not the implementation: test HTTP status codes, response bodies, and headers — not internal function calls.\n- Respect sanitizers: treat sanitizer failures as real bugs, not annoyances to disable.\n- Least privilege: grant only the permissions tests actually need.\n","reference":"# Deno Edge Function Testing Reference\n\nDetailed definitions, official sources, and verified citations for each principle in this skill.\n\n## Table of Contents\n\n1. [Test Runner](#1-test-runner)\n2. [Assertions](#2-assertions)\n3. [BDD Module](#3-bdd-module)\n4. [Mocking](#4-mocking)\n5. [Sanitizers](#5-sanitizers)\n6. [Supabase Integration Testing](#6-supabase-integration-testing)\n7. [Environment and Permissions](#7-environment-and-permissions)\n8. [CLI Reference](#8-cli-reference)\n9. [Anti-Patterns](#9-anti-patterns)\n\n---\n\n## 1. Test Runner\n\nSource: [docs.deno.com/runtime/fundamentals/testing](https://docs.deno.com/runtime/fundamentals/testing/)\n\nDeno ships a built-in test runner — no external framework required. Tests are registered with `Deno.test()`.\n\n### File auto-discovery\n\n`deno test` auto-discovers files matching: `{_,.,}test.{ts, tsx, mts, js, mjs, jsx}`\n\nThis matches `_test.ts`, `.test.ts`, and `test.ts` — but NOT `-test.ts` (hyphenated). To run hyphenated files, pass the directory explicitly.\n\n### `Deno.test()` style (native)\n\n```ts\nDeno.test(\"function returns 200 for valid input\", async () => {\n const res = await fetch(`${BASE_URL}/my-function`, { / ... / });\n assertEquals(res.status, 200);\n});\n```\n\n### BDD style (`@std/testing/bdd`)\n\n```ts\nimport { describe, it, beforeEach, afterEach } from \"@std/testing/bdd\";\n\ndescribe(\"my-function\", () => {\n it(\"returns 200 for valid input\", async () => {\n const res = await fetch(`${BASE_URL}/my-function`, { / ... / });\n assertEquals(res.status, 200);\n });\n});\n```\n\n### Test steps\n\nSub-tests within a single `Deno.test()`:\n\n```ts\nDeno.test(\"grouped tests\", async (t) => {\n await t.step(\"step one\", async () => { / ... / });\n await t.step(\"step two\", async () => { / ... / });\n});\n```\n\nSteps are awaited sequentially. Each step reports independently.\n\n---\n\n## 2. Assertions\n\nSource:* [jsr.io/@std/assert](https://jsr.io/@std/assert)\n\n### Complete function list (verified)\n\n\| Function \| Purpose \|\n\|---\|---\|\n\| `assert(expr)` \| Truthy check \|\n\| `assertAlmostEquals(actual, expected, tolerance?)` \| Floating-point comparison \|\n\| `assertArrayIncludes(actual, expected)` \| Array contains all elements \|\n\| `assertEquals(actual, expected)` \| Deep equality \|\n\| `assertExists(actual)` \| Not null/undefined (narrows to `NonNullable<T>`) \|\n\| `assertFalse(expr)` \| Falsy check \|\n\| `assertGreater(actual, expected)` \| `actual > expected` \|\n\| `assertGreaterOrEqual(actual, expected)` \| `actual >= expected` \|\n\| `assertInstanceOf(actual, ExpectedType)` \| `instanceof` check \|\n\| `assertIsError(error, ErrorClass?, msgIncludes?)` \| Error type check \|\n\| `assertLess(actual, expected)` \| `actual < expected` \|\n\| `assertLessOrEqual(actual, expected)` \| `actual <= expected` \|\n\| `assertMatch(actual, regex)` \| String matches RegExp \|\n\| `assertNotEquals(actual, expected)` \| Deep inequality \|\n\| `assertNotInstanceOf(actual, UnexpectedType)` \| NOT `instanceof` \|\n\| `assertNotMatch(actual, regex)` \| String does NOT match RegExp \|\n\| `assertNotStrictEquals(actual, expected)` \| Reference inequality \|\n\| `assertObjectMatch(actual, expected)` \| Partial deep match \|\n\| `assertRejects(fn, ErrorClass?, msgIncludes?)` \| Async rejection testing \|\n\| `assertStrictEquals(actual, expected)` \| Reference equality (`===`) \|\n\| `assertStringIncludes(actual, expected)` \| String contains substring \|\n\| `assertThrows(fn, ErrorClass?, msgIncludes?)` \| Synchronous throw testing \|\n\| `equal(a, b)` \| Deep equality (returns boolean, no assertion) \|\n\| `fail(msg?)` \| Unconditional failure \|\n\| `unimplemented(msg?)` \| Marks unimplemented code \|\n\| `unreachable()` \| Marks unreachable code \|\n\n### Alternative: `@std/expect` (Jest-compatible)\n\nSource: [jsr.io/@std/expect](https://jsr.io/@std/expect) — official Deno standard library\n\n```ts\nimport { expect } from \"@std/expect\";\nexpect(x).toEqual(42);\nexpect(fn).toThrow(TypeError);\nawait expect(asyncFn()).resolves.toEqual(42);\n```\n\nSupports standard matchers plus asymmetric matchers (`expect.anything()`, `expect.objectContaining()`, etc.).\n\n---\n\n## 3. BDD Module\n\nSource: [jsr.io/@std/testing/doc/bdd](https://jsr.io/@std/testing/doc/bdd)\n\n### Exports\n\n`describe`, `it`, `test`, `beforeAll`, `afterAll`, `beforeEach`, `afterEach`, `before` (alias of `beforeAll`), `after` (alias of `afterAll`)\n\n### How it maps to Deno.test\n\n\"Internally, `describe` and `it` are registering tests with `Deno.test` and `t.step`.\"\n\n### Modifiers\n\n- `.only()` — run only this test/suite\n- `.skip()` — skip this test/suite\n- `.ignore()` — alias of `.skip()`\n\n### Known limitation\n\n\"There is currently one limitation to this, you cannot use the permissions option on an individual test case or test suite that belongs to another test suite. That's because internally those tests are registered with `t.step` which does not support the permissions option.\"\n\n### Hook execution order\n\n\"A test suite can have multiples of each type of hook, they will be called in the order that they are registered. The `afterEach` and `afterAll` hooks will be called whether or not the test case passes.\"\n\n---\n\n## 4. Mocking\n\n### Spies\n\nSource: [jsr.io/@std/testing/doc/mock](https://jsr.io/@std/testing/doc/mock), [docs.deno.com/examples/mocking_tutorial](https://docs.deno.com/examples/mocking_tutorial/)\n\n\"Test spies are function stand-ins that are used to assert if a function's internal behavior matches expectations. Test spies on methods keep the original behavior but allow you to test how the method is called and what it returns.\"\n\n### Spy usage example\n\n```ts\nimport { spy, assertSpyCalls, assertSpyCall } from \"@std/testing/mock\";\nconst dbSpy = spy(database, \"save\");\n// ... test code ...\nassertSpyCalls(dbSpy, 1);\n```\n\n### Stubs\n\n\"Test stubs are an extension of test spies that also replaces the original methods behavior.\"\n\n### Cleanup\n\n\"Method spys are disposable, meaning that you can have them automatically restore themselves with the `using` keyword.\"\n\nUsing `using` keyword (preferred):\n```ts\nimport { stub } from \"@std/testing/mock\";\nusing _stub = stub(deps, \"getUserName\", () => \"Test User\");\n// stub auto-restores when scope exits\n```\n\nWithout `using`, always restore in `try/finally`:\n```ts\nconst myStub = stub(obj, \"method\", () => \"mocked\");\ntry {\n // test code\n} finally {\n myStub.restore();\n}\n```\n\n### FakeTime\n\nSource: `@std/testing/time` (separate module from `@std/testing/mock`)\n\n```ts\nimport { FakeTime } from \"@std/testing/time\";\nusing time = new FakeTime();\ntime.tick(3500);\n```\n\n### Assertion helpers\n\n- `assertSpyCall(spy, callIndex, expected)` — assert specific call\n- `assertSpyCalls(spy, expectedCount)` — assert total call count\n- `assertSpyCallArg(spy, callIndex, argIndex, expected)` — assert specific argument\n- `assertSpyCallArgs(spy, callIndex, expected)` — assert all arguments\n- `returnsNext(values)` — create function returning values from iterable\n- `resolvesNext(values)` — async version of `returnsNext`\n\n---\n\n## 5. Sanitizers\n\nSource: [docs.deno.com/runtime/fundamentals/testing](https://docs.deno.com/runtime/fundamentals/testing/)\n\n### sanitizeResources (default: true)\n\n\"Ensures that all I/O resources created during a test are closed, to prevent leaks.\"\n\n### sanitizeOps (default: true)\n\n\"Ensures that all async operations started in a test are completed before the test ends.\"\n\n### sanitizeExit (default: true)\n\n\"Ensures that tested code doesn't call `Deno.exit()`, which could signal a false test success.\"\n\n### Per-test sanitizer disable example\n\n```ts\nDeno.test({\n name: \"test with persistent connection\",\n sanitizeResources: false, // Supabase client keeps connection pool open\n async fn() { /* ... / },\n});\n```\n\n### When to disable\n\n- `sanitizeResources: false` — when a third-party library holds connections open (e.g., database pool)\n- `sanitizeOps: false` — when background tasks fire intentionally (e.g., token refresh)\n- NEVER disable globally — only per-test with a documented reason\n\n---\n\n## 6. Supabase Integration Testing\n\nSource:* [supabase.com/docs/guides/functions/unit-test](https://supabase.com/docs/guides/functions/unit-test)\n\n### Recommended structure\n\n```\nsupabase/functions/\n function-one/\n index.ts\n tests/\n .env.local\n function-one-test.ts\n```\n\n\"using the same name as the Function followed by `-test.ts`\"\n\n### Official example (Supabase client style)\n\n```ts\nimport { assert, assertEquals } from \"jsr:@std/assert@1\";\nimport { createClient } from \"npm:@supabase/supabase-js@2\";\n\nconst supabaseUrl = Deno.env.get(\"SUPABASE_URL\") ?? \"\";\nconst supabaseKey = Deno.env.get(\"SUPABASE_PUBLISHABLE_KEY\") ?? \"\";\n\nconst client = createClient(supabaseUrl, supabaseKey, {\n auth: {\n autoRefreshToken: false,\n persistSession: false,\n detectSessionInUrl: false,\n },\n});\n```\n\n### Direct fetch integration test pattern\n\n```ts\nconst BASE_URL = Deno.env.get(\"SUPABASE_URL\") + \"/functions/v1\";\n\nDeno.test(\"POST /my-function returns expected data\", async () => {\n const response = await fetch(`${BASE_URL}/my-function`, {\n method: \"POST\",\n headers: {\n \"Content-Type\": \"application/json\",\n \"Authorization\": `Bearer ${Deno.env.get(\"SB_PUBLISHABLE_KEY\")}`,\n },\n body: JSON.stringify({ name: \"Test\" }),\n });\n\n assertEquals(response.status, 200);\n const data = await response.json();\n assertEquals(data.message, \"Hello Test!\");\n});\n```\n\n### What to test\n\n- Happy-path request/response (status code, body shape)\n- Authentication enforcement (missing/invalid JWT returns 401)\n- Input validation (malformed body returns 400)\n- Error responses (correct status codes and error messages)\n- CORS headers (OPTIONS preflight, allowed origins)\n- Method routing (POST vs GET vs unsupported methods)\n\n### What NOT to test here (test at the database layer instead)\n\n- RLS policies\n- RPC business logic\n- Trigger behavior\n\n### Error response testing examples\n\nAlways test error paths explicitly:\n\n```ts\nDeno.test(\"returns 401 for missing auth\", async () => {\n const res = await fetch(`${BASE_URL}/my-function`, {\n method: \"POST\",\n headers: { \"Content-Type\": \"application/json\" },\n body: JSON.stringify({}),\n });\n assertEquals(res.status, 401);\n const body = await res.json();\n assertStringIncludes(body.error, \"Missing\");\n});\n\nDeno.test(\"returns 400 for invalid body\", async () => {\n const res = await fetch(`${BASE_URL}/my-function`, {\n method: \"POST\",\n headers: {\n \"Content-Type\": \"application/json\",\n \"Authorization\": `Bearer ${validToken}`,\n },\n body: JSON.stringify({ wrong: \"shape\" }),\n });\n assertEquals(res.status, 400);\n});\n\nDeno.test(\"returns 405 for unsupported method\", async () => {\n const res = await fetch(`${BASE_URL}/my-function`, { method: \"DELETE\" });\n assertEquals(res.status, 405);\n});\n```\n\n### Running\n\n```bash\nsupabase start\nsupabase functions serve\ndeno test --allow-all supabase/functions/tests/function-one-test.ts\n```\n\n### Lock file issue\n\nSource: [github.com/orgs/supabase/discussions/39966](https://github.com/orgs/supabase/discussions/39966)\n\nThe Supabase Edge Runtime uses Deno v2.1.x. Newer Deno CLI versions generate lock file format v5, which the runtime cannot parse. Use `--no-lock` to bypass.\n\n---\n\n## 7. Environment and Permissions\n\n### `--env-file`\n\nSource: [docs.deno.com/runtime/reference/cli/test](https://docs.deno.com/runtime/reference/cli/test)\n\n\"Load environment variables from local file. Only the first environment variable with a given key is used.\" Values from `--env-file` take precedence over existing shell environment variables.\n\n### Permissions\n\nSource: [docs.deno.com/runtime/fundamentals/testing](https://docs.deno.com/runtime/fundamentals/testing/)\n\n\"The `permissions` property in the `Deno.test` configuration allows you to specifically deny permissions, but does not grant them. Permissions must be provided when running the test command.\"\n\n\"Remember that any permission not explicitly granted at the command line will be denied, regardless of what's specified in the test configuration.\"\n\n### Fine-grained per-test permissions example\n\n```ts\nDeno.test({\n name: \"reads config file\",\n permissions: { read: [\"./config.json\"], net: false },\n fn: () => { /* ... / },\n});\n```\n\nPer-test permissions CANNOT exceed CLI-granted permissions — they can only restrict further.\n\n---\n\n## 8. CLI Reference\n\nSource:* [docs.deno.com/runtime/reference/cli/test](https://docs.deno.com/runtime/reference/cli/test)\n\n\| Flag \| Purpose \|\n\|---\|---\|\n\| `--env-file=<path>` \| Load env vars from file \|\n\| `--no-lock` \| Disable lock file discovery \|\n\| `--filter \"<pattern>\"` \| Run tests matching string or `/regex/` \|\n\| `--parallel` \| Run test files in parallel (defaults to CPU count) \|\n\| `--fail-fast` \| Stop after first failure \|\n\| `--watch` \| Re-run on file changes \|\n\| `--coverage=<dir>` \| Collect coverage data \|\n\| `--reporter=<type>` \| Output format (default: `pretty`) \|\n\| `--no-check` \| Skip type checking \|\n\| `--doc` \| Evaluate code blocks in JSDoc/Markdown \|\n\| `--shuffle` \| Randomize test order \|\n\| `--trace-leaks` \| Show resource leak stack traces \|\n\| `--junit-path=<path>` \| Output JUnit XML \|\n\| `--permit-no-files` \| Don't error if no test files found \|\n\n---\n\n## 9. Anti-Patterns\n\nSynthesized from official documentation warnings and sanitizer documentation:\n\n1. Not awaiting async operations — sanitizeOps exists specifically for this\n2. Leaking resources — open files/connections without closing\n3. Disabling sanitizers globally — hides real bugs\n4. Not restoring stubs/spies — leaks mock state between tests\n5. Using `assertThrows` for async code — use `assertRejects`\n6. Over-mocking in integration tests — defeats the purpose\n7. Relying on test execution order — tests should be independent\n8. Hardcoding URLs and credentials — use `Deno.env.get()` + `--env-file`\n9. Ignoring the lock file issue — use `--no-lock` with Supabase Edge Runtime\n10. Using `assert(condition)` for everything — provides no useful failure message; use specific assertions (`assertEquals`, `assertStringIncludes`, etc.)\n11. Mocking `fetch` in integration tests — defeats the purpose of integration testing; use real HTTP calls to the local server\n12. Sharing mutable state without cleanup — tests become order-dependent; reset in `beforeEach`/`afterEach`\n"},"test-frontend":{"content":"---\nname: test-frontend\ndescription: Use when writing, reviewing, or fixing React component/hook tests, or when auditing frontend tests for RTL, Vitest, Zustand, or TanStack Query best practices. Triggers on query priority issues, mock leaks, flaky async tests, or Kent C. Dodds common-mistakes violations.\n---\n\n# React Frontend Testing\n\nWrite and review React component and hook tests using Vitest and React Testing Library (RTL). Every recommendation is sourced from official documentation — see [REFERENCE.md](REFERENCE.md) for citations, code examples, and detailed explanations.\n\n## Scope\n\nDetermine what to review or write based on user request:\n\n- Write mode: write new tests for components/hooks the user specifies\n- Review mode: audit existing test files for anti-patterns and best practice violations\n- Fix mode: fix failing or flawed tests\n\nTest files live in the project's test directory (commonly `src/__tests__/` or `__tests__/` — check the project structure).\n\n## Prerequisites\n\n```bash\ncd app && npx vitest # run all tests (watch mode)\ncd app && npx vitest run # run all tests once\ncd app && npx vitest run src/__tests__/path/to/file.test.tsx # specific file\n```\n\n## The Core Principle\n\nSource: [testing-library.com/docs/guiding-principles](https://testing-library.com/docs/guiding-principles)\n\n> \"The more your tests resemble the way your software is used, the more confidence they can give you.\"\n\nThis means:\n- Test from the user's perspective (what they see and interact with)\n- Query elements by their accessible roles and visible text\n- Do NOT test implementation details (internal state, CSS classes, component structure)\n\n## Principles to Enforce\n\n### 1. Query Priority — Use the Most Accessible Query\n\n\| Priority \| Query \| When to use \|\n\|----------\|-------\|-------------\|\n\| 1 \| `getByRole` \| Default choice — accessible to everyone \|\n\| 2 \| `getByLabelText` \| Form fields with labels \|\n\| 3 \| `getByPlaceholderText` \| When no label exists \|\n\| 4 \| `getByText` \| Non-interactive content \|\n\| 5 \| `getByDisplayValue` \| Filled-in form elements \|\n\| 6 \| `getByAltText` \| Images, areas, inputs \|\n\| 7 \| `getByTitle` \| Tooltip-like content \|\n\| 8 \| `getByTestId` \| Last resort only \|\n\nAnti-patterns:\n- Using `getByTestId` when `getByRole` would work\n- Using `container.querySelector()` — NEVER do this\n- Using `getByText` for buttons when `getByRole('button', { name: /text/i })` is available\n\n### 2. Query Type Selection — `getBy` vs `queryBy` vs `findBy`\n\n\| Type \| Returns \| Throws? \| Use when \|\n\|------\|---------\|---------\|----------\|\n\| `getBy` \| Element \| Yes, if not found \| Element MUST exist (default) \|\n\| `queryBy` \| Element or `null` \| No \| Asserting element does NOT exist \|\n\| `findBy` \| Promise\\<Element\\> \| Yes, after timeout \| Element appears asynchronously \|\n\| `getAllBy` \| Array \| Yes, if empty \| Multiple elements MUST exist \|\n\| `queryAllBy` \| Array (may be empty) \| No \| Checking count of elements \|\n\| `findAllBy` \| Promise\\<Array\\> \| Yes, after timeout \| Multiple elements appear async \|\n\nAnti-patterns:\n- Using `queryBy` to assert existence — use `getBy` instead\n- Wrapping `getBy` in `waitFor` — use `findBy` instead\n- Using `findBy` for synchronous elements — use `getBy` instead\n\n### 3. User Interactions — Use `@testing-library/user-event`\n\n- Use `userEvent.setup()` before `render()`, inside each test\n- The docs discourage using userEvent functions outside the test itself (e.g., in `before`/`after` hooks)\n- Use `user-event` for all interactions — only fall back to `fireEvent` for events `user-event` doesn't support\n\n### 4. Async Testing — `waitFor` and `findBy`\n\n- Use `findBy` for elements that appear asynchronously (it combines `waitFor` + `getBy`)\n- Use `waitFor` only for assertions that become true asynchronously\n- Do NOT wrap `getBy` in `waitFor` — use `findBy` instead\n- Do NOT leave `waitFor` callbacks empty\n- Do NOT put multiple assertions inside a single `waitFor` — one inside, rest outside\n- Do NOT put side effects (like `fireEvent.click`) inside `waitFor`\n\n### 5. `screen` — Always Use It\n\n- Always use `screen.getByRole(...)` etc. instead of destructuring from `render()`\n- `screen` is always available, reduces refactoring churn, and matches the recommended pattern\n\n### 6. Assertions — Use `jest-dom` Matchers\n\n- Use semantic matchers: `toBeDisabled()`, `toBeVisible()`, `toHaveTextContent()`, `toHaveAttribute()`\n- Do NOT check DOM properties directly (e.g., `button.disabled`, `element.textContent`)\n- Key matchers: `toBeVisible()`, `toBeDisabled()`/`toBeEnabled()`, `toBeInTheDocument()`, `toHaveTextContent()`, `toHaveAttribute()`, `toHaveClass()`, `toHaveValue()`, `toBeChecked()`\n\n### 7. Vitest Mocking — `vi.mock()`, `vi.spyOn()`, `vi.fn()`\n\n- `vi.mock()` is hoisted to top of file — runs before all imports\n- Use `vi.hoisted()` when you need variables available to the hoisted mock factory\n\n\| Method \| What it does \|\n\|--------\|-------------\|\n\| `vi.clearAllMocks()` \| Clears mock history (calls, instances). Does NOT reset implementation. \|\n\| `vi.resetAllMocks()` \| Clears history AND resets implementation to `() => undefined`. \|\n\| `vi.restoreAllMocks()` \| Restores original implementations for `vi.spyOn` spies. Does NOT clear history. \|\n\n- Use `vi.clearAllMocks()` in `beforeEach` — most common pattern\n\n### 8. Component Testing with Providers\n\n- Components using React Query, Router, or Zustand need provider wrappers\n- Create a `createWrapper()` function that returns a provider component with `QueryClientProvider` and `MemoryRouter`\n- Use the `wrapper` option in `render()` and `renderHook()`\n\n### 9. Testing React Query\n\n- Create a new `QueryClient` per test — prevents shared cache between tests\n- Set `retry: false` — prevents tests from retrying failed queries (makes failures instant)\n- Use the `wrapper` option to provide `QueryClientProvider`\n\n### 10. Testing Zustand Stores\n\n- Official pattern: create `__mocks__/zustand.ts` that auto-resets stores between tests\n- Alternative: set store state directly via `useAppStore.setState()` or `useAppStore.getState().clearStore()` in `beforeEach`\n- Vitest warning: if you change the Vitest `root` config (e.g., to `./src`), the `__mocks__` directory must be relative to that root\n\n### 11. Cleanup — Automatic\n\n- Do NOT manually call `cleanup()` — Vitest handles it automatically\n- Do NOT import `cleanup` — it's unnecessary boilerplate\n\n### 12. Test Isolation\n\n- Use `beforeEach` to reset mocks and store state\n- Create fresh `QueryClient` instances per test (not shared)\n- Use `vi.clearAllMocks()` in `beforeEach` to reset call history\n- Tests within a file share module scope — don't rely on test order\n\n### 13. What to Test vs What NOT to Test\n\nTest (user-observable behavior):\n- Rendered text and accessible elements\n- User interactions (click, type, submit) and their effects\n- Navigation and route changes\n- Error states and loading states\n- Accessibility (roles, labels, ARIA attributes)\n\nDo NOT test (implementation details):\n- Internal component state\n- CSS classes or inline styles\n- Component instance methods\n- Hook internals (test via component behavior or `renderHook`)\n- That a function was called N times (unless it's the main behavior being tested)\n\n### 14. Act Warnings — When to Use `act()`\n\n- `render()` and `fireEvent` are already wrapped in `act()` — do NOT wrap them again\n- Only use `act()` when directly triggering state updates outside of RTL utilities (e.g., calling store methods directly)\n\n## Common Anti-Patterns (Kent C. Dodds' Official List)\n\n\| # \| Anti-Pattern \| Fix \|\n\|---\|---\|---\|\n\| 1 \| Not using Testing Library ESLint plugins \| Install `eslint-plugin-testing-library` \|\n\| 2 \| Using `wrapper` as variable name for render result \| Destructure or use `screen` \|\n\| 3 \| Manually calling `cleanup` \| Remove — it's automatic \|\n\| 4 \| Not using `screen` \| Always use `screen.getByRole(...)` \|\n\| 5 \| Wrong assertion (`button.disabled` instead of matcher) \| Use `toBeDisabled()` \|\n\| 6 \| Wrapping everything in `act()` \| Remove — `render`/`fireEvent` already handle it \|\n\| 7 \| Using `getByTestId` instead of accessible queries \| Use `getByRole`, `getByText`, etc. \|\n\| 8 \| Using `container.querySelector()` \| Use `screen` queries \|\n\| 9 \| Not querying by text \| Query by visible text content \|\n\| 10 \| Not using `ByRole` most of the time \| `getByRole` is the default \|\n\| 11 \| Adding unnecessary `aria-`/`role` attributes \| Use semantic HTML \|\n\| 12 \| Using `fireEvent` instead of `user-event` \| Use `userEvent.setup()` \|\n\| 13 \| Using `query` for existence checks \| `query` is for NON-existence only \|\n\| 14 \| Using `waitFor` instead of `findBy` \| `findBy` = `waitFor` + `getBy` \|\n\| 15 \| Empty `waitFor(() => {})` callback \| Put an assertion inside \|\n\| 16 \| Multiple assertions in `waitFor` \| One assertion inside, rest outside \|\n\| 17 \| Side effects inside `waitFor` \| Put side effects outside the callback \|\n\| 18 \| Using `get` as implicit assertions \| Always use explicit `expect()` \|\n\n## Output Format (Review Mode)\n\nWhen reviewing existing tests, group findings by severity:\n\n```\n## Critical\nIssues that make tests unreliable, flaky, or misleading.\n\n### [PRINCIPLE] Brief title\nFile: `path/to/file.test.tsx` (lines X-Y)\nPrinciple: What the standard requires.\nViolation: What the code does wrong.\nFix: Specific, actionable suggestion.\n\n## Warning\nIssues that weaken test value or violate conventions.\n\n(same structure)\n\n## Suggestion\nImprovements aligned with best practices.\n\n(same structure)\n```\n\n## Linter\n\nRun ESLint with Testing Library plugin:\n\n```bash\ncd app && npx eslint src/__tests__/\n```\n\n## Rules\n\n- Only verified claims: every recommendation is backed by official Testing Library, Vitest, or framework documentation.\n- User perspective: test what users see and do, not internal implementation.\n- Accessible queries first: `getByRole` is the default; `getByTestId` is the last resort.\n- No unnecessary wrappers: don't add `act()`, `cleanup()`, or extra abstractions.\n- Fresh state per test: new QueryClient, reset store, clear mocks in `beforeEach`.\n- Explicit assertions: always use `expect()` — don't rely on `getBy` throwing as an assertion.\n","reference":"# React Frontend Testing Reference\n\nDetailed definitions, official sources, and verified citations for each principle in this skill.\n\n## Table of Contents\n\n1. [Guiding Principles](#1-guiding-principles)\n2. [Query Priority](#2-query-priority)\n3. [Query Types](#3-query-types)\n4. [User Events](#4-user-events)\n5. [Vitest Mocking](#5-vitest-mocking)\n6. [React Testing Library API](#6-react-testing-library-api)\n7. [Zustand Testing](#7-zustand-testing)\n8. [TanStack Query Testing](#8-tanstack-query-testing)\n9. [Common Mistakes](#9-common-mistakes)\n10. [jest-dom Matchers](#10-jest-dom-matchers)\n\n---\n\n## 1. Guiding Principles\n\nSource: [testing-library.com/docs/guiding-principles](https://testing-library.com/docs/guiding-principles)\n\n> \"The more your tests resemble the way your software is used, the more confidence they can give you.\"\n\nThe library emphasizes three principles:\n1. Tests should interact with DOM nodes rather than component instances\n2. Utilities should encourage testing applications as users would actually use them\n3. Implementations should remain simple and flexible\n\n---\n\n## 2. Query Priority\n\nSource: [testing-library.com/docs/queries/about](https://testing-library.com/docs/queries/about)\n\nOfficial order from most to least preferred:\n\n1. `getByRole` — \"query every element that is exposed in the accessibility tree\"\n2. `getByLabelText` — \"top preference\" for form fields\n3. `getByPlaceholderText` — fallback when labels unavailable\n4. `getByText` — for non-interactive elements outside forms\n5. `getByDisplayValue` — for form elements with filled-in values\n6. `getByAltText` — for elements supporting alt text\n7. `getByTitle` — least reliable semantic option\n8. `getByTestId` — only when other methods don't apply\n\n---\n\n## 3. Query Types\n\nSource: [testing-library.com/docs/queries/about](https://testing-library.com/docs/queries/about)\n\n\| Type \| 0 matches \| 1 match \| >1 matches \| Async? \|\n\|------\|-----------\|---------\|------------\|--------\|\n\| `getBy` \| Throw \| Return \| Throw \| No \|\n\| `queryBy` \| `null` \| Return \| Throw \| No \|\n\| `findBy` \| Throw \| Return \| Throw \| Yes (retries up to 1000ms) \|\n\| `getAllBy` \| Throw \| Array \| Array \| No \|\n\| `queryAllBy` \| `[]` \| Array \| Array \| No \|\n\| `findAllBy` \| Throw \| Array \| Array \| Yes \|\n\n---\n\n## 4. User Events\n\nSource: [testing-library.com/docs/user-event/intro](https://testing-library.com/docs/user-event/intro)\n\n`user-event` \"simulates user interactions by dispatching the events that would happen if the interaction took place in a browser.\"\n\nKey difference from `fireEvent`: `user-event` \"adds visibility and interactability checks along the way and manipulates the DOM just like a user interaction in the browser would.\"\n\nSetup:\n```ts\nconst user = userEvent.setup();\nrender(<MyComponent />);\nawait user.click(screen.getByRole('button'));\n```\n\nFull form interaction example:\n```ts\nimport userEvent from '@testing-library/user-event';\n\nit('submits the form', async () => {\n const user = userEvent.setup();\n render(<MyForm />);\n\n await user.type(screen.getByRole('textbox', { name: /name/i }), 'Alice');\n await user.click(screen.getByRole('button', { name: /submit/i }));\n\n expect(screen.getByText(/success/i)).toBeVisible();\n});\n```\n\nThe documentation \"discourages rendering or using any `userEvent` functions outside of the test itself - e.g. in a `before`/`after` hook.\"\n\n### Async Testing — `waitFor` and `findBy`\n\nSource: [testing-library.com/docs/dom-testing-library/api-async](https://testing-library.com/docs/dom-testing-library/api-async)\n\n```ts\n// GOOD: findBy for elements that appear asynchronously\nconst heading = await screen.findByRole('heading', { name: /welcome/i });\n\n// GOOD: waitFor for assertions that become true asynchronously\nawait waitFor(() => {\n expect(screen.getByText(/loaded/i)).toBeVisible();\n});\n\n// BAD: wrapping getBy in waitFor (use findBy instead)\nawait waitFor(() => {\n screen.getByText(/loaded/i); // wrong — use findByText\n});\n\n// BAD: empty waitFor callback\nawait waitFor(() => {}); // does nothing useful\n\n// BAD: multiple assertions in waitFor\nawait waitFor(() => {\n expect(a).toBe(1);\n expect(b).toBe(2); // if a fails, b never runs — put one inside, rest outside\n});\n\n// BAD: side effects inside waitFor\nawait waitFor(() => {\n fireEvent.click(button); // don't do this — put side effects outside\n expect(result).toBeVisible();\n});\n```\n\n### Always Use `screen`\n\nSource: Kent C. Dodds — \"Common Mistakes with React Testing Library\"\n\n```ts\n// GOOD\nrender(<MyComponent />);\nexpect(screen.getByRole('button')).toBeVisible();\n\n// BAD — destructuring from render\nconst { getByRole } = render(<MyComponent />);\nexpect(getByRole('button')).toBeVisible();\n```\n\nWhy: `screen` is always available, reduces refactoring churn, and matches the Testing Library recommended pattern.\n\n### jest-dom Good vs Bad Examples\n\nSource: [testing-library.com/docs/ecosystem-jest-dom](https://testing-library.com/docs/ecosystem-jest-dom)\n\n```ts\n// GOOD\nexpect(button).toBeDisabled();\nexpect(element).toBeVisible();\nexpect(element).toHaveTextContent('hello');\nexpect(link).toHaveAttribute('href', '/path');\n\n// BAD — checking properties directly\nexpect(button.disabled).toBe(true);\nexpect(element.textContent).toBe('hello');\n```\n\n---\n\n## 5. Vitest Mocking\n\n### vi.mock() hoisting\n\nSource: [vitest.dev/api/vi](https://vitest.dev/api/vi.html)\n\n\"`vi.mock` is hoisted (in other words, _moved_) to top of the file.\"\n\n\"The call to `vi.mock` is hoisted to top of the file. It will always be executed before all imports.\"\n\n### vi.hoisted()\n\nAllows side effects before static imports are evaluated. Returns the factory function's return value.\n\n```ts\nconst { mockFn } = vi.hoisted(() => ({\n mockFn: vi.fn(),\n}));\n\nvi.mock('./module', () => ({ fn: mockFn }));\n```\n\n### Mock clearing methods\n\nSource: [vitest.dev/api/vi](https://vitest.dev/api/vi.html)\n\n`vi.clearAllMocks()` — Calls `.mockClear()` on all spies. \"This will clear mock history without affecting mock implementations.\"\n\n`vi.resetAllMocks()` — Calls `.mockReset()` on all spies. \"This will clear mock history and reset each mock's implementation.\"\n\n`vi.restoreAllMocks()` — \"This restores all original implementations on spies created with `vi.spyOn`.\" Does NOT clear history.\n\n#### Recommended `beforeEach` pattern\n\n```ts\nbeforeEach(() => {\n vi.clearAllMocks(); // most common — clears call history between tests\n});\n```\n\n### Internal vs external access warning\n\nSource: [vitest.dev/guide/mocking](https://vitest.dev/guide/mocking)\n\n\"This only mocks _external_ access. In this example, if `original` calls `mocked` internally, it will always call the function defined in the module, not in the mock factory.\"\n\n---\n\n## 6. React Testing Library API\n\nSource: [testing-library.com/docs/react-testing-library/api](https://testing-library.com/docs/react-testing-library/api)\n\n### render()\n\nReturns: `container`, `baseElement`, `debug`, `rerender`, `unmount`, `asFragment`, plus all bound queries.\n\nOptions: `container`, `baseElement`, `hydrate`, `legacyRoot`, `wrapper`, `queries`, `reactStrictMode`.\n\n### wrapper option\n\n\"Pass a React Component as the `wrapper` option to have it rendered around the inner element. This is most useful for creating reusable custom render functions for common data providers.\"\n\n### cleanup\n\n\"Unmounts React trees that were mounted with render. This is called automatically if your testing framework (such as mocha, Jest or Jasmine) injects a global `afterEach()` function.\"\n\n### renderHook()\n\n\"A convenience wrapper around `render` with a custom test component.\" Returns `result` (with `result.current`), `rerender`, `unmount`.\n\n### Provider Wrapper Pattern (QueryClient + MemoryRouter)\n\nComponents that use React Query, Router, or Zustand need provider wrappers:\n\n```ts\nimport { QueryClient, QueryClientProvider } from '@tanstack/react-query';\nimport { MemoryRouter } from 'react-router-dom';\nimport { createElement, type ReactNode } from 'react';\n\nfunction createWrapper() {\n const queryClient = new QueryClient({\n defaultOptions: { queries: { retry: false } },\n });\n return ({ children }: { children: ReactNode }) =>\n createElement(QueryClientProvider, { client: queryClient },\n createElement(MemoryRouter, null, children)\n );\n}\n\n// In test:\nrender(<MyComponent />, { wrapper: createWrapper() });\n\n// For hooks:\nrenderHook(() => useMyHook(), { wrapper: createWrapper() });\n```\n\n---\n\n## 7. Zustand Testing\n\nSource: [github.com/pmndrs/zustand — docs/learn/guides/testing.md](https://github.com/pmndrs/zustand)\n\n### Official recommendation\n\n\"We recommend using React Testing Library (RTL) to test out React components that connect to Zustand.\"\n\n\"We also recommend using Mock Service Worker (MSW) to mock network requests.\"\n\n### Store reset pattern (Vitest)\n\n1. Create `__mocks__/zustand.ts`:\n\n```ts\nimport { act } from '@testing-library/react';\nimport type * as ZustandExportedTypes from 'zustand';\nexport * from 'zustand';\n\nconst { create: actualCreate, createStore: actualCreateStore } =\n await vi.importActual<typeof ZustandExportedTypes>('zustand');\n\nexport const storeResetFns = new Set<() => void>();\n\nconst createUncurried = <T>(\n stateCreator: ZustandExportedTypes.StateCreator<T>,\n) => {\n const store = actualCreate(stateCreator);\n const initialState = store.getInitialState();\n storeResetFns.add(() => { store.setState(initialState, true); });\n return store;\n};\n\nexport const create = (<T>(\n stateCreator: ZustandExportedTypes.StateCreator<T>,\n) => {\n return typeof stateCreator === 'function'\n ? createUncurried(stateCreator)\n : createUncurried;\n}) as typeof ZustandExportedTypes.create;\n\n// Similar for createStore...\n\nafterEach(() => {\n act(() => { storeResetFns.forEach((fn) => fn()); });\n});\n```\n\n2. In setup file: `vi.mock('zustand');`\n\n### Alternative: Direct `setState` in Tests\n\nFor simpler cases, set store state directly before each test:\n\n```ts\nimport { useAppStore } from '../../store';\n\nbeforeEach(() => {\n useAppStore.getState().clearStore(); // if clearStore action exists\n // or\n useAppStore.setState({ key: initialValue });\n});\n```\n\n### Warning\n\n\"In Vitest you can change the root. Due to that, you need make sure that you are creating your `__mocks__` directory in the right place. Let's say that you change the root to `./src`, that means you need to create a `__mocks__` directory under `./src`.\"\n\n---\n\n## 8. TanStack Query Testing\n\nSource: TanStack Query official documentation\n\n### Key patterns\n\n- Create a new `QueryClient` for each test to prevent cache leaking\n- Set `retry: false` to make failures immediate:\n ```ts\n new QueryClient({ defaultOptions: { queries: { retry: false } } })\n ```\n- Provide via wrapper:\n ```ts\n const wrapper = ({ children }) =>\n createElement(QueryClientProvider, { client: queryClient }, children);\n ```\n\n---\n\n## 9. Common Mistakes\n\nSource: [kentcdodds.com/blog/common-mistakes-with-react-testing-library](https://kentcdodds.com/blog/common-mistakes-with-react-testing-library) (Kent C. Dodds, creator of Testing Library)\n\n1. Not using Testing Library ESLint plugins — Install and use them\n2. Using `wrapper` as variable name for render result — Use `screen` or destructure\n3. Manually calling `cleanup` — It's automatic\n4. Not using `screen` — Always use `screen` for queries\n5. Wrong assertion — Use jest-dom matchers like `toBeDisabled()`\n6. Wrapping in `act()` unnecessarily — `render`/`fireEvent` already handle it\n7. Using wrong query — Use accessible queries, not `getByTestId`\n8. Using `container.querySelector()` — Use `screen` queries\n9. Not querying by text — Query by visible text content\n10. *Not using `ByRole` — It should be the primary query\n11. Adding `aria-`/`role` incorrectly — Use semantic HTML elements\n12. Using `fireEvent` instead of `user-event` — `userEvent.setup()` is preferred\n13. Using `query` for existence* — `query` is for NON-existence; use `getBy` for existence\n14. Using `waitFor` instead of `findBy`* — `findBy` = `waitFor` + `getBy`\n15. Empty `waitFor` callback — Must contain an assertion\n16. Multiple assertions in `waitFor` — One inside, rest outside\n17. Side effects inside `waitFor` — Put side effects outside\n18. *Using `get` as implicit assertions — Always use explicit `expect()`\n\n### `act()` — When to Use It\n\nSource: Kent C. Dodds — \"Common Mistakes with React Testing Library\"\n\n`render()` and `fireEvent` are already wrapped in `act()`. Do NOT wrap them again.\n\n```ts\n// BAD — unnecessary act()\nact(() => {\n render(<MyComponent />);\n});\n\n// GOOD — render already handles act()\nrender(<MyComponent />);\n```\n\nOnly use `act()` when directly triggering state updates outside of RTL utilities (e.g., calling store methods directly).\n\n### What to Test vs What NOT to Test\n\nTest (user-observable behavior):\n- Rendered text and accessible elements\n- User interactions (click, type, submit) and their effects\n- Navigation and route changes\n- Error states and loading states\n- Accessibility (roles, labels, ARIA attributes)\n\nDo NOT test (implementation details):\n- Internal component state\n- CSS classes or inline styles\n- Component instance methods\n- Hook internals (test via component behavior or `renderHook`)\n- That a function was called N times (unless it's the main behavior being tested)\n\n---\n\n## 10. jest-dom Matchers\n\nSource: [testing-library.com/docs/ecosystem-jest-dom](https://testing-library.com/docs/ecosystem-jest-dom)\n\nKey matchers for DOM testing:\n\n\| Matcher \| Tests for \|\n\|---\|---\|\n\| `toBeInTheDocument()` \| Element exists in DOM \|\n\| `toBeVisible()` \| Element is visible to user \|\n\| `toBeDisabled()` / `toBeEnabled()` \| Disabled state \|\n\| `toBeChecked()` \| Checkbox/radio is checked \|\n\| `toBeRequired()` \| Form element is required \|\n\| `toBeValid()` / `toBeInvalid()` \| Form validation state \|\n\| `toBeEmptyDOMElement()` \| No content \|\n\| `toHaveTextContent(text)` \| Contains text \|\n\| `toHaveAttribute(attr, value?)` \| Has HTML attribute \|\n\| `toHaveClass(className)` \| Has CSS class \|\n\| `toHaveStyle(css)` \| Has inline style \|\n\| `toHaveValue(value)` \| Form element value \|\n\| `toHaveDisplayValue(value)` \| Displayed value \|\n\| `toHaveFocus()` \| Element is focused \|\n\| `toContainElement(element)` \| Contains child element \|\n\| `toContainHTML(html)` \| Contains HTML string \|\n\| `toHaveDescription(text)` \| Has `aria-describedby` text \|\n\| `toHaveErrorMessage(text)` \| Has `aria-errormessage` text \|\n\| `toHaveAccessibleName(name)` \| Has accessible name \|\n\| `toHaveAccessibleDescription(desc)` \| Has accessible description \|\n\n### Vitest environment\n\nSource: [vitest.dev/guide/features](https://vitest.dev/guide/features.html)\n\nVitest supports both `happy-dom` and `jsdom` for DOM mocking: \"happy-dom or jsdom for DOM mocking.\" Configure via the `environment` option in vitest config.\n\n\"Vitest also isolates each file's environment so env mutations in one file don't affect others.\"\n"},"test-pgtap":{"content":"---\nname: test-pgtap\ndescription: Use when writing, reviewing, or fixing pgTAP tests for Supabase SQL migrations, or when auditing database tests for best practices. Triggers on plan count mismatches, transaction isolation issues, RLS policy testing, privilege verification, or assertion selection problems.\n---\n\n# pgTAP Database Testing\n\nWrite and review pgTAP tests for Supabase SQL migrations. Every recommendation is sourced from official pgTAP documentation (pgtap.org) or Supabase documentation — see [REFERENCE.md](REFERENCE.md) for citations, full function reference tables, and detailed examples.\n\n## Scope\n\nDetermine what to review or write based on user request:\n\n- Write mode: write new tests for migrations the user specifies\n- Review mode: audit existing test files for anti-patterns and best practice violations\n- Fix mode*: fix failing or flawed tests\n\nTest files live in the project's database test directory (Supabase convention: `supabase/tests/database/.test.sql`).\n\n## Prerequisites\n\n```bash\nnpx supabase start # start local Supabase stack\nnpx supabase test db # run all pgTAP tests\nnpx supabase db reset # reset DB if needed (re-runs all migrations + seeds)\nnpx supabase db lint # run plpgsql_check linter\n```\n\nRequired extension: The `supabase_test_helpers` extension must be enabled for user management helpers (`tests.create_supabase_user`, `tests.authenticate_as`, etc.). Enable it in a migration with `CREATE EXTENSION IF NOT EXISTS supabase_test_helpers;`.\n\n## Principles to Enforce\n\n### 1. Transaction Isolation — `BEGIN`/`ROLLBACK`\n\nEvery test file MUST be wrapped in a `BEGIN;` ... `ROLLBACK;` transaction.\n\n- ROLLBACK ensures all test-created data is cleaned up; tests cannot leak state\n- Always `BEGIN;` as the first statement, `ROLLBACK;` as the last\n- NEVER use `COMMIT;` in test files\n- `finish()` must be called before `ROLLBACK` to output TAP diagnostics\n\n### 2. Plan Counts — Always Use `SELECT plan(N)`\n\npgTAP official documentation states about `no_plan()`: \"Try to avoid using this as it weakens your test.\"\n\n- ALWAYS use `SELECT plan(N)` with an exact count\n- NEVER use `no_plan()` — it hides missing/skipped assertions\n- The plan count MUST match the actual number of assertion calls\n- When adding/removing assertions, update the plan count AND the file header comment\n\n### 3. Test File Organization\n\nNaming convention: `NNNNN-description.test.sql` where NNNNN is a zero-padded number controlling execution order.\n\nFile header template:\n```sql\n-- NNNNN-description.test.sql\n-- Tests for: <what migration/feature this tests>\n--\n-- Covers:\n-- 1. <first thing tested>\n-- 2. <second thing tested>\n--\n-- Assertion count: N\n-- Dependency: <test files or seeds this depends on>\n```\n\nCategorization:\n- `00NNN` — Schema tests (table/column/index/constraint existence) and trigger tests\n- `01NNN` — RPC/function behavioral tests\n- Files should test ONE migration or ONE logical unit\n\n### 4. Assertion Function Selection\n\nUse the most specific assertion for the situation:\n\n\| Situation \| Use \| NOT \|\n\|-----------\|-----\|-----\|\n\| Exact value equality \| `is(have, want, desc)` \| `ok(have = want, desc)` \|\n\| Value inequality \| `isnt(have, want, desc)` \| `ok(have != want, desc)` \|\n\| Boolean condition \| `ok(condition, desc)` \| `is(condition, true, desc)` \|\n\| Row existence \| `ok(EXISTS(SELECT ...), desc)` \| Checking count \|\n\| Exception expected \| `throws_ok(sql, errcode, errmsg, desc)` \| Manual BEGIN/EXCEPTION \|\n\| No exception expected \| `lives_ok(sql, desc)` \| Running SQL without assertion \|\n\| Row non-existence \| `ok(NOT EXISTS(SELECT ...), desc)` \| `is(count, 0, desc)` \|\n\| Exact row comparison \| `results_eq(sql, sql, desc)` \| Manual row-by-row checks \|\n\| Set equality (order-independent) \| `set_eq(sql, sql, desc)` \| `results_eq` when order doesn't matter \|\n\| Empty result set \| `is_empty(sql, desc)` \| `ok(NOT EXISTS(...))` \|\n\n`is()` uses `IS NOT DISTINCT FROM` — this correctly handles NULL comparisons (unlike `=`).\n\n### 5. Schema Tests — Existence and Structure\n\nTest that migrations created expected schema objects: tables, columns (type, nullability, defaults), primary/foreign keys, check constraints, indexes, and RLS enabled status. Use `has_table`, `has_column`, `col_type_is`, `col_not_null`, `col_default_is`, `col_is_pk`, `fk_ok`, `has_check`, `has_index`, etc. See REFERENCE.md for the full function list.\n\n### 6. Behavioral Tests — RPCs and Business Logic\n\nTest PL/pgSQL function behavior by calling them and asserting outcomes:\n\n- Happy path: call the function, assert return value with `is()`\n- Exception path: use `throws_ok()` with SQL string wrapped in `$sql$...$sql$`\n- Side effects: verify rows created/modified with `ok(EXISTS(...))`\n- Use `format()` with `%L` for parameter interpolation in `throws_ok` SQL strings\n- SQLSTATE `'P0001'` for custom `RAISE EXCEPTION`; third arg is exact error message match\n\n### 7. RLS Policy Testing\n\n- Schema: verify policies exist with `policies_are()`, roles with `policy_roles_are()`, commands with `policy_cmd_is()`\n- Behavioral: set role context with `SET LOCAL ROLE` + `SET LOCAL \"request.jwt.claims\"`, then query and assert access enforcement\n- Prefer `tests.authenticate_as()` helper over manual `SET LOCAL` when available\n- Always `RESET ROLE` or rely on `ROLLBACK` to restore context\n\n### 8. SECURITY DEFINER and Privilege Testing\n\n- Verify security context with `is_definer()` / `isnt_definer()`\n- Verify privilege grants/revokes with `function_privs_are()` — pass `ARRAY[]::text[]` for no privileges\n- Parameter types use `ARRAY['uuid']::name[]` — use `ARRAY[]::name[]` for no-argument functions\n- Test all relevant roles: `anon`, `authenticated`, `service_role`\n\n### 9. Trigger Testing\n\n- Schema: verify trigger exists with `has_trigger()`, trigger function with `has_function()`, security context with `is_definer()`\n- Behavioral: insert data and verify side effects with `ok(EXISTS(...))`\n\n### 10. Supabase Test Helpers\n\n- Use `tests.create_supabase_user()` for user creation — fires auth triggers (do not raw INSERT into `auth.users`)\n- Use `tests.get_supabase_uid()` to retrieve test user UUIDs\n- Use `tests.authenticate_as()` / `tests.authenticate_as_service_role()` / `tests.clear_authentication()` for role context\n- Use unique aliases per test file; prefix with the test file's theme (e.g., `auth_trigger_alice`)\n- JSONB metadata must include `sub` and `preferred_username` for GitHub OAuth simulation\n\n### 11. Test Description Conventions\n\nEvery assertion MUST have a descriptive message.\n\nFormat: `'<function_or_feature>: <what is being verified>'`\n\nGood: `'my_rpc: returns correct value for edge case'`\nBad: no description, or vague like `'test 1'`\n\n### 12. Determinism and Independence\n\n- Tests MUST be deterministic — same result every run\n- Use fixed values, not `random()`, `now()`, or `gen_random_uuid()` in assertions\n- Each test file should be independent — don't rely on state from other test files\n- Clean up is handled by `ROLLBACK` — no explicit DELETE needed\n\n### 13. `SET LOCAL` vs `SET` — Scope to the Transaction\n\n- Always use `SET LOCAL` when changing session variables inside tests (role, JWT claims)\n- Both are reverted by `ROLLBACK`, but plain `SET` persists after `COMMIT` while `SET LOCAL` does not — use `SET LOCAL` to make scoping explicit and guard against accidental `COMMIT`\n\n### 14. SAVEPOINT Caveat — Avoid Sub-transactions\n\n- Do NOT use `SAVEPOINT`/`ROLLBACK TO` inside pgTAP test files\n- Rolling back to a savepoint discards assertions emitted after it, causing plan count mismatches\n- Use `throws_ok()` instead for error testing\n\n## Common Anti-Patterns\n\n\| Anti-Pattern \| Why it's wrong \| Fix \|\n\|---\|---\|---\|\n\| `no_plan()` \| Hides missing assertions \| Use `plan(N)` with exact count \|\n\| Missing `ROLLBACK` \| Test data leaks to other files \| Always end with `ROLLBACK;` \|\n\| `ok(a = b, desc)` for equality \| Fails silently on NULL \| Use `is(a, b, desc)` \|\n\| No description on assertions \| Failures are undiagnosable \| Always provide descriptive message \|\n\| Testing private internals \| Brittle, breaks on refactor \| Test public RPC behavior \|\n\| Hardcoded UUIDs \| Collides with other tests \| Use `tests.get_supabase_uid()` \|\n\| `COMMIT` in test files \| Permanently alters database \| Use `ROLLBACK` \|\n\| Plan count mismatch \| Test suite reports wrong total \| Keep count in sync with assertions \|\n\| Missing `finish()` \| No diagnostic output on failure \| Always call before `ROLLBACK` \|\n\| `SET` instead of `SET LOCAL` \| Persists after `COMMIT`; less explicit scoping \| Always use `SET LOCAL` inside tests \|\n\| `SAVEPOINT`/`ROLLBACK TO` \| Discards assertions, breaks plan count \| Use `throws_ok()` for error testing \|\n\n## Output Format (Review Mode)\n\nWhen reviewing existing tests, group findings by severity:\n\n```\n## Critical\nIssues that make tests unreliable, flaky, or misleading.\n\n### [PRINCIPLE] Brief title\nFile: `path/to/file.test.sql` (lines X-Y)\nPrinciple: What the standard requires.\nViolation: What the code does wrong.\nFix: Specific, actionable suggestion.\n\n## Warning\nIssues that weaken test value or violate conventions.\n\n(same structure)\n\n## Suggestion\nImprovements aligned with best practices.\n\n(same structure)\n```\n\n## Rules\n\n- Only verified claims: every recommendation is backed by pgtap.org or Supabase official documentation.\n- Schema AND behavior: test both that objects exist (schema) and that they work correctly (behavior).\n- Transaction discipline: every file wrapped in BEGIN/ROLLBACK, no exceptions.\n- Exact plan counts: never use `no_plan()`.\n- Descriptive messages: every assertion needs a clear description.\n- Test the contract: test what RPCs accept, return, and side-effect — not internal implementation.\n","reference":"# pgTAP Database Testing Reference\n\nDetailed definitions, official sources, and verified citations for each principle in this skill.\n\n## Table of Contents\n\n1. [Test Structure](#1-test-structure)\n2. [Plan Counts](#2-plan-counts)\n3. [Core Assertions](#3-core-assertions)\n4. [Schema Testing Functions](#4-schema-testing-functions)\n5. [Column Testing Functions](#5-column-testing-functions)\n6. [Function Testing Functions](#6-function-testing-functions)\n7. [RLS Policy Functions](#7-rls-policy-functions)\n8. [Privilege Testing Functions](#8-privilege-testing-functions)\n9. [Exception Testing](#9-exception-testing)\n10. [Result Set Testing](#10-result-set-testing)\n11. [Supabase Helpers](#11-supabase-helpers)\n12. [Diagnostics and Utilities](#12-diagnostics-and-utilities)\n\n---\n\n## 1. Test Structure\n\nSource: [pgtap.org/documentation.html](https://pgtap.org/documentation.html)\n\nThe standard test structure shown in pgTAP documentation:\n\n```sql\nBEGIN;\nSELECT plan(N);\n-- tests\nSELECT * FROM finish();\nROLLBACK;\n```\n\n\"This ensures all changes (including function loading) are rolled back after tests complete.\"\n\nSource (Supabase): [supabase.com/docs/guides/database/extensions/pgtap](https://supabase.com/docs/guides/database/extensions/pgtap)\n\nSupabase's examples use the same `begin;`/`rollback;` pattern.\n\nRunning tests: `supabase test db`\n\nSource (Supabase): [supabase.com/docs/guides/database/testing](https://supabase.com/docs/guides/database/testing)\n\nTest files go in `./supabase/tests/database/` with `.sql` extension. \"All `sql` files use pgTAP as the test runner.\"\n\n### File Naming Convention\n\nConvention: `NNNNN-description.test.sql` where NNNNN is a zero-padded number controlling execution order.\n\nCategorization ranges:\n- `00NNN` — Schema tests (table/column/index/constraint existence) and trigger schema + behavioral tests\n- `01NNN` — RPC/function behavioral tests\n\nFiles should test ONE migration or ONE logical unit.\n\n### Test Description Conventions\n\nEvery assertion MUST have a descriptive message. Format: `'<function_or_feature>: <what is being verified>'`\n\n```sql\n-- Good: tells you what's being tested and what function/feature\nSELECT is(result, expected, 'my_rpc: returns correct value for edge case');\n\n-- Bad: no description\nSELECT is(result, expected);\n\n-- Bad: vague\nSELECT is(result, expected, 'test 1');\n```\n\n---\n\n## 2. Plan Counts\n\nSource: [pgtap.org/documentation.html](https://pgtap.org/documentation.html)\n\n`SELECT plan(N);` — declares the expected number of tests.\n\n`SELECT * FROM no_plan();` — for cases where test count is unknown. \"Try to avoid using this as it weakens your test.\"\n\n`SELECT * FROM finish();` — outputs TAP summary, reports failures. Optional parameter: `finish(true)` throws exception if any test failed.\n\n---\n\n## 3. Core Assertions\n\nSource: [pgtap.org/documentation.html](https://pgtap.org/documentation.html)\n\n### Basic\n\n\| Function \| Description \|\n\|---\|---\|\n\| `ok(boolean, description)` \| Passes if boolean is true \|\n\| `is(have, want, description)` \| Equality using `IS NOT DISTINCT FROM` (NULL-safe) \|\n\| `isnt(have, want, description)` \| Inequality using `IS DISTINCT FROM` \|\n\| `pass(description)` \| Unconditional pass \|\n\| `fail(description)` \| Unconditional fail \|\n\| `isa_ok(value, regtype, name)` \| Type checking \|\n\n### Pattern Matching\n\n\| Function \| Description \|\n\|---\|---\|\n\| `matches(have, regex, description)` \| Regex match \|\n\| `imatches(have, regex, description)` \| Case-insensitive regex match \|\n\| `doesnt_match(have, regex, description)` \| Regex non-match \|\n\| `alike(have, like_pattern, description)` \| SQL LIKE pattern \|\n\| `unalike(have, like_pattern, description)` \| LIKE non-match \|\n\| `cmp_ok(have, operator, want, description)` \| Arbitrary operator comparison \|\n\n---\n\n## 4. Schema Testing Functions\n\nSource: [pgtap.org/documentation.html](https://pgtap.org/documentation.html)\n\n### Existence\n\n\| Function \| Tests for \|\n\|---\|---\|\n\| `has_table(schema, table, desc)` \| Table exists \|\n\| `hasnt_table(schema, table, desc)` \| Table doesn't exist \|\n\| `has_view(schema, view, desc)` \| View exists \|\n\| `has_materialized_view(schema, view, desc)` \| Materialized view exists \|\n\| `has_sequence(schema, sequence, desc)` \| Sequence exists \|\n\| `has_index(schema, table, index, desc)` \| Index exists \|\n\| `has_trigger(schema, table, trigger, desc)` \| Trigger exists \|\n\| `has_function(schema, function, desc)` \| Function exists \|\n\| `has_extension(name, desc)` \| Extension enabled \|\n\| `has_schema(name, desc)` \| Schema exists \|\n\| `has_type(schema, type, desc)` \| Type exists \|\n\| `has_enum(schema, enum, desc)` \| Enum exists \|\n\| `has_composite(schema, composite, desc)` \| Composite type exists \|\n\| `has_domain(schema, domain, desc)` \| Domain exists \|\n\| `has_role(name, desc)` \| Role exists \|\n\n### Collection assertions\n\n\| Function \| Tests for \|\n\|---\|---\|\n\| `tables_are(schema, tables_array, desc)` \| Exact set of tables \|\n\| `views_are(schema, views_array, desc)` \| Exact set of views \|\n\| `columns_are(schema, table, columns_array, desc)` \| Exact set of columns \|\n\| `indexes_are(schema, table, indexes_array, desc)` \| Exact set of indexes \|\n\| `triggers_are(schema, table, triggers_array, desc)` \| Exact set of triggers \|\n\| `functions_are(schema, functions_array, desc)` \| Exact set of functions \|\n\| `schemas_are(schemas_array, desc)` \| Exact set of schemas \|\n\| `extensions_are(schema, extensions_array, desc)` \| Exact set of extensions \|\n\| `roles_are(roles_array, desc)` \| Exact set of roles \|\n\| `enum_has_labels(schema, enum, labels_array, desc)` \| Enum has expected labels \|\n\n---\n\n## 5. Column Testing Functions\n\nSource: [pgtap.org/documentation.html](https://pgtap.org/documentation.html)\n\n\| Function \| Tests for \|\n\|---\|---\|\n\| `has_column(schema, table, column, desc)` \| Column exists \|\n\| `hasnt_column(schema, table, column, desc)` \| Column doesn't exist \|\n\| `col_type_is(schema, table, column, type, desc)` \| Column has expected type \|\n\| `col_not_null(schema, table, column, desc)` \| Column is NOT NULL \|\n\| `col_is_null(schema, table, column, desc)` \| Column allows NULL \|\n\| `col_has_default(schema, table, column, desc)` \| Column has a default \|\n\| `col_hasnt_default(schema, table, column, desc)` \| Column has no default \|\n\| `col_default_is(schema, table, column, default, desc)` \| Default value matches \|\n\| `col_is_pk(schema, table, column, desc)` \| Column is primary key \|\n\| `col_isnt_pk(schema, table, column, desc)` \| Column is not primary key \|\n\| `col_is_fk(schema, table, column, desc)` \| Column is foreign key \|\n\| `col_isnt_fk(schema, table, column, desc)` \| Column is not foreign key \|\n\| `col_is_unique(schema, table, column, desc)` \| Column has unique constraint \|\n\| `has_pk(schema, table, desc)` \| Table has a primary key \|\n\| `has_fk(schema, table, desc)` \| Table has a foreign key \|\n\| `fk_ok(schema, table, cols, ref_schema, ref_table, ref_cols, desc)` \| Foreign key references correct table \|\n\| `has_check(schema, table, check_name, desc)` \| Check constraint exists \|\n\| `has_unique(schema, table, columns, desc)` \| Unique constraint on columns \|\n\| `is_partitioned(schema, table, desc)` \| Table is partitioned \|\n\| `is_partition_of(schema, table, parent, desc)` \| Table is partition of parent \|\n\n---\n\n## 6. Function Testing Functions\n\nSource: [pgtap.org/documentation.html](https://pgtap.org/documentation.html)\n\n\| Function \| Tests for \|\n\|---\|---\|\n\| `has_function(schema, function, args, desc)` \| Function exists with given args \|\n\| `function_lang_is(schema, function, args, language, desc)` \| Function language (plpgsql, sql, etc.) \|\n\| `function_returns(schema, function, args, return_type, desc)` \| Return type \|\n\| `is_definer(schema, function, args, desc)` \| SECURITY DEFINER \|\n\| `isnt_definer(schema, function, args, desc)` \| NOT SECURITY DEFINER (INVOKER) \|\n\| `is_strict(schema, function, args, desc)` \| STRICT (RETURNS NULL ON NULL INPUT) \|\n\| `isnt_strict(schema, function, args, desc)` \| NOT STRICT \|\n\| `volatility_is(schema, function, args, volatility, desc)` \| IMMUTABLE, STABLE, or VOLATILE \|\n\| `is_aggregate(schema, function, args, desc)` \| Is an aggregate function \|\n\| `is_procedure(schema, function, args, desc)` \| Is a procedure \|\n\| `is_normal_function(schema, function, args, desc)` \| Is a normal function \|\n\| `trigger_is(schema, table, trigger, function, desc)` \| Trigger calls expected function \|\n\n### Trigger Testing\n\nBeyond schema assertions (`has_trigger`, `trigger_is`), test trigger behavior by inserting data and verifying side effects:\n\n```sql\n-- Trigger exists on table\nSELECT has_trigger('public', 'messages', 'on_message_insert',\n 'on_message_insert trigger exists on messages');\n\n-- Trigger function exists and is SECURITY DEFINER\nSELECT has_function('public', 'broadcast_new_message', 'trigger function exists');\nSELECT is_definer('public', 'broadcast_new_message', ARRAY[]::name[],\n 'broadcast_new_message is SECURITY DEFINER');\n\n-- Trigger behavior (insert data, verify side effects)\nINSERT INTO public.messages (...) VALUES (...);\nSELECT ok(\n EXISTS (SELECT 1 FROM public.expected_side_effect WHERE ...),\n 'trigger creates expected side effect'\n);\n```\n\n### Argument format\n\nFor `args`, use `ARRAY['uuid', 'text']::name[]` or `ARRAY[]::name[]` for no arguments:\n\n```sql\nSELECT is_definer('public', 'my_function', ARRAY['uuid', 'integer']::name[], 'is SECURITY DEFINER');\nSELECT is_definer('public', 'my_trigger_fn', ARRAY[]::name[], 'trigger fn is SECURITY DEFINER');\n```\n\n---\n\n## 7. RLS Policy Functions\n\nSource: [pgtap.org/documentation.html](https://pgtap.org/documentation.html), [supabase.com/docs/guides/database/extensions/pgtap](https://supabase.com/docs/guides/database/extensions/pgtap)\n\n\| Function \| Tests for \|\n\|---\|---\|\n\| `policies_are(schema, table, policies_array, desc)` \| Exact set of policies on table \|\n\| `policy_roles_are(schema, table, policy, roles_array, desc)` \| Policy applies to these roles \|\n\| `policy_cmd_is(schema, table, policy, command, desc)` \| Policy applies to SELECT/INSERT/UPDATE/DELETE/ALL \|\n\n### Example from Supabase docs\n\n```sql\nSELECT policies_are(\n 'public', 'profiles',\n ARRAY['Profiles are public', 'Profiles can only be updated by the owner']\n);\n```\n\n### Behavioral RLS Testing\n\nTo test that RLS policies actually enforce access, set the role context and query as that user:\n\n```sql\n-- Set authenticated user context\nSET LOCAL ROLE authenticated;\nSET LOCAL \"request.jwt.claims\" = '{\"sub\": \"user-uuid-here\"}';\n\n-- Now queries run as that user, RLS applies\nSELECT is_empty(\n $$SELECT * FROM public.profiles WHERE id != 'user-uuid-here'$$,\n 'authenticated user cannot see other profiles'\n);\n\n-- Reset role\nRESET ROLE;\n```\n\nAlternatively, use the Supabase test helper `tests.authenticate_as()` (see section 11) which handles role and JWT claims together.\n\n---\n\n## 8. Privilege Testing Functions\n\nSource: [pgtap.org/documentation.html](https://pgtap.org/documentation.html)\n\n\| Function \| Tests for \|\n\|---\|---\|\n\| `table_privs_are(schema, table, role, privs, desc)` \| Table privileges (SELECT, INSERT, UPDATE, DELETE, etc.) \|\n\| `schema_privs_are(schema, role, privs, desc)` \| Schema privileges (CREATE, USAGE) \|\n\| `function_privs_are(schema, function, args, role, privs, desc)` \| Function privileges (EXECUTE) \|\n\| `sequence_privs_are(schema, sequence, role, privs, desc)` \| Sequence privileges \|\n\| `column_privs_are(schema, table, column, role, privs, desc)` \| Column-level privileges \|\n\| `database_privs_are(database, role, privs, desc)` \| Database privileges \|\n\n### Testing REVOKE\n\nTo verify a function has NO execute privilege for a role:\n\n```sql\nSELECT function_privs_are('public', 'my_function', ARRAY['uuid']::name[],\n 'authenticated', ARRAY[]::text[], -- empty array = no privileges\n 'authenticated: no execute on my_function');\n```\n\n### Behavioral Testing of Functions (RPCs)\n\nTest PL/pgSQL function behavior by calling them and asserting outcomes:\n\n```sql\n-- Happy path\nSELECT is(\n (SELECT public.my_rpc(param1, param2)),\n expected_value,\n 'my_rpc: returns expected value'\n);\n\n-- Exception path\nSELECT throws_ok(\n format($sql$SELECT public.my_rpc(%L, %L)$sql$, bad_param1, bad_param2),\n 'P0001', -- SQLSTATE for RAISE EXCEPTION\n 'Expected error message',\n 'my_rpc: rejects bad input'\n);\n\n-- Side effects\nSELECT ok(\n EXISTS (SELECT 1 FROM public.some_table WHERE condition),\n 'my_rpc: creates expected row'\n);\n```\n\n---\n\n## 9. Exception Testing\n\nSource: [pgtap.org/documentation.html](https://pgtap.org/documentation.html)\n\n\| Function \| Tests for \|\n\|---\|---\|\n\| `throws_ok(sql, errcode, errmsg, desc)` \| SQL raises expected exception \|\n\| `throws_like(sql, like_pattern, desc)` \| Exception message matches LIKE pattern \|\n\| `throws_matching(sql, regex, desc)` \| Exception message matches regex \|\n\| `lives_ok(sql, desc)` \| SQL does NOT raise an exception \|\n\| `performs_ok(sql, milliseconds, desc)` \| SQL completes within time limit \|\n\n### `throws_ok` signatures\n\n```sql\n-- Full form: SQLSTATE + message\nSELECT throws_ok(\n $$SELECT 1/0$$,\n '22012', -- SQLSTATE for division by zero\n 'division by zero',\n 'division by zero throws correct error'\n);\n\n-- Message only\nSELECT throws_ok(\n $$SELECT 1/0$$,\n 'division by zero'\n);\n\n-- SQLSTATE only\nSELECT throws_ok(\n $$SELECT 1/0$$,\n '22012'\n);\n```\n\n### Using `format()` with `%L` for parameter interpolation\n\nWhen passing dynamic values into `throws_ok` or `lives_ok` SQL strings, use `format()` with `%L` (literal-quoting placeholder) to safely interpolate values. This prevents SQL injection in test code:\n\n```sql\nSELECT throws_ok(\n format($sql$SELECT public.my_rpc(%L, %L)$sql$, bad_param1, bad_param2),\n 'P0001', -- SQLSTATE for RAISE EXCEPTION\n 'Expected error message',\n 'my_rpc: rejects bad input'\n);\n```\n\n### Common SQLSTATE codes\n\n- `P0001` — `RAISE EXCEPTION` (custom)\n- `23505` — unique_violation\n- `23503` — foreign_key_violation\n- `23514` — check_violation\n- `22012` — division_by_zero\n- `42501` — insufficient_privilege\n\n---\n\n## 10. Result Set Testing\n\nSource: [pgtap.org/documentation.html](https://pgtap.org/documentation.html)\n\n\| Function \| Tests for \|\n\|---\|---\|\n\| `results_eq(sql, sql, desc)` \| Exact row-by-row match (order matters) \|\n\| `results_ne(sql, sql, desc)` \| Results differ \|\n\| `set_eq(sql, sql, desc)` \| Same rows regardless of order/duplicates \|\n\| `set_ne(sql, sql, desc)` \| Different sets \|\n\| `set_has(sql, sql, desc)` \| First result is superset of second \|\n\| `set_hasnt(sql, sql, desc)` \| First result has none of second's rows \|\n\| `bag_eq(sql, sql, desc)` \| Same multiset (duplicates matter, order doesn't) \|\n\| `bag_ne(sql, sql, desc)` \| Different multisets \|\n\| `is_empty(sql, desc)` \| Query returns no rows \|\n\| `isnt_empty(sql, desc)` \| Query returns at least one row \|\n\| `row_eq(sql, record, desc)` \| Single row matches record \|\n\n---\n\n## 11. Supabase Helpers\n\nSource: [supabase.com/docs/guides/database/testing](https://supabase.com/docs/guides/database/testing), [supabase.com/docs/guides/local-development/testing/pgtap-extended](https://supabase.com/docs/guides/local-development/testing/pgtap-extended)\n\nSupabase provides a `tests` schema with helper functions for managing test users and context:\n\n### User Management\n\n\| Function \| Purpose \|\n\|---\|---\|\n\| `tests.create_supabase_user(identifier, email, phone?, metadata?)` \| Creates `auth.users` record (fires auth triggers) \|\n\| `tests.get_supabase_uid(identifier)` \| Returns UUID of previously created test user \|\n\n```sql\nSELECT tests.create_supabase_user(\n 'my_test_user',\n 'test@example.com',\n NULL,\n '{\"sub\": \"12345\", \"preferred_username\": \"testuser\", \"avatar_url\": \"https://example.com/avatar.png\"}'::jsonb\n);\n\nSELECT tests.get_supabase_uid('my_test_user');\n-- Returns: uuid\n```\n\n### Authentication Context\n\n\| Function \| Purpose \|\n\|---\|---\|\n\| `tests.authenticate_as(identifier)` \| Sets role to `authenticated` + JWT claims for user \|\n\| `tests.authenticate_as_service_role()` \| Sets role to `service_role`, clears JWT claims \|\n\| `tests.clear_authentication()` \| Sets role to `anon`, clears JWT claims \|\n\n```sql\n-- Test as authenticated user\nSELECT tests.authenticate_as('my_test_user');\n-- Now queries run with RLS applied for this user\n\n-- Test as service_role (bypasses RLS)\nSELECT tests.authenticate_as_service_role();\n\n-- Test as anonymous\nSELECT tests.clear_authentication();\n```\n\n### Usage Rules\n\n- Use unique aliases per test file to avoid collisions\n- Prefix aliases with the test file's theme (e.g., `auth_trigger_alice`)\n- The JSONB metadata passed to `create_supabase_user` must include `sub` and `preferred_username` for GitHub OAuth simulation\n- Use `tests.create_supabase_user()` for user setup, not raw `INSERT` into `auth.users` — this ensures auth triggers fire\n\n### RLS Verification\n\n\| Function \| Purpose \|\n\|---\|---\|\n\| `tests.rls_enabled(schema)` \| Asserts ALL tables in schema have RLS enabled \|\n\| `tests.rls_enabled(schema, table)` \| Asserts specific table has RLS enabled \|\n\n```sql\nSELECT tests.rls_enabled('public');\n```\n\n### Time Control\n\n\| Function \| Purpose \|\n\|---\|---\|\n\| `tests.freeze_time(timestamp)` \| Freeze `now()` for deterministic time tests \|\n\| `tests.unfreeze_time()` \| Restore normal time behavior \|\n\n---\n\n## 12. Diagnostics and Utilities\n\nSource: [pgtap.org/documentation.html](https://pgtap.org/documentation.html)\n\n\| Function \| Purpose \|\n\|---\|---\|\n\| `diag(message)` \| Output diagnostic message (prefixed with `#`) \|\n\| `skip(reason, count)` \| Skip N tests with reason \|\n\| `todo(reason, count)` \| Mark N tests as to-do \|\n\| `todo_start(why)` / `todo_end()` \| Block-style todo marking \|\n\n### Ownership testing\n\n\| Function \| Tests for \|\n\|---\|---\|\n\| `table_owner_is(schema, table, owner, desc)` \| Table owner \|\n\| `view_owner_is(schema, view, owner, desc)` \| View owner \|\n\| `function_owner_is(schema, function, args, owner, desc)` \| Function owner \|\n\| `schema_owner_is(schema, owner, desc)` \| Schema owner \|\n\| `sequence_owner_is(schema, sequence, owner, desc)` \| Sequence owner \|\n\n---\n\n## 12b. Determinism and Independence\n\nTests MUST be deterministic — same result every run:\n\n- Use fixed values, not `random()`, `now()`, or `gen_random_uuid()` in assertions\n- Each test file should be independent — don't rely on state from other test files\n- Use `tests.create_supabase_user()` for user setup, not raw INSERT (ensures triggers fire)\n- Clean up is handled by `ROLLBACK` — no explicit DELETE needed\n- Use `tests.freeze_time()` when testing time-dependent logic\n\n---\n\n## 13. `SET LOCAL` vs `SET`\n\nSource: [PostgreSQL documentation — SET](https://www.postgresql.org/docs/current/sql-set.html)\n\n`SET LOCAL` restricts the setting to the current transaction. When the transaction ends (via `COMMIT` or `ROLLBACK`), the setting reverts to its session-level value.\n\nPlain `SET` (without `LOCAL`) changes the session-level value. Both are reverted by `ROLLBACK`, but the key difference: plain `SET` persists after `COMMIT`, while `SET LOCAL` does not. Since pgTAP tests use `ROLLBACK`, both are technically reverted — but `SET LOCAL` is still preferred because it makes the intent explicit and protects against accidental `COMMIT`.\n\n```sql\n-- Inside BEGIN/ROLLBACK:\nSET LOCAL ROLE authenticated; -- reverted by both COMMIT and ROLLBACK ✓\nSET ROLE authenticated; -- reverted by ROLLBACK, but persists after COMMIT ✗\n```\n\n---\n\n## 14. SAVEPOINT Caveat\n\nSource: PostgreSQL transaction semantics ([postgresql.org/docs/current/sql-savepoint.html](https://www.postgresql.org/docs/current/sql-savepoint.html))\n\nNote: pgTAP documentation does not explicitly address SAVEPOINTs. This guidance is derived from PostgreSQL transaction semantics: `ROLLBACK TO SAVEPOINT` undoes all changes (including sequence increments and table writes) made after the savepoint. Since pgTAP tracks test state within the transaction, rolling back to a savepoint can corrupt internal counters and cause plan count mismatches.\n\n```sql\n-- BAD: assertions between SAVEPOINT and ROLLBACK TO are lost\nSAVEPOINT sp1;\nSELECT ok(true, 'this assertion gets rolled back'); -- counted by plan but discarded\nROLLBACK TO sp1;\n-- Plan now expects more assertions than will actually complete\n\n-- GOOD: use throws_ok() instead\nSELECT throws_ok(\n $$SELECT some_function_that_should_fail()$$,\n 'P0001', 'expected error message',\n 'function rejects bad input'\n);\n```\n"},"ui-audit":{"content":"---\nname: ui-audit\ndescription: Use after any UI edit, when reviewing UI components, or when asked for an accessibility or structure audit. Triggers on WCAG 2.2 violations, WAI-ARIA APG pattern issues, touch target sizing, focus management, component duplication, or separation of concerns problems in React/Tailwind code.\n---\n\n# UI Audit — Accessibility & Structure\n\nAudit React/Tailwind UI code for accessibility violations and structural anti-patterns. Every finding must cite the specific standard (WCAG SC, WAI-ARIA APG pattern, platform guideline) so the developer knows the authoritative source.\n\nSee [REFERENCE.md](REFERENCE.md) for detailed standard definitions, exact requirements, and code examples.\n\n## Scope\n\nDetermine what to audit based on context:\n\n- Git diff mode (default when no scope specified and changes exist): run `git diff` and `git diff --cached` to audit only changed/added UI code (`.tsx`, `.css` files)\n- File/directory mode: audit the files or directories the user specifies\n- Full audit mode: when the user asks for a full UI audit, scan the project's `src/` directory (skip node_modules, build artifacts, test files)\n\nRead all in-scope code before producing findings.\n\n## Part 1 — Accessibility\n\nEvaluate against each check. Skip checks with no findings.\n\n### 1. Touch Target Size\n\nStandards: WCAG 2.5.5 (AAA) — 44x44 CSS px; WCAG 2.5.8 (AA) — 24x24 CSS px; Apple HIG — 44x44 pt; Material Design — 48x48 dp\n\nThis project targets mobile (future native app). Enforce 44x44px minimum (Tailwind `min-h-11` = 2.75rem = 44px at 16px root).\n\nWhat to check:\n- Every `<button>`, `<a>`, `<input>`, `<select>`, clickable `<div>`/`<li>`, and icon button must produce a tap target of at least 44x44px\n- Padding classes that produce heights below 44px on small text: `py-0.5` (~24px), `py-1` (~28px), `py-1.5` (~32px), `py-2` (~36px) on `text-sm`/`text-xs` elements\n- Toggle/switch components: the clickable area (not just the visual track) must be 44x44px\n- Close buttons (especially bare `x` character): must have padding to reach 44x44px\n\nExceptions (per WCAG 2.5.5):\n- Inline links within a sentence or block of text\n- Size determined entirely by the user agent\n- Equivalent control on the same page meets the size requirement\n- Specific presentation is essential to the information being conveyed\n\n### 2. Modal / Dialog Accessibility\n\nStandard: WAI-ARIA APG — Dialog (Modal) Pattern\n\nWhat to check (see REFERENCE.md §2 for full attribute table and focus management specs):\n- Container has `role=\"dialog\"`, `aria-modal=\"true\"`, and `aria-labelledby` or `aria-label`\n- Focus moves into the dialog on open and is trapped (Tab/Shift+Tab cycle within)\n- Focus returns to the triggering element on close\n- Escape key closes the dialog\n\nViolations to flag:\n- `role=\"presentation\"` on a modal container\n- No focus trap, no focus restoration, missing label\n\n### 3. Focus Visibility\n\nStandard: WCAG 2.4.7 (AA) — \"Any keyboard operable user interface has a mode of operation where the keyboard focus indicator is visible.\"\n\nWhat to check:\n- Every interactive element (`<button>`, `<a>`, `<input>`, `<select>`, `[role=\"button\"]`, `[role=\"tab\"]`, `[tabindex]`) must have a visible focus indicator\n- Look for `focus-visible:outline`, `focus-visible:ring`, `focus:ring`, or equivalent\n- If NO interactive elements in scope have focus styles, flag as a blanket issue\n- Custom components wrapping `<div onClick>` need `tabIndex={0}` AND a focus style\n\n### 4. Color Contrast\n\nStandard: WCAG 1.4.3 (AA) — Contrast (Minimum)\n\nRequired ratios:\n- Normal text (< 18pt or < 14pt bold): 4.5:1\n- Large text (>= 18pt or >= 14pt bold): 3:1\n- UI components and graphical objects (WCAG 1.4.11 AA): 3:1\n\nWhat to check:\n- Low-opacity text: `opacity-30`, `opacity-40`, or equivalent — compute effective contrast\n- Stacked opacity (e.g., `bg-surface/50 opacity-60`) — compound reduction likely fails\n- Placeholder text colors\n- Note: WCAG 1.4.3 exempts \"inactive\" (disabled) UI components from contrast requirements\n\n### 5. Form Label Association\n\nStandards: WCAG 1.3.1 (A) — \"Information, structure, and relationships conveyed through presentation can be programmatically determined\"; WCAG 4.1.2 (A) — \"For all user interface components, the name and role can be programmatically determined\"\n\nWhat to check:\n- Every `<input>`, `<select>`, `<textarea>` must have ONE of:\n - A `<label>` with `htmlFor` matching the input's `id`\n - `aria-label` on the input\n - `aria-labelledby` pointing to a visible label element\n- Visible label text NOT programmatically connected = violation\n- `placeholder` alone is NOT a label (it disappears on input)\n\n### 6. Icon-Only Buttons\n\nStandards: WCAG 1.1.1 (A) — \"All non-text content that is presented to the user has a text alternative that serves the equivalent purpose\"; WCAG 4.1.2 (A) — Name, Role, Value\n\nWhat to check:\n- Every button/link containing only an icon (SVG, icon component, single character like `x`) must have:\n - `aria-label` describing the action, OR\n - `<span className=\"sr-only\">` with descriptive text\n- Icons inside buttons with visible text should have `aria-hidden=\"true\"`\n\n### 7. ARIA Widget Patterns\n\nStandard: WAI-ARIA APG\n\nWhat to check (see REFERENCE.md §7 for full attribute tables and keyboard specs):\n- Tabs: tablist/tab/tabpanel roles, `aria-selected`, `aria-controls`/`aria-labelledby` linkage, arrow key navigation\n- Menu buttons: `aria-haspopup`, `aria-expanded`, menu/menuitem roles, Enter/Space/Arrow/Escape keyboard\n- Alerts: `role=\"alert\"` for urgent, `aria-live=\"polite\"` for non-urgent, no auto-dismiss (WCAG 2.2.3)\n\n### 8. Keyboard Accessibility\n\nStandard: WCAG 2.1.1 (A) — \"All functionality of the content is operable through a keyboard interface\"\n\nWhat to check:\n- Clickable non-interactive elements (`<div onClick>`, `<li onClick>`, `<span onClick>`) must have:\n - `role=\"button\"` (or appropriate role)\n - `tabIndex={0}`\n - `onKeyDown` handler (Enter and Space should activate)\n- Context menus, dropdowns, popovers: closeable with Escape\n- Hover-only interactions (`opacity-0 group-hover:opacity-100` on buttons): invisible to keyboard — must have a keyboard-accessible alternative\n\n### 9. Loading States\n\nStandard: WCAG 4.1.3 (AA) — \"Status messages can be programmatically determined through role or properties such that they can be presented to the user by assistive technologies without receiving focus\"; React Suspense docs\n\nWhat to check:\n- Components that `return null` during loading = blank screen with no feedback — always show a loading indicator\n- Dynamic content regions that update should use `aria-live` or `role=\"status\"` to announce changes\n- React Suspense docs: \"Don't put a Suspense boundary around every component. Suspense boundaries should not be more granular than the loading sequence that you want the user to experience.\"\n- React Suspense docs: \"Replacing visible UI with a fallback creates a jarring user experience\" — use `startTransition` for updates to already-visible content\n\n### 10. Text Size\n\nImportant: WCAG has NO minimum font size requirement. WCAG 1.4.4 (AA) requires text to be resizable to 200% without loss of content — not a minimum size.\n\nBest practice (Apple HIG, Material Design, general UX):\n- Body text: 16px recommended\n- Secondary/caption text: 12px practical minimum\n- Text below 12px (`text-[10px]`, `text-[9px]`) is a readability concern, especially on mobile\n\nSeverity: Warning (best practice), never Critical. Always note this is NOT a WCAG requirement.\n\n## Part 2 — UI Structure\n\n### 11. Component Extraction (DRY)\n\nSources: Tailwind docs — \"Reusing Styles\"; Kent C. Dodds — AHA Programming (see REFERENCE.md §11 for full quotes)\n\nWhat to check:\n- Same Tailwind class combination (5+ utility classes forming one visual pattern) appearing 3+ times across different files — extract to a shared component\n- Common extraction candidates: Button variants, Card, Input, Badge, Modal close button\n- Utility style patterns (e.g., focus rings) repeated 10+ times — bake into base components\n\nThreshold: 3+ identical patterns across 2+ files = extract. Duplication within a single file is fine.\n\nDo NOT flag:\n- Single-use class combinations, even if long (this is Tailwind by-design)\n- Structural Tailwind classes that naturally repeat (`flex items-center gap-2`)\n\n### 12. Component Size & Responsibility\n\nSources: React docs — \"Thinking in React\"; Robert C. Martin — Single Responsibility Principle\n\nSRP heuristic: a component's purpose should be describable in one sentence without \"and.\"\n\nWhat to check:\n- Components exceeding ~200 lines — likely multiple responsibilities\n- JSX return exceeding ~50 lines — consider splitting into subcomponents\n- Business logic (API calls, optimistic updates, complex state transforms) inline in render components — extract to custom hooks\n- Inline event handlers exceeding ~10 lines — extract to named functions or hooks\n- Multiple unrelated `useState`/`useEffect` clusters in one component\n\n### 13. Layout Consistency\n\nWhat to check:\n- Individual screens overriding the app-level layout constraint (e.g., screen sets `max-w-lg` when layout uses `max-w-2xl`)\n- Hardcoded heights with `calc()` and magic numbers (`calc(100vh - 140px)`) — use flex/grid layout instead; these break when surrounding layout changes\n- Inconsistent page-level spacing (one screen `p-4`, another `p-6` for the same structural role)\n\n### 14. Design Token Usage\n\nSource: Tailwind docs — Theme configuration\n\nWhat to check:\n- Hardcoded hex colors (`#1a1a2e`, `rgb(...)`, inline `style={{ color: '...' }}`) bypassing the project's CSS custom properties / Tailwind theme\n- Hardcoded pixel values for spacing/sizing that should use Tailwind's scale\n- Magic numbers for timeouts, thresholds, row heights, page sizes — should be named constants (Clean Code Ch. 17: numbers other than 0 and 1 should be named)\n\n### 15. Loading & Error Patterns\n\nSources: React Suspense docs; React docs — Error Boundaries\n\nWhat to check:\n- `.catch(() => {})` on user-initiated actions (buy, claim, save) — user sees nothing on failure. Note: acceptable for best-effort background operations (auto-sync, prefetch)\n- Missing error boundaries around independently-failing sections\n- Inconsistent loading patterns across screens (some `useQuery`, some manual `useState`, some `return null`)\n\n### 16. State & Hook Patterns\n\nSource: React docs — \"Reusing Logic with Custom Hooks\"\n\nWhat to check:\n- Custom hooks wrapping a single `useState` with no other hooks — React docs: \"extracting a useFormInput Hook to wrap a single useState call is probably unnecessary\"\n- Functions prefixed with `use` that don't call any React hooks — React docs: \"If your function doesn't call any Hooks, avoid the use prefix\"\n- Components with 4+ `useState` calls that could be consolidated into a custom hook or `useReducer`\n- React docs: \"Custom Hooks let you share stateful logic but not state itself. Each call to a Hook is completely independent.\"\n\n## Output Format\n\nGroup findings by severity. Each finding MUST name the specific standard.\n\n```\n## Critical\nViolations that directly harm users — screen reader users can't navigate, keyboard users are trapped, touch users can't tap targets.\n\n### [STANDARD] Brief title\nFile: `path/to/file.tsx` (lines X-Y)\nStandard: Full standard ID and one-line requirement.\nViolation: What the code does wrong and who is affected.\nFix: Specific, actionable code change.\n\n## Warning\nViolations that degrade usability but have workarounds, or best-practice violations with real UX impact.\n\n(same structure)\n\n## Suggestion\nImprovements that increase robustness or consistency but aren't urgently broken.\n\n(same structure)\n\n## Summary\n- Total findings: N (X critical, Y warning, Z suggestion)\n- Standards most frequently violated: list top 2-3\n- Overall assessment: 1-2 sentence verdict\n```\n\n## False Positive Filtering\n\n### Hard Exclusions — do NOT report:\n\n1. Inline links within body text — exempt from touch target size per WCAG 2.5.5\n2. Disabled/inactive elements — exempt from contrast requirements per WCAG 1.4.3\n3. Purely decorative elements — exempt from text alternative requirements per WCAG 1.1.1\n4. Third-party component internals — don't audit inside node_modules\n5. Test files — skip `.test.tsx`, `.spec.tsx`\n6. Theme/token definitions — CSS variable definitions in theme config ARE the design system\n\n### Severity Calibration:\n\n- Critical: Users physically cannot complete an action (can't tap, can't navigate, can't perceive content). Screen reader users locked out.\n- Warning: Users CAN complete the action but with significant difficulty. UX best-practice violations with real impact.\n- Suggestion: Improvements that help but aren't urgently broken. Minor inconsistencies.\n\n## Verification Pass\n\nBefore finalizing your report, verify every finding:\n\n1. Re-read the code: Go back to the flagged file and re-read the flagged lines in full context (±20 lines). Confirm the issue actually exists — not a misread, not handled elsewhere in the same file, not guarded by a wrapper or parent component.\n2. Check for existing mitigations: Search the codebase for related patterns. Is the \"missing\" attribute set by a shared component, layout wrapper, or design system primitive? If so, drop the finding.\n3. Verify against official docs: For every standard you cite, confirm your interpretation is correct. If you're unsure whether a pattern violates the standard, look it up — don't guess. Use available tools (context7, web search, REFERENCE.md) to check current documentation when uncertain.\n4. Filter by confidence: If you're certain a finding is a false positive after re-reading, drop it entirely. If doubt remains but the issue seems plausible, move it to a brief \"Worth Investigating\" note at the end of the report — don't include it as a formal finding.\n\n## Rules\n\n- Cite the standard: every finding must reference the specific WCAG SC, ARIA APG pattern, or platform guideline.\n- Be specific: always cite file paths and line numbers.\n- Be actionable: every finding must include a concrete fix — not \"add aria-label\" but `aria-label=\"Close dialog\"` on line 42.\n- Measure real impact: severity by who is affected and how badly.\n- Don't over-report text size: WCAG has no minimum font size. Sub-12px = Warning (best practice), never Critical.\n- Don't over-report DRY: same-file duplication is fine per Tailwind guidance. Only flag cross-file duplication of 3+ occurrences.\n- Respect scope: in diff mode, only flag issues in changed lines and their immediate context.\n- Don't duplicate other skills: a11y and UI structure only. Logic bugs go to `correctness-audit`, security to `security-audit`, general code quality to `best-practices-audit`.\n","reference":"# UI Audit Reference\n\nDetailed definitions, exact requirements, and source citations for each check in the audit.\n\n## Table of Contents\n\n### Part 1 — Accessibility\n1. [Touch Target Size](#1-touch-target-size)\n2. [Modal / Dialog](#2-modal--dialog-accessibility)\n3. [Focus Visibility](#3-focus-visibility)\n4. [Color Contrast](#4-color-contrast)\n5. [Form Labels](#5-form-label-association)\n6. [Icon-Only Buttons](#6-icon-only-buttons)\n7. [ARIA Widget Patterns](#7-aria-widget-patterns)\n8. [Keyboard Accessibility](#8-keyboard-accessibility)\n9. [Loading States](#9-loading-states)\n10. [Text Size](#10-text-size)\n\n### Part 2 — UI Structure\n11. [Component Extraction](#11-component-extraction)\n12. [Component Size](#12-component-size--responsibility)\n13. [Layout Consistency](#13-layout-consistency)\n14. [Design Tokens](#14-design-token-usage)\n15. [Loading & Error Patterns](#15-loading--error-patterns)\n16. [State & Hook Patterns](#16-state--hook-patterns)\n\n---\n\n## 1. Touch Target Size\n\n### Sources\n- WCAG 2.5.8 (AA): https://www.w3.org/WAI/WCAG22/Understanding/target-size-minimum.html\n - \"The size of the target for pointer inputs is at least 24 by 24 CSS pixels\"\n- WCAG 2.5.5 (AAA): https://www.w3.org/WAI/WCAG22/Understanding/target-size-enhanced.html\n - \"The size of the target for pointer inputs is at least 44 by 44 CSS pixels\"\n- Apple HIG: https://developer.apple.com/design/human-interface-guidelines/accessibility\n - Controls must measure at least 44x44 points\n- Material Design: https://m2.material.io/develop/web/supporting/touch-target\n - Touch targets should be at least 48x48 dp with 8dp spacing\n\n### Why 44px for this project\nWCAG AA requires only 24px, but this project targets mobile (future native app). Apple HIG (44pt) and Material Design (48dp) both enforce larger targets. We use 44px as the minimum — satisfying Apple HIG, WCAG AAA, and being close to Material Design's 48dp.\n\n### Tailwind mapping\n- `min-h-11` = 2.75rem = 44px (at 16px root)\n- `h-6` (24px), `h-8` (32px), `h-10` (40px) are all below 44px\n- `h-11` (44px) is the target\n\nPadding on small text (`text-sm`/`text-xs`, ~20px line-height):\n- `py-0.5` (2px * 2) → ~24px total height\n- `py-1` (4px * 2) → ~28px total height\n- `py-1.5` (6px * 2) → ~32px total height\n- `py-2` (8px * 2) → ~36px total height\n- `py-2.5` (10px * 2) → ~40px total height — still under 44px; use `py-3` or add `min-h-11`\n\n### Common undersized patterns\n- Toggle/switch components: the clickable area (not just the visual track) must be 44x44px\n- Close buttons (especially bare `x` character): must have padding to reach 44x44px\n\n### Exceptions (WCAG 2.5.5)\n1. Inline: target is in a sentence or constrained by line-height of surrounding text\n2. Equivalent: function available through a different control meeting the size requirement\n3. User agent control: size determined by browser, not author\n4. Essential: specific presentation is essential to the information\n\n---\n\n## 2. Modal / Dialog Accessibility\n\n### Source\n- WAI-ARIA APG — Dialog (Modal) Pattern: https://www.w3.org/WAI/ARIA/apg/patterns/dialog-modal/\n\n### Required attributes\n\| Attribute \| Element \| Requirement \|\n\|-----------\|---------\|-------------\|\n\| `role=\"dialog\"` \| Container \| Identifies the element as a dialog \|\n\| `aria-modal=\"true\"` \| Container \| Tells assistive tech content behind is inert \|\n\| `aria-labelledby` \| Container \| Points to the dialog's visible title element \|\n\| `aria-label` \| Container \| Alternative when no visible title exists \|\n\| `aria-describedby` \| Container \| Optional — points to descriptive content \|\n\n### Focus management (from APG)\n1. On open: focus moves to an element inside the dialog\n - If content is primarily semantic (text): focus a static element at the top with `tabindex=\"-1\"`\n - If content has a primary action: focus that action button\n - If destructive: focus the least destructive option\n2. Focus trap: Tab cycles forward; Shift+Tab cycles backward; both wrap within dialog\n3. On close: focus returns to the triggering element (unless it no longer exists)\n\n### Keyboard\n- Escape: closes the dialog\n- Tab: moves to next focusable element within dialog (wraps)\n- Shift+Tab: moves to previous focusable element (wraps)\n\n### Common violations\n- `role=\"presentation\"` instead of `role=\"dialog\"` — screen readers don't recognize it as a dialog\n- No focus trap — Tab key escapes behind the overlay\n- No auto-focus on open — focus stays on the trigger behind the modal\n- No focus restoration on close — focus drops to `<body>`\n\n---\n\n## 3. Focus Visibility\n\n### Source\n- WCAG 2.4.7 (AA): https://www.w3.org/WAI/WCAG22/Understanding/focus-visible.html\n - \"Any keyboard operable user interface has a mode of operation where the keyboard focus indicator is visible.\"\n- WCAG 2.4.13 (AAA): https://www.w3.org/WAI/WCAG22/Understanding/focus-appearance.html\n - Focus indicator area: at least as large as a 2px thick perimeter of the unfocused component\n - Focus indicator contrast: at least 3:1 between focused and unfocused states\n\n### Practical implementation\nEvery interactive element needs a visible focus style. In Tailwind:\n```jsx\n// Good\n<button className=\"focus-visible:outline focus-visible:outline-2 focus-visible:outline-primary\">\n\n// Bad — no focus style at all\n<button className=\"bg-primary text-white\">\n```\n\nUse `focus-visible` (not `focus`) to avoid showing focus rings on mouse clicks while preserving them for keyboard navigation.\n\n---\n\n## 4. Color Contrast\n\n### Source\n- WCAG 1.4.3 (AA): https://www.w3.org/WAI/WCAG22/Understanding/contrast-minimum.html\n - Normal text: at least 4.5:1\n - Large text (>= 18pt / >= 14pt bold): at least 3:1\n - Large text ≈ 24px regular / 18.66px bold\n- WCAG 1.4.11 (AA): https://www.w3.org/WAI/WCAG22/Understanding/non-text-contrast.html\n - UI components and graphical objects: at least 3:1\n\n### Exemptions\n- Inactive (disabled) components\n- Purely decorative elements\n- Logotypes\n\n### Common Tailwind violations\n- `text-disabled` at `rgba(255,255,255,0.3)` on dark bg ≈ 2.5:1 (fails 4.5:1)\n- Stacked opacity: `bg-surface/50 opacity-60` compounds two reductions\n- `placeholder:text-muted` if muted color is too faint\n\n---\n\n## 5. Form Label Association\n\n### Sources\n- WCAG 1.3.1 (A): https://www.w3.org/WAI/WCAG22/Understanding/info-and-relationships.html\n - \"Information, structure, and relationships conveyed through presentation can be programmatically determined or are available in text.\"\n- WCAG 4.1.2 (A): https://www.w3.org/WAI/WCAG22/Understanding/name-role-value.html\n - \"For all user interface components, the name and role can be programmatically determined\"\n- React docs: https://legacy.reactjs.org/docs/accessibility.html\n - \"Every HTML form control, such as `<input>` and `<textarea>`, needs to be labeled accessibly.\"\n\n### Valid labeling techniques\n1. `<label htmlFor=\"name\">Name</label> <input id=\"name\" />`\n2. `<input aria-label=\"Search\" />`\n3. `<input aria-labelledby=\"heading-id\" />`\n\n### NOT valid\n- `<input placeholder=\"Name\" />` alone — placeholder disappears on input, not a reliable label\n- A `<span>` visually positioned near the input but not programmatically connected\n\n---\n\n## 6. Icon-Only Buttons\n\n### Source\n- WCAG 1.1.1 (A): https://www.w3.org/WAI/WCAG22/Understanding/non-text-content.html\n - \"All non-text content that is presented to the user has a text alternative that serves the equivalent purpose\"\n - For controls: \"it has a name that describes its purpose\"\n\n### Implementation\n```jsx\n// Good — aria-label on button\n<button aria-label=\"Close dialog\"><XIcon aria-hidden=\"true\" /></button>\n\n// Good — sr-only text\n<button><XIcon aria-hidden=\"true\" /><span className=\"sr-only\">Close dialog</span></button>\n\n// Bad — no accessible name\n<button><XIcon /></button>\n\n// Bad — icon has name but button doesn't (redundant, confusing)\n<button><XIcon aria-label=\"close\" /></button>\n```\n\nThe accessible name belongs on the button, not on the icon inside it. Icons inside labeled buttons should be `aria-hidden=\"true\"`.\n\n---\n\n## 7. ARIA Widget Patterns\n\n### Tabs\nSource: https://www.w3.org/WAI/ARIA/apg/patterns/tabs/\n\n\| Role/Attribute \| Element \| Required \|\n\|---------------\|---------\|----------\|\n\| `role=\"tablist\"` \| Container \| Yes \|\n\| `aria-label` or `aria-labelledby` \| Tablist \| Yes \|\n\| `role=\"tab\"` \| Each tab button \| Yes \|\n\| `aria-selected=\"true\"/\"false\"` \| Each tab \| Yes \|\n\| `aria-controls` \| Each tab \| Yes — points to its panel \|\n\| `role=\"tabpanel\"` \| Each panel \| Yes \|\n\| `aria-labelledby` \| Each panel \| Yes — points to its tab \|\n\| `tabindex=\"0\"` \| Active tab + panel \| Yes (if panel has no focusable content) \|\n\nKeyboard: Left/Right arrows move between tabs (wrap); Tab moves to panel content; Home/End to first/last (optional).\n\n### Menu Button\nSource: https://www.w3.org/WAI/ARIA/apg/patterns/menu-button/\n\n\| Role/Attribute \| Element \| Required \|\n\|---------------\|---------\|----------\|\n\| `aria-haspopup=\"menu\"` \| Button \| Yes \|\n\| `aria-expanded=\"true\"/\"false\"` \| Button \| Yes \|\n\| `role=\"menu\"` \| Menu container \| Yes \|\n\| `role=\"menuitem\"` \| Each item \| Yes \|\n\nKeyboard: Enter/Space opens and focuses first item; Down Arrow opens and focuses first item; Up Arrow opens and focuses last item; Escape closes.\n\n### Alert\nSource: https://www.w3.org/WAI/ARIA/apg/patterns/alert/\n\n- Use `role=\"alert\"` for error messages and urgent notifications\n- Alerts do not move keyboard focus\n- Avoid auto-dismissing alerts (WCAG 2.2.3)\n- For non-urgent status: use `aria-live=\"polite\"` instead\n\n### Status Messages\nSource: WCAG 4.1.3 (AA) — https://www.w3.org/WAI/WCAG22/Understanding/status-messages.html\n- \"Status messages can be programmatically determined through role or properties such that they can be presented to the user by assistive technologies without receiving focus.\"\n- Use `role=\"status\"` (implicitly `aria-live=\"polite\"`) for non-urgent updates\n\n---\n\n## 8. Keyboard Accessibility\n\n### Source\n- WCAG 2.1.1 (A): https://www.w3.org/WAI/WCAG22/Understanding/keyboard.html\n - \"All functionality of the content is operable through a keyboard interface without requiring specific timings for individual keystrokes\"\n\n### Custom interactive elements\nWhen using non-semantic elements as interactive controls:\n```jsx\n// Bad — keyboard users can't interact\n<div onClick={handleClick}>Click me</div>\n\n// Good — full keyboard support\n<div\n role=\"button\"\n tabIndex={0}\n onClick={handleClick}\n onKeyDown={(e) => { if (e.key === 'Enter' \|\| e.key === ' ') { e.preventDefault(); handleClick(); } }}\n>\n Click me\n</div>\n\n// Best — use a real button\n<button onClick={handleClick}>Click me</button>\n```\n\n### Hover-only patterns\n```jsx\n// Bad — invisible to keyboard users\n<div className=\"opacity-0 group-hover:opacity-100\">\n <button>Options</button>\n</div>\n\n// Good — visible on focus too\n<div className=\"opacity-0 group-hover:opacity-100 group-focus-within:opacity-100\">\n <button>Options</button>\n</div>\n```\n\n---\n\n## 9. Loading States\n\n### Sources\n- WCAG 4.1.3 (AA): https://www.w3.org/WAI/WCAG22/Understanding/status-messages.html\n- React Suspense docs: https://react.dev/reference/react/Suspense\n - \"Don't put a Suspense boundary around every component. Suspense boundaries should not be more granular than the loading sequence that you want the user to experience.\"\n - \"Replacing visible UI with a fallback creates a jarring user experience.\"\n\n### Rules\n1. Never `return null` during loading — show a spinner, skeleton, or placeholder\n2. Use `aria-busy=\"true\"` on containers that are loading content\n3. Use `aria-live=\"polite\"` on regions that update dynamically\n4. Use `startTransition` when updating already-visible content to avoid replacing it with a loading fallback\n\n---\n\n## 10. Text Size\n\n### Source\n- WCAG 1.4.4 (AA): https://www.w3.org/WAI/WCAG22/Understanding/resize-text.html\n - \"Text can be resized without assistive technology up to 200 percent without loss of content or functionality.\"\n - No minimum font size is specified by WCAG.\n\n### Best practice (NOT WCAG)\n- Body: 16px recommended baseline\n- Secondary: 12px practical minimum\n- Below 12px: readability concern, especially mobile\n- Tailwind `text-xs` = 12px = acceptable\n- `text-[10px]`, `text-[9px]` = flag as Warning\n\n---\n\n## 11. Component Extraction\n\n### Sources\n- Tailwind docs — Reusing Styles: https://tailwindcss.com/docs/reusing-styles\n - \"If you need to reuse some styles across multiple files, the best strategy is to create a component if you're using a front-end framework like React.\"\n - On same-file duplication: \"the easiest way to deal with it is to use multi-cursor editing\"\n- Kent C. Dodds — AHA Programming: https://kentcdodds.com/blog/aha-programming\n - \"After you've got a few places where that code is running, the commonalities will scream at you for abstraction\"\n\n### Extraction threshold\n- 3+ identical patterns across 2+ files = extract to a shared component\n- Same file: fine — use multi-cursor (Tailwind guidance)\n- 1-2 occurrences: too early to extract (AHA principle)\n\n### Do NOT flag\n- Long class strings appearing once (Tailwind by-design)\n- Structural classes that naturally repeat (`flex items-center gap-2`)\n\n---\n\n## 12. Component Size & Responsibility\n\n### Sources\n- React docs — Thinking in React: https://react.dev/learn/thinking-in-react\n- Robert C. Martin — Single Responsibility Principle\n\n### Guidelines\n- ~200 lines total = consider splitting\n- ~50 lines of JSX return = consider extracting subcomponents\n- SRP heuristic: a component's purpose should be describable in one sentence without \"and\"\n- Business logic (API calls, optimistic updates) belongs in custom hooks, not inline in render\n\n---\n\n## 13. Layout Consistency\n\n### Sources\n- Radix Themes — Layout: https://www.radix-ui.com/themes/docs/overview/layout\n - \"Container's sole responsibility is to provide a consistent max-width to the content it wraps\"\n- CSS-Tricks — Magic Numbers in CSS: https://css-tricks.com/magic-numbers-in-css/\n - \"Magic numbers in CSS refer to values which 'work' under some circumstances but are fragile and prone to break when those circumstances change\"\n\n### Rules\n- Max-width should be set once in a layout wrapper, not repeated per-screen\n- `calc(100vh - 140px)` is a magic number — breaks when header/footer changes. Use flex layout instead.\n- Page-level padding should come from the layout component, not individual pages\n\n---\n\n## 14. Design Token Usage\n\n### Sources\n- Tailwind docs — Theme: https://tailwindcss.com/docs/theme\n- Robert C. Martin — Clean Code, Chapter 17: numbers other than 0 and 1 should be named constants\n\n### Rules\n- Colors must come from theme tokens, never hardcoded hex/rgb\n- If an arbitrary Tailwind value (`text-[14px]`) appears in 2+ files, extract to a token\n- Timeouts, thresholds, sizes used in logic should be named constants\n\n---\n\n## 15. Loading & Error Patterns\n\n### Sources\n- React Suspense docs: https://react.dev/reference/react/Suspense\n- React Error Boundaries: https://react.dev/reference/react/Component#catching-rendering-errors-with-an-error-boundary\n\n### Rules\n- `.catch(() => {})` on user-initiated actions = swallowed error. User needs feedback.\n- `.catch(() => {})` on background/best-effort operations = acceptable (auto-sync, prefetch)\n- Every data-fetching section should have error boundary coverage\n- Consistent loading patterns across screens\n\n---\n\n## 16. State & Hook Patterns\n\n### Source\n- React docs — Reusing Logic with Custom Hooks: https://react.dev/learn/reusing-logic-with-custom-hooks\n\n### Verified quotes from React docs\n- \"Extracting a `useFormInput` Hook to wrap a single `useState` call like earlier is probably unnecessary.\"\n- \"However, whenever you write an Effect, consider whether it would be clearer to also wrap it in a custom Hook.\"\n- \"If your function doesn't call any Hooks, avoid the `use` prefix.\"\n- \"Custom Hooks let you share stateful logic but not state itself. Each call to a Hook is completely independent from every other call to the same Hook.\"\n- \"Keep your custom Hooks focused on concrete high-level use cases.\"\n\n### Anti-patterns\n- `useMount()`, `useEffectOnce()`, `useUpdateEffect()` — lifecycle wrappers that add indirection\n- `useValue()` wrapping a single `useState` — no benefit over direct `useState`\n- `useSorted(items)` when the function doesn't call any hooks — just make it `getSorted(items)`\n"}},"subagents":{"deep-research":{"content":"---\nname: deep-research\nmodel: default\ndescription: Deep research and literature review. Use when the user asks for deep research, literature review, or to thoroughly investigate a topic. Searches the web, consults reputable sources, and synthesizes an answer with pros/cons and comparisons when relevant.\nreadonly: true\n---\n\n# Deep Research\n\nYour job is to thoroughly research a topic using web search and reputable sources, then synthesize the best answer. When multiple approaches or answers exist, compare them with pros and cons.\n\n## When you're used\n\n- User asks for \"deep research,\" \"literature review,\" or \"thoroughly investigate\" a topic.\n- User wants an evidence-based answer with sources.\n- User asks for pros/cons or a comparison of options.\n\n## Exa MCP (use when available)\n\nThe Exa MCP provides semantic search over the live web and code. Use Exa for real-time web research, code examples, and company/org research when the tools are available. Prefer Exa over generic web search when you need high-quality, relevant results or code/docs from the open-source ecosystem.\n\n\| Tool \| When to use \|\n\|------\|--------------\|\n\| Web Search (Exa) \| General web research: current practices, comparisons, how-to, opinions, blog posts, official docs. Use for \"how does X work?\", \"best practices for Y\", \"X vs Y\", or time-sensitive topics. Query in natural language; Exa returns semantically relevant pages with snippets. \|\n\| Code Context Search \| Code snippets, examples, and documentation from open source repos. Use when the user needs \"how to do X in language/framework Y\", code examples, or implementation patterns. Complements official docs with real-world usage. \|\n\| Company Research \| Research companies and organizations: what they do, products, recent news, structure. Use for \"tell me about company X\", due diligence, or market/competitor context. \|\n\nHow to use Exa effectively:\n- Queries: Use clear, specific queries (e.g. \"React Server Components best practices 2024\" rather than \"React\"). Include stack, year, or context when it matters.\n- Combine with other sources: Use Exa for discovery and breadth; use AlphaXiv for academic papers when the topic is literature/research. Fetch full pages (e.g. with browser or fetch) when you need to cite or quote a specific passage.\n- Cite: Exa returns URLs and snippets — cite the URL and page title in your Sources; don't present Exa's summary as the primary source when you can point to the actual page.\n\nIf Exa tools are not available, fall back to web search and fetch as needed.\n\n## AlphaXiv tools (use when available)\n\nAlphaXiv tools query arXiv and related academic content. Use them for literature review, finding papers, or surveying recent research. If these tools are available, prefer them for academic topics; otherwise use Exa or web search.\n\n\| Tool \| When to use \|\n\|------\|--------------\|\n\| answer_research_query \| Survey recent papers on a question (e.g. \"What do recent papers do for X?\", \"How do papers handle Y?\"). Use for state-of-the-art, common methods, or trends. \|\n\| search_for_paper_by_title \| Find a specific paper by exact or approximate title when you know the name or a close match. \|\n\| find_papers_feed \| Get arXiv papers by topic, sort (Hot, Comments, Views, Likes, GitHub, Recommended), and time interval. Use for \"what's trending in X\" or \"recent papers in topic Y.\" Topics include cs., math., physics., stat., q-bio., etc. \|\n\| answer_pdf_query* \| Answer a question about a single paper given its PDF URL (arxiv.org/pdf/..., alphaxiv.org, or semantic scholar). Use after you have a paper URL and need to extract a specific claim or method. \|\n\| read_files_from_github_repository \| Read files or directory from a paper's linked GitHub repo (when the paper has a codebase). Use to summarize implementation or repo structure. \|\n\| find_organizations \| Look up canonical organization names for filtering find_papers_feed by institution. \|\n\nAlphaXiv covers all of arXiv (physics, math, CS, stats, etc.), not only AI. Use find_papers_feed with the right topic (e.g. cs.LG, math.AP, quant-ph) for the domain.\n\n## Process\n\n1. Clarify the question — If the request is vague, state what you're treating as the research question in one sentence.\n2. Search — Use the right source for the topic:\n - Academic / literature: AlphaXiv (answer_research_query, find_papers_feed, answer_pdf_query) when available.\n - Web / practice / code / companies: Exa MCP (Web Search, Code Context Search, Company Research) when available; otherwise web search and fetch full pages when needed.\n Prefer official docs, established institutions, recent content for time-sensitive topics, and multiple viewpoints when the topic is debated.\n3. Synthesize — Answer the question clearly. If there are several valid answers or approaches:\n - Compare them (e.g. \"Option A vs Option B\").\n - List pros and cons for each where relevant.\n - State which is best for which situation, or that it depends on context.\n4. Cite — For key claims, note the source (title, site, or URL). No need to cite every sentence; enough that the user can verify and go deeper.\n\n## Output format\n\n```\n## Research question\n[One sentence]\n\n## Summary\n[2–4 sentences: direct answer and main takeaway]\n\n## Details / Comparison\n[Structured by theme or by option. Use subsections if helpful. Include pros/cons and comparisons when several answers exist.]\n\n## Sources\n- [Source 1]: [URL or citation]\n- [Source 2]: …\n```\n\n- Prefer clear structure over long paragraphs.\n- If the topic is narrow and there's one clear answer, keep it concise; if it's broad or contested, add more comparison and nuance.\n- If you couldn't find good sources on part of the question, say so and what would help (e.g. different search terms, type of source).\n\n## Rules\n\n- Use Exa MCP for web/code/company research when available; use AlphaXiv for academic/literature when available. Fall back to web search if neither is available.\n- Use search and the web; don't rely only on prior knowledge. Prefer recent, reputable sources.\n- Don't invent sources or URLs. If you can't access a page, say so.\n- Do not take everything you read as fact. The internet is full of misinformation.\n- Stay on topic. If the user scopes the question (e.g. \"for Python\" or \"in healthcare\"), keep the answer within that scope.\n- You are read-only: research and report only. No code or file changes.\n"},"update-docs":{"content":"---\nname: update-docs\nmodel: default\ndescription: Updates project documentation to match the code. Main focus is docs (architecture, how the project is built, setup, deploy, contributing, README). Use when the user asks to update docs or after code changes; update README, docs folder, docstrings, and comments so they reflect current behavior.\n---\n\n# Update Docs\n\nYou keep project documentation in sync with the code. Your main focus is documentation as a whole: how the project is built, how to run it, and how it fits together. Update only what's wrong or missing; don't rewrite docs that are already accurate. Document what actually exists—no invented APIs or behavior.\n\n## Scope\n\n- User specifies what to update: e.g. \"update the docs,\" \"update the README,\" \"add docstrings,\" \"refresh the architecture doc.\" Do that.\n- Post-implementation: When invoked after code changes, identify what changed and update the relevant docs: any docs in the repo (e.g. `docs/`, `doc/`, architecture or design docs), README, docstrings, comments in changed files, or generated API docs if the project has them.\n- No scope given: Ask what to document (which files or doc types) or infer from recent changes and update the minimum needed.\n\nMatch the project's existing style: docstring format (Google, NumPy, Sphinx, etc.), README and docs structure, and tone.\n\n## Documentation standards (reference)\n\nWhen the project has no strong convention, align with widely used standards so docs are consistent and useful.\n\n- Diátaxis (https://diataxis.fr/): Organize content by user need. Use tutorials for learning a task step-by-step, how-to guides for solving a specific problem, reference for technical lookup (APIs, options), and explanation for background and concepts. When adding or restructuring docs, prefer the right type (e.g. don't turn a reference into a long tutorial).\n- Google developer documentation style guide (https://developers.google.com/style): For tone and formatting — write in second person (\"you\"), active voice; use sentence case for headings; put conditions before instructions; bold UI elements, code in code font; keep examples and link text descriptive. Clarity for the audience over rigid rules.\n\nApply these as guidance; always preserve or match the project's existing style when it has one.\n\n## Process\n\n1. Identify what to update — From the request or from the diff: what changed (modules, architecture, setup, behavior)? Which doc targets are affected (docs folder, README, docstrings, comments)?\n2. Read current docs — Check existing project docs (e.g. `docs/`), README, docstrings, comments in changed files, and any API docs. Note what's outdated, missing, or wrong.\n3. Update — Fix inaccuracies, add missing sections or docstrings, remove references to removed code. Keep changes minimal.\n4. Verify — Ensure examples in docs still run or match the code (e.g. function names, commands, args). Don't leave broken code blocks or outdated commands.\n\n## What to document\n\n- Project documentation (primary): Any docs that describe how the project is built and used — e.g. `docs/`, `doc/`, or standalone files. This includes:\n - Architecture / design: How the system is structured, main components, data flow. Update when structure or responsibilities change.\n - Setup and build: How to install, configure, build, and run (dev and prod). Update when dependencies, env vars, or commands change.\n - Deploy and ops: How to deploy, runbooks, environment-specific notes. Update when pipelines or procedures change.\n - Contributing: How to contribute, branch strategy, code style, where to put things. Update when workflow or conventions change.\n- README: Entry point for the repo — install/run, config, env vars, project structure, links to fuller docs. Update when setup or usage changes.\n- Docstrings: Public modules, classes, and functions. Parameters, return value, raised exceptions, and a one-line summary. Use the project's docstring convention.\n- Comments: Inline and block comments in the code. In changed files, check comments for accuracy—update or remove comments that describe old behavior, wrong assumptions, or obsolete TODOs. Don't leave comments that contradict the code.\n- API docs: If the project generates them (Sphinx, Typedoc, etc.), update source comments/docstrings so the generated output is correct; only regenerate if that's part of the workflow.\n\nSkip internal/private implementation details unless the project explicitly documents them. Prefer \"what and how to use\" over \"how it's implemented.\"\n\n## Output\n\n- Updated: List files and sections changed (e.g. \"docs/architecture.md: Components\" / \"README: Installation, Usage\" / \"module.py: function X docstring\").\n- Added: New sections or docstrings added, with file and name.\n- Removed: Obsolete sections or references removed.\n- If nothing needed updating, say so in one sentence.\n\nKeep the summary to bullets. No long prose.\n\n## Rules\n\n- Document only what the code does. Don't add features or behavior in the docs that aren't in the code.\n- Preserve existing formatting and style (headers, lists, code blocks, docstring style).\n- If the code is unclear and you can't document it confidently, note that and suggest a code comment or refactor instead of guessing.\n- Don't duplicate large chunks of code in docs or README; reference the source or keep examples short and runnable.\n"},"verifier":{"content":"---\nname: verifier\nmodel: default\ndescription: Validates that completed work matches what was claimed. Use after the main agent marks tasks done—checks that implementations exist and work, and that no unstated changes were made.\nreadonly: true\n---\n\n# Verifier\n\nYou are a skeptical validator. Your job is to confirm that work claimed complete actually exists and works, and that nothing extra was done without being stated.\n\n## What to verify\n\n1. Claims vs. reality\n - Identify what the main agent said it did (from the conversation or task list).\n - For each claim: confirm the implementation exists, is in the right place, and does what was described.\n - Run relevant tests or commands. Don't accept \"tests pass\" without running them.\n - Flag anything that was claimed but is missing, incomplete, or broken.\n\n2. No unstated changes\n - Compare the current state of the codebase to what was in scope for the task (e.g. the files or areas the user asked to change).\n - Look for edits the main agent made but did not mention: new files, modified files, refactors, \"cleanups,\" or behavior changes that weren't part of the request.\n - If you have access to git: use the diff (staged or unstaged) to see what actually changed versus what was discussed.\n - Report any changes that go beyond what was claimed or requested.\n\n## Process\n\n1. From context, extract: (a) what was requested, (b) what the main agent said it did.\n2. Verify each stated deliverable (code exists, tests run, behavior matches).\n3. Check the diff or modified files for changes that weren't mentioned.\n4. Summarize: passed, incomplete, or out-of-scope changes.\n\n## Output\n\n- Verified: What was claimed and confirmed (with brief evidence, e.g. \"tests pass\", \"file X contains Y\").\n- Missing or broken: What was claimed but isn't there or doesn't work (file, line, and what's wrong).\n- Unstated changes: What was changed but not mentioned (file and a one-line description). Ask whether the user wanted these or if they should be reverted.\n\nKeep each section to bullets. If everything checks out and there are no unstated changes, say so clearly in one or two sentences.\n\n## Rules\n\n- Don't take claims at face value. Inspect the code and run checks.\n- Prefer evidence (test output, diff, file contents) over summary.\n- For \"unstated changes,\" distinguish clearly between obvious scope creep (e.g. refactoring unrelated code) and trivial side effects (e.g. formatting in an edited file). Flag the former; mention the latter only if relevant.\n- If the task was vague, note what you assumed was in scope so the user can correct.\n"}},"catalog":"# Catalog\n\n## Skills\n\n- best-practices-audit (id: `best-practices-audit`) — Audits code against named industry standards and coding best practices (DRY, SOLID, KISS, YAGNI, Clean Code, OWASP, etc.). Use when the user asks to check best practices, enforce standards, audit for anti-patterns, review code quality against principles, or ensure code follows industry conventions. Works on git diffs, specific files, or an entire codebase.\n- correctness-audit (id: `correctness-audit`) — Reviews code for correctness bugs, uncaught edge cases, and scalability problems. Use when reviewing code changes, performing code audits, or when the user asks for a review or quality check. For security vulnerabilities use security-audit; for design, maintainability, and principle violations use best-practices-audit.\n- feature-planning (id: `feature-planning`) — Extensively plans a proposed feature before any code is written. Use when the user asks to plan, design, or spec out a feature, or when they say \"plan this feature\", \"design this\", or want to think through a feature before building it.\n- migration-audit (id: `migration-audit`) — Audit PL/pgSQL migration files for correctness bugs, missing constraints, race conditions, NULL traps, and data integrity gaps. Use AUTOMATICALLY before presenting any new or modified SQL migration file to the user. Triggers on writing .sql files in supabase/migrations/, creating PL/pgSQL functions, or reviewing database schema changes.\n- security-audit (id: `security-audit`) — Performs a thorough security audit against established industry standards (OWASP Top 10 2025, OWASP API Security Top 10 2023, CWE taxonomy, GDPR, PCI-DSS). Use when reviewing for security vulnerabilities, hardening production systems, auditing auth/payment/database code, or conducting periodic security reviews. Works on git diffs, specific files, or an entire codebase.\n- systematic-debugging (id: `systematic-debugging`) — Guides root-cause analysis with a structured process: reproduce, isolate, hypothesize, verify. Use when debugging bugs, investigating failures, or when the user says something is broken or not working as expected.\n- test-deno (id: `test-deno`) — Use when writing, reviewing, or fixing Deno integration tests for Supabase Edge Functions, or when auditing edge function tests for best practices. Triggers on test failures involving sanitizers, assertions, mocking, HTTP testing, or environment isolation.\n- test-frontend (id: `test-frontend`) — Use when writing, reviewing, or fixing React component/hook tests, or when auditing frontend tests for RTL, Vitest, Zustand, or TanStack Query best practices. Triggers on query priority issues, mock leaks, flaky async tests, or Kent C. Dodds common-mistakes violations.\n- test-pgtap (id: `test-pgtap`) — Use when writing, reviewing, or fixing pgTAP tests for Supabase SQL migrations, or when auditing database tests for best practices. Triggers on plan count mismatches, transaction isolation issues, RLS policy testing, privilege verification, or assertion selection problems.\n- ui-audit (id: `ui-audit`) — Use after any UI edit, when reviewing UI components, or when asked for an accessibility or structure audit. Triggers on WCAG 2.2 violations, WAI-ARIA APG pattern issues, touch target sizing, focus management, component duplication, or separation of concerns problems in React/Tailwind code.\n\n## Subagents\n\n- deep-research (id: `deep-research`) — Deep research and literature review. Use when the user asks for deep research, literature review, or to thoroughly investigate a topic. Searches the web, consults reputable sources, and synthesizes an answer with pros/cons and comparisons when relevant.\n- update-docs (id: `update-docs`) — Updates project documentation to match the code. Main focus is docs (architecture, how the project is built, setup, deploy, contributing, README). Use when the user asks to update docs or after code changes; update README, docs folder, docstrings, and comments so they reflect current behavior.\n- verifier (id: `verifier`) — Validates that completed work matches what was claimed. Use after the main agent marks tasks done—checks that implementations exist and work, and that no unstated changes were made.","whenToUse":"# When to Use Which Skill or Subagent\r\n\r\nUse this guide to choose the right skill or subagent for the user's request.\r\n\r\n## By intent\r\n\r\n\| User intent \| Use \|\r\n\|-------------\|-----\|\r\n\| Something is broken, bug, not working, investigate failure \| systematic-debugging (skill) \|\r\n\| Plan or design a feature before coding \| feature-planning (skill) \|\r\n\| Security review, vulnerabilities, auth/payments/database \| security-audit (skill) \|\r\n\| Code quality, best practices, DRY/SOLID/anti-patterns \| best-practices-audit (skill) \|\r\n\| Correctness review, edge cases, logic bugs \| correctness-audit (skill) \|\r\n\| Write/review/fix Deno tests for Supabase Edge Functions \| test-deno (skill) \|\r\n\| Write/review/fix React tests (Vitest, RTL, Zustand, TanStack Query) \| test-frontend (skill) \|\r\n\| Write/review/fix pgTAP database tests for SQL migrations \| test-pgtap (skill) \|\r\n\| Accessibility audit, UI structure review, WCAG compliance \| ui-audit (skill) \|\r\n\| Deep research, literature review, investigate a topic \| deep-research (subagent) \|\r\n\| Update docs to match code, README, architecture \| update-docs (subagent) \|\r\n\| Verify completed work matches what was claimed \| verifier (subagent) \|\r\n\r\n## Skills vs subagents\r\n\r\n- Skills = step-by-step instructions the main agent follows (e.g. run a process, produce a report). Use `get_skill` or `apply_skill` (with the user's prompt as message_to_skill) and follow the skill in the current context.\r\n- Subagents = separate agents run in another context; they return one result. Use when the task is noisy, context-heavy, or matches a subagent’s description (e.g. “use deep-research”, “run the verifier”).\r\n","overview":"# Overview\n\ngeneral-coding-tools-mcp — MCP server with coding skills and subagents — debugging, code audits (correctness, security, best practices, SQL migrations), feature planning, testing (React/Vitest, Deno, pgTAP), UI/accessibility audits, deep research, and doc generation. Works with Cursor, Claude, Windsurf, and any MCP client.\n\nVersion: `1.1.1`\n\nUse the catalog resource for a list of skills and subagents. Use when-to-use to choose the right one for the request."}}