npm - sanook-cli - Versions diffs - 0.4.0 - Mend

sanook-cli 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (148) hide show

package/.env.example +23 -0
package/CHANGELOG.md +38 -0
package/LICENSE +201 -0
package/README.md +239 -0
package/dist/agentContext.js +2 -0
package/dist/approval.js +78 -0
package/dist/bin.js +461 -0
package/dist/brain.js +186 -0
package/dist/commands.js +66 -0
package/dist/compaction.js +85 -0
package/dist/config.js +101 -0
package/dist/cost.js +59 -0
package/dist/diff.js +36 -0
package/dist/gateway/auth.js +32 -0
package/dist/gateway/ledger.js +94 -0
package/dist/gateway/lock.js +114 -0
package/dist/gateway/schedule.js +74 -0
package/dist/gateway/scheduler.js +87 -0
package/dist/gateway/serve.js +57 -0
package/dist/gateway/server.js +94 -0
package/dist/gateway/telegram.js +115 -0
package/dist/git.js +55 -0
package/dist/hooks.js +104 -0
package/dist/knowledge.js +68 -0
package/dist/loop.js +169 -0
package/dist/mcp.js +191 -0
package/dist/memory.js +108 -0
package/dist/providers/codex.js +86 -0
package/dist/providers/keys.js +37 -0
package/dist/providers/models.js +55 -0
package/dist/providers/registry.js +241 -0
package/dist/session.js +36 -0
package/dist/skill-install.js +190 -0
package/dist/skills.js +111 -0
package/dist/tools/bash.js +26 -0
package/dist/tools/edit.js +107 -0
package/dist/tools/git.js +68 -0
package/dist/tools/index.js +36 -0
package/dist/tools/list.js +24 -0
package/dist/tools/permission.js +30 -0
package/dist/tools/read.js +18 -0
package/dist/tools/recall.js +12 -0
package/dist/tools/remember.js +14 -0
package/dist/tools/schedule.js +61 -0
package/dist/tools/search.js +54 -0
package/dist/tools/skill.js +65 -0
package/dist/tools/task.js +46 -0
package/dist/tools/util.js +5 -0
package/dist/tools/write.js +27 -0
package/dist/ui/app.js +132 -0
package/dist/ui/banner.js +20 -0
package/dist/ui/brain-wizard.js +29 -0
package/dist/ui/render.js +57 -0
package/dist/ui/setup.js +46 -0
package/package.json +77 -0
package/second-brain/AGENTS.md +18 -0
package/second-brain/CLAUDE.md +96 -0
package/second-brain/Evals/retrieval-eval.md +30 -0
package/second-brain/GEMINI.md +15 -0
package/second-brain/Home.md +33 -0
package/second-brain/README.md +29 -0
package/second-brain/Runbooks/ingest-quarantine.md +27 -0
package/second-brain/Runbooks/sleep-time-consolidation.md +26 -0
package/second-brain/Shared/AI-Context-Index.md +52 -0
package/second-brain/Shared/Core-Facts/protected-facts.md +21 -0
package/second-brain/Shared/Decision-Memory/decision-log.md +24 -0
package/second-brain/Shared/Memory-Inbox/memory-inbox.md +23 -0
package/second-brain/Shared/Operating-State/current-state.md +30 -0
package/second-brain/Shared/Provenance/ingest-log.md +27 -0
package/second-brain/Shared/Rules/context-assembly-policy.md +28 -0
package/second-brain/Shared/Rules/frontmatter-standard.md +33 -0
package/second-brain/Shared/Rules/skills-admission.md +30 -0
package/second-brain/Shared/User-Memory/user-preferences.md +25 -0
package/second-brain/Templates/bug.md +22 -0
package/second-brain/Templates/handoff.md +21 -0
package/second-brain/Templates/project.md +24 -0
package/second-brain/Templates/session.md +26 -0
package/second-brain/USER.md +36 -0
package/second-brain/Vault Structure Map.md +106 -0
package/skills/agent-tool-mcp-builder/SKILL.md +88 -0
package/skills/api-design-review/SKILL.md +70 -0
package/skills/async-concurrency-correctness/SKILL.md +93 -0
package/skills/audit-accessibility-wcag/SKILL.md +59 -0
package/skills/audit-technical-seo/SKILL.md +62 -0
package/skills/auth-jwt-session/SKILL.md +88 -0
package/skills/brainstorm-design/SKILL.md +73 -0
package/skills/build-etl-pipeline/SKILL.md +58 -0
package/skills/build-form-validation/SKILL.md +103 -0
package/skills/build-office-docs/SKILL.md +80 -0
package/skills/build-react-component/SKILL.md +116 -0
package/skills/build-spreadsheet/SKILL.md +106 -0
package/skills/caching-strategy/SKILL.md +75 -0
package/skills/cicd-pipeline-author/SKILL.md +65 -0
package/skills/cloud-cost-optimize/SKILL.md +91 -0
package/skills/code-comments/SKILL.md +52 -0
package/skills/code-review/SKILL.md +61 -0
package/skills/db-migration-safety/SKILL.md +67 -0
package/skills/debug-frontend-browser/SKILL.md +58 -0
package/skills/debug-root-cause/SKILL.md +54 -0
package/skills/dependency-upgrade/SKILL.md +56 -0
package/skills/deploy-release/SKILL.md +64 -0
package/skills/diff-table-parity/SKILL.md +58 -0
package/skills/dockerfile-optimize/SKILL.md +82 -0
package/skills/error-message/SKILL.md +58 -0
package/skills/estimate-work/SKILL.md +54 -0
package/skills/explore-codebase/SKILL.md +73 -0
package/skills/git-commit-pr/SKILL.md +65 -0
package/skills/gitops-deploy-workflow/SKILL.md +97 -0
package/skills/implement-from-design/SKILL.md +69 -0
package/skills/incident-response-sre/SKILL.md +78 -0
package/skills/k8s-debug-workload/SKILL.md +135 -0
package/skills/k8s-manifest-review/SKILL.md +86 -0
package/skills/llm-eval-harness/SKILL.md +63 -0
package/skills/manage-client-server-state/SKILL.md +94 -0
package/skills/mermaid-diagram/SKILL.md +61 -0
package/skills/message-queue-jobs/SKILL.md +139 -0
package/skills/naming-helper/SKILL.md +57 -0
package/skills/observability-instrument/SKILL.md +113 -0
package/skills/optimize-core-web-vitals/SKILL.md +75 -0
package/skills/optimize-sql-query/SKILL.md +67 -0
package/skills/performance-profiling/SKILL.md +65 -0
package/skills/process-pdf/SKILL.md +107 -0
package/skills/profile-dataset/SKILL.md +97 -0
package/skills/prompt-engineering/SKILL.md +70 -0
package/skills/rag-pipeline/SKILL.md +53 -0
package/skills/rate-limiting/SKILL.md +96 -0
package/skills/refactor-cleanup/SKILL.md +54 -0
package/skills/regex-build/SKILL.md +72 -0
package/skills/release-notes/SKILL.md +79 -0
package/skills/rest-graphql-contract/SKILL.md +71 -0
package/skills/scrape-structured-web-data/SKILL.md +61 -0
package/skills/secrets-management/SKILL.md +96 -0
package/skills/security-review/SKILL.md +62 -0
package/skills/shell-script-robust/SKILL.md +71 -0
package/skills/style-responsive-tailwind/SKILL.md +70 -0
package/skills/terraform-plan-review/SKILL.md +95 -0
package/skills/type-safety-strict/SKILL.md +82 -0
package/skills/validate-data-quality/SKILL.md +62 -0
package/skills/wrangle-tabular-data/SKILL.md +75 -0
package/skills/write-adr/SKILL.md +75 -0
package/skills/write-analytical-sql/SKILL.md +71 -0
package/skills/write-data-viz/SKILL.md +58 -0
package/skills/write-docs/SKILL.md +54 -0
package/skills/write-plan/SKILL.md +59 -0
package/skills/write-playwright-e2e/SKILL.md +86 -0
package/skills/write-prd/SKILL.md +65 -0
package/skills/write-rfc/SKILL.md +75 -0
package/skills/write-tests/SKILL.md +50 -0

package/skills/agent-tool-mcp-builder/SKILL.md ADDED Viewed

@@ -0,0 +1,88 @@
+---
+name: agent-tool-mcp-builder
+description: Designs agent tools and builds MCP servers (tool schemas, naming, error shapes, auth, context-efficient results) when exposing capabilities to an LLM agent or scaffolding a Model Context Protocol server.
+when_to_use: User is building an MCP server, defining tools/function schemas for an agent, or fixing tools that the model misuses, returns bloated results, or mis-routes. NOT for general REST/GraphQL APIs for humans (use rest-graphql-contract).
+---
+## When to Use
+Reach for this when an LLM agent — not a human — is the caller:
+- Scaffolding an **MCP server** (stdio or streamable HTTP) that exposes tools/resources/prompts.
+- Defining **tool / function-call schemas** for a model (the `input_schema`, descriptions, result shape).
+- **Fixing tools the model misuses**: wrong tool picked, args malformed, results so large they blow the context window, or the model can't tell success from failure.
+NOT for human-facing REST/GraphQL APIs — those optimize for client devs and stable contracts, not for a model reading schemas at inference time. Use `rest-graphql-contract` there.
+Core mental model: **the model is your end user.** It reads only the tool name + description + param schema before deciding. Every result it gets back consumes context it pays for. Design for "a smart reader who sees the schema and nothing else."
+## Steps
+1. **Pick transport + deployment before writing tools.**
+   - **stdio** (subprocess, JSON-RPC over stdin/stdout) — local single-user tools, the default for desktop/CLI agent clients. No auth layer; the OS process boundary is the trust boundary. The host spawns one process per client, so concurrency = multiple independent processes (see step 6).
+   - **Streamable HTTP** — remote/multi-tenant, web-hosted. Needs real auth (below) and must handle concurrent requests in one process.
+   - Never log to **stdout** on a stdio server — stdout is the JSON-RPC channel and any stray `print` corrupts the protocol. Log to **stderr** only.
+2. **Choose auth by deployment, not by habit.**
+   - stdio local → **none** (process isolation). Read secrets from env vars the host injects; never take API keys as tool params (they land in model context + transcripts).
+   - Remote internal → **static bearer / API key** in the `Authorization` header.
+   - Remote multi-user → **OAuth 2.0**; keep token refresh server-side. The model never sees the credential — your server injects it at egress when calling the downstream API.
+3. **Name and scope tools for a model picking from a list.**
+   - `verb_noun`, specific: `search_orders`, `create_invoice` — not `orders`, `do`, `query`. The name is the model's primary routing signal.
+   - **One tool = one job.** A `manage_users(action=create|delete|list)` mega-tool forces the model to get `action` right *and* shapes a vague schema. Split into `create_user` / `delete_user` / `list_users` so each gets a tight schema and a name the model can match.
+   - Keep the active set **focused** (rough ceiling ~20–40 well-named tools). Past that, the model mis-routes. If you have a large library, expose a search/discovery tool and load schemas on demand rather than dumping all of them into context.
+   - **bash vs. dedicated tool:** a `bash`/`run_command` tool gives breadth but an opaque string the host can't gate, render, or parallelize. Promote an action to a **dedicated tool** when you need to: gate it (irreversible/external — `send_email`, `delete_*`), enforce an invariant (reject an edit if the file changed since last read), render custom UI, or mark it parallel-safe (read-only `glob`/`grep`). Rule: start with bash for reach, promote when you must gate/render/audit/parallelize.
+4. **Write the param schema and description for the model.**
+   - JSON Schema `input_schema`: `type:"object"`, `properties` with a `description` on **every** field, `required` listing only truly-required params, `enum` for fixed value sets, `additionalProperties:false`.
+   - Descriptions are **prescriptive about *when* to call**, not just what the tool does — e.g. *"Call this when the user asks about current order status or recent shipments."* On models that reach for tools conservatively, trigger conditions in the description measurably raise should-call rate. Put the same trigger in the tool description itself, not only the system prompt.
+   - Set `strict: true` (or the server's strict-schema equivalent) when the downstream call needs exact params — it constrains the model's args to the schema instead of validating after the fact.
+   - **Tool name/type pairs are load-bearing** for built-in/server tools — `text_editor_20250728` pairs with name `str_replace_based_edit_tool`; mixing a type with the wrong name 400s. Match your versioned tool `type` to its expected `name`.
+5. **Make results discriminated and errors actionable.**
+   - Return a **discriminated shape** the model can branch on: a `status: "ok" | "error" | "empty"` (or typed result variants), never a bare string the model has to pattern-match.
+   - On failure, return the error **as a tool result with an error flag set** (`is_error: true` on the result block) and an **actionable message** — *"Location 'xyz' not found. Provide a valid city name."* — not an exception that kills the loop, and not a silent empty result. The model reads the message and retries or asks; a stack trace or `null` teaches it nothing.
+   - Validate inputs **inside** the tool before doing work; surface validation failures the same way so the model can correct its own args.
+6. **Trim results for context efficiency — this is where most tool servers fail.**
+   - **Default to summaries.** Add a `summary`/`detail` mode and a `limit`; return aggregates + top-N, not full dumps. A tool that returns 5K rows of JSON burns thousands of tokens the model rarely needs. (Mirror of the read-tool convention: `summary=true` unless individual rows are required.)
+   - **Paginate** large lists (`cursor`/`page` + `limit`); never return an unbounded set.
+   - For **large/tabular** payloads, prefer **CSV or YAML over JSON** — same data, far fewer tokens (no repeated keys/braces). Reserve JSON for small structured results the model branches on.
+   - **Offload huge outputs to a file/handle**: if a result would exceed a large threshold (tens of K tokens), write it to a path/resource and return a **preview + the handle** so the model can fetch the slice it needs. (Hosted toolsets do this automatically past ~100K tokens — replicate the pattern.)
+   - **Programmatic / composed calls:** when the model would otherwise chain N tool calls (each result hitting its context), let it run a script that calls tools as functions and returns only the final output — intermediate results never enter context. Worth it for sequential calls or large filtered-down intermediates.
+7. **Handle concurrency safely for the transport.**
+   - stdio: the host spawns **one subprocess per client** — within a process, JSON-RPC requests can interleave, so don't hold shared mutable state across `await` points without guarding it; do hold per-request state on the stack.
+   - HTTP: multiple concurrent requests share **one** process — make handlers reentrant, use connection pools (don't open a DB connection per call), and never stash request-scoped data in module globals.
+   - Mark read-only tools as parallel-safe where the framework allows; serialize anything with side effects.
+8. **Add observability + rate limiting (remote especially).**
+   - Log each call's tool name, arg keys (not secret values), latency, and result size to **stderr / your logger** — never stdout on stdio.
+   - Rate-limit per-caller on remote servers; return a clear retryable error (with retry guidance) on limit, so the agent backs off instead of hammering.
+9. **Test with the MCP inspector, then verify the model actually calls it right.**
+   - Run the server under the **MCP inspector** (`npx @modelcontextprotocol/inspector <server-cmd>`) — confirm tools enumerate, schemas validate, and a hand-entered call returns the expected discriminated shape and error shape.
+   - Then close the loop with a real model: give it a task that *should* trigger the tool and confirm it (a) picks the right tool, (b) fills args validly, (c) gets a result small enough to act on. A tool that passes the inspector but the model never calls (or always mis-calls) is not done — fix the **name/description**, not the prompt.
+## Common Errors
+- **Logging to stdout on a stdio server.** Any `print`/`console.log` corrupts the JSON-RPC stream and the client silently breaks. Everything diagnostic goes to **stderr**.
+- **Returning raw, unbounded results.** The #1 context killer. 5K rows of pretty-printed JSON = thousands of wasted tokens. Add `summary`/`limit`/pagination and prefer CSV/YAML for tabular data **before** shipping.
+- **Mega-tools with an `action` discriminator.** `manage_x(action=...)` makes the model get both the action and a loose schema right. Split into one tool per action.
+- **Vague names / what-not-when descriptions.** `query`, `data`, `do` give the model nothing to route on. A description that says *what* the tool does but not *when to call it* leaves the model guessing — and newer models under-call by default, so the trigger condition has to be explicit.
+- **Errors as exceptions or silent nulls.** An exception ends the agent loop; a `null`/empty result looks like success. Return a result with the error flag set + an actionable message so the model can recover.
+- **API keys as tool parameters.** They land in model context and the transcript/event history. Inject credentials server-side from env/vault at egress; the model never sees them.
+- **Mismatched versioned tool name/type pair** (built-in tools). The `type` and `name` are a fixed pair — swapping one without the other 400s.
+- **Per-call DB connections / shared mutable state on HTTP servers.** Concurrent requests share the process; use pools and keep request state off module globals.
+- **Schema drift the model can't see.** Changing return shape without updating the description means the model branches on a contract that no longer holds. Keep the description and the actual result shape in sync.
+## Verify
+- [ ] Transport chosen deliberately (stdio = local/no-auth; HTTP = remote/+auth); secrets come from env/vault, **never** from tool params.
+- [ ] Every tool: `verb_noun` name, single responsibility, description states **when to call**, `input_schema` has a `description` on each field + `enum`/`required`/`additionalProperties:false`.
+- [ ] Results are discriminated (`status`/typed variants); errors return an error-flagged result with an actionable message — not exceptions, not silent nulls.
+- [ ] Large results trimmed: summary mode + pagination + limit; CSV/YAML for tabular; oversized output offloaded to a handle with a preview.
+- [ ] Concurrency safe for the transport (no stdout logging on stdio; pooled connections + reentrant handlers on HTTP).
+- [ ] Rate limiting + per-call logging (to stderr/logger) in place on remote servers.
+- [ ] **Passed the MCP inspector** (tools enumerate, schemas validate, success + error shapes correct) **and** a real model picks the right tool, fills valid args, and gets a context-sized result.

package/skills/api-design-review/SKILL.md ADDED Viewed

@@ -0,0 +1,70 @@
+---
+name: api-design-review
+description: Designs and reviews HTTP/REST and RPC API surfaces — resource naming, verbs, status codes, pagination, versioning, idempotency, error envelopes, and backward compatibility. Use when creating a new endpoint or changing an existing API contract.
+when_to_use: สร้าง endpoint ใหม่; แก้ request/response schema; เปลี่ยน API contract; ออกแบบ versioning
+---
+## When to Use
+Invoke before writing or merging any change that alters an API contract:
+- Adding a new endpoint or RPC method.
+- Changing a request or response body (add/remove/rename field, change type, change nullability).
+- Changing path, query params, HTTP verb, or status codes of an existing route.
+- Designing pagination, filtering, sorting, or a versioning strategy.
+Do NOT use for: pure internal refactors with no wire-format change, or UI-only work.
+## Steps
+1. **Pin the contract surface.** List every consumer-visible element you touch: `METHOD /path`, query params, request body fields, response body fields, status codes, headers. If you can't enumerate it, you can't review it.
+2. **Model resources as nouns, actions as verbs.** Path = pluralized noun collection (`/orders`, `/orders/{id}/items`). Never put the action in the path (`/getOrder`, `/createOrder` are wrong). Map intent to verb:
+   - `GET` read, no side effects, safe + idempotent.
+   - `POST` create / non-idempotent action → returns `201` + `Location` header (or `202` for async).
+   - `PUT` full replace, idempotent. `PATCH` partial update.
+   - `DELETE` remove, idempotent → `204` (or `200` with body).
+   - Genuinely non-CRUD action → POST a sub-resource (`POST /orders/{id}/cancel`), not a verb path.
+3. **Pick the correct status code — not a blanket 200.**
+   - `201` created (with `Location`), `202` accepted-async, `204` success-no-body.
+   - `400` malformed syntax, `401` unauthenticated, `403` authenticated-but-forbidden, `404` not found, `409` conflict (e.g. duplicate, state clash), `422` well-formed but semantically invalid (validation), `429` rate-limited.
+   - `5xx` only for server faults — never for client validation failures.
+4. **Standardize pagination/filter/sort across the surface.** Pick ONE pagination style and apply it everywhere: cursor-based (`?cursor=&limit=`, opaque cursor, stable for live data — preferred) or offset (`?page=&per_page=`, simpler but drifts on inserts). Return pagination metadata consistently (e.g. `{ "data": [...], "next_cursor": "...", "has_more": true }`). Filtering/sorting via explicit allow-listed params (`?status=open&sort=-created_at`); reject unknown params instead of silently ignoring.
+5. **Use a single error envelope for every error response.** Same shape on every 4xx/5xx so clients parse once:
+   ```json
+   { "error": { "code": "VALIDATION_FAILED", "message": "human readable", "details": [ { "field": "email", "issue": "invalid format" } ] } }
+   ```
+   `code` = stable machine string (clients branch on this, never on `message`). Don't leak stack traces / internal SQL.
+6. **Require idempotency for unsafe writes.** Non-idempotent `POST` that creates resources or moves money/state must accept an `Idempotency-Key` header; server stores key→result and replays the same response on retry. `PUT`/`DELETE` must be naturally idempotent (calling twice = same end state, no error on second `DELETE` of already-gone resource → `204` or `404`, pick one and document it).
+7. **Version + run a breaking-change diff.** Diff the new contract against the existing one. Backward-compatible (safe, no version bump): adding an optional field, adding a new endpoint, adding an enum value clients must tolerate, loosening validation. Breaking (needs new version or is forbidden): removing/renaming a field, changing a type, making an optional field required, tightening validation, changing status-code semantics, changing default behavior. Version via URL (`/v2/...`) or header — pick the project's existing convention. If a change is breaking, either make it additive instead, or ship under a new version and keep the old one working.
+8. **Validate input at the edge.** Reject unknown/extra fields or define the policy explicitly. Enforce types, ranges, lengths, and required-ness before any business logic runs. Document every field: type, required?, default, constraints, example.
+## Common Errors
+- **Silent breaking change.** Renaming a response field or tightening validation looks harmless in a diff but breaks every existing client. Always diff against the deployed contract, not just the previous commit.
+- **Status code 200 for everything** (including errors, with `{"success": false}` in the body). Clients then can't rely on HTTP semantics, retries/caching/monitoring all misbehave. Use real status codes.
+- **`200` for async work.** If processing is deferred, return `202 Accepted` with a status/polling URL — not `200` implying it's done.
+- **404 vs 403 leak.** Returning `404` to hide existence is sometimes intentional; returning `403` reveals the resource exists. Decide deliberately and be consistent.
+- **POST retried twice creates duplicates** because there's no idempotency key — classic double-charge / double-order bug on flaky networks.
+- **Offset pagination on live data** skips or repeats rows when records are inserted/deleted between pages. Prefer cursor for anything mutating.
+- **Inconsistent error shapes** across endpoints force clients to write per-endpoint parsing. One envelope, always.
+- **`422` vs `400` confusion.** `400` = can't even parse the request; `422` = parsed fine but the values are invalid. Mixing them makes client retry logic wrong.
+## Verify
+- [ ] Every path is a noun; no verbs in paths.
+- [ ] Each operation uses the semantically correct verb and a specific status code (not blanket `200`).
+- [ ] Error responses across the whole surface share one envelope with a stable `code` field.
+- [ ] All create/state-changing `POST`s accept an idempotency key; `PUT`/`DELETE` are idempotent.
+- [ ] Pagination/filter/sort follow one consistent pattern; unknown params are rejected, not ignored.
+- [ ] Ran a contract diff vs the deployed version → any breaking change is either reworked to be additive or shipped under a new version with the old one intact.
+- [ ] Every request/response field is documented (type, required, default, example) and validated at the edge.
+- [ ] No internal details (stack traces, SQL, secrets) leak in any response.
+If a contract test / schema snapshot exists, run it and confirm the diff matches the intended (and documented) changes only.

package/skills/async-concurrency-correctness/SKILL.md ADDED Viewed

@@ -0,0 +1,93 @@
+---
+name: async-concurrency-correctness
+description: Writes and fixes correct async/concurrent code across Python (asyncio), TypeScript (Promises), Rust (tokio), and Go (goroutines/channels), targeting deadlocks, races, cancellation, and backpressure.
+when_to_use: User is writing or debugging async/await, event loops, goroutines, tokio tasks, locks, channels, or hits deadlocks, races, leaked tasks, blocked event loops, or unbounded concurrency. NOT for general code tidy-ups (use refactor-cleanup).
+---
+## When to Use
+Reach for this skill when the code touches more than one thing happening at once and correctness depends on *ordering, timing, or shared state*:
+- Writing/reviewing `async`/`await`, event loops, `goroutine`+`channel`, `tokio::spawn`, thread pools, locks, queues.
+- A symptom points at concurrency: hang/deadlock, intermittent test failure, data race, "works once then stalls", leaked/zombie tasks, OOM under load, event loop "blocked" warnings, requests timing out under concurrency but fine serially.
+- Adding parallelism to existing serial code (fan-out, worker pool, pipeline).
+Do NOT use for single-threaded refactors, naming, or non-timing-dependent logic bugs — that's `refactor-cleanup`. If a "race" reproduces 100% deterministically serially, it's a logic bug, not a concurrency bug — stop and treat it as one.
+First move: **identify the runtime model before writing a line.** The right fix differs per runtime.
+| Runtime | Model | Key constraint |
+|---|---|---|
+| Python asyncio | single-thread cooperative loop | one blocking call freezes *everything*; CPU work needs a process, not a thread (GIL) |
+| TypeScript/Node | single-thread microtask queue | sync CPU blocks the loop; `await` in a `for` serializes; unhandled rejection can crash |
+| Rust tokio | M:N work-stealing (multi-thread) or current-thread | blocking a worker starves the pool; `!Send` futures can't cross threads |
+| Go | M:N goroutine scheduler, preemptive | true parallelism → real data races; nil/closed channel semantics bite |
+## Steps
+1. **Pin the runtime + executor flavor.** asyncio (`asyncio.run`?) · tokio (`#[tokio::main]` multi-thread vs `flavor = "current_thread"`?) · Go (`GOMAXPROCS`?) · Node (worker_threads in play?). Decisions below depend on this.
+2. **Get blocking work off the async path.** Anything that doesn't yield blocks the whole loop/worker:
+   - asyncio: wrap sync/CPU/file-I/O in `await asyncio.to_thread(fn, ...)`; for CPU-bound use `loop.run_in_executor(ProcessPoolExecutor(), ...)` (threads won't help under the GIL).
+   - tokio: `tokio::task::spawn_blocking(move || ...).await` for blocking syscalls/CPU; never call `std::thread::sleep`, blocking `Mutex` held across `.await`, or sync file I/O on a worker.
+   - Node: move CPU work to a `worker_thread`; never a synchronous loop, `fs.readFileSync`, or `JSON.parse` of a huge string on the main thread.
+   - Go: blocking is fine (scheduler handles it), but cap goroutines spawned around it (step 5).
+3. **Never hold a sync lock across an await/yield point.** This is the #1 async deadlock.
+   - asyncio: use `asyncio.Lock` (an `async with`), not `threading.Lock`, around `await`. If you must touch a `threading.Lock`, acquire→mutate→release with *zero* awaits inside.
+   - tokio: use `tokio::sync::Mutex` only when the guard must survive `.await`. Otherwise prefer `std::sync::Mutex` and **drop the guard before** `.await` (scope it in a block, or `let v = *guard; drop(guard);`). Holding `std::sync::Mutex` across `.await` is a hang waiting to happen.
+   - Go: `sync.Mutex` is fine but never `Lock()` then `<-ch`/another `Lock()` while holding it — order all locks consistently to kill lock-ordering deadlocks.
+4. **Make cancellation and shutdown explicit.** Tasks must be reliably stoppable and joined.
+   - asyncio: prefer `asyncio.TaskGroup` (3.11+) so a child failure cancels siblings and you `await` them all; use `asyncio.timeout(s)` for deadlines. Every bare `create_task` must be stored and awaited/cancelled — fire-and-forget tasks get GC'd and silently dropped. On shutdown: cancel, then `await asyncio.gather(*tasks, return_exceptions=True)`. Catch `asyncio.CancelledError`, run cleanup, then **re-raise it** — swallowing it breaks cancellation.
+   - tokio: pass a `tokio_util::sync::CancellationToken` (or `select!` on a shutdown channel) into long tasks; wrap deadlines in `tokio::time::timeout`. Note a dropped `JoinHandle` does *not* cancel the task — call `.abort()` or use a `JoinSet`. Cancellation drops at the next `.await`, so keep state consistent at every await point.
+   - Node: thread an `AbortSignal` into fetch/timers/streams; race work against `AbortSignal.timeout(ms)`.
+   - Go: every goroutine takes `ctx context.Context` and returns on `<-ctx.Done()`; pair `context.WithTimeout`/`WithCancel` with a `defer cancel()`. Use a `sync.WaitGroup` to join before exit so you don't leak.
+5. **Bound concurrency — never unbounded fan-out.** Spawning one task per input will exhaust memory/sockets/fds.
+   - asyncio: gate with `asyncio.Semaphore(N)` around each unit, or process in chunks; `gather` over a *fixed* worker pool, not over N inputs.
+   - tokio: `Semaphore::new(N)` + `acquire_owned`, or `buffer_unordered(N)` on a `futures::stream`.
+   - Node: pool with `p-limit`/a concurrency cap; an unawaited `array.map(async ...)` fires all at once — wrap in `pLimit` or batch.
+   - Go: fixed-size worker pool reading from a jobs channel (`for w := 0; w < N; w++ { go worker() }`), or a buffered semaphore channel `sem := make(chan struct{}, N)`.
+6. **Channels/queues must have backpressure.** Bounded by default; unbounded queues just move the OOM downstream and hide the real throughput limit.
+   - Go: `make(chan T, N)` (buffered) so producers block when consumers lag; close the channel exactly once, from the *sender* side only; never send on a closed channel. Drain or `select{ case <-ctx.Done(): }` to avoid send-blocked leaks.
+   - asyncio: `asyncio.Queue(maxsize=N)`; `await queue.put` blocks the producer under load (that's the point). Use sentinel values or `task_done()`/`join()` to signal completion.
+   - Rust: `tokio::sync::mpsc::channel(N)` (bounded), not `unbounded_channel`. `send().await` applies backpressure.
+7. **Eliminate shared mutable state or make access correct.** Prefer message-passing/ownership over shared memory. If shared:
+   - Make handlers **idempotent** (retries + at-least-once delivery will re-run them).
+   - Use atomics for counters (`std::sync::atomic`, Go `atomic`, not bare `i++`); use proper guarded structures for the rest.
+   - Go: protect every shared field — a single unguarded write under `-race` is a real bug, not a warning to ignore.
+8. **Verify under contention, not just once** (see Verify).
+## Common Errors
+- **`std::sync::Mutex` guard held across `.await` (Rust):** task parks holding the lock → deadlock. Drop the guard before awaiting, or switch to `tokio::sync::Mutex`.
+- **`threading.Lock` instead of `asyncio.Lock` (Python):** blocks the whole loop or deadlocks. Use the async lock around awaits.
+- **Fire-and-forget `asyncio.create_task(...)` without keeping a reference:** the task can be garbage-collected mid-flight and exceptions vanish. Store it; await or cancel it; or use `TaskGroup`.
+- **Swallowing `CancelledError`:** `except Exception` catches it (it's `BaseException` in 3.8+ but still easy to over-catch) — clean up and re-raise, or cancellation silently no-ops.
+- **`await` inside a `for` loop when you meant parallel (JS/Python):** serializes everything. Use `Promise.all`/`asyncio.gather` — but then add a concurrency cap (step 5) so it's not unbounded.
+- **`Promise.all` for fan-out with no limit:** opens thousands of connections at once. Cap it.
+- **Unhandled promise rejection (Node):** crashes or silently drops the result. Always `.catch`/`try` around detached promises; add `process.on('unhandledRejection')` as a backstop, not a fix.
+- **`JoinHandle` dropped expecting cancellation (tokio):** dropping does NOT abort the task — it keeps running detached. Use `.abort()` or a `JoinSet`.
+- **Closing a Go channel from the receiver, or twice:** panics. Close once, from the sole sender. Sending on a closed/nil channel panics/blocks forever.
+- **`select` with no `default` and no ready case (Go):** blocks forever — a silent deadlock. Add a `ctx.Done()` case.
+- **Loop variable captured by goroutine pre-Go 1.22:** all goroutines see the last value. Pass it as an arg or shadow it.
+- **Treating `-race` / lint warnings as noise:** a reported race is a real bug. Fix the root cause; never suppress it to make CI green.
+- **CPU-bound work in `to_thread`/`spawn_blocking` expecting parallelism (Python):** GIL serializes it. Use processes for CPU.
+- **Blocking call sneaking onto a tokio worker** (`reqwest::blocking`, sync `std::fs`, `std::thread::sleep`): starves the runtime. Use async equivalents or `spawn_blocking`.
+## Verify
+A concurrency fix isn't done until it survives contention. Don't trust a single green run.
+1. **Run under the race detector / sanitizer:** Go `go test -race ./...` and `go run -race`; Rust `RUSTFLAGS="-Zsanitizer=thread"` (nightly) or at minimum `cargo test` with `tokio` `--features full` and Loom for lock-free code; Python/Node — add a stress test (below).
+2. **Stress test:** run the hot path with N concurrent callers (e.g. 100–1000), repeated 100+ iterations. Flaky pass = unfixed race. In Go run with `-count=100 -race`; in async langs spawn the op `gather`/`Promise.all`/`JoinSet` with a high count and assert invariants every run.
+3. **Deadlock check:** add a hard timeout to the test itself (`go test -timeout 30s`, `asyncio.timeout`, jest `testTimeout`) so a hang fails loudly instead of hanging CI.
+4. **Leak check:** assert no growth in task/goroutine count after the workload settles — Go `runtime.NumGoroutine()` before/after; asyncio `len(asyncio.all_tasks())`; tokio task count via a `JoinSet`/metrics. Steady-state count must return to baseline.
+5. **Backpressure check:** drive producers faster than consumers and confirm bounded memory (queue depth stays ≤ N, RSS flat) instead of unbounded growth.
+6. **Cancellation check:** trigger shutdown/timeout mid-flight and assert all tasks stopped, resources released (no leaked connections/fds/locks), and no `CancelledError`/panic escaped.
+Show the actual command + output as evidence. "Looks fine" is not verification for concurrency.

package/skills/audit-accessibility-wcag/SKILL.md ADDED Viewed

@@ -0,0 +1,59 @@
+---
+name: audit-accessibility-wcag
+description: Audits and fixes markup/JSX for WCAG 2.2 AA compliance — alt text, ARIA, heading order, contrast, keyboard nav, focus management; used before shipping UI or preparing an a11y review.
+when_to_use: When the user wants an accessibility (a11y) check or fix, mentions WCAG, screen readers, ARIA, keyboard navigation, color contrast, or is preparing for an accessibility audit.
+---
+## When to Use
+Trigger this skill when the task is to audit or fix UI for accessibility: WCAG 2.2 AA, screen readers, ARIA, keyboard nav, color contrast, focus management, or preparing an a11y review.
+NOT this skill: pure visual/layout polish (no a11y intent), or security/XSS concerns — that is `security-review`. ARIA/role injection that touches `dangerouslySetInnerHTML` or user-controlled markup overlaps both; flag the a11y part here and hand the injection risk to `security-review`.
+## Steps
+Work in this order. Earlier rules outrank later ones — never reach for ARIA to patch a problem that semantic HTML already solves.
+1. **Scope the surface.** Identify the files/components in the diff or named target. Grep for the smells that produce most findings: `rg -n 'role=|aria-|onClick=|<div|<span|tabindex|tabIndex|alt=|<img|placeholder='` across the target. Build a checklist; do not audit the whole repo unless asked.
+2. **Semantic HTML first.** Replace `<div onClick>` / `<span onClick>` with a real `<button type="button">`. Use `<a href>` only for navigation (has a destination), `<button>` for actions (no navigation). Use `<nav>`, `<main>`, `<header>`, `<footer>`, `<ul>/<li>`, `<table>` over `<div role="...">`. One `<main>` per page. A native element is keyboard-operable and announced for free; a styled `div` is not.
+3. **Images & icons.** Every `<img>` needs `alt`. Informative image → describe its meaning (`alt="Quarterly revenue up 12%"`), not the filename. Decorative/redundant image → `alt=""` (empty, not missing — missing alt makes SRs read the URL). Icon-only buttons need an accessible name via `aria-label` or visually-hidden text; an inline SVG that conveys meaning needs `role="img"` + `aria-label`, a decorative one needs `aria-hidden="true"`.
+4. **Heading hierarchy.** Exactly one `<h1>`. No skipped levels (h2 → h4 is a fail; reflows the SR outline). Headings describe structure, not font size — never pick a level for its default styling. Don't fake headings with bold `<p>`/`<div>`.
+5. **ARIA only where native fails.** First rule of ARIA: don't use ARIA. Apply it only for patterns HTML can't express (tabs, comboboxes, live regions, disclosure). When you do: every `role` must carry its required states (e.g. `role="tab"` needs `aria-selected` + `aria-controls`; `aria-expanded` on every disclosure/menu trigger; `role="dialog"` needs `aria-modal="true"` + `aria-labelledby`). Never put an interactive role on a non-interactive element without also making it focusable (`tabindex="0"`) and key-handled. Remove redundant roles (`role="button"` on a `<button>`).
+6. **Forms.** Every input has a programmatic label: `<label for>` / `htmlFor`, wrapping `<label>`, or `aria-label`/`aria-labelledby`. Placeholder is NOT a label (vanishes on input, low contrast). Required → `required` + visible indicator. Errors: link the message with `aria-describedby` on the field, set `aria-invalid="true"`, and announce dynamic errors via an `aria-live="polite"` (or `role="alert"` for assertive) region. Group radios/related fields in `<fieldset>` + `<legend>`.
+7. **Color contrast (WCAG 2.2 AA).** Body text ≥ 4.5:1; large text (≥ 24px, or ≥ 18.66px bold) ≥ 3:1; UI components, icons, and focus indicators ≥ 3:1 against adjacent colors. Compute ratios from the actual hex values (resolve CSS vars/Tailwind tokens to real colors first). Never rely on color alone to convey state — pair it with text/icon/shape. Check disabled, hover, and dark-mode states too.
+8. **Keyboard.** Everything operable by mouse must work by keyboard. Tab order follows DOM order — fix with DOM/flexbox `order` reasoning, never positive `tabindex` (1+). `tabindex="-1"` = focusable by script only; `0` = in natural order. Every interactive element needs a visible focus indicator — do NOT `outline: none` without a replacement (`:focus-visible` ring). Add a "Skip to main content" link as the first focusable element. No keyboard traps (you can Tab out of every widget). Custom widgets implement expected keys (Esc closes, Arrow keys move within tabs/menus/listbox per ARIA APG).
+9. **Focus management (dynamic UI).** On modal/dialog open: move focus into it, trap Tab within it, restore focus to the trigger on close, Esc closes. On client-side route change: move focus to the new page's `<h1>` or a focusable container so SR users aren't stranded. Toasts/async results → announce via live region, don't steal focus mid-task.
+10. **Report.** Output a prioritized list grouped **Blocker → Major → Minor**, each entry = file:line, the WCAG criterion (e.g. 1.1.1, 2.4.7), the concrete problem, and a copy-pasteable fix (before → after). Blocker = blocks a user from completing a task (unlabeled control, keyboard trap, no focus on modal). Major = significant barrier (contrast fail, skipped heading). Minor = friction (redundant role, missing `lang`). Apply the fixes when asked; otherwise leave them as ready-to-paste diffs.
+## Common Errors
+- `alt=""` vs missing `alt` — empty is intentional "decorative, skip me"; missing makes SRs announce the file path. Not interchangeable.
+- Placeholder used as the only label. Fails 1.3.1/4.1.2 and disappears once typing starts.
+- `outline: none` / `outline: 0` in a CSS reset with no `:focus-visible` replacement — silently removes the focus ring for keyboard users. Single most common keyboard fail.
+- Positive `tabindex` ("1", "2"…) to "fix" order — creates a brittle parallel tab sequence ahead of everything else. Fix DOM order instead.
+- `aria-label` on a non-interactive, non-`role`'d element (plain `<div>`/`<span>`/`<p>`) — most SRs ignore it. The name needs an element that takes a name.
+- `role="button"` on a `<div>` without `tabindex="0"` AND `onKeyDown` for Enter/Space — looks clickable, dead to keyboard. Just use `<button>`.
+- Click handler only on the icon `<svg>`/`<i>` inside a control — enlarge the hit target and put the handler + name on the `<button>`.
+- `aria-hidden="true"` on a focusable element (or an ancestor of one) — hides it from SRs while it stays in the tab order: a focusable ghost. Never wrap interactive content in `aria-hidden`.
+- Live region added to the DOM at the same time as its message — SRs only announce changes to an *already-present* region. Render the empty `aria-live` container first, then inject text.
+- Tailwind/utility color audit done on class names — `text-gray-400 on bg-white` is not a contrast value. Resolve to hex, then compute the ratio.
+- Trusting the linter as "done" — `eslint-plugin-jsx-a11y`/axe catch static markup issues only. They cannot see contrast against rendered backgrounds, focus order, trap behavior, or whether a name actually makes sense. Keyboard/SR behavior needs manual or automated runtime checks.
+## Verify
+Static (always): `rg -n 'outline:\s*(none|0)|tabindex=["\x27][1-9]|aria-hidden=["\x27]true' <target>` returns no unjustified hits. If the project has it, run `npx eslint --no-eslintrc --plugin jsx-a11y` (or the existing a11y lint task) → zero errors. Confirm exactly one `<h1>` and no skipped heading levels.
+Runtime (when a browser is available): load the page, run an axe-core scan (or `lighthouse` accessibility category) and capture the score + violations. Then keyboard-walk it: Tab from the top — every interactive element is reachable, in logical order, with a *visible* focus ring; the skip link appears first; open a modal and confirm focus enters it, Tab is trapped, Esc closes, and focus returns to the trigger.
+Contrast: for each flagged pair, state the computed ratio and the threshold it must clear (4.5:1 / 3:1) — a fix isn't done until the number passes.
+Report is complete only when every Blocker has a concrete code fix attached (not just a description) and each finding cites its WCAG success criterion.

package/skills/audit-technical-seo/SKILL.md ADDED Viewed

@@ -0,0 +1,62 @@
+---
+name: audit-technical-seo
+description: Audits and fixes technical/on-page SEO — meta tags, Open Graph/Twitter cards, JSON-LD structured data, canonicals, sitemap, robots.txt; used when improving discoverability or fixing crawlability.
+when_to_use: When the user wants SEO improvements, mentions meta tags, Open Graph, schema.org/JSON-LD structured data, sitemap, robots.txt, canonical URLs, crawlability, or indexing.
+---
+## When to Use
+Trigger this skill when the request involves: improving search discoverability/indexing, adding or fixing `<title>`/`<meta>` tags, Open Graph or Twitter Card previews, schema.org / JSON-LD structured data, canonical URLs, `sitemap.xml`, `robots.txt`, `hreflang`, or "why isn't this page showing up / previewing right on social". No API keys, crawler accounts, or paid tools required — this is static analysis + code fixes on the project's own source.
+Skip if the ask is about off-page SEO (backlinks, content strategy, keyword research) or Core Web Vitals performance — those are different jobs (use a Lighthouse/perf pass for the latter).
+## Steps
+Detect the stack first, then audit head-out. Output a **missing-items report** with concrete diffs, don't just describe.
+1. **Detect framework + render mode.** `grep -rl "next" package.json` etc. Branch:
+   - Next.js App Router → `export const metadata` / `generateMetadata()` in `layout.tsx`/`page.tsx`. Do NOT hand-write `<head>` tags; they get stripped.
+   - Next.js Pages Router → `next/head` `<Head>` component, or `_document.tsx` for site-wide.
+   - Plain HTML / Vite / Astro → edit `<head>` directly or the framework's head API.
+   Find existing tags: `grep -rniE "og:|twitter:|application/ld\+json|rel=.canonical|<title|name=.description" src/ app/ public/`.
+2. **Title + meta description.** Every indexable page needs a unique `<title>` (aim ≤60 chars so it isn't truncated in SERP) and `<meta name="description">` (≤160 chars, written for click-through, not keyword-stuffed). Flag pages sharing the same title/description as duplicate-content risk. In Next: set `title` (string or `{ default, template }`) and `description` in the metadata object.
+3. **Open Graph + Twitter Card.** Required for correct social previews:
+   - OG: `og:title`, `og:description`, `og:type` (`website`/`article`), `og:url` (absolute), `og:image` (absolute URL, 1200×630, <5MB), `og:site_name`.
+   - Twitter: `twitter:card` = `summary_large_image`, `twitter:title`, `twitter:description`, `twitter:image`.
+   In Next, use the `openGraph` and `twitter` keys (Next auto-emits the `<meta>` tags). Images must be absolute — set `metadataBase: new URL("https://example.com")` so relative image paths resolve, otherwise OG images silently 404 for crawlers.
+4. **JSON-LD structured data.** Pick the type that matches the page: `Article`/`BlogPosting` (blog), `Product` + `Offer` (commerce), `BreadcrumbList` (nav trail), `FAQPage` (Q&A), `Organization`/`WebSite` (site-wide, in root layout). Emit as `<script type="application/ld+json">` with `JSON.stringify` — in React use `dangerouslySetInnerHTML`, never `{JSON.stringify(...)}` as a text child (React escapes the quotes and breaks parsing). Include all required props for the type (e.g. `Article` needs `headline`, `image`, `datePublished`, `author`). Validate the shape against schema.org before declaring done.
+5. **Canonical URLs.** Each page must self-reference a single absolute canonical (`<link rel="canonical">` / Next `alternates.canonical`). Audit for conflicts: trailing-slash variants, `?utm_`/query params, `http` vs `https`, `www` vs apex all resolving to the same content without one canonical winner. Paginated/filtered list pages each canonical to themselves, not page 1.
+6. **sitemap.xml + robots.txt.**
+   - `sitemap.xml`: lists only indexable, canonical, 200-status URLs (no redirects, no `noindex` pages, no admin/auth routes). Next: `app/sitemap.ts`. Verify `<loc>` URLs are absolute and match canonicals exactly.
+   - `robots.txt`: must NOT `Disallow` paths you also list in the sitemap or want indexed. Confirm it points to the sitemap (`Sitemap: https://example.com/sitemap.xml`). Check no stray `Disallow: /` left from a staging config.
+7. **Heading + semantic HTML.** Exactly one `<h1>` per page, no skipped levels (h1→h3), descriptive not styling-driven. Use `<main>`, `<nav>`, `<article>`, `<header>`; `alt` on content images. This is the crawler's content map — don't leave it as `<div>` soup.
+8. **hreflang (only if i18n).** For multi-locale sites, every locale variant lists reciprocal `hreflang` tags including a self-reference and `x-default`. Skip entirely for single-locale sites — wrong/partial hreflang is worse than none.
+9. **Report.** Produce a table: page · issue · severity · concrete fix (with the diff). Apply the fixes, then re-audit.
+## Common Errors
+- **Hand-writing `<head>` in Next App Router** — React strips it on the server. Must go through the metadata API.
+- **Relative OG/canonical image URLs without `metadataBase`** — crawlers and social scrapers can't resolve them; previews break and canonicals get ignored. Always absolute, or set `metadataBase`.
+- **JSON-LD as a React text child** — `<script>{JSON.stringify(data)}</script>` HTML-escapes the quotes (`&quot;`) and the data won't parse. Use `dangerouslySetInnerHTML={{ __html: JSON.stringify(data) }}`.
+- **Sitemap listing `noindex`/redirected/non-canonical URLs** — sends mixed signals; sitemap must only contain final indexable canonicals.
+- **`robots.txt` Disallow vs sitemap conflict** — disallowing a URL that's in the sitemap (or carries `noindex`) wastes crawl budget and confuses indexing.
+- **Multiple or zero `<h1>`** — common with component-based layouts where a shared header and the page both render an h1, or neither does.
+- **Duplicate title/description across pages** — usually a layout-level default never overridden per-page; treat as a real bug, not cosmetic.
+- **`noindex` left in from staging** — grep for `noindex` / `robots: { index: false }` before shipping; a single leftover tag silently de-indexes the page.
+## Verify
+- **Render the head:** build + serve, then `curl -s <url> | grep -iE "og:|twitter:|canonical|<title|ld\+json"` (or DevTools → Elements). Tags must exist in the **server-rendered HTML**, not injected client-side after load — crawlers read the initial response.
+- **Structured data:** extract each JSON-LD block and confirm it's valid JSON and includes every required field for its `@type` per schema.org (Google Rich Results Test / Schema Markup Validator for a live check).
+- **Social preview:** scrape with the platforms' validators (Facebook Sharing Debugger, Twitter/X Card Validator) — confirm image, title, description resolve from absolute URLs.
+- **Sitemap/robots:** fetch `/sitemap.xml` (well-formed XML, absolute `<loc>`, no 404/redirect/noindex entries) and `/robots.txt` (references sitemap, doesn't block indexable paths).
+- **Canonicals:** each page's canonical is absolute and self-consistent; no two pages claim the same canonical unless intended.
+- **Done = the missing-items report has zero open high-severity items** and the above checks pass against the actually-served output, not the source file.

package/skills/auth-jwt-session/SKILL.md ADDED Viewed

@@ -0,0 +1,88 @@
+---
+name: auth-jwt-session
+description: Implements authentication and session management (JWT issuing/verification, refresh rotation, sessions, cookies, OAuth2/OIDC flows, RBAC checks) when building or fixing how a backend logs users in and authorizes requests.
+when_to_use: User is implementing login/logout, JWT or session handling, refresh tokens, cookie config, OAuth/OIDC, or RBAC/permission checks. This is implementation; broad vulnerability auditing of a diff is security-review.
+---
+## When to Use
+Use when writing or fixing the code that logs a user in and authorizes their requests:
+- Implementing `login` / `logout` / `whoami` endpoints or commands
+- Issuing or verifying JWTs (access + refresh)
+- Refresh-token rotation, revocation, denylist/allowlist
+- Setting auth cookies (flags, scope, expiry)
+- Wiring an OAuth2 / OIDC provider (authorization-code + PKCE)
+- Adding RBAC / scope / permission checks to protected routes
+NOT for: auditing an existing diff for vulnerabilities (use `security-review`), or pure DB/RLS work in a managed Postgres stack (use the `supabase` skill).
+## Steps
+**1. Choose the session model first — write it down before coding.**
+- Stateless JWT: no server lookup per request; cannot instantly revoke. Use short access TTL (5–15 min) to bound the blast radius.
+- Server session (opaque ID → store): instant revocation, server round-trip per request.
+- Default for most APIs: **access JWT (short) + refresh token (long, server-tracked)**. This is the rest of these steps.
+**2. Pick the signing algorithm and lock it.**
+- Symmetric `HS256`: one secret signs AND verifies — only for a single-service monolith.
+- Asymmetric `RS256`/`ES256`: private key signs, public key (JWKS) verifies — required when a separate service verifies tokens.
+- On verify, **pass an explicit allowlist** of algorithms (e.g. `algorithms: ["RS256"]`). Never let the library read `alg` from the token header.
+- Store signing keys outside the repo: env var / secret manager / KMS. Never commit a key or check a literal secret into source.
+**3. Build the access token with minimal, validated claims.**
+- Set `exp` (short), `iat`, `iss`, `aud`, and a stable `sub` (user id). Add a `jti` if you ever need per-token revocation.
+- Put role/scope claims in only if every verifier trusts the issuer; otherwise look up authz server-side.
+- On verify, check signature AND `exp`, `iss`, `aud` — a valid signature on a token for another audience is still invalid here.
+**4. Implement refresh rotation + revocation.**
+- Store each refresh token (hashed, never plaintext) with `user_id`, `expires_at`, `revoked_at`, and a `family_id`.
+- On refresh: verify it exists, is unexpired, unrevoked → issue NEW access + NEW refresh, then mark the old one revoked (rotation).
+- **Reuse detection:** if an already-revoked refresh token is presented, revoke the entire `family_id` — that means a stolen token was replayed.
+- On logout: revoke the refresh token (and family). Access tokens stay valid until `exp` — that's expected with stateless JWT; keep access TTL short so this window is tiny.
+**5. Set cookies correctly (if browser-facing).**
+- Refresh token in cookie: `HttpOnly; Secure; SameSite=Lax` (or `Strict` if no cross-site nav), `Path=/auth/refresh` to limit where it's sent.
+- Never put a token in `localStorage` if XSS is in scope — `HttpOnly` cookie is the point.
+- `SameSite=None` REQUIRES `Secure` and only for genuine cross-site needs; pair with CSRF protection.
+**6. OAuth2 / OIDC: authorization-code + PKCE only.**
+- Generate `code_verifier` (random) → `code_challenge = S256(verifier)`. Send challenge on the authorize request; send verifier on the token exchange.
+- Generate a random `state`, store it tied to the session, and verify it on callback (CSRF defense).
+- For OIDC, generate a `nonce`, send it, and verify it in the returned ID token.
+- Validate the ID token: signature via the provider's JWKS, plus `iss`, `aud`, `exp`, `nonce`. Never trust the `userinfo` response without validating the token first.
+- Do not use the implicit flow. Do not embed a client secret in a public/CLI/SPA client — use PKCE without a secret.
+**7. Authorize every protected route — deny by default.**
+- Centralize: one middleware/guard that runs on all protected routes and returns 401/403 unless an explicit check passes. No route is public unless explicitly marked.
+- Check the specific permission/scope for the action, not just "is logged in." A valid token is authentication, not authorization.
+- Re-check ownership for resource access (does THIS user own THIS object), not only role.
+## Common Errors
+- **`alg: none` / algorithm confusion** — verifier trusts the token's own `alg`. An attacker sends `alg: none` (no signature) or swaps `RS256→HS256` and signs with the public key as an HMAC secret. Fix: hardcode the accepted algorithm list on verify.
+- **Verifying signature but not `exp`/`aud`/`iss`** — many libraries verify the signature only unless you opt in to claim validation. An expired or wrong-audience token passes. Explicitly require these claims.
+- **Refresh rotation without reuse detection** — you rotate tokens but don't revoke the family when an old one reappears, so a stolen token works until it naturally expires. Revoke the whole family on reuse.
+- **Refresh tokens stored in plaintext** — a DB leak hands out live sessions. Store a hash; compare by hash.
+- **Long-lived access tokens** — using a multi-hour access TTL "to avoid refresh complexity" means logout/ban does nothing for hours. Keep access short; put longevity in the refresh path.
+- **Cookie missing `HttpOnly`/`Secure`** — token readable by JS (XSS) or sent over plaintext. Both flags on the auth cookie, always.
+- **Same secret/key across environments** — a leaked staging key forges prod tokens. Separate keys per environment.
+- **Timing-unsafe token compare** — comparing opaque tokens/secrets with `==` leaks length/content via timing. Use a constant-time compare.
+- **Logging tokens** — access/refresh tokens or `Authorization` headers in request logs. Redact before logging.
+- **Authn mistaken for authz** — "user is logged in" used as the only gate; any logged-in user reaches admin routes. Check the specific scope/role/ownership.
+## Verify
+Write tests that prove rejection, not just the happy path:
+1. **Tampered token** — flip one byte of the signature → verify returns 401, never the user.
+2. **`alg: none`** — craft an unsigned token with valid-looking claims → rejected.
+3. **Expired token** — `exp` in the past → 401 (confirms claim validation runs, not just signature).
+4. **Wrong audience/issuer** — valid signature, wrong `aud` → rejected.
+5. **Refresh reuse** — use a refresh token, then present the old (rotated) one → that token AND its family are revoked; subsequent refresh fails.
+6. **Logout** — after logout, the refresh token no longer mints access tokens.
+7. **Authz** — a valid token lacking the required scope/role → 403 on a protected route; deny-by-default holds on an unmapped route.
+8. **OAuth state/PKCE** — callback with mismatched `state` → rejected; token exchange with wrong `code_verifier` → fails.
+9. **Secret scan** — `git grep` / a secrets scanner over the diff finds no literal keys or tokens; confirm logs redact `Authorization`.
+Run the test suite to green before declaring done. A passing login flow with no rejection tests is not done.

package/skills/brainstorm-design/SKILL.md ADDED Viewed

@@ -0,0 +1,73 @@
+---
+name: brainstorm-design
+description: Run a structured design conversation that explores requirements, proposes 2-3 approaches with tradeoffs, and converges on a validated design BEFORE any code is written — used when a feature/idea is fuzzy or under-specified.
+when_to_use: User describes a feature/idea/problem that is vague, open-ended, or has multiple viable approaches; before write-plan or implementation; when 'should I build X or Y' framing appears. Skip for one-line trivial changes (typo, rename, log line).
+---
+## When to Use
+Invoke when the request is a **design decision, not a typing task**. Concrete triggers:
+- The ask is a goal, not a mechanism: "add caching", "make onboarding smoother", "let users export data" — *how* is undecided.
+- "Should I build X or Y?" / "What's the best way to do Z?" framing.
+- The change touches ≥2 modules, a public API/schema, or a data model (decisions are expensive to reverse later).
+- Success criteria, target users, scale, or constraints are unstated.
+**Skip** (go straight to edit/write-plan) when the diff fits in one sentence: typo, rename, log line, version bump, copy tweak, adding a flag to an existing call. Burning a design round on these annoys the user.
+This skill **stops at a validated design brief**. It never writes implementation code. Hand the brief to `write-plan` (or implementation) next.
+## Steps
+1. **Read the surrounding code first, then ask.** Before any questions, grep the repo for the relevant module/feature so questions are informed, not generic. Skip questions the code already answers (existing patterns, current schema, conventions). Asking what you could have read yourself wastes the user's turn.
+2. **Ask 3-5 narrowing questions in ONE message** (not drip-fed one at a time). Cover the axes that actually change the design:
+   - **Scope / non-goals** — what is explicitly out of scope for v1?
+   - **Users & trigger** — who hits this, how often, from where (UI, API, CLI, cron)?
+   - **Constraints** — latency budget, data volume, existing stack/deps you must reuse, deadline, backwards-compat?
+   - **Success criteria** — how do we know it works? (a measurable bar, not "it's good")
+   - **Failure tolerance** — what happens on error/empty/duplicate input — block, retry, or degrade?
+   Ask only the axes that are genuinely ambiguous. If the user already pinned scope, don't re-ask it.
+3. **Propose 2-3 candidate approaches as a tradeoff table.** One row per approach, columns: `Approach | Effort | Risk | Reversibility | Fit`. Approaches must be *materially* different (e.g. "sync in-request" vs "background queue" vs "precompute on write") — not three flavors of the same thing. Each cell is a short concrete phrase, not a paragraph. If you can only think of one real approach, say so and skip the table — don't invent strawmen to pad to three.
+4. **Recommend one** with 1-2 sentences of reasoning tied to the user's stated constraints from step 2 (e.g. "Given the 'no new infra' constraint, B avoids a queue dep while still meeting the 200ms budget"). Take a position — this is a design partner, not a menu.
+5. **Probe the chosen approach for edge cases and failure modes** before locking it: empty/null input, concurrent writes, partial failure, scale ceiling, security/auth surface, migration of existing data. List the ones that genuinely apply; note how the design handles each or flag it as an open question.
+6. **Emit the design brief** and stop. Exactly these sections, kept tight:
+   ```
+   ## Design Brief: <feature>
+   **Problem:** <1-2 sentences — the user-facing need>
+   **Chosen approach:** <name + 2-3 sentence mechanism>
+   **Why:** <the tradeoff that decided it>
+   **Non-goals:** <explicit out-of-scope for this iteration>
+   **Edge cases handled:** <bullets>
+   **Open questions:** <unresolved decisions needing user input — or "none">
+   ```
+   Then hand off: "Ready for `write-plan` / implementation." Do not start coding.
+## Common Errors
+- **Skipping questions and jumping to a table.** If you guess the constraints, the "best" approach is best for the wrong problem. Always run step 2 unless the user already pinned every axis.
+- **Drip-feeding questions** one per turn — exhausting and slow. Batch all of them into a single message.
+- **Three fake approaches.** Listing minor variants of one idea to fill the table is noise. Two genuinely distinct options beat three near-duplicates.
+- **Refusing to recommend** ("it depends, here are options"). The user invoked a design *partner*. Pick one and defend it; they can override.
+- **Re-asking what's in the repo or in chat.** Read the code and the existing message first. Asking about conventions the codebase already demonstrates reads as not having looked.
+- **Sliding into implementation** — pseudocode, function signatures, file edits. The deliverable is a brief, not a diff. Stop at the brief even when the design feels obvious.
+- **Brief with no non-goals or open questions.** "Non-goals" prevents scope creep in the next phase; omitting it is the #1 cause of bloated plans downstream.
+## Verify
+Before declaring the design done, confirm all of:
+- [ ] Read the relevant existing code/schema (not designing blind).
+- [ ] User answered the narrowing questions, or explicitly waved them off.
+- [ ] ≥2 materially distinct approaches were surfaced (or a clear reason only one is viable).
+- [ ] A single approach is recommended with reasoning tied to a stated constraint.
+- [ ] Edge cases / failure modes were enumerated and addressed or parked.
+- [ ] Brief contains all sections including **Non-goals** and **Open questions**.
+- [ ] **No implementation code was written** — output is the brief, handed to the next phase.
+If any box is unchecked, the design is not ready to hand off — finish it before moving on.