npm - sanook-cli - Versions diffs - 0.4.0 → 0.5.1 - Mend

sanook-cli 0.4.0 → 0.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (238) hide show

package/.env.example +19 -0
package/CHANGELOG.md +173 -0
package/README.md +153 -20
package/README.th.md +136 -0
package/dist/agentContext.js +4 -0
package/dist/approval.js +6 -0
package/dist/bin.js +405 -57
package/dist/brain.js +92 -59
package/dist/brand.js +47 -0
package/dist/checkpoint.js +37 -0
package/dist/commands.js +86 -6
package/dist/compaction.js +76 -5
package/dist/config.js +100 -12
package/dist/cost.js +60 -3
package/dist/doctor.js +92 -0
package/dist/gateway/auth.js +2 -2
package/dist/gateway/ledger.js +2 -2
package/dist/gateway/scheduler.js +1 -0
package/dist/gateway/serve.js +6 -4
package/dist/gateway/server.js +10 -2
package/dist/git.js +11 -2
package/dist/hooks.js +43 -17
package/dist/knowledge.js +48 -49
package/dist/loop.js +182 -66
package/dist/lsp/client.js +173 -0
package/dist/lsp/framing.js +56 -0
package/dist/lsp/index.js +138 -0
package/dist/lsp/servers.js +82 -0
package/dist/mcp-server.js +244 -0
package/dist/mcp.js +184 -29
package/dist/memory-store.js +559 -0
package/dist/memory.js +143 -29
package/dist/orchestrate.js +150 -0
package/dist/providers/codex.js +21 -7
package/dist/providers/keys.js +3 -2
package/dist/providers/models.js +22 -6
package/dist/providers/registry.js +155 -1
package/dist/repomap.js +93 -0
package/dist/search/chunk.js +158 -0
package/dist/search/embed-store.js +187 -0
package/dist/search/engine.js +203 -0
package/dist/search/fuse.js +35 -0
package/dist/search/index-core.js +187 -0
package/dist/search/indexer.js +241 -0
package/dist/search/store.js +77 -0
package/dist/session.js +42 -8
package/dist/skill-install.js +10 -10
package/dist/skills.js +12 -9
package/dist/summarize.js +31 -0
package/dist/tools/bash.js +21 -2
package/dist/tools/diagnostics.js +41 -0
package/dist/tools/edit.js +29 -7
package/dist/tools/index.js +8 -1
package/dist/tools/list.js +7 -2
package/dist/tools/permission.js +90 -9
package/dist/tools/read.js +23 -4
package/dist/tools/remember.js +1 -1
package/dist/tools/sandbox.js +61 -0
package/dist/tools/search.js +105 -4
package/dist/tools/task.js +195 -29
package/dist/tools/timeout.js +35 -0
package/dist/tools/util.js +10 -0
package/dist/tools/write.js +6 -4
package/dist/trust.js +89 -0
package/dist/ui/app.js +228 -31
package/dist/ui/banner.js +4 -9
package/dist/ui/brain-wizard.js +2 -2
package/dist/ui/history.js +30 -0
package/dist/ui/mentions.js +44 -0
package/dist/ui/render.js +55 -15
package/dist/ui/setup.js +97 -12
package/dist/ui/useEditor.js +83 -0
package/dist/update.js +114 -0
package/dist/worktree.js +173 -0
package/package.json +11 -5
package/scripts/postinstall.mjs +33 -0
package/second-brain/.agents/_Index.md +30 -0
package/second-brain/.agents/skills/_Index.md +30 -0
package/second-brain/.agents/workflows/_Index.md +30 -0
package/second-brain/AGENTS.md +4 -4
package/second-brain/Acceptance/_Index.md +30 -0
package/second-brain/Acceptance/golden-case-template.md +39 -0
package/second-brain/Areas/_Index.md +30 -0
package/second-brain/Bugs/System-OS/_Index.md +30 -0
package/second-brain/Bugs/_Index.md +30 -0
package/second-brain/CLAUDE.md +4 -1
package/second-brain/Checklists/_Index.md +30 -0
package/second-brain/Checklists/preflight-postflight-template.md +29 -0
package/second-brain/Distillations/_Index.md +30 -0
package/second-brain/Entities/_Index.md +30 -0
package/second-brain/Entities/entity-template.md +33 -0
package/second-brain/Evals/_Index.md +30 -0
package/second-brain/Evals/correction-pairs.md +24 -0
package/second-brain/Evals/failure-taxonomy.md +24 -0
package/second-brain/Evals/golden-set.md +25 -0
package/second-brain/Evals/quality-ledger.md +23 -0
package/second-brain/Evals/self-eval-rubric.md +23 -0
package/second-brain/GEMINI.md +4 -4
package/second-brain/Goals/_Index.md +30 -0
package/second-brain/Handoffs/_Index.md +30 -0
package/second-brain/Home.md +7 -0
package/second-brain/Intake/Raw Sources/_Index.md +30 -0
package/second-brain/Intake/_Index.md +30 -0
package/second-brain/Intake/_Quarantine/_Index.md +30 -0
package/second-brain/Learning/_Index.md +30 -0
package/second-brain/Playbooks/_Index.md +30 -0
package/second-brain/Playbooks/playbook-template.md +23 -0
package/second-brain/Projects/_Index.md +30 -0
package/second-brain/Prompts/_Index.md +30 -0
package/second-brain/README.md +2 -1
package/second-brain/Research/_Index.md +30 -0
package/second-brain/Retrospectives/_Index.md +30 -0
package/second-brain/Reviews/_Index.md +30 -0
package/second-brain/Runbooks/_Index.md +30 -0
package/second-brain/Runbooks/eval-loop.md +24 -0
package/second-brain/Sessions/_Index.md +30 -0
package/second-brain/Shared/AI-Context-Index.md +20 -0
package/second-brain/Shared/AI-Threads/_Index.md +30 -0
package/second-brain/Shared/Archive/_Index.md +30 -0
package/second-brain/Shared/Assets/_Index.md +30 -0
package/second-brain/Shared/Context-Packs/_Index.md +30 -0
package/second-brain/Shared/Context7-Docs/_Index.md +30 -0
package/second-brain/Shared/Coordination/NOW.md +28 -0
package/second-brain/Shared/Coordination/_Index.md +30 -0
package/second-brain/Shared/Coordination/agent-registry.md +24 -0
package/second-brain/Shared/Coordination/task-board/_Index.md +30 -0
package/second-brain/Shared/Coordination/task-board/task-template.md +43 -0
package/second-brain/Shared/Coordination/task-board.md +32 -0
package/second-brain/Shared/Core-Facts/_Index.md +30 -0
package/second-brain/Shared/Decision-Memory/_Index.md +30 -0
package/second-brain/Shared/Glossary/_Index.md +30 -0
package/second-brain/Shared/Memory-Inbox/_Index.md +30 -0
package/second-brain/Shared/Operating-State/_Index.md +30 -0
package/second-brain/Shared/Prompting/_Index.md +30 -0
package/second-brain/Shared/Provenance/_Index.md +30 -0
package/second-brain/Shared/Rules/_Index.md +30 -0
package/second-brain/Shared/Rules/contextual-note-rule.md +30 -0
package/second-brain/Shared/Rules/frontmatter-standard.md +10 -0
package/second-brain/Shared/Rules/memory-write-protocol.md +28 -0
package/second-brain/Shared/Rules/procedural-runbook-header.md +40 -0
package/second-brain/Shared/Rules/review-and-staleness-policy.md +22 -0
package/second-brain/Shared/Rules/rules-formatting.md +34 -0
package/second-brain/Shared/Scripts/_Index.md +30 -0
package/second-brain/Shared/Scripts-Archive/_Index.md +30 -0
package/second-brain/Shared/Tech-Standards/_Index.md +30 -0
package/second-brain/Shared/Tech-Standards/verification-standard.md +40 -0
package/second-brain/Shared/User-Memory/_Index.md +30 -0
package/second-brain/Shared/User-Persona/_Index.md +30 -0
package/second-brain/Shared/User-Persona/owner-profile.md +25 -0
package/second-brain/Shared/Working-Memory/_Index.md +30 -0
package/second-brain/Shared/_Index.md +30 -0
package/second-brain/Shared/mcp-servers/_Index.md +30 -0
package/second-brain/Skills/_Index.md +30 -0
package/second-brain/Templates/_Index.md +30 -0
package/second-brain/Templates/bug.md +2 -0
package/second-brain/Templates/handoff.md +2 -0
package/second-brain/Templates/session.md +2 -0
package/second-brain/Tools/_Index.md +30 -0
package/second-brain/Traces/_Index.md +30 -0
package/second-brain/Vault Structure Map.md +33 -1
package/second-brain/copilot/_Index.md +30 -0
package/skills/audit-license-compliance/SKILL.md +117 -0
package/skills/author-codemod/SKILL.md +110 -0
package/skills/build-audit-logging/SKILL.md +112 -0
package/skills/build-cdc-streaming-pipeline/SKILL.md +123 -0
package/skills/build-cli-tool/SKILL.md +108 -0
package/skills/build-data-table/SKILL.md +141 -0
package/skills/build-native-mobile-ui/SKILL.md +154 -0
package/skills/build-offline-first-sync/SKILL.md +118 -0
package/skills/build-realtime-channel/SKILL.md +122 -0
package/skills/build-vector-search/SKILL.md +131 -0
package/skills/compose-local-dev-stack/SKILL.md +149 -0
package/skills/configure-bundler-build/SKILL.md +166 -0
package/skills/configure-dns-tls/SKILL.md +142 -0
package/skills/configure-reverse-proxy-lb/SKILL.md +129 -0
package/skills/configure-security-headers-csp/SKILL.md +122 -0
package/skills/contract-testing/SKILL.md +140 -0
package/skills/datetime-timezone-correctness/SKILL.md +125 -0
package/skills/debug-ci-pipeline-failure/SKILL.md +134 -0
package/skills/debug-flaky-tests/SKILL.md +128 -0
package/skills/defend-llm-prompt-injection/SKILL.md +110 -0
package/skills/deliver-webhooks/SKILL.md +116 -0
package/skills/design-api-pagination/SKILL.md +144 -0
package/skills/design-authorization-model/SKILL.md +119 -0
package/skills/design-backup-dr-recovery/SKILL.md +113 -0
package/skills/design-event-sourcing-cqrs/SKILL.md +143 -0
package/skills/design-multi-tenancy/SKILL.md +100 -0
package/skills/design-protobuf-grpc-service/SKILL.md +146 -0
package/skills/design-relational-schema/SKILL.md +129 -0
package/skills/design-search-index-infra/SKILL.md +151 -0
package/skills/design-state-machine/SKILL.md +108 -0
package/skills/design-token-system/SKILL.md +109 -0
package/skills/distributed-locks-leases/SKILL.md +120 -0
package/skills/encrypt-sensitive-data/SKILL.md +148 -0
package/skills/feature-flags-rollout/SKILL.md +130 -0
package/skills/file-upload-object-storage/SKILL.md +107 -0
package/skills/fuzz-dynamic-security-test/SKILL.md +111 -0
package/skills/harden-llm-app-reliability/SKILL.md +126 -0
package/skills/i18n-localization-setup/SKILL.md +113 -0
package/skills/idempotency-keys/SKILL.md +107 -0
package/skills/implement-push-notifications/SKILL.md +142 -0
package/skills/ingest-webhook-secure/SKILL.md +120 -0
package/skills/integrate-oauth-oidc/SKILL.md +126 -0
package/skills/load-stress-test/SKILL.md +129 -0
package/skills/map-privacy-data-gdpr/SKILL.md +146 -0
package/skills/model-nosql-data/SKILL.md +118 -0
package/skills/money-decimal-arithmetic/SKILL.md +123 -0
package/skills/monitor-ml-drift/SKILL.md +109 -0
package/skills/numeric-precision-units/SKILL.md +144 -0
package/skills/optimize-llm-cost-latency/SKILL.md +103 -0
package/skills/optimize-react-rerenders/SKILL.md +124 -0
package/skills/orchestrate-agent-workflow/SKILL.md +100 -0
package/skills/payments-billing-integration/SKILL.md +114 -0
package/skills/pin-toolchain-versions/SKILL.md +116 -0
package/skills/plan-strangler-migration/SKILL.md +95 -0
package/skills/property-based-testing/SKILL.md +108 -0
package/skills/publish-package-registry/SKILL.md +130 -0
package/skills/recover-git-state/SKILL.md +119 -0
package/skills/remediate-web-vulnerabilities/SKILL.md +125 -0
package/skills/resilience-timeouts-retries/SKILL.md +104 -0
package/skills/resolve-merge-rebase-conflict/SKILL.md +97 -0
package/skills/rewrite-git-history/SKILL.md +109 -0
package/skills/scaffold-cross-platform-app/SKILL.md +137 -0
package/skills/schema-evolution-compatibility/SKILL.md +121 -0
package/skills/send-transactional-email/SKILL.md +126 -0
package/skills/serve-deploy-ml-model/SKILL.md +107 -0
package/skills/setup-cdn-edge-waf/SKILL.md +107 -0
package/skills/setup-devcontainer-env/SKILL.md +131 -0
package/skills/setup-lint-format-precommit/SKILL.md +140 -0
package/skills/setup-monorepo-tooling/SKILL.md +125 -0
package/skills/ship-mobile-app-store-release/SKILL.md +137 -0
package/skills/structured-output-llm/SKILL.md +86 -0
package/skills/supply-chain-sbom-provenance/SKILL.md +120 -0
package/skills/test-data-factories/SKILL.md +158 -0
package/skills/threat-model-stride/SKILL.md +123 -0
package/skills/train-evaluate-ml-model/SKILL.md +109 -0
package/skills/unicode-text-correctness/SKILL.md +109 -0
package/skills/visual-regression-testing/SKILL.md +120 -0

package/skills/harden-llm-app-reliability/SKILL.md ADDED Viewed

@@ -0,0 +1,126 @@
+---
+name: harden-llm-app-reliability
+description: Hardens LLM API calls for production with per-call timeouts and cancellation, exponential-backoff-plus-full-jitter retries on 429/500/529 that honor Retry-After, model fallback, one-round structured-output repair, refusal/stop_reason handling, and a circuit-breaker degraded mode so a flaky provider never breaks the feature.
+when_to_use: Shipping an LLM feature where provider errors, timeouts, rate limits, or refusals must not crash the UX. Distinct from optimize-llm-cost-latency (speed/spend), defend-llm-prompt-injection (security of inputs), and rate-limiting (protecting your own API from callers, not surviving a provider's limits).
+---
+## When to Use
+Reach for this skill when the failure mode you fear is **the provider**, not your code or your callers:
+- "The model call sometimes hangs / times out and the request just spins forever"
+- "We get 429s / 529s / 500s in bursts and the feature errors out"
+- "Wrap the LLM call so a bad response or refusal degrades gracefully instead of throwing"
+- "Add fallback to a cheaper/other model when the primary is down or refuses"
+- "JSON-mode output is occasionally malformed and crashes the parser"
+- "Mid-stream the connection drops and the user sees half an answer"
+NOT this skill:
+- Making calls *cheaper or faster* (model routing for cost, prompt caching, token trimming) → optimize-llm-cost-latency
+- Defending the prompt against injection / untrusted-content attacks → defend-llm-prompt-injection
+- Limiting how often *your callers* hit *your* API (token bucket, quotas, your own 429s) → rate-limiting
+- Designing the prompt + structured-output schema itself → prompt-engineering
+- Measuring output quality across prompt/model changes → llm-eval-harness
+- Offloading the whole LLM job to a durable background queue with DLQ → message-queue-jobs
+This skill is the resilience wrapper *around* one logical LLM call. It assumes the prompt is already written.
+## Steps
+1. **Wrap every call in a timeout + cancellation token. No naked `await`.** A hung socket must die on a deadline you own, not the SDK default (often 600s+). Two clocks: a per-attempt timeout (the request) and a total deadline (all retries combined). Stream long calls so the per-attempt timeout measures *time-to-first-byte*, not total generation.
+   ```ts
+   const TOTAL_DEADLINE_MS = 30_000;   // whole operation, retries included
+   const PER_ATTEMPT_MS    = 12_000;   // one HTTP attempt (TTFB for streams)
+   // remainingMs = total deadline left for this attempt (computed by the caller, step 2)
+   async function callWithDeadline(fn, remainingMs) {
+     const ctrl = new AbortController();
+     const budget = Math.max(0, Math.min(PER_ATTEMPT_MS, remainingMs));
+     const t = setTimeout(() => ctrl.abort(), budget);
+     try { return await fn(ctrl.signal); }
+     finally { clearTimeout(t); }
+   }
+   ```
+   Pass `signal` into the SDK (`client.messages.create({...}, { signal })`). On the user side wire the inbound request's abort signal through so a closed browser tab cancels the upstream call instead of burning tokens.
+2. **Retry only what's retryable, with exponential backoff + full jitter, and honor `Retry-After`.** Classify the error before you retry — retrying a 400 is just slower failure.
+   | Status / condition | Retry? | Wait |
+   |---|---|---|
+   | 429 rate-limited | Yes | `Retry-After` header if present, else backoff |
+   | 529 overloaded (Anthropic) / 503 | Yes | backoff + jitter |
+   | 500 / 502 / 504 / gateway | Yes | backoff + jitter |
+   | Network reset / timeout / ECONNRESET | Yes | backoff + jitter |
+   | 408 request timeout | Yes | backoff |
+   | 400 / 422 bad request | **No** | fix the request, not the retry |
+   | 401 / 403 auth | **No** | rotate key / fix scope |
+   | 413 too large | **No** | trim input |
+   | Refusal / `stop_reason` | **No retry — fall back** (step 4) | — |
+   Defaults: **max 4 attempts**, base 500ms, cap 8s, **full jitter** (`sleep = random(0, min(cap, base * 2**attempt))`). Full jitter beats fixed/equal backoff because synchronized clients (a 429 storm) otherwise retry in lockstep and re-stampede. Always clamp the wait to the remaining total deadline — never sleep past it.
+   ```ts
+   const start = Date.now();
+   const elapsed = () => Date.now() - start;
+   for (let attempt = 0; attempt < 4; attempt++) {
+     try { return await callWithDeadline(fn, TOTAL_DEADLINE_MS - elapsed()); }
+     catch (e) {
+       if (!isRetryable(e) || attempt === 3 || elapsed() > TOTAL_DEADLINE_MS) throw e;
+       const ra = retryAfterMs(e);                       // parse header, seconds or http-date
+       const backoff = Math.random() * Math.min(8000, 500 * 2 ** attempt);
+       await sleep(Math.min(ra ?? backoff, TOTAL_DEADLINE_MS - elapsed()));
+     }
+   }
+   ```
+   LLM calls are **non-idempotent and billed** — a retry after a *partial* success double-charges. Only retry attempts that demonstrably failed before producing a usable response (connection error, non-2xx, timeout-before-first-byte). Never retry a call that already streamed a full body.
+3. **Validate structured output; repair once; then fail safe — never crash on malformed JSON.** When you asked for JSON, do not feed the raw model string straight into `JSON.parse` + a schema and let it throw to the user.
+   - Parse → validate against the schema (Zod / Pydantic / JSON Schema).
+   - On failure, **one** repair round: send the model the broken output + the validator error, ask for corrected JSON only. Strip code fences and prose first.
+   - Still invalid → return a typed safe default (e.g. `{ status: "unavailable" }`) or route to degraded mode. Log the raw output. Do **not** loop repairs (cost + latency blowup).
+   Prefer the SDK's native enforcement (tool/`tool_choice` forcing, strict JSON mode) over free-text + regex — it eliminates most repairs. Repair is the safety net, not the plan.
+4. **Fall back to another model on persistent failure or refusal.** When the primary is exhausted (retries spent, circuit open) or returns a refusal, try a fallback before giving up. Order by capability-then-availability, e.g. primary Sonnet → fallback Haiku, or cross-provider if you run multi-vendor.
+   - A **refusal** (`stop_reason: "refusal"`, or the model declining) is not a transport error — do not retry the same model; either fall back or return the refusal as a first-class result.
+   - Treat `stop_reason: "max_tokens"` as a *truncated* (not failed) result: the JSON is incomplete — repair or raise `max_tokens` and retry once, don't ship the cutoff.
+   - Cap fallback depth at 1–2 models. Record which model actually served the response.
+5. **Stream with a heartbeat; discard partials on mid-stream error.** Long generations should stream so the user sees progress and you detect stalls. Set an **inter-chunk idle timeout** (e.g. 20s with no new token → abort) — a stream can hang open without erroring. If the stream errors or aborts mid-way, **discard the accumulated partial** and either retry from scratch (step 2 rules) or degrade; never persist or render a half-message as if complete. Buffer to a scratch variable and only commit on the terminal `message_stop`.
+6. **Circuit-breaker around the provider → degraded mode.** Per-provider breaker: after N consecutive failures (e.g. 5) or a failure rate over a window, **open** the circuit and stop calling for a cooldown (e.g. 30s), then **half-open** one probe. While open, skip the doomed call and serve degraded mode immediately: a cached previous answer, a canned/templated response, or a clear "this feature is temporarily unavailable" — chosen per feature, decided *before* the incident. This stops a provider outage from turning into 30s timeouts on every request and exhausting your own connection pool.
+7. **Never lose user input on failure.** Before the call, persist the user's prompt/turn so any failure path (timeout, all-retries-exhausted, circuit open) returns a retryable state, not a black hole. The user should be able to resend with one tap, or the system auto-resumes — input is never silently dropped. For expensive multi-step agent runs, checkpoint so you resume from the failed step, not step 1.
+## Common Errors
+- **Relying on the SDK default timeout.** It's often minutes. A spike of hung sockets exhausts your connection pool and takes the whole service down. Set an explicit per-attempt timeout you own.
+- **Retrying non-retryable errors.** Looping on a 400/401/413 wastes the deadline and (for auth) can lock the key. Classify first; only retry 408/429/5xx/network.
+- **Fixed or equal-jitter backoff.** All clients that got 429'd retry at the same instant and re-stampede the provider. Use full jitter: `random(0, min(cap, base·2^n))`.
+- **Ignoring `Retry-After`.** The provider told you exactly when to come back; backoff math that retries sooner just earns another 429. Parse the header (seconds *or* HTTP-date) and prefer it.
+- **Retrying a partially-streamed call.** It already cost tokens and may have half-applied a side effect; the retry double-charges and can double-act. Only retry failures that occurred before a usable response.
+- **`JSON.parse` straight onto the response.** One malformed token throws an unhandled exception to the user. Always validate, repair once, then fail to a typed default.
+- **Infinite repair loop.** Re-asking the model until JSON is valid can run forever and 10x the bill. Exactly one repair round, then degrade.
+- **Treating a refusal as a 5xx.** Retrying the identical prompt on the same model just refuses again. Fall back or surface it; don't burn retries.
+- **Shipping a `max_tokens` cutoff as complete.** Truncated JSON silently corrupts downstream. Check `stop_reason`; repair or re-call with higher limit.
+- **Rendering the mid-stream partial.** A dropped stream leaves a half-answer the user reads as final. Buffer and only commit on `message_stop`; discard on error.
+- **No circuit breaker.** During a provider outage every request pays the full timeout × retries before failing — your latency and pool collapse. Trip the breaker and serve degraded mode fast.
+- **Dropping user input on the failure path.** The user retypes everything. Persist the turn before the call; make every failure resumable.
+- **Sharing one breaker/timeout budget across unrelated features.** A flaky batch job opens the circuit for your latency-critical chat path. Scope breakers per provider+route.
+## Verify
+Prove resilience with **fault injection**, not hope. Force each failure and assert the wrapper holds — don't wait for prod to hit them.
+1. **Forced 429 storm:** Stub the client to return `429` with `Retry-After: 2` for the first 3 calls, then `200`. Assert: exactly 4 attempts, waits honor `Retry-After` (≈2s, not the backoff curve), final result returned, total stays under the deadline.
+2. **Forced timeout:** Stub a response slower than `PER_ATTEMPT_MS`. Assert: the attempt aborts at the deadline (not the SDK default), the `AbortController` fired, and either a retry or a clean degraded response — never a hang.
+3. **Non-retryable:** Stub a `400`. Assert: **zero** retries, immediate failure, deadline barely consumed.
+4. **Malformed JSON:** Stub output that fails the schema, then valid on the repair call. Assert: exactly one repair round, valid object returned. Then stub it invalid twice → assert the typed safe default, no thrown exception.
+5. **Refusal / cutoff:** Stub `stop_reason: "refusal"` → assert fallback model is tried (no same-model retry). Stub `stop_reason: "max_tokens"` → assert truncation is detected, not shipped as complete.
+6. **Mid-stream drop:** Start a stream, kill the connection after 2 chunks. Assert: the partial is discarded (not rendered/persisted), and retry-or-degrade fires.
+7. **Circuit breaker:** Force N consecutive failures → assert the circuit opens, subsequent calls return degraded mode **immediately** (no timeout wait), then half-open probes and closes on recovery.
+8. **Input preservation:** Trigger total failure → assert the user's input is still retrievable/resumable, returned as retryable state, never silently lost.
+9. **Idempotency/billing:** Assert a fully-streamed-then-errored response is **not** retried (no double charge).
+Done = fault-injection tests 1–9 pass, every LLM call has an explicit per-attempt timeout + total deadline, retries use full-jitter backoff that honors `Retry-After` and never fires on non-retryable or already-served calls, malformed/refused/truncated output degrades to a typed safe path instead of throwing, the circuit breaker serves degraded mode under a forced outage without paying timeouts, and no failure path loses user input.

package/skills/i18n-localization-setup/SKILL.md ADDED Viewed

@@ -0,0 +1,113 @@
+---
+name: i18n-localization-setup
+description: Externalizes user-facing text into message catalogs keyed by stable IDs and wires locale-correct rendering — ICU MessageFormat plurals/gender/select, named-placeholder interpolation, Intl/CLDR number/date/list/relative-time formatting, RTL/bidi via logical CSS, and an extract→translate→compile pipeline with pseudo-localization.
+when_to_use: Making a product support multiple languages/locales, or auditing existing i18n — hardcoded UI strings, sentence concatenation, English-only `if(n===1)` plurals, missing RTL, locale-blind number/date formatting, or wiring i18next/react-intl/gettext/Rails i18n/Fluent. Distinct from style-responsive-tailwind (visual layout) and audit-accessibility-wcag (a11y conformance — i18n only owns translatable a11y *attribute text*).
+---
+## When to Use
+Reach for this when text must render correctly in **more than one language/locale**, not just look right:
+- "Add Spanish/Arabic/Japanese — what's the right way to externalize strings?"
+- "Our plurals break in Polish/Russian" or "we do `count === 1 ? 'item' : 'items'` everywhere"
+- "Dates show as `6/15/2026` for everyone" / numbers use `.` for thousands in `de-DE`
+- "Arabic/Hebrew layout is broken — everything's still left-to-right"
+- "Translators can't reorder words — we concatenate `'Deleted ' + n + ' files'`"
+- "Set up the extraction pipeline: extract → PO/XLIFF/JSON → translate → compile" + catch missing keys before ship
+- Auditing an app that's "already i18n'd" for the traps below
+NOT this skill:
+- Visual responsive layout, breakpoints, container sizing → **style-responsive-tailwind** (i18n owns *logical* CSS props + `dir`, not the design system)
+- WCAG conformance, screen-reader semantics, contrast → **audit-accessibility-wcag** (i18n only owns making `aria-label`/`alt`/`title` *translatable*)
+- `hreflang`, localized URLs, sitemap per-locale, canonical → **audit-technical-seo**
+- Validating/parsing user-entered locale data (phone, postal) → **build-form-validation**
+- Wrapping a single component's copy as you build it → **build-react-component** (use this skill when standing up the *catalog system*)
+- UTC storage, DST, IANA conversion math behind a displayed timestamp → **datetime-timezone-correctness** (i18n only *formats* the instant per locale; it doesn't compute it)
+## Steps
+1. **Externalize every user-facing string into a catalog keyed by a stable ID — kill concatenation.** A string is translatable if a human ever reads it: labels, buttons, errors, emails, `alt`/`aria-label`/`title`/`placeholder`, `<title>`, push/toast text. Key by **semantic ID**, never by English source (English changes → key shouldn't). Co-locate by feature: `checkout.cart.empty`, not `string_447`.
+   ```jsonc
+   // en.json — one message = one full sentence with named placeholders
+   { "checkout.items": "{count, plural, one {# item} other {# items}}",
+     "profile.greeting": "Welcome back, {name}!" }
+   ```
+   Never build sentences from fragments. `t('deleted') + ' ' + n + ' ' + t('files')` is **untranslatable** — word order, plural agreement, and gender all vary by language. One key = one whole sentence.
+2. **Pluralize with ICU MessageFormat / CLDR categories — never `if (n === 1)`.** English has 2 forms; Arabic has **6** (zero/one/two/few/many/other), Polish/Russian have 4. Provide every category the *target* locale's CLDR rules require; `other` is the mandatory fallback. Same mechanism for gender/choice via `select`. Use `#` for the count (auto-formatted per locale), not `{count}` re-interpolated.
+   | Need | ICU construct | Anti-pattern it replaces |
+   |---|---|---|
+   | Count agreement | `{n, plural, one {…} few {…} many {…} other {…}}` | `n === 1 ? 'x' : 'xs'` |
+   | Ordinals (1st/2nd) | `{n, selectordinal, one {#st} two {#nd} few {#rd} other {#th}}` | string-suffix hacks |
+   | Gender / enum | `{g, select, female {…} male {…} other {…}}` | branching in code, concatenating |
+   | Money/percent inside text | `{amt, number, ::currency/EUR}` | manual `$` + `toFixed(2)` |
+   Translators supply the categories *their* language needs — don't hardcode the English set into the message shape.
+3. **Interpolate with named placeholders so translators can reorder.** `"{count} {unit} remaining"` lets a translator emit `"quedan {count} {unit}"`. Positional `{0}`/`%s`/`printf` ordering is fixed and breaks under reordering — use named only. Pass an explicit values object: `t('checkout.items', { count })`. Escape literal braces per ICU (`'{'`). Auto-escape interpolated values for the sink (HTML) to avoid injection.
+4. **Format numbers/dates/lists/units via `Intl` (CLDR) in the user's locale — never roll your own.** Locale decides separators (`1,234.5` vs `1.234,5`), date order, AM/PM vs 24h, RTL digit shaping, list conjunctions. Always pass the resolved locale explicitly; relying on the host default is non-deterministic.
+   ```js
+   new Intl.NumberFormat(locale, { style: 'currency', currency: 'JPY' }).format(1234)   // ¥1,234
+   new Intl.DateTimeFormat(locale, { dateStyle: 'long' }).format(d)                       // 15 de junio de 2026
+   new Intl.RelativeTimeFormat(locale, { numeric: 'auto' }).format(-1, 'day')             // "yesterday"
+   new Intl.ListFormat(locale, { type: 'conjunction' }).format(['a','b','c'])             // "a, b, and c"
+   ```
+   `currency` is data, not locale — `de` user paying USD shows `1.234,00 $`. Never store formatted strings; format at render time from raw numbers + ISO/epoch timestamps. Default time storage to **UTC**, convert to the user's IANA timezone for display (the conversion/DST math itself lives in **datetime-timezone-correctness**).
+5. **Make layout direction-agnostic: logical CSS + `dir` + bidi isolation.** Set `<html dir="rtl" lang="ar">` from the locale (RTL set: ar, he, fa, ur). Replace physical properties with logical ones so one stylesheet serves both directions:
+   | Physical (breaks RTL) | Logical (correct both) |
+   |---|---|
+   | `margin-left` / `padding-right` | `margin-inline-start` / `padding-inline-end` |
+   | `left` / `right` | `inset-inline-start` / `inset-inline-end` |
+   | `text-align: left` | `text-align: start` |
+   | `border-left` | `border-inline-start` |
+   Wrap user/dynamic content of unknown direction in `<bdi>` or `unicode-bidi: isolate` so an Arabic username doesn't scramble surrounding LTR punctuation. Mirror directional icons (back/forward arrows) via `[dir=rtl]` or transform; don't mirror logos.
+6. **Stand up the pipeline: extract → catalog → translate → compile, gated by pseudo-loc + missing-key detection.** Source code is the single source of truth for *keys*; translators own values. Pick the format by toolchain:
+   | Format | Use with | Plurals |
+   |---|---|---|
+   | **JSON / ICU** | i18next, react-intl/FormatJS | native ICU |
+   | **PO/POT** (gettext) | Rails (`gettext`), Python, PHP, C | `nplurals` header |
+   | **XLIFF** | Angular, enterprise TMS handoff | ICU or `<plural>` |
+   | **FTL** (Fluent) | Mozilla stack, attribute-rich UI | built-in selectors |
+   Pipeline: (1) `extract` keys from source (`i18next-parser`, `formatjs extract`, `xgettext`) → POT/template; (2) merge into per-locale catalogs without dropping existing translations; (3) translate / push to TMS; (4) `compile` to runtime bundles (`formatjs compile`, `msgfmt`). Generate a **pseudo-locale** (`en-XA`: `[!!! Ŝéàŕçĥ ~~~]`) — accent + ~40% length padding + bracket markers — to surface hardcoded strings, truncation, and concatenation in CI before any human translates. Fail the build on missing keys / unknown ICU vars.
+7. **Negotiate locale, fall back, and support runtime switching.** Resolve in priority order: explicit user setting → URL/cookie → `Accept-Language` → app default. Match with BCP-47 lookup (`fr-CA` → `fr` → default); never 404 on an unsupported region — fall back to base language, then to source locale. Lazy-load the active locale's bundle (don't ship all 30); switching locale re-renders messages **and** updates `lang`/`dir` on `<html>`. Use a real BCP-47 matcher — `@formatjs/intl-localematcher` (`match()`) or `accept-language-parser` for the header, canonicalized via `Intl.getCanonicalLocales` — never naive string equality (there is no `Intl.LocaleMatcher` global; locale matching is the `localeMatcher` *option* on `Intl` constructors or a library).
+## Common Errors
+- **`count === 1 ? x : xs`.** Breaks every language with ≠2 plural forms (Arabic, Polish, Russian, Welsh). Use ICU `plural` with CLDR categories.
+- **Sentence concatenation** (`t('sent') + name + t('a_msg')`). Word order/agreement/gender vary; translators can't fix it. One key = one full sentence with named placeholders.
+- **Keying by English source text.** Editing the copy silently orphans the translation. Key by stable semantic ID.
+- **Hand-formatted numbers/dates** (`'$' + n.toFixed(2)`, `MM/DD/YYYY`). Wrong separators/order/currency per locale. Use `Intl.NumberFormat`/`DateTimeFormat` with an explicit locale.
+- **Conflating locale with currency/timezone.** A `de` user can pay in USD in `America/New_York`. Format with the user's *locale* but the transaction's *currency* and the event's *timezone*; store UTC + ISO currency code.
+- **Physical CSS** (`margin-left`, `float: right`). Layout breaks in RTL. Use logical properties + `dir`.
+- **No bidi isolation.** An RTL name/number injected into LTR text reorders adjacent punctuation/brackets. Wrap unknown-direction content in `<bdi>`/`unicode-bidi: isolate`.
+- **Forgetting non-`textContent` text.** `alt`, `aria-label`, `title`, `placeholder`, `<title>`, email subjects, validation messages are all translatable — and untranslated `aria-label` regresses a11y.
+- **No length budget.** German/Finnish run ~35% longer than English; pseudo-loc padding exposes truncation/overflow before translators do.
+- **Locale-blind sort/case.** JS `.sort()` is code-point order (`Ä` after `Z`); Turkish `i`↔`İ`/`ı` breaks `toUpperCase()`. Use `Intl.Collator(locale)` for sorting and `toLocaleUpperCase(locale)` for case.
+- **Inventing `Intl.LocaleMatcher`.** No such global exists — locale matching is the `localeMatcher` option on `Intl` constructors or a library (`@formatjs/intl-localematcher`). Don't string-compare BCP-47 tags.
+- **Shipping all locales eagerly / hard 404 on unknown region.** Lazy-load active locale; fall back `region → language → source`, never error.
+- **Pluralizing the count with `#` but re-interpolating `{count}` raw.** `#` is locale-formatted (`1,000`); a separate `{count}` isn't. Use `#` inside `plural`.
+## Verify
+1. **No hardcoded strings:** lint (`eslint-plugin-formatjs`, `i18next` no-literal rule) reports zero user-facing literals outside the catalog.
+2. **Pseudo-loc pass:** run UI in `en-XA` — every visible string is accented+bracketed (no bare English = no missed key), nothing truncates or overflows, no concatenated fragments appear.
+3. **Plural matrix:** render the count message at `n = 0,1,2,5,11,100` in `en`, `pl` (4 forms), and `ar` (6 forms); each picks the CLDR-correct category. `if(n===1)` cannot pass this.
+4. **Reordering:** a target locale that reverses placeholder order renders correctly (proves named, not positional, interpolation).
+5. **Formatting:** the same number/date/currency/list renders per-locale separators/order (`1,234.5`↔`1.234,5`, `06/15`↔`15/06`, currency symbol placement) — assert against `Intl` golden strings.
+6. **RTL:** load `ar`/`he` → `<html dir="rtl">`, layout mirrors via logical props, directional icons flip, bidi-isolated names don't scramble punctuation.
+7. **Missing-key gate:** delete a key from a non-source catalog → CI fails (or falls back to source) — it must never render a raw key like `checkout.items` to a user.
+8. **Negotiation:** `Accept-Language: fr-CA` with only `fr` available resolves to `fr` (not default/404) via a real matcher; switching locale at runtime updates messages **and** `lang`/`dir`.
+9. **Sort/case:** a localized list sorts via `Intl.Collator(locale)` (e.g. Swedish `å/ä/ö` last); Turkish case round-trips with `toLocaleUpperCase('tr')`.
+Done = zero hardcoded user-facing strings, pseudo-loc clean, the plural matrix passes for a 4-form and a 6-form locale, RTL renders with logical CSS + bidi isolation, all formatting goes through `Intl` with explicit locale, locale negotiation uses a real BCP-47 matcher, and CI fails on any missing key or unknown ICU variable.

package/skills/idempotency-keys/SKILL.md ADDED Viewed

@@ -0,0 +1,107 @@
+---
+name: idempotency-keys
+description: Makes operations safe to repeat so retries and at-least-once delivery don't double-charge or double-create — idempotency by design first (PUT/upsert, conditional writes with version/ETag/If-Match, natural deterministic keys, set-don't-increment) and by key second (client Idempotency-Key header, a dedup table keyed unique on the key that stores request fingerprint + status + response and replays the SAME response, 409 in-progress lock for concurrent duplicates, 422 on key-reuse-with-different-body), plus consumer-side dedup (processed-event-id store / dedup window), the outbox pattern for atomic write+publish, and DB mechanics (ON CONFLICT, SELECT FOR UPDATE / advisory locks). Effectively-once via dedup, because exactly-once delivery is a myth.
+when_to_use: An operation can run more than once and must not have double effects — a POST that creates/charges behind a client/proxy/SDK retry, an at-least-once queue or webhook consumer that may redeliver, a job that may run twice, or you're adding an Idempotency-Key header or a dedup table. Distinct from resilience-timeouts-retries (decides WHEN/how to retry; this skill makes the target safe to retry into) and deliver-webhooks (the sender side — at-least-once delivery + signed retries; this skill is what makes the receiver safe under that redelivery).
+---
+## When to Use
+Reach for this skill when the same operation may execute more than once and a second execution must NOT produce a second effect:
+- "A POST timed out, the client retried, and we charged/created twice"
+- "Our SDK/proxy/load balancer retries — make the create idempotent"
+- "Add an Idempotency-Key header so replays return the original response"
+- "The queue/webhook is at-least-once; the consumer ran the same event twice"
+- "Make this job safe to run twice" / "dedup redelivered events"
+- "Atomically write a row AND publish an event without dual-write loss" (outbox)
+NOT this skill:
+- Deciding the retry policy itself — backoff, jitter, retry budget, circuit breaker, which errors are retryable → resilience-timeouts-retries (it generates the duplicate calls; this skill absorbs them safely)
+- The webhook *sender*: at-least-once dispatch, signing, retry schedule, DLQ for failed deliveries → deliver-webhooks
+- The webhook *receiver's* signature/replay-window verification (HMAC over raw body, timestamp window) → ingest-webhook-secure (this skill is the dedup-on-event-id half it hands off to)
+- Building the queue/worker, DLQ, poison-message handling → message-queue-jobs (this skill specifies the *idempotent consumer* it needs)
+- Idempotent PSP charges + subscription/proration/ledger reconciliation → payments-billing-integration (it owns billing state and calls this skill's key pattern for money-mutating calls)
+- The rounding/allocation math of the amounts → money-decimal-arithmetic
+## Steps
+1. **Make it idempotent BY DESIGN before reaching for a key — that's cheaper and self-healing.** A surprising amount of "double effect" disappears if the operation is naturally repeatable:
+   | Technique | How | Why it's idempotent |
+   |---|---|---|
+   | **PUT / upsert** to a client-chosen id | `PUT /orders/{client_uuid}` → `INSERT ... ON CONFLICT (id) DO NOTHING/UPDATE` | second call hits the same row, no new row |
+   | **Conditional write** (optimistic concurrency) | `If-Match: <etag>` / `WHERE version = N` → bump version | stale retry's precondition fails → no double-apply |
+   | **Natural / deterministic key** | derive id from stable inputs (`hash(order_id+sku)`, not `uuid()`) | same inputs → same id → conflict, not insert |
+   | **Set, don't increment** | `balance = 100` not `balance += 10`; `status = 'paid'` | reapplying the same set is a no-op |
+   | **DELETE / "ensure absent"** | delete-by-id, "cancel if active" | already-gone is success, not error |
+   Increments, "append a row", and server-generated ids on POST are the *non*-idempotent shapes that force you to step 2.
+2. **For non-idempotent POSTs, use a client-supplied Idempotency-Key.** The client (not the server, not per-retry) generates ONE key for a logical operation and sends it on the original request AND every retry — header `Idempotency-Key: <opaque-uuid>`. The key must be **stable across retries and unique per operation**: generate it once before the first send, store it with the in-flight request, reuse it on retry. This is the Stripe model and the reference semantics to copy.
+3. **Persist the key with a dedup table — fingerprint, status, and the stored response.** One row per key:
+   ```sql
+   CREATE TABLE idempotency_keys (
+     id_key        text NOT NULL,
+     scope         text NOT NULL,            -- e.g. (user_id || ':' || endpoint)
+     request_hash  text NOT NULL,            -- SHA-256 of canonical request body+route
+     status        text NOT NULL,            -- 'in_progress' | 'completed'
+     response_code int,
+     response_body jsonb,
+     created_at    timestamptz NOT NULL DEFAULT now(),
+     expires_at    timestamptz NOT NULL,     -- TTL: now() + 24h..7d
+     PRIMARY KEY (scope, id_key)              -- UNIQUE on (scope, key)
+   );
+   ```
+   Scope the key per `(user/tenant, endpoint)` so one client's key can't collide with or replay another's. Retention 24h–7d (Stripe = 24h); a TTL/cron purges expired rows so storage is bounded.
+4. **Server flow — claim, execute, store, replay — atomically.** On each request:
+   1. Compute `request_hash` from the canonical (sorted/normalized) body + method + route.
+   2. **Atomically claim** the key: `INSERT (scope, id_key, request_hash, status='in_progress') ON CONFLICT (scope, id_key) DO NOTHING`. The insert *is* the lock.
+   3. **If the insert won** (0 conflicts): run the real operation, then `UPDATE ... SET status='completed', response_code, response_body` and return the response.
+   4. **If it conflicted**, read the existing row:
+      - `status='completed'` **and** `request_hash` matches → return the **stored** `response_code`/`response_body` verbatim (the replay path — same result, no re-execution).
+      - `status='in_progress'` → a concurrent duplicate is still running → return **409 Conflict** (or `425`-style "in progress"); the client should retry-after, not re-execute.
+      - `request_hash` **differs** (same key, different body) → return **422 Unprocessable Entity** — the key was reused for a different operation; never run it.
+5. **Hold a lock for the in-flight window so concurrent duplicates don't both execute.** The `INSERT ... ON CONFLICT DO NOTHING` claim handles most of it, but if you read-then-write, take a row lock: `SELECT ... FOR UPDATE` on the key row, or a Postgres advisory lock `pg_advisory_xact_lock(hashtext(scope||id_key))` around the whole claim+execute. Without this, two parallel retries can both see "no row" and both run. Wrap claim + business write + result-store in **one transaction** (or make the business write itself idempotent via step 1) so a crash between execute and store doesn't lose the recorded response.
+6. **Consumer-side dedup for at-least-once queues and webhooks — "exactly-once delivery" is a myth; you get effectively-once.** Brokers (SQS, Kafka, RabbitMQ) and webhook senders redeliver on ack timeout, so the *consumer* must be idempotent. Two patterns:
+   - **Processed-event-id store:** a `processed_events(event_id PRIMARY KEY, processed_at)` table. Before handling, `INSERT ... ON CONFLICT DO NOTHING`; if 0 rows inserted, it's a duplicate → ack and skip. Dedup on the **provider's stable event id** (not your own per-receipt uuid). Pairs with ingest-webhook-secure / message-queue-jobs.
+   - **Dedup window:** a bounded TTL set (Redis `SET key NX EX <window>`) when full history is too large — only safe if redelivery is bounded within the window.
+   Best of all: make the *handler* naturally idempotent (step 1: upsert by event-derived key, set-don't-increment) so even a missed dedup is harmless.
+7. **Atomic write + publish → outbox pattern (no dual-write).** Writing the DB row and publishing the event as two separate calls can crash between them (row saved, event lost — or vice versa). Instead, in **one transaction** write the business row AND an `outbox` row; a separate relay polls/CDC-tails the outbox and publishes (at-least-once → consumers dedup per step 6). The transaction guarantees the event is recorded iff the state changed.
+8. **DB mechanics cheat-sheet.**
+   - Postgres/SQLite: `INSERT ... ON CONFLICT (cols) DO NOTHING` (claim/dedup) or `DO UPDATE SET ...` (upsert). MySQL: `INSERT ... ON DUPLICATE KEY UPDATE`. The **UNIQUE constraint/index is what makes it safe** — `ON CONFLICT` without a matching unique index silently doesn't dedup.
+   - Serialize the in-flight window with `SELECT ... FOR UPDATE` (row) or `pg_advisory_xact_lock` (cross-row/logical) — released at transaction end.
+   - Check the *result* of the upsert (rows affected / `RETURNING xmax = 0`) to know whether you inserted or hit an existing row.
+## Common Errors
+- **Generating the key per-retry (`uuid()` / `now()` inside the retry loop).** Every attempt gets a fresh key → zero dedup → still double-charges. Fix: generate ONCE before the first send; reuse the identical key on every retry.
+- **No request-fingerprint check.** Same key replayed with a *different* body silently returns the old response (or runs the new op). Fix: store `request_hash`; on mismatch return 422, never execute.
+- **Racing duplicates with no lock.** Two parallel retries both `SELECT` (no row), both execute, both insert. Fix: atomic `INSERT ... ON CONFLICT DO NOTHING` as the claim, or `FOR UPDATE` / advisory lock around read-modify-write.
+- **`ON CONFLICT` / upsert without a UNIQUE index on the key.** No conflict ever fires → no dedup, duplicate rows. Fix: enforce a unique constraint on `(scope, id_key)` (or the natural key).
+- **Unbounded key storage.** The dedup table grows forever. Fix: `expires_at` + a purge job; pick 24h–7d retention.
+- **Treating a non-idempotent op as idempotent.** Retrying `balance += 10` or "append row" doubles the effect even *with* a key if you don't replay the stored response. Fix: replay the stored response on hit; or redesign to set-don't-increment (step 1).
+- **Recording the result in a separate step from the business write.** Crash in between → next retry re-executes a completed op. Fix: same transaction, or idempotent business write so re-execution is a no-op.
+- **Believing the broker gives exactly-once.** "Exactly-once delivery" doesn't exist over a network; redelivery happens. Fix: idempotent consumer + processed-event-id dedup = effectively-once.
+- **Dual-write (DB then publish, two calls).** A crash loses one side. Fix: outbox in the same transaction + a relay.
+- **Acking before the work is durable.** Ack-then-process loses the message on a crash. Fix: process (idempotently) and commit, *then* ack.
+## Verify
+1. **Duplicate POST is a no-op:** send the same request with the same `Idempotency-Key` twice → exactly one effect (one charge/row) and the second response is byte-identical to the first.
+2. **Concurrent duplicates:** fire N parallel requests with the same key → exactly one executes; the rest get the stored response or `409 in-progress`, never a second effect. (This is the race test — run it against the real shared store.)
+3. **Key reuse, different body:** same key + changed payload → `422`, and no operation runs.
+4. **Per-retry-key bug guard:** confirm the client generates the key once and reuses it (grep the retry path for `uuid()`/`now()` *inside* the loop).
+5. **Consumer redelivery:** deliver the same event id to the queue/webhook consumer twice → handled once (processed-events insert conflicts on the second); effect is identical to single delivery.
+6. **By-design ops:** issue the same `PUT`/upsert / conditional write twice → one row, version advances once; a stale `If-Match` retry is rejected, not double-applied.
+7. **Outbox atomicity:** kill the process between the business write and publish → on restart the relay still publishes (event recorded iff state changed); no orphan event, no lost event.
+8. **Retention bounded:** expired keys are purged; an old key past TTL behaves as a fresh request (documented), and the table doesn't grow without bound.
+Done = duplicate and concurrent requests produce exactly one effect with an identical replayed response, same-key/different-body returns 422, the in-flight window is locked, consumers dedup at-least-once delivery on stable event ids, write+publish is atomic via the outbox, and key storage is TTL-bounded — all proven by the parallel/redelivery tests in checks 1–7.

package/skills/implement-push-notifications/SKILL.md ADDED Viewed

@@ -0,0 +1,142 @@
+---
+name: implement-push-notifications
+description: Implements end-to-end mobile push — APNs token-auth and FCM HTTP v1 provider setup, device-token registration and rotation, alert vs silent/data payload schemas, the server send path, foreground/background/killed receipt handling, tap-to-deep-link routing, rich media via service extensions, and permission-prompt UX.
+when_to_use: Adding or debugging push on iOS/Android (native or RN/Flutter) — token registration/rotation, payload design, foreground/background/killed delivery, tap deep-linking, silent data pushes, or permission timing. Distinct from message-queue-jobs (server-side fan-out/retry) and build-native-mobile-ui (the deep-link router/navigation it taps into).
+---
+## When to Use
+Reach for this skill when the work is **getting a notification onto a device and reacting to it** — the client↔provider↔server push loop:
+- "Register the device for push and store its APNs/FCM token against the user"
+- "Token keeps changing / notifications stopped after reinstall — handle refresh"
+- "Send a push from the backend and have the tap open a specific screen"
+- "Silent/background push to sync data without showing an alert"
+- "Notification isn't showing when the app is in foreground / killed"
+- "Add an image + action buttons to the notification (rich push)"
+- "When and how should we ask for notification permission?"
+NOT this skill:
+- Server-side queueing, retry, and fan-out of the send jobs to millions of tokens → message-queue-jobs
+- Delivery-rate dashboards, open-rate funnels, alerting on send failures → observability-instrument
+- Designing the REST/GraphQL endpoint that receives the token from the client → rest-graphql-contract
+- Who the user is / signing the request that registers the token → auth-jwt-session
+- Throttling how often you send to one user → rate-limiting
+- In-app realtime state sync (WebSocket/SSE, not OS push) → manage-client-server-state
+- Building the in-app router / navigation stack the tap hands off to → build-native-mobile-ui
+- Code signing, push capability provisioning, APNs auth-key upload, TestFlight/Play distribution → ship-mobile-app-store-release
+## Steps
+1. **Pick the transport per platform — there is exactly one right answer each.** Use **APNs token-based auth (`.p8` key + JWT)** for iOS, never the legacy `.p12` cert (certs expire yearly and are per-app; one `.p8` covers all your bundle IDs). Use **FCM HTTP v1** (`https://fcm.googleapis.com/v1/projects/{id}/messages:send`, OAuth2 bearer) for Android and as a unified façade for both — never the deprecated legacy `key=` server-key API (shut down June 2024). On iOS, register Firebase as the APNs delegate so you get one FCM token covering both stores.
+   | Concern | iOS | Android |
+   |---|---|---|
+   | Provider | APNs (direct) or FCM→APNs | FCM |
+   | Server auth | `.p8` key → ES256 JWT (`apns-topic`=bundle id) | OAuth2 SA token → FCM v1 |
+   | Token source | `didRegisterForRemoteNotifications` deviceToken, or FCM token | FCM `getToken()` |
+   | Capability | Xcode **Push Notifications** + **Background Modes→Remote notifications** | none (FCM in `google-services.json`) |
+   | Silent push | `content-available:1`, **no** `alert` | `data`-only message, `priority:"high"` |
+2. **Time the permission prompt — never on first launch.** Show a pre-permission *value* screen, then call the OS prompt only on a user action ("Turn on alerts"). iOS: `UNUserNotificationCenter.requestAuthorization([.alert,.sound,.badge])` returns a one-shot grant — if denied you cannot re-prompt, you must deep-link to Settings, so don't waste it. Android 13+ (API 33) requires the runtime `POST_NOTIFICATIONS` permission; target SDK 33+ and request it explicitly or you get silently zero notifications. iOS provisional auth (`.provisional`) delivers quietly to Notification Center with no prompt — good default for low-stakes apps.
+3. **Obtain the token, then push it to the backend — and re-push on every refresh.** The token is not stable: it rotates on reinstall, restore-to-new-device, and at the OS's discretion. Treat the refresh callback as the source of truth, not the one-time fetch at startup.
+   ```kotlin
+   // Android — fires on first token AND every rotation
+   override fun onNewToken(token: String) {
+     api.registerDevice(token, platform = "android", appVersion = BuildConfig.VERSION_NAME)
+   }
+   ```
+   ```swift
+   // iOS via Firebase — delegate fires on rotation too
+   func messaging(_ m: Messaging, didReceiveRegistrationToken token: String?) {
+     guard let token else { return }
+     Api.registerDevice(token, platform: "ios", bundle: Bundle.main.bundleIdentifier!)
+   }
+   ```
+   Send `Authorization` from the logged-in session so the token binds to the user. Re-register on **login** and **app foreground** too — a token issued while logged out must be re-bound after sign-in.
+4. **Store tokens keyed by (user, device) with an upsert — dedupe and invalidate.** A user has many devices; a device's token changes. Key the row on a stable `device_id` (vendor id / install id), not the token, and **upsert** so rotation updates in place instead of accumulating dead rows.
+   ```sql
+   CREATE TABLE device_tokens (
+     user_id    uuid    NOT NULL,
+     device_id  text    NOT NULL,          -- stable per install
+     token      text    NOT NULL,
+     platform   text    NOT NULL,          -- 'ios' | 'android'
+     updated_at timestamptz NOT NULL DEFAULT now(),
+     PRIMARY KEY (user_id, device_id)
+   );
+   CREATE UNIQUE INDEX ON device_tokens(token);   -- a token belongs to one user
+   ```
+   On send failure, the provider tells you a token is dead (see step 8) — **delete it then**, not on a guessed schedule. On logout, delete that device's row so a reassigned phone doesn't get the previous user's pushes.
+5. **Design the payload: alert vs data vs silent — keep them distinct.** Put display fields in the platform alert block; put routing/business fields in a **custom data** block your code reads on tap. A FCM v1 unified body:
+   ```json
+   {"message": {
+     "token": "<device-token>",
+     "notification": {"title": "New reply", "body": "Pim replied to your post"},
+     "data": {"deeplink": "app://thread/8412", "type": "reply"},
+     "android": {"priority": "high", "notification": {"channel_id": "social", "image": "https://…/t.jpg"}},
+     "apns": {
+       "headers": {"apns-priority": "10", "apns-push-type": "alert", "apns-collapse-id": "thread-8412"},
+       "payload": {"aps": {"alert": {"title":"New reply","body":"Pim replied"},
+                           "sound":"default","badge":3,"mutable-content":1,"category":"REPLY"}}}
+   }}
+   ```
+   Rules: **`data` values must be strings** in FCM. **Silent push** = `content-available:1` / data-only, `apns-push-type:"background"`, `apns-priority:"5"`, **omit `alert`/`sound`/`badge`** entirely — any alert field makes it a visible push. Use **`apns-collapse-id` / FCM `collapse_key`** so a newer update replaces a stale one instead of stacking. Set `mutable-content:1` (iOS) / include `image` (Android) only when a service extension / Notifee will render rich content.
+6. **Handle receipt in all three app states — they are different code paths.** Foreground delivery does **not** show a banner unless you opt in. Cold-start-from-tap gives you the payload via a *different* entry point than a tap while running. Wire every one:
+   | State | iOS handler | Android handler |
+   |---|---|---|
+   | Foreground arrives | `userNotificationCenter(_:willPresent:)` → return `[.banner,.sound]` to show | `onMessageReceived` (data msgs) → build local notification |
+   | Background/locked tap | `didReceive response` | launcher Activity `intent.extras` |
+   | Killed → tap launches | `didFinishLaunching` `launchOptions[.remoteNotification]` | `getInitialNotification()` / launch `Intent` |
+   | Silent/background data | `didReceiveRemoteNotification` (call completion handler!) | `onMessageReceived` (no notification block) |
+   On tap, read `data.deeplink` and resolve it through the app's **central router** (the same one handling universal links — owned by build-native-mobile-ui; this skill only hands the URL to it). Never inline screen logic in the notification handler — funnel to one `route(url)` so cold-start and warm-tap reach the identical destination.
+7. **Rich push needs platform-native rendering, not just an `image` URL.** iOS: add a **Notification Service Extension**; on receipt download the media in `didReceive(_:withContentHandler:)`, attach via `UNNotificationAttachment`, and call the handler within ~30s or the OS drops the attachment. Buttons: register a `UNNotificationCategory` whose `identifier` matches the payload `category`, with `UNNotificationAction`s. Android: pass `image` for a `BigPictureStyle`; add buttons with `addAction(PendingIntent)`. RN/Flutter: use **Notifee** (`@notifee/react-native` / `notifee` Flutter) — it does the channels, big-picture, actions, and full-screen intents both native SDKs require, and it's the only sane cross-platform path for actionable/rich notifications.
+8. **Verify delivery and reap dead tokens from the provider's response — don't guess.** A 200 from APNs/FCM means *accepted*, not *delivered*; you only learn a token is dead from a specific error. Delete on these, retry/backoff on those:
+   | Signal | Meaning | Action |
+   |---|---|---|
+   | APNs `410` / reason `Unregistered` | token dead (uninstall) | **delete token** |
+   | APNs `400 BadDeviceToken` / `DeviceTokenNotForTopic` | wrong env or topic | fix env (sandbox vs prod) / `apns-topic`; delete if truly invalid |
+   | FCM `UNREGISTERED` / `INVALID_ARGUMENT`(token) | dead / malformed token | **delete token** |
+   | APNs `429 TooManyRequests` / FCM `QUOTA_EXCEEDED`(429) | throttled | exponential backoff + retry |
+   | FCM `UNAVAILABLE`(503) / APNs `503` | transient | retry with `Retry-After` |
+   Match the APNs **environment** to the build: dev/TestFlight tokens are APNs *sandbox*; App Store builds are *production* — sending a sandbox token to the prod gateway returns `BadDeviceToken`, the #1 "works on my phone, dead in prod" bug. (The build channel and signing that decide that env are owned by ship-mobile-app-store-release; here you only route the token to the matching gateway.)
+## Common Errors
+- **Legacy FCM `key=AAAA…` server key.** Removed June 2024 — returns 404. Use HTTP v1 with an OAuth2 bearer from a service account.
+- **APNs sandbox vs production mismatch.** TestFlight = sandbox, App Store = production; crossing them yields `BadDeviceToken`. Pick the gateway from the build channel, not a global flag.
+- **Storing only one token per user.** Overwrites the user's other devices; only the last-registered phone gets pushes. Key on `(user, device_id)`.
+- **Keying the row on the token.** Token rotates → orphan rows pile up and you spray dead tokens. Key on stable `device_id`, upsert the token.
+- **Silent push with an `alert`/`sound`/`badge` field.** It becomes a *visible* push and the OS may also throttle your background budget. Background pushes carry `content-available:1` and nothing displayable.
+- **Expecting a foreground banner for free.** iOS suppresses it unless `willPresent` returns presentation options; Android `notification`-type messages are dropped in foreground — handle as `data` and post a local notification.
+- **Android 13+ with no `POST_NOTIFICATIONS` request.** Silent zero delivery, no error. Target SDK 33+ and request the runtime permission.
+- **Missing Android notification channel.** On API 26+ a notification with no created channel never shows. Create channels at startup; set `channel_id` in the payload.
+- **Not calling the silent-push completion handler.** iOS `didReceiveRemoteNotification` must call `completionHandler(.newData)` fast, or iOS throttles future background pushes for the app.
+- **`data` values as numbers/objects in FCM.** v1 requires all `data` values be strings; non-strings 400 the request. Stringify, parse on the client.
+- **Deleting dead tokens on a timer.** You evict live tokens and keep dead ones. Delete only on `Unregistered`/`UNREGISTERED` from the actual send response.
+- **Re-prompting after iOS denial.** The grant is one-shot; a second `requestAuthorization` no-ops. Detect denied and deep-link to system Settings instead.
+## Verify
+1. **Round-trip per state:** with a real token, send and confirm a banner appears in **foreground, background, and killed**. Tapping each opens the screen named by `data.deeplink` — cold-start tap and warm tap land on the *same* screen.
+2. **Token rotation:** reinstall the app → `onNewToken`/refresh fires → backend row is **updated in place** (no second row), and a push to the new token arrives while the old one returns `Unregistered`.
+3. **Silent push:** send `content-available:1` / data-only → app wakes and runs the handler with **no visible banner**; iOS completion handler is called.
+4. **Dead-token reap:** uninstall, then send → provider returns `410 Unregistered` / FCM `UNREGISTERED` and the backend **deletes** that row. A subsequent send skips it.
+5. **Env correctness:** an App Store / production build's token accepted by the **production** APNs gateway (no `BadDeviceToken`); a dev build by sandbox.
+6. **Permission UX:** fresh install shows the OS prompt only after the in-app value screen / user action; on Android 13+ the `POST_NOTIFICATIONS` dialog appears; denying then re-trying routes to Settings rather than silently failing.
+7. **Rich push:** a payload with an image + actions renders the picture and buttons; each button fires its intended action/deeplink.
+8. **Collapse:** two updates with the same `apns-collapse-id`/`collapse_key` show as **one** replaced notification, not two stacked.
+Done = a real device receives and correctly deep-links a push in all three app states, tokens upsert-and-rotate without duplicate or stale rows, dead tokens are deleted on the provider's `Unregistered`/`UNREGISTERED` signal, silent pushes wake the app without a banner, and the prod build hits the prod APNs gateway with zero `BadDeviceToken`.