npm - sanook-cli - Versions diffs - 0.4.0 → 0.5.0 - Mend

sanook-cli 0.4.0 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (235) hide show

package/.env.example +19 -0
package/CHANGELOG.md +144 -0
package/README.md +153 -20
package/README.th.md +136 -0
package/dist/agentContext.js +4 -0
package/dist/approval.js +6 -0
package/dist/bin.js +394 -51
package/dist/brain.js +92 -59
package/dist/brand.js +47 -0
package/dist/checkpoint.js +37 -0
package/dist/commands.js +86 -6
package/dist/compaction.js +76 -5
package/dist/config.js +100 -12
package/dist/cost.js +60 -3
package/dist/doctor.js +92 -0
package/dist/gateway/auth.js +2 -2
package/dist/gateway/ledger.js +2 -2
package/dist/gateway/scheduler.js +1 -0
package/dist/gateway/serve.js +6 -4
package/dist/gateway/server.js +10 -2
package/dist/git.js +11 -2
package/dist/hooks.js +43 -17
package/dist/knowledge.js +48 -49
package/dist/loop.js +182 -66
package/dist/lsp/client.js +173 -0
package/dist/lsp/framing.js +56 -0
package/dist/lsp/index.js +138 -0
package/dist/lsp/servers.js +82 -0
package/dist/mcp-server.js +244 -0
package/dist/mcp.js +184 -29
package/dist/memory-store.js +559 -0
package/dist/memory.js +143 -29
package/dist/orchestrate.js +150 -0
package/dist/providers/codex.js +2 -2
package/dist/providers/keys.js +3 -2
package/dist/providers/registry.js +133 -1
package/dist/repomap.js +93 -0
package/dist/search/chunk.js +158 -0
package/dist/search/embed-store.js +187 -0
package/dist/search/engine.js +203 -0
package/dist/search/fuse.js +35 -0
package/dist/search/index-core.js +187 -0
package/dist/search/indexer.js +241 -0
package/dist/search/store.js +77 -0
package/dist/session.js +42 -8
package/dist/skill-install.js +10 -10
package/dist/skills.js +12 -9
package/dist/summarize.js +31 -0
package/dist/tools/bash.js +21 -2
package/dist/tools/diagnostics.js +41 -0
package/dist/tools/edit.js +29 -7
package/dist/tools/index.js +8 -1
package/dist/tools/list.js +7 -2
package/dist/tools/permission.js +90 -9
package/dist/tools/read.js +23 -4
package/dist/tools/remember.js +1 -1
package/dist/tools/sandbox.js +61 -0
package/dist/tools/search.js +105 -4
package/dist/tools/task.js +195 -29
package/dist/tools/timeout.js +35 -0
package/dist/tools/util.js +10 -0
package/dist/tools/write.js +6 -4
package/dist/trust.js +89 -0
package/dist/ui/app.js +218 -27
package/dist/ui/banner.js +4 -9
package/dist/ui/history.js +30 -0
package/dist/ui/mentions.js +44 -0
package/dist/ui/setup.js +6 -5
package/dist/ui/useEditor.js +83 -0
package/dist/update.js +114 -0
package/dist/worktree.js +173 -0
package/package.json +11 -5
package/scripts/postinstall.mjs +33 -0
package/second-brain/.agents/_Index.md +30 -0
package/second-brain/.agents/skills/_Index.md +30 -0
package/second-brain/.agents/workflows/_Index.md +30 -0
package/second-brain/AGENTS.md +4 -4
package/second-brain/Acceptance/_Index.md +30 -0
package/second-brain/Acceptance/golden-case-template.md +39 -0
package/second-brain/Areas/_Index.md +30 -0
package/second-brain/Bugs/System-OS/_Index.md +30 -0
package/second-brain/Bugs/_Index.md +30 -0
package/second-brain/CLAUDE.md +4 -1
package/second-brain/Checklists/_Index.md +30 -0
package/second-brain/Checklists/preflight-postflight-template.md +29 -0
package/second-brain/Distillations/_Index.md +30 -0
package/second-brain/Entities/_Index.md +30 -0
package/second-brain/Entities/entity-template.md +33 -0
package/second-brain/Evals/_Index.md +30 -0
package/second-brain/Evals/correction-pairs.md +24 -0
package/second-brain/Evals/failure-taxonomy.md +24 -0
package/second-brain/Evals/golden-set.md +25 -0
package/second-brain/Evals/quality-ledger.md +23 -0
package/second-brain/Evals/self-eval-rubric.md +23 -0
package/second-brain/GEMINI.md +4 -4
package/second-brain/Goals/_Index.md +30 -0
package/second-brain/Handoffs/_Index.md +30 -0
package/second-brain/Home.md +7 -0
package/second-brain/Intake/Raw Sources/_Index.md +30 -0
package/second-brain/Intake/_Index.md +30 -0
package/second-brain/Intake/_Quarantine/_Index.md +30 -0
package/second-brain/Learning/_Index.md +30 -0
package/second-brain/Playbooks/_Index.md +30 -0
package/second-brain/Playbooks/playbook-template.md +23 -0
package/second-brain/Projects/_Index.md +30 -0
package/second-brain/Prompts/_Index.md +30 -0
package/second-brain/README.md +2 -1
package/second-brain/Research/_Index.md +30 -0
package/second-brain/Retrospectives/_Index.md +30 -0
package/second-brain/Reviews/_Index.md +30 -0
package/second-brain/Runbooks/_Index.md +30 -0
package/second-brain/Runbooks/eval-loop.md +24 -0
package/second-brain/Sessions/_Index.md +30 -0
package/second-brain/Shared/AI-Context-Index.md +20 -0
package/second-brain/Shared/AI-Threads/_Index.md +30 -0
package/second-brain/Shared/Archive/_Index.md +30 -0
package/second-brain/Shared/Assets/_Index.md +30 -0
package/second-brain/Shared/Context-Packs/_Index.md +30 -0
package/second-brain/Shared/Context7-Docs/_Index.md +30 -0
package/second-brain/Shared/Coordination/NOW.md +28 -0
package/second-brain/Shared/Coordination/_Index.md +30 -0
package/second-brain/Shared/Coordination/agent-registry.md +24 -0
package/second-brain/Shared/Coordination/task-board/_Index.md +30 -0
package/second-brain/Shared/Coordination/task-board/task-template.md +43 -0
package/second-brain/Shared/Coordination/task-board.md +32 -0
package/second-brain/Shared/Core-Facts/_Index.md +30 -0
package/second-brain/Shared/Decision-Memory/_Index.md +30 -0
package/second-brain/Shared/Glossary/_Index.md +30 -0
package/second-brain/Shared/Memory-Inbox/_Index.md +30 -0
package/second-brain/Shared/Operating-State/_Index.md +30 -0
package/second-brain/Shared/Prompting/_Index.md +30 -0
package/second-brain/Shared/Provenance/_Index.md +30 -0
package/second-brain/Shared/Rules/_Index.md +30 -0
package/second-brain/Shared/Rules/contextual-note-rule.md +30 -0
package/second-brain/Shared/Rules/frontmatter-standard.md +10 -0
package/second-brain/Shared/Rules/memory-write-protocol.md +28 -0
package/second-brain/Shared/Rules/procedural-runbook-header.md +40 -0
package/second-brain/Shared/Rules/review-and-staleness-policy.md +22 -0
package/second-brain/Shared/Rules/rules-formatting.md +34 -0
package/second-brain/Shared/Scripts/_Index.md +30 -0
package/second-brain/Shared/Scripts-Archive/_Index.md +30 -0
package/second-brain/Shared/Tech-Standards/_Index.md +30 -0
package/second-brain/Shared/Tech-Standards/verification-standard.md +40 -0
package/second-brain/Shared/User-Memory/_Index.md +30 -0
package/second-brain/Shared/User-Persona/_Index.md +30 -0
package/second-brain/Shared/User-Persona/owner-profile.md +25 -0
package/second-brain/Shared/Working-Memory/_Index.md +30 -0
package/second-brain/Shared/_Index.md +30 -0
package/second-brain/Shared/mcp-servers/_Index.md +30 -0
package/second-brain/Skills/_Index.md +30 -0
package/second-brain/Templates/_Index.md +30 -0
package/second-brain/Templates/bug.md +2 -0
package/second-brain/Templates/handoff.md +2 -0
package/second-brain/Templates/session.md +2 -0
package/second-brain/Tools/_Index.md +30 -0
package/second-brain/Traces/_Index.md +30 -0
package/second-brain/Vault Structure Map.md +33 -1
package/second-brain/copilot/_Index.md +30 -0
package/skills/audit-license-compliance/SKILL.md +117 -0
package/skills/author-codemod/SKILL.md +110 -0
package/skills/build-audit-logging/SKILL.md +112 -0
package/skills/build-cdc-streaming-pipeline/SKILL.md +123 -0
package/skills/build-cli-tool/SKILL.md +108 -0
package/skills/build-data-table/SKILL.md +141 -0
package/skills/build-native-mobile-ui/SKILL.md +154 -0
package/skills/build-offline-first-sync/SKILL.md +118 -0
package/skills/build-realtime-channel/SKILL.md +122 -0
package/skills/build-vector-search/SKILL.md +131 -0
package/skills/compose-local-dev-stack/SKILL.md +149 -0
package/skills/configure-bundler-build/SKILL.md +166 -0
package/skills/configure-dns-tls/SKILL.md +142 -0
package/skills/configure-reverse-proxy-lb/SKILL.md +129 -0
package/skills/configure-security-headers-csp/SKILL.md +122 -0
package/skills/contract-testing/SKILL.md +140 -0
package/skills/datetime-timezone-correctness/SKILL.md +125 -0
package/skills/debug-ci-pipeline-failure/SKILL.md +134 -0
package/skills/debug-flaky-tests/SKILL.md +128 -0
package/skills/defend-llm-prompt-injection/SKILL.md +110 -0
package/skills/deliver-webhooks/SKILL.md +116 -0
package/skills/design-api-pagination/SKILL.md +144 -0
package/skills/design-authorization-model/SKILL.md +119 -0
package/skills/design-backup-dr-recovery/SKILL.md +113 -0
package/skills/design-event-sourcing-cqrs/SKILL.md +143 -0
package/skills/design-multi-tenancy/SKILL.md +100 -0
package/skills/design-protobuf-grpc-service/SKILL.md +146 -0
package/skills/design-relational-schema/SKILL.md +129 -0
package/skills/design-search-index-infra/SKILL.md +151 -0
package/skills/design-state-machine/SKILL.md +108 -0
package/skills/design-token-system/SKILL.md +109 -0
package/skills/distributed-locks-leases/SKILL.md +120 -0
package/skills/encrypt-sensitive-data/SKILL.md +148 -0
package/skills/feature-flags-rollout/SKILL.md +130 -0
package/skills/file-upload-object-storage/SKILL.md +107 -0
package/skills/fuzz-dynamic-security-test/SKILL.md +111 -0
package/skills/harden-llm-app-reliability/SKILL.md +126 -0
package/skills/i18n-localization-setup/SKILL.md +113 -0
package/skills/idempotency-keys/SKILL.md +107 -0
package/skills/implement-push-notifications/SKILL.md +142 -0
package/skills/ingest-webhook-secure/SKILL.md +120 -0
package/skills/integrate-oauth-oidc/SKILL.md +126 -0
package/skills/load-stress-test/SKILL.md +129 -0
package/skills/map-privacy-data-gdpr/SKILL.md +146 -0
package/skills/model-nosql-data/SKILL.md +118 -0
package/skills/money-decimal-arithmetic/SKILL.md +123 -0
package/skills/monitor-ml-drift/SKILL.md +109 -0
package/skills/numeric-precision-units/SKILL.md +144 -0
package/skills/optimize-llm-cost-latency/SKILL.md +103 -0
package/skills/optimize-react-rerenders/SKILL.md +124 -0
package/skills/orchestrate-agent-workflow/SKILL.md +100 -0
package/skills/payments-billing-integration/SKILL.md +114 -0
package/skills/pin-toolchain-versions/SKILL.md +116 -0
package/skills/plan-strangler-migration/SKILL.md +95 -0
package/skills/property-based-testing/SKILL.md +108 -0
package/skills/publish-package-registry/SKILL.md +130 -0
package/skills/recover-git-state/SKILL.md +119 -0
package/skills/remediate-web-vulnerabilities/SKILL.md +125 -0
package/skills/resilience-timeouts-retries/SKILL.md +104 -0
package/skills/resolve-merge-rebase-conflict/SKILL.md +97 -0
package/skills/rewrite-git-history/SKILL.md +109 -0
package/skills/scaffold-cross-platform-app/SKILL.md +137 -0
package/skills/schema-evolution-compatibility/SKILL.md +121 -0
package/skills/send-transactional-email/SKILL.md +126 -0
package/skills/serve-deploy-ml-model/SKILL.md +107 -0
package/skills/setup-cdn-edge-waf/SKILL.md +107 -0
package/skills/setup-devcontainer-env/SKILL.md +131 -0
package/skills/setup-lint-format-precommit/SKILL.md +140 -0
package/skills/setup-monorepo-tooling/SKILL.md +125 -0
package/skills/ship-mobile-app-store-release/SKILL.md +137 -0
package/skills/structured-output-llm/SKILL.md +86 -0
package/skills/supply-chain-sbom-provenance/SKILL.md +120 -0
package/skills/test-data-factories/SKILL.md +158 -0
package/skills/threat-model-stride/SKILL.md +123 -0
package/skills/train-evaluate-ml-model/SKILL.md +109 -0
package/skills/unicode-text-correctness/SKILL.md +109 -0
package/skills/visual-regression-testing/SKILL.md +120 -0

package/skills/payments-billing-integration/SKILL.md ADDED Viewed

@@ -0,0 +1,114 @@
+---
+name: payments-billing-integration
+description: Integrates payment, subscription, and billing flows against a payment provider — hosted/PCI-offloaded checkout and payment-intent surfaces, idempotency-keyed money-mutating calls that survive retries, webhook-driven order/subscription state reconciliation keyed on stored provider event ids, subscription lifecycle (trial/upgrade/downgrade/proration/cancel/dunning), and an append-only ledger of charges, refunds, and credits that reconciles to the provider balance.
+when_to_use: Integrating a payment provider (checkout, PaymentIntents, subscriptions), handling plan changes/proration/cancellations, processing payment webhooks, preventing double-charges, reconciling payment state, or implementing dunning. Distinct from ingest-webhook-secure (verifies the generic signature/replay/dedup of any inbound webhook — this skill drives billing state from those verified events), money-decimal-arithmetic (the rounding/allocation/FX math this skill calls into for totals), and auth-jwt-session (your users' identity, not a PSP charge).
+---
+## When to Use
+Reach for this when the request mutates money or subscription state through a payment provider (Stripe, Adyen, Braintree, PayPal):
+- "Add a checkout / let users pay for X"
+- "Set up subscription billing with monthly/annual plans and a free trial"
+- "Handle upgrade/downgrade with proration" or "cancel at period end vs immediately"
+- "We double-charged a customer / a retry created two charges" — make money calls idempotent
+- "Order shows paid but the webhook said failed" — reconcile state from webhooks, not the redirect
+- "Implement dunning / retry failed renewals / grace period before we revoke access"
+- "Refund or partially refund, issue account credit, keep the ledger auditable"
+NOT this skill:
+- Verifying the raw signature, timestamp window, and replay/dedup of the inbound webhook itself → ingest-webhook-secure (this skill consumes an already-verified, deduped event and decides what billing state it changes)
+- Rounding cents, splitting a charge across line items so it sums exactly, banker's rounding, FX triangulation → money-decimal-arithmetic (call into it; don't re-implement allocation here)
+- Authenticating *your* logged-in user before they reach checkout → auth-jwt-session
+- The background worker/queue that processes a handed-off event → message-queue-jobs
+- Storing the PSP secret/webhook signing key → secrets-management
+## Steps
+1. **Never let raw card data or client-supplied amounts touch your server.** Use a hosted/PCI-offloaded surface so PAN never hits your backend — keeps you in SAQ-A, not SAQ-D.
+   | Need | Use | Why |
+   |---|---|---|
+   | Fastest, lowest PCI scope, one-off or sub | **Hosted checkout** (Stripe Checkout / Adyen Drop-in / PayPal Smart Buttons) | provider hosts the card form, redirects back |
+   | Custom in-page UI, still PCI-offloaded | **PaymentIntents + provider Elements/SDK** | card tokenized client-side; you only see a token + intent id |
+   | Recurring | **Subscriptions API** on a saved payment method | provider runs the renewal schedule + retries |
+   | You handle raw PAN | **don't** | SAQ-D, audits, liability — almost never justified |
+   The **provider is the source of truth for amount and currency**. Compute the price server-side from your catalog, create the intent server-side with that amount, and ignore any amount the client posts. A client that sends `amount=1` for a $100 cart must still be charged $100.
+2. **Every money-mutating call carries an idempotency key — no exceptions.** Charges, captures, refunds, and subscription creates must be safe to retry after a timeout, because "request timed out" does NOT mean "charge didn't happen." Derive the key deterministically from your own intent (e.g. `charge:order_42:attempt_1`), persist it before the call, and reuse the *same* key on retry.
+   ```python
+   # Stripe — header makes the create idempotent for 24h
+   intent = stripe.PaymentIntent.create(
+       amount=order.total_minor,          # integer minor units, computed server-side
+       currency=order.currency,           # ISO 4217, lowercased for Stripe
+       customer=order.customer_id,
+       idempotency_key=f"pi:order:{order.id}",   # SAME key on every retry of THIS order
+       metadata={"order_id": order.id},   # your id, so webhooks map back
+   )
+   ```
+   Rules: a new key per *logical* operation, the same key across *retries* of that operation. Never reuse a key for a different amount (providers return a conflict/error). Generating a fresh UUID per HTTP attempt defeats the entire mechanism — that's how double-charges happen.
+3. **Drive durable state from verified webhooks, not the redirect.** The browser redirect/return is a UX hint only — the user may close the tab, the network may drop, or the 3DS challenge may resolve seconds later. Treat the synchronous result as "pending"; flip to `paid`/`active`/`failed` **only** when the matching verified webhook arrives.
+   - Inbound verification (signature over raw body, timestamp window, replay/seen-id dedup) is owned by **ingest-webhook-secure** — do that first.
+   - Store the **provider event id** (`evt_…`) in a `processed_events` table with a unique constraint; INSERT-or-skip makes re-delivery a no-op.
+   - Map by **your** id from `metadata` (set in step 2), not by position or amount.
+   - Return `2xx` fast so the provider stops retrying; do the heavy lifting async (hand to message-queue-jobs).
+4. **Make state transitions a guarded machine, tolerant of out-of-order delivery.** Webhooks arrive out of order and duplicated; a `payment_failed` can land after a later `payment_succeeded`. Never overwrite blindly — apply only forward transitions.
+   ```
+   pending ──succeeded──▶ paid ──refunded──▶ refunded
+      │                    ▲
+      └──failed──▶ failed ─┘ (manual retry / new intent)
+   ```
+   Guard: ignore a `failed` event for an intent already `paid` by a later event; key the decision on the event's intent status + your stored status, not arrival order. Use the event's own timestamp/sequence to drop stale ones.
+5. **Subscription lifecycle — pick defaults, don't hand-roll proration.**
+   | Change | Default behavior | How |
+   |---|---|---|
+   | Trial → paid | charge at trial end; webhook flips `trialing`→`active` | provider `trial_period_days` + `payment_behavior=default_incomplete` so a failed first charge stays `incomplete` instead of silently activating; gate access on the webhook-confirmed status |
+   | **Upgrade** (to pricier plan) | **immediate**, prorate, charge the difference now | swap price with `proration_behavior=create_prorations` and invoice now |
+   | **Downgrade** | **at period end** (avoid mid-cycle credit/refund churn) | schedule the price change for next period |
+   | Cancel | **at period end** by default (keep paid access they bought); offer immediate+refund only if asked | `cancel_at_period_end=true`; immediate = cancel now + prorated credit/refund |
+   | Quantity/seats | prorate immediately | update quantity, `create_prorations` |
+   Let the provider compute proration — it knows the exact second of the cycle. **Gate feature access on the subscription's webhook-confirmed status** (`active`/`trialing`/`past_due`), never on "they clicked upgrade."
+6. **Dunning — let the provider retry, you handle the lifecycle.** On a failed renewal the provider enters smart-retries and moves the subscription to `past_due`. Subscribe to `invoice.payment_failed` (notify + start grace), `invoice.payment_succeeded` (recovered → `active`), and `subscription.deleted`/`unpaid` (retries exhausted → revoke). Default grace: keep access through `past_due`, revoke only on terminal `unpaid`/`canceled`. Don't build your own retry timer — you'll race the provider's.
+7. **Ledger and invoice correctness — append-only, money math delegated.** Record every money event as an immutable ledger row (`charge`/`refund`/`credit`/`fee` with provider id, minor-unit integer amount, currency, timestamp); never UPDATE an amount in place — post a compensating row. Store amounts as integer minor units or `NUMERIC`, never float. **All splitting/rounding/tax/FX goes through money-decimal-arithmetic** so line items reconcile to the captured total exactly. Refunds reference the original charge and can't exceed it (track remaining refundable). Reconcile the ledger sum against the provider's balance/payout for each charge.
+8. **Periodically reconcile against the provider** — webhooks get missed (endpoint down, dropped delivery). Run a scheduled job that lists provider charges/subscriptions since the last cursor and repairs any local row that drifted (missing `paid`, stale `active`). The provider is authoritative; your DB is a cache that must converge.
+## Common Errors
+- **Acting on the redirect instead of the webhook.** User bounces before the success URL → order stuck `pending` though they paid; or the redirect fires before the charge settles → premature fulfillment. Fulfill on the verified webhook only.
+- **Fresh idempotency key per HTTP retry.** A timeout retried with a new key creates a second charge. Key must be deterministic per logical operation and identical across retries.
+- **Trusting the client amount/currency.** Always compute price server-side from your catalog; the client value is display-only and spoofable.
+- **No `processed_events` dedup.** Providers deliver each event at-least-once; processing a redelivered `payment_succeeded` double-fulfills or double-credits. Unique-constrain the provider event id and skip on conflict.
+- **Overwriting state on out-of-order events.** A late `payment_failed` clobbers a `paid` order. Apply forward-only transitions guarded by stored status + event status, not arrival order.
+- **Hand-rolling proration math.** Off-by-cents and wrong on leap/short months. Let the provider prorate; if you must do money math, route it through money-decimal-arithmetic.
+- **Granting access on the click, not the confirmed status.** Failed first charge on a trial → user gets the product free. Gate on webhook-confirmed `active`/`trialing`.
+- **Floating-point money.** `0.1 + 0.2 != 0.3`; totals drift by a cent. Integer minor units or `NUMERIC` only — see money-decimal-arithmetic.
+- **Refund without a remaining-refundable check.** Two partial refunds can exceed the charge or the provider rejects the second. Track refunded-so-far against the original charge.
+- **Slow webhook handler.** Doing DB writes + emails synchronously blows the provider's timeout → it retries → storms. `2xx` fast, process async.
+- **Logging the full PAN / CVV / signing key.** PCI violation and secret leak. Never log card data; keep the webhook signing key in secrets-management.
+- **Testing only the happy path.** Ship without simulating `card_declined`, `insufficient_funds`, 3DS challenge, expired card, or webhook redelivery and you'll discover them in production.
+## Verify
+1. **Double-charge under retry:** create one PaymentIntent, fire the create twice with the **same** idempotency key (or kill the first mid-flight and retry) → exactly one charge on the provider dashboard, one ledger row.
+2. **Redirect-independent fulfillment:** complete a test payment but **don't** follow the success redirect (close the tab) → the webhook still flips the order to `paid`. Then never deliver the webhook → order stays `pending` (proves you don't fulfill on redirect).
+3. **Webhook dedup:** replay the same `evt_…` (provider "Resend" or `stripe trigger` + manual re-POST) → second delivery is a no-op; one fulfillment, one ledger entry.
+4. **Out-of-order:** deliver `payment_succeeded` then a stale `payment_failed` for the same intent → final state stays `paid`.
+5. **Lifecycle:** in provider test mode run trial→active, upgrade (immediate prorated charge appears), downgrade (applies next period), cancel-at-period-end (access persists to period end) — each confirmed by the corresponding webhook.
+6. **Dunning:** force a renewal decline (`4000000000000341` / provider test card) → subscription `past_due`, grace access holds, then succeed a retry → back to `active`; exhaust retries → access revoked on terminal event.
+7. **Failure paths:** simulate `card_declined`, `insufficient_funds`, expired card, and a 3DS-required card → each yields the correct user-facing error and no partial fulfillment.
+8. **Ledger reconciliation:** for a charge + partial refund, the ledger sum equals the provider's net for that charge; a refund exceeding remaining refundable is rejected.
+9. **Drift repair:** delete a local order row (simulate a missed webhook), run the reconcile job → it re-creates/repairs from the provider.
+Done = under retried/duplicated/out-of-order webhooks there is exactly one charge and one fulfillment per order, durable state is driven only by verified+deduped webhooks (never the redirect), the full subscription lifecycle + dunning are exercised in provider test mode, the ledger reconciles to the provider's balance, and every failure path is tested.

package/skills/pin-toolchain-versions/SKILL.md ADDED Viewed

@@ -0,0 +1,116 @@
+---
+name: pin-toolchain-versions
+description: Pins language/runtime/CLI versions for identical toolchains across machines and CI — a version manager (mise/asdf/Volta/nvm), exact .tool-versions/.mise.toml pins, engines + packageManager via corepack, frozen-lockfile installs, auto-switch on cd, and CI reading the same pin file.
+when_to_use: Version drift across machines or CI ("wrong node/python version"), painful onboarding, or a "works locally, fails in CI" toolchain mismatch. NOT bumping library deps (dependency-upgrade), a containerized env (setup-devcontainer-env), or workspace task orchestration (setup-monorepo-tooling).
+---
+## When to Use
+Reach for this skill when the problem is **which version of the tool runs**, not which library version is installed:
+- "CI uses node 18, my laptop has 20 — build passes locally, fails in CI"
+- "New hire spent a day getting the right Python/Ruby/Go installed"
+- "`pnpm install` produces a different lockfile on the CI runner"
+- "Every machine must resolve the exact same node + package-manager version"
+- A flaky build traced to a runtime/CLI version that differs by host
+NOT this skill:
+- Bumping a library/framework dependency and fixing the breakage → dependency-upgrade
+- Reproducibility via a container/devcontainer image → setup-devcontainer-env
+- Wiring up workspace task runners / package-manager workspaces → setup-monorepo-tooling
+- Publishing the resulting package to a registry → publish-package-registry
+## Steps
+1. **Pick one manager and commit to it — never run two.** Two managers fighting over `PATH` shims is the #1 cause of "it switched back."
+   | Manager | Use when | Notes |
+   |---|---|---|
+   | **mise** | **Default.** Polyglot (node/python/go/ruby/rust/…), fast Rust shims, reads `.tool-versions` *and* `.mise.toml`, runs tasks + env | One tool for every language; drop-in upgrade path from asdf |
+   | asdf | Already standardized on it org-wide | Plugin-per-language, slower; mise reads its files unchanged |
+   | Volta | JS/TS-only repo, want the toolchain pinned *in package.json* | Pins node+pm under `"volta"`, no separate file |
+   | nvm | Minimal, node-only, can't install other tools | `.nvmrc` only, no pm pin, manual `nvm use` |
+   Default to **mise** unless the repo is JS-only and you specifically want package.json-embedded pins (Volta).
+2. **Pin EXACT versions — never a range, `latest`, or `lts`.** A range re-introduces drift the moment a new patch ships. `.mise.toml`:
+   ```toml
+   [tools]
+   node   = "20.18.1"
+   python = "3.12.7"
+   pnpm   = "9.15.0"
+   ```
+   Or `.tool-versions` (asdf/mise compatible):
+   ```
+   node 20.18.1
+   python 3.12.7
+   pnpm 9.15.0
+   ```
+   Full `MAJOR.MINOR.PATCH`. `node = "20"` or `"lts"` resolves differently on a machine that synced its index yesterday vs today — that defeats the purpose.
+3. **Pin the package manager too — runtime alone is not enough.** A pinned node with a floating pnpm still produces different lockfiles. Declare both in `package.json` and let corepack enforce it:
+   ```jsonc
+   {
+     "packageManager": "pnpm@9.15.0",            // corepack pins the exact pm
+     "engines": { "node": "20.18.1", "pnpm": "9.15.0" }
+   }
+   ```
+   `corepack enable` makes the `pnpm`/`yarn` shim resolve that exact version. Add `engine-strict=true` to `.npmrc` so an out-of-range node **errors** instead of warning. (Volta users: put `"volta": { "node": "20.18.1", "pnpm": "9.15.0" }` in package.json instead — it owns both.)
+4. **Commit the lockfile and install frozen in CI.** Pinned tools are wasted if installs still resolve fresh versions. Commit `pnpm-lock.yaml` / `package-lock.json` / `poetry.lock` / `Cargo.lock`, and in CI use the **frozen** install that fails on any drift, never re-resolves:
+   | PM | Local install | CI (must fail on drift) |
+   |---|---|---|
+   | pnpm | `pnpm install` | `pnpm install --frozen-lockfile` |
+   | npm | `npm install` | `npm ci` |
+   | yarn (berry) | `yarn install` | `yarn install --immutable` |
+   | poetry | `poetry install` | `poetry install` after `poetry lock --check` |
+   | cargo | `cargo build` | `cargo build --locked` |
+5. **Auto-switch on `cd` so nobody runs the wrong version by hand.** Hook the manager into the shell once: append `mise activate zsh` (or `bash`/`fish`) to the rc file; nvm users add `.nvmrc` auto-use logic. Entering the repo now selects the pinned tools automatically — no `nvm use`, no stale shell. Run `mise install` once to materialize the versions and `mise trust` to allow the repo's config.
+6. **Make CI read the SAME pin file — never retype versions in YAML.** A hardcoded `node-version: 20` in the workflow is a second source of truth that silently drifts. GitHub Actions:
+   ```yaml
+   # Option A: native setup reads the pin file directly
+   - uses: actions/setup-node@v4
+     with:
+       node-version-file: '.tool-versions'   # or .nvmrc / package.json
+   - uses: actions/setup-python@v5
+     with:
+       python-version-file: '.tool-versions'
+   # Option B (polyglot, simplest): let mise install everything
+   - uses: jdx/mise-action@v2                 # reads .mise.toml / .tool-versions
+   ```
+   `mise-action` is cleanest when you pin >2 languages — one step, the same file the dev uses.
+7. **Go hermetic only when "same versions" isn't enough.** A version manager pins the tool but still links the host's system libraries (openssl, glibc), so two "same node" builds can still differ. For bit-for-bit reproducibility (security/compliance), use **Nix flakes** (`flake.nix` + `flake.lock`, `nix develop`) or **devbox** (`devbox.json`, mise-like UX over Nix). Reserve this for projects that genuinely need it — it's heavier and steeper than mise.
+## Common Errors
+- **Pinning a range or `latest`/`lts`.** `node = "20"` resolves to different patches over time and across machines. Always full `MAJOR.MINOR.PATCH`.
+- **Pinning the runtime but not the package manager.** Floating pnpm/yarn/npm produces divergent lockfiles even on identical node. Pin via `packageManager` + corepack.
+- **Two managers installed (nvm + mise, or asdf + Volta).** Their `PATH` shims clash and one wins nondeterministically per shell. Pick one; uninstall the other's shell hook.
+- **`engine-strict` not set, so `engines` is just a warning.** npm ignores an `engines` mismatch by default. Set `engine-strict=true` in `.npmrc` to make it a hard error.
+- **CI hardcodes the version in YAML.** `node-version: 20` drifts from the repo's pin file the day someone bumps one and not the other. Use `node-version-file:` / `mise-action`.
+- **Non-frozen install in CI.** Plain `npm install` / `pnpm install` re-resolves and can pick newer deps than the lockfile. Use `npm ci` / `--frozen-lockfile` / `--immutable` / `--locked`.
+- **Forgot `corepack enable`.** Then the `pnpm` on PATH is whatever was globally installed, ignoring `packageManager`. Enable corepack locally and in CI before install.
+- **Lockfile not committed (or in `.gitignore`).** Frozen install has nothing to enforce against. Commit every lockfile.
+- **`.mise.toml` not trusted on a fresh clone.** mise refuses untrusted config and silently skips it. Run `mise trust` (or set `MISE_TRUSTED_CONFIG_PATHS`) in onboarding/CI.
+- **Pinning the tool but using system libs.** A "same node version" build still differs if it links a different openssl. If that bites you, go hermetic (Nix/devbox), not just a version manager.
+## Verify
+1. **Pin file is exact:** `grep -E '[0-9]+\.[0-9]+\.[0-9]+' .mise.toml .tool-versions package.json` shows full triples — no bare majors, no `latest`/`lts`/`*`.
+2. **Single source of truth:** the version in the pin file, `package.json` `engines`/`packageManager`, and the CI workflow all match — no hardcoded version in YAML that diverges from the file.
+3. **Clean machine A:** fresh clone → `mise install && corepack enable` → `node -v`, `python -V`, `pnpm -v` print the pinned versions with zero manual selection.
+4. **Clean machine B (or container):** repeat step 3 on a second OS/host → identical version strings.
+5. **CI parity:** the CI job logs the same `node -v`/`pnpm -v` as the two machines (read the pin file via `setup-*` `version-file` or `mise-action`).
+6. **Frozen install holds the line:** bump a dep without updating the lock → `npm ci` / `pnpm install --frozen-lockfile` **fails** (drift is rejected, not silently resolved).
+7. **Auto-switch works:** `cd` out of the repo and back → the active `node`/`python` flips to the pinned versions with no manual command.
+8. **engine-strict bites:** temporarily set a wrong node in the pin file → install **errors** on the mismatch instead of warning.
+Done = two clean machines and CI all print identical tool + package-manager versions resolved from one committed pin file, the frozen install fails on any lockfile drift, and switching is automatic on `cd`.

package/skills/plan-strangler-migration/SKILL.md ADDED Viewed

@@ -0,0 +1,95 @@
+---
+name: plan-strangler-migration
+description: Plans and executes incremental legacy modernization with the strangler-fig pattern — pins current behavior with characterization/golden-master tests, carves the narrowest seam, routes traffic old↔new behind a flag (shadow then canary), compares parity on real traffic, migrates data via expand-contract, then flips the default and retires the old path against a tracked kill-list.
+when_to_use: Replacing or rewriting a live legacy system/module slice by slice with rollback at every step — monolith→service, framework v-old→v-new, on-prem→cloud, rewriting an untested critical component, or peeling a god-class apart, when a big-bang cutover is too risky. Distinct from feature-flags-rollout (owns the flag/bucketing/ramp mechanics this skill drives), db-migration-safety (owns the DDL/lock safety of the expand-contract step), and diff-table-parity (diffs two static datasets; this diffs live request streams).
+---
+## When to Use
+Reach for this when you must **swap an implementation while the system stays live**, with rollback at every step:
+- "Move this endpoint/module off the monolith into a new service without a freeze"
+- "Rewrite this untested payment/pricing component — I'm scared to touch it"
+- "Migrate from <old framework/runtime> to <new> without a big-bang cutover"
+- "Break this 4,000-line god-class apart safely"
+- "Re-platform on-prem → cloud, one slice at a time, provably reversible"
+- "We tried a rewrite-and-switch and it blew up — do it incrementally instead"
+NOT this skill:
+- Building the flag, hashed bucketing, and 1→10→50→100 ramp itself → feature-flags-rollout (this skill *drives* a flag; that skill *builds* it)
+- Lock contention, blocking DDL, destructive ops in the expand-contract step → db-migration-safety
+- Diffing two **static** tables/query results to prove a migration matched → diff-table-parity (this skill diffs **live shadowed request streams** in flight)
+- Writing tests for code whose contract you *know and trust* (TDD a new feature) → write-tests (here you pin **observed** behavior, bugs included, not desired behavior)
+- Behavior-preserving cleanup once the new path already works → refactor-cleanup
+- Sequencing an already-decided plan into batched steps → write-plan
+## Steps
+1. **Characterize BEFORE you touch anything — pin observed behavior, not intended behavior.** Write golden-master/characterization tests that capture what the legacy path *actually does today, bugs included*. These are your regression witness; without them you cannot prove the new path is equivalent. If the unit is untestable in isolation, characterize at the next boundary out (HTTP, queue message, CLI stdout). Don't write the assertion by hand — record the real output and snapshot it:
+   ```python
+   # Pin CURRENT behavior. If legacy returns a wrong-but-shipped value, the snapshot
+   # captures the wrong value on purpose — parity first, fix bugs in a LATER slice.
+   @pytest.mark.parametrize("case", load_real_inputs("prod_samples.jsonl"))
+   def test_golden_master(case, snapshot):
+       assert snapshot == legacy.handle(case)   # `pytest --snapshot-update` once, then freeze
+   ```
+   Feed it **real recorded inputs** (sampled prod traffic / a replay log), not invented ones — invented inputs miss the quirks that break the rewrite.
+2. **Find the seam — the narrowest interface where old and new can swap.** Pick the boundary by cost; narrowest viable wins.
+   | Seam type | Swap point | Use when | Rollback granularity |
+   |---|---|---|---|
+   | **HTTP route / reverse proxy** | nginx/Envoy/API-gateway path rule | monolith→service, per-endpoint carve | per route, instant |
+   | **Function/interface (facade)** | inject impl behind one interface | god-class split, in-process rewrite | per call, per deploy |
+   | **Message/queue consumer** | new consumer group on same topic | async pipeline, event handler | per topic/partition |
+   | **Branch-by-abstraction** | abstraction layer both impls satisfy | can't split a release; long-lived migration on `main` | per flag, no long branch |
+   Default to **branch-by-abstraction behind a flag** for in-process work and a **proxy/gateway route split** for service extraction. Avoid long-lived feature branches — they rot; keep both impls on `main` behind the seam.
+3. **Stand up the new impl behind the seam and route a thin slice via flag — start in shadow.** Do not send real users to unproven code first. Ramp the *mode*, then the *percentage*:
+   - **Shadow (mirror):** run new alongside old on real traffic, **discard new's result, serve old**, log the diff. Zero user risk — this is how you earn trust.
+   - **Parallel-run (canary):** serve new to 1% (sticky by user/key), keep old as the authority for everything else.
+   - The flag/bucketing/ramp mechanics are **feature-flags-rollout's** job — wire to it; don't reinvent hashing here.
+   ```
+   result_old = legacy.handle(req)
+   if flags.enabled("strangler.charges", req.user):     # sticky bucket
+       result_new = newimpl.handle(req)
+       compare.record(req, result_old, result_new)      # async, never blocks the response
+   return result_old                                    # shadow phase: OLD is still the truth
+   ```
+4. **Compare old vs new on real traffic; widen only while parity holds.** Diff every shadowed request and alert on mismatch rate, not just errors. Normalize away legitimate noise — timestamps, map ordering, float epsilon — *before* diffing, or you'll drown in false diffs. Set a hard gate: **promote a slice only after ≥10k shadow requests at <0.1% semantic mismatch.** A persistent diff is a finding: either a real new-impl bug, or an undocumented legacy quirk you must replicate. Never widen past an unexplained diff.
+5. **Migrate data/state with expand-contract — keep both readable until cutover.** Never rename/drop in place. Three phases, each independently deployable and reversible: **expand** (add new column/table/store, backfill, dual-write old+new) → **migrate** (reads shift to new, writes still hit both) → **contract** (stop writing old, drop it — only after the new path is the default and stable). Dual-write so a rollback at any moment still finds consistent data on the old side. The DDL safety of each phase (lock time, backfill batching, online index) belongs to **db-migration-safety** — route the schema change there.
+6. **Flip the default, keep the rollback flag live, retire in two separate steps.** When a slice holds parity at 100%, flip its flag default to **new** but **leave the flag in place** so one toggle reverts. Bake in for one full business cycle (covers month-end/batch/cron paths). Only then: (a) delete the old code path, (b) **in a separate later commit**, remove the now-unused flag and dead branches. Collapsing flip + delete into one change throws away your rollback the moment you might need it.
+7. **Track a kill-list so "done" is provable.** Maintain a checklist of every legacy unit (route, function, table, consumer) with its state: `characterized → shadowing → canary → default-new → old-deleted → flag-removed`. The migration is done when every row reaches `flag-removed` and zero callers reference the legacy module. No kill-list = no way to prove completion, and stranded half-migrated code lives forever.
+## Common Errors
+- **Rewriting before characterizing.** You have nothing to compare against, so "it works" is a guess. Always pin observed behavior (step 1) first — even ugly snapshot tests beat none.
+- **Pinning intended behavior instead of actual.** You "fix" a legacy bug while characterizing, the snapshot now disagrees with prod, every shadow diff is noise. Capture reality; fix bugs as a *later, separate* slice with its own test change.
+- **Big-bang seam — boundary too wide.** Carving a whole subsystem at once = no thin slice, no cheap rollback. Find a narrower interface (one route, one function) even if it means more iterations.
+- **Going straight to canary, skipping shadow.** Real users hit unproven code before you've seen a single diff. Shadow first; users only after the mismatch rate is provably near-zero.
+- **Comparison blocks the response / mutates state twice.** Synchronous diffing adds new-impl latency to every request, and a non-idempotent new path in shadow double-charges/double-sends. Record diffs async and keep shadow side-effect-free (no writes, no emails, no charges).
+- **Diffing raw output without normalization.** Timestamps, map ordering, and float jitter flood you with fake mismatches and you stop trusting the signal. Canonicalize both sides before comparing.
+- **Rename/drop-in-place migration.** Destroys the old read path, so rollback corrupts data. Use expand-contract with dual-write; drop only in the final contract phase.
+- **Deleting the old path in the same change that flips the default.** The instant you need to roll back, there's nothing to roll back to. Flip, bake, then delete in a later commit; remove the flag in a third.
+- **Long-lived rewrite branch.** It diverges from `main` for months and the merge is its own big-bang. Keep both impls on `main` behind the seam (branch-by-abstraction).
+- **No kill-list / orphaned flags.** Half-migrated routes and permanent "temporary" flags accumulate; nobody can say what's done. Track every unit to `flag-removed`.
+## Verify
+1. **Characterization exists and is green on legacy:** the golden-master suite runs against the *unmodified* old path and passes from real recorded inputs (not hand-written cases).
+2. **Seam is reversible by config:** a single flag/route toggle (no redeploy) flips a slice old↔new and back; demonstrate the round-trip live.
+3. **Shadow parity gate met:** the promoted slice logged ≥10k shadowed requests at <0.1% semantic mismatch, and every residual diff is explained (real bug filed, or quirk now replicated).
+4. **Shadow is side-effect-free:** with the new path shadowed at 100%, downstream effects (writes, charges, emails) occur exactly once — proven by counts, not inspection.
+5. **Data is dual-readable mid-migration:** after cutover-to-new-reads but before contract, force the rollback flag → the old read path still returns consistent data (dual-write working).
+6. **Rollback works after flip:** with default=new in prod, toggle the flag → traffic returns to old with no errors and no data divergence.
+7. **Old path retired separately and fully:** dead legacy code is deleted in its own commit, the flag removed in another, and a code search shows zero remaining references to the legacy module.
+8. **Kill-list closed:** every unit on the list is at `flag-removed`.
+Done = the golden-master suite passes on the new path, every kill-list unit reached `flag-removed`, no live caller references the legacy module, and a single toggle could have reverted each slice up until its flag was removed.

package/skills/property-based-testing/SKILL.md ADDED Viewed

@@ -0,0 +1,108 @@
+---
+name: property-based-testing
+description: Finds bugs example tests miss by asserting properties over thousands of generated inputs instead of hand-picked cases — pick the invariant (round-trip encode/decode, idempotence f(f(x))==f(x), oracle/reference equivalence, metamorphic relations, commutativity/associativity, conservation/no-loss), build generators that hit edge cases (empty, huge, Unicode, NaN, negative-zero), and let the framework auto-shrink a failure to a minimal counterexample with a reproducible seed. Covers Hypothesis (Python), fast-check (JS/TS), QuickCheck/Hedgehog (Haskell), proptest/quickcheck (Rust), jqwik (Java), and stateful/model-based testing that drives a system through random command sequences checking it against a model. Distinct from example tests: you specify what's always true, not what one input returns.
+when_to_use: You have a function/codec/parser/data structure with a property that holds for ALL inputs (round-trips, idempotent ops, an invariant, or a slow-but-correct reference to check against), example tests feel like they're missing edge cases, or you want a stateful model test that hammers an API/state machine with random command sequences. Distinct from write-tests (curates specific example-based cases for known behavior; this generates inputs + shrinks counterexamples for universal properties) and fuzz-dynamic-security-test (throws malformed bytes to find crashes/memory-safety/DoS, no correctness oracle; PBT checks a stated invariant holds).
+---
+## When to Use
+Reach for this skill when correctness can be stated as a rule true for **every** input, not just the cases you thought of:
+- "Test this encoder/decoder / serializer / parser — `decode(encode(x)) == x` for any `x`"
+- "This operation should be idempotent / commutative / order-independent — prove it over random inputs"
+- "I have a slow-but-obviously-correct reference (or the old impl); check the fast/new one matches it"
+- "Example tests pass but prod keeps hitting edge cases (empty, Unicode, huge, negative-zero, DST)"
+- "Hammer this stateful API / cache / state machine with random valid command sequences and check invariants"
+- "A property test failed — minimize it to the smallest reproducing input and pin the seed"
+NOT this skill:
+- Curating specific input→output example cases for known/spec'd behavior, organizing the suite, fixtures/mocks → write-tests (it structures example-based tests; this one *generates* inputs and *shrinks* counterexamples for universal properties)
+- Throwing malformed/adversarial bytes to find crashes, OOM, panics, memory-safety, ReDoS, parser DoS — with no correctness oracle → fuzz-dynamic-security-test (security crash-finding; PBT asserts a stated invariant, not "didn't crash")
+- A test that fails non-deterministically and you need to stabilize/quarantine it → debug-flaky-tests (note: PBT failures look flaky but are *real* bugs found by a different seed — capture the seed, don't retry-til-green)
+- Building reusable typed input builders/fixtures for example tests → test-data-factories (a factory can *seed* a PBT generator, but generators add ranges + shrinking)
+- Validating a real dataset for nulls/outliers/dupes → validate-data-quality; precision/rounding invariants of money → money-decimal-arithmetic (this skill is *how* you'd test those invariants)
+- API request/response contract conformance across services → contract-testing
+## Steps
+1. **First find the property — this is the hard part, not the framework.** A property is a predicate true for all valid inputs. The reusable archetypes (memorize these; most code fits one):
+   | Property | Shape | Good for |
+   |---|---|---|
+   | **Round-trip / inverse** | `decode(encode(x)) == x`, `parse(render(x)) == x`, `decompress(compress(x)) == x` | codecs, serializers, parsers, ORMs, URL/path builders |
+   | **Idempotence** | `f(f(x)) == f(x)` | normalize, dedupe, sort, sanitize, `PUT`, migrations, formatters |
+   | **Oracle / reference** | `fast(x) == slow_obviously_correct(x)`, or `new(x) == old(x)` | optimizations, rewrites, replacing a lib, regression vs prod |
+   | **Metamorphic** | relate two runs without knowing the answer: `sin(x)==sin(π−x)`, `len(sort(xs))==len(xs)`, `f(x)+f(y)==f(x∪y)`, search results superset of stricter query | ML, numeric, search/ranking, anything with no easy oracle |
+   | **Invariant / postcondition** | output always satisfies P: sorted is ordered, balanced tree stays balanced, total preserved, no PII leaks | data structures, allocators, accounting |
+   | **Algebraic laws** | commutativity `a∘b==b∘a`, associativity, identity, distributivity | merges, set ops, CRDTs, query builders |
+   | **Conservation / no-loss** | nothing created or destroyed: `sum(split(x))==x`, `count in == count out`, partition reassembles | sharding, money allocation, ETL, pagination |
+   If you can't state a property, you're not ready for PBT — fall back to write-tests. The classic trap: re-implementing the function inside the test (tautology). Prefer round-trip/metamorphic/oracle, which don't need a second copy of the logic.
+2. **Pick the framework and learn its three primitives — generator, runner, shrinker.**
+   | Lang | Library | Generate | Decorator/runner | Reproduce a failure |
+   |---|---|---|---|---|
+   | Python | **Hypothesis** | `@given(st.integers())`, `st.text()`, `st.lists(...)` | `@given(...)` on a test fn | prints `@reproduce_failure` / `@example`; `--hypothesis-seed=` |
+   | JS/TS | **fast-check** | `fc.integer()`, `fc.string()`, `fc.record({...})` | `fc.assert(fc.property(gen, pred))` | prints `seed` + `path`; `{ seed, path }` in `fc.assert` |
+   | Rust | **proptest** / quickcheck | `proptest!{ \|(x in 0..100u32)\| {...} }`, `any::<T>()` | `proptest! { ... }` macro | failures persisted to `proptest-regressions/*.txt` (commit it) |
+   | Haskell | **QuickCheck** / Hedgehog | `Arbitrary`, `Gen`; Hedgehog integrated shrinking | `prop> forAll gen $ \x -> ...` | `--quickcheck-replay=`, Hedgehog prints seed |
+   | Java/Kotlin | **jqwik** | `@ForAll`, `@Provide` Arbitraries | `@Property` method | `@Property(seed = "...")` |
+   | Go | testing/quick (basic) or **rapid** | `rapid.Int()`, `rapid.Custom` | `rapid.Check(t, func(t){...})` | rapid prints `-rapid.seed=`/`-rapid.failfile=` |
+   Defaults to bump: run **≥1000 cases** in CI for cheap properties (Hypothesis defaults 100, fast-check 100, proptest 256). Set `max_examples`/`numRuns`/`PROPTEST_CASES` higher for critical codecs; lower (and a deadline) for slow ones.
+3. **Write generators that actually reach the bug — composition + shaping, not just `random int`.** Build complex inputs from primitives, then constrain:
+   - **Compose:** `st.lists(st.builds(User, name=st.text(), age=st.integers(0, 130)))` (Hypothesis) / `fc.array(fc.record({ name: fc.string(), age: fc.nat(130) }))` (fast-check). Generate the *whole* domain object, not field-by-field manual loops.
+   - **Constrain with `map`/`filter`/`assume`, but prefer construction.** `filter`/`assume` that rejects >~50% of inputs starves the run (Hypothesis raises `FailedHealthCheck`). Instead `map` into the valid space: to get even numbers use `integers().map(lambda n: n*2)`, not `filter(is_even)`. For "sorted pair", generate two and sort — don't reject unsorted.
+   - **Force the edge cases generators under-sample.** Add `@example(...)` (Hypothesis) / explicit `fc.constantFrom` mixes for: empty string/list/dict, single element, the boundary value, `0`, `-0.0`, `NaN`/`Infinity`, max int, surrogate-pair & combining-char Unicode, duplicate keys. Hypothesis already biases toward these; fast-check less so — seed them.
+   - **Stateful/model generators** generate *command sequences*, not single inputs (step 6).
+4. **Trust automatic shrinking — it's the feature that makes PBT worth it; don't shrink by hand.** When a property fails, the framework re-runs with progressively simpler inputs (smaller numbers toward 0, shorter lists, shorter strings) until it finds a **minimal counterexample** — the smallest input that still fails. A raw failure of `[8348, -2, 991, 0, 17]` shrinks to `[0, 0]` or `[1]`, which points straight at the bug. Pitfalls that break shrinking:
+   - **`assume()`/`filter` mid-test** that discards the shrunk candidate → shrinker stalls. Constrain via the generator (step 3) so every generated value is valid.
+   - **Hand-rolled generators without a shrinker** (custom `fc.constantFrom` of opaque blobs, or returning a closure) shrink poorly. Use built-in combinators that carry shrink logic; in Hedgehog/Hypothesis shrinking is integrated so composed generators shrink for free.
+   - **Mutable shared state / non-determinism in the property** → the shrunk case "doesn't reproduce." Make the property a pure function of its inputs; reset state each run.
+5. **Pin the seed and persist regressions — a PBT failure is a real bug, capture it, never "rerun until green."** Each framework prints a seed/replay token on failure:
+   - **Hypothesis:** maintains a `.hypothesis/examples` DB that auto-replays the last failing case; copy the printed `@reproduce_failure(...)` or add `@example(...)` to lock it permanently. Set `derandomize=True` or `--hypothesis-seed=0` for fully deterministic CI.
+   - **fast-check:** copy the reported `seed` and `path` into `fc.assert(prop, { seed, path })` to replay exactly; commit it as a regression test.
+   - **proptest:** auto-writes the failing input to `proptest-regressions/<test>.txt` — **commit that file**; it's replayed first on every future run.
+   - **jqwik:** add `@Property(seed = "…")`; rapid: `-rapid.seed=`. Treat a flake-looking PBT failure as a found bug (a different seed exercised a real path), not noise → fix it, don't quarantine (that's debug-flaky-tests territory only if the *property itself* is non-deterministic).
+6. **Stateful / model-based testing — drive the system through random command sequences and check it against a simple model.** For stateful systems (caches, queues, key-value stores, allocators, an API, a shopping cart, a state machine), single-input properties miss interaction bugs. The pattern:
+   - Define a **model**: a trivial in-memory reference (a `dict` for a KV store, a `list` for a queue) that's obviously correct.
+   - Define **commands** with preconditions (when valid), the real action (mutate the SUT), and a postcondition (assert SUT result matches model).
+   - The framework generates a random *valid sequence* of commands, runs both, and asserts they agree at every step; on failure it **shrinks the sequence** to the shortest failing trace (e.g. `put(a,1); delete(a); get(a)`).
+   - Tools: **Hypothesis** `RuleBasedStateMachine` (`@rule`, `@precondition`, `@invariant`); **fast-check** `fc.commands([...])` + `fc.modelRun`; **proptest-state-machine**; QuickCheck `quickcheck-state-machine`. This finds ordering/concurrency/leak bugs example tests never reach.
+7. **Wire into CI with bounded time and a fixed seed — and keep the corpus.** Make runs deterministic and budgeted:
+   - Set a **per-property deadline/timeout** (Hypothesis `deadline=`, fast-check `interruptAfterTimeLimit`) so one slow generator can't hang CI.
+   - Fix the CI seed for reproducibility but **also run a nightly job with a random/rotating seed and more examples** (`max_examples=10000`) to keep discovering — a single fixed seed eventually stops finding anything.
+   - Commit the regression corpus (`proptest-regressions/`, `.hypothesis/` cache as appropriate, pinned `@example`/`seed` cases) so every found bug stays found.
+8. **When PBT beats example tests (and when it doesn't).** Reach for PBT when: the input space is large/structured (parsers, codecs, numeric, collections), you have an oracle or invariant, or bugs cluster at edges you keep missing. **Skip it** when: there's no expressible property (just "this specific input returns this specific value" — that's write-tests); the function calls non-deterministic externals you can't model; or a 3-line pure function where one example *is* the spec. Best practice: a **thin layer of example tests** (documentation + spec'd corner cases) **plus** properties (the invariants) — they're complementary, not either/or.
+## Common Errors
+- **No real property — testing a tautology.** Re-implementing the function inside the test (`assert add(a,b) == a+b`) proves nothing. Fix: use round-trip/metamorphic/oracle/invariant shapes that don't restate the logic.
+- **`filter`/`assume` that rejects most inputs.** Starves the generator, triggers `FailedHealthCheck`, and breaks shrinking. Fix: `map`/construct into the valid space instead of filtering out of the invalid one.
+- **Forgetting the edge cases generators under-sample.** Empty, single-element, `0`, `-0.0`, `NaN`, max int, surrogate-pair/combining Unicode, duplicate keys. Fix: add explicit `@example`/`constantFrom` for them.
+- **Treating a failure as flaky and rerunning until green.** A different seed found a *real* bug. Fix: capture the seed/minimal case, add it as a regression, fix the code.
+- **Not committing the regression corpus.** `proptest-regressions/*.txt` / pinned `@example` get dropped → the same bug returns. Fix: commit them; they replay first.
+- **Non-deterministic or stateful property body.** Shared mutable state / clocks / RNG make the shrunk case not reproduce. Fix: pure property, reset state per run, inject the clock/seed.
+- **Too few runs.** 100 default cases barely scratch a large space. Fix: ≥1000 in CI for cheap props; nightly 10k with rotating seed.
+- **Hand-rolled generators that don't shrink.** Opaque blobs/closures give you a 4000-element counterexample. Fix: build from library combinators that carry shrink logic.
+- **No deadline on slow properties.** One expensive generator hangs CI. Fix: per-property timeout/deadline.
+- **Using PBT where there's no invariant.** Forcing a property onto "input X → output Y" is awkward and weak. Fix: write-tests for spec'd examples; PBT for universal rules — layer both.
+## Verify
+1. **The property is non-tautological:** it's a round-trip/metamorphic/oracle/invariant — not a second copy of the implementation. Mutate the code under test (flip a sign, drop an element) and confirm the property *fails*; a property that never fails on injected bugs is testing nothing.
+2. **Edge cases are reached:** the run includes (or has `@example` for) empty, single, boundary, `0`/`-0.0`/`NaN`, max, and tricky-Unicode inputs; coverage/`Hypothesis statistics` shows them exercised.
+3. **Failures shrink to minimal:** introduce a real bug → the reported counterexample is *small and pointed* (e.g. `[0,0]`, `""`, `1`), not a giant random blob. If it doesn't shrink, fix the generator/`assume` (step 4).
+4. **Reproducible:** re-running with the printed seed/`@reproduce_failure`/`seed+path`/regression file reproduces the *same* failure deterministically; the regression artifact is committed.
+5. **Run count + budget:** CI runs ≥1000 cases per cheap property within a per-property deadline; a nightly/extended job runs more with a rotating seed.
+6. **Stateful (if applicable):** the model-based test drives random command sequences, checks SUT==model at each step, and shrinks a failure to the shortest failing command trace.
+7. **Layered:** example tests cover the documented/spec corners; properties cover the universal invariants — both present, neither doing the other's job.
+Done = each function/codec/state machine has at least one non-tautological property (round-trip, idempotence, oracle, metamorphic, invariant, algebraic, or conservation), generators construct valid inputs (not filter) and hit known edges, failures auto-shrink to a minimal reproducible counterexample with a committed seed/regression, stateful systems are checked against a model via random command sequences, and runs are deterministic-but-budgeted in CI with an extended nightly sweep — proven by the bug-injection and shrink checks in 1–3.

package/skills/publish-package-registry/SKILL.md ADDED Viewed

@@ -0,0 +1,130 @@
+---
+name: publish-package-registry
+description: Publishes a library to a package registry (npm/PyPI/crates) safely — semver decision, correct artifacts (dual ESM/CJS + types, files allowlist), provenance/signing via OIDC, a pre-publish gate, scoped least-privilege access, and tag-triggered CI release.
+when_to_use: Shipping or fixing a library release to npm/PyPI/crates — oversized/broken publish, missing types on install, error-prone manual releases. NOT deploying a running app/service (deploy-release), writing changelog text (release-notes), or auditing consumed dependencies (supply-chain-sbom-provenance — this PRODUCES and attests your OWN package).
+---
+## When to Use
+Reach for this skill when you are **publishing a library others install**, not deploying a service:
+- "Publish v2 to npm" / "release this crate" / "push the wheel to PyPI"
+- "Consumers get `Could not find a declaration file` — types are missing on install"
+- "Our tarball is 40 MB / shipped `src/` and tests / leaked a `.env`"
+- "Replace our manual `npm publish` with a tag-triggered CI release"
+- "Add provenance / sign the artifact so installs are verifiable"
+- "Ship a prerelease on a `next` dist-tag without moving `latest`"
+NOT this skill:
+- Deploying a running app/service/container to an environment → deploy-release
+- Writing the human-readable changelog / release-notes text → release-notes
+- Auditing/attesting dependencies you *consume* (SBOM of third-party deps) → supply-chain-sbom-provenance (this skill produces+attests the package you *own*)
+- Authoring the bundler/`tsup`/Rollup config that emits the artifacts → configure-bundler-build
+- Wiring versioning across many packages in one repo → setup-monorepo-tooling
+## Steps
+1. **Run the pre-publish gate — never publish off an unverified working tree.** A publish is irreversible (you can't re-publish the same version; npm unpublish is restricted to 72h). Gate, in order, and abort on the first failure:
+   ```bash
+   git status --porcelain        # MUST be empty — publish only committed, tagged code
+   <build>                       # tsup/rollup/maturin/cargo build — emit dist/ fresh
+   <test> && <typecheck>         # vitest/pytest + tsc --noEmit; green or stop
+   npm pack --dry-run            # npm: list the EXACT files + unpacked size
+   ```
+   For Python: `python -m build && twine check dist/*`. For crates: `cargo publish --dry-run` and `cargo package --list`. Read the file list out loud — if it contains `src/`, tests, `.env`, `*.map` you didn't intend, or the size jumped, fix the allowlist (step 4) before going further.
+2. **Decide the semver bump from the diff, not vibes.** Compare the public API surface, not the commit count.
+   | Change | Bump | Example |
+   |---|---|---|
+   | Removed/renamed export, changed signature, dropped Node/Py version, behavior break | **major** | `0.x` exception: any break is allowed, but prefer minor and document it |
+   | New export, new optional param, new overload — old code still compiles | **minor** | added `parse(opts?)` |
+   | Bugfix, perf, types-only fix, docs, internal refactor — public API identical | **patch** | fixed off-by-one |
+   Pre-1.0 (`0.y.z`): treat `0.y` like major (breaking bumps `y`), `0.y.z` like minor/patch. Don't hand-bump if you use Changesets/release-please (step 6) — let the tool compute it from change intents. Never reuse or downgrade a published version.
+3. **Make the package importable both ways with types — this is the #1 broken-install cause.** Ship dual ESM+CJS plus a `.d.ts`, and wire `exports` so resolvers actually find them:
+   ```jsonc
+   {
+     "name": "@scope/lib",
+     "version": "2.0.0",
+     "type": "module",
+     "exports": {
+       ".": {
+         "types": "./dist/index.d.ts",   // types FIRST — resolution is order-sensitive
+         "import": "./dist/index.mjs",
+         "require": "./dist/index.cjs"
+       },
+       "./package.json": "./package.json"
+     },
+     "main": "./dist/index.cjs",          // fallback for old resolvers
+     "module": "./dist/index.mjs",
+     "types": "./dist/index.d.ts",
+     "files": ["dist"],                   // allowlist — ONLY dist ships
+     "sideEffects": false,                // lets bundlers tree-shake consumers
+     "repository": { "type": "git", "url": "git+https://github.com/org/lib.git" },
+     "license": "MIT",
+     "engines": { "node": ">=18" }
+   }
+   ```
+   Validate the resolution with `attw --pack` (`@arethetypeswrong/cli`) and `publint` — they catch missing `types` condition, ESM/CJS mismatch, and bad `exports` before users do. Python equivalent: `pyproject.toml` with `[project]` (name, version, license, `requires-python`, `urls`), `py.typed` shipped in the package, and SPDX `license` string. Crates: `Cargo.toml` `[package]` with `description`, `license`, `repository`, `readme`, and an `include = [...]` list.
+4. **Control exactly what ships with an allowlist, not a denylist.** Prefer `files` in `package.json` (allowlist) over `.npmignore` (denylist) — a forgotten denylist entry leaks files; an allowlist fails safe. Note `package.json`, `README`, `LICENSE`, and the `main`/`types` targets are always included. Re-run `npm pack --dry-run` after editing and confirm the count dropped. crates: `include`/`exclude` in `Cargo.toml`. Python: `MANIFEST.in` + `tool.setuptools.packages.find` / hatch `[tool.hatch.build.targets.wheel]`.
+5. **Authenticate with a short-lived, least-privilege credential — never a personal long-lived token in CI.** Order of preference:
+   - **OIDC / trusted publishing (best, no stored secret):** npm provenance + GitHub OIDC, PyPI "Trusted Publisher", crates.io GitHub OIDC. The registry trusts the CI identity directly; nothing to leak or rotate.
+   - **Automation/CI token (next best):** npm *Automation* token (granular, bypasses 2FA prompt in CI), PyPI *project-scoped* API token, crates `CARGO_REGISTRY_TOKEN`. Store in CI secrets, scope to the single package, never to your account.
+   - Enforce **2FA = auth-and-publish** on the package for any human-initiated publish. First publish of a *public* scoped package needs `--access public` (scoped defaults to restricted and will 402/403 otherwise).
+6. **Automate the release on a tag — kill the manual `npm publish`.** Manual publishes drift (wrong branch, dirty tree, forgotten build). Use Changesets (or release-please/semantic-release) so the bump+changelog+tag is mechanical, and let CI do the publish with provenance:
+   ```yaml
+   # .github/workflows/release.yml
+   permissions:
+     contents: write
+     id-token: write            # REQUIRED for npm --provenance / PyPI trusted publishing
+   jobs:
+     release:
+       runs-on: ubuntu-latest
+       steps:
+         - uses: actions/checkout@v4
+         - uses: actions/setup-node@v4
+           with: { node-version: 20, registry-url: 'https://registry.npmjs.org' }
+         - run: npm ci && npm run build && npm test
+         - run: npm publish --provenance --access public --tag latest
+           env: { NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }} }
+   ```
+   `--provenance` (with `id-token: write`) cryptographically links the published tarball to the source commit + workflow — visible as a verified badge on npm. PyPI/crates get equivalent attestation via sigstore/cosign keyless signing in the same OIDC flow. Use **dist-tags** deliberately: prereleases → `--tag next` (or `beta`), so `npm install pkg` (which resolves `latest`) never silently jumps to an unstable build. Promote later with `npm dist-tag add pkg@x.y.z latest`.
+7. **Verify the real artifact in a clean environment** (see Verify) — building locally proves nothing about what consumers actually receive.
+## Common Errors
+- **Publishing a dirty/untagged tree.** The tarball includes uncommitted changes that no commit reproduces. Gate on `git status --porcelain` empty + a matching tag before publish.
+- **Missing `types` on install.** No `"types"` condition in `exports` (or it's listed *after* `import`/`require`) → consumers get `any`/`Could not find a declaration file`. Put `types` first in each `exports` entry; verify with `attw --pack`.
+- **ESM/CJS half-shipped.** Only `.mjs` exists but `require` points at it (or vice versa) → `ERR_REQUIRE_ESM` / `Cannot use import`. Emit both, wire both conditions; `publint` flags the mismatch.
+- **Denylist leak.** `.npmignore` forgot `test/` or a fixture with secrets → it ships. Switch to a `files` allowlist; re-check `npm pack --dry-run`.
+- **Oversized tarball.** Shipping `src/`, sourcemaps, `node_modules`, or `.map` blows up install size. Allowlist `dist` only; confirm unpacked size in the pack dry-run.
+- **First public scoped publish fails with 402/403.** Scoped packages default to restricted. Add `--access public` on the first publish.
+- **`id-token: write` missing → provenance silently absent or publish errors.** Provenance and trusted publishing both need that permission on the job; without it `--provenance` fails or no attestation is produced.
+- **Prerelease moved `latest`.** Publishing `2.0.0-beta.1` without `--tag` makes it `latest`, so every fresh install gets the beta. Always tag prereleases `next`/`beta`.
+- **Long-lived personal token in CI.** A leaked account-scoped token can publish *any* of your packages. Use OIDC trusted publishing, or a package-scoped automation token at minimum.
+- **Reusing/forcing a version.** The registry rejects a duplicate version, and unpublish windows are tiny. Bump forward — there is no "fix the same version" path.
+- **`sideEffects` unset on a side-effect-free lib.** Consumers can't tree-shake your exports; bundle size leaks downstream. Set `"sideEffects": false` (or list the few files that do have side effects).
+## Verify
+1. **Pack and inspect:** `npm pack` (or `python -m build` / `cargo package`) and list the tarball contents — it contains *only* the allowlisted build output, expected size, no `src`/tests/secrets/maps.
+2. **Type/contract lint:** `attw --pack` and `publint` report zero errors (npm); `twine check dist/*` passes (PyPI). No missing/mis-ordered `exports` conditions.
+3. **Clean-room install from the tarball:** in an empty temp dir, `npm init -y && npm i ../lib-2.0.0.tgz` (or `pip install dist/lib-2.0.0-*.whl` in a fresh venv). Install must succeed with no peer/engine warnings you didn't intend.
+4. **Import both module systems with types:** in that clean project, `node -e "import('lib').then(m=>console.log(m.default))"` AND `node -e "require('lib')"` both resolve; a `.ts` file importing `lib` typechecks under `tsc` with no `any`. Python: `import lib` in the fresh venv and `mypy` sees the shipped `py.typed`.
+5. **Semver matches the diff:** the chosen bump (step 2) is justified by the actual public-API delta, and the new version is strictly greater than the latest published one (`npm view pkg version`).
+6. **Provenance/signature present:** after a CI publish, the npm page shows the verified provenance badge (or `cosign verify`/sigstore attestation validates for PyPI/crates), tracing the artifact to the source commit + workflow run.
+7. **Dist-tag correct:** `npm dist-tag ls pkg` shows the prerelease on `next`/`beta` and `latest` still points at the last stable — a default `npm i pkg` does not pull the prerelease.
+Done = the gate is green on a clean tagged tree, the packed tarball installs and imports both ESM and CJS with working types in a clean room, the semver bump matches the API diff, and CI published it on the correct dist-tag with verifiable provenance.