npm - sanook-cli - Versions diffs - 0.4.0 → 0.5.1 - Mend

sanook-cli 0.4.0 → 0.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (238) hide show

package/.env.example +19 -0
package/CHANGELOG.md +173 -0
package/README.md +153 -20
package/README.th.md +136 -0
package/dist/agentContext.js +4 -0
package/dist/approval.js +6 -0
package/dist/bin.js +405 -57
package/dist/brain.js +92 -59
package/dist/brand.js +47 -0
package/dist/checkpoint.js +37 -0
package/dist/commands.js +86 -6
package/dist/compaction.js +76 -5
package/dist/config.js +100 -12
package/dist/cost.js +60 -3
package/dist/doctor.js +92 -0
package/dist/gateway/auth.js +2 -2
package/dist/gateway/ledger.js +2 -2
package/dist/gateway/scheduler.js +1 -0
package/dist/gateway/serve.js +6 -4
package/dist/gateway/server.js +10 -2
package/dist/git.js +11 -2
package/dist/hooks.js +43 -17
package/dist/knowledge.js +48 -49
package/dist/loop.js +182 -66
package/dist/lsp/client.js +173 -0
package/dist/lsp/framing.js +56 -0
package/dist/lsp/index.js +138 -0
package/dist/lsp/servers.js +82 -0
package/dist/mcp-server.js +244 -0
package/dist/mcp.js +184 -29
package/dist/memory-store.js +559 -0
package/dist/memory.js +143 -29
package/dist/orchestrate.js +150 -0
package/dist/providers/codex.js +21 -7
package/dist/providers/keys.js +3 -2
package/dist/providers/models.js +22 -6
package/dist/providers/registry.js +155 -1
package/dist/repomap.js +93 -0
package/dist/search/chunk.js +158 -0
package/dist/search/embed-store.js +187 -0
package/dist/search/engine.js +203 -0
package/dist/search/fuse.js +35 -0
package/dist/search/index-core.js +187 -0
package/dist/search/indexer.js +241 -0
package/dist/search/store.js +77 -0
package/dist/session.js +42 -8
package/dist/skill-install.js +10 -10
package/dist/skills.js +12 -9
package/dist/summarize.js +31 -0
package/dist/tools/bash.js +21 -2
package/dist/tools/diagnostics.js +41 -0
package/dist/tools/edit.js +29 -7
package/dist/tools/index.js +8 -1
package/dist/tools/list.js +7 -2
package/dist/tools/permission.js +90 -9
package/dist/tools/read.js +23 -4
package/dist/tools/remember.js +1 -1
package/dist/tools/sandbox.js +61 -0
package/dist/tools/search.js +105 -4
package/dist/tools/task.js +195 -29
package/dist/tools/timeout.js +35 -0
package/dist/tools/util.js +10 -0
package/dist/tools/write.js +6 -4
package/dist/trust.js +89 -0
package/dist/ui/app.js +228 -31
package/dist/ui/banner.js +4 -9
package/dist/ui/brain-wizard.js +2 -2
package/dist/ui/history.js +30 -0
package/dist/ui/mentions.js +44 -0
package/dist/ui/render.js +55 -15
package/dist/ui/setup.js +97 -12
package/dist/ui/useEditor.js +83 -0
package/dist/update.js +114 -0
package/dist/worktree.js +173 -0
package/package.json +11 -5
package/scripts/postinstall.mjs +33 -0
package/second-brain/.agents/_Index.md +30 -0
package/second-brain/.agents/skills/_Index.md +30 -0
package/second-brain/.agents/workflows/_Index.md +30 -0
package/second-brain/AGENTS.md +4 -4
package/second-brain/Acceptance/_Index.md +30 -0
package/second-brain/Acceptance/golden-case-template.md +39 -0
package/second-brain/Areas/_Index.md +30 -0
package/second-brain/Bugs/System-OS/_Index.md +30 -0
package/second-brain/Bugs/_Index.md +30 -0
package/second-brain/CLAUDE.md +4 -1
package/second-brain/Checklists/_Index.md +30 -0
package/second-brain/Checklists/preflight-postflight-template.md +29 -0
package/second-brain/Distillations/_Index.md +30 -0
package/second-brain/Entities/_Index.md +30 -0
package/second-brain/Entities/entity-template.md +33 -0
package/second-brain/Evals/_Index.md +30 -0
package/second-brain/Evals/correction-pairs.md +24 -0
package/second-brain/Evals/failure-taxonomy.md +24 -0
package/second-brain/Evals/golden-set.md +25 -0
package/second-brain/Evals/quality-ledger.md +23 -0
package/second-brain/Evals/self-eval-rubric.md +23 -0
package/second-brain/GEMINI.md +4 -4
package/second-brain/Goals/_Index.md +30 -0
package/second-brain/Handoffs/_Index.md +30 -0
package/second-brain/Home.md +7 -0
package/second-brain/Intake/Raw Sources/_Index.md +30 -0
package/second-brain/Intake/_Index.md +30 -0
package/second-brain/Intake/_Quarantine/_Index.md +30 -0
package/second-brain/Learning/_Index.md +30 -0
package/second-brain/Playbooks/_Index.md +30 -0
package/second-brain/Playbooks/playbook-template.md +23 -0
package/second-brain/Projects/_Index.md +30 -0
package/second-brain/Prompts/_Index.md +30 -0
package/second-brain/README.md +2 -1
package/second-brain/Research/_Index.md +30 -0
package/second-brain/Retrospectives/_Index.md +30 -0
package/second-brain/Reviews/_Index.md +30 -0
package/second-brain/Runbooks/_Index.md +30 -0
package/second-brain/Runbooks/eval-loop.md +24 -0
package/second-brain/Sessions/_Index.md +30 -0
package/second-brain/Shared/AI-Context-Index.md +20 -0
package/second-brain/Shared/AI-Threads/_Index.md +30 -0
package/second-brain/Shared/Archive/_Index.md +30 -0
package/second-brain/Shared/Assets/_Index.md +30 -0
package/second-brain/Shared/Context-Packs/_Index.md +30 -0
package/second-brain/Shared/Context7-Docs/_Index.md +30 -0
package/second-brain/Shared/Coordination/NOW.md +28 -0
package/second-brain/Shared/Coordination/_Index.md +30 -0
package/second-brain/Shared/Coordination/agent-registry.md +24 -0
package/second-brain/Shared/Coordination/task-board/_Index.md +30 -0
package/second-brain/Shared/Coordination/task-board/task-template.md +43 -0
package/second-brain/Shared/Coordination/task-board.md +32 -0
package/second-brain/Shared/Core-Facts/_Index.md +30 -0
package/second-brain/Shared/Decision-Memory/_Index.md +30 -0
package/second-brain/Shared/Glossary/_Index.md +30 -0
package/second-brain/Shared/Memory-Inbox/_Index.md +30 -0
package/second-brain/Shared/Operating-State/_Index.md +30 -0
package/second-brain/Shared/Prompting/_Index.md +30 -0
package/second-brain/Shared/Provenance/_Index.md +30 -0
package/second-brain/Shared/Rules/_Index.md +30 -0
package/second-brain/Shared/Rules/contextual-note-rule.md +30 -0
package/second-brain/Shared/Rules/frontmatter-standard.md +10 -0
package/second-brain/Shared/Rules/memory-write-protocol.md +28 -0
package/second-brain/Shared/Rules/procedural-runbook-header.md +40 -0
package/second-brain/Shared/Rules/review-and-staleness-policy.md +22 -0
package/second-brain/Shared/Rules/rules-formatting.md +34 -0
package/second-brain/Shared/Scripts/_Index.md +30 -0
package/second-brain/Shared/Scripts-Archive/_Index.md +30 -0
package/second-brain/Shared/Tech-Standards/_Index.md +30 -0
package/second-brain/Shared/Tech-Standards/verification-standard.md +40 -0
package/second-brain/Shared/User-Memory/_Index.md +30 -0
package/second-brain/Shared/User-Persona/_Index.md +30 -0
package/second-brain/Shared/User-Persona/owner-profile.md +25 -0
package/second-brain/Shared/Working-Memory/_Index.md +30 -0
package/second-brain/Shared/_Index.md +30 -0
package/second-brain/Shared/mcp-servers/_Index.md +30 -0
package/second-brain/Skills/_Index.md +30 -0
package/second-brain/Templates/_Index.md +30 -0
package/second-brain/Templates/bug.md +2 -0
package/second-brain/Templates/handoff.md +2 -0
package/second-brain/Templates/session.md +2 -0
package/second-brain/Tools/_Index.md +30 -0
package/second-brain/Traces/_Index.md +30 -0
package/second-brain/Vault Structure Map.md +33 -1
package/second-brain/copilot/_Index.md +30 -0
package/skills/audit-license-compliance/SKILL.md +117 -0
package/skills/author-codemod/SKILL.md +110 -0
package/skills/build-audit-logging/SKILL.md +112 -0
package/skills/build-cdc-streaming-pipeline/SKILL.md +123 -0
package/skills/build-cli-tool/SKILL.md +108 -0
package/skills/build-data-table/SKILL.md +141 -0
package/skills/build-native-mobile-ui/SKILL.md +154 -0
package/skills/build-offline-first-sync/SKILL.md +118 -0
package/skills/build-realtime-channel/SKILL.md +122 -0
package/skills/build-vector-search/SKILL.md +131 -0
package/skills/compose-local-dev-stack/SKILL.md +149 -0
package/skills/configure-bundler-build/SKILL.md +166 -0
package/skills/configure-dns-tls/SKILL.md +142 -0
package/skills/configure-reverse-proxy-lb/SKILL.md +129 -0
package/skills/configure-security-headers-csp/SKILL.md +122 -0
package/skills/contract-testing/SKILL.md +140 -0
package/skills/datetime-timezone-correctness/SKILL.md +125 -0
package/skills/debug-ci-pipeline-failure/SKILL.md +134 -0
package/skills/debug-flaky-tests/SKILL.md +128 -0
package/skills/defend-llm-prompt-injection/SKILL.md +110 -0
package/skills/deliver-webhooks/SKILL.md +116 -0
package/skills/design-api-pagination/SKILL.md +144 -0
package/skills/design-authorization-model/SKILL.md +119 -0
package/skills/design-backup-dr-recovery/SKILL.md +113 -0
package/skills/design-event-sourcing-cqrs/SKILL.md +143 -0
package/skills/design-multi-tenancy/SKILL.md +100 -0
package/skills/design-protobuf-grpc-service/SKILL.md +146 -0
package/skills/design-relational-schema/SKILL.md +129 -0
package/skills/design-search-index-infra/SKILL.md +151 -0
package/skills/design-state-machine/SKILL.md +108 -0
package/skills/design-token-system/SKILL.md +109 -0
package/skills/distributed-locks-leases/SKILL.md +120 -0
package/skills/encrypt-sensitive-data/SKILL.md +148 -0
package/skills/feature-flags-rollout/SKILL.md +130 -0
package/skills/file-upload-object-storage/SKILL.md +107 -0
package/skills/fuzz-dynamic-security-test/SKILL.md +111 -0
package/skills/harden-llm-app-reliability/SKILL.md +126 -0
package/skills/i18n-localization-setup/SKILL.md +113 -0
package/skills/idempotency-keys/SKILL.md +107 -0
package/skills/implement-push-notifications/SKILL.md +142 -0
package/skills/ingest-webhook-secure/SKILL.md +120 -0
package/skills/integrate-oauth-oidc/SKILL.md +126 -0
package/skills/load-stress-test/SKILL.md +129 -0
package/skills/map-privacy-data-gdpr/SKILL.md +146 -0
package/skills/model-nosql-data/SKILL.md +118 -0
package/skills/money-decimal-arithmetic/SKILL.md +123 -0
package/skills/monitor-ml-drift/SKILL.md +109 -0
package/skills/numeric-precision-units/SKILL.md +144 -0
package/skills/optimize-llm-cost-latency/SKILL.md +103 -0
package/skills/optimize-react-rerenders/SKILL.md +124 -0
package/skills/orchestrate-agent-workflow/SKILL.md +100 -0
package/skills/payments-billing-integration/SKILL.md +114 -0
package/skills/pin-toolchain-versions/SKILL.md +116 -0
package/skills/plan-strangler-migration/SKILL.md +95 -0
package/skills/property-based-testing/SKILL.md +108 -0
package/skills/publish-package-registry/SKILL.md +130 -0
package/skills/recover-git-state/SKILL.md +119 -0
package/skills/remediate-web-vulnerabilities/SKILL.md +125 -0
package/skills/resilience-timeouts-retries/SKILL.md +104 -0
package/skills/resolve-merge-rebase-conflict/SKILL.md +97 -0
package/skills/rewrite-git-history/SKILL.md +109 -0
package/skills/scaffold-cross-platform-app/SKILL.md +137 -0
package/skills/schema-evolution-compatibility/SKILL.md +121 -0
package/skills/send-transactional-email/SKILL.md +126 -0
package/skills/serve-deploy-ml-model/SKILL.md +107 -0
package/skills/setup-cdn-edge-waf/SKILL.md +107 -0
package/skills/setup-devcontainer-env/SKILL.md +131 -0
package/skills/setup-lint-format-precommit/SKILL.md +140 -0
package/skills/setup-monorepo-tooling/SKILL.md +125 -0
package/skills/ship-mobile-app-store-release/SKILL.md +137 -0
package/skills/structured-output-llm/SKILL.md +86 -0
package/skills/supply-chain-sbom-provenance/SKILL.md +120 -0
package/skills/test-data-factories/SKILL.md +158 -0
package/skills/threat-model-stride/SKILL.md +123 -0
package/skills/train-evaluate-ml-model/SKILL.md +109 -0
package/skills/unicode-text-correctness/SKILL.md +109 -0
package/skills/visual-regression-testing/SKILL.md +120 -0

package/skills/plan-strangler-migration/SKILL.md ADDED Viewed

@@ -0,0 +1,95 @@
+---
+name: plan-strangler-migration
+description: Plans and executes incremental legacy modernization with the strangler-fig pattern — pins current behavior with characterization/golden-master tests, carves the narrowest seam, routes traffic old↔new behind a flag (shadow then canary), compares parity on real traffic, migrates data via expand-contract, then flips the default and retires the old path against a tracked kill-list.
+when_to_use: Replacing or rewriting a live legacy system/module slice by slice with rollback at every step — monolith→service, framework v-old→v-new, on-prem→cloud, rewriting an untested critical component, or peeling a god-class apart, when a big-bang cutover is too risky. Distinct from feature-flags-rollout (owns the flag/bucketing/ramp mechanics this skill drives), db-migration-safety (owns the DDL/lock safety of the expand-contract step), and diff-table-parity (diffs two static datasets; this diffs live request streams).
+---
+## When to Use
+Reach for this when you must **swap an implementation while the system stays live**, with rollback at every step:
+- "Move this endpoint/module off the monolith into a new service without a freeze"
+- "Rewrite this untested payment/pricing component — I'm scared to touch it"
+- "Migrate from <old framework/runtime> to <new> without a big-bang cutover"
+- "Break this 4,000-line god-class apart safely"
+- "Re-platform on-prem → cloud, one slice at a time, provably reversible"
+- "We tried a rewrite-and-switch and it blew up — do it incrementally instead"
+NOT this skill:
+- Building the flag, hashed bucketing, and 1→10→50→100 ramp itself → feature-flags-rollout (this skill *drives* a flag; that skill *builds* it)
+- Lock contention, blocking DDL, destructive ops in the expand-contract step → db-migration-safety
+- Diffing two **static** tables/query results to prove a migration matched → diff-table-parity (this skill diffs **live shadowed request streams** in flight)
+- Writing tests for code whose contract you *know and trust* (TDD a new feature) → write-tests (here you pin **observed** behavior, bugs included, not desired behavior)
+- Behavior-preserving cleanup once the new path already works → refactor-cleanup
+- Sequencing an already-decided plan into batched steps → write-plan
+## Steps
+1. **Characterize BEFORE you touch anything — pin observed behavior, not intended behavior.** Write golden-master/characterization tests that capture what the legacy path *actually does today, bugs included*. These are your regression witness; without them you cannot prove the new path is equivalent. If the unit is untestable in isolation, characterize at the next boundary out (HTTP, queue message, CLI stdout). Don't write the assertion by hand — record the real output and snapshot it:
+   ```python
+   # Pin CURRENT behavior. If legacy returns a wrong-but-shipped value, the snapshot
+   # captures the wrong value on purpose — parity first, fix bugs in a LATER slice.
+   @pytest.mark.parametrize("case", load_real_inputs("prod_samples.jsonl"))
+   def test_golden_master(case, snapshot):
+       assert snapshot == legacy.handle(case)   # `pytest --snapshot-update` once, then freeze
+   ```
+   Feed it **real recorded inputs** (sampled prod traffic / a replay log), not invented ones — invented inputs miss the quirks that break the rewrite.
+2. **Find the seam — the narrowest interface where old and new can swap.** Pick the boundary by cost; narrowest viable wins.
+   | Seam type | Swap point | Use when | Rollback granularity |
+   |---|---|---|---|
+   | **HTTP route / reverse proxy** | nginx/Envoy/API-gateway path rule | monolith→service, per-endpoint carve | per route, instant |
+   | **Function/interface (facade)** | inject impl behind one interface | god-class split, in-process rewrite | per call, per deploy |
+   | **Message/queue consumer** | new consumer group on same topic | async pipeline, event handler | per topic/partition |
+   | **Branch-by-abstraction** | abstraction layer both impls satisfy | can't split a release; long-lived migration on `main` | per flag, no long branch |
+   Default to **branch-by-abstraction behind a flag** for in-process work and a **proxy/gateway route split** for service extraction. Avoid long-lived feature branches — they rot; keep both impls on `main` behind the seam.
+3. **Stand up the new impl behind the seam and route a thin slice via flag — start in shadow.** Do not send real users to unproven code first. Ramp the *mode*, then the *percentage*:
+   - **Shadow (mirror):** run new alongside old on real traffic, **discard new's result, serve old**, log the diff. Zero user risk — this is how you earn trust.
+   - **Parallel-run (canary):** serve new to 1% (sticky by user/key), keep old as the authority for everything else.
+   - The flag/bucketing/ramp mechanics are **feature-flags-rollout's** job — wire to it; don't reinvent hashing here.
+   ```
+   result_old = legacy.handle(req)
+   if flags.enabled("strangler.charges", req.user):     # sticky bucket
+       result_new = newimpl.handle(req)
+       compare.record(req, result_old, result_new)      # async, never blocks the response
+   return result_old                                    # shadow phase: OLD is still the truth
+   ```
+4. **Compare old vs new on real traffic; widen only while parity holds.** Diff every shadowed request and alert on mismatch rate, not just errors. Normalize away legitimate noise — timestamps, map ordering, float epsilon — *before* diffing, or you'll drown in false diffs. Set a hard gate: **promote a slice only after ≥10k shadow requests at <0.1% semantic mismatch.** A persistent diff is a finding: either a real new-impl bug, or an undocumented legacy quirk you must replicate. Never widen past an unexplained diff.
+5. **Migrate data/state with expand-contract — keep both readable until cutover.** Never rename/drop in place. Three phases, each independently deployable and reversible: **expand** (add new column/table/store, backfill, dual-write old+new) → **migrate** (reads shift to new, writes still hit both) → **contract** (stop writing old, drop it — only after the new path is the default and stable). Dual-write so a rollback at any moment still finds consistent data on the old side. The DDL safety of each phase (lock time, backfill batching, online index) belongs to **db-migration-safety** — route the schema change there.
+6. **Flip the default, keep the rollback flag live, retire in two separate steps.** When a slice holds parity at 100%, flip its flag default to **new** but **leave the flag in place** so one toggle reverts. Bake in for one full business cycle (covers month-end/batch/cron paths). Only then: (a) delete the old code path, (b) **in a separate later commit**, remove the now-unused flag and dead branches. Collapsing flip + delete into one change throws away your rollback the moment you might need it.
+7. **Track a kill-list so "done" is provable.** Maintain a checklist of every legacy unit (route, function, table, consumer) with its state: `characterized → shadowing → canary → default-new → old-deleted → flag-removed`. The migration is done when every row reaches `flag-removed` and zero callers reference the legacy module. No kill-list = no way to prove completion, and stranded half-migrated code lives forever.
+## Common Errors
+- **Rewriting before characterizing.** You have nothing to compare against, so "it works" is a guess. Always pin observed behavior (step 1) first — even ugly snapshot tests beat none.
+- **Pinning intended behavior instead of actual.** You "fix" a legacy bug while characterizing, the snapshot now disagrees with prod, every shadow diff is noise. Capture reality; fix bugs as a *later, separate* slice with its own test change.
+- **Big-bang seam — boundary too wide.** Carving a whole subsystem at once = no thin slice, no cheap rollback. Find a narrower interface (one route, one function) even if it means more iterations.
+- **Going straight to canary, skipping shadow.** Real users hit unproven code before you've seen a single diff. Shadow first; users only after the mismatch rate is provably near-zero.
+- **Comparison blocks the response / mutates state twice.** Synchronous diffing adds new-impl latency to every request, and a non-idempotent new path in shadow double-charges/double-sends. Record diffs async and keep shadow side-effect-free (no writes, no emails, no charges).
+- **Diffing raw output without normalization.** Timestamps, map ordering, and float jitter flood you with fake mismatches and you stop trusting the signal. Canonicalize both sides before comparing.
+- **Rename/drop-in-place migration.** Destroys the old read path, so rollback corrupts data. Use expand-contract with dual-write; drop only in the final contract phase.
+- **Deleting the old path in the same change that flips the default.** The instant you need to roll back, there's nothing to roll back to. Flip, bake, then delete in a later commit; remove the flag in a third.
+- **Long-lived rewrite branch.** It diverges from `main` for months and the merge is its own big-bang. Keep both impls on `main` behind the seam (branch-by-abstraction).
+- **No kill-list / orphaned flags.** Half-migrated routes and permanent "temporary" flags accumulate; nobody can say what's done. Track every unit to `flag-removed`.
+## Verify
+1. **Characterization exists and is green on legacy:** the golden-master suite runs against the *unmodified* old path and passes from real recorded inputs (not hand-written cases).
+2. **Seam is reversible by config:** a single flag/route toggle (no redeploy) flips a slice old↔new and back; demonstrate the round-trip live.
+3. **Shadow parity gate met:** the promoted slice logged ≥10k shadowed requests at <0.1% semantic mismatch, and every residual diff is explained (real bug filed, or quirk now replicated).
+4. **Shadow is side-effect-free:** with the new path shadowed at 100%, downstream effects (writes, charges, emails) occur exactly once — proven by counts, not inspection.
+5. **Data is dual-readable mid-migration:** after cutover-to-new-reads but before contract, force the rollback flag → the old read path still returns consistent data (dual-write working).
+6. **Rollback works after flip:** with default=new in prod, toggle the flag → traffic returns to old with no errors and no data divergence.
+7. **Old path retired separately and fully:** dead legacy code is deleted in its own commit, the flag removed in another, and a code search shows zero remaining references to the legacy module.
+8. **Kill-list closed:** every unit on the list is at `flag-removed`.
+Done = the golden-master suite passes on the new path, every kill-list unit reached `flag-removed`, no live caller references the legacy module, and a single toggle could have reverted each slice up until its flag was removed.

package/skills/property-based-testing/SKILL.md ADDED Viewed

@@ -0,0 +1,108 @@
+---
+name: property-based-testing
+description: Finds bugs example tests miss by asserting properties over thousands of generated inputs instead of hand-picked cases — pick the invariant (round-trip encode/decode, idempotence f(f(x))==f(x), oracle/reference equivalence, metamorphic relations, commutativity/associativity, conservation/no-loss), build generators that hit edge cases (empty, huge, Unicode, NaN, negative-zero), and let the framework auto-shrink a failure to a minimal counterexample with a reproducible seed. Covers Hypothesis (Python), fast-check (JS/TS), QuickCheck/Hedgehog (Haskell), proptest/quickcheck (Rust), jqwik (Java), and stateful/model-based testing that drives a system through random command sequences checking it against a model. Distinct from example tests: you specify what's always true, not what one input returns.
+when_to_use: You have a function/codec/parser/data structure with a property that holds for ALL inputs (round-trips, idempotent ops, an invariant, or a slow-but-correct reference to check against), example tests feel like they're missing edge cases, or you want a stateful model test that hammers an API/state machine with random command sequences. Distinct from write-tests (curates specific example-based cases for known behavior; this generates inputs + shrinks counterexamples for universal properties) and fuzz-dynamic-security-test (throws malformed bytes to find crashes/memory-safety/DoS, no correctness oracle; PBT checks a stated invariant holds).
+---
+## When to Use
+Reach for this skill when correctness can be stated as a rule true for **every** input, not just the cases you thought of:
+- "Test this encoder/decoder / serializer / parser — `decode(encode(x)) == x` for any `x`"
+- "This operation should be idempotent / commutative / order-independent — prove it over random inputs"
+- "I have a slow-but-obviously-correct reference (or the old impl); check the fast/new one matches it"
+- "Example tests pass but prod keeps hitting edge cases (empty, Unicode, huge, negative-zero, DST)"
+- "Hammer this stateful API / cache / state machine with random valid command sequences and check invariants"
+- "A property test failed — minimize it to the smallest reproducing input and pin the seed"
+NOT this skill:
+- Curating specific input→output example cases for known/spec'd behavior, organizing the suite, fixtures/mocks → write-tests (it structures example-based tests; this one *generates* inputs and *shrinks* counterexamples for universal properties)
+- Throwing malformed/adversarial bytes to find crashes, OOM, panics, memory-safety, ReDoS, parser DoS — with no correctness oracle → fuzz-dynamic-security-test (security crash-finding; PBT asserts a stated invariant, not "didn't crash")
+- A test that fails non-deterministically and you need to stabilize/quarantine it → debug-flaky-tests (note: PBT failures look flaky but are *real* bugs found by a different seed — capture the seed, don't retry-til-green)
+- Building reusable typed input builders/fixtures for example tests → test-data-factories (a factory can *seed* a PBT generator, but generators add ranges + shrinking)
+- Validating a real dataset for nulls/outliers/dupes → validate-data-quality; precision/rounding invariants of money → money-decimal-arithmetic (this skill is *how* you'd test those invariants)
+- API request/response contract conformance across services → contract-testing
+## Steps
+1. **First find the property — this is the hard part, not the framework.** A property is a predicate true for all valid inputs. The reusable archetypes (memorize these; most code fits one):
+   | Property | Shape | Good for |
+   |---|---|---|
+   | **Round-trip / inverse** | `decode(encode(x)) == x`, `parse(render(x)) == x`, `decompress(compress(x)) == x` | codecs, serializers, parsers, ORMs, URL/path builders |
+   | **Idempotence** | `f(f(x)) == f(x)` | normalize, dedupe, sort, sanitize, `PUT`, migrations, formatters |
+   | **Oracle / reference** | `fast(x) == slow_obviously_correct(x)`, or `new(x) == old(x)` | optimizations, rewrites, replacing a lib, regression vs prod |
+   | **Metamorphic** | relate two runs without knowing the answer: `sin(x)==sin(π−x)`, `len(sort(xs))==len(xs)`, `f(x)+f(y)==f(x∪y)`, search results superset of stricter query | ML, numeric, search/ranking, anything with no easy oracle |
+   | **Invariant / postcondition** | output always satisfies P: sorted is ordered, balanced tree stays balanced, total preserved, no PII leaks | data structures, allocators, accounting |
+   | **Algebraic laws** | commutativity `a∘b==b∘a`, associativity, identity, distributivity | merges, set ops, CRDTs, query builders |
+   | **Conservation / no-loss** | nothing created or destroyed: `sum(split(x))==x`, `count in == count out`, partition reassembles | sharding, money allocation, ETL, pagination |
+   If you can't state a property, you're not ready for PBT — fall back to write-tests. The classic trap: re-implementing the function inside the test (tautology). Prefer round-trip/metamorphic/oracle, which don't need a second copy of the logic.
+2. **Pick the framework and learn its three primitives — generator, runner, shrinker.**
+   | Lang | Library | Generate | Decorator/runner | Reproduce a failure |
+   |---|---|---|---|---|
+   | Python | **Hypothesis** | `@given(st.integers())`, `st.text()`, `st.lists(...)` | `@given(...)` on a test fn | prints `@reproduce_failure` / `@example`; `--hypothesis-seed=` |
+   | JS/TS | **fast-check** | `fc.integer()`, `fc.string()`, `fc.record({...})` | `fc.assert(fc.property(gen, pred))` | prints `seed` + `path`; `{ seed, path }` in `fc.assert` |
+   | Rust | **proptest** / quickcheck | `proptest!{ \|(x in 0..100u32)\| {...} }`, `any::<T>()` | `proptest! { ... }` macro | failures persisted to `proptest-regressions/*.txt` (commit it) |
+   | Haskell | **QuickCheck** / Hedgehog | `Arbitrary`, `Gen`; Hedgehog integrated shrinking | `prop> forAll gen $ \x -> ...` | `--quickcheck-replay=`, Hedgehog prints seed |
+   | Java/Kotlin | **jqwik** | `@ForAll`, `@Provide` Arbitraries | `@Property` method | `@Property(seed = "...")` |
+   | Go | testing/quick (basic) or **rapid** | `rapid.Int()`, `rapid.Custom` | `rapid.Check(t, func(t){...})` | rapid prints `-rapid.seed=`/`-rapid.failfile=` |
+   Defaults to bump: run **≥1000 cases** in CI for cheap properties (Hypothesis defaults 100, fast-check 100, proptest 256). Set `max_examples`/`numRuns`/`PROPTEST_CASES` higher for critical codecs; lower (and a deadline) for slow ones.
+3. **Write generators that actually reach the bug — composition + shaping, not just `random int`.** Build complex inputs from primitives, then constrain:
+   - **Compose:** `st.lists(st.builds(User, name=st.text(), age=st.integers(0, 130)))` (Hypothesis) / `fc.array(fc.record({ name: fc.string(), age: fc.nat(130) }))` (fast-check). Generate the *whole* domain object, not field-by-field manual loops.
+   - **Constrain with `map`/`filter`/`assume`, but prefer construction.** `filter`/`assume` that rejects >~50% of inputs starves the run (Hypothesis raises `FailedHealthCheck`). Instead `map` into the valid space: to get even numbers use `integers().map(lambda n: n*2)`, not `filter(is_even)`. For "sorted pair", generate two and sort — don't reject unsorted.
+   - **Force the edge cases generators under-sample.** Add `@example(...)` (Hypothesis) / explicit `fc.constantFrom` mixes for: empty string/list/dict, single element, the boundary value, `0`, `-0.0`, `NaN`/`Infinity`, max int, surrogate-pair & combining-char Unicode, duplicate keys. Hypothesis already biases toward these; fast-check less so — seed them.
+   - **Stateful/model generators** generate *command sequences*, not single inputs (step 6).
+4. **Trust automatic shrinking — it's the feature that makes PBT worth it; don't shrink by hand.** When a property fails, the framework re-runs with progressively simpler inputs (smaller numbers toward 0, shorter lists, shorter strings) until it finds a **minimal counterexample** — the smallest input that still fails. A raw failure of `[8348, -2, 991, 0, 17]` shrinks to `[0, 0]` or `[1]`, which points straight at the bug. Pitfalls that break shrinking:
+   - **`assume()`/`filter` mid-test** that discards the shrunk candidate → shrinker stalls. Constrain via the generator (step 3) so every generated value is valid.
+   - **Hand-rolled generators without a shrinker** (custom `fc.constantFrom` of opaque blobs, or returning a closure) shrink poorly. Use built-in combinators that carry shrink logic; in Hedgehog/Hypothesis shrinking is integrated so composed generators shrink for free.
+   - **Mutable shared state / non-determinism in the property** → the shrunk case "doesn't reproduce." Make the property a pure function of its inputs; reset state each run.
+5. **Pin the seed and persist regressions — a PBT failure is a real bug, capture it, never "rerun until green."** Each framework prints a seed/replay token on failure:
+   - **Hypothesis:** maintains a `.hypothesis/examples` DB that auto-replays the last failing case; copy the printed `@reproduce_failure(...)` or add `@example(...)` to lock it permanently. Set `derandomize=True` or `--hypothesis-seed=0` for fully deterministic CI.
+   - **fast-check:** copy the reported `seed` and `path` into `fc.assert(prop, { seed, path })` to replay exactly; commit it as a regression test.
+   - **proptest:** auto-writes the failing input to `proptest-regressions/<test>.txt` — **commit that file**; it's replayed first on every future run.
+   - **jqwik:** add `@Property(seed = "…")`; rapid: `-rapid.seed=`. Treat a flake-looking PBT failure as a found bug (a different seed exercised a real path), not noise → fix it, don't quarantine (that's debug-flaky-tests territory only if the *property itself* is non-deterministic).
+6. **Stateful / model-based testing — drive the system through random command sequences and check it against a simple model.** For stateful systems (caches, queues, key-value stores, allocators, an API, a shopping cart, a state machine), single-input properties miss interaction bugs. The pattern:
+   - Define a **model**: a trivial in-memory reference (a `dict` for a KV store, a `list` for a queue) that's obviously correct.
+   - Define **commands** with preconditions (when valid), the real action (mutate the SUT), and a postcondition (assert SUT result matches model).
+   - The framework generates a random *valid sequence* of commands, runs both, and asserts they agree at every step; on failure it **shrinks the sequence** to the shortest failing trace (e.g. `put(a,1); delete(a); get(a)`).
+   - Tools: **Hypothesis** `RuleBasedStateMachine` (`@rule`, `@precondition`, `@invariant`); **fast-check** `fc.commands([...])` + `fc.modelRun`; **proptest-state-machine**; QuickCheck `quickcheck-state-machine`. This finds ordering/concurrency/leak bugs example tests never reach.
+7. **Wire into CI with bounded time and a fixed seed — and keep the corpus.** Make runs deterministic and budgeted:
+   - Set a **per-property deadline/timeout** (Hypothesis `deadline=`, fast-check `interruptAfterTimeLimit`) so one slow generator can't hang CI.
+   - Fix the CI seed for reproducibility but **also run a nightly job with a random/rotating seed and more examples** (`max_examples=10000`) to keep discovering — a single fixed seed eventually stops finding anything.
+   - Commit the regression corpus (`proptest-regressions/`, `.hypothesis/` cache as appropriate, pinned `@example`/`seed` cases) so every found bug stays found.
+8. **When PBT beats example tests (and when it doesn't).** Reach for PBT when: the input space is large/structured (parsers, codecs, numeric, collections), you have an oracle or invariant, or bugs cluster at edges you keep missing. **Skip it** when: there's no expressible property (just "this specific input returns this specific value" — that's write-tests); the function calls non-deterministic externals you can't model; or a 3-line pure function where one example *is* the spec. Best practice: a **thin layer of example tests** (documentation + spec'd corner cases) **plus** properties (the invariants) — they're complementary, not either/or.
+## Common Errors
+- **No real property — testing a tautology.** Re-implementing the function inside the test (`assert add(a,b) == a+b`) proves nothing. Fix: use round-trip/metamorphic/oracle/invariant shapes that don't restate the logic.
+- **`filter`/`assume` that rejects most inputs.** Starves the generator, triggers `FailedHealthCheck`, and breaks shrinking. Fix: `map`/construct into the valid space instead of filtering out of the invalid one.
+- **Forgetting the edge cases generators under-sample.** Empty, single-element, `0`, `-0.0`, `NaN`, max int, surrogate-pair/combining Unicode, duplicate keys. Fix: add explicit `@example`/`constantFrom` for them.
+- **Treating a failure as flaky and rerunning until green.** A different seed found a *real* bug. Fix: capture the seed/minimal case, add it as a regression, fix the code.
+- **Not committing the regression corpus.** `proptest-regressions/*.txt` / pinned `@example` get dropped → the same bug returns. Fix: commit them; they replay first.
+- **Non-deterministic or stateful property body.** Shared mutable state / clocks / RNG make the shrunk case not reproduce. Fix: pure property, reset state per run, inject the clock/seed.
+- **Too few runs.** 100 default cases barely scratch a large space. Fix: ≥1000 in CI for cheap props; nightly 10k with rotating seed.
+- **Hand-rolled generators that don't shrink.** Opaque blobs/closures give you a 4000-element counterexample. Fix: build from library combinators that carry shrink logic.
+- **No deadline on slow properties.** One expensive generator hangs CI. Fix: per-property timeout/deadline.
+- **Using PBT where there's no invariant.** Forcing a property onto "input X → output Y" is awkward and weak. Fix: write-tests for spec'd examples; PBT for universal rules — layer both.
+## Verify
+1. **The property is non-tautological:** it's a round-trip/metamorphic/oracle/invariant — not a second copy of the implementation. Mutate the code under test (flip a sign, drop an element) and confirm the property *fails*; a property that never fails on injected bugs is testing nothing.
+2. **Edge cases are reached:** the run includes (or has `@example` for) empty, single, boundary, `0`/`-0.0`/`NaN`, max, and tricky-Unicode inputs; coverage/`Hypothesis statistics` shows them exercised.
+3. **Failures shrink to minimal:** introduce a real bug → the reported counterexample is *small and pointed* (e.g. `[0,0]`, `""`, `1`), not a giant random blob. If it doesn't shrink, fix the generator/`assume` (step 4).
+4. **Reproducible:** re-running with the printed seed/`@reproduce_failure`/`seed+path`/regression file reproduces the *same* failure deterministically; the regression artifact is committed.
+5. **Run count + budget:** CI runs ≥1000 cases per cheap property within a per-property deadline; a nightly/extended job runs more with a rotating seed.
+6. **Stateful (if applicable):** the model-based test drives random command sequences, checks SUT==model at each step, and shrinks a failure to the shortest failing command trace.
+7. **Layered:** example tests cover the documented/spec corners; properties cover the universal invariants — both present, neither doing the other's job.
+Done = each function/codec/state machine has at least one non-tautological property (round-trip, idempotence, oracle, metamorphic, invariant, algebraic, or conservation), generators construct valid inputs (not filter) and hit known edges, failures auto-shrink to a minimal reproducible counterexample with a committed seed/regression, stateful systems are checked against a model via random command sequences, and runs are deterministic-but-budgeted in CI with an extended nightly sweep — proven by the bug-injection and shrink checks in 1–3.

package/skills/publish-package-registry/SKILL.md ADDED Viewed

@@ -0,0 +1,130 @@
+---
+name: publish-package-registry
+description: Publishes a library to a package registry (npm/PyPI/crates) safely — semver decision, correct artifacts (dual ESM/CJS + types, files allowlist), provenance/signing via OIDC, a pre-publish gate, scoped least-privilege access, and tag-triggered CI release.
+when_to_use: Shipping or fixing a library release to npm/PyPI/crates — oversized/broken publish, missing types on install, error-prone manual releases. NOT deploying a running app/service (deploy-release), writing changelog text (release-notes), or auditing consumed dependencies (supply-chain-sbom-provenance — this PRODUCES and attests your OWN package).
+---
+## When to Use
+Reach for this skill when you are **publishing a library others install**, not deploying a service:
+- "Publish v2 to npm" / "release this crate" / "push the wheel to PyPI"
+- "Consumers get `Could not find a declaration file` — types are missing on install"
+- "Our tarball is 40 MB / shipped `src/` and tests / leaked a `.env`"
+- "Replace our manual `npm publish` with a tag-triggered CI release"
+- "Add provenance / sign the artifact so installs are verifiable"
+- "Ship a prerelease on a `next` dist-tag without moving `latest`"
+NOT this skill:
+- Deploying a running app/service/container to an environment → deploy-release
+- Writing the human-readable changelog / release-notes text → release-notes
+- Auditing/attesting dependencies you *consume* (SBOM of third-party deps) → supply-chain-sbom-provenance (this skill produces+attests the package you *own*)
+- Authoring the bundler/`tsup`/Rollup config that emits the artifacts → configure-bundler-build
+- Wiring versioning across many packages in one repo → setup-monorepo-tooling
+## Steps
+1. **Run the pre-publish gate — never publish off an unverified working tree.** A publish is irreversible (you can't re-publish the same version; npm unpublish is restricted to 72h). Gate, in order, and abort on the first failure:
+   ```bash
+   git status --porcelain        # MUST be empty — publish only committed, tagged code
+   <build>                       # tsup/rollup/maturin/cargo build — emit dist/ fresh
+   <test> && <typecheck>         # vitest/pytest + tsc --noEmit; green or stop
+   npm pack --dry-run            # npm: list the EXACT files + unpacked size
+   ```
+   For Python: `python -m build && twine check dist/*`. For crates: `cargo publish --dry-run` and `cargo package --list`. Read the file list out loud — if it contains `src/`, tests, `.env`, `*.map` you didn't intend, or the size jumped, fix the allowlist (step 4) before going further.
+2. **Decide the semver bump from the diff, not vibes.** Compare the public API surface, not the commit count.
+   | Change | Bump | Example |
+   |---|---|---|
+   | Removed/renamed export, changed signature, dropped Node/Py version, behavior break | **major** | `0.x` exception: any break is allowed, but prefer minor and document it |
+   | New export, new optional param, new overload — old code still compiles | **minor** | added `parse(opts?)` |
+   | Bugfix, perf, types-only fix, docs, internal refactor — public API identical | **patch** | fixed off-by-one |
+   Pre-1.0 (`0.y.z`): treat `0.y` like major (breaking bumps `y`), `0.y.z` like minor/patch. Don't hand-bump if you use Changesets/release-please (step 6) — let the tool compute it from change intents. Never reuse or downgrade a published version.
+3. **Make the package importable both ways with types — this is the #1 broken-install cause.** Ship dual ESM+CJS plus a `.d.ts`, and wire `exports` so resolvers actually find them:
+   ```jsonc
+   {
+     "name": "@scope/lib",
+     "version": "2.0.0",
+     "type": "module",
+     "exports": {
+       ".": {
+         "types": "./dist/index.d.ts",   // types FIRST — resolution is order-sensitive
+         "import": "./dist/index.mjs",
+         "require": "./dist/index.cjs"
+       },
+       "./package.json": "./package.json"
+     },
+     "main": "./dist/index.cjs",          // fallback for old resolvers
+     "module": "./dist/index.mjs",
+     "types": "./dist/index.d.ts",
+     "files": ["dist"],                   // allowlist — ONLY dist ships
+     "sideEffects": false,                // lets bundlers tree-shake consumers
+     "repository": { "type": "git", "url": "git+https://github.com/org/lib.git" },
+     "license": "MIT",
+     "engines": { "node": ">=18" }
+   }
+   ```
+   Validate the resolution with `attw --pack` (`@arethetypeswrong/cli`) and `publint` — they catch missing `types` condition, ESM/CJS mismatch, and bad `exports` before users do. Python equivalent: `pyproject.toml` with `[project]` (name, version, license, `requires-python`, `urls`), `py.typed` shipped in the package, and SPDX `license` string. Crates: `Cargo.toml` `[package]` with `description`, `license`, `repository`, `readme`, and an `include = [...]` list.
+4. **Control exactly what ships with an allowlist, not a denylist.** Prefer `files` in `package.json` (allowlist) over `.npmignore` (denylist) — a forgotten denylist entry leaks files; an allowlist fails safe. Note `package.json`, `README`, `LICENSE`, and the `main`/`types` targets are always included. Re-run `npm pack --dry-run` after editing and confirm the count dropped. crates: `include`/`exclude` in `Cargo.toml`. Python: `MANIFEST.in` + `tool.setuptools.packages.find` / hatch `[tool.hatch.build.targets.wheel]`.
+5. **Authenticate with a short-lived, least-privilege credential — never a personal long-lived token in CI.** Order of preference:
+   - **OIDC / trusted publishing (best, no stored secret):** npm provenance + GitHub OIDC, PyPI "Trusted Publisher", crates.io GitHub OIDC. The registry trusts the CI identity directly; nothing to leak or rotate.
+   - **Automation/CI token (next best):** npm *Automation* token (granular, bypasses 2FA prompt in CI), PyPI *project-scoped* API token, crates `CARGO_REGISTRY_TOKEN`. Store in CI secrets, scope to the single package, never to your account.
+   - Enforce **2FA = auth-and-publish** on the package for any human-initiated publish. First publish of a *public* scoped package needs `--access public` (scoped defaults to restricted and will 402/403 otherwise).
+6. **Automate the release on a tag — kill the manual `npm publish`.** Manual publishes drift (wrong branch, dirty tree, forgotten build). Use Changesets (or release-please/semantic-release) so the bump+changelog+tag is mechanical, and let CI do the publish with provenance:
+   ```yaml
+   # .github/workflows/release.yml
+   permissions:
+     contents: write
+     id-token: write            # REQUIRED for npm --provenance / PyPI trusted publishing
+   jobs:
+     release:
+       runs-on: ubuntu-latest
+       steps:
+         - uses: actions/checkout@v4
+         - uses: actions/setup-node@v4
+           with: { node-version: 20, registry-url: 'https://registry.npmjs.org' }
+         - run: npm ci && npm run build && npm test
+         - run: npm publish --provenance --access public --tag latest
+           env: { NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }} }
+   ```
+   `--provenance` (with `id-token: write`) cryptographically links the published tarball to the source commit + workflow — visible as a verified badge on npm. PyPI/crates get equivalent attestation via sigstore/cosign keyless signing in the same OIDC flow. Use **dist-tags** deliberately: prereleases → `--tag next` (or `beta`), so `npm install pkg` (which resolves `latest`) never silently jumps to an unstable build. Promote later with `npm dist-tag add pkg@x.y.z latest`.
+7. **Verify the real artifact in a clean environment** (see Verify) — building locally proves nothing about what consumers actually receive.
+## Common Errors
+- **Publishing a dirty/untagged tree.** The tarball includes uncommitted changes that no commit reproduces. Gate on `git status --porcelain` empty + a matching tag before publish.
+- **Missing `types` on install.** No `"types"` condition in `exports` (or it's listed *after* `import`/`require`) → consumers get `any`/`Could not find a declaration file`. Put `types` first in each `exports` entry; verify with `attw --pack`.
+- **ESM/CJS half-shipped.** Only `.mjs` exists but `require` points at it (or vice versa) → `ERR_REQUIRE_ESM` / `Cannot use import`. Emit both, wire both conditions; `publint` flags the mismatch.
+- **Denylist leak.** `.npmignore` forgot `test/` or a fixture with secrets → it ships. Switch to a `files` allowlist; re-check `npm pack --dry-run`.
+- **Oversized tarball.** Shipping `src/`, sourcemaps, `node_modules`, or `.map` blows up install size. Allowlist `dist` only; confirm unpacked size in the pack dry-run.
+- **First public scoped publish fails with 402/403.** Scoped packages default to restricted. Add `--access public` on the first publish.
+- **`id-token: write` missing → provenance silently absent or publish errors.** Provenance and trusted publishing both need that permission on the job; without it `--provenance` fails or no attestation is produced.
+- **Prerelease moved `latest`.** Publishing `2.0.0-beta.1` without `--tag` makes it `latest`, so every fresh install gets the beta. Always tag prereleases `next`/`beta`.
+- **Long-lived personal token in CI.** A leaked account-scoped token can publish *any* of your packages. Use OIDC trusted publishing, or a package-scoped automation token at minimum.
+- **Reusing/forcing a version.** The registry rejects a duplicate version, and unpublish windows are tiny. Bump forward — there is no "fix the same version" path.
+- **`sideEffects` unset on a side-effect-free lib.** Consumers can't tree-shake your exports; bundle size leaks downstream. Set `"sideEffects": false` (or list the few files that do have side effects).
+## Verify
+1. **Pack and inspect:** `npm pack` (or `python -m build` / `cargo package`) and list the tarball contents — it contains *only* the allowlisted build output, expected size, no `src`/tests/secrets/maps.
+2. **Type/contract lint:** `attw --pack` and `publint` report zero errors (npm); `twine check dist/*` passes (PyPI). No missing/mis-ordered `exports` conditions.
+3. **Clean-room install from the tarball:** in an empty temp dir, `npm init -y && npm i ../lib-2.0.0.tgz` (or `pip install dist/lib-2.0.0-*.whl` in a fresh venv). Install must succeed with no peer/engine warnings you didn't intend.
+4. **Import both module systems with types:** in that clean project, `node -e "import('lib').then(m=>console.log(m.default))"` AND `node -e "require('lib')"` both resolve; a `.ts` file importing `lib` typechecks under `tsc` with no `any`. Python: `import lib` in the fresh venv and `mypy` sees the shipped `py.typed`.
+5. **Semver matches the diff:** the chosen bump (step 2) is justified by the actual public-API delta, and the new version is strictly greater than the latest published one (`npm view pkg version`).
+6. **Provenance/signature present:** after a CI publish, the npm page shows the verified provenance badge (or `cosign verify`/sigstore attestation validates for PyPI/crates), tracing the artifact to the source commit + workflow run.
+7. **Dist-tag correct:** `npm dist-tag ls pkg` shows the prerelease on `next`/`beta` and `latest` still points at the last stable — a default `npm i pkg` does not pull the prerelease.
+Done = the gate is green on a clean tagged tree, the packed tarball installs and imports both ESM and CJS with working types in a clean room, the semver bump matches the API diff, and CI published it on the correct dist-tag with verifiable provenance.

package/skills/recover-git-state/SKILL.md ADDED Viewed

@@ -0,0 +1,119 @@
+---
+name: recover-git-state
+description: Recovers lost or broken git state — restores dropped commits/branches/stashes via reflog and fsck, pins a regression with git bisect, and safely undoes a bad reset/rebase/merge with revert or reset --soft/--mixed/--hard — without destroying still-recoverable objects.
+when_to_use: Work appears gone after a reset --hard, bad rebase, deleted branch, dropped stash, or detached HEAD, or you need to pin which commit introduced a bug. NOT intentional history editing (rewrite-git-history), conflict-marker resolution (resolve-merge-rebase-conflict), or diagnosing a non-git code failure (debug-root-cause).
+---
+## When to Use
+Reach for this when work *appears* gone or HEAD landed somewhere wrong — the commits almost always still exist as unreachable objects:
+- "I lost my commits" / "my branch is empty" after a `reset --hard`, bad `rebase`, or force-fetch
+- "I deleted the wrong branch" (`git branch -D feature`)
+- "I'm in detached HEAD and made commits — did I lose them?"
+- "I `git stash drop`'d / `stash pop`'d into a conflict and lost a stash"
+- "Which commit broke this?" — a regression to pin across a known-good..bad range
+- A merge/rebase/`reset` made things worse and you want it back exactly as it was
+NOT this skill:
+- *Intentionally* rewriting history (squash, reword, rebase to clean up, strip a secret) → rewrite-git-history
+- Resolving the conflict markers from an in-progress merge/rebase/cherry-pick → resolve-merge-rebase-conflict
+- The code itself is failing (test/crash/wrong output) and git history is fine → debug-root-cause
+- The file was never committed and isn't in any stash → git can't recover it; check editor local history / backups
+## Steps
+1. **STOP and snapshot before touching anything.** Recovery commands can themselves overwrite refs. Freeze current state so you can't make it worse:
+   ```bash
+   git stash list; git status; git log --oneline -5   # what do we actually have?
+   git branch _backup_$(date +%s)                       # pin current HEAD as a named ref
+   ```
+   Never run `reset --hard`, `checkout`, `rebase`, or `gc` until you've located the SHA you want back.
+2. **Find the lost SHA — reflog is the safety net.** Every move of HEAD (and of each branch) is logged for ~90 days, even after `reset --hard` or branch deletion:
+   ```bash
+   git reflog                       # every HEAD move: HEAD@{0}, HEAD@{1}, ...
+   git reflog show feature          # moves of one branch ref specifically
+   git reflog --date=relative | grep -iE 'commit|reset|rebase|checkout'
+   ```
+   The entry *just before* the bad operation is your target (e.g. `HEAD@{2}` = "before the reset"). Copy its SHA.
+3. **Recover by creating a ref — never `checkout` a loose SHA bare.** Pick the action by what you lost:
+   | Lost | Command | Note |
+   |---|---|---|
+   | A deleted branch | `git branch feature <sha>` | `<sha>` from `git reflog show feature` or `fsck` |
+   | Commits after a `reset --hard` | `git reset --hard HEAD@{1}` | moves current branch back; **discards** anything since |
+   | Commits, but keep current work | `git branch rescue <sha>` | safe — inspect/`cherry-pick` from `rescue`, then delete it |
+   | Detached-HEAD commits | `git branch keep <sha>` *before* you `checkout` away | bare checkout-away orphans them |
+   Default to `git branch rescue <sha>` — it's non-destructive. Use `reset --hard HEAD@{n}` only when you're sure everything after it is garbage.
+4. **Recover a dropped/popped stash via fsck.** A dropped stash isn't in `reflog` but lives as an unreachable commit:
+   ```bash
+   git fsck --no-reflog --unreachable | grep commit | awk '{print $3}' \
+     | xargs -I{} git log -1 --format='%H %ci %s' {} | grep -i 'WIP on'
+   git stash apply <sha>            # or: git branch stash_recovered <sha>
+   ```
+   A stash commit's subject starts with `WIP on <branch>:`. `git fsck --lost-found` also drops them under `.git/lost-found/`.
+5. **Undo by audience: published → `revert`, local → `reset`.** This is the one rule that prevents a second disaster:
+   | Situation | Use | Why |
+   |---|---|---|
+   | Bad commit already pushed / shared | `git revert <sha>` | adds an inverse commit — no history rewrite, safe for collaborators |
+   | Local mistake, want it gone | `git reset` (see below) | rewrites your local branch; never on shared history |
+   `reset` mode, decided by where you want the changes to land:
+   - `--soft HEAD~1` → undo the commit, keep changes **staged** (re-commit cleanly)
+   - `--mixed HEAD~1` (default) → undo commit, keep changes in **working tree**, unstaged
+   - `--hard HEAD~1` → undo commit **and discard** the changes — destructive; only after step 1's backup
+6. **Restore working-tree files (not commits) with `git restore`.** Discarded edits or a wrong file version:
+   ```bash
+   git restore path/to/file              # revert working-tree file to HEAD (uncommitted edits gone)
+   git restore --source=<sha> file       # pull one file's content from a specific commit
+   git restore --staged path             # unstage only (keep working-tree edits)
+   ```
+   `git checkout -- file` is the old spelling; prefer `restore`.
+7. **Pin a regression with `git bisect`.** Binary-search the first bad commit across a known-good..bad range:
+   ```bash
+   git bisect start
+   git bisect bad                 # current HEAD is broken
+   git bisect good v1.4.0         # last known-good tag/sha
+   # git checks out the midpoint; test it, then mark good/bad and repeat
+   git bisect good   # or: git bisect bad
+   ```
+   Automate it — let git drive every step with an exit-coded script:
+   ```bash
+   git bisect run ./test.sh       # script: exit 0 = good, 1..124 = bad, 125 = skip (untestable)
+   ```
+   When done it prints `<sha> is the first bad commit`. **Always** `git bisect reset` to return to your original HEAD.
+8. **Run `git gc` only after you've recovered and verified.** Unreachable objects survive until garbage collection. Don't run `git gc --prune=now` or `git reflog expire` while a recovery is pending — that's what permanently deletes the SHAs you're hunting.
+## Common Errors
+- **`checkout`ing a loose SHA, making commits, then leaving — they're orphaned again.** Always `git branch <name> <sha>` *first*; commit onto a real ref.
+- **`reset --hard HEAD@{1}` to "go back," wiping current uncommitted work.** `--hard` discards the working tree. Stash or `git branch _backup` first (step 1); use `git branch rescue` instead when unsure.
+- **`reflog` is empty / "ambiguous argument".** You're in a fresh clone or a different repo — reflog is per-local-clone and not pushed. Use `git fsck --lost-found` to find unreachable commits by content instead.
+- **`revert`ing a merge commit fails or reverts the wrong side.** Pass the parent: `git revert -m 1 <merge-sha>` (mainline = parent 1). Reverting a merge has its own re-merge gotcha; for shared history that's still safer than rewriting.
+- **`stash pop` hit a conflict and you assumed the stash was lost.** A *conflicted* `pop` does **not** drop the stash — it's still in `git stash list`. Resolve, then `git stash drop`.
+- **Ran `git gc` / `git reflog expire --expire=now --all` and now the SHA is gone.** Pruning deleted the unreachable objects. Recover the ref *before* any gc; check `.git/lost-found/` and cloned mirrors.
+- **`bisect` mislabels because the build is broken at some midpoints.** Have the script `exit 125` (skip) for uncompilable commits so bisect routes around them instead of falsely marking bad.
+- **Forgot `git bisect reset`.** You're left in detached HEAD on a midpoint commit, confusing later work. Reset returns you to the pre-bisect branch.
+- **Using `reset` to undo *pushed* commits, then force-pushing.** Rewrites shared history and breaks teammates. On published commits use `git revert`; rewriting is rewrite-git-history's job, not a recovery.
+## Verify
+Recovery is complete and correct when:
+1. **The recovered ref points at the right tree.** `git diff <known-good-sha> <recovered-ref>` is empty (or shows exactly the intended delta) — confirms you grabbed the right SHA, not a neighbor.
+2. **The expected commits are reachable.** `git log --oneline <recovered-ref>` shows the commits that were "lost," and `git status` is clean (no surprise staged/modified files).
+3. **A recovered stash applied cleanly.** `git stash show -p <sha>` matches the work you expected; after `apply` the files contain the WIP changes.
+4. **An `undo` did what its audience requires.** After `revert`: a new inverse commit exists and `git log` history is intact (nothing rewritten). After `reset`: `git status` shows the changes in the intended state (staged for `--soft`, unstaged for `--mixed`, gone for `--hard`).
+5. **Bisect named one culprit and cleaned up.** Output ended with `<sha> is the first bad commit`, `git show <sha>` plausibly explains the regression, and `git bisect reset` returned HEAD to the starting branch (`git status` confirms the original branch, not detached).
+6. **Nothing was pruned mid-recovery.** No `git gc`/`reflog expire` ran before the ref was secured; the `_backup` branch from step 1 still exists as a fallback.
+Done = the recovered commits/branch/stash are on a named ref whose tree matches the known-good SHA (`git diff` empty), any undo matched its published-vs-local audience (revert vs reset mode), and bisect (if used) named the first bad commit with `git bisect reset` leaving HEAD on the original branch.

package/skills/remediate-web-vulnerabilities/SKILL.md ADDED Viewed

@@ -0,0 +1,125 @@
+---
+name: remediate-web-vulnerabilities
+description: Fixes specific web vulnerability classes — SQL/command injection, XSS, CSRF, SSRF, IDOR/broken access, insecure deserialization — by applying the canonical hardening (parameterized queries, args-array exec, context-aware output encoding + CSP, SameSite + synchronizer tokens, egress allowlists, per-owner authorization, safe deserialization) and proving each fix with a regression test that replays the exploit.
+when_to_use: A specific vuln was found (review, scan, pentest) or an input/output path needs proactive hardening. Distinct from security-review (finds and reports vulns, does not fix), design-authorization-model (authZ architecture, not a single IDOR patch), and defend-llm-prompt-injection (LLM/prompt-specific, not classic web).
+---
+## When to Use
+- "We got a finding for SQL injection / XSS / SSRF — fix it"
+- "A pentest flagged IDOR on `GET /orders/:id` — anyone can read any order"
+- "Sanitize this user input before it hits the shell / the DB / the page"
+- "Harden this URL-fetch endpoint so it can't hit our metadata service"
+- "Stop us deserializing untrusted JSON/pickle/YAML into objects"
+- Proactively hardening a newly added input or output path before ship
+NOT this skill:
+- *Finding* and reporting vulns in a diff (no specific one yet) → security-review
+- Designing the overall authZ model (roles, policies, tenancy) instead of patching one missing ownership check → design-authorization-model
+- Prompt injection / tool-abuse / jailbreaks against an LLM → defend-llm-prompt-injection
+- Stress-finding *new* input bugs by mutation/fuzzing → fuzz-dynamic-security-test
+- Moving plaintext secrets out of code/IaC and rotating them → secrets-management
+- WAF rules / managed rulesets at the edge as a *compensating* control → setup-cdn-edge-waf (a WAF is defense-in-depth, never the fix)
+Fix the **class, not the instance**: when you find one concatenated query, grep the codebase and fix every sibling. A WAF rule or input blocklist is a band-aid — remove the unsafe construct.
+## Steps
+1. **Identify the class, then apply its canonical fix.** Do not invent ad-hoc escaping — each class has one correct construct:
+   | Class | Root cause | Canonical fix (do this) | Never (the band-aid) |
+   |---|---|---|---|
+   | **SQLi** | String-built query | Parameterized query / bound params; ORM with bindings | Escaping quotes, blocklisting `'`/`;`/`--` |
+   | **Command injection** | Shell-interpreted string | No shell: exec with **args array**; allowlist binaries/flags | `escapeshellarg`-then-concat into `sh -c` |
+   | **XSS** | Untrusted data in HTML | Framework auto-escaping + **context-aware** encoding + CSP; sanitize HTML with DOMPurify | Regex strip of `<script>`, `innerHTML` of user data |
+   | **CSRF** | Ambient cookie auth | `SameSite=Lax/Strict` + **synchronizer token** on state-changers | Checking `Referer` only |
+   | **SSRF** | User-controlled fetch URL | **Allowlist** dest hosts; block link-local/metadata/private ranges; pin resolved IP | Blocklisting `localhost`/`127.0.0.1` strings |
+   | **IDOR / broken access** | authN ≠ authZ | Authorize **every object by owner/tenant**, server-side, on the resolved row | Hiding the ID in the UI; UUIDs as "security" |
+   | **Insecure deserialization** | Untrusted bytes → objects | Don't deserialize untrusted data; use data-only formats (JSON) + schema validate | `pickle.loads`, `yaml.load`, Java native, `unserialize()` on input |
+2. **SQLi — parameterize, never concatenate.** Pass user data as bound parameters so it is never parsed as SQL. Identifiers (table/column names) can't be parameters — allowlist them against a fixed set.
+   ```python
+   # BAD: db.execute(f"SELECT * FROM users WHERE email = '{email}'")
+   db.execute("SELECT * FROM users WHERE email = %s", (email,))           # psycopg
+   # ORM: User.query.filter_by(email=email)   # SQLAlchemy binds automatically
+   # Dynamic column: pick from an allowlist, don't interpolate the raw value
+   ALLOWED = {"name", "created_at"}
+   if sort not in ALLOWED: raise ValueError("bad sort column")            # reject, don't interpolate
+   ```
+3. **Command injection — drop the shell, pass an args array.** The shell is the vuln; `shell=True` / `sh -c` / `os.system` interpret metacharacters. Pass argv as a list so the OS execs the binary directly with no parsing.
+   ```python
+   # BAD: subprocess.run(f"convert {path} out.png", shell=True)
+   subprocess.run(["convert", path, "out.png"], shell=False, check=True)   # path is one argv element, never re-parsed
+   ```
+   If a value must be a flag, allowlist it (`if mode not in {"fast","hq"}: reject`). Never build a flag string from user input.
+4. **XSS — encode for the output context, add CSP, sanitize HTML only with a vetted lib.** Escaping for HTML body ≠ for an attribute ≠ for JS ≠ for a URL. Let the template engine auto-escape (Jinja `autoescape`, React `{}` text nodes) and don't defeat it (`| safe`, `dangerouslySetInnerHTML`, `v-html`, `innerHTML`). If you must render user HTML, run it through **DOMPurify** first:
+   ```js
+   el.textContent = userInput;                       // default: text, auto-safe
+   el.innerHTML = DOMPurify.sanitize(userHtml);      // ONLY when HTML is required
+   // Never build href/src from raw input — reject non-http(s) schemes (blocks javascript:)
+   ```
+   Add a strict CSP as defense-in-depth: `Content-Security-Policy: default-src 'self'; object-src 'none'; base-uri 'none'` — no `unsafe-inline`/`unsafe-eval`. CSP is a second wall, not the fix.
+5. **CSRF — `SameSite` cookies + synchronizer token.** Set session cookies `SameSite=Lax` (or `Strict`), `Secure`, `HttpOnly`. For every state-changing request (`POST/PUT/PATCH/DELETE`), require a per-session CSRF token (double-submit cookie or server-stored), compared with a constant-time check. Pure token-auth APIs (`Authorization: Bearer`, no cookies) are not CSRF-prone — don't bolt tokens onto those.
+6. **SSRF — allowlist egress, block internal ranges, fetch the pinned IP.** If the destination is user-controlled, default-deny:
+   - Allowlist the exact hostnames/domains you intend to reach; reject everything else.
+   - Resolve the host, then **reject** `127.0.0.0/8`, `::1`, `10/8`, `172.16/12`, `192.168/16`, `169.254.0.0/16` (link-local → cloud metadata `169.254.169.254`), `fc00::/7`, and `0.0.0.0`.
+   - Resolve once, validate that IP, and connect to **that IP** (pass it explicitly / pin it) to kill DNS-rebinding TOCTOU.
+   - Disable redirects, or re-validate the target on each hop. Never follow a redirect to an unvalidated host.
+7. **IDOR / broken access — authorize every object by owner, server-side.** Authentication tells you *who*; you still must check *they own this row*. Scope the query or assert ownership on the resolved object — never trust an ID from the request as proof of access.
+   ```python
+   # BAD: order = Order.get(request.params["id"])   # any id -> anyone's order
+   order = Order.get(id=request.params["id"], owner_id=current_user.id)    # scope to caller
+   if order is None: raise NotFound()   # 404, not 403 — don't confirm the row exists
+   ```
+   This is the single-instance patch. If you're (re)designing roles/policies/multi-tenancy, that's design-authorization-model.
+8. **Insecure deserialization — never deserialize untrusted bytes.** Object-deserializers (`pickle`, `yaml.load`, Java `ObjectInputStream`, PHP `unserialize`, .NET `BinaryFormatter`) can execute code or instantiate arbitrary types. Accept only data formats and validate against a schema:
+   ```python
+   # BAD: obj = pickle.loads(body)   /   data = yaml.load(body)
+   data = json.loads(body)              # data only, no code execution
+   yaml.safe_load(body)                 # if YAML is required
+   Payload.model_validate(data)         # pydantic: enforce shape/types
+   ```
+   If signed objects are unavoidable, verify an HMAC over the bytes *before* deserializing.
+9. **Sweep the class and write the regression test.** Grep for every sibling of the fixed pattern (`grep -rn "shell=True"`, `f"SELECT`, `innerHTML =`, `pickle.loads`, `yaml.load(`, `dangerouslySetInnerHTML`). Then write a test that sends the **actual exploit payload** and asserts it no longer triggers (Verify).
+## Common Errors
+- **Escaping/blocklisting instead of parameterizing.** Quote-escaping and `'`/`;` blocklists are bypassable (unicode, comments, encoding). Use bound parameters — fix the construct.
+- **`shell=True` "but I escaped it".** `escapeshellarg`-then-concat still goes through the shell and gets bypassed. Pass an argv array with `shell=False`; no shell, no metacharacters.
+- **Sanitizing on input, rendering elsewhere.** Input sanitization can't know the output context. Encode at the point of output (HTML vs attr vs JS vs URL); store data raw.
+- **Trusting a custom XSS regex.** Hand-rolled HTML filters miss `onerror=`, `javascript:`, SVG, mutation XSS. Use DOMPurify; never `innerHTML` raw input.
+- **CSP with `unsafe-inline`/`unsafe-eval`.** Negates the protection against injected `<script>`. Use nonces/hashes; if you can't, you haven't fixed the XSS — CSP was only the backstop.
+- **SSRF fix that blocks strings, not IPs.** Blocking `"localhost"` misses `0`, `0x7f.1`, `[::1]`, decimal IPs, and DNS-rebinding. Resolve and check the **IP** against CIDR ranges, then connect to that resolved IP.
+- **Following redirects after SSRF validation.** Validated host 302-redirects to `169.254.169.254`. Disable redirects or re-validate every hop.
+- **IDOR "fixed" by switching to UUIDs / hiding the ID.** Obscurity isn't authorization. The fix is the server-side owner/tenant check on the resolved object.
+- **Adding authN where authZ is missing.** Logged-in ≠ authorized for *this* object. Scope the lookup to the caller.
+- **`yaml.load` left as the "safe" one.** Plain `yaml.load` constructs arbitrary objects. Use `yaml.safe_load`. Likewise `pickle`/`BinaryFormatter`/`unserialize` on input are never safe — switch to JSON + schema.
+- **Fixing the reported instance, leaving the class.** The scanner found one; the same pattern lives in ten other files. Grep and fix all, or it regresses next sprint.
+- **Calling a WAF rule the fix.** A blocked payload at the edge while the unsafe code remains is unfixed — the next encoding gets through. WAF is defense-in-depth (setup-cdn-edge-waf), not remediation.
+## Verify
+A fix is proven only when an automated test reproduces the original exploit and asserts it's now inert:
+1. **SQLi:** Send `' OR '1'='1` / `'; DROP TABLE--` as the input → response is a normal empty/auth-fail result, no extra rows, no error leaking SQL; query log shows it ran as a bound parameter.
+2. **Command injection:** Send `; id`, `$(id)`, `` `id` ``, `| cat /etc/passwd` as the value → no extra process runs, no command output in the response; a sentinel file the payload tries to create does not exist.
+3. **XSS:** Submit `<img src=x onerror=alert(1)>` and `javascript:alert(1)` → response renders them as **escaped text** (`<img...`), not live markup; assert the raw `<script>`/`onerror` byte sequence is absent from the HTML. Confirm the `Content-Security-Policy` header is present and free of `unsafe-inline`.
+4. **CSRF:** Replay a state-changing `POST` with a valid session cookie but **no/forged** CSRF token → rejected (`403`); the same request with a valid token → succeeds.
+5. **SSRF:** Request each blocked target — `http://169.254.169.254/`, `http://127.0.0.1`, `http://[::1]`, a decimal-IP form, and a host that 30x-redirects to `169.254.169.254` → all rejected before any socket to an internal range opens; only an explicitly allowlisted host returns `200`. Assert the connection to the internal range was never opened.
+6. **IDOR:** As user A, request user B's object ID → `404` (not the row, not a `403` that confirms existence); as the owner → `200`. Run it for every object-scoped route you touched.
+7. **Deserialization:** Feed a malicious pickle/`!!python/object` YAML/gadget payload → rejected at schema validation, no object instantiated, no code executed (sentinel side-effect absent).
+8. **Class sweep:** The grep for the unsafe construct returns zero remaining hits in app code (excluding the test fixtures that hold the payloads).
+Done = the exploit-replay test for the fixed class **failed before** the change and **passes after**, the canonical construct (not a blocklist/WAF rule) is in place, the class-wide grep is clean, and no Critical regression remains.