npm - sanook-cli - Versions diffs - 0.4.0 → 0.5.0 - Mend

sanook-cli 0.4.0 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (235) hide show

package/.env.example +19 -0
package/CHANGELOG.md +144 -0
package/README.md +153 -20
package/README.th.md +136 -0
package/dist/agentContext.js +4 -0
package/dist/approval.js +6 -0
package/dist/bin.js +394 -51
package/dist/brain.js +92 -59
package/dist/brand.js +47 -0
package/dist/checkpoint.js +37 -0
package/dist/commands.js +86 -6
package/dist/compaction.js +76 -5
package/dist/config.js +100 -12
package/dist/cost.js +60 -3
package/dist/doctor.js +92 -0
package/dist/gateway/auth.js +2 -2
package/dist/gateway/ledger.js +2 -2
package/dist/gateway/scheduler.js +1 -0
package/dist/gateway/serve.js +6 -4
package/dist/gateway/server.js +10 -2
package/dist/git.js +11 -2
package/dist/hooks.js +43 -17
package/dist/knowledge.js +48 -49
package/dist/loop.js +182 -66
package/dist/lsp/client.js +173 -0
package/dist/lsp/framing.js +56 -0
package/dist/lsp/index.js +138 -0
package/dist/lsp/servers.js +82 -0
package/dist/mcp-server.js +244 -0
package/dist/mcp.js +184 -29
package/dist/memory-store.js +559 -0
package/dist/memory.js +143 -29
package/dist/orchestrate.js +150 -0
package/dist/providers/codex.js +2 -2
package/dist/providers/keys.js +3 -2
package/dist/providers/registry.js +133 -1
package/dist/repomap.js +93 -0
package/dist/search/chunk.js +158 -0
package/dist/search/embed-store.js +187 -0
package/dist/search/engine.js +203 -0
package/dist/search/fuse.js +35 -0
package/dist/search/index-core.js +187 -0
package/dist/search/indexer.js +241 -0
package/dist/search/store.js +77 -0
package/dist/session.js +42 -8
package/dist/skill-install.js +10 -10
package/dist/skills.js +12 -9
package/dist/summarize.js +31 -0
package/dist/tools/bash.js +21 -2
package/dist/tools/diagnostics.js +41 -0
package/dist/tools/edit.js +29 -7
package/dist/tools/index.js +8 -1
package/dist/tools/list.js +7 -2
package/dist/tools/permission.js +90 -9
package/dist/tools/read.js +23 -4
package/dist/tools/remember.js +1 -1
package/dist/tools/sandbox.js +61 -0
package/dist/tools/search.js +105 -4
package/dist/tools/task.js +195 -29
package/dist/tools/timeout.js +35 -0
package/dist/tools/util.js +10 -0
package/dist/tools/write.js +6 -4
package/dist/trust.js +89 -0
package/dist/ui/app.js +218 -27
package/dist/ui/banner.js +4 -9
package/dist/ui/history.js +30 -0
package/dist/ui/mentions.js +44 -0
package/dist/ui/setup.js +6 -5
package/dist/ui/useEditor.js +83 -0
package/dist/update.js +114 -0
package/dist/worktree.js +173 -0
package/package.json +11 -5
package/scripts/postinstall.mjs +33 -0
package/second-brain/.agents/_Index.md +30 -0
package/second-brain/.agents/skills/_Index.md +30 -0
package/second-brain/.agents/workflows/_Index.md +30 -0
package/second-brain/AGENTS.md +4 -4
package/second-brain/Acceptance/_Index.md +30 -0
package/second-brain/Acceptance/golden-case-template.md +39 -0
package/second-brain/Areas/_Index.md +30 -0
package/second-brain/Bugs/System-OS/_Index.md +30 -0
package/second-brain/Bugs/_Index.md +30 -0
package/second-brain/CLAUDE.md +4 -1
package/second-brain/Checklists/_Index.md +30 -0
package/second-brain/Checklists/preflight-postflight-template.md +29 -0
package/second-brain/Distillations/_Index.md +30 -0
package/second-brain/Entities/_Index.md +30 -0
package/second-brain/Entities/entity-template.md +33 -0
package/second-brain/Evals/_Index.md +30 -0
package/second-brain/Evals/correction-pairs.md +24 -0
package/second-brain/Evals/failure-taxonomy.md +24 -0
package/second-brain/Evals/golden-set.md +25 -0
package/second-brain/Evals/quality-ledger.md +23 -0
package/second-brain/Evals/self-eval-rubric.md +23 -0
package/second-brain/GEMINI.md +4 -4
package/second-brain/Goals/_Index.md +30 -0
package/second-brain/Handoffs/_Index.md +30 -0
package/second-brain/Home.md +7 -0
package/second-brain/Intake/Raw Sources/_Index.md +30 -0
package/second-brain/Intake/_Index.md +30 -0
package/second-brain/Intake/_Quarantine/_Index.md +30 -0
package/second-brain/Learning/_Index.md +30 -0
package/second-brain/Playbooks/_Index.md +30 -0
package/second-brain/Playbooks/playbook-template.md +23 -0
package/second-brain/Projects/_Index.md +30 -0
package/second-brain/Prompts/_Index.md +30 -0
package/second-brain/README.md +2 -1
package/second-brain/Research/_Index.md +30 -0
package/second-brain/Retrospectives/_Index.md +30 -0
package/second-brain/Reviews/_Index.md +30 -0
package/second-brain/Runbooks/_Index.md +30 -0
package/second-brain/Runbooks/eval-loop.md +24 -0
package/second-brain/Sessions/_Index.md +30 -0
package/second-brain/Shared/AI-Context-Index.md +20 -0
package/second-brain/Shared/AI-Threads/_Index.md +30 -0
package/second-brain/Shared/Archive/_Index.md +30 -0
package/second-brain/Shared/Assets/_Index.md +30 -0
package/second-brain/Shared/Context-Packs/_Index.md +30 -0
package/second-brain/Shared/Context7-Docs/_Index.md +30 -0
package/second-brain/Shared/Coordination/NOW.md +28 -0
package/second-brain/Shared/Coordination/_Index.md +30 -0
package/second-brain/Shared/Coordination/agent-registry.md +24 -0
package/second-brain/Shared/Coordination/task-board/_Index.md +30 -0
package/second-brain/Shared/Coordination/task-board/task-template.md +43 -0
package/second-brain/Shared/Coordination/task-board.md +32 -0
package/second-brain/Shared/Core-Facts/_Index.md +30 -0
package/second-brain/Shared/Decision-Memory/_Index.md +30 -0
package/second-brain/Shared/Glossary/_Index.md +30 -0
package/second-brain/Shared/Memory-Inbox/_Index.md +30 -0
package/second-brain/Shared/Operating-State/_Index.md +30 -0
package/second-brain/Shared/Prompting/_Index.md +30 -0
package/second-brain/Shared/Provenance/_Index.md +30 -0
package/second-brain/Shared/Rules/_Index.md +30 -0
package/second-brain/Shared/Rules/contextual-note-rule.md +30 -0
package/second-brain/Shared/Rules/frontmatter-standard.md +10 -0
package/second-brain/Shared/Rules/memory-write-protocol.md +28 -0
package/second-brain/Shared/Rules/procedural-runbook-header.md +40 -0
package/second-brain/Shared/Rules/review-and-staleness-policy.md +22 -0
package/second-brain/Shared/Rules/rules-formatting.md +34 -0
package/second-brain/Shared/Scripts/_Index.md +30 -0
package/second-brain/Shared/Scripts-Archive/_Index.md +30 -0
package/second-brain/Shared/Tech-Standards/_Index.md +30 -0
package/second-brain/Shared/Tech-Standards/verification-standard.md +40 -0
package/second-brain/Shared/User-Memory/_Index.md +30 -0
package/second-brain/Shared/User-Persona/_Index.md +30 -0
package/second-brain/Shared/User-Persona/owner-profile.md +25 -0
package/second-brain/Shared/Working-Memory/_Index.md +30 -0
package/second-brain/Shared/_Index.md +30 -0
package/second-brain/Shared/mcp-servers/_Index.md +30 -0
package/second-brain/Skills/_Index.md +30 -0
package/second-brain/Templates/_Index.md +30 -0
package/second-brain/Templates/bug.md +2 -0
package/second-brain/Templates/handoff.md +2 -0
package/second-brain/Templates/session.md +2 -0
package/second-brain/Tools/_Index.md +30 -0
package/second-brain/Traces/_Index.md +30 -0
package/second-brain/Vault Structure Map.md +33 -1
package/second-brain/copilot/_Index.md +30 -0
package/skills/audit-license-compliance/SKILL.md +117 -0
package/skills/author-codemod/SKILL.md +110 -0
package/skills/build-audit-logging/SKILL.md +112 -0
package/skills/build-cdc-streaming-pipeline/SKILL.md +123 -0
package/skills/build-cli-tool/SKILL.md +108 -0
package/skills/build-data-table/SKILL.md +141 -0
package/skills/build-native-mobile-ui/SKILL.md +154 -0
package/skills/build-offline-first-sync/SKILL.md +118 -0
package/skills/build-realtime-channel/SKILL.md +122 -0
package/skills/build-vector-search/SKILL.md +131 -0
package/skills/compose-local-dev-stack/SKILL.md +149 -0
package/skills/configure-bundler-build/SKILL.md +166 -0
package/skills/configure-dns-tls/SKILL.md +142 -0
package/skills/configure-reverse-proxy-lb/SKILL.md +129 -0
package/skills/configure-security-headers-csp/SKILL.md +122 -0
package/skills/contract-testing/SKILL.md +140 -0
package/skills/datetime-timezone-correctness/SKILL.md +125 -0
package/skills/debug-ci-pipeline-failure/SKILL.md +134 -0
package/skills/debug-flaky-tests/SKILL.md +128 -0
package/skills/defend-llm-prompt-injection/SKILL.md +110 -0
package/skills/deliver-webhooks/SKILL.md +116 -0
package/skills/design-api-pagination/SKILL.md +144 -0
package/skills/design-authorization-model/SKILL.md +119 -0
package/skills/design-backup-dr-recovery/SKILL.md +113 -0
package/skills/design-event-sourcing-cqrs/SKILL.md +143 -0
package/skills/design-multi-tenancy/SKILL.md +100 -0
package/skills/design-protobuf-grpc-service/SKILL.md +146 -0
package/skills/design-relational-schema/SKILL.md +129 -0
package/skills/design-search-index-infra/SKILL.md +151 -0
package/skills/design-state-machine/SKILL.md +108 -0
package/skills/design-token-system/SKILL.md +109 -0
package/skills/distributed-locks-leases/SKILL.md +120 -0
package/skills/encrypt-sensitive-data/SKILL.md +148 -0
package/skills/feature-flags-rollout/SKILL.md +130 -0
package/skills/file-upload-object-storage/SKILL.md +107 -0
package/skills/fuzz-dynamic-security-test/SKILL.md +111 -0
package/skills/harden-llm-app-reliability/SKILL.md +126 -0
package/skills/i18n-localization-setup/SKILL.md +113 -0
package/skills/idempotency-keys/SKILL.md +107 -0
package/skills/implement-push-notifications/SKILL.md +142 -0
package/skills/ingest-webhook-secure/SKILL.md +120 -0
package/skills/integrate-oauth-oidc/SKILL.md +126 -0
package/skills/load-stress-test/SKILL.md +129 -0
package/skills/map-privacy-data-gdpr/SKILL.md +146 -0
package/skills/model-nosql-data/SKILL.md +118 -0
package/skills/money-decimal-arithmetic/SKILL.md +123 -0
package/skills/monitor-ml-drift/SKILL.md +109 -0
package/skills/numeric-precision-units/SKILL.md +144 -0
package/skills/optimize-llm-cost-latency/SKILL.md +103 -0
package/skills/optimize-react-rerenders/SKILL.md +124 -0
package/skills/orchestrate-agent-workflow/SKILL.md +100 -0
package/skills/payments-billing-integration/SKILL.md +114 -0
package/skills/pin-toolchain-versions/SKILL.md +116 -0
package/skills/plan-strangler-migration/SKILL.md +95 -0
package/skills/property-based-testing/SKILL.md +108 -0
package/skills/publish-package-registry/SKILL.md +130 -0
package/skills/recover-git-state/SKILL.md +119 -0
package/skills/remediate-web-vulnerabilities/SKILL.md +125 -0
package/skills/resilience-timeouts-retries/SKILL.md +104 -0
package/skills/resolve-merge-rebase-conflict/SKILL.md +97 -0
package/skills/rewrite-git-history/SKILL.md +109 -0
package/skills/scaffold-cross-platform-app/SKILL.md +137 -0
package/skills/schema-evolution-compatibility/SKILL.md +121 -0
package/skills/send-transactional-email/SKILL.md +126 -0
package/skills/serve-deploy-ml-model/SKILL.md +107 -0
package/skills/setup-cdn-edge-waf/SKILL.md +107 -0
package/skills/setup-devcontainer-env/SKILL.md +131 -0
package/skills/setup-lint-format-precommit/SKILL.md +140 -0
package/skills/setup-monorepo-tooling/SKILL.md +125 -0
package/skills/ship-mobile-app-store-release/SKILL.md +137 -0
package/skills/structured-output-llm/SKILL.md +86 -0
package/skills/supply-chain-sbom-provenance/SKILL.md +120 -0
package/skills/test-data-factories/SKILL.md +158 -0
package/skills/threat-model-stride/SKILL.md +123 -0
package/skills/train-evaluate-ml-model/SKILL.md +109 -0
package/skills/unicode-text-correctness/SKILL.md +109 -0
package/skills/visual-regression-testing/SKILL.md +120 -0

package/skills/build-audit-logging/SKILL.md ADDED Viewed

@@ -0,0 +1,112 @@
+---
+name: build-audit-logging
+description: Builds tamper-evident audit logging — structured actor/action/target/result records for security-relevant events, append-only hash-chained or WORM/object-lock storage, PII-safe payloads that log references not raw data, and regulation-driven retention — to satisfy SOC2/HIPAA-style controls and support incident forensics.
+when_to_use: A system needs a defensible, queryable record of sensitive actions (access, permission/config changes, admin ops) for compliance or forensics. Distinct from observability-instrument (operational logs/metrics/traces for debugging) and map-privacy-data-gdpr (data-subject rights and lawful-basis mapping).
+---
+## When to Use
+Reach for this skill when the requirement is a **defensible record of who did what to whom**, not operational telemetry:
+- "We need an audit trail for SOC2 / HIPAA / PCI — access, admin actions, config changes"
+- "Auditors want to know who changed this permission / exported this report / read this patient record"
+- "After the breach, prove what the attacker touched and that nobody edited the logs"
+- "Log every admin override / impersonation / data export, immutably"
+- "Make sensitive-action history queryable for investigations and legal hold"
+NOT this skill:
+- Debugging latency/errors with logs, metrics, traces, dashboards → observability-instrument (operational, sampled, short-retention — the opposite of an audit log)
+- Data-subject access/erasure requests, consent, lawful basis, retention *policy* for personal data → map-privacy-data-gdpr
+- *Deciding* whether an action is allowed (the policy engine itself) → design-authorization-model (audit logging records the decision; it does not make it)
+- An immutable append-only store as the system of record for business state (rebuildable projections) → design-event-sourcing-cqrs
+- Storing/rotating the secrets and signing keys this log references → secrets-management
+- Running the actual breach investigation/postmortem → incident-response-sre (this skill makes that investigation *possible*)
+## Steps
+1. **Enumerate auditable events first — code to a closed list, not "log everything."** An audit log with too much noise is as useless as one with gaps. Audit exactly the security-relevant control points:
+   | Category | Examples | SOC2 (TSC) | HIPAA |
+   |---|---|---|---|
+   | Authentication | login success/fail, MFA, logout, password/key change, session revoke | CC6.1 | §164.312(b) |
+   | Authorization decisions | access denied, privilege grant/revoke, role change, impersonation start/stop | CC6.3 | §164.308(a)(4) |
+   | Sensitive data access | read/export/print of PII/PHI/financial records, bulk query, report download | CC6.1 / CC7.2 | §164.312(b) audit controls |
+   | Config / security changes | feature flag, retention policy, encryption setting, integration/webhook, IAM policy | CC8.1 | §164.308(a)(1) |
+   | Admin / break-glass ops | user delete, data purge, override, prod DB access, support impersonation | CC6.1 | §164.308(a)(3) |
+   Define this list with security/compliance, not ad hoc per feature. Each event gets a stable `action` constant (e.g. `user.role.granted`, `record.exported`) — never a free-text string you can't query or version.
+2. **Fix one structured schema and emit it everywhere.** Required fields, machine-parseable (JSON), one shape across services:
+   ```json
+   {
+     "id": "01J8...ULID",                  // unique, sortable, dedup key
+     "ts": "2026-06-15T09:41:02.117Z",     // UTC, ISO-8601, ms precision, server clock
+     "action": "record.exported",          // from the closed list, dotted, versioned
+     "actor": { "type": "user", "id": "u_8821", "auth": "session", "on_behalf_of": "support_agent_31" },
+     "target": { "type": "patient_record", "id": "p_5567" },   // id/reference ONLY
+     "result": "allow",                     // allow | deny | error
+     "reason": "policy:export.phi.granted", // why, esp. for deny
+     "source_ip": "203.0.113.7",            // normalized from trusted proxy header
+     "user_agent": "...",
+     "request_id": "req_9f3c",              // correlation id → ties to app/trace logs
+     "tenant": "org_204",
+     "meta": { "row_count": 1420, "format": "csv" }  // counts/refs, NEVER raw payload
+   }
+   ```
+   Use **ULID/UUIDv7** for `id` (sortable + a natural dedup key for at-least-once emitters). `on_behalf_of` is mandatory whenever an admin/support acts as another user — impersonation without it is an audit gap auditors will flag.
+3. **Keep the audit log physically separate from application logs.** Different store, different write credentials, different retention. App logs are mutable, sampled, debug-grade; audit logs are not. Mixing them means a developer with log-write access can forge or delete audit history. Ship audit events to a **dedicated append-only sink** (dedicated Postgres table with revoked UPDATE/DELETE, a WORM object store, or a managed audit service) — never the same index/bucket as `console.log` output.
+4. **Make tamper-evidence structural, not a promise. Pick by threat model:**
+   | Mechanism | Detects | Use when | Cost |
+   |---|---|---|---|
+   | **Hash chain** (each row stores `hash(prev_hash + entry)`) | any edit/delete/reorder of past rows | default — works in any DB, cheap, verifiable offline | 1 hash/write + periodic verify job |
+   | **WORM / object-lock** (S3 Object Lock COMPLIANCE, GCS retention lock) | deletion/overwrite before retention expiry, even by root | regulated retention, untrusted operators | storage + immutable retention window |
+   | **Per-entry digital signature** (HSM/KMS sign each batch) | forgery + proves origin/non-repudiation | strict non-repudiation, third-party verifier | KMS calls, key mgmt |
+   | **External anchoring** (periodic chain-head to a notary/transparency log) | insider with full DB+app access rewriting the whole chain | high-value targets, hostile-insider model | scheduled external write |
+   **Default: hash chain + WORM storage.** The chain proves *no row was altered*; object-lock proves *no row was deleted*. Hash chain alone doesn't stop a truncate-and-rebuild by someone with full write access — pair it with object-lock or external anchoring for that threat. Restrict write access to an **append-only path** (DB role with `INSERT` only; bucket policy allowing `PutObject` but not `DeleteObject`/overwrite); **nobody — including the app service account — gets row-level update/delete.**
+5. **Never log secrets or raw PII/PHI — log references and minimized metadata.** The audit log is high-value, long-retention, and widely readable by auditors; a raw payload in it is a second copy of your most sensitive data with the worst blast radius. Log the *id* of the record touched, not its contents. For changes, log a **field-name diff** (`changed: ["role","status"]`) or hashed before/after, never the literal old/new PII values. Run a serializer allowlist + a secret/PII scrubber on the `meta` object before write; drop anything not on the allowlist. Tokens, passwords, full card/SSN, message bodies, query result rows → never.
+6. **Set retention per regulation, then enforce it in the store — don't rely on a cron `DELETE`.** Map each event category to its longest applicable requirement and configure the immutable window so deletion *can't* happen early and *does* happen on schedule:
+   | Regime | Typical minimum | Enforce via |
+   |---|---|---|
+   | HIPAA | 6 years | object-lock retention 6y + lifecycle expiry |
+   | SOC2 | 1 year (often 7 for evidence) | partition + lifecycle policy |
+   | PCI-DSS | 1 year (3 months hot) | hot tier + cold archive |
+   Use **time-partitioned tables or object lifecycle rules** so expiry is declarative and audited, not a script someone can disable. Don't over-retain past the requirement (that's its own liability under privacy law — see map-privacy-data-gdpr).
+7. **Make it queryable for investigations.** A trail you can't search is forensically useless. Index `actor.id`, `target.id`, `action`, `ts`, `tenant`, `request_id`. The two queries every investigation needs: *"everything actor X did in window T"* and *"everyone who touched target Y."* Tie `request_id` back to operational traces (observability-instrument owns those) so an investigator can pivot from an audit entry to the full request. Provide a read-only investigator role separate from the write path.
+8. **Emit exactly once, synchronously to the decision, fail-closed on sensitive actions.** Write the audit record **in the same transaction/critical path as the action it records** (or via a transactional outbox) so an action can never succeed without its record. For sensitive control points (data export, permission grant, break-glass), if the audit write fails, **deny the action** — an unlogged privileged action is worse than a blocked one. Dedup downstream consumers on `id`. Never fire-and-forget an audit write for a security-critical event.
+## Common Errors
+- **Audit log shares the store/credentials with app logs.** Anyone who can write debug logs can then forge or wipe audit history. Separate store, separate INSERT-only credential, separate retention.
+- **Logging raw PII/PHI or secrets in the payload.** Creates a long-retention, broadly-read second copy of your crown jewels. Log ids and field-name diffs; scrub `meta` against an allowlist before write.
+- **"Append-only" that the app account can still UPDATE/DELETE.** That's not append-only. Revoke update/delete at the DB-role / bucket-policy level; verify with an attempted delete that must fail.
+- **Hash chain with no verification job.** An undetected break = no tamper evidence at all. Run a scheduled verifier that recomputes the chain and alerts on the first mismatch; anchor the chain head externally if insiders are in scope.
+- **Async fire-and-forget emit.** The action commits, the audit write is dropped on a queue overflow or crash, and you have a silent gap. Write in-transaction or via outbox; fail-closed for sensitive actions.
+- **Free-text `action` strings.** `"User exported the data"` can't be queried, aggregated, or mapped to a control. Use a versioned closed enum.
+- **Trusting client-supplied `X-Forwarded-For` / actor id.** Both are spoofable. Take `source_ip` only from the header your trusted proxy sets; take `actor.id` from the authenticated session, never from the request body.
+- **Missing impersonation provenance.** Support acts "as" a user and the log shows only the end user — auditors flag this as a control gap. Always populate `on_behalf_of`.
+- **Cron-job retention instead of store-enforced.** A disabled or buggy cron either leaks data forever or deletes evidence early. Use object-lock / partition lifecycle so the store enforces it.
+- **No timezone discipline.** Mixed local timestamps make a forensic timeline unreconstructable. UTC + ISO-8601 + ms, server clock, everywhere.
+- **Recording allows but dropping denies.** Auditors and investigators care most about blocked attempts. Record `result: "deny"` with `reason`, not just successful actions.
+## Verify
+1. **Exactly-once coverage:** For each event in the closed list, perform the action and confirm **one** audit record is written with all required fields populated; perform a sensitive action whose audit write is forced to fail and confirm the action is **denied** (fail-closed), not silently completed.
+2. **Tamper detection:** Directly mutate one stored row (or delete one), run the chain verifier → it flags the exact broken entry. Re-run on the untouched log → clean. This is the test that proves the tamper-evidence is real, not decorative.
+3. **Immutability of the write path:** As the **application service account**, attempt `UPDATE`/`DELETE` on an audit row (and overwrite/delete on the object store) → both must be rejected by the role/bucket policy. Only `INSERT`/`PutObject` succeeds.
+4. **No leakage:** Trigger actions involving secrets and PII/PHI (export a record, change a password, edit a profile), then grep the stored audit entries for the raw secret, the password, and the literal PII values → **zero hits**; only ids, field names, and counts appear.
+5. **Retention enforced by the store:** Confirm the object-lock/partition policy is configured for the regulated window and that no role (including admin/root) can delete before expiry; confirm entries past the window expire automatically without a manual job.
+6. **Investigation queries:** Run *"all actions by actor X in window T"* and *"all actors who touched target Y"* → both return correct, complete results in interactive time on indexed fields, and a `request_id` pivots to the matching operational trace.
+7. **Provenance:** An impersonated action shows both the acting agent and `on_behalf_of`; a denied action shows `result: "deny"` + `reason`; `source_ip` matches the trusted-proxy value, not a spoofed body field.
+Done = every event in the closed list emits exactly one complete record on a physically separate, INSERT-only, retention-locked store; the chain verifier detects any edit/delete; no secret or raw PII/PHI appears in any entry; and the two core investigation queries return complete, correct results mapped to their SOC2/HIPAA controls.

package/skills/build-cdc-streaming-pipeline/SKILL.md ADDED Viewed

@@ -0,0 +1,123 @@
+---
+name: build-cdc-streaming-pipeline
+description: Designs change-data-capture and streaming pipelines — log-based CDC off a DB transaction log (Debezium/WAL/binlog), topic-per-table fan-out onto Kafka/Kinesis, consumer-group/offset/rebalance correctness, windowed/stateful stream processing with watermarks, exactly-once vs at-least-once-plus-idempotent delivery, and Avro/Protobuf schema-registry evolution.
+when_to_use: When row changes (incl. deletes) must propagate continuously and low-latency rather than on a schedule — capturing off a transaction log, fanning onto a partitioned stream bus, consuming with correct offset/rebalance/ordering, windowed joins/aggregations, and sinking to a search index/warehouse/cache kept in sync. Distinct from build-etl-pipeline (scheduled batch/incremental loads) and message-queue-jobs (durable server-to-server task queues, not a replayable change log).
+---
+## When to Use
+Reach for this skill when data must **flow as a continuous change stream**, not land in scheduled batches:
+- "Stream every row change out of Postgres/MySQL into Kafka and keep Elasticsearch in sync"
+- "Mirror a table into the warehouse in near-real-time, including deletes"
+- "My consumer is reprocessing / skipping events after a deploy or rebalance"
+- "Consumer-group lag is climbing; ordering is wrong; one partition is hot"
+- "Join an orders stream against an enrichment stream with a 5-minute window"
+- "Late/out-of-order events are dropped or double-counted"
+- "Producer schema changed and consumers broke" / "map Debezium op codes to upserts and deletes"
+NOT this skill:
+- Scheduled/incremental batch loads to a warehouse (Airflow/dbt, nightly, `updated_at` cursor) → build-etl-pipeline
+- Durable server-to-server work/task queue (enqueue a job, one worker runs it once) → message-queue-jobs
+- Client-facing live push over WebSocket/SSE (chat, dashboards) → build-realtime-channel
+- Offline client store + delta pull + conflict resolution → build-offline-first-sync
+- The replication slot / logical-decoding DDL impact on the **source** DB itself → db-migration-safety
+- Embedding/indexing documents for retrieval as the sink semantics → rag-pipeline
+## Steps
+1. **Confirm it's actually streaming, then capture log-based — not query polling.** If freshness tolerance is minutes/hours and deletes don't need to propagate, stop and use build-etl-pipeline. If changes (incl. deletes) must land in seconds, do CDC. Pick the capture method:
+   | Method | Captures deletes | Source load | Ordering | Use when |
+   |---|---|---|---|---|
+   | Query polling (`WHERE updated_at > :cursor`) | ❌ no (row is gone) | full table scan / index pressure | by `updated_at` only | no log access; deletes don't matter; small tables |
+   | **Log-based CDC (Debezium on WAL/binlog/redo)** | ✅ yes | low — reads the log the DB already writes | exact commit order per table | **default** — full fidelity, deletes, low impact |
+   | Trigger-based | ✅ yes | write amplification on every DML | by trigger | log unavailable but deletes needed |
+   Default: **Debezium connectors** — Postgres (`pgoutput` logical decoding + replication slot), MySQL (`binlog`, `binlog_format=ROW`, `binlog_row_image=FULL`), Mongo (change streams). Set Postgres `wal_level=logical`, `REPLICA IDENTITY FULL` on tables whose before-image (for deletes/diffs) you need.
+2. **Get the snapshot→stream handoff right, or you lose or double rows at startup.** A new connector must read existing rows (snapshot) then switch to live log without a gap. Use `snapshot.mode=initial` (snapshot once, then stream) — the connector records the log position at snapshot start and streams from there. For huge tables use **incremental snapshot** (`signal`-driven, chunked) so streaming isn't blocked and the connector is resumable. **Never** drop the replication slot while paused — Postgres then discards WAL the connector hasn't read and you get a permanent gap (full re-snapshot required). Monitor `pg_replication_slots.confirmed_flush_lsn`; an abandoned slot also pins WAL and fills the disk.
+3. **Shape the bus: topic-per-table, partition key = entity id, choose retention vs compaction.** One topic per source table/aggregate (`server.table` → `dbserver1.public.orders`). Partition **by primary key** so all events for one entity land on one partition → per-entity ordering is preserved; Kafka guarantees order **only within a partition**, never across. Do not key by a low-cardinality column (creates hot partitions) or leave keys null (round-robin → ordering lost).
+   | Topic config | Use for | Effect |
+   |---|---|---|
+   | `cleanup.policy=delete` + `retention.ms` | event/audit streams, replay window | drops old segments by time/size |
+   | `cleanup.policy=compact` | **CDC table mirrors** (latest state per key) | keeps newest value per key forever; tombstone (`value=null`) deletes the key |
+   | `compact,delete` | mirror + bounded history | compacted, plus old tombstones expire after `delete.retention.ms` |
+   Default for a table mirror: **log compaction**, keyed by PK. Kinesis equivalent: shard by partition key = PK; remember Kinesis ordering is per-shard and resharding rehashes keys.
+4. **Map CDC op codes to sink operations explicitly.** Debezium envelope `op`: `c`(create)/`r`(read/snapshot)/`u`(update) → **upsert** by PK; `d`(delete) → emit a **tombstone** (`key=PK, value=null`) so compaction and downstream deletes work. Configure `ExtractNewRecordState` SMT to unwrap the envelope and `delete.handling.mode=rewrite` (or `drop`) per sink needs. A sink that treats `d` as an upsert of nulls instead of a delete silently resurrects deleted rows.
+5. **Consume correctly — offset-commit timing is the core bug.** The consumer group assigns partitions to members; each commits the offset of records it has processed. **Commit after the side effect is durable, never before.**
+   - **Enable-auto-commit is at-least-once at best and silently lossy at worst:** it commits on a timer (`auto.commit.interval.ms`) regardless of whether your handler finished. A crash after commit-but-before-processing → message lost. **Set `enable.auto.commit=false`** and commit manually after the sink write succeeds.
+   ```java
+   // at-least-once done right: process → flush sink → THEN commit
+   props.put("enable.auto.commit", "false");
+   props.put("isolation.level", "read_committed"); // skip aborted txn records
+   props.put("max.poll.records", "500");
+   while (running) {
+     var records = consumer.poll(Duration.ofMillis(500));
+     for (var r : records) sink.upsert(key(r), value(r));  // idempotent
+     sink.flush();                                          // durable side effect first
+     consumer.commitSync();                                 // commit only after flush
+   }
+   ```
+   Order is load-bearing: process → flush → commit. Commit-before-process loses on crash; commit-per-record kills throughput.
+   - **Cooperative rebalance, not eager (stop-the-world):** set `partition.assignment.strategy=CooperativeStickyAssignor` so a join/leave revokes only the moved partitions instead of pausing the whole group. Commit in the `onPartitionsRevoked` callback so the new owner resumes from the right place.
+   - **Avoid spurious rebalances:** if processing a poll batch can exceed `max.poll.interval.ms` (default 5 min), the broker evicts the member and rebalances mid-work. Either lower `max.poll.records` or raise the interval. Keep `session.timeout.ms`/`heartbeat.interval.ms` at ~3:1.
+   - **Lag, not just throughput:** alert on consumer-group lag (`kafka-consumer-groups --describe`, or Burrow/CMAK). Scale by adding consumers **up to the partition count** — extra consumers past `#partitions` sit idle. More throughput needs more partitions (and partition count can only go *up*; increasing it rehashes keys and breaks ordering for in-flight keys).
+   - **Poison record → DLQ, don't block the partition.** A record that always fails (bad schema, sink rejects) will halt the partition forever if you retry in place. After N attempts, route it to a dead-letter topic with headers (original topic/partition/offset/exception), commit past it, continue.
+6. **Process: stateless map vs windowed/stateful — pick the window and a watermark.** Stateless filter/transform/route → a plain consumer or single-operator stream. Joins/aggregations need **state + a window + a watermark** (event-time progress marker that says "no events older than T will arrive"):
+   | Window | Use for | Note |
+   |---|---|---|
+   | Tumbling (fixed, non-overlapping) | per-minute counts, billing buckets | each event in exactly one window |
+   | Hopping/sliding (overlapping) | moving averages, "last 5 min every 1 min" | event in multiple windows |
+   | Session (gap-based) | user sessions, bursts | window closes after inactivity gap |
+   Use **event time** (the row's commit/`ts_ms`), never processing time, or replay and out-of-order delivery corrupt results. Set `allowed_lateness`/grace so late events update an already-emitted window instead of being dropped; send events past the grace period to a side-output, don't silently discard. Keep operator state in a durable, checkpointed store (Kafka Streams `RocksDB` + changelog topic, or Flink checkpoints) so a restart restores aggregates instead of recomputing from zero.
+7. **Choose delivery semantics deliberately — exactly-once is opt-in and not free.** Default and simplest: **at-least-once + idempotent sink.** Make the sink absorb duplicates (upsert by PK, `INSERT ... ON CONFLICT DO UPDATE`, dedup table on event id) so reprocessing after a rebalance/replay is harmless. Reach for true **exactly-once** only when the sink can't be made idempotent (e.g. incrementing counters, append-only ledgers):
+   - Kafka→Kafka: enable EOS — `processing.guarantee=exactly_once_v2` (Kafka Streams) or transactional producer (`enable.idempotence=true`, `transactional.id`) + consumer `isolation.level=read_committed`. This is a transactional read-process-write **within Kafka only**; it does not extend to an external DB.
+   - Kafka→external store: use **idempotent upserts**, or a two-phase/transactional sink connector that stores the consumed offset in the *same* transaction as the data.
+   - **Replay** is a first-class operation: reset the group to an offset/timestamp (`kafka-consumer-groups --reset-offsets --to-datetime`) and reprocess. This only produces correct results **because** the sink is idempotent or transactional — design for replay from day one.
+8. **Schema registry + compatibility, or producers will break consumers.** Serialize with **Avro or Protobuf via a schema registry** (not raw JSON) so every record carries a schema id and the registry enforces compatibility on register. Default compatibility: **BACKWARD** (new schema can read old data) — consumers upgrade first. Rules that keep it safe: add fields **with defaults**, never rename/retype a field in place (add new + dual-write + retire), never remove a required field. Pin Debezium key/value converters to the registry. For Kafka Connect sinks, the registry + compatibility check is what stops a bad producer from poisoning every downstream consumer at 3am.
+## Common Errors
+- **`enable.auto.commit=true` treated as exactly-once.** It's a timer that commits independent of your handler — a crash loses or reprocesses. Set it `false` and commit after the sink flush.
+- **Committing the offset before the side effect is durable.** Crash in the gap = silent data loss. Strict order: process → flush sink → commit.
+- **Dropping/recreating the Postgres replication slot to "reset".** WAL the connector hasn't consumed is discarded → permanent gap, forces a full re-snapshot. Pause the connector, keep the slot; never delete a slot with unconsumed WAL.
+- **Abandoned/lagging slot fills the source disk.** A stopped consumer pins WAL forever. Alert on `confirmed_flush_lsn` lag and slot age; clean up dead connectors.
+- **Null or low-cardinality partition key.** Null key → round-robin → cross-partition reordering of one entity's events. Low-cardinality key → hot partition. Key by primary key.
+- **Increasing partition count on a live keyed topic.** Rehashes keys → an entity's new events go to a different partition than its in-flight ones → ordering broken. Plan partition count up front; treat increases as a migration.
+- **Treating a Debezium `d` (delete) as an upsert.** Resurrects deleted rows in the sink. Emit a tombstone (`value=null`) and let the sink delete; use the `ExtractNewRecordState` SMT.
+- **Reading uncommitted transactional records.** Without `isolation.level=read_committed`, consumers see aborted-transaction records and double-count. Set it whenever producers use transactions.
+- **Poison record retried in place.** One un-processable record halts its partition forever and lag explodes. Bounded retries → DLQ topic → commit past it.
+- **Processing on the poll thread longer than `max.poll.interval.ms`.** Broker thinks the consumer died, rebalances mid-batch, you reprocess. Shrink `max.poll.records` or raise the interval; offload slow work.
+- **Eager (stop-the-world) rebalance assignor by default.** Every scale event pauses the whole group. Use `CooperativeStickyAssignor`.
+- **Windowing on processing time.** Replay and out-of-order delivery silently corrupt aggregates. Window on event time with a watermark; route past-grace events to a side-output.
+- **Raw JSON with no registry.** A producer field rename breaks every consumer with no guardrail. Use Avro/Protobuf + registry with BACKWARD compatibility.
+- **Scaling consumers past partition count.** Extra members sit idle. Add partitions (carefully — see above) or split the workload differently.
+## Verify
+1. **Capture fidelity incl. deletes:** `INSERT`, `UPDATE`, then `DELETE` a row on the source → consumer observes a create, an update (with correct before/after), and a tombstone, in commit order. A delete that produces no tombstone is a fail.
+2. **Snapshot→stream no-gap:** seed N rows, start the connector, then write M more **during** the snapshot → exactly N+M distinct rows arrive downstream, none missing, none duplicated past idempotency.
+3. **Per-entity ordering:** rapidly emit 3 updates to one PK → consumer receives them in source order on a single partition (events for that key never interleave out of order).
+4. **Offset correctness across restart:** kill the consumer mid-batch, restart → no committed-but-unprocessed record is lost and no already-sunk record corrupts the sink (idempotency holds). Lag returns to ~0.
+5. **Rebalance correctness:** add then remove a consumer under load with `CooperativeStickyAssignor` → no record is processed by two members and none is skipped; only moved partitions are revoked (check logs).
+6. **Replay = same result:** `--reset-offsets --to-earliest` and reprocess → final sink state is byte-identical to before the replay (proves the sink is idempotent/transactional).
+7. **Poison handling:** inject a record the sink rejects → it lands in the DLQ with origin headers, the partition keeps flowing, lag does not climb.
+8. **Late event:** emit an event with an event-time inside a closed-but-within-grace window → the window result updates; past grace → it appears in the side-output, not silently dropped.
+9. **Schema evolution:** register a new schema adding a field with a default under BACKWARD compatibility → old consumers keep running; attempt an incompatible change → registry rejects the register (does not reach consumers).
+10. **Lag SLO:** under sustained source write load, consumer-group lag stays bounded (returns toward 0), not monotonically rising.
+Done = deletes propagate as tombstones, snapshot→stream is gap-free, per-entity ordering holds on one partition, a kill-restart and a full replay both leave the sink state correct (idempotent or transactional), poison records go to a DLQ without blocking the partition, late events hit grace/side-output (never silently dropped), and an incompatible schema is rejected at the registry before it reaches any consumer.

package/skills/build-cli-tool/SKILL.md ADDED Viewed

@@ -0,0 +1,108 @@
+---
+name: build-cli-tool
+description: Designs the UX and contract of a command-line program in any language — argument parsing via a real lib (commander/yargs, click/typer, cobra, clap), meaningful exit codes, the stdout=data / stderr=logs split so the tool pipes cleanly, TTY-aware color/spinners that auto-plain when redirected, a --json machine mode, layered config precedence, signal cleanup, and shell completion. Covers the whole interface contract that makes a CLI scriptable, composable, and safe — not the language-internal logic.
+when_to_use: Building a new CLI/terminal program or fixing one that misbehaves in pipes, CI, or non-TTY contexts (logs on stdout, colors in files, wrong exit codes, secrets in flags). Distinct from shell-script-robust (writing a robust Bash script — set -euo pipefail, quoting, traps; this skill DESIGNS the CLI program/UX in any language) and publish-package-registry (PUBLISHING the finished tool to npm/PyPI/crates; this skill DESIGNS it).
+---
+## When to Use
+- "I'm writing a CLI — how should I structure subcommands, flags, and help?"
+- "My tool breaks when I pipe it (`tool | jq`) or redirect to a file — output is garbled / has color codes."
+- "CI can't tell why my command failed — every error exits 1."
+- "I need a `--json` mode so scripts can parse my output."
+- "Colors/spinners show up in log files but shouldn't" / "respect `NO_COLOR`."
+- "How do I take a secret without it leaking in `ps` / shell history?"
+- "Add shell completion / a `--dry-run` / proper Ctrl-C cleanup."
+NOT this skill:
+- Writing a robust **Bash** script (strict mode, quoting, trap cleanup) → **shell-script-robust** (that's a shell *implementation*; this is CLI *interface design* in any language).
+- **Publishing** the built tool to npm/PyPI/crates (bin field, OIDC, semver) → **publish-package-registry**.
+- The exact **wording** of a failure string (what/why/next) → **error-message** (use it for message copy; this skill decides the *channel* and *exit code*).
+- Choosing **names** for commands/flags/config keys → **naming-helper**.
+- Hardening the language-internal correctness (concurrency, types, money math) → the respective domain skills.
+## Steps
+The contract in one line: **stdout = data, stderr = everything else, exit code = the verdict.** Get those three right and the tool composes with Unix.
+1. **Pick a parser library, never hand-roll.** Hand-rolled `process.argv` parsing misses `--`, `=`, bundled short flags, and negation. Use the idiomatic one:
+   | Lang | Library | Notes |
+   |---|---|---|
+   | Node | **commander** (simple) / **yargs** (rich) / **clipanion** (class-based, typed) | commander for most; yargs for middleware/completion |
+   | Python | **typer** (type-hint driven) / **click** / `argparse` (stdlib, zero-dep) | typer = click + types; argparse if no deps allowed |
+   | Go | **cobra** (+ pflag/viper) | kubectl/gh use it; gives completion + config for free |
+   | Rust | **clap** (derive) | derive macro → struct = the CLI |
+   Define subcommands (`tool sync`, `tool config get`), flags with **both short and long** (`-v/--verbose`), positionals, and let the lib handle `--` (everything after it is a positional, never a flag — so `rm -- -weird-file`). Support `--flag=value` and `--flag value`.
+2. **Generate `--help` and include examples + a one-line summary.** Every command and subcommand needs `--help`; the lib auto-generates usage from the spec — your job is to add a one-line summary and **real examples** (most help is useless without them):
+   ```
+   sync — mirror a local dir to remote storage
+   Usage: tool sync [options] <src> <dest>
+   Examples:
+     tool sync ./build s3://bucket/site      # one-shot
+     tool sync --dry-run ./build s3://...    # preview, no writes
+   ```
+   Provide `--version` (print version + exit 0). Unknown flag → usage error on **stderr** + exit 2, not a stack trace.
+3. **Define exit codes that mean something.** Scripts and CI branch on `$?`. Don't return 1 for everything:
+   | Code | Meaning |
+   |---|---|
+   | 0 | success |
+   | 1 | generic/expected failure (operation didn't succeed) |
+   | 2 | **usage error** (bad flag/arg) — convention; argparse/clap use it |
+   | 3+ | distinct codes per failure class (e.g. 3 = network, 4 = auth, 5 = not-found) — document them |
+   | 130 | interrupted by SIGINT (128 + 2); 143 for SIGTERM (128 + 15) |
+   Document the table in `--help` or the README so callers can `case $? in ...`.
+4. **Enforce stdout=data / stderr=logs (the cardinal rule).** Primary results → **stdout**. Logs, progress, spinners, prompts, warnings, errors → **stderr**. This is what makes `tool | jq`, `tool > out.json`, and `tool 2>/dev/null` work. **Never** print a log line, banner, or "✓ done" to stdout — it corrupts the data stream. A `--quiet` run with a clean pipe should emit *only* the payload on stdout.
+5. **Detect TTY; degrade gracefully when not interactive.** Color, spinners, progress bars, and interactive prompts are only valid on a terminal. Check before emitting them:
+   - Node: `process.stdout.isTTY` / `process.stderr.isTTY`
+   - Python: `sys.stdout.isatty()`
+   - Go: `term.IsTerminal(int(os.Stdout.Fd()))`
+   Piped/redirected (not a TTY) → auto-plain: no ANSI, no spinner, no prompt (instead error: "stdin is not a tty; pass --yes or --input"). Honor env + flag precedence for color: **`--color=never` > `NO_COLOR` (any value disables) > `--color=always`/`FORCE_COLOR` > `--color=auto` (default: color only if stdout isTTY).**
+6. **Add a `--json` / machine-readable mode.** Human tables for the TTY, structured output for scripts. `--json` emits one JSON document (or NDJSON per record for streams) to stdout, *nothing else* — no log noise, no color. This is more robust than asking users to `grep`/`awk` your pretty output. Keep the schema stable; version it if it may change.
+7. **Stream output; don't buffer huge results.** Write records as you produce them (NDJSON line-by-line, or flush rows incrementally) so `tool export | head` exits fast and memory stays flat on large datasets. Buffering everything then printing at the end breaks `head`/`less` and OOMs on big runs.
+8. **Layer config with a documented precedence.** Highest wins, document the order:
+   ```
+   CLI flags  >  env vars  >  project config (./.toolrc)  >  user config (~/.config/tool/config.toml)  >  built-in defaults
+   ```
+   viper (Go), a small merge (Node/Python), or click's `auto_envvar_prefix` give this. Print the resolved source on `--verbose` so users can debug "why is this value set?".
+9. **Never accept secrets as CLI flags.** `--password hunter2` leaks into `ps aux`, shell history, and CI logs. Accept secrets via **env var** (`TOOL_TOKEN`), a **file** (`--token-file`), or **stdin** (`--password-stdin`, like `docker login`). If a flag like `--token` must exist, mark it deprecated and warn on use.
+10. **Handle signals and clean up.** On SIGINT/SIGTERM: remove temp files, restore terminal state (cursor, raw mode, `\e[?25h` to show cursor), flush partial output, then exit 130/143 — don't leave a half-written file or a hidden cursor. Node: `process.on('SIGINT', cleanup)`; Python: `signal.signal` / `try/finally` + `KeyboardInterrupt`; Go: `signal.NotifyContext`. Make operations idempotent so a re-run after interruption is safe.
+11. **Verbosity, dry-run, and destructive guards.** Levels: `-q/--quiet` (errors only), default, `-v`, `-vv` (stackable → log level). Destructive actions (`delete`, `reset`, overwrite) require `--dry-run` (print exactly what *would* happen, change nothing) and either an interactive confirm (TTY only) or an explicit `--yes`/`--force` for non-interactive use. Prefer idempotent operations so partial failures are recoverable.
+12. **Ship shell completion + handle cross-platform.** Generate completion for bash/zsh/fish (cobra/clap/yargs do this; expose `tool completion zsh`). Cross-platform care: use the lib's path join (not hard-coded `/`), write `\n` not `\r\n` to data streams, and on Windows enable ANSI (modern terminals support it; older need a `colorama`-style shim or `FORCE_COLOR`). Distribution is a separate step — `bin` in package.json + npx, `pipx`, a single static binary (Go/Rust), or a Homebrew formula — but PUBLISHING is **publish-package-registry**.
+## Common Errors
+- **Logs on stdout.** A `console.log("Done!")` or progress bar to stdout silently corrupts `tool | jq` and `tool > file`. The single most common CLI bug — route all non-data to stderr.
+- **Everything exits 1.** CI can't distinguish "bad input" from "network down". Use distinct codes (Step 3) and 2 for usage errors.
+- **Color codes in files.** Forgetting the isTTY check writes raw `\e[31m` into redirected output. Auto-plain when not a TTY; honor `NO_COLOR`.
+- **Secret in a flag.** `--api-key sk-...` is visible to every user via `ps` and saved in `~/.zsh_history`. Use env/file/stdin (Step 9).
+- **Buffering huge output** then printing at the end → `head` hangs, memory blows up. Stream (Step 7).
+- **No `--` handling** → `tool rm -weird-name` treats the filename as a flag. The parser lib handles `--`; don't hand-roll past it.
+- **Prompting in a non-TTY** → CI hangs forever waiting on stdin. Detect TTY; require `--yes`/`--input` otherwise.
+- **Leaving temp files / a hidden cursor on Ctrl-C** — register signal cleanup (Step 10) before creating temps.
+## Verify
+- `tool sub --json | jq .` succeeds and `tool sub > out.txt` produces clean data — **zero** log lines or ANSI in stdout.
+- `tool sub 2>/dev/null` still prints the full payload; `tool sub >/dev/null` still shows progress (proves the stream split).
+- `tool --color=never | cat` has no escape codes; `NO_COLOR=1 tool` is plain; piped output auto-plains without any flag.
+- Bad flag → exit 2 + usage on stderr; a real failure → documented non-zero code; success → 0. `echo $?` after each.
+- Ctrl-C mid-run → exit 130, no temp file left, cursor visible, terminal usable.
+- `ps aux | grep tool` during a run shows **no** secret; `--help` lists examples, exit codes, and config precedence.
+- `tool completion zsh` emits a valid script; a non-TTY run with a destructive command refuses without `--yes`/`--dry-run`.

package/skills/build-data-table/SKILL.md ADDED Viewed

@@ -0,0 +1,141 @@
+---
+name: build-data-table
+description: Builds production data grids that stay fast and accessible at 10k–1M+ rows — decide server-side vs client-side sort/filter/paginate by the dataset-fits-in-memory test (client only under ~10k rows, otherwise push to the API and treat the table as a controlled view of server state), ROW-VIRTUALIZE with TanStack Virtual or react-window so only the visible window mounts (fixed estimateSize, overscan 5–10, measureElement for dynamic rows, contain:strict, and a real scroll container — never table-layout:auto over thousands of rows), build the headless logic with TanStack Table v8 (or AG Grid when you need pinning/grouping/enterprise out of the box), add column resize/reorder/pin, inline edit with optimistic update + rollback on error, row selection with a stable rowId, full keyboard nav with roving tabindex over an ARIA grid (role=grid/row/gridcell, aria-sort, aria-rowcount/aria-rowindex so virtualization stays announced), position:sticky headers, streaming CSV export that doesn't block the main thread, and explicit empty/loading-skeleton/error/no-results states.
+when_to_use: Building a sortable/filterable/paginated table, an editable grid, or any list that must render thousands+ of rows without jank — virtualization, column pin/resize/reorder, inline edit, keyboard grid nav, or CSV export. Distinct from build-react-component (scaffolds one component's props/server-vs-client boundary; this is the full grid subsystem) and design-api-pagination (defines the backend cursor/keyset paging contract; this consumes it for server-side mode) — and from optimize-react-rerenders (fixes wasted renders in general React; this owns the table-specific row-memoization + virtualization).
+---
+## When to Use
+Reach for this skill when you're building a real data grid, not a static `<table>`:
+- "Make this table sortable/filterable and paginated" — and decide server vs client
+- "The table janks / freezes scrolling 50k rows" → you need virtualization
+- "Let users resize, reorder, and pin columns; persist the layout"
+- "Inline-edit a cell and save it optimistically with rollback on failure"
+- "Add row selection (checkboxes, select-all-across-pages) and bulk actions"
+- "Make the grid keyboard-navigable and screen-reader accessible (ARIA grid)"
+- "Export the current (filtered/sorted) view to CSV"
+NOT this skill:
+- Scaffolding a single component's props contract / Server-vs-Client boundary, not a grid subsystem → build-react-component
+- The backend list endpoint's cursor/keyset contract, page_size caps, `{data,next_cursor,has_more}` envelope → design-api-pagination (this skill *consumes* that contract in server-side mode)
+- Generic "why is React re-rendering" wasted-render diagnosis outside the table → optimize-react-rerenders (this owns only the row/cell memoization the grid needs)
+- Wiring TanStack Query caching/mutations/optimistic infra in general → manage-client-server-state (this skill calls into it for the data layer)
+- A spreadsheet with formulas/multi-sheet/cell-range math → build-spreadsheet (a grid is read-mostly tabular UI, not a calc engine)
+- Field-level form rules across a `<form>` (not per-cell inline edit) → build-form-validation
+- Deep WCAG audit of the finished UI → audit-accessibility-wcag (this skill builds the grid a11y baseline it then verifies)
+- Charts/heatmaps from the data → write-data-viz; cleaning/reshaping the rows before display → wrangle-tabular-data
+- Live-updating rows over a socket → build-realtime-channel feeds this grid; merge into rows keyed by stable id
+## Steps
+1. **Decide server-side vs client-side FIRST — it changes the whole architecture.** The test is "does the full dataset fit in memory and stay responsive to filter/sort in the browser?"
+   | | Client-side | Server-side |
+   |---|---|---|
+   | Row count | ≲ 10k (hard ceiling ~50k) | 10k → millions |
+   | Sort/filter/paginate | in JS, instant | the API does it; table is a *controlled view* |
+   | Source of truth | the loaded array | the server query (sort/filter/page in the request) |
+   | TanStack flag | `getSortedRowModel`, `getFilteredRowModel`, `getPaginationRowModel` | `manualSorting/manualFiltering/manualPagination: true` + `pageCount`/`rowCount` |
+   In server-side mode, debounce filter input (~300ms), send `sort`, `filter`, and the **cursor** (from design-api-pagination — keyset, not OFFSET) to the API, and keep table state controlled (`state={{ sorting, columnFilters, pagination }}` + `onSortingChange` etc.). Never load 200k rows to filter client-side "because it's simpler" — it OOMs the tab.
+2. **Virtualize rows whenever you render more than ~100 at once — this is non-negotiable for big grids.** Mounting 10k `<tr>` nodes blows the DOM budget and kills scroll. Use **TanStack Virtual** (`@tanstack/react-virtual`, framework-agnostic, the default) or **react-window** (lighter, fixed/variable list). Only the visible window + overscan mounts.
+   ```tsx
+   const rowVirtualizer = useVirtualizer({
+     count: rows.length,
+     getScrollElement: () => scrollRef.current,
+     estimateSize: () => 36,        // measured row height in px
+     overscan: 8,                   // render 8 extra each side; smooths fast scroll
+     measureElement: el => el.getBoundingClientRect().height, // only if rows vary
+   });
+   // render: a tall spacer div (totalSize) + absolutely-positioned visible rows
+   ```
+   Rules: give the scroll container a **fixed height** and `overflow:auto`; set `contain: strict` (or `layout paint`) on it; use `transform: translateY()` for row offset, not `top`. Do **not** use a native `<table>` with `table-layout:auto` over thousands of rows — the browser re-measures every column on each row; switch to `display:grid`/explicit `<col>` widths or `table-layout:fixed`. For dynamic row heights, `measureElement` + `data-index`; expect a one-frame jump unless you pre-measure.
+3. **Build the logic headless with TanStack Table v8; pick AG Grid only when you need its enterprise features turned-key.** TanStack Table is *headless* — it computes models, you render every DOM node (full control, ~14kb, pairs with Virtual). Define columns with `createColumnHelper<Row>()` for type-safe accessors:
+   ```tsx
+   const col = createColumnHelper<Person>();
+   const columns = [
+     col.accessor('name', { header: 'Name', enableSorting: true }),
+     col.accessor('amount', { cell: c => fmtMoney(c.getValue()), enableColumnFilter: true }),
+   ];
+   const table = useReactTable({ data, columns, getRowId: r => r.id,
+     getCoreRowModel: getCoreRowModel() });
+   ```
+   Choose **AG Grid** instead when you need row grouping/tree data, pivoting, built-in column pinning + Excel export, or a million-row server-row-model without hand-rolling it — it ships those, but it's heavier and styling is its own system. Don't reach for a styled mega-component (`<DataGrid>` Material/MUI X) if you need fine control; you'll fight it.
+4. **Column resize / reorder / pin — wire the table's column features and persist the layout.** TanStack: `enableColumnResizing: true` + `columnResizeMode:'onChange'` (live) vs `'onEnd'` (commit on mouseup, cheaper); read width via `header.getSize()` and apply with CSS vars so resizing doesn't re-render every cell. Reorder = drag the header and reorder `columnOrder` state (use the table's `setColumnOrder`; a `dnd-kit` sortable context for the drag). **Pin** with `column.getIsPinned()` + `position: sticky; left: <accumulated width>; z-index` and a shadow on the last pinned column. Persist `{columnOrder, columnSizing, columnPinning, columnVisibility}` to localStorage (or the user profile) keyed by table id and rehydrate as initial state.
+5. **Inline edit = optimistic update + rollback, never block on the network.** On commit (Enter / blur), write the new value into the cached rows immediately, fire the mutation, and roll back on error. With TanStack Query:
+   ```tsx
+   useMutation({ mutationFn: patchCell,
+     onMutate: async (next) => {
+       await qc.cancelQueries({ queryKey });
+       const prev = qc.getQueryData(queryKey);
+       qc.setQueryData(queryKey, patch(prev, next)); // optimistic
+       return { prev };
+     },
+     onError: (_e, _v, ctx) => qc.setQueryData(queryKey, ctx.prev), // rollback
+     onSettled: () => qc.invalidateQueries({ queryKey }),
+   });
+   ```
+   Edit mode: single cell, `aria-readonly` off, focus the input, Esc cancels, Enter commits + moves down, Tab moves right. Validate per-cell before the optimistic write (type/range); show the error inline and keep the old value. This is the per-cell case — multi-field `<form>` rules belong to build-form-validation.
+6. **Row selection: use a stable `getRowId` and decide select-all semantics.** TanStack `enableRowSelection`, state `rowSelection: Record<rowId, boolean>` — keyed by **your row id**, not the index, so selection survives sort/filter/page. The header checkbox has three states (none/some/all) via `table.getIsSomePageRowsSelected()` → `indeterminate`. Critical decision: "select all" = **current page** or **entire matching dataset**? In server-side mode you can't select rows you haven't loaded — implement "select all N matching" as a *predicate* (the active filter) sent to the bulk endpoint, not a list of ids, and show "All 4,213 selected" with a clear-selection affordance.
+7. **Keyboard nav + ARIA grid — roving tabindex over a real grid role, virtualization-aware.** A data grid is **one tab stop**: the container/active cell has `tabindex=0`, every other cell `tabindex=-1`; arrow keys move the active cell and move the `0`. Roles: container `role="grid"`, rows `role="row"`, cells `role="gridcell"`, header cells `role="columnheader"` with `aria-sort="ascending|descending|none"`. Because virtualization removes off-screen rows from the DOM, **you must** set `aria-rowcount={totalRows}` on the grid and `aria-rowindex` (1-based, header = 1) on every rendered row, and `aria-colcount`/`aria-colindex` for horizontally virtualized columns — otherwise SRs announce "row 12 of 30" instead of "of 50000". Key map:
+   | Key | Action |
+   |---|---|
+   | ↑ ↓ ← → | move active cell |
+   | Home / End | first / last cell in row |
+   | Ctrl+Home / Ctrl+End | first / last cell in grid |
+   | PageUp / PageDown | scroll a viewport of rows (and move focus) |
+   | Enter / F2 | enter edit mode; Esc exits |
+   | Space | toggle row selection |
+   When focus moves to a virtualized row that's scrolled out, call `rowVirtualizer.scrollToIndex(i)` before focusing so the node exists. Sort headers must be operable with Enter/Space and update `aria-sort`. Deep WCAG conformance → audit-accessibility-wcag.
+8. **Sticky headers (and sticky pinned columns) with `position: sticky`.** Header row: `position: sticky; top: 0; z-index: 2` inside the scroll container (sticky is scoped to the nearest scrolling ancestor — the header must live *inside* the same `overflow:auto` element as the rows, not above it). Pinned column cells: `sticky; left: 0; z-index: 1`; the top-left corner (sticky header + pinned col) needs the higher `z-index`. Give sticky cells an opaque `background` (transparent sticky cells show rows bleeding through) and a bottom/right `box-shadow` so the freeze line reads.
+9. **CSV export of the *current view*, off the main thread for large sets.** Export reflects the active sort/filter/column-visibility, not the raw data. Build CSV correctly: quote fields containing `, " \n`, double internal quotes (`"a ""b"" c"`), prefix the file with `` (UTF-8 BOM) so Excel reads UTF-8, and **defend against CSV injection** — prefix any cell starting with `= + - @ \t \r` with a `'` (formula-injection in spreadsheets). For server-side / huge datasets, hit a streaming export endpoint (the server pages with the keyset cursor and streams rows) or generate in a Web Worker + `Blob` so a 100k-row export doesn't freeze the UI; trigger download via an object URL.
+10. **Every grid has four states — design them, don't default to a blank box.** *Loading* → skeleton rows matching column widths (not a centered spinner; preserves layout, no shift). *Empty* (no data exists yet) → illustration + primary action ("Add your first record"). *No results* (filters exclude everything) → "No matches" + a **Clear filters** button (distinct from empty — the fix differs). *Error* → message + Retry that refires the query, keeping prior data visible if you have it. In server-side infinite scroll, show a row-level loading sentinel at the bottom and an error row with retry, not a full-table swap.
+## Common Errors
+- **Rendering all rows, then "optimizing" later.** 10k `<tr>` is already janky; virtualize from the start (step 2). Bolting it on after layout/CSS assumes a normal `<table>` is the biggest rewrite.
+- **Client-side sort/filter on a server-scale dataset.** Loading 100k+ rows to filter in JS OOMs the tab and waterfalls. Use `manualSorting/Filtering/Pagination` + the paged API (step 1).
+- **`<table>` with `table-layout:auto` over thousands of rows.** The browser re-measures every column per row → quadratic. Use `table-layout:fixed` / `display:grid` with explicit widths (step 2).
+- **Selection/edit keyed by row index.** Sort or filter and the wrong rows are selected/edited. Key by a stable `getRowId` (steps 5–6).
+- **No `aria-rowcount`/`aria-rowindex` with virtualization.** SR announces the rendered window ("12 of 30"), not the real total. Set them from the full count (step 7).
+- **Every cell in the tab order.** Tabbing through 50 columns × visible rows is unusable. One tab stop + roving tabindex + arrow keys (step 7).
+- **Sticky header outside the scroll container.** `position:sticky` only sticks within its scrolling ancestor — a header above the `overflow:auto` div won't stick. Put it inside (step 8).
+- **Transparent sticky/pinned cells.** Rows show through the frozen header/column. Opaque background + shadow (step 8).
+- **Inline edit that awaits the server before updating UI.** Feels broken on slow networks. Optimistic write + rollback on error (step 5).
+- **CSV without quoting / injection guard.** Commas/newlines corrupt columns; a cell starting `=cmd|...` executes in Excel. Quote + escape + prefix dangerous leading chars + BOM (step 9).
+- **Exporting raw data instead of the current view.** Users expect the filtered/sorted/visible columns they see. Export from the table's current row model (step 9).
+- **One "no data" state for both empty and filtered-out.** Users can't tell "nothing exists" from "filters hide everything." Split them; give no-results a Clear-filters button (step 10).
+- **Re-creating `columns`/`data` inline each render.** New array identity busts memoization and re-runs every row model. Define `columns` module-level or `useMemo`, keep `data` referentially stable (defer deep render perf to optimize-react-rerenders).
+## Verify
+1. **Scale:** load the target row count (10k / 100k) and scroll fast top-to-bottom — DOM node count stays bounded (only window + overscan in the inspector), no dropped frames.
+2. **Server mode:** sort/filter/page issue new API requests with the right params (keyset cursor, not OFFSET); the table never holds the full dataset; filter input is debounced.
+3. **Columns:** resize, reorder, pin, hide — layout holds, pinned columns freeze with a shadow, and the layout persists across reload.
+4. **Inline edit:** commit shows the new value instantly; force the mutation to fail → it rolls back to the old value and surfaces an error; Esc cancels, Enter advances.
+5. **Selection:** select rows, then sort/filter/page → the *same* rows stay selected (id-keyed); header checkbox shows indeterminate; "select all matching" sends a predicate, not loaded ids.
+6. **Keyboard:** Tab reaches the grid once; arrows/Home/End/Ctrl+Home move the active cell; focusing a scrolled-out row scrolls it into view first; sort headers fire on Enter/Space.
+7. **A11y:** screen reader announces `role=grid`, column headers with `aria-sort`, and the **real** total via `aria-rowcount` (not the virtualized window); run audit-accessibility-wcag for full conformance.
+8. **Sticky:** header stays pinned on vertical scroll, pinned columns on horizontal scroll, corner z-index correct, no bleed-through.
+9. **Export:** CSV of a filtered+sorted view opens in Excel with UTF-8 intact, fields with commas/quotes/newlines are correct, a `=`-leading cell is neutralized, and a 100k-row export doesn't freeze the tab.
+10. **States:** empty, no-results (with Clear-filters), loading skeleton, and error (with Retry) each render distinctly and the error path recovers.
+Done = the grid renders the target scale without jank (virtualized, bounded DOM), sort/filter/paginate run server-side for large datasets against the keyset API, columns resize/reorder/pin and persist, inline edit is optimistic with rollback, selection and edit are id-stable, the grid is one keyboard tab stop with a correct ARIA grid (rowcount/rowindex aware of virtualization), headers and pinned columns stick opaquely, CSV export of the current view is correctly quoted + injection-safe, and all four data states are designed — all proven by checks 1–10.