npm - sanook-cli - Versions diffs - 0.4.0 → 0.5.0 - Mend

sanook-cli 0.4.0 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (235) hide show

package/.env.example +19 -0
package/CHANGELOG.md +144 -0
package/README.md +153 -20
package/README.th.md +136 -0
package/dist/agentContext.js +4 -0
package/dist/approval.js +6 -0
package/dist/bin.js +394 -51
package/dist/brain.js +92 -59
package/dist/brand.js +47 -0
package/dist/checkpoint.js +37 -0
package/dist/commands.js +86 -6
package/dist/compaction.js +76 -5
package/dist/config.js +100 -12
package/dist/cost.js +60 -3
package/dist/doctor.js +92 -0
package/dist/gateway/auth.js +2 -2
package/dist/gateway/ledger.js +2 -2
package/dist/gateway/scheduler.js +1 -0
package/dist/gateway/serve.js +6 -4
package/dist/gateway/server.js +10 -2
package/dist/git.js +11 -2
package/dist/hooks.js +43 -17
package/dist/knowledge.js +48 -49
package/dist/loop.js +182 -66
package/dist/lsp/client.js +173 -0
package/dist/lsp/framing.js +56 -0
package/dist/lsp/index.js +138 -0
package/dist/lsp/servers.js +82 -0
package/dist/mcp-server.js +244 -0
package/dist/mcp.js +184 -29
package/dist/memory-store.js +559 -0
package/dist/memory.js +143 -29
package/dist/orchestrate.js +150 -0
package/dist/providers/codex.js +2 -2
package/dist/providers/keys.js +3 -2
package/dist/providers/registry.js +133 -1
package/dist/repomap.js +93 -0
package/dist/search/chunk.js +158 -0
package/dist/search/embed-store.js +187 -0
package/dist/search/engine.js +203 -0
package/dist/search/fuse.js +35 -0
package/dist/search/index-core.js +187 -0
package/dist/search/indexer.js +241 -0
package/dist/search/store.js +77 -0
package/dist/session.js +42 -8
package/dist/skill-install.js +10 -10
package/dist/skills.js +12 -9
package/dist/summarize.js +31 -0
package/dist/tools/bash.js +21 -2
package/dist/tools/diagnostics.js +41 -0
package/dist/tools/edit.js +29 -7
package/dist/tools/index.js +8 -1
package/dist/tools/list.js +7 -2
package/dist/tools/permission.js +90 -9
package/dist/tools/read.js +23 -4
package/dist/tools/remember.js +1 -1
package/dist/tools/sandbox.js +61 -0
package/dist/tools/search.js +105 -4
package/dist/tools/task.js +195 -29
package/dist/tools/timeout.js +35 -0
package/dist/tools/util.js +10 -0
package/dist/tools/write.js +6 -4
package/dist/trust.js +89 -0
package/dist/ui/app.js +218 -27
package/dist/ui/banner.js +4 -9
package/dist/ui/history.js +30 -0
package/dist/ui/mentions.js +44 -0
package/dist/ui/setup.js +6 -5
package/dist/ui/useEditor.js +83 -0
package/dist/update.js +114 -0
package/dist/worktree.js +173 -0
package/package.json +11 -5
package/scripts/postinstall.mjs +33 -0
package/second-brain/.agents/_Index.md +30 -0
package/second-brain/.agents/skills/_Index.md +30 -0
package/second-brain/.agents/workflows/_Index.md +30 -0
package/second-brain/AGENTS.md +4 -4
package/second-brain/Acceptance/_Index.md +30 -0
package/second-brain/Acceptance/golden-case-template.md +39 -0
package/second-brain/Areas/_Index.md +30 -0
package/second-brain/Bugs/System-OS/_Index.md +30 -0
package/second-brain/Bugs/_Index.md +30 -0
package/second-brain/CLAUDE.md +4 -1
package/second-brain/Checklists/_Index.md +30 -0
package/second-brain/Checklists/preflight-postflight-template.md +29 -0
package/second-brain/Distillations/_Index.md +30 -0
package/second-brain/Entities/_Index.md +30 -0
package/second-brain/Entities/entity-template.md +33 -0
package/second-brain/Evals/_Index.md +30 -0
package/second-brain/Evals/correction-pairs.md +24 -0
package/second-brain/Evals/failure-taxonomy.md +24 -0
package/second-brain/Evals/golden-set.md +25 -0
package/second-brain/Evals/quality-ledger.md +23 -0
package/second-brain/Evals/self-eval-rubric.md +23 -0
package/second-brain/GEMINI.md +4 -4
package/second-brain/Goals/_Index.md +30 -0
package/second-brain/Handoffs/_Index.md +30 -0
package/second-brain/Home.md +7 -0
package/second-brain/Intake/Raw Sources/_Index.md +30 -0
package/second-brain/Intake/_Index.md +30 -0
package/second-brain/Intake/_Quarantine/_Index.md +30 -0
package/second-brain/Learning/_Index.md +30 -0
package/second-brain/Playbooks/_Index.md +30 -0
package/second-brain/Playbooks/playbook-template.md +23 -0
package/second-brain/Projects/_Index.md +30 -0
package/second-brain/Prompts/_Index.md +30 -0
package/second-brain/README.md +2 -1
package/second-brain/Research/_Index.md +30 -0
package/second-brain/Retrospectives/_Index.md +30 -0
package/second-brain/Reviews/_Index.md +30 -0
package/second-brain/Runbooks/_Index.md +30 -0
package/second-brain/Runbooks/eval-loop.md +24 -0
package/second-brain/Sessions/_Index.md +30 -0
package/second-brain/Shared/AI-Context-Index.md +20 -0
package/second-brain/Shared/AI-Threads/_Index.md +30 -0
package/second-brain/Shared/Archive/_Index.md +30 -0
package/second-brain/Shared/Assets/_Index.md +30 -0
package/second-brain/Shared/Context-Packs/_Index.md +30 -0
package/second-brain/Shared/Context7-Docs/_Index.md +30 -0
package/second-brain/Shared/Coordination/NOW.md +28 -0
package/second-brain/Shared/Coordination/_Index.md +30 -0
package/second-brain/Shared/Coordination/agent-registry.md +24 -0
package/second-brain/Shared/Coordination/task-board/_Index.md +30 -0
package/second-brain/Shared/Coordination/task-board/task-template.md +43 -0
package/second-brain/Shared/Coordination/task-board.md +32 -0
package/second-brain/Shared/Core-Facts/_Index.md +30 -0
package/second-brain/Shared/Decision-Memory/_Index.md +30 -0
package/second-brain/Shared/Glossary/_Index.md +30 -0
package/second-brain/Shared/Memory-Inbox/_Index.md +30 -0
package/second-brain/Shared/Operating-State/_Index.md +30 -0
package/second-brain/Shared/Prompting/_Index.md +30 -0
package/second-brain/Shared/Provenance/_Index.md +30 -0
package/second-brain/Shared/Rules/_Index.md +30 -0
package/second-brain/Shared/Rules/contextual-note-rule.md +30 -0
package/second-brain/Shared/Rules/frontmatter-standard.md +10 -0
package/second-brain/Shared/Rules/memory-write-protocol.md +28 -0
package/second-brain/Shared/Rules/procedural-runbook-header.md +40 -0
package/second-brain/Shared/Rules/review-and-staleness-policy.md +22 -0
package/second-brain/Shared/Rules/rules-formatting.md +34 -0
package/second-brain/Shared/Scripts/_Index.md +30 -0
package/second-brain/Shared/Scripts-Archive/_Index.md +30 -0
package/second-brain/Shared/Tech-Standards/_Index.md +30 -0
package/second-brain/Shared/Tech-Standards/verification-standard.md +40 -0
package/second-brain/Shared/User-Memory/_Index.md +30 -0
package/second-brain/Shared/User-Persona/_Index.md +30 -0
package/second-brain/Shared/User-Persona/owner-profile.md +25 -0
package/second-brain/Shared/Working-Memory/_Index.md +30 -0
package/second-brain/Shared/_Index.md +30 -0
package/second-brain/Shared/mcp-servers/_Index.md +30 -0
package/second-brain/Skills/_Index.md +30 -0
package/second-brain/Templates/_Index.md +30 -0
package/second-brain/Templates/bug.md +2 -0
package/second-brain/Templates/handoff.md +2 -0
package/second-brain/Templates/session.md +2 -0
package/second-brain/Tools/_Index.md +30 -0
package/second-brain/Traces/_Index.md +30 -0
package/second-brain/Vault Structure Map.md +33 -1
package/second-brain/copilot/_Index.md +30 -0
package/skills/audit-license-compliance/SKILL.md +117 -0
package/skills/author-codemod/SKILL.md +110 -0
package/skills/build-audit-logging/SKILL.md +112 -0
package/skills/build-cdc-streaming-pipeline/SKILL.md +123 -0
package/skills/build-cli-tool/SKILL.md +108 -0
package/skills/build-data-table/SKILL.md +141 -0
package/skills/build-native-mobile-ui/SKILL.md +154 -0
package/skills/build-offline-first-sync/SKILL.md +118 -0
package/skills/build-realtime-channel/SKILL.md +122 -0
package/skills/build-vector-search/SKILL.md +131 -0
package/skills/compose-local-dev-stack/SKILL.md +149 -0
package/skills/configure-bundler-build/SKILL.md +166 -0
package/skills/configure-dns-tls/SKILL.md +142 -0
package/skills/configure-reverse-proxy-lb/SKILL.md +129 -0
package/skills/configure-security-headers-csp/SKILL.md +122 -0
package/skills/contract-testing/SKILL.md +140 -0
package/skills/datetime-timezone-correctness/SKILL.md +125 -0
package/skills/debug-ci-pipeline-failure/SKILL.md +134 -0
package/skills/debug-flaky-tests/SKILL.md +128 -0
package/skills/defend-llm-prompt-injection/SKILL.md +110 -0
package/skills/deliver-webhooks/SKILL.md +116 -0
package/skills/design-api-pagination/SKILL.md +144 -0
package/skills/design-authorization-model/SKILL.md +119 -0
package/skills/design-backup-dr-recovery/SKILL.md +113 -0
package/skills/design-event-sourcing-cqrs/SKILL.md +143 -0
package/skills/design-multi-tenancy/SKILL.md +100 -0
package/skills/design-protobuf-grpc-service/SKILL.md +146 -0
package/skills/design-relational-schema/SKILL.md +129 -0
package/skills/design-search-index-infra/SKILL.md +151 -0
package/skills/design-state-machine/SKILL.md +108 -0
package/skills/design-token-system/SKILL.md +109 -0
package/skills/distributed-locks-leases/SKILL.md +120 -0
package/skills/encrypt-sensitive-data/SKILL.md +148 -0
package/skills/feature-flags-rollout/SKILL.md +130 -0
package/skills/file-upload-object-storage/SKILL.md +107 -0
package/skills/fuzz-dynamic-security-test/SKILL.md +111 -0
package/skills/harden-llm-app-reliability/SKILL.md +126 -0
package/skills/i18n-localization-setup/SKILL.md +113 -0
package/skills/idempotency-keys/SKILL.md +107 -0
package/skills/implement-push-notifications/SKILL.md +142 -0
package/skills/ingest-webhook-secure/SKILL.md +120 -0
package/skills/integrate-oauth-oidc/SKILL.md +126 -0
package/skills/load-stress-test/SKILL.md +129 -0
package/skills/map-privacy-data-gdpr/SKILL.md +146 -0
package/skills/model-nosql-data/SKILL.md +118 -0
package/skills/money-decimal-arithmetic/SKILL.md +123 -0
package/skills/monitor-ml-drift/SKILL.md +109 -0
package/skills/numeric-precision-units/SKILL.md +144 -0
package/skills/optimize-llm-cost-latency/SKILL.md +103 -0
package/skills/optimize-react-rerenders/SKILL.md +124 -0
package/skills/orchestrate-agent-workflow/SKILL.md +100 -0
package/skills/payments-billing-integration/SKILL.md +114 -0
package/skills/pin-toolchain-versions/SKILL.md +116 -0
package/skills/plan-strangler-migration/SKILL.md +95 -0
package/skills/property-based-testing/SKILL.md +108 -0
package/skills/publish-package-registry/SKILL.md +130 -0
package/skills/recover-git-state/SKILL.md +119 -0
package/skills/remediate-web-vulnerabilities/SKILL.md +125 -0
package/skills/resilience-timeouts-retries/SKILL.md +104 -0
package/skills/resolve-merge-rebase-conflict/SKILL.md +97 -0
package/skills/rewrite-git-history/SKILL.md +109 -0
package/skills/scaffold-cross-platform-app/SKILL.md +137 -0
package/skills/schema-evolution-compatibility/SKILL.md +121 -0
package/skills/send-transactional-email/SKILL.md +126 -0
package/skills/serve-deploy-ml-model/SKILL.md +107 -0
package/skills/setup-cdn-edge-waf/SKILL.md +107 -0
package/skills/setup-devcontainer-env/SKILL.md +131 -0
package/skills/setup-lint-format-precommit/SKILL.md +140 -0
package/skills/setup-monorepo-tooling/SKILL.md +125 -0
package/skills/ship-mobile-app-store-release/SKILL.md +137 -0
package/skills/structured-output-llm/SKILL.md +86 -0
package/skills/supply-chain-sbom-provenance/SKILL.md +120 -0
package/skills/test-data-factories/SKILL.md +158 -0
package/skills/threat-model-stride/SKILL.md +123 -0
package/skills/train-evaluate-ml-model/SKILL.md +109 -0
package/skills/unicode-text-correctness/SKILL.md +109 -0
package/skills/visual-regression-testing/SKILL.md +120 -0

package/skills/scaffold-cross-platform-app/SKILL.md ADDED Viewed

@@ -0,0 +1,137 @@
+---
+name: scaffold-cross-platform-app
+description: Scaffolds React Native (Expo Router) and Flutter app shells — feature-first folder layout, typed navigation + deep links, client-store wiring (Zustand/Redux Toolkit/Riverpod/Bloc), platform-divergent code and native bridges (Expo config plugin/Flutter platform channel), token-driven theming with dark mode, and env/build-flavor tooling.
+when_to_use: Standing up or restructuring a whole React Native (Expo) or Flutter app — choosing navigation, client state, platform-conditional code, bridging a native module, theming, and build flavors. Distinct from build-native-mobile-ui (SwiftUI/Compose screens, not RN/Flutter), manage-client-server-state (server cache/data fetching), design-token-system (the token pipeline this skill consumes), and ship-mobile-app-store-release (signing + store upload).
+---
+## When to Use
+Reach for this skill when the request is about **standing up or reorganizing a whole RN/Flutter app**, not a single screen:
+- "Set up a new Expo app with tabs + a typed navigation stack and deep links"
+- "Start a Flutter app with go_router and Riverpod, organized by feature"
+- "Pick state management — Redux Toolkit vs Zustand / Bloc vs Riverpod — and wire it"
+- "I need iOS-only and Android-only versions of this code / an adaptive widget"
+- "Bridge a native module / write an Expo config plugin / add a Flutter platform channel"
+- "Apply our design tokens + dark mode across the app shell"
+- "Add dev/staging/prod flavors with separate env, bundle IDs, and icons"
+NOT this skill:
+- A **native** iOS/Android screen in SwiftUI or Jetpack Compose (not RN/Flutter) → build-native-mobile-ui
+- Building one reusable RN component in an existing tree → build-react-component
+- Server-cache, fetching, optimistic updates, query invalidation → manage-client-server-state (this skill wires *client* state only)
+- Designing the token **architecture/pipeline** (primitive/semantic tiers, Style Dictionary, W3C export) → design-token-system (this skill *consumes* the exported tokens)
+- Pixel-matching a Figma/screenshot for a screen → implement-from-design
+- Tailwind/responsive web layout → style-responsive-tailwind
+- E2E flows on the running app → write-playwright-e2e
+- Code signing, keystores, TestFlight/Play upload, phased rollout → ship-mobile-app-store-release
+- The CI workflow that calls build/sign/upload lanes (EAS/Codemagic/Fastlane in CI) → cicd-pipeline-author
+- Storing signing keys / API secrets safely → secrets-management
+## Steps
+1. **Pick the framework lane and don't drift mid-project.** Default to **Expo (managed) + Expo Router** for RN, **Flutter stable + go_router** for Dart. Go bare RN only when a dependency needs native build config the managed prebuild can't express.
+   | Need | RN choice | Flutter choice |
+   |---|---|---|
+   | Standard app, OTA updates, fast start | **Expo managed** + `expo-dev-client` | Flutter stable |
+   | Custom native code you control | Expo + **config plugin** (stay managed) | Flutter + plugin/FFI |
+   | Native build settings Expo can't model | bare RN (`expo prebuild` then own `ios/`,`android/`) | n/a |
+   | Routing | **Expo Router** (file-based, typed) | **go_router** (typed routes) |
+   | New project command | `npx create-expo-app@latest -t default` | `flutter create --org com.acme app` |
+   Reject React-Navigation-only (no router) for new apps: Expo Router *is* React Navigation underneath but gives file-based deep linking for free.
+2. **Lay out feature-first, not type-first.** Group by domain so a feature is one deletable folder. Avoid the top-level `screens/ components/ reducers/` split — it scatters every feature across the tree.
+   ```
+   src/
+     app/                 # Expo Router routes (file = route). Flutter: lib/routing/
+       (tabs)/index.tsx   # deep link: myapp://  →  /
+       (tabs)/profile.tsx
+       post/[id].tsx      # myapp://post/42
+       _layout.tsx        # Stack/Tabs + theme provider
+     features/
+       auth/  { ui/  store.ts  api.ts  types.ts }
+       feed/  { ui/  store.ts  api.ts }
+     shared/  { ui/  hooks/  theme/  lib/ }
+     platform/            # *.ios.tsx / *.android.tsx live next to use site
+   ```
+   Flutter mirror: `lib/features/<x>/{presentation,application,data,domain}`, `lib/core/theme`, `lib/routing/app_router.dart`.
+3. **Make routes typed and deep-linkable from day one.**
+   - **Expo Router:** enable typed routes in `app.json` → `"experiments": { "typedRoutes": true }`. Set `scheme` in `app.json` (`"scheme": "myapp"`) so `myapp://post/42` resolves; for universal/app links add `expo-router` `+native-intent` or `associatedDomains`. Nest with `_layout.tsx`: a `(tabs)` group holds `<Tabs>`, a sibling `_layout` holds a `<Stack>` for modals/detail. Navigate with `router.push({ pathname: '/post/[id]', params: { id } })` — params are type-checked.
+   - **go_router:** define routes once, use `GoRoute` + `context.goNamed('post', pathParameters: {'id': id})`. Configure `MaterialApp.router(routerConfig: appRouter)`. Deep links work via the platform `<intent-filter>` (Android) / `CFBundleURLTypes` (iOS) — wire `uriPrefix`/`scheme` to match the route table.
+4. **Wire client state by app shape — opinionated defaults, no "it depends":**
+   | App | RN | Flutter | Why |
+   |---|---|---|---|
+   | Small/medium, mostly local UI state | **Zustand** | **Riverpod** | Minimal boilerplate, no provider-tree gymnastics |
+   | Large, many devs, time-travel/devtools, strict conventions | **Redux Toolkit** | **Bloc** | Enforced structure, traceable events, predictable reducers |
+   | Server data (lists, caches, mutations) | **TanStack Query** | Riverpod `AsyncNotifier` / `dio` | Don't hand-roll cache in the store → manage-client-server-state |
+   Default to **Zustand** (RN) / **Riverpod** (Flutter) unless team size or audit needs push you to RTK/Bloc. **Boundary rule:** keep *server cache* out of the global store; the store holds session, auth, theme, navigation-adjacent UI state. One `store.ts`/notifier per feature; compose at app root, never one god-store.
+   ```ts
+   // features/auth/store.ts — Zustand slice, typed, selector-friendly
+   export const useAuth = create<AuthState>()((set) => ({
+     user: null, token: null,
+     signIn: async (c) => { const { user, token } = await api.login(c); set({ user, token }); },
+     signOut: () => set({ user: null, token: null }),
+   }));
+   // read narrowly to avoid re-renders: const user = useAuth(s => s.user)
+   ```
+5. **Diverge by platform with the cheapest tool that works.** Escalate only as needed:
+   - **One value differs:** `Platform.select({ ios: 12, android: 8, default: 8 })` or `Platform.OS === 'ios'`. Flutter: `Theme.of(context).platform == TargetPlatform.iOS` or `defaultTargetPlatform`.
+   - **A whole component differs:** split files — `Button.ios.tsx` / `Button.android.tsx`; import `./Button` and Metro resolves per-platform. Flutter: `Platform.isIOS ? CupertinoButton(...) : ElevatedButton(...)`, or conditional imports for web vs native.
+   - **Adaptive by design:** Flutter `Switch.adaptive`, `CupertinoIcons` on iOS; RN use a wrapper that picks the native control. Never branch on `Platform.OS` deep inside business logic — isolate divergence at the UI/platform layer.
+6. **Bridge native code through the framework's official channel — never patch generated folders by hand.**
+   - **Expo config plugin** (stay managed): write a plugin that mutates native config at prebuild, e.g. `withInfoPlist` / `withAndroidManifest`, register in `app.json` `"plugins": ["./plugins/with-foo"]`. For real native APIs use the **Expo Modules API** (`createModule`, Swift/Kotlin) — typed JS interface, no manual bridge boilerplate.
+   - **Bare RN:** Turbo/Native Module — declare a TS spec, run Codegen, implement on iOS (Swift/ObjC) + Android (Kotlin/Java).
+   - **Flutter platform channel:** `MethodChannel('com.acme/foo')` on Dart side; implement the matching handler in `AppDelegate.swift` and `MainActivity.kt`. Keep the channel name and method strings in one shared constants file so both sides can't drift.
+   ```dart
+   const _ch = MethodChannel('com.acme/battery');
+   Future<int> level() async => await _ch.invokeMethod<int>('getLevel') ?? -1;
+   ```
+   After any native change run `expo prebuild --clean` (Expo) or `flutter clean` and rebuild — JS/Dart hot reload will NOT pick up native edits.
+7. **Consume design tokens at the app shell; theme from them, don't hardcode.** Build the token source/pipeline with design-token-system; *this* step wires its output into RN/Flutter theming. One `theme/tokens.ts` (or `core/theme/tokens.dart`) holds colors/spacing/radii/typography. Build light+dark from the same tokens; resolve via system scheme.
+   - **RN:** export a `light`/`dark` theme object keyed off tokens; read `useColorScheme()`; pass to a `ThemeProvider` (or Expo Router's `<ThemeProvider value={scheme === 'dark' ? Dark : Light}>`). Never inline hex in components — pull from theme.
+   - **Flutter:** `MaterialApp(theme: lightFromTokens, darkTheme: darkFromTokens, themeMode: ThemeMode.system)`; build `ColorScheme.fromSeed(seedColor: tokens.brand)`; use `CupertinoTheme` where you ship iOS-native chrome. Dark mode = the dark token set + `themeMode`, not ad-hoc `if (isDark)` checks.
+8. **Set up tooling once so the app is reproducible:**
+   - **Env/flavors:** RN — `app.config.ts` reading `process.env`, build profiles in `eas.json` (`development`/`preview`/`production`), distinct `bundleIdentifier`/`package` per profile. Flutter — `--flavor dev|staging|prod` with `--dart-define-from-file=env/dev.json`, matching Xcode schemes + Android `productFlavors`. **Secrets never in `app.json`/committed `.env`** → secrets-management. **Signing certs, keystores, and store upload** are out of scope → ship-mobile-app-store-release.
+   - **Fonts/assets:** RN `expo-font` `useFonts()` (or `expo-asset` preload), gate render on loaded; Flutter declare under `pubspec.yaml` `fonts:`/`assets:`.
+   - **Types/lint:** TS `strict: true`, `eslint` + `eslint-config-expo`, `prettier`; Flutter `flutter analyze` + `flutter_lints`. Add a `typecheck` script (`tsc --noEmit`) to CI.
+   - **Fast refresh** is on by default — if it stops working, it's almost always a non-component export or a circular import, not the bundler.
+## Common Errors
+- **Type-first folders (`screens/`, `reducers/`, `components/`).** Every feature smears across the tree; deleting a feature touches 6 folders. Group by feature, share only truly shared code in `shared/`.
+- **One global store for everything including server data.** Caching API responses in Zustand/Redux means manual invalidation and stale UI. Put server cache in TanStack Query / Riverpod `AsyncNotifier`; keep the store for session/UI state.
+- **`Platform.OS` checks buried in business logic.** Divergence leaks everywhere and is untestable. Isolate it at the UI/platform layer via `.ios`/`.android` files or `Platform.select`.
+- **Editing `ios/` or `android/` by hand on a managed Expo app.** The next `prebuild` wipes it. Express native changes as a **config plugin** or Expo Module instead.
+- **Native change with no rebuild.** Hot reload/Fast Refresh only reloads JS/Dart. A new native module or channel needs `expo prebuild --clean` / `flutter clean` + a fresh native build, or you'll debug a phantom "method not found."
+- **Hardcoded hex colors / magic spacing.** Dark mode and rebrands become a find-and-replace. Pull every color/space/radius from the token theme; derive light+dark from one source.
+- **Missing `scheme` / intent-filter, so deep links silently no-op.** Set `scheme` in `app.json` (RN) and the Android `<intent-filter>` + iOS `CFBundleURLTypes` (Flutter) to match the route table, or `myapp://post/42` opens the app to the home screen.
+- **Mismatched platform-channel/method names across Dart↔native.** A typo yields a silent `MissingPluginException` at runtime. Keep channel + method strings in one shared constant referenced by both sides.
+- **Same `bundleIdentifier`/`applicationId` across flavors.** Dev and prod overwrite each other on-device and can't coexist. Give each flavor a distinct id + icon + display name.
+- **Untyped navigation params.** `router.push('/post/' + id)` loses type-checking and breaks on refactor. Enable typed routes (Expo) / named go_router routes and pass params as objects.
+## Verify
+Run on **both** an iOS simulator and an Android emulator/device — a single-platform pass proves nothing cross-platform.
+1. **Boots clean both OSes:** `npx expo run:ios` and `npx expo run:android` (or `flutter run -d ios` / `-d android`) start with **no red box / no exception**, app reaches the first screen.
+2. **Typed navigation + deep links:** a wrong route param fails `tsc --noEmit`/`flutter analyze`. `xcrun simctl openurl booted myapp://post/42` and `adb shell am start -a android.intent.action.VIEW -d "myapp://post/42"` both open the correct detail screen with the right id.
+3. **State wiring:** an action mutates the store and exactly the subscribed components re-render (verify with a render log/devtools); unrelated screens do not. Server data lives in the query cache, not the store.
+4. **Platform divergence resolves:** the `.ios`/`.android` (or adaptive) variant renders the native-looking control on each OS — confirm by screenshot, not assumption.
+5. **Native bridge round-trips:** call the module/channel method on both platforms and get a real value back (not `-1`/`MissingPluginException`); confirm a rebuild was done after the native edit.
+6. **Theming + dark mode:** toggle system appearance on each OS → colors/typography flip via tokens, no hardcoded color survives; no contrast regressions.
+7. **Flavors:** build `dev` and `prod` → distinct bundle id + icon + name, each reading its own env, no committed secret in the bundle.
+8. **Lint/types green:** `tsc --noEmit` + `eslint .` (or `flutter analyze`) pass with zero errors.
+Done = the app builds and runs on iOS *and* Android, deep links and typed nav resolve on both, state/theming/native-bridge round-trip correctly per platform, and lint + typecheck are green.

package/skills/schema-evolution-compatibility/SKILL.md ADDED Viewed

@@ -0,0 +1,121 @@
+---
+name: schema-evolution-compatibility
+description: Evolves shared data contracts (events, API payloads, DB columns, protobuf/avro) without breaking live consumers — additive-only changes with optional+default fields, NEVER remove/rename/repurpose a field or reuse a protobuf tag / avro position (reserve them with `reserved`/aliases instead), backward vs forward vs full compatibility chosen per producer/consumer upgrade order, expand-then-contract (dual-write/dual-read) migrations for renames and type changes, and a schema registry (Confluent/Buf) wired into CI to mechanically reject incompatible diffs before merge. Tolerant reader, unknown-field preservation, and explicit versioning when a true break is unavoidable.
+when_to_use: Changing a schema that something else already reads or writes — adding/removing/renaming a field on a Kafka event, API JSON payload, protobuf/avro/JSON-Schema, or a DB column other services depend on; deciding if a change is safe to deploy and in what order; or wiring registry compat checks into CI. Distinct from design-protobuf-grpc-service (designs the IDL/RPCs from scratch; this evolves an existing one safely) and db-migration-safety (runs the ALTER without locking/downtime; this decides whether the column change breaks readers at all).
+---
+## When to Use
+Reach for this skill when a contract that *another* process already produces or consumes is changing and you must not break it mid-deploy:
+- "Add a field to this Kafka event / API response — will old consumers still parse it?"
+- "Rename / remove / change the type of a field that other services read"
+- "Which compatibility mode (backward/forward/full) for this Avro subject?"
+- "We reused a protobuf field number and a consumer is reading garbage"
+- "Deploy producers or consumers first? what's the safe order?"
+- "Wire `buf breaking` / Confluent compat checks into CI so bad diffs get blocked"
+- "Migrate a column/field rename with zero downtime across services"
+NOT this skill:
+- Designing the proto/gRPC service, message shapes, and RPCs from scratch → design-protobuf-grpc-service (this skill *evolves* an IDL that already has live readers)
+- Running the `ALTER TABLE` itself without locks/downtime (lock-free index, batched backfill, `NOT VALID` constraints) → db-migration-safety (it makes the DDL safe; this skill decides if the column change breaks consumers)
+- Designing the relational schema / normalization / keys → design-relational-schema
+- The REST/GraphQL field-type and nullability contract for one endpoint → rest-graphql-contract
+- API versioning policy, deprecation headers, pagination contracts → api-design-review / design-api-pagination
+- Validating one payload against a schema at the edge (request validation) → build-form-validation / validate-data-quality
+- Verifying producer and consumer agree via recorded pacts → contract-testing (it tests the agreement; this skill governs how the schema may change)
+- Big phased rewrite/cutover of a whole system → plan-strangler-migration
+## Steps
+1. **Pick the compatibility mode from your upgrade order — it's the whole game.** Compatibility is asymmetric and defined by *who reads data written under the other schema*:
+   | Mode | Guarantees | Allowed change | Upgrade FIRST |
+   |---|---|---|---|
+   | **BACKWARD** | new consumer reads data from old + new producers | **add** optional field (w/ default), **delete** optional field | **consumers** |
+   | **FORWARD** | old consumer reads data from new producer | **add** optional field, **delete** field that had a default | **producers** |
+   | **FULL** | both directions | **only** add/remove **optional fields with defaults** | either |
+   | **\*_TRANSITIVE** | same, but vs **all** prior versions not just the last | — | — |
+   Default to **BACKWARD** for events/topics (Confluent's default — consumers lag and replay history, so the new reader must handle old records). Use **FORWARD** when producers ship ahead of consumers. Use **FULL_TRANSITIVE** for long-lived event logs you replay from the beginning. The rule of thumb: **add a field → forward-safe; remove a field → backward-safe; do both safely → only optional+default**.
+2. **Additive-only is the safe default. Every new field is optional with a default — never required.** A new *required* field breaks every old producer (forward) and every old record (backward) instantly. Concretely:
+   - **JSON / JSON-Schema:** add the key, do NOT add it to `required`, give consumers a default. Keep `additionalProperties` permissive (or `unevaluatedProperties` in 2020-12) so old readers tolerate fields they don't know.
+   - **protobuf (proto3):** every field is already optional; new scalar fields default to `0`/`""`/`false`. Just append with a **fresh field number**. Use `optional` (proto3 explicit presence) when you must distinguish "unset" from "zero".
+   - **Avro:** a new field **must** carry a `"default"`, or it's neither backward- nor forward-compatible — `{"name":"x","type":["null","string"],"default":null}`. This is the #1 Avro footgun.
+3. **NEVER remove, rename, or repurpose a field in place — and NEVER reuse a tag/number/position.** Renaming = remove + add to every consumer; changing a field's *meaning* while keeping its name/number is the worst break because it passes schema checks but silently corrupts data. Reuse of an identifier makes old payloads decode into the wrong field. Reserve instead:
+   - **protobuf** — when you drop field `7` (name `email`), reserve both so the number and name can never be re-added:
+     ```proto
+     message User {
+       reserved 7, 9 to 11;          // numbers
+       reserved "email", "legacy_id"; // names
+       string username = 3;
+     }
+     ```
+   - **Avro** — never reuse a removed field's name; to *rename* keep the old name reachable via `"aliases": ["old_name"]` so readers using the old schema still resolve it.
+   - **JSON** — treat a removed key as permanently retired; never recycle a key name for a different type/meaning.
+   A type change (e.g. `int32 → string`, `string → enum`) is **not** additive even if the name stays — it's a remove-and-add. Wire-compatible widenings exist in proto (`int32`/`int64`/`uint32`/`bool` are interchangeable on the wire; `sint*`/`fixed*` are **not**) but treat them as breaking unless you've verified the exact pair.
+4. **For a true rename or type change, run expand → migrate → contract (dual-write/dual-read).** You cannot atomically change a field across N independently-deployed services. Phase it:
+   | Phase | Producer | Consumer | DB column |
+   |---|---|---|---|
+   | **1 Expand** | write BOTH `old` + `new` | still reads `old` | add `new` col, backfill, dual-write trigger |
+   | **2 Migrate** | writes both | switch reads to `new` (fallback to `old`) | — |
+   | **3 Contract** | stop writing `old`; reserve it | reads `new` only | drop `old` col (after grace + replay window) |
+   Each phase is independently deployable and rollback-safe. The grace window between expand and contract must exceed your **longest consumer lag + replay/retention window** (e.g. Kafka topic retention) so no in-flight or replayed record still needs the old field. The DB column drop is where db-migration-safety takes over.
+5. **Deploy in the order the compatibility mode dictates — getting this backwards is the classic outage.**
+   - **BACKWARD** change (added/removed optional): deploy **consumers first**, then producers. New consumers can read both shapes; once all consumers handle the new shape, flip producers.
+   - **FORWARD** change: deploy **producers first** — old consumers tolerate the new field (they ignore unknowns), then upgrade consumers to use it.
+   - **FULL**: either order, but still roll out gradually and watch dead-letter/parse-error metrics during the canary.
+   - Never deploy producer and consumer in lockstep assuming atomicity — there is always a window where mixed versions run.
+6. **Run a schema registry with mechanical compatibility checks, and gate CI on them.** Humans miss breaks; the registry doesn't.
+   - **Confluent Schema Registry** (Avro/Protobuf/JSON-Schema over Kafka): set per-subject mode and test the candidate before publishing — `curl -X PUT .../config/<subject> -d '{"compatibility":"BACKWARD_TRANSITIVE"}'`, then `POST .../compatibility/subjects/<subject>/versions/latest` returns `{"is_compatible": true|false}`. The Maven/Gradle `schema-registry:test-compatibility` goal does this in CI.
+   - **protobuf** → **`buf breaking --against '.git#branch=main'`** in CI; rules `FIELD_NO_DELETE` (forces `reserved`), `FIELD_SAME_TYPE`, `RESERVED_*` catch exactly the breaks above. Pair with `buf lint`.
+   - **Avro** standalone → `java -jar avro-tools` or the `avro-compatibility` checker; gate the PR.
+   - **JSON-Schema** → `json-schema-diff` / `oasdiff` (for OpenAPI) flag breaking changes.
+   Make the check **fail the build**, not warn. The registry's `compatibility` setting per subject is the contract; CI is the enforcement.
+7. **Write consumers as tolerant readers — ignore unknown fields, never hard-fail on them.** Forward compatibility depends on the *reader's* behavior as much as the schema:
+   - JSON: don't use a strict/closed deserializer that throws on unknown keys. Jackson → `@JsonIgnoreProperties(ignoreUnknown = true)` / `FAIL_ON_UNKNOWN_PROPERTIES=false`; Go `encoding/json` ignores unknowns by default (avoid `DisallowUnknownFields`); Pydantic → `model_config = ConfigDict(extra="ignore")` (NOT `"forbid"`).
+   - **Preserve, don't drop, unknown fields** on a read-modify-write path, or a round-trip through an old service silently deletes data a newer one added. protobuf keeps unknown fields by default; for JSON, capture them (`@JsonAnySetter`, `additionalProperties` map) and re-emit. This is the subtle one — a "harmless" old service in the middle of a pipeline strips new fields.
+   - Always provide a default when a field is absent; don't assume presence.
+8. **When a break is genuinely unavoidable, version explicitly — don't mutate in place.** Some changes (splitting one field into two, restructuring nesting, semantic redefinition) can't be made compatible. Then:
+   - **Events:** new schema = **new subject / new topic** (`orders.v2`) or an explicit `schema_version` field; run v1 and v2 in parallel; migrate consumers; retire v1 after the replay window. Never silently change `v1`'s meaning.
+   - **APIs:** new path/header version (`/v2`, `Accept: application/vnd.api.v2+json`); deprecate v1 with a sunset header and timeline.
+   Versioning is the escape hatch, not the default — additive evolution avoids a version bump for the 90% case.
+## Common Errors
+- **Adding a required field.** Breaks every old producer and every historical record at once. Fix: optional + default, always.
+- **Avro field with no `default`.** Silently fails both backward and forward compat. Fix: every Avro field added/removed needs an explicit `"default"`.
+- **Reusing a protobuf field number (or Avro position).** Old payloads decode into the wrong field — type-confused garbage that passes schema checks. Fix: `reserved` the number AND the name; only ever append fresh numbers.
+- **Renaming a field in place.** It's a delete + add to every consumer simultaneously. Fix: expand→migrate→contract, or Avro `aliases`.
+- **Repurposing a field's meaning while keeping its name.** Passes all mechanical checks, silently corrupts semantics. Fix: new field; reserve the old one.
+- **Wrong deploy order for the compat mode.** Backward change with producers-first (or forward with consumers-first) → mixed-version outage. Fix: consumers-first for backward, producers-first for forward.
+- **Strict deserializer that throws on unknown fields.** Kills forward compatibility the moment a producer adds a field. Fix: tolerant reader (`ignoreUnknown`, `extra="ignore"`, no `DisallowUnknownFields`).
+- **Dropping unknown fields on read-modify-write.** An older service in the pipeline silently erases data newer services added. Fix: preserve and re-emit unknown fields.
+- **Treating a type widening as free.** `int32→string` or `string→enum` is a break even with the same name; not all proto widenings are wire-safe. Fix: verify the exact pair or run expand→contract.
+- **No registry / CI gate.** Relying on review to catch breaks. Fix: `buf breaking` / Confluent compat check that **fails the build**.
+- **Checking only against the latest version, not all.** A change compatible with v3 but not v1 breaks replay. Fix: `*_TRANSITIVE` mode for replayable logs.
+- **Contracting before the replay/retention window passes.** Dropping the old field while replayable records still reference it. Fix: grace window > longest consumer lag + topic retention.
+## Verify
+1. **Mechanical compat check passes in CI:** `buf breaking` / Confluent `is_compatible:true` / Avro checker runs on the PR diff and **fails the build** on an incompatible change — proven by intentionally introducing a remove/rename and watching CI go red.
+2. **Old-schema read of new data, and vice versa:** serialize a record with the new schema, deserialize with the old (forward); serialize with old, read with new (backward) — both succeed, defaults fill absent fields. This is the literal compatibility definition; test it, don't assume it.
+3. **No required field added, every new field has a default:** grep the diff — new fields are optional and defaulted (`"default"` in Avro, not in JSON `required`, appended proto numbers).
+4. **Removed fields are reserved:** any dropped proto field has its number AND name in `reserved`; any renamed Avro field has `aliases`; no identifier is reused.
+5. **Tolerant reader confirmed:** feed a consumer a payload with an extra unknown field → it parses and ignores it (no exception); on read-modify-write, the unknown field survives the round-trip.
+6. **Deploy order documented and rehearsed:** the rollout plan states consumers-first (backward) or producers-first (forward), and a mixed-version canary shows zero parse errors / dead-letters during the window.
+7. **Rename via expand→contract, not in place:** the migration is staged (dual-write, switch reads, then drop + reserve) and each phase is independently rollback-safe; the old field is dropped only after the replay window.
+8. **Transitive check for replayable logs:** for an event log replayed from offset 0, compat mode is `*_TRANSITIVE` and a candidate is checked against all prior versions, not just latest.
+Done = the change is additive (optional + defaulted) or staged through expand→migrate→contract, no field/tag/position is ever removed-without-reserving or repurposed, the compatibility mode matches the deploy order, consumers are tolerant readers that preserve unknowns, and a schema-registry compat check fails CI on any incompatible diff — all proven by the old↔new round-trip and the red-CI test in checks 1–2.

package/skills/send-transactional-email/SKILL.md ADDED Viewed

@@ -0,0 +1,126 @@
+---
+name: send-transactional-email
+description: Ships reliable transactional email (password resets, receipts, verification, alerts) where the hard part is deliverability, not the API call — authenticate the From domain with SPF/DKIM/DMARC alignment, send through a provider (SES/Postmark/SendGrid/Resend/Mailgun) instead of a cold self-hosted MTA, isolate transactional from marketing streams, build inlined-CSS multipart emails, send idempotently via a job runner, and process bounce/complaint webhooks into a suppression list so mail actually lands in the inbox.
+when_to_use: Sending or fixing delivery of transactional email — auth/verification/reset/receipt mail landing in spam, domain authentication (SPF/DKIM/DMARC), bounce/complaint handling, suppression lists, or rendering. Distinct from implement-push-notifications (the mobile/web PUSH channel, a different transport entirely) and message-queue-jobs (the async job system that ENQUEUES the send and owns retry/DLQ — this skill owns the email-specific deliverability, content, and feedback loop).
+---
+## When to Use
+Reach for this skill when the work is **getting a transactional email into the inbox and reacting to what bounces** — domain auth, provider routing, content, and the feedback loop:
+- "Password-reset / verification emails are landing in spam (or vanishing) — fix deliverability"
+- "Set up SPF / DKIM / DMARC so our From domain authenticates and aligns"
+- "Pick and wire a provider (SES, Postmark, SendGrid, Resend, Mailgun) for receipts/alerts"
+- "Our marketing blasts are tanking password-reset delivery — separate the streams"
+- "Process bounce + complaint webhooks and stop re-sending to dead addresses"
+- "Build the email so it renders right in Outlook/Gmail/dark mode with a plain-text fallback"
+- "A retry double-sent the receipt / verification email — make sends idempotent"
+NOT this skill:
+- The async job/queue that **enqueues** the send, owns retry-with-backoff, DLQ, poison-message handling → message-queue-jobs (this skill is what runs *inside* that job)
+- The idempotency-key store/dedup primitive that makes the enqueue+send exactly-once → idempotency-keys
+- Mobile/web **PUSH** notifications (APNs/FCM/Web Push) — a different transport, not email → implement-push-notifications
+- The raw DNS record mechanics (TTL, zone editing, how a TXT/CNAME is published) → configure-dns-tls (this skill tells you *which* records; that skill publishes them)
+- Tracking-pixel/open-tracking consent, unsubscribe-data handling, PII retention/erasure → map-privacy-data-gdpr
+- Throttling how many emails one user can trigger → rate-limiting
+- Marketing campaigns, newsletters, drip sequences, segmentation → (out of scope — a different sending stream entirely; see step 3)
+This skill owns **domain authentication, provider/stream choice, email content, idempotent sending, and feedback processing**. It hands the actual job-running to message-queue-jobs.
+## Steps
+1. **Authenticate the sending domain — this is the gate, not optional.** Gmail/Yahoo require SPF + DKIM + DMARC on bulk and increasingly on all mail; without alignment you go to spam or get rejected. Publish all three on the From domain (records owned by configure-dns-tls; *values* below). Use a dedicated subdomain like `mail.example.com` / `txn.example.com` so reputation is scoped.
+   | Record | Where | Value (shape) | Purpose |
+   |---|---|---|---|
+   | **SPF** | `TXT` at sending domain | `v=spf1 include:amazonses.com ~all` (one TXT, ≤10 DNS lookups, `~all` not `-all` until verified) | authorizes the provider's IPs in `Return-Path` |
+   | **DKIM** | provider-given `CNAME`s (SES, Resend) or `TXT` (`<sel>._domainkey`) | provider publishes the public key; mail is signed `d=example.com` | cryptographic signature, survives forwarding |
+   | **DMARC** | `TXT` at `_dmarc.example.com` | `v=DMARC1; p=none; rua=mailto:dmarc@example.com; adkim=s; aspf=s` | tells receivers what to do on auth fail + reports |
+   **Alignment is the part people miss:** DMARC passes only if SPF *or* DKIM passes **and** its domain matches the **visible `From:`** domain. `Return-Path: bounces@provider.com` aligning SPF to the provider does **not** align to your From — so DKIM `d=` must equal your From domain. Set a **custom Return-Path / MAIL FROM** subdomain (`bounce.example.com`) at the provider for SPF alignment too. Roll DMARC `p=none` → monitor `rua` reports for 1–4 weeks → `p=quarantine` → `p=reject`. Never start at `reject`; you'll blackhole your own mail.
+2. **Send through a reputable provider — do NOT run your own SMTP MTA on cold IPs.** A fresh cloud IP has zero reputation and is often already on a blocklist; running Postfix yourself means you own PTR, warmup, FBL enrollment, and blocklist fights. Use a provider:
+   | Provider | Best for | Notes |
+   |---|---|---|
+   | **Postmark** | pure transactional, fastest inbox | hard-blocks marketing on transactional streams; great deliverability |
+   | **Amazon SES** | volume, cost | cheapest; you do more setup; sandbox until prod access granted |
+   | **Resend** | DX-first, modern stacks | React-email native; simple DKIM CNAMEs |
+   | **SendGrid / Mailgun** | scale, both streams | bigger surface, more knobs |
+   If you self-host anyway (rare): set **PTR / reverse DNS** so the IP resolves back to your HELO hostname (no PTR ≈ instant spam), enroll in every provider's **FBL**, and warm the IP. For 99% of cases, a provider is the answer.
+3. **Separate TRANSACTIONAL from MARKETING — different subdomains, IPs, and streams.** A marketing complaint must **never** be able to poison password-reset delivery. Use `txn.example.com` (or a dedicated transactional stream/IP pool) for resets/receipts/verification, and `news.example.com` (separate IP/stream) for campaigns. Postmark enforces this with separate Streams; SES uses separate **configuration sets** + dedicated IP pools. Mixing them means one bad newsletter tanks your ability to log users in.
+4. **Dedicated vs shared IP, and warm up before volume.** Shared IP (provider's pool) is fine and *better* at low/spiky volume — you inherit the pool's warm reputation. Move to a **dedicated IP** only above ~100k/month steady, then **warm it**: ramp send volume gradually so receivers learn the IP is legit.
+   | Day | Max sends/day (rough) |
+   |---|---|
+   | 1–2 | 50 → 100 |
+   | 3–5 | 500 → 1,000 |
+   | 6–10 | 5,000 → 20,000 |
+   | 11–20 | double daily toward target |
+   Send your **best, most-engaged traffic first** during warmup; complaints early on a cold dedicated IP are very expensive.
+5. **Build the email so it actually renders — inline CSS, multipart, dark-mode, accessible.** Email clients (esp. Outlook/Word engine, Gmail) strip `<style>`, ignore flexbox/grid, and need table layout. Use **MJML** (compiles to bulletproof tables) or a templating tool with a **CSS inliner** (`juice`, premailer) — never raw `<div>` flexbox.
+   - **Always send `multipart/alternative`** with both `text/plain` AND `text/html`. A missing/empty plain-text part is a strong spam signal and breaks watches/screen readers.
+   - **Inline every style** (`style="…"` on elements); media queries in `<head>` for mobile/dark-mode are progressive enhancement only.
+   - **Dark mode:** set `<meta name="color-scheme" content="light dark">` and `supported-color-schemes`; don't rely on transparent PNG logos (add a background).
+   - **Accessible:** real `alt` on images (many clients block images by default — the email must make sense with images off), sufficient contrast, semantic headings, descriptive link text (not "click here").
+   - Put the critical action (reset link, code) in **text**, not baked into an image.
+6. **Set From / Reply-To / Return-Path correctly.** `From:` = a real, branded, *authenticated* address on your sending domain (`noreply@txn.example.com` is fine but a monitored `Reply-To` is friendlier). `Reply-To:` → where humans actually reach you (`support@example.com`). **`Return-Path` / envelope MAIL FROM** → the provider's/your bounce-handling address on an SPF-aligned subdomain; this is where bounces go and what SPF checks — **never** your visible From. Mismatched/spoofed From domains fail DMARC.
+7. **Make every send idempotent — a retry must not double-send.** The job runner (message-queue-jobs) will retry on transient failure; without a guard, the user gets two receipts. Compute a stable **idempotency key** per logical email (e.g. `sha256(user_id + email_type + event_id)`) and record it transactionally before/with the send. Most providers also accept a request-level idempotency/dedup token — pass it. (The dedup-store primitive is idempotency-keys; this skill defines *what makes an email send unique*.)
+   ```python
+   key = sha256(f"{user_id}:password_reset:{reset_request_id}").hexdigest()
+   if not claim_idempotency_key(key):      # atomic INSERT … ON CONFLICT DO NOTHING
+       return                              # already sent — silently no-op
+   provider.send(msg, idempotency_key=key) # provider-level dedup too
+   ```
+   Enqueue the send as a job rather than sending inline in the request path, so a slow provider or 5xx doesn't fail the user's HTTP request — see message-queue-jobs.
+8. **Process bounces + complaints and maintain a suppression list — never re-send to dead/complained addresses.** Wire the provider's **webhooks** (SES→SNS, Postmark/SendGrid/Mailgun event webhooks) and feed a `suppression` table that the send path checks *before* every send. Verify webhook signatures (these are untrusted inbound — see ingest-webhook-secure).
+   | Event | Meaning | Action |
+   |---|---|---|
+   | **Hard bounce** | address doesn't exist | **suppress permanently**, never retry |
+   | **Soft bounce** | mailbox full / temporary | retry a few times, then suppress if persistent |
+   | **Complaint (FBL)** | user hit "spam" | **suppress permanently**; investigate — this is reputation poison |
+   | **Spam / blocked** | content/IP blocked | pause stream, inspect content/reputation |
+   A single complaint costs far more than a lost email. Re-sending to a hard bounce or complainer destroys sender reputation for *everyone* on the stream.
+9. **Honor unsubscribe — `List-Unsubscribe` + One-Click — even on transactional.** Gmail/Yahoo bulk rules require a `List-Unsubscribe` header with **one-click** support; even for transactional mail it's good practice (and required if there's any promotional content). Pure system mail (password reset) can be exempt, but adding the header never hurts.
+   ```
+   List-Unsubscribe: <https://example.com/u/abc123>, <mailto:unsub@example.com>
+   List-Unsubscribe-Post: List-Unsubscribe=One-Click
+   ```
+   A POST to the URL must unsubscribe with no further interaction. (Consent/unsubscribe *data* handling → map-privacy-data-gdpr.)
+10. **Monitor reputation and stay under the thresholds.** Enroll the domain in **Google Postmaster Tools** and watch your provider's dashboards. Hard limits that get you throttled/blocked: **complaint rate < 0.1%** (Gmail's red line is 0.3%, but treat 0.1% as the ceiling), bounce rate low single digits, no blocklist hits. Set alerting on a spike (observability-instrument). A climbing complaint rate is an early warning before a hard block.
+11. **Test in a sandbox — NEVER send to real addresses from staging.** Catch a misconfigured loop emailing 50k real users *before* prod.
+    - **Local/CI:** capture all SMTP into **Mailpit** or **MailHog** (a fake inbox); assert subject, both MIME parts, and rendered HTML in tests.
+    - **Provider sandbox:** **SES sandbox** only delivers to verified addresses; Postmark has a test API token that accepts-but-doesn't-deliver.
+    - **Inbox placement / seed list:** before a big change, send to a seed list (GlockApps/provider tools) to see Gmail/Outlook/Yahoo inbox-vs-spam placement.
+    - Gate the real provider behind an env flag so staging can only hit Mailpit/sandbox — never live SMTP.
+12. **Mind tracking privacy and don't trust open rates.** Open tracking = a 1×1 pixel; **Apple Mail Privacy Protection (MPP)** pre-fetches it, **inflating opens to near-100%** and making opens worthless for engagement. Tracking pixels are personal-data processing under GDPR — needs a lawful basis and arguably consent (→ map-privacy-data-gdpr). For transactional mail, prefer **no open tracking**; if you wrap links for click tracking, keep redirects fast and on your own domain so they don't trip spam filters or break the link on failure.
+## Verify
+1. **Auth passes and aligns:** send to `check-auth@verifier.port25.com` or mail-tester.com — SPF `pass`, DKIM `pass` with `d=` your From domain, DMARC `pass` with **alignment**. `dig TXT _dmarc.example.com` shows the policy; `dig TXT <sel>._domainkey.example.com` resolves.
+2. **DMARC ramped safely:** `rua` aggregate reports show your legit mail passing for 1–4 weeks at `p=none` *before* you move to `quarantine`/`reject`.
+3. **Streams isolated:** a forced complaint/bounce on the marketing stream does **not** appear in or degrade the transactional stream's reputation/dashboards.
+4. **Renders everywhere:** the HTML shows correctly in Gmail, Outlook (Word engine), Apple Mail, and dark mode; with images blocked the email is still actionable (alt text, text link); a `text/plain` part exists and is non-empty.
+5. **Idempotent:** trigger the same logical email twice (or force a job retry) → exactly **one** message is delivered; the second is a no-op.
+6. **Feedback loop works:** send to a provider seed/simulator bounce + complaint address → webhook fires, the address lands in the **suppression** table, and a subsequent send to it is **skipped before** hitting the provider.
+7. **Unsubscribe one-click:** a POST to the `List-Unsubscribe` URL unsubscribes with no extra step; Gmail shows the unsubscribe affordance.
+8. **No real mail from non-prod:** staging/CI sends are captured by Mailpit/MailHog/sandbox and cannot reach a real inbox; a deliberate "send to a real address" from staging is blocked.
+9. **Reputation green:** Google Postmaster shows domain reputation High/Medium, complaint rate **< 0.1%**, no blocklist entries.
+Done = the From domain passes SPF/DKIM/DMARC with alignment (DMARC ramped p=none→quarantine→reject on real report data), transactional mail goes through a provider on a stream isolated from marketing, emails render with inlined CSS + a plain-text part, sends are idempotent under retry, bounces/complaints flow into a suppression list that the send path honors, and no staging environment can email a real user.

package/skills/serve-deploy-ml-model/SKILL.md ADDED Viewed

@@ -0,0 +1,107 @@
+---
+name: serve-deploy-ml-model
+description: Deploys a trained ML model to production — packaging it with the identical training-time preprocessing, registering a versioned model+code+data triple, serving via batch or online REST/gRPC behind a runtime (BentoML/TorchServe/Triton/ONNX), with autoscaling/warmup and canary/shadow rollout — so served predictions reproducibly match offline scoring.
+when_to_use: Taking a trained model to production to generate predictions (package, register, serve, scale, roll out). Distinct from train-evaluate-ml-model (building/evaluating the model), monitor-ml-drift (post-deployment drift/quality monitoring), and deploy-release (generic application deploys with no model artifact).
+---
+## When to Use
+Reach for this skill when a working model needs to **serve predictions in production**, not when it's still being built:
+- "Deploy this model so the app/service can call it"
+- "Stand up a REST/gRPC inference endpoint for `model.pkl`/`model.pt`"
+- "Run nightly batch scoring over the warehouse table"
+- "Roll the new model out behind the old one (shadow/canary) before cutting over"
+- "Our predictions in prod don't match what we got in the notebook" (train/serve skew)
+- "Speed up / scale the inference service" (ONNX export, autoscaling, warmup)
+NOT this skill:
+- Training, hyperparameter search, offline metrics, choosing the model → train-evaluate-ml-model
+- Watching the *live* model for input drift, label delay, quality decay, alerting → monitor-ml-drift
+- Shipping a normal app/service with no model artifact (web app, API, worker) → deploy-release
+- Percentage ramps, kill switches, sticky bucketing for *any* change → feature-flags-rollout (this skill uses it for the model rollout)
+- Latency/cost tuning of an *LLM* prompt/provider path → optimize-llm-cost-latency
+## Steps
+1. **Package the model WITH its exact preprocessing — this is the #1 cause of train/serve skew.** The artifact must contain the *same* feature/transform code that produced training inputs, not a reimplementation. Fit transforms on train data, serialize the fitted objects, and apply the identical pipeline at serve time.
+   ```python
+   # train.py — ONE fitted pipeline = preprocessing + model, saved as a unit
+   from sklearn.pipeline import Pipeline
+   from sklearn.compose import ColumnTransformer
+   import mlflow, mlflow.sklearn
+   pipe = Pipeline([("prep", ColumnTransformer(...)), ("model", clf)]).fit(X_tr, y_tr)
+   with mlflow.start_run():
+       mlflow.sklearn.log_model(
+           pipe, "model",
+           registered_model_name="churn",
+           input_example=X_tr.iloc[:5],          # captures schema + dtypes
+           signature=mlflow.models.infer_signature(X_tr, pipe.predict(X_tr)),
+           pip_requirements="requirements.lock",  # pinned, == training env
+       )
+   ```
+   Rules: never re-derive features in the serving codebase; serve the fitted `prep+model` as one object. For deep nets, save the transform graph (e.g. `torchvision`/`torchaudio` transforms or a `tf.function` preprocessing layer) *inside* the exported module so the runtime applies it. Stateful features (counts, embeddings, aggregates) computed from a feature store at train time must be read from the **same** store online — recomputing them in app code drifts.
+2. **Register and pin model + code + data together.** A model version is meaningless without the code and data snapshot that produced it. Push to a registry (MLflow Model Registry, SageMaker, Vertex, or a tagged OCI artifact) and record, in the run/version metadata: git SHA, training-data version/hash (DVC/Delta/snapshot id), and the locked dependency file. Use registry **stages** (`Staging` → `Production`) or aliases; deploy by *immutable version*, never "latest".
+3. **Pick the serving pattern by latency need — decide, don't hedge.**
+   | Pattern | Use when | Interface | Default runtime |
+   |---|---|---|---|
+   | **Batch / offline** | No realtime need; score a table/file on a schedule | Job writes predictions to warehouse/S3 | Spark / Ray / a plain container in cron/Airflow |
+   | **Online (sync)** | A user request blocks on the prediction; p99 budget < ~200 ms | **REST** (simple, debuggable) default; **gRPC** when p99 < 20 ms or high QPS | BentoML / TorchServe / Triton |
+   | **Streaming** | React to an event flow (clicks, transactions) continuously | Consume Kafka/Kinesis → predict → emit | Flink / Faust / a Ray Serve consumer |
+   Defaults: **batch unless something blocks on the result** — it's cheaper, simpler, and trivially reproducible. For online, start with **REST + JSON** and only move to gRPC/protobuf when a measured latency budget forces it. Do not build an online endpoint for a nightly report.
+4. **Choose the runtime; export to ONNX/TensorRT only when you need the speed.** Server defaults: **BentoML** (Python-first, easy custom logic, batching) for most teams; **Triton** for multi-framework, GPU, dynamic batching at scale; **TorchServe** for pure PyTorch shops. Convert to **ONNX Runtime** (CPU) or **TensorRT** (GPU) when profiling shows the framework runtime is the bottleneck — and **re-verify outputs match** the original within tolerance (atol≈1e-4) before trusting it; quantization/op-set changes silently alter predictions.
+   ```python
+   # bento service.py — load a PINNED model version (never "latest"), server-side batching
+   import bentoml
+   from bentoml.io import JSON
+   runner = bentoml.mlflow.get("churn:prod").to_runner()   # alias -> immutable version; never churn:latest
+   svc = bentoml.Service("churn", runners=[runner])
+   @svc.api(input=JSON(), output=JSON())   # set batchable=True + max_batch_size on the runner config for throughput
+   async def predict(rows: list[dict]) -> list[dict]:
+       return await runner.predict.async_run(rows)
+   ```
+5. **Add warmup, resource limits, and autoscaling — in that order.** Cold models cause p99 spikes: run a synthetic prediction at startup (load weights, JIT/CUDA-warm, fill caches) and gate the readiness probe on it so traffic only arrives warm. Set CPU/memory/GPU **requests and limits** from a load test (see load-stress-test), not by guessing. Autoscale on the right signal — **request concurrency / queue depth / GPU util**, not CPU% for GPU models — with `minReplicas ≥ 2` (no cold-start on scale-from-zero for latency-critical paths) and a scale-down stabilization window so it doesn't flap. Pin threads (`OMP_NUM_THREADS`) to avoid oversubscription under the container limit.
+6. **Roll out shadow → canary against the current model; keep an instant rollback.** Never hard-cut. **Shadow** first: mirror live traffic to the new version, log its predictions, serve the old model's response to users — compares behavior on real traffic at zero user risk. Then **canary**: route 1% → 10% → 50% → 100% by sticky hashed bucketing, watching guardrail metrics (latency, error rate, and prediction distribution vs the incumbent); auto-halt and revert on breach. Drive the ramp/kill switch with feature-flags-rollout. Rollback = repoint the alias/route to the previous **registered version** (still deployed) — must be one command, seconds, no rebuild.
+7. **Lock inference reproducibility end to end.** Serve from the **locked** requirements captured at registration (same library versions, same op-set), pin the base image by digest, set seeds where any stochasticity exists, and freeze the feature-store read path. The contract: the same input row produces a bit-identical (or within-tolerance) prediction in the notebook, the batch job, and the online endpoint.
+## Common Errors
+- **Reimplementing preprocessing in the serving code.** The serving normalizer/encoder/tokenizer drifts from the training one → skew. Serialize and serve the *fitted* pipeline as one artifact; never rewrite the transforms.
+- **Fitting a transform at serve time** (e.g. `StandardScaler().fit(request_batch)`, or imputing with the request's own mean). Must use stats fitted on **training** data, frozen in the artifact.
+- **Deploying "latest"/an unpinned stage.** A retrain silently swaps the model under prod. Deploy an immutable version id; promote via alias (`churn:prod`), not `churn:latest`.
+- **Env mismatch between train and serve.** Different numpy/sklearn/torch/CUDA or ONNX op-set changes outputs. Serve from the exact locked requirements; pin the image by digest.
+- **ONNX/TensorRT export assumed equivalent.** Quantization, fused ops, or op-set bumps shift predictions. Always diff converted vs original outputs on a fixed sample before shipping.
+- **No warmup → readiness flaps.** First requests hit an unloaded/un-JIT'd model and time out; the cold pod is added to the pool before it can serve. Warm at startup and gate readiness on it.
+- **Online endpoint for a batch problem.** Standing up a low-latency REST service to score a table on a schedule wastes cost and adds failure modes. Use a batch job.
+- **Hard cutover with no shadow/canary.** A skew or perf regression hits 100% of traffic instantly. Shadow, then ramp, with auto-rollback.
+- **Single replica / scale-to-zero on a latency path.** Any restart or scale event becomes a user-visible cold start. Keep `minReplicas ≥ 2`.
+- **Autoscaling GPU models on CPU%.** CPU sits low while the GPU saturates → it never scales and latency explodes. Scale on concurrency/queue depth/GPU util.
+- **Stateful features recomputed in app code.** Online aggregates/counts computed differently from the training feature store drift per request. Read from the same store.
+- **No rollback artifact.** The previous version was torn down, so "revert" means a rebuild. Keep the prior registered version deployed and one alias-flip away.
+## Verify
+1. **Parity (the skew gate):** Take a **fixed** holdout sample, score it three ways — training notebook, the batch job, and the online endpoint — and assert predictions match within tolerance (exact for classification labels; `atol≤1e-4` for probabilities/regression). Any mismatch blocks the deploy. This is the single most important check.
+2. **ONNX/quantized parity:** If exported, diff converted-runtime outputs vs the original framework on the same sample within tolerance.
+3. **Schema/contract:** Send a malformed/missing-field request → a clean 4xx, not a 500 or a silently wrong prediction. The logged input signature matches the registered one.
+4. **Latency/throughput:** Under the target arrival rate (load-stress-test), p95/p99 and sustained QPS meet the documented SLO **with warmup applied** — measure warm, not cold.
+5. **Warmup/readiness:** A freshly started replica reports ready only after a successful synthetic prediction; first real request is not a cold spike.
+6. **Autoscaling:** Drive load past the per-replica knee → replicas scale up on the chosen signal and back down after the stabilization window; `minReplicas` is honored at idle.
+7. **Shadow:** New version receives mirrored traffic and logs predictions while users still get the incumbent's response; their distributions are comparable before any canary.
+8. **Rollback:** Flip the alias to the previous version and confirm traffic serves the old model within seconds, no rebuild.
+9. **Reproducibility pin:** The deployed image digest, model version, training-data hash, and git SHA are all recorded together and resolvable from the running service.
+Done = served predictions match offline scoring on the fixed sample within tolerance, latency/throughput meet the SLO warm, shadow/canary ran with guardrails, and a one-command rollback to the prior registered version is proven.