npm - sanook-cli - Versions diffs - 0.4.0 → 0.5.1 - Mend

sanook-cli 0.4.0 → 0.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (238) hide show

package/.env.example +19 -0
package/CHANGELOG.md +173 -0
package/README.md +153 -20
package/README.th.md +136 -0
package/dist/agentContext.js +4 -0
package/dist/approval.js +6 -0
package/dist/bin.js +405 -57
package/dist/brain.js +92 -59
package/dist/brand.js +47 -0
package/dist/checkpoint.js +37 -0
package/dist/commands.js +86 -6
package/dist/compaction.js +76 -5
package/dist/config.js +100 -12
package/dist/cost.js +60 -3
package/dist/doctor.js +92 -0
package/dist/gateway/auth.js +2 -2
package/dist/gateway/ledger.js +2 -2
package/dist/gateway/scheduler.js +1 -0
package/dist/gateway/serve.js +6 -4
package/dist/gateway/server.js +10 -2
package/dist/git.js +11 -2
package/dist/hooks.js +43 -17
package/dist/knowledge.js +48 -49
package/dist/loop.js +182 -66
package/dist/lsp/client.js +173 -0
package/dist/lsp/framing.js +56 -0
package/dist/lsp/index.js +138 -0
package/dist/lsp/servers.js +82 -0
package/dist/mcp-server.js +244 -0
package/dist/mcp.js +184 -29
package/dist/memory-store.js +559 -0
package/dist/memory.js +143 -29
package/dist/orchestrate.js +150 -0
package/dist/providers/codex.js +21 -7
package/dist/providers/keys.js +3 -2
package/dist/providers/models.js +22 -6
package/dist/providers/registry.js +155 -1
package/dist/repomap.js +93 -0
package/dist/search/chunk.js +158 -0
package/dist/search/embed-store.js +187 -0
package/dist/search/engine.js +203 -0
package/dist/search/fuse.js +35 -0
package/dist/search/index-core.js +187 -0
package/dist/search/indexer.js +241 -0
package/dist/search/store.js +77 -0
package/dist/session.js +42 -8
package/dist/skill-install.js +10 -10
package/dist/skills.js +12 -9
package/dist/summarize.js +31 -0
package/dist/tools/bash.js +21 -2
package/dist/tools/diagnostics.js +41 -0
package/dist/tools/edit.js +29 -7
package/dist/tools/index.js +8 -1
package/dist/tools/list.js +7 -2
package/dist/tools/permission.js +90 -9
package/dist/tools/read.js +23 -4
package/dist/tools/remember.js +1 -1
package/dist/tools/sandbox.js +61 -0
package/dist/tools/search.js +105 -4
package/dist/tools/task.js +195 -29
package/dist/tools/timeout.js +35 -0
package/dist/tools/util.js +10 -0
package/dist/tools/write.js +6 -4
package/dist/trust.js +89 -0
package/dist/ui/app.js +228 -31
package/dist/ui/banner.js +4 -9
package/dist/ui/brain-wizard.js +2 -2
package/dist/ui/history.js +30 -0
package/dist/ui/mentions.js +44 -0
package/dist/ui/render.js +55 -15
package/dist/ui/setup.js +97 -12
package/dist/ui/useEditor.js +83 -0
package/dist/update.js +114 -0
package/dist/worktree.js +173 -0
package/package.json +11 -5
package/scripts/postinstall.mjs +33 -0
package/second-brain/.agents/_Index.md +30 -0
package/second-brain/.agents/skills/_Index.md +30 -0
package/second-brain/.agents/workflows/_Index.md +30 -0
package/second-brain/AGENTS.md +4 -4
package/second-brain/Acceptance/_Index.md +30 -0
package/second-brain/Acceptance/golden-case-template.md +39 -0
package/second-brain/Areas/_Index.md +30 -0
package/second-brain/Bugs/System-OS/_Index.md +30 -0
package/second-brain/Bugs/_Index.md +30 -0
package/second-brain/CLAUDE.md +4 -1
package/second-brain/Checklists/_Index.md +30 -0
package/second-brain/Checklists/preflight-postflight-template.md +29 -0
package/second-brain/Distillations/_Index.md +30 -0
package/second-brain/Entities/_Index.md +30 -0
package/second-brain/Entities/entity-template.md +33 -0
package/second-brain/Evals/_Index.md +30 -0
package/second-brain/Evals/correction-pairs.md +24 -0
package/second-brain/Evals/failure-taxonomy.md +24 -0
package/second-brain/Evals/golden-set.md +25 -0
package/second-brain/Evals/quality-ledger.md +23 -0
package/second-brain/Evals/self-eval-rubric.md +23 -0
package/second-brain/GEMINI.md +4 -4
package/second-brain/Goals/_Index.md +30 -0
package/second-brain/Handoffs/_Index.md +30 -0
package/second-brain/Home.md +7 -0
package/second-brain/Intake/Raw Sources/_Index.md +30 -0
package/second-brain/Intake/_Index.md +30 -0
package/second-brain/Intake/_Quarantine/_Index.md +30 -0
package/second-brain/Learning/_Index.md +30 -0
package/second-brain/Playbooks/_Index.md +30 -0
package/second-brain/Playbooks/playbook-template.md +23 -0
package/second-brain/Projects/_Index.md +30 -0
package/second-brain/Prompts/_Index.md +30 -0
package/second-brain/README.md +2 -1
package/second-brain/Research/_Index.md +30 -0
package/second-brain/Retrospectives/_Index.md +30 -0
package/second-brain/Reviews/_Index.md +30 -0
package/second-brain/Runbooks/_Index.md +30 -0
package/second-brain/Runbooks/eval-loop.md +24 -0
package/second-brain/Sessions/_Index.md +30 -0
package/second-brain/Shared/AI-Context-Index.md +20 -0
package/second-brain/Shared/AI-Threads/_Index.md +30 -0
package/second-brain/Shared/Archive/_Index.md +30 -0
package/second-brain/Shared/Assets/_Index.md +30 -0
package/second-brain/Shared/Context-Packs/_Index.md +30 -0
package/second-brain/Shared/Context7-Docs/_Index.md +30 -0
package/second-brain/Shared/Coordination/NOW.md +28 -0
package/second-brain/Shared/Coordination/_Index.md +30 -0
package/second-brain/Shared/Coordination/agent-registry.md +24 -0
package/second-brain/Shared/Coordination/task-board/_Index.md +30 -0
package/second-brain/Shared/Coordination/task-board/task-template.md +43 -0
package/second-brain/Shared/Coordination/task-board.md +32 -0
package/second-brain/Shared/Core-Facts/_Index.md +30 -0
package/second-brain/Shared/Decision-Memory/_Index.md +30 -0
package/second-brain/Shared/Glossary/_Index.md +30 -0
package/second-brain/Shared/Memory-Inbox/_Index.md +30 -0
package/second-brain/Shared/Operating-State/_Index.md +30 -0
package/second-brain/Shared/Prompting/_Index.md +30 -0
package/second-brain/Shared/Provenance/_Index.md +30 -0
package/second-brain/Shared/Rules/_Index.md +30 -0
package/second-brain/Shared/Rules/contextual-note-rule.md +30 -0
package/second-brain/Shared/Rules/frontmatter-standard.md +10 -0
package/second-brain/Shared/Rules/memory-write-protocol.md +28 -0
package/second-brain/Shared/Rules/procedural-runbook-header.md +40 -0
package/second-brain/Shared/Rules/review-and-staleness-policy.md +22 -0
package/second-brain/Shared/Rules/rules-formatting.md +34 -0
package/second-brain/Shared/Scripts/_Index.md +30 -0
package/second-brain/Shared/Scripts-Archive/_Index.md +30 -0
package/second-brain/Shared/Tech-Standards/_Index.md +30 -0
package/second-brain/Shared/Tech-Standards/verification-standard.md +40 -0
package/second-brain/Shared/User-Memory/_Index.md +30 -0
package/second-brain/Shared/User-Persona/_Index.md +30 -0
package/second-brain/Shared/User-Persona/owner-profile.md +25 -0
package/second-brain/Shared/Working-Memory/_Index.md +30 -0
package/second-brain/Shared/_Index.md +30 -0
package/second-brain/Shared/mcp-servers/_Index.md +30 -0
package/second-brain/Skills/_Index.md +30 -0
package/second-brain/Templates/_Index.md +30 -0
package/second-brain/Templates/bug.md +2 -0
package/second-brain/Templates/handoff.md +2 -0
package/second-brain/Templates/session.md +2 -0
package/second-brain/Tools/_Index.md +30 -0
package/second-brain/Traces/_Index.md +30 -0
package/second-brain/Vault Structure Map.md +33 -1
package/second-brain/copilot/_Index.md +30 -0
package/skills/audit-license-compliance/SKILL.md +117 -0
package/skills/author-codemod/SKILL.md +110 -0
package/skills/build-audit-logging/SKILL.md +112 -0
package/skills/build-cdc-streaming-pipeline/SKILL.md +123 -0
package/skills/build-cli-tool/SKILL.md +108 -0
package/skills/build-data-table/SKILL.md +141 -0
package/skills/build-native-mobile-ui/SKILL.md +154 -0
package/skills/build-offline-first-sync/SKILL.md +118 -0
package/skills/build-realtime-channel/SKILL.md +122 -0
package/skills/build-vector-search/SKILL.md +131 -0
package/skills/compose-local-dev-stack/SKILL.md +149 -0
package/skills/configure-bundler-build/SKILL.md +166 -0
package/skills/configure-dns-tls/SKILL.md +142 -0
package/skills/configure-reverse-proxy-lb/SKILL.md +129 -0
package/skills/configure-security-headers-csp/SKILL.md +122 -0
package/skills/contract-testing/SKILL.md +140 -0
package/skills/datetime-timezone-correctness/SKILL.md +125 -0
package/skills/debug-ci-pipeline-failure/SKILL.md +134 -0
package/skills/debug-flaky-tests/SKILL.md +128 -0
package/skills/defend-llm-prompt-injection/SKILL.md +110 -0
package/skills/deliver-webhooks/SKILL.md +116 -0
package/skills/design-api-pagination/SKILL.md +144 -0
package/skills/design-authorization-model/SKILL.md +119 -0
package/skills/design-backup-dr-recovery/SKILL.md +113 -0
package/skills/design-event-sourcing-cqrs/SKILL.md +143 -0
package/skills/design-multi-tenancy/SKILL.md +100 -0
package/skills/design-protobuf-grpc-service/SKILL.md +146 -0
package/skills/design-relational-schema/SKILL.md +129 -0
package/skills/design-search-index-infra/SKILL.md +151 -0
package/skills/design-state-machine/SKILL.md +108 -0
package/skills/design-token-system/SKILL.md +109 -0
package/skills/distributed-locks-leases/SKILL.md +120 -0
package/skills/encrypt-sensitive-data/SKILL.md +148 -0
package/skills/feature-flags-rollout/SKILL.md +130 -0
package/skills/file-upload-object-storage/SKILL.md +107 -0
package/skills/fuzz-dynamic-security-test/SKILL.md +111 -0
package/skills/harden-llm-app-reliability/SKILL.md +126 -0
package/skills/i18n-localization-setup/SKILL.md +113 -0
package/skills/idempotency-keys/SKILL.md +107 -0
package/skills/implement-push-notifications/SKILL.md +142 -0
package/skills/ingest-webhook-secure/SKILL.md +120 -0
package/skills/integrate-oauth-oidc/SKILL.md +126 -0
package/skills/load-stress-test/SKILL.md +129 -0
package/skills/map-privacy-data-gdpr/SKILL.md +146 -0
package/skills/model-nosql-data/SKILL.md +118 -0
package/skills/money-decimal-arithmetic/SKILL.md +123 -0
package/skills/monitor-ml-drift/SKILL.md +109 -0
package/skills/numeric-precision-units/SKILL.md +144 -0
package/skills/optimize-llm-cost-latency/SKILL.md +103 -0
package/skills/optimize-react-rerenders/SKILL.md +124 -0
package/skills/orchestrate-agent-workflow/SKILL.md +100 -0
package/skills/payments-billing-integration/SKILL.md +114 -0
package/skills/pin-toolchain-versions/SKILL.md +116 -0
package/skills/plan-strangler-migration/SKILL.md +95 -0
package/skills/property-based-testing/SKILL.md +108 -0
package/skills/publish-package-registry/SKILL.md +130 -0
package/skills/recover-git-state/SKILL.md +119 -0
package/skills/remediate-web-vulnerabilities/SKILL.md +125 -0
package/skills/resilience-timeouts-retries/SKILL.md +104 -0
package/skills/resolve-merge-rebase-conflict/SKILL.md +97 -0
package/skills/rewrite-git-history/SKILL.md +109 -0
package/skills/scaffold-cross-platform-app/SKILL.md +137 -0
package/skills/schema-evolution-compatibility/SKILL.md +121 -0
package/skills/send-transactional-email/SKILL.md +126 -0
package/skills/serve-deploy-ml-model/SKILL.md +107 -0
package/skills/setup-cdn-edge-waf/SKILL.md +107 -0
package/skills/setup-devcontainer-env/SKILL.md +131 -0
package/skills/setup-lint-format-precommit/SKILL.md +140 -0
package/skills/setup-monorepo-tooling/SKILL.md +125 -0
package/skills/ship-mobile-app-store-release/SKILL.md +137 -0
package/skills/structured-output-llm/SKILL.md +86 -0
package/skills/supply-chain-sbom-provenance/SKILL.md +120 -0
package/skills/test-data-factories/SKILL.md +158 -0
package/skills/threat-model-stride/SKILL.md +123 -0
package/skills/train-evaluate-ml-model/SKILL.md +109 -0
package/skills/unicode-text-correctness/SKILL.md +109 -0
package/skills/visual-regression-testing/SKILL.md +120 -0

package/skills/unicode-text-correctness/SKILL.md ADDED Viewed

@@ -0,0 +1,109 @@
+---
+name: unicode-text-correctness
+description: Implements and fixes correct text/Unicode handling — pinning UTF-8 end-to-end, detecting BOM/legacy charsets, NFC/NFD normalization, grapheme-aware length/slicing/truncation/reversal, locale-aware collation and full case-folding, and homoglyph/confusable/bidi spoofing defenses.
+when_to_use: Code measures, slices, truncates, reverses, sorts, lowercases, or compares strings containing emoji, combining marks, or CJK; or bugs show mojibake, emoji counting as length 4, truncation splitting a character, equal-looking usernames comparing unequal, broken accented sorting, or double-encoding. Distinct from regex-build (pattern matching) and validate-data-quality (column-level rules, not character semantics).
+---
+## When to Use
+Reach for this when the bug is about **what a character *is*** — its bytes, boundaries, identity, or order — not about pattern matching or business rules:
+- "Emoji `👨‍👩‍👧` counts as length 7 / truncates to a broken `�` / reverses into garbage"
+- "Twitter-style `120 chars` limit cuts a flag emoji or `é` in half"
+- "Two usernames look identical but `==` says they differ" (or the reverse: a spoof passes)
+- "Accented words sort after `z` / `ä` doesn't sort near `a`"
+- "Text came in as `Ã©` / `â€™` / `é` — mojibake / double-encoding"
+- "`.toLowerCase()` breaks Turkish `İ`, German `ß`, or fails to match `İstanbul`"
+- "MySQL stores emoji as `????` / IDN domain `аpple.com` (Cyrillic а) phishes users"
+NOT this skill:
+- Writing/debugging a regex pattern (email/slug/`\d` over-matching) → **regex-build**
+- Column-level assertions (no nulls/dupes, value ranges, freshness) → **validate-data-quality**
+- Schema/charset migration mechanics (lock contention, rollback of an `ALTER`) → **db-migration-safety**
+- Whether a confusable username is an actual attack you must *report* in a diff → **security-review** (this skill *builds* the defense; security-review *audits* for its absence)
+## Steps
+1. **Know the four length units — pick one deliberately, never let the language pick for you.** Most "Unicode bugs" are using the wrong unit.
+   | Unit | What it counts | `"é"` (NFD) | `"👨‍👩‍👧"` | Use for |
+   |---|---|---|---|---|
+   | Bytes | UTF-8 octets | 3 | 18 | storage size, network frames, DB byte limits |
+   | Code units | UTF-16 slots (JS `.length`, Java `char`) | 2 | 7 | **almost never — this is the trap** |
+   | Code points | Unicode scalars | 2 | 5 | normalization input, codepoint ranges |
+   | **Grapheme clusters** | user-perceived characters | **1** | **1** | length shown to users, truncation, cursor, slicing |
+   Default for any **user-facing** length, limit, slice, or reverse: **grapheme clusters**. JS `"👨‍👩‍👧".length === 7` and `[..."👨‍👩‍👧"].length === 5` are *both wrong* for "how many characters"; only a segmenter gives 1.
+2. **Count and slice on grapheme boundaries — use a real segmenter, do not split on code points.** Built-ins:
+   ```js
+   const seg = new Intl.Segmenter(undefined, { granularity: "grapheme" });
+   const graphemes = [...seg.segment(s)].map(x => x.segment);
+   const len = graphemes.length;                 // user-visible length
+   const head = graphemes.slice(0, 120).join(""); // truncate to 120 chars, never split
+   const reversed = graphemes.reverse().join(""); // reverse without scrambling 👨‍👩‍👧
+   ```
+   - Python: `regex` module `\X` (`regex.findall(r"\X", s)`) — stdlib `re`/`len()` give code points, not graphemes.
+   - Rust: `unicode-segmentation` `.graphemes(true)`. Go: `rivo/uniseg`. Swift: `String` is already grapheme-correct (`.count`).
+   - **Truncate, then re-append an ellipsis as its own grapheme**; if a byte cap (e.g. DB `VARCHAR(n)` is bytes) also applies, trim graphemes until `utf8Bytes(result) <= cap` — never cut at byte `n` directly.
+3. **Normalize to NFC at every boundary you store, compare, hash, or index.** `"é"` has two encodings (NFC U+00E9 = 1 codepoint; NFD U+0065 U+0301 = 2). They render identically but are `!=` and hash differently. Rule:
+   - **NFC on input** (ingest/form submit/API request) — canonical, shortest, what the web expects.
+   - Compare/hash/dedup/`UNIQUE` index **only on NFC** strings — never store one form and look up another (macOS filesystem returns **NFD**; HTTP/most input is NFC → a path from disk won't match a stored key without normalizing both sides).
+   - Apply NFC **before** truncation (combining mark must ride with its base) and **before** case-folding.
+   ```js
+   const key = s.normalize("NFC");            // JS
+   ```
+   ```python
+   import unicodedata; key = unicodedata.normalize("NFC", s)  # Python
+   ```
+   Use **NFKC** (compatibility) only for *identifiers/search keys* where you want `①`→`1`, `ﬁ`→`fi`, full-width `Ａ`→`A` folded together — it is lossy, so never NFKC user display text.
+4. **Compare case-insensitively with full case-folding, not `lower()`; sort with a locale collator, not byte order.**
+   - Case-insensitive equality: `str.casefold()` (Python) / `String::to_lowercase` (Rust) is the floor — `.toLowerCase()`/`.toUpperCase()` is *not* enough. `"ß".casefold() == "ss"`; Turkish `"İ"` vs `"i"` differ by locale. Never use `.toLowerCase()` for a *security or identity* comparison — NFC then fold both sides: compare `a.normalize("NFC")` folded vs `b.normalize("NFC")` folded.
+   - Sorting: byte/codepoint order puts `Z`(0x5A) before `a`(0x61) and accented letters after `z`. Use an **ICU/CLDR collator**: `new Intl.Collator("de", { sensitivity: "base" }).compare(a, b)` (JS), `PyICU.Collator` or `locale.strxfrm` (Python), `COLLATE "de-x-icu"` (Postgres). Pin the locale explicitly — the "right" order for `ä`/`ö` differs by language (German vs Swedish).
+5. **Defend identifiers (usernames, domains, package names) against confusables and mixed-script spoofing.** Equal-*looking* must mean equal-*compared*, and visually-deceptive must be rejected:
+   - **Skeleton/confusable check** (UTS #39): map each char to its prototype (`раypal`→`paypal`) via the Unicode confusables table (`confusable_homoglyphs`, ICU `usprep`, `unicode-security` crate) and compare skeletons against existing identifiers.
+   - **Mixed-script reject:** allow a single script run per identifier (Latin *or* Cyrillic, not `аpple` mixing Cyrillic `а` + Latin); permit only known-safe combos (Latin+Han+Hiragana for JP). Reject whole-script confusables (all-Cyrillic `аррӏе`).
+   - **Strip/reject bidi overrides** `U+202A–202E`, `U+2066–2069`, and zero-width `U+200B/200C/200D/FEFF` in identifiers and filenames — `safe.txt‮gpj.exe` displays as `safe.txtexe.jpg` (Trojan Source). NFKC-fold identifiers before storing.
+6. **Pin UTF-8 across storage and transport — no implicit charset anywhere.**
+   - DB: MySQL **`utf8mb4`** (the 3-byte `utf8` alias silently drops emoji → `????`); set table *and* connection charset + a `_unicode_ci`/`utf8mb4_0900_ai_ci` collation. Postgres: `ENCODING 'UTF8'` + ICU collation per UTF-8 column.
+   - HTTP: send `Content-Type: …; charset=utf-8`; read `charset` from the response header, fall back to BOM, then to the declared meta — never assume Latin-1.
+   - **BOM:** strip a leading `U+FEFF` on read (it corrupts JSON parse and the first field of CSV); do **not** emit a BOM in UTF-8 output unless a consumer (Excel CSV) demands it.
+   - **Legacy ingest:** detect with `chardet`/`charset-normalizer`/ICU `CharsetDetector`, decode once to Unicode, then work in UTF-8 — and **never re-decode an already-decoded string** (the cause of `â€™` double-encoding mojibake).
+   - URLs/IDN: percent-encode the UTF-8 bytes of the path/query; convert IDN hostnames to **Punycode** (`xn--…`) for transport, but display the Unicode form *only after* the confusable check in step 5.
+7. **Lock the behavior with adversarial test strings** (see Verify) before declaring text handling correct.
+## Common Errors
+- **Using `.length` (JS/Java UTF-16) as character count.** Counts code units → emoji = 2–7, BMP CJK = 1. Fix: `Intl.Segmenter` graphemes for user counts.
+- **Splitting on code points and calling it grapheme-safe.** `[...str]` keeps `é`(NFC) whole but shatters `👨‍👩‍👧` (5 codepoints) and a base+combining `e+◌́`. Fix: segment graphemes, not codepoints.
+- **Byte-cap truncation (`s[:200]`, `substr`).** Cuts mid-codepoint → `�`, or splits a base from its combining mark / a ZWJ sequence. Fix: trim whole graphemes until under the byte cap.
+- **Comparing/indexing without normalizing.** NFC `café` ≠ NFD `café`; one inserts, the other duplicates past a `UNIQUE` constraint. Fix: NFC both sides before `==`, hash, and the DB write.
+- **`toLowerCase()` for identity/security checks.** Misses `ß`/`ss`, breaks Turkish `İ/ı`, locale-dependent. Fix: full case-fold (`casefold()`), NFC first.
+- **Sorting by codepoint/byte.** `Z` before `a`, accents dumped after `z`, wrong per language. Fix: ICU/CLDR collator with an explicit locale.
+- **MySQL `utf8` (3-byte alias).** Silently stores emoji/4-byte chars as `????` or errors. Fix: `utf8mb4` everywhere — column, table, connection.
+- **Double-decoding / re-encoding.** Decoding an already-`str` value (or treating UTF-8 bytes as Latin-1 then re-encoding) → `Ã©`, `â€™`. Fix: decode exactly once at the boundary; keep Unicode internally.
+- **Not stripping the BOM.** Leading `U+FEFF` breaks `JSON.parse`, makes the first CSV column key invisible. Fix: strip a leading `` on read.
+- **Reversing a string by codepoint/char.** Scrambles emoji ZWJ sequences and detaches combining marks (`á` → `́a`). Fix: reverse grapheme clusters.
+- **NFKC on display text.** Lossy: `²`→`2`, `ﬁ`→`fi`, full-width collapses. Fix: NFKC only for fold-keys/identifiers; store NFC for display.
+- **Trusting Unicode display of IDN/filenames.** Bidi override + homoglyph spoofs the eye. Fix: render Punycode / run the confusable+bidi check before showing.
+## Verify
+Test every text op against a fixed adversarial corpus — at minimum: `"á"` (e + combining acute, NFD `á`), `"á"` (NFC `á`), `"👨‍👩‍👧‍👦"` (ZWJ family), `"🇯🇵"` (regional-indicator flag), `"ẹ́"` (stacked combining marks), `"한국어"` (Hangul), `"Ｈｅｌｌｏ"` (full-width), `"раypal"` (mixed-script Cyrillic), `"safe‮txt.exe"` (bidi override), `"hi"` (BOM), `"café"` in NFC and NFD.
+1. **Grapheme length:** the ZWJ family and a flag emoji each report length **1**; `"á"` reports **1**. Not 2, 4, or 7.
+2. **Truncation:** truncating the corpus to N graphemes never yields a `�`, never splits a ZWJ sequence, and never strands a combining mark; `utf8Bytes(result) <= byteCap` when a byte cap applies.
+3. **Reverse:** reversing `"👨‍👩‍👧"` returns it unchanged (single grapheme); reversing `"áb"` keeps `á` intact.
+4. **Normalization equality:** NFC `"café"` and NFD `"café"` compare **equal** and hash equal after `.normalize("NFC")`; inserting both into a table with a `UNIQUE(NFC)` key yields one row.
+5. **Case-fold:** `"ß"` matches `"SS"`/`"ss"` under full case-fold; `"İstanbul"` matches per Turkish locale and is *not* silently mangled in the default locale.
+6. **Collation:** sorting `["z","ä","a","Z"]` under `de` collator puts `ä` adjacent to `a` and is *not* codepoint order (`Z` before `a`).
+7. **Confusable/bidi:** `"раypal"` is flagged confusable with an existing `"paypal"` and mixed-script-rejected; the bidi-override string is rejected or its overrides stripped before storage/display.
+8. **Round-trip:** a string written to the DB (`utf8mb4`) and read back is byte-identical including emoji; a BOM-prefixed file parses with no phantom first key; an IDN host round-trips through Punycode and back.
+Done = grapheme-unit length/slice/truncate/reverse are all correct on the ZWJ + flag + combining-mark corpus, NFC-normalized values compare/hash/dedup equal across forms, case-insensitive matching uses full case-folding and sorting uses a locale collator, confusable + mixed-script + bidi spoofs are rejected, and emoji round-trip cleanly through the `utf8mb4` store with no `????`/`�`/mojibake.

package/skills/visual-regression-testing/SKILL.md ADDED Viewed

@@ -0,0 +1,120 @@
+---
+name: visual-regression-testing
+description: Catches unintended UI pixel changes by snapshotting rendered output and diffing against approved baselines — make snapshots deterministic (disable CSS animations/transitions/caret, mask dynamic regions like dates/avatars/ads, freeze the clock and seed randomness, preload+wait for fonts, pin viewport + deviceScaleFactor, force reduced-motion and a fixed color-scheme), generate per-browser/per-OS baselines (never share a Linux baseline with a dev's macOS), tune the diff threshold (maxDiffPixelRatio / anti-alias mode) instead of inflating it to hide flake, run baselines in ONE pinned container so subpixel/font rendering is identical, and wire a human review/approve flow (Playwright --update-snapshots, Chromatic/Percy approve UI) — at component level (isolated, fast) and page level (integration). Effectively a pixel contract: a diff is a question for a human, not an auto-pass.
+when_to_use: You want to detect visual UI regressions — a CSS/refactor/dependency bump silently shifted layout/color/spacing, you're adding toHaveScreenshot/Chromatic/Percy/BackstopJS, baselines flake across machines, or you're tuning diff thresholds and the review/approve flow. Distinct from write-playwright-e2e (asserts functional behavior and DOM state, not pixels — this skill is the screenshot-diff layer) and audit-accessibility-wcag (WCAG conformance / contrast / semantics, not whether pixels changed).
+---
+## When to Use
+Reach for this skill when the goal is **detecting unintended pixel/visual changes against an approved baseline**, not functional behavior or a11y conformance:
+- "A CSS refactor / Tailwind upgrade / design-token change silently broke a layout somewhere"
+- "Add visual regression / screenshot tests to this component library or these pages"
+- "Set up Playwright `toHaveScreenshot`, Chromatic, Percy, or BackstopJS"
+- "Snapshots flake — they pass on CI but fail on my Mac, or fail randomly"
+- "Tune the diff threshold / mask the date+avatar regions / freeze animations"
+- "Wire the baseline review-and-approve flow into PRs"
+NOT this skill:
+- Asserting a button click opens a modal, a form submits, navigation/DOM state, network mocking → write-playwright-e2e (functional E2E; this skill is the screenshot-diff layer that *also* runs on a stabilized page)
+- WCAG conformance, contrast ratios, ARIA, keyboard/focus order, screen-reader semantics → audit-accessibility-wcag (correct *semantics*, not whether pixels match a baseline)
+- A snapshot/screenshot test that's flaky for timing/ordering reasons → debug-flaky-tests (root-causing nondeterminism in general; this skill prescribes the *visual-specific* stabilizers)
+- Structuring the test suite, fixtures, assertions for unit/integration tests → write-tests
+- Driving a real browser to manually inspect/debug a rendering bug → debug-frontend-browser
+- Catching LCP/CLS/perf regressions (layout shift as a metric, not a pixel diff) → optimize-core-web-vitals
+- Defining the tokens (color/space/type scale) whose changes you're guarding → design-token-system
+## Steps
+1. **Pick the tier by what you own.** Each is a screenshot + perceptual diff against a stored baseline; they differ in where baselines live and review happens.
+   | Tool | Baseline storage | Review/approve | Best for |
+   |---|---|---|---|
+   | **Playwright `toHaveScreenshot`** | git (PNGs committed per project) | `--update-snapshots` + PR diff of `.png` | self-hosted, full control, free; you own the render env |
+   | **Chromatic** | cloud (Storybook) | hosted UI, per-story approve, branch baselines | Storybook component libs; turbosnap diffs only changed stories |
+   | **Percy (BrowserStack)** | cloud | hosted UI, approve per snapshot | cross-browser cloud render, framework-agnostic SDK |
+   | **BackstopLP / BackstopJS** | git/local | `approve` CLI, HTML report | legacy/no-cloud, reference+test+report flow |
+   Default to **Playwright `toHaveScreenshot`** when you control the runner (commit baselines, run in a pinned container); reach for **Chromatic/Percy** when you can't pin a render env or want cross-browser cloud baselines without managing them.
+2. **Render env is the baseline — pin it or every diff is noise.** Font hinting and subpixel antialiasing differ across OS/GPU, so a macOS-generated PNG will *never* match a Linux CI PNG. Generate and verify baselines in **one** environment:
+   - Playwright: pin the Docker image to your exact version — `mcr.microsoft.com/playwright:v1.50.0-noble` — and run *baseline generation and CI in the same image*. Never commit a baseline produced on a dev's machine.
+   - Snapshot filenames already encode browser/OS (`button-chromium-linux.png`). Keep that suffix; do **not** force a single platform name to "share" baselines across OSes — generate one baseline per `(browser, platform)` you actually test.
+   - `npx playwright test --update-snapshots` locally only via `docker run` in that image, or with a dedicated CI "update baselines" job — so the bytes match CI.
+3. **Kill animation and motion before the shot.** A mid-transition frame is the #1 flake source.
+   ```ts
+   // playwright.config.ts
+   expect: { toHaveScreenshot: { animations: 'disabled', caret: 'hide', scale: 'css' } }
+   ```
+   `animations:'disabled'` finite-CSS-animations are fast-forwarded to their end state and transitions disabled; `caret:'hide'` removes the blinking text cursor; `scale:'css'` ignores DPR so HiDPI vs 1x render the same logical pixels. For motion that CSS can't reach, also inject:
+   ```ts
+   await page.emulateMedia({ reducedMotion: 'reduce', colorScheme: 'light' });
+   await page.addStyleTag({ content: `*,*::before,*::after{transition:none!important;animation:none!important;}` });
+   ```
+4. **Pin viewport + DPR + color-scheme deterministically.** Layout depends on width; rendering depends on DPR and scheme. Set them explicitly per project, never inherit the runner's screen:
+   ```ts
+   use: { viewport: { width: 1280, height: 720 }, deviceScaleFactor: 1, colorScheme: 'light' }
+   ```
+   Test responsive breakpoints as **separate named snapshots** (`card-mobile-375.png`, `card-desktop-1280.png`) — don't rely on a default window size. For full-page shots, set `fullPage: true` only when the page height is stable; otherwise prefer clipping a component.
+5. **Freeze time, randomness, and anything non-deterministic in content.** "Updated 3 minutes ago", `Math.random()` ids, and animated counters all churn pixels:
+   - Clock: Playwright `await page.clock.setFixedTime(new Date('2025-01-01T00:00:00Z'))` (or `page.clock.install`) before navigation, so `Date.now()`/timers are frozen.
+   - Seed PRNGs / stub `Math.random` and `crypto.randomUUID` via `addInitScript` so generated ids/charts are stable.
+   - Stub network: route API calls to **fixtures** (deterministic data) — a live API means live data means flake. This is where it overlaps with write-playwright-e2e's mocking, but here the goal is *stable pixels*, not asserting a request.
+6. **Wait for the page to be visually settled — not just `load`.** Diff what's actually rendered:
+   - **Fonts:** a FOUT (fallback → web font swap) changes glyph metrics. `await page.evaluate(() => document.fonts.ready)` before the shot, and self-host/preload fonts so they're not network-flaky.
+   - **Lazy images / skeletons:** wait for the specific `<img>` `decode()`/`load`, or assert the skeleton is gone (`await expect(loc).toBeVisible()`), not a blanket `networkidle` (deprecated and flaky).
+   - **Layout stability:** `await page.waitForFunction` on a render-complete signal, or `expect(locator).toHaveScreenshot()` which **auto-retries until two consecutive shots match** — lean on that built-in stabilization rather than `waitForTimeout`.
+7. **Mask the regions you can't make deterministic — don't widen the threshold to swallow them.** Ads, avatars, timestamps, maps, video, third-party embeds:
+   ```ts
+   await expect(page).toHaveScreenshot('dashboard.png', {
+     mask: [page.locator('.ad-slot'), page.locator('[data-testid="avatar"]')],
+     maskColor: '#FF00FF',
+   });
+   ```
+   Masking paints those areas a solid color in both baseline and actual, so they're excluded from the diff while the rest stays pixel-exact. This is strictly better than raising the global threshold, which blinds you to real regressions everywhere.
+8. **Tune the threshold tight; treat a loose threshold as a bug.** Two knobs, prefer the pixel-count one:
+   - `maxDiffPixelRatio` (fraction of differing pixels, e.g. `0.01`) or `maxDiffPixels` (absolute count) — set as low as your env allows. Start at `0` and raise only to the floor that survives a no-change re-run.
+   - `threshold` (per-pixel color sensitivity, 0–1, default `0.2`) — handles antialias jitter; lowering it makes diffs *stricter*.
+   - **Anti-pattern:** bumping `maxDiffPixelRatio` to `0.1` to "stop flake." That hides a 9%-of-the-screen regression. Fix the nondeterminism (steps 3–6) instead; reserve a small ratio purely for subpixel antialiasing noise.
+9. **Component vs page level — run both, weight toward component.** Component snapshots (Storybook + Chromatic, or Playwright `mount`/component testing) are isolated, fast, and pinpoint *which* component changed; a wall of full-page snapshots is slow and every page that embeds a changed header fails at once (noisy, hard to triage). Use a **pyramid**: many small component/story snapshots, a handful of critical full-page integration snapshots (login, checkout, dashboard). Snapshot **states**, not just the default: hover, focus, error, empty, loading, RTL, dark mode — each as its own baseline.
+10. **A diff is a question for a human — never auto-update on CI.** The review/approve flow is the whole point:
+    - **Failing build is correct behavior** when pixels change — the PR must show the diff image (Playwright attaches `expected/actual/diff` to the HTML report and `test-results/`; Chromatic/Percy link a hosted diff).
+    - Approve intentional changes deliberately: Playwright → run the dedicated `--update-snapshots` job and **commit the new PNGs in the same PR** (reviewers see the pixel diff in git); Chromatic/Percy → click *approve* which moves the branch baseline.
+    - **Never** run `--update-snapshots` automatically in the main test job or on every CI run — that auto-blesses regressions and the test becomes worthless. Updating baselines is a reviewed, intentional act.
+11. **Keep baselines healthy.** Commit PNGs via **Git LFS** (binary churn bloats history); delete stale baselines when a component is removed (orphan PNGs hide nothing and rot); regenerate the whole set deliberately after an intentional global change (font swap, token update) in a single isolated PR titled as such, so reviewers know the diff is wholesale, not a regression slipping through.
+## Common Errors
+- **Baseline made on macOS, CI runs Linux.** Font/subpixel rendering differs → every snapshot "fails." Fix: generate and run in one pinned container image (`mcr.microsoft.com/playwright:vX.Y-noble`); never commit a dev-machine baseline.
+- **Animations/transitions not disabled.** Mid-flight frame captured → random diffs. Fix: `animations:'disabled'`, `caret:'hide'`, inject `transition/animation:none!important`, `emulateMedia({reducedMotion:'reduce'})`.
+- **Web font swaps after the shot (FOUT).** Glyph metrics shift → text diffs. Fix: `await document.fonts.ready` + self-host/preload fonts.
+- **Live time/random/data.** "2 min ago", uuids, live API → churns pixels. Fix: `page.clock.setFixedTime`, seed/stub `Math.random`/`randomUUID`, route APIs to fixtures.
+- **Raising `maxDiffPixelRatio` to stop flake.** Hides real regressions across the whole frame. Fix: eliminate nondeterminism (steps 3–6) and *mask* dynamic regions; keep the threshold near zero.
+- **`waitForTimeout`/`networkidle` instead of a render signal.** Flaky on slow CI, deprecated. Fix: wait on `fonts.ready`, specific image `decode()`, or rely on `toHaveScreenshot`'s built-in retry-until-stable.
+- **Forcing one platform name to share baselines.** A "shared" baseline matches no real env. Fix: one baseline per `(browser, platform)`; keep the OS suffix in the filename.
+- **Auto-running `--update-snapshots` in CI.** Silently re-baselines regressions → the test never fails on a real change. Fix: dedicated, reviewed update job; commit PNGs in the PR.
+- **Only the default/happy state snapshotted.** Hover/error/empty/dark/RTL regressions slip through. Fix: a baseline per meaningful state.
+- **No DPR pin.** HiDPI runner doubles pixels vs 1x → size mismatch. Fix: `deviceScaleFactor:1` + `scale:'css'`.
+- **Giant full-page snapshots only.** One header change fails 40 pages; slow, untriageable. Fix: component-level pyramid + a few critical page shots.
+- **Baselines committed as raw blobs.** Binary churn bloats the repo. Fix: Git LFS; prune orphaned PNGs.
+## Verify
+1. **Determinism re-run:** run the suite twice back-to-back with **no code change** in the pinned CI image → zero diffs. Any nonzero diff on a clean re-run is leftover nondeterminism — fix it before trusting the suite.
+2. **Env parity:** generate a baseline in the container and run CI in the same container → match; confirm filenames carry the `(browser, platform)` suffix and no baseline was produced on a dev machine.
+3. **Real regression is caught:** deliberately change a color/padding/font-size by a few px → the relevant snapshot fails and the report shows a highlighted `diff.png`; the build goes red.
+4. **Masking works, threshold is tight:** a masked region (avatar/clock) churning its content produces **no** diff, while an unmasked 1% layout shift **does** fail — proving the threshold isn't swallowing real changes.
+5. **Stabilizers active:** animations disabled, `document.fonts.ready` awaited, clock fixed, randomness seeded, APIs stubbed to fixtures — grep the config/setup for each; a snapshot taken mid-animation or with a live `Date.now()` would fail check 1.
+6. **Approve flow is manual:** confirm no job runs `--update-snapshots`/auto-approve on the main path; an intentional change requires committing new PNGs (or clicking approve) in a reviewed PR, and that PR's diff shows the pixel change.
+7. **State coverage:** the critical components have baselines for hover/focus/error/empty/dark/RTL, not just default; responsive breakpoints are separate named snapshots.
+Done = snapshots are byte-stable on a clean re-run in one pinned render env, dynamic regions are masked (not threshold-inflated), per-`(browser,platform)` baselines live in version control via LFS, a real few-pixel change goes red with a visible diff, and every baseline update is a deliberate, reviewed human approval — never an automatic CI step.