sanook-cli 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (148) hide show
  1. package/.env.example +23 -0
  2. package/CHANGELOG.md +38 -0
  3. package/LICENSE +201 -0
  4. package/README.md +239 -0
  5. package/dist/agentContext.js +2 -0
  6. package/dist/approval.js +78 -0
  7. package/dist/bin.js +461 -0
  8. package/dist/brain.js +186 -0
  9. package/dist/commands.js +66 -0
  10. package/dist/compaction.js +85 -0
  11. package/dist/config.js +101 -0
  12. package/dist/cost.js +59 -0
  13. package/dist/diff.js +36 -0
  14. package/dist/gateway/auth.js +32 -0
  15. package/dist/gateway/ledger.js +94 -0
  16. package/dist/gateway/lock.js +114 -0
  17. package/dist/gateway/schedule.js +74 -0
  18. package/dist/gateway/scheduler.js +87 -0
  19. package/dist/gateway/serve.js +57 -0
  20. package/dist/gateway/server.js +94 -0
  21. package/dist/gateway/telegram.js +115 -0
  22. package/dist/git.js +55 -0
  23. package/dist/hooks.js +104 -0
  24. package/dist/knowledge.js +68 -0
  25. package/dist/loop.js +169 -0
  26. package/dist/mcp.js +191 -0
  27. package/dist/memory.js +108 -0
  28. package/dist/providers/codex.js +86 -0
  29. package/dist/providers/keys.js +37 -0
  30. package/dist/providers/models.js +55 -0
  31. package/dist/providers/registry.js +241 -0
  32. package/dist/session.js +36 -0
  33. package/dist/skill-install.js +190 -0
  34. package/dist/skills.js +111 -0
  35. package/dist/tools/bash.js +26 -0
  36. package/dist/tools/edit.js +107 -0
  37. package/dist/tools/git.js +68 -0
  38. package/dist/tools/index.js +36 -0
  39. package/dist/tools/list.js +24 -0
  40. package/dist/tools/permission.js +30 -0
  41. package/dist/tools/read.js +18 -0
  42. package/dist/tools/recall.js +12 -0
  43. package/dist/tools/remember.js +14 -0
  44. package/dist/tools/schedule.js +61 -0
  45. package/dist/tools/search.js +54 -0
  46. package/dist/tools/skill.js +65 -0
  47. package/dist/tools/task.js +46 -0
  48. package/dist/tools/util.js +5 -0
  49. package/dist/tools/write.js +27 -0
  50. package/dist/ui/app.js +132 -0
  51. package/dist/ui/banner.js +20 -0
  52. package/dist/ui/brain-wizard.js +29 -0
  53. package/dist/ui/render.js +57 -0
  54. package/dist/ui/setup.js +46 -0
  55. package/package.json +77 -0
  56. package/second-brain/AGENTS.md +18 -0
  57. package/second-brain/CLAUDE.md +96 -0
  58. package/second-brain/Evals/retrieval-eval.md +30 -0
  59. package/second-brain/GEMINI.md +15 -0
  60. package/second-brain/Home.md +33 -0
  61. package/second-brain/README.md +29 -0
  62. package/second-brain/Runbooks/ingest-quarantine.md +27 -0
  63. package/second-brain/Runbooks/sleep-time-consolidation.md +26 -0
  64. package/second-brain/Shared/AI-Context-Index.md +52 -0
  65. package/second-brain/Shared/Core-Facts/protected-facts.md +21 -0
  66. package/second-brain/Shared/Decision-Memory/decision-log.md +24 -0
  67. package/second-brain/Shared/Memory-Inbox/memory-inbox.md +23 -0
  68. package/second-brain/Shared/Operating-State/current-state.md +30 -0
  69. package/second-brain/Shared/Provenance/ingest-log.md +27 -0
  70. package/second-brain/Shared/Rules/context-assembly-policy.md +28 -0
  71. package/second-brain/Shared/Rules/frontmatter-standard.md +33 -0
  72. package/second-brain/Shared/Rules/skills-admission.md +30 -0
  73. package/second-brain/Shared/User-Memory/user-preferences.md +25 -0
  74. package/second-brain/Templates/bug.md +22 -0
  75. package/second-brain/Templates/handoff.md +21 -0
  76. package/second-brain/Templates/project.md +24 -0
  77. package/second-brain/Templates/session.md +26 -0
  78. package/second-brain/USER.md +36 -0
  79. package/second-brain/Vault Structure Map.md +106 -0
  80. package/skills/agent-tool-mcp-builder/SKILL.md +88 -0
  81. package/skills/api-design-review/SKILL.md +70 -0
  82. package/skills/async-concurrency-correctness/SKILL.md +93 -0
  83. package/skills/audit-accessibility-wcag/SKILL.md +59 -0
  84. package/skills/audit-technical-seo/SKILL.md +62 -0
  85. package/skills/auth-jwt-session/SKILL.md +88 -0
  86. package/skills/brainstorm-design/SKILL.md +73 -0
  87. package/skills/build-etl-pipeline/SKILL.md +58 -0
  88. package/skills/build-form-validation/SKILL.md +103 -0
  89. package/skills/build-office-docs/SKILL.md +80 -0
  90. package/skills/build-react-component/SKILL.md +116 -0
  91. package/skills/build-spreadsheet/SKILL.md +106 -0
  92. package/skills/caching-strategy/SKILL.md +75 -0
  93. package/skills/cicd-pipeline-author/SKILL.md +65 -0
  94. package/skills/cloud-cost-optimize/SKILL.md +91 -0
  95. package/skills/code-comments/SKILL.md +52 -0
  96. package/skills/code-review/SKILL.md +61 -0
  97. package/skills/db-migration-safety/SKILL.md +67 -0
  98. package/skills/debug-frontend-browser/SKILL.md +58 -0
  99. package/skills/debug-root-cause/SKILL.md +54 -0
  100. package/skills/dependency-upgrade/SKILL.md +56 -0
  101. package/skills/deploy-release/SKILL.md +64 -0
  102. package/skills/diff-table-parity/SKILL.md +58 -0
  103. package/skills/dockerfile-optimize/SKILL.md +82 -0
  104. package/skills/error-message/SKILL.md +58 -0
  105. package/skills/estimate-work/SKILL.md +54 -0
  106. package/skills/explore-codebase/SKILL.md +73 -0
  107. package/skills/git-commit-pr/SKILL.md +65 -0
  108. package/skills/gitops-deploy-workflow/SKILL.md +97 -0
  109. package/skills/implement-from-design/SKILL.md +69 -0
  110. package/skills/incident-response-sre/SKILL.md +78 -0
  111. package/skills/k8s-debug-workload/SKILL.md +135 -0
  112. package/skills/k8s-manifest-review/SKILL.md +86 -0
  113. package/skills/llm-eval-harness/SKILL.md +63 -0
  114. package/skills/manage-client-server-state/SKILL.md +94 -0
  115. package/skills/mermaid-diagram/SKILL.md +61 -0
  116. package/skills/message-queue-jobs/SKILL.md +139 -0
  117. package/skills/naming-helper/SKILL.md +57 -0
  118. package/skills/observability-instrument/SKILL.md +113 -0
  119. package/skills/optimize-core-web-vitals/SKILL.md +75 -0
  120. package/skills/optimize-sql-query/SKILL.md +67 -0
  121. package/skills/performance-profiling/SKILL.md +65 -0
  122. package/skills/process-pdf/SKILL.md +107 -0
  123. package/skills/profile-dataset/SKILL.md +97 -0
  124. package/skills/prompt-engineering/SKILL.md +70 -0
  125. package/skills/rag-pipeline/SKILL.md +53 -0
  126. package/skills/rate-limiting/SKILL.md +96 -0
  127. package/skills/refactor-cleanup/SKILL.md +54 -0
  128. package/skills/regex-build/SKILL.md +72 -0
  129. package/skills/release-notes/SKILL.md +79 -0
  130. package/skills/rest-graphql-contract/SKILL.md +71 -0
  131. package/skills/scrape-structured-web-data/SKILL.md +61 -0
  132. package/skills/secrets-management/SKILL.md +96 -0
  133. package/skills/security-review/SKILL.md +62 -0
  134. package/skills/shell-script-robust/SKILL.md +71 -0
  135. package/skills/style-responsive-tailwind/SKILL.md +70 -0
  136. package/skills/terraform-plan-review/SKILL.md +95 -0
  137. package/skills/type-safety-strict/SKILL.md +82 -0
  138. package/skills/validate-data-quality/SKILL.md +62 -0
  139. package/skills/wrangle-tabular-data/SKILL.md +75 -0
  140. package/skills/write-adr/SKILL.md +75 -0
  141. package/skills/write-analytical-sql/SKILL.md +71 -0
  142. package/skills/write-data-viz/SKILL.md +58 -0
  143. package/skills/write-docs/SKILL.md +54 -0
  144. package/skills/write-plan/SKILL.md +59 -0
  145. package/skills/write-playwright-e2e/SKILL.md +86 -0
  146. package/skills/write-prd/SKILL.md +65 -0
  147. package/skills/write-rfc/SKILL.md +75 -0
  148. package/skills/write-tests/SKILL.md +50 -0
@@ -0,0 +1,59 @@
1
+ ---
2
+ name: write-plan
3
+ description: Converts an approved design/PRD/RFC into a concrete, batched implementation plan — ordered steps, files to touch, dependencies, checkpoints, and a verification step per phase — ready to execute or hand to subagents.
4
+ when_to_use: After brainstorm/design sign-off, or for a migration/refactor/multi-file feature where the approach is settled but execution needs sequencing ("make a plan to build X"). Skip for single-file trivial edits that you can describe in one sentence.
5
+ ---
6
+
7
+ ## When to Use
8
+
9
+ Use after the *approach* is decided and you need to turn it into ordered, verifiable execution. Inputs you should already have (if missing, go get them first — a plan on a guessed design is worthless):
10
+
11
+ - An approved design / PRD / RFC, OR a sign-off from a brainstorm step.
12
+ - The actual current state of the code (read it — do not plan against assumptions).
13
+
14
+ **Skip this skill** when the change fits in one file and you can state the diff in one sentence (typo, rename, add a log line, bump a version). Just do it.
15
+
16
+ **Stop and escalate instead of planning** when the design is still ambiguous, success criteria are undefined, or two steps contradict each other. A plan cannot resolve an unresolved decision — kick it back to design.
17
+
18
+ This skill produces a *plan only*. No code is written here. Output is consumed by an execute step or dispatched to subagents.
19
+
20
+ ## Steps
21
+
22
+ 1. **Anchor the goal in one line.** Write the single outcome this plan delivers and its Definition of Done (DoD) — observable, testable conditions, not "it works". Example: "`POST /import` accepts a CSV, writes rows to `imports` table, returns 202 + job id; integration test green; existing endpoints unchanged." Everything below must serve this line.
23
+
24
+ 2. **Map the real current state.** Grep/read the modules the design touches. List the exact files and the functions/types/configs each change will land in. If you can't name the file a step edits, you don't understand it yet — read more before continuing. Note existing tests covering this area (they're your regression net).
25
+
26
+ 3. **Decompose into checkpointed steps.** Break the work into the smallest units that each end at a *verifiable* state. Heuristic: a step that touches >3 files or can't be verified on its own is too big — split it. Each step must leave the build/tests green (or explicitly red-by-design in TDD: test committed first, failing, then made to pass).
27
+
28
+ 4. **Annotate every step** with these four fields — no step ships without all four:
29
+ - **Touches:** exact files/modules (e.g. `src/import/handler.ts`, `db/migrations/004_imports.sql`).
30
+ - **Depends on:** which prior step(s) must land first, and why (schema before handler, types before callers).
31
+ - **Verify:** the *concrete* command or check that proves the step is done — `pytest tests/import_test.py::test_csv_accepted`, `npm run build`, `curl … | jq .status == "queued"`, or a named manual check. "Looks right" is not a verification.
32
+ - **Risk/backout:** only if the step is risky (migration, deletes data, touches auth/payments/shared interface). State the failure mode and how to undo (revert migration, feature-flag off, keep old path until cutover).
33
+
34
+ 5. **Separate sequential from parallel.** Mark which steps share no files and have no dependency — those can run concurrently (good subagent candidates). Mark the critical path: the longest dependency chain that gates DoD. Put risky/irreversible steps as late as safely possible and behind a flag where feasible.
35
+
36
+ 6. **Order for fail-fast.** Front-load the step most likely to invalidate the design (the spike, the unknown API, the perf-critical query). If it's going to break the plan, break it on step 2, not step 9.
37
+
38
+ 7. **Emit the plan as an executable checklist** — a `- [ ]` todo list grouped into phases, each phase ending in its Verify line, plus the overall DoD at the top. Format so an execute step or a subagent can pick up any item and know its files, deps, and done-check without re-reading the design. Then stop — do not start coding.
39
+
40
+ ## Common Errors
41
+
42
+ - **Planning against imagined code.** The #1 failure. Steps reference files/functions that don't exist or have moved. Fix: step 2 is mandatory — read before you plan.
43
+ - **Checkpoints with no real verify.** "Verify: confirm it works" is theater. Every checkpoint needs a runnable command or a specific observable. If a step genuinely can't be verified in isolation, it's mis-sized — merge or split until it can.
44
+ - **Hidden dependency = false parallelism.** Two "independent" steps both edit a shared types file / route table / migration sequence and collide when run in parallel. Trace shared files explicitly in step 5 before declaring anything parallel.
45
+ - **Migration/refactor with no backout.** Irreversible step buried mid-plan with no undo path. Any data migration, destructive change, or shared-interface break must carry a backout and sit behind a flag or run last.
46
+ - **Boil-the-ocean step.** "Implement the feature" as one line. Useless — can't checkpoint, can't verify, can't hand off. Decompose until each step is a verifiable unit.
47
+ - **Plan that drifts from DoD.** Steps accrete scope unrelated to the one-line goal. Re-check every step against the DoD; cut anything that doesn't serve it.
48
+ - **Writing code here.** This skill plans only. Slipping into implementation skips the review/approval the plan exists to enable.
49
+
50
+ ## Verify
51
+
52
+ The plan is done when all of these hold:
53
+
54
+ - DoD is stated up top in observable/testable terms.
55
+ - Every step lists Touches + Depends-on + Verify; risky steps also list a backout.
56
+ - Every Verify is a runnable command or a specific observable, not "looks right".
57
+ - Sequential vs parallel is explicit; the critical path and any irreversible steps are flagged.
58
+ - Following the checklist top-to-bottom satisfies the DoD with no gaps and no unstated prerequisites.
59
+ - Output is a `- [ ]` checklist ready for an execute step or subagent dispatch — and zero production code was written.
@@ -0,0 +1,86 @@
1
+ ---
2
+ name: write-playwright-e2e
3
+ description: Designs and stabilizes Playwright end-to-end tests — Page Object Model, role/data-testid selectors, cross-browser, network mocking, visual regression; used when adding or de-flaking browser tests.
4
+ when_to_use: When the user wants end-to-end/browser tests, mentions Playwright, Page Object Model, flaky E2E tests, cross-browser testing, or testing real user flows in a browser.
5
+ ---
6
+
7
+ ## File layout
8
+
9
+ Save the skill at `skills/write-playwright-e2e/SKILL.md` with this frontmatter, then the body below verbatim:
10
+
11
+ ```yaml
12
+ ---
13
+ name: write-playwright-e2e
14
+ description: Designs and stabilizes Playwright end-to-end tests — Page Object Model, role/data-testid selectors, cross-browser, network mocking, visual regression; used when adding or de-flaking browser tests.
15
+ when_to_use: When the user wants end-to-end/browser tests, mentions Playwright, Page Object Model, flaky E2E tests, cross-browser testing, or testing real user flows in a browser.
16
+ ---
17
+ ```
18
+
19
+ ---
20
+
21
+ ## When to Use
22
+
23
+ - Adding a new browser test for a real user flow (login, checkout, search→result, form submit).
24
+ - A flow spans pages/redirects/auth and can't be covered by a unit or integration test.
25
+ - An existing E2E test is **flaky** (passes locally, fails in CI, or fails ~1 in N runs) and needs de-flaking.
26
+ - The user explicitly names Playwright, Page Object Model, cross-browser, or visual regression.
27
+
28
+ Skip — and reach for `write-tests` (unit/integration) instead — when the logic under test is a pure function, a parser, an API handler, or anything you can drive without a real DOM. E2E is the slowest, most brittle tier; only put a flow here when the value *is* the browser+network+rendering integration. One or two E2E tests per critical flow, not one per assertion.
29
+
30
+ ## Steps
31
+
32
+ 1. **Bootstrap once if Playwright is absent.** Check `package.json` devDeps for `@playwright/test`. If missing: `npm init playwright@latest` (or `npm i -D @playwright/test && npx playwright install --with-deps`). Confirm a `playwright.config.ts` exists; tests live in `e2e/` or `tests/` (match the repo's existing convention — never invent a parallel folder). Add an npm script `"test:e2e": "playwright test"` if none exists.
33
+
34
+ 2. **Set config invariants before writing any test.** In `playwright.config.ts`:
35
+ - `use.baseURL` → so tests call `page.goto('/path')`, never hardcoded hosts.
36
+ - `webServer: { command, url, reuseExistingServer: !process.env.CI }` → Playwright boots/awaits the app itself; no manual "start the server first".
37
+ - `use.trace: 'on-first-retry'`, `screenshot: 'only-on-failure'`, `video: 'retain-on-failure'` → failure forensics without bloating green runs.
38
+ - `expect.timeout` left at default (5s); raise per-assertion only where justified, never globally to mask slowness.
39
+ - `forbidOnly: !!process.env.CI` → a stray `.only` fails CI instead of silently shrinking the suite.
40
+
41
+ 3. **Write one user scenario per `test`, arrange→act→assert.** Name it after the behavior (`'user can reset password from the login page'`), not the implementation. Use `test.describe` to group a feature, `test.beforeEach` for shared navigation/setup. One scenario = one reason to fail; don't chain five unrelated flows into a mega-test where failure 1 hides 2–5.
42
+
43
+ 4. **Select by role/label first, `data-testid` only as fallback — never brittle CSS/text.** Priority order:
44
+ - `page.getByRole('button', { name: 'Submit' })`, `getByLabel`, `getByPlaceholder`, `getByText` for stable user-visible content → these mirror what a user/assistive-tech sees and survive refactors.
45
+ - `page.getByTestId('cart-total')` when there's no accessible handle. If the element lacks one, **add a `data-testid` to the app source** rather than reaching for `.css > .selectors:nth-child(3)`.
46
+ - Banned: deep CSS/XPath chains, `nth-child`, class names from a CSS framework, and matching on copy that translates/changes. These are the #1 source of false failures.
47
+
48
+ 5. **Lean entirely on auto-waiting + web-first assertions — zero manual sleeps.** Every locator action (`click`, `fill`) auto-waits for actionability; every `expect(locator)` retries until the timeout. Write `await expect(page.getByRole('alert')).toBeVisible()`, `toHaveText`, `toHaveURL`, `toBeEnabled`. **Never** `page.waitForTimeout(ms)` / `sleep` — it's either flaky (too short) or slow (too long). To wait on the network, await the round-trip explicitly: `await page.waitForResponse(r => r.url().includes('/api/x') && r.ok())` or `Promise.all([waitForResponse(...), button.click()])`.
49
+
50
+ 6. **Extract reusable flows into Page Objects.** One class per page/major component under `e2e/pages/` (e.g. `LoginPage`). The class takes `page` in its constructor, exposes locators as readonly fields and **intent methods** (`async login(user, pass)`), and returns the next Page Object when navigation crosses pages. Tests then read as prose: `await new LoginPage(page).login(...)`. Assertions stay in the test, not buried in the POM. Promote a flow to a POM the moment a second test needs it — don't copy-paste selectors.
51
+
52
+ 7. **Isolate from the real backend with network interception + fixtures.** For deterministic tests, mock at the network edge: `await page.route('**/api/orders', route => route.fulfill({ json: fixture }))`. Keep payloads in `e2e/fixtures/*.json`. Seed auth via storage state — log in once in a setup project, save `storageState`, and reuse it (`use.storageState`) so most tests skip the login UI. Decide explicitly per suite: mock (fast, deterministic, no env deps) vs. hit a seeded test backend (higher fidelity). Don't half-do it — a test that mocks some calls and lets others hit prod is the worst of both.
53
+
54
+ 8. **Cover cross-browser + viewport via `projects`, not duplicated tests.** Define `projects` for `chromium`, `firefox`, `webkit`, plus mobile via `devices['Pixel 5']` / `devices['iPhone 13']`. The same test file runs across all of them. Gate genuinely engine-specific behavior with `test.skip(browserName === 'webkit', 'reason')` — sparingly, with a written reason. For responsive breakpoints, set `viewport` per project rather than resizing mid-test.
55
+
56
+ 9. **(Optional) Add visual regression where pixels are the contract.** `await expect(page).toHaveScreenshot('checkout.png')` for layout-critical surfaces. Mask dynamic regions (`mask: [page.getByTestId('timestamp')]`), disable animations (`animations: 'disabled'`), and pin a fixed viewport. Generate baselines with `--update-snapshots`, **commit them**, and review the baseline image like code. Use sparingly — snapshots are high-maintenance; don't snapshot pages full of live/random data.
57
+
58
+ 10. **De-flake methodically — retries are a last resort, not a fix.** When a test flakes, run `playwright test --repeat-each=20 <file>` (or `--retries=0` to surface it) to reproduce, then open the **trace** (`npx playwright show-trace`) to see the exact failing step. Root causes, in order of likelihood: a manual sleep masking a real wait; asserting before the network settled; a non-deterministic selector (text/index); shared mutable state between tests; an animation/transition mid-action. Fix the cause. Only after that, set `retries: 2` in CI config as a safety net for genuine infra blips — never to paper over a known race.
59
+
60
+ 11. **Run green across all projects before declaring done.** `npm run test:e2e` (all browsers). Run the new/changed file `--repeat-each=10` to prove non-flakiness. Confirm CI runs headless. Report: which flows are covered, which browsers, and any deliberately skipped engine.
61
+
62
+ ## Common Errors
63
+
64
+ - **`waitForTimeout` / arbitrary sleeps.** The single biggest cause of flake *and* slowness. There is always a condition to await instead (`expect().toBeVisible`, `waitForResponse`, `waitForURL`). Treat any sleep in an E2E test as a bug.
65
+ - **Race on click→assert without awaiting the trigger's effect.** Clicking submit then immediately asserting the next page fails intermittently because navigation/fetch is in flight. Either use a web-first assertion (which retries) or `Promise.all([page.waitForURL('**/success'), submit.click()])`.
66
+ - **Brittle selectors.** `nth-child`, framework class names, and copy-text break on the next refactor/translation and produce false failures that erode trust in the suite. Role/label/testid only.
67
+ - **Forgetting `await`.** Every Playwright call is async. A missing `await` makes assertions pass vacuously (the promise is truthy) — a test that can never fail. Enable `@typescript-eslint/no-floating-promises` to catch these.
68
+ - **`reuseExistingServer` left on in CI**, or no `webServer` block at all → tests race a not-yet-ready app and fail on the first `goto`. Let Playwright own server lifecycle and await `url`.
69
+ - **Tests depending on order / shared state.** Each test must set up and tear down its own data; Playwright runs files in parallel and order isn't guaranteed. Cross-test coupling produces "passes alone, fails in suite."
70
+ - **Cross-origin `page.route` misses.** A glob like `/api/*` won't match an absolute `https://api.example.com/...`. Use `**/api/**` and verify the route actually fired (route handlers that never match silently fall through to the real network).
71
+ - **Uncommitted or machine-specific snapshot baselines.** Visual diffs fail in CI when baselines aren't committed, or were generated on a different OS/font stack. Generate in (or matching) the CI environment and commit them.
72
+ - **Global timeout inflation to "fix" flake.** Bumping `expect.timeout` to 30s hides a race and makes every failure take 30s. Fix the wait condition; keep timeouts tight.
73
+
74
+ ## Verify
75
+
76
+ The work is done when:
77
+
78
+ - Each test covers exactly one user scenario, named for behavior, structured arrange→act→assert.
79
+ - Zero `waitForTimeout`/`sleep`; all waits are web-first assertions or explicit `waitFor*` conditions.
80
+ - Selectors are role/label-based, with `data-testid` only as a documented fallback — no positional CSS/XPath or copy-text matching.
81
+ - Reused flows live in Page Objects (`e2e/pages/`); duplicated selector blocks have been extracted.
82
+ - The suite is deterministic: network mocked via `page.route` + committed fixtures, or pointed at a seeded backend — chosen explicitly, applied consistently.
83
+ - `playwright.config.ts` sets `baseURL`, `webServer` (with `reuseExistingServer: !CI`), trace/screenshot/video on failure, and `forbidOnly` in CI.
84
+ - Cross-browser coverage exists via `projects` (chromium/firefox/webkit + at least one mobile device) on the same test files.
85
+ - New/changed tests pass `--repeat-each=10` with no flake, and the full `npm run test:e2e` is green headless. Any retries are documented as an infra safety net, not a race patch.
86
+ - (If used) visual snapshots are masked/animation-disabled, viewport-pinned, and the baselines are committed.
@@ -0,0 +1,65 @@
1
+ ---
2
+ name: write-prd
3
+ description: Produces an opinionated, implementation-ready Product Requirements Document (PRD) via interactive discovery — evidence-first problem statement, measurable goals with counter-metrics, scoped requirements, and explicit prioritization.
4
+ when_to_use: User asks to write/draft a PRD, product spec, feature requirements, or 'spec this out'; product-side framing before engineering planning. Use when business intent must be structured before build.
5
+ ---
6
+
7
+ ## When to Use
8
+
9
+ - User says "write a PRD", "draft a spec", "spec this out", "feature requirements", or describes a feature/product to be built but hasn't framed it.
10
+ - Business intent exists but is not yet structured: vague goal, no success metric, fuzzy scope.
11
+ - The handoff target is engineering planning (a PRD precedes a tech design / implementation plan — do NOT put architecture, schemas, or code here).
12
+
13
+ Do NOT use for: a tech design doc, an RFC about *how* to build, a bug report, or a one-line change. If the user already knows the exact change, skip the PRD and plan the build.
14
+
15
+ ## Steps
16
+
17
+ 1. **Gate on context.** If the user gave a one-liner, do NOT start writing. Ask exactly 3-5 discovery questions, then stop and wait. Cover these five gaps (skip any the user already answered):
18
+ - **Problem + evidence** — what's broken, and what data/observation proves it (not a hunch)?
19
+ - **Target user** — who specifically hits this, and how often?
20
+ - **Success metric** — one number that proves it worked, with current baseline and target.
21
+ - **Constraints** — deadline, platform, team size, regulatory/tech limits.
22
+ - **Scope boundary** — what is explicitly OUT of this version.
23
+ Keep questions concrete and answerable in one line each. Don't ask what you can reasonably infer.
24
+
25
+ 2. **Draft the PRD** in this exact section order. Each section is mandatory; write "None known" rather than deleting a heading.
26
+ 1. **Problem** — 2-4 sentences, evidence-backed. Lead with the observation/data, then the cost of inaction. Ban adjectives that aren't measured.
27
+ 2. **Goals** — 1-3 bullets, each a measurable outcome (`metric: baseline → target by date`). Every goal MUST be paired with a **counter-metric** (the thing you must NOT break while chasing it, e.g. "increase signup conversion without raising 30-day churn").
28
+ 3. **Non-goals** — explicit list of what this version will not do. This is where you kill scope creep on paper.
29
+ 4. **Requirements** — numbered, each tagged **P0 / P1 / P2** (P0 = ship-blocker, P1 = strongly wanted, P2 = nice-to-have). Each requirement has its own **acceptance criteria** written as testable Given/When/Then or a binary checklist. No requirement ships without criteria.
30
+ 5. **UX notes** — flows, states, edge/empty/error states. Words or ASCII, not pixel design. Note where design is still open.
31
+ 6. **Risks & dependencies** — what could sink this, what it relies on (other teams, APIs, data). One mitigation per risk.
32
+ 7. **Open questions** — unresolved decisions with an owner or a "needs decision by <date>".
33
+
34
+ 3. **Prioritize honestly.** If everything is P0, nothing is — force-rank so P0 is the minimum shippable set that satisfies the Goals. State the cut line ("P0 = v1 release; P1+ = fast-follow").
35
+
36
+ 4. **Flag every assumption inline** with a bold `**Assumption:**` prefix wherever you filled a gap the user didn't confirm. Do not silently invent a metric, user, or constraint.
37
+
38
+ 5. **Hold the length budget.** Target ~1200 words, hard ceiling ~1500. If you're over, cut prose — never cut a required section or acceptance criteria. Clarity over volume.
39
+
40
+ 6. **Output as a single Markdown document** (the PRD itself), not a conversational summary. End with a short "Assumptions & Open Questions to confirm" recap so the reader knows what's still soft.
41
+
42
+ ## Common Errors
43
+
44
+ - **Writing the PRD from the one-liner without discovery.** The #1 failure. A PRD built on guesses is worse than no PRD. Always run step 1 first unless the user explicitly says "just draft it, I'll fix details."
45
+ - **Goals with no number.** "Improve onboarding" is not a goal. If you can't attach `baseline → target`, it's a vision statement — push back and ask for the metric.
46
+ - **Forgetting counter-metrics.** A goal with no guardrail invites the team to game it (e.g. boost conversion by adding dark patterns that spike churn). Every goal needs its "without breaking X" clause.
47
+ - **Requirements without acceptance criteria.** Without testable criteria, engineering can't tell "done" from "looks done." Each requirement is incomplete until it has a binary pass/fail check.
48
+ - **Leaking solution into the problem.** "We need a Redis cache" is a solution, not a problem. The Problem section describes user/business pain; *how* belongs in the downstream tech design, not here.
49
+ - **P0 inflation.** Marking everything P0 defeats prioritization. Force a cut line; if pressed, the P0 set should be the smallest thing that hits the Goals.
50
+ - **Silent assumptions.** Inventing the target user or success metric and presenting it as fact. Always tag with `**Assumption:**` so it gets challenged.
51
+ - **Bloating past budget with backstory.** Long context-setting prose buries the requirements. Trim narrative, keep the spec.
52
+
53
+ ## Verify
54
+
55
+ Before declaring the PRD done, confirm every item — fix and re-check any that fail:
56
+
57
+ - [ ] All 7 sections present and in order (Problem → Goals → Non-goals → Requirements → UX → Risks/deps → Open questions).
58
+ - [ ] Problem cites at least one piece of evidence (data/observation), not just opinion.
59
+ - [ ] Every Goal is measurable (`baseline → target`) AND has a paired counter-metric.
60
+ - [ ] Every Requirement has a P0/P1/P2 tag AND testable acceptance criteria.
61
+ - [ ] A clear P0 cut line exists (P0 ≠ "everything").
62
+ - [ ] Every gap you filled is marked `**Assumption:**`.
63
+ - [ ] Word count within ~1200 (≤1500 hard cap).
64
+ - [ ] No architecture/schema/code — those belong to the downstream design doc.
65
+ - [ ] Output is a Markdown PRD, ending with the Assumptions & Open Questions recap.
@@ -0,0 +1,75 @@
1
+ ---
2
+ name: write-rfc
3
+ description: Sanook drafts an engineering RFC / design doc / technical proposal for a non-trivial change — motivation, proposed design, alternatives, tradeoffs, rollout/migration, risks, and open questions — structured for team review and sign-off.
4
+ when_to_use: User asks to write an RFC, design doc, tech proposal, or 'design document' for a system/feature/migration that needs review before building; larger than an ADR (which records one decision).
5
+ ---
6
+
7
+ ## When to Use
8
+
9
+ Use for a non-trivial change that needs **review before building**: a new service, a cross-cutting refactor, a data migration, an API redesign, a build-vs-buy decision. Reach for this when the change touches multiple teams/systems, is expensive to reverse, or has > 1 viable approach worth comparing.
10
+
11
+ Do NOT use when:
12
+ - The decision is already made and you just need to record *what* and *why* → write an **ADR** (single decision, ~1 page), not an RFC.
13
+ - The change is a one-liner you can describe in a PR description → just open the PR.
14
+ - It's a pure incident/postmortem → use a postmortem template.
15
+
16
+ An RFC is bigger than an ADR: it explores the *solution space* and asks reviewers to choose. An ADR records *one locked decision*. If an RFC locks a sub-decision, spawn an ADR for it and link.
17
+
18
+ ## Steps
19
+
20
+ 1. **Locate the repo's RFC convention first.** Look for an existing `docs/rfc/`, `rfcs/`, or `docs/adr/` dir and copy the latest file's structure, numbering, and frontmatter. Match the team's house style over this template. New file: `docs/rfc/NNNN-kebab-title.md` where `NNNN` = (highest existing number + 1), zero-padded. If no convention exists, create `docs/rfc/0001-<title>.md`.
21
+
22
+ 2. **Write the metadata header** (status drives review): `RFC #`, `Title`, `Author(s)`, `Status: Draft` (lifecycle: Draft → In Review → Accepted / Rejected / Superseded), `Created` date, `Reviewers` (named, not "the team"), `Related: ` links to prior RFCs/ADRs/issues.
23
+
24
+ 3. **Lead with a TL;DR / Summary** (3-5 sentences max): what you're proposing, the *one* recommended option, and **why now** (the forcing function — what breaks or gets blocked if we don't). A reviewer who reads only this paragraph should know what they're approving.
25
+
26
+ 4. **Motivation** — state the concrete problem with evidence (a metric, an incident, a scaling limit, a recurring support load), not "it would be nice." Then **Goals** and **Non-Goals** as two explicit bullet lists. Non-Goals is the highest-leverage section for killing scope-creep arguments in review — name what's deliberately out.
27
+
28
+ 5. **Proposed Design** — the recommended option, in enough detail to estimate and critique:
29
+ - Architecture / data flow. Add a `mermaid` diagram when components or sequence matter:
30
+ ```mermaid
31
+ flowchart LR
32
+ Client --> API --> Queue --> Worker --> DB
33
+ ```
34
+ - API / schema / interface changes — show the actual signatures, table DDL, or message shapes.
35
+ - Key behaviors, failure modes, and concurrency/consistency assumptions.
36
+
37
+ 6. **Alternatives Considered** — at least 2 real options (one is usually "do nothing / status quo"). For each: a one-line description + **why not** chosen. Reviewers trust a recommendation more when they see the rejected paths. This is the section that separates an RFC from a spec.
38
+
39
+ 7. **Tradeoffs** — a comparison table across the live options on the axes that matter here (e.g. complexity, cost, latency, migration effort, blast radius, lock-in). Pick axes specific to *this* decision; don't ship a generic grid.
40
+
41
+ 8. **Migration / Rollout** — ordered, runnable steps: feature flag → backfill → dual-write/shadow → cutover → cleanup. State the **backout plan** (how to revert at each phase) and whether each step is reversible. Note data backfill and any irreversible point-of-no-return explicitly.
42
+
43
+ 9. **Risks & Mitigations** — table of `Risk | Likelihood | Impact | Mitigation`. Include the failure that keeps you up at night, not just easy ones.
44
+
45
+ 10. **Security & Performance Impact** — new attack surface, authz/data-exposure changes, PII handling, new dependencies; expected latency/throughput/cost delta and how you'll measure it. If the change touches auth, input handling, or secrets, say so loudly here so reviewers route it to a security pass.
46
+
47
+ 11. **Open Questions** — honest unknowns you want input on. An RFC with zero open questions usually means you haven't thought hard enough.
48
+
49
+ 12. **Decisions Needed** — end with a numbered list of explicit asks: each item = a question + the options + your recommendation, phrased so a reviewer can reply "approve #1, #3; let's discuss #2." This is what unblocks sign-off; don't bury it.
50
+
51
+ ## Common Errors
52
+
53
+ - **Describing the solution before the problem.** If Motivation/Goals are weak, reviewers argue about the design forever. Lock the problem statement first.
54
+ - **Strawman alternatives.** Listing options you obviously dismiss signals you didn't really explore. Each alternative needs a *plausible* reason someone would pick it, then your honest why-not.
55
+ - **No backout plan.** "We'll roll forward" is not a plan. Every migration phase needs a documented revert path; flag irreversible steps in bold.
56
+ - **Generic tradeoff table.** A grid of complexity/cost/maintainability that could apply to any RFC adds nothing. Choose axes that actually differ between *these* options.
57
+ - **Vague ownership.** "Reviewers: the team" and "Author: TBD" stall sign-off. Name people. "Decisions needed" with no recommendation forces reviewers to do your synthesis.
58
+ - **RFC that's actually an ADR.** If there's only one option and the decision is made, you're padding a 1-page ADR into 5 pages. Downgrade it.
59
+ - **Mermaid that doesn't render.** Wrong fence (` ```mermaid `), or `flowchart`/`sequenceDiagram` typos break the diagram silently in the doc viewer. Paste-test before shipping.
60
+ - **Letting status go stale.** A doc stuck on `Status: Draft` after it shipped misleads future readers. Update to Accepted/Superseded on resolution.
61
+
62
+ ## Verify
63
+
64
+ Before declaring the RFC ready for review, confirm:
65
+
66
+ - [ ] A reviewer can read **only the TL;DR** and know what they're approving and why now.
67
+ - [ ] **Goals AND Non-Goals** are both present and explicit.
68
+ - [ ] **≥ 2 alternatives** each have a concrete *why-not* (status quo counts as one).
69
+ - [ ] **Tradeoff table** uses axes specific to this decision, not boilerplate.
70
+ - [ ] **Migration section has a backout/revert path**, with irreversible steps flagged.
71
+ - [ ] **Security & performance impact** is addressed (even if "none, because…").
72
+ - [ ] Doc ends with a **numbered "Decisions Needed"** list, each with a recommendation.
73
+ - [ ] Any **mermaid diagram renders** (correct fence + valid syntax).
74
+ - [ ] Every locked sub-decision links to (or spawns) an **ADR**; metadata names real **author + reviewers** and `Status`.
75
+ - [ ] File lives under the repo's RFC dir with correct sequential numbering and matches the house template if one exists.
@@ -0,0 +1,50 @@
1
+ ---
2
+ name: write-tests
3
+ description: Writes new automated tests for a function, API, parser, or module using a real test runner — TDD-first when the contract is clear (write test, run to FAIL, then implement). Use when adding test coverage or building a feature with a defined contract. Does NOT diagnose existing failures.
4
+ when_to_use: เพิ่ม coverage; สร้าง feature ที่มี contract ชัด (function/API/parser/bot) แบบ TDD; ก่อนแก้ regression
5
+ ---
6
+
7
+ ## When to Use
8
+
9
+ - Adding test coverage to existing code whose behavior is already stable.
10
+ - Building a new feature with a clear contract (function signature, API request/response, parser input→output, bot command). Use TDD: write the test first, watch it fail, then implement.
11
+ - Reproducing a regression as a failing test BEFORE fixing the bug.
12
+
13
+ Do NOT use this to diagnose why existing tests fail — that is a debugging task, not a test-writing one.
14
+
15
+ ## Steps
16
+
17
+ 1. **Pin the contract before writing any code.** State in one line: inputs, expected output, and side effects. List edge cases explicitly — empty input, null/undefined, zero/negative, max boundary, malformed input, error path. If the contract is ambiguous, resolve it from the spec/types/caller, not from the current implementation.
18
+
19
+ 2. **Detect the repo's test runner — never introduce a new one.** Check config/manifest:
20
+ - Node: `package.json` `scripts.test` and devDeps → `jest` / `vitest` / `node:test` / `mocha`.
21
+ - Python: `pytest.ini` / `pyproject.toml` `[tool.pytest]` / `tox.ini` → `pytest`; else `unittest`.
22
+ - Go: `go test ./...`. Rust: `cargo test`.
23
+ Match an existing test file's location, naming (`*.test.ts`, `*.spec.ts`, `test_*.py`, `*_test.go`), and import style.
24
+
25
+ 3. **Write tests covering happy path + every edge case + error path.** One assertion target per test, descriptive names (`returns_empty_array_when_input_is_null`). Assert real values, not just "no throw". For error cases, assert the specific error type/message, not a bare catch.
26
+
27
+ 4. **Mock only true external boundaries** (network, clock, filesystem, DB). Do NOT mock the unit under test or pure logic. If a test mocks so much that it only asserts the mock was called, it tests nothing — delete or rewrite it.
28
+
29
+ 5. **Run the tests and confirm RED.** For TDD, the test MUST fail because the feature is unimplemented — not because of an import error or typo. Read the failure message and verify it fails for the contract reason. A test that passes immediately on unwritten code is broken.
30
+
31
+ 6. **For TDD: commit the failing test BEFORE writing implementation.** This makes the red→green transition visible in history (anti-cheat). Then implement until green.
32
+
33
+ 7. **Implement until all tests pass. Never weaken a test to make it green** — no deleting assertions, loosening matchers, or `skip`/`xit`. If a test is genuinely wrong about the contract, fix the contract understanding explicitly, don't silently soften the assertion. Fix failures at root cause; do not suppress errors or catch-and-ignore.
34
+
35
+ ## Common Errors
36
+
37
+ - **Over-mocking → vacuous test.** Mocking the function under test means you assert your own mock. Rule: if removing the assertion changes nothing, the test is dead.
38
+ - **Test written against implementation, not contract.** Asserting internal call order or private state couples the test to refactors. Assert observable behavior (return value, emitted event, persisted row) instead.
39
+ - **Test never actually ran red.** Skipping step 5 hides tests that pass for the wrong reason (e.g. function returns `undefined`, assertion is `toBeFalsy`). Always see it fail first.
40
+ - **Non-deterministic test.** Real time (`Date.now()`), random, unordered map iteration, or network make tests flaky. Inject a fixed clock/seed; sort before comparing; stub the network.
41
+ - **Wrong runner / orphan test file.** Putting a Vitest file in a Jest repo (or vice versa) — it won't be picked up by the test command and silently never runs. Confirm the file is collected by the existing `test` command.
42
+ - **Async not awaited.** Missing `await`/`return` on a promise assertion → test passes before the assertion runs. Await every async assertion.
43
+
44
+ ## Verify
45
+
46
+ - Run the repo's actual test command (`npm test` / `pytest` / `go test ./...`) — the new tests appear in the run count and pass.
47
+ - Temporarily break the implementation (flip a return, comment a line) → the new test goes RED. Revert. This proves the test actually exercises the code.
48
+ - For TDD: `git log`/`diff` shows the failing-test commit landed before the implementation commit.
49
+ - Coverage report (if available) shows the new branches/lines are hit — confirm edge and error paths execute, not just the happy path.
50
+ - Tests are deterministic: run them twice (or with `--runInBand` / no parallelism toggle) and get identical results.