safeword 0.2.3 → 0.2.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (235) hide show
  1. package/.claude/commands/arch-review.md +32 -0
  2. package/.claude/commands/lint.md +6 -0
  3. package/.claude/commands/quality-review.md +13 -0
  4. package/.claude/commands/setup-linting.md +6 -0
  5. package/.claude/hooks/auto-lint.sh +6 -0
  6. package/.claude/hooks/auto-quality-review.sh +170 -0
  7. package/.claude/hooks/check-linting-sync.sh +17 -0
  8. package/.claude/hooks/inject-timestamp.sh +6 -0
  9. package/.claude/hooks/question-protocol.sh +12 -0
  10. package/.claude/hooks/run-linters.sh +8 -0
  11. package/.claude/hooks/run-quality-review.sh +76 -0
  12. package/.claude/hooks/version-check.sh +10 -0
  13. package/.claude/mcp/README.md +96 -0
  14. package/.claude/mcp/arcade.sample.json +9 -0
  15. package/.claude/mcp/context7.sample.json +7 -0
  16. package/.claude/mcp/playwright.sample.json +7 -0
  17. package/.claude/settings.json +62 -0
  18. package/.claude/skills/quality-reviewer/SKILL.md +190 -0
  19. package/.claude/skills/safeword-quality-reviewer/SKILL.md +13 -0
  20. package/.env.arcade.example +4 -0
  21. package/.env.example +11 -0
  22. package/.gitmodules +4 -0
  23. package/.safeword/SAFEWORD.md +33 -0
  24. package/.safeword/eslint/eslint-base.mjs +101 -0
  25. package/.safeword/guides/architecture-guide.md +404 -0
  26. package/.safeword/guides/code-philosophy.md +174 -0
  27. package/.safeword/guides/context-files-guide.md +405 -0
  28. package/.safeword/guides/data-architecture-guide.md +183 -0
  29. package/.safeword/guides/design-doc-guide.md +165 -0
  30. package/.safeword/guides/learning-extraction.md +515 -0
  31. package/.safeword/guides/llm-instruction-design.md +239 -0
  32. package/.safeword/guides/llm-prompting.md +95 -0
  33. package/.safeword/guides/tdd-best-practices.md +570 -0
  34. package/.safeword/guides/test-definitions-guide.md +243 -0
  35. package/.safeword/guides/testing-methodology.md +573 -0
  36. package/.safeword/guides/user-story-guide.md +237 -0
  37. package/.safeword/guides/zombie-process-cleanup.md +214 -0
  38. package/{templates → .safeword}/hooks/agents-md-check.sh +0 -0
  39. package/{templates → .safeword}/hooks/post-tool.sh +0 -0
  40. package/{templates → .safeword}/hooks/pre-commit.sh +0 -0
  41. package/.safeword/planning/002-user-story-quality-evaluation.md +1840 -0
  42. package/.safeword/planning/003-langsmith-eval-setup-prompt.md +363 -0
  43. package/.safeword/planning/004-llm-eval-test-cases.md +3226 -0
  44. package/.safeword/planning/005-architecture-enforcement-system.md +169 -0
  45. package/.safeword/planning/006-reactive-fix-prevention-research.md +135 -0
  46. package/.safeword/planning/011-cli-ux-vision.md +330 -0
  47. package/.safeword/planning/012-project-structure-cleanup.md +154 -0
  48. package/.safeword/planning/README.md +39 -0
  49. package/.safeword/planning/automation-plan-v2.md +1225 -0
  50. package/.safeword/planning/automation-plan-v3.md +1291 -0
  51. package/.safeword/planning/automation-plan.md +3058 -0
  52. package/.safeword/planning/design/005-cli-implementation.md +343 -0
  53. package/.safeword/planning/design/013-cli-self-contained-templates.md +596 -0
  54. package/.safeword/planning/design/013a-eslint-plugin-suite.md +256 -0
  55. package/.safeword/planning/design/013b-implementation-snippets.md +385 -0
  56. package/.safeword/planning/design/013c-config-isolation-strategy.md +242 -0
  57. package/.safeword/planning/design/code-philosophy-improvements.md +60 -0
  58. package/.safeword/planning/mcp-analysis.md +545 -0
  59. package/.safeword/planning/phase2-subagents-vs-skills-analysis.md +451 -0
  60. package/.safeword/planning/settings-improvements.md +970 -0
  61. package/.safeword/planning/test-definitions/005-cli-implementation.md +1301 -0
  62. package/.safeword/planning/test-definitions/cli-self-contained-templates.md +205 -0
  63. package/.safeword/planning/user-stories/001-guides-review-user-stories.md +1381 -0
  64. package/.safeword/planning/user-stories/003-reactive-fix-prevention.md +132 -0
  65. package/.safeword/planning/user-stories/004-technical-constraints.md +86 -0
  66. package/.safeword/planning/user-stories/005-cli-implementation.md +311 -0
  67. package/.safeword/planning/user-stories/cli-self-contained-templates.md +172 -0
  68. package/.safeword/planning/versioned-distribution.md +740 -0
  69. package/.safeword/prompts/arch-review.md +43 -0
  70. package/.safeword/prompts/quality-review.md +11 -0
  71. package/.safeword/scripts/arch-review.sh +235 -0
  72. package/.safeword/scripts/check-linting-sync.sh +58 -0
  73. package/.safeword/scripts/setup-linting.sh +559 -0
  74. package/.safeword/templates/architecture-template.md +136 -0
  75. package/.safeword/templates/ci/architecture-check.yml +79 -0
  76. package/.safeword/templates/design-doc-template.md +127 -0
  77. package/.safeword/templates/test-definitions-feature.md +100 -0
  78. package/.safeword/templates/ticket-template.md +74 -0
  79. package/.safeword/templates/user-stories-template.md +82 -0
  80. package/.safeword/tickets/001-guides-review-user-stories.md +83 -0
  81. package/.safeword/tickets/002-architecture-enforcement.md +211 -0
  82. package/.safeword/tickets/003-reactive-fix-prevention.md +57 -0
  83. package/.safeword/tickets/004-technical-constraints-in-user-stories.md +39 -0
  84. package/.safeword/tickets/005-cli-implementation.md +248 -0
  85. package/.safeword/tickets/006-flesh-out-skills.md +43 -0
  86. package/.safeword/tickets/007-flesh-out-questioning.md +44 -0
  87. package/.safeword/tickets/008-upgrade-questioning.md +58 -0
  88. package/.safeword/tickets/009-naming-conventions.md +41 -0
  89. package/.safeword/tickets/010-safeword-md-cleanup.md +34 -0
  90. package/.safeword/tickets/011-cursor-setup.md +86 -0
  91. package/.safeword/tickets/README.md +73 -0
  92. package/.safeword/version +1 -0
  93. package/AGENTS.md +59 -0
  94. package/CLAUDE.md +12 -0
  95. package/README.md +347 -0
  96. package/docs/001-cli-implementation-plan.md +856 -0
  97. package/docs/elite-dx-implementation-plan.md +1034 -0
  98. package/framework/README.md +131 -0
  99. package/framework/mcp/README.md +96 -0
  100. package/framework/mcp/arcade.sample.json +8 -0
  101. package/framework/mcp/context7.sample.json +6 -0
  102. package/framework/mcp/playwright.sample.json +6 -0
  103. package/framework/scripts/arch-review.sh +235 -0
  104. package/framework/scripts/check-linting-sync.sh +58 -0
  105. package/framework/scripts/load-env.sh +49 -0
  106. package/framework/scripts/setup-claude.sh +223 -0
  107. package/framework/scripts/setup-linting.sh +559 -0
  108. package/framework/scripts/setup-quality.sh +477 -0
  109. package/framework/scripts/setup-safeword.sh +550 -0
  110. package/framework/templates/ci/architecture-check.yml +78 -0
  111. package/learnings/ai-sdk-v5-breaking-changes.md +178 -0
  112. package/learnings/e2e-test-zombie-processes.md +231 -0
  113. package/learnings/milkdown-crepe-editor-property.md +96 -0
  114. package/learnings/prosemirror-fragment-traversal.md +119 -0
  115. package/package.json +19 -43
  116. package/packages/cli/AGENTS.md +1 -0
  117. package/packages/cli/ARCHITECTURE.md +279 -0
  118. package/packages/cli/package.json +51 -0
  119. package/packages/cli/src/cli.ts +63 -0
  120. package/packages/cli/src/commands/check.ts +166 -0
  121. package/packages/cli/src/commands/diff.ts +209 -0
  122. package/packages/cli/src/commands/reset.ts +190 -0
  123. package/packages/cli/src/commands/setup.ts +325 -0
  124. package/packages/cli/src/commands/upgrade.ts +163 -0
  125. package/packages/cli/src/index.ts +3 -0
  126. package/packages/cli/src/templates/config.ts +58 -0
  127. package/packages/cli/src/templates/content.ts +18 -0
  128. package/packages/cli/src/templates/index.ts +12 -0
  129. package/packages/cli/src/utils/agents-md.ts +66 -0
  130. package/packages/cli/src/utils/fs.ts +179 -0
  131. package/packages/cli/src/utils/git.ts +124 -0
  132. package/packages/cli/src/utils/hooks.ts +29 -0
  133. package/packages/cli/src/utils/output.ts +60 -0
  134. package/packages/cli/src/utils/project-detector.test.ts +185 -0
  135. package/packages/cli/src/utils/project-detector.ts +44 -0
  136. package/packages/cli/src/utils/version.ts +28 -0
  137. package/packages/cli/src/version.ts +6 -0
  138. package/packages/cli/templates/SAFEWORD.md +776 -0
  139. package/packages/cli/templates/doc-templates/architecture-template.md +136 -0
  140. package/packages/cli/templates/doc-templates/design-doc-template.md +134 -0
  141. package/packages/cli/templates/doc-templates/test-definitions-feature.md +131 -0
  142. package/packages/cli/templates/doc-templates/ticket-template.md +82 -0
  143. package/packages/cli/templates/doc-templates/user-stories-template.md +92 -0
  144. package/packages/cli/templates/guides/architecture-guide.md +423 -0
  145. package/packages/cli/templates/guides/code-philosophy.md +195 -0
  146. package/packages/cli/templates/guides/context-files-guide.md +457 -0
  147. package/packages/cli/templates/guides/data-architecture-guide.md +200 -0
  148. package/packages/cli/templates/guides/design-doc-guide.md +171 -0
  149. package/packages/cli/templates/guides/learning-extraction.md +552 -0
  150. package/packages/cli/templates/guides/llm-instruction-design.md +248 -0
  151. package/packages/cli/templates/guides/llm-prompting.md +102 -0
  152. package/packages/cli/templates/guides/tdd-best-practices.md +615 -0
  153. package/packages/cli/templates/guides/test-definitions-guide.md +334 -0
  154. package/packages/cli/templates/guides/testing-methodology.md +618 -0
  155. package/packages/cli/templates/guides/user-story-guide.md +256 -0
  156. package/packages/cli/templates/guides/zombie-process-cleanup.md +219 -0
  157. package/packages/cli/templates/hooks/agents-md-check.sh +27 -0
  158. package/packages/cli/templates/hooks/post-tool.sh +4 -0
  159. package/packages/cli/templates/hooks/pre-commit.sh +10 -0
  160. package/packages/cli/templates/prompts/arch-review.md +43 -0
  161. package/packages/cli/templates/prompts/quality-review.md +10 -0
  162. package/packages/cli/templates/skills/safeword-quality-reviewer/SKILL.md +207 -0
  163. package/packages/cli/tests/commands/check.test.ts +129 -0
  164. package/packages/cli/tests/commands/cli.test.ts +89 -0
  165. package/packages/cli/tests/commands/diff.test.ts +115 -0
  166. package/packages/cli/tests/commands/reset.test.ts +310 -0
  167. package/packages/cli/tests/commands/self-healing.test.ts +170 -0
  168. package/packages/cli/tests/commands/setup-blocking.test.ts +71 -0
  169. package/packages/cli/tests/commands/setup-core.test.ts +135 -0
  170. package/packages/cli/tests/commands/setup-git.test.ts +139 -0
  171. package/packages/cli/tests/commands/setup-hooks.test.ts +334 -0
  172. package/packages/cli/tests/commands/setup-linting.test.ts +189 -0
  173. package/packages/cli/tests/commands/setup-noninteractive.test.ts +80 -0
  174. package/packages/cli/tests/commands/setup-templates.test.ts +181 -0
  175. package/packages/cli/tests/commands/upgrade.test.ts +215 -0
  176. package/packages/cli/tests/helpers.ts +243 -0
  177. package/packages/cli/tests/npm-package.test.ts +83 -0
  178. package/packages/cli/tests/technical-constraints.test.ts +96 -0
  179. package/packages/cli/tsconfig.json +25 -0
  180. package/packages/cli/tsup.config.ts +11 -0
  181. package/packages/cli/vitest.config.ts +23 -0
  182. package/promptfoo.yaml +3270 -0
  183. package/dist/check-3NGQ4NR5.js +0 -129
  184. package/dist/check-3NGQ4NR5.js.map +0 -1
  185. package/dist/chunk-2XWIUEQK.js +0 -190
  186. package/dist/chunk-2XWIUEQK.js.map +0 -1
  187. package/dist/chunk-GZRQL3SX.js +0 -146
  188. package/dist/chunk-GZRQL3SX.js.map +0 -1
  189. package/dist/chunk-ORQHKDT2.js +0 -10
  190. package/dist/chunk-ORQHKDT2.js.map +0 -1
  191. package/dist/chunk-W66Z3C5H.js +0 -21
  192. package/dist/chunk-W66Z3C5H.js.map +0 -1
  193. package/dist/cli.d.ts +0 -1
  194. package/dist/cli.js +0 -34
  195. package/dist/cli.js.map +0 -1
  196. package/dist/diff-Y6QTAW4O.js +0 -166
  197. package/dist/diff-Y6QTAW4O.js.map +0 -1
  198. package/dist/index.d.ts +0 -11
  199. package/dist/index.js +0 -7
  200. package/dist/index.js.map +0 -1
  201. package/dist/reset-3ACTIYYE.js +0 -143
  202. package/dist/reset-3ACTIYYE.js.map +0 -1
  203. package/dist/setup-RR4M334C.js +0 -266
  204. package/dist/setup-RR4M334C.js.map +0 -1
  205. package/dist/upgrade-6AR3DHUV.js +0 -134
  206. package/dist/upgrade-6AR3DHUV.js.map +0 -1
  207. /package/{templates → framework}/SAFEWORD.md +0 -0
  208. /package/{templates → framework}/guides/architecture-guide.md +0 -0
  209. /package/{templates → framework}/guides/code-philosophy.md +0 -0
  210. /package/{templates → framework}/guides/context-files-guide.md +0 -0
  211. /package/{templates → framework}/guides/data-architecture-guide.md +0 -0
  212. /package/{templates → framework}/guides/design-doc-guide.md +0 -0
  213. /package/{templates → framework}/guides/learning-extraction.md +0 -0
  214. /package/{templates → framework}/guides/llm-instruction-design.md +0 -0
  215. /package/{templates → framework}/guides/llm-prompting.md +0 -0
  216. /package/{templates → framework}/guides/tdd-best-practices.md +0 -0
  217. /package/{templates → framework}/guides/test-definitions-guide.md +0 -0
  218. /package/{templates → framework}/guides/testing-methodology.md +0 -0
  219. /package/{templates → framework}/guides/user-story-guide.md +0 -0
  220. /package/{templates → framework}/guides/zombie-process-cleanup.md +0 -0
  221. /package/{templates → framework}/prompts/arch-review.md +0 -0
  222. /package/{templates → framework}/prompts/quality-review.md +0 -0
  223. /package/{templates/skills/safeword-quality-reviewer → framework/skills/quality-reviewer}/SKILL.md +0 -0
  224. /package/{templates/doc-templates → framework/templates}/architecture-template.md +0 -0
  225. /package/{templates/doc-templates → framework/templates}/design-doc-template.md +0 -0
  226. /package/{templates/doc-templates → framework/templates}/test-definitions-feature.md +0 -0
  227. /package/{templates/doc-templates → framework/templates}/ticket-template.md +0 -0
  228. /package/{templates/doc-templates → framework/templates}/user-stories-template.md +0 -0
  229. /package/{templates → packages/cli/templates}/commands/arch-review.md +0 -0
  230. /package/{templates → packages/cli/templates}/commands/lint.md +0 -0
  231. /package/{templates → packages/cli/templates}/commands/quality-review.md +0 -0
  232. /package/{templates → packages/cli/templates}/hooks/inject-timestamp.sh +0 -0
  233. /package/{templates → packages/cli/templates}/lib/common.sh +0 -0
  234. /package/{templates → packages/cli/templates}/lib/jq-fallback.sh +0 -0
  235. /package/{templates → packages/cli/templates}/markdownlint.jsonc +0 -0
@@ -0,0 +1,1381 @@
1
+ # Guides Review → User Story Extraction Plan
2
+
3
+ This document defines how we will extract user stories from each guide in `framework/guides/` and serves as the single place to collect them.
4
+
5
+ ## Method
6
+
7
+ - Treat a “concept” as any method, pattern, workflow, anti-pattern, or decision rule that implies user behavior or developer workflow.
8
+ - For each concept, write at least one user story and acceptance criteria.
9
+ - Prefer concise, testable stories. Consolidate duplicates across guides.
10
+
11
+ ### User Story Template
12
+
13
+ As a [role], I want [capability], so that [outcome].
14
+
15
+ Acceptance Criteria:
16
+
17
+ - [ ] Condition(s) that trigger behavior
18
+ - [ ] Observable result(s) and error states
19
+ - [ ] Success/Done definition
20
+
21
+ ## Per-Guide Checklist
22
+
23
+ - [x] architecture-guide.md — extracted
24
+ - [x] code-philosophy.md — extracted
25
+ - [x] context-files-guide.md — extracted
26
+ - [x] data-architecture-guide.md — extracted
27
+ - [x] design-doc-guide.md — extracted
28
+ - [x] learning-extraction.md — extracted
29
+ - [x] llm-instruction-design.md — extracted
30
+ - [x] llm-prompting.md — extracted
31
+ - [x] tdd-best-practices.md — extracted (renamed from tdd-templates.md)
32
+ - [x] test-definitions-guide.md — extracted
33
+ - [x] testing-methodology.md — extracted
34
+ - [x] user-story-guide.md — extracted
35
+ - [x] zombie-process-cleanup.md — extracted
36
+
37
+ ## Extracted User Stories (to be populated)
38
+
39
+ ### architecture-guide.md
40
+
41
+ 1. Single Comprehensive Architecture Doc
42
+ As a project maintainer, I want one comprehensive architecture document per project/package, so that architecture context isn’t fragmented.
43
+ Acceptance Criteria:
44
+
45
+ - [ ] `ARCHITECTURE.md` exists at project/package root
46
+ - [ ] Required sections present (header, TOC, overview, data principles, model, components, flows, decisions, best practices, migration)
47
+ - [ ] Updated in place with version and status
48
+
49
+ 2. Design Docs for Features
50
+ As a feature developer, I want concise design docs referencing the architecture doc, so that feature scope and approach are clear.
51
+ Acceptance Criteria:
52
+
53
+ - [ ] Location under `planning/design/` (or `docs/design/`)
54
+ - [ ] ~2–3 pages with user flow, component interactions, test mapping
55
+ - [ ] References architecture doc for broader decisions
56
+
57
+ 3. Quick Doc-Type Decision
58
+ As a developer, I want a quick matrix to decide architecture vs design doc, so that I pick the right doc type.
59
+ Acceptance Criteria:
60
+
61
+ - [ ] Tech choices/data model → architecture doc
62
+ - [ ] New feature implementation → design doc
63
+ - [ ] Trade-offs can appear in both (brief in design doc)
64
+
65
+ 4. Document Why, Not Just What
66
+ As a maintainer, I want decisions to include What/Why/Trade-offs/Alternatives, so that rationale is explicit.
67
+ Acceptance Criteria:
68
+
69
+ - [ ] Every key decision includes rationale and trade-offs
70
+ - [ ] Alternatives considered, with reasons for rejection
71
+ - [ ] Links to relevant code or prior decisions
72
+
73
+ 5. Code References in Docs
74
+ As a doc author, I want to reference real code paths (with line ranges when helpful), so that readers can verify implementations.
75
+ Acceptance Criteria:
76
+
77
+ - [ ] Paths use `[module]/[file].[ext]:[start]-[end]` pattern when applicable
78
+ - [ ] Examples point to current code, not stale references
79
+ - [ ] References updated when code moves
80
+
81
+ 6. Versioning and Status
82
+ As a maintainer, I want current vs proposed sections with version and status, so that readers know what’s live now.
83
+ Acceptance Criteria:
84
+
85
+ - [ ] Header includes Version and Status
86
+ - [ ] Sections separate current schema vs proposed
87
+ - [ ] Proposed items marked clearly
88
+
89
+ 7. TDD Workflow Integration
90
+ As a developer, I want a docs-first, tests-first workflow, so that implementation follows clear definitions.
91
+ Acceptance Criteria:
92
+
93
+ - [ ] Order: User Stories → Test Definitions → Design Doc → Check Architecture → Implement → Update Architecture if needed
94
+ - [ ] Architecture doc reviewed before coding
95
+ - [ ] Updates recorded after new patterns emerge
96
+
97
+ 8. Triggers to Update Architecture Doc
98
+ As a developer, I want clear triggers for architecture updates, so that docs stay accurate.
99
+ Acceptance Criteria:
100
+
101
+ - [ ] Update when: new data model concepts, tech choices, new patterns/conventions, architectural insights
102
+ - [ ] Don’t update for: single feature implementation, bug fixes, refactors without architectural impact
103
+
104
+ 9. Avoid Common Mistakes
105
+ As a doc reviewer, I want checks that prevent doc anti-patterns, so that documentation stays useful.
106
+ Acceptance Criteria:
107
+
108
+ - [ ] No ADR sprawl; keep one comprehensive architecture doc
109
+ - [ ] Decisions always include rationale and status
110
+ - [ ] Design docs avoid duplicating architecture content
111
+
112
+ 10. Standard File Organization
113
+ As an architect, I want a clear directory layout, so that docs are easy to find and maintain.
114
+ Acceptance Criteria:
115
+
116
+ - [ ] `ARCHITECTURE.md` at root; feature design docs under planning/design
117
+ - [ ] User stories and test definitions kept in planning subfolders
118
+ - [ ] Multiple architecture docs only for major subsystems (clearly scoped)
119
+
120
+ 11. Data Architecture Guidance
121
+ As a data architect, I want a linked data architecture guide, so that data-heavy projects document models, flows, and policies properly.
122
+ Acceptance Criteria:
123
+
124
+ - [ ] Data architecture doc created when applicable
125
+ - [ ] Principles, models (conceptual/logical/physical), flows, and policies documented
126
+ - [ ] Integrated with TDD workflow and versioning
127
+
128
+ ### code-philosophy.md
129
+
130
+ 1. Response JSON Summary
131
+ As a developer using the agent, I want every response to end with a standard JSON summary, so that automations can reliably parse outcomes.
132
+ Acceptance Criteria:
133
+
134
+ - [ ] Summary includes keys: `proposedChanges`, `madeChanges`, `askedQuestion`
135
+ - [ ] Boolean values reflect the actual actions taken
136
+ - [ ] Missing or malformed summaries are flagged during review
137
+
138
+ 2. Avoid Bloat, Prefer Elegant Code
139
+ As a maintainer, I want simple, focused solutions, so that the codebase remains easy to read and change.
140
+ Acceptance Criteria:
141
+
142
+ - [ ] PRs that add features include justification of necessity
143
+ - [ ] Redundant or speculative code is removed before merge
144
+ - [ ] Readability is prioritized over cleverness
145
+
146
+ 3. Self-Documenting Code
147
+ As a reviewer, I want clear naming and structure with minimal comments, so that intent is obvious without verbose annotations.
148
+ Acceptance Criteria:
149
+
150
+ - [ ] Names convey purpose without abbreviations
151
+ - [ ] Comments explain non-obvious decisions only
152
+ - [ ] Functions stay small with clear responsibilities
153
+
154
+ 4. Explicit Error Handling
155
+ As a developer, I want explicit error handling, so that failures are visible and traceable.
156
+ Acceptance Criteria:
157
+
158
+ - [ ] No swallowed errors; failures are surfaced
159
+ - [ ] Error messages include action context
160
+ - [ ] Control flow avoids blanket try/catch without handling
161
+
162
+ 5. Documentation Verification
163
+ As a developer, I want to verify current docs and versions before coding, so that I don’t rely on outdated APIs.
164
+ Acceptance Criteria:
165
+
166
+ - [ ] Library versions are checked before implementation
167
+ - [ ] Feature assumptions are validated against docs
168
+ - [ ] Version-specific caveats are noted in PR/commit
169
+
170
+ 6. TDD Workflow
171
+ As a developer, I want tests written first (RED → GREEN → REFACTOR), so that behavior is defined and changes are safe.
172
+ Acceptance Criteria:
173
+
174
+ - [ ] A failing test exists before feature implementation
175
+ - [ ] Only minimal code is added to pass the test
176
+ - [ ] Refactors keep tests green
177
+
178
+ 7. Self-Testing Before Completion
179
+ As a developer, I want to run tests myself before declaring completion, so that users aren’t asked to verify my work.
180
+ Acceptance Criteria:
181
+
182
+ - [ ] Relevant tests run and pass locally
183
+ - [ ] Evidence of test run included in work notes/PR
184
+ - [ ] No request for user verification of correctness
185
+
186
+ 8. Debug Logging Hygiene
187
+ As a developer, I want to log actual vs expected while debugging and remove logs after, so that code stays clean.
188
+ Acceptance Criteria:
189
+
190
+ - [ ] Debug logs show inputs and expected/actual deltas
191
+ - [ ] Temporary logs removed before merge
192
+ - [ ] Complex objects serialized meaningfully when needed
193
+
194
+ 9. Cross-Platform Paths
195
+ As a developer, I want path handling that supports `/` and `\`, so that the code works on macOS, Windows, and Linux.
196
+ Acceptance Criteria:
197
+
198
+ - [ ] Join/resolve functions used instead of string concat
199
+ - [ ] Tests or manual checks on at least two platforms/CI
200
+ - [ ] No hard-coded separators
201
+
202
+ 10. Best Practices Research
203
+ As a developer, I want to consult tool, domain, and UX best practices, so that implementations align with conventions.
204
+ Acceptance Criteria:
205
+
206
+ - [ ] PR/notes link to relevant best-practice references
207
+ - [ ] Design decisions mention tradeoffs briefly
208
+ - [ ] Anti-patterns avoided when documented
209
+
210
+ 11. Self-Review Gate
211
+ As a developer, I want a pre-merge self-review, so that obvious issues are caught early.
212
+ Acceptance Criteria:
213
+
214
+ - [ ] Checklist: correctness, elegance, best practices, docs/version, tests
215
+ - [ ] Blockers addressed or explicitly deferred with rationale
216
+ - [ ] Final pass confirms user-facing functionality
217
+
218
+ 12. Question-Asking Protocol
219
+ As a developer, I want to ask questions only after due diligence, so that I respect the user’s time.
220
+ Acceptance Criteria:
221
+
222
+ - [ ] Attempts include docs/code/test exploration
223
+ - [ ] Questions focus on domain preferences or unknowns
224
+ - [ ] Summary of attempted paths provided
225
+
226
+ 13. Tooling Currency
227
+ As a devops-minded contributor, I want critical CLIs updated, so that workflows remain reliable and secure.
228
+ Acceptance Criteria:
229
+
230
+ - [ ] Update cadence or check noted in docs/automation
231
+ - [ ] Breaking changes reviewed before rollout
232
+ - [ ] Version pinning strategy documented if needed
233
+
234
+ 14. Git Workflow
235
+ As a developer, I want frequent, descriptive commits, so that progress can be checkpointed and reviewed easily.
236
+ Acceptance Criteria:
237
+
238
+ - [ ] Small, atomic commits with clear messages
239
+ - [ ] Commit after each meaningful change/test pass
240
+ - [ ] Avoid “misc” or ambiguous messages
241
+
242
+ ### context-files-guide.md
243
+
244
+ 1. Choose the Right Context File(s)
245
+ As a maintainer, I want to create the context file(s) matching our tools, so that agents load the right guidance.
246
+ Acceptance Criteria:
247
+
248
+ - [ ] Use `CLAUDE.md` for Claude-specific guidance
249
+ - [ ] Use `CURSOR.md` for Cursor-specific guidance
250
+ - [ ] Use `AGENTS.md` for tool-agnostic project context
251
+ - [ ] Subdirectory files created only when >3 unique rules or specialized context is needed
252
+
253
+ 2. SAFEWORD Trigger Required
254
+ As a doc author, I want every project-level context file to start with a SAFEWORD trigger, so that global patterns are always loaded.
255
+ Acceptance Criteria:
256
+
257
+ - [ ] Top of file includes “ALWAYS READ FIRST: @./.safeword/SAFEWORD.md”
258
+ - [ ] Brief rationale explains why SAFEWORD is referenced
259
+ - [ ] Applies to `CLAUDE.md`, `CURSOR.md`, and `AGENTS.md`
260
+
261
+ 3. Respect Auto-Loading Behavior
262
+ As a contributor, I want root + subdirectory context to load predictably, so that guidance is layered without duplication.
263
+ Acceptance Criteria:
264
+
265
+ - [ ] Subdirectory files assume root is loaded and use cross-references
266
+ - [ ] No duplication of root content in subdirectory files
267
+ - [ ] Reliability note: explicitly reference the file in conversations when needed
268
+
269
+ 4. Modular File Structure
270
+ As a maintainer, I want a modular context structure with imports, so that files stay concise and scannable.
271
+ Acceptance Criteria:
272
+
273
+ - [ ] Root context imports `docs/architecture.md` and `docs/conventions.md` (or equivalents)
274
+ - [ ] Imports use `@docs/...` or `@.safeword/...` patterns
275
+ - [ ] Recursive imports limited to depth ≤5; code spans/blocks don’t resolve imports
276
+
277
+ 5. Content Inclusion/Exclusion Rules
278
+ As a doc reviewer, I want clear guidelines on what belongs in context files, so that they stay high-signal.
279
+ Acceptance Criteria:
280
+
281
+ - [ ] Include why-over-what, project-specific conventions, domain requirements, examples, gotchas, cross-refs
282
+ - [ ] Exclude setup/API docs, generic advice, feature lists, test commands (move to README/tests/other docs)
283
+ - [ ] No implementation details (paths/line numbers) in root; put specifics in subdirectory files
284
+
285
+ 6. Size Targets and Modularity
286
+ As a maintainer, I want size targets for context files, so that token usage stays efficient.
287
+ Acceptance Criteria:
288
+
289
+ - [ ] Root 100–200 lines; subdirectories 60–100; total <500 across project
290
+ - [ ] If >200 lines, extract to subdirectory or use imports
291
+ - [ ] Keep under ~50KB and modularize as needed
292
+
293
+ 7. Cross-Reference Pattern
294
+ As a doc author, I want a standard cross-reference pattern, so that readers can jump between root and subdirectories.
295
+ Acceptance Criteria:
296
+
297
+ - [ ] Root uses “Agents (path) – See /AGENTS.md” style
298
+ - [ ] Subdirectory files reference root SAFEWORD/AGENTS for architecture
299
+ - [ ] Import examples show architecture, coding standards, and git workflow sections
300
+
301
+ 8. Maintenance Rules
302
+ As a team, I want explicit maintenance rules, so that context stays current and lean.
303
+ Acceptance Criteria:
304
+
305
+ - [ ] Update on architecture changes; remove outdated sections immediately
306
+ - [ ] Consolidate overlapping content across files
307
+ - [ ] Verify hierarchical loading matches intent
308
+
309
+ 9. Domain Requirements Section (Optional)
310
+ As a product/domain lead, I want domain requirements captured when needed, so that the AI respects specialized rules.
311
+ Acceptance Criteria:
312
+
313
+ - [ ] Add only for non-obvious domain knowledge
314
+ - [ ] Use clear structure (domains → principles with rationale)
315
+ - [ ] Provide concrete, testable guidance and resources
316
+
317
+ 10. LLM Comprehension Checklist
318
+ As an author, I want a pre-commit checklist for LLM readability, so that instructions are reliable.
319
+ Acceptance Criteria:
320
+
321
+ - [ ] MECE decision logic; terms defined; no contradictions
322
+ - [ ] Examples for rules and explicit edge cases
323
+ - [ ] No redundancy; under 200 lines or uses imports
324
+
325
+ 11. Conciseness, Effectiveness, Token Budget
326
+ As a maintainer, I want concise, effective context that respects token budgets, so that prompts remain efficient.
327
+ Acceptance Criteria:
328
+
329
+ - [ ] Short, declarative bullets; remove commentary/redundancy
330
+ - [ ] Treat as living docs; add emphasis for critical rules
331
+ - [ ] Keep files small; modularize via imports
332
+
333
+ ### data-architecture-guide.md
334
+
335
+ 1. Decide Where to Document
336
+ As an architect, I want a clear decision tree for data documentation, so that data changes land in the right doc.
337
+ Acceptance Criteria:
338
+
339
+ - [ ] Follow ordered decisions (project init → new store → model change → flow integration → single feature)
340
+ - [ ] Edge cases handled (schema change always in architecture doc; 3+ entities → architecture)
341
+ - [ ] Design doc used when only feature uses existing model
342
+
343
+ 2. Define Data Principles First
344
+ As a maintainer, I want core data principles documented first, so that models and flows follow a stable foundation.
345
+ Acceptance Criteria:
346
+
347
+ - [ ] Data Quality, Governance, Accessibility, Living Documentation defined with What/Why/Document/Example
348
+ - [ ] Validation checkpoints include file paths/line refs where applicable
349
+ - [ ] Source of truth identified for each entity
350
+
351
+ 3. Model at Three Levels
352
+ As a designer, I want conceptual, logical, and physical models, so that readers see the system from high-level to storage details.
353
+ Acceptance Criteria:
354
+
355
+ - [ ] Conceptual: entities with descriptions
356
+ - [ ] Logical: attributes, types, relationships, constraints
357
+ - [ ] Physical: storage tech, indexes, and rationale/trade-offs
358
+
359
+ 4. Document Data Flows
360
+ As a developer, I want sources → transformations → destinations with error handling, so that flows are predictable and testable.
361
+ Acceptance Criteria:
362
+
363
+ - [ ] Each step includes validation, business logic, persistence, UI updates
364
+ - [ ] Error handling covered for each step (not just happy path)
365
+ - [ ] External integrations called out when applicable
366
+
367
+ 5. Specify Data Policies
368
+ As a security-conscious maintainer, I want access, validation, and lifecycle policies, so that data is protected and consistent.
369
+ Acceptance Criteria:
370
+
371
+ - [ ] Read/write/delete roles and mechanisms defined
372
+ - [ ] Lifecycle rules (create/update/delete/purge) documented
373
+ - [ ] Conflict resolution strategy selected and justified
374
+
375
+ 6. TDD Integration Triggers
376
+ As a developer, I want data-specific triggers for updating architecture docs, so that documentation stays current.
377
+ Acceptance Criteria:
378
+
379
+ - [ ] Update on new entities, schema changes, storage tech changes, perf bottlenecks
380
+ - [ ] Cross-reference from `ARCHITECTURE.md` or `SAFEWORD.md`
381
+ - [ ] Version/status and migration strategy updated
382
+
383
+ 7. Avoid Common Mistakes
384
+ As a reviewer, I want checks that prevent data doc anti-patterns, so that docs remain trustworthy.
385
+ Acceptance Criteria:
386
+
387
+ - [ ] Source of truth defined; validation rules present
388
+ - [ ] Migration strategy included for breaking changes
389
+ - [ ] Performance targets concrete; implementation details kept out of architecture doc
390
+
391
+ 8. Best Practices Checklist Compliance
392
+ As a maintainer, I want a pre-merge checklist, so that data docs meet quality standards.
393
+ Acceptance Criteria:
394
+
395
+ - [ ] Four+ principles documented; entities and models complete
396
+ - [ ] Flows include error handling; validation checkpoints have line numbers
397
+ - [ ] Version/status aligned with code; cross-references exist
398
+
399
+ ### design-doc-guide.md
400
+
401
+ 1. Verify Prerequisites
402
+ As a developer, I want to confirm user stories and test definitions before writing a design doc, so that the design aligns with validated behavior.
403
+ Acceptance Criteria:
404
+
405
+ - [ ] User stories exist and are linked
406
+ - [ ] Test definitions exist and are linked
407
+ - [ ] Design doc references both; no duplication
408
+
409
+ 2. Use Standard Template
410
+ As a contributor, I want to use the standard design doc template, so that docs are consistent and complete.
411
+ Acceptance Criteria:
412
+
413
+ - [ ] Structure follows `templates/design-doc-template.md`
414
+ - [ ] All required sections are present or marked "(if applicable)"
415
+ - [ ] Saved under `planning/design/[feature]-design.md`
416
+
417
+ 3. Architecture Section
418
+ As a designer, I want a concise architecture section, so that the high-level approach is clear.
419
+ Acceptance Criteria:
420
+
421
+ - [ ] 1–2 paragraphs describing overall approach and fit
422
+ - [ ] Optional diagram if helpful
423
+ - [ ] References architecture doc where relevant
424
+
425
+ 4. Components with [N]/[N+1] Pattern
426
+ As a developer, I want concrete component examples with interfaces and tests, so that patterns are repeatable.
427
+ Acceptance Criteria:
428
+
429
+ - [ ] Define Component [N] (name, responsibility, interface, dependencies, tests)
430
+ - [ ] Define Component [N+1] (different example showing variation)
431
+ - [ ] Add additional components as needed
432
+
433
+ 5. Data Model (If Applicable)
434
+ As a developer, I want the design doc to describe the data model when relevant, so that types and flows are explicit.
435
+ Acceptance Criteria:
436
+
437
+ - [ ] State shape or schema outlined
438
+ - [ ] Relationships and flows between types shown
439
+ - [ ] Interaction with components clear
440
+
441
+ 6. Component Interaction (If Applicable)
442
+ As a developer, I want to document component communication, so that integration is predictable.
443
+ Acceptance Criteria:
444
+
445
+ - [ ] Events/method calls documented
446
+ - [ ] Data flow between components shown (N → N+1)
447
+ - [ ] Edge cases in interactions noted
448
+
449
+ 7. User Flow
450
+ As a product-focused developer, I want a step-by-step user flow, so that UX is concrete and testable.
451
+ Acceptance Criteria:
452
+
453
+ - [ ] Concrete steps (e.g., keyboard shortcuts, buttons)
454
+ - [ ] Aligns with user stories
455
+ - [ ] Maps to test definitions
456
+
457
+ 8. Key Decisions with Trade-offs
458
+ As a maintainer, I want key decisions documented with rationale and trade-offs, so that choices are explicit.
459
+ Acceptance Criteria:
460
+
461
+ - [ ] Each decision includes what, why (with specifics), and trade-off
462
+ - [ ] Multiple decisions show [N]/[N+1] pattern
463
+ - [ ] Links to benchmarks/analysis where relevant
464
+
465
+ 9. Implementation Notes (If Applicable)
466
+ As an engineer, I want constraints, error handling, and gotchas documented, so that implementation risks are known.
467
+ Acceptance Criteria:
468
+
469
+ - [ ] Constraints, error handling approach, and gotchas listed
470
+ - [ ] Open questions enumerated
471
+ - [ ] References to ADRs/POCs included if applicable
472
+
473
+ 10. Quality Checklist
474
+ As a reviewer, I want a design doc quality checklist, so that docs are concise and LLM-optimized.
475
+ Acceptance Criteria:
476
+
477
+ - [ ] References user stories and tests, does not duplicate them
478
+ - [ ] Has Component [N] and [N+1] examples
479
+ - [ ] ~121 lines target, clear and concise
480
+
481
+ ### learning-extraction.md
482
+
483
+ 1. Trigger-Based Extraction
484
+ As a developer, I want clear triggers to extract learnings, so that reusable knowledge is captured when it matters.
485
+ Acceptance Criteria:
486
+
487
+ - [ ] Observable complexity, trial-and-error, undocumented gotcha, integration struggle, testing trap, or architectural insight triggers extraction
488
+ - [ ] Forward-looking filter applied (“will this save future time?”)
489
+ - [ ] Extraction deferred until fix confirmed (no mid-debug extraction)
490
+
491
+ 2. Check Existing Learnings First
492
+ As a contributor, I want to check for existing learnings before creating new ones, so that we prevent duplication.
493
+ Acceptance Criteria:
494
+
495
+ - [ ] Proactive search by keyword in project and global learnings
496
+ - [ ] Update existing learning when similar concept exists
497
+ - [ ] Reference existing vs new with explicit difference when partially overlapping
498
+
499
+ 3. Place Learnings in Correct Location
500
+ As a maintainer, I want consistent locations for learnings, so that the knowledge base stays organized.
501
+ Acceptance Criteria:
502
+
503
+ - [ ] Global patterns → `.safeword/learnings/`
504
+ - [ ] Project-specific patterns → `./.safeword/learnings/`
505
+ - [ ] One-off narratives → `./.safeword/learnings/archive/`
506
+
507
+ 4. Respect Instruction Precedence
508
+ As an agent, I want to follow cascading precedence, so that project-specific guidance overrides global defaults.
509
+ Acceptance Criteria:
510
+
511
+ - [ ] Order applied: explicit user > project learnings > global learnings > SAFEWORD.md
512
+ - [ ] Conflicts resolved in favor of higher precedence
513
+
514
+ 5. Use Templates
515
+ As a doc author, I want standard templates for learnings and narratives, so that documents are consistent and actionable.
516
+ Acceptance Criteria:
517
+
518
+ - [ ] Forward-looking learning includes Principle, Gotcha, Good/Bad, Why, Examples, Testing Trap, Reference
519
+ - [ ] Narrative includes Problem, Investigation, Solution (diff), Lesson
520
+ - [ ] Examples are concrete and up to date
521
+
522
+ 6. SAFEWORD.md Cross-Reference
523
+ As a maintainer, I want to cross-reference new learnings in SAFEWORD.md, so that discoverability stays high.
524
+ Acceptance Criteria:
525
+
526
+ - [ ] Add short entry under Common Gotchas or Architecture as appropriate
527
+ - [ ] Include bold name + one-liner + link
528
+ - [ ] Update references when files move or split
529
+
530
+ 7. Suggest Extraction at the Right Time
531
+ As an assistant, I want to suggest learnings at appropriate confidence levels, so that we don’t create noise.
532
+ Acceptance Criteria:
533
+
534
+ - [ ] High confidence: suggest during debugging for complex/stuck patterns
535
+ - [ ] Medium confidence: ask after completion
536
+ - [ ] Low confidence: don’t suggest for trivial or well-documented cases
537
+
538
+ 8. Review and Maintenance Cycle
539
+ As a maintainer, I want periodic review of learnings, so that guidance stays relevant.
540
+ Acceptance Criteria:
541
+
542
+ - [ ] Monthly relevance review; quarterly archive obsolete items
543
+ - [ ] Split learnings >200 lines or covering multiple concepts
544
+ - [ ] Consolidate overlapping learnings
545
+
546
+ 9. Feedback Loop
547
+ As a team, I want to tune suggestion thresholds, so that learnings reflect real value.
548
+ Acceptance Criteria:
549
+
550
+ - [ ] Track acceptance rate of suggestions
551
+ - [ ] If acceptance <30%, raise suggestion threshold
552
+ - [ ] Monitor references of learnings in future work
553
+
554
+ 10. Workflow Integration
555
+ As a developer, I want a clear extraction workflow during and after development, so that documentation fits naturally into delivery.
556
+ Acceptance Criteria:
557
+
558
+ - [ ] During dev: recognize trigger → assess scope → choose location → extract → cross-ref
559
+ - [ ] After feature: review → extract if threshold met → update cross-refs → commit with learning
560
+ - [ ] Commit messages include learning references when applicable
561
+
562
+ 11. Anti-Patterns to Avoid
563
+ As a reviewer, I want to block low-value extractions, so that the knowledge base stays high-signal.
564
+ Acceptance Criteria:
565
+
566
+ - [ ] No trivial, one-liners, opinions without rationale, or steps without a lesson
567
+ - [ ] Don’t duplicate official docs
568
+ - [ ] Remove stale learning references after archiving/replacement
569
+
570
+ 12. Directory & Size Standards
571
+ As a doc author, I want directory and size guidelines, so that files are easy to navigate and maintain.
572
+ Acceptance Criteria:
573
+
574
+ - [ ] Follow standard directory structure for global/project/archive
575
+ - [ ] Forward-looking: 50–150 lines; Narratives: 30–100 lines
576
+ - [ ] Split oversized files into focused concepts
577
+
578
+ ### llm-instruction-design.md
579
+
580
+ 1. MECE Decision Trees
581
+ As a documentation author, I want decision trees that are mutually exclusive and collectively exhaustive, so that LLMs follow unambiguous paths.
582
+ Acceptance Criteria:
583
+
584
+ - [ ] Branches do not overlap; first-match stops evaluation
585
+ - [ ] All relevant cases covered; none fall through
586
+ - [ ] Example tree included
587
+
588
+ 2. Explicit Definitions
589
+ As a documentation author, I want all terms defined explicitly, so that LLMs don’t assume meanings.
590
+ Acceptance Criteria:
591
+
592
+ - [ ] Ambiguous terms replaced with precise definitions
593
+ - [ ] Examples clarify common misunderstandings
594
+ - [ ] “Lowest level” and similar phrases rewritten to be actionable
595
+
596
+ 3. No Contradictions
597
+ As a maintainer, I want consistent guidance across sections, so that LLMs don’t receive conflicting rules.
598
+ Acceptance Criteria:
599
+
600
+ - [ ] Cross-check updates for conflicting statements
601
+ - [ ] Overlapping guidance reconciled into single rule
602
+ - [ ] Example of corrected contradiction included
603
+
604
+ 4. Concrete Examples (Good vs Bad)
605
+ As a documentation author, I want 2–3 concrete examples per rule, so that LLMs learn patterns.
606
+ Acceptance Criteria:
607
+
608
+ - [ ] Each rule includes at least one BAD and one GOOD example
609
+ - [ ] Examples are brief and domain-relevant
610
+ - [ ] Examples updated when guidance changes
611
+
612
+ 5. Edge Cases Explicit
613
+ As a writer, I want edge cases listed under each rule, so that LLMs handle tricky scenarios.
614
+ Acceptance Criteria:
615
+
616
+ - [ ] Edge cases listed and addressed
617
+ - [ ] Non-deterministic and environment-bound cases covered
618
+ - [ ] Clear routing of mixed concerns
619
+
620
+ 6. Actionable, Not Vague
621
+ As a reader, I want actionable rules with optimization guidance, so that outcomes are consistent.
622
+ Acceptance Criteria:
623
+
624
+ - [ ] Subjective terms replaced by concrete rules and red flags
625
+ - [ ] Rules describe what to do, not opinions
626
+ - [ ] Checklist item ensures actionability
627
+
628
+ 7. Sequential Decision Trees
629
+ As a maintainer, I want ordered questions, so that LLMs stop at the first match.
630
+ Acceptance Criteria:
631
+
632
+ - [ ] Questions presented in strict order
633
+ - [ ] “Stop at first match” called out
634
+ - [ ] Parallel structures removed
635
+
636
+ 8. Tie-Breaking Rules
637
+ As a user, I want tie-breakers documented, so that ambiguous choices resolve deterministically.
638
+ Acceptance Criteria:
639
+
640
+ - [ ] Global tie-breaker declared (e.g., choose fastest test)
641
+ - [ ] References embedded in each decision tree
642
+ - [ ] Conflicts refer to the same tie-breaker
643
+
644
+ 9. Lookup Tables for Complex Logic
645
+ As an author, I want simple tables for 3+ branch decisions, so that LLMs can map inputs to outputs cleanly.
646
+ Acceptance Criteria:
647
+
648
+ - [ ] Table includes clear cases without caveats in cells
649
+ - [ ] Complex logic extracted into a table
650
+ - [ ] Example table provided
651
+
652
+ 10. No Caveats in Tables
653
+ As an author, I want caveats expressed as separate rows, so that tables remain pattern-friendly.
654
+ Acceptance Criteria:
655
+
656
+ - [ ] Parentheticals removed from cells
657
+ - [ ] Caveats represented by additional rows
658
+ - [ ] Table remains simple to parse
659
+
660
+ 11. Percentages with Context
661
+ As an author, I want percentage guidance accompanied by adjustments, so that LLMs adapt sensibly.
662
+ Acceptance Criteria:
663
+
664
+ - [ ] Baseline + context-specific adjustments or principles-only alternative
665
+ - [ ] Avoid standalone percentages without rules
666
+ - [ ] Examples show adjustments
667
+
668
+ 12. Specific Questions
669
+ As a writer, I want precise questions, so that LLMs choose correct tools.
670
+ Acceptance Criteria:
671
+
672
+ - [ ] Use tool-specific wording (e.g., real browser vs jsdom)
673
+ - [ ] Clarify commonly confused tools
674
+ - [ ] Examples included
675
+
676
+ 13. Re-evaluation Paths
677
+ As a user, I want next steps when rules don’t fit, so that I can decompose the problem.
678
+ Acceptance Criteria:
679
+
680
+ - [ ] 3-step fallback path documented with example
681
+ - [ ] Encourages decomposition (pure vs I/O vs UI)
682
+ - [ ] Concrete example (e.g., login validation)
683
+
684
+ 14. Anti-Patterns Guard
685
+ As a reviewer, I want to block common anti-patterns, so that docs stay reliable.
686
+ Acceptance Criteria:
687
+
688
+ - [ ] No visual metaphors, undefined jargon, outdated references
689
+ - [ ] Single decision framework per topic (no competition)
690
+ - [ ] Update all mentions when concepts are removed
691
+
692
+ 15. Quality Checklist Compliance
693
+ As a maintainer, I want a pre-commit checklist for LLM docs, so that guidance is consistent.
694
+ Acceptance Criteria:
695
+
696
+ - [ ] MECE, explicit definitions, no contradictions
697
+ - [ ] Examples, edge cases, tie-breakers, lookup tables as needed
698
+ - [ ] Re-evaluation path present
699
+
700
+ ### llm-prompting.md
701
+
702
+ 1. Concrete Examples in Prompts
703
+ As a prompt author, I want GOOD vs BAD code examples, so that guidance is concrete and learnable.
704
+ Acceptance Criteria:
705
+
706
+ - [ ] Each rule has at least one BAD and one GOOD example
707
+ - [ ] Examples are short and domain-relevant
708
+ - [ ] Examples updated as patterns evolve
709
+
710
+ 2. Structured Outputs via JSON
711
+ As an engineer, I want LLM responses to follow JSON schemas, so that outputs are predictable and easily validated.
712
+ Acceptance Criteria:
713
+
714
+ - [ ] Prompts request JSON output with explicit schema
715
+ - [ ] Responses validated against schema
716
+ - [ ] Free-form prose avoided for machine consumption
717
+
718
+ 3. Prompt Caching for Cost Reduction
719
+ As an agent developer, I want static rules cached with cache_control: ephemeral, so that repeated calls are cheaper.
720
+ Acceptance Criteria:
721
+
722
+ - [ ] Static content in system prompt marked cacheable (ephemeral)
723
+ - [ ] Dynamic state placed in user messages (non-cached)
724
+ - [ ] Cache hit behavior verified; changes to cached blocks minimized
725
+
726
+ 4. Message Architecture (Static vs Dynamic)
727
+ As an implementer, I want clean separation of static rules and dynamic inputs, so that caching and clarity improve.
728
+ Acceptance Criteria:
729
+
730
+ - [ ] No dynamic state interpolated into cached system prompts
731
+ - [ ] User messages contain dynamic state and inputs
732
+ - [ ] Example snippet provided showing separation
733
+
734
+ 5. Cache Invalidation Discipline
735
+ As a maintainer, I want to change cached blocks sparingly, so that we avoid widespread cache invalidation.
736
+ Acceptance Criteria:
737
+
738
+ - [ ] Acknowledge “any change breaks all caches”
739
+ - [ ] Batch edits to cached sections
740
+ - [ ] Document rebuild cost when caches invalidate
741
+
742
+ 6. LLM-as-Judge Evaluations
743
+ As a tester, I want rubric-driven LLM evaluations, so that nuanced qualities can be tested reliably.
744
+ Acceptance Criteria:
745
+
746
+ - [ ] Rubrics define EXCELLENT/ACCEPTABLE/POOR with criteria
747
+ - [ ] Avoid brittle keyword checks for creative outputs
748
+ - [ ] Evaluations integrated into test suite where applicable
749
+
750
+ 7. Evaluation Framework Mapping
751
+ As a test planner, I want clear guidance on Unit, Integration, and LLM Evals, so that we test at the right layer.
752
+ Acceptance Criteria:
753
+
754
+ - [ ] Unit: pure functions; Integration: agent + LLM calls; LLM Evals: judgment quality
755
+ - [ ] Real browser required only for E2E scenarios
756
+ - [ ] Examples for mapping common cases
757
+
758
+ 8. Cost Awareness for Evals
759
+ As a maintainer, I want evals sized and cached thoughtfully, so that costs stay predictable.
760
+ Acceptance Criteria:
761
+
762
+ - [ ] Note typical scenario counts and approximate costs with caching
763
+ - [ ] Use caching for static rubrics and examples
764
+ - [ ] Document budget expectations in CI
765
+
766
+ 9. “Why” Over “What” in Prompts
767
+ As a prompt author, I want rationales with numbers, so that trade-offs are explicit.
768
+ Acceptance Criteria:
769
+
770
+ - [ ] Prompts include brief rationale for critical rules
771
+ - [ ] Where possible, include metrics or concrete targets
772
+ - [ ] Trade-offs and gotchas stated explicitly
773
+
774
+ 10. Precise Technical Terms
775
+ As a writer, I want specific terms (e.g., real browser vs jsdom), so that tool selection is correct.
776
+ Acceptance Criteria:
777
+
778
+ - [ ] Prompts ask “Does this require a real browser (Playwright/Cypress)?”
779
+ - [ ] Clarify React Testing Library is not a real browser
780
+ - [ ] Common confusions documented
781
+
782
+ ### tdd-best-practices.md (formerly tdd-templates.md)
783
+
784
+ 1. Select Correct Template
785
+ As a planner, I want to select the right template (user stories, test definitions, design doc, architecture), so that documentation aligns with TDD workflow.
786
+ Acceptance Criteria:
787
+
788
+ - [ ] Feature/issues → user stories template
789
+ - [ ] Feature test suites → test definitions template
790
+ - [ ] Feature implementation → design doc template
791
+ - [ ] Project-wide decisions → architecture doc (no template)
792
+ - [ ] Example prompts guide correct selection
793
+
794
+ 2. Choose Story Format
795
+ As a writer, I want to choose the appropriate story format (standard, Given-When-Then, job story), so that intent is clear.
796
+ Acceptance Criteria:
797
+
798
+ - [ ] Standard format (As a / I want / So that) used by default
799
+ - [ ] Given-When-Then for behavior-focused stories
800
+ - [ ] Job story for outcome-focused cases
801
+ - [ ] Each format includes example
802
+
803
+ 3. Write Quality Acceptance Criteria
804
+ As a product owner, I want specific, testable AC with out-of-scope sections, so that delivery is measurable.
805
+ Acceptance Criteria:
806
+
807
+ - [ ] AC specify measurable behavior (e.g., "<200ms", "within 5 clicks")
808
+ - [ ] Out-of-scope listed to prevent scope creep
809
+ - [ ] Good examples demonstrate testable conditions
810
+ - [ ] Bad examples show what to avoid
811
+
812
+ 4. Block Story Anti-Patterns
813
+ As a reviewer, I want to reject vague, technical-task, or bundled stories, so that stories remain valuable.
814
+ Acceptance Criteria:
815
+
816
+ - [ ] Vague value or missing AC is blocked (example provided)
817
+ - [ ] Technical-only stories redirected to task/spike
818
+ - [ ] Bundled features (3+) split into multiple stories
819
+ - [ ] Missing "So that" flagged
820
+
821
+ 5. Use Unit Test Template
822
+ As a developer, I want unit tests following AAA pattern, so that behavior and edge cases are covered.
823
+ Acceptance Criteria:
824
+
825
+ - [ ] Template includes Arrange/Act/Assert structure
826
+ - [ ] Happy path, error path, and edge cases present
827
+ - [ ] describe/it nesting shown with naming examples
828
+
829
+ 6. Use Integration Test Template
830
+ As a developer, I want integration tests with setup/teardown and realistic flows, so that components work together.
831
+ Acceptance Criteria:
832
+
833
+ - [ ] beforeEach/afterEach for fixtures
834
+ - [ ] Full workflow with assertions on outcomes
835
+ - [ ] Failure cases test rollback/consistency
836
+
837
+ 7. Use E2E Test Template
838
+ As a QA engineer, I want E2E tests using real browser patterns, so that user journeys are validated.
839
+ Acceptance Criteria:
840
+
841
+ - [ ] Playwright/Cypress patterns demonstrated
842
+ - [ ] UI interactions and state assertions included
843
+ - [ ] Multi-step flows shown
844
+
845
+ 8. Apply Test Naming Conventions
846
+ As a reviewer, I want descriptive test names, so that intent is obvious without reading code.
847
+ Acceptance Criteria:
848
+
849
+ - [ ] Good examples: behavior + condition ("should return X when Y")
850
+ - [ ] Bad examples: vague or implementation-focused
851
+ - [ ] Rename guidance provided
852
+
853
+ 9. Enforce Test Independence
854
+ As a developer, I want tests isolated with fresh state, so that order doesn't matter.
855
+ Acceptance Criteria:
856
+
857
+ - [ ] Good example: beforeEach creates fresh state
858
+ - [ ] Bad example: shared mutable state
859
+ - [ ] No test depends on another test's side effects
860
+
861
+ 10. Know What to Test
862
+ As a developer, I want clear guidance on what to test vs skip, so that effort is focused.
863
+ Acceptance Criteria:
864
+
865
+ - [ ] Test: public API, user features, edge cases, error handling, integrations
866
+ - [ ] Don't test: private internals, third-party internals, trivial code
867
+ - [ ] Examples clarify boundaries
868
+
869
+ 11. Use Test Data Builders
870
+ As a developer, I want reusable builders for complex test data, so that tests are concise.
871
+ Acceptance Criteria:
872
+
873
+ - [ ] Builder function with override support shown
874
+ - [ ] Example demonstrates customization per test
875
+ - [ ] Reduces duplication across tests
876
+
877
+ 12. Apply LLM Testing Patterns
878
+ As a tester, I want rubric-based LLM evals, so that AI quality is testable.
879
+ Acceptance Criteria:
880
+
881
+ - [ ] Promptfoo template with llm-rubric shown
882
+ - [ ] EXCELLENT/ACCEPTABLE/POOR grading criteria defined
883
+ - [ ] Integration test with real LLM call demonstrated
884
+
885
+ 13. INVEST Gate for Stories
886
+ As a product owner, I want every story to pass INVEST, so that it's deliverable and testable.
887
+ Acceptance Criteria:
888
+
889
+ - [ ] Independent, Negotiable, Valuable, Estimable, Small, Testable
890
+ - [ ] Failing any criterion triggers refinement or split
891
+ - [ ] Checklist provided for validation
892
+
893
+ 14. Use Red Flags Quick Reference
894
+ As a reviewer, I want quick red flag checklists, so that common mistakes are caught fast.
895
+ Acceptance Criteria:
896
+
897
+ - [ ] User story red flags listed (no AC, >3 AC, technical details, no value)
898
+ - [ ] Test red flags listed (vague name, shared state, >50 lines, tests implementation)
899
+ - [ ] E2E vs integration vs unit decision guidance included
900
+
901
+ ### test-definitions-guide.md
902
+
903
+ 1. Use the Standard Test Definitions Template
904
+ As a tester, I want to use the provided test definitions template, so that feature tests are consistent and complete.
905
+ Acceptance Criteria:
906
+
907
+ - [ ] Read and fill `templates/test-definitions-feature.md`
908
+ - [ ] Include feature name, issue number, and test file path
909
+ - [ ] Save under `planning/test-definitions/[id]-[feature]-test-definitions.md`
910
+
911
+ 2. Organize Tests into Suites
912
+ As a maintainer, I want tests grouped logically, so that coverage is easy to navigate.
913
+ Acceptance Criteria:
914
+
915
+ - [ ] Suites reflect layout/structure, interactions, state, accessibility, edge cases
916
+ - [ ] Suite names/descriptions explain scope
917
+ - [ ] Tests numbered (e.g., 1.1, 1.2)
918
+
919
+ 3. Track Test Status
920
+ As a contributor, I want consistent status indicators, so that progress is visible.
921
+ Acceptance Criteria:
922
+
923
+ - [ ] Use ✅ Passing, ⏭️ Skipped (with rationale), ❌ Not Implemented, 🔴 Failing
924
+ - [ ] Status listed per test
925
+ - [ ] "Last Updated" maintained
926
+
927
+ 4. Write Clear Steps
928
+ As a tester, I want actionable, numbered steps, so that tests are reproducible.
929
+ Acceptance Criteria:
930
+
931
+ - [ ] Steps are concrete and minimal
932
+ - [ ] Avoid vague phrasing like "check it works"
933
+ - [ ] Example step blocks used as reference
934
+
935
+ 5. Define Specific Expected Outcomes
936
+ As a tester, I want specific assertions, so that pass/fail is unambiguous.
937
+ Acceptance Criteria:
938
+
939
+ - [ ] Expected outcomes are measurable
940
+ - [ ] Avoid "everything works"
941
+ - [ ] Example assertion blocks used as reference
942
+
943
+ 6. Coverage Summary
944
+ As a maintainer, I want a coverage breakdown, so that gaps are obvious.
945
+ Acceptance Criteria:
946
+
947
+ - [ ] Totals and percentages per status
948
+ - [ ] Coverage by feature table (if applicable)
949
+ - [ ] Rationale for skipped tests
950
+
951
+ 7. Test Naming
952
+ As a reviewer, I want descriptive test names, so that intent is clear.
953
+ Acceptance Criteria:
954
+
955
+ - [ ] Names describe observable behavior
956
+ - [ ] Avoid "Test 1" or implementation details
957
+ - [ ] Unique names across suite
958
+
959
+ 8. Test Execution Commands
960
+ As a developer, I want practical commands documented, so that I can run tests quickly.
961
+ Acceptance Criteria:
962
+
963
+ - [ ] Include commands to run all tests for the feature
964
+ - [ ] Include grep/example to run a specific test
965
+ - [ ] Commands match project tooling
966
+
967
+ 9. TDD Workflow Integration
968
+ As a developer, I want tests defined before implementation and updated during delivery, so that TDD is enforced.
969
+ Acceptance Criteria:
970
+
971
+ - [ ] Created before implementation and alongside user stories
972
+ - [ ] Status updated as tests pass/skip/fail
973
+ - [ ] "Last Updated" timestamp maintained
974
+
975
+ 10. Map to User Stories
976
+ As a PM/dev, I want tests to map directly to user story AC, so that verification is complete.
977
+ Acceptance Criteria:
978
+
979
+ - [ ] Each user story AC has at least one test
980
+ - [ ] Edge cases and errors included beyond AC
981
+ - [ ] References to test file locations included
982
+
983
+ 11. Avoid Common Mistakes
984
+ As a reviewer, I want to block anti-patterns, so that definitions are high quality.
985
+ Acceptance Criteria:
986
+
987
+ - [ ] No implementation details tested
988
+ - [ ] No vague steps or missing coverage summaries
989
+ - [ ] No duplicate test descriptions
990
+
991
+ 12. Apply LLM Instruction Design
992
+ As an author, I want LLM-optimized definitions, so that agents follow them reliably.
993
+ Acceptance Criteria:
994
+
995
+ - [ ] Use MECE decision trees; define terms explicitly
996
+ - [ ] Include concrete examples and edge cases
997
+ - [ ] Use actionable language throughout
998
+
999
+ 1. Choose the Right Template
1000
+ As a planner, I want to select the correct template (user stories, test definitions, design doc, architecture doc), so that documentation aligns with TDD workflow.
1001
+ Acceptance Criteria:
1002
+
1003
+ - [ ] Feature/issues → user stories template
1004
+ - [ ] Feature test suites → test definitions template
1005
+ - [ ] Feature/system implementation → design doc template
1006
+ - [ ] Project/package-wide decisions → architecture document
1007
+
1008
+ 2. Story Format Selection
1009
+ As a writer, I want to choose the appropriate story format (standard, Given-When-Then, job story), so that intent is clear.
1010
+ Acceptance Criteria:
1011
+
1012
+ - [ ] Standard format used by default
1013
+ - [ ] Given-When-Then for behavior-focused stories
1014
+ - [ ] Job story for outcome-focused cases
1015
+
1016
+ 3. Story Acceptance Criteria and Scope
1017
+ As a product owner, I want clear AC and explicit out-of-scope sections, so that delivery is measurable.
1018
+ Acceptance Criteria:
1019
+
1020
+ - [ ] 2–5 specific, testable AC
1021
+ - [ ] Out-of-scope listed to prevent creep
1022
+ - [ ] Links to tests/design doc when available
1023
+
1024
+ 4. Block Story Anti-Patterns
1025
+ As a reviewer, I want to reject vague, technical-task, or bundled stories, so that stories remain valuable and small.
1026
+ Acceptance Criteria:
1027
+
1028
+ - [ ] Vague value or missing AC is blocked
1029
+ - [ ] Implementation-only “story” redirected to a task/spike
1030
+ - [ ] Bundled features split into multiple stories
1031
+
1032
+ 5. Create Test Definitions per Feature
1033
+ As a tester, I want structured test definitions with status and coverage, so that verification is transparent.
1034
+ Acceptance Criteria:
1035
+
1036
+ - [ ] Suites and individual tests organized
1037
+ - [ ] Status per test: Passing/Skipped/Not Implemented/Failing
1038
+ - [ ] Coverage summary and execution commands included
1039
+
1040
+ 6. Unit Test Template Usage
1041
+ As a developer, I want unit tests to follow the provided template, so that behavior and edge cases are covered.
1042
+ Acceptance Criteria:
1043
+
1044
+ - [ ] AAA structure used
1045
+ - [ ] Happy path, error path, and edge cases present
1046
+ - [ ] Deterministic assertions, no flaky timers/randomness
1047
+
1048
+ 7. Integration Test Template Usage
1049
+ As a developer, I want integration tests with setup/teardown and realistic flows, so that components work together.
1050
+ Acceptance Criteria:
1051
+
1052
+ - [ ] Fixtures for DB/APIs handled in beforeEach/afterEach
1053
+ - [ ] Full workflow exercised with assertions on outcomes
1054
+ - [ ] Failure cases test rollback/consistency
1055
+
1056
+ 8. E2E Test Template Usage
1057
+ As a QA engineer, I want E2E tests using a real browser, so that user journeys are validated.
1058
+ Acceptance Criteria:
1059
+
1060
+ - [ ] UI interactions scripted; URL/state assertions included
1061
+ - [ ] Page structure assertions reflect UX
1062
+ - [ ] Keep flows focused; long flows split
1063
+
1064
+ 9. Test Naming Conventions
1065
+ As a reviewer, I want descriptive test names, so that intent is obvious.
1066
+ Acceptance Criteria:
1067
+
1068
+ - [ ] Names describe observable behavior and conditions
1069
+ - [ ] No vague or implementation-focused names
1070
+ - [ ] Rename flagged vague tests
1071
+
1072
+ 10. Test Independence
1073
+ As a developer, I want tests isolated with fresh state, so that order does not matter.
1074
+ Acceptance Criteria:
1075
+
1076
+ - [ ] No shared mutable state across tests
1077
+ - [ ] Fresh fixtures per test via setup helpers
1078
+ - [ ] Flaky order-dependent tests prohibited
1079
+
1080
+ 11. What to Test vs Not Test
1081
+ As a reviewer, I want guidance applied to focus on behavior, so that tests are high ROI.
1082
+ Acceptance Criteria:
1083
+
1084
+ - [ ] Public API and user features tested
1085
+ - [ ] Avoid private internals and third-party internals
1086
+ - [ ] Trivial code not tested unless business logic
1087
+
1088
+ 12. Test Data Builders
1089
+ As a developer, I want reusable builders for complex data, so that tests are concise and maintainable.
1090
+ Acceptance Criteria:
1091
+
1092
+ - [ ] Builder functions with override support exist
1093
+ - [ ] Examples demonstrate customization
1094
+ - [ ] Builders shared across suites where sensible
1095
+
1096
+ 13. LLM-as-Judge Rubrics
1097
+ As a tester, I want rubric-based evals for narrative/reasoning, so that subjective qualities are testable.
1098
+ Acceptance Criteria:
1099
+
1100
+ - [ ] Rubrics specify EXCELLENT/ACCEPTABLE/POOR criteria
1101
+ - [ ] Promptfoo (or equivalent) templates used when applicable
1102
+ - [ ] Avoid brittle keyword matching
1103
+
1104
+ 14. Integration with Real LLM
1105
+ As an integration tester, I want real LLM calls for structured outputs, so that agent behavior is validated.
1106
+ Acceptance Criteria:
1107
+
1108
+ - [ ] Tests assert structured fields (not prose)
1109
+ - [ ] Costs acknowledged and minimized (e.g., ~$0.01 per test)
1110
+ - [ ] Sensitive secrets isolated in CI
1111
+
1112
+ 15. INVEST Gate for Stories
1113
+ As a product owner, I want every story to pass INVEST, so that it’s deliverable and testable.
1114
+ Acceptance Criteria:
1115
+
1116
+ - [ ] Independent, Negotiable, Valuable, Estimable, Small, Testable
1117
+ - [ ] Failing any criterion triggers refinement
1118
+ - [ ] Split large stories before acceptance
1119
+
1120
+ 16. Red Flags and Ratios
1121
+ As a maintainer, I want red flags enforced and test mix monitored, so that the suite stays efficient.
1122
+ Acceptance Criteria:
1123
+
1124
+ - [ ] Red flags checklist applied (long tests, dependencies, vague names)
1125
+ - [ ] Ratio baseline 70/20/10 (unit/integration/E2E) with documented adjustments per project
1126
+ - [ ] Prefer “as many fast tests as possible” principle with exceptions noted
1127
+
1128
+ ### testing-methodology.md
1129
+
1130
+ 1. Fastest-Effective Test Rule
1131
+ As a developer, I want to choose the fastest test type that can catch a bug, so that feedback loops stay quick and cheap.
1132
+ Acceptance Criteria:
1133
+
1134
+ - [ ] Apply speed hierarchy: Unit → Integration → LLM Eval → E2E
1135
+ - [ ] If multiple apply, choose the faster one (tie-breaker)
1136
+ - [ ] Anti-patterns avoided (e.g., business logic via E2E)
1137
+
1138
+ 2. Component vs Flow Testing
1139
+ As a tester, I want component behavior tested with integration and multi-page flows with E2E, so that each level is validated appropriately.
1140
+ Acceptance Criteria:
1141
+
1142
+ - [ ] UI components verified with integration tests (no real browser)
1143
+ - [ ] Multi-page flows use E2E with a real browser
1144
+ - [ ] Examples guide common cases
1145
+
1146
+ 3. Target Distribution Guidance
1147
+ As a maintainer, I want guidance favoring fast tests, so that the suite remains efficient.
1148
+ Acceptance Criteria:
1149
+
1150
+ - [ ] Emphasize unit+integration; E2E only for critical paths
1151
+ - [ ] Add LLM evals only for AI features needing judgment
1152
+ - [ ] Red flag documented: more E2E than integration is too slow
1153
+
1154
+ 4. TDD Phases with Guardrails
1155
+ As a developer, I want explicit steps for RED → GREEN → REFACTOR, so that TDD is followed correctly.
1156
+ Acceptance Criteria:
1157
+
1158
+ - [ ] RED: write failing test, verify it fails for correct reason, commit test
1159
+ - [ ] GREEN: minimum code to pass, no extra features (YAGNI)
1160
+ - [ ] REFACTOR: improve design with tests green; optional subagent validation
1161
+
1162
+ 5. Test Type Decision Tree
1163
+ As a tester, I want a sequential decision tree, so that I can deterministically select test type.
1164
+ Acceptance Criteria:
1165
+
1166
+ - [ ] Ordered questions: LLM Eval → E2E → Integration → Unit
1167
+ - [ ] Edge cases: non-determinism → mock; env deps → integration; mixed → split
1168
+ - [ ] Re-evaluation path documented with example (login validation)
1169
+
1170
+ 6. Bug-to-Test Mapping Table
1171
+ As a planner, I want a lookup table mapping bug types to test types, so that choices are consistent.
1172
+ Acceptance Criteria:
1173
+
1174
+ - [ ] Include calculation, API, DB, state, CSS, navigation cases
1175
+ - [ ] “Best choice” favors fastest viable test
1176
+ - [ ] Aligns with decision tree
1177
+
1178
+ 7. E2E Dev/Test Server Isolation
1179
+ As a QA engineer, I want isolated ports and processes for E2E, so that we avoid port conflicts and zombie processes.
1180
+ Acceptance Criteria:
1181
+
1182
+ - [ ] Dev on stable port; tests on devPort+1000 (or ephemeral fallback)
1183
+ - [ ] Playwright config uses isolated port and reuse rules
1184
+ - [ ] Package scripts include dev and dev:test commands
1185
+
1186
+ 8. LLM Evaluations Usage
1187
+ As a tester, I want to use LLM evals with rubrics when judging quality, so that nuanced outputs are verified.
1188
+ Acceptance Criteria:
1189
+
1190
+ - [ ] Use programmatic assertions for structure; use LLM-as-judge for tone/reasoning
1191
+ - [ ] Skip evals when simple/deterministic
1192
+ - [ ] Costs acknowledged; caching used
1193
+
1194
+ 9. Cost Controls for Evals
1195
+ As a maintainer, I want to reduce eval costs, so that CI remains affordable.
1196
+ Acceptance Criteria:
1197
+
1198
+ - [ ] Cache static prompts/examples
1199
+ - [ ] Batch scenarios; schedule full evals (e.g., PR/weekly)
1200
+ - [ ] Document expected costs and ROI
1201
+
1202
+ 10. Coverage Goals and Critical Paths
1203
+ As a team, I want clear coverage goals and critical path definitions, so that tests cover what matters.
1204
+ Acceptance Criteria:
1205
+
1206
+ - [ ] Unit: 80%+ for pure functions
1207
+ - [ ] E2E: all critical multi-page flows; Integration: all critical paths
1208
+ - [ ] “Critical” defined (auth, payment, data loss, core flows)
1209
+
1210
+ 11. Test Quality Practices
1211
+ As a developer, I want clear guidance on writing effective tests, so that tests are reliable and maintainable.
1212
+ Acceptance Criteria:
1213
+
1214
+ - [ ] AAA pattern enforced; descriptive naming; independent tests
1215
+ - [ ] Async uses polling/selectors (no arbitrary timeouts)
1216
+ - [ ] What not to test documented (implementation details, trivial code, third-party internals)
1217
+
1218
+ 12. CI/CD Testing Cadence
1219
+ As a maintainer, I want a CI plan that balances speed and confidence, so that pipelines are efficient.
1220
+ Acceptance Criteria:
1221
+
1222
+ - [ ] Unit+integration on every commit; E2E on PR; evals on schedule
1223
+ - [ ] No skipped tests without justification
1224
+ - [ ] Coverage thresholds enforced
1225
+
1226
+ 13. Project-Specific Testing Doc
1227
+ As a maintainer, I want `tests/SAFEWORD.md` to document stack/commands/patterns, so that contributors can run tests correctly.
1228
+ Acceptance Criteria:
1229
+
1230
+ - [ ] File exists with stack, commands, setup, file structure, patterns
1231
+ - [ ] TDD expectations, coverage and PR requirements included
1232
+ - [ ] If missing, ask where testing docs are and create/update
1233
+
1234
+ 14. Test Integrity Guardrails
1235
+ As a developer, I want explicit rules preventing test modifications without approval, so that tests remain trusted specifications.
1236
+ Acceptance Criteria:
1237
+
1238
+ - [ ] Tests not modified/skipped/deleted without human approval
1239
+ - [ ] Forbidden actions enforced (changing assertions, .skip(), weakening, deleting)
1240
+ - [ ] Implementation fixed when test fails (not the test)
1241
+ - [ ] Requirement changes discussed before test updates
1242
+
1243
+ ### user-story-guide.md
1244
+
1245
+ 1. Use the Standard User Stories Template
1246
+ As a planner, I want to use the provided user stories template, so that stories are consistent and easy to track.
1247
+ Acceptance Criteria:
1248
+
1249
+ - [ ] Fill `templates/user-stories-template.md` with feature name, issue number, status
1250
+ - [ ] Number stories (Story 1, Story 2, …)
1251
+ - [ ] Save to `planning/user-stories/[id]-[feature].md` (or project convention)
1252
+
1253
+ 2. Include Tracking Metadata
1254
+ As a PM, I want status, test refs, and completion summaries, so that progress is visible.
1255
+ Acceptance Criteria:
1256
+
1257
+ - [ ] ✅/❌ per story and per AC
1258
+ - [ ] Test file references included
1259
+ - [ ] Completion % and phase tracking present; next steps listed
1260
+
1261
+ 3. INVEST Validation Gate
1262
+ As a reviewer, I want INVEST validation before saving, so that stories are deliverable.
1263
+ Acceptance Criteria:
1264
+
1265
+ - [ ] Independent, Negotiable, Valuable, Estimable, Small, Testable
1266
+ - [ ] If any fail → refine or split before merge
1267
+ - [ ] Validation documented in the story file or PR notes
1268
+
1269
+ 4. Write Good Acceptance Criteria
1270
+ As a writer, I want specific, user-facing, testable AC, so that done is unambiguous.
1271
+ Acceptance Criteria:
1272
+
1273
+ - [ ] AC specify measurable behavior (e.g., “<200ms”)
1274
+ - [ ] Avoid technical/implementation details
1275
+ - [ ] Avoid vague phrasing (“works”, “better”)
1276
+
1277
+ 5. Size Guidelines Enforcement
1278
+ As a planner, I want size checks, so that stories are right-sized.
1279
+ Acceptance Criteria:
1280
+
1281
+ - [ ] Split if 6+ AC, multiple personas, multiple screens, or >6 days
1282
+ - [ ] Combine if trivial (<1 hour, no AC)
1283
+ - [ ] Target: 1–5 AC, 1–2 screens/personas, 1–5 days
1284
+
1285
+ 6. Good/Bad Examples Reference
1286
+ As a contributor, I want examples of good vs bad stories, so that quality is consistent.
1287
+ Acceptance Criteria:
1288
+
1289
+ - [ ] Provide at least one good story example in the doc
1290
+ - [ ] Provide a “too big” and “no value” anti-example
1291
+ - [ ] Technical tasks labeled as tasks/spikes (not user stories)
1292
+
1293
+ 7. Conversation, Not Contract
1294
+ As a team, I want stories to be conversation starters, so that details emerge collaboratively.
1295
+ Acceptance Criteria:
1296
+
1297
+ - [ ] Discuss edge cases, approach, open questions during planning
1298
+ - [ ] Story avoids implementation details and test strategies
1299
+ - [ ] Link to UI mockups instead of embedding
1300
+
1301
+ 8. LLM-Optimized Wording
1302
+ As a writer, I want LLM-friendly wording, so that agents follow stories reliably.
1303
+ Acceptance Criteria:
1304
+
1305
+ - [ ] Use specific, concrete language with numbers where helpful
1306
+ - [ ] Define terms explicitly; avoid generic phrases
1307
+ - [ ] Use examples over abstract rules
1308
+
1309
+ 9. Token Efficiency of Template
1310
+ As a maintainer, I want minimal template overhead, so that prompting cost stays low.
1311
+ Acceptance Criteria:
1312
+
1313
+ - [ ] Keep template lean (≈9 lines)
1314
+ - [ ] Flat structure, no nested sections or validation metadata
1315
+ - [ ] Reuse standard sections across stories
1316
+
1317
+ 10. File Naming Conventions
1318
+ As a contributor, I want descriptive filenames, so that stories are discoverable.
1319
+ Acceptance Criteria:
1320
+
1321
+ - [ ] Use descriptive slugs (e.g., `campaign-switching.md`)
1322
+ - [ ] Avoid generic or bloated names
1323
+ - [ ] Place under `docs/stories/` when appropriate
1324
+
1325
+ ### zombie-process-cleanup.md
1326
+
1327
+ 1. Prefer Port-Based Cleanup
1328
+ As a developer, I want to kill processes by port, so that I don’t affect other projects.
1329
+ Acceptance Criteria:
1330
+
1331
+ - [ ] Use `lsof -ti:PORT | xargs kill -9` for dev servers
1332
+ - [ ] Avoid blanket `killall node` or `pkill -9 node`
1333
+ - [ ] Verify port uniqueness per project
1334
+
1335
+ 2. Project-Specific Cleanup Script
1336
+ As a maintainer, I want a `scripts/cleanup.sh`, so that cleanup is safe and repeatable.
1337
+ Acceptance Criteria:
1338
+
1339
+ - [ ] Script kills by project port and filters by current directory
1340
+ - [ ] Handles dev server, Playwright browsers, and test runners
1341
+ - [ ] Mark executable and document usage
1342
+
1343
+ 3. Unique Port Assignment
1344
+ As a team, I want unique ports per project, so that cleanup is unambiguous.
1345
+ Acceptance Criteria:
1346
+
1347
+ - [ ] Set explicit `PORT` per project (e.g., 3000, 3001)
1348
+ - [ ] Document ports in README or env
1349
+ - [ ] Verify in CI/local scripts
1350
+
1351
+ 4. tmux/Screen Isolation (Optional)
1352
+ As a developer, I want isolated terminal sessions, so that I can kill everything for one project safely.
1353
+ Acceptance Criteria:
1354
+
1355
+ - [ ] Start dev in a named session
1356
+ - [ ] One command kills only that session
1357
+ - [ ] Trade-offs documented (learning curve)
1358
+
1359
+ 5. Debugging Zombie Processes
1360
+ As a developer, I want quick commands to find processes, so that I can target cleanup precisely.
1361
+ Acceptance Criteria:
1362
+
1363
+ - [ ] Commands to find by port, by process type, and by project directory
1364
+ - [ ] Guidance for listing node/playwright/chromium
1365
+ - [ ] Prefer filtered kills using `$(pwd)` pattern
1366
+
1367
+ 6. Best Practices
1368
+ As a maintainer, I want a short best-practices list, so that the team consistently avoids cross-project kills.
1369
+ Acceptance Criteria:
1370
+
1371
+ - [ ] Assign unique ports; use port-based cleanup first
1372
+ - [ ] Create and use project cleanup scripts
1373
+ - [ ] Clean before start; check with `lsof -i:PORT`
1374
+
1375
+ 7. Quick Reference
1376
+ As a developer, I want a quick-reference table, so that I can copy/paste safe commands.
1377
+ Acceptance Criteria:
1378
+
1379
+ - [ ] Include kill by port, kill playwright for this project, full cleanup script
1380
+ - [ ] Include commands to check ports and find zombies
1381
+ - [ ] Warn against dangerous global kills