safeword 0.2.3 → 0.2.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (235) hide show
  1. package/.claude/commands/arch-review.md +32 -0
  2. package/.claude/commands/lint.md +6 -0
  3. package/.claude/commands/quality-review.md +13 -0
  4. package/.claude/commands/setup-linting.md +6 -0
  5. package/.claude/hooks/auto-lint.sh +6 -0
  6. package/.claude/hooks/auto-quality-review.sh +170 -0
  7. package/.claude/hooks/check-linting-sync.sh +17 -0
  8. package/.claude/hooks/inject-timestamp.sh +6 -0
  9. package/.claude/hooks/question-protocol.sh +12 -0
  10. package/.claude/hooks/run-linters.sh +8 -0
  11. package/.claude/hooks/run-quality-review.sh +76 -0
  12. package/.claude/hooks/version-check.sh +10 -0
  13. package/.claude/mcp/README.md +96 -0
  14. package/.claude/mcp/arcade.sample.json +9 -0
  15. package/.claude/mcp/context7.sample.json +7 -0
  16. package/.claude/mcp/playwright.sample.json +7 -0
  17. package/.claude/settings.json +62 -0
  18. package/.claude/skills/quality-reviewer/SKILL.md +190 -0
  19. package/.claude/skills/safeword-quality-reviewer/SKILL.md +13 -0
  20. package/.env.arcade.example +4 -0
  21. package/.env.example +11 -0
  22. package/.gitmodules +4 -0
  23. package/.safeword/SAFEWORD.md +33 -0
  24. package/.safeword/eslint/eslint-base.mjs +101 -0
  25. package/.safeword/guides/architecture-guide.md +404 -0
  26. package/.safeword/guides/code-philosophy.md +174 -0
  27. package/.safeword/guides/context-files-guide.md +405 -0
  28. package/.safeword/guides/data-architecture-guide.md +183 -0
  29. package/.safeword/guides/design-doc-guide.md +165 -0
  30. package/.safeword/guides/learning-extraction.md +515 -0
  31. package/.safeword/guides/llm-instruction-design.md +239 -0
  32. package/.safeword/guides/llm-prompting.md +95 -0
  33. package/.safeword/guides/tdd-best-practices.md +570 -0
  34. package/.safeword/guides/test-definitions-guide.md +243 -0
  35. package/.safeword/guides/testing-methodology.md +573 -0
  36. package/.safeword/guides/user-story-guide.md +237 -0
  37. package/.safeword/guides/zombie-process-cleanup.md +214 -0
  38. package/{templates → .safeword}/hooks/agents-md-check.sh +0 -0
  39. package/{templates → .safeword}/hooks/post-tool.sh +0 -0
  40. package/{templates → .safeword}/hooks/pre-commit.sh +0 -0
  41. package/.safeword/planning/002-user-story-quality-evaluation.md +1840 -0
  42. package/.safeword/planning/003-langsmith-eval-setup-prompt.md +363 -0
  43. package/.safeword/planning/004-llm-eval-test-cases.md +3226 -0
  44. package/.safeword/planning/005-architecture-enforcement-system.md +169 -0
  45. package/.safeword/planning/006-reactive-fix-prevention-research.md +135 -0
  46. package/.safeword/planning/011-cli-ux-vision.md +330 -0
  47. package/.safeword/planning/012-project-structure-cleanup.md +154 -0
  48. package/.safeword/planning/README.md +39 -0
  49. package/.safeword/planning/automation-plan-v2.md +1225 -0
  50. package/.safeword/planning/automation-plan-v3.md +1291 -0
  51. package/.safeword/planning/automation-plan.md +3058 -0
  52. package/.safeword/planning/design/005-cli-implementation.md +343 -0
  53. package/.safeword/planning/design/013-cli-self-contained-templates.md +596 -0
  54. package/.safeword/planning/design/013a-eslint-plugin-suite.md +256 -0
  55. package/.safeword/planning/design/013b-implementation-snippets.md +385 -0
  56. package/.safeword/planning/design/013c-config-isolation-strategy.md +242 -0
  57. package/.safeword/planning/design/code-philosophy-improvements.md +60 -0
  58. package/.safeword/planning/mcp-analysis.md +545 -0
  59. package/.safeword/planning/phase2-subagents-vs-skills-analysis.md +451 -0
  60. package/.safeword/planning/settings-improvements.md +970 -0
  61. package/.safeword/planning/test-definitions/005-cli-implementation.md +1301 -0
  62. package/.safeword/planning/test-definitions/cli-self-contained-templates.md +205 -0
  63. package/.safeword/planning/user-stories/001-guides-review-user-stories.md +1381 -0
  64. package/.safeword/planning/user-stories/003-reactive-fix-prevention.md +132 -0
  65. package/.safeword/planning/user-stories/004-technical-constraints.md +86 -0
  66. package/.safeword/planning/user-stories/005-cli-implementation.md +311 -0
  67. package/.safeword/planning/user-stories/cli-self-contained-templates.md +172 -0
  68. package/.safeword/planning/versioned-distribution.md +740 -0
  69. package/.safeword/prompts/arch-review.md +43 -0
  70. package/.safeword/prompts/quality-review.md +11 -0
  71. package/.safeword/scripts/arch-review.sh +235 -0
  72. package/.safeword/scripts/check-linting-sync.sh +58 -0
  73. package/.safeword/scripts/setup-linting.sh +559 -0
  74. package/.safeword/templates/architecture-template.md +136 -0
  75. package/.safeword/templates/ci/architecture-check.yml +79 -0
  76. package/.safeword/templates/design-doc-template.md +127 -0
  77. package/.safeword/templates/test-definitions-feature.md +100 -0
  78. package/.safeword/templates/ticket-template.md +74 -0
  79. package/.safeword/templates/user-stories-template.md +82 -0
  80. package/.safeword/tickets/001-guides-review-user-stories.md +83 -0
  81. package/.safeword/tickets/002-architecture-enforcement.md +211 -0
  82. package/.safeword/tickets/003-reactive-fix-prevention.md +57 -0
  83. package/.safeword/tickets/004-technical-constraints-in-user-stories.md +39 -0
  84. package/.safeword/tickets/005-cli-implementation.md +248 -0
  85. package/.safeword/tickets/006-flesh-out-skills.md +43 -0
  86. package/.safeword/tickets/007-flesh-out-questioning.md +44 -0
  87. package/.safeword/tickets/008-upgrade-questioning.md +58 -0
  88. package/.safeword/tickets/009-naming-conventions.md +41 -0
  89. package/.safeword/tickets/010-safeword-md-cleanup.md +34 -0
  90. package/.safeword/tickets/011-cursor-setup.md +86 -0
  91. package/.safeword/tickets/README.md +73 -0
  92. package/.safeword/version +1 -0
  93. package/AGENTS.md +59 -0
  94. package/CLAUDE.md +12 -0
  95. package/README.md +347 -0
  96. package/docs/001-cli-implementation-plan.md +856 -0
  97. package/docs/elite-dx-implementation-plan.md +1034 -0
  98. package/framework/README.md +131 -0
  99. package/framework/mcp/README.md +96 -0
  100. package/framework/mcp/arcade.sample.json +8 -0
  101. package/framework/mcp/context7.sample.json +6 -0
  102. package/framework/mcp/playwright.sample.json +6 -0
  103. package/framework/scripts/arch-review.sh +235 -0
  104. package/framework/scripts/check-linting-sync.sh +58 -0
  105. package/framework/scripts/load-env.sh +49 -0
  106. package/framework/scripts/setup-claude.sh +223 -0
  107. package/framework/scripts/setup-linting.sh +559 -0
  108. package/framework/scripts/setup-quality.sh +477 -0
  109. package/framework/scripts/setup-safeword.sh +550 -0
  110. package/framework/templates/ci/architecture-check.yml +78 -0
  111. package/learnings/ai-sdk-v5-breaking-changes.md +178 -0
  112. package/learnings/e2e-test-zombie-processes.md +231 -0
  113. package/learnings/milkdown-crepe-editor-property.md +96 -0
  114. package/learnings/prosemirror-fragment-traversal.md +119 -0
  115. package/package.json +19 -43
  116. package/packages/cli/AGENTS.md +1 -0
  117. package/packages/cli/ARCHITECTURE.md +279 -0
  118. package/packages/cli/package.json +51 -0
  119. package/packages/cli/src/cli.ts +63 -0
  120. package/packages/cli/src/commands/check.ts +166 -0
  121. package/packages/cli/src/commands/diff.ts +209 -0
  122. package/packages/cli/src/commands/reset.ts +190 -0
  123. package/packages/cli/src/commands/setup.ts +325 -0
  124. package/packages/cli/src/commands/upgrade.ts +163 -0
  125. package/packages/cli/src/index.ts +3 -0
  126. package/packages/cli/src/templates/config.ts +58 -0
  127. package/packages/cli/src/templates/content.ts +18 -0
  128. package/packages/cli/src/templates/index.ts +12 -0
  129. package/packages/cli/src/utils/agents-md.ts +66 -0
  130. package/packages/cli/src/utils/fs.ts +179 -0
  131. package/packages/cli/src/utils/git.ts +124 -0
  132. package/packages/cli/src/utils/hooks.ts +29 -0
  133. package/packages/cli/src/utils/output.ts +60 -0
  134. package/packages/cli/src/utils/project-detector.test.ts +185 -0
  135. package/packages/cli/src/utils/project-detector.ts +44 -0
  136. package/packages/cli/src/utils/version.ts +28 -0
  137. package/packages/cli/src/version.ts +6 -0
  138. package/packages/cli/templates/SAFEWORD.md +776 -0
  139. package/packages/cli/templates/doc-templates/architecture-template.md +136 -0
  140. package/packages/cli/templates/doc-templates/design-doc-template.md +134 -0
  141. package/packages/cli/templates/doc-templates/test-definitions-feature.md +131 -0
  142. package/packages/cli/templates/doc-templates/ticket-template.md +82 -0
  143. package/packages/cli/templates/doc-templates/user-stories-template.md +92 -0
  144. package/packages/cli/templates/guides/architecture-guide.md +423 -0
  145. package/packages/cli/templates/guides/code-philosophy.md +195 -0
  146. package/packages/cli/templates/guides/context-files-guide.md +457 -0
  147. package/packages/cli/templates/guides/data-architecture-guide.md +200 -0
  148. package/packages/cli/templates/guides/design-doc-guide.md +171 -0
  149. package/packages/cli/templates/guides/learning-extraction.md +552 -0
  150. package/packages/cli/templates/guides/llm-instruction-design.md +248 -0
  151. package/packages/cli/templates/guides/llm-prompting.md +102 -0
  152. package/packages/cli/templates/guides/tdd-best-practices.md +615 -0
  153. package/packages/cli/templates/guides/test-definitions-guide.md +334 -0
  154. package/packages/cli/templates/guides/testing-methodology.md +618 -0
  155. package/packages/cli/templates/guides/user-story-guide.md +256 -0
  156. package/packages/cli/templates/guides/zombie-process-cleanup.md +219 -0
  157. package/packages/cli/templates/hooks/agents-md-check.sh +27 -0
  158. package/packages/cli/templates/hooks/post-tool.sh +4 -0
  159. package/packages/cli/templates/hooks/pre-commit.sh +10 -0
  160. package/packages/cli/templates/prompts/arch-review.md +43 -0
  161. package/packages/cli/templates/prompts/quality-review.md +10 -0
  162. package/packages/cli/templates/skills/safeword-quality-reviewer/SKILL.md +207 -0
  163. package/packages/cli/tests/commands/check.test.ts +129 -0
  164. package/packages/cli/tests/commands/cli.test.ts +89 -0
  165. package/packages/cli/tests/commands/diff.test.ts +115 -0
  166. package/packages/cli/tests/commands/reset.test.ts +310 -0
  167. package/packages/cli/tests/commands/self-healing.test.ts +170 -0
  168. package/packages/cli/tests/commands/setup-blocking.test.ts +71 -0
  169. package/packages/cli/tests/commands/setup-core.test.ts +135 -0
  170. package/packages/cli/tests/commands/setup-git.test.ts +139 -0
  171. package/packages/cli/tests/commands/setup-hooks.test.ts +334 -0
  172. package/packages/cli/tests/commands/setup-linting.test.ts +189 -0
  173. package/packages/cli/tests/commands/setup-noninteractive.test.ts +80 -0
  174. package/packages/cli/tests/commands/setup-templates.test.ts +181 -0
  175. package/packages/cli/tests/commands/upgrade.test.ts +215 -0
  176. package/packages/cli/tests/helpers.ts +243 -0
  177. package/packages/cli/tests/npm-package.test.ts +83 -0
  178. package/packages/cli/tests/technical-constraints.test.ts +96 -0
  179. package/packages/cli/tsconfig.json +25 -0
  180. package/packages/cli/tsup.config.ts +11 -0
  181. package/packages/cli/vitest.config.ts +23 -0
  182. package/promptfoo.yaml +3270 -0
  183. package/dist/check-3NGQ4NR5.js +0 -129
  184. package/dist/check-3NGQ4NR5.js.map +0 -1
  185. package/dist/chunk-2XWIUEQK.js +0 -190
  186. package/dist/chunk-2XWIUEQK.js.map +0 -1
  187. package/dist/chunk-GZRQL3SX.js +0 -146
  188. package/dist/chunk-GZRQL3SX.js.map +0 -1
  189. package/dist/chunk-ORQHKDT2.js +0 -10
  190. package/dist/chunk-ORQHKDT2.js.map +0 -1
  191. package/dist/chunk-W66Z3C5H.js +0 -21
  192. package/dist/chunk-W66Z3C5H.js.map +0 -1
  193. package/dist/cli.d.ts +0 -1
  194. package/dist/cli.js +0 -34
  195. package/dist/cli.js.map +0 -1
  196. package/dist/diff-Y6QTAW4O.js +0 -166
  197. package/dist/diff-Y6QTAW4O.js.map +0 -1
  198. package/dist/index.d.ts +0 -11
  199. package/dist/index.js +0 -7
  200. package/dist/index.js.map +0 -1
  201. package/dist/reset-3ACTIYYE.js +0 -143
  202. package/dist/reset-3ACTIYYE.js.map +0 -1
  203. package/dist/setup-RR4M334C.js +0 -266
  204. package/dist/setup-RR4M334C.js.map +0 -1
  205. package/dist/upgrade-6AR3DHUV.js +0 -134
  206. package/dist/upgrade-6AR3DHUV.js.map +0 -1
  207. /package/{templates → framework}/SAFEWORD.md +0 -0
  208. /package/{templates → framework}/guides/architecture-guide.md +0 -0
  209. /package/{templates → framework}/guides/code-philosophy.md +0 -0
  210. /package/{templates → framework}/guides/context-files-guide.md +0 -0
  211. /package/{templates → framework}/guides/data-architecture-guide.md +0 -0
  212. /package/{templates → framework}/guides/design-doc-guide.md +0 -0
  213. /package/{templates → framework}/guides/learning-extraction.md +0 -0
  214. /package/{templates → framework}/guides/llm-instruction-design.md +0 -0
  215. /package/{templates → framework}/guides/llm-prompting.md +0 -0
  216. /package/{templates → framework}/guides/tdd-best-practices.md +0 -0
  217. /package/{templates → framework}/guides/test-definitions-guide.md +0 -0
  218. /package/{templates → framework}/guides/testing-methodology.md +0 -0
  219. /package/{templates → framework}/guides/user-story-guide.md +0 -0
  220. /package/{templates → framework}/guides/zombie-process-cleanup.md +0 -0
  221. /package/{templates → framework}/prompts/arch-review.md +0 -0
  222. /package/{templates → framework}/prompts/quality-review.md +0 -0
  223. /package/{templates/skills/safeword-quality-reviewer → framework/skills/quality-reviewer}/SKILL.md +0 -0
  224. /package/{templates/doc-templates → framework/templates}/architecture-template.md +0 -0
  225. /package/{templates/doc-templates → framework/templates}/design-doc-template.md +0 -0
  226. /package/{templates/doc-templates → framework/templates}/test-definitions-feature.md +0 -0
  227. /package/{templates/doc-templates → framework/templates}/ticket-template.md +0 -0
  228. /package/{templates/doc-templates → framework/templates}/user-stories-template.md +0 -0
  229. /package/{templates → packages/cli/templates}/commands/arch-review.md +0 -0
  230. /package/{templates → packages/cli/templates}/commands/lint.md +0 -0
  231. /package/{templates → packages/cli/templates}/commands/quality-review.md +0 -0
  232. /package/{templates → packages/cli/templates}/hooks/inject-timestamp.sh +0 -0
  233. /package/{templates → packages/cli/templates}/lib/common.sh +0 -0
  234. /package/{templates → packages/cli/templates}/lib/jq-fallback.sh +0 -0
  235. /package/{templates → packages/cli/templates}/markdownlint.jsonc +0 -0
@@ -0,0 +1,3058 @@
1
+ # Claude Code Automation Plan: Quality Control Workflow
2
+
3
+ **Goal**: Eliminate repetitive "double check and critique" prompts while maintaining quality standards
4
+
5
+ **Based on**: Conversation pattern analysis across soulless-monorepo (1,319 messages) and bitd (8,732 messages)
6
+
7
+ **Key Pattern Identified**: 7% of soulless-monorepo and 4% of bitd prompts are quality control rituals that can be automated
8
+
9
+ **Updated**: 2025-10-30 - Integrated Anthropic 2025 best practices: Explore→Plan→Code workflow, context management, split Quality Reviewer into 3 focused Skills, replaced UserPromptSubmit Hook with Stop Hook
10
+
11
+ ---
12
+
13
+ ## Executive Summary
14
+
15
+ After comprehensive analysis of all automation mechanisms (Hooks, Skills, Subagents, MCP Servers), the recommended approach is:
16
+
17
+ **Phase 1** (START HERE): Enhanced CLAUDE.md + Slash Commands → 60% reduction in prompt length
18
+ **Phase 2** (⭐ RECOMMENDED): Skills + PostToolUse Hooks → Eliminate all 347 quality check prompts
19
+ **Phase 3** (OPTIONAL): UserPromptSubmit Hook → If Phase 2 insufficient
20
+ **Phase 4** (FUTURE): MCP Servers → Only for external integrations (GitHub PR automation, database validation)
21
+
22
+ **Why NOT Subagents or MCP Servers for Phase 2**:
23
+
24
+ - ❌ Subagents: Too slow (context switching), lack conversation history
25
+ - ❌ MCP Servers: External dependencies, cost, complexity, no conversation context
26
+ - ✅ Skills: Fast, context-aware, automatic activation, perfect fit
27
+
28
+ ### Quick Reference: Top Automation Opportunities
29
+
30
+ **Updated based on Anthropic 2025 best practices**
31
+
32
+ | Mechanism | Name | Priority | Impact | Effort | ROI | Source Guide |
33
+ | ------------- | --------------------------- | ---------- | ------------------------------------------------------------- | ------- | --------- | -------------------------------------- |
34
+ | **Skill** | Explore-First Enforcer | ⭐⭐⭐⭐⭐ | Prevents jumping to code (Anthropic #1 best practice) | 30 min | Very High | Anthropic 2025 best practices |
35
+ | **Skill** | Docs Verifier | ⭐⭐⭐⭐⭐ | Check latest docs (was part of Quality Reviewer) | 45 min | Very High | code-philosophy.md |
36
+ | **Skill** | Standards Checker | ⭐⭐⭐⭐⭐ | Validate conventions (was part of Quality Reviewer) | 30 min | Very High | code-philosophy.md |
37
+ | **Skill** | Quality Gates | ⭐⭐⭐⭐⭐ | Final review before Write/Edit (was part of Quality Reviewer) | 45 min | Very High | code-philosophy.md |
38
+ | **Hook** | SessionStart - Load Context | ⭐⭐⭐⭐ | Load quality standards at session start | 20 min | High | Official docs |
39
+ | **Hook** | Stop - Context Management | ⭐⭐⭐⭐ | Prevent "dumber after compaction" issue | 10 min | High | Anthropic 2025 best practices |
40
+ | **Hook** | PostToolUse - Test Runner | ⭐⭐⭐⭐ | Auto-run tests after every edit | 30 min | High | testing-methodology.md |
41
+ | **Hook** | PreToolUse - Format Code | ⭐⭐⭐ | Auto-format before Write/Edit | 1-2 hrs | Medium | Official docs v2.0.10+ |
42
+ | **Skill** | Context Manager | ⭐⭐⭐⭐ | Suggest /clear between tasks | 20 min | High | Anthropic 2025 best practices |
43
+ | **Skill** | Feature Kickoff | ⭐⭐⭐⭐ | Auto-find user stories/tests/design docs | 1 hr | High | CLAUDE.md |
44
+ | **Hook** | SessionEnd - Log Stats | ⭐⭐ | Log session statistics | 15 min | Low | Official docs |
45
+ | **Skill** | TDD Enforcer | ⭐⭐⭐ | Enforce tests BEFORE implementation | 30 min | Medium | testing-methodology.md, Anthropic 2025 |
46
+ | **Slash Cmd** | /critique | ⭐⭐⭐ | Manual quality check shortcut | 15 min | Medium | Phase 1 plan |
47
+
48
+ **Total identified**: 17 Skills, 7 Hooks, 5 Slash Commands (29 automation opportunities)
49
+
50
+ **Hook types available**:
51
+
52
+ - **Tool-related**: PreToolUse, PostToolUse
53
+ - **User interaction**: UserPromptSubmit, Notification, Stop
54
+ - **Session**: SessionStart, SessionEnd
55
+ - **Subagent**: SubagentStop
56
+ - **Maintenance**: PreCompact
57
+
58
+ **Key changes from original plan**:
59
+
60
+ - Split Quality Reviewer into 3 focused Skills (Anthropic: "keep Skills lean")
61
+ - Added Explore-First Enforcer (Anthropic #1 best practice)
62
+ - Replaced UserPromptSubmit Hook with Stop Hook (prevents context bloat)
63
+ - Added Context Manager Skill (addresses "dumber after compaction" issue)
64
+
65
+ **Implementation recommendation**: Start with Explore-First + Stop Hook (Phase 1), then Docs Verifier + Quality Gates (Phase 2).
66
+
67
+ ---
68
+
69
+ ## Current State Analysis
70
+
71
+ ### Repetitive Patterns Identified
72
+
73
+ 1. **Double Check Ritual** (appears 347+ times in bitd alone):
74
+
75
+ ```
76
+ "double check and critique your work again just in case.
77
+ is it correct? is it elegant? does it adhere to [plan/template]?
78
+ does it follow the latest best practices and documentation for [stack/domain]?
79
+ avoid bloat."
80
+ ```
81
+
82
+ 2. **Approval Cycle**:
83
+ - "yes" / "yes please" / "ok" / "proceed" / "do it"
84
+ - User reviews, gives terse approval, expects execution
85
+
86
+ 3. **Latest Documentation Emphasis**:
87
+ - "latest best practices"
88
+ - "very latest documentation"
89
+ - "check [tool]'s very latest docs"
90
+
91
+ 4. **Self-Sufficiency Directive**:
92
+ - "ask any non-obvious questions you need answered that you can't research yourself"
93
+ - "don't be lazy"
94
+
95
+ 5. **Anti-Bloat Vigilance**:
96
+ - "avoid bloat" appears in most critique requests
97
+
98
+ 6. **Quality Triad**:
99
+ - "is it correct?"
100
+ - "is it elegant?"
101
+ - "does it adhere to [standard]?"
102
+
103
+ ---
104
+
105
+ ## Automation Strategy: All Available Mechanisms
106
+
107
+ Claude Code provides **7 automation mechanisms**. After testing all approaches against your workflow, here's the analysis:
108
+
109
+ | Mechanism | Power | Complexity | Best For | For Quality Checks |
110
+ | ------------------------- | ---------- | ---------- | --------------------- | ------------------------------ |
111
+ | **Skills** | ⭐⭐⭐⭐⭐ | Low | Auto quality checks | ✅ Context-aware, proactive |
112
+ | **PostToolUse Hook** | ⭐⭐⭐⭐ | Medium | Post-code validation | ✅ Deterministic, immediate |
113
+ | **UserPromptSubmit Hook** | ⭐⭐⭐⭐⭐ | Medium | Prompt enrichment | ⚠️ No context, false positives |
114
+ | **Enhanced CLAUDE.md** | ⭐⭐⭐ | Low | Baseline behavior | ✅ Simple, foundation |
115
+ | **Slash Commands** | ⭐⭐⭐⭐ | Low | Quick shortcuts | ✅ Manual, explicit |
116
+ | **Subagents** | ⭐⭐⭐⭐ | High | Long analysis | ❌ Too slow, no context |
117
+ | **MCP Servers** | ⭐⭐⭐⭐ | High | External integrations | ❌ External API costs |
118
+
119
+ ### Approach 1: Stop Hook for Context Management ⭐ RECOMMENDED
120
+
121
+ **Power Level**: ⭐⭐⭐⭐
122
+ **Effort**: Low (10 min)
123
+ **Impact**: Prevents "dumber after compaction" issue
124
+
125
+ **What it does**: Reminds to `/clear` context after completing tasks
126
+
127
+ **Why not UserPromptSubmit Hook**: Text manipulation adds to context bloat, accelerating compaction issues. Stop event is cleaner.
128
+
129
+ **Configuration**: Add to `~/.claude/settings.json`
130
+
131
+ ```json
132
+ {
133
+ "hooks": {
134
+ "Stop": [
135
+ {
136
+ "matcher": "*",
137
+ "hooks": [
138
+ {
139
+ "type": "command",
140
+ "command": "echo '💡 Task complete. Consider: /clear before next task (prevents context pollution)'"
141
+ }
142
+ ]
143
+ }
144
+ ]
145
+ }
146
+ }
147
+ ```
148
+
149
+ **How it works**:
150
+
151
+ - Claude finishes responding (Stop event fires)
152
+ - Hook outputs suggestion to use `/clear`
153
+ - User decides whether to clear context
154
+ - No text manipulation, no false positives
155
+
156
+ **Pros**:
157
+
158
+ - Non-invasive (suggestion, not manipulation)
159
+ - Addresses "dumber after compaction" issue
160
+ - No context bloat
161
+ - Works across all projects
162
+
163
+ **Cons**:
164
+
165
+ - User must manually type `/clear`
166
+ - May be repetitive if working on single task
167
+
168
+ **Decision**: Phase 2 RECOMMENDED - Pair with Skills for complete automation
169
+
170
+ ---
171
+
172
+ ### Approach 1a: SessionStart Hook (LOAD CONTEXT AT STARTUP) ⭐⭐⭐⭐
173
+
174
+ **Power Level**: ⭐⭐⭐⭐
175
+ **Effort**: Low (20 min)
176
+ **Impact**: Loads development context and environment at session start
177
+
178
+ **What it does**: Runs when Claude Code starts a new session or resumes an existing session
179
+
180
+ **Use Cases**:
181
+
182
+ - Load quality standards from CLAUDE.md into context
183
+ - Install dependencies automatically
184
+ - Set up environment variables
185
+ - Display project-specific reminders
186
+
187
+ **Configuration**: Add to `~/.claude/settings.json`
188
+
189
+ ```json
190
+ {
191
+ "hooks": {
192
+ "SessionStart": [
193
+ {
194
+ "matcher": "*",
195
+ "hooks": [
196
+ {
197
+ "type": "command",
198
+ "command": "echo '🎯 Loading project quality standards...' && cat ~/.claude/CLAUDE.md | head -50"
199
+ }
200
+ ]
201
+ }
202
+ ]
203
+ }
204
+ }
205
+ ```
206
+
207
+ **Special Features**:
208
+
209
+ **1. Add Context to Session**
210
+
211
+ Output from SessionStart hook is automatically added to Claude's context:
212
+
213
+ ```bash
214
+ #!/bin/bash
215
+ echo "Project: My App"
216
+ echo "Quality Standards:"
217
+ echo "- Always check latest docs"
218
+ echo "- Run tests before committing"
219
+ echo "- Avoid bloat"
220
+ ```
221
+
222
+ This text appears in Claude's initial context window.
223
+
224
+ **2. Persist Environment Variables**
225
+
226
+ Use `CLAUDE_ENV_FILE` to persist env vars for subsequent bash commands:
227
+
228
+ ```bash
229
+ #!/bin/bash
230
+ # Write to env file for later bash commands
231
+ echo "export PROJECT_ROOT=/Users/alex/project" >> "$CLAUDE_ENV_FILE"
232
+ echo "export DEBUG=1" >> "$CLAUDE_ENV_FILE"
233
+ ```
234
+
235
+ **3. Structured Output for Additional Context**
236
+
237
+ Return JSON to add formatted context:
238
+
239
+ ```json
240
+ {
241
+ "hookSpecificOutput": {
242
+ "additionalContext": "Quality Standards:\n- Check docs\n- Run tests\n- Avoid bloat"
243
+ }
244
+ }
245
+ ```
246
+
247
+ **Available Environment Variables**:
248
+
249
+ - `CLAUDE_ENV_FILE` - File path where you can persist environment variables (SessionStart only)
250
+ - `CLAUDE_PROJECT_DIR` - Absolute path to project root
251
+ - `CLAUDE_CODE_REMOTE` - "true" if remote, empty if local
252
+
253
+ **Example: Load Project Context**
254
+
255
+ Create `~/.claude/hooks/session-start.sh`:
256
+
257
+ ```bash
258
+ #!/bin/bash
259
+
260
+ echo "🚀 Session started for: $CLAUDE_PROJECT_DIR"
261
+ echo ""
262
+
263
+ # Check if project has CLAUDE.md
264
+ if [ -f "$CLAUDE_PROJECT_DIR/.claude/CLAUDE.md" ]; then
265
+ echo "📋 Project Guidelines (first 30 lines):"
266
+ head -30 "$CLAUDE_PROJECT_DIR/.claude/CLAUDE.md"
267
+ echo ""
268
+ fi
269
+
270
+ # Set project env vars
271
+ echo "export PROJECT_ROOT=$CLAUDE_PROJECT_DIR" >> "$CLAUDE_ENV_FILE"
272
+
273
+ # Check for uncommitted changes
274
+ cd "$CLAUDE_PROJECT_DIR"
275
+ if ! git diff --quiet 2>/dev/null; then
276
+ echo "⚠️ Warning: Uncommitted changes detected"
277
+ fi
278
+ ```
279
+
280
+ Register in settings.json:
281
+
282
+ ```json
283
+ {
284
+ "hooks": {
285
+ "SessionStart": [
286
+ {
287
+ "matcher": "*",
288
+ "hooks": [
289
+ {
290
+ "type": "command",
291
+ "command": "~/.claude/hooks/session-start.sh"
292
+ }
293
+ ]
294
+ }
295
+ ]
296
+ }
297
+ }
298
+ ```
299
+
300
+ **Pros**:
301
+
302
+ - Loads context automatically (no manual reminders)
303
+ - Sets up environment consistently
304
+ - Displays project-specific warnings
305
+ - Runs only once per session (not repeatedly)
306
+
307
+ **Cons**:
308
+
309
+ - Adds to initial context window (uses tokens)
310
+ - Can't be canceled once session starts
311
+ - Errors may prevent session from starting
312
+
313
+ **Decision**: Phase 2 RECOMMENDED - Load quality standards and environment setup
314
+
315
+ ---
316
+
317
+ ### Approach 1c: SessionEnd Hook (CLEANUP AND LOGGING)
318
+
319
+ **Power Level**: ⭐⭐⭐
320
+ **Effort**: Low (15 min)
321
+ **Impact**: Cleanup tasks, session statistics logging
322
+
323
+ **What it does**: Runs when Claude Code session ends
324
+
325
+ **Use Cases**:
326
+
327
+ - Log session statistics (commands run, files modified, errors)
328
+ - Save session state for resume
329
+ - Clean up temporary files
330
+ - Export metrics for analysis
331
+
332
+ **⚠️ Important**: SessionEnd hooks **cannot block session termination**. They run in background for cleanup only.
333
+
334
+ **Configuration**: Add to `~/.claude/settings.json`
335
+
336
+ ```json
337
+ {
338
+ "hooks": {
339
+ "SessionEnd": [
340
+ {
341
+ "matcher": "*",
342
+ "hooks": [
343
+ {
344
+ "type": "command",
345
+ "command": "echo \"Session ended: $(date)\" >> ~/.claude/session-log.txt && echo \"Project: $CLAUDE_PROJECT_DIR\" >> ~/.claude/session-log.txt"
346
+ }
347
+ ]
348
+ }
349
+ ]
350
+ }
351
+ }
352
+ ```
353
+
354
+ **Example: Session Statistics**
355
+
356
+ Create `~/.claude/hooks/session-end.sh`:
357
+
358
+ ```bash
359
+ #!/bin/bash
360
+
361
+ LOG_FILE=~/.claude/quality-sessions.log
362
+
363
+ {
364
+ echo "=================="
365
+ echo "Session ended: $(date)"
366
+ echo "Project: $CLAUDE_PROJECT_DIR"
367
+ echo "Duration: [tracked elsewhere]"
368
+
369
+ # Count files in project
370
+ file_count=$(find "$CLAUDE_PROJECT_DIR" -type f | wc -l)
371
+ echo "Files in project: $file_count"
372
+
373
+ # Check if tests were run
374
+ if [ -f "$CLAUDE_PROJECT_DIR/.test-results" ]; then
375
+ echo "Tests run: Yes"
376
+ else
377
+ echo "Tests run: No"
378
+ fi
379
+
380
+ echo ""
381
+ } >> "$LOG_FILE"
382
+ ```
383
+
384
+ Register in settings.json:
385
+
386
+ ```json
387
+ {
388
+ "hooks": {
389
+ "SessionEnd": [
390
+ {
391
+ "matcher": "*",
392
+ "hooks": [
393
+ {
394
+ "type": "command",
395
+ "command": "~/.claude/hooks/session-end.sh"
396
+ }
397
+ ]
398
+ }
399
+ ]
400
+ }
401
+ }
402
+ ```
403
+
404
+ **Special Feature: System Messages**
405
+
406
+ SessionEnd hooks support `systemMessage` in output to display final messages:
407
+
408
+ ```json
409
+ {
410
+ "hookSpecificOutput": {
411
+ "systemMessage": "✅ Session statistics saved to ~/.claude/quality-sessions.log"
412
+ }
413
+ }
414
+ ```
415
+
416
+ **Pros**:
417
+
418
+ - Automatic logging (no manual tracking)
419
+ - Preserves session history
420
+ - Can trigger external tools (backups, notifications)
421
+
422
+ **Cons**:
423
+
424
+ - Can't block session end (always runs in background)
425
+ - No user interaction possible
426
+ - May not complete if system shuts down abruptly
427
+
428
+ **Decision**: Phase 3 OPTIONAL - Useful for tracking metrics over time
429
+
430
+ ---
431
+
432
+ ### Approach 1b: PreToolUse Hook (PROACTIVE MODIFICATION) ⭐ NEW v2.0.10+
433
+
434
+ **Power Level**: ⭐⭐⭐⭐⭐
435
+ **Effort**: High (1-2 hours - complex JSON handling, testing required)
436
+ **Impact**: Modifies tool inputs BEFORE execution (cleaner than PostToolUse validation)
437
+
438
+ **What's New**: Starting in Claude Code v2.0.10+, PreToolUse hooks can **modify tool inputs** before execution using the `updatedInput` field.
439
+
440
+ **From official docs**:
441
+
442
+ > "hooks can modify tool inputs before execution using `updatedInput`"
443
+
444
+ **Use Cases**:
445
+
446
+ - Auto-format code BEFORE Write/Edit executes (cleaner than PostToolUse fixing)
447
+ - Inject environment variables or credentials
448
+ - Add required headers/imports automatically
449
+ - Sanitize or validate file paths
450
+
451
+ ---
452
+
453
+ **⚠️ CRITICAL: How PreToolUse Hooks Work**
454
+
455
+ **Input**: Hooks receive JSON via **stdin** (not environment variables)
456
+ **Output**: Hooks must output JSON to **stdout** with specific structure
457
+
458
+ **Input JSON structure**:
459
+
460
+ ```json
461
+ {
462
+ "session_id": "abc123",
463
+ "hook_event_name": "PreToolUse",
464
+ "tool_name": "Write",
465
+ "tool_input": {
466
+ "file_path": "/path/to/file.ts",
467
+ "content": "unformatted code"
468
+ }
469
+ }
470
+ ```
471
+
472
+ **Output JSON structure** (to modify inputs):
473
+
474
+ ```json
475
+ {
476
+ "hookSpecificOutput": {
477
+ "hookEventName": "PreToolUse",
478
+ "permissionDecision": "allow",
479
+ "updatedInput": {
480
+ "content": "formatted code"
481
+ }
482
+ }
483
+ }
484
+ ```
485
+
486
+ **Key points**:
487
+
488
+ - Only include fields you want to **change** in `updatedInput`
489
+ - Unchanged fields are preserved automatically
490
+ - Must set `"permissionDecision": "allow"` to proceed
491
+ - Use `"permissionDecision": "deny"` to block tool execution
492
+
493
+ ---
494
+
495
+ **Configuration**: Add to `~/.claude/settings.json`
496
+
497
+ **Example 1: Simple text modification**
498
+
499
+ Create hook script `~/.claude/hooks/format-code.sh`:
500
+
501
+ ```bash
502
+ #!/bin/bash
503
+
504
+ # Read JSON from stdin
505
+ input=$(cat)
506
+
507
+ # Extract fields
508
+ tool_name=$(echo "$input" | jq -r '.tool_name')
509
+ file_path=$(echo "$input" | jq -r '.tool_input.file_path // empty')
510
+ content=$(echo "$input" | jq -r '.tool_input.content // empty')
511
+
512
+ # Only format if it's a Write tool with .ts/.js file
513
+ if [[ "$tool_name" == "Write" ]] && [[ "$file_path" =~ \.(ts|js)$ ]]; then
514
+ # Format with prettier (fallback to original on error)
515
+ formatted=$(echo "$content" | prettier --stdin-filepath "$file_path" 2>/dev/null || echo "$content")
516
+
517
+ # Output modified JSON
518
+ jq -n \
519
+ --arg content "$formatted" \
520
+ '{
521
+ hookSpecificOutput: {
522
+ hookEventName: "PreToolUse",
523
+ permissionDecision: "allow",
524
+ updatedInput: {
525
+ content: $content
526
+ }
527
+ }
528
+ }'
529
+ else
530
+ # No modification, just approve
531
+ jq -n '{
532
+ hookSpecificOutput: {
533
+ hookEventName: "PreToolUse",
534
+ permissionDecision: "allow"
535
+ }
536
+ }'
537
+ fi
538
+ ```
539
+
540
+ **Make executable**:
541
+
542
+ ```bash
543
+ chmod +x ~/.claude/hooks/format-code.sh
544
+ ```
545
+
546
+ **Register in settings.json**:
547
+
548
+ ```json
549
+ {
550
+ "hooks": {
551
+ "PreToolUse": [
552
+ {
553
+ "matcher": "Write",
554
+ "hooks": [
555
+ {
556
+ "type": "command",
557
+ "command": "~/.claude/hooks/format-code.sh"
558
+ }
559
+ ]
560
+ }
561
+ ]
562
+ }
563
+ }
564
+ ```
565
+
566
+ ---
567
+
568
+ **Example 2: Inline jq transformation**
569
+
570
+ ```json
571
+ {
572
+ "hooks": {
573
+ "PreToolUse": [
574
+ {
575
+ "matcher": "Bash",
576
+ "hooks": [
577
+ {
578
+ "type": "command",
579
+ "command": "jq '{hookSpecificOutput: {hookEventName: \"PreToolUse\", permissionDecision: \"allow\", updatedInput: {command: (\"export DEBUG=1; \" + .tool_input.command)}}}'"
580
+ }
581
+ ]
582
+ }
583
+ ]
584
+ }
585
+ }
586
+ ```
587
+
588
+ This prepends `export DEBUG=1;` to all Bash commands.
589
+
590
+ ---
591
+
592
+ **Available Environment Variables**:
593
+
594
+ - `CLAUDE_PROJECT_DIR` - Absolute path to project root
595
+ - `CLAUDE_CODE_REMOTE` - "true" if remote, empty if local
596
+ - `CLAUDE_FILE_PATHS` - Space-separated files (PostToolUse only, not PreToolUse)
597
+
598
+ **Important**: Tool input JSON comes via **stdin**, not via environment variables.
599
+
600
+ ---
601
+
602
+ **How it works**:
603
+
604
+ 1. Claude decides to use Write/Edit/Bash tool
605
+ 2. Hook receives JSON via stdin
606
+ 3. Hook script processes JSON, modifies `updatedInput` fields
607
+ 4. Hook outputs modified JSON to stdout
608
+ 5. Tool executes with modified input
609
+ 6. No post-validation needed (input pre-corrected)
610
+
611
+ **Pros**:
612
+
613
+ - Proactive correction (not reactive validation)
614
+ - Cleaner than PostToolUse → Read → Edit cycle
615
+ - No second round-trip (input corrected before execution)
616
+ - Transparent to Claude (sees final result)
617
+
618
+ **Cons**:
619
+
620
+ - Complex JSON manipulation required
621
+ - Must output valid JSON structure exactly
622
+ - Debugging harder (input/output via pipes)
623
+ - Formatting errors can break tool execution
624
+
625
+ **Decision**: Phase 2 OPTIONAL - Use if proactive input modification needed, otherwise PostToolUse is simpler and more debuggable
626
+
627
+ ---
628
+
629
+ ### Approach 2: PostToolUse Hook (AUTO-VALIDATION) ⭐ RECOMMENDED
630
+
631
+ **Power Level**: ⭐⭐⭐⭐
632
+ **Effort**: Medium (30 min - requires testing)
633
+ **Impact**: Catches issues immediately after changes
634
+
635
+ **What it does**: Automatically runs quality checks AFTER Claude makes changes
636
+
637
+ **⚠️ SECURITY WARNING**: Hooks run automatically with your environment's credentials. Review carefully before adding. Malicious hooks can exfiltrate data.
638
+
639
+ **Configuration**: Add to `~/.claude/settings.json`
640
+
641
+ ```json
642
+ {
643
+ "hooks": {
644
+ "PostToolUse": [
645
+ {
646
+ "matcher": "Write",
647
+ "hooks": [
648
+ {
649
+ "type": "command",
650
+ "command": "bash -c 'echo \"🔍 Validating changes...\"; npm run lint 2>&1 | tail -5 || true; [ -f tsconfig.json ] && npx tsc --noEmit 2>&1 | tail -10 || true; echo \"✅ Validation complete\"'"
651
+ }
652
+ ]
653
+ },
654
+ {
655
+ "matcher": "Edit",
656
+ "hooks": [
657
+ {
658
+ "type": "command",
659
+ "command": "bash -c 'echo \"🔍 Validating changes...\"; npm run lint 2>&1 | tail -5 || true; [ -f tsconfig.json ] && npx tsc --noEmit 2>&1 | tail -10 || true; echo \"✅ Validation complete\"'"
660
+ }
661
+ ]
662
+ }
663
+ ]
664
+ }
665
+ }
666
+ ```
667
+
668
+ **Environment variables available**:
669
+
670
+ - `$CLAUDE_TOOL_NAME` - Tool that was used (Write, Edit, etc.)
671
+ - `$CLAUDE_FILE_PATHS` - Space-separated list of modified files
672
+
673
+ **How it works**:
674
+
675
+ 1. Claude uses Write or Edit tool
676
+ 2. Hook fires automatically after tool completes
677
+ 3. Runs linter + type checker
678
+ 4. Output shown to Claude (cannot undo tool execution)
679
+ 5. Claude sees results and responds accordingly
680
+
681
+ **Pros**:
682
+
683
+ - Deterministic - always runs after Write/Edit
684
+ - Catches issues immediately
685
+ - Integrates with existing tooling (npm scripts)
686
+ - No false positives (only fires on actual code changes)
687
+
688
+ **Cons**:
689
+
690
+ - Tool already executed (can't block, only validate)
691
+ - May slow iteration if tools are slow
692
+ - Requires npm scripts configured
693
+ - Noise if many warnings
694
+
695
+ **Decision**: Phase 2 RECOMMENDED - Pair with Quality Reviewer Skill
696
+
697
+ ---
698
+
699
+ ### Approach 3: Custom Slash Commands (WORKFLOW SHORTCUTS)
700
+
701
+ **Power Level**: ⭐⭐⭐⭐
702
+ **Effort**: Low (15 min)
703
+ **Impact**: 60% reduction in prompt length
704
+
705
+ **What it does**: Replace long prompts with short commands
706
+
707
+ #### Command 1: `/critique`
708
+
709
+ **File**: `~/.claude/commands/critique.md`
710
+
711
+ ```markdown
712
+ ---
713
+ description: Run quality critique on current work
714
+ ---
715
+
716
+ Double check and critique your work again just in case.
717
+
718
+ Evaluate against these criteria:
719
+
720
+ 1. **Correctness**
721
+ - Will it actually work?
722
+ - Edge cases handled?
723
+ - Error handling complete?
724
+
725
+ 2. **Elegance**
726
+ - Simplest solution possible?
727
+ - Any bloat or over-engineering?
728
+ - Readable and maintainable?
729
+
730
+ 3. **Standards Adherence**
731
+ - Follows project CLAUDE.md conventions?
732
+ - Matches existing code patterns?
733
+ - Latest best practices for: $ARGUMENTS
734
+
735
+ 4. **Testing**
736
+ - Can we test this?
737
+ - Test strategy clear?
738
+
739
+ Ask any non-obvious questions you need answered that you can't research yourself online or in the codebase. Don't be lazy.
740
+
741
+ Provide your critique, then wait for approval before implementing.
742
+ ```
743
+
744
+ **Usage**:
745
+
746
+ - `/critique react typescript blades`
747
+ - `/critique electron desktop-app`
748
+
749
+ #### Command 2: `/implement-quality`
750
+
751
+ **File**: `~/.claude/commands/implement-quality.md`
752
+
753
+ ```markdown
754
+ ---
755
+ description: Implement proposed changes with quality verification
756
+ ---
757
+
758
+ Yes, implement your suggested changes.
759
+
760
+ But first, run through this quality checklist:
761
+
762
+ 1. ✓ Verify against latest documentation for all libraries used
763
+ 2. ✓ Ensure correct, elegant, and avoids bloat
764
+ 3. ✓ Double-check adherence to project standards (read CLAUDE.md if needed)
765
+ 4. ✓ Confirm test coverage is adequate
766
+ 5. ✓ Consider edge cases and error handling
767
+
768
+ Then proceed with implementation.
769
+
770
+ After implementation:
771
+
772
+ - Run all relevant tests
773
+ - Report results before marking complete
774
+ ```
775
+
776
+ **Usage**: `/implement-quality` (replaces "yes please")
777
+
778
+ #### Command 3: `/latest-docs`
779
+
780
+ **File**: `~/.claude/commands/latest-docs.md`
781
+
782
+ ```markdown
783
+ ---
784
+ description: Look up latest documentation before proceeding
785
+ ---
786
+
787
+ Before proceeding, look up the very latest documentation for: $ARGUMENTS
788
+
789
+ For each library/framework, verify:
790
+
791
+ - API compatibility with our version (check package.json)
792
+ - Best practices haven't changed since your training data
793
+ - No deprecated patterns in our current approach
794
+ - New features that might be better
795
+
796
+ Then provide your findings and continue with your recommendation.
797
+ ```
798
+
799
+ **Usage**:
800
+
801
+ - `/latest-docs @anthropic/sdk react-19`
802
+ - `/latest-docs playwright zustand`
803
+
804
+ #### Command 4: `/check-and-proceed`
805
+
806
+ **File**: `~/.claude/commands/check-and-proceed.md`
807
+
808
+ ```markdown
809
+ ---
810
+ description: Comprehensive quality check then immediate implementation
811
+ ---
812
+
813
+ Run full quality verification, then implement if passing:
814
+
815
+ **STEP 1: Quality Check**
816
+
817
+ - Correct? Elegant? Standards-compliant?
818
+ - Latest docs verified?
819
+ - Tests planned?
820
+ - Bloat avoided?
821
+
822
+ **STEP 2: If all checks pass**
823
+
824
+ - Implement immediately (approval implicit)
825
+ - Run tests
826
+ - Report results
827
+
828
+ **STEP 3: If issues found**
829
+
830
+ - List concerns
831
+ - Wait for user guidance
832
+
833
+ This is a "one-shot" command - check thoroughly, then proceed automatically if confident.
834
+ ```
835
+
836
+ **Usage**: `/check-and-proceed` (combines critique + approval + implementation)
837
+
838
+ **Pros**:
839
+
840
+ - Immediate productivity boost
841
+ - Easy to create and modify
842
+ - Works across all projects (global commands in ~/.claude/commands/)
843
+ - No complex scripting required
844
+
845
+ **Cons**:
846
+
847
+ - Still requires manual invocation
848
+ - Need to remember which command to use when
849
+
850
+ **Decision**: Implement in Phase 1 (TODAY)
851
+
852
+ ---
853
+
854
+ ### Approach 4: Agent Skills (AUTO-INVOKED QUALITY REVIEW) ⭐ RECOMMENDED
855
+
856
+ **Power Level**: ⭐⭐⭐⭐⭐
857
+ **Effort**: High (1-2 hours - includes testing)
858
+ **Impact**: Eliminates need for manual quality check invocation
859
+
860
+ **What it does**: Claude automatically invokes quality checks when relevant (no manual trigger)
861
+
862
+ **How Skills Work**: Model-invoked—Claude autonomously decides when to use them based on your request and the Skill's description. Skills are tools Claude calls, like Read or WebFetch.
863
+
864
+ **File Structure**:
865
+
866
+ ```
867
+ ~/.claude/skills/quality-reviewer/
868
+ ├── SKILL.md # Main Skill definition (YAML frontmatter + instructions)
869
+ └── examples.md # Optional: Supporting documentation
870
+ ```
871
+
872
+ **SKILL.md Template** (condensed—see full implementation in appendix):
873
+
874
+ ```yaml
875
+ ---
876
+ name: quality-reviewer
877
+ description: |
878
+ Automatically review code changes for correctness, elegance, and standards adherence.
879
+
880
+ Use PROACTIVELY when:
881
+ - About to Write or Edit code files
882
+ - User says "yes"/"proceed"/"implement"
883
+ - Proposing architectural changes
884
+
885
+ Perform checks:
886
+ - Latest docs verified (WebFetch/WebSearch package.json versions)
887
+ - Correctness (edge cases, errors, type safety)
888
+ - Elegance (avoid bloat, simplest solution)
889
+ - Standards (read CLAUDE.md, match patterns)
890
+ - Testability (test strategy clear)
891
+
892
+ Return: PROCEED / REVISE / USER INPUT
893
+
894
+ allowed-tools:
895
+ - Read
896
+ - Grep
897
+ - Glob
898
+ - WebFetch
899
+ - WebSearch
900
+ ---
901
+ # Quality Review Protocol
902
+
903
+ [
904
+ Full protocol with 6 steps: Doc Verification,
905
+ Standards Check,
906
+ Quality Eval,
907
+ Recommendation,
908
+ Escalation Rules,
909
+ Quality Gates,
910
+ ]
911
+ ```
912
+
913
+ **Key Features**:
914
+
915
+ - **Dynamic doc discovery**: Reads package.json → searches "[library] v[version] documentation"
916
+ - **Context-aware**: Infers task context from keywords + file paths
917
+ - **Version-specific**: Uses `<env>` current date for recency validation
918
+ - **Structured output**: Returns PROCEED/REVISE/USER INPUT with reasoning
919
+
920
+ **⚠️ WARNING**: Keep Skill description under 1024 chars—it's critical for Claude's activation logic. Split into multiple focused Skills if needed.
921
+
922
+ **Pros**:
923
+
924
+ - Completely automatic - no manual invocation
925
+ - Works across all projects
926
+ - Most intelligent approach (Claude decides when to use)
927
+ - Comprehensive quality checks
928
+ - Has conversation context (unlike Hooks/MCP)
929
+
930
+ **Cons**:
931
+
932
+ - Complex to set up and test
933
+ - May fire at wrong times (needs description tuning)
934
+ - Harder to debug if misbehaving
935
+ - Requires clear, concise description (<1024 chars)
936
+
937
+ **Decision**: Phase 2 RECOMMENDED - Primary automation mechanism
938
+
939
+ **Note**: Full Skill implementation with examples available in appendix or create as separate file at `~/.claude/skills/quality-reviewer/SKILL.md`
940
+
941
+ ---
942
+
943
+ ### Skills: Activation Reliability (CRITICAL FOR SUCCESS)
944
+
945
+ **Problem**: Skills with vague descriptions don't activate reliably. Claude uses the description to decide when to invoke your Skill.
946
+
947
+ **⚠️ Common Anti-Patterns That Prevent Activation**:
948
+
949
+ #### ❌ Anti-Pattern 1: Too Vague
950
+
951
+ ```yaml
952
+ ---
953
+ name: document-helper
954
+ description: |
955
+ Helps with documents and files. Use when working with documents.
956
+ ---
957
+ ```
958
+
959
+ **Why it fails**:
960
+
961
+ - "Helps with" is not actionable
962
+ - "documents" is too broad (code? markdown? PDFs?)
963
+ - No specific trigger terms
964
+
965
+ #### ✅ GOOD: Specific and Actionable
966
+
967
+ ```yaml
968
+ ---
969
+ name: pdf-processor
970
+ description: |
971
+ Extract text and tables from PDF files, fill forms, merge documents.
972
+
973
+ Use PROACTIVELY when:
974
+ - User mentions "PDF", "extract", "form", or "merge"
975
+ - File path ends with .pdf
976
+ - Task involves document manipulation
977
+
978
+ Returns: Extracted text, table data, or confirmation of merge/fill
979
+
980
+ allowed-tools: Read, Bash
981
+ ---
982
+ ```
983
+
984
+ **Why it works**:
985
+
986
+ - Specific actions (extract, fill, merge)
987
+ - Clear trigger terms (PDF, extract, form, merge)
988
+ - File pattern (.pdf)
989
+ - States what it returns
990
+
991
+ ---
992
+
993
+ #### ❌ Anti-Pattern 2: Too Broad Scope
994
+
995
+ ```yaml
996
+ ---
997
+ name: code-reviewer
998
+ description: |
999
+ Reviews all code for any issues. Use whenever writing code.
1000
+ ---
1001
+ ```
1002
+
1003
+ **Why it fails**:
1004
+
1005
+ - "all code" → fires on every Write/Edit (false positives)
1006
+ - "any issues" → unclear what to check
1007
+ - No prioritization
1008
+
1009
+ #### ✅ GOOD: Narrow, Focused Scope
1010
+
1011
+ ```yaml
1012
+ ---
1013
+ name: quality-gates
1014
+ description: |
1015
+ Final review before Write/Edit tools for TypeScript/JavaScript files.
1016
+
1017
+ Use when Claude is ABOUT TO:
1018
+ - Write new .ts/.tsx/.js/.jsx files
1019
+ - Edit existing source files (not config/markdown)
1020
+
1021
+ Checks:
1022
+ - Latest docs verified (WebFetch package.json versions)
1023
+ - Edge cases handled (null, undefined, empty)
1024
+ - Avoids bloat (simplest solution)
1025
+
1026
+ Returns: PROCEED / REVISE / USER INPUT
1027
+
1028
+ allowed-tools: Read, Grep, WebFetch, WebSearch
1029
+ ---
1030
+ ```
1031
+
1032
+ **Why it works**:
1033
+
1034
+ - Specific timing ("ABOUT TO Write/Edit")
1035
+ - File type restrictions (.ts/.tsx/.js/.jsx)
1036
+ - Concrete checks (docs, edge cases, bloat)
1037
+ - Structured output (PROCEED/REVISE/USER INPUT)
1038
+
1039
+ ---
1040
+
1041
+ #### ❌ Anti-Pattern 3: No Trigger Keywords
1042
+
1043
+ ```yaml
1044
+ ---
1045
+ name: architecture-helper
1046
+ description: |
1047
+ Assists with design and structure decisions.
1048
+ ---
1049
+ ```
1050
+
1051
+ **Why it fails**:
1052
+
1053
+ - No keywords for Claude to match
1054
+ - Passive voice ("assists")
1055
+ - No examples of when to use
1056
+
1057
+ #### ✅ GOOD: Explicit Trigger Keywords
1058
+
1059
+ ```yaml
1060
+ ---
1061
+ name: architecture-monitor
1062
+ description: |
1063
+ Suggest ARCHITECTURE.md updates when making architectural decisions.
1064
+
1065
+ Use when user discusses:
1066
+ - Technology choices (state management, database, frameworks)
1067
+ - Data model design (entities, relationships, schema)
1068
+ - Project-wide patterns (error handling, API structure)
1069
+ - "Why" decisions (trade-offs, alternatives considered)
1070
+
1071
+ Prompt: "Should I document this decision in ARCHITECTURE.md?"
1072
+
1073
+ allowed-tools: Read, Grep
1074
+ ---
1075
+ ```
1076
+
1077
+ **Why it works**:
1078
+
1079
+ - Specific keywords (technology, database, frameworks, data model, schema)
1080
+ - Clear action ("Suggest updates", "Prompt")
1081
+ - Examples of trigger phrases
1082
+
1083
+ ---
1084
+
1085
+ #### ❌ Anti-Pattern 4: Description Too Long
1086
+
1087
+ ```yaml
1088
+ ---
1089
+ name: quality-reviewer
1090
+ description: |
1091
+ [650 lines of detailed checking logic, edge cases, examples,
1092
+ comprehensive documentation about every possible scenario,
1093
+ full decision trees, extensive examples...]
1094
+ ---
1095
+ ```
1096
+
1097
+ **Why it fails**:
1098
+
1099
+ - > 1024 chars reduces activation reliability (documented limit)
1100
+ - Claude can't parse quickly during activation decision
1101
+ - Should use progressive disclosure (supporting files)
1102
+
1103
+ #### ✅ GOOD: Concise Core, Progressive Disclosure
1104
+
1105
+ ```yaml
1106
+ ---
1107
+ name: quality-reviewer
1108
+ description: |
1109
+ Review code before Write/Edit for correctness, elegance, standards.
1110
+
1111
+ Use when:
1112
+ - About to propose code changes
1113
+ - User approves implementation ("yes", "proceed")
1114
+
1115
+ Checks latest docs, edge cases, bloat avoidance, project conventions.
1116
+
1117
+ Returns: PROCEED / REVISE / USER INPUT
1118
+
1119
+ (See examples.md for full review protocol)
1120
+
1121
+ allowed-tools: Read, Grep, WebFetch, WebSearch
1122
+ ---
1123
+ ```
1124
+
1125
+ **Why it works**:
1126
+
1127
+ - Core description <1024 chars
1128
+ - Supporting details in examples.md (loaded only when Skill runs)
1129
+ - Clear, scannable structure
1130
+
1131
+ ---
1132
+
1133
+ ### Skill Naming Constraints (CRITICAL)
1134
+
1135
+ **`name` field requirements**:
1136
+
1137
+ - **Max 64 characters**
1138
+ - **Lowercase letters, numbers, and hyphens only**
1139
+ - No uppercase, no underscores, no spaces
1140
+ - Must be unique across all Skills
1141
+
1142
+ **Examples**:
1143
+
1144
+ ✅ **VALID**:
1145
+
1146
+ ```yaml
1147
+ name: quality-reviewer # ✅ Lowercase, hyphens
1148
+ name: docs-verifier-v2 # ✅ Numbers allowed
1149
+ name: test-advisor # ✅ Max 64 chars
1150
+ ```
1151
+
1152
+ ❌ **INVALID**:
1153
+
1154
+ ```yaml
1155
+ name: QualityReviewer # ❌ Uppercase not allowed
1156
+ name: quality_reviewer # ❌ Underscores not allowed
1157
+ name: quality reviewer # ❌ Spaces not allowed
1158
+ name: my-very-long-skill-name-that-exceeds-sixty-four-character-maximum # ❌ >64 chars
1159
+ ```
1160
+
1161
+ **What happens if invalid**: Skill won't load, Claude can't use it, no error message shown
1162
+
1163
+ **Verify before creating**:
1164
+
1165
+ ```bash
1166
+ # Check name length
1167
+ echo -n "quality-reviewer" | wc -c # Should be ≤64
1168
+
1169
+ # Check format (should only show letters, numbers, hyphens)
1170
+ echo "quality-reviewer" | grep -E '^[a-z0-9-]+$' && echo "Valid" || echo "Invalid"
1171
+ ```
1172
+
1173
+ ---
1174
+
1175
+ ### Best Practices for Skill Descriptions
1176
+
1177
+ **✅ DO**:
1178
+
1179
+ - Use specific technical terms ("TypeScript", "React hooks", "Electron IPC")
1180
+ - List exact trigger keywords Claude should look for
1181
+ - State what the Skill returns (structured output)
1182
+ - Include file patterns if relevant (.ts, .test.js, package.json)
1183
+ - Use active voice ("Check", "Verify", "Suggest")
1184
+ - Keep <1024 chars for core description
1185
+ - Use allowed-tools to restrict capabilities
1186
+ - **Verify name field follows constraints** (see above)
1187
+
1188
+ **❌ DON'T**:
1189
+
1190
+ - Use vague terms ("helps", "assists", "works with")
1191
+ - Say "Use when needed" (Claude can't determine "needed")
1192
+ - Omit trigger keywords
1193
+ - Write >1024 char descriptions (reduces reliability)
1194
+ - Use passive voice ("can be used for")
1195
+ - Leave scope unbounded ("all code", "any files")
1196
+ - **Use invalid characters in name field** (uppercase, underscores, spaces)
1197
+
1198
+ ---
1199
+
1200
+ ### Testing Skill Activation
1201
+
1202
+ **After creating a Skill, test it:**
1203
+
1204
+ 1. **Positive test** (should activate):
1205
+
1206
+ ```
1207
+ You: [Say phrase with trigger keywords]
1208
+ Claude: [Should invoke Skill as tool]
1209
+ ```
1210
+
1211
+ 2. **Negative test** (should NOT activate):
1212
+
1213
+ ```
1214
+ You: [Say unrelated phrase]
1215
+ Claude: [Should NOT invoke Skill]
1216
+ ```
1217
+
1218
+ 3. **Edge case test**:
1219
+ ```
1220
+ You: [Ambiguous phrase - could trigger or not]
1221
+ Claude: [Document behavior for refinement]
1222
+ ```
1223
+
1224
+ **Refinement loop**:
1225
+
1226
+ - False positive (activated when shouldn't) → Narrow scope, add exclusions
1227
+ - False negative (didn't activate when should) → Add trigger keywords, broaden slightly
1228
+ - Iterate description based on real usage patterns
1229
+
1230
+ ---
1231
+
1232
+ ### Skills Activation Reality Check (CRITICAL)
1233
+
1234
+ **⚠️ SET REALISTIC EXPECTATIONS: Skills Don't Auto-Activate Reliably**
1235
+
1236
+ **Plan assumption**: Skills will auto-activate reliably when trigger conditions are met, eliminating need for manual prompts.
1237
+
1238
+ **Actual behavior reported by users**:
1239
+
1240
+ > "The #1 problem with Claude Code skills is that they don't activate on their own. Claude Code skills just sit there and you have to remember to use them."
1241
+
1242
+ **Reality**:
1243
+
1244
+ - ⚠️ **Skills may not activate** even with well-written descriptions
1245
+ - ⚠️ **Activation rate varies** significantly (users report 30-70% success rate)
1246
+ - ⚠️ **No way to force activation** - Claude decides autonomously
1247
+ - ⚠️ **Description tuning is trial-and-error** - no guaranteed formula
1248
+ - ⚠️ **Unpredictable** - May activate perfectly for days, then stop
1249
+
1250
+ **Impact on automation plan**:
1251
+
1252
+ **Original expectation**: Phase 2-3 will "eliminate all 347 quality check prompts" (100% automation)
1253
+
1254
+ **Realistic expectation**: Skills will reduce prompts by 50-70% (175-245 still manual)
1255
+
1256
+ ---
1257
+
1258
+ ### Mitigation Strategies
1259
+
1260
+ **1. Always Provide Slash Command Alternatives**
1261
+
1262
+ For every Skill, create equivalent slash command:
1263
+
1264
+ | Skill | Slash Command Alternative | Use When |
1265
+ | ---------------------- | ------------------------- | ------------------------ |
1266
+ | Quality Reviewer Skill | `/critique` | Skill doesn't activate |
1267
+ | Docs Verifier Skill | `/latest-docs [library]` | Need to force docs check |
1268
+ | Feature Kickoff Skill | `/feature-start` | Starting new feature |
1269
+
1270
+ **2. Track Activation Rate**
1271
+
1272
+ Monitor how often Skills activate vs. expected:
1273
+
1274
+ ```bash
1275
+ # Create activation log
1276
+ ~/.claude/skills/activation-log.txt
1277
+
1278
+ # Format: Date | Skill Name | Activated (Y/N) | Expected (Y/N)
1279
+ 2025-10-30 | quality-reviewer | N | Y
1280
+ 2025-10-30 | docs-verifier | Y | Y
1281
+ 2025-10-30 | quality-reviewer | Y | Y
1282
+ ```
1283
+
1284
+ Calculate success rate:
1285
+
1286
+ ```bash
1287
+ # After 1 week, check activation rate
1288
+ grep quality-reviewer ~/.claude/skills/activation-log.txt | \
1289
+ awk '{total++; if($4=="Y" && $6=="Y") correct++} END {print "Rate:", correct/total*100"%"}'
1290
+ ```
1291
+
1292
+ **3. Iterate Descriptions Based on Real Usage**
1293
+
1294
+ **Week 1**: Deploy Skill with initial description
1295
+ **Week 2**: Review activation log, identify false negatives
1296
+ **Week 3**: Add missed trigger keywords to description
1297
+ **Week 4**: Re-test and measure improvement
1298
+
1299
+ **Example iteration**:
1300
+
1301
+ ```yaml
1302
+ # v1 (Week 1) - 40% activation rate
1303
+ description: "Review code for quality before Write/Edit"
1304
+
1305
+ # v2 (Week 3) - Added trigger keywords - 60% activation rate
1306
+ description: "Review code before Write/Edit. Use when proposing changes, user says 'implement', 'fix', 'add', 'create', or 'yes'."
1307
+
1308
+ # v3 (Week 5) - Added file patterns - 70% activation rate
1309
+ description: "Review TypeScript/JavaScript code (.ts, .tsx, .js, .jsx) before Write/Edit. Use when proposing changes, user says 'implement', 'fix', 'add', 'create', or 'yes'."
1310
+ ```
1311
+
1312
+ **4. Set Realistic Automation Goals**
1313
+
1314
+ **❌ Unrealistic**: "Eliminate 100% of quality check prompts"
1315
+ **✅ Realistic**: "Reduce quality check prompts by 50-70%"
1316
+
1317
+ **Success metrics adjustment**:
1318
+
1319
+ | Metric | Original Target | Realistic Target |
1320
+ | -------------------------------- | --------------- | ---------------- |
1321
+ | Quality check prompts eliminated | 347 (100%) | 175-245 (50-70%) |
1322
+ | Prompt length reduction | 80% | 40-60% |
1323
+ | Manual invocation still needed | 0 | 30-50% |
1324
+
1325
+ **5. Document Manual Fallback Procedures**
1326
+
1327
+ **When Skill doesn't activate, user should**:
1328
+
1329
+ 1. Use slash command alternative (`/critique`)
1330
+ 2. OR explicitly mention Skill name: "use the quality-reviewer skill"
1331
+ 3. OR use manual prompt: "double check and critique this"
1332
+
1333
+ **Add to project CLAUDE.md**:
1334
+
1335
+ ```markdown
1336
+ ## When Skills Don't Activate
1337
+
1338
+ If Quality Reviewer Skill doesn't activate:
1339
+
1340
+ - Type: `/critique [stack]`
1341
+ - Or say: "use the quality-reviewer skill"
1342
+ - Or manual: "double check: correct? elegant? standards? latest docs?"
1343
+ ```
1344
+
1345
+ ---
1346
+
1347
+ ### Adjusted Success Criteria
1348
+
1349
+ **Phase 2 success** (Skills + Hooks):
1350
+
1351
+ **Original**:
1352
+
1353
+ - ✓ Quality Reviewer Skill activates before implementations (100%)
1354
+ - ✓ Zero manual quality check invocations
1355
+
1356
+ **Realistic**:
1357
+
1358
+ - ✓ Quality Reviewer Skill activates 50-70% of the time
1359
+ - ✓ Slash commands provide 1-step fallback for missed activations
1360
+ - ✓ Overall prompt reduction: 50-70% (not 100%)
1361
+ - ✓ Manual quality checks still needed 30-50% of time
1362
+
1363
+ **Phase 3 success adjusted accordingly**.
1364
+
1365
+ ---
1366
+
1367
+ ### Approach 5: Enhanced CLAUDE.md (BASELINE BEHAVIOR)
1368
+
1369
+ **Power Level**: ⭐⭐⭐
1370
+ **Effort**: Low (10 min)
1371
+ **Impact**: Establishes baseline expectations
1372
+
1373
+ **What it does**: Embeds quality standards directly in Claude's instructions (every session)
1374
+
1375
+ **File**: `~/.claude/CLAUDE.md` (add new section)
1376
+
1377
+ ```markdown
1378
+ ## Workflow Standards (CRITICAL - Always Follow)
1379
+
1380
+ ### 0. Explore → Plan → Code Workflow (ALWAYS USE)
1381
+
1382
+ **Critical Best Practice** (Anthropic 2025): Claude tends to jump straight to coding without this.
1383
+
1384
+ **Before ANY code changes:**
1385
+
1386
+ **EXPLORE FIRST:**
1387
+
1388
+ 1. Read relevant existing files (do NOT write code yet)
1389
+ 2. Understand current patterns and architecture
1390
+ 3. Identify what needs to change
1391
+
1392
+ **THEN PLAN:** 4. Use "think hard" or "ultrathink" for complex problems 5. Create implementation plan 6. Get user approval on plan
1393
+
1394
+ **THEN CODE:** 7. Implement with explicit verification steps 8. Run tests 9. Commit results
1395
+
1396
+ **Example:**
1397
+ ```
1398
+
1399
+ User: "Add dark mode toggle"
1400
+
1401
+ ❌ BAD: Immediately propose component code
1402
+ ✅ GOOD:
1403
+ "Let me explore first:
1404
+
1405
+ 1. Reading existing theme system...
1406
+ 2. Checking current state management...
1407
+ 3. Understanding toggle patterns in codebase...
1408
+
1409
+ [After exploration]
1410
+ Now I'll think hard about the implementation plan:
1411
+
1412
+ - Option A: CSS variables + context
1413
+ - Option B: Tailwind dark mode classes
1414
+ - Recommend A because [reasoning]
1415
+
1416
+ Should I proceed with detailed plan?"
1417
+
1418
+ ```
1419
+
1420
+ **When to skip exploration:**
1421
+ - Trivial changes (typos, formatting)
1422
+ - User explicitly says "just do it"
1423
+
1424
+ ---
1425
+
1426
+ ### 1. Latest Documentation Check
1427
+
1428
+ NEVER assume API compatibility. Training data is stale (cutoff: January 2025).
1429
+
1430
+ **Process:**
1431
+ - Identify all external libraries/frameworks used
1432
+ - Check package.json for versions (if applicable)
1433
+ - Look up latest documentation using WebFetch or WebSearch
1434
+ - Verify API still exists and works as expected
1435
+ - Note any deprecated patterns or better alternatives
1436
+ - Report findings before proposing implementation
1437
+
1438
+ **Example:**
1439
+ ```
1440
+
1441
+ 📚 Verified latest docs:
1442
+
1443
+ - @anthropic/sdk v0.38.0: streaming API unchanged ✓
1444
+ - react v18.3.1: no breaking changes ✓
1445
+
1446
+ ```
1447
+
1448
+ ### 2. Self-Critique Against Quality Criteria
1449
+
1450
+ Every proposal must pass these gates:
1451
+
1452
+ #### Correctness
1453
+ - Will it actually work?
1454
+ - Edge cases handled? (null, undefined, empty, boundaries)
1455
+ - Error handling complete?
1456
+ - Type-safe (TypeScript projects)?
1457
+
1458
+ #### Elegance
1459
+ - Simplest solution possible?
1460
+ - Any bloat or over-engineering?
1461
+ - Could it be simpler?
1462
+ - Readable and maintainable?
1463
+
1464
+ #### Standards Adherence
1465
+ - Matches project CLAUDE.md conventions?
1466
+ - Follows existing code patterns?
1467
+ - File/folder organization correct?
1468
+ - Naming conventions followed?
1469
+
1470
+ #### Testability
1471
+ - Can we write tests for this?
1472
+ - Test strategy clear? (unit/integration/e2e)
1473
+ - Edge cases testable?
1474
+
1475
+ ### 3. Output Format for Proposals
1476
+
1477
+ Always structure proposals like this:
1478
+
1479
+ ```
1480
+
1481
+ PROPOSAL:
1482
+ [What you want to implement]
1483
+
1484
+ QUALITY CHECK:
1485
+ ✓ Latest docs verified: [libraries checked]
1486
+ ✓ Correct: [reasoning - edge cases, errors]
1487
+ ✓ Elegant: [why this approach, not simpler]
1488
+ ✓ Standards: [how it matches conventions]
1489
+ ✓ Testable: [testing strategy]
1490
+
1491
+ CONCERNS (if any):
1492
+ ⚠️ [Trade-offs, limitations, or uncertainties]
1493
+
1494
+ READY FOR APPROVAL
1495
+ (Awaiting "yes"/"proceed"/"implement" to execute)
1496
+
1497
+ ```
1498
+
1499
+ ### 4. When User Approves
1500
+
1501
+ When user says **"yes"**, **"proceed"**, **"implement"**, or similar:
1502
+
1503
+ 1. ✓ Implementation approved - execute immediately
1504
+ 2. ✓ Run relevant tests after implementation
1505
+ 3. ✓ Report test results + status
1506
+ 4. ✓ Only then mark task complete
1507
+
1508
+ **Do not:**
1509
+ - Ask for approval again (already given)
1510
+ - Propose without implementing (approval was the green light)
1511
+ - Skip tests (always run if available)
1512
+
1513
+ ### 5. Non-Obvious Questions Only
1514
+
1515
+ **Ask user ONLY if:**
1516
+ - Multiple valid approaches with unclear trade-offs (A vs B decision)
1517
+ - Domain knowledge required (business rules, game mechanics, UX preferences)
1518
+ - Breaking changes need approval (user data migration, API changes)
1519
+ - Genuinely can't find answer in docs/codebase after thorough search
1520
+
1521
+ **DO NOT ask:**
1522
+ - Questions answered in official library docs → Look them up
1523
+ - Implementation details you can figure out → Figure them out
1524
+ - Preferences documented in CLAUDE.md → Read CLAUDE.md
1525
+ - Standard engineering practices → Research best practices
1526
+ - "Should I check X?" → Just check X
1527
+
1528
+ **Remember**: "Don't be lazy" - research first, ask only when stuck.
1529
+
1530
+ ### 6. Avoid Bloat
1531
+
1532
+ **Red flags:**
1533
+ - Adding dependencies when stdlib/existing libs sufficient
1534
+ - Over-abstraction (framework for 1 use case)
1535
+ - Premature optimization (no performance issue)
1536
+ - Duplicating existing functionality
1537
+ - Config for things that don't need config
1538
+
1539
+ **Principle**: Simplest solution that solves the problem. No more, no less.
1540
+
1541
+ ### 7. Context Management (Prevent "Dumber After Compaction")
1542
+
1543
+ **Critical Issue** (Community 2025): "Claude is definitely dumber after compaction, doesn't know what files it was looking at."
1544
+
1545
+ **Solution**: Use `/clear` frequently to reset context between tasks.
1546
+
1547
+ **When to use /clear:**
1548
+ - ✓ After completing a task (before starting next)
1549
+ - ✓ When switching topics/features
1550
+ - ✓ After fixing a bug (before new work)
1551
+ - ✓ When conversation feels unfocused
1552
+
1553
+ **When NOT to use /clear:**
1554
+ - ✗ In middle of multi-step task
1555
+ - ✗ During active debugging
1556
+ - ✗ When building on previous work
1557
+
1558
+ **Proactive suggestion:**
1559
+ After completing tasks, say: "Task complete. Should I /clear context before moving to next task? (Maintains performance and prevents context pollution)"
1560
+
1561
+ ### 8. CLAUDE.md as Living Document
1562
+
1563
+ **Critical Principle** (Anthropic 2025): "Treat CLAUDE.md files as living documents. Iterate on their effectiveness rather than simply accumulating content."
1564
+
1565
+ **Iteration Strategy:**
1566
+ 1. **Start minimal** (50-100 lines for baseline)
1567
+ 2. **Use `#` key during sessions** - Claude auto-incorporates effective instructions into CLAUDE.md
1568
+ 3. **Review weekly** - Remove instructions that don't work
1569
+ 4. **Add emphasis** - Use "IMPORTANT" or "YOU MUST" for critical rules
1570
+ 5. **Test effectiveness** - Try same task with/without instruction to verify impact
1571
+ 6. **Refine continuously** - Like tuning a prompt, CLAUDE.md requires iteration
1572
+
1573
+ **Anti-pattern:** Accumulating content without testing if it improves Claude's behavior.
1574
+
1575
+ ---
1576
+
1577
+ ## Testing Standards
1578
+
1579
+ **After any code changes:**
1580
+ - Run existing test suite (if available)
1581
+ - Report results before completion
1582
+ - If tests fail: fix them (don't ask user to run tests)
1583
+
1584
+ **Test pyramid**: 70% unit, 20% integration, 10% e2e
1585
+
1586
+ **For test writing**: See `@./.safeword/guides/testing-methodology.md` for comprehensive TDD workflow.
1587
+
1588
+ ---
1589
+
1590
+ ## Workflow: Feature Development
1591
+
1592
+ **CRITICAL: Always follow this sequence** (see `@~/.claude/CLAUDE.md` → Feature Development Workflow)
1593
+
1594
+ 1. Check for user stories → ask if not found
1595
+ 2. Check for test definitions → ask if not found
1596
+ 3. Check for design doc (complex features) → ask if needed
1597
+ 4. Follow TDD: RED → GREEN → REFACTOR
1598
+ 5. Run tests yourself (don't ask user to verify)
1599
+
1600
+ **See:** Full workflow in global CLAUDE.md
1601
+ ```
1602
+
1603
+ **Pros**:
1604
+
1605
+ - Works immediately (no new files to create)
1606
+ - Applies to every session automatically
1607
+ - Establishes baseline expectations
1608
+ - Easy to update and refine
1609
+
1610
+ **Cons**:
1611
+
1612
+ - Claude may not always follow (instructions can be ignored)
1613
+ - Less deterministic than hooks/skills
1614
+ - Requires periodic refinement
1615
+ - Token cost (loaded every session)
1616
+
1617
+ **Decision**: Implement in Phase 1 (TODAY) - foundation for other approaches
1618
+
1619
+ ---
1620
+
1621
+ ### Approach 6: Subagents (Task Tool) - NOT RECOMMENDED
1622
+
1623
+ **Power Level**: ⭐⭐⭐⭐
1624
+ **Effort**: High (1+ hours to create and test)
1625
+ **Impact**: Powerful but WRONG TOOL for quality checks
1626
+
1627
+ **What it does**: Delegates tasks to specialized AI agents in separate context windows
1628
+
1629
+ **How it works**:
1630
+
1631
+ - Invoked via `Task` tool with `subagent_type` parameter
1632
+ - Runs in **separate context window** (doesn't see main conversation)
1633
+ - Can be automatic (Claude decides) or explicit (user requests)
1634
+ - Configured as `.claude/agents/*.md` files
1635
+
1636
+ **Built-in subagent types** (from system):
1637
+
1638
+ - `general-purpose` - Multi-step tasks, complex questions
1639
+ - `Explore` - Fast codebase exploration (glob/grep patterns)
1640
+ - `statusline-setup` - Configure status line
1641
+ - `output-style-setup` - Create output styles
1642
+
1643
+ **Your actual subagent usage** (from bitd conversation history):
1644
+
1645
+ - ✓ "Search for remaining NPC mechanics" - Explore subagent (CORRECT use)
1646
+ - ✓ "Search exhaustively for character mechanics" - Explore subagent (CORRECT use)
1647
+ - ✓ "Deep search for missing mechanics" - Explore subagent (CORRECT use)
1648
+
1649
+ **Pattern**: All 3 uses were long, exhaustive searches. NOT quality checks.
1650
+
1651
+ **Example custom subagent**:
1652
+
1653
+ ```yaml
1654
+ ---
1655
+ name: architecture-reviewer
1656
+ description: |
1657
+ Reviews design docs for consistency with ARCHITECTURE.md.
1658
+ Use for comprehensive architecture audits.
1659
+ allowed-tools: Read, Grep, Glob
1660
+ ---
1661
+ # Architecture Review Protocol
1662
+ [Detailed analysis instructions...]
1663
+ ```
1664
+
1665
+ **Pros**:
1666
+
1667
+ - ✅ Separate context = focused analysis
1668
+ - ✅ Own token budget (can be very detailed)
1669
+ - ✅ Returns comprehensive report
1670
+ - ✅ Good for long analysis tasks
1671
+
1672
+ **Cons for quality checks**:
1673
+
1674
+ - ❌ Slow (context switching overhead)
1675
+ - ❌ No conversation history (can't see your requirements)
1676
+ - ❌ Can't access main conversation context
1677
+ - ❌ Overkill for simple quality checks
1678
+ - ❌ Better suited for exhaustive searches (your actual usage)
1679
+
1680
+ **Decision**: ❌ DON'T USE for Phase 2
1681
+
1682
+ **Why NOT**: Skills are faster, have conversation context, and auto-activate. Subagents are for long analysis tasks, not quick quality checks.
1683
+
1684
+ **When to use instead**:
1685
+
1686
+ - ✅ Comprehensive PR reviews ("Review all changes in this PR")
1687
+ - ✅ Architecture audits ("Analyze design doc against ARCHITECTURE.md")
1688
+ - ✅ Exhaustive code searches (you're already using Explore correctly)
1689
+
1690
+ ---
1691
+
1692
+ ### Approach 7: MCP Servers - NOT RECOMMENDED
1693
+
1694
+ **Power Level**: ⭐⭐⭐⭐
1695
+ **Effort**: High (install + configure + API keys)
1696
+ **Impact**: Powerful for external integrations, but WRONG TOOL for quality checks
1697
+
1698
+ **What it does**: Connects Claude Code to external tools, APIs, and data sources
1699
+
1700
+ **How it works**:
1701
+
1702
+ - MCP (Model Context Protocol) = Standard for external integrations
1703
+ - Three transport types: HTTP (remote), SSE (deprecated), Stdio (local)
1704
+ - Provides access to GitHub, databases, APIs, external linters
1705
+ - Installed via: `claude mcp add --transport http [name] [url]`
1706
+
1707
+ **Available MCP servers for code quality**:
1708
+
1709
+ 1. **praneybehl/code-review-mcp**
1710
+ - Uses GPT-4, Gemini, or Claude API for reviews
1711
+ - Analyzes git diffs (staged changes, branches)
1712
+ - Returns external LLM opinion
1713
+ - ❌ Cost: External API fees ($$$)
1714
+ - ❌ Latency: Network calls
1715
+ - ❌ No conversation context
1716
+
1717
+ 2. **MCP Server Analyzer (Python)**
1718
+ - RUFF linting + VULTURE dead code detection
1719
+ - Python-specific
1720
+ - ❌ Limited to one language
1721
+ - ❌ Hooks simpler for linting
1722
+
1723
+ 3. **GitHub MCP Server**
1724
+ - Fetch PRs, post comments, query issues
1725
+ - ✅ Useful for PR automation
1726
+ - ❌ Not needed for local quality checks
1727
+
1728
+ **Your current MCP config**:
1729
+
1730
+ ```json
1731
+ {
1732
+ "context7": "allow", // Unknown (context management?)
1733
+ "playwright": "allow", // Browser automation
1734
+ "fetch": "allow", // HTTP requests
1735
+ "websearch": "allow" // Web search
1736
+ }
1737
+ ```
1738
+
1739
+ **Comparison: MCP vs Skills for quality checks**:
1740
+
1741
+ | Requirement | MCP Servers | Skills | Winner |
1742
+ | -------------------- | ------------------ | -------------------- | ------ |
1743
+ | Conversation context | ❌ No | ✅ Yes | Skills |
1744
+ | Speed | ⚠️ Network latency | ✅ Instant | Skills |
1745
+ | Cost | ⚠️ External API $$ | ✅ Included | Skills |
1746
+ | Setup complexity | ❌ Complex | ✅ Simple | Skills |
1747
+ | Standards checking | ❌ Needs config | ✅ Reads CLAUDE.md | Skills |
1748
+ | Elegance evaluation | ❌ Hard externally | ✅ Claude's strength | Skills |
1749
+
1750
+ **Pros**:
1751
+
1752
+ - ✅ Integrates external systems (GitHub, databases)
1753
+ - ✅ Can run external linters/tools
1754
+ - ✅ Second opinion from different LLM models
1755
+
1756
+ **Cons for quality checks**:
1757
+
1758
+ - ❌ External API costs ($$ for GPT-4/Gemini calls)
1759
+ - ❌ Network latency (slower than in-context)
1760
+ - ❌ No conversation context (doesn't see requirements, CLAUDE.md)
1761
+ - ❌ Complex setup (API keys, configuration)
1762
+ - ❌ Security concern (code sent to external services)
1763
+ - ❌ Hooks already handle linting (simpler)
1764
+
1765
+ **Decision**: ❌ DON'T USE for Phase 2
1766
+
1767
+ **Why NOT**: Skills + Hooks provide better, faster, cheaper, context-aware quality checks without external dependencies.
1768
+
1769
+ **When to use instead** (Phase 4 - Future):
1770
+
1771
+ - ✅ GitHub PR automation ("Review PR #123 and post comments")
1772
+ - ✅ Database validation ("Will this schema change break production?")
1773
+ - ✅ External company-specific tools
1774
+ - ✅ Second LLM opinion for critical decisions
1775
+
1776
+ **Example future workflow**:
1777
+
1778
+ ```bash
1779
+ # Phase 4 (optional)
1780
+ claude mcp add github [url]
1781
+
1782
+ # Usage:
1783
+ You: "Review PR #456 and ensure it follows our standards"
1784
+ Claude: [Uses GitHub MCP to fetch PR]
1785
+ Claude: [Uses Quality Reviewer Skill to evaluate with context]
1786
+ Claude: [Posts review comment via GitHub MCP]
1787
+ ```
1788
+
1789
+ ---
1790
+
1791
+ ## Implementation Plan
1792
+
1793
+ ### Phase 1: Quick Wins (TODAY - 30 min)
1794
+
1795
+ **Goal**: Immediate 60% reduction in repetitive prompts
1796
+
1797
+ 1. **Enhanced CLAUDE.md** (10 min)
1798
+ - Add "Quality Standards" section to `~/.claude/CLAUDE.md`
1799
+ - Establishes baseline behavior
1800
+ - ✅ No new tools needed
1801
+
1802
+ 2. **Slash Commands** (20 min)
1803
+ - Create 4 commands in `~/.claude/commands/`:
1804
+ - `critique.md` - Quality review
1805
+ - `implement-quality.md` - Approve with quality check
1806
+ - `latest-docs.md` - Documentation lookup
1807
+ - `check-and-proceed.md` - One-shot command
1808
+ - Test each command in both projects
1809
+ - ✅ Immediate usability
1810
+
1811
+ **Success criteria**:
1812
+
1813
+ - Can type `/critique` instead of full prompt
1814
+ - CLAUDE.md quality standards visible in Claude's proposals
1815
+
1816
+ ---
1817
+
1818
+ ### Phase 2: Automation (THIS WEEK - 2 hours)
1819
+
1820
+ **Goal**: Auto-invoked quality checks, post-implementation validation
1821
+
1822
+ 1. **Quality Reviewer Skill** (1 hour)
1823
+ - Create `~/.claude/skills/quality-reviewer/SKILL.md`
1824
+ - Create `~/.claude/skills/quality-reviewer/examples.md`
1825
+ - Test skill activation across scenarios:
1826
+ - Proposes code change → skill activates
1827
+ - Answers question → skill does NOT activate
1828
+ - Reads files → skill does NOT activate
1829
+ - Refine description if activation inconsistent
1830
+ - ✅ Skills are most complex (require testing)
1831
+
1832
+ 2. **PostToolUse Hook** (1 hour)
1833
+ - Create `~/.claude/hooks/post-code-validation.yaml`
1834
+ - Test in project with npm scripts
1835
+ - Verify formatters/linters run after Write/Edit
1836
+ - Check that Claude sees results and responds
1837
+ - Adjust if too noisy or slow
1838
+ - ✅ Immediate feedback after changes
1839
+
1840
+ **Success criteria**:
1841
+
1842
+ - Quality Reviewer Skill activates before implementations
1843
+ - PostToolUse Hook runs linters automatically
1844
+ - Both reduce manual verification steps
1845
+
1846
+ ---
1847
+
1848
+ ### Phase 3: Full Automation (OPTIONAL - 30 min)
1849
+
1850
+ **Goal**: Zero manual quality check invocations
1851
+
1852
+ 1. **UserPromptSubmit Hook** (30 min)
1853
+ - Create `~/.claude/hooks/quality-auto-append.yaml`
1854
+ - Test trigger patterns (yes/proceed/implement)
1855
+ - Refine regex to avoid false positives
1856
+ - Consider toggling on/off per project
1857
+ - ✅ Most powerful but most invasive
1858
+
1859
+ **Success criteria**:
1860
+
1861
+ - Type "yes" → quality checklist auto-appended
1862
+ - No false triggers on conversational prompts
1863
+
1864
+ **Decision point**: May not need if Phase 2 works well
1865
+
1866
+ ---
1867
+
1868
+ ### Phase 4: External Integrations (FUTURE - Variable)
1869
+
1870
+ **Goal**: Integrate external tools and services for advanced workflows
1871
+
1872
+ **When to implement**: Only if you need external system integrations
1873
+
1874
+ **Option A: Subagents for Long Analysis**
1875
+
1876
+ **Create**: Custom subagents in `~/.claude/agents/`
1877
+
1878
+ **Use cases**:
1879
+
1880
+ - Comprehensive PR reviews (entire diff analysis)
1881
+ - Architecture consistency audits
1882
+ - Already using Explore subagent correctly for codebase searches
1883
+
1884
+ **Effort**: 1-2 hours per custom subagent
1885
+
1886
+ **Examples**:
1887
+
1888
+ - `code-reviewer` subagent for full PR analysis
1889
+ - `architecture-auditor` subagent for design doc reviews
1890
+
1891
+ ---
1892
+
1893
+ **Option B: MCP Servers for External Tools**
1894
+
1895
+ **Install**: MCP servers for specific integrations
1896
+
1897
+ **Use cases**:
1898
+
1899
+ - GitHub PR automation (`claude mcp add github [url]`)
1900
+ - Database validation (`claude mcp add postgres [connection]`)
1901
+ - Company-specific tools
1902
+
1903
+ **Effort**: Variable (depends on service complexity)
1904
+
1905
+ **Examples**:
1906
+
1907
+ ```bash
1908
+ # GitHub integration
1909
+ claude mcp add github https://github-mcp-server.example.com
1910
+
1911
+ # Usage:
1912
+ You: "Review PR #456 and post review comments"
1913
+ Claude: [Fetches PR via GitHub MCP]
1914
+ Claude: [Uses Quality Reviewer Skill with context]
1915
+ Claude: [Posts review via GitHub MCP]
1916
+ ```
1917
+
1918
+ **When NOT to use**:
1919
+
1920
+ - ❌ Local quality checks (Skills better)
1921
+ - ❌ Post-code validation (Hooks better)
1922
+ - ❌ Local linting (Hooks simpler than MCP)
1923
+
1924
+ **Success criteria**: External integrations streamline cross-system workflows
1925
+
1926
+ ---
1927
+
1928
+ ### Phase 5: Plugin Distribution (FUTURE)
1929
+
1930
+ **Goal**: Bundle and distribute your automation for reuse and sharing
1931
+
1932
+ **When to implement**: After Phase 2-4 validated and working well
1933
+
1934
+ **What**: Package Skills, Hooks, Commands into distributable plugins
1935
+
1936
+ **From October 2025 release**:
1937
+
1938
+ - Plugin system released with 227+ community plugins
1939
+ - `/plugin install [name]` - Install from marketplace
1940
+ - `/plugin marketplace add [url]` - Add custom marketplace
1941
+ - Plugins bundle Skills, Hooks, Commands, MCP servers together
1942
+
1943
+ **Use cases**:
1944
+
1945
+ 1. **Personal reuse** - Install same automation on multiple machines
1946
+ 2. **Team sharing** - Distribute quality standards to team members
1947
+ 3. **Community contribution** - Share with broader developer community
1948
+ 4. **Version control** - Track automation evolution with versioning
1949
+
1950
+ **Example plugin structure**:
1951
+
1952
+ ```
1953
+ ~/.claude/plugins/quality-automation/
1954
+ ├── plugin.yaml # Metadata and version info
1955
+ ├── skills/
1956
+ │ ├── docs-verifier/
1957
+ │ │ └── SKILL.md
1958
+ │ ├── standards-checker/
1959
+ │ │ └── SKILL.md
1960
+ │ └── quality-gates/
1961
+ │ └── SKILL.md
1962
+ ├── hooks/
1963
+ │ └── stop-context-manager/
1964
+ │ └── hook.json
1965
+ └── commands/
1966
+ ├── critique.md
1967
+ └── feature-start.md
1968
+ ```
1969
+
1970
+ **plugin.yaml example**:
1971
+
1972
+ ```yaml
1973
+ name: quality-automation
1974
+ version: 1.0.0
1975
+ description: |
1976
+ Automated quality control workflow for Claude Code.
1977
+ Eliminates repetitive "double check and critique" prompts.
1978
+
1979
+ author: Your Name
1980
+ repository: https://github.com/yourusername/claude-quality-automation
1981
+
1982
+ skills:
1983
+ - skills/docs-verifier/
1984
+ - skills/standards-checker/
1985
+ - skills/quality-gates/
1986
+
1987
+ hooks:
1988
+ - hooks/stop-context-manager/
1989
+
1990
+ commands:
1991
+ - commands/critique.md
1992
+ - commands/feature-start.md
1993
+
1994
+ dependencies:
1995
+ - prettier
1996
+ - eslint
1997
+ ```
1998
+
1999
+ **Installation flow**:
2000
+
2001
+ ```bash
2002
+ # User installs your plugin
2003
+ /plugin install quality-automation
2004
+
2005
+ # Or from custom marketplace
2006
+ /plugin marketplace add https://your-marketplace.com/plugins.json
2007
+ /plugin install quality-automation
2008
+
2009
+ # Updates
2010
+ /plugin update quality-automation
2011
+
2012
+ # Uninstall
2013
+ /plugin uninstall quality-automation
2014
+ ```
2015
+
2016
+ **Benefits**:
2017
+
2018
+ ✅ **One-command setup** - No manual file creation
2019
+ ✅ **Version control** - Update automation centrally
2020
+ ✅ **Dependency management** - Specify required tools
2021
+ ✅ **Documentation** - README shown during install
2022
+ ✅ **Discoverability** - Users can browse marketplace
2023
+
2024
+ **Distribution options**:
2025
+
2026
+ 1. **Private** - Share via git repo (team use)
2027
+ 2. **Public marketplace** - Submit to community marketplace
2028
+ 3. **Custom marketplace** - Host your own plugin registry
2029
+
2030
+ **Publishing steps**:
2031
+
2032
+ 1. Test plugin locally (install from local path)
2033
+ 2. Create GitHub repository with plugin structure
2034
+ 3. Submit to marketplace (if public)
2035
+ 4. Document usage in README
2036
+ 5. Version with semver (1.0.0, 1.1.0, 2.0.0)
2037
+
2038
+ **When NOT to use plugins**:
2039
+
2040
+ - Still iterating on automation (keep as local files)
2041
+ - Highly personal/specific to your workflow
2042
+ - Experimental features not ready for distribution
2043
+
2044
+ **Success criteria**: Team members or community users successfully install and use your automation plugin
2045
+
2046
+ ---
2047
+
2048
+ ## Testing Strategy
2049
+
2050
+ ### Phase 1 Testing (Slash Commands)
2051
+
2052
+ **Test in soulless-monorepo:**
2053
+
2054
+ 1. Make a small change proposal
2055
+ 2. Type `/critique react typescript`
2056
+ 3. Verify comprehensive review appears
2057
+ 4. Type `/implement-quality`
2058
+ 5. Verify implementation happens with final check
2059
+
2060
+ **Test in bitd:**
2061
+
2062
+ 1. Propose game mechanic implementation
2063
+ 2. Type `/critique blades typescript`
2064
+ 3. Verify domain-specific considerations appear
2065
+ 4. Type `/check-and-proceed`
2066
+ 5. Verify one-shot execution
2067
+
2068
+ **Pass criteria**: Commands work, save typing, maintain quality
2069
+
2070
+ ---
2071
+
2072
+ ### Phase 2 Testing (Skills + Hooks)
2073
+
2074
+ **Test Quality Reviewer Skill:**
2075
+
2076
+ 1. **Positive test** (should activate):
2077
+
2078
+ ```
2079
+ You: "Add a loading spinner component"
2080
+ Claude: [Quality Reviewer Skill activates]
2081
+ Claude: [Comprehensive review with docs check]
2082
+ Claude: "RECOMMENDATION: PROCEED"
2083
+ You: "yes"
2084
+ Claude: [Implements]
2085
+ ```
2086
+
2087
+ 2. **Negative test** (should NOT activate):
2088
+
2089
+ ```
2090
+ You: "What's the current architecture?"
2091
+ Claude: [NO skill activation - just answers]
2092
+ Claude: [Reads files, explains architecture]
2093
+ ```
2094
+
2095
+ 3. **Edge case test**:
2096
+ ```
2097
+ You: "yes" (approving previous proposal)
2098
+ Claude: [Skill activates for final check]
2099
+ Claude: [Brief review, then implements]
2100
+ ```
2101
+
2102
+ **Test PostToolUse Hook:**
2103
+
2104
+ 1. Make code change
2105
+ 2. Verify linter/formatter runs automatically
2106
+ 3. Check Claude responds to results
2107
+ 4. Verify test suite runs (if configured)
2108
+
2109
+ **Pass criteria**:
2110
+
2111
+ - Skills activate at right times (not false positives)
2112
+ - Hooks run reliably after tool use
2113
+ - Quality maintained or improved
2114
+
2115
+ ---
2116
+
2117
+ ### Phase 3 Testing (UserPromptSubmit Hook)
2118
+
2119
+ **Test trigger patterns:**
2120
+
2121
+ 1. Type "yes" after proposal → checklist appended ✓
2122
+ 2. Type "proceed" → checklist appended ✓
2123
+ 3. Type "yes, but change X" → checklist appended (may need refinement)
2124
+ 4. Type "yes, I understand" (conversational) → should NOT trigger
2125
+
2126
+ **Pass criteria**:
2127
+
2128
+ - High precision (few false positives)
2129
+ - High recall (catches all approvals)
2130
+ - Doesn't disrupt conversational flow
2131
+
2132
+ ---
2133
+
2134
+ ## Rollback Plan
2135
+
2136
+ If any approach causes issues:
2137
+
2138
+ ### Phase 1 (Slash Commands)
2139
+
2140
+ - **Issue**: Command doesn't work as expected
2141
+ - **Fix**: Edit `.md` file, reload Claude session
2142
+ - **Rollback**: Delete command file
2143
+
2144
+ ### Phase 2 (Skills/Hooks)
2145
+
2146
+ - **Issue**: Skill fires at wrong times
2147
+ - **Fix**: Refine description in SKILL.md
2148
+ - **Rollback**: Delete skill directory
2149
+
2150
+ - **Issue**: Hook breaks workflow
2151
+ - **Fix**: Adjust script or conditions
2152
+ - **Rollback**: Delete hook file or set `decision: allow`
2153
+
2154
+ ### Phase 3 (UserPromptSubmit Hook)
2155
+
2156
+ - **Issue**: Too many false triggers
2157
+ - **Fix**: Refine regex pattern
2158
+ - **Rollback**: Delete hook file (most invasive, easiest to remove)
2159
+
2160
+ **Recovery**: All automation is additive - removing files returns to manual workflow
2161
+
2162
+ ---
2163
+
2164
+ ### Quick Recovery: Checkpoints (NEW - v2.0.10+)
2165
+
2166
+ **Feature**: Press **Esc twice** to rewind conversation to last checkpoint.
2167
+
2168
+ **From changelog** (October 2025):
2169
+
2170
+ > "Checkpoints: Rewind conversations without losing context. Press Esc twice to return to checkpoint."
2171
+
2172
+ **Use when**:
2173
+
2174
+ - ✅ Skill activated incorrectly → Esc Esc to undo
2175
+ - ✅ Hook broke workflow → Esc Esc to revert
2176
+ - ✅ Want to retry with different approach → Esc Esc and try again
2177
+ - ✅ Automation caused unexpected behavior → Esc Esc back to before
2178
+
2179
+ **Benefits**:
2180
+
2181
+ - **Faster than**: Deleting files, restarting session, or manual undo
2182
+ - **Safer experimentation**: Can rewind mistakes instantly
2183
+ - **Context preserved**: Unlike /clear, checkpoint keeps full conversation history
2184
+ - **No commit needed**: Test automation changes without permanent effects
2185
+
2186
+ **Example flow**:
2187
+
2188
+ ```
2189
+ You: "implement feature X"
2190
+ Claude: [Quality Reviewer Skill activates incorrectly]
2191
+ You: [Press Esc Esc]
2192
+ → Returns to "implement feature X" prompt
2193
+ → Conversation state restored
2194
+ → Can refine Skill description or try different approach
2195
+ ```
2196
+
2197
+ **Limitation**: Can only rewind to last checkpoint (not arbitrary points in history)
2198
+
2199
+ ---
2200
+
2201
+ ## 🚨 CRITICAL: Hook System Bug in Current Versions
2202
+
2203
+ **⚠️ AS OF OCTOBER 30, 2025: Hooks are broken in Claude Code v2.0.27 and v2.0.29**
2204
+
2205
+ **Source**: [GitHub Issue #10399](https://github.com/anthropics/claude-code/issues/10399) (OPEN, actively reported)
2206
+
2207
+ ### What's Broken
2208
+
2209
+ Multiple hook types stopped executing completely:
2210
+
2211
+ - ❌ **Stop hooks** - Don't fire when Claude finishes responding
2212
+ - ❌ **SessionEnd hooks** - Don't fire when session ends
2213
+ - ❌ **SessionStart hooks** - Don't fire when session starts
2214
+ - ❌ **PostToolUse hooks** - Don't fire after tool execution
2215
+
2216
+ **Symptom**: Hooks silently fail to execute. No error messages. No output.
2217
+
2218
+ **Pattern**: "Hooks worked perfectly until a few days ago" - multiple users across macOS and Windows
2219
+
2220
+ ### Working Version
2221
+
2222
+ ✅ **v2.0.25** - Hooks work correctly (confirmed by multiple users downgrading)
2223
+
2224
+ ### Check Your Version
2225
+
2226
+ ```bash
2227
+ claude --version
2228
+ ```
2229
+
2230
+ If you see v2.0.27 or v2.0.29, your hooks will not work without workaround.
2231
+
2232
+ ### Workaround: Use `--debug` Flag
2233
+
2234
+ **Temporary fix discovered by community**:
2235
+
2236
+ ```bash
2237
+ claude --debug
2238
+ ```
2239
+
2240
+ **Why it works**: Hook initialization appears to be incorrectly gated behind debug-mode-only code paths in v2.0.27+
2241
+
2242
+ **Create alias for convenience**:
2243
+
2244
+ ```bash
2245
+ # Add to ~/.zshrc or ~/.bashrc
2246
+ alias claude='claude --debug'
2247
+ ```
2248
+
2249
+ **Verify hooks work**:
2250
+
2251
+ ```bash
2252
+ # Create simple test hook
2253
+ mkdir -p ~/.claude/hooks
2254
+ cat > ~/.claude/hooks/test.sh << 'EOF'
2255
+ #!/bin/bash
2256
+ echo "🎉 Hook fired successfully!"
2257
+ EOF
2258
+ chmod +x ~/.claude/hooks/test.sh
2259
+
2260
+ # Add to settings.json
2261
+ jq '.hooks.Stop = [{matcher: "*", hooks: [{type: "command", command: "~/.claude/hooks/test.sh"}]}]' ~/.claude/settings.json > tmp.json && mv tmp.json ~/.claude/settings.json
2262
+
2263
+ # Launch with debug and test
2264
+ claude --debug
2265
+ # (Ask Claude something, when it finishes you should see "🎉 Hook fired successfully!")
2266
+ ```
2267
+
2268
+ ### Downgrade Option (If Workaround Fails)
2269
+
2270
+ **Not officially supported, but reported to work**:
2271
+
2272
+ ```bash
2273
+ # Backup current version
2274
+ cp $(which claude) ~/claude-backup
2275
+
2276
+ # Install specific version (method depends on installation)
2277
+ # Homebrew:
2278
+ brew uninstall claude-code
2279
+ brew install claude-code@2.0.25 # If available
2280
+
2281
+ # Or check Claude Code documentation for version pinning
2282
+ ```
2283
+
2284
+ ### Impact on This Automation Plan
2285
+
2286
+ **Phase 1** (Slash Commands + CLAUDE.md): ✅ **NOT AFFECTED** - No hooks used
2287
+
2288
+ **Phase 2** (Skills + Hooks): 🚨 **CRITICALLY AFFECTED**
2289
+
2290
+ - PostToolUse Hook automation **will not work** without workaround
2291
+ - Stop Hook context management **will not work** without workaround
2292
+ - SessionStart Hook **will not work** without workaround
2293
+
2294
+ **Phase 3** (Advanced Hooks): 🚨 **CRITICALLY AFFECTED**
2295
+
2296
+ - All hook-based automation requires `--debug` flag or downgrade
2297
+
2298
+ ### Recommended Action Plan
2299
+
2300
+ **Option A: Wait for fix** (if not urgent)
2301
+
2302
+ - Implement Phase 1 only (Slash Commands)
2303
+ - Monitor GitHub issue #10399 for resolution
2304
+ - Proceed to Phase 2 when bug is fixed
2305
+
2306
+ **Option B: Use workaround** (proceed with automation)
2307
+
2308
+ - Launch Claude Code with `--debug` flag always
2309
+ - Create alias: `alias claude='claude --debug'`
2310
+ - Proceed with Phase 2 implementation
2311
+ - Test hooks immediately to verify they work
2312
+
2313
+ **Option C: Downgrade** (if workaround doesn't work)
2314
+
2315
+ - Downgrade to v2.0.25
2316
+ - Proceed with full automation plan
2317
+ - Pin version to prevent auto-updates
2318
+
2319
+ ### Testing Checklist Before Phase 2
2320
+
2321
+ **BEFORE implementing any hooks, verify they work**:
2322
+
2323
+ 1. Check version: `claude --version`
2324
+ 2. If v2.0.27/v2.0.29:
2325
+ - Launch with `--debug` flag
2326
+ - Create test hook (see example above)
2327
+ - Verify hook fires with "🎉" message
2328
+ 3. If hook still doesn't fire:
2329
+ - Consider downgrading to v2.0.25
2330
+ - Or wait for official fix
2331
+
2332
+ **DO NOT PROCEED TO PHASE 2 WITHOUT CONFIRMING HOOKS WORK**
2333
+
2334
+ ---
2335
+
2336
+ ## Common Implementation Mistakes
2337
+
2338
+ **Context**: Based on community issues and official documentation, these are the most common failure modes when implementing Skills and Hooks.
2339
+
2340
+ ### Skills Mistakes
2341
+
2342
+ **1. YAML Validation Errors**
2343
+
2344
+ ```yaml
2345
+ # ❌ WRONG - Invalid YAML
2346
+ ---
2347
+ name: quality-reviewer
2348
+ description: |
2349
+ Reviews code for quality
2350
+ (missing closing quote somewhere in description)
2351
+ allowed-tools: Read
2352
+ --- # Missing second --- separator
2353
+ ```
2354
+
2355
+ **Fix**: Use online YAML validator (yamllint.com) before testing
2356
+
2357
+ **Verify**:
2358
+
2359
+ ```bash
2360
+ yamllint ~/.claude/skills/*/SKILL.md
2361
+ ```
2362
+
2363
+ ---
2364
+
2365
+ **2. Description Too Long (>1024 chars)**
2366
+
2367
+ **Problem**: Skill activates unreliably or not at all
2368
+
2369
+ **Fix**:
2370
+
2371
+ - Count chars in description field only (not entire file)
2372
+ - Move detailed instructions to separate .md files
2373
+ - Use progressive disclosure pattern (see "Skills Anti-Patterns" section above)
2374
+
2375
+ **Check length**:
2376
+
2377
+ ```bash
2378
+ # Extract description field and count chars
2379
+ grep -A 50 "description: |" ~/.claude/skills/quality-reviewer/SKILL.md | wc -c
2380
+ ```
2381
+
2382
+ ---
2383
+
2384
+ **3. Missing Trigger Keywords**
2385
+
2386
+ **Problem**: Skill never activates because Claude doesn't know when to use it
2387
+
2388
+ **Fix**: Add explicit keywords from user's vocabulary:
2389
+
2390
+ - "test" → testing skills
2391
+ - "implement", "fix", "add" → quality check skills
2392
+ - "PDF", "document" → document processing skills
2393
+ - File extensions (.ts, .test.js) → file-type-specific skills
2394
+
2395
+ **Before**:
2396
+
2397
+ ```yaml
2398
+ description: |
2399
+ Helps with architecture decisions.
2400
+ ```
2401
+
2402
+ **After**:
2403
+
2404
+ ```yaml
2405
+ description: |
2406
+ Suggest ARCHITECTURE.md updates when discussing:
2407
+ - Technology choices (keywords: database, state management, framework)
2408
+ - Data model design (keywords: entity, relationship, schema)
2409
+ - Patterns (keywords: error handling, API structure)
2410
+ ```
2411
+
2412
+ ---
2413
+
2414
+ **4. No Test of Activation**
2415
+
2416
+ **Problem**: Skill created but never verified if it actually works
2417
+
2418
+ **Fix**: Test immediately after creating:
2419
+
2420
+ ```
2421
+ # Create Skill
2422
+ ~/.claude/skills/test-advisor/SKILL.md
2423
+
2424
+ # Test activation
2425
+ You: "should I write unit tests or integration tests for this?"
2426
+ Expected: Test Advisor Skill activates
2427
+ Actual: [Document what happens]
2428
+
2429
+ # Refine description based on test results
2430
+ ```
2431
+
2432
+ ---
2433
+
2434
+ ### Hooks Mistakes
2435
+
2436
+ **1. Hook Script Not Executable**
2437
+
2438
+ ```bash
2439
+ # Problem: Hook file created but no execute permission
2440
+ $ ls -la ~/.claude/hooks/quality-check.sh
2441
+ -rw-r--r-- ~/.claude/hooks/quality-check.sh
2442
+
2443
+ # Symptom: Hook never fires, no error message
2444
+
2445
+ # Fix: Make executable
2446
+ chmod +x ~/.claude/hooks/quality-check.sh
2447
+
2448
+ # Verify
2449
+ ls -la ~/.claude/hooks/quality-check.sh
2450
+ -rwxr-xr-x ~/.claude/hooks/quality-check.sh
2451
+ ```
2452
+
2453
+ ---
2454
+
2455
+ **2. Wrong Matcher (Tool Name vs File Pattern)**
2456
+
2457
+ ```json
2458
+ {
2459
+ "hooks": {
2460
+ "PostToolUse": [{
2461
+ "matcher": "*.ts", // ❌ WRONG - matcher is TOOL NAME, not file pattern
2462
+ "hooks": [...]
2463
+ }]
2464
+ }
2465
+ }
2466
+ ```
2467
+
2468
+ **Fix**: Matcher is tool name (Write, Edit, Read, Bash, etc.), filter files inside command
2469
+
2470
+ ```json
2471
+ {
2472
+ "hooks": {
2473
+ "PostToolUse": [
2474
+ {
2475
+ "matcher": "Write", // ✅ CORRECT - Tool name
2476
+ "hooks": [
2477
+ {
2478
+ "type": "command",
2479
+ "command": "bash -c 'if [[ \"${CLAUDE_FILE_PATHS}\" =~ \\.ts$ ]]; then npm run lint; fi'"
2480
+ }
2481
+ ]
2482
+ }
2483
+ ]
2484
+ }
2485
+ }
2486
+ ```
2487
+
2488
+ ---
2489
+
2490
+ **3. Exit Code Ignored (Tests Falsely Pass)**
2491
+
2492
+ ```bash
2493
+ # ❌ Problem: Claude reports "tests passed" but command failed
2494
+ command: "npm test"
2495
+
2496
+ # Why: Hook doesn't check exit code, npm test fails silently
2497
+
2498
+ # ✅ Fix: Check exit code explicitly
2499
+ command: "bash -c 'npm test 2>&1 | tee /tmp/test-output.txt; exit ${PIPESTATUS[0]}'"
2500
+ ```
2501
+
2502
+ **Key**: Use `${PIPESTATUS[0]}` to preserve npm test's exit code after piping to tee
2503
+
2504
+ ---
2505
+
2506
+ **4. Output Lost / Not Visible**
2507
+
2508
+ ```bash
2509
+ # Problem: Hook runs but Claude doesn't see results
2510
+ command: "npm run lint > /dev/null 2>&1"
2511
+
2512
+ # Fix: Ensure output goes to stdout/stderr (Claude captures both)
2513
+ command: "npm run lint 2>&1"
2514
+
2515
+ # Or: Use tee to log AND display
2516
+ command: "npm run lint 2>&1 | tee /tmp/lint-output.txt"
2517
+ ```
2518
+
2519
+ ---
2520
+
2521
+ **5. Environment Variables Not Set**
2522
+
2523
+ ```bash
2524
+ # Problem: $CLAUDE_FILE_PATHS empty when expected
2525
+ command: "echo File: $CLAUDE_FILE_PATHS"
2526
+ # Output: "File: " (empty)
2527
+
2528
+ # Debug: Check which variables are available
2529
+ command: "env | grep CLAUDE"
2530
+
2531
+ # Available variables (PostToolUse):
2532
+ # CLAUDE_TOOL_NAME = "Write" or "Edit"
2533
+ # CLAUDE_FILE_PATHS = space-separated list of files
2534
+
2535
+ # Available variables (PreToolUse):
2536
+ # CLAUDE_TOOL_INPUT = JSON input to tool
2537
+ # CLAUDE_TOOL_NAME = tool being called
2538
+ ```
2539
+
2540
+ ---
2541
+
2542
+ ### Testing Checklist
2543
+
2544
+ **Before deploying automation:**
2545
+
2546
+ **Skills**:
2547
+
2548
+ - [ ] YAML validates (yamllint)
2549
+ - [ ] Description <1024 chars
2550
+ - [ ] Includes specific trigger keywords
2551
+ - [ ] Tested positive case (should activate)
2552
+ - [ ] Tested negative case (should NOT activate)
2553
+ - [ ] Returns structured output documented
2554
+
2555
+ **Hooks**:
2556
+
2557
+ - [ ] Script executable (`chmod +x`)
2558
+ - [ ] Matcher is tool name (Write, Edit, etc.)
2559
+ - [ ] Exit code checked (`${PIPESTATUS[0]}`)
2560
+ - [ ] Output visible (stdout/stderr, not /dev/null)
2561
+ - [ ] Environment variables tested (`echo $CLAUDE_*`)
2562
+ - [ ] Hook fires on expected events (test manually)
2563
+
2564
+ **Debugging Commands**:
2565
+
2566
+ ```bash
2567
+ # List all Skills
2568
+ ls -la ~/.claude/skills/
2569
+
2570
+ # Validate YAML
2571
+ yamllint ~/.claude/skills/*/SKILL.md
2572
+
2573
+ # Check hook permissions
2574
+ ls -la ~/.claude/hooks/
2575
+
2576
+ # Test hook script manually
2577
+ bash -c 'export CLAUDE_FILE_PATHS="test.ts"; ~/.claude/hooks/quality-check.sh'
2578
+
2579
+ # View hook configuration
2580
+ jq '.hooks' ~/.claude/settings.json
2581
+
2582
+ # Check Skill description length
2583
+ grep -A 100 "description: |" ~/.claude/skills/quality-reviewer/SKILL.md | head -50 | wc -c
2584
+ ```
2585
+
2586
+ ---
2587
+
2588
+ ## Success Metrics
2589
+
2590
+ **⚠️ UPDATED**: Targets revised based on realistic Skills activation rates (50-70%, not 100%)
2591
+
2592
+ Track these before/after metrics:
2593
+
2594
+ ### Quantitative (Revised Targets)
2595
+
2596
+ | Metric | Before | Target After Phase 1 | Target After Phase 2 | Target After Phase 3 |
2597
+ | --------------------------------- | ------ | -------------------- | -------------------- | -------------------- |
2598
+ | Avg prompt length (chars) | 250 | 150 (-40%) | 125 (-50%) | 100 (-60%) |
2599
+ | Quality check prompts per session | 15 | 6 (-60%) | 5-8 (-50-65%) | 4-6 (-60-70%) |
2600
+ | Time to implementation (min) | 10 | 7 (-30%) | 6 (-40%) | 5 (-50%) |
2601
+ | Docs lookup prompts | 8 | 3 (-62%) | 2-3 (-62-75%) | 1-2 (-75-87%) |
2602
+ | **Skills activation rate** | N/A | N/A | **50-70%** | **60-75%** |
2603
+ | **Manual fallback usage** | N/A | N/A | **30-50%** | **25-40%** |
2604
+
2605
+ **Key changes from original plan**:
2606
+
2607
+ - ❌ Removed "0 (-100%)" targets - unrealistic with current Skills reliability
2608
+ - ✅ Added ranges reflecting 50-70% automation success rate
2609
+ - ✅ Added Skills activation rate tracking (critical metric)
2610
+ - ✅ Added manual fallback usage tracking
2611
+
2612
+ **Why targets are conservative**:
2613
+
2614
+ - Skills don't activate 100% of time (30-70% typical)
2615
+ - Hooks may fail in v2.0.27/v2.0.29 without workaround
2616
+ - Description tuning takes iterative refinement
2617
+ - User must remember slash commands when Skills don't fire
2618
+
2619
+ ### Quantitative Goals by Phase
2620
+
2621
+ **Phase 1** (Slash Commands):
2622
+
2623
+ - 60% reduction in prompt length (250 → 100 chars)
2624
+ - Baseline behavior established via CLAUDE.md
2625
+ - No automation (all manual, but shorter commands)
2626
+
2627
+ **Phase 2** (Skills + Hooks):
2628
+
2629
+ - 50-65% reduction in quality check prompts (15 → 5-8 per session)
2630
+ - Skills activate 50-70% of time (track weekly)
2631
+ - Slash commands provide fallback for 30-50% of cases
2632
+ - Hooks automate linting/testing (if working in your version)
2633
+
2634
+ **Phase 3** (Advanced Automation):
2635
+
2636
+ - 60-70% reduction in quality check prompts (15 → 4-6 per session)
2637
+ - Skills activation improves to 60-75% after description tuning
2638
+ - Manual fallback needed 25-40% of time
2639
+ - UserPromptSubmit Hook (optional) catches some missed activations
2640
+
2641
+ **Target NOT 100% automation** - Accept that 25-40% will remain manual
2642
+
2643
+ ### Qualitative
2644
+
2645
+ - **Quality maintained**: No increase in bugs or refactoring needs
2646
+ - **Developer experience**: Feels faster, less repetitive _even with 30-50% manual_
2647
+ - **False positives**: <5% of hook/skill triggers are wrong
2648
+ - **Comprehensiveness**: Automated checks as thorough as manual
2649
+ - **Skill activation trending up**: Activation rate improves week-over-week
2650
+ - **Fallback workflow smooth**: Slash commands feel natural, not frustrating
2651
+
2652
+ **Measure at**: Week 1 (Phase 1), Week 2 (Phase 2), Week 4 (Phase 2 iteration), Week 6 (Phase 3)
2653
+
2654
+ ---
2655
+
2656
+ ## Maintenance
2657
+
2658
+ ### Weekly Review
2659
+
2660
+ **Check:**
2661
+
2662
+ - Are slash commands being used? (check history)
2663
+ - Are skills activating correctly? (check logs)
2664
+ - Are hooks causing issues? (monitor for complaints)
2665
+ - Quality standards still being met? (code review)
2666
+
2667
+ **Refine:**
2668
+
2669
+ - Update CLAUDE.md if patterns emerge
2670
+ - Adjust hook conditions if false positives
2671
+ - Enhance skill descriptions if activation inconsistent
2672
+
2673
+ ### Monthly Review
2674
+
2675
+ **Assess:**
2676
+
2677
+ - Metrics: Are targets being met?
2678
+ - Workflow: Any new repetitive patterns?
2679
+ - Tools: Any new automation opportunities?
2680
+
2681
+ **Evolve:**
2682
+
2683
+ - Archive unused commands
2684
+ - Create new commands for new patterns
2685
+ - Update quality criteria based on learnings
2686
+
2687
+ ---
2688
+
2689
+ ## Future Enhancements (Phase 4 and Beyond)
2690
+
2691
+ ### 1. Project-Specific Automation
2692
+
2693
+ Create per-project hooks/skills in `.claude/`:
2694
+
2695
+ **soulless-monorepo**:
2696
+
2697
+ - Custom commands for cross-package operations
2698
+ - Hooks for Electron-specific validations
2699
+ - Skills for desktop app patterns
2700
+
2701
+ **bitd**:
2702
+
2703
+ - Commands for Blades mechanics validation
2704
+ - Skills for game design review
2705
+ - Hooks for test coverage enforcement
2706
+
2707
+ ### 2. CI/CD Integration
2708
+
2709
+ **GitHub Actions** (see docs.claude.com/en/docs/claude-code/github-actions.md):
2710
+
2711
+ - Automated PR reviews using Claude
2712
+ - Quality checks in CI pipeline
2713
+ - Custom automation with prompts
2714
+
2715
+ ### 3. Team Collaboration
2716
+
2717
+ **Share automation across team**:
2718
+
2719
+ - Commit `.claude/commands/` to git
2720
+ - Standardize quality checks
2721
+ - Document team-specific slash commands
2722
+
2723
+ ### 4. External Integrations (Phase 4)
2724
+
2725
+ **Subagents for specialized analysis**:
2726
+
2727
+ - `code-reviewer` - Comprehensive PR diff analysis (separate context)
2728
+ - `architecture-auditor` - Design doc consistency checks
2729
+ - Continue using `Explore` for exhaustive codebase searches
2730
+
2731
+ **MCP Servers for external tools**:
2732
+
2733
+ - GitHub integration for PR automation
2734
+ - Database servers for production validation
2735
+ - Company-specific linters and tools
2736
+
2737
+ **When to implement**: Only when you need these specific integrations
2738
+
2739
+ ### 5. Advanced Skills (Beyond Phase 4)
2740
+
2741
+ **Additional skills to build**:
2742
+
2743
+ - `architecture-reviewer` - Inline checks against ARCHITECTURE.md
2744
+ - `test-coverage-analyzer` - Identifies untested code paths
2745
+ - `dependency-auditor` - Checks for outdated/vulnerable deps
2746
+ - `performance-profiler` - Reviews for common performance issues
2747
+
2748
+ ---
2749
+
2750
+ ## Decision Log
2751
+
2752
+ ### Decision 1: Start with Slash Commands (Phase 1)
2753
+
2754
+ **Rationale**:
2755
+
2756
+ - Lowest effort, immediate value
2757
+ - Easy to test and iterate
2758
+ - No risk of automation misbehaving
2759
+ - Builds muscle memory for quality checks
2760
+
2761
+ **Alternative considered**: Start with Skills
2762
+ **Why not**: Skills are complex, harder to debug, require more testing
2763
+
2764
+ ---
2765
+
2766
+ ### Decision 2: Quality Reviewer Skill in Phase 2
2767
+
2768
+ **Rationale**:
2769
+
2770
+ - Most powerful automation after slash commands proven
2771
+ - Auto-invoked = zero manual work
2772
+ - Comprehensive checks in one place
2773
+
2774
+ **Alternative considered**: UserPromptSubmit Hook first
2775
+ **Why not**: Hook is more invasive, Skills allow more nuance
2776
+
2777
+ ---
2778
+
2779
+ ### Decision 3: UserPromptSubmit Hook Optional (Phase 3)
2780
+
2781
+ **Rationale**:
2782
+
2783
+ - Phase 2 may be sufficient (skills + post-hooks)
2784
+ - Most invasive = highest risk of disruption
2785
+ - Can delay if Phase 2 meets needs
2786
+
2787
+ **Alternative considered**: Make it Phase 1 for max impact
2788
+ **Why not**: Too risky to start with, need to validate approach first
2789
+
2790
+ ---
2791
+
2792
+ ### Decision 4: Global vs Project-Specific
2793
+
2794
+ **Rationale**:
2795
+
2796
+ - Start global (`~/.claude/`) for consistency
2797
+ - Add project-specific (`.claude/`) when patterns diverge
2798
+ - Easier to manage one set initially
2799
+
2800
+ **Pattern**: Global by default, project-specific when needed
2801
+
2802
+ ---
2803
+
2804
+ ### Decision 5: NOT Using Subagents for Quality Checks (Phase 2)
2805
+
2806
+ **Rationale**:
2807
+
2808
+ - Too slow (context switching overhead)
2809
+ - Lack conversation history (can't see requirements, CLAUDE.md)
2810
+ - Better suited for long analysis (your actual usage: Explore for searches)
2811
+ - Skills provide same intelligence with context access
2812
+
2813
+ **Alternative considered**: Custom quality-reviewer subagent
2814
+ **Why not**: Skills are faster, have context, and auto-activate
2815
+
2816
+ **When to use subagents**: Comprehensive PR reviews, architecture audits, exhaustive searches (already doing this correctly)
2817
+
2818
+ ---
2819
+
2820
+ ### Decision 6: NOT Using MCP Servers for Quality Checks (Phase 2)
2821
+
2822
+ **Rationale**:
2823
+
2824
+ - External API costs ($$ for GPT-4/Gemini)
2825
+ - Network latency (slower than in-context)
2826
+ - No conversation context (doesn't see requirements)
2827
+ - Complex setup (install + config + API keys)
2828
+ - Hooks already handle linting (simpler)
2829
+
2830
+ **Alternative considered**: praneybehl/code-review-mcp
2831
+ **Why not**: Skills provide better, faster, cheaper, context-aware checks
2832
+
2833
+ **When to use MCP**: GitHub PR automation, database validation, external company tools (Phase 4 - future)
2834
+
2835
+ ---
2836
+
2837
+ ### Decision 7: Trigger Mechanism - NOT Parsing JSON Responses
2838
+
2839
+ **Context**: The `{"proposedChanges": boolean, "madeChanges": boolean}` format is Claude's OUTPUT (appended to every response per code-philosophy.md), not an INPUT that triggers automation.
2840
+
2841
+ **Rationale for NOT parsing JSON**:
2842
+
2843
+ - **Architectural mismatch** - Skills are tools Claude invokes BEFORE taking action, not analyzers that react to Claude's own output
2844
+ - **No conversation history access** - Skills don't have access to read previous assistant messages
2845
+ - **Circular dependency** - "Claude responds → Parse response → Tell Claude what to do" breaks the automation flow
2846
+ - **Deterministic alternatives exist** - Tool use events (Write/Edit) are observable and reliable triggers
2847
+
2848
+ **Correct trigger mechanisms**:
2849
+
2850
+ | Use Case | Trigger | Mechanism | When Fires |
2851
+ | ----------------------- | --------------------------------- | ---------------------- | ----------------------------------- |
2852
+ | `proposedChanges: true` | Claude about to propose/implement | Quality Reviewer Skill | BEFORE Write/Edit tools (proactive) |
2853
+ | `madeChanges: true` | Claude modified files | PostToolUse Hook | AFTER Write/Edit tools (reactive) |
2854
+
2855
+ **Skills = Proactive**: Claude invokes them as tools BEFORE taking action (like Read, WebFetch, Grep)
2856
+ **Hooks = Reactive**: Respond to events (tool use, prompt submission) with deterministic triggers
2857
+
2858
+ **Role of JSON format**:
2859
+
2860
+ - ✅ **Keep it** for transparency and debugging (user can see what happened)
2861
+ - ✅ **Document** whether Claude proposed or made changes
2862
+ - ❌ **Don't use as trigger** - Automation uses Skills/Hooks based on behavioral events, not text parsing
2863
+
2864
+ **Example flow**:
2865
+
2866
+ ```
2867
+ 1. User: "implement feature X"
2868
+ 2. Claude thinks: "I should check quality first" (CLAUDE.md instruction)
2869
+ 3. Claude invokes Quality Reviewer Skill (proactive tool use)
2870
+ 4. Skill performs review, returns PROCEED/REVISE/USER INPUT
2871
+ 5. Claude proceeds: Uses Write tool
2872
+ 6. PostToolUse Hook fires (reactive to Write tool)
2873
+ 7. Hook runs linters/tests, provides feedback
2874
+ 8. Claude responds with results + {"proposedChanges": false, "madeChanges": true}
2875
+ ```
2876
+
2877
+ **Alternative considered**: Parse JSON from previous assistant message in a Skill
2878
+ **Why not**: Skills can't read conversation history, and even if they could, tool use events are cleaner triggers
2879
+
2880
+ **Alternative considered**: UserPromptSubmit Hook that checks for "yes" and appends quality checklist
2881
+ **Why not**: Too invasive (false positives on conversational "yes"), Skills provide context-aware intelligence
2882
+
2883
+ **Key insight**: Don't fight the architecture - use Skills for proactive quality checks, Hooks for reactive validation, and keep JSON as documentation.
2884
+
2885
+ ---
2886
+
2887
+ ### Decision 8: Dynamic Documentation Discovery (Not Static Tables) 🚧 PROPOSED
2888
+
2889
+ **⚠️ STATUS**: PROPOSED - NEEDS TESTING. This approach is theoretically sound but unvalidated with actual Claude Code Skills. Implement as prototype and iterate based on results.
2890
+
2891
+ **Problem**: The "latest documentation and best practices" varies by stack (Electron, TypeScript), context (testing, UX), and domain (conversational agents, document editing). Static lookup tables don't scale.
2892
+
2893
+ **Solution**: Process-based discovery, not catalog-based enumeration.
2894
+
2895
+ **Discovery Algorithm for Quality Reviewer Skill:**
2896
+
2897
+ ```
2898
+ 1. AUTO-DETECT STACK
2899
+ - Read package.json dependencies (if exists)
2900
+ - Extract libraries/frameworks in use
2901
+ - Note versions (critical for compatibility checks)
2902
+
2903
+ 2. INFER CONTEXT FROM TASK
2904
+ - User keywords: "test" → testing, "style" → UX, "slow" → performance
2905
+ - Files modified: *.test.ts → testing, *.css → UX, *.md → docs
2906
+
2907
+ 3. READ PROJECT CLAUDE.MD
2908
+ - Check for "Key Technologies" section
2909
+ - Look for domain context (game, desktop app, conversational AI)
2910
+ - Note any project-specific patterns documented
2911
+
2912
+ 4. BUILD VERSION-SPECIFIC SEARCH QUERIES
2913
+ - Format: "[library] v[X.Y] documentation [context]"
2914
+ - Example: "vitest v2.1 documentation async testing"
2915
+ - Fallback: "[library] latest documentation [context]"
2916
+ - Use current year from <env> if helpful: "electron {CURRENT_YEAR} IPC patterns"
2917
+
2918
+ 5. SEARCH WITH WEBSEARCH/WEBFETCH
2919
+ - Prioritize official docs (.dev, .io, .org, official GitHub)
2920
+ - Prefer /latest/ or /stable/ URL paths (auto-redirect to current)
2921
+ - Check publish dates in search results against <env> current date
2922
+ - Flag results older than package.json version release date
2923
+ ```
2924
+
2925
+ **Handling "Latest" Documentation:**
2926
+
2927
+ **Problem**: LLM training data has cutoff (January 2025), but <env> provides current date.
2928
+
2929
+ **Solutions (in priority order):**
2930
+
2931
+ 1. **Use package.json version** - Most reliable
2932
+
2933
+ ```
2934
+ package.json has "vitest": "^2.1.0"
2935
+ → Search: "vitest v2.1 documentation"
2936
+ → Verify: Check if docs mention v2.1 features
2937
+ ```
2938
+
2939
+ 2. **Use /latest/ URLs** - Auto-redirect to current
2940
+
2941
+ ```
2942
+ https://vitest.dev/guide/ (always current)
2943
+ https://www.electronjs.org/docs/latest/ (version-agnostic)
2944
+ ```
2945
+
2946
+ 3. **Omit year from searches** - Let search engine find newest
2947
+
2948
+ ```
2949
+ ✅ "react hooks best practices"
2950
+ ❌ "react hooks best practices 2024" (assumes year)
2951
+ ```
2952
+
2953
+ 4. **Check dates in results** - Use <env> current date for comparison
2954
+
2955
+ ```
2956
+ <env> says: Today's date: 2025-10-27
2957
+ Search result says: Published: 2025-09-15
2958
+ → Recent enough (within 2 months)
2959
+ ```
2960
+
2961
+ 5. **Version compatibility warning**
2962
+ ```
2963
+ package.json: "react": "18.3.1"
2964
+ Docs found: React 19 documentation
2965
+ → Flag: "Docs are for newer version, check compatibility"
2966
+ ```
2967
+
2968
+ **What to Document in CLAUDE.md (Process, Not Tables):**
2969
+
2970
+ ```markdown
2971
+ ## Documentation Verification (For Quality Reviewer Skill)
2972
+
2973
+ **Discovery Process:**
2974
+
2975
+ 1. **Detect stack** - Read package.json dependencies + versions
2976
+ 2. **Infer context** - Parse user request keywords + file paths
2977
+ 3. **Check project docs** - Read project CLAUDE.md for domain/patterns
2978
+ 4. **Build version-specific searches**:
2979
+ - "[library] v[version] documentation [context]"
2980
+ - Prefer /latest/ or /stable/ URL paths
2981
+ - Use current date from <env> for recency checks
2982
+ 5. **Prioritize official sources** - .dev, .io, .org, GitHub official repos
2983
+
2984
+ **Recency Validation:**
2985
+
2986
+ - Compare search result dates against <env> current date
2987
+ - Flag results >6 months old (may be stale)
2988
+ - Check if docs version matches package.json version
2989
+ - Warn if using docs for newer/older version than installed
2990
+
2991
+ **Red flags to watch for:**
2992
+
2993
+ - Deprecated APIs (search "[library] deprecated [version]")
2994
+ - Version mismatches (package.json v2.1, docs for v1.x)
2995
+ - Security advisories (check GitHub security tab)
2996
+ - Stale results (published before package.json version release)
2997
+
2998
+ **When to skip doc checks:**
2999
+
3000
+ - Standard library features (JS Array.map, Promise, etc.)
3001
+ - Well-known stable APIs (console.log, localStorage, etc.)
3002
+ - User explicitly says "use existing pattern from codebase"
3003
+ ```
3004
+
3005
+ **Example Flow:**
3006
+
3007
+ ```
3008
+ User: "write tests for the IPC handler"
3009
+
3010
+ Skill detects:
3011
+ ├─ package.json → electron: "^32.0.0", vitest: "^2.1.3"
3012
+ ├─ User keywords → "tests", "IPC"
3013
+ ├─ File context → src/main/ipc-handler.ts (Electron main process)
3014
+ └─ Project CLAUDE.md → Desktop app, Electron IPC documented in ARCHITECTURE.md
3015
+
3016
+ Skill searches:
3017
+ 1. "vitest v2.1 documentation" → https://vitest.dev/guide/
3018
+ 2. "electron v32 IPC documentation" → https://www.electronjs.org/docs/latest/api/ipc-main
3019
+ 3. "electron testing IPC handlers best practices"
3020
+
3021
+ Skill reviews:
3022
+ - Vitest v2.1 async testing patterns (matches version ✓)
3023
+ - Electron v32 contextBridge security (current major version ✓)
3024
+ - Checks publish dates: All within 3 months of <env> date (2025-10-27) ✓
3025
+ - Warns: "Electron v32 deprecated ipcRenderer.sendSync, use invoke pattern"
3026
+
3027
+ Quality check includes:
3028
+ ✓ Using latest Vitest patterns (async/await, not callbacks)
3029
+ ✓ Following Electron v32 IPC security (contextBridge preload)
3030
+ ✓ Avoiding deprecated sendSync (per v32 docs)
3031
+ ```
3032
+
3033
+ **Advantages:**
3034
+
3035
+ ✅ **No maintenance** - No tables to update when new libraries released
3036
+ ✅ **Infinite extensibility** - Works for any stack/context/domain
3037
+ ✅ **Version-aware** - Checks compatibility with installed versions
3038
+ ✅ **Recency-aware** - Uses <env> date to validate freshness
3039
+ ✅ **Context-specific** - Only checks what's relevant to task
3040
+
3041
+ **Rationale**: Static tables become stale immediately. Process-based discovery scales infinitely and self-updates via WebSearch.
3042
+
3043
+ ---
3044
+
3045
+ ## Next Steps
3046
+
3047
+ 1. **Review this plan** - Critique against your workflow needs
3048
+ 2. **Approve Phase 1** - Start with slash commands + CLAUDE.md
3049
+ 3. **Test in real usage** - Try for 1 week in both projects
3050
+ 4. **Measure results** - Track metrics (prompt length, time saved)
3051
+ 5. **Decide Phase 2** - Proceed if Phase 1 successful
3052
+ 6. **Iterate** - Refine based on actual usage patterns
3053
+
3054
+ **Ready to proceed?** Say:
3055
+
3056
+ - "implement phase 1" → I'll create the files
3057
+ - "show me examples first" → I'll demonstrate each command
3058
+ - "modify the plan" → Tell me what to change