@simplysm/sd-claude 13.0.78 → 13.0.81

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (68) hide show
  1. package/claude/rules/sd-claude-rules.md +4 -63
  2. package/claude/rules/sd-simplysm-usage.md +7 -0
  3. package/claude/sd-session-start.sh +10 -0
  4. package/claude/sd-statusline.py +249 -0
  5. package/claude/skills/sd-api-review/SKILL.md +89 -0
  6. package/claude/skills/sd-check/SKILL.md +55 -57
  7. package/claude/skills/sd-commit/SKILL.md +37 -42
  8. package/claude/skills/sd-debug/SKILL.md +75 -265
  9. package/claude/skills/sd-document/SKILL.md +63 -53
  10. package/claude/skills/sd-document/_common.py +94 -0
  11. package/claude/skills/sd-document/extract_docx.py +19 -48
  12. package/claude/skills/sd-document/extract_pdf.py +22 -50
  13. package/claude/skills/sd-document/extract_pptx.py +17 -40
  14. package/claude/skills/sd-document/extract_xlsx.py +19 -40
  15. package/claude/skills/sd-email-analyze/SKILL.md +23 -31
  16. package/claude/skills/sd-email-analyze/email-analyzer.py +79 -65
  17. package/claude/skills/sd-init/SKILL.md +133 -0
  18. package/claude/skills/sd-plan/SKILL.md +69 -120
  19. package/claude/skills/sd-readme/SKILL.md +106 -131
  20. package/claude/skills/sd-review/SKILL.md +38 -155
  21. package/claude/skills/sd-simplify/SKILL.md +59 -0
  22. package/dist/commands/install.js +20 -6
  23. package/dist/commands/install.js.map +1 -1
  24. package/package.json +3 -2
  25. package/src/commands/install.ts +29 -7
  26. package/README.md +0 -297
  27. package/claude/refs/sd-angular.md +0 -127
  28. package/claude/refs/sd-code-conventions.md +0 -155
  29. package/claude/refs/sd-directories.md +0 -7
  30. package/claude/refs/sd-library-issue.md +0 -7
  31. package/claude/refs/sd-migration.md +0 -7
  32. package/claude/refs/sd-orm-v12.md +0 -81
  33. package/claude/refs/sd-orm.md +0 -23
  34. package/claude/refs/sd-service.md +0 -5
  35. package/claude/refs/sd-simplysm-docs.md +0 -52
  36. package/claude/refs/sd-solid.md +0 -68
  37. package/claude/refs/sd-workflow.md +0 -25
  38. package/claude/rules/sd-refs-linker.md +0 -52
  39. package/claude/sd-statusline.js +0 -296
  40. package/claude/skills/sd-api-name-review/SKILL.md +0 -154
  41. package/claude/skills/sd-brainstorm/SKILL.md +0 -215
  42. package/claude/skills/sd-debug/condition-based-waiting-example.ts +0 -158
  43. package/claude/skills/sd-debug/condition-based-waiting.md +0 -114
  44. package/claude/skills/sd-debug/defense-in-depth.md +0 -128
  45. package/claude/skills/sd-debug/find-polluter.sh +0 -64
  46. package/claude/skills/sd-debug/root-cause-tracing.md +0 -168
  47. package/claude/skills/sd-discuss/SKILL.md +0 -91
  48. package/claude/skills/sd-explore/SKILL.md +0 -118
  49. package/claude/skills/sd-plan-dev/SKILL.md +0 -294
  50. package/claude/skills/sd-plan-dev/code-quality-reviewer-prompt.md +0 -49
  51. package/claude/skills/sd-plan-dev/final-review-prompt.md +0 -50
  52. package/claude/skills/sd-plan-dev/implementer-prompt.md +0 -60
  53. package/claude/skills/sd-plan-dev/spec-reviewer-prompt.md +0 -45
  54. package/claude/skills/sd-review/api-reviewer-prompt.md +0 -75
  55. package/claude/skills/sd-review/code-reviewer-prompt.md +0 -82
  56. package/claude/skills/sd-review/convention-checker-prompt.md +0 -61
  57. package/claude/skills/sd-review/refactoring-analyzer-prompt.md +0 -92
  58. package/claude/skills/sd-skill/SKILL.md +0 -417
  59. package/claude/skills/sd-skill/anthropic-best-practices.md +0 -156
  60. package/claude/skills/sd-skill/cso-guide.md +0 -161
  61. package/claude/skills/sd-skill/examples/CLAUDE_MD_TESTING.md +0 -200
  62. package/claude/skills/sd-skill/persuasion-principles.md +0 -220
  63. package/claude/skills/sd-skill/testing-skills-with-subagents.md +0 -408
  64. package/claude/skills/sd-skill/writing-guide.md +0 -159
  65. package/claude/skills/sd-tdd/SKILL.md +0 -385
  66. package/claude/skills/sd-tdd/testing-anti-patterns.md +0 -317
  67. package/claude/skills/sd-use/SKILL.md +0 -67
  68. package/claude/skills/sd-worktree/SKILL.md +0 -78
@@ -1,61 +0,0 @@
1
- # Convention Checker Prompt
2
-
3
- Template for `Agent(general-purpose)`. Fill in `[CONVENTIONS]` and `[EXPLORE_FILES]`.
4
-
5
- ```
6
- You are checking code against project conventions.
7
- Your question: "Does this code violate any project-defined rules?"
8
-
9
- ## Context
10
-
11
- 1. Review the following project conventions (these are your Grep criteria):
12
-
13
- [CONVENTIONS]
14
-
15
- 2. Read these explore result files: [EXPLORE_FILES]
16
- 3. Collect ALL file paths from the **File Summaries** sections — these are your Grep scope
17
-
18
- ## Step 1: Extract Grep-searchable patterns
19
-
20
- From the conventions, extract rules that can be checked via text pattern matching.
21
-
22
- Examples of Grep-searchable rules:
23
- - `as unknown as` — prohibited
24
- - `as any` — prohibited in public-facing types
25
- - `export * from` or `export { } from` outside `src/index.ts` — prohibited
26
-
27
- **Skip rules that require semantic understanding** (e.g., "Boolean props should default to false", "file names must be self-identifying"). Only check patterns that can be matched syntactically.
28
-
29
- ## Step 2: Grep for each pattern
30
-
31
- For each prohibited pattern, run Grep across all files in scope.
32
-
33
- For patterns with justified exceptions (e.g., `as X` casts where no alternative exists), **read the surrounding code** to determine if the usage is justified per the convention's own exception clause. Report only unjustified matches.
34
-
35
- Do NOT skip or dismiss matches for these reasons:
36
- - "Widespread usage" is NOT an exception — it means widespread violation
37
- - "Codebase pattern" is NOT an exception — conventions define what's correct
38
-
39
- ## Step 3: Report
40
-
41
- ### [WARNING] title
42
-
43
- - **File**: path/to/file.ts:42
44
- - **Convention**: which rule from which convention
45
- - **Evidence**: the matching code (include snippet)
46
- - **Suggestion**: the fix recommended by the convention
47
-
48
- All violations are WARNING. Use CRITICAL only if the violation causes an immediate runtime bug.
49
-
50
- Start with:
51
-
52
- ## Convention Check Results
53
-
54
- ### Summary
55
- - Files checked: N
56
- - Conventions referenced: [list]
57
- - Violations found: N
58
-
59
- ### Violations
60
- [findings here]
61
- ```
@@ -1,92 +0,0 @@
1
- # Refactoring Analyzer Prompt
2
-
3
- Template for `Agent(general-purpose)`. Fill in `[CONVENTIONS]` and `[EXPLORE_FILES]`.
4
-
5
- ```
6
- You are analyzing code for structural improvement and simplification.
7
- Your question: "Can this code be simpler or better organized without changing its behavior?"
8
-
9
- ## Context
10
-
11
- 1. Review the following project conventions relevant to code structure:
12
-
13
- [CONVENTIONS]
14
-
15
- 2. Read these explore result files: [EXPLORE_FILES]
16
- 3. From the explore results' **Tagged Files → REFACTOR** sections, collect all entries — these are your deep-read targets
17
-
18
- ## Step 1: Deep Review
19
-
20
- Read each file from the REFACTOR tagged list. For each:
21
- 1. Verify suspected structural issues from screening
22
- 2. Look for additional opportunities
23
-
24
- Look for:
25
-
26
- **Simplification:**
27
- - Unnecessary complexity: over-abstraction, needless indirection, complex generics
28
- - Duplication: same logic repeated across files, similar functions that could be unified
29
- - Readability: hard-to-follow control flow, unclear variable names, implicit behavior
30
-
31
- **Structure:**
32
- - Responsibility mixing: single module handling concerns that should be separate
33
- - Abstraction level mismatch: high-level orchestration mixed with low-level details
34
- - Module organization: related functionality scattered, or unrelated functionality grouped
35
- - Leaking abstractions: internal details exposed through public API
36
- - Coupling hotspots: changes that would cascade widely
37
-
38
- ## CRITICAL — Scope boundaries
39
-
40
- Do NOT report:
41
- - Bugs, security, logic errors, race conditions → code review
42
- - Naming consistency, API design, type quality → API review
43
- - Convention violations → convention checker
44
- - Documentation gaps, style preferences, import ordering
45
- - Performance optimization (unless also a structural improvement)
46
- - Magic numbers with clear adjacent comments
47
- - Small interface duplication (< 10 fields) where extraction adds indirection
48
- - Issues outside the reviewed files
49
-
50
- **Test each finding:** "Is this about CODE STRUCTURE, or about something else?"
51
-
52
- ## Step 2: Self-verify
53
-
54
- 1. **Structure test**: genuinely structural? not a bug, convention, or doc issue?
55
- 2. **Impact test**: would a developer actually struggle with this?
56
- 3. **Intentional pattern**: used consistently across the codebase? → by-design, drop.
57
- 4. **Separation benefit**: < ~150 lines AND tightly coupled? → splitting adds overhead, drop.
58
- 5. **Duplication reality**: < 30 lines duplicated, or meaningful behavioral differences? → drop.
59
-
60
- **Quality over quantity: 3 verified structural findings > 10 mixed findings.**
61
-
62
- ## Constraints
63
-
64
- - Analysis only. Do NOT modify any files.
65
- - Do NOT provide corrected code blocks. Describe issues in words only.
66
- - Only report structural issues with real evidence.
67
-
68
- ## Output Format
69
-
70
- ### [HIGH|MEDIUM|LOW] title
71
-
72
- - **File**: path/to/file.ts:42
73
- - **Evidence**: what you observed (include code snippet)
74
- - **Issue**: what the structural problem is
75
- - **Suggestion**: how to improve it (in words, not code)
76
-
77
- Impact levels:
78
- - HIGH: Major structural problem. Significantly harder to understand or modify safely.
79
- - MEDIUM: Notable concern. Unnecessary complexity or meaningful duplication.
80
- - LOW: Improvement opportunity. Cleaner structure exists but current is workable.
81
-
82
- Start with:
83
-
84
- ## Refactoring Analysis Results
85
-
86
- ### Summary
87
- - Files deep-reviewed: N (list them)
88
- - Findings: X HIGH, Y MEDIUM, Z LOW
89
-
90
- ### Findings
91
- [findings here]
92
- ```
@@ -1,417 +0,0 @@
1
- ---
2
- name: sd-skill
3
- description: "Skill creation and editing (explicit invocation only)"
4
- ---
5
-
6
- # Writing Skills
7
-
8
- ## Overview
9
-
10
- **Writing skills IS Test-Driven Development applied to process documentation.**
11
-
12
- You write test cases (pressure scenarios with subagents), watch them fail (baseline behavior), write the skill (documentation), watch tests pass (agents comply), and refactor (close loopholes).
13
-
14
- **Core principle:** If you didn't watch an agent fail without the skill, you don't know if the skill teaches the right thing.
15
-
16
- **REQUIRED BACKGROUND:** You MUST understand sd-tdd before using this skill. That skill defines the fundamental RED-GREEN-REFACTOR cycle. This skill adapts TDD to documentation.
17
-
18
- **Official guidance:** For Anthropic's official skill authoring best practices, see anthropic-best-practices.md. This document provides additional patterns and guidelines that complement the TDD-focused approach in this skill.
19
-
20
- ## What is a Skill?
21
-
22
- A **skill** is a reference guide for proven techniques, patterns, or tools. Skills help future Claude instances find and apply effective approaches.
23
-
24
- **Skills are:** Reusable techniques, patterns, tools, reference guides
25
-
26
- **Skills are NOT:** Narratives about how you solved a problem once
27
-
28
- ## TDD Mapping for Skills
29
-
30
- | TDD Concept | Skill Creation |
31
- | ----------------------- | ------------------------------------------------ |
32
- | **Test case** | Pressure scenario with subagent |
33
- | **Production code** | Skill document (SKILL.md) |
34
- | **Test fails (RED)** | Agent violates rule without skill (baseline) |
35
- | **Test passes (GREEN)** | Agent complies with skill present |
36
- | **Refactor** | Close loopholes while maintaining compliance |
37
- | **Write test first** | Run baseline scenario BEFORE writing skill |
38
- | **Watch it fail** | Document exact rationalizations agent uses |
39
- | **Minimal code** | Write skill addressing those specific violations |
40
- | **Watch it pass** | Verify agent now complies |
41
- | **Refactor cycle** | Find new rationalizations → plug → re-verify |
42
-
43
- The entire skill creation process follows RED-GREEN-REFACTOR.
44
-
45
- ## When to Create a Skill
46
-
47
- **Create when:**
48
-
49
- - Technique wasn't intuitively obvious to you
50
- - You'd reference this again across projects
51
- - Pattern applies broadly (not project-specific)
52
- - Others would benefit
53
-
54
- **Don't create for:**
55
-
56
- - One-off solutions
57
- - Standard practices well-documented elsewhere
58
- - Project-specific conventions (put in CLAUDE.md)
59
- - Mechanical constraints (if it's enforceable with regex/validation, automate it—save documentation for judgment calls)
60
-
61
- ## Skill Types
62
-
63
- ### Technique
64
-
65
- Concrete method with steps to follow (condition-based-waiting, root-cause-tracing)
66
-
67
- ### Pattern
68
-
69
- Way of thinking about problems (flatten-with-flags, test-invariants)
70
-
71
- ### Reference
72
-
73
- API docs, syntax guides, tool documentation (office docs)
74
-
75
- ## Directory Structure
76
-
77
- ```
78
- skills/
79
- skill-name/
80
- SKILL.md # Main reference (required)
81
- supporting-file.* # Only if needed
82
- ```
83
-
84
- **Flat namespace** - all skills in one searchable namespace
85
-
86
- **Separate files for:**
87
-
88
- 1. **Heavy reference** (100+ lines) - API docs, comprehensive syntax
89
- 2. **Reusable tools** - Scripts, utilities, templates
90
-
91
- **Keep inline:**
92
-
93
- - Principles and concepts
94
- - Code patterns (< 50 lines)
95
- - Everything else
96
-
97
- ## SKILL.md Structure
98
-
99
- **Frontmatter (YAML):**
100
-
101
- - Only two fields supported: `name` and `description`
102
- - Max 1024 characters total
103
- - `name`: Use letters, numbers, and hyphens only (no parentheses, special chars)
104
- - `description`: Third-person, describes ONLY when to use (NOT what it does)
105
- - Start with "Use when..." to focus on triggering conditions
106
- - Include specific symptoms, situations, and contexts
107
- - **NEVER summarize the skill's process or workflow** (see cso-guide.md for why)
108
- - Keep under 500 characters if possible
109
-
110
- ```markdown
111
- ---
112
- name: Skill-Name-With-Hyphens
113
- description: Use when [specific triggering conditions and symptoms]
114
- ---
115
-
116
- # Skill Name
117
-
118
- ## Overview
119
-
120
- What is this? Core principle in 1-2 sentences.
121
-
122
- ## When to Use
123
-
124
- [Small inline flowchart IF decision non-obvious]
125
-
126
- Bullet list with SYMPTOMS and use cases
127
- When NOT to use
128
-
129
- ## Core Pattern (for techniques/patterns)
130
-
131
- Before/after code comparison
132
-
133
- ## Quick Reference
134
-
135
- Table or bullets for scanning common operations
136
-
137
- ## Implementation
138
-
139
- Inline code for simple patterns
140
- Link to file for heavy reference or reusable tools
141
-
142
- ## Common Mistakes
143
-
144
- What goes wrong + fixes
145
-
146
- ## Real-World Impact (optional)
147
-
148
- Concrete results
149
- ```
150
-
151
- ## Claude Search Optimization (CSO)
152
-
153
- **Critical for discovery.** See **cso-guide.md** for the complete guide covering description fields, keyword coverage, naming, token efficiency, and cross-referencing.
154
-
155
- ## Writing Guidelines
156
-
157
- **See writing-guide.md** for flowchart usage, code examples, file organization, and bulletproofing techniques.
158
-
159
- ## The Iron Law (Same as TDD)
160
-
161
- ```
162
- NO SKILL WITHOUT A FAILING TEST FIRST
163
- ```
164
-
165
- This applies to NEW skills AND EDITS to existing skills.
166
-
167
- Write skill before testing? Delete it. Start over.
168
- Edit skill without testing? Same violation.
169
-
170
- **No exceptions:**
171
-
172
- - Not for "simple additions"
173
- - Not for "just adding a section"
174
- - Not for "documentation updates"
175
- - Don't keep untested changes as "reference"
176
- - Don't "adapt" while running tests
177
- - Delete means delete
178
-
179
- **Only exemption — pure mechanical edits:** Typo fixes, tool/variable renames where the behavioral guidance is identical (e.g., `TodoWrite` → `TaskCreate`). If you're changing what the skill *teaches*, it's not mechanical — test it.
180
-
181
- **REQUIRED BACKGROUND:** The sd-tdd skill explains why this matters. Same principles apply to documentation.
182
-
183
- ## Testing All Skill Types
184
-
185
- Different skill types need different test approaches:
186
-
187
- ```mermaid
188
- flowchart TD
189
- A{"What type of skill?"}
190
- A -->|"Discipline (rules/requirements)"| B["Pressure test<br>(compliance under stress)"]
191
- A -->|"Technique (how-to guides)"| C["Application test<br>(correct technique usage)"]
192
- A -->|"Pattern (mental models)"| D["Recognition test<br>(when/how to apply)"]
193
- A -->|"Reference (docs/APIs)"| E["Retrieval test<br>(find & use reference)"]
194
- ```
195
-
196
- ### Discipline-Enforcing Skills (rules/requirements)
197
-
198
- **Examples:** TDD, verification-before-completion, designing-before-coding
199
-
200
- **Test with:**
201
-
202
- - Academic questions: Do they understand the rules?
203
- - Pressure scenarios: Do they comply under stress?
204
- - Multiple pressures combined: time + sunk cost + exhaustion
205
- - Identify rationalizations and add explicit counters
206
-
207
- **Success criteria:** Agent follows rule under maximum pressure
208
-
209
- ### Technique Skills (how-to guides)
210
-
211
- **Examples:** condition-based-waiting, root-cause-tracing, defensive-programming
212
-
213
- **Test with:**
214
-
215
- - Application scenarios: Can they apply the technique correctly?
216
- - Variation scenarios: Do they handle edge cases?
217
- - Missing information tests: Do instructions have gaps?
218
-
219
- **How to test:** Give a subagent a problem the technique solves, WITHOUT the skill. Observe what approach they use naturally. Then give the SAME problem WITH the skill and verify they apply the technique correctly.
220
-
221
- ```
222
- Example: Testing a "condition-based-waiting" skill
223
- 1. Ask subagent: "Fix this flaky test that uses setTimeout(500)"
224
- 2. WITHOUT skill: Agent increases timeout to 2000ms (wrong approach)
225
- 3. WITH skill: Agent replaces with polling/condition check (correct)
226
- ```
227
-
228
- **Success criteria:** Agent successfully applies technique to new scenario
229
-
230
- ### Pattern Skills (mental models)
231
-
232
- **Examples:** reducing-complexity, information-hiding concepts
233
-
234
- **Test with:**
235
-
236
- - Recognition scenarios: Do they recognize when pattern applies?
237
- - Application scenarios: Can they use the mental model?
238
- - Counter-examples: Do they know when NOT to apply?
239
-
240
- **Success criteria:** Agent correctly identifies when/how to apply pattern
241
-
242
- ### Reference Skills (documentation/APIs)
243
-
244
- **Examples:** API documentation, command references, library guides
245
-
246
- **Test with:**
247
-
248
- - Retrieval scenarios: Can they find the right information?
249
- - Application scenarios: Can they use what they found correctly?
250
- - Gap testing: Are common use cases covered?
251
-
252
- **Success criteria:** Agent finds and correctly applies reference information
253
-
254
- ## Common Rationalizations for Skipping Testing
255
-
256
- | Excuse | Reality |
257
- | -------------------------------------- | --------------------------------------------------------------------------------------------------------------- |
258
- | "Skill is obviously clear" | Clear to you ≠ clear to other agents. Test it. |
259
- | "It's just a reference" | References can have gaps, unclear sections. Test retrieval. |
260
- | "Testing is overkill" | Untested skills have issues. Always. 15 min testing saves hours. |
261
- | "I'll test if problems emerge" | Problems = agents can't use skill. Test BEFORE deploying. |
262
- | "Too tedious to test" | Testing is less tedious than debugging bad skill in production. |
263
- | "I'm confident it's good" | Overconfidence guarantees issues. Test anyway. |
264
- | "Academic review is enough" | Reading ≠ using. Test application scenarios. |
265
- | "No time to test" | Deploying untested skill wastes more time fixing it later. |
266
- | "I already know the baseline failures" | You know what YOU think the failures are. Run a subagent to see what ACTUALLY happens. Knowledge ≠ observation. |
267
- | "This is process theater" | If the process catches even one issue you missed, it paid for itself. "Theater" is what you call process before it saves you. |
268
- | "It applies the wrong test methodology" | Different skill types need different tests (pressure vs retrieval), but ALL types need testing. No type is exempt. |
269
-
270
- **All of these mean: Test before deploying. No exceptions.**
271
-
272
- ## Bulletproofing Skills Against Rationalization
273
-
274
- Skills that enforce discipline need to resist rationalization. **See writing-guide.md** for detailed techniques on closing loopholes, spirit-vs-letter arguments, rationalization tables, and red flags lists.
275
-
276
- ## RED-GREEN-REFACTOR for Skills
277
-
278
- Follow the TDD cycle:
279
-
280
- ### Subagent Rules
281
-
282
- **NEVER use `isolation: "worktree"` when launching subagents.** Worktrees break lint/build tooling. Always run subagents in the default (non-isolated) mode.
283
-
284
- ### RED: Write Failing Test (Baseline)
285
-
286
- Run pressure scenario with subagent WITHOUT the skill. Document exact behavior:
287
-
288
- - What choices did they make?
289
- - What rationalizations did they use (verbatim)?
290
- - Which pressures triggered violations?
291
-
292
- This is "watch the test fail" - you must see what agents naturally do before writing the skill.
293
-
294
- **You MUST actually run a subagent.** Do not substitute your own knowledge of "what agents would probably do." Your prediction of baseline behavior ≠ observed baseline behavior. Run the subagent, read the output, document what actually happened.
295
-
296
- ### GREEN: Write Minimal Skill
297
-
298
- Write skill that addresses those specific rationalizations. Don't add extra content for hypothetical cases.
299
-
300
- Run same scenarios WITH skill. Agent should now comply.
301
-
302
- ### REFACTOR: Close Loopholes
303
-
304
- Agent found new rationalization? Add explicit counter. Re-test until bulletproof.
305
-
306
- **Testing methodology:** See testing-skills-with-subagents.md for the complete testing methodology:
307
-
308
- - How to write pressure scenarios
309
- - Pressure types (time, sunk cost, authority, exhaustion)
310
- - Plugging holes systematically
311
- - Meta-testing techniques
312
-
313
- ## Anti-Patterns
314
-
315
- ### ❌ Narrative Example
316
-
317
- "In session 2025-10-03, we found empty projectDir caused..."
318
- **Why bad:** Too specific, not reusable
319
-
320
- ### ❌ Multi-Language Dilution
321
-
322
- example-js.js, example-py.py, example-go.go
323
- **Why bad:** Mediocre quality, maintenance burden
324
-
325
- ### ❌ Code in Flowcharts
326
-
327
- ```mermaid
328
- flowchart TD
329
- A["import fs"] --> B["read file"]
330
- ```
331
-
332
- **Why bad:** Can't copy-paste, hard to read
333
-
334
- ### ❌ Generic Labels
335
-
336
- helper1, helper2, step3, pattern4
337
- **Why bad:** Labels should have semantic meaning
338
-
339
- ## STOP: Before Moving to Next Skill
340
-
341
- **After writing ANY skill, you MUST STOP and complete the deployment process.**
342
-
343
- **Do NOT:**
344
-
345
- - Create multiple skills in batch without testing each
346
- - Move to next skill before current one is verified
347
- - Skip testing because "batching is more efficient"
348
-
349
- **The deployment checklist below is MANDATORY for EACH skill.**
350
-
351
- Deploying untested skills = deploying untested code. It's a violation of quality standards.
352
-
353
- ## Skill Creation Checklist (TDD Adapted)
354
-
355
- **IMPORTANT: Use TaskCreate to create todos for EACH checklist item below.**
356
-
357
- **RED Phase - Write Failing Test:**
358
-
359
- - [ ] Create pressure scenarios (3+ combined pressures for discipline skills)
360
- - [ ] Run scenarios WITHOUT skill - document baseline behavior verbatim
361
- - [ ] Identify patterns in rationalizations/failures
362
-
363
- **GREEN Phase - Write Minimal Skill:**
364
-
365
- - [ ] Name uses only letters, numbers, hyphens (no parentheses/special chars)
366
- - [ ] YAML frontmatter with only name and description (max 1024 chars)
367
- - [ ] Description starts with "Use when..." and includes specific triggers/symptoms
368
- - [ ] Description written in third person
369
- - [ ] Keywords throughout for search (errors, symptoms, tools)
370
- - [ ] Clear overview with core principle
371
- - [ ] Address specific baseline failures identified in RED
372
- - [ ] Code inline OR link to separate file
373
- - [ ] One excellent example (not multi-language)
374
- - [ ] Run scenarios WITH skill - verify agents now comply
375
-
376
- **REFACTOR Phase - Close Loopholes:**
377
-
378
- - [ ] Identify NEW rationalizations from testing
379
- - [ ] Add explicit counters (if discipline skill)
380
- - [ ] Build rationalization table from all test iterations
381
- - [ ] Create red flags list
382
- - [ ] Re-test until bulletproof
383
-
384
- **Quality Checks:**
385
-
386
- - [ ] Small flowchart only if decision non-obvious
387
- - [ ] Quick reference table
388
- - [ ] Common mistakes section
389
- - [ ] No narrative storytelling
390
- - [ ] Supporting files only for tools or heavy reference
391
-
392
- **Deployment:**
393
-
394
- - [ ] Commit skill to git and push to your fork (if configured)
395
- - [ ] Consider contributing back via PR (if broadly useful)
396
-
397
- ## Discovery Workflow
398
-
399
- How future Claude finds your skill:
400
-
401
- 1. **Encounters problem** ("tests are flaky")
402
- 2. **Finds SKILL** (description matches)
403
- 3. **Scans overview** (is this relevant?)
404
- 4. **Reads patterns** (quick reference table)
405
- 5. **Loads example** (only when implementing)
406
-
407
- **Optimize for this flow** - put searchable terms early and often.
408
-
409
- ## The Bottom Line
410
-
411
- **Creating skills IS TDD for process documentation.**
412
-
413
- Same Iron Law: No skill without failing test first.
414
- Same cycle: RED (baseline) → GREEN (write skill) → REFACTOR (close loopholes).
415
- Same benefits: Better quality, fewer surprises, bulletproof results.
416
-
417
- If you follow TDD for code, follow it for skills. It's the same discipline applied to documentation.