opencodekit 0.5.0 → 0.6.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/index.js +1 -1
- package/dist/template/.opencode/AGENTS.md +12 -1
- package/dist/template/.opencode/command/skill-create.md +3 -3
- package/dist/template/.opencode/command/skill-optimize.md +1 -1
- package/dist/template/.opencode/command/summarize.md +71 -0
- package/dist/template/.opencode/opencode.json +23 -5
- package/dist/template/.opencode/package.json +1 -1
- package/dist/template/.opencode/plugin/sessions.ts +26 -1
- package/dist/template/.opencode/plugin/skill.ts +275 -0
- package/dist/template/.opencode/{skills → skill}/accessibility-audit/SKILL.md +5 -0
- package/dist/template/.opencode/{skills → skill}/brainstorming/SKILL.md +2 -2
- package/dist/template/.opencode/{skills → skill}/design-system-audit/SKILL.md +5 -0
- package/dist/template/.opencode/{skills → skill}/executing-plans/SKILL.md +13 -2
- package/dist/template/.opencode/{skills → skill}/frontend-aesthetics/SKILL.md +5 -0
- package/dist/template/.opencode/{skills → skill}/mockup-to-code/SKILL.md +5 -0
- package/dist/template/.opencode/{skills → skill}/requesting-code-review/SKILL.md +16 -6
- package/dist/template/.opencode/{skills → skill}/subagent-driven-development/SKILL.md +38 -17
- package/dist/template/.opencode/{skills → skill}/systematic-debugging/SKILL.md +28 -18
- package/dist/template/.opencode/{skills → skill}/testing-skills-with-subagents/SKILL.md +1 -1
- package/dist/template/.opencode/{skills → skill}/ui-ux-research/SKILL.md +5 -0
- package/dist/template/.opencode/{skills → skill}/visual-analysis/SKILL.md +5 -0
- package/dist/template/.opencode/{skills → skill}/writing-plans/SKILL.md +3 -3
- package/dist/template/.opencode/{skills → skill}/writing-skills/SKILL.md +101 -41
- package/package.json +1 -1
- package/dist/template/.opencode/plugin/superpowers.ts +0 -271
- package/dist/template/.opencode/superpowers/.claude/settings.local.json +0 -141
- package/dist/template/.opencode/superpowers/.claude-plugin/marketplace.json +0 -20
- package/dist/template/.opencode/superpowers/.claude-plugin/plugin.json +0 -13
- package/dist/template/.opencode/superpowers/.codex/INSTALL.md +0 -35
- package/dist/template/.opencode/superpowers/.codex/superpowers-bootstrap.md +0 -33
- package/dist/template/.opencode/superpowers/.codex/superpowers-codex +0 -267
- package/dist/template/.opencode/superpowers/.github/FUNDING.yml +0 -3
- package/dist/template/.opencode/superpowers/.opencode/INSTALL.md +0 -135
- package/dist/template/.opencode/superpowers/.opencode/plugin/superpowers.js +0 -215
- package/dist/template/.opencode/superpowers/LICENSE +0 -21
- package/dist/template/.opencode/superpowers/README.md +0 -165
- package/dist/template/.opencode/superpowers/RELEASE-NOTES.md +0 -493
- package/dist/template/.opencode/superpowers/agents/code-reviewer.md +0 -48
- package/dist/template/.opencode/superpowers/commands/brainstorm.md +0 -5
- package/dist/template/.opencode/superpowers/commands/execute-plan.md +0 -5
- package/dist/template/.opencode/superpowers/commands/write-plan.md +0 -5
- package/dist/template/.opencode/superpowers/docs/README.codex.md +0 -153
- package/dist/template/.opencode/superpowers/docs/README.opencode.md +0 -234
- package/dist/template/.opencode/superpowers/docs/plans/2025-11-22-opencode-support-design.md +0 -294
- package/dist/template/.opencode/superpowers/docs/plans/2025-11-22-opencode-support-implementation.md +0 -1095
- package/dist/template/.opencode/superpowers/hooks/hooks.json +0 -15
- package/dist/template/.opencode/superpowers/hooks/session-start.sh +0 -34
- package/dist/template/.opencode/superpowers/lib/skills-core.js +0 -208
- package/dist/template/.opencode/superpowers/tests/opencode/run-tests.sh +0 -165
- package/dist/template/.opencode/superpowers/tests/opencode/setup.sh +0 -73
- package/dist/template/.opencode/superpowers/tests/opencode/test-plugin-loading.sh +0 -81
- package/dist/template/.opencode/superpowers/tests/opencode/test-priority.sh +0 -198
- package/dist/template/.opencode/superpowers/tests/opencode/test-skills-core.sh +0 -440
- package/dist/template/.opencode/superpowers/tests/opencode/test-tools.sh +0 -104
- /package/dist/template/.opencode/{skills → skill}/condition-based-waiting/SKILL.md +0 -0
- /package/dist/template/.opencode/{skills → skill}/condition-based-waiting/example.ts +0 -0
- /package/dist/template/.opencode/{skills → skill}/defense-in-depth/SKILL.md +0 -0
- /package/dist/template/.opencode/{skills → skill}/dispatching-parallel-agents/SKILL.md +0 -0
- /package/dist/template/.opencode/{skills → skill}/finishing-a-development-branch/SKILL.md +0 -0
- /package/dist/template/.opencode/{skills → skill}/gemini-large-context/SKILL.md +0 -0
- /package/dist/template/.opencode/{skills → skill}/receiving-code-review/SKILL.md +0 -0
- /package/dist/template/.opencode/{skills/requesting-code-review/code-reviewer.md → skill/requesting-code-review/review.md} +0 -0
- /package/dist/template/.opencode/{skills → skill}/root-cause-tracing/SKILL.md +0 -0
- /package/dist/template/.opencode/{skills → skill}/root-cause-tracing/find-polluter.sh +0 -0
- /package/dist/template/.opencode/{skills → skill}/sharing-skills/SKILL.md +0 -0
- /package/dist/template/.opencode/{skills → skill}/systematic-debugging/CREATION-LOG.md +0 -0
- /package/dist/template/.opencode/{skills → skill}/systematic-debugging/test-academic.md +0 -0
- /package/dist/template/.opencode/{skills → skill}/systematic-debugging/test-pressure-1.md +0 -0
- /package/dist/template/.opencode/{skills → skill}/systematic-debugging/test-pressure-2.md +0 -0
- /package/dist/template/.opencode/{skills → skill}/systematic-debugging/test-pressure-3.md +0 -0
- /package/dist/template/.opencode/{skills → skill}/test-driven-development/SKILL.md +0 -0
- /package/dist/template/.opencode/{skills → skill}/testing-anti-patterns/SKILL.md +0 -0
- /package/dist/template/.opencode/{skills → skill}/testing-skills-with-subagents/examples/CLAUDE_MD_TESTING.md +0 -0
- /package/dist/template/.opencode/{skills → skill}/using-git-worktrees/SKILL.md +0 -0
- /package/dist/template/.opencode/{skills → skill}/using-superpowers/SKILL.md +0 -0
- /package/dist/template/.opencode/{skills → skill}/verification-before-completion/SKILL.md +0 -0
- /package/dist/template/.opencode/{skills → skill}/writing-skills/anthropic-best-practices.md +0 -0
- /package/dist/template/.opencode/{skills → skill}/writing-skills/graphviz-conventions.dot +0 -0
- /package/dist/template/.opencode/{skills → skill}/writing-skills/persuasion-principles.md +0 -0
|
@@ -9,13 +9,13 @@ description: Use when creating new skills, editing existing skills, or verifying
|
|
|
9
9
|
|
|
10
10
|
**Writing skills IS Test-Driven Development applied to process documentation.**
|
|
11
11
|
|
|
12
|
-
**Personal skills live in agent-specific directories (`~/.claude/skills` for Claude Code, `~/.codex/skills` for Codex)**
|
|
12
|
+
**Personal skills live in agent-specific directories (`~/.claude/skills` for Claude Code, `~/.codex/skills` for Codex)**
|
|
13
13
|
|
|
14
14
|
You write test cases (pressure scenarios with subagents), watch them fail (baseline behavior), write the skill (documentation), watch tests pass (agents comply), and refactor (close loopholes).
|
|
15
15
|
|
|
16
16
|
**Core principle:** If you didn't watch an agent fail without the skill, you don't know if the skill teaches the right thing.
|
|
17
17
|
|
|
18
|
-
**REQUIRED BACKGROUND:** You MUST understand
|
|
18
|
+
**REQUIRED BACKGROUND:** You MUST understand test-driven-development before using this skill. That skill defines the fundamental RED-GREEN-REFACTOR cycle. This skill adapts TDD to documentation.
|
|
19
19
|
|
|
20
20
|
**Official guidance:** For Anthropic's official skill authoring best practices, see anthropic-best-practices.md. This document provides additional patterns and guidelines that complement the TDD-focused approach in this skill.
|
|
21
21
|
|
|
@@ -29,30 +29,32 @@ A **skill** is a reference guide for proven techniques, patterns, or tools. Skil
|
|
|
29
29
|
|
|
30
30
|
## TDD Mapping for Skills
|
|
31
31
|
|
|
32
|
-
| TDD Concept
|
|
33
|
-
|
|
34
|
-
| **Test case**
|
|
35
|
-
| **Production code**
|
|
36
|
-
| **Test fails (RED)**
|
|
37
|
-
| **Test passes (GREEN)** | Agent complies with skill present
|
|
38
|
-
| **Refactor**
|
|
39
|
-
| **Write test first**
|
|
40
|
-
| **Watch it fail**
|
|
41
|
-
| **Minimal code**
|
|
42
|
-
| **Watch it pass**
|
|
43
|
-
| **Refactor cycle**
|
|
32
|
+
| TDD Concept | Skill Creation |
|
|
33
|
+
| ----------------------- | ------------------------------------------------ |
|
|
34
|
+
| **Test case** | Pressure scenario with subagent |
|
|
35
|
+
| **Production code** | Skill document (SKILL.md) |
|
|
36
|
+
| **Test fails (RED)** | Agent violates rule without skill (baseline) |
|
|
37
|
+
| **Test passes (GREEN)** | Agent complies with skill present |
|
|
38
|
+
| **Refactor** | Close loopholes while maintaining compliance |
|
|
39
|
+
| **Write test first** | Run baseline scenario BEFORE writing skill |
|
|
40
|
+
| **Watch it fail** | Document exact rationalizations agent uses |
|
|
41
|
+
| **Minimal code** | Write skill addressing those specific violations |
|
|
42
|
+
| **Watch it pass** | Verify agent now complies |
|
|
43
|
+
| **Refactor cycle** | Find new rationalizations → plug → re-verify |
|
|
44
44
|
|
|
45
45
|
The entire skill creation process follows RED-GREEN-REFACTOR.
|
|
46
46
|
|
|
47
47
|
## When to Create a Skill
|
|
48
48
|
|
|
49
49
|
**Create when:**
|
|
50
|
+
|
|
50
51
|
- Technique wasn't intuitively obvious to you
|
|
51
52
|
- You'd reference this again across projects
|
|
52
53
|
- Pattern applies broadly (not project-specific)
|
|
53
54
|
- Others would benefit
|
|
54
55
|
|
|
55
56
|
**Don't create for:**
|
|
57
|
+
|
|
56
58
|
- One-off solutions
|
|
57
59
|
- Standard practices well-documented elsewhere
|
|
58
60
|
- Project-specific conventions (put in CLAUDE.md)
|
|
@@ -60,17 +62,19 @@ The entire skill creation process follows RED-GREEN-REFACTOR.
|
|
|
60
62
|
## Skill Types
|
|
61
63
|
|
|
62
64
|
### Technique
|
|
65
|
+
|
|
63
66
|
Concrete method with steps to follow (condition-based-waiting, root-cause-tracing)
|
|
64
67
|
|
|
65
68
|
### Pattern
|
|
69
|
+
|
|
66
70
|
Way of thinking about problems (flatten-with-flags, test-invariants)
|
|
67
71
|
|
|
68
72
|
### Reference
|
|
73
|
+
|
|
69
74
|
API docs, syntax guides, tool documentation (office docs)
|
|
70
75
|
|
|
71
76
|
## Directory Structure
|
|
72
77
|
|
|
73
|
-
|
|
74
78
|
```
|
|
75
79
|
skills/
|
|
76
80
|
skill-name/
|
|
@@ -81,10 +85,12 @@ skills/
|
|
|
81
85
|
**Flat namespace** - all skills in one searchable namespace
|
|
82
86
|
|
|
83
87
|
**Separate files for:**
|
|
88
|
+
|
|
84
89
|
1. **Heavy reference** (100+ lines) - API docs, comprehensive syntax
|
|
85
90
|
2. **Reusable tools** - Scripts, utilities, templates
|
|
86
91
|
|
|
87
92
|
**Keep inline:**
|
|
93
|
+
|
|
88
94
|
- Principles and concepts
|
|
89
95
|
- Code patterns (< 50 lines)
|
|
90
96
|
- Everything else
|
|
@@ -92,6 +98,7 @@ skills/
|
|
|
92
98
|
## SKILL.md Structure
|
|
93
99
|
|
|
94
100
|
**Frontmatter (YAML):**
|
|
101
|
+
|
|
95
102
|
- Only two fields supported: `name` and `description`
|
|
96
103
|
- Max 1024 characters total
|
|
97
104
|
- `name`: Use letters, numbers, and hyphens only (no parentheses, special chars)
|
|
@@ -109,32 +116,38 @@ description: Use when [specific triggering conditions and symptoms] - [what the
|
|
|
109
116
|
# Skill Name
|
|
110
117
|
|
|
111
118
|
## Overview
|
|
119
|
+
|
|
112
120
|
What is this? Core principle in 1-2 sentences.
|
|
113
121
|
|
|
114
122
|
## When to Use
|
|
123
|
+
|
|
115
124
|
[Small inline flowchart IF decision non-obvious]
|
|
116
125
|
|
|
117
126
|
Bullet list with SYMPTOMS and use cases
|
|
118
127
|
When NOT to use
|
|
119
128
|
|
|
120
129
|
## Core Pattern (for techniques/patterns)
|
|
130
|
+
|
|
121
131
|
Before/after code comparison
|
|
122
132
|
|
|
123
133
|
## Quick Reference
|
|
134
|
+
|
|
124
135
|
Table or bullets for scanning common operations
|
|
125
136
|
|
|
126
137
|
## Implementation
|
|
138
|
+
|
|
127
139
|
Inline code for simple patterns
|
|
128
140
|
Link to file for heavy reference or reusable tools
|
|
129
141
|
|
|
130
142
|
## Common Mistakes
|
|
143
|
+
|
|
131
144
|
What goes wrong + fixes
|
|
132
145
|
|
|
133
146
|
## Real-World Impact (optional)
|
|
147
|
+
|
|
134
148
|
Concrete results
|
|
135
149
|
```
|
|
136
150
|
|
|
137
|
-
|
|
138
151
|
## Claude Search Optimization (CSO)
|
|
139
152
|
|
|
140
153
|
**Critical for discovery:** Future Claude needs to FIND your skill
|
|
@@ -146,8 +159,9 @@ Concrete results
|
|
|
146
159
|
**Format:** Start with "Use when..." to focus on triggering conditions, then explain what it does
|
|
147
160
|
|
|
148
161
|
**Content:**
|
|
162
|
+
|
|
149
163
|
- Use concrete triggers, symptoms, and situations that signal this skill applies
|
|
150
|
-
- Describe the
|
|
164
|
+
- Describe the _problem_ (race conditions, inconsistent behavior) not _language-specific symptoms_ (setTimeout, sleep)
|
|
151
165
|
- Keep triggers technology-agnostic unless the skill itself is technology-specific
|
|
152
166
|
- If skill is technology-specific, make that explicit in the trigger
|
|
153
167
|
- Write in third person (injected into system prompt)
|
|
@@ -172,6 +186,7 @@ description: Use when using React Router and handling authentication redirects -
|
|
|
172
186
|
### 2. Keyword Coverage
|
|
173
187
|
|
|
174
188
|
Use words Claude would search for:
|
|
189
|
+
|
|
175
190
|
- Error messages: "Hook timed out", "ENOTEMPTY", "race condition"
|
|
176
191
|
- Symptoms: "flaky", "hanging", "zombie", "pollution"
|
|
177
192
|
- Synonyms: "timeout/hang/freeze", "cleanup/teardown/afterEach"
|
|
@@ -180,6 +195,7 @@ Use words Claude would search for:
|
|
|
180
195
|
### 3. Descriptive Naming
|
|
181
196
|
|
|
182
197
|
**Use active voice, verb-first:**
|
|
198
|
+
|
|
183
199
|
- ✅ `creating-skills` not `skill-creation`
|
|
184
200
|
- ✅ `testing-skills-with-subagents` not `subagent-skill-testing`
|
|
185
201
|
|
|
@@ -188,6 +204,7 @@ Use words Claude would search for:
|
|
|
188
204
|
**Problem:** getting-started and frequently-referenced skills load into EVERY conversation. Every token counts.
|
|
189
205
|
|
|
190
206
|
**Target word counts:**
|
|
207
|
+
|
|
191
208
|
- getting-started workflows: <150 words each
|
|
192
209
|
- Frequently-loaded skills: <200 words total
|
|
193
210
|
- Other skills: <500 words (still be concise)
|
|
@@ -195,6 +212,7 @@ Use words Claude would search for:
|
|
|
195
212
|
**Techniques:**
|
|
196
213
|
|
|
197
214
|
**Move details to tool help:**
|
|
215
|
+
|
|
198
216
|
```bash
|
|
199
217
|
# ❌ BAD: Document all flags in SKILL.md
|
|
200
218
|
search-conversations supports --text, --both, --after DATE, --before DATE, --limit N
|
|
@@ -204,34 +222,42 @@ search-conversations supports multiple modes and filters. Run --help for details
|
|
|
204
222
|
```
|
|
205
223
|
|
|
206
224
|
**Use cross-references:**
|
|
225
|
+
|
|
207
226
|
```markdown
|
|
208
227
|
# ❌ BAD: Repeat workflow details
|
|
228
|
+
|
|
209
229
|
When searching, dispatch subagent with template...
|
|
210
230
|
[20 lines of repeated instructions]
|
|
211
231
|
|
|
212
232
|
# ✅ GOOD: Reference other skill
|
|
233
|
+
|
|
213
234
|
Always use subagents (50-100x context savings). REQUIRED: Use [other-skill-name] for workflow.
|
|
214
235
|
```
|
|
215
236
|
|
|
216
237
|
**Compress examples:**
|
|
238
|
+
|
|
217
239
|
```markdown
|
|
218
240
|
# ❌ BAD: Verbose example (42 words)
|
|
241
|
+
|
|
219
242
|
your human partner: "How did we handle authentication errors in React Router before?"
|
|
220
243
|
You: I'll search past conversations for React Router authentication patterns.
|
|
221
244
|
[Dispatch subagent with search query: "React Router authentication error handling 401"]
|
|
222
245
|
|
|
223
246
|
# ✅ GOOD: Minimal example (20 words)
|
|
247
|
+
|
|
224
248
|
Partner: "How did we handle auth errors in React Router?"
|
|
225
249
|
You: Searching...
|
|
226
250
|
[Dispatch subagent → synthesis]
|
|
227
251
|
```
|
|
228
252
|
|
|
229
253
|
**Eliminate redundancy:**
|
|
254
|
+
|
|
230
255
|
- Don't repeat what's in cross-referenced skills
|
|
231
256
|
- Don't explain what's obvious from command
|
|
232
257
|
- Don't include multiple examples of same pattern
|
|
233
258
|
|
|
234
259
|
**Verification:**
|
|
260
|
+
|
|
235
261
|
```bash
|
|
236
262
|
wc -w skills/path/SKILL.md
|
|
237
263
|
# getting-started workflows: aim for <150 each
|
|
@@ -239,12 +265,14 @@ wc -w skills/path/SKILL.md
|
|
|
239
265
|
```
|
|
240
266
|
|
|
241
267
|
**Name by what you DO or core insight:**
|
|
268
|
+
|
|
242
269
|
- ✅ `condition-based-waiting` > `async-test-helpers`
|
|
243
270
|
- ✅ `using-skills` not `skill-usage`
|
|
244
271
|
- ✅ `flatten-with-flags` > `data-structure-refactoring`
|
|
245
272
|
- ✅ `root-cause-tracing` > `debugging-techniques`
|
|
246
273
|
|
|
247
274
|
**Gerunds (-ing) work well for processes:**
|
|
275
|
+
|
|
248
276
|
- `creating-skills`, `testing-skills`, `debugging-with-logs`
|
|
249
277
|
- Active, describes the action you're taking
|
|
250
278
|
|
|
@@ -253,8 +281,9 @@ wc -w skills/path/SKILL.md
|
|
|
253
281
|
**When writing documentation that references other skills:**
|
|
254
282
|
|
|
255
283
|
Use skill name only, with explicit requirement markers:
|
|
256
|
-
|
|
257
|
-
- ✅ Good: `**REQUIRED
|
|
284
|
+
|
|
285
|
+
- ✅ Good: `**REQUIRED SUB-SKILL:** Use use_skill("test-driven-development")`
|
|
286
|
+
- ✅ Good: `**REQUIRED BACKGROUND:** You MUST understand systematic-debugging`
|
|
258
287
|
- ❌ Bad: `See skills/testing/test-driven-development` (unclear if required)
|
|
259
288
|
- ❌ Bad: `@skills/testing/test-driven-development/SKILL.md` (force-loads, burns context)
|
|
260
289
|
|
|
@@ -276,11 +305,13 @@ digraph when_flowchart {
|
|
|
276
305
|
```
|
|
277
306
|
|
|
278
307
|
**Use flowcharts ONLY for:**
|
|
308
|
+
|
|
279
309
|
- Non-obvious decision points
|
|
280
310
|
- Process loops where you might stop too early
|
|
281
311
|
- "When to use A vs B" decisions
|
|
282
312
|
|
|
283
313
|
**Never use flowcharts for:**
|
|
314
|
+
|
|
284
315
|
- Reference material → Tables, lists
|
|
285
316
|
- Code examples → Markdown blocks
|
|
286
317
|
- Linear instructions → Numbered lists
|
|
@@ -293,11 +324,13 @@ See @graphviz-conventions.dot for graphviz style rules.
|
|
|
293
324
|
**One excellent example beats many mediocre ones**
|
|
294
325
|
|
|
295
326
|
Choose most relevant language:
|
|
327
|
+
|
|
296
328
|
- Testing techniques → TypeScript/JavaScript
|
|
297
329
|
- System debugging → Shell/Python
|
|
298
330
|
- Data processing → Python
|
|
299
331
|
|
|
300
332
|
**Good example:**
|
|
333
|
+
|
|
301
334
|
- Complete and runnable
|
|
302
335
|
- Well-commented explaining WHY
|
|
303
336
|
- From real scenario
|
|
@@ -305,6 +338,7 @@ Choose most relevant language:
|
|
|
305
338
|
- Ready to adapt (not generic template)
|
|
306
339
|
|
|
307
340
|
**Don't:**
|
|
341
|
+
|
|
308
342
|
- Implement in 5+ languages
|
|
309
343
|
- Create fill-in-the-blank templates
|
|
310
344
|
- Write contrived examples
|
|
@@ -314,21 +348,26 @@ You're good at porting - one great example is enough.
|
|
|
314
348
|
## File Organization
|
|
315
349
|
|
|
316
350
|
### Self-Contained Skill
|
|
351
|
+
|
|
317
352
|
```
|
|
318
353
|
defense-in-depth/
|
|
319
354
|
SKILL.md # Everything inline
|
|
320
355
|
```
|
|
356
|
+
|
|
321
357
|
When: All content fits, no heavy reference needed
|
|
322
358
|
|
|
323
359
|
### Skill with Reusable Tool
|
|
360
|
+
|
|
324
361
|
```
|
|
325
362
|
condition-based-waiting/
|
|
326
363
|
SKILL.md # Overview + patterns
|
|
327
364
|
example.ts # Working helpers to adapt
|
|
328
365
|
```
|
|
366
|
+
|
|
329
367
|
When: Tool is reusable code, not just narrative
|
|
330
368
|
|
|
331
369
|
### Skill with Heavy Reference
|
|
370
|
+
|
|
332
371
|
```
|
|
333
372
|
pptx/
|
|
334
373
|
SKILL.md # Overview + workflows
|
|
@@ -336,6 +375,7 @@ pptx/
|
|
|
336
375
|
ooxml.md # 500 lines XML structure
|
|
337
376
|
scripts/ # Executable tools
|
|
338
377
|
```
|
|
378
|
+
|
|
339
379
|
When: Reference material too large for inline
|
|
340
380
|
|
|
341
381
|
## The Iron Law (Same as TDD)
|
|
@@ -350,6 +390,7 @@ Write skill before testing? Delete it. Start over.
|
|
|
350
390
|
Edit skill without testing? Same violation.
|
|
351
391
|
|
|
352
392
|
**No exceptions:**
|
|
393
|
+
|
|
353
394
|
- Not for "simple additions"
|
|
354
395
|
- Not for "just adding a section"
|
|
355
396
|
- Not for "documentation updates"
|
|
@@ -357,7 +398,7 @@ Edit skill without testing? Same violation.
|
|
|
357
398
|
- Don't "adapt" while running tests
|
|
358
399
|
- Delete means delete
|
|
359
400
|
|
|
360
|
-
**REQUIRED BACKGROUND:** The
|
|
401
|
+
**REQUIRED BACKGROUND:** The test-driven-development skill explains why this matters. Same principles apply to documentation.
|
|
361
402
|
|
|
362
403
|
## Testing All Skill Types
|
|
363
404
|
|
|
@@ -368,6 +409,7 @@ Different skill types need different test approaches:
|
|
|
368
409
|
**Examples:** TDD, verification-before-completion, designing-before-coding
|
|
369
410
|
|
|
370
411
|
**Test with:**
|
|
412
|
+
|
|
371
413
|
- Academic questions: Do they understand the rules?
|
|
372
414
|
- Pressure scenarios: Do they comply under stress?
|
|
373
415
|
- Multiple pressures combined: time + sunk cost + exhaustion
|
|
@@ -380,6 +422,7 @@ Different skill types need different test approaches:
|
|
|
380
422
|
**Examples:** condition-based-waiting, root-cause-tracing, defensive-programming
|
|
381
423
|
|
|
382
424
|
**Test with:**
|
|
425
|
+
|
|
383
426
|
- Application scenarios: Can they apply the technique correctly?
|
|
384
427
|
- Variation scenarios: Do they handle edge cases?
|
|
385
428
|
- Missing information tests: Do instructions have gaps?
|
|
@@ -391,6 +434,7 @@ Different skill types need different test approaches:
|
|
|
391
434
|
**Examples:** reducing-complexity, information-hiding concepts
|
|
392
435
|
|
|
393
436
|
**Test with:**
|
|
437
|
+
|
|
394
438
|
- Recognition scenarios: Do they recognize when pattern applies?
|
|
395
439
|
- Application scenarios: Can they use the mental model?
|
|
396
440
|
- Counter-examples: Do they know when NOT to apply?
|
|
@@ -402,6 +446,7 @@ Different skill types need different test approaches:
|
|
|
402
446
|
**Examples:** API documentation, command references, library guides
|
|
403
447
|
|
|
404
448
|
**Test with:**
|
|
449
|
+
|
|
405
450
|
- Retrieval scenarios: Can they find the right information?
|
|
406
451
|
- Application scenarios: Can they use what they found correctly?
|
|
407
452
|
- Gap testing: Are common use cases covered?
|
|
@@ -410,16 +455,16 @@ Different skill types need different test approaches:
|
|
|
410
455
|
|
|
411
456
|
## Common Rationalizations for Skipping Testing
|
|
412
457
|
|
|
413
|
-
| Excuse
|
|
414
|
-
|
|
415
|
-
| "Skill is obviously clear"
|
|
416
|
-
| "It's just a reference"
|
|
417
|
-
| "Testing is overkill"
|
|
418
|
-
| "I'll test if problems emerge" | Problems = agents can't use skill. Test BEFORE deploying.
|
|
419
|
-
| "Too tedious to test"
|
|
420
|
-
| "I'm confident it's good"
|
|
421
|
-
| "Academic review is enough"
|
|
422
|
-
| "No time to test"
|
|
458
|
+
| Excuse | Reality |
|
|
459
|
+
| ------------------------------ | ---------------------------------------------------------------- |
|
|
460
|
+
| "Skill is obviously clear" | Clear to you ≠ clear to other agents. Test it. |
|
|
461
|
+
| "It's just a reference" | References can have gaps, unclear sections. Test retrieval. |
|
|
462
|
+
| "Testing is overkill" | Untested skills have issues. Always. 15 min testing saves hours. |
|
|
463
|
+
| "I'll test if problems emerge" | Problems = agents can't use skill. Test BEFORE deploying. |
|
|
464
|
+
| "Too tedious to test" | Testing is less tedious than debugging bad skill in production. |
|
|
465
|
+
| "I'm confident it's good" | Overconfidence guarantees issues. Test anyway. |
|
|
466
|
+
| "Academic review is enough" | Reading ≠ using. Test application scenarios. |
|
|
467
|
+
| "No time to test" | Deploying untested skill wastes more time fixing it later. |
|
|
423
468
|
|
|
424
469
|
**All of these mean: Test before deploying. No exceptions.**
|
|
425
470
|
|
|
@@ -444,11 +489,13 @@ Write code before test? Delete it.
|
|
|
444
489
|
Write code before test? Delete it. Start over.
|
|
445
490
|
|
|
446
491
|
**No exceptions:**
|
|
492
|
+
|
|
447
493
|
- Don't keep it as "reference"
|
|
448
494
|
- Don't "adapt" it while writing tests
|
|
449
495
|
- Don't look at it
|
|
450
496
|
- Delete means delete
|
|
451
|
-
|
|
497
|
+
|
|
498
|
+
````
|
|
452
499
|
</Good>
|
|
453
500
|
|
|
454
501
|
### Address "Spirit vs Letter" Arguments
|
|
@@ -457,7 +504,7 @@ Add foundational principle early:
|
|
|
457
504
|
|
|
458
505
|
```markdown
|
|
459
506
|
**Violating the letter of the rules is violating the spirit of the rules.**
|
|
460
|
-
|
|
507
|
+
````
|
|
461
508
|
|
|
462
509
|
This cuts off entire class of "I'm following the spirit" rationalizations.
|
|
463
510
|
|
|
@@ -466,10 +513,10 @@ This cuts off entire class of "I'm following the spirit" rationalizations.
|
|
|
466
513
|
Capture rationalizations from baseline testing (see Testing section below). Every excuse agents make goes in the table:
|
|
467
514
|
|
|
468
515
|
```markdown
|
|
469
|
-
| Excuse
|
|
470
|
-
|
|
471
|
-
| "Too simple to test"
|
|
472
|
-
| "I'll test after"
|
|
516
|
+
| Excuse | Reality |
|
|
517
|
+
| -------------------------------- | ----------------------------------------------------------------------- |
|
|
518
|
+
| "Too simple to test" | Simple code breaks. Test takes 30 seconds. |
|
|
519
|
+
| "I'll test after" | Tests passing immediately prove nothing. |
|
|
473
520
|
| "Tests after achieve same goals" | Tests-after = "what does this do?" Tests-first = "what should this do?" |
|
|
474
521
|
```
|
|
475
522
|
|
|
@@ -504,6 +551,7 @@ Follow the TDD cycle:
|
|
|
504
551
|
### RED: Write Failing Test (Baseline)
|
|
505
552
|
|
|
506
553
|
Run pressure scenario with subagent WITHOUT the skill. Document exact behavior:
|
|
554
|
+
|
|
507
555
|
- What choices did they make?
|
|
508
556
|
- What rationalizations did they use (verbatim)?
|
|
509
557
|
- Which pressures triggered violations?
|
|
@@ -520,7 +568,8 @@ Run same scenarios WITH skill. Agent should now comply.
|
|
|
520
568
|
|
|
521
569
|
Agent found new rationalization? Add explicit counter. Re-test until bulletproof.
|
|
522
570
|
|
|
523
|
-
**REQUIRED SUB-SKILL:** Use
|
|
571
|
+
**REQUIRED SUB-SKILL:** Use use_skill("testing-skills-with-subagents") for the complete testing methodology:
|
|
572
|
+
|
|
524
573
|
- How to write pressure scenarios
|
|
525
574
|
- Pressure types (time, sunk cost, authority, exhaustion)
|
|
526
575
|
- Plugging holes systematically
|
|
@@ -529,21 +578,26 @@ Agent found new rationalization? Add explicit counter. Re-test until bulletproof
|
|
|
529
578
|
## Anti-Patterns
|
|
530
579
|
|
|
531
580
|
### ❌ Narrative Example
|
|
581
|
+
|
|
532
582
|
"In session 2025-10-03, we found empty projectDir caused..."
|
|
533
583
|
**Why bad:** Too specific, not reusable
|
|
534
584
|
|
|
535
585
|
### ❌ Multi-Language Dilution
|
|
586
|
+
|
|
536
587
|
example-js.js, example-py.py, example-go.go
|
|
537
588
|
**Why bad:** Mediocre quality, maintenance burden
|
|
538
589
|
|
|
539
590
|
### ❌ Code in Flowcharts
|
|
591
|
+
|
|
540
592
|
```dot
|
|
541
593
|
step1 [label="import fs"];
|
|
542
594
|
step2 [label="read file"];
|
|
543
595
|
```
|
|
596
|
+
|
|
544
597
|
**Why bad:** Can't copy-paste, hard to read
|
|
545
598
|
|
|
546
599
|
### ❌ Generic Labels
|
|
600
|
+
|
|
547
601
|
helper1, helper2, step3, pattern4
|
|
548
602
|
**Why bad:** Labels should have semantic meaning
|
|
549
603
|
|
|
@@ -552,6 +606,7 @@ helper1, helper2, step3, pattern4
|
|
|
552
606
|
**After writing ANY skill, you MUST STOP and complete the deployment process.**
|
|
553
607
|
|
|
554
608
|
**Do NOT:**
|
|
609
|
+
|
|
555
610
|
- Create multiple skills in batch without testing each
|
|
556
611
|
- Move to next skill before current one is verified
|
|
557
612
|
- Skip testing because "batching is more efficient"
|
|
@@ -565,11 +620,13 @@ Deploying untested skills = deploying untested code. It's a violation of quality
|
|
|
565
620
|
**IMPORTANT: Use TodoWrite to create todos for EACH checklist item below.**
|
|
566
621
|
|
|
567
622
|
**RED Phase - Write Failing Test:**
|
|
623
|
+
|
|
568
624
|
- [ ] Create pressure scenarios (3+ combined pressures for discipline skills)
|
|
569
625
|
- [ ] Run scenarios WITHOUT skill - document baseline behavior verbatim
|
|
570
626
|
- [ ] Identify patterns in rationalizations/failures
|
|
571
627
|
|
|
572
628
|
**GREEN Phase - Write Minimal Skill:**
|
|
629
|
+
|
|
573
630
|
- [ ] Name uses only letters, numbers, hyphens (no parentheses/special chars)
|
|
574
631
|
- [ ] YAML frontmatter with only name and description (max 1024 chars)
|
|
575
632
|
- [ ] Description starts with "Use when..." and includes specific triggers/symptoms
|
|
@@ -582,6 +639,7 @@ Deploying untested skills = deploying untested code. It's a violation of quality
|
|
|
582
639
|
- [ ] Run scenarios WITH skill - verify agents now comply
|
|
583
640
|
|
|
584
641
|
**REFACTOR Phase - Close Loopholes:**
|
|
642
|
+
|
|
585
643
|
- [ ] Identify NEW rationalizations from testing
|
|
586
644
|
- [ ] Add explicit counters (if discipline skill)
|
|
587
645
|
- [ ] Build rationalization table from all test iterations
|
|
@@ -589,6 +647,7 @@ Deploying untested skills = deploying untested code. It's a violation of quality
|
|
|
589
647
|
- [ ] Re-test until bulletproof
|
|
590
648
|
|
|
591
649
|
**Quality Checks:**
|
|
650
|
+
|
|
592
651
|
- [ ] Small flowchart only if decision non-obvious
|
|
593
652
|
- [ ] Quick reference table
|
|
594
653
|
- [ ] Common mistakes section
|
|
@@ -596,6 +655,7 @@ Deploying untested skills = deploying untested code. It's a violation of quality
|
|
|
596
655
|
- [ ] Supporting files only for tools or heavy reference
|
|
597
656
|
|
|
598
657
|
**Deployment:**
|
|
658
|
+
|
|
599
659
|
- [ ] Commit skill to git and push to your fork (if configured)
|
|
600
660
|
- [ ] Consider contributing back via PR (if broadly useful)
|
|
601
661
|
|
|
@@ -604,10 +664,10 @@ Deploying untested skills = deploying untested code. It's a violation of quality
|
|
|
604
664
|
How future Claude finds your skill:
|
|
605
665
|
|
|
606
666
|
1. **Encounters problem** ("tests are flaky")
|
|
607
|
-
|
|
608
|
-
|
|
609
|
-
|
|
610
|
-
|
|
667
|
+
2. **Finds SKILL** (description matches)
|
|
668
|
+
3. **Scans overview** (is this relevant?)
|
|
669
|
+
4. **Reads patterns** (quick reference table)
|
|
670
|
+
5. **Loads example** (only when implementing)
|
|
611
671
|
|
|
612
672
|
**Optimize for this flow** - put searchable terms early and often.
|
|
613
673
|
|