clawpowers 1.1.3 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (74) hide show
  1. package/CHANGELOG.md +94 -0
  2. package/LICENSE +44 -0
  3. package/README.md +202 -384
  4. package/SECURITY.md +72 -0
  5. package/dist/index.d.ts +844 -0
  6. package/dist/index.js +2536 -0
  7. package/dist/index.js.map +1 -0
  8. package/package.json +52 -42
  9. package/.claude-plugin/manifest.json +0 -19
  10. package/.codex/INSTALL.md +0 -36
  11. package/.cursor-plugin/manifest.json +0 -21
  12. package/.opencode/INSTALL.md +0 -52
  13. package/ARCHITECTURE.md +0 -69
  14. package/bin/clawpowers.js +0 -625
  15. package/bin/clawpowers.sh +0 -91
  16. package/docs/demo/clawpowers-demo.cast +0 -197
  17. package/docs/demo/clawpowers-demo.gif +0 -0
  18. package/docs/launch-images/25-skills-breakdown.jpg +0 -0
  19. package/docs/launch-images/clawpowers-vs-superpowers.jpg +0 -0
  20. package/docs/launch-images/economic-code-optimization.jpg +0 -0
  21. package/docs/launch-images/native-vs-bridge-2.jpg +0 -0
  22. package/docs/launch-images/native-vs-bridge.jpg +0 -0
  23. package/docs/launch-images/post1-hero-lobster.jpg +0 -0
  24. package/docs/launch-images/post2-dashboard.jpg +0 -0
  25. package/docs/launch-images/post3-superpowers.jpg +0 -0
  26. package/docs/launch-images/post4-before-after.jpg +0 -0
  27. package/docs/launch-images/post5-install-now.jpg +0 -0
  28. package/docs/launch-images/ultimate-stack.jpg +0 -0
  29. package/docs/launch-posts.md +0 -76
  30. package/docs/quickstart-first-transaction.md +0 -204
  31. package/gemini-extension.json +0 -32
  32. package/hooks/session-start +0 -205
  33. package/hooks/session-start.cmd +0 -43
  34. package/hooks/session-start.js +0 -163
  35. package/runtime/demo/README.md +0 -78
  36. package/runtime/demo/x402-mock-server.js +0 -230
  37. package/runtime/feedback/analyze.js +0 -621
  38. package/runtime/feedback/analyze.sh +0 -546
  39. package/runtime/init.js +0 -210
  40. package/runtime/init.sh +0 -178
  41. package/runtime/metrics/collector.js +0 -361
  42. package/runtime/metrics/collector.sh +0 -308
  43. package/runtime/payments/ledger.js +0 -305
  44. package/runtime/payments/ledger.sh +0 -262
  45. package/runtime/payments/pipeline.js +0 -459
  46. package/runtime/persistence/store.js +0 -433
  47. package/runtime/persistence/store.sh +0 -303
  48. package/skill.json +0 -106
  49. package/skills/agent-bounties/SKILL.md +0 -553
  50. package/skills/agent-payments/SKILL.md +0 -479
  51. package/skills/brainstorming/SKILL.md +0 -233
  52. package/skills/content-pipeline/SKILL.md +0 -282
  53. package/skills/cross-project-knowledge/SKILL.md +0 -345
  54. package/skills/dispatching-parallel-agents/SKILL.md +0 -305
  55. package/skills/economic-code-optimization/SKILL.md +0 -265
  56. package/skills/executing-plans/SKILL.md +0 -255
  57. package/skills/finishing-a-development-branch/SKILL.md +0 -260
  58. package/skills/formal-verification-lite/SKILL.md +0 -441
  59. package/skills/learn-how-to-learn/SKILL.md +0 -235
  60. package/skills/market-intelligence/SKILL.md +0 -323
  61. package/skills/meta-skill-evolution/SKILL.md +0 -325
  62. package/skills/prospecting/SKILL.md +0 -454
  63. package/skills/receiving-code-review/SKILL.md +0 -225
  64. package/skills/requesting-code-review/SKILL.md +0 -206
  65. package/skills/security-audit/SKILL.md +0 -353
  66. package/skills/self-healing-code/SKILL.md +0 -369
  67. package/skills/subagent-driven-development/SKILL.md +0 -244
  68. package/skills/systematic-debugging/SKILL.md +0 -355
  69. package/skills/test-driven-development/SKILL.md +0 -416
  70. package/skills/using-clawpowers/SKILL.md +0 -160
  71. package/skills/using-git-worktrees/SKILL.md +0 -261
  72. package/skills/verification-before-completion/SKILL.md +0 -254
  73. package/skills/writing-plans/SKILL.md +0 -276
  74. package/skills/writing-skills/SKILL.md +0 -260
@@ -1,276 +0,0 @@
1
- ---
2
- name: writing-plans
3
- description: Transform a specification or goal into a sequenced implementation plan of concrete 2-5 minute tasks with dependency graph. Activate when someone asks you to plan work, before starting any substantial feature.
4
- version: 1.0.0
5
- requires:
6
- tools: []
7
- runtime: false
8
- metrics:
9
- tracks: [plan_accuracy, task_count, estimation_error, dependency_violations]
10
- improves: [task_granularity, estimation_calibration, dependency_detection]
11
- ---
12
-
13
- # Writing Plans
14
-
15
- ## When to Use
16
-
17
- Apply this skill when:
18
-
19
- - You receive a specification that requires multiple distinct steps
20
- - Work will take more than 30 minutes total
21
- - Multiple people or agents will execute the work
22
- - The execution order matters (dependencies exist)
23
- - You need to communicate progress against milestones
24
- - The risk of missing a step is high
25
-
26
- **Skip when:**
27
- - The task is a single, obvious action (just do it)
28
- - The task is exploratory — you can't plan what you haven't understood yet
29
- - The plan would take longer to write than the work itself
30
-
31
- **Decision tree:**
32
- ```
33
- Is the task > 30 min or > 1 context window?
34
- ├── No → execute directly
35
- └── Yes → Do you understand the full scope?
36
- ├── No → brainstorming first, then write-plans
37
- └── Yes → writing-plans ← YOU ARE HERE
38
- ```
39
-
40
- ## Core Methodology
41
-
42
- ### Phase 1: Specification Analysis
43
-
44
- Before writing a single task, decompose the spec into its logical components:
45
-
46
- 1. **Parse the goal** — What is the desired end state? Not "build auth" but "users can register, log in, and maintain sessions securely"
47
- 2. **Identify components** — What distinct systems or modules need to exist?
48
- 3. **Find dependencies** — Which components require others to exist first?
49
- 4. **Identify risks** — What's most likely to go wrong? Plan for it early.
50
- 5. **Define done criteria** — How will you know the goal is achieved?
51
-
52
- **Specification analysis template:**
53
- ```markdown
54
- ## Goal
55
- [Single sentence: what exists when this is done that didn't exist before]
56
-
57
- ## Components
58
- - [Component A]: [what it is and what it does]
59
- - [Component B]: [what it is and what it does]
60
-
61
- ## Dependencies
62
- - Component B requires Component A's [interface/data/service]
63
- - Component C requires both A and B
64
-
65
- ## Risks
66
- 1. [Risk]: [mitigation]
67
- 2. [Risk]: [mitigation]
68
-
69
- ## Done Criteria
70
- - [ ] [Observable, testable condition]
71
- - [ ] [Observable, testable condition]
72
- ```
73
-
74
- ### Phase 2: Task Sequencing
75
-
76
- Break each component into atomic tasks. Rules for task granularity:
77
-
78
- **Target size:** 2-5 minutes of execution time (not wall clock time — agent execution time)
79
- **Signs a task is too large:**
80
- - It contains "and" (two things)
81
- - Its done criteria has more than 3 bullet points
82
- - It could fail in more than 2 different ways
83
- - It would produce more than 200 lines of code
84
-
85
- **Signs a task is too small:**
86
- - It produces less than 10 lines of code
87
- - Its setup (imports, scaffolding) outweighs its work
88
- - It has zero decision points
89
-
90
- **Task format:**
91
- ```markdown
92
- ### Task N: [Action verb] [specific thing]
93
-
94
- **Input:** [What this task needs to exist before it can run]
95
- **Output:** [Exact file, function, or artifact produced]
96
- **Duration:** [2-5 min]
97
- **Done when:**
98
- - [ ] [Specific, verifiable criterion]
99
- - [ ] Tests pass (if applicable)
100
-
101
- **Notes:** [Edge cases, non-obvious decisions, references]
102
- ```
103
-
104
- ### Phase 3: Dependency Graph
105
-
106
- After listing tasks, draw the dependency graph explicitly:
107
-
108
- ```
109
- Task 1: Database schema ──→ Task 3: Repository layer
110
- Task 2: Domain models ──→ Task 3: Repository layer
111
- Task 3: Repository layer ──→ Task 5: Service layer
112
- Task 4: Auth middleware ──→ Task 6: Protected routes
113
- Task 5: Service layer ──→ Task 6: Protected routes
114
- Task 6: Protected routes ──→ Task 7: Integration tests
115
- Task 7: Integration tests ──→ Task 8: Documentation
116
- ```
117
-
118
- **Parallel execution opportunities** — tasks with no shared dependencies:
119
- - Task 1 and Task 2 can run in parallel
120
- - Task 4 can run in parallel with Tasks 1-3
121
-
122
- Label these explicitly. If using `subagent-driven-development`, parallel tasks become parallel subagent dispatches.
123
-
124
- ### Phase 4: Risk-First Ordering
125
-
126
- Within the constraint of the dependency graph, sequence tasks to:
127
- 1. **Prove the spike first** — If there's a technical uncertainty, make a task that resolves it early
128
- 2. **Hard tasks early** — Don't save the hardest part for last (discovery of blockers costs less time early)
129
- 3. **Reviewable checkpoints** — Insert verification tasks at natural boundaries
130
-
131
- **Risk-first example:**
132
- ```
133
- # BAD ordering (risk deferred to end)
134
- Task 1: Build entire frontend
135
- Task 2: Build entire backend
136
- Task 3: Integrate (discovery: API shape is wrong, redo Task 1)
137
-
138
- # GOOD ordering (risk surfaced early)
139
- Task 1: Define API contract (OpenAPI spec)
140
- Task 2: Backend stub that satisfies contract
141
- Task 3: Frontend stub that calls contract
142
- Task 4: Integration smoke test ← risk surfaced HERE, at task 4 not task 20
143
- Task 5: Full backend implementation
144
- Task 6: Full frontend implementation
145
- ```
146
-
147
- ### Phase 5: The Written Plan
148
-
149
- Final output format:
150
-
151
- ```markdown
152
- # Plan: [Goal Name]
153
-
154
- **Goal:** [Single sentence]
155
- **Total tasks:** N
156
- **Estimated duration:** [sum of task durations]
157
- **Parallel opportunities:** [task numbers that can run concurrently]
158
-
159
- ## Done Criteria
160
- - [ ] [Observable, testable condition]
161
- - [ ] [Observable, testable condition]
162
-
163
- ## Dependency Graph
164
- [ASCII or description]
165
-
166
- ## Tasks
167
-
168
- ### Task 1: [Name]
169
- [Full task block]
170
-
171
- ### Task 2: [Name]
172
- [Full task block]
173
-
174
- ...
175
- ```
176
-
177
- ## ClawPowers Enhancement
178
-
179
- When `~/.clawpowers/` runtime is initialized:
180
-
181
- **Historical Estimation Calibration:**
182
-
183
- Plans get compared to actual execution. After 5+ plans, calibration data is available:
184
-
185
- ```bash
186
- # After plan execution completes
187
- bash runtime/persistence/store.sh set "plan:auth-service:estimated_duration" "180"
188
- bash runtime/persistence/store.sh set "plan:auth-service:actual_duration" "240"
189
-
190
- # Read calibration
191
- bash runtime/feedback/analyze.sh --skill writing-plans
192
- # Output: Your 2-5 min tasks average 7.3 min actual. Adjust estimates by 1.5x.
193
- ```
194
-
195
- **Dependency Graph Validation:**
196
-
197
- Before executing a plan, validate the dependency graph has no cycles and all task inputs exist:
198
-
199
- ```bash
200
- bash runtime/persistence/store.sh set "plan:current:task_count" "8"
201
- bash runtime/persistence/store.sh set "plan:current:deps" "3:1,2 4:- 5:3,4 6:4 7:5,6 8:7"
202
- # Analyzer checks for cycles and unreachable tasks
203
- ```
204
-
205
- **Plan Quality Scoring:**
206
-
207
- Stored metrics enable quality scoring of plans over time:
208
- - Estimation accuracy (actual / estimated)
209
- - Task rework rate (tasks that required re-execution)
210
- - Dependency violation rate (tasks executed out of order)
211
- - Done criteria completeness (criteria met on first attempt)
212
-
213
- ## Anti-Patterns
214
-
215
- | Anti-Pattern | Why It Fails | Correct Approach |
216
- |-------------|-------------|-----------------|
217
- | Tasks with "and" | Two risks, two outcomes, ambiguous done criteria | Split into two tasks |
218
- | Vague done criteria | "Done" is subjective, causes rework debates | Observable, testable criteria only |
219
- | Skipping dependency mapping | Execution order violations cause rework | Always draw the dependency graph |
220
- | Planning with no spike | Technical unknown bites you at Task 15 | Schedule spike task first if uncertainty exists |
221
- | Giant tasks (2 hours each) | Hard to track progress, hard to parallelize | Break down to 2-5 min granularity |
222
- | Tiny tasks (1-2 min each) | Plan overhead exceeds value | Group related micro-tasks |
223
- | Over-planning volatile specs | Plan becomes invalid before execution starts | Plan only what's stable, leave flexibility for the rest |
224
-
225
- ## Examples
226
-
227
- ### Example 1: Small Plan (4 tasks)
228
-
229
- **Goal:** Add rate limiting to the API
230
-
231
- ```markdown
232
- # Plan: API Rate Limiting
233
-
234
- **Goal:** All API endpoints enforce per-user rate limits with 429 response on exceeded limits.
235
- **Total tasks:** 4 | **Estimated:** 16 min
236
-
237
- ## Done Criteria
238
- - [ ] Requests beyond limit receive 429 with Retry-After header
239
- - [ ] Rate limit state persists across server restarts
240
- - [ ] Tests cover limit enforcement and reset behavior
241
-
242
- ## Tasks
243
-
244
- ### Task 1: Write rate limit tests (RED)
245
- **Input:** None | **Output:** tests/test_rate_limit.py (failing) | **Duration:** 3 min
246
- **Done when:** Tests run and fail with ImportError
247
-
248
- ### Task 2: Implement RedisRateLimiter
249
- **Input:** tests/test_rate_limit.py | **Output:** src/rate_limiter.py | **Duration:** 5 min
250
- **Done when:** All rate limiter unit tests pass
251
-
252
- ### Task 3: Integrate rate limiter into middleware
253
- **Input:** src/rate_limiter.py | **Output:** src/middleware/rate_limit.py | **Duration:** 4 min
254
- **Done when:** Integration tests pass, middleware applies limits per endpoint
255
-
256
- ### Task 4: Add 429 response and Retry-After header
257
- **Input:** src/middleware/rate_limit.py | **Output:** Modified middleware | **Duration:** 2 min
258
- **Done when:** 429 response includes Retry-After with correct TTL
259
- ```
260
-
261
- ### Example 2: Dependency-heavy Plan (8 tasks with parallel opportunities)
262
-
263
- **Goal:** Build notification service
264
-
265
- **Parallel opportunities:** Tasks 1+2 concurrent, Tasks 4+5 concurrent
266
-
267
- ```markdown
268
- ### Task 1: Define notification event schema [parallel with Task 2]
269
- ### Task 2: Database migration for notification store [parallel with Task 1]
270
- ### Task 3: NotificationRepository (depends on 1, 2)
271
- ### Task 4: Email provider integration (depends on 1) [parallel with Task 5]
272
- ### Task 5: Push notification provider integration (depends on 1) [parallel with Task 4]
273
- ### Task 6: NotificationService orchestrator (depends on 3, 4, 5)
274
- ### Task 7: API endpoints (depends on 6)
275
- ### Task 8: Integration tests (depends on 7)
276
- ```
@@ -1,260 +0,0 @@
1
- ---
2
- name: writing-skills
3
- description: Create new ClawPowers skills using TDD methodology — write test scenarios, watch the agent fail without the skill, write the skill, verify the agent passes. Activate when you need a new skill that ClawPowers doesn't have.
4
- version: 1.0.0
5
- requires:
6
- tools: [bash]
7
- runtime: false
8
- metrics:
9
- tracks: [skills_written, skill_quality_scores, test_coverage, anti_pattern_count]
10
- improves: [skill_structure_quality, when_to_use_clarity, example_relevance]
11
- ---
12
-
13
- # Writing Skills
14
-
15
- ## When to Use
16
-
17
- Apply this skill when:
18
-
19
- - ClawPowers lacks a skill you need repeatedly
20
- - You've solved a non-trivial problem 3+ times and want to codify the approach
21
- - A team has domain-specific methodologies that should be agent-accessible
22
- - An existing skill is missing important context or examples
23
- - You're improving an existing skill that consistently produces suboptimal results
24
-
25
- **Skip when:**
26
- - The skill would be a one-off
27
- - The methodology is already captured in a skill that could be extended
28
- - You don't have enough real experience with the problem to write a useful skill (skills written from theory, not experience, are worse than no skill)
29
-
30
- **Decision tree:**
31
- ```
32
- Have you solved this problem multiple times manually?
33
- ├── No → Solve it first. Document later.
34
- └── Yes → Is it covered by an existing skill?
35
- ├── Yes → Extend the existing skill (new section, new example)
36
- └── No → writing-skills ← YOU ARE HERE
37
- ```
38
-
39
- ## Core Methodology
40
-
41
- ### TDD for Skills
42
-
43
- Skills are tested by measuring whether an agent WITHOUT the skill fails a scenario that an agent WITH the skill handles correctly. This is behavioral testing — not unit testing of code.
44
-
45
- ### Phase 1: Write Test Scenarios (RED)
46
-
47
- Before writing any skill content, write test scenarios that the skill must handle.
48
-
49
- **Test scenario format:**
50
- ```markdown
51
- ## Scenario N: [Descriptive name]
52
-
53
- **Agent state:** Agent has no knowledge of [skill domain]
54
- **Input:** [Exact prompt or situation the agent receives]
55
- **Without skill:** [Specific wrong behavior the agent exhibits]
56
- **With skill:** [Specific correct behavior the skill produces]
57
- **Success criteria:**
58
- - [ ] [Observable, verifiable outcome]
59
- - [ ] [Observable, verifiable outcome]
60
- ```
61
-
62
- **Minimum scenarios before writing the skill:**
63
- - 1 happy path (common use case)
64
- - 1 edge case (unusual but valid)
65
- - 1 failure case (when NOT to use the skill)
66
- - 1 anti-pattern case (common mistake the skill prevents)
67
-
68
- **Example scenarios for a hypothetical "database-migration" skill:**
69
-
70
- ```markdown
71
- ## Scenario 1: Running migrations safely
72
- Without skill: Agent runs `alembic upgrade head` without backing up first
73
- With skill: Agent follows backup → dry-run → verify → apply sequence
74
-
75
- ## Scenario 2: Rolling back a bad migration
76
- Without skill: Agent manually deletes rows or drops columns (data loss)
77
- With skill: Agent runs `alembic downgrade -1`, verifies schema, identifies root cause
78
-
79
- ## Scenario 3: When NOT to use this skill
80
- Input: "Update the user model to add an index"
81
- Without skill: Agent triggers migration skill for every schema change
82
- With skill: Agent recognizes index-only changes don't need this protocol
83
-
84
- ## Scenario 4: Anti-pattern — concurrent migrations
85
- Without skill: Agent runs migrations in parallel across multiple servers
86
- With skill: Agent ensures single-server serial execution with distributed lock
87
- ```
88
-
89
- ### Phase 2: Verify Failure (The "RED" Moment)
90
-
91
- Before writing the skill, verify that an agent without it fails at least Scenario 1. This confirms:
92
- - The skill is actually needed
93
- - The test scenarios are meaningful
94
- - The skill will produce measurable improvement
95
-
96
- If an agent without the skill already handles the scenarios correctly, you don't need the skill.
97
-
98
- ### Phase 3: Write the Skill
99
-
100
- Use the ClawPowers skill template:
101
-
102
- ```markdown
103
- ---
104
- name: skill-name-kebab-case
105
- description: [One sentence: when to trigger this skill. Start with "Activate when..."]
106
- version: 1.0.0
107
- requires:
108
- tools: [tool1, tool2] # Only tools the skill actually requires
109
- runtime: false # true if skill needs ~/.clawpowers/
110
- metrics:
111
- tracks: [metric1, metric2] # Observable outcomes
112
- improves: [param1, param2] # Parameters RSI can tune
113
- ---
114
-
115
- # [Skill Name]
116
-
117
- ## When to Use
118
-
119
- [Decision tree. Include when to skip the skill.]
120
-
121
- ## Core Methodology
122
-
123
- [The actual methodology. Numbered steps. Concrete, not abstract.]
124
-
125
- ## ClawPowers Enhancement
126
-
127
- [What the runtime layer adds. Only if runtime: true or if runtime is optional benefit.]
128
-
129
- ## Anti-Patterns
130
-
131
- [Table of common mistakes, why they fail, correct approach.]
132
-
133
- ## Examples
134
-
135
- [1-3 concrete examples with real code/commands, not hypotheticals.]
136
- ```
137
-
138
- ### Phase 4: Quality Gates
139
-
140
- Before the skill is "done", it must pass these gates:
141
-
142
- **Gate 1: When to Use is a decision tree, not a list**
143
- - Does it tell you when NOT to use the skill?
144
- - Does it handle edge cases in the decision?
145
-
146
- **Gate 2: Core Methodology is actionable**
147
- - Can someone follow these steps without guessing?
148
- - Does every step produce a verifiable output?
149
- - Are code/command examples real, not pseudocode?
150
-
151
- **Gate 3: Anti-Patterns are specific**
152
- - Each anti-pattern names a specific behavior, not a category
153
- - Each explains WHY it fails (not just "don't do this")
154
- - Each provides a concrete correct approach
155
-
156
- **Gate 4: Examples are real**
157
- - Examples use plausible real names (not `foo`, `bar`, `example`)
158
- - Code examples are syntactically correct
159
- - Examples cover the most common real-world use case
160
-
161
- **Gate 5: No stubs**
162
- - No "TODO: add examples here"
163
- - No "coming soon" sections
164
- - No placeholder text
165
-
166
- ### Phase 5: Verify Pass (The "GREEN" Moment)
167
-
168
- Apply the test scenarios to the completed skill. Verify:
169
- - Scenario 1 (happy path): skill guides agent to correct outcome
170
- - Scenario 2 (edge case): skill handles it explicitly or provides guidance
171
- - Scenario 3 (skip case): skill's "When to Use" correctly excludes this
172
- - Scenario 4 (anti-pattern): skill's Anti-Patterns section covers it
173
-
174
- ### Phase 6: Register the Skill
175
-
176
- Add the skill to `skills/using-clawpowers/SKILL.md`:
177
-
178
- ```markdown
179
- # In the "Quick Reference" section, add:
180
- 25. `database-migration` — Safe migration sequence with backup, dry-run, and rollback
181
- ```
182
-
183
- Add trigger pattern to the pattern map:
184
- ```markdown
185
- | Running or planning a database schema change | `database-migration` |
186
- ```
187
-
188
- ## ClawPowers Enhancement
189
-
190
- When `~/.clawpowers/` runtime is initialized:
191
-
192
- **Skill Quality Scoring:**
193
-
194
- Each skill is scored on:
195
- - Scenario coverage (how many test scenarios does it pass?)
196
- - Usage frequency (how often is it triggered per session?)
197
- - Outcome rate (when triggered, what % of executions succeed?)
198
- - Anti-pattern prevention (how often does it prevent a documented anti-pattern?)
199
-
200
- ```bash
201
- bash runtime/persistence/store.sh set "skill-quality:database-migration:scenario_coverage" "4/4"
202
- bash runtime/persistence/store.sh set "skill-quality:database-migration:outcome_rate" "0.92"
203
- ```
204
-
205
- **Anti-Pattern Detection:**
206
-
207
- The feedback engine monitors skill usage for anti-patterns not covered by the skill:
208
- ```bash
209
- bash runtime/feedback/analyze.sh --skill database-migration
210
- # → New anti-pattern detected: agents omit the verify step after applying migrations
211
- # → Recommend adding explicit verify step to Core Methodology
212
- ```
213
-
214
- **Skill Evolution:**
215
-
216
- When a skill's outcome rate drops below threshold (< 80%), the feedback engine flags it for revision:
217
- ```bash
218
- bash runtime/metrics/collector.sh record \
219
- --skill writing-skills \
220
- --outcome success \
221
- --notes "database-migration: 4 scenarios, all passing, quality gates cleared"
222
- ```
223
-
224
- ## Anti-Patterns
225
-
226
- | Anti-Pattern | Why It Fails | Correct Approach |
227
- |-------------|-------------|-----------------|
228
- | Writing from theory | Skill misses real-world edge cases | Write skills from real experience only |
229
- | Skipping test scenarios | No way to verify the skill works | Write scenarios first (TDD) |
230
- | Vague "When to Use" | Skill triggers at wrong times | Decision tree with explicit skip conditions |
231
- | Placeholder sections | Skill is deployed incomplete | All sections must be complete before registration |
232
- | One giant methodology section | Agents lose track of where they are | Numbered steps with verifiable outputs |
233
- | No anti-patterns section | Common mistakes recur | Always include anti-patterns |
234
- | Examples with foo/bar names | Low signal — agents don't recognize applicability | Use realistic domain names in examples |
235
-
236
- ## Examples
237
-
238
- ### Example 1: Simple Skill (2 scenarios)
239
-
240
- **Skill:** `git-submodule-update`
241
- **Problem:** Agents frequently forget to update submodules after `git pull`
242
-
243
- **Scenarios:**
244
- 1. After `git pull`, submodule code is stale — skill ensures `git submodule update --init --recursive`
245
- 2. New submodule added — skill ensures `git submodule update --init` for new submodules only
246
-
247
- **Skill structure:** When to Use → 3-step methodology (detect stale, update, verify) → 2 anti-patterns → 1 example
248
-
249
- ### Example 2: Complex Skill (4 scenarios)
250
-
251
- **Skill:** `zero-downtime-deployment`
252
- **Problem:** Agents deploy without considering traffic impact
253
-
254
- **Scenarios:**
255
- 1. Deploying new version (happy path) — blue-green or rolling strategy
256
- 2. Deploying with schema migration — migration-first, app second
257
- 3. Rolling back a bad deploy — revert app before revert migration
258
- 4. Skip case — deploying to a dev environment with no traffic
259
-
260
- **Skill structure:** When to Use (with explicit skip for dev) → 5-step methodology → 4 anti-patterns → 2 examples (with and without migration)