clawpowers 1.1.3 → 2.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +94 -0
- package/LICENSE +44 -0
- package/README.md +202 -384
- package/SECURITY.md +72 -0
- package/dist/index.d.ts +844 -0
- package/dist/index.js +2536 -0
- package/dist/index.js.map +1 -0
- package/package.json +52 -42
- package/.claude-plugin/manifest.json +0 -19
- package/.codex/INSTALL.md +0 -36
- package/.cursor-plugin/manifest.json +0 -21
- package/.opencode/INSTALL.md +0 -52
- package/ARCHITECTURE.md +0 -69
- package/bin/clawpowers.js +0 -625
- package/bin/clawpowers.sh +0 -91
- package/docs/demo/clawpowers-demo.cast +0 -197
- package/docs/demo/clawpowers-demo.gif +0 -0
- package/docs/launch-images/25-skills-breakdown.jpg +0 -0
- package/docs/launch-images/clawpowers-vs-superpowers.jpg +0 -0
- package/docs/launch-images/economic-code-optimization.jpg +0 -0
- package/docs/launch-images/native-vs-bridge-2.jpg +0 -0
- package/docs/launch-images/native-vs-bridge.jpg +0 -0
- package/docs/launch-images/post1-hero-lobster.jpg +0 -0
- package/docs/launch-images/post2-dashboard.jpg +0 -0
- package/docs/launch-images/post3-superpowers.jpg +0 -0
- package/docs/launch-images/post4-before-after.jpg +0 -0
- package/docs/launch-images/post5-install-now.jpg +0 -0
- package/docs/launch-images/ultimate-stack.jpg +0 -0
- package/docs/launch-posts.md +0 -76
- package/docs/quickstart-first-transaction.md +0 -204
- package/gemini-extension.json +0 -32
- package/hooks/session-start +0 -205
- package/hooks/session-start.cmd +0 -43
- package/hooks/session-start.js +0 -163
- package/runtime/demo/README.md +0 -78
- package/runtime/demo/x402-mock-server.js +0 -230
- package/runtime/feedback/analyze.js +0 -621
- package/runtime/feedback/analyze.sh +0 -546
- package/runtime/init.js +0 -210
- package/runtime/init.sh +0 -178
- package/runtime/metrics/collector.js +0 -361
- package/runtime/metrics/collector.sh +0 -308
- package/runtime/payments/ledger.js +0 -305
- package/runtime/payments/ledger.sh +0 -262
- package/runtime/payments/pipeline.js +0 -459
- package/runtime/persistence/store.js +0 -433
- package/runtime/persistence/store.sh +0 -303
- package/skill.json +0 -106
- package/skills/agent-bounties/SKILL.md +0 -553
- package/skills/agent-payments/SKILL.md +0 -479
- package/skills/brainstorming/SKILL.md +0 -233
- package/skills/content-pipeline/SKILL.md +0 -282
- package/skills/cross-project-knowledge/SKILL.md +0 -345
- package/skills/dispatching-parallel-agents/SKILL.md +0 -305
- package/skills/economic-code-optimization/SKILL.md +0 -265
- package/skills/executing-plans/SKILL.md +0 -255
- package/skills/finishing-a-development-branch/SKILL.md +0 -260
- package/skills/formal-verification-lite/SKILL.md +0 -441
- package/skills/learn-how-to-learn/SKILL.md +0 -235
- package/skills/market-intelligence/SKILL.md +0 -323
- package/skills/meta-skill-evolution/SKILL.md +0 -325
- package/skills/prospecting/SKILL.md +0 -454
- package/skills/receiving-code-review/SKILL.md +0 -225
- package/skills/requesting-code-review/SKILL.md +0 -206
- package/skills/security-audit/SKILL.md +0 -353
- package/skills/self-healing-code/SKILL.md +0 -369
- package/skills/subagent-driven-development/SKILL.md +0 -244
- package/skills/systematic-debugging/SKILL.md +0 -355
- package/skills/test-driven-development/SKILL.md +0 -416
- package/skills/using-clawpowers/SKILL.md +0 -160
- package/skills/using-git-worktrees/SKILL.md +0 -261
- package/skills/verification-before-completion/SKILL.md +0 -254
- package/skills/writing-plans/SKILL.md +0 -276
- package/skills/writing-skills/SKILL.md +0 -260
|
@@ -1,276 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: writing-plans
|
|
3
|
-
description: Transform a specification or goal into a sequenced implementation plan of concrete 2-5 minute tasks with dependency graph. Activate when someone asks you to plan work, before starting any substantial feature.
|
|
4
|
-
version: 1.0.0
|
|
5
|
-
requires:
|
|
6
|
-
tools: []
|
|
7
|
-
runtime: false
|
|
8
|
-
metrics:
|
|
9
|
-
tracks: [plan_accuracy, task_count, estimation_error, dependency_violations]
|
|
10
|
-
improves: [task_granularity, estimation_calibration, dependency_detection]
|
|
11
|
-
---
|
|
12
|
-
|
|
13
|
-
# Writing Plans
|
|
14
|
-
|
|
15
|
-
## When to Use
|
|
16
|
-
|
|
17
|
-
Apply this skill when:
|
|
18
|
-
|
|
19
|
-
- You receive a specification that requires multiple distinct steps
|
|
20
|
-
- Work will take more than 30 minutes total
|
|
21
|
-
- Multiple people or agents will execute the work
|
|
22
|
-
- The execution order matters (dependencies exist)
|
|
23
|
-
- You need to communicate progress against milestones
|
|
24
|
-
- The risk of missing a step is high
|
|
25
|
-
|
|
26
|
-
**Skip when:**
|
|
27
|
-
- The task is a single, obvious action (just do it)
|
|
28
|
-
- The task is exploratory — you can't plan what you haven't understood yet
|
|
29
|
-
- The plan would take longer to write than the work itself
|
|
30
|
-
|
|
31
|
-
**Decision tree:**
|
|
32
|
-
```
|
|
33
|
-
Is the task > 30 min or > 1 context window?
|
|
34
|
-
├── No → execute directly
|
|
35
|
-
└── Yes → Do you understand the full scope?
|
|
36
|
-
├── No → brainstorming first, then write-plans
|
|
37
|
-
└── Yes → writing-plans ← YOU ARE HERE
|
|
38
|
-
```
|
|
39
|
-
|
|
40
|
-
## Core Methodology
|
|
41
|
-
|
|
42
|
-
### Phase 1: Specification Analysis
|
|
43
|
-
|
|
44
|
-
Before writing a single task, decompose the spec into its logical components:
|
|
45
|
-
|
|
46
|
-
1. **Parse the goal** — What is the desired end state? Not "build auth" but "users can register, log in, and maintain sessions securely"
|
|
47
|
-
2. **Identify components** — What distinct systems or modules need to exist?
|
|
48
|
-
3. **Find dependencies** — Which components require others to exist first?
|
|
49
|
-
4. **Identify risks** — What's most likely to go wrong? Plan for it early.
|
|
50
|
-
5. **Define done criteria** — How will you know the goal is achieved?
|
|
51
|
-
|
|
52
|
-
**Specification analysis template:**
|
|
53
|
-
```markdown
|
|
54
|
-
## Goal
|
|
55
|
-
[Single sentence: what exists when this is done that didn't exist before]
|
|
56
|
-
|
|
57
|
-
## Components
|
|
58
|
-
- [Component A]: [what it is and what it does]
|
|
59
|
-
- [Component B]: [what it is and what it does]
|
|
60
|
-
|
|
61
|
-
## Dependencies
|
|
62
|
-
- Component B requires Component A's [interface/data/service]
|
|
63
|
-
- Component C requires both A and B
|
|
64
|
-
|
|
65
|
-
## Risks
|
|
66
|
-
1. [Risk]: [mitigation]
|
|
67
|
-
2. [Risk]: [mitigation]
|
|
68
|
-
|
|
69
|
-
## Done Criteria
|
|
70
|
-
- [ ] [Observable, testable condition]
|
|
71
|
-
- [ ] [Observable, testable condition]
|
|
72
|
-
```
|
|
73
|
-
|
|
74
|
-
### Phase 2: Task Sequencing
|
|
75
|
-
|
|
76
|
-
Break each component into atomic tasks. Rules for task granularity:
|
|
77
|
-
|
|
78
|
-
**Target size:** 2-5 minutes of execution time (not wall clock time — agent execution time)
|
|
79
|
-
**Signs a task is too large:**
|
|
80
|
-
- It contains "and" (two things)
|
|
81
|
-
- Its done criteria has more than 3 bullet points
|
|
82
|
-
- It could fail in more than 2 different ways
|
|
83
|
-
- It would produce more than 200 lines of code
|
|
84
|
-
|
|
85
|
-
**Signs a task is too small:**
|
|
86
|
-
- It produces less than 10 lines of code
|
|
87
|
-
- Its setup (imports, scaffolding) outweighs its work
|
|
88
|
-
- It has zero decision points
|
|
89
|
-
|
|
90
|
-
**Task format:**
|
|
91
|
-
```markdown
|
|
92
|
-
### Task N: [Action verb] [specific thing]
|
|
93
|
-
|
|
94
|
-
**Input:** [What this task needs to exist before it can run]
|
|
95
|
-
**Output:** [Exact file, function, or artifact produced]
|
|
96
|
-
**Duration:** [2-5 min]
|
|
97
|
-
**Done when:**
|
|
98
|
-
- [ ] [Specific, verifiable criterion]
|
|
99
|
-
- [ ] Tests pass (if applicable)
|
|
100
|
-
|
|
101
|
-
**Notes:** [Edge cases, non-obvious decisions, references]
|
|
102
|
-
```
|
|
103
|
-
|
|
104
|
-
### Phase 3: Dependency Graph
|
|
105
|
-
|
|
106
|
-
After listing tasks, draw the dependency graph explicitly:
|
|
107
|
-
|
|
108
|
-
```
|
|
109
|
-
Task 1: Database schema ──→ Task 3: Repository layer
|
|
110
|
-
Task 2: Domain models ──→ Task 3: Repository layer
|
|
111
|
-
Task 3: Repository layer ──→ Task 5: Service layer
|
|
112
|
-
Task 4: Auth middleware ──→ Task 6: Protected routes
|
|
113
|
-
Task 5: Service layer ──→ Task 6: Protected routes
|
|
114
|
-
Task 6: Protected routes ──→ Task 7: Integration tests
|
|
115
|
-
Task 7: Integration tests ──→ Task 8: Documentation
|
|
116
|
-
```
|
|
117
|
-
|
|
118
|
-
**Parallel execution opportunities** — tasks with no shared dependencies:
|
|
119
|
-
- Task 1 and Task 2 can run in parallel
|
|
120
|
-
- Task 4 can run in parallel with Tasks 1-3
|
|
121
|
-
|
|
122
|
-
Label these explicitly. If using `subagent-driven-development`, parallel tasks become parallel subagent dispatches.
|
|
123
|
-
|
|
124
|
-
### Phase 4: Risk-First Ordering
|
|
125
|
-
|
|
126
|
-
Within the constraint of the dependency graph, sequence tasks to:
|
|
127
|
-
1. **Prove the spike first** — If there's a technical uncertainty, make a task that resolves it early
|
|
128
|
-
2. **Hard tasks early** — Don't save the hardest part for last (discovery of blockers costs less time early)
|
|
129
|
-
3. **Reviewable checkpoints** — Insert verification tasks at natural boundaries
|
|
130
|
-
|
|
131
|
-
**Risk-first example:**
|
|
132
|
-
```
|
|
133
|
-
# BAD ordering (risk deferred to end)
|
|
134
|
-
Task 1: Build entire frontend
|
|
135
|
-
Task 2: Build entire backend
|
|
136
|
-
Task 3: Integrate (discovery: API shape is wrong, redo Task 1)
|
|
137
|
-
|
|
138
|
-
# GOOD ordering (risk surfaced early)
|
|
139
|
-
Task 1: Define API contract (OpenAPI spec)
|
|
140
|
-
Task 2: Backend stub that satisfies contract
|
|
141
|
-
Task 3: Frontend stub that calls contract
|
|
142
|
-
Task 4: Integration smoke test ← risk surfaced HERE, at task 4 not task 20
|
|
143
|
-
Task 5: Full backend implementation
|
|
144
|
-
Task 6: Full frontend implementation
|
|
145
|
-
```
|
|
146
|
-
|
|
147
|
-
### Phase 5: The Written Plan
|
|
148
|
-
|
|
149
|
-
Final output format:
|
|
150
|
-
|
|
151
|
-
```markdown
|
|
152
|
-
# Plan: [Goal Name]
|
|
153
|
-
|
|
154
|
-
**Goal:** [Single sentence]
|
|
155
|
-
**Total tasks:** N
|
|
156
|
-
**Estimated duration:** [sum of task durations]
|
|
157
|
-
**Parallel opportunities:** [task numbers that can run concurrently]
|
|
158
|
-
|
|
159
|
-
## Done Criteria
|
|
160
|
-
- [ ] [Observable, testable condition]
|
|
161
|
-
- [ ] [Observable, testable condition]
|
|
162
|
-
|
|
163
|
-
## Dependency Graph
|
|
164
|
-
[ASCII or description]
|
|
165
|
-
|
|
166
|
-
## Tasks
|
|
167
|
-
|
|
168
|
-
### Task 1: [Name]
|
|
169
|
-
[Full task block]
|
|
170
|
-
|
|
171
|
-
### Task 2: [Name]
|
|
172
|
-
[Full task block]
|
|
173
|
-
|
|
174
|
-
...
|
|
175
|
-
```
|
|
176
|
-
|
|
177
|
-
## ClawPowers Enhancement
|
|
178
|
-
|
|
179
|
-
When `~/.clawpowers/` runtime is initialized:
|
|
180
|
-
|
|
181
|
-
**Historical Estimation Calibration:**
|
|
182
|
-
|
|
183
|
-
Plans get compared to actual execution. After 5+ plans, calibration data is available:
|
|
184
|
-
|
|
185
|
-
```bash
|
|
186
|
-
# After plan execution completes
|
|
187
|
-
bash runtime/persistence/store.sh set "plan:auth-service:estimated_duration" "180"
|
|
188
|
-
bash runtime/persistence/store.sh set "plan:auth-service:actual_duration" "240"
|
|
189
|
-
|
|
190
|
-
# Read calibration
|
|
191
|
-
bash runtime/feedback/analyze.sh --skill writing-plans
|
|
192
|
-
# Output: Your 2-5 min tasks average 7.3 min actual. Adjust estimates by 1.5x.
|
|
193
|
-
```
|
|
194
|
-
|
|
195
|
-
**Dependency Graph Validation:**
|
|
196
|
-
|
|
197
|
-
Before executing a plan, validate the dependency graph has no cycles and all task inputs exist:
|
|
198
|
-
|
|
199
|
-
```bash
|
|
200
|
-
bash runtime/persistence/store.sh set "plan:current:task_count" "8"
|
|
201
|
-
bash runtime/persistence/store.sh set "plan:current:deps" "3:1,2 4:- 5:3,4 6:4 7:5,6 8:7"
|
|
202
|
-
# Analyzer checks for cycles and unreachable tasks
|
|
203
|
-
```
|
|
204
|
-
|
|
205
|
-
**Plan Quality Scoring:**
|
|
206
|
-
|
|
207
|
-
Stored metrics enable quality scoring of plans over time:
|
|
208
|
-
- Estimation accuracy (actual / estimated)
|
|
209
|
-
- Task rework rate (tasks that required re-execution)
|
|
210
|
-
- Dependency violation rate (tasks executed out of order)
|
|
211
|
-
- Done criteria completeness (criteria met on first attempt)
|
|
212
|
-
|
|
213
|
-
## Anti-Patterns
|
|
214
|
-
|
|
215
|
-
| Anti-Pattern | Why It Fails | Correct Approach |
|
|
216
|
-
|-------------|-------------|-----------------|
|
|
217
|
-
| Tasks with "and" | Two risks, two outcomes, ambiguous done criteria | Split into two tasks |
|
|
218
|
-
| Vague done criteria | "Done" is subjective, causes rework debates | Observable, testable criteria only |
|
|
219
|
-
| Skipping dependency mapping | Execution order violations cause rework | Always draw the dependency graph |
|
|
220
|
-
| Planning with no spike | Technical unknown bites you at Task 15 | Schedule spike task first if uncertainty exists |
|
|
221
|
-
| Giant tasks (2 hours each) | Hard to track progress, hard to parallelize | Break down to 2-5 min granularity |
|
|
222
|
-
| Tiny tasks (1-2 min each) | Plan overhead exceeds value | Group related micro-tasks |
|
|
223
|
-
| Over-planning volatile specs | Plan becomes invalid before execution starts | Plan only what's stable, leave flexibility for the rest |
|
|
224
|
-
|
|
225
|
-
## Examples
|
|
226
|
-
|
|
227
|
-
### Example 1: Small Plan (4 tasks)
|
|
228
|
-
|
|
229
|
-
**Goal:** Add rate limiting to the API
|
|
230
|
-
|
|
231
|
-
```markdown
|
|
232
|
-
# Plan: API Rate Limiting
|
|
233
|
-
|
|
234
|
-
**Goal:** All API endpoints enforce per-user rate limits with 429 response on exceeded limits.
|
|
235
|
-
**Total tasks:** 4 | **Estimated:** 16 min
|
|
236
|
-
|
|
237
|
-
## Done Criteria
|
|
238
|
-
- [ ] Requests beyond limit receive 429 with Retry-After header
|
|
239
|
-
- [ ] Rate limit state persists across server restarts
|
|
240
|
-
- [ ] Tests cover limit enforcement and reset behavior
|
|
241
|
-
|
|
242
|
-
## Tasks
|
|
243
|
-
|
|
244
|
-
### Task 1: Write rate limit tests (RED)
|
|
245
|
-
**Input:** None | **Output:** tests/test_rate_limit.py (failing) | **Duration:** 3 min
|
|
246
|
-
**Done when:** Tests run and fail with ImportError
|
|
247
|
-
|
|
248
|
-
### Task 2: Implement RedisRateLimiter
|
|
249
|
-
**Input:** tests/test_rate_limit.py | **Output:** src/rate_limiter.py | **Duration:** 5 min
|
|
250
|
-
**Done when:** All rate limiter unit tests pass
|
|
251
|
-
|
|
252
|
-
### Task 3: Integrate rate limiter into middleware
|
|
253
|
-
**Input:** src/rate_limiter.py | **Output:** src/middleware/rate_limit.py | **Duration:** 4 min
|
|
254
|
-
**Done when:** Integration tests pass, middleware applies limits per endpoint
|
|
255
|
-
|
|
256
|
-
### Task 4: Add 429 response and Retry-After header
|
|
257
|
-
**Input:** src/middleware/rate_limit.py | **Output:** Modified middleware | **Duration:** 2 min
|
|
258
|
-
**Done when:** 429 response includes Retry-After with correct TTL
|
|
259
|
-
```
|
|
260
|
-
|
|
261
|
-
### Example 2: Dependency-heavy Plan (8 tasks with parallel opportunities)
|
|
262
|
-
|
|
263
|
-
**Goal:** Build notification service
|
|
264
|
-
|
|
265
|
-
**Parallel opportunities:** Tasks 1+2 concurrent, Tasks 4+5 concurrent
|
|
266
|
-
|
|
267
|
-
```markdown
|
|
268
|
-
### Task 1: Define notification event schema [parallel with Task 2]
|
|
269
|
-
### Task 2: Database migration for notification store [parallel with Task 1]
|
|
270
|
-
### Task 3: NotificationRepository (depends on 1, 2)
|
|
271
|
-
### Task 4: Email provider integration (depends on 1) [parallel with Task 5]
|
|
272
|
-
### Task 5: Push notification provider integration (depends on 1) [parallel with Task 4]
|
|
273
|
-
### Task 6: NotificationService orchestrator (depends on 3, 4, 5)
|
|
274
|
-
### Task 7: API endpoints (depends on 6)
|
|
275
|
-
### Task 8: Integration tests (depends on 7)
|
|
276
|
-
```
|
|
@@ -1,260 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: writing-skills
|
|
3
|
-
description: Create new ClawPowers skills using TDD methodology — write test scenarios, watch the agent fail without the skill, write the skill, verify the agent passes. Activate when you need a new skill that ClawPowers doesn't have.
|
|
4
|
-
version: 1.0.0
|
|
5
|
-
requires:
|
|
6
|
-
tools: [bash]
|
|
7
|
-
runtime: false
|
|
8
|
-
metrics:
|
|
9
|
-
tracks: [skills_written, skill_quality_scores, test_coverage, anti_pattern_count]
|
|
10
|
-
improves: [skill_structure_quality, when_to_use_clarity, example_relevance]
|
|
11
|
-
---
|
|
12
|
-
|
|
13
|
-
# Writing Skills
|
|
14
|
-
|
|
15
|
-
## When to Use
|
|
16
|
-
|
|
17
|
-
Apply this skill when:
|
|
18
|
-
|
|
19
|
-
- ClawPowers lacks a skill you need repeatedly
|
|
20
|
-
- You've solved a non-trivial problem 3+ times and want to codify the approach
|
|
21
|
-
- A team has domain-specific methodologies that should be agent-accessible
|
|
22
|
-
- An existing skill is missing important context or examples
|
|
23
|
-
- You're improving an existing skill that consistently produces suboptimal results
|
|
24
|
-
|
|
25
|
-
**Skip when:**
|
|
26
|
-
- The skill would be a one-off
|
|
27
|
-
- The methodology is already captured in a skill that could be extended
|
|
28
|
-
- You don't have enough real experience with the problem to write a useful skill (skills written from theory, not experience, are worse than no skill)
|
|
29
|
-
|
|
30
|
-
**Decision tree:**
|
|
31
|
-
```
|
|
32
|
-
Have you solved this problem multiple times manually?
|
|
33
|
-
├── No → Solve it first. Document later.
|
|
34
|
-
└── Yes → Is it covered by an existing skill?
|
|
35
|
-
├── Yes → Extend the existing skill (new section, new example)
|
|
36
|
-
└── No → writing-skills ← YOU ARE HERE
|
|
37
|
-
```
|
|
38
|
-
|
|
39
|
-
## Core Methodology
|
|
40
|
-
|
|
41
|
-
### TDD for Skills
|
|
42
|
-
|
|
43
|
-
Skills are tested by measuring whether an agent WITHOUT the skill fails a scenario that an agent WITH the skill handles correctly. This is behavioral testing — not unit testing of code.
|
|
44
|
-
|
|
45
|
-
### Phase 1: Write Test Scenarios (RED)
|
|
46
|
-
|
|
47
|
-
Before writing any skill content, write test scenarios that the skill must handle.
|
|
48
|
-
|
|
49
|
-
**Test scenario format:**
|
|
50
|
-
```markdown
|
|
51
|
-
## Scenario N: [Descriptive name]
|
|
52
|
-
|
|
53
|
-
**Agent state:** Agent has no knowledge of [skill domain]
|
|
54
|
-
**Input:** [Exact prompt or situation the agent receives]
|
|
55
|
-
**Without skill:** [Specific wrong behavior the agent exhibits]
|
|
56
|
-
**With skill:** [Specific correct behavior the skill produces]
|
|
57
|
-
**Success criteria:**
|
|
58
|
-
- [ ] [Observable, verifiable outcome]
|
|
59
|
-
- [ ] [Observable, verifiable outcome]
|
|
60
|
-
```
|
|
61
|
-
|
|
62
|
-
**Minimum scenarios before writing the skill:**
|
|
63
|
-
- 1 happy path (common use case)
|
|
64
|
-
- 1 edge case (unusual but valid)
|
|
65
|
-
- 1 failure case (when NOT to use the skill)
|
|
66
|
-
- 1 anti-pattern case (common mistake the skill prevents)
|
|
67
|
-
|
|
68
|
-
**Example scenarios for a hypothetical "database-migration" skill:**
|
|
69
|
-
|
|
70
|
-
```markdown
|
|
71
|
-
## Scenario 1: Running migrations safely
|
|
72
|
-
Without skill: Agent runs `alembic upgrade head` without backing up first
|
|
73
|
-
With skill: Agent follows backup → dry-run → verify → apply sequence
|
|
74
|
-
|
|
75
|
-
## Scenario 2: Rolling back a bad migration
|
|
76
|
-
Without skill: Agent manually deletes rows or drops columns (data loss)
|
|
77
|
-
With skill: Agent runs `alembic downgrade -1`, verifies schema, identifies root cause
|
|
78
|
-
|
|
79
|
-
## Scenario 3: When NOT to use this skill
|
|
80
|
-
Input: "Update the user model to add an index"
|
|
81
|
-
Without skill: Agent triggers migration skill for every schema change
|
|
82
|
-
With skill: Agent recognizes index-only changes don't need this protocol
|
|
83
|
-
|
|
84
|
-
## Scenario 4: Anti-pattern — concurrent migrations
|
|
85
|
-
Without skill: Agent runs migrations in parallel across multiple servers
|
|
86
|
-
With skill: Agent ensures single-server serial execution with distributed lock
|
|
87
|
-
```
|
|
88
|
-
|
|
89
|
-
### Phase 2: Verify Failure (The "RED" Moment)
|
|
90
|
-
|
|
91
|
-
Before writing the skill, verify that an agent without it fails at least Scenario 1. This confirms:
|
|
92
|
-
- The skill is actually needed
|
|
93
|
-
- The test scenarios are meaningful
|
|
94
|
-
- The skill will produce measurable improvement
|
|
95
|
-
|
|
96
|
-
If an agent without the skill already handles the scenarios correctly, you don't need the skill.
|
|
97
|
-
|
|
98
|
-
### Phase 3: Write the Skill
|
|
99
|
-
|
|
100
|
-
Use the ClawPowers skill template:
|
|
101
|
-
|
|
102
|
-
```markdown
|
|
103
|
-
---
|
|
104
|
-
name: skill-name-kebab-case
|
|
105
|
-
description: [One sentence: when to trigger this skill. Start with "Activate when..."]
|
|
106
|
-
version: 1.0.0
|
|
107
|
-
requires:
|
|
108
|
-
tools: [tool1, tool2] # Only tools the skill actually requires
|
|
109
|
-
runtime: false # true if skill needs ~/.clawpowers/
|
|
110
|
-
metrics:
|
|
111
|
-
tracks: [metric1, metric2] # Observable outcomes
|
|
112
|
-
improves: [param1, param2] # Parameters RSI can tune
|
|
113
|
-
---
|
|
114
|
-
|
|
115
|
-
# [Skill Name]
|
|
116
|
-
|
|
117
|
-
## When to Use
|
|
118
|
-
|
|
119
|
-
[Decision tree. Include when to skip the skill.]
|
|
120
|
-
|
|
121
|
-
## Core Methodology
|
|
122
|
-
|
|
123
|
-
[The actual methodology. Numbered steps. Concrete, not abstract.]
|
|
124
|
-
|
|
125
|
-
## ClawPowers Enhancement
|
|
126
|
-
|
|
127
|
-
[What the runtime layer adds. Only if runtime: true or if runtime is optional benefit.]
|
|
128
|
-
|
|
129
|
-
## Anti-Patterns
|
|
130
|
-
|
|
131
|
-
[Table of common mistakes, why they fail, correct approach.]
|
|
132
|
-
|
|
133
|
-
## Examples
|
|
134
|
-
|
|
135
|
-
[1-3 concrete examples with real code/commands, not hypotheticals.]
|
|
136
|
-
```
|
|
137
|
-
|
|
138
|
-
### Phase 4: Quality Gates
|
|
139
|
-
|
|
140
|
-
Before the skill is "done", it must pass these gates:
|
|
141
|
-
|
|
142
|
-
**Gate 1: When to Use is a decision tree, not a list**
|
|
143
|
-
- Does it tell you when NOT to use the skill?
|
|
144
|
-
- Does it handle edge cases in the decision?
|
|
145
|
-
|
|
146
|
-
**Gate 2: Core Methodology is actionable**
|
|
147
|
-
- Can someone follow these steps without guessing?
|
|
148
|
-
- Does every step produce a verifiable output?
|
|
149
|
-
- Are code/command examples real, not pseudocode?
|
|
150
|
-
|
|
151
|
-
**Gate 3: Anti-Patterns are specific**
|
|
152
|
-
- Each anti-pattern names a specific behavior, not a category
|
|
153
|
-
- Each explains WHY it fails (not just "don't do this")
|
|
154
|
-
- Each provides a concrete correct approach
|
|
155
|
-
|
|
156
|
-
**Gate 4: Examples are real**
|
|
157
|
-
- Examples use plausible real names (not `foo`, `bar`, `example`)
|
|
158
|
-
- Code examples are syntactically correct
|
|
159
|
-
- Examples cover the most common real-world use case
|
|
160
|
-
|
|
161
|
-
**Gate 5: No stubs**
|
|
162
|
-
- No "TODO: add examples here"
|
|
163
|
-
- No "coming soon" sections
|
|
164
|
-
- No placeholder text
|
|
165
|
-
|
|
166
|
-
### Phase 5: Verify Pass (The "GREEN" Moment)
|
|
167
|
-
|
|
168
|
-
Apply the test scenarios to the completed skill. Verify:
|
|
169
|
-
- Scenario 1 (happy path): skill guides agent to correct outcome
|
|
170
|
-
- Scenario 2 (edge case): skill handles it explicitly or provides guidance
|
|
171
|
-
- Scenario 3 (skip case): skill's "When to Use" correctly excludes this
|
|
172
|
-
- Scenario 4 (anti-pattern): skill's Anti-Patterns section covers it
|
|
173
|
-
|
|
174
|
-
### Phase 6: Register the Skill
|
|
175
|
-
|
|
176
|
-
Add the skill to `skills/using-clawpowers/SKILL.md`:
|
|
177
|
-
|
|
178
|
-
```markdown
|
|
179
|
-
# In the "Quick Reference" section, add:
|
|
180
|
-
25. `database-migration` — Safe migration sequence with backup, dry-run, and rollback
|
|
181
|
-
```
|
|
182
|
-
|
|
183
|
-
Add trigger pattern to the pattern map:
|
|
184
|
-
```markdown
|
|
185
|
-
| Running or planning a database schema change | `database-migration` |
|
|
186
|
-
```
|
|
187
|
-
|
|
188
|
-
## ClawPowers Enhancement
|
|
189
|
-
|
|
190
|
-
When `~/.clawpowers/` runtime is initialized:
|
|
191
|
-
|
|
192
|
-
**Skill Quality Scoring:**
|
|
193
|
-
|
|
194
|
-
Each skill is scored on:
|
|
195
|
-
- Scenario coverage (how many test scenarios does it pass?)
|
|
196
|
-
- Usage frequency (how often is it triggered per session?)
|
|
197
|
-
- Outcome rate (when triggered, what % of executions succeed?)
|
|
198
|
-
- Anti-pattern prevention (how often does it prevent a documented anti-pattern?)
|
|
199
|
-
|
|
200
|
-
```bash
|
|
201
|
-
bash runtime/persistence/store.sh set "skill-quality:database-migration:scenario_coverage" "4/4"
|
|
202
|
-
bash runtime/persistence/store.sh set "skill-quality:database-migration:outcome_rate" "0.92"
|
|
203
|
-
```
|
|
204
|
-
|
|
205
|
-
**Anti-Pattern Detection:**
|
|
206
|
-
|
|
207
|
-
The feedback engine monitors skill usage for anti-patterns not covered by the skill:
|
|
208
|
-
```bash
|
|
209
|
-
bash runtime/feedback/analyze.sh --skill database-migration
|
|
210
|
-
# → New anti-pattern detected: agents omit the verify step after applying migrations
|
|
211
|
-
# → Recommend adding explicit verify step to Core Methodology
|
|
212
|
-
```
|
|
213
|
-
|
|
214
|
-
**Skill Evolution:**
|
|
215
|
-
|
|
216
|
-
When a skill's outcome rate drops below threshold (< 80%), the feedback engine flags it for revision:
|
|
217
|
-
```bash
|
|
218
|
-
bash runtime/metrics/collector.sh record \
|
|
219
|
-
--skill writing-skills \
|
|
220
|
-
--outcome success \
|
|
221
|
-
--notes "database-migration: 4 scenarios, all passing, quality gates cleared"
|
|
222
|
-
```
|
|
223
|
-
|
|
224
|
-
## Anti-Patterns
|
|
225
|
-
|
|
226
|
-
| Anti-Pattern | Why It Fails | Correct Approach |
|
|
227
|
-
|-------------|-------------|-----------------|
|
|
228
|
-
| Writing from theory | Skill misses real-world edge cases | Write skills from real experience only |
|
|
229
|
-
| Skipping test scenarios | No way to verify the skill works | Write scenarios first (TDD) |
|
|
230
|
-
| Vague "When to Use" | Skill triggers at wrong times | Decision tree with explicit skip conditions |
|
|
231
|
-
| Placeholder sections | Skill is deployed incomplete | All sections must be complete before registration |
|
|
232
|
-
| One giant methodology section | Agents lose track of where they are | Numbered steps with verifiable outputs |
|
|
233
|
-
| No anti-patterns section | Common mistakes recur | Always include anti-patterns |
|
|
234
|
-
| Examples with foo/bar names | Low signal — agents don't recognize applicability | Use realistic domain names in examples |
|
|
235
|
-
|
|
236
|
-
## Examples
|
|
237
|
-
|
|
238
|
-
### Example 1: Simple Skill (2 scenarios)
|
|
239
|
-
|
|
240
|
-
**Skill:** `git-submodule-update`
|
|
241
|
-
**Problem:** Agents frequently forget to update submodules after `git pull`
|
|
242
|
-
|
|
243
|
-
**Scenarios:**
|
|
244
|
-
1. After `git pull`, submodule code is stale — skill ensures `git submodule update --init --recursive`
|
|
245
|
-
2. New submodule added — skill ensures `git submodule update --init` for new submodules only
|
|
246
|
-
|
|
247
|
-
**Skill structure:** When to Use → 3-step methodology (detect stale, update, verify) → 2 anti-patterns → 1 example
|
|
248
|
-
|
|
249
|
-
### Example 2: Complex Skill (4 scenarios)
|
|
250
|
-
|
|
251
|
-
**Skill:** `zero-downtime-deployment`
|
|
252
|
-
**Problem:** Agents deploy without considering traffic impact
|
|
253
|
-
|
|
254
|
-
**Scenarios:**
|
|
255
|
-
1. Deploying new version (happy path) — blue-green or rolling strategy
|
|
256
|
-
2. Deploying with schema migration — migration-first, app second
|
|
257
|
-
3. Rolling back a bad deploy — revert app before revert migration
|
|
258
|
-
4. Skip case — deploying to a dev environment with no traffic
|
|
259
|
-
|
|
260
|
-
**Skill structure:** When to Use (with explicit skip for dev) → 5-step methodology → 4 anti-patterns → 2 examples (with and without migration)
|