clawpowers 1.1.3 → 2.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +94 -0
- package/LICENSE +44 -0
- package/README.md +202 -384
- package/SECURITY.md +72 -0
- package/dist/index.d.ts +844 -0
- package/dist/index.js +2536 -0
- package/dist/index.js.map +1 -0
- package/package.json +52 -42
- package/.claude-plugin/manifest.json +0 -19
- package/.codex/INSTALL.md +0 -36
- package/.cursor-plugin/manifest.json +0 -21
- package/.opencode/INSTALL.md +0 -52
- package/ARCHITECTURE.md +0 -69
- package/bin/clawpowers.js +0 -625
- package/bin/clawpowers.sh +0 -91
- package/docs/demo/clawpowers-demo.cast +0 -197
- package/docs/demo/clawpowers-demo.gif +0 -0
- package/docs/launch-images/25-skills-breakdown.jpg +0 -0
- package/docs/launch-images/clawpowers-vs-superpowers.jpg +0 -0
- package/docs/launch-images/economic-code-optimization.jpg +0 -0
- package/docs/launch-images/native-vs-bridge-2.jpg +0 -0
- package/docs/launch-images/native-vs-bridge.jpg +0 -0
- package/docs/launch-images/post1-hero-lobster.jpg +0 -0
- package/docs/launch-images/post2-dashboard.jpg +0 -0
- package/docs/launch-images/post3-superpowers.jpg +0 -0
- package/docs/launch-images/post4-before-after.jpg +0 -0
- package/docs/launch-images/post5-install-now.jpg +0 -0
- package/docs/launch-images/ultimate-stack.jpg +0 -0
- package/docs/launch-posts.md +0 -76
- package/docs/quickstart-first-transaction.md +0 -204
- package/gemini-extension.json +0 -32
- package/hooks/session-start +0 -205
- package/hooks/session-start.cmd +0 -43
- package/hooks/session-start.js +0 -163
- package/runtime/demo/README.md +0 -78
- package/runtime/demo/x402-mock-server.js +0 -230
- package/runtime/feedback/analyze.js +0 -621
- package/runtime/feedback/analyze.sh +0 -546
- package/runtime/init.js +0 -210
- package/runtime/init.sh +0 -178
- package/runtime/metrics/collector.js +0 -361
- package/runtime/metrics/collector.sh +0 -308
- package/runtime/payments/ledger.js +0 -305
- package/runtime/payments/ledger.sh +0 -262
- package/runtime/payments/pipeline.js +0 -459
- package/runtime/persistence/store.js +0 -433
- package/runtime/persistence/store.sh +0 -303
- package/skill.json +0 -106
- package/skills/agent-bounties/SKILL.md +0 -553
- package/skills/agent-payments/SKILL.md +0 -479
- package/skills/brainstorming/SKILL.md +0 -233
- package/skills/content-pipeline/SKILL.md +0 -282
- package/skills/cross-project-knowledge/SKILL.md +0 -345
- package/skills/dispatching-parallel-agents/SKILL.md +0 -305
- package/skills/economic-code-optimization/SKILL.md +0 -265
- package/skills/executing-plans/SKILL.md +0 -255
- package/skills/finishing-a-development-branch/SKILL.md +0 -260
- package/skills/formal-verification-lite/SKILL.md +0 -441
- package/skills/learn-how-to-learn/SKILL.md +0 -235
- package/skills/market-intelligence/SKILL.md +0 -323
- package/skills/meta-skill-evolution/SKILL.md +0 -325
- package/skills/prospecting/SKILL.md +0 -454
- package/skills/receiving-code-review/SKILL.md +0 -225
- package/skills/requesting-code-review/SKILL.md +0 -206
- package/skills/security-audit/SKILL.md +0 -353
- package/skills/self-healing-code/SKILL.md +0 -369
- package/skills/subagent-driven-development/SKILL.md +0 -244
- package/skills/systematic-debugging/SKILL.md +0 -355
- package/skills/test-driven-development/SKILL.md +0 -416
- package/skills/using-clawpowers/SKILL.md +0 -160
- package/skills/using-git-worktrees/SKILL.md +0 -261
- package/skills/verification-before-completion/SKILL.md +0 -254
- package/skills/writing-plans/SKILL.md +0 -276
- package/skills/writing-skills/SKILL.md +0 -260
|
@@ -1,369 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: self-healing-code
|
|
3
|
-
description: On test failure, automatically capture the failure, run hypothesis-driven debugging, generate ≥2 candidate patches, apply and measure each, auto-commit the winner or escalate with full context. Max 3 iteration cycles with coverage guard.
|
|
4
|
-
version: 1.0.0
|
|
5
|
-
requires:
|
|
6
|
-
tools: [bash, git]
|
|
7
|
-
runtime: true
|
|
8
|
-
metrics:
|
|
9
|
-
tracks: [healing_attempts, auto_commits, escalations, patches_generated, coverage_delta, cycles_used]
|
|
10
|
-
improves: [patch_quality, hypothesis_accuracy, escalation_context_completeness]
|
|
11
|
-
---
|
|
12
|
-
|
|
13
|
-
# Self-Healing Code
|
|
14
|
-
|
|
15
|
-
## When to Use
|
|
16
|
-
|
|
17
|
-
Apply this skill when:
|
|
18
|
-
|
|
19
|
-
- A CI run or local test suite produces a failure
|
|
20
|
-
- A previously green test suite goes red after a code change
|
|
21
|
-
- An automated pipeline fails and needs remediation without human intervention
|
|
22
|
-
- Bill runs tests and the output contains `FAILED`, `ERROR`, or non-zero exit code
|
|
23
|
-
|
|
24
|
-
**Skip when:**
|
|
25
|
-
- Tests fail because of a missing environment variable or missing external service (that's a configuration issue, not a code defect)
|
|
26
|
-
- The failure is a flaky test known to fail intermittently — check `~/.clawpowers/state/known-flaky.json` first
|
|
27
|
-
- A previous healing cycle for this exact error is already in progress (check `~/.clawpowers/state/healing-lock.json`)
|
|
28
|
-
|
|
29
|
-
**Decision tree:**
|
|
30
|
-
```
|
|
31
|
-
Did the test suite produce a failure?
|
|
32
|
-
├── No → no action
|
|
33
|
-
└── Yes → Is this a known flaky test?
|
|
34
|
-
├── Yes → skip, add flaky annotation, report
|
|
35
|
-
└── No → Is a healing cycle already running for this error?
|
|
36
|
-
├── Yes → wait for completion or check lock age
|
|
37
|
-
└── No → self-healing-code ← YOU ARE HERE
|
|
38
|
-
```
|
|
39
|
-
|
|
40
|
-
## Core Methodology
|
|
41
|
-
|
|
42
|
-
### Guardrails (enforce before any healing action)
|
|
43
|
-
|
|
44
|
-
```bash
|
|
45
|
-
# Max cycles — never exceed 3 healing iterations per error
|
|
46
|
-
MAX_CYCLES=3
|
|
47
|
-
HEALING_STATE=~/.clawpowers/state/healing-$(echo "$ERROR_SIG" | md5).json
|
|
48
|
-
CURRENT_CYCLE=$(cat "$HEALING_STATE" 2>/dev/null | node -e "const d=require('/dev/stdin');console.log(d.cycle||0)" 2>/dev/null || echo 0)
|
|
49
|
-
|
|
50
|
-
if (( CURRENT_CYCLE >= MAX_CYCLES )); then
|
|
51
|
-
echo "Max cycles ($MAX_CYCLES) reached. Escalating."
|
|
52
|
-
# → go to Step 6: Escalation
|
|
53
|
-
fi
|
|
54
|
-
|
|
55
|
-
# Coverage guard — baseline before any patch
|
|
56
|
-
COVERAGE_BASELINE=$(bash runtime/persistence/store.sh get "coverage:baseline:$PROJECT" 2>/dev/null || echo "0")
|
|
57
|
-
```
|
|
58
|
-
|
|
59
|
-
### Step 1: Capture the Failure
|
|
60
|
-
|
|
61
|
-
Collect everything needed to understand and reproduce the failure:
|
|
62
|
-
|
|
63
|
-
```bash
|
|
64
|
-
# Run tests and capture full output
|
|
65
|
-
TEST_OUTPUT=$(bash -c "$TEST_CMD 2>&1") || true
|
|
66
|
-
EXIT_CODE=$?
|
|
67
|
-
|
|
68
|
-
# Extract structured fields
|
|
69
|
-
TEST_NAME=$(echo "$TEST_OUTPUT" | grep -E "^(FAILED|FAIL|Error in)" | head -1)
|
|
70
|
-
ERROR_MSG=$(echo "$TEST_OUTPUT" | grep -A5 "AssertionError\|Error:\|Exception:" | head -10)
|
|
71
|
-
STACK_TRACE=$(echo "$TEST_OUTPUT" | grep -A20 "Traceback\|at [A-Za-z]" | head -30)
|
|
72
|
-
|
|
73
|
-
# Diff from last green commit
|
|
74
|
-
LAST_GREEN=$(bash runtime/persistence/store.sh get "last-green:$PROJECT" 2>/dev/null || git log --oneline | grep -i "green\|pass\|ci:" | head -1 | awk '{print $1}')
|
|
75
|
-
DIFF_FROM_GREEN=""
|
|
76
|
-
if [[ -n "$LAST_GREEN" ]]; then
|
|
77
|
-
DIFF_FROM_GREEN=$(git diff "$LAST_GREEN" HEAD -- . 2>/dev/null | head -200)
|
|
78
|
-
fi
|
|
79
|
-
|
|
80
|
-
# Error signature hash (for dedup and state tracking)
|
|
81
|
-
ERROR_SIG=$(echo "${TEST_NAME}${ERROR_MSG}" | md5)
|
|
82
|
-
|
|
83
|
-
# Log the capture
|
|
84
|
-
CAPTURE_RECORD=~/.clawpowers/state/healing-$ERROR_SIG-capture.json
|
|
85
|
-
cat > "$CAPTURE_RECORD" <<EOF
|
|
86
|
-
{
|
|
87
|
-
"timestamp": "$(date -u +%Y-%m-%dT%H:%M:%SZ)",
|
|
88
|
-
"test_name": $(echo "$TEST_NAME" | node -e "process.stdin.on('data',d=>console.log(JSON.stringify(d.toString().trim())))"),
|
|
89
|
-
"error_msg": $(echo "$ERROR_MSG" | node -e "process.stdin.on('data',d=>console.log(JSON.stringify(d.toString().trim())))"),
|
|
90
|
-
"exit_code": $EXIT_CODE,
|
|
91
|
-
"last_green_commit": "$LAST_GREEN",
|
|
92
|
-
"error_signature": "$ERROR_SIG"
|
|
93
|
-
}
|
|
94
|
-
EOF
|
|
95
|
-
```
|
|
96
|
-
|
|
97
|
-
**Capture checklist:**
|
|
98
|
-
- [ ] Test name (exact test identifier)
|
|
99
|
-
- [ ] Full error message (not truncated)
|
|
100
|
-
- [ ] Stack trace (full, not just last frame)
|
|
101
|
-
- [ ] Diff from last green commit
|
|
102
|
-
- [ ] Environment snapshot (language version, key deps)
|
|
103
|
-
|
|
104
|
-
### Step 2: Hypothesis Tree (Systematic Debugging Integration)
|
|
105
|
-
|
|
106
|
-
Apply the `systematic-debugging` methodology to form ranked hypotheses. This is not optional — random patches without hypotheses produce random results.
|
|
107
|
-
|
|
108
|
-
```bash
|
|
109
|
-
# Check persistent hypothesis memory first (see systematic-debugging enhancement)
|
|
110
|
-
KNOWN_HYP=$(bash runtime/persistence/store.sh get "debug:hypothesis:$ERROR_SIG" 2>/dev/null)
|
|
111
|
-
if [[ -n "$KNOWN_HYP" ]]; then
|
|
112
|
-
echo "Known error pattern found. Starting with previously successful hypothesis."
|
|
113
|
-
echo "$KNOWN_HYP"
|
|
114
|
-
fi
|
|
115
|
-
```
|
|
116
|
-
|
|
117
|
-
**Hypothesis template for common failure patterns:**
|
|
118
|
-
|
|
119
|
-
| Failure pattern | Likely hypothesis | Experiment |
|
|
120
|
-
|----------------|------------------|-----------|
|
|
121
|
-
| `AttributeError: 'NoneType' has no attribute X` | Null not guarded in refactored path | Add null check before access |
|
|
122
|
-
| `AssertionError: expected X, got Y` | Logic changed in upstream function | Bisect to find commit, inspect callers |
|
|
123
|
-
| `ConnectionRefusedError` | Service not started or port changed | Check env config, not a code fix |
|
|
124
|
-
| `KeyError: 'field_name'` | Schema changed, consumer not updated | Find all consumers of that key |
|
|
125
|
-
| `TypeError: expected str, got int` | Type coercion removed | Restore coercion or fix caller |
|
|
126
|
-
|
|
127
|
-
Form 2-4 specific hypotheses before generating patches.
|
|
128
|
-
|
|
129
|
-
### Step 3: Generate Candidate Patches (Minimum 2)
|
|
130
|
-
|
|
131
|
-
For each top hypothesis, generate a candidate patch. Generate patches **before** applying any:
|
|
132
|
-
|
|
133
|
-
```bash
|
|
134
|
-
# Stash current state for rollback safety
|
|
135
|
-
git stash push -m "self-healing-pre-patch-$ERROR_SIG-$(date +%s)"
|
|
136
|
-
STASH_REF=$(git stash list | head -1 | awk '{print $1}' | tr -d ':')
|
|
137
|
-
|
|
138
|
-
# Generate patch candidates (store as files, don't apply yet)
|
|
139
|
-
mkdir -p ~/.clawpowers/state/patches/$ERROR_SIG
|
|
140
|
-
```
|
|
141
|
-
|
|
142
|
-
**Patch generation principles:**
|
|
143
|
-
- **Patch A:** Minimal fix — smallest change that addresses the hypothesis (prefer this)
|
|
144
|
-
- **Patch B:** Alternative approach — different mechanism, same outcome
|
|
145
|
-
- **Patch C (if needed):** Defensive fix — add guards to prevent the class of error
|
|
146
|
-
|
|
147
|
-
**Example (Python null guard):**
|
|
148
|
-
```python
|
|
149
|
-
# Patch A — minimal: add None check at the failure site
|
|
150
|
-
# Before:
|
|
151
|
-
result = user.profile.settings["theme"]
|
|
152
|
-
# After:
|
|
153
|
-
result = user.profile.settings.get("theme", "default") if user.profile else "default"
|
|
154
|
-
|
|
155
|
-
# Patch B — alternative: fix upstream to guarantee non-null
|
|
156
|
-
# Before:
|
|
157
|
-
def get_user(user_id):
|
|
158
|
-
return db.query(User).filter_by(id=user_id).first() # can return None
|
|
159
|
-
# After:
|
|
160
|
-
def get_user(user_id):
|
|
161
|
-
user = db.query(User).filter_by(id=user_id).first()
|
|
162
|
-
if user is None:
|
|
163
|
-
raise UserNotFoundError(f"User {user_id} not found")
|
|
164
|
-
return user
|
|
165
|
-
```
|
|
166
|
-
|
|
167
|
-
Write each patch to a file:
|
|
168
|
-
```bash
|
|
169
|
-
# Write patches to staging area
|
|
170
|
-
cat > ~/.clawpowers/state/patches/$ERROR_SIG/patch-a.diff <<'EOF'
|
|
171
|
-
[patch content here]
|
|
172
|
-
EOF
|
|
173
|
-
|
|
174
|
-
# Capture reasoning for each patch
|
|
175
|
-
echo '{"patch":"a","hypothesis":"null not guarded","mechanism":"add get() with default","confidence":"high"}' \
|
|
176
|
-
> ~/.clawpowers/state/patches/$ERROR_SIG/patch-a-meta.json
|
|
177
|
-
```
|
|
178
|
-
|
|
179
|
-
### Step 4: Apply, Test, Measure
|
|
180
|
-
|
|
181
|
-
Apply patches in order, testing each. Stop at the first winner.
|
|
182
|
-
|
|
183
|
-
```bash
|
|
184
|
-
# Measure baseline coverage before any patch
|
|
185
|
-
COVERAGE_BEFORE=$(bash -c "$COVERAGE_CMD 2>&1" | grep -E "TOTAL.*[0-9]+%" | grep -oE "[0-9]+%" | tail -1)
|
|
186
|
-
|
|
187
|
-
for PATCH in a b c; do
|
|
188
|
-
PATCH_FILE=~/.clawpowers/state/patches/$ERROR_SIG/patch-$PATCH.diff
|
|
189
|
-
[[ -f "$PATCH_FILE" ]] || continue
|
|
190
|
-
|
|
191
|
-
echo "=== Applying patch $PATCH ==="
|
|
192
|
-
|
|
193
|
-
# Restore clean state from stash before each patch
|
|
194
|
-
git stash pop 2>/dev/null || true
|
|
195
|
-
git stash push -m "self-healing-between-patches-$ERROR_SIG" 2>/dev/null || true
|
|
196
|
-
git checkout -- . 2>/dev/null || true
|
|
197
|
-
|
|
198
|
-
# Apply the patch
|
|
199
|
-
git apply "$PATCH_FILE" 2>/dev/null || patch -p1 < "$PATCH_FILE" 2>/dev/null
|
|
200
|
-
|
|
201
|
-
# Run full test suite
|
|
202
|
-
TEST_RESULT=$(bash -c "$TEST_CMD 2>&1")
|
|
203
|
-
TEST_EXIT=$?
|
|
204
|
-
|
|
205
|
-
# Measure coverage after patch
|
|
206
|
-
COVERAGE_AFTER=$(bash -c "$COVERAGE_CMD 2>&1" | grep -E "TOTAL.*[0-9]+%" | grep -oE "[0-9]+%" | tail -1)
|
|
207
|
-
|
|
208
|
-
# Coverage guard: never reduce
|
|
209
|
-
COVERAGE_OK=true
|
|
210
|
-
if [[ -n "$COVERAGE_BEFORE" && -n "$COVERAGE_AFTER" ]]; then
|
|
211
|
-
BEFORE_NUM=$(echo "$COVERAGE_BEFORE" | tr -d '%')
|
|
212
|
-
AFTER_NUM=$(echo "$COVERAGE_AFTER" | tr -d '%')
|
|
213
|
-
if (( AFTER_NUM < BEFORE_NUM )); then
|
|
214
|
-
COVERAGE_OK=false
|
|
215
|
-
echo "Coverage dropped: $COVERAGE_BEFORE → $COVERAGE_AFTER. Patch $PATCH rejected."
|
|
216
|
-
fi
|
|
217
|
-
fi
|
|
218
|
-
|
|
219
|
-
if [[ $TEST_EXIT -eq 0 && "$COVERAGE_OK" == "true" ]]; then
|
|
220
|
-
echo "Patch $PATCH PASSED all tests. Coverage: $COVERAGE_BEFORE → $COVERAGE_AFTER"
|
|
221
|
-
WINNING_PATCH=$PATCH
|
|
222
|
-
break
|
|
223
|
-
else
|
|
224
|
-
echo "Patch $PATCH FAILED. Exit: $TEST_EXIT. Coverage OK: $COVERAGE_OK"
|
|
225
|
-
fi
|
|
226
|
-
done
|
|
227
|
-
```
|
|
228
|
-
|
|
229
|
-
### Step 5: Auto-Commit the Winner
|
|
230
|
-
|
|
231
|
-
If a patch passes all tests and maintains coverage:
|
|
232
|
-
|
|
233
|
-
```bash
|
|
234
|
-
if [[ -n "$WINNING_PATCH" ]]; then
|
|
235
|
-
# Commit with full context
|
|
236
|
-
git add -A
|
|
237
|
-
git commit -m "fix: self-healing patch for ${TEST_NAME}
|
|
238
|
-
|
|
239
|
-
Error signature: $ERROR_SIG
|
|
240
|
-
Patch applied: $WINNING_PATCH
|
|
241
|
-
Hypothesis: $(cat ~/.clawpowers/state/patches/$ERROR_SIG/patch-$WINNING_PATCH-meta.json | node -e "const d=require('/dev/stdin');process.stdin.pipe(d.hypothesis)")
|
|
242
|
-
Coverage: $COVERAGE_BEFORE → $COVERAGE_AFTER
|
|
243
|
-
Cycles used: $((CURRENT_CYCLE + 1))/$MAX_CYCLES
|
|
244
|
-
|
|
245
|
-
[self-healing-code]"
|
|
246
|
-
|
|
247
|
-
# Store last-green reference
|
|
248
|
-
bash runtime/persistence/store.sh set "last-green:$PROJECT" "$(git rev-parse HEAD)"
|
|
249
|
-
|
|
250
|
-
# Record success
|
|
251
|
-
bash runtime/metrics/collector.sh record \
|
|
252
|
-
--skill self-healing-code \
|
|
253
|
-
--outcome success \
|
|
254
|
-
--notes "patch-$WINNING_PATCH won, coverage $COVERAGE_BEFORE→$COVERAGE_AFTER, cycle $((CURRENT_CYCLE+1))/$MAX_CYCLES"
|
|
255
|
-
|
|
256
|
-
# Clean up healing state
|
|
257
|
-
rm -rf ~/.clawpowers/state/patches/$ERROR_SIG
|
|
258
|
-
rm -f ~/.clawpowers/state/healing-$ERROR_SIG*.json
|
|
259
|
-
fi
|
|
260
|
-
```
|
|
261
|
-
|
|
262
|
-
### Step 6: Rollback Protocol
|
|
263
|
-
|
|
264
|
-
If no patch wins after all candidates are tried:
|
|
265
|
-
|
|
266
|
-
```bash
|
|
267
|
-
if [[ -z "$WINNING_PATCH" ]]; then
|
|
268
|
-
# Restore to pre-healing state
|
|
269
|
-
git checkout -- .
|
|
270
|
-
git stash drop 2>/dev/null || true
|
|
271
|
-
echo "All patches failed. State restored to pre-healing baseline."
|
|
272
|
-
|
|
273
|
-
# Increment cycle counter
|
|
274
|
-
NEW_CYCLE=$((CURRENT_CYCLE + 1))
|
|
275
|
-
echo "{\"cycle\": $NEW_CYCLE, \"error_sig\": \"$ERROR_SIG\"}" > "$HEALING_STATE"
|
|
276
|
-
|
|
277
|
-
if (( NEW_CYCLE < MAX_CYCLES )); then
|
|
278
|
-
echo "Cycle $NEW_CYCLE/$MAX_CYCLES complete. Forming new hypotheses."
|
|
279
|
-
# → Loop back to Step 2 with refined hypotheses
|
|
280
|
-
else
|
|
281
|
-
# → Escalate
|
|
282
|
-
echo "Max cycles reached. Escalating with full context."
|
|
283
|
-
fi
|
|
284
|
-
fi
|
|
285
|
-
```
|
|
286
|
-
|
|
287
|
-
### Step 7: Escalation Package
|
|
288
|
-
|
|
289
|
-
When all cycles are exhausted, escalate with enough context that a human can immediately begin debugging:
|
|
290
|
-
|
|
291
|
-
```markdown
|
|
292
|
-
## Self-Healing Escalation Report
|
|
293
|
-
|
|
294
|
-
**Error:** [test_name]
|
|
295
|
-
**Error signature:** [hash]
|
|
296
|
-
**Cycles attempted:** 3/3
|
|
297
|
-
**Time spent:** [duration]
|
|
298
|
-
|
|
299
|
-
### Failure Details
|
|
300
|
-
[Full test output — not truncated]
|
|
301
|
-
|
|
302
|
-
### Patches Attempted
|
|
303
|
-
1. Patch A — [hypothesis] — [outcome]
|
|
304
|
-
2. Patch B — [hypothesis] — [outcome]
|
|
305
|
-
3. Patch C — [hypothesis] — [outcome]
|
|
306
|
-
|
|
307
|
-
### Diff from Last Green
|
|
308
|
-
[git diff output]
|
|
309
|
-
|
|
310
|
-
### Recommended Next Step
|
|
311
|
-
[Best remaining hypothesis with suggested experiment]
|
|
312
|
-
|
|
313
|
-
### Relevant Files
|
|
314
|
-
[files touched by failing test]
|
|
315
|
-
```
|
|
316
|
-
|
|
317
|
-
```bash
|
|
318
|
-
# Record escalation
|
|
319
|
-
bash runtime/metrics/collector.sh record \
|
|
320
|
-
--skill self-healing-code \
|
|
321
|
-
--outcome failure \
|
|
322
|
-
--notes "escalated: $MAX_CYCLES cycles, $PATCHES_TRIED patches, test: $TEST_NAME"
|
|
323
|
-
```
|
|
324
|
-
|
|
325
|
-
## ClawPowers Enhancement
|
|
326
|
-
|
|
327
|
-
When `~/.clawpowers/` runtime is initialized:
|
|
328
|
-
|
|
329
|
-
**Healing state persistence (resumable across sessions):**
|
|
330
|
-
|
|
331
|
-
```bash
|
|
332
|
-
# Save healing progress
|
|
333
|
-
bash runtime/persistence/store.sh set "healing:$ERROR_SIG:cycle" "$CURRENT_CYCLE"
|
|
334
|
-
bash runtime/persistence/store.sh set "healing:$ERROR_SIG:stash" "$STASH_REF"
|
|
335
|
-
bash runtime/persistence/store.sh set "healing:$ERROR_SIG:patches_tried" "$PATCHES_TRIED"
|
|
336
|
-
|
|
337
|
-
# Resume an interrupted healing session
|
|
338
|
-
ERROR_SIG="<hash>"
|
|
339
|
-
CYCLE=$(bash runtime/persistence/store.sh get "healing:$ERROR_SIG:cycle")
|
|
340
|
-
STASH=$(bash runtime/persistence/store.sh get "healing:$ERROR_SIG:stash")
|
|
341
|
-
echo "Resuming healing cycle $CYCLE for error $ERROR_SIG"
|
|
342
|
-
```
|
|
343
|
-
|
|
344
|
-
**Regression detection:**
|
|
345
|
-
```bash
|
|
346
|
-
# After auto-commit, verify no regressions in related tests
|
|
347
|
-
RELATED_TESTS=$(git diff HEAD~1 HEAD --name-only | xargs grep -l "def test_" 2>/dev/null | head -10)
|
|
348
|
-
bash -c "$TEST_CMD $RELATED_TESTS 2>&1"
|
|
349
|
-
```
|
|
350
|
-
|
|
351
|
-
**Pattern learning (feeds systematic-debugging):**
|
|
352
|
-
```bash
|
|
353
|
-
# After successful heal, store the winning pattern
|
|
354
|
-
bash runtime/persistence/store.sh set "debug:hypothesis:$ERROR_SIG" \
|
|
355
|
-
"$(cat ~/.clawpowers/state/patches/$ERROR_SIG/patch-$WINNING_PATCH-meta.json)"
|
|
356
|
-
```
|
|
357
|
-
|
|
358
|
-
## Anti-Patterns
|
|
359
|
-
|
|
360
|
-
| Anti-Pattern | Why It Fails | Correct Approach |
|
|
361
|
-
|-------------|-------------|-----------------|
|
|
362
|
-
| Apply patches without stashing first | No rollback path if all patches fail | Always stash before first patch |
|
|
363
|
-
| Skip hypothesis formation | Random patches waste all 3 cycles | Form ranked hypotheses before any patch |
|
|
364
|
-
| Generate only 1 patch | Single point of failure | Always generate ≥ 2 patches before applying |
|
|
365
|
-
| Skip coverage check | Patches that delete tests always "pass" | Coverage guard is non-negotiable |
|
|
366
|
-
| Apply patches sequentially without reset | Patches contaminate each other | Reset to clean state between each patch |
|
|
367
|
-
| Commit without full test suite pass | Partial fixes break other tests | Run full suite, not just the failing test |
|
|
368
|
-
| Exceed 3 cycles | Spiraling into a rabbit hole | Hard limit at 3; escalate cleanly |
|
|
369
|
-
| Escalate without full context | Human must re-investigate from scratch | Escalation package must include all evidence |
|
|
@@ -1,244 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: subagent-driven-development
|
|
3
|
-
description: Orchestrate complex tasks by dispatching fresh subagents with isolated context, two-stage review, and Git worktree isolation. Activate when a task is large enough to benefit from parallelism or context separation.
|
|
4
|
-
version: 1.0.0
|
|
5
|
-
requires:
|
|
6
|
-
tools: [git, bash]
|
|
7
|
-
runtime: false
|
|
8
|
-
metrics:
|
|
9
|
-
tracks: [tasks_dispatched, subagent_success_rate, review_pass_rate, time_to_completion]
|
|
10
|
-
improves: [task_decomposition_quality, spec_clarity, review_threshold]
|
|
11
|
-
---
|
|
12
|
-
|
|
13
|
-
# Subagent-Driven Development
|
|
14
|
-
|
|
15
|
-
## When to Use
|
|
16
|
-
|
|
17
|
-
Apply this skill when you encounter:
|
|
18
|
-
|
|
19
|
-
- A task with 3+ logically independent workstreams
|
|
20
|
-
- A task so large it would exhaust a single context window
|
|
21
|
-
- A feature requiring multiple specialists (frontend + backend + tests + docs)
|
|
22
|
-
- Any work where a bug in one component shouldn't block another
|
|
23
|
-
- A task with clear interfaces between components (you can spec them up front)
|
|
24
|
-
|
|
25
|
-
**Skip this skill when:**
|
|
26
|
-
- The task is tightly coupled — one change cascades everywhere
|
|
27
|
-
- You need to maintain narrative continuity across all components
|
|
28
|
-
- The task is < 2 hours of work for a single agent
|
|
29
|
-
- You don't have enough information to spec subagent boundaries yet
|
|
30
|
-
|
|
31
|
-
**Decision tree:**
|
|
32
|
-
```
|
|
33
|
-
Can the task be split into N parts with defined interfaces?
|
|
34
|
-
├── No → single-agent execution
|
|
35
|
-
└── Yes → Can subagents work concurrently without blocking each other?
|
|
36
|
-
├── No → sequential execution with checkpointing (executing-plans)
|
|
37
|
-
└── Yes → subagent-driven-development ← YOU ARE HERE
|
|
38
|
-
```
|
|
39
|
-
|
|
40
|
-
## Core Methodology
|
|
41
|
-
|
|
42
|
-
### Stage 0: Task Decomposition (do this yourself, not in a subagent)
|
|
43
|
-
|
|
44
|
-
Before dispatching anything, produce:
|
|
45
|
-
|
|
46
|
-
1. **Task tree** — hierarchical breakdown of the full work
|
|
47
|
-
2. **Subagent boundaries** — where one agent's output is another's input
|
|
48
|
-
3. **Interface contracts** — what each subagent accepts and delivers
|
|
49
|
-
4. **Dependency order** — which can run in parallel, which must sequence
|
|
50
|
-
|
|
51
|
-
**Decomposition heuristic:** Each subagent task should be completable in 1 context window (roughly 2-5K tokens of output). If larger, decompose further.
|
|
52
|
-
|
|
53
|
-
**Example decomposition for "Build authentication service":**
|
|
54
|
-
```
|
|
55
|
-
auth-service/
|
|
56
|
-
├── Subagent A: API design + OpenAPI spec [no dependencies]
|
|
57
|
-
├── Subagent B: Database schema + migrations [no dependencies]
|
|
58
|
-
├── Subagent C: Core auth logic (JWT, bcrypt) [depends on: A, B specs]
|
|
59
|
-
├── Subagent D: Integration tests [depends on: C output]
|
|
60
|
-
└── Subagent E: Documentation [depends on: A, C, D output]
|
|
61
|
-
```
|
|
62
|
-
|
|
63
|
-
### Stage 1: Spec Writing (per subagent)
|
|
64
|
-
|
|
65
|
-
For each subagent, write a precise spec that includes:
|
|
66
|
-
|
|
67
|
-
```markdown
|
|
68
|
-
## Subagent Spec: [Component Name]
|
|
69
|
-
|
|
70
|
-
**Objective:** [Single sentence — what this subagent produces]
|
|
71
|
-
|
|
72
|
-
**Context provided:**
|
|
73
|
-
- [File or artifact they receive as input]
|
|
74
|
-
- [Interface contract from upstream subagent]
|
|
75
|
-
|
|
76
|
-
**Deliverables:**
|
|
77
|
-
- [Specific file or artifact, not vague output]
|
|
78
|
-
- [Test file covering the deliverable]
|
|
79
|
-
|
|
80
|
-
**Constraints:**
|
|
81
|
-
- [Language/framework requirements]
|
|
82
|
-
- [Performance requirements if applicable]
|
|
83
|
-
- [Must not break: existing interfaces]
|
|
84
|
-
|
|
85
|
-
**Done criteria:**
|
|
86
|
-
- [ ] All tests pass
|
|
87
|
-
- [ ] Interface contract satisfied
|
|
88
|
-
- [ ] No TODOs or stubs in production code
|
|
89
|
-
```
|
|
90
|
-
|
|
91
|
-
**Anti-pattern:** Vague specs produce vague output. "Build the auth logic" is not a spec. "Implement JWT issuance and validation with RS256, returning {token, expiresAt, userId} from issue() and {valid, userId, error} from validate()" is a spec.
|
|
92
|
-
|
|
93
|
-
### Stage 2: Worktree Isolation
|
|
94
|
-
|
|
95
|
-
Each subagent works in an isolated Git worktree to prevent interference:
|
|
96
|
-
|
|
97
|
-
```bash
|
|
98
|
-
# Create worktrees for parallel subagents
|
|
99
|
-
git worktree add ../task-auth-api feature/auth-api
|
|
100
|
-
git worktree add ../task-auth-db feature/auth-db
|
|
101
|
-
git worktree add ../task-auth-core feature/auth-core
|
|
102
|
-
|
|
103
|
-
# Verify isolation
|
|
104
|
-
git worktree list
|
|
105
|
-
```
|
|
106
|
-
|
|
107
|
-
Worktrees share the repo history but have independent working directories. A subagent working in `../task-auth-api` cannot accidentally overwrite files in `../task-auth-core`.
|
|
108
|
-
|
|
109
|
-
See: `skills/using-git-worktrees/SKILL.md` for full worktree management protocol.
|
|
110
|
-
|
|
111
|
-
### Stage 3: Subagent Dispatch
|
|
112
|
-
|
|
113
|
-
Dispatch each subagent with:
|
|
114
|
-
1. The spec (complete, not abbreviated)
|
|
115
|
-
2. All input artifacts (relevant files, interface contracts)
|
|
116
|
-
3. Access to their assigned worktree
|
|
117
|
-
4. No instruction to "skip complicated parts" or "use a stub"
|
|
118
|
-
|
|
119
|
-
**Dispatch instruction template:**
|
|
120
|
-
```
|
|
121
|
-
You are implementing [component]. Your spec is below. Work only in the provided
|
|
122
|
-
worktree directory. Produce real, working code with tests — no stubs, no TODOs.
|
|
123
|
-
Deliver: [specific files]. When done, output a JSON summary of what you built.
|
|
124
|
-
|
|
125
|
-
[Full spec here]
|
|
126
|
-
```
|
|
127
|
-
|
|
128
|
-
### Stage 4: Two-Stage Review
|
|
129
|
-
|
|
130
|
-
**Stage 4a: Spec review** — Before running any subagent code, review that:
|
|
131
|
-
- The output matches the spec's deliverables
|
|
132
|
-
- Interface contracts are satisfied (types match, method signatures match)
|
|
133
|
-
- No stubs or mocks in production code paths
|
|
134
|
-
- Tests exist and cover the critical paths
|
|
135
|
-
|
|
136
|
-
**Stage 4b: Quality review** — After running the code:
|
|
137
|
-
- All tests pass (zero failing)
|
|
138
|
-
- No linting errors
|
|
139
|
-
- Performance meets requirements
|
|
140
|
-
- Security: no hardcoded credentials, no SQL injection vectors, no unvalidated inputs
|
|
141
|
-
|
|
142
|
-
**Review failure protocol:**
|
|
143
|
-
```
|
|
144
|
-
If Stage 4a fails → return spec to subagent with specific failure reason
|
|
145
|
-
If Stage 4b fails → return to subagent with exact failing test output
|
|
146
|
-
Never merge code that fails either review stage
|
|
147
|
-
```
|
|
148
|
-
|
|
149
|
-
### Stage 5: Integration
|
|
150
|
-
|
|
151
|
-
After all subagents pass review:
|
|
152
|
-
|
|
153
|
-
1. Merge worktrees in dependency order
|
|
154
|
-
2. Run full integration test suite
|
|
155
|
-
3. Resolve any interface mismatches (typically minor type issues)
|
|
156
|
-
4. Clean up worktrees
|
|
157
|
-
|
|
158
|
-
```bash
|
|
159
|
-
# Merge in order (B and C are independent, merge alphabetically)
|
|
160
|
-
git checkout main
|
|
161
|
-
git merge feature/auth-db
|
|
162
|
-
git merge feature/auth-api
|
|
163
|
-
git merge feature/auth-core # depends on both
|
|
164
|
-
git merge feature/auth-tests
|
|
165
|
-
git merge feature/auth-docs
|
|
166
|
-
|
|
167
|
-
# Clean up
|
|
168
|
-
git worktree remove ../task-auth-api
|
|
169
|
-
git worktree remove ../task-auth-db
|
|
170
|
-
# ... etc
|
|
171
|
-
```
|
|
172
|
-
|
|
173
|
-
## ClawPowers Enhancement
|
|
174
|
-
|
|
175
|
-
When `~/.clawpowers/` runtime is initialized:
|
|
176
|
-
|
|
177
|
-
**Persistent Execution DB:** Every subagent dispatch is logged with spec hash, start time, subagent ID, and outcome. If a session is interrupted, you know exactly which subagents completed and which to re-run.
|
|
178
|
-
|
|
179
|
-
```bash
|
|
180
|
-
# Record dispatch
|
|
181
|
-
bash runtime/persistence/store.sh set "subagent:auth-api:status" "dispatched"
|
|
182
|
-
bash runtime/persistence/store.sh set "subagent:auth-api:spec_hash" "$(echo "$SPEC" | sha256sum | cut -c1-8)"
|
|
183
|
-
|
|
184
|
-
# Check on resume
|
|
185
|
-
bash runtime/persistence/store.sh get "subagent:auth-api:status"
|
|
186
|
-
```
|
|
187
|
-
|
|
188
|
-
**Resumable Checkpoints:** The framework saves the task tree and each subagent's completion state. A session that crashes mid-dispatch resumes from the last successful checkpoint, not from scratch.
|
|
189
|
-
|
|
190
|
-
**Outcome Metrics:** After integration, record:
|
|
191
|
-
```bash
|
|
192
|
-
bash runtime/metrics/collector.sh record \
|
|
193
|
-
--skill subagent-driven-development \
|
|
194
|
-
--outcome success \
|
|
195
|
-
--duration 3600 \
|
|
196
|
-
--notes "auth-service: 5 subagents, 2 review cycles, 0 integration failures"
|
|
197
|
-
```
|
|
198
|
-
|
|
199
|
-
**Metric-driven decomposition:** After 10+ executions, `runtime/feedback/analyze.sh` identifies your optimal subagent granularity — tasks that are too small (high coordination overhead) or too large (high review failure rate).
|
|
200
|
-
|
|
201
|
-
## Anti-Patterns
|
|
202
|
-
|
|
203
|
-
| Anti-Pattern | Why It Fails | Correct Approach |
|
|
204
|
-
|-------------|-------------|-----------------|
|
|
205
|
-
| Vague spec ("build the auth thing") | Subagent guesses, output is wrong | Write spec with deliverables and done criteria |
|
|
206
|
-
| Skip the failure witness | Review catches nothing | Require all tests to pass in the review stage |
|
|
207
|
-
| Merge before review | Bad code enters main | Two-stage review is non-negotiable |
|
|
208
|
-
| Single worktree for multiple agents | Files overwrite each other | One worktree per subagent, always |
|
|
209
|
-
| Decompose too fine | Excessive coordination cost | Target 1-context-window tasks (2-5K token output) |
|
|
210
|
-
| Decompose too coarse | Subagent context exhaustion | If output > 1 context window, split further |
|
|
211
|
-
| Stub the hard parts | Tech debt accumulates | "No stubs" is a hard constraint in the spec |
|
|
212
|
-
|
|
213
|
-
## Examples
|
|
214
|
-
|
|
215
|
-
### Example 1: Simple (2 subagents)
|
|
216
|
-
|
|
217
|
-
**Task:** Add email verification to existing user signup
|
|
218
|
-
|
|
219
|
-
**Decomposition:**
|
|
220
|
-
- Subagent A: Email service integration (SendGrid/SES wrapper, template rendering)
|
|
221
|
-
- Subagent B: Verification flow (token generation, storage, verification endpoint)
|
|
222
|
-
- Sequential: B depends on A's interface
|
|
223
|
-
|
|
224
|
-
**Specs:** A delivers `EmailService` class with `send(to, template, vars)` → B uses that interface
|
|
225
|
-
|
|
226
|
-
### Example 2: Complex (5 subagents)
|
|
227
|
-
|
|
228
|
-
**Task:** Build real-time dashboard
|
|
229
|
-
|
|
230
|
-
**Decomposition:**
|
|
231
|
-
- Subagent A: WebSocket server (connection mgmt, message routing) [parallel]
|
|
232
|
-
- Subagent B: Data aggregation service (query engine, caching) [parallel]
|
|
233
|
-
- Subagent C: Frontend dashboard components (React, chart library) [parallel]
|
|
234
|
-
- Subagent D: Integration tests (WebSocket + aggregation E2E) [depends on A, B]
|
|
235
|
-
- Subagent E: Dashboard state management (connects C to A/B) [depends on A, B, C]
|
|
236
|
-
|
|
237
|
-
**Parallel dispatch:** A, B, C run concurrently. D and E run after A, B, C complete review.
|
|
238
|
-
|
|
239
|
-
## Integration with Other Skills
|
|
240
|
-
|
|
241
|
-
- Use `writing-plans` first if you don't have a clear task tree yet
|
|
242
|
-
- Apply `using-git-worktrees` for worktree lifecycle management
|
|
243
|
-
- Use `dispatching-parallel-agents` if subagents run as independent processes
|
|
244
|
-
- Apply `verification-before-completion` before final integration merge
|