clawpowers 1.0.1 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,325 @@
1
+ ---
2
+ name: meta-skill-evolution
3
+ description: RSI for coding methodology itself. After every 50 completed tasks, analyze outcome patterns, identify the weakest skill, surgically improve it, and commit the evolution. Agents that literally improve their own methodology over time.
4
+ version: 1.0.0
5
+ requires:
6
+ tools: [bash, git, node]
7
+ runtime: true
8
+ metrics:
9
+ tracks: [evolutions_triggered, skills_improved, success_rate_delta, version_bumps, evolution_duration]
10
+ improves: [skill_selection_accuracy, weakest_skill_identification, surgical_edit_quality]
11
+ ---
12
+
13
+ # Meta-Skill Evolution
14
+
15
+ ## When to Use
16
+
17
+ Apply this skill when:
18
+
19
+ - The task counter reaches a multiple of 50 (tracked in `~/.clawpowers/state/task-counter.json`)
20
+ - A skill consistently shows < 70% success rate over the last 20 uses
21
+ - `runtime/feedback/analyze.sh` surfaces a skill with declining trend
22
+ - Bill explicitly requests "evolve the skills" or "improve methodology"
23
+ - A cluster of related task failures points to a methodology gap
24
+
25
+ **Skip when:**
26
+ - Fewer than 50 total tasks have been completed (insufficient signal)
27
+ - The runtime directory `~/.clawpowers/` doesn't exist (static mode)
28
+ - A previous evolution cycle completed within the last 10 tasks (cooling period)
29
+
30
+ **Decision tree:**
31
+ ```
32
+ Has task counter hit a multiple of 50?
33
+ ├── No → continue working; check counter at next task completion
34
+ └── Yes → Run evolution cycle
35
+ └── Does weakest skill have < 80% success rate?
36
+ ├── No → log "all skills healthy", increment counter, skip
37
+ └── Yes → identify weakest section → surgical edit → version bump → commit
38
+ ```
39
+
40
+ ## Core Methodology
41
+
42
+ ### Step 1: Trigger and Task Counter
43
+
44
+ Every completed task increments a persistent counter. After each task:
45
+
46
+ ```bash
47
+ # Increment task counter
48
+ COUNTER_FILE=~/.clawpowers/state/task-counter.json
49
+ CURRENT=$(cat "$COUNTER_FILE" 2>/dev/null | node -e "const d=require('/dev/stdin');console.log(d.count||0)" 2>/dev/null || echo 0)
50
+ NEXT=$((CURRENT + 1))
51
+ echo "{\"count\": $NEXT, \"last_updated\": \"$(date -u +%Y-%m-%dT%H:%M:%SZ)\"}" > "$COUNTER_FILE"
52
+
53
+ # Check if evolution cycle is due
54
+ if (( NEXT % 50 == 0 )); then
55
+ echo "Evolution cycle triggered at task $NEXT"
56
+ # → proceed to Step 2
57
+ fi
58
+ ```
59
+
60
+ **Recording task completion (run this after every task):**
61
+ ```bash
62
+ bash runtime/metrics/collector.sh record \
63
+ --skill <active-skill-name> \
64
+ --outcome success|failure \
65
+ --duration <seconds> \
66
+ --notes "<brief description>"
67
+ ```
68
+
69
+ ### Step 2: Outcome Pattern Analysis
70
+
71
+ Pull the last 50 task records and compute per-skill success rates:
72
+
73
+ ```bash
74
+ # Analyze outcomes for the last 50 tasks
75
+ METRICS_FILE=~/.clawpowers/metrics/outcomes.jsonl
76
+
77
+ # Per-skill success rate (requires jq or node)
78
+ node - <<'EOF'
79
+ const fs = require('fs');
80
+ const lines = fs.readFileSync(process.env.HOME + '/.clawpowers/metrics/outcomes.jsonl', 'utf8')
81
+ .trim().split('\n').filter(Boolean).slice(-50)
82
+ .map(l => JSON.parse(l));
83
+
84
+ const stats = {};
85
+ for (const rec of lines) {
86
+ const s = rec.skill || 'unknown';
87
+ if (!stats[s]) stats[s] = { success: 0, failure: 0, durations: [] };
88
+ stats[s][rec.outcome === 'success' ? 'success' : 'failure']++;
89
+ if (rec.duration) stats[s].durations.push(rec.duration);
90
+ }
91
+
92
+ const report = Object.entries(stats).map(([skill, d]) => {
93
+ const total = d.success + d.failure;
94
+ const rate = total > 0 ? (d.success / total) : null;
95
+ const avgDuration = d.durations.length > 0
96
+ ? Math.round(d.durations.reduce((a,b)=>a+b,0) / d.durations.length)
97
+ : null;
98
+ return { skill, success_rate: rate, total_tasks: total, avg_duration_s: avgDuration };
99
+ }).sort((a, b) => (a.success_rate ?? 1) - (b.success_rate ?? 1));
100
+
101
+ console.log(JSON.stringify(report, null, 2));
102
+ EOF
103
+ ```
104
+
105
+ **What to look for:**
106
+ - Lowest `success_rate` → weakest skill candidate
107
+ - Rising `avg_duration_s` → methodology is too slow or unclear
108
+ - High failure count on a single skill → systemic gap, not random noise
109
+
110
+ ### Step 3: Identify the Weakest Skill
111
+
112
+ ```bash
113
+ # Get weakest skill (lowest success rate with ≥ 3 data points)
114
+ WEAKEST=$(node - <<'EOF'
115
+ const fs = require('fs');
116
+ const lines = fs.readFileSync(process.env.HOME + '/.clawpowers/metrics/outcomes.jsonl', 'utf8')
117
+ .trim().split('\n').filter(Boolean).slice(-50).map(l => JSON.parse(l));
118
+ const stats = {};
119
+ for (const rec of lines) {
120
+ const s = rec.skill || 'unknown';
121
+ if (!stats[s]) stats[s] = { success: 0, failure: 0 };
122
+ stats[s][rec.outcome === 'success' ? 'success' : 'failure']++;
123
+ }
124
+ const ranked = Object.entries(stats)
125
+ .filter(([_, d]) => (d.success + d.failure) >= 3)
126
+ .map(([skill, d]) => ({ skill, rate: d.success / (d.success + d.failure) }))
127
+ .sort((a, b) => a.rate - b.rate);
128
+ console.log(ranked[0]?.skill || '');
129
+ EOF
130
+ )
131
+
132
+ echo "Weakest skill: $WEAKEST"
133
+
134
+ # Read the skill file
135
+ SKILL_FILE="skills/$WEAKEST/SKILL.md"
136
+ if [[ ! -f "$SKILL_FILE" ]]; then
137
+ echo "Skill file not found: $SKILL_FILE — skipping evolution"
138
+ exit 0
139
+ fi
140
+ ```
141
+
142
+ ### Step 4: Diagnose the Specific Weakness
143
+
144
+ Before editing, analyze *why* the skill is failing. Read the failure notes:
145
+
146
+ ```bash
147
+ # Extract failure notes for the weakest skill
148
+ node - <<EOF
149
+ const fs = require('fs');
150
+ const skill = '$WEAKEST';
151
+ const lines = fs.readFileSync(process.env.HOME + '/.clawpowers/metrics/outcomes.jsonl', 'utf8')
152
+ .trim().split('\n').filter(Boolean).slice(-50)
153
+ .map(l => JSON.parse(l))
154
+ .filter(r => r.skill === skill && r.outcome === 'failure' && r.notes);
155
+ lines.forEach(r => console.log(r.timestamp, '|', r.notes));
156
+ EOF
157
+ ```
158
+
159
+ **Diagnosis patterns:**
160
+
161
+ | Failure note pattern | Likely weak section | Fix strategy |
162
+ |---------------------|-------------------|-------------|
163
+ | "step X was unclear" | Core Methodology step X | Add concrete example, remove ambiguity |
164
+ | "forgot to check Y" | Anti-Patterns table | Add the missed check as an explicit anti-pattern |
165
+ | "didn't know when to apply" | When to Use decision tree | Sharpen the decision tree with new branch |
166
+ | "ClawPowers commands failed" | ClawPowers Enhancement | Fix command syntax or add error handling |
167
+ | "took too long on Z" | Core Methodology step Z | Add shortcut or restructure step ordering |
168
+
169
+ ### Step 5: Surgical Edit (Not Wholesale Replacement)
170
+
171
+ **Critical rule:** Edit specific sections, not the entire file. Wholesale rewrites lose working methodology.
172
+
173
+ ```bash
174
+ # Read the current skill version
175
+ CURRENT_VERSION=$(grep '^version:' "$SKILL_FILE" | head -1 | awk '{print $2}' | tr -d '"')
176
+ MAJOR=$(echo $CURRENT_VERSION | cut -d. -f1)
177
+ MINOR=$(echo $CURRENT_VERSION | cut -d. -f2)
178
+ PATCH=$(echo $CURRENT_VERSION | cut -d. -f3)
179
+ NEW_VERSION="$MAJOR.$MINOR.$((PATCH + 1))"
180
+
181
+ echo "Evolving $WEAKEST from v$CURRENT_VERSION → v$NEW_VERSION"
182
+ ```
183
+
184
+ **Surgical edit guidelines:**
185
+ - If the "When to Use" decision tree is wrong → edit only that block
186
+ - If a Core Methodology step is incomplete → add one concrete example under that step
187
+ - If an Anti-Pattern is missing → append one row to the table
188
+ - If ClawPowers commands are broken → fix only the broken command block
189
+ - Never touch sections that aren't implicated in the failures
190
+ - Max lines changed per evolution cycle: 30 (forces focus)
191
+
192
+ **Apply the edit and bump version:**
193
+ ```bash
194
+ # After making the targeted edit in SKILL_FILE:
195
+ sed -i "s/^version: $CURRENT_VERSION/version: $NEW_VERSION/" "$SKILL_FILE"
196
+ ```
197
+
198
+ ### Step 6: Commit the Evolution
199
+
200
+ ```bash
201
+ # Stage and commit
202
+ git add "$SKILL_FILE"
203
+ git commit -m "skill-evolution: $WEAKEST v$CURRENT_VERSION → v$NEW_VERSION
204
+
205
+ Triggered at task $TASK_COUNT. Success rate was $RATE%.
206
+ Section edited: $SECTION_EDITED
207
+ Root cause: $ROOT_CAUSE
208
+
209
+ [meta-skill-evolution]"
210
+
211
+ # Copy evolved skill to ~/.clawpowers/skills/ if exists
212
+ MANAGED_SKILLS_DIR=~/.clawpowers/skills
213
+ if [[ -d "$MANAGED_SKILLS_DIR" ]]; then
214
+ mkdir -p "$MANAGED_SKILLS_DIR/$WEAKEST"
215
+ cp "$SKILL_FILE" "$MANAGED_SKILLS_DIR/$WEAKEST/SKILL.md"
216
+ fi
217
+ ```
218
+
219
+ ### Step 7: Log Evolution History
220
+
221
+ Every evolution is appended to a persistent log:
222
+
223
+ ```bash
224
+ EVOLUTION_LOG=~/.clawpowers/feedback/evolution-log.jsonl
225
+ mkdir -p "$(dirname $EVOLUTION_LOG)"
226
+
227
+ cat >> "$EVOLUTION_LOG" <<EOF
228
+ {"timestamp":"$(date -u +%Y-%m-%dT%H:%M:%SZ)","task_count":$TASK_COUNT,"skill":"$WEAKEST","version_from":"$CURRENT_VERSION","version_to":"$NEW_VERSION","success_rate_before":$RATE,"section_edited":"$SECTION_EDITED","root_cause":"$ROOT_CAUSE","commit":"$(git rev-parse --short HEAD)"}
229
+ EOF
230
+ ```
231
+
232
+ **Review evolution history:**
233
+ ```bash
234
+ # See all past evolutions
235
+ cat ~/.clawpowers/feedback/evolution-log.jsonl | node -e "
236
+ const lines = require('fs').readFileSync('/dev/stdin','utf8').trim().split('\n').map(JSON.parse);
237
+ lines.forEach(e => console.log(e.timestamp.slice(0,10), e.skill, e.version_from, '→', e.version_to, 'rate:', (e.success_rate_before*100).toFixed(0)+'%'));
238
+ "
239
+
240
+ # Check if an evolution helped (compare rate before vs after)
241
+ # Re-run outcome analysis after 10 more tasks to measure improvement
242
+ ```
243
+
244
+ ### Step 8: Validate the Evolution
245
+
246
+ After 10 more tasks using the evolved skill, check if the success rate improved:
247
+
248
+ ```bash
249
+ # Post-evolution check (run after 10+ tasks)
250
+ NEW_RATE=$(node -e "
251
+ const fs = require('fs');
252
+ const lines = fs.readFileSync(process.env.HOME + '/.clawpowers/metrics/outcomes.jsonl','utf8')
253
+ .trim().split('\n').filter(Boolean).slice(-10)
254
+ .map(l => JSON.parse(l))
255
+ .filter(r => r.skill === '$WEAKEST');
256
+ const success = lines.filter(r => r.outcome === 'success').length;
257
+ console.log((success/lines.length).toFixed(2));
258
+ ")
259
+ echo "Post-evolution success rate for $WEAKEST: $NEW_RATE"
260
+
261
+ # If rate dropped: revert the evolution
262
+ if node -e "process.exit(parseFloat('$NEW_RATE') < parseFloat('$RATE') ? 1 : 0)"; then
263
+ echo "Evolution improved the skill. Rate: $RATE → $NEW_RATE"
264
+ else
265
+ echo "WARNING: Evolution did not help. Consider reverting."
266
+ git revert HEAD --no-edit
267
+ fi
268
+ ```
269
+
270
+ ## ClawPowers Enhancement
271
+
272
+ When `~/.clawpowers/` runtime is initialized:
273
+
274
+ **Full evolution pipeline:**
275
+
276
+ ```bash
277
+ # Store evolution state for resumability
278
+ bash runtime/persistence/store.sh set "meta-evolution:current:task_count" "$TASK_COUNT"
279
+ bash runtime/persistence/store.sh set "meta-evolution:current:weakest_skill" "$WEAKEST"
280
+ bash runtime/persistence/store.sh set "meta-evolution:current:phase" "diagnosis|editing|committed|validated"
281
+
282
+ # Record the evolution outcome
283
+ bash runtime/metrics/collector.sh record \
284
+ --skill meta-skill-evolution \
285
+ --outcome success \
286
+ --duration "$DURATION" \
287
+ --notes "$WEAKEST v$CURRENT_VERSION→v$NEW_VERSION rate:$RATE→$NEW_RATE"
288
+ ```
289
+
290
+ **Analyze evolution effectiveness over time:**
291
+
292
+ ```bash
293
+ bash runtime/feedback/analyze.sh --filter meta-skill-evolution
294
+ # Shows: how many evolutions triggered, average rate improvement per evolution,
295
+ # which skills have been evolved most, correlation between evolution and task success
296
+ ```
297
+
298
+ **Track cumulative improvement:**
299
+ ```bash
300
+ # Evolution impact report
301
+ cat ~/.clawpowers/feedback/evolution-log.jsonl | node -e "
302
+ const lines = require('fs').readFileSync('/dev/stdin','utf8').trim().split('\n').map(JSON.parse);
303
+ const bySkill = {};
304
+ lines.forEach(e => {
305
+ if (!bySkill[e.skill]) bySkill[e.skill] = [];
306
+ bySkill[e.skill].push(e);
307
+ });
308
+ Object.entries(bySkill).forEach(([skill, evos]) => {
309
+ console.log(skill + ': ' + evos.length + ' evolutions, versions: ' + evos.map(e=>e.version_to).join(', '));
310
+ });
311
+ "
312
+ ```
313
+
314
+ ## Anti-Patterns
315
+
316
+ | Anti-Pattern | Why It Fails | Correct Approach |
317
+ |-------------|-------------|-----------------|
318
+ | Rewrite the whole skill | Destroys working methodology, no signal on what improved | Surgical edits only — max 30 lines changed |
319
+ | Evolve based on < 3 data points | Statistical noise triggers false evolution | Require ≥ 3 uses before a skill is eligible |
320
+ | Evolve on a cooling period | Too-frequent changes create instability | Enforce 10-task cooldown between evolutions |
321
+ | Skip the validation step | Bad evolutions compound over time | Always measure rate before vs after |
322
+ | Edit non-implicated sections | Changes unrelated things, pollutes signal | Only edit sections linked to failure notes |
323
+ | Forget to bump version | Can't track evolution history | Version bump is mandatory before commit |
324
+ | No evolution log entry | History is lost; can't audit what improved | Always append to evolution-log.jsonl |
325
+ | Evolve the meta-skill-evolution skill first | Circular improvement without baseline | Evolve leaf skills first; evolve this skill only after 5+ other evolutions |
@@ -0,0 +1,369 @@
1
+ ---
2
+ name: self-healing-code
3
+ description: On test failure, automatically capture the failure, run hypothesis-driven debugging, generate ≥2 candidate patches, apply and measure each, auto-commit the winner or escalate with full context. Max 3 iteration cycles with coverage guard.
4
+ version: 1.0.0
5
+ requires:
6
+ tools: [bash, git]
7
+ runtime: true
8
+ metrics:
9
+ tracks: [healing_attempts, auto_commits, escalations, patches_generated, coverage_delta, cycles_used]
10
+ improves: [patch_quality, hypothesis_accuracy, escalation_context_completeness]
11
+ ---
12
+
13
+ # Self-Healing Code
14
+
15
+ ## When to Use
16
+
17
+ Apply this skill when:
18
+
19
+ - A CI run or local test suite produces a failure
20
+ - A previously green test suite goes red after a code change
21
+ - An automated pipeline fails and needs remediation without human intervention
22
+ - Bill runs tests and the output contains `FAILED`, `ERROR`, or non-zero exit code
23
+
24
+ **Skip when:**
25
+ - Tests fail because of a missing environment variable or missing external service (that's a configuration issue, not a code defect)
26
+ - The failure is a flaky test known to fail intermittently — check `~/.clawpowers/state/known-flaky.json` first
27
+ - A previous healing cycle for this exact error is already in progress (check `~/.clawpowers/state/healing-lock.json`)
28
+
29
+ **Decision tree:**
30
+ ```
31
+ Did the test suite produce a failure?
32
+ ├── No → no action
33
+ └── Yes → Is this a known flaky test?
34
+ ├── Yes → skip, add flaky annotation, report
35
+ └── No → Is a healing cycle already running for this error?
36
+ ├── Yes → wait for completion or check lock age
37
+ └── No → self-healing-code ← YOU ARE HERE
38
+ ```
39
+
40
+ ## Core Methodology
41
+
42
+ ### Guardrails (enforce before any healing action)
43
+
44
+ ```bash
45
+ # Max cycles — never exceed 3 healing iterations per error
46
+ MAX_CYCLES=3
47
+ HEALING_STATE=~/.clawpowers/state/healing-$(echo "$ERROR_SIG" | md5).json
48
+ CURRENT_CYCLE=$(cat "$HEALING_STATE" 2>/dev/null | node -e "const d=require('/dev/stdin');console.log(d.cycle||0)" 2>/dev/null || echo 0)
49
+
50
+ if (( CURRENT_CYCLE >= MAX_CYCLES )); then
51
+ echo "Max cycles ($MAX_CYCLES) reached. Escalating."
52
+ # → go to Step 6: Escalation
53
+ fi
54
+
55
+ # Coverage guard — baseline before any patch
56
+ COVERAGE_BASELINE=$(bash runtime/persistence/store.sh get "coverage:baseline:$PROJECT" 2>/dev/null || echo "0")
57
+ ```
58
+
59
+ ### Step 1: Capture the Failure
60
+
61
+ Collect everything needed to understand and reproduce the failure:
62
+
63
+ ```bash
64
+ # Run tests and capture full output
65
+ TEST_OUTPUT=$(bash -c "$TEST_CMD 2>&1") || true
66
+ EXIT_CODE=$?
67
+
68
+ # Extract structured fields
69
+ TEST_NAME=$(echo "$TEST_OUTPUT" | grep -E "^(FAILED|FAIL|Error in)" | head -1)
70
+ ERROR_MSG=$(echo "$TEST_OUTPUT" | grep -A5 "AssertionError\|Error:\|Exception:" | head -10)
71
+ STACK_TRACE=$(echo "$TEST_OUTPUT" | grep -A20 "Traceback\|at [A-Za-z]" | head -30)
72
+
73
+ # Diff from last green commit
74
+ LAST_GREEN=$(bash runtime/persistence/store.sh get "last-green:$PROJECT" 2>/dev/null || git log --oneline | grep -i "green\|pass\|ci:" | head -1 | awk '{print $1}')
75
+ DIFF_FROM_GREEN=""
76
+ if [[ -n "$LAST_GREEN" ]]; then
77
+ DIFF_FROM_GREEN=$(git diff "$LAST_GREEN" HEAD -- . 2>/dev/null | head -200)
78
+ fi
79
+
80
+ # Error signature hash (for dedup and state tracking)
81
+ ERROR_SIG=$(echo "${TEST_NAME}${ERROR_MSG}" | md5)
82
+
83
+ # Log the capture
84
+ CAPTURE_RECORD=~/.clawpowers/state/healing-$ERROR_SIG-capture.json
85
+ cat > "$CAPTURE_RECORD" <<EOF
86
+ {
87
+ "timestamp": "$(date -u +%Y-%m-%dT%H:%M:%SZ)",
88
+ "test_name": $(echo "$TEST_NAME" | node -e "process.stdin.on('data',d=>console.log(JSON.stringify(d.toString().trim())))"),
89
+ "error_msg": $(echo "$ERROR_MSG" | node -e "process.stdin.on('data',d=>console.log(JSON.stringify(d.toString().trim())))"),
90
+ "exit_code": $EXIT_CODE,
91
+ "last_green_commit": "$LAST_GREEN",
92
+ "error_signature": "$ERROR_SIG"
93
+ }
94
+ EOF
95
+ ```
96
+
97
+ **Capture checklist:**
98
+ - [ ] Test name (exact test identifier)
99
+ - [ ] Full error message (not truncated)
100
+ - [ ] Stack trace (full, not just last frame)
101
+ - [ ] Diff from last green commit
102
+ - [ ] Environment snapshot (language version, key deps)
103
+
104
+ ### Step 2: Hypothesis Tree (Systematic Debugging Integration)
105
+
106
+ Apply the `systematic-debugging` methodology to form ranked hypotheses. This is not optional — random patches without hypotheses produce random results.
107
+
108
+ ```bash
109
+ # Check persistent hypothesis memory first (see systematic-debugging enhancement)
110
+ KNOWN_HYP=$(bash runtime/persistence/store.sh get "debug:hypothesis:$ERROR_SIG" 2>/dev/null)
111
+ if [[ -n "$KNOWN_HYP" ]]; then
112
+ echo "Known error pattern found. Starting with previously successful hypothesis."
113
+ echo "$KNOWN_HYP"
114
+ fi
115
+ ```
116
+
117
+ **Hypothesis template for common failure patterns:**
118
+
119
+ | Failure pattern | Likely hypothesis | Experiment |
120
+ |----------------|------------------|-----------|
121
+ | `AttributeError: 'NoneType' has no attribute X` | Null not guarded in refactored path | Add null check before access |
122
+ | `AssertionError: expected X, got Y` | Logic changed in upstream function | Bisect to find commit, inspect callers |
123
+ | `ConnectionRefusedError` | Service not started or port changed | Check env config, not a code fix |
124
+ | `KeyError: 'field_name'` | Schema changed, consumer not updated | Find all consumers of that key |
125
+ | `TypeError: expected str, got int` | Type coercion removed | Restore coercion or fix caller |
126
+
127
+ Form 2-4 specific hypotheses before generating patches.
128
+
129
+ ### Step 3: Generate Candidate Patches (Minimum 2)
130
+
131
+ For each top hypothesis, generate a candidate patch. Generate patches **before** applying any:
132
+
133
+ ```bash
134
+ # Stash current state for rollback safety
135
+ git stash push -m "self-healing-pre-patch-$ERROR_SIG-$(date +%s)"
136
+ STASH_REF=$(git stash list | head -1 | awk '{print $1}' | tr -d ':')
137
+
138
+ # Generate patch candidates (store as files, don't apply yet)
139
+ mkdir -p ~/.clawpowers/state/patches/$ERROR_SIG
140
+ ```
141
+
142
+ **Patch generation principles:**
143
+ - **Patch A:** Minimal fix — smallest change that addresses the hypothesis (prefer this)
144
+ - **Patch B:** Alternative approach — different mechanism, same outcome
145
+ - **Patch C (if needed):** Defensive fix — add guards to prevent the class of error
146
+
147
+ **Example (Python null guard):**
148
+ ```python
149
+ # Patch A — minimal: add None check at the failure site
150
+ # Before:
151
+ result = user.profile.settings["theme"]
152
+ # After:
153
+ result = user.profile.settings.get("theme", "default") if user.profile else "default"
154
+
155
+ # Patch B — alternative: fix upstream to guarantee non-null
156
+ # Before:
157
+ def get_user(user_id):
158
+ return db.query(User).filter_by(id=user_id).first() # can return None
159
+ # After:
160
+ def get_user(user_id):
161
+ user = db.query(User).filter_by(id=user_id).first()
162
+ if user is None:
163
+ raise UserNotFoundError(f"User {user_id} not found")
164
+ return user
165
+ ```
166
+
167
+ Write each patch to a file:
168
+ ```bash
169
+ # Write patches to staging area
170
+ cat > ~/.clawpowers/state/patches/$ERROR_SIG/patch-a.diff <<'EOF'
171
+ [patch content here]
172
+ EOF
173
+
174
+ # Capture reasoning for each patch
175
+ echo '{"patch":"a","hypothesis":"null not guarded","mechanism":"add get() with default","confidence":"high"}' \
176
+ > ~/.clawpowers/state/patches/$ERROR_SIG/patch-a-meta.json
177
+ ```
178
+
179
+ ### Step 4: Apply, Test, Measure
180
+
181
+ Apply patches in order, testing each. Stop at the first winner.
182
+
183
+ ```bash
184
+ # Measure baseline coverage before any patch
185
+ COVERAGE_BEFORE=$(bash -c "$COVERAGE_CMD 2>&1" | grep -E "TOTAL.*[0-9]+%" | grep -oE "[0-9]+%" | tail -1)
186
+
187
+ for PATCH in a b c; do
188
+ PATCH_FILE=~/.clawpowers/state/patches/$ERROR_SIG/patch-$PATCH.diff
189
+ [[ -f "$PATCH_FILE" ]] || continue
190
+
191
+ echo "=== Applying patch $PATCH ==="
192
+
193
+ # Restore clean state from stash before each patch
194
+ git stash pop 2>/dev/null || true
195
+ git stash push -m "self-healing-between-patches-$ERROR_SIG" 2>/dev/null || true
196
+ git checkout -- . 2>/dev/null || true
197
+
198
+ # Apply the patch
199
+ git apply "$PATCH_FILE" 2>/dev/null || patch -p1 < "$PATCH_FILE" 2>/dev/null
200
+
201
+ # Run full test suite
202
+ TEST_RESULT=$(bash -c "$TEST_CMD 2>&1")
203
+ TEST_EXIT=$?
204
+
205
+ # Measure coverage after patch
206
+ COVERAGE_AFTER=$(bash -c "$COVERAGE_CMD 2>&1" | grep -E "TOTAL.*[0-9]+%" | grep -oE "[0-9]+%" | tail -1)
207
+
208
+ # Coverage guard: never reduce
209
+ COVERAGE_OK=true
210
+ if [[ -n "$COVERAGE_BEFORE" && -n "$COVERAGE_AFTER" ]]; then
211
+ BEFORE_NUM=$(echo "$COVERAGE_BEFORE" | tr -d '%')
212
+ AFTER_NUM=$(echo "$COVERAGE_AFTER" | tr -d '%')
213
+ if (( AFTER_NUM < BEFORE_NUM )); then
214
+ COVERAGE_OK=false
215
+ echo "Coverage dropped: $COVERAGE_BEFORE → $COVERAGE_AFTER. Patch $PATCH rejected."
216
+ fi
217
+ fi
218
+
219
+ if [[ $TEST_EXIT -eq 0 && "$COVERAGE_OK" == "true" ]]; then
220
+ echo "Patch $PATCH PASSED all tests. Coverage: $COVERAGE_BEFORE → $COVERAGE_AFTER"
221
+ WINNING_PATCH=$PATCH
222
+ break
223
+ else
224
+ echo "Patch $PATCH FAILED. Exit: $TEST_EXIT. Coverage OK: $COVERAGE_OK"
225
+ fi
226
+ done
227
+ ```
228
+
229
+ ### Step 5: Auto-Commit the Winner
230
+
231
+ If a patch passes all tests and maintains coverage:
232
+
233
+ ```bash
234
+ if [[ -n "$WINNING_PATCH" ]]; then
235
+ # Commit with full context
236
+ git add -A
237
+ git commit -m "fix: self-healing patch for ${TEST_NAME}
238
+
239
+ Error signature: $ERROR_SIG
240
+ Patch applied: $WINNING_PATCH
241
+ Hypothesis: $(cat ~/.clawpowers/state/patches/$ERROR_SIG/patch-$WINNING_PATCH-meta.json | node -e "const d=require('/dev/stdin');process.stdin.pipe(d.hypothesis)")
242
+ Coverage: $COVERAGE_BEFORE → $COVERAGE_AFTER
243
+ Cycles used: $((CURRENT_CYCLE + 1))/$MAX_CYCLES
244
+
245
+ [self-healing-code]"
246
+
247
+ # Store last-green reference
248
+ bash runtime/persistence/store.sh set "last-green:$PROJECT" "$(git rev-parse HEAD)"
249
+
250
+ # Record success
251
+ bash runtime/metrics/collector.sh record \
252
+ --skill self-healing-code \
253
+ --outcome success \
254
+ --notes "patch-$WINNING_PATCH won, coverage $COVERAGE_BEFORE→$COVERAGE_AFTER, cycle $((CURRENT_CYCLE+1))/$MAX_CYCLES"
255
+
256
+ # Clean up healing state
257
+ rm -rf ~/.clawpowers/state/patches/$ERROR_SIG
258
+ rm -f ~/.clawpowers/state/healing-$ERROR_SIG*.json
259
+ fi
260
+ ```
261
+
262
+ ### Step 6: Rollback Protocol
263
+
264
+ If no patch wins after all candidates are tried:
265
+
266
+ ```bash
267
+ if [[ -z "$WINNING_PATCH" ]]; then
268
+ # Restore to pre-healing state
269
+ git checkout -- .
270
+ git stash drop 2>/dev/null || true
271
+ echo "All patches failed. State restored to pre-healing baseline."
272
+
273
+ # Increment cycle counter
274
+ NEW_CYCLE=$((CURRENT_CYCLE + 1))
275
+ echo "{\"cycle\": $NEW_CYCLE, \"error_sig\": \"$ERROR_SIG\"}" > "$HEALING_STATE"
276
+
277
+ if (( NEW_CYCLE < MAX_CYCLES )); then
278
+ echo "Cycle $NEW_CYCLE/$MAX_CYCLES complete. Forming new hypotheses."
279
+ # → Loop back to Step 2 with refined hypotheses
280
+ else
281
+ # → Escalate
282
+ echo "Max cycles reached. Escalating with full context."
283
+ fi
284
+ fi
285
+ ```
286
+
287
+ ### Step 7: Escalation Package
288
+
289
+ When all cycles are exhausted, escalate with enough context that a human can immediately begin debugging:
290
+
291
+ ```markdown
292
+ ## Self-Healing Escalation Report
293
+
294
+ **Error:** [test_name]
295
+ **Error signature:** [hash]
296
+ **Cycles attempted:** 3/3
297
+ **Time spent:** [duration]
298
+
299
+ ### Failure Details
300
+ [Full test output — not truncated]
301
+
302
+ ### Patches Attempted
303
+ 1. Patch A — [hypothesis] — [outcome]
304
+ 2. Patch B — [hypothesis] — [outcome]
305
+ 3. Patch C — [hypothesis] — [outcome]
306
+
307
+ ### Diff from Last Green
308
+ [git diff output]
309
+
310
+ ### Recommended Next Step
311
+ [Best remaining hypothesis with suggested experiment]
312
+
313
+ ### Relevant Files
314
+ [files touched by failing test]
315
+ ```
316
+
317
+ ```bash
318
+ # Record escalation
319
+ bash runtime/metrics/collector.sh record \
320
+ --skill self-healing-code \
321
+ --outcome failure \
322
+ --notes "escalated: $MAX_CYCLES cycles, $PATCHES_TRIED patches, test: $TEST_NAME"
323
+ ```
324
+
325
+ ## ClawPowers Enhancement
326
+
327
+ When `~/.clawpowers/` runtime is initialized:
328
+
329
+ **Healing state persistence (resumable across sessions):**
330
+
331
+ ```bash
332
+ # Save healing progress
333
+ bash runtime/persistence/store.sh set "healing:$ERROR_SIG:cycle" "$CURRENT_CYCLE"
334
+ bash runtime/persistence/store.sh set "healing:$ERROR_SIG:stash" "$STASH_REF"
335
+ bash runtime/persistence/store.sh set "healing:$ERROR_SIG:patches_tried" "$PATCHES_TRIED"
336
+
337
+ # Resume an interrupted healing session
338
+ ERROR_SIG="<hash>"
339
+ CYCLE=$(bash runtime/persistence/store.sh get "healing:$ERROR_SIG:cycle")
340
+ STASH=$(bash runtime/persistence/store.sh get "healing:$ERROR_SIG:stash")
341
+ echo "Resuming healing cycle $CYCLE for error $ERROR_SIG"
342
+ ```
343
+
344
+ **Regression detection:**
345
+ ```bash
346
+ # After auto-commit, verify no regressions in related tests
347
+ RELATED_TESTS=$(git diff HEAD~1 HEAD --name-only | xargs grep -l "def test_" 2>/dev/null | head -10)
348
+ bash -c "$TEST_CMD $RELATED_TESTS 2>&1"
349
+ ```
350
+
351
+ **Pattern learning (feeds systematic-debugging):**
352
+ ```bash
353
+ # After successful heal, store the winning pattern
354
+ bash runtime/persistence/store.sh set "debug:hypothesis:$ERROR_SIG" \
355
+ "$(cat ~/.clawpowers/state/patches/$ERROR_SIG/patch-$WINNING_PATCH-meta.json)"
356
+ ```
357
+
358
+ ## Anti-Patterns
359
+
360
+ | Anti-Pattern | Why It Fails | Correct Approach |
361
+ |-------------|-------------|-----------------|
362
+ | Apply patches without stashing first | No rollback path if all patches fail | Always stash before first patch |
363
+ | Skip hypothesis formation | Random patches waste all 3 cycles | Form ranked hypotheses before any patch |
364
+ | Generate only 1 patch | Single point of failure | Always generate ≥ 2 patches before applying |
365
+ | Skip coverage check | Patches that delete tests always "pass" | Coverage guard is non-negotiable |
366
+ | Apply patches sequentially without reset | Patches contaminate each other | Reset to clean state between each patch |
367
+ | Commit without full test suite pass | Partial fixes break other tests | Run full suite, not just the failing test |
368
+ | Exceed 3 cycles | Spiraling into a rabbit hole | Hard limit at 3; escalate cleanly |
369
+ | Escalate without full context | Human must re-investigate from scratch | Escalation package must include all evidence |