clawpowers 1.1.4 → 2.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +126 -0
- package/COMPATIBILITY.md +13 -0
- package/KNOWN_LIMITATIONS.md +19 -0
- package/LICENSE +44 -0
- package/LICENSING.md +10 -0
- package/README.md +378 -210
- package/SECURITY.md +52 -0
- package/dist/index.d.ts +1477 -0
- package/dist/index.js +3464 -0
- package/dist/index.js.map +1 -0
- package/native/Cargo.lock +4863 -0
- package/native/Cargo.toml +73 -0
- package/native/crates/canonical/Cargo.toml +24 -0
- package/native/crates/canonical/src/lib.rs +673 -0
- package/native/crates/compression/Cargo.toml +20 -0
- package/native/crates/compression/benches/compression_bench.rs +42 -0
- package/native/crates/compression/src/lib.rs +393 -0
- package/native/crates/evm-eth/Cargo.toml +13 -0
- package/native/crates/evm-eth/src/lib.rs +105 -0
- package/native/crates/fee/Cargo.toml +15 -0
- package/native/crates/fee/src/lib.rs +281 -0
- package/native/crates/index/Cargo.toml +16 -0
- package/native/crates/index/src/lib.rs +277 -0
- package/native/crates/policy/Cargo.toml +17 -0
- package/native/crates/policy/src/lib.rs +614 -0
- package/native/crates/security/Cargo.toml +22 -0
- package/native/crates/security/src/lib.rs +478 -0
- package/native/crates/tokens/Cargo.toml +13 -0
- package/native/crates/tokens/src/lib.rs +534 -0
- package/native/crates/verification/Cargo.toml +23 -0
- package/native/crates/verification/src/lib.rs +333 -0
- package/native/crates/wallet/Cargo.toml +20 -0
- package/native/crates/wallet/src/lib.rs +261 -0
- package/native/crates/x402/Cargo.toml +30 -0
- package/native/crates/x402/src/lib.rs +423 -0
- package/native/ffi/Cargo.toml +34 -0
- package/native/ffi/build.rs +4 -0
- package/native/ffi/index.node +0 -0
- package/native/ffi/src/lib.rs +352 -0
- package/native/ffi/tests/integration.rs +354 -0
- package/native/pyo3/Cargo.toml +26 -0
- package/native/pyo3/pyproject.toml +16 -0
- package/native/pyo3/src/lib.rs +407 -0
- package/native/pyo3/tests/test_smoke.py +180 -0
- package/native/wasm/Cargo.toml +44 -0
- package/native/wasm/pkg/.gitignore +6 -0
- package/native/wasm/pkg/clawpowers_wasm.d.ts +208 -0
- package/native/wasm/pkg/clawpowers_wasm.js +872 -0
- package/native/wasm/pkg/clawpowers_wasm_bg.wasm +0 -0
- package/native/wasm/pkg/clawpowers_wasm_bg.wasm.d.ts +40 -0
- package/native/wasm/pkg/package.json +17 -0
- package/native/wasm/pkg-node/.gitignore +6 -0
- package/native/wasm/pkg-node/clawpowers_wasm.d.ts +143 -0
- package/native/wasm/pkg-node/clawpowers_wasm.js +798 -0
- package/native/wasm/pkg-node/clawpowers_wasm_bg.wasm +0 -0
- package/native/wasm/pkg-node/clawpowers_wasm_bg.wasm.d.ts +40 -0
- package/native/wasm/pkg-node/package.json +13 -0
- package/native/wasm/src/lib.rs +433 -0
- package/package.json +71 -44
- package/src/skills/catalog.ts +435 -0
- package/src/skills/executor.ts +56 -0
- package/src/skills/index.ts +3 -0
- package/src/skills/itp/SKILL.md +112 -0
- package/src/skills/loader.ts +193 -0
- package/.claude-plugin/manifest.json +0 -19
- package/.codex/INSTALL.md +0 -36
- package/.cursor-plugin/manifest.json +0 -21
- package/.opencode/INSTALL.md +0 -52
- package/ARCHITECTURE.md +0 -69
- package/bin/clawpowers.js +0 -625
- package/bin/clawpowers.sh +0 -91
- package/docs/demo/clawpowers-demo.cast +0 -197
- package/docs/demo/clawpowers-demo.gif +0 -0
- package/docs/launch-images/25-skills-breakdown.jpg +0 -0
- package/docs/launch-images/clawpowers-vs-superpowers.jpg +0 -0
- package/docs/launch-images/economic-code-optimization.jpg +0 -0
- package/docs/launch-images/native-vs-bridge-2.jpg +0 -0
- package/docs/launch-images/native-vs-bridge.jpg +0 -0
- package/docs/launch-images/post1-hero-lobster.jpg +0 -0
- package/docs/launch-images/post2-dashboard.jpg +0 -0
- package/docs/launch-images/post3-superpowers.jpg +0 -0
- package/docs/launch-images/post4-before-after.jpg +0 -0
- package/docs/launch-images/post5-install-now.jpg +0 -0
- package/docs/launch-images/ultimate-stack.jpg +0 -0
- package/docs/launch-posts.md +0 -76
- package/docs/quickstart-first-transaction.md +0 -204
- package/gemini-extension.json +0 -32
- package/hooks/session-start +0 -205
- package/hooks/session-start.cmd +0 -43
- package/hooks/session-start.js +0 -163
- package/runtime/demo/README.md +0 -78
- package/runtime/demo/x402-mock-server.js +0 -230
- package/runtime/feedback/analyze.js +0 -621
- package/runtime/feedback/analyze.sh +0 -546
- package/runtime/init.js +0 -210
- package/runtime/init.sh +0 -178
- package/runtime/metrics/collector.js +0 -361
- package/runtime/metrics/collector.sh +0 -308
- package/runtime/payments/ledger.js +0 -305
- package/runtime/payments/ledger.sh +0 -262
- package/runtime/payments/pipeline.js +0 -455
- package/runtime/persistence/store.js +0 -433
- package/runtime/persistence/store.sh +0 -303
- package/skill.json +0 -106
- package/skills/agent-bounties/SKILL.md +0 -553
- package/skills/agent-payments/SKILL.md +0 -479
- package/skills/brainstorming/SKILL.md +0 -233
- package/skills/content-pipeline/SKILL.md +0 -282
- package/skills/cross-project-knowledge/SKILL.md +0 -345
- package/skills/dispatching-parallel-agents/SKILL.md +0 -305
- package/skills/economic-code-optimization/SKILL.md +0 -265
- package/skills/executing-plans/SKILL.md +0 -255
- package/skills/finishing-a-development-branch/SKILL.md +0 -260
- package/skills/formal-verification-lite/SKILL.md +0 -441
- package/skills/learn-how-to-learn/SKILL.md +0 -235
- package/skills/market-intelligence/SKILL.md +0 -323
- package/skills/meta-skill-evolution/SKILL.md +0 -325
- package/skills/prospecting/SKILL.md +0 -454
- package/skills/receiving-code-review/SKILL.md +0 -225
- package/skills/requesting-code-review/SKILL.md +0 -206
- package/skills/security-audit/SKILL.md +0 -353
- package/skills/self-healing-code/SKILL.md +0 -369
- package/skills/subagent-driven-development/SKILL.md +0 -244
- package/skills/systematic-debugging/SKILL.md +0 -355
- package/skills/test-driven-development/SKILL.md +0 -416
- package/skills/using-clawpowers/SKILL.md +0 -160
- package/skills/using-git-worktrees/SKILL.md +0 -261
- package/skills/validator/SKILL.md +0 -281
- package/skills/verification-before-completion/SKILL.md +0 -254
- package/skills/writing-plans/SKILL.md +0 -276
- package/skills/writing-skills/SKILL.md +0 -260
|
@@ -1,265 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: economic-code-optimization
|
|
3
|
-
description: Autonomously allocate micro-budgets to improve code quality - premium models, cloud compute, expert reviews, specialized tools. ROI-tracked spending that compounds agent performance over time.
|
|
4
|
-
version: 1.0.0
|
|
5
|
-
requires:
|
|
6
|
-
tools: []
|
|
7
|
-
runtime: true
|
|
8
|
-
skills: [agent-payments]
|
|
9
|
-
metrics:
|
|
10
|
-
tracks: [spend_decisions, amount_spent, roi_ratio, quality_delta, upgrades_purchased]
|
|
11
|
-
improves: [spend_efficiency, roi_accuracy, tier_selection, budget_utilization]
|
|
12
|
-
---
|
|
13
|
-
|
|
14
|
-
# Economic Code Optimization
|
|
15
|
-
|
|
16
|
-
## When to Use
|
|
17
|
-
|
|
18
|
-
Apply this skill when:
|
|
19
|
-
|
|
20
|
-
- A task is complex enough (estimated 2+ hours) that premium resources would meaningfully improve quality or speed
|
|
21
|
-
- Confidence is low on a critical PR and a paid expert review would catch issues you'd miss
|
|
22
|
-
- Mutation testing or formal verification needs more compute than the local machine provides
|
|
23
|
-
- An external API or tool (static analyzer, security scanner, profiler) would produce better results than manual inspection
|
|
24
|
-
- The task has high business value and the cost of a bug outweighs the cost of premium resources
|
|
25
|
-
|
|
26
|
-
Do NOT use when:
|
|
27
|
-
- The task is routine (simple CRUD, config changes, docs updates)
|
|
28
|
-
- The owner hasn't configured spending limits in `~/.clawpowers/config.json`
|
|
29
|
-
- The wallet balance is below the minimum threshold
|
|
30
|
-
- The task value doesn't justify any spend (ROI would be negative)
|
|
31
|
-
|
|
32
|
-
## Background: Why Agents Should Spend Money on Code
|
|
33
|
-
|
|
34
|
-
Static frameworks assume all resources are equal. They're not.
|
|
35
|
-
|
|
36
|
-
A $0.003 API call to a premium static analyzer catches bugs that 200 lines of manual review miss. A $0.10 GPU burst runs 10,000 mutation tests in 30 seconds instead of 20 minutes locally. A $2.00 expert review on a security-critical PR prevents a $200,000 vulnerability.
|
|
37
|
-
|
|
38
|
-
The question isn't whether to spend. It's how much, on what, and whether the ROI justifies it. This skill makes that decision automatically, within hard limits the owner controls.
|
|
39
|
-
|
|
40
|
-
## Setup
|
|
41
|
-
|
|
42
|
-
### Configure Spending Limits
|
|
43
|
-
|
|
44
|
-
Create or update `~/.clawpowers/config.json`:
|
|
45
|
-
|
|
46
|
-
```json
|
|
47
|
-
{
|
|
48
|
-
"economic_optimization": {
|
|
49
|
-
"enabled": true,
|
|
50
|
-
"global_daily_limit_usd": 5.00,
|
|
51
|
-
"per_task_max_usd": 1.00,
|
|
52
|
-
"min_task_value_usd": 50.00,
|
|
53
|
-
"auto_approve_below_usd": 0.10,
|
|
54
|
-
"require_owner_approval_above_usd": 2.00,
|
|
55
|
-
"allowed_categories": ["premium_model", "compute", "static_analysis", "security_scan", "expert_review"]
|
|
56
|
-
}
|
|
57
|
-
}
|
|
58
|
-
```
|
|
59
|
-
|
|
60
|
-
If no config exists, the skill operates in dry-run mode - it calculates what it would spend and logs the decision, but doesn't execute any payments.
|
|
61
|
-
|
|
62
|
-
### Verify Prerequisites
|
|
63
|
-
|
|
64
|
-
```bash
|
|
65
|
-
# Check that agent-payments skill is available
|
|
66
|
-
npx clawpowers store get "skill:agent-payments:configured"
|
|
67
|
-
|
|
68
|
-
# Check wallet balance (if configured)
|
|
69
|
-
npx clawpowers store get "wallet:balance:usd"
|
|
70
|
-
```
|
|
71
|
-
|
|
72
|
-
## Core Methodology
|
|
73
|
-
|
|
74
|
-
### Step 1: Task Value Assessment
|
|
75
|
-
|
|
76
|
-
Before any spend decision, estimate the task's business value. Be conservative.
|
|
77
|
-
|
|
78
|
-
**Value heuristics:**
|
|
79
|
-
|
|
80
|
-
| Task Type | Estimated Value | Rationale |
|
|
81
|
-
|-----------|----------------|-----------|
|
|
82
|
-
| Security-critical fix | $500-5,000 | Vulnerability cost if shipped |
|
|
83
|
-
| Core business logic | $200-1,000 | Revenue impact of bugs |
|
|
84
|
-
| Public API change | $100-500 | Breaking changes affect users |
|
|
85
|
-
| Performance optimization | $50-200 | Compute savings over time |
|
|
86
|
-
| Internal tooling | $20-100 | Developer time savings |
|
|
87
|
-
| Docs/config changes | $5-20 | Low risk, low impact |
|
|
88
|
-
|
|
89
|
-
Record the assessment:
|
|
90
|
-
|
|
91
|
-
```bash
|
|
92
|
-
npx clawpowers store set "eco:${TASK_ID}:estimated_value" "500"
|
|
93
|
-
npx clawpowers store set "eco:${TASK_ID}:task_type" "security-critical"
|
|
94
|
-
```
|
|
95
|
-
|
|
96
|
-
### Step 2: Spend Tier Calculation
|
|
97
|
-
|
|
98
|
-
Calculate the optimal spend as a percentage of task value:
|
|
99
|
-
|
|
100
|
-
```
|
|
101
|
-
spend_budget = min(task_value * spend_ratio, per_task_max)
|
|
102
|
-
|
|
103
|
-
Spend ratios by complexity:
|
|
104
|
-
Simple (1-3): 0% - no spend needed
|
|
105
|
-
Medium (4-6): 0.5% of task value
|
|
106
|
-
Complex (7-8): 2% of task value
|
|
107
|
-
Critical (9-10): 5% of task value
|
|
108
|
-
```
|
|
109
|
-
|
|
110
|
-
**Decision tree:**
|
|
111
|
-
|
|
112
|
-
1. Is `economic_optimization.enabled` true? If no, stop.
|
|
113
|
-
2. Is `estimated_value >= min_task_value_usd`? If no, stop - task too small to justify spend.
|
|
114
|
-
3. Calculate `spend_budget` using complexity ratio.
|
|
115
|
-
4. Is `spend_budget < auto_approve_below_usd`? If yes, proceed automatically.
|
|
116
|
-
5. Is `spend_budget > require_owner_approval_above_usd`? If yes, queue for approval and continue with base resources.
|
|
117
|
-
6. Otherwise, proceed with spend.
|
|
118
|
-
|
|
119
|
-
Record the decision:
|
|
120
|
-
|
|
121
|
-
```bash
|
|
122
|
-
npx clawpowers store set "eco:${TASK_ID}:spend_tier" "medium"
|
|
123
|
-
npx clawpowers store set "eco:${TASK_ID}:budget_usd" "0.50"
|
|
124
|
-
npx clawpowers store set "eco:${TASK_ID}:approved" "auto"
|
|
125
|
-
```
|
|
126
|
-
|
|
127
|
-
### Step 3: Resource Allocation
|
|
128
|
-
|
|
129
|
-
Based on the budget, select upgrades from this priority list:
|
|
130
|
-
|
|
131
|
-
**Tier 1: Free optimizations (always apply)**
|
|
132
|
-
- Use cached results from `cross-project-knowledge` pattern library
|
|
133
|
-
- Apply known-good patterns from previous projects
|
|
134
|
-
- Run local linters and type checkers
|
|
135
|
-
|
|
136
|
-
**Tier 2: Micro-spend ($0.01-0.10)**
|
|
137
|
-
- Premium static analysis API call (Semgrep Pro, SonarCloud)
|
|
138
|
-
- Extended mutation testing run (2x-5x normal iterations)
|
|
139
|
-
- Dependency vulnerability deep scan
|
|
140
|
-
|
|
141
|
-
**Tier 3: Small spend ($0.10-1.00)**
|
|
142
|
-
- Cloud GPU burst for heavy formal verification (1000+ property tests)
|
|
143
|
-
- Premium model API call for complex code review (Claude Opus, GPT-4o)
|
|
144
|
-
- Performance profiling service for optimization tasks
|
|
145
|
-
|
|
146
|
-
**Tier 4: Significant spend ($1.00-5.00)**
|
|
147
|
-
- Paid expert review routing for security-critical PRs
|
|
148
|
-
- Multi-model consensus (run 3 premium models, majority vote on approach)
|
|
149
|
-
- Extended cloud compute for exhaustive test generation
|
|
150
|
-
|
|
151
|
-
Record allocations:
|
|
152
|
-
|
|
153
|
-
```bash
|
|
154
|
-
npx clawpowers store set "eco:${TASK_ID}:allocations" "premium_model,mutation_testing_5x"
|
|
155
|
-
```
|
|
156
|
-
|
|
157
|
-
### Step 4: Execute Spend
|
|
158
|
-
|
|
159
|
-
Use the `agent-payments` skill for all financial transactions. Never bypass spending limits.
|
|
160
|
-
|
|
161
|
-
```bash
|
|
162
|
-
# Record the spend decision
|
|
163
|
-
npx clawpowers metrics record \
|
|
164
|
-
--skill economic-code-optimization \
|
|
165
|
-
--outcome success \
|
|
166
|
-
--duration 0 \
|
|
167
|
-
--notes "Allocated $0.50: premium model ($0.30) + extended mutation testing ($0.20) for task ${TASK_ID}"
|
|
168
|
-
```
|
|
169
|
-
|
|
170
|
-
For each purchased resource, track what was bought and the immediate result:
|
|
171
|
-
|
|
172
|
-
```bash
|
|
173
|
-
npx clawpowers store set "eco:${TASK_ID}:purchase:premium_model:cost" "0.30"
|
|
174
|
-
npx clawpowers store set "eco:${TASK_ID}:purchase:premium_model:result" "3 additional edge cases identified"
|
|
175
|
-
npx clawpowers store set "eco:${TASK_ID}:purchase:mutation_5x:cost" "0.20"
|
|
176
|
-
npx clawpowers store set "eco:${TASK_ID}:purchase:mutation_5x:result" "mutation score 72% -> 91%"
|
|
177
|
-
```
|
|
178
|
-
|
|
179
|
-
### Step 5: Measure ROI
|
|
180
|
-
|
|
181
|
-
After the task completes, calculate actual return on investment:
|
|
182
|
-
|
|
183
|
-
```
|
|
184
|
-
roi = (quality_improvement_value + time_saved_value) / amount_spent
|
|
185
|
-
|
|
186
|
-
Quality improvement value:
|
|
187
|
-
- Bugs caught before merge × estimated bug cost = quality value
|
|
188
|
-
- Mutation score improvement × coverage confidence factor
|
|
189
|
-
|
|
190
|
-
Time saved value:
|
|
191
|
-
- Hours saved by premium resources × hourly rate
|
|
192
|
-
```
|
|
193
|
-
|
|
194
|
-
Record ROI:
|
|
195
|
-
|
|
196
|
-
```bash
|
|
197
|
-
npx clawpowers store set "eco:${TASK_ID}:roi" "4.2"
|
|
198
|
-
npx clawpowers store set "eco:${TASK_ID}:bugs_caught" "3"
|
|
199
|
-
npx clawpowers store set "eco:${TASK_ID}:time_saved_min" "45"
|
|
200
|
-
|
|
201
|
-
npx clawpowers metrics record \
|
|
202
|
-
--skill economic-code-optimization \
|
|
203
|
-
--outcome success \
|
|
204
|
-
--duration 2700 \
|
|
205
|
-
--notes "ROI 4.2x: spent $0.50, saved ~$2.10 (3 bugs caught pre-merge, 45 min saved)"
|
|
206
|
-
```
|
|
207
|
-
|
|
208
|
-
### Step 6: Feed Back to Meta-Skill-Evolution
|
|
209
|
-
|
|
210
|
-
After 10+ economic optimization cycles, patterns emerge:
|
|
211
|
-
|
|
212
|
-
- Which spend categories produce the highest ROI?
|
|
213
|
-
- What's the optimal spend ratio for each task complexity level?
|
|
214
|
-
- Are certain tools/services consistently worth their cost?
|
|
215
|
-
- Where is the agent over-spending or under-spending?
|
|
216
|
-
|
|
217
|
-
The `meta-skill-evolution` skill picks up these patterns and adjusts the spend ratios automatically. After 50 cycles, the agent's spending becomes highly efficient - it knows exactly when premium resources pay for themselves.
|
|
218
|
-
|
|
219
|
-
```bash
|
|
220
|
-
# The meta-skill-evolution cycle reads this data automatically
|
|
221
|
-
npx clawpowers store list "eco:" | head -20
|
|
222
|
-
```
|
|
223
|
-
|
|
224
|
-
## ClawPowers Enhancement
|
|
225
|
-
|
|
226
|
-
This skill requires runtime persistence. Without `~/.clawpowers/`:
|
|
227
|
-
- Spend decisions can't reference historical ROI
|
|
228
|
-
- The agent can't learn which purchases are worthwhile
|
|
229
|
-
- Budget tracking across sessions is impossible
|
|
230
|
-
|
|
231
|
-
With runtime:
|
|
232
|
-
- Full spend history with ROI tracking
|
|
233
|
-
- Automatic spend ratio optimization over time
|
|
234
|
-
- Cross-project spend intelligence (some tools are worth it for all projects)
|
|
235
|
-
- Budget compliance verification
|
|
236
|
-
|
|
237
|
-
## Anti-Patterns
|
|
238
|
-
|
|
239
|
-
1. **Spending on every task.** Most tasks don't need premium resources. The skill should result in $0 spend on 70%+ of tasks.
|
|
240
|
-
|
|
241
|
-
2. **Ignoring ROI data.** If a spend category consistently produces ROI < 1.0, stop spending on it. The data is there - use it.
|
|
242
|
-
|
|
243
|
-
3. **Over-spending on low-value tasks.** A $2 expert review on a README typo fix is waste. The `min_task_value_usd` threshold exists for a reason.
|
|
244
|
-
|
|
245
|
-
4. **Bypassing spending limits.** Never modify `config.json` programmatically to increase limits. Only the owner adjusts caps.
|
|
246
|
-
|
|
247
|
-
5. **Spending without tracking.** Every cent spent must be recorded with task ID, category, amount, and result. Untracked spend is unaccountable spend.
|
|
248
|
-
|
|
249
|
-
6. **Assuming spend equals quality.** Premium resources help, but they don't replace good methodology. Always apply free optimizations (Tier 1) first. Spend only when free options are insufficient.
|
|
250
|
-
|
|
251
|
-
## Dry-Run Mode
|
|
252
|
-
|
|
253
|
-
When `economic_optimization.enabled` is false or the config doesn't exist, the skill runs in observation mode:
|
|
254
|
-
|
|
255
|
-
- Calculates what it would spend on each task
|
|
256
|
-
- Logs hypothetical ROI based on task outcomes
|
|
257
|
-
- After 20+ dry-run cycles, the agent can show the owner: "Here's what I would have spent and the estimated ROI"
|
|
258
|
-
- Owner can then enable with confidence, knowing the agent's spending judgment is calibrated
|
|
259
|
-
|
|
260
|
-
## References
|
|
261
|
-
|
|
262
|
-
- `agent-payments` skill: payment execution and wallet interaction
|
|
263
|
-
- `meta-skill-evolution` skill: automatic spend ratio optimization
|
|
264
|
-
- `agentwallet-sdk` v6.0: non-custodial wallets, spending policies, x402 protocol
|
|
265
|
-
- ERC-6551: token-bound accounts with smart-contract spending enforcement
|
|
@@ -1,255 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: executing-plans
|
|
3
|
-
description: Execute an existing plan with progress tracking, interruption recovery, and milestone verification. Activate when you have a written plan and are ready to implement it.
|
|
4
|
-
version: 1.0.0
|
|
5
|
-
requires:
|
|
6
|
-
tools: [bash, git]
|
|
7
|
-
runtime: false
|
|
8
|
-
metrics:
|
|
9
|
-
tracks: [tasks_completed, rework_rate, interruption_recovery_time, milestone_hit_rate]
|
|
10
|
-
improves: [task_sequencing, checkpoint_frequency, verification_rigor]
|
|
11
|
-
---
|
|
12
|
-
|
|
13
|
-
# Executing Plans
|
|
14
|
-
|
|
15
|
-
## When to Use
|
|
16
|
-
|
|
17
|
-
Apply this skill when:
|
|
18
|
-
|
|
19
|
-
- You have a written plan (from `writing-plans` or equivalent) ready to execute
|
|
20
|
-
- Executing a multi-task sequence where progress matters
|
|
21
|
-
- You need to be able to pause and resume without losing context
|
|
22
|
-
- You're executing work that someone else is tracking
|
|
23
|
-
|
|
24
|
-
**Skip when:**
|
|
25
|
-
- You don't have a plan yet (use `writing-plans` first)
|
|
26
|
-
- The task is a single step (just execute it)
|
|
27
|
-
- You're mid-execution and don't need the overhead
|
|
28
|
-
|
|
29
|
-
**Relationship to other skills:**
|
|
30
|
-
```
|
|
31
|
-
writing-plans → executing-plans → verification-before-completion → finishing-a-development-branch
|
|
32
|
-
```
|
|
33
|
-
|
|
34
|
-
## Core Methodology
|
|
35
|
-
|
|
36
|
-
### Pre-Execution Setup
|
|
37
|
-
|
|
38
|
-
Before executing the first task:
|
|
39
|
-
|
|
40
|
-
1. **Read the full plan** — Don't start mid-plan. Read it completely.
|
|
41
|
-
2. **Verify preconditions** — All inputs for Task 1 must exist. If they don't, stop and get them.
|
|
42
|
-
3. **Create execution checkpoint** — Save plan state to resume on interruption.
|
|
43
|
-
4. **Identify parallel tasks** — Group concurrent tasks from the dependency graph.
|
|
44
|
-
|
|
45
|
-
**Checkpoint structure (file-based, no runtime required):**
|
|
46
|
-
```json
|
|
47
|
-
{
|
|
48
|
-
"plan_name": "auth-service",
|
|
49
|
-
"started_at": "2026-03-21T14:00:00Z",
|
|
50
|
-
"tasks": {
|
|
51
|
-
"1": {"status": "pending"},
|
|
52
|
-
"2": {"status": "pending"},
|
|
53
|
-
"3": {"status": "pending"}
|
|
54
|
-
},
|
|
55
|
-
"current_task": null
|
|
56
|
-
}
|
|
57
|
-
```
|
|
58
|
-
|
|
59
|
-
If runtime is available:
|
|
60
|
-
```bash
|
|
61
|
-
bash runtime/persistence/store.sh set "execution:plan_name" "auth-service"
|
|
62
|
-
bash runtime/persistence/store.sh set "execution:task_1:status" "pending"
|
|
63
|
-
```
|
|
64
|
-
|
|
65
|
-
### Task Execution Loop
|
|
66
|
-
|
|
67
|
-
For each task in the plan (in dependency order):
|
|
68
|
-
|
|
69
|
-
**Step 1: Mark task in progress**
|
|
70
|
-
```bash
|
|
71
|
-
# With runtime
|
|
72
|
-
bash runtime/persistence/store.sh set "execution:task_N:status" "in_progress"
|
|
73
|
-
bash runtime/persistence/store.sh set "execution:task_N:started_at" "$(date -u +%Y-%m-%dT%H:%M:%SZ)"
|
|
74
|
-
```
|
|
75
|
-
|
|
76
|
-
**Step 2: Execute the task**
|
|
77
|
-
- Follow the task spec exactly — scope, deliverables, done criteria
|
|
78
|
-
- Do not expand scope ("while I'm here, I'll also...")
|
|
79
|
-
- Do not shrink scope ("this is probably good enough...")
|
|
80
|
-
|
|
81
|
-
**Step 3: Verify done criteria**
|
|
82
|
-
|
|
83
|
-
Each done criterion must be checked explicitly:
|
|
84
|
-
```
|
|
85
|
-
Task 2 done criteria:
|
|
86
|
-
- [ ] Repository layer exists at src/repos/user_repo.py → CHECK: file exists ✓
|
|
87
|
-
- [ ] All repository tests pass → CHECK: pytest tests/test_user_repo.py → 8 passed ✓
|
|
88
|
-
- [ ] No raw SQL in service layer → CHECK: grep "SELECT\|INSERT\|UPDATE" src/services/ → 0 results ✓
|
|
89
|
-
```
|
|
90
|
-
|
|
91
|
-
If any criterion fails: **stop, diagnose, fix, re-verify** — do not proceed to the next task.
|
|
92
|
-
|
|
93
|
-
**Step 4: Mark task complete**
|
|
94
|
-
```bash
|
|
95
|
-
bash runtime/persistence/store.sh set "execution:task_N:status" "complete"
|
|
96
|
-
bash runtime/persistence/store.sh set "execution:task_N:completed_at" "$(date -u +%Y-%m-%dT%H:%M:%SZ)"
|
|
97
|
-
```
|
|
98
|
-
|
|
99
|
-
**Step 5: Git commit the task output**
|
|
100
|
-
|
|
101
|
-
Each completed task gets its own commit:
|
|
102
|
-
```bash
|
|
103
|
-
git add [task output files]
|
|
104
|
-
git commit -m "feat(auth): implement UserRepository with connection pooling
|
|
105
|
-
|
|
106
|
-
Task 2/8 of auth-service plan. Completes repository layer.
|
|
107
|
-
All 8 repository tests passing. Zero raw SQL in service layer."
|
|
108
|
-
```
|
|
109
|
-
|
|
110
|
-
This makes the plan's execution history visible in git log and enables rollback to any task boundary.
|
|
111
|
-
|
|
112
|
-
### Handling Parallel Tasks
|
|
113
|
-
|
|
114
|
-
When the plan identifies parallel tasks:
|
|
115
|
-
|
|
116
|
-
1. Dispatch them concurrently (via `dispatching-parallel-agents` if agents, or concurrent execution)
|
|
117
|
-
2. Wait for ALL parallel tasks to complete before proceeding
|
|
118
|
-
3. Verify all parallel task done criteria before moving to dependent tasks
|
|
119
|
-
4. If one parallel task fails, others continue — don't cancel them
|
|
120
|
-
|
|
121
|
-
```
|
|
122
|
-
parallel_group = [Task 1, Task 2] # Both have no dependencies
|
|
123
|
-
|
|
124
|
-
Execute Task 1 and Task 2 concurrently
|
|
125
|
-
Wait for both → verify both → only then execute Task 3
|
|
126
|
-
```
|
|
127
|
-
|
|
128
|
-
### Milestone Verification
|
|
129
|
-
|
|
130
|
-
At natural boundaries (end of a logical phase), run a milestone verification:
|
|
131
|
-
|
|
132
|
-
1. All tasks in the phase are marked complete
|
|
133
|
-
2. All done criteria are checked
|
|
134
|
-
3. Integration between phase tasks is verified (not just individual tasks)
|
|
135
|
-
4. Run any integration tests that cover the phase boundary
|
|
136
|
-
|
|
137
|
-
**Example milestone:** After implementing the repository and service layers:
|
|
138
|
-
```bash
|
|
139
|
-
# Milestone: data layer complete
|
|
140
|
-
pytest tests/ -k "repository or service" # All must pass
|
|
141
|
-
# Verify no circular imports
|
|
142
|
-
python -c "from src.services.user import UserService"
|
|
143
|
-
# Verify interface contracts
|
|
144
|
-
python -m mypy src/repos/ src/services/
|
|
145
|
-
```
|
|
146
|
-
|
|
147
|
-
### Interruption Recovery
|
|
148
|
-
|
|
149
|
-
If execution is interrupted (session ends, error halts, requirement change):
|
|
150
|
-
|
|
151
|
-
**With runtime:**
|
|
152
|
-
```bash
|
|
153
|
-
# On resume
|
|
154
|
-
bash runtime/persistence/store.sh get "execution:plan_name"
|
|
155
|
-
# → auth-service
|
|
156
|
-
|
|
157
|
-
# Find last completed task
|
|
158
|
-
bash runtime/persistence/store.sh list "execution:task_*:status"
|
|
159
|
-
# → task_1: complete, task_2: complete, task_3: in_progress, task_4: pending
|
|
160
|
-
|
|
161
|
-
# Assess task_3: was it actually completed?
|
|
162
|
-
# Check: does the output exist? Do tests pass?
|
|
163
|
-
# If yes → mark complete, continue from task_4
|
|
164
|
-
# If no → re-execute task_3 from scratch
|
|
165
|
-
```
|
|
166
|
-
|
|
167
|
-
**Without runtime:** Check git log for the last committed task, verify its done criteria, continue from the next task.
|
|
168
|
-
|
|
169
|
-
**Key principle:** Never assume a task is complete because it was started. Verify the done criteria on resume.
|
|
170
|
-
|
|
171
|
-
### Scope Change During Execution
|
|
172
|
-
|
|
173
|
-
If requirements change mid-execution:
|
|
174
|
-
|
|
175
|
-
1. **Stop** — don't continue executing the current plan
|
|
176
|
-
2. **Assess** — how many tasks are invalidated by the change?
|
|
177
|
-
3. **If < 20% of tasks affected:** modify affected tasks in place, re-verify done criteria
|
|
178
|
-
4. **If > 20% of tasks affected:** return to `writing-plans` — the plan needs revision
|
|
179
|
-
5. **Document the change** — what changed and why, update the plan document
|
|
180
|
-
|
|
181
|
-
Never silently adjust scope while executing. Make the change explicit.
|
|
182
|
-
|
|
183
|
-
### Progress Reporting
|
|
184
|
-
|
|
185
|
-
When asked for progress, report against the plan:
|
|
186
|
-
|
|
187
|
-
```
|
|
188
|
-
Plan: auth-service (8 tasks)
|
|
189
|
-
Progress: 5/8 complete (62.5%)
|
|
190
|
-
|
|
191
|
-
✓ Task 1: Database schema
|
|
192
|
-
✓ Task 2: Repository layer
|
|
193
|
-
✓ Task 3: Service layer
|
|
194
|
-
✓ Task 4: Auth middleware
|
|
195
|
-
✓ Task 5: JWT utilities
|
|
196
|
-
⟳ Task 6: Protected routes (in progress)
|
|
197
|
-
Task 7: Integration tests (pending)
|
|
198
|
-
Task 8: Documentation (pending)
|
|
199
|
-
|
|
200
|
-
Current: Implementing route guards for admin endpoints
|
|
201
|
-
ETA: ~12 min remaining
|
|
202
|
-
Blockers: None
|
|
203
|
-
```
|
|
204
|
-
|
|
205
|
-
## ClawPowers Enhancement
|
|
206
|
-
|
|
207
|
-
When `~/.clawpowers/` runtime is initialized:
|
|
208
|
-
|
|
209
|
-
**Milestone Persistence:** Every task completion and milestone hit is saved to `~/.clawpowers/state/`. If your laptop crashes at Task 6 of 8, you resume from Task 7, not Task 1.
|
|
210
|
-
|
|
211
|
-
```bash
|
|
212
|
-
# Full execution history on resume
|
|
213
|
-
bash runtime/persistence/store.sh list "execution:*"
|
|
214
|
-
```
|
|
215
|
-
|
|
216
|
-
**Progress Dashboard:** Generate a real-time execution report:
|
|
217
|
-
|
|
218
|
-
```bash
|
|
219
|
-
bash runtime/feedback/analyze.sh --plan auth-service
|
|
220
|
-
# Output:
|
|
221
|
-
# Plan: auth-service
|
|
222
|
-
# Duration so far: 47 min (estimated 60 min total)
|
|
223
|
-
# Tasks: 5/8 complete
|
|
224
|
-
# Velocity: 1 task / 9.4 min (plan estimated: 1/7.5 min)
|
|
225
|
-
# Projected completion: +13 min
|
|
226
|
-
# Warning: Task 3 took 24 min vs estimated 5 min (spec was underspecified)
|
|
227
|
-
```
|
|
228
|
-
|
|
229
|
-
**Interruption Recovery Statistics:** Tracks how often execution is interrupted and how long recovery takes, informing optimal checkpoint frequency.
|
|
230
|
-
|
|
231
|
-
**Rework Tracking:** When a task must be re-executed (done criteria failed), records the cause:
|
|
232
|
-
- Spec was ambiguous (improve `writing-plans`)
|
|
233
|
-
- Dependency missing (improve dependency mapping)
|
|
234
|
-
- Requirement changed (flag as scope change, not rework)
|
|
235
|
-
- Implementation error (improve task verification step)
|
|
236
|
-
|
|
237
|
-
## Anti-Patterns
|
|
238
|
-
|
|
239
|
-
| Anti-Pattern | Why It Fails | Correct Approach |
|
|
240
|
-
|-------------|-------------|-----------------|
|
|
241
|
-
| Skipping done criteria verification | Tasks appear complete but aren't | Verify every criterion explicitly |
|
|
242
|
-
| Expanding task scope mid-execution | Creates unanticipated dependencies | Strict scope adherence; new scope = new task |
|
|
243
|
-
| Proceeding past a failed task | Subsequent tasks build on broken foundation | Stop, fix, re-verify, then continue |
|
|
244
|
-
| Not committing per task | Can't identify which task introduced a bug | Commit every task completion |
|
|
245
|
-
| Ignoring parallel opportunities | Sequential execution of parallel-safe tasks wastes time | Dispatch parallel tasks concurrently |
|
|
246
|
-
| Silently adjusting requirements | Plan/reality divergence | Explicit scope change protocol |
|
|
247
|
-
| Skipping milestone verification | Integration problems discovered late | Verify at every phase boundary |
|
|
248
|
-
|
|
249
|
-
## Integration with Other Skills
|
|
250
|
-
|
|
251
|
-
- Preceded by `writing-plans` (plan must exist before execution)
|
|
252
|
-
- Use `subagent-driven-development` for parallel task dispatch
|
|
253
|
-
- Use `using-git-worktrees` for concurrent task isolation
|
|
254
|
-
- Followed by `verification-before-completion` before merging
|
|
255
|
-
- Use `systematic-debugging` when a task fails unexpectedly
|