clawpowers 1.1.0 → 1.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -27,7 +27,7 @@ ClawPowers gives your coding agent superpowers that go beyond instructions. Whil
27
27
  | Windows native support | ✅ | ❌ |
28
28
  | Zero dependencies | ✅ | ✅ |
29
29
 
30
- **24 skills.** 14 cover everything static frameworks do (TDD, subagent dev, debugging, planning, code review, git worktrees). 6 go where they can't — payments, security, content, prospecting, market intelligence, and metacognitive learning.
30
+ **25 skills.** 14 cover everything static frameworks do (TDD, subagent dev, debugging, planning, code review, git worktrees). 6 go where they can't — payments, security, content, prospecting, market intelligence, and metacognitive learning. 4 are things no other framework even attempts — self-healing code, agents that rewrite their own methodology, cross-project knowledge transfer, and property-based formal verification.
31
31
 
32
32
  ## Requirements
33
33
 
@@ -214,11 +214,23 @@ Static frameworks stop at coding methodology. ClawPowers includes skills for:
214
214
  | `market-intelligence` | Competitive analysis, trend detection, opportunity scoring | Requires web access, data aggregation, persistent tracking |
215
215
  | `prospecting` | Lead generation, contact enrichment, CRM sync | Requires API calls (Exa, Apollo), structured output |
216
216
 
217
+ ### RSI Intelligence Layer (4 skills)
218
+
219
+ These skills don't exist in any other framework. They require runtime execution, persistent state, and self-modification capabilities that static prompt collections can never deliver.
220
+
221
+ | Skill | What It Does | Why This Changes Everything |
222
+ |-------|-------------|----------------------------|
223
+ | `meta-skill-evolution` | Every 50 tasks, analyzes outcome patterns, identifies the weakest skill, surgically rewrites its methodology, version bumps | Your agent's coding discipline improves autonomously over time. After 30 days it's measurably better than any static install |
224
+ | `self-healing-code` | On test failure: captures error → builds hypothesis tree → generates 2+ patches → applies with coverage guard → auto-commits winner | 3-cycle max with rollback. Turns red tests into green tests without human intervention |
225
+ | `cross-project-knowledge` | Persistent pattern library across ALL repos. Bug fixes, architecture decisions, and performance optimizations transfer between projects | Agent working on Project B benefits from everything learned on Projects A, C, D. Knowledge compounds |
226
+ | `formal-verification-lite` | Property-based testing with fast-check (JS), Hypothesis (Python), QuickCheck (Haskell). 5 property templates, 1000+ examples per property | Goes beyond "tests pass" to "tests actually prove correctness." Catches edge cases unit tests miss |
227
+ | `economic-code-optimization` | Autonomously spends micro-budgets on premium models, cloud GPUs, expert reviews when ROI justifies it. Tracks every cent and learns optimal spend ratios | Agents literally invest in their own performance. Spending efficiency improves over time via RSI feedback loop |
228
+
217
229
  ## Architecture
218
230
 
219
231
  ```
220
232
  clawpowers/
221
- ├── skills/ # 20 skill directories, each with SKILL.md
233
+ ├── skills/ # 25 skill directories, each with SKILL.md
222
234
  ├── runtime/
223
235
  │ ├── persistence/ # Cross-session state (store.js + store.sh)
224
236
  │ ├── metrics/ # Outcome tracking (collector.js + collector.sh)
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "clawpowers",
3
- "version": "1.1.0",
3
+ "version": "1.1.1",
4
4
  "description": "The skills framework that actually does something — runtime execution, persistent memory, self-improvement, and autonomous payments for coding agents.",
5
5
  "license": "MIT",
6
6
  "author": "AI Agent Economy <https://github.com/up2itnow0822>",
@@ -0,0 +1,265 @@
1
+ ---
2
+ name: economic-code-optimization
3
+ description: Autonomously allocate micro-budgets to improve code quality - premium models, cloud compute, expert reviews, specialized tools. ROI-tracked spending that compounds agent performance over time.
4
+ version: 1.0.0
5
+ requires:
6
+ tools: []
7
+ runtime: true
8
+ skills: [agent-payments]
9
+ metrics:
10
+ tracks: [spend_decisions, amount_spent, roi_ratio, quality_delta, upgrades_purchased]
11
+ improves: [spend_efficiency, roi_accuracy, tier_selection, budget_utilization]
12
+ ---
13
+
14
+ # Economic Code Optimization
15
+
16
+ ## When to Use
17
+
18
+ Apply this skill when:
19
+
20
+ - A task is complex enough (estimated 2+ hours) that premium resources would meaningfully improve quality or speed
21
+ - Confidence is low on a critical PR and a paid expert review would catch issues you'd miss
22
+ - Mutation testing or formal verification needs more compute than the local machine provides
23
+ - An external API or tool (static analyzer, security scanner, profiler) would produce better results than manual inspection
24
+ - The task has high business value and the cost of a bug outweighs the cost of premium resources
25
+
26
+ Do NOT use when:
27
+ - The task is routine (simple CRUD, config changes, docs updates)
28
+ - The owner hasn't configured spending limits in `~/.clawpowers/config.json`
29
+ - The wallet balance is below the minimum threshold
30
+ - The task value doesn't justify any spend (ROI would be negative)
31
+
32
+ ## Background: Why Agents Should Spend Money on Code
33
+
34
+ Static frameworks assume all resources are equal. They're not.
35
+
36
+ A $0.003 API call to a premium static analyzer catches bugs that 200 lines of manual review miss. A $0.10 GPU burst runs 10,000 mutation tests in 30 seconds instead of 20 minutes locally. A $2.00 expert review on a security-critical PR prevents a $200,000 vulnerability.
37
+
38
+ The question isn't whether to spend. It's how much, on what, and whether the ROI justifies it. This skill makes that decision automatically, within hard limits the owner controls.
39
+
40
+ ## Setup
41
+
42
+ ### Configure Spending Limits
43
+
44
+ Create or update `~/.clawpowers/config.json`:
45
+
46
+ ```json
47
+ {
48
+ "economic_optimization": {
49
+ "enabled": true,
50
+ "global_daily_limit_usd": 5.00,
51
+ "per_task_max_usd": 1.00,
52
+ "min_task_value_usd": 50.00,
53
+ "auto_approve_below_usd": 0.10,
54
+ "require_owner_approval_above_usd": 2.00,
55
+ "allowed_categories": ["premium_model", "compute", "static_analysis", "security_scan", "expert_review"]
56
+ }
57
+ }
58
+ ```
59
+
60
+ If no config exists, the skill operates in dry-run mode - it calculates what it would spend and logs the decision, but doesn't execute any payments.
61
+
62
+ ### Verify Prerequisites
63
+
64
+ ```bash
65
+ # Check that agent-payments skill is available
66
+ npx clawpowers store get "skill:agent-payments:configured"
67
+
68
+ # Check wallet balance (if configured)
69
+ npx clawpowers store get "wallet:balance:usd"
70
+ ```
71
+
72
+ ## Core Methodology
73
+
74
+ ### Step 1: Task Value Assessment
75
+
76
+ Before any spend decision, estimate the task's business value. Be conservative.
77
+
78
+ **Value heuristics:**
79
+
80
+ | Task Type | Estimated Value | Rationale |
81
+ |-----------|----------------|-----------|
82
+ | Security-critical fix | $500-5,000 | Vulnerability cost if shipped |
83
+ | Core business logic | $200-1,000 | Revenue impact of bugs |
84
+ | Public API change | $100-500 | Breaking changes affect users |
85
+ | Performance optimization | $50-200 | Compute savings over time |
86
+ | Internal tooling | $20-100 | Developer time savings |
87
+ | Docs/config changes | $5-20 | Low risk, low impact |
88
+
89
+ Record the assessment:
90
+
91
+ ```bash
92
+ npx clawpowers store set "eco:${TASK_ID}:estimated_value" "500"
93
+ npx clawpowers store set "eco:${TASK_ID}:task_type" "security-critical"
94
+ ```
95
+
96
+ ### Step 2: Spend Tier Calculation
97
+
98
+ Calculate the optimal spend as a percentage of task value:
99
+
100
+ ```
101
+ spend_budget = min(task_value * spend_ratio, per_task_max)
102
+
103
+ Spend ratios by complexity:
104
+ Simple (1-3): 0% - no spend needed
105
+ Medium (4-6): 0.5% of task value
106
+ Complex (7-8): 2% of task value
107
+ Critical (9-10): 5% of task value
108
+ ```
109
+
110
+ **Decision tree:**
111
+
112
+ 1. Is `economic_optimization.enabled` true? If no, stop.
113
+ 2. Is `estimated_value >= min_task_value_usd`? If no, stop - task too small to justify spend.
114
+ 3. Calculate `spend_budget` using complexity ratio.
115
+ 4. Is `spend_budget < auto_approve_below_usd`? If yes, proceed automatically.
116
+ 5. Is `spend_budget > require_owner_approval_above_usd`? If yes, queue for approval and continue with base resources.
117
+ 6. Otherwise, proceed with spend.
118
+
119
+ Record the decision:
120
+
121
+ ```bash
122
+ npx clawpowers store set "eco:${TASK_ID}:spend_tier" "medium"
123
+ npx clawpowers store set "eco:${TASK_ID}:budget_usd" "0.50"
124
+ npx clawpowers store set "eco:${TASK_ID}:approved" "auto"
125
+ ```
126
+
127
+ ### Step 3: Resource Allocation
128
+
129
+ Based on the budget, select upgrades from this priority list:
130
+
131
+ **Tier 1: Free optimizations (always apply)**
132
+ - Use cached results from `cross-project-knowledge` pattern library
133
+ - Apply known-good patterns from previous projects
134
+ - Run local linters and type checkers
135
+
136
+ **Tier 2: Micro-spend ($0.01-0.10)**
137
+ - Premium static analysis API call (Semgrep Pro, SonarCloud)
138
+ - Extended mutation testing run (2x-5x normal iterations)
139
+ - Dependency vulnerability deep scan
140
+
141
+ **Tier 3: Small spend ($0.10-1.00)**
142
+ - Cloud GPU burst for heavy formal verification (1000+ property tests)
143
+ - Premium model API call for complex code review (Claude Opus, GPT-4o)
144
+ - Performance profiling service for optimization tasks
145
+
146
+ **Tier 4: Significant spend ($1.00-5.00)**
147
+ - Paid expert review routing for security-critical PRs
148
+ - Multi-model consensus (run 3 premium models, majority vote on approach)
149
+ - Extended cloud compute for exhaustive test generation
150
+
151
+ Record allocations:
152
+
153
+ ```bash
154
+ npx clawpowers store set "eco:${TASK_ID}:allocations" "premium_model,mutation_testing_5x"
155
+ ```
156
+
157
+ ### Step 4: Execute Spend
158
+
159
+ Use the `agent-payments` skill for all financial transactions. Never bypass spending limits.
160
+
161
+ ```bash
162
+ # Record the spend decision
163
+ npx clawpowers metrics record \
164
+ --skill economic-code-optimization \
165
+ --outcome success \
166
+ --duration 0 \
167
+ --notes "Allocated $0.50: premium model ($0.30) + extended mutation testing ($0.20) for task ${TASK_ID}"
168
+ ```
169
+
170
+ For each purchased resource, track what was bought and the immediate result:
171
+
172
+ ```bash
173
+ npx clawpowers store set "eco:${TASK_ID}:purchase:premium_model:cost" "0.30"
174
+ npx clawpowers store set "eco:${TASK_ID}:purchase:premium_model:result" "3 additional edge cases identified"
175
+ npx clawpowers store set "eco:${TASK_ID}:purchase:mutation_5x:cost" "0.20"
176
+ npx clawpowers store set "eco:${TASK_ID}:purchase:mutation_5x:result" "mutation score 72% -> 91%"
177
+ ```
178
+
179
+ ### Step 5: Measure ROI
180
+
181
+ After the task completes, calculate actual return on investment:
182
+
183
+ ```
184
+ roi = (quality_improvement_value + time_saved_value) / amount_spent
185
+
186
+ Quality improvement value:
187
+ - Bugs caught before merge × estimated bug cost = quality value
188
+ - Mutation score improvement × coverage confidence factor
189
+
190
+ Time saved value:
191
+ - Hours saved by premium resources × hourly rate
192
+ ```
193
+
194
+ Record ROI:
195
+
196
+ ```bash
197
+ npx clawpowers store set "eco:${TASK_ID}:roi" "4.2"
198
+ npx clawpowers store set "eco:${TASK_ID}:bugs_caught" "3"
199
+ npx clawpowers store set "eco:${TASK_ID}:time_saved_min" "45"
200
+
201
+ npx clawpowers metrics record \
202
+ --skill economic-code-optimization \
203
+ --outcome success \
204
+ --duration 2700 \
205
+ --notes "ROI 4.2x: spent $0.50, saved ~$2.10 (3 bugs caught pre-merge, 45 min saved)"
206
+ ```
207
+
208
+ ### Step 6: Feed Back to Meta-Skill-Evolution
209
+
210
+ After 10+ economic optimization cycles, patterns emerge:
211
+
212
+ - Which spend categories produce the highest ROI?
213
+ - What's the optimal spend ratio for each task complexity level?
214
+ - Are certain tools/services consistently worth their cost?
215
+ - Where is the agent over-spending or under-spending?
216
+
217
+ The `meta-skill-evolution` skill picks up these patterns and adjusts the spend ratios automatically. After 50 cycles, the agent's spending becomes highly efficient - it knows exactly when premium resources pay for themselves.
218
+
219
+ ```bash
220
+ # The meta-skill-evolution cycle reads this data automatically
221
+ npx clawpowers store list "eco:" | head -20
222
+ ```
223
+
224
+ ## ClawPowers Enhancement
225
+
226
+ This skill requires runtime persistence. Without `~/.clawpowers/`:
227
+ - Spend decisions can't reference historical ROI
228
+ - The agent can't learn which purchases are worthwhile
229
+ - Budget tracking across sessions is impossible
230
+
231
+ With runtime:
232
+ - Full spend history with ROI tracking
233
+ - Automatic spend ratio optimization over time
234
+ - Cross-project spend intelligence (some tools are worth it for all projects)
235
+ - Budget compliance verification
236
+
237
+ ## Anti-Patterns
238
+
239
+ 1. **Spending on every task.** Most tasks don't need premium resources. The skill should result in $0 spend on 70%+ of tasks.
240
+
241
+ 2. **Ignoring ROI data.** If a spend category consistently produces ROI < 1.0, stop spending on it. The data is there - use it.
242
+
243
+ 3. **Over-spending on low-value tasks.** A $2 expert review on a README typo fix is waste. The `min_task_value_usd` threshold exists for a reason.
244
+
245
+ 4. **Bypassing spending limits.** Never modify `config.json` programmatically to increase limits. Only the owner adjusts caps.
246
+
247
+ 5. **Spending without tracking.** Every cent spent must be recorded with task ID, category, amount, and result. Untracked spend is unaccountable spend.
248
+
249
+ 6. **Assuming spend equals quality.** Premium resources help, but they don't replace good methodology. Always apply free optimizations (Tier 1) first. Spend only when free options are insufficient.
250
+
251
+ ## Dry-Run Mode
252
+
253
+ When `economic_optimization.enabled` is false or the config doesn't exist, the skill runs in observation mode:
254
+
255
+ - Calculates what it would spend on each task
256
+ - Logs hypothetical ROI based on task outcomes
257
+ - After 20+ dry-run cycles, the agent can show the owner: "Here's what I would have spent and the estimated ROI"
258
+ - Owner can then enable with confidence, knowing the agent's spending judgment is calibrated
259
+
260
+ ## References
261
+
262
+ - `agent-payments` skill: payment execution and wallet interaction
263
+ - `meta-skill-evolution` skill: automatic spend ratio optimization
264
+ - `agentwallet-sdk` v6.0: non-custodial wallets, spending policies, x402 protocol
265
+ - ERC-6551: token-bound accounts with smart-contract spending enforcement
@@ -64,6 +64,8 @@ Skills activate automatically when you recognize a matching task pattern. You do
64
64
  | After fixing a bug or architecture decision; want to store the pattern | `cross-project-knowledge` |
65
65
  | TDD GREEN phase complete; want invariant property tests | `formal-verification-lite` |
66
66
  | Need roundtrip/idempotence/commutativity tests for a pure function | `formal-verification-lite` |
67
+ | Complex task where premium resources would improve quality | `economic-code-optimization` |
68
+ | Deciding whether to pay for expert review or premium model | `economic-code-optimization` |
67
69
 
68
70
  ## Reading a Skill
69
71
 
@@ -143,7 +145,8 @@ You never need to check the mode. Skills detect it themselves and adapt their in
143
145
  22. `self-healing-code` — Test failure → hypothesis tree → ≥2 candidate patches → auto-commit winner or escalate
144
146
  23. `cross-project-knowledge` — Persistent pattern KB across all projects; search before tasks, store after fixes
145
147
  24. `formal-verification-lite` — Property-based testing (fast-check/Hypothesis) after TDD GREEN; 1000+ iterations per invariant
148
+ 25. `economic-code-optimization` — Autonomously spend micro-budgets on premium models, compute, expert reviews when ROI justifies it
146
149
 
147
150
  ## Session Initialization Complete
148
151
 
149
- ClawPowers is ready. 24 skills active. Skills activate on pattern recognition. Runtime enhancements available when `~/.clawpowers/` exists. RSI Intelligence Layer (meta-skill-evolution, self-healing-code, cross-project-knowledge, formal-verification-lite) provides persistent learning across sessions and projects.
152
+ ClawPowers is ready. 25 skills active. Skills activate on pattern recognition. Runtime enhancements available when `~/.clawpowers/` exists. RSI Intelligence Layer (meta-skill-evolution, self-healing-code, cross-project-knowledge, formal-verification-lite) provides persistent learning across sessions and projects.