@deeflectcom/smart-spawn 1.0.0 → 1.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +352 -0
- package/package.json +3 -2
package/README.md
ADDED
|
@@ -0,0 +1,352 @@
|
|
|
1
|
+
# Smart Spawn
|
|
2
|
+
|
|
3
|
+
Intelligent model routing for [OpenClaw](https://github.com/openclaw/openclaw). Automatically picks the best AI model for any task based on real benchmark data from 5 sources.
|
|
4
|
+
|
|
5
|
+
Instead of hardcoding models or guessing, Smart Spawn analyzes what you're doing and routes to the optimal model for the job — factoring in task type, budget, benchmarks, speed, and your own feedback history.
|
|
6
|
+
|
|
7
|
+
## Quick Start (OpenClaw Plugin)
|
|
8
|
+
|
|
9
|
+
You don't need to host anything. The public API runs at `ss.deeflect.com`.
|
|
10
|
+
|
|
11
|
+
**Install the plugin:**
|
|
12
|
+
|
|
13
|
+
```bash
|
|
14
|
+
openclaw plugins install @deeflectcom/smart-spawn
|
|
15
|
+
openclaw gateway restart
|
|
16
|
+
```
|
|
17
|
+
|
|
18
|
+
**Use it in conversation:**
|
|
19
|
+
|
|
20
|
+
> "Research the latest developments in WebGPU"
|
|
21
|
+
>
|
|
22
|
+
> Smart Spawn picks Gemini 2.5 Flash (fast, free, great context) and spawns a research sub-agent on it.
|
|
23
|
+
|
|
24
|
+
> "Build me a React dashboard with auth"
|
|
25
|
+
>
|
|
26
|
+
> Smart Spawn picks the best coding model in your budget tier and spawns a coder sub-agent.
|
|
27
|
+
|
|
28
|
+
**Plugin config** (optional — add to your OpenClaw config under `extensions.smart-spawn`):
|
|
29
|
+
|
|
30
|
+
```json
|
|
31
|
+
{
|
|
32
|
+
"apiUrl": "https://ss.deeflect.com",
|
|
33
|
+
"defaultBudget": "medium",
|
|
34
|
+
"defaultMode": "single"
|
|
35
|
+
}
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
| Setting | Default | Options |
|
|
39
|
+
|---------|---------|---------|
|
|
40
|
+
| `apiUrl` | `https://ss.deeflect.com` | Your own API URL if self-hosting |
|
|
41
|
+
| `defaultBudget` | `medium` | `low`, `medium`, `high`, `any` |
|
|
42
|
+
| `defaultMode` | `single` | `single`, `collective`, `cascade`, `swarm` |
|
|
43
|
+
|
|
44
|
+
### Spawn Modes
|
|
45
|
+
|
|
46
|
+
- **Single** — Pick one best model, spawn one agent
|
|
47
|
+
- **Collective** — Pick N diverse models, spawn parallel agents, merge results
|
|
48
|
+
- **Cascade** — Start cheap, escalate to premium if quality is insufficient
|
|
49
|
+
- **Swarm** — Decompose complex tasks into a DAG of sub-tasks with optimal model per step
|
|
50
|
+
|
|
51
|
+
---
|
|
52
|
+
|
|
53
|
+
## How It Works
|
|
54
|
+
|
|
55
|
+
```
|
|
56
|
+
┌─────────────────────────────────────────────────────┐
|
|
57
|
+
│ Data Sources (5) │
|
|
58
|
+
│ │
|
|
59
|
+
│ OpenRouter ─── model catalog, pricing, capabilities │
|
|
60
|
+
│ Artificial Analysis ─── intelligence/coding/math idx │
|
|
61
|
+
│ HuggingFace Open LLM Leaderboard ─── MMLU, BBH, etc│
|
|
62
|
+
│ LMArena (Chatbot Arena) ─── ELO from human prefs │
|
|
63
|
+
│ LiveBench ─── contamination-free coding/reasoning │
|
|
64
|
+
└──────────────────────┬──────────────────────────────┘
|
|
65
|
+
│
|
|
66
|
+
▼
|
|
67
|
+
┌──────────────────────────────────────────────────────┐
|
|
68
|
+
│ Enrichment Pipeline │
|
|
69
|
+
│ │
|
|
70
|
+
│ 1. Pull raw data from all 5 sources │
|
|
71
|
+
│ 2. Alias matching (map model names across sources) │
|
|
72
|
+
│ 3. Z-score normalization per benchmark │
|
|
73
|
+
│ 4. Category scoring (coding/reasoning/creative/...) │
|
|
74
|
+
│ 5. Cost-efficiency calculation │
|
|
75
|
+
│ 6. Tier + capability classification │
|
|
76
|
+
│ 7. Blend: benchmarks + personal + community scores │
|
|
77
|
+
│ │
|
|
78
|
+
│ Refreshes every 6 hours automatically │
|
|
79
|
+
└──────────────────────┬──────────────────────────────┘
|
|
80
|
+
│
|
|
81
|
+
▼
|
|
82
|
+
┌──────────────────────────────────────────────────────┐
|
|
83
|
+
│ SQLite Cache → API → Plugin → Agent │
|
|
84
|
+
└──────────────────────────────────────────────────────┘
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
### Scoring System
|
|
88
|
+
|
|
89
|
+
**Z-score normalization** — Each benchmark source uses different scales. An "intelligence index" of 65 from Artificial Analysis means something completely different than an Arena ELO of 1350. We normalize everything:
|
|
90
|
+
|
|
91
|
+
1. Compute mean and stddev for each benchmark across all models
|
|
92
|
+
2. Convert to z-scores: `(value - mean) / stddev`
|
|
93
|
+
3. Map to 0-100 scale: z=-2.5→0, z=0→50, z=+1→70, z=+2→90
|
|
94
|
+
|
|
95
|
+
This means a model that's 2σ above average on LiveCodeBench gets the same score as one 2σ above average on Arena ELO — both are "equally exceptional" on their metric.
|
|
96
|
+
|
|
97
|
+
**Category scores** — Models get scored per category (coding, reasoning, creative, vision, research, fast-cheap, general) using weighted combinations of relevant benchmarks:
|
|
98
|
+
|
|
99
|
+
| Category | Key Benchmarks |
|
|
100
|
+
|----------|---------------|
|
|
101
|
+
| Coding | LiveCodeBench, Agentic Coding, Coding Index |
|
|
102
|
+
| Reasoning | GPQA, Arena ELO, MATH-500, BBH |
|
|
103
|
+
| Creative | Arena ELO (human preference), LiveBench Language |
|
|
104
|
+
| Vision | Intelligence Index (vision-capable models) |
|
|
105
|
+
| Research | Arena ELO, context length bonus |
|
|
106
|
+
| Fast-cheap | Speed (tokens/sec), low pricing |
|
|
107
|
+
|
|
108
|
+
**Score blending** — Final score = weighted mix of:
|
|
109
|
+
- Benchmark score (primary)
|
|
110
|
+
- Personal feedback (your own ratings from past spawns)
|
|
111
|
+
- Community scores (anonymous aggregated ratings from other instances)
|
|
112
|
+
- Context boost (task-specific signals like "needs vision" or "long context")
|
|
113
|
+
|
|
114
|
+
### Budget Tiers
|
|
115
|
+
|
|
116
|
+
| Budget | Price Range (per 1M input tokens) | Examples |
|
|
117
|
+
|--------|----------------------------------|----------|
|
|
118
|
+
| `low` | $0 – $1 | DeepSeek, Kimi K2.5, Gemini Flash |
|
|
119
|
+
| `medium` | $0 – $5 | Claude Sonnet, GPT-4o, Gemini Pro |
|
|
120
|
+
| `high` | $2 – $20 | Claude Opus, GPT-5, o3 |
|
|
121
|
+
| `any` | No limit | Best available regardless of cost |
|
|
122
|
+
|
|
123
|
+
### Model Classification
|
|
124
|
+
|
|
125
|
+
Every model is automatically classified with:
|
|
126
|
+
- **Tier**: premium / standard / budget (based on provider + pricing)
|
|
127
|
+
- **Categories**: which task types it's good at (derived from benchmarks + capabilities)
|
|
128
|
+
- **Tags**: specific traits like "fast", "vision", "reasoning", "large-context"
|
|
129
|
+
- **Cost efficiency**: quality-per-dollar ratio per category
|
|
130
|
+
|
|
131
|
+
---
|
|
132
|
+
|
|
133
|
+
## API Reference
|
|
134
|
+
|
|
135
|
+
Base URL: `https://ss.deeflect.com`
|
|
136
|
+
|
|
137
|
+
### GET /pick
|
|
138
|
+
|
|
139
|
+
Pick the single best model for a task.
|
|
140
|
+
|
|
141
|
+
```bash
|
|
142
|
+
curl "https://ss.deeflect.com/pick?task=build+a+react+app&budget=medium"
|
|
143
|
+
```
|
|
144
|
+
|
|
145
|
+
| Param | Required | Description |
|
|
146
|
+
|-------|----------|-------------|
|
|
147
|
+
| `task` | Yes | Task description or category name |
|
|
148
|
+
| `budget` | No | `low`, `medium`, `high`, `any` (default: `medium`) |
|
|
149
|
+
| `exclude` | No | Comma-separated model IDs to skip |
|
|
150
|
+
| `context` | No | Context tags (e.g. `vision,long-context`) |
|
|
151
|
+
|
|
152
|
+
```json
|
|
153
|
+
{
|
|
154
|
+
"data": {
|
|
155
|
+
"id": "anthropic/claude-opus-4.6",
|
|
156
|
+
"name": "Claude Opus 4.6",
|
|
157
|
+
"score": 86,
|
|
158
|
+
"pricing": { "prompt": 5, "completion": 25 },
|
|
159
|
+
"budget": "medium",
|
|
160
|
+
"reason": "Best general model at medium budget ($0-5/M) — score: 86"
|
|
161
|
+
}
|
|
162
|
+
}
|
|
163
|
+
```
|
|
164
|
+
|
|
165
|
+
### GET /recommend
|
|
166
|
+
|
|
167
|
+
Get multiple model recommendations with provider diversity.
|
|
168
|
+
|
|
169
|
+
```bash
|
|
170
|
+
curl "https://ss.deeflect.com/recommend?task=coding&budget=low&count=3"
|
|
171
|
+
```
|
|
172
|
+
|
|
173
|
+
| Param | Required | Description |
|
|
174
|
+
|-------|----------|-------------|
|
|
175
|
+
| `task` or `category` | Yes | Task description or category name |
|
|
176
|
+
| `budget` | No | Budget tier (default: `medium`) |
|
|
177
|
+
| `count` | No | Number of recommendations, 1-5 (default: `1`) |
|
|
178
|
+
| `exclude` | No | Comma-separated model IDs to skip |
|
|
179
|
+
| `require` | No | Required capabilities: `vision`, `functionCalling`, `json`, `reasoning` |
|
|
180
|
+
| `minContext` | No | Minimum context window length |
|
|
181
|
+
| `context` | No | Context tags for routing boost |
|
|
182
|
+
|
|
183
|
+
### GET /compare
|
|
184
|
+
|
|
185
|
+
Side-by-side model comparison.
|
|
186
|
+
|
|
187
|
+
```bash
|
|
188
|
+
curl "https://ss.deeflect.com/compare?models=anthropic/claude-opus-4.6,openai/gpt-5.2"
|
|
189
|
+
```
|
|
190
|
+
|
|
191
|
+
| Param | Required | Description |
|
|
192
|
+
|-------|----------|-------------|
|
|
193
|
+
| `models` | Yes | Comma-separated OpenRouter model IDs |
|
|
194
|
+
|
|
195
|
+
### GET /models
|
|
196
|
+
|
|
197
|
+
Browse the full model catalog.
|
|
198
|
+
|
|
199
|
+
```bash
|
|
200
|
+
curl "https://ss.deeflect.com/models?category=coding&sort=score&limit=10"
|
|
201
|
+
```
|
|
202
|
+
|
|
203
|
+
| Param | Required | Description |
|
|
204
|
+
|-------|----------|-------------|
|
|
205
|
+
| `category` | No | Filter by category |
|
|
206
|
+
| `tier` | No | Filter by tier: `premium`, `standard`, `budget` |
|
|
207
|
+
| `sort` | No | `score` (default), `cost`, `efficiency`, or any category name |
|
|
208
|
+
| `limit` | No | Results to return, 1-500 (default: `50`) |
|
|
209
|
+
|
|
210
|
+
### POST /decompose
|
|
211
|
+
|
|
212
|
+
Break a complex task into sequential steps with optimal model per step.
|
|
213
|
+
|
|
214
|
+
```bash
|
|
215
|
+
curl -X POST "https://ss.deeflect.com/decompose" \
|
|
216
|
+
-H "Content-Type: application/json" \
|
|
217
|
+
-d '{"task": "Build and deploy a SaaS landing page", "budget": "medium"}'
|
|
218
|
+
```
|
|
219
|
+
|
|
220
|
+
### POST /swarm
|
|
221
|
+
|
|
222
|
+
Decompose a task into a parallel DAG of sub-tasks with dependency tracking.
|
|
223
|
+
|
|
224
|
+
```bash
|
|
225
|
+
curl -X POST "https://ss.deeflect.com/swarm" \
|
|
226
|
+
-H "Content-Type: application/json" \
|
|
227
|
+
-d '{"task": "Research competitors and build a pitch deck", "budget": "low"}'
|
|
228
|
+
```
|
|
229
|
+
|
|
230
|
+
### GET /status
|
|
231
|
+
|
|
232
|
+
API health and data freshness.
|
|
233
|
+
|
|
234
|
+
```bash
|
|
235
|
+
curl "https://ss.deeflect.com/status"
|
|
236
|
+
```
|
|
237
|
+
|
|
238
|
+
### POST /refresh
|
|
239
|
+
|
|
240
|
+
Force a data refresh (pulls from all 5 sources). Protected by API key if `REFRESH_API_KEY` is set.
|
|
241
|
+
|
|
242
|
+
```bash
|
|
243
|
+
curl -X POST "https://ss.deeflect.com/refresh" \
|
|
244
|
+
-H "Authorization: Bearer YOUR_KEY"
|
|
245
|
+
```
|
|
246
|
+
|
|
247
|
+
### POST /spawn-log
|
|
248
|
+
|
|
249
|
+
Log a spawn event (used by the plugin for feedback/learning).
|
|
250
|
+
|
|
251
|
+
### POST /spawn-log/outcome
|
|
252
|
+
|
|
253
|
+
Report task outcome rating (1-5) for the learning loop.
|
|
254
|
+
|
|
255
|
+
### POST /community/report
|
|
256
|
+
|
|
257
|
+
Anonymous community outcome report for shared intelligence.
|
|
258
|
+
|
|
259
|
+
### POST /roles/compose
|
|
260
|
+
|
|
261
|
+
Compose a role-enriched prompt from persona/stack/domain blocks.
|
|
262
|
+
|
|
263
|
+
---
|
|
264
|
+
|
|
265
|
+
## Self-Hosting
|
|
266
|
+
|
|
267
|
+
The API is open source. Run your own if you want full control.
|
|
268
|
+
|
|
269
|
+
### Local Development
|
|
270
|
+
|
|
271
|
+
```bash
|
|
272
|
+
git clone https://github.com/deeflect/smart-spawn.git
|
|
273
|
+
cd smart-spawn
|
|
274
|
+
bun install
|
|
275
|
+
bun run dev # starts on http://localhost:3000
|
|
276
|
+
```
|
|
277
|
+
|
|
278
|
+
### Docker
|
|
279
|
+
|
|
280
|
+
```bash
|
|
281
|
+
docker build -t smart-spawn .
|
|
282
|
+
docker run -p 3000:3000 -v smart-spawn-data:/app/data smart-spawn
|
|
283
|
+
```
|
|
284
|
+
|
|
285
|
+
### Railway
|
|
286
|
+
|
|
287
|
+
[](https://railway.com/template)
|
|
288
|
+
|
|
289
|
+
The repo includes `railway.json` and `Dockerfile`. Just connect your repo and deploy.
|
|
290
|
+
|
|
291
|
+
### Environment Variables
|
|
292
|
+
|
|
293
|
+
| Variable | Required | Description |
|
|
294
|
+
|----------|----------|-------------|
|
|
295
|
+
| `PORT` | No | Server port (default: `3000`) |
|
|
296
|
+
| `REFRESH_API_KEY` | No | Protects `/refresh` endpoint. If set, requires `Authorization: Bearer <key>` |
|
|
297
|
+
|
|
298
|
+
### Rate Limits
|
|
299
|
+
|
|
300
|
+
- **200 requests/min** per IP (all endpoints)
|
|
301
|
+
- **2 requests/hour** per IP on `/refresh`
|
|
302
|
+
- Returns `429 Too Many Requests` with `Retry-After` header
|
|
303
|
+
|
|
304
|
+
These are generous enough for agent use. If you're hitting limits, self-host.
|
|
305
|
+
|
|
306
|
+
---
|
|
307
|
+
|
|
308
|
+
## Architecture
|
|
309
|
+
|
|
310
|
+
```
|
|
311
|
+
smart-spawn/
|
|
312
|
+
├── src/ # API server
|
|
313
|
+
│ ├── index.ts # Hono app, middleware, startup
|
|
314
|
+
│ ├── db.ts # SQLite (cache, spawn logs, scores)
|
|
315
|
+
│ ├── types.ts # All TypeScript types
|
|
316
|
+
│ ├── model-selection.ts # Score sorting, blending logic
|
|
317
|
+
│ ├── scoring-utils.ts # Category classification, score helpers
|
|
318
|
+
│ ├── context-signals.ts # Context tag parsing and boost calculation
|
|
319
|
+
│ ├── task-splitter.ts # Task decomposition for cascade/swarm
|
|
320
|
+
│ ├── enrichment/
|
|
321
|
+
│ │ ├── pipeline.ts # Main pipeline: pull → enrich → cache
|
|
322
|
+
│ │ ├── scoring.ts # Z-score normalization, score computation
|
|
323
|
+
│ │ ├── rules.ts # Tier classification, category derivation
|
|
324
|
+
│ │ ├── alias-map.ts # Cross-source model name matching
|
|
325
|
+
│ │ └── sources/ # Data source adapters
|
|
326
|
+
│ │ ├── openrouter.ts # OpenRouter model catalog
|
|
327
|
+
│ │ ├── artificial.ts # Artificial Analysis benchmarks
|
|
328
|
+
│ │ ├── hf-leaderboard.ts # HuggingFace Open LLM Leaderboard
|
|
329
|
+
│ │ ├── lmarena.ts # LMArena / Chatbot Arena ELO
|
|
330
|
+
│ │ └── livebench.ts # LiveBench scores
|
|
331
|
+
│ ├── routes/ # API endpoints
|
|
332
|
+
│ ├── roles/ # Role composition blocks
|
|
333
|
+
│ ├── middleware/ # Rate limiting, response caching
|
|
334
|
+
│ └── utils/ # Input validation
|
|
335
|
+
├── smart-spawn/ # OpenClaw plugin
|
|
336
|
+
│ ├── index.ts # Plugin entry point (tool registration)
|
|
337
|
+
│ ├── openclaw.plugin.json # Plugin manifest
|
|
338
|
+
│ ├── src/api-client.ts # API client for plugin
|
|
339
|
+
│ └── skills/smart-spawn/ # Companion SKILL.md
|
|
340
|
+
├── data/ # SQLite database (auto-created)
|
|
341
|
+
├── Dockerfile
|
|
342
|
+
├── railway.json
|
|
343
|
+
└── .env.example
|
|
344
|
+
```
|
|
345
|
+
|
|
346
|
+
---
|
|
347
|
+
|
|
348
|
+
## License
|
|
349
|
+
|
|
350
|
+
MIT — see [LICENSE](LICENSE).
|
|
351
|
+
|
|
352
|
+
Built by [@deeflect](https://github.com/deeflect).
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@deeflectcom/smart-spawn",
|
|
3
|
-
"version": "1.0.
|
|
3
|
+
"version": "1.0.1",
|
|
4
4
|
"description": "Intelligent sub-agent spawning with automatic model selection for OpenClaw",
|
|
5
5
|
"main": "index.ts",
|
|
6
6
|
"type": "module",
|
|
@@ -8,7 +8,8 @@
|
|
|
8
8
|
"index.ts",
|
|
9
9
|
"src/",
|
|
10
10
|
"skills/",
|
|
11
|
-
"openclaw.plugin.json"
|
|
11
|
+
"openclaw.plugin.json",
|
|
12
|
+
"README.md"
|
|
12
13
|
],
|
|
13
14
|
"openclaw": {
|
|
14
15
|
"extensions": ["./index.ts"]
|