wolverine-ai 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (79) hide show
  1. package/PLATFORM.md +442 -0
  2. package/README.md +475 -0
  3. package/SERVER_BEST_PRACTICES.md +62 -0
  4. package/TELEMETRY.md +108 -0
  5. package/bin/wolverine.js +95 -0
  6. package/examples/01-basic-typo.js +31 -0
  7. package/examples/02-multi-file/routes/users.js +15 -0
  8. package/examples/02-multi-file/server.js +25 -0
  9. package/examples/03-syntax-error.js +23 -0
  10. package/examples/04-secret-leak.js +14 -0
  11. package/examples/05-expired-key.js +27 -0
  12. package/examples/06-json-config/config.json +13 -0
  13. package/examples/06-json-config/server.js +28 -0
  14. package/examples/07-rate-limit-loop.js +11 -0
  15. package/examples/08-sandbox-escape.js +20 -0
  16. package/examples/buggy-server.js +39 -0
  17. package/examples/demos/01-basic-typo/index.js +20 -0
  18. package/examples/demos/01-basic-typo/routes/api.js +13 -0
  19. package/examples/demos/01-basic-typo/routes/health.js +4 -0
  20. package/examples/demos/02-multi-file/index.js +24 -0
  21. package/examples/demos/02-multi-file/routes/api.js +13 -0
  22. package/examples/demos/02-multi-file/routes/health.js +4 -0
  23. package/examples/demos/03-syntax-error/index.js +18 -0
  24. package/examples/demos/04-secret-leak/index.js +16 -0
  25. package/examples/demos/05-expired-key/index.js +21 -0
  26. package/examples/demos/06-json-config/config.json +9 -0
  27. package/examples/demos/06-json-config/index.js +20 -0
  28. package/examples/demos/07-null-crash/index.js +16 -0
  29. package/examples/run-demo.js +110 -0
  30. package/package.json +67 -0
  31. package/server/config/settings.json +62 -0
  32. package/server/index.js +33 -0
  33. package/server/routes/api.js +12 -0
  34. package/server/routes/health.js +16 -0
  35. package/server/routes/time.js +12 -0
  36. package/src/agent/agent-engine.js +727 -0
  37. package/src/agent/goal-loop.js +140 -0
  38. package/src/agent/research-agent.js +120 -0
  39. package/src/agent/sub-agents.js +176 -0
  40. package/src/backup/backup-manager.js +321 -0
  41. package/src/brain/brain.js +315 -0
  42. package/src/brain/embedder.js +131 -0
  43. package/src/brain/function-map.js +263 -0
  44. package/src/brain/vector-store.js +267 -0
  45. package/src/core/ai-client.js +387 -0
  46. package/src/core/cluster-manager.js +144 -0
  47. package/src/core/config.js +89 -0
  48. package/src/core/error-parser.js +87 -0
  49. package/src/core/health-monitor.js +129 -0
  50. package/src/core/models.js +132 -0
  51. package/src/core/patcher.js +55 -0
  52. package/src/core/runner.js +464 -0
  53. package/src/core/system-info.js +141 -0
  54. package/src/core/verifier.js +146 -0
  55. package/src/core/wolverine.js +290 -0
  56. package/src/dashboard/server.js +1332 -0
  57. package/src/index.js +94 -0
  58. package/src/logger/event-logger.js +237 -0
  59. package/src/logger/pricing.js +96 -0
  60. package/src/logger/repair-history.js +109 -0
  61. package/src/logger/token-tracker.js +277 -0
  62. package/src/mcp/mcp-client.js +224 -0
  63. package/src/mcp/mcp-registry.js +228 -0
  64. package/src/mcp/mcp-security.js +152 -0
  65. package/src/monitor/perf-monitor.js +300 -0
  66. package/src/monitor/process-monitor.js +231 -0
  67. package/src/monitor/route-prober.js +191 -0
  68. package/src/notifications/notifier.js +227 -0
  69. package/src/platform/heartbeat.js +93 -0
  70. package/src/platform/queue.js +53 -0
  71. package/src/platform/register.js +64 -0
  72. package/src/platform/telemetry.js +76 -0
  73. package/src/security/admin-auth.js +150 -0
  74. package/src/security/injection-detector.js +174 -0
  75. package/src/security/rate-limiter.js +152 -0
  76. package/src/security/sandbox.js +128 -0
  77. package/src/security/secret-redactor.js +217 -0
  78. package/src/skills/skill-registry.js +129 -0
  79. package/src/skills/sql.js +375 -0
package/PLATFORM.md ADDED
@@ -0,0 +1,442 @@
1
+ # Wolverine Platform — Multi-Server Analytics & Management
2
+
3
+ ## Overview
4
+
5
+ The Wolverine Platform aggregates data from hundreds/thousands of wolverine server instances into a single backend + frontend dashboard. Each wolverine instance runs independently and broadcasts lightweight telemetry to the platform.
6
+
7
+ ```
8
+ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐
9
+ │ Wolverine #1 │ │ Wolverine #2 │ │ Wolverine #3 │ ... (N instances)
10
+ │ server:3000 │ │ server:4000 │ │ server:5000 │
11
+ │ dash:3001 │ │ dash:4001 │ │ dash:5001 │
12
+ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘
13
+ │ │ │
14
+ │ heartbeat │ heartbeat │ heartbeat
15
+ │ (every 60s) │ (every 60s) │ (every 60s)
16
+ ▼ ▼ ▼
17
+ ┌─────────────────────────────────────────────────┐
18
+ │ Wolverine Platform Backend │
19
+ │ │
20
+ │ POST /api/v1/heartbeat ← receive telemetry │
21
+ │ GET /api/v1/servers ← list all instances │
22
+ │ GET /api/v1/servers/:id ← single instance │
23
+ │ GET /api/v1/analytics ← aggregated stats │
24
+ │ GET /api/v1/alerts ← active alerts │
25
+ │ WS /ws/live ← real-time stream │
26
+ │ │
27
+ │ Database: PostgreSQL (time-series optimized) │
28
+ │ Cache: Redis (live state, pub/sub) │
29
+ │ Queue: Bull/BullMQ (alert processing) │
30
+ └─────────────────────────────────────────────────┘
31
+
32
+
33
+ ┌─────────────────────────────────────────────────┐
34
+ │ Wolverine Platform Frontend │
35
+ │ │
36
+ │ Fleet overview — all servers at a glance │
37
+ │ Per-server deep dive — events, repairs, usage │
38
+ │ Cost analytics — tokens, USD, by model │
39
+ │ Alert management — acknowledge, escalate │
40
+ │ Uptime history — SLA tracking over time │
41
+ └─────────────────────────────────────────────────┘
42
+ ```
43
+
44
+ ---
45
+
46
+ ## Telemetry Protocol
47
+
48
+ ### Heartbeat Payload
49
+
50
+ Each wolverine instance sends a heartbeat every **60 seconds** (configurable). This is the only outbound traffic — minimal network impact.
51
+
52
+ ```json
53
+ POST /api/v1/heartbeat
54
+ Authorization: Bearer <PLATFORM_API_KEY>
55
+ Content-Type: application/json
56
+
57
+ {
58
+ "instanceId": "wlv_a1b2c3d4",
59
+ "version": "0.1.0",
60
+ "timestamp": 1775073247574,
61
+
62
+ "server": {
63
+ "name": "my-api",
64
+ "port": 3000,
65
+ "uptime": 86400,
66
+ "status": "healthy",
67
+ "pid": 12345
68
+ },
69
+
70
+ "process": {
71
+ "memoryMB": 128,
72
+ "cpuPercent": 12,
73
+ "peakMemoryMB": 256
74
+ },
75
+
76
+ "routes": {
77
+ "total": 8,
78
+ "healthy": 8,
79
+ "unhealthy": 0,
80
+ "slowest": { "path": "/api/search", "avgMs": 450 }
81
+ },
82
+
83
+ "repairs": {
84
+ "total": 3,
85
+ "successes": 2,
86
+ "failures": 1,
87
+ "lastRepair": {
88
+ "error": "TypeError: Cannot read property 'id' of undefined",
89
+ "resolution": "Added null check before accessing user.id",
90
+ "tokens": 1820,
91
+ "cost": 0.0045,
92
+ "mode": "fast",
93
+ "timestamp": 1775073200000
94
+ }
95
+ },
96
+
97
+ "usage": {
98
+ "totalTokens": 45000,
99
+ "totalCost": 0.12,
100
+ "totalCalls": 85,
101
+ "byCategory": {
102
+ "heal": { "tokens": 12000, "cost": 0.04, "calls": 5 },
103
+ "chat": { "tokens": 25000, "cost": 0.05, "calls": 60 },
104
+ "classify": { "tokens": 3000, "cost": 0.001, "calls": 15 },
105
+ "develop": { "tokens": 5000, "cost": 0.03, "calls": 5 }
106
+ }
107
+ },
108
+
109
+ "brain": {
110
+ "totalMemories": 45,
111
+ "namespaces": { "docs": 23, "functions": 12, "errors": 5, "fixes": 3, "learnings": 2 }
112
+ },
113
+
114
+ "backups": {
115
+ "total": 8,
116
+ "stable": 3,
117
+ "verified": 2,
118
+ "unstable": 3
119
+ },
120
+
121
+ "alerts": [
122
+ {
123
+ "type": "memory_leak",
124
+ "message": "Memory growing: +50MB over 10 samples",
125
+ "severity": "warn",
126
+ "timestamp": 1775073100000
127
+ }
128
+ ]
129
+ }
130
+ ```
131
+
132
+ ### Design Principles
133
+
134
+ - **Infrequent**: 1 heartbeat per 60 seconds = 1440/day per instance
135
+ - **Small**: ~2KB per payload, gzipped < 500 bytes
136
+ - **Idempotent**: same heartbeat can be sent twice safely (upsert by instanceId + timestamp)
137
+ - **Offline-resilient**: if platform is down, wolverine queues heartbeats and replays on reconnect
138
+ - **No PII**: never send secrets, user data, or source code in heartbeats
139
+
140
+ ---
141
+
142
+ ## Platform Backend Architecture
143
+
144
+ ### Database Schema (PostgreSQL)
145
+
146
+ ```sql
147
+ -- Servers — one row per wolverine instance
148
+ CREATE TABLE servers (
149
+ id TEXT PRIMARY KEY, -- "wlv_a1b2c3d4"
150
+ name TEXT NOT NULL,
151
+ version TEXT,
152
+ first_seen TIMESTAMPTZ NOT NULL DEFAULT NOW(),
153
+ last_heartbeat TIMESTAMPTZ NOT NULL,
154
+ status TEXT NOT NULL DEFAULT 'unknown', -- healthy, degraded, down, unknown
155
+ config JSONB -- port, models, etc.
156
+ );
157
+
158
+ -- Time-series heartbeats — partitioned by day for scale
159
+ CREATE TABLE heartbeats (
160
+ id BIGSERIAL,
161
+ server_id TEXT NOT NULL REFERENCES servers(id),
162
+ timestamp TIMESTAMPTZ NOT NULL,
163
+ uptime INTEGER,
164
+ memory_mb INTEGER,
165
+ cpu_percent INTEGER,
166
+ routes_total INTEGER,
167
+ routes_healthy INTEGER,
168
+ routes_unhealthy INTEGER,
169
+ tokens_total INTEGER,
170
+ cost_total NUMERIC(10,6),
171
+ repairs_total INTEGER,
172
+ repairs_successes INTEGER,
173
+ payload JSONB -- full heartbeat for deep queries
174
+ ) PARTITION BY RANGE (timestamp);
175
+
176
+ -- Create daily partitions automatically (pg_partman or manual)
177
+ -- This allows dropping old data by partition instead of DELETE
178
+
179
+ -- Repairs — detailed log of every fix
180
+ CREATE TABLE repairs (
181
+ id BIGSERIAL PRIMARY KEY,
182
+ server_id TEXT NOT NULL REFERENCES servers(id),
183
+ timestamp TIMESTAMPTZ NOT NULL,
184
+ error TEXT,
185
+ resolution TEXT,
186
+ success BOOLEAN,
187
+ mode TEXT, -- fast, agent, sub-agents
188
+ model TEXT,
189
+ tokens INTEGER,
190
+ cost NUMERIC(10,6),
191
+ iteration INTEGER,
192
+ duration_ms INTEGER
193
+ );
194
+
195
+ -- Alerts — active and historical
196
+ CREATE TABLE alerts (
197
+ id BIGSERIAL PRIMARY KEY,
198
+ server_id TEXT NOT NULL REFERENCES servers(id),
199
+ type TEXT NOT NULL, -- memory_leak, route_down, crash_loop, etc.
200
+ message TEXT,
201
+ severity TEXT, -- info, warn, error, critical
202
+ created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
203
+ acknowledged_at TIMESTAMPTZ,
204
+ resolved_at TIMESTAMPTZ,
205
+ acknowledged_by TEXT
206
+ );
207
+
208
+ -- Usage aggregates — hourly rollups for fast analytics
209
+ CREATE TABLE usage_hourly (
210
+ server_id TEXT NOT NULL REFERENCES servers(id),
211
+ hour TIMESTAMPTZ NOT NULL,
212
+ tokens_total INTEGER DEFAULT 0,
213
+ cost_total NUMERIC(10,6) DEFAULT 0,
214
+ calls_total INTEGER DEFAULT 0,
215
+ tokens_by_category JSONB,
216
+ PRIMARY KEY (server_id, hour)
217
+ );
218
+
219
+ -- Indexes for common queries
220
+ CREATE INDEX idx_heartbeats_server_time ON heartbeats (server_id, timestamp DESC);
221
+ CREATE INDEX idx_repairs_server_time ON repairs (server_id, timestamp DESC);
222
+ CREATE INDEX idx_alerts_active ON alerts (server_id) WHERE resolved_at IS NULL;
223
+ CREATE INDEX idx_servers_status ON servers (status);
224
+ ```
225
+
226
+ ### API Endpoints
227
+
228
+ ```
229
+ Authentication: Bearer token (PLATFORM_API_KEY)
230
+
231
+ POST /api/v1/heartbeat ← Receive heartbeat from wolverine instance
232
+ → Upsert server, insert heartbeat, process alerts
233
+ → Returns: { received: true, serverTime: "..." }
234
+
235
+ GET /api/v1/servers ← List all instances
236
+ → Query: ?status=healthy&sort=last_heartbeat&limit=50&offset=0
237
+ → Returns: { servers: [...], total: 150, page: 1 }
238
+
239
+ GET /api/v1/servers/:id ← Single instance detail
240
+ → Returns: full server state + recent heartbeats + repairs + alerts
241
+
242
+ GET /api/v1/servers/:id/heartbeats ← Heartbeat history
243
+ → Query: ?from=2026-04-01&to=2026-04-02&interval=5m
244
+ → Returns: time-series data for charting
245
+
246
+ GET /api/v1/servers/:id/repairs ← Repair history for one server
247
+ → Query: ?limit=50&success=true
248
+ → Returns: { repairs: [...], stats: { total, successes, avgTokens } }
249
+
250
+ GET /api/v1/analytics ← Fleet-wide aggregates
251
+ → Query: ?period=24h or ?from=...&to=...
252
+ → Returns: {
253
+ totalServers, activeServers, totalRepairs, successRate,
254
+ totalTokens, totalCost, tokensByCategory, costByModel,
255
+ uptimePercent, avgResponseTime
256
+ }
257
+
258
+ GET /api/v1/analytics/cost ← Cost breakdown
259
+ → Query: ?period=7d&groupBy=server|model|category
260
+ → Returns: cost time-series + breakdown
261
+
262
+ GET /api/v1/alerts ← Active alerts across fleet
263
+ → Query: ?severity=critical&acknowledged=false
264
+ → Returns: { alerts: [...], total: 5 }
265
+
266
+ PATCH /api/v1/alerts/:id ← Acknowledge/resolve alert
267
+ → Body: { action: "acknowledge" | "resolve", by: "admin@..." }
268
+
269
+ WS /ws/live ← Real-time WebSocket stream
270
+ → Streams: heartbeats, alerts, repairs as they arrive
271
+ → Subscribe: { subscribe: ["heartbeat", "alert", "repair"] }
272
+ → Filter: { servers: ["wlv_a1b2c3d4"] }
273
+ ```
274
+
275
+ ### Scaling Strategy
276
+
277
+ ```
278
+ 10 servers: Single PostgreSQL, single Node.js backend
279
+ 100 servers: PostgreSQL with connection pooling (pgBouncer), Redis cache
280
+ 1,000 servers: Partitioned heartbeats table, read replicas, queue workers
281
+ 10,000 servers: TimescaleDB for time-series, horizontal API scaling, Kafka for ingestion
282
+ 100,000+: Sharded by server_id, dedicated ingestion pipeline, ClickHouse for analytics
283
+ ```
284
+
285
+ **Key scaling decisions:**
286
+ - Heartbeats are **append-only** — no updates, only inserts → perfect for time-series DBs
287
+ - Hourly rollups in `usage_hourly` prevent expensive full-table scans for analytics
288
+ - Partitioned by day → drop old data by partition (instant, no vacuum)
289
+ - Redis caches the "current state" of each server (latest heartbeat) → fast fleet overview
290
+ - WebSocket uses Redis pub/sub → horizontal scaling of frontend connections
291
+ - Alert processing is async via job queue → doesn't block heartbeat ingestion
292
+
293
+ ### Redis Structure
294
+
295
+ ```
296
+ wolverine:server:{id}:state ← Latest heartbeat (JSON, TTL 5min)
297
+ wolverine:server:{id}:uptime ← Uptime counter (INCR every heartbeat)
298
+ wolverine:servers:active ← Sorted set (score = last_heartbeat timestamp)
299
+ wolverine:alerts:active ← Set of active alert IDs
300
+ wolverine:stats:fleet ← Cached fleet-wide aggregates (TTL 30s)
301
+ wolverine:pubsub:heartbeats ← Pub/sub channel for real-time streaming
302
+ wolverine:pubsub:alerts ← Pub/sub channel for alert notifications
303
+ ```
304
+
305
+ ---
306
+
307
+ ## Platform Frontend
308
+
309
+ ### Pages
310
+
311
+ **1. Fleet Overview**
312
+ - Grid/list of all server instances
313
+ - Color-coded status: green (healthy), yellow (degraded), red (down), gray (unknown)
314
+ - Sortable by: status, uptime, memory, cost, last repair
315
+ - Search/filter by name, status, tags
316
+ - Fleet-wide stats bar: total servers, active, repairs today, cost today
317
+
318
+ **2. Server Detail**
319
+ - Real-time stats: memory, CPU, uptime, routes
320
+ - Event timeline (same as local dashboard but from platform data)
321
+ - Repair history with resolution details + token cost
322
+ - Usage chart: tokens over time, cost over time
323
+ - Route health table with response time trends
324
+ - Backup status
325
+ - Brain stats
326
+
327
+ **3. Analytics**
328
+ - Fleet-wide token usage over time (by day/hour)
329
+ - Cost breakdown: by server, by model, by category
330
+ - Repair success rate over time
331
+ - Mean time to repair (MTTR) trend
332
+ - Most expensive servers / most repaired servers
333
+ - Uptime SLA tracking (99.9% target)
334
+ - Response time percentiles across fleet
335
+
336
+ **4. Alerts**
337
+ - Active alerts sorted by severity
338
+ - Acknowledge / resolve workflow
339
+ - Alert history with resolution notes
340
+ - Alert rules configuration (memory threshold, crash count, response time)
341
+
342
+ **5. Cost Management**
343
+ - Total spend by period (day/week/month)
344
+ - Per-server cost ranking
345
+ - Per-model cost ranking
346
+ - Projected monthly cost based on current usage
347
+ - Budget alerts (notify when approaching limit)
348
+
349
+ ### Tech Stack Recommendation
350
+
351
+ ```
352
+ Frontend: Next.js + Tailwind + Recharts (or Tremor for dashboard components)
353
+ Backend: Node.js + Express + PostgreSQL + Redis + BullMQ
354
+ Auth: NextAuth.js or Clerk (team management)
355
+ Hosting: Vercel (frontend) + Railway/Fly.io (backend) + Supabase (PostgreSQL)
356
+ WebSocket: Socket.io or native WS through the backend
357
+ ```
358
+
359
+ ---
360
+
361
+ ## Wolverine Client Integration
362
+
363
+ ### New env variables for the wolverine instance:
364
+
365
+ ```env
366
+ # Platform telemetry (optional — wolverine works fine without it)
367
+ WOLVERINE_PLATFORM_URL=https://api.wolverine.dev
368
+ WOLVERINE_PLATFORM_KEY=wlvk_your_api_key_here
369
+ WOLVERINE_INSTANCE_NAME=my-api-prod
370
+ WOLVERINE_HEARTBEAT_INTERVAL_MS=60000
371
+ ```
372
+
373
+ ### Telemetry module to build in wolverine:
374
+
375
+ ```
376
+ src/platform/
377
+ ├── telemetry.js ← Collects heartbeat data from all subsystems
378
+ ├── heartbeat.js ← Sends heartbeat to platform on interval
379
+ └── queue.js ← Queues heartbeats when platform is unreachable
380
+ ```
381
+
382
+ **telemetry.js** gathers data from:
383
+ - `processMonitor.getMetrics()` → memory, CPU
384
+ - `routeProber.getMetrics()` → route health
385
+ - `tokenTracker.getAnalytics()` → usage
386
+ - `repairHistory.getStats()` → repairs
387
+ - `backupManager.getStats()` → backups
388
+ - `brain.getStats()` → brain
389
+ - `notifier` → active alerts
390
+
391
+ **heartbeat.js** sends it:
392
+ - HTTP POST to platform every 60s
393
+ - Gzip compressed
394
+ - Timeout: 5s (don't block if platform is slow)
395
+ - On failure: queue locally, retry with exponential backoff
396
+ - On reconnect: replay queued heartbeats
397
+
398
+ **queue.js** handles offline resilience:
399
+ - Append to `.wolverine/heartbeat-queue.jsonl` when platform unreachable
400
+ - On next successful heartbeat, drain the queue (oldest first)
401
+ - Max queue size: 1440 entries (24 hours of heartbeats)
402
+ - After 24h, drop oldest entries (stale data isn't useful)
403
+
404
+ ---
405
+
406
+ ## Security Considerations
407
+
408
+ - **Platform API key** per instance — revokable, rotatable
409
+ - **Secret redactor** runs on heartbeat payload before sending (no env values leak)
410
+ - **No source code** in heartbeats — only metrics, error messages (redacted), and stats
411
+ - **TLS only** — platform endpoint must be HTTPS
412
+ - **Rate limiting** on platform ingestion — max 1 heartbeat/second per instance
413
+ - **Tenant isolation** — multi-tenant platform must scope data by organization
414
+ - **Audit log** — track who acknowledged/resolved alerts
415
+
416
+ ---
417
+
418
+ ## Implementation Priority
419
+
420
+ ### Phase 1: Core (1-2 weeks)
421
+ 1. Platform backend: heartbeat ingestion + server listing + basic API
422
+ 2. Wolverine telemetry module: collect + send heartbeats
423
+ 3. Frontend: fleet overview + server detail page
424
+ 4. PostgreSQL schema + Redis caching
425
+
426
+ ### Phase 2: Analytics (1 week)
427
+ 1. Hourly usage rollups
428
+ 2. Cost analytics page
429
+ 3. Repair history aggregation
430
+ 4. Uptime tracking
431
+
432
+ ### Phase 3: Alerting (1 week)
433
+ 1. Alert rules engine
434
+ 2. Acknowledge/resolve workflow
435
+ 3. Email/Slack/webhook notifications
436
+ 4. Alert history
437
+
438
+ ### Phase 4: Scale (ongoing)
439
+ 1. TimescaleDB migration for heartbeats
440
+ 2. Horizontal API scaling
441
+ 3. WebSocket real-time streaming
442
+ 4. Team management + RBAC