groove-dev 0.27.112 → 0.27.115

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (55) hide show
  1. package/CENTRAL_COMMAND_REBUILD.md +689 -0
  2. package/EMBEDDING_DIAGNOSTIC.md +197 -0
  3. package/TRAINING_DATA_v4.md +3 -0
  4. package/moe-training/client/parsers/codex.js +3 -3
  5. package/moe-training/client/parsers/gemini.js +2 -2
  6. package/moe-training/client/step-classifier.js +2 -2
  7. package/moe-training/test/client/step-classifier.test.js +63 -7
  8. package/node_modules/@groove-dev/cli/package.json +1 -1
  9. package/node_modules/@groove-dev/cli/src/commands/team.js +43 -1
  10. package/node_modules/@groove-dev/daemon/package.json +1 -1
  11. package/node_modules/@groove-dev/daemon/src/api.js +75 -15
  12. package/node_modules/@groove-dev/daemon/src/filewatcher.js +45 -0
  13. package/node_modules/@groove-dev/daemon/src/index.js +36 -10
  14. package/node_modules/@groove-dev/daemon/src/teams.js +100 -6
  15. package/node_modules/@groove-dev/daemon/src/tunnel-manager.js +75 -43
  16. package/node_modules/@groove-dev/gui/dist/assets/{index-CHu5w3i3.js → index-BKCiOUDb.js} +593 -593
  17. package/node_modules/@groove-dev/gui/dist/assets/index-D4Q72afD.css +1 -0
  18. package/node_modules/@groove-dev/gui/dist/index.html +2 -2
  19. package/node_modules/@groove-dev/gui/package.json +1 -1
  20. package/node_modules/@groove-dev/gui/src/components/agents/workspace-mode.jsx +0 -22
  21. package/node_modules/@groove-dev/gui/src/components/layout/status-bar.jsx +43 -45
  22. package/node_modules/@groove-dev/gui/src/components/preview/preview-workspace.jsx +3 -1
  23. package/node_modules/@groove-dev/gui/src/components/settings/quick-connect.jsx +2 -1
  24. package/node_modules/@groove-dev/gui/src/stores/groove.js +57 -8
  25. package/node_modules/@groove-dev/gui/src/views/agents.jsx +31 -3
  26. package/node_modules/@groove-dev/gui/src/views/editor.jsx +1 -20
  27. package/node_modules/@groove-dev/gui/src/views/teams.jsx +106 -3
  28. package/node_modules/moe-training/client/parsers/codex.js +3 -3
  29. package/node_modules/moe-training/client/parsers/gemini.js +2 -2
  30. package/node_modules/moe-training/client/step-classifier.js +2 -2
  31. package/node_modules/moe-training/test/client/step-classifier.test.js +63 -7
  32. package/package.json +1 -1
  33. package/packages/cli/package.json +1 -1
  34. package/packages/cli/src/commands/team.js +43 -1
  35. package/packages/daemon/package.json +1 -1
  36. package/packages/daemon/src/api.js +75 -15
  37. package/packages/daemon/src/filewatcher.js +45 -0
  38. package/packages/daemon/src/index.js +36 -10
  39. package/packages/daemon/src/teams.js +100 -6
  40. package/packages/daemon/src/tunnel-manager.js +75 -43
  41. package/packages/gui/dist/assets/{index-CHu5w3i3.js → index-BKCiOUDb.js} +593 -593
  42. package/packages/gui/dist/assets/index-D4Q72afD.css +1 -0
  43. package/packages/gui/dist/index.html +2 -2
  44. package/packages/gui/package.json +1 -1
  45. package/packages/gui/src/components/agents/workspace-mode.jsx +0 -22
  46. package/packages/gui/src/components/layout/status-bar.jsx +43 -45
  47. package/packages/gui/src/components/preview/preview-workspace.jsx +3 -1
  48. package/packages/gui/src/components/settings/quick-connect.jsx +2 -1
  49. package/packages/gui/src/stores/groove.js +57 -8
  50. package/packages/gui/src/views/agents.jsx +31 -3
  51. package/packages/gui/src/views/editor.jsx +1 -20
  52. package/packages/gui/src/views/teams.jsx +106 -3
  53. package/TRAINING_DATA_v2.md +0 -9
  54. package/node_modules/@groove-dev/gui/dist/assets/index-DAlSbVyK.css +0 -1
  55. package/packages/gui/dist/assets/index-DAlSbVyK.css +0 -1
@@ -0,0 +1,689 @@
1
+ # Central Command — Complete Rebuild Specification
2
+
3
+ This document contains everything needed to rebuild the Central Command ingestion server from scratch. Central Command is a Node.js Express server that receives training trajectory envelopes from Groove clients, verifies HMAC attestation, stores data as JSONL, stitches multi-chunk sessions, scores trajectories, tracks contributor credits, and serves a 384-dim embedding endpoint.
4
+
5
+ ## Architecture Overview
6
+
7
+ ```
8
+ Groove Client Central Command (AWS)
9
+ ───────────── ────────────────────
10
+ SessionAttestation SessionRegistry (SQLite)
11
+ → POST /v1/sessions/open ──────► ECDH key exchange
12
+ ← server_public_key shared_secret derived
13
+ sequence counter = 0
14
+ TrajectoryCapture
15
+ → signs envelope with HMAC
16
+ → POST /v1/training/ingest ──────► EnvelopeVerifier
17
+ HMAC verification
18
+ Sequence check
19
+ Schema validation
20
+ EnvelopeStorage (JSONL)
21
+ TrajectoryStitcher
22
+ TrajectoryScorer
23
+ ContributorLedger (SQLite)
24
+
25
+ DomainTagger
26
+ → POST /v1/embed ──────► EmbeddingService (ONNX)
27
+ ← 384-dim vector all-MiniLM-L6-v2
28
+ ```
29
+
30
+ ## Server Entry Point
31
+
32
+ **Port:** 8443 (env: `GROOVE_CENTRAL_PORT`)
33
+
34
+ **Dependencies:**
35
+ ```json
36
+ {
37
+ "dependencies": {
38
+ "express": "^4.18.0",
39
+ "better-sqlite3": "^12.9.0",
40
+ "uuid": "^9.0.0",
41
+ "@xenova/transformers": "^2.x"
42
+ }
43
+ }
44
+ ```
45
+
46
+ **`server/index.js`** — Express app with:
47
+ - CORS: `Access-Control-Allow-Origin: *`, methods GET/POST/OPTIONS
48
+ - Body parser: `express.json({ limit: '5mb' })`
49
+ - Per-IP rate limiting: 100/minute, 1000/hour (in-memory Map, cleaned every 5 min)
50
+ - Request logging: `[ISO timestamp] METHOD /path STATUS DURATIONms`
51
+ - Health endpoint: `GET /health` → `{ status: 'ok', uptime: process.uptime() }`
52
+ - Graceful shutdown on SIGTERM/SIGINT (5s timeout)
53
+
54
+ **Component initialization order:**
55
+ ```javascript
56
+ const sessionRegistry = new SessionRegistry(); // ./data/sessions.db
57
+ const storage = new EnvelopeStorage(); // ./data/envelopes/
58
+ const ledger = new ContributorLedger(); // ./data/ledger.db
59
+ const verifier = new EnvelopeVerifier(sessionRegistry);
60
+ const stitcher = new TrajectoryStitcher(storage);
61
+ const scorer = new TrajectoryScorer({ MODEL_TIERS, QUALITY_MULTIPLIERS });
62
+ const enrichment = new EnrichmentPipeline();
63
+ const centralStats = new CentralStats(storage, ledger, sessionRegistry);
64
+ ```
65
+
66
+ **Route mounting:**
67
+ ```javascript
68
+ app.use(createSessionRoutes(sessionRegistry));
69
+ app.use(createIngestRoutes(verifier, storage, stitcher, scorer, enrichment, ledger, sessionRegistry));
70
+ app.use(createStatsRoutes(centralStats));
71
+ app.use(createEmbedRoutes()); // NEW: embedding service
72
+ initEmbedding(); // NEW: load ONNX model in background
73
+ ```
74
+
75
+ ---
76
+
77
+ ## Endpoint Reference
78
+
79
+ ### 1. `POST /v1/sessions/open` — ECDH Key Exchange
80
+
81
+ **Purpose:** Client opens a session, server generates ECDH keypair, derives shared secret for HMAC signing.
82
+
83
+ **Request:**
84
+ ```json
85
+ {
86
+ "session_id": "sess_<uuid>",
87
+ "public_key": "<base64 ECDH public key>",
88
+ "provider": "claude-code|codex|gemini",
89
+ "model": "claude-opus-4-6",
90
+ "machine_fingerprint": "<sha256 hex>",
91
+ "app_version_hash": "<sha256 hex>",
92
+ "groove_version": "0.27.113"
93
+ }
94
+ ```
95
+ All fields required. Returns 400 if any missing.
96
+
97
+ **Response (200):**
98
+ ```json
99
+ { "server_public_key": "<base64 ECDH public key>" }
100
+ ```
101
+
102
+ **Response (429):** Rate limited — max 20 sessions per machine fingerprint per hour.
103
+
104
+ **Crypto details:**
105
+ - Curve: `prime256v1` (P-256)
106
+ - `generateECDHKeypair()` → creates ECDH, returns `{ publicKey, privateKey }` as base64
107
+ - `deriveSharedSecret(serverPrivateKey, clientPublicKey)` → `ecdh.computeSecret()` as base64
108
+ - Shared secret stored in SQLite alongside session metadata
109
+ - Sequence counter starts at 0
110
+
111
+ ### 2. `POST /v1/sessions/close` — Session Close (HTTP)
112
+
113
+ **Request:**
114
+ ```json
115
+ { "session_id": "sess_<uuid>" }
116
+ ```
117
+
118
+ **Response:** `{ closed: true }` or `{ closed: true, already_closed: true }`
119
+
120
+ ### 3. `POST /v1/training/ingest` — Envelope Ingestion
121
+
122
+ **Purpose:** Receives signed trajectory envelopes. Handles three envelope types: CHUNK, SESSION_CLOSE, USER_FEEDBACK.
123
+
124
+ **Common validation for all types:**
125
+ - Envelope must have `session_id`
126
+ - Server generates `envelope_id` (not trusted from client): `env_<uuid>`
127
+ - Dedup check via `processed_envelopes` table
128
+
129
+ #### 3a. Regular CHUNK Envelopes
130
+
131
+ **Request shape:**
132
+ ```json
133
+ {
134
+ "session_id": "sess_<uuid>",
135
+ "chunk_sequence": 0,
136
+ "contributor_id": "<32-char hex>",
137
+ "attestation": {
138
+ "session_hmac": "<64-char hex>",
139
+ "sequence": 0,
140
+ "app_version_hash": "<64-char hex>"
141
+ },
142
+ "metadata": {
143
+ "provider": "codex",
144
+ "model_engine": "gpt-5.5",
145
+ "agent_role": "frontend",
146
+ "agent_id": "frontend-3",
147
+ "team_size": 3,
148
+ "task_complexity": "medium",
149
+ "groove_version": "0.27.113",
150
+ "domain_tags": null,
151
+ "session_embedding": null,
152
+ "routing": null,
153
+ "leaf_context": null,
154
+ "session_quality": 50
155
+ },
156
+ "trajectory_log": [
157
+ {
158
+ "step": 1,
159
+ "type": "thought",
160
+ "timestamp": 1745794234.567,
161
+ "content": "I'll scaffold the project...",
162
+ "token_count": 45
163
+ },
164
+ {
165
+ "step": 2,
166
+ "type": "action",
167
+ "timestamp": 1745794235.123,
168
+ "tool": "command_execution",
169
+ "arguments": { "command": "mkdir -p src" },
170
+ "content": "Executing: mkdir -p src",
171
+ "token_count": 12
172
+ }
173
+ ]
174
+ }
175
+ ```
176
+
177
+ **Verification flow:**
178
+ 1. Session must exist and be `active`
179
+ 2. Attestation must have non-empty HMAC string
180
+ 3. HMAC verification: strip `attestation` from envelope, JSON.stringify the rest, verify against stored shared_secret with sequence number
181
+ 4. Atomic sequence check + increment (prevents replay)
182
+ 5. Schema validation (see Schema section below)
183
+ 6. Per-session envelope limit: max 200
184
+
185
+ **HMAC computation:**
186
+ ```javascript
187
+ // payload = uint32BE(sequence) + Buffer(JSON.stringify(envelopeWithoutAttestation))
188
+ const key = Buffer.from(sharedSecretB64, 'base64');
189
+ const seqBuf = Buffer.alloc(4);
190
+ seqBuf.writeUInt32BE(sequenceNumber);
191
+ const payload = Buffer.concat([seqBuf, Buffer.from(envelopeBytes)]);
192
+ const hmac = createHmac('sha256', key).update(payload).digest('hex');
193
+ ```
194
+
195
+ **Constant-time comparison** for HMAC verification (XOR each char, bitwise OR accumulator).
196
+
197
+ #### 3b. SESSION_CLOSE Envelopes
198
+
199
+ **Request shape:**
200
+ ```json
201
+ {
202
+ "session_id": "sess_<uuid>",
203
+ "type": "SESSION_CLOSE",
204
+ "attestation": { "session_hmac": "...", "sequence": 5, "app_version_hash": "..." },
205
+ "metadata": {
206
+ "provider": "codex",
207
+ "model_engine": "gpt-5.5",
208
+ "agent_role": "frontend",
209
+ "agent_id": "frontend-3",
210
+ "domain_tags": {
211
+ "primary": { "domain": "react_frontend", "confidence": 0.4932 },
212
+ "secondary": { "domain": "vue_frontend", "confidence": 0.4729 },
213
+ "tertiary": { "domain": "typescript_node", "confidence": 0.4001 }
214
+ },
215
+ "session_embedding": {
216
+ "model": "sentence-transformers/all-MiniLM-L6-v2",
217
+ "vector": [0.0123, -0.0456, "...384 floats"],
218
+ "source_text": "frontend\nI'll scaffold the Vite game project..."
219
+ },
220
+ "session_quality": 80
221
+ },
222
+ "outcome": {
223
+ "status": "SUCCESS",
224
+ "session_quality": 80,
225
+ "quality_tier": "TIER_A",
226
+ "quality_tier_reason": "high_quality_no_errors",
227
+ "user_interventions": 0,
228
+ "total_steps": 45,
229
+ "total_chunks": 2,
230
+ "total_tokens": 32606,
231
+ "duration_seconds": 337,
232
+ "files_modified": 12,
233
+ "errors_encountered": 1,
234
+ "errors_recovered": 1,
235
+ "coordination_events": 0,
236
+ "training_eligible": true,
237
+ "training_exclusion_reason": null
238
+ }
239
+ }
240
+ ```
241
+
242
+ **After verification + storage:**
243
+ 1. Marks session as `closed` in registry
244
+ 2. Stitches all chunks for the session into a single trajectory
245
+ 3. Scores the stitched trajectory
246
+ 4. Credits the contributor in the ledger
247
+
248
+ #### 3c. USER_FEEDBACK Envelopes
249
+
250
+ **Request shape:**
251
+ ```json
252
+ {
253
+ "session_id": "sess_<uuid>",
254
+ "type": "USER_FEEDBACK",
255
+ "attestation": { "session_hmac": "...", "sequence": 6, "app_version_hash": "..." },
256
+ "feedback": {
257
+ "signal": "accepted|modified|rejected|iterated",
258
+ "timestamp": 1745794500.0,
259
+ "context": "session completed successfully with no user interventions",
260
+ "target_step": 45,
261
+ "revision_rounds": 0,
262
+ "delta_summary": null
263
+ }
264
+ }
265
+ ```
266
+
267
+ **Verification:** Same HMAC check but session can be active OR closed (feedback may arrive after close). No sequence increment for feedback.
268
+
269
+ ### 4. `POST /v1/embed` — Embedding Service
270
+
271
+ **Purpose:** Returns 384-dim sentence embedding for domain tagging and session fingerprinting.
272
+
273
+ **Request:**
274
+ ```json
275
+ {
276
+ "input": "React TypeScript frontend development",
277
+ "model": "sentence-transformers/all-MiniLM-L6-v2"
278
+ }
279
+ ```
280
+ - `input` — string, required, server truncates to 512 chars
281
+ - `model` — string, optional (only one model, ignore or validate)
282
+
283
+ **Response (200):**
284
+ ```json
285
+ {
286
+ "data": [
287
+ {
288
+ "embedding": [0.0123, -0.0456, "...384 floats"],
289
+ "index": 0
290
+ }
291
+ ],
292
+ "model": "sentence-transformers/all-MiniLM-L6-v2"
293
+ }
294
+ ```
295
+
296
+ **Client reads:** `data?.data?.[0]?.embedding` — must be `Array<number>`, length 384.
297
+
298
+ **Error responses:**
299
+ - `400` — missing `input`
300
+ - `503` — model not loaded yet (server startup)
301
+
302
+ **Health probe:** Client sends `{ "input": "health check" }` on init with 5s timeout. If 200, switches to embedding mode. If fails, falls back to keyword matching silently.
303
+
304
+ **Implementation:**
305
+ ```javascript
306
+ import { pipeline } from '@xenova/transformers';
307
+
308
+ let embedder = null;
309
+ let loading = false;
310
+
311
+ export async function initEmbedding() {
312
+ if (embedder || loading) return;
313
+ loading = true;
314
+ try {
315
+ embedder = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2');
316
+ console.log('[embedding] model loaded');
317
+ } catch (err) {
318
+ console.error('[embedding] failed to load model:', err.message);
319
+ }
320
+ loading = false;
321
+ }
322
+
323
+ export async function embed(text) {
324
+ if (!embedder) throw new Error('Model not loaded');
325
+ const result = await embedder(text, { pooling: 'mean', normalize: true });
326
+ return Array.from(result.data);
327
+ }
328
+
329
+ export function isReady() {
330
+ return embedder !== null;
331
+ }
332
+ ```
333
+
334
+ - Model loads once at startup (~2-5s), ~200MB memory
335
+ - Inference: ~5-15ms per call on CPU
336
+ - `initEmbedding()` called at server start, non-blocking (don't await)
337
+
338
+ ### 5. Stats Endpoints
339
+
340
+ | Endpoint | Method | Returns |
341
+ |----------|--------|---------|
342
+ | `/v1/stats/summary` | GET | `{ totalEnvelopes, totalSteps, totalSessions, activeSessions, uniqueContributors, storageSizeMb, totalPointsAwarded }` |
343
+ | `/v1/stats/daily?days=7` | GET | Array of `{ date, envelopes, steps, sessions, points }` |
344
+ | `/v1/stats/models` | GET | Object keyed by model name: `{ sessions, steps, points, percentage }` |
345
+ | `/v1/stats/providers` | GET | Object keyed by provider: `{ sessions, steps, points }` |
346
+ | `/v1/stats/leaderboard?limit=10` | GET | Array of `{ contributor_id (truncated), total_points, total_sessions }` |
347
+
348
+ ---
349
+
350
+ ## Data Storage
351
+
352
+ ### SQLite: `./data/sessions.db` (SessionRegistry)
353
+
354
+ **Table: `sessions`**
355
+ ```sql
356
+ CREATE TABLE sessions (
357
+ session_id TEXT PRIMARY KEY,
358
+ server_private_key TEXT NOT NULL,
359
+ server_public_key TEXT NOT NULL,
360
+ shared_secret TEXT NOT NULL,
361
+ client_public_key TEXT NOT NULL,
362
+ provider TEXT NOT NULL,
363
+ model TEXT NOT NULL,
364
+ machine_fingerprint TEXT NOT NULL,
365
+ app_version_hash TEXT NOT NULL,
366
+ groove_version TEXT NOT NULL,
367
+ expected_sequence INTEGER DEFAULT 0,
368
+ envelope_count INTEGER DEFAULT 0,
369
+ status TEXT DEFAULT 'active',
370
+ created_at TEXT NOT NULL,
371
+ closed_at TEXT
372
+ );
373
+ ```
374
+
375
+ **Table: `processed_envelopes`**
376
+ ```sql
377
+ CREATE TABLE processed_envelopes (
378
+ envelope_id TEXT PRIMARY KEY,
379
+ session_id TEXT NOT NULL,
380
+ processed_at TEXT NOT NULL
381
+ );
382
+ ```
383
+
384
+ - WAL mode, foreign keys ON
385
+ - `checkAndIncrementSequence()` runs in an IMMEDIATE transaction (atomic)
386
+ - Rate limit: max 20 sessions per `machine_fingerprint` per hour
387
+ - Data dir created with mode `0o700`
388
+
389
+ ### SQLite: `./data/ledger.db` (ContributorLedger)
390
+
391
+ **Table: `credits`**
392
+ ```sql
393
+ CREATE TABLE credits (
394
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
395
+ contributor_id TEXT NOT NULL,
396
+ session_id TEXT NOT NULL,
397
+ points REAL NOT NULL,
398
+ base_points INTEGER NOT NULL,
399
+ multiplier_breakdown TEXT NOT NULL,
400
+ created_at TEXT NOT NULL
401
+ );
402
+ ```
403
+
404
+ **Table: `balances`**
405
+ ```sql
406
+ CREATE TABLE balances (
407
+ contributor_id TEXT PRIMARY KEY,
408
+ total_points REAL NOT NULL DEFAULT 0,
409
+ total_sessions INTEGER NOT NULL DEFAULT 0,
410
+ last_credit_at TEXT,
411
+ trust_score REAL NOT NULL DEFAULT 1.0
412
+ );
413
+ ```
414
+
415
+ - Credits run in a transaction (insert credit + upsert balance)
416
+ - Trust score clamped 0-10
417
+
418
+ ### JSONL: `./data/envelopes/YYYY-MM-DD.jsonl` (EnvelopeStorage)
419
+
420
+ - One JSON object per line, one file per day
421
+ - Files created with mode `0o600`
422
+ - 50GB quota, warns at 40GB
423
+ - Reads parse all lines, filter by session_id for stitching
424
+
425
+ ---
426
+
427
+ ## Shared Modules (used by both client and server)
428
+
429
+ ### `shared/crypto.js` — ECDH + HMAC
430
+
431
+ ```javascript
432
+ import { createECDH, createHmac, createHash, randomBytes } from 'node:crypto';
433
+ import { readFileSync } from 'node:fs';
434
+
435
+ // Curve: prime256v1 (P-256)
436
+ generateECDHKeypair() → { publicKey: base64, privateKey: base64 }
437
+ deriveSharedSecret(priv, pub) → base64 shared secret
438
+ signEnvelope(secret, bytes, seq) → hex HMAC
439
+ verifyEnvelope(secret, bytes, seq, hmac) → boolean (constant-time)
440
+ computeAppHash(filePath) → sha256 hex of file contents
441
+ ```
442
+
443
+ **HMAC payload:** `uint32BE(sequence) || envelopeBytes` where `envelopeBytes` is `JSON.stringify(envelope)` with `attestation` field removed.
444
+
445
+ ### `shared/constants.js`
446
+
447
+ ```javascript
448
+ export const CURRENT_CONSENT_VERSION = '1.0';
449
+ export const CHUNK_SIZE = 200;
450
+ export const CHUNK_TIMEOUT_MS = 300_000; // 5 min
451
+ export const MAX_QUEUE_SIZE = 10_000;
452
+ export const OBSERVATION_TRUNCATE_HEAD = 50;
453
+ export const OBSERVATION_TRUNCATE_TAIL = 20;
454
+ export const SUPPORTED_PROVIDERS = ['claude-code', 'codex', 'gemini'];
455
+
456
+ export const MODEL_TIERS = {
457
+ 'claude-opus-4-6': 5,
458
+ 'claude-opus-4-7': 5,
459
+ 'claude-sonnet-4-6': 3,
460
+ 'gpt-4.5': 5,
461
+ 'gpt-5.5': 5,
462
+ 'o3': 5,
463
+ 'o4-mini': 2,
464
+ 'gemini-2.5-pro': 3,
465
+ 'gemini-2.5-flash': 1.5,
466
+ };
467
+
468
+ export const QUALITY_MULTIPLIERS = {
469
+ correction: 10,
470
+ coordination: 5,
471
+ errorRecovery: 3,
472
+ heavyTask: 2,
473
+ highQuality: 1.5,
474
+ };
475
+
476
+ export const OBSERVATION_TOKEN_LIMIT = 4096;
477
+ export const TIER_A_MIN_QUALITY = 70;
478
+ export const TIER_B_MIN_QUALITY = 50;
479
+ export const TRAINING_MIN_STEPS = 5;
480
+ export const TRAINING_MIN_TOKENS = 500;
481
+ export const TRAINING_MIN_DURATION = 10;
482
+ export const TRAINING_EXCLUSION_REASONS = ['too_few_steps', 'no_actions', 'no_observations', 'insufficient_tokens', 'too_short'];
483
+
484
+ export const USER_MESSAGE_MAX_CHARS = 2000;
485
+
486
+ export const CENTRAL_COMMAND_URL = process.env.GROOVE_CENTRAL_URL || 'https://api.groovedev.ai';
487
+ export const EMBEDDING_SERVICE_URL = process.env.EMBEDDING_SERVICE_URL || `${CENTRAL_COMMAND_URL}/v1/embed`;
488
+ ```
489
+
490
+ ### `shared/envelope-schema.js` — Validation
491
+
492
+ **Valid step types:** `thought`, `action`, `observation`, `correction`, `resolution`, `error`, `coordination`, `edit`, `instruction`, `clarification`, `approval`
493
+
494
+ **Valid quality tiers:** `TIER_A`, `TIER_B`, `TIER_C`
495
+
496
+ **Valid feedback signals:** `accepted`, `modified`, `rejected`, `iterated`
497
+
498
+ **Valid outcome statuses:** `SUCCESS`, `CRASH`, `KILLED`
499
+
500
+ **Valid complexities:** `light`, `medium`, `heavy`
501
+
502
+ **Limits:**
503
+ - Max 500 steps per envelope
504
+ - Max 10,000 chars per step content
505
+ - Max 100,000 token_count per step
506
+ - Max 50,000 step number
507
+ - Timestamps: within last 7 days, max 1 hour in future
508
+ - Sequence: 0 to 1,000,000
509
+ - contributor_id: 32-char hex
510
+ - session_hmac: 64-char hex
511
+ - app_version_hash: 64-char hex
512
+ - agent_role: max 50 chars
513
+ - agent_id: max 100 chars
514
+ - team_size: 1-50
515
+ - groove_version: max 20 chars
516
+
517
+ **Metadata fields validated:**
518
+ - `domain_tags` — object or null, with primary/secondary/tertiary each having domain (string) and confidence (0-1)
519
+ - `session_embedding` — object or null, with model (string), vector (non-empty array of finite numbers), source_text (string)
520
+ - `routing` — object or null, with leaf_id (string), routing_confidence (0-1), fallback_used (boolean), optional leaf_lifecycle_stage (string), optional parent_leaf_id (string or null)
521
+ - `leaf_context` — object or null, with leaf_id, leaf_version, confidence_at_route (0-1), chassis_model
522
+
523
+ ---
524
+
525
+ ## Server Components (detailed)
526
+
527
+ ### TrajectoryStitcher
528
+
529
+ Stitches multi-chunk sessions into a single trajectory on SESSION_CLOSE.
530
+
531
+ ```javascript
532
+ stitch(sessionId):
533
+ 1. Get all envelopes for session from storage
534
+ 2. Separate chunks (non-SESSION_CLOSE) from close envelope
535
+ 3. Sort chunks by chunk_sequence
536
+ 4. Concatenate all trajectory_log steps, sort by step number
537
+ 5. Compute: unique tools used, step type distribution, total tokens
538
+ 6. Return {
539
+ session_id, contributor_id, metadata (from first chunk),
540
+ trajectory_log, outcome (from close envelope),
541
+ total_steps, total_tokens, unique_tools_used,
542
+ step_type_distribution, total_chunks
543
+ }
544
+ ```
545
+
546
+ Also has `linkCoordination(trajectory)` — annotates coordination steps with partner info.
547
+
548
+ ### TrajectoryScorer
549
+
550
+ Scores stitched trajectories for contributor credits. All scoring derived from actual steps, never trusts client-reported values.
551
+
552
+ ```javascript
553
+ score(stitchedTrajectory):
554
+ - basePoints = min(steps.length, 5000)
555
+ - modelMultiplier = MODEL_TIERS[model_engine] || 1
556
+ - correctionBonus = min(correctionSteps, 30% of trajectory) * 10
557
+ - coordinationBonus = min(coordSteps, 20% of trajectory) * 5
558
+ - errorRecoveryBonus = min(errorsRecovered, errorSteps) * 3
559
+ - complexityBonus = (heavy task) ? basePoints * 1 : 0
560
+ - subtotal = (basePoints * modelMultiplier) + all bonuses
561
+ - qualityBonus = (hasResolution && 5-5000 steps) ? floor(subtotal * 0.1) : 0
562
+ - totalPoints = subtotal + qualityBonus
563
+ ```
564
+
565
+ ### EnrichmentPipeline
566
+
567
+ Currently a stub — returns trajectory with `enrichment: { cognitive_target: 'pending', model_verified: 'pending', quality_assessment: 'pending' }`. Placeholder for future LLM-as-a-Judge enrichment.
568
+
569
+ ### CentralStats
570
+
571
+ Aggregation layer over storage + ledger + registry.
572
+
573
+ - `summary()` — total envelopes, steps, sessions, active sessions, unique contributors, storage size, total points
574
+ - `dailyGrowth(days)` — per-day envelope/step/session/point counts
575
+ - `modelBreakdown()` — per-model session/step counts with percentage
576
+ - `providerBreakdown()` — per-provider session/step counts
577
+ - `topContributors(limit)` — leaderboard (contributor_id truncated to 8 chars)
578
+ - File size safety: skips files > 100MB, caps at 100K lines per file
579
+
580
+ ---
581
+
582
+ ## Rate Limiting
583
+
584
+ ### Per-IP (middleware)
585
+ - 100 requests per minute per IP
586
+ - 1000 requests per hour per IP
587
+ - IP from `X-Forwarded-For` header (first entry) or `req.ip`
588
+ - Returns 429 with `retryAfter` seconds
589
+ - Stale entries cleaned every 5 minutes (>120s since minute window AND >7200s since hour window)
590
+
591
+ ### Per-Machine (session open)
592
+ - Max 20 sessions per `machine_fingerprint` per hour
593
+ - Returns 429 on session open attempt
594
+
595
+ ### Per-Session (envelope ingest)
596
+ - Max 200 envelopes per session
597
+ - Returns 429 on exceeding
598
+
599
+ ---
600
+
601
+ ## Verification Flow (exact implementation)
602
+
603
+ Both `verify()` and `verifyClose()` follow the same pattern:
604
+
605
+ ```
606
+ 1. Check session_id exists
607
+ 2. Check session status is 'active'
608
+ 3. Check attestation block exists
609
+ 4. Check session_hmac is non-empty string
610
+ 5. Reconstruct HMAC:
611
+ a. Copy envelope, delete attestation key
612
+ b. JSON.stringify the copy
613
+ c. verifyEnvelope(shared_secret, jsonBytes, sequence, hmac)
614
+ 6. Atomic sequence check + increment
615
+ 7. Schema validation via validateEnvelope()
616
+ ```
617
+
618
+ `verifyFeedback()` is the same except:
619
+ - Session can be active OR closed (no status check)
620
+ - No sequence check/increment
621
+
622
+ `verifyClose()` additionally calls `sessionRegistry.closeSession(sessionId)` after validation.
623
+
624
+ ---
625
+
626
+ ## Deployment Notes
627
+
628
+ ### Install
629
+ ```bash
630
+ npm install express better-sqlite3 uuid @xenova/transformers
631
+ ```
632
+
633
+ ### Environment Variables
634
+ ```bash
635
+ GROOVE_CENTRAL_PORT=8443 # server port (default 8443)
636
+ ```
637
+
638
+ ### Data Directories (auto-created)
639
+ ```
640
+ ./data/sessions.db # session registry
641
+ ./data/ledger.db # contributor ledger
642
+ ./data/envelopes/ # JSONL storage (YYYY-MM-DD.jsonl)
643
+ ```
644
+
645
+ All created with restrictive permissions (0o700 dirs, 0o600 files).
646
+
647
+ ### First Run
648
+ The ONNX model (`Xenova/all-MiniLM-L6-v2`) downloads from Hugging Face on first startup. ~80MB download, cached locally after first run. Server starts accepting requests immediately — embedding endpoint returns 503 until model loads (~2-5s).
649
+
650
+ ### Verification Test
651
+ ```bash
652
+ # Health check
653
+ curl https://api.groovedev.ai/health
654
+ # Expected: {"status":"ok","uptime":...}
655
+
656
+ # Embedding test
657
+ curl -s -X POST https://api.groovedev.ai/v1/embed \
658
+ -H "Content-Type: application/json" \
659
+ -d '{"input": "React TypeScript frontend development"}' \
660
+ | python3 -c "import sys,json; d=json.load(sys.stdin); print(f'Vector length: {len(d[\"data\"][0][\"embedding\"])}')"
661
+ # Expected: Vector length: 384
662
+
663
+ # Stats
664
+ curl https://api.groovedev.ai/v1/stats/summary
665
+ ```
666
+
667
+ ---
668
+
669
+ ## Critical Contract: What the Client Sends
670
+
671
+ The Groove client (`moe-training/client/`) sends to these exact URLs:
672
+
673
+ | Client Component | Endpoint | When |
674
+ |-----------------|----------|------|
675
+ | `SessionAttestation.openSession()` | `POST /v1/sessions/open` | Agent spawn |
676
+ | `TransmissionQueue._drain()` | `POST /v1/training/ingest` | Every chunk + session close + feedback |
677
+ | `SessionAttestation.closeSession()` | `POST /v1/sessions/close` | Agent complete/crash |
678
+ | `DomainTagger.init()` | `POST /v1/embed` | DomainTagger init (health probe) |
679
+ | `DomainTagger._embed()` | `POST /v1/embed` | Domain tagging + session embedding |
680
+ | `TrajectoryCapture._retryOfflineQueue()` | `GET /health` | Every 60s when offline queue has items |
681
+
682
+ Base URL: `https://api.groovedev.ai` (from `CENTRAL_COMMAND_URL` constant).
683
+ Embedding URL: `https://api.groovedev.ai/v1/embed` (from `EMBEDDING_SERVICE_URL` constant, defaults to `CENTRAL_COMMAND_URL + /v1/embed`).
684
+
685
+ All requests use `AbortSignal.timeout()`:
686
+ - Session open/close: 10s
687
+ - Ingest: 30s (with 5 retries, exponential backoff up to 60s)
688
+ - Embed: 10s (regular calls), 5s (init health probe)
689
+ - Health check: 5s