lynkr 8.0.0 → 9.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (128) hide show
  1. package/.lynkr/telemetry.db +0 -0
  2. package/.lynkr/telemetry.db-shm +0 -0
  3. package/.lynkr/telemetry.db-wal +0 -0
  4. package/README.md +196 -322
  5. package/lynkr-skill.tar.gz +0 -0
  6. package/package.json +4 -3
  7. package/src/api/openai-router.js +64 -13
  8. package/src/api/providers-handler.js +171 -3
  9. package/src/api/router.js +9 -2
  10. package/src/clients/circuit-breaker.js +10 -247
  11. package/src/clients/codex-process.js +342 -0
  12. package/src/clients/codex-utils.js +143 -0
  13. package/src/clients/databricks.js +210 -63
  14. package/src/clients/resilience.js +540 -0
  15. package/src/clients/retry.js +22 -167
  16. package/src/clients/standard-tools.js +23 -0
  17. package/src/config/index.js +77 -0
  18. package/src/context/compression.js +42 -9
  19. package/src/context/distill.js +492 -0
  20. package/src/orchestrator/index.js +48 -8
  21. package/src/routing/complexity-analyzer.js +258 -5
  22. package/src/routing/index.js +12 -2
  23. package/src/routing/latency-tracker.js +148 -0
  24. package/src/routing/model-tiers.js +2 -0
  25. package/src/routing/quality-scorer.js +113 -0
  26. package/src/routing/telemetry.js +464 -0
  27. package/src/server.js +13 -12
  28. package/src/tools/code-graph.js +538 -0
  29. package/src/tools/code-mode.js +304 -0
  30. package/src/tools/index.js +4 -0
  31. package/src/tools/lazy-loader.js +18 -0
  32. package/src/tools/mcp-remote.js +7 -0
  33. package/src/tools/smart-selection.js +11 -0
  34. package/src/tools/tinyfish.js +358 -0
  35. package/src/tools/truncate.js +1 -0
  36. package/src/utils/payload.js +206 -0
  37. package/src/utils/perf-timer.js +80 -0
  38. package/.github/FUNDING.yml +0 -15
  39. package/.github/workflows/README.md +0 -215
  40. package/.github/workflows/ci.yml +0 -69
  41. package/.github/workflows/index.yml +0 -62
  42. package/.github/workflows/web-tools-tests.yml +0 -56
  43. package/CITATIONS.bib +0 -6
  44. package/DEPLOYMENT.md +0 -1001
  45. package/LYNKR-TUI-PLAN.md +0 -984
  46. package/PERFORMANCE-REPORT.md +0 -866
  47. package/PLAN-per-client-model-routing.md +0 -252
  48. package/docs/42642f749da6234f41b6b425c3bb07c9.txt +0 -1
  49. package/docs/BingSiteAuth.xml +0 -4
  50. package/docs/docs-style.css +0 -478
  51. package/docs/docs.html +0 -198
  52. package/docs/google5be250e608e6da39.html +0 -1
  53. package/docs/index.html +0 -577
  54. package/docs/index.md +0 -584
  55. package/docs/robots.txt +0 -4
  56. package/docs/sitemap.xml +0 -44
  57. package/docs/style.css +0 -1223
  58. package/docs/toon-integration-spec.md +0 -130
  59. package/documentation/README.md +0 -101
  60. package/documentation/api.md +0 -806
  61. package/documentation/claude-code-cli.md +0 -679
  62. package/documentation/codex-cli.md +0 -397
  63. package/documentation/contributing.md +0 -571
  64. package/documentation/cursor-integration.md +0 -734
  65. package/documentation/docker.md +0 -874
  66. package/documentation/embeddings.md +0 -762
  67. package/documentation/faq.md +0 -713
  68. package/documentation/features.md +0 -403
  69. package/documentation/headroom.md +0 -519
  70. package/documentation/installation.md +0 -758
  71. package/documentation/memory-system.md +0 -476
  72. package/documentation/production.md +0 -636
  73. package/documentation/providers.md +0 -1009
  74. package/documentation/routing.md +0 -476
  75. package/documentation/testing.md +0 -629
  76. package/documentation/token-optimization.md +0 -325
  77. package/documentation/tools.md +0 -697
  78. package/documentation/troubleshooting.md +0 -969
  79. package/final-test.js +0 -33
  80. package/headroom-sidecar/config.py +0 -93
  81. package/headroom-sidecar/requirements.txt +0 -14
  82. package/headroom-sidecar/server.py +0 -451
  83. package/monitor-agents.sh +0 -31
  84. package/scripts/audit-log-reader.js +0 -399
  85. package/scripts/compact-dictionary.js +0 -204
  86. package/scripts/test-deduplication.js +0 -448
  87. package/src/db/database.sqlite +0 -0
  88. package/te +0 -11622
  89. package/test/README.md +0 -212
  90. package/test/azure-openai-config.test.js +0 -213
  91. package/test/azure-openai-error-resilience.test.js +0 -238
  92. package/test/azure-openai-format-conversion.test.js +0 -354
  93. package/test/azure-openai-integration.test.js +0 -287
  94. package/test/azure-openai-routing.test.js +0 -175
  95. package/test/azure-openai-streaming.test.js +0 -171
  96. package/test/bedrock-integration.test.js +0 -457
  97. package/test/comprehensive-test-suite.js +0 -928
  98. package/test/config-validation.test.js +0 -207
  99. package/test/cursor-integration.test.js +0 -484
  100. package/test/format-conversion.test.js +0 -578
  101. package/test/hybrid-routing-integration.test.js +0 -269
  102. package/test/hybrid-routing-performance.test.js +0 -428
  103. package/test/llamacpp-integration.test.js +0 -882
  104. package/test/lmstudio-integration.test.js +0 -347
  105. package/test/memory/extractor.test.js +0 -398
  106. package/test/memory/retriever.test.js +0 -613
  107. package/test/memory/retriever.test.js.bak +0 -585
  108. package/test/memory/search.test.js +0 -537
  109. package/test/memory/search.test.js.bak +0 -389
  110. package/test/memory/store.test.js +0 -344
  111. package/test/memory/store.test.js.bak +0 -312
  112. package/test/memory/surprise.test.js +0 -300
  113. package/test/memory-performance.test.js +0 -472
  114. package/test/openai-integration.test.js +0 -683
  115. package/test/openrouter-error-resilience.test.js +0 -418
  116. package/test/passthrough-mode.test.js +0 -385
  117. package/test/performance-benchmark.js +0 -351
  118. package/test/performance-tests.js +0 -528
  119. package/test/routing.test.js +0 -225
  120. package/test/toon-compression.test.js +0 -131
  121. package/test/web-tools.test.js +0 -329
  122. package/test-agents-simple.js +0 -43
  123. package/test-cli-connection.sh +0 -33
  124. package/test-learning-unit.js +0 -126
  125. package/test-learning.js +0 -112
  126. package/test-parallel-agents.sh +0 -124
  127. package/test-parallel-direct.js +0 -155
  128. package/test-subagents.sh +0 -117
@@ -1,519 +0,0 @@
1
- # Headroom Context Compression
2
-
3
- Headroom is an intelligent context compression system that reduces LLM token usage by 47-92% while preserving semantic meaning. It runs as a Python sidecar container that Lynkr manages automatically via Docker.
4
-
5
- ---
6
-
7
- ## Overview
8
-
9
- ### What is Headroom?
10
-
11
- Headroom is a context optimization SDK that compresses LLM prompts and tool outputs using:
12
-
13
- 1. **Smart Crusher** - Statistical JSON compression based on field analysis
14
- 2. **Cache Aligner** - Stabilizes dynamic content (UUIDs, timestamps) for provider cache hits
15
- 3. **CCR (Compress-Cache-Retrieve)** - Reversible compression with on-demand retrieval
16
- 4. **Rolling Window** - Token budget enforcement with turn-based windowing
17
- 5. **LLMLingua** (optional) - ML-based 20x compression using BERT
18
-
19
- ### Benefits
20
-
21
- | Metric | Without Headroom | With Headroom |
22
- |--------|-----------------|---------------|
23
- | Token usage | 100% | 8-53% (47-92% reduction) |
24
- | Cache hit rate | ~20% | ~60-80% |
25
- | Cost per request | $0.01-0.05 | $0.002-0.02 |
26
- | Context overflow | Common | Rare |
27
-
28
- ---
29
-
30
- ## Quick Start
31
-
32
- ### 1. Enable Headroom
33
-
34
- Add to your `.env`:
35
-
36
- ```bash
37
- # Enable Headroom compression
38
- HEADROOM_ENABLED=true
39
- ```
40
-
41
- ### 2. Start Lynkr
42
-
43
- ```bash
44
- npm start
45
- ```
46
-
47
- Lynkr will automatically:
48
- 1. Pull the `lynkr/headroom-sidecar:latest` Docker image
49
- 2. Start the container with configured settings
50
- 3. Wait for health checks to pass
51
- 4. Begin compressing requests
52
-
53
- ### 3. Verify It's Working
54
-
55
- Check the health endpoint:
56
-
57
- ```bash
58
- curl http://localhost:8081/health/headroom
59
- ```
60
-
61
- Expected response:
62
- ```json
63
- {
64
- "enabled": true,
65
- "healthy": true,
66
- "service": {
67
- "available": true,
68
- "ccrEnabled": true,
69
- "llmlinguaEnabled": false
70
- },
71
- "docker": {
72
- "running": true,
73
- "status": "running",
74
- "health": "healthy"
75
- }
76
- }
77
- ```
78
-
79
- ---
80
-
81
- ## How It Works
82
-
83
- ### Transform Pipeline
84
-
85
- When a request arrives, Headroom processes it through a three-stage pipeline:
86
-
87
- ```
88
- Request → Cache Aligner → Smart Crusher → Context Manager → Compressed Request
89
- ↓ ↓ ↓
90
- Stabilize IDs Compress JSON Enforce budget
91
- ```
92
-
93
- ### 1. Cache Aligner
94
-
95
- **Problem**: Dynamic content like UUIDs and timestamps change every request, preventing provider cache hits.
96
-
97
- **Solution**: Replace dynamic values with stable placeholders:
98
-
99
- ```json
100
- // Before
101
- {"id": "f47ac10b-58cc-4372-a567-0e02b2c3d479", "created": "2024-01-15T10:30:00Z"}
102
-
103
- // After
104
- {"id": "[ID:1]", "created": "[TS:1]"}
105
- ```
106
-
107
- **Result**: 60-80% cache hit rate instead of ~20%.
108
-
109
- ### 2. Smart Crusher
110
-
111
- **Problem**: Tool outputs often contain repetitive JSON with many similar items.
112
-
113
- **Solution**: Statistical analysis to identify and compress redundant fields:
114
-
115
- ```json
116
- // Before (100 search results, ~50KB)
117
- [
118
- {"title": "Result 1", "url": "...", "snippet": "...", "score": 0.95, ...},
119
- {"title": "Result 2", "url": "...", "snippet": "...", "score": 0.93, ...},
120
- // ... 98 more items
121
- ]
122
-
123
- // After (~5KB)
124
- {
125
- "_meta": {"compressed": true, "original_count": 100, "kept": 12},
126
- "items": [
127
- // Top 12 most relevant items with essential fields only
128
- ]
129
- }
130
- ```
131
-
132
- **Compression strategies**:
133
- - **High-variance fields**: Keep (they're informative)
134
- - **Low-variance fields**: Remove (they're redundant)
135
- - **Unique fields**: Keep first occurrence only
136
- - **Repetitive arrays**: Sample representative items
137
-
138
- ### 3. CCR (Compress-Cache-Retrieve)
139
-
140
- **Problem**: Sometimes you need to retrieve compressed content later.
141
-
142
- **Solution**: Hash-based reversible compression:
143
-
144
- ```json
145
- // Compressed message
146
- {
147
- "content": "[CCR:abc123] 100 files found. Use ccr_retrieve to explore.",
148
- "ccr_available": true
149
- }
150
-
151
- // Tool definition injected
152
- {
153
- "name": "ccr_retrieve",
154
- "description": "Retrieve compressed content by hash",
155
- "input_schema": {
156
- "hash": "string",
157
- "query": "string (optional search within results)"
158
- }
159
- }
160
- ```
161
-
162
- When the LLM calls `ccr_retrieve`, Headroom returns the full original content.
163
-
164
- ---
165
-
166
- ## Configuration
167
-
168
- ### Basic Settings
169
-
170
- ```bash
171
- # Enable/disable Headroom
172
- HEADROOM_ENABLED=true
173
-
174
- # Sidecar endpoint
175
- HEADROOM_ENDPOINT=http://localhost:8787
176
-
177
- # Request timeout (ms)
178
- HEADROOM_TIMEOUT_MS=5000
179
-
180
- # Skip compression for small requests (tokens)
181
- HEADROOM_MIN_TOKENS=500
182
-
183
- # Mode: "audit" (observe) or "optimize" (apply)
184
- HEADROOM_MODE=optimize
185
- ```
186
-
187
- ### Docker Settings
188
-
189
- ```bash
190
- # Enable automatic container management
191
- HEADROOM_DOCKER_ENABLED=true
192
-
193
- # Docker image
194
- HEADROOM_DOCKER_IMAGE=lynkr/headroom-sidecar:latest
195
-
196
- # Container name
197
- HEADROOM_DOCKER_CONTAINER_NAME=lynkr-headroom
198
-
199
- # Port mapping
200
- HEADROOM_DOCKER_PORT=8787
201
-
202
- # Resource limits
203
- HEADROOM_DOCKER_MEMORY_LIMIT=512m
204
- HEADROOM_DOCKER_CPU_LIMIT=1.0
205
-
206
- # Restart policy
207
- HEADROOM_DOCKER_RESTART_POLICY=unless-stopped
208
- ```
209
-
210
- ### Transform Settings
211
-
212
- ```bash
213
- # Smart Crusher (statistical JSON compression)
214
- HEADROOM_SMART_CRUSHER=true
215
- HEADROOM_SMART_CRUSHER_MIN_TOKENS=200
216
- HEADROOM_SMART_CRUSHER_MAX_ITEMS=15
217
-
218
- # Tool Crusher (fixed-rules compression)
219
- HEADROOM_TOOL_CRUSHER=true
220
-
221
- # Cache Aligner (stabilize dynamic content)
222
- HEADROOM_CACHE_ALIGNER=true
223
-
224
- # Rolling Window (context overflow management)
225
- HEADROOM_ROLLING_WINDOW=true
226
- HEADROOM_KEEP_TURNS=3
227
- ```
228
-
229
- ### CCR Settings
230
-
231
- ```bash
232
- # Enable CCR for reversible compression
233
- HEADROOM_CCR=true
234
-
235
- # Cache TTL in seconds
236
- HEADROOM_CCR_TTL=300
237
- ```
238
-
239
- ### LLMLingua Settings (Optional)
240
-
241
- LLMLingua provides ML-based compression using BERT token classification. Requires GPU for reasonable performance.
242
-
243
- ```bash
244
- # Enable LLMLingua (default: false)
245
- HEADROOM_LLMLINGUA=true
246
-
247
- # Device: cuda, cpu, auto
248
- HEADROOM_LLMLINGUA_DEVICE=cuda
249
- ```
250
-
251
- **Note**: LLMLingua adds 100-500ms latency per request. Only enable if you have a GPU and need maximum compression.
252
-
253
- ---
254
-
255
- ## API Endpoints
256
-
257
- ### Health Check
258
-
259
- ```bash
260
- GET /health/headroom
261
- ```
262
-
263
- Returns Headroom health status including container and service state.
264
-
265
- ### Compression Metrics
266
-
267
- ```bash
268
- GET /metrics/compression
269
- ```
270
-
271
- Returns compression statistics:
272
-
273
- ```json
274
- {
275
- "enabled": true,
276
- "endpoint": "http://localhost:8787",
277
- "client": {
278
- "totalCalls": 150,
279
- "successfulCompressions": 120,
280
- "skippedCompressions": 25,
281
- "failures": 5,
282
- "totalTokensSaved": 450000,
283
- "averageLatencyMs": 45,
284
- "compressionRate": 80,
285
- "failureRate": 3
286
- },
287
- "server": {
288
- "requests_total": 150,
289
- "compressions_applied": 120,
290
- "average_compression_ratio": 0.35,
291
- "ccr_retrievals": 45
292
- }
293
- }
294
- ```
295
-
296
- ### Detailed Status
297
-
298
- ```bash
299
- GET /headroom/status
300
- ```
301
-
302
- Returns full status including configuration, metrics, and recent logs.
303
-
304
- ### Container Restart
305
-
306
- ```bash
307
- POST /headroom/restart
308
- ```
309
-
310
- Restarts the Headroom container (useful for applying config changes).
311
-
312
- ### Container Logs
313
-
314
- ```bash
315
- GET /headroom/logs?tail=100
316
- ```
317
-
318
- Returns recent container logs for debugging.
319
-
320
- ---
321
-
322
- ## Monitoring
323
-
324
- ### Health Check Integration
325
-
326
- Headroom status is included in the `/health/ready` endpoint:
327
-
328
- ```json
329
- {
330
- "status": "ready",
331
- "checks": {
332
- "database": { "healthy": true },
333
- "memory": { "healthy": true },
334
- "headroom": {
335
- "healthy": true,
336
- "enabled": true,
337
- "service": "available",
338
- "docker": "running"
339
- }
340
- }
341
- }
342
- ```
343
-
344
- **Note**: Headroom is non-critical. If it fails, Lynkr continues without compression.
345
-
346
- ### Logging
347
-
348
- Headroom logs compression events:
349
-
350
- ```
351
- INFO: Headroom compression applied
352
- tokensBefore: 15000
353
- tokensAfter: 5200
354
- savingsPercent: 65.3
355
- latencyMs: 42
356
- transforms: ["cache_aligner", "smart_crusher"]
357
- ```
358
-
359
- ---
360
-
361
- ## Troubleshooting
362
-
363
- ### Container Won't Start
364
-
365
- **Check Docker is running:**
366
- ```bash
367
- docker ps
368
- ```
369
-
370
- **Check for port conflicts:**
371
- ```bash
372
- lsof -i :8787
373
- ```
374
-
375
- **View container logs:**
376
- ```bash
377
- curl http://localhost:8081/headroom/logs
378
- # or
379
- docker logs lynkr-headroom
380
- ```
381
-
382
- ### High Latency
383
-
384
- 1. **Reduce transforms**: Disable LLMLingua if not needed
385
- 2. **Increase resources**: Raise `HEADROOM_DOCKER_MEMORY_LIMIT`
386
- 3. **Skip small requests**: Increase `HEADROOM_MIN_TOKENS`
387
-
388
- ### Compression Not Applied
389
-
390
- Check:
391
- 1. `HEADROOM_ENABLED=true` in `.env`
392
- 2. Request has more than `HEADROOM_MIN_TOKENS` tokens
393
- 3. Health endpoint shows `healthy: true`
394
-
395
- ### CCR Retrieval Fails
396
-
397
- 1. Check `HEADROOM_CCR=true`
398
- 2. Verify TTL hasn't expired (`HEADROOM_CCR_TTL`)
399
- 3. Ensure same session is used (CCR is session-scoped)
400
-
401
- ---
402
-
403
- ## Architecture
404
-
405
- ### System Diagram
406
-
407
- ```
408
- ┌─────────────────────────────────────────────────────────────────┐
409
- │ Lynkr (Node.js) │
410
- │ ┌──────────────────────────────────────────────────────────┐ │
411
- │ │ Request Handler │ │
412
- │ │ ↓ │ │
413
- │ │ src/headroom/client.js ──HTTP──→ Headroom Sidecar │ │
414
- │ │ ↓ (Python Container) │ │
415
- │ │ Compressed Request │ │ │
416
- │ │ ↓ ↓ │ │
417
- │ │ LLM Provider ┌─────────────┐ │ │
418
- │ │ │ Transforms │ │ │
419
- │ └──────────────────────────────────│ - Aligner │─────────┘ │
420
- │ │ - Crusher │ │
421
- │ │ - CCR Store │ │
422
- │ │ - LLMLingua │ │
423
- │ └─────────────┘ │
424
- └─────────────────────────────────────────────────────────────────┘
425
- ```
426
-
427
- ### Request Flow
428
-
429
- 1. **Request arrives** at Lynkr
430
- 2. **Token estimation** - Skip if below `HEADROOM_MIN_TOKENS`
431
- 3. **Send to sidecar** - HTTP POST to `/compress`
432
- 4. **Transform pipeline** executes:
433
- - Cache Aligner stabilizes dynamic content
434
- - Smart Crusher compresses JSON structures
435
- - Context Manager enforces token budget
436
- 5. **Return compressed** messages and tools
437
- 6. **Forward to LLM** provider
438
- 7. **On CCR tool call** - Retrieve original content
439
-
440
- ### File Structure
441
-
442
- ```
443
- src/headroom/
444
- ├── index.js # HeadroomManager singleton, exports
445
- ├── launcher.js # Docker container lifecycle (dockerode)
446
- ├── client.js # HTTP client for sidecar API
447
- └── health.js # Health check functionality
448
- ```
449
-
450
- ---
451
-
452
- ## Best Practices
453
-
454
- ### 1. Start with Defaults
455
-
456
- The default configuration is optimized for most use cases:
457
- - Smart Crusher: Enabled
458
- - Cache Aligner: Enabled
459
- - CCR: Enabled
460
- - LLMLingua: Disabled (enable only with GPU)
461
-
462
- ### 2. Monitor Compression Rates
463
-
464
- Check `/metrics/compression` regularly:
465
- - **Good**: 60-80% compression rate
466
- - **Warning**: Below 40% (check transform settings)
467
- - **Issue**: High failure rate (check container health)
468
-
469
- ### 3. Tune for Your Workload
470
-
471
- | Workload | Recommended Settings |
472
- |----------|---------------------|
473
- | Code assistance | `SMART_CRUSHER_MAX_ITEMS=20` |
474
- | Search-heavy | `SMART_CRUSHER_MAX_ITEMS=10`, CCR enabled |
475
- | Long conversations | `ROLLING_WINDOW=true`, `KEEP_TURNS=5` |
476
- | Cost-sensitive | Enable LLMLingua with GPU |
477
-
478
- ### 4. Use Audit Mode First
479
-
480
- Test compression without applying it:
481
-
482
- ```bash
483
- HEADROOM_MODE=audit
484
- ```
485
-
486
- This logs what would be compressed without modifying requests.
487
-
488
- ---
489
-
490
- ## FAQ
491
-
492
- ### Does Headroom affect response quality?
493
-
494
- Minimal impact. Smart Crusher preserves high-variance (informative) fields and CCR allows full retrieval when needed. LLMLingua may have ~1.5% quality reduction.
495
-
496
- ### Can I use Headroom without Docker?
497
-
498
- Yes. Disable Docker management and run the sidecar manually:
499
-
500
- ```bash
501
- HEADROOM_DOCKER_ENABLED=false
502
- HEADROOM_ENDPOINT=http://your-headroom-server:8787
503
- ```
504
-
505
- ### Is Headroom required?
506
-
507
- No. If Headroom fails or is disabled, Lynkr works normally without compression.
508
-
509
- ### What providers benefit most?
510
-
511
- All providers benefit from compression. Anthropic and OpenAI see additional benefits from Cache Aligner improving cache hit rates.
512
-
513
- ---
514
-
515
- ## References
516
-
517
- - [Headroom GitHub Repository](https://github.com/chopratejas/headroom)
518
- - [LLMLingua Paper](https://arxiv.org/abs/2310.05736)
519
- - [Anthropic Prompt Caching](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching)