lynkr 7.2.5 → 8.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (124) hide show
  1. package/README.md +3 -3
  2. package/config/model-tiers.json +89 -0
  3. package/install.sh +6 -1
  4. package/package.json +4 -2
  5. package/scripts/setup.js +0 -1
  6. package/src/agents/executor.js +14 -6
  7. package/src/api/middleware/session.js +15 -2
  8. package/src/api/openai-router.js +162 -37
  9. package/src/api/providers-handler.js +15 -1
  10. package/src/api/router.js +107 -2
  11. package/src/budget/index.js +4 -3
  12. package/src/clients/databricks.js +431 -234
  13. package/src/clients/gpt-utils.js +181 -0
  14. package/src/clients/ollama-utils.js +66 -140
  15. package/src/clients/routing.js +0 -1
  16. package/src/clients/standard-tools.js +99 -3
  17. package/src/config/index.js +133 -35
  18. package/src/context/toon.js +173 -0
  19. package/src/logger/index.js +23 -0
  20. package/src/orchestrator/index.js +688 -213
  21. package/src/routing/agentic-detector.js +320 -0
  22. package/src/routing/complexity-analyzer.js +202 -2
  23. package/src/routing/cost-optimizer.js +305 -0
  24. package/src/routing/index.js +168 -159
  25. package/src/routing/model-tiers.js +365 -0
  26. package/src/server.js +4 -14
  27. package/src/sessions/cleanup.js +3 -3
  28. package/src/sessions/record.js +10 -1
  29. package/src/sessions/store.js +7 -2
  30. package/src/tools/agent-task.js +48 -1
  31. package/src/tools/index.js +19 -2
  32. package/src/tools/lazy-loader.js +7 -0
  33. package/src/tools/tinyfish.js +358 -0
  34. package/src/tools/truncate.js +1 -0
  35. package/.github/FUNDING.yml +0 -15
  36. package/.github/workflows/README.md +0 -215
  37. package/.github/workflows/ci.yml +0 -69
  38. package/.github/workflows/index.yml +0 -62
  39. package/.github/workflows/web-tools-tests.yml +0 -56
  40. package/CITATIONS.bib +0 -6
  41. package/CLAWROUTER_ROUTING_PLAN.md +0 -910
  42. package/DEPLOYMENT.md +0 -1001
  43. package/LYNKR-TUI-PLAN.md +0 -984
  44. package/PERFORMANCE-REPORT.md +0 -866
  45. package/PLAN-per-client-model-routing.md +0 -252
  46. package/ROUTER_COMPARISON.md +0 -173
  47. package/TIER_ROUTING_PLAN.md +0 -771
  48. package/docs/42642f749da6234f41b6b425c3bb07c9.txt +0 -1
  49. package/docs/BingSiteAuth.xml +0 -4
  50. package/docs/docs-style.css +0 -478
  51. package/docs/docs.html +0 -197
  52. package/docs/google5be250e608e6da39.html +0 -1
  53. package/docs/index.html +0 -577
  54. package/docs/index.md +0 -577
  55. package/docs/robots.txt +0 -4
  56. package/docs/sitemap.xml +0 -44
  57. package/docs/style.css +0 -1223
  58. package/documentation/README.md +0 -100
  59. package/documentation/api.md +0 -806
  60. package/documentation/claude-code-cli.md +0 -672
  61. package/documentation/codex-cli.md +0 -397
  62. package/documentation/contributing.md +0 -571
  63. package/documentation/cursor-integration.md +0 -731
  64. package/documentation/docker.md +0 -867
  65. package/documentation/embeddings.md +0 -760
  66. package/documentation/faq.md +0 -659
  67. package/documentation/features.md +0 -396
  68. package/documentation/headroom.md +0 -519
  69. package/documentation/installation.md +0 -706
  70. package/documentation/memory-system.md +0 -476
  71. package/documentation/production.md +0 -601
  72. package/documentation/providers.md +0 -906
  73. package/documentation/testing.md +0 -629
  74. package/documentation/token-optimization.md +0 -323
  75. package/documentation/tools.md +0 -697
  76. package/documentation/troubleshooting.md +0 -893
  77. package/final-test.js +0 -33
  78. package/headroom-sidecar/config.py +0 -93
  79. package/headroom-sidecar/requirements.txt +0 -14
  80. package/headroom-sidecar/server.py +0 -451
  81. package/monitor-agents.sh +0 -31
  82. package/scripts/audit-log-reader.js +0 -399
  83. package/scripts/compact-dictionary.js +0 -204
  84. package/scripts/test-deduplication.js +0 -448
  85. package/src/db/database.sqlite +0 -0
  86. package/test/README.md +0 -212
  87. package/test/azure-openai-config.test.js +0 -204
  88. package/test/azure-openai-error-resilience.test.js +0 -238
  89. package/test/azure-openai-format-conversion.test.js +0 -354
  90. package/test/azure-openai-integration.test.js +0 -281
  91. package/test/azure-openai-routing.test.js +0 -177
  92. package/test/azure-openai-streaming.test.js +0 -171
  93. package/test/bedrock-integration.test.js +0 -471
  94. package/test/comprehensive-test-suite.js +0 -928
  95. package/test/config-validation.test.js +0 -207
  96. package/test/cursor-integration.test.js +0 -484
  97. package/test/format-conversion.test.js +0 -578
  98. package/test/hybrid-routing-integration.test.js +0 -254
  99. package/test/hybrid-routing-performance.test.js +0 -418
  100. package/test/llamacpp-integration.test.js +0 -863
  101. package/test/lmstudio-integration.test.js +0 -335
  102. package/test/memory/extractor.test.js +0 -398
  103. package/test/memory/retriever.test.js +0 -613
  104. package/test/memory/retriever.test.js.bak +0 -585
  105. package/test/memory/search.test.js +0 -537
  106. package/test/memory/search.test.js.bak +0 -389
  107. package/test/memory/store.test.js +0 -344
  108. package/test/memory/store.test.js.bak +0 -312
  109. package/test/memory/surprise.test.js +0 -300
  110. package/test/memory-performance.test.js +0 -472
  111. package/test/openai-integration.test.js +0 -686
  112. package/test/openrouter-error-resilience.test.js +0 -418
  113. package/test/passthrough-mode.test.js +0 -385
  114. package/test/performance-benchmark.js +0 -351
  115. package/test/performance-tests.js +0 -528
  116. package/test/routing.test.js +0 -219
  117. package/test/web-tools.test.js +0 -329
  118. package/test-agents-simple.js +0 -43
  119. package/test-cli-connection.sh +0 -33
  120. package/test-learning-unit.js +0 -126
  121. package/test-learning.js +0 -112
  122. package/test-parallel-agents.sh +0 -124
  123. package/test-parallel-direct.js +0 -155
  124. package/test-subagents.sh +0 -117
@@ -1,323 +0,0 @@
1
- # Token Optimization Guide
2
-
3
- Comprehensive guide to Lynkr's token optimization strategies that achieve 60-80% cost reduction.
4
-
5
- ---
6
-
7
- ## Overview
8
-
9
- Lynkr reduces token usage by **60-80%** through 6 intelligent optimization phases. At 100,000 requests/month, this translates to **$77k-$115k annual savings**.
10
-
11
- ---
12
-
13
- ## Cost Savings Breakdown
14
-
15
- ### Real-World Example
16
-
17
- **Scenario:** 100,000 requests/month, 50k input tokens, 2k output tokens per request
18
-
19
- | Provider | Without Lynkr | With Lynkr (60% savings) | Monthly Savings | Annual Savings |
20
- |----------|---------------|-------------------------|-----------------|----------------|
21
- | **Claude Sonnet 4.5** | $16,000 | $6,400 | **$9,600** | **$115,200** |
22
- | **GPT-4o** | $12,000 | $4,800 | **$7,200** | **$86,400** |
23
- | **Ollama (Local)** | API costs | **$0** | **$12,000+** | **$144,000+** |
24
-
25
- ---
26
-
27
- ## 6 Optimization Phases
28
-
29
- ### Phase 1: Smart Tool Selection (50-70% reduction)
30
-
31
- **Problem:** Sending all tools to every request wastes tokens.
32
-
33
- **Solution:** Intelligently filter tools based on request type.
34
-
35
- **How it works:**
36
- - **Chat queries** → Only send Read tool
37
- - **File operations** → Send Read, Write, Edit tools
38
- - **Git operations** → Send git_* tools
39
- - **Code execution** → Send Bash tool
40
-
41
- **Example:**
42
- ```
43
- Original: 30 tools × 150 tokens = 4,500 tokens
44
- Optimized: 3 tools × 150 tokens = 450 tokens
45
- Savings: 90% (4,050 tokens saved)
46
- ```
47
-
48
- **Configuration:**
49
- ```bash
50
- # Automatic - no configuration needed
51
- # Lynkr detects request type and filters tools
52
- ```
53
-
54
- ---
55
-
56
- ### Phase 2: Prompt Caching (30-45% reduction)
57
-
58
- **Problem:** Repeated system prompts consume tokens.
59
-
60
- **Solution:** Cache and reuse prompts across requests.
61
-
62
- **How it works:**
63
- - SHA-256 hash of prompt
64
- - LRU cache with TTL (default: 5 minutes)
65
- - Cache hit = free tokens
66
-
67
- **Example:**
68
- ```
69
- First request: 2,000 token system prompt
70
- Subsequent requests: 0 tokens (cache hit)
71
- 10 requests: Save 18,000 tokens (90% reduction)
72
- ```
73
-
74
- **Configuration:**
75
- ```bash
76
- # Enable prompt caching (default: enabled)
77
- PROMPT_CACHE_ENABLED=true
78
-
79
- # Cache TTL in milliseconds (default: 300000 = 5 minutes)
80
- PROMPT_CACHE_TTL_MS=300000
81
-
82
- # Max cached entries (default: 64)
83
- PROMPT_CACHE_MAX_ENTRIES=64
84
- ```
85
-
86
- ---
87
-
88
- ### Phase 3: Memory Deduplication (20-30% reduction)
89
-
90
- **Problem:** Duplicate memories inject redundant context.
91
-
92
- **Solution:** Deduplicate memories before injection.
93
-
94
- **How it works:**
95
- - Track last N memories injected
96
- - Skip if same memory was in last 5 requests
97
- - Only inject novel context
98
-
99
- **Example:**
100
- ```
101
- Original: 5 memories × 200 tokens × 10 requests = 10,000 tokens
102
- With dedup: 5 memories × 200 tokens + 3 new × 200 = 1,600 tokens
103
- Savings: 84% (8,400 tokens saved)
104
- ```
105
-
106
- **Configuration:**
107
- ```bash
108
- # Enable memory deduplication (default: enabled)
109
- MEMORY_DEDUP_ENABLED=true
110
-
111
- # Lookback window for dedup (default: 5)
112
- MEMORY_DEDUP_LOOKBACK=5
113
- ```
114
-
115
- ---
116
-
117
- ### Phase 4: Tool Response Truncation (15-25% reduction)
118
-
119
- **Problem:** Long tool outputs (file contents, bash output) waste tokens.
120
-
121
- **Solution:** Intelligently truncate tool responses.
122
-
123
- **How it works:**
124
- - File Read: Limit to 2,000 lines
125
- - Bash output: Limit to 1,000 lines
126
- - Keep most relevant portions
127
- - Add truncation indicator
128
-
129
- **Example:**
130
- ```
131
- Original file read: 10,000 lines = 50,000 tokens
132
- Truncated: 2,000 lines = 10,000 tokens
133
- Savings: 80% (40,000 tokens saved)
134
- ```
135
-
136
- **Configuration:**
137
- ```bash
138
- # Automatic - no configuration needed
139
- # Built into Read and Bash tools
140
- ```
141
-
142
- ---
143
-
144
- ### Phase 5: Dynamic System Prompts (10-20% reduction)
145
-
146
- **Problem:** Long system prompts for simple queries.
147
-
148
- **Solution:** Adapt prompt complexity to request type.
149
-
150
- **How it works:**
151
- - **Simple chat**: Minimal system prompt (500 tokens)
152
- - **File operations**: Medium prompt (1,000 tokens)
153
- - **Complex multi-tool**: Full prompt (2,000 tokens)
154
-
155
- **Example:**
156
- ```
157
- 10 simple queries with full prompt: 10 × 2,000 = 20,000 tokens
158
- 10 simple queries with minimal: 10 × 500 = 5,000 tokens
159
- Savings: 75% (15,000 tokens saved)
160
- ```
161
-
162
- **Configuration:**
163
- ```bash
164
- # Automatic - no configuration needed
165
- # Lynkr detects request complexity
166
- ```
167
-
168
- ---
169
-
170
- ### Phase 6: Conversation Compression (15-25% reduction)
171
-
172
- **Problem:** Long conversation history accumulates tokens.
173
-
174
- **Solution:** Compress old messages while keeping recent ones detailed.
175
-
176
- **How it works:**
177
- - Last 5 messages: Full detail
178
- - Messages 6-20: Summarized
179
- - Messages 21+: Archived (not sent)
180
-
181
- **Example:**
182
- ```
183
- 20-turn conversation without compression: 100,000 tokens
184
- With compression: 25,000 tokens (last 5 full + 15 summarized)
185
- Savings: 75% (75,000 tokens saved)
186
- ```
187
-
188
- **Configuration:**
189
- ```bash
190
- # Automatic - no configuration needed
191
- # Lynkr manages conversation history
192
- ```
193
-
194
- ---
195
-
196
- ## Combined Savings
197
-
198
- When all 6 phases work together:
199
-
200
- **Example Request Flow:**
201
-
202
- 1. **Original request**: 50,000 input tokens
203
- - System prompt: 2,000 tokens
204
- - Tools: 4,500 tokens (30 tools)
205
- - Memories: 1,000 tokens (5 memories)
206
- - Conversation: 20,000 tokens (20 messages)
207
- - User query: 22,500 tokens
208
-
209
- 2. **After optimization**: 12,500 input tokens
210
- - System prompt: 0 tokens (cache hit)
211
- - Tools: 450 tokens (3 relevant tools)
212
- - Memories: 200 tokens (deduplicated)
213
- - Conversation: 5,000 tokens (compressed)
214
- - User query: 22,500 tokens (same)
215
-
216
- 3. **Savings**: 75% reduction (37,500 tokens saved)
217
-
218
- ---
219
-
220
- ## Monitoring Token Usage
221
-
222
- ### Real-Time Tracking
223
-
224
- ```bash
225
- # Check metrics endpoint
226
- curl http://localhost:8081/metrics | grep lynkr_tokens
227
-
228
- # Output:
229
- # lynkr_tokens_input_total{provider="databricks"} 1234567
230
- # lynkr_tokens_output_total{provider="databricks"} 234567
231
- # lynkr_tokens_cached_total 500000
232
- ```
233
-
234
- ### Per-Request Logging
235
-
236
- ```bash
237
- # Enable token logging
238
- LOG_LEVEL=info
239
-
240
- # Logs show:
241
- # {"level":"info","tokens":{"input":1250,"output":234,"cached":750}}
242
- ```
243
-
244
- ---
245
-
246
- ## Best Practices
247
-
248
- ### 1. Enable All Optimizations
249
-
250
- ```bash
251
- # All optimizations are enabled by default
252
- # No configuration needed
253
- ```
254
-
255
- ### 2. Use Hybrid Routing
256
-
257
- ```bash
258
- # Route simple requests to free Ollama
259
- PREFER_OLLAMA=true
260
- FALLBACK_ENABLED=true
261
-
262
- # Complex requests automatically go to cloud
263
- FALLBACK_PROVIDER=databricks
264
- ```
265
-
266
- ### 3. Monitor and Tune
267
-
268
- ```bash
269
- # Check cache hit rate
270
- curl http://localhost:8081/metrics | grep cache_hits
271
-
272
- # Adjust cache size if needed
273
- PROMPT_CACHE_MAX_ENTRIES=128 # Increase for more caching
274
- ```
275
-
276
- ---
277
-
278
- ## ROI Calculator
279
-
280
- Calculate your potential savings:
281
-
282
- **Formula:**
283
- ```
284
- Monthly Requests = 100,000
285
- Avg Input Tokens = 50,000
286
- Avg Output Tokens = 2,000
287
- Cost per 1M Input = $3.00
288
- Cost per 1M Output = $15.00
289
-
290
- Without Lynkr:
291
- Input Cost = (100,000 × 50,000 ÷ 1,000,000) × $3 = $15,000
292
- Output Cost = (100,000 × 2,000 ÷ 1,000,000) × $15 = $3,000
293
- Total = $18,000/month
294
-
295
- With Lynkr (60% savings):
296
- Total = $7,200/month
297
-
298
- Savings = $10,800/month = $129,600/year
299
- ```
300
-
301
- **Your numbers:**
302
- - Monthly requests: _____
303
- - Avg input tokens: _____
304
- - Avg output tokens: _____
305
- - Provider cost: _____
306
-
307
- **Result:** $_____ saved per year
308
-
309
- ---
310
-
311
- ## Next Steps
312
-
313
- - **[Installation Guide](installation.md)** - Install Lynkr
314
- - **[Provider Configuration](providers.md)** - Configure providers
315
- - **[Production Guide](production.md)** - Deploy to production
316
- - **[FAQ](faq.md)** - Common questions
317
-
318
- ---
319
-
320
- ## Getting Help
321
-
322
- - **[GitHub Discussions](https://github.com/vishalveerareddy123/Lynkr/discussions)** - Ask questions
323
- - **[GitHub Issues](https://github.com/vishalveerareddy123/Lynkr/issues)** - Report issues