lynkr 8.0.0 → 9.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (128) hide show
  1. package/.lynkr/telemetry.db +0 -0
  2. package/.lynkr/telemetry.db-shm +0 -0
  3. package/.lynkr/telemetry.db-wal +0 -0
  4. package/README.md +196 -322
  5. package/lynkr-skill.tar.gz +0 -0
  6. package/package.json +4 -3
  7. package/src/api/openai-router.js +64 -13
  8. package/src/api/providers-handler.js +171 -3
  9. package/src/api/router.js +9 -2
  10. package/src/clients/circuit-breaker.js +10 -247
  11. package/src/clients/codex-process.js +342 -0
  12. package/src/clients/codex-utils.js +143 -0
  13. package/src/clients/databricks.js +210 -63
  14. package/src/clients/resilience.js +540 -0
  15. package/src/clients/retry.js +22 -167
  16. package/src/clients/standard-tools.js +23 -0
  17. package/src/config/index.js +77 -0
  18. package/src/context/compression.js +42 -9
  19. package/src/context/distill.js +492 -0
  20. package/src/orchestrator/index.js +48 -8
  21. package/src/routing/complexity-analyzer.js +258 -5
  22. package/src/routing/index.js +12 -2
  23. package/src/routing/latency-tracker.js +148 -0
  24. package/src/routing/model-tiers.js +2 -0
  25. package/src/routing/quality-scorer.js +113 -0
  26. package/src/routing/telemetry.js +464 -0
  27. package/src/server.js +13 -12
  28. package/src/tools/code-graph.js +538 -0
  29. package/src/tools/code-mode.js +304 -0
  30. package/src/tools/index.js +4 -0
  31. package/src/tools/lazy-loader.js +18 -0
  32. package/src/tools/mcp-remote.js +7 -0
  33. package/src/tools/smart-selection.js +11 -0
  34. package/src/tools/tinyfish.js +358 -0
  35. package/src/tools/truncate.js +1 -0
  36. package/src/utils/payload.js +206 -0
  37. package/src/utils/perf-timer.js +80 -0
  38. package/.github/FUNDING.yml +0 -15
  39. package/.github/workflows/README.md +0 -215
  40. package/.github/workflows/ci.yml +0 -69
  41. package/.github/workflows/index.yml +0 -62
  42. package/.github/workflows/web-tools-tests.yml +0 -56
  43. package/CITATIONS.bib +0 -6
  44. package/DEPLOYMENT.md +0 -1001
  45. package/LYNKR-TUI-PLAN.md +0 -984
  46. package/PERFORMANCE-REPORT.md +0 -866
  47. package/PLAN-per-client-model-routing.md +0 -252
  48. package/docs/42642f749da6234f41b6b425c3bb07c9.txt +0 -1
  49. package/docs/BingSiteAuth.xml +0 -4
  50. package/docs/docs-style.css +0 -478
  51. package/docs/docs.html +0 -198
  52. package/docs/google5be250e608e6da39.html +0 -1
  53. package/docs/index.html +0 -577
  54. package/docs/index.md +0 -584
  55. package/docs/robots.txt +0 -4
  56. package/docs/sitemap.xml +0 -44
  57. package/docs/style.css +0 -1223
  58. package/docs/toon-integration-spec.md +0 -130
  59. package/documentation/README.md +0 -101
  60. package/documentation/api.md +0 -806
  61. package/documentation/claude-code-cli.md +0 -679
  62. package/documentation/codex-cli.md +0 -397
  63. package/documentation/contributing.md +0 -571
  64. package/documentation/cursor-integration.md +0 -734
  65. package/documentation/docker.md +0 -874
  66. package/documentation/embeddings.md +0 -762
  67. package/documentation/faq.md +0 -713
  68. package/documentation/features.md +0 -403
  69. package/documentation/headroom.md +0 -519
  70. package/documentation/installation.md +0 -758
  71. package/documentation/memory-system.md +0 -476
  72. package/documentation/production.md +0 -636
  73. package/documentation/providers.md +0 -1009
  74. package/documentation/routing.md +0 -476
  75. package/documentation/testing.md +0 -629
  76. package/documentation/token-optimization.md +0 -325
  77. package/documentation/tools.md +0 -697
  78. package/documentation/troubleshooting.md +0 -969
  79. package/final-test.js +0 -33
  80. package/headroom-sidecar/config.py +0 -93
  81. package/headroom-sidecar/requirements.txt +0 -14
  82. package/headroom-sidecar/server.py +0 -451
  83. package/monitor-agents.sh +0 -31
  84. package/scripts/audit-log-reader.js +0 -399
  85. package/scripts/compact-dictionary.js +0 -204
  86. package/scripts/test-deduplication.js +0 -448
  87. package/src/db/database.sqlite +0 -0
  88. package/te +0 -11622
  89. package/test/README.md +0 -212
  90. package/test/azure-openai-config.test.js +0 -213
  91. package/test/azure-openai-error-resilience.test.js +0 -238
  92. package/test/azure-openai-format-conversion.test.js +0 -354
  93. package/test/azure-openai-integration.test.js +0 -287
  94. package/test/azure-openai-routing.test.js +0 -175
  95. package/test/azure-openai-streaming.test.js +0 -171
  96. package/test/bedrock-integration.test.js +0 -457
  97. package/test/comprehensive-test-suite.js +0 -928
  98. package/test/config-validation.test.js +0 -207
  99. package/test/cursor-integration.test.js +0 -484
  100. package/test/format-conversion.test.js +0 -578
  101. package/test/hybrid-routing-integration.test.js +0 -269
  102. package/test/hybrid-routing-performance.test.js +0 -428
  103. package/test/llamacpp-integration.test.js +0 -882
  104. package/test/lmstudio-integration.test.js +0 -347
  105. package/test/memory/extractor.test.js +0 -398
  106. package/test/memory/retriever.test.js +0 -613
  107. package/test/memory/retriever.test.js.bak +0 -585
  108. package/test/memory/search.test.js +0 -537
  109. package/test/memory/search.test.js.bak +0 -389
  110. package/test/memory/store.test.js +0 -344
  111. package/test/memory/store.test.js.bak +0 -312
  112. package/test/memory/surprise.test.js +0 -300
  113. package/test/memory-performance.test.js +0 -472
  114. package/test/openai-integration.test.js +0 -683
  115. package/test/openrouter-error-resilience.test.js +0 -418
  116. package/test/passthrough-mode.test.js +0 -385
  117. package/test/performance-benchmark.js +0 -351
  118. package/test/performance-tests.js +0 -528
  119. package/test/routing.test.js +0 -225
  120. package/test/toon-compression.test.js +0 -131
  121. package/test/web-tools.test.js +0 -329
  122. package/test-agents-simple.js +0 -43
  123. package/test-cli-connection.sh +0 -33
  124. package/test-learning-unit.js +0 -126
  125. package/test-learning.js +0 -112
  126. package/test-parallel-agents.sh +0 -124
  127. package/test-parallel-direct.js +0 -155
  128. package/test-subagents.sh +0 -117
@@ -1,762 +0,0 @@
1
- # Embeddings Configuration Guide
2
-
3
- Complete guide to configuring embeddings for Cursor @Codebase semantic search and code understanding.
4
-
5
- ---
6
-
7
- ## Overview
8
-
9
- **Embeddings** enable semantic code search in Cursor IDE's @Codebase feature. Instead of keyword matching, embeddings understand the *meaning* of your code, allowing you to search for functionality, concepts, or patterns.
10
-
11
- ### What Are Embeddings?
12
-
13
- Embeddings convert text (code, comments, documentation) into high-dimensional vectors that capture semantic meaning. Similar code gets similar vectors, enabling:
14
-
15
- - **@Codebase Search** - Find relevant code by describing what you need
16
- - **Automatic Context** - Cursor automatically includes relevant files in conversations
17
- - **Find Similar Code** - Discover code patterns and examples in your codebase
18
-
19
- ### Why Use Embeddings?
20
-
21
- **Without embeddings:**
22
- - ❌ Keyword-only search (`grep`, exact string matching)
23
- - ❌ No semantic understanding
24
- - ❌ Can't find code by describing its purpose
25
-
26
- **With embeddings:**
27
- - ✅ Semantic search ("find authentication logic")
28
- - ✅ Concept-based discovery ("show me error handling patterns")
29
- - ✅ Similar code detection ("code like this function")
30
-
31
- ---
32
-
33
- ## Supported Embedding Providers
34
-
35
- Lynkr supports 4 embedding providers with different tradeoffs:
36
-
37
- | Provider | Cost | Privacy | Setup | Quality | Best For |
38
- |----------|------|---------|-------|---------|----------|
39
- | **Ollama** | **FREE** | 🔒 100% Local | Easy | Good | Privacy, offline, no costs |
40
- | **llama.cpp** | **FREE** | 🔒 100% Local | Medium | Good | Performance, GPU, GGUF models |
41
- | **OpenRouter** | $0.01-0.10/mo | ☁️ Cloud | Easy | Excellent | Simplicity, quality, one key |
42
- | **OpenAI** | $0.01-0.10/mo | ☁️ Cloud | Easy | Excellent | Best quality, direct access |
43
-
44
- ---
45
-
46
- ## Option 1: Ollama (Recommended for Privacy)
47
-
48
- ### Overview
49
-
50
- - **Cost:** 100% FREE 🔒
51
- - **Privacy:** All data stays on your machine
52
- - **Setup:** Easy (5 minutes)
53
- - **Quality:** Good (768-1024 dimensions)
54
- - **Best for:** Privacy-focused teams, offline work, zero cloud dependencies
55
-
56
- ### Installation & Setup
57
-
58
- ```bash
59
- # 1. Install Ollama (if not already installed)
60
- brew install ollama # macOS
61
- # Or download from: https://ollama.ai/download
62
-
63
- # 2. Start Ollama service
64
- ollama serve
65
-
66
- # 3. Pull embedding model (in separate terminal)
67
- ollama pull nomic-embed-text
68
-
69
- # 4. Verify model is available
70
- ollama list
71
- # Should show: nomic-embed-text ...
72
- ```
73
-
74
- ### Configuration
75
-
76
- Add to `.env`:
77
-
78
- ```env
79
- # Ollama embeddings configuration
80
- OLLAMA_EMBEDDINGS_MODEL=nomic-embed-text
81
- OLLAMA_EMBEDDINGS_ENDPOINT=http://localhost:11434/api/embeddings
82
- ```
83
-
84
- ### Available Models
85
-
86
- **nomic-embed-text** (Recommended) ⭐
87
- ```bash
88
- ollama pull nomic-embed-text
89
- ```
90
- - **Dimensions:** 768
91
- - **Parameters:** 137M
92
- - **Quality:** Excellent for code search
93
- - **Speed:** Fast (~50ms per query)
94
- - **Best for:** General purpose, best all-around choice
95
-
96
- **mxbai-embed-large** (Higher Quality)
97
- ```bash
98
- ollama pull mxbai-embed-large
99
- ```
100
- - **Dimensions:** 1024
101
- - **Parameters:** 335M
102
- - **Quality:** Higher quality than nomic-embed-text
103
- - **Speed:** Slower (~100ms per query)
104
- - **Best for:** Large codebases where quality matters most
105
-
106
- **all-minilm** (Fastest)
107
- ```bash
108
- ollama pull all-minilm
109
- ```
110
- - **Dimensions:** 384
111
- - **Parameters:** 23M
112
- - **Quality:** Good for simple searches
113
- - **Speed:** Very fast (~20ms per query)
114
- - **Best for:** Small codebases, speed-critical applications
115
-
116
- ### Testing
117
-
118
- ```bash
119
- # Test embedding generation
120
- curl http://localhost:11434/api/embeddings \
121
- -d '{"model":"nomic-embed-text","prompt":"function to sort array"}'
122
-
123
- # Should return JSON with embedding vector
124
- ```
125
-
126
- ### Benefits
127
-
128
- - ✅ **100% FREE** - No API costs ever
129
- - ✅ **100% Private** - All data stays on your machine
130
- - ✅ **Offline** - Works without internet
131
- - ✅ **Easy Setup** - Install → Pull model → Configure
132
- - ✅ **Good Quality** - Excellent for code search
133
- - ✅ **Multiple Models** - Choose speed vs quality tradeoff
134
-
135
- ---
136
-
137
- ## Option 2: llama.cpp (Maximum Performance)
138
-
139
- ### Overview
140
-
141
- - **Cost:** 100% FREE 🔒
142
- - **Privacy:** All data stays on your machine
143
- - **Setup:** Medium (15 minutes, requires compilation)
144
- - **Quality:** Good (same as Ollama models, GGUF format)
145
- - **Best for:** Performance optimization, GPU acceleration, GGUF models
146
-
147
- ### Installation & Setup
148
-
149
- ```bash
150
- # 1. Clone and build llama.cpp
151
- git clone https://github.com/ggerganov/llama.cpp
152
- cd llama.cpp
153
-
154
- # Build with GPU support (optional):
155
- # For CUDA (NVIDIA): make LLAMA_CUDA=1
156
- # For Metal (Apple Silicon): make LLAMA_METAL=1
157
- # For CPU only: make
158
- make
159
-
160
- # 2. Download embedding model (GGUF format)
161
- # Example: nomic-embed-text GGUF
162
- wget https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF/resolve/main/nomic-embed-text-v1.5.Q4_K_M.gguf
163
-
164
- # 3. Start llama-server with embedding model
165
- ./llama-server \
166
- -m nomic-embed-text-v1.5.Q4_K_M.gguf \
167
- --port 8080 \
168
- --embedding
169
-
170
- # 4. Verify server is running
171
- curl http://localhost:8080/health
172
- # Should return: {"status":"ok"}
173
- ```
174
-
175
- ### Configuration
176
-
177
- Add to `.env`:
178
-
179
- ```env
180
- # llama.cpp embeddings configuration
181
- LLAMACPP_EMBEDDINGS_ENDPOINT=http://localhost:8080/embeddings
182
- ```
183
-
184
- ### Available Models (GGUF)
185
-
186
- **nomic-embed-text-v1.5** (Recommended) ⭐
187
- - **File:** `nomic-embed-text-v1.5.Q4_K_M.gguf`
188
- - **Download:** https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF
189
- - **Dimensions:** 768
190
- - **Size:** ~80MB
191
- - **Quality:** Excellent for code
192
- - **Best for:** Best all-around choice
193
-
194
- **all-MiniLM-L6-v2** (Fastest)
195
- - **File:** `all-MiniLM-L6-v2.Q4_K_M.gguf`
196
- - **Download:** https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2-GGUF
197
- - **Dimensions:** 384
198
- - **Size:** ~25MB
199
- - **Quality:** Good for simple searches
200
- - **Best for:** Speed-critical applications
201
-
202
- **bge-large-en-v1.5** (Highest Quality)
203
- - **File:** `bge-large-en-v1.5.Q4_K_M.gguf`
204
- - **Download:** https://huggingface.co/BAAI/bge-large-en-v1.5-GGUF
205
- - **Dimensions:** 1024
206
- - **Size:** ~350MB
207
- - **Quality:** Best quality for embeddings
208
- - **Best for:** Large codebases, quality-critical applications
209
-
210
- ### GPU Support
211
-
212
- llama.cpp supports multiple GPU backends for faster embedding generation:
213
-
214
- **NVIDIA CUDA:**
215
- ```bash
216
- make LLAMA_CUDA=1
217
- ./llama-server -m model.gguf --embedding --n-gpu-layers 32
218
- ```
219
-
220
- **Apple Silicon Metal:**
221
- ```bash
222
- make LLAMA_METAL=1
223
- ./llama-server -m model.gguf --embedding --n-gpu-layers 32
224
- ```
225
-
226
- **AMD ROCm:**
227
- ```bash
228
- make LLAMA_ROCM=1
229
- ./llama-server -m model.gguf --embedding --n-gpu-layers 32
230
- ```
231
-
232
- **Vulkan (Universal):**
233
- ```bash
234
- make LLAMA_VULKAN=1
235
- ./llama-server -m model.gguf --embedding --n-gpu-layers 32
236
- ```
237
-
238
- ### Testing
239
-
240
- ```bash
241
- # Test embedding generation
242
- curl http://localhost:8080/embeddings \
243
- -H "Content-Type: application/json" \
244
- -d '{"content":"function to sort array"}'
245
-
246
- # Should return JSON with embedding vector
247
- ```
248
-
249
- ### Benefits
250
-
251
- - ✅ **100% FREE** - No API costs
252
- - ✅ **100% Private** - All data stays local
253
- - ✅ **Faster than Ollama** - Optimized C++ implementation
254
- - ✅ **GPU Acceleration** - CUDA, Metal, ROCm, Vulkan
255
- - ✅ **Lower Memory** - Quantization options (Q4, Q5, Q8)
256
- - ✅ **Any GGUF Model** - Use any embedding model from HuggingFace
257
-
258
- ### llama.cpp vs Ollama
259
-
260
- | Feature | Ollama | llama.cpp |
261
- |---------|--------|-----------|
262
- | **Setup** | Easy (app) | Manual (compile) |
263
- | **Model Format** | Ollama-specific | Any GGUF model |
264
- | **Performance** | Good | **Better** (optimized C++) |
265
- | **GPU Support** | Yes | Yes (more options) |
266
- | **Memory Usage** | Higher | **Lower** (more quantization options) |
267
- | **Flexibility** | Limited models | **Any GGUF** from HuggingFace |
268
-
269
- ---
270
-
271
- ## Option 3: OpenRouter (Simplest Cloud)
272
-
273
- ### Overview
274
-
275
- - **Cost:** ~$0.01-0.10/month (typical usage)
276
- - **Privacy:** Cloud-based
277
- - **Setup:** Very easy (2 minutes)
278
- - **Quality:** Excellent (best-in-class models)
279
- - **Best for:** Simplicity, quality, one key for chat + embeddings
280
-
281
- ### Configuration
282
-
283
- Add to `.env`:
284
-
285
- ```env
286
- # OpenRouter configuration (if not already set)
287
- OPENROUTER_API_KEY=sk-or-v1-your-key-here
288
-
289
- # Embeddings model (optional, defaults to text-embedding-ada-002)
290
- OPENROUTER_EMBEDDINGS_MODEL=openai/text-embedding-3-small
291
- ```
292
-
293
- **Note:** If you're already using `MODEL_PROVIDER=openrouter`, embeddings work automatically with the same key! No additional configuration needed.
294
-
295
- ### Getting OpenRouter API Key
296
-
297
- 1. Visit [openrouter.ai](https://openrouter.ai)
298
- 2. Sign in with GitHub, Google, or email
299
- 3. Go to [openrouter.ai/keys](https://openrouter.ai/keys)
300
- 4. Create a new API key
301
- 5. Add credits (pay-as-you-go, no subscription)
302
-
303
- ### Available Models
304
-
305
- **openai/text-embedding-3-small** (Recommended) ⭐
306
- ```env
307
- OPENROUTER_EMBEDDINGS_MODEL=openai/text-embedding-3-small
308
- ```
309
- - **Dimensions:** 1536
310
- - **Cost:** $0.02 per 1M tokens (80% cheaper than ada-002!)
311
- - **Quality:** Excellent
312
- - **Best for:** Best balance of quality and cost
313
-
314
- **openai/text-embedding-ada-002** (Standard)
315
- ```env
316
- OPENROUTER_EMBEDDINGS_MODEL=openai/text-embedding-ada-002
317
- ```
318
- - **Dimensions:** 1536
319
- - **Cost:** $0.10 per 1M tokens
320
- - **Quality:** Excellent (widely supported standard)
321
- - **Best for:** Compatibility
322
-
323
- **openai/text-embedding-3-large** (Best Quality)
324
- ```env
325
- OPENROUTER_EMBEDDINGS_MODEL=openai/text-embedding-3-large
326
- ```
327
- - **Dimensions:** 3072
328
- - **Cost:** $0.13 per 1M tokens
329
- - **Quality:** Best quality available
330
- - **Best for:** Large codebases where quality matters most
331
-
332
- **voyage/voyage-code-2** (Code-Specialized)
333
- ```env
334
- OPENROUTER_EMBEDDINGS_MODEL=voyage/voyage-code-2
335
- ```
336
- - **Dimensions:** 1024
337
- - **Cost:** $0.12 per 1M tokens
338
- - **Quality:** Optimized specifically for code
339
- - **Best for:** Code search (better than general models)
340
-
341
- **voyage/voyage-2** (General Purpose)
342
- ```env
343
- OPENROUTER_EMBEDDINGS_MODEL=voyage/voyage-2
344
- ```
345
- - **Dimensions:** 1024
346
- - **Cost:** $0.12 per 1M tokens
347
- - **Quality:** Best for general text
348
- - **Best for:** Mixed code + documentation
349
-
350
- ### Benefits
351
-
352
- - ✅ **ONE Key** - Same key for chat + embeddings
353
- - ✅ **No Setup** - Works immediately after adding key
354
- - ✅ **Best Quality** - State-of-the-art embedding models
355
- - ✅ **Automatic Fallbacks** - Switches providers if one is down
356
- - ✅ **Competitive Pricing** - Often cheaper than direct providers
357
-
358
- ---
359
-
360
- ## Option 4: OpenAI (Direct)
361
-
362
- ### Overview
363
-
364
- - **Cost:** ~$0.01-0.10/month (typical usage)
365
- - **Privacy:** Cloud-based
366
- - **Setup:** Easy (5 minutes)
367
- - **Quality:** Excellent (best-in-class, direct from OpenAI)
368
- - **Best for:** Best quality, direct OpenAI access
369
-
370
- ### Configuration
371
-
372
- Add to `.env`:
373
-
374
- ```env
375
- # OpenAI configuration (if not already set)
376
- OPENAI_API_KEY=sk-your-openai-api-key
377
-
378
- # Embeddings model (optional, defaults to text-embedding-ada-002)
379
- # Recommended: Use text-embedding-3-small for 80% cost savings
380
- # OPENAI_EMBEDDINGS_MODEL=text-embedding-3-small
381
- ```
382
-
383
- ### Getting OpenAI API Key
384
-
385
- 1. Visit [platform.openai.com](https://platform.openai.com)
386
- 2. Sign up or log in
387
- 3. Go to [API Keys](https://platform.openai.com/api-keys)
388
- 4. Create a new API key
389
- 5. Add credits to your account (pay-as-you-go)
390
-
391
- ### Available Models
392
-
393
- **text-embedding-3-small** (Recommended) ⭐
394
- ```env
395
- OPENAI_EMBEDDINGS_MODEL=text-embedding-3-small
396
- ```
397
- - **Dimensions:** 1536
398
- - **Cost:** $0.02 per 1M tokens (80% cheaper!)
399
- - **Quality:** Excellent
400
- - **Best for:** Best balance of quality and cost
401
-
402
- **text-embedding-ada-002** (Standard)
403
- ```env
404
- OPENAI_EMBEDDINGS_MODEL=text-embedding-ada-002
405
- ```
406
- - **Dimensions:** 1536
407
- - **Cost:** $0.10 per 1M tokens
408
- - **Quality:** Excellent (standard, widely used)
409
- - **Best for:** Compatibility
410
-
411
- **text-embedding-3-large** (Best Quality)
412
- ```env
413
- OPENAI_EMBEDDINGS_MODEL=text-embedding-3-large
414
- ```
415
- - **Dimensions:** 3072
416
- - **Cost:** $0.13 per 1M tokens
417
- - **Quality:** Best quality available
418
- - **Best for:** Maximum quality for large codebases
419
-
420
- ### Benefits
421
-
422
- - ✅ **Best Quality** - Direct from OpenAI, best-in-class
423
- - ✅ **Lowest Latency** - No intermediaries
424
- - ✅ **Simple Setup** - Just one API key
425
- - ✅ **Organization Support** - Use org-level API keys for teams
426
-
427
- ---
428
-
429
- ## Provider Comparison
430
-
431
- ### Feature Comparison
432
-
433
- | Feature | Ollama | llama.cpp | OpenRouter | OpenAI |
434
- |---------|--------|-----------|------------|--------|
435
- | **Cost** | **FREE** | **FREE** | $0.01-0.10/mo | $0.01-0.10/mo |
436
- | **Privacy** | 🔒 Local | 🔒 Local | ☁️ Cloud | ☁️ Cloud |
437
- | **Setup** | Easy | Medium | Easy | Easy |
438
- | **Quality** | Good | Good | **Excellent** | **Excellent** |
439
- | **Speed** | Fast | **Faster** | Fast | Fast |
440
- | **Offline** | ✅ Yes | ✅ Yes | ❌ No | ❌ No |
441
- | **GPU Support** | Yes | **Yes (more options)** | N/A | N/A |
442
- | **Model Choice** | Limited | **Any GGUF** | Many | Few |
443
- | **Dimensions** | 384-1024 | 384-1024 | 1024-3072 | 1536-3072 |
444
-
445
- ### Cost Comparison (100K embeddings/month)
446
-
447
- | Provider | Model | Monthly Cost |
448
- |----------|-------|--------------|
449
- | **Ollama** | Any | **$0** (100% FREE) 🔒 |
450
- | **llama.cpp** | Any | **$0** (100% FREE) 🔒 |
451
- | **OpenRouter** | text-embedding-3-small | **$0.02** |
452
- | **OpenRouter** | text-embedding-ada-002 | $0.10 |
453
- | **OpenRouter** | voyage-code-2 | $0.12 |
454
- | **OpenAI** | text-embedding-3-small | **$0.02** |
455
- | **OpenAI** | text-embedding-ada-002 | $0.10 |
456
- | **OpenAI** | text-embedding-3-large | $0.13 |
457
-
458
- ---
459
-
460
- ## Embeddings Provider Override
461
-
462
- By default, Lynkr uses the same provider as `MODEL_PROVIDER` for embeddings (if supported). To use a different provider for embeddings:
463
-
464
- ```env
465
- # Use Databricks for chat, but Ollama for embeddings (privacy + cost savings)
466
- MODEL_PROVIDER=databricks
467
- DATABRICKS_API_BASE=https://your-workspace.databricks.com
468
- DATABRICKS_API_KEY=your-key
469
-
470
- # Override embeddings provider
471
- EMBEDDINGS_PROVIDER=ollama
472
- OLLAMA_EMBEDDINGS_MODEL=nomic-embed-text
473
- ```
474
-
475
- **Smart provider detection:**
476
- - Uses same provider as chat (if embeddings supported)
477
- - Or automatically selects first available embeddings provider
478
- - Or use `EMBEDDINGS_PROVIDER` to force a specific provider
479
-
480
- ---
481
-
482
- ## Recommended Configurations
483
-
484
- ### 1. Privacy-First (100% Local, FREE)
485
-
486
- **Best for:** Sensitive codebases, offline work, zero cloud dependencies
487
-
488
- ```env
489
- # Chat: Ollama (local)
490
- MODEL_PROVIDER=ollama
491
- OLLAMA_MODEL=llama3.1:8b
492
-
493
- # Embeddings: Ollama (local)
494
- OLLAMA_EMBEDDINGS_MODEL=nomic-embed-text
495
-
496
- # Everything 100% local, 100% private, 100% FREE!
497
- ```
498
-
499
- **Benefits:**
500
- - ✅ Zero cloud dependencies
501
- - ✅ All data stays on your machine
502
- - ✅ Works offline
503
- - ✅ 100% FREE
504
-
505
- ---
506
-
507
- ### 2. Simplest (One Key for Everything)
508
-
509
- **Best for:** Easy setup, flexibility, quality
510
-
511
- ```env
512
- # Chat + Embeddings: OpenRouter with ONE key
513
- MODEL_PROVIDER=openrouter
514
- OPENROUTER_API_KEY=sk-or-v1-your-key
515
- OPENROUTER_MODEL=anthropic/claude-3.5-sonnet
516
-
517
- # Embeddings work automatically with same key!
518
- # Optional: Specify model for cost savings
519
- OPENROUTER_EMBEDDINGS_MODEL=openai/text-embedding-3-small
520
- ```
521
-
522
- **Benefits:**
523
- - ✅ ONE key for everything
524
- - ✅ Best quality embeddings
525
- - ✅ 100+ chat models available
526
- - ✅ ~$5-10/month total cost
527
-
528
- ---
529
-
530
- ### 3. Hybrid (Best of Both Worlds)
531
-
532
- **Best for:** Privacy + Quality + Cost Optimization
533
-
534
- ```env
535
- # Chat: Tier-based routing (set all 4 to enable)
536
- TIER_SIMPLE=ollama:llama3.2
537
- TIER_MEDIUM=openrouter:openai/gpt-4o-mini
538
- TIER_COMPLEX=databricks:databricks-claude-sonnet-4-5
539
- TIER_REASONING=databricks:databricks-claude-sonnet-4-5
540
- FALLBACK_ENABLED=true
541
- FALLBACK_PROVIDER=databricks
542
- DATABRICKS_API_BASE=https://your-workspace.databricks.com
543
- DATABRICKS_API_KEY=your-key
544
-
545
- # Embeddings: Ollama (local, private)
546
- OLLAMA_EMBEDDINGS_MODEL=nomic-embed-text
547
-
548
- # Result: Free + private embeddings, mostly free chat, cloud for complex tasks
549
- ```
550
-
551
- **Benefits:**
552
- - ✅ 70-80% of chat requests FREE (Ollama via TIER_SIMPLE)
553
- - ✅ 100% private embeddings (local)
554
- - ✅ Cloud quality for complex tasks
555
- - ✅ Intelligent automatic tier-based routing
556
-
557
- ---
558
-
559
- ### 4. Enterprise (Best Quality)
560
-
561
- **Best for:** Large teams, quality-critical applications
562
-
563
- ```env
564
- # Chat: Databricks (enterprise SLA)
565
- MODEL_PROVIDER=databricks
566
- DATABRICKS_API_BASE=https://your-workspace.databricks.com
567
- DATABRICKS_API_KEY=your-key
568
-
569
- # Embeddings: OpenRouter (best quality)
570
- OPENROUTER_API_KEY=sk-or-v1-your-key
571
- OPENROUTER_EMBEDDINGS_MODEL=voyage/voyage-code-2 # Code-specialized
572
- ```
573
-
574
- **Benefits:**
575
- - ✅ Enterprise chat (Claude 4.5)
576
- - ✅ Best embedding quality (code-specialized)
577
- - ✅ Separate billing/limits for chat vs embeddings
578
- - ✅ Production-ready reliability
579
-
580
- ---
581
-
582
- ## Testing & Verification
583
-
584
- ### Test Embeddings Endpoint
585
-
586
- ```bash
587
- # Test embedding generation
588
- curl http://localhost:8081/v1/embeddings \
589
- -H "Content-Type: application/json" \
590
- -d '{
591
- "input": "function to sort an array",
592
- "model": "text-embedding-ada-002"
593
- }'
594
-
595
- # Should return JSON with embedding vector
596
- # Example response:
597
- # {
598
- # "object": "list",
599
- # "data": [{
600
- # "object": "embedding",
601
- # "embedding": [0.123, -0.456, 0.789, ...], # 768-3072 dimensions
602
- # "index": 0
603
- # }],
604
- # "model": "text-embedding-ada-002",
605
- # "usage": {"prompt_tokens": 7, "total_tokens": 7}
606
- # }
607
- ```
608
-
609
- ### Test in Cursor
610
-
611
- 1. **Open Cursor IDE**
612
- 2. **Open a project**
613
- 3. **Press Cmd+L** (or Ctrl+L)
614
- 4. **Type:** `@Codebase find authentication logic`
615
- 5. **Expected:** Cursor returns relevant files
616
-
617
- If @Codebase doesn't work:
618
- - Check embeddings endpoint: `curl http://localhost:8081/v1/embeddings` (should not return 501)
619
- - Restart Lynkr after adding embeddings config
620
- - Restart Cursor to re-index codebase
621
-
622
- ---
623
-
624
- ## Troubleshooting
625
-
626
- ### @Codebase Doesn't Work
627
-
628
- **Symptoms:** @Codebase doesn't return results or shows error
629
-
630
- **Solutions:**
631
-
632
- 1. **Verify embeddings are configured:**
633
- ```bash
634
- curl http://localhost:8081/v1/embeddings \
635
- -H "Content-Type: application/json" \
636
- -d '{"input":"test","model":"text-embedding-ada-002"}'
637
-
638
- # Should return embeddings, not 501 error
639
- ```
640
-
641
- 2. **Check embeddings provider in .env:**
642
- ```bash
643
- # Verify ONE of these is set:
644
- OLLAMA_EMBEDDINGS_MODEL=nomic-embed-text
645
- # OR
646
- LLAMACPP_EMBEDDINGS_ENDPOINT=http://localhost:8080/embeddings
647
- # OR
648
- OPENROUTER_API_KEY=sk-or-v1-your-key
649
- # OR
650
- OPENAI_API_KEY=sk-your-key
651
- ```
652
-
653
- 3. **Restart Lynkr** after adding embeddings config
654
-
655
- 4. **Restart Cursor** to re-index codebase
656
-
657
- ---
658
-
659
- ### Poor Search Results
660
-
661
- **Symptoms:** @Codebase returns irrelevant files
662
-
663
- **Solutions:**
664
-
665
- 1. **Upgrade to better embedding model:**
666
- ```bash
667
- # Ollama: Use larger model
668
- ollama pull mxbai-embed-large
669
- OLLAMA_EMBEDDINGS_MODEL=mxbai-embed-large
670
-
671
- # OpenRouter: Use code-specialized model
672
- OPENROUTER_EMBEDDINGS_MODEL=voyage/voyage-code-2
673
- ```
674
-
675
- 2. **Switch to cloud embeddings:**
676
- - Local models (Ollama/llama.cpp): Good quality
677
- - Cloud models (OpenRouter/OpenAI): Excellent quality
678
-
679
- 3. **This may be a Cursor indexing issue:**
680
- - Close and reopen workspace in Cursor
681
- - Wait for Cursor to re-index
682
-
683
- ---
684
-
685
- ### Ollama Model Not Found
686
-
687
- **Symptoms:** `Error: model "nomic-embed-text" not found`
688
-
689
- **Solutions:**
690
-
691
- ```bash
692
- # List available models
693
- ollama list
694
-
695
- # Pull the model
696
- ollama pull nomic-embed-text
697
-
698
- # Verify it's available
699
- ollama list
700
- # Should show: nomic-embed-text ...
701
- ```
702
-
703
- ---
704
-
705
- ### llama.cpp Connection Refused
706
-
707
- **Symptoms:** `ECONNREFUSED` when accessing llama.cpp endpoint
708
-
709
- **Solutions:**
710
-
711
- 1. **Verify llama-server is running:**
712
- ```bash
713
- lsof -i :8080
714
- # Should show llama-server process
715
- ```
716
-
717
- 2. **Start llama-server with embedding model:**
718
- ```bash
719
- ./llama-server -m nomic-embed-text-v1.5.Q4_K_M.gguf --port 8080 --embedding
720
- ```
721
-
722
- 3. **Test endpoint:**
723
- ```bash
724
- curl http://localhost:8080/health
725
- # Should return: {"status":"ok"}
726
- ```
727
-
728
- ---
729
-
730
- ### Rate Limiting (Cloud Providers)
731
-
732
- **Symptoms:** Too many requests error (429)
733
-
734
- **Solutions:**
735
-
736
- 1. **Switch to local embeddings:**
737
- ```env
738
- # Ollama (no rate limits, 100% FREE)
739
- OLLAMA_EMBEDDINGS_MODEL=nomic-embed-text
740
- ```
741
-
742
- 2. **Use OpenRouter** (pooled rate limits):
743
- ```env
744
- OPENROUTER_API_KEY=sk-or-v1-your-key
745
- ```
746
-
747
- ---
748
-
749
- ## Next Steps
750
-
751
- - **[Cursor Integration](cursor-integration.md)** - Full Cursor IDE setup guide
752
- - **[Provider Configuration](providers.md)** - Configure all providers
753
- - **[Installation Guide](installation.md)** - Install Lynkr
754
- - **[Troubleshooting](troubleshooting.md)** - More troubleshooting tips
755
-
756
- ---
757
-
758
- ## Getting Help
759
-
760
- - **[GitHub Discussions](https://github.com/vishalveerareddy123/Lynkr/discussions)** - Community Q&A
761
- - **[GitHub Issues](https://github.com/vishalveerareddy123/Lynkr/issues)** - Report bugs
762
- - **[FAQ](faq.md)** - Frequently asked questions