lynkr 8.0.0 → 8.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (102) hide show
  1. package/README.md +1 -1
  2. package/package.json +1 -1
  3. package/src/api/openai-router.js +34 -2
  4. package/src/clients/standard-tools.js +23 -0
  5. package/src/config/index.js +20 -0
  6. package/src/orchestrator/index.js +2 -2
  7. package/src/server.js +2 -12
  8. package/src/tools/index.js +4 -0
  9. package/src/tools/lazy-loader.js +7 -0
  10. package/src/tools/tinyfish.js +358 -0
  11. package/src/tools/truncate.js +1 -0
  12. package/.github/FUNDING.yml +0 -15
  13. package/.github/workflows/README.md +0 -215
  14. package/.github/workflows/ci.yml +0 -69
  15. package/.github/workflows/index.yml +0 -62
  16. package/.github/workflows/web-tools-tests.yml +0 -56
  17. package/CITATIONS.bib +0 -6
  18. package/DEPLOYMENT.md +0 -1001
  19. package/LYNKR-TUI-PLAN.md +0 -984
  20. package/PERFORMANCE-REPORT.md +0 -866
  21. package/PLAN-per-client-model-routing.md +0 -252
  22. package/docs/42642f749da6234f41b6b425c3bb07c9.txt +0 -1
  23. package/docs/BingSiteAuth.xml +0 -4
  24. package/docs/docs-style.css +0 -478
  25. package/docs/docs.html +0 -198
  26. package/docs/google5be250e608e6da39.html +0 -1
  27. package/docs/index.html +0 -577
  28. package/docs/index.md +0 -584
  29. package/docs/robots.txt +0 -4
  30. package/docs/sitemap.xml +0 -44
  31. package/docs/style.css +0 -1223
  32. package/docs/toon-integration-spec.md +0 -130
  33. package/documentation/README.md +0 -101
  34. package/documentation/api.md +0 -806
  35. package/documentation/claude-code-cli.md +0 -679
  36. package/documentation/codex-cli.md +0 -397
  37. package/documentation/contributing.md +0 -571
  38. package/documentation/cursor-integration.md +0 -734
  39. package/documentation/docker.md +0 -874
  40. package/documentation/embeddings.md +0 -762
  41. package/documentation/faq.md +0 -713
  42. package/documentation/features.md +0 -403
  43. package/documentation/headroom.md +0 -519
  44. package/documentation/installation.md +0 -758
  45. package/documentation/memory-system.md +0 -476
  46. package/documentation/production.md +0 -636
  47. package/documentation/providers.md +0 -1009
  48. package/documentation/routing.md +0 -476
  49. package/documentation/testing.md +0 -629
  50. package/documentation/token-optimization.md +0 -325
  51. package/documentation/tools.md +0 -697
  52. package/documentation/troubleshooting.md +0 -969
  53. package/final-test.js +0 -33
  54. package/headroom-sidecar/config.py +0 -93
  55. package/headroom-sidecar/requirements.txt +0 -14
  56. package/headroom-sidecar/server.py +0 -451
  57. package/monitor-agents.sh +0 -31
  58. package/scripts/audit-log-reader.js +0 -399
  59. package/scripts/compact-dictionary.js +0 -204
  60. package/scripts/test-deduplication.js +0 -448
  61. package/src/db/database.sqlite +0 -0
  62. package/te +0 -11622
  63. package/test/README.md +0 -212
  64. package/test/azure-openai-config.test.js +0 -213
  65. package/test/azure-openai-error-resilience.test.js +0 -238
  66. package/test/azure-openai-format-conversion.test.js +0 -354
  67. package/test/azure-openai-integration.test.js +0 -287
  68. package/test/azure-openai-routing.test.js +0 -175
  69. package/test/azure-openai-streaming.test.js +0 -171
  70. package/test/bedrock-integration.test.js +0 -457
  71. package/test/comprehensive-test-suite.js +0 -928
  72. package/test/config-validation.test.js +0 -207
  73. package/test/cursor-integration.test.js +0 -484
  74. package/test/format-conversion.test.js +0 -578
  75. package/test/hybrid-routing-integration.test.js +0 -269
  76. package/test/hybrid-routing-performance.test.js +0 -428
  77. package/test/llamacpp-integration.test.js +0 -882
  78. package/test/lmstudio-integration.test.js +0 -347
  79. package/test/memory/extractor.test.js +0 -398
  80. package/test/memory/retriever.test.js +0 -613
  81. package/test/memory/retriever.test.js.bak +0 -585
  82. package/test/memory/search.test.js +0 -537
  83. package/test/memory/search.test.js.bak +0 -389
  84. package/test/memory/store.test.js +0 -344
  85. package/test/memory/store.test.js.bak +0 -312
  86. package/test/memory/surprise.test.js +0 -300
  87. package/test/memory-performance.test.js +0 -472
  88. package/test/openai-integration.test.js +0 -683
  89. package/test/openrouter-error-resilience.test.js +0 -418
  90. package/test/passthrough-mode.test.js +0 -385
  91. package/test/performance-benchmark.js +0 -351
  92. package/test/performance-tests.js +0 -528
  93. package/test/routing.test.js +0 -225
  94. package/test/toon-compression.test.js +0 -131
  95. package/test/web-tools.test.js +0 -329
  96. package/test-agents-simple.js +0 -43
  97. package/test-cli-connection.sh +0 -33
  98. package/test-learning-unit.js +0 -126
  99. package/test-learning.js +0 -112
  100. package/test-parallel-agents.sh +0 -124
  101. package/test-parallel-direct.js +0 -155
  102. package/test-subagents.sh +0 -117
@@ -1,762 +0,0 @@
1
- # Embeddings Configuration Guide
2
-
3
- Complete guide to configuring embeddings for Cursor @Codebase semantic search and code understanding.
4
-
5
- ---
6
-
7
- ## Overview
8
-
9
- **Embeddings** enable semantic code search in Cursor IDE's @Codebase feature. Instead of keyword matching, embeddings understand the *meaning* of your code, allowing you to search for functionality, concepts, or patterns.
10
-
11
- ### What Are Embeddings?
12
-
13
- Embeddings convert text (code, comments, documentation) into high-dimensional vectors that capture semantic meaning. Similar code gets similar vectors, enabling:
14
-
15
- - **@Codebase Search** - Find relevant code by describing what you need
16
- - **Automatic Context** - Cursor automatically includes relevant files in conversations
17
- - **Find Similar Code** - Discover code patterns and examples in your codebase
18
-
19
- ### Why Use Embeddings?
20
-
21
- **Without embeddings:**
22
- - ❌ Keyword-only search (`grep`, exact string matching)
23
- - ❌ No semantic understanding
24
- - ❌ Can't find code by describing its purpose
25
-
26
- **With embeddings:**
27
- - ✅ Semantic search ("find authentication logic")
28
- - ✅ Concept-based discovery ("show me error handling patterns")
29
- - ✅ Similar code detection ("code like this function")
30
-
31
- ---
32
-
33
- ## Supported Embedding Providers
34
-
35
- Lynkr supports 4 embedding providers with different tradeoffs:
36
-
37
- | Provider | Cost | Privacy | Setup | Quality | Best For |
38
- |----------|------|---------|-------|---------|----------|
39
- | **Ollama** | **FREE** | 🔒 100% Local | Easy | Good | Privacy, offline, no costs |
40
- | **llama.cpp** | **FREE** | 🔒 100% Local | Medium | Good | Performance, GPU, GGUF models |
41
- | **OpenRouter** | $0.01-0.10/mo | ☁️ Cloud | Easy | Excellent | Simplicity, quality, one key |
42
- | **OpenAI** | $0.01-0.10/mo | ☁️ Cloud | Easy | Excellent | Best quality, direct access |
43
-
44
- ---
45
-
46
- ## Option 1: Ollama (Recommended for Privacy)
47
-
48
- ### Overview
49
-
50
- - **Cost:** 100% FREE 🔒
51
- - **Privacy:** All data stays on your machine
52
- - **Setup:** Easy (5 minutes)
53
- - **Quality:** Good (768-1024 dimensions)
54
- - **Best for:** Privacy-focused teams, offline work, zero cloud dependencies
55
-
56
- ### Installation & Setup
57
-
58
- ```bash
59
- # 1. Install Ollama (if not already installed)
60
- brew install ollama # macOS
61
- # Or download from: https://ollama.ai/download
62
-
63
- # 2. Start Ollama service
64
- ollama serve
65
-
66
- # 3. Pull embedding model (in separate terminal)
67
- ollama pull nomic-embed-text
68
-
69
- # 4. Verify model is available
70
- ollama list
71
- # Should show: nomic-embed-text ...
72
- ```
73
-
74
- ### Configuration
75
-
76
- Add to `.env`:
77
-
78
- ```env
79
- # Ollama embeddings configuration
80
- OLLAMA_EMBEDDINGS_MODEL=nomic-embed-text
81
- OLLAMA_EMBEDDINGS_ENDPOINT=http://localhost:11434/api/embeddings
82
- ```
83
-
84
- ### Available Models
85
-
86
- **nomic-embed-text** (Recommended) ⭐
87
- ```bash
88
- ollama pull nomic-embed-text
89
- ```
90
- - **Dimensions:** 768
91
- - **Parameters:** 137M
92
- - **Quality:** Excellent for code search
93
- - **Speed:** Fast (~50ms per query)
94
- - **Best for:** General purpose, best all-around choice
95
-
96
- **mxbai-embed-large** (Higher Quality)
97
- ```bash
98
- ollama pull mxbai-embed-large
99
- ```
100
- - **Dimensions:** 1024
101
- - **Parameters:** 335M
102
- - **Quality:** Higher quality than nomic-embed-text
103
- - **Speed:** Slower (~100ms per query)
104
- - **Best for:** Large codebases where quality matters most
105
-
106
- **all-minilm** (Fastest)
107
- ```bash
108
- ollama pull all-minilm
109
- ```
110
- - **Dimensions:** 384
111
- - **Parameters:** 23M
112
- - **Quality:** Good for simple searches
113
- - **Speed:** Very fast (~20ms per query)
114
- - **Best for:** Small codebases, speed-critical applications
115
-
116
- ### Testing
117
-
118
- ```bash
119
- # Test embedding generation
120
- curl http://localhost:11434/api/embeddings \
121
- -d '{"model":"nomic-embed-text","prompt":"function to sort array"}'
122
-
123
- # Should return JSON with embedding vector
124
- ```
125
-
126
- ### Benefits
127
-
128
- - ✅ **100% FREE** - No API costs ever
129
- - ✅ **100% Private** - All data stays on your machine
130
- - ✅ **Offline** - Works without internet
131
- - ✅ **Easy Setup** - Install → Pull model → Configure
132
- - ✅ **Good Quality** - Excellent for code search
133
- - ✅ **Multiple Models** - Choose speed vs quality tradeoff
134
-
135
- ---
136
-
137
- ## Option 2: llama.cpp (Maximum Performance)
138
-
139
- ### Overview
140
-
141
- - **Cost:** 100% FREE 🔒
142
- - **Privacy:** All data stays on your machine
143
- - **Setup:** Medium (15 minutes, requires compilation)
144
- - **Quality:** Good (same as Ollama models, GGUF format)
145
- - **Best for:** Performance optimization, GPU acceleration, GGUF models
146
-
147
- ### Installation & Setup
148
-
149
- ```bash
150
- # 1. Clone and build llama.cpp
151
- git clone https://github.com/ggerganov/llama.cpp
152
- cd llama.cpp
153
-
154
- # Build with GPU support (optional):
155
- # For CUDA (NVIDIA): make LLAMA_CUDA=1
156
- # For Metal (Apple Silicon): make LLAMA_METAL=1
157
- # For CPU only: make
158
- make
159
-
160
- # 2. Download embedding model (GGUF format)
161
- # Example: nomic-embed-text GGUF
162
- wget https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF/resolve/main/nomic-embed-text-v1.5.Q4_K_M.gguf
163
-
164
- # 3. Start llama-server with embedding model
165
- ./llama-server \
166
- -m nomic-embed-text-v1.5.Q4_K_M.gguf \
167
- --port 8080 \
168
- --embedding
169
-
170
- # 4. Verify server is running
171
- curl http://localhost:8080/health
172
- # Should return: {"status":"ok"}
173
- ```
174
-
175
- ### Configuration
176
-
177
- Add to `.env`:
178
-
179
- ```env
180
- # llama.cpp embeddings configuration
181
- LLAMACPP_EMBEDDINGS_ENDPOINT=http://localhost:8080/embeddings
182
- ```
183
-
184
- ### Available Models (GGUF)
185
-
186
- **nomic-embed-text-v1.5** (Recommended) ⭐
187
- - **File:** `nomic-embed-text-v1.5.Q4_K_M.gguf`
188
- - **Download:** https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF
189
- - **Dimensions:** 768
190
- - **Size:** ~80MB
191
- - **Quality:** Excellent for code
192
- - **Best for:** Best all-around choice
193
-
194
- **all-MiniLM-L6-v2** (Fastest)
195
- - **File:** `all-MiniLM-L6-v2.Q4_K_M.gguf`
196
- - **Download:** https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2-GGUF
197
- - **Dimensions:** 384
198
- - **Size:** ~25MB
199
- - **Quality:** Good for simple searches
200
- - **Best for:** Speed-critical applications
201
-
202
- **bge-large-en-v1.5** (Highest Quality)
203
- - **File:** `bge-large-en-v1.5.Q4_K_M.gguf`
204
- - **Download:** https://huggingface.co/BAAI/bge-large-en-v1.5-GGUF
205
- - **Dimensions:** 1024
206
- - **Size:** ~350MB
207
- - **Quality:** Best quality for embeddings
208
- - **Best for:** Large codebases, quality-critical applications
209
-
210
- ### GPU Support
211
-
212
- llama.cpp supports multiple GPU backends for faster embedding generation:
213
-
214
- **NVIDIA CUDA:**
215
- ```bash
216
- make LLAMA_CUDA=1
217
- ./llama-server -m model.gguf --embedding --n-gpu-layers 32
218
- ```
219
-
220
- **Apple Silicon Metal:**
221
- ```bash
222
- make LLAMA_METAL=1
223
- ./llama-server -m model.gguf --embedding --n-gpu-layers 32
224
- ```
225
-
226
- **AMD ROCm:**
227
- ```bash
228
- make LLAMA_ROCM=1
229
- ./llama-server -m model.gguf --embedding --n-gpu-layers 32
230
- ```
231
-
232
- **Vulkan (Universal):**
233
- ```bash
234
- make LLAMA_VULKAN=1
235
- ./llama-server -m model.gguf --embedding --n-gpu-layers 32
236
- ```
237
-
238
- ### Testing
239
-
240
- ```bash
241
- # Test embedding generation
242
- curl http://localhost:8080/embeddings \
243
- -H "Content-Type: application/json" \
244
- -d '{"content":"function to sort array"}'
245
-
246
- # Should return JSON with embedding vector
247
- ```
248
-
249
- ### Benefits
250
-
251
- - ✅ **100% FREE** - No API costs
252
- - ✅ **100% Private** - All data stays local
253
- - ✅ **Faster than Ollama** - Optimized C++ implementation
254
- - ✅ **GPU Acceleration** - CUDA, Metal, ROCm, Vulkan
255
- - ✅ **Lower Memory** - Quantization options (Q4, Q5, Q8)
256
- - ✅ **Any GGUF Model** - Use any embedding model from HuggingFace
257
-
258
- ### llama.cpp vs Ollama
259
-
260
- | Feature | Ollama | llama.cpp |
261
- |---------|--------|-----------|
262
- | **Setup** | Easy (app) | Manual (compile) |
263
- | **Model Format** | Ollama-specific | Any GGUF model |
264
- | **Performance** | Good | **Better** (optimized C++) |
265
- | **GPU Support** | Yes | Yes (more options) |
266
- | **Memory Usage** | Higher | **Lower** (more quantization options) |
267
- | **Flexibility** | Limited models | **Any GGUF** from HuggingFace |
268
-
269
- ---
270
-
271
- ## Option 3: OpenRouter (Simplest Cloud)
272
-
273
- ### Overview
274
-
275
- - **Cost:** ~$0.01-0.10/month (typical usage)
276
- - **Privacy:** Cloud-based
277
- - **Setup:** Very easy (2 minutes)
278
- - **Quality:** Excellent (best-in-class models)
279
- - **Best for:** Simplicity, quality, one key for chat + embeddings
280
-
281
- ### Configuration
282
-
283
- Add to `.env`:
284
-
285
- ```env
286
- # OpenRouter configuration (if not already set)
287
- OPENROUTER_API_KEY=sk-or-v1-your-key-here
288
-
289
- # Embeddings model (optional, defaults to text-embedding-ada-002)
290
- OPENROUTER_EMBEDDINGS_MODEL=openai/text-embedding-3-small
291
- ```
292
-
293
- **Note:** If you're already using `MODEL_PROVIDER=openrouter`, embeddings work automatically with the same key! No additional configuration needed.
294
-
295
- ### Getting OpenRouter API Key
296
-
297
- 1. Visit [openrouter.ai](https://openrouter.ai)
298
- 2. Sign in with GitHub, Google, or email
299
- 3. Go to [openrouter.ai/keys](https://openrouter.ai/keys)
300
- 4. Create a new API key
301
- 5. Add credits (pay-as-you-go, no subscription)
302
-
303
- ### Available Models
304
-
305
- **openai/text-embedding-3-small** (Recommended) ⭐
306
- ```env
307
- OPENROUTER_EMBEDDINGS_MODEL=openai/text-embedding-3-small
308
- ```
309
- - **Dimensions:** 1536
310
- - **Cost:** $0.02 per 1M tokens (80% cheaper than ada-002!)
311
- - **Quality:** Excellent
312
- - **Best for:** Best balance of quality and cost
313
-
314
- **openai/text-embedding-ada-002** (Standard)
315
- ```env
316
- OPENROUTER_EMBEDDINGS_MODEL=openai/text-embedding-ada-002
317
- ```
318
- - **Dimensions:** 1536
319
- - **Cost:** $0.10 per 1M tokens
320
- - **Quality:** Excellent (widely supported standard)
321
- - **Best for:** Compatibility
322
-
323
- **openai/text-embedding-3-large** (Best Quality)
324
- ```env
325
- OPENROUTER_EMBEDDINGS_MODEL=openai/text-embedding-3-large
326
- ```
327
- - **Dimensions:** 3072
328
- - **Cost:** $0.13 per 1M tokens
329
- - **Quality:** Best quality available
330
- - **Best for:** Large codebases where quality matters most
331
-
332
- **voyage/voyage-code-2** (Code-Specialized)
333
- ```env
334
- OPENROUTER_EMBEDDINGS_MODEL=voyage/voyage-code-2
335
- ```
336
- - **Dimensions:** 1024
337
- - **Cost:** $0.12 per 1M tokens
338
- - **Quality:** Optimized specifically for code
339
- - **Best for:** Code search (better than general models)
340
-
341
- **voyage/voyage-2** (General Purpose)
342
- ```env
343
- OPENROUTER_EMBEDDINGS_MODEL=voyage/voyage-2
344
- ```
345
- - **Dimensions:** 1024
346
- - **Cost:** $0.12 per 1M tokens
347
- - **Quality:** Best for general text
348
- - **Best for:** Mixed code + documentation
349
-
350
- ### Benefits
351
-
352
- - ✅ **ONE Key** - Same key for chat + embeddings
353
- - ✅ **No Setup** - Works immediately after adding key
354
- - ✅ **Best Quality** - State-of-the-art embedding models
355
- - ✅ **Automatic Fallbacks** - Switches providers if one is down
356
- - ✅ **Competitive Pricing** - Often cheaper than direct providers
357
-
358
- ---
359
-
360
- ## Option 4: OpenAI (Direct)
361
-
362
- ### Overview
363
-
364
- - **Cost:** ~$0.01-0.10/month (typical usage)
365
- - **Privacy:** Cloud-based
366
- - **Setup:** Easy (5 minutes)
367
- - **Quality:** Excellent (best-in-class, direct from OpenAI)
368
- - **Best for:** Best quality, direct OpenAI access
369
-
370
- ### Configuration
371
-
372
- Add to `.env`:
373
-
374
- ```env
375
- # OpenAI configuration (if not already set)
376
- OPENAI_API_KEY=sk-your-openai-api-key
377
-
378
- # Embeddings model (optional, defaults to text-embedding-ada-002)
379
- # Recommended: Use text-embedding-3-small for 80% cost savings
380
- # OPENAI_EMBEDDINGS_MODEL=text-embedding-3-small
381
- ```
382
-
383
- ### Getting OpenAI API Key
384
-
385
- 1. Visit [platform.openai.com](https://platform.openai.com)
386
- 2. Sign up or log in
387
- 3. Go to [API Keys](https://platform.openai.com/api-keys)
388
- 4. Create a new API key
389
- 5. Add credits to your account (pay-as-you-go)
390
-
391
- ### Available Models
392
-
393
- **text-embedding-3-small** (Recommended) ⭐
394
- ```env
395
- OPENAI_EMBEDDINGS_MODEL=text-embedding-3-small
396
- ```
397
- - **Dimensions:** 1536
398
- - **Cost:** $0.02 per 1M tokens (80% cheaper!)
399
- - **Quality:** Excellent
400
- - **Best for:** Best balance of quality and cost
401
-
402
- **text-embedding-ada-002** (Standard)
403
- ```env
404
- OPENAI_EMBEDDINGS_MODEL=text-embedding-ada-002
405
- ```
406
- - **Dimensions:** 1536
407
- - **Cost:** $0.10 per 1M tokens
408
- - **Quality:** Excellent (standard, widely used)
409
- - **Best for:** Compatibility
410
-
411
- **text-embedding-3-large** (Best Quality)
412
- ```env
413
- OPENAI_EMBEDDINGS_MODEL=text-embedding-3-large
414
- ```
415
- - **Dimensions:** 3072
416
- - **Cost:** $0.13 per 1M tokens
417
- - **Quality:** Best quality available
418
- - **Best for:** Maximum quality for large codebases
419
-
420
- ### Benefits
421
-
422
- - ✅ **Best Quality** - Direct from OpenAI, best-in-class
423
- - ✅ **Lowest Latency** - No intermediaries
424
- - ✅ **Simple Setup** - Just one API key
425
- - ✅ **Organization Support** - Use org-level API keys for teams
426
-
427
- ---
428
-
429
- ## Provider Comparison
430
-
431
- ### Feature Comparison
432
-
433
- | Feature | Ollama | llama.cpp | OpenRouter | OpenAI |
434
- |---------|--------|-----------|------------|--------|
435
- | **Cost** | **FREE** | **FREE** | $0.01-0.10/mo | $0.01-0.10/mo |
436
- | **Privacy** | 🔒 Local | 🔒 Local | ☁️ Cloud | ☁️ Cloud |
437
- | **Setup** | Easy | Medium | Easy | Easy |
438
- | **Quality** | Good | Good | **Excellent** | **Excellent** |
439
- | **Speed** | Fast | **Faster** | Fast | Fast |
440
- | **Offline** | ✅ Yes | ✅ Yes | ❌ No | ❌ No |
441
- | **GPU Support** | Yes | **Yes (more options)** | N/A | N/A |
442
- | **Model Choice** | Limited | **Any GGUF** | Many | Few |
443
- | **Dimensions** | 384-1024 | 384-1024 | 1024-3072 | 1536-3072 |
444
-
445
- ### Cost Comparison (100K embeddings/month)
446
-
447
- | Provider | Model | Monthly Cost |
448
- |----------|-------|--------------|
449
- | **Ollama** | Any | **$0** (100% FREE) 🔒 |
450
- | **llama.cpp** | Any | **$0** (100% FREE) 🔒 |
451
- | **OpenRouter** | text-embedding-3-small | **$0.02** |
452
- | **OpenRouter** | text-embedding-ada-002 | $0.10 |
453
- | **OpenRouter** | voyage-code-2 | $0.12 |
454
- | **OpenAI** | text-embedding-3-small | **$0.02** |
455
- | **OpenAI** | text-embedding-ada-002 | $0.10 |
456
- | **OpenAI** | text-embedding-3-large | $0.13 |
457
-
458
- ---
459
-
460
- ## Embeddings Provider Override
461
-
462
- By default, Lynkr uses the same provider as `MODEL_PROVIDER` for embeddings (if supported). To use a different provider for embeddings:
463
-
464
- ```env
465
- # Use Databricks for chat, but Ollama for embeddings (privacy + cost savings)
466
- MODEL_PROVIDER=databricks
467
- DATABRICKS_API_BASE=https://your-workspace.databricks.com
468
- DATABRICKS_API_KEY=your-key
469
-
470
- # Override embeddings provider
471
- EMBEDDINGS_PROVIDER=ollama
472
- OLLAMA_EMBEDDINGS_MODEL=nomic-embed-text
473
- ```
474
-
475
- **Smart provider detection:**
476
- - Uses same provider as chat (if embeddings supported)
477
- - Or automatically selects first available embeddings provider
478
- - Or use `EMBEDDINGS_PROVIDER` to force a specific provider
479
-
480
- ---
481
-
482
- ## Recommended Configurations
483
-
484
- ### 1. Privacy-First (100% Local, FREE)
485
-
486
- **Best for:** Sensitive codebases, offline work, zero cloud dependencies
487
-
488
- ```env
489
- # Chat: Ollama (local)
490
- MODEL_PROVIDER=ollama
491
- OLLAMA_MODEL=llama3.1:8b
492
-
493
- # Embeddings: Ollama (local)
494
- OLLAMA_EMBEDDINGS_MODEL=nomic-embed-text
495
-
496
- # Everything 100% local, 100% private, 100% FREE!
497
- ```
498
-
499
- **Benefits:**
500
- - ✅ Zero cloud dependencies
501
- - ✅ All data stays on your machine
502
- - ✅ Works offline
503
- - ✅ 100% FREE
504
-
505
- ---
506
-
507
- ### 2. Simplest (One Key for Everything)
508
-
509
- **Best for:** Easy setup, flexibility, quality
510
-
511
- ```env
512
- # Chat + Embeddings: OpenRouter with ONE key
513
- MODEL_PROVIDER=openrouter
514
- OPENROUTER_API_KEY=sk-or-v1-your-key
515
- OPENROUTER_MODEL=anthropic/claude-3.5-sonnet
516
-
517
- # Embeddings work automatically with same key!
518
- # Optional: Specify model for cost savings
519
- OPENROUTER_EMBEDDINGS_MODEL=openai/text-embedding-3-small
520
- ```
521
-
522
- **Benefits:**
523
- - ✅ ONE key for everything
524
- - ✅ Best quality embeddings
525
- - ✅ 100+ chat models available
526
- - ✅ ~$5-10/month total cost
527
-
528
- ---
529
-
530
- ### 3. Hybrid (Best of Both Worlds)
531
-
532
- **Best for:** Privacy + Quality + Cost Optimization
533
-
534
- ```env
535
- # Chat: Tier-based routing (set all 4 to enable)
536
- TIER_SIMPLE=ollama:llama3.2
537
- TIER_MEDIUM=openrouter:openai/gpt-4o-mini
538
- TIER_COMPLEX=databricks:databricks-claude-sonnet-4-5
539
- TIER_REASONING=databricks:databricks-claude-sonnet-4-5
540
- FALLBACK_ENABLED=true
541
- FALLBACK_PROVIDER=databricks
542
- DATABRICKS_API_BASE=https://your-workspace.databricks.com
543
- DATABRICKS_API_KEY=your-key
544
-
545
- # Embeddings: Ollama (local, private)
546
- OLLAMA_EMBEDDINGS_MODEL=nomic-embed-text
547
-
548
- # Result: Free + private embeddings, mostly free chat, cloud for complex tasks
549
- ```
550
-
551
- **Benefits:**
552
- - ✅ 70-80% of chat requests FREE (Ollama via TIER_SIMPLE)
553
- - ✅ 100% private embeddings (local)
554
- - ✅ Cloud quality for complex tasks
555
- - ✅ Intelligent automatic tier-based routing
556
-
557
- ---
558
-
559
- ### 4. Enterprise (Best Quality)
560
-
561
- **Best for:** Large teams, quality-critical applications
562
-
563
- ```env
564
- # Chat: Databricks (enterprise SLA)
565
- MODEL_PROVIDER=databricks
566
- DATABRICKS_API_BASE=https://your-workspace.databricks.com
567
- DATABRICKS_API_KEY=your-key
568
-
569
- # Embeddings: OpenRouter (best quality)
570
- OPENROUTER_API_KEY=sk-or-v1-your-key
571
- OPENROUTER_EMBEDDINGS_MODEL=voyage/voyage-code-2 # Code-specialized
572
- ```
573
-
574
- **Benefits:**
575
- - ✅ Enterprise chat (Claude 4.5)
576
- - ✅ Best embedding quality (code-specialized)
577
- - ✅ Separate billing/limits for chat vs embeddings
578
- - ✅ Production-ready reliability
579
-
580
- ---
581
-
582
- ## Testing & Verification
583
-
584
- ### Test Embeddings Endpoint
585
-
586
- ```bash
587
- # Test embedding generation
588
- curl http://localhost:8081/v1/embeddings \
589
- -H "Content-Type: application/json" \
590
- -d '{
591
- "input": "function to sort an array",
592
- "model": "text-embedding-ada-002"
593
- }'
594
-
595
- # Should return JSON with embedding vector
596
- # Example response:
597
- # {
598
- # "object": "list",
599
- # "data": [{
600
- # "object": "embedding",
601
- # "embedding": [0.123, -0.456, 0.789, ...], # 768-3072 dimensions
602
- # "index": 0
603
- # }],
604
- # "model": "text-embedding-ada-002",
605
- # "usage": {"prompt_tokens": 7, "total_tokens": 7}
606
- # }
607
- ```
608
-
609
- ### Test in Cursor
610
-
611
- 1. **Open Cursor IDE**
612
- 2. **Open a project**
613
- 3. **Press Cmd+L** (or Ctrl+L)
614
- 4. **Type:** `@Codebase find authentication logic`
615
- 5. **Expected:** Cursor returns relevant files
616
-
617
- If @Codebase doesn't work:
618
- - Check embeddings endpoint: `curl http://localhost:8081/v1/embeddings` (should not return 501)
619
- - Restart Lynkr after adding embeddings config
620
- - Restart Cursor to re-index codebase
621
-
622
- ---
623
-
624
- ## Troubleshooting
625
-
626
- ### @Codebase Doesn't Work
627
-
628
- **Symptoms:** @Codebase doesn't return results or shows error
629
-
630
- **Solutions:**
631
-
632
- 1. **Verify embeddings are configured:**
633
- ```bash
634
- curl http://localhost:8081/v1/embeddings \
635
- -H "Content-Type: application/json" \
636
- -d '{"input":"test","model":"text-embedding-ada-002"}'
637
-
638
- # Should return embeddings, not 501 error
639
- ```
640
-
641
- 2. **Check embeddings provider in .env:**
642
- ```bash
643
- # Verify ONE of these is set:
644
- OLLAMA_EMBEDDINGS_MODEL=nomic-embed-text
645
- # OR
646
- LLAMACPP_EMBEDDINGS_ENDPOINT=http://localhost:8080/embeddings
647
- # OR
648
- OPENROUTER_API_KEY=sk-or-v1-your-key
649
- # OR
650
- OPENAI_API_KEY=sk-your-key
651
- ```
652
-
653
- 3. **Restart Lynkr** after adding embeddings config
654
-
655
- 4. **Restart Cursor** to re-index codebase
656
-
657
- ---
658
-
659
- ### Poor Search Results
660
-
661
- **Symptoms:** @Codebase returns irrelevant files
662
-
663
- **Solutions:**
664
-
665
- 1. **Upgrade to better embedding model:**
666
- ```bash
667
- # Ollama: Use larger model
668
- ollama pull mxbai-embed-large
669
- OLLAMA_EMBEDDINGS_MODEL=mxbai-embed-large
670
-
671
- # OpenRouter: Use code-specialized model
672
- OPENROUTER_EMBEDDINGS_MODEL=voyage/voyage-code-2
673
- ```
674
-
675
- 2. **Switch to cloud embeddings:**
676
- - Local models (Ollama/llama.cpp): Good quality
677
- - Cloud models (OpenRouter/OpenAI): Excellent quality
678
-
679
- 3. **This may be a Cursor indexing issue:**
680
- - Close and reopen workspace in Cursor
681
- - Wait for Cursor to re-index
682
-
683
- ---
684
-
685
- ### Ollama Model Not Found
686
-
687
- **Symptoms:** `Error: model "nomic-embed-text" not found`
688
-
689
- **Solutions:**
690
-
691
- ```bash
692
- # List available models
693
- ollama list
694
-
695
- # Pull the model
696
- ollama pull nomic-embed-text
697
-
698
- # Verify it's available
699
- ollama list
700
- # Should show: nomic-embed-text ...
701
- ```
702
-
703
- ---
704
-
705
- ### llama.cpp Connection Refused
706
-
707
- **Symptoms:** `ECONNREFUSED` when accessing llama.cpp endpoint
708
-
709
- **Solutions:**
710
-
711
- 1. **Verify llama-server is running:**
712
- ```bash
713
- lsof -i :8080
714
- # Should show llama-server process
715
- ```
716
-
717
- 2. **Start llama-server with embedding model:**
718
- ```bash
719
- ./llama-server -m nomic-embed-text-v1.5.Q4_K_M.gguf --port 8080 --embedding
720
- ```
721
-
722
- 3. **Test endpoint:**
723
- ```bash
724
- curl http://localhost:8080/health
725
- # Should return: {"status":"ok"}
726
- ```
727
-
728
- ---
729
-
730
- ### Rate Limiting (Cloud Providers)
731
-
732
- **Symptoms:** Too many requests error (429)
733
-
734
- **Solutions:**
735
-
736
- 1. **Switch to local embeddings:**
737
- ```env
738
- # Ollama (no rate limits, 100% FREE)
739
- OLLAMA_EMBEDDINGS_MODEL=nomic-embed-text
740
- ```
741
-
742
- 2. **Use OpenRouter** (pooled rate limits):
743
- ```env
744
- OPENROUTER_API_KEY=sk-or-v1-your-key
745
- ```
746
-
747
- ---
748
-
749
- ## Next Steps
750
-
751
- - **[Cursor Integration](cursor-integration.md)** - Full Cursor IDE setup guide
752
- - **[Provider Configuration](providers.md)** - Configure all providers
753
- - **[Installation Guide](installation.md)** - Install Lynkr
754
- - **[Troubleshooting](troubleshooting.md)** - More troubleshooting tips
755
-
756
- ---
757
-
758
- ## Getting Help
759
-
760
- - **[GitHub Discussions](https://github.com/vishalveerareddy123/Lynkr/discussions)** - Community Q&A
761
- - **[GitHub Issues](https://github.com/vishalveerareddy123/Lynkr/issues)** - Report bugs
762
- - **[FAQ](faq.md)** - Frequently asked questions