lynkr 3.3.1 → 4.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +276 -2177
- package/README.md.backup +2996 -0
- package/docs/GSD_LEARNINGS.md +1116 -0
- package/docs/LOCAL_EMBEDDINGS_PLAN.md +1024 -0
- package/documentation/README.md +98 -0
- package/documentation/api.md +806 -0
- package/documentation/claude-code-cli.md +672 -0
- package/documentation/contributing.md +571 -0
- package/documentation/cursor-integration.md +731 -0
- package/documentation/docker.md +867 -0
- package/documentation/embeddings.md +760 -0
- package/documentation/faq.md +659 -0
- package/documentation/features.md +396 -0
- package/documentation/installation.md +706 -0
- package/documentation/memory-system.md +476 -0
- package/documentation/production.md +601 -0
- package/documentation/providers.md +735 -0
- package/documentation/testing.md +629 -0
- package/documentation/token-optimization.md +323 -0
- package/documentation/tools.md +697 -0
- package/documentation/troubleshooting.md +864 -0
- package/package.json +2 -2
- package/src/api/openai-router.js +919 -0
- package/src/api/router.js +4 -0
- package/src/clients/openai-format.js +427 -0
- package/src/config/index.js +8 -0
- package/test/cursor-integration.test.js +484 -0
|
@@ -0,0 +1,760 @@
|
|
|
1
|
+
# Embeddings Configuration Guide
|
|
2
|
+
|
|
3
|
+
Complete guide to configuring embeddings for Cursor @Codebase semantic search and code understanding.
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## Overview
|
|
8
|
+
|
|
9
|
+
**Embeddings** enable semantic code search in Cursor IDE's @Codebase feature. Instead of keyword matching, embeddings understand the *meaning* of your code, allowing you to search for functionality, concepts, or patterns.
|
|
10
|
+
|
|
11
|
+
### What Are Embeddings?
|
|
12
|
+
|
|
13
|
+
Embeddings convert text (code, comments, documentation) into high-dimensional vectors that capture semantic meaning. Similar code gets similar vectors, enabling:
|
|
14
|
+
|
|
15
|
+
- **@Codebase Search** - Find relevant code by describing what you need
|
|
16
|
+
- **Automatic Context** - Cursor automatically includes relevant files in conversations
|
|
17
|
+
- **Find Similar Code** - Discover code patterns and examples in your codebase
|
|
18
|
+
|
|
19
|
+
### Why Use Embeddings?
|
|
20
|
+
|
|
21
|
+
**Without embeddings:**
|
|
22
|
+
- ❌ Keyword-only search (`grep`, exact string matching)
|
|
23
|
+
- ❌ No semantic understanding
|
|
24
|
+
- ❌ Can't find code by describing its purpose
|
|
25
|
+
|
|
26
|
+
**With embeddings:**
|
|
27
|
+
- ✅ Semantic search ("find authentication logic")
|
|
28
|
+
- ✅ Concept-based discovery ("show me error handling patterns")
|
|
29
|
+
- ✅ Similar code detection ("code like this function")
|
|
30
|
+
|
|
31
|
+
---
|
|
32
|
+
|
|
33
|
+
## Supported Embedding Providers
|
|
34
|
+
|
|
35
|
+
Lynkr supports 4 embedding providers with different tradeoffs:
|
|
36
|
+
|
|
37
|
+
| Provider | Cost | Privacy | Setup | Quality | Best For |
|
|
38
|
+
|----------|------|---------|-------|---------|----------|
|
|
39
|
+
| **Ollama** | **FREE** | 🔒 100% Local | Easy | Good | Privacy, offline, no costs |
|
|
40
|
+
| **llama.cpp** | **FREE** | 🔒 100% Local | Medium | Good | Performance, GPU, GGUF models |
|
|
41
|
+
| **OpenRouter** | $0.01-0.10/mo | ☁️ Cloud | Easy | Excellent | Simplicity, quality, one key |
|
|
42
|
+
| **OpenAI** | $0.01-0.10/mo | ☁️ Cloud | Easy | Excellent | Best quality, direct access |
|
|
43
|
+
|
|
44
|
+
---
|
|
45
|
+
|
|
46
|
+
## Option 1: Ollama (Recommended for Privacy)
|
|
47
|
+
|
|
48
|
+
### Overview
|
|
49
|
+
|
|
50
|
+
- **Cost:** 100% FREE 🔒
|
|
51
|
+
- **Privacy:** All data stays on your machine
|
|
52
|
+
- **Setup:** Easy (5 minutes)
|
|
53
|
+
- **Quality:** Good (768-1024 dimensions)
|
|
54
|
+
- **Best for:** Privacy-focused teams, offline work, zero cloud dependencies
|
|
55
|
+
|
|
56
|
+
### Installation & Setup
|
|
57
|
+
|
|
58
|
+
```bash
|
|
59
|
+
# 1. Install Ollama (if not already installed)
|
|
60
|
+
brew install ollama # macOS
|
|
61
|
+
# Or download from: https://ollama.ai/download
|
|
62
|
+
|
|
63
|
+
# 2. Start Ollama service
|
|
64
|
+
ollama serve
|
|
65
|
+
|
|
66
|
+
# 3. Pull embedding model (in separate terminal)
|
|
67
|
+
ollama pull nomic-embed-text
|
|
68
|
+
|
|
69
|
+
# 4. Verify model is available
|
|
70
|
+
ollama list
|
|
71
|
+
# Should show: nomic-embed-text ...
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
### Configuration
|
|
75
|
+
|
|
76
|
+
Add to `.env`:
|
|
77
|
+
|
|
78
|
+
```env
|
|
79
|
+
# Ollama embeddings configuration
|
|
80
|
+
OLLAMA_EMBEDDINGS_MODEL=nomic-embed-text
|
|
81
|
+
OLLAMA_EMBEDDINGS_ENDPOINT=http://localhost:11434/api/embeddings
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
### Available Models
|
|
85
|
+
|
|
86
|
+
**nomic-embed-text** (Recommended) ⭐
|
|
87
|
+
```bash
|
|
88
|
+
ollama pull nomic-embed-text
|
|
89
|
+
```
|
|
90
|
+
- **Dimensions:** 768
|
|
91
|
+
- **Parameters:** 137M
|
|
92
|
+
- **Quality:** Excellent for code search
|
|
93
|
+
- **Speed:** Fast (~50ms per query)
|
|
94
|
+
- **Best for:** General purpose, best all-around choice
|
|
95
|
+
|
|
96
|
+
**mxbai-embed-large** (Higher Quality)
|
|
97
|
+
```bash
|
|
98
|
+
ollama pull mxbai-embed-large
|
|
99
|
+
```
|
|
100
|
+
- **Dimensions:** 1024
|
|
101
|
+
- **Parameters:** 335M
|
|
102
|
+
- **Quality:** Higher quality than nomic-embed-text
|
|
103
|
+
- **Speed:** Slower (~100ms per query)
|
|
104
|
+
- **Best for:** Large codebases where quality matters most
|
|
105
|
+
|
|
106
|
+
**all-minilm** (Fastest)
|
|
107
|
+
```bash
|
|
108
|
+
ollama pull all-minilm
|
|
109
|
+
```
|
|
110
|
+
- **Dimensions:** 384
|
|
111
|
+
- **Parameters:** 23M
|
|
112
|
+
- **Quality:** Good for simple searches
|
|
113
|
+
- **Speed:** Very fast (~20ms per query)
|
|
114
|
+
- **Best for:** Small codebases, speed-critical applications
|
|
115
|
+
|
|
116
|
+
### Testing
|
|
117
|
+
|
|
118
|
+
```bash
|
|
119
|
+
# Test embedding generation
|
|
120
|
+
curl http://localhost:11434/api/embeddings \
|
|
121
|
+
-d '{"model":"nomic-embed-text","prompt":"function to sort array"}'
|
|
122
|
+
|
|
123
|
+
# Should return JSON with embedding vector
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
### Benefits
|
|
127
|
+
|
|
128
|
+
- ✅ **100% FREE** - No API costs ever
|
|
129
|
+
- ✅ **100% Private** - All data stays on your machine
|
|
130
|
+
- ✅ **Offline** - Works without internet
|
|
131
|
+
- ✅ **Easy Setup** - Install → Pull model → Configure
|
|
132
|
+
- ✅ **Good Quality** - Excellent for code search
|
|
133
|
+
- ✅ **Multiple Models** - Choose speed vs quality tradeoff
|
|
134
|
+
|
|
135
|
+
---
|
|
136
|
+
|
|
137
|
+
## Option 2: llama.cpp (Maximum Performance)
|
|
138
|
+
|
|
139
|
+
### Overview
|
|
140
|
+
|
|
141
|
+
- **Cost:** 100% FREE 🔒
|
|
142
|
+
- **Privacy:** All data stays on your machine
|
|
143
|
+
- **Setup:** Medium (15 minutes, requires compilation)
|
|
144
|
+
- **Quality:** Good (same as Ollama models, GGUF format)
|
|
145
|
+
- **Best for:** Performance optimization, GPU acceleration, GGUF models
|
|
146
|
+
|
|
147
|
+
### Installation & Setup
|
|
148
|
+
|
|
149
|
+
```bash
|
|
150
|
+
# 1. Clone and build llama.cpp
|
|
151
|
+
git clone https://github.com/ggerganov/llama.cpp
|
|
152
|
+
cd llama.cpp
|
|
153
|
+
|
|
154
|
+
# Build with GPU support (optional):
|
|
155
|
+
# For CUDA (NVIDIA): make LLAMA_CUDA=1
|
|
156
|
+
# For Metal (Apple Silicon): make LLAMA_METAL=1
|
|
157
|
+
# For CPU only: make
|
|
158
|
+
make
|
|
159
|
+
|
|
160
|
+
# 2. Download embedding model (GGUF format)
|
|
161
|
+
# Example: nomic-embed-text GGUF
|
|
162
|
+
wget https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF/resolve/main/nomic-embed-text-v1.5.Q4_K_M.gguf
|
|
163
|
+
|
|
164
|
+
# 3. Start llama-server with embedding model
|
|
165
|
+
./llama-server \
|
|
166
|
+
-m nomic-embed-text-v1.5.Q4_K_M.gguf \
|
|
167
|
+
--port 8080 \
|
|
168
|
+
--embedding
|
|
169
|
+
|
|
170
|
+
# 4. Verify server is running
|
|
171
|
+
curl http://localhost:8080/health
|
|
172
|
+
# Should return: {"status":"ok"}
|
|
173
|
+
```
|
|
174
|
+
|
|
175
|
+
### Configuration
|
|
176
|
+
|
|
177
|
+
Add to `.env`:
|
|
178
|
+
|
|
179
|
+
```env
|
|
180
|
+
# llama.cpp embeddings configuration
|
|
181
|
+
LLAMACPP_EMBEDDINGS_ENDPOINT=http://localhost:8080/embeddings
|
|
182
|
+
```
|
|
183
|
+
|
|
184
|
+
### Available Models (GGUF)
|
|
185
|
+
|
|
186
|
+
**nomic-embed-text-v1.5** (Recommended) ⭐
|
|
187
|
+
- **File:** `nomic-embed-text-v1.5.Q4_K_M.gguf`
|
|
188
|
+
- **Download:** https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF
|
|
189
|
+
- **Dimensions:** 768
|
|
190
|
+
- **Size:** ~80MB
|
|
191
|
+
- **Quality:** Excellent for code
|
|
192
|
+
- **Best for:** Best all-around choice
|
|
193
|
+
|
|
194
|
+
**all-MiniLM-L6-v2** (Fastest)
|
|
195
|
+
- **File:** `all-MiniLM-L6-v2.Q4_K_M.gguf`
|
|
196
|
+
- **Download:** https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2-GGUF
|
|
197
|
+
- **Dimensions:** 384
|
|
198
|
+
- **Size:** ~25MB
|
|
199
|
+
- **Quality:** Good for simple searches
|
|
200
|
+
- **Best for:** Speed-critical applications
|
|
201
|
+
|
|
202
|
+
**bge-large-en-v1.5** (Highest Quality)
|
|
203
|
+
- **File:** `bge-large-en-v1.5.Q4_K_M.gguf`
|
|
204
|
+
- **Download:** https://huggingface.co/BAAI/bge-large-en-v1.5-GGUF
|
|
205
|
+
- **Dimensions:** 1024
|
|
206
|
+
- **Size:** ~350MB
|
|
207
|
+
- **Quality:** Best quality for embeddings
|
|
208
|
+
- **Best for:** Large codebases, quality-critical applications
|
|
209
|
+
|
|
210
|
+
### GPU Support
|
|
211
|
+
|
|
212
|
+
llama.cpp supports multiple GPU backends for faster embedding generation:
|
|
213
|
+
|
|
214
|
+
**NVIDIA CUDA:**
|
|
215
|
+
```bash
|
|
216
|
+
make LLAMA_CUDA=1
|
|
217
|
+
./llama-server -m model.gguf --embedding --n-gpu-layers 32
|
|
218
|
+
```
|
|
219
|
+
|
|
220
|
+
**Apple Silicon Metal:**
|
|
221
|
+
```bash
|
|
222
|
+
make LLAMA_METAL=1
|
|
223
|
+
./llama-server -m model.gguf --embedding --n-gpu-layers 32
|
|
224
|
+
```
|
|
225
|
+
|
|
226
|
+
**AMD ROCm:**
|
|
227
|
+
```bash
|
|
228
|
+
make LLAMA_ROCM=1
|
|
229
|
+
./llama-server -m model.gguf --embedding --n-gpu-layers 32
|
|
230
|
+
```
|
|
231
|
+
|
|
232
|
+
**Vulkan (Universal):**
|
|
233
|
+
```bash
|
|
234
|
+
make LLAMA_VULKAN=1
|
|
235
|
+
./llama-server -m model.gguf --embedding --n-gpu-layers 32
|
|
236
|
+
```
|
|
237
|
+
|
|
238
|
+
### Testing
|
|
239
|
+
|
|
240
|
+
```bash
|
|
241
|
+
# Test embedding generation
|
|
242
|
+
curl http://localhost:8080/embeddings \
|
|
243
|
+
-H "Content-Type: application/json" \
|
|
244
|
+
-d '{"content":"function to sort array"}'
|
|
245
|
+
|
|
246
|
+
# Should return JSON with embedding vector
|
|
247
|
+
```
|
|
248
|
+
|
|
249
|
+
### Benefits
|
|
250
|
+
|
|
251
|
+
- ✅ **100% FREE** - No API costs
|
|
252
|
+
- ✅ **100% Private** - All data stays local
|
|
253
|
+
- ✅ **Faster than Ollama** - Optimized C++ implementation
|
|
254
|
+
- ✅ **GPU Acceleration** - CUDA, Metal, ROCm, Vulkan
|
|
255
|
+
- ✅ **Lower Memory** - Quantization options (Q4, Q5, Q8)
|
|
256
|
+
- ✅ **Any GGUF Model** - Use any embedding model from HuggingFace
|
|
257
|
+
|
|
258
|
+
### llama.cpp vs Ollama
|
|
259
|
+
|
|
260
|
+
| Feature | Ollama | llama.cpp |
|
|
261
|
+
|---------|--------|-----------|
|
|
262
|
+
| **Setup** | Easy (app) | Manual (compile) |
|
|
263
|
+
| **Model Format** | Ollama-specific | Any GGUF model |
|
|
264
|
+
| **Performance** | Good | **Better** (optimized C++) |
|
|
265
|
+
| **GPU Support** | Yes | Yes (more options) |
|
|
266
|
+
| **Memory Usage** | Higher | **Lower** (more quantization options) |
|
|
267
|
+
| **Flexibility** | Limited models | **Any GGUF** from HuggingFace |
|
|
268
|
+
|
|
269
|
+
---
|
|
270
|
+
|
|
271
|
+
## Option 3: OpenRouter (Simplest Cloud)
|
|
272
|
+
|
|
273
|
+
### Overview
|
|
274
|
+
|
|
275
|
+
- **Cost:** ~$0.01-0.10/month (typical usage)
|
|
276
|
+
- **Privacy:** Cloud-based
|
|
277
|
+
- **Setup:** Very easy (2 minutes)
|
|
278
|
+
- **Quality:** Excellent (best-in-class models)
|
|
279
|
+
- **Best for:** Simplicity, quality, one key for chat + embeddings
|
|
280
|
+
|
|
281
|
+
### Configuration
|
|
282
|
+
|
|
283
|
+
Add to `.env`:
|
|
284
|
+
|
|
285
|
+
```env
|
|
286
|
+
# OpenRouter configuration (if not already set)
|
|
287
|
+
OPENROUTER_API_KEY=sk-or-v1-your-key-here
|
|
288
|
+
|
|
289
|
+
# Embeddings model (optional, defaults to text-embedding-ada-002)
|
|
290
|
+
OPENROUTER_EMBEDDINGS_MODEL=openai/text-embedding-3-small
|
|
291
|
+
```
|
|
292
|
+
|
|
293
|
+
**Note:** If you're already using `MODEL_PROVIDER=openrouter`, embeddings work automatically with the same key! No additional configuration needed.
|
|
294
|
+
|
|
295
|
+
### Getting OpenRouter API Key
|
|
296
|
+
|
|
297
|
+
1. Visit [openrouter.ai](https://openrouter.ai)
|
|
298
|
+
2. Sign in with GitHub, Google, or email
|
|
299
|
+
3. Go to [openrouter.ai/keys](https://openrouter.ai/keys)
|
|
300
|
+
4. Create a new API key
|
|
301
|
+
5. Add credits (pay-as-you-go, no subscription)
|
|
302
|
+
|
|
303
|
+
### Available Models
|
|
304
|
+
|
|
305
|
+
**openai/text-embedding-3-small** (Recommended) ⭐
|
|
306
|
+
```env
|
|
307
|
+
OPENROUTER_EMBEDDINGS_MODEL=openai/text-embedding-3-small
|
|
308
|
+
```
|
|
309
|
+
- **Dimensions:** 1536
|
|
310
|
+
- **Cost:** $0.02 per 1M tokens (80% cheaper than ada-002!)
|
|
311
|
+
- **Quality:** Excellent
|
|
312
|
+
- **Best for:** Best balance of quality and cost
|
|
313
|
+
|
|
314
|
+
**openai/text-embedding-ada-002** (Standard)
|
|
315
|
+
```env
|
|
316
|
+
OPENROUTER_EMBEDDINGS_MODEL=openai/text-embedding-ada-002
|
|
317
|
+
```
|
|
318
|
+
- **Dimensions:** 1536
|
|
319
|
+
- **Cost:** $0.10 per 1M tokens
|
|
320
|
+
- **Quality:** Excellent (widely supported standard)
|
|
321
|
+
- **Best for:** Compatibility
|
|
322
|
+
|
|
323
|
+
**openai/text-embedding-3-large** (Best Quality)
|
|
324
|
+
```env
|
|
325
|
+
OPENROUTER_EMBEDDINGS_MODEL=openai/text-embedding-3-large
|
|
326
|
+
```
|
|
327
|
+
- **Dimensions:** 3072
|
|
328
|
+
- **Cost:** $0.13 per 1M tokens
|
|
329
|
+
- **Quality:** Best quality available
|
|
330
|
+
- **Best for:** Large codebases where quality matters most
|
|
331
|
+
|
|
332
|
+
**voyage/voyage-code-2** (Code-Specialized)
|
|
333
|
+
```env
|
|
334
|
+
OPENROUTER_EMBEDDINGS_MODEL=voyage/voyage-code-2
|
|
335
|
+
```
|
|
336
|
+
- **Dimensions:** 1024
|
|
337
|
+
- **Cost:** $0.12 per 1M tokens
|
|
338
|
+
- **Quality:** Optimized specifically for code
|
|
339
|
+
- **Best for:** Code search (better than general models)
|
|
340
|
+
|
|
341
|
+
**voyage/voyage-2** (General Purpose)
|
|
342
|
+
```env
|
|
343
|
+
OPENROUTER_EMBEDDINGS_MODEL=voyage/voyage-2
|
|
344
|
+
```
|
|
345
|
+
- **Dimensions:** 1024
|
|
346
|
+
- **Cost:** $0.12 per 1M tokens
|
|
347
|
+
- **Quality:** Best for general text
|
|
348
|
+
- **Best for:** Mixed code + documentation
|
|
349
|
+
|
|
350
|
+
### Benefits
|
|
351
|
+
|
|
352
|
+
- ✅ **ONE Key** - Same key for chat + embeddings
|
|
353
|
+
- ✅ **No Setup** - Works immediately after adding key
|
|
354
|
+
- ✅ **Best Quality** - State-of-the-art embedding models
|
|
355
|
+
- ✅ **Automatic Fallbacks** - Switches providers if one is down
|
|
356
|
+
- ✅ **Competitive Pricing** - Often cheaper than direct providers
|
|
357
|
+
|
|
358
|
+
---
|
|
359
|
+
|
|
360
|
+
## Option 4: OpenAI (Direct)
|
|
361
|
+
|
|
362
|
+
### Overview
|
|
363
|
+
|
|
364
|
+
- **Cost:** ~$0.01-0.10/month (typical usage)
|
|
365
|
+
- **Privacy:** Cloud-based
|
|
366
|
+
- **Setup:** Easy (5 minutes)
|
|
367
|
+
- **Quality:** Excellent (best-in-class, direct from OpenAI)
|
|
368
|
+
- **Best for:** Best quality, direct OpenAI access
|
|
369
|
+
|
|
370
|
+
### Configuration
|
|
371
|
+
|
|
372
|
+
Add to `.env`:
|
|
373
|
+
|
|
374
|
+
```env
|
|
375
|
+
# OpenAI configuration (if not already set)
|
|
376
|
+
OPENAI_API_KEY=sk-your-openai-api-key
|
|
377
|
+
|
|
378
|
+
# Embeddings model (optional, defaults to text-embedding-ada-002)
|
|
379
|
+
# Recommended: Use text-embedding-3-small for 80% cost savings
|
|
380
|
+
# OPENAI_EMBEDDINGS_MODEL=text-embedding-3-small
|
|
381
|
+
```
|
|
382
|
+
|
|
383
|
+
### Getting OpenAI API Key
|
|
384
|
+
|
|
385
|
+
1. Visit [platform.openai.com](https://platform.openai.com)
|
|
386
|
+
2. Sign up or log in
|
|
387
|
+
3. Go to [API Keys](https://platform.openai.com/api-keys)
|
|
388
|
+
4. Create a new API key
|
|
389
|
+
5. Add credits to your account (pay-as-you-go)
|
|
390
|
+
|
|
391
|
+
### Available Models
|
|
392
|
+
|
|
393
|
+
**text-embedding-3-small** (Recommended) ⭐
|
|
394
|
+
```env
|
|
395
|
+
OPENAI_EMBEDDINGS_MODEL=text-embedding-3-small
|
|
396
|
+
```
|
|
397
|
+
- **Dimensions:** 1536
|
|
398
|
+
- **Cost:** $0.02 per 1M tokens (80% cheaper!)
|
|
399
|
+
- **Quality:** Excellent
|
|
400
|
+
- **Best for:** Best balance of quality and cost
|
|
401
|
+
|
|
402
|
+
**text-embedding-ada-002** (Standard)
|
|
403
|
+
```env
|
|
404
|
+
OPENAI_EMBEDDINGS_MODEL=text-embedding-ada-002
|
|
405
|
+
```
|
|
406
|
+
- **Dimensions:** 1536
|
|
407
|
+
- **Cost:** $0.10 per 1M tokens
|
|
408
|
+
- **Quality:** Excellent (standard, widely used)
|
|
409
|
+
- **Best for:** Compatibility
|
|
410
|
+
|
|
411
|
+
**text-embedding-3-large** (Best Quality)
|
|
412
|
+
```env
|
|
413
|
+
OPENAI_EMBEDDINGS_MODEL=text-embedding-3-large
|
|
414
|
+
```
|
|
415
|
+
- **Dimensions:** 3072
|
|
416
|
+
- **Cost:** $0.13 per 1M tokens
|
|
417
|
+
- **Quality:** Best quality available
|
|
418
|
+
- **Best for:** Maximum quality for large codebases
|
|
419
|
+
|
|
420
|
+
### Benefits
|
|
421
|
+
|
|
422
|
+
- ✅ **Best Quality** - Direct from OpenAI, best-in-class
|
|
423
|
+
- ✅ **Lowest Latency** - No intermediaries
|
|
424
|
+
- ✅ **Simple Setup** - Just one API key
|
|
425
|
+
- ✅ **Organization Support** - Use org-level API keys for teams
|
|
426
|
+
|
|
427
|
+
---
|
|
428
|
+
|
|
429
|
+
## Provider Comparison
|
|
430
|
+
|
|
431
|
+
### Feature Comparison
|
|
432
|
+
|
|
433
|
+
| Feature | Ollama | llama.cpp | OpenRouter | OpenAI |
|
|
434
|
+
|---------|--------|-----------|------------|--------|
|
|
435
|
+
| **Cost** | **FREE** | **FREE** | $0.01-0.10/mo | $0.01-0.10/mo |
|
|
436
|
+
| **Privacy** | 🔒 Local | 🔒 Local | ☁️ Cloud | ☁️ Cloud |
|
|
437
|
+
| **Setup** | Easy | Medium | Easy | Easy |
|
|
438
|
+
| **Quality** | Good | Good | **Excellent** | **Excellent** |
|
|
439
|
+
| **Speed** | Fast | **Faster** | Fast | Fast |
|
|
440
|
+
| **Offline** | ✅ Yes | ✅ Yes | ❌ No | ❌ No |
|
|
441
|
+
| **GPU Support** | Yes | **Yes (more options)** | N/A | N/A |
|
|
442
|
+
| **Model Choice** | Limited | **Any GGUF** | Many | Few |
|
|
443
|
+
| **Dimensions** | 384-1024 | 384-1024 | 1024-3072 | 1536-3072 |
|
|
444
|
+
|
|
445
|
+
### Cost Comparison (100K embeddings/month)
|
|
446
|
+
|
|
447
|
+
| Provider | Model | Monthly Cost |
|
|
448
|
+
|----------|-------|--------------|
|
|
449
|
+
| **Ollama** | Any | **$0** (100% FREE) 🔒 |
|
|
450
|
+
| **llama.cpp** | Any | **$0** (100% FREE) 🔒 |
|
|
451
|
+
| **OpenRouter** | text-embedding-3-small | **$0.02** |
|
|
452
|
+
| **OpenRouter** | text-embedding-ada-002 | $0.10 |
|
|
453
|
+
| **OpenRouter** | voyage-code-2 | $0.12 |
|
|
454
|
+
| **OpenAI** | text-embedding-3-small | **$0.02** |
|
|
455
|
+
| **OpenAI** | text-embedding-ada-002 | $0.10 |
|
|
456
|
+
| **OpenAI** | text-embedding-3-large | $0.13 |
|
|
457
|
+
|
|
458
|
+
---
|
|
459
|
+
|
|
460
|
+
## Embeddings Provider Override
|
|
461
|
+
|
|
462
|
+
By default, Lynkr uses the same provider as `MODEL_PROVIDER` for embeddings (if supported). To use a different provider for embeddings:
|
|
463
|
+
|
|
464
|
+
```env
|
|
465
|
+
# Use Databricks for chat, but Ollama for embeddings (privacy + cost savings)
|
|
466
|
+
MODEL_PROVIDER=databricks
|
|
467
|
+
DATABRICKS_API_BASE=https://your-workspace.databricks.com
|
|
468
|
+
DATABRICKS_API_KEY=your-key
|
|
469
|
+
|
|
470
|
+
# Override embeddings provider
|
|
471
|
+
EMBEDDINGS_PROVIDER=ollama
|
|
472
|
+
OLLAMA_EMBEDDINGS_MODEL=nomic-embed-text
|
|
473
|
+
```
|
|
474
|
+
|
|
475
|
+
**Smart provider detection:**
|
|
476
|
+
- Uses same provider as chat (if embeddings supported)
|
|
477
|
+
- Or automatically selects first available embeddings provider
|
|
478
|
+
- Or use `EMBEDDINGS_PROVIDER` to force a specific provider
|
|
479
|
+
|
|
480
|
+
---
|
|
481
|
+
|
|
482
|
+
## Recommended Configurations
|
|
483
|
+
|
|
484
|
+
### 1. Privacy-First (100% Local, FREE)
|
|
485
|
+
|
|
486
|
+
**Best for:** Sensitive codebases, offline work, zero cloud dependencies
|
|
487
|
+
|
|
488
|
+
```env
|
|
489
|
+
# Chat: Ollama (local)
|
|
490
|
+
MODEL_PROVIDER=ollama
|
|
491
|
+
OLLAMA_MODEL=llama3.1:8b
|
|
492
|
+
|
|
493
|
+
# Embeddings: Ollama (local)
|
|
494
|
+
OLLAMA_EMBEDDINGS_MODEL=nomic-embed-text
|
|
495
|
+
|
|
496
|
+
# Everything 100% local, 100% private, 100% FREE!
|
|
497
|
+
```
|
|
498
|
+
|
|
499
|
+
**Benefits:**
|
|
500
|
+
- ✅ Zero cloud dependencies
|
|
501
|
+
- ✅ All data stays on your machine
|
|
502
|
+
- ✅ Works offline
|
|
503
|
+
- ✅ 100% FREE
|
|
504
|
+
|
|
505
|
+
---
|
|
506
|
+
|
|
507
|
+
### 2. Simplest (One Key for Everything)
|
|
508
|
+
|
|
509
|
+
**Best for:** Easy setup, flexibility, quality
|
|
510
|
+
|
|
511
|
+
```env
|
|
512
|
+
# Chat + Embeddings: OpenRouter with ONE key
|
|
513
|
+
MODEL_PROVIDER=openrouter
|
|
514
|
+
OPENROUTER_API_KEY=sk-or-v1-your-key
|
|
515
|
+
OPENROUTER_MODEL=anthropic/claude-3.5-sonnet
|
|
516
|
+
|
|
517
|
+
# Embeddings work automatically with same key!
|
|
518
|
+
# Optional: Specify model for cost savings
|
|
519
|
+
OPENROUTER_EMBEDDINGS_MODEL=openai/text-embedding-3-small
|
|
520
|
+
```
|
|
521
|
+
|
|
522
|
+
**Benefits:**
|
|
523
|
+
- ✅ ONE key for everything
|
|
524
|
+
- ✅ Best quality embeddings
|
|
525
|
+
- ✅ 100+ chat models available
|
|
526
|
+
- ✅ ~$5-10/month total cost
|
|
527
|
+
|
|
528
|
+
---
|
|
529
|
+
|
|
530
|
+
### 3. Hybrid (Best of Both Worlds)
|
|
531
|
+
|
|
532
|
+
**Best for:** Privacy + Quality + Cost Optimization
|
|
533
|
+
|
|
534
|
+
```env
|
|
535
|
+
# Chat: Ollama + Cloud fallback
|
|
536
|
+
PREFER_OLLAMA=true
|
|
537
|
+
FALLBACK_ENABLED=true
|
|
538
|
+
OLLAMA_MODEL=llama3.1:8b
|
|
539
|
+
FALLBACK_PROVIDER=databricks
|
|
540
|
+
DATABRICKS_API_BASE=https://your-workspace.databricks.com
|
|
541
|
+
DATABRICKS_API_KEY=your-key
|
|
542
|
+
|
|
543
|
+
# Embeddings: Ollama (local, private)
|
|
544
|
+
OLLAMA_EMBEDDINGS_MODEL=nomic-embed-text
|
|
545
|
+
|
|
546
|
+
# Result: Free + private embeddings, mostly free chat, cloud for complex tasks
|
|
547
|
+
```
|
|
548
|
+
|
|
549
|
+
**Benefits:**
|
|
550
|
+
- ✅ 70-80% of chat requests FREE (Ollama)
|
|
551
|
+
- ✅ 100% private embeddings (local)
|
|
552
|
+
- ✅ Cloud quality for complex tasks
|
|
553
|
+
- ✅ Intelligent automatic routing
|
|
554
|
+
|
|
555
|
+
---
|
|
556
|
+
|
|
557
|
+
### 4. Enterprise (Best Quality)
|
|
558
|
+
|
|
559
|
+
**Best for:** Large teams, quality-critical applications
|
|
560
|
+
|
|
561
|
+
```env
|
|
562
|
+
# Chat: Databricks (enterprise SLA)
|
|
563
|
+
MODEL_PROVIDER=databricks
|
|
564
|
+
DATABRICKS_API_BASE=https://your-workspace.databricks.com
|
|
565
|
+
DATABRICKS_API_KEY=your-key
|
|
566
|
+
|
|
567
|
+
# Embeddings: OpenRouter (best quality)
|
|
568
|
+
OPENROUTER_API_KEY=sk-or-v1-your-key
|
|
569
|
+
OPENROUTER_EMBEDDINGS_MODEL=voyage/voyage-code-2 # Code-specialized
|
|
570
|
+
```
|
|
571
|
+
|
|
572
|
+
**Benefits:**
|
|
573
|
+
- ✅ Enterprise chat (Claude 4.5)
|
|
574
|
+
- ✅ Best embedding quality (code-specialized)
|
|
575
|
+
- ✅ Separate billing/limits for chat vs embeddings
|
|
576
|
+
- ✅ Production-ready reliability
|
|
577
|
+
|
|
578
|
+
---
|
|
579
|
+
|
|
580
|
+
## Testing & Verification
|
|
581
|
+
|
|
582
|
+
### Test Embeddings Endpoint
|
|
583
|
+
|
|
584
|
+
```bash
|
|
585
|
+
# Test embedding generation
|
|
586
|
+
curl http://localhost:8081/v1/embeddings \
|
|
587
|
+
-H "Content-Type: application/json" \
|
|
588
|
+
-d '{
|
|
589
|
+
"input": "function to sort an array",
|
|
590
|
+
"model": "text-embedding-ada-002"
|
|
591
|
+
}'
|
|
592
|
+
|
|
593
|
+
# Should return JSON with embedding vector
|
|
594
|
+
# Example response:
|
|
595
|
+
# {
|
|
596
|
+
# "object": "list",
|
|
597
|
+
# "data": [{
|
|
598
|
+
# "object": "embedding",
|
|
599
|
+
# "embedding": [0.123, -0.456, 0.789, ...], # 768-3072 dimensions
|
|
600
|
+
# "index": 0
|
|
601
|
+
# }],
|
|
602
|
+
# "model": "text-embedding-ada-002",
|
|
603
|
+
# "usage": {"prompt_tokens": 7, "total_tokens": 7}
|
|
604
|
+
# }
|
|
605
|
+
```
|
|
606
|
+
|
|
607
|
+
### Test in Cursor
|
|
608
|
+
|
|
609
|
+
1. **Open Cursor IDE**
|
|
610
|
+
2. **Open a project**
|
|
611
|
+
3. **Press Cmd+L** (or Ctrl+L)
|
|
612
|
+
4. **Type:** `@Codebase find authentication logic`
|
|
613
|
+
5. **Expected:** Cursor returns relevant files
|
|
614
|
+
|
|
615
|
+
If @Codebase doesn't work:
|
|
616
|
+
- Check embeddings endpoint: `curl http://localhost:8081/v1/embeddings` (should not return 501)
|
|
617
|
+
- Restart Lynkr after adding embeddings config
|
|
618
|
+
- Restart Cursor to re-index codebase
|
|
619
|
+
|
|
620
|
+
---
|
|
621
|
+
|
|
622
|
+
## Troubleshooting
|
|
623
|
+
|
|
624
|
+
### @Codebase Doesn't Work
|
|
625
|
+
|
|
626
|
+
**Symptoms:** @Codebase doesn't return results or shows error
|
|
627
|
+
|
|
628
|
+
**Solutions:**
|
|
629
|
+
|
|
630
|
+
1. **Verify embeddings are configured:**
|
|
631
|
+
```bash
|
|
632
|
+
curl http://localhost:8081/v1/embeddings \
|
|
633
|
+
-H "Content-Type: application/json" \
|
|
634
|
+
-d '{"input":"test","model":"text-embedding-ada-002"}'
|
|
635
|
+
|
|
636
|
+
# Should return embeddings, not 501 error
|
|
637
|
+
```
|
|
638
|
+
|
|
639
|
+
2. **Check embeddings provider in .env:**
|
|
640
|
+
```bash
|
|
641
|
+
# Verify ONE of these is set:
|
|
642
|
+
OLLAMA_EMBEDDINGS_MODEL=nomic-embed-text
|
|
643
|
+
# OR
|
|
644
|
+
LLAMACPP_EMBEDDINGS_ENDPOINT=http://localhost:8080/embeddings
|
|
645
|
+
# OR
|
|
646
|
+
OPENROUTER_API_KEY=sk-or-v1-your-key
|
|
647
|
+
# OR
|
|
648
|
+
OPENAI_API_KEY=sk-your-key
|
|
649
|
+
```
|
|
650
|
+
|
|
651
|
+
3. **Restart Lynkr** after adding embeddings config
|
|
652
|
+
|
|
653
|
+
4. **Restart Cursor** to re-index codebase
|
|
654
|
+
|
|
655
|
+
---
|
|
656
|
+
|
|
657
|
+
### Poor Search Results
|
|
658
|
+
|
|
659
|
+
**Symptoms:** @Codebase returns irrelevant files
|
|
660
|
+
|
|
661
|
+
**Solutions:**
|
|
662
|
+
|
|
663
|
+
1. **Upgrade to better embedding model:**
|
|
664
|
+
```bash
|
|
665
|
+
# Ollama: Use larger model
|
|
666
|
+
ollama pull mxbai-embed-large
|
|
667
|
+
OLLAMA_EMBEDDINGS_MODEL=mxbai-embed-large
|
|
668
|
+
|
|
669
|
+
# OpenRouter: Use code-specialized model
|
|
670
|
+
OPENROUTER_EMBEDDINGS_MODEL=voyage/voyage-code-2
|
|
671
|
+
```
|
|
672
|
+
|
|
673
|
+
2. **Switch to cloud embeddings:**
|
|
674
|
+
- Local models (Ollama/llama.cpp): Good quality
|
|
675
|
+
- Cloud models (OpenRouter/OpenAI): Excellent quality
|
|
676
|
+
|
|
677
|
+
3. **This may be a Cursor indexing issue:**
|
|
678
|
+
- Close and reopen workspace in Cursor
|
|
679
|
+
- Wait for Cursor to re-index
|
|
680
|
+
|
|
681
|
+
---
|
|
682
|
+
|
|
683
|
+
### Ollama Model Not Found
|
|
684
|
+
|
|
685
|
+
**Symptoms:** `Error: model "nomic-embed-text" not found`
|
|
686
|
+
|
|
687
|
+
**Solutions:**
|
|
688
|
+
|
|
689
|
+
```bash
|
|
690
|
+
# List available models
|
|
691
|
+
ollama list
|
|
692
|
+
|
|
693
|
+
# Pull the model
|
|
694
|
+
ollama pull nomic-embed-text
|
|
695
|
+
|
|
696
|
+
# Verify it's available
|
|
697
|
+
ollama list
|
|
698
|
+
# Should show: nomic-embed-text ...
|
|
699
|
+
```
|
|
700
|
+
|
|
701
|
+
---
|
|
702
|
+
|
|
703
|
+
### llama.cpp Connection Refused
|
|
704
|
+
|
|
705
|
+
**Symptoms:** `ECONNREFUSED` when accessing llama.cpp endpoint
|
|
706
|
+
|
|
707
|
+
**Solutions:**
|
|
708
|
+
|
|
709
|
+
1. **Verify llama-server is running:**
|
|
710
|
+
```bash
|
|
711
|
+
lsof -i :8080
|
|
712
|
+
# Should show llama-server process
|
|
713
|
+
```
|
|
714
|
+
|
|
715
|
+
2. **Start llama-server with embedding model:**
|
|
716
|
+
```bash
|
|
717
|
+
./llama-server -m nomic-embed-text-v1.5.Q4_K_M.gguf --port 8080 --embedding
|
|
718
|
+
```
|
|
719
|
+
|
|
720
|
+
3. **Test endpoint:**
|
|
721
|
+
```bash
|
|
722
|
+
curl http://localhost:8080/health
|
|
723
|
+
# Should return: {"status":"ok"}
|
|
724
|
+
```
|
|
725
|
+
|
|
726
|
+
---
|
|
727
|
+
|
|
728
|
+
### Rate Limiting (Cloud Providers)
|
|
729
|
+
|
|
730
|
+
**Symptoms:** Too many requests error (429)
|
|
731
|
+
|
|
732
|
+
**Solutions:**
|
|
733
|
+
|
|
734
|
+
1. **Switch to local embeddings:**
|
|
735
|
+
```env
|
|
736
|
+
# Ollama (no rate limits, 100% FREE)
|
|
737
|
+
OLLAMA_EMBEDDINGS_MODEL=nomic-embed-text
|
|
738
|
+
```
|
|
739
|
+
|
|
740
|
+
2. **Use OpenRouter** (pooled rate limits):
|
|
741
|
+
```env
|
|
742
|
+
OPENROUTER_API_KEY=sk-or-v1-your-key
|
|
743
|
+
```
|
|
744
|
+
|
|
745
|
+
---
|
|
746
|
+
|
|
747
|
+
## Next Steps
|
|
748
|
+
|
|
749
|
+
- **[Cursor Integration](cursor-integration.md)** - Full Cursor IDE setup guide
|
|
750
|
+
- **[Provider Configuration](providers.md)** - Configure all providers
|
|
751
|
+
- **[Installation Guide](installation.md)** - Install Lynkr
|
|
752
|
+
- **[Troubleshooting](troubleshooting.md)** - More troubleshooting tips
|
|
753
|
+
|
|
754
|
+
---
|
|
755
|
+
|
|
756
|
+
## Getting Help
|
|
757
|
+
|
|
758
|
+
- **[GitHub Discussions](https://github.com/vishalveerareddy123/Lynkr/discussions)** - Community Q&A
|
|
759
|
+
- **[GitHub Issues](https://github.com/vishalveerareddy123/Lynkr/issues)** - Report bugs
|
|
760
|
+
- **[FAQ](faq.md)** - Frequently asked questions
|