lynkr 3.3.1 → 4.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,760 @@
1
+ # Embeddings Configuration Guide
2
+
3
+ Complete guide to configuring embeddings for Cursor @Codebase semantic search and code understanding.
4
+
5
+ ---
6
+
7
+ ## Overview
8
+
9
+ **Embeddings** enable semantic code search in Cursor IDE's @Codebase feature. Instead of keyword matching, embeddings understand the *meaning* of your code, allowing you to search for functionality, concepts, or patterns.
10
+
11
+ ### What Are Embeddings?
12
+
13
+ Embeddings convert text (code, comments, documentation) into high-dimensional vectors that capture semantic meaning. Similar code gets similar vectors, enabling:
14
+
15
+ - **@Codebase Search** - Find relevant code by describing what you need
16
+ - **Automatic Context** - Cursor automatically includes relevant files in conversations
17
+ - **Find Similar Code** - Discover code patterns and examples in your codebase
18
+
19
+ ### Why Use Embeddings?
20
+
21
+ **Without embeddings:**
22
+ - ❌ Keyword-only search (`grep`, exact string matching)
23
+ - ❌ No semantic understanding
24
+ - ❌ Can't find code by describing its purpose
25
+
26
+ **With embeddings:**
27
+ - ✅ Semantic search ("find authentication logic")
28
+ - ✅ Concept-based discovery ("show me error handling patterns")
29
+ - ✅ Similar code detection ("code like this function")
30
+
31
+ ---
32
+
33
+ ## Supported Embedding Providers
34
+
35
+ Lynkr supports 4 embedding providers with different tradeoffs:
36
+
37
+ | Provider | Cost | Privacy | Setup | Quality | Best For |
38
+ |----------|------|---------|-------|---------|----------|
39
+ | **Ollama** | **FREE** | 🔒 100% Local | Easy | Good | Privacy, offline, no costs |
40
+ | **llama.cpp** | **FREE** | 🔒 100% Local | Medium | Good | Performance, GPU, GGUF models |
41
+ | **OpenRouter** | $0.01-0.10/mo | ☁️ Cloud | Easy | Excellent | Simplicity, quality, one key |
42
+ | **OpenAI** | $0.01-0.10/mo | ☁️ Cloud | Easy | Excellent | Best quality, direct access |
43
+
44
+ ---
45
+
46
+ ## Option 1: Ollama (Recommended for Privacy)
47
+
48
+ ### Overview
49
+
50
+ - **Cost:** 100% FREE 🔒
51
+ - **Privacy:** All data stays on your machine
52
+ - **Setup:** Easy (5 minutes)
53
+ - **Quality:** Good (768-1024 dimensions)
54
+ - **Best for:** Privacy-focused teams, offline work, zero cloud dependencies
55
+
56
+ ### Installation & Setup
57
+
58
+ ```bash
59
+ # 1. Install Ollama (if not already installed)
60
+ brew install ollama # macOS
61
+ # Or download from: https://ollama.ai/download
62
+
63
+ # 2. Start Ollama service
64
+ ollama serve
65
+
66
+ # 3. Pull embedding model (in separate terminal)
67
+ ollama pull nomic-embed-text
68
+
69
+ # 4. Verify model is available
70
+ ollama list
71
+ # Should show: nomic-embed-text ...
72
+ ```
73
+
74
+ ### Configuration
75
+
76
+ Add to `.env`:
77
+
78
+ ```env
79
+ # Ollama embeddings configuration
80
+ OLLAMA_EMBEDDINGS_MODEL=nomic-embed-text
81
+ OLLAMA_EMBEDDINGS_ENDPOINT=http://localhost:11434/api/embeddings
82
+ ```
83
+
84
+ ### Available Models
85
+
86
+ **nomic-embed-text** (Recommended) ⭐
87
+ ```bash
88
+ ollama pull nomic-embed-text
89
+ ```
90
+ - **Dimensions:** 768
91
+ - **Parameters:** 137M
92
+ - **Quality:** Excellent for code search
93
+ - **Speed:** Fast (~50ms per query)
94
+ - **Best for:** General purpose, best all-around choice
95
+
96
+ **mxbai-embed-large** (Higher Quality)
97
+ ```bash
98
+ ollama pull mxbai-embed-large
99
+ ```
100
+ - **Dimensions:** 1024
101
+ - **Parameters:** 335M
102
+ - **Quality:** Higher quality than nomic-embed-text
103
+ - **Speed:** Slower (~100ms per query)
104
+ - **Best for:** Large codebases where quality matters most
105
+
106
+ **all-minilm** (Fastest)
107
+ ```bash
108
+ ollama pull all-minilm
109
+ ```
110
+ - **Dimensions:** 384
111
+ - **Parameters:** 23M
112
+ - **Quality:** Good for simple searches
113
+ - **Speed:** Very fast (~20ms per query)
114
+ - **Best for:** Small codebases, speed-critical applications
115
+
116
+ ### Testing
117
+
118
+ ```bash
119
+ # Test embedding generation
120
+ curl http://localhost:11434/api/embeddings \
121
+ -d '{"model":"nomic-embed-text","prompt":"function to sort array"}'
122
+
123
+ # Should return JSON with embedding vector
124
+ ```
125
+
126
+ ### Benefits
127
+
128
+ - ✅ **100% FREE** - No API costs ever
129
+ - ✅ **100% Private** - All data stays on your machine
130
+ - ✅ **Offline** - Works without internet
131
+ - ✅ **Easy Setup** - Install → Pull model → Configure
132
+ - ✅ **Good Quality** - Excellent for code search
133
+ - ✅ **Multiple Models** - Choose speed vs quality tradeoff
134
+
135
+ ---
136
+
137
+ ## Option 2: llama.cpp (Maximum Performance)
138
+
139
+ ### Overview
140
+
141
+ - **Cost:** 100% FREE 🔒
142
+ - **Privacy:** All data stays on your machine
143
+ - **Setup:** Medium (15 minutes, requires compilation)
144
+ - **Quality:** Good (same as Ollama models, GGUF format)
145
+ - **Best for:** Performance optimization, GPU acceleration, GGUF models
146
+
147
+ ### Installation & Setup
148
+
149
+ ```bash
150
+ # 1. Clone and build llama.cpp
151
+ git clone https://github.com/ggerganov/llama.cpp
152
+ cd llama.cpp
153
+
154
+ # Build with GPU support (optional):
155
+ # For CUDA (NVIDIA): make LLAMA_CUDA=1
156
+ # For Metal (Apple Silicon): make LLAMA_METAL=1
157
+ # For CPU only: make
158
+ make
159
+
160
+ # 2. Download embedding model (GGUF format)
161
+ # Example: nomic-embed-text GGUF
162
+ wget https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF/resolve/main/nomic-embed-text-v1.5.Q4_K_M.gguf
163
+
164
+ # 3. Start llama-server with embedding model
165
+ ./llama-server \
166
+ -m nomic-embed-text-v1.5.Q4_K_M.gguf \
167
+ --port 8080 \
168
+ --embedding
169
+
170
+ # 4. Verify server is running
171
+ curl http://localhost:8080/health
172
+ # Should return: {"status":"ok"}
173
+ ```
174
+
175
+ ### Configuration
176
+
177
+ Add to `.env`:
178
+
179
+ ```env
180
+ # llama.cpp embeddings configuration
181
+ LLAMACPP_EMBEDDINGS_ENDPOINT=http://localhost:8080/embeddings
182
+ ```
183
+
184
+ ### Available Models (GGUF)
185
+
186
+ **nomic-embed-text-v1.5** (Recommended) ⭐
187
+ - **File:** `nomic-embed-text-v1.5.Q4_K_M.gguf`
188
+ - **Download:** https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF
189
+ - **Dimensions:** 768
190
+ - **Size:** ~80MB
191
+ - **Quality:** Excellent for code
192
+ - **Best for:** Best all-around choice
193
+
194
+ **all-MiniLM-L6-v2** (Fastest)
195
+ - **File:** `all-MiniLM-L6-v2.Q4_K_M.gguf`
196
+ - **Download:** https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2-GGUF
197
+ - **Dimensions:** 384
198
+ - **Size:** ~25MB
199
+ - **Quality:** Good for simple searches
200
+ - **Best for:** Speed-critical applications
201
+
202
+ **bge-large-en-v1.5** (Highest Quality)
203
+ - **File:** `bge-large-en-v1.5.Q4_K_M.gguf`
204
+ - **Download:** https://huggingface.co/BAAI/bge-large-en-v1.5-GGUF
205
+ - **Dimensions:** 1024
206
+ - **Size:** ~350MB
207
+ - **Quality:** Best quality for embeddings
208
+ - **Best for:** Large codebases, quality-critical applications
209
+
210
+ ### GPU Support
211
+
212
+ llama.cpp supports multiple GPU backends for faster embedding generation:
213
+
214
+ **NVIDIA CUDA:**
215
+ ```bash
216
+ make LLAMA_CUDA=1
217
+ ./llama-server -m model.gguf --embedding --n-gpu-layers 32
218
+ ```
219
+
220
+ **Apple Silicon Metal:**
221
+ ```bash
222
+ make LLAMA_METAL=1
223
+ ./llama-server -m model.gguf --embedding --n-gpu-layers 32
224
+ ```
225
+
226
+ **AMD ROCm:**
227
+ ```bash
228
+ make LLAMA_ROCM=1
229
+ ./llama-server -m model.gguf --embedding --n-gpu-layers 32
230
+ ```
231
+
232
+ **Vulkan (Universal):**
233
+ ```bash
234
+ make LLAMA_VULKAN=1
235
+ ./llama-server -m model.gguf --embedding --n-gpu-layers 32
236
+ ```
237
+
238
+ ### Testing
239
+
240
+ ```bash
241
+ # Test embedding generation
242
+ curl http://localhost:8080/embeddings \
243
+ -H "Content-Type: application/json" \
244
+ -d '{"content":"function to sort array"}'
245
+
246
+ # Should return JSON with embedding vector
247
+ ```
248
+
249
+ ### Benefits
250
+
251
+ - ✅ **100% FREE** - No API costs
252
+ - ✅ **100% Private** - All data stays local
253
+ - ✅ **Faster than Ollama** - Optimized C++ implementation
254
+ - ✅ **GPU Acceleration** - CUDA, Metal, ROCm, Vulkan
255
+ - ✅ **Lower Memory** - Quantization options (Q4, Q5, Q8)
256
+ - ✅ **Any GGUF Model** - Use any embedding model from HuggingFace
257
+
258
+ ### llama.cpp vs Ollama
259
+
260
+ | Feature | Ollama | llama.cpp |
261
+ |---------|--------|-----------|
262
+ | **Setup** | Easy (app) | Manual (compile) |
263
+ | **Model Format** | Ollama-specific | Any GGUF model |
264
+ | **Performance** | Good | **Better** (optimized C++) |
265
+ | **GPU Support** | Yes | Yes (more options) |
266
+ | **Memory Usage** | Higher | **Lower** (more quantization options) |
267
+ | **Flexibility** | Limited models | **Any GGUF** from HuggingFace |
268
+
269
+ ---
270
+
271
+ ## Option 3: OpenRouter (Simplest Cloud)
272
+
273
+ ### Overview
274
+
275
+ - **Cost:** ~$0.01-0.10/month (typical usage)
276
+ - **Privacy:** Cloud-based
277
+ - **Setup:** Very easy (2 minutes)
278
+ - **Quality:** Excellent (best-in-class models)
279
+ - **Best for:** Simplicity, quality, one key for chat + embeddings
280
+
281
+ ### Configuration
282
+
283
+ Add to `.env`:
284
+
285
+ ```env
286
+ # OpenRouter configuration (if not already set)
287
+ OPENROUTER_API_KEY=sk-or-v1-your-key-here
288
+
289
+ # Embeddings model (optional, defaults to text-embedding-ada-002)
290
+ OPENROUTER_EMBEDDINGS_MODEL=openai/text-embedding-3-small
291
+ ```
292
+
293
+ **Note:** If you're already using `MODEL_PROVIDER=openrouter`, embeddings work automatically with the same key! No additional configuration needed.
294
+
295
+ ### Getting OpenRouter API Key
296
+
297
+ 1. Visit [openrouter.ai](https://openrouter.ai)
298
+ 2. Sign in with GitHub, Google, or email
299
+ 3. Go to [openrouter.ai/keys](https://openrouter.ai/keys)
300
+ 4. Create a new API key
301
+ 5. Add credits (pay-as-you-go, no subscription)
302
+
303
+ ### Available Models
304
+
305
+ **openai/text-embedding-3-small** (Recommended) ⭐
306
+ ```env
307
+ OPENROUTER_EMBEDDINGS_MODEL=openai/text-embedding-3-small
308
+ ```
309
+ - **Dimensions:** 1536
310
+ - **Cost:** $0.02 per 1M tokens (80% cheaper than ada-002!)
311
+ - **Quality:** Excellent
312
+ - **Best for:** Best balance of quality and cost
313
+
314
+ **openai/text-embedding-ada-002** (Standard)
315
+ ```env
316
+ OPENROUTER_EMBEDDINGS_MODEL=openai/text-embedding-ada-002
317
+ ```
318
+ - **Dimensions:** 1536
319
+ - **Cost:** $0.10 per 1M tokens
320
+ - **Quality:** Excellent (widely supported standard)
321
+ - **Best for:** Compatibility
322
+
323
+ **openai/text-embedding-3-large** (Best Quality)
324
+ ```env
325
+ OPENROUTER_EMBEDDINGS_MODEL=openai/text-embedding-3-large
326
+ ```
327
+ - **Dimensions:** 3072
328
+ - **Cost:** $0.13 per 1M tokens
329
+ - **Quality:** Best quality available
330
+ - **Best for:** Large codebases where quality matters most
331
+
332
+ **voyage/voyage-code-2** (Code-Specialized)
333
+ ```env
334
+ OPENROUTER_EMBEDDINGS_MODEL=voyage/voyage-code-2
335
+ ```
336
+ - **Dimensions:** 1024
337
+ - **Cost:** $0.12 per 1M tokens
338
+ - **Quality:** Optimized specifically for code
339
+ - **Best for:** Code search (better than general models)
340
+
341
+ **voyage/voyage-2** (General Purpose)
342
+ ```env
343
+ OPENROUTER_EMBEDDINGS_MODEL=voyage/voyage-2
344
+ ```
345
+ - **Dimensions:** 1024
346
+ - **Cost:** $0.12 per 1M tokens
347
+ - **Quality:** Best for general text
348
+ - **Best for:** Mixed code + documentation
349
+
350
+ ### Benefits
351
+
352
+ - ✅ **ONE Key** - Same key for chat + embeddings
353
+ - ✅ **No Setup** - Works immediately after adding key
354
+ - ✅ **Best Quality** - State-of-the-art embedding models
355
+ - ✅ **Automatic Fallbacks** - Switches providers if one is down
356
+ - ✅ **Competitive Pricing** - Often cheaper than direct providers
357
+
358
+ ---
359
+
360
+ ## Option 4: OpenAI (Direct)
361
+
362
+ ### Overview
363
+
364
+ - **Cost:** ~$0.01-0.10/month (typical usage)
365
+ - **Privacy:** Cloud-based
366
+ - **Setup:** Easy (5 minutes)
367
+ - **Quality:** Excellent (best-in-class, direct from OpenAI)
368
+ - **Best for:** Best quality, direct OpenAI access
369
+
370
+ ### Configuration
371
+
372
+ Add to `.env`:
373
+
374
+ ```env
375
+ # OpenAI configuration (if not already set)
376
+ OPENAI_API_KEY=sk-your-openai-api-key
377
+
378
+ # Embeddings model (optional, defaults to text-embedding-ada-002)
379
+ # Recommended: Use text-embedding-3-small for 80% cost savings
380
+ # OPENAI_EMBEDDINGS_MODEL=text-embedding-3-small
381
+ ```
382
+
383
+ ### Getting OpenAI API Key
384
+
385
+ 1. Visit [platform.openai.com](https://platform.openai.com)
386
+ 2. Sign up or log in
387
+ 3. Go to [API Keys](https://platform.openai.com/api-keys)
388
+ 4. Create a new API key
389
+ 5. Add credits to your account (pay-as-you-go)
390
+
391
+ ### Available Models
392
+
393
+ **text-embedding-3-small** (Recommended) ⭐
394
+ ```env
395
+ OPENAI_EMBEDDINGS_MODEL=text-embedding-3-small
396
+ ```
397
+ - **Dimensions:** 1536
398
+ - **Cost:** $0.02 per 1M tokens (80% cheaper!)
399
+ - **Quality:** Excellent
400
+ - **Best for:** Best balance of quality and cost
401
+
402
+ **text-embedding-ada-002** (Standard)
403
+ ```env
404
+ OPENAI_EMBEDDINGS_MODEL=text-embedding-ada-002
405
+ ```
406
+ - **Dimensions:** 1536
407
+ - **Cost:** $0.10 per 1M tokens
408
+ - **Quality:** Excellent (standard, widely used)
409
+ - **Best for:** Compatibility
410
+
411
+ **text-embedding-3-large** (Best Quality)
412
+ ```env
413
+ OPENAI_EMBEDDINGS_MODEL=text-embedding-3-large
414
+ ```
415
+ - **Dimensions:** 3072
416
+ - **Cost:** $0.13 per 1M tokens
417
+ - **Quality:** Best quality available
418
+ - **Best for:** Maximum quality for large codebases
419
+
420
+ ### Benefits
421
+
422
+ - ✅ **Best Quality** - Direct from OpenAI, best-in-class
423
+ - ✅ **Lowest Latency** - No intermediaries
424
+ - ✅ **Simple Setup** - Just one API key
425
+ - ✅ **Organization Support** - Use org-level API keys for teams
426
+
427
+ ---
428
+
429
+ ## Provider Comparison
430
+
431
+ ### Feature Comparison
432
+
433
+ | Feature | Ollama | llama.cpp | OpenRouter | OpenAI |
434
+ |---------|--------|-----------|------------|--------|
435
+ | **Cost** | **FREE** | **FREE** | $0.01-0.10/mo | $0.01-0.10/mo |
436
+ | **Privacy** | 🔒 Local | 🔒 Local | ☁️ Cloud | ☁️ Cloud |
437
+ | **Setup** | Easy | Medium | Easy | Easy |
438
+ | **Quality** | Good | Good | **Excellent** | **Excellent** |
439
+ | **Speed** | Fast | **Faster** | Fast | Fast |
440
+ | **Offline** | ✅ Yes | ✅ Yes | ❌ No | ❌ No |
441
+ | **GPU Support** | Yes | **Yes (more options)** | N/A | N/A |
442
+ | **Model Choice** | Limited | **Any GGUF** | Many | Few |
443
+ | **Dimensions** | 384-1024 | 384-1024 | 1024-3072 | 1536-3072 |
444
+
445
+ ### Cost Comparison (100K embeddings/month)
446
+
447
+ | Provider | Model | Monthly Cost |
448
+ |----------|-------|--------------|
449
+ | **Ollama** | Any | **$0** (100% FREE) 🔒 |
450
+ | **llama.cpp** | Any | **$0** (100% FREE) 🔒 |
451
+ | **OpenRouter** | text-embedding-3-small | **$0.02** |
452
+ | **OpenRouter** | text-embedding-ada-002 | $0.10 |
453
+ | **OpenRouter** | voyage-code-2 | $0.12 |
454
+ | **OpenAI** | text-embedding-3-small | **$0.02** |
455
+ | **OpenAI** | text-embedding-ada-002 | $0.10 |
456
+ | **OpenAI** | text-embedding-3-large | $0.13 |
457
+
458
+ ---
459
+
460
+ ## Embeddings Provider Override
461
+
462
+ By default, Lynkr uses the same provider as `MODEL_PROVIDER` for embeddings (if supported). To use a different provider for embeddings:
463
+
464
+ ```env
465
+ # Use Databricks for chat, but Ollama for embeddings (privacy + cost savings)
466
+ MODEL_PROVIDER=databricks
467
+ DATABRICKS_API_BASE=https://your-workspace.databricks.com
468
+ DATABRICKS_API_KEY=your-key
469
+
470
+ # Override embeddings provider
471
+ EMBEDDINGS_PROVIDER=ollama
472
+ OLLAMA_EMBEDDINGS_MODEL=nomic-embed-text
473
+ ```
474
+
475
+ **Smart provider detection:**
476
+ - Uses same provider as chat (if embeddings supported)
477
+ - Or automatically selects first available embeddings provider
478
+ - Or use `EMBEDDINGS_PROVIDER` to force a specific provider
479
+
480
+ ---
481
+
482
+ ## Recommended Configurations
483
+
484
+ ### 1. Privacy-First (100% Local, FREE)
485
+
486
+ **Best for:** Sensitive codebases, offline work, zero cloud dependencies
487
+
488
+ ```env
489
+ # Chat: Ollama (local)
490
+ MODEL_PROVIDER=ollama
491
+ OLLAMA_MODEL=llama3.1:8b
492
+
493
+ # Embeddings: Ollama (local)
494
+ OLLAMA_EMBEDDINGS_MODEL=nomic-embed-text
495
+
496
+ # Everything 100% local, 100% private, 100% FREE!
497
+ ```
498
+
499
+ **Benefits:**
500
+ - ✅ Zero cloud dependencies
501
+ - ✅ All data stays on your machine
502
+ - ✅ Works offline
503
+ - ✅ 100% FREE
504
+
505
+ ---
506
+
507
+ ### 2. Simplest (One Key for Everything)
508
+
509
+ **Best for:** Easy setup, flexibility, quality
510
+
511
+ ```env
512
+ # Chat + Embeddings: OpenRouter with ONE key
513
+ MODEL_PROVIDER=openrouter
514
+ OPENROUTER_API_KEY=sk-or-v1-your-key
515
+ OPENROUTER_MODEL=anthropic/claude-3.5-sonnet
516
+
517
+ # Embeddings work automatically with same key!
518
+ # Optional: Specify model for cost savings
519
+ OPENROUTER_EMBEDDINGS_MODEL=openai/text-embedding-3-small
520
+ ```
521
+
522
+ **Benefits:**
523
+ - ✅ ONE key for everything
524
+ - ✅ Best quality embeddings
525
+ - ✅ 100+ chat models available
526
+ - ✅ ~$5-10/month total cost
527
+
528
+ ---
529
+
530
+ ### 3. Hybrid (Best of Both Worlds)
531
+
532
+ **Best for:** Privacy + Quality + Cost Optimization
533
+
534
+ ```env
535
+ # Chat: Ollama + Cloud fallback
536
+ PREFER_OLLAMA=true
537
+ FALLBACK_ENABLED=true
538
+ OLLAMA_MODEL=llama3.1:8b
539
+ FALLBACK_PROVIDER=databricks
540
+ DATABRICKS_API_BASE=https://your-workspace.databricks.com
541
+ DATABRICKS_API_KEY=your-key
542
+
543
+ # Embeddings: Ollama (local, private)
544
+ OLLAMA_EMBEDDINGS_MODEL=nomic-embed-text
545
+
546
+ # Result: Free + private embeddings, mostly free chat, cloud for complex tasks
547
+ ```
548
+
549
+ **Benefits:**
550
+ - ✅ 70-80% of chat requests FREE (Ollama)
551
+ - ✅ 100% private embeddings (local)
552
+ - ✅ Cloud quality for complex tasks
553
+ - ✅ Intelligent automatic routing
554
+
555
+ ---
556
+
557
+ ### 4. Enterprise (Best Quality)
558
+
559
+ **Best for:** Large teams, quality-critical applications
560
+
561
+ ```env
562
+ # Chat: Databricks (enterprise SLA)
563
+ MODEL_PROVIDER=databricks
564
+ DATABRICKS_API_BASE=https://your-workspace.databricks.com
565
+ DATABRICKS_API_KEY=your-key
566
+
567
+ # Embeddings: OpenRouter (best quality)
568
+ OPENROUTER_API_KEY=sk-or-v1-your-key
569
+ OPENROUTER_EMBEDDINGS_MODEL=voyage/voyage-code-2 # Code-specialized
570
+ ```
571
+
572
+ **Benefits:**
573
+ - ✅ Enterprise chat (Claude 4.5)
574
+ - ✅ Best embedding quality (code-specialized)
575
+ - ✅ Separate billing/limits for chat vs embeddings
576
+ - ✅ Production-ready reliability
577
+
578
+ ---
579
+
580
+ ## Testing & Verification
581
+
582
+ ### Test Embeddings Endpoint
583
+
584
+ ```bash
585
+ # Test embedding generation
586
+ curl http://localhost:8081/v1/embeddings \
587
+ -H "Content-Type: application/json" \
588
+ -d '{
589
+ "input": "function to sort an array",
590
+ "model": "text-embedding-ada-002"
591
+ }'
592
+
593
+ # Should return JSON with embedding vector
594
+ # Example response:
595
+ # {
596
+ # "object": "list",
597
+ # "data": [{
598
+ # "object": "embedding",
599
+ # "embedding": [0.123, -0.456, 0.789, ...], # 768-3072 dimensions
600
+ # "index": 0
601
+ # }],
602
+ # "model": "text-embedding-ada-002",
603
+ # "usage": {"prompt_tokens": 7, "total_tokens": 7}
604
+ # }
605
+ ```
606
+
607
+ ### Test in Cursor
608
+
609
+ 1. **Open Cursor IDE**
610
+ 2. **Open a project**
611
+ 3. **Press Cmd+L** (or Ctrl+L)
612
+ 4. **Type:** `@Codebase find authentication logic`
613
+ 5. **Expected:** Cursor returns relevant files
614
+
615
+ If @Codebase doesn't work:
616
+ - Check embeddings endpoint: `curl http://localhost:8081/v1/embeddings` (should not return 501)
617
+ - Restart Lynkr after adding embeddings config
618
+ - Restart Cursor to re-index codebase
619
+
620
+ ---
621
+
622
+ ## Troubleshooting
623
+
624
+ ### @Codebase Doesn't Work
625
+
626
+ **Symptoms:** @Codebase doesn't return results or shows error
627
+
628
+ **Solutions:**
629
+
630
+ 1. **Verify embeddings are configured:**
631
+ ```bash
632
+ curl http://localhost:8081/v1/embeddings \
633
+ -H "Content-Type: application/json" \
634
+ -d '{"input":"test","model":"text-embedding-ada-002"}'
635
+
636
+ # Should return embeddings, not 501 error
637
+ ```
638
+
639
+ 2. **Check embeddings provider in .env:**
640
+ ```bash
641
+ # Verify ONE of these is set:
642
+ OLLAMA_EMBEDDINGS_MODEL=nomic-embed-text
643
+ # OR
644
+ LLAMACPP_EMBEDDINGS_ENDPOINT=http://localhost:8080/embeddings
645
+ # OR
646
+ OPENROUTER_API_KEY=sk-or-v1-your-key
647
+ # OR
648
+ OPENAI_API_KEY=sk-your-key
649
+ ```
650
+
651
+ 3. **Restart Lynkr** after adding embeddings config
652
+
653
+ 4. **Restart Cursor** to re-index codebase
654
+
655
+ ---
656
+
657
+ ### Poor Search Results
658
+
659
+ **Symptoms:** @Codebase returns irrelevant files
660
+
661
+ **Solutions:**
662
+
663
+ 1. **Upgrade to better embedding model:**
664
+ ```bash
665
+ # Ollama: Use larger model
666
+ ollama pull mxbai-embed-large
667
+ OLLAMA_EMBEDDINGS_MODEL=mxbai-embed-large
668
+
669
+ # OpenRouter: Use code-specialized model
670
+ OPENROUTER_EMBEDDINGS_MODEL=voyage/voyage-code-2
671
+ ```
672
+
673
+ 2. **Switch to cloud embeddings:**
674
+ - Local models (Ollama/llama.cpp): Good quality
675
+ - Cloud models (OpenRouter/OpenAI): Excellent quality
676
+
677
+ 3. **This may be a Cursor indexing issue:**
678
+ - Close and reopen workspace in Cursor
679
+ - Wait for Cursor to re-index
680
+
681
+ ---
682
+
683
+ ### Ollama Model Not Found
684
+
685
+ **Symptoms:** `Error: model "nomic-embed-text" not found`
686
+
687
+ **Solutions:**
688
+
689
+ ```bash
690
+ # List available models
691
+ ollama list
692
+
693
+ # Pull the model
694
+ ollama pull nomic-embed-text
695
+
696
+ # Verify it's available
697
+ ollama list
698
+ # Should show: nomic-embed-text ...
699
+ ```
700
+
701
+ ---
702
+
703
+ ### llama.cpp Connection Refused
704
+
705
+ **Symptoms:** `ECONNREFUSED` when accessing llama.cpp endpoint
706
+
707
+ **Solutions:**
708
+
709
+ 1. **Verify llama-server is running:**
710
+ ```bash
711
+ lsof -i :8080
712
+ # Should show llama-server process
713
+ ```
714
+
715
+ 2. **Start llama-server with embedding model:**
716
+ ```bash
717
+ ./llama-server -m nomic-embed-text-v1.5.Q4_K_M.gguf --port 8080 --embedding
718
+ ```
719
+
720
+ 3. **Test endpoint:**
721
+ ```bash
722
+ curl http://localhost:8080/health
723
+ # Should return: {"status":"ok"}
724
+ ```
725
+
726
+ ---
727
+
728
+ ### Rate Limiting (Cloud Providers)
729
+
730
+ **Symptoms:** Too many requests error (429)
731
+
732
+ **Solutions:**
733
+
734
+ 1. **Switch to local embeddings:**
735
+ ```env
736
+ # Ollama (no rate limits, 100% FREE)
737
+ OLLAMA_EMBEDDINGS_MODEL=nomic-embed-text
738
+ ```
739
+
740
+ 2. **Use OpenRouter** (pooled rate limits):
741
+ ```env
742
+ OPENROUTER_API_KEY=sk-or-v1-your-key
743
+ ```
744
+
745
+ ---
746
+
747
+ ## Next Steps
748
+
749
+ - **[Cursor Integration](cursor-integration.md)** - Full Cursor IDE setup guide
750
+ - **[Provider Configuration](providers.md)** - Configure all providers
751
+ - **[Installation Guide](installation.md)** - Install Lynkr
752
+ - **[Troubleshooting](troubleshooting.md)** - More troubleshooting tips
753
+
754
+ ---
755
+
756
+ ## Getting Help
757
+
758
+ - **[GitHub Discussions](https://github.com/vishalveerareddy123/Lynkr/discussions)** - Community Q&A
759
+ - **[GitHub Issues](https://github.com/vishalveerareddy123/Lynkr/issues)** - Report bugs
760
+ - **[FAQ](faq.md)** - Frequently asked questions