lynkr 3.3.1 → 4.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,735 @@
1
+ # Provider Configuration Guide
2
+
3
+ Complete configuration reference for all 9+ supported LLM providers. Each provider section includes setup instructions, model options, pricing, and example configurations.
4
+
5
+ ---
6
+
7
+ ## Overview
8
+
9
+ Lynkr supports multiple AI model providers, giving you flexibility in choosing the right model for your needs:
10
+
11
+ | Provider | Type | Models | Cost | Privacy | Setup Complexity |
12
+ |----------|------|--------|------|---------|------------------|
13
+ | **AWS Bedrock** | Cloud | 100+ (Claude, DeepSeek, Qwen, Nova, Titan, Llama, Mistral) | $-$$$ | Cloud | Easy |
14
+ | **Databricks** | Cloud | Claude Sonnet 4.5, Opus 4.5 | $$$ | Cloud | Medium |
15
+ | **OpenRouter** | Cloud | 100+ (GPT, Claude, Gemini, Llama, Mistral, etc.) | $-$$ | Cloud | Easy |
16
+ | **Ollama** | Local | Unlimited (free, offline) | **FREE** | 🔒 100% Local | Easy |
17
+ | **llama.cpp** | Local | Any GGUF model | **FREE** | 🔒 100% Local | Medium |
18
+ | **Azure OpenAI** | Cloud | GPT-4o, GPT-5, o1, o3 | $$$ | Cloud | Medium |
19
+ | **Azure Anthropic** | Cloud | Claude models | $$$ | Cloud | Medium |
20
+ | **OpenAI** | Cloud | GPT-4o, o1, o3 | $$$ | Cloud | Easy |
21
+ | **LM Studio** | Local | Local models with GUI | **FREE** | 🔒 100% Local | Easy |
22
+
23
+ ---
24
+
25
+ ## Configuration Methods
26
+
27
+ ### Environment Variables (Quick Start)
28
+
29
+ ```bash
30
+ export MODEL_PROVIDER=databricks
31
+ export DATABRICKS_API_BASE=https://your-workspace.databricks.com
32
+ export DATABRICKS_API_KEY=your-key
33
+ lynkr start
34
+ ```
35
+
36
+ ### .env File (Recommended for Production)
37
+
38
+ ```bash
39
+ # Copy example file
40
+ cp .env.example .env
41
+
42
+ # Edit with your credentials
43
+ nano .env
44
+ ```
45
+
46
+ Example `.env`:
47
+ ```env
48
+ MODEL_PROVIDER=databricks
49
+ DATABRICKS_API_BASE=https://your-workspace.databricks.com
50
+ DATABRICKS_API_KEY=dapi1234567890abcdef
51
+ PORT=8081
52
+ LOG_LEVEL=info
53
+ ```
54
+
55
+ ---
56
+
57
+ ## Provider-Specific Configuration
58
+
59
+ ### 1. AWS Bedrock (100+ Models)
60
+
61
+ **Best for:** AWS ecosystem, multi-model flexibility, Claude + alternatives
62
+
63
+ #### Configuration
64
+
65
+ ```env
66
+ MODEL_PROVIDER=bedrock
67
+ AWS_BEDROCK_API_KEY=your-bearer-token
68
+ AWS_BEDROCK_REGION=us-east-1
69
+ AWS_BEDROCK_MODEL_ID=anthropic.claude-3-5-sonnet-20241022-v2:0
70
+ ```
71
+
72
+ #### Getting AWS Bedrock API Key
73
+
74
+ 1. Log in to [AWS Console](https://console.aws.amazon.com/)
75
+ 2. Navigate to **Bedrock** → **API Keys**
76
+ 3. Click **Generate API Key**
77
+ 4. Copy the bearer token (this is your `AWS_BEDROCK_API_KEY`)
78
+ 5. Enable model access in Bedrock console
79
+ 6. See: [AWS Bedrock API Keys Documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/api-keys-generate.html)
80
+
81
+ #### Available Regions
82
+
83
+ - `us-east-1` (N. Virginia) - Most models available
84
+ - `us-west-2` (Oregon)
85
+ - `us-east-2` (Ohio)
86
+ - `ap-southeast-1` (Singapore)
87
+ - `ap-northeast-1` (Tokyo)
88
+ - `eu-central-1` (Frankfurt)
89
+
90
+ #### Model Catalog
91
+
92
+ **Claude Models (Best for Tool Calling)** ✅
93
+
94
+ Claude 4.5 (latest - requires inference profiles):
95
+ ```env
96
+ AWS_BEDROCK_MODEL_ID=us.anthropic.claude-sonnet-4-5-20250929-v1:0 # Regional US
97
+ AWS_BEDROCK_MODEL_ID=us.anthropic.claude-haiku-4-5-20251001-v1:0 # Fast, efficient
98
+ AWS_BEDROCK_MODEL_ID=global.anthropic.claude-sonnet-4-5-20250929-v1:0 # Cross-region
99
+ ```
100
+
101
+ Claude 3.x models:
102
+ ```env
103
+ AWS_BEDROCK_MODEL_ID=anthropic.claude-3-5-sonnet-20241022-v2:0 # Excellent tool calling
104
+ AWS_BEDROCK_MODEL_ID=anthropic.claude-3-opus-20240229-v1:0 # Most capable
105
+ AWS_BEDROCK_MODEL_ID=anthropic.claude-3-haiku-20240307-v1:0 # Fast, cheap
106
+ ```
107
+
108
+ **DeepSeek Models (NEW - 2025)**
109
+ ```env
110
+ AWS_BEDROCK_MODEL_ID=us.deepseek.r1-v1:0 # DeepSeek R1 - reasoning model (o1-style)
111
+ ```
112
+
113
+ **Qwen Models (Alibaba - NEW 2025)**
114
+ ```env
115
+ AWS_BEDROCK_MODEL_ID=qwen.qwen3-235b-a22b-2507-v1:0 # Largest, 235B parameters
116
+ AWS_BEDROCK_MODEL_ID=qwen.qwen3-32b-v1:0 # Balanced, 32B
117
+ AWS_BEDROCK_MODEL_ID=qwen.qwen3-coder-480b-a35b-v1:0 # Coding specialist, 480B
118
+ AWS_BEDROCK_MODEL_ID=qwen.qwen3-coder-30b-a3b-v1:0 # Coding, smaller
119
+ ```
120
+
121
+ **OpenAI Open-Weight Models (NEW - 2025)**
122
+ ```env
123
+ AWS_BEDROCK_MODEL_ID=openai.gpt-oss-120b-1:0 # 120B parameters, open-weight
124
+ AWS_BEDROCK_MODEL_ID=openai.gpt-oss-20b-1:0 # 20B parameters, efficient
125
+ ```
126
+
127
+ **Google Gemma Models (Open-Weight)**
128
+ ```env
129
+ AWS_BEDROCK_MODEL_ID=google.gemma-3-27b # 27B parameters
130
+ AWS_BEDROCK_MODEL_ID=google.gemma-3-12b # 12B parameters
131
+ AWS_BEDROCK_MODEL_ID=google.gemma-3-4b # 4B parameters, efficient
132
+ ```
133
+
134
+ **Amazon Models**
135
+
136
+ Nova (multimodal):
137
+ ```env
138
+ AWS_BEDROCK_MODEL_ID=us.amazon.nova-pro-v1:0 # Best quality, multimodal, 300K context
139
+ AWS_BEDROCK_MODEL_ID=us.amazon.nova-lite-v1:0 # Fast, cost-effective
140
+ AWS_BEDROCK_MODEL_ID=us.amazon.nova-micro-v1:0 # Ultra-fast, text-only
141
+ ```
142
+
143
+ Titan:
144
+ ```env
145
+ AWS_BEDROCK_MODEL_ID=amazon.titan-text-premier-v1:0 # Largest
146
+ AWS_BEDROCK_MODEL_ID=amazon.titan-text-express-v1 # Fast
147
+ AWS_BEDROCK_MODEL_ID=amazon.titan-text-lite-v1 # Cheapest
148
+ ```
149
+
150
+ **Meta Llama Models**
151
+ ```env
152
+ AWS_BEDROCK_MODEL_ID=meta.llama3-1-70b-instruct-v1:0 # Most capable
153
+ AWS_BEDROCK_MODEL_ID=meta.llama3-1-8b-instruct-v1:0 # Fast, efficient
154
+ ```
155
+
156
+ **Mistral Models**
157
+ ```env
158
+ AWS_BEDROCK_MODEL_ID=mistral.mistral-large-2407-v1:0 # Largest, coding, multilingual
159
+ AWS_BEDROCK_MODEL_ID=mistral.mistral-small-2402-v1:0 # Efficient
160
+ AWS_BEDROCK_MODEL_ID=mistral.mixtral-8x7b-instruct-v0:1 # Mixture of experts
161
+ ```
162
+
163
+ **Cohere Command Models**
164
+ ```env
165
+ AWS_BEDROCK_MODEL_ID=cohere.command-r-plus-v1:0 # Best for RAG, search
166
+ AWS_BEDROCK_MODEL_ID=cohere.command-r-v1:0 # Balanced
167
+ ```
168
+
169
+ **AI21 Jamba Models**
170
+ ```env
171
+ AWS_BEDROCK_MODEL_ID=ai21.jamba-1-5-large-v1:0 # Hybrid architecture, 256K context
172
+ AWS_BEDROCK_MODEL_ID=ai21.jamba-1-5-mini-v1:0 # Fast
173
+ ```
174
+
175
+ #### Pricing (per 1M tokens)
176
+
177
+ | Model | Input | Output |
178
+ |-------|-------|--------|
179
+ | Claude 3.5 Sonnet | $3.00 | $15.00 |
180
+ | Claude 3 Opus | $15.00 | $75.00 |
181
+ | Claude 3 Haiku | $0.25 | $1.25 |
182
+ | Titan Text Express | $0.20 | $0.60 |
183
+ | Llama 3 70B | $0.99 | $0.99 |
184
+ | Nova Pro | $0.80 | $3.20 |
185
+
186
+ #### Important Notes
187
+
188
+ ⚠️ **Tool Calling:** Only **Claude models** support tool calling on Bedrock. Other models work via Converse API but won't use Read/Write/Bash tools.
189
+
190
+ 📖 **Full Documentation:** See [BEDROCK_MODELS.md](../BEDROCK_MODELS.md) for complete model catalog with capabilities and use cases.
191
+
192
+ ---
193
+
194
+ ### 2. Databricks (Claude Sonnet 4.5, Opus 4.5)
195
+
196
+ **Best for:** Enterprise production use, managed Claude endpoints
197
+
198
+ #### Configuration
199
+
200
+ ```env
201
+ MODEL_PROVIDER=databricks
202
+ DATABRICKS_API_BASE=https://your-workspace.cloud.databricks.com
203
+ DATABRICKS_API_KEY=dapi1234567890abcdef
204
+ ```
205
+
206
+ Optional endpoint path override:
207
+ ```env
208
+ DATABRICKS_ENDPOINT_PATH=/serving-endpoints/databricks-claude-sonnet-4-5/invocations
209
+ ```
210
+
211
+ #### Getting Databricks Credentials
212
+
213
+ 1. Log in to your Databricks workspace
214
+ 2. Navigate to **Settings** → **User Settings**
215
+ 3. Click **Generate New Token**
216
+ 4. Copy the token (this is your `DATABRICKS_API_KEY`)
217
+ 5. Your workspace URL is the base URL (e.g., `https://your-workspace.cloud.databricks.com`)
218
+
219
+ #### Available Models
220
+
221
+ - **Claude Sonnet 4.5** - Excellent for tool calling, balanced performance
222
+ - **Claude Opus 4.5** - Most capable model for complex reasoning
223
+
224
+ #### Pricing
225
+
226
+ Contact Databricks for enterprise pricing.
227
+
228
+ ---
229
+
230
+ ### 3. OpenRouter (100+ Models)
231
+
232
+ **Best for:** Quick setup, model flexibility, cost optimization
233
+
234
+ #### Configuration
235
+
236
+ ```env
237
+ MODEL_PROVIDER=openrouter
238
+ OPENROUTER_API_KEY=sk-or-v1-your-key
239
+ OPENROUTER_MODEL=anthropic/claude-3.5-sonnet
240
+ OPENROUTER_ENDPOINT=https://openrouter.ai/api/v1/chat/completions
241
+ ```
242
+
243
+ Optional for hybrid routing:
244
+ ```env
245
+ OPENROUTER_MAX_TOOLS_FOR_ROUTING=15 # Max tools to route to OpenRouter
246
+ ```
247
+
248
+ #### Getting OpenRouter API Key
249
+
250
+ 1. Visit [openrouter.ai](https://openrouter.ai)
251
+ 2. Sign in with GitHub, Google, or email
252
+ 3. Go to [openrouter.ai/keys](https://openrouter.ai/keys)
253
+ 4. Create a new API key
254
+ 5. Add credits (pay-as-you-go, no subscription required)
255
+
256
+ #### Popular Models
257
+
258
+ **Claude Models (Best for Coding)**
259
+ ```env
260
+ OPENROUTER_MODEL=anthropic/claude-3.5-sonnet # $3/$15 per 1M tokens
261
+ OPENROUTER_MODEL=anthropic/claude-opus-4.5 # $15/$75 per 1M tokens
262
+ OPENROUTER_MODEL=anthropic/claude-3-haiku # $0.25/$1.25 per 1M tokens
263
+ ```
264
+
265
+ **OpenAI Models**
266
+ ```env
267
+ OPENROUTER_MODEL=openai/gpt-4o # $2.50/$10 per 1M tokens
268
+ OPENROUTER_MODEL=openai/gpt-4o-mini # $0.15/$0.60 per 1M tokens (default)
269
+ OPENROUTER_MODEL=openai/o1-preview # $15/$60 per 1M tokens
270
+ OPENROUTER_MODEL=openai/o1-mini # $3/$12 per 1M tokens
271
+ ```
272
+
273
+ **Google Models**
274
+ ```env
275
+ OPENROUTER_MODEL=google/gemini-pro-1.5 # $1.25/$5 per 1M tokens
276
+ OPENROUTER_MODEL=google/gemini-flash-1.5 # $0.075/$0.30 per 1M tokens
277
+ ```
278
+
279
+ **Meta Llama Models**
280
+ ```env
281
+ OPENROUTER_MODEL=meta-llama/llama-3.1-405b # $2.70/$2.70 per 1M tokens
282
+ OPENROUTER_MODEL=meta-llama/llama-3.1-70b # $0.52/$0.75 per 1M tokens
283
+ OPENROUTER_MODEL=meta-llama/llama-3.1-8b # $0.06/$0.06 per 1M tokens
284
+ ```
285
+
286
+ **Mistral Models**
287
+ ```env
288
+ OPENROUTER_MODEL=mistralai/mistral-large # $2/$6 per 1M tokens
289
+ OPENROUTER_MODEL=mistralai/codestral-latest # $0.30/$0.90 per 1M tokens
290
+ ```
291
+
292
+ **DeepSeek Models**
293
+ ```env
294
+ OPENROUTER_MODEL=deepseek/deepseek-chat # $0.14/$0.28 per 1M tokens
295
+ OPENROUTER_MODEL=deepseek/deepseek-coder # $0.14/$0.28 per 1M tokens
296
+ ```
297
+
298
+ #### Benefits
299
+
300
+ - ✅ **100+ models** through one API
301
+ - ✅ **Automatic fallbacks** if primary model unavailable
302
+ - ✅ **Competitive pricing** with volume discounts
303
+ - ✅ **Full tool calling support**
304
+ - ✅ **No monthly fees** - pay only for usage
305
+ - ✅ **Rate limit pooling** across models
306
+
307
+ See [openrouter.ai/models](https://openrouter.ai/models) for complete list with pricing.
308
+
309
+ ---
310
+
311
+ ### 4. Ollama (Local Models)
312
+
313
+ **Best for:** Local development, privacy, offline use, no API costs
314
+
315
+ #### Configuration
316
+
317
+ ```env
318
+ MODEL_PROVIDER=ollama
319
+ OLLAMA_ENDPOINT=http://localhost:11434
320
+ OLLAMA_MODEL=llama3.1:8b
321
+ OLLAMA_TIMEOUT_MS=120000
322
+ ```
323
+
324
+ #### Installation & Setup
325
+
326
+ ```bash
327
+ # Install Ollama
328
+ brew install ollama # macOS
329
+ # Or download from: https://ollama.ai/download
330
+
331
+ # Start Ollama service
332
+ ollama serve
333
+
334
+ # Pull a model
335
+ ollama pull llama3.1:8b
336
+
337
+ # Verify model is available
338
+ ollama list
339
+ ```
340
+
341
+ #### Recommended Models
342
+
343
+ **For Tool Calling** ✅ (Required for Claude Code CLI)
344
+ ```bash
345
+ ollama pull llama3.1:8b # Good balance (4.7GB)
346
+ ollama pull llama3.2 # Latest Llama (4.7GB)
347
+ ollama pull qwen2.5:14b # Strong reasoning (8GB, 7b struggles with tools)
348
+ ollama pull mistral:7b-instruct # Fast and capable (4.1GB)
349
+ ```
350
+
351
+ **NOT Recommended for Tools** ❌
352
+ ```bash
353
+ qwen2.5-coder # Code-only, slow with tool calling
354
+ codellama # Code-only, poor tool support
355
+ ```
356
+
357
+ #### Tool Calling Support
358
+
359
+ Lynkr supports **native tool calling** for compatible Ollama models:
360
+
361
+ - ✅ **Supported models**: llama3.1, llama3.2, qwen2.5, mistral, mistral-nemo
362
+ - ✅ **Automatic detection**: Lynkr detects tool-capable models
363
+ - ✅ **Format conversion**: Transparent Anthropic ↔ Ollama conversion
364
+ - ❌ **Unsupported models**: llama3, older models (tools filtered automatically)
365
+
366
+ #### Pricing
367
+
368
+ **100% FREE** - Models run on your hardware with no API costs.
369
+
370
+ #### Model Sizes
371
+
372
+ - **7B models**: ~4-5GB download, 8GB RAM required
373
+ - **8B models**: ~4.7GB download, 8GB RAM required
374
+ - **14B models**: ~8GB download, 16GB RAM required
375
+ - **32B models**: ~18GB download, 32GB RAM required
376
+
377
+ ---
378
+
379
+ ### 5. llama.cpp (GGUF Models)
380
+
381
+ **Best for:** Maximum performance, custom quantization, any GGUF model
382
+
383
+ #### Configuration
384
+
385
+ ```env
386
+ MODEL_PROVIDER=llamacpp
387
+ LLAMACPP_ENDPOINT=http://localhost:8080
388
+ LLAMACPP_MODEL=qwen2.5-coder-7b
389
+ LLAMACPP_TIMEOUT_MS=120000
390
+ ```
391
+
392
+ Optional API key (for secured servers):
393
+ ```env
394
+ LLAMACPP_API_KEY=your-optional-api-key
395
+ ```
396
+
397
+ #### Installation & Setup
398
+
399
+ ```bash
400
+ # Clone and build llama.cpp
401
+ git clone https://github.com/ggerganov/llama.cpp
402
+ cd llama.cpp && make
403
+
404
+ # Download a GGUF model (example: Qwen2.5-Coder-7B)
405
+ wget https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct-GGUF/resolve/main/qwen2.5-coder-7b-instruct-q4_k_m.gguf
406
+
407
+ # Start llama-server
408
+ ./llama-server -m qwen2.5-coder-7b-instruct-q4_k_m.gguf --port 8080
409
+
410
+ # Verify server is running
411
+ curl http://localhost:8080/health
412
+ ```
413
+
414
+ #### GPU Support
415
+
416
+ llama.cpp supports multiple GPU backends:
417
+
418
+ - **CUDA** (NVIDIA): `make LLAMA_CUDA=1`
419
+ - **Metal** (Apple Silicon): `make LLAMA_METAL=1`
420
+ - **ROCm** (AMD): `make LLAMA_ROCM=1`
421
+ - **Vulkan** (Universal): `make LLAMA_VULKAN=1`
422
+
423
+ #### llama.cpp vs Ollama
424
+
425
+ | Feature | Ollama | llama.cpp |
426
+ |---------|--------|-----------|
427
+ | Setup | Easy (app) | Manual (compile/download) |
428
+ | Model Format | Ollama-specific | Any GGUF model |
429
+ | Performance | Good | **Excellent** (optimized C++) |
430
+ | GPU Support | Yes | Yes (CUDA, Metal, ROCm, Vulkan) |
431
+ | Memory Usage | Higher | **Lower** (quantization options) |
432
+ | API | Custom `/api/chat` | OpenAI-compatible `/v1/chat/completions` |
433
+ | Flexibility | Limited models | **Any GGUF** from HuggingFace |
434
+ | Tool Calling | Limited models | Grammar-based, more reliable |
435
+
436
+ **Choose llama.cpp when you need:**
437
+ - Maximum performance
438
+ - Specific quantization options (Q4, Q5, Q8)
439
+ - GGUF models not available in Ollama
440
+ - Fine-grained control over inference parameters
441
+
442
+ ---
443
+
444
+ ### 6. Azure OpenAI
445
+
446
+ **Best for:** Azure integration, Microsoft ecosystem, GPT-4o, o1, o3
447
+
448
+ #### Configuration
449
+
450
+ ```env
451
+ MODEL_PROVIDER=azure-openai
452
+ AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/openai/deployments/YOUR-DEPLOYMENT/chat/completions?api-version=2025-01-01-preview
453
+ AZURE_OPENAI_API_KEY=your-azure-api-key
454
+ AZURE_OPENAI_DEPLOYMENT=gpt-4o
455
+ ```
456
+
457
+ Optional:
458
+ ```env
459
+ AZURE_OPENAI_API_VERSION=2024-08-01-preview # Latest stable version
460
+ ```
461
+
462
+ #### Getting Azure OpenAI Credentials
463
+
464
+ 1. Log in to [Azure Portal](https://portal.azure.com)
465
+ 2. Navigate to **Azure OpenAI** service
466
+ 3. Go to **Keys and Endpoint**
467
+ 4. Copy **KEY 1** (this is your API key)
468
+ 5. Copy **Endpoint** URL
469
+ 6. Create a deployment (gpt-4o, gpt-4o-mini, etc.)
470
+
471
+ #### Important: Full Endpoint URL Required
472
+
473
+ The `AZURE_OPENAI_ENDPOINT` must include:
474
+ - Resource name
475
+ - Deployment path
476
+ - API version query parameter
477
+
478
+ **Example:**
479
+ ```
480
+ https://your-resource.openai.azure.com/openai/deployments/gpt-4o/chat/completions?api-version=2025-01-01-preview
481
+ ```
482
+
483
+ #### Available Deployments
484
+
485
+ You can deploy any of these models in Azure AI Foundry:
486
+
487
+ ```env
488
+ AZURE_OPENAI_DEPLOYMENT=gpt-4o # Latest GPT-4o
489
+ AZURE_OPENAI_DEPLOYMENT=gpt-4o-mini # Smaller, faster, cheaper
490
+ AZURE_OPENAI_DEPLOYMENT=gpt-5-chat # GPT-5 (if available)
491
+ AZURE_OPENAI_DEPLOYMENT=o1-preview # Reasoning model
492
+ AZURE_OPENAI_DEPLOYMENT=o3-mini # Latest reasoning model
493
+ AZURE_OPENAI_DEPLOYMENT=kimi-k2 # Kimi K2 (if available)
494
+ ```
495
+
496
+ ---
497
+
498
+ ### 7. Azure Anthropic
499
+
500
+ **Best for:** Azure-hosted Claude models with enterprise integration
501
+
502
+ #### Configuration
503
+
504
+ ```env
505
+ MODEL_PROVIDER=azure-anthropic
506
+ AZURE_ANTHROPIC_ENDPOINT=https://your-resource.services.ai.azure.com/anthropic/v1/messages
507
+ AZURE_ANTHROPIC_API_KEY=your-azure-api-key
508
+ AZURE_ANTHROPIC_VERSION=2023-06-01
509
+ ```
510
+
511
+ #### Getting Azure Anthropic Credentials
512
+
513
+ 1. Log in to [Azure Portal](https://portal.azure.com)
514
+ 2. Navigate to your Azure Anthropic resource
515
+ 3. Go to **Keys and Endpoint**
516
+ 4. Copy the API key
517
+ 5. Copy the endpoint URL (includes `/anthropic/v1/messages`)
518
+
519
+ #### Available Models
520
+
521
+ - **Claude Sonnet 4.5** - Best for tool calling, balanced
522
+ - **Claude Opus 4.5** - Most capable for complex reasoning
523
+
524
+ ---
525
+
526
+ ### 8. OpenAI (Direct)
527
+
528
+ **Best for:** Direct OpenAI API access, lowest latency
529
+
530
+ #### Configuration
531
+
532
+ ```env
533
+ MODEL_PROVIDER=openai
534
+ OPENAI_API_KEY=sk-your-openai-api-key
535
+ OPENAI_MODEL=gpt-4o
536
+ OPENAI_ENDPOINT=https://api.openai.com/v1/chat/completions
537
+ ```
538
+
539
+ Optional for organization-level keys:
540
+ ```env
541
+ OPENAI_ORGANIZATION=org-your-org-id
542
+ ```
543
+
544
+ #### Getting OpenAI API Key
545
+
546
+ 1. Visit [platform.openai.com](https://platform.openai.com)
547
+ 2. Sign up or log in
548
+ 3. Go to [API Keys](https://platform.openai.com/api-keys)
549
+ 4. Create a new API key
550
+ 5. Add credits to your account (pay-as-you-go)
551
+
552
+ #### Available Models
553
+
554
+ ```env
555
+ OPENAI_MODEL=gpt-4o # Latest GPT-4o ($2.50/$10 per 1M)
556
+ OPENAI_MODEL=gpt-4o-mini # Smaller, faster ($0.15/$0.60 per 1M)
557
+ OPENAI_MODEL=gpt-4-turbo # GPT-4 Turbo
558
+ OPENAI_MODEL=o1-preview # Reasoning model
559
+ OPENAI_MODEL=o1-mini # Smaller reasoning model
560
+ ```
561
+
562
+ #### Benefits
563
+
564
+ - ✅ **Direct API access** - No intermediaries, lowest latency
565
+ - ✅ **Full tool calling support** - Excellent function calling
566
+ - ✅ **Parallel tool calls** - Execute multiple tools simultaneously
567
+ - ✅ **Organization support** - Use org-level API keys
568
+ - ✅ **Simple setup** - Just one API key needed
569
+
570
+ ---
571
+
572
+ ### 9. LM Studio (Local with GUI)
573
+
574
+ **Best for:** Local models with graphical interface
575
+
576
+ #### Configuration
577
+
578
+ ```env
579
+ MODEL_PROVIDER=lmstudio
580
+ LMSTUDIO_ENDPOINT=http://localhost:1234
581
+ LMSTUDIO_MODEL=default
582
+ LMSTUDIO_TIMEOUT_MS=120000
583
+ ```
584
+
585
+ Optional API key (for secured servers):
586
+ ```env
587
+ LMSTUDIO_API_KEY=your-optional-api-key
588
+ ```
589
+
590
+ #### Setup
591
+
592
+ 1. Download and install [LM Studio](https://lmstudio.ai)
593
+ 2. Launch LM Studio
594
+ 3. Download a model (e.g., Qwen2.5-Coder-7B, Llama 3.1)
595
+ 4. Click **Start Server** (default port: 1234)
596
+ 5. Configure Lynkr to use LM Studio
597
+
598
+ #### Benefits
599
+
600
+ - ✅ **Graphical interface** for model management
601
+ - ✅ **Easy model downloads** from HuggingFace
602
+ - ✅ **Built-in server** with OpenAI-compatible API
603
+ - ✅ **GPU acceleration** support
604
+ - ✅ **Model presets** and configurations
605
+
606
+ ---
607
+
608
+ ## Hybrid Routing & Fallback
609
+
610
+ ### Intelligent 3-Tier Routing
611
+
612
+ Optimize costs by routing requests based on complexity:
613
+
614
+ ```env
615
+ # Enable hybrid routing
616
+ PREFER_OLLAMA=true
617
+ FALLBACK_ENABLED=true
618
+
619
+ # Configure providers for each tier
620
+ MODEL_PROVIDER=ollama
621
+ OLLAMA_MODEL=llama3.1:8b
622
+ OLLAMA_MAX_TOOLS_FOR_ROUTING=3
623
+
624
+ # Mid-tier (moderate complexity)
625
+ OPENROUTER_API_KEY=your-key
626
+ OPENROUTER_MODEL=openai/gpt-4o-mini
627
+ OPENROUTER_MAX_TOOLS_FOR_ROUTING=15
628
+
629
+ # Heavy workload (complex requests)
630
+ FALLBACK_PROVIDER=databricks
631
+ DATABRICKS_API_BASE=your-base
632
+ DATABRICKS_API_KEY=your-key
633
+ ```
634
+
635
+ ### How It Works
636
+
637
+ **Routing Logic:**
638
+ 1. **0-2 tools**: Try Ollama first (free, local, fast)
639
+ 2. **3-15 tools**: Route to OpenRouter (affordable cloud)
640
+ 3. **16+ tools**: Route directly to Databricks/Azure (most capable)
641
+
642
+ **Automatic Fallback:**
643
+ - ❌ If Ollama fails → Fallback to OpenRouter or Databricks
644
+ - ❌ If OpenRouter fails → Fallback to Databricks
645
+ - ✅ Transparent to the user
646
+
647
+ ### Cost Savings
648
+
649
+ - **65-100%** for requests that stay on Ollama
650
+ - **40-87%** faster for simple requests
651
+ - **Privacy**: Simple queries never leave your machine
652
+
653
+ ### Configuration Options
654
+
655
+ | Variable | Description | Default |
656
+ |----------|-------------|---------|
657
+ | `PREFER_OLLAMA` | Enable Ollama preference for simple requests | `false` |
658
+ | `FALLBACK_ENABLED` | Enable automatic fallback | `true` |
659
+ | `FALLBACK_PROVIDER` | Provider to use when primary fails | `databricks` |
660
+ | `OLLAMA_MAX_TOOLS_FOR_ROUTING` | Max tools to route to Ollama | `3` |
661
+ | `OPENROUTER_MAX_TOOLS_FOR_ROUTING` | Max tools to route to OpenRouter | `15` |
662
+
663
+ **Note:** Local providers (ollama, llamacpp, lmstudio) cannot be used as `FALLBACK_PROVIDER`.
664
+
665
+ ---
666
+
667
+ ## Complete Configuration Reference
668
+
669
+ ### Core Variables
670
+
671
+ | Variable | Description | Default |
672
+ |----------|-------------|---------|
673
+ | `MODEL_PROVIDER` | Primary provider (`databricks`, `bedrock`, `openrouter`, `ollama`, `llamacpp`, `azure-openai`, `azure-anthropic`, `openai`, `lmstudio`) | `databricks` |
674
+ | `PORT` | HTTP port for proxy server | `8081` |
675
+ | `WORKSPACE_ROOT` | Workspace directory path | `process.cwd()` |
676
+ | `LOG_LEVEL` | Logging level (`error`, `warn`, `info`, `debug`) | `info` |
677
+ | `TOOL_EXECUTION_MODE` | Where tools execute (`server`, `client`) | `server` |
678
+ | `MODEL_DEFAULT` | Override default model/deployment name | Provider-specific |
679
+
680
+ ### Provider-Specific Variables
681
+
682
+ See individual provider sections above for complete variable lists.
683
+
684
+ ---
685
+
686
+ ## Provider Comparison
687
+
688
+ ### Feature Comparison
689
+
690
+ | Feature | Databricks | Bedrock | OpenAI | Azure OpenAI | Azure Anthropic | OpenRouter | Ollama | llama.cpp | LM Studio |
691
+ |---------|-----------|---------|--------|--------------|-----------------|------------|--------|-----------|-----------|
692
+ | **Setup Complexity** | Medium | Easy | Easy | Medium | Medium | Easy | Easy | Medium | Easy |
693
+ | **Cost** | $$$ | $-$$$ | $$ | $$ | $$$ | $-$$ | **Free** | **Free** | **Free** |
694
+ | **Latency** | Low | Low | Low | Low | Low | Medium | **Very Low** | **Very Low** | **Very Low** |
695
+ | **Model Variety** | 2 | **100+** | 10+ | 10+ | 2 | **100+** | 50+ | Unlimited | 50+ |
696
+ | **Tool Calling** | Excellent | Excellent* | Excellent | Excellent | Excellent | Good | Fair | Good | Fair |
697
+ | **Context Length** | 200K | Up to 300K | 128K | 128K | 200K | Varies | 32K-128K | Model-dependent | 32K-128K |
698
+ | **Streaming** | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
699
+ | **Privacy** | Enterprise | Enterprise | Third-party | Enterprise | Enterprise | Third-party | **Local** | **Local** | **Local** |
700
+ | **Offline** | No | No | No | No | No | No | **Yes** | **Yes** | **Yes** |
701
+
702
+ _* Tool calling only supported by Claude models on Bedrock_
703
+
704
+ ### Cost Comparison (per 1M tokens)
705
+
706
+ | Provider | Model | Input | Output |
707
+ |----------|-------|-------|--------|
708
+ | **Bedrock** | Claude 3.5 Sonnet | $3.00 | $15.00 |
709
+ | **Databricks** | Contact for pricing | - | - |
710
+ | **OpenRouter** | Claude 3.5 Sonnet | $3.00 | $15.00 |
711
+ | **OpenRouter** | GPT-4o mini | $0.15 | $0.60 |
712
+ | **OpenAI** | GPT-4o | $2.50 | $10.00 |
713
+ | **Azure OpenAI** | GPT-4o | $2.50 | $10.00 |
714
+ | **Ollama** | Any model | **FREE** | **FREE** |
715
+ | **llama.cpp** | Any model | **FREE** | **FREE** |
716
+ | **LM Studio** | Any model | **FREE** | **FREE** |
717
+
718
+ ---
719
+
720
+ ## Next Steps
721
+
722
+ - **[Installation Guide](installation.md)** - Install Lynkr with your chosen provider
723
+ - **[Claude Code CLI Setup](claude-code-cli.md)** - Connect Claude Code CLI
724
+ - **[Cursor Integration](cursor-integration.md)** - Connect Cursor IDE
725
+ - **[Embeddings Configuration](embeddings.md)** - Enable @Codebase semantic search
726
+ - **[Troubleshooting](troubleshooting.md)** - Common issues and solutions
727
+
728
+ ---
729
+
730
+ ## Getting Help
731
+
732
+ - **[FAQ](faq.md)** - Frequently asked questions
733
+ - **[Troubleshooting Guide](troubleshooting.md)** - Common issues
734
+ - **[GitHub Discussions](https://github.com/vishalveerareddy123/Lynkr/discussions)** - Community Q&A
735
+ - **[GitHub Issues](https://github.com/vishalveerareddy123/Lynkr/issues)** - Report bugs