lynkr 9.1.9 → 9.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +428 -208
  2. package/package.json +1 -1
package/README.md CHANGED
@@ -6,7 +6,6 @@
6
6
  [![Tests](https://img.shields.io/badge/tests-699%20passing-brightgreen)](https://github.com/Fast-Editor/Lynkr)
7
7
  [![License: Apache 2.0](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](LICENSE)
8
8
  [![Node.js](https://img.shields.io/badge/node-20%2B-green)](https://nodejs.org)
9
- [![Homebrew Tap](https://img.shields.io/badge/homebrew-lynkr-brightgreen.svg)](https://github.com/vishalveerareddy123/homebrew-lynkr)
10
9
  [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/vishalveerareddy123/Lynkr)
11
10
 
12
11
  <table>
@@ -20,281 +19,481 @@
20
19
 
21
20
  ---
22
21
 
23
- ## The Problem
22
+ ## Quick Start (2 Minutes)
24
23
 
25
- AI coding tools lock you into one provider. Claude Code requires Anthropic. Codex requires OpenAI. You can't use your company's Databricks endpoint, your local Ollama models, or your AWS Bedrock account — at least, not without Lynkr.
24
+ ### 1. Install Lynkr
26
25
 
27
- **The real costs:**
28
- - Anthropic API at $15/MTok output adds up fast for daily coding
29
- - No way to use free local models (Ollama, llama.cpp) with Claude Code
30
- - Enterprise teams can't route through their own cloud infrastructure
31
- - Provider outages take your entire workflow down
26
+ ```bash
27
+ npm install -g lynkr
28
+ ```
32
29
 
33
- ## The Solution
30
+ ### 2. Configure Lynkr
34
31
 
35
- Lynkr is a self-hosted proxy that sits between your AI coding tools and any LLM provider. One environment variable change, and your tools work with any model.
32
+ First run creates a `.env` file. Edit it with your provider settings.
36
33
 
37
- ```
38
- Claude Code / Cursor / Codex / Cline / Continue / Vercel AI SDK
39
- |
40
- Lynkr
41
- |
42
- Ollama | Bedrock | Databricks | OpenRouter | Azure | OpenAI | llama.cpp
43
- ```
34
+ **Option A: Free & Local (Ollama) - Recommended for Testing**
44
35
 
45
36
  ```bash
46
- # That's it. Three lines.
47
- npm install -g lynkr
48
- export ANTHROPIC_BASE_URL=http://localhost:8081
49
- lynkr start
37
+ # Install Ollama first: https://ollama.com
38
+ ollama pull qwen2.5-coder:latest
50
39
  ```
51
40
 
52
- ---
41
+ Create/edit `.env` in your project directory:
42
+ ```bash
43
+ # Provider
44
+ MODEL_PROVIDER=ollama
45
+ FALLBACK_ENABLED=false
53
46
 
54
- ## Quick Start
47
+ # Ollama Configuration
48
+ OLLAMA_ENDPOINT=http://localhost:11434
49
+ OLLAMA_MODEL=qwen2.5-coder:latest
55
50
 
56
- ### Install
51
+ # Server
52
+ PORT=8081
53
+
54
+ # Increase limits for Claude Code/Cursor
55
+ POLICY_MAX_STEPS=50
56
+ POLICY_MAX_TOOL_CALLS=100
57
+
58
+ # Disable overly strict command filtering
59
+ POLICY_SAFE_COMMANDS_ENABLED=false
60
+ ```
61
+
62
+ **Option B: Cloud (OpenRouter) - Recommended for Production**
57
63
 
58
- **One-line install (recommended):**
59
64
  ```bash
60
- curl -fsSL https://raw.githubusercontent.com/Fast-Editor/Lynkr/main/install.sh | bash
65
+ # Get API key from https://openrouter.ai
61
66
  ```
62
67
 
63
- **Or via npm:**
68
+ Create/edit `.env`:
64
69
  ```bash
65
- npm install -g pino-pretty && npm install -g lynkr
70
+ # Provider
71
+ MODEL_PROVIDER=openrouter
72
+ OPENROUTER_API_KEY=sk-or-v1-your-key-here
73
+ FALLBACK_ENABLED=false
74
+
75
+ # Server
76
+ PORT=8081
77
+
78
+ # Increase limits
79
+ POLICY_MAX_STEPS=50
80
+ POLICY_MAX_TOOL_CALLS=100
81
+
82
+ # Optional: Enable caching
83
+ PROMPT_CACHE_ENABLED=true
84
+ SEMANTIC_CACHE_ENABLED=true
66
85
  ```
67
86
 
68
- ### Pick a Provider
87
+ **Option C: Enterprise (AWS Bedrock)**
69
88
 
70
- **Free & Local (Ollama)**
89
+ Create/edit `.env`:
71
90
  ```bash
72
- export MODEL_PROVIDER=ollama
73
- export OLLAMA_MODEL=qwen2.5-coder:latest
74
- lynkr start
91
+ # Provider
92
+ MODEL_PROVIDER=bedrock
93
+ AWS_BEDROCK_API_KEY=your-aws-key
94
+ AWS_BEDROCK_MODEL_ID=anthropic.claude-3-5-sonnet-20241022-v2:0
95
+ FALLBACK_ENABLED=false
96
+
97
+ # Server
98
+ PORT=8081
99
+
100
+ # Increase limits
101
+ POLICY_MAX_STEPS=50
102
+ POLICY_MAX_TOOL_CALLS=100
75
103
  ```
76
104
 
77
- **AWS Bedrock (100+ models)**
105
+ **Option D: Enterprise (Databricks)**
106
+
107
+ Create/edit `.env`:
78
108
  ```bash
79
- export MODEL_PROVIDER=bedrock
80
- export AWS_BEDROCK_API_KEY=your-key
81
- export AWS_BEDROCK_MODEL_ID=anthropic.claude-3-5-sonnet-20241022-v2:0
82
- lynkr start
109
+ # Provider
110
+ MODEL_PROVIDER=databricks
111
+ DATABRICKS_API_BASE=https://your-workspace.cloud.databricks.com
112
+ DATABRICKS_API_KEY=your-token
113
+ FALLBACK_ENABLED=false
114
+
115
+ # Server
116
+ PORT=8081
117
+
118
+ # Increase limits
119
+ POLICY_MAX_STEPS=50
120
+ POLICY_MAX_TOOL_CALLS=100
83
121
  ```
84
122
 
85
- **OpenRouter (cheapest cloud)**
123
+ Then start Lynkr:
86
124
  ```bash
87
- export MODEL_PROVIDER=openrouter
88
- export OPENROUTER_API_KEY=sk-or-v1-your-key
89
125
  lynkr start
90
126
  ```
91
127
 
92
- ### Connect Your Tool
128
+ ### 3. Connect Your Tool
93
129
 
94
130
  **Claude Code**
95
131
  ```bash
96
132
  export ANTHROPIC_BASE_URL=http://localhost:8081
97
133
  export ANTHROPIC_API_KEY=dummy
98
- claude "Your prompt here"
134
+ claude "write a hello world in python"
99
135
  ```
100
136
 
101
- **Codex CLI** — edit `~/.codex/config.toml`:
137
+ **Cursor IDE**
138
+ - Settings → Models → Override Base URL
139
+ - Set to: `http://localhost:8081/v1`
140
+ - API Key: `any-value`
141
+
142
+ **Codex CLI**
143
+
144
+ Edit `~/.codex/config.toml`:
102
145
  ```toml
103
146
  model_provider = "lynkr"
104
- model = "gpt-4o"
105
147
 
106
148
  [model_providers.lynkr]
107
- name = "Lynkr Proxy"
108
149
  base_url = "http://localhost:8081/v1"
109
150
  wire_api = "responses"
110
151
  ```
111
152
 
112
- **Cursor IDE**
113
- - Settings > Features > Models
114
- - Base URL: `http://localhost:8081/v1`
115
- - API Key: `sk-lynkr`
116
-
117
- **Vercel AI SDK**
118
- ```ts
119
- import { generateText } from "ai";
120
- import { createOpenAICompatible } from "@ai-sdk/openai-compatible";
121
-
122
- const lynkr = createOpenAICompatible({
123
- baseURL: "http://localhost:8081/v1",
124
- name: "lynkr",
125
- apiKey: "sk-lynkr",
126
- });
127
-
128
- const { text } = await generateText({
129
- model: lynkr.chatModel("auto"),
130
- prompt: "Hello!",
131
- });
132
- ```
153
+ **Done!** Your AI tool now uses your chosen provider.
154
+
155
+ ---
156
+
157
+ ## Why Lynkr?
158
+
159
+ AI coding tools lock you into one provider. Lynkr breaks that lock.
133
160
 
134
- **OpenClaw**
135
- ```json
136
- // openclaw.json
137
- {
138
- "models": {
139
- "providers": [{
140
- "name": "lynkr",
141
- "type": "openai-compatible",
142
- "base_url": "http://localhost:8081/v1",
143
- "api_key": "any-value",
144
- "models": ["auto"]
145
- }]
146
- }
147
- }
148
161
  ```
149
- Set `OPENCLAW_MODE=true` in Lynkr's `.env` to show actual provider/model in responses.
162
+ Claude Code / Cursor / Codex / Cline / Continue
163
+
164
+ Lynkr
165
+
166
+ Ollama | Bedrock | Azure | OpenRouter | OpenAI
167
+ ```
150
168
 
151
- > Works with any OpenAI-compatible client: Cline, Continue.dev, OpenClaw, KiloCode, and more.
169
+ **What you get:**
170
+ - ✅ Use **free local models** (Ollama, llama.cpp) with Claude Code
171
+ - ✅ Route through **your company's infrastructure** (Databricks, Azure, Bedrock)
172
+ - ✅ Cut costs **60-80%** with smart token optimization
173
+ - ✅ **Zero code changes** - just change one environment variable
152
174
 
153
175
  ---
154
176
 
155
177
  ## Supported Providers
156
178
 
157
- | Provider | Type | Models | Cost |
158
- |----------|------|--------|------|
159
- | **Ollama** | Local | Unlimited (free, offline) | **Free** |
179
+ | Provider | Type | Example Models | Cost |
180
+ |----------|------|---------------|------|
181
+ | **Ollama** | Local | qwen2.5-coder, deepseek-coder, llama3 | **Free** |
160
182
  | **llama.cpp** | Local | Any GGUF model | **Free** |
161
183
  | **LM Studio** | Local | Local models with GUI | **Free** |
162
- | **MLX Server** | Local | Apple Silicon optimized | **Free** |
163
- | **AWS Bedrock** | Cloud | 100+ (Claude, Llama, Mistral, Titan) | $$ |
164
- | **OpenRouter** | Cloud | 100+ (GPT, Claude, Llama, Gemini) | $-$$ |
184
+ | **OpenRouter** | Cloud | GPT-4o, Claude 3.5, Llama 3, Gemini | $ |
185
+ | **AWS Bedrock** | Cloud | Claude, Llama, Mistral, Titan | $$ |
165
186
  | **Databricks** | Cloud | Claude Sonnet 4.5, Opus 4.6 | $$$ |
166
187
  | **Azure OpenAI** | Cloud | GPT-4o, o1, o3 | $$$ |
167
- | **Azure Anthropic** | Cloud | Claude models | $$$ |
168
- | **OpenAI** | Cloud | GPT-4o, o3, o4-mini | $$$ |
169
- | **Google Vertex** | Cloud | Gemini 2.5 Pro/Flash | $$$ |
170
- | **Moonshot AI** | Cloud | Kimi K2 Thinking/Turbo | $$ |
171
- | **Z.AI** | Cloud | GLM-4.7 | $$ |
172
- | **DeepSeek** | Cloud | DeepSeek Reasoner, R1 | $ |
188
+ | **Azure Anthropic** | Cloud | Claude Sonnet, Opus | $$$ |
189
+ | **OpenAI** | Cloud | GPT-4o, o3-mini | $$$ |
190
+ | **DeepSeek** | Cloud | DeepSeek R1, Reasoner | $ |
173
191
 
174
- 4 local providers for **100% offline, free** usage. 10+ cloud providers for scale.
192
+ **4 local providers** for 100% offline, free usage. **10+ cloud providers** for scale.
175
193
 
176
194
  ---
177
195
 
178
- ## Why Lynkr Over Alternatives
179
-
180
- | Feature | Lynkr | LiteLLM (42K stars) | OpenRouter | PortKey |
181
- |---------|-------|---------------------|------------|---------|
182
- | **Setup** | `npm install -g lynkr` | Python + Docker + Postgres | Account signup | Docker + config |
183
- | **Claude Code support** | Drop-in, native | Requires config | No CLI support | Requires config |
184
- | **Cursor support** | Drop-in, native | Partial | Via API key | Partial |
185
- | **Codex CLI support** | Drop-in, native | No | No | No |
186
- | **Built for coding tools** | Yes (purpose-built) | No (general gateway) | No (general API) | No (general gateway) |
187
- | **Local models** | Ollama, llama.cpp, LM Studio, MLX | Ollama only | No | No |
188
- | **Token optimization** | Built-in (60-80% savings) | No | No | Caching only |
189
- | **Complexity routing** | Auto-routes by task difficulty | Manual | Cost/latency only | Manual |
190
- | **Memory system** | Titans-inspired long-term memory | No | No | No |
191
- | **Self-hosted** | Yes (Node.js) | Yes (Python stack) | No (SaaS) | Yes (Docker) |
192
- | **Offline capable** | Yes | Yes | No | No |
193
- | **Transaction fees** | None | None (OSS) / Paid enterprise | 5.5% on credits | Free tier / Paid |
194
- | **Dependencies** | Node.js only | Python, Prisma, PostgreSQL | N/A | Docker, Python |
195
- | **Format conversion** | Anthropic <-> OpenAI (automatic) | Automatic | N/A | Automatic |
196
- | **Code intelligence** | Graphify (19-lang AST graph) | No | No | No |
197
- | **Routing telemetry** | Built-in (SQLite + REST API) | No | Dashboard | Dashboard |
198
- | **Admin hot-reload** | Yes (no restart) | Requires restart | N/A | Requires restart |
199
- | **License** | Apache 2.0 | MIT | Proprietary | MIT (gateway) |
200
-
201
- **Lynkr's edge:** Purpose-built for AI coding tools. Not a general LLM gateway — a proxy that understands Claude Code, Cursor, and Codex natively, with built-in token optimization, complexity-based routing, and a memory system designed for coding workflows. Installs in one command, runs on Node.js, zero infrastructure required.
196
+ ## Advanced: Tier Routing (Save Even More)
202
197
 
203
- ---
198
+ Route different request types to different models automatically:
204
199
 
205
- ## Cost Comparison
200
+ ```bash
201
+ # .env file
202
+ MODEL_PROVIDER=ollama
203
+ FALLBACK_ENABLED=false
204
+
205
+ # Use small/fast models for simple tasks
206
+ TIER_SIMPLE=ollama:qwen2.5:3b
207
+
208
+ # Use medium models for normal coding
209
+ TIER_MEDIUM=ollama:qwen2.5:7b
206
210
 
207
- | Scenario | Direct Anthropic | Lynkr + Ollama | Lynkr + OpenRouter | Lynkr + Bedrock |
208
- |----------|-----------------|----------------|--------------------| --------------- |
209
- | Daily Claude Code usage | ~$10-30/day | **$0 (free)** | ~$2-8/day | ~$5-15/day |
210
- | Token optimization savings | — | — | 60-80% further | 60-80% further |
211
- | Monthly (heavy use) | $300-900 | **$0** | $60-240 | $150-450 |
211
+ # Use powerful models for complex architecture
212
+ TIER_COMPLEX=ollama:deepseek-r1:14b
213
+ TIER_REASONING=ollama:deepseek-r1:14b
212
214
 
213
- > With token optimization enabled, Lynkr's smart tool selection, prompt caching, and memory deduplication reduce token usage by 60-80% on top of provider savings.
215
+ # Increase limits for long conversations
216
+ POLICY_MAX_STEPS=50
217
+ POLICY_MAX_TOOL_CALLS=100
218
+ ```
219
+
220
+ Lynkr analyzes each request and routes it to the appropriate tier. Simple questions use fast models. Complex refactoring uses powerful models.
221
+
222
+ **Result:** 70-90% of requests use cheaper/faster models. Only hard problems hit expensive models.
214
223
 
215
224
  ---
216
225
 
217
- ## What's Under the Hood
226
+ ## Complete .env Examples
218
227
 
219
- Lynkr isn't just a passthrough proxy. It's an optimization layer.
228
+ ### MVP: Minimal Working Setup (Ollama)
220
229
 
221
- ### Smart Routing (5-Phase)
222
- Routes requests to the right model based on 5-phase complexity analysis. Simple questions go to fast/cheap models. Complex architectural tasks go to powerful models. Includes Graphify structural analysis for code-aware routing.
230
+ Copy-paste ready configuration for immediate use:
223
231
 
224
- - **Complexity scoring** — 15-dimension weighted scoring with agentic workflow detection
225
- - **Graphify integration** — AST-based knowledge graph detects god nodes, community cohesion, blast radius across 19 languages
226
- - **Routing telemetry** — every decision recorded with quality scoring (0-100) and latency tracking (P50/P95/P99)
232
+ ```bash
233
+ # .env - Minimal Ollama Setup
234
+
235
+ # ============================================
236
+ # REQUIRED: Provider Configuration
237
+ # ============================================
238
+ MODEL_PROVIDER=ollama
239
+ FALLBACK_ENABLED=false
240
+
241
+ # ============================================
242
+ # REQUIRED: Ollama Settings
243
+ # ============================================
244
+ OLLAMA_ENDPOINT=http://localhost:11434
245
+ OLLAMA_MODEL=qwen2.5-coder:latest
246
+
247
+ # ============================================
248
+ # REQUIRED: Server Configuration
249
+ # ============================================
250
+ PORT=8081
251
+ HOST=0.0.0.0
252
+
253
+ # ============================================
254
+ # REQUIRED: Claude Code/Cursor Compatibility
255
+ # ============================================
256
+ POLICY_MAX_STEPS=50
257
+ POLICY_MAX_TOOL_CALLS=100
258
+ POLICY_SAFE_COMMANDS_ENABLED=false
259
+
260
+ # ============================================
261
+ # OPTIONAL: Performance (Recommended)
262
+ # ============================================
263
+ LOG_LEVEL=warn
264
+ LOAD_SHEDDING_ENABLED=true
265
+ LOAD_SHEDDING_HEAP_THRESHOLD=0.85
266
+ ```
227
267
 
228
- ### Token Optimization (8 Phases)
229
- - **MCP Code Mode** — replaces 100+ MCP tool schemas with 4 meta-tools (~96% reduction, lazy tool discovery)
230
- - **Smart tool selection** only sends tools relevant to the current task (50-70% reduction)
231
- - **Prompt caching** SHA-256 keyed LRU cache (30-45% reduction on repeated prompts)
232
- - **Memory deduplication** — eliminates repeated information across turns (20-30% reduction)
233
- - **Tool response truncation** — intelligent truncation of long outputs (15-25% reduction)
234
- - **Dynamic system prompts** — adapt complexity to request type (10-20% reduction)
235
- - **Distill compression** — structural similarity, delta rendering, smart dedup of repetitive tool outputs (20-40% reduction)
236
- - **Headroom sidecar** — optional ML-based compression: Smart Crusher, CCR, LLMLingua (47-92% reduction)
268
+ **Steps:**
269
+ 1. Install Ollama: `curl -fsSL https://ollama.com/install.sh | sh`
270
+ 2. Pull model: `ollama pull qwen2.5-coder:latest`
271
+ 3. Copy above to `.env` in your project directory
272
+ 4. Run: `lynkr start`
237
273
 
238
- ### Enterprise Resilience
239
- - **Circuit breakers** — automatic failover with half-open probe recovery
240
- - **Admin hot-reload** — `POST /v1/admin/reload` reloads config + resets circuit breakers without restart
241
- - **Load shedding** — graceful degradation under high load
242
- - **Prometheus metrics** — full observability at `/metrics`
243
- - **Health checks** — K8s-ready endpoints at `/health`
244
- - **Performance timer** — per-request timing breakdown with `PERF_TIMER=true`
274
+ ---
245
275
 
246
- ### Memory System
247
- Titans-inspired long-term memory with surprise-based filtering. The system remembers important context across sessions and forgets noise — reducing token waste from repeated context.
276
+ ### Production: Cloud with Tier Routing (OpenRouter)
248
277
 
249
- ### Semantic Cache
250
- Cache responses for semantically similar prompts. Hit rate depends on your workflow, but repeat questions (common in coding) get instant responses.
278
+ Optimized for cost savings with smart routing:
251
279
 
252
280
  ```bash
281
+ # .env - Production OpenRouter Setup
282
+
283
+ # ============================================
284
+ # REQUIRED: Provider Configuration
285
+ # ============================================
286
+ MODEL_PROVIDER=openrouter
287
+ OPENROUTER_API_KEY=sk-or-v1-your-key-here
288
+ FALLBACK_ENABLED=false
289
+
290
+ # ============================================
291
+ # REQUIRED: Server Configuration
292
+ # ============================================
293
+ PORT=8081
294
+ HOST=0.0.0.0
295
+
296
+ # ============================================
297
+ # TIER ROUTING: Smart Cost Optimization
298
+ # ============================================
299
+ # Simple queries → Cheap/fast model
300
+ TIER_SIMPLE=openrouter:google/gemini-flash-1.5
301
+
302
+ # Normal coding → Balanced model
303
+ TIER_MEDIUM=openrouter:anthropic/claude-3.5-sonnet
304
+
305
+ # Complex refactoring → Powerful model
306
+ TIER_COMPLEX=openrouter:anthropic/claude-opus-4
307
+
308
+ # Deep reasoning → Most capable model
309
+ TIER_REASONING=openrouter:anthropic/claude-opus-4
310
+
311
+ # ============================================
312
+ # REQUIRED: Claude Code/Cursor Compatibility
313
+ # ============================================
314
+ POLICY_MAX_STEPS=50
315
+ POLICY_MAX_TOOL_CALLS=100
316
+ POLICY_SAFE_COMMANDS_ENABLED=false
317
+
318
+ # ============================================
319
+ # OPTIONAL: Token Optimization (60-80% savings)
320
+ # ============================================
321
+ PROMPT_CACHE_ENABLED=true
253
322
  SEMANTIC_CACHE_ENABLED=true
254
323
  SEMANTIC_CACHE_THRESHOLD=0.95
324
+ TOOL_INJECTION_ENABLED=false
325
+
326
+ # ============================================
327
+ # OPTIONAL: Performance Tuning
328
+ # ============================================
329
+ LOG_LEVEL=warn
330
+ LOAD_SHEDDING_ENABLED=true
331
+ LOAD_SHEDDING_HEAP_THRESHOLD=0.85
332
+ ```
333
+
334
+ **Expected savings:** 70-90% of requests use Gemini Flash ($). Only 10-30% use Claude Opus ($$$).
335
+
336
+ ---
337
+
338
+ ### Enterprise: Databricks Foundation Models
339
+
340
+ For teams using Databricks Model Serving:
341
+
342
+ ```bash
343
+ # .env - Enterprise Databricks Setup
344
+
345
+ # ============================================
346
+ # REQUIRED: Provider Configuration
347
+ # ============================================
348
+ MODEL_PROVIDER=databricks
349
+ DATABRICKS_API_BASE=https://your-workspace.cloud.databricks.com
350
+ DATABRICKS_API_KEY=dapi1234567890abcdef
351
+ FALLBACK_ENABLED=false
352
+
353
+ # ============================================
354
+ # REQUIRED: Model Configuration
355
+ # ============================================
356
+ # Option 1: Single model (no tier routing)
357
+ DATABRICKS_MODEL=databricks-meta-llama-3-1-405b-instruct
358
+
359
+ # Option 2: Tier routing (comment out above, uncomment below)
360
+ # TIER_SIMPLE=databricks:databricks-meta-llama-3-1-70b-instruct
361
+ # TIER_MEDIUM=databricks:databricks-claude-sonnet-4-5
362
+ # TIER_COMPLEX=databricks:databricks-claude-opus-4-6
363
+ # TIER_REASONING=databricks:databricks-claude-opus-4-6
364
+
365
+ # ============================================
366
+ # REQUIRED: Server Configuration
367
+ # ============================================
368
+ PORT=8081
369
+ HOST=0.0.0.0
370
+
371
+ # ============================================
372
+ # REQUIRED: Claude Code/Cursor Compatibility
373
+ # ============================================
374
+ POLICY_MAX_STEPS=50
375
+ POLICY_MAX_TOOL_CALLS=100
376
+ POLICY_SAFE_COMMANDS_ENABLED=false
377
+
378
+ # ============================================
379
+ # OPTIONAL: Enterprise Features
380
+ # ============================================
381
+ LOG_LEVEL=info
382
+ LOAD_SHEDDING_ENABLED=true
383
+ LOAD_SHEDDING_HEAP_THRESHOLD=0.85
384
+
385
+ # Optional: Metrics for monitoring
386
+ # PROMETHEUS_METRICS_ENABLED=true
255
387
  ```
256
388
 
257
- ### MCP Integration + Code Mode
258
- Automatic Model Context Protocol server discovery and orchestration. Your MCP tools work through Lynkr without configuration.
389
+ ---
390
+
391
+ ### Hybrid: Local + Cloud Fallback
259
392
 
260
- **MCP Code Mode** Token optimization for heavy MCP setups:
261
- - Replaces 100+ individual MCP tool schemas with 4 meta-tools
262
- - Reduces tool catalog from ~17,500 tokens to ~700 tokens (**96% reduction**)
263
- - Enables lazy tool discovery: model queries `mcp_list_tools`, then `mcp_tool_info`, then `mcp_execute`
264
- - Best for: 50+ MCP tools, long conversations, context-constrained setups
265
- - Trade-off: 3 sequential calls instead of 1 (adds ~2-3s latency)
393
+ Use free Ollama, fallback to cloud when needed:
266
394
 
267
395
  ```bash
268
- CODE_MODE_ENABLED=true # Enable Code Mode
269
- CODE_MODE_CACHE_TTL=60000 # Tool list cache TTL (ms)
396
+ # .env - Hybrid Setup (Advanced)
397
+
398
+ # ============================================
399
+ # PRIMARY: Local Ollama
400
+ # ============================================
401
+ MODEL_PROVIDER=ollama
402
+ OLLAMA_ENDPOINT=http://localhost:11434
403
+ OLLAMA_MODEL=qwen2.5-coder:latest
404
+
405
+ # ============================================
406
+ # FALLBACK: Cloud Provider
407
+ # ============================================
408
+ FALLBACK_ENABLED=true
409
+ FALLBACK_PROVIDER=openrouter
410
+ OPENROUTER_API_KEY=sk-or-v1-your-key-here
411
+
412
+ # ============================================
413
+ # TIER ROUTING: Mix Local + Cloud
414
+ # ============================================
415
+ TIER_SIMPLE=ollama:qwen2.5:3b
416
+ TIER_MEDIUM=ollama:qwen2.5:7b
417
+ TIER_COMPLEX=openrouter:anthropic/claude-3.5-sonnet
418
+ TIER_REASONING=openrouter:anthropic/claude-opus-4
419
+
420
+ # ============================================
421
+ # REQUIRED: Server Configuration
422
+ # ============================================
423
+ PORT=8081
424
+ HOST=0.0.0.0
425
+
426
+ # ============================================
427
+ # REQUIRED: Claude Code/Cursor Compatibility
428
+ # ============================================
429
+ POLICY_MAX_STEPS=50
430
+ POLICY_MAX_TOOL_CALLS=100
431
+ POLICY_SAFE_COMMANDS_ENABLED=false
432
+
433
+ # ============================================
434
+ # OPTIONAL: Performance
435
+ # ============================================
436
+ LOG_LEVEL=warn
437
+ LOAD_SHEDDING_ENABLED=true
270
438
  ```
271
439
 
272
- See [Token Optimization Guide](documentation/token-optimization.md#phase-0-mcp-code-mode-96-reduction-for-mcp-tools) and [Tools Documentation](documentation/tools.md#mcp-code-mode-token-optimization) for details.
440
+ **Best of both worlds:** 80% of requests stay local (free). Complex tasks use cloud (paid).
273
441
 
274
442
  ---
275
443
 
276
- ## Deployment Options
444
+ ## Common Issues & Fixes
445
+
446
+ | Issue | Solution |
447
+ |-------|----------|
448
+ | **"Service temporarily overloaded"** | Ollama model too large for RAM. Use smaller model or increase `--max-old-space-size` |
449
+ | **"Route not found: HEAD /"** | Ignore - harmless health check from Claude Code |
450
+ | **"Hallucinated tool calls"** | Normal - Lynkr automatically filters invalid tools |
451
+ | **"Safe Command DSL blocked"** | Add `POLICY_SAFE_COMMANDS_ENABLED=false` to `.env` |
452
+ | **Slow first request (20+ sec)** | Ollama loading model into memory. Add `OLLAMA_KEEP_ALIVE=30m` in Ollama config |
453
+ | **No response after N turns** | Increase limits: `POLICY_MAX_STEPS=50` and `POLICY_MAX_TOOL_CALLS=100` |
277
454
 
278
- **One-line install (recommended)**
455
+ ---
456
+
457
+ ## Advanced Features
458
+
459
+ ### Token Optimization (60-80% savings)
279
460
  ```bash
280
- curl -fsSL https://raw.githubusercontent.com/Fast-Editor/Lynkr/main/install.sh | bash
461
+ # Enable all optimizations
462
+ PROMPT_CACHE_ENABLED=true
463
+ SEMANTIC_CACHE_ENABLED=true
464
+ TOOL_INJECTION_ENABLED=false
465
+ CODE_MODE_ENABLED=true
281
466
  ```
282
467
 
283
- **NPM**
468
+ ### Memory System (Titans-inspired)
284
469
  ```bash
285
- npm install -g lynkr && lynkr start
470
+ MEMORY_ENABLED=true
471
+ MEMORY_TTL=3600000 # 1 hour
286
472
  ```
287
473
 
288
- **Docker**
474
+ ### Load Shedding & Resilience
289
475
  ```bash
290
- docker-compose up -d
476
+ LOAD_SHEDDING_ENABLED=true
477
+ LOAD_SHEDDING_HEAP_THRESHOLD=0.85
291
478
  ```
292
479
 
293
- **Git Clone**
480
+ ### Admin Hot-Reload (no restart needed)
294
481
  ```bash
295
- git clone https://github.com/Fast-Editor/Lynkr.git
296
- cd Lynkr && npm install && cp .env.example .env
297
- npm start
482
+ curl -X POST http://localhost:8081/v1/admin/reload
483
+ ```
484
+
485
+ ---
486
+
487
+ ## Installation Methods
488
+
489
+ **NPM (recommended)**
490
+ ```bash
491
+ npm install -g lynkr
492
+ ```
493
+
494
+ **One-line installer**
495
+ ```bash
496
+ curl -fsSL https://raw.githubusercontent.com/Fast-Editor/Lynkr/main/install.sh | bash
298
497
  ```
299
498
 
300
499
  **Homebrew**
@@ -303,6 +502,22 @@ brew tap vishalveerareddy123/lynkr
303
502
  brew install lynkr
304
503
  ```
305
504
 
505
+ **Docker**
506
+ ```bash
507
+ git clone https://github.com/Fast-Editor/Lynkr.git
508
+ cd Lynkr
509
+ docker-compose up -d
510
+ ```
511
+
512
+ **From source**
513
+ ```bash
514
+ git clone https://github.com/Fast-Editor/Lynkr.git
515
+ cd Lynkr
516
+ npm install
517
+ cp .env.example .env
518
+ npm start
519
+ ```
520
+
306
521
  ---
307
522
 
308
523
  ## Documentation
@@ -310,54 +525,59 @@ brew install lynkr
310
525
  | Guide | Description |
311
526
  |-------|-------------|
312
527
  | [Installation](documentation/installation.md) | All installation methods |
313
- | [Provider Config](documentation/providers.md) | Setup for all 12+ providers |
314
- | [Claude Code CLI](documentation/claude-code-cli.md) | Detailed Claude Code integration |
315
- | [Codex CLI](documentation/codex-cli.md) | Codex config.toml setup |
316
- | [OpenClaw](documentation/openclaw-integration.md) | OpenClaw integration with tier routing |
317
- | [Cursor IDE](documentation/cursor-integration.md) | Cursor integration + troubleshooting |
318
- | [Embeddings](documentation/embeddings.md) | @Codebase semantic search (4 options) |
319
- | [Token Optimization](documentation/token-optimization.md) | 60-80% cost reduction strategies |
320
- | [Memory System](documentation/memory-system.md) | Titans-inspired long-term memory |
321
- | [Tools & Execution](documentation/tools.md) | Tool calling and execution modes |
322
- | [Smart Routing](documentation/routing.md) | Complexity-based model routing |
323
- | [Docker Deployment](documentation/docker.md) | docker-compose with GPU support |
324
- | [Production Hardening](documentation/production.md) | Circuit breakers, metrics, load shedding |
325
- | [API Reference](documentation/api.md) | All endpoints and formats |
528
+ | [Provider Setup](documentation/providers.md) | Configuration for all 12+ providers |
529
+ | [Claude Code](documentation/claude-code-cli.md) | Claude Code CLI integration |
530
+ | [Cursor IDE](documentation/cursor-integration.md) | Cursor setup + troubleshooting |
531
+ | [Codex CLI](documentation/codex-cli.md) | Codex configuration |
532
+ | [Tier Routing](documentation/routing.md) | Smart model routing by complexity |
533
+ | [Token Optimization](documentation/token-optimization.md) | 60-80% cost reduction |
326
534
  | [Troubleshooting](documentation/troubleshooting.md) | Common issues and solutions |
327
- | [FAQ](documentation/faq.md) | Frequently asked questions |
535
+ | [API Reference](documentation/api.md) | REST API endpoints |
536
+ | [Production](documentation/production.md) | Enterprise deployment |
328
537
 
329
538
  ---
330
539
 
331
- ## Troubleshooting
540
+ ## Cost Comparison
332
541
 
333
- | Issue | Solution |
334
- |-------|----------|
335
- | Same response for all queries | Disable semantic cache: `SEMANTIC_CACHE_ENABLED=false` |
336
- | Tool calls not executing | Increase threshold: `POLICY_TOOL_LOOP_THRESHOLD=15` |
337
- | Slow first request | Keep Ollama loaded: `OLLAMA_KEEP_ALIVE=24h` |
338
- | Connection refused | Ensure Lynkr is running: `lynkr start` |
542
+ | Scenario | Direct Anthropic | Lynkr + Ollama | Lynkr + OpenRouter |
543
+ |----------|-----------------|----------------|-------------------|
544
+ | Daily coding (8h) | $10-30/day | **$0 (free)** | $2-8/day |
545
+ | Monthly (heavy use) | $300-900 | **$0** | $60-240 |
546
+
547
+ With tier routing + token optimization: **additional 60-80% savings** on cloud providers.
339
548
 
340
549
  ---
341
550
 
342
- ## Contributing
551
+ ## Why Lynkr vs Alternatives
552
+
553
+ | Feature | Lynkr | LiteLLM | OpenRouter | PortKey |
554
+ |---------|-------|---------|-----------|---------|
555
+ | **Setup** | `npm install -g lynkr` | Python + Docker + Postgres | Account signup | Docker stack |
556
+ | **Claude Code native** | ✅ Drop-in | ⚠️ Requires config | ❌ | ⚠️ Partial |
557
+ | **Cursor native** | ✅ Drop-in | ⚠️ Partial | ❌ | ⚠️ Partial |
558
+ | **Local models** | Ollama, llama.cpp, LM Studio, MLX | Ollama only | ❌ | ❌ |
559
+ | **Tier routing** | Auto complexity-based | ❌ Manual | Cost-based only | ❌ Manual |
560
+ | **Token optimization** | 60-80% built-in | ❌ | ❌ | Cache only |
561
+ | **Self-hosted** | ✅ Node.js only | ✅ Python stack | ❌ SaaS | ✅ Docker |
562
+ | **Dependencies** | Node.js 20+ | Python, Prisma, PostgreSQL | None | Docker, Python |
343
563
 
344
- We welcome contributions. See the [Contributing Guide](documentation/contributing.md) and [Testing Guide](documentation/testing.md).
564
+ **Lynkr's edge:** Purpose-built for AI coding tools. Zero-config for Claude Code, Cursor, and Codex. Installs in one command, runs anywhere Node.js runs.
345
565
 
346
566
  ---
347
567
 
348
- ## License
568
+ ## Community
349
569
 
350
- Apache 2.0See [LICENSE](LICENSE).
570
+ - [GitHub Discussions](https://github.com/Fast-Editor/Lynkr/discussions)Ask questions
571
+ - [Report Issues](https://github.com/Fast-Editor/Lynkr/issues) — Bug reports
572
+ - [NPM Package](https://www.npmjs.com/package/lynkr) — Official releases
573
+ - [DeepWiki](https://deepwiki.com/vishalveerareddy123/Lynkr) — AI-powered docs
351
574
 
352
575
  ---
353
576
 
354
- ## Community
577
+ ## License
355
578
 
356
- - [GitHub Discussions](https://github.com/Fast-Editor/Lynkr/discussions)Questions and tips
357
- - [Report Issues](https://github.com/Fast-Editor/Lynkr/issues) — Bug reports and feature requests
358
- - [NPM Package](https://www.npmjs.com/package/lynkr) — Official package
359
- - [DeepWiki](https://deepwiki.com/vishalveerareddy123/Lynkr) — AI-powered docs search
579
+ Apache 2.0See [LICENSE](LICENSE).
360
580
 
361
581
  ---
362
582
 
363
- **Built by [Vishal Veera Reddy](https://github.com/vishalveerareddy123) for developers who want control over their AI tools.**
583
+ **Built by [Vishal Veera Reddy](https://github.com/vishalveerareddy123) for developers who want control over their AI tools.**
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "lynkr",
3
- "version": "9.1.9",
3
+ "version": "9.2.1",
4
4
  "description": "Self-hosted Claude Code & Cursor proxy with Databricks,AWS BedRock,Azure adapters, openrouter, Ollama,llamacpp,LM Studio, workspace tooling, and MCP integration.",
5
5
  "main": "index.js",
6
6
  "bin": {