lynkr 8.0.1 → 9.0.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +238 -315
- package/bin/cli.js +16 -3
- package/index.js +7 -3
- package/install.sh +3 -3
- package/lynkr-skill.tar.gz +0 -0
- package/native/Cargo.toml +26 -0
- package/native/index.js +29 -0
- package/native/lynkr-native.node +0 -0
- package/native/src/lib.rs +321 -0
- package/package.json +8 -6
- package/src/api/files-multipart.js +30 -0
- package/src/api/files-router.js +81 -0
- package/src/api/openai-router.js +379 -308
- package/src/api/providers-handler.js +171 -3
- package/src/api/router.js +109 -5
- package/src/cache/prompt.js +13 -0
- package/src/clients/circuit-breaker.js +10 -247
- package/src/clients/codex-process.js +342 -0
- package/src/clients/codex-utils.js +143 -0
- package/src/clients/databricks.js +243 -76
- package/src/clients/ollama-utils.js +21 -17
- package/src/clients/openai-format.js +20 -6
- package/src/clients/openrouter-utils.js +42 -37
- package/src/clients/prompt-cache-injection.js +140 -0
- package/src/clients/provider-capabilities.js +41 -0
- package/src/clients/resilience.js +540 -0
- package/src/clients/responses-format.js +8 -7
- package/src/clients/retry.js +22 -167
- package/src/clients/standard-tools.js +1 -1
- package/src/clients/xml-tool-extractor.js +307 -0
- package/src/cluster.js +82 -0
- package/src/config/index.js +66 -0
- package/src/context/compression.js +42 -9
- package/src/context/distill.js +507 -0
- package/src/context/tool-result-compressor.js +563 -0
- package/src/memory/extractor.js +22 -0
- package/src/orchestrator/index.js +147 -205
- package/src/routing/complexity-analyzer.js +258 -5
- package/src/routing/index.js +15 -34
- package/src/routing/latency-tracker.js +148 -0
- package/src/routing/model-tiers.js +2 -0
- package/src/routing/quality-scorer.js +113 -0
- package/src/routing/telemetry.js +502 -0
- package/src/server.js +23 -0
- package/src/stores/file-store.js +69 -0
- package/src/stores/response-store.js +25 -0
- package/src/tools/code-graph.js +538 -0
- package/src/tools/code-mode.js +304 -0
- package/src/tools/index.js +1 -1
- package/src/tools/lazy-loader.js +11 -0
- package/src/tools/mcp-remote.js +7 -0
- package/src/tools/smart-selection.js +11 -0
- package/src/tools/web.js +1 -1
- package/src/utils/payload.js +206 -0
- package/src/utils/perf-timer.js +80 -0
package/README.md
CHANGED
|
@@ -1,429 +1,352 @@
|
|
|
1
|
-
# Lynkr
|
|
2
|
-
|
|
1
|
+
# Lynkr
|
|
2
|
+
|
|
3
|
+
### Run Claude Code, Cursor, and Codex on any model. One proxy, every provider.
|
|
3
4
|
|
|
4
5
|
[](https://www.npmjs.com/package/lynkr)
|
|
5
|
-
[](https://github.com/Fast-Editor/Lynkr)
|
|
6
7
|
[](LICENSE)
|
|
8
|
+
[](https://nodejs.org)
|
|
9
|
+
[](https://github.com/vishalveerareddy123/homebrew-lynkr)
|
|
7
10
|
[](https://deepwiki.com/vishalveerareddy123/Lynkr)
|
|
8
|
-
[](https://www.databricks.com/)
|
|
9
|
-
[](https://aws.amazon.com/bedrock/)
|
|
10
|
-
[](https://openai.com/)
|
|
11
|
-
[](https://ollama.ai/)
|
|
12
|
-
[](https://github.com/ggerganov/llama.cpp)
|
|
13
11
|
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
---
|
|
12
|
+
<table>
|
|
13
|
+
<tr>
|
|
14
|
+
<td align="center"><strong>12+</strong><br/>LLM Providers</td>
|
|
15
|
+
<td align="center"><strong>60-80%</strong><br/>Cost Reduction</td>
|
|
16
|
+
<td align="center"><strong>699</strong><br/>Tests Passing</td>
|
|
17
|
+
<td align="center"><strong>0</strong><br/>Code Changes Required</td>
|
|
18
|
+
</tr>
|
|
19
|
+
</table>
|
|
23
20
|
|
|
24
|
-
|
|
21
|
+
---
|
|
25
22
|
|
|
26
|
-
|
|
23
|
+
## The Problem
|
|
27
24
|
|
|
28
|
-
|
|
29
|
-
- 💰 **60-80% Cost Reduction** - Built-in token optimization with smart tool selection, prompt caching, and memory deduplication
|
|
30
|
-
- 🔒 **100% Local/Private** - Run completely offline with Ollama or llama.cpp
|
|
31
|
-
- 🌐 **Remote or Local** - Connect to providers on any IP/hostname (not limited to localhost)
|
|
32
|
-
- 🎯 **Zero Code Changes** - Drop-in replacement for Anthropic's backend
|
|
33
|
-
- 🏢 **Enterprise-Ready** - Circuit breakers, load shedding, Prometheus metrics, health checks
|
|
25
|
+
AI coding tools lock you into one provider. Claude Code requires Anthropic. Codex requires OpenAI. You can't use your company's Databricks endpoint, your local Ollama models, or your AWS Bedrock account — at least, not without Lynkr.
|
|
34
26
|
|
|
35
|
-
**
|
|
36
|
-
-
|
|
37
|
-
-
|
|
38
|
-
-
|
|
39
|
-
-
|
|
27
|
+
**The real costs:**
|
|
28
|
+
- Anthropic API at $15/MTok output adds up fast for daily coding
|
|
29
|
+
- No way to use free local models (Ollama, llama.cpp) with Claude Code
|
|
30
|
+
- Enterprise teams can't route through their own cloud infrastructure
|
|
31
|
+
- Provider outages take your entire workflow down
|
|
40
32
|
|
|
41
|
-
|
|
33
|
+
## The Solution
|
|
42
34
|
|
|
43
|
-
|
|
35
|
+
Lynkr is a self-hosted proxy that sits between your AI coding tools and any LLM provider. One environment variable change, and your tools work with any model.
|
|
44
36
|
|
|
45
|
-
|
|
37
|
+
```
|
|
38
|
+
Claude Code / Cursor / Codex / Cline / Continue / Vercel AI SDK
|
|
39
|
+
|
|
|
40
|
+
Lynkr
|
|
41
|
+
|
|
|
42
|
+
Ollama | Bedrock | Databricks | OpenRouter | Azure | OpenAI | llama.cpp
|
|
43
|
+
```
|
|
46
44
|
|
|
47
|
-
**Option 1: NPM Package (Recommended)**
|
|
48
45
|
```bash
|
|
49
|
-
#
|
|
50
|
-
npm install -g pino-pretty
|
|
46
|
+
# That's it. Three lines.
|
|
51
47
|
npm install -g lynkr
|
|
52
|
-
|
|
48
|
+
export ANTHROPIC_BASE_URL=http://localhost:8081
|
|
53
49
|
lynkr start
|
|
54
50
|
```
|
|
55
51
|
|
|
56
|
-
|
|
57
|
-
```bash
|
|
58
|
-
# Clone repository
|
|
59
|
-
git clone https://github.com/vishalveerareddy123/Lynkr.git
|
|
60
|
-
cd Lynkr
|
|
61
|
-
|
|
62
|
-
# Install dependencies
|
|
63
|
-
npm install
|
|
52
|
+
---
|
|
64
53
|
|
|
65
|
-
|
|
66
|
-
cp .env.example .env
|
|
54
|
+
## Quick Start
|
|
67
55
|
|
|
68
|
-
|
|
69
|
-
nano .env
|
|
56
|
+
### Install
|
|
70
57
|
|
|
71
|
-
|
|
72
|
-
|
|
58
|
+
**One-line install (recommended):**
|
|
59
|
+
```bash
|
|
60
|
+
curl -fsSL https://raw.githubusercontent.com/Fast-Editor/Lynkr/main/install.sh | bash
|
|
73
61
|
```
|
|
74
62
|
|
|
75
|
-
**
|
|
76
|
-
- **Node 20-24**: Full support with all features
|
|
77
|
-
- **Node 25+**: Full support (native modules auto-rebuild, babel fallback for code parsing)
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
**Option 3: Docker**
|
|
63
|
+
**Or via npm:**
|
|
82
64
|
```bash
|
|
83
|
-
|
|
65
|
+
npm install -g pino-pretty && npm install -g lynkr
|
|
84
66
|
```
|
|
85
67
|
|
|
86
|
-
|
|
87
|
-
|
|
88
|
-
## Supported Providers
|
|
89
|
-
|
|
90
|
-
Lynkr supports **10+ LLM providers**:
|
|
91
|
-
|
|
92
|
-
| Provider | Type | Models | Cost | Privacy |
|
|
93
|
-
|----------|------|--------|------|---------|
|
|
94
|
-
| **AWS Bedrock** | Cloud | 100+ (Claude, Titan, Llama, Mistral, etc.) | $$-$$$ | Cloud |
|
|
95
|
-
| **Databricks** | Cloud | Claude Sonnet 4.5, Opus 4.5 | $$$ | Cloud |
|
|
96
|
-
| **OpenRouter** | Cloud | 100+ (GPT, Claude, Llama, Gemini, etc.) | $-$$ | Cloud |
|
|
97
|
-
| **Ollama** | Local | Unlimited (free, offline) | **FREE** | 🔒 100% Local |
|
|
98
|
-
| **llama.cpp** | Local | GGUF models | **FREE** | 🔒 100% Local |
|
|
99
|
-
| **Azure OpenAI** | Cloud | GPT-4o, GPT-5, o1, o3 | $$$ | Cloud |
|
|
100
|
-
| **Azure Anthropic** | Cloud | Claude models | $$$ | Cloud |
|
|
101
|
-
| **OpenAI** | Cloud | GPT-4o, o1, o3 | $$$ | Cloud |
|
|
102
|
-
| **LM Studio** | Local | Local models with GUI | **FREE** | 🔒 100% Local |
|
|
103
|
-
| **MLX OpenAI Server** | Local | Apple Silicon (M1/M2/M3/M4) | **FREE** | 🔒 100% Local |
|
|
68
|
+
### Pick a Provider
|
|
104
69
|
|
|
105
|
-
|
|
70
|
+
**Free & Local (Ollama)**
|
|
71
|
+
```bash
|
|
72
|
+
export MODEL_PROVIDER=ollama
|
|
73
|
+
export OLLAMA_MODEL=qwen2.5-coder:latest
|
|
74
|
+
lynkr start
|
|
75
|
+
```
|
|
106
76
|
|
|
107
|
-
|
|
77
|
+
**AWS Bedrock (100+ models)**
|
|
78
|
+
```bash
|
|
79
|
+
export MODEL_PROVIDER=bedrock
|
|
80
|
+
export AWS_BEDROCK_API_KEY=your-key
|
|
81
|
+
export AWS_BEDROCK_MODEL_ID=anthropic.claude-3-5-sonnet-20241022-v2:0
|
|
82
|
+
lynkr start
|
|
83
|
+
```
|
|
108
84
|
|
|
109
|
-
|
|
85
|
+
**OpenRouter (cheapest cloud)**
|
|
86
|
+
```bash
|
|
87
|
+
export MODEL_PROVIDER=openrouter
|
|
88
|
+
export OPENROUTER_API_KEY=sk-or-v1-your-key
|
|
89
|
+
lynkr start
|
|
90
|
+
```
|
|
110
91
|
|
|
111
|
-
|
|
92
|
+
### Connect Your Tool
|
|
112
93
|
|
|
94
|
+
**Claude Code**
|
|
113
95
|
```bash
|
|
114
|
-
# Set Lynkr as backend
|
|
115
96
|
export ANTHROPIC_BASE_URL=http://localhost:8081
|
|
116
97
|
export ANTHROPIC_API_KEY=dummy
|
|
117
|
-
|
|
118
|
-
# Run Claude Code
|
|
119
98
|
claude "Your prompt here"
|
|
120
99
|
```
|
|
121
100
|
|
|
122
|
-
|
|
123
|
-
|
|
124
|
-
📖 **[Detailed Claude Code Setup](documentation/claude-code-cli.md)**
|
|
125
|
-
|
|
126
|
-
---
|
|
127
|
-
|
|
128
|
-
## Cursor Integration
|
|
129
|
-
|
|
130
|
-
Configure Cursor IDE to use Lynkr:
|
|
131
|
-
|
|
132
|
-
1. **Open Cursor Settings**
|
|
133
|
-
- Mac: `Cmd+,` | Windows/Linux: `Ctrl+,`
|
|
134
|
-
- Navigate to: **Features** → **Models**
|
|
135
|
-
|
|
136
|
-
2. **Configure OpenAI API Settings**
|
|
137
|
-
- **API Key**: `sk-lynkr` (any non-empty value)
|
|
138
|
-
- **Base URL**: `http://localhost:8081/v1`
|
|
139
|
-
- **Model**: `claude-3.5-sonnet` (or your provider's model)
|
|
140
|
-
|
|
141
|
-
3. **Test It**
|
|
142
|
-
- Chat: `Cmd+L` / `Ctrl+L`
|
|
143
|
-
- Inline edits: `Cmd+K` / `Ctrl+K`
|
|
144
|
-
- @Codebase search: Requires [embeddings setup](documentation/embeddings.md)
|
|
145
|
-
|
|
146
|
-
📖 **[Full Cursor Setup Guide](documentation/cursor-integration.md)** | **[Embeddings Configuration](documentation/embeddings.md)**
|
|
147
|
-
---
|
|
148
|
-
## Codex CLI Integration
|
|
149
|
-
|
|
150
|
-
Configure [OpenAI Codex CLI](https://github.com/openai/codex) to use Lynkr as its backend.
|
|
151
|
-
|
|
152
|
-
### Option 1: Environment Variables (Quick Start)
|
|
153
|
-
|
|
154
|
-
```bash
|
|
155
|
-
export OPENAI_BASE_URL=http://localhost:8081/v1
|
|
156
|
-
export OPENAI_API_KEY=dummy
|
|
157
|
-
|
|
158
|
-
codex
|
|
159
|
-
```
|
|
160
|
-
|
|
161
|
-
### Option 2: Config File (Recommended)
|
|
162
|
-
|
|
163
|
-
Edit `~/.codex/config.toml`:
|
|
164
|
-
|
|
101
|
+
**Codex CLI** — edit `~/.codex/config.toml`:
|
|
165
102
|
```toml
|
|
166
|
-
# Set Lynkr as the default provider
|
|
167
103
|
model_provider = "lynkr"
|
|
168
104
|
model = "gpt-4o"
|
|
169
105
|
|
|
170
|
-
# Define the Lynkr provider
|
|
171
106
|
[model_providers.lynkr]
|
|
172
107
|
name = "Lynkr Proxy"
|
|
173
108
|
base_url = "http://localhost:8081/v1"
|
|
174
109
|
wire_api = "responses"
|
|
175
|
-
|
|
176
|
-
# Optional: Trust your project directories
|
|
177
|
-
[projects."/path/to/your/project"]
|
|
178
|
-
trust_level = "trusted"
|
|
179
110
|
```
|
|
180
111
|
|
|
181
|
-
|
|
182
|
-
|
|
183
|
-
|
|
184
|
-
|
|
185
|
-
|
|
186
|
-
|
|
187
|
-
|
|
188
|
-
|
|
189
|
-
|
|
190
|
-
|
|
191
|
-
|
|
192
|
-
|
|
193
|
-
|
|
194
|
-
|
|
195
|
-
|
|
196
|
-
|
|
197
|
-
|
|
198
|
-
|
|
199
|
-
|
|
112
|
+
**Cursor IDE**
|
|
113
|
+
- Settings > Features > Models
|
|
114
|
+
- Base URL: `http://localhost:8081/v1`
|
|
115
|
+
- API Key: `sk-lynkr`
|
|
116
|
+
|
|
117
|
+
**Vercel AI SDK**
|
|
118
|
+
```ts
|
|
119
|
+
import { generateText } from "ai";
|
|
120
|
+
import { createOpenAICompatible } from "@ai-sdk/openai-compatible";
|
|
121
|
+
|
|
122
|
+
const lynkr = createOpenAICompatible({
|
|
123
|
+
baseURL: "http://localhost:8081/v1",
|
|
124
|
+
name: "lynkr",
|
|
125
|
+
apiKey: "sk-lynkr",
|
|
126
|
+
});
|
|
127
|
+
|
|
128
|
+
const { text } = await generateText({
|
|
129
|
+
model: lynkr.chatModel("auto"),
|
|
130
|
+
prompt: "Hello!",
|
|
131
|
+
});
|
|
200
132
|
```
|
|
201
133
|
|
|
202
|
-
|
|
203
|
-
|
|
204
|
-
|
|
205
|
-
|
|
206
|
-
|
|
207
|
-
|
|
208
|
-
|
|
209
|
-
|
|
134
|
+
**OpenClaw**
|
|
135
|
+
```json
|
|
136
|
+
// openclaw.json
|
|
137
|
+
{
|
|
138
|
+
"models": {
|
|
139
|
+
"providers": [{
|
|
140
|
+
"name": "lynkr",
|
|
141
|
+
"type": "openai-compatible",
|
|
142
|
+
"base_url": "http://localhost:8081/v1",
|
|
143
|
+
"api_key": "any-value",
|
|
144
|
+
"models": ["auto"]
|
|
145
|
+
}]
|
|
146
|
+
}
|
|
147
|
+
}
|
|
148
|
+
```
|
|
149
|
+
Set `OPENCLAW_MODE=true` in Lynkr's `.env` to show actual provider/model in responses.
|
|
210
150
|
|
|
211
|
-
>
|
|
151
|
+
> Works with any OpenAI-compatible client: Cline, Continue.dev, OpenClaw, KiloCode, and more.
|
|
212
152
|
|
|
213
153
|
---
|
|
214
154
|
|
|
215
|
-
##
|
|
216
|
-
|
|
217
|
-
Lynkr supports [ClawdBot](https://github.com/openclaw/openclaw) via its OpenAI-compatible API. ClawdBot users can route requests through Lynkr to access any supported provider.
|
|
155
|
+
## Supported Providers
|
|
218
156
|
|
|
219
|
-
|
|
220
|
-
|
|
221
|
-
|
|
222
|
-
|
|
|
223
|
-
|
|
|
224
|
-
|
|
|
225
|
-
|
|
|
157
|
+
| Provider | Type | Models | Cost |
|
|
158
|
+
|----------|------|--------|------|
|
|
159
|
+
| **Ollama** | Local | Unlimited (free, offline) | **Free** |
|
|
160
|
+
| **llama.cpp** | Local | Any GGUF model | **Free** |
|
|
161
|
+
| **LM Studio** | Local | Local models with GUI | **Free** |
|
|
162
|
+
| **MLX Server** | Local | Apple Silicon optimized | **Free** |
|
|
163
|
+
| **AWS Bedrock** | Cloud | 100+ (Claude, Llama, Mistral, Titan) | $$ |
|
|
164
|
+
| **OpenRouter** | Cloud | 100+ (GPT, Claude, Llama, Gemini) | $-$$ |
|
|
165
|
+
| **Databricks** | Cloud | Claude Sonnet 4.5, Opus 4.6 | $$$ |
|
|
166
|
+
| **Azure OpenAI** | Cloud | GPT-4o, o1, o3 | $$$ |
|
|
167
|
+
| **Azure Anthropic** | Cloud | Claude models | $$$ |
|
|
168
|
+
| **OpenAI** | Cloud | GPT-4o, o3, o4-mini | $$$ |
|
|
169
|
+
| **Google Vertex** | Cloud | Gemini 2.5 Pro/Flash | $$$ |
|
|
170
|
+
| **Moonshot AI** | Cloud | Kimi K2 Thinking/Turbo | $$ |
|
|
171
|
+
| **Z.AI** | Cloud | GLM-4.7 | $$ |
|
|
172
|
+
| **DeepSeek** | Cloud | DeepSeek Reasoner, R1 | $ |
|
|
173
|
+
|
|
174
|
+
4 local providers for **100% offline, free** usage. 10+ cloud providers for scale.
|
|
226
175
|
|
|
227
|
-
|
|
228
|
-
`gpt-5.2`, `gpt-5.1-codex`, `claude-opus-4.5`, `claude-sonnet-4.5`, `claude-haiku-4.5`, `gemini-3-pro`, `gemini-3-flash`, and more.
|
|
176
|
+
---
|
|
229
177
|
|
|
230
|
-
|
|
178
|
+
## Why Lynkr Over Alternatives
|
|
179
|
+
|
|
180
|
+
| Feature | Lynkr | LiteLLM (42K stars) | OpenRouter | PortKey |
|
|
181
|
+
|---------|-------|---------------------|------------|---------|
|
|
182
|
+
| **Setup** | `npm install -g lynkr` | Python + Docker + Postgres | Account signup | Docker + config |
|
|
183
|
+
| **Claude Code support** | Drop-in, native | Requires config | No CLI support | Requires config |
|
|
184
|
+
| **Cursor support** | Drop-in, native | Partial | Via API key | Partial |
|
|
185
|
+
| **Codex CLI support** | Drop-in, native | No | No | No |
|
|
186
|
+
| **Built for coding tools** | Yes (purpose-built) | No (general gateway) | No (general API) | No (general gateway) |
|
|
187
|
+
| **Local models** | Ollama, llama.cpp, LM Studio, MLX | Ollama only | No | No |
|
|
188
|
+
| **Token optimization** | Built-in (60-80% savings) | No | No | Caching only |
|
|
189
|
+
| **Complexity routing** | Auto-routes by task difficulty | Manual | Cost/latency only | Manual |
|
|
190
|
+
| **Memory system** | Titans-inspired long-term memory | No | No | No |
|
|
191
|
+
| **Self-hosted** | Yes (Node.js) | Yes (Python stack) | No (SaaS) | Yes (Docker) |
|
|
192
|
+
| **Offline capable** | Yes | Yes | No | No |
|
|
193
|
+
| **Transaction fees** | None | None (OSS) / Paid enterprise | 5.5% on credits | Free tier / Paid |
|
|
194
|
+
| **Dependencies** | Node.js only | Python, Prisma, PostgreSQL | N/A | Docker, Python |
|
|
195
|
+
| **Format conversion** | Anthropic <-> OpenAI (automatic) | Automatic | N/A | Automatic |
|
|
196
|
+
| **Code intelligence** | Graphify (19-lang AST graph) | No | No | No |
|
|
197
|
+
| **Routing telemetry** | Built-in (SQLite + REST API) | No | Dashboard | Dashboard |
|
|
198
|
+
| **Admin hot-reload** | Yes (no restart) | Requires restart | N/A | Requires restart |
|
|
199
|
+
| **License** | Apache 2.0 | MIT | Proprietary | MIT (gateway) |
|
|
200
|
+
|
|
201
|
+
**Lynkr's edge:** Purpose-built for AI coding tools. Not a general LLM gateway — a proxy that understands Claude Code, Cursor, and Codex natively, with built-in token optimization, complexity-based routing, and a memory system designed for coding workflows. Installs in one command, runs on Node.js, zero infrastructure required.
|
|
231
202
|
|
|
232
203
|
---
|
|
233
204
|
|
|
234
|
-
##
|
|
235
|
-
---
|
|
205
|
+
## Cost Comparison
|
|
236
206
|
|
|
237
|
-
|
|
207
|
+
| Scenario | Direct Anthropic | Lynkr + Ollama | Lynkr + OpenRouter | Lynkr + Bedrock |
|
|
208
|
+
|----------|-----------------|----------------|--------------------| --------------- |
|
|
209
|
+
| Daily Claude Code usage | ~$10-30/day | **$0 (free)** | ~$2-8/day | ~$5-15/day |
|
|
210
|
+
| Token optimization savings | — | — | 60-80% further | 60-80% further |
|
|
211
|
+
| Monthly (heavy use) | $300-900 | **$0** | $60-240 | $150-450 |
|
|
238
212
|
|
|
239
|
-
|
|
240
|
-
- 📦 **[Installation Guide](documentation/installation.md)** - Detailed installation for all methods
|
|
241
|
-
- ⚙️ **[Provider Configuration](documentation/providers.md)** - Complete setup for all 12+ providers
|
|
242
|
-
- 🎯 **[Quick Start Examples](documentation/installation.md#quick-start-examples)** - Copy-paste configs
|
|
243
|
-
|
|
244
|
-
### IDE & CLI Integration
|
|
245
|
-
- 🖥️ **[Claude Code CLI Setup](documentation/claude-code-cli.md)** - Connect Claude Code CLI
|
|
246
|
-
- 🤖 **[Codex CLI Setup](documentation/codex-cli.md)** - Configure OpenAI Codex CLI with config.toml
|
|
247
|
-
- 🎨 **[Cursor IDE Setup](documentation/cursor-integration.md)** - Full Cursor integration with troubleshooting
|
|
248
|
-
- 🔍 **[Embeddings Guide](documentation/embeddings.md)** - Enable @Codebase semantic search (4 options: Ollama, llama.cpp, OpenRouter, OpenAI)
|
|
249
|
-
|
|
250
|
-
### Features & Capabilities
|
|
251
|
-
- ✨ **[Core Features](documentation/features.md)** - Architecture, request flow, format conversion
|
|
252
|
-
- 🧠 **[Memory System](documentation/memory-system.md)** - Titans-inspired long-term memory
|
|
253
|
-
- 🗃️ **[Semantic Cache](#semantic-cache)** - Cache responses for similar prompts
|
|
254
|
-
- 💰 **[Token Optimization](documentation/token-optimization.md)** - 60-80% cost reduction strategies
|
|
255
|
-
- 🔧 **[Tools & Execution](documentation/tools.md)** - Tool calling, execution modes, custom tools
|
|
256
|
-
|
|
257
|
-
### Deployment & Operations
|
|
258
|
-
- 🐳 **[Docker Deployment](documentation/docker.md)** - docker-compose setup with GPU support
|
|
259
|
-
- 🏭 **[Production Hardening](documentation/production.md)** - Circuit breakers, load shedding, metrics
|
|
260
|
-
- 📊 **[API Reference](documentation/api.md)** - All endpoints and formats
|
|
261
|
-
|
|
262
|
-
### Support
|
|
263
|
-
- 🔧 **[Troubleshooting](documentation/troubleshooting.md)** - Common issues and solutions
|
|
264
|
-
- ❓ **[FAQ](documentation/faq.md)** - Frequently asked questions
|
|
265
|
-
- 🧪 **[Testing Guide](documentation/testing.md)** - Running tests and validation
|
|
213
|
+
> With token optimization enabled, Lynkr's smart tool selection, prompt caching, and memory deduplication reduce token usage by 60-80% on top of provider savings.
|
|
266
214
|
|
|
267
215
|
---
|
|
268
216
|
|
|
269
|
-
##
|
|
217
|
+
## What's Under the Hood
|
|
270
218
|
|
|
271
|
-
|
|
272
|
-
- 💬 **[GitHub Discussions](https://github.com/vishalveerareddy123/Lynkr/discussions)** - Community Q&A
|
|
273
|
-
- 🐛 **[Report Issues](https://github.com/vishalveerareddy123/Lynkr/issues)** - Bug reports and feature requests
|
|
274
|
-
- 📦 **[NPM Package](https://www.npmjs.com/package/lynkr)** - Official npm package
|
|
219
|
+
Lynkr isn't just a passthrough proxy. It's an optimization layer.
|
|
275
220
|
|
|
276
|
-
|
|
221
|
+
### Smart Routing (5-Phase)
|
|
222
|
+
Routes requests to the right model based on 5-phase complexity analysis. Simple questions go to fast/cheap models. Complex architectural tasks go to powerful models. Includes Graphify structural analysis for code-aware routing.
|
|
277
223
|
|
|
278
|
-
|
|
279
|
-
|
|
280
|
-
-
|
|
281
|
-
- ✅ **60-80% Cost Reduction** - Token optimization with smart tool selection, prompt caching, memory deduplication
|
|
282
|
-
- ✅ **100% Local Option** - Run completely offline with Ollama/llama.cpp (zero cloud dependencies)
|
|
283
|
-
- ✅ **OpenAI Compatible** - Works with Cursor IDE, Continue.dev, and any OpenAI-compatible client
|
|
284
|
-
- ✅ **Embeddings Support** - 4 options for @Codebase search: Ollama (local), llama.cpp (local), OpenRouter, OpenAI
|
|
285
|
-
- ✅ **MCP Integration** - Automatic Model Context Protocol server discovery and orchestration
|
|
286
|
-
- ✅ **Enterprise Features** - Circuit breakers, load shedding, Prometheus metrics, K8s health checks
|
|
287
|
-
- ✅ **Streaming Support** - Real-time token streaming for all providers
|
|
288
|
-
- ✅ **Memory System** - Titans-inspired long-term memory with surprise-based filtering
|
|
289
|
-
- ✅ **Tool Calling** - Full tool support with server and passthrough execution modes
|
|
290
|
-
- ✅ **Production Ready** - Battle-tested with 400+ tests, observability, and error resilience
|
|
291
|
-
- ✅ **Node 20-25 Support** - Works with latest Node.js versions including v25
|
|
292
|
-
- ✅ **Semantic Caching** - Cache responses for similar prompts (requires embeddings)
|
|
224
|
+
- **Complexity scoring** — 15-dimension weighted scoring with agentic workflow detection
|
|
225
|
+
- **Graphify integration** — AST-based knowledge graph detects god nodes, community cohesion, blast radius across 19 languages
|
|
226
|
+
- **Routing telemetry** — every decision recorded with quality scoring (0-100) and latency tracking (P50/P95/P99)
|
|
293
227
|
|
|
294
|
-
|
|
228
|
+
### Token Optimization (7 Phases)
|
|
229
|
+
- **Smart tool selection** — only sends tools relevant to the current task
|
|
230
|
+
- **Code Mode** — replaces 100+ MCP tools with 4 meta-tools (~96% token reduction)
|
|
231
|
+
- **Distill compression** — structural similarity, delta rendering, smart dedup of repetitive tool outputs
|
|
232
|
+
- **Prompt caching** — SHA-256 keyed LRU cache
|
|
233
|
+
- **Memory deduplication** — eliminates repeated information across turns
|
|
234
|
+
- **History compression** — sliding window with Distill-powered structural dedup
|
|
235
|
+
- **Headroom sidecar** — optional 47-92% ML-based compression (Smart Crusher, CCR, LLMLingua)
|
|
295
236
|
|
|
296
|
-
|
|
237
|
+
### Enterprise Resilience
|
|
238
|
+
- **Circuit breakers** — automatic failover with half-open probe recovery
|
|
239
|
+
- **Admin hot-reload** — `POST /v1/admin/reload` reloads config + resets circuit breakers without restart
|
|
240
|
+
- **Load shedding** — graceful degradation under high load
|
|
241
|
+
- **Prometheus metrics** — full observability at `/metrics`
|
|
242
|
+
- **Health checks** — K8s-ready endpoints at `/health`
|
|
243
|
+
- **Performance timer** — per-request timing breakdown with `PERF_TIMER=true`
|
|
297
244
|
|
|
298
|
-
|
|
245
|
+
### Memory System
|
|
246
|
+
Titans-inspired long-term memory with surprise-based filtering. The system remembers important context across sessions and forgets noise — reducing token waste from repeated context.
|
|
299
247
|
|
|
300
|
-
|
|
301
|
-
|
|
302
|
-
# Requires an embeddings provider (Ollama recommended)
|
|
303
|
-
ollama pull nomic-embed-text
|
|
248
|
+
### Semantic Cache
|
|
249
|
+
Cache responses for semantically similar prompts. Hit rate depends on your workflow, but repeat questions (common in coding) get instant responses.
|
|
304
250
|
|
|
305
|
-
|
|
251
|
+
```bash
|
|
306
252
|
SEMANTIC_CACHE_ENABLED=true
|
|
307
253
|
SEMANTIC_CACHE_THRESHOLD=0.95
|
|
308
|
-
OLLAMA_EMBEDDINGS_MODEL=nomic-embed-text
|
|
309
|
-
OLLAMA_EMBEDDINGS_ENDPOINT=http://localhost:11434/api/embeddings
|
|
310
254
|
```
|
|
311
255
|
|
|
312
|
-
|
|
313
|
-
|
|
314
|
-
| `SEMANTIC_CACHE_ENABLED` | `false` | Enable/disable semantic caching |
|
|
315
|
-
| `SEMANTIC_CACHE_THRESHOLD` | `0.95` | Similarity threshold (0.0-1.0) |
|
|
256
|
+
### MCP Integration + Code Mode
|
|
257
|
+
Automatic Model Context Protocol server discovery and orchestration. Your MCP tools work through Lynkr without configuration. Enable Code Mode to replace 100+ MCP tool definitions with 4 lightweight meta-tools:
|
|
316
258
|
|
|
317
|
-
|
|
318
|
-
|
|
319
|
-
---
|
|
320
|
-
|
|
321
|
-
## Architecture
|
|
322
|
-
|
|
323
|
-
```
|
|
324
|
-
┌─────────────────┐
|
|
325
|
-
│ AI Tools │
|
|
326
|
-
└────────┬────────┘
|
|
327
|
-
│ Anthropic/OpenAI Format
|
|
328
|
-
↓
|
|
329
|
-
┌─────────────────┐
|
|
330
|
-
│ Lynkr Proxy │
|
|
331
|
-
│ Port: 8081 │
|
|
332
|
-
│ │
|
|
333
|
-
│ • Format Conv. │
|
|
334
|
-
│ • Token Optim. │
|
|
335
|
-
│ • Provider Route│
|
|
336
|
-
│ • Tool Calling │
|
|
337
|
-
│ • Caching │
|
|
338
|
-
└────────┬────────┘
|
|
339
|
-
│
|
|
340
|
-
├──→ Databricks (Claude 4.5)
|
|
341
|
-
├──→ AWS Bedrock (100+ models)
|
|
342
|
-
├──→ OpenRouter (100+ models)
|
|
343
|
-
├──→ Ollama (local, free)
|
|
344
|
-
├──→ llama.cpp (local, free)
|
|
345
|
-
├──→ Azure OpenAI (GPT-4o, o1)
|
|
346
|
-
├──→ OpenAI (GPT-4o, o3)
|
|
347
|
-
└──→ Azure Anthropic (Claude)
|
|
259
|
+
```bash
|
|
260
|
+
CODE_MODE_ENABLED=true # ~96% reduction in tool-catalog tokens
|
|
348
261
|
```
|
|
349
262
|
|
|
350
|
-
📖 **[Detailed Architecture](documentation/features.md#architecture)**
|
|
351
|
-
|
|
352
263
|
---
|
|
353
264
|
|
|
354
|
-
##
|
|
265
|
+
## Deployment Options
|
|
355
266
|
|
|
356
|
-
**
|
|
267
|
+
**One-line install (recommended)**
|
|
357
268
|
```bash
|
|
358
|
-
|
|
359
|
-
export OLLAMA_MODEL=qwen2.5-coder:latest
|
|
360
|
-
export OLLAMA_EMBEDDINGS_MODEL=nomic-embed-text
|
|
361
|
-
npm start
|
|
269
|
+
curl -fsSL https://raw.githubusercontent.com/Fast-Editor/Lynkr/main/install.sh | bash
|
|
362
270
|
```
|
|
363
|
-
> 💡 **Tip:** Prevent slow cold starts by keeping Ollama models loaded: `launchctl setenv OLLAMA_KEEP_ALIVE "24h"` (macOS) or set `OLLAMA_KEEP_ALIVE=24h` env var. See [troubleshooting](documentation/troubleshooting.md#slow-first-request--cold-start-warning).
|
|
364
271
|
|
|
365
|
-
**
|
|
272
|
+
**NPM**
|
|
366
273
|
```bash
|
|
367
|
-
|
|
368
|
-
export OLLAMA_ENDPOINT=http://192.168.1.100:11434 # Any IP or hostname
|
|
369
|
-
export OLLAMA_MODEL=llama3.1:70b
|
|
370
|
-
npm start
|
|
274
|
+
npm install -g lynkr && lynkr start
|
|
371
275
|
```
|
|
372
|
-
> 🌐 **Note:** All provider endpoints support remote addresses - not limited to localhost. Use any IP, hostname, or domain.
|
|
373
276
|
|
|
374
|
-
**
|
|
277
|
+
**Docker**
|
|
375
278
|
```bash
|
|
376
|
-
|
|
377
|
-
mlx-openai-server launch --model-path mlx-community/Qwen2.5-Coder-7B-Instruct-4bit --model-type lm
|
|
378
|
-
|
|
379
|
-
# Terminal 2: Start Lynkr
|
|
380
|
-
export MODEL_PROVIDER=openai
|
|
381
|
-
export OPENAI_ENDPOINT=http://localhost:8000/v1/chat/completions
|
|
382
|
-
export OPENAI_API_KEY=not-needed
|
|
383
|
-
npm start
|
|
279
|
+
docker-compose up -d
|
|
384
280
|
```
|
|
385
|
-
> 🍎 **Apple Silicon optimized** - Native MLX performance on M1/M2/M3/M4 Macs. See [MLX setup guide](documentation/providers.md#10-mlx-openai-server-apple-silicon).
|
|
386
281
|
|
|
387
|
-
**
|
|
282
|
+
**Git Clone**
|
|
388
283
|
```bash
|
|
389
|
-
|
|
390
|
-
|
|
391
|
-
export AWS_BEDROCK_MODEL_ID=anthropic.claude-3-5-sonnet-20241022-v2:0
|
|
284
|
+
git clone https://github.com/Fast-Editor/Lynkr.git
|
|
285
|
+
cd Lynkr && npm install && cp .env.example .env
|
|
392
286
|
npm start
|
|
393
287
|
```
|
|
394
288
|
|
|
395
|
-
**
|
|
289
|
+
**Homebrew**
|
|
396
290
|
```bash
|
|
397
|
-
|
|
398
|
-
|
|
399
|
-
npm start
|
|
291
|
+
brew tap vishalveerareddy123/lynkr
|
|
292
|
+
brew install lynkr
|
|
400
293
|
```
|
|
401
|
-
|
|
402
|
-
|
|
294
|
+
|
|
295
|
+
---
|
|
296
|
+
|
|
297
|
+
## Documentation
|
|
298
|
+
|
|
299
|
+
| Guide | Description |
|
|
300
|
+
|-------|-------------|
|
|
301
|
+
| [Installation](documentation/installation.md) | All installation methods |
|
|
302
|
+
| [Provider Config](documentation/providers.md) | Setup for all 12+ providers |
|
|
303
|
+
| [Claude Code CLI](documentation/claude-code-cli.md) | Detailed Claude Code integration |
|
|
304
|
+
| [Codex CLI](documentation/codex-cli.md) | Codex config.toml setup |
|
|
305
|
+
| [OpenClaw](documentation/openclaw-integration.md) | OpenClaw integration with tier routing |
|
|
306
|
+
| [Cursor IDE](documentation/cursor-integration.md) | Cursor integration + troubleshooting |
|
|
307
|
+
| [Embeddings](documentation/embeddings.md) | @Codebase semantic search (4 options) |
|
|
308
|
+
| [Token Optimization](documentation/token-optimization.md) | 60-80% cost reduction strategies |
|
|
309
|
+
| [Memory System](documentation/memory-system.md) | Titans-inspired long-term memory |
|
|
310
|
+
| [Tools & Execution](documentation/tools.md) | Tool calling and execution modes |
|
|
311
|
+
| [Smart Routing](documentation/routing.md) | Complexity-based model routing |
|
|
312
|
+
| [Docker Deployment](documentation/docker.md) | docker-compose with GPU support |
|
|
313
|
+
| [Production Hardening](documentation/production.md) | Circuit breakers, metrics, load shedding |
|
|
314
|
+
| [API Reference](documentation/api.md) | All endpoints and formats |
|
|
315
|
+
| [Troubleshooting](documentation/troubleshooting.md) | Common issues and solutions |
|
|
316
|
+
| [FAQ](documentation/faq.md) | Frequently asked questions |
|
|
317
|
+
|
|
318
|
+
---
|
|
319
|
+
|
|
320
|
+
## Troubleshooting
|
|
321
|
+
|
|
322
|
+
| Issue | Solution |
|
|
323
|
+
|-------|----------|
|
|
324
|
+
| Same response for all queries | Disable semantic cache: `SEMANTIC_CACHE_ENABLED=false` |
|
|
325
|
+
| Tool calls not executing | Increase threshold: `POLICY_TOOL_LOOP_THRESHOLD=15` |
|
|
326
|
+
| Slow first request | Keep Ollama loaded: `OLLAMA_KEEP_ALIVE=24h` |
|
|
327
|
+
| Connection refused | Ensure Lynkr is running: `lynkr start` |
|
|
403
328
|
|
|
404
329
|
---
|
|
405
330
|
|
|
406
331
|
## Contributing
|
|
407
332
|
|
|
408
|
-
We welcome contributions
|
|
409
|
-
- **[Contributing Guide](documentation/contributing.md)** - How to contribute
|
|
410
|
-
- **[Testing Guide](documentation/testing.md)** - Running tests
|
|
333
|
+
We welcome contributions. See the [Contributing Guide](documentation/contributing.md) and [Testing Guide](documentation/testing.md).
|
|
411
334
|
|
|
412
335
|
---
|
|
413
336
|
|
|
414
337
|
## License
|
|
415
338
|
|
|
416
|
-
Apache 2.0
|
|
339
|
+
Apache 2.0 — See [LICENSE](LICENSE).
|
|
417
340
|
|
|
418
341
|
---
|
|
419
342
|
|
|
420
|
-
## Community
|
|
343
|
+
## Community
|
|
421
344
|
|
|
422
|
-
-
|
|
423
|
-
-
|
|
424
|
-
-
|
|
425
|
-
-
|
|
345
|
+
- [GitHub Discussions](https://github.com/Fast-Editor/Lynkr/discussions) — Questions and tips
|
|
346
|
+
- [Report Issues](https://github.com/Fast-Editor/Lynkr/issues) — Bug reports and feature requests
|
|
347
|
+
- [NPM Package](https://www.npmjs.com/package/lynkr) — Official package
|
|
348
|
+
- [DeepWiki](https://deepwiki.com/vishalveerareddy123/Lynkr) — AI-powered docs search
|
|
426
349
|
|
|
427
350
|
---
|
|
428
351
|
|
|
429
|
-
**
|
|
352
|
+
**Built by [Vishal Veera Reddy](https://github.com/vishalveerareddy123) — for developers who want control over their AI tools.**
|