lynkr 1.0.0 → 2.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CITATIONS.bib +6 -0
- package/DEPLOYMENT.md +1001 -0
- package/README.md +215 -71
- package/docs/index.md +55 -2
- package/monitor-agents.sh +31 -0
- package/package.json +7 -3
- package/src/agents/context-manager.js +220 -0
- package/src/agents/definitions/loader.js +563 -0
- package/src/agents/executor.js +412 -0
- package/src/agents/index.js +157 -0
- package/src/agents/parallel-coordinator.js +68 -0
- package/src/agents/reflector.js +321 -0
- package/src/agents/skillbook.js +331 -0
- package/src/agents/store.js +244 -0
- package/src/api/router.js +55 -0
- package/src/clients/databricks.js +214 -17
- package/src/clients/routing.js +15 -7
- package/src/clients/standard-tools.js +341 -0
- package/src/config/index.js +41 -5
- package/src/orchestrator/index.js +254 -37
- package/src/server.js +2 -0
- package/src/tools/agent-task.js +96 -0
- package/test/azure-openai-config.test.js +203 -0
- package/test/azure-openai-error-resilience.test.js +238 -0
- package/test/azure-openai-format-conversion.test.js +354 -0
- package/test/azure-openai-integration.test.js +281 -0
- package/test/azure-openai-routing.test.js +148 -0
- package/test/azure-openai-streaming.test.js +171 -0
- package/test/format-conversion.test.js +578 -0
- package/test/hybrid-routing-integration.test.js +18 -11
- package/test/openrouter-error-resilience.test.js +418 -0
- package/test/passthrough-mode.test.js +385 -0
- package/test/routing.test.js +9 -3
- package/test/web-tools.test.js +3 -0
- package/test-agents-simple.js +43 -0
- package/test-cli-connection.sh +33 -0
- package/test-learning-unit.js +126 -0
- package/test-learning.js +112 -0
- package/test-parallel-agents.sh +124 -0
- package/test-parallel-direct.js +155 -0
- package/test-subagents.sh +117 -0
package/README.md
CHANGED
|
@@ -15,22 +15,22 @@
|
|
|
15
15
|
## Table of Contents
|
|
16
16
|
|
|
17
17
|
1. [Overview](#overview)
|
|
18
|
-
2. [
|
|
18
|
+
2. [Supported Models & Providers](#supported-models--providers)
|
|
19
|
+
3. [Core Capabilities](#core-capabilities)
|
|
19
20
|
- [Repo Intelligence & Navigation](#repo-intelligence--navigation)
|
|
20
21
|
- [Git Workflow Enhancements](#git-workflow-enhancements)
|
|
21
22
|
- [Diff & Change Management](#diff--change-management)
|
|
22
23
|
- [Execution & Tooling](#execution--tooling)
|
|
23
24
|
- [Workflow & Collaboration](#workflow--collaboration)
|
|
24
25
|
- [UX, Monitoring, and Logs](#ux-monitoring-and-logs)
|
|
25
|
-
|
|
26
|
+
4. [Production Hardening Features](#production-hardening-features)
|
|
26
27
|
- [Reliability & Resilience](#reliability--resilience)
|
|
27
28
|
- [Observability & Monitoring](#observability--monitoring)
|
|
28
29
|
- [Security & Governance](#security--governance)
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
7. [Runtime Operations](#runtime-operations)
|
|
30
|
+
5. [Architecture](#architecture)
|
|
31
|
+
6. [Getting Started](#getting-started)
|
|
32
|
+
7. [Configuration Reference](#configuration-reference)
|
|
33
|
+
8. [Runtime Operations](#runtime-operations)
|
|
34
34
|
- [Launching the Proxy](#launching-the-proxy)
|
|
35
35
|
- [Connecting Claude Code CLI](#connecting-claude-code-cli)
|
|
36
36
|
- [Using Ollama Models](#using-ollama-models)
|
|
@@ -40,11 +40,12 @@
|
|
|
40
40
|
- [Integrating MCP Servers](#integrating-mcp-servers)
|
|
41
41
|
- [Health Checks & Monitoring](#health-checks--monitoring)
|
|
42
42
|
- [Metrics & Observability](#metrics--observability)
|
|
43
|
-
|
|
44
|
-
|
|
45
|
-
|
|
46
|
-
|
|
47
|
-
|
|
43
|
+
9. [Manual Test Matrix](#manual-test-matrix)
|
|
44
|
+
10. [Troubleshooting](#troubleshooting)
|
|
45
|
+
11. [Roadmap & Known Gaps](#roadmap--known-gaps)
|
|
46
|
+
12. [FAQ](#faq)
|
|
47
|
+
13. [References](#references)
|
|
48
|
+
14. [License](#license)
|
|
48
49
|
|
|
49
50
|
---
|
|
50
51
|
|
|
@@ -71,6 +72,105 @@ Further documentation and usage notes are available on [DeepWiki](https://deepwi
|
|
|
71
72
|
|
|
72
73
|
---
|
|
73
74
|
|
|
75
|
+
## Supported Models & Providers
|
|
76
|
+
|
|
77
|
+
Lynkr supports multiple AI model providers, giving you flexibility in choosing the right model for your needs:
|
|
78
|
+
|
|
79
|
+
### **Provider Options**
|
|
80
|
+
|
|
81
|
+
| Provider | Configuration | Models Available | Best For |
|
|
82
|
+
|----------|--------------|------------------|----------|
|
|
83
|
+
| **Databricks** (Default) | `MODEL_PROVIDER=databricks` | Claude Sonnet 4.5, Claude Opus 4.5 | Production use, enterprise deployment |
|
|
84
|
+
| **Azure OpenAI** | `MODEL_PROVIDER=azure-openai` | GPT-4o, GPT-4o-mini, GPT-5, o1, o3 | Azure integration, Microsoft ecosystem |
|
|
85
|
+
| **Azure Anthropic** | `MODEL_PROVIDER=azure-anthropic` | Claude Sonnet 4.5, Claude Opus 4.5 | Azure-hosted Claude models |
|
|
86
|
+
| **OpenRouter** | `MODEL_PROVIDER=openrouter` | 100+ models (GPT-4o, Claude, Gemini, Llama, etc.) | Model flexibility, cost optimization |
|
|
87
|
+
| **Ollama** (Local) | `MODEL_PROVIDER=ollama` | Llama 3.1, Qwen2.5, Mistral, CodeLlama | Local/offline use, privacy, no API costs |
|
|
88
|
+
|
|
89
|
+
### **Recommended Models by Use Case**
|
|
90
|
+
|
|
91
|
+
#### **For Production Code Assistance**
|
|
92
|
+
- **Best**: Claude Sonnet 4.5 (via Databricks or Azure Anthropic)
|
|
93
|
+
- **Alternative**: GPT-4o (via Azure OpenAI or OpenRouter)
|
|
94
|
+
- **Budget**: GPT-4o-mini (via Azure OpenAI) or Claude Haiku (via OpenRouter)
|
|
95
|
+
|
|
96
|
+
#### **For Code Generation**
|
|
97
|
+
- **Best**: Claude Opus 4.5 (via Databricks or Azure Anthropic)
|
|
98
|
+
- **Alternative**: GPT-4o (via Azure OpenAI)
|
|
99
|
+
- **Local**: Qwen2.5-Coder 32B (via Ollama)
|
|
100
|
+
|
|
101
|
+
#### **For Fast Exploration**
|
|
102
|
+
- **Best**: Claude Haiku (via OpenRouter or Azure Anthropic)
|
|
103
|
+
- **Alternative**: GPT-4o-mini (via Azure OpenAI)
|
|
104
|
+
- **Local**: Llama 3.1 8B (via Ollama)
|
|
105
|
+
|
|
106
|
+
#### **For Cost Optimization**
|
|
107
|
+
- **Cheapest Cloud**: Amazon Nova models (via OpenRouter) - free tier available
|
|
108
|
+
- **Cheapest Local**: Ollama (any model) - completely free, runs on your hardware
|
|
109
|
+
|
|
110
|
+
### **Azure OpenAI Specific Models**
|
|
111
|
+
|
|
112
|
+
When using `MODEL_PROVIDER=azure-openai`, you can deploy any of these models:
|
|
113
|
+
|
|
114
|
+
| Model | Deployment Name | Capabilities | Best For |
|
|
115
|
+
|-------|----------------|--------------|----------|
|
|
116
|
+
| **GPT-4o** | `gpt-4o` | Text, vision, function calling | General-purpose, multimodal tasks |
|
|
117
|
+
| **GPT-4o-mini** | `gpt-4o-mini` | Text, function calling | Fast responses, cost-effective |
|
|
118
|
+
| **GPT-5** | `gpt-5-chat` or custom | Advanced reasoning, longer context | Complex problem-solving |
|
|
119
|
+
| **o1-preview** | `o1-preview` | Deep reasoning, chain of thought | Mathematical, logic problems |
|
|
120
|
+
| **o3-mini** | `o3-mini` | Efficient reasoning | Fast reasoning tasks |
|
|
121
|
+
|
|
122
|
+
**Note**: Azure OpenAI deployment names are configurable via `AZURE_OPENAI_DEPLOYMENT` environment variable.
|
|
123
|
+
|
|
124
|
+
### **Ollama Model Recommendations**
|
|
125
|
+
|
|
126
|
+
For tool calling support (required for Claude Code CLI functionality):
|
|
127
|
+
|
|
128
|
+
✅ **Recommended**:
|
|
129
|
+
- `llama3.1:8b` - Good balance of speed and capability
|
|
130
|
+
- `llama3.2` - Latest Llama model
|
|
131
|
+
- `qwen2.5:14b` - Strong reasoning (larger model needed, 7b struggles with tools)
|
|
132
|
+
- `mistral:7b-instruct` - Fast and capable
|
|
133
|
+
|
|
134
|
+
❌ **Not Recommended for Tools**:
|
|
135
|
+
- `qwen2.5-coder` - Code-only, slow with tool calling
|
|
136
|
+
- `codellama` - Code-only, poor tool support
|
|
137
|
+
|
|
138
|
+
### **Hybrid Routing (Ollama + Cloud Fallback)**
|
|
139
|
+
|
|
140
|
+
Lynkr supports intelligent hybrid routing for cost optimization:
|
|
141
|
+
|
|
142
|
+
```bash
|
|
143
|
+
# Use Ollama for simple tasks, fallback to cloud for complex ones
|
|
144
|
+
PREFER_OLLAMA=true
|
|
145
|
+
FALLBACK_ENABLED=true
|
|
146
|
+
FALLBACK_PROVIDER=databricks # or azure-openai, openrouter, azure-anthropic
|
|
147
|
+
```
|
|
148
|
+
|
|
149
|
+
**How it works**:
|
|
150
|
+
- Requests with few/no tools → Ollama (free, local)
|
|
151
|
+
- Requests with many tools → Cloud provider (more capable)
|
|
152
|
+
- Ollama failures → Automatic fallback to cloud
|
|
153
|
+
|
|
154
|
+
**Routing Logic**:
|
|
155
|
+
- 0-2 tools: Ollama
|
|
156
|
+
- 3-15 tools: OpenRouter or Azure OpenAI (if configured)
|
|
157
|
+
- 16+ tools: Databricks or Azure Anthropic (most capable)
|
|
158
|
+
|
|
159
|
+
### **Provider Comparison**
|
|
160
|
+
|
|
161
|
+
| Feature | Databricks | Azure OpenAI | Azure Anthropic | OpenRouter | Ollama |
|
|
162
|
+
|---------|-----------|--------------|-----------------|------------|--------|
|
|
163
|
+
| **Setup Complexity** | Medium | Medium | Medium | Easy | Easy |
|
|
164
|
+
| **Cost** | $$$ | $$ | $$$ | $ | Free |
|
|
165
|
+
| **Latency** | Low | Low | Low | Medium | Very Low |
|
|
166
|
+
| **Tool Calling** | Excellent | Excellent | Excellent | Good | Fair |
|
|
167
|
+
| **Context Length** | 200K | 128K | 200K | Varies | 32K-128K |
|
|
168
|
+
| **Streaming** | Yes | Yes | Yes | Yes | Yes |
|
|
169
|
+
| **Privacy** | Enterprise | Enterprise | Enterprise | Third-party | Local |
|
|
170
|
+
| **Offline** | No | No | No | No | Yes |
|
|
171
|
+
|
|
172
|
+
---
|
|
173
|
+
|
|
74
174
|
## Core Capabilities
|
|
75
175
|
|
|
76
176
|
### Repo Intelligence & Navigation
|
|
@@ -96,7 +196,15 @@ Further documentation and usage notes are available on [DeepWiki](https://deepwi
|
|
|
96
196
|
|
|
97
197
|
### Execution & Tooling
|
|
98
198
|
|
|
99
|
-
-
|
|
199
|
+
- **Flexible tool execution modes**: Configure where tools execute via `TOOL_EXECUTION_MODE`:
|
|
200
|
+
- `server` (default) – Tools run on the proxy server where Lynkr is hosted
|
|
201
|
+
- `client`/`passthrough` – Tools execute on the Claude Code CLI side, enabling local file operations and commands on the client machine
|
|
202
|
+
- **Client-side tool execution** – When in passthrough mode, the proxy returns Anthropic-formatted `tool_use` blocks to the CLI, which executes them locally and sends back `tool_result` blocks. This enables:
|
|
203
|
+
- File operations on the CLI user's local filesystem
|
|
204
|
+
- Local command execution in the user's environment
|
|
205
|
+
- Access to local credentials and SSH keys
|
|
206
|
+
- Integration with local development tools
|
|
207
|
+
- Tool execution pipeline sandboxes or runs tools in the host workspace based on policy (server mode).
|
|
100
208
|
- MCP sandbox orchestration (Docker runtime by default) optionally isolates external tools with mount and permission controls.
|
|
101
209
|
- Automated testing harness exposes `workspace_test_run`, `workspace_test_history`, and `workspace_test_summary`.
|
|
102
210
|
- Prompt caching reduces repeated token usage for iterative conversations.
|
|
@@ -213,43 +321,7 @@ Lynkr includes comprehensive production-ready features designed for reliability,
|
|
|
213
321
|
- Cost tracking and budget exhaustion handling
|
|
214
322
|
- Request-level cost attribution
|
|
215
323
|
|
|
216
|
-
### Performance Characteristics
|
|
217
|
-
|
|
218
|
-
#### **Benchmark Results**
|
|
219
|
-
Based on comprehensive performance testing with 100,000+ operations:
|
|
220
|
-
|
|
221
|
-
| Component | Throughput | Latency | Overhead |
|
|
222
|
-
|-----------|------------|---------|----------|
|
|
223
|
-
| Baseline (no-op) | 21.3M ops/sec | 0.00005ms | - |
|
|
224
|
-
| Metrics Collection | 4.7M ops/sec | 0.0002ms | 0.15ms |
|
|
225
|
-
| Load Shedding Check | 7.6M ops/sec | 0.0001ms | 0.08ms |
|
|
226
|
-
| Circuit Breaker | 4.3M ops/sec | 0.0002ms | 0.18ms |
|
|
227
|
-
| Input Validation (simple) | 5.8M ops/sec | 0.0002ms | 0.12ms |
|
|
228
|
-
| Input Validation (complex) | 890K ops/sec | 0.0011ms | 0.96ms |
|
|
229
|
-
| Combined Middleware Stack | 140K ops/sec | 0.0071ms | 7.1μs |
|
|
230
|
-
|
|
231
|
-
**Overall Performance Rating:** ⭐ **EXCELLENT**
|
|
232
|
-
- Total middleware overhead: **7.1 microseconds** per request
|
|
233
|
-
- Throughput: **140,000 requests/second**
|
|
234
|
-
- Memory overhead: **~4MB** for typical workload
|
|
235
|
-
|
|
236
|
-
#### **Production Deployment Metrics**
|
|
237
|
-
- **Test Coverage:** 80 comprehensive tests with 100% pass rate
|
|
238
|
-
- **Feature Completeness:** 14/14 production features implemented
|
|
239
|
-
- **Zero-downtime Deployments:** Supported via graceful shutdown
|
|
240
|
-
- **Horizontal Scaling:** Stateless design enables unlimited horizontal scaling
|
|
241
|
-
- **Vertical Scaling:** Efficient resource usage supports high request volumes
|
|
242
|
-
|
|
243
|
-
#### **Scalability Profile**
|
|
244
|
-
- Single instance handles 140K req/sec under test conditions
|
|
245
|
-
- Linear scaling with additional instances (no shared state)
|
|
246
|
-
- Memory usage: ~100MB baseline + ~4MB per 10K active requests
|
|
247
|
-
- CPU usage: <5% per core at moderate load
|
|
248
|
-
- Network: Limited by backend API latency, not proxy overhead
|
|
249
|
-
|
|
250
|
-
For detailed performance analysis, benchmarks, and deployment guidance, see [PERFORMANCE-REPORT.md](PERFORMANCE-REPORT.md).
|
|
251
324
|
|
|
252
|
-
---
|
|
253
325
|
|
|
254
326
|
## Architecture
|
|
255
327
|
|
|
@@ -593,6 +665,7 @@ See https://openrouter.ai/models for the complete list with pricing.
|
|
|
593
665
|
| `PROMPT_CACHE_ENABLED` | Toggle the prompt cache system. | `true` |
|
|
594
666
|
| `PROMPT_CACHE_TTL_MS` | Milliseconds before cached prompts expire. | `300000` (5 minutes) |
|
|
595
667
|
| `PROMPT_CACHE_MAX_ENTRIES` | Maximum number of cached prompts retained. | `64` |
|
|
668
|
+
| `TOOL_EXECUTION_MODE` | Controls where tools execute: `server` (default, tools run on proxy server), `client`/`passthrough` (tools execute on Claude Code CLI side). | `server` |
|
|
596
669
|
| `POLICY_MAX_STEPS` | Max agent loop iterations before timeout. | `8` |
|
|
597
670
|
| `POLICY_GIT_ALLOW_PUSH` | Allow/disallow `workspace_git_push`. | `false` |
|
|
598
671
|
| `POLICY_GIT_REQUIRE_TESTS` | Enforce passing tests before `workspace_git_commit`. | `false` |
|
|
@@ -690,12 +763,6 @@ Lynkr works with any Ollama model. Popular choices:
|
|
|
690
763
|
- **mistral:latest** – Fast, efficient model (7B parameters, 4.1GB)
|
|
691
764
|
- **codellama:latest** – Meta's code-focused model (7B-34B variants)
|
|
692
765
|
|
|
693
|
-
**Performance Characteristics:**
|
|
694
|
-
|
|
695
|
-
- **Latency**: ~100-500ms first token (depending on model size and hardware)
|
|
696
|
-
- **Throughput**: ~20-50 tokens/sec on M1/M2 Macs, ~10-30 tokens/sec on typical CPUs
|
|
697
|
-
- **Memory**: 8GB RAM minimum recommended for 7B models, 16GB for 13B models
|
|
698
|
-
- **Disk**: 4-10GB per model (quantized)
|
|
699
766
|
|
|
700
767
|
**Ollama Health Check:**
|
|
701
768
|
|
|
@@ -716,7 +783,6 @@ Lynkr now supports **native tool calling** for compatible Ollama models:
|
|
|
716
783
|
- ✅ **Format conversion**: Transparent conversion between Anthropic and Ollama tool formats
|
|
717
784
|
- ❌ **Unsupported models**: llama3, older models (tools are filtered out automatically)
|
|
718
785
|
|
|
719
|
-
See [OLLAMA-TOOL-CALLING.md](OLLAMA-TOOL-CALLING.md) for implementation details.
|
|
720
786
|
|
|
721
787
|
**Limitations:**
|
|
722
788
|
|
|
@@ -885,8 +951,6 @@ npm start
|
|
|
885
951
|
| **Cost per simple request** | $0.002-0.005 | $0.00 | 100% savings 💰 |
|
|
886
952
|
| **Fallback latency** | N/A | <100ms | Transparent to user |
|
|
887
953
|
|
|
888
|
-
See [HYBRID-ROUTING-ANALYSIS.md](HYBRID-ROUTING-ANALYSIS.md) for detailed performance analysis.
|
|
889
|
-
|
|
890
954
|
### Using Built-in Workspace Tools
|
|
891
955
|
|
|
892
956
|
You can call tools programmatically via HTTP:
|
|
@@ -913,6 +977,69 @@ curl http://localhost:8080/v1/messages \
|
|
|
913
977
|
|
|
914
978
|
Tool responses appear in the assistant content block with structured JSON.
|
|
915
979
|
|
|
980
|
+
### Client-Side Tool Execution (Passthrough Mode)
|
|
981
|
+
|
|
982
|
+
Lynkr supports **client-side tool execution**, where tools execute on the Claude Code CLI machine instead of the proxy server. This enables local file operations, commands, and access to local resources.
|
|
983
|
+
|
|
984
|
+
**Enable client-side execution:**
|
|
985
|
+
|
|
986
|
+
```bash
|
|
987
|
+
# Set in .env or export before starting
|
|
988
|
+
export TOOL_EXECUTION_MODE=client
|
|
989
|
+
npm start
|
|
990
|
+
```
|
|
991
|
+
|
|
992
|
+
**How it works:**
|
|
993
|
+
|
|
994
|
+
1. **Model generates tool calls** – Databricks/OpenRouter/Ollama model returns tool calls
|
|
995
|
+
2. **Proxy converts to Anthropic format** – Tool calls converted to `tool_use` blocks
|
|
996
|
+
3. **CLI executes tools locally** – Claude Code CLI receives `tool_use` blocks and runs them on the user's machine
|
|
997
|
+
4. **CLI sends results back** – Tool results sent back to proxy in next request as `tool_result` blocks
|
|
998
|
+
5. **Conversation continues** – Proxy forwards the complete conversation (including tool results) back to the model
|
|
999
|
+
|
|
1000
|
+
**Example response in passthrough mode:**
|
|
1001
|
+
|
|
1002
|
+
```json
|
|
1003
|
+
{
|
|
1004
|
+
"id": "msg_123",
|
|
1005
|
+
"type": "message",
|
|
1006
|
+
"role": "assistant",
|
|
1007
|
+
"content": [
|
|
1008
|
+
{
|
|
1009
|
+
"type": "text",
|
|
1010
|
+
"text": "I'll create that file for you."
|
|
1011
|
+
},
|
|
1012
|
+
{
|
|
1013
|
+
"type": "tool_use",
|
|
1014
|
+
"id": "toolu_abc",
|
|
1015
|
+
"name": "Write",
|
|
1016
|
+
"input": {
|
|
1017
|
+
"file_path": "/tmp/test.txt",
|
|
1018
|
+
"content": "Hello World"
|
|
1019
|
+
}
|
|
1020
|
+
}
|
|
1021
|
+
],
|
|
1022
|
+
"stop_reason": "tool_use"
|
|
1023
|
+
}
|
|
1024
|
+
```
|
|
1025
|
+
|
|
1026
|
+
**Benefits:**
|
|
1027
|
+
- ✅ Tools execute on CLI user's local filesystem
|
|
1028
|
+
- ✅ Access to local credentials, SSH keys, environment variables
|
|
1029
|
+
- ✅ Integration with local development tools (git, npm, docker, etc.)
|
|
1030
|
+
- ✅ Reduced network latency for file operations
|
|
1031
|
+
- ✅ Server doesn't need filesystem access or permissions
|
|
1032
|
+
|
|
1033
|
+
**Use cases:**
|
|
1034
|
+
- Remote proxy server, local CLI execution
|
|
1035
|
+
- Multi-user environments where each user needs their own workspace
|
|
1036
|
+
- Security-sensitive environments where server shouldn't access user files
|
|
1037
|
+
|
|
1038
|
+
**Supported modes:**
|
|
1039
|
+
- `TOOL_EXECUTION_MODE=server` – Tools run on proxy server (default)
|
|
1040
|
+
- `TOOL_EXECUTION_MODE=client` – Tools run on CLI side
|
|
1041
|
+
- `TOOL_EXECUTION_MODE=passthrough` – Alias for `client`
|
|
1042
|
+
|
|
916
1043
|
### Working with Prompt Caching
|
|
917
1044
|
|
|
918
1045
|
- Set `PROMPT_CACHE_ENABLED=true` (default) to activate the cache.
|
|
@@ -1205,11 +1332,21 @@ Replace `<workspace>` and `<endpoint-name>` with your Databricks workspace host
|
|
|
1205
1332
|
- **Claude CLI prompts for missing tools** – Verify `tools` array in the client request lists the functions you expect. The proxy only exposes registered handlers.
|
|
1206
1333
|
- **Dynamic finance pages return stale data** – `web_fetch` fetches static HTML only. Use an API endpoint (e.g. Yahoo Finance chart JSON) or the Databricks-hosted tooling if you need rendered values from heavily scripted pages.
|
|
1207
1334
|
|
|
1335
|
+
### OpenRouter Issues
|
|
1336
|
+
|
|
1337
|
+
- **"No choices in OpenRouter response" errors** – OpenRouter sometimes returns error responses (rate limits, model unavailable) with JSON but no `choices` array. As of the latest update, Lynkr gracefully handles these errors and returns proper error responses instead of crashing. Check logs for "OpenRouter response missing choices array" warnings to see the full error details.
|
|
1338
|
+
- **Multi-prompt behavior with certain models** – Some OpenRouter models (particularly open-source models like `openai/gpt-oss-120b`) may be overly cautious and ask for confirmation multiple times before executing tools. This is model-specific behavior. Consider switching to:
|
|
1339
|
+
- `anthropic/claude-3.5-sonnet` – More decisive tool execution
|
|
1340
|
+
- `openai/gpt-4o` or `openai/gpt-4o-mini` – Better tool calling behavior
|
|
1341
|
+
- Use Databricks provider with Claude models for optimal tool execution
|
|
1342
|
+
- **Rate limit errors** – OpenRouter applies per-model rate limits. If you hit limits frequently, check your OpenRouter dashboard for current usage and consider upgrading your plan or spreading requests across multiple models.
|
|
1343
|
+
|
|
1208
1344
|
### Production Hardening Issues
|
|
1209
1345
|
|
|
1210
1346
|
- **503 Service Unavailable errors during normal load** – Check load shedding thresholds (`LOAD_SHEDDING_*`). Lower values may trigger too aggressively. Check `/metrics/observability` for memory usage patterns.
|
|
1211
1347
|
- **Circuit breaker stuck in OPEN state** – Check `/metrics/circuit-breakers` to see failure counts. Verify backend service (Databricks/Azure) is accessible. Circuit will automatically attempt recovery after `CIRCUIT_BREAKER_TIMEOUT` (default: 60s).
|
|
1212
1348
|
- **"Circuit breaker is OPEN" errors** – The circuit breaker detected too many failures and is protecting against cascading failures. Wait for timeout or fix the underlying issue. Check logs for root cause of failures.
|
|
1349
|
+
- **Azure OpenAI specific**: If using Azure OpenAI and seeing circuit breaker errors, verify your `AZURE_OPENAI_ENDPOINT` includes the full path (including `/openai/deployments/YOUR-DEPLOYMENT/chat/completions`). Missing endpoint variable or undefined returns can trigger circuit breaker protection.
|
|
1213
1350
|
- **High latency after adding production features** – This is unexpected; middleware adds only ~7μs overhead. Check `/metrics/prometheus` for actual latency distribution. Verify network latency to backend services.
|
|
1214
1351
|
- **Health check endpoint returns 503 but service seems healthy** – Check individual health check components in the response JSON. Database connectivity or memory issues may trigger this. Review logs for specific health check failures.
|
|
1215
1352
|
- **Metrics endpoint shows incorrect data** – Metrics are in-memory and reset on restart. For persistent metrics, configure Prometheus scraping. Check that `METRICS_ENABLED=true`.
|
|
@@ -1245,9 +1382,9 @@ If performance is degraded:
|
|
|
1245
1382
|
|
|
1246
1383
|
## Roadmap & Known Gaps
|
|
1247
1384
|
|
|
1248
|
-
### ✅ Recently Completed
|
|
1385
|
+
### ✅ Recently Completed
|
|
1249
1386
|
|
|
1250
|
-
All 14
|
|
1387
|
+
**Production Hardening (All 14 features implemented with 100% pass rate):**
|
|
1251
1388
|
- ✅ Exponential backoff with jitter retry logic
|
|
1252
1389
|
- ✅ Circuit breaker pattern for external services
|
|
1253
1390
|
- ✅ Load shedding with resource monitoring
|
|
@@ -1263,7 +1400,11 @@ All 14 production hardening features have been implemented and tested with 100%
|
|
|
1263
1400
|
- ✅ Rate limiting capabilities
|
|
1264
1401
|
- ✅ Safe command DSL
|
|
1265
1402
|
|
|
1266
|
-
|
|
1403
|
+
|
|
1404
|
+
**Latest Features (December 2025):**
|
|
1405
|
+
- ✅ **Client-side tool execution** (`TOOL_EXECUTION_MODE=client/passthrough`) – Tools can now execute on the Claude Code CLI side instead of the server, enabling local file operations, local commands, and access to local credentials
|
|
1406
|
+
- ✅ **OpenRouter error resilience** – Graceful handling of malformed OpenRouter responses (missing `choices` array), preventing crashes during rate limits or service errors
|
|
1407
|
+
- ✅ **Enhanced format conversion** – Improved Anthropic ↔ OpenRouter format conversion for tool calls, ensuring proper `tool_use` block generation and session consistency across providers
|
|
1267
1408
|
|
|
1268
1409
|
### 🔮 Future Enhancements
|
|
1269
1410
|
|
|
@@ -1380,32 +1521,35 @@ A: Lynkr collects request counts, error rates, latency percentiles (p50, p95, p9
|
|
|
1380
1521
|
- `/metrics/circuit-breakers` - Circuit breaker state
|
|
1381
1522
|
|
|
1382
1523
|
**Q: Is Lynkr production-ready?**
|
|
1383
|
-
A: Yes. Excellent performance
|
|
1524
|
+
A: Yes. Excellent performance , and comprehensive observability, Lynkr is designed for production deployments. It supports:
|
|
1384
1525
|
- Zero-downtime deployments (graceful shutdown)
|
|
1385
1526
|
- Kubernetes integration (health checks, metrics)
|
|
1386
1527
|
- Horizontal scaling (stateless design)
|
|
1387
1528
|
- Enterprise monitoring (Prometheus, Grafana)
|
|
1388
1529
|
|
|
1389
|
-
**Q: What's the performance impact of production features?**
|
|
1390
|
-
A: Minimal. Comprehensive benchmarking shows:
|
|
1391
|
-
- Total middleware overhead: 7.1 microseconds per request
|
|
1392
|
-
- Throughput: 140,000 requests/second
|
|
1393
|
-
- Memory overhead: ~4MB for typical workload
|
|
1394
1530
|
|
|
1395
|
-
This is considered "EXCELLENT" performance - the overhead is negligible compared to network and API latency.
|
|
1396
1531
|
|
|
1397
1532
|
**Q: How do I deploy Lynkr to Kubernetes?**
|
|
1398
1533
|
A: Use the included Kubernetes configurations and Docker support. Key steps:
|
|
1399
1534
|
1. Build Docker image: `docker build -t lynkr .`
|
|
1400
1535
|
2. Configure environment variables in Kubernetes secrets
|
|
1401
|
-
3.
|
|
1402
|
-
4.
|
|
1403
|
-
5. Set up Grafana dashboards for visualization
|
|
1536
|
+
3. Configure Prometheus scraping for metrics
|
|
1537
|
+
4. Set up Grafana dashboards for visualization
|
|
1404
1538
|
|
|
1405
1539
|
The graceful shutdown and health check endpoints ensure zero-downtime deployments.
|
|
1406
1540
|
|
|
1407
1541
|
---
|
|
1408
1542
|
|
|
1543
|
+
## References
|
|
1544
|
+
|
|
1545
|
+
Lynkr's design also includes ACE Framework informed by research in agentic AI systems and context engineering:
|
|
1546
|
+
|
|
1547
|
+
- **Zhang et al. (2024)**. *Agentic Context Engineering*. arXiv:2510.04618. [arXiv](https://arxiv.org/abs/2510.04618)
|
|
1548
|
+
|
|
1549
|
+
For BibTeX citations, see [CITATIONS.bib](CITATIONS.bib).
|
|
1550
|
+
|
|
1551
|
+
---
|
|
1552
|
+
|
|
1409
1553
|
## License
|
|
1410
1554
|
|
|
1411
1555
|
MIT License. See [LICENSE](LICENSE) for details.
|
package/docs/index.md
CHANGED
|
@@ -73,10 +73,13 @@ Commit, push, diff, stage, generate release notes, etc.
|
|
|
73
73
|
### ✔ Prompt Caching (LRU + TTL)
|
|
74
74
|
Reuses identical prompts to reduce cost + latency.
|
|
75
75
|
|
|
76
|
-
### ✔ Workspace Tools
|
|
76
|
+
### ✔ Workspace Tools
|
|
77
77
|
Task tracker, file I/O, test runner, index rebuild, etc.
|
|
78
78
|
|
|
79
|
-
### ✔
|
|
79
|
+
### ✔ Client-Side Tool Execution (Passthrough Mode)
|
|
80
|
+
Tools can execute on the Claude Code CLI side instead of the server, enabling local file operations and commands.
|
|
81
|
+
|
|
82
|
+
### ✔ Fully extensible Node.js architecture
|
|
80
83
|
Add custom tools, policies, or backend adapters.
|
|
81
84
|
|
|
82
85
|
---
|
|
@@ -92,6 +95,7 @@ Add custom tools, policies, or backend adapters.
|
|
|
92
95
|
- [Prompt Caching](#-prompt-caching)
|
|
93
96
|
- [MCP (Model Context Protocol) Integration](#-model-context-protocol-mcp)
|
|
94
97
|
- [Git Tools](#-git-tools)
|
|
98
|
+
- [Client-Side Tool Execution (Passthrough Mode)](#-client-side-tool-execution-passthrough-mode)
|
|
95
99
|
- [API Examples](#-api-examples)
|
|
96
100
|
- [Roadmap](#-roadmap)
|
|
97
101
|
- [Links](#-links)
|
|
@@ -363,6 +367,47 @@ Example:
|
|
|
363
367
|
|
|
364
368
|
---
|
|
365
369
|
|
|
370
|
+
# 🔄 Client-Side Tool Execution (Passthrough Mode)
|
|
371
|
+
|
|
372
|
+
Lynkr supports **client-side tool execution**, enabling tools to execute on the Claude Code CLI machine instead of the proxy server.
|
|
373
|
+
|
|
374
|
+
**Enable passthrough mode:**
|
|
375
|
+
|
|
376
|
+
```bash
|
|
377
|
+
export TOOL_EXECUTION_MODE=client
|
|
378
|
+
npm start
|
|
379
|
+
```
|
|
380
|
+
|
|
381
|
+
**How it works:**
|
|
382
|
+
|
|
383
|
+
1. Model generates tool calls (from Databricks/OpenRouter/Ollama)
|
|
384
|
+
2. Proxy converts to Anthropic format with `tool_use` blocks
|
|
385
|
+
3. Claude Code CLI receives `tool_use` blocks and executes locally
|
|
386
|
+
4. CLI sends `tool_result` blocks back in the next request
|
|
387
|
+
5. Proxy forwards complete conversation back to the model
|
|
388
|
+
|
|
389
|
+
**Benefits:**
|
|
390
|
+
|
|
391
|
+
* ✅ Local filesystem access on CLI user's machine
|
|
392
|
+
* ✅ Local credentials, SSH keys, environment variables
|
|
393
|
+
* ✅ Integration with local dev tools (git, npm, docker)
|
|
394
|
+
* ✅ Reduced network latency for file operations
|
|
395
|
+
* ✅ Server doesn't need filesystem permissions
|
|
396
|
+
|
|
397
|
+
**Use cases:**
|
|
398
|
+
|
|
399
|
+
* Remote proxy server with local CLI execution
|
|
400
|
+
* Multi-user environments where each needs their own workspace
|
|
401
|
+
* Security-sensitive setups where server shouldn't access user files
|
|
402
|
+
|
|
403
|
+
**Configuration:**
|
|
404
|
+
|
|
405
|
+
* `TOOL_EXECUTION_MODE=server` – Tools run on proxy (default)
|
|
406
|
+
* `TOOL_EXECUTION_MODE=client` – Tools run on CLI side
|
|
407
|
+
* `TOOL_EXECUTION_MODE=passthrough` – Alias for client mode
|
|
408
|
+
|
|
409
|
+
---
|
|
410
|
+
|
|
366
411
|
# 🧪 API Example (Index Rebuild)
|
|
367
412
|
|
|
368
413
|
```bash
|
|
@@ -382,6 +427,14 @@ curl http://localhost:8080/v1/messages \
|
|
|
382
427
|
|
|
383
428
|
# 🛣 Roadmap
|
|
384
429
|
|
|
430
|
+
## ✅ Recently Completed (December 2025)
|
|
431
|
+
|
|
432
|
+
* **Client-side tool execution** (`TOOL_EXECUTION_MODE=client/passthrough`) – Tools can execute on the Claude Code CLI side, enabling local file operations, commands, and access to local credentials
|
|
433
|
+
* **OpenRouter error resilience** – Graceful handling of malformed OpenRouter responses, preventing crashes during rate limits or service errors
|
|
434
|
+
* **Enhanced format conversion** – Improved Anthropic ↔ OpenRouter format conversion for tool calls with proper `tool_use` block generation
|
|
435
|
+
|
|
436
|
+
## 🔮 Future Features
|
|
437
|
+
|
|
385
438
|
* LSP integration (TypeScript, Python, more languages)
|
|
386
439
|
* Per-file diff comments
|
|
387
440
|
* Risk scoring for Git diffs
|
|
@@ -0,0 +1,31 @@
|
|
|
1
|
+
#!/bin/bash
|
|
2
|
+
|
|
3
|
+
# Monitor agent activity in real-time
|
|
4
|
+
|
|
5
|
+
echo "🔍 Monitoring Agent Activity"
|
|
6
|
+
echo "=============================="
|
|
7
|
+
echo ""
|
|
8
|
+
|
|
9
|
+
while true; do
|
|
10
|
+
clear
|
|
11
|
+
echo "🔍 Agent Statistics (refreshing every 3s)"
|
|
12
|
+
echo "=========================================="
|
|
13
|
+
echo ""
|
|
14
|
+
|
|
15
|
+
# Get stats
|
|
16
|
+
curl -s http://localhost:8080/v1/agents/stats | jq -r '.stats[] |
|
|
17
|
+
"Agent: \(.agent_type)
|
|
18
|
+
Executions: \(.total_executions) (\(.completed) completed, \(.failed) failed)
|
|
19
|
+
Avg Duration: \(.avg_duration_ms)ms
|
|
20
|
+
Tokens: \(.total_input_tokens) in / \(.total_output_tokens) out
|
|
21
|
+
"' || echo "Proxy not responding..."
|
|
22
|
+
|
|
23
|
+
echo ""
|
|
24
|
+
echo "Latest transcripts:"
|
|
25
|
+
ls -lt data/agent-transcripts/*.jsonl 2>/dev/null | head -3 || echo "No transcripts yet"
|
|
26
|
+
|
|
27
|
+
echo ""
|
|
28
|
+
echo "Press Ctrl+C to stop monitoring"
|
|
29
|
+
|
|
30
|
+
sleep 3
|
|
31
|
+
done
|
package/package.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "lynkr",
|
|
3
|
-
"version": "
|
|
4
|
-
"description": "Self-hosted Claude Code proxy with Databricks
|
|
3
|
+
"version": "2.0.0",
|
|
4
|
+
"description": "Self-hosted Claude Code proxy with Databricks,Azure adapters, workspace tooling, and MCP integration.",
|
|
5
5
|
"main": "index.js",
|
|
6
6
|
"bin": {
|
|
7
7
|
"lynkr": "./bin/cli.js",
|
|
@@ -12,7 +12,8 @@
|
|
|
12
12
|
"dev": "nodemon index.js",
|
|
13
13
|
"lint": "eslint src index.js",
|
|
14
14
|
"test": "npm run test:unit && npm run test:performance",
|
|
15
|
-
"test:unit": "DATABRICKS_API_KEY=test-key DATABRICKS_API_BASE=http://test.com node --test test/routing.test.js test/hybrid-routing-integration.test.js test/web-tools.test.js",
|
|
15
|
+
"test:unit": "DATABRICKS_API_KEY=test-key DATABRICKS_API_BASE=http://test.com node --test test/routing.test.js test/hybrid-routing-integration.test.js test/web-tools.test.js test/passthrough-mode.test.js test/openrouter-error-resilience.test.js test/format-conversion.test.js test/azure-openai-config.test.js test/azure-openai-format-conversion.test.js test/azure-openai-routing.test.js test/azure-openai-streaming.test.js test/azure-openai-error-resilience.test.js test/azure-openai-integration.test.js",
|
|
16
|
+
"test:new-features": "DATABRICKS_API_KEY=test-key DATABRICKS_API_BASE=http://test.com node --test test/passthrough-mode.test.js test/openrouter-error-resilience.test.js test/format-conversion.test.js",
|
|
16
17
|
"test:performance": "DATABRICKS_API_KEY=test-key DATABRICKS_API_BASE=http://test.com node test/hybrid-routing-performance.test.js && DATABRICKS_API_KEY=test-key DATABRICKS_API_BASE=http://test.com node test/performance-tests.js",
|
|
17
18
|
"test:benchmark": "DATABRICKS_API_KEY=test-key DATABRICKS_API_BASE=http://test.com node test/performance-benchmark.js",
|
|
18
19
|
"test:quick": "DATABRICKS_API_KEY=test-key DATABRICKS_API_BASE=http://test.com node --test test/routing.test.js",
|
|
@@ -40,6 +41,7 @@
|
|
|
40
41
|
"node": ">=20.0.0"
|
|
41
42
|
},
|
|
42
43
|
"dependencies": {
|
|
44
|
+
"@azure/openai": "^2.0.0",
|
|
43
45
|
"better-sqlite3": "^9.4.0",
|
|
44
46
|
"compression": "^1.7.4",
|
|
45
47
|
"diff": "^5.2.0",
|
|
@@ -47,6 +49,8 @@
|
|
|
47
49
|
"express": "^5.1.0",
|
|
48
50
|
"express-rate-limit": "^8.2.1",
|
|
49
51
|
"fast-glob": "^3.3.2",
|
|
52
|
+
"js-yaml": "^4.1.1",
|
|
53
|
+
"openai": "^6.14.0",
|
|
50
54
|
"pino": "^8.17.2",
|
|
51
55
|
"pino-http": "^8.6.0",
|
|
52
56
|
"tree-sitter": "^0.20.1",
|