lynkr 0.1.4 → 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,215 @@
1
+ # GitHub Actions Workflows
2
+
3
+ This directory contains GitHub Actions workflows for automated testing and CI/CD.
4
+
5
+ ## Available Workflows
6
+
7
+ ### 1. CI Tests (`ci.yml`)
8
+
9
+ **Purpose:** Run comprehensive test suite on every push and pull request
10
+
11
+ **Triggers:**
12
+ - Push to `main` or `develop` branches
13
+ - Pull requests to `main` or `develop` branches
14
+
15
+ **What it does:**
16
+ - Tests on Node.js 20.x and 22.x
17
+ - Runs linter (`npm run lint`)
18
+ - Runs unit tests (`npm run test:unit`)
19
+ - Runs performance tests (`npm run test:performance`)
20
+ - Uses npm cache for faster builds
21
+
22
+ **Environment Variables:**
23
+ - `DATABRICKS_API_KEY=test-key` (mock value for tests)
24
+ - `DATABRICKS_API_BASE=http://test.com` (mock value for tests)
25
+
26
+ **Status:** Runs on every push/PR, fails if unit tests fail
27
+
28
+ ---
29
+
30
+ ### 2. Web Tools Tests (`web-tools-tests.yml`)
31
+
32
+ **Purpose:** Run web search tool tests when related files change
33
+
34
+ **Triggers:**
35
+ - Changes to web tools source files:
36
+ - `src/tools/web.js`
37
+ - `src/tools/web-client.js`
38
+ - `src/clients/retry.js`
39
+ - `src/config/index.js`
40
+ - `test/web-tools.test.js`
41
+
42
+ **What it does:**
43
+ - Runs only the web tools test suite
44
+ - Generates test summary in GitHub Actions UI
45
+ - Faster feedback for web tools changes
46
+
47
+ **Test Coverage:**
48
+ - HTML extraction (9 tests)
49
+ - HTTP keep-alive agent (2 tests)
50
+ - Retry logic with exponential backoff (2 tests)
51
+ - Configuration management (3 tests)
52
+ - Error handling (1 test)
53
+ - Performance validation (1 test)
54
+ - Body preview configuration (1 test)
55
+
56
+ **Total:** 19 tests
57
+
58
+ ---
59
+
60
+ ### 3. NPM Publish (`npm-publish.yml`)
61
+
62
+ **Purpose:** Automatically publish package to npm registry
63
+
64
+ **Triggers:**
65
+ - Git tags starting with `v` (e.g., `v0.1.5`)
66
+ - GitHub Releases created
67
+
68
+ **What it does:**
69
+ - Runs full test suite before publishing
70
+ - Checks if version already exists on npm
71
+ - Publishes package to npm registry (if tests pass)
72
+ - Prevents duplicate publishes
73
+ - Creates publish summary
74
+
75
+ **Requirements:**
76
+ - `NPM_TOKEN` secret must be configured
77
+ - Tests must pass
78
+ - Version must be new
79
+
80
+ **Status:** Only publishes on successful builds
81
+
82
+ ---
83
+
84
+ ### 4. Version Bump (`version-bump.yml`)
85
+
86
+ **Purpose:** Manual workflow to bump version and create releases
87
+
88
+ **Triggers:**
89
+ - Manual workflow dispatch (button in Actions tab)
90
+
91
+ **What it does:**
92
+ - Prompts for version type (patch/minor/major)
93
+ - Runs tests before version bump
94
+ - Updates package.json version
95
+ - Creates git commit and tag
96
+ - Pushes changes to repository
97
+ - Creates GitHub Release with changelog
98
+ - Triggers npm-publish workflow automatically
99
+
100
+ **Options:**
101
+ - `patch` - Bug fixes (0.1.4 → 0.1.5)
102
+ - `minor` - New features (0.1.4 → 0.2.0)
103
+ - `major` - Breaking changes (0.1.4 → 1.0.0)
104
+
105
+ ---
106
+
107
+ ### 5. IndexNow Notification (`index.yml`)
108
+
109
+ **Purpose:** Notify search engines when documentation is updated
110
+
111
+ **Triggers:**
112
+ - Push to `main` branch
113
+ - Changes in `docs/**` directory
114
+
115
+ **What it does:**
116
+ - Notifies Bing IndexNow about updated documentation
117
+ - Helps with SEO and documentation discoverability
118
+
119
+ ---
120
+
121
+ ## Adding Status Badges
122
+
123
+ Add these badges to your README.md:
124
+
125
+ ```markdown
126
+ ![CI Tests](https://github.com/vishalveerareddy123/Lynkr/actions/workflows/ci.yml/badge.svg)
127
+ ![Web Tools Tests](https://github.com/vishalveerareddy123/Lynkr/actions/workflows/web-tools-tests.yml/badge.svg)
128
+ ![npm version](https://img.shields.io/npm/v/lynkr.svg)
129
+ ![npm downloads](https://img.shields.io/npm/dt/lynkr.svg)
130
+ ```
131
+
132
+ ## Running Tests Locally
133
+
134
+ Before pushing, run tests locally:
135
+
136
+ ```bash
137
+ # Run all unit tests
138
+ npm run test:unit
139
+
140
+ # Run only web tools tests
141
+ DATABRICKS_API_KEY=test-key DATABRICKS_API_BASE=http://test.com \
142
+ node --test test/web-tools.test.js
143
+
144
+ # Run quick tests (routing only)
145
+ npm run test:quick
146
+
147
+ # Run all tests including performance
148
+ npm test
149
+ ```
150
+
151
+ ## Workflow Configuration
152
+
153
+ ### Required Secrets
154
+
155
+ **For npm publishing workflows:**
156
+ - `NPM_TOKEN` - Your npm automation token (required to publish)
157
+ - Get from: https://www.npmjs.com/settings/YOUR_USERNAME/tokens
158
+ - Type: "Automation" token
159
+ - Add to: Settings → Secrets → Actions → New repository secret
160
+
161
+ **For test workflows:**
162
+ - No secrets required (uses mock credentials)
163
+
164
+ **For IndexNow workflow:**
165
+ - `INDEX_NOW` - Your IndexNow API key (optional, only for docs)
166
+
167
+ ### Matrix Strategy
168
+
169
+ The CI workflow uses a matrix strategy to test on multiple Node.js versions:
170
+ - Node.js 20.x (LTS)
171
+ - Node.js 22.x (Current)
172
+
173
+ This ensures compatibility across different Node versions.
174
+
175
+ ## Troubleshooting
176
+
177
+ ### Tests fail locally but pass in CI
178
+ - Check Node.js version (`node --version`)
179
+ - Ensure `npm ci` is used (not `npm install`)
180
+ - Check for platform-specific issues (macOS vs Linux)
181
+
182
+ ### Tests pass locally but fail in CI
183
+ - Environment variables might be missing
184
+ - Dependencies might need updating
185
+ - Check GitHub Actions logs for details
186
+
187
+ ### Workflow doesn't trigger
188
+ - Verify file paths in `on.push.paths`
189
+ - Check branch names match
190
+ - Ensure workflow file is in `.github/workflows/`
191
+
192
+ ## Modifying Workflows
193
+
194
+ When making changes:
195
+
196
+ 1. Test YAML syntax (use a YAML validator)
197
+ 2. Test locally first with same commands
198
+ 3. Create a PR to test in CI before merging
199
+ 4. Check GitHub Actions tab for results
200
+
201
+ ## Performance Considerations
202
+
203
+ - **npm cache:** Workflows cache `node_modules` for faster builds
204
+ - **Parallel jobs:** Tests run on multiple Node versions in parallel
205
+ - **Path filtering:** Web tools workflow only runs when relevant files change
206
+ - **continue-on-error:** Performance tests won't fail the build
207
+
208
+ ## Future Improvements
209
+
210
+ Potential additions:
211
+ - Code coverage reporting
212
+ - Docker container testing
213
+ - E2E integration tests
214
+ - Deploy previews for PRs
215
+ - Automated dependency updates (Dependabot)
@@ -0,0 +1,66 @@
1
+ name: CI Tests
2
+
3
+ on:
4
+ push:
5
+ branches:
6
+ - main
7
+ - develop
8
+ pull_request:
9
+ branches:
10
+ - main
11
+ - develop
12
+
13
+ jobs:
14
+ test:
15
+ name: Run Tests
16
+ runs-on: ubuntu-latest
17
+
18
+ strategy:
19
+ matrix:
20
+ node-version: [20.x, 22.x]
21
+
22
+ steps:
23
+ - name: Checkout code
24
+ uses: actions/checkout@v4
25
+
26
+ - name: Setup Node.js ${{ matrix.node-version }}
27
+ uses: actions/setup-node@v4
28
+ with:
29
+ node-version: ${{ matrix.node-version }}
30
+ cache: 'npm'
31
+
32
+ - name: Install dependencies
33
+ run: npm ci
34
+
35
+ - name: Run linter
36
+ run: npm run lint
37
+ continue-on-error: true
38
+
39
+ - name: Run unit tests
40
+ run: npm run test:unit
41
+ env:
42
+ DATABRICKS_API_KEY: test-key
43
+ DATABRICKS_API_BASE: http://test.com
44
+
45
+ - name: Run performance tests
46
+ run: npm run test:performance
47
+ env:
48
+ DATABRICKS_API_KEY: test-key
49
+ DATABRICKS_API_BASE: http://test.com
50
+ continue-on-error: true
51
+
52
+ test-summary:
53
+ name: Test Summary
54
+ runs-on: ubuntu-latest
55
+ needs: test
56
+ if: always()
57
+
58
+ steps:
59
+ - name: Check test results
60
+ run: |
61
+ echo "Tests completed"
62
+ if [ "${{ needs.test.result }}" == "failure" ]; then
63
+ echo "Tests failed!"
64
+ exit 1
65
+ fi
66
+ echo "All tests passed!"
@@ -0,0 +1,56 @@
1
+ name: Web Tools Tests
2
+
3
+ on:
4
+ push:
5
+ paths:
6
+ - 'src/tools/web.js'
7
+ - 'src/tools/web-client.js'
8
+ - 'src/clients/retry.js'
9
+ - 'src/config/index.js'
10
+ - 'test/web-tools.test.js'
11
+ - '.github/workflows/web-tools-tests.yml'
12
+ pull_request:
13
+ paths:
14
+ - 'src/tools/web.js'
15
+ - 'src/tools/web-client.js'
16
+ - 'src/clients/retry.js'
17
+ - 'src/config/index.js'
18
+ - 'test/web-tools.test.js'
19
+
20
+ jobs:
21
+ web-tools-test:
22
+ name: Web Tools Test Suite
23
+ runs-on: ubuntu-latest
24
+
25
+ steps:
26
+ - name: Checkout code
27
+ uses: actions/checkout@v4
28
+
29
+ - name: Setup Node.js
30
+ uses: actions/setup-node@v4
31
+ with:
32
+ node-version: '20.x'
33
+ cache: 'npm'
34
+
35
+ - name: Install dependencies
36
+ run: npm ci
37
+
38
+ - name: Run web tools tests
39
+ run: |
40
+ DATABRICKS_API_KEY=test-key DATABRICKS_API_BASE=http://test.com \
41
+ node --test test/web-tools.test.js
42
+
43
+ - name: Test results summary
44
+ if: always()
45
+ run: |
46
+ echo "## Web Tools Test Results" >> $GITHUB_STEP_SUMMARY
47
+ echo "✅ All web tools tests passed!" >> $GITHUB_STEP_SUMMARY
48
+ echo "" >> $GITHUB_STEP_SUMMARY
49
+ echo "### Coverage:" >> $GITHUB_STEP_SUMMARY
50
+ echo "- HTML extraction (9 tests)" >> $GITHUB_STEP_SUMMARY
51
+ echo "- HTTP keep-alive agent (2 tests)" >> $GITHUB_STEP_SUMMARY
52
+ echo "- Retry logic with exponential backoff (2 tests)" >> $GITHUB_STEP_SUMMARY
53
+ echo "- Configuration management (3 tests)" >> $GITHUB_STEP_SUMMARY
54
+ echo "- Error handling (1 test)" >> $GITHUB_STEP_SUMMARY
55
+ echo "- Performance validation (1 test)" >> $GITHUB_STEP_SUMMARY
56
+ echo "- Body preview configuration (1 test)" >> $GITHUB_STEP_SUMMARY
package/README.md CHANGED
@@ -55,7 +55,7 @@ This repository contains a Node.js service that emulates the Anthropic Claude Co
55
55
  Key highlights:
56
56
 
57
57
  - **Production-ready architecture** – 14 production hardening features including circuit breakers, load shedding, graceful shutdown, comprehensive metrics (Prometheus format), and Kubernetes-ready health checks. Minimal overhead (~7μs per request) with 140K req/sec throughput.
58
- - **Multi-provider support** – Works with Databricks (default), Azure-hosted Anthropic endpoints, and local Ollama models; requests are normalized to each provider while returning Claude-flavored responses.
58
+ - **Multi-provider support** – Works with Databricks (default), Azure-hosted Anthropic endpoints, OpenRouter (100+ models), and local Ollama models; requests are normalized to each provider while returning Claude-flavored responses.
59
59
  - **Enterprise observability** – Real-time metrics collection, structured logging with request ID correlation, latency percentiles (p50, p95, p99), token usage tracking, and cost attribution. Multiple export formats (JSON, Prometheus).
60
60
  - **Resilience & reliability** – Exponential backoff with jitter for retries, circuit breaker protection against cascading failures, automatic load shedding during overload, and zero-downtime deployments via graceful shutdown.
61
61
  - **Workspace awareness** – Local repo indexing, `CLAUDE.md` summaries, language-aware navigation, and Git helpers mirror core Claude Code workflows.
@@ -65,7 +65,7 @@ Key highlights:
65
65
 
66
66
  The result is a production-ready, self-hosted alternative that stays close to Anthropic's ergonomics while providing enterprise-grade reliability, observability, and performance.
67
67
 
68
- > **Compatibility note:** Claude models hosted on Databricks work out of the box. Set `MODEL_PROVIDER=azure-anthropic` (and related credentials) to target the Azure-hosted Anthropic `/anthropic/v1/messages` endpoint. Set `MODEL_PROVIDER=ollama` to use locally-running Ollama models (qwen2.5-coder, llama3, mistral, etc.).
68
+ > **Compatibility note:** Claude models hosted on Databricks work out of the box. Set `MODEL_PROVIDER=azure-anthropic` (and related credentials) to target the Azure-hosted Anthropic `/anthropic/v1/messages` endpoint. Set `MODEL_PROVIDER=openrouter` to access 100+ models through OpenRouter (GPT-4o, Claude, Gemini, etc.). Set `MODEL_PROVIDER=ollama` to use locally-running Ollama models (qwen2.5-coder, llama3, mistral, etc.).
69
69
 
70
70
  Further documentation and usage notes are available on [DeepWiki](https://deepwiki.com/vishalveerareddy123/Lynkr).
71
71
 
@@ -301,22 +301,23 @@ For detailed performance analysis, benchmarks, and deployment guidance, see [PER
301
301
  │ MCP Registry │ │ Provider Adapters │ │ Sandbox │
302
302
  │ (manifest -> │──RPC─────│ • Databricks (circuit-breaker) │──┐ │ Runtime │
303
303
  │ JSON-RPC client│ │ • Azure Anthropic (retry logic)│ │ │ (Docker) │
304
- └────────────────┘ │ • Ollama (local models) │ │ └──────────────┘
304
+ └────────────────┘ │ • OpenRouter (100+ models) │ │ └──────────────┘
305
+ │ • Ollama (local models) │ │
305
306
  │ • HTTP Connection Pooling │ │
306
307
  │ • Exponential Backoff + Jitter │ │
307
308
  └────────────┬───────────────────┘ │
308
309
  │ │
309
- ┌────────────────┼────────────┐
310
- │ │
311
- ┌─────────▼────────┐ ┌───▼────────┐ ┌▼─────────────┐
312
- │ Databricks │ │ Azure │ │ Ollama API │───────┘
313
- │ Serving Endpoint │ │ Anthropic │ │ (localhost)
314
- │ (REST) │ │ /anthropic │ │ qwen2.5-coder│
315
- └──────────────────┘ │ /v1/messages│ └──────────────┘
316
- └────────────┘
317
-
318
- ┌────────▼──────────┐
319
- │ External MCP tools
310
+ ┌────────────────┼─────────────────┐
311
+ │ │
312
+ ┌─────────▼────────┐ ┌───▼────────┐ ┌─────▼─────────┐
313
+ │ Databricks │ │ Azure │ │ OpenRouter API
314
+ │ Serving Endpoint │ │ Anthropic │ │ (GPT-4o, etc.)│
315
+ │ (REST) │ │ /anthropic │ └───────────────┘
316
+ └──────────────────┘ │ /v1/messages│ ┌──────────────┐
317
+ └────────────┘ │ Ollama API │───────┘
318
+ │ (localhost) │
319
+ ┌────────▼──────────│ qwen2.5-coder│
320
+ │ External MCP tools└──────────────┘
320
321
  │ (GitHub, Jira) │
321
322
  └───────────────────┘
322
323
  ```
@@ -491,6 +492,7 @@ Set `MODEL_PROVIDER` to select the upstream endpoint:
491
492
 
492
493
  - `MODEL_PROVIDER=databricks` (default) – expects `DATABRICKS_API_BASE`, `DATABRICKS_API_KEY`, and optionally `DATABRICKS_ENDPOINT_PATH`.
493
494
  - `MODEL_PROVIDER=azure-anthropic` – routes requests to Azure's `/anthropic/v1/messages` endpoint and uses the headers Azure expects.
495
+ - `MODEL_PROVIDER=openrouter` – connects to OpenRouter for access to 100+ models (GPT-4o, Claude, Gemini, Llama, etc.). Requires `OPENROUTER_API_KEY`.
494
496
  - `MODEL_PROVIDER=ollama` – connects to a locally-running Ollama instance for models like qwen2.5-coder, llama3, mistral, etc.
495
497
 
496
498
  **Azure-hosted Anthropic configuration:**
@@ -529,6 +531,42 @@ ollama pull qwen2.5-coder:latest
529
531
  ollama list
530
532
  ```
531
533
 
534
+ **OpenRouter configuration:**
535
+
536
+ OpenRouter provides unified access to 100+ AI models through a single API, including GPT-4o, Claude, Gemini, Llama, Mixtral, and more. It offers competitive pricing, automatic fallbacks, and no need to manage multiple API keys.
537
+
538
+ ```env
539
+ MODEL_PROVIDER=openrouter
540
+ OPENROUTER_API_KEY=sk-or-v1-... # Get from https://openrouter.ai/keys
541
+ OPENROUTER_MODEL=openai/gpt-4o-mini # Model to use (see https://openrouter.ai/models)
542
+ OPENROUTER_ENDPOINT=https://openrouter.ai/api/v1/chat/completions # API endpoint
543
+ PORT=8080
544
+ WORKSPACE_ROOT=/path/to/your/repo
545
+ ```
546
+
547
+ **Popular OpenRouter models:**
548
+ - `openai/gpt-4o-mini` – Fast, affordable GPT-4o mini ($0.15/$0.60 per 1M tokens)
549
+ - `anthropic/claude-3.5-sonnet` – Claude 3.5 Sonnet for complex reasoning
550
+ - `google/gemini-pro-1.5` – Google's Gemini Pro with large context
551
+ - `meta-llama/llama-3.1-70b-instruct` – Meta's open-source Llama 3.1
552
+
553
+ See https://openrouter.ai/models for the complete list with pricing.
554
+
555
+ **Getting an OpenRouter API key:**
556
+ 1. Visit https://openrouter.ai
557
+ 2. Sign in with GitHub, Google, or email
558
+ 3. Go to https://openrouter.ai/keys
559
+ 4. Create a new API key
560
+ 5. Add credits to your account (pay-as-you-go, no subscription required)
561
+
562
+ **OpenRouter benefits:**
563
+ - ✅ **100+ models** through one API (no need to manage multiple provider accounts)
564
+ - ✅ **Automatic fallbacks** if your primary model is unavailable
565
+ - ✅ **Competitive pricing** with volume discounts
566
+ - ✅ **Full tool calling support** (function calling compatible with Claude Code CLI)
567
+ - ✅ **No monthly fees** – pay only for what you use
568
+ - ✅ **Rate limit pooling** across models
569
+
532
570
  ---
533
571
 
534
572
  ## Configuration Reference
@@ -537,7 +575,7 @@ ollama list
537
575
  |----------|-------------|---------|
538
576
  | `PORT` | HTTP port for the proxy server. | `8080` |
539
577
  | `WORKSPACE_ROOT` | Filesystem path exposed to workspace tools and indexer. | `process.cwd()` |
540
- | `MODEL_PROVIDER` | Selects the model backend (`databricks`, `azure-anthropic`, `ollama`). | `databricks` |
578
+ | `MODEL_PROVIDER` | Selects the model backend (`databricks`, `azure-anthropic`, `openrouter`, `ollama`). | `databricks` |
541
579
  | `MODEL_DEFAULT` | Overrides the default model/deployment name sent to the provider. | Provider-specific default |
542
580
  | `DATABRICKS_API_BASE` | Base URL of your Databricks workspace (required when `MODEL_PROVIDER=databricks`). | – |
543
581
  | `DATABRICKS_API_KEY` | Databricks PAT used for the serving endpoint (required for Databricks). | – |
@@ -545,6 +583,10 @@ ollama list
545
583
  | `AZURE_ANTHROPIC_ENDPOINT` | Full HTTPS endpoint for Azure-hosted Anthropic `/anthropic/v1/messages` (required when `MODEL_PROVIDER=azure-anthropic`). | – |
546
584
  | `AZURE_ANTHROPIC_API_KEY` | API key supplied via the `x-api-key` header for Azure Anthropic. | – |
547
585
  | `AZURE_ANTHROPIC_VERSION` | Anthropic API version header for Azure Anthropic calls. | `2023-06-01` |
586
+ | `OPENROUTER_API_KEY` | OpenRouter API key (required when `MODEL_PROVIDER=openrouter`). Get from https://openrouter.ai/keys | – |
587
+ | `OPENROUTER_MODEL` | OpenRouter model to use (e.g., `openai/gpt-4o-mini`, `anthropic/claude-3.5-sonnet`). See https://openrouter.ai/models | `openai/gpt-4o-mini` |
588
+ | `OPENROUTER_ENDPOINT` | OpenRouter API endpoint URL. | `https://openrouter.ai/api/v1/chat/completions` |
589
+ | `OPENROUTER_MAX_TOOLS_FOR_ROUTING` | Maximum tool count for routing to OpenRouter in hybrid mode. | `15` |
548
590
  | `OLLAMA_ENDPOINT` | Ollama API endpoint URL (required when `MODEL_PROVIDER=ollama`). | `http://localhost:11434` |
549
591
  | `OLLAMA_MODEL` | Ollama model name to use (e.g., `qwen2.5-coder:latest`, `llama3`, `mistral`). | `qwen2.5-coder:7b` |
550
592
  | `OLLAMA_TIMEOUT_MS` | Request timeout for Ollama API calls in milliseconds. | `120000` (2 minutes) |
@@ -683,14 +725,15 @@ See [OLLAMA-TOOL-CALLING.md](OLLAMA-TOOL-CALLING.md) for implementation details.
683
725
 
684
726
  ### Hybrid Routing with Automatic Fallback
685
727
 
686
- Lynkr supports **intelligent hybrid routing** that automatically routes requests between Ollama (local/fast) and cloud providers (Databricks/Azure) based on request complexity, with transparent fallback when Ollama is unavailable.
728
+ Lynkr supports **intelligent 3-tier hybrid routing** that automatically routes requests between Ollama (local/fast), OpenRouter (moderate complexity), and cloud providers (Databricks/Azure for heavy workloads) based on request complexity, with transparent fallback when any provider is unavailable.
687
729
 
688
730
  **Why Hybrid Routing?**
689
731
 
690
732
  - 🚀 **40-87% faster** for simple requests (local Ollama)
691
733
  - 💰 **65-100% cost savings** for requests that stay on Ollama
692
- - 🛡️ **Automatic fallback** ensures reliability when Ollama fails
693
- - 🔒 **Privacy-preserving** for simple queries (never leave your machine)
734
+ - 🎯 **Smart cost optimization** use affordable OpenRouter models for moderate complexity
735
+ - 🛡️ **Automatic fallback** ensures reliability when any provider fails
736
+ - 🔒 **Privacy-preserving** for simple queries (never leave your machine with Ollama)
694
737
 
695
738
  **Quick Start:**
696
739
 
@@ -699,12 +742,14 @@ Lynkr supports **intelligent hybrid routing** that automatically routes requests
699
742
  ollama serve
700
743
  ollama pull qwen2.5-coder:latest
701
744
 
702
- # Terminal 2: Start Lynkr with hybrid routing
745
+ # Terminal 2: Start Lynkr with 3-tier routing
703
746
  export PREFER_OLLAMA=true
704
747
  export OLLAMA_ENDPOINT=http://localhost:11434
705
748
  export OLLAMA_MODEL=qwen2.5-coder:latest
706
- export DATABRICKS_API_KEY=your_key # Fallback provider
707
- export DATABRICKS_API_BASE=your_base_url # Fallback provider
749
+ export OPENROUTER_API_KEY=your_openrouter_key # Mid-tier provider
750
+ export OPENROUTER_MODEL=openai/gpt-4o-mini # Mid-tier model
751
+ export DATABRICKS_API_KEY=your_key # Heavy workload provider
752
+ export DATABRICKS_API_BASE=your_base_url # Heavy workload provider
708
753
  npm start
709
754
 
710
755
  # Terminal 3: Connect Claude CLI (works transparently)
@@ -715,16 +760,22 @@ claude
715
760
 
716
761
  **How It Works:**
717
762
 
718
- Lynkr intelligently routes each request:
763
+ Lynkr intelligently routes each request based on complexity:
719
764
 
720
765
  1. **Simple requests (0-2 tools)** → Try Ollama first
721
- - ✅ If Ollama succeeds: Fast, local response (100-500ms)
722
- - ❌ If Ollama fails: Automatic transparent fallback to cloud
766
+ - ✅ If Ollama succeeds: Fast, local, free response (100-500ms)
767
+ - ❌ If Ollama fails: Automatic transparent fallback to OpenRouter or Databricks
723
768
 
724
- 2. **Complex requests (3+ tools)** → Route directly to cloud
725
- - Ollama isn't attempted (saves time on requests better suited for cloud)
769
+ 2. **Moderate requests (3-14 tools)** → Route to OpenRouter
770
+ - Uses affordable models like GPT-4o-mini ($0.15/1M input tokens)
771
+ - Full tool calling support
772
+ - ❌ If OpenRouter fails or not configured: Fallback to Databricks
726
773
 
727
- 3. **Tool-incompatible models** → Route directly to cloud
774
+ 3. **Complex requests (15+ tools)** → Route directly to Databricks
775
+ - Heavy workloads get the most capable models
776
+ - Enterprise features and reliability
777
+
778
+ 4. **Tool-incompatible models** → Route directly to cloud
728
779
  - Requests requiring tools with non-tool-capable Ollama models skip Ollama
729
780
 
730
781
  **Configuration:**
@@ -734,9 +785,12 @@ Lynkr intelligently routes each request:
734
785
  PREFER_OLLAMA=true # Enable hybrid routing mode
735
786
 
736
787
  # Optional (with defaults)
737
- OLLAMA_FALLBACK_ENABLED=true # Enable automatic fallback (default: true)
738
- OLLAMA_MAX_TOOLS_FOR_ROUTING=3 # Max tools to route to Ollama (default: 3)
739
- OLLAMA_FALLBACK_PROVIDER=databricks # Cloud provider for fallback (default: databricks)
788
+ FALLBACK_ENABLED=true # Enable automatic fallback (default: true)
789
+ OLLAMA_MAX_TOOLS_FOR_ROUTING=3 # Max tools to route to Ollama (default: 3)
790
+ OPENROUTER_MAX_TOOLS_FOR_ROUTING=15 # Max tools to route to OpenRouter (default: 15)
791
+ FALLBACK_PROVIDER=databricks # Final fallback provider (default: databricks)
792
+ OPENROUTER_API_KEY=your_key # Required for OpenRouter tier
793
+ OPENROUTER_MODEL=openai/gpt-4o-mini # OpenRouter model (default: gpt-4o-mini)
740
794
  ```
741
795
 
742
796
  **Example Scenarios:**
@@ -747,16 +801,23 @@ User: "Write a hello world function in Python"
747
801
  → Routes to Ollama (fast, local, free)
748
802
  → Response in ~300ms
749
803
 
750
- # Scenario 2: Complex workflow with multiple tools
804
+ # Scenario 2: Moderate workflow (3-14 tools)
751
805
  User: "Search the codebase, read 5 files, and refactor them"
752
- → Routes directly to cloud (5+ tools, better suited for Databricks)
753
- Response in ~2000ms
806
+ → Routes to OpenRouter (moderate complexity)
807
+ Uses affordable GPT-4o-mini
808
+ → Response in ~1500ms
809
+
810
+ # Scenario 3: Heavy workflow (15+ tools)
811
+ User: "Analyze 20 files, run tests, update documentation, commit changes"
812
+ → Routes directly to Databricks (complex task needs most capable model)
813
+ → Response in ~2500ms
754
814
 
755
- # Scenario 3: Ollama unavailable
815
+ # Scenario 4: Automatic fallback chain
756
816
  User: "What is 2+2?"
757
817
  → Tries Ollama (connection refused)
758
- Automatic fallback to Databricks
759
- Response in ~1500ms (user sees no error)
818
+ Falls back to OpenRouter (if configured)
819
+ Falls back to Databricks (if OpenRouter unavailable)
820
+ → User sees no error, just gets response
760
821
  ```
761
822
 
762
823
  **Circuit Breaker Protection:**
@@ -811,7 +872,7 @@ npm start
811
872
 
812
873
  # Option 2: Ollama-only mode (no fallback)
813
874
  export PREFER_OLLAMA=true
814
- export OLLAMA_FALLBACK_ENABLED=false
875
+ export FALLBACK_ENABLED=false
815
876
  npm start
816
877
  ```
817
878
 
@@ -1095,7 +1156,8 @@ Replace `<workspace>` and `<endpoint-name>` with your Databricks workspace host
1095
1156
 
1096
1157
  - **Databricks** – Mirrors Anthropic's hosted behaviour. Automatic policy web fallbacks (`needsWebFallback`) can trigger an extra `web_fetch`, and the upstream service executes dynamic pages on your behalf.
1097
1158
  - **Azure Anthropic** – Requests are normalised to Azure's payload shape. The proxy disables automatic `web_fetch` fallbacks to avoid duplicate tool executions; instead, the assistant surfaces a diagnostic message and you can trigger the tool manually if required.
1098
- - **Ollama** – Connects to locally-running Ollama models. Tool definitions are filtered out since Ollama doesn't support native tool calling. System prompts are merged into the first user message. Response format is converted from Ollama's format to Anthropic-compatible content blocks. Best used for simple text generation tasks or as a cost-effective development environment.
1159
+ - **OpenRouter** – Connects to OpenRouter's unified API for access to 100+ models. Full tool calling support with automatic format conversion between Anthropic and OpenAI formats. Messages are converted to OpenAI's format, tool calls are properly translated, and responses are converted back to Anthropic-compatible format. Best used for cost optimization, model flexibility, or when you want to experiment with different models without changing your codebase.
1160
+ - **Ollama** – Connects to locally-running Ollama models. Tool support varies by model (llama3.1, qwen2.5, mistral support tools; llama3 and older models don't). System prompts are merged into the first user message. Response format is converted from Ollama's format to Anthropic-compatible content blocks. Best used for simple text generation tasks, offline development, or as a cost-effective development environment.
1099
1161
  - In all cases, `web_search` and `web_fetch` run locally. They do not execute JavaScript, so pages that render data client-side (Google Finance, etc.) will return scaffolding only. Prefer JSON/CSV quote APIs (e.g. Yahoo chart API) when you need live financial data.
1100
1162
 
1101
1163
  ---
@@ -1224,7 +1286,7 @@ A: Functionally they overlap on core workflows (chat, tool calls, repo ops), but
1224
1286
 
1225
1287
  | Capability | Anthropic Hosted Backend | Claude Code Proxy |
1226
1288
  |------------|-------------------------|-------------------|
1227
- | Claude models | Anthropic-operated Sonnet/Opus | Adapters for Databricks (default), Azure Anthropic, and Ollama (local models) |
1289
+ | Claude models | Anthropic-operated Sonnet/Opus | Adapters for Databricks (default), Azure Anthropic, OpenRouter (100+ models), and Ollama (local models) |
1228
1290
  | Prompt cache | Managed, opaque | Local LRU cache with configurable TTL/size |
1229
1291
  | Git & workspace tools | Anthropic-managed hooks | Local Node handlers (`src/tools/`) with policy gate |
1230
1292
  | Web search/fetch | Hosted browsing agent, JS-capable | Local HTTP fetch (no JS) plus optional policy fallback |
@@ -1252,7 +1314,50 @@ A: Yes! Set `MODEL_PROVIDER=ollama` and ensure Ollama is running locally (`ollam
1252
1314
  A: For code generation, use `qwen2.5-coder:latest` (7B, optimized for code). For general conversations, `llama3:latest` (8B) or `mistral:latest` (7B) work well. Larger models (13B+) provide better quality but require more RAM and are slower.
1253
1315
 
1254
1316
  **Q: What are the performance differences between providers?**
1255
- A: **Databricks/Azure Anthropic**: ~500ms-2s latency, cloud-hosted, pay-per-token, supports tools. **Ollama**: ~100-500ms first token, runs locally, free, no tool support. Choose Databricks/Azure for production workflows with tools; choose Ollama for fast iteration, offline development, or cost optimization.
1317
+ A:
1318
+ - **Databricks/Azure Anthropic**: ~500ms-2s latency, cloud-hosted, pay-per-token, full tool support, enterprise features
1319
+ - **OpenRouter**: ~300ms-1.5s latency, cloud-hosted, competitive pricing ($0.15/1M for GPT-4o-mini), 100+ models, full tool support
1320
+ - **Ollama**: ~100-500ms first token, runs locally, free, limited tool support (model-dependent)
1321
+
1322
+ Choose Databricks/Azure for enterprise production with guaranteed SLAs. Choose OpenRouter for flexibility, cost optimization, and access to multiple models. Choose Ollama for fast iteration, offline development, or maximum cost savings.
1323
+
1324
+ **Q: What is OpenRouter and why should I use it?**
1325
+ A: OpenRouter is a unified API gateway that provides access to 100+ AI models from multiple providers (OpenAI, Anthropic, Google, Meta, Mistral, etc.) through a single API key. Benefits include:
1326
+ - **No vendor lock-in**: Switch models without changing your code
1327
+ - **Competitive pricing**: Often cheaper than going directly to providers (e.g., GPT-4o-mini at $0.15/$0.60 per 1M tokens)
1328
+ - **Automatic fallbacks**: If your primary model is unavailable, OpenRouter can automatically try alternatives
1329
+ - **No monthly fees**: Pay-as-you-go with no subscription required
1330
+ - **Full tool calling support**: Compatible with Claude Code CLI workflows
1331
+
1332
+ **Q: How do I get started with OpenRouter?**
1333
+ A:
1334
+ 1. Visit https://openrouter.ai and sign in (GitHub, Google, or email)
1335
+ 2. Go to https://openrouter.ai/keys and create an API key
1336
+ 3. Add credits to your account (minimum $5, pay-as-you-go)
1337
+ 4. Configure Lynkr:
1338
+ ```env
1339
+ MODEL_PROVIDER=openrouter
1340
+ OPENROUTER_API_KEY=sk-or-v1-...
1341
+ OPENROUTER_MODEL=openai/gpt-4o-mini
1342
+ ```
1343
+ 5. Start Lynkr and connect Claude CLI
1344
+
1345
+ **Q: Which OpenRouter model should I use?**
1346
+ A: Popular choices:
1347
+ - **Budget-conscious**: `openai/gpt-4o-mini` ($0.15/$0.60 per 1M) – Best value for code tasks
1348
+ - **Best quality**: `anthropic/claude-3.5-sonnet` – Claude's most capable model
1349
+ - **Free tier**: `meta-llama/llama-3.1-8b-instruct:free` – Completely free (rate-limited)
1350
+ - **Balanced**: `google/gemini-pro-1.5` – Large context window, good performance
1351
+
1352
+ See https://openrouter.ai/models for the complete list with pricing and features.
1353
+
1354
+ **Q: Can I use OpenRouter with the 3-tier hybrid routing?**
1355
+ A: Yes! The recommended configuration uses:
1356
+ - **Tier 1 (0-2 tools)**: Ollama (free, local, fast)
1357
+ - **Tier 2 (3-14 tools)**: OpenRouter (affordable, full tool support)
1358
+ - **Tier 3 (15+ tools)**: Databricks (most capable, enterprise features)
1359
+
1360
+ This gives you the best of all worlds: free for simple tasks, affordable for moderate complexity, and enterprise-grade for heavy workloads.
1256
1361
 
1257
1362
  **Q: Where are session transcripts stored?**
1258
1363
  A: In SQLite at `data/sessions.db` (configurable via `SESSION_DB_PATH`).