npm - lynkr - Versions diffs - 0.1.4 → 1.0.0 - Mend

lynkr 0.1.4 → 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

package/.github/workflows/README.md +215 -0
package/.github/workflows/ci.yml +66 -0
package/.github/workflows/web-tools-tests.yml +56 -0
package/README.md +145 -40
package/docs/index.md +42 -2
package/package.json +5 -3
package/scripts/setup.js +10 -10
package/src/api/middleware/rate-limiter.js +96 -0
package/src/api/router.js +98 -3
package/src/clients/databricks.js +79 -2
package/src/clients/openrouter-utils.js +318 -0
package/src/clients/routing.js +17 -7
package/src/config/index.js +63 -19
package/src/orchestrator/index.js +412 -15
package/src/tools/web-client.js +71 -0
package/src/tools/web.js +173 -44
package/test/web-tools.test.js +326 -0
package/CLAUDE.md +0 -39
/package/{comprehensive-test-suite.js → test/comprehensive-test-suite.js} +0 -0

package/.github/workflows/README.md ADDED Viewed

@@ -0,0 +1,215 @@
+# GitHub Actions Workflows
+This directory contains GitHub Actions workflows for automated testing and CI/CD.
+## Available Workflows
+### 1. CI Tests (`ci.yml`)
+**Purpose:** Run comprehensive test suite on every push and pull request
+**Triggers:**
+- Push to `main` or `develop` branches
+- Pull requests to `main` or `develop` branches
+**What it does:**
+- Tests on Node.js 20.x and 22.x
+- Runs linter (`npm run lint`)
+- Runs unit tests (`npm run test:unit`)
+- Runs performance tests (`npm run test:performance`)
+- Uses npm cache for faster builds
+**Environment Variables:**
+- `DATABRICKS_API_KEY=test-key` (mock value for tests)
+- `DATABRICKS_API_BASE=http://test.com` (mock value for tests)
+**Status:** Runs on every push/PR, fails if unit tests fail
+---
+### 2. Web Tools Tests (`web-tools-tests.yml`)
+**Purpose:** Run web search tool tests when related files change
+**Triggers:**
+- Changes to web tools source files:
+  - `src/tools/web.js`
+  - `src/tools/web-client.js`
+  - `src/clients/retry.js`
+  - `src/config/index.js`
+  - `test/web-tools.test.js`
+**What it does:**
+- Runs only the web tools test suite
+- Generates test summary in GitHub Actions UI
+- Faster feedback for web tools changes
+**Test Coverage:**
+- HTML extraction (9 tests)
+- HTTP keep-alive agent (2 tests)
+- Retry logic with exponential backoff (2 tests)
+- Configuration management (3 tests)
+- Error handling (1 test)
+- Performance validation (1 test)
+- Body preview configuration (1 test)
+**Total:** 19 tests
+---
+### 3. NPM Publish (`npm-publish.yml`)
+**Purpose:** Automatically publish package to npm registry
+**Triggers:**
+- Git tags starting with `v` (e.g., `v0.1.5`)
+- GitHub Releases created
+**What it does:**
+- Runs full test suite before publishing
+- Checks if version already exists on npm
+- Publishes package to npm registry (if tests pass)
+- Prevents duplicate publishes
+- Creates publish summary
+**Requirements:**
+- `NPM_TOKEN` secret must be configured
+- Tests must pass
+- Version must be new
+**Status:** Only publishes on successful builds
+---
+### 4. Version Bump (`version-bump.yml`)
+**Purpose:** Manual workflow to bump version and create releases
+**Triggers:**
+- Manual workflow dispatch (button in Actions tab)
+**What it does:**
+- Prompts for version type (patch/minor/major)
+- Runs tests before version bump
+- Updates package.json version
+- Creates git commit and tag
+- Pushes changes to repository
+- Creates GitHub Release with changelog
+- Triggers npm-publish workflow automatically
+**Options:**
+- `patch` - Bug fixes (0.1.4 → 0.1.5)
+- `minor` - New features (0.1.4 → 0.2.0)
+- `major` - Breaking changes (0.1.4 → 1.0.0)
+---
+### 5. IndexNow Notification (`index.yml`)
+**Purpose:** Notify search engines when documentation is updated
+**Triggers:**
+- Push to `main` branch
+- Changes in `docs/**` directory
+**What it does:**
+- Notifies Bing IndexNow about updated documentation
+- Helps with SEO and documentation discoverability
+---
+## Adding Status Badges
+Add these badges to your README.md:
+```markdown
+![CI Tests](https://github.com/vishalveerareddy123/Lynkr/actions/workflows/ci.yml/badge.svg)
+![Web Tools Tests](https://github.com/vishalveerareddy123/Lynkr/actions/workflows/web-tools-tests.yml/badge.svg)
+![npm version](https://img.shields.io/npm/v/lynkr.svg)
+![npm downloads](https://img.shields.io/npm/dt/lynkr.svg)
+```
+## Running Tests Locally
+Before pushing, run tests locally:
+```bash
+# Run all unit tests
+npm run test:unit
+# Run only web tools tests
+DATABRICKS_API_KEY=test-key DATABRICKS_API_BASE=http://test.com \
+  node --test test/web-tools.test.js
+# Run quick tests (routing only)
+npm run test:quick
+# Run all tests including performance
+npm test
+```
+## Workflow Configuration
+### Required Secrets
+**For npm publishing workflows:**
+- `NPM_TOKEN` - Your npm automation token (required to publish)
+  - Get from: https://www.npmjs.com/settings/YOUR_USERNAME/tokens
+  - Type: "Automation" token
+  - Add to: Settings → Secrets → Actions → New repository secret
+**For test workflows:**
+- No secrets required (uses mock credentials)
+**For IndexNow workflow:**
+- `INDEX_NOW` - Your IndexNow API key (optional, only for docs)
+### Matrix Strategy
+The CI workflow uses a matrix strategy to test on multiple Node.js versions:
+- Node.js 20.x (LTS)
+- Node.js 22.x (Current)
+This ensures compatibility across different Node versions.
+## Troubleshooting
+### Tests fail locally but pass in CI
+- Check Node.js version (`node --version`)
+- Ensure `npm ci` is used (not `npm install`)
+- Check for platform-specific issues (macOS vs Linux)
+### Tests pass locally but fail in CI
+- Environment variables might be missing
+- Dependencies might need updating
+- Check GitHub Actions logs for details
+### Workflow doesn't trigger
+- Verify file paths in `on.push.paths`
+- Check branch names match
+- Ensure workflow file is in `.github/workflows/`
+## Modifying Workflows
+When making changes:
+1. Test YAML syntax (use a YAML validator)
+2. Test locally first with same commands
+3. Create a PR to test in CI before merging
+4. Check GitHub Actions tab for results
+## Performance Considerations
+- **npm cache:** Workflows cache `node_modules` for faster builds
+- **Parallel jobs:** Tests run on multiple Node versions in parallel
+- **Path filtering:** Web tools workflow only runs when relevant files change
+- **continue-on-error:** Performance tests won't fail the build
+## Future Improvements
+Potential additions:
+- Code coverage reporting
+- Docker container testing
+- E2E integration tests
+- Deploy previews for PRs
+- Automated dependency updates (Dependabot)

package/.github/workflows/ci.yml ADDED Viewed

@@ -0,0 +1,66 @@
+name: CI Tests
+on:
+  push:
+    branches:
+      - main
+      - develop
+  pull_request:
+    branches:
+      - main
+      - develop
+jobs:
+  test:
+    name: Run Tests
+    runs-on: ubuntu-latest
+    strategy:
+      matrix:
+        node-version: [20.x, 22.x]
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@v4
+      - name: Setup Node.js ${{ matrix.node-version }}
+        uses: actions/setup-node@v4
+        with:
+          node-version: ${{ matrix.node-version }}
+          cache: 'npm'
+      - name: Install dependencies
+        run: npm ci
+      - name: Run linter
+        run: npm run lint
+        continue-on-error: true
+      - name: Run unit tests
+        run: npm run test:unit
+        env:
+          DATABRICKS_API_KEY: test-key
+          DATABRICKS_API_BASE: http://test.com
+      - name: Run performance tests
+        run: npm run test:performance
+        env:
+          DATABRICKS_API_KEY: test-key
+          DATABRICKS_API_BASE: http://test.com
+        continue-on-error: true
+  test-summary:
+    name: Test Summary
+    runs-on: ubuntu-latest
+    needs: test
+    if: always()
+    steps:
+      - name: Check test results
+        run: |
+          echo "Tests completed"
+          if [ "${{ needs.test.result }}" == "failure" ]; then
+            echo "Tests failed!"
+            exit 1
+          fi
+          echo "All tests passed!"

package/.github/workflows/web-tools-tests.yml ADDED Viewed

@@ -0,0 +1,56 @@
+name: Web Tools Tests
+on:
+  push:
+    paths:
+      - 'src/tools/web.js'
+      - 'src/tools/web-client.js'
+      - 'src/clients/retry.js'
+      - 'src/config/index.js'
+      - 'test/web-tools.test.js'
+      - '.github/workflows/web-tools-tests.yml'
+  pull_request:
+    paths:
+      - 'src/tools/web.js'
+      - 'src/tools/web-client.js'
+      - 'src/clients/retry.js'
+      - 'src/config/index.js'
+      - 'test/web-tools.test.js'
+jobs:
+  web-tools-test:
+    name: Web Tools Test Suite
+    runs-on: ubuntu-latest
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@v4
+      - name: Setup Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: '20.x'
+          cache: 'npm'
+      - name: Install dependencies
+        run: npm ci
+      - name: Run web tools tests
+        run: |
+          DATABRICKS_API_KEY=test-key DATABRICKS_API_BASE=http://test.com \
+          node --test test/web-tools.test.js
+      - name: Test results summary
+        if: always()
+        run: |
+          echo "## Web Tools Test Results" >> $GITHUB_STEP_SUMMARY
+          echo "✅ All web tools tests passed!" >> $GITHUB_STEP_SUMMARY
+          echo "" >> $GITHUB_STEP_SUMMARY
+          echo "### Coverage:" >> $GITHUB_STEP_SUMMARY
+          echo "- HTML extraction (9 tests)" >> $GITHUB_STEP_SUMMARY
+          echo "- HTTP keep-alive agent (2 tests)" >> $GITHUB_STEP_SUMMARY
+          echo "- Retry logic with exponential backoff (2 tests)" >> $GITHUB_STEP_SUMMARY
+          echo "- Configuration management (3 tests)" >> $GITHUB_STEP_SUMMARY
+          echo "- Error handling (1 test)" >> $GITHUB_STEP_SUMMARY
+          echo "- Performance validation (1 test)" >> $GITHUB_STEP_SUMMARY
+          echo "- Body preview configuration (1 test)" >> $GITHUB_STEP_SUMMARY

package/README.md CHANGED Viewed

@@ -55,7 +55,7 @@ This repository contains a Node.js service that emulates the Anthropic Claude Co
 Key highlights:
 - **Production-ready architecture** – 14 production hardening features including circuit breakers, load shedding, graceful shutdown, comprehensive metrics (Prometheus format), and Kubernetes-ready health checks. Minimal overhead (~7μs per request) with 140K req/sec throughput.
-- **Multi-provider support** – Works with Databricks (default), Azure-hosted Anthropic endpoints, and local Ollama models; requests are normalized to each provider while returning Claude-flavored responses.
+- **Multi-provider support** – Works with Databricks (default), Azure-hosted Anthropic endpoints, OpenRouter (100+ models), and local Ollama models; requests are normalized to each provider while returning Claude-flavored responses.
 - **Enterprise observability** – Real-time metrics collection, structured logging with request ID correlation, latency percentiles (p50, p95, p99), token usage tracking, and cost attribution. Multiple export formats (JSON, Prometheus).
 - **Resilience & reliability** – Exponential backoff with jitter for retries, circuit breaker protection against cascading failures, automatic load shedding during overload, and zero-downtime deployments via graceful shutdown.
 - **Workspace awareness** – Local repo indexing, `CLAUDE.md` summaries, language-aware navigation, and Git helpers mirror core Claude Code workflows.
@@ -65,7 +65,7 @@ Key highlights:
 The result is a production-ready, self-hosted alternative that stays close to Anthropic's ergonomics while providing enterprise-grade reliability, observability, and performance.
-> **Compatibility note:** Claude models hosted on Databricks work out of the box. Set `MODEL_PROVIDER=azure-anthropic` (and related credentials) to target the Azure-hosted Anthropic `/anthropic/v1/messages` endpoint. Set `MODEL_PROVIDER=ollama` to use locally-running Ollama models (qwen2.5-coder, llama3, mistral, etc.).
+> **Compatibility note:** Claude models hosted on Databricks work out of the box. Set `MODEL_PROVIDER=azure-anthropic` (and related credentials) to target the Azure-hosted Anthropic `/anthropic/v1/messages` endpoint. Set `MODEL_PROVIDER=openrouter` to access 100+ models through OpenRouter (GPT-4o, Claude, Gemini, etc.). Set `MODEL_PROVIDER=ollama` to use locally-running Ollama models (qwen2.5-coder, llama3, mistral, etc.).
 Further documentation and usage notes are available on [DeepWiki](https://deepwiki.com/vishalveerareddy123/Lynkr).
@@ -301,22 +301,23 @@ For detailed performance analysis, benchmarks, and deployment guidance, see [PER
 │ MCP Registry   │          │ Provider Adapters              │      │ Sandbox      │
 │ (manifest ->   │──RPC─────│ • Databricks (circuit-breaker) │──┐   │ Runtime      │
 │ JSON-RPC client│          │ • Azure Anthropic (retry logic)│  │   │ (Docker)     │
-└────────────────┘          │ • Ollama (local models)        │  │   └──────────────┘
+└────────────────┘          │ • OpenRouter (100+ models)     │  │   └──────────────┘
+                            │ • Ollama (local models)        │  │
                             │ • HTTP Connection Pooling      │  │
                             │ • Exponential Backoff + Jitter │  │
                             └────────────┬───────────────────┘  │
                                          │                      │
-                        ┌────────────────┼────────────┐         │
-                        │                │            │         │
-              ┌─────────▼────────┐  ┌───▼────────┐  ┌▼─────────────┐
-              │ Databricks       │  │ Azure      │  │ Ollama API   │───────┘
-              │ Serving Endpoint │  │ Anthropic  │  │ (localhost)  │
-              │ (REST)           │  │ /anthropic │  │ qwen2.5-coder│
-              └──────────────────┘  │ /v1/messages│  └──────────────┘
-                                    └────────────┘
-                                         │
-                                ┌────────▼──────────┐
-                                │ External MCP tools│
+                        ┌────────────────┼─────────────────┐    │
+                        │                │                 │    │
+              ┌─────────▼────────┐  ┌───▼────────┐  ┌─────▼─────────┐
+              │ Databricks       │  │ Azure      │  │ OpenRouter API│
+              │ Serving Endpoint │  │ Anthropic  │  │ (GPT-4o, etc.)│
+              │ (REST)           │  │ /anthropic │  └───────────────┘
+              └──────────────────┘  │ /v1/messages│  ┌──────────────┐
+                                    └────────────┘  │ Ollama API   │───────┘
+                                         │          │ (localhost)  │
+                                ┌────────▼──────────│ qwen2.5-coder│
+                                │ External MCP tools└──────────────┘
                                 │ (GitHub, Jira)    │
                                 └───────────────────┘
 ```
@@ -491,6 +492,7 @@ Set `MODEL_PROVIDER` to select the upstream endpoint:
 - `MODEL_PROVIDER=databricks` (default) – expects `DATABRICKS_API_BASE`, `DATABRICKS_API_KEY`, and optionally `DATABRICKS_ENDPOINT_PATH`.
 - `MODEL_PROVIDER=azure-anthropic` – routes requests to Azure's `/anthropic/v1/messages` endpoint and uses the headers Azure expects.
+- `MODEL_PROVIDER=openrouter` – connects to OpenRouter for access to 100+ models (GPT-4o, Claude, Gemini, Llama, etc.). Requires `OPENROUTER_API_KEY`.
 - `MODEL_PROVIDER=ollama` – connects to a locally-running Ollama instance for models like qwen2.5-coder, llama3, mistral, etc.
 **Azure-hosted Anthropic configuration:**
@@ -529,6 +531,42 @@ ollama pull qwen2.5-coder:latest
 ollama list
 ```
+**OpenRouter configuration:**
+OpenRouter provides unified access to 100+ AI models through a single API, including GPT-4o, Claude, Gemini, Llama, Mixtral, and more. It offers competitive pricing, automatic fallbacks, and no need to manage multiple API keys.
+```env
+MODEL_PROVIDER=openrouter
+OPENROUTER_API_KEY=sk-or-v1-...                                    # Get from https://openrouter.ai/keys
+OPENROUTER_MODEL=openai/gpt-4o-mini                                # Model to use (see https://openrouter.ai/models)
+OPENROUTER_ENDPOINT=https://openrouter.ai/api/v1/chat/completions  # API endpoint
+PORT=8080
+WORKSPACE_ROOT=/path/to/your/repo
+```
+**Popular OpenRouter models:**
+- `openai/gpt-4o-mini` – Fast, affordable GPT-4o mini ($0.15/$0.60 per 1M tokens)
+- `anthropic/claude-3.5-sonnet` – Claude 3.5 Sonnet for complex reasoning
+- `google/gemini-pro-1.5` – Google's Gemini Pro with large context
+- `meta-llama/llama-3.1-70b-instruct` – Meta's open-source Llama 3.1
+See https://openrouter.ai/models for the complete list with pricing.
+**Getting an OpenRouter API key:**
+1. Visit https://openrouter.ai
+2. Sign in with GitHub, Google, or email
+3. Go to https://openrouter.ai/keys
+4. Create a new API key
+5. Add credits to your account (pay-as-you-go, no subscription required)
+**OpenRouter benefits:**
+- ✅ **100+ models** through one API (no need to manage multiple provider accounts)
+- ✅ **Automatic fallbacks** if your primary model is unavailable
+- ✅ **Competitive pricing** with volume discounts
+- ✅ **Full tool calling support** (function calling compatible with Claude Code CLI)
+- ✅ **No monthly fees** – pay only for what you use
+- ✅ **Rate limit pooling** across models
 ---
 ## Configuration Reference
@@ -537,7 +575,7 @@ ollama list
 |----------|-------------|---------|
 | `PORT` | HTTP port for the proxy server. | `8080` |
 | `WORKSPACE_ROOT` | Filesystem path exposed to workspace tools and indexer. | `process.cwd()` |
-| `MODEL_PROVIDER` | Selects the model backend (`databricks`, `azure-anthropic`, `ollama`). | `databricks` |
+| `MODEL_PROVIDER` | Selects the model backend (`databricks`, `azure-anthropic`, `openrouter`, `ollama`). | `databricks` |
 | `MODEL_DEFAULT` | Overrides the default model/deployment name sent to the provider. | Provider-specific default |
 | `DATABRICKS_API_BASE` | Base URL of your Databricks workspace (required when `MODEL_PROVIDER=databricks`). | – |
 | `DATABRICKS_API_KEY` | Databricks PAT used for the serving endpoint (required for Databricks). | – |
@@ -545,6 +583,10 @@ ollama list
 | `AZURE_ANTHROPIC_ENDPOINT` | Full HTTPS endpoint for Azure-hosted Anthropic `/anthropic/v1/messages` (required when `MODEL_PROVIDER=azure-anthropic`). | – |
 | `AZURE_ANTHROPIC_API_KEY` | API key supplied via the `x-api-key` header for Azure Anthropic. | – |
 | `AZURE_ANTHROPIC_VERSION` | Anthropic API version header for Azure Anthropic calls. | `2023-06-01` |
+| `OPENROUTER_API_KEY` | OpenRouter API key (required when `MODEL_PROVIDER=openrouter`). Get from https://openrouter.ai/keys | – |
+| `OPENROUTER_MODEL` | OpenRouter model to use (e.g., `openai/gpt-4o-mini`, `anthropic/claude-3.5-sonnet`). See https://openrouter.ai/models | `openai/gpt-4o-mini` |
+| `OPENROUTER_ENDPOINT` | OpenRouter API endpoint URL. | `https://openrouter.ai/api/v1/chat/completions` |
+| `OPENROUTER_MAX_TOOLS_FOR_ROUTING` | Maximum tool count for routing to OpenRouter in hybrid mode. | `15` |
 | `OLLAMA_ENDPOINT` | Ollama API endpoint URL (required when `MODEL_PROVIDER=ollama`). | `http://localhost:11434` |
 | `OLLAMA_MODEL` | Ollama model name to use (e.g., `qwen2.5-coder:latest`, `llama3`, `mistral`). | `qwen2.5-coder:7b` |
 | `OLLAMA_TIMEOUT_MS` | Request timeout for Ollama API calls in milliseconds. | `120000` (2 minutes) |
@@ -683,14 +725,15 @@ See [OLLAMA-TOOL-CALLING.md](OLLAMA-TOOL-CALLING.md) for implementation details.
 ### Hybrid Routing with Automatic Fallback
-Lynkr supports **intelligent hybrid routing** that automatically routes requests between Ollama (local/fast) and cloud providers (Databricks/Azure) based on request complexity, with transparent fallback when Ollama is unavailable.
+Lynkr supports **intelligent 3-tier hybrid routing** that automatically routes requests between Ollama (local/fast), OpenRouter (moderate complexity), and cloud providers (Databricks/Azure for heavy workloads) based on request complexity, with transparent fallback when any provider is unavailable.
 **Why Hybrid Routing?**
 - 🚀 **40-87% faster** for simple requests (local Ollama)
 - 💰 **65-100% cost savings** for requests that stay on Ollama
-- 🛡️ **Automatic fallback** ensures reliability when Ollama fails
-- 🔒 **Privacy-preserving** for simple queries (never leave your machine)
+- 🎯 **Smart cost optimization** – use affordable OpenRouter models for moderate complexity
+- 🛡️ **Automatic fallback** ensures reliability when any provider fails
+- 🔒 **Privacy-preserving** for simple queries (never leave your machine with Ollama)
 **Quick Start:**
@@ -699,12 +742,14 @@ Lynkr supports **intelligent hybrid routing** that automatically routes requests
 ollama serve
 ollama pull qwen2.5-coder:latest
-# Terminal 2: Start Lynkr with hybrid routing
+# Terminal 2: Start Lynkr with 3-tier routing
 export PREFER_OLLAMA=true
 export OLLAMA_ENDPOINT=http://localhost:11434
 export OLLAMA_MODEL=qwen2.5-coder:latest
-export DATABRICKS_API_KEY=your_key           # Fallback provider
-export DATABRICKS_API_BASE=your_base_url     # Fallback provider
+export OPENROUTER_API_KEY=your_openrouter_key    # Mid-tier provider
+export OPENROUTER_MODEL=openai/gpt-4o-mini       # Mid-tier model
+export DATABRICKS_API_KEY=your_key               # Heavy workload provider
+export DATABRICKS_API_BASE=your_base_url         # Heavy workload provider
 npm start
 # Terminal 3: Connect Claude CLI (works transparently)
@@ -715,16 +760,22 @@ claude
 **How It Works:**
-Lynkr intelligently routes each request:
+Lynkr intelligently routes each request based on complexity:
 1. **Simple requests (0-2 tools)** → Try Ollama first
-   - ✅ If Ollama succeeds: Fast, local response (100-500ms)
-   - ❌ If Ollama fails: Automatic transparent fallback to cloud
+   - ✅ If Ollama succeeds: Fast, local, free response (100-500ms)
+   - ❌ If Ollama fails: Automatic transparent fallback to OpenRouter or Databricks
-2. **Complex requests (3+ tools)** → Route directly to cloud
-   - Ollama isn't attempted (saves time on requests better suited for cloud)
+2. **Moderate requests (3-14 tools)** → Route to OpenRouter
+   - Uses affordable models like GPT-4o-mini ($0.15/1M input tokens)
+   - Full tool calling support
+   - ❌ If OpenRouter fails or not configured: Fallback to Databricks
-3. **Tool-incompatible models** → Route directly to cloud
+3. **Complex requests (15+ tools)** → Route directly to Databricks
+   - Heavy workloads get the most capable models
+   - Enterprise features and reliability
+4. **Tool-incompatible models** → Route directly to cloud
    - Requests requiring tools with non-tool-capable Ollama models skip Ollama
 **Configuration:**
@@ -734,9 +785,12 @@ Lynkr intelligently routes each request:
 PREFER_OLLAMA=true                    # Enable hybrid routing mode
 # Optional (with defaults)
-OLLAMA_FALLBACK_ENABLED=true          # Enable automatic fallback (default: true)
-OLLAMA_MAX_TOOLS_FOR_ROUTING=3        # Max tools to route to Ollama (default: 3)
-OLLAMA_FALLBACK_PROVIDER=databricks   # Cloud provider for fallback (default: databricks)
+FALLBACK_ENABLED=true                         # Enable automatic fallback (default: true)
+OLLAMA_MAX_TOOLS_FOR_ROUTING=3                # Max tools to route to Ollama (default: 3)
+OPENROUTER_MAX_TOOLS_FOR_ROUTING=15           # Max tools to route to OpenRouter (default: 15)
+FALLBACK_PROVIDER=databricks                  # Final fallback provider (default: databricks)
+OPENROUTER_API_KEY=your_key                   # Required for OpenRouter tier
+OPENROUTER_MODEL=openai/gpt-4o-mini           # OpenRouter model (default: gpt-4o-mini)
 ```
 **Example Scenarios:**
@@ -747,16 +801,23 @@ User: "Write a hello world function in Python"
 → Routes to Ollama (fast, local, free)
 → Response in ~300ms
-# Scenario 2: Complex workflow with multiple tools
+# Scenario 2: Moderate workflow (3-14 tools)
 User: "Search the codebase, read 5 files, and refactor them"
-→ Routes directly to cloud (5+ tools, better suited for Databricks)
-→ Response in ~2000ms
+→ Routes to OpenRouter (moderate complexity)
+→ Uses affordable GPT-4o-mini
+→ Response in ~1500ms
+# Scenario 3: Heavy workflow (15+ tools)
+User: "Analyze 20 files, run tests, update documentation, commit changes"
+→ Routes directly to Databricks (complex task needs most capable model)
+→ Response in ~2500ms
-# Scenario 3: Ollama unavailable
+# Scenario 4: Automatic fallback chain
 User: "What is 2+2?"
 → Tries Ollama (connection refused)
-→ Automatic fallback to Databricks
-→ Response in ~1500ms (user sees no error)
+→ Falls back to OpenRouter (if configured)
+→ Falls back to Databricks (if OpenRouter unavailable)
+→ User sees no error, just gets response
 ```
 **Circuit Breaker Protection:**
@@ -811,7 +872,7 @@ npm start
 # Option 2: Ollama-only mode (no fallback)
 export PREFER_OLLAMA=true
-export OLLAMA_FALLBACK_ENABLED=false
+export FALLBACK_ENABLED=false
 npm start
 ```
@@ -1095,7 +1156,8 @@ Replace `<workspace>` and `<endpoint-name>` with your Databricks workspace host
 - **Databricks** – Mirrors Anthropic's hosted behaviour. Automatic policy web fallbacks (`needsWebFallback`) can trigger an extra `web_fetch`, and the upstream service executes dynamic pages on your behalf.
 - **Azure Anthropic** – Requests are normalised to Azure's payload shape. The proxy disables automatic `web_fetch` fallbacks to avoid duplicate tool executions; instead, the assistant surfaces a diagnostic message and you can trigger the tool manually if required.
-- **Ollama** – Connects to locally-running Ollama models. Tool definitions are filtered out since Ollama doesn't support native tool calling. System prompts are merged into the first user message. Response format is converted from Ollama's format to Anthropic-compatible content blocks. Best used for simple text generation tasks or as a cost-effective development environment.
+- **OpenRouter** – Connects to OpenRouter's unified API for access to 100+ models. Full tool calling support with automatic format conversion between Anthropic and OpenAI formats. Messages are converted to OpenAI's format, tool calls are properly translated, and responses are converted back to Anthropic-compatible format. Best used for cost optimization, model flexibility, or when you want to experiment with different models without changing your codebase.
+- **Ollama** – Connects to locally-running Ollama models. Tool support varies by model (llama3.1, qwen2.5, mistral support tools; llama3 and older models don't). System prompts are merged into the first user message. Response format is converted from Ollama's format to Anthropic-compatible content blocks. Best used for simple text generation tasks, offline development, or as a cost-effective development environment.
 - In all cases, `web_search` and `web_fetch` run locally. They do not execute JavaScript, so pages that render data client-side (Google Finance, etc.) will return scaffolding only. Prefer JSON/CSV quote APIs (e.g. Yahoo chart API) when you need live financial data.
 ---
@@ -1224,7 +1286,7 @@ A: Functionally they overlap on core workflows (chat, tool calls, repo ops), but
 | Capability | Anthropic Hosted Backend | Claude Code Proxy |
 |------------|-------------------------|-------------------|
-| Claude models | Anthropic-operated Sonnet/Opus | Adapters for Databricks (default), Azure Anthropic, and Ollama (local models) |
+| Claude models | Anthropic-operated Sonnet/Opus | Adapters for Databricks (default), Azure Anthropic, OpenRouter (100+ models), and Ollama (local models) |
 | Prompt cache | Managed, opaque | Local LRU cache with configurable TTL/size |
 | Git & workspace tools | Anthropic-managed hooks | Local Node handlers (`src/tools/`) with policy gate |
 | Web search/fetch | Hosted browsing agent, JS-capable | Local HTTP fetch (no JS) plus optional policy fallback |
@@ -1252,7 +1314,50 @@ A: Yes! Set `MODEL_PROVIDER=ollama` and ensure Ollama is running locally (`ollam
 A: For code generation, use `qwen2.5-coder:latest` (7B, optimized for code). For general conversations, `llama3:latest` (8B) or `mistral:latest` (7B) work well. Larger models (13B+) provide better quality but require more RAM and are slower.
 **Q: What are the performance differences between providers?**
-A: **Databricks/Azure Anthropic**: ~500ms-2s latency, cloud-hosted, pay-per-token, supports tools. **Ollama**: ~100-500ms first token, runs locally, free, no tool support. Choose Databricks/Azure for production workflows with tools; choose Ollama for fast iteration, offline development, or cost optimization.
+A:
+- **Databricks/Azure Anthropic**: ~500ms-2s latency, cloud-hosted, pay-per-token, full tool support, enterprise features
+- **OpenRouter**: ~300ms-1.5s latency, cloud-hosted, competitive pricing ($0.15/1M for GPT-4o-mini), 100+ models, full tool support
+- **Ollama**: ~100-500ms first token, runs locally, free, limited tool support (model-dependent)
+Choose Databricks/Azure for enterprise production with guaranteed SLAs. Choose OpenRouter for flexibility, cost optimization, and access to multiple models. Choose Ollama for fast iteration, offline development, or maximum cost savings.
+**Q: What is OpenRouter and why should I use it?**
+A: OpenRouter is a unified API gateway that provides access to 100+ AI models from multiple providers (OpenAI, Anthropic, Google, Meta, Mistral, etc.) through a single API key. Benefits include:
+- **No vendor lock-in**: Switch models without changing your code
+- **Competitive pricing**: Often cheaper than going directly to providers (e.g., GPT-4o-mini at $0.15/$0.60 per 1M tokens)
+- **Automatic fallbacks**: If your primary model is unavailable, OpenRouter can automatically try alternatives
+- **No monthly fees**: Pay-as-you-go with no subscription required
+- **Full tool calling support**: Compatible with Claude Code CLI workflows
+**Q: How do I get started with OpenRouter?**
+A:
+1. Visit https://openrouter.ai and sign in (GitHub, Google, or email)
+2. Go to https://openrouter.ai/keys and create an API key
+3. Add credits to your account (minimum $5, pay-as-you-go)
+4. Configure Lynkr:
+```env
+MODEL_PROVIDER=openrouter
+OPENROUTER_API_KEY=sk-or-v1-...
+OPENROUTER_MODEL=openai/gpt-4o-mini
+```
+5. Start Lynkr and connect Claude CLI
+**Q: Which OpenRouter model should I use?**
+A: Popular choices:
+- **Budget-conscious**: `openai/gpt-4o-mini` ($0.15/$0.60 per 1M) – Best value for code tasks
+- **Best quality**: `anthropic/claude-3.5-sonnet` – Claude's most capable model
+- **Free tier**: `meta-llama/llama-3.1-8b-instruct:free` – Completely free (rate-limited)
+- **Balanced**: `google/gemini-pro-1.5` – Large context window, good performance
+See https://openrouter.ai/models for the complete list with pricing and features.
+**Q: Can I use OpenRouter with the 3-tier hybrid routing?**
+A: Yes! The recommended configuration uses:
+- **Tier 1 (0-2 tools)**: Ollama (free, local, fast)
+- **Tier 2 (3-14 tools)**: OpenRouter (affordable, full tool support)
+- **Tier 3 (15+ tools)**: Databricks (most capable, enterprise features)
+This gives you the best of all worlds: free for simple tasks, affordable for moderate complexity, and enterprise-grade for heavy workloads.
 **Q: Where are session transcripts stored?**
 A: In SQLite at `data/sessions.db` (configurable via `SESSION_DB_PATH`).