npm - lynkr - Versions diffs - 3.2.0 → 3.3.1 - Mend

lynkr 3.2.0 → 3.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (17) hide show

package/README.md +66 -17
package/ROUTER_COMPARISON.md +173 -0
package/TIER_ROUTING_PLAN.md +771 -0
package/docs/index.md +49 -5
package/final-test.js +33 -0
package/package.json +2 -2
package/src/clients/bedrock-utils.js +298 -0
package/src/clients/databricks.js +265 -0
package/src/clients/databricks.js.backup +1036 -0
package/src/clients/routing.js +12 -0
package/src/config/index.js +47 -3
package/src/db/database.sqlite +0 -0
package/src/orchestrator/index.js +18 -27
package/src/tools/smart-selection.js +23 -58
package/test/bedrock-integration.test.js +471 -0
package/test/llamacpp-integration.test.js +13 -34
package/test/lmstudio-integration.test.js +335 -0

package/README.md CHANGED Viewed

@@ -1,21 +1,22 @@
-# Lynkr - Production-Ready Claude Code Proxy with Multi-Provider Support, MCP Integration & Token Optimization
+# Lynkr -  Claude Code Proxy with Multi-Provider Support, MCP Integration & Token Optimization
 [![npm version](https://img.shields.io/npm/v/lynkr.svg)](https://www.npmjs.com/package/lynkr "Lynkr NPM Package - Claude Code Proxy Server")
 [![Homebrew Tap](https://img.shields.io/badge/homebrew-lynkr-brightgreen.svg)](https://github.com/vishalveerareddy123/homebrew-lynkr "Install Lynkr via Homebrew")
 [![License: Apache 2.0](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](LICENSE "Apache 2.0 License - Open Source Claude Code Alternative")
 [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/vishalveerareddy123/Lynkr "Lynkr Documentation on DeepWiki")
 [![Databricks Supported](https://img.shields.io/badge/Databricks-Supported-orange)](https://www.databricks.com/ "Databricks Claude Integration")
+[![AWS Bedrock](https://img.shields.io/badge/AWS%20Bedrock-100%2B%20Models-FF9900)](https://aws.amazon.com/bedrock/ "AWS Bedrock - 100+ Models")
 [![OpenAI Compatible](https://img.shields.io/badge/OpenAI-Compatible-412991)](https://openai.com/ "OpenAI GPT Integration")
 [![Ollama Compatible](https://img.shields.io/badge/Ollama-Compatible-brightgreen)](https://ollama.ai/ "Local Ollama Model Support")
 [![llama.cpp Compatible](https://img.shields.io/badge/llama.cpp-Compatible-blue)](https://github.com/ggerganov/llama.cpp "llama.cpp GGUF Model Support")
 [![IndexNow Enabled](https://img.shields.io/badge/IndexNow-Enabled-success?style=flat-square)](https://www.indexnow.org/ "SEO Optimized with IndexNow")
 [![DevHunt](https://img.shields.io/badge/DevHunt-Lynkr-orange)](https://devhunt.org/tool/lynkr "Lynkr on DevHunt")
-> **Production-ready Claude Code proxy server supporting Databricks, OpenRouter, Ollama & Azure. Features MCP integration, prompt caching & 60-80% token optimization savings.**
+> ** Claude Code proxy server supporting Databricks, AWS Bedrock (100+ models), OpenRouter, Ollama & Azure. Features MCP integration, prompt caching & 60-80% token optimization savings.**
 ## 🔖 Keywords
-`claude-code` `claude-proxy` `anthropic-api` `databricks-llm` `openrouter-integration` `ollama-local` `llama-cpp` `azure-openai` `azure-anthropic` `mcp-server` `prompt-caching` `token-optimization` `ai-coding-assistant` `llm-proxy` `self-hosted-ai` `git-automation` `code-generation` `developer-tools` `ci-cd-automation` `llm-gateway` `cost-reduction` `multi-provider-llm`
+`claude-code` `claude-proxy` `anthropic-api` `databricks-llm` `aws-bedrock` `bedrock-models` `deepseek-r1` `qwen3-coder` `openrouter-integration` `ollama-local` `llama-cpp` `azure-openai` `azure-anthropic` `mcp-server` `prompt-caching` `token-optimization` `ai-coding-assistant` `llm-proxy` `self-hosted-ai` `git-automation` `code-generation` `developer-tools` `ci-cd-automation` `llm-gateway` `cost-reduction` `multi-provider-llm`
 ---
@@ -68,7 +69,7 @@ Claude Code CLI is locked to Anthropic's API, limiting your choice of LLM provid
 ### The Solution
 Lynkr is a **production-ready proxy server** that unlocks Claude Code CLI's full potential:
-- ✅ **Any LLM Provider** - [Databricks, OpenRouter (100+ models), Ollama (local), Azure, OpenAI, llama.cpp](#supported-ai-model-providers-databricks-openrouter-ollama-azure-llamacpp)
+- ✅ **Any LLM Provider** - [Databricks, AWS Bedrock (100+ models), OpenRouter (100+ models), Ollama (local), Azure, OpenAI, llama.cpp](#supported-ai-model-providers-databricks-aws-bedrock-openrouter-ollama-azure-llamacpp)
 - ✅ **60-80% Cost Reduction** - Built-in [token optimization](#token-optimization-implementation) (5 optimization phases implemented)
 - ✅ **Zero Code Changes** - [Drop-in replacement](#connecting-claude-code-cli) for Anthropic backend
 - ✅ **Local & Offline** - Run Claude Code with [Ollama](#using-ollama-models) or [llama.cpp](#using-llamacpp-with-lynkr) (no internet required)
@@ -94,11 +95,17 @@ npm install -g lynkr
 ### 2️⃣ Configure Your Provider
 ```bash
-# Option A: Use local Ollama (free, offline)
+# Option A: Use AWS Bedrock (100+ models) 🆕
+export MODEL_PROVIDER=bedrock
+export AWS_BEDROCK_API_KEY=your-bearer-token
+export AWS_BEDROCK_REGION=us-east-2
+export AWS_BEDROCK_MODEL_ID=us.anthropic.claude-sonnet-4-5-20250929-v1:0
+# Option B: Use local Ollama (free, offline)
 export MODEL_PROVIDER=ollama
 export OLLAMA_MODEL=llama3.1:8b
-# Option B: Use Databricks (production)
+# Option C: Use Databricks (production)
 export MODEL_PROVIDER=databricks
 export DATABRICKS_API_BASE=https://your-workspace.databricks.net
 export DATABRICKS_API_KEY=your-api-key
@@ -160,7 +167,7 @@ Further documentation and usage notes are available on [DeepWiki](https://deepwi
 ---
-## Supported AI Model Providers (Databricks, OpenRouter, Ollama, Azure, llama.cpp)
+## Supported AI Model Providers (Databricks, AWS Bedrock, OpenRouter, Ollama, Azure, llama.cpp)
 Lynkr supports multiple AI model providers, giving you flexibility in choosing the right model for your needs:
@@ -169,6 +176,7 @@ Lynkr supports multiple AI model providers, giving you flexibility in choosing t
 | Provider | Configuration | Models Available | Best For |
 |----------|--------------|------------------|----------|
 | **Databricks** (Default) | `MODEL_PROVIDER=databricks` | Claude Sonnet 4.5, Claude Opus 4.5 | Production use, enterprise deployment |
+| **AWS Bedrock** 🆕 | `MODEL_PROVIDER=bedrock` | 100+ models (Claude, DeepSeek R1, Qwen3, Nova, Titan, Llama, Mistral, etc.) | AWS ecosystem, multi-model flexibility, Claude + alternatives |
 | **OpenAI** | `MODEL_PROVIDER=openai` | GPT-5, GPT-5.2, GPT-4o, GPT-4o-mini, GPT-4-turbo, o1, o1-mini | Direct OpenAI API access |
 | **Azure OpenAI** | `MODEL_PROVIDER=azure-openai` | GPT-5, GPT-5.2,GPT-4o, GPT-4o-mini, GPT-5, o1, o3, Kimi-K2 | Azure integration, Microsoft ecosystem |
 | **Azure Anthropic** | `MODEL_PROVIDER=azure-anthropic` | Claude Sonnet 4.5, Claude Opus 4.5 | Azure-hosted Claude models |
@@ -204,6 +212,44 @@ When using `MODEL_PROVIDER=azure-openai`, you can deploy any of the models in az
 **Note**: Azure OpenAI deployment names are configurable via `AZURE_OPENAI_DEPLOYMENT` environment variable.
+### **AWS Bedrock Model Catalog (100+ Models)**
+When using `MODEL_PROVIDER=bedrock`, you have access to **nearly 100 models** via AWS Bedrock's unified Converse API:
+#### **🆕 NEW Models (2025-2026)**
+- **DeepSeek R1** - `us.deepseek.r1-v1:0` - Reasoning model (o1-style)
+- **Qwen3** - `qwen.qwen3-235b-*`, `qwen.qwen3-coder-480b-*` - Up to 480B parameters!
+- **OpenAI GPT-OSS** - `openai.gpt-oss-120b-1:0` - Open-weight GPT models
+- **Google Gemma 3** - `google.gemma-3-27b` - Open-weight from Google
+- **MiniMax M2** - `minimax.m2-v1:0` - Chinese AI company
+#### **Claude Models (Best for Tool Calling)**
+- **Claude 4.5** - `us.anthropic.claude-sonnet-4-5-*` - Best for coding with tools
+- **Claude 3.5** - `anthropic.claude-3-5-sonnet-*` - Excellent tool calling
+- **Claude 3 Haiku** - `anthropic.claude-3-haiku-*` - Fast and cost-effective
+#### **Amazon Models**
+- **Nova** - `us.amazon.nova-pro-v1:0` - Multimodal, 300K context
+- **Titan** - `amazon.titan-text-express-v1` - General purpose
+#### **Other Major Models**
+- **Meta Llama** - `meta.llama3-1-70b-*` - Open-source Llama 3.1
+- **Mistral** - `mistral.mistral-large-*` - Coding, multilingual
+- **Cohere** - `cohere.command-r-plus-v1:0` - RAG, search
+- **AI21 Jamba** - `ai21.jamba-1-5-large-v1:0` - 256K context
+#### **Quick Setup**
+```bash
+export MODEL_PROVIDER=bedrock
+export AWS_BEDROCK_API_KEY=your-bearer-token  # Get from AWS Console → Bedrock → API Keys
+export AWS_BEDROCK_REGION=us-east-2
+export AWS_BEDROCK_MODEL_ID=us.anthropic.claude-sonnet-4-5-20250929-v1:0
+```
+📖 **Full Documentation**: See [BEDROCK_MODELS.md](BEDROCK_MODELS.md) for complete model catalog, pricing, capabilities, and use cases.
+⚠️ **Tool Calling Note**: Only **Claude models** support tool calling on Bedrock. Other models work via Converse API but won't use Read/Write/Bash tools.
 ### **Ollama Model Recommendations**
 For tool calling support (required for Claude Code CLI functionality):
@@ -241,16 +287,19 @@ FALLBACK_PROVIDER=databricks  # or azure-openai, openrouter, azure-anthropic
 ### **Provider Comparison**
-| Feature | Databricks | OpenAI | Azure OpenAI | Azure Anthropic | OpenRouter | Ollama | llama.cpp |
-|---------|-----------|--------|--------------|-----------------|------------|--------|-----------|
-| **Setup Complexity** | Medium | Easy | Medium | Medium | Easy | Easy | Medium |
-| **Cost** | $$$ | $$ | $$ | $$$ | $ | Free | Free |
-| **Latency** | Low | Low | Low | Low | Medium | Very Low | Very Low |
-| **Tool Calling** | Excellent | Excellent | Excellent | Excellent | Good | Fair | Good |
-| **Context Length** | 200K | 128K | 128K | 200K | Varies | 32K-128K | Model-dependent |
-| **Streaming** | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
-| **Privacy** | Enterprise | Third-party | Enterprise | Enterprise | Third-party | Local | Local |
-| **Offline** | No | No | No | No | No | Yes | Yes |
+| Feature | Databricks | AWS Bedrock | OpenAI | Azure OpenAI | Azure Anthropic | OpenRouter | Ollama | llama.cpp |
+|---------|-----------|-------------|--------|--------------|-----------------|------------|--------|-----------|
+| **Setup Complexity** | Medium | Easy | Easy | Medium | Medium | Easy | Easy | Medium |
+| **Cost** | $$$ | $$ | $$ | $$ | $$$ | $ | Free | Free |
+| **Latency** | Low | Low | Low | Low | Low | Medium | Very Low | Very Low |
+| **Model Variety** | 2 | 100+ | 10+ | 10+ | 2 | 100+ | 50+ | Unlimited |
+| **Tool Calling** | Excellent | Excellent* | Excellent | Excellent | Excellent | Good | Fair | Good |
+| **Context Length** | 200K | Up to 300K | 128K | 128K | 200K | Varies | 32K-128K | Model-dependent |
+| **Streaming** | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
+| **Privacy** | Enterprise | Enterprise | Third-party | Enterprise | Enterprise | Third-party | Local | Local |
+| **Offline** | No | No | No | No | No | No | Yes | Yes |
+_* Tool calling only supported by Claude models on Bedrock_
 ---

package/ROUTER_COMPARISON.md ADDED Viewed

@@ -0,0 +1,173 @@
+# Comparison: claude-code-router vs Lynkr Proxy
+## Architecture Differences
+**claude-code-router:**
+- **CLI-first design** - `ccr` commands for interactive model switching
+- **Request interceptor** - Sits between Claude Code CLI and LLM providers
+- **Transformer pipeline** - Middleware system for request/response modification
+- **Built with Fastify** (web framework)
+- **TypeScript + esbuild** compilation
+- **Web UI** for configuration
+**Lynkr:**
+- **HTTP proxy server** - Express-based API endpoint
+- **Provider abstraction** - Unified interface for 7+ providers
+- **Long-term memory system** (Titans-inspired)
+- **Built with Express** (web framework)
+- **Pure JavaScript** (no compilation)
+- **Token optimization focus** (6 optimization phases)
+---
+## Key Feature Comparison
+| Feature | claude-code-router | Lynkr | Winner |
+|---------|-------------------|-------|--------|
+| **Dynamic Model Switching** | ✅ Runtime `/model` command | ❌ Static .env config | 🏆 Router |
+| **Routing Logic** | ✅ Context-aware (think/background/long-context) | ❌ Simple provider fallback only | 🏆 Router |
+| **Custom Router Scripts** | ✅ JavaScript-based routing rules | ❌ No custom routing | 🏆 Router |
+| **Web UI** | ✅ `ccr ui` browser interface | ❌ No UI | 🏆 Router |
+| **Long-Term Memory** | ❌ None | ✅ Vector search + surprise scoring | 🏆 Lynkr |
+| **Token Optimization** | ⚠️ Basic (long-context detection) | ✅ 6 phases (smart tools, compression, etc.) | 🏆 Lynkr |
+| **Smart Tool Selection** | ❌ None | ✅ Heuristic-based (just implemented) | 🏆 Lynkr |
+| **History Compression** | ❌ None | ✅ Automatic + token budget enforcement | 🏆 Lynkr |
+| **Prompt Caching** | ✅ Via transformer | ✅ Built-in | 🟰 Tie |
+| **Provider Count** | 6 (OpenRouter, DeepSeek, Ollama, Gemini, etc.) | 7 (Databricks, Azure, OpenAI, OpenRouter, Ollama, llama.cpp) | 🟰 Tie |
+| **Tool Enhancement** | ✅ `enhancetool` transformer | ❌ Basic passthrough | 🏆 Router |
+| **GitHub Actions** | ✅ CI/CD integration | ❌ None | 🏆 Router |
+| **Logging** | ✅ Rotating file logs | ✅ Pino logger | 🟰 Tie |
+| **TypeScript** | ✅ Full TypeScript | ❌ JavaScript only | 🏆 Router |
+---
+## Improvements for Lynkr (Ranked by Impact)
+### 🔴 **Critical - High Impact, High Value**
+#### 1. Dynamic Model Switching via `/model` Command
+- **What**: Allow users to switch models mid-conversation without restarting server
+- **Why**: Router's killer feature - flexibility without configuration edits
+- **Implementation**: Add chat command parser, session-level model overrides
+- **Effort**: Medium (2-3 days)
+#### 2. Context-Aware Routing (Background/Think/Long-Context)
+- **What**: Automatically route requests based on context type
+- **Why**: Cost optimization + performance (cheap models for background, reasoning models for planning)
+- **Example**:
+  - Background tasks → `gpt-4o-mini` ($0.15/1M)
+  - Planning/thinking → `o1-preview` (reasoning model)
+  - Long context (>60k tokens) → `claude-sonnet-4` (200k context)
+- **Effort**: Medium (3-4 days)
+#### 3. Custom Router Scripts (JavaScript-based)
+- **What**: Let users define routing logic in JavaScript
+- **Why**: Ultimate flexibility - enterprise users need custom rules
+- **Example**:
+  ```javascript
+  // router.js
+  module.exports = function(request) {
+    if (request.tools.length > 5) return 'gpt-4o'; // Complex task
+    if (request.content.includes('urgent')) return 'databricks'; // Fast provider
+    return 'openrouter/nova-lite'; // Default cheap
+  }
+  ```
+- **Effort**: High (5-7 days)
+#### 4. Web UI for Configuration
+- **What**: Browser-based interface at `http://localhost:8081/ui`
+- **Why**: Non-technical users can't edit .env files
+- **Features**: Model selection, provider config, logs viewer, cost tracking
+- **Effort**: High (7-10 days)
+### 🟡 **High Impact, Medium Complexity**
+#### 5. Tool Enhancement Transformer
+- **What**: Add error tolerance and response buffering to tool calls
+- **Why**: Prevents cascade failures when tools return malformed JSON
+- **Example**: Retry tool calls with exponential backoff, validate tool outputs
+- **Effort**: Low (1-2 days)
+#### 6. Request/Response Transformer Pipeline
+- **What**: Middleware system to modify requests/responses per provider
+- **Why**: Provider-specific quirks (Azure needs different format, Ollama strips thinking blocks)
+- **Current**: Hardcoded in client adapters
+- **Improved**: Pluggable transformer chain
+- **Effort**: Medium (3-4 days)
+#### 7. Token-Based Auto-Routing
+- **What**: Switch to high-context models when input exceeds threshold
+- **Why**: Prevent truncation errors, automatic upgrade
+- **Example**: Request >100k tokens → auto-switch from `gpt-4o` (128k) to `claude-sonnet-4` (200k)
+- **Effort**: Low (1-2 days) - you already have token counting
+### 🟢 **Nice to Have - Lower Priority**
+#### 8. GitHub Actions Integration
+- **What**: Trigger Claude Code workflows in CI/CD
+- **Why**: Automated code reviews, documentation generation
+- **Use Case**: PR opens → Claude reviews code → posts comments
+- **Effort**: Medium (3-4 days)
+#### 9. CLI Commands (`lynkr model`, `lynkr ui`)
+- **What**: Interactive terminal commands for management
+- **Why**: Better DX than editing .env and restarting
+- **Effort**: Medium (2-3 days)
+#### 10. Rotating File Logs
+- **What**: Auto-rotate logs by size/date (keep last 7 days)
+- **Why**: Prevent disk bloat in production
+- **Current**: Pino logs to stdout only
+- **Effort**: Low (1 day) - use `pino-rotating-file-stream`
+#### 11. LRU Caching for Responses
+- **What**: Cache identical requests for X minutes
+- **Why**: Save money on repeated queries
+- **Example**: User asks "what is 2+2?" 3 times → only 1 LLM call
+- **Effort**: Low (1-2 days)
+#### 12. TypeScript Migration
+- **What**: Convert codebase to TypeScript
+- **Why**: Type safety, better IDE support, fewer runtime errors
+- **Effort**: Very High (15-20 days) - 87 files to convert
+---
+## Unique Strengths of Lynkr (Don't Lose These!)
+1. **Long-term memory system** - Router doesn't have this
+2. **Smart tool selection** - Just implemented, very valuable
+3. **6-phase token optimization** - Industry-leading
+4. **History compression** - Automatic context management
+5. **7 providers** - Broader support than Router
+6. **Hybrid routing** - Ollama + cloud fallback
+---
+## Recommended Implementation Order
+### Phase 1: Quick Wins (1-2 weeks)
+1. Token-based auto-routing
+2. Tool enhancement transformer
+3. Rotating file logs
+4. LRU caching
+### Phase 2: Game Changers (3-4 weeks)
+5. Dynamic model switching via `/model` command
+6. Context-aware routing (background/think/long-context)
+7. Request/Response transformer pipeline
+### Phase 3: Enterprise Features (4-6 weeks)
+8. Custom router scripts
+9. Web UI
+10. CLI commands
+11. GitHub Actions integration
+### Phase 4: Long-Term (Optional)
+12. TypeScript migration
+---
+## Bottom Line
+Router excels at **flexibility and user experience** (dynamic switching, routing logic, Web UI). Lynkr excels at **optimization and intelligence** (memory, token optimization, smart tools). Merging the best of both would create the ultimate Claude Code proxy.