npm - lynkr - Versions diffs - 9.1.9 → 9.2.1 - Mend

lynkr 9.1.9 → 9.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (2) hide show

package/README.md +428 -208
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -6,7 +6,6 @@
 [![Tests](https://img.shields.io/badge/tests-699%20passing-brightgreen)](https://github.com/Fast-Editor/Lynkr)
 [![License: Apache 2.0](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](LICENSE)
 [![Node.js](https://img.shields.io/badge/node-20%2B-green)](https://nodejs.org)
-[![Homebrew Tap](https://img.shields.io/badge/homebrew-lynkr-brightgreen.svg)](https://github.com/vishalveerareddy123/homebrew-lynkr)
 [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/vishalveerareddy123/Lynkr)
 <table>
@@ -20,281 +19,481 @@
 ---
-## The Problem
+## Quick Start (2 Minutes)
-AI coding tools lock you into one provider. Claude Code requires Anthropic. Codex requires OpenAI. You can't use your company's Databricks endpoint, your local Ollama models, or your AWS Bedrock account — at least, not without Lynkr.
+### 1. Install Lynkr
-**The real costs:**
-- Anthropic API at $15/MTok output adds up fast for daily coding
-- No way to use free local models (Ollama, llama.cpp) with Claude Code
-- Enterprise teams can't route through their own cloud infrastructure
-- Provider outages take your entire workflow down
+```bash
+npm install -g lynkr
+```
-## The Solution
+### 2. Configure Lynkr
-Lynkr is a self-hosted proxy that sits between your AI coding tools and any LLM provider. One environment variable change, and your tools work with any model.
+First run creates a `.env` file. Edit it with your provider settings.
-```
-Claude Code / Cursor / Codex / Cline / Continue / Vercel AI SDK
-                        |
-                      Lynkr
-                        |
-    Ollama | Bedrock | Databricks | OpenRouter | Azure | OpenAI | llama.cpp
-```
+**Option A: Free & Local (Ollama) - Recommended for Testing**
 ```bash
-# That's it. Three lines.
-npm install -g lynkr
-export ANTHROPIC_BASE_URL=http://localhost:8081
-lynkr start
+# Install Ollama first: https://ollama.com
+ollama pull qwen2.5-coder:latest
 ```
----
+Create/edit `.env` in your project directory:
+```bash
+# Provider
+MODEL_PROVIDER=ollama
+FALLBACK_ENABLED=false
-## Quick Start
+# Ollama Configuration
+OLLAMA_ENDPOINT=http://localhost:11434
+OLLAMA_MODEL=qwen2.5-coder:latest
-### Install
+# Server
+PORT=8081
+# Increase limits for Claude Code/Cursor
+POLICY_MAX_STEPS=50
+POLICY_MAX_TOOL_CALLS=100
+# Disable overly strict command filtering
+POLICY_SAFE_COMMANDS_ENABLED=false
+```
+**Option B: Cloud (OpenRouter) - Recommended for Production**
-**One-line install (recommended):**
 ```bash
-curl -fsSL https://raw.githubusercontent.com/Fast-Editor/Lynkr/main/install.sh | bash
+# Get API key from https://openrouter.ai
 ```
-**Or via npm:**
+Create/edit `.env`:
 ```bash
-npm install -g pino-pretty && npm install -g lynkr
+# Provider
+MODEL_PROVIDER=openrouter
+OPENROUTER_API_KEY=sk-or-v1-your-key-here
+FALLBACK_ENABLED=false
+# Server
+PORT=8081
+# Increase limits
+POLICY_MAX_STEPS=50
+POLICY_MAX_TOOL_CALLS=100
+# Optional: Enable caching
+PROMPT_CACHE_ENABLED=true
+SEMANTIC_CACHE_ENABLED=true
 ```
-### Pick a Provider
+**Option C: Enterprise (AWS Bedrock)**
-**Free & Local (Ollama)**
+Create/edit `.env`:
 ```bash
-export MODEL_PROVIDER=ollama
-export OLLAMA_MODEL=qwen2.5-coder:latest
-lynkr start
+# Provider
+MODEL_PROVIDER=bedrock
+AWS_BEDROCK_API_KEY=your-aws-key
+AWS_BEDROCK_MODEL_ID=anthropic.claude-3-5-sonnet-20241022-v2:0
+FALLBACK_ENABLED=false
+# Server
+PORT=8081
+# Increase limits
+POLICY_MAX_STEPS=50
+POLICY_MAX_TOOL_CALLS=100
 ```
-**AWS Bedrock (100+ models)**
+**Option D: Enterprise (Databricks)**
+Create/edit `.env`:
 ```bash
-export MODEL_PROVIDER=bedrock
-export AWS_BEDROCK_API_KEY=your-key
-export AWS_BEDROCK_MODEL_ID=anthropic.claude-3-5-sonnet-20241022-v2:0
-lynkr start
+# Provider
+MODEL_PROVIDER=databricks
+DATABRICKS_API_BASE=https://your-workspace.cloud.databricks.com
+DATABRICKS_API_KEY=your-token
+FALLBACK_ENABLED=false
+# Server
+PORT=8081
+# Increase limits
+POLICY_MAX_STEPS=50
+POLICY_MAX_TOOL_CALLS=100
 ```
-**OpenRouter (cheapest cloud)**
+Then start Lynkr:
 ```bash
-export MODEL_PROVIDER=openrouter
-export OPENROUTER_API_KEY=sk-or-v1-your-key
 lynkr start
 ```
-### Connect Your Tool
+### 3. Connect Your Tool
 **Claude Code**
 ```bash
 export ANTHROPIC_BASE_URL=http://localhost:8081
 export ANTHROPIC_API_KEY=dummy
-claude "Your prompt here"
+claude "write a hello world in python"
 ```
-**Codex CLI** — edit `~/.codex/config.toml`:
+**Cursor IDE**
+- Settings → Models → Override Base URL
+- Set to: `http://localhost:8081/v1`
+- API Key: `any-value`
+**Codex CLI**
+Edit `~/.codex/config.toml`:
 ```toml
 model_provider = "lynkr"
-model = "gpt-4o"
 [model_providers.lynkr]
-name = "Lynkr Proxy"
 base_url = "http://localhost:8081/v1"
 wire_api = "responses"
 ```
-**Cursor IDE**
-- Settings > Features > Models
-- Base URL: `http://localhost:8081/v1`
-- API Key: `sk-lynkr`
-**Vercel AI SDK**
-```ts
-import { generateText } from "ai";
-import { createOpenAICompatible } from "@ai-sdk/openai-compatible";
-const lynkr = createOpenAICompatible({
-  baseURL: "http://localhost:8081/v1",
-  name: "lynkr",
-  apiKey: "sk-lynkr",
-});
-const { text } = await generateText({
-  model: lynkr.chatModel("auto"),
-  prompt: "Hello!",
-});
-```
+✅ **Done!** Your AI tool now uses your chosen provider.
+---
+## Why Lynkr?
+AI coding tools lock you into one provider. Lynkr breaks that lock.
-**OpenClaw**
-```json
-// openclaw.json
-{
-  "models": {
-    "providers": [{
-      "name": "lynkr",
-      "type": "openai-compatible",
-      "base_url": "http://localhost:8081/v1",
-      "api_key": "any-value",
-      "models": ["auto"]
-    }]
-  }
-}
 ```
-Set `OPENCLAW_MODE=true` in Lynkr's `.env` to show actual provider/model in responses.
+Claude Code / Cursor / Codex / Cline / Continue
+                    ↓
+                  Lynkr
+                    ↓
+    Ollama | Bedrock | Azure | OpenRouter | OpenAI
+```
-> Works with any OpenAI-compatible client: Cline, Continue.dev, OpenClaw, KiloCode, and more.
+**What you get:**
+- ✅ Use **free local models** (Ollama, llama.cpp) with Claude Code
+- ✅ Route through **your company's infrastructure** (Databricks, Azure, Bedrock)
+- ✅ Cut costs **60-80%** with smart token optimization
+- ✅ **Zero code changes** - just change one environment variable
 ---
 ## Supported Providers
-| Provider | Type | Models | Cost |
-|----------|------|--------|------|
-| **Ollama** | Local | Unlimited (free, offline) | **Free** |
+| Provider | Type | Example Models | Cost |
+|----------|------|---------------|------|
+| **Ollama** | Local | qwen2.5-coder, deepseek-coder, llama3 | **Free** |
 | **llama.cpp** | Local | Any GGUF model | **Free** |
 | **LM Studio** | Local | Local models with GUI | **Free** |
-| **MLX Server** | Local | Apple Silicon optimized | **Free** |
-| **AWS Bedrock** | Cloud | 100+ (Claude, Llama, Mistral, Titan) | $$ |
-| **OpenRouter** | Cloud | 100+ (GPT, Claude, Llama, Gemini) | $-$$ |
+| **OpenRouter** | Cloud | GPT-4o, Claude 3.5, Llama 3, Gemini | $ |
+| **AWS Bedrock** | Cloud | Claude, Llama, Mistral, Titan | $$ |
 | **Databricks** | Cloud | Claude Sonnet 4.5, Opus 4.6 | $$$ |
 | **Azure OpenAI** | Cloud | GPT-4o, o1, o3 | $$$ |
-| **Azure Anthropic** | Cloud | Claude models | $$$ |
-| **OpenAI** | Cloud | GPT-4o, o3, o4-mini | $$$ |
-| **Google Vertex** | Cloud | Gemini 2.5 Pro/Flash | $$$ |
-| **Moonshot AI** | Cloud | Kimi K2 Thinking/Turbo | $$ |
-| **Z.AI** | Cloud | GLM-4.7 | $$ |
-| **DeepSeek** | Cloud | DeepSeek Reasoner, R1 | $ |
+| **Azure Anthropic** | Cloud | Claude Sonnet, Opus | $$$ |
+| **OpenAI** | Cloud | GPT-4o, o3-mini | $$$ |
+| **DeepSeek** | Cloud | DeepSeek R1, Reasoner | $ |
-4 local providers for **100% offline, free** usage. 10+ cloud providers for scale.
+**4 local providers** for 100% offline, free usage. **10+ cloud providers** for scale.
 ---
-## Why Lynkr Over Alternatives
-| Feature | Lynkr | LiteLLM (42K stars) | OpenRouter | PortKey |
-|---------|-------|---------------------|------------|---------|
-| **Setup** | `npm install -g lynkr` | Python + Docker + Postgres | Account signup | Docker + config |
-| **Claude Code support** | Drop-in, native | Requires config | No CLI support | Requires config |
-| **Cursor support** | Drop-in, native | Partial | Via API key | Partial |
-| **Codex CLI support** | Drop-in, native | No | No | No |
-| **Built for coding tools** | Yes (purpose-built) | No (general gateway) | No (general API) | No (general gateway) |
-| **Local models** | Ollama, llama.cpp, LM Studio, MLX | Ollama only | No | No |
-| **Token optimization** | Built-in (60-80% savings) | No | No | Caching only |
-| **Complexity routing** | Auto-routes by task difficulty | Manual | Cost/latency only | Manual |
-| **Memory system** | Titans-inspired long-term memory | No | No | No |
-| **Self-hosted** | Yes (Node.js) | Yes (Python stack) | No (SaaS) | Yes (Docker) |
-| **Offline capable** | Yes | Yes | No | No |
-| **Transaction fees** | None | None (OSS) / Paid enterprise | 5.5% on credits | Free tier / Paid |
-| **Dependencies** | Node.js only | Python, Prisma, PostgreSQL | N/A | Docker, Python |
-| **Format conversion** | Anthropic <-> OpenAI (automatic) | Automatic | N/A | Automatic |
-| **Code intelligence** | Graphify (19-lang AST graph) | No | No | No |
-| **Routing telemetry** | Built-in (SQLite + REST API) | No | Dashboard | Dashboard |
-| **Admin hot-reload** | Yes (no restart) | Requires restart | N/A | Requires restart |
-| **License** | Apache 2.0 | MIT | Proprietary | MIT (gateway) |
-**Lynkr's edge:** Purpose-built for AI coding tools. Not a general LLM gateway — a proxy that understands Claude Code, Cursor, and Codex natively, with built-in token optimization, complexity-based routing, and a memory system designed for coding workflows. Installs in one command, runs on Node.js, zero infrastructure required.
+## Advanced: Tier Routing (Save Even More)
----
+Route different request types to different models automatically:
-## Cost Comparison
+```bash
+# .env file
+MODEL_PROVIDER=ollama
+FALLBACK_ENABLED=false
+# Use small/fast models for simple tasks
+TIER_SIMPLE=ollama:qwen2.5:3b
+# Use medium models for normal coding
+TIER_MEDIUM=ollama:qwen2.5:7b
-| Scenario | Direct Anthropic | Lynkr + Ollama | Lynkr + OpenRouter | Lynkr + Bedrock |
-|----------|-----------------|----------------|--------------------| --------------- |
-| Daily Claude Code usage | ~$10-30/day | **$0 (free)** | ~$2-8/day | ~$5-15/day |
-| Token optimization savings | — | — | 60-80% further | 60-80% further |
-| Monthly (heavy use) | $300-900 | **$0** | $60-240 | $150-450 |
+# Use powerful models for complex architecture
+TIER_COMPLEX=ollama:deepseek-r1:14b
+TIER_REASONING=ollama:deepseek-r1:14b
-> With token optimization enabled, Lynkr's smart tool selection, prompt caching, and memory deduplication reduce token usage by 60-80% on top of provider savings.
+# Increase limits for long conversations
+POLICY_MAX_STEPS=50
+POLICY_MAX_TOOL_CALLS=100
+```
+Lynkr analyzes each request and routes it to the appropriate tier. Simple questions use fast models. Complex refactoring uses powerful models.
+**Result:** 70-90% of requests use cheaper/faster models. Only hard problems hit expensive models.
 ---
-## What's Under the Hood
+## Complete .env Examples
-Lynkr isn't just a passthrough proxy. It's an optimization layer.
+### MVP: Minimal Working Setup (Ollama)
-### Smart Routing (5-Phase)
-Routes requests to the right model based on 5-phase complexity analysis. Simple questions go to fast/cheap models. Complex architectural tasks go to powerful models. Includes Graphify structural analysis for code-aware routing.
+Copy-paste ready configuration for immediate use:
-- **Complexity scoring** — 15-dimension weighted scoring with agentic workflow detection
-- **Graphify integration** — AST-based knowledge graph detects god nodes, community cohesion, blast radius across 19 languages
-- **Routing telemetry** — every decision recorded with quality scoring (0-100) and latency tracking (P50/P95/P99)
+```bash
+# .env - Minimal Ollama Setup
+# ============================================
+# REQUIRED: Provider Configuration
+# ============================================
+MODEL_PROVIDER=ollama
+FALLBACK_ENABLED=false
+# ============================================
+# REQUIRED: Ollama Settings
+# ============================================
+OLLAMA_ENDPOINT=http://localhost:11434
+OLLAMA_MODEL=qwen2.5-coder:latest
+# ============================================
+# REQUIRED: Server Configuration
+# ============================================
+PORT=8081
+HOST=0.0.0.0
+# ============================================
+# REQUIRED: Claude Code/Cursor Compatibility
+# ============================================
+POLICY_MAX_STEPS=50
+POLICY_MAX_TOOL_CALLS=100
+POLICY_SAFE_COMMANDS_ENABLED=false
+# ============================================
+# OPTIONAL: Performance (Recommended)
+# ============================================
+LOG_LEVEL=warn
+LOAD_SHEDDING_ENABLED=true
+LOAD_SHEDDING_HEAP_THRESHOLD=0.85
+```
-### Token Optimization (8 Phases)
-- **MCP Code Mode** — replaces 100+ MCP tool schemas with 4 meta-tools (~96% reduction, lazy tool discovery)
-- **Smart tool selection** — only sends tools relevant to the current task (50-70% reduction)
-- **Prompt caching** — SHA-256 keyed LRU cache (30-45% reduction on repeated prompts)
-- **Memory deduplication** — eliminates repeated information across turns (20-30% reduction)
-- **Tool response truncation** — intelligent truncation of long outputs (15-25% reduction)
-- **Dynamic system prompts** — adapt complexity to request type (10-20% reduction)
-- **Distill compression** — structural similarity, delta rendering, smart dedup of repetitive tool outputs (20-40% reduction)
-- **Headroom sidecar** — optional ML-based compression: Smart Crusher, CCR, LLMLingua (47-92% reduction)
+**Steps:**
+1. Install Ollama: `curl -fsSL https://ollama.com/install.sh | sh`
+2. Pull model: `ollama pull qwen2.5-coder:latest`
+3. Copy above to `.env` in your project directory
+4. Run: `lynkr start`
-### Enterprise Resilience
-- **Circuit breakers** — automatic failover with half-open probe recovery
-- **Admin hot-reload** — `POST /v1/admin/reload` reloads config + resets circuit breakers without restart
-- **Load shedding** — graceful degradation under high load
-- **Prometheus metrics** — full observability at `/metrics`
-- **Health checks** — K8s-ready endpoints at `/health`
-- **Performance timer** — per-request timing breakdown with `PERF_TIMER=true`
+---
-### Memory System
-Titans-inspired long-term memory with surprise-based filtering. The system remembers important context across sessions and forgets noise — reducing token waste from repeated context.
+### Production: Cloud with Tier Routing (OpenRouter)
-### Semantic Cache
-Cache responses for semantically similar prompts. Hit rate depends on your workflow, but repeat questions (common in coding) get instant responses.
+Optimized for cost savings with smart routing:
 ```bash
+# .env - Production OpenRouter Setup
+# ============================================
+# REQUIRED: Provider Configuration
+# ============================================
+MODEL_PROVIDER=openrouter
+OPENROUTER_API_KEY=sk-or-v1-your-key-here
+FALLBACK_ENABLED=false
+# ============================================
+# REQUIRED: Server Configuration
+# ============================================
+PORT=8081
+HOST=0.0.0.0
+# ============================================
+# TIER ROUTING: Smart Cost Optimization
+# ============================================
+# Simple queries → Cheap/fast model
+TIER_SIMPLE=openrouter:google/gemini-flash-1.5
+# Normal coding → Balanced model
+TIER_MEDIUM=openrouter:anthropic/claude-3.5-sonnet
+# Complex refactoring → Powerful model
+TIER_COMPLEX=openrouter:anthropic/claude-opus-4
+# Deep reasoning → Most capable model
+TIER_REASONING=openrouter:anthropic/claude-opus-4
+# ============================================
+# REQUIRED: Claude Code/Cursor Compatibility
+# ============================================
+POLICY_MAX_STEPS=50
+POLICY_MAX_TOOL_CALLS=100
+POLICY_SAFE_COMMANDS_ENABLED=false
+# ============================================
+# OPTIONAL: Token Optimization (60-80% savings)
+# ============================================
+PROMPT_CACHE_ENABLED=true
 SEMANTIC_CACHE_ENABLED=true
 SEMANTIC_CACHE_THRESHOLD=0.95
+TOOL_INJECTION_ENABLED=false
+# ============================================
+# OPTIONAL: Performance Tuning
+# ============================================
+LOG_LEVEL=warn
+LOAD_SHEDDING_ENABLED=true
+LOAD_SHEDDING_HEAP_THRESHOLD=0.85
+```
+**Expected savings:** 70-90% of requests use Gemini Flash ($). Only 10-30% use Claude Opus ($$$).
+---
+### Enterprise: Databricks Foundation Models
+For teams using Databricks Model Serving:
+```bash
+# .env - Enterprise Databricks Setup
+# ============================================
+# REQUIRED: Provider Configuration
+# ============================================
+MODEL_PROVIDER=databricks
+DATABRICKS_API_BASE=https://your-workspace.cloud.databricks.com
+DATABRICKS_API_KEY=dapi1234567890abcdef
+FALLBACK_ENABLED=false
+# ============================================
+# REQUIRED: Model Configuration
+# ============================================
+# Option 1: Single model (no tier routing)
+DATABRICKS_MODEL=databricks-meta-llama-3-1-405b-instruct
+# Option 2: Tier routing (comment out above, uncomment below)
+# TIER_SIMPLE=databricks:databricks-meta-llama-3-1-70b-instruct
+# TIER_MEDIUM=databricks:databricks-claude-sonnet-4-5
+# TIER_COMPLEX=databricks:databricks-claude-opus-4-6
+# TIER_REASONING=databricks:databricks-claude-opus-4-6
+# ============================================
+# REQUIRED: Server Configuration
+# ============================================
+PORT=8081
+HOST=0.0.0.0
+# ============================================
+# REQUIRED: Claude Code/Cursor Compatibility
+# ============================================
+POLICY_MAX_STEPS=50
+POLICY_MAX_TOOL_CALLS=100
+POLICY_SAFE_COMMANDS_ENABLED=false
+# ============================================
+# OPTIONAL: Enterprise Features
+# ============================================
+LOG_LEVEL=info
+LOAD_SHEDDING_ENABLED=true
+LOAD_SHEDDING_HEAP_THRESHOLD=0.85
+# Optional: Metrics for monitoring
+# PROMETHEUS_METRICS_ENABLED=true
 ```
-### MCP Integration + Code Mode
-Automatic Model Context Protocol server discovery and orchestration. Your MCP tools work through Lynkr without configuration.
+---
+### Hybrid: Local + Cloud Fallback
-**MCP Code Mode** — Token optimization for heavy MCP setups:
-- Replaces 100+ individual MCP tool schemas with 4 meta-tools
-- Reduces tool catalog from ~17,500 tokens to ~700 tokens (**96% reduction**)
-- Enables lazy tool discovery: model queries `mcp_list_tools`, then `mcp_tool_info`, then `mcp_execute`
-- Best for: 50+ MCP tools, long conversations, context-constrained setups
-- Trade-off: 3 sequential calls instead of 1 (adds ~2-3s latency)
+Use free Ollama, fallback to cloud when needed:
 ```bash
-CODE_MODE_ENABLED=true          # Enable Code Mode
-CODE_MODE_CACHE_TTL=60000       # Tool list cache TTL (ms)
+# .env - Hybrid Setup (Advanced)
+# ============================================
+# PRIMARY: Local Ollama
+# ============================================
+MODEL_PROVIDER=ollama
+OLLAMA_ENDPOINT=http://localhost:11434
+OLLAMA_MODEL=qwen2.5-coder:latest
+# ============================================
+# FALLBACK: Cloud Provider
+# ============================================
+FALLBACK_ENABLED=true
+FALLBACK_PROVIDER=openrouter
+OPENROUTER_API_KEY=sk-or-v1-your-key-here
+# ============================================
+# TIER ROUTING: Mix Local + Cloud
+# ============================================
+TIER_SIMPLE=ollama:qwen2.5:3b
+TIER_MEDIUM=ollama:qwen2.5:7b
+TIER_COMPLEX=openrouter:anthropic/claude-3.5-sonnet
+TIER_REASONING=openrouter:anthropic/claude-opus-4
+# ============================================
+# REQUIRED: Server Configuration
+# ============================================
+PORT=8081
+HOST=0.0.0.0
+# ============================================
+# REQUIRED: Claude Code/Cursor Compatibility
+# ============================================
+POLICY_MAX_STEPS=50
+POLICY_MAX_TOOL_CALLS=100
+POLICY_SAFE_COMMANDS_ENABLED=false
+# ============================================
+# OPTIONAL: Performance
+# ============================================
+LOG_LEVEL=warn
+LOAD_SHEDDING_ENABLED=true
 ```
-See [Token Optimization Guide](documentation/token-optimization.md#phase-0-mcp-code-mode-96-reduction-for-mcp-tools) and [Tools Documentation](documentation/tools.md#mcp-code-mode-token-optimization) for details.
+**Best of both worlds:** 80% of requests stay local (free). Complex tasks use cloud (paid).
 ---
-## Deployment Options
+## Common Issues & Fixes
+| Issue | Solution |
+|-------|----------|
+| **"Service temporarily overloaded"** | Ollama model too large for RAM. Use smaller model or increase `--max-old-space-size` |
+| **"Route not found: HEAD /"** | Ignore - harmless health check from Claude Code |
+| **"Hallucinated tool calls"** | Normal - Lynkr automatically filters invalid tools |
+| **"Safe Command DSL blocked"** | Add `POLICY_SAFE_COMMANDS_ENABLED=false` to `.env` |
+| **Slow first request (20+ sec)** | Ollama loading model into memory. Add `OLLAMA_KEEP_ALIVE=30m` in Ollama config |
+| **No response after N turns** | Increase limits: `POLICY_MAX_STEPS=50` and `POLICY_MAX_TOOL_CALLS=100` |
-**One-line install (recommended)**
+---
+## Advanced Features
+### Token Optimization (60-80% savings)
 ```bash
-curl -fsSL https://raw.githubusercontent.com/Fast-Editor/Lynkr/main/install.sh | bash
+# Enable all optimizations
+PROMPT_CACHE_ENABLED=true
+SEMANTIC_CACHE_ENABLED=true
+TOOL_INJECTION_ENABLED=false
+CODE_MODE_ENABLED=true
 ```
-**NPM**
+### Memory System (Titans-inspired)
 ```bash
-npm install -g lynkr && lynkr start
+MEMORY_ENABLED=true
+MEMORY_TTL=3600000  # 1 hour
 ```
-**Docker**
+### Load Shedding & Resilience
 ```bash
-docker-compose up -d
+LOAD_SHEDDING_ENABLED=true
+LOAD_SHEDDING_HEAP_THRESHOLD=0.85
 ```
-**Git Clone**
+### Admin Hot-Reload (no restart needed)
 ```bash
-git clone https://github.com/Fast-Editor/Lynkr.git
-cd Lynkr && npm install && cp .env.example .env
-npm start
+curl -X POST http://localhost:8081/v1/admin/reload
+```
+---
+## Installation Methods
+**NPM (recommended)**
+```bash
+npm install -g lynkr
+```
+**One-line installer**
+```bash
+curl -fsSL https://raw.githubusercontent.com/Fast-Editor/Lynkr/main/install.sh | bash
 ```
 **Homebrew**
@@ -303,6 +502,22 @@ brew tap vishalveerareddy123/lynkr
 brew install lynkr
 ```
+**Docker**
+```bash
+git clone https://github.com/Fast-Editor/Lynkr.git
+cd Lynkr
+docker-compose up -d
+```
+**From source**
+```bash
+git clone https://github.com/Fast-Editor/Lynkr.git
+cd Lynkr
+npm install
+cp .env.example .env
+npm start
+```
 ---
 ## Documentation
@@ -310,54 +525,59 @@ brew install lynkr
 | Guide | Description |
 |-------|-------------|
 | [Installation](documentation/installation.md) | All installation methods |
-| [Provider Config](documentation/providers.md) | Setup for all 12+ providers |
-| [Claude Code CLI](documentation/claude-code-cli.md) | Detailed Claude Code integration |
-| [Codex CLI](documentation/codex-cli.md) | Codex config.toml setup |
-| [OpenClaw](documentation/openclaw-integration.md) | OpenClaw integration with tier routing |
-| [Cursor IDE](documentation/cursor-integration.md) | Cursor integration + troubleshooting |
-| [Embeddings](documentation/embeddings.md) | @Codebase semantic search (4 options) |
-| [Token Optimization](documentation/token-optimization.md) | 60-80% cost reduction strategies |
-| [Memory System](documentation/memory-system.md) | Titans-inspired long-term memory |
-| [Tools & Execution](documentation/tools.md) | Tool calling and execution modes |
-| [Smart Routing](documentation/routing.md) | Complexity-based model routing |
-| [Docker Deployment](documentation/docker.md) | docker-compose with GPU support |
-| [Production Hardening](documentation/production.md) | Circuit breakers, metrics, load shedding |
-| [API Reference](documentation/api.md) | All endpoints and formats |
+| [Provider Setup](documentation/providers.md) | Configuration for all 12+ providers |
+| [Claude Code](documentation/claude-code-cli.md) | Claude Code CLI integration |
+| [Cursor IDE](documentation/cursor-integration.md) | Cursor setup + troubleshooting |
+| [Codex CLI](documentation/codex-cli.md) | Codex configuration |
+| [Tier Routing](documentation/routing.md) | Smart model routing by complexity |
+| [Token Optimization](documentation/token-optimization.md) | 60-80% cost reduction |
 | [Troubleshooting](documentation/troubleshooting.md) | Common issues and solutions |
-| [FAQ](documentation/faq.md) | Frequently asked questions |
+| [API Reference](documentation/api.md) | REST API endpoints |
+| [Production](documentation/production.md) | Enterprise deployment |
 ---
-## Troubleshooting
+## Cost Comparison
-| Issue | Solution |
-|-------|----------|
-| Same response for all queries | Disable semantic cache: `SEMANTIC_CACHE_ENABLED=false` |
-| Tool calls not executing | Increase threshold: `POLICY_TOOL_LOOP_THRESHOLD=15` |
-| Slow first request | Keep Ollama loaded: `OLLAMA_KEEP_ALIVE=24h` |
-| Connection refused | Ensure Lynkr is running: `lynkr start` |
+| Scenario | Direct Anthropic | Lynkr + Ollama | Lynkr + OpenRouter |
+|----------|-----------------|----------------|-------------------|
+| Daily coding (8h) | $10-30/day | **$0 (free)** | $2-8/day |
+| Monthly (heavy use) | $300-900 | **$0** | $60-240 |
+With tier routing + token optimization: **additional 60-80% savings** on cloud providers.
 ---
-## Contributing
+## Why Lynkr vs Alternatives
+| Feature | Lynkr | LiteLLM | OpenRouter | PortKey |
+|---------|-------|---------|-----------|---------|
+| **Setup** | `npm install -g lynkr` | Python + Docker + Postgres | Account signup | Docker stack |
+| **Claude Code native** | ✅ Drop-in | ⚠️ Requires config | ❌ | ⚠️ Partial |
+| **Cursor native** | ✅ Drop-in | ⚠️ Partial | ❌ | ⚠️ Partial |
+| **Local models** | Ollama, llama.cpp, LM Studio, MLX | Ollama only | ❌ | ❌ |
+| **Tier routing** | Auto complexity-based | ❌ Manual | Cost-based only | ❌ Manual |
+| **Token optimization** | 60-80% built-in | ❌ | ❌ | Cache only |
+| **Self-hosted** | ✅ Node.js only | ✅ Python stack | ❌ SaaS | ✅ Docker |
+| **Dependencies** | Node.js 20+ | Python, Prisma, PostgreSQL | None | Docker, Python |
-We welcome contributions. See the [Contributing Guide](documentation/contributing.md) and [Testing Guide](documentation/testing.md).
+**Lynkr's edge:** Purpose-built for AI coding tools. Zero-config for Claude Code, Cursor, and Codex. Installs in one command, runs anywhere Node.js runs.
 ---
-## License
+## Community
-Apache 2.0 — See [LICENSE](LICENSE).
+- [GitHub Discussions](https://github.com/Fast-Editor/Lynkr/discussions) — Ask questions
+- [Report Issues](https://github.com/Fast-Editor/Lynkr/issues) — Bug reports
+- [NPM Package](https://www.npmjs.com/package/lynkr) — Official releases
+- [DeepWiki](https://deepwiki.com/vishalveerareddy123/Lynkr) — AI-powered docs
 ---
-## Community
+## License
-- [GitHub Discussions](https://github.com/Fast-Editor/Lynkr/discussions) — Questions and tips
-- [Report Issues](https://github.com/Fast-Editor/Lynkr/issues) — Bug reports and feature requests
-- [NPM Package](https://www.npmjs.com/package/lynkr) — Official package
-- [DeepWiki](https://deepwiki.com/vishalveerareddy123/Lynkr) — AI-powered docs search
+Apache 2.0 — See [LICENSE](LICENSE).
 ---
-**Built by [Vishal Veera Reddy](https://github.com/vishalveerareddy123) — for developers who want control over their AI tools.**
+**Built by [Vishal Veera Reddy](https://github.com/vishalveerareddy123) for developers who want control over their AI tools.**

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "lynkr",
-  "version": "9.1.9",
+  "version": "9.2.1",
   "description": "Self-hosted Claude Code & Cursor proxy with Databricks,AWS BedRock,Azure  adapters, openrouter, Ollama,llamacpp,LM Studio, workspace tooling, and MCP integration.",
   "main": "index.js",
   "bin": {