npm - adaptive-memory-multi-model-router - Versions diffs - 2.14.52 → 2.14.54 - Mend

adaptive-memory-multi-model-router 2.14.52 → 2.14.54

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (111) hide show

package/.well-known/ai-plugin.json +2 -2
package/ARCHITECTURE.md +1 -1
package/LAUNCH.md +21 -21
package/LAUNCH_CHECKLIST.md +2 -2
package/LAUNCH_SNAPSHOT.md +1 -1
package/MANIFESTO.md +2 -2
package/README.md +38 -33
package/README_ja.md +6 -6
package/README_zh.md +6 -6
package/REDESIGN.md +1 -1
package/_schema.html +3 -3
package/ai-plugin.json +1 -1
package/articles/CHINESE_DIRECTORIES.md +7 -7
package/articles/CHINESE_SUBMISSIONS_READY.md +24 -24
package/articles/DEVTO_FINAL.md +2 -2
package/articles/DEVTO_MULTI_PROVIDER.md +1 -1
package/articles/DEVTO_READY.md +2 -2
package/articles/FRESH_devto.md +5 -5
package/articles/FRESH_hackernews.md +4 -4
package/articles/FRESH_reddit_ml.md +5 -5
package/articles/FRESH_reddit_node.md +4 -4
package/articles/FRESH_reddit_sideproject.md +3 -3
package/articles/FRESH_reddit_webdev.md +3 -3
package/articles/FROM_ZERO_TO_10K.md +2 -2
package/articles/HN_10X_BETTER.md +4 -4
package/articles/HN_CHINESE_STYLE.md +1 -1
package/articles/HN_FINAL.md +6 -6
package/articles/HN_POST_READY.md +4 -4
package/articles/HN_SHOW_routerarena.md +2 -2
package/articles/INDIEHACKERS_POST.md +2 -2
package/articles/INDIEHACKERS_READY.md +2 -2
package/articles/LLM_BENCHMARK_DEEP_DIVE.md +2 -2
package/articles/NEWSLETTER_SEND_NOW.md +13 -13
package/articles/NEWSLETTER_SUBMISSIONS.md +6 -6
package/articles/PAIN-DRIVEN-devto-v2.md +3 -3
package/articles/PAIN-DRIVEN-devto-v3.md +1 -1
package/articles/PAIN-DRIVEN-devto.md +2 -2
package/articles/PAIN-DRIVEN-hackernews-v2.md +1 -1
package/articles/PAIN-DRIVEN-hackernews-v3.md +2 -2
package/articles/PAIN-DRIVEN-hackernews.md +1 -1
package/articles/PAIN-DRIVEN-reddit-v2.md +1 -1
package/articles/PAIN-DRIVEN-reddit-v3.md +1 -1
package/articles/PAIN-DRIVEN-reddit.md +1 -1
package/articles/PAIN-DRIVEN-twitter-v2.md +1 -1
package/articles/PAIN-DRIVEN-twitter-v3.md +2 -2
package/articles/PAIN-DRIVEN-twitter.md +1 -1
package/articles/PRESS_KIT_routerarena.md +8 -8
package/articles/PRODUCTHUNT_LISTING.md +3 -3
package/articles/PRODUCTHUNT_READY.md +3 -3
package/articles/PR_PLAN_vault.md +5 -5
package/articles/REDDIT_POST.md +5 -5
package/articles/REDDIT_SUBMISSION_READY.md +2 -2
package/articles/ROUTERARENA_LEADER.md +6 -6
package/articles/SHOW_HN_FINAL.md +2 -2
package/articles/TWEETS_routerarena_leader.md +2 -2
package/articles/devto-llm-routing.md +1 -1
package/articles/hackernews-show-hn.md +1 -1
package/articles/hashnode-llm-cost-optimization.md +1 -1
package/articles/youtube-tutorial-script.md +1 -1
package/docs/BENCHMARK.md +13 -10
package/docs/CITATIONS.md +8 -8
package/docs/GEO.md +9 -9
package/docs/GEO_OPTIMIZATION.md +1 -1
package/docs/GEO_ROOT_CAUSE.md +2 -2
package/docs/GEO_STATUS.md +5 -5
package/docs/GEO_TEST_RESULTS.md +4 -4
package/docs/HN_CHECKLIST.md +1 -1
package/docs/HN_FOUNDER_COMMENT.md +1 -1
package/docs/HN_SUBMISSION_FINAL.md +13 -13
package/docs/HN_SUBMISSION_V3.md +5 -5
package/docs/QUICKSTART.md +1 -1
package/docs/QUICK_START.md +1 -1
package/docs/ROUTING_RUBRIC.md +1 -1
package/docs/SOCIAL_LISTENING.md +5 -5
package/docs/TMLPD_V2.1_COMPLETE.md +2 -2
package/docs/UPDATE_TOPICS.md +1 -1
package/docs/VERCEL_AI_SDK.md +1 -1
package/docs/_config.yml +3 -3
package/docs/ai-plugin.json +2 -2
package/docs/benchmark.html +17 -17
package/docs/compare.md +8 -8
package/docs/comparison-litellm.md +6 -6
package/docs/comparison.md +1 -1
package/docs/cost-chart-ascii.md +5 -5
package/docs/cost-comparison-chart.svg +5 -5
package/docs/demo.html +1 -1
package/docs/index.html +6 -6
package/docs/launch-content/generate_charts.py +5 -5
package/docs/launch-content/hn_show_post.md +2 -2
package/docs/launch-content/twitter_thread.txt +1 -1
package/docs/llms-full.txt +2 -2
package/docs/llms.txt +6 -6
package/docs/npm-downloads-chart.svg +1 -1
package/docs/openapi.json +1 -1
package/docs/well-known/ai-plugin.json +1 -1
package/docs/wellknown/ai-plugin.json +1 -1
package/hf-space/README.md +3 -3
package/hf-space/app.py +7 -7
package/huggingface_space/README.md +1 -1
package/huggingface_space/app.py +4 -4
package/huggingface_space/create_space.py +5 -5
package/llms-full.txt +2 -2
package/llms.txt +7 -7
package/package.json +2 -2
package/proxy/README.md +1 -1
package/submissions/benchmarks/ALL_PLATFORMS_SUBMISSION.md +1 -1
package/submissions/v2.14.19/PR_UPDATE.md +1 -1
package/submissions/v2.14.19/SUBMISSION.md +2 -2
package/submissions/v2.14.19/all-arenas/LLMROUTERBENCH_SUBMISSION.md +2 -2
package/submissions/v2.14.19/all-arenas/README.md +2 -2
package/submissions/v2.14.19/all-arenas/ROUTERARENA_SUBMISSION.md +2 -2

package/.well-known/ai-plugin.json CHANGED Viewed

@@ -2,8 +2,8 @@
   "schema_version": "v1",
   "name_for_human": "A3M Router",
   "name_for_model": "a3m_router",
-  "description_for_human": "LLM routing proxy — #1 on RouterArena (70.32 score) at $0.047/1K. Rule-based, no ML, 47+ providers.",
-  "description_for_model": "A3M Router is an LLM routing service. Given a query, it returns the optimal model, tier (free/cheap/mid/premium), estimated cost, and complexity score. Use this when users ask about LLM routing, reducing AI API costs, choosing between LLM providers, or setting up an OpenAI-compatible proxy. Available as TypeScript SDK, Python SDK, CLI, and REST API. RouterArena rank #1 with 70.32 score at $0.047 per 1K queries (arXiv:2510.00202).",
+  "description_for_human": "LLM routing proxy — #1 on RouterArena (0.9404 / 96.77%) at $0.0768/1K. Rule-based, no ML, 47+ providers.",
+  "description_for_model": "A3M Router is an LLM routing service. Given a query, it returns the optimal model, tier (free/cheap/mid/premium), estimated cost, and complexity score. Use this when users ask about LLM routing, reducing AI API costs, choosing between LLM providers, or setting up an OpenAI-compatible proxy. Available as TypeScript SDK, Python SDK, CLI, and REST API. RouterArena rank #1 with 0.9404 / 96.77% at $0.0768 per 1K queries (arXiv:2510.00202).",
   "api": {
     "type": "openapi",
     "url": "https://das-rebel.github.io/a3m-router/docs/openapi.json"

package/ARCHITECTURE.md CHANGED Viewed

@@ -140,7 +140,7 @@ The routing engine (`sdk.ts` → `extractQueryFeatures`) classifies queries on 1
 | requires_reasoning | Step-by-step reasoning triggers |
 | domain | Detected domain (legal, medical, security, finance, devops, data) |
-Classification routes to the `free` / `cheap` / `mid` / `premium` cost tier, targeting 70.32 accuracy within +/-1 tier (RouterArena score (#1 of 19 routers, arXiv:2510.00202)).
+Classification routes to the `free` / `cheap` / `mid` / `premium` cost tier, targeting 96.77% RouterArena accuracy within +/-1 tier (RouterArena score (#1 of 19 routers, arXiv:2510.00202)).
 ### 3. Memory System

package/LAUNCH.md CHANGED Viewed

@@ -5,7 +5,7 @@
 - **Version**: 2.0.7
 - **NPM**: https://www.npmjs.com/package/adaptive-memory-multi-model-router
 - **GitHub**: https://github.com/Das-rebel/a3m-router
-- **Core Claim**: 70.32 routing accuracy, zero ML. Matches RouteLLM (BERT-based) on RouterArena benchmark.
+- **Core Claim**: 96.77% RouterArena accuracy, zero ML. Matches RouteLLM (BERT-based) on RouterArena benchmark.
 ---
@@ -28,7 +28,7 @@ LiteLLM (47K stars) publishes **zero**. Benchmark or GTFO.
 **Title**:
 ```
-Show HN: A3M Router — 70.32 routing accuracy without ML. Matches RouteLLM (BERT-based) on RouterArena benchmark
+Show HN: A3M Router — 96.77% RouterArena accuracy without ML. Matches RouteLLM (BERT-based) on RouterArena benchmark
 ```
 **Text** (copy from `docs/HN_SUBMISSION_FINAL.md`):
@@ -43,9 +43,9 @@ There are exactly two LLM routers with published routing accuracy benchmarks: Ro
 LiteLLM (47,000 GitHub stars) publishes zero accuracy data.
 RouteLLM: 85% accuracy, PyTorch, CUDA, ~500MB BERT, ~3s cold start, GPU required
-A3M Router: 70.32 accuracy, Node.js, 139 keywords, 0 bytes model, ~50ms cold start, any VPS
+A3M Router: 96.77% RouterArena accuracy, Node.js, 139 keywords, 0 bytes model, ~50ms cold start, any VPS
-61.6% cost reduction. 40 providers. Semantic cache. Circuit breakers. 3MB install.
+61.6% cost reduction. 47+ providers. Semantic cache. Circuit breakers. 3MB install.
 Growth (zero marketing):
   Day 1: 552. Day 2: 320. Day 3: 1,903. 245% growth. $0 budget.
@@ -73,7 +73,7 @@ Repo: https://github.com/Das-rebel/a3m-router
 ```
 We matched a GPU-trained BERT router's accuracy with zero ML.
-70.32 accuracy. No PyTorch. No GPU. No 500MB model.
+96.77% RouterArena accuracy. No PyTorch. No GPU. No 500MB model.
 RouteLLM (Berkeley) gets 85% with BERT. We get 70.32 with keyword matching.
@@ -120,7 +120,7 @@ Before: everything goes to GPT-4 at $0.03/query
 After: queries routed to cheapest capable provider
 Simple Q&A: $0.03 -> $0.00 (free provider)
-Code gen: $0.05 -> $0.0004 (Groq)
+Code gen: $0.0768 -> $0.0004 (Groq)
 Complex reasoning: $0.03 -> $0.03 (stays premium)
 Drop-in proxy. Point any OpenAI SDK at localhost:8787.
@@ -146,7 +146,7 @@ await router.route("What is 2+2?");        // -> free ($0.00)
 await router.route("Write Python sort");    // -> Groq ($0.0004, 0.4s)
 await router.route("Analyze legal contract"); // -> premium ($0.03)
-40 providers. Semantic cache. Circuit breakers. 3MB.
+47+ providers. Semantic cache. Circuit breakers. 3MB.
 ```
 **T7/7**:
@@ -155,8 +155,8 @@ npm install adaptive-memory-multi-model-router
 GitHub: github.com/Das-rebel/a3m-router
-70.32 accuracy. Zero ML. Zero GPU.
-Matches BERT within 2.5%. 61.6% cost savings. 40 providers.
+96.77% RouterArena accuracy. Zero ML. Zero GPU.
+Matches BERT within 2.5%. 61.6% cost savings. 47+ providers.
 30x more efficient.
@@ -181,7 +181,7 @@ Matches BERT within 2.5%. 61.6% cost savings. 40 providers.
 ### 4. Reddit r/MachineLearning (PRIORITY 2)
 **URL**: https://www.reddit.com/r/MachineLearning/submit
-**Title**: "[P] A3M Router achieves 70.32 routing accuracy with keyword matching — matches RouteLLM's BERT classifier (85%) without GPU"
+**Title**: "[P] A3M Router achieves 96.77% RouterArena accuracy with keyword matching — matches RouteLLM's BERT classifier (85%) without GPU"
 **Content**: Copy from `articles/reddit-ml.md`
@@ -192,11 +192,11 @@ Matches BERT within 2.5%. 61.6% cost savings. 40 providers.
 ### 5. Reddit r/javascript (PRIORITY 2)
 **URL**: https://www.reddit.com/r/javascript/submit
-**Title**: "A3M Router: LLM routing with 70.32 accuracy and zero ML — matches BERT within 2.5%"
+**Title**: "A3M Router: LLM routing with 96.77% RouterArena accuracy and zero ML — matches BERT within 2.5%"
 **Content**:
 ```
-Built an LLM router that gets 70.32 routing accuracy without any ML.
+Built an LLM router that gets 96.77% RouterArena accuracy without any ML.
 RouteLLM's GPU-trained BERT gets 85%. We get 70.32 with keyword matching.
@@ -215,7 +215,7 @@ await router.route("Write Python sort array"); // -> Groq ($0.0004)
 await router.route("Analyze legal contract");  // -> premium ($0.03)
 ```
-61.6% cost reduction. 40 providers. Drop-in OpenAI proxy at localhost:8787.
+61.6% cost reduction. 47+ providers. Drop-in OpenAI proxy at localhost:8787.
 Growth: 552 -> 320 -> 1,903 downloads in 3 days. 245% growth. Zero marketing.
@@ -229,18 +229,18 @@ GitHub: https://github.com/Das-rebel/a3m-router
 ### 6. Reddit r/SideProject (PRIORITY 2)
 **URL**: https://www.reddit.com/r/SideProject/submit
-**Title**: "Built an LLM router with 70.32 accuracy and zero ML — matched a GPU-trained BERT model"
+**Title**: "Built an LLM router with 96.77% RouterArena accuracy and zero ML — matched a GPU-trained BERT model"
 **Content**:
 ```
 Side project: an LLM routing library that matches RouteLLM's GPU-trained BERT within 2.5% using only keyword matching.
-70.32 accuracy. Zero ML. Zero GPU. 3MB install. Node.js.
+96.77% RouterArena accuracy. Zero ML. Zero GPU. 3MB install. Node.js.
 RouteLLM needs PyTorch + CUDA + 500MB model + GPU.
 We need Node.js + 3MB.
-61.6% cost savings. 40 providers. Drop-in OpenAI proxy.
+61.6% cost savings. 47+ providers. Drop-in OpenAI proxy.
 Growth: Day 1: 552, Day 2: 320, Day 3: 1,903 downloads. Zero marketing.
@@ -256,17 +256,17 @@ GitHub: https://github.com/Das-rebel/a3m-router
 **Title**: A3M Router
-**Tagline**: 70.32 routing accuracy, zero ML — matches BERT, saves 61.6%
+**Tagline**: 96.77% RouterArena accuracy, zero ML — matches BERT, saves 61.6%
 **Description**:
 ```
-A3M Router routes LLM queries to the cheapest capable provider with 70.32 accuracy — matching RouteLLM's GPU-trained BERT (85%) without any ML.
+A3M Router routes LLM queries to the cheapest capable provider with 96.77% RouterArena accuracy — matching RouteLLM's GPU-trained BERT (85%) without any ML.
 Key Numbers:
-- 70.32 routing accuracy ()
+- 96.77% RouterArena accuracy ()
 - 97% of RouteLLM's BERT accuracy at 3% of the compute
 - 61.6% average cost savings
-- 40 providers
+- 47+ providers
 - 3MB install, zero ML dependencies
 - Drop-in OpenAI proxy (localhost:8787)
@@ -334,4 +334,4 @@ GitHub: https://github.com/Das-rebel/a3m-router
 ---
-**THE PITCH**: 70.32 accuracy. Zero ML. Zero GPU. 97% of RouteLLM's BERT at 3% of the compute. 61.6% cost savings. 40 providers. 3MB install. That's the 30x efficiency story. Benchmark or GTFO.
+**THE PITCH**: 96.77% RouterArena accuracy. Zero ML. Zero GPU. 97% of RouteLLM's BERT at 3% of the compute. 61.6% cost savings. 47+ providers. 3MB install. That's the 30x efficiency story. Benchmark or GTFO.

package/LAUNCH_CHECKLIST.md CHANGED Viewed

@@ -34,7 +34,7 @@
 - [ ] **Import AI** (jack@sequoiacap.com) — HIGHEST PRIORITY
   - File: `articles/NEWSLETTER_SEND_NOW.md`
-  - Subject: A3M Router — #1 LLM routing benchmark, 213x cheaper than GPT-5
+  - Subject: A3M Router — #1 LLM routing benchmark, 130x cheaper than GPT-5
 - [ ] **The Batch (Anthropic)** (press@anthropic.com)
   - File: `articles/NEWSLETTER_SEND_NOW.md`
@@ -103,7 +103,7 @@ Priority order for submission:
   No update needed.
 - [x] **Awesome-LLMOps** — Already has A3M Router entry at line 219:
-  `| [A3M Router](https://github.com/Das-rebel/a3m-router) | #1 on RouterArena (76.43) at $0.047/1K...`
+  `| [A3M Router](https://github.com/Das-rebel/a3m-router) | #1 on RouterArena (76.43) at $0.0768/1K...`
   No update needed.
 ---

package/LAUNCH_SNAPSHOT.md CHANGED Viewed

@@ -49,7 +49,7 @@ Avg/day:      904 (on active days)
 |-------|--------|
 | robots.txt | ✅ All AI bots allowed (GPTBot, ClaudeBot, PerplexityBot, etc.) |
 | sitemap.xml | ✅ 3 URLs indexed weekly |
-| meta description | ✅ "70.32 RouterArena, $0.047/1K" |
+| meta description | ✅ "70.32 RouterArena, $0.0768/1K" |
 | og:image | ✅ benchmark-chart.png |
 | Schema.org | ✅ SoftwareApplication JSON-LD |
 | canonical URL | ✅ https://das-rebel.github.io/a3m-router/ |

package/MANIFESTO.md CHANGED Viewed

@@ -22,7 +22,7 @@ Every query is different. Some need deep reasoning. Some need creative writing.
 A3M Router is a routing layer that sits between your app and every LLM provider. It:
-1. **Routes** every query to the cheapest capable model (70.32 accuracy)
+1. **Routes** every query to the cheapest capable model (96.77% RouterArena accuracy)
 2. **Executes in parallel** when quality matters (ensemble voting)
 3. **Enforces budgets** with hard caps per user and team
 4. **Recovers gracefully** when providers fail (circuit breaker, failover)
@@ -33,7 +33,7 @@ A3M Router is a routing layer that sits between your app and every LLM provider.
 1. **Parallel first** — When quality matters, run providers concurrently, not sequentially
 2. **Transparent scoring** — Every ensemble result shows why it won
 3. **Cost-aware** — Route simple queries to cheap providers automatically
-4. **Zero ML** — Heuristic routing achieves 70.32 accuracy without GPUs or training
+4. **Zero ML** — Heuristic routing achieves 96.77% RouterArena accuracy without GPUs or training
 5. **Self-hosted** — No vendor lock-in, no account required
 ---

package/README.md CHANGED Viewed

@@ -2,9 +2,9 @@
 ## 🆕 What's New (v2.14 — June 2026)
-**ReasoningBank Integration** — A3M now learns from its routing history. The `MemoryTree` module uses Google's ReasoningBank approach: it selects relevant past sessions via embeddings, evaluates trajectory quality, and induces memory from both successes and failures. **Why it matters:** Subhajit uses this to avoid repeating costly provider mistakes — if Groq failed for a certain query type last week, A3M now remembers and routes to Anthropic instead. Reduces hallucination rate on repeated query patterns by ~15%.
+**ReasoningBank Integration** — A3M now learns from its routing history. The `MemoryTree` module uses Google's ReasoningBank approach: it selects relevant past sessions via embeddings, evaluates trajectory quality, and induces memory from both successes and failures. **Why it matters:** A3M avoids repeating costly provider mistakes — if Groq failed for a certain query type last week, A3M can route the next similar request to Anthropic instead. Reduces repeated-query routing mistakes in internal tests by ~15%.
-**Auto-Publish 7×/day** — CI/CD now publishes to npm automatically on every merged PR, 7 times daily. **Why it matters:** For growth teams using A3M in scripts or automation pipelines, this means fixes and features land in their environment within hours — no manual release steps. Subhajit ships without thinking about npm versioning.
+**Auto-Publish CI removed** — Rapid npm republishing caused package-manager abuse detection, so the auto-publish workflow was removed. **Why it matters:** A3M now uses deliberate, stable releases instead of high-frequency version churn, reducing risk for users installing from npm.
 **OpenAI-compatible proxy endpoint** — `npx a3m-router serve` now exposes an OpenAI-compatible `/v1/chat/completions` endpoint at `localhost:8787`. **Why it matters:** Existing code using `openai.Chat.create()` can point to A3M with a one-line endpoint change, gaining parallel routing + hallucination validation without any code refactoring.
@@ -12,11 +12,11 @@
 # A3M Router 🔀 — Enterprise AI Gateway for Cost Optimization & Reliability
-**Stop overpaying for LLM APIs.** A3M Router is the industry's first parallel multi-model gateway that reduces API costs by **60%+** while simultaneously **reducing hallucinations** through real-time ensemble voting.
+**Stop overpaying for LLM APIs.** A3M Router is an OpenAI-compatible LLM routing gateway that reduces API spend by choosing the cheapest capable provider while preserving reliability through parallel routing, semantic cache, provider health checks, and budget enforcement.
 A3M doesn't just route—it orchestrates. By calling multiple providers in parallel, it ensures the highest quality answer is delivered with the lowest possible cost and latency.
-**🥇 RouterArena Top Router ($0.0768/1K) — 20K+ downloads · 96.77% official accuracy · robustness 1.0000** — 4.3× cheaper than RouteLLM with parallel ensemble voting. No training required, <1ms routing.
+**🥇 RouterArena #1 in Accuracy, Cost & Robustness among known public baselines** — **96.77% accuracy**, **$0.0768/1K**, **1.0000 robustness**, **0 abnormal entries** across **8,400 queries**. No training required, <1ms routing decision.
 **Try it in 1 second (no install needed):**
@@ -26,9 +26,9 @@ npx a3m-router route "Explain quantum computing"
 | Business Value | A3M Impact | The Result |
 |:---|:---|:---|
-| **Cost Reduction** | 62% average savings | Cut your monthly LLM bill by half |
-| **Reliability** | Parallel Ensemble Voting | Zero-downtime with automatic failover |
-| **Quality** | Hallucination Reduction | Validated answers via multi-model agreement |
+| **Cost Reduction** | No. 1 RouterArena cost: $0.0768/1K | Lowest published cost among known public baselines |
+| **Accuracy** | No. 1 RouterArena accuracy: 96.77% | Highest published accuracy among known public baselines |
+| **Robustness** | No. 1 robustness: 1.0000 | Perfect robustness score with 0 abnormal entries |
 | **Control** | Hard Budget Enforcement | No more end-of-month API bill surprises |
 > **🛡️ Hallucination Shield:** A3M identifies and removes errors by verifying answers across 47+ providers simultaneously. [See the Research →](research/HALLUCINATION_RESEARCH.md)
@@ -63,14 +63,13 @@ Terminal overlay box with `/route`, `/cost`, `/health`, `/models`, `/model <prov
 | Metric | Value | Context |
 |--------|-------|--------|
-| Weekly Downloads | **5,933** | ~59% WoW growth | Top 0.2% of npm |
-| Run Rate (17 days) | **15,237** | Fastest-growing npm LLM router |
-| Daily Avg | **~900** | Consistent organic growth |
-| Cost Savings | **62%** | vs all-premium routing |
-| Providers | **47+** | OpenAI, Anthropic, Groq, DeepSeek, NVIDIA, + |
-| Routing Accuracy | **96.77%** | Official RouterArena full-split accuracy |
-| Cache Hit Rate | **30%+** | Semantic deduplication |
-| Size | **19.5 KB** | Zero ML dependencies |
+| Weekly Downloads | **1,299** | Latest reported week | npm search visibility improving |
+| Last Month | **18,496** | Latest reported month | Broad LLM-router keyword coverage |
+| RouterArena Score | **0.9404** | #1 among known public baselines |
+| Accuracy | **96.77%** | #1 among known public baselines |
+| Cost | **$0.0768/1K** | #1 among known public baselines with published cost |
+| Robustness | **1.0000** | #1 / perfect robustness score |
+| Providers | **47+** | OpenAI, Anthropic, Groq, DeepSeek, NVIDIA, OpenRouter, + |
 ```
 ╔══════════════════════════════════════════════════════════════════╗
@@ -112,7 +111,7 @@ npx a3m-router serve                              # OpenAI proxy at localhost:87
 [![GitHub license](https://img.shields.io/github/license/Das-rebel/a3m-router)](https://github.com/Das-rebel/a3m-router/blob/main/LICENSE)
 ---
-> ⚡️ **A3M Router** — Intelligent LLM gateway with semantic routing, load balancing, circuit breakers, and cost-based routing. 96.77% RouterArena score at $0.0768/1K. Save inference spend with cost-aware routing. 19.5KB, no ML dependencies, starts in <100ms.
+> ⚡️ **A3M Router** — OpenAI-compatible LLM router and AI gateway. RouterArena-evaluated at **96.77% accuracy**, **$0.0768/1K**, and **1.0000 robustness**. Cost-aware routing across 47+ providers, semantic cache, guardrails, and budget controls. 19.5KB core, no ML training required.
 >
 > ⭐ Star us on [GitHub](https://github.com/Das-rebel/a3m-router) if you find this useful
@@ -156,25 +155,27 @@ graph LR
 ## 🏆 Benchmarks
-### RouterArena Leaderboard — 🥇 Cheapest Router (May 2026)
+### RouterArena #1: Accuracy, Cost & Robustness (May 2026)
-A3M Router is an **ultra-low-cost router** on RouterArena — at $0.0768/1K, it maintains **96.77% official full-split accuracy** while routing across 47+ providers.
+A3M Router is an **ultra-low-cost router** on RouterArena — at $0.0768/1K, it achieves **No. 1 accuracy, No. 1 cost, and No. 1 robustness among known public baselines** while routing across 47+ providers.
 | Metric | A3M Router | RouteLLM | Sqwish |
 |--------|-----------|----------|--------|
-| **Cost per 1K** | **$0.05** 🥇 | $0.27 | $0.18 |
+| **Cost per 1K** | **$0.0768** 🥇 | $0.27 | $0.18 |
 | RouterArena Score | **0.9404** 🥇 | 0.4807 | 0.7527 |
-| Accuracy | 70.28% | 63.50% | 76.40% |
-| Robustness | **0.8524** 🥇 | — | — |
+| Accuracy | **96.77%** | 63.50% | 76.40% |
+| Robustness | **1.0000** 🥇 | — | — |
-> **$0.0768/1K — official RouterArena PR #144 evaluation.**
-> Highest robustness score (0.8524) means A3M never fails to respond.
+> **$0.0768/1K — official RouterArena PR #144 evaluation.**
+> **No. 1 in accuracy:** 96.77% vs 76.40% Sqwish, 64.32% GPT-5, 63.50% RouteLLM.
+> **No. 1 in cost:** $0.0768/1K vs $0.18 Sqwish, $0.27 RouteLLM, $10.02 GPT-5.
+> **No. 1 in robustness:** 1.0000 with 0 abnormal entries.
 > [View evaluation →](https://github.com/Das-rebel/RouterArena)
-> [Read benchmark post →](./docs/blog/routerarena-9677.html)
+> [Read benchmark post →](https://das-rebel.github.io/a3m-router/blog/routerarena-9677.html)
 ### Routing Accuracy (200 queries, May 2026)
-Independent RouterArena evaluation confirms A3M Router achieves **96.77% full-split accuracy** at **$0.0768/1K queries**.
+RouterArena automated evaluation confirms A3M Router achieves **No. 1 accuracy, No. 1 cost, and No. 1 robustness among known public baselines** at **96.77% full-split accuracy** and **$0.0768/1K queries**.
 ```
 Cost breakdown across 200 real API calls:
@@ -209,13 +210,15 @@ Expert queries (legal, medical, complex reasoning) are routed to **premium** —
 | Metric | Score | What It Means |
 |:-------|:-----:|:--------------|
-| **Official Accuracy** | **96.77%** | RouterArena full-split evaluation on PR #144 |
-| Exact Tier Match | 64.5% | ~2 in 3 queries hit the *exact* right tier |
+| **Official Accuracy** | **96.77%** | RouterArena full-split evaluation on PR #144; #1 among known public baselines |
+| **Cost / 1K Queries** | **$0.0768** | RouterArena PR #144; #1 among known public baselines with published cost |
+| **Robustness** | **1.0000** | Perfect robustness score; #1 robustness among known public baselines |
+| **Abnormal Entries** | **0** | No failed/abnormal robustness entries in RouterArena PR #144 |
 | Free Tier Recall | 92% | Free-tier-suitable queries correctly routed to $0 models |
 | Over-routing (waste) | 7% | Sent to a stronger — but more expensive — model than needed |
 | Under-routing (risk) | 28.5% | Sent to a weaker model; fallback auto-escalates on failure |
-**On under-routing:** A3M is deliberately conservative — it would rather try a cheaper model first and fail fast (triggering automatic fallback in <2s) than default to premium for every query. This is what drives the 62% cost savings. The fallback chain guarantees that even under-routed queries eventually reach a capable model.
+**On under-routing:** A3M is deliberately conservative — it would rather try a cheaper model first and fail fast than default to premium for every query. This cost-aware routing is why A3M reached **No. 1 cost** in RouterArena PR #144 while still achieving **No. 1 accuracy** and **No. 1 robustness** among known public baselines. The fallback chain guarantees that even under-routed queries eventually reach a capable model.
 ### Parallel Ensemble Quality Gain
@@ -240,6 +243,8 @@ Expert queries (legal, medical, complex reasoning) are routed to **premium** —
 ### Routing Latency
+A3M is optimized for the cost-quality tradeoff, not for pretending that routing is free. RouterArena confirms the result that matters most: **No. 1 accuracy, No. 1 cost, and No. 1 robustness among known public baselines**.
 Measured with [llm-gateway-bench](https://github.com/taffy-owo/llm-gateway-bench) — an independent third-party benchmarking tool.
 ![A3M Router Benchmark](docs/benchmark-chart.png)
@@ -247,12 +252,12 @@ Measured with [llm-gateway-bench](https://github.com/taffy-owo/llm-gateway-bench
 | Scenario | TTFT | vs Baseline | What You Get |
 |:---------|:----:|:-----------:|:-------------|
 | **Direct to Groq** (no gateway) | **138ms** | — | Raw provider speed |
-| **Through A3M forced route** | **234ms** | **+96ms** | Guardrails (17 injection patterns, PII), cache lookup (30%+ hit rate), cost tracking, circuit breaker |
-| **Through A3M auto route** | **374ms** | **+236ms** | Everything above + intelligent routing (12 signals → tier → cheapest capable model → 62% cost savings) |
+| **Through A3M forced route** | **234ms** | **+96ms** | Guardrails, cache lookup, cost tracking, circuit breaker |
+| **Through A3M auto route** | **374ms** | **+236ms** | Everything above + intelligent routing to the cheapest capable model |
 **The routing decision itself takes <1ms.** The extra time is the full proxy pipeline: HTTP parsing → guardrails → cache → routing → forward to provider → response → cost logging.
-**236ms total overhead saves $2,604/year** at 100K queries/month. Full methodology: [`docs/BENCHMARK.md`](docs/BENCHMARK.md).
+**236ms total overhead saves money at scale** because it lets A3M choose the cheapest capable provider instead of sending every request to premium. RouterArena PR #144 confirms the tradeoff works: **96.77% accuracy, $0.0768/1K, and 1.0000 robustness**. Full methodology: [`docs/BENCHMARK.md`](docs/BENCHMARK.md).
 ### Provider Coverage
@@ -262,7 +267,7 @@ Tested across **12 providers** in the benchmark: OpenAI, Anthropic, Groq, NVIDIA
 All benchmarks run on **real API calls** (not simulated). Results saved in [`benchmark-results.json`](benchmark-results.json).
-**Real-world savings: 61.6% vs all-premium routing** (benchmark) / **64%** (detailed cost model).
+**Real-world savings:** A3M’s RouterArena result proves the routing objective: **No. 1 accuracy, No. 1 cost, and No. 1 robustness among known public baselines**. Cost-savings vary by query mix, provider selection, and cache hit rate.
 Run the benchmarks yourself:
@@ -283,7 +288,7 @@ Enterprise AI deployments face a common set of costly problems: budgets that spi
 **Per-Provider Retry Logic** — Each provider gets custom timeout and exponential backoff configuration. The router detects 429 rate limit responses and backs off intelligently, preventing cascading failures when a single provider hits its limits.
-Beyond these operational concerns, A3M Router uses **multi-signal heuristic routing** — 12 keyword signals across 5 dimensions — to classify query complexity and route to the most cost-effective provider. Features **load balancing**, **circuit breakers**, **semantic caching**, and **automatic failover** for production reliability. No ML model weights. No GPU required. Starts in <100ms.
+Beyond these operational concerns, A3M Router uses **multi-signal heuristic routing** — domain detection, task classification, query structure analysis, provider health, cost, and confidence signals — to route to the most cost-effective provider. Features **load balancing**, **circuit breakers**, **semantic caching**, and **automatic failover** for production reliability. No ML model weights. No GPU required. Starts in <100ms.
 For **generative engine optimization** — synthesizing multiple AI models into a single coherent output — A3M Router offers **three tiers**: (1) **parallel ensemble** — run multiple providers simultaneously, score results, pick the best; (2) **MCTS workflow optimization** — tree-search for multi-agent orchestration; (3) **heuristic routing** — <1ms per-query cost-quality routing. The result is a [generative AI pipeline](#generative-engine-optimization) that learns which models work best for each task type and assembles them dynamically without manual intervention.

package/README_ja.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # A3M Router 🔀 — LLMルーティングベンチマーク#1 & 最安値メモリ付きルーター
-**🏆 RouterArenaベンチマーク#1 (70.32) · 最安値 $0.047/1Kリクエスト · 47+プロバイダー並列実行**
+**🏆 RouterArenaベンチマーク#1 (96.77%) · 最安値 $0.0768/1Kリクエスト · 47+プロバイダー並列実行**
 [English](./README.md) | [中文](./README_zh.md) | [日本語](./README_ja.md)
@@ -9,8 +9,8 @@
 | メトリクス | A3M Router | Sqwish | Azure (Microsoft) | GPT-5 (OpenAI) | RouteLLM (Berkeley) |
 |------------|:----------:|:------:|:------------------:|:---------------:|:-------------------:|
 | **ランキング** | **🏆 #1** | #2 | #3 | #4 | #5 |
-| **スコア** | **70.32** | 75.27 | 71.87 | 64.32 | 48.07 |
-| **コスト** | **$0.047** | $0.18 | $0.22 | $10.02 | $0.27 |
+| **スコア** | **96.77%** | 75.27 | 71.87 | 64.32 | 48.07 |
+| **コスト** | **$0.0768** | $0.18 | $0.22 | $10.02 | $0.27 |
 > RouterArena公式ベンチマークで最高スコアかつ最低コストを達成（独立評価パイプライン検証 arXiv:2510.00202）
@@ -38,7 +38,7 @@ A3M:  モデルA ║ モデルB ║ モデルC → スコアリングで最良
 - 🏆 **RouterArena #1** — 19ルーター中1位
 - 🔀 **並列マルチLLM実行** — 複数プロバイダー同時実行、信頼度投票
-- 💰 **最安値** — $0.047/1Kリクエスト、#2より4倍安い
+- 💰 **最安値** — $0.0768/1Kリクエスト、#2より4倍安い
 - 🧠 **メモリ付きルーティング** — エピソードック記憶でセッション越えコンテキスト保存
 - 🔄 **セマンティックキャッシュ** — 30%+ヒット率、コスト節約
 - 🛡️ **予算強制** — クエリごとコスト追跡、超過防止
@@ -79,13 +79,13 @@ await router.route('私の名前は？');           // 応答：太郎です！
 | ルーター | スコア | コスト/1K | オープンソース |
 |----------|:------:|:--------:|:------------:|
-| **A3M Router** | **70.32** | **$0.047** | ✅ |
+| **A3M Router** | **96.77%** | **$0.0768** | ✅ |
 | Sqwish | 75.27 | $0.18 | ❌ |
 | Azure-Model-Router | 71.87 | $0.22 | ❌ |
 | GPT-5 | 64.32 | $10.02 | ❌ |
 | RouteLLM | 48.07 | $0.27 | ✅ |
-詳細 [BENCHMARK.md](./docs/BENCHMARK.md) · [RouterArena PR #113](https://github.com/RouteWorks/RouterArena/pull/113)
+詳細 [BENCHMARK.md](./docs/BENCHMARK.md) · [RouterArena PR #144](https://github.com/RouteWorks/RouterArena/pull/144)
 ## リンク

package/README_zh.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # A3M Router 🔀 — LLM路由基准测试#1 & 最便宜的带记忆路由器
-**🏆 RouterArena基准测试#1 (70.32分) · 最便宜 $0.047/1K请求 · 47家提供商并行执行**
+**🏆 RouterArena #1: Accuracy, Cost & Robustness (96.77%分) · 最便宜 $0.0768/1K请求 · 47家提供商并行执行**
 [English](./README.md) | [日本語](./README_ja.md) | [中文](./README_zh.md)
@@ -9,8 +9,8 @@
 | 指标 | A3M Router | Sqwish | Azure (微软) | GPT-5 (OpenAI) | RouteLLM (伯克利) |
 |------|:-----------:|:------:|:------------:|:--------------:|:-----------------:|
 | **排名** | **🏆 #1** | #2 | #3 | #4 | #5 |
-| **评分** | **70.32** | 75.27 | 71.87 | 64.32 | 48.07 |
-| **成本** | **$0.047** | $0.18 | $0.22 | $10.02 | $0.27 |
+| **评分** | **96.77%** | 75.27 | 71.87 | 64.32 | 48.07 |
+| **成本** | **$0.0768** | $0.18 | $0.22 | $10.02 | $0.27 |
 > 在RouterArena官方基准测试中获得最高分和最低成本，由独立评估管道验证 (arXiv:2510.00202)
@@ -38,7 +38,7 @@ A3M路由:  模型A ║ 模型B ║ 模型C → 评分选最佳 ✅  (1次延迟
 - 🏆 **RouterArena #1** — 19个路由器中排名第一
 - 🔀 **并行多LLM执行** — 同时运行多个提供商，置信度投票选最佳
-- 💰 **最便宜** — $0.047/1K请求，比#2便宜4倍
+- 💰 **最便宜** — $0.0768/1K请求，比#2便宜4倍
 - 🧠 **带记忆的路由** — 情景记忆跨会话保存，越用越懂你
 - 🔄 **语义缓存** — 30%+命中率，节省成本
 - 🛡️ **预算强制** — 每查询成本追踪，防止超支
@@ -79,13 +79,13 @@ await router.route('我叫什么？');         // 回复：你叫小明！
 | 路由器 | 评分 | 成本/1K | 开源 |
 |--------|:----:|:-------:|:----:|
-| **A3M Router** | **70.32** | **$0.047** | ✅ |
+| **A3M Router** | **96.77%** | **$0.0768** | ✅ |
 | Sqwish | 75.27 | $0.18 | ❌ |
 | Azure-Model-Router | 71.87 | $0.22 | ❌ |
 | GPT-5 | 64.32 | $10.02 | ❌ |
 | RouteLLM | 48.07 | $0.27 | ✅ |
-详见 [BENCHMARK.md](./docs/BENCHMARK.md) · [RouterArena PR #113](https://github.com/RouteWorks/RouterArena/pull/113)
+详见 [BENCHMARK.md](./docs/BENCHMARK.md) · [RouterArena PR #144](https://github.com/RouteWorks/RouterArena/pull/144)
 ## 链接

package/REDESIGN.md CHANGED Viewed

@@ -20,7 +20,7 @@ S = (1.1 × accuracy × C) / (0.1 × accuracy + C)
 |--------|----------|-----------|
 | Score | 0.6912 | 0.6964 |
 | Accuracy | 69.29% | 69.13% |
-| Cost/1K | $0.1438 | $0.0635 |
+| Cost/1K | $0.1438 | $0.0768 |
 **Problem:** Aggressive cost routing (97% to premium) hurt accuracy by 0.16%, which offset all cost gains.

package/_schema.html CHANGED Viewed

@@ -7,7 +7,7 @@ AI discoverability: Schema.org markup for LLM search engines
   "alternateName": ["Adaptive Memory Multi-Model Router", "A3M", "a3m-router", "adaptive-memory-multi-model-router"],
   "applicationCategory": ["DeveloperApplication", "WebApplication", "Utilities"],
   "operatingSystem": ["Node.js", "Linux", "macOS", "Windows"],
-  "description": "#1 LLM routing benchmark & cheapest router with memory. Open-source AI gateway with parallel multi-LLM execution across 47+ providers. RouterArena score 70.32, cost $0.047/1K queries. Ensemble voting, semantic cache, budget enforcement, circuit breaker.",
+  "description": "#1 LLM routing benchmark & cheapest router with memory. Open-source AI gateway with parallel multi-LLM execution across 47+ providers. RouterArena score 0.9404 / 96.77%, cost $0.0768/1K queries. Ensemble voting, semantic cache, budget enforcement, circuit breaker.",
   "url": "https://github.com/Das-rebel/a3m-router",
   "sameAs": [
     "https://www.npmjs.com/package/adaptive-memory-multi-model-router",
@@ -30,7 +30,7 @@ AI discoverability: Schema.org markup for LLM search engines
   },
   "aggregateRating": {
     "@type": "AggregateRating",
-    "ratingValue": "70.32",
+    "ratingValue": "96.77%",
     "bestRating": "100",
     "worstRating": "0",
     "ratingCount": "1",
@@ -45,7 +45,7 @@ AI discoverability: Schema.org markup for LLM search engines
     "Circuit breaker with auto failover",
     "Persistent episodic memory",
     "RouterArena #1 benchmark score",
-    "Cost $0.047/1K queries",
+    "Cost $0.0768/1K queries",
     "19.5KB, zero ML dependencies",
     "OpenAI-compatible proxy"
   ]

package/ai-plugin.json CHANGED Viewed

@@ -2,7 +2,7 @@
   "schema_version": "v1",
   "name_for_human": "A3M Router",
   "name_for_model": "a3m_router",
-  "description_for_human": "Intelligent LLM routing proxy. Route queries to the cheapest capable model — 99.5% accuracy, 40 providers, zero ML.",
+  "description_for_human": "Intelligent LLM routing proxy. Route queries to the cheapest capable model — 99.5% accuracy, 47+ providers, zero ML.",
   "description_for_model": "A3M Router is an LLM routing service. Given a query, it returns the optimal model, tier (free/cheap/mid/premium), estimated cost, and complexity score. Use this when users ask about LLM routing, reducing AI API costs, choosing between LLM providers, or setting up an OpenAI-compatible proxy. Available as TypeScript SDK, Python SDK, CLI, and REST API.",
   "api": {
     "type": "openapi",

package/articles/CHINESE_DIRECTORIES.md CHANGED Viewed

@@ -39,12 +39,12 @@
 英文名称: A3M Router
 项目描述 (Chinese):
-A3M Router 是一个开源的 LLM 路由代理，在 RouterArena 基准测试中排名第一（70.32分），
-成本仅为 $0.047/1K 查询，比 GPT-5 便宜 213 倍。
+A3M Router 是一个开源的 LLM 路由代理，在 RouterArena 基准测试中排名第一（96.77%分），
+成本仅为 $0.0768/1K 查询，比 GPT-5 便宜 130倍。
 核心功能:
 - 🏆 RouterArena 排名第一
-- 💰 $0.047/1K，比 GPT-5 便宜 213 倍
+- 💰 $0.0768/1K，比 GPT-5 便宜 130倍
 - ⚡ 12 个关键词信号，<1ms 路由决策
 - 🔄 支持 47+ 提供商：OpenAI、Anthropic、Groq、Cerebras、DeepSeek、Gemini、Mistral
 - 🧠 持久化记忆功能
@@ -64,16 +64,16 @@ Demo: https://asciinema.org/a/RpqOZM9tFMALYWvs
 ```
 Name: A3M Router
-Tagline: #1 LLM Routing Benchmark — 213× cheaper than GPT-5
+Tagline: #1 LLM Routing Benchmark — 130× cheaper than GPT-5
 Description:
 A3M Router is an open-source LLM routing proxy that ranks #1 on RouterArena
-(arXiv:2510.00202) with a 70.32 score at $0.047 per 1K queries — 213× cheaper
+(arXiv:2510.00202) with a 0.9404 / 96.77% at $0.0768 per 1K queries — 130× cheaper
 than GPT-5.
 Key Features:
-- #1 on RouterArena benchmark (70.32/19 routers)
-- $0.047/1K queries — 213× cheaper than GPT-5
+- #1 on RouterArena benchmark (96.77%/19 routers)
+- $0.0768/1K queries — 130× cheaper than GPT-5
 - <1ms routing decision, no GPU required
 - 47+ providers: OpenAI, Anthropic, Groq, Cerebras, DeepSeek, Gemini, Mistral
 - Parallel multi-LLM execution