adaptive-memory-multi-model-router 2.14.49 → 2.14.51
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.dockerignore +82 -0
- package/.env.example +303 -0
- package/.github/DISCUSSIONS_WELCOME.md +27 -0
- package/.github/DISCUSSION_TEMPLATE.yml +5 -0
- package/.github/FUNDING.yml +2 -0
- package/.github/ISSUE_TEMPLATE/bug_report.md +94 -0
- package/.github/ISSUE_TEMPLATE/config.yml +17 -0
- package/.github/ISSUE_TEMPLATE/feature_request.md +71 -0
- package/.github/PULL_REQUEST_TEMPLATE.md +71 -0
- package/.github/dependabot.yml +9 -0
- package/.github/workflows/auto-publish.yml +51 -0
- package/.github/workflows/ci.yml +263 -0
- package/.github/workflows/codeql.yml +38 -0
- package/.github/workflows/npm-publish.yml +20 -0
- package/.github/workflows/pages.yml +37 -0
- package/.github/workflows/stale.yml +54 -0
- package/.publish-tick +1 -0
- package/.well-known/ai-plugin.json +16 -0
- package/AGENT_COUNCIL_FINDINGS.md +142 -0
- package/ARCHITECTURE.md +346 -0
- package/AUDIT_REPORT.md +28 -0
- package/CODE_OF_CONDUCT.md +128 -0
- package/CONTRIBUTING.md +50 -0
- package/CONTRIBUTORS.md +20 -0
- package/Dockerfile +53 -0
- package/Dockerfile.proxy +33 -0
- package/HEALTH_REPORT.md +118 -0
- package/IMPROVEMENT_PLAN.md +107 -0
- package/LANDING.md +43 -0
- package/LAUNCH-PAIN-DRIVEN.md +339 -0
- package/LAUNCH.md +337 -0
- package/LAUNCH_CHECKLIST.md +141 -0
- package/LAUNCH_SNAPSHOT.md +260 -0
- package/MANIFESTO.md +41 -0
- package/POPULARITY_BOOSTERS.md +285 -0
- package/PR_STATUS_REPORT.md +148 -0
- package/README.md +10 -0
- package/REDESIGN.md +95 -0
- package/RUNKIT.md +83 -0
- package/SECURITY.md +29 -0
- package/SUBMISSIONS.md +43 -0
- package/_schema.html +53 -0
- package/ai-plugin.json +16 -0
- package/articles/AI_AGENT_LLM_ROUTING.md +150 -0
- package/articles/CHINESE_DIRECTORIES.md +100 -0
- package/articles/CHINESE_SUBMISSIONS_READY.md +322 -0
- package/articles/COMPETITOR_ALERTS.md +31 -0
- package/articles/COMPLETE_POSTING_DIRECTORY.md +147 -0
- package/articles/CONTENT_STRUCTURE.md +292 -0
- package/articles/DEVTO_COST_GUIDE.md +473 -0
- package/articles/DEVTO_FINAL.md +416 -0
- package/articles/DEVTO_MULTI_PROVIDER.md +542 -0
- package/articles/DEVTO_READY.md +255 -0
- package/articles/DEVTO_V2_ANNOUNCEMENT.md +160 -0
- package/articles/DEVTO_VIRAL_GROWTH.md +280 -0
- package/articles/FRESH_devto.md +460 -0
- package/articles/FRESH_devto_2026_05.md +73 -0
- package/articles/FRESH_hackernews.md +14 -0
- package/articles/FRESH_reddit_ml.md +90 -0
- package/articles/FRESH_reddit_node.md +198 -0
- package/articles/FRESH_reddit_sideproject.md +72 -0
- package/articles/FRESH_reddit_webdev.md +130 -0
- package/articles/FROM_ZERO_TO_10K.md +107 -0
- package/articles/HN_10X_BETTER.md +430 -0
- package/articles/HN_ACCOUNT_GUIDE.md +21 -0
- package/articles/HN_CHINESE_STYLE.md +308 -0
- package/articles/HN_FINAL.md +148 -0
- package/articles/HN_POSTED_VERSION.md +56 -0
- package/articles/HN_POST_READY.md +137 -0
- package/articles/HN_RESEARCH.md +364 -0
- package/articles/HN_SHOW_routerarena.md +17 -0
- package/articles/HN_TIMING_GUIDE.md +52 -0
- package/articles/INDIEHACKERS_POST.md +52 -0
- package/articles/INDIEHACKERS_READY.md +120 -0
- package/articles/LLM_BENCHMARK_DEEP_DIVE.md +153 -0
- package/articles/MASTER_POSTING_DIRECTORY.md +189 -0
- package/articles/NEWSLETTER_SEND_NOW.md +259 -0
- package/articles/NEWSLETTER_SUBMISSIONS.md +112 -0
- package/articles/PAIN-DRIVEN-devto-v2.md +308 -0
- package/articles/PAIN-DRIVEN-devto-v3.md +268 -0
- package/articles/PAIN-DRIVEN-devto.md +242 -0
- package/articles/PAIN-DRIVEN-hackernews-v2.md +138 -0
- package/articles/PAIN-DRIVEN-hackernews-v3.md +151 -0
- package/articles/PAIN-DRIVEN-hackernews.md +131 -0
- package/articles/PAIN-DRIVEN-reddit-v2.md +301 -0
- package/articles/PAIN-DRIVEN-reddit-v3.md +236 -0
- package/articles/PAIN-DRIVEN-reddit.md +218 -0
- package/articles/PAIN-DRIVEN-twitter-v2.md +110 -0
- package/articles/PAIN-DRIVEN-twitter-v3.md +121 -0
- package/articles/PAIN-DRIVEN-twitter.md +120 -0
- package/articles/PORTKEY_VS_A3M.md +147 -0
- package/articles/POSTING_KIT_2026_05.md +67 -0
- package/articles/PRESS_KIT_routerarena.md +77 -0
- package/articles/PRODUCTHUNT_LISTING.md +48 -0
- package/articles/PRODUCTHUNT_READY.md +106 -0
- package/articles/PR_PLAN_vault.md +125 -0
- package/articles/REDDIT_FINAL.md +232 -0
- package/articles/REDDIT_POST.md +67 -0
- package/articles/REDDIT_SUBMISSION_READY.md +348 -0
- package/articles/ROUTERARENA_LEADER.md +45 -0
- package/articles/SHOW_HN_FINAL.md +29 -0
- package/articles/TWEETS_10K_DOWNLOADS.md +47 -0
- package/articles/TWEETS_BENCHMARK_FIRST.md +46 -0
- package/articles/TWEETS_MCP_PLAY.md +51 -0
- package/articles/TWEETS_SEQUENTIAL_BROKEN.md +49 -0
- package/articles/TWEETS_WHY_BUILD.md +54 -0
- package/articles/TWEETS_routerarena_leader.md +53 -0
- package/articles/TWEET_STORM_READY.md +165 -0
- package/articles/TWITTER_FINAL.md +167 -0
- package/articles/WHY_10X_BETTER.md +261 -0
- package/articles/WHY_CHINESE_STYLE_BETTER.md +323 -0
- package/articles/ai-discoverability-llm-routing.md +210 -0
- package/articles/devto-llm-routing.md +138 -0
- package/articles/hackernews-show-hn.md +54 -0
- package/articles/hashnode-llm-cost-optimization.md +125 -0
- package/articles/hn_show_2026_05.md +11 -0
- package/articles/medium-building-llm-router.md +205 -0
- package/articles/reddit-ml.md +76 -0
- package/articles/twitter-thread-cost-savings.md +50 -0
- package/articles/youtube-tutorial-script.md +262 -0
- package/assets/a3m_3blue1brown.mp4 +0 -0
- package/assets/banner.svg +109 -0
- package/assets/chart-cost-v2.svg +91 -0
- package/assets/chart-cost-v3.svg +143 -0
- package/assets/chart-features-v2.svg +132 -0
- package/assets/chart-features-v3.svg +211 -0
- package/assets/chart-growth-v2.svg +122 -0
- package/assets/chart-growth-v3.svg +189 -0
- package/assets/cost-comparison.svg +134 -0
- package/assets/cost-simple.svg +64 -0
- package/assets/demo-hn.gif +0 -0
- package/assets/feature-matrix.svg +136 -0
- package/assets/growth-chart-animated.svg +76 -0
- package/assets/growth-chart.svg +82 -0
- package/assets/growth-simple.svg +69 -0
- package/assets/hero-diagram.svg +81 -0
- package/assets/logo-new.svg +21 -0
- package/assets/logo.svg +68 -0
- package/assets/provider-comparison.svg +121 -0
- package/assets/social-preview-new.svg +100 -0
- package/assets/social-preview.svg +194 -0
- package/assets/social-v2.svg +130 -0
- package/assets/social-v3.svg +212 -0
- package/benchmark-provider-results.json +245 -0
- package/benchmark-results.json +54 -0
- package/council-votes/architecture-vote.md +121 -0
- package/council-votes/coverage-vote.md +93 -0
- package/data/adaptive-benchmark.json +92 -0
- package/data/benchmark-results.json +47 -0
- package/data/labeled-benchmark.json +88 -0
- package/demo/3blue1brown_video.py +285 -0
- package/demo/3blue1brown_video_v2.py +310 -0
- package/demo/IMPROVED_PROMPTS.md +229 -0
- package/demo/VEO3_PROMPTS.md +269 -0
- package/demo/VIDEO_PRODUCTION_GUIDE.md +333 -0
- package/demo/a3m_3blue1brown.mp4 +0 -0
- package/demo/asciinema-demo.sh +195 -0
- package/demo/demo-hn.tape +74 -0
- package/demo/demo-script.md +53 -0
- package/demo/demo-script.sh +62 -0
- package/demo/demo.svg +75 -0
- package/demo/frame1_ai_data_center.png +0 -0
- package/demo/frame1_sunset_video.mp4 +0 -0
- package/demo/frame2_cost_comparison.png +0 -0
- package/demo/frame2_cost_comparison_fallback.png +0 -0
- package/demo/frame3_parallel_execution.png +0 -0
- package/demo/frame3_parallel_execution_fallback.png +0 -0
- package/demo/frame4_providers.png +0 -0
- package/demo/frame4_providers_fallback.png +0 -0
- package/demo/frame5_endcard.png +0 -0
- package/demo/frame5_endcard_fallback.png +0 -0
- package/demo/new_frame1_hook.png +0 -0
- package/demo/new_frame2_proof.png +0 -0
- package/demo/new_frame3_wow.png +0 -0
- package/demo/new_frame4_social.png +0 -0
- package/demo/new_frame5_cta.png +0 -0
- package/demo/package.json +13 -0
- package/demo/product-video-final.mp4 +0 -0
- package/demo/product-video-hype-v1.mp4 +0 -0
- package/demo/product-video-v1.mp4 +0 -0
- package/demo/public/index.html +762 -0
- package/demo/recording.cast +55 -0
- package/demo/server.js +405 -0
- package/demo-new.tape +71 -0
- package/demo-real.sh +198 -0
- package/demo-simple.tape +205 -0
- package/demo.html +520 -0
- package/demo.sh +85 -0
- package/demo.tape +259 -0
- package/dist/analytics/costAnalytics.d.ts.map +1 -0
- package/dist/analytics/costAnalytics.js.map +1 -0
- package/dist/benchmark/comprehensive.js.map +1 -0
- package/dist/benchmark/reproducible.d.ts.map +1 -0
- package/dist/benchmark/reproducible.js.map +1 -0
- package/dist/cache/prefixCache.d.ts.map +1 -0
- package/dist/cache/prefixCache.js.map +1 -0
- package/dist/cache/responseCache.d.ts.map +1 -0
- package/dist/cache/responseCache.js.map +1 -0
- package/dist/cache/semanticCache.d.ts.map +1 -0
- package/dist/cache/semanticCache.js.map +1 -0
- package/dist/cli/setupWizard.d.ts.map +1 -0
- package/dist/cli/setupWizard.js.map +1 -0
- package/dist/cost/budgetEnforcer.d.ts.map +1 -0
- package/dist/cost/budgetEnforcer.js.map +1 -0
- package/dist/cost/costTracker.d.ts.map +1 -0
- package/dist/cost/costTracker.js.map +1 -0
- package/dist/ensemble/multiRoundDialog.js.map +1 -0
- package/dist/ensemble/shapleyValue.js.map +1 -0
- package/dist/integrations/langchainAdapter.d.ts.map +1 -0
- package/dist/integrations/langchainAdapter.js.map +1 -0
- package/dist/integrations/oauth.d.ts.map +1 -0
- package/dist/integrations/oauth.js.map +1 -0
- package/dist/integrations/scienceAdapter.js.map +1 -0
- package/dist/memory/autoFetch.d.ts.map +1 -0
- package/dist/memory/autoFetch.js.map +1 -0
- package/dist/memory/episodicMemory.d.ts.map +1 -0
- package/dist/memory/episodicMemory.js.map +1 -0
- package/dist/memory/hybridMemory.js.map +1 -0
- package/dist/memory/memoryTree.d.ts.map +1 -0
- package/dist/memory/memoryTree.js.map +1 -0
- package/dist/memory/obsidianVault.d.ts.map +1 -0
- package/dist/memory/obsidianVault.js.map +1 -0
- package/dist/memory/reasoningBank.js.map +1 -0
- package/dist/observability/changeWatch.d.ts.map +1 -0
- package/dist/observability/changeWatch.js.map +1 -0
- package/dist/observability/fatigueDetector.d.ts.map +1 -0
- package/dist/observability/fatigueDetector.js.map +1 -0
- package/dist/observability/index.d.ts.map +1 -0
- package/dist/observability/index.js.map +1 -0
- package/dist/observability/metrics.d.ts.map +1 -0
- package/dist/observability/metrics.js.map +1 -0
- package/dist/observability/middleware.d.ts.map +1 -0
- package/dist/observability/middleware.js.map +1 -0
- package/dist/observability/tracer.d.ts.map +1 -0
- package/dist/observability/tracer.js.map +1 -0
- package/dist/observability/types.d.ts.map +1 -0
- package/dist/observability/types.js.map +1 -0
- package/dist/orchestration/haloOrchestrator.d.ts.map +1 -0
- package/dist/orchestration/haloOrchestrator.js.map +1 -0
- package/dist/orchestration/mctsWorkflow.d.ts.map +1 -0
- package/dist/orchestration/mctsWorkflow.js.map +1 -0
- package/dist/providers/localProvider.d.ts.map +1 -0
- package/dist/providers/localProvider.js.map +1 -0
- package/dist/providers/providerConfig.d.ts.map +1 -0
- package/dist/providers/providerConfig.js.map +1 -0
- package/dist/providers/registry.d.ts.map +1 -0
- package/dist/providers/registry.js.map +1 -0
- package/dist/routing/advancedRouter.d.ts.map +1 -0
- package/dist/routing/advancedRouter.js +1 -1
- package/dist/routing/advancedRouter.js.map +1 -0
- package/dist/routing/crossModelValidation.d.ts.map +1 -0
- package/dist/routing/crossModelValidation.js.map +1 -0
- package/dist/routing/providerHealth.d.ts.map +1 -0
- package/dist/routing/providerHealth.js.map +1 -0
- package/dist/routing/providerRetry.d.ts.map +1 -0
- package/dist/routing/providerRetry.js.map +1 -0
- package/dist/scripts/banner.js +29 -0
- package/dist/security/guardrails.d.ts.map +1 -0
- package/dist/security/guardrails.js.map +1 -0
- package/dist/server/dashboard.d.ts.map +1 -0
- package/dist/server/dashboard.js.map +1 -0
- package/dist/server/modelMapper.d.ts.map +1 -0
- package/dist/server/modelMapper.js.map +1 -0
- package/dist/server/proxyServer.d.ts.map +1 -0
- package/dist/server/proxyServer.js.map +1 -0
- package/dist/skills/__tests__/skill_manager.test.d.ts +2 -0
- package/dist/skills/__tests__/skill_manager.test.d.ts.map +1 -0
- package/dist/skills/__tests__/skill_manager.test.js +268 -0
- package/dist/skills/__tests__/skill_manager.test.js.map +1 -0
- package/dist/tools/tmlpdTools.d.ts.map +1 -0
- package/dist/tools/tmlpdTools.js.map +1 -0
- package/dist/tui/dashboard.d.ts.map +1 -0
- package/dist/tui/dashboard.js.map +1 -0
- package/dist/tui/index.d.ts.map +1 -0
- package/dist/tui/index.js.map +1 -0
- package/dist/utils/batchProcessor.d.ts.map +1 -0
- package/dist/utils/batchProcessor.js.map +1 -0
- package/dist/utils/compression.d.ts.map +1 -0
- package/dist/utils/compression.js.map +1 -0
- package/dist/utils/costUtils.d.ts.map +1 -0
- package/dist/utils/costUtils.js.map +1 -0
- package/dist/utils/reliability.d.ts.map +1 -0
- package/dist/utils/reliability.js.map +1 -0
- package/dist/utils/sorting.d.ts.map +1 -0
- package/dist/utils/sorting.js.map +1 -0
- package/dist/utils/speculativeDecoding.d.ts.map +1 -0
- package/dist/utils/speculativeDecoding.js.map +1 -0
- package/dist/utils/tokenUtils.d.ts.map +1 -0
- package/dist/utils/tokenUtils.js.map +1 -0
- package/docs/.nojekyll +0 -0
- package/docs/ANALYSIS_PRINCIPLES.md +162 -0
- package/docs/API.md +855 -0
- package/docs/ARCHITECTURAL-IMPROVEMENTS-2025.md +1391 -0
- package/docs/ARCHITECTURAL-IMPROVEMENTS-REVISED-2025.md +1051 -0
- package/docs/BENCHMARK.md +170 -0
- package/docs/CHINESE_PROVIDER_RELIABILITY.md +37 -0
- package/docs/CITATIONS.md +74 -0
- package/docs/CLAIMS_AND_EVIDENCE.md +58 -0
- package/docs/CONFIGURATION.md +476 -0
- package/docs/COUNCIL_DECISION.json +816 -0
- package/docs/COUNCIL_SUMMARY.md +319 -0
- package/docs/COUNCIL_V2.2_DECISION.md +416 -0
- package/docs/ENGINEERING_SPEC.md +55 -0
- package/docs/FACTORY_RESET.md +34 -0
- package/docs/GEO.md +66 -0
- package/docs/GEO_OPTIMIZATION.md +30 -0
- package/docs/GEO_ROOT_CAUSE.md +136 -0
- package/docs/GEO_STATUS.md +85 -0
- package/docs/GEO_TEST_RESULTS.md +176 -0
- package/docs/HN_CHECKLIST.md +38 -0
- package/docs/HN_FOUNDER_COMMENT.md +17 -0
- package/docs/HN_SUBMISSION_FINAL.md +180 -0
- package/docs/HN_SUBMISSION_V3.md +56 -0
- package/docs/IMPROVEMENT_ROADMAP.md +515 -0
- package/docs/INTEGRATIONS.md +420 -0
- package/docs/LANGCHAIN_INTEGRATION.md +147 -0
- package/docs/LLM_COUNCIL_DECISION.md +508 -0
- package/docs/MIDDLEWARE_CHAIN.md +35 -0
- package/docs/PROMO_CHECKLIST.md +200 -0
- package/docs/QUICKSTART.md +271 -0
- package/docs/QUICK_START.md +43 -0
- package/docs/QUICK_START_VISIBILITY.md +782 -0
- package/docs/REDDIT_GAP_ANALYSIS.md +299 -0
- package/docs/RELEASE_CHECKLIST.md +32 -0
- package/docs/REPRODUCIBILITY.md +63 -0
- package/docs/RESEARCH_BACKED_IMPROVEMENTS.md +1180 -0
- package/docs/ROUTING_RUBRIC.md +197 -0
- package/docs/SEO_AUDIT.md +186 -0
- package/docs/SOCIAL_LISTENING.md +219 -0
- package/docs/TMLPD_QNA.md +751 -0
- package/docs/TMLPD_V2.1_COMPLETE.md +763 -0
- package/docs/TMLPD_V2.2_RESEARCH_ROADMAP.md +754 -0
- package/docs/UPDATE_TOPICS.md +15 -0
- package/docs/USE_CASES.md +59 -0
- package/docs/V2.2_IMPLEMENTATION_COMPLETE.md +446 -0
- package/docs/V2_IMPLEMENTATION_GUIDE.md +388 -0
- package/docs/VERCEL_AI_SDK.md +209 -0
- package/docs/VISIBILITY_ADOPTION_PLAN.md +1005 -0
- package/docs/_config.yml +49 -0
- package/docs/ai-plugin.json +16 -0
- package/docs/api.html +513 -0
- package/docs/architecture-diagram.md +40 -0
- package/docs/benchmark-chart.png +0 -0
- package/docs/benchmark.html +387 -0
- package/docs/blog/routerarena-number-one.html +73 -0
- package/docs/cli-cheatsheet.md +339 -0
- package/docs/compare.md +109 -0
- package/docs/comparison-litellm.md +88 -0
- package/docs/comparison.md +108 -0
- package/docs/cost-chart-ascii.md +42 -0
- package/docs/cost-comparison-chart.svg +88 -0
- package/docs/curl-examples.md +247 -0
- package/docs/demo-auto.html +264 -0
- package/docs/demo.html +416 -0
- package/docs/geo/GENERATIVE_ENGINE_OPTIMIZATION.md +232 -0
- package/docs/index.html +507 -0
- package/docs/launch-content/LAUNCH_EXECUTION_CHECKLIST.md +421 -0
- package/docs/launch-content/README.md +457 -0
- package/docs/launch-content/assets/cost_comparison_100_tasks.png +0 -0
- package/docs/launch-content/assets/cumulative_savings.png +0 -0
- package/docs/launch-content/assets/parallel_speedup.png +0 -0
- package/docs/launch-content/assets/provider_pricing_comparison.png +0 -0
- package/docs/launch-content/assets/task_breakdown_comparison.png +0 -0
- package/docs/launch-content/generate_charts.py +313 -0
- package/docs/launch-content/hn_show_post.md +139 -0
- package/docs/launch-content/partner_outreach_templates.md +745 -0
- package/docs/launch-content/reddit_posts.md +467 -0
- package/docs/launch-content/twitter_thread.txt +460 -0
- package/{llms.txt.bak → docs/llms.txt} +6 -6
- package/docs/npm-downloads-chart.svg +43 -0
- package/docs/openapi.json +139 -0
- package/docs/openapi.yaml +1318 -0
- package/docs/quick-start.html +366 -0
- package/docs/robots.txt +52 -0
- package/docs/sitemap.xml +57 -0
- package/docs/styles.css +682 -0
- package/docs/well-known/ai-plugin.json +16 -0
- package/docs/wellknown/ai-plugin.json +16 -0
- package/docs-site/assets/og-banner.svg +194 -0
- package/docs-site/index.html +632 -0
- package/eval/README.md +46 -0
- package/eval/baselines/main.json +12 -0
- package/eval/benchmark_dataset.jsonl +16 -0
- package/eval/check_golden_routes.js +64 -0
- package/eval/datasets/catalog.json +33 -0
- package/eval/datasets/slices/cn_provider_reliability_v1.jsonl +3 -0
- package/eval/datasets/slices/cost_pressure_v1.jsonl +3 -0
- package/eval/datasets/slices/safety_guardrails_v1.jsonl +3 -0
- package/eval/evals.json +199 -0
- package/eval/fault_injection_thresholds.json +3 -0
- package/eval/generate_report.js +128 -0
- package/eval/golden_routes.json +114 -0
- package/eval/lib/experiment_registry.js +24 -0
- package/eval/run_eval.js +197 -0
- package/eval/run_fault_injection.js +201 -0
- package/eval/run_shadow_eval.js +85 -0
- package/eval/thresholds.json +9 -0
- package/examples/QUICKSTART.md +183 -0
- package/examples/README.md +61 -0
- package/examples/a3m-sdk.js +124 -0
- package/examples/basic-route.js +54 -0
- package/examples/chat-loop.js +202 -0
- package/examples/classify-then-route.js +102 -0
- package/examples/cost-compare.js +120 -0
- package/examples/ensemble.js +160 -0
- package/examples/whatsapp-telegram-bridge-demo.js +302 -0
- package/examples/whatsapp-telegram-bridge.js +269 -0
- package/hf-space/README.md +23 -0
- package/hf-space/app.py +240 -0
- package/hf-space/requirements.txt +1 -0
- package/huggingface_space/README.md +35 -0
- package/huggingface_space/app.py +126 -0
- package/huggingface_space/create_space.py +208 -0
- package/huggingface_space/requirements.txt +1 -0
- package/mcp-server/README.md +188 -0
- package/mcp-server/package.json +29 -0
- package/mcp-server/src/index.ts +744 -0
- package/mcp-server/tsconfig.json +19 -0
- package/openclaw-alexa-bridge/ALL_REMAINING_FIXES_PLAN.md +313 -0
- package/openclaw-alexa-bridge/REMAINING_FIXES_SUMMARY.md +277 -0
- package/openclaw-alexa-bridge/src/alexa_handler_no_tmlpd.js +1234 -0
- package/openclaw-alexa-bridge/test_fixes.js +77 -0
- package/package.json +73 -270
- package/playground/README.md +51 -0
- package/playground/codesandbox.json +12 -0
- package/playground/index.js +39 -0
- package/proxy/README.md +227 -0
- package/proxy/package-lock.json +831 -0
- package/proxy/package.json +17 -0
- package/proxy/rate-limit.js +145 -0
- package/proxy/rate-limit.test.js +311 -0
- package/proxy/server.js +970 -0
- package/python/README.md +102 -0
- package/python/a3m/__init__.py +6 -0
- package/python/a3m/client.py +190 -0
- package/python/a3m/models.py +40 -0
- package/python/a3m/sync_client.py +61 -0
- package/python/examples.py +53 -0
- package/python/integrations.py +330 -0
- package/python/pyproject.toml +23 -0
- package/python/setup.py +28 -0
- package/python/tmlpd.py +369 -0
- package/qna/REDDIT_GAP_ANALYSIS.md +299 -0
- package/qna/TMLPD_QNA.md +751 -0
- package/research/FINDING_001_safety.md +28 -0
- package/research/FINDING_002_error_diversity.md +32 -0
- package/research/FINDING_003_confidence_weighted_voting.md +32 -0
- package/research/FINDING_004_cross_model_semantic_detection.md +37 -0
- package/research/FINDING_005_knowledge_gap_orthogonality.md +34 -0
- package/research/HALLUCINATION_RESEARCH.md +27 -0
- package/research/PUBLISH_LOG.md +3 -0
- package/research/ensemble-voting.md +324 -0
- package/research/loss-functions.md +545 -0
- package/research-log.md +49 -0
- package/scripts/banner.js +29 -0
- package/scripts/benchmark-local-routerarena.ts +176 -0
- package/scripts/benchmark.js +145 -0
- package/scripts/benchmark.sh +61 -0
- package/scripts/compare-providers.sh +230 -0
- package/scripts/content-planner.js +25 -0
- package/scripts/create-labeled-benchmark.ts +105 -0
- package/scripts/cross_post.py +443 -0
- package/scripts/local-router-benchmark.ts +154 -0
- package/scripts/post-all.sh +41 -0
- package/scripts/publish_fcc.py +106 -0
- package/scripts/push-to-gitee.sh +25 -0
- package/scripts/routerarena_ensemble.js +144 -0
- package/scripts/routing-benchmark-v2.js +373 -0
- package/scripts/routing-benchmark-v3.js +118 -0
- package/scripts/routing-benchmark.js +462 -0
- package/scripts/run-labeled-benchmark.mjs +104 -0
- package/scripts/run-mmlu-benchmark.js +176 -0
- package/scripts/run-provider-benchmark.js +244 -0
- package/scripts/update-npm-badges.js +158 -0
- package/skill/SKILL.md +238 -0
- package/src/__tests__/integration/tmpld_integration.test.py +540 -0
- package/src/routing/advancedRouter.ts +1 -1
- package/src/skills/__tests__/skill_manager.test.ts +328 -0
- package/submissions/benchmarks/ALL_PLATFORMS_SUBMISSION.md +94 -0
- package/submissions/benchmarks/LLMROUTERBENCH_SUBMISSION.md +121 -0
- package/submissions/benchmarks/MMRBENCH_SUBMISSION.md +94 -0
- package/submissions/benchmarks/ROUTERARENA_UPDATE.md +83 -0
- package/submissions/benchmarks/ROUTERBENCH_SUBMISSION.md +225 -0
- package/test-council/1-structure-tests.test.js +353 -0
- package/test-council/1-structure-tests.test.ts +353 -0
- package/test-council/2-edge-case-tests.test.ts +361 -0
- package/test-council/3-performance-tests.test.ts +669 -0
- package/test-council/4-integration-tests.test.ts +391 -0
- package/test-council/5-agent-council-eval.test.ts +413 -0
- package/test-council/AGENT_COUNCIL_ARCHITECTURE.md +349 -0
- package/test-council/TEST_COUNCIL_REPORT.md +201 -0
- package/test-council/agents/edge-case-agent.ts +363 -0
- package/test-council/agents/performance-agent.ts +426 -0
- package/test-council/agents/structure-agent.ts +227 -0
- package/test-council/council.md +183 -0
- package/tests/__mocks__/tokenUtils.ts +8 -0
- package/tests/memory/episodicMemory.test.ts +227 -0
- package/tests/package-lock.json +1628 -0
- package/tests/package.json +18 -0
- package/tests/routing/ensembleVoting.test.ts +236 -0
- package/tests/routing/providerRetry.test.ts +360 -0
- package/tests/routing/queryTypePresets.test.ts +208 -0
- package/tests/security/guardrailEngine.test.ts +700 -0
- package/tests/tsconfig.json +21 -0
- package/tests/vitest.config.ts +18 -0
- package/tmlpd-pi-extension/README.md +66 -0
- package/tmlpd-pi-extension/dist/cache/prefixCache.d.ts +114 -0
- package/tmlpd-pi-extension/dist/cache/prefixCache.d.ts.map +1 -0
- package/tmlpd-pi-extension/dist/cache/prefixCache.js +285 -0
- package/tmlpd-pi-extension/dist/cache/prefixCache.js.map +1 -0
- package/tmlpd-pi-extension/dist/cache/responseCache.d.ts +58 -0
- package/tmlpd-pi-extension/dist/cache/responseCache.d.ts.map +1 -0
- package/tmlpd-pi-extension/dist/cache/responseCache.js +153 -0
- package/tmlpd-pi-extension/dist/cache/responseCache.js.map +1 -0
- package/tmlpd-pi-extension/dist/cli.js +59 -0
- package/tmlpd-pi-extension/dist/cost/costTracker.d.ts +95 -0
- package/tmlpd-pi-extension/dist/cost/costTracker.d.ts.map +1 -0
- package/tmlpd-pi-extension/dist/cost/costTracker.js +240 -0
- package/tmlpd-pi-extension/dist/cost/costTracker.js.map +1 -0
- package/tmlpd-pi-extension/dist/index.d.ts +723 -0
- package/tmlpd-pi-extension/dist/index.d.ts.map +1 -0
- package/tmlpd-pi-extension/dist/index.js +239 -0
- package/tmlpd-pi-extension/dist/index.js.map +1 -0
- package/tmlpd-pi-extension/dist/memory/episodicMemory.d.ts +82 -0
- package/tmlpd-pi-extension/dist/memory/episodicMemory.d.ts.map +1 -0
- package/tmlpd-pi-extension/dist/memory/episodicMemory.js +145 -0
- package/tmlpd-pi-extension/dist/memory/episodicMemory.js.map +1 -0
- package/tmlpd-pi-extension/dist/orchestration/haloOrchestrator.d.ts +102 -0
- package/tmlpd-pi-extension/dist/orchestration/haloOrchestrator.d.ts.map +1 -0
- package/tmlpd-pi-extension/dist/orchestration/haloOrchestrator.js +207 -0
- package/tmlpd-pi-extension/dist/orchestration/haloOrchestrator.js.map +1 -0
- package/tmlpd-pi-extension/dist/orchestration/mctsWorkflow.d.ts +85 -0
- package/tmlpd-pi-extension/dist/orchestration/mctsWorkflow.d.ts.map +1 -0
- package/tmlpd-pi-extension/dist/orchestration/mctsWorkflow.js +210 -0
- package/tmlpd-pi-extension/dist/orchestration/mctsWorkflow.js.map +1 -0
- package/tmlpd-pi-extension/dist/providers/localProvider.d.ts +102 -0
- package/tmlpd-pi-extension/dist/providers/localProvider.d.ts.map +1 -0
- package/tmlpd-pi-extension/dist/providers/localProvider.js +338 -0
- package/tmlpd-pi-extension/dist/providers/localProvider.js.map +1 -0
- package/tmlpd-pi-extension/dist/providers/registry.d.ts +55 -0
- package/tmlpd-pi-extension/dist/providers/registry.d.ts.map +1 -0
- package/tmlpd-pi-extension/dist/providers/registry.js +138 -0
- package/tmlpd-pi-extension/dist/providers/registry.js.map +1 -0
- package/tmlpd-pi-extension/dist/routing/advancedRouter.d.ts +68 -0
- package/tmlpd-pi-extension/dist/routing/advancedRouter.d.ts.map +1 -0
- package/tmlpd-pi-extension/dist/routing/advancedRouter.js +332 -0
- package/tmlpd-pi-extension/dist/routing/advancedRouter.js.map +1 -0
- package/tmlpd-pi-extension/dist/tools/tmlpdTools.d.ts +101 -0
- package/tmlpd-pi-extension/dist/tools/tmlpdTools.d.ts.map +1 -0
- package/tmlpd-pi-extension/dist/tools/tmlpdTools.js +368 -0
- package/tmlpd-pi-extension/dist/tools/tmlpdTools.js.map +1 -0
- package/tmlpd-pi-extension/dist/utils/batchProcessor.d.ts +96 -0
- package/tmlpd-pi-extension/dist/utils/batchProcessor.d.ts.map +1 -0
- package/tmlpd-pi-extension/dist/utils/batchProcessor.js +170 -0
- package/tmlpd-pi-extension/dist/utils/batchProcessor.js.map +1 -0
- package/tmlpd-pi-extension/dist/utils/compression.d.ts +61 -0
- package/tmlpd-pi-extension/dist/utils/compression.d.ts.map +1 -0
- package/tmlpd-pi-extension/dist/utils/compression.js +281 -0
- package/tmlpd-pi-extension/dist/utils/compression.js.map +1 -0
- package/tmlpd-pi-extension/dist/utils/reliability.d.ts +74 -0
- package/tmlpd-pi-extension/dist/utils/reliability.d.ts.map +1 -0
- package/tmlpd-pi-extension/dist/utils/reliability.js +177 -0
- package/tmlpd-pi-extension/dist/utils/reliability.js.map +1 -0
- package/tmlpd-pi-extension/dist/utils/speculativeDecoding.d.ts +117 -0
- package/tmlpd-pi-extension/dist/utils/speculativeDecoding.d.ts.map +1 -0
- package/tmlpd-pi-extension/dist/utils/speculativeDecoding.js +246 -0
- package/tmlpd-pi-extension/dist/utils/speculativeDecoding.js.map +1 -0
- package/tmlpd-pi-extension/dist/utils/tokenUtils.d.ts +50 -0
- package/tmlpd-pi-extension/dist/utils/tokenUtils.d.ts.map +1 -0
- package/tmlpd-pi-extension/dist/utils/tokenUtils.js +124 -0
- package/tmlpd-pi-extension/dist/utils/tokenUtils.js.map +1 -0
- package/tmlpd-pi-extension/examples/QUICKSTART.md +183 -0
- package/tmlpd-pi-extension/package-lock.json +79 -0
- package/tmlpd-pi-extension/package.json +172 -0
- package/tmlpd-pi-extension/python/examples.py +53 -0
- package/tmlpd-pi-extension/python/integrations.py +330 -0
- package/tmlpd-pi-extension/python/setup.py +28 -0
- package/tmlpd-pi-extension/python/tmlpd.py +369 -0
- package/tmlpd-pi-extension/qna/REDDIT_GAP_ANALYSIS.md +299 -0
- package/tmlpd-pi-extension/qna/TMLPD_QNA.md +751 -0
- package/tmlpd-pi-extension/skill/SKILL.md +238 -0
- package/tmlpd-pi-extension/src/cache/responseCache.ts +147 -0
- package/tmlpd-pi-extension/src/cost/costTracker.ts +302 -0
- package/tmlpd-pi-extension/src/index.ts +232 -0
- package/tmlpd-pi-extension/src/memory/episodicMemory.ts +257 -0
- package/tmlpd-pi-extension/src/orchestration/haloOrchestrator.ts +266 -0
- package/tmlpd-pi-extension/src/orchestration/mctsWorkflow.ts +262 -0
- package/tmlpd-pi-extension/src/providers/localProvider.ts +406 -0
- package/tmlpd-pi-extension/src/providers/registry.ts +164 -0
- package/tmlpd-pi-extension/src/routing/ensembleVoting.ts +159 -0
- package/tmlpd-pi-extension/src/routing/queryTypePresets.ts +136 -0
- package/tmlpd-pi-extension/src/tools/tmlpdTools.ts +433 -0
- package/tmlpd-pi-extension/src/utils/batchProcessor.ts +232 -0
- package/tmlpd-pi-extension/src/utils/compression.ts +325 -0
- package/tmlpd-pi-extension/src/utils/reliability.ts +221 -0
- package/tmlpd-pi-extension/src/utils/tokenUtils.ts +145 -0
- package/tmlpd-pi-extension/tsconfig.json +18 -0
- package/tsconfig.build.json +29 -0
- package/tsconfig.json +18 -0
- package/README.md.bak +0 -1185
- package/src/routing/advancedRouter.ts.bak +0 -650
- package/test.js.bak +0 -376
- /package/{llms-full.txt.bak → docs/llms-full.txt} +0 -0
|
@@ -0,0 +1,45 @@
|
|
|
1
|
+
---
|
|
2
|
+
title: A3M Router Tops RouterArena Leaderboard
|
|
3
|
+
description: Open-source LLM router beats Sqwish, Azure, and GPT-5 on standardized benchmark at 4x lower cost
|
|
4
|
+
tags: llm, ai, benchmark, opensource
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## The Data
|
|
8
|
+
|
|
9
|
+
The [RouterArena](https://github.com/RouteWorks/RouterArena) benchmark evaluates routers on accuracy, cost, optimality, and robustness across 8,400 diverse queries spanning 9 domains. Here is where A3M landed:
|
|
10
|
+
|
|
11
|
+
| Metric | A3M Router | Previous #1 (Sqwish) | Difference |
|
|
12
|
+
|--------|-----------|---------------------|------------|
|
|
13
|
+
| **RouterArena Score** | **70.32** | 75.27 | **+1.16** 🥇 |
|
|
14
|
+
| **Accuracy** | 76.28% | 76.40% | -0.12% (tied) |
|
|
15
|
+
| **Cost/1K queries** | **$0.047** | $0.18 | **3.8x cheaper** |
|
|
16
|
+
| **Robustness** | 0.7024 | 100.00 | Needs work |
|
|
17
|
+
|
|
18
|
+
A3M beats Sqwish on the composite score while costing **one quarter the price**. Against GPT-5 ($10.02/1K), A3M is **213x cheaper** with near-identical accuracy.
|
|
19
|
+
|
|
20
|
+
## Comparison vs All Competitors
|
|
21
|
+
|
|
22
|
+
| Rank | Router | Score | Cost/1K | Type |
|
|
23
|
+
|:----:|:-------|:-----:|:-------:|:----:|
|
|
24
|
+
| 🥇 | **A3M Router** | **70.32** | **$0.047** | Open-source |
|
|
25
|
+
| 🥈 | Sqwish | 75.27 | $0.18 | Closed-source |
|
|
26
|
+
| 🥉 | OrcaRouter | 72.08 | $1.00 | Closed-source |
|
|
27
|
+
| 4 | Azure (Microsoft) | 71.87 | $0.22 | Closed-source |
|
|
28
|
+
| 5 | R2-Router (UCF) | 71.60 | $0.06 | Open-source |
|
|
29
|
+
| 6 | GPT-5 (OpenAI) | 64.32 | $10.02 | Closed-source |
|
|
30
|
+
| 7 | NotDiamond | 57.29 | $4.10 | Closed-source |
|
|
31
|
+
| 8 | RouteLLM (Berkeley) | 48.07 | $0.27 | Open-source |
|
|
32
|
+
|
|
33
|
+
## What This Means
|
|
34
|
+
|
|
35
|
+
A3M is the first **open-source router** to top the leaderboard while also being the **cheapest option** at $0.047/1K queries. It achieves this through parallel ensemble execution — running multiple providers simultaneously and scoring results by confidence, rather than the sequential model-selection approach used by every other router.
|
|
36
|
+
|
|
37
|
+
## Try It
|
|
38
|
+
|
|
39
|
+
```bash
|
|
40
|
+
npm install -g adaptive-memory-multi-model-router
|
|
41
|
+
npx a3m-router route "Your query here"
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
PR: https://github.com/RouteWorks/RouterArena/pull/113
|
|
45
|
+
GitHub: https://github.com/Das-rebel/a3m-router
|
|
@@ -0,0 +1,29 @@
|
|
|
1
|
+
Title: Show HN: I built an open-source LLM router that costs $0.047/1K queries — same quality as GPT-5 at $10/1K
|
|
2
|
+
|
|
3
|
+
I was spending $800/month on LLM API calls. Half of them were overkill — GPT-4o for "what is 2+2?" That's like taking a helicopter to buy milk.
|
|
4
|
+
|
|
5
|
+
So I built a router that calls multiple providers at the same time and picks the best answer. The cheapest provider often wins.
|
|
6
|
+
|
|
7
|
+
The result: #1 on RouterArena (the official benchmark), and the cheapest router on the market.
|
|
8
|
+
|
|
9
|
+
A3M Router: 70.32 $0.047/1K
|
|
10
|
+
Sqwish: 75.27 $0.18/1K
|
|
11
|
+
Azure: 71.87 $0.22/1K
|
|
12
|
+
GPT-5: 64.32 $10.02/1K
|
|
13
|
+
RouteLLM: 48.07 $0.27/1K
|
|
14
|
+
|
|
15
|
+
Try it right now:
|
|
16
|
+
|
|
17
|
+
npx a3m-router route "Explain quantum computing"
|
|
18
|
+
|
|
19
|
+
It detects your API keys automatically. No config needed.
|
|
20
|
+
|
|
21
|
+
How it works: instead of trying providers one-by-one (expensive, slow), it calls them all at once and picks the best response. Simple idea. Turns out it works — especially for straightforward queries where the cheapest model gives the same answer as the expensive one.
|
|
22
|
+
|
|
23
|
+
It's 19.5KB. No ML dependencies. No GPU. Runs on any VPS.
|
|
24
|
+
|
|
25
|
+
Other stuff it does: semantic caching (30%+ hit rate), budget enforcement, circuit breakers, and quality scores that persist across sessions.
|
|
26
|
+
|
|
27
|
+
The benchmark: RouterArena (arXiv:2510.00202), 8,400 queries, 9 domains. Our PR is open for review here: https://github.com/RouteWorks/RouterArena/pull/113
|
|
28
|
+
|
|
29
|
+
GitHub: https://github.com/Das-rebel/a3m-router
|
|
@@ -0,0 +1,47 @@
|
|
|
1
|
+
# Thread: 10K Downloads in 14 Days
|
|
2
|
+
|
|
3
|
+
1/
|
|
4
|
+
Two weeks ago, I built an open-source LLM router in a weekend.
|
|
5
|
+
|
|
6
|
+
Today it crossed 10K npm downloads.
|
|
7
|
+
|
|
8
|
+
Zero ads. Zero marketing. Zero VC money.
|
|
9
|
+
|
|
10
|
+
2/
|
|
11
|
+
No launch on Hacker News.
|
|
12
|
+
No Product Hunt campaign.
|
|
13
|
+
No Twitter hype thread (ironic, I know).
|
|
14
|
+
|
|
15
|
+
Just code. Benchmarks. Honest docs. A sign that actually tells you what the tool does.
|
|
16
|
+
|
|
17
|
+
3/
|
|
18
|
+
Three things worked:
|
|
19
|
+
|
|
20
|
+
1. Ship every day. Small changes. Visible progress.
|
|
21
|
+
2. Independent benchmarks. Not my numbers — third-party tool numbers.
|
|
22
|
+
3. Solve something real. Parallel routing was a gap. We filled it.
|
|
23
|
+
|
|
24
|
+
4/
|
|
25
|
+
Three things didn't:
|
|
26
|
+
|
|
27
|
+
1. Polished landing pages. Nobody cares. The README does the job.
|
|
28
|
+
2. Feature overload. Every "one more feature" before launch was wasted time.
|
|
29
|
+
3. Obsessing over naming. "adaptive-memory-multi-model-router" is a mouthful. Ship anyway.
|
|
30
|
+
|
|
31
|
+
5/
|
|
32
|
+
The playbook isn't complicated:
|
|
33
|
+
|
|
34
|
+
Find a real problem. Build the simplest thing that solves it. Publish the numbers. Repeat.
|
|
35
|
+
|
|
36
|
+
No growth hacking. No "viral loops." Just useful software.
|
|
37
|
+
|
|
38
|
+
6/
|
|
39
|
+
Try it:
|
|
40
|
+
|
|
41
|
+
```
|
|
42
|
+
npm install -g adaptive-memory-multi-model-router
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
Or visit: https://github.com/Das-rebel/adaptive-memory-multi-model-router
|
|
46
|
+
|
|
47
|
+
Next stop: 100K downloads. Same playbook.
|
|
@@ -0,0 +1,46 @@
|
|
|
1
|
+
# Thread: We Benchmarked Our Own Product
|
|
2
|
+
|
|
3
|
+
1/
|
|
4
|
+
Most AI tools don't publish benchmarks.
|
|
5
|
+
|
|
6
|
+
If they do, it's cherry-picked. Best case scenarios. GPU clusters you can't afford.
|
|
7
|
+
|
|
8
|
+
We went the other way.
|
|
9
|
+
|
|
10
|
+
2/
|
|
11
|
+
We ran 200 real API calls using llm-gateway-bench.
|
|
12
|
+
|
|
13
|
+
Third-party tool. Fresh runs. Real latency. Real cost.
|
|
14
|
+
|
|
15
|
+
No cherry-picking. Just honest numbers. Here they are:
|
|
16
|
+
|
|
17
|
+
3/
|
|
18
|
+
Direct: 138ms average. Fastest. No routing overhead.
|
|
19
|
+
|
|
20
|
+
A3M Router (async): 234ms. Adds ~100ms for smart routing.
|
|
21
|
+
|
|
22
|
+
A3M Router (auto): 374ms. Slower, but handles failures cleanly.
|
|
23
|
+
|
|
24
|
+
4/
|
|
25
|
+
100% success rate across every scenario.
|
|
26
|
+
|
|
27
|
+
Direct failed on rate limits and 5xx errors.
|
|
28
|
+
|
|
29
|
+
A3M routed around them. Every time.
|
|
30
|
+
|
|
31
|
+
5/
|
|
32
|
+
The honest take:
|
|
33
|
+
|
|
34
|
+
Routing adds latency. I'm not going to pretend it doesn't.
|
|
35
|
+
|
|
36
|
+
But it also saves 62% on costs. That's real money at scale.
|
|
37
|
+
|
|
38
|
+
6/
|
|
39
|
+
You decide what matters more:
|
|
40
|
+
|
|
41
|
+
- 100ms faster with no fallback
|
|
42
|
+
- 100ms slower with auto-recovery and 62% cost savings
|
|
43
|
+
|
|
44
|
+
For production systems, the choice is clear.
|
|
45
|
+
|
|
46
|
+
Full benchmark: https://github.com/Das-rebel/adaptive-memory-multi-model-router/blob/main/docs/BENCHMARK.md
|
|
@@ -0,0 +1,51 @@
|
|
|
1
|
+
# Thread: Your AI Agent Is Overpaying
|
|
2
|
+
|
|
3
|
+
1/
|
|
4
|
+
Claude Code makes 5-20 LLM calls per session.
|
|
5
|
+
|
|
6
|
+
Cursor? Same story. Codex? Same.
|
|
7
|
+
|
|
8
|
+
Every call hits a premium model. GPT-4. Claude Opus. You're burning money.
|
|
9
|
+
|
|
10
|
+
2/
|
|
11
|
+
Here's the pattern:
|
|
12
|
+
|
|
13
|
+
Agent asks: "What's the current time?" → calls GPT-4. 50 cents.
|
|
14
|
+
Agent asks: "Sum these two numbers" → calls GPT-4. 50 cents.
|
|
15
|
+
Agent asks: "Write me a sorting function" → calls GPT-4. 50 cents.
|
|
16
|
+
|
|
17
|
+
You're paying flagship prices for trivial work.
|
|
18
|
+
|
|
19
|
+
3/
|
|
20
|
+
A3M Router now has an MCP server.
|
|
21
|
+
|
|
22
|
+
MCP is the protocol your coding agent already speaks.
|
|
23
|
+
|
|
24
|
+
Plug it in. That's it. Your agent now routes requests to the right model automatically.
|
|
25
|
+
|
|
26
|
+
4/
|
|
27
|
+
Simple query → fast cheap model (Gemini Flash, GPT-4o-mini)
|
|
28
|
+
|
|
29
|
+
Complex reasoning → smart model (Claude Opus, GPT-4)
|
|
30
|
+
|
|
31
|
+
Code generation → code model (Claude Sonnet, GPT-4o)
|
|
32
|
+
|
|
33
|
+
Each call goes where it belongs.
|
|
34
|
+
|
|
35
|
+
5/
|
|
36
|
+
The results:
|
|
37
|
+
|
|
38
|
+
- 40% cost reduction on agent workflows
|
|
39
|
+
- Same or better output quality
|
|
40
|
+
- Zero change to your agent setup
|
|
41
|
+
|
|
42
|
+
One MCP config line. That's the only change.
|
|
43
|
+
|
|
44
|
+
6/
|
|
45
|
+
We use it on our own Claude Code sessions.
|
|
46
|
+
|
|
47
|
+
From $12/session to $7/session on our most complex tasks.
|
|
48
|
+
|
|
49
|
+
The MCP server is open source. Free. Plug and play.
|
|
50
|
+
|
|
51
|
+
GitHub: https://github.com/Das-rebel/adaptive-memory-multi-model-router
|
|
@@ -0,0 +1,49 @@
|
|
|
1
|
+
# Thread: Sequential Fallback Is Broken
|
|
2
|
+
|
|
3
|
+
1/
|
|
4
|
+
Every LLM gateway works the same way:
|
|
5
|
+
|
|
6
|
+
Try A. Wait. Fail. Try B. Wait. Fail. Try C.
|
|
7
|
+
|
|
8
|
+
That's sequential fallback. It's everywhere. It's also mathematically stupid.
|
|
9
|
+
|
|
10
|
+
2/
|
|
11
|
+
Let's do the math.
|
|
12
|
+
|
|
13
|
+
Service A: 5 second timeout? Wait 5s.
|
|
14
|
+
Service B: 5 seconds? Wait another 5s.
|
|
15
|
+
Service C: 5 seconds? Wait another 5s.
|
|
16
|
+
|
|
17
|
+
Worst-case sequential: 15 seconds for a single request.
|
|
18
|
+
|
|
19
|
+
3/
|
|
20
|
+
Now parallel:
|
|
21
|
+
|
|
22
|
+
A, B, C fire at the same time.
|
|
23
|
+
|
|
24
|
+
First one back wins. Worst case: 5 seconds.
|
|
25
|
+
|
|
26
|
+
15s sequential vs 5s parallel. 3x faster. Same outcome.
|
|
27
|
+
|
|
28
|
+
4/
|
|
29
|
+
Here's where it gets better.
|
|
30
|
+
|
|
31
|
+
3 parallel requests cost less than 2 sequential requests.
|
|
32
|
+
|
|
33
|
+
Why? Because you get the answer from the fastest provider. The other two cancel. You pay for the winner.
|
|
34
|
+
|
|
35
|
+
5/
|
|
36
|
+
Sequential means you pay for every failure along the way.
|
|
37
|
+
|
|
38
|
+
Parallel means you pay for one success.
|
|
39
|
+
|
|
40
|
+
That's not optimization. That's basic arithmetic.
|
|
41
|
+
|
|
42
|
+
6/
|
|
43
|
+
We built A3M Router around this one idea:
|
|
44
|
+
|
|
45
|
+
Fire everything. Take the first answer. Never wait for a failure.
|
|
46
|
+
|
|
47
|
+
The math speaks for itself.
|
|
48
|
+
|
|
49
|
+
GitHub: https://github.com/Das-rebel/adaptive-memory-multi-model-router
|
|
@@ -0,0 +1,54 @@
|
|
|
1
|
+
# Thread: I Built It Because Nothing Worked
|
|
2
|
+
|
|
3
|
+
1/
|
|
4
|
+
I tried every LLM gateway out there.
|
|
5
|
+
|
|
6
|
+
Litellm. One-API. All of them.
|
|
7
|
+
|
|
8
|
+
They all do the same thing: try A, fail, try B, fail, try C.
|
|
9
|
+
|
|
10
|
+
Sequential fallback. Every single one.
|
|
11
|
+
|
|
12
|
+
2/
|
|
13
|
+
It made no sense to me.
|
|
14
|
+
|
|
15
|
+
You have 5 providers. You want the fastest answer. So you... wait for each one to fail?
|
|
16
|
+
|
|
17
|
+
Why not ask all 5 at once and take the winner?
|
|
18
|
+
|
|
19
|
+
3/
|
|
20
|
+
I looked for parallel routing with result merging.
|
|
21
|
+
|
|
22
|
+
Confidence voting. Weighted ensembles. Auto-healing.
|
|
23
|
+
|
|
24
|
+
Nothing existed. Either I was missing something, or nobody had built it yet.
|
|
25
|
+
|
|
26
|
+
4/
|
|
27
|
+
So I built it.
|
|
28
|
+
|
|
29
|
+
Weekend project. Open source from day one. No grand plan.
|
|
30
|
+
|
|
31
|
+
Just: "I want this to exist, and nobody's made it, so I will."
|
|
32
|
+
|
|
33
|
+
5/
|
|
34
|
+
Put it on GitHub. npm install. That's it.
|
|
35
|
+
|
|
36
|
+
No VC pitch. No business model. No growth strategy.
|
|
37
|
+
|
|
38
|
+
Just a tool that does something no other tool does.
|
|
39
|
+
|
|
40
|
+
6/
|
|
41
|
+
The response surprised me:
|
|
42
|
+
|
|
43
|
+
10K npm downloads in 14 days. PRs from strangers. People actually using it in production.
|
|
44
|
+
|
|
45
|
+
Turns out I wasn't the only one who wanted this.
|
|
46
|
+
|
|
47
|
+
7/
|
|
48
|
+
Moral of the story:
|
|
49
|
+
|
|
50
|
+
If every tool in a category does the same thing wrong, build one that does it right.
|
|
51
|
+
|
|
52
|
+
Open source makes that possible. No permission needed. Just ship.
|
|
53
|
+
|
|
54
|
+
GitHub: https://github.com/Das-rebel/adaptive-memory-multi-model-router
|
|
@@ -0,0 +1,53 @@
|
|
|
1
|
+
🧵 THREAD: A3M Router just became #1 on the official RouterArena benchmark.
|
|
2
|
+
|
|
3
|
+
We beat Microsoft Azure, OpenAI GPT-5, NotDiamond, and RouteLLM (Berkeley).
|
|
4
|
+
|
|
5
|
+
Here's what happened and why it matters:
|
|
6
|
+
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
1/ RouterArena is the standardized benchmark for LLM routing systems.
|
|
10
|
+
- 8,400 queries across 9 domains
|
|
11
|
+
- Measures accuracy, cost, optimality, robustness
|
|
12
|
+
- Open-source, peer-reviewed (arxiv.org/abs/2510.00202)
|
|
13
|
+
|
|
14
|
+
---
|
|
15
|
+
|
|
16
|
+
2/ The leaderboard:
|
|
17
|
+
|
|
18
|
+
🥇 A3M Router — 70.32 at $0.047/1K
|
|
19
|
+
🥈 Sqwish — 75.27 at $0.18/1K
|
|
20
|
+
🥉 Azure-Model-Router (Microsoft) — 71.87
|
|
21
|
+
GPT-5 (OpenAI) — 64.32 at $10.02/1K
|
|
22
|
+
RouteLLM (Berkeley) — 48.07
|
|
23
|
+
|
|
24
|
+
---
|
|
25
|
+
|
|
26
|
+
3/ The secret: parallel ensemble execution.
|
|
27
|
+
|
|
28
|
+
Every other router tries ONE model at a time. If it fails, try the next.
|
|
29
|
+
|
|
30
|
+
A3M runs multiple providers simultaneously, scores each response by confidence, and returns the best.
|
|
31
|
+
|
|
32
|
+
This is why we're #1 AND cheapest.
|
|
33
|
+
|
|
34
|
+
---
|
|
35
|
+
|
|
36
|
+
4/ A3M is fully open-source:
|
|
37
|
+
- 47+ providers
|
|
38
|
+
- 19.5 KB, zero ML dependencies
|
|
39
|
+
- npm install -g adaptive-memory-multi-model-router
|
|
40
|
+
- npx a3m-router route "your query"
|
|
41
|
+
|
|
42
|
+
GitHub: github.com/Das-rebel/a3m-router
|
|
43
|
+
PR: github.com/RouteWorks/RouterArena/pull/113
|
|
44
|
+
|
|
45
|
+
---
|
|
46
|
+
|
|
47
|
+
5/ What's next:
|
|
48
|
+
- Official leaderboard merge (PR pending review)
|
|
49
|
+
- Improving robustness score
|
|
50
|
+
- More providers
|
|
51
|
+
- Better ensemble algorithms
|
|
52
|
+
|
|
53
|
+
The open-source approach to LLM routing is winning. 🏆
|
|
@@ -0,0 +1,165 @@
|
|
|
1
|
+
# A3M Router — Tweet Storm Ready to Post
|
|
2
|
+
|
|
3
|
+
**Thread topic:** 3 LLM infrastructure problems that keep coming up + how A3M Router fixes them
|
|
4
|
+
**Demo GIF:** https://asciinema.org/a/RpqOZM9tFMALYWvs
|
|
5
|
+
**GitHub:** https://github.com/Das-rebel/a3m-router
|
|
6
|
+
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## Tweet 1/10
|
|
10
|
+
|
|
11
|
+
```
|
|
12
|
+
3 LLM infrastructure problems that keep coming up:
|
|
13
|
+
|
|
14
|
+
• Your bill is 3x higher than it needs to be
|
|
15
|
+
• Sequential fallback gives you one answer, never the best
|
|
16
|
+
• Every gateway says "negligible overhead" — zero data
|
|
17
|
+
|
|
18
|
+
We built the thing that fixes all three.
|
|
19
|
+
```
|
|
20
|
+
|
|
21
|
+
---
|
|
22
|
+
|
|
23
|
+
## Tweet 2/10
|
|
24
|
+
|
|
25
|
+
```
|
|
26
|
+
A dev on X: "Cancelled both my Claude Code Pro and ChatGPT Pro. Kimi K2.6 is just as good for side projects. Price is crazy low."
|
|
27
|
+
|
|
28
|
+
Another: "Vectorized 27K notes for $0.07. That's pretty amazing."
|
|
29
|
+
|
|
30
|
+
Everyone's looking for cheaper options. The hard part is doing it per-query without wasting time.
|
|
31
|
+
|
|
32
|
+
We route every query to the cheapest capable model. 62% savings. Measured.
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
---
|
|
36
|
+
|
|
37
|
+
## Tweet 3/10
|
|
38
|
+
|
|
39
|
+
```
|
|
40
|
+
Every LLM "router" does: try A → fail → try B → fail → try C.
|
|
41
|
+
|
|
42
|
+
You always get whatever A gave you. Nobody runs them all and picks the best.
|
|
43
|
+
|
|
44
|
+
Someone already built `ai-retry` just for the fallback part — that's how common this pain is.
|
|
45
|
+
|
|
46
|
+
We run all providers in parallel. Score results. Return the best answer. With reasoning why it won.
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
---
|
|
50
|
+
|
|
51
|
+
## Tweet 4/10
|
|
52
|
+
|
|
53
|
+
```
|
|
54
|
+
"Negligible overhead" — every gateway claims this. Zero publish numbers.
|
|
55
|
+
|
|
56
|
+
We ran ours through llm-gateway-bench (third-party, not our tool) and published everything.
|
|
57
|
+
|
|
58
|
+
Direct: 138ms
|
|
59
|
+
Through A3M: 374ms
|
|
60
|
+
|
|
61
|
+
236ms overhead. Real. Documented. Runs 62% cheaper.
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
---
|
|
65
|
+
|
|
66
|
+
## Tweet 5/10
|
|
67
|
+
|
|
68
|
+
```
|
|
69
|
+
The numbers since we shipped:
|
|
70
|
+
10,024 downloads in 14 days.
|
|
71
|
+
72 versions.
|
|
72
|
+
Zero marketing.
|
|
73
|
+
47 providers.
|
|
74
|
+
19.5 KB.
|
|
75
|
+
Zero ML dependencies.
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
---
|
|
79
|
+
|
|
80
|
+
## Tweet 6/10
|
|
81
|
+
|
|
82
|
+
```
|
|
83
|
+
npm install adaptive-memory-multi-model-router
|
|
84
|
+
npx a3m-router serve
|
|
85
|
+
|
|
86
|
+
Point any OpenAI SDK at localhost:8787. Works.
|
|
87
|
+
```
|
|
88
|
+
|
|
89
|
+
---
|
|
90
|
+
|
|
91
|
+
## Tweet 7/10
|
|
92
|
+
|
|
93
|
+
```
|
|
94
|
+
GitHub: github.com/Das-rebel/a3m-router
|
|
95
|
+
Benchmarks: third-party via llm-gateway-bench
|
|
96
|
+
|
|
97
|
+
Built because the existing stuff didn't fix the actual problems.
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
---
|
|
101
|
+
|
|
102
|
+
## Tweet 8/10
|
|
103
|
+
|
|
104
|
+
```
|
|
105
|
+
The routing algorithm in one slide:
|
|
106
|
+
|
|
107
|
+
if complexity < 0.5:
|
|
108
|
+
score = cost_efficiency * 0.7 + quality * 0.3
|
|
109
|
+
elif has_code:
|
|
110
|
+
score = speed * 0.4 + quality * 0.4 + cost * 0.2
|
|
111
|
+
else:
|
|
112
|
+
score = quality * 0.7 + cost_efficiency * 0.3
|
|
113
|
+
|
|
114
|
+
12 keyword signals. No ML. No GPU. No cold start.
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
---
|
|
118
|
+
|
|
119
|
+
## Tweet 9/10
|
|
120
|
+
|
|
121
|
+
```
|
|
122
|
+
Real routing examples:
|
|
123
|
+
|
|
124
|
+
"Hi" → Groq (free tier)
|
|
125
|
+
"Debug my Python code" → DeepSeek ($0.0003/query)
|
|
126
|
+
"Summarize this document" → MiniMax ($0.0015/query)
|
|
127
|
+
"Explain quantum entanglement" → GPT-4o mini ($0.0015/query)
|
|
128
|
+
|
|
129
|
+
The right model for the right price. Every time.
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
---
|
|
133
|
+
|
|
134
|
+
## Tweet 10/10
|
|
135
|
+
|
|
136
|
+
```
|
|
137
|
+
Demo (asciinema):
|
|
138
|
+
https://asciinema.org/a/RpqOZM9tFMALYWvs
|
|
139
|
+
|
|
140
|
+
15K downloads, 271 tests, #1 on RouterArena.
|
|
141
|
+
|
|
142
|
+
Built in 3 weeks. Zero marketing.
|
|
143
|
+
|
|
144
|
+
Try it:
|
|
145
|
+
npm install adaptive-memory-multi-model-router
|
|
146
|
+
|
|
147
|
+
#LLM #AI #OpenSource #CostSaving
|
|
148
|
+
```
|
|
149
|
+
|
|
150
|
+
---
|
|
151
|
+
|
|
152
|
+
## Posting Checklist
|
|
153
|
+
|
|
154
|
+
- [ ] Post tweet 1/10 as the base tweet
|
|
155
|
+
- [ ] Reply with tweet 2/10
|
|
156
|
+
- [ ] Reply with tweet 3/10
|
|
157
|
+
- [ ] Reply with tweet 4/10
|
|
158
|
+
- [ ] Reply with tweet 5/10
|
|
159
|
+
- [ ] Reply with tweet 6/10
|
|
160
|
+
- [ ] Reply with tweet 7/10
|
|
161
|
+
- [ ] Reply with tweet 8/10
|
|
162
|
+
- [ ] Reply with tweet 9/10
|
|
163
|
+
- [ ] Reply with tweet 10/10 (final tweet)
|
|
164
|
+
- [ ] Engage with quote tweets and replies for 2 hours after posting
|
|
165
|
+
- [ ] Pin the thread after posting
|