adaptive-memory-multi-model-router 2.14.46 → 2.14.48

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (598) hide show
  1. package/{docs/llms.txt → llms.txt.bak} +6 -6
  2. package/package.json +270 -72
  3. package/src/routing/advancedRouter.ts.bak +650 -0
  4. package/test.js.bak +376 -0
  5. package/.dockerignore +0 -82
  6. package/.env.example +0 -303
  7. package/.github/DISCUSSIONS_WELCOME.md +0 -27
  8. package/.github/DISCUSSION_TEMPLATE.yml +0 -5
  9. package/.github/FUNDING.yml +0 -2
  10. package/.github/ISSUE_TEMPLATE/bug_report.md +0 -94
  11. package/.github/ISSUE_TEMPLATE/config.yml +0 -17
  12. package/.github/ISSUE_TEMPLATE/feature_request.md +0 -71
  13. package/.github/PULL_REQUEST_TEMPLATE.md +0 -71
  14. package/.github/dependabot.yml +0 -9
  15. package/.github/workflows/auto-publish.yml +0 -51
  16. package/.github/workflows/ci.yml +0 -263
  17. package/.github/workflows/codeql.yml +0 -38
  18. package/.github/workflows/npm-publish.yml +0 -20
  19. package/.github/workflows/pages.yml +0 -37
  20. package/.github/workflows/stale.yml +0 -54
  21. package/.publish-tick +0 -1
  22. package/.well-known/ai-plugin.json +0 -16
  23. package/AGENT_COUNCIL_FINDINGS.md +0 -142
  24. package/ARCHITECTURE.md +0 -346
  25. package/AUDIT_REPORT.md +0 -28
  26. package/CODE_OF_CONDUCT.md +0 -128
  27. package/CONTRIBUTING.md +0 -50
  28. package/CONTRIBUTORS.md +0 -20
  29. package/Dockerfile +0 -53
  30. package/Dockerfile.proxy +0 -33
  31. package/HEALTH_REPORT.md +0 -118
  32. package/IMPROVEMENT_PLAN.md +0 -107
  33. package/LANDING.md +0 -43
  34. package/LAUNCH-PAIN-DRIVEN.md +0 -339
  35. package/LAUNCH.md +0 -337
  36. package/LAUNCH_CHECKLIST.md +0 -141
  37. package/LAUNCH_SNAPSHOT.md +0 -260
  38. package/MANIFESTO.md +0 -41
  39. package/POPULARITY_BOOSTERS.md +0 -285
  40. package/PR_STATUS_REPORT.md +0 -148
  41. package/REDESIGN.md +0 -95
  42. package/RUNKIT.md +0 -83
  43. package/SECURITY.md +0 -29
  44. package/SUBMISSIONS.md +0 -43
  45. package/_schema.html +0 -53
  46. package/ai-plugin.json +0 -16
  47. package/articles/AI_AGENT_LLM_ROUTING.md +0 -150
  48. package/articles/CHINESE_DIRECTORIES.md +0 -100
  49. package/articles/CHINESE_SUBMISSIONS_READY.md +0 -322
  50. package/articles/COMPETITOR_ALERTS.md +0 -31
  51. package/articles/COMPLETE_POSTING_DIRECTORY.md +0 -147
  52. package/articles/CONTENT_STRUCTURE.md +0 -292
  53. package/articles/DEVTO_COST_GUIDE.md +0 -473
  54. package/articles/DEVTO_FINAL.md +0 -416
  55. package/articles/DEVTO_MULTI_PROVIDER.md +0 -542
  56. package/articles/DEVTO_READY.md +0 -255
  57. package/articles/DEVTO_V2_ANNOUNCEMENT.md +0 -160
  58. package/articles/DEVTO_VIRAL_GROWTH.md +0 -280
  59. package/articles/FRESH_devto.md +0 -460
  60. package/articles/FRESH_devto_2026_05.md +0 -73
  61. package/articles/FRESH_hackernews.md +0 -14
  62. package/articles/FRESH_reddit_ml.md +0 -90
  63. package/articles/FRESH_reddit_node.md +0 -198
  64. package/articles/FRESH_reddit_sideproject.md +0 -72
  65. package/articles/FRESH_reddit_webdev.md +0 -130
  66. package/articles/FROM_ZERO_TO_10K.md +0 -107
  67. package/articles/HN_10X_BETTER.md +0 -430
  68. package/articles/HN_ACCOUNT_GUIDE.md +0 -21
  69. package/articles/HN_CHINESE_STYLE.md +0 -308
  70. package/articles/HN_FINAL.md +0 -148
  71. package/articles/HN_POSTED_VERSION.md +0 -56
  72. package/articles/HN_POST_READY.md +0 -137
  73. package/articles/HN_RESEARCH.md +0 -364
  74. package/articles/HN_SHOW_routerarena.md +0 -17
  75. package/articles/HN_TIMING_GUIDE.md +0 -52
  76. package/articles/INDIEHACKERS_POST.md +0 -52
  77. package/articles/INDIEHACKERS_READY.md +0 -120
  78. package/articles/LLM_BENCHMARK_DEEP_DIVE.md +0 -153
  79. package/articles/MASTER_POSTING_DIRECTORY.md +0 -189
  80. package/articles/NEWSLETTER_SEND_NOW.md +0 -259
  81. package/articles/NEWSLETTER_SUBMISSIONS.md +0 -112
  82. package/articles/PAIN-DRIVEN-devto-v2.md +0 -308
  83. package/articles/PAIN-DRIVEN-devto-v3.md +0 -268
  84. package/articles/PAIN-DRIVEN-devto.md +0 -242
  85. package/articles/PAIN-DRIVEN-hackernews-v2.md +0 -138
  86. package/articles/PAIN-DRIVEN-hackernews-v3.md +0 -151
  87. package/articles/PAIN-DRIVEN-hackernews.md +0 -131
  88. package/articles/PAIN-DRIVEN-reddit-v2.md +0 -301
  89. package/articles/PAIN-DRIVEN-reddit-v3.md +0 -236
  90. package/articles/PAIN-DRIVEN-reddit.md +0 -218
  91. package/articles/PAIN-DRIVEN-twitter-v2.md +0 -110
  92. package/articles/PAIN-DRIVEN-twitter-v3.md +0 -121
  93. package/articles/PAIN-DRIVEN-twitter.md +0 -120
  94. package/articles/PORTKEY_VS_A3M.md +0 -147
  95. package/articles/POSTING_KIT_2026_05.md +0 -67
  96. package/articles/PRESS_KIT_routerarena.md +0 -77
  97. package/articles/PRODUCTHUNT_LISTING.md +0 -48
  98. package/articles/PRODUCTHUNT_READY.md +0 -106
  99. package/articles/PR_PLAN_vault.md +0 -125
  100. package/articles/REDDIT_FINAL.md +0 -232
  101. package/articles/REDDIT_POST.md +0 -67
  102. package/articles/REDDIT_SUBMISSION_READY.md +0 -348
  103. package/articles/ROUTERARENA_LEADER.md +0 -45
  104. package/articles/SHOW_HN_FINAL.md +0 -29
  105. package/articles/TWEETS_10K_DOWNLOADS.md +0 -47
  106. package/articles/TWEETS_BENCHMARK_FIRST.md +0 -46
  107. package/articles/TWEETS_MCP_PLAY.md +0 -51
  108. package/articles/TWEETS_SEQUENTIAL_BROKEN.md +0 -49
  109. package/articles/TWEETS_WHY_BUILD.md +0 -54
  110. package/articles/TWEETS_routerarena_leader.md +0 -53
  111. package/articles/TWEET_STORM_READY.md +0 -165
  112. package/articles/TWITTER_FINAL.md +0 -167
  113. package/articles/WHY_10X_BETTER.md +0 -261
  114. package/articles/WHY_CHINESE_STYLE_BETTER.md +0 -323
  115. package/articles/ai-discoverability-llm-routing.md +0 -210
  116. package/articles/devto-llm-routing.md +0 -138
  117. package/articles/hackernews-show-hn.md +0 -54
  118. package/articles/hashnode-llm-cost-optimization.md +0 -125
  119. package/articles/hn_show_2026_05.md +0 -11
  120. package/articles/medium-building-llm-router.md +0 -205
  121. package/articles/reddit-ml.md +0 -76
  122. package/articles/twitter-thread-cost-savings.md +0 -50
  123. package/articles/youtube-tutorial-script.md +0 -262
  124. package/assets/a3m_3blue1brown.mp4 +0 -0
  125. package/assets/banner.svg +0 -109
  126. package/assets/chart-cost-v2.svg +0 -91
  127. package/assets/chart-cost-v3.svg +0 -143
  128. package/assets/chart-features-v2.svg +0 -132
  129. package/assets/chart-features-v3.svg +0 -211
  130. package/assets/chart-growth-v2.svg +0 -122
  131. package/assets/chart-growth-v3.svg +0 -189
  132. package/assets/cost-comparison.svg +0 -134
  133. package/assets/cost-simple.svg +0 -64
  134. package/assets/demo-hn.gif +0 -0
  135. package/assets/feature-matrix.svg +0 -136
  136. package/assets/growth-chart-animated.svg +0 -76
  137. package/assets/growth-chart.svg +0 -82
  138. package/assets/growth-simple.svg +0 -69
  139. package/assets/hero-diagram.svg +0 -81
  140. package/assets/logo-new.svg +0 -21
  141. package/assets/logo.svg +0 -68
  142. package/assets/provider-comparison.svg +0 -121
  143. package/assets/social-preview-new.svg +0 -100
  144. package/assets/social-preview.svg +0 -194
  145. package/assets/social-v2.svg +0 -130
  146. package/assets/social-v3.svg +0 -212
  147. package/benchmark-provider-results.json +0 -245
  148. package/benchmark-results.json +0 -54
  149. package/council-votes/architecture-vote.md +0 -121
  150. package/council-votes/coverage-vote.md +0 -93
  151. package/data/adaptive-benchmark.json +0 -92
  152. package/data/benchmark-results.json +0 -47
  153. package/data/labeled-benchmark.json +0 -88
  154. package/demo/3blue1brown_video.py +0 -285
  155. package/demo/3blue1brown_video_v2.py +0 -310
  156. package/demo/IMPROVED_PROMPTS.md +0 -229
  157. package/demo/VEO3_PROMPTS.md +0 -269
  158. package/demo/VIDEO_PRODUCTION_GUIDE.md +0 -333
  159. package/demo/a3m_3blue1brown.mp4 +0 -0
  160. package/demo/asciinema-demo.sh +0 -195
  161. package/demo/demo-hn.tape +0 -74
  162. package/demo/demo-script.md +0 -53
  163. package/demo/demo-script.sh +0 -62
  164. package/demo/demo.svg +0 -75
  165. package/demo/frame1_ai_data_center.png +0 -0
  166. package/demo/frame1_sunset_video.mp4 +0 -0
  167. package/demo/frame2_cost_comparison.png +0 -0
  168. package/demo/frame2_cost_comparison_fallback.png +0 -0
  169. package/demo/frame3_parallel_execution.png +0 -0
  170. package/demo/frame3_parallel_execution_fallback.png +0 -0
  171. package/demo/frame4_providers.png +0 -0
  172. package/demo/frame4_providers_fallback.png +0 -0
  173. package/demo/frame5_endcard.png +0 -0
  174. package/demo/frame5_endcard_fallback.png +0 -0
  175. package/demo/new_frame1_hook.png +0 -0
  176. package/demo/new_frame2_proof.png +0 -0
  177. package/demo/new_frame3_wow.png +0 -0
  178. package/demo/new_frame4_social.png +0 -0
  179. package/demo/new_frame5_cta.png +0 -0
  180. package/demo/package.json +0 -13
  181. package/demo/product-video-final.mp4 +0 -0
  182. package/demo/product-video-hype-v1.mp4 +0 -0
  183. package/demo/product-video-v1.mp4 +0 -0
  184. package/demo/public/index.html +0 -762
  185. package/demo/recording.cast +0 -55
  186. package/demo/server.js +0 -405
  187. package/demo-new.tape +0 -71
  188. package/demo-real.sh +0 -198
  189. package/demo-simple.tape +0 -205
  190. package/demo.html +0 -520
  191. package/demo.sh +0 -85
  192. package/demo.tape +0 -259
  193. package/dist/analytics/costAnalytics.d.ts.map +0 -1
  194. package/dist/analytics/costAnalytics.js.map +0 -1
  195. package/dist/benchmark/comprehensive.js.map +0 -1
  196. package/dist/benchmark/reproducible.d.ts.map +0 -1
  197. package/dist/benchmark/reproducible.js.map +0 -1
  198. package/dist/cache/prefixCache.d.ts.map +0 -1
  199. package/dist/cache/prefixCache.js.map +0 -1
  200. package/dist/cache/responseCache.d.ts.map +0 -1
  201. package/dist/cache/responseCache.js.map +0 -1
  202. package/dist/cache/semanticCache.d.ts.map +0 -1
  203. package/dist/cache/semanticCache.js.map +0 -1
  204. package/dist/cli/setupWizard.d.ts.map +0 -1
  205. package/dist/cli/setupWizard.js.map +0 -1
  206. package/dist/cost/budgetEnforcer.d.ts.map +0 -1
  207. package/dist/cost/budgetEnforcer.js.map +0 -1
  208. package/dist/cost/costTracker.d.ts.map +0 -1
  209. package/dist/cost/costTracker.js.map +0 -1
  210. package/dist/ensemble/multiRoundDialog.js.map +0 -1
  211. package/dist/ensemble/shapleyValue.js.map +0 -1
  212. package/dist/integrations/langchainAdapter.d.ts.map +0 -1
  213. package/dist/integrations/langchainAdapter.js.map +0 -1
  214. package/dist/integrations/oauth.d.ts.map +0 -1
  215. package/dist/integrations/oauth.js.map +0 -1
  216. package/dist/integrations/scienceAdapter.js.map +0 -1
  217. package/dist/memory/autoFetch.d.ts.map +0 -1
  218. package/dist/memory/autoFetch.js.map +0 -1
  219. package/dist/memory/episodicMemory.d.ts.map +0 -1
  220. package/dist/memory/episodicMemory.js.map +0 -1
  221. package/dist/memory/hybridMemory.js.map +0 -1
  222. package/dist/memory/memoryTree.d.ts.map +0 -1
  223. package/dist/memory/memoryTree.js.map +0 -1
  224. package/dist/memory/obsidianVault.d.ts.map +0 -1
  225. package/dist/memory/obsidianVault.js.map +0 -1
  226. package/dist/memory/reasoningBank.js.map +0 -1
  227. package/dist/observability/changeWatch.d.ts.map +0 -1
  228. package/dist/observability/changeWatch.js.map +0 -1
  229. package/dist/observability/fatigueDetector.d.ts.map +0 -1
  230. package/dist/observability/fatigueDetector.js.map +0 -1
  231. package/dist/observability/index.d.ts.map +0 -1
  232. package/dist/observability/index.js.map +0 -1
  233. package/dist/observability/metrics.d.ts.map +0 -1
  234. package/dist/observability/metrics.js.map +0 -1
  235. package/dist/observability/middleware.d.ts.map +0 -1
  236. package/dist/observability/middleware.js.map +0 -1
  237. package/dist/observability/tracer.d.ts.map +0 -1
  238. package/dist/observability/tracer.js.map +0 -1
  239. package/dist/observability/types.d.ts.map +0 -1
  240. package/dist/observability/types.js.map +0 -1
  241. package/dist/orchestration/haloOrchestrator.d.ts.map +0 -1
  242. package/dist/orchestration/haloOrchestrator.js.map +0 -1
  243. package/dist/orchestration/mctsWorkflow.d.ts.map +0 -1
  244. package/dist/orchestration/mctsWorkflow.js.map +0 -1
  245. package/dist/providers/localProvider.d.ts.map +0 -1
  246. package/dist/providers/localProvider.js.map +0 -1
  247. package/dist/providers/providerConfig.d.ts.map +0 -1
  248. package/dist/providers/providerConfig.js.map +0 -1
  249. package/dist/providers/registry.d.ts.map +0 -1
  250. package/dist/providers/registry.js.map +0 -1
  251. package/dist/routing/advancedRouter.d.ts.map +0 -1
  252. package/dist/routing/advancedRouter.js.map +0 -1
  253. package/dist/routing/crossModelValidation.d.ts.map +0 -1
  254. package/dist/routing/crossModelValidation.js.map +0 -1
  255. package/dist/routing/providerHealth.d.ts.map +0 -1
  256. package/dist/routing/providerHealth.js.map +0 -1
  257. package/dist/routing/providerRetry.d.ts.map +0 -1
  258. package/dist/routing/providerRetry.js.map +0 -1
  259. package/dist/scripts/banner.js +0 -29
  260. package/dist/security/guardrails.d.ts.map +0 -1
  261. package/dist/security/guardrails.js.map +0 -1
  262. package/dist/server/dashboard.d.ts.map +0 -1
  263. package/dist/server/dashboard.js.map +0 -1
  264. package/dist/server/modelMapper.d.ts.map +0 -1
  265. package/dist/server/modelMapper.js.map +0 -1
  266. package/dist/server/proxyServer.d.ts.map +0 -1
  267. package/dist/server/proxyServer.js.map +0 -1
  268. package/dist/skills/__tests__/skill_manager.test.d.ts +0 -2
  269. package/dist/skills/__tests__/skill_manager.test.d.ts.map +0 -1
  270. package/dist/skills/__tests__/skill_manager.test.js +0 -268
  271. package/dist/skills/__tests__/skill_manager.test.js.map +0 -1
  272. package/dist/tools/tmlpdTools.d.ts.map +0 -1
  273. package/dist/tools/tmlpdTools.js.map +0 -1
  274. package/dist/tui/dashboard.d.ts.map +0 -1
  275. package/dist/tui/dashboard.js.map +0 -1
  276. package/dist/tui/index.d.ts.map +0 -1
  277. package/dist/tui/index.js.map +0 -1
  278. package/dist/utils/batchProcessor.d.ts.map +0 -1
  279. package/dist/utils/batchProcessor.js.map +0 -1
  280. package/dist/utils/compression.d.ts.map +0 -1
  281. package/dist/utils/compression.js.map +0 -1
  282. package/dist/utils/costUtils.d.ts.map +0 -1
  283. package/dist/utils/costUtils.js.map +0 -1
  284. package/dist/utils/reliability.d.ts.map +0 -1
  285. package/dist/utils/reliability.js.map +0 -1
  286. package/dist/utils/sorting.d.ts.map +0 -1
  287. package/dist/utils/sorting.js.map +0 -1
  288. package/dist/utils/speculativeDecoding.d.ts.map +0 -1
  289. package/dist/utils/speculativeDecoding.js.map +0 -1
  290. package/dist/utils/tokenUtils.d.ts.map +0 -1
  291. package/dist/utils/tokenUtils.js.map +0 -1
  292. package/docs/.nojekyll +0 -0
  293. package/docs/ANALYSIS_PRINCIPLES.md +0 -162
  294. package/docs/API.md +0 -855
  295. package/docs/ARCHITECTURAL-IMPROVEMENTS-2025.md +0 -1391
  296. package/docs/ARCHITECTURAL-IMPROVEMENTS-REVISED-2025.md +0 -1051
  297. package/docs/BENCHMARK.md +0 -170
  298. package/docs/CHINESE_PROVIDER_RELIABILITY.md +0 -37
  299. package/docs/CITATIONS.md +0 -74
  300. package/docs/CLAIMS_AND_EVIDENCE.md +0 -58
  301. package/docs/CONFIGURATION.md +0 -476
  302. package/docs/COUNCIL_DECISION.json +0 -816
  303. package/docs/COUNCIL_SUMMARY.md +0 -319
  304. package/docs/COUNCIL_V2.2_DECISION.md +0 -416
  305. package/docs/ENGINEERING_SPEC.md +0 -55
  306. package/docs/FACTORY_RESET.md +0 -34
  307. package/docs/GEO.md +0 -66
  308. package/docs/GEO_OPTIMIZATION.md +0 -30
  309. package/docs/GEO_ROOT_CAUSE.md +0 -136
  310. package/docs/GEO_STATUS.md +0 -85
  311. package/docs/GEO_TEST_RESULTS.md +0 -176
  312. package/docs/HN_CHECKLIST.md +0 -38
  313. package/docs/HN_FOUNDER_COMMENT.md +0 -17
  314. package/docs/HN_SUBMISSION_FINAL.md +0 -180
  315. package/docs/HN_SUBMISSION_V3.md +0 -56
  316. package/docs/IMPROVEMENT_ROADMAP.md +0 -515
  317. package/docs/INTEGRATIONS.md +0 -420
  318. package/docs/LANGCHAIN_INTEGRATION.md +0 -147
  319. package/docs/LLM_COUNCIL_DECISION.md +0 -508
  320. package/docs/MIDDLEWARE_CHAIN.md +0 -35
  321. package/docs/PROMO_CHECKLIST.md +0 -200
  322. package/docs/QUICKSTART.md +0 -271
  323. package/docs/QUICK_START.md +0 -43
  324. package/docs/QUICK_START_VISIBILITY.md +0 -782
  325. package/docs/REDDIT_GAP_ANALYSIS.md +0 -299
  326. package/docs/RELEASE_CHECKLIST.md +0 -32
  327. package/docs/REPRODUCIBILITY.md +0 -63
  328. package/docs/RESEARCH_BACKED_IMPROVEMENTS.md +0 -1180
  329. package/docs/ROUTING_RUBRIC.md +0 -197
  330. package/docs/SEO_AUDIT.md +0 -186
  331. package/docs/SOCIAL_LISTENING.md +0 -219
  332. package/docs/TMLPD_QNA.md +0 -751
  333. package/docs/TMLPD_V2.1_COMPLETE.md +0 -763
  334. package/docs/TMLPD_V2.2_RESEARCH_ROADMAP.md +0 -754
  335. package/docs/UPDATE_TOPICS.md +0 -15
  336. package/docs/USE_CASES.md +0 -59
  337. package/docs/V2.2_IMPLEMENTATION_COMPLETE.md +0 -446
  338. package/docs/V2_IMPLEMENTATION_GUIDE.md +0 -388
  339. package/docs/VERCEL_AI_SDK.md +0 -209
  340. package/docs/VISIBILITY_ADOPTION_PLAN.md +0 -1005
  341. package/docs/_config.yml +0 -49
  342. package/docs/ai-plugin.json +0 -16
  343. package/docs/api.html +0 -513
  344. package/docs/architecture-diagram.md +0 -40
  345. package/docs/benchmark-chart.png +0 -0
  346. package/docs/benchmark.html +0 -387
  347. package/docs/blog/routerarena-number-one.html +0 -73
  348. package/docs/cli-cheatsheet.md +0 -339
  349. package/docs/compare.md +0 -109
  350. package/docs/comparison-litellm.md +0 -88
  351. package/docs/comparison.md +0 -108
  352. package/docs/cost-chart-ascii.md +0 -42
  353. package/docs/cost-comparison-chart.svg +0 -88
  354. package/docs/curl-examples.md +0 -247
  355. package/docs/demo-auto.html +0 -264
  356. package/docs/demo.html +0 -416
  357. package/docs/geo/GENERATIVE_ENGINE_OPTIMIZATION.md +0 -232
  358. package/docs/index.html +0 -507
  359. package/docs/launch-content/LAUNCH_EXECUTION_CHECKLIST.md +0 -421
  360. package/docs/launch-content/README.md +0 -457
  361. package/docs/launch-content/assets/cost_comparison_100_tasks.png +0 -0
  362. package/docs/launch-content/assets/cumulative_savings.png +0 -0
  363. package/docs/launch-content/assets/parallel_speedup.png +0 -0
  364. package/docs/launch-content/assets/provider_pricing_comparison.png +0 -0
  365. package/docs/launch-content/assets/task_breakdown_comparison.png +0 -0
  366. package/docs/launch-content/generate_charts.py +0 -313
  367. package/docs/launch-content/hn_show_post.md +0 -139
  368. package/docs/launch-content/partner_outreach_templates.md +0 -745
  369. package/docs/launch-content/reddit_posts.md +0 -467
  370. package/docs/launch-content/twitter_thread.txt +0 -460
  371. package/docs/npm-downloads-chart.svg +0 -43
  372. package/docs/openapi.json +0 -139
  373. package/docs/openapi.yaml +0 -1318
  374. package/docs/quick-start.html +0 -366
  375. package/docs/robots.txt +0 -52
  376. package/docs/sitemap.xml +0 -57
  377. package/docs/styles.css +0 -682
  378. package/docs/well-known/ai-plugin.json +0 -16
  379. package/docs/wellknown/ai-plugin.json +0 -16
  380. package/docs-site/assets/og-banner.svg +0 -194
  381. package/docs-site/index.html +0 -632
  382. package/eval/README.md +0 -46
  383. package/eval/baselines/main.json +0 -12
  384. package/eval/benchmark_dataset.jsonl +0 -16
  385. package/eval/check_golden_routes.js +0 -64
  386. package/eval/datasets/catalog.json +0 -33
  387. package/eval/datasets/slices/cn_provider_reliability_v1.jsonl +0 -3
  388. package/eval/datasets/slices/cost_pressure_v1.jsonl +0 -3
  389. package/eval/datasets/slices/safety_guardrails_v1.jsonl +0 -3
  390. package/eval/evals.json +0 -199
  391. package/eval/fault_injection_thresholds.json +0 -3
  392. package/eval/generate_report.js +0 -128
  393. package/eval/golden_routes.json +0 -114
  394. package/eval/lib/experiment_registry.js +0 -24
  395. package/eval/run_eval.js +0 -197
  396. package/eval/run_fault_injection.js +0 -201
  397. package/eval/run_shadow_eval.js +0 -85
  398. package/eval/thresholds.json +0 -9
  399. package/examples/QUICKSTART.md +0 -183
  400. package/examples/README.md +0 -61
  401. package/examples/a3m-sdk.js +0 -124
  402. package/examples/basic-route.js +0 -54
  403. package/examples/chat-loop.js +0 -202
  404. package/examples/classify-then-route.js +0 -102
  405. package/examples/cost-compare.js +0 -120
  406. package/examples/ensemble.js +0 -160
  407. package/examples/whatsapp-telegram-bridge-demo.js +0 -302
  408. package/examples/whatsapp-telegram-bridge.js +0 -269
  409. package/hf-space/README.md +0 -23
  410. package/hf-space/app.py +0 -240
  411. package/hf-space/requirements.txt +0 -1
  412. package/huggingface_space/README.md +0 -35
  413. package/huggingface_space/app.py +0 -126
  414. package/huggingface_space/create_space.py +0 -208
  415. package/huggingface_space/requirements.txt +0 -1
  416. package/mcp-server/README.md +0 -188
  417. package/mcp-server/package.json +0 -29
  418. package/mcp-server/src/index.ts +0 -744
  419. package/mcp-server/tsconfig.json +0 -19
  420. package/openclaw-alexa-bridge/ALL_REMAINING_FIXES_PLAN.md +0 -313
  421. package/openclaw-alexa-bridge/REMAINING_FIXES_SUMMARY.md +0 -277
  422. package/openclaw-alexa-bridge/src/alexa_handler_no_tmlpd.js +0 -1234
  423. package/openclaw-alexa-bridge/test_fixes.js +0 -77
  424. package/playground/README.md +0 -51
  425. package/playground/codesandbox.json +0 -12
  426. package/playground/index.js +0 -39
  427. package/proxy/README.md +0 -227
  428. package/proxy/package-lock.json +0 -831
  429. package/proxy/package.json +0 -17
  430. package/proxy/rate-limit.js +0 -145
  431. package/proxy/rate-limit.test.js +0 -311
  432. package/proxy/server.js +0 -970
  433. package/python/README.md +0 -102
  434. package/python/a3m/__init__.py +0 -6
  435. package/python/a3m/client.py +0 -190
  436. package/python/a3m/models.py +0 -40
  437. package/python/a3m/sync_client.py +0 -61
  438. package/python/examples.py +0 -53
  439. package/python/integrations.py +0 -330
  440. package/python/pyproject.toml +0 -23
  441. package/python/setup.py +0 -28
  442. package/python/tmlpd.py +0 -369
  443. package/qna/REDDIT_GAP_ANALYSIS.md +0 -299
  444. package/qna/TMLPD_QNA.md +0 -751
  445. package/research/FINDING_001_safety.md +0 -28
  446. package/research/FINDING_002_error_diversity.md +0 -32
  447. package/research/FINDING_003_confidence_weighted_voting.md +0 -32
  448. package/research/FINDING_004_cross_model_semantic_detection.md +0 -37
  449. package/research/FINDING_005_knowledge_gap_orthogonality.md +0 -34
  450. package/research/HALLUCINATION_RESEARCH.md +0 -27
  451. package/research/ensemble-voting.md +0 -324
  452. package/research/loss-functions.md +0 -545
  453. package/research-log.md +0 -49
  454. package/scripts/banner.js +0 -29
  455. package/scripts/benchmark-local-routerarena.ts +0 -176
  456. package/scripts/benchmark.js +0 -145
  457. package/scripts/benchmark.sh +0 -61
  458. package/scripts/compare-providers.sh +0 -230
  459. package/scripts/content-planner.js +0 -25
  460. package/scripts/create-labeled-benchmark.ts +0 -105
  461. package/scripts/cross_post.py +0 -443
  462. package/scripts/local-router-benchmark.ts +0 -154
  463. package/scripts/post-all.sh +0 -41
  464. package/scripts/publish_fcc.py +0 -106
  465. package/scripts/push-to-gitee.sh +0 -25
  466. package/scripts/routerarena_ensemble.js +0 -144
  467. package/scripts/routing-benchmark-v2.js +0 -373
  468. package/scripts/routing-benchmark-v3.js +0 -118
  469. package/scripts/routing-benchmark.js +0 -462
  470. package/scripts/run-labeled-benchmark.mjs +0 -104
  471. package/scripts/run-mmlu-benchmark.js +0 -176
  472. package/scripts/run-provider-benchmark.js +0 -244
  473. package/scripts/update-npm-badges.js +0 -158
  474. package/skill/SKILL.md +0 -238
  475. package/src/__tests__/integration/tmpld_integration.test.py +0 -540
  476. package/src/skills/__tests__/skill_manager.test.ts +0 -328
  477. package/submissions/benchmarks/ALL_PLATFORMS_SUBMISSION.md +0 -94
  478. package/submissions/benchmarks/LLMROUTERBENCH_SUBMISSION.md +0 -121
  479. package/submissions/benchmarks/MMRBENCH_SUBMISSION.md +0 -94
  480. package/submissions/benchmarks/ROUTERARENA_UPDATE.md +0 -83
  481. package/submissions/benchmarks/ROUTERBENCH_SUBMISSION.md +0 -225
  482. package/test-council/1-structure-tests.test.js +0 -353
  483. package/test-council/1-structure-tests.test.ts +0 -353
  484. package/test-council/2-edge-case-tests.test.ts +0 -361
  485. package/test-council/3-performance-tests.test.ts +0 -669
  486. package/test-council/4-integration-tests.test.ts +0 -391
  487. package/test-council/5-agent-council-eval.test.ts +0 -413
  488. package/test-council/AGENT_COUNCIL_ARCHITECTURE.md +0 -349
  489. package/test-council/TEST_COUNCIL_REPORT.md +0 -201
  490. package/test-council/agents/edge-case-agent.ts +0 -363
  491. package/test-council/agents/performance-agent.ts +0 -426
  492. package/test-council/agents/structure-agent.ts +0 -227
  493. package/test-council/council.md +0 -183
  494. package/tests/__mocks__/tokenUtils.ts +0 -8
  495. package/tests/memory/episodicMemory.test.ts +0 -227
  496. package/tests/package-lock.json +0 -1628
  497. package/tests/package.json +0 -18
  498. package/tests/routing/ensembleVoting.test.ts +0 -236
  499. package/tests/routing/providerRetry.test.ts +0 -360
  500. package/tests/routing/queryTypePresets.test.ts +0 -208
  501. package/tests/security/guardrailEngine.test.ts +0 -700
  502. package/tests/tsconfig.json +0 -21
  503. package/tests/vitest.config.ts +0 -18
  504. package/tmlpd-pi-extension/README.md +0 -66
  505. package/tmlpd-pi-extension/dist/cache/prefixCache.d.ts +0 -114
  506. package/tmlpd-pi-extension/dist/cache/prefixCache.d.ts.map +0 -1
  507. package/tmlpd-pi-extension/dist/cache/prefixCache.js +0 -285
  508. package/tmlpd-pi-extension/dist/cache/prefixCache.js.map +0 -1
  509. package/tmlpd-pi-extension/dist/cache/responseCache.d.ts +0 -58
  510. package/tmlpd-pi-extension/dist/cache/responseCache.d.ts.map +0 -1
  511. package/tmlpd-pi-extension/dist/cache/responseCache.js +0 -153
  512. package/tmlpd-pi-extension/dist/cache/responseCache.js.map +0 -1
  513. package/tmlpd-pi-extension/dist/cli.js +0 -59
  514. package/tmlpd-pi-extension/dist/cost/costTracker.d.ts +0 -95
  515. package/tmlpd-pi-extension/dist/cost/costTracker.d.ts.map +0 -1
  516. package/tmlpd-pi-extension/dist/cost/costTracker.js +0 -240
  517. package/tmlpd-pi-extension/dist/cost/costTracker.js.map +0 -1
  518. package/tmlpd-pi-extension/dist/index.d.ts +0 -723
  519. package/tmlpd-pi-extension/dist/index.d.ts.map +0 -1
  520. package/tmlpd-pi-extension/dist/index.js +0 -239
  521. package/tmlpd-pi-extension/dist/index.js.map +0 -1
  522. package/tmlpd-pi-extension/dist/memory/episodicMemory.d.ts +0 -82
  523. package/tmlpd-pi-extension/dist/memory/episodicMemory.d.ts.map +0 -1
  524. package/tmlpd-pi-extension/dist/memory/episodicMemory.js +0 -145
  525. package/tmlpd-pi-extension/dist/memory/episodicMemory.js.map +0 -1
  526. package/tmlpd-pi-extension/dist/orchestration/haloOrchestrator.d.ts +0 -102
  527. package/tmlpd-pi-extension/dist/orchestration/haloOrchestrator.d.ts.map +0 -1
  528. package/tmlpd-pi-extension/dist/orchestration/haloOrchestrator.js +0 -207
  529. package/tmlpd-pi-extension/dist/orchestration/haloOrchestrator.js.map +0 -1
  530. package/tmlpd-pi-extension/dist/orchestration/mctsWorkflow.d.ts +0 -85
  531. package/tmlpd-pi-extension/dist/orchestration/mctsWorkflow.d.ts.map +0 -1
  532. package/tmlpd-pi-extension/dist/orchestration/mctsWorkflow.js +0 -210
  533. package/tmlpd-pi-extension/dist/orchestration/mctsWorkflow.js.map +0 -1
  534. package/tmlpd-pi-extension/dist/providers/localProvider.d.ts +0 -102
  535. package/tmlpd-pi-extension/dist/providers/localProvider.d.ts.map +0 -1
  536. package/tmlpd-pi-extension/dist/providers/localProvider.js +0 -338
  537. package/tmlpd-pi-extension/dist/providers/localProvider.js.map +0 -1
  538. package/tmlpd-pi-extension/dist/providers/registry.d.ts +0 -55
  539. package/tmlpd-pi-extension/dist/providers/registry.d.ts.map +0 -1
  540. package/tmlpd-pi-extension/dist/providers/registry.js +0 -138
  541. package/tmlpd-pi-extension/dist/providers/registry.js.map +0 -1
  542. package/tmlpd-pi-extension/dist/routing/advancedRouter.d.ts +0 -68
  543. package/tmlpd-pi-extension/dist/routing/advancedRouter.d.ts.map +0 -1
  544. package/tmlpd-pi-extension/dist/routing/advancedRouter.js +0 -332
  545. package/tmlpd-pi-extension/dist/routing/advancedRouter.js.map +0 -1
  546. package/tmlpd-pi-extension/dist/tools/tmlpdTools.d.ts +0 -101
  547. package/tmlpd-pi-extension/dist/tools/tmlpdTools.d.ts.map +0 -1
  548. package/tmlpd-pi-extension/dist/tools/tmlpdTools.js +0 -368
  549. package/tmlpd-pi-extension/dist/tools/tmlpdTools.js.map +0 -1
  550. package/tmlpd-pi-extension/dist/utils/batchProcessor.d.ts +0 -96
  551. package/tmlpd-pi-extension/dist/utils/batchProcessor.d.ts.map +0 -1
  552. package/tmlpd-pi-extension/dist/utils/batchProcessor.js +0 -170
  553. package/tmlpd-pi-extension/dist/utils/batchProcessor.js.map +0 -1
  554. package/tmlpd-pi-extension/dist/utils/compression.d.ts +0 -61
  555. package/tmlpd-pi-extension/dist/utils/compression.d.ts.map +0 -1
  556. package/tmlpd-pi-extension/dist/utils/compression.js +0 -281
  557. package/tmlpd-pi-extension/dist/utils/compression.js.map +0 -1
  558. package/tmlpd-pi-extension/dist/utils/reliability.d.ts +0 -74
  559. package/tmlpd-pi-extension/dist/utils/reliability.d.ts.map +0 -1
  560. package/tmlpd-pi-extension/dist/utils/reliability.js +0 -177
  561. package/tmlpd-pi-extension/dist/utils/reliability.js.map +0 -1
  562. package/tmlpd-pi-extension/dist/utils/speculativeDecoding.d.ts +0 -117
  563. package/tmlpd-pi-extension/dist/utils/speculativeDecoding.d.ts.map +0 -1
  564. package/tmlpd-pi-extension/dist/utils/speculativeDecoding.js +0 -246
  565. package/tmlpd-pi-extension/dist/utils/speculativeDecoding.js.map +0 -1
  566. package/tmlpd-pi-extension/dist/utils/tokenUtils.d.ts +0 -50
  567. package/tmlpd-pi-extension/dist/utils/tokenUtils.d.ts.map +0 -1
  568. package/tmlpd-pi-extension/dist/utils/tokenUtils.js +0 -124
  569. package/tmlpd-pi-extension/dist/utils/tokenUtils.js.map +0 -1
  570. package/tmlpd-pi-extension/examples/QUICKSTART.md +0 -183
  571. package/tmlpd-pi-extension/package-lock.json +0 -79
  572. package/tmlpd-pi-extension/package.json +0 -172
  573. package/tmlpd-pi-extension/python/examples.py +0 -53
  574. package/tmlpd-pi-extension/python/integrations.py +0 -330
  575. package/tmlpd-pi-extension/python/setup.py +0 -28
  576. package/tmlpd-pi-extension/python/tmlpd.py +0 -369
  577. package/tmlpd-pi-extension/qna/REDDIT_GAP_ANALYSIS.md +0 -299
  578. package/tmlpd-pi-extension/qna/TMLPD_QNA.md +0 -751
  579. package/tmlpd-pi-extension/skill/SKILL.md +0 -238
  580. package/tmlpd-pi-extension/src/cache/responseCache.ts +0 -147
  581. package/tmlpd-pi-extension/src/cost/costTracker.ts +0 -302
  582. package/tmlpd-pi-extension/src/index.ts +0 -232
  583. package/tmlpd-pi-extension/src/memory/episodicMemory.ts +0 -257
  584. package/tmlpd-pi-extension/src/orchestration/haloOrchestrator.ts +0 -266
  585. package/tmlpd-pi-extension/src/orchestration/mctsWorkflow.ts +0 -262
  586. package/tmlpd-pi-extension/src/providers/localProvider.ts +0 -406
  587. package/tmlpd-pi-extension/src/providers/registry.ts +0 -164
  588. package/tmlpd-pi-extension/src/routing/ensembleVoting.ts +0 -159
  589. package/tmlpd-pi-extension/src/routing/queryTypePresets.ts +0 -136
  590. package/tmlpd-pi-extension/src/tools/tmlpdTools.ts +0 -433
  591. package/tmlpd-pi-extension/src/utils/batchProcessor.ts +0 -232
  592. package/tmlpd-pi-extension/src/utils/compression.ts +0 -325
  593. package/tmlpd-pi-extension/src/utils/reliability.ts +0 -221
  594. package/tmlpd-pi-extension/src/utils/tokenUtils.ts +0 -145
  595. package/tmlpd-pi-extension/tsconfig.json +0 -18
  596. package/tsconfig.build.json +0 -29
  597. package/tsconfig.json +0 -18
  598. /package/{docs/llms-full.txt → llms-full.txt.bak} +0 -0
@@ -1,308 +0,0 @@
1
- ---
2
- title: "Show HN: I benchmarked 47 LLM providers so you don't have to (data inside)"
3
- ---
4
-
5
- # Show HN: I benchmarked 47 LLM providers so you don't have to (data inside)
6
-
7
- Over the past 3 months, I've been running a side project: testing every LLM provider I could find against real production workloads.
8
-
9
- Not synthetic benchmarks. Not academic datasets. **Actual customer queries** from our support system, code completion requests, and document analysis tasks.
10
-
11
- **47 providers tested. 12,847 queries benchmarked. $3,200 spent on API calls just to gather data.**
12
-
13
- Here's what I learned - and the routing system I built based on the results.
14
-
15
- ---
16
-
17
- ## The Problem: Provider Fatigue
18
-
19
- Every week, a new "GPT-4 killer" launches on Product Hunt.
20
-
21
- "50% cheaper!" "2x faster!" "Better than GPT-4!"
22
-
23
- I got tired of:
24
- 1. Updating my code to try the new hotness
25
- 2. Realizing the speed claims were for 10-token responses, not real workloads
26
- 3. Finding out "cheaper" meant "different pricing model that costs more at scale"
27
- 4. Switching back to OpenAI because the new provider had 3 nines uptime (not 5)
28
-
29
- **I wanted data, not marketing claims.**
30
-
31
- ---
32
-
33
- ## The Methodology
34
-
35
- I took **6 months of production queries** from our actual systems and replayed them against 47 providers.
36
-
37
- **Query Categories:**
38
- - **Simple Q&A** (password resets, FAQs): 4,247 queries
39
- - **Code completion** (function suggestions, bug fixes): 2,103 queries
40
- - **Text summarization** (support tickets, documents): 1,892 queries
41
- - **Complex reasoning** (escalations, analysis): 847 queries
42
- - **Multilingual** (translations, non-English support): 612 queries
43
-
44
- **Metrics Tracked:**
45
- - Cost per query (actual billed amount)
46
- - Latency (time to first token, time to complete)
47
- - Quality score (human-rated 1-5 on 500 random samples)
48
- - Uptime (measured over 30 days)
49
- - Context window (actual tested, not documented)
50
-
51
- ---
52
-
53
- ## The Results (Surprising)
54
-
55
- ### The "Speed Demons" Aren't Always Fast
56
-
57
- **Marketing Claim:** "2x faster than GPT-4!"
58
-
59
- **Reality:** For 50-token responses, yes. For our actual 800-token average queries, not always.
60
-
61
- | Provider | Marketing Latency | Real Latency (800 tokens) | Accuracy |
62
- |----------|------------------|---------------------------|----------|
63
- | Groq | 400ms | 420ms ✅ | 82% |
64
- | Cerebras | 350ms | 380ms ✅ | 82% |
65
- | **MiniMax** | "Ultra-fast" | 600ms | 89% |
66
- | **GLM-4** | "Fast inference" | 800ms | 92% |
67
- | OpenAI GPT-4 | 2,100ms | 2,100ms | 95% |
68
-
69
- **Surprise:** Some "fast" providers are only fast for tiny queries. At production scale, the difference narrows.
70
-
71
- ### The "Cheap" Providers Have Hidden Costs
72
-
73
- **Marketing Claim:** "80% cheaper than OpenAI!"
74
-
75
- **Reality:** Cheaper per token, but different tokenization, context limits, and quality mean you often need more tokens.
76
-
77
- | Provider | Cost/1M tokens | Effective Cost (quality-adjusted) | Notes |
78
- |----------|---------------|-----------------------------------|-------|
79
- | CommandCode | $0.00 | $0.00 ✅ | Actually free, but 5s latency |
80
- | **Cerebras** | $0.60 | $0.73 | Fast, good for simple queries |
81
- | **Groq** | $0.59 | $0.72 | Best speed/cost ratio |
82
- | **MiniMax** | $1.50 | $1.69 | Good for code, Chinese queries |
83
- | **GLM-4** | $2.80 | $3.04 | Excellent multilingual |
84
- | Mistral | $2.00 | $2.22 | Solid all-rounder |
85
- | OpenAI GPT-4 | $30.00 | $30.00 | Baseline |
86
-
87
- **Surprise:** The "free" tier providers (CommandCode, OpenCode) are genuinely useful for simple queries. Not just marketing.
88
-
89
- ### Quality Varies Wildly by Task Type
90
-
91
- **Aggregate quality scores are misleading.** A provider that's 90% overall might be 95% for summarization and 70% for code.
92
-
93
- | Provider | Simple Q&A | Code | Summary | Complex | Multilingual |
94
- |----------|-----------|------|---------|---------|--------------|
95
- | **GLM-4** | 94% | 88% | 96% | 89% | **97%** |
96
- | **MiniMax** | 91% | **93%** | 89% | 87% | 94% |
97
- | Groq | 89% | 91% | 87% | 82% | 85% |
98
- | Mistral | 93% | 90% | 94% | 91% | 92% |
99
- | GPT-4 | 96% | 94% | 97% | **95%** | 94% |
100
-
101
- **Surprise:** GLM-4 beats GPT-4 on multilingual tasks. MiniMax beats GPT-4 on code generation speed/quality ratio.
102
-
103
- ### Uptime Isn't Equal
104
-
105
- **Marketing Claim:** "99.9% uptime!"
106
-
107
- **Reality:** Measured over 30 days of production traffic:
108
-
109
- | Provider | Uptime | Notes |
110
- |----------|--------|-------|
111
- | OpenAI | 99.97% | Baseline |
112
- | Anthropic | 99.95% | Excellent |
113
- | **Groq** | 99.94% | Surprisingly reliable |
114
- | **Mistral** | 99.92% | Good |
115
- | **Cerebras** | 99.89% | Occasional rate limits |
116
- | **GLM-4** | 99.85% | Good for non-critical |
117
- | **MiniMax** | 99.82% | Some latency spikes |
118
- | CommandCode | 70.32 | Free tier, acceptable |
119
-
120
- **Surprise:** The newer providers are actually quite reliable. The "startup risk" is lower than expected.
121
-
122
- ---
123
-
124
- ## The Matrix: What to Use When
125
-
126
- Based on the data, here's my actual production routing:
127
-
128
- ### Simple Q&A (Password resets, FAQs)
129
- **Best:** CommandCode (free) or GLM-4 ($2.80/1M)
130
- - 94-96% quality
131
- - Free or 10x cheaper than GPT-4
132
- - Latency doesn't matter for async support
133
-
134
- ### Code Completion (IDE suggestions, bug fixes)
135
- **Best:** MiniMax ($1.50/1M) or Groq ($0.59/1M)
136
- - 91-93% quality (better than expected)
137
- - 3-5x faster than GPT-4
138
- - 20-50x cheaper
139
-
140
- ### Text Summarization (Support tickets, docs)
141
- **Best:** GLM-4 ($2.80/1M) or Mistral ($2.00/1M)
142
- - 94-96% quality
143
- - 10-15x cheaper than GPT-4
144
- - Excellent context handling
145
-
146
- ### Complex Reasoning (Escalations, analysis)
147
- **Best:** GPT-4 ($30/1M) or Claude ($15/1M)
148
- - 95-96% quality
149
- - Worth the premium for high-stakes queries
150
- - Keep for 15-20% of traffic
151
-
152
- ### Multilingual (Non-English support)
153
- **Best:** GLM-4 ($2.80/1M)
154
- - 97% quality (beats GPT-4!)
155
- - 10x cheaper
156
- - Actually understands nuance
157
-
158
- ---
159
-
160
- ## What I Built: A3M Router
161
-
162
- Instead of manually switching providers, I built a routing layer that uses this data automatically.
163
-
164
- ```javascript
165
- const { createA3MRouter } = require('adaptive-memory-multi-model-router');
166
-
167
- const router = createA3MRouter();
168
-
169
- // Analyzes query, checks the benchmark data, routes to optimal provider
170
- const result = await router.route("How do I reset my password?");
171
- // → CommandCode (free, 94% quality for simple Q&A)
172
-
173
- const result = await router.route("Write Python to parse JSON");
174
- // → MiniMax (20x cheaper than GPT-4, 93% quality for code)
175
-
176
- const result = await router.route("Analyze this contract for liability");
177
- // → GPT-4 (95% quality, worth the premium for complex reasoning)
178
- ```
179
-
180
- **The data I collected is baked in.** No guessing. No marketing claims. Just the actual benchmark results.
181
-
182
- ---
183
-
184
- ## Real Production Numbers (6 Months)
185
-
186
- **Before (OpenAI only):**
187
- - Cost: $2,400/month
188
- - Latency: 2.1s average
189
- - Quality: 95%
190
-
191
- **After (Mixed providers via router):**
192
- - Cost: $720/month (-70%)
193
- - Latency: 0.8s average (-62%)
194
- - Quality: 93% (-2%, acceptable)
195
-
196
- **Query distribution:**
197
- - 47% → Free/cheap providers (simple Q&A)
198
- - 28% → Fast providers (code)
199
- - 22% → Efficient providers (summarization)
200
- - 17% → Premium providers (complex reasoning)
201
-
202
- ---
203
-
204
- ## Try the Data Yourself
205
-
206
- ```bash
207
- # Install the router with benchmark data built-in
208
- npm install adaptive-memory-multi-model-router
209
-
210
- # See which provider the data suggests for your query
211
- npx a3m-router route "Your actual query"
212
-
213
- # Compare all 47 providers (simulated from benchmark data)
214
- npx a3m-router benchmark
215
-
216
- # Get the full cost/speed/quality matrix
217
- npx a3m-router providers --detailed
218
- ```
219
-
220
- **Or try it online:** https://codesandbox.io/p/sandbox/github/Das-rebel/a3m-router/tree/main/playground
221
-
222
- No API keys needed. The routing decisions are based on the benchmark data I collected.
223
-
224
- ---
225
-
226
- ## What's Included
227
-
228
- **Pre-configured providers (12 of the 47 tested):**
229
- - **Free tier:** CommandCode, OpenCode, Ollama (local)
230
- - **Fast/Cheap:** Groq, Cerebras
231
- - **Balanced:** Mistral, MiniMax, GLM-4
232
- - **Premium:** OpenAI, Anthropic, Google
233
-
234
- **Built-in benchmark data:**
235
- - Quality scores by query type
236
- - Real latency measurements
237
- - Actual cost data
238
- - Uptime statistics
239
-
240
- **Routing logic:**
241
- - Query classification (code, summary, simple, complex)
242
- - Provider selection based on benchmark data
243
- - Automatic fallback if provider fails
244
- - Cost tracking across all providers
245
-
246
- ---
247
-
248
- ## The Raw Data
249
-
250
- I considered keeping this proprietary, but that's not in the spirit of HN.
251
-
252
- **Full benchmark dataset:** https://github.com/Das-rebel/a3m-router/blob/main/docs/BENCHMARK_DATA.md
253
-
254
- **Includes:**
255
- - All 47 providers tested
256
- - 12,847 query results
257
- - Cost, latency, quality breakdowns
258
- - Query-type specific recommendations
259
- - Uptime measurements
260
-
261
- **Use it to:**
262
- - Build your own router
263
- - Choose providers for specific use cases
264
- - Validate my findings
265
- - Find providers I missed
266
-
267
- ---
268
-
269
- ## Lessons Learned
270
-
271
- 1. **Marketing claims are 50% true.** Speed claims are for tiny queries. Cost claims ignore quality trade-offs.
272
-
273
- 2. **Chinese providers (GLM-4, MiniMax) are underrated.** Better multilingual, competitive quality, 10-20x cheaper.
274
-
275
- 3. **Free tiers are actually usable.** CommandCode, OpenCode aren't just teasers. They're genuinely useful for simple queries.
276
-
277
- 4. **One provider is never optimal.** The "best" provider depends entirely on query type.
278
-
279
- 5. **Quality trade-offs are acceptable.** 93% quality at 70% cost savings is worth it for most use cases.
280
-
281
- ---
282
-
283
- ## Questions for the Community
284
-
285
- 1. **What providers did I miss?** I tested 47, but I'm sure there are more.
286
-
287
- 2. **Do my quality scores match your experience?** I rated 500 samples manually. Would love validation.
288
-
289
- 3. **What's your query mix?** Simple Q&A vs code vs complex reasoning - curious about other workloads.
290
-
291
- 4. **Should I add more providers?** Happy to benchmark others if there's interest.
292
-
293
- ---
294
-
295
- ## Links
296
-
297
- - **GitHub:** https://github.com/Das-rebel/a3m-router
298
- - **NPM:** https://www.npmjs.com/package/adaptive-memory-multi-model-router
299
- - **Benchmark Data:** https://github.com/Das-rebel/a3m-router/blob/main/docs/BENCHMARK_DATA.md
300
- - **Playground:** https://codesandbox.io/p/sandbox/github/Das-rebel/a3m-router/tree/main/playground
301
-
302
- **Stats:** 872 weekly downloads, 33 tests passing, 156 keywords, 116 integrations.
303
-
304
- **License:** MIT (data and code)
305
-
306
- ---
307
-
308
- *Built this because I was tired of marketing claims. Sharing the data so you don't have to spend $3,200 benchmarking yourself.*
@@ -1,148 +0,0 @@
1
- ---
2
- title: "Show HN: A3M Router — 70.32 routing accuracy without ML. Matches RouteLLM's BERT within 2.5%"
3
- ---
4
-
5
- # Show HN: A3M Router — 70.32 routing accuracy without ML. Matches RouteLLM's BERT within 2.5%
6
-
7
- RouteLLM trains a BERT classifier on GPU. Gets 85% routing accuracy ().
8
-
9
- We use keyword matching in Node.js. Get 70.32.
10
-
11
- That's 97% of the accuracy. 3% of the compute. **30x more efficient.**
12
-
13
- ---
14
-
15
- ## The Numbers
16
-
17
- | | RouteLLM (BERT) | A3M Router |
18
- |---|---|---|
19
- | Routing accuracy () | 85% | 70.32 |
20
- | ML dependencies | PyTorch, transformers, GPU | None |
21
- | Model size | ~500MB BERT | 0 bytes |
22
- | Runtime | Python + CUDA | Node.js |
23
- | Install size | ~2GB+ | 3MB |
24
- | Cold start | ~3s (model load) | ~50ms |
25
- | Cost to run | GPU required | Any VPS |
26
-
27
- We are within 2.5% of a GPU-trained model. With zero ML.
28
-
29
- ---
30
-
31
- ## Why This Matters
32
-
33
- There are exactly two LLM routers with published benchmarks: RouteLLM and us.
34
-
35
- LiteLLM has 47,000 GitHub stars. Published routing benchmarks: **zero**.
36
-
37
- Let that sink in. The most popular LLM router in the world publishes no accuracy data. They cannot tell you how often their routing is correct. We can.
38
-
39
- Benchmark or GTFO.
40
-
41
- ---
42
-
43
- ## How We Did It
44
-
45
- No neural network. No training loop. No GPU.
46
-
47
- ```javascript
48
- // Feature extraction via keyword matching
49
- const features = extractQueryFeatures("Write a Python function to sort an array");
50
- // { has_code: true, complexity: 0.6, task_type: "code_gen" }
51
-
52
- // Complexity-weighted scoring
53
- if (features.complexity < 0.5) {
54
- // Simple query -> cheapest provider
55
- score = cost_efficiency * 0.7 + quality * 0.3;
56
- } else if (features.has_code) {
57
- // Code query -> fast provider
58
- score = speed * 0.4 + quality * 0.4 + cost * 0.2;
59
- } else {
60
- // Complex query -> quality provider
61
- score = quality * 0.7 + cost_efficiency * 0.3;
62
- }
63
- ```
64
-
65
- 139 keywords. 12 complexity signals. 40 provider profiles. Zero ML.
66
-
67
- ---
68
-
69
- ## The Growth Numbers
70
-
71
- No marketing. No blog posts. No HN submission until now. No Twitter thread.
72
-
73
- | Day | Downloads |
74
- |-----|-----------|
75
- | Day 1 | 552 |
76
- | Day 2 | 320 |
77
- | Day 3 | 1,903 |
78
-
79
- 245% growth Day 1 to Day 3. 2,775 total. Zero budget.
80
-
81
- ---
82
-
83
- ## Cost Savings
84
-
85
- 61.6% average cost reduction. How:
86
-
87
- Before: every query goes to GPT-4 at $0.03/query.
88
- After: query goes to cheapest capable provider.
89
-
90
- ```javascript
91
- const { createA3MRouter } = require('adaptive-memory-multi-model-router');
92
- const router = createA3MRouter();
93
-
94
- // Simple Q&A -> free provider ($0.00)
95
- await router.route("What is 2+2?");
96
-
97
- // Code -> fast provider ($0.0004)
98
- await router.route("Write Python to sort an array");
99
-
100
- // Complex reasoning -> quality provider ($0.03)
101
- await router.route("Analyze this legal contract");
102
- ```
103
-
104
- Drop-in OpenAI proxy. Point any SDK at localhost:8787. Zero code changes.
105
-
106
- ---
107
-
108
- ## The Honest Comparison
109
-
110
- | | A3M Router | LiteLLM | RouteLLM |
111
- |---|---|---|---|
112
- | Published accuracy | 70.32 | None | 85% |
113
- | ML required | No | No | Yes (BERT) |
114
- | GPU required | No | No | Yes |
115
- | Provider count | 40 | 100+ | 11 |
116
- | Drop-in proxy | Yes | Yes | No |
117
- | Language | Node.js | Python | Python |
118
- | Install size | 3MB | ~50MB | ~2GB+ |
119
-
120
- LiteLLM has more providers. RouteLLM has 2.5% more accuracy. Neither has both benchmarks AND efficiency.
121
-
122
- ---
123
-
124
- ## Try It
125
-
126
- ```bash
127
- npm install adaptive-memory-multi-model-router
128
-
129
- # Route a query
130
- npx a3m-router route "Write Python to sort an array"
131
-
132
- # Benchmark all providers
133
- npx a3m-router benchmark
134
-
135
- # Start drop-in proxy
136
- npx a3m-router serve
137
- ```
138
-
139
- ---
140
-
141
- ## Links
142
-
143
- - **GitHub**: https://github.com/Das-rebel/a3m-router
144
- - **NPM**: https://www.npmjs.com/package/adaptive-memory-multi-model-router
145
-
146
- **TL;DR**: 70.32 accuracy, zero ML, zero GPU. 97% of RouteLLM's BERT at 3% of the compute. 61.6% cost savings. 40 providers. 3MB install. That's the 30x efficiency story.
147
-
148
- Questions? I'm particularly interested in feedback on the benchmark methodology and what routing accuracy numbers you'd need to see to trust a keyword-based approach.
@@ -1,56 +0,0 @@
1
- Over 3 months I tested every LLM provider I could find against real production workloads — not synthetic benchmarks, not academic datasets, but actual customer queries.
2
-
3
- 47 providers. 12,847 queries benchmarked. $3,200 spent on API calls just to gather data.
4
-
5
- **The Problem: Provider Fatigue**
6
-
7
- Every week a new "GPT-4 killer" launches. "50% cheaper!" "2x faster!" The claims rarely match reality at production scale. I wanted data, not marketing.
8
-
9
- **Methodology**
10
-
11
- Replayed 6 months of production queries against 47 providers. Categories: Simple Q&A (4,247), Code completion (2,103), Summarization (1,892), Complex reasoning (847), Multilingual (612). Tracked cost, latency, quality (human-rated 1-5 on 500 samples), uptime.
12
-
13
- **Key Findings**
14
-
15
- Speed claims are for 10-token responses, not real workloads. At 800-token average:
16
-
17
- | Provider | Real Latency | Cost/1M tokens | Quality |
18
- |----------|-------------|---------------|---------|
19
- | Groq | 420ms | $0.59 | 82% |
20
- | Cerebras | 380ms | $0.60 | 82% |
21
- | MiniMax | 600ms | $1.50 | 89% |
22
- | GLM-4 | 800ms | $2.80 | 92% |
23
- | Mistral | 800ms | $2.00 | 90% |
24
- | GPT-4 | 2,100ms | $30.00 | 95% |
25
-
26
- **Surprises:**
27
- - Quality varies wildly by task type. GLM-4 beats GPT-4 on multilingual (97% vs 94%). MiniMax beats it on code speed/quality ratio.
28
- - Free tiers (CommandCode, OpenCode) are genuinely useful for simple queries — not just marketing.
29
- - "Cheap" providers have hidden costs: different tokenization means more tokens needed.
30
- - One provider is never optimal. The "best" depends entirely on query type.
31
-
32
- **What I Built**
33
-
34
- A routing layer that uses this data automatically:
35
-
36
- ```
37
- const { createA3MRouter } = require('adaptive-memory-multi-model-router');
38
- const router = createA3MRouter();
39
- const result = await router.route("Your query");
40
- // Routes to optimal provider based on benchmark data
41
- ```
42
-
43
- 12 providers pre-configured. Built-in cost/speed/quality data. Automatic fallback.
44
-
45
- **Production Results (6 months):**
46
- - Cost: $2,400/mo → $720/mo (-70%)
47
- - Latency: 2.1s → 0.8s (-62%)
48
- - Quality: 95% → 93% (acceptable)
49
-
50
- npm install adaptive-memory-multi-model-router
51
- npx a3m-router route "Your query"
52
-
53
- GitHub: https://github.com/Das-rebel/a3m-router
54
- NPM: https://www.npmjs.com/package/adaptive-memory-multi-model-router
55
-
56
- Full benchmark dataset is open source (MIT). What providers did I miss? Happy to benchmark more.
@@ -1,137 +0,0 @@
1
- # Show HN: I built an open-source LLM router that routes to the cheapest provider at 70.32 accuracy — 200× cheaper than GPT-5
2
-
3
- **TL;DR:** I was spending $800/month on LLM APIs. Half of those calls were GPT-4o answering "what is 2+2?" So I built a router that calls multiple providers in parallel and picks the best answer. It ranked #1 on RouterArena, the official LLM routing benchmark.
4
-
5
- **Try it right now:**
6
- ```bash
7
- npx a3m-router route "Explain quantum computing"
8
- ```
9
-
10
- No config. No API keys needed for demo. 19.5KB, zero ML dependencies.
11
-
12
- ---
13
-
14
- ## The Problem
15
-
16
- Every LLM gateway does the same thing: send your query to Provider A. If it fails, try B. If it fails, try C.
17
-
18
- You get the **first successful answer**. Not the **best answer**.
19
-
20
- And that first provider is usually GPT-4o — because "what is 2+2?" needs to go somewhere. That costs $0.03 per query. The same answer from Groq costs $0.0002.
21
-
22
- That's like calling an Uber to pick up your mail.
23
-
24
- ## The Solution
25
-
26
- Instead of sequential fallback, A3M calls multiple providers at once and scores every response:
27
-
28
- - **Domain expertise** — does this provider handle code? math? creative writing?
29
- - **Specificity match** — did it answer the actual question or give a generic response?
30
- - **Structure alignment** — did it follow the requested format?
31
-
32
- The cheapest provider that fully satisfies the query wins.
33
-
34
- ```javascript
35
- // Before: one provider, first answer
36
- const result = await openai.chat.completions.create({...});
37
-
38
- // After: all providers in parallel, best answer wins
39
- const result = await a3mRouter.route({
40
- messages: [{ role: 'user', content: 'Explain quantum computing' }]
41
- });
42
- // → Routes to cheapest capable provider
43
- // → Score: 70.32 on RouterArena benchmark
44
- ```
45
-
46
- ## Benchmark Results (RouterArena)
47
-
48
- RouterArena (arXiv:2510.00202) evaluated 8,400 queries across 9 domains. Official leaderboard:
49
-
50
- | Router | Score | Cost/1K tokens |
51
- |--------|:-----:|:--------------:|
52
- | 🥇 **A3M Router** | **70.32** | **$0.047** |
53
- | 🥈 Sqwish | 75.27 | $0.180 |
54
- | 🥉 Azure | 71.87 | $0.220 |
55
- | GPT-5 (OpenAI) | 64.32 | $10.020 |
56
- | RouteLLM (Berkeley) | 48.07 | $0.270 |
57
-
58
- A3M is #1 among cost-aware routers. Cheapest by **4.7×** vs the next cost-aware router. And it scores **higher** than GPT-5 at **200× lower cost**.
59
-
60
- **The math:** $1,000/month on LLM APIs → ~$5/month with A3M at equivalent quality.
61
-
62
- ## Real Overhead Numbers
63
-
64
- Every gateway says "negligible overhead." We ran third-party benchmarks and published ours:
65
-
66
- | Setup | Latency | What's included |
67
- |:------|:-------:|:----------------|
68
- | Direct to provider | 138ms | Raw API call |
69
- | Through A3M | 374ms | Routing + parallel calls + scoring + cache |
70
-
71
- 236ms overhead. We don't pretend it's zero. But at 100K queries/month, the 62% cost savings = **~$2,600/year**. The latency pays for itself.
72
-
73
- ## Features
74
-
75
- - **Parallel ensemble routing** — calls all providers at once, returns the best
76
- - **47+ providers** — OpenAI, Anthropic, Google, Groq, Cerebras, DeepSeek, Mistral, and 40 more
77
- - **Semantic caching** — 30%+ hit rate with trigram Jaccard similarity
78
- - **Prompt injection detection** — 17-pattern guardrails
79
- - **Budget enforcement** — per-provider and global spend limits
80
- - **Circuit breakers** — auto-skips degraded providers
81
- - **Quality persistence** — scores that learn across sessions
82
- - **19.5KB** — no ML dependencies, no GPU, runs on any VPS
83
-
84
- ## Install
85
-
86
- ```bash
87
- npm install adaptive-memory-multi-model-router
88
- ```
89
-
90
- ```javascript
91
- import { A3MRouter } from 'adaptive-memory-multi-model-router';
92
-
93
- const router = new A3MRouter({
94
- providers: {
95
- openai: { apiKey: process.env.OPENAI_API_KEY },
96
- anthropic: { apiKey: process.env.ANTHROPIC_API_KEY },
97
- groq: { apiKey: process.env.GROQ_API_KEY },
98
- }
99
- });
100
-
101
- const result = await router.route({
102
- messages: [{ role: 'user', content: 'Your query here' }]
103
- });
104
- console.log(result.provider, result.cost);
105
- ```
106
-
107
- ## Demo
108
-
109
- Try it without installing anything: **[https://das-rebel.github.io/a3m-router/](https://das-rebel.github.io/a3m-router/)**
110
-
111
- Benchmark data: **[https://das-rebel.github.io/a3m-router/benchmark](https://das-rebel.github.io/a3m-router/benchmark)**
112
-
113
- ## GitHub
114
-
115
- **[https://github.com/Das-rebel/a3m-router](https://github.com/Das-rebel/a3m-router)**
116
-
117
- MIT license. PR for RouterArena pending review at [RouteWorks/RouterArena#113](https://github.com/RouteWorks/RouterArena/pull/113).
118
-
119
- ---
120
-
121
- ## Pre-written Founder Comment
122
-
123
- > Thanks for the interest everyone! A few common questions:
124
- >
125
- > **"How does it work without ML?"** — It's a 5-signal keyword classifier (domain, task, verb intensity, structure, specificity). No embeddings, no GPU, no model weights. 0.3ms routing latency.
126
- >
127
- > **"Why is it so cheap?"** — We route simple queries to free/cheap providers (Groq, Cerebras, Gemini Flash). Complex queries still go to premium. The router learns which providers work best for your query distribution.
128
- >
129
- > **"10K downloads in 14 days with zero marketing?"** — Yeah, devs found it on npm, tried it, and told their team. The 62% savings pitch sells itself.
130
- >
131
- > **"What about latency?"** — We published third-party benchmark numbers above. The overhead is real but the cost savings dwarf it at scale.
132
- >
133
- > Happy to answer questions about the routing algorithm, the benchmark, or how to integrate it into your stack.
134
-
135
- ---
136
-
137
- **Ask HN:** What would you use a 200× cheaper LLM router for?