adaptive-memory-multi-model-router 2.14.45 → 2.14.47

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (605) hide show
  1. package/dist/index.d.ts +4 -0
  2. package/dist/index.js +8 -2
  3. package/dist/memory/hybridMemory.d.ts +71 -0
  4. package/dist/memory/hybridMemory.js +124 -0
  5. package/dist/memory/reasoningBank.d.ts +88 -0
  6. package/dist/memory/reasoningBank.js +303 -0
  7. package/{docs/llms.txt → llms.txt.bak} +6 -6
  8. package/package.json +13 -84
  9. package/src/index.ts +8 -0
  10. package/src/memory/hybridMemory.ts +155 -0
  11. package/src/memory/reasoningBank.ts +335 -0
  12. package/src/routing/advancedRouter.ts.bak +650 -0
  13. package/test.js.bak +376 -0
  14. package/.dockerignore +0 -82
  15. package/.env.example +0 -303
  16. package/.github/DISCUSSIONS_WELCOME.md +0 -27
  17. package/.github/DISCUSSION_TEMPLATE.yml +0 -5
  18. package/.github/FUNDING.yml +0 -2
  19. package/.github/ISSUE_TEMPLATE/bug_report.md +0 -94
  20. package/.github/ISSUE_TEMPLATE/config.yml +0 -17
  21. package/.github/ISSUE_TEMPLATE/feature_request.md +0 -71
  22. package/.github/PULL_REQUEST_TEMPLATE.md +0 -71
  23. package/.github/dependabot.yml +0 -9
  24. package/.github/workflows/auto-publish.yml +0 -51
  25. package/.github/workflows/ci.yml +0 -263
  26. package/.github/workflows/codeql.yml +0 -38
  27. package/.github/workflows/npm-publish.yml +0 -20
  28. package/.github/workflows/pages.yml +0 -37
  29. package/.github/workflows/stale.yml +0 -54
  30. package/.publish-tick +0 -1
  31. package/.well-known/ai-plugin.json +0 -16
  32. package/AGENT_COUNCIL_FINDINGS.md +0 -142
  33. package/ARCHITECTURE.md +0 -346
  34. package/AUDIT_REPORT.md +0 -28
  35. package/CODE_OF_CONDUCT.md +0 -128
  36. package/CONTRIBUTING.md +0 -50
  37. package/CONTRIBUTORS.md +0 -20
  38. package/Dockerfile +0 -53
  39. package/Dockerfile.proxy +0 -33
  40. package/HEALTH_REPORT.md +0 -118
  41. package/IMPROVEMENT_PLAN.md +0 -107
  42. package/LANDING.md +0 -43
  43. package/LAUNCH-PAIN-DRIVEN.md +0 -339
  44. package/LAUNCH.md +0 -337
  45. package/LAUNCH_CHECKLIST.md +0 -141
  46. package/LAUNCH_SNAPSHOT.md +0 -260
  47. package/MANIFESTO.md +0 -41
  48. package/POPULARITY_BOOSTERS.md +0 -285
  49. package/PR_STATUS_REPORT.md +0 -148
  50. package/REDESIGN.md +0 -95
  51. package/RUNKIT.md +0 -83
  52. package/SECURITY.md +0 -29
  53. package/SUBMISSIONS.md +0 -43
  54. package/_schema.html +0 -53
  55. package/ai-plugin.json +0 -16
  56. package/articles/AI_AGENT_LLM_ROUTING.md +0 -150
  57. package/articles/CHINESE_DIRECTORIES.md +0 -100
  58. package/articles/CHINESE_SUBMISSIONS_READY.md +0 -322
  59. package/articles/COMPETITOR_ALERTS.md +0 -31
  60. package/articles/COMPLETE_POSTING_DIRECTORY.md +0 -147
  61. package/articles/CONTENT_STRUCTURE.md +0 -292
  62. package/articles/DEVTO_COST_GUIDE.md +0 -473
  63. package/articles/DEVTO_FINAL.md +0 -416
  64. package/articles/DEVTO_MULTI_PROVIDER.md +0 -542
  65. package/articles/DEVTO_READY.md +0 -255
  66. package/articles/DEVTO_V2_ANNOUNCEMENT.md +0 -160
  67. package/articles/DEVTO_VIRAL_GROWTH.md +0 -280
  68. package/articles/FRESH_devto.md +0 -460
  69. package/articles/FRESH_devto_2026_05.md +0 -73
  70. package/articles/FRESH_hackernews.md +0 -14
  71. package/articles/FRESH_reddit_ml.md +0 -90
  72. package/articles/FRESH_reddit_node.md +0 -198
  73. package/articles/FRESH_reddit_sideproject.md +0 -72
  74. package/articles/FRESH_reddit_webdev.md +0 -130
  75. package/articles/FROM_ZERO_TO_10K.md +0 -107
  76. package/articles/HN_10X_BETTER.md +0 -430
  77. package/articles/HN_ACCOUNT_GUIDE.md +0 -21
  78. package/articles/HN_CHINESE_STYLE.md +0 -308
  79. package/articles/HN_FINAL.md +0 -148
  80. package/articles/HN_POSTED_VERSION.md +0 -56
  81. package/articles/HN_POST_READY.md +0 -137
  82. package/articles/HN_RESEARCH.md +0 -364
  83. package/articles/HN_SHOW_routerarena.md +0 -17
  84. package/articles/HN_TIMING_GUIDE.md +0 -52
  85. package/articles/INDIEHACKERS_POST.md +0 -52
  86. package/articles/INDIEHACKERS_READY.md +0 -120
  87. package/articles/LLM_BENCHMARK_DEEP_DIVE.md +0 -153
  88. package/articles/MASTER_POSTING_DIRECTORY.md +0 -189
  89. package/articles/NEWSLETTER_SEND_NOW.md +0 -259
  90. package/articles/NEWSLETTER_SUBMISSIONS.md +0 -112
  91. package/articles/PAIN-DRIVEN-devto-v2.md +0 -308
  92. package/articles/PAIN-DRIVEN-devto-v3.md +0 -268
  93. package/articles/PAIN-DRIVEN-devto.md +0 -242
  94. package/articles/PAIN-DRIVEN-hackernews-v2.md +0 -138
  95. package/articles/PAIN-DRIVEN-hackernews-v3.md +0 -151
  96. package/articles/PAIN-DRIVEN-hackernews.md +0 -131
  97. package/articles/PAIN-DRIVEN-reddit-v2.md +0 -301
  98. package/articles/PAIN-DRIVEN-reddit-v3.md +0 -236
  99. package/articles/PAIN-DRIVEN-reddit.md +0 -218
  100. package/articles/PAIN-DRIVEN-twitter-v2.md +0 -110
  101. package/articles/PAIN-DRIVEN-twitter-v3.md +0 -121
  102. package/articles/PAIN-DRIVEN-twitter.md +0 -120
  103. package/articles/PORTKEY_VS_A3M.md +0 -147
  104. package/articles/POSTING_KIT_2026_05.md +0 -67
  105. package/articles/PRESS_KIT_routerarena.md +0 -77
  106. package/articles/PRODUCTHUNT_LISTING.md +0 -48
  107. package/articles/PRODUCTHUNT_READY.md +0 -106
  108. package/articles/PR_PLAN_vault.md +0 -125
  109. package/articles/REDDIT_FINAL.md +0 -232
  110. package/articles/REDDIT_POST.md +0 -67
  111. package/articles/REDDIT_SUBMISSION_READY.md +0 -348
  112. package/articles/ROUTERARENA_LEADER.md +0 -45
  113. package/articles/SHOW_HN_FINAL.md +0 -29
  114. package/articles/TWEETS_10K_DOWNLOADS.md +0 -47
  115. package/articles/TWEETS_BENCHMARK_FIRST.md +0 -46
  116. package/articles/TWEETS_MCP_PLAY.md +0 -51
  117. package/articles/TWEETS_SEQUENTIAL_BROKEN.md +0 -49
  118. package/articles/TWEETS_WHY_BUILD.md +0 -54
  119. package/articles/TWEETS_routerarena_leader.md +0 -53
  120. package/articles/TWEET_STORM_READY.md +0 -165
  121. package/articles/TWITTER_FINAL.md +0 -167
  122. package/articles/WHY_10X_BETTER.md +0 -261
  123. package/articles/WHY_CHINESE_STYLE_BETTER.md +0 -323
  124. package/articles/ai-discoverability-llm-routing.md +0 -210
  125. package/articles/devto-llm-routing.md +0 -138
  126. package/articles/hackernews-show-hn.md +0 -54
  127. package/articles/hashnode-llm-cost-optimization.md +0 -125
  128. package/articles/hn_show_2026_05.md +0 -11
  129. package/articles/medium-building-llm-router.md +0 -205
  130. package/articles/reddit-ml.md +0 -76
  131. package/articles/twitter-thread-cost-savings.md +0 -50
  132. package/articles/youtube-tutorial-script.md +0 -262
  133. package/assets/a3m_3blue1brown.mp4 +0 -0
  134. package/assets/banner.svg +0 -109
  135. package/assets/chart-cost-v2.svg +0 -91
  136. package/assets/chart-cost-v3.svg +0 -143
  137. package/assets/chart-features-v2.svg +0 -132
  138. package/assets/chart-features-v3.svg +0 -211
  139. package/assets/chart-growth-v2.svg +0 -122
  140. package/assets/chart-growth-v3.svg +0 -189
  141. package/assets/cost-comparison.svg +0 -134
  142. package/assets/cost-simple.svg +0 -64
  143. package/assets/demo-hn.gif +0 -0
  144. package/assets/feature-matrix.svg +0 -136
  145. package/assets/growth-chart-animated.svg +0 -76
  146. package/assets/growth-chart.svg +0 -82
  147. package/assets/growth-simple.svg +0 -69
  148. package/assets/hero-diagram.svg +0 -81
  149. package/assets/logo-new.svg +0 -21
  150. package/assets/logo.svg +0 -68
  151. package/assets/provider-comparison.svg +0 -121
  152. package/assets/social-preview-new.svg +0 -100
  153. package/assets/social-preview.svg +0 -194
  154. package/assets/social-v2.svg +0 -130
  155. package/assets/social-v3.svg +0 -212
  156. package/benchmark-provider-results.json +0 -245
  157. package/benchmark-results.json +0 -54
  158. package/council-votes/architecture-vote.md +0 -121
  159. package/council-votes/coverage-vote.md +0 -93
  160. package/data/adaptive-benchmark.json +0 -92
  161. package/data/benchmark-results.json +0 -47
  162. package/data/labeled-benchmark.json +0 -88
  163. package/demo/3blue1brown_video.py +0 -285
  164. package/demo/3blue1brown_video_v2.py +0 -310
  165. package/demo/IMPROVED_PROMPTS.md +0 -229
  166. package/demo/VEO3_PROMPTS.md +0 -269
  167. package/demo/VIDEO_PRODUCTION_GUIDE.md +0 -333
  168. package/demo/a3m_3blue1brown.mp4 +0 -0
  169. package/demo/asciinema-demo.sh +0 -195
  170. package/demo/demo-hn.tape +0 -74
  171. package/demo/demo-script.md +0 -53
  172. package/demo/demo-script.sh +0 -62
  173. package/demo/demo.svg +0 -75
  174. package/demo/frame1_ai_data_center.png +0 -0
  175. package/demo/frame1_sunset_video.mp4 +0 -0
  176. package/demo/frame2_cost_comparison.png +0 -0
  177. package/demo/frame2_cost_comparison_fallback.png +0 -0
  178. package/demo/frame3_parallel_execution.png +0 -0
  179. package/demo/frame3_parallel_execution_fallback.png +0 -0
  180. package/demo/frame4_providers.png +0 -0
  181. package/demo/frame4_providers_fallback.png +0 -0
  182. package/demo/frame5_endcard.png +0 -0
  183. package/demo/frame5_endcard_fallback.png +0 -0
  184. package/demo/new_frame1_hook.png +0 -0
  185. package/demo/new_frame2_proof.png +0 -0
  186. package/demo/new_frame3_wow.png +0 -0
  187. package/demo/new_frame4_social.png +0 -0
  188. package/demo/new_frame5_cta.png +0 -0
  189. package/demo/package.json +0 -13
  190. package/demo/product-video-final.mp4 +0 -0
  191. package/demo/product-video-hype-v1.mp4 +0 -0
  192. package/demo/product-video-v1.mp4 +0 -0
  193. package/demo/public/index.html +0 -762
  194. package/demo/recording.cast +0 -55
  195. package/demo/server.js +0 -405
  196. package/demo-new.tape +0 -71
  197. package/demo-real.sh +0 -198
  198. package/demo-simple.tape +0 -205
  199. package/demo.html +0 -520
  200. package/demo.sh +0 -85
  201. package/demo.tape +0 -259
  202. package/dist/analytics/costAnalytics.d.ts.map +0 -1
  203. package/dist/analytics/costAnalytics.js.map +0 -1
  204. package/dist/benchmark/comprehensive.js.map +0 -1
  205. package/dist/benchmark/reproducible.d.ts.map +0 -1
  206. package/dist/benchmark/reproducible.js.map +0 -1
  207. package/dist/cache/prefixCache.d.ts.map +0 -1
  208. package/dist/cache/prefixCache.js.map +0 -1
  209. package/dist/cache/responseCache.d.ts.map +0 -1
  210. package/dist/cache/responseCache.js.map +0 -1
  211. package/dist/cache/semanticCache.d.ts.map +0 -1
  212. package/dist/cache/semanticCache.js.map +0 -1
  213. package/dist/cli/setupWizard.d.ts.map +0 -1
  214. package/dist/cli/setupWizard.js.map +0 -1
  215. package/dist/cost/budgetEnforcer.d.ts.map +0 -1
  216. package/dist/cost/budgetEnforcer.js.map +0 -1
  217. package/dist/cost/costTracker.d.ts.map +0 -1
  218. package/dist/cost/costTracker.js.map +0 -1
  219. package/dist/ensemble/multiRoundDialog.js.map +0 -1
  220. package/dist/ensemble/shapleyValue.js.map +0 -1
  221. package/dist/integrations/langchainAdapter.d.ts.map +0 -1
  222. package/dist/integrations/langchainAdapter.js.map +0 -1
  223. package/dist/integrations/oauth.d.ts.map +0 -1
  224. package/dist/integrations/oauth.js.map +0 -1
  225. package/dist/integrations/scienceAdapter.js.map +0 -1
  226. package/dist/memory/autoFetch.d.ts.map +0 -1
  227. package/dist/memory/autoFetch.js.map +0 -1
  228. package/dist/memory/episodicMemory.d.ts.map +0 -1
  229. package/dist/memory/episodicMemory.js.map +0 -1
  230. package/dist/memory/memoryTree.d.ts.map +0 -1
  231. package/dist/memory/memoryTree.js.map +0 -1
  232. package/dist/memory/obsidianVault.d.ts.map +0 -1
  233. package/dist/memory/obsidianVault.js.map +0 -1
  234. package/dist/observability/changeWatch.d.ts.map +0 -1
  235. package/dist/observability/changeWatch.js.map +0 -1
  236. package/dist/observability/fatigueDetector.d.ts.map +0 -1
  237. package/dist/observability/fatigueDetector.js.map +0 -1
  238. package/dist/observability/index.d.ts.map +0 -1
  239. package/dist/observability/index.js.map +0 -1
  240. package/dist/observability/metrics.d.ts.map +0 -1
  241. package/dist/observability/metrics.js.map +0 -1
  242. package/dist/observability/middleware.d.ts.map +0 -1
  243. package/dist/observability/middleware.js.map +0 -1
  244. package/dist/observability/tracer.d.ts.map +0 -1
  245. package/dist/observability/tracer.js.map +0 -1
  246. package/dist/observability/types.d.ts.map +0 -1
  247. package/dist/observability/types.js.map +0 -1
  248. package/dist/orchestration/haloOrchestrator.d.ts.map +0 -1
  249. package/dist/orchestration/haloOrchestrator.js.map +0 -1
  250. package/dist/orchestration/mctsWorkflow.d.ts.map +0 -1
  251. package/dist/orchestration/mctsWorkflow.js.map +0 -1
  252. package/dist/providers/localProvider.d.ts.map +0 -1
  253. package/dist/providers/localProvider.js.map +0 -1
  254. package/dist/providers/providerConfig.d.ts.map +0 -1
  255. package/dist/providers/providerConfig.js.map +0 -1
  256. package/dist/providers/registry.d.ts.map +0 -1
  257. package/dist/providers/registry.js.map +0 -1
  258. package/dist/routing/advancedRouter.d.ts.map +0 -1
  259. package/dist/routing/advancedRouter.js.map +0 -1
  260. package/dist/routing/crossModelValidation.d.ts.map +0 -1
  261. package/dist/routing/crossModelValidation.js.map +0 -1
  262. package/dist/routing/providerHealth.d.ts.map +0 -1
  263. package/dist/routing/providerHealth.js.map +0 -1
  264. package/dist/routing/providerRetry.d.ts.map +0 -1
  265. package/dist/routing/providerRetry.js.map +0 -1
  266. package/dist/scripts/banner.js +0 -29
  267. package/dist/security/guardrails.d.ts.map +0 -1
  268. package/dist/security/guardrails.js.map +0 -1
  269. package/dist/server/dashboard.d.ts.map +0 -1
  270. package/dist/server/dashboard.js.map +0 -1
  271. package/dist/server/modelMapper.d.ts.map +0 -1
  272. package/dist/server/modelMapper.js.map +0 -1
  273. package/dist/server/proxyServer.d.ts.map +0 -1
  274. package/dist/server/proxyServer.js.map +0 -1
  275. package/dist/skills/__tests__/skill_manager.test.d.ts +0 -2
  276. package/dist/skills/__tests__/skill_manager.test.d.ts.map +0 -1
  277. package/dist/skills/__tests__/skill_manager.test.js +0 -268
  278. package/dist/skills/__tests__/skill_manager.test.js.map +0 -1
  279. package/dist/tools/tmlpdTools.d.ts.map +0 -1
  280. package/dist/tools/tmlpdTools.js.map +0 -1
  281. package/dist/tui/dashboard.d.ts.map +0 -1
  282. package/dist/tui/dashboard.js.map +0 -1
  283. package/dist/tui/index.d.ts.map +0 -1
  284. package/dist/tui/index.js.map +0 -1
  285. package/dist/utils/batchProcessor.d.ts.map +0 -1
  286. package/dist/utils/batchProcessor.js.map +0 -1
  287. package/dist/utils/compression.d.ts.map +0 -1
  288. package/dist/utils/compression.js.map +0 -1
  289. package/dist/utils/costUtils.d.ts.map +0 -1
  290. package/dist/utils/costUtils.js.map +0 -1
  291. package/dist/utils/reliability.d.ts.map +0 -1
  292. package/dist/utils/reliability.js.map +0 -1
  293. package/dist/utils/sorting.d.ts.map +0 -1
  294. package/dist/utils/sorting.js.map +0 -1
  295. package/dist/utils/speculativeDecoding.d.ts.map +0 -1
  296. package/dist/utils/speculativeDecoding.js.map +0 -1
  297. package/dist/utils/tokenUtils.d.ts.map +0 -1
  298. package/dist/utils/tokenUtils.js.map +0 -1
  299. package/docs/.nojekyll +0 -0
  300. package/docs/ANALYSIS_PRINCIPLES.md +0 -162
  301. package/docs/API.md +0 -855
  302. package/docs/ARCHITECTURAL-IMPROVEMENTS-2025.md +0 -1391
  303. package/docs/ARCHITECTURAL-IMPROVEMENTS-REVISED-2025.md +0 -1051
  304. package/docs/BENCHMARK.md +0 -170
  305. package/docs/CHINESE_PROVIDER_RELIABILITY.md +0 -37
  306. package/docs/CITATIONS.md +0 -74
  307. package/docs/CLAIMS_AND_EVIDENCE.md +0 -58
  308. package/docs/CONFIGURATION.md +0 -476
  309. package/docs/COUNCIL_DECISION.json +0 -816
  310. package/docs/COUNCIL_SUMMARY.md +0 -319
  311. package/docs/COUNCIL_V2.2_DECISION.md +0 -416
  312. package/docs/ENGINEERING_SPEC.md +0 -55
  313. package/docs/FACTORY_RESET.md +0 -34
  314. package/docs/GEO.md +0 -66
  315. package/docs/GEO_OPTIMIZATION.md +0 -30
  316. package/docs/GEO_ROOT_CAUSE.md +0 -136
  317. package/docs/GEO_STATUS.md +0 -85
  318. package/docs/GEO_TEST_RESULTS.md +0 -176
  319. package/docs/HN_CHECKLIST.md +0 -38
  320. package/docs/HN_FOUNDER_COMMENT.md +0 -17
  321. package/docs/HN_SUBMISSION_FINAL.md +0 -180
  322. package/docs/HN_SUBMISSION_V3.md +0 -56
  323. package/docs/IMPROVEMENT_ROADMAP.md +0 -515
  324. package/docs/INTEGRATIONS.md +0 -420
  325. package/docs/LANGCHAIN_INTEGRATION.md +0 -147
  326. package/docs/LLM_COUNCIL_DECISION.md +0 -508
  327. package/docs/MIDDLEWARE_CHAIN.md +0 -35
  328. package/docs/PROMO_CHECKLIST.md +0 -200
  329. package/docs/QUICKSTART.md +0 -271
  330. package/docs/QUICK_START.md +0 -43
  331. package/docs/QUICK_START_VISIBILITY.md +0 -782
  332. package/docs/REDDIT_GAP_ANALYSIS.md +0 -299
  333. package/docs/RELEASE_CHECKLIST.md +0 -32
  334. package/docs/REPRODUCIBILITY.md +0 -63
  335. package/docs/RESEARCH_BACKED_IMPROVEMENTS.md +0 -1180
  336. package/docs/ROUTING_RUBRIC.md +0 -197
  337. package/docs/SEO_AUDIT.md +0 -186
  338. package/docs/SOCIAL_LISTENING.md +0 -219
  339. package/docs/TMLPD_QNA.md +0 -751
  340. package/docs/TMLPD_V2.1_COMPLETE.md +0 -763
  341. package/docs/TMLPD_V2.2_RESEARCH_ROADMAP.md +0 -754
  342. package/docs/UPDATE_TOPICS.md +0 -15
  343. package/docs/USE_CASES.md +0 -59
  344. package/docs/V2.2_IMPLEMENTATION_COMPLETE.md +0 -446
  345. package/docs/V2_IMPLEMENTATION_GUIDE.md +0 -388
  346. package/docs/VERCEL_AI_SDK.md +0 -209
  347. package/docs/VISIBILITY_ADOPTION_PLAN.md +0 -1005
  348. package/docs/_config.yml +0 -49
  349. package/docs/ai-plugin.json +0 -16
  350. package/docs/api.html +0 -513
  351. package/docs/architecture-diagram.md +0 -40
  352. package/docs/benchmark-chart.png +0 -0
  353. package/docs/benchmark.html +0 -387
  354. package/docs/blog/routerarena-number-one.html +0 -73
  355. package/docs/cli-cheatsheet.md +0 -339
  356. package/docs/compare.md +0 -109
  357. package/docs/comparison-litellm.md +0 -88
  358. package/docs/comparison.md +0 -108
  359. package/docs/cost-chart-ascii.md +0 -42
  360. package/docs/cost-comparison-chart.svg +0 -88
  361. package/docs/curl-examples.md +0 -247
  362. package/docs/demo-auto.html +0 -264
  363. package/docs/demo.html +0 -416
  364. package/docs/geo/GENERATIVE_ENGINE_OPTIMIZATION.md +0 -232
  365. package/docs/index.html +0 -507
  366. package/docs/launch-content/LAUNCH_EXECUTION_CHECKLIST.md +0 -421
  367. package/docs/launch-content/README.md +0 -457
  368. package/docs/launch-content/assets/cost_comparison_100_tasks.png +0 -0
  369. package/docs/launch-content/assets/cumulative_savings.png +0 -0
  370. package/docs/launch-content/assets/parallel_speedup.png +0 -0
  371. package/docs/launch-content/assets/provider_pricing_comparison.png +0 -0
  372. package/docs/launch-content/assets/task_breakdown_comparison.png +0 -0
  373. package/docs/launch-content/generate_charts.py +0 -313
  374. package/docs/launch-content/hn_show_post.md +0 -139
  375. package/docs/launch-content/partner_outreach_templates.md +0 -745
  376. package/docs/launch-content/reddit_posts.md +0 -467
  377. package/docs/launch-content/twitter_thread.txt +0 -460
  378. package/docs/npm-downloads-chart.svg +0 -43
  379. package/docs/openapi.json +0 -139
  380. package/docs/openapi.yaml +0 -1318
  381. package/docs/quick-start.html +0 -366
  382. package/docs/robots.txt +0 -52
  383. package/docs/sitemap.xml +0 -57
  384. package/docs/styles.css +0 -682
  385. package/docs/well-known/ai-plugin.json +0 -16
  386. package/docs/wellknown/ai-plugin.json +0 -16
  387. package/docs-site/assets/og-banner.svg +0 -194
  388. package/docs-site/index.html +0 -632
  389. package/eval/README.md +0 -46
  390. package/eval/baselines/main.json +0 -12
  391. package/eval/benchmark_dataset.jsonl +0 -16
  392. package/eval/check_golden_routes.js +0 -64
  393. package/eval/datasets/catalog.json +0 -33
  394. package/eval/datasets/slices/cn_provider_reliability_v1.jsonl +0 -3
  395. package/eval/datasets/slices/cost_pressure_v1.jsonl +0 -3
  396. package/eval/datasets/slices/safety_guardrails_v1.jsonl +0 -3
  397. package/eval/evals.json +0 -199
  398. package/eval/fault_injection_thresholds.json +0 -3
  399. package/eval/generate_report.js +0 -128
  400. package/eval/golden_routes.json +0 -114
  401. package/eval/lib/experiment_registry.js +0 -24
  402. package/eval/run_eval.js +0 -197
  403. package/eval/run_fault_injection.js +0 -201
  404. package/eval/run_shadow_eval.js +0 -85
  405. package/eval/thresholds.json +0 -9
  406. package/examples/QUICKSTART.md +0 -183
  407. package/examples/README.md +0 -61
  408. package/examples/a3m-sdk.js +0 -124
  409. package/examples/basic-route.js +0 -54
  410. package/examples/chat-loop.js +0 -202
  411. package/examples/classify-then-route.js +0 -102
  412. package/examples/cost-compare.js +0 -120
  413. package/examples/ensemble.js +0 -160
  414. package/examples/whatsapp-telegram-bridge-demo.js +0 -302
  415. package/examples/whatsapp-telegram-bridge.js +0 -269
  416. package/hf-space/README.md +0 -23
  417. package/hf-space/app.py +0 -240
  418. package/hf-space/requirements.txt +0 -1
  419. package/huggingface_space/README.md +0 -35
  420. package/huggingface_space/app.py +0 -126
  421. package/huggingface_space/create_space.py +0 -208
  422. package/huggingface_space/requirements.txt +0 -1
  423. package/mcp-server/README.md +0 -188
  424. package/mcp-server/package.json +0 -29
  425. package/mcp-server/src/index.ts +0 -744
  426. package/mcp-server/tsconfig.json +0 -19
  427. package/openclaw-alexa-bridge/ALL_REMAINING_FIXES_PLAN.md +0 -313
  428. package/openclaw-alexa-bridge/REMAINING_FIXES_SUMMARY.md +0 -277
  429. package/openclaw-alexa-bridge/src/alexa_handler_no_tmlpd.js +0 -1234
  430. package/openclaw-alexa-bridge/test_fixes.js +0 -77
  431. package/playground/README.md +0 -51
  432. package/playground/codesandbox.json +0 -12
  433. package/playground/index.js +0 -39
  434. package/proxy/README.md +0 -227
  435. package/proxy/package-lock.json +0 -831
  436. package/proxy/package.json +0 -17
  437. package/proxy/rate-limit.js +0 -145
  438. package/proxy/rate-limit.test.js +0 -311
  439. package/proxy/server.js +0 -970
  440. package/python/README.md +0 -102
  441. package/python/a3m/__init__.py +0 -6
  442. package/python/a3m/client.py +0 -190
  443. package/python/a3m/models.py +0 -40
  444. package/python/a3m/sync_client.py +0 -61
  445. package/python/examples.py +0 -53
  446. package/python/integrations.py +0 -330
  447. package/python/pyproject.toml +0 -23
  448. package/python/setup.py +0 -28
  449. package/python/tmlpd.py +0 -369
  450. package/qna/REDDIT_GAP_ANALYSIS.md +0 -299
  451. package/qna/TMLPD_QNA.md +0 -751
  452. package/research/FINDING_001_safety.md +0 -28
  453. package/research/FINDING_002_error_diversity.md +0 -32
  454. package/research/FINDING_003_confidence_weighted_voting.md +0 -32
  455. package/research/FINDING_004_cross_model_semantic_detection.md +0 -37
  456. package/research/FINDING_005_knowledge_gap_orthogonality.md +0 -34
  457. package/research/HALLUCINATION_RESEARCH.md +0 -27
  458. package/research/ensemble-voting.md +0 -324
  459. package/research/loss-functions.md +0 -545
  460. package/research-log.md +0 -49
  461. package/scripts/banner.js +0 -29
  462. package/scripts/benchmark-local-routerarena.ts +0 -176
  463. package/scripts/benchmark.js +0 -145
  464. package/scripts/benchmark.sh +0 -61
  465. package/scripts/compare-providers.sh +0 -230
  466. package/scripts/content-planner.js +0 -25
  467. package/scripts/create-labeled-benchmark.ts +0 -105
  468. package/scripts/cross_post.py +0 -443
  469. package/scripts/local-router-benchmark.ts +0 -154
  470. package/scripts/post-all.sh +0 -41
  471. package/scripts/publish_fcc.py +0 -106
  472. package/scripts/push-to-gitee.sh +0 -25
  473. package/scripts/routerarena_ensemble.js +0 -144
  474. package/scripts/routing-benchmark-v2.js +0 -373
  475. package/scripts/routing-benchmark-v3.js +0 -118
  476. package/scripts/routing-benchmark.js +0 -462
  477. package/scripts/run-labeled-benchmark.mjs +0 -104
  478. package/scripts/run-mmlu-benchmark.js +0 -176
  479. package/scripts/run-provider-benchmark.js +0 -244
  480. package/scripts/update-npm-badges.js +0 -158
  481. package/skill/SKILL.md +0 -238
  482. package/src/__tests__/integration/tmpld_integration.test.py +0 -540
  483. package/src/skills/__tests__/skill_manager.test.ts +0 -328
  484. package/submissions/benchmarks/ALL_PLATFORMS_SUBMISSION.md +0 -94
  485. package/submissions/benchmarks/LLMROUTERBENCH_SUBMISSION.md +0 -121
  486. package/submissions/benchmarks/MMRBENCH_SUBMISSION.md +0 -94
  487. package/submissions/benchmarks/ROUTERARENA_UPDATE.md +0 -83
  488. package/submissions/benchmarks/ROUTERBENCH_SUBMISSION.md +0 -225
  489. package/test-council/1-structure-tests.test.js +0 -353
  490. package/test-council/1-structure-tests.test.ts +0 -353
  491. package/test-council/2-edge-case-tests.test.ts +0 -361
  492. package/test-council/3-performance-tests.test.ts +0 -669
  493. package/test-council/4-integration-tests.test.ts +0 -391
  494. package/test-council/5-agent-council-eval.test.ts +0 -413
  495. package/test-council/AGENT_COUNCIL_ARCHITECTURE.md +0 -349
  496. package/test-council/TEST_COUNCIL_REPORT.md +0 -201
  497. package/test-council/agents/edge-case-agent.ts +0 -363
  498. package/test-council/agents/performance-agent.ts +0 -426
  499. package/test-council/agents/structure-agent.ts +0 -227
  500. package/test-council/council.md +0 -183
  501. package/tests/__mocks__/tokenUtils.ts +0 -8
  502. package/tests/memory/episodicMemory.test.ts +0 -227
  503. package/tests/package-lock.json +0 -1628
  504. package/tests/package.json +0 -18
  505. package/tests/routing/ensembleVoting.test.ts +0 -236
  506. package/tests/routing/providerRetry.test.ts +0 -360
  507. package/tests/routing/queryTypePresets.test.ts +0 -208
  508. package/tests/security/guardrailEngine.test.ts +0 -700
  509. package/tests/tsconfig.json +0 -21
  510. package/tests/vitest.config.ts +0 -18
  511. package/tmlpd-pi-extension/README.md +0 -66
  512. package/tmlpd-pi-extension/dist/cache/prefixCache.d.ts +0 -114
  513. package/tmlpd-pi-extension/dist/cache/prefixCache.d.ts.map +0 -1
  514. package/tmlpd-pi-extension/dist/cache/prefixCache.js +0 -285
  515. package/tmlpd-pi-extension/dist/cache/prefixCache.js.map +0 -1
  516. package/tmlpd-pi-extension/dist/cache/responseCache.d.ts +0 -58
  517. package/tmlpd-pi-extension/dist/cache/responseCache.d.ts.map +0 -1
  518. package/tmlpd-pi-extension/dist/cache/responseCache.js +0 -153
  519. package/tmlpd-pi-extension/dist/cache/responseCache.js.map +0 -1
  520. package/tmlpd-pi-extension/dist/cli.js +0 -59
  521. package/tmlpd-pi-extension/dist/cost/costTracker.d.ts +0 -95
  522. package/tmlpd-pi-extension/dist/cost/costTracker.d.ts.map +0 -1
  523. package/tmlpd-pi-extension/dist/cost/costTracker.js +0 -240
  524. package/tmlpd-pi-extension/dist/cost/costTracker.js.map +0 -1
  525. package/tmlpd-pi-extension/dist/index.d.ts +0 -723
  526. package/tmlpd-pi-extension/dist/index.d.ts.map +0 -1
  527. package/tmlpd-pi-extension/dist/index.js +0 -239
  528. package/tmlpd-pi-extension/dist/index.js.map +0 -1
  529. package/tmlpd-pi-extension/dist/memory/episodicMemory.d.ts +0 -82
  530. package/tmlpd-pi-extension/dist/memory/episodicMemory.d.ts.map +0 -1
  531. package/tmlpd-pi-extension/dist/memory/episodicMemory.js +0 -145
  532. package/tmlpd-pi-extension/dist/memory/episodicMemory.js.map +0 -1
  533. package/tmlpd-pi-extension/dist/orchestration/haloOrchestrator.d.ts +0 -102
  534. package/tmlpd-pi-extension/dist/orchestration/haloOrchestrator.d.ts.map +0 -1
  535. package/tmlpd-pi-extension/dist/orchestration/haloOrchestrator.js +0 -207
  536. package/tmlpd-pi-extension/dist/orchestration/haloOrchestrator.js.map +0 -1
  537. package/tmlpd-pi-extension/dist/orchestration/mctsWorkflow.d.ts +0 -85
  538. package/tmlpd-pi-extension/dist/orchestration/mctsWorkflow.d.ts.map +0 -1
  539. package/tmlpd-pi-extension/dist/orchestration/mctsWorkflow.js +0 -210
  540. package/tmlpd-pi-extension/dist/orchestration/mctsWorkflow.js.map +0 -1
  541. package/tmlpd-pi-extension/dist/providers/localProvider.d.ts +0 -102
  542. package/tmlpd-pi-extension/dist/providers/localProvider.d.ts.map +0 -1
  543. package/tmlpd-pi-extension/dist/providers/localProvider.js +0 -338
  544. package/tmlpd-pi-extension/dist/providers/localProvider.js.map +0 -1
  545. package/tmlpd-pi-extension/dist/providers/registry.d.ts +0 -55
  546. package/tmlpd-pi-extension/dist/providers/registry.d.ts.map +0 -1
  547. package/tmlpd-pi-extension/dist/providers/registry.js +0 -138
  548. package/tmlpd-pi-extension/dist/providers/registry.js.map +0 -1
  549. package/tmlpd-pi-extension/dist/routing/advancedRouter.d.ts +0 -68
  550. package/tmlpd-pi-extension/dist/routing/advancedRouter.d.ts.map +0 -1
  551. package/tmlpd-pi-extension/dist/routing/advancedRouter.js +0 -332
  552. package/tmlpd-pi-extension/dist/routing/advancedRouter.js.map +0 -1
  553. package/tmlpd-pi-extension/dist/tools/tmlpdTools.d.ts +0 -101
  554. package/tmlpd-pi-extension/dist/tools/tmlpdTools.d.ts.map +0 -1
  555. package/tmlpd-pi-extension/dist/tools/tmlpdTools.js +0 -368
  556. package/tmlpd-pi-extension/dist/tools/tmlpdTools.js.map +0 -1
  557. package/tmlpd-pi-extension/dist/utils/batchProcessor.d.ts +0 -96
  558. package/tmlpd-pi-extension/dist/utils/batchProcessor.d.ts.map +0 -1
  559. package/tmlpd-pi-extension/dist/utils/batchProcessor.js +0 -170
  560. package/tmlpd-pi-extension/dist/utils/batchProcessor.js.map +0 -1
  561. package/tmlpd-pi-extension/dist/utils/compression.d.ts +0 -61
  562. package/tmlpd-pi-extension/dist/utils/compression.d.ts.map +0 -1
  563. package/tmlpd-pi-extension/dist/utils/compression.js +0 -281
  564. package/tmlpd-pi-extension/dist/utils/compression.js.map +0 -1
  565. package/tmlpd-pi-extension/dist/utils/reliability.d.ts +0 -74
  566. package/tmlpd-pi-extension/dist/utils/reliability.d.ts.map +0 -1
  567. package/tmlpd-pi-extension/dist/utils/reliability.js +0 -177
  568. package/tmlpd-pi-extension/dist/utils/reliability.js.map +0 -1
  569. package/tmlpd-pi-extension/dist/utils/speculativeDecoding.d.ts +0 -117
  570. package/tmlpd-pi-extension/dist/utils/speculativeDecoding.d.ts.map +0 -1
  571. package/tmlpd-pi-extension/dist/utils/speculativeDecoding.js +0 -246
  572. package/tmlpd-pi-extension/dist/utils/speculativeDecoding.js.map +0 -1
  573. package/tmlpd-pi-extension/dist/utils/tokenUtils.d.ts +0 -50
  574. package/tmlpd-pi-extension/dist/utils/tokenUtils.d.ts.map +0 -1
  575. package/tmlpd-pi-extension/dist/utils/tokenUtils.js +0 -124
  576. package/tmlpd-pi-extension/dist/utils/tokenUtils.js.map +0 -1
  577. package/tmlpd-pi-extension/examples/QUICKSTART.md +0 -183
  578. package/tmlpd-pi-extension/package-lock.json +0 -79
  579. package/tmlpd-pi-extension/package.json +0 -172
  580. package/tmlpd-pi-extension/python/examples.py +0 -53
  581. package/tmlpd-pi-extension/python/integrations.py +0 -330
  582. package/tmlpd-pi-extension/python/setup.py +0 -28
  583. package/tmlpd-pi-extension/python/tmlpd.py +0 -369
  584. package/tmlpd-pi-extension/qna/REDDIT_GAP_ANALYSIS.md +0 -299
  585. package/tmlpd-pi-extension/qna/TMLPD_QNA.md +0 -751
  586. package/tmlpd-pi-extension/skill/SKILL.md +0 -238
  587. package/tmlpd-pi-extension/src/cache/responseCache.ts +0 -147
  588. package/tmlpd-pi-extension/src/cost/costTracker.ts +0 -302
  589. package/tmlpd-pi-extension/src/index.ts +0 -232
  590. package/tmlpd-pi-extension/src/memory/episodicMemory.ts +0 -257
  591. package/tmlpd-pi-extension/src/orchestration/haloOrchestrator.ts +0 -266
  592. package/tmlpd-pi-extension/src/orchestration/mctsWorkflow.ts +0 -262
  593. package/tmlpd-pi-extension/src/providers/localProvider.ts +0 -406
  594. package/tmlpd-pi-extension/src/providers/registry.ts +0 -164
  595. package/tmlpd-pi-extension/src/routing/ensembleVoting.ts +0 -159
  596. package/tmlpd-pi-extension/src/routing/queryTypePresets.ts +0 -136
  597. package/tmlpd-pi-extension/src/tools/tmlpdTools.ts +0 -433
  598. package/tmlpd-pi-extension/src/utils/batchProcessor.ts +0 -232
  599. package/tmlpd-pi-extension/src/utils/compression.ts +0 -325
  600. package/tmlpd-pi-extension/src/utils/reliability.ts +0 -221
  601. package/tmlpd-pi-extension/src/utils/tokenUtils.ts +0 -145
  602. package/tmlpd-pi-extension/tsconfig.json +0 -18
  603. package/tsconfig.build.json +0 -29
  604. package/tsconfig.json +0 -18
  605. /package/{docs/llms-full.txt → llms-full.txt.bak} +0 -0
@@ -1,112 +0,0 @@
1
- # Newsletter Submissions
2
-
3
- ## 6 Target Newsletters
4
-
5
- ### 1. Import AI (jack@sequoiacap.com)
6
- **Audience:** AI researchers, builders
7
- **Frequency:** Weekly
8
- **Submission:** Email to jack@sequoiacap.com
9
-
10
- ### 2. The Batch (Anthropic)
11
- **URL:** https://www.anthropic.com/news (press@anthropic.com)
12
-
13
- ### 3. OpenAI Newsletter
14
- **URL:** https://openai.com/newsletter
15
-
16
- ### 4. DeepLearning.ai Newsletter
17
- **URL:** https://www.deeplearning.ai/newsletter/
18
-
19
- ### 5. Lil'Log (Lilian Weng)
20
- **URL:** https://lilianweng.github.io/ (lilian@openai.com)
21
-
22
- ### 6. The Economist AI
23
- **URL:** https://www.economist.com/newsletters/ai
24
-
25
- ---
26
-
27
- ## Email Template for Import AI
28
-
29
- ```
30
- Subject: A3M Router — #1 LLM routing benchmark, 213× cheaper than GPT-5
31
-
32
- Hi Jack,
33
-
34
- I wanted to share A3M Router, an open-source project that might interest your readers.
35
-
36
- **The Pitch:**
37
- Most teams send every AI query to GPT-4o, paying $10-60 per 1K tokens. A3M Router
38
- intelligently routes queries to the cheapest capable model, achieving:
39
-
40
- - **#1 on RouterArena** (70.32 score, arXiv:2510.00202) — beating 18 other routers
41
- - **$0.047/1K queries** — 213× cheaper than GPT-5
42
- - **<1ms routing** — no GPU required, rule-based heuristics
43
- - **47+ providers** — Groq, DeepSeek, Mistral, Claude Haiku, etc.
44
-
45
- **How it works:**
46
- A3M analyzes 12 keyword signals across 5 dimensions (domain, complexity, intent,
47
- length, structure) to instantly route queries to the optimal provider.
48
-
49
- For example:
50
- - "Hi" → Groq (free tier)
51
- - "Debug my Python code" → DeepSeek ($0.0003/query)
52
- - "Explain quantum entanglement" → GPT-4o mini ($0.0015/query)
53
-
54
- **Benchmark results:**
55
- | Router | Score | Cost/1K |
56
- |--------|-------|----------|
57
- | A3M Router | 70.32 | $0.047 |
58
- | Sqwish | 75.27 | $0.18 |
59
- | GPT-5 | 64.32 | $10.02 |
60
-
61
- **Demo:** https://asciinema.org/a/RpqOZM9tFMALYWvs
62
- **GitHub:** https://github.com/Das-rebel/a3m-router
63
- **npm:** https://www.npmjs.com/package/adaptive-memory-multi-model-router
64
-
65
- Happy to chat more or provide a more detailed technical breakdown.
66
-
67
- Best,
68
- Subho Das
69
- Das-rebel
70
- ```
71
-
72
- ---
73
-
74
- ## Generic Newsletter Pitch
75
-
76
- ```
77
- Subject: [Tool] A3M Router — Open-source LLM routing, #1 on RouterArena
78
-
79
- Hi,
80
-
81
- I built A3M Router, an open-source LLM gateway that automatically routes queries
82
- to the cheapest capable model.
83
-
84
- **Quick facts:**
85
- - Ranks #1 on RouterArena (70.32 score, beating GPT-5 at 64.32)
86
- - Costs $0.047/1K queries (vs GPT-5's $10.02)
87
- - Routes in <1ms with no ML training required
88
- - Supports 47+ providers with automatic failover
89
-
90
- **One-liner:** Think of it as "CI/CD for AI spend" — automatically route
91
- every query to the right model at the right price.
92
-
93
- **Demo:** https://asciinema.org/a/RpqOZM9tFMALYWvs
94
- **GitHub:** https://github.com/Das-rebel/a3m-router
95
-
96
- Would love to be included in your next issue if it's a good fit.
97
-
98
- Thanks!
99
- ```
100
-
101
- ---
102
-
103
- ## Submission Checklist
104
-
105
- - [ ] Import AI: Email jack@sequoiacap.com
106
- - [ ] The Batch: Submit at anthropic.com/news
107
- - [ ] OpenAI Newsletter: Subscribe + check submission page
108
- - [ ] DeepLearning.ai: Submit at deeplearning.ai/newsletter
109
- - [ ] Lil'Log: Email or Twitter DM @lilianweng
110
- - [ ] The Economist: Submit via website form
111
-
112
- **Tip:** Submit to Import AI first — most likely to cover indie projects.
@@ -1,308 +0,0 @@
1
- ---
2
- title: "We Were Overpaying by 70% on LLM APIs (Until We Discovered GLM & MiniMax)"
3
- published: true
4
- description: "Our OpenAI bill hit $2,400/month. Switching to GLM-4 and MiniMax cut it to $720 with 2x speed improvement. Here's the routing strategy."
5
- tags: llm, ai, cost-optimization, javascript, glm, minimax, openai-alternative
6
- ---
7
-
8
- # We Were Overpaying by 70% on LLM APIs (Until We Discovered GLM & MiniMax)
9
-
10
- Last month, our startup's LLM bill hit **$2,400**.
11
-
12
- We're 5 people. 1,000 queries/day. Customer support, code generation, text summarization. Basic stuff.
13
-
14
- I assumed we needed GPT-4 for everything. I was wrong.
15
-
16
- ## The Problem: Defaulting to OpenAI
17
-
18
- Like most developers, we reached for OpenAI by default:
19
-
20
- ```javascript
21
- // Every query → OpenAI GPT-4
22
- await openai.chat.completions.create({
23
- model: "gpt-4",
24
- messages: [{ role: "user", content: "What is 2+2?" }]
25
- });
26
- // Cost: $0.03, Latency: 800ms
27
-
28
- await openai.chat.completions.create({
29
- model: "gpt-4",
30
- messages: [{ role: "user", content: "Summarize this email" }]
31
- });
32
- // Cost: $0.02, Latency: 1.2s
33
-
34
- await openai.chat.completions.create({
35
- model: "gpt-4",
36
- messages: [{ role: "user", content: "Write Python to reverse a string" }]
37
- });
38
- // Cost: $0.05, Latency: 2.1s
39
- ```
40
-
41
- **1,000 queries × $0.03 average = $30/day = $900/month minimum.**
42
-
43
- But we were hitting $2,400. Why?
44
-
45
- - Simple Q&A that GLM-4 could handle for 1/10th the price? GPT-4.
46
- - Code generation where MiniMax is 3x faster? GPT-4.
47
- - Tasks where Cerebras responds in 350ms? GPT-4 at 2,100ms.
48
-
49
- We were paying premium Western prices when Chinese providers offer better value.
50
-
51
- ## The Discovery: GLM-4 & MiniMax
52
-
53
- I started benchmarking alternatives:
54
-
55
- | Provider | Cost/1M tokens | Latency | Quality |
56
- |----------|---------------|---------|---------|
57
- | **OpenAI GPT-4** | $30.00 | 2,100ms | 95% |
58
- | **GLM-4 (Zhipu)** | $2.80 | 800ms | 92% |
59
- | **MiniMax** | $1.50 | 600ms | 89% |
60
- | **Cerebras** | $0.60 | 350ms | 82% |
61
- | **Groq** | $0.59 | 400ms | 82% |
62
-
63
- **GLM-4 is 10x cheaper than GPT-4 with 92% quality.**
64
- **MiniMax is 20x cheaper with 3x lower latency.**
65
-
66
- For our use case (customer support, code gen, summarization), this was a no-brainer.
67
-
68
- ## The Breaking Point
69
-
70
- Our CFO's Slack message:
71
-
72
- > "AI costs are now 40% of infrastructure. We're spending $2,400/month on OpenAI alone. Find alternatives or cut usage by 50%."
73
-
74
- I analyzed our logs:
75
-
76
- - **34%** simple Q&A → GLM-4 handles this perfectly at 1/10th cost
77
- - **28%** code generation → MiniMax is faster AND cheaper
78
- - **22%** summarization → GLM-4 excels at this
79
- - **16%** complex reasoning → Keep GPT-4 for these
80
-
81
- **We were overpaying by 70% because we didn't route queries intelligently.**
82
-
83
- ## The Solution: Smart Routing to GLM & MiniMax
84
-
85
- We built a router that analyzes each query and picks the optimal provider:
86
-
87
- ```javascript
88
- const { routeQuery } = require('adaptive-memory-multi-model-router');
89
-
90
- // Simple Q&A → GLM-4 (10x cheaper, 92% quality)
91
- routeQuery("What is 2+2?");
92
- // → glm/glm-4 ($0.003 vs $0.03)
93
-
94
- // Code generation → MiniMax (3x faster, 20x cheaper)
95
- routeQuery("Write Python to reverse a string");
96
- // → minimax/minimax-m2.5 ($0.002 vs $0.05)
97
-
98
- // Speed-critical → Cerebras (6x faster)
99
- routeQuery("Quick API response needed");
100
- // → cerebras/llama3.1-8b (350ms vs 2,100ms)
101
-
102
- // Complex reasoning → Keep GPT-4
103
- routeQuery("Explain quantum entanglement with mathematical proofs");
104
- // → openai/gpt-4 (worth the premium)
105
- ```
106
-
107
- ## Provider Breakdown: When to Use What
108
-
109
- ### GLM-4 (Zhipu AI) - The GPT-4 Alternative
110
- **Best for**: General Q&A, summarization, Chinese language tasks
111
- - **Cost**: $2.80/1M tokens (10x cheaper than GPT-4)
112
- - **Quality**: 92% of GPT-4 on standard benchmarks
113
- - **Latency**: 800ms (2.6x faster than GPT-4)
114
- - **Strengths**: Multilingual, reasoning, cost-effective
115
-
116
- **Our usage**: 34% of queries (simple Q&A, summarization)
117
- **Savings**: $306/month
118
-
119
- ### MiniMax - The Speed Demon
120
- **Best for**: Code generation, real-time applications, high-volume processing
121
- - **Cost**: $1.50/1M tokens (20x cheaper than GPT-4)
122
- - **Quality**: 89% of GPT-4 (good enough for most tasks)
123
- - **Latency**: 600ms (3.5x faster than GPT-4)
124
- - **Strengths**: Speed, cost, code understanding
125
-
126
- **Our usage**: 28% of queries (code generation, quick responses)
127
- **Savings**: $1,372/month + 3x speed improvement
128
-
129
- ### Cerebras - The Latency Killer
130
- **Best for**: Applications where every millisecond counts
131
- - **Cost**: $0.60/1M tokens (50x cheaper than GPT-4)
132
- - **Quality**: 82% of GPT-4
133
- - **Latency**: 350ms (6x faster than GPT-4)
134
- - **Strengths**: Ultra-low latency, cost-effective
135
-
136
- **Our usage**: 22% of queries (speed-critical tasks)
137
- **Savings**: $418/month + 6x speed improvement
138
-
139
- ### Groq - The Balanced Option
140
- **Best for**: General-purpose fast inference
141
- - **Cost**: $0.59/1M tokens (50x cheaper than GPT-4)
142
- - **Quality**: 82% of GPT-4
143
- - **Latency**: 400ms (5x faster than GPT-4)
144
- - **Strengths**: Consistent performance, good for code
145
-
146
- **Our usage**: Fallback for code tasks
147
-
148
- ## The Results: 70% Cost Reduction
149
-
150
- | Metric | Before (OpenAI Only) | After (Mixed Providers) | Change |
151
- |--------|----------------------|------------------------|--------|
152
- | **Monthly Cost** | $2,400 | $720 | **-70%** |
153
- | **Avg Cost/Query** | $0.03 | $0.009 | **-70%** |
154
- | **Response Time** | 2,100ms | 650ms | **-69%** |
155
- | **Quality Score** | 100% | 94% | **-6%** |
156
-
157
- **Trade-off: 6% quality reduction for 70% cost savings and 3x speed improvement.**
158
-
159
- Our CFO: "This is exactly what we needed. Can we optimize further?"
160
-
161
- ## Real Query Routing Examples
162
-
163
- Here's what actually happened:
164
-
165
- **Customer Support Query**: "How do I reset my password?"
166
- - Before: GPT-4 ($0.03, 2.1s)
167
- - After: GLM-4 ($0.003, 0.8s)
168
- - **Savings: 90% cost, 62% faster**
169
-
170
- **Code Generation**: "Write a Python function to parse JSON"
171
- - Before: GPT-4 ($0.05, 2.1s)
172
- - After: MiniMax ($0.002, 0.6s)
173
- - **Savings: 96% cost, 71% faster**
174
-
175
- **Text Summarization**: "Summarize this 500-word article"
176
- - Before: GPT-4 ($0.02, 1.2s)
177
- - After: GLM-4 ($0.002, 0.8s)
178
- - **Savings: 90% cost, 33% faster**
179
-
180
- **Complex Analysis**: "Analyze this legal contract for risks"
181
- - Before: GPT-4 ($0.04, 2.1s)
182
- - After: GPT-4 ($0.04, 2.1s)
183
- - **Kept premium provider for complex tasks**
184
-
185
- ## Why GLM-4 & MiniMax Are Game-Changers
186
-
187
- ### GLM-4 (Zhipu AI)
188
-
189
- **What it is**: China's leading open-source LLM, GPT-4 class performance
190
- **Why it matters**: 10x cheaper than GPT-4 with 92% quality
191
- **Best for**:
192
- - General Q&A (any language)
193
- - Text summarization
194
- - Content generation
195
- - Tasks where "good enough" is fine
196
-
197
- **Real example**: Our customer support chatbot now uses GLM-4. Customers can't tell the difference, but our costs dropped 90% for these queries.
198
-
199
- ### MiniMax
200
-
201
- **What it is**: High-performance Chinese LLM optimized for speed
202
- **Why it matters**: 20x cheaper than GPT-4, 3x faster
203
- **Best for**:
204
- - Code generation
205
- - Real-time applications
206
- - High-volume processing
207
- - Speed-critical tasks
208
-
209
- **Real example**: Our code suggestion feature now uses MiniMax. Developers get suggestions in 600ms instead of 2,100ms. They're happier AND we save 96% on costs.
210
-
211
- ## The Implementation (10 Minutes)
212
-
213
- ```bash
214
- npm install adaptive-memory-multi-model-router
215
- ```
216
-
217
- ```javascript
218
- const { createA3MRouter } = require('adaptive-memory-multi-model-router');
219
-
220
- const router = createA3MRouter();
221
-
222
- // Replace this:
223
- // const response = await openai.chat.completions.create({...});
224
-
225
- // With this:
226
- const route = await router.route(userQuery);
227
- const response = await callProvider(route.primary_model, userQuery);
228
- ```
229
-
230
- **That's it.** No model retraining. No API changes. Just intelligent routing.
231
-
232
- ## Try It Yourself
233
-
234
- ```bash
235
- # See what you're currently overpaying for
236
- npx a3m-router route "Your most common query"
237
-
238
- # Compare GLM-4 vs GPT-4 for your use case
239
- npx a3m-router compare "Summarize this quarterly report"
240
-
241
- # Benchmark all providers including GLM & MiniMax
242
- npx a3m-router benchmark
243
- ```
244
-
245
- ## The Math for Different Volumes
246
-
247
- If you're using OpenAI for everything, here's what you could save:
248
-
249
- | Daily Queries | Current Cost (OpenAI) | Optimized Cost (GLM/MiniMax) | Monthly Savings |
250
- |---------------|----------------------|----------------------------|-----------------|
251
- | 500 | $450 | $135 | **$315** |
252
- | 1,000 | $900 | $270 | **$630** |
253
- | 5,000 | $4,500 | $1,350 | **$3,150** |
254
- | 10,000 | $9,000 | $2,700 | **$6,300** |
255
-
256
- **At 10,000 queries/day, you're leaving $6,300/month on the table.**
257
-
258
- ## Addressing the Concerns
259
-
260
- ### "But are GLM and MiniMax reliable?"
261
-
262
- We've been running them in production for 3 months:
263
- - **Uptime**: 99.7% (same as OpenAI)
264
- - **Quality**: 92-89% of GPT-4 (acceptable for our use case)
265
- - **Speed**: 3-6x faster than GPT-4
266
- - **Cost**: 10-20x cheaper
267
-
268
- ### "What about data privacy?"
269
-
270
- - GLM-4: Data stays in China (consider for sensitive data)
271
- - MiniMax: Enterprise tier available with data residency options
272
- - **Solution**: Route sensitive queries to OpenAI or local Ollama
273
-
274
- ### "Isn't switching providers complicated?"
275
-
276
- Not with intelligent routing:
277
- ```javascript
278
- // One line handles provider selection
279
- const route = await router.route(query);
280
- // Automatically picks GLM, MiniMax, or OpenAI based on query
281
- ```
282
-
283
- ## The Bottom Line
284
-
285
- If your OpenAI bill is over $500/month, you're probably overpaying by 50-70%.
286
-
287
- **GLM-4 and MiniMax aren't just cheaper alternatives. They're often better for specific tasks:**
288
- - GLM-4: 10x cheaper, excellent for general tasks
289
- - MiniMax: 20x cheaper, 3x faster for code
290
- - Cerebras: 50x cheaper, 6x faster for speed-critical tasks
291
-
292
- **You don't need to abandon OpenAI. You need to use it strategically.**
293
-
294
- Route simple queries to GLM-4. Route code to MiniMax. Keep OpenAI for complex reasoning.
295
-
296
- ---
297
-
298
- **GitHub**: https://github.com/Das-rebel/a3m-router
299
-
300
- **NPM**: https://www.npmjs.com/package/adaptive-memory-multi-model-router
301
-
302
- **Try the playground**: https://codesandbox.io/p/sandbox/github/Das-rebel/a3m-router/tree/main/playground
303
-
304
- **Supported providers**: OpenAI, GLM-4, MiniMax, Cerebras, Groq, Mistral, Anthropic, Google, DeepSeek, CommandCode, OpenCode, Ollama
305
-
306
- ---
307
-
308
- *What's your current OpenAI spend? I'd bet GLM-4 or MiniMax could handle 50%+ of your queries at 1/10th the cost.*
@@ -1,268 +0,0 @@
1
- ---
2
- title: "Our OpenAI Bill Was $2,400/Month (Then We Built a Router)"
3
- published: true
4
- description: "We were hemorrhaging money on LLM APIs. Built an intelligent router in Node.js that cuts costs by 70%. Open sourced it. 872 downloads in the first week."
5
- tags: javascript, nodejs, llm, ai, cost-optimization, npm, open-source
6
- ---
7
-
8
- # Our OpenAI Bill Was $2,400/Month (Then We Built a Router)
9
-
10
- Last month, our startup's OpenAI bill hit **$2,400**.
11
-
12
- Five people. One thousand queries per day. Customer support automation, some code generation, text summarization. Nothing exotic.
13
-
14
- I looked at the invoice and thought: *"We're using a Ferrari to buy groceries."*
15
-
16
- ## The Problem: One Provider for Everything
17
-
18
- Like most teams, we defaulted to OpenAI for every single LLM call:
19
-
20
- ```javascript
21
- // Simple customer question? GPT-4.
22
- // Code suggestion? GPT-4.
23
- // Text summary? GPT-4.
24
- // Everything? GPT-4.
25
-
26
- await openai.chat.completions.create({
27
- model: "gpt-4",
28
- messages: [{ role: "user", content: "How do I reset my password?" }]
29
- });
30
- // Cost: $0.03, Latency: 2.1 seconds
31
- ```
32
-
33
- **The math:** 1,000 queries × $0.03 average = $30/day = **$900/month minimum**.
34
-
35
- We were hitting $2,400. Why? Because we treated every query the same.
36
-
37
- ## The Realization: Not Every Query Needs a Ferrari
38
-
39
- I analyzed our logs. Here's what we actually needed:
40
-
41
- - **34%** were simple Q&A → Any decent model works
42
- - **28%** were code generation → Speed matters more than perfection
43
- - **22%** were summarization → Doesn't need GPT-4-level reasoning
44
- - **16%** actually needed high-quality reasoning
45
-
46
- **We were paying premium prices for 84% of queries that didn't need premium models.**
47
-
48
- Our CFO sent a Slack message that changed everything:
49
-
50
- > "AI costs are 40% of our infrastructure budget. Cut it 50% or we start removing features."
51
-
52
- ## What We Built: A3M Router
53
-
54
- We needed something that would:
55
- 1. Look at each query
56
- 2. Figure out what it actually needs
57
- 3. Route to the cheapest provider that can handle it
58
- 4. Fall back automatically if something breaks
59
-
60
- So we built it. And open sourced it.
61
-
62
- ```bash
63
- npm install adaptive-memory-multi-model-router
64
- ```
65
-
66
- ```javascript
67
- const { createA3MRouter } = require('adaptive-memory-multi-model-router');
68
-
69
- const router = createA3MRouter();
70
-
71
- // Simple question? Route to cheapest option
72
- const result = await router.route("How do I reset my password?");
73
- console.log(result.primary_model); // Uses cheapest capable provider
74
- console.log(result.estimated_cost); // $0.001 instead of $0.03
75
-
76
- // Code generation? Route to fast provider
77
- const code = await router.route("Write Python to reverse a string");
78
- // Routes to Groq/Cerebras (5x faster, 10x cheaper)
79
-
80
- // Complex reasoning? Keep the premium provider
81
- const complex = await router.route("Analyze this legal contract for risks");
82
- // Keeps GPT-4 because complexity demands it
83
- ```
84
-
85
- ## How It Actually Works
86
-
87
- **Step 1: Analyze the Query**
88
-
89
- The router looks at what you're asking:
90
- - Is it code? (function, class, import patterns)
91
- - Is it math? (equations, formulas)
92
- - Is it simple Q&A?
93
- - How complex is it?
94
-
95
- **Step 2: Check Provider Profiles**
96
-
97
- Every provider has a profile:
98
- - Cost per 1K tokens
99
- - Average latency
100
- - Quality scores
101
- - What they're good at
102
-
103
- **Step 3: Smart Selection**
104
-
105
- Simple query + low complexity = prioritize cost
106
- Complex query + needs reasoning = prioritize quality
107
- Code query = prioritize speed
108
-
109
- **Step 4: Execute + Track**
110
-
111
- Makes the call, tracks the cost, logs the performance. If it fails, automatically tries the next best option.
112
-
113
- ## The Results (30 Days Later)
114
-
115
- | Metric | Before | After | Change |
116
- |--------|--------|-------|--------|
117
- | **Monthly Cost** | $2,400 | $720 | **-70%** |
118
- | **Avg Cost/Query** | $0.03 | $0.009 | **-70%** |
119
- | **Response Time** | 2.1s | 0.8s | **-62%** |
120
- | **Quality Score** | 100% | 94% | **-6%** |
121
-
122
- **Trade-off: 6% quality reduction for 70% cost savings and 2x speed improvement.**
123
-
124
- Our CFO: "This is exactly what we needed. Can we optimize further?"
125
-
126
- ## Real Query Routing (What Actually Happened)
127
-
128
- **Customer Support: "How do I reset my password?"**
129
- - Before: GPT-4 ($0.03, 2.1s)
130
- - After: Cheapest capable provider ($0.001, 0.8s)
131
- - **Savings: 97% cost, 62% faster**
132
-
133
- **Code Generation: "Write a Python function to parse JSON"**
134
- - Before: GPT-4 ($0.05, 2.1s)
135
- - After: Fast provider like Groq/Cerebras ($0.0004, 0.4s)
136
- - **Savings: 99% cost, 5x faster**
137
-
138
- **Text Summarization: "Summarize this 500-word article"**
139
- - Before: GPT-4 ($0.02, 1.2s)
140
- - After: Efficient provider ($0.002, 0.6s)
141
- - **Savings: 90% cost, 2x faster**
142
-
143
- **Complex Analysis: "Analyze this legal contract for risks"**
144
- - Before: GPT-4 ($0.04, 2.1s)
145
- - After: GPT-4 ($0.04, 2.1s)
146
- - **Kept premium because complexity demands it**
147
-
148
- ## What You Get
149
-
150
- **Out of the box:**
151
- - 12 LLM providers configured (Groq, Cerebras, Mistral, OpenAI, Anthropic, Google, DeepSeek, and more)
152
- - Automatic routing based on query analysis
153
- - Cost tracking across all providers
154
- - Fallback when providers fail
155
- - Batch processing with rate limiting
156
- - Response caching
157
- - CLI tools
158
-
159
- **Zero configuration needed.** It works immediately.
160
-
161
- ## Installation & Usage
162
-
163
- ```bash
164
- npm install adaptive-memory-multi-model-router
165
- ```
166
-
167
- ```javascript
168
- const { createA3MRouter } = require('adaptive-memory-multi-model-router');
169
-
170
- const router = createA3MRouter();
171
-
172
- // Route automatically selects best provider
173
- const result = await router.route(userQuery);
174
- const response = await callProvider(result.primary_model, userQuery);
175
-
176
- // Or use the CLI
177
- npx a3m-router route "Your query here"
178
- npx a3m-router providers # See all configured providers
179
- npx a3m-router benchmark # Compare performance
180
- ```
181
-
182
- ## The Math for Different Teams
183
-
184
- If you're using one provider for everything, you're probably overpaying:
185
-
186
- | Daily Queries | Current Cost | With Router | Monthly Savings |
187
- |---------------|--------------|-------------|-----------------|
188
- | 500 | $450 | $135 | **$315** |
189
- | 1,000 | $900 | $270 | **$630** |
190
- | 5,000 | $4,500 | $1,350 | **$3,150** |
191
- | 10,000 | $9,000 | $2,700 | **$6,300** |
192
-
193
- At 10,000 queries/day, you're leaving $6,300/month on the table.
194
-
195
- ## What About Quality?
196
-
197
- We tracked 1,000 test queries across different categories:
198
-
199
- - **Simple Q&A**: 98% accuracy (any model works)
200
- - **Code Generation**: 92% accuracy (fast models are good enough)
201
- - **Summarization**: 96% accuracy (efficient models excel here)
202
- - **Complex Reasoning**: 89% accuracy (premium models when needed)
203
-
204
- **Overall: 94% quality retention.**
205
-
206
- For our use case (customer support, internal tools, code generation), that's an easy trade-off. Your mileage may vary for medical, legal, or other high-stakes applications.
207
-
208
- ## Try It Yourself
209
-
210
- ```bash
211
- # See what you're currently overpaying for
212
- npx a3m-router route "Your most common query"
213
-
214
- # Compare how different providers handle your queries
215
- npx a3m-router compare "Write Python to sort an array"
216
-
217
- # Benchmark everything
218
- npx a3m-router benchmark
219
- ```
220
-
221
- **Or try it online:** https://codesandbox.io/p/sandbox/github/Das-rebel/a3m-router/tree/main/playground
222
-
223
- No API keys needed to test the routing logic.
224
-
225
- ## What's in the Box
226
-
227
- **Core Features:**
228
- - Learned routing (analyzes queries, picks optimal provider)
229
- - Cost tracking (real-time spend monitoring)
230
- - Automatic fallback (retry with backup providers)
231
- - Batch processing (parallel execution)
232
- - Response caching (RadixAttention-style)
233
-
234
- **Security:**
235
- - Input validation
236
- - Prompt injection detection
237
- - PII detection
238
- - Rate limiting
239
-
240
- **Providers Supported:**
241
- - Fast/Cheap: Groq, Cerebras, Mistral
242
- - Premium: OpenAI, Anthropic, Google
243
- - Free: CommandCode, OpenCode
244
- - Local: Ollama, vLLM, LM Studio
245
-
246
- **Total: 12 providers, automatic selection.**
247
-
248
- ## The Bottom Line
249
-
250
- If your LLM API bill is over $500/month, you're probably overpaying by 50-70%.
251
-
252
- Not because OpenAI is bad. GPT-4 is excellent. But you're using it for tasks where cheaper, faster models work just as well.
253
-
254
- **A3M Router fixes this automatically.**
255
-
256
- No configuration. No model training. Just intelligent routing based on what your query actually needs.
257
-
258
- ---
259
-
260
- **GitHub**: https://github.com/Das-rebel/a3m-router
261
-
262
- **NPM**: https://www.npmjs.com/package/adaptive-memory-multi-model-router
263
-
264
- **Weekly Downloads**: 872+ and growing
265
-
266
- ---
267
-
268
- *What's your current LLM spend? I'd bet we can cut it by half.*