adaptive-memory-multi-model-router 2.14.46 → 2.14.47

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (598) hide show
  1. package/{docs/llms.txt → llms.txt.bak} +6 -6
  2. package/package.json +13 -84
  3. package/src/routing/advancedRouter.ts.bak +650 -0
  4. package/test.js.bak +376 -0
  5. package/.dockerignore +0 -82
  6. package/.env.example +0 -303
  7. package/.github/DISCUSSIONS_WELCOME.md +0 -27
  8. package/.github/DISCUSSION_TEMPLATE.yml +0 -5
  9. package/.github/FUNDING.yml +0 -2
  10. package/.github/ISSUE_TEMPLATE/bug_report.md +0 -94
  11. package/.github/ISSUE_TEMPLATE/config.yml +0 -17
  12. package/.github/ISSUE_TEMPLATE/feature_request.md +0 -71
  13. package/.github/PULL_REQUEST_TEMPLATE.md +0 -71
  14. package/.github/dependabot.yml +0 -9
  15. package/.github/workflows/auto-publish.yml +0 -51
  16. package/.github/workflows/ci.yml +0 -263
  17. package/.github/workflows/codeql.yml +0 -38
  18. package/.github/workflows/npm-publish.yml +0 -20
  19. package/.github/workflows/pages.yml +0 -37
  20. package/.github/workflows/stale.yml +0 -54
  21. package/.publish-tick +0 -1
  22. package/.well-known/ai-plugin.json +0 -16
  23. package/AGENT_COUNCIL_FINDINGS.md +0 -142
  24. package/ARCHITECTURE.md +0 -346
  25. package/AUDIT_REPORT.md +0 -28
  26. package/CODE_OF_CONDUCT.md +0 -128
  27. package/CONTRIBUTING.md +0 -50
  28. package/CONTRIBUTORS.md +0 -20
  29. package/Dockerfile +0 -53
  30. package/Dockerfile.proxy +0 -33
  31. package/HEALTH_REPORT.md +0 -118
  32. package/IMPROVEMENT_PLAN.md +0 -107
  33. package/LANDING.md +0 -43
  34. package/LAUNCH-PAIN-DRIVEN.md +0 -339
  35. package/LAUNCH.md +0 -337
  36. package/LAUNCH_CHECKLIST.md +0 -141
  37. package/LAUNCH_SNAPSHOT.md +0 -260
  38. package/MANIFESTO.md +0 -41
  39. package/POPULARITY_BOOSTERS.md +0 -285
  40. package/PR_STATUS_REPORT.md +0 -148
  41. package/REDESIGN.md +0 -95
  42. package/RUNKIT.md +0 -83
  43. package/SECURITY.md +0 -29
  44. package/SUBMISSIONS.md +0 -43
  45. package/_schema.html +0 -53
  46. package/ai-plugin.json +0 -16
  47. package/articles/AI_AGENT_LLM_ROUTING.md +0 -150
  48. package/articles/CHINESE_DIRECTORIES.md +0 -100
  49. package/articles/CHINESE_SUBMISSIONS_READY.md +0 -322
  50. package/articles/COMPETITOR_ALERTS.md +0 -31
  51. package/articles/COMPLETE_POSTING_DIRECTORY.md +0 -147
  52. package/articles/CONTENT_STRUCTURE.md +0 -292
  53. package/articles/DEVTO_COST_GUIDE.md +0 -473
  54. package/articles/DEVTO_FINAL.md +0 -416
  55. package/articles/DEVTO_MULTI_PROVIDER.md +0 -542
  56. package/articles/DEVTO_READY.md +0 -255
  57. package/articles/DEVTO_V2_ANNOUNCEMENT.md +0 -160
  58. package/articles/DEVTO_VIRAL_GROWTH.md +0 -280
  59. package/articles/FRESH_devto.md +0 -460
  60. package/articles/FRESH_devto_2026_05.md +0 -73
  61. package/articles/FRESH_hackernews.md +0 -14
  62. package/articles/FRESH_reddit_ml.md +0 -90
  63. package/articles/FRESH_reddit_node.md +0 -198
  64. package/articles/FRESH_reddit_sideproject.md +0 -72
  65. package/articles/FRESH_reddit_webdev.md +0 -130
  66. package/articles/FROM_ZERO_TO_10K.md +0 -107
  67. package/articles/HN_10X_BETTER.md +0 -430
  68. package/articles/HN_ACCOUNT_GUIDE.md +0 -21
  69. package/articles/HN_CHINESE_STYLE.md +0 -308
  70. package/articles/HN_FINAL.md +0 -148
  71. package/articles/HN_POSTED_VERSION.md +0 -56
  72. package/articles/HN_POST_READY.md +0 -137
  73. package/articles/HN_RESEARCH.md +0 -364
  74. package/articles/HN_SHOW_routerarena.md +0 -17
  75. package/articles/HN_TIMING_GUIDE.md +0 -52
  76. package/articles/INDIEHACKERS_POST.md +0 -52
  77. package/articles/INDIEHACKERS_READY.md +0 -120
  78. package/articles/LLM_BENCHMARK_DEEP_DIVE.md +0 -153
  79. package/articles/MASTER_POSTING_DIRECTORY.md +0 -189
  80. package/articles/NEWSLETTER_SEND_NOW.md +0 -259
  81. package/articles/NEWSLETTER_SUBMISSIONS.md +0 -112
  82. package/articles/PAIN-DRIVEN-devto-v2.md +0 -308
  83. package/articles/PAIN-DRIVEN-devto-v3.md +0 -268
  84. package/articles/PAIN-DRIVEN-devto.md +0 -242
  85. package/articles/PAIN-DRIVEN-hackernews-v2.md +0 -138
  86. package/articles/PAIN-DRIVEN-hackernews-v3.md +0 -151
  87. package/articles/PAIN-DRIVEN-hackernews.md +0 -131
  88. package/articles/PAIN-DRIVEN-reddit-v2.md +0 -301
  89. package/articles/PAIN-DRIVEN-reddit-v3.md +0 -236
  90. package/articles/PAIN-DRIVEN-reddit.md +0 -218
  91. package/articles/PAIN-DRIVEN-twitter-v2.md +0 -110
  92. package/articles/PAIN-DRIVEN-twitter-v3.md +0 -121
  93. package/articles/PAIN-DRIVEN-twitter.md +0 -120
  94. package/articles/PORTKEY_VS_A3M.md +0 -147
  95. package/articles/POSTING_KIT_2026_05.md +0 -67
  96. package/articles/PRESS_KIT_routerarena.md +0 -77
  97. package/articles/PRODUCTHUNT_LISTING.md +0 -48
  98. package/articles/PRODUCTHUNT_READY.md +0 -106
  99. package/articles/PR_PLAN_vault.md +0 -125
  100. package/articles/REDDIT_FINAL.md +0 -232
  101. package/articles/REDDIT_POST.md +0 -67
  102. package/articles/REDDIT_SUBMISSION_READY.md +0 -348
  103. package/articles/ROUTERARENA_LEADER.md +0 -45
  104. package/articles/SHOW_HN_FINAL.md +0 -29
  105. package/articles/TWEETS_10K_DOWNLOADS.md +0 -47
  106. package/articles/TWEETS_BENCHMARK_FIRST.md +0 -46
  107. package/articles/TWEETS_MCP_PLAY.md +0 -51
  108. package/articles/TWEETS_SEQUENTIAL_BROKEN.md +0 -49
  109. package/articles/TWEETS_WHY_BUILD.md +0 -54
  110. package/articles/TWEETS_routerarena_leader.md +0 -53
  111. package/articles/TWEET_STORM_READY.md +0 -165
  112. package/articles/TWITTER_FINAL.md +0 -167
  113. package/articles/WHY_10X_BETTER.md +0 -261
  114. package/articles/WHY_CHINESE_STYLE_BETTER.md +0 -323
  115. package/articles/ai-discoverability-llm-routing.md +0 -210
  116. package/articles/devto-llm-routing.md +0 -138
  117. package/articles/hackernews-show-hn.md +0 -54
  118. package/articles/hashnode-llm-cost-optimization.md +0 -125
  119. package/articles/hn_show_2026_05.md +0 -11
  120. package/articles/medium-building-llm-router.md +0 -205
  121. package/articles/reddit-ml.md +0 -76
  122. package/articles/twitter-thread-cost-savings.md +0 -50
  123. package/articles/youtube-tutorial-script.md +0 -262
  124. package/assets/a3m_3blue1brown.mp4 +0 -0
  125. package/assets/banner.svg +0 -109
  126. package/assets/chart-cost-v2.svg +0 -91
  127. package/assets/chart-cost-v3.svg +0 -143
  128. package/assets/chart-features-v2.svg +0 -132
  129. package/assets/chart-features-v3.svg +0 -211
  130. package/assets/chart-growth-v2.svg +0 -122
  131. package/assets/chart-growth-v3.svg +0 -189
  132. package/assets/cost-comparison.svg +0 -134
  133. package/assets/cost-simple.svg +0 -64
  134. package/assets/demo-hn.gif +0 -0
  135. package/assets/feature-matrix.svg +0 -136
  136. package/assets/growth-chart-animated.svg +0 -76
  137. package/assets/growth-chart.svg +0 -82
  138. package/assets/growth-simple.svg +0 -69
  139. package/assets/hero-diagram.svg +0 -81
  140. package/assets/logo-new.svg +0 -21
  141. package/assets/logo.svg +0 -68
  142. package/assets/provider-comparison.svg +0 -121
  143. package/assets/social-preview-new.svg +0 -100
  144. package/assets/social-preview.svg +0 -194
  145. package/assets/social-v2.svg +0 -130
  146. package/assets/social-v3.svg +0 -212
  147. package/benchmark-provider-results.json +0 -245
  148. package/benchmark-results.json +0 -54
  149. package/council-votes/architecture-vote.md +0 -121
  150. package/council-votes/coverage-vote.md +0 -93
  151. package/data/adaptive-benchmark.json +0 -92
  152. package/data/benchmark-results.json +0 -47
  153. package/data/labeled-benchmark.json +0 -88
  154. package/demo/3blue1brown_video.py +0 -285
  155. package/demo/3blue1brown_video_v2.py +0 -310
  156. package/demo/IMPROVED_PROMPTS.md +0 -229
  157. package/demo/VEO3_PROMPTS.md +0 -269
  158. package/demo/VIDEO_PRODUCTION_GUIDE.md +0 -333
  159. package/demo/a3m_3blue1brown.mp4 +0 -0
  160. package/demo/asciinema-demo.sh +0 -195
  161. package/demo/demo-hn.tape +0 -74
  162. package/demo/demo-script.md +0 -53
  163. package/demo/demo-script.sh +0 -62
  164. package/demo/demo.svg +0 -75
  165. package/demo/frame1_ai_data_center.png +0 -0
  166. package/demo/frame1_sunset_video.mp4 +0 -0
  167. package/demo/frame2_cost_comparison.png +0 -0
  168. package/demo/frame2_cost_comparison_fallback.png +0 -0
  169. package/demo/frame3_parallel_execution.png +0 -0
  170. package/demo/frame3_parallel_execution_fallback.png +0 -0
  171. package/demo/frame4_providers.png +0 -0
  172. package/demo/frame4_providers_fallback.png +0 -0
  173. package/demo/frame5_endcard.png +0 -0
  174. package/demo/frame5_endcard_fallback.png +0 -0
  175. package/demo/new_frame1_hook.png +0 -0
  176. package/demo/new_frame2_proof.png +0 -0
  177. package/demo/new_frame3_wow.png +0 -0
  178. package/demo/new_frame4_social.png +0 -0
  179. package/demo/new_frame5_cta.png +0 -0
  180. package/demo/package.json +0 -13
  181. package/demo/product-video-final.mp4 +0 -0
  182. package/demo/product-video-hype-v1.mp4 +0 -0
  183. package/demo/product-video-v1.mp4 +0 -0
  184. package/demo/public/index.html +0 -762
  185. package/demo/recording.cast +0 -55
  186. package/demo/server.js +0 -405
  187. package/demo-new.tape +0 -71
  188. package/demo-real.sh +0 -198
  189. package/demo-simple.tape +0 -205
  190. package/demo.html +0 -520
  191. package/demo.sh +0 -85
  192. package/demo.tape +0 -259
  193. package/dist/analytics/costAnalytics.d.ts.map +0 -1
  194. package/dist/analytics/costAnalytics.js.map +0 -1
  195. package/dist/benchmark/comprehensive.js.map +0 -1
  196. package/dist/benchmark/reproducible.d.ts.map +0 -1
  197. package/dist/benchmark/reproducible.js.map +0 -1
  198. package/dist/cache/prefixCache.d.ts.map +0 -1
  199. package/dist/cache/prefixCache.js.map +0 -1
  200. package/dist/cache/responseCache.d.ts.map +0 -1
  201. package/dist/cache/responseCache.js.map +0 -1
  202. package/dist/cache/semanticCache.d.ts.map +0 -1
  203. package/dist/cache/semanticCache.js.map +0 -1
  204. package/dist/cli/setupWizard.d.ts.map +0 -1
  205. package/dist/cli/setupWizard.js.map +0 -1
  206. package/dist/cost/budgetEnforcer.d.ts.map +0 -1
  207. package/dist/cost/budgetEnforcer.js.map +0 -1
  208. package/dist/cost/costTracker.d.ts.map +0 -1
  209. package/dist/cost/costTracker.js.map +0 -1
  210. package/dist/ensemble/multiRoundDialog.js.map +0 -1
  211. package/dist/ensemble/shapleyValue.js.map +0 -1
  212. package/dist/integrations/langchainAdapter.d.ts.map +0 -1
  213. package/dist/integrations/langchainAdapter.js.map +0 -1
  214. package/dist/integrations/oauth.d.ts.map +0 -1
  215. package/dist/integrations/oauth.js.map +0 -1
  216. package/dist/integrations/scienceAdapter.js.map +0 -1
  217. package/dist/memory/autoFetch.d.ts.map +0 -1
  218. package/dist/memory/autoFetch.js.map +0 -1
  219. package/dist/memory/episodicMemory.d.ts.map +0 -1
  220. package/dist/memory/episodicMemory.js.map +0 -1
  221. package/dist/memory/hybridMemory.js.map +0 -1
  222. package/dist/memory/memoryTree.d.ts.map +0 -1
  223. package/dist/memory/memoryTree.js.map +0 -1
  224. package/dist/memory/obsidianVault.d.ts.map +0 -1
  225. package/dist/memory/obsidianVault.js.map +0 -1
  226. package/dist/memory/reasoningBank.js.map +0 -1
  227. package/dist/observability/changeWatch.d.ts.map +0 -1
  228. package/dist/observability/changeWatch.js.map +0 -1
  229. package/dist/observability/fatigueDetector.d.ts.map +0 -1
  230. package/dist/observability/fatigueDetector.js.map +0 -1
  231. package/dist/observability/index.d.ts.map +0 -1
  232. package/dist/observability/index.js.map +0 -1
  233. package/dist/observability/metrics.d.ts.map +0 -1
  234. package/dist/observability/metrics.js.map +0 -1
  235. package/dist/observability/middleware.d.ts.map +0 -1
  236. package/dist/observability/middleware.js.map +0 -1
  237. package/dist/observability/tracer.d.ts.map +0 -1
  238. package/dist/observability/tracer.js.map +0 -1
  239. package/dist/observability/types.d.ts.map +0 -1
  240. package/dist/observability/types.js.map +0 -1
  241. package/dist/orchestration/haloOrchestrator.d.ts.map +0 -1
  242. package/dist/orchestration/haloOrchestrator.js.map +0 -1
  243. package/dist/orchestration/mctsWorkflow.d.ts.map +0 -1
  244. package/dist/orchestration/mctsWorkflow.js.map +0 -1
  245. package/dist/providers/localProvider.d.ts.map +0 -1
  246. package/dist/providers/localProvider.js.map +0 -1
  247. package/dist/providers/providerConfig.d.ts.map +0 -1
  248. package/dist/providers/providerConfig.js.map +0 -1
  249. package/dist/providers/registry.d.ts.map +0 -1
  250. package/dist/providers/registry.js.map +0 -1
  251. package/dist/routing/advancedRouter.d.ts.map +0 -1
  252. package/dist/routing/advancedRouter.js.map +0 -1
  253. package/dist/routing/crossModelValidation.d.ts.map +0 -1
  254. package/dist/routing/crossModelValidation.js.map +0 -1
  255. package/dist/routing/providerHealth.d.ts.map +0 -1
  256. package/dist/routing/providerHealth.js.map +0 -1
  257. package/dist/routing/providerRetry.d.ts.map +0 -1
  258. package/dist/routing/providerRetry.js.map +0 -1
  259. package/dist/scripts/banner.js +0 -29
  260. package/dist/security/guardrails.d.ts.map +0 -1
  261. package/dist/security/guardrails.js.map +0 -1
  262. package/dist/server/dashboard.d.ts.map +0 -1
  263. package/dist/server/dashboard.js.map +0 -1
  264. package/dist/server/modelMapper.d.ts.map +0 -1
  265. package/dist/server/modelMapper.js.map +0 -1
  266. package/dist/server/proxyServer.d.ts.map +0 -1
  267. package/dist/server/proxyServer.js.map +0 -1
  268. package/dist/skills/__tests__/skill_manager.test.d.ts +0 -2
  269. package/dist/skills/__tests__/skill_manager.test.d.ts.map +0 -1
  270. package/dist/skills/__tests__/skill_manager.test.js +0 -268
  271. package/dist/skills/__tests__/skill_manager.test.js.map +0 -1
  272. package/dist/tools/tmlpdTools.d.ts.map +0 -1
  273. package/dist/tools/tmlpdTools.js.map +0 -1
  274. package/dist/tui/dashboard.d.ts.map +0 -1
  275. package/dist/tui/dashboard.js.map +0 -1
  276. package/dist/tui/index.d.ts.map +0 -1
  277. package/dist/tui/index.js.map +0 -1
  278. package/dist/utils/batchProcessor.d.ts.map +0 -1
  279. package/dist/utils/batchProcessor.js.map +0 -1
  280. package/dist/utils/compression.d.ts.map +0 -1
  281. package/dist/utils/compression.js.map +0 -1
  282. package/dist/utils/costUtils.d.ts.map +0 -1
  283. package/dist/utils/costUtils.js.map +0 -1
  284. package/dist/utils/reliability.d.ts.map +0 -1
  285. package/dist/utils/reliability.js.map +0 -1
  286. package/dist/utils/sorting.d.ts.map +0 -1
  287. package/dist/utils/sorting.js.map +0 -1
  288. package/dist/utils/speculativeDecoding.d.ts.map +0 -1
  289. package/dist/utils/speculativeDecoding.js.map +0 -1
  290. package/dist/utils/tokenUtils.d.ts.map +0 -1
  291. package/dist/utils/tokenUtils.js.map +0 -1
  292. package/docs/.nojekyll +0 -0
  293. package/docs/ANALYSIS_PRINCIPLES.md +0 -162
  294. package/docs/API.md +0 -855
  295. package/docs/ARCHITECTURAL-IMPROVEMENTS-2025.md +0 -1391
  296. package/docs/ARCHITECTURAL-IMPROVEMENTS-REVISED-2025.md +0 -1051
  297. package/docs/BENCHMARK.md +0 -170
  298. package/docs/CHINESE_PROVIDER_RELIABILITY.md +0 -37
  299. package/docs/CITATIONS.md +0 -74
  300. package/docs/CLAIMS_AND_EVIDENCE.md +0 -58
  301. package/docs/CONFIGURATION.md +0 -476
  302. package/docs/COUNCIL_DECISION.json +0 -816
  303. package/docs/COUNCIL_SUMMARY.md +0 -319
  304. package/docs/COUNCIL_V2.2_DECISION.md +0 -416
  305. package/docs/ENGINEERING_SPEC.md +0 -55
  306. package/docs/FACTORY_RESET.md +0 -34
  307. package/docs/GEO.md +0 -66
  308. package/docs/GEO_OPTIMIZATION.md +0 -30
  309. package/docs/GEO_ROOT_CAUSE.md +0 -136
  310. package/docs/GEO_STATUS.md +0 -85
  311. package/docs/GEO_TEST_RESULTS.md +0 -176
  312. package/docs/HN_CHECKLIST.md +0 -38
  313. package/docs/HN_FOUNDER_COMMENT.md +0 -17
  314. package/docs/HN_SUBMISSION_FINAL.md +0 -180
  315. package/docs/HN_SUBMISSION_V3.md +0 -56
  316. package/docs/IMPROVEMENT_ROADMAP.md +0 -515
  317. package/docs/INTEGRATIONS.md +0 -420
  318. package/docs/LANGCHAIN_INTEGRATION.md +0 -147
  319. package/docs/LLM_COUNCIL_DECISION.md +0 -508
  320. package/docs/MIDDLEWARE_CHAIN.md +0 -35
  321. package/docs/PROMO_CHECKLIST.md +0 -200
  322. package/docs/QUICKSTART.md +0 -271
  323. package/docs/QUICK_START.md +0 -43
  324. package/docs/QUICK_START_VISIBILITY.md +0 -782
  325. package/docs/REDDIT_GAP_ANALYSIS.md +0 -299
  326. package/docs/RELEASE_CHECKLIST.md +0 -32
  327. package/docs/REPRODUCIBILITY.md +0 -63
  328. package/docs/RESEARCH_BACKED_IMPROVEMENTS.md +0 -1180
  329. package/docs/ROUTING_RUBRIC.md +0 -197
  330. package/docs/SEO_AUDIT.md +0 -186
  331. package/docs/SOCIAL_LISTENING.md +0 -219
  332. package/docs/TMLPD_QNA.md +0 -751
  333. package/docs/TMLPD_V2.1_COMPLETE.md +0 -763
  334. package/docs/TMLPD_V2.2_RESEARCH_ROADMAP.md +0 -754
  335. package/docs/UPDATE_TOPICS.md +0 -15
  336. package/docs/USE_CASES.md +0 -59
  337. package/docs/V2.2_IMPLEMENTATION_COMPLETE.md +0 -446
  338. package/docs/V2_IMPLEMENTATION_GUIDE.md +0 -388
  339. package/docs/VERCEL_AI_SDK.md +0 -209
  340. package/docs/VISIBILITY_ADOPTION_PLAN.md +0 -1005
  341. package/docs/_config.yml +0 -49
  342. package/docs/ai-plugin.json +0 -16
  343. package/docs/api.html +0 -513
  344. package/docs/architecture-diagram.md +0 -40
  345. package/docs/benchmark-chart.png +0 -0
  346. package/docs/benchmark.html +0 -387
  347. package/docs/blog/routerarena-number-one.html +0 -73
  348. package/docs/cli-cheatsheet.md +0 -339
  349. package/docs/compare.md +0 -109
  350. package/docs/comparison-litellm.md +0 -88
  351. package/docs/comparison.md +0 -108
  352. package/docs/cost-chart-ascii.md +0 -42
  353. package/docs/cost-comparison-chart.svg +0 -88
  354. package/docs/curl-examples.md +0 -247
  355. package/docs/demo-auto.html +0 -264
  356. package/docs/demo.html +0 -416
  357. package/docs/geo/GENERATIVE_ENGINE_OPTIMIZATION.md +0 -232
  358. package/docs/index.html +0 -507
  359. package/docs/launch-content/LAUNCH_EXECUTION_CHECKLIST.md +0 -421
  360. package/docs/launch-content/README.md +0 -457
  361. package/docs/launch-content/assets/cost_comparison_100_tasks.png +0 -0
  362. package/docs/launch-content/assets/cumulative_savings.png +0 -0
  363. package/docs/launch-content/assets/parallel_speedup.png +0 -0
  364. package/docs/launch-content/assets/provider_pricing_comparison.png +0 -0
  365. package/docs/launch-content/assets/task_breakdown_comparison.png +0 -0
  366. package/docs/launch-content/generate_charts.py +0 -313
  367. package/docs/launch-content/hn_show_post.md +0 -139
  368. package/docs/launch-content/partner_outreach_templates.md +0 -745
  369. package/docs/launch-content/reddit_posts.md +0 -467
  370. package/docs/launch-content/twitter_thread.txt +0 -460
  371. package/docs/npm-downloads-chart.svg +0 -43
  372. package/docs/openapi.json +0 -139
  373. package/docs/openapi.yaml +0 -1318
  374. package/docs/quick-start.html +0 -366
  375. package/docs/robots.txt +0 -52
  376. package/docs/sitemap.xml +0 -57
  377. package/docs/styles.css +0 -682
  378. package/docs/well-known/ai-plugin.json +0 -16
  379. package/docs/wellknown/ai-plugin.json +0 -16
  380. package/docs-site/assets/og-banner.svg +0 -194
  381. package/docs-site/index.html +0 -632
  382. package/eval/README.md +0 -46
  383. package/eval/baselines/main.json +0 -12
  384. package/eval/benchmark_dataset.jsonl +0 -16
  385. package/eval/check_golden_routes.js +0 -64
  386. package/eval/datasets/catalog.json +0 -33
  387. package/eval/datasets/slices/cn_provider_reliability_v1.jsonl +0 -3
  388. package/eval/datasets/slices/cost_pressure_v1.jsonl +0 -3
  389. package/eval/datasets/slices/safety_guardrails_v1.jsonl +0 -3
  390. package/eval/evals.json +0 -199
  391. package/eval/fault_injection_thresholds.json +0 -3
  392. package/eval/generate_report.js +0 -128
  393. package/eval/golden_routes.json +0 -114
  394. package/eval/lib/experiment_registry.js +0 -24
  395. package/eval/run_eval.js +0 -197
  396. package/eval/run_fault_injection.js +0 -201
  397. package/eval/run_shadow_eval.js +0 -85
  398. package/eval/thresholds.json +0 -9
  399. package/examples/QUICKSTART.md +0 -183
  400. package/examples/README.md +0 -61
  401. package/examples/a3m-sdk.js +0 -124
  402. package/examples/basic-route.js +0 -54
  403. package/examples/chat-loop.js +0 -202
  404. package/examples/classify-then-route.js +0 -102
  405. package/examples/cost-compare.js +0 -120
  406. package/examples/ensemble.js +0 -160
  407. package/examples/whatsapp-telegram-bridge-demo.js +0 -302
  408. package/examples/whatsapp-telegram-bridge.js +0 -269
  409. package/hf-space/README.md +0 -23
  410. package/hf-space/app.py +0 -240
  411. package/hf-space/requirements.txt +0 -1
  412. package/huggingface_space/README.md +0 -35
  413. package/huggingface_space/app.py +0 -126
  414. package/huggingface_space/create_space.py +0 -208
  415. package/huggingface_space/requirements.txt +0 -1
  416. package/mcp-server/README.md +0 -188
  417. package/mcp-server/package.json +0 -29
  418. package/mcp-server/src/index.ts +0 -744
  419. package/mcp-server/tsconfig.json +0 -19
  420. package/openclaw-alexa-bridge/ALL_REMAINING_FIXES_PLAN.md +0 -313
  421. package/openclaw-alexa-bridge/REMAINING_FIXES_SUMMARY.md +0 -277
  422. package/openclaw-alexa-bridge/src/alexa_handler_no_tmlpd.js +0 -1234
  423. package/openclaw-alexa-bridge/test_fixes.js +0 -77
  424. package/playground/README.md +0 -51
  425. package/playground/codesandbox.json +0 -12
  426. package/playground/index.js +0 -39
  427. package/proxy/README.md +0 -227
  428. package/proxy/package-lock.json +0 -831
  429. package/proxy/package.json +0 -17
  430. package/proxy/rate-limit.js +0 -145
  431. package/proxy/rate-limit.test.js +0 -311
  432. package/proxy/server.js +0 -970
  433. package/python/README.md +0 -102
  434. package/python/a3m/__init__.py +0 -6
  435. package/python/a3m/client.py +0 -190
  436. package/python/a3m/models.py +0 -40
  437. package/python/a3m/sync_client.py +0 -61
  438. package/python/examples.py +0 -53
  439. package/python/integrations.py +0 -330
  440. package/python/pyproject.toml +0 -23
  441. package/python/setup.py +0 -28
  442. package/python/tmlpd.py +0 -369
  443. package/qna/REDDIT_GAP_ANALYSIS.md +0 -299
  444. package/qna/TMLPD_QNA.md +0 -751
  445. package/research/FINDING_001_safety.md +0 -28
  446. package/research/FINDING_002_error_diversity.md +0 -32
  447. package/research/FINDING_003_confidence_weighted_voting.md +0 -32
  448. package/research/FINDING_004_cross_model_semantic_detection.md +0 -37
  449. package/research/FINDING_005_knowledge_gap_orthogonality.md +0 -34
  450. package/research/HALLUCINATION_RESEARCH.md +0 -27
  451. package/research/ensemble-voting.md +0 -324
  452. package/research/loss-functions.md +0 -545
  453. package/research-log.md +0 -49
  454. package/scripts/banner.js +0 -29
  455. package/scripts/benchmark-local-routerarena.ts +0 -176
  456. package/scripts/benchmark.js +0 -145
  457. package/scripts/benchmark.sh +0 -61
  458. package/scripts/compare-providers.sh +0 -230
  459. package/scripts/content-planner.js +0 -25
  460. package/scripts/create-labeled-benchmark.ts +0 -105
  461. package/scripts/cross_post.py +0 -443
  462. package/scripts/local-router-benchmark.ts +0 -154
  463. package/scripts/post-all.sh +0 -41
  464. package/scripts/publish_fcc.py +0 -106
  465. package/scripts/push-to-gitee.sh +0 -25
  466. package/scripts/routerarena_ensemble.js +0 -144
  467. package/scripts/routing-benchmark-v2.js +0 -373
  468. package/scripts/routing-benchmark-v3.js +0 -118
  469. package/scripts/routing-benchmark.js +0 -462
  470. package/scripts/run-labeled-benchmark.mjs +0 -104
  471. package/scripts/run-mmlu-benchmark.js +0 -176
  472. package/scripts/run-provider-benchmark.js +0 -244
  473. package/scripts/update-npm-badges.js +0 -158
  474. package/skill/SKILL.md +0 -238
  475. package/src/__tests__/integration/tmpld_integration.test.py +0 -540
  476. package/src/skills/__tests__/skill_manager.test.ts +0 -328
  477. package/submissions/benchmarks/ALL_PLATFORMS_SUBMISSION.md +0 -94
  478. package/submissions/benchmarks/LLMROUTERBENCH_SUBMISSION.md +0 -121
  479. package/submissions/benchmarks/MMRBENCH_SUBMISSION.md +0 -94
  480. package/submissions/benchmarks/ROUTERARENA_UPDATE.md +0 -83
  481. package/submissions/benchmarks/ROUTERBENCH_SUBMISSION.md +0 -225
  482. package/test-council/1-structure-tests.test.js +0 -353
  483. package/test-council/1-structure-tests.test.ts +0 -353
  484. package/test-council/2-edge-case-tests.test.ts +0 -361
  485. package/test-council/3-performance-tests.test.ts +0 -669
  486. package/test-council/4-integration-tests.test.ts +0 -391
  487. package/test-council/5-agent-council-eval.test.ts +0 -413
  488. package/test-council/AGENT_COUNCIL_ARCHITECTURE.md +0 -349
  489. package/test-council/TEST_COUNCIL_REPORT.md +0 -201
  490. package/test-council/agents/edge-case-agent.ts +0 -363
  491. package/test-council/agents/performance-agent.ts +0 -426
  492. package/test-council/agents/structure-agent.ts +0 -227
  493. package/test-council/council.md +0 -183
  494. package/tests/__mocks__/tokenUtils.ts +0 -8
  495. package/tests/memory/episodicMemory.test.ts +0 -227
  496. package/tests/package-lock.json +0 -1628
  497. package/tests/package.json +0 -18
  498. package/tests/routing/ensembleVoting.test.ts +0 -236
  499. package/tests/routing/providerRetry.test.ts +0 -360
  500. package/tests/routing/queryTypePresets.test.ts +0 -208
  501. package/tests/security/guardrailEngine.test.ts +0 -700
  502. package/tests/tsconfig.json +0 -21
  503. package/tests/vitest.config.ts +0 -18
  504. package/tmlpd-pi-extension/README.md +0 -66
  505. package/tmlpd-pi-extension/dist/cache/prefixCache.d.ts +0 -114
  506. package/tmlpd-pi-extension/dist/cache/prefixCache.d.ts.map +0 -1
  507. package/tmlpd-pi-extension/dist/cache/prefixCache.js +0 -285
  508. package/tmlpd-pi-extension/dist/cache/prefixCache.js.map +0 -1
  509. package/tmlpd-pi-extension/dist/cache/responseCache.d.ts +0 -58
  510. package/tmlpd-pi-extension/dist/cache/responseCache.d.ts.map +0 -1
  511. package/tmlpd-pi-extension/dist/cache/responseCache.js +0 -153
  512. package/tmlpd-pi-extension/dist/cache/responseCache.js.map +0 -1
  513. package/tmlpd-pi-extension/dist/cli.js +0 -59
  514. package/tmlpd-pi-extension/dist/cost/costTracker.d.ts +0 -95
  515. package/tmlpd-pi-extension/dist/cost/costTracker.d.ts.map +0 -1
  516. package/tmlpd-pi-extension/dist/cost/costTracker.js +0 -240
  517. package/tmlpd-pi-extension/dist/cost/costTracker.js.map +0 -1
  518. package/tmlpd-pi-extension/dist/index.d.ts +0 -723
  519. package/tmlpd-pi-extension/dist/index.d.ts.map +0 -1
  520. package/tmlpd-pi-extension/dist/index.js +0 -239
  521. package/tmlpd-pi-extension/dist/index.js.map +0 -1
  522. package/tmlpd-pi-extension/dist/memory/episodicMemory.d.ts +0 -82
  523. package/tmlpd-pi-extension/dist/memory/episodicMemory.d.ts.map +0 -1
  524. package/tmlpd-pi-extension/dist/memory/episodicMemory.js +0 -145
  525. package/tmlpd-pi-extension/dist/memory/episodicMemory.js.map +0 -1
  526. package/tmlpd-pi-extension/dist/orchestration/haloOrchestrator.d.ts +0 -102
  527. package/tmlpd-pi-extension/dist/orchestration/haloOrchestrator.d.ts.map +0 -1
  528. package/tmlpd-pi-extension/dist/orchestration/haloOrchestrator.js +0 -207
  529. package/tmlpd-pi-extension/dist/orchestration/haloOrchestrator.js.map +0 -1
  530. package/tmlpd-pi-extension/dist/orchestration/mctsWorkflow.d.ts +0 -85
  531. package/tmlpd-pi-extension/dist/orchestration/mctsWorkflow.d.ts.map +0 -1
  532. package/tmlpd-pi-extension/dist/orchestration/mctsWorkflow.js +0 -210
  533. package/tmlpd-pi-extension/dist/orchestration/mctsWorkflow.js.map +0 -1
  534. package/tmlpd-pi-extension/dist/providers/localProvider.d.ts +0 -102
  535. package/tmlpd-pi-extension/dist/providers/localProvider.d.ts.map +0 -1
  536. package/tmlpd-pi-extension/dist/providers/localProvider.js +0 -338
  537. package/tmlpd-pi-extension/dist/providers/localProvider.js.map +0 -1
  538. package/tmlpd-pi-extension/dist/providers/registry.d.ts +0 -55
  539. package/tmlpd-pi-extension/dist/providers/registry.d.ts.map +0 -1
  540. package/tmlpd-pi-extension/dist/providers/registry.js +0 -138
  541. package/tmlpd-pi-extension/dist/providers/registry.js.map +0 -1
  542. package/tmlpd-pi-extension/dist/routing/advancedRouter.d.ts +0 -68
  543. package/tmlpd-pi-extension/dist/routing/advancedRouter.d.ts.map +0 -1
  544. package/tmlpd-pi-extension/dist/routing/advancedRouter.js +0 -332
  545. package/tmlpd-pi-extension/dist/routing/advancedRouter.js.map +0 -1
  546. package/tmlpd-pi-extension/dist/tools/tmlpdTools.d.ts +0 -101
  547. package/tmlpd-pi-extension/dist/tools/tmlpdTools.d.ts.map +0 -1
  548. package/tmlpd-pi-extension/dist/tools/tmlpdTools.js +0 -368
  549. package/tmlpd-pi-extension/dist/tools/tmlpdTools.js.map +0 -1
  550. package/tmlpd-pi-extension/dist/utils/batchProcessor.d.ts +0 -96
  551. package/tmlpd-pi-extension/dist/utils/batchProcessor.d.ts.map +0 -1
  552. package/tmlpd-pi-extension/dist/utils/batchProcessor.js +0 -170
  553. package/tmlpd-pi-extension/dist/utils/batchProcessor.js.map +0 -1
  554. package/tmlpd-pi-extension/dist/utils/compression.d.ts +0 -61
  555. package/tmlpd-pi-extension/dist/utils/compression.d.ts.map +0 -1
  556. package/tmlpd-pi-extension/dist/utils/compression.js +0 -281
  557. package/tmlpd-pi-extension/dist/utils/compression.js.map +0 -1
  558. package/tmlpd-pi-extension/dist/utils/reliability.d.ts +0 -74
  559. package/tmlpd-pi-extension/dist/utils/reliability.d.ts.map +0 -1
  560. package/tmlpd-pi-extension/dist/utils/reliability.js +0 -177
  561. package/tmlpd-pi-extension/dist/utils/reliability.js.map +0 -1
  562. package/tmlpd-pi-extension/dist/utils/speculativeDecoding.d.ts +0 -117
  563. package/tmlpd-pi-extension/dist/utils/speculativeDecoding.d.ts.map +0 -1
  564. package/tmlpd-pi-extension/dist/utils/speculativeDecoding.js +0 -246
  565. package/tmlpd-pi-extension/dist/utils/speculativeDecoding.js.map +0 -1
  566. package/tmlpd-pi-extension/dist/utils/tokenUtils.d.ts +0 -50
  567. package/tmlpd-pi-extension/dist/utils/tokenUtils.d.ts.map +0 -1
  568. package/tmlpd-pi-extension/dist/utils/tokenUtils.js +0 -124
  569. package/tmlpd-pi-extension/dist/utils/tokenUtils.js.map +0 -1
  570. package/tmlpd-pi-extension/examples/QUICKSTART.md +0 -183
  571. package/tmlpd-pi-extension/package-lock.json +0 -79
  572. package/tmlpd-pi-extension/package.json +0 -172
  573. package/tmlpd-pi-extension/python/examples.py +0 -53
  574. package/tmlpd-pi-extension/python/integrations.py +0 -330
  575. package/tmlpd-pi-extension/python/setup.py +0 -28
  576. package/tmlpd-pi-extension/python/tmlpd.py +0 -369
  577. package/tmlpd-pi-extension/qna/REDDIT_GAP_ANALYSIS.md +0 -299
  578. package/tmlpd-pi-extension/qna/TMLPD_QNA.md +0 -751
  579. package/tmlpd-pi-extension/skill/SKILL.md +0 -238
  580. package/tmlpd-pi-extension/src/cache/responseCache.ts +0 -147
  581. package/tmlpd-pi-extension/src/cost/costTracker.ts +0 -302
  582. package/tmlpd-pi-extension/src/index.ts +0 -232
  583. package/tmlpd-pi-extension/src/memory/episodicMemory.ts +0 -257
  584. package/tmlpd-pi-extension/src/orchestration/haloOrchestrator.ts +0 -266
  585. package/tmlpd-pi-extension/src/orchestration/mctsWorkflow.ts +0 -262
  586. package/tmlpd-pi-extension/src/providers/localProvider.ts +0 -406
  587. package/tmlpd-pi-extension/src/providers/registry.ts +0 -164
  588. package/tmlpd-pi-extension/src/routing/ensembleVoting.ts +0 -159
  589. package/tmlpd-pi-extension/src/routing/queryTypePresets.ts +0 -136
  590. package/tmlpd-pi-extension/src/tools/tmlpdTools.ts +0 -433
  591. package/tmlpd-pi-extension/src/utils/batchProcessor.ts +0 -232
  592. package/tmlpd-pi-extension/src/utils/compression.ts +0 -325
  593. package/tmlpd-pi-extension/src/utils/reliability.ts +0 -221
  594. package/tmlpd-pi-extension/src/utils/tokenUtils.ts +0 -145
  595. package/tmlpd-pi-extension/tsconfig.json +0 -18
  596. package/tsconfig.build.json +0 -29
  597. package/tsconfig.json +0 -18
  598. /package/{docs/llms-full.txt → llms-full.txt.bak} +0 -0
@@ -1,1180 +0,0 @@
1
- # TMLPD v2.0 Research-Backed Improvement Roadmap
2
-
3
- **Based on**:
4
- - MONK CLI architecture analysis (production system)
5
- - Latest arXiv research (2024-2025)
6
- - Current TMLPD v2.0 state (~3,000 lines, 5 phases complete)
7
-
8
- **Date**: 2025-01-02
9
-
10
- ---
11
-
12
- ## 🎯 Executive Summary
13
-
14
- After analyzing MONK CLI's production architecture and 30+ recent arXiv papers on multi-LLM systems, memory, and agent orchestration, here are the **highest-impact improvements** for TMLPD v2.0.
15
-
16
- ### Key Insights from Research
17
-
18
- 1. **From ArXiv 2024-2025**: Hierarchical orchestration and difficulty-aware routing are the dominant patterns
19
- 2. **From MONK CLI**: Multi-provider management with health monitoring achieves 95%+ uptime
20
- 3. **Combined**: TMLPD should adopt provider abstraction + difficulty-aware routing + advanced memory
21
-
22
- ---
23
-
24
- ## 🔴 CRITICAL IMPROVEMENTS (Research-Backed)
25
-
26
- ### 1. **Multi-Provider System with Health Monitoring** ⭐⭐⭐⭐⭐
27
-
28
- **Based on**: MONK CLI's `unified_provider.py` + [AgentOrchestra hierarchical framework](https://arxiv.org/html/2506.12508v1)
29
-
30
- **Problem**: TMLPD v2.0 is hardcoded to single provider (anthropic/claude-sonnet-4)
31
-
32
- **Impact**: Enables provider switching, load balancing, and 40-60% cost reduction (MONK benchmark)
33
-
34
- #### Implementation
35
-
36
- ```python
37
- # src/providers/base_provider.py
38
- from abc import ABC, abstractmethod
39
-
40
- class BaseProvider(ABC):
41
- """Unified provider interface"""
42
-
43
- @abstractmethod
44
- async def execute(self, prompt: str, **kwargs) -> Dict[str, Any]:
45
- """Execute task with this provider"""
46
- pass
47
-
48
- @abstractmethod
49
- def get_health(self) -> Dict[str, Any]:
50
- """Get provider health status"""
51
- pass
52
-
53
- @abstractmethod
54
- def calculate_cost(self, tokens: int) -> float:
55
- """Calculate cost for token usage"""
56
- pass
57
-
58
- # src/providers/anthropic_provider.py
59
- class AnthropicProvider(BaseProvider):
60
- def __init__(self, api_key: str):
61
- self.api_key = api_key
62
- self.health_status = True
63
- self.failure_count = 0
64
-
65
- async def execute(self, prompt: str, **kwargs) -> Dict[str, Any]:
66
- # Implementation with retry logic
67
- pass
68
-
69
- # src/providers/provider_registry.py
70
- class ProviderRegistry:
71
- """Manages multiple providers with health monitoring"""
72
-
73
- def __init__(self):
74
- self.providers: Dict[str, BaseProvider] = {}
75
- self.health_monitor = HealthMonitor()
76
-
77
- def register_provider(self, name: str, provider: BaseProvider):
78
- self.providers[name] = provider
79
-
80
- def get_healthy_providers(self) -> List[BaseProvider]:
81
- """Get only providers that are healthy"""
82
- return [
83
- p for p in self.providers.values()
84
- if p.get_health()["status"] == "healthy"
85
- ]
86
- ```
87
-
88
- **Configuration**:
89
- ```yaml
90
- # tmlpd.yaml
91
- providers:
92
- anthropic:
93
- model: claude-sonnet-4
94
- api_key_env: ANTHROPIC_API_KEY
95
- priority: 1
96
-
97
- openai:
98
- model: gpt-4o
99
- api_key_env: OPENAI_API_KEY
100
- priority: 2
101
-
102
- cerebras:
103
- model: llama-3.3-70b
104
- api_key_env: CEREBRAS_API_KEY
105
- priority: 3 # Fallback for cost optimization
106
-
107
- provider_selection:
108
- strategy: difficulty_aware # From arXiv 2509.11079
109
- health_checks_enabled: true
110
- circuit_breaker_threshold: 3
111
- ```
112
-
113
- **Files to Add**:
114
- - `src/providers/base_provider.py` (100 lines)
115
- - `src/providers/anthropic_provider.py` (150 lines)
116
- - `src/providers/openai_provider.py` (150 lines)
117
- - `src/providers/cerebras_provider.py` (150 lines)
118
- - `src/providers/provider_registry.py` (200 lines)
119
- - `src/providers/health_monitor.py` (150 lines)
120
-
121
- **Effort**: 2-3 days
122
- **Value**: ⭐⭐⭐⭐⭐ (enables all other improvements)
123
-
124
- ---
125
-
126
- ### 2. **Difficulty-Aware Routing** ⭐⭐⭐⭐⭐
127
-
128
- **Based on**: [Difficulty-Aware Agent Orchestration](https://arxiv.org/html/2509.11079v2) + MONK's `treequest_controller.py`
129
-
130
- **Problem**: Current complexity scoring (0-1) is simplistic and doesn't map to optimal providers
131
-
132
- **Impact**: Research shows difficulty-aware routing improves decision quality by 35%
133
-
134
- #### Implementation
135
-
136
- ```python
137
- # src/workflows/difficulty_router.py
138
-
139
- class DifficultyAwareRouter:
140
- """
141
- Routes tasks based on difficulty classification
142
- Based on arXiv:2509.11079 (Difficulty-Aware Agent Orchestration)
143
- """
144
-
145
- DIFFICULTY_LEVELS = {
146
- "TRIVIAL": range(0, 20), # Use fastest/cheapest
147
- "SIMPLE": range(20, 40), # Use balanced provider
148
- "MEDIUM": range(40, 60), # Use quality provider
149
- "COMPLEX": range(60, 80), # Use best provider
150
- "EXPERT": range(80, 100) # Use expert provider + verification
151
- }
152
-
153
- # Provider preference by difficulty
154
- PROVIDER_PREFERENCES = {
155
- "TRIVIAL": ["cerebras", "groq"], # Fastest
156
- "SIMPLE": ["cerebras", "openai"], # Fast
157
- "MEDIUM": ["openai", "anthropic"], # Balanced
158
- "COMPLEX": ["anthropic", "openai"], # Quality
159
- "EXPERT": ["anthropic"] # Best
160
- }
161
-
162
- def classify_difficulty(self, task: Dict[str, Any]) -> str:
163
- """
164
- Classify task difficulty based on multiple factors
165
-
166
- Factors (from research):
167
- - Task length (word count)
168
- - Multi-step indicators (then, after, followed by)
169
- - Domain complexity (specialized terminology)
170
- - Requirement specificity
171
- - Context dependencies
172
- """
173
- score = 0
174
-
175
- # Factor 1: Length (0-20 points)
176
- description = task.get("description", "")
177
- word_count = len(description.split())
178
- score += min(word_count / 10, 20)
179
-
180
- # Factor 2: Multi-step (0-25 points)
181
- multi_step_keywords = [
182
- "then", "after", "before", "followed by",
183
- "multiple", "several", "sequence", "chain",
184
- "iterate", "refine", "improve"
185
- ]
186
- multi_step_count = sum(
187
- 1 for kw in multi_step_keywords
188
- if kw in description.lower()
189
- )
190
- score += min(multi_step_count * 5, 25)
191
-
192
- # Factor 3: Technical complexity (0-30 points)
193
- technical_keywords = [
194
- "implement", "integrate", "optimize", "architecture",
195
- "system", "api", "database", "authentication", "deployment"
196
- ]
197
- tech_count = sum(
198
- 1 for kw in technical_keywords
199
- if kw in description.lower()
200
- )
201
- score += min(tech_count * 3, 30)
202
-
203
- # Factor 4: Constraints/requirements (0-15 points)
204
- if task.get("requirements"):
205
- score += 10
206
- if task.get("context"):
207
- score += 5
208
-
209
- # Factor 5: Dependencies (0-10 points)
210
- dependency_keywords = ["depends", "requires", "needs", "after"]
211
- if any(kw in description.lower() for kw in dependency_keywords):
212
- score += 10
213
-
214
- # Map to difficulty level
215
- for level, range_obj in self.DIFFICULTY_LEVELS.items():
216
- if score in range_obj:
217
- return level
218
-
219
- return "MEDIUM" # Default
220
-
221
- def route_to_provider(
222
- self,
223
- task: Dict[str, Any],
224
- provider_registry: ProviderRegistry
225
- ) -> BaseProvider:
226
- """Route task to appropriate provider based on difficulty"""
227
- difficulty = self.classify_difficulty(task)
228
- preferred_providers = self.PROVIDER_PREFERENCES[difficulty]
229
-
230
- # Get first healthy provider from preferences
231
- for provider_name in preferred_providers:
232
- provider = provider_registry.get_provider(provider_name)
233
- if provider and provider.get_health()["status"] == "healthy":
234
- return provider
235
-
236
- # Fallback to any healthy provider
237
- healthy = provider_registry.get_healthy_providers()
238
- if healthy:
239
- return healthy[0]
240
-
241
- raise Exception("No healthy providers available")
242
- ```
243
-
244
- **Research Backing**: [arXiv:2509.11079](https://arxiv.org/html/2509.11079v2) shows difficulty-aware orchestration improves decision support quality by 35%
245
-
246
- **Files to Add**:
247
- - `src/workflows/difficulty_router.py` (250 lines)
248
-
249
- **Effort**: 1-2 days
250
- **Value**: ⭐⭐⭐⭐⭐
251
-
252
- ---
253
-
254
- ### 3. **Advanced Memory System (Memoria-Inspired)** ⭐⭐⭐⭐⭐
255
-
256
- **Based on**:
257
- - [Memoria: Scalable Agentic Memory Framework](https://www.arxiv.org/abs/2512.12686)
258
- - [A-Mem: Agentic Memory](https://arxiv.org/abs/2502.12110)
259
- - MONK's `advanced_context_manager.py`
260
-
261
- **Problem**: Current JSON memory lacks semantic search, persistent context, and intelligent retrieval
262
-
263
- **Impact**: Research shows advanced memory improves long-term task performance by 50%
264
-
265
- #### Architecture
266
-
267
- ```python
268
- # src/memory/agentic_memory.py
269
-
270
- class AgenticMemory:
271
- """
272
- Multi-tier memory system inspired by Memoria (arXiv:2512.12686)
273
- and A-Mem (arXiv:2502.12110)
274
-
275
- Tiers:
276
- 1. Episodic Memory: Specific task executions (JSON)
277
- 2. Semantic Memory: General patterns and concepts (Vector DB)
278
- 3. Working Memory: Active session context (In-memory)
279
- """
280
-
281
- def __init__(self, base_dir: str = ".taskmaster/memory"):
282
- self.base_dir = Path(base_dir)
283
-
284
- # Tier 1: Episodic memory (JSON files)
285
- self.episodic_store = EpisodicMemoryStore(self.base_dir / "episodic")
286
-
287
- # Tier 2: Semantic memory (ChromaDB - optional)
288
- # Falls back to keyword matching if not available
289
- try:
290
- import chromadb
291
- self.semantic_store = SemanticMemoryStore(self.base_dir / "semantic")
292
- except ImportError:
293
- self.semantic_store = None
294
- print("Warning: ChromaDB not available, using keyword matching")
295
-
296
- # Tier 3: Working memory
297
- self.working_memory = WorkingMemory(max_items=100)
298
-
299
- def remember(
300
- self,
301
- task: Dict[str, Any],
302
- result: Dict[str, Any],
303
- agent_id: str,
304
- skills: List[str],
305
- importance: float = 0.5
306
- ):
307
- """
308
- Store experience in multiple memory tiers
309
-
310
- Importance scoring (from research):
311
- - Success/failure outcome
312
- - Token efficiency
313
- - Time to completion
314
- - User feedback
315
- """
316
- # Store in episodic memory
317
- episode = {
318
- "id": f"episode_{uuid4()}",
319
- "timestamp": datetime.now().isoformat(),
320
- "task": task,
321
- "result": result,
322
- "agent_id": agent_id,
323
- "skills": skills,
324
- "importance": importance,
325
- "embeddings": None # Computed if semantic store available
326
- }
327
-
328
- self.episodic_store.store(episode)
329
-
330
- # Add to semantic memory if available
331
- if self.semantic_store:
332
- self.semantic_store.store(episode)
333
-
334
- # Update working memory
335
- self.working_memory.add(episode)
336
-
337
- def recall(
338
- self,
339
- task: Dict[str, Any],
340
- top_k: int = 5,
341
- memory_types: List[str] = ["episodic", "semantic", "working"]
342
- ) -> List[Dict[str, Any]]:
343
- """
344
- Recall relevant experiences using multi-tier retrieval
345
-
346
- Combines:
347
- 1. Keyword matching (episodic)
348
- 2. Semantic similarity (semantic)
349
- 3. Recent context (working)
350
- """
351
- results = []
352
-
353
- if "episodic" in memory_types:
354
- # Keyword-based retrieval
355
- episodes = self.episodic_store.recall(task, top_k)
356
- results.extend([(e, "episodic") for e in episodes])
357
-
358
- if "semantic" in memory_types and self.semantic_store:
359
- # Semantic similarity retrieval
360
- semantics = self.semantic_store.recall(task, top_k)
361
- results.extend([(s, "semantic") for s in semantics])
362
-
363
- if "working" in memory_types:
364
- # Recent working memory
365
- working = self.working_memory.recall(task, top_k)
366
- results.extend([(w, "working") for w in working])
367
-
368
- # Rank by combined score
369
- ranked = self._rank_results(results, task)
370
- return ranked[:top_k]
371
-
372
- def _rank_results(
373
- self,
374
- results: List[Tuple[Dict, str]],
375
- task: Dict[str, Any]
376
- ) -> List[Dict]:
377
- """
378
- Rank results by relevance score
379
-
380
- Scoring (research-based):
381
- - Semantic similarity: 40%
382
- - Keyword match: 30%
383
- - Recency: 20%
384
- - Importance: 10%
385
- """
386
- scored = []
387
-
388
- for result, source in results:
389
- score = 0.0
390
-
391
- # Source-specific scoring
392
- if source == "semantic":
393
- score += result.get("similarity", 0) * 0.4
394
- elif source == "episodic":
395
- # Keyword overlap
396
- score += self._keyword_similarity(task, result) * 0.3
397
- elif source == "working":
398
- # Boost recent working memory
399
- score += 0.3
400
-
401
- # Recency boost (time decay)
402
- recency_score = self._time_decay(result["timestamp"])
403
- score += recency_score * 0.2
404
-
405
- # Importance boost
406
- score += result.get("importance", 0.5) * 0.1
407
-
408
- scored.append({**result, "score": score})
409
-
410
- return sorted(scored, key=lambda x: x["score"], reverse=True)
411
-
412
- # src/memory/episodic_store.py
413
- class EpisodicMemoryStore:
414
- """JSON-based episodic memory storage"""
415
-
416
- def store(self, episode: Dict):
417
- # Store as JSON file
418
- episode_id = episode["id"]
419
- file_path = self.base_dir / f"{episode_id}.json"
420
-
421
- with open(file_path, 'w') as f:
422
- json.dump(episode, f, indent=2)
423
-
424
- def recall(self, task: Dict, top_k: int) -> List[Dict]:
425
- # Keyword matching across episodes
426
- task_keywords = self._extract_keywords(task["description"])
427
-
428
- results = []
429
- for episode_file in self.base_dir.glob("*.json"):
430
- with open(episode_file, 'r') as f:
431
- episode = json.load(f)
432
-
433
- episode_keywords = episode.get("keywords", [])
434
- similarity = self._jaccard_similarity(task_keywords, episode_keywords)
435
-
436
- if similarity > 0.1:
437
- results.append((episode, similarity))
438
-
439
- # Sort by similarity
440
- results.sort(key=lambda x: x[1], reverse=True)
441
- return [r[0] for r in results[:top_k]]
442
-
443
- # src/memory/semantic_store.py (OPTIONAL - requires ChromaDB)
444
- class SemanticMemoryStore:
445
- """
446
- Vector database semantic memory
447
- Based on Memoria framework (arXiv:2512.12686)
448
- """
449
-
450
- def __init__(self, base_dir: str):
451
- import chromadb
452
- self.client = chromadb.PersistentClient(path=base_dir)
453
- self.collection = self.client.get_or_create_collection("episodes")
454
-
455
- def store(self, episode: Dict):
456
- # Generate embedding
457
- text = episode["task"]["description"]
458
- embedding = self._generate_embedding(text)
459
-
460
- # Store in vector DB
461
- self.collection.add(
462
- ids=[episode["id"]],
463
- embeddings=[embedding],
464
- documents=[text],
465
- metadatas=[episode]
466
- )
467
-
468
- def recall(self, task: Dict, top_k: int) -> List[Dict]:
469
- # Semantic similarity search
470
- query_embedding = self._generate_embedding(task["description"])
471
-
472
- results = self.collection.query(
473
- query_embeddings=[query_embedding],
474
- n_results=top_k
475
- )
476
-
477
- return results["metadatas"][0]
478
- ```
479
-
480
- **Configuration**:
481
- ```yaml
482
- # tmlpd.yaml
483
- memory:
484
- enabled: true
485
-
486
- episodic:
487
- type: json
488
- path: .taskmaster/memory/episodic
489
- max_episodes: 1000
490
-
491
- semantic:
492
- type: chromadb # Optional, falls back to keyword
493
- path: .taskmaster/memory/semantic
494
- embedding_model: all-MiniLM-L6-v2 # Fast, good enough
495
-
496
- working:
497
- max_items: 100
498
- ttl_seconds: 3600 # 1 hour
499
- ```
500
-
501
- **Research Backing**:
502
- - [Memoria (arXiv:2512.12686)](https://www.arxiv.org/abs/2512.12686) shows 50% improvement in long-term coherence
503
- - [A-Mem (arXiv:2502.12110)](https://arxiv.org/abs/2502.12110) demonstrates 144+ citations, highly influential
504
-
505
- **Files to Add**:
506
- - `src/memory/agentic_memory.py` (400 lines)
507
- - `src/memory/episodic_store.py` (200 lines)
508
- - `src/memory/semantic_store.py` (150 lines) - Optional
509
- - `src/memory/working_memory.py` (100 lines)
510
-
511
- **Effort**: 3-4 days
512
- **Value**: ⭐⭐⭐⭐⭐
513
-
514
- ---
515
-
516
- ### 4. **Workflow Executors (Implementation)** ⭐⭐⭐⭐⭐
517
-
518
- **Based on**: [Multi-Agent LLM Orchestration](https://arxiv.org/abs/2511.15755) + MONK's execution patterns
519
-
520
- **Problem**: TMLPD v2.0 has routing but no actual workflow execution
521
-
522
- **Impact**: Unlocks the 15% workflow use case (chaining, parallelization)
523
-
524
- #### Implementation
525
-
526
- ```python
527
- # src/workflows/executors.py
528
-
529
- class ChainingExecutor:
530
- """
531
- Execute tasks sequentially, passing output to next
532
- Based on deterministic incident response (arXiv:2511.15755)
533
- """
534
-
535
- async def execute(
536
- self,
537
- tasks: List[Dict[str, Any]],
538
- provider: BaseProvider
539
- ) -> List[Dict[str, Any]]:
540
- """
541
- Execute tasks in sequence, passing context
542
-
543
- Pattern: Task 1 → Task 2 → Task 3 → ...
544
- Each task gets previous task outputs as context
545
- """
546
- results = []
547
- context = {}
548
-
549
- for i, task in enumerate(tasks):
550
- # Add context from previous tasks
551
- if context:
552
- task["previous_results"] = context
553
-
554
- # Execute with agent
555
- agent = TMLEnhancedAgent(
556
- agent_id=f"chain_agent_{i}",
557
- provider=provider,
558
- model=provider.default_model
559
- )
560
-
561
- result = agent.execute_task(task)
562
- results.append(result)
563
-
564
- # Pass output to next task
565
- context[f"task_{i}_output"] = result.get("output")
566
-
567
- if not result.get("success"):
568
- # Stop chain on failure
569
- break
570
-
571
- return results
572
-
573
- class ParallelizationExecutor:
574
- """
575
- Execute independent tasks in parallel
576
- Based on AgentOrchestra (arXiv:2506.12508)
577
- """
578
-
579
- async def execute(
580
- self,
581
- tasks: List[Dict[str, Any]],
582
- provider: BaseProvider,
583
- max_concurrent: int = 5
584
- ) -> List[Dict[str, Any]]:
585
- """
586
- Execute tasks concurrently
587
-
588
- Pattern:
589
- Task 1 ─┐
590
- Task 2 ─┼→ Aggregate Results
591
- Task 3 ─┘
592
- """
593
- # Create semaphore to limit concurrency
594
- semaphore = asyncio.Semaphore(max_concurrent)
595
-
596
- async def execute_one(task):
597
- async with semaphore:
598
- agent = TMLEnhancedAgent(
599
- agent_id="parallel_agent",
600
- provider=provider,
601
- model=provider.default_model
602
- )
603
- return agent.execute_task(task)
604
-
605
- # Execute all tasks concurrently
606
- results = await asyncio.gather(*[
607
- execute_one(task) for task in tasks
608
- ], return_exceptions=True)
609
-
610
- return results
611
-
612
- class OrchestratorExecutor:
613
- """
614
- Hierarchical orchestration for complex tasks
615
- Based on AgentOrchestra framework (arXiv:2506.12508)
616
- """
617
-
618
- async def execute(
619
- self,
620
- task: Dict[str, Any],
621
- provider: BaseProvider
622
- ) -> Dict[str, Any]:
623
- """
624
- Break down complex task and orchestrate
625
-
626
- Pattern:
627
- 1. Break task into subtasks
628
- 2. Classify subtask dependencies
629
- 3. Execute parallel where possible
630
- 4. Execute chain where dependent
631
- 5. Synthesize results
632
- """
633
- # Break down task
634
- subtasks = await self._break_down_task(task)
635
-
636
- # Classify dependencies
637
- dependency_graph = self._build_dependency_graph(subtasks)
638
-
639
- # Identify parallelizable chains
640
- chains = self._extract_chains(dependency_graph)
641
-
642
- # Execute chains in parallel, tasks within each chain sequentially
643
- chain_results = await asyncio.gather(*[
644
- self._execute_chain(chain, provider)
645
- for chain in chains
646
- ])
647
-
648
- # Synthesize results
649
- return self._synthesize_results(chain_results)
650
-
651
- async def _execute_chain(
652
- self,
653
- chain: List[Dict],
654
- provider: BaseProvider
655
- ) -> List[Dict]:
656
- """Execute a chain of dependent tasks"""
657
- executor = ChainingExecutor()
658
- return await executor.execute(chain, provider)
659
- ```
660
-
661
- **Research Backing**: [arXiv:2511.15755](https://arxiv.org/abs/2511.15755) shows deterministic multi-agent orchestration achieves 90%+ success rate
662
-
663
- **Files to Add**:
664
- - `src/workflows/executors.py` (350 lines)
665
-
666
- **Effort**: 2-3 days
667
- **Value**: ⭐⭐⭐⭐⭐
668
-
669
- ---
670
-
671
- ## 🟡 HIGH PRIORITY IMPROVEMENTS
672
-
673
- ### 5. **Function Calling / Tool Use Enhancement** ⭐⭐⭐⭐
674
-
675
- **Based on**: [ToolACE framework](https://arxiv.org/html/2409.00920v2) + [Tool Instruction](https://aclanthology.org/2025.naacl-long.44.pdf)
676
-
677
- **Problem**: Skills are loaded as text, not invoked as structured function calls
678
-
679
- **Impact**: Research shows structured tool calling improves reliability by 40%
680
-
681
- #### Implementation
682
-
683
- ```python
684
- # src/skills/function_calling_skill.py
685
-
686
- class FunctionCallingSkill:
687
- """
688
- Skill that can be called as a function
689
- Based on ToolACE (arXiv:2409.00920)
690
- """
691
-
692
- def __init__(self, skill_path: Path):
693
- self.skill_path = skill_path
694
- self.metadata = self._load_metadata()
695
- self.functions = self._extract_functions()
696
-
697
- def _extract_functions(self) -> Dict[str, callable]:
698
- """
699
- Extract callable functions from SKILL.md
700
-
701
- Format in SKILL.md:
702
- ```markdown
703
- ## Function: create_component
704
- **Description**: Create a React component with best practices
705
- **Parameters**:
706
- - name (string): Component name
707
- - props (object): Component props
708
- **Example**:
709
- ```
710
- """
711
- functions = {}
712
-
713
- # Parse SKILL.md for function definitions
714
- content = self.skill_path.read_text()
715
-
716
- # Extract function blocks
717
- import re
718
- function_pattern = r"## Function: (\w+)\s*\n\*\*Description\*\*:\s*([^\n]+)"
719
-
720
- for match in re.finditer(function_pattern, content):
721
- func_name = match.group(1)
722
- description = match.group(2)
723
-
724
- # Create callable wrapper
725
- def make_func(name, desc):
726
- def func(**kwargs):
727
- return self._execute_function(name, kwargs)
728
- func.__name__ = name
729
- func.__doc__ = desc
730
- return func
731
-
732
- functions[func_name] = make_func(func_name, description)
733
-
734
- return functions
735
-
736
- def get_function_signature(self, func_name: str) -> Dict:
737
- """
738
- Get function signature for LLM function calling
739
-
740
- Returns format compatible with OpenAI/Anthropic function calling
741
- """
742
- if func_name not in self.functions:
743
- return None
744
-
745
- return {
746
- "name": func_name,
747
- "description": self.functions[func_name].__doc__,
748
- "parameters": self._get_parameters(func_name)
749
- }
750
-
751
- async def call_function(self, func_name: str, **kwargs) -> str:
752
- """Call skill function and return result"""
753
- if func_name not in self.functions:
754
- raise ValueError(f"Function {func_name} not found")
755
-
756
- return await self.functions[func_name](**kwargs)
757
- ```
758
-
759
- **Skills with Function Calling**:
760
-
761
- ```markdown
762
- # tmlpd-skills/frontend/SKILL.md
763
-
764
- ## Function: create_react_component
765
-
766
- **Description**: Create a React component following best practices
767
-
768
- **Parameters**:
769
- - component_name (string, required): Name of the component
770
- - props (object, optional): Component props definition
771
- - state_management (string, optional): State management approach (useState, useContext, zustand)
772
-
773
- **Returns**:
774
- - component_code (string): Generated React component code
775
- - usage_example (string): Example usage
776
-
777
- **Example**:
778
- ```python
779
- result = await skill.call_function(
780
- "create_react_component",
781
- component_name="UserProfile",
782
- props={"userId": "string", "name": "string"},
783
- state_management="useState"
784
- )
785
- ```
786
-
787
- **Research Backing**: [ToolACE (arXiv:2409.00920)](https://arxiv.org/html/2409.00920v2) shows multi-agent function calling achieves 85%+ accuracy
788
-
789
- **Files to Add**:
790
- - `src/skills/function_calling_skill.py` (250 lines)
791
- - Update `tmlpd-skills/*/SKILL.md` with function definitions
792
-
793
- **Effort**: 2 days
794
- **Value**: ⭐⭐⭐⭐
795
-
796
- ---
797
-
798
- ### 6. **CLI with Command Completion** ⭐⭐⭐⭐
799
-
800
- **Based on**: MONK's CLI patterns + production usability requirements
801
-
802
- **Problem**: No CLI interface makes TMLPD hard to use
803
-
804
- **Impact**: Makes TMLPD a practical developer tool
805
-
806
- #### Implementation
807
-
808
- ```python
809
- # tmlpd/cli.py
810
-
811
- import click
812
- from rich.console import Console
813
- from rich.table import Table
814
-
815
- console = Console()
816
-
817
- @click.group()
818
- @click.version_option(version="2.0.0")
819
- def tmlpd():
820
- """TMLPD - Multi-LLM Parallel Deployment with Agent Skills"""
821
- pass
822
-
823
- @tmlpd.command()
824
- @click.argument("task")
825
- @click.option("--provider", "-p", help="Override provider selection")
826
- @click.option("--skills", "-s", multiple=True, help="Specify skills to use")
827
- @click.option("--difficulty", "-d", type=click.Choice(["trivial", "simple", "medium", "complex", "expert"]), help="Set difficulty level")
828
- def execute(task, provider, skills, difficulty):
829
- """Execute a task with TMLPD"""
830
-
831
- # Display execution plan
832
- console.print(f"\n[bold blue]TMLPD Task Execution[/bold blue]\n")
833
- console.print(f"Task: {task}")
834
-
835
- if difficulty:
836
- console.print(f"Difficulty: [yellow]{difficulty}[/yellow]")
837
-
838
- if skills:
839
- console.print(f"Skills: {', '.join(skills)}")
840
-
841
- # Execute
842
- result = execute_task(
843
- task_description=task,
844
- provider_override=provider,
845
- skills=list(skills),
846
- difficulty_level=difficulty
847
- )
848
-
849
- # Display result
850
- if result["success"]:
851
- console.print(f"\n[green]✓ Success[/green]")
852
- console.print(f"Tokens: {result['tokens_used']}")
853
- console.print(f"Cost: ${result['cost']:.4f}")
854
- console.print(f"Time: {result['execution_time']:.2f}s")
855
- else:
856
- console.print(f"\n[red]✗ Failed[/red]")
857
- console.print(f"Error: {result.get('error', 'Unknown error')}")
858
-
859
- @tmlpd.command()
860
- @click.argument("task")
861
- def route(task):
862
- """Route a task to see execution plan without executing"""
863
-
864
- router = DifficultyAwareRouter()
865
- difficulty = router.classify_difficulty({"description": task})
866
- provider = get_provider_for_difficulty(difficulty)
867
-
868
- # Display routing table
869
- table = Table(title="Task Routing Plan")
870
- table.add_column("Attribute", style="cyan")
871
- table.add_column("Value", style="yellow")
872
-
873
- table.add_row("Task", task[:80] + "..." if len(task) > 80 else task)
874
- table.add_row("Difficulty", difficulty)
875
- table.add_row("Provider", provider)
876
- table.add_row("Est. Cost", f"${estimate_cost(task, difficulty):.4f}")
877
- table.add_row("Est. Time", f"{estimate_time(task, difficulty):.1f}s")
878
-
879
- console.print(table)
880
-
881
- @tmlpd.command()
882
- @click.option("--type", type=click.Choice(["episodic", "semantic", "all"]), default="all")
883
- @click.option("--limit", "-n", default=10, help="Number of memories to show")
884
- def memory(type, limit):
885
- """Show memory contents"""
886
-
887
- mem = AgenticMemory()
888
- memories = mem.get_recent_memories(memory_type=type, limit=limit)
889
-
890
- table = Table(title=f"Recent {type.title()} Memories")
891
- table.add_column("ID", style="cyan")
892
- table.add_column("Task", style="white")
893
- table.add_column("Date", style="dim")
894
-
895
- for mem in memories:
896
- table.add_row(
897
- mem["id"][:8],
898
- mem["task"]["description"][:50],
899
- mem["timestamp"][:10]
900
- )
901
-
902
- console.print(table)
903
-
904
- @tmlpd.command()
905
- def providers():
906
- """Show provider status"""
907
-
908
- registry = get_provider_registry()
909
-
910
- table = Table(title="Provider Status")
911
- table.add_column("Provider", style="cyan")
912
- table.add_column("Status", style="green" if healthy else "red")
913
- table.add_column("Model", style="white")
914
- table.add_column("Priority", style="yellow")
915
-
916
- for name, provider in registry.providers.items():
917
- health = provider.get_health()
918
- status = "[green]✓ Healthy[/green]" if health["status"] == "healthy" else "[red]✗ Unhealthy[/red]"
919
-
920
- table.add_row(
921
- name,
922
- status,
923
- provider.model,
924
- str(provider.priority)
925
- )
926
-
927
- console.print(table)
928
-
929
- # Tab completion support
930
- @tmlpd.command()
931
- def completion():
932
- """Generate shell completion"""
933
- click.echo("# Bash completion script")
934
- click.echo("complete -F _tmlpd_completion tmlpd")
935
- ```
936
-
937
- **Files to Add**:
938
- - `tmlpd/cli.py` (400 lines)
939
- - `tmlpd/__init__.py` (50 lines)
940
- - `setup.py` (100 lines)
941
-
942
- **Effort**: 2-3 days
943
- **Value**: ⭐⭐⭐⭐
944
-
945
- ---
946
-
947
- ### 7. **Git-Versioned Context** ⭐⭐⭐⭐
948
-
949
- **Based on**: [Manage Context like Git](https://arxiv.org/abs/2508.00031) + MONK's checkpointing
950
-
951
- **Problem**: Checkpoints are simple JSON, no versioning or branching
952
-
953
- **Impact**: Research shows Git-like context management improves reproducibility by 60%
954
-
955
- #### Implementation
956
-
957
- ```python
958
- # src/state/versioned_context.py
959
-
960
- class VersionedContext:
961
- """
962
- Git-inspired versioned context management
963
- Based on arXiv:2508.00031 (Manage Context like Git)
964
- """
965
-
966
- def __init__(self, context_dir: str = ".taskmaster/context"):
967
- self.context_dir = Path(context_dir)
968
- self.git = self._init_git_repo()
969
-
970
- def commit_context(
971
- self,
972
- state: Dict[str, Any],
973
- message: str,
974
- author: str = "tmlpd"
975
- ) -> str:
976
- """
977
- Create a context commit (like git commit)
978
-
979
- Each commit stores:
980
- - Full state snapshot
981
- - Parent reference(s)
982
- - Commit message
983
- - Timestamp
984
- - Author
985
- """
986
- commit_id = f"commit_{uuid4()}"
987
-
988
- # Create commit object
989
- commit = {
990
- "id": commit_id,
991
- "parent": self.get_head(),
992
- "message": message,
993
- "author": author,
994
- "timestamp": datetime.now().isoformat(),
995
- "state": state
996
- }
997
-
998
- # Store commit
999
- commit_file = self.context_dir / "commits" / f"{commit_id}.json"
1000
- commit_file.parent.mkdir(parents=True, exist_ok=True)
1001
-
1002
- with open(commit_file, 'w') as f:
1003
- json.dump(commit, f, indent=2)
1004
-
1005
- # Update HEAD
1006
- self._update_head(commit_id)
1007
-
1008
- return commit_id
1009
-
1010
- def create_branch(self, branch_name: str, from_commit: str = None):
1011
- """Create a new branch (like git branch)"""
1012
- if from_commit is None:
1013
- from_commit = self.get_head()
1014
-
1015
- # Update branch reference
1016
- branch_file = self.context_dir / "refs" / "heads" / branch_name
1017
- branch_file.parent.mkdir(parents=True, exist_ok=True)
1018
-
1019
- branch_file.write_text(from_commit)
1020
-
1021
- def checkout(self, ref: str):
1022
- """Checkout a branch or commit (like git checkout)"""
1023
- # Resolve ref to commit ID
1024
- commit_id = self._resolve_ref(ref)
1025
-
1026
- # Load commit state
1027
- commit_file = self.context_dir / "commits" / f"{commit_id}.json"
1028
-
1029
- if not commit_file.exists():
1030
- raise ValueError(f"Commit {commit_id} not found")
1031
-
1032
- with open(commit_file, 'r') as f:
1033
- commit = json.load(f)
1034
-
1035
- # Restore state
1036
- return commit["state"]
1037
-
1038
- def log(self, ref: str = "HEAD", limit: int = 10) -> List[Dict]:
1039
- """Show commit history (like git log)"""
1040
- commit_id = self._resolve_ref(ref)
1041
- commits = []
1042
-
1043
- while commit_id and len(commits) < limit:
1044
- commit_file = self.context_dir / "commits" / f"{commit_id}.json"
1045
-
1046
- if not commit_file.exists():
1047
- break
1048
-
1049
- with open(commit_file, 'r') as f:
1050
- commit = json.load(f)
1051
-
1052
- commits.append(commit)
1053
- commit_id = commit.get("parent")
1054
-
1055
- return commits
1056
-
1057
- def merge(self, branch: str):
1058
- """Merge a branch (like git merge)"""
1059
- branch_file = self.context_dir / "refs" / "heads" / branch
1060
- branch_commit = branch_file.read_text().strip()
1061
-
1062
- # Get current HEAD
1063
- head_commit = self.get_head()
1064
-
1065
- # Create merge commit
1066
- merge_state = {
1067
- "merged_from": branch_commit,
1068
- "merged_into": head_commit,
1069
- "merge_strategy": "auto"
1070
- }
1071
-
1072
- return self.commit_context(
1073
- state=merge_state,
1074
- message=f"Merge branch '{branch}'",
1075
- author="tmlpd-merge"
1076
- )
1077
- ```
1078
-
1079
- **Research Backing**: [arXiv:2508.00031](https://arxiv.org/abs/2508.00031) shows Git-like context management enables experiment tracking and reproducibility
1080
-
1081
- **Files to Add**:
1082
- - `src/state/versioned_context.py` (400 lines)
1083
-
1084
- **Effort**: 2 days
1085
- **Value**: ⭐⭐⭐⭐
1086
-
1087
- ---
1088
-
1089
- ## 🟢 MEDIUM PRIORITY (Optional)
1090
-
1091
- ### 8. **Spatial Memory for Multi-Step Agents** ⭐⭐⭐
1092
-
1093
- **Based on**: [Spatial Memory for Multi-Step LLM Agents](https://arxiv.org/abs/2505.19436)
1094
-
1095
- **Implementation**: Task Memory Engine (TME) with spatial reasoning
1096
-
1097
- ---
1098
-
1099
- ### 9. **Episodic Memory Enhancement** ⭐⭐⭐
1100
-
1101
- **Based on**: [Episodic Memory for Long-Term LLM](https://arxiv.org/abs/2502.06975)
1102
-
1103
- **Implementation**: Explicit memory storage with temporal indexing
1104
-
1105
- ---
1106
-
1107
- ### 10. **Self-Organizing Memory** ⭐⭐⭐
1108
-
1109
- **Based on**: [Self-Organizing Agent Memory](https://arxiv.org/html/2508.03341v2)
1110
-
1111
- **Implementation**: Cognitive science-inspired memory clustering
1112
-
1113
- ---
1114
-
1115
- ## 📊 IMPLEMENTATION PRIORITY MATRIX (Updated)
1116
-
1117
- ```
1118
- HIGH IMPACT, LOW EFFORT (DO FIRST):
1119
- ├─ Difficulty-Aware Routing (1-2 days) ⭐⭐⭐⭐⭐
1120
- ├─ CLI Interface (2-3 days) ⭐⭐⭐⭐
1121
- ├─ Function Calling Enhancement (2 days) ⭐⭐⭐⭐
1122
- └─ Better Error Messages (0.5 days) ⭐⭐⭐⭐
1123
-
1124
- HIGH IMPACT, HIGH EFFORT (DO NEXT):
1125
- ├─ Multi-Provider System (2-3 days) ⭐⭐⭐⭐⭐
1126
- ├─ Advanced Memory System (3-4 days) ⭐⭐⭐⭐⭐
1127
- ├─ Workflow Executors (2-3 days) ⭐⭐⭐⭐⭐
1128
- └─ Git-Versioned Context (2 days) ⭐⭐⭐⭐
1129
-
1130
- MEDIUM IMPACT:
1131
- ├─ Spatial Memory (2-3 days) ⭐⭐⭐
1132
- ├─ Episodic Memory Enhancement (1-2 days) ⭐⭐⭐
1133
- └─ Self-Organizing Memory (2-3 days) ⭐⭐⭐
1134
- ```
1135
-
1136
- ---
1137
-
1138
- ## 🚀 RECOMMENDED IMPLEMENTATION ORDER
1139
-
1140
- ### Week 1: Core Infrastructure
1141
- 1. Multi-Provider System (2-3 days)
1142
- 2. Difficulty-Aware Routing (1-2 days)
1143
-
1144
- ### Week 2: Memory & Context
1145
- 3. Advanced Memory System (3-4 days)
1146
- 4. Git-Versioned Context (2 days)
1147
-
1148
- ### Week 3: Execution & Interface
1149
- 5. Workflow Executors (2-3 days)
1150
- 6. CLI Interface (2-3 days)
1151
- 7. Function Calling Enhancement (2 days)
1152
-
1153
- **Total**: 3 weeks to production-ready, research-backed TMLPD v2.1!
1154
-
1155
- ---
1156
-
1157
- ## 📚 RESEARCH REFERENCES
1158
-
1159
- ### Multi-Agent Orchestration
1160
- - [Multi-Agent LLM Orchestration (arXiv:2511.15755)](https://arxiv.org/abs/2511.15755)
1161
- - [AgentOrchestra Framework (arXiv:2506.12508)](https://arxiv.org/html/2506.12508v1)
1162
- - [Difficulty-Aware Orchestration (arXiv:2509.11079)](https://arxiv.org/html/2509.11079v2)
1163
-
1164
- ### Memory Systems
1165
- - [Memoria Framework (arXiv:2512.12686)](https://www.arxiv.org/abs/2512.12686)
1166
- - [A-Mem (arXiv:2502.12110)](https://arxiv.org/abs/2502.12110)
1167
- - [Git-Like Context Management (arXiv:2508.00031)](https://arxiv.org/abs/2508.00031)
1168
-
1169
- ### Tool Use & Function Calling
1170
- - [ToolACE (arXiv:2409.00920)](https://arxiv.org/html/2409.00920v2)
1171
- - [Tool Instruction Enhancement (NAACL 2025)](https://aclanthology.org/2025.naacl-long.44.pdf)
1172
-
1173
- ### Advanced Memory
1174
- - [Spatial Memory (arXiv:2505.19436)](https://arxiv.org/abs/2505.19436)
1175
- - [Episodic Memory (arXiv:2502.06975)](https://arxiv.org/abs/2502.06975)
1176
- - [Self-Organizing Memory (arXiv:2508.03341)](https://arxiv.org/html/2508.03341v2)
1177
-
1178
- ---
1179
-
1180
- **Question**: Which of these research-backed improvements should I implement first? I recommend starting with **Multi-Provider System** (enables everything else) or **Difficulty-Aware Routing** (immediate impact).