mdcontext 0.0.1 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (337) hide show
  1. package/.changeset/README.md +28 -0
  2. package/.changeset/config.json +11 -0
  3. package/.claude/settings.local.json +25 -0
  4. package/.github/workflows/ci.yml +83 -0
  5. package/.github/workflows/claude-code-review.yml +44 -0
  6. package/.github/workflows/claude.yml +85 -0
  7. package/.github/workflows/release.yml +113 -0
  8. package/.tldrignore +112 -0
  9. package/BACKLOG.md +338 -0
  10. package/CONTRIBUTING.md +186 -0
  11. package/NOTES/NOTES +44 -0
  12. package/README.md +434 -11
  13. package/biome.json +36 -0
  14. package/cspell.config.yaml +14 -0
  15. package/dist/chunk-23UPXDNL.js +3044 -0
  16. package/dist/chunk-2W7MO2DL.js +1366 -0
  17. package/dist/chunk-3NUAZGMA.js +1689 -0
  18. package/dist/chunk-7TOWB2XB.js +366 -0
  19. package/dist/chunk-7XOTOADQ.js +3065 -0
  20. package/dist/chunk-AH2PDM2K.js +3042 -0
  21. package/dist/chunk-BNXWSZ63.js +3742 -0
  22. package/dist/chunk-BTL5DJVU.js +3222 -0
  23. package/dist/chunk-HDHYG7E4.js +104 -0
  24. package/dist/chunk-HLR4KZBP.js +3234 -0
  25. package/dist/chunk-IP3FRFEB.js +1045 -0
  26. package/dist/chunk-KHU56VDO.js +3042 -0
  27. package/dist/chunk-KRYIFLQR.js +88 -0
  28. package/dist/chunk-LBSDNLEM.js +287 -0
  29. package/dist/chunk-MNTQ7HCP.js +2643 -0
  30. package/dist/chunk-MUJELQQ6.js +1387 -0
  31. package/dist/chunk-MXJGMSLV.js +2199 -0
  32. package/dist/chunk-N6QJGC3Z.js +2636 -0
  33. package/dist/chunk-OBELGBPM.js +1713 -0
  34. package/dist/chunk-OT7R5XTA.js +3192 -0
  35. package/dist/chunk-P7X4RA2T.js +106 -0
  36. package/dist/chunk-PIDUQNC2.js +3185 -0
  37. package/dist/chunk-POGCDIH4.js +3187 -0
  38. package/dist/chunk-PSIEOQGZ.js +3043 -0
  39. package/dist/chunk-PVRT3IHA.js +3238 -0
  40. package/dist/chunk-QNN4TT23.js +1430 -0
  41. package/dist/chunk-RE3R45RJ.js +3042 -0
  42. package/dist/chunk-S7E6TFX6.js +803 -0
  43. package/dist/chunk-SG6GLU4U.js +1378 -0
  44. package/dist/chunk-SJCDV2ST.js +274 -0
  45. package/dist/chunk-SYE5XLF3.js +104 -0
  46. package/dist/chunk-T5VLYBZD.js +103 -0
  47. package/dist/chunk-TOQB7VWU.js +3238 -0
  48. package/dist/chunk-VFNMZ4ZQ.js +3228 -0
  49. package/dist/chunk-VVTGZNBT.js +1629 -0
  50. package/dist/chunk-W7Q4RFEV.js +104 -0
  51. package/dist/chunk-XTYYVRLO.js +3190 -0
  52. package/dist/chunk-Y6MDYVJD.js +3063 -0
  53. package/dist/cli/main.d.ts +1 -0
  54. package/dist/cli/main.js +5458 -0
  55. package/dist/index.d.ts +653 -0
  56. package/dist/index.js +79 -0
  57. package/dist/mcp/server.d.ts +1 -0
  58. package/dist/mcp/server.js +472 -0
  59. package/dist/schema-BAWSG7KY.js +22 -0
  60. package/dist/schema-E3QUPL26.js +20 -0
  61. package/dist/schema-EHL7WUT6.js +20 -0
  62. package/docs/019-USAGE.md +625 -0
  63. package/docs/020-current-implementation.md +364 -0
  64. package/docs/021-DOGFOODING-FINDINGS.md +175 -0
  65. package/docs/BACKLOG.md +80 -0
  66. package/docs/CONFIG.md +1123 -0
  67. package/docs/DESIGN.md +439 -0
  68. package/docs/ERRORS.md +383 -0
  69. package/docs/PROJECT.md +88 -0
  70. package/docs/ROADMAP.md +407 -0
  71. package/docs/summarization.md +320 -0
  72. package/docs/test-links.md +9 -0
  73. package/justfile +40 -0
  74. package/package.json +74 -9
  75. package/pnpm-workspace.yaml +5 -0
  76. package/research/INDEX.md +315 -0
  77. package/research/code-review/README.md +90 -0
  78. package/research/code-review/cli-error-handling-review.md +979 -0
  79. package/research/code-review/code-review-validation-report.md +464 -0
  80. package/research/code-review/main-ts-review.md +1128 -0
  81. package/research/config-analysis/01-current-implementation.md +470 -0
  82. package/research/config-analysis/02-strategy-recommendation.md +428 -0
  83. package/research/config-analysis/03-task-candidates.md +715 -0
  84. package/research/config-analysis/033-research-configuration-management.md +828 -0
  85. package/research/config-analysis/034-research-effect-cli-config.md +1504 -0
  86. package/research/config-analysis/04-consolidated-task-candidates.md +277 -0
  87. package/research/config-docs/SUMMARY.md +357 -0
  88. package/research/config-docs/TEST-RESULTS.md +776 -0
  89. package/research/config-docs/TODO.md +542 -0
  90. package/research/config-docs/analysis.md +744 -0
  91. package/research/config-docs/fix-validation.md +502 -0
  92. package/research/config-docs/help-audit.md +264 -0
  93. package/research/config-docs/help-system-analysis.md +890 -0
  94. package/research/dogfood/consolidated-tool-evaluation.md +373 -0
  95. package/research/dogfood/strategy-a/a-synthesis.md +184 -0
  96. package/research/dogfood/strategy-a/a1-docs.md +226 -0
  97. package/research/dogfood/strategy-a/a2-amorphic.md +156 -0
  98. package/research/dogfood/strategy-a/a3-llm.md +164 -0
  99. package/research/dogfood/strategy-b/b-synthesis.md +228 -0
  100. package/research/dogfood/strategy-b/b1-architecture.md +207 -0
  101. package/research/dogfood/strategy-b/b2-gaps.md +258 -0
  102. package/research/dogfood/strategy-b/b3-workflows.md +250 -0
  103. package/research/dogfood/strategy-c/c-synthesis.md +451 -0
  104. package/research/dogfood/strategy-c/c1-explorer.md +192 -0
  105. package/research/dogfood/strategy-c/c2-diver-memory.md +145 -0
  106. package/research/dogfood/strategy-c/c3-diver-control.md +148 -0
  107. package/research/dogfood/strategy-c/c4-diver-failure.md +151 -0
  108. package/research/dogfood/strategy-c/c5-diver-execution.md +221 -0
  109. package/research/dogfood/strategy-c/c6-diver-org.md +221 -0
  110. package/research/effect-cli-error-handling.md +845 -0
  111. package/research/effect-errors-as-values.md +943 -0
  112. package/research/errors-task-analysis/00-consolidated-tasks.md +207 -0
  113. package/research/errors-task-analysis/cli-commands-analysis.md +909 -0
  114. package/research/errors-task-analysis/embeddings-analysis.md +709 -0
  115. package/research/errors-task-analysis/index-search-analysis.md +812 -0
  116. package/research/frontmatter/COMMENTS-ARE-SKIPPED.md +149 -0
  117. package/research/frontmatter/LLM-CODE-NAVIGATION.md +276 -0
  118. package/research/issue-review.md +603 -0
  119. package/research/llm-summarization/agent-cli-tools-2026.md +1082 -0
  120. package/research/llm-summarization/alternative-providers-2026.md +1428 -0
  121. package/research/llm-summarization/anthropic-2026.md +367 -0
  122. package/research/llm-summarization/claude-cli-integration.md +1706 -0
  123. package/research/llm-summarization/cli-integration-patterns.md +3155 -0
  124. package/research/llm-summarization/openai-2026.md +473 -0
  125. package/research/llm-summarization/openai-compatible-providers-2026.md +1022 -0
  126. package/research/llm-summarization/opencode-cli-integration.md +1552 -0
  127. package/research/llm-summarization/prompt-engineering-2026.md +1426 -0
  128. package/research/llm-summarization/prototype-results.md +56 -0
  129. package/research/llm-summarization/provider-switching-patterns-2026.md +2153 -0
  130. package/research/llm-summarization/typescript-llm-libraries-2026.md +2436 -0
  131. package/research/mdcontext-error-analysis.md +521 -0
  132. package/research/mdcontext-pudding/00-EXECUTIVE-SUMMARY.md +282 -0
  133. package/research/mdcontext-pudding/01-index-embed.md +956 -0
  134. package/research/mdcontext-pudding/02-search-COMMANDS.md +142 -0
  135. package/research/mdcontext-pudding/02-search-SUMMARY.md +146 -0
  136. package/research/mdcontext-pudding/02-search.md +970 -0
  137. package/research/mdcontext-pudding/03-context.md +779 -0
  138. package/research/mdcontext-pudding/04-navigation-and-analytics.md +803 -0
  139. package/research/mdcontext-pudding/04-tree.md +704 -0
  140. package/research/mdcontext-pudding/05-config.md +1038 -0
  141. package/research/mdcontext-pudding/06-links-summary.txt +87 -0
  142. package/research/mdcontext-pudding/06-links.md +679 -0
  143. package/research/mdcontext-pudding/07-stats.md +693 -0
  144. package/research/mdcontext-pudding/BUG-FIX-PLAN.md +388 -0
  145. package/research/mdcontext-pudding/P0-BUG-VALIDATION.md +167 -0
  146. package/research/mdcontext-pudding/README.md +168 -0
  147. package/research/mdcontext-pudding/TESTING-SUMMARY.md +128 -0
  148. package/research/npm_publish/011-npm-workflow-research-agent2.md +792 -0
  149. package/research/npm_publish/012-npm-workflow-research-agent1.md +530 -0
  150. package/research/npm_publish/013-npm-workflow-research-agent3.md +722 -0
  151. package/research/npm_publish/014-npm-workflow-synthesis.md +556 -0
  152. package/research/npm_publish/031-npm-workflow-task-analysis.md +134 -0
  153. package/research/research-quality-review.md +834 -0
  154. package/research/semantic-search/002-research-embedding-models.md +490 -0
  155. package/research/semantic-search/003-research-rag-alternatives.md +523 -0
  156. package/research/semantic-search/004-research-vector-search.md +841 -0
  157. package/research/semantic-search/032-research-semantic-search.md +427 -0
  158. package/research/semantic-search/embedding-text-analysis.md +156 -0
  159. package/research/semantic-search/multi-word-failure-reproduction.md +171 -0
  160. package/research/semantic-search/query-processing-analysis.md +207 -0
  161. package/research/semantic-search/root-cause-and-solution.md +114 -0
  162. package/research/semantic-search/threshold-validation-report.md +69 -0
  163. package/research/semantic-search/vector-search-analysis.md +63 -0
  164. package/research/task-management-2026/00-synthesis-recommendations.md +295 -0
  165. package/research/task-management-2026/01-ai-workflow-tools.md +416 -0
  166. package/research/task-management-2026/02-agent-framework-patterns.md +476 -0
  167. package/research/task-management-2026/03-lightweight-file-based.md +567 -0
  168. package/research/task-management-2026/04-established-tools-ai-features.md +541 -0
  169. package/research/task-management-2026/linear/01-core-features-workflow.md +771 -0
  170. package/research/task-management-2026/linear/02-api-integrations.md +930 -0
  171. package/research/task-management-2026/linear/03-ai-features.md +368 -0
  172. package/research/task-management-2026/linear/04-pricing-setup.md +205 -0
  173. package/research/task-management-2026/linear/05-usage-patterns-best-practices.md +605 -0
  174. package/research/test-path-issues.md +276 -0
  175. package/review/ALP-76/1-error-type-design.md +962 -0
  176. package/review/ALP-76/2-error-handling-patterns.md +906 -0
  177. package/review/ALP-76/3-error-presentation.md +624 -0
  178. package/review/ALP-76/4-test-coverage.md +625 -0
  179. package/review/ALP-76/5-migration-completeness.md +440 -0
  180. package/review/ALP-76/6-effect-best-practices.md +755 -0
  181. package/scripts/apply-branch-protection.sh +47 -0
  182. package/scripts/branch-protection-templates.json +79 -0
  183. package/scripts/prototype-summarization.ts +346 -0
  184. package/scripts/rebuild-hnswlib.js +58 -0
  185. package/scripts/setup-branch-protection.sh +64 -0
  186. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/active-provider.json +7 -0
  187. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/bm25.json +541 -0
  188. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/bm25.meta.json +5 -0
  189. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/config.json +8 -0
  190. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.bin +0 -0
  191. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.meta.bin +0 -0
  192. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/indexes/documents.json +60 -0
  193. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/indexes/links.json +13 -0
  194. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/indexes/sections.json +1197 -0
  195. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/configuration-management.md +99 -0
  196. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/distributed-systems.md +92 -0
  197. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/error-handling.md +78 -0
  198. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/failure-automation.md +55 -0
  199. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/job-context.md +69 -0
  200. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/process-orchestration.md +99 -0
  201. package/src/cli/argv-preprocessor.test.ts +210 -0
  202. package/src/cli/argv-preprocessor.ts +202 -0
  203. package/src/cli/cli.test.ts +627 -0
  204. package/src/cli/commands/backlinks.ts +54 -0
  205. package/src/cli/commands/config-cmd.ts +642 -0
  206. package/src/cli/commands/context.ts +285 -0
  207. package/src/cli/commands/duplicates.ts +122 -0
  208. package/src/cli/commands/embeddings.ts +529 -0
  209. package/src/cli/commands/index-cmd.ts +480 -0
  210. package/src/cli/commands/index.ts +16 -0
  211. package/src/cli/commands/links.ts +52 -0
  212. package/src/cli/commands/search.ts +1281 -0
  213. package/src/cli/commands/stats.ts +149 -0
  214. package/src/cli/commands/tree.ts +128 -0
  215. package/src/cli/config-layer.ts +176 -0
  216. package/src/cli/error-handler.test.ts +235 -0
  217. package/src/cli/error-handler.ts +655 -0
  218. package/src/cli/flag-schemas.ts +341 -0
  219. package/src/cli/help.ts +588 -0
  220. package/src/cli/index.ts +9 -0
  221. package/src/cli/main.ts +435 -0
  222. package/src/cli/options.ts +41 -0
  223. package/src/cli/shared-error-handling.ts +199 -0
  224. package/src/cli/typo-suggester.test.ts +105 -0
  225. package/src/cli/typo-suggester.ts +130 -0
  226. package/src/cli/utils.ts +259 -0
  227. package/src/config/file-provider.test.ts +320 -0
  228. package/src/config/file-provider.ts +273 -0
  229. package/src/config/index.ts +72 -0
  230. package/src/config/integration.test.ts +667 -0
  231. package/src/config/precedence.test.ts +277 -0
  232. package/src/config/precedence.ts +451 -0
  233. package/src/config/schema.test.ts +414 -0
  234. package/src/config/schema.ts +603 -0
  235. package/src/config/service.test.ts +320 -0
  236. package/src/config/service.ts +243 -0
  237. package/src/config/testing.test.ts +264 -0
  238. package/src/config/testing.ts +110 -0
  239. package/src/core/index.ts +1 -0
  240. package/src/core/types.ts +113 -0
  241. package/src/duplicates/detector.test.ts +183 -0
  242. package/src/duplicates/detector.ts +414 -0
  243. package/src/duplicates/index.ts +18 -0
  244. package/src/embeddings/embedding-namespace.test.ts +300 -0
  245. package/src/embeddings/embedding-namespace.ts +947 -0
  246. package/src/embeddings/heading-boost.test.ts +222 -0
  247. package/src/embeddings/hnsw-build-options.test.ts +198 -0
  248. package/src/embeddings/hyde.test.ts +272 -0
  249. package/src/embeddings/hyde.ts +264 -0
  250. package/src/embeddings/index.ts +10 -0
  251. package/src/embeddings/openai-provider.ts +414 -0
  252. package/src/embeddings/pricing.json +22 -0
  253. package/src/embeddings/provider-constants.ts +204 -0
  254. package/src/embeddings/provider-errors.test.ts +967 -0
  255. package/src/embeddings/provider-errors.ts +565 -0
  256. package/src/embeddings/provider-factory.test.ts +240 -0
  257. package/src/embeddings/provider-factory.ts +225 -0
  258. package/src/embeddings/provider-integration.test.ts +788 -0
  259. package/src/embeddings/query-preprocessing.test.ts +187 -0
  260. package/src/embeddings/semantic-search-threshold.test.ts +508 -0
  261. package/src/embeddings/semantic-search.ts +1270 -0
  262. package/src/embeddings/types.ts +359 -0
  263. package/src/embeddings/vector-store.ts +708 -0
  264. package/src/embeddings/voyage-provider.ts +313 -0
  265. package/src/errors/errors.test.ts +845 -0
  266. package/src/errors/index.ts +533 -0
  267. package/src/index/ignore-patterns.test.ts +354 -0
  268. package/src/index/ignore-patterns.ts +305 -0
  269. package/src/index/index.ts +4 -0
  270. package/src/index/indexer.ts +684 -0
  271. package/src/index/storage.ts +260 -0
  272. package/src/index/types.ts +147 -0
  273. package/src/index/watcher.ts +189 -0
  274. package/src/index.ts +30 -0
  275. package/src/integration/search-keyword.test.ts +678 -0
  276. package/src/mcp/server.ts +612 -0
  277. package/src/parser/index.ts +1 -0
  278. package/src/parser/parser.test.ts +291 -0
  279. package/src/parser/parser.ts +394 -0
  280. package/src/parser/section-filter.test.ts +277 -0
  281. package/src/parser/section-filter.ts +392 -0
  282. package/src/search/__tests__/hybrid-search.test.ts +650 -0
  283. package/src/search/bm25-store.ts +366 -0
  284. package/src/search/cross-encoder.test.ts +253 -0
  285. package/src/search/cross-encoder.ts +406 -0
  286. package/src/search/fuzzy-search.test.ts +419 -0
  287. package/src/search/fuzzy-search.ts +273 -0
  288. package/src/search/hybrid-search.ts +448 -0
  289. package/src/search/path-matcher.test.ts +276 -0
  290. package/src/search/path-matcher.ts +33 -0
  291. package/src/search/query-parser.test.ts +260 -0
  292. package/src/search/query-parser.ts +319 -0
  293. package/src/search/searcher.test.ts +280 -0
  294. package/src/search/searcher.ts +724 -0
  295. package/src/search/wink-bm25.d.ts +30 -0
  296. package/src/summarization/cli-providers/claude.ts +202 -0
  297. package/src/summarization/cli-providers/detection.test.ts +273 -0
  298. package/src/summarization/cli-providers/detection.ts +118 -0
  299. package/src/summarization/cli-providers/index.ts +8 -0
  300. package/src/summarization/cost.test.ts +139 -0
  301. package/src/summarization/cost.ts +102 -0
  302. package/src/summarization/error-handler.test.ts +127 -0
  303. package/src/summarization/error-handler.ts +111 -0
  304. package/src/summarization/index.ts +102 -0
  305. package/src/summarization/pipeline.test.ts +498 -0
  306. package/src/summarization/pipeline.ts +231 -0
  307. package/src/summarization/prompts.test.ts +269 -0
  308. package/src/summarization/prompts.ts +133 -0
  309. package/src/summarization/provider-factory.test.ts +396 -0
  310. package/src/summarization/provider-factory.ts +178 -0
  311. package/src/summarization/types.ts +184 -0
  312. package/src/summarize/budget-bugs.test.ts +620 -0
  313. package/src/summarize/formatters.ts +419 -0
  314. package/src/summarize/index.ts +20 -0
  315. package/src/summarize/summarizer.test.ts +275 -0
  316. package/src/summarize/summarizer.ts +597 -0
  317. package/src/summarize/verify-bugs.test.ts +238 -0
  318. package/src/types/huggingface-transformers.d.ts +66 -0
  319. package/src/utils/index.ts +1 -0
  320. package/src/utils/tokens.test.ts +142 -0
  321. package/src/utils/tokens.ts +186 -0
  322. package/tests/fixtures/cli/.mdcontext/active-provider.json +7 -0
  323. package/tests/fixtures/cli/.mdcontext/config.json +8 -0
  324. package/tests/fixtures/cli/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.bin +0 -0
  325. package/tests/fixtures/cli/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.meta.bin +0 -0
  326. package/tests/fixtures/cli/.mdcontext/indexes/documents.json +33 -0
  327. package/tests/fixtures/cli/.mdcontext/indexes/links.json +12 -0
  328. package/tests/fixtures/cli/.mdcontext/indexes/sections.json +247 -0
  329. package/tests/fixtures/cli/README.md +9 -0
  330. package/tests/fixtures/cli/api-reference.md +11 -0
  331. package/tests/fixtures/cli/getting-started.md +11 -0
  332. package/tests/integration/embed-index.test.ts +712 -0
  333. package/tests/integration/search-context.test.ts +469 -0
  334. package/tests/integration/search-semantic.test.ts +522 -0
  335. package/tsconfig.json +26 -0
  336. package/vitest.config.ts +16 -0
  337. package/vitest.setup.ts +12 -0
@@ -0,0 +1,1022 @@
1
+ # OpenAI-Compatible LLM Providers in 2026
2
+
3
+ ## Executive Summary
4
+
5
+ As of 2026, the OpenAI SDK compatibility pattern has become the de facto standard for LLM API providers. This allows developers to use the official OpenAI SDK (`openai` package) with multiple providers by simply changing the `baseURL` and API key. This pattern significantly reduces integration complexity and enables easy provider switching.
6
+
7
+ **Recommendation: STRONGLY RECOMMENDED**
8
+
9
+ The OpenAI-compatible pattern should be the primary approach for multi-provider LLM support. It offers:
10
+ - Minimal code duplication
11
+ - Easy provider switching
12
+ - Familiar developer experience
13
+ - Broad ecosystem support
14
+ - Future-proof architecture
15
+
16
+ ---
17
+
18
+ ## Provider Comparison Table
19
+
20
+ | Provider | Status | Base URL | Auth Method | Key Features |
21
+ |----------|--------|----------|-------------|--------------|
22
+ | **DeepSeek** | ✅ Full | `https://api.deepseek.com/v1` | Bearer Token | Fast reasoning models, 128K context |
23
+ | **Together AI** | ✅ Full | `https://api.together.xyz/v1` | Bearer Token | 200+ open-source models |
24
+ | **Groq** | ✅ Full | `https://api.groq.com/openai/v1` | Bearer Token | Ultra-fast inference, function calling |
25
+ | **Ollama** | ✅ Full | `http://localhost:11434/v1` | None (local) | Local deployment, no API key needed |
26
+ | **Anthropic Claude** | ⚠️ Limited | `https://api.anthropic.com/v1/` | Bearer Token | Testing only, use native API for production |
27
+ | **Mistral AI** | ✅ Full | `https://api.mistral.ai/v1` | Bearer Token | Magistral reasoning models |
28
+ | **Cohere** | ✅ Full | Via Compatibility API | Bearer Token | Function calling, structured outputs |
29
+ | **Fireworks AI** | ✅ Full | `https://api.fireworks.ai/inference/v1` | Bearer Token | Fast inference, MCP support |
30
+ | **Perplexity AI** | ✅ Full | `https://api.perplexity.ai` | Bearer Token | Real-time search, citations |
31
+ | **OpenRouter** | ✅ Full | `https://openrouter.ai/api/v1` | Bearer Token | 500+ models, unified gateway |
32
+ | **Cloudflare Workers AI** | ✅ Full | Via Workers AI | CF Token | Edge deployment, 50+ models |
33
+ | **vLLM** | ✅ Full | `http://localhost:8000/v1` | None (self-hosted) | Self-hosted, multi-GPU support |
34
+ | **LiteLLM Proxy** | ✅ Gateway | `http://localhost:4000/v1` | Bearer Token | 100+ providers, cost tracking |
35
+ | **Anyscale** | ⚠️ Limited | `https://api.endpoints.anyscale.com/v1` | Bearer Token | Hosted platform only (as of Aug 2024) |
36
+
37
+ ---
38
+
39
+ ## Detailed Provider Information
40
+
41
+ ### 1. DeepSeek API
42
+
43
+ **Status:** ✅ Fully OpenAI-Compatible
44
+
45
+ **Base URL:** `https://api.deepseek.com/v1`
46
+
47
+ **Authentication:** Bearer token via `Authorization` header or API key parameter
48
+
49
+ **Models:**
50
+ - `deepseek-chat` - Fast general-purpose model (128K context)
51
+ - `deepseek-reasoner` - Reasoning mode with chain-of-thought (64K output)
52
+ - Both powered by V3.2-Exp
53
+
54
+ **Example:**
55
+ ```typescript
56
+ import OpenAI from 'openai'
57
+
58
+ const client = new OpenAI({
59
+ baseURL: 'https://api.deepseek.com/v1',
60
+ apiKey: process.env.DEEPSEEK_API_KEY
61
+ })
62
+
63
+ const response = await client.chat.completions.create({
64
+ model: 'deepseek-chat',
65
+ messages: [{ role: 'user', content: 'Hello!' }]
66
+ })
67
+ ```
68
+
69
+ **Limitations:**
70
+ - None reported for OpenAI compatibility
71
+
72
+ **Sources:**
73
+ - [DeepSeek API Docs](https://api-docs.deepseek.com/)
74
+ - [How to Integrate DeepSeek with Node.js Using the OpenAI SDK](https://medium.com/@akbhuker/how-to-integrate-deepseek-with-node-js-using-the-openai-sdk-a0b7ef8ae1e4)
75
+
76
+ ---
77
+
78
+ ### 2. Together AI
79
+
80
+ **Status:** ✅ Fully OpenAI-Compatible
81
+
82
+ **Base URL:** `https://api.together.xyz/v1`
83
+
84
+ **Authentication:** Bearer token
85
+
86
+ **Models:** 200+ open-source models including Llama, Mixtral, and more
87
+
88
+ **Example:**
89
+ ```typescript
90
+ import OpenAI from 'openai'
91
+
92
+ const client = new OpenAI({
93
+ baseURL: 'https://api.together.xyz/v1',
94
+ apiKey: process.env.TOGETHER_API_KEY
95
+ })
96
+
97
+ const response = await client.chat.completions.create({
98
+ model: 'meta-llama/Llama-3.3-70B-Instruct-Turbo',
99
+ messages: [{ role: 'user', content: 'Hello!' }]
100
+ })
101
+ ```
102
+
103
+ **Limitations:**
104
+ - None reported for OpenAI compatibility
105
+
106
+ **Sources:**
107
+ - [OpenAI Compatibility - Together.ai Docs](https://docs.together.ai/docs/openai-api-compatibility)
108
+
109
+ ---
110
+
111
+ ### 3. Groq
112
+
113
+ **Status:** ✅ Fully OpenAI-Compatible
114
+
115
+ **Base URL:** `https://api.groq.com/openai/v1`
116
+
117
+ **Authentication:** Bearer token
118
+
119
+ **Models:** Fast inference models including:
120
+ - `qwen-qwq-32b` - Reasoning model
121
+ - `deepseek-r1-distill-llama-70b` - Reasoning model
122
+ - GPT-OSS 120B - OpenAI's open-weight model
123
+
124
+ **Example:**
125
+ ```typescript
126
+ import OpenAI from 'openai'
127
+
128
+ const client = new OpenAI({
129
+ baseURL: 'https://api.groq.com/openai/v1',
130
+ apiKey: process.env.GROQ_API_KEY
131
+ })
132
+
133
+ const response = await client.chat.completions.create({
134
+ model: 'deepseek-r1-distill-llama-70b',
135
+ messages: [{ role: 'user', content: 'Hello!' }]
136
+ })
137
+ ```
138
+
139
+ **Limitations:**
140
+ - None reported for OpenAI compatibility
141
+
142
+ **Sources:**
143
+ - [Groq Docs Overview](https://console.groq.com/docs/overview)
144
+
145
+ ---
146
+
147
+ ### 4. Ollama
148
+
149
+ **Status:** ✅ Fully OpenAI-Compatible
150
+
151
+ **Base URL:** `http://localhost:11434/v1` (local deployment)
152
+
153
+ **Authentication:** None required (local API). API key parameter is ignored.
154
+
155
+ **Models:** Any model supported by Ollama (Llama, Mistral, etc.)
156
+
157
+ **Example:**
158
+ ```typescript
159
+ import OpenAI from 'openai'
160
+
161
+ const client = new OpenAI({
162
+ baseURL: 'http://localhost:11434/v1',
163
+ apiKey: 'ollama' // required but ignored
164
+ })
165
+
166
+ const response = await client.chat.completions.create({
167
+ model: 'llama3.3',
168
+ messages: [{ role: 'user', content: 'Hello!' }]
169
+ })
170
+ ```
171
+
172
+ **Limitations:**
173
+ - Local deployment only (not a cloud service)
174
+ - API key is required by OpenAI SDK but ignored by Ollama
175
+
176
+ **Added Features:**
177
+ - Tool/function calling support (added in v0.13.3)
178
+
179
+ **Sources:**
180
+ - [OpenAI compatibility - Ollama](https://docs.ollama.com/api/openai-compatibility)
181
+
182
+ ---
183
+
184
+ ### 5. Anthropic Claude
185
+
186
+ **Status:** ⚠️ Limited Compatibility (Testing Only)
187
+
188
+ **Base URL:** `https://api.anthropic.com/v1/`
189
+
190
+ **Authentication:** Bearer token via `x-api-key` header (but uses standard OpenAI auth in compatibility mode)
191
+
192
+ **Models:**
193
+ - `claude-sonnet-4-5`
194
+ - `claude-opus-4-5`
195
+ - All Claude models
196
+
197
+ **Example:**
198
+ ```typescript
199
+ import OpenAI from 'openai'
200
+
201
+ const client = new OpenAI({
202
+ baseURL: 'https://api.anthropic.com/v1/',
203
+ apiKey: process.env.ANTHROPIC_API_KEY
204
+ })
205
+
206
+ const response = await client.chat.completions.create({
207
+ model: 'claude-sonnet-4-5',
208
+ messages: [{ role: 'user', content: 'Hello!' }]
209
+ })
210
+ ```
211
+
212
+ **Important Limitations:**
213
+ - **Not for production use** - Anthropic recommends using native Claude API
214
+ - Audio input not supported (silently stripped)
215
+ - Prompt caching not supported (available in native SDK)
216
+ - Strict parameter for function calling ignored
217
+ - PDF processing, citations, extended thinking require native API
218
+
219
+ **Recommendation:**
220
+ Use native Anthropic SDK for production. OpenAI compatibility layer is for quick testing/comparison only.
221
+
222
+ **Sources:**
223
+ - [OpenAI SDK compatibility - Claude API Docs](https://platform.claude.com/docs/en/api/openai-sdk)
224
+
225
+ ---
226
+
227
+ ### 6. Mistral AI
228
+
229
+ **Status:** ✅ Fully OpenAI-Compatible
230
+
231
+ **Base URL:** `https://api.mistral.ai/v1`
232
+
233
+ **Authentication:** Bearer token
234
+
235
+ **Models:**
236
+ - Magistral reasoning models (specialized reasoning, June 2025+)
237
+ - Mistral Large, Medium, Small variants
238
+
239
+ **Example:**
240
+ ```typescript
241
+ import OpenAI from 'openai'
242
+
243
+ const client = new OpenAI({
244
+ baseURL: 'https://api.mistral.ai/v1',
245
+ apiKey: process.env.MISTRAL_API_KEY
246
+ })
247
+
248
+ const response = await client.chat.completions.create({
249
+ model: 'mistral-large-latest',
250
+ messages: [{ role: 'user', content: 'Hello!' }]
251
+ })
252
+ ```
253
+
254
+ **Limitations:**
255
+ - None reported for OpenAI compatibility
256
+
257
+ **Sources:**
258
+ - [API Specs - Mistral Docs](https://docs.mistral.ai/api)
259
+
260
+ ---
261
+
262
+ ### 7. Cohere
263
+
264
+ **Status:** ✅ Fully OpenAI-Compatible
265
+
266
+ **Base URL:** Via Compatibility API endpoint
267
+
268
+ **Authentication:** Bearer token
269
+
270
+ **Models:** Cohere Command models
271
+
272
+ **Example:**
273
+ ```typescript
274
+ import OpenAI from 'openai'
275
+
276
+ const client = new OpenAI({
277
+ baseURL: 'https://api.cohere.ai/v1', // Compatibility API
278
+ apiKey: process.env.COHERE_API_KEY
279
+ })
280
+
281
+ const response = await client.chat.completions.create({
282
+ model: 'command-r-plus',
283
+ messages: [{ role: 'user', content: 'Hello!' }]
284
+ })
285
+ ```
286
+
287
+ **Features:**
288
+ - Function calling
289
+ - Structured outputs
290
+ - Text embeddings
291
+
292
+ **Limitations:**
293
+ - `reasoning_effort` only supports `none` and `high` (maps to thinking mode on/off)
294
+ - Trial keys are rate-limited (1,000 API calls/month)
295
+
296
+ **Sources:**
297
+ - [Using Cohere models via the OpenAI SDK](https://docs.cohere.com/docs/compatibility-api)
298
+
299
+ ---
300
+
301
+ ### 8. Fireworks AI
302
+
303
+ **Status:** ✅ Fully OpenAI-Compatible
304
+
305
+ **Base URL:** `https://api.fireworks.ai/inference/v1`
306
+
307
+ **Authentication:** Bearer token
308
+
309
+ **Models:** Wide selection of open-source models
310
+
311
+ **Example:**
312
+ ```typescript
313
+ import OpenAI from 'openai'
314
+
315
+ const client = new OpenAI({
316
+ baseURL: 'https://api.fireworks.ai/inference/v1',
317
+ apiKey: process.env.FIREWORKS_API_KEY
318
+ })
319
+
320
+ const response = await client.chat.completions.create({
321
+ model: 'accounts/fireworks/models/llama-v3p3-70b-instruct',
322
+ messages: [{ role: 'user', content: 'Hello!' }]
323
+ })
324
+ ```
325
+
326
+ **New Features (2026):**
327
+ - OpenAI-compatible Responses API with MCP (Model Context Protocol) support
328
+ - Server-side agentic loop handling
329
+
330
+ **Limitations:**
331
+ - None reported for OpenAI compatibility
332
+
333
+ **Sources:**
334
+ - [OpenAI compatibility - Fireworks AI Docs](https://docs.fireworks.ai/tools-sdks/openai-compatibility)
335
+
336
+ ---
337
+
338
+ ### 9. Perplexity AI
339
+
340
+ **Status:** ✅ Fully OpenAI-Compatible
341
+
342
+ **Base URL:** `https://api.perplexity.ai`
343
+
344
+ **Authentication:** Bearer token
345
+
346
+ **Models:**
347
+ - `sonar-pro` - Real-time search with citations
348
+ - Other Sonar variants
349
+
350
+ **Example:**
351
+ ```typescript
352
+ import OpenAI from 'openai'
353
+
354
+ const client = new OpenAI({
355
+ baseURL: 'https://api.perplexity.ai',
356
+ apiKey: process.env.PERPLEXITY_API_KEY
357
+ })
358
+
359
+ const response = await client.chat.completions.create({
360
+ model: 'sonar-pro',
361
+ messages: [{ role: 'user', content: 'What happened today in tech?' }]
362
+ })
363
+ ```
364
+
365
+ **Unique Features:**
366
+ - Real-time web search
367
+ - Automatic citation of sources
368
+ - Up-to-date information
369
+
370
+ **Important Consideration:**
371
+ - High token costs: Perplexity includes full text of cited sources in input token count
372
+ - A simple question can result in high token usage if multiple long articles are cited
373
+
374
+ **Sources:**
375
+ - [OpenAI Compatibility - Perplexity](https://docs.perplexity.ai/guides/chat-completions-guide)
376
+
377
+ ---
378
+
379
+ ### 10. OpenRouter
380
+
381
+ **Status:** ✅ Fully OpenAI-Compatible (API Gateway)
382
+
383
+ **Base URL:** `https://openrouter.ai/api/v1`
384
+
385
+ **Authentication:** Bearer token
386
+
387
+ **Models:** 500+ models from multiple providers
388
+
389
+ **Example:**
390
+ ```typescript
391
+ import OpenAI from 'openai'
392
+
393
+ const client = new OpenAI({
394
+ baseURL: 'https://openrouter.ai/api/v1',
395
+ apiKey: process.env.OPENROUTER_API_KEY
396
+ })
397
+
398
+ const response = await client.chat.completions.create({
399
+ model: 'anthropic/claude-sonnet-4-5',
400
+ messages: [{ role: 'user', content: 'Hello!' }]
401
+ })
402
+ ```
403
+
404
+ **Features:**
405
+ - Unified access to 500+ models from providers like OpenAI, Anthropic, Google, etc.
406
+ - Automatic failovers
407
+ - Prompt caching
408
+ - Intelligent routing for cost/latency optimization
409
+ - 13+ free models with daily limits
410
+
411
+ **Pricing:**
412
+ - Pass-through pricing at exact provider rates
413
+ - 5% platform fee (5.5% on credits)
414
+
415
+ **Limitations:**
416
+ - Schema normalization means slight differences from native provider APIs
417
+ - Additional latency from routing layer
418
+
419
+ **Sources:**
420
+ - [OpenRouter Quickstart Guide](https://openrouter.ai/docs/quickstart)
421
+
422
+ ---
423
+
424
+ ### 11. Cloudflare Workers AI
425
+
426
+ **Status:** ✅ Fully OpenAI-Compatible
427
+
428
+ **Base URL:** Via Workers AI endpoints
429
+
430
+ **Authentication:** Cloudflare token
431
+
432
+ **Models:** 50+ models including:
433
+ - `@cf/openai/gpt-oss-120b` - OpenAI's open-weight model
434
+ - `@cf/openai/gpt-oss-20b`
435
+ - Other open-source models
436
+
437
+ **Example:**
438
+ ```typescript
439
+ import OpenAI from 'openai'
440
+
441
+ const client = new OpenAI({
442
+ baseURL: 'https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/v1',
443
+ apiKey: process.env.CLOUDFLARE_API_TOKEN
444
+ })
445
+
446
+ const response = await client.chat.completions.create({
447
+ model: '@cf/openai/gpt-oss-120b',
448
+ messages: [{ role: 'user', content: 'Hello!' }]
449
+ })
450
+ ```
451
+
452
+ **Supported Endpoints:**
453
+ - `/v1/chat/completions` - Text generation
454
+ - `/v1/embeddings` - Text embeddings
455
+
456
+ **Features:**
457
+ - Edge deployment (200+ cities worldwide)
458
+ - Serverless pricing
459
+ - Day 0 support for OpenAI's open-weight models
460
+ - OpenAI Responses API format support
461
+
462
+ **Limitations:**
463
+ - Requires Cloudflare account and setup
464
+
465
+ **Sources:**
466
+ - [OpenAI compatible API endpoints · Cloudflare Workers AI docs](https://developers.cloudflare.com/workers-ai/configuration/open-ai-compatibility/)
467
+
468
+ ---
469
+
470
+ ### 12. vLLM
471
+
472
+ **Status:** ✅ Fully OpenAI-Compatible (Self-Hosted)
473
+
474
+ **Base URL:** `http://localhost:8000/v1` (default, configurable)
475
+
476
+ **Authentication:** None (self-hosted)
477
+
478
+ **Models:** Any model supported by vLLM (hundreds of open-source models)
479
+
480
+ **Example:**
481
+ ```typescript
482
+ import OpenAI from 'openai'
483
+
484
+ const client = new OpenAI({
485
+ baseURL: 'http://localhost:8000/v1',
486
+ apiKey: 'none' // required by SDK but ignored
487
+ })
488
+
489
+ const response = await client.chat.completions.create({
490
+ model: 'meta-llama/Llama-3.3-70B-Instruct',
491
+ messages: [{ role: 'user', content: 'Hello!' }]
492
+ })
493
+ ```
494
+
495
+ **Supported APIs:**
496
+ - Chat Completions API
497
+ - Completions API
498
+ - Embeddings API
499
+
500
+ **Features:**
501
+ - Self-hosted deployment
502
+ - Multi-GPU support
503
+ - Scales from single GPU to multi-node cluster
504
+ - Multimodal support (vision and audio)
505
+ - Auto-downloads models from HuggingFace
506
+
507
+ **Recent Updates (Jan 2026):**
508
+ - Support for latest models including DeepSeek R1
509
+
510
+ **Limitations:**
511
+ - Requires infrastructure setup
512
+ - Self-managed deployment
513
+
514
+ **Sources:**
515
+ - [OpenAI-Compatible Server - vLLM](https://docs.vllm.ai/en/stable/serving/openai_compatible_server/)
516
+
517
+ ---
518
+
519
+ ### 13. LiteLLM Proxy
520
+
521
+ **Status:** ✅ Fully OpenAI-Compatible (Gateway/Proxy)
522
+
523
+ **Base URL:** `http://localhost:4000/v1` (default)
524
+
525
+ **Authentication:** Bearer token (managed by proxy)
526
+
527
+ **Models:** 100+ providers unified through single interface
528
+
529
+ **Example:**
530
+ ```typescript
531
+ import OpenAI from 'openai'
532
+
533
+ const client = new OpenAI({
534
+ baseURL: 'http://localhost:4000',
535
+ apiKey: process.env.LITELLM_API_KEY
536
+ })
537
+
538
+ const response = await client.chat.completions.create({
539
+ model: 'gpt-4', // LiteLLM routes to configured provider
540
+ messages: [{ role: 'user', content: 'Hello!' }]
541
+ })
542
+ ```
543
+
544
+ **Supported Providers:**
545
+ - OpenAI, Azure, Anthropic, Cohere, Bedrock, VertexAI, HuggingFace, NVIDIA NIM, and 100+ more
546
+
547
+ **Supported Endpoints:**
548
+ - `/chat/completions`
549
+ - `/responses`
550
+ - `/embeddings`
551
+ - `/images`
552
+ - `/audio`
553
+ - `/batches`
554
+ - `/rerank`
555
+ - `/a2a` (Agent-to-Agent)
556
+ - `/messages`
557
+
558
+ **Features (2026):**
559
+ - Cost tracking and management
560
+ - Guardrails
561
+ - Load balancing
562
+ - Logging
563
+ - JWT Authentication
564
+ - Batch API routing
565
+ - Prompt management with versioning
566
+ - Agent (A2A) Gateway support
567
+
568
+ **Use Cases:**
569
+ - Unified gateway for multiple providers
570
+ - Cost tracking across providers
571
+ - Development/testing with multiple models
572
+ - Production routing and fallbacks
573
+
574
+ **Limitations:**
575
+ - Requires running proxy server
576
+ - Additional latency from proxy layer
577
+
578
+ **Sources:**
579
+ - [OpenAI-Compatible Endpoints | liteLLM](https://docs.litellm.ai/docs/providers/openai_compatible)
580
+ - [GitHub - BerriAI/litellm](https://github.com/BerriAI/litellm)
581
+
582
+ ---
583
+
584
+ ### 14. Anyscale Endpoints
585
+
586
+ **Status:** ⚠️ Limited Availability
587
+
588
+ **Base URL:** `https://api.endpoints.anyscale.com/v1`
589
+
590
+ **Authentication:** Bearer token
591
+
592
+ **Models:** Various open-source models
593
+
594
+ **Example:**
595
+ ```typescript
596
+ import OpenAI from 'openai'
597
+
598
+ const client = new OpenAI({
599
+ baseURL: 'https://api.endpoints.anyscale.com/v1',
600
+ apiKey: process.env.ANYSCALE_API_KEY
601
+ })
602
+
603
+ const response = await client.chat.completions.create({
604
+ model: 'meta-llama/Llama-3.3-70B-Instruct',
605
+ messages: [{ role: 'user', content: 'Hello!' }]
606
+ })
607
+ ```
608
+
609
+ **Important Note:**
610
+ - As of August 1, 2024, Anyscale Endpoints API is only available through the fully Hosted Anyscale Platform
611
+ - Multi-tenant access to LLM models was removed
612
+
613
+ **Features:**
614
+ - JSON Mode
615
+ - Function calling
616
+ - Fine-tuning API (OpenAI-compatible)
617
+
618
+ **Sources:**
619
+ - [Migrate from OpenAI | Anyscale Docs](https://docs.anyscale.com/endpoints/text-generation/migrate-from-openai/)
620
+
621
+ ---
622
+
623
+ ## Implementation Pattern
624
+
625
+ ### TypeScript Example: Universal LLM Client
626
+
627
+ ```typescript
628
+ import OpenAI from 'openai'
629
+
630
+ type Provider =
631
+ | 'openai'
632
+ | 'deepseek'
633
+ | 'together'
634
+ | 'groq'
635
+ | 'ollama'
636
+ | 'mistral'
637
+ | 'fireworks'
638
+ | 'perplexity'
639
+ | 'openrouter'
640
+
641
+ interface ProviderConfig {
642
+ baseURL: string
643
+ apiKey: string
644
+ defaultModel?: string
645
+ }
646
+
647
+ const PROVIDER_CONFIGS: Record<Provider, (apiKey: string) => ProviderConfig> = {
648
+ openai: (apiKey) => ({
649
+ baseURL: 'https://api.openai.com/v1',
650
+ apiKey,
651
+ defaultModel: 'gpt-4o'
652
+ }),
653
+ deepseek: (apiKey) => ({
654
+ baseURL: 'https://api.deepseek.com/v1',
655
+ apiKey,
656
+ defaultModel: 'deepseek-chat'
657
+ }),
658
+ together: (apiKey) => ({
659
+ baseURL: 'https://api.together.xyz/v1',
660
+ apiKey,
661
+ defaultModel: 'meta-llama/Llama-3.3-70B-Instruct-Turbo'
662
+ }),
663
+ groq: (apiKey) => ({
664
+ baseURL: 'https://api.groq.com/openai/v1',
665
+ apiKey,
666
+ defaultModel: 'deepseek-r1-distill-llama-70b'
667
+ }),
668
+ ollama: (apiKey) => ({
669
+ baseURL: 'http://localhost:11434/v1',
670
+ apiKey: 'ollama', // ignored but required
671
+ defaultModel: 'llama3.3'
672
+ }),
673
+ mistral: (apiKey) => ({
674
+ baseURL: 'https://api.mistral.ai/v1',
675
+ apiKey,
676
+ defaultModel: 'mistral-large-latest'
677
+ }),
678
+ fireworks: (apiKey) => ({
679
+ baseURL: 'https://api.fireworks.ai/inference/v1',
680
+ apiKey,
681
+ defaultModel: 'accounts/fireworks/models/llama-v3p3-70b-instruct'
682
+ }),
683
+ perplexity: (apiKey) => ({
684
+ baseURL: 'https://api.perplexity.ai',
685
+ apiKey,
686
+ defaultModel: 'sonar-pro'
687
+ }),
688
+ openrouter: (apiKey) => ({
689
+ baseURL: 'https://openrouter.ai/api/v1',
690
+ apiKey,
691
+ defaultModel: 'anthropic/claude-sonnet-4-5'
692
+ })
693
+ }
694
+
695
+ export class UniversalLLMClient {
696
+ private client: OpenAI
697
+ private defaultModel: string
698
+
699
+ constructor(provider: Provider, apiKey: string) {
700
+ const config = PROVIDER_CONFIGS[provider](apiKey)
701
+
702
+ this.client = new OpenAI({
703
+ baseURL: config.baseURL,
704
+ apiKey: config.apiKey
705
+ })
706
+
707
+ this.defaultModel = config.defaultModel || ''
708
+ }
709
+
710
+ async chat(
711
+ messages: Array<{ role: string; content: string }>,
712
+ options?: {
713
+ model?: string
714
+ temperature?: number
715
+ maxTokens?: number
716
+ }
717
+ ) {
718
+ return this.client.chat.completions.create({
719
+ model: options?.model || this.defaultModel,
720
+ messages,
721
+ temperature: options?.temperature,
722
+ max_tokens: options?.maxTokens
723
+ })
724
+ }
725
+
726
+ async embed(input: string | string[], model?: string) {
727
+ return this.client.embeddings.create({
728
+ model: model || 'text-embedding-3-small',
729
+ input
730
+ })
731
+ }
732
+ }
733
+
734
+ // Usage
735
+ const client = new UniversalLLMClient('deepseek', process.env.DEEPSEEK_API_KEY!)
736
+
737
+ const response = await client.chat([
738
+ { role: 'user', content: 'Hello!' }
739
+ ])
740
+
741
+ console.log(response.choices[0].message.content)
742
+ ```
743
+
744
+ ---
745
+
746
+ ## Architecture Recommendations
747
+
748
+ ### For Production Use
749
+
750
+ 1. **Primary Pattern: OpenAI SDK with Provider Switching**
751
+ ```typescript
752
+ // Recommended approach
753
+ const provider = process.env.LLM_PROVIDER || 'openai'
754
+ const client = createLLMClient(provider)
755
+ ```
756
+
757
+ 2. **Use LiteLLM Proxy for:**
758
+ - Cost tracking across providers
759
+ - Load balancing and failovers
760
+ - Unified logging and monitoring
761
+ - Development/staging environments
762
+
763
+ 3. **Use OpenRouter for:**
764
+ - Quick access to many models
765
+ - Model experimentation
766
+ - Fallback/redundancy strategy
767
+
768
+ 4. **Use Native SDKs When:**
769
+ - Provider-specific features required (e.g., Claude's prompt caching, extended thinking)
770
+ - Maximum performance needed
771
+ - Advanced features not in OpenAI spec
772
+
773
+ ### Environment Configuration
774
+
775
+ ```typescript
776
+ // .env
777
+ LLM_PROVIDER=deepseek
778
+ DEEPSEEK_API_KEY=sk-xxx
779
+ OPENAI_API_KEY=sk-xxx
780
+ GROQ_API_KEY=gsk-xxx
781
+ TOGETHER_API_KEY=xxx
782
+ ```
783
+
784
+ ### Error Handling
785
+
786
+ ```typescript
787
+ class LLMError extends Error {
788
+ constructor(
789
+ message: string,
790
+ public provider: string,
791
+ public originalError: unknown
792
+ ) {
793
+ super(message)
794
+ }
795
+ }
796
+
797
+ async function chatWithFallback(
798
+ messages: Array<{ role: string; content: string }>,
799
+ providers: Provider[] = ['deepseek', 'groq', 'openai']
800
+ ) {
801
+ const errors: LLMError[] = []
802
+
803
+ for (const provider of providers) {
804
+ try {
805
+ const client = new UniversalLLMClient(
806
+ provider,
807
+ process.env[`${provider.toUpperCase()}_API_KEY`]!
808
+ )
809
+ return await client.chat(messages)
810
+ } catch (error) {
811
+ errors.push(new LLMError(
812
+ `${provider} failed`,
813
+ provider,
814
+ error
815
+ ))
816
+ continue
817
+ }
818
+ }
819
+
820
+ throw new Error(
821
+ `All providers failed: ${errors.map(e => e.message).join(', ')}`
822
+ )
823
+ }
824
+ ```
825
+
826
+ ---
827
+
828
+ ## Provider Selection Guide
829
+
830
+ ### For Cost Optimization
831
+ 1. **DeepSeek** - Very competitive pricing
832
+ 2. **Groq** - Fast inference at good rates
833
+ 3. **Together AI** - Competitive open-source model pricing
834
+ 4. **OpenRouter** - Automatic cost optimization
835
+
836
+ ### For Speed
837
+ 1. **Groq** - Ultra-fast inference (LPU-based)
838
+ 2. **Fireworks AI** - Optimized for speed
839
+ 3. **Together AI** - Fast open-source models
840
+
841
+ ### For Model Variety
842
+ 1. **OpenRouter** - 500+ models
843
+ 2. **Together AI** - 200+ open-source models
844
+ 3. **LiteLLM Proxy** - 100+ providers
845
+
846
+ ### For Reasoning Tasks
847
+ 1. **DeepSeek** - deepseek-reasoner with chain-of-thought
848
+ 2. **Groq** - qwen-qwq-32b, deepseek-r1-distill-llama-70b
849
+ 3. **Mistral** - Magistral reasoning models
850
+
851
+ ### For Real-Time Information
852
+ 1. **Perplexity AI** - Built-in web search with citations
853
+ 2. **OpenRouter** - Access to various search-enabled models
854
+
855
+ ### For Local/Private Deployment
856
+ 1. **Ollama** - Easy local deployment
857
+ 2. **vLLM** - High-performance self-hosted
858
+ 3. **LiteLLM Proxy** - Self-hosted gateway
859
+
860
+ ---
861
+
862
+ ## Migration Strategy
863
+
864
+ ### Phase 1: Abstraction Layer
865
+ Create a unified interface that uses OpenAI SDK internally:
866
+
867
+ ```typescript
868
+ interface LLMProvider {
869
+ chat(messages: Message[]): Promise<ChatResponse>
870
+ embed(text: string): Promise<Embedding>
871
+ }
872
+
873
+ class OpenAICompatibleProvider implements LLMProvider {
874
+ constructor(
875
+ private client: OpenAI,
876
+ private defaultModel: string
877
+ ) {}
878
+
879
+ async chat(messages: Message[]) {
880
+ const response = await this.client.chat.completions.create({
881
+ model: this.defaultModel,
882
+ messages
883
+ })
884
+ return response
885
+ }
886
+ }
887
+ ```
888
+
889
+ ### Phase 2: Configuration
890
+ Externalize provider configuration:
891
+
892
+ ```typescript
893
+ // config/llm-providers.ts
894
+ export const LLM_PROVIDERS = {
895
+ deepseek: {
896
+ baseURL: 'https://api.deepseek.com/v1',
897
+ models: {
898
+ chat: 'deepseek-chat',
899
+ reasoning: 'deepseek-reasoner'
900
+ }
901
+ },
902
+ groq: {
903
+ baseURL: 'https://api.groq.com/openai/v1',
904
+ models: {
905
+ fast: 'deepseek-r1-distill-llama-70b'
906
+ }
907
+ }
908
+ // ... more providers
909
+ }
910
+ ```
911
+
912
+ ### Phase 3: Runtime Switching
913
+ Enable dynamic provider selection:
914
+
915
+ ```typescript
916
+ const provider = selectProvider({
917
+ task: 'reasoning', // or 'chat', 'search', etc.
918
+ priority: 'cost', // or 'speed', 'quality'
919
+ fallbacks: true
920
+ })
921
+ ```
922
+
923
+ ---
924
+
925
+ ## Testing Recommendations
926
+
927
+ ### Provider Compatibility Tests
928
+
929
+ ```typescript
930
+ import { describe, it, expect } from 'vitest'
931
+
932
+ const PROVIDERS_TO_TEST: Provider[] = [
933
+ 'deepseek',
934
+ 'groq',
935
+ 'together',
936
+ 'mistral'
937
+ ]
938
+
939
+ describe.each(PROVIDERS_TO_TEST)('Provider: %s', (provider) => {
940
+ it('should complete chat', async () => {
941
+ const client = new UniversalLLMClient(
942
+ provider,
943
+ process.env[`${provider.toUpperCase()}_API_KEY`]!
944
+ )
945
+
946
+ const response = await client.chat([
947
+ { role: 'user', content: 'Say "hello"' }
948
+ ])
949
+
950
+ expect(response.choices[0].message.content).toBeTruthy()
951
+ })
952
+
953
+ it('should handle streaming', async () => {
954
+ // ... streaming test
955
+ })
956
+
957
+ it('should support function calling', async () => {
958
+ // ... function calling test
959
+ })
960
+ })
961
+ ```
962
+
963
+ ---
964
+
965
+ ## Conclusion
966
+
967
+ The OpenAI-compatible API pattern is the **clear winner** for multi-provider LLM integration in 2026. Key benefits:
968
+
969
+ 1. **Minimal Code**: One SDK, multiple providers
970
+ 2. **Easy Migration**: Change 2 lines of code to switch providers
971
+ 3. **Future-Proof**: New providers adopting this standard regularly
972
+ 4. **Developer Experience**: Familiar interface reduces learning curve
973
+ 5. **Ecosystem**: Works with existing tools built for OpenAI SDK
974
+
975
+ ### Exceptions to Use Native SDKs:
976
+
977
+ - **Anthropic Claude**: Use native SDK for production (OpenAI compat is testing-only)
978
+ - **Provider-Specific Features**: When you need features not in OpenAI spec
979
+ - **Maximum Performance**: When latency is critical and provider optimizations matter
980
+
981
+ ### Recommended Stack:
982
+
983
+ ```
984
+ Application Code
985
+
986
+ Universal LLM Client (OpenAI SDK-based)
987
+
988
+ [Optional] LiteLLM Proxy (for cost tracking, routing)
989
+
990
+ Multiple Providers (DeepSeek, Groq, Together, etc.)
991
+ ```
992
+
993
+ This architecture provides flexibility, maintainability, and future-proofing while minimizing complexity.
994
+
995
+ ---
996
+
997
+ ## References
998
+
999
+ ### Official Documentation
1000
+ - [DeepSeek API Docs](https://api-docs.deepseek.com/)
1001
+ - [Together AI OpenAI Compatibility](https://docs.together.ai/docs/openai-api-compatibility)
1002
+ - [Groq Docs Overview](https://console.groq.com/docs/overview)
1003
+ - [Ollama OpenAI compatibility](https://docs.ollama.com/api/openai-compatibility)
1004
+ - [Anthropic OpenAI SDK compatibility](https://platform.claude.com/docs/en/api/openai-sdk)
1005
+ - [Mistral AI API Specs](https://docs.mistral.ai/api)
1006
+ - [Cohere Compatibility API](https://docs.cohere.com/docs/compatibility-api)
1007
+ - [Fireworks AI OpenAI compatibility](https://docs.fireworks.ai/tools-sdks/openai-compatibility)
1008
+ - [Perplexity OpenAI Compatibility](https://docs.perplexity.ai/guides/chat-completions-guide)
1009
+ - [OpenRouter Quickstart Guide](https://openrouter.ai/docs/quickstart)
1010
+ - [Cloudflare Workers AI OpenAI endpoints](https://developers.cloudflare.com/workers-ai/configuration/open-ai-compatibility/)
1011
+ - [vLLM OpenAI-Compatible Server](https://docs.vllm.ai/en/stable/serving/openai_compatible_server/)
1012
+ - [LiteLLM Documentation](https://docs.litellm.ai/docs/providers/openai_compatible)
1013
+
1014
+ ### Additional Resources
1015
+ - [AI SDK Providers](https://ai-sdk.dev/providers/ai-sdk-providers/)
1016
+ - [OpenAI SDK (npm)](https://www.npmjs.com/package/openai)
1017
+
1018
+ ---
1019
+
1020
+ **Document Version:** 1.0
1021
+ **Last Updated:** January 26, 2026
1022
+ **Researched by:** Claude Sonnet 4.5