mdcontext 0.0.1 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.changeset/README.md +28 -0
- package/.changeset/config.json +11 -0
- package/.claude/settings.local.json +25 -0
- package/.github/workflows/ci.yml +83 -0
- package/.github/workflows/claude-code-review.yml +44 -0
- package/.github/workflows/claude.yml +85 -0
- package/.github/workflows/release.yml +113 -0
- package/.tldrignore +112 -0
- package/BACKLOG.md +338 -0
- package/CONTRIBUTING.md +186 -0
- package/NOTES/NOTES +44 -0
- package/README.md +434 -11
- package/biome.json +36 -0
- package/cspell.config.yaml +14 -0
- package/dist/chunk-23UPXDNL.js +3044 -0
- package/dist/chunk-2W7MO2DL.js +1366 -0
- package/dist/chunk-3NUAZGMA.js +1689 -0
- package/dist/chunk-7TOWB2XB.js +366 -0
- package/dist/chunk-7XOTOADQ.js +3065 -0
- package/dist/chunk-AH2PDM2K.js +3042 -0
- package/dist/chunk-BNXWSZ63.js +3742 -0
- package/dist/chunk-BTL5DJVU.js +3222 -0
- package/dist/chunk-HDHYG7E4.js +104 -0
- package/dist/chunk-HLR4KZBP.js +3234 -0
- package/dist/chunk-IP3FRFEB.js +1045 -0
- package/dist/chunk-KHU56VDO.js +3042 -0
- package/dist/chunk-KRYIFLQR.js +88 -0
- package/dist/chunk-LBSDNLEM.js +287 -0
- package/dist/chunk-MNTQ7HCP.js +2643 -0
- package/dist/chunk-MUJELQQ6.js +1387 -0
- package/dist/chunk-MXJGMSLV.js +2199 -0
- package/dist/chunk-N6QJGC3Z.js +2636 -0
- package/dist/chunk-OBELGBPM.js +1713 -0
- package/dist/chunk-OT7R5XTA.js +3192 -0
- package/dist/chunk-P7X4RA2T.js +106 -0
- package/dist/chunk-PIDUQNC2.js +3185 -0
- package/dist/chunk-POGCDIH4.js +3187 -0
- package/dist/chunk-PSIEOQGZ.js +3043 -0
- package/dist/chunk-PVRT3IHA.js +3238 -0
- package/dist/chunk-QNN4TT23.js +1430 -0
- package/dist/chunk-RE3R45RJ.js +3042 -0
- package/dist/chunk-S7E6TFX6.js +803 -0
- package/dist/chunk-SG6GLU4U.js +1378 -0
- package/dist/chunk-SJCDV2ST.js +274 -0
- package/dist/chunk-SYE5XLF3.js +104 -0
- package/dist/chunk-T5VLYBZD.js +103 -0
- package/dist/chunk-TOQB7VWU.js +3238 -0
- package/dist/chunk-VFNMZ4ZQ.js +3228 -0
- package/dist/chunk-VVTGZNBT.js +1629 -0
- package/dist/chunk-W7Q4RFEV.js +104 -0
- package/dist/chunk-XTYYVRLO.js +3190 -0
- package/dist/chunk-Y6MDYVJD.js +3063 -0
- package/dist/cli/main.d.ts +1 -0
- package/dist/cli/main.js +5458 -0
- package/dist/index.d.ts +653 -0
- package/dist/index.js +79 -0
- package/dist/mcp/server.d.ts +1 -0
- package/dist/mcp/server.js +472 -0
- package/dist/schema-BAWSG7KY.js +22 -0
- package/dist/schema-E3QUPL26.js +20 -0
- package/dist/schema-EHL7WUT6.js +20 -0
- package/docs/019-USAGE.md +625 -0
- package/docs/020-current-implementation.md +364 -0
- package/docs/021-DOGFOODING-FINDINGS.md +175 -0
- package/docs/BACKLOG.md +80 -0
- package/docs/CONFIG.md +1123 -0
- package/docs/DESIGN.md +439 -0
- package/docs/ERRORS.md +383 -0
- package/docs/PROJECT.md +88 -0
- package/docs/ROADMAP.md +407 -0
- package/docs/summarization.md +320 -0
- package/docs/test-links.md +9 -0
- package/justfile +40 -0
- package/package.json +74 -9
- package/pnpm-workspace.yaml +5 -0
- package/research/INDEX.md +315 -0
- package/research/code-review/README.md +90 -0
- package/research/code-review/cli-error-handling-review.md +979 -0
- package/research/code-review/code-review-validation-report.md +464 -0
- package/research/code-review/main-ts-review.md +1128 -0
- package/research/config-analysis/01-current-implementation.md +470 -0
- package/research/config-analysis/02-strategy-recommendation.md +428 -0
- package/research/config-analysis/03-task-candidates.md +715 -0
- package/research/config-analysis/033-research-configuration-management.md +828 -0
- package/research/config-analysis/034-research-effect-cli-config.md +1504 -0
- package/research/config-analysis/04-consolidated-task-candidates.md +277 -0
- package/research/config-docs/SUMMARY.md +357 -0
- package/research/config-docs/TEST-RESULTS.md +776 -0
- package/research/config-docs/TODO.md +542 -0
- package/research/config-docs/analysis.md +744 -0
- package/research/config-docs/fix-validation.md +502 -0
- package/research/config-docs/help-audit.md +264 -0
- package/research/config-docs/help-system-analysis.md +890 -0
- package/research/dogfood/consolidated-tool-evaluation.md +373 -0
- package/research/dogfood/strategy-a/a-synthesis.md +184 -0
- package/research/dogfood/strategy-a/a1-docs.md +226 -0
- package/research/dogfood/strategy-a/a2-amorphic.md +156 -0
- package/research/dogfood/strategy-a/a3-llm.md +164 -0
- package/research/dogfood/strategy-b/b-synthesis.md +228 -0
- package/research/dogfood/strategy-b/b1-architecture.md +207 -0
- package/research/dogfood/strategy-b/b2-gaps.md +258 -0
- package/research/dogfood/strategy-b/b3-workflows.md +250 -0
- package/research/dogfood/strategy-c/c-synthesis.md +451 -0
- package/research/dogfood/strategy-c/c1-explorer.md +192 -0
- package/research/dogfood/strategy-c/c2-diver-memory.md +145 -0
- package/research/dogfood/strategy-c/c3-diver-control.md +148 -0
- package/research/dogfood/strategy-c/c4-diver-failure.md +151 -0
- package/research/dogfood/strategy-c/c5-diver-execution.md +221 -0
- package/research/dogfood/strategy-c/c6-diver-org.md +221 -0
- package/research/effect-cli-error-handling.md +845 -0
- package/research/effect-errors-as-values.md +943 -0
- package/research/errors-task-analysis/00-consolidated-tasks.md +207 -0
- package/research/errors-task-analysis/cli-commands-analysis.md +909 -0
- package/research/errors-task-analysis/embeddings-analysis.md +709 -0
- package/research/errors-task-analysis/index-search-analysis.md +812 -0
- package/research/frontmatter/COMMENTS-ARE-SKIPPED.md +149 -0
- package/research/frontmatter/LLM-CODE-NAVIGATION.md +276 -0
- package/research/issue-review.md +603 -0
- package/research/llm-summarization/agent-cli-tools-2026.md +1082 -0
- package/research/llm-summarization/alternative-providers-2026.md +1428 -0
- package/research/llm-summarization/anthropic-2026.md +367 -0
- package/research/llm-summarization/claude-cli-integration.md +1706 -0
- package/research/llm-summarization/cli-integration-patterns.md +3155 -0
- package/research/llm-summarization/openai-2026.md +473 -0
- package/research/llm-summarization/openai-compatible-providers-2026.md +1022 -0
- package/research/llm-summarization/opencode-cli-integration.md +1552 -0
- package/research/llm-summarization/prompt-engineering-2026.md +1426 -0
- package/research/llm-summarization/prototype-results.md +56 -0
- package/research/llm-summarization/provider-switching-patterns-2026.md +2153 -0
- package/research/llm-summarization/typescript-llm-libraries-2026.md +2436 -0
- package/research/mdcontext-error-analysis.md +521 -0
- package/research/mdcontext-pudding/00-EXECUTIVE-SUMMARY.md +282 -0
- package/research/mdcontext-pudding/01-index-embed.md +956 -0
- package/research/mdcontext-pudding/02-search-COMMANDS.md +142 -0
- package/research/mdcontext-pudding/02-search-SUMMARY.md +146 -0
- package/research/mdcontext-pudding/02-search.md +970 -0
- package/research/mdcontext-pudding/03-context.md +779 -0
- package/research/mdcontext-pudding/04-navigation-and-analytics.md +803 -0
- package/research/mdcontext-pudding/04-tree.md +704 -0
- package/research/mdcontext-pudding/05-config.md +1038 -0
- package/research/mdcontext-pudding/06-links-summary.txt +87 -0
- package/research/mdcontext-pudding/06-links.md +679 -0
- package/research/mdcontext-pudding/07-stats.md +693 -0
- package/research/mdcontext-pudding/BUG-FIX-PLAN.md +388 -0
- package/research/mdcontext-pudding/P0-BUG-VALIDATION.md +167 -0
- package/research/mdcontext-pudding/README.md +168 -0
- package/research/mdcontext-pudding/TESTING-SUMMARY.md +128 -0
- package/research/npm_publish/011-npm-workflow-research-agent2.md +792 -0
- package/research/npm_publish/012-npm-workflow-research-agent1.md +530 -0
- package/research/npm_publish/013-npm-workflow-research-agent3.md +722 -0
- package/research/npm_publish/014-npm-workflow-synthesis.md +556 -0
- package/research/npm_publish/031-npm-workflow-task-analysis.md +134 -0
- package/research/research-quality-review.md +834 -0
- package/research/semantic-search/002-research-embedding-models.md +490 -0
- package/research/semantic-search/003-research-rag-alternatives.md +523 -0
- package/research/semantic-search/004-research-vector-search.md +841 -0
- package/research/semantic-search/032-research-semantic-search.md +427 -0
- package/research/semantic-search/embedding-text-analysis.md +156 -0
- package/research/semantic-search/multi-word-failure-reproduction.md +171 -0
- package/research/semantic-search/query-processing-analysis.md +207 -0
- package/research/semantic-search/root-cause-and-solution.md +114 -0
- package/research/semantic-search/threshold-validation-report.md +69 -0
- package/research/semantic-search/vector-search-analysis.md +63 -0
- package/research/task-management-2026/00-synthesis-recommendations.md +295 -0
- package/research/task-management-2026/01-ai-workflow-tools.md +416 -0
- package/research/task-management-2026/02-agent-framework-patterns.md +476 -0
- package/research/task-management-2026/03-lightweight-file-based.md +567 -0
- package/research/task-management-2026/04-established-tools-ai-features.md +541 -0
- package/research/task-management-2026/linear/01-core-features-workflow.md +771 -0
- package/research/task-management-2026/linear/02-api-integrations.md +930 -0
- package/research/task-management-2026/linear/03-ai-features.md +368 -0
- package/research/task-management-2026/linear/04-pricing-setup.md +205 -0
- package/research/task-management-2026/linear/05-usage-patterns-best-practices.md +605 -0
- package/research/test-path-issues.md +276 -0
- package/review/ALP-76/1-error-type-design.md +962 -0
- package/review/ALP-76/2-error-handling-patterns.md +906 -0
- package/review/ALP-76/3-error-presentation.md +624 -0
- package/review/ALP-76/4-test-coverage.md +625 -0
- package/review/ALP-76/5-migration-completeness.md +440 -0
- package/review/ALP-76/6-effect-best-practices.md +755 -0
- package/scripts/apply-branch-protection.sh +47 -0
- package/scripts/branch-protection-templates.json +79 -0
- package/scripts/prototype-summarization.ts +346 -0
- package/scripts/rebuild-hnswlib.js +58 -0
- package/scripts/setup-branch-protection.sh +64 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/active-provider.json +7 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/bm25.json +541 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/bm25.meta.json +5 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/config.json +8 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.bin +0 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.meta.bin +0 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/indexes/documents.json +60 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/indexes/links.json +13 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/indexes/sections.json +1197 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/configuration-management.md +99 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/distributed-systems.md +92 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/error-handling.md +78 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/failure-automation.md +55 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/job-context.md +69 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/process-orchestration.md +99 -0
- package/src/cli/argv-preprocessor.test.ts +210 -0
- package/src/cli/argv-preprocessor.ts +202 -0
- package/src/cli/cli.test.ts +627 -0
- package/src/cli/commands/backlinks.ts +54 -0
- package/src/cli/commands/config-cmd.ts +642 -0
- package/src/cli/commands/context.ts +285 -0
- package/src/cli/commands/duplicates.ts +122 -0
- package/src/cli/commands/embeddings.ts +529 -0
- package/src/cli/commands/index-cmd.ts +480 -0
- package/src/cli/commands/index.ts +16 -0
- package/src/cli/commands/links.ts +52 -0
- package/src/cli/commands/search.ts +1281 -0
- package/src/cli/commands/stats.ts +149 -0
- package/src/cli/commands/tree.ts +128 -0
- package/src/cli/config-layer.ts +176 -0
- package/src/cli/error-handler.test.ts +235 -0
- package/src/cli/error-handler.ts +655 -0
- package/src/cli/flag-schemas.ts +341 -0
- package/src/cli/help.ts +588 -0
- package/src/cli/index.ts +9 -0
- package/src/cli/main.ts +435 -0
- package/src/cli/options.ts +41 -0
- package/src/cli/shared-error-handling.ts +199 -0
- package/src/cli/typo-suggester.test.ts +105 -0
- package/src/cli/typo-suggester.ts +130 -0
- package/src/cli/utils.ts +259 -0
- package/src/config/file-provider.test.ts +320 -0
- package/src/config/file-provider.ts +273 -0
- package/src/config/index.ts +72 -0
- package/src/config/integration.test.ts +667 -0
- package/src/config/precedence.test.ts +277 -0
- package/src/config/precedence.ts +451 -0
- package/src/config/schema.test.ts +414 -0
- package/src/config/schema.ts +603 -0
- package/src/config/service.test.ts +320 -0
- package/src/config/service.ts +243 -0
- package/src/config/testing.test.ts +264 -0
- package/src/config/testing.ts +110 -0
- package/src/core/index.ts +1 -0
- package/src/core/types.ts +113 -0
- package/src/duplicates/detector.test.ts +183 -0
- package/src/duplicates/detector.ts +414 -0
- package/src/duplicates/index.ts +18 -0
- package/src/embeddings/embedding-namespace.test.ts +300 -0
- package/src/embeddings/embedding-namespace.ts +947 -0
- package/src/embeddings/heading-boost.test.ts +222 -0
- package/src/embeddings/hnsw-build-options.test.ts +198 -0
- package/src/embeddings/hyde.test.ts +272 -0
- package/src/embeddings/hyde.ts +264 -0
- package/src/embeddings/index.ts +10 -0
- package/src/embeddings/openai-provider.ts +414 -0
- package/src/embeddings/pricing.json +22 -0
- package/src/embeddings/provider-constants.ts +204 -0
- package/src/embeddings/provider-errors.test.ts +967 -0
- package/src/embeddings/provider-errors.ts +565 -0
- package/src/embeddings/provider-factory.test.ts +240 -0
- package/src/embeddings/provider-factory.ts +225 -0
- package/src/embeddings/provider-integration.test.ts +788 -0
- package/src/embeddings/query-preprocessing.test.ts +187 -0
- package/src/embeddings/semantic-search-threshold.test.ts +508 -0
- package/src/embeddings/semantic-search.ts +1270 -0
- package/src/embeddings/types.ts +359 -0
- package/src/embeddings/vector-store.ts +708 -0
- package/src/embeddings/voyage-provider.ts +313 -0
- package/src/errors/errors.test.ts +845 -0
- package/src/errors/index.ts +533 -0
- package/src/index/ignore-patterns.test.ts +354 -0
- package/src/index/ignore-patterns.ts +305 -0
- package/src/index/index.ts +4 -0
- package/src/index/indexer.ts +684 -0
- package/src/index/storage.ts +260 -0
- package/src/index/types.ts +147 -0
- package/src/index/watcher.ts +189 -0
- package/src/index.ts +30 -0
- package/src/integration/search-keyword.test.ts +678 -0
- package/src/mcp/server.ts +612 -0
- package/src/parser/index.ts +1 -0
- package/src/parser/parser.test.ts +291 -0
- package/src/parser/parser.ts +394 -0
- package/src/parser/section-filter.test.ts +277 -0
- package/src/parser/section-filter.ts +392 -0
- package/src/search/__tests__/hybrid-search.test.ts +650 -0
- package/src/search/bm25-store.ts +366 -0
- package/src/search/cross-encoder.test.ts +253 -0
- package/src/search/cross-encoder.ts +406 -0
- package/src/search/fuzzy-search.test.ts +419 -0
- package/src/search/fuzzy-search.ts +273 -0
- package/src/search/hybrid-search.ts +448 -0
- package/src/search/path-matcher.test.ts +276 -0
- package/src/search/path-matcher.ts +33 -0
- package/src/search/query-parser.test.ts +260 -0
- package/src/search/query-parser.ts +319 -0
- package/src/search/searcher.test.ts +280 -0
- package/src/search/searcher.ts +724 -0
- package/src/search/wink-bm25.d.ts +30 -0
- package/src/summarization/cli-providers/claude.ts +202 -0
- package/src/summarization/cli-providers/detection.test.ts +273 -0
- package/src/summarization/cli-providers/detection.ts +118 -0
- package/src/summarization/cli-providers/index.ts +8 -0
- package/src/summarization/cost.test.ts +139 -0
- package/src/summarization/cost.ts +102 -0
- package/src/summarization/error-handler.test.ts +127 -0
- package/src/summarization/error-handler.ts +111 -0
- package/src/summarization/index.ts +102 -0
- package/src/summarization/pipeline.test.ts +498 -0
- package/src/summarization/pipeline.ts +231 -0
- package/src/summarization/prompts.test.ts +269 -0
- package/src/summarization/prompts.ts +133 -0
- package/src/summarization/provider-factory.test.ts +396 -0
- package/src/summarization/provider-factory.ts +178 -0
- package/src/summarization/types.ts +184 -0
- package/src/summarize/budget-bugs.test.ts +620 -0
- package/src/summarize/formatters.ts +419 -0
- package/src/summarize/index.ts +20 -0
- package/src/summarize/summarizer.test.ts +275 -0
- package/src/summarize/summarizer.ts +597 -0
- package/src/summarize/verify-bugs.test.ts +238 -0
- package/src/types/huggingface-transformers.d.ts +66 -0
- package/src/utils/index.ts +1 -0
- package/src/utils/tokens.test.ts +142 -0
- package/src/utils/tokens.ts +186 -0
- package/tests/fixtures/cli/.mdcontext/active-provider.json +7 -0
- package/tests/fixtures/cli/.mdcontext/config.json +8 -0
- package/tests/fixtures/cli/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.bin +0 -0
- package/tests/fixtures/cli/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.meta.bin +0 -0
- package/tests/fixtures/cli/.mdcontext/indexes/documents.json +33 -0
- package/tests/fixtures/cli/.mdcontext/indexes/links.json +12 -0
- package/tests/fixtures/cli/.mdcontext/indexes/sections.json +247 -0
- package/tests/fixtures/cli/README.md +9 -0
- package/tests/fixtures/cli/api-reference.md +11 -0
- package/tests/fixtures/cli/getting-started.md +11 -0
- package/tests/integration/embed-index.test.ts +712 -0
- package/tests/integration/search-context.test.ts +469 -0
- package/tests/integration/search-semantic.test.ts +522 -0
- package/tsconfig.json +26 -0
- package/vitest.config.ts +16 -0
- package/vitest.setup.ts +12 -0
|
@@ -0,0 +1,1022 @@
|
|
|
1
|
+
# OpenAI-Compatible LLM Providers in 2026
|
|
2
|
+
|
|
3
|
+
## Executive Summary
|
|
4
|
+
|
|
5
|
+
As of 2026, the OpenAI SDK compatibility pattern has become the de facto standard for LLM API providers. This allows developers to use the official OpenAI SDK (`openai` package) with multiple providers by simply changing the `baseURL` and API key. This pattern significantly reduces integration complexity and enables easy provider switching.
|
|
6
|
+
|
|
7
|
+
**Recommendation: STRONGLY RECOMMENDED**
|
|
8
|
+
|
|
9
|
+
The OpenAI-compatible pattern should be the primary approach for multi-provider LLM support. It offers:
|
|
10
|
+
- Minimal code duplication
|
|
11
|
+
- Easy provider switching
|
|
12
|
+
- Familiar developer experience
|
|
13
|
+
- Broad ecosystem support
|
|
14
|
+
- Future-proof architecture
|
|
15
|
+
|
|
16
|
+
---
|
|
17
|
+
|
|
18
|
+
## Provider Comparison Table
|
|
19
|
+
|
|
20
|
+
| Provider | Status | Base URL | Auth Method | Key Features |
|
|
21
|
+
|----------|--------|----------|-------------|--------------|
|
|
22
|
+
| **DeepSeek** | ✅ Full | `https://api.deepseek.com/v1` | Bearer Token | Fast reasoning models, 128K context |
|
|
23
|
+
| **Together AI** | ✅ Full | `https://api.together.xyz/v1` | Bearer Token | 200+ open-source models |
|
|
24
|
+
| **Groq** | ✅ Full | `https://api.groq.com/openai/v1` | Bearer Token | Ultra-fast inference, function calling |
|
|
25
|
+
| **Ollama** | ✅ Full | `http://localhost:11434/v1` | None (local) | Local deployment, no API key needed |
|
|
26
|
+
| **Anthropic Claude** | ⚠️ Limited | `https://api.anthropic.com/v1/` | Bearer Token | Testing only, use native API for production |
|
|
27
|
+
| **Mistral AI** | ✅ Full | `https://api.mistral.ai/v1` | Bearer Token | Magistral reasoning models |
|
|
28
|
+
| **Cohere** | ✅ Full | Via Compatibility API | Bearer Token | Function calling, structured outputs |
|
|
29
|
+
| **Fireworks AI** | ✅ Full | `https://api.fireworks.ai/inference/v1` | Bearer Token | Fast inference, MCP support |
|
|
30
|
+
| **Perplexity AI** | ✅ Full | `https://api.perplexity.ai` | Bearer Token | Real-time search, citations |
|
|
31
|
+
| **OpenRouter** | ✅ Full | `https://openrouter.ai/api/v1` | Bearer Token | 500+ models, unified gateway |
|
|
32
|
+
| **Cloudflare Workers AI** | ✅ Full | Via Workers AI | CF Token | Edge deployment, 50+ models |
|
|
33
|
+
| **vLLM** | ✅ Full | `http://localhost:8000/v1` | None (self-hosted) | Self-hosted, multi-GPU support |
|
|
34
|
+
| **LiteLLM Proxy** | ✅ Gateway | `http://localhost:4000/v1` | Bearer Token | 100+ providers, cost tracking |
|
|
35
|
+
| **Anyscale** | ⚠️ Limited | `https://api.endpoints.anyscale.com/v1` | Bearer Token | Hosted platform only (as of Aug 2024) |
|
|
36
|
+
|
|
37
|
+
---
|
|
38
|
+
|
|
39
|
+
## Detailed Provider Information
|
|
40
|
+
|
|
41
|
+
### 1. DeepSeek API
|
|
42
|
+
|
|
43
|
+
**Status:** ✅ Fully OpenAI-Compatible
|
|
44
|
+
|
|
45
|
+
**Base URL:** `https://api.deepseek.com/v1`
|
|
46
|
+
|
|
47
|
+
**Authentication:** Bearer token via `Authorization` header or API key parameter
|
|
48
|
+
|
|
49
|
+
**Models:**
|
|
50
|
+
- `deepseek-chat` - Fast general-purpose model (128K context)
|
|
51
|
+
- `deepseek-reasoner` - Reasoning mode with chain-of-thought (64K output)
|
|
52
|
+
- Both powered by V3.2-Exp
|
|
53
|
+
|
|
54
|
+
**Example:**
|
|
55
|
+
```typescript
|
|
56
|
+
import OpenAI from 'openai'
|
|
57
|
+
|
|
58
|
+
const client = new OpenAI({
|
|
59
|
+
baseURL: 'https://api.deepseek.com/v1',
|
|
60
|
+
apiKey: process.env.DEEPSEEK_API_KEY
|
|
61
|
+
})
|
|
62
|
+
|
|
63
|
+
const response = await client.chat.completions.create({
|
|
64
|
+
model: 'deepseek-chat',
|
|
65
|
+
messages: [{ role: 'user', content: 'Hello!' }]
|
|
66
|
+
})
|
|
67
|
+
```
|
|
68
|
+
|
|
69
|
+
**Limitations:**
|
|
70
|
+
- None reported for OpenAI compatibility
|
|
71
|
+
|
|
72
|
+
**Sources:**
|
|
73
|
+
- [DeepSeek API Docs](https://api-docs.deepseek.com/)
|
|
74
|
+
- [How to Integrate DeepSeek with Node.js Using the OpenAI SDK](https://medium.com/@akbhuker/how-to-integrate-deepseek-with-node-js-using-the-openai-sdk-a0b7ef8ae1e4)
|
|
75
|
+
|
|
76
|
+
---
|
|
77
|
+
|
|
78
|
+
### 2. Together AI
|
|
79
|
+
|
|
80
|
+
**Status:** ✅ Fully OpenAI-Compatible
|
|
81
|
+
|
|
82
|
+
**Base URL:** `https://api.together.xyz/v1`
|
|
83
|
+
|
|
84
|
+
**Authentication:** Bearer token
|
|
85
|
+
|
|
86
|
+
**Models:** 200+ open-source models including Llama, Mixtral, and more
|
|
87
|
+
|
|
88
|
+
**Example:**
|
|
89
|
+
```typescript
|
|
90
|
+
import OpenAI from 'openai'
|
|
91
|
+
|
|
92
|
+
const client = new OpenAI({
|
|
93
|
+
baseURL: 'https://api.together.xyz/v1',
|
|
94
|
+
apiKey: process.env.TOGETHER_API_KEY
|
|
95
|
+
})
|
|
96
|
+
|
|
97
|
+
const response = await client.chat.completions.create({
|
|
98
|
+
model: 'meta-llama/Llama-3.3-70B-Instruct-Turbo',
|
|
99
|
+
messages: [{ role: 'user', content: 'Hello!' }]
|
|
100
|
+
})
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
**Limitations:**
|
|
104
|
+
- None reported for OpenAI compatibility
|
|
105
|
+
|
|
106
|
+
**Sources:**
|
|
107
|
+
- [OpenAI Compatibility - Together.ai Docs](https://docs.together.ai/docs/openai-api-compatibility)
|
|
108
|
+
|
|
109
|
+
---
|
|
110
|
+
|
|
111
|
+
### 3. Groq
|
|
112
|
+
|
|
113
|
+
**Status:** ✅ Fully OpenAI-Compatible
|
|
114
|
+
|
|
115
|
+
**Base URL:** `https://api.groq.com/openai/v1`
|
|
116
|
+
|
|
117
|
+
**Authentication:** Bearer token
|
|
118
|
+
|
|
119
|
+
**Models:** Fast inference models including:
|
|
120
|
+
- `qwen-qwq-32b` - Reasoning model
|
|
121
|
+
- `deepseek-r1-distill-llama-70b` - Reasoning model
|
|
122
|
+
- GPT-OSS 120B - OpenAI's open-weight model
|
|
123
|
+
|
|
124
|
+
**Example:**
|
|
125
|
+
```typescript
|
|
126
|
+
import OpenAI from 'openai'
|
|
127
|
+
|
|
128
|
+
const client = new OpenAI({
|
|
129
|
+
baseURL: 'https://api.groq.com/openai/v1',
|
|
130
|
+
apiKey: process.env.GROQ_API_KEY
|
|
131
|
+
})
|
|
132
|
+
|
|
133
|
+
const response = await client.chat.completions.create({
|
|
134
|
+
model: 'deepseek-r1-distill-llama-70b',
|
|
135
|
+
messages: [{ role: 'user', content: 'Hello!' }]
|
|
136
|
+
})
|
|
137
|
+
```
|
|
138
|
+
|
|
139
|
+
**Limitations:**
|
|
140
|
+
- None reported for OpenAI compatibility
|
|
141
|
+
|
|
142
|
+
**Sources:**
|
|
143
|
+
- [Groq Docs Overview](https://console.groq.com/docs/overview)
|
|
144
|
+
|
|
145
|
+
---
|
|
146
|
+
|
|
147
|
+
### 4. Ollama
|
|
148
|
+
|
|
149
|
+
**Status:** ✅ Fully OpenAI-Compatible
|
|
150
|
+
|
|
151
|
+
**Base URL:** `http://localhost:11434/v1` (local deployment)
|
|
152
|
+
|
|
153
|
+
**Authentication:** None required (local API). API key parameter is ignored.
|
|
154
|
+
|
|
155
|
+
**Models:** Any model supported by Ollama (Llama, Mistral, etc.)
|
|
156
|
+
|
|
157
|
+
**Example:**
|
|
158
|
+
```typescript
|
|
159
|
+
import OpenAI from 'openai'
|
|
160
|
+
|
|
161
|
+
const client = new OpenAI({
|
|
162
|
+
baseURL: 'http://localhost:11434/v1',
|
|
163
|
+
apiKey: 'ollama' // required but ignored
|
|
164
|
+
})
|
|
165
|
+
|
|
166
|
+
const response = await client.chat.completions.create({
|
|
167
|
+
model: 'llama3.3',
|
|
168
|
+
messages: [{ role: 'user', content: 'Hello!' }]
|
|
169
|
+
})
|
|
170
|
+
```
|
|
171
|
+
|
|
172
|
+
**Limitations:**
|
|
173
|
+
- Local deployment only (not a cloud service)
|
|
174
|
+
- API key is required by OpenAI SDK but ignored by Ollama
|
|
175
|
+
|
|
176
|
+
**Added Features:**
|
|
177
|
+
- Tool/function calling support (added in v0.13.3)
|
|
178
|
+
|
|
179
|
+
**Sources:**
|
|
180
|
+
- [OpenAI compatibility - Ollama](https://docs.ollama.com/api/openai-compatibility)
|
|
181
|
+
|
|
182
|
+
---
|
|
183
|
+
|
|
184
|
+
### 5. Anthropic Claude
|
|
185
|
+
|
|
186
|
+
**Status:** ⚠️ Limited Compatibility (Testing Only)
|
|
187
|
+
|
|
188
|
+
**Base URL:** `https://api.anthropic.com/v1/`
|
|
189
|
+
|
|
190
|
+
**Authentication:** Bearer token via `x-api-key` header (but uses standard OpenAI auth in compatibility mode)
|
|
191
|
+
|
|
192
|
+
**Models:**
|
|
193
|
+
- `claude-sonnet-4-5`
|
|
194
|
+
- `claude-opus-4-5`
|
|
195
|
+
- All Claude models
|
|
196
|
+
|
|
197
|
+
**Example:**
|
|
198
|
+
```typescript
|
|
199
|
+
import OpenAI from 'openai'
|
|
200
|
+
|
|
201
|
+
const client = new OpenAI({
|
|
202
|
+
baseURL: 'https://api.anthropic.com/v1/',
|
|
203
|
+
apiKey: process.env.ANTHROPIC_API_KEY
|
|
204
|
+
})
|
|
205
|
+
|
|
206
|
+
const response = await client.chat.completions.create({
|
|
207
|
+
model: 'claude-sonnet-4-5',
|
|
208
|
+
messages: [{ role: 'user', content: 'Hello!' }]
|
|
209
|
+
})
|
|
210
|
+
```
|
|
211
|
+
|
|
212
|
+
**Important Limitations:**
|
|
213
|
+
- **Not for production use** - Anthropic recommends using native Claude API
|
|
214
|
+
- Audio input not supported (silently stripped)
|
|
215
|
+
- Prompt caching not supported (available in native SDK)
|
|
216
|
+
- Strict parameter for function calling ignored
|
|
217
|
+
- PDF processing, citations, extended thinking require native API
|
|
218
|
+
|
|
219
|
+
**Recommendation:**
|
|
220
|
+
Use native Anthropic SDK for production. OpenAI compatibility layer is for quick testing/comparison only.
|
|
221
|
+
|
|
222
|
+
**Sources:**
|
|
223
|
+
- [OpenAI SDK compatibility - Claude API Docs](https://platform.claude.com/docs/en/api/openai-sdk)
|
|
224
|
+
|
|
225
|
+
---
|
|
226
|
+
|
|
227
|
+
### 6. Mistral AI
|
|
228
|
+
|
|
229
|
+
**Status:** ✅ Fully OpenAI-Compatible
|
|
230
|
+
|
|
231
|
+
**Base URL:** `https://api.mistral.ai/v1`
|
|
232
|
+
|
|
233
|
+
**Authentication:** Bearer token
|
|
234
|
+
|
|
235
|
+
**Models:**
|
|
236
|
+
- Magistral reasoning models (specialized reasoning, June 2025+)
|
|
237
|
+
- Mistral Large, Medium, Small variants
|
|
238
|
+
|
|
239
|
+
**Example:**
|
|
240
|
+
```typescript
|
|
241
|
+
import OpenAI from 'openai'
|
|
242
|
+
|
|
243
|
+
const client = new OpenAI({
|
|
244
|
+
baseURL: 'https://api.mistral.ai/v1',
|
|
245
|
+
apiKey: process.env.MISTRAL_API_KEY
|
|
246
|
+
})
|
|
247
|
+
|
|
248
|
+
const response = await client.chat.completions.create({
|
|
249
|
+
model: 'mistral-large-latest',
|
|
250
|
+
messages: [{ role: 'user', content: 'Hello!' }]
|
|
251
|
+
})
|
|
252
|
+
```
|
|
253
|
+
|
|
254
|
+
**Limitations:**
|
|
255
|
+
- None reported for OpenAI compatibility
|
|
256
|
+
|
|
257
|
+
**Sources:**
|
|
258
|
+
- [API Specs - Mistral Docs](https://docs.mistral.ai/api)
|
|
259
|
+
|
|
260
|
+
---
|
|
261
|
+
|
|
262
|
+
### 7. Cohere
|
|
263
|
+
|
|
264
|
+
**Status:** ✅ Fully OpenAI-Compatible
|
|
265
|
+
|
|
266
|
+
**Base URL:** Via Compatibility API endpoint
|
|
267
|
+
|
|
268
|
+
**Authentication:** Bearer token
|
|
269
|
+
|
|
270
|
+
**Models:** Cohere Command models
|
|
271
|
+
|
|
272
|
+
**Example:**
|
|
273
|
+
```typescript
|
|
274
|
+
import OpenAI from 'openai'
|
|
275
|
+
|
|
276
|
+
const client = new OpenAI({
|
|
277
|
+
baseURL: 'https://api.cohere.ai/v1', // Compatibility API
|
|
278
|
+
apiKey: process.env.COHERE_API_KEY
|
|
279
|
+
})
|
|
280
|
+
|
|
281
|
+
const response = await client.chat.completions.create({
|
|
282
|
+
model: 'command-r-plus',
|
|
283
|
+
messages: [{ role: 'user', content: 'Hello!' }]
|
|
284
|
+
})
|
|
285
|
+
```
|
|
286
|
+
|
|
287
|
+
**Features:**
|
|
288
|
+
- Function calling
|
|
289
|
+
- Structured outputs
|
|
290
|
+
- Text embeddings
|
|
291
|
+
|
|
292
|
+
**Limitations:**
|
|
293
|
+
- `reasoning_effort` only supports `none` and `high` (maps to thinking mode on/off)
|
|
294
|
+
- Trial keys are rate-limited (1,000 API calls/month)
|
|
295
|
+
|
|
296
|
+
**Sources:**
|
|
297
|
+
- [Using Cohere models via the OpenAI SDK](https://docs.cohere.com/docs/compatibility-api)
|
|
298
|
+
|
|
299
|
+
---
|
|
300
|
+
|
|
301
|
+
### 8. Fireworks AI
|
|
302
|
+
|
|
303
|
+
**Status:** ✅ Fully OpenAI-Compatible
|
|
304
|
+
|
|
305
|
+
**Base URL:** `https://api.fireworks.ai/inference/v1`
|
|
306
|
+
|
|
307
|
+
**Authentication:** Bearer token
|
|
308
|
+
|
|
309
|
+
**Models:** Wide selection of open-source models
|
|
310
|
+
|
|
311
|
+
**Example:**
|
|
312
|
+
```typescript
|
|
313
|
+
import OpenAI from 'openai'
|
|
314
|
+
|
|
315
|
+
const client = new OpenAI({
|
|
316
|
+
baseURL: 'https://api.fireworks.ai/inference/v1',
|
|
317
|
+
apiKey: process.env.FIREWORKS_API_KEY
|
|
318
|
+
})
|
|
319
|
+
|
|
320
|
+
const response = await client.chat.completions.create({
|
|
321
|
+
model: 'accounts/fireworks/models/llama-v3p3-70b-instruct',
|
|
322
|
+
messages: [{ role: 'user', content: 'Hello!' }]
|
|
323
|
+
})
|
|
324
|
+
```
|
|
325
|
+
|
|
326
|
+
**New Features (2026):**
|
|
327
|
+
- OpenAI-compatible Responses API with MCP (Model Context Protocol) support
|
|
328
|
+
- Server-side agentic loop handling
|
|
329
|
+
|
|
330
|
+
**Limitations:**
|
|
331
|
+
- None reported for OpenAI compatibility
|
|
332
|
+
|
|
333
|
+
**Sources:**
|
|
334
|
+
- [OpenAI compatibility - Fireworks AI Docs](https://docs.fireworks.ai/tools-sdks/openai-compatibility)
|
|
335
|
+
|
|
336
|
+
---
|
|
337
|
+
|
|
338
|
+
### 9. Perplexity AI
|
|
339
|
+
|
|
340
|
+
**Status:** ✅ Fully OpenAI-Compatible
|
|
341
|
+
|
|
342
|
+
**Base URL:** `https://api.perplexity.ai`
|
|
343
|
+
|
|
344
|
+
**Authentication:** Bearer token
|
|
345
|
+
|
|
346
|
+
**Models:**
|
|
347
|
+
- `sonar-pro` - Real-time search with citations
|
|
348
|
+
- Other Sonar variants
|
|
349
|
+
|
|
350
|
+
**Example:**
|
|
351
|
+
```typescript
|
|
352
|
+
import OpenAI from 'openai'
|
|
353
|
+
|
|
354
|
+
const client = new OpenAI({
|
|
355
|
+
baseURL: 'https://api.perplexity.ai',
|
|
356
|
+
apiKey: process.env.PERPLEXITY_API_KEY
|
|
357
|
+
})
|
|
358
|
+
|
|
359
|
+
const response = await client.chat.completions.create({
|
|
360
|
+
model: 'sonar-pro',
|
|
361
|
+
messages: [{ role: 'user', content: 'What happened today in tech?' }]
|
|
362
|
+
})
|
|
363
|
+
```
|
|
364
|
+
|
|
365
|
+
**Unique Features:**
|
|
366
|
+
- Real-time web search
|
|
367
|
+
- Automatic citation of sources
|
|
368
|
+
- Up-to-date information
|
|
369
|
+
|
|
370
|
+
**Important Consideration:**
|
|
371
|
+
- High token costs: Perplexity includes full text of cited sources in input token count
|
|
372
|
+
- A simple question can result in high token usage if multiple long articles are cited
|
|
373
|
+
|
|
374
|
+
**Sources:**
|
|
375
|
+
- [OpenAI Compatibility - Perplexity](https://docs.perplexity.ai/guides/chat-completions-guide)
|
|
376
|
+
|
|
377
|
+
---
|
|
378
|
+
|
|
379
|
+
### 10. OpenRouter
|
|
380
|
+
|
|
381
|
+
**Status:** ✅ Fully OpenAI-Compatible (API Gateway)
|
|
382
|
+
|
|
383
|
+
**Base URL:** `https://openrouter.ai/api/v1`
|
|
384
|
+
|
|
385
|
+
**Authentication:** Bearer token
|
|
386
|
+
|
|
387
|
+
**Models:** 500+ models from multiple providers
|
|
388
|
+
|
|
389
|
+
**Example:**
|
|
390
|
+
```typescript
|
|
391
|
+
import OpenAI from 'openai'
|
|
392
|
+
|
|
393
|
+
const client = new OpenAI({
|
|
394
|
+
baseURL: 'https://openrouter.ai/api/v1',
|
|
395
|
+
apiKey: process.env.OPENROUTER_API_KEY
|
|
396
|
+
})
|
|
397
|
+
|
|
398
|
+
const response = await client.chat.completions.create({
|
|
399
|
+
model: 'anthropic/claude-sonnet-4-5',
|
|
400
|
+
messages: [{ role: 'user', content: 'Hello!' }]
|
|
401
|
+
})
|
|
402
|
+
```
|
|
403
|
+
|
|
404
|
+
**Features:**
|
|
405
|
+
- Unified access to 500+ models from providers like OpenAI, Anthropic, Google, etc.
|
|
406
|
+
- Automatic failovers
|
|
407
|
+
- Prompt caching
|
|
408
|
+
- Intelligent routing for cost/latency optimization
|
|
409
|
+
- 13+ free models with daily limits
|
|
410
|
+
|
|
411
|
+
**Pricing:**
|
|
412
|
+
- Pass-through pricing at exact provider rates
|
|
413
|
+
- 5% platform fee (5.5% on credits)
|
|
414
|
+
|
|
415
|
+
**Limitations:**
|
|
416
|
+
- Schema normalization means slight differences from native provider APIs
|
|
417
|
+
- Additional latency from routing layer
|
|
418
|
+
|
|
419
|
+
**Sources:**
|
|
420
|
+
- [OpenRouter Quickstart Guide](https://openrouter.ai/docs/quickstart)
|
|
421
|
+
|
|
422
|
+
---
|
|
423
|
+
|
|
424
|
+
### 11. Cloudflare Workers AI
|
|
425
|
+
|
|
426
|
+
**Status:** ✅ Fully OpenAI-Compatible
|
|
427
|
+
|
|
428
|
+
**Base URL:** Via Workers AI endpoints
|
|
429
|
+
|
|
430
|
+
**Authentication:** Cloudflare token
|
|
431
|
+
|
|
432
|
+
**Models:** 50+ models including:
|
|
433
|
+
- `@cf/openai/gpt-oss-120b` - OpenAI's open-weight model
|
|
434
|
+
- `@cf/openai/gpt-oss-20b`
|
|
435
|
+
- Other open-source models
|
|
436
|
+
|
|
437
|
+
**Example:**
|
|
438
|
+
```typescript
|
|
439
|
+
import OpenAI from 'openai'
|
|
440
|
+
|
|
441
|
+
const client = new OpenAI({
|
|
442
|
+
baseURL: 'https://api.cloudflare.com/client/v4/accounts/{account_id}/ai/v1',
|
|
443
|
+
apiKey: process.env.CLOUDFLARE_API_TOKEN
|
|
444
|
+
})
|
|
445
|
+
|
|
446
|
+
const response = await client.chat.completions.create({
|
|
447
|
+
model: '@cf/openai/gpt-oss-120b',
|
|
448
|
+
messages: [{ role: 'user', content: 'Hello!' }]
|
|
449
|
+
})
|
|
450
|
+
```
|
|
451
|
+
|
|
452
|
+
**Supported Endpoints:**
|
|
453
|
+
- `/v1/chat/completions` - Text generation
|
|
454
|
+
- `/v1/embeddings` - Text embeddings
|
|
455
|
+
|
|
456
|
+
**Features:**
|
|
457
|
+
- Edge deployment (200+ cities worldwide)
|
|
458
|
+
- Serverless pricing
|
|
459
|
+
- Day 0 support for OpenAI's open-weight models
|
|
460
|
+
- OpenAI Responses API format support
|
|
461
|
+
|
|
462
|
+
**Limitations:**
|
|
463
|
+
- Requires Cloudflare account and setup
|
|
464
|
+
|
|
465
|
+
**Sources:**
|
|
466
|
+
- [OpenAI compatible API endpoints · Cloudflare Workers AI docs](https://developers.cloudflare.com/workers-ai/configuration/open-ai-compatibility/)
|
|
467
|
+
|
|
468
|
+
---
|
|
469
|
+
|
|
470
|
+
### 12. vLLM
|
|
471
|
+
|
|
472
|
+
**Status:** ✅ Fully OpenAI-Compatible (Self-Hosted)
|
|
473
|
+
|
|
474
|
+
**Base URL:** `http://localhost:8000/v1` (default, configurable)
|
|
475
|
+
|
|
476
|
+
**Authentication:** None (self-hosted)
|
|
477
|
+
|
|
478
|
+
**Models:** Any model supported by vLLM (hundreds of open-source models)
|
|
479
|
+
|
|
480
|
+
**Example:**
|
|
481
|
+
```typescript
|
|
482
|
+
import OpenAI from 'openai'
|
|
483
|
+
|
|
484
|
+
const client = new OpenAI({
|
|
485
|
+
baseURL: 'http://localhost:8000/v1',
|
|
486
|
+
apiKey: 'none' // required by SDK but ignored
|
|
487
|
+
})
|
|
488
|
+
|
|
489
|
+
const response = await client.chat.completions.create({
|
|
490
|
+
model: 'meta-llama/Llama-3.3-70B-Instruct',
|
|
491
|
+
messages: [{ role: 'user', content: 'Hello!' }]
|
|
492
|
+
})
|
|
493
|
+
```
|
|
494
|
+
|
|
495
|
+
**Supported APIs:**
|
|
496
|
+
- Chat Completions API
|
|
497
|
+
- Completions API
|
|
498
|
+
- Embeddings API
|
|
499
|
+
|
|
500
|
+
**Features:**
|
|
501
|
+
- Self-hosted deployment
|
|
502
|
+
- Multi-GPU support
|
|
503
|
+
- Scales from single GPU to multi-node cluster
|
|
504
|
+
- Multimodal support (vision and audio)
|
|
505
|
+
- Auto-downloads models from HuggingFace
|
|
506
|
+
|
|
507
|
+
**Recent Updates (Jan 2026):**
|
|
508
|
+
- Support for latest models including DeepSeek R1
|
|
509
|
+
|
|
510
|
+
**Limitations:**
|
|
511
|
+
- Requires infrastructure setup
|
|
512
|
+
- Self-managed deployment
|
|
513
|
+
|
|
514
|
+
**Sources:**
|
|
515
|
+
- [OpenAI-Compatible Server - vLLM](https://docs.vllm.ai/en/stable/serving/openai_compatible_server/)
|
|
516
|
+
|
|
517
|
+
---
|
|
518
|
+
|
|
519
|
+
### 13. LiteLLM Proxy
|
|
520
|
+
|
|
521
|
+
**Status:** ✅ Fully OpenAI-Compatible (Gateway/Proxy)
|
|
522
|
+
|
|
523
|
+
**Base URL:** `http://localhost:4000/v1` (default)
|
|
524
|
+
|
|
525
|
+
**Authentication:** Bearer token (managed by proxy)
|
|
526
|
+
|
|
527
|
+
**Models:** 100+ providers unified through single interface
|
|
528
|
+
|
|
529
|
+
**Example:**
|
|
530
|
+
```typescript
|
|
531
|
+
import OpenAI from 'openai'
|
|
532
|
+
|
|
533
|
+
const client = new OpenAI({
|
|
534
|
+
baseURL: 'http://localhost:4000',
|
|
535
|
+
apiKey: process.env.LITELLM_API_KEY
|
|
536
|
+
})
|
|
537
|
+
|
|
538
|
+
const response = await client.chat.completions.create({
|
|
539
|
+
model: 'gpt-4', // LiteLLM routes to configured provider
|
|
540
|
+
messages: [{ role: 'user', content: 'Hello!' }]
|
|
541
|
+
})
|
|
542
|
+
```
|
|
543
|
+
|
|
544
|
+
**Supported Providers:**
|
|
545
|
+
- OpenAI, Azure, Anthropic, Cohere, Bedrock, VertexAI, HuggingFace, NVIDIA NIM, and 100+ more
|
|
546
|
+
|
|
547
|
+
**Supported Endpoints:**
|
|
548
|
+
- `/chat/completions`
|
|
549
|
+
- `/responses`
|
|
550
|
+
- `/embeddings`
|
|
551
|
+
- `/images`
|
|
552
|
+
- `/audio`
|
|
553
|
+
- `/batches`
|
|
554
|
+
- `/rerank`
|
|
555
|
+
- `/a2a` (Agent-to-Agent)
|
|
556
|
+
- `/messages`
|
|
557
|
+
|
|
558
|
+
**Features (2026):**
|
|
559
|
+
- Cost tracking and management
|
|
560
|
+
- Guardrails
|
|
561
|
+
- Load balancing
|
|
562
|
+
- Logging
|
|
563
|
+
- JWT Authentication
|
|
564
|
+
- Batch API routing
|
|
565
|
+
- Prompt management with versioning
|
|
566
|
+
- Agent (A2A) Gateway support
|
|
567
|
+
|
|
568
|
+
**Use Cases:**
|
|
569
|
+
- Unified gateway for multiple providers
|
|
570
|
+
- Cost tracking across providers
|
|
571
|
+
- Development/testing with multiple models
|
|
572
|
+
- Production routing and fallbacks
|
|
573
|
+
|
|
574
|
+
**Limitations:**
|
|
575
|
+
- Requires running proxy server
|
|
576
|
+
- Additional latency from proxy layer
|
|
577
|
+
|
|
578
|
+
**Sources:**
|
|
579
|
+
- [OpenAI-Compatible Endpoints | liteLLM](https://docs.litellm.ai/docs/providers/openai_compatible)
|
|
580
|
+
- [GitHub - BerriAI/litellm](https://github.com/BerriAI/litellm)
|
|
581
|
+
|
|
582
|
+
---
|
|
583
|
+
|
|
584
|
+
### 14. Anyscale Endpoints
|
|
585
|
+
|
|
586
|
+
**Status:** ⚠️ Limited Availability
|
|
587
|
+
|
|
588
|
+
**Base URL:** `https://api.endpoints.anyscale.com/v1`
|
|
589
|
+
|
|
590
|
+
**Authentication:** Bearer token
|
|
591
|
+
|
|
592
|
+
**Models:** Various open-source models
|
|
593
|
+
|
|
594
|
+
**Example:**
|
|
595
|
+
```typescript
|
|
596
|
+
import OpenAI from 'openai'
|
|
597
|
+
|
|
598
|
+
const client = new OpenAI({
|
|
599
|
+
baseURL: 'https://api.endpoints.anyscale.com/v1',
|
|
600
|
+
apiKey: process.env.ANYSCALE_API_KEY
|
|
601
|
+
})
|
|
602
|
+
|
|
603
|
+
const response = await client.chat.completions.create({
|
|
604
|
+
model: 'meta-llama/Llama-3.3-70B-Instruct',
|
|
605
|
+
messages: [{ role: 'user', content: 'Hello!' }]
|
|
606
|
+
})
|
|
607
|
+
```
|
|
608
|
+
|
|
609
|
+
**Important Note:**
|
|
610
|
+
- As of August 1, 2024, Anyscale Endpoints API is only available through the fully Hosted Anyscale Platform
|
|
611
|
+
- Multi-tenant access to LLM models was removed
|
|
612
|
+
|
|
613
|
+
**Features:**
|
|
614
|
+
- JSON Mode
|
|
615
|
+
- Function calling
|
|
616
|
+
- Fine-tuning API (OpenAI-compatible)
|
|
617
|
+
|
|
618
|
+
**Sources:**
|
|
619
|
+
- [Migrate from OpenAI | Anyscale Docs](https://docs.anyscale.com/endpoints/text-generation/migrate-from-openai/)
|
|
620
|
+
|
|
621
|
+
---
|
|
622
|
+
|
|
623
|
+
## Implementation Pattern
|
|
624
|
+
|
|
625
|
+
### TypeScript Example: Universal LLM Client
|
|
626
|
+
|
|
627
|
+
```typescript
|
|
628
|
+
import OpenAI from 'openai'
|
|
629
|
+
|
|
630
|
+
type Provider =
|
|
631
|
+
| 'openai'
|
|
632
|
+
| 'deepseek'
|
|
633
|
+
| 'together'
|
|
634
|
+
| 'groq'
|
|
635
|
+
| 'ollama'
|
|
636
|
+
| 'mistral'
|
|
637
|
+
| 'fireworks'
|
|
638
|
+
| 'perplexity'
|
|
639
|
+
| 'openrouter'
|
|
640
|
+
|
|
641
|
+
interface ProviderConfig {
|
|
642
|
+
baseURL: string
|
|
643
|
+
apiKey: string
|
|
644
|
+
defaultModel?: string
|
|
645
|
+
}
|
|
646
|
+
|
|
647
|
+
const PROVIDER_CONFIGS: Record<Provider, (apiKey: string) => ProviderConfig> = {
|
|
648
|
+
openai: (apiKey) => ({
|
|
649
|
+
baseURL: 'https://api.openai.com/v1',
|
|
650
|
+
apiKey,
|
|
651
|
+
defaultModel: 'gpt-4o'
|
|
652
|
+
}),
|
|
653
|
+
deepseek: (apiKey) => ({
|
|
654
|
+
baseURL: 'https://api.deepseek.com/v1',
|
|
655
|
+
apiKey,
|
|
656
|
+
defaultModel: 'deepseek-chat'
|
|
657
|
+
}),
|
|
658
|
+
together: (apiKey) => ({
|
|
659
|
+
baseURL: 'https://api.together.xyz/v1',
|
|
660
|
+
apiKey,
|
|
661
|
+
defaultModel: 'meta-llama/Llama-3.3-70B-Instruct-Turbo'
|
|
662
|
+
}),
|
|
663
|
+
groq: (apiKey) => ({
|
|
664
|
+
baseURL: 'https://api.groq.com/openai/v1',
|
|
665
|
+
apiKey,
|
|
666
|
+
defaultModel: 'deepseek-r1-distill-llama-70b'
|
|
667
|
+
}),
|
|
668
|
+
ollama: (apiKey) => ({
|
|
669
|
+
baseURL: 'http://localhost:11434/v1',
|
|
670
|
+
apiKey: 'ollama', // ignored but required
|
|
671
|
+
defaultModel: 'llama3.3'
|
|
672
|
+
}),
|
|
673
|
+
mistral: (apiKey) => ({
|
|
674
|
+
baseURL: 'https://api.mistral.ai/v1',
|
|
675
|
+
apiKey,
|
|
676
|
+
defaultModel: 'mistral-large-latest'
|
|
677
|
+
}),
|
|
678
|
+
fireworks: (apiKey) => ({
|
|
679
|
+
baseURL: 'https://api.fireworks.ai/inference/v1',
|
|
680
|
+
apiKey,
|
|
681
|
+
defaultModel: 'accounts/fireworks/models/llama-v3p3-70b-instruct'
|
|
682
|
+
}),
|
|
683
|
+
perplexity: (apiKey) => ({
|
|
684
|
+
baseURL: 'https://api.perplexity.ai',
|
|
685
|
+
apiKey,
|
|
686
|
+
defaultModel: 'sonar-pro'
|
|
687
|
+
}),
|
|
688
|
+
openrouter: (apiKey) => ({
|
|
689
|
+
baseURL: 'https://openrouter.ai/api/v1',
|
|
690
|
+
apiKey,
|
|
691
|
+
defaultModel: 'anthropic/claude-sonnet-4-5'
|
|
692
|
+
})
|
|
693
|
+
}
|
|
694
|
+
|
|
695
|
+
export class UniversalLLMClient {
|
|
696
|
+
private client: OpenAI
|
|
697
|
+
private defaultModel: string
|
|
698
|
+
|
|
699
|
+
constructor(provider: Provider, apiKey: string) {
|
|
700
|
+
const config = PROVIDER_CONFIGS[provider](apiKey)
|
|
701
|
+
|
|
702
|
+
this.client = new OpenAI({
|
|
703
|
+
baseURL: config.baseURL,
|
|
704
|
+
apiKey: config.apiKey
|
|
705
|
+
})
|
|
706
|
+
|
|
707
|
+
this.defaultModel = config.defaultModel || ''
|
|
708
|
+
}
|
|
709
|
+
|
|
710
|
+
async chat(
|
|
711
|
+
messages: Array<{ role: string; content: string }>,
|
|
712
|
+
options?: {
|
|
713
|
+
model?: string
|
|
714
|
+
temperature?: number
|
|
715
|
+
maxTokens?: number
|
|
716
|
+
}
|
|
717
|
+
) {
|
|
718
|
+
return this.client.chat.completions.create({
|
|
719
|
+
model: options?.model || this.defaultModel,
|
|
720
|
+
messages,
|
|
721
|
+
temperature: options?.temperature,
|
|
722
|
+
max_tokens: options?.maxTokens
|
|
723
|
+
})
|
|
724
|
+
}
|
|
725
|
+
|
|
726
|
+
async embed(input: string | string[], model?: string) {
|
|
727
|
+
return this.client.embeddings.create({
|
|
728
|
+
model: model || 'text-embedding-3-small',
|
|
729
|
+
input
|
|
730
|
+
})
|
|
731
|
+
}
|
|
732
|
+
}
|
|
733
|
+
|
|
734
|
+
// Usage
|
|
735
|
+
const client = new UniversalLLMClient('deepseek', process.env.DEEPSEEK_API_KEY!)
|
|
736
|
+
|
|
737
|
+
const response = await client.chat([
|
|
738
|
+
{ role: 'user', content: 'Hello!' }
|
|
739
|
+
])
|
|
740
|
+
|
|
741
|
+
console.log(response.choices[0].message.content)
|
|
742
|
+
```
|
|
743
|
+
|
|
744
|
+
---
|
|
745
|
+
|
|
746
|
+
## Architecture Recommendations
|
|
747
|
+
|
|
748
|
+
### For Production Use
|
|
749
|
+
|
|
750
|
+
1. **Primary Pattern: OpenAI SDK with Provider Switching**
|
|
751
|
+
```typescript
|
|
752
|
+
// Recommended approach
|
|
753
|
+
const provider = process.env.LLM_PROVIDER || 'openai'
|
|
754
|
+
const client = createLLMClient(provider)
|
|
755
|
+
```
|
|
756
|
+
|
|
757
|
+
2. **Use LiteLLM Proxy for:**
|
|
758
|
+
- Cost tracking across providers
|
|
759
|
+
- Load balancing and failovers
|
|
760
|
+
- Unified logging and monitoring
|
|
761
|
+
- Development/staging environments
|
|
762
|
+
|
|
763
|
+
3. **Use OpenRouter for:**
|
|
764
|
+
- Quick access to many models
|
|
765
|
+
- Model experimentation
|
|
766
|
+
- Fallback/redundancy strategy
|
|
767
|
+
|
|
768
|
+
4. **Use Native SDKs When:**
|
|
769
|
+
- Provider-specific features required (e.g., Claude's prompt caching, extended thinking)
|
|
770
|
+
- Maximum performance needed
|
|
771
|
+
- Advanced features not in OpenAI spec
|
|
772
|
+
|
|
773
|
+
### Environment Configuration
|
|
774
|
+
|
|
775
|
+
```typescript
|
|
776
|
+
// .env
|
|
777
|
+
LLM_PROVIDER=deepseek
|
|
778
|
+
DEEPSEEK_API_KEY=sk-xxx
|
|
779
|
+
OPENAI_API_KEY=sk-xxx
|
|
780
|
+
GROQ_API_KEY=gsk-xxx
|
|
781
|
+
TOGETHER_API_KEY=xxx
|
|
782
|
+
```
|
|
783
|
+
|
|
784
|
+
### Error Handling
|
|
785
|
+
|
|
786
|
+
```typescript
|
|
787
|
+
class LLMError extends Error {
|
|
788
|
+
constructor(
|
|
789
|
+
message: string,
|
|
790
|
+
public provider: string,
|
|
791
|
+
public originalError: unknown
|
|
792
|
+
) {
|
|
793
|
+
super(message)
|
|
794
|
+
}
|
|
795
|
+
}
|
|
796
|
+
|
|
797
|
+
async function chatWithFallback(
|
|
798
|
+
messages: Array<{ role: string; content: string }>,
|
|
799
|
+
providers: Provider[] = ['deepseek', 'groq', 'openai']
|
|
800
|
+
) {
|
|
801
|
+
const errors: LLMError[] = []
|
|
802
|
+
|
|
803
|
+
for (const provider of providers) {
|
|
804
|
+
try {
|
|
805
|
+
const client = new UniversalLLMClient(
|
|
806
|
+
provider,
|
|
807
|
+
process.env[`${provider.toUpperCase()}_API_KEY`]!
|
|
808
|
+
)
|
|
809
|
+
return await client.chat(messages)
|
|
810
|
+
} catch (error) {
|
|
811
|
+
errors.push(new LLMError(
|
|
812
|
+
`${provider} failed`,
|
|
813
|
+
provider,
|
|
814
|
+
error
|
|
815
|
+
))
|
|
816
|
+
continue
|
|
817
|
+
}
|
|
818
|
+
}
|
|
819
|
+
|
|
820
|
+
throw new Error(
|
|
821
|
+
`All providers failed: ${errors.map(e => e.message).join(', ')}`
|
|
822
|
+
)
|
|
823
|
+
}
|
|
824
|
+
```
|
|
825
|
+
|
|
826
|
+
---
|
|
827
|
+
|
|
828
|
+
## Provider Selection Guide
|
|
829
|
+
|
|
830
|
+
### For Cost Optimization
|
|
831
|
+
1. **DeepSeek** - Very competitive pricing
|
|
832
|
+
2. **Groq** - Fast inference at good rates
|
|
833
|
+
3. **Together AI** - Competitive open-source model pricing
|
|
834
|
+
4. **OpenRouter** - Automatic cost optimization
|
|
835
|
+
|
|
836
|
+
### For Speed
|
|
837
|
+
1. **Groq** - Ultra-fast inference (LPU-based)
|
|
838
|
+
2. **Fireworks AI** - Optimized for speed
|
|
839
|
+
3. **Together AI** - Fast open-source models
|
|
840
|
+
|
|
841
|
+
### For Model Variety
|
|
842
|
+
1. **OpenRouter** - 500+ models
|
|
843
|
+
2. **Together AI** - 200+ open-source models
|
|
844
|
+
3. **LiteLLM Proxy** - 100+ providers
|
|
845
|
+
|
|
846
|
+
### For Reasoning Tasks
|
|
847
|
+
1. **DeepSeek** - deepseek-reasoner with chain-of-thought
|
|
848
|
+
2. **Groq** - qwen-qwq-32b, deepseek-r1-distill-llama-70b
|
|
849
|
+
3. **Mistral** - Magistral reasoning models
|
|
850
|
+
|
|
851
|
+
### For Real-Time Information
|
|
852
|
+
1. **Perplexity AI** - Built-in web search with citations
|
|
853
|
+
2. **OpenRouter** - Access to various search-enabled models
|
|
854
|
+
|
|
855
|
+
### For Local/Private Deployment
|
|
856
|
+
1. **Ollama** - Easy local deployment
|
|
857
|
+
2. **vLLM** - High-performance self-hosted
|
|
858
|
+
3. **LiteLLM Proxy** - Self-hosted gateway
|
|
859
|
+
|
|
860
|
+
---
|
|
861
|
+
|
|
862
|
+
## Migration Strategy
|
|
863
|
+
|
|
864
|
+
### Phase 1: Abstraction Layer
|
|
865
|
+
Create a unified interface that uses OpenAI SDK internally:
|
|
866
|
+
|
|
867
|
+
```typescript
|
|
868
|
+
interface LLMProvider {
|
|
869
|
+
chat(messages: Message[]): Promise<ChatResponse>
|
|
870
|
+
embed(text: string): Promise<Embedding>
|
|
871
|
+
}
|
|
872
|
+
|
|
873
|
+
class OpenAICompatibleProvider implements LLMProvider {
|
|
874
|
+
constructor(
|
|
875
|
+
private client: OpenAI,
|
|
876
|
+
private defaultModel: string
|
|
877
|
+
) {}
|
|
878
|
+
|
|
879
|
+
async chat(messages: Message[]) {
|
|
880
|
+
const response = await this.client.chat.completions.create({
|
|
881
|
+
model: this.defaultModel,
|
|
882
|
+
messages
|
|
883
|
+
})
|
|
884
|
+
return response
|
|
885
|
+
}
|
|
886
|
+
}
|
|
887
|
+
```
|
|
888
|
+
|
|
889
|
+
### Phase 2: Configuration
|
|
890
|
+
Externalize provider configuration:
|
|
891
|
+
|
|
892
|
+
```typescript
|
|
893
|
+
// config/llm-providers.ts
|
|
894
|
+
export const LLM_PROVIDERS = {
|
|
895
|
+
deepseek: {
|
|
896
|
+
baseURL: 'https://api.deepseek.com/v1',
|
|
897
|
+
models: {
|
|
898
|
+
chat: 'deepseek-chat',
|
|
899
|
+
reasoning: 'deepseek-reasoner'
|
|
900
|
+
}
|
|
901
|
+
},
|
|
902
|
+
groq: {
|
|
903
|
+
baseURL: 'https://api.groq.com/openai/v1',
|
|
904
|
+
models: {
|
|
905
|
+
fast: 'deepseek-r1-distill-llama-70b'
|
|
906
|
+
}
|
|
907
|
+
}
|
|
908
|
+
// ... more providers
|
|
909
|
+
}
|
|
910
|
+
```
|
|
911
|
+
|
|
912
|
+
### Phase 3: Runtime Switching
|
|
913
|
+
Enable dynamic provider selection:
|
|
914
|
+
|
|
915
|
+
```typescript
|
|
916
|
+
const provider = selectProvider({
|
|
917
|
+
task: 'reasoning', // or 'chat', 'search', etc.
|
|
918
|
+
priority: 'cost', // or 'speed', 'quality'
|
|
919
|
+
fallbacks: true
|
|
920
|
+
})
|
|
921
|
+
```
|
|
922
|
+
|
|
923
|
+
---
|
|
924
|
+
|
|
925
|
+
## Testing Recommendations
|
|
926
|
+
|
|
927
|
+
### Provider Compatibility Tests
|
|
928
|
+
|
|
929
|
+
```typescript
|
|
930
|
+
import { describe, it, expect } from 'vitest'
|
|
931
|
+
|
|
932
|
+
const PROVIDERS_TO_TEST: Provider[] = [
|
|
933
|
+
'deepseek',
|
|
934
|
+
'groq',
|
|
935
|
+
'together',
|
|
936
|
+
'mistral'
|
|
937
|
+
]
|
|
938
|
+
|
|
939
|
+
describe.each(PROVIDERS_TO_TEST)('Provider: %s', (provider) => {
|
|
940
|
+
it('should complete chat', async () => {
|
|
941
|
+
const client = new UniversalLLMClient(
|
|
942
|
+
provider,
|
|
943
|
+
process.env[`${provider.toUpperCase()}_API_KEY`]!
|
|
944
|
+
)
|
|
945
|
+
|
|
946
|
+
const response = await client.chat([
|
|
947
|
+
{ role: 'user', content: 'Say "hello"' }
|
|
948
|
+
])
|
|
949
|
+
|
|
950
|
+
expect(response.choices[0].message.content).toBeTruthy()
|
|
951
|
+
})
|
|
952
|
+
|
|
953
|
+
it('should handle streaming', async () => {
|
|
954
|
+
// ... streaming test
|
|
955
|
+
})
|
|
956
|
+
|
|
957
|
+
it('should support function calling', async () => {
|
|
958
|
+
// ... function calling test
|
|
959
|
+
})
|
|
960
|
+
})
|
|
961
|
+
```
|
|
962
|
+
|
|
963
|
+
---
|
|
964
|
+
|
|
965
|
+
## Conclusion
|
|
966
|
+
|
|
967
|
+
The OpenAI-compatible API pattern is the **clear winner** for multi-provider LLM integration in 2026. Key benefits:
|
|
968
|
+
|
|
969
|
+
1. **Minimal Code**: One SDK, multiple providers
|
|
970
|
+
2. **Easy Migration**: Change 2 lines of code to switch providers
|
|
971
|
+
3. **Future-Proof**: New providers adopting this standard regularly
|
|
972
|
+
4. **Developer Experience**: Familiar interface reduces learning curve
|
|
973
|
+
5. **Ecosystem**: Works with existing tools built for OpenAI SDK
|
|
974
|
+
|
|
975
|
+
### Exceptions to Use Native SDKs:
|
|
976
|
+
|
|
977
|
+
- **Anthropic Claude**: Use native SDK for production (OpenAI compat is testing-only)
|
|
978
|
+
- **Provider-Specific Features**: When you need features not in OpenAI spec
|
|
979
|
+
- **Maximum Performance**: When latency is critical and provider optimizations matter
|
|
980
|
+
|
|
981
|
+
### Recommended Stack:
|
|
982
|
+
|
|
983
|
+
```
|
|
984
|
+
Application Code
|
|
985
|
+
↓
|
|
986
|
+
Universal LLM Client (OpenAI SDK-based)
|
|
987
|
+
↓
|
|
988
|
+
[Optional] LiteLLM Proxy (for cost tracking, routing)
|
|
989
|
+
↓
|
|
990
|
+
Multiple Providers (DeepSeek, Groq, Together, etc.)
|
|
991
|
+
```
|
|
992
|
+
|
|
993
|
+
This architecture provides flexibility, maintainability, and future-proofing while minimizing complexity.
|
|
994
|
+
|
|
995
|
+
---
|
|
996
|
+
|
|
997
|
+
## References
|
|
998
|
+
|
|
999
|
+
### Official Documentation
|
|
1000
|
+
- [DeepSeek API Docs](https://api-docs.deepseek.com/)
|
|
1001
|
+
- [Together AI OpenAI Compatibility](https://docs.together.ai/docs/openai-api-compatibility)
|
|
1002
|
+
- [Groq Docs Overview](https://console.groq.com/docs/overview)
|
|
1003
|
+
- [Ollama OpenAI compatibility](https://docs.ollama.com/api/openai-compatibility)
|
|
1004
|
+
- [Anthropic OpenAI SDK compatibility](https://platform.claude.com/docs/en/api/openai-sdk)
|
|
1005
|
+
- [Mistral AI API Specs](https://docs.mistral.ai/api)
|
|
1006
|
+
- [Cohere Compatibility API](https://docs.cohere.com/docs/compatibility-api)
|
|
1007
|
+
- [Fireworks AI OpenAI compatibility](https://docs.fireworks.ai/tools-sdks/openai-compatibility)
|
|
1008
|
+
- [Perplexity OpenAI Compatibility](https://docs.perplexity.ai/guides/chat-completions-guide)
|
|
1009
|
+
- [OpenRouter Quickstart Guide](https://openrouter.ai/docs/quickstart)
|
|
1010
|
+
- [Cloudflare Workers AI OpenAI endpoints](https://developers.cloudflare.com/workers-ai/configuration/open-ai-compatibility/)
|
|
1011
|
+
- [vLLM OpenAI-Compatible Server](https://docs.vllm.ai/en/stable/serving/openai_compatible_server/)
|
|
1012
|
+
- [LiteLLM Documentation](https://docs.litellm.ai/docs/providers/openai_compatible)
|
|
1013
|
+
|
|
1014
|
+
### Additional Resources
|
|
1015
|
+
- [AI SDK Providers](https://ai-sdk.dev/providers/ai-sdk-providers/)
|
|
1016
|
+
- [OpenAI SDK (npm)](https://www.npmjs.com/package/openai)
|
|
1017
|
+
|
|
1018
|
+
---
|
|
1019
|
+
|
|
1020
|
+
**Document Version:** 1.0
|
|
1021
|
+
**Last Updated:** January 26, 2026
|
|
1022
|
+
**Researched by:** Claude Sonnet 4.5
|