mdcontext 0.0.1 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.changeset/README.md +28 -0
- package/.changeset/config.json +11 -0
- package/.claude/settings.local.json +25 -0
- package/.github/workflows/ci.yml +83 -0
- package/.github/workflows/claude-code-review.yml +44 -0
- package/.github/workflows/claude.yml +85 -0
- package/.github/workflows/release.yml +113 -0
- package/.tldrignore +112 -0
- package/BACKLOG.md +338 -0
- package/CONTRIBUTING.md +186 -0
- package/NOTES/NOTES +44 -0
- package/README.md +434 -11
- package/biome.json +36 -0
- package/cspell.config.yaml +14 -0
- package/dist/chunk-23UPXDNL.js +3044 -0
- package/dist/chunk-2W7MO2DL.js +1366 -0
- package/dist/chunk-3NUAZGMA.js +1689 -0
- package/dist/chunk-7TOWB2XB.js +366 -0
- package/dist/chunk-7XOTOADQ.js +3065 -0
- package/dist/chunk-AH2PDM2K.js +3042 -0
- package/dist/chunk-BNXWSZ63.js +3742 -0
- package/dist/chunk-BTL5DJVU.js +3222 -0
- package/dist/chunk-HDHYG7E4.js +104 -0
- package/dist/chunk-HLR4KZBP.js +3234 -0
- package/dist/chunk-IP3FRFEB.js +1045 -0
- package/dist/chunk-KHU56VDO.js +3042 -0
- package/dist/chunk-KRYIFLQR.js +88 -0
- package/dist/chunk-LBSDNLEM.js +287 -0
- package/dist/chunk-MNTQ7HCP.js +2643 -0
- package/dist/chunk-MUJELQQ6.js +1387 -0
- package/dist/chunk-MXJGMSLV.js +2199 -0
- package/dist/chunk-N6QJGC3Z.js +2636 -0
- package/dist/chunk-OBELGBPM.js +1713 -0
- package/dist/chunk-OT7R5XTA.js +3192 -0
- package/dist/chunk-P7X4RA2T.js +106 -0
- package/dist/chunk-PIDUQNC2.js +3185 -0
- package/dist/chunk-POGCDIH4.js +3187 -0
- package/dist/chunk-PSIEOQGZ.js +3043 -0
- package/dist/chunk-PVRT3IHA.js +3238 -0
- package/dist/chunk-QNN4TT23.js +1430 -0
- package/dist/chunk-RE3R45RJ.js +3042 -0
- package/dist/chunk-S7E6TFX6.js +803 -0
- package/dist/chunk-SG6GLU4U.js +1378 -0
- package/dist/chunk-SJCDV2ST.js +274 -0
- package/dist/chunk-SYE5XLF3.js +104 -0
- package/dist/chunk-T5VLYBZD.js +103 -0
- package/dist/chunk-TOQB7VWU.js +3238 -0
- package/dist/chunk-VFNMZ4ZQ.js +3228 -0
- package/dist/chunk-VVTGZNBT.js +1629 -0
- package/dist/chunk-W7Q4RFEV.js +104 -0
- package/dist/chunk-XTYYVRLO.js +3190 -0
- package/dist/chunk-Y6MDYVJD.js +3063 -0
- package/dist/cli/main.d.ts +1 -0
- package/dist/cli/main.js +5458 -0
- package/dist/index.d.ts +653 -0
- package/dist/index.js +79 -0
- package/dist/mcp/server.d.ts +1 -0
- package/dist/mcp/server.js +472 -0
- package/dist/schema-BAWSG7KY.js +22 -0
- package/dist/schema-E3QUPL26.js +20 -0
- package/dist/schema-EHL7WUT6.js +20 -0
- package/docs/019-USAGE.md +625 -0
- package/docs/020-current-implementation.md +364 -0
- package/docs/021-DOGFOODING-FINDINGS.md +175 -0
- package/docs/BACKLOG.md +80 -0
- package/docs/CONFIG.md +1123 -0
- package/docs/DESIGN.md +439 -0
- package/docs/ERRORS.md +383 -0
- package/docs/PROJECT.md +88 -0
- package/docs/ROADMAP.md +407 -0
- package/docs/summarization.md +320 -0
- package/docs/test-links.md +9 -0
- package/justfile +40 -0
- package/package.json +74 -9
- package/pnpm-workspace.yaml +5 -0
- package/research/INDEX.md +315 -0
- package/research/code-review/README.md +90 -0
- package/research/code-review/cli-error-handling-review.md +979 -0
- package/research/code-review/code-review-validation-report.md +464 -0
- package/research/code-review/main-ts-review.md +1128 -0
- package/research/config-analysis/01-current-implementation.md +470 -0
- package/research/config-analysis/02-strategy-recommendation.md +428 -0
- package/research/config-analysis/03-task-candidates.md +715 -0
- package/research/config-analysis/033-research-configuration-management.md +828 -0
- package/research/config-analysis/034-research-effect-cli-config.md +1504 -0
- package/research/config-analysis/04-consolidated-task-candidates.md +277 -0
- package/research/config-docs/SUMMARY.md +357 -0
- package/research/config-docs/TEST-RESULTS.md +776 -0
- package/research/config-docs/TODO.md +542 -0
- package/research/config-docs/analysis.md +744 -0
- package/research/config-docs/fix-validation.md +502 -0
- package/research/config-docs/help-audit.md +264 -0
- package/research/config-docs/help-system-analysis.md +890 -0
- package/research/dogfood/consolidated-tool-evaluation.md +373 -0
- package/research/dogfood/strategy-a/a-synthesis.md +184 -0
- package/research/dogfood/strategy-a/a1-docs.md +226 -0
- package/research/dogfood/strategy-a/a2-amorphic.md +156 -0
- package/research/dogfood/strategy-a/a3-llm.md +164 -0
- package/research/dogfood/strategy-b/b-synthesis.md +228 -0
- package/research/dogfood/strategy-b/b1-architecture.md +207 -0
- package/research/dogfood/strategy-b/b2-gaps.md +258 -0
- package/research/dogfood/strategy-b/b3-workflows.md +250 -0
- package/research/dogfood/strategy-c/c-synthesis.md +451 -0
- package/research/dogfood/strategy-c/c1-explorer.md +192 -0
- package/research/dogfood/strategy-c/c2-diver-memory.md +145 -0
- package/research/dogfood/strategy-c/c3-diver-control.md +148 -0
- package/research/dogfood/strategy-c/c4-diver-failure.md +151 -0
- package/research/dogfood/strategy-c/c5-diver-execution.md +221 -0
- package/research/dogfood/strategy-c/c6-diver-org.md +221 -0
- package/research/effect-cli-error-handling.md +845 -0
- package/research/effect-errors-as-values.md +943 -0
- package/research/errors-task-analysis/00-consolidated-tasks.md +207 -0
- package/research/errors-task-analysis/cli-commands-analysis.md +909 -0
- package/research/errors-task-analysis/embeddings-analysis.md +709 -0
- package/research/errors-task-analysis/index-search-analysis.md +812 -0
- package/research/frontmatter/COMMENTS-ARE-SKIPPED.md +149 -0
- package/research/frontmatter/LLM-CODE-NAVIGATION.md +276 -0
- package/research/issue-review.md +603 -0
- package/research/llm-summarization/agent-cli-tools-2026.md +1082 -0
- package/research/llm-summarization/alternative-providers-2026.md +1428 -0
- package/research/llm-summarization/anthropic-2026.md +367 -0
- package/research/llm-summarization/claude-cli-integration.md +1706 -0
- package/research/llm-summarization/cli-integration-patterns.md +3155 -0
- package/research/llm-summarization/openai-2026.md +473 -0
- package/research/llm-summarization/openai-compatible-providers-2026.md +1022 -0
- package/research/llm-summarization/opencode-cli-integration.md +1552 -0
- package/research/llm-summarization/prompt-engineering-2026.md +1426 -0
- package/research/llm-summarization/prototype-results.md +56 -0
- package/research/llm-summarization/provider-switching-patterns-2026.md +2153 -0
- package/research/llm-summarization/typescript-llm-libraries-2026.md +2436 -0
- package/research/mdcontext-error-analysis.md +521 -0
- package/research/mdcontext-pudding/00-EXECUTIVE-SUMMARY.md +282 -0
- package/research/mdcontext-pudding/01-index-embed.md +956 -0
- package/research/mdcontext-pudding/02-search-COMMANDS.md +142 -0
- package/research/mdcontext-pudding/02-search-SUMMARY.md +146 -0
- package/research/mdcontext-pudding/02-search.md +970 -0
- package/research/mdcontext-pudding/03-context.md +779 -0
- package/research/mdcontext-pudding/04-navigation-and-analytics.md +803 -0
- package/research/mdcontext-pudding/04-tree.md +704 -0
- package/research/mdcontext-pudding/05-config.md +1038 -0
- package/research/mdcontext-pudding/06-links-summary.txt +87 -0
- package/research/mdcontext-pudding/06-links.md +679 -0
- package/research/mdcontext-pudding/07-stats.md +693 -0
- package/research/mdcontext-pudding/BUG-FIX-PLAN.md +388 -0
- package/research/mdcontext-pudding/P0-BUG-VALIDATION.md +167 -0
- package/research/mdcontext-pudding/README.md +168 -0
- package/research/mdcontext-pudding/TESTING-SUMMARY.md +128 -0
- package/research/npm_publish/011-npm-workflow-research-agent2.md +792 -0
- package/research/npm_publish/012-npm-workflow-research-agent1.md +530 -0
- package/research/npm_publish/013-npm-workflow-research-agent3.md +722 -0
- package/research/npm_publish/014-npm-workflow-synthesis.md +556 -0
- package/research/npm_publish/031-npm-workflow-task-analysis.md +134 -0
- package/research/research-quality-review.md +834 -0
- package/research/semantic-search/002-research-embedding-models.md +490 -0
- package/research/semantic-search/003-research-rag-alternatives.md +523 -0
- package/research/semantic-search/004-research-vector-search.md +841 -0
- package/research/semantic-search/032-research-semantic-search.md +427 -0
- package/research/semantic-search/embedding-text-analysis.md +156 -0
- package/research/semantic-search/multi-word-failure-reproduction.md +171 -0
- package/research/semantic-search/query-processing-analysis.md +207 -0
- package/research/semantic-search/root-cause-and-solution.md +114 -0
- package/research/semantic-search/threshold-validation-report.md +69 -0
- package/research/semantic-search/vector-search-analysis.md +63 -0
- package/research/task-management-2026/00-synthesis-recommendations.md +295 -0
- package/research/task-management-2026/01-ai-workflow-tools.md +416 -0
- package/research/task-management-2026/02-agent-framework-patterns.md +476 -0
- package/research/task-management-2026/03-lightweight-file-based.md +567 -0
- package/research/task-management-2026/04-established-tools-ai-features.md +541 -0
- package/research/task-management-2026/linear/01-core-features-workflow.md +771 -0
- package/research/task-management-2026/linear/02-api-integrations.md +930 -0
- package/research/task-management-2026/linear/03-ai-features.md +368 -0
- package/research/task-management-2026/linear/04-pricing-setup.md +205 -0
- package/research/task-management-2026/linear/05-usage-patterns-best-practices.md +605 -0
- package/research/test-path-issues.md +276 -0
- package/review/ALP-76/1-error-type-design.md +962 -0
- package/review/ALP-76/2-error-handling-patterns.md +906 -0
- package/review/ALP-76/3-error-presentation.md +624 -0
- package/review/ALP-76/4-test-coverage.md +625 -0
- package/review/ALP-76/5-migration-completeness.md +440 -0
- package/review/ALP-76/6-effect-best-practices.md +755 -0
- package/scripts/apply-branch-protection.sh +47 -0
- package/scripts/branch-protection-templates.json +79 -0
- package/scripts/prototype-summarization.ts +346 -0
- package/scripts/rebuild-hnswlib.js +58 -0
- package/scripts/setup-branch-protection.sh +64 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/active-provider.json +7 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/bm25.json +541 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/bm25.meta.json +5 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/config.json +8 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.bin +0 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.meta.bin +0 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/indexes/documents.json +60 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/indexes/links.json +13 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/indexes/sections.json +1197 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/configuration-management.md +99 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/distributed-systems.md +92 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/error-handling.md +78 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/failure-automation.md +55 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/job-context.md +69 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/process-orchestration.md +99 -0
- package/src/cli/argv-preprocessor.test.ts +210 -0
- package/src/cli/argv-preprocessor.ts +202 -0
- package/src/cli/cli.test.ts +627 -0
- package/src/cli/commands/backlinks.ts +54 -0
- package/src/cli/commands/config-cmd.ts +642 -0
- package/src/cli/commands/context.ts +285 -0
- package/src/cli/commands/duplicates.ts +122 -0
- package/src/cli/commands/embeddings.ts +529 -0
- package/src/cli/commands/index-cmd.ts +480 -0
- package/src/cli/commands/index.ts +16 -0
- package/src/cli/commands/links.ts +52 -0
- package/src/cli/commands/search.ts +1281 -0
- package/src/cli/commands/stats.ts +149 -0
- package/src/cli/commands/tree.ts +128 -0
- package/src/cli/config-layer.ts +176 -0
- package/src/cli/error-handler.test.ts +235 -0
- package/src/cli/error-handler.ts +655 -0
- package/src/cli/flag-schemas.ts +341 -0
- package/src/cli/help.ts +588 -0
- package/src/cli/index.ts +9 -0
- package/src/cli/main.ts +435 -0
- package/src/cli/options.ts +41 -0
- package/src/cli/shared-error-handling.ts +199 -0
- package/src/cli/typo-suggester.test.ts +105 -0
- package/src/cli/typo-suggester.ts +130 -0
- package/src/cli/utils.ts +259 -0
- package/src/config/file-provider.test.ts +320 -0
- package/src/config/file-provider.ts +273 -0
- package/src/config/index.ts +72 -0
- package/src/config/integration.test.ts +667 -0
- package/src/config/precedence.test.ts +277 -0
- package/src/config/precedence.ts +451 -0
- package/src/config/schema.test.ts +414 -0
- package/src/config/schema.ts +603 -0
- package/src/config/service.test.ts +320 -0
- package/src/config/service.ts +243 -0
- package/src/config/testing.test.ts +264 -0
- package/src/config/testing.ts +110 -0
- package/src/core/index.ts +1 -0
- package/src/core/types.ts +113 -0
- package/src/duplicates/detector.test.ts +183 -0
- package/src/duplicates/detector.ts +414 -0
- package/src/duplicates/index.ts +18 -0
- package/src/embeddings/embedding-namespace.test.ts +300 -0
- package/src/embeddings/embedding-namespace.ts +947 -0
- package/src/embeddings/heading-boost.test.ts +222 -0
- package/src/embeddings/hnsw-build-options.test.ts +198 -0
- package/src/embeddings/hyde.test.ts +272 -0
- package/src/embeddings/hyde.ts +264 -0
- package/src/embeddings/index.ts +10 -0
- package/src/embeddings/openai-provider.ts +414 -0
- package/src/embeddings/pricing.json +22 -0
- package/src/embeddings/provider-constants.ts +204 -0
- package/src/embeddings/provider-errors.test.ts +967 -0
- package/src/embeddings/provider-errors.ts +565 -0
- package/src/embeddings/provider-factory.test.ts +240 -0
- package/src/embeddings/provider-factory.ts +225 -0
- package/src/embeddings/provider-integration.test.ts +788 -0
- package/src/embeddings/query-preprocessing.test.ts +187 -0
- package/src/embeddings/semantic-search-threshold.test.ts +508 -0
- package/src/embeddings/semantic-search.ts +1270 -0
- package/src/embeddings/types.ts +359 -0
- package/src/embeddings/vector-store.ts +708 -0
- package/src/embeddings/voyage-provider.ts +313 -0
- package/src/errors/errors.test.ts +845 -0
- package/src/errors/index.ts +533 -0
- package/src/index/ignore-patterns.test.ts +354 -0
- package/src/index/ignore-patterns.ts +305 -0
- package/src/index/index.ts +4 -0
- package/src/index/indexer.ts +684 -0
- package/src/index/storage.ts +260 -0
- package/src/index/types.ts +147 -0
- package/src/index/watcher.ts +189 -0
- package/src/index.ts +30 -0
- package/src/integration/search-keyword.test.ts +678 -0
- package/src/mcp/server.ts +612 -0
- package/src/parser/index.ts +1 -0
- package/src/parser/parser.test.ts +291 -0
- package/src/parser/parser.ts +394 -0
- package/src/parser/section-filter.test.ts +277 -0
- package/src/parser/section-filter.ts +392 -0
- package/src/search/__tests__/hybrid-search.test.ts +650 -0
- package/src/search/bm25-store.ts +366 -0
- package/src/search/cross-encoder.test.ts +253 -0
- package/src/search/cross-encoder.ts +406 -0
- package/src/search/fuzzy-search.test.ts +419 -0
- package/src/search/fuzzy-search.ts +273 -0
- package/src/search/hybrid-search.ts +448 -0
- package/src/search/path-matcher.test.ts +276 -0
- package/src/search/path-matcher.ts +33 -0
- package/src/search/query-parser.test.ts +260 -0
- package/src/search/query-parser.ts +319 -0
- package/src/search/searcher.test.ts +280 -0
- package/src/search/searcher.ts +724 -0
- package/src/search/wink-bm25.d.ts +30 -0
- package/src/summarization/cli-providers/claude.ts +202 -0
- package/src/summarization/cli-providers/detection.test.ts +273 -0
- package/src/summarization/cli-providers/detection.ts +118 -0
- package/src/summarization/cli-providers/index.ts +8 -0
- package/src/summarization/cost.test.ts +139 -0
- package/src/summarization/cost.ts +102 -0
- package/src/summarization/error-handler.test.ts +127 -0
- package/src/summarization/error-handler.ts +111 -0
- package/src/summarization/index.ts +102 -0
- package/src/summarization/pipeline.test.ts +498 -0
- package/src/summarization/pipeline.ts +231 -0
- package/src/summarization/prompts.test.ts +269 -0
- package/src/summarization/prompts.ts +133 -0
- package/src/summarization/provider-factory.test.ts +396 -0
- package/src/summarization/provider-factory.ts +178 -0
- package/src/summarization/types.ts +184 -0
- package/src/summarize/budget-bugs.test.ts +620 -0
- package/src/summarize/formatters.ts +419 -0
- package/src/summarize/index.ts +20 -0
- package/src/summarize/summarizer.test.ts +275 -0
- package/src/summarize/summarizer.ts +597 -0
- package/src/summarize/verify-bugs.test.ts +238 -0
- package/src/types/huggingface-transformers.d.ts +66 -0
- package/src/utils/index.ts +1 -0
- package/src/utils/tokens.test.ts +142 -0
- package/src/utils/tokens.ts +186 -0
- package/tests/fixtures/cli/.mdcontext/active-provider.json +7 -0
- package/tests/fixtures/cli/.mdcontext/config.json +8 -0
- package/tests/fixtures/cli/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.bin +0 -0
- package/tests/fixtures/cli/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.meta.bin +0 -0
- package/tests/fixtures/cli/.mdcontext/indexes/documents.json +33 -0
- package/tests/fixtures/cli/.mdcontext/indexes/links.json +12 -0
- package/tests/fixtures/cli/.mdcontext/indexes/sections.json +247 -0
- package/tests/fixtures/cli/README.md +9 -0
- package/tests/fixtures/cli/api-reference.md +11 -0
- package/tests/fixtures/cli/getting-started.md +11 -0
- package/tests/integration/embed-index.test.ts +712 -0
- package/tests/integration/search-context.test.ts +469 -0
- package/tests/integration/search-semantic.test.ts +522 -0
- package/tsconfig.json +26 -0
- package/vitest.config.ts +16 -0
- package/vitest.setup.ts +12 -0
|
@@ -0,0 +1,490 @@
|
|
|
1
|
+
# Embedding Models Research for mdcontext
|
|
2
|
+
|
|
3
|
+
_Research conducted: January 2026_
|
|
4
|
+
|
|
5
|
+
This document provides comprehensive research on embedding models for improving mdcontext's semantic search capabilities. The current implementation uses OpenAI's `text-embedding-3-small` (1536 dimensions, $0.02/1M tokens).
|
|
6
|
+
|
|
7
|
+
## Table of Contents
|
|
8
|
+
|
|
9
|
+
1. [Model Comparison Table](#model-comparison-table)
|
|
10
|
+
2. [OpenAI Models Analysis](#openai-models-analysis)
|
|
11
|
+
3. [Local/Offline Models Analysis](#localoffline-models-analysis)
|
|
12
|
+
4. [Alternative API Providers](#alternative-api-providers)
|
|
13
|
+
5. [Dimension Reduction Analysis](#dimension-reduction-analysis)
|
|
14
|
+
6. [Hybrid Search & Reranking](#hybrid-search--reranking)
|
|
15
|
+
7. [Top 3 Recommendations](#top-3-recommendations)
|
|
16
|
+
8. [Effort/Impact Analysis](#effortimpact-analysis)
|
|
17
|
+
9. [Quick Wins](#quick-wins)
|
|
18
|
+
|
|
19
|
+
---
|
|
20
|
+
|
|
21
|
+
## Model Comparison Table
|
|
22
|
+
|
|
23
|
+
### API-Based Models
|
|
24
|
+
|
|
25
|
+
| Provider | Model | Dimensions | Cost/1M tokens | MTEB Score | Context Length | Notes |
|
|
26
|
+
| --------- | ---------------------- | ------------------- | ------------------------ | ---------- | -------------- | -------------------------- |
|
|
27
|
+
| OpenAI | text-embedding-3-small | 1536 (configurable) | $0.02 | 62.3 | 8,192 | Current mdcontext model |
|
|
28
|
+
| OpenAI | text-embedding-3-large | 3072 (configurable) | $0.13 | 64.6 | 8,192 | Best OpenAI option |
|
|
29
|
+
| Voyage AI | voyage-3.5 | 1024 | $0.06 | ~66+ | 32,000 | Excellent retrieval |
|
|
30
|
+
| Voyage AI | voyage-3.5-lite | 512 | $0.02 | ~64+ | 32,000 | Same price as OpenAI small |
|
|
31
|
+
| Voyage AI | voyage-3-large | 2048/1024/512/256 | $0.22 | ~68+ | 32,000 | SOTA general purpose |
|
|
32
|
+
| Cohere | embed-v4 | 1536 | $0.12 | 65.2 | 512 | Multimodal support |
|
|
33
|
+
| Cohere | embed-v3-english | 1024 | ~$0.10 | ~64 | 512 | Text-only |
|
|
34
|
+
| Google | gemini-embedding-001 | 3072/1536/768 | $0.15 (paid) / Free tier | 71.5 | 2,048 | Free tier available |
|
|
35
|
+
| Jina AI | jina-embeddings-v3 | 1024 (configurable) | Usage-based | 65.5 | 8,192 | Task-specific adapters |
|
|
36
|
+
|
|
37
|
+
### Local/Open-Source Models
|
|
38
|
+
|
|
39
|
+
| Model | Dimensions | Memory | Speed | MTEB Score | Context | License |
|
|
40
|
+
| ---------------------- | ------------------ | ------ | --------- | ---------- | -------- | ---------- |
|
|
41
|
+
| nomic-embed-text-v1.5 | 768 (configurable) | ~0.5GB | Very Fast | 62.4 | 8,192 | Apache 2.0 |
|
|
42
|
+
| mxbai-embed-large | 1024 | ~1.2GB | Fast | 64.7 | Standard | Apache 2.0 |
|
|
43
|
+
| BGE-M3 | 1024 | ~2GB | Medium | 63.0 | 8,192 | MIT |
|
|
44
|
+
| all-MiniLM-L6-v2 | 384 | ~100MB | Very Fast | 56.3 | 256 | Apache 2.0 |
|
|
45
|
+
| all-mpnet-base-v2 | 768 | ~400MB | Fast | 57.8 | 384 | Apache 2.0 |
|
|
46
|
+
| jina-embeddings-v3 | 1024 | ~2GB | Medium | 65.5 | 8,192 | Apache 2.0 |
|
|
47
|
+
| E5-Mistral-7B-Instruct | 4096 | ~14GB | Slow | 61.8 | 4,096 | MIT |
|
|
48
|
+
|
|
49
|
+
---
|
|
50
|
+
|
|
51
|
+
## OpenAI Models Analysis
|
|
52
|
+
|
|
53
|
+
### Current: text-embedding-3-small
|
|
54
|
+
|
|
55
|
+
**Specs:**
|
|
56
|
+
|
|
57
|
+
- Dimensions: 1536 (can be reduced via API)
|
|
58
|
+
- Cost: $0.02 per 1M tokens
|
|
59
|
+
- MTEB Score: 62.3
|
|
60
|
+
- Context: 8,192 tokens
|
|
61
|
+
|
|
62
|
+
**Strengths:**
|
|
63
|
+
|
|
64
|
+
- Cost-effective for API usage
|
|
65
|
+
- Good multilingual support (improved over ada-002)
|
|
66
|
+
- Native dimension reduction support (Matryoshka)
|
|
67
|
+
- Well-documented, stable API
|
|
68
|
+
|
|
69
|
+
**Weaknesses:**
|
|
70
|
+
|
|
71
|
+
- Requires API access (no offline mode)
|
|
72
|
+
- Lower quality than text-embedding-3-large
|
|
73
|
+
- Latency dependent on network
|
|
74
|
+
|
|
75
|
+
### Upgrade Option: text-embedding-3-large
|
|
76
|
+
|
|
77
|
+
**Specs:**
|
|
78
|
+
|
|
79
|
+
- Dimensions: 3072 (can be reduced to 256-3072)
|
|
80
|
+
- Cost: $0.13 per 1M tokens (6.5x more expensive)
|
|
81
|
+
- MTEB Score: 64.6
|
|
82
|
+
- MIRACL Score: 54.9% (vs 44.0% for small)
|
|
83
|
+
|
|
84
|
+
**When to Consider:**
|
|
85
|
+
|
|
86
|
+
- Multilingual documentation
|
|
87
|
+
- Complex technical content
|
|
88
|
+
- When quality matters more than cost
|
|
89
|
+
|
|
90
|
+
**Key Insight:** You can use text-embedding-3-large at 256-512 dimensions and still outperform text-embedding-3-small at full 1536 dimensions. This provides a quality upgrade with storage savings.
|
|
91
|
+
|
|
92
|
+
### Dimension Reduction (Matryoshka)
|
|
93
|
+
|
|
94
|
+
OpenAI's text-embedding-3 models use Matryoshka Representation Learning, allowing dimension truncation:
|
|
95
|
+
|
|
96
|
+
| Original Model | Reduced Dims | MTEB Impact | Storage Savings |
|
|
97
|
+
| -------------- | ------------ | ----------- | --------------- |
|
|
98
|
+
| 3-large (3072) | 1536 | ~1-2% drop | 50% |
|
|
99
|
+
| 3-large (3072) | 1024 | ~2-3% drop | 67% |
|
|
100
|
+
| 3-large (3072) | 512 | ~4-5% drop | 83% |
|
|
101
|
+
| 3-large (3072) | 256 | ~6-8% drop | 92% |
|
|
102
|
+
| 3-small (1536) | 512 | ~3-4% drop | 67% |
|
|
103
|
+
| 3-small (1536) | 256 | ~5-7% drop | 83% |
|
|
104
|
+
|
|
105
|
+
**Practical finding:** Reducing from 1536 to 512 dimensions typically cuts query latency in half and reduces vector storage by 67% with minimal accuracy impact for most RAG use cases.
|
|
106
|
+
|
|
107
|
+
---
|
|
108
|
+
|
|
109
|
+
## Local/Offline Models Analysis
|
|
110
|
+
|
|
111
|
+
### Tier 1: High Quality (Recommended for mdcontext)
|
|
112
|
+
|
|
113
|
+
#### nomic-embed-text-v1.5
|
|
114
|
+
|
|
115
|
+
**Why it stands out:**
|
|
116
|
+
|
|
117
|
+
- Outperforms OpenAI text-embedding-3-small on both short and long context benchmarks
|
|
118
|
+
- 8,192 token context (matches OpenAI)
|
|
119
|
+
- Matryoshka support for dimension flexibility
|
|
120
|
+
- Binary quantization support (100x storage reduction possible)
|
|
121
|
+
- Apache 2.0 license with fully open weights, code, and training data
|
|
122
|
+
- ~100 QPS on M2 MacBook (excellent local performance)
|
|
123
|
+
- Most downloaded open-source embedder on Hugging Face (35M+ downloads)
|
|
124
|
+
|
|
125
|
+
**Availability:**
|
|
126
|
+
|
|
127
|
+
- Hugging Face: `nomic-ai/nomic-embed-text-v1.5`
|
|
128
|
+
- Ollama: `nomic-embed-text`
|
|
129
|
+
- sentence-transformers compatible
|
|
130
|
+
|
|
131
|
+
**Best for:** General documentation search, mdcontext's primary use case
|
|
132
|
+
|
|
133
|
+
#### mxbai-embed-large
|
|
134
|
+
|
|
135
|
+
**Why it stands out:**
|
|
136
|
+
|
|
137
|
+
- MTEB retrieval score of 64.68 (matches OpenAI text-embedding-3-large at 64.59)
|
|
138
|
+
- Excellent for context-heavy, complex queries
|
|
139
|
+
- 1024 dimensions (efficient storage)
|
|
140
|
+
|
|
141
|
+
**Availability:**
|
|
142
|
+
|
|
143
|
+
- Ollama: `mxbai-embed-large`
|
|
144
|
+
- Hugging Face: `mixedbread-ai/mxbai-embed-large-v1`
|
|
145
|
+
|
|
146
|
+
**Best for:** When accuracy is paramount, complex technical documentation
|
|
147
|
+
|
|
148
|
+
#### BGE-M3
|
|
149
|
+
|
|
150
|
+
**Why it stands out:**
|
|
151
|
+
|
|
152
|
+
- Supports dense, sparse, AND multi-vector retrieval simultaneously
|
|
153
|
+
- 100+ languages
|
|
154
|
+
- 8,192 token context
|
|
155
|
+
- SOTA on multilingual benchmarks (MIRACL, MKQA)
|
|
156
|
+
- MIT license
|
|
157
|
+
|
|
158
|
+
**Unique capability:** Enables hybrid retrieval without separate BM25 index - the model produces both dense embeddings and sparse lexical representations.
|
|
159
|
+
|
|
160
|
+
**Availability:**
|
|
161
|
+
|
|
162
|
+
- Hugging Face: `BAAI/bge-m3`
|
|
163
|
+
- Ollama: `bge-m3`
|
|
164
|
+
|
|
165
|
+
**Best for:** Multilingual documentation, hybrid search without BM25
|
|
166
|
+
|
|
167
|
+
### Tier 2: Fast & Lightweight
|
|
168
|
+
|
|
169
|
+
#### all-MiniLM-L6-v2
|
|
170
|
+
|
|
171
|
+
**Specs:**
|
|
172
|
+
|
|
173
|
+
- 384 dimensions, ~22M parameters, ~100MB
|
|
174
|
+
- 5x faster than larger models
|
|
175
|
+
- 12,450 tokens/sec on RTX 4090
|
|
176
|
+
|
|
177
|
+
**Trade-off:** Lower accuracy (MTEB 56.3) but extremely fast and lightweight
|
|
178
|
+
|
|
179
|
+
**Best for:** Edge deployment, high-throughput scenarios, prototyping
|
|
180
|
+
|
|
181
|
+
#### all-mpnet-base-v2
|
|
182
|
+
|
|
183
|
+
**Specs:**
|
|
184
|
+
|
|
185
|
+
- 768 dimensions, ~110M parameters, ~400MB
|
|
186
|
+
- STS-B score: 87-88% (vs 84-85% for MiniLM)
|
|
187
|
+
|
|
188
|
+
**Trade-off:** Better accuracy than MiniLM, but 4-5x slower
|
|
189
|
+
|
|
190
|
+
**Best for:** When you need better accuracy than MiniLM but can't run larger models
|
|
191
|
+
|
|
192
|
+
### Local Model Comparison for mdcontext
|
|
193
|
+
|
|
194
|
+
| Factor | nomic-embed-text-v1.5 | mxbai-embed-large | BGE-M3 |
|
|
195
|
+
| ------------- | --------------------- | ----------------- | --------- |
|
|
196
|
+
| Quality | High | Highest | High |
|
|
197
|
+
| Speed | Very Fast | Fast | Medium |
|
|
198
|
+
| Memory | 0.5GB | 1.2GB | 2GB |
|
|
199
|
+
| Context | 8,192 | Standard | 8,192 |
|
|
200
|
+
| Matryoshka | Yes | No | No |
|
|
201
|
+
| Multilingual | Moderate | Moderate | Excellent |
|
|
202
|
+
| mdcontext fit | Excellent | Good | Good |
|
|
203
|
+
|
|
204
|
+
**Recommendation:** nomic-embed-text-v1.5 is the best fit for mdcontext due to its balance of quality, speed, long context, and Matryoshka support.
|
|
205
|
+
|
|
206
|
+
---
|
|
207
|
+
|
|
208
|
+
## Alternative API Providers
|
|
209
|
+
|
|
210
|
+
### Voyage AI
|
|
211
|
+
|
|
212
|
+
**Standout features:**
|
|
213
|
+
|
|
214
|
+
- voyage-3.5 outperforms OpenAI text-embedding-3-large by 8.26%
|
|
215
|
+
- 32K token context (4x OpenAI)
|
|
216
|
+
- Excellent domain-specific models (code, law, finance)
|
|
217
|
+
- Matryoshka + quantization support
|
|
218
|
+
|
|
219
|
+
**Pricing:**
|
|
220
|
+
|
|
221
|
+
- voyage-3.5-lite: $0.02/1M (same as OpenAI small, but better quality)
|
|
222
|
+
- voyage-3.5: $0.06/1M
|
|
223
|
+
- voyage-3-large: $0.22/1M
|
|
224
|
+
|
|
225
|
+
**Free tier:** 200M tokens free for new models
|
|
226
|
+
|
|
227
|
+
**Best for:** When you need better quality than OpenAI at similar cost
|
|
228
|
+
|
|
229
|
+
### Cohere
|
|
230
|
+
|
|
231
|
+
**Standout features:**
|
|
232
|
+
|
|
233
|
+
- embed-v4 is multimodal (text + images)
|
|
234
|
+
- 100+ languages
|
|
235
|
+
- Fast inference (50-60% faster than OpenAI)
|
|
236
|
+
- Works well with Cohere's reranker
|
|
237
|
+
|
|
238
|
+
**Pricing:**
|
|
239
|
+
|
|
240
|
+
- embed-v4: $0.12/1M tokens
|
|
241
|
+
|
|
242
|
+
**Best for:** Multimodal needs, when using Cohere's full stack
|
|
243
|
+
|
|
244
|
+
### Google (Gemini Embedding)
|
|
245
|
+
|
|
246
|
+
**Standout features:**
|
|
247
|
+
|
|
248
|
+
- gemini-embedding-001: 71.5% accuracy on benchmarks
|
|
249
|
+
- Free tier available
|
|
250
|
+
- Matryoshka support (3072/1536/768)
|
|
251
|
+
|
|
252
|
+
**Pricing:**
|
|
253
|
+
|
|
254
|
+
- Free tier: Generous limits
|
|
255
|
+
- Paid: $0.15/1M tokens
|
|
256
|
+
|
|
257
|
+
**Consideration:** Higher latency, less established for embeddings
|
|
258
|
+
|
|
259
|
+
### Jina AI
|
|
260
|
+
|
|
261
|
+
**Standout features:**
|
|
262
|
+
|
|
263
|
+
- jina-embeddings-v3: Task-specific LoRA adapters
|
|
264
|
+
- 89 languages, 8,192 context
|
|
265
|
+
- Matryoshka support (32-1024 dims)
|
|
266
|
+
- Can be self-hosted (Apache 2.0)
|
|
267
|
+
|
|
268
|
+
**Best for:** Multilingual, task-specific optimization, hybrid API/local deployment
|
|
269
|
+
|
|
270
|
+
---
|
|
271
|
+
|
|
272
|
+
## Hybrid Search & Reranking
|
|
273
|
+
|
|
274
|
+
### Why Hybrid Search Matters
|
|
275
|
+
|
|
276
|
+
Current mdcontext limitation: semantic and keyword search are mutually exclusive.
|
|
277
|
+
|
|
278
|
+
**Hybrid approach benefits:**
|
|
279
|
+
|
|
280
|
+
- 48% improvement in retrieval quality (Pinecone benchmarks)
|
|
281
|
+
- Captures both exact keyword matches AND semantic similarity
|
|
282
|
+
- Reduces LLM hallucinations by 35% when used with reranking
|
|
283
|
+
|
|
284
|
+
### Recommended Architecture
|
|
285
|
+
|
|
286
|
+
```
|
|
287
|
+
Query → BM25 (lexical) ──┐
|
|
288
|
+
├─→ Merge & Dedupe → Reranker → Top K results
|
|
289
|
+
Query → Dense Embed ─────┘
|
|
290
|
+
```
|
|
291
|
+
|
|
292
|
+
### Reranking Impact
|
|
293
|
+
|
|
294
|
+
Cross-encoder rerankers examine query-document pairs together, achieving +28% NDCG@10 improvements over raw embedding retrieval.
|
|
295
|
+
|
|
296
|
+
**Top reranker options:**
|
|
297
|
+
|
|
298
|
+
1. **Cohere Rerank 3**: 100+ languages, production-ready
|
|
299
|
+
2. **BGE Reranker v2-m3**: Open source, ~600M params, Apache 2.0
|
|
300
|
+
3. **Voyage rerank-2.5**: Instruction-following, high quality
|
|
301
|
+
|
|
302
|
+
**Optimal configuration:**
|
|
303
|
+
|
|
304
|
+
- Rerank top 50-75 documents for best quality/speed balance
|
|
305
|
+
- Latency: ~1.5 seconds for 50 documents
|
|
306
|
+
|
|
307
|
+
### BGE-M3 Special Capability
|
|
308
|
+
|
|
309
|
+
BGE-M3 uniquely supports all three retrieval methods in one model:
|
|
310
|
+
|
|
311
|
+
- Dense retrieval (semantic)
|
|
312
|
+
- Sparse retrieval (lexical, like BM25)
|
|
313
|
+
- Multi-vector retrieval (ColBERT-style)
|
|
314
|
+
|
|
315
|
+
This could eliminate the need for a separate BM25 index in mdcontext.
|
|
316
|
+
|
|
317
|
+
---
|
|
318
|
+
|
|
319
|
+
## Top 3 Recommendations
|
|
320
|
+
|
|
321
|
+
### Recommendation 1: Add Local Embedding Support with nomic-embed-text-v1.5
|
|
322
|
+
|
|
323
|
+
**Rationale:**
|
|
324
|
+
|
|
325
|
+
- Enables offline semantic search (major feature gap)
|
|
326
|
+
- Quality matches or exceeds current OpenAI text-embedding-3-small
|
|
327
|
+
- Zero ongoing API costs
|
|
328
|
+
- 8,192 token context matches current implementation
|
|
329
|
+
- Matryoshka support enables storage optimization
|
|
330
|
+
- Excellent performance on Apple Silicon (mdcontext's likely dev environment)
|
|
331
|
+
|
|
332
|
+
**Implementation approach:**
|
|
333
|
+
|
|
334
|
+
1. Add `nomic-embed-text` as an Ollama provider option
|
|
335
|
+
2. Create `OllamaEmbeddingProvider` implementing existing interface
|
|
336
|
+
3. Allow provider selection via config or CLI flag
|
|
337
|
+
4. Keep OpenAI as default for backward compatibility
|
|
338
|
+
|
|
339
|
+
**Impact:** High (offline capability, cost elimination)
|
|
340
|
+
**Effort:** Medium (new provider implementation, testing)
|
|
341
|
+
|
|
342
|
+
### Recommendation 2: Implement Dimension Reduction for OpenAI
|
|
343
|
+
|
|
344
|
+
**Rationale:**
|
|
345
|
+
|
|
346
|
+
- Zero-code quick win using existing API
|
|
347
|
+
- Reduce storage by 67% (1536 → 512) with minimal quality loss
|
|
348
|
+
- Improve query latency by ~50%
|
|
349
|
+
- text-embedding-3-large at 512 dims outperforms 3-small at 1536
|
|
350
|
+
|
|
351
|
+
**Implementation approach:**
|
|
352
|
+
|
|
353
|
+
1. Add `dimensions` parameter to OpenAI API calls
|
|
354
|
+
2. Update vector store to handle variable dimensions
|
|
355
|
+
3. Default to 512 dimensions for new indexes
|
|
356
|
+
4. Add migration path for existing indexes (or require rebuild)
|
|
357
|
+
|
|
358
|
+
**Impact:** Medium-High (storage/performance improvement)
|
|
359
|
+
**Effort:** Low (API parameter change, minor refactoring)
|
|
360
|
+
|
|
361
|
+
### Recommendation 3: Add Hybrid Search with BGE-M3 (Future)
|
|
362
|
+
|
|
363
|
+
**Rationale:**
|
|
364
|
+
|
|
365
|
+
- Addresses limitation #4 (no hybrid search) from current implementation
|
|
366
|
+
- Single model provides dense + sparse retrieval
|
|
367
|
+
- No separate BM25 index needed
|
|
368
|
+
- 48% retrieval quality improvement potential
|
|
369
|
+
|
|
370
|
+
**Implementation approach:**
|
|
371
|
+
|
|
372
|
+
1. Add BGE-M3 as a local provider option
|
|
373
|
+
2. Store both dense and sparse vectors
|
|
374
|
+
3. Implement hybrid retrieval merging
|
|
375
|
+
4. Optional: Add cross-encoder reranking
|
|
376
|
+
|
|
377
|
+
**Impact:** High (major quality improvement)
|
|
378
|
+
**Effort:** High (significant architecture changes)
|
|
379
|
+
|
|
380
|
+
---
|
|
381
|
+
|
|
382
|
+
## Effort/Impact Analysis
|
|
383
|
+
|
|
384
|
+
| Improvement | Impact | Effort | Priority |
|
|
385
|
+
| ------------------------------ | ----------- | ------ | ---------------- |
|
|
386
|
+
| Dimension reduction (512) | Medium-High | Low | 1 - Quick Win |
|
|
387
|
+
| nomic-embed-text local | High | Medium | 2 - High Value |
|
|
388
|
+
| Voyage AI as alternative | Medium | Low | 3 - Easy Upgrade |
|
|
389
|
+
| BGE-M3 hybrid search | High | High | 4 - Future |
|
|
390
|
+
| Cross-encoder reranking | Medium-High | Medium | 5 - Future |
|
|
391
|
+
| text-embedding-3-large upgrade | Medium | Low | 6 - Optional |
|
|
392
|
+
|
|
393
|
+
### Implementation Priority Order
|
|
394
|
+
|
|
395
|
+
1. **Week 1:** Dimension reduction (1536 → 512)
|
|
396
|
+
- Modify OpenAI provider to pass `dimensions: 512`
|
|
397
|
+
- Update vector store metadata
|
|
398
|
+
- Test retrieval quality
|
|
399
|
+
|
|
400
|
+
2. **Week 2-3:** Local embedding support
|
|
401
|
+
- Implement Ollama provider
|
|
402
|
+
- Add nomic-embed-text integration
|
|
403
|
+
- Create provider selection mechanism
|
|
404
|
+
|
|
405
|
+
3. **Week 4+:** Provider ecosystem
|
|
406
|
+
- Add Voyage AI option
|
|
407
|
+
- Consider BGE-M3 for hybrid search
|
|
408
|
+
- Evaluate reranking integration
|
|
409
|
+
|
|
410
|
+
---
|
|
411
|
+
|
|
412
|
+
## Quick Wins
|
|
413
|
+
|
|
414
|
+
### 1. Dimension Reduction (Immediate)
|
|
415
|
+
|
|
416
|
+
**Change required:**
|
|
417
|
+
|
|
418
|
+
```typescript
|
|
419
|
+
// In openai-provider.ts
|
|
420
|
+
const response = await openai.embeddings.create({
|
|
421
|
+
model: "text-embedding-3-small",
|
|
422
|
+
input: texts,
|
|
423
|
+
dimensions: 512, // Add this parameter
|
|
424
|
+
});
|
|
425
|
+
```
|
|
426
|
+
|
|
427
|
+
**Benefits:**
|
|
428
|
+
|
|
429
|
+
- 67% storage reduction
|
|
430
|
+
- ~50% faster queries
|
|
431
|
+
- Minimal quality impact (~3-4%)
|
|
432
|
+
|
|
433
|
+
### 2. Switch to voyage-3.5-lite (Same Cost, Better Quality)
|
|
434
|
+
|
|
435
|
+
**If considering API alternatives:**
|
|
436
|
+
|
|
437
|
+
- Same price as OpenAI small ($0.02/1M)
|
|
438
|
+
- 6-8% better retrieval quality
|
|
439
|
+
- 32K context (4x more)
|
|
440
|
+
- Free 200M tokens to test
|
|
441
|
+
|
|
442
|
+
### 3. Use text-embedding-3-large at Reduced Dimensions
|
|
443
|
+
|
|
444
|
+
**For quality boost:**
|
|
445
|
+
|
|
446
|
+
```typescript
|
|
447
|
+
// Better quality at same storage cost
|
|
448
|
+
const response = await openai.embeddings.create({
|
|
449
|
+
model: "text-embedding-3-large",
|
|
450
|
+
input: texts,
|
|
451
|
+
dimensions: 512, // Truncate large model
|
|
452
|
+
});
|
|
453
|
+
```
|
|
454
|
+
|
|
455
|
+
**Trade-off:** 6.5x cost increase, but significantly better retrieval
|
|
456
|
+
|
|
457
|
+
---
|
|
458
|
+
|
|
459
|
+
## Sources
|
|
460
|
+
|
|
461
|
+
- [MTEB Leaderboard - Hugging Face](https://huggingface.co/spaces/mteb/leaderboard)
|
|
462
|
+
- [OpenAI Embeddings Documentation](https://platform.openai.com/docs/guides/embeddings)
|
|
463
|
+
- [Voyage AI Documentation](https://docs.voyageai.com/docs/embeddings)
|
|
464
|
+
- [Cohere Embed Documentation](https://cohere.com/pricing)
|
|
465
|
+
- [nomic-embed-text-v1.5 - Hugging Face](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5)
|
|
466
|
+
- [BGE-M3 - Hugging Face](https://huggingface.co/BAAI/bge-m3)
|
|
467
|
+
- [Jina Embeddings v3](https://jina.ai/models/jina-embeddings-v3/)
|
|
468
|
+
- [OpenAI Matryoshka Embeddings - Pinecone](https://www.pinecone.io/learn/openai-embeddings-v3/)
|
|
469
|
+
- [Ollama Embedding Models](https://ollama.com/blog/embedding-models)
|
|
470
|
+
- [Best Embedding Models 2025 - Ailog](https://app.ailog.fr/en/blog/guides/choosing-embedding-models)
|
|
471
|
+
- [Rerankers for RAG - Analytics Vidhya](https://www.analyticsvidhya.com/blog/2025/06/top-rerankers-for-rag/)
|
|
472
|
+
- [Hybrid Search & Reranking - Superlinked](https://superlinked.com/vectorhub/articles/optimizing-rag-with-hybrid-search-reranking)
|
|
473
|
+
|
|
474
|
+
---
|
|
475
|
+
|
|
476
|
+
## Appendix: Model Selection Decision Tree
|
|
477
|
+
|
|
478
|
+
```
|
|
479
|
+
Need offline/local capability?
|
|
480
|
+
├─ Yes → nomic-embed-text-v1.5 (Ollama)
|
|
481
|
+
│ ├─ Need multilingual? → BGE-M3
|
|
482
|
+
│ └─ Need max accuracy? → mxbai-embed-large
|
|
483
|
+
└─ No (API is fine)
|
|
484
|
+
├─ Cost-sensitive?
|
|
485
|
+
│ ├─ Yes → text-embedding-3-small @ 512 dims
|
|
486
|
+
│ └─ Same budget, better quality? → voyage-3.5-lite
|
|
487
|
+
└─ Quality-focused?
|
|
488
|
+
├─ Yes → voyage-3-large or text-embedding-3-large
|
|
489
|
+
└─ Free tier preferred? → gemini-embedding-001
|
|
490
|
+
```
|