mdcontext 0.0.1 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (337) hide show
  1. package/.changeset/README.md +28 -0
  2. package/.changeset/config.json +11 -0
  3. package/.claude/settings.local.json +25 -0
  4. package/.github/workflows/ci.yml +83 -0
  5. package/.github/workflows/claude-code-review.yml +44 -0
  6. package/.github/workflows/claude.yml +85 -0
  7. package/.github/workflows/release.yml +113 -0
  8. package/.tldrignore +112 -0
  9. package/BACKLOG.md +338 -0
  10. package/CONTRIBUTING.md +186 -0
  11. package/NOTES/NOTES +44 -0
  12. package/README.md +434 -11
  13. package/biome.json +36 -0
  14. package/cspell.config.yaml +14 -0
  15. package/dist/chunk-23UPXDNL.js +3044 -0
  16. package/dist/chunk-2W7MO2DL.js +1366 -0
  17. package/dist/chunk-3NUAZGMA.js +1689 -0
  18. package/dist/chunk-7TOWB2XB.js +366 -0
  19. package/dist/chunk-7XOTOADQ.js +3065 -0
  20. package/dist/chunk-AH2PDM2K.js +3042 -0
  21. package/dist/chunk-BNXWSZ63.js +3742 -0
  22. package/dist/chunk-BTL5DJVU.js +3222 -0
  23. package/dist/chunk-HDHYG7E4.js +104 -0
  24. package/dist/chunk-HLR4KZBP.js +3234 -0
  25. package/dist/chunk-IP3FRFEB.js +1045 -0
  26. package/dist/chunk-KHU56VDO.js +3042 -0
  27. package/dist/chunk-KRYIFLQR.js +88 -0
  28. package/dist/chunk-LBSDNLEM.js +287 -0
  29. package/dist/chunk-MNTQ7HCP.js +2643 -0
  30. package/dist/chunk-MUJELQQ6.js +1387 -0
  31. package/dist/chunk-MXJGMSLV.js +2199 -0
  32. package/dist/chunk-N6QJGC3Z.js +2636 -0
  33. package/dist/chunk-OBELGBPM.js +1713 -0
  34. package/dist/chunk-OT7R5XTA.js +3192 -0
  35. package/dist/chunk-P7X4RA2T.js +106 -0
  36. package/dist/chunk-PIDUQNC2.js +3185 -0
  37. package/dist/chunk-POGCDIH4.js +3187 -0
  38. package/dist/chunk-PSIEOQGZ.js +3043 -0
  39. package/dist/chunk-PVRT3IHA.js +3238 -0
  40. package/dist/chunk-QNN4TT23.js +1430 -0
  41. package/dist/chunk-RE3R45RJ.js +3042 -0
  42. package/dist/chunk-S7E6TFX6.js +803 -0
  43. package/dist/chunk-SG6GLU4U.js +1378 -0
  44. package/dist/chunk-SJCDV2ST.js +274 -0
  45. package/dist/chunk-SYE5XLF3.js +104 -0
  46. package/dist/chunk-T5VLYBZD.js +103 -0
  47. package/dist/chunk-TOQB7VWU.js +3238 -0
  48. package/dist/chunk-VFNMZ4ZQ.js +3228 -0
  49. package/dist/chunk-VVTGZNBT.js +1629 -0
  50. package/dist/chunk-W7Q4RFEV.js +104 -0
  51. package/dist/chunk-XTYYVRLO.js +3190 -0
  52. package/dist/chunk-Y6MDYVJD.js +3063 -0
  53. package/dist/cli/main.d.ts +1 -0
  54. package/dist/cli/main.js +5458 -0
  55. package/dist/index.d.ts +653 -0
  56. package/dist/index.js +79 -0
  57. package/dist/mcp/server.d.ts +1 -0
  58. package/dist/mcp/server.js +472 -0
  59. package/dist/schema-BAWSG7KY.js +22 -0
  60. package/dist/schema-E3QUPL26.js +20 -0
  61. package/dist/schema-EHL7WUT6.js +20 -0
  62. package/docs/019-USAGE.md +625 -0
  63. package/docs/020-current-implementation.md +364 -0
  64. package/docs/021-DOGFOODING-FINDINGS.md +175 -0
  65. package/docs/BACKLOG.md +80 -0
  66. package/docs/CONFIG.md +1123 -0
  67. package/docs/DESIGN.md +439 -0
  68. package/docs/ERRORS.md +383 -0
  69. package/docs/PROJECT.md +88 -0
  70. package/docs/ROADMAP.md +407 -0
  71. package/docs/summarization.md +320 -0
  72. package/docs/test-links.md +9 -0
  73. package/justfile +40 -0
  74. package/package.json +74 -9
  75. package/pnpm-workspace.yaml +5 -0
  76. package/research/INDEX.md +315 -0
  77. package/research/code-review/README.md +90 -0
  78. package/research/code-review/cli-error-handling-review.md +979 -0
  79. package/research/code-review/code-review-validation-report.md +464 -0
  80. package/research/code-review/main-ts-review.md +1128 -0
  81. package/research/config-analysis/01-current-implementation.md +470 -0
  82. package/research/config-analysis/02-strategy-recommendation.md +428 -0
  83. package/research/config-analysis/03-task-candidates.md +715 -0
  84. package/research/config-analysis/033-research-configuration-management.md +828 -0
  85. package/research/config-analysis/034-research-effect-cli-config.md +1504 -0
  86. package/research/config-analysis/04-consolidated-task-candidates.md +277 -0
  87. package/research/config-docs/SUMMARY.md +357 -0
  88. package/research/config-docs/TEST-RESULTS.md +776 -0
  89. package/research/config-docs/TODO.md +542 -0
  90. package/research/config-docs/analysis.md +744 -0
  91. package/research/config-docs/fix-validation.md +502 -0
  92. package/research/config-docs/help-audit.md +264 -0
  93. package/research/config-docs/help-system-analysis.md +890 -0
  94. package/research/dogfood/consolidated-tool-evaluation.md +373 -0
  95. package/research/dogfood/strategy-a/a-synthesis.md +184 -0
  96. package/research/dogfood/strategy-a/a1-docs.md +226 -0
  97. package/research/dogfood/strategy-a/a2-amorphic.md +156 -0
  98. package/research/dogfood/strategy-a/a3-llm.md +164 -0
  99. package/research/dogfood/strategy-b/b-synthesis.md +228 -0
  100. package/research/dogfood/strategy-b/b1-architecture.md +207 -0
  101. package/research/dogfood/strategy-b/b2-gaps.md +258 -0
  102. package/research/dogfood/strategy-b/b3-workflows.md +250 -0
  103. package/research/dogfood/strategy-c/c-synthesis.md +451 -0
  104. package/research/dogfood/strategy-c/c1-explorer.md +192 -0
  105. package/research/dogfood/strategy-c/c2-diver-memory.md +145 -0
  106. package/research/dogfood/strategy-c/c3-diver-control.md +148 -0
  107. package/research/dogfood/strategy-c/c4-diver-failure.md +151 -0
  108. package/research/dogfood/strategy-c/c5-diver-execution.md +221 -0
  109. package/research/dogfood/strategy-c/c6-diver-org.md +221 -0
  110. package/research/effect-cli-error-handling.md +845 -0
  111. package/research/effect-errors-as-values.md +943 -0
  112. package/research/errors-task-analysis/00-consolidated-tasks.md +207 -0
  113. package/research/errors-task-analysis/cli-commands-analysis.md +909 -0
  114. package/research/errors-task-analysis/embeddings-analysis.md +709 -0
  115. package/research/errors-task-analysis/index-search-analysis.md +812 -0
  116. package/research/frontmatter/COMMENTS-ARE-SKIPPED.md +149 -0
  117. package/research/frontmatter/LLM-CODE-NAVIGATION.md +276 -0
  118. package/research/issue-review.md +603 -0
  119. package/research/llm-summarization/agent-cli-tools-2026.md +1082 -0
  120. package/research/llm-summarization/alternative-providers-2026.md +1428 -0
  121. package/research/llm-summarization/anthropic-2026.md +367 -0
  122. package/research/llm-summarization/claude-cli-integration.md +1706 -0
  123. package/research/llm-summarization/cli-integration-patterns.md +3155 -0
  124. package/research/llm-summarization/openai-2026.md +473 -0
  125. package/research/llm-summarization/openai-compatible-providers-2026.md +1022 -0
  126. package/research/llm-summarization/opencode-cli-integration.md +1552 -0
  127. package/research/llm-summarization/prompt-engineering-2026.md +1426 -0
  128. package/research/llm-summarization/prototype-results.md +56 -0
  129. package/research/llm-summarization/provider-switching-patterns-2026.md +2153 -0
  130. package/research/llm-summarization/typescript-llm-libraries-2026.md +2436 -0
  131. package/research/mdcontext-error-analysis.md +521 -0
  132. package/research/mdcontext-pudding/00-EXECUTIVE-SUMMARY.md +282 -0
  133. package/research/mdcontext-pudding/01-index-embed.md +956 -0
  134. package/research/mdcontext-pudding/02-search-COMMANDS.md +142 -0
  135. package/research/mdcontext-pudding/02-search-SUMMARY.md +146 -0
  136. package/research/mdcontext-pudding/02-search.md +970 -0
  137. package/research/mdcontext-pudding/03-context.md +779 -0
  138. package/research/mdcontext-pudding/04-navigation-and-analytics.md +803 -0
  139. package/research/mdcontext-pudding/04-tree.md +704 -0
  140. package/research/mdcontext-pudding/05-config.md +1038 -0
  141. package/research/mdcontext-pudding/06-links-summary.txt +87 -0
  142. package/research/mdcontext-pudding/06-links.md +679 -0
  143. package/research/mdcontext-pudding/07-stats.md +693 -0
  144. package/research/mdcontext-pudding/BUG-FIX-PLAN.md +388 -0
  145. package/research/mdcontext-pudding/P0-BUG-VALIDATION.md +167 -0
  146. package/research/mdcontext-pudding/README.md +168 -0
  147. package/research/mdcontext-pudding/TESTING-SUMMARY.md +128 -0
  148. package/research/npm_publish/011-npm-workflow-research-agent2.md +792 -0
  149. package/research/npm_publish/012-npm-workflow-research-agent1.md +530 -0
  150. package/research/npm_publish/013-npm-workflow-research-agent3.md +722 -0
  151. package/research/npm_publish/014-npm-workflow-synthesis.md +556 -0
  152. package/research/npm_publish/031-npm-workflow-task-analysis.md +134 -0
  153. package/research/research-quality-review.md +834 -0
  154. package/research/semantic-search/002-research-embedding-models.md +490 -0
  155. package/research/semantic-search/003-research-rag-alternatives.md +523 -0
  156. package/research/semantic-search/004-research-vector-search.md +841 -0
  157. package/research/semantic-search/032-research-semantic-search.md +427 -0
  158. package/research/semantic-search/embedding-text-analysis.md +156 -0
  159. package/research/semantic-search/multi-word-failure-reproduction.md +171 -0
  160. package/research/semantic-search/query-processing-analysis.md +207 -0
  161. package/research/semantic-search/root-cause-and-solution.md +114 -0
  162. package/research/semantic-search/threshold-validation-report.md +69 -0
  163. package/research/semantic-search/vector-search-analysis.md +63 -0
  164. package/research/task-management-2026/00-synthesis-recommendations.md +295 -0
  165. package/research/task-management-2026/01-ai-workflow-tools.md +416 -0
  166. package/research/task-management-2026/02-agent-framework-patterns.md +476 -0
  167. package/research/task-management-2026/03-lightweight-file-based.md +567 -0
  168. package/research/task-management-2026/04-established-tools-ai-features.md +541 -0
  169. package/research/task-management-2026/linear/01-core-features-workflow.md +771 -0
  170. package/research/task-management-2026/linear/02-api-integrations.md +930 -0
  171. package/research/task-management-2026/linear/03-ai-features.md +368 -0
  172. package/research/task-management-2026/linear/04-pricing-setup.md +205 -0
  173. package/research/task-management-2026/linear/05-usage-patterns-best-practices.md +605 -0
  174. package/research/test-path-issues.md +276 -0
  175. package/review/ALP-76/1-error-type-design.md +962 -0
  176. package/review/ALP-76/2-error-handling-patterns.md +906 -0
  177. package/review/ALP-76/3-error-presentation.md +624 -0
  178. package/review/ALP-76/4-test-coverage.md +625 -0
  179. package/review/ALP-76/5-migration-completeness.md +440 -0
  180. package/review/ALP-76/6-effect-best-practices.md +755 -0
  181. package/scripts/apply-branch-protection.sh +47 -0
  182. package/scripts/branch-protection-templates.json +79 -0
  183. package/scripts/prototype-summarization.ts +346 -0
  184. package/scripts/rebuild-hnswlib.js +58 -0
  185. package/scripts/setup-branch-protection.sh +64 -0
  186. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/active-provider.json +7 -0
  187. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/bm25.json +541 -0
  188. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/bm25.meta.json +5 -0
  189. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/config.json +8 -0
  190. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.bin +0 -0
  191. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.meta.bin +0 -0
  192. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/indexes/documents.json +60 -0
  193. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/indexes/links.json +13 -0
  194. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/indexes/sections.json +1197 -0
  195. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/configuration-management.md +99 -0
  196. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/distributed-systems.md +92 -0
  197. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/error-handling.md +78 -0
  198. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/failure-automation.md +55 -0
  199. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/job-context.md +69 -0
  200. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/process-orchestration.md +99 -0
  201. package/src/cli/argv-preprocessor.test.ts +210 -0
  202. package/src/cli/argv-preprocessor.ts +202 -0
  203. package/src/cli/cli.test.ts +627 -0
  204. package/src/cli/commands/backlinks.ts +54 -0
  205. package/src/cli/commands/config-cmd.ts +642 -0
  206. package/src/cli/commands/context.ts +285 -0
  207. package/src/cli/commands/duplicates.ts +122 -0
  208. package/src/cli/commands/embeddings.ts +529 -0
  209. package/src/cli/commands/index-cmd.ts +480 -0
  210. package/src/cli/commands/index.ts +16 -0
  211. package/src/cli/commands/links.ts +52 -0
  212. package/src/cli/commands/search.ts +1281 -0
  213. package/src/cli/commands/stats.ts +149 -0
  214. package/src/cli/commands/tree.ts +128 -0
  215. package/src/cli/config-layer.ts +176 -0
  216. package/src/cli/error-handler.test.ts +235 -0
  217. package/src/cli/error-handler.ts +655 -0
  218. package/src/cli/flag-schemas.ts +341 -0
  219. package/src/cli/help.ts +588 -0
  220. package/src/cli/index.ts +9 -0
  221. package/src/cli/main.ts +435 -0
  222. package/src/cli/options.ts +41 -0
  223. package/src/cli/shared-error-handling.ts +199 -0
  224. package/src/cli/typo-suggester.test.ts +105 -0
  225. package/src/cli/typo-suggester.ts +130 -0
  226. package/src/cli/utils.ts +259 -0
  227. package/src/config/file-provider.test.ts +320 -0
  228. package/src/config/file-provider.ts +273 -0
  229. package/src/config/index.ts +72 -0
  230. package/src/config/integration.test.ts +667 -0
  231. package/src/config/precedence.test.ts +277 -0
  232. package/src/config/precedence.ts +451 -0
  233. package/src/config/schema.test.ts +414 -0
  234. package/src/config/schema.ts +603 -0
  235. package/src/config/service.test.ts +320 -0
  236. package/src/config/service.ts +243 -0
  237. package/src/config/testing.test.ts +264 -0
  238. package/src/config/testing.ts +110 -0
  239. package/src/core/index.ts +1 -0
  240. package/src/core/types.ts +113 -0
  241. package/src/duplicates/detector.test.ts +183 -0
  242. package/src/duplicates/detector.ts +414 -0
  243. package/src/duplicates/index.ts +18 -0
  244. package/src/embeddings/embedding-namespace.test.ts +300 -0
  245. package/src/embeddings/embedding-namespace.ts +947 -0
  246. package/src/embeddings/heading-boost.test.ts +222 -0
  247. package/src/embeddings/hnsw-build-options.test.ts +198 -0
  248. package/src/embeddings/hyde.test.ts +272 -0
  249. package/src/embeddings/hyde.ts +264 -0
  250. package/src/embeddings/index.ts +10 -0
  251. package/src/embeddings/openai-provider.ts +414 -0
  252. package/src/embeddings/pricing.json +22 -0
  253. package/src/embeddings/provider-constants.ts +204 -0
  254. package/src/embeddings/provider-errors.test.ts +967 -0
  255. package/src/embeddings/provider-errors.ts +565 -0
  256. package/src/embeddings/provider-factory.test.ts +240 -0
  257. package/src/embeddings/provider-factory.ts +225 -0
  258. package/src/embeddings/provider-integration.test.ts +788 -0
  259. package/src/embeddings/query-preprocessing.test.ts +187 -0
  260. package/src/embeddings/semantic-search-threshold.test.ts +508 -0
  261. package/src/embeddings/semantic-search.ts +1270 -0
  262. package/src/embeddings/types.ts +359 -0
  263. package/src/embeddings/vector-store.ts +708 -0
  264. package/src/embeddings/voyage-provider.ts +313 -0
  265. package/src/errors/errors.test.ts +845 -0
  266. package/src/errors/index.ts +533 -0
  267. package/src/index/ignore-patterns.test.ts +354 -0
  268. package/src/index/ignore-patterns.ts +305 -0
  269. package/src/index/index.ts +4 -0
  270. package/src/index/indexer.ts +684 -0
  271. package/src/index/storage.ts +260 -0
  272. package/src/index/types.ts +147 -0
  273. package/src/index/watcher.ts +189 -0
  274. package/src/index.ts +30 -0
  275. package/src/integration/search-keyword.test.ts +678 -0
  276. package/src/mcp/server.ts +612 -0
  277. package/src/parser/index.ts +1 -0
  278. package/src/parser/parser.test.ts +291 -0
  279. package/src/parser/parser.ts +394 -0
  280. package/src/parser/section-filter.test.ts +277 -0
  281. package/src/parser/section-filter.ts +392 -0
  282. package/src/search/__tests__/hybrid-search.test.ts +650 -0
  283. package/src/search/bm25-store.ts +366 -0
  284. package/src/search/cross-encoder.test.ts +253 -0
  285. package/src/search/cross-encoder.ts +406 -0
  286. package/src/search/fuzzy-search.test.ts +419 -0
  287. package/src/search/fuzzy-search.ts +273 -0
  288. package/src/search/hybrid-search.ts +448 -0
  289. package/src/search/path-matcher.test.ts +276 -0
  290. package/src/search/path-matcher.ts +33 -0
  291. package/src/search/query-parser.test.ts +260 -0
  292. package/src/search/query-parser.ts +319 -0
  293. package/src/search/searcher.test.ts +280 -0
  294. package/src/search/searcher.ts +724 -0
  295. package/src/search/wink-bm25.d.ts +30 -0
  296. package/src/summarization/cli-providers/claude.ts +202 -0
  297. package/src/summarization/cli-providers/detection.test.ts +273 -0
  298. package/src/summarization/cli-providers/detection.ts +118 -0
  299. package/src/summarization/cli-providers/index.ts +8 -0
  300. package/src/summarization/cost.test.ts +139 -0
  301. package/src/summarization/cost.ts +102 -0
  302. package/src/summarization/error-handler.test.ts +127 -0
  303. package/src/summarization/error-handler.ts +111 -0
  304. package/src/summarization/index.ts +102 -0
  305. package/src/summarization/pipeline.test.ts +498 -0
  306. package/src/summarization/pipeline.ts +231 -0
  307. package/src/summarization/prompts.test.ts +269 -0
  308. package/src/summarization/prompts.ts +133 -0
  309. package/src/summarization/provider-factory.test.ts +396 -0
  310. package/src/summarization/provider-factory.ts +178 -0
  311. package/src/summarization/types.ts +184 -0
  312. package/src/summarize/budget-bugs.test.ts +620 -0
  313. package/src/summarize/formatters.ts +419 -0
  314. package/src/summarize/index.ts +20 -0
  315. package/src/summarize/summarizer.test.ts +275 -0
  316. package/src/summarize/summarizer.ts +597 -0
  317. package/src/summarize/verify-bugs.test.ts +238 -0
  318. package/src/types/huggingface-transformers.d.ts +66 -0
  319. package/src/utils/index.ts +1 -0
  320. package/src/utils/tokens.test.ts +142 -0
  321. package/src/utils/tokens.ts +186 -0
  322. package/tests/fixtures/cli/.mdcontext/active-provider.json +7 -0
  323. package/tests/fixtures/cli/.mdcontext/config.json +8 -0
  324. package/tests/fixtures/cli/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.bin +0 -0
  325. package/tests/fixtures/cli/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.meta.bin +0 -0
  326. package/tests/fixtures/cli/.mdcontext/indexes/documents.json +33 -0
  327. package/tests/fixtures/cli/.mdcontext/indexes/links.json +12 -0
  328. package/tests/fixtures/cli/.mdcontext/indexes/sections.json +247 -0
  329. package/tests/fixtures/cli/README.md +9 -0
  330. package/tests/fixtures/cli/api-reference.md +11 -0
  331. package/tests/fixtures/cli/getting-started.md +11 -0
  332. package/tests/integration/embed-index.test.ts +712 -0
  333. package/tests/integration/search-context.test.ts +469 -0
  334. package/tests/integration/search-semantic.test.ts +522 -0
  335. package/tsconfig.json +26 -0
  336. package/vitest.config.ts +16 -0
  337. package/vitest.setup.ts +12 -0
@@ -0,0 +1,490 @@
1
+ # Embedding Models Research for mdcontext
2
+
3
+ _Research conducted: January 2026_
4
+
5
+ This document provides comprehensive research on embedding models for improving mdcontext's semantic search capabilities. The current implementation uses OpenAI's `text-embedding-3-small` (1536 dimensions, $0.02/1M tokens).
6
+
7
+ ## Table of Contents
8
+
9
+ 1. [Model Comparison Table](#model-comparison-table)
10
+ 2. [OpenAI Models Analysis](#openai-models-analysis)
11
+ 3. [Local/Offline Models Analysis](#localoffline-models-analysis)
12
+ 4. [Alternative API Providers](#alternative-api-providers)
13
+ 5. [Dimension Reduction Analysis](#dimension-reduction-analysis)
14
+ 6. [Hybrid Search & Reranking](#hybrid-search--reranking)
15
+ 7. [Top 3 Recommendations](#top-3-recommendations)
16
+ 8. [Effort/Impact Analysis](#effortimpact-analysis)
17
+ 9. [Quick Wins](#quick-wins)
18
+
19
+ ---
20
+
21
+ ## Model Comparison Table
22
+
23
+ ### API-Based Models
24
+
25
+ | Provider | Model | Dimensions | Cost/1M tokens | MTEB Score | Context Length | Notes |
26
+ | --------- | ---------------------- | ------------------- | ------------------------ | ---------- | -------------- | -------------------------- |
27
+ | OpenAI | text-embedding-3-small | 1536 (configurable) | $0.02 | 62.3 | 8,192 | Current mdcontext model |
28
+ | OpenAI | text-embedding-3-large | 3072 (configurable) | $0.13 | 64.6 | 8,192 | Best OpenAI option |
29
+ | Voyage AI | voyage-3.5 | 1024 | $0.06 | ~66+ | 32,000 | Excellent retrieval |
30
+ | Voyage AI | voyage-3.5-lite | 512 | $0.02 | ~64+ | 32,000 | Same price as OpenAI small |
31
+ | Voyage AI | voyage-3-large | 2048/1024/512/256 | $0.22 | ~68+ | 32,000 | SOTA general purpose |
32
+ | Cohere | embed-v4 | 1536 | $0.12 | 65.2 | 512 | Multimodal support |
33
+ | Cohere | embed-v3-english | 1024 | ~$0.10 | ~64 | 512 | Text-only |
34
+ | Google | gemini-embedding-001 | 3072/1536/768 | $0.15 (paid) / Free tier | 71.5 | 2,048 | Free tier available |
35
+ | Jina AI | jina-embeddings-v3 | 1024 (configurable) | Usage-based | 65.5 | 8,192 | Task-specific adapters |
36
+
37
+ ### Local/Open-Source Models
38
+
39
+ | Model | Dimensions | Memory | Speed | MTEB Score | Context | License |
40
+ | ---------------------- | ------------------ | ------ | --------- | ---------- | -------- | ---------- |
41
+ | nomic-embed-text-v1.5 | 768 (configurable) | ~0.5GB | Very Fast | 62.4 | 8,192 | Apache 2.0 |
42
+ | mxbai-embed-large | 1024 | ~1.2GB | Fast | 64.7 | Standard | Apache 2.0 |
43
+ | BGE-M3 | 1024 | ~2GB | Medium | 63.0 | 8,192 | MIT |
44
+ | all-MiniLM-L6-v2 | 384 | ~100MB | Very Fast | 56.3 | 256 | Apache 2.0 |
45
+ | all-mpnet-base-v2 | 768 | ~400MB | Fast | 57.8 | 384 | Apache 2.0 |
46
+ | jina-embeddings-v3 | 1024 | ~2GB | Medium | 65.5 | 8,192 | Apache 2.0 |
47
+ | E5-Mistral-7B-Instruct | 4096 | ~14GB | Slow | 61.8 | 4,096 | MIT |
48
+
49
+ ---
50
+
51
+ ## OpenAI Models Analysis
52
+
53
+ ### Current: text-embedding-3-small
54
+
55
+ **Specs:**
56
+
57
+ - Dimensions: 1536 (can be reduced via API)
58
+ - Cost: $0.02 per 1M tokens
59
+ - MTEB Score: 62.3
60
+ - Context: 8,192 tokens
61
+
62
+ **Strengths:**
63
+
64
+ - Cost-effective for API usage
65
+ - Good multilingual support (improved over ada-002)
66
+ - Native dimension reduction support (Matryoshka)
67
+ - Well-documented, stable API
68
+
69
+ **Weaknesses:**
70
+
71
+ - Requires API access (no offline mode)
72
+ - Lower quality than text-embedding-3-large
73
+ - Latency dependent on network
74
+
75
+ ### Upgrade Option: text-embedding-3-large
76
+
77
+ **Specs:**
78
+
79
+ - Dimensions: 3072 (can be reduced to 256-3072)
80
+ - Cost: $0.13 per 1M tokens (6.5x more expensive)
81
+ - MTEB Score: 64.6
82
+ - MIRACL Score: 54.9% (vs 44.0% for small)
83
+
84
+ **When to Consider:**
85
+
86
+ - Multilingual documentation
87
+ - Complex technical content
88
+ - When quality matters more than cost
89
+
90
+ **Key Insight:** You can use text-embedding-3-large at 256-512 dimensions and still outperform text-embedding-3-small at full 1536 dimensions. This provides a quality upgrade with storage savings.
91
+
92
+ ### Dimension Reduction (Matryoshka)
93
+
94
+ OpenAI's text-embedding-3 models use Matryoshka Representation Learning, allowing dimension truncation:
95
+
96
+ | Original Model | Reduced Dims | MTEB Impact | Storage Savings |
97
+ | -------------- | ------------ | ----------- | --------------- |
98
+ | 3-large (3072) | 1536 | ~1-2% drop | 50% |
99
+ | 3-large (3072) | 1024 | ~2-3% drop | 67% |
100
+ | 3-large (3072) | 512 | ~4-5% drop | 83% |
101
+ | 3-large (3072) | 256 | ~6-8% drop | 92% |
102
+ | 3-small (1536) | 512 | ~3-4% drop | 67% |
103
+ | 3-small (1536) | 256 | ~5-7% drop | 83% |
104
+
105
+ **Practical finding:** Reducing from 1536 to 512 dimensions typically cuts query latency in half and reduces vector storage by 67% with minimal accuracy impact for most RAG use cases.
106
+
107
+ ---
108
+
109
+ ## Local/Offline Models Analysis
110
+
111
+ ### Tier 1: High Quality (Recommended for mdcontext)
112
+
113
+ #### nomic-embed-text-v1.5
114
+
115
+ **Why it stands out:**
116
+
117
+ - Outperforms OpenAI text-embedding-3-small on both short and long context benchmarks
118
+ - 8,192 token context (matches OpenAI)
119
+ - Matryoshka support for dimension flexibility
120
+ - Binary quantization support (100x storage reduction possible)
121
+ - Apache 2.0 license with fully open weights, code, and training data
122
+ - ~100 QPS on M2 MacBook (excellent local performance)
123
+ - Most downloaded open-source embedder on Hugging Face (35M+ downloads)
124
+
125
+ **Availability:**
126
+
127
+ - Hugging Face: `nomic-ai/nomic-embed-text-v1.5`
128
+ - Ollama: `nomic-embed-text`
129
+ - sentence-transformers compatible
130
+
131
+ **Best for:** General documentation search, mdcontext's primary use case
132
+
133
+ #### mxbai-embed-large
134
+
135
+ **Why it stands out:**
136
+
137
+ - MTEB retrieval score of 64.68 (matches OpenAI text-embedding-3-large at 64.59)
138
+ - Excellent for context-heavy, complex queries
139
+ - 1024 dimensions (efficient storage)
140
+
141
+ **Availability:**
142
+
143
+ - Ollama: `mxbai-embed-large`
144
+ - Hugging Face: `mixedbread-ai/mxbai-embed-large-v1`
145
+
146
+ **Best for:** When accuracy is paramount, complex technical documentation
147
+
148
+ #### BGE-M3
149
+
150
+ **Why it stands out:**
151
+
152
+ - Supports dense, sparse, AND multi-vector retrieval simultaneously
153
+ - 100+ languages
154
+ - 8,192 token context
155
+ - SOTA on multilingual benchmarks (MIRACL, MKQA)
156
+ - MIT license
157
+
158
+ **Unique capability:** Enables hybrid retrieval without separate BM25 index - the model produces both dense embeddings and sparse lexical representations.
159
+
160
+ **Availability:**
161
+
162
+ - Hugging Face: `BAAI/bge-m3`
163
+ - Ollama: `bge-m3`
164
+
165
+ **Best for:** Multilingual documentation, hybrid search without BM25
166
+
167
+ ### Tier 2: Fast & Lightweight
168
+
169
+ #### all-MiniLM-L6-v2
170
+
171
+ **Specs:**
172
+
173
+ - 384 dimensions, ~22M parameters, ~100MB
174
+ - 5x faster than larger models
175
+ - 12,450 tokens/sec on RTX 4090
176
+
177
+ **Trade-off:** Lower accuracy (MTEB 56.3) but extremely fast and lightweight
178
+
179
+ **Best for:** Edge deployment, high-throughput scenarios, prototyping
180
+
181
+ #### all-mpnet-base-v2
182
+
183
+ **Specs:**
184
+
185
+ - 768 dimensions, ~110M parameters, ~400MB
186
+ - STS-B score: 87-88% (vs 84-85% for MiniLM)
187
+
188
+ **Trade-off:** Better accuracy than MiniLM, but 4-5x slower
189
+
190
+ **Best for:** When you need better accuracy than MiniLM but can't run larger models
191
+
192
+ ### Local Model Comparison for mdcontext
193
+
194
+ | Factor | nomic-embed-text-v1.5 | mxbai-embed-large | BGE-M3 |
195
+ | ------------- | --------------------- | ----------------- | --------- |
196
+ | Quality | High | Highest | High |
197
+ | Speed | Very Fast | Fast | Medium |
198
+ | Memory | 0.5GB | 1.2GB | 2GB |
199
+ | Context | 8,192 | Standard | 8,192 |
200
+ | Matryoshka | Yes | No | No |
201
+ | Multilingual | Moderate | Moderate | Excellent |
202
+ | mdcontext fit | Excellent | Good | Good |
203
+
204
+ **Recommendation:** nomic-embed-text-v1.5 is the best fit for mdcontext due to its balance of quality, speed, long context, and Matryoshka support.
205
+
206
+ ---
207
+
208
+ ## Alternative API Providers
209
+
210
+ ### Voyage AI
211
+
212
+ **Standout features:**
213
+
214
+ - voyage-3.5 outperforms OpenAI text-embedding-3-large by 8.26%
215
+ - 32K token context (4x OpenAI)
216
+ - Excellent domain-specific models (code, law, finance)
217
+ - Matryoshka + quantization support
218
+
219
+ **Pricing:**
220
+
221
+ - voyage-3.5-lite: $0.02/1M (same as OpenAI small, but better quality)
222
+ - voyage-3.5: $0.06/1M
223
+ - voyage-3-large: $0.22/1M
224
+
225
+ **Free tier:** 200M tokens free for new models
226
+
227
+ **Best for:** When you need better quality than OpenAI at similar cost
228
+
229
+ ### Cohere
230
+
231
+ **Standout features:**
232
+
233
+ - embed-v4 is multimodal (text + images)
234
+ - 100+ languages
235
+ - Fast inference (50-60% faster than OpenAI)
236
+ - Works well with Cohere's reranker
237
+
238
+ **Pricing:**
239
+
240
+ - embed-v4: $0.12/1M tokens
241
+
242
+ **Best for:** Multimodal needs, when using Cohere's full stack
243
+
244
+ ### Google (Gemini Embedding)
245
+
246
+ **Standout features:**
247
+
248
+ - gemini-embedding-001: 71.5% accuracy on benchmarks
249
+ - Free tier available
250
+ - Matryoshka support (3072/1536/768)
251
+
252
+ **Pricing:**
253
+
254
+ - Free tier: Generous limits
255
+ - Paid: $0.15/1M tokens
256
+
257
+ **Consideration:** Higher latency, less established for embeddings
258
+
259
+ ### Jina AI
260
+
261
+ **Standout features:**
262
+
263
+ - jina-embeddings-v3: Task-specific LoRA adapters
264
+ - 89 languages, 8,192 context
265
+ - Matryoshka support (32-1024 dims)
266
+ - Can be self-hosted (Apache 2.0)
267
+
268
+ **Best for:** Multilingual, task-specific optimization, hybrid API/local deployment
269
+
270
+ ---
271
+
272
+ ## Hybrid Search & Reranking
273
+
274
+ ### Why Hybrid Search Matters
275
+
276
+ Current mdcontext limitation: semantic and keyword search are mutually exclusive.
277
+
278
+ **Hybrid approach benefits:**
279
+
280
+ - 48% improvement in retrieval quality (Pinecone benchmarks)
281
+ - Captures both exact keyword matches AND semantic similarity
282
+ - Reduces LLM hallucinations by 35% when used with reranking
283
+
284
+ ### Recommended Architecture
285
+
286
+ ```
287
+ Query → BM25 (lexical) ──┐
288
+ ├─→ Merge & Dedupe → Reranker → Top K results
289
+ Query → Dense Embed ─────┘
290
+ ```
291
+
292
+ ### Reranking Impact
293
+
294
+ Cross-encoder rerankers examine query-document pairs together, achieving +28% NDCG@10 improvements over raw embedding retrieval.
295
+
296
+ **Top reranker options:**
297
+
298
+ 1. **Cohere Rerank 3**: 100+ languages, production-ready
299
+ 2. **BGE Reranker v2-m3**: Open source, ~600M params, Apache 2.0
300
+ 3. **Voyage rerank-2.5**: Instruction-following, high quality
301
+
302
+ **Optimal configuration:**
303
+
304
+ - Rerank top 50-75 documents for best quality/speed balance
305
+ - Latency: ~1.5 seconds for 50 documents
306
+
307
+ ### BGE-M3 Special Capability
308
+
309
+ BGE-M3 uniquely supports all three retrieval methods in one model:
310
+
311
+ - Dense retrieval (semantic)
312
+ - Sparse retrieval (lexical, like BM25)
313
+ - Multi-vector retrieval (ColBERT-style)
314
+
315
+ This could eliminate the need for a separate BM25 index in mdcontext.
316
+
317
+ ---
318
+
319
+ ## Top 3 Recommendations
320
+
321
+ ### Recommendation 1: Add Local Embedding Support with nomic-embed-text-v1.5
322
+
323
+ **Rationale:**
324
+
325
+ - Enables offline semantic search (major feature gap)
326
+ - Quality matches or exceeds current OpenAI text-embedding-3-small
327
+ - Zero ongoing API costs
328
+ - 8,192 token context matches current implementation
329
+ - Matryoshka support enables storage optimization
330
+ - Excellent performance on Apple Silicon (mdcontext's likely dev environment)
331
+
332
+ **Implementation approach:**
333
+
334
+ 1. Add `nomic-embed-text` as an Ollama provider option
335
+ 2. Create `OllamaEmbeddingProvider` implementing existing interface
336
+ 3. Allow provider selection via config or CLI flag
337
+ 4. Keep OpenAI as default for backward compatibility
338
+
339
+ **Impact:** High (offline capability, cost elimination)
340
+ **Effort:** Medium (new provider implementation, testing)
341
+
342
+ ### Recommendation 2: Implement Dimension Reduction for OpenAI
343
+
344
+ **Rationale:**
345
+
346
+ - Zero-code quick win using existing API
347
+ - Reduce storage by 67% (1536 → 512) with minimal quality loss
348
+ - Improve query latency by ~50%
349
+ - text-embedding-3-large at 512 dims outperforms 3-small at 1536
350
+
351
+ **Implementation approach:**
352
+
353
+ 1. Add `dimensions` parameter to OpenAI API calls
354
+ 2. Update vector store to handle variable dimensions
355
+ 3. Default to 512 dimensions for new indexes
356
+ 4. Add migration path for existing indexes (or require rebuild)
357
+
358
+ **Impact:** Medium-High (storage/performance improvement)
359
+ **Effort:** Low (API parameter change, minor refactoring)
360
+
361
+ ### Recommendation 3: Add Hybrid Search with BGE-M3 (Future)
362
+
363
+ **Rationale:**
364
+
365
+ - Addresses limitation #4 (no hybrid search) from current implementation
366
+ - Single model provides dense + sparse retrieval
367
+ - No separate BM25 index needed
368
+ - 48% retrieval quality improvement potential
369
+
370
+ **Implementation approach:**
371
+
372
+ 1. Add BGE-M3 as a local provider option
373
+ 2. Store both dense and sparse vectors
374
+ 3. Implement hybrid retrieval merging
375
+ 4. Optional: Add cross-encoder reranking
376
+
377
+ **Impact:** High (major quality improvement)
378
+ **Effort:** High (significant architecture changes)
379
+
380
+ ---
381
+
382
+ ## Effort/Impact Analysis
383
+
384
+ | Improvement | Impact | Effort | Priority |
385
+ | ------------------------------ | ----------- | ------ | ---------------- |
386
+ | Dimension reduction (512) | Medium-High | Low | 1 - Quick Win |
387
+ | nomic-embed-text local | High | Medium | 2 - High Value |
388
+ | Voyage AI as alternative | Medium | Low | 3 - Easy Upgrade |
389
+ | BGE-M3 hybrid search | High | High | 4 - Future |
390
+ | Cross-encoder reranking | Medium-High | Medium | 5 - Future |
391
+ | text-embedding-3-large upgrade | Medium | Low | 6 - Optional |
392
+
393
+ ### Implementation Priority Order
394
+
395
+ 1. **Week 1:** Dimension reduction (1536 → 512)
396
+ - Modify OpenAI provider to pass `dimensions: 512`
397
+ - Update vector store metadata
398
+ - Test retrieval quality
399
+
400
+ 2. **Week 2-3:** Local embedding support
401
+ - Implement Ollama provider
402
+ - Add nomic-embed-text integration
403
+ - Create provider selection mechanism
404
+
405
+ 3. **Week 4+:** Provider ecosystem
406
+ - Add Voyage AI option
407
+ - Consider BGE-M3 for hybrid search
408
+ - Evaluate reranking integration
409
+
410
+ ---
411
+
412
+ ## Quick Wins
413
+
414
+ ### 1. Dimension Reduction (Immediate)
415
+
416
+ **Change required:**
417
+
418
+ ```typescript
419
+ // In openai-provider.ts
420
+ const response = await openai.embeddings.create({
421
+ model: "text-embedding-3-small",
422
+ input: texts,
423
+ dimensions: 512, // Add this parameter
424
+ });
425
+ ```
426
+
427
+ **Benefits:**
428
+
429
+ - 67% storage reduction
430
+ - ~50% faster queries
431
+ - Minimal quality impact (~3-4%)
432
+
433
+ ### 2. Switch to voyage-3.5-lite (Same Cost, Better Quality)
434
+
435
+ **If considering API alternatives:**
436
+
437
+ - Same price as OpenAI small ($0.02/1M)
438
+ - 6-8% better retrieval quality
439
+ - 32K context (4x more)
440
+ - Free 200M tokens to test
441
+
442
+ ### 3. Use text-embedding-3-large at Reduced Dimensions
443
+
444
+ **For quality boost:**
445
+
446
+ ```typescript
447
+ // Better quality at same storage cost
448
+ const response = await openai.embeddings.create({
449
+ model: "text-embedding-3-large",
450
+ input: texts,
451
+ dimensions: 512, // Truncate large model
452
+ });
453
+ ```
454
+
455
+ **Trade-off:** 6.5x cost increase, but significantly better retrieval
456
+
457
+ ---
458
+
459
+ ## Sources
460
+
461
+ - [MTEB Leaderboard - Hugging Face](https://huggingface.co/spaces/mteb/leaderboard)
462
+ - [OpenAI Embeddings Documentation](https://platform.openai.com/docs/guides/embeddings)
463
+ - [Voyage AI Documentation](https://docs.voyageai.com/docs/embeddings)
464
+ - [Cohere Embed Documentation](https://cohere.com/pricing)
465
+ - [nomic-embed-text-v1.5 - Hugging Face](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5)
466
+ - [BGE-M3 - Hugging Face](https://huggingface.co/BAAI/bge-m3)
467
+ - [Jina Embeddings v3](https://jina.ai/models/jina-embeddings-v3/)
468
+ - [OpenAI Matryoshka Embeddings - Pinecone](https://www.pinecone.io/learn/openai-embeddings-v3/)
469
+ - [Ollama Embedding Models](https://ollama.com/blog/embedding-models)
470
+ - [Best Embedding Models 2025 - Ailog](https://app.ailog.fr/en/blog/guides/choosing-embedding-models)
471
+ - [Rerankers for RAG - Analytics Vidhya](https://www.analyticsvidhya.com/blog/2025/06/top-rerankers-for-rag/)
472
+ - [Hybrid Search & Reranking - Superlinked](https://superlinked.com/vectorhub/articles/optimizing-rag-with-hybrid-search-reranking)
473
+
474
+ ---
475
+
476
+ ## Appendix: Model Selection Decision Tree
477
+
478
+ ```
479
+ Need offline/local capability?
480
+ ├─ Yes → nomic-embed-text-v1.5 (Ollama)
481
+ │ ├─ Need multilingual? → BGE-M3
482
+ │ └─ Need max accuracy? → mxbai-embed-large
483
+ └─ No (API is fine)
484
+ ├─ Cost-sensitive?
485
+ │ ├─ Yes → text-embedding-3-small @ 512 dims
486
+ │ └─ Same budget, better quality? → voyage-3.5-lite
487
+ └─ Quality-focused?
488
+ ├─ Yes → voyage-3-large or text-embedding-3-large
489
+ └─ Free tier preferred? → gemini-embedding-001
490
+ ```