mdcontext 0.0.1 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (337) hide show
  1. package/.changeset/README.md +28 -0
  2. package/.changeset/config.json +11 -0
  3. package/.claude/settings.local.json +25 -0
  4. package/.github/workflows/ci.yml +83 -0
  5. package/.github/workflows/claude-code-review.yml +44 -0
  6. package/.github/workflows/claude.yml +85 -0
  7. package/.github/workflows/release.yml +113 -0
  8. package/.tldrignore +112 -0
  9. package/BACKLOG.md +338 -0
  10. package/CONTRIBUTING.md +186 -0
  11. package/NOTES/NOTES +44 -0
  12. package/README.md +434 -11
  13. package/biome.json +36 -0
  14. package/cspell.config.yaml +14 -0
  15. package/dist/chunk-23UPXDNL.js +3044 -0
  16. package/dist/chunk-2W7MO2DL.js +1366 -0
  17. package/dist/chunk-3NUAZGMA.js +1689 -0
  18. package/dist/chunk-7TOWB2XB.js +366 -0
  19. package/dist/chunk-7XOTOADQ.js +3065 -0
  20. package/dist/chunk-AH2PDM2K.js +3042 -0
  21. package/dist/chunk-BNXWSZ63.js +3742 -0
  22. package/dist/chunk-BTL5DJVU.js +3222 -0
  23. package/dist/chunk-HDHYG7E4.js +104 -0
  24. package/dist/chunk-HLR4KZBP.js +3234 -0
  25. package/dist/chunk-IP3FRFEB.js +1045 -0
  26. package/dist/chunk-KHU56VDO.js +3042 -0
  27. package/dist/chunk-KRYIFLQR.js +88 -0
  28. package/dist/chunk-LBSDNLEM.js +287 -0
  29. package/dist/chunk-MNTQ7HCP.js +2643 -0
  30. package/dist/chunk-MUJELQQ6.js +1387 -0
  31. package/dist/chunk-MXJGMSLV.js +2199 -0
  32. package/dist/chunk-N6QJGC3Z.js +2636 -0
  33. package/dist/chunk-OBELGBPM.js +1713 -0
  34. package/dist/chunk-OT7R5XTA.js +3192 -0
  35. package/dist/chunk-P7X4RA2T.js +106 -0
  36. package/dist/chunk-PIDUQNC2.js +3185 -0
  37. package/dist/chunk-POGCDIH4.js +3187 -0
  38. package/dist/chunk-PSIEOQGZ.js +3043 -0
  39. package/dist/chunk-PVRT3IHA.js +3238 -0
  40. package/dist/chunk-QNN4TT23.js +1430 -0
  41. package/dist/chunk-RE3R45RJ.js +3042 -0
  42. package/dist/chunk-S7E6TFX6.js +803 -0
  43. package/dist/chunk-SG6GLU4U.js +1378 -0
  44. package/dist/chunk-SJCDV2ST.js +274 -0
  45. package/dist/chunk-SYE5XLF3.js +104 -0
  46. package/dist/chunk-T5VLYBZD.js +103 -0
  47. package/dist/chunk-TOQB7VWU.js +3238 -0
  48. package/dist/chunk-VFNMZ4ZQ.js +3228 -0
  49. package/dist/chunk-VVTGZNBT.js +1629 -0
  50. package/dist/chunk-W7Q4RFEV.js +104 -0
  51. package/dist/chunk-XTYYVRLO.js +3190 -0
  52. package/dist/chunk-Y6MDYVJD.js +3063 -0
  53. package/dist/cli/main.d.ts +1 -0
  54. package/dist/cli/main.js +5458 -0
  55. package/dist/index.d.ts +653 -0
  56. package/dist/index.js +79 -0
  57. package/dist/mcp/server.d.ts +1 -0
  58. package/dist/mcp/server.js +472 -0
  59. package/dist/schema-BAWSG7KY.js +22 -0
  60. package/dist/schema-E3QUPL26.js +20 -0
  61. package/dist/schema-EHL7WUT6.js +20 -0
  62. package/docs/019-USAGE.md +625 -0
  63. package/docs/020-current-implementation.md +364 -0
  64. package/docs/021-DOGFOODING-FINDINGS.md +175 -0
  65. package/docs/BACKLOG.md +80 -0
  66. package/docs/CONFIG.md +1123 -0
  67. package/docs/DESIGN.md +439 -0
  68. package/docs/ERRORS.md +383 -0
  69. package/docs/PROJECT.md +88 -0
  70. package/docs/ROADMAP.md +407 -0
  71. package/docs/summarization.md +320 -0
  72. package/docs/test-links.md +9 -0
  73. package/justfile +40 -0
  74. package/package.json +74 -9
  75. package/pnpm-workspace.yaml +5 -0
  76. package/research/INDEX.md +315 -0
  77. package/research/code-review/README.md +90 -0
  78. package/research/code-review/cli-error-handling-review.md +979 -0
  79. package/research/code-review/code-review-validation-report.md +464 -0
  80. package/research/code-review/main-ts-review.md +1128 -0
  81. package/research/config-analysis/01-current-implementation.md +470 -0
  82. package/research/config-analysis/02-strategy-recommendation.md +428 -0
  83. package/research/config-analysis/03-task-candidates.md +715 -0
  84. package/research/config-analysis/033-research-configuration-management.md +828 -0
  85. package/research/config-analysis/034-research-effect-cli-config.md +1504 -0
  86. package/research/config-analysis/04-consolidated-task-candidates.md +277 -0
  87. package/research/config-docs/SUMMARY.md +357 -0
  88. package/research/config-docs/TEST-RESULTS.md +776 -0
  89. package/research/config-docs/TODO.md +542 -0
  90. package/research/config-docs/analysis.md +744 -0
  91. package/research/config-docs/fix-validation.md +502 -0
  92. package/research/config-docs/help-audit.md +264 -0
  93. package/research/config-docs/help-system-analysis.md +890 -0
  94. package/research/dogfood/consolidated-tool-evaluation.md +373 -0
  95. package/research/dogfood/strategy-a/a-synthesis.md +184 -0
  96. package/research/dogfood/strategy-a/a1-docs.md +226 -0
  97. package/research/dogfood/strategy-a/a2-amorphic.md +156 -0
  98. package/research/dogfood/strategy-a/a3-llm.md +164 -0
  99. package/research/dogfood/strategy-b/b-synthesis.md +228 -0
  100. package/research/dogfood/strategy-b/b1-architecture.md +207 -0
  101. package/research/dogfood/strategy-b/b2-gaps.md +258 -0
  102. package/research/dogfood/strategy-b/b3-workflows.md +250 -0
  103. package/research/dogfood/strategy-c/c-synthesis.md +451 -0
  104. package/research/dogfood/strategy-c/c1-explorer.md +192 -0
  105. package/research/dogfood/strategy-c/c2-diver-memory.md +145 -0
  106. package/research/dogfood/strategy-c/c3-diver-control.md +148 -0
  107. package/research/dogfood/strategy-c/c4-diver-failure.md +151 -0
  108. package/research/dogfood/strategy-c/c5-diver-execution.md +221 -0
  109. package/research/dogfood/strategy-c/c6-diver-org.md +221 -0
  110. package/research/effect-cli-error-handling.md +845 -0
  111. package/research/effect-errors-as-values.md +943 -0
  112. package/research/errors-task-analysis/00-consolidated-tasks.md +207 -0
  113. package/research/errors-task-analysis/cli-commands-analysis.md +909 -0
  114. package/research/errors-task-analysis/embeddings-analysis.md +709 -0
  115. package/research/errors-task-analysis/index-search-analysis.md +812 -0
  116. package/research/frontmatter/COMMENTS-ARE-SKIPPED.md +149 -0
  117. package/research/frontmatter/LLM-CODE-NAVIGATION.md +276 -0
  118. package/research/issue-review.md +603 -0
  119. package/research/llm-summarization/agent-cli-tools-2026.md +1082 -0
  120. package/research/llm-summarization/alternative-providers-2026.md +1428 -0
  121. package/research/llm-summarization/anthropic-2026.md +367 -0
  122. package/research/llm-summarization/claude-cli-integration.md +1706 -0
  123. package/research/llm-summarization/cli-integration-patterns.md +3155 -0
  124. package/research/llm-summarization/openai-2026.md +473 -0
  125. package/research/llm-summarization/openai-compatible-providers-2026.md +1022 -0
  126. package/research/llm-summarization/opencode-cli-integration.md +1552 -0
  127. package/research/llm-summarization/prompt-engineering-2026.md +1426 -0
  128. package/research/llm-summarization/prototype-results.md +56 -0
  129. package/research/llm-summarization/provider-switching-patterns-2026.md +2153 -0
  130. package/research/llm-summarization/typescript-llm-libraries-2026.md +2436 -0
  131. package/research/mdcontext-error-analysis.md +521 -0
  132. package/research/mdcontext-pudding/00-EXECUTIVE-SUMMARY.md +282 -0
  133. package/research/mdcontext-pudding/01-index-embed.md +956 -0
  134. package/research/mdcontext-pudding/02-search-COMMANDS.md +142 -0
  135. package/research/mdcontext-pudding/02-search-SUMMARY.md +146 -0
  136. package/research/mdcontext-pudding/02-search.md +970 -0
  137. package/research/mdcontext-pudding/03-context.md +779 -0
  138. package/research/mdcontext-pudding/04-navigation-and-analytics.md +803 -0
  139. package/research/mdcontext-pudding/04-tree.md +704 -0
  140. package/research/mdcontext-pudding/05-config.md +1038 -0
  141. package/research/mdcontext-pudding/06-links-summary.txt +87 -0
  142. package/research/mdcontext-pudding/06-links.md +679 -0
  143. package/research/mdcontext-pudding/07-stats.md +693 -0
  144. package/research/mdcontext-pudding/BUG-FIX-PLAN.md +388 -0
  145. package/research/mdcontext-pudding/P0-BUG-VALIDATION.md +167 -0
  146. package/research/mdcontext-pudding/README.md +168 -0
  147. package/research/mdcontext-pudding/TESTING-SUMMARY.md +128 -0
  148. package/research/npm_publish/011-npm-workflow-research-agent2.md +792 -0
  149. package/research/npm_publish/012-npm-workflow-research-agent1.md +530 -0
  150. package/research/npm_publish/013-npm-workflow-research-agent3.md +722 -0
  151. package/research/npm_publish/014-npm-workflow-synthesis.md +556 -0
  152. package/research/npm_publish/031-npm-workflow-task-analysis.md +134 -0
  153. package/research/research-quality-review.md +834 -0
  154. package/research/semantic-search/002-research-embedding-models.md +490 -0
  155. package/research/semantic-search/003-research-rag-alternatives.md +523 -0
  156. package/research/semantic-search/004-research-vector-search.md +841 -0
  157. package/research/semantic-search/032-research-semantic-search.md +427 -0
  158. package/research/semantic-search/embedding-text-analysis.md +156 -0
  159. package/research/semantic-search/multi-word-failure-reproduction.md +171 -0
  160. package/research/semantic-search/query-processing-analysis.md +207 -0
  161. package/research/semantic-search/root-cause-and-solution.md +114 -0
  162. package/research/semantic-search/threshold-validation-report.md +69 -0
  163. package/research/semantic-search/vector-search-analysis.md +63 -0
  164. package/research/task-management-2026/00-synthesis-recommendations.md +295 -0
  165. package/research/task-management-2026/01-ai-workflow-tools.md +416 -0
  166. package/research/task-management-2026/02-agent-framework-patterns.md +476 -0
  167. package/research/task-management-2026/03-lightweight-file-based.md +567 -0
  168. package/research/task-management-2026/04-established-tools-ai-features.md +541 -0
  169. package/research/task-management-2026/linear/01-core-features-workflow.md +771 -0
  170. package/research/task-management-2026/linear/02-api-integrations.md +930 -0
  171. package/research/task-management-2026/linear/03-ai-features.md +368 -0
  172. package/research/task-management-2026/linear/04-pricing-setup.md +205 -0
  173. package/research/task-management-2026/linear/05-usage-patterns-best-practices.md +605 -0
  174. package/research/test-path-issues.md +276 -0
  175. package/review/ALP-76/1-error-type-design.md +962 -0
  176. package/review/ALP-76/2-error-handling-patterns.md +906 -0
  177. package/review/ALP-76/3-error-presentation.md +624 -0
  178. package/review/ALP-76/4-test-coverage.md +625 -0
  179. package/review/ALP-76/5-migration-completeness.md +440 -0
  180. package/review/ALP-76/6-effect-best-practices.md +755 -0
  181. package/scripts/apply-branch-protection.sh +47 -0
  182. package/scripts/branch-protection-templates.json +79 -0
  183. package/scripts/prototype-summarization.ts +346 -0
  184. package/scripts/rebuild-hnswlib.js +58 -0
  185. package/scripts/setup-branch-protection.sh +64 -0
  186. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/active-provider.json +7 -0
  187. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/bm25.json +541 -0
  188. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/bm25.meta.json +5 -0
  189. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/config.json +8 -0
  190. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.bin +0 -0
  191. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.meta.bin +0 -0
  192. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/indexes/documents.json +60 -0
  193. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/indexes/links.json +13 -0
  194. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/indexes/sections.json +1197 -0
  195. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/configuration-management.md +99 -0
  196. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/distributed-systems.md +92 -0
  197. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/error-handling.md +78 -0
  198. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/failure-automation.md +55 -0
  199. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/job-context.md +69 -0
  200. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/process-orchestration.md +99 -0
  201. package/src/cli/argv-preprocessor.test.ts +210 -0
  202. package/src/cli/argv-preprocessor.ts +202 -0
  203. package/src/cli/cli.test.ts +627 -0
  204. package/src/cli/commands/backlinks.ts +54 -0
  205. package/src/cli/commands/config-cmd.ts +642 -0
  206. package/src/cli/commands/context.ts +285 -0
  207. package/src/cli/commands/duplicates.ts +122 -0
  208. package/src/cli/commands/embeddings.ts +529 -0
  209. package/src/cli/commands/index-cmd.ts +480 -0
  210. package/src/cli/commands/index.ts +16 -0
  211. package/src/cli/commands/links.ts +52 -0
  212. package/src/cli/commands/search.ts +1281 -0
  213. package/src/cli/commands/stats.ts +149 -0
  214. package/src/cli/commands/tree.ts +128 -0
  215. package/src/cli/config-layer.ts +176 -0
  216. package/src/cli/error-handler.test.ts +235 -0
  217. package/src/cli/error-handler.ts +655 -0
  218. package/src/cli/flag-schemas.ts +341 -0
  219. package/src/cli/help.ts +588 -0
  220. package/src/cli/index.ts +9 -0
  221. package/src/cli/main.ts +435 -0
  222. package/src/cli/options.ts +41 -0
  223. package/src/cli/shared-error-handling.ts +199 -0
  224. package/src/cli/typo-suggester.test.ts +105 -0
  225. package/src/cli/typo-suggester.ts +130 -0
  226. package/src/cli/utils.ts +259 -0
  227. package/src/config/file-provider.test.ts +320 -0
  228. package/src/config/file-provider.ts +273 -0
  229. package/src/config/index.ts +72 -0
  230. package/src/config/integration.test.ts +667 -0
  231. package/src/config/precedence.test.ts +277 -0
  232. package/src/config/precedence.ts +451 -0
  233. package/src/config/schema.test.ts +414 -0
  234. package/src/config/schema.ts +603 -0
  235. package/src/config/service.test.ts +320 -0
  236. package/src/config/service.ts +243 -0
  237. package/src/config/testing.test.ts +264 -0
  238. package/src/config/testing.ts +110 -0
  239. package/src/core/index.ts +1 -0
  240. package/src/core/types.ts +113 -0
  241. package/src/duplicates/detector.test.ts +183 -0
  242. package/src/duplicates/detector.ts +414 -0
  243. package/src/duplicates/index.ts +18 -0
  244. package/src/embeddings/embedding-namespace.test.ts +300 -0
  245. package/src/embeddings/embedding-namespace.ts +947 -0
  246. package/src/embeddings/heading-boost.test.ts +222 -0
  247. package/src/embeddings/hnsw-build-options.test.ts +198 -0
  248. package/src/embeddings/hyde.test.ts +272 -0
  249. package/src/embeddings/hyde.ts +264 -0
  250. package/src/embeddings/index.ts +10 -0
  251. package/src/embeddings/openai-provider.ts +414 -0
  252. package/src/embeddings/pricing.json +22 -0
  253. package/src/embeddings/provider-constants.ts +204 -0
  254. package/src/embeddings/provider-errors.test.ts +967 -0
  255. package/src/embeddings/provider-errors.ts +565 -0
  256. package/src/embeddings/provider-factory.test.ts +240 -0
  257. package/src/embeddings/provider-factory.ts +225 -0
  258. package/src/embeddings/provider-integration.test.ts +788 -0
  259. package/src/embeddings/query-preprocessing.test.ts +187 -0
  260. package/src/embeddings/semantic-search-threshold.test.ts +508 -0
  261. package/src/embeddings/semantic-search.ts +1270 -0
  262. package/src/embeddings/types.ts +359 -0
  263. package/src/embeddings/vector-store.ts +708 -0
  264. package/src/embeddings/voyage-provider.ts +313 -0
  265. package/src/errors/errors.test.ts +845 -0
  266. package/src/errors/index.ts +533 -0
  267. package/src/index/ignore-patterns.test.ts +354 -0
  268. package/src/index/ignore-patterns.ts +305 -0
  269. package/src/index/index.ts +4 -0
  270. package/src/index/indexer.ts +684 -0
  271. package/src/index/storage.ts +260 -0
  272. package/src/index/types.ts +147 -0
  273. package/src/index/watcher.ts +189 -0
  274. package/src/index.ts +30 -0
  275. package/src/integration/search-keyword.test.ts +678 -0
  276. package/src/mcp/server.ts +612 -0
  277. package/src/parser/index.ts +1 -0
  278. package/src/parser/parser.test.ts +291 -0
  279. package/src/parser/parser.ts +394 -0
  280. package/src/parser/section-filter.test.ts +277 -0
  281. package/src/parser/section-filter.ts +392 -0
  282. package/src/search/__tests__/hybrid-search.test.ts +650 -0
  283. package/src/search/bm25-store.ts +366 -0
  284. package/src/search/cross-encoder.test.ts +253 -0
  285. package/src/search/cross-encoder.ts +406 -0
  286. package/src/search/fuzzy-search.test.ts +419 -0
  287. package/src/search/fuzzy-search.ts +273 -0
  288. package/src/search/hybrid-search.ts +448 -0
  289. package/src/search/path-matcher.test.ts +276 -0
  290. package/src/search/path-matcher.ts +33 -0
  291. package/src/search/query-parser.test.ts +260 -0
  292. package/src/search/query-parser.ts +319 -0
  293. package/src/search/searcher.test.ts +280 -0
  294. package/src/search/searcher.ts +724 -0
  295. package/src/search/wink-bm25.d.ts +30 -0
  296. package/src/summarization/cli-providers/claude.ts +202 -0
  297. package/src/summarization/cli-providers/detection.test.ts +273 -0
  298. package/src/summarization/cli-providers/detection.ts +118 -0
  299. package/src/summarization/cli-providers/index.ts +8 -0
  300. package/src/summarization/cost.test.ts +139 -0
  301. package/src/summarization/cost.ts +102 -0
  302. package/src/summarization/error-handler.test.ts +127 -0
  303. package/src/summarization/error-handler.ts +111 -0
  304. package/src/summarization/index.ts +102 -0
  305. package/src/summarization/pipeline.test.ts +498 -0
  306. package/src/summarization/pipeline.ts +231 -0
  307. package/src/summarization/prompts.test.ts +269 -0
  308. package/src/summarization/prompts.ts +133 -0
  309. package/src/summarization/provider-factory.test.ts +396 -0
  310. package/src/summarization/provider-factory.ts +178 -0
  311. package/src/summarization/types.ts +184 -0
  312. package/src/summarize/budget-bugs.test.ts +620 -0
  313. package/src/summarize/formatters.ts +419 -0
  314. package/src/summarize/index.ts +20 -0
  315. package/src/summarize/summarizer.test.ts +275 -0
  316. package/src/summarize/summarizer.ts +597 -0
  317. package/src/summarize/verify-bugs.test.ts +238 -0
  318. package/src/types/huggingface-transformers.d.ts +66 -0
  319. package/src/utils/index.ts +1 -0
  320. package/src/utils/tokens.test.ts +142 -0
  321. package/src/utils/tokens.ts +186 -0
  322. package/tests/fixtures/cli/.mdcontext/active-provider.json +7 -0
  323. package/tests/fixtures/cli/.mdcontext/config.json +8 -0
  324. package/tests/fixtures/cli/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.bin +0 -0
  325. package/tests/fixtures/cli/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.meta.bin +0 -0
  326. package/tests/fixtures/cli/.mdcontext/indexes/documents.json +33 -0
  327. package/tests/fixtures/cli/.mdcontext/indexes/links.json +12 -0
  328. package/tests/fixtures/cli/.mdcontext/indexes/sections.json +247 -0
  329. package/tests/fixtures/cli/README.md +9 -0
  330. package/tests/fixtures/cli/api-reference.md +11 -0
  331. package/tests/fixtures/cli/getting-started.md +11 -0
  332. package/tests/integration/embed-index.test.ts +712 -0
  333. package/tests/integration/search-context.test.ts +469 -0
  334. package/tests/integration/search-semantic.test.ts +522 -0
  335. package/tsconfig.json +26 -0
  336. package/vitest.config.ts +16 -0
  337. package/vitest.setup.ts +12 -0
@@ -0,0 +1,523 @@
1
+ # RAG Alternatives Research: Improving Semantic Search Quality
2
+
3
+ This document explores alternatives to traditional RAG patterns for improving semantic search quality in mdcontext. Since mdcontext is a pure retrieval system (no LLM generation), we focus on techniques that enhance retrieval precision and recall without adding generation complexity.
4
+
5
+ ## Table of Contents
6
+
7
+ 1. [The RAG Problem](#the-rag-problem)
8
+ 2. [Alternative Approaches](#alternative-approaches)
9
+ 3. [Top 3 Recommendations](#top-3-recommendations)
10
+ 4. [Effort/Impact Analysis](#effortimpact-analysis)
11
+ 5. [Quick Wins](#quick-wins)
12
+
13
+ ---
14
+
15
+ ## The RAG Problem
16
+
17
+ ### Why Standard RAG Hurts Retrieval Quality
18
+
19
+ Traditional RAG (Retrieval-Augmented Generation) is designed to enhance LLM generation with retrieved context. However, this paradigm introduces several problems when applied to pure semantic search:
20
+
21
+ #### 1. Optimization Mismatch
22
+
23
+ RAG systems optimize for **generation quality**, not **retrieval precision**. This creates a fundamental mismatch:
24
+
25
+ - RAG tolerates noisy retrieval because LLMs can filter irrelevant context
26
+ - Pure search requires every result to be relevant since users see raw results
27
+ - RAG metrics (BLEU, ROUGE) don't align with search metrics (nDCG, MRR)
28
+
29
+ #### 2. The Confidence Problem
30
+
31
+ Research shows RAG paradoxically reduces model accuracy in some cases:
32
+
33
+ > "While RAG generally improves overall performance, it paradoxically reduces the model's ability to abstain from answering when appropriate. The introduction of additional context seems to increase the model's confidence, leading to a higher propensity for hallucination rather than abstention."
34
+
35
+ Google research found that Gemma's incorrect answer rate increased from 10.2% to 66.1% when using insufficient context, demonstrating how retrieved content can actively harm results.
36
+
37
+ #### 3. The Vocabulary Mismatch Problem
38
+
39
+ Dense embeddings theoretically solve vocabulary mismatch (e.g., "coronavirus" should match "COVID"), but real embeddings fall short:
40
+
41
+ > "While semantic embedding models are supposed to eliminate the need for query expansion... real embeddings made by real models often fall short."
42
+
43
+ Example: A query for "skin rash" might retrieve documents about "behaving rashly" while missing medical articles about "dermatitis."
44
+
45
+ #### 4. When Retrieval Beats Generation
46
+
47
+ For documentation search specifically:
48
+
49
+ | Use Case | RAG Appropriate | Pure Retrieval Better |
50
+ | --------------------- | --------------- | --------------------- |
51
+ | Answer synthesis | Yes | No |
52
+ | Finding specific docs | No | Yes |
53
+ | Exploratory search | No | Yes |
54
+ | Code examples | Depends | Usually |
55
+ | API reference | No | Yes |
56
+
57
+ mdcontext's use case (finding relevant documentation sections) is best served by optimizing retrieval directly.
58
+
59
+ ---
60
+
61
+ ## Alternative Approaches
62
+
63
+ ### 1. Hybrid Search (BM25 + Dense)
64
+
65
+ **What it is**: Combine traditional keyword search (BM25) with dense vector search, fusing results using techniques like Reciprocal Rank Fusion (RRF).
66
+
67
+ **Why it works**:
68
+
69
+ - Dense vectors excel at semantic understanding
70
+ - BM25 excels at exact matches (error codes, SKUs, technical terms)
71
+ - Hybrid captures both without tradeoffs
72
+
73
+ **Performance gains**:
74
+
75
+ > "Hybrid search improves recall 15-30% over single methods with minimal added complexity."
76
+ >
77
+ > "In open-domain QA benchmarks... BM25 passage recall is 22.1%; dense retrievers (DPR) reach 48.7%, but hybrid pipelines achieve up to 53.4%."
78
+
79
+ **Fusion methods**:
80
+
81
+ 1. **Reciprocal Rank Fusion (RRF)**: Simplest, requires no tuning
82
+
83
+ ```
84
+ score = sum(1 / (k + rank)) for each retriever
85
+ k = 60 (standard constant)
86
+ ```
87
+
88
+ 2. **Linear Combination**: More control, requires tuning
89
+
90
+ ```
91
+ score = alpha * bm25_score + (1 - alpha) * dense_score
92
+ ```
93
+
94
+ **JavaScript/TypeScript options**:
95
+
96
+ - [wink-bm25-text-search](https://www.npmjs.com/package/wink-bm25-text-search): Full-featured BM25 with NLP integration
97
+ - [OkapiBM25](https://www.npmjs.com/package/okapi-bm25): Simple, typed implementation
98
+ - [@langchain/community BM25Retriever](https://js.langchain.com/docs/integrations/retrievers/bm25/)
99
+
100
+ ### 2. Cross-Encoder Re-ranking
101
+
102
+ **What it is**: Use a secondary model to re-score top-k results from initial retrieval.
103
+
104
+ **How it works**:
105
+
106
+ 1. First stage: Fast bi-encoder retrieval (current approach)
107
+ 2. Second stage: Cross-encoder scores (query, document) pairs for top-k results
108
+ 3. Re-order based on cross-encoder scores
109
+
110
+ **Why it's better**:
111
+
112
+ > "Cross-encoders are more accurate than bi-encoders but they don't scale well, so using them to re-order a shortened list returned by semantic search is the ideal use case."
113
+
114
+ Cross-encoders process query and document together, enabling deeper semantic matching that bi-encoders (separate embedding) cannot achieve.
115
+
116
+ **Trade-offs**:
117
+
118
+ | Aspect | Bi-Encoder | Cross-Encoder |
119
+ | ----------- | ---------------------- | ------------------------ |
120
+ | Speed | Fast (precompute docs) | Slow (compute per query) |
121
+ | Accuracy | Good | Best |
122
+ | Scalability | O(1) for docs | O(n) per query |
123
+ | Use case | Initial retrieval | Re-ranking top-k |
124
+
125
+ **Implementation options**:
126
+
127
+ - [Transformers.js](https://huggingface.co/docs/transformers.js): Run ONNX models in Node.js
128
+ - [Cohere Rerank API](https://cohere.com/rerank): Managed service
129
+ - Python sidecar with sentence-transformers
130
+
131
+ ### 3. SPLADE (Learned Sparse Retrieval)
132
+
133
+ **What it is**: Neural model that produces sparse vectors compatible with inverted indexes, combining benefits of neural understanding with lexical precision.
134
+
135
+ **How it works**:
136
+
137
+ - Uses BERT to weight term importance
138
+ - Enables term expansion (adds relevant related terms)
139
+ - Produces sparse vectors (mostly zeros) for efficient indexing
140
+
141
+ **Key advantages**:
142
+
143
+ > "Sparse representations benefit from several advantages compared to dense approaches: efficient use of inverted index, explicit lexical match, interpretability. They also seem to be better at generalizing on out-of-domain data."
144
+
145
+ **When SPLADE beats dense**:
146
+
147
+ - Out-of-domain generalization
148
+ - Interpretability requirements
149
+ - Exact term matching important
150
+ - Limited training data
151
+
152
+ **Trade-offs**:
153
+
154
+ - Requires specialized model serving
155
+ - Less mature JavaScript ecosystem
156
+ - May need fine-tuning for domain
157
+
158
+ ### 4. ColBERT Late Interaction
159
+
160
+ **What it is**: Multi-vector approach where documents and queries are represented by multiple token-level vectors, matched via "late interaction."
161
+
162
+ **How it works**:
163
+
164
+ 1. Encode query tokens → multiple query vectors
165
+ 2. Encode document tokens → multiple document vectors
166
+ 3. Compute MaxSim: for each query token, find max similarity to any doc token
167
+ 4. Sum MaxSim scores across query tokens
168
+
169
+ **Performance characteristics**:
170
+
171
+ > "PLAID reduces late interaction search latency by up to 7x on a GPU and 45x on a CPU against vanilla ColBERTv2."
172
+
173
+ **Production viability**:
174
+
175
+ - PLAID engine enables production-scale deployment
176
+ - Memory-mapped storage reduces RAM by 90%
177
+ - Sub-millisecond query latency achievable
178
+
179
+ **Limitations for mdcontext**:
180
+
181
+ - No mature JavaScript implementation
182
+ - Would require Python service
183
+ - More complex infrastructure
184
+ - Overkill for typical documentation corpus sizes
185
+
186
+ ### 5. Query Expansion Techniques
187
+
188
+ #### a) HyDE (Hypothetical Document Embeddings)
189
+
190
+ **What it is**: Use LLM to generate a hypothetical answer, then search using the answer's embedding instead of the query's.
191
+
192
+ **How it works**:
193
+
194
+ 1. Query: "How do I configure authentication?"
195
+ 2. LLM generates hypothetical answer (may be wrong, but captures patterns)
196
+ 3. Embed the hypothetical answer
197
+ 4. Search with that embedding
198
+
199
+ **Why it works**:
200
+
201
+ > "The semantic gap between your short question and the detailed answer creates mismatches. HyDE bridges this gap by first expanding your question into a hypothetical detailed answer."
202
+
203
+ **When to use**:
204
+
205
+ - Complex questions
206
+ - Domain-specific jargon
207
+ - When query is much shorter than target documents
208
+
209
+ **When NOT to use**:
210
+
211
+ - Simple keyword queries
212
+ - When LLM lacks domain knowledge
213
+ - Latency-sensitive applications (adds LLM call)
214
+
215
+ #### b) LLM Query Expansion
216
+
217
+ **What it is**: Use LLM to expand query with synonyms, related terms, and reformulations.
218
+
219
+ **Approaches**:
220
+
221
+ 1. **Explicit expansion**: Generate expansion terms to append
222
+ 2. **Multi-query**: Generate multiple query variations, search all, merge results
223
+
224
+ **Risk**:
225
+
226
+ > "While query expansion is helpful, using LLMs risks adding unhelpful query terms that reduce performance."
227
+
228
+ **Best practices**:
229
+
230
+ - Use for ambiguous queries only
231
+ - Limit expansion scope
232
+ - Consider query type detection before expanding
233
+
234
+ ### 6. Domain-Adapted Embeddings
235
+
236
+ **What it is**: Fine-tune embedding models on your specific corpus or domain.
237
+
238
+ **Why it matters**:
239
+
240
+ > "Off-the-shelf embedding models are often limited to general knowledge and not company- or domain-specific knowledge."
241
+
242
+ **Results**:
243
+
244
+ > "Fine-tuning can boost performance by ~7% with only 6.3k samples. The training took 3 minutes on a consumer size GPU."
245
+
246
+ **Approaches**:
247
+
248
+ | Approach | Effort | Improvement | When to Use |
249
+ | --------------------- | ------ | ----------- | ------------------------- |
250
+ | LoRA adapters | Low | 5-10% | Specialized terminology |
251
+ | Full fine-tune | Medium | 10-15% | Domain-specific semantics |
252
+ | Contrastive on corpus | High | 15-20% | Mission-critical search |
253
+
254
+ **Requirements**:
255
+
256
+ - Training data (query-document pairs)
257
+ - GPU for training (consumer-grade sufficient)
258
+ - Evaluation dataset
259
+
260
+ ### 7. Matryoshka Representation Learning (MRL)
261
+
262
+ **What it is**: Embeddings that work at multiple dimensions, enabling adaptive precision/speed tradeoffs.
263
+
264
+ **How it works**:
265
+
266
+ - Full embedding: 1536 dimensions
267
+ - Can truncate to 768, 384, 256, 128, etc.
268
+ - Early dimensions contain most information
269
+ - Enable two-stage retrieval with progressive precision
270
+
271
+ **Benefits**:
272
+
273
+ > "Up to 14x smaller embedding size for ImageNet-1K classification at the same level of accuracy... up to 14x real-world speed-ups for large-scale retrieval."
274
+
275
+ **Supported models**:
276
+
277
+ - OpenAI text-embedding-3-large (supports dimension reduction)
278
+ - Nomic nomic-embed-text-v1
279
+ - Alibaba gte-multilingual-base
280
+
281
+ **Application for mdcontext**:
282
+
283
+ - Already using text-embedding-3-small (supports dimensions parameter)
284
+ - Could use lower dimensions for initial shortlist
285
+ - Full dimensions for final ranking
286
+
287
+ ---
288
+
289
+ ## Top 3 Recommendations
290
+
291
+ ### Recommendation 1: Hybrid Search (BM25 + Dense)
292
+
293
+ **Why #1**: Maximum impact with minimal complexity.
294
+
295
+ **Rationale**:
296
+
297
+ - Addresses the vocabulary mismatch problem directly
298
+ - 15-30% recall improvement documented
299
+ - Well-supported in JavaScript ecosystem
300
+ - No external dependencies (LLM, GPU, Python)
301
+ - Complements existing dense search perfectly
302
+
303
+ **Implementation path**:
304
+
305
+ 1. Add BM25 index alongside HNSW
306
+ 2. Run parallel queries
307
+ 3. Fuse with RRF (k=60)
308
+ 4. Return fused top-k
309
+
310
+ **Expected improvement**: 15-25% better recall for technical queries.
311
+
312
+ ### Recommendation 2: Cross-Encoder Re-ranking
313
+
314
+ **Why #2**: Best precision gains for reasonable cost.
315
+
316
+ **Rationale**:
317
+
318
+ - Dramatically improves top-10 relevance
319
+ - Can be applied selectively (complex queries only)
320
+ - Transformers.js enables pure JavaScript implementation
321
+ - Small models (MiniLM) run fast enough for interactive use
322
+
323
+ **Implementation path**:
324
+
325
+ 1. Use Transformers.js with cross-encoder model
326
+ 2. Re-rank top-20 candidates to top-10
327
+ 3. Consider caching for repeated queries
328
+
329
+ **Expected improvement**: 10-20% precision@10 improvement.
330
+
331
+ ### Recommendation 3: Query Expansion (Selective HyDE)
332
+
333
+ **Why #3**: Addresses semantic gap for complex queries.
334
+
335
+ **Rationale**:
336
+
337
+ - Transforms short queries into document-like representations
338
+ - Works well for "how to" and conceptual queries
339
+ - Can be optional (detect when helpful)
340
+ - Uses existing OpenAI integration
341
+
342
+ **Implementation path**:
343
+
344
+ 1. Detect query type (simple keyword vs. complex question)
345
+ 2. For complex queries, generate 1-3 hypothetical answers
346
+ 3. Embed answers, average embeddings
347
+ 4. Search with expanded representation
348
+
349
+ **Expected improvement**: 15-30% for complex queries, 0% for simple keywords (but no regression).
350
+
351
+ ---
352
+
353
+ ## Effort/Impact Analysis
354
+
355
+ | Technique | Implementation Effort | Accuracy Impact | Latency Impact | Dependencies |
356
+ | ------------------------- | --------------------- | --------------------- | ------------------- | ---------------------------------- |
357
+ | **Hybrid Search** | Medium (2-3 days) | High (+15-30%) | Low (+5-10ms) | npm package only |
358
+ | **Cross-Encoder Re-rank** | Medium (2-3 days) | High (+10-20%) | Medium (+50-200ms) | Transformers.js + ONNX model |
359
+ | **HyDE Query Expansion** | Low (1 day) | Medium (+15%) | High (+500-1000ms) | OpenAI API |
360
+ | **SPLADE** | High (1-2 weeks) | Medium (+10%) | Low | Python service |
361
+ | **ColBERT** | Very High (2-4 weeks) | Very High (+20%) | Medium | Python service + specialized index |
362
+ | **Fine-tuned Embeddings** | High (1 week) | Medium-High (+10-15%) | None | Training infrastructure |
363
+ | **Matryoshka Dimensions** | Low (0.5 days) | Low (+5%) | Improvement (-20ms) | Already supported |
364
+
365
+ ### Prioritized Roadmap
366
+
367
+ ```
368
+ Phase 1 (Quick Wins - 1 week):
369
+ ├── Matryoshka dimension optimization
370
+ └── Query preprocessing improvements
371
+
372
+ Phase 2 (Core Improvements - 2 weeks):
373
+ ├── Hybrid search (BM25 + dense)
374
+ └── RRF fusion implementation
375
+
376
+ Phase 3 (Advanced Features - 2 weeks):
377
+ ├── Cross-encoder re-ranking (Transformers.js)
378
+ └── Selective HyDE for complex queries
379
+
380
+ Phase 4 (Future Optimization):
381
+ ├── Domain-adapted embeddings (if corpus-specific issues)
382
+ └── SPLADE evaluation (if hybrid insufficient)
383
+ ```
384
+
385
+ ---
386
+
387
+ ## Quick Wins
388
+
389
+ These improvements can be implemented quickly with immediate benefits:
390
+
391
+ ### 1. Query Preprocessing (1-2 hours)
392
+
393
+ ```typescript
394
+ function preprocessQuery(query: string): string {
395
+ return query
396
+ .toLowerCase()
397
+ .replace(/[^\w\s]/g, " ") // Remove punctuation
398
+ .replace(/\s+/g, " ") // Normalize whitespace
399
+ .trim();
400
+ }
401
+ ```
402
+
403
+ **Impact**: Reduces embedding noise, 2-5% precision improvement.
404
+
405
+ ### 2. Matryoshka Dimension Reduction (2-4 hours)
406
+
407
+ OpenAI's text-embedding-3-small supports dimension reduction:
408
+
409
+ ```typescript
410
+ const response = await openai.embeddings.create({
411
+ model: "text-embedding-3-small",
412
+ input: texts,
413
+ dimensions: 512, // Instead of 1536
414
+ });
415
+ ```
416
+
417
+ **Benefits**:
418
+
419
+ - 3x smaller index
420
+ - Faster search
421
+ - Minimal accuracy loss (< 2% for most cases)
422
+
423
+ **Best for**: Larger corpora, faster iteration.
424
+
425
+ ### 3. Result Deduplication (1-2 hours)
426
+
427
+ Remove near-duplicate results based on:
428
+
429
+ - Same document + similar headings
430
+ - High cosine similarity between result embeddings
431
+
432
+ **Impact**: Better result diversity, improved user experience.
433
+
434
+ ### 4. Boost Heading Matches (2-4 hours)
435
+
436
+ Add bonus score when query terms appear in section headings:
437
+
438
+ ```typescript
439
+ function adjustScore(result: SearchResult, query: string): number {
440
+ const queryTerms = query.toLowerCase().split(/\s+/);
441
+ const headingLower = result.heading.toLowerCase();
442
+ const headingMatches = queryTerms.filter((t) =>
443
+ headingLower.includes(t),
444
+ ).length;
445
+
446
+ return result.similarity + headingMatches * 0.05; // +5% per match
447
+ }
448
+ ```
449
+
450
+ **Impact**: Significant for navigation queries ("installation guide", "API reference").
451
+
452
+ ### 5. Document Title Context (2-4 hours)
453
+
454
+ Ensure document titles are prominent in embeddings:
455
+
456
+ ```typescript
457
+ function getEmbeddingText(section: Section, doc: Document): string {
458
+ return `
459
+ Document: ${doc.title}
460
+ Section: ${section.heading}
461
+ Parent: ${section.parent?.heading || "None"}
462
+
463
+ ${section.content}
464
+ `.trim();
465
+ }
466
+ ```
467
+
468
+ **Impact**: Better matching for document-level queries.
469
+
470
+ ### 6. Negative Result Caching (4-8 hours)
471
+
472
+ Cache queries that return poor results:
473
+
474
+ - Track low-similarity searches
475
+ - Use for query expansion hints
476
+ - Inform users when no good matches exist
477
+
478
+ **Impact**: Better UX, data for future improvements.
479
+
480
+ ---
481
+
482
+ ## References
483
+
484
+ ### Research Papers
485
+
486
+ - [Precise Zero-Shot Dense Retrieval without Relevance Labels (HyDE)](https://arxiv.org/abs/2212.10496)
487
+ - [Matryoshka Representation Learning](https://arxiv.org/abs/2205.13147)
488
+ - [SPLADE: Sparse Lexical and Expansion Model for Information Retrieval](https://arxiv.org/abs/2109.10086)
489
+ - [ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction](https://arxiv.org/abs/2004.12832)
490
+ - [Conventional Contrastive Learning Often Falls Short](https://arxiv.org/abs/2505.19274)
491
+
492
+ ### Implementation Resources
493
+
494
+ - [Transformers.js Documentation](https://huggingface.co/docs/transformers.js)
495
+ - [wink-bm25-text-search](https://www.npmjs.com/package/wink-bm25-text-search)
496
+ - [Sentence Transformers - Retrieve & Re-Rank](https://sbert.net/examples/sentence_transformer/applications/retrieve_rerank/README.html)
497
+ - [OpenAI Cookbook - Search Reranking with Cross-Encoders](https://cookbook.openai.com/examples/search_reranking_with_cross-encoders)
498
+
499
+ ### Industry Best Practices
500
+
501
+ - [Weaviate: Hybrid Search Explained](https://weaviate.io/blog/hybrid-search-explained)
502
+ - [Qdrant: Modern Sparse Neural Retrieval](https://qdrant.tech/articles/modern-sparse-neural-retrieval/)
503
+ - [Pinecone: SPLADE for Sparse Vector Search](https://www.pinecone.io/learn/splade/)
504
+ - [Google Research: The Role of Sufficient Context in RAG](https://research.google/blog/deeper-insights-into-retrieval-augmented-generation-the-role-of-sufficient-context/)
505
+
506
+ ---
507
+
508
+ ## Summary
509
+
510
+ For mdcontext's semantic search use case, the recommended approach is:
511
+
512
+ 1. **Hybrid search** for best baseline improvement
513
+ 2. **Cross-encoder re-ranking** for precision when needed
514
+ 3. **Selective query expansion** for complex queries
515
+
516
+ These three techniques, combined with quick wins like query preprocessing and heading boosting, can significantly improve search quality without introducing the complexity and failure modes of full RAG systems.
517
+
518
+ The key insight is that **pure retrieval optimization beats RAG** for documentation search because:
519
+
520
+ - Users want to find documents, not generated answers
521
+ - Every result must be relevant (no LLM to filter noise)
522
+ - Latency matters for interactive search
523
+ - Simpler systems are more reliable and maintainable