mdcontext 0.0.1 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (337) hide show
  1. package/.changeset/README.md +28 -0
  2. package/.changeset/config.json +11 -0
  3. package/.claude/settings.local.json +25 -0
  4. package/.github/workflows/ci.yml +83 -0
  5. package/.github/workflows/claude-code-review.yml +44 -0
  6. package/.github/workflows/claude.yml +85 -0
  7. package/.github/workflows/release.yml +113 -0
  8. package/.tldrignore +112 -0
  9. package/BACKLOG.md +338 -0
  10. package/CONTRIBUTING.md +186 -0
  11. package/NOTES/NOTES +44 -0
  12. package/README.md +434 -11
  13. package/biome.json +36 -0
  14. package/cspell.config.yaml +14 -0
  15. package/dist/chunk-23UPXDNL.js +3044 -0
  16. package/dist/chunk-2W7MO2DL.js +1366 -0
  17. package/dist/chunk-3NUAZGMA.js +1689 -0
  18. package/dist/chunk-7TOWB2XB.js +366 -0
  19. package/dist/chunk-7XOTOADQ.js +3065 -0
  20. package/dist/chunk-AH2PDM2K.js +3042 -0
  21. package/dist/chunk-BNXWSZ63.js +3742 -0
  22. package/dist/chunk-BTL5DJVU.js +3222 -0
  23. package/dist/chunk-HDHYG7E4.js +104 -0
  24. package/dist/chunk-HLR4KZBP.js +3234 -0
  25. package/dist/chunk-IP3FRFEB.js +1045 -0
  26. package/dist/chunk-KHU56VDO.js +3042 -0
  27. package/dist/chunk-KRYIFLQR.js +88 -0
  28. package/dist/chunk-LBSDNLEM.js +287 -0
  29. package/dist/chunk-MNTQ7HCP.js +2643 -0
  30. package/dist/chunk-MUJELQQ6.js +1387 -0
  31. package/dist/chunk-MXJGMSLV.js +2199 -0
  32. package/dist/chunk-N6QJGC3Z.js +2636 -0
  33. package/dist/chunk-OBELGBPM.js +1713 -0
  34. package/dist/chunk-OT7R5XTA.js +3192 -0
  35. package/dist/chunk-P7X4RA2T.js +106 -0
  36. package/dist/chunk-PIDUQNC2.js +3185 -0
  37. package/dist/chunk-POGCDIH4.js +3187 -0
  38. package/dist/chunk-PSIEOQGZ.js +3043 -0
  39. package/dist/chunk-PVRT3IHA.js +3238 -0
  40. package/dist/chunk-QNN4TT23.js +1430 -0
  41. package/dist/chunk-RE3R45RJ.js +3042 -0
  42. package/dist/chunk-S7E6TFX6.js +803 -0
  43. package/dist/chunk-SG6GLU4U.js +1378 -0
  44. package/dist/chunk-SJCDV2ST.js +274 -0
  45. package/dist/chunk-SYE5XLF3.js +104 -0
  46. package/dist/chunk-T5VLYBZD.js +103 -0
  47. package/dist/chunk-TOQB7VWU.js +3238 -0
  48. package/dist/chunk-VFNMZ4ZQ.js +3228 -0
  49. package/dist/chunk-VVTGZNBT.js +1629 -0
  50. package/dist/chunk-W7Q4RFEV.js +104 -0
  51. package/dist/chunk-XTYYVRLO.js +3190 -0
  52. package/dist/chunk-Y6MDYVJD.js +3063 -0
  53. package/dist/cli/main.d.ts +1 -0
  54. package/dist/cli/main.js +5458 -0
  55. package/dist/index.d.ts +653 -0
  56. package/dist/index.js +79 -0
  57. package/dist/mcp/server.d.ts +1 -0
  58. package/dist/mcp/server.js +472 -0
  59. package/dist/schema-BAWSG7KY.js +22 -0
  60. package/dist/schema-E3QUPL26.js +20 -0
  61. package/dist/schema-EHL7WUT6.js +20 -0
  62. package/docs/019-USAGE.md +625 -0
  63. package/docs/020-current-implementation.md +364 -0
  64. package/docs/021-DOGFOODING-FINDINGS.md +175 -0
  65. package/docs/BACKLOG.md +80 -0
  66. package/docs/CONFIG.md +1123 -0
  67. package/docs/DESIGN.md +439 -0
  68. package/docs/ERRORS.md +383 -0
  69. package/docs/PROJECT.md +88 -0
  70. package/docs/ROADMAP.md +407 -0
  71. package/docs/summarization.md +320 -0
  72. package/docs/test-links.md +9 -0
  73. package/justfile +40 -0
  74. package/package.json +74 -9
  75. package/pnpm-workspace.yaml +5 -0
  76. package/research/INDEX.md +315 -0
  77. package/research/code-review/README.md +90 -0
  78. package/research/code-review/cli-error-handling-review.md +979 -0
  79. package/research/code-review/code-review-validation-report.md +464 -0
  80. package/research/code-review/main-ts-review.md +1128 -0
  81. package/research/config-analysis/01-current-implementation.md +470 -0
  82. package/research/config-analysis/02-strategy-recommendation.md +428 -0
  83. package/research/config-analysis/03-task-candidates.md +715 -0
  84. package/research/config-analysis/033-research-configuration-management.md +828 -0
  85. package/research/config-analysis/034-research-effect-cli-config.md +1504 -0
  86. package/research/config-analysis/04-consolidated-task-candidates.md +277 -0
  87. package/research/config-docs/SUMMARY.md +357 -0
  88. package/research/config-docs/TEST-RESULTS.md +776 -0
  89. package/research/config-docs/TODO.md +542 -0
  90. package/research/config-docs/analysis.md +744 -0
  91. package/research/config-docs/fix-validation.md +502 -0
  92. package/research/config-docs/help-audit.md +264 -0
  93. package/research/config-docs/help-system-analysis.md +890 -0
  94. package/research/dogfood/consolidated-tool-evaluation.md +373 -0
  95. package/research/dogfood/strategy-a/a-synthesis.md +184 -0
  96. package/research/dogfood/strategy-a/a1-docs.md +226 -0
  97. package/research/dogfood/strategy-a/a2-amorphic.md +156 -0
  98. package/research/dogfood/strategy-a/a3-llm.md +164 -0
  99. package/research/dogfood/strategy-b/b-synthesis.md +228 -0
  100. package/research/dogfood/strategy-b/b1-architecture.md +207 -0
  101. package/research/dogfood/strategy-b/b2-gaps.md +258 -0
  102. package/research/dogfood/strategy-b/b3-workflows.md +250 -0
  103. package/research/dogfood/strategy-c/c-synthesis.md +451 -0
  104. package/research/dogfood/strategy-c/c1-explorer.md +192 -0
  105. package/research/dogfood/strategy-c/c2-diver-memory.md +145 -0
  106. package/research/dogfood/strategy-c/c3-diver-control.md +148 -0
  107. package/research/dogfood/strategy-c/c4-diver-failure.md +151 -0
  108. package/research/dogfood/strategy-c/c5-diver-execution.md +221 -0
  109. package/research/dogfood/strategy-c/c6-diver-org.md +221 -0
  110. package/research/effect-cli-error-handling.md +845 -0
  111. package/research/effect-errors-as-values.md +943 -0
  112. package/research/errors-task-analysis/00-consolidated-tasks.md +207 -0
  113. package/research/errors-task-analysis/cli-commands-analysis.md +909 -0
  114. package/research/errors-task-analysis/embeddings-analysis.md +709 -0
  115. package/research/errors-task-analysis/index-search-analysis.md +812 -0
  116. package/research/frontmatter/COMMENTS-ARE-SKIPPED.md +149 -0
  117. package/research/frontmatter/LLM-CODE-NAVIGATION.md +276 -0
  118. package/research/issue-review.md +603 -0
  119. package/research/llm-summarization/agent-cli-tools-2026.md +1082 -0
  120. package/research/llm-summarization/alternative-providers-2026.md +1428 -0
  121. package/research/llm-summarization/anthropic-2026.md +367 -0
  122. package/research/llm-summarization/claude-cli-integration.md +1706 -0
  123. package/research/llm-summarization/cli-integration-patterns.md +3155 -0
  124. package/research/llm-summarization/openai-2026.md +473 -0
  125. package/research/llm-summarization/openai-compatible-providers-2026.md +1022 -0
  126. package/research/llm-summarization/opencode-cli-integration.md +1552 -0
  127. package/research/llm-summarization/prompt-engineering-2026.md +1426 -0
  128. package/research/llm-summarization/prototype-results.md +56 -0
  129. package/research/llm-summarization/provider-switching-patterns-2026.md +2153 -0
  130. package/research/llm-summarization/typescript-llm-libraries-2026.md +2436 -0
  131. package/research/mdcontext-error-analysis.md +521 -0
  132. package/research/mdcontext-pudding/00-EXECUTIVE-SUMMARY.md +282 -0
  133. package/research/mdcontext-pudding/01-index-embed.md +956 -0
  134. package/research/mdcontext-pudding/02-search-COMMANDS.md +142 -0
  135. package/research/mdcontext-pudding/02-search-SUMMARY.md +146 -0
  136. package/research/mdcontext-pudding/02-search.md +970 -0
  137. package/research/mdcontext-pudding/03-context.md +779 -0
  138. package/research/mdcontext-pudding/04-navigation-and-analytics.md +803 -0
  139. package/research/mdcontext-pudding/04-tree.md +704 -0
  140. package/research/mdcontext-pudding/05-config.md +1038 -0
  141. package/research/mdcontext-pudding/06-links-summary.txt +87 -0
  142. package/research/mdcontext-pudding/06-links.md +679 -0
  143. package/research/mdcontext-pudding/07-stats.md +693 -0
  144. package/research/mdcontext-pudding/BUG-FIX-PLAN.md +388 -0
  145. package/research/mdcontext-pudding/P0-BUG-VALIDATION.md +167 -0
  146. package/research/mdcontext-pudding/README.md +168 -0
  147. package/research/mdcontext-pudding/TESTING-SUMMARY.md +128 -0
  148. package/research/npm_publish/011-npm-workflow-research-agent2.md +792 -0
  149. package/research/npm_publish/012-npm-workflow-research-agent1.md +530 -0
  150. package/research/npm_publish/013-npm-workflow-research-agent3.md +722 -0
  151. package/research/npm_publish/014-npm-workflow-synthesis.md +556 -0
  152. package/research/npm_publish/031-npm-workflow-task-analysis.md +134 -0
  153. package/research/research-quality-review.md +834 -0
  154. package/research/semantic-search/002-research-embedding-models.md +490 -0
  155. package/research/semantic-search/003-research-rag-alternatives.md +523 -0
  156. package/research/semantic-search/004-research-vector-search.md +841 -0
  157. package/research/semantic-search/032-research-semantic-search.md +427 -0
  158. package/research/semantic-search/embedding-text-analysis.md +156 -0
  159. package/research/semantic-search/multi-word-failure-reproduction.md +171 -0
  160. package/research/semantic-search/query-processing-analysis.md +207 -0
  161. package/research/semantic-search/root-cause-and-solution.md +114 -0
  162. package/research/semantic-search/threshold-validation-report.md +69 -0
  163. package/research/semantic-search/vector-search-analysis.md +63 -0
  164. package/research/task-management-2026/00-synthesis-recommendations.md +295 -0
  165. package/research/task-management-2026/01-ai-workflow-tools.md +416 -0
  166. package/research/task-management-2026/02-agent-framework-patterns.md +476 -0
  167. package/research/task-management-2026/03-lightweight-file-based.md +567 -0
  168. package/research/task-management-2026/04-established-tools-ai-features.md +541 -0
  169. package/research/task-management-2026/linear/01-core-features-workflow.md +771 -0
  170. package/research/task-management-2026/linear/02-api-integrations.md +930 -0
  171. package/research/task-management-2026/linear/03-ai-features.md +368 -0
  172. package/research/task-management-2026/linear/04-pricing-setup.md +205 -0
  173. package/research/task-management-2026/linear/05-usage-patterns-best-practices.md +605 -0
  174. package/research/test-path-issues.md +276 -0
  175. package/review/ALP-76/1-error-type-design.md +962 -0
  176. package/review/ALP-76/2-error-handling-patterns.md +906 -0
  177. package/review/ALP-76/3-error-presentation.md +624 -0
  178. package/review/ALP-76/4-test-coverage.md +625 -0
  179. package/review/ALP-76/5-migration-completeness.md +440 -0
  180. package/review/ALP-76/6-effect-best-practices.md +755 -0
  181. package/scripts/apply-branch-protection.sh +47 -0
  182. package/scripts/branch-protection-templates.json +79 -0
  183. package/scripts/prototype-summarization.ts +346 -0
  184. package/scripts/rebuild-hnswlib.js +58 -0
  185. package/scripts/setup-branch-protection.sh +64 -0
  186. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/active-provider.json +7 -0
  187. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/bm25.json +541 -0
  188. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/bm25.meta.json +5 -0
  189. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/config.json +8 -0
  190. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.bin +0 -0
  191. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.meta.bin +0 -0
  192. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/indexes/documents.json +60 -0
  193. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/indexes/links.json +13 -0
  194. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/indexes/sections.json +1197 -0
  195. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/configuration-management.md +99 -0
  196. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/distributed-systems.md +92 -0
  197. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/error-handling.md +78 -0
  198. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/failure-automation.md +55 -0
  199. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/job-context.md +69 -0
  200. package/src/__tests__/fixtures/semantic-search/multi-word-corpus/process-orchestration.md +99 -0
  201. package/src/cli/argv-preprocessor.test.ts +210 -0
  202. package/src/cli/argv-preprocessor.ts +202 -0
  203. package/src/cli/cli.test.ts +627 -0
  204. package/src/cli/commands/backlinks.ts +54 -0
  205. package/src/cli/commands/config-cmd.ts +642 -0
  206. package/src/cli/commands/context.ts +285 -0
  207. package/src/cli/commands/duplicates.ts +122 -0
  208. package/src/cli/commands/embeddings.ts +529 -0
  209. package/src/cli/commands/index-cmd.ts +480 -0
  210. package/src/cli/commands/index.ts +16 -0
  211. package/src/cli/commands/links.ts +52 -0
  212. package/src/cli/commands/search.ts +1281 -0
  213. package/src/cli/commands/stats.ts +149 -0
  214. package/src/cli/commands/tree.ts +128 -0
  215. package/src/cli/config-layer.ts +176 -0
  216. package/src/cli/error-handler.test.ts +235 -0
  217. package/src/cli/error-handler.ts +655 -0
  218. package/src/cli/flag-schemas.ts +341 -0
  219. package/src/cli/help.ts +588 -0
  220. package/src/cli/index.ts +9 -0
  221. package/src/cli/main.ts +435 -0
  222. package/src/cli/options.ts +41 -0
  223. package/src/cli/shared-error-handling.ts +199 -0
  224. package/src/cli/typo-suggester.test.ts +105 -0
  225. package/src/cli/typo-suggester.ts +130 -0
  226. package/src/cli/utils.ts +259 -0
  227. package/src/config/file-provider.test.ts +320 -0
  228. package/src/config/file-provider.ts +273 -0
  229. package/src/config/index.ts +72 -0
  230. package/src/config/integration.test.ts +667 -0
  231. package/src/config/precedence.test.ts +277 -0
  232. package/src/config/precedence.ts +451 -0
  233. package/src/config/schema.test.ts +414 -0
  234. package/src/config/schema.ts +603 -0
  235. package/src/config/service.test.ts +320 -0
  236. package/src/config/service.ts +243 -0
  237. package/src/config/testing.test.ts +264 -0
  238. package/src/config/testing.ts +110 -0
  239. package/src/core/index.ts +1 -0
  240. package/src/core/types.ts +113 -0
  241. package/src/duplicates/detector.test.ts +183 -0
  242. package/src/duplicates/detector.ts +414 -0
  243. package/src/duplicates/index.ts +18 -0
  244. package/src/embeddings/embedding-namespace.test.ts +300 -0
  245. package/src/embeddings/embedding-namespace.ts +947 -0
  246. package/src/embeddings/heading-boost.test.ts +222 -0
  247. package/src/embeddings/hnsw-build-options.test.ts +198 -0
  248. package/src/embeddings/hyde.test.ts +272 -0
  249. package/src/embeddings/hyde.ts +264 -0
  250. package/src/embeddings/index.ts +10 -0
  251. package/src/embeddings/openai-provider.ts +414 -0
  252. package/src/embeddings/pricing.json +22 -0
  253. package/src/embeddings/provider-constants.ts +204 -0
  254. package/src/embeddings/provider-errors.test.ts +967 -0
  255. package/src/embeddings/provider-errors.ts +565 -0
  256. package/src/embeddings/provider-factory.test.ts +240 -0
  257. package/src/embeddings/provider-factory.ts +225 -0
  258. package/src/embeddings/provider-integration.test.ts +788 -0
  259. package/src/embeddings/query-preprocessing.test.ts +187 -0
  260. package/src/embeddings/semantic-search-threshold.test.ts +508 -0
  261. package/src/embeddings/semantic-search.ts +1270 -0
  262. package/src/embeddings/types.ts +359 -0
  263. package/src/embeddings/vector-store.ts +708 -0
  264. package/src/embeddings/voyage-provider.ts +313 -0
  265. package/src/errors/errors.test.ts +845 -0
  266. package/src/errors/index.ts +533 -0
  267. package/src/index/ignore-patterns.test.ts +354 -0
  268. package/src/index/ignore-patterns.ts +305 -0
  269. package/src/index/index.ts +4 -0
  270. package/src/index/indexer.ts +684 -0
  271. package/src/index/storage.ts +260 -0
  272. package/src/index/types.ts +147 -0
  273. package/src/index/watcher.ts +189 -0
  274. package/src/index.ts +30 -0
  275. package/src/integration/search-keyword.test.ts +678 -0
  276. package/src/mcp/server.ts +612 -0
  277. package/src/parser/index.ts +1 -0
  278. package/src/parser/parser.test.ts +291 -0
  279. package/src/parser/parser.ts +394 -0
  280. package/src/parser/section-filter.test.ts +277 -0
  281. package/src/parser/section-filter.ts +392 -0
  282. package/src/search/__tests__/hybrid-search.test.ts +650 -0
  283. package/src/search/bm25-store.ts +366 -0
  284. package/src/search/cross-encoder.test.ts +253 -0
  285. package/src/search/cross-encoder.ts +406 -0
  286. package/src/search/fuzzy-search.test.ts +419 -0
  287. package/src/search/fuzzy-search.ts +273 -0
  288. package/src/search/hybrid-search.ts +448 -0
  289. package/src/search/path-matcher.test.ts +276 -0
  290. package/src/search/path-matcher.ts +33 -0
  291. package/src/search/query-parser.test.ts +260 -0
  292. package/src/search/query-parser.ts +319 -0
  293. package/src/search/searcher.test.ts +280 -0
  294. package/src/search/searcher.ts +724 -0
  295. package/src/search/wink-bm25.d.ts +30 -0
  296. package/src/summarization/cli-providers/claude.ts +202 -0
  297. package/src/summarization/cli-providers/detection.test.ts +273 -0
  298. package/src/summarization/cli-providers/detection.ts +118 -0
  299. package/src/summarization/cli-providers/index.ts +8 -0
  300. package/src/summarization/cost.test.ts +139 -0
  301. package/src/summarization/cost.ts +102 -0
  302. package/src/summarization/error-handler.test.ts +127 -0
  303. package/src/summarization/error-handler.ts +111 -0
  304. package/src/summarization/index.ts +102 -0
  305. package/src/summarization/pipeline.test.ts +498 -0
  306. package/src/summarization/pipeline.ts +231 -0
  307. package/src/summarization/prompts.test.ts +269 -0
  308. package/src/summarization/prompts.ts +133 -0
  309. package/src/summarization/provider-factory.test.ts +396 -0
  310. package/src/summarization/provider-factory.ts +178 -0
  311. package/src/summarization/types.ts +184 -0
  312. package/src/summarize/budget-bugs.test.ts +620 -0
  313. package/src/summarize/formatters.ts +419 -0
  314. package/src/summarize/index.ts +20 -0
  315. package/src/summarize/summarizer.test.ts +275 -0
  316. package/src/summarize/summarizer.ts +597 -0
  317. package/src/summarize/verify-bugs.test.ts +238 -0
  318. package/src/types/huggingface-transformers.d.ts +66 -0
  319. package/src/utils/index.ts +1 -0
  320. package/src/utils/tokens.test.ts +142 -0
  321. package/src/utils/tokens.ts +186 -0
  322. package/tests/fixtures/cli/.mdcontext/active-provider.json +7 -0
  323. package/tests/fixtures/cli/.mdcontext/config.json +8 -0
  324. package/tests/fixtures/cli/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.bin +0 -0
  325. package/tests/fixtures/cli/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.meta.bin +0 -0
  326. package/tests/fixtures/cli/.mdcontext/indexes/documents.json +33 -0
  327. package/tests/fixtures/cli/.mdcontext/indexes/links.json +12 -0
  328. package/tests/fixtures/cli/.mdcontext/indexes/sections.json +247 -0
  329. package/tests/fixtures/cli/README.md +9 -0
  330. package/tests/fixtures/cli/api-reference.md +11 -0
  331. package/tests/fixtures/cli/getting-started.md +11 -0
  332. package/tests/integration/embed-index.test.ts +712 -0
  333. package/tests/integration/search-context.test.ts +469 -0
  334. package/tests/integration/search-semantic.test.ts +522 -0
  335. package/tsconfig.json +26 -0
  336. package/vitest.config.ts +16 -0
  337. package/vitest.setup.ts +12 -0
@@ -0,0 +1,149 @@
1
+ # Comments Are Skipped: The Format Problem
2
+
3
+ **Date:** 2026-01-28
4
+ **Status:** Critical insight - updates the thesis
5
+
6
+ ---
7
+
8
+ ## The Original Thesis
9
+
10
+ Frontmatter in file headers → LLMs read first 20 lines → 94% token reduction.
11
+
12
+ ## The Problem
13
+
14
+ LLMs already read the first 20 lines.
15
+
16
+ **They skip the comments.**
17
+
18
+ ```typescript
19
+ // ---
20
+ // file: ./auth.ts
21
+ // exports: [validateUser]
22
+ // ---
23
+ ```
24
+
25
+ LLM cognition: "Comment block → noise → skip → find code"
26
+
27
+ **Frontmatter as comments is invisible.**
28
+
29
+ ---
30
+
31
+ ## The Evidence
32
+
33
+ When I (Claude) read a file with frontmatter without being told to use it:
34
+
35
+ 1. Registered lines 1-8 as "comment header decoration"
36
+ 2. Skipped to line 10+ looking for actual code
37
+ 3. Never used the exports/imports metadata
38
+ 4. Read the full file to understand what it does
39
+
40
+ **The data was there. I ignored it.**
41
+
42
+ ---
43
+
44
+ ## The Fix: Format, Not Behavior
45
+
46
+ The problem isn't LLM read behavior. The problem is the format.
47
+
48
+ ### Comments = Skipped
49
+
50
+ ```typescript
51
+ // ---
52
+ // exports: [validateUser]
53
+ // ---
54
+ ```
55
+
56
+ Invisible. Dead on arrival.
57
+
58
+ ### Self-Announcing Header = Visible
59
+
60
+ ```typescript
61
+ // --- FMM ---
62
+ // exports: [validateUser]
63
+ // ---
64
+ ```
65
+
66
+ LLM sees `FMM` → pattern match → "this is metadata"
67
+
68
+ ### Code = Parsed
69
+
70
+ ```typescript
71
+ export const __meta = {
72
+ exports: ["validateUser"],
73
+ imports: ["crypto"],
74
+ loc: 234
75
+ };
76
+ ```
77
+
78
+ LLM reads this as code. It's visible.
79
+
80
+ ### JSON = Queryable
81
+
82
+ ```json
83
+ // .fmm/index.json
84
+ {
85
+ "src/auth.ts": {
86
+ "exports": ["validateUser"],
87
+ "imports": ["crypto"],
88
+ "loc": 234
89
+ }
90
+ }
91
+ ```
92
+
93
+ LLM queries this before reading files. No comments to skip.
94
+
95
+ ---
96
+
97
+ ## The Updated Model
98
+
99
+ | Format | Human Readable | LLM Visible | Recommendation |
100
+ |--------|----------------|-------------|----------------|
101
+ | Comment frontmatter | Yes | **No** | Keep for humans |
102
+ | Code export | Yes | Yes | Bundler issues |
103
+ | Manifest JSON | No | **Yes** | Add for LLMs |
104
+
105
+ **Generate both:**
106
+ - Inline comments → human readability
107
+ - Manifest JSON → LLM queryability
108
+
109
+ ---
110
+
111
+ ## Implications for mdcontext
112
+
113
+ mdcontext is about giving LLMs exactly what they need.
114
+
115
+ **Lesson:** Format matters as much as content.
116
+
117
+ - Markdown headers → LLMs parse these (structured)
118
+ - Markdown prose → LLMs read this (content)
119
+ - Code comments → LLMs skip this (noise)
120
+
121
+ When designing LLM-readable formats:
122
+ 1. **Avoid comment syntax** - it signals "ignore me"
123
+ 2. **Use structured data** - JSON, YAML, code
124
+ 3. **Put it where LLMs look** - separate files, explicit markers
125
+
126
+ ---
127
+
128
+ ## The Meta Point
129
+
130
+ We're generating markdown research docs.
131
+
132
+ mdcontext exists to make markdown LLM-readable.
133
+
134
+ The insight applies recursively:
135
+ - **Structure** your markdown (headers, lists) → LLMs parse it
136
+ - **Avoid** wall-of-text prose → LLMs skim it
137
+ - **Use** explicit markers for key info → LLMs find it
138
+
139
+ This document uses:
140
+ - Headers for navigation
141
+ - Tables for comparison
142
+ - Code blocks for examples
143
+ - Short paragraphs for scannability
144
+
145
+ **Format is interface.**
146
+
147
+ ---
148
+
149
+ *Captured: 2026-01-28*
@@ -0,0 +1,276 @@
1
+ # LLM Code Navigation: The Case for Frontmatter
2
+
3
+ **The 94% Solution**
4
+
5
+ ---
6
+
7
+ ## Abstract
8
+
9
+ LLMs waste tokens reading entire files to understand what they do. By adding structured metadata (frontmatter) to the first 10 lines of source files, LLMs can triage and navigate codebases with **88-97% fewer tokens** while maintaining equivalent accuracy.
10
+
11
+ This is not a developer tool. It's infrastructure for LLM cost reduction.
12
+
13
+ ---
14
+
15
+ ## The Problem
16
+
17
+ When an LLM explores code, it follows a predictable pattern:
18
+
19
+ ```
20
+ grep "thing" → find files → read files → understand code
21
+ ```
22
+
23
+ The bottleneck is step 3. For every grep match, the LLM reads the entire file to understand:
24
+ - What does this file export?
25
+ - What does it depend on?
26
+ - Is this the file I'm looking for?
27
+
28
+ **Example:**
29
+ ```
30
+ grep "validateUser" → 10 matches
31
+
32
+ Read file 1 (400 lines) → wrong file
33
+ Read file 2 (600 lines) → wrong file
34
+ Read file 3 (200 lines) → this is it
35
+
36
+ Total: 1,200 lines read to find the right context
37
+ ```
38
+
39
+ Multiply this across every task, every file, every codebase. Tokens add up. Costs add up.
40
+
41
+ ---
42
+
43
+ ## The Solution
44
+
45
+ **Frontmatter:** structured metadata in the first 10 lines of every source file.
46
+
47
+ ```typescript
48
+ // ---
49
+ // file: ./src/auth/session.ts
50
+ // exports: [validateUser, createSession, destroySession]
51
+ // imports: [crypto, ./database]
52
+ // dependencies: [./types, ./config]
53
+ // loc: 234
54
+ // modified: 2026-01-27
55
+ // ---
56
+
57
+ import { createHash } from 'crypto';
58
+ // ... rest of file
59
+ ```
60
+
61
+ **The new workflow:**
62
+ ```
63
+ grep "validateUser" → 10 matches
64
+
65
+ Read first 15 lines of file 1 → exports: [UserService] → skip
66
+ Read first 15 lines of file 2 → exports: [AuthMiddleware] → skip
67
+ Read first 15 lines of file 3 → exports: [validateUser] → match!
68
+ Read full file 3 (200 lines)
69
+
70
+ Total: 245 lines read
71
+ Savings: 80%
72
+ ```
73
+
74
+ ---
75
+
76
+ ## The Evidence
77
+
78
+ Controlled experiments comparing LLM code navigation with and without frontmatter:
79
+
80
+ ### Experiment Results
81
+
82
+ | Task | Control (no FMM) | FMM | Reduction |
83
+ |------|------------------|-----|-----------|
84
+ | Review recent changes | 1,824 lines | 65 lines | **96%** |
85
+ | Refactor impact analysis | 2,800 lines | 345 lines | **88%** |
86
+ | Architecture exploration | 7,135 lines | 180 lines | **97.5%** |
87
+
88
+ **Test environment:** 244-file TypeScript codebase (81,732 total lines)
89
+
90
+ ### Quality Comparison
91
+
92
+ | Metric | Control | FMM |
93
+ |--------|---------|-----|
94
+ | Files correctly identified | ✓ | ✓ |
95
+ | Architecture diagrams produced | ✓ | ✓ |
96
+ | Dependencies mapped | ✓ | ✓ |
97
+ | Accuracy | Equivalent | Equivalent |
98
+
99
+ **Same output. 94% fewer tokens.**
100
+
101
+ ---
102
+
103
+ ## Why It Works
104
+
105
+ Frontmatter answers the three questions LLMs ask about every file:
106
+
107
+ 1. **What does this file do?** → `exports: [...]`
108
+ 2. **What does it depend on?** → `imports: [...]`, `dependencies: [...]`
109
+ 3. **How big is it?** → `loc: 234`
110
+
111
+ With these answers in the first 15 lines, the LLM can triage without reading the full file.
112
+
113
+ ### The Triage Decision Tree
114
+
115
+ ```
116
+ Read frontmatter (15 lines)
117
+
118
+ ├── Exports match what I'm looking for?
119
+ │ ├── Yes → Read full file
120
+ │ └── No → Skip (saved 200+ lines)
121
+
122
+ └── Dependencies relevant to my task?
123
+ ├── Yes → Read full file
124
+ └── No → Skip (saved 200+ lines)
125
+ ```
126
+
127
+ ---
128
+
129
+ ## The Economics
130
+
131
+ ### Per-Request Savings
132
+
133
+ | Scenario | Without FMM | With FMM | Savings |
134
+ |----------|-------------|----------|---------|
135
+ | Simple lookup | 500 lines | 65 lines | 87% |
136
+ | Refactoring task | 3,000 lines | 400 lines | 87% |
137
+ | Architecture review | 7,000 lines | 200 lines | 97% |
138
+
139
+ ### At Scale
140
+
141
+ Assuming:
142
+ - 1,000 LLM coding requests/day
143
+ - Average 2,000 lines read per request
144
+ - $0.01 per 1K tokens (input)
145
+ - ~4 chars per token
146
+
147
+ **Without FMM:** 2M lines × 1000 requests = 2B lines/day = ~$5,000/day
148
+ **With FMM (90% reduction):** ~$500/day
149
+
150
+ **Annual savings: ~$1.6M** (per organization at this scale)
151
+
152
+ ---
153
+
154
+ ## The Crossover Point
155
+
156
+ Frontmatter has overhead: ~8-10 lines per file for the metadata block.
157
+
158
+ **FMM wins when:** `files_skipped × avg_file_size > frontmatter_overhead`
159
+
160
+ | Codebase | Files | Avg LOC | Break-Even | FMM Value |
161
+ |----------|-------|---------|------------|-----------|
162
+ | Tiny | 4 | 30 | Skip 3+ files | Marginal |
163
+ | Small | 50 | 100 | Skip 5+ files | Positive |
164
+ | Medium | 200 | 200 | Skip 10+ files | Strong |
165
+ | Large | 500+ | 300+ | Skip 15+ files | Massive |
166
+
167
+ **Real codebases are medium-to-large. FMM wins by default.**
168
+
169
+ ---
170
+
171
+ ## The Adoption Path
172
+
173
+ ### What Doesn't Work
174
+
175
+ - ❌ Manifest files (`.fmm/index.json`) - adds complexity
176
+ - ❌ Discovery mechanisms - overengineered
177
+ - ❌ CLAUDE.md hints - project-specific
178
+ - ❌ New developer tooling - adoption friction
179
+
180
+ ### What Works
181
+
182
+ The LLM workflow is:
183
+ ```
184
+ grep → find files → READ files → understand
185
+ ```
186
+
187
+ Frontmatter changes only the READ step:
188
+ ```
189
+ grep → find files → READ FIRST 15 LINES → decide → maybe read rest
190
+ ```
191
+
192
+ **The adoption path:**
193
+ 1. Codebases add frontmatter (`fmm generate src/`)
194
+ 2. LLM tools adopt "peek first" as default behavior
195
+
196
+ No new tools for developers. No discovery layers. Just a behavior change in how LLMs read files.
197
+
198
+ ---
199
+
200
+ ## The Thesis
201
+
202
+ **Frontmatter is infrastructure for LLM cost reduction.**
203
+
204
+ Every codebase with frontmatter = cheaper to work with.
205
+ Every LLM tool that peeks first = cheaper to run.
206
+
207
+ The more codebases have frontmatter, the more pressure on LLM tools to optimize for it. The more tools optimize, the more value codebases get from adding it.
208
+
209
+ **This is a coordination game with positive-sum economics.**
210
+
211
+ ---
212
+
213
+ ## Implementation
214
+
215
+ ### fmm (Frontmatter Matters)
216
+
217
+ CLI tool to generate and maintain frontmatter:
218
+
219
+ ```bash
220
+ # Add frontmatter to all TypeScript files
221
+ fmm generate src/
222
+
223
+ # Update existing frontmatter
224
+ fmm update src/
225
+
226
+ # Validate frontmatter is current (CI integration)
227
+ fmm validate src/
228
+ ```
229
+
230
+ **Supported languages:** TypeScript, JavaScript, Python, Rust, Go
231
+
232
+ **Performance:** ~1,000 files/second on M1 Mac
233
+
234
+ ### Frontmatter Format
235
+
236
+ ```typescript
237
+ // ---
238
+ // file: ./relative/path.ts
239
+ // exports: [namedExport1, namedExport2, DefaultExport]
240
+ // imports: [external-package, ./local-dep]
241
+ // dependencies: [./types, ./utils]
242
+ // loc: 234
243
+ // modified: 2026-01-27
244
+ // ---
245
+ ```
246
+
247
+ ### Integration Points
248
+
249
+ - **Pre-commit hooks:** Ensure frontmatter stays in sync
250
+ - **CI validation:** `fmm validate` fails if frontmatter is stale
251
+ - **Editor plugins:** Auto-update on save
252
+
253
+ ---
254
+
255
+ ## Conclusion
256
+
257
+ LLMs reading code is expensive. Frontmatter makes it cheap.
258
+
259
+ The evidence is clear: **88-97% token reduction** on real tasks, with equivalent accuracy.
260
+
261
+ The path is simple: add frontmatter to codebases, change LLM read behavior to peek first.
262
+
263
+ The economics do the rest.
264
+
265
+ ---
266
+
267
+ ## References
268
+
269
+ - Experiment data: `fmm/research/exp13/`
270
+ - fmm CLI: `github.com/mdcontext/fmm`
271
+ - mdcontext: `github.com/mdcontext/mdcontext`
272
+
273
+ ---
274
+
275
+ *Research conducted January 2026*
276
+ *Stuart Robinson & Claude Opus 4.5*