mdcontext 0.0.1 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.changeset/README.md +28 -0
- package/.changeset/config.json +11 -0
- package/.claude/settings.local.json +25 -0
- package/.github/workflows/ci.yml +83 -0
- package/.github/workflows/claude-code-review.yml +44 -0
- package/.github/workflows/claude.yml +85 -0
- package/.github/workflows/release.yml +113 -0
- package/.tldrignore +112 -0
- package/BACKLOG.md +338 -0
- package/CONTRIBUTING.md +186 -0
- package/NOTES/NOTES +44 -0
- package/README.md +434 -11
- package/biome.json +36 -0
- package/cspell.config.yaml +14 -0
- package/dist/chunk-23UPXDNL.js +3044 -0
- package/dist/chunk-2W7MO2DL.js +1366 -0
- package/dist/chunk-3NUAZGMA.js +1689 -0
- package/dist/chunk-7TOWB2XB.js +366 -0
- package/dist/chunk-7XOTOADQ.js +3065 -0
- package/dist/chunk-AH2PDM2K.js +3042 -0
- package/dist/chunk-BNXWSZ63.js +3742 -0
- package/dist/chunk-BTL5DJVU.js +3222 -0
- package/dist/chunk-HDHYG7E4.js +104 -0
- package/dist/chunk-HLR4KZBP.js +3234 -0
- package/dist/chunk-IP3FRFEB.js +1045 -0
- package/dist/chunk-KHU56VDO.js +3042 -0
- package/dist/chunk-KRYIFLQR.js +88 -0
- package/dist/chunk-LBSDNLEM.js +287 -0
- package/dist/chunk-MNTQ7HCP.js +2643 -0
- package/dist/chunk-MUJELQQ6.js +1387 -0
- package/dist/chunk-MXJGMSLV.js +2199 -0
- package/dist/chunk-N6QJGC3Z.js +2636 -0
- package/dist/chunk-OBELGBPM.js +1713 -0
- package/dist/chunk-OT7R5XTA.js +3192 -0
- package/dist/chunk-P7X4RA2T.js +106 -0
- package/dist/chunk-PIDUQNC2.js +3185 -0
- package/dist/chunk-POGCDIH4.js +3187 -0
- package/dist/chunk-PSIEOQGZ.js +3043 -0
- package/dist/chunk-PVRT3IHA.js +3238 -0
- package/dist/chunk-QNN4TT23.js +1430 -0
- package/dist/chunk-RE3R45RJ.js +3042 -0
- package/dist/chunk-S7E6TFX6.js +803 -0
- package/dist/chunk-SG6GLU4U.js +1378 -0
- package/dist/chunk-SJCDV2ST.js +274 -0
- package/dist/chunk-SYE5XLF3.js +104 -0
- package/dist/chunk-T5VLYBZD.js +103 -0
- package/dist/chunk-TOQB7VWU.js +3238 -0
- package/dist/chunk-VFNMZ4ZQ.js +3228 -0
- package/dist/chunk-VVTGZNBT.js +1629 -0
- package/dist/chunk-W7Q4RFEV.js +104 -0
- package/dist/chunk-XTYYVRLO.js +3190 -0
- package/dist/chunk-Y6MDYVJD.js +3063 -0
- package/dist/cli/main.d.ts +1 -0
- package/dist/cli/main.js +5458 -0
- package/dist/index.d.ts +653 -0
- package/dist/index.js +79 -0
- package/dist/mcp/server.d.ts +1 -0
- package/dist/mcp/server.js +472 -0
- package/dist/schema-BAWSG7KY.js +22 -0
- package/dist/schema-E3QUPL26.js +20 -0
- package/dist/schema-EHL7WUT6.js +20 -0
- package/docs/019-USAGE.md +625 -0
- package/docs/020-current-implementation.md +364 -0
- package/docs/021-DOGFOODING-FINDINGS.md +175 -0
- package/docs/BACKLOG.md +80 -0
- package/docs/CONFIG.md +1123 -0
- package/docs/DESIGN.md +439 -0
- package/docs/ERRORS.md +383 -0
- package/docs/PROJECT.md +88 -0
- package/docs/ROADMAP.md +407 -0
- package/docs/summarization.md +320 -0
- package/docs/test-links.md +9 -0
- package/justfile +40 -0
- package/package.json +74 -9
- package/pnpm-workspace.yaml +5 -0
- package/research/INDEX.md +315 -0
- package/research/code-review/README.md +90 -0
- package/research/code-review/cli-error-handling-review.md +979 -0
- package/research/code-review/code-review-validation-report.md +464 -0
- package/research/code-review/main-ts-review.md +1128 -0
- package/research/config-analysis/01-current-implementation.md +470 -0
- package/research/config-analysis/02-strategy-recommendation.md +428 -0
- package/research/config-analysis/03-task-candidates.md +715 -0
- package/research/config-analysis/033-research-configuration-management.md +828 -0
- package/research/config-analysis/034-research-effect-cli-config.md +1504 -0
- package/research/config-analysis/04-consolidated-task-candidates.md +277 -0
- package/research/config-docs/SUMMARY.md +357 -0
- package/research/config-docs/TEST-RESULTS.md +776 -0
- package/research/config-docs/TODO.md +542 -0
- package/research/config-docs/analysis.md +744 -0
- package/research/config-docs/fix-validation.md +502 -0
- package/research/config-docs/help-audit.md +264 -0
- package/research/config-docs/help-system-analysis.md +890 -0
- package/research/dogfood/consolidated-tool-evaluation.md +373 -0
- package/research/dogfood/strategy-a/a-synthesis.md +184 -0
- package/research/dogfood/strategy-a/a1-docs.md +226 -0
- package/research/dogfood/strategy-a/a2-amorphic.md +156 -0
- package/research/dogfood/strategy-a/a3-llm.md +164 -0
- package/research/dogfood/strategy-b/b-synthesis.md +228 -0
- package/research/dogfood/strategy-b/b1-architecture.md +207 -0
- package/research/dogfood/strategy-b/b2-gaps.md +258 -0
- package/research/dogfood/strategy-b/b3-workflows.md +250 -0
- package/research/dogfood/strategy-c/c-synthesis.md +451 -0
- package/research/dogfood/strategy-c/c1-explorer.md +192 -0
- package/research/dogfood/strategy-c/c2-diver-memory.md +145 -0
- package/research/dogfood/strategy-c/c3-diver-control.md +148 -0
- package/research/dogfood/strategy-c/c4-diver-failure.md +151 -0
- package/research/dogfood/strategy-c/c5-diver-execution.md +221 -0
- package/research/dogfood/strategy-c/c6-diver-org.md +221 -0
- package/research/effect-cli-error-handling.md +845 -0
- package/research/effect-errors-as-values.md +943 -0
- package/research/errors-task-analysis/00-consolidated-tasks.md +207 -0
- package/research/errors-task-analysis/cli-commands-analysis.md +909 -0
- package/research/errors-task-analysis/embeddings-analysis.md +709 -0
- package/research/errors-task-analysis/index-search-analysis.md +812 -0
- package/research/frontmatter/COMMENTS-ARE-SKIPPED.md +149 -0
- package/research/frontmatter/LLM-CODE-NAVIGATION.md +276 -0
- package/research/issue-review.md +603 -0
- package/research/llm-summarization/agent-cli-tools-2026.md +1082 -0
- package/research/llm-summarization/alternative-providers-2026.md +1428 -0
- package/research/llm-summarization/anthropic-2026.md +367 -0
- package/research/llm-summarization/claude-cli-integration.md +1706 -0
- package/research/llm-summarization/cli-integration-patterns.md +3155 -0
- package/research/llm-summarization/openai-2026.md +473 -0
- package/research/llm-summarization/openai-compatible-providers-2026.md +1022 -0
- package/research/llm-summarization/opencode-cli-integration.md +1552 -0
- package/research/llm-summarization/prompt-engineering-2026.md +1426 -0
- package/research/llm-summarization/prototype-results.md +56 -0
- package/research/llm-summarization/provider-switching-patterns-2026.md +2153 -0
- package/research/llm-summarization/typescript-llm-libraries-2026.md +2436 -0
- package/research/mdcontext-error-analysis.md +521 -0
- package/research/mdcontext-pudding/00-EXECUTIVE-SUMMARY.md +282 -0
- package/research/mdcontext-pudding/01-index-embed.md +956 -0
- package/research/mdcontext-pudding/02-search-COMMANDS.md +142 -0
- package/research/mdcontext-pudding/02-search-SUMMARY.md +146 -0
- package/research/mdcontext-pudding/02-search.md +970 -0
- package/research/mdcontext-pudding/03-context.md +779 -0
- package/research/mdcontext-pudding/04-navigation-and-analytics.md +803 -0
- package/research/mdcontext-pudding/04-tree.md +704 -0
- package/research/mdcontext-pudding/05-config.md +1038 -0
- package/research/mdcontext-pudding/06-links-summary.txt +87 -0
- package/research/mdcontext-pudding/06-links.md +679 -0
- package/research/mdcontext-pudding/07-stats.md +693 -0
- package/research/mdcontext-pudding/BUG-FIX-PLAN.md +388 -0
- package/research/mdcontext-pudding/P0-BUG-VALIDATION.md +167 -0
- package/research/mdcontext-pudding/README.md +168 -0
- package/research/mdcontext-pudding/TESTING-SUMMARY.md +128 -0
- package/research/npm_publish/011-npm-workflow-research-agent2.md +792 -0
- package/research/npm_publish/012-npm-workflow-research-agent1.md +530 -0
- package/research/npm_publish/013-npm-workflow-research-agent3.md +722 -0
- package/research/npm_publish/014-npm-workflow-synthesis.md +556 -0
- package/research/npm_publish/031-npm-workflow-task-analysis.md +134 -0
- package/research/research-quality-review.md +834 -0
- package/research/semantic-search/002-research-embedding-models.md +490 -0
- package/research/semantic-search/003-research-rag-alternatives.md +523 -0
- package/research/semantic-search/004-research-vector-search.md +841 -0
- package/research/semantic-search/032-research-semantic-search.md +427 -0
- package/research/semantic-search/embedding-text-analysis.md +156 -0
- package/research/semantic-search/multi-word-failure-reproduction.md +171 -0
- package/research/semantic-search/query-processing-analysis.md +207 -0
- package/research/semantic-search/root-cause-and-solution.md +114 -0
- package/research/semantic-search/threshold-validation-report.md +69 -0
- package/research/semantic-search/vector-search-analysis.md +63 -0
- package/research/task-management-2026/00-synthesis-recommendations.md +295 -0
- package/research/task-management-2026/01-ai-workflow-tools.md +416 -0
- package/research/task-management-2026/02-agent-framework-patterns.md +476 -0
- package/research/task-management-2026/03-lightweight-file-based.md +567 -0
- package/research/task-management-2026/04-established-tools-ai-features.md +541 -0
- package/research/task-management-2026/linear/01-core-features-workflow.md +771 -0
- package/research/task-management-2026/linear/02-api-integrations.md +930 -0
- package/research/task-management-2026/linear/03-ai-features.md +368 -0
- package/research/task-management-2026/linear/04-pricing-setup.md +205 -0
- package/research/task-management-2026/linear/05-usage-patterns-best-practices.md +605 -0
- package/research/test-path-issues.md +276 -0
- package/review/ALP-76/1-error-type-design.md +962 -0
- package/review/ALP-76/2-error-handling-patterns.md +906 -0
- package/review/ALP-76/3-error-presentation.md +624 -0
- package/review/ALP-76/4-test-coverage.md +625 -0
- package/review/ALP-76/5-migration-completeness.md +440 -0
- package/review/ALP-76/6-effect-best-practices.md +755 -0
- package/scripts/apply-branch-protection.sh +47 -0
- package/scripts/branch-protection-templates.json +79 -0
- package/scripts/prototype-summarization.ts +346 -0
- package/scripts/rebuild-hnswlib.js +58 -0
- package/scripts/setup-branch-protection.sh +64 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/active-provider.json +7 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/bm25.json +541 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/bm25.meta.json +5 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/config.json +8 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.bin +0 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.meta.bin +0 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/indexes/documents.json +60 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/indexes/links.json +13 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/indexes/sections.json +1197 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/configuration-management.md +99 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/distributed-systems.md +92 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/error-handling.md +78 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/failure-automation.md +55 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/job-context.md +69 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/process-orchestration.md +99 -0
- package/src/cli/argv-preprocessor.test.ts +210 -0
- package/src/cli/argv-preprocessor.ts +202 -0
- package/src/cli/cli.test.ts +627 -0
- package/src/cli/commands/backlinks.ts +54 -0
- package/src/cli/commands/config-cmd.ts +642 -0
- package/src/cli/commands/context.ts +285 -0
- package/src/cli/commands/duplicates.ts +122 -0
- package/src/cli/commands/embeddings.ts +529 -0
- package/src/cli/commands/index-cmd.ts +480 -0
- package/src/cli/commands/index.ts +16 -0
- package/src/cli/commands/links.ts +52 -0
- package/src/cli/commands/search.ts +1281 -0
- package/src/cli/commands/stats.ts +149 -0
- package/src/cli/commands/tree.ts +128 -0
- package/src/cli/config-layer.ts +176 -0
- package/src/cli/error-handler.test.ts +235 -0
- package/src/cli/error-handler.ts +655 -0
- package/src/cli/flag-schemas.ts +341 -0
- package/src/cli/help.ts +588 -0
- package/src/cli/index.ts +9 -0
- package/src/cli/main.ts +435 -0
- package/src/cli/options.ts +41 -0
- package/src/cli/shared-error-handling.ts +199 -0
- package/src/cli/typo-suggester.test.ts +105 -0
- package/src/cli/typo-suggester.ts +130 -0
- package/src/cli/utils.ts +259 -0
- package/src/config/file-provider.test.ts +320 -0
- package/src/config/file-provider.ts +273 -0
- package/src/config/index.ts +72 -0
- package/src/config/integration.test.ts +667 -0
- package/src/config/precedence.test.ts +277 -0
- package/src/config/precedence.ts +451 -0
- package/src/config/schema.test.ts +414 -0
- package/src/config/schema.ts +603 -0
- package/src/config/service.test.ts +320 -0
- package/src/config/service.ts +243 -0
- package/src/config/testing.test.ts +264 -0
- package/src/config/testing.ts +110 -0
- package/src/core/index.ts +1 -0
- package/src/core/types.ts +113 -0
- package/src/duplicates/detector.test.ts +183 -0
- package/src/duplicates/detector.ts +414 -0
- package/src/duplicates/index.ts +18 -0
- package/src/embeddings/embedding-namespace.test.ts +300 -0
- package/src/embeddings/embedding-namespace.ts +947 -0
- package/src/embeddings/heading-boost.test.ts +222 -0
- package/src/embeddings/hnsw-build-options.test.ts +198 -0
- package/src/embeddings/hyde.test.ts +272 -0
- package/src/embeddings/hyde.ts +264 -0
- package/src/embeddings/index.ts +10 -0
- package/src/embeddings/openai-provider.ts +414 -0
- package/src/embeddings/pricing.json +22 -0
- package/src/embeddings/provider-constants.ts +204 -0
- package/src/embeddings/provider-errors.test.ts +967 -0
- package/src/embeddings/provider-errors.ts +565 -0
- package/src/embeddings/provider-factory.test.ts +240 -0
- package/src/embeddings/provider-factory.ts +225 -0
- package/src/embeddings/provider-integration.test.ts +788 -0
- package/src/embeddings/query-preprocessing.test.ts +187 -0
- package/src/embeddings/semantic-search-threshold.test.ts +508 -0
- package/src/embeddings/semantic-search.ts +1270 -0
- package/src/embeddings/types.ts +359 -0
- package/src/embeddings/vector-store.ts +708 -0
- package/src/embeddings/voyage-provider.ts +313 -0
- package/src/errors/errors.test.ts +845 -0
- package/src/errors/index.ts +533 -0
- package/src/index/ignore-patterns.test.ts +354 -0
- package/src/index/ignore-patterns.ts +305 -0
- package/src/index/index.ts +4 -0
- package/src/index/indexer.ts +684 -0
- package/src/index/storage.ts +260 -0
- package/src/index/types.ts +147 -0
- package/src/index/watcher.ts +189 -0
- package/src/index.ts +30 -0
- package/src/integration/search-keyword.test.ts +678 -0
- package/src/mcp/server.ts +612 -0
- package/src/parser/index.ts +1 -0
- package/src/parser/parser.test.ts +291 -0
- package/src/parser/parser.ts +394 -0
- package/src/parser/section-filter.test.ts +277 -0
- package/src/parser/section-filter.ts +392 -0
- package/src/search/__tests__/hybrid-search.test.ts +650 -0
- package/src/search/bm25-store.ts +366 -0
- package/src/search/cross-encoder.test.ts +253 -0
- package/src/search/cross-encoder.ts +406 -0
- package/src/search/fuzzy-search.test.ts +419 -0
- package/src/search/fuzzy-search.ts +273 -0
- package/src/search/hybrid-search.ts +448 -0
- package/src/search/path-matcher.test.ts +276 -0
- package/src/search/path-matcher.ts +33 -0
- package/src/search/query-parser.test.ts +260 -0
- package/src/search/query-parser.ts +319 -0
- package/src/search/searcher.test.ts +280 -0
- package/src/search/searcher.ts +724 -0
- package/src/search/wink-bm25.d.ts +30 -0
- package/src/summarization/cli-providers/claude.ts +202 -0
- package/src/summarization/cli-providers/detection.test.ts +273 -0
- package/src/summarization/cli-providers/detection.ts +118 -0
- package/src/summarization/cli-providers/index.ts +8 -0
- package/src/summarization/cost.test.ts +139 -0
- package/src/summarization/cost.ts +102 -0
- package/src/summarization/error-handler.test.ts +127 -0
- package/src/summarization/error-handler.ts +111 -0
- package/src/summarization/index.ts +102 -0
- package/src/summarization/pipeline.test.ts +498 -0
- package/src/summarization/pipeline.ts +231 -0
- package/src/summarization/prompts.test.ts +269 -0
- package/src/summarization/prompts.ts +133 -0
- package/src/summarization/provider-factory.test.ts +396 -0
- package/src/summarization/provider-factory.ts +178 -0
- package/src/summarization/types.ts +184 -0
- package/src/summarize/budget-bugs.test.ts +620 -0
- package/src/summarize/formatters.ts +419 -0
- package/src/summarize/index.ts +20 -0
- package/src/summarize/summarizer.test.ts +275 -0
- package/src/summarize/summarizer.ts +597 -0
- package/src/summarize/verify-bugs.test.ts +238 -0
- package/src/types/huggingface-transformers.d.ts +66 -0
- package/src/utils/index.ts +1 -0
- package/src/utils/tokens.test.ts +142 -0
- package/src/utils/tokens.ts +186 -0
- package/tests/fixtures/cli/.mdcontext/active-provider.json +7 -0
- package/tests/fixtures/cli/.mdcontext/config.json +8 -0
- package/tests/fixtures/cli/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.bin +0 -0
- package/tests/fixtures/cli/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.meta.bin +0 -0
- package/tests/fixtures/cli/.mdcontext/indexes/documents.json +33 -0
- package/tests/fixtures/cli/.mdcontext/indexes/links.json +12 -0
- package/tests/fixtures/cli/.mdcontext/indexes/sections.json +247 -0
- package/tests/fixtures/cli/README.md +9 -0
- package/tests/fixtures/cli/api-reference.md +11 -0
- package/tests/fixtures/cli/getting-started.md +11 -0
- package/tests/integration/embed-index.test.ts +712 -0
- package/tests/integration/search-context.test.ts +469 -0
- package/tests/integration/search-semantic.test.ts +522 -0
- package/tsconfig.json +26 -0
- package/vitest.config.ts +16 -0
- package/vitest.setup.ts +12 -0
|
@@ -0,0 +1,149 @@
|
|
|
1
|
+
# Comments Are Skipped: The Format Problem
|
|
2
|
+
|
|
3
|
+
**Date:** 2026-01-28
|
|
4
|
+
**Status:** Critical insight - updates the thesis
|
|
5
|
+
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
## The Original Thesis
|
|
9
|
+
|
|
10
|
+
Frontmatter in file headers → LLMs read first 20 lines → 94% token reduction.
|
|
11
|
+
|
|
12
|
+
## The Problem
|
|
13
|
+
|
|
14
|
+
LLMs already read the first 20 lines.
|
|
15
|
+
|
|
16
|
+
**They skip the comments.**
|
|
17
|
+
|
|
18
|
+
```typescript
|
|
19
|
+
// ---
|
|
20
|
+
// file: ./auth.ts
|
|
21
|
+
// exports: [validateUser]
|
|
22
|
+
// ---
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
LLM cognition: "Comment block → noise → skip → find code"
|
|
26
|
+
|
|
27
|
+
**Frontmatter as comments is invisible.**
|
|
28
|
+
|
|
29
|
+
---
|
|
30
|
+
|
|
31
|
+
## The Evidence
|
|
32
|
+
|
|
33
|
+
When I (Claude) read a file with frontmatter without being told to use it:
|
|
34
|
+
|
|
35
|
+
1. Registered lines 1-8 as "comment header decoration"
|
|
36
|
+
2. Skipped to line 10+ looking for actual code
|
|
37
|
+
3. Never used the exports/imports metadata
|
|
38
|
+
4. Read the full file to understand what it does
|
|
39
|
+
|
|
40
|
+
**The data was there. I ignored it.**
|
|
41
|
+
|
|
42
|
+
---
|
|
43
|
+
|
|
44
|
+
## The Fix: Format, Not Behavior
|
|
45
|
+
|
|
46
|
+
The problem isn't LLM read behavior. The problem is the format.
|
|
47
|
+
|
|
48
|
+
### Comments = Skipped
|
|
49
|
+
|
|
50
|
+
```typescript
|
|
51
|
+
// ---
|
|
52
|
+
// exports: [validateUser]
|
|
53
|
+
// ---
|
|
54
|
+
```
|
|
55
|
+
|
|
56
|
+
Invisible. Dead on arrival.
|
|
57
|
+
|
|
58
|
+
### Self-Announcing Header = Visible
|
|
59
|
+
|
|
60
|
+
```typescript
|
|
61
|
+
// --- FMM ---
|
|
62
|
+
// exports: [validateUser]
|
|
63
|
+
// ---
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
LLM sees `FMM` → pattern match → "this is metadata"
|
|
67
|
+
|
|
68
|
+
### Code = Parsed
|
|
69
|
+
|
|
70
|
+
```typescript
|
|
71
|
+
export const __meta = {
|
|
72
|
+
exports: ["validateUser"],
|
|
73
|
+
imports: ["crypto"],
|
|
74
|
+
loc: 234
|
|
75
|
+
};
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
LLM reads this as code. It's visible.
|
|
79
|
+
|
|
80
|
+
### JSON = Queryable
|
|
81
|
+
|
|
82
|
+
```json
|
|
83
|
+
// .fmm/index.json
|
|
84
|
+
{
|
|
85
|
+
"src/auth.ts": {
|
|
86
|
+
"exports": ["validateUser"],
|
|
87
|
+
"imports": ["crypto"],
|
|
88
|
+
"loc": 234
|
|
89
|
+
}
|
|
90
|
+
}
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
LLM queries this before reading files. No comments to skip.
|
|
94
|
+
|
|
95
|
+
---
|
|
96
|
+
|
|
97
|
+
## The Updated Model
|
|
98
|
+
|
|
99
|
+
| Format | Human Readable | LLM Visible | Recommendation |
|
|
100
|
+
|--------|----------------|-------------|----------------|
|
|
101
|
+
| Comment frontmatter | Yes | **No** | Keep for humans |
|
|
102
|
+
| Code export | Yes | Yes | Bundler issues |
|
|
103
|
+
| Manifest JSON | No | **Yes** | Add for LLMs |
|
|
104
|
+
|
|
105
|
+
**Generate both:**
|
|
106
|
+
- Inline comments → human readability
|
|
107
|
+
- Manifest JSON → LLM queryability
|
|
108
|
+
|
|
109
|
+
---
|
|
110
|
+
|
|
111
|
+
## Implications for mdcontext
|
|
112
|
+
|
|
113
|
+
mdcontext is about giving LLMs exactly what they need.
|
|
114
|
+
|
|
115
|
+
**Lesson:** Format matters as much as content.
|
|
116
|
+
|
|
117
|
+
- Markdown headers → LLMs parse these (structured)
|
|
118
|
+
- Markdown prose → LLMs read this (content)
|
|
119
|
+
- Code comments → LLMs skip this (noise)
|
|
120
|
+
|
|
121
|
+
When designing LLM-readable formats:
|
|
122
|
+
1. **Avoid comment syntax** - it signals "ignore me"
|
|
123
|
+
2. **Use structured data** - JSON, YAML, code
|
|
124
|
+
3. **Put it where LLMs look** - separate files, explicit markers
|
|
125
|
+
|
|
126
|
+
---
|
|
127
|
+
|
|
128
|
+
## The Meta Point
|
|
129
|
+
|
|
130
|
+
We're generating markdown research docs.
|
|
131
|
+
|
|
132
|
+
mdcontext exists to make markdown LLM-readable.
|
|
133
|
+
|
|
134
|
+
The insight applies recursively:
|
|
135
|
+
- **Structure** your markdown (headers, lists) → LLMs parse it
|
|
136
|
+
- **Avoid** wall-of-text prose → LLMs skim it
|
|
137
|
+
- **Use** explicit markers for key info → LLMs find it
|
|
138
|
+
|
|
139
|
+
This document uses:
|
|
140
|
+
- Headers for navigation
|
|
141
|
+
- Tables for comparison
|
|
142
|
+
- Code blocks for examples
|
|
143
|
+
- Short paragraphs for scannability
|
|
144
|
+
|
|
145
|
+
**Format is interface.**
|
|
146
|
+
|
|
147
|
+
---
|
|
148
|
+
|
|
149
|
+
*Captured: 2026-01-28*
|
|
@@ -0,0 +1,276 @@
|
|
|
1
|
+
# LLM Code Navigation: The Case for Frontmatter
|
|
2
|
+
|
|
3
|
+
**The 94% Solution**
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## Abstract
|
|
8
|
+
|
|
9
|
+
LLMs waste tokens reading entire files to understand what they do. By adding structured metadata (frontmatter) to the first 10 lines of source files, LLMs can triage and navigate codebases with **88-97% fewer tokens** while maintaining equivalent accuracy.
|
|
10
|
+
|
|
11
|
+
This is not a developer tool. It's infrastructure for LLM cost reduction.
|
|
12
|
+
|
|
13
|
+
---
|
|
14
|
+
|
|
15
|
+
## The Problem
|
|
16
|
+
|
|
17
|
+
When an LLM explores code, it follows a predictable pattern:
|
|
18
|
+
|
|
19
|
+
```
|
|
20
|
+
grep "thing" → find files → read files → understand code
|
|
21
|
+
```
|
|
22
|
+
|
|
23
|
+
The bottleneck is step 3. For every grep match, the LLM reads the entire file to understand:
|
|
24
|
+
- What does this file export?
|
|
25
|
+
- What does it depend on?
|
|
26
|
+
- Is this the file I'm looking for?
|
|
27
|
+
|
|
28
|
+
**Example:**
|
|
29
|
+
```
|
|
30
|
+
grep "validateUser" → 10 matches
|
|
31
|
+
|
|
32
|
+
Read file 1 (400 lines) → wrong file
|
|
33
|
+
Read file 2 (600 lines) → wrong file
|
|
34
|
+
Read file 3 (200 lines) → this is it
|
|
35
|
+
|
|
36
|
+
Total: 1,200 lines read to find the right context
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
Multiply this across every task, every file, every codebase. Tokens add up. Costs add up.
|
|
40
|
+
|
|
41
|
+
---
|
|
42
|
+
|
|
43
|
+
## The Solution
|
|
44
|
+
|
|
45
|
+
**Frontmatter:** structured metadata in the first 10 lines of every source file.
|
|
46
|
+
|
|
47
|
+
```typescript
|
|
48
|
+
// ---
|
|
49
|
+
// file: ./src/auth/session.ts
|
|
50
|
+
// exports: [validateUser, createSession, destroySession]
|
|
51
|
+
// imports: [crypto, ./database]
|
|
52
|
+
// dependencies: [./types, ./config]
|
|
53
|
+
// loc: 234
|
|
54
|
+
// modified: 2026-01-27
|
|
55
|
+
// ---
|
|
56
|
+
|
|
57
|
+
import { createHash } from 'crypto';
|
|
58
|
+
// ... rest of file
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
**The new workflow:**
|
|
62
|
+
```
|
|
63
|
+
grep "validateUser" → 10 matches
|
|
64
|
+
|
|
65
|
+
Read first 15 lines of file 1 → exports: [UserService] → skip
|
|
66
|
+
Read first 15 lines of file 2 → exports: [AuthMiddleware] → skip
|
|
67
|
+
Read first 15 lines of file 3 → exports: [validateUser] → match!
|
|
68
|
+
Read full file 3 (200 lines)
|
|
69
|
+
|
|
70
|
+
Total: 245 lines read
|
|
71
|
+
Savings: 80%
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
---
|
|
75
|
+
|
|
76
|
+
## The Evidence
|
|
77
|
+
|
|
78
|
+
Controlled experiments comparing LLM code navigation with and without frontmatter:
|
|
79
|
+
|
|
80
|
+
### Experiment Results
|
|
81
|
+
|
|
82
|
+
| Task | Control (no FMM) | FMM | Reduction |
|
|
83
|
+
|------|------------------|-----|-----------|
|
|
84
|
+
| Review recent changes | 1,824 lines | 65 lines | **96%** |
|
|
85
|
+
| Refactor impact analysis | 2,800 lines | 345 lines | **88%** |
|
|
86
|
+
| Architecture exploration | 7,135 lines | 180 lines | **97.5%** |
|
|
87
|
+
|
|
88
|
+
**Test environment:** 244-file TypeScript codebase (81,732 total lines)
|
|
89
|
+
|
|
90
|
+
### Quality Comparison
|
|
91
|
+
|
|
92
|
+
| Metric | Control | FMM |
|
|
93
|
+
|--------|---------|-----|
|
|
94
|
+
| Files correctly identified | ✓ | ✓ |
|
|
95
|
+
| Architecture diagrams produced | ✓ | ✓ |
|
|
96
|
+
| Dependencies mapped | ✓ | ✓ |
|
|
97
|
+
| Accuracy | Equivalent | Equivalent |
|
|
98
|
+
|
|
99
|
+
**Same output. 94% fewer tokens.**
|
|
100
|
+
|
|
101
|
+
---
|
|
102
|
+
|
|
103
|
+
## Why It Works
|
|
104
|
+
|
|
105
|
+
Frontmatter answers the three questions LLMs ask about every file:
|
|
106
|
+
|
|
107
|
+
1. **What does this file do?** → `exports: [...]`
|
|
108
|
+
2. **What does it depend on?** → `imports: [...]`, `dependencies: [...]`
|
|
109
|
+
3. **How big is it?** → `loc: 234`
|
|
110
|
+
|
|
111
|
+
With these answers in the first 15 lines, the LLM can triage without reading the full file.
|
|
112
|
+
|
|
113
|
+
### The Triage Decision Tree
|
|
114
|
+
|
|
115
|
+
```
|
|
116
|
+
Read frontmatter (15 lines)
|
|
117
|
+
│
|
|
118
|
+
├── Exports match what I'm looking for?
|
|
119
|
+
│ ├── Yes → Read full file
|
|
120
|
+
│ └── No → Skip (saved 200+ lines)
|
|
121
|
+
│
|
|
122
|
+
└── Dependencies relevant to my task?
|
|
123
|
+
├── Yes → Read full file
|
|
124
|
+
└── No → Skip (saved 200+ lines)
|
|
125
|
+
```
|
|
126
|
+
|
|
127
|
+
---
|
|
128
|
+
|
|
129
|
+
## The Economics
|
|
130
|
+
|
|
131
|
+
### Per-Request Savings
|
|
132
|
+
|
|
133
|
+
| Scenario | Without FMM | With FMM | Savings |
|
|
134
|
+
|----------|-------------|----------|---------|
|
|
135
|
+
| Simple lookup | 500 lines | 65 lines | 87% |
|
|
136
|
+
| Refactoring task | 3,000 lines | 400 lines | 87% |
|
|
137
|
+
| Architecture review | 7,000 lines | 200 lines | 97% |
|
|
138
|
+
|
|
139
|
+
### At Scale
|
|
140
|
+
|
|
141
|
+
Assuming:
|
|
142
|
+
- 1,000 LLM coding requests/day
|
|
143
|
+
- Average 2,000 lines read per request
|
|
144
|
+
- $0.01 per 1K tokens (input)
|
|
145
|
+
- ~4 chars per token
|
|
146
|
+
|
|
147
|
+
**Without FMM:** 2M lines × 1000 requests = 2B lines/day = ~$5,000/day
|
|
148
|
+
**With FMM (90% reduction):** ~$500/day
|
|
149
|
+
|
|
150
|
+
**Annual savings: ~$1.6M** (per organization at this scale)
|
|
151
|
+
|
|
152
|
+
---
|
|
153
|
+
|
|
154
|
+
## The Crossover Point
|
|
155
|
+
|
|
156
|
+
Frontmatter has overhead: ~8-10 lines per file for the metadata block.
|
|
157
|
+
|
|
158
|
+
**FMM wins when:** `files_skipped × avg_file_size > frontmatter_overhead`
|
|
159
|
+
|
|
160
|
+
| Codebase | Files | Avg LOC | Break-Even | FMM Value |
|
|
161
|
+
|----------|-------|---------|------------|-----------|
|
|
162
|
+
| Tiny | 4 | 30 | Skip 3+ files | Marginal |
|
|
163
|
+
| Small | 50 | 100 | Skip 5+ files | Positive |
|
|
164
|
+
| Medium | 200 | 200 | Skip 10+ files | Strong |
|
|
165
|
+
| Large | 500+ | 300+ | Skip 15+ files | Massive |
|
|
166
|
+
|
|
167
|
+
**Real codebases are medium-to-large. FMM wins by default.**
|
|
168
|
+
|
|
169
|
+
---
|
|
170
|
+
|
|
171
|
+
## The Adoption Path
|
|
172
|
+
|
|
173
|
+
### What Doesn't Work
|
|
174
|
+
|
|
175
|
+
- ❌ Manifest files (`.fmm/index.json`) - adds complexity
|
|
176
|
+
- ❌ Discovery mechanisms - overengineered
|
|
177
|
+
- ❌ CLAUDE.md hints - project-specific
|
|
178
|
+
- ❌ New developer tooling - adoption friction
|
|
179
|
+
|
|
180
|
+
### What Works
|
|
181
|
+
|
|
182
|
+
The LLM workflow is:
|
|
183
|
+
```
|
|
184
|
+
grep → find files → READ files → understand
|
|
185
|
+
```
|
|
186
|
+
|
|
187
|
+
Frontmatter changes only the READ step:
|
|
188
|
+
```
|
|
189
|
+
grep → find files → READ FIRST 15 LINES → decide → maybe read rest
|
|
190
|
+
```
|
|
191
|
+
|
|
192
|
+
**The adoption path:**
|
|
193
|
+
1. Codebases add frontmatter (`fmm generate src/`)
|
|
194
|
+
2. LLM tools adopt "peek first" as default behavior
|
|
195
|
+
|
|
196
|
+
No new tools for developers. No discovery layers. Just a behavior change in how LLMs read files.
|
|
197
|
+
|
|
198
|
+
---
|
|
199
|
+
|
|
200
|
+
## The Thesis
|
|
201
|
+
|
|
202
|
+
**Frontmatter is infrastructure for LLM cost reduction.**
|
|
203
|
+
|
|
204
|
+
Every codebase with frontmatter = cheaper to work with.
|
|
205
|
+
Every LLM tool that peeks first = cheaper to run.
|
|
206
|
+
|
|
207
|
+
The more codebases have frontmatter, the more pressure on LLM tools to optimize for it. The more tools optimize, the more value codebases get from adding it.
|
|
208
|
+
|
|
209
|
+
**This is a coordination game with positive-sum economics.**
|
|
210
|
+
|
|
211
|
+
---
|
|
212
|
+
|
|
213
|
+
## Implementation
|
|
214
|
+
|
|
215
|
+
### fmm (Frontmatter Matters)
|
|
216
|
+
|
|
217
|
+
CLI tool to generate and maintain frontmatter:
|
|
218
|
+
|
|
219
|
+
```bash
|
|
220
|
+
# Add frontmatter to all TypeScript files
|
|
221
|
+
fmm generate src/
|
|
222
|
+
|
|
223
|
+
# Update existing frontmatter
|
|
224
|
+
fmm update src/
|
|
225
|
+
|
|
226
|
+
# Validate frontmatter is current (CI integration)
|
|
227
|
+
fmm validate src/
|
|
228
|
+
```
|
|
229
|
+
|
|
230
|
+
**Supported languages:** TypeScript, JavaScript, Python, Rust, Go
|
|
231
|
+
|
|
232
|
+
**Performance:** ~1,000 files/second on M1 Mac
|
|
233
|
+
|
|
234
|
+
### Frontmatter Format
|
|
235
|
+
|
|
236
|
+
```typescript
|
|
237
|
+
// ---
|
|
238
|
+
// file: ./relative/path.ts
|
|
239
|
+
// exports: [namedExport1, namedExport2, DefaultExport]
|
|
240
|
+
// imports: [external-package, ./local-dep]
|
|
241
|
+
// dependencies: [./types, ./utils]
|
|
242
|
+
// loc: 234
|
|
243
|
+
// modified: 2026-01-27
|
|
244
|
+
// ---
|
|
245
|
+
```
|
|
246
|
+
|
|
247
|
+
### Integration Points
|
|
248
|
+
|
|
249
|
+
- **Pre-commit hooks:** Ensure frontmatter stays in sync
|
|
250
|
+
- **CI validation:** `fmm validate` fails if frontmatter is stale
|
|
251
|
+
- **Editor plugins:** Auto-update on save
|
|
252
|
+
|
|
253
|
+
---
|
|
254
|
+
|
|
255
|
+
## Conclusion
|
|
256
|
+
|
|
257
|
+
LLMs reading code is expensive. Frontmatter makes it cheap.
|
|
258
|
+
|
|
259
|
+
The evidence is clear: **88-97% token reduction** on real tasks, with equivalent accuracy.
|
|
260
|
+
|
|
261
|
+
The path is simple: add frontmatter to codebases, change LLM read behavior to peek first.
|
|
262
|
+
|
|
263
|
+
The economics do the rest.
|
|
264
|
+
|
|
265
|
+
---
|
|
266
|
+
|
|
267
|
+
## References
|
|
268
|
+
|
|
269
|
+
- Experiment data: `fmm/research/exp13/`
|
|
270
|
+
- fmm CLI: `github.com/mdcontext/fmm`
|
|
271
|
+
- mdcontext: `github.com/mdcontext/mdcontext`
|
|
272
|
+
|
|
273
|
+
---
|
|
274
|
+
|
|
275
|
+
*Research conducted January 2026*
|
|
276
|
+
*Stuart Robinson & Claude Opus 4.5*
|