mdcontext 0.1.0 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.changeset/config.json +9 -9
- package/.claude/settings.local.json +25 -0
- package/.github/workflows/claude-code-review.yml +44 -0
- package/.github/workflows/claude.yml +85 -0
- package/CONTRIBUTING.md +186 -0
- package/NOTES/NOTES +44 -0
- package/README.md +206 -3
- package/biome.json +1 -1
- package/dist/chunk-23UPXDNL.js +3044 -0
- package/dist/chunk-2W7MO2DL.js +1366 -0
- package/dist/chunk-3NUAZGMA.js +1689 -0
- package/dist/chunk-7TOWB2XB.js +366 -0
- package/dist/chunk-7XOTOADQ.js +3065 -0
- package/dist/chunk-AH2PDM2K.js +3042 -0
- package/dist/chunk-BNXWSZ63.js +3742 -0
- package/dist/chunk-BTL5DJVU.js +3222 -0
- package/dist/chunk-HDHYG7E4.js +104 -0
- package/dist/chunk-HLR4KZBP.js +3234 -0
- package/dist/chunk-IP3FRFEB.js +1045 -0
- package/dist/chunk-KHU56VDO.js +3042 -0
- package/dist/chunk-KRYIFLQR.js +85 -89
- package/dist/chunk-LBSDNLEM.js +287 -0
- package/dist/chunk-MNTQ7HCP.js +2643 -0
- package/dist/chunk-MUJELQQ6.js +1387 -0
- package/dist/chunk-MXJGMSLV.js +2199 -0
- package/dist/chunk-N6QJGC3Z.js +2636 -0
- package/dist/chunk-OBELGBPM.js +1713 -0
- package/dist/chunk-OT7R5XTA.js +3192 -0
- package/dist/chunk-P7X4RA2T.js +106 -0
- package/dist/chunk-PIDUQNC2.js +3185 -0
- package/dist/chunk-POGCDIH4.js +3187 -0
- package/dist/chunk-PSIEOQGZ.js +3043 -0
- package/dist/chunk-PVRT3IHA.js +3238 -0
- package/dist/chunk-QNN4TT23.js +1430 -0
- package/dist/chunk-RE3R45RJ.js +3042 -0
- package/dist/chunk-S7E6TFX6.js +718 -657
- package/dist/chunk-SG6GLU4U.js +1378 -0
- package/dist/chunk-SJCDV2ST.js +274 -0
- package/dist/chunk-SYE5XLF3.js +104 -0
- package/dist/chunk-T5VLYBZD.js +103 -0
- package/dist/chunk-TOQB7VWU.js +3238 -0
- package/dist/chunk-VFNMZ4ZQ.js +3228 -0
- package/dist/chunk-VVTGZNBT.js +1533 -1423
- package/dist/chunk-W7Q4RFEV.js +104 -0
- package/dist/chunk-XTYYVRLO.js +3190 -0
- package/dist/chunk-Y6MDYVJD.js +3063 -0
- package/dist/cli/main.js +4072 -629
- package/dist/index.d.ts +420 -33
- package/dist/index.js +8 -15
- package/dist/mcp/server.js +103 -7
- package/dist/schema-BAWSG7KY.js +22 -0
- package/dist/schema-E3QUPL26.js +20 -0
- package/dist/schema-EHL7WUT6.js +20 -0
- package/docs/019-USAGE.md +44 -5
- package/docs/020-current-implementation.md +8 -8
- package/docs/021-DOGFOODING-FINDINGS.md +1 -1
- package/docs/CONFIG.md +1123 -0
- package/docs/ERRORS.md +383 -0
- package/docs/summarization.md +320 -0
- package/justfile +40 -0
- package/package.json +39 -33
- package/research/INDEX.md +315 -0
- package/research/code-review/README.md +90 -0
- package/research/code-review/cli-error-handling-review.md +979 -0
- package/research/code-review/code-review-validation-report.md +464 -0
- package/research/code-review/main-ts-review.md +1128 -0
- package/research/config-docs/SUMMARY.md +357 -0
- package/research/config-docs/TEST-RESULTS.md +776 -0
- package/research/config-docs/TODO.md +542 -0
- package/research/config-docs/analysis.md +744 -0
- package/research/config-docs/fix-validation.md +502 -0
- package/research/config-docs/help-audit.md +264 -0
- package/research/config-docs/help-system-analysis.md +890 -0
- package/research/frontmatter/COMMENTS-ARE-SKIPPED.md +149 -0
- package/research/frontmatter/LLM-CODE-NAVIGATION.md +276 -0
- package/research/issue-review.md +603 -0
- package/research/llm-summarization/agent-cli-tools-2026.md +1082 -0
- package/research/llm-summarization/alternative-providers-2026.md +1428 -0
- package/research/llm-summarization/anthropic-2026.md +367 -0
- package/research/llm-summarization/claude-cli-integration.md +1706 -0
- package/research/llm-summarization/cli-integration-patterns.md +3155 -0
- package/research/llm-summarization/openai-2026.md +473 -0
- package/research/llm-summarization/openai-compatible-providers-2026.md +1022 -0
- package/research/llm-summarization/opencode-cli-integration.md +1552 -0
- package/research/llm-summarization/prompt-engineering-2026.md +1426 -0
- package/research/llm-summarization/prototype-results.md +56 -0
- package/research/llm-summarization/provider-switching-patterns-2026.md +2153 -0
- package/research/llm-summarization/typescript-llm-libraries-2026.md +2436 -0
- package/research/mdcontext-pudding/00-EXECUTIVE-SUMMARY.md +282 -0
- package/research/mdcontext-pudding/01-index-embed.md +956 -0
- package/research/mdcontext-pudding/02-search-COMMANDS.md +142 -0
- package/research/mdcontext-pudding/02-search-SUMMARY.md +146 -0
- package/research/mdcontext-pudding/02-search.md +970 -0
- package/research/mdcontext-pudding/03-context.md +779 -0
- package/research/mdcontext-pudding/04-navigation-and-analytics.md +803 -0
- package/research/mdcontext-pudding/04-tree.md +704 -0
- package/research/mdcontext-pudding/05-config.md +1038 -0
- package/research/mdcontext-pudding/06-links-summary.txt +87 -0
- package/research/mdcontext-pudding/06-links.md +679 -0
- package/research/mdcontext-pudding/07-stats.md +693 -0
- package/research/mdcontext-pudding/BUG-FIX-PLAN.md +388 -0
- package/research/mdcontext-pudding/P0-BUG-VALIDATION.md +167 -0
- package/research/mdcontext-pudding/README.md +168 -0
- package/research/mdcontext-pudding/TESTING-SUMMARY.md +128 -0
- package/research/research-quality-review.md +834 -0
- package/research/semantic-search/embedding-text-analysis.md +156 -0
- package/research/semantic-search/multi-word-failure-reproduction.md +171 -0
- package/research/semantic-search/query-processing-analysis.md +207 -0
- package/research/semantic-search/root-cause-and-solution.md +114 -0
- package/research/semantic-search/threshold-validation-report.md +69 -0
- package/research/semantic-search/vector-search-analysis.md +63 -0
- package/research/test-path-issues.md +276 -0
- package/review/ALP-76/1-error-type-design.md +962 -0
- package/review/ALP-76/2-error-handling-patterns.md +906 -0
- package/review/ALP-76/3-error-presentation.md +624 -0
- package/review/ALP-76/4-test-coverage.md +625 -0
- package/review/ALP-76/5-migration-completeness.md +440 -0
- package/review/ALP-76/6-effect-best-practices.md +755 -0
- package/scripts/apply-branch-protection.sh +47 -0
- package/scripts/branch-protection-templates.json +79 -0
- package/scripts/prototype-summarization.ts +346 -0
- package/scripts/rebuild-hnswlib.js +32 -37
- package/scripts/setup-branch-protection.sh +64 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/active-provider.json +7 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/bm25.json +541 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/bm25.meta.json +5 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/config.json +8 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.bin +0 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.meta.bin +0 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/indexes/documents.json +60 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/indexes/links.json +13 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/indexes/sections.json +1197 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/configuration-management.md +99 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/distributed-systems.md +92 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/error-handling.md +78 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/failure-automation.md +55 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/job-context.md +69 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/process-orchestration.md +99 -0
- package/src/cli/argv-preprocessor.test.ts +2 -2
- package/src/cli/cli.test.ts +230 -33
- package/src/cli/commands/config-cmd.ts +642 -0
- package/src/cli/commands/context.ts +97 -9
- package/src/cli/commands/duplicates.ts +122 -0
- package/src/cli/commands/embeddings.ts +529 -0
- package/src/cli/commands/index-cmd.ts +210 -30
- package/src/cli/commands/index.ts +3 -0
- package/src/cli/commands/search.ts +894 -64
- package/src/cli/commands/stats.ts +3 -0
- package/src/cli/commands/tree.ts +26 -5
- package/src/cli/config-layer.ts +176 -0
- package/src/cli/error-handler.test.ts +235 -0
- package/src/cli/error-handler.ts +655 -0
- package/src/cli/flag-schemas.ts +66 -0
- package/src/cli/help.ts +209 -7
- package/src/cli/main.ts +348 -58
- package/src/cli/options.ts +10 -0
- package/src/cli/shared-error-handling.ts +199 -0
- package/src/cli/utils.ts +150 -17
- package/src/config/file-provider.test.ts +320 -0
- package/src/config/file-provider.ts +273 -0
- package/src/config/index.ts +72 -0
- package/src/config/integration.test.ts +667 -0
- package/src/config/precedence.test.ts +277 -0
- package/src/config/precedence.ts +451 -0
- package/src/config/schema.test.ts +414 -0
- package/src/config/schema.ts +603 -0
- package/src/config/service.test.ts +320 -0
- package/src/config/service.ts +243 -0
- package/src/config/testing.test.ts +264 -0
- package/src/config/testing.ts +110 -0
- package/src/core/types.ts +6 -33
- package/src/duplicates/detector.test.ts +183 -0
- package/src/duplicates/detector.ts +414 -0
- package/src/duplicates/index.ts +18 -0
- package/src/embeddings/embedding-namespace.test.ts +300 -0
- package/src/embeddings/embedding-namespace.ts +947 -0
- package/src/embeddings/heading-boost.test.ts +222 -0
- package/src/embeddings/hnsw-build-options.test.ts +198 -0
- package/src/embeddings/hyde.test.ts +272 -0
- package/src/embeddings/hyde.ts +264 -0
- package/src/embeddings/index.ts +2 -0
- package/src/embeddings/openai-provider.ts +332 -83
- package/src/embeddings/pricing.json +22 -0
- package/src/embeddings/provider-constants.ts +204 -0
- package/src/embeddings/provider-errors.test.ts +967 -0
- package/src/embeddings/provider-errors.ts +565 -0
- package/src/embeddings/provider-factory.test.ts +240 -0
- package/src/embeddings/provider-factory.ts +225 -0
- package/src/embeddings/provider-integration.test.ts +788 -0
- package/src/embeddings/query-preprocessing.test.ts +187 -0
- package/src/embeddings/semantic-search-threshold.test.ts +508 -0
- package/src/embeddings/semantic-search.ts +780 -93
- package/src/embeddings/types.ts +293 -16
- package/src/embeddings/vector-store.ts +486 -77
- package/src/embeddings/voyage-provider.ts +313 -0
- package/src/errors/errors.test.ts +845 -0
- package/src/errors/index.ts +533 -0
- package/src/index/ignore-patterns.test.ts +354 -0
- package/src/index/ignore-patterns.ts +305 -0
- package/src/index/indexer.ts +286 -48
- package/src/index/storage.ts +94 -30
- package/src/index/types.ts +40 -2
- package/src/index/watcher.ts +67 -9
- package/src/index.ts +22 -0
- package/src/integration/search-keyword.test.ts +678 -0
- package/src/mcp/server.ts +135 -6
- package/src/parser/parser.ts +18 -19
- package/src/parser/section-filter.test.ts +277 -0
- package/src/parser/section-filter.ts +125 -3
- package/src/search/__tests__/hybrid-search.test.ts +650 -0
- package/src/search/bm25-store.ts +366 -0
- package/src/search/cross-encoder.test.ts +253 -0
- package/src/search/cross-encoder.ts +406 -0
- package/src/search/fuzzy-search.test.ts +419 -0
- package/src/search/fuzzy-search.ts +273 -0
- package/src/search/hybrid-search.ts +448 -0
- package/src/search/path-matcher.test.ts +276 -0
- package/src/search/path-matcher.ts +33 -0
- package/src/search/searcher.test.ts +99 -1
- package/src/search/searcher.ts +189 -67
- package/src/search/wink-bm25.d.ts +30 -0
- package/src/summarization/cli-providers/claude.ts +202 -0
- package/src/summarization/cli-providers/detection.test.ts +273 -0
- package/src/summarization/cli-providers/detection.ts +118 -0
- package/src/summarization/cli-providers/index.ts +8 -0
- package/src/summarization/cost.test.ts +139 -0
- package/src/summarization/cost.ts +102 -0
- package/src/summarization/error-handler.test.ts +127 -0
- package/src/summarization/error-handler.ts +111 -0
- package/src/summarization/index.ts +102 -0
- package/src/summarization/pipeline.test.ts +498 -0
- package/src/summarization/pipeline.ts +231 -0
- package/src/summarization/prompts.test.ts +269 -0
- package/src/summarization/prompts.ts +133 -0
- package/src/summarization/provider-factory.test.ts +396 -0
- package/src/summarization/provider-factory.ts +178 -0
- package/src/summarization/types.ts +184 -0
- package/src/summarize/summarizer.ts +104 -35
- package/src/types/huggingface-transformers.d.ts +66 -0
- package/tests/fixtures/cli/.mdcontext/active-provider.json +7 -0
- package/tests/fixtures/cli/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.bin +0 -0
- package/tests/fixtures/cli/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.meta.bin +0 -0
- package/tests/fixtures/cli/.mdcontext/indexes/documents.json +4 -4
- package/tests/fixtures/cli/.mdcontext/indexes/sections.json +14 -0
- package/tests/integration/embed-index.test.ts +712 -0
- package/tests/integration/search-context.test.ts +469 -0
- package/tests/integration/search-semantic.test.ts +522 -0
- package/vitest.config.ts +1 -6
- package/AGENTS.md +0 -46
- package/tests/fixtures/cli/.mdcontext/vectors.bin +0 -0
- package/tests/fixtures/cli/.mdcontext/vectors.meta.json +0 -1264
|
@@ -0,0 +1,69 @@
|
|
|
1
|
+
# Threshold Validation Report
|
|
2
|
+
|
|
3
|
+
## Summary
|
|
4
|
+
|
|
5
|
+
Validation confirms that lowering the default similarity threshold from 0.5 to 0.35 (ALP-208) **fixes single-word query failures** without regressing multi-word query performance.
|
|
6
|
+
|
|
7
|
+
## Test Environment
|
|
8
|
+
|
|
9
|
+
- **Test Corpus**: `src/__tests__/fixtures/semantic-search/multi-word-corpus/`
|
|
10
|
+
- **Documents**: 6 markdown files (failure-automation, job-context, error-handling, configuration-management, distributed-systems, process-orchestration)
|
|
11
|
+
- **Sections**: 52 embedded vectors
|
|
12
|
+
- **Date**: 2026-01-26
|
|
13
|
+
|
|
14
|
+
## Before/After Comparison
|
|
15
|
+
|
|
16
|
+
### Single-Word Queries
|
|
17
|
+
|
|
18
|
+
| Query | Before (0.5) | After (0.35) | Top Match | Top Score |
|
|
19
|
+
|-------|-------------|--------------|-----------|-----------|
|
|
20
|
+
| "failure" | 0 results | **6 results** | failure-automation.md: Failure Isolation | 39.0% |
|
|
21
|
+
| "error" | 0 results | **7 results** | error-handling.md: Programming Errors | 49.1% |
|
|
22
|
+
| "automation" | 0 results | **10 results** | failure-automation.md: Overview | 44.9% |
|
|
23
|
+
| "context" | 0 results | **10 results** | job-context.md: What is Job Context? | 48.1% |
|
|
24
|
+
|
|
25
|
+
**Improvement**: 100% of single-word queries now return relevant results.
|
|
26
|
+
|
|
27
|
+
### Multi-Word Queries (Regression Check)
|
|
28
|
+
|
|
29
|
+
| Query | Before (0.5) | After (0.35) | Top Match | Top Score |
|
|
30
|
+
|-------|-------------|--------------|-----------|-----------|
|
|
31
|
+
| "failure automation" | 7 results | 10 results | failure-automation.md: Best Practices | 61.5% |
|
|
32
|
+
| "job context" | 4 results | 7 results | job-context.md: What is Job Context? | 60.4% |
|
|
33
|
+
| "error handling" | 7 results | 10 results | error-handling.md: Introduction | 63.6% |
|
|
34
|
+
| "configuration management" | 8 results | 10 results | configuration-management.md: Overview | 69.5% |
|
|
35
|
+
| "distributed systems" | 4 results | 10 results | distributed-systems.md: What Are... | 60.9% |
|
|
36
|
+
| "process orchestration" | 8 results | 10 results | process-orchestration.md: Introduction | 67.9% |
|
|
37
|
+
|
|
38
|
+
**Finding**: No regression. Multi-word queries actually return MORE results (expected, since threshold is lower), with the same top matches and scores.
|
|
39
|
+
|
|
40
|
+
## Success Criteria Validation
|
|
41
|
+
|
|
42
|
+
- [x] **Single-word queries return results at default threshold** - All 4 test queries now return 6-10 results
|
|
43
|
+
- [x] **Multi-word queries work as before (no regression)** - All 6 queries return results with same top matches
|
|
44
|
+
- [x] **Quantitative improvement documented** - See tables above
|
|
45
|
+
|
|
46
|
+
## Below-Threshold Feedback (ALP-209)
|
|
47
|
+
|
|
48
|
+
The new feedback feature correctly reports results below threshold:
|
|
49
|
+
|
|
50
|
+
```json
|
|
51
|
+
{
|
|
52
|
+
"results": [...6 results...],
|
|
53
|
+
"belowThresholdCount": 14,
|
|
54
|
+
"belowThresholdHighest": 0.349
|
|
55
|
+
}
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
This helps users understand that more content exists if they lower the threshold.
|
|
59
|
+
|
|
60
|
+
## Conclusion
|
|
61
|
+
|
|
62
|
+
The threshold change from 0.5 to 0.35 is validated as the correct fix:
|
|
63
|
+
|
|
64
|
+
1. **Single-word queries now work** - Users can search for concepts like "failure", "error", "context"
|
|
65
|
+
2. **Multi-word queries unaffected** - High-quality results with same top matches
|
|
66
|
+
3. **User guidance in place** - Documentation (ALP-210) explains threshold behavior
|
|
67
|
+
4. **Below-threshold feedback** - Users see when lowering threshold would help
|
|
68
|
+
|
|
69
|
+
The root cause identified in ALP-207 (threshold too high for short queries scoring 30-40%) is confirmed fixed.
|
|
@@ -0,0 +1,63 @@
|
|
|
1
|
+
# Vector Search Parameters and Scoring Analysis
|
|
2
|
+
|
|
3
|
+
## Executive Summary
|
|
4
|
+
|
|
5
|
+
The HNSW vector search configuration is **appropriate and well-tuned**. The root cause of "0 results" is **NOT the vector search algorithm**, but the **similarity threshold filtering** applied after search.
|
|
6
|
+
|
|
7
|
+
Key finding: Single-word queries have inherently lower similarity scores (30-40%) than multi-word queries (50-70%). The default 0.5 threshold filters out all single-word results.
|
|
8
|
+
|
|
9
|
+
## HNSW Configuration
|
|
10
|
+
|
|
11
|
+
### Current Parameters
|
|
12
|
+
|
|
13
|
+
From `src/embeddings/vector-store.ts:98`:
|
|
14
|
+
|
|
15
|
+
```typescript
|
|
16
|
+
this.index.initIndex(10000, 16, 200, 100)
|
|
17
|
+
// maxElements, M, efConstruction, efSearch
|
|
18
|
+
```
|
|
19
|
+
|
|
20
|
+
| Parameter | Value | Description | Assessment |
|
|
21
|
+
|-----------|-------|-------------|------------|
|
|
22
|
+
| maxElements | 10,000 | Initial capacity (auto-resizes) | Adequate |
|
|
23
|
+
| M | 16 | Max connections per node | Good balance |
|
|
24
|
+
| efConstruction | 200 | Construction-time search width | High quality |
|
|
25
|
+
| efSearch | 100 | Query-time search width | Good recall |
|
|
26
|
+
|
|
27
|
+
All parameters are well-tuned. No changes needed.
|
|
28
|
+
|
|
29
|
+
## Similarity Score Analysis
|
|
30
|
+
|
|
31
|
+
### Threshold Experiment
|
|
32
|
+
|
|
33
|
+
Testing "failure" at different thresholds:
|
|
34
|
+
|
|
35
|
+
| Threshold | Results | Top Score |
|
|
36
|
+
|-----------|---------|-----------|
|
|
37
|
+
| 0.0 | 10 | 39.1% |
|
|
38
|
+
| 0.3 | 10 | 39.1% |
|
|
39
|
+
| 0.4 | 0 | - |
|
|
40
|
+
| 0.5 | 0 | - |
|
|
41
|
+
|
|
42
|
+
### Score Distribution by Query Type
|
|
43
|
+
|
|
44
|
+
| Query Type | Score Range | Results at 0.5 |
|
|
45
|
+
|------------|-------------|----------------|
|
|
46
|
+
| Single word | 31-49% | 0 |
|
|
47
|
+
| Two-word domain | 54-70% | 7+ |
|
|
48
|
+
| Natural language | 50-66% | 9 |
|
|
49
|
+
|
|
50
|
+
## Root Cause
|
|
51
|
+
|
|
52
|
+
The 0.5 default threshold filters out single-word results (max ~49%). This is threshold calibration, not a search algorithm issue.
|
|
53
|
+
|
|
54
|
+
## Recommendations for ALP-207
|
|
55
|
+
|
|
56
|
+
1. Lower default threshold to 0.3-0.4
|
|
57
|
+
2. Consider adaptive threshold by query length
|
|
58
|
+
3. Show "N results below threshold" message
|
|
59
|
+
4. Make threshold more visible in docs
|
|
60
|
+
|
|
61
|
+
## Conclusion
|
|
62
|
+
|
|
63
|
+
Vector search works correctly. Focus ALP-207 on threshold tuning, not algorithmic changes.
|
|
@@ -0,0 +1,276 @@
|
|
|
1
|
+
# Test Path Issues Analysis
|
|
2
|
+
|
|
3
|
+
---
|
|
4
|
+
**RESEARCH METADATA**
|
|
5
|
+
|
|
6
|
+
- Analysis Date: 2026-01-24
|
|
7
|
+
- Git Commit: 07c9e72ba01cda840046b96a1be4743a85e3d4c5
|
|
8
|
+
- Status: ✅ Valid
|
|
9
|
+
- Last Validated: 2026-01-24
|
|
10
|
+
- Worktree: nancy-ALP-139
|
|
11
|
+
- Index: [/research/INDEX.md](INDEX.md)
|
|
12
|
+
|
|
13
|
+
**ACCURACY NOTE**
|
|
14
|
+
|
|
15
|
+
Test path fix validation. Findings are accurate and current.
|
|
16
|
+
---
|
|
17
|
+
|
|
18
|
+
**Date:** 2026-01-24
|
|
19
|
+
**Issue:** Cross-platform path handling in test suite
|
|
20
|
+
**Related File:** `src/cli/cli.test.ts`
|
|
21
|
+
|
|
22
|
+
## Summary
|
|
23
|
+
|
|
24
|
+
This document analyzes the fix made to line 458 of `src/cli/cli.test.ts` and identifies other potential path-related issues across the test suite.
|
|
25
|
+
|
|
26
|
+
## The Fix at Line 458
|
|
27
|
+
|
|
28
|
+
### What Changed
|
|
29
|
+
|
|
30
|
+
```typescript
|
|
31
|
+
// Before:
|
|
32
|
+
expect(output).toContain('/nonexistent/path.json')
|
|
33
|
+
|
|
34
|
+
// After:
|
|
35
|
+
expect(output).toContain('path.json')
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
### Test Context
|
|
39
|
+
|
|
40
|
+
The test (lines 453-459) verifies that when a non-existent config file is specified, the error message includes information about the missing file:
|
|
41
|
+
|
|
42
|
+
```typescript
|
|
43
|
+
it('shows error for non-existent config file', async () => {
|
|
44
|
+
const output = await run('--config /nonexistent/path.json --help', {
|
|
45
|
+
expectError: true,
|
|
46
|
+
})
|
|
47
|
+
expect(output).toContain('Error: Config file not found')
|
|
48
|
+
expect(output).toContain('path.json')
|
|
49
|
+
})
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
### Analysis of the Fix
|
|
53
|
+
|
|
54
|
+
**Is this fix correct?**
|
|
55
|
+
YES - The fix is correct for the following reasons:
|
|
56
|
+
|
|
57
|
+
1. **Cross-platform compatibility**: The original test used a Unix-style absolute path (`/nonexistent/path.json`) which would fail on Windows where absolute paths look like `C:\nonexistent\path.json`.
|
|
58
|
+
|
|
59
|
+
2. **Appropriate validation scope**: The test's purpose is to verify that:
|
|
60
|
+
- An error is shown for non-existent config files
|
|
61
|
+
- The error message mentions the filename
|
|
62
|
+
|
|
63
|
+
It does NOT need to validate that the full absolute path appears in the error message.
|
|
64
|
+
|
|
65
|
+
3. **More resilient**: By checking only for `path.json` (the basename), the test becomes agnostic to:
|
|
66
|
+
- Path separators (forward vs backslash)
|
|
67
|
+
- Path normalization
|
|
68
|
+
- Absolute vs relative path representation in error messages
|
|
69
|
+
|
|
70
|
+
**Is this sufficient?**
|
|
71
|
+
YES - The fix is sufficient because:
|
|
72
|
+
|
|
73
|
+
1. The test has two assertions:
|
|
74
|
+
- First checks for the error type: `'Error: Config file not found'`
|
|
75
|
+
- Second checks that the filename appears: `'path.json'`
|
|
76
|
+
|
|
77
|
+
2. Together, these assertions validate the core requirement: that users can identify which config file was not found.
|
|
78
|
+
|
|
79
|
+
3. The test doesn't need to verify exact path formatting, which may vary by OS and implementation.
|
|
80
|
+
|
|
81
|
+
**Does the test still validate what it should?**
|
|
82
|
+
YES - The test's purpose is to ensure:
|
|
83
|
+
- The CLI detects non-existent config files
|
|
84
|
+
- The error message is informative enough for users to identify the problem
|
|
85
|
+
- Both requirements are still met after the fix
|
|
86
|
+
|
|
87
|
+
## Other Path Issues Found
|
|
88
|
+
|
|
89
|
+
### 1. Mock Path Variables in `src/cli/argv-preprocessor.test.ts`
|
|
90
|
+
|
|
91
|
+
**Lines 9-10:**
|
|
92
|
+
```typescript
|
|
93
|
+
const node = '/usr/bin/node'
|
|
94
|
+
const script = '/path/to/mdcontext'
|
|
95
|
+
```
|
|
96
|
+
|
|
97
|
+
**Issue:** Uses hardcoded Unix-style paths for test fixtures.
|
|
98
|
+
|
|
99
|
+
**Impact:** LOW - These are mock values used only to construct argv arrays for testing flag parsing. They don't interact with the filesystem or get validated against actual paths.
|
|
100
|
+
|
|
101
|
+
**Recommendation:** No fix needed. These are appropriate test fixtures because:
|
|
102
|
+
- They represent typical argv[0] and argv[1] values
|
|
103
|
+
- They're used purely as string identifiers in the test
|
|
104
|
+
- The actual values don't matter for the flag parsing logic being tested
|
|
105
|
+
|
|
106
|
+
### 2. Hardcoded Paths in Error Test Files
|
|
107
|
+
|
|
108
|
+
Multiple test files use hardcoded Unix-style paths as test data:
|
|
109
|
+
|
|
110
|
+
#### `src/summarize/budget-bugs.test.ts`
|
|
111
|
+
- Lines 11, 69, 126, 182, 220, 246, 335, 367, 426, 483, 499, 528, 556, 589
|
|
112
|
+
- Example: `path: '/test/file.md'`
|
|
113
|
+
|
|
114
|
+
#### `src/summarize/summarizer.test.ts`
|
|
115
|
+
- Lines 16, 55, 92, 132, 150, 192, 224, 252
|
|
116
|
+
- Example: `path: '/test/file.md'`
|
|
117
|
+
|
|
118
|
+
#### `src/summarize/verify-bugs.test.ts`
|
|
119
|
+
- Lines 21, 71, 92, 108, 148, 180, 203
|
|
120
|
+
- Example: `path: '/very/long/path/to/deeply/nested/directory/structure/file.md'`
|
|
121
|
+
|
|
122
|
+
#### `src/errors/errors.test.ts`
|
|
123
|
+
- Lines 45, 53, 62, 72, 81, 95, 119, 131, 155, 346, 367, 382, 389, 395, 413, 473, 518, 547, 559, 571, 622, 634, 742, 792, 814
|
|
124
|
+
- Examples:
|
|
125
|
+
- `path: '/test/file.md'`
|
|
126
|
+
- `sourceFile: '/path/to/mdcontext.config.json'`
|
|
127
|
+
|
|
128
|
+
**Issue:** Hardcoded Unix-style paths in test data.
|
|
129
|
+
|
|
130
|
+
**Impact:** VERY LOW - These are test fixtures representing document paths in mock data structures. They are:
|
|
131
|
+
- Not used for filesystem operations
|
|
132
|
+
- Not validated against actual paths
|
|
133
|
+
- Used purely as string identifiers in domain objects
|
|
134
|
+
- Never checked with path-specific assertions
|
|
135
|
+
|
|
136
|
+
**Recommendation:** No fix needed. These paths are appropriate because:
|
|
137
|
+
1. They represent the `path` field in domain objects (DocumentSummary, Error objects, etc.)
|
|
138
|
+
2. The actual format doesn't affect test validity
|
|
139
|
+
3. In production, these paths would come from actual filesystem operations
|
|
140
|
+
4. Using Unix-style paths as test data is a common convention
|
|
141
|
+
|
|
142
|
+
### 3. Config Test Paths in `src/config/file-provider.test.ts`
|
|
143
|
+
|
|
144
|
+
**Lines 233, 248:**
|
|
145
|
+
```typescript
|
|
146
|
+
paths: {
|
|
147
|
+
cacheDir: '/custom/cache',
|
|
148
|
+
}
|
|
149
|
+
// ...
|
|
150
|
+
expect(result.paths.cacheDir).toBe('/custom/cache')
|
|
151
|
+
```
|
|
152
|
+
|
|
153
|
+
**Issue:** Hardcoded Unix-style absolute path.
|
|
154
|
+
|
|
155
|
+
**Impact:** VERY LOW - This is testing configuration value passthrough, not filesystem operations.
|
|
156
|
+
|
|
157
|
+
**Recommendation:** No fix needed. The test validates that:
|
|
158
|
+
- Config values are correctly parsed and stored
|
|
159
|
+
- The exact string value is preserved
|
|
160
|
+
- The path format is irrelevant since it's testing data flow, not path validation
|
|
161
|
+
|
|
162
|
+
### 4. Config Schema Test Paths in `src/config/schema.test.ts`
|
|
163
|
+
|
|
164
|
+
**Lines 231-245:**
|
|
165
|
+
```typescript
|
|
166
|
+
const provider = ConfigProvider.fromMap(
|
|
167
|
+
new Map([
|
|
168
|
+
['root', '/home/user/docs'],
|
|
169
|
+
['configFile', './custom.config.json'],
|
|
170
|
+
['cacheDir', '.cache/mdcontext'],
|
|
171
|
+
]),
|
|
172
|
+
)
|
|
173
|
+
// ...
|
|
174
|
+
expect(Option.getOrThrow(result.root)).toBe('/home/user/docs')
|
|
175
|
+
expect(Option.getOrThrow(result.configFile)).toBe('./custom.config.json')
|
|
176
|
+
expect(result.cacheDir).toBe('.cache/mdcontext')
|
|
177
|
+
```
|
|
178
|
+
|
|
179
|
+
**Issue:** Mix of Unix absolute and relative paths.
|
|
180
|
+
|
|
181
|
+
**Impact:** VERY LOW - Tests configuration schema parsing, not path operations.
|
|
182
|
+
|
|
183
|
+
**Recommendation:** No fix needed. These test the configuration system's ability to:
|
|
184
|
+
- Accept various path formats
|
|
185
|
+
- Preserve path values as provided
|
|
186
|
+
- Map between config sources and domain objects
|
|
187
|
+
|
|
188
|
+
## Patterns Using `path.join` (Correct Usage)
|
|
189
|
+
|
|
190
|
+
The test suite correctly uses `path.join()` for constructing paths that interact with the filesystem:
|
|
191
|
+
|
|
192
|
+
### `src/cli/cli.test.ts`
|
|
193
|
+
```typescript
|
|
194
|
+
const TEST_FIXTURE_DIR = path.join(process.cwd(), 'tests', 'fixtures', 'cli')
|
|
195
|
+
const CLI = `node ${path.join(process.cwd(), 'dist', 'cli', 'main.js')}`
|
|
196
|
+
```
|
|
197
|
+
|
|
198
|
+
### `src/search/searcher.test.ts`
|
|
199
|
+
```typescript
|
|
200
|
+
const TEST_DIR = path.join(process.cwd(), 'tests', 'fixtures', 'search')
|
|
201
|
+
const doc1Path = path.join(TEST_DIR, 'doc1.md')
|
|
202
|
+
```
|
|
203
|
+
|
|
204
|
+
These are correct because they:
|
|
205
|
+
- Build actual filesystem paths
|
|
206
|
+
- Use OS-appropriate separators
|
|
207
|
+
- Work across Windows, macOS, and Linux
|
|
208
|
+
|
|
209
|
+
## Recommendations
|
|
210
|
+
|
|
211
|
+
### 1. No Additional Fixes Required
|
|
212
|
+
|
|
213
|
+
The test suite does not have significant cross-platform path issues. The fix at line 458 was appropriate, but no similar issues exist elsewhere because:
|
|
214
|
+
|
|
215
|
+
- Most hardcoded paths are test data, not filesystem paths
|
|
216
|
+
- Actual filesystem operations use `path.join()` correctly
|
|
217
|
+
- Test assertions check for meaningful content, not path formatting
|
|
218
|
+
|
|
219
|
+
### 2. Best Practices for Future Tests
|
|
220
|
+
|
|
221
|
+
When writing new tests:
|
|
222
|
+
|
|
223
|
+
#### DO:
|
|
224
|
+
- Use `path.join()` for constructing filesystem paths
|
|
225
|
+
- Use `path.basename()` when asserting on filenames in error messages
|
|
226
|
+
- Check for file/directory names rather than full paths in assertions
|
|
227
|
+
- Use relative paths in test fixtures when possible
|
|
228
|
+
|
|
229
|
+
#### DON'T:
|
|
230
|
+
- Hardcode absolute paths with `/` or `\` separators for filesystem operations
|
|
231
|
+
- Make assertions about exact path formatting in error messages
|
|
232
|
+
- Assume path separator type in test expectations
|
|
233
|
+
- Use platform-specific path conventions in assertions
|
|
234
|
+
|
|
235
|
+
#### Example: Testing Error Messages
|
|
236
|
+
```typescript
|
|
237
|
+
// GOOD: Check for filename, not full path
|
|
238
|
+
expect(output).toContain('config.json')
|
|
239
|
+
|
|
240
|
+
// BAD: Assumes Unix path format
|
|
241
|
+
expect(output).toContain('/path/to/config.json')
|
|
242
|
+
|
|
243
|
+
// GOOD: For actual file operations
|
|
244
|
+
const configPath = path.join(testDir, 'config.json')
|
|
245
|
+
```
|
|
246
|
+
|
|
247
|
+
#### Example: Test Data
|
|
248
|
+
```typescript
|
|
249
|
+
// OK: Mock data representing a document path
|
|
250
|
+
const mockDoc: DocumentSummary = {
|
|
251
|
+
path: '/test/file.md', // Not a real filesystem path
|
|
252
|
+
// ...
|
|
253
|
+
}
|
|
254
|
+
|
|
255
|
+
// GOOD: Actual filesystem test fixture
|
|
256
|
+
const testFile = path.join(fixtureDir, 'file.md')
|
|
257
|
+
```
|
|
258
|
+
|
|
259
|
+
### 3. Testing Strategy
|
|
260
|
+
|
|
261
|
+
The current approach is sound:
|
|
262
|
+
- Mock data paths can remain Unix-style for consistency
|
|
263
|
+
- Filesystem operations use `path.join()`
|
|
264
|
+
- Error message assertions check for meaningful content, not formatting
|
|
265
|
+
- Integration tests use real filesystem operations with proper path handling
|
|
266
|
+
|
|
267
|
+
## Conclusion
|
|
268
|
+
|
|
269
|
+
The fix to line 458 is **correct and sufficient**. No other cross-platform path issues exist in the test suite. The codebase follows best practices by:
|
|
270
|
+
|
|
271
|
+
1. Using `path.join()` for actual filesystem paths
|
|
272
|
+
2. Using hardcoded paths only as mock data
|
|
273
|
+
3. Making assertions on meaningful content rather than path formatting
|
|
274
|
+
4. Keeping tests platform-agnostic
|
|
275
|
+
|
|
276
|
+
The test suite is well-designed for cross-platform compatibility.
|