mdcontext 0.0.1 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.changeset/README.md +28 -0
- package/.changeset/config.json +11 -0
- package/.claude/settings.local.json +25 -0
- package/.github/workflows/ci.yml +83 -0
- package/.github/workflows/claude-code-review.yml +44 -0
- package/.github/workflows/claude.yml +85 -0
- package/.github/workflows/release.yml +113 -0
- package/.tldrignore +112 -0
- package/BACKLOG.md +338 -0
- package/CONTRIBUTING.md +186 -0
- package/NOTES/NOTES +44 -0
- package/README.md +434 -11
- package/biome.json +36 -0
- package/cspell.config.yaml +14 -0
- package/dist/chunk-23UPXDNL.js +3044 -0
- package/dist/chunk-2W7MO2DL.js +1366 -0
- package/dist/chunk-3NUAZGMA.js +1689 -0
- package/dist/chunk-7TOWB2XB.js +366 -0
- package/dist/chunk-7XOTOADQ.js +3065 -0
- package/dist/chunk-AH2PDM2K.js +3042 -0
- package/dist/chunk-BNXWSZ63.js +3742 -0
- package/dist/chunk-BTL5DJVU.js +3222 -0
- package/dist/chunk-HDHYG7E4.js +104 -0
- package/dist/chunk-HLR4KZBP.js +3234 -0
- package/dist/chunk-IP3FRFEB.js +1045 -0
- package/dist/chunk-KHU56VDO.js +3042 -0
- package/dist/chunk-KRYIFLQR.js +88 -0
- package/dist/chunk-LBSDNLEM.js +287 -0
- package/dist/chunk-MNTQ7HCP.js +2643 -0
- package/dist/chunk-MUJELQQ6.js +1387 -0
- package/dist/chunk-MXJGMSLV.js +2199 -0
- package/dist/chunk-N6QJGC3Z.js +2636 -0
- package/dist/chunk-OBELGBPM.js +1713 -0
- package/dist/chunk-OT7R5XTA.js +3192 -0
- package/dist/chunk-P7X4RA2T.js +106 -0
- package/dist/chunk-PIDUQNC2.js +3185 -0
- package/dist/chunk-POGCDIH4.js +3187 -0
- package/dist/chunk-PSIEOQGZ.js +3043 -0
- package/dist/chunk-PVRT3IHA.js +3238 -0
- package/dist/chunk-QNN4TT23.js +1430 -0
- package/dist/chunk-RE3R45RJ.js +3042 -0
- package/dist/chunk-S7E6TFX6.js +803 -0
- package/dist/chunk-SG6GLU4U.js +1378 -0
- package/dist/chunk-SJCDV2ST.js +274 -0
- package/dist/chunk-SYE5XLF3.js +104 -0
- package/dist/chunk-T5VLYBZD.js +103 -0
- package/dist/chunk-TOQB7VWU.js +3238 -0
- package/dist/chunk-VFNMZ4ZQ.js +3228 -0
- package/dist/chunk-VVTGZNBT.js +1629 -0
- package/dist/chunk-W7Q4RFEV.js +104 -0
- package/dist/chunk-XTYYVRLO.js +3190 -0
- package/dist/chunk-Y6MDYVJD.js +3063 -0
- package/dist/cli/main.d.ts +1 -0
- package/dist/cli/main.js +5458 -0
- package/dist/index.d.ts +653 -0
- package/dist/index.js +79 -0
- package/dist/mcp/server.d.ts +1 -0
- package/dist/mcp/server.js +472 -0
- package/dist/schema-BAWSG7KY.js +22 -0
- package/dist/schema-E3QUPL26.js +20 -0
- package/dist/schema-EHL7WUT6.js +20 -0
- package/docs/019-USAGE.md +625 -0
- package/docs/020-current-implementation.md +364 -0
- package/docs/021-DOGFOODING-FINDINGS.md +175 -0
- package/docs/BACKLOG.md +80 -0
- package/docs/CONFIG.md +1123 -0
- package/docs/DESIGN.md +439 -0
- package/docs/ERRORS.md +383 -0
- package/docs/PROJECT.md +88 -0
- package/docs/ROADMAP.md +407 -0
- package/docs/summarization.md +320 -0
- package/docs/test-links.md +9 -0
- package/justfile +40 -0
- package/package.json +74 -9
- package/pnpm-workspace.yaml +5 -0
- package/research/INDEX.md +315 -0
- package/research/code-review/README.md +90 -0
- package/research/code-review/cli-error-handling-review.md +979 -0
- package/research/code-review/code-review-validation-report.md +464 -0
- package/research/code-review/main-ts-review.md +1128 -0
- package/research/config-analysis/01-current-implementation.md +470 -0
- package/research/config-analysis/02-strategy-recommendation.md +428 -0
- package/research/config-analysis/03-task-candidates.md +715 -0
- package/research/config-analysis/033-research-configuration-management.md +828 -0
- package/research/config-analysis/034-research-effect-cli-config.md +1504 -0
- package/research/config-analysis/04-consolidated-task-candidates.md +277 -0
- package/research/config-docs/SUMMARY.md +357 -0
- package/research/config-docs/TEST-RESULTS.md +776 -0
- package/research/config-docs/TODO.md +542 -0
- package/research/config-docs/analysis.md +744 -0
- package/research/config-docs/fix-validation.md +502 -0
- package/research/config-docs/help-audit.md +264 -0
- package/research/config-docs/help-system-analysis.md +890 -0
- package/research/dogfood/consolidated-tool-evaluation.md +373 -0
- package/research/dogfood/strategy-a/a-synthesis.md +184 -0
- package/research/dogfood/strategy-a/a1-docs.md +226 -0
- package/research/dogfood/strategy-a/a2-amorphic.md +156 -0
- package/research/dogfood/strategy-a/a3-llm.md +164 -0
- package/research/dogfood/strategy-b/b-synthesis.md +228 -0
- package/research/dogfood/strategy-b/b1-architecture.md +207 -0
- package/research/dogfood/strategy-b/b2-gaps.md +258 -0
- package/research/dogfood/strategy-b/b3-workflows.md +250 -0
- package/research/dogfood/strategy-c/c-synthesis.md +451 -0
- package/research/dogfood/strategy-c/c1-explorer.md +192 -0
- package/research/dogfood/strategy-c/c2-diver-memory.md +145 -0
- package/research/dogfood/strategy-c/c3-diver-control.md +148 -0
- package/research/dogfood/strategy-c/c4-diver-failure.md +151 -0
- package/research/dogfood/strategy-c/c5-diver-execution.md +221 -0
- package/research/dogfood/strategy-c/c6-diver-org.md +221 -0
- package/research/effect-cli-error-handling.md +845 -0
- package/research/effect-errors-as-values.md +943 -0
- package/research/errors-task-analysis/00-consolidated-tasks.md +207 -0
- package/research/errors-task-analysis/cli-commands-analysis.md +909 -0
- package/research/errors-task-analysis/embeddings-analysis.md +709 -0
- package/research/errors-task-analysis/index-search-analysis.md +812 -0
- package/research/frontmatter/COMMENTS-ARE-SKIPPED.md +149 -0
- package/research/frontmatter/LLM-CODE-NAVIGATION.md +276 -0
- package/research/issue-review.md +603 -0
- package/research/llm-summarization/agent-cli-tools-2026.md +1082 -0
- package/research/llm-summarization/alternative-providers-2026.md +1428 -0
- package/research/llm-summarization/anthropic-2026.md +367 -0
- package/research/llm-summarization/claude-cli-integration.md +1706 -0
- package/research/llm-summarization/cli-integration-patterns.md +3155 -0
- package/research/llm-summarization/openai-2026.md +473 -0
- package/research/llm-summarization/openai-compatible-providers-2026.md +1022 -0
- package/research/llm-summarization/opencode-cli-integration.md +1552 -0
- package/research/llm-summarization/prompt-engineering-2026.md +1426 -0
- package/research/llm-summarization/prototype-results.md +56 -0
- package/research/llm-summarization/provider-switching-patterns-2026.md +2153 -0
- package/research/llm-summarization/typescript-llm-libraries-2026.md +2436 -0
- package/research/mdcontext-error-analysis.md +521 -0
- package/research/mdcontext-pudding/00-EXECUTIVE-SUMMARY.md +282 -0
- package/research/mdcontext-pudding/01-index-embed.md +956 -0
- package/research/mdcontext-pudding/02-search-COMMANDS.md +142 -0
- package/research/mdcontext-pudding/02-search-SUMMARY.md +146 -0
- package/research/mdcontext-pudding/02-search.md +970 -0
- package/research/mdcontext-pudding/03-context.md +779 -0
- package/research/mdcontext-pudding/04-navigation-and-analytics.md +803 -0
- package/research/mdcontext-pudding/04-tree.md +704 -0
- package/research/mdcontext-pudding/05-config.md +1038 -0
- package/research/mdcontext-pudding/06-links-summary.txt +87 -0
- package/research/mdcontext-pudding/06-links.md +679 -0
- package/research/mdcontext-pudding/07-stats.md +693 -0
- package/research/mdcontext-pudding/BUG-FIX-PLAN.md +388 -0
- package/research/mdcontext-pudding/P0-BUG-VALIDATION.md +167 -0
- package/research/mdcontext-pudding/README.md +168 -0
- package/research/mdcontext-pudding/TESTING-SUMMARY.md +128 -0
- package/research/npm_publish/011-npm-workflow-research-agent2.md +792 -0
- package/research/npm_publish/012-npm-workflow-research-agent1.md +530 -0
- package/research/npm_publish/013-npm-workflow-research-agent3.md +722 -0
- package/research/npm_publish/014-npm-workflow-synthesis.md +556 -0
- package/research/npm_publish/031-npm-workflow-task-analysis.md +134 -0
- package/research/research-quality-review.md +834 -0
- package/research/semantic-search/002-research-embedding-models.md +490 -0
- package/research/semantic-search/003-research-rag-alternatives.md +523 -0
- package/research/semantic-search/004-research-vector-search.md +841 -0
- package/research/semantic-search/032-research-semantic-search.md +427 -0
- package/research/semantic-search/embedding-text-analysis.md +156 -0
- package/research/semantic-search/multi-word-failure-reproduction.md +171 -0
- package/research/semantic-search/query-processing-analysis.md +207 -0
- package/research/semantic-search/root-cause-and-solution.md +114 -0
- package/research/semantic-search/threshold-validation-report.md +69 -0
- package/research/semantic-search/vector-search-analysis.md +63 -0
- package/research/task-management-2026/00-synthesis-recommendations.md +295 -0
- package/research/task-management-2026/01-ai-workflow-tools.md +416 -0
- package/research/task-management-2026/02-agent-framework-patterns.md +476 -0
- package/research/task-management-2026/03-lightweight-file-based.md +567 -0
- package/research/task-management-2026/04-established-tools-ai-features.md +541 -0
- package/research/task-management-2026/linear/01-core-features-workflow.md +771 -0
- package/research/task-management-2026/linear/02-api-integrations.md +930 -0
- package/research/task-management-2026/linear/03-ai-features.md +368 -0
- package/research/task-management-2026/linear/04-pricing-setup.md +205 -0
- package/research/task-management-2026/linear/05-usage-patterns-best-practices.md +605 -0
- package/research/test-path-issues.md +276 -0
- package/review/ALP-76/1-error-type-design.md +962 -0
- package/review/ALP-76/2-error-handling-patterns.md +906 -0
- package/review/ALP-76/3-error-presentation.md +624 -0
- package/review/ALP-76/4-test-coverage.md +625 -0
- package/review/ALP-76/5-migration-completeness.md +440 -0
- package/review/ALP-76/6-effect-best-practices.md +755 -0
- package/scripts/apply-branch-protection.sh +47 -0
- package/scripts/branch-protection-templates.json +79 -0
- package/scripts/prototype-summarization.ts +346 -0
- package/scripts/rebuild-hnswlib.js +58 -0
- package/scripts/setup-branch-protection.sh +64 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/active-provider.json +7 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/bm25.json +541 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/bm25.meta.json +5 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/config.json +8 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.bin +0 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.meta.bin +0 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/indexes/documents.json +60 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/indexes/links.json +13 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/.mdcontext/indexes/sections.json +1197 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/configuration-management.md +99 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/distributed-systems.md +92 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/error-handling.md +78 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/failure-automation.md +55 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/job-context.md +69 -0
- package/src/__tests__/fixtures/semantic-search/multi-word-corpus/process-orchestration.md +99 -0
- package/src/cli/argv-preprocessor.test.ts +210 -0
- package/src/cli/argv-preprocessor.ts +202 -0
- package/src/cli/cli.test.ts +627 -0
- package/src/cli/commands/backlinks.ts +54 -0
- package/src/cli/commands/config-cmd.ts +642 -0
- package/src/cli/commands/context.ts +285 -0
- package/src/cli/commands/duplicates.ts +122 -0
- package/src/cli/commands/embeddings.ts +529 -0
- package/src/cli/commands/index-cmd.ts +480 -0
- package/src/cli/commands/index.ts +16 -0
- package/src/cli/commands/links.ts +52 -0
- package/src/cli/commands/search.ts +1281 -0
- package/src/cli/commands/stats.ts +149 -0
- package/src/cli/commands/tree.ts +128 -0
- package/src/cli/config-layer.ts +176 -0
- package/src/cli/error-handler.test.ts +235 -0
- package/src/cli/error-handler.ts +655 -0
- package/src/cli/flag-schemas.ts +341 -0
- package/src/cli/help.ts +588 -0
- package/src/cli/index.ts +9 -0
- package/src/cli/main.ts +435 -0
- package/src/cli/options.ts +41 -0
- package/src/cli/shared-error-handling.ts +199 -0
- package/src/cli/typo-suggester.test.ts +105 -0
- package/src/cli/typo-suggester.ts +130 -0
- package/src/cli/utils.ts +259 -0
- package/src/config/file-provider.test.ts +320 -0
- package/src/config/file-provider.ts +273 -0
- package/src/config/index.ts +72 -0
- package/src/config/integration.test.ts +667 -0
- package/src/config/precedence.test.ts +277 -0
- package/src/config/precedence.ts +451 -0
- package/src/config/schema.test.ts +414 -0
- package/src/config/schema.ts +603 -0
- package/src/config/service.test.ts +320 -0
- package/src/config/service.ts +243 -0
- package/src/config/testing.test.ts +264 -0
- package/src/config/testing.ts +110 -0
- package/src/core/index.ts +1 -0
- package/src/core/types.ts +113 -0
- package/src/duplicates/detector.test.ts +183 -0
- package/src/duplicates/detector.ts +414 -0
- package/src/duplicates/index.ts +18 -0
- package/src/embeddings/embedding-namespace.test.ts +300 -0
- package/src/embeddings/embedding-namespace.ts +947 -0
- package/src/embeddings/heading-boost.test.ts +222 -0
- package/src/embeddings/hnsw-build-options.test.ts +198 -0
- package/src/embeddings/hyde.test.ts +272 -0
- package/src/embeddings/hyde.ts +264 -0
- package/src/embeddings/index.ts +10 -0
- package/src/embeddings/openai-provider.ts +414 -0
- package/src/embeddings/pricing.json +22 -0
- package/src/embeddings/provider-constants.ts +204 -0
- package/src/embeddings/provider-errors.test.ts +967 -0
- package/src/embeddings/provider-errors.ts +565 -0
- package/src/embeddings/provider-factory.test.ts +240 -0
- package/src/embeddings/provider-factory.ts +225 -0
- package/src/embeddings/provider-integration.test.ts +788 -0
- package/src/embeddings/query-preprocessing.test.ts +187 -0
- package/src/embeddings/semantic-search-threshold.test.ts +508 -0
- package/src/embeddings/semantic-search.ts +1270 -0
- package/src/embeddings/types.ts +359 -0
- package/src/embeddings/vector-store.ts +708 -0
- package/src/embeddings/voyage-provider.ts +313 -0
- package/src/errors/errors.test.ts +845 -0
- package/src/errors/index.ts +533 -0
- package/src/index/ignore-patterns.test.ts +354 -0
- package/src/index/ignore-patterns.ts +305 -0
- package/src/index/index.ts +4 -0
- package/src/index/indexer.ts +684 -0
- package/src/index/storage.ts +260 -0
- package/src/index/types.ts +147 -0
- package/src/index/watcher.ts +189 -0
- package/src/index.ts +30 -0
- package/src/integration/search-keyword.test.ts +678 -0
- package/src/mcp/server.ts +612 -0
- package/src/parser/index.ts +1 -0
- package/src/parser/parser.test.ts +291 -0
- package/src/parser/parser.ts +394 -0
- package/src/parser/section-filter.test.ts +277 -0
- package/src/parser/section-filter.ts +392 -0
- package/src/search/__tests__/hybrid-search.test.ts +650 -0
- package/src/search/bm25-store.ts +366 -0
- package/src/search/cross-encoder.test.ts +253 -0
- package/src/search/cross-encoder.ts +406 -0
- package/src/search/fuzzy-search.test.ts +419 -0
- package/src/search/fuzzy-search.ts +273 -0
- package/src/search/hybrid-search.ts +448 -0
- package/src/search/path-matcher.test.ts +276 -0
- package/src/search/path-matcher.ts +33 -0
- package/src/search/query-parser.test.ts +260 -0
- package/src/search/query-parser.ts +319 -0
- package/src/search/searcher.test.ts +280 -0
- package/src/search/searcher.ts +724 -0
- package/src/search/wink-bm25.d.ts +30 -0
- package/src/summarization/cli-providers/claude.ts +202 -0
- package/src/summarization/cli-providers/detection.test.ts +273 -0
- package/src/summarization/cli-providers/detection.ts +118 -0
- package/src/summarization/cli-providers/index.ts +8 -0
- package/src/summarization/cost.test.ts +139 -0
- package/src/summarization/cost.ts +102 -0
- package/src/summarization/error-handler.test.ts +127 -0
- package/src/summarization/error-handler.ts +111 -0
- package/src/summarization/index.ts +102 -0
- package/src/summarization/pipeline.test.ts +498 -0
- package/src/summarization/pipeline.ts +231 -0
- package/src/summarization/prompts.test.ts +269 -0
- package/src/summarization/prompts.ts +133 -0
- package/src/summarization/provider-factory.test.ts +396 -0
- package/src/summarization/provider-factory.ts +178 -0
- package/src/summarization/types.ts +184 -0
- package/src/summarize/budget-bugs.test.ts +620 -0
- package/src/summarize/formatters.ts +419 -0
- package/src/summarize/index.ts +20 -0
- package/src/summarize/summarizer.test.ts +275 -0
- package/src/summarize/summarizer.ts +597 -0
- package/src/summarize/verify-bugs.test.ts +238 -0
- package/src/types/huggingface-transformers.d.ts +66 -0
- package/src/utils/index.ts +1 -0
- package/src/utils/tokens.test.ts +142 -0
- package/src/utils/tokens.ts +186 -0
- package/tests/fixtures/cli/.mdcontext/active-provider.json +7 -0
- package/tests/fixtures/cli/.mdcontext/config.json +8 -0
- package/tests/fixtures/cli/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.bin +0 -0
- package/tests/fixtures/cli/.mdcontext/embeddings/openai_text-embedding-3-small_512/vectors.meta.bin +0 -0
- package/tests/fixtures/cli/.mdcontext/indexes/documents.json +33 -0
- package/tests/fixtures/cli/.mdcontext/indexes/links.json +12 -0
- package/tests/fixtures/cli/.mdcontext/indexes/sections.json +247 -0
- package/tests/fixtures/cli/README.md +9 -0
- package/tests/fixtures/cli/api-reference.md +11 -0
- package/tests/fixtures/cli/getting-started.md +11 -0
- package/tests/integration/embed-index.test.ts +712 -0
- package/tests/integration/search-context.test.ts +469 -0
- package/tests/integration/search-semantic.test.ts +522 -0
- package/tsconfig.json +26 -0
- package/vitest.config.ts +16 -0
- package/vitest.setup.ts +12 -0
|
@@ -0,0 +1,171 @@
|
|
|
1
|
+
# Multi-Word Semantic Search Failure Reproduction
|
|
2
|
+
|
|
3
|
+
## Executive Summary
|
|
4
|
+
|
|
5
|
+
After systematic testing, the reported "multi-word semantic search failure" is **NOT a failure of semantic search itself**, but rather a **threshold calibration issue**. The root causes are:
|
|
6
|
+
|
|
7
|
+
1. **Single-word queries have low similarity scores** (30-40%) while multi-word queries have higher scores (50-70%)
|
|
8
|
+
2. **Default threshold of 0.5** filters out both single-word AND semantically-distant multi-word queries
|
|
9
|
+
3. **Queries with abstract/non-domain-specific terms** (e.g., "gaps missing omissions", "issue challenge gap") score below threshold
|
|
10
|
+
4. **Domain-specific multi-word queries work well** (e.g., "failure automation" = 61%, "process orchestration" = 68%)
|
|
11
|
+
|
|
12
|
+
## Test Methodology
|
|
13
|
+
|
|
14
|
+
### Test Corpus
|
|
15
|
+
|
|
16
|
+
Created a controlled test corpus in `src/__tests__/fixtures/semantic-search/multi-word-corpus/` with 6 markdown files covering:
|
|
17
|
+
- failure-automation.md - Failure detection and automated recovery
|
|
18
|
+
- job-context.md - Job execution context and metadata
|
|
19
|
+
- error-handling.md - Error handling patterns
|
|
20
|
+
- configuration-management.md - Config management practices
|
|
21
|
+
- distributed-systems.md - Distributed systems architecture
|
|
22
|
+
- process-orchestration.md - Workflow orchestration patterns
|
|
23
|
+
|
|
24
|
+
**Corpus Statistics:**
|
|
25
|
+
- 6 documents
|
|
26
|
+
- 67 sections
|
|
27
|
+
- 52 embedded vectors
|
|
28
|
+
- ~4,725 tokens
|
|
29
|
+
|
|
30
|
+
### Index Command
|
|
31
|
+
|
|
32
|
+
```bash
|
|
33
|
+
node dist/cli/main.js index src/__tests__/fixtures/semantic-search/multi-word-corpus --embed --force
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
## Test Results
|
|
37
|
+
|
|
38
|
+
### Multi-Word Domain-Specific Queries (DEFAULT THRESHOLD 0.5)
|
|
39
|
+
|
|
40
|
+
| Query | Results | Top Match | Top Score |
|
|
41
|
+
|-------|---------|-----------|-----------|
|
|
42
|
+
| "failure automation" | 7 | failure-automation.md: Best Practices | 61.6% |
|
|
43
|
+
| "job context" | 4 | job-context.md: What is Job Context? | 60.4% |
|
|
44
|
+
| "error handling" | 7 | error-handling.md: Introduction | 63.7% |
|
|
45
|
+
| "configuration management" | 8 | configuration-management.md: Overview | 69.5% |
|
|
46
|
+
| "distributed systems" | 4 | distributed-systems.md: What Are... | 61.0% |
|
|
47
|
+
| "process orchestration" | 8 | process-orchestration.md: Introduction | 68.0% |
|
|
48
|
+
|
|
49
|
+
**Finding:** Multi-word queries with domain-specific terms **WORK WELL** with default threshold.
|
|
50
|
+
|
|
51
|
+
### Single-Word Queries (DEFAULT THRESHOLD 0.5)
|
|
52
|
+
|
|
53
|
+
| Query | Results | Notes |
|
|
54
|
+
|-------|---------|-------|
|
|
55
|
+
| "failure" | 0 | Below 0.5 threshold |
|
|
56
|
+
| "automation" | 0 | Below 0.5 threshold |
|
|
57
|
+
| "context" | 0 | Below 0.5 threshold |
|
|
58
|
+
| "error" | 0 | Below 0.5 threshold |
|
|
59
|
+
|
|
60
|
+
### Single-Word Queries (THRESHOLD 0.3)
|
|
61
|
+
|
|
62
|
+
| Query | Results | Top Match | Top Score |
|
|
63
|
+
|-------|---------|-----------|-----------|
|
|
64
|
+
| "failure" | 10 | failure-automation.md: Failure Isolation | 39.1% |
|
|
65
|
+
| "automation" | 10 | (similar) | ~35% |
|
|
66
|
+
| "error" | 10 | error-handling.md: Programming Errors | 49.1% |
|
|
67
|
+
|
|
68
|
+
**Finding:** Single-word queries have inherently **LOW similarity scores** (30-49%) due to:
|
|
69
|
+
1. Short query embeddings lack semantic context
|
|
70
|
+
2. Embedding model produces less distinctive vectors for single words
|
|
71
|
+
3. Cosine similarity between short and long vectors is compressed
|
|
72
|
+
|
|
73
|
+
### Abstract/Generic Multi-Word Queries (DEFAULT THRESHOLD 0.5)
|
|
74
|
+
|
|
75
|
+
| Query | Results | Notes |
|
|
76
|
+
|-------|---------|-------|
|
|
77
|
+
| "issue challenge gap" | 0 | Abstract terms, no domain match |
|
|
78
|
+
| "gaps missing omissions" | 0 | Meta-language about content, not content itself |
|
|
79
|
+
|
|
80
|
+
### Abstract Queries (THRESHOLD 0.3)
|
|
81
|
+
|
|
82
|
+
| Query | Results | Top Match | Top Score |
|
|
83
|
+
|-------|---------|-----------|-----------|
|
|
84
|
+
| "issue challenge gap" | 10 | distributed-systems.md: Consistency vs Availability | 40.8% |
|
|
85
|
+
| "gaps missing omissions" | 3 | error-handling.md: Programming Errors | 35.0% |
|
|
86
|
+
|
|
87
|
+
**Finding:** Abstract/meta-language queries score **30-40%** - below default threshold but findable with lower threshold.
|
|
88
|
+
|
|
89
|
+
### Hybrid Search Results
|
|
90
|
+
|
|
91
|
+
| Query | Hybrid Results | Primary Source |
|
|
92
|
+
|-------|---------------|----------------|
|
|
93
|
+
| "failure automation" | 7 | Semantic (RRF ~1.6) |
|
|
94
|
+
| "job context" | 4 | Semantic (RRF ~1.6) |
|
|
95
|
+
|
|
96
|
+
**Finding:** Hybrid search successfully combines semantic and keyword results, but the semantic component still uses the threshold filter.
|
|
97
|
+
|
|
98
|
+
## Pattern Analysis
|
|
99
|
+
|
|
100
|
+
### What Works (>50% similarity)
|
|
101
|
+
- Multi-word queries with **domain-specific terms** directly present in content
|
|
102
|
+
- Queries that form **coherent concepts** (e.g., "process orchestration")
|
|
103
|
+
- Queries that match **document titles or major headings**
|
|
104
|
+
|
|
105
|
+
### What Fails at Default Threshold
|
|
106
|
+
- **Single words** - all score 30-49%
|
|
107
|
+
- **Abstract meta-language** - "gaps", "issues", "challenges" without domain context
|
|
108
|
+
- **Non-domain queries** searching indexed domain content
|
|
109
|
+
- **Very short queries** (1-2 generic words)
|
|
110
|
+
|
|
111
|
+
### Similarity Score Distribution
|
|
112
|
+
|
|
113
|
+
```
|
|
114
|
+
70%+ : Document title/heading exact concept matches
|
|
115
|
+
60-70%: Multi-word domain queries matching content topics
|
|
116
|
+
50-60%: Multi-word queries with partial concept overlap
|
|
117
|
+
40-50%: Single words or abstract queries with some relevance
|
|
118
|
+
30-40%: Tangentially related content
|
|
119
|
+
<30% : Unrelated content (correctly filtered)
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
## Dogfooding Context
|
|
123
|
+
|
|
124
|
+
The dogfooding agents reported semantic search as "unreliable for multi-word conceptual queries". Re-analysis shows:
|
|
125
|
+
|
|
126
|
+
1. **No embeddings were built** during dogfooding (only keyword index existed)
|
|
127
|
+
2. Semantic search was **unavailable** - falling back to keyword search
|
|
128
|
+
3. Multi-word **keyword** searches like "failure automation" worked
|
|
129
|
+
4. Multi-word keyword searches as **quoted phrases** returned 0 (expecting exact text)
|
|
130
|
+
5. Abstract queries like "gaps missing omissions" correctly returned 0 (phrase not in content)
|
|
131
|
+
|
|
132
|
+
The actual issue was:
|
|
133
|
+
- **Semantic search unavailable** (no embeddings)
|
|
134
|
+
- **Keyword phrase search** misunderstood (quoted = exact match)
|
|
135
|
+
- **Abstract conceptual queries** don't match concrete content via keyword
|
|
136
|
+
|
|
137
|
+
## Recommendations
|
|
138
|
+
|
|
139
|
+
### For ALP-204 (Embedding Text Analysis)
|
|
140
|
+
- Analyze how `generateEmbeddingText()` combines section context
|
|
141
|
+
- Check if heading + parent + content provides enough semantic signal for short queries
|
|
142
|
+
|
|
143
|
+
### For ALP-205 (Query Processing)
|
|
144
|
+
- Query text is passed directly to embedding - no preprocessing
|
|
145
|
+
- Consider query expansion for short queries
|
|
146
|
+
|
|
147
|
+
### For ALP-206 (Vector Search Parameters)
|
|
148
|
+
- Default threshold of 0.5 is **too high** for single-word queries
|
|
149
|
+
- Consider adaptive thresholds based on query length
|
|
150
|
+
- Consider returning top-K results regardless of threshold, then filtering
|
|
151
|
+
|
|
152
|
+
### For ALP-207 (Solution Design)
|
|
153
|
+
Key solutions to consider:
|
|
154
|
+
1. **Adaptive threshold** - lower for short queries
|
|
155
|
+
2. **Query expansion** - augment short queries with context
|
|
156
|
+
3. **Better user feedback** - show "X results below threshold" message
|
|
157
|
+
4. **Threshold documentation** - educate users on --threshold flag
|
|
158
|
+
|
|
159
|
+
## Conclusion
|
|
160
|
+
|
|
161
|
+
Multi-word semantic search **is working correctly** for domain-specific queries. The perceived "failure" is a combination of:
|
|
162
|
+
1. No embeddings in dogfooding environment
|
|
163
|
+
2. Threshold too high for short/abstract queries
|
|
164
|
+
3. Confusion between keyword phrase search and semantic search
|
|
165
|
+
4. Users expecting semantic search to understand meta-language about content
|
|
166
|
+
|
|
167
|
+
The fix is NOT to change semantic search algorithm, but to:
|
|
168
|
+
1. Calibrate default threshold appropriately
|
|
169
|
+
2. Add query-length-aware threshold adjustment
|
|
170
|
+
3. Improve error messages when no results found
|
|
171
|
+
4. Consider hybrid search as default mode
|
|
@@ -0,0 +1,207 @@
|
|
|
1
|
+
# Query Processing Analysis
|
|
2
|
+
|
|
3
|
+
## Executive Summary
|
|
4
|
+
|
|
5
|
+
Query processing is **minimal and appropriate**. The query text is passed directly to the embedding API without modification. This is correct behavior for OpenAI's text-embedding-3-small model, which handles text normalization internally.
|
|
6
|
+
|
|
7
|
+
The asymmetry between query format (plain text) and document format (text with metadata) does NOT cause issues - embedding models are designed for this asymmetric retrieval pattern.
|
|
8
|
+
|
|
9
|
+
## Query Flow
|
|
10
|
+
|
|
11
|
+
```
|
|
12
|
+
User Input
|
|
13
|
+
│
|
|
14
|
+
▼
|
|
15
|
+
CLI Parser (search.ts)
|
|
16
|
+
│ query string unchanged
|
|
17
|
+
▼
|
|
18
|
+
semanticSearch(rootPath, query, options)
|
|
19
|
+
│ query string unchanged
|
|
20
|
+
▼
|
|
21
|
+
provider.embed([query])
|
|
22
|
+
│ passed directly to API
|
|
23
|
+
▼
|
|
24
|
+
OpenAI Embeddings API
|
|
25
|
+
│ returns 512-dimensional vector
|
|
26
|
+
▼
|
|
27
|
+
Vector Store search()
|
|
28
|
+
│ cosine similarity comparison
|
|
29
|
+
▼
|
|
30
|
+
Results filtered by threshold
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
## Code Trace
|
|
34
|
+
|
|
35
|
+
### Entry Point: CLI
|
|
36
|
+
|
|
37
|
+
```typescript
|
|
38
|
+
// src/cli/commands/search.ts:53-55
|
|
39
|
+
query: Args.text({ name: 'query' }).pipe(
|
|
40
|
+
Args.withDescription('Search query (natural language or regex pattern)'),
|
|
41
|
+
),
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
The query enters as a raw text string, no preprocessing.
|
|
45
|
+
|
|
46
|
+
### Search Mode Detection
|
|
47
|
+
|
|
48
|
+
```typescript
|
|
49
|
+
// src/cli/commands/search.ts:201-206
|
|
50
|
+
} else if (isAdvancedQuery(query)) {
|
|
51
|
+
effectiveMode = 'keyword'
|
|
52
|
+
modeReason = 'boolean/phrase pattern detected'
|
|
53
|
+
} else if (isRegexPattern(query)) {
|
|
54
|
+
effectiveMode = 'keyword'
|
|
55
|
+
modeReason = 'regex pattern detected'
|
|
56
|
+
}
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
Queries with boolean operators (AND, OR, NOT) or quoted phrases are routed to keyword search. Plain multi-word queries go to semantic search.
|
|
60
|
+
|
|
61
|
+
### Semantic Search Function
|
|
62
|
+
|
|
63
|
+
```typescript
|
|
64
|
+
// src/embeddings/semantic-search.ts:558-559
|
|
65
|
+
// Embed the query
|
|
66
|
+
const queryResult = yield* wrapEmbedding(provider.embed([query]))
|
|
67
|
+
```
|
|
68
|
+
|
|
69
|
+
**No preprocessing** - query is embedded exactly as received.
|
|
70
|
+
|
|
71
|
+
### Embedding API Call
|
|
72
|
+
|
|
73
|
+
```typescript
|
|
74
|
+
// src/embeddings/openai-provider.ts:175-179
|
|
75
|
+
const response = await this.client.embeddings.create({
|
|
76
|
+
model: this.model,
|
|
77
|
+
input: batch, // query text passed directly
|
|
78
|
+
dimensions: 512,
|
|
79
|
+
})
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
Query text goes directly to OpenAI API without modification.
|
|
83
|
+
|
|
84
|
+
## Query vs Document Format Asymmetry
|
|
85
|
+
|
|
86
|
+
### Document Embedding Format (from ALP-204)
|
|
87
|
+
|
|
88
|
+
```
|
|
89
|
+
# {heading}
|
|
90
|
+
Parent section: {parentHeading}
|
|
91
|
+
Document: {documentTitle}
|
|
92
|
+
|
|
93
|
+
{content}
|
|
94
|
+
```
|
|
95
|
+
|
|
96
|
+
### Query Format
|
|
97
|
+
|
|
98
|
+
```
|
|
99
|
+
{raw query text}
|
|
100
|
+
```
|
|
101
|
+
|
|
102
|
+
### Analysis
|
|
103
|
+
|
|
104
|
+
This asymmetry is **intentional and correct** for semantic search:
|
|
105
|
+
|
|
106
|
+
1. **Embedding models handle asymmetry**: OpenAI's text-embedding models are trained on diverse text formats. They produce semantically meaningful vectors regardless of format.
|
|
107
|
+
|
|
108
|
+
2. **Query expansion is not needed**: The embedding model understands "failure automation" conceptually - it doesn't need to see `# Failure Automation` format.
|
|
109
|
+
|
|
110
|
+
3. **Document context helps disambiguation**: The heading/document metadata in indexed content helps distinguish between sections with similar content but different contexts.
|
|
111
|
+
|
|
112
|
+
4. **Industry standard practice**: Most RAG systems use plain queries against enriched documents.
|
|
113
|
+
|
|
114
|
+
## Query Variation Tests
|
|
115
|
+
|
|
116
|
+
All variations produce semantically similar results:
|
|
117
|
+
|
|
118
|
+
| Query | Top Result | Similarity |
|
|
119
|
+
|-------|------------|------------|
|
|
120
|
+
| "failure automation" | Best Practices | 61.6% |
|
|
121
|
+
| "failure-automation" | Overview | 68.8% |
|
|
122
|
+
| "Failure Automation" | Best Practices | 65.6% |
|
|
123
|
+
| "automation for failures" | Overview | 70.3% |
|
|
124
|
+
| "how to automate failure handling" | Best Practices | 66.4% |
|
|
125
|
+
|
|
126
|
+
**Findings:**
|
|
127
|
+
- Casing doesn't significantly affect results
|
|
128
|
+
- Hyphenation produces slightly different top result
|
|
129
|
+
- Word order matters but doesn't break search
|
|
130
|
+
- Natural language queries work well
|
|
131
|
+
|
|
132
|
+
## Threshold Analysis
|
|
133
|
+
|
|
134
|
+
### Default Threshold Flow
|
|
135
|
+
|
|
136
|
+
```
|
|
137
|
+
CLI default: 0.45
|
|
138
|
+
│
|
|
139
|
+
▼ (if CLI uses default)
|
|
140
|
+
Config default: 0.5
|
|
141
|
+
│
|
|
142
|
+
▼
|
|
143
|
+
Effective threshold: 0.5
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
When user doesn't specify `--threshold`, the effective value is 0.5 from config.
|
|
147
|
+
|
|
148
|
+
### Threshold Impact
|
|
149
|
+
|
|
150
|
+
| Threshold | Single-word "failure" | Multi-word "failure automation" |
|
|
151
|
+
|-----------|----------------------|--------------------------------|
|
|
152
|
+
| 0.5 | 0 results | 7 results |
|
|
153
|
+
| 0.3 | 10 results | 7+ results |
|
|
154
|
+
| 0.1 | 10 results | 7+ results |
|
|
155
|
+
|
|
156
|
+
The 0.5 threshold filters out low-similarity single-word matches while allowing relevant multi-word matches through.
|
|
157
|
+
|
|
158
|
+
## Potential Query Enhancements (for ALP-207)
|
|
159
|
+
|
|
160
|
+
While current processing is correct, potential improvements could include:
|
|
161
|
+
|
|
162
|
+
### 1. Query Expansion for Short Queries
|
|
163
|
+
|
|
164
|
+
```typescript
|
|
165
|
+
// Hypothetical enhancement
|
|
166
|
+
const enhancedQuery = query.split(' ').length <= 2
|
|
167
|
+
? `Find content about: ${query}`
|
|
168
|
+
: query
|
|
169
|
+
```
|
|
170
|
+
|
|
171
|
+
### 2. Adaptive Threshold
|
|
172
|
+
|
|
173
|
+
```typescript
|
|
174
|
+
// Lower threshold for shorter queries
|
|
175
|
+
const adaptiveThreshold = query.split(' ').length <= 1
|
|
176
|
+
? 0.3
|
|
177
|
+
: options.threshold ?? 0.5
|
|
178
|
+
```
|
|
179
|
+
|
|
180
|
+
### 3. Hybrid by Default
|
|
181
|
+
|
|
182
|
+
Short queries might benefit from hybrid mode being the default, leveraging both keyword and semantic signals.
|
|
183
|
+
|
|
184
|
+
## Recommendations
|
|
185
|
+
|
|
186
|
+
### No Changes Needed to Query Processing
|
|
187
|
+
|
|
188
|
+
The current implementation is correct. The query flow is:
|
|
189
|
+
- Clean (no unnecessary transformations)
|
|
190
|
+
- Transparent (what you type is what gets embedded)
|
|
191
|
+
- Flexible (users can adjust with --threshold)
|
|
192
|
+
|
|
193
|
+
### Focus Areas for ALP-207
|
|
194
|
+
|
|
195
|
+
1. **Threshold tuning** - Consider lowering default to 0.4 or making it adaptive
|
|
196
|
+
2. **Better feedback** - Show "X results below threshold" when 0 results
|
|
197
|
+
3. **Documentation** - Explain threshold behavior in help text
|
|
198
|
+
4. **Hybrid default** - Consider hybrid mode as default for better coverage
|
|
199
|
+
|
|
200
|
+
## Conclusion
|
|
201
|
+
|
|
202
|
+
Query processing is implemented correctly. The perceived "multi-word query failures" are actually threshold calibration issues, not query processing bugs. The search correctly:
|
|
203
|
+
|
|
204
|
+
1. Passes queries unchanged to embedding API (correct)
|
|
205
|
+
2. Uses asymmetric retrieval (query vs enriched documents) (correct)
|
|
206
|
+
3. Handles query variations semantically (working)
|
|
207
|
+
4. Applies configurable threshold (working, but may need tuning)
|
|
@@ -0,0 +1,114 @@
|
|
|
1
|
+
# Root Cause Analysis and Solution Design
|
|
2
|
+
|
|
3
|
+
## Executive Summary
|
|
4
|
+
|
|
5
|
+
**Root Cause**: The "multi-word semantic search failure" is a **threshold calibration issue**, not a search algorithm bug.
|
|
6
|
+
|
|
7
|
+
**Key Findings**:
|
|
8
|
+
1. Multi-word domain queries WORK correctly (60-70% similarity)
|
|
9
|
+
2. Single-word queries score lower (30-40%) due to embedding model properties
|
|
10
|
+
3. Default 0.5 threshold filters out short/abstract queries
|
|
11
|
+
4. The dogfooding had no embeddings built - agents fell back to keyword search
|
|
12
|
+
5. Embedding text format, query processing, and HNSW config are all correct
|
|
13
|
+
|
|
14
|
+
**Solution**: Lower default threshold + improve user feedback for edge cases.
|
|
15
|
+
|
|
16
|
+
## Synthesis of Diagnostic Findings
|
|
17
|
+
|
|
18
|
+
### ALP-203: Reproduction Results
|
|
19
|
+
|
|
20
|
+
| Query Type | Works at 0.5? | Score Range |
|
|
21
|
+
|------------|---------------|-------------|
|
|
22
|
+
| "failure automation" | YES | 54-62% |
|
|
23
|
+
| "error handling" | YES | 53-64% |
|
|
24
|
+
| "failure" (single) | NO | 31-39% |
|
|
25
|
+
| "error" (single) | NO | 32-49% |
|
|
26
|
+
| "gaps missing omissions" | NO | 30-35% |
|
|
27
|
+
|
|
28
|
+
**Conclusion**: Multi-word domain queries work. Short/abstract queries fail threshold.
|
|
29
|
+
|
|
30
|
+
### ALP-204: Embedding Text Analysis
|
|
31
|
+
|
|
32
|
+
- Format is correct: `# heading\nParent: X\nDocument: Y\n\ncontent`
|
|
33
|
+
- Follows industry best practices
|
|
34
|
+
- No issues identified
|
|
35
|
+
|
|
36
|
+
### ALP-205: Query Processing Analysis
|
|
37
|
+
|
|
38
|
+
- Query passed unchanged to embedding API (correct)
|
|
39
|
+
- Asymmetric retrieval (plain query vs enriched docs) is normal
|
|
40
|
+
- Query variations all work correctly
|
|
41
|
+
|
|
42
|
+
### ALP-206: Vector Search Analysis
|
|
43
|
+
|
|
44
|
+
- HNSW parameters (M=16, efConstruction=200, efSearch=100) are optimal
|
|
45
|
+
- Cosine distance correct for text embeddings
|
|
46
|
+
- Threshold filtering is the only issue
|
|
47
|
+
|
|
48
|
+
## Root Cause
|
|
49
|
+
|
|
50
|
+
**Primary Cause**: The default similarity threshold (0.5) is too high for:
|
|
51
|
+
1. Single-word queries (max ~49% similarity due to embedding model properties)
|
|
52
|
+
2. Abstract/meta-language queries
|
|
53
|
+
3. Non-domain-specific queries
|
|
54
|
+
|
|
55
|
+
**NOT the cause**:
|
|
56
|
+
- Embedding text format (correct)
|
|
57
|
+
- Query processing (correct)
|
|
58
|
+
- HNSW parameters (optimal)
|
|
59
|
+
- Embedding model (working as expected)
|
|
60
|
+
|
|
61
|
+
**Contributing Factor**: Dogfooding lacked embeddings, causing confusion about what was failing.
|
|
62
|
+
|
|
63
|
+
## Solution Design
|
|
64
|
+
|
|
65
|
+
### Recommended Approach: Threshold Tuning + UX Improvements
|
|
66
|
+
|
|
67
|
+
#### 1. Lower Default Threshold to 0.35
|
|
68
|
+
|
|
69
|
+
```typescript
|
|
70
|
+
// src/config/schema.ts
|
|
71
|
+
minSimilarity: Config.number('minSimilarity').pipe(Config.withDefault(0.35))
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
**Rationale**:
|
|
75
|
+
- Captures single-word results (30-40% range)
|
|
76
|
+
- Still filters irrelevant content (<30%)
|
|
77
|
+
- Low risk - users can adjust with --threshold
|
|
78
|
+
|
|
79
|
+
#### 2. Add "Below Threshold" Feedback
|
|
80
|
+
|
|
81
|
+
When 0 results, show hint about lower-scored results:
|
|
82
|
+
|
|
83
|
+
```
|
|
84
|
+
Results: 0
|
|
85
|
+
|
|
86
|
+
Note: 10 results found below 0.35 threshold (highest: 0.34)
|
|
87
|
+
Tip: Use --threshold 0.3 to see more results
|
|
88
|
+
```
|
|
89
|
+
|
|
90
|
+
#### 3. Consider Hybrid Search as Default
|
|
91
|
+
|
|
92
|
+
For queries without boolean operators, hybrid mode provides better coverage by combining semantic and keyword signals.
|
|
93
|
+
|
|
94
|
+
## Implementation Plan for Phase 2
|
|
95
|
+
|
|
96
|
+
1. **Lower default threshold** - Change config default from 0.5 to 0.35
|
|
97
|
+
2. **Add below-threshold feedback** - Show hint when 0 results
|
|
98
|
+
3. **Document threshold behavior** - Update README/help
|
|
99
|
+
4. **Validate changes** - Re-run test corpus
|
|
100
|
+
|
|
101
|
+
## Expected Outcomes
|
|
102
|
+
|
|
103
|
+
| Metric | Before | After |
|
|
104
|
+
|--------|--------|-------|
|
|
105
|
+
| Single-word results at default | 0 | 10+ |
|
|
106
|
+
| Multi-word results | 7+ | 7+ (unchanged) |
|
|
107
|
+
|
|
108
|
+
## Conclusion
|
|
109
|
+
|
|
110
|
+
The "multi-word semantic search failure" was misidentified. Multi-word queries work correctly. The issue is threshold calibration affecting single-word and abstract queries.
|
|
111
|
+
|
|
112
|
+
**Recommended Solution**: Lower threshold to 0.35, add user feedback, improve documentation.
|
|
113
|
+
|
|
114
|
+
**No algorithmic changes needed** to embedding generation, query processing, or vector search.
|
|
@@ -0,0 +1,69 @@
|
|
|
1
|
+
# Threshold Validation Report
|
|
2
|
+
|
|
3
|
+
## Summary
|
|
4
|
+
|
|
5
|
+
Validation confirms that lowering the default similarity threshold from 0.5 to 0.35 (ALP-208) **fixes single-word query failures** without regressing multi-word query performance.
|
|
6
|
+
|
|
7
|
+
## Test Environment
|
|
8
|
+
|
|
9
|
+
- **Test Corpus**: `src/__tests__/fixtures/semantic-search/multi-word-corpus/`
|
|
10
|
+
- **Documents**: 6 markdown files (failure-automation, job-context, error-handling, configuration-management, distributed-systems, process-orchestration)
|
|
11
|
+
- **Sections**: 52 embedded vectors
|
|
12
|
+
- **Date**: 2026-01-26
|
|
13
|
+
|
|
14
|
+
## Before/After Comparison
|
|
15
|
+
|
|
16
|
+
### Single-Word Queries
|
|
17
|
+
|
|
18
|
+
| Query | Before (0.5) | After (0.35) | Top Match | Top Score |
|
|
19
|
+
|-------|-------------|--------------|-----------|-----------|
|
|
20
|
+
| "failure" | 0 results | **6 results** | failure-automation.md: Failure Isolation | 39.0% |
|
|
21
|
+
| "error" | 0 results | **7 results** | error-handling.md: Programming Errors | 49.1% |
|
|
22
|
+
| "automation" | 0 results | **10 results** | failure-automation.md: Overview | 44.9% |
|
|
23
|
+
| "context" | 0 results | **10 results** | job-context.md: What is Job Context? | 48.1% |
|
|
24
|
+
|
|
25
|
+
**Improvement**: 100% of single-word queries now return relevant results.
|
|
26
|
+
|
|
27
|
+
### Multi-Word Queries (Regression Check)
|
|
28
|
+
|
|
29
|
+
| Query | Before (0.5) | After (0.35) | Top Match | Top Score |
|
|
30
|
+
|-------|-------------|--------------|-----------|-----------|
|
|
31
|
+
| "failure automation" | 7 results | 10 results | failure-automation.md: Best Practices | 61.5% |
|
|
32
|
+
| "job context" | 4 results | 7 results | job-context.md: What is Job Context? | 60.4% |
|
|
33
|
+
| "error handling" | 7 results | 10 results | error-handling.md: Introduction | 63.6% |
|
|
34
|
+
| "configuration management" | 8 results | 10 results | configuration-management.md: Overview | 69.5% |
|
|
35
|
+
| "distributed systems" | 4 results | 10 results | distributed-systems.md: What Are... | 60.9% |
|
|
36
|
+
| "process orchestration" | 8 results | 10 results | process-orchestration.md: Introduction | 67.9% |
|
|
37
|
+
|
|
38
|
+
**Finding**: No regression. Multi-word queries actually return MORE results (expected, since threshold is lower), with the same top matches and scores.
|
|
39
|
+
|
|
40
|
+
## Success Criteria Validation
|
|
41
|
+
|
|
42
|
+
- [x] **Single-word queries return results at default threshold** - All 4 test queries now return 6-10 results
|
|
43
|
+
- [x] **Multi-word queries work as before (no regression)** - All 6 queries return results with same top matches
|
|
44
|
+
- [x] **Quantitative improvement documented** - See tables above
|
|
45
|
+
|
|
46
|
+
## Below-Threshold Feedback (ALP-209)
|
|
47
|
+
|
|
48
|
+
The new feedback feature correctly reports results below threshold:
|
|
49
|
+
|
|
50
|
+
```json
|
|
51
|
+
{
|
|
52
|
+
"results": [...6 results...],
|
|
53
|
+
"belowThresholdCount": 14,
|
|
54
|
+
"belowThresholdHighest": 0.349
|
|
55
|
+
}
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
This helps users understand that more content exists if they lower the threshold.
|
|
59
|
+
|
|
60
|
+
## Conclusion
|
|
61
|
+
|
|
62
|
+
The threshold change from 0.5 to 0.35 is validated as the correct fix:
|
|
63
|
+
|
|
64
|
+
1. **Single-word queries now work** - Users can search for concepts like "failure", "error", "context"
|
|
65
|
+
2. **Multi-word queries unaffected** - High-quality results with same top matches
|
|
66
|
+
3. **User guidance in place** - Documentation (ALP-210) explains threshold behavior
|
|
67
|
+
4. **Below-threshold feedback** - Users see when lowering threshold would help
|
|
68
|
+
|
|
69
|
+
The root cause identified in ALP-207 (threshold too high for short queries scoring 30-40%) is confirmed fixed.
|
|
@@ -0,0 +1,63 @@
|
|
|
1
|
+
# Vector Search Parameters and Scoring Analysis
|
|
2
|
+
|
|
3
|
+
## Executive Summary
|
|
4
|
+
|
|
5
|
+
The HNSW vector search configuration is **appropriate and well-tuned**. The root cause of "0 results" is **NOT the vector search algorithm**, but the **similarity threshold filtering** applied after search.
|
|
6
|
+
|
|
7
|
+
Key finding: Single-word queries have inherently lower similarity scores (30-40%) than multi-word queries (50-70%). The default 0.5 threshold filters out all single-word results.
|
|
8
|
+
|
|
9
|
+
## HNSW Configuration
|
|
10
|
+
|
|
11
|
+
### Current Parameters
|
|
12
|
+
|
|
13
|
+
From `src/embeddings/vector-store.ts:98`:
|
|
14
|
+
|
|
15
|
+
```typescript
|
|
16
|
+
this.index.initIndex(10000, 16, 200, 100)
|
|
17
|
+
// maxElements, M, efConstruction, efSearch
|
|
18
|
+
```
|
|
19
|
+
|
|
20
|
+
| Parameter | Value | Description | Assessment |
|
|
21
|
+
|-----------|-------|-------------|------------|
|
|
22
|
+
| maxElements | 10,000 | Initial capacity (auto-resizes) | Adequate |
|
|
23
|
+
| M | 16 | Max connections per node | Good balance |
|
|
24
|
+
| efConstruction | 200 | Construction-time search width | High quality |
|
|
25
|
+
| efSearch | 100 | Query-time search width | Good recall |
|
|
26
|
+
|
|
27
|
+
All parameters are well-tuned. No changes needed.
|
|
28
|
+
|
|
29
|
+
## Similarity Score Analysis
|
|
30
|
+
|
|
31
|
+
### Threshold Experiment
|
|
32
|
+
|
|
33
|
+
Testing "failure" at different thresholds:
|
|
34
|
+
|
|
35
|
+
| Threshold | Results | Top Score |
|
|
36
|
+
|-----------|---------|-----------|
|
|
37
|
+
| 0.0 | 10 | 39.1% |
|
|
38
|
+
| 0.3 | 10 | 39.1% |
|
|
39
|
+
| 0.4 | 0 | - |
|
|
40
|
+
| 0.5 | 0 | - |
|
|
41
|
+
|
|
42
|
+
### Score Distribution by Query Type
|
|
43
|
+
|
|
44
|
+
| Query Type | Score Range | Results at 0.5 |
|
|
45
|
+
|------------|-------------|----------------|
|
|
46
|
+
| Single word | 31-49% | 0 |
|
|
47
|
+
| Two-word domain | 54-70% | 7+ |
|
|
48
|
+
| Natural language | 50-66% | 9 |
|
|
49
|
+
|
|
50
|
+
## Root Cause
|
|
51
|
+
|
|
52
|
+
The 0.5 default threshold filters out single-word results (max ~49%). This is threshold calibration, not a search algorithm issue.
|
|
53
|
+
|
|
54
|
+
## Recommendations for ALP-207
|
|
55
|
+
|
|
56
|
+
1. Lower default threshold to 0.3-0.4
|
|
57
|
+
2. Consider adaptive threshold by query length
|
|
58
|
+
3. Show "N results below threshold" message
|
|
59
|
+
4. Make threshold more visible in docs
|
|
60
|
+
|
|
61
|
+
## Conclusion
|
|
62
|
+
|
|
63
|
+
Vector search works correctly. Focus ALP-207 on threshold tuning, not algorithmic changes.
|