mdcontext 0.0.1 → 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.changeset/README.md +28 -0
- package/.changeset/config.json +11 -0
- package/.github/workflows/ci.yml +83 -0
- package/.github/workflows/release.yml +113 -0
- package/.tldrignore +112 -0
- package/AGENTS.md +46 -0
- package/BACKLOG.md +338 -0
- package/README.md +231 -11
- package/biome.json +36 -0
- package/cspell.config.yaml +14 -0
- package/dist/chunk-KRYIFLQR.js +92 -0
- package/dist/chunk-S7E6TFX6.js +742 -0
- package/dist/chunk-VVTGZNBT.js +1519 -0
- package/dist/cli/main.d.ts +1 -0
- package/dist/cli/main.js +2015 -0
- package/dist/index.d.ts +266 -0
- package/dist/index.js +86 -0
- package/dist/mcp/server.d.ts +1 -0
- package/dist/mcp/server.js +376 -0
- package/docs/019-USAGE.md +586 -0
- package/docs/020-current-implementation.md +364 -0
- package/docs/021-DOGFOODING-FINDINGS.md +175 -0
- package/docs/BACKLOG.md +80 -0
- package/docs/DESIGN.md +439 -0
- package/docs/PROJECT.md +88 -0
- package/docs/ROADMAP.md +407 -0
- package/docs/test-links.md +9 -0
- package/package.json +69 -10
- package/pnpm-workspace.yaml +5 -0
- package/research/config-analysis/01-current-implementation.md +470 -0
- package/research/config-analysis/02-strategy-recommendation.md +428 -0
- package/research/config-analysis/03-task-candidates.md +715 -0
- package/research/config-analysis/033-research-configuration-management.md +828 -0
- package/research/config-analysis/034-research-effect-cli-config.md +1504 -0
- package/research/config-analysis/04-consolidated-task-candidates.md +277 -0
- package/research/dogfood/consolidated-tool-evaluation.md +373 -0
- package/research/dogfood/strategy-a/a-synthesis.md +184 -0
- package/research/dogfood/strategy-a/a1-docs.md +226 -0
- package/research/dogfood/strategy-a/a2-amorphic.md +156 -0
- package/research/dogfood/strategy-a/a3-llm.md +164 -0
- package/research/dogfood/strategy-b/b-synthesis.md +228 -0
- package/research/dogfood/strategy-b/b1-architecture.md +207 -0
- package/research/dogfood/strategy-b/b2-gaps.md +258 -0
- package/research/dogfood/strategy-b/b3-workflows.md +250 -0
- package/research/dogfood/strategy-c/c-synthesis.md +451 -0
- package/research/dogfood/strategy-c/c1-explorer.md +192 -0
- package/research/dogfood/strategy-c/c2-diver-memory.md +145 -0
- package/research/dogfood/strategy-c/c3-diver-control.md +148 -0
- package/research/dogfood/strategy-c/c4-diver-failure.md +151 -0
- package/research/dogfood/strategy-c/c5-diver-execution.md +221 -0
- package/research/dogfood/strategy-c/c6-diver-org.md +221 -0
- package/research/effect-cli-error-handling.md +845 -0
- package/research/effect-errors-as-values.md +943 -0
- package/research/errors-task-analysis/00-consolidated-tasks.md +207 -0
- package/research/errors-task-analysis/cli-commands-analysis.md +909 -0
- package/research/errors-task-analysis/embeddings-analysis.md +709 -0
- package/research/errors-task-analysis/index-search-analysis.md +812 -0
- package/research/mdcontext-error-analysis.md +521 -0
- package/research/npm_publish/011-npm-workflow-research-agent2.md +792 -0
- package/research/npm_publish/012-npm-workflow-research-agent1.md +530 -0
- package/research/npm_publish/013-npm-workflow-research-agent3.md +722 -0
- package/research/npm_publish/014-npm-workflow-synthesis.md +556 -0
- package/research/npm_publish/031-npm-workflow-task-analysis.md +134 -0
- package/research/semantic-search/002-research-embedding-models.md +490 -0
- package/research/semantic-search/003-research-rag-alternatives.md +523 -0
- package/research/semantic-search/004-research-vector-search.md +841 -0
- package/research/semantic-search/032-research-semantic-search.md +427 -0
- package/research/task-management-2026/00-synthesis-recommendations.md +295 -0
- package/research/task-management-2026/01-ai-workflow-tools.md +416 -0
- package/research/task-management-2026/02-agent-framework-patterns.md +476 -0
- package/research/task-management-2026/03-lightweight-file-based.md +567 -0
- package/research/task-management-2026/04-established-tools-ai-features.md +541 -0
- package/research/task-management-2026/linear/01-core-features-workflow.md +771 -0
- package/research/task-management-2026/linear/02-api-integrations.md +930 -0
- package/research/task-management-2026/linear/03-ai-features.md +368 -0
- package/research/task-management-2026/linear/04-pricing-setup.md +205 -0
- package/research/task-management-2026/linear/05-usage-patterns-best-practices.md +605 -0
- package/scripts/rebuild-hnswlib.js +63 -0
- package/src/cli/argv-preprocessor.test.ts +210 -0
- package/src/cli/argv-preprocessor.ts +202 -0
- package/src/cli/cli.test.ts +430 -0
- package/src/cli/commands/backlinks.ts +54 -0
- package/src/cli/commands/context.ts +197 -0
- package/src/cli/commands/index-cmd.ts +300 -0
- package/src/cli/commands/index.ts +13 -0
- package/src/cli/commands/links.ts +52 -0
- package/src/cli/commands/search.ts +451 -0
- package/src/cli/commands/stats.ts +146 -0
- package/src/cli/commands/tree.ts +107 -0
- package/src/cli/flag-schemas.ts +275 -0
- package/src/cli/help.ts +386 -0
- package/src/cli/index.ts +9 -0
- package/src/cli/main.ts +145 -0
- package/src/cli/options.ts +31 -0
- package/src/cli/typo-suggester.test.ts +105 -0
- package/src/cli/typo-suggester.ts +130 -0
- package/src/cli/utils.ts +126 -0
- package/src/core/index.ts +1 -0
- package/src/core/types.ts +140 -0
- package/src/embeddings/index.ts +8 -0
- package/src/embeddings/openai-provider.ts +165 -0
- package/src/embeddings/semantic-search.ts +583 -0
- package/src/embeddings/types.ts +82 -0
- package/src/embeddings/vector-store.ts +299 -0
- package/src/index/index.ts +4 -0
- package/src/index/indexer.ts +446 -0
- package/src/index/storage.ts +196 -0
- package/src/index/types.ts +109 -0
- package/src/index/watcher.ts +131 -0
- package/src/index.ts +8 -0
- package/src/mcp/server.ts +483 -0
- package/src/parser/index.ts +1 -0
- package/src/parser/parser.test.ts +291 -0
- package/src/parser/parser.ts +395 -0
- package/src/parser/section-filter.ts +270 -0
- package/src/search/query-parser.test.ts +260 -0
- package/src/search/query-parser.ts +319 -0
- package/src/search/searcher.test.ts +182 -0
- package/src/search/searcher.ts +602 -0
- package/src/summarize/budget-bugs.test.ts +620 -0
- package/src/summarize/formatters.ts +419 -0
- package/src/summarize/index.ts +20 -0
- package/src/summarize/summarizer.test.ts +275 -0
- package/src/summarize/summarizer.ts +528 -0
- package/src/summarize/verify-bugs.test.ts +238 -0
- package/src/utils/index.ts +1 -0
- package/src/utils/tokens.test.ts +142 -0
- package/src/utils/tokens.ts +186 -0
- package/tests/fixtures/cli/.mdcontext/config.json +8 -0
- package/tests/fixtures/cli/.mdcontext/indexes/documents.json +33 -0
- package/tests/fixtures/cli/.mdcontext/indexes/links.json +12 -0
- package/tests/fixtures/cli/.mdcontext/indexes/sections.json +233 -0
- package/tests/fixtures/cli/.mdcontext/vectors.bin +0 -0
- package/tests/fixtures/cli/.mdcontext/vectors.meta.json +1264 -0
- package/tests/fixtures/cli/README.md +9 -0
- package/tests/fixtures/cli/api-reference.md +11 -0
- package/tests/fixtures/cli/getting-started.md +11 -0
- package/tsconfig.json +26 -0
- package/vitest.config.ts +21 -0
- package/vitest.setup.ts +12 -0
|
@@ -0,0 +1,427 @@
|
|
|
1
|
+
# Research Task Analysis: Embedding Models, RAG Alternatives, and Vector Search
|
|
2
|
+
|
|
3
|
+
Analysis of research documents against current mdcontext implementation.
|
|
4
|
+
|
|
5
|
+
Date: January 2026
|
|
6
|
+
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## Documents Analyzed
|
|
10
|
+
|
|
11
|
+
1. `002-research-embedding-models.md` - Embedding model comparison and recommendations
|
|
12
|
+
2. `003-research-rag-alternatives.md` - RAG alternatives for improving semantic search
|
|
13
|
+
3. `004-research-vector-search.md` - Vector search patterns and techniques
|
|
14
|
+
|
|
15
|
+
---
|
|
16
|
+
|
|
17
|
+
## Implemented (No Action Needed)
|
|
18
|
+
|
|
19
|
+
### 1. Dimension Reduction (512 dimensions)
|
|
20
|
+
|
|
21
|
+
**Research Recommendation:** Reduce OpenAI embeddings from 1536 to 512 dimensions for 67% storage reduction with minimal quality loss.
|
|
22
|
+
|
|
23
|
+
**Current Implementation:** Already implemented in `src/embeddings/openai-provider.ts`:
|
|
24
|
+
|
|
25
|
+
```typescript
|
|
26
|
+
const response = await this.client.embeddings.create({
|
|
27
|
+
model: this.model,
|
|
28
|
+
input: batch,
|
|
29
|
+
dimensions: 512, // Already using reduced dimensions
|
|
30
|
+
});
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
**Status:** Implemented
|
|
34
|
+
|
|
35
|
+
---
|
|
36
|
+
|
|
37
|
+
### 2. HNSW Vector Index
|
|
38
|
+
|
|
39
|
+
**Research Recommendation:** Stay with HNSW for documentation corpora (<100K sections).
|
|
40
|
+
|
|
41
|
+
**Current Implementation:** Using `hnswlib-node` with cosine similarity in `src/embeddings/vector-store.ts`:
|
|
42
|
+
|
|
43
|
+
```typescript
|
|
44
|
+
this.index = new HierarchicalNSW.HierarchicalNSW("cosine", this.dimensions);
|
|
45
|
+
this.index.initIndex(10000, 16, 200, 100); // M=16, efConstruction=200
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
**Status:** Implemented
|
|
49
|
+
|
|
50
|
+
---
|
|
51
|
+
|
|
52
|
+
### 3. EmbeddingProvider Interface
|
|
53
|
+
|
|
54
|
+
**Research Recommendation:** Create provider abstraction for embedding models.
|
|
55
|
+
|
|
56
|
+
**Current Implementation:** `src/embeddings/types.ts` defines a clean provider interface:
|
|
57
|
+
|
|
58
|
+
```typescript
|
|
59
|
+
export interface EmbeddingProvider {
|
|
60
|
+
readonly name: string;
|
|
61
|
+
readonly dimensions: number;
|
|
62
|
+
embed(texts: string[]): Promise<EmbeddingResult>;
|
|
63
|
+
}
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
**Status:** Implemented (foundation ready for additional providers)
|
|
67
|
+
|
|
68
|
+
---
|
|
69
|
+
|
|
70
|
+
### 4. Path Pattern Filtering (Post-filter)
|
|
71
|
+
|
|
72
|
+
**Research Recommendation:** Implement metadata filtering for search results.
|
|
73
|
+
|
|
74
|
+
**Current Implementation:** `pathPattern` option implemented in `semanticSearch()` as post-filtering.
|
|
75
|
+
|
|
76
|
+
**Status:** Implemented (basic)
|
|
77
|
+
|
|
78
|
+
---
|
|
79
|
+
|
|
80
|
+
### 5. Document Context in Embeddings
|
|
81
|
+
|
|
82
|
+
**Research Recommendation:** Include document title and parent section in embedding text.
|
|
83
|
+
|
|
84
|
+
**Current Implementation:** Already in `src/embeddings/semantic-search.ts`:
|
|
85
|
+
|
|
86
|
+
```typescript
|
|
87
|
+
const generateEmbeddingText = (
|
|
88
|
+
section,
|
|
89
|
+
content,
|
|
90
|
+
documentTitle,
|
|
91
|
+
parentHeading,
|
|
92
|
+
) => {
|
|
93
|
+
parts.push(`# ${section.heading}`);
|
|
94
|
+
if (parentHeading) parts.push(`Parent section: ${parentHeading}`);
|
|
95
|
+
parts.push(`Document: ${documentTitle}`);
|
|
96
|
+
parts.push(content);
|
|
97
|
+
// ...
|
|
98
|
+
};
|
|
99
|
+
```
|
|
100
|
+
|
|
101
|
+
**Status:** Implemented
|
|
102
|
+
|
|
103
|
+
---
|
|
104
|
+
|
|
105
|
+
## Task Candidates
|
|
106
|
+
|
|
107
|
+
### 1. Add Hybrid Search (BM25 + Semantic)
|
|
108
|
+
|
|
109
|
+
**Priority:** High
|
|
110
|
+
|
|
111
|
+
**Description:**
|
|
112
|
+
Implement hybrid search combining BM25 keyword search with semantic search, using Reciprocal Rank Fusion (RRF) to merge results.
|
|
113
|
+
|
|
114
|
+
**Why It Matters:**
|
|
115
|
+
|
|
116
|
+
- Research shows 15-30% recall improvement over single-method retrieval
|
|
117
|
+
- Handles exact term matching (API names, error codes, identifiers) that pure semantic search misses
|
|
118
|
+
- Current keyword search exists separately but isn't integrated with semantic search
|
|
119
|
+
|
|
120
|
+
**Implementation Notes:**
|
|
121
|
+
|
|
122
|
+
- Add `wink-bm25-text-search` dependency
|
|
123
|
+
- Build BM25 index alongside vector index during `mdcontext embed`
|
|
124
|
+
- Add `--mode hybrid` option to search command
|
|
125
|
+
- Implement RRF fusion (~50 lines of code)
|
|
126
|
+
|
|
127
|
+
**Current Gap:**
|
|
128
|
+
|
|
129
|
+
- Keyword search (`src/search/searcher.ts`) and semantic search (`src/embeddings/semantic-search.ts`) are separate codepaths
|
|
130
|
+
- No fusion mechanism exists
|
|
131
|
+
|
|
132
|
+
**Estimated Effort:** 2-3 days
|
|
133
|
+
|
|
134
|
+
---
|
|
135
|
+
|
|
136
|
+
### 2. Add Local Embedding Provider (Ollama)
|
|
137
|
+
|
|
138
|
+
**Priority:** High
|
|
139
|
+
|
|
140
|
+
**Description:**
|
|
141
|
+
Implement an Ollama-based embedding provider using `nomic-embed-text-v1.5` for offline semantic search.
|
|
142
|
+
|
|
143
|
+
**Why It Matters:**
|
|
144
|
+
|
|
145
|
+
- Enables offline semantic search (major feature gap)
|
|
146
|
+
- Zero ongoing API costs
|
|
147
|
+
- Quality matches or exceeds OpenAI text-embedding-3-small
|
|
148
|
+
- Privacy-sensitive use cases
|
|
149
|
+
|
|
150
|
+
**Implementation Notes:**
|
|
151
|
+
|
|
152
|
+
- Create `src/embeddings/ollama-provider.ts` implementing `EmbeddingProvider`
|
|
153
|
+
- Add provider selection via config or `--provider` CLI flag
|
|
154
|
+
- Default to OpenAI for backward compatibility
|
|
155
|
+
- nomic-embed-text supports Matryoshka (dimension flexibility)
|
|
156
|
+
|
|
157
|
+
**Models to Support:**
|
|
158
|
+
|
|
159
|
+
1. `nomic-embed-text` - Best overall fit (fast, 8K context, Matryoshka)
|
|
160
|
+
2. `mxbai-embed-large` - Higher quality option
|
|
161
|
+
3. `bge-m3` - Multilingual option
|
|
162
|
+
|
|
163
|
+
**Estimated Effort:** 2-3 days
|
|
164
|
+
|
|
165
|
+
---
|
|
166
|
+
|
|
167
|
+
### 3. Add Cross-Encoder Re-ranking
|
|
168
|
+
|
|
169
|
+
**Priority:** Medium
|
|
170
|
+
|
|
171
|
+
**Description:**
|
|
172
|
+
Add optional re-ranking of top-N semantic search results using a cross-encoder model.
|
|
173
|
+
|
|
174
|
+
**Why It Matters:**
|
|
175
|
+
|
|
176
|
+
- 20-35% accuracy improvement in retrieval precision
|
|
177
|
+
- Cross-encoders capture fine-grained relevance that bi-encoders miss
|
|
178
|
+
- Can be opt-in to avoid latency when not needed
|
|
179
|
+
|
|
180
|
+
**Implementation Notes:**
|
|
181
|
+
|
|
182
|
+
- Add `@xenova/transformers` dependency for Transformers.js
|
|
183
|
+
- Use `ms-marco-MiniLM-L-6-v2` model (22.7M params, 2-5ms/pair)
|
|
184
|
+
- Re-rank top-20 candidates to top-10
|
|
185
|
+
- Add `--rerank` flag to search command
|
|
186
|
+
|
|
187
|
+
**Alternative:** Cohere Rerank API for simpler integration (adds cost)
|
|
188
|
+
|
|
189
|
+
**Estimated Effort:** 2-3 days
|
|
190
|
+
|
|
191
|
+
---
|
|
192
|
+
|
|
193
|
+
### 4. Add Dynamic efSearch (Quality Modes)
|
|
194
|
+
|
|
195
|
+
**Priority:** Medium
|
|
196
|
+
|
|
197
|
+
**Description:**
|
|
198
|
+
Allow users to control search quality/speed tradeoff via HNSW efSearch parameter at query time.
|
|
199
|
+
|
|
200
|
+
**Why It Matters:**
|
|
201
|
+
|
|
202
|
+
- Zero dependency changes
|
|
203
|
+
- Immediate quality/speed improvements
|
|
204
|
+
- Low risk
|
|
205
|
+
|
|
206
|
+
**Implementation Notes:**
|
|
207
|
+
|
|
208
|
+
- Add `--quality` flag: `fast` (64), `balanced` (100), `thorough` (256)
|
|
209
|
+
- efSearch is already configurable at query time in hnswlib-node
|
|
210
|
+
- Update search functions to accept quality parameter
|
|
211
|
+
|
|
212
|
+
**Current State:**
|
|
213
|
+
|
|
214
|
+
```typescript
|
|
215
|
+
this.index.initIndex(10000, 16, 200, 100); // efSearch=100 (implicit)
|
|
216
|
+
```
|
|
217
|
+
|
|
218
|
+
**Estimated Effort:** 0.5 days
|
|
219
|
+
|
|
220
|
+
---
|
|
221
|
+
|
|
222
|
+
### 5. Add Configurable HNSW Parameters
|
|
223
|
+
|
|
224
|
+
**Priority:** Low
|
|
225
|
+
|
|
226
|
+
**Description:**
|
|
227
|
+
Expose HNSW build parameters (M, efConstruction) via configuration for users with specific needs.
|
|
228
|
+
|
|
229
|
+
**Why It Matters:**
|
|
230
|
+
|
|
231
|
+
- Users with large corpora may want to tune for speed
|
|
232
|
+
- Users needing maximum recall can increase parameters
|
|
233
|
+
- Enables benchmarking different configurations
|
|
234
|
+
|
|
235
|
+
**Current Hardcoded Values:**
|
|
236
|
+
|
|
237
|
+
```typescript
|
|
238
|
+
M: 16; // Max connections per node
|
|
239
|
+
efConstruction: 200; // Construction-time search width
|
|
240
|
+
```
|
|
241
|
+
|
|
242
|
+
**Recommended Configurations:**
|
|
243
|
+
|
|
244
|
+
- Quality-focused: M=24, efConstruction=256
|
|
245
|
+
- Speed-focused: M=12, efConstruction=128
|
|
246
|
+
|
|
247
|
+
**Estimated Effort:** 1 day
|
|
248
|
+
|
|
249
|
+
---
|
|
250
|
+
|
|
251
|
+
### 6. Add Query Preprocessing
|
|
252
|
+
|
|
253
|
+
**Priority:** Low
|
|
254
|
+
|
|
255
|
+
**Description:**
|
|
256
|
+
Add basic query preprocessing before embedding to reduce noise.
|
|
257
|
+
|
|
258
|
+
**Why It Matters:**
|
|
259
|
+
|
|
260
|
+
- 2-5% precision improvement
|
|
261
|
+
- Simple implementation
|
|
262
|
+
|
|
263
|
+
**Implementation:**
|
|
264
|
+
|
|
265
|
+
```typescript
|
|
266
|
+
function preprocessQuery(query: string): string {
|
|
267
|
+
return query
|
|
268
|
+
.toLowerCase()
|
|
269
|
+
.replace(/[^\w\s]/g, " ")
|
|
270
|
+
.replace(/\s+/g, " ")
|
|
271
|
+
.trim();
|
|
272
|
+
}
|
|
273
|
+
```
|
|
274
|
+
|
|
275
|
+
**Estimated Effort:** 1-2 hours
|
|
276
|
+
|
|
277
|
+
---
|
|
278
|
+
|
|
279
|
+
### 7. Add Heading Match Boost
|
|
280
|
+
|
|
281
|
+
**Priority:** Low
|
|
282
|
+
|
|
283
|
+
**Description:**
|
|
284
|
+
Boost search results when query terms appear in section headings.
|
|
285
|
+
|
|
286
|
+
**Why It Matters:**
|
|
287
|
+
|
|
288
|
+
- Significant for navigation queries ("installation guide", "API reference")
|
|
289
|
+
- Simple scoring adjustment
|
|
290
|
+
|
|
291
|
+
**Implementation:**
|
|
292
|
+
|
|
293
|
+
```typescript
|
|
294
|
+
function adjustScore(result, query): number {
|
|
295
|
+
const queryTerms = query.toLowerCase().split(/\s+/);
|
|
296
|
+
const headingLower = result.heading.toLowerCase();
|
|
297
|
+
const headingMatches = queryTerms.filter((t) =>
|
|
298
|
+
headingLower.includes(t),
|
|
299
|
+
).length;
|
|
300
|
+
return result.similarity + headingMatches * 0.05;
|
|
301
|
+
}
|
|
302
|
+
```
|
|
303
|
+
|
|
304
|
+
**Estimated Effort:** 2-4 hours
|
|
305
|
+
|
|
306
|
+
---
|
|
307
|
+
|
|
308
|
+
### 8. Add HyDE Query Expansion (Optional)
|
|
309
|
+
|
|
310
|
+
**Priority:** Low
|
|
311
|
+
|
|
312
|
+
**Description:**
|
|
313
|
+
Implement Hypothetical Document Embeddings for complex queries - generate a hypothetical answer with LLM, then search using that embedding.
|
|
314
|
+
|
|
315
|
+
**Why It Matters:**
|
|
316
|
+
|
|
317
|
+
- 10-30% retrieval improvement on ambiguous queries
|
|
318
|
+
- Bridges semantic gap between short questions and detailed documents
|
|
319
|
+
|
|
320
|
+
**Considerations:**
|
|
321
|
+
|
|
322
|
+
- Adds LLM call (cost, latency)
|
|
323
|
+
- Should be opt-in for complex queries only
|
|
324
|
+
- Works poorly if LLM lacks domain knowledge
|
|
325
|
+
|
|
326
|
+
**Estimated Effort:** 1-2 days
|
|
327
|
+
|
|
328
|
+
---
|
|
329
|
+
|
|
330
|
+
### 9. Fix Dimension Mismatch in Provider
|
|
331
|
+
|
|
332
|
+
**Priority:** Medium
|
|
333
|
+
|
|
334
|
+
**Description:**
|
|
335
|
+
The OpenAI provider reports incorrect dimensions (1536/3072) while actually using 512.
|
|
336
|
+
|
|
337
|
+
**Current Issue:**
|
|
338
|
+
|
|
339
|
+
```typescript
|
|
340
|
+
// In openai-provider.ts
|
|
341
|
+
this.dimensions = this.model === "text-embedding-3-large" ? 3072 : 1536;
|
|
342
|
+
// But actual API call uses:
|
|
343
|
+
dimensions: 512;
|
|
344
|
+
```
|
|
345
|
+
|
|
346
|
+
This mismatch could cause issues if other code relies on `provider.dimensions`.
|
|
347
|
+
|
|
348
|
+
**Fix:** Update dimension reporting to match actual API parameter.
|
|
349
|
+
|
|
350
|
+
**Estimated Effort:** 0.5 hours
|
|
351
|
+
|
|
352
|
+
---
|
|
353
|
+
|
|
354
|
+
### 10. Add Alternative API Provider (Voyage AI)
|
|
355
|
+
|
|
356
|
+
**Priority:** Low
|
|
357
|
+
|
|
358
|
+
**Description:**
|
|
359
|
+
Add Voyage AI as an alternative embedding provider for users wanting better quality at similar cost.
|
|
360
|
+
|
|
361
|
+
**Why It Matters:**
|
|
362
|
+
|
|
363
|
+
- voyage-3.5-lite: Same price as OpenAI ($0.02/1M), but better quality
|
|
364
|
+
- 32K token context (4x OpenAI)
|
|
365
|
+
- Free 200M tokens for testing
|
|
366
|
+
|
|
367
|
+
**Estimated Effort:** 1 day
|
|
368
|
+
|
|
369
|
+
---
|
|
370
|
+
|
|
371
|
+
## Skip (Not Applicable)
|
|
372
|
+
|
|
373
|
+
| Recommendation | Reason to Skip |
|
|
374
|
+
| ------------------------ | --------------------------------------------------------------------- |
|
|
375
|
+
| ColBERT Late Interaction | Overkill for documentation corpus sizes; requires Python service |
|
|
376
|
+
| SPLADE Sparse Retrieval | BM25 + semantic hybrid likely sufficient; adds complexity |
|
|
377
|
+
| GraphRAG | Overkill for documentation search |
|
|
378
|
+
| Fine-tuned Embeddings | Requires training infrastructure; general models work well |
|
|
379
|
+
| IVF/DiskANN Indexes | HNSW sufficient for typical documentation sizes (<100K sections) |
|
|
380
|
+
| LLM-based Re-ranking | Cross-encoders provide similar quality without LLM cost/latency |
|
|
381
|
+
| Self-RAG | Beyond current scope; more relevant for RAG pipelines with generation |
|
|
382
|
+
|
|
383
|
+
---
|
|
384
|
+
|
|
385
|
+
## Summary
|
|
386
|
+
|
|
387
|
+
| Category | Count |
|
|
388
|
+
| --------------- | ----------------- |
|
|
389
|
+
| Implemented | 5 items |
|
|
390
|
+
| Task Candidates | 10 items |
|
|
391
|
+
| Skipped | 7 recommendations |
|
|
392
|
+
|
|
393
|
+
### Priority Matrix
|
|
394
|
+
|
|
395
|
+
| Priority | Tasks |
|
|
396
|
+
| ---------- | ------------------------------------------------------------------------- |
|
|
397
|
+
| **High** | Hybrid Search (BM25+RRF), Local Embedding Provider (Ollama) |
|
|
398
|
+
| **Medium** | Cross-Encoder Re-ranking, Dynamic efSearch, Fix Dimension Mismatch |
|
|
399
|
+
| **Low** | HNSW Config, Query Preprocessing, Heading Boost, HyDE, Voyage AI Provider |
|
|
400
|
+
|
|
401
|
+
### Recommended Implementation Order
|
|
402
|
+
|
|
403
|
+
**Phase 1: Quick Wins (1 week)**
|
|
404
|
+
|
|
405
|
+
1. Fix dimension mismatch in provider
|
|
406
|
+
2. Add dynamic efSearch (quality modes)
|
|
407
|
+
3. Add query preprocessing
|
|
408
|
+
4. Add heading match boost
|
|
409
|
+
|
|
410
|
+
**Phase 2: Hybrid Search (1-2 weeks)**
|
|
411
|
+
|
|
412
|
+
1. Integrate BM25 library
|
|
413
|
+
2. Build BM25 index during embed
|
|
414
|
+
3. Implement RRF fusion
|
|
415
|
+
4. Add `--mode hybrid` to CLI
|
|
416
|
+
|
|
417
|
+
**Phase 3: Local/Offline (1-2 weeks)**
|
|
418
|
+
|
|
419
|
+
1. Implement Ollama provider
|
|
420
|
+
2. Add provider selection CLI
|
|
421
|
+
3. Test with nomic-embed-text
|
|
422
|
+
|
|
423
|
+
**Phase 4: Advanced (2 weeks)**
|
|
424
|
+
|
|
425
|
+
1. Cross-encoder re-ranking (Transformers.js)
|
|
426
|
+
2. HyDE query expansion (optional)
|
|
427
|
+
3. Alternative API providers (Voyage AI)
|
|
@@ -0,0 +1,295 @@
|
|
|
1
|
+
# Task Management for Ralph: Synthesis & Recommendations
|
|
2
|
+
|
|
3
|
+
**Date:** January 21, 2026
|
|
4
|
+
**Purpose:** Recommend a task management approach for capturing intent before AI agent (ralph) processing
|
|
5
|
+
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
## Executive Summary
|
|
9
|
+
|
|
10
|
+
**For capturing task intent in an AI agent workflow, use a file-based SPEC.md pattern stored in Git.** This approach offers the best balance of human usability, LLM compatibility, and zero friction. External tools like Linear or GitHub Issues add unnecessary complexity for the core problem of "capture intent quickly so ralph can act on it." The SPEC.md pattern is already proven in the codebase, requires no new tooling, and aligns with the emerging industry standard of spec-driven development. Reserve external tools (Linear, GitHub Issues) for team coordination and visibility, not for the AI agent interface.
|
|
11
|
+
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
## Context
|
|
15
|
+
|
|
16
|
+
### The Problem
|
|
17
|
+
|
|
18
|
+
We need a system for capturing task intent before processing through "ralph" (an AI agent orchestration system). This is specifically about the **input interface** to the agent, not project management at large.
|
|
19
|
+
|
|
20
|
+
### Requirements
|
|
21
|
+
|
|
22
|
+
| Requirement | Priority | Notes |
|
|
23
|
+
| --------------------------- | -------- | --------------------------------------- |
|
|
24
|
+
| Capture task intent quickly | Critical | Friction kills adoption |
|
|
25
|
+
| LLM-friendly format | Critical | Ralph must parse and understand it |
|
|
26
|
+
| Git-native | High | Version control, audit trail, offline |
|
|
27
|
+
| Low friction for humans | High | Must be faster than "just do it myself" |
|
|
28
|
+
| Scales to many tasks | Medium | Backlog accumulation, parallel work |
|
|
29
|
+
|
|
30
|
+
### Constraint
|
|
31
|
+
|
|
32
|
+
Ralph operates on files in a repository. Whatever system we choose must either:
|
|
33
|
+
|
|
34
|
+
1. **Live in the repo** (file-based), or
|
|
35
|
+
2. **Be accessible via MCP/API** (external tool)
|
|
36
|
+
|
|
37
|
+
---
|
|
38
|
+
|
|
39
|
+
## Options Matrix
|
|
40
|
+
|
|
41
|
+
### Top 4 Approaches Evaluated
|
|
42
|
+
|
|
43
|
+
| Approach | Quick Capture | LLM-Friendly | Git-Native | Low Friction | Scales | Overall |
|
|
44
|
+
| ------------------- | ------------- | ------------ | ---------- | ------------ | ------ | ------- |
|
|
45
|
+
| **SPEC.md Pattern** | A | A | A | A | B | **A** |
|
|
46
|
+
| **Backlog.md** | B | A | A | B | A | **A-** |
|
|
47
|
+
| **Linear** | B | B | C | A | A | **B+** |
|
|
48
|
+
| **GitHub Issues** | C | B | B | B | A | **B** |
|
|
49
|
+
|
|
50
|
+
### Detailed Analysis
|
|
51
|
+
|
|
52
|
+
#### 1. SPEC.md Pattern (File-Based Specifications)
|
|
53
|
+
|
|
54
|
+
**What it is:** Markdown files in the repo (e.g., `SPEC.md`, `TODO.md`, `PLAN.md`) that define task intent. Already emerging as the industry standard for AI-assisted development in 2026.
|
|
55
|
+
|
|
56
|
+
**Pros:**
|
|
57
|
+
|
|
58
|
+
- Zero dependencies - works with any editor, any AI tool
|
|
59
|
+
- Direct context injection into LLM prompts (no API calls)
|
|
60
|
+
- Git versioning provides audit trail and rollback for free
|
|
61
|
+
- Human and AI edit the same artifact
|
|
62
|
+
- Token-efficient format (markdown is LLM-native)
|
|
63
|
+
- Already using something similar in the codebase
|
|
64
|
+
|
|
65
|
+
**Cons:**
|
|
66
|
+
|
|
67
|
+
- No built-in visualization (Kanban, timelines)
|
|
68
|
+
- Requires discipline to maintain structure
|
|
69
|
+
- Doesn't scale well for team-wide coordination across multiple projects
|
|
70
|
+
|
|
71
|
+
**Best for:** Individual or small team AI coding workflows where the developer is the primary user.
|
|
72
|
+
|
|
73
|
+
#### 2. Backlog.md
|
|
74
|
+
|
|
75
|
+
**What it is:** Purpose-built tool for AI-human collaboration. Each task is a separate `.md` file in a `.backlog/` directory. CLI and web interface available.
|
|
76
|
+
|
|
77
|
+
**Pros:**
|
|
78
|
+
|
|
79
|
+
- Designed specifically for AI agent workflows
|
|
80
|
+
- Git-native with task IDs referencing commits/branches
|
|
81
|
+
- Terminal Kanban for visualization
|
|
82
|
+
- Works with Claude Code, Cursor, and MCP-compatible tools
|
|
83
|
+
- Open source, no vendor lock-in
|
|
84
|
+
|
|
85
|
+
**Cons:**
|
|
86
|
+
|
|
87
|
+
- Adds a tool/CLI dependency
|
|
88
|
+
- Slightly more structure than bare SPEC.md
|
|
89
|
+
- Newer tool with smaller community
|
|
90
|
+
|
|
91
|
+
**Best for:** Teams wanting more structure than plain markdown but still git-native.
|
|
92
|
+
|
|
93
|
+
#### 3. Linear
|
|
94
|
+
|
|
95
|
+
**What it is:** The leading AI-native project management tool. Treats AI agents as "full teammates" with native integrations to Cursor, Copilot, and Claude.
|
|
96
|
+
|
|
97
|
+
**Pros:**
|
|
98
|
+
|
|
99
|
+
- Best-in-class AI agent integration (agents as assignable teammates)
|
|
100
|
+
- Excellent API and MCP servers
|
|
101
|
+
- Team visibility and coordination features
|
|
102
|
+
- Fast, keyboard-first UI
|
|
103
|
+
|
|
104
|
+
**Cons:**
|
|
105
|
+
|
|
106
|
+
- External service (not in repo)
|
|
107
|
+
- Requires MCP/API integration for ralph to access
|
|
108
|
+
- Adds latency to task capture (open tool, create issue)
|
|
109
|
+
- Paid at scale ($8/user/month)
|
|
110
|
+
- Another tool to maintain
|
|
111
|
+
|
|
112
|
+
**Best for:** Team coordination, roadmap visibility, when multiple people need to see task status.
|
|
113
|
+
|
|
114
|
+
#### 4. GitHub Issues
|
|
115
|
+
|
|
116
|
+
**What it is:** GitHub's built-in issue tracker, now with Copilot coding agent that can be assigned issues directly.
|
|
117
|
+
|
|
118
|
+
**Pros:**
|
|
119
|
+
|
|
120
|
+
- Already using GitHub
|
|
121
|
+
- Copilot can autonomously work on issues
|
|
122
|
+
- No new tool adoption
|
|
123
|
+
- Free
|
|
124
|
+
|
|
125
|
+
**Cons:**
|
|
126
|
+
|
|
127
|
+
- Slow to create issues (web UI friction)
|
|
128
|
+
- Not optimized for quick intent capture
|
|
129
|
+
- Less LLM-friendly than markdown files
|
|
130
|
+
- Requires API calls for ralph to read
|
|
131
|
+
|
|
132
|
+
**Best for:** When you need the issue-to-PR workflow with Copilot, team-visible bug tracking.
|
|
133
|
+
|
|
134
|
+
---
|
|
135
|
+
|
|
136
|
+
## Recommendation
|
|
137
|
+
|
|
138
|
+
### Primary: SPEC.md Pattern
|
|
139
|
+
|
|
140
|
+
**Use file-based specifications as the primary interface between humans and ralph.**
|
|
141
|
+
|
|
142
|
+
#### Rationale
|
|
143
|
+
|
|
144
|
+
1. **Lowest friction for capture:** Open file, write intent, save. Done. No context switching to another tool.
|
|
145
|
+
|
|
146
|
+
2. **Best LLM compatibility:** Markdown is the native language of LLMs. No translation layer, no API calls, no token overhead from structured formats.
|
|
147
|
+
|
|
148
|
+
3. **Already proven:** The research shows spec-driven development is the emerging standard. GitHub Spec-Kit, JetBrains Junie, Amazon Kiro - all use this pattern.
|
|
149
|
+
|
|
150
|
+
4. **Git-native by default:** Every change is versioned. Branching allows parallel task exploration. History shows what was attempted and why.
|
|
151
|
+
|
|
152
|
+
5. **Zero new tooling:** Works today with the existing setup. No adoption curve, no dependencies to maintain.
|
|
153
|
+
|
|
154
|
+
6. **Context window optimization:** Files can be selectively loaded. Large backlogs don't need to be sent to the LLM - only the relevant spec.
|
|
155
|
+
|
|
156
|
+
#### Proposed File Structure
|
|
157
|
+
|
|
158
|
+
```
|
|
159
|
+
.ralph/
|
|
160
|
+
BACKLOG.md # Quick task capture (one-liner ideas)
|
|
161
|
+
active/
|
|
162
|
+
feature-x.spec.md # Detailed spec for in-progress work
|
|
163
|
+
bug-fix-y.spec.md
|
|
164
|
+
completed/ # Archived specs (reference/learning)
|
|
165
|
+
templates/
|
|
166
|
+
feature.spec.md # Template for new features
|
|
167
|
+
bugfix.spec.md # Template for bug fixes
|
|
168
|
+
```
|
|
169
|
+
|
|
170
|
+
#### Spec File Format
|
|
171
|
+
|
|
172
|
+
```markdown
|
|
173
|
+
# [Task Title]
|
|
174
|
+
|
|
175
|
+
## Intent
|
|
176
|
+
|
|
177
|
+
[One paragraph: What do we want to accomplish and why?]
|
|
178
|
+
|
|
179
|
+
## Success Criteria
|
|
180
|
+
|
|
181
|
+
- [ ] Criterion 1
|
|
182
|
+
- [ ] Criterion 2
|
|
183
|
+
- [ ] Criterion 3
|
|
184
|
+
|
|
185
|
+
## Context
|
|
186
|
+
|
|
187
|
+
[Any relevant background, links, constraints]
|
|
188
|
+
|
|
189
|
+
## Notes
|
|
190
|
+
|
|
191
|
+
[Optional: implementation hints, considerations]
|
|
192
|
+
```
|
|
193
|
+
|
|
194
|
+
### Secondary: Linear for Team Visibility
|
|
195
|
+
|
|
196
|
+
**If you need team coordination or external visibility, sync completed specs to Linear.**
|
|
197
|
+
|
|
198
|
+
- Use Linear for roadmap and sprint planning
|
|
199
|
+
- Use Linear for stakeholder visibility
|
|
200
|
+
- Keep ralph's input interface file-based
|
|
201
|
+
- Consider automation: completed specs could auto-create Linear issues for tracking
|
|
202
|
+
|
|
203
|
+
### Why Not Linear as Primary?
|
|
204
|
+
|
|
205
|
+
Linear is excellent, but it adds friction to the core workflow:
|
|
206
|
+
|
|
207
|
+
1. **Context switch:** You're in your editor, you have an idea, you have to open Linear
|
|
208
|
+
2. **API dependency:** Ralph needs MCP/API calls instead of file reads
|
|
209
|
+
3. **Not the source of truth:** The code is in git, the spec should be too
|
|
210
|
+
4. **Overkill for capture:** Linear is optimized for team coordination, not quick intent capture
|
|
211
|
+
|
|
212
|
+
Linear makes sense for _visibility and coordination_, not for _AI agent input_.
|
|
213
|
+
|
|
214
|
+
### Why Not Backlog.md?
|
|
215
|
+
|
|
216
|
+
Backlog.md is well-designed and could work. However:
|
|
217
|
+
|
|
218
|
+
1. **Adds a dependency** for marginal benefit over plain markdown
|
|
219
|
+
2. **Learning curve** for the CLI and conventions
|
|
220
|
+
3. **SPEC.md is simpler** and already aligns with industry patterns
|
|
221
|
+
|
|
222
|
+
If the team grows or the backlog becomes complex, Backlog.md is a good upgrade path.
|
|
223
|
+
|
|
224
|
+
---
|
|
225
|
+
|
|
226
|
+
## Implementation Path
|
|
227
|
+
|
|
228
|
+
### Phase 1: Establish the Pattern (Day 1)
|
|
229
|
+
|
|
230
|
+
1. Create `.ralph/` directory structure
|
|
231
|
+
2. Create `BACKLOG.md` for quick capture
|
|
232
|
+
3. Create spec templates
|
|
233
|
+
4. Document the pattern in CLAUDE.md
|
|
234
|
+
|
|
235
|
+
### Phase 2: Integrate with Ralph (Week 1)
|
|
236
|
+
|
|
237
|
+
1. Update ralph to look for specs in `.ralph/active/`
|
|
238
|
+
2. Implement spec parsing (extract intent, success criteria)
|
|
239
|
+
3. Add success criteria tracking (ralph marks criteria complete)
|
|
240
|
+
4. Move completed specs to `.ralph/completed/`
|
|
241
|
+
|
|
242
|
+
### Phase 3: Optimize (Ongoing)
|
|
243
|
+
|
|
244
|
+
1. Refine templates based on what works
|
|
245
|
+
2. Consider auto-sync to Linear for visibility (if needed)
|
|
246
|
+
3. Add tooling for quick spec creation (CLI or editor shortcuts)
|
|
247
|
+
|
|
248
|
+
### Example Workflow
|
|
249
|
+
|
|
250
|
+
```bash
|
|
251
|
+
# Human captures intent quickly
|
|
252
|
+
echo "# Add dark mode support
|
|
253
|
+
|
|
254
|
+
## Intent
|
|
255
|
+
Users want dark mode to reduce eye strain during night coding sessions.
|
|
256
|
+
|
|
257
|
+
## Success Criteria
|
|
258
|
+
- [ ] Toggle in settings
|
|
259
|
+
- [ ] Persists across sessions
|
|
260
|
+
- [ ] Respects system preference by default
|
|
261
|
+
" > .ralph/active/dark-mode.spec.md
|
|
262
|
+
|
|
263
|
+
# Ralph picks it up and executes
|
|
264
|
+
ralph process .ralph/active/dark-mode.spec.md
|
|
265
|
+
```
|
|
266
|
+
|
|
267
|
+
---
|
|
268
|
+
|
|
269
|
+
## Decision Summary
|
|
270
|
+
|
|
271
|
+
| Decision | Choice | Confidence |
|
|
272
|
+
| ----------------------- | ----------------------------------- | ---------- |
|
|
273
|
+
| Primary task capture | SPEC.md files in `.ralph/` | High |
|
|
274
|
+
| Format | Markdown with structured sections | High |
|
|
275
|
+
| Location | Git repository | High |
|
|
276
|
+
| Team coordination | Linear (optional, secondary) | Medium |
|
|
277
|
+
| External issue tracking | GitHub Issues (for bugs, community) | Medium |
|
|
278
|
+
|
|
279
|
+
---
|
|
280
|
+
|
|
281
|
+
## Sources
|
|
282
|
+
|
|
283
|
+
This synthesis is based on four research documents:
|
|
284
|
+
|
|
285
|
+
1. **01-ai-workflow-tools.md** - Survey of AI-native task management tools (Linear, Taskade, Backlog.md, Plane.so, etc.)
|
|
286
|
+
2. **02-agent-framework-patterns.md** - How AI agent frameworks (LangGraph, CrewAI, Claude Code) handle task persistence
|
|
287
|
+
3. **03-lightweight-file-based.md** - File-based approaches (SPEC.md, Beads, todo.txt, etc.)
|
|
288
|
+
4. **04-established-tools-ai-features.md** - AI features in GitHub, Linear, Jira, Notion
|
|
289
|
+
|
|
290
|
+
Key external sources informing this recommendation:
|
|
291
|
+
|
|
292
|
+
- [GitHub Spec-Kit](https://github.com/github/spec-kit/blob/main/spec-driven.md)
|
|
293
|
+
- [Addy Osmani on specs for AI agents](https://addyosmani.com/blog/good-spec/)
|
|
294
|
+
- [Steve Yegge on coding agent memory](https://steve-yegge.medium.com/introducing-beads-a-coding-agent-memory-system-637d7d92514a)
|
|
295
|
+
- [Thoughtworks on Spec-Driven Development](https://www.thoughtworks.com/en-us/insights/blog/agile-engineering-practices/spec-driven-development-unpacking-2025-new-engineering-practices)
|