mdcontext 0.0.1 → 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (140) hide show
  1. package/.changeset/README.md +28 -0
  2. package/.changeset/config.json +11 -0
  3. package/.github/workflows/ci.yml +83 -0
  4. package/.github/workflows/release.yml +113 -0
  5. package/.tldrignore +112 -0
  6. package/AGENTS.md +46 -0
  7. package/BACKLOG.md +338 -0
  8. package/README.md +231 -11
  9. package/biome.json +36 -0
  10. package/cspell.config.yaml +14 -0
  11. package/dist/chunk-KRYIFLQR.js +92 -0
  12. package/dist/chunk-S7E6TFX6.js +742 -0
  13. package/dist/chunk-VVTGZNBT.js +1519 -0
  14. package/dist/cli/main.d.ts +1 -0
  15. package/dist/cli/main.js +2015 -0
  16. package/dist/index.d.ts +266 -0
  17. package/dist/index.js +86 -0
  18. package/dist/mcp/server.d.ts +1 -0
  19. package/dist/mcp/server.js +376 -0
  20. package/docs/019-USAGE.md +586 -0
  21. package/docs/020-current-implementation.md +364 -0
  22. package/docs/021-DOGFOODING-FINDINGS.md +175 -0
  23. package/docs/BACKLOG.md +80 -0
  24. package/docs/DESIGN.md +439 -0
  25. package/docs/PROJECT.md +88 -0
  26. package/docs/ROADMAP.md +407 -0
  27. package/docs/test-links.md +9 -0
  28. package/package.json +69 -10
  29. package/pnpm-workspace.yaml +5 -0
  30. package/research/config-analysis/01-current-implementation.md +470 -0
  31. package/research/config-analysis/02-strategy-recommendation.md +428 -0
  32. package/research/config-analysis/03-task-candidates.md +715 -0
  33. package/research/config-analysis/033-research-configuration-management.md +828 -0
  34. package/research/config-analysis/034-research-effect-cli-config.md +1504 -0
  35. package/research/config-analysis/04-consolidated-task-candidates.md +277 -0
  36. package/research/dogfood/consolidated-tool-evaluation.md +373 -0
  37. package/research/dogfood/strategy-a/a-synthesis.md +184 -0
  38. package/research/dogfood/strategy-a/a1-docs.md +226 -0
  39. package/research/dogfood/strategy-a/a2-amorphic.md +156 -0
  40. package/research/dogfood/strategy-a/a3-llm.md +164 -0
  41. package/research/dogfood/strategy-b/b-synthesis.md +228 -0
  42. package/research/dogfood/strategy-b/b1-architecture.md +207 -0
  43. package/research/dogfood/strategy-b/b2-gaps.md +258 -0
  44. package/research/dogfood/strategy-b/b3-workflows.md +250 -0
  45. package/research/dogfood/strategy-c/c-synthesis.md +451 -0
  46. package/research/dogfood/strategy-c/c1-explorer.md +192 -0
  47. package/research/dogfood/strategy-c/c2-diver-memory.md +145 -0
  48. package/research/dogfood/strategy-c/c3-diver-control.md +148 -0
  49. package/research/dogfood/strategy-c/c4-diver-failure.md +151 -0
  50. package/research/dogfood/strategy-c/c5-diver-execution.md +221 -0
  51. package/research/dogfood/strategy-c/c6-diver-org.md +221 -0
  52. package/research/effect-cli-error-handling.md +845 -0
  53. package/research/effect-errors-as-values.md +943 -0
  54. package/research/errors-task-analysis/00-consolidated-tasks.md +207 -0
  55. package/research/errors-task-analysis/cli-commands-analysis.md +909 -0
  56. package/research/errors-task-analysis/embeddings-analysis.md +709 -0
  57. package/research/errors-task-analysis/index-search-analysis.md +812 -0
  58. package/research/mdcontext-error-analysis.md +521 -0
  59. package/research/npm_publish/011-npm-workflow-research-agent2.md +792 -0
  60. package/research/npm_publish/012-npm-workflow-research-agent1.md +530 -0
  61. package/research/npm_publish/013-npm-workflow-research-agent3.md +722 -0
  62. package/research/npm_publish/014-npm-workflow-synthesis.md +556 -0
  63. package/research/npm_publish/031-npm-workflow-task-analysis.md +134 -0
  64. package/research/semantic-search/002-research-embedding-models.md +490 -0
  65. package/research/semantic-search/003-research-rag-alternatives.md +523 -0
  66. package/research/semantic-search/004-research-vector-search.md +841 -0
  67. package/research/semantic-search/032-research-semantic-search.md +427 -0
  68. package/research/task-management-2026/00-synthesis-recommendations.md +295 -0
  69. package/research/task-management-2026/01-ai-workflow-tools.md +416 -0
  70. package/research/task-management-2026/02-agent-framework-patterns.md +476 -0
  71. package/research/task-management-2026/03-lightweight-file-based.md +567 -0
  72. package/research/task-management-2026/04-established-tools-ai-features.md +541 -0
  73. package/research/task-management-2026/linear/01-core-features-workflow.md +771 -0
  74. package/research/task-management-2026/linear/02-api-integrations.md +930 -0
  75. package/research/task-management-2026/linear/03-ai-features.md +368 -0
  76. package/research/task-management-2026/linear/04-pricing-setup.md +205 -0
  77. package/research/task-management-2026/linear/05-usage-patterns-best-practices.md +605 -0
  78. package/scripts/rebuild-hnswlib.js +63 -0
  79. package/src/cli/argv-preprocessor.test.ts +210 -0
  80. package/src/cli/argv-preprocessor.ts +202 -0
  81. package/src/cli/cli.test.ts +430 -0
  82. package/src/cli/commands/backlinks.ts +54 -0
  83. package/src/cli/commands/context.ts +197 -0
  84. package/src/cli/commands/index-cmd.ts +300 -0
  85. package/src/cli/commands/index.ts +13 -0
  86. package/src/cli/commands/links.ts +52 -0
  87. package/src/cli/commands/search.ts +451 -0
  88. package/src/cli/commands/stats.ts +146 -0
  89. package/src/cli/commands/tree.ts +107 -0
  90. package/src/cli/flag-schemas.ts +275 -0
  91. package/src/cli/help.ts +386 -0
  92. package/src/cli/index.ts +9 -0
  93. package/src/cli/main.ts +145 -0
  94. package/src/cli/options.ts +31 -0
  95. package/src/cli/typo-suggester.test.ts +105 -0
  96. package/src/cli/typo-suggester.ts +130 -0
  97. package/src/cli/utils.ts +126 -0
  98. package/src/core/index.ts +1 -0
  99. package/src/core/types.ts +140 -0
  100. package/src/embeddings/index.ts +8 -0
  101. package/src/embeddings/openai-provider.ts +165 -0
  102. package/src/embeddings/semantic-search.ts +583 -0
  103. package/src/embeddings/types.ts +82 -0
  104. package/src/embeddings/vector-store.ts +299 -0
  105. package/src/index/index.ts +4 -0
  106. package/src/index/indexer.ts +446 -0
  107. package/src/index/storage.ts +196 -0
  108. package/src/index/types.ts +109 -0
  109. package/src/index/watcher.ts +131 -0
  110. package/src/index.ts +8 -0
  111. package/src/mcp/server.ts +483 -0
  112. package/src/parser/index.ts +1 -0
  113. package/src/parser/parser.test.ts +291 -0
  114. package/src/parser/parser.ts +395 -0
  115. package/src/parser/section-filter.ts +270 -0
  116. package/src/search/query-parser.test.ts +260 -0
  117. package/src/search/query-parser.ts +319 -0
  118. package/src/search/searcher.test.ts +182 -0
  119. package/src/search/searcher.ts +602 -0
  120. package/src/summarize/budget-bugs.test.ts +620 -0
  121. package/src/summarize/formatters.ts +419 -0
  122. package/src/summarize/index.ts +20 -0
  123. package/src/summarize/summarizer.test.ts +275 -0
  124. package/src/summarize/summarizer.ts +528 -0
  125. package/src/summarize/verify-bugs.test.ts +238 -0
  126. package/src/utils/index.ts +1 -0
  127. package/src/utils/tokens.test.ts +142 -0
  128. package/src/utils/tokens.ts +186 -0
  129. package/tests/fixtures/cli/.mdcontext/config.json +8 -0
  130. package/tests/fixtures/cli/.mdcontext/indexes/documents.json +33 -0
  131. package/tests/fixtures/cli/.mdcontext/indexes/links.json +12 -0
  132. package/tests/fixtures/cli/.mdcontext/indexes/sections.json +233 -0
  133. package/tests/fixtures/cli/.mdcontext/vectors.bin +0 -0
  134. package/tests/fixtures/cli/.mdcontext/vectors.meta.json +1264 -0
  135. package/tests/fixtures/cli/README.md +9 -0
  136. package/tests/fixtures/cli/api-reference.md +11 -0
  137. package/tests/fixtures/cli/getting-started.md +11 -0
  138. package/tsconfig.json +26 -0
  139. package/vitest.config.ts +21 -0
  140. package/vitest.setup.ts +12 -0
@@ -0,0 +1,427 @@
1
+ # Research Task Analysis: Embedding Models, RAG Alternatives, and Vector Search
2
+
3
+ Analysis of research documents against current mdcontext implementation.
4
+
5
+ Date: January 2026
6
+
7
+ ---
8
+
9
+ ## Documents Analyzed
10
+
11
+ 1. `002-research-embedding-models.md` - Embedding model comparison and recommendations
12
+ 2. `003-research-rag-alternatives.md` - RAG alternatives for improving semantic search
13
+ 3. `004-research-vector-search.md` - Vector search patterns and techniques
14
+
15
+ ---
16
+
17
+ ## Implemented (No Action Needed)
18
+
19
+ ### 1. Dimension Reduction (512 dimensions)
20
+
21
+ **Research Recommendation:** Reduce OpenAI embeddings from 1536 to 512 dimensions for 67% storage reduction with minimal quality loss.
22
+
23
+ **Current Implementation:** Already implemented in `src/embeddings/openai-provider.ts`:
24
+
25
+ ```typescript
26
+ const response = await this.client.embeddings.create({
27
+ model: this.model,
28
+ input: batch,
29
+ dimensions: 512, // Already using reduced dimensions
30
+ });
31
+ ```
32
+
33
+ **Status:** Implemented
34
+
35
+ ---
36
+
37
+ ### 2. HNSW Vector Index
38
+
39
+ **Research Recommendation:** Stay with HNSW for documentation corpora (<100K sections).
40
+
41
+ **Current Implementation:** Using `hnswlib-node` with cosine similarity in `src/embeddings/vector-store.ts`:
42
+
43
+ ```typescript
44
+ this.index = new HierarchicalNSW.HierarchicalNSW("cosine", this.dimensions);
45
+ this.index.initIndex(10000, 16, 200, 100); // M=16, efConstruction=200
46
+ ```
47
+
48
+ **Status:** Implemented
49
+
50
+ ---
51
+
52
+ ### 3. EmbeddingProvider Interface
53
+
54
+ **Research Recommendation:** Create provider abstraction for embedding models.
55
+
56
+ **Current Implementation:** `src/embeddings/types.ts` defines a clean provider interface:
57
+
58
+ ```typescript
59
+ export interface EmbeddingProvider {
60
+ readonly name: string;
61
+ readonly dimensions: number;
62
+ embed(texts: string[]): Promise<EmbeddingResult>;
63
+ }
64
+ ```
65
+
66
+ **Status:** Implemented (foundation ready for additional providers)
67
+
68
+ ---
69
+
70
+ ### 4. Path Pattern Filtering (Post-filter)
71
+
72
+ **Research Recommendation:** Implement metadata filtering for search results.
73
+
74
+ **Current Implementation:** `pathPattern` option implemented in `semanticSearch()` as post-filtering.
75
+
76
+ **Status:** Implemented (basic)
77
+
78
+ ---
79
+
80
+ ### 5. Document Context in Embeddings
81
+
82
+ **Research Recommendation:** Include document title and parent section in embedding text.
83
+
84
+ **Current Implementation:** Already in `src/embeddings/semantic-search.ts`:
85
+
86
+ ```typescript
87
+ const generateEmbeddingText = (
88
+ section,
89
+ content,
90
+ documentTitle,
91
+ parentHeading,
92
+ ) => {
93
+ parts.push(`# ${section.heading}`);
94
+ if (parentHeading) parts.push(`Parent section: ${parentHeading}`);
95
+ parts.push(`Document: ${documentTitle}`);
96
+ parts.push(content);
97
+ // ...
98
+ };
99
+ ```
100
+
101
+ **Status:** Implemented
102
+
103
+ ---
104
+
105
+ ## Task Candidates
106
+
107
+ ### 1. Add Hybrid Search (BM25 + Semantic)
108
+
109
+ **Priority:** High
110
+
111
+ **Description:**
112
+ Implement hybrid search combining BM25 keyword search with semantic search, using Reciprocal Rank Fusion (RRF) to merge results.
113
+
114
+ **Why It Matters:**
115
+
116
+ - Research shows 15-30% recall improvement over single-method retrieval
117
+ - Handles exact term matching (API names, error codes, identifiers) that pure semantic search misses
118
+ - Current keyword search exists separately but isn't integrated with semantic search
119
+
120
+ **Implementation Notes:**
121
+
122
+ - Add `wink-bm25-text-search` dependency
123
+ - Build BM25 index alongside vector index during `mdcontext embed`
124
+ - Add `--mode hybrid` option to search command
125
+ - Implement RRF fusion (~50 lines of code)
126
+
127
+ **Current Gap:**
128
+
129
+ - Keyword search (`src/search/searcher.ts`) and semantic search (`src/embeddings/semantic-search.ts`) are separate codepaths
130
+ - No fusion mechanism exists
131
+
132
+ **Estimated Effort:** 2-3 days
133
+
134
+ ---
135
+
136
+ ### 2. Add Local Embedding Provider (Ollama)
137
+
138
+ **Priority:** High
139
+
140
+ **Description:**
141
+ Implement an Ollama-based embedding provider using `nomic-embed-text-v1.5` for offline semantic search.
142
+
143
+ **Why It Matters:**
144
+
145
+ - Enables offline semantic search (major feature gap)
146
+ - Zero ongoing API costs
147
+ - Quality matches or exceeds OpenAI text-embedding-3-small
148
+ - Privacy-sensitive use cases
149
+
150
+ **Implementation Notes:**
151
+
152
+ - Create `src/embeddings/ollama-provider.ts` implementing `EmbeddingProvider`
153
+ - Add provider selection via config or `--provider` CLI flag
154
+ - Default to OpenAI for backward compatibility
155
+ - nomic-embed-text supports Matryoshka (dimension flexibility)
156
+
157
+ **Models to Support:**
158
+
159
+ 1. `nomic-embed-text` - Best overall fit (fast, 8K context, Matryoshka)
160
+ 2. `mxbai-embed-large` - Higher quality option
161
+ 3. `bge-m3` - Multilingual option
162
+
163
+ **Estimated Effort:** 2-3 days
164
+
165
+ ---
166
+
167
+ ### 3. Add Cross-Encoder Re-ranking
168
+
169
+ **Priority:** Medium
170
+
171
+ **Description:**
172
+ Add optional re-ranking of top-N semantic search results using a cross-encoder model.
173
+
174
+ **Why It Matters:**
175
+
176
+ - 20-35% accuracy improvement in retrieval precision
177
+ - Cross-encoders capture fine-grained relevance that bi-encoders miss
178
+ - Can be opt-in to avoid latency when not needed
179
+
180
+ **Implementation Notes:**
181
+
182
+ - Add `@xenova/transformers` dependency for Transformers.js
183
+ - Use `ms-marco-MiniLM-L-6-v2` model (22.7M params, 2-5ms/pair)
184
+ - Re-rank top-20 candidates to top-10
185
+ - Add `--rerank` flag to search command
186
+
187
+ **Alternative:** Cohere Rerank API for simpler integration (adds cost)
188
+
189
+ **Estimated Effort:** 2-3 days
190
+
191
+ ---
192
+
193
+ ### 4. Add Dynamic efSearch (Quality Modes)
194
+
195
+ **Priority:** Medium
196
+
197
+ **Description:**
198
+ Allow users to control search quality/speed tradeoff via HNSW efSearch parameter at query time.
199
+
200
+ **Why It Matters:**
201
+
202
+ - Zero dependency changes
203
+ - Immediate quality/speed improvements
204
+ - Low risk
205
+
206
+ **Implementation Notes:**
207
+
208
+ - Add `--quality` flag: `fast` (64), `balanced` (100), `thorough` (256)
209
+ - efSearch is already configurable at query time in hnswlib-node
210
+ - Update search functions to accept quality parameter
211
+
212
+ **Current State:**
213
+
214
+ ```typescript
215
+ this.index.initIndex(10000, 16, 200, 100); // efSearch=100 (implicit)
216
+ ```
217
+
218
+ **Estimated Effort:** 0.5 days
219
+
220
+ ---
221
+
222
+ ### 5. Add Configurable HNSW Parameters
223
+
224
+ **Priority:** Low
225
+
226
+ **Description:**
227
+ Expose HNSW build parameters (M, efConstruction) via configuration for users with specific needs.
228
+
229
+ **Why It Matters:**
230
+
231
+ - Users with large corpora may want to tune for speed
232
+ - Users needing maximum recall can increase parameters
233
+ - Enables benchmarking different configurations
234
+
235
+ **Current Hardcoded Values:**
236
+
237
+ ```typescript
238
+ M: 16; // Max connections per node
239
+ efConstruction: 200; // Construction-time search width
240
+ ```
241
+
242
+ **Recommended Configurations:**
243
+
244
+ - Quality-focused: M=24, efConstruction=256
245
+ - Speed-focused: M=12, efConstruction=128
246
+
247
+ **Estimated Effort:** 1 day
248
+
249
+ ---
250
+
251
+ ### 6. Add Query Preprocessing
252
+
253
+ **Priority:** Low
254
+
255
+ **Description:**
256
+ Add basic query preprocessing before embedding to reduce noise.
257
+
258
+ **Why It Matters:**
259
+
260
+ - 2-5% precision improvement
261
+ - Simple implementation
262
+
263
+ **Implementation:**
264
+
265
+ ```typescript
266
+ function preprocessQuery(query: string): string {
267
+ return query
268
+ .toLowerCase()
269
+ .replace(/[^\w\s]/g, " ")
270
+ .replace(/\s+/g, " ")
271
+ .trim();
272
+ }
273
+ ```
274
+
275
+ **Estimated Effort:** 1-2 hours
276
+
277
+ ---
278
+
279
+ ### 7. Add Heading Match Boost
280
+
281
+ **Priority:** Low
282
+
283
+ **Description:**
284
+ Boost search results when query terms appear in section headings.
285
+
286
+ **Why It Matters:**
287
+
288
+ - Significant for navigation queries ("installation guide", "API reference")
289
+ - Simple scoring adjustment
290
+
291
+ **Implementation:**
292
+
293
+ ```typescript
294
+ function adjustScore(result, query): number {
295
+ const queryTerms = query.toLowerCase().split(/\s+/);
296
+ const headingLower = result.heading.toLowerCase();
297
+ const headingMatches = queryTerms.filter((t) =>
298
+ headingLower.includes(t),
299
+ ).length;
300
+ return result.similarity + headingMatches * 0.05;
301
+ }
302
+ ```
303
+
304
+ **Estimated Effort:** 2-4 hours
305
+
306
+ ---
307
+
308
+ ### 8. Add HyDE Query Expansion (Optional)
309
+
310
+ **Priority:** Low
311
+
312
+ **Description:**
313
+ Implement Hypothetical Document Embeddings for complex queries - generate a hypothetical answer with LLM, then search using that embedding.
314
+
315
+ **Why It Matters:**
316
+
317
+ - 10-30% retrieval improvement on ambiguous queries
318
+ - Bridges semantic gap between short questions and detailed documents
319
+
320
+ **Considerations:**
321
+
322
+ - Adds LLM call (cost, latency)
323
+ - Should be opt-in for complex queries only
324
+ - Works poorly if LLM lacks domain knowledge
325
+
326
+ **Estimated Effort:** 1-2 days
327
+
328
+ ---
329
+
330
+ ### 9. Fix Dimension Mismatch in Provider
331
+
332
+ **Priority:** Medium
333
+
334
+ **Description:**
335
+ The OpenAI provider reports incorrect dimensions (1536/3072) while actually using 512.
336
+
337
+ **Current Issue:**
338
+
339
+ ```typescript
340
+ // In openai-provider.ts
341
+ this.dimensions = this.model === "text-embedding-3-large" ? 3072 : 1536;
342
+ // But actual API call uses:
343
+ dimensions: 512;
344
+ ```
345
+
346
+ This mismatch could cause issues if other code relies on `provider.dimensions`.
347
+
348
+ **Fix:** Update dimension reporting to match actual API parameter.
349
+
350
+ **Estimated Effort:** 0.5 hours
351
+
352
+ ---
353
+
354
+ ### 10. Add Alternative API Provider (Voyage AI)
355
+
356
+ **Priority:** Low
357
+
358
+ **Description:**
359
+ Add Voyage AI as an alternative embedding provider for users wanting better quality at similar cost.
360
+
361
+ **Why It Matters:**
362
+
363
+ - voyage-3.5-lite: Same price as OpenAI ($0.02/1M), but better quality
364
+ - 32K token context (4x OpenAI)
365
+ - Free 200M tokens for testing
366
+
367
+ **Estimated Effort:** 1 day
368
+
369
+ ---
370
+
371
+ ## Skip (Not Applicable)
372
+
373
+ | Recommendation | Reason to Skip |
374
+ | ------------------------ | --------------------------------------------------------------------- |
375
+ | ColBERT Late Interaction | Overkill for documentation corpus sizes; requires Python service |
376
+ | SPLADE Sparse Retrieval | BM25 + semantic hybrid likely sufficient; adds complexity |
377
+ | GraphRAG | Overkill for documentation search |
378
+ | Fine-tuned Embeddings | Requires training infrastructure; general models work well |
379
+ | IVF/DiskANN Indexes | HNSW sufficient for typical documentation sizes (<100K sections) |
380
+ | LLM-based Re-ranking | Cross-encoders provide similar quality without LLM cost/latency |
381
+ | Self-RAG | Beyond current scope; more relevant for RAG pipelines with generation |
382
+
383
+ ---
384
+
385
+ ## Summary
386
+
387
+ | Category | Count |
388
+ | --------------- | ----------------- |
389
+ | Implemented | 5 items |
390
+ | Task Candidates | 10 items |
391
+ | Skipped | 7 recommendations |
392
+
393
+ ### Priority Matrix
394
+
395
+ | Priority | Tasks |
396
+ | ---------- | ------------------------------------------------------------------------- |
397
+ | **High** | Hybrid Search (BM25+RRF), Local Embedding Provider (Ollama) |
398
+ | **Medium** | Cross-Encoder Re-ranking, Dynamic efSearch, Fix Dimension Mismatch |
399
+ | **Low** | HNSW Config, Query Preprocessing, Heading Boost, HyDE, Voyage AI Provider |
400
+
401
+ ### Recommended Implementation Order
402
+
403
+ **Phase 1: Quick Wins (1 week)**
404
+
405
+ 1. Fix dimension mismatch in provider
406
+ 2. Add dynamic efSearch (quality modes)
407
+ 3. Add query preprocessing
408
+ 4. Add heading match boost
409
+
410
+ **Phase 2: Hybrid Search (1-2 weeks)**
411
+
412
+ 1. Integrate BM25 library
413
+ 2. Build BM25 index during embed
414
+ 3. Implement RRF fusion
415
+ 4. Add `--mode hybrid` to CLI
416
+
417
+ **Phase 3: Local/Offline (1-2 weeks)**
418
+
419
+ 1. Implement Ollama provider
420
+ 2. Add provider selection CLI
421
+ 3. Test with nomic-embed-text
422
+
423
+ **Phase 4: Advanced (2 weeks)**
424
+
425
+ 1. Cross-encoder re-ranking (Transformers.js)
426
+ 2. HyDE query expansion (optional)
427
+ 3. Alternative API providers (Voyage AI)
@@ -0,0 +1,295 @@
1
+ # Task Management for Ralph: Synthesis & Recommendations
2
+
3
+ **Date:** January 21, 2026
4
+ **Purpose:** Recommend a task management approach for capturing intent before AI agent (ralph) processing
5
+
6
+ ---
7
+
8
+ ## Executive Summary
9
+
10
+ **For capturing task intent in an AI agent workflow, use a file-based SPEC.md pattern stored in Git.** This approach offers the best balance of human usability, LLM compatibility, and zero friction. External tools like Linear or GitHub Issues add unnecessary complexity for the core problem of "capture intent quickly so ralph can act on it." The SPEC.md pattern is already proven in the codebase, requires no new tooling, and aligns with the emerging industry standard of spec-driven development. Reserve external tools (Linear, GitHub Issues) for team coordination and visibility, not for the AI agent interface.
11
+
12
+ ---
13
+
14
+ ## Context
15
+
16
+ ### The Problem
17
+
18
+ We need a system for capturing task intent before processing through "ralph" (an AI agent orchestration system). This is specifically about the **input interface** to the agent, not project management at large.
19
+
20
+ ### Requirements
21
+
22
+ | Requirement | Priority | Notes |
23
+ | --------------------------- | -------- | --------------------------------------- |
24
+ | Capture task intent quickly | Critical | Friction kills adoption |
25
+ | LLM-friendly format | Critical | Ralph must parse and understand it |
26
+ | Git-native | High | Version control, audit trail, offline |
27
+ | Low friction for humans | High | Must be faster than "just do it myself" |
28
+ | Scales to many tasks | Medium | Backlog accumulation, parallel work |
29
+
30
+ ### Constraint
31
+
32
+ Ralph operates on files in a repository. Whatever system we choose must either:
33
+
34
+ 1. **Live in the repo** (file-based), or
35
+ 2. **Be accessible via MCP/API** (external tool)
36
+
37
+ ---
38
+
39
+ ## Options Matrix
40
+
41
+ ### Top 4 Approaches Evaluated
42
+
43
+ | Approach | Quick Capture | LLM-Friendly | Git-Native | Low Friction | Scales | Overall |
44
+ | ------------------- | ------------- | ------------ | ---------- | ------------ | ------ | ------- |
45
+ | **SPEC.md Pattern** | A | A | A | A | B | **A** |
46
+ | **Backlog.md** | B | A | A | B | A | **A-** |
47
+ | **Linear** | B | B | C | A | A | **B+** |
48
+ | **GitHub Issues** | C | B | B | B | A | **B** |
49
+
50
+ ### Detailed Analysis
51
+
52
+ #### 1. SPEC.md Pattern (File-Based Specifications)
53
+
54
+ **What it is:** Markdown files in the repo (e.g., `SPEC.md`, `TODO.md`, `PLAN.md`) that define task intent. Already emerging as the industry standard for AI-assisted development in 2026.
55
+
56
+ **Pros:**
57
+
58
+ - Zero dependencies - works with any editor, any AI tool
59
+ - Direct context injection into LLM prompts (no API calls)
60
+ - Git versioning provides audit trail and rollback for free
61
+ - Human and AI edit the same artifact
62
+ - Token-efficient format (markdown is LLM-native)
63
+ - Already using something similar in the codebase
64
+
65
+ **Cons:**
66
+
67
+ - No built-in visualization (Kanban, timelines)
68
+ - Requires discipline to maintain structure
69
+ - Doesn't scale well for team-wide coordination across multiple projects
70
+
71
+ **Best for:** Individual or small team AI coding workflows where the developer is the primary user.
72
+
73
+ #### 2. Backlog.md
74
+
75
+ **What it is:** Purpose-built tool for AI-human collaboration. Each task is a separate `.md` file in a `.backlog/` directory. CLI and web interface available.
76
+
77
+ **Pros:**
78
+
79
+ - Designed specifically for AI agent workflows
80
+ - Git-native with task IDs referencing commits/branches
81
+ - Terminal Kanban for visualization
82
+ - Works with Claude Code, Cursor, and MCP-compatible tools
83
+ - Open source, no vendor lock-in
84
+
85
+ **Cons:**
86
+
87
+ - Adds a tool/CLI dependency
88
+ - Slightly more structure than bare SPEC.md
89
+ - Newer tool with smaller community
90
+
91
+ **Best for:** Teams wanting more structure than plain markdown but still git-native.
92
+
93
+ #### 3. Linear
94
+
95
+ **What it is:** The leading AI-native project management tool. Treats AI agents as "full teammates" with native integrations to Cursor, Copilot, and Claude.
96
+
97
+ **Pros:**
98
+
99
+ - Best-in-class AI agent integration (agents as assignable teammates)
100
+ - Excellent API and MCP servers
101
+ - Team visibility and coordination features
102
+ - Fast, keyboard-first UI
103
+
104
+ **Cons:**
105
+
106
+ - External service (not in repo)
107
+ - Requires MCP/API integration for ralph to access
108
+ - Adds latency to task capture (open tool, create issue)
109
+ - Paid at scale ($8/user/month)
110
+ - Another tool to maintain
111
+
112
+ **Best for:** Team coordination, roadmap visibility, when multiple people need to see task status.
113
+
114
+ #### 4. GitHub Issues
115
+
116
+ **What it is:** GitHub's built-in issue tracker, now with Copilot coding agent that can be assigned issues directly.
117
+
118
+ **Pros:**
119
+
120
+ - Already using GitHub
121
+ - Copilot can autonomously work on issues
122
+ - No new tool adoption
123
+ - Free
124
+
125
+ **Cons:**
126
+
127
+ - Slow to create issues (web UI friction)
128
+ - Not optimized for quick intent capture
129
+ - Less LLM-friendly than markdown files
130
+ - Requires API calls for ralph to read
131
+
132
+ **Best for:** When you need the issue-to-PR workflow with Copilot, team-visible bug tracking.
133
+
134
+ ---
135
+
136
+ ## Recommendation
137
+
138
+ ### Primary: SPEC.md Pattern
139
+
140
+ **Use file-based specifications as the primary interface between humans and ralph.**
141
+
142
+ #### Rationale
143
+
144
+ 1. **Lowest friction for capture:** Open file, write intent, save. Done. No context switching to another tool.
145
+
146
+ 2. **Best LLM compatibility:** Markdown is the native language of LLMs. No translation layer, no API calls, no token overhead from structured formats.
147
+
148
+ 3. **Already proven:** The research shows spec-driven development is the emerging standard. GitHub Spec-Kit, JetBrains Junie, Amazon Kiro - all use this pattern.
149
+
150
+ 4. **Git-native by default:** Every change is versioned. Branching allows parallel task exploration. History shows what was attempted and why.
151
+
152
+ 5. **Zero new tooling:** Works today with the existing setup. No adoption curve, no dependencies to maintain.
153
+
154
+ 6. **Context window optimization:** Files can be selectively loaded. Large backlogs don't need to be sent to the LLM - only the relevant spec.
155
+
156
+ #### Proposed File Structure
157
+
158
+ ```
159
+ .ralph/
160
+ BACKLOG.md # Quick task capture (one-liner ideas)
161
+ active/
162
+ feature-x.spec.md # Detailed spec for in-progress work
163
+ bug-fix-y.spec.md
164
+ completed/ # Archived specs (reference/learning)
165
+ templates/
166
+ feature.spec.md # Template for new features
167
+ bugfix.spec.md # Template for bug fixes
168
+ ```
169
+
170
+ #### Spec File Format
171
+
172
+ ```markdown
173
+ # [Task Title]
174
+
175
+ ## Intent
176
+
177
+ [One paragraph: What do we want to accomplish and why?]
178
+
179
+ ## Success Criteria
180
+
181
+ - [ ] Criterion 1
182
+ - [ ] Criterion 2
183
+ - [ ] Criterion 3
184
+
185
+ ## Context
186
+
187
+ [Any relevant background, links, constraints]
188
+
189
+ ## Notes
190
+
191
+ [Optional: implementation hints, considerations]
192
+ ```
193
+
194
+ ### Secondary: Linear for Team Visibility
195
+
196
+ **If you need team coordination or external visibility, sync completed specs to Linear.**
197
+
198
+ - Use Linear for roadmap and sprint planning
199
+ - Use Linear for stakeholder visibility
200
+ - Keep ralph's input interface file-based
201
+ - Consider automation: completed specs could auto-create Linear issues for tracking
202
+
203
+ ### Why Not Linear as Primary?
204
+
205
+ Linear is excellent, but it adds friction to the core workflow:
206
+
207
+ 1. **Context switch:** You're in your editor, you have an idea, you have to open Linear
208
+ 2. **API dependency:** Ralph needs MCP/API calls instead of file reads
209
+ 3. **Not the source of truth:** The code is in git, the spec should be too
210
+ 4. **Overkill for capture:** Linear is optimized for team coordination, not quick intent capture
211
+
212
+ Linear makes sense for _visibility and coordination_, not for _AI agent input_.
213
+
214
+ ### Why Not Backlog.md?
215
+
216
+ Backlog.md is well-designed and could work. However:
217
+
218
+ 1. **Adds a dependency** for marginal benefit over plain markdown
219
+ 2. **Learning curve** for the CLI and conventions
220
+ 3. **SPEC.md is simpler** and already aligns with industry patterns
221
+
222
+ If the team grows or the backlog becomes complex, Backlog.md is a good upgrade path.
223
+
224
+ ---
225
+
226
+ ## Implementation Path
227
+
228
+ ### Phase 1: Establish the Pattern (Day 1)
229
+
230
+ 1. Create `.ralph/` directory structure
231
+ 2. Create `BACKLOG.md` for quick capture
232
+ 3. Create spec templates
233
+ 4. Document the pattern in CLAUDE.md
234
+
235
+ ### Phase 2: Integrate with Ralph (Week 1)
236
+
237
+ 1. Update ralph to look for specs in `.ralph/active/`
238
+ 2. Implement spec parsing (extract intent, success criteria)
239
+ 3. Add success criteria tracking (ralph marks criteria complete)
240
+ 4. Move completed specs to `.ralph/completed/`
241
+
242
+ ### Phase 3: Optimize (Ongoing)
243
+
244
+ 1. Refine templates based on what works
245
+ 2. Consider auto-sync to Linear for visibility (if needed)
246
+ 3. Add tooling for quick spec creation (CLI or editor shortcuts)
247
+
248
+ ### Example Workflow
249
+
250
+ ```bash
251
+ # Human captures intent quickly
252
+ echo "# Add dark mode support
253
+
254
+ ## Intent
255
+ Users want dark mode to reduce eye strain during night coding sessions.
256
+
257
+ ## Success Criteria
258
+ - [ ] Toggle in settings
259
+ - [ ] Persists across sessions
260
+ - [ ] Respects system preference by default
261
+ " > .ralph/active/dark-mode.spec.md
262
+
263
+ # Ralph picks it up and executes
264
+ ralph process .ralph/active/dark-mode.spec.md
265
+ ```
266
+
267
+ ---
268
+
269
+ ## Decision Summary
270
+
271
+ | Decision | Choice | Confidence |
272
+ | ----------------------- | ----------------------------------- | ---------- |
273
+ | Primary task capture | SPEC.md files in `.ralph/` | High |
274
+ | Format | Markdown with structured sections | High |
275
+ | Location | Git repository | High |
276
+ | Team coordination | Linear (optional, secondary) | Medium |
277
+ | External issue tracking | GitHub Issues (for bugs, community) | Medium |
278
+
279
+ ---
280
+
281
+ ## Sources
282
+
283
+ This synthesis is based on four research documents:
284
+
285
+ 1. **01-ai-workflow-tools.md** - Survey of AI-native task management tools (Linear, Taskade, Backlog.md, Plane.so, etc.)
286
+ 2. **02-agent-framework-patterns.md** - How AI agent frameworks (LangGraph, CrewAI, Claude Code) handle task persistence
287
+ 3. **03-lightweight-file-based.md** - File-based approaches (SPEC.md, Beads, todo.txt, etc.)
288
+ 4. **04-established-tools-ai-features.md** - AI features in GitHub, Linear, Jira, Notion
289
+
290
+ Key external sources informing this recommendation:
291
+
292
+ - [GitHub Spec-Kit](https://github.com/github/spec-kit/blob/main/spec-driven.md)
293
+ - [Addy Osmani on specs for AI agents](https://addyosmani.com/blog/good-spec/)
294
+ - [Steve Yegge on coding agent memory](https://steve-yegge.medium.com/introducing-beads-a-coding-agent-memory-system-637d7d92514a)
295
+ - [Thoughtworks on Spec-Driven Development](https://www.thoughtworks.com/en-us/insights/blog/agile-engineering-practices/spec-driven-development-unpacking-2025-new-engineering-practices)