mcp-local-rag 0.1.3 → 0.1.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +79 -53
- package/package.json +2 -2
package/README.md
CHANGED
|
@@ -40,8 +40,8 @@ claude mcp add local-rag --scope user --env BASE_DIR=/path/to/your/documents --
|
|
|
40
40
|
|
|
41
41
|
Restart your tool, then start using:
|
|
42
42
|
```
|
|
43
|
-
"
|
|
44
|
-
"
|
|
43
|
+
"Ingest api-spec.pdf"
|
|
44
|
+
"What does this document say about authentication?"
|
|
45
45
|
```
|
|
46
46
|
|
|
47
47
|
That's it. No installation, no Docker, no complex setup.
|
|
@@ -80,23 +80,24 @@ All of this uses:
|
|
|
80
80
|
|
|
81
81
|
The result: query responses typically under 3 seconds on a standard laptop, even with thousands of document chunks indexed.
|
|
82
82
|
|
|
83
|
-
##
|
|
83
|
+
## First Run
|
|
84
84
|
|
|
85
|
-
|
|
85
|
+
On first launch, the embedding model downloads automatically from HuggingFace:
|
|
86
|
+
- **Download size**: ~90MB (model files)
|
|
87
|
+
- **Disk usage after caching**: ~120MB (includes ONNX runtime cache)
|
|
88
|
+
- **Time**: 1-2 minutes on a decent connection
|
|
86
89
|
|
|
87
|
-
|
|
88
|
-
node --version
|
|
89
|
-
```
|
|
90
|
+
You'll see progress in the console. The model caches in `CACHE_DIR` (default: `./models/`) for offline use.
|
|
90
91
|
|
|
91
|
-
|
|
92
|
+
**Offline Mode**: After first run, works completely offline—no internet required.
|
|
92
93
|
|
|
93
|
-
|
|
94
|
+
## Security
|
|
94
95
|
|
|
95
|
-
|
|
96
|
+
**Path Restriction**: This server only accesses files within your `BASE_DIR`. Any attempt to access files outside this directory (e.g., via `../` path traversal) will be rejected.
|
|
96
97
|
|
|
97
|
-
|
|
98
|
+
**Local Only**: All processing happens on your machine. No network requests are made after the initial model download.
|
|
98
99
|
|
|
99
|
-
The
|
|
100
|
+
**Model Verification**: The embedding model downloads from HuggingFace's official repository (`Xenova/all-MiniLM-L6-v2`). Verify integrity by checking the [official model card](https://huggingface.co/Xenova/all-MiniLM-L6-v2).
|
|
100
101
|
|
|
101
102
|
## Configuration
|
|
102
103
|
|
|
@@ -168,30 +169,31 @@ claude mcp add local-rag --scope user \
|
|
|
168
169
|
|
|
169
170
|
### Environment Variables
|
|
170
171
|
|
|
171
|
-
|
|
172
|
-
|
|
173
|
-
|
|
174
|
-
|
|
175
|
-
|
|
176
|
-
|
|
177
|
-
|
|
178
|
-
|
|
179
|
-
|
|
180
|
-
|
|
181
|
-
**CHUNK_SIZE** - How many characters per chunk. Defaults to 512. Larger chunks give more context but slower processing.
|
|
182
|
-
|
|
183
|
-
**CHUNK_OVERLAP** - How many characters overlap between chunks. Defaults to 100. This helps preserve context across chunk boundaries.
|
|
172
|
+
| Variable | Default | Description | Valid Range |
|
|
173
|
+
|----------|---------|-------------|-------------|
|
|
174
|
+
| `BASE_DIR` | Current directory | Document root directory. Server only accesses files within this path (prevents accidental system file access). | Any valid path |
|
|
175
|
+
| `DB_PATH` | `./lancedb/` | Vector database storage location. Can grow large with many documents. | Any valid path |
|
|
176
|
+
| `CACHE_DIR` | `./models/` | Model cache directory. After first download, model stays here for offline use. | Any valid path |
|
|
177
|
+
| `MODEL_NAME` | `Xenova/all-MiniLM-L6-v2` | HuggingFace model identifier. Must be Transformers.js compatible. | HF model ID |
|
|
178
|
+
| `MAX_FILE_SIZE` | `104857600` (100MB) | Maximum file size in bytes. Larger files rejected to prevent memory issues. | 1MB - 500MB |
|
|
179
|
+
| `CHUNK_SIZE` | `512` | Characters per chunk. Larger = more context but slower processing. | 128 - 2048 |
|
|
180
|
+
| `CHUNK_OVERLAP` | `100` | Overlap between chunks. Preserves context across boundaries. | 0 - (CHUNK_SIZE/2) |
|
|
184
181
|
|
|
185
182
|
## Usage
|
|
186
183
|
|
|
187
|
-
|
|
184
|
+
**After configuration**, restart your MCP client:
|
|
185
|
+
- **Cursor**: Fully quit and relaunch (Cmd+Q on Mac, not just closing windows)
|
|
186
|
+
- **Codex**: Restart the IDE/extension
|
|
187
|
+
- **Claude Code**: No restart needed—changes apply immediately
|
|
188
|
+
|
|
189
|
+
The server will appear as available tools that your AI assistant can use.
|
|
188
190
|
|
|
189
191
|
### Ingesting Documents
|
|
190
192
|
|
|
191
|
-
**In Cursor**,
|
|
193
|
+
**In Cursor**, the Composer Agent automatically uses MCP tools when needed:
|
|
192
194
|
|
|
193
195
|
```
|
|
194
|
-
"
|
|
196
|
+
"Ingest the document at /Users/me/docs/api-spec.pdf"
|
|
195
197
|
```
|
|
196
198
|
|
|
197
199
|
**In Codex CLI**, the assistant automatically uses configured MCP tools when needed:
|
|
@@ -206,11 +208,7 @@ codex "Ingest the document at /Users/me/docs/api-spec.pdf into the RAG system"
|
|
|
206
208
|
"Ingest the document at /Users/me/docs/api-spec.pdf"
|
|
207
209
|
```
|
|
208
210
|
|
|
209
|
-
The
|
|
210
|
-
|
|
211
|
-
```
|
|
212
|
-
"@mcp Ingest api-spec.pdf"
|
|
213
|
-
```
|
|
211
|
+
**Path Requirements**: The server requires **absolute paths** to files. Your AI assistant will typically convert natural language requests into absolute paths automatically. The `BASE_DIR` setting restricts access to only files within that directory tree for security, but you must still provide the full path.
|
|
214
212
|
|
|
215
213
|
The server:
|
|
216
214
|
1. Validates the file exists and is under 100MB
|
|
@@ -226,9 +224,9 @@ This takes roughly 5-10 seconds per MB on a standard laptop. You'll see a confir
|
|
|
226
224
|
Ask questions in natural language:
|
|
227
225
|
|
|
228
226
|
```
|
|
229
|
-
"
|
|
230
|
-
"
|
|
231
|
-
"
|
|
227
|
+
"What does the API documentation say about authentication?"
|
|
228
|
+
"Find information about rate limiting"
|
|
229
|
+
"Search for error handling best practices"
|
|
232
230
|
```
|
|
233
231
|
|
|
234
232
|
The server:
|
|
@@ -241,7 +239,7 @@ Results include the text content, which file it came from, and a relevance score
|
|
|
241
239
|
You can request more results:
|
|
242
240
|
|
|
243
241
|
```
|
|
244
|
-
"
|
|
242
|
+
"Search for database optimization tips, return 10 results"
|
|
245
243
|
```
|
|
246
244
|
|
|
247
245
|
The limit parameter accepts 1-20 results.
|
|
@@ -251,7 +249,7 @@ The limit parameter accepts 1-20 results.
|
|
|
251
249
|
See what's indexed:
|
|
252
250
|
|
|
253
251
|
```
|
|
254
|
-
"
|
|
252
|
+
"List all ingested files"
|
|
255
253
|
```
|
|
256
254
|
|
|
257
255
|
This shows each file's path, how many chunks it produced, and when it was ingested.
|
|
@@ -259,7 +257,7 @@ This shows each file's path, how many chunks it produced, and when it was ingest
|
|
|
259
257
|
Check system status:
|
|
260
258
|
|
|
261
259
|
```
|
|
262
|
-
"
|
|
260
|
+
"Show the RAG server status"
|
|
263
261
|
```
|
|
264
262
|
|
|
265
263
|
This reports total documents, total chunks, current memory usage, and uptime.
|
|
@@ -269,7 +267,7 @@ This reports total documents, total chunks, current memory usage, and uptime.
|
|
|
269
267
|
If you update a document, ingest it again:
|
|
270
268
|
|
|
271
269
|
```
|
|
272
|
-
"
|
|
270
|
+
"Re-ingest api-spec.pdf with the latest changes"
|
|
273
271
|
```
|
|
274
272
|
|
|
275
273
|
The server automatically deletes old chunks for that file before adding new ones. No duplicates, no stale data.
|
|
@@ -341,24 +339,41 @@ Each module has clear boundaries:
|
|
|
341
339
|
|
|
342
340
|
## Performance
|
|
343
341
|
|
|
344
|
-
|
|
342
|
+
**Test Environment**: MacBook Pro M1 (16GB RAM), tested with v0.1.3 on Node.js 22 (January 2025)
|
|
345
343
|
|
|
346
|
-
**Query
|
|
344
|
+
**Query Performance**:
|
|
345
|
+
- Average: 1.2 seconds for 10,000 indexed chunks (5 results)
|
|
346
|
+
- Target: p90 < 3 seconds ✓
|
|
347
347
|
|
|
348
|
-
**Ingestion
|
|
349
|
-
-
|
|
350
|
-
-
|
|
351
|
-
-
|
|
352
|
-
-
|
|
348
|
+
**Ingestion Speed** (10MB PDF):
|
|
349
|
+
- Total: ~45 seconds
|
|
350
|
+
- PDF parsing: ~8 seconds (17%)
|
|
351
|
+
- Text chunking: ~2 seconds (4%)
|
|
352
|
+
- Embedding generation: ~30 seconds (67%)
|
|
353
|
+
- Database insertion: ~5 seconds (11%)
|
|
353
354
|
|
|
354
|
-
**Memory
|
|
355
|
+
**Memory Usage**:
|
|
356
|
+
- Baseline: ~200MB idle
|
|
357
|
+
- Peak: ~800MB when ingesting 50MB file
|
|
358
|
+
- Target: < 1GB ✓
|
|
355
359
|
|
|
356
|
-
**Concurrent
|
|
360
|
+
**Concurrent Queries**: Handles 5 parallel queries without degradation. LanceDB's async API allows non-blocking operations.
|
|
357
361
|
|
|
358
|
-
Your results will vary based on hardware, especially CPU speed (
|
|
362
|
+
**Note**: Your results will vary based on hardware, especially CPU speed (embeddings run on CPU, not GPU).
|
|
359
363
|
|
|
360
364
|
## Troubleshooting
|
|
361
365
|
|
|
366
|
+
### "No results found" when searching
|
|
367
|
+
|
|
368
|
+
**Cause**: Documents must be ingested before searching.
|
|
369
|
+
|
|
370
|
+
**Solution**:
|
|
371
|
+
1. First ingest documents: `"Ingest /path/to/document.pdf"`
|
|
372
|
+
2. Verify ingestion: `"List all ingested files"`
|
|
373
|
+
3. Then search: `"Search for [your query]"`
|
|
374
|
+
|
|
375
|
+
**Common mistake**: Trying to search immediately after configuration without ingesting any documents.
|
|
376
|
+
|
|
362
377
|
### "Model download failed"
|
|
363
378
|
|
|
364
379
|
The embedding model downloads from HuggingFace on first run. If you're behind a proxy or firewall, you might need to configure network settings.
|
|
@@ -445,7 +460,20 @@ Cloud services (OpenAI, Pinecone, etc.) typically offer better accuracy and scal
|
|
|
445
460
|
|
|
446
461
|
**What file formats are supported?**
|
|
447
462
|
|
|
448
|
-
|
|
463
|
+
Currently supported:
|
|
464
|
+
- **PDF**: `.pdf` (uses pdf-parse)
|
|
465
|
+
- **Microsoft Word**: `.docx` (uses mammoth, not `.doc`)
|
|
466
|
+
- **Plain Text**: `.txt`
|
|
467
|
+
- **Markdown**: `.md`, `.markdown`
|
|
468
|
+
|
|
469
|
+
**Not yet supported**:
|
|
470
|
+
- Excel/CSV (`.xlsx`, `.csv`)
|
|
471
|
+
- PowerPoint (`.pptx`)
|
|
472
|
+
- Images with OCR (`.jpg`, `.png`)
|
|
473
|
+
- HTML (`.html`)
|
|
474
|
+
- Old Word documents (`.doc`)
|
|
475
|
+
|
|
476
|
+
Want support for another format? [Open an issue](https://github.com/shinpr/mcp-local-rag/issues/new) with your use case.
|
|
449
477
|
|
|
450
478
|
**Can I customize the embedding model?**
|
|
451
479
|
|
|
@@ -476,8 +504,6 @@ Contributions are welcome. Before submitting a PR:
|
|
|
476
504
|
3. Add tests for new features
|
|
477
505
|
4. Update documentation if you change behavior
|
|
478
506
|
|
|
479
|
-
This project follows the [Conventional Commits](https://www.conventionalcommits.org/) standard for commit messages.
|
|
480
|
-
|
|
481
507
|
## License
|
|
482
508
|
|
|
483
509
|
MIT License - see LICENSE file for details.
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "mcp-local-rag",
|
|
3
|
-
"version": "0.1.
|
|
3
|
+
"version": "0.1.4",
|
|
4
4
|
"description": "Local RAG MCP Server - Easy-to-setup document search with minimal configuration",
|
|
5
5
|
"main": "dist/index.js",
|
|
6
6
|
"bin": {
|
|
@@ -81,7 +81,7 @@
|
|
|
81
81
|
"vitest": "^3.2.4"
|
|
82
82
|
},
|
|
83
83
|
"engines": {
|
|
84
|
-
"node": "20"
|
|
84
|
+
"node": ">=20"
|
|
85
85
|
},
|
|
86
86
|
"lint-staged": {
|
|
87
87
|
"src/**/*.{ts,tsx}": [
|