@robthepcguy/rag-vault 1.3.0 → 1.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (158) hide show
  1. package/LICENSE +0 -0
  2. package/README.md +478 -441
  3. package/dist/bin/install-skills.d.ts +0 -0
  4. package/dist/bin/install-skills.js +0 -0
  5. package/dist/chunker/index.d.ts +0 -0
  6. package/dist/chunker/index.js +0 -0
  7. package/dist/chunker/semantic-chunker.d.ts +0 -0
  8. package/dist/chunker/semantic-chunker.js +0 -0
  9. package/dist/chunker/sentence-splitter.d.ts +0 -0
  10. package/dist/chunker/sentence-splitter.js +0 -0
  11. package/dist/embedder/index.d.ts +8 -0
  12. package/dist/embedder/index.js +38 -0
  13. package/dist/errors/index.d.ts +0 -0
  14. package/dist/errors/index.js +0 -0
  15. package/dist/explainability/index.d.ts +0 -0
  16. package/dist/explainability/index.js +0 -0
  17. package/dist/explainability/keywords.d.ts +0 -0
  18. package/dist/explainability/keywords.js +0 -0
  19. package/dist/flywheel/feedback.d.ts +0 -0
  20. package/dist/flywheel/feedback.js +0 -0
  21. package/dist/flywheel/index.d.ts +0 -0
  22. package/dist/flywheel/index.js +0 -0
  23. package/dist/index.d.ts +0 -0
  24. package/dist/parser/html-parser.d.ts +0 -0
  25. package/dist/parser/html-parser.js +0 -0
  26. package/dist/parser/index.d.ts +0 -0
  27. package/dist/parser/index.js +21 -4
  28. package/dist/parser/pdf-filter.d.ts +0 -0
  29. package/dist/parser/pdf-filter.js +0 -0
  30. package/dist/query/index.d.ts +0 -0
  31. package/dist/query/index.js +0 -0
  32. package/dist/query/parser.d.ts +0 -0
  33. package/dist/query/parser.js +0 -0
  34. package/dist/server/index.d.ts +0 -0
  35. package/dist/server/index.js +33 -12
  36. package/dist/server/raw-data-utils.d.ts +9 -0
  37. package/dist/server/raw-data-utils.js +15 -0
  38. package/dist/server/schemas.d.ts +0 -0
  39. package/dist/server/schemas.js +0 -0
  40. package/dist/utils/config-parsers.d.ts +0 -0
  41. package/dist/utils/config-parsers.js +0 -0
  42. package/dist/utils/config.d.ts +0 -0
  43. package/dist/utils/config.js +0 -0
  44. package/dist/utils/file-utils.d.ts +0 -0
  45. package/dist/utils/file-utils.js +0 -0
  46. package/dist/utils/math.d.ts +0 -0
  47. package/dist/utils/math.js +0 -0
  48. package/dist/utils/process-handlers.d.ts +0 -0
  49. package/dist/utils/process-handlers.js +0 -0
  50. package/dist/vectordb/index.d.ts +0 -0
  51. package/dist/vectordb/index.js +2 -1
  52. package/dist/web/api-routes.d.ts +0 -0
  53. package/dist/web/api-routes.js +0 -0
  54. package/dist/web/config-routes.d.ts +0 -0
  55. package/dist/web/config-routes.js +0 -0
  56. package/dist/web/database-manager.d.ts +4 -0
  57. package/dist/web/database-manager.js +15 -0
  58. package/dist/web/http-server.d.ts +0 -0
  59. package/dist/web/http-server.js +13 -1
  60. package/dist/web/index.d.ts +0 -0
  61. package/dist/web/index.js +0 -0
  62. package/dist/web/middleware/async-handler.d.ts +0 -0
  63. package/dist/web/middleware/async-handler.js +0 -0
  64. package/dist/web/middleware/auth.d.ts +0 -0
  65. package/dist/web/middleware/auth.js +0 -0
  66. package/dist/web/middleware/error-handler.d.ts +0 -0
  67. package/dist/web/middleware/error-handler.js +0 -0
  68. package/dist/web/middleware/index.d.ts +0 -0
  69. package/dist/web/middleware/index.js +0 -0
  70. package/dist/web/middleware/rate-limit.d.ts +0 -0
  71. package/dist/web/middleware/rate-limit.js +0 -0
  72. package/dist/web/middleware/request-logger.d.ts +0 -0
  73. package/dist/web/middleware/request-logger.js +0 -0
  74. package/dist/web/types.d.ts +0 -0
  75. package/dist/web/types.js +0 -0
  76. package/package.json +54 -36
  77. package/skills/rag-vault/SKILL.md +0 -0
  78. package/skills/rag-vault/references/html-ingestion.md +0 -0
  79. package/skills/rag-vault/references/query-optimization.md +0 -0
  80. package/skills/rag-vault/references/result-refinement.md +0 -0
  81. package/web-ui/dist/assets/{index-BcRp9-z9.js → index-SBHxoAwi.js} +2 -2
  82. package/web-ui/dist/assets/index-ej8i4PGl.css +0 -0
  83. package/web-ui/dist/index.html +1 -1
  84. package/web-ui/dist/vite.svg +0 -0
  85. package/dist/bin/install-skills.d.ts.map +0 -1
  86. package/dist/bin/install-skills.js.map +0 -1
  87. package/dist/chunker/index.d.ts.map +0 -1
  88. package/dist/chunker/index.js.map +0 -1
  89. package/dist/chunker/semantic-chunker.d.ts.map +0 -1
  90. package/dist/chunker/semantic-chunker.js.map +0 -1
  91. package/dist/chunker/sentence-splitter.d.ts.map +0 -1
  92. package/dist/chunker/sentence-splitter.js.map +0 -1
  93. package/dist/embedder/index.d.ts.map +0 -1
  94. package/dist/embedder/index.js.map +0 -1
  95. package/dist/errors/index.d.ts.map +0 -1
  96. package/dist/errors/index.js.map +0 -1
  97. package/dist/explainability/index.d.ts.map +0 -1
  98. package/dist/explainability/index.js.map +0 -1
  99. package/dist/explainability/keywords.d.ts.map +0 -1
  100. package/dist/explainability/keywords.js.map +0 -1
  101. package/dist/flywheel/feedback.d.ts.map +0 -1
  102. package/dist/flywheel/feedback.js.map +0 -1
  103. package/dist/flywheel/index.d.ts.map +0 -1
  104. package/dist/flywheel/index.js.map +0 -1
  105. package/dist/index.d.ts.map +0 -1
  106. package/dist/index.js.map +0 -1
  107. package/dist/parser/html-parser.d.ts.map +0 -1
  108. package/dist/parser/html-parser.js.map +0 -1
  109. package/dist/parser/index.d.ts.map +0 -1
  110. package/dist/parser/index.js.map +0 -1
  111. package/dist/parser/pdf-filter.d.ts.map +0 -1
  112. package/dist/parser/pdf-filter.js.map +0 -1
  113. package/dist/query/index.d.ts.map +0 -1
  114. package/dist/query/index.js.map +0 -1
  115. package/dist/query/parser.d.ts.map +0 -1
  116. package/dist/query/parser.js.map +0 -1
  117. package/dist/server/index.d.ts.map +0 -1
  118. package/dist/server/index.js.map +0 -1
  119. package/dist/server/raw-data-utils.d.ts.map +0 -1
  120. package/dist/server/raw-data-utils.js.map +0 -1
  121. package/dist/server/schemas.d.ts.map +0 -1
  122. package/dist/server/schemas.js.map +0 -1
  123. package/dist/utils/config-parsers.d.ts.map +0 -1
  124. package/dist/utils/config-parsers.js.map +0 -1
  125. package/dist/utils/config.d.ts.map +0 -1
  126. package/dist/utils/config.js.map +0 -1
  127. package/dist/utils/file-utils.d.ts.map +0 -1
  128. package/dist/utils/file-utils.js.map +0 -1
  129. package/dist/utils/math.d.ts.map +0 -1
  130. package/dist/utils/math.js.map +0 -1
  131. package/dist/utils/process-handlers.d.ts.map +0 -1
  132. package/dist/utils/process-handlers.js.map +0 -1
  133. package/dist/vectordb/index.d.ts.map +0 -1
  134. package/dist/vectordb/index.js.map +0 -1
  135. package/dist/web/api-routes.d.ts.map +0 -1
  136. package/dist/web/api-routes.js.map +0 -1
  137. package/dist/web/config-routes.d.ts.map +0 -1
  138. package/dist/web/config-routes.js.map +0 -1
  139. package/dist/web/database-manager.d.ts.map +0 -1
  140. package/dist/web/database-manager.js.map +0 -1
  141. package/dist/web/http-server.d.ts.map +0 -1
  142. package/dist/web/http-server.js.map +0 -1
  143. package/dist/web/index.d.ts.map +0 -1
  144. package/dist/web/index.js.map +0 -1
  145. package/dist/web/middleware/async-handler.d.ts.map +0 -1
  146. package/dist/web/middleware/async-handler.js.map +0 -1
  147. package/dist/web/middleware/auth.d.ts.map +0 -1
  148. package/dist/web/middleware/auth.js.map +0 -1
  149. package/dist/web/middleware/error-handler.d.ts.map +0 -1
  150. package/dist/web/middleware/error-handler.js.map +0 -1
  151. package/dist/web/middleware/index.d.ts.map +0 -1
  152. package/dist/web/middleware/index.js.map +0 -1
  153. package/dist/web/middleware/rate-limit.d.ts.map +0 -1
  154. package/dist/web/middleware/rate-limit.js.map +0 -1
  155. package/dist/web/middleware/request-logger.d.ts.map +0 -1
  156. package/dist/web/middleware/request-logger.js.map +0 -1
  157. package/dist/web/types.d.ts.map +0 -1
  158. package/dist/web/types.js.map +0 -1
package/README.md CHANGED
@@ -1,441 +1,478 @@
1
- # RAG Vault
2
-
3
- [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
4
- [![TypeScript](https://img.shields.io/badge/TypeScript-5.0-blue.svg?logo=typescript&logoColor=white)](https://www.typescriptlang.org/)
5
- [![MCP Registry](https://img.shields.io/badge/MCP-Registry-green.svg)](https://registry.modelcontextprotocol.io/servers/io.github.RobThePCGuy/rag-vault)
6
-
7
- **Your documents. Your machine. Your control.**
8
-
9
- RAG Vault gives AI coding assistants instant access to your private documentsAPI specs, research papers, internal docs—without ever sending data to the cloud. One command, zero configuration, complete privacy.
10
-
11
- ## Why RAG Vault?
12
-
13
- | Pain Point | RAG Vault Solution |
14
- |------------|-------------------|
15
- | "I don't want my docs on someone else's server" | Everything stays local. No API calls after setup. |
16
- | "Semantic search misses exact code terms" | Hybrid search: meaning + exact matches like `useEffect` |
17
- | "Setup requires Docker, Python, databases..." | One `npx` command. Done. |
18
- | "Cloud APIs charge per query" | Free forever. No subscriptions. |
19
-
20
- ## Security
21
-
22
- RAG Vault includes security features for production deployment:
23
- - **API Authentication** — Optional API key via `RAG_API_KEY`
24
- - **Rate Limiting** Configurable request throttling
25
- - **CORS Control** Restrict allowed origins
26
- - **Security Headers** Helmet.js protection
27
-
28
- See [SECURITY.md](SECURITY.md) for complete documentation.
29
-
30
- ## Get Started in 30 Seconds
31
-
32
- ### For Cursor
33
-
34
- Add to `~/.cursor/mcp.json`:
35
-
36
- ```json
37
- {
38
- "mcpServers": {
39
- "local-rag": {
40
- "type": "stdio",
41
- "command": "npx",
42
- "args": ["-y", "github:RobThePCGuy/rag-vault"],
43
- "env": {
44
- "BASE_DIR": "/path/to/your/documents"
45
- }
46
- }
47
- }
48
- }
49
- ```
50
-
51
- ### For Claude Code
52
-
53
- Add to `.mcp.json` in your project directory:
54
-
55
- ```json
56
- {
57
- "mcpServers": {
58
- "local-rag": {
59
- "type": "stdio",
60
- "command": "npx",
61
- "args": ["-y", "github:RobThePCGuy/rag-vault"],
62
- "env": {
63
- "BASE_DIR": "./documents",
64
- "DB_PATH": "./documents/.rag-db",
65
- "CACHE_DIR": "./.cache",
66
- "RAG_HYBRID_WEIGHT": "0.6",
67
- "RAG_GROUPING": "related"
68
- }
69
- }
70
- }
71
- }
72
- ```
73
-
74
- Or add inline via CLI:
75
-
76
- ```bash
77
- claude mcp add local-rag --scope user --env BASE_DIR=/path/to/your/documents -- npx -y github:RobThePCGuy/rag-vault
78
- ```
79
-
80
- ### For Codex
81
-
82
- Add to `~/.codex/config.toml`:
83
-
84
- ```toml
85
- [mcp_servers.local-rag]
86
- command = "npx"
87
- args = ["-y", "github:RobThePCGuy/rag-vault"]
88
-
89
- [mcp_servers.local-rag.env]
90
- BASE_DIR = "/path/to/your/documents"
91
- ```
92
-
93
- ### Install Skills (Optional)
94
-
95
- For enhanced AI guidance on query formulation and result interpretation, install the RAG Vault skills:
96
-
97
- ```bash
98
- # Claude Code (project-level - recommended for team projects)
99
- npx github:RobThePCGuy/rag-vault skills install --claude-code
100
-
101
- # Claude Code (user-level - available in all projects)
102
- npx github:RobThePCGuy/rag-vault skills install --claude-code --global
103
-
104
- # Codex (user-level)
105
- npx github:RobThePCGuy/rag-vault skills install --codex
106
-
107
- # Custom location
108
- npx github:RobThePCGuy/rag-vault skills install --path /your/custom/path
109
- ```
110
-
111
- Skills teach Claude best practices for:
112
- - Query formulation and expansion strategies
113
- - Score interpretation (< 0.3 = good match, > 0.5 = skip)
114
- - When to use `ingest_file` vs `ingest_data`
115
- - HTML ingestion and URL handling
116
-
117
- Restart your AI tool, and start talking:
118
-
119
- ```
120
- You: "Ingest api-spec.pdf"
121
- AI: Successfully ingested api-spec.pdf (47 chunks)
122
-
123
- You: "How does authentication work?"
124
- AI: Based on section 3.2, authentication uses OAuth 2.0 with JWT tokens...
125
- ```
126
-
127
- That's it. No Docker. No Python. No servers.
128
-
129
- ## Web Interface
130
-
131
- RAG Vault includes a full-featured web UI for managing your documents without the command line.
132
-
133
- ### Launch the Web UI
134
-
135
- ```bash
136
- npx github:RobThePCGuy/rag-vault web
137
- ```
138
-
139
- Open [http://localhost:3000](http://localhost:3000) in your browser.
140
-
141
- ### What You Can Do
142
-
143
- - **Upload documents** — Drag and drop PDFs, Word docs, Markdown, text files
144
- - **Search instantly** Type queries and see results with relevance scores
145
- - **Preview content** — Click any result to see the full chunk in context
146
- - **Manage files** View all indexed documents, delete what you don't need
147
- - **Switch databases** — Create and switch between multiple knowledge bases
148
- - **Monitor status** — See document counts, memory usage, and search mode
149
- - **Export/Import settings** — Back up and restore your vault configuration
150
- - **Theme preferences** — Switch between light, dark, or system theme
151
- - **Folder browser** — Navigate directories to select documents
152
-
153
- ### REST API
154
-
155
- The web server exposes a REST API for programmatic access. Set `RAG_API_KEY` to require authentication:
156
-
157
- ```bash
158
- # With authentication (when RAG_API_KEY is set)
159
- curl -X POST "http://localhost:3000/api/v1/search" \
160
- -H "Authorization: Bearer your-api-key" \
161
- -H "Content-Type: application/json" \
162
- -d '{"query": "authentication", "limit": 5}'
163
-
164
- # Search documents (no auth required if RAG_API_KEY is not set)
165
- curl -X POST "http://localhost:3000/api/v1/search" \
166
- -H "Content-Type: application/json" \
167
- -d '{"query": "authentication", "limit": 5}'
168
-
169
- # List all files
170
- curl "http://localhost:3000/api/v1/files"
171
-
172
- # Upload a document
173
- curl -X POST "http://localhost:3000/api/v1/files/upload" \
174
- -F "file=@spec.pdf"
175
-
176
- # Delete a file
177
- curl -X DELETE "http://localhost:3000/api/v1/files" \
178
- -H "Content-Type: application/json" \
179
- -d '{"filePath": "/path/to/spec.pdf"}'
180
-
181
- # Get system status
182
- curl "http://localhost:3000/api/v1/status"
183
-
184
- # Health check (for load balancers)
185
- curl "http://localhost:3000/api/v1/health"
186
- ```
187
-
188
- ### Reader API Endpoints
189
-
190
- For programmatic document reading and cross-document discovery:
191
-
192
- ```bash
193
- # Get all chunks for a document (ordered by index)
194
- curl "http://localhost:3000/api/v1/documents/chunks?filePath=/path/to/doc.pdf"
195
-
196
- # Find related chunks for cross-document discovery
197
- curl "http://localhost:3000/api/v1/chunks/related?filePath=/path/to/doc.pdf&chunkIndex=0&limit=5"
198
-
199
- # Batch request for multiple chunks (efficient for UIs)
200
- curl -X POST "http://localhost:3000/api/v1/chunks/batch-related" \
201
- -H "Content-Type: application/json" \
202
- -d '{"chunks": [{"filePath": "/path/to/doc.pdf", "chunkIndex": 0}], "limit": 3}'
203
- ```
204
-
205
- ## Real-World Examples
206
-
207
- ### Search Your Codebase Documentation
208
-
209
- ```
210
- You: "Ingest all the markdown files in /docs"
211
- AI: Ingested 23 files (847 chunks total)
212
-
213
- You: "What's the retry policy for failed API calls?"
214
- AI: According to error-handling.md, failed requests retry 3 times
215
- with exponential backoff: 1s, 2s, 4s...
216
- ```
217
-
218
- ### Index Web Documentation
219
-
220
- ```
221
- You: "Fetch https://docs.example.com/api and ingest the HTML"
222
- AI: Ingested "docs.example.com/api" (156 chunks)
223
-
224
- You: "What rate limits apply to the /users endpoint?"
225
- AI: The API limits /users to 100 requests per minute per API key...
226
- ```
227
-
228
- ### Build a Personal Knowledge Base
229
-
230
- ```
231
- You: "Ingest my research papers folder"
232
- AI: Ingested 12 PDFs (2,341 chunks)
233
-
234
- You: "What do recent studies say about transformer attention mechanisms?"
235
- AI: Based on attention-mechanisms-2024.pdf, the key finding is...
236
- ```
237
-
238
- ### Search Exact Technical Terms
239
-
240
- RAG Vault's hybrid search catches both meaning and exact matches:
241
-
242
- ```
243
- You: "Search for ERR_CONNECTION_REFUSED"
244
- AI: Found 3 results mentioning ERR_CONNECTION_REFUSED:
245
- 1. troubleshooting.md - "When you see ERR_CONNECTION_REFUSED..."
246
- 2. network-errors.pdf - "Common causes include..."
247
- ```
248
-
249
- Pure semantic search would miss this. RAG Vault finds it.
250
-
251
- ## How It Works
252
-
253
- ```
254
- Document → Parse → Chunk by meaning → Embed locally → Store in LanceDB
255
-
256
- Query Embed → Vector search → Keyword boost → Quality filter → Results
257
- ```
258
-
259
- **Smart chunking**: Splits by meaning, not character count. Keeps code blocks intact.
260
-
261
- **Hybrid search**: Vector similarity finds related content. Keyword boost ranks exact matches higher.
262
-
263
- **Quality filtering**: Groups results by relevance gaps instead of arbitrary top-K cutoffs.
264
-
265
- **Local everything**: Embeddings via Transformers.js. Storage via LanceDB. No network after model download.
266
-
267
- ## Supported Formats
268
-
269
- | Format | Extension | Notes |
270
- |--------|-----------|-------|
271
- | PDF | `.pdf` | Full text extraction, header/footer filtering |
272
- | Word | `.docx` | Tables, lists, formatting preserved |
273
- | Markdown | `.md` | Code blocks kept intact |
274
- | Text | `.txt` | Plain text |
275
- | JSON | `.json` | Converted to searchable key-value text |
276
- | HTML | via `ingest_data` | Auto-cleaned with Readability |
277
-
278
- ## Configuration
279
-
280
- ### Environment Variables
281
-
282
- | Variable | Default | What it does |
283
- |----------|---------|--------------|
284
- | `BASE_DIR` | Current directory | Only files under this path can be accessed |
285
- | `DB_PATH` | `./lancedb/` | Where vectors are stored |
286
- | `MODEL_NAME` | `Xenova/all-MiniLM-L6-v2` | HuggingFace embedding model |
287
- | `WEB_PORT` | `3000` | Port for web interface |
288
-
289
- ### Search Tuning
290
-
291
- | Variable | Default | What it does |
292
- |----------|---------|--------------|
293
- | `RAG_HYBRID_WEIGHT` | `0.6` | Keyword boost strength. 0 = semantic-only, higher = stronger boost for exact keyword matches |
294
- | `RAG_GROUPING` | — | `similar` = top group only, `related` = top 2 groups |
295
- | `RAG_MAX_DISTANCE` | — | Filter out results below this relevance threshold |
296
-
297
- ### Security (optional)
298
-
299
- | Variable | Default | What it does |
300
- |----------|---------|--------------|
301
- | `RAG_API_KEY` | | API key for authentication |
302
- | `CORS_ORIGINS` | localhost | Allowed origins (comma-separated, or `*`) |
303
- | `RATE_LIMIT_WINDOW_MS` | `60000` | Rate limit time window (ms) |
304
- | `RATE_LIMIT_MAX_REQUESTS` | `100` | Max requests per window |
305
-
306
- ### Advanced
307
-
308
- | Variable | Default | What it does |
309
- |----------|---------|--------------|
310
- | `ALLOWED_SCAN_ROOTS` | Home directory | Directories allowed for database scanning |
311
- | `JSON_BODY_LIMIT` | `5mb` | Max request body size |
312
- | `REQUEST_TIMEOUT_MS` | `30000` | API request timeout |
313
- | `REQUEST_LOGGING` | `false` | Enable request audit logging |
314
-
315
- > Copy [`.env.example`](.env.example) for a complete configuration template.
316
-
317
- **For code-heavy content**, try:
318
-
319
- ```json
320
- "env": {
321
- "RAG_HYBRID_WEIGHT": "0.8",
322
- "RAG_GROUPING": "similar"
323
- }
324
- ```
325
-
326
- ## Frequently Asked Questions
327
-
328
- <details>
329
- <summary><strong>Is my data really private?</strong></summary>
330
-
331
- Yes. After the embedding model downloads (~90MB), RAG Vault makes zero network requests. Everything runs on your machine. Verify with network monitoring.
332
-
333
- </details>
334
-
335
- <details>
336
- <summary><strong>Does it work offline?</strong></summary>
337
-
338
- Yes, after the first run. The model caches locally.
339
-
340
- </details>
341
-
342
- <details>
343
- <summary><strong>What about GPU acceleration?</strong></summary>
344
-
345
- Transformers.js runs on CPU. GPU support is experimental but unnecessary for most use cases—queries return in ~1 second even with 10,000 chunks.
346
-
347
- </details>
348
-
349
- <details>
350
- <summary><strong>Can I change the embedding model?</strong></summary>
351
-
352
- Yes. Set `MODEL_NAME` to any compatible HuggingFace model. But you must delete `DB_PATH` and re-ingest—different models produce incompatible vectors.
353
-
354
- **Recommended upgrade:** For better quality and multilingual support, use [EmbeddingGemma](https://huggingface.co/onnx-community/embeddinggemma-300m-ONNX):
355
-
356
- ```json
357
- "MODEL_NAME": "onnx-community/embeddinggemma-300m-ONNX"
358
- ```
359
-
360
- This 300M parameter model scores 68.36 on MTEB benchmarks and supports 100+ languages, making it ideal for mixed-language or high-quality retrieval needs.
361
-
362
- **Other specialized models:**
363
- - Scientific: `sentence-transformers/allenai-specter`
364
- - Code: `jinaai/jina-embeddings-v2-base-code`
365
-
366
- </details>
367
-
368
- <details>
369
- <summary><strong>How do I back up my data?</strong></summary>
370
-
371
- Copy the `DB_PATH` directory (default: `./lancedb/`).
372
-
373
- </details>
374
-
375
- ## Troubleshooting
376
-
377
- | Problem | Solution |
378
- |---------|----------|
379
- | No results found | Documents must be ingested first. Run "List all ingested files" to check. |
380
- | Model download failed | Check internet connection. Model is ~90MB from HuggingFace. |
381
- | File too large | Default limit is 100MB. Set `MAX_FILE_SIZE` higher or split the file. |
382
- | Path outside BASE_DIR | All file paths must be under `BASE_DIR`. Use absolute paths. |
383
- | MCP tools not showing | Verify config syntax, restart your AI tool completely (Cmd+Q on Mac). |
384
- | 401 Unauthorized | API key required. Set `RAG_API_KEY` or use correct header format. |
385
- | 429 Too Many Requests | Rate limited. Wait for reset or increase `RATE_LIMIT_MAX_REQUESTS`. |
386
- | CORS errors | Add your origin to `CORS_ORIGINS` environment variable. |
387
-
388
- ## Development
389
-
390
- ```bash
391
- git clone https://github.com/RobThePCGuy/rag-vault.git
392
- cd rag-vault
393
- pnpm install
394
-
395
- # Run tests
396
- pnpm test
397
-
398
- # Type check + lint + format
399
- pnpm check:all
400
-
401
- # Build
402
- pnpm build
403
-
404
- # Run MCP server locally
405
- pnpm dev
406
-
407
- # Run web server locally
408
- pnpm web:dev
409
- ```
410
-
411
- ### Project Structure
412
-
413
- ```
414
- src/
415
- ├── server/ # MCP tool handlers
416
- ├── vectordb/ # LanceDB + hybrid search
417
- ├── chunker/ # Semantic text splitting
418
- ├── embedder/ # Transformers.js wrapper
419
- ├── parser/ # PDF, DOCX, HTML parsing
420
- ├── web/ # Express server + REST API
421
- └── __tests__/ # Test suites
422
-
423
- web-ui/ # React frontend
424
- ```
425
-
426
- ## Documentation
427
-
428
- - [SECURITY.md](SECURITY.md) — Security configuration and best practices
429
- - [.env.example](.env.example) — Complete environment variable template
430
-
431
- ## License
432
-
433
- MIT — free for personal and commercial use.
434
-
435
- ## Acknowledgments
436
-
437
- Built with [Model Context Protocol](https://modelcontextprotocol.io/), [LanceDB](https://lancedb.com/), and [Transformers.js](https://huggingface.co/docs/transformers.js).
438
-
439
- > Started as a fork of [mcp-local-rag](https://github.com/shinpr/mcp-local-rag) by [Shinsuke Kagawa](https://github.com/shinpr). Now it’s its own thing.
440
- > Huge credit to upstream contributors for the foundation, I’ve been iterating hard from there.
441
- > Local-first dev tools, all the way.
1
+ # RAG Vault
2
+
3
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
4
+ [![TypeScript](https://img.shields.io/badge/TypeScript-5.0-blue.svg?logo=typescript&logoColor=white)](https://www.typescriptlang.org/)
5
+ [![MCP Registry](https://img.shields.io/badge/MCP-Registry-green.svg)](https://registry.modelcontextprotocol.io/servers/io.github.RobThePCGuy/rag-vault)
6
+
7
+ **Your documents. Your machine. Your control.**
8
+
9
+ RAG Vault gives AI coding assistants fast access to your private documents such as API specs, research papers, and internal docs. Indexing and search run locally, and your data stays on your machine unless you explicitly ingest content from a remote URL.
10
+
11
+ One command to run, minimal setup, privacy by default.
12
+
13
+ ## Why RAG Vault?
14
+
15
+ | Pain Point | RAG Vault Solution |
16
+ |------------|-------------------|
17
+ | "I don't want my docs on someone else's server" | Everything stays local by default. No background cloud calls for indexing or search. |
18
+ | "Semantic search misses exact code terms" | Hybrid search: meaning + exact matches like `useEffect` |
19
+ | "Setup requires Docker, Python, databases..." | One `npx` command plus a small MCP config block. |
20
+ | "Cloud APIs charge per query" | Free forever. No subscriptions. |
21
+
22
+ ## Security
23
+
24
+ RAG Vault includes security features for production deployment:
25
+ - **API Authentication**: Optional API key via `RAG_API_KEY`
26
+ - **Rate Limiting**: Configurable request throttling
27
+ - **CORS Control**: Restrict allowed origins
28
+ - **Security Headers**: Helmet.js protection
29
+
30
+ See [SECURITY.md](SECURITY.md) for complete documentation.
31
+
32
+ ## First-Time Setup Checklist
33
+
34
+ Before adding MCP config:
35
+
36
+ 1. Install Node.js 20 or newer.
37
+ 2. Pick a documents directory and set `BASE_DIR` to that path.
38
+ 3. Make sure your AI tool process can read `BASE_DIR`.
39
+ 4. Restart your AI tool after editing config.
40
+
41
+ ## Get Started Quickly
42
+
43
+ ### For Cursor
44
+
45
+ Add to `~/.cursor/mcp.json`:
46
+
47
+ ```json
48
+ {
49
+ "mcpServers": {
50
+ "local-rag": {
51
+ "type": "stdio",
52
+ "command": "npx",
53
+ "args": ["-y", "github:RobThePCGuy/rag-vault"],
54
+ "env": {
55
+ "BASE_DIR": "/path/to/your/documents"
56
+ }
57
+ }
58
+ }
59
+ }
60
+ ```
61
+
62
+ Replace `/path/to/your/documents` with your real absolute path.
63
+
64
+ ### For Claude Code
65
+
66
+ Add to `.mcp.json` in your project directory:
67
+
68
+ ```json
69
+ {
70
+ "mcpServers": {
71
+ "local-rag": {
72
+ "type": "stdio",
73
+ "command": "npx",
74
+ "args": ["-y", "github:RobThePCGuy/rag-vault"],
75
+ "env": {
76
+ "BASE_DIR": "./documents",
77
+ "DB_PATH": "./documents/.rag-db",
78
+ "CACHE_DIR": "./.cache",
79
+ "RAG_HYBRID_WEIGHT": "0.6",
80
+ "RAG_GROUPING": "related"
81
+ }
82
+ }
83
+ }
84
+ }
85
+ ```
86
+
87
+ Or add inline via CLI:
88
+
89
+ ```bash
90
+ claude mcp add local-rag --scope user --env BASE_DIR=/path/to/your/documents -- npx -y github:RobThePCGuy/rag-vault
91
+ ```
92
+
93
+ ### For Codex
94
+
95
+ Add to `~/.codex/config.toml`:
96
+
97
+ ```toml
98
+ [mcp_servers.local-rag]
99
+ command = "npx"
100
+ args = ["-y", "github:RobThePCGuy/rag-vault"]
101
+
102
+ [mcp_servers.local-rag.env]
103
+ BASE_DIR = "/path/to/your/documents"
104
+ ```
105
+
106
+ ### Install Skills (Optional)
107
+
108
+ For enhanced AI guidance on query formulation and result interpretation, install the RAG Vault skills:
109
+
110
+ ```bash
111
+ # Claude Code (project-level - recommended for team projects)
112
+ npx github:RobThePCGuy/rag-vault skills install --claude-code
113
+
114
+ # Claude Code (user-level - available in all projects)
115
+ npx github:RobThePCGuy/rag-vault skills install --claude-code --global
116
+
117
+ # Codex (user-level)
118
+ npx github:RobThePCGuy/rag-vault skills install --codex
119
+
120
+ # Custom location
121
+ npx github:RobThePCGuy/rag-vault skills install --path /your/custom/path
122
+ ```
123
+
124
+ Skills teach Claude best practices for:
125
+ - Query formulation and expansion strategies
126
+ - Score interpretation (< 0.3 = good match, > 0.5 = skip)
127
+ - When to use `ingest_file` vs `ingest_data`
128
+ - HTML ingestion and URL handling
129
+
130
+ Restart your AI tool, and start talking:
131
+
132
+ ```
133
+ You: "Ingest api-spec.pdf"
134
+ AI: Successfully ingested api-spec.pdf (47 chunks)
135
+
136
+ You: "How does authentication work?"
137
+ AI: Based on section 3.2, authentication uses OAuth 2.0 with JWT tokens...
138
+ ```
139
+
140
+ That's it. No Docker. No Python. No server infrastructure to manage.
141
+
142
+ ## Web Interface
143
+
144
+ RAG Vault includes a full-featured web UI for managing your documents without the command line.
145
+
146
+ ### Launch the Web UI
147
+
148
+ ```bash
149
+ npx github:RobThePCGuy/rag-vault web
150
+ ```
151
+
152
+ Open [http://localhost:3000](http://localhost:3000) in your browser.
153
+
154
+ ### What You Can Do
155
+
156
+ - **Upload documents**: Drag and drop PDF, DOCX, Markdown, TXT, JSON, JSONL, and NDJSON files
157
+ - **Search instantly**: Type queries and see results with relevance scores
158
+ - **Preview content**: Click any result to see the full chunk in context
159
+ - **Manage files**: View all indexed documents and delete what you do not need
160
+ - **Switch databases**: Create and switch between multiple knowledge bases
161
+ - **Monitor status**: See document counts, memory usage, and search mode
162
+ - **Export/Import settings**: Back up and restore your vault configuration
163
+ - **Theme preferences**: Switch between light, dark, or system theme
164
+ - **Folder browser**: Navigate directories to select documents
165
+
166
+ ### REST API
167
+
168
+ The web server exposes a REST API for programmatic access. Set `RAG_API_KEY` to require authentication:
169
+
170
+ ```bash
171
+ # With authentication (when RAG_API_KEY is set)
172
+ curl -X POST "http://localhost:3000/api/v1/search" \
173
+ -H "Authorization: Bearer your-api-key" \
174
+ -H "Content-Type: application/json" \
175
+ -d '{"query": "authentication", "limit": 5}'
176
+
177
+ # Search documents (no auth required if RAG_API_KEY is not set)
178
+ curl -X POST "http://localhost:3000/api/v1/search" \
179
+ -H "Content-Type: application/json" \
180
+ -d '{"query": "authentication", "limit": 5}'
181
+
182
+ # List all files
183
+ curl "http://localhost:3000/api/v1/files"
184
+
185
+ # Upload a document
186
+ curl -X POST "http://localhost:3000/api/v1/files/upload" \
187
+ -F "file=@spec.pdf"
188
+
189
+ # Delete a file
190
+ curl -X DELETE "http://localhost:3000/api/v1/files" \
191
+ -H "Content-Type: application/json" \
192
+ -d '{"filePath": "/path/to/spec.pdf"}'
193
+
194
+ # Get system status
195
+ curl "http://localhost:3000/api/v1/status"
196
+
197
+ # Health check (for load balancers)
198
+ curl "http://localhost:3000/api/v1/health"
199
+ ```
200
+
201
+ ### Reader API Endpoints
202
+
203
+ For programmatic document reading and cross-document discovery:
204
+
205
+ ```bash
206
+ # Get all chunks for a document (ordered by index)
207
+ curl "http://localhost:3000/api/v1/documents/chunks?filePath=/path/to/doc.pdf"
208
+
209
+ # Find related chunks for cross-document discovery
210
+ curl "http://localhost:3000/api/v1/chunks/related?filePath=/path/to/doc.pdf&chunkIndex=0&limit=5"
211
+
212
+ # Batch request for multiple chunks (efficient for UIs)
213
+ curl -X POST "http://localhost:3000/api/v1/chunks/batch-related" \
214
+ -H "Content-Type: application/json" \
215
+ -d '{"chunks": [{"filePath": "/path/to/doc.pdf", "chunkIndex": 0}], "limit": 3}'
216
+ ```
217
+
218
+ ## Real-World Examples
219
+
220
+ ### Search Your Codebase Documentation
221
+
222
+ ```
223
+ You: "Ingest all the markdown files in /docs"
224
+ AI: Ingested 23 files (847 chunks total)
225
+
226
+ You: "What's the retry policy for failed API calls?"
227
+ AI: According to error-handling.md, failed requests retry 3 times
228
+ with exponential backoff: 1s, 2s, 4s...
229
+ ```
230
+
231
+ ### Index Web Documentation
232
+
233
+ ```
234
+ You: "Fetch https://docs.example.com/api and ingest the HTML"
235
+ AI: Ingested "docs.example.com/api" (156 chunks)
236
+
237
+ You: "What rate limits apply to the /users endpoint?"
238
+ AI: The API limits /users to 100 requests per minute per API key...
239
+ ```
240
+
241
+ ### Build a Personal Knowledge Base
242
+
243
+ ```
244
+ You: "Ingest my research papers folder"
245
+ AI: Ingested 12 PDFs (2,341 chunks)
246
+
247
+ You: "What do recent studies say about transformer attention mechanisms?"
248
+ AI: Based on attention-mechanisms-2024.pdf, the key finding is...
249
+ ```
250
+
251
+ ### Search Exact Technical Terms
252
+
253
+ RAG Vault's hybrid search catches both meaning and exact matches:
254
+
255
+ ```
256
+ You: "Search for ERR_CONNECTION_REFUSED"
257
+ AI: Found 3 results mentioning ERR_CONNECTION_REFUSED:
258
+ 1. troubleshooting.md - "When you see ERR_CONNECTION_REFUSED..."
259
+ 2. network-errors.pdf - "Common causes include..."
260
+ ```
261
+
262
+ Pure semantic search would miss this. RAG Vault finds it.
263
+
264
+ ## How It Works
265
+
266
+ ```
267
+ Document Parse → Chunk by meaning → Embed locally → Store in LanceDB
268
+
269
+ Query Embed Vector search → Keyword boost → Quality filter → Results
270
+ ```
271
+
272
+ **Smart chunking**: Splits by meaning, not character count. Keeps code blocks intact.
273
+
274
+ **Hybrid search**: Vector similarity finds related content. Keyword boost ranks exact matches higher.
275
+
276
+ **Quality filtering**: Groups results by relevance gaps instead of arbitrary top-K cutoffs.
277
+
278
+ **Local by default**: Embeddings via Transformers.js. Storage via LanceDB. Network is only needed for initial model download or if you explicitly ingest remote URLs.
279
+
280
+ **MCP tools included**: `ingest_file`, `ingest_data`, `query_documents`, `list_files`, and `delete_file`.
281
+
282
+ ## Supported Formats
283
+
284
+ | Format | Extension | Notes |
285
+ |--------|-----------|-------|
286
+ | PDF | `.pdf` | Full text extraction, header/footer filtering |
287
+ | Word | `.docx` | Tables, lists, formatting preserved |
288
+ | Markdown | `.md` | Code blocks kept intact |
289
+ | Text | `.txt` | Plain text |
290
+ | JSON | `.json` | Converted to searchable key-value text |
291
+ | JSONL / NDJSON | `.jsonl`, `.ndjson` | Parsed line-by-line for logs and structured records |
292
+ | HTML | via `ingest_data` | Auto-cleaned with Readability |
293
+
294
+ ## Configuration
295
+
296
+ ### Environment Variables
297
+
298
+ | Variable | Default | What it does |
299
+ |----------|---------|--------------|
300
+ | `BASE_DIR` | Current directory | Only files under this path can be accessed |
301
+ | `DB_PATH` | `./lancedb/` | Where vectors are stored |
302
+ | `MODEL_NAME` | `Xenova/all-MiniLM-L6-v2` | HuggingFace embedding model |
303
+ | `WEB_PORT` | `3000` | Port for web interface |
304
+
305
+ ### Search Tuning
306
+
307
+ | Variable | Default | What it does |
308
+ |----------|---------|--------------|
309
+ | `RAG_HYBRID_WEIGHT` | `0.6` | Keyword boost strength. 0 = semantic-only, higher = stronger boost for exact keyword matches |
310
+ | `RAG_GROUPING` | unset | `similar` = top group only, `related` = top 2 groups |
311
+ | `RAG_MAX_DISTANCE` | unset | Filter out results below this relevance threshold |
312
+
313
+ ### Security (optional)
314
+
315
+ | Variable | Default | What it does |
316
+ |----------|---------|--------------|
317
+ | `RAG_API_KEY` | unset | API key for authentication |
318
+ | `CORS_ORIGINS` | localhost | Allowed origins (comma-separated, or `*`) |
319
+ | `RATE_LIMIT_WINDOW_MS` | `60000` | Rate limit time window (ms) |
320
+ | `RATE_LIMIT_MAX_REQUESTS` | `100` | Max requests per window |
321
+
322
+ ### Advanced
323
+
324
+ | Variable | Default | What it does |
325
+ |----------|---------|--------------|
326
+ | `ALLOWED_SCAN_ROOTS` | Home directory | Directories allowed for database scanning |
327
+ | `JSON_BODY_LIMIT` | `5mb` | Max request body size |
328
+ | `REQUEST_TIMEOUT_MS` | `30000` | API request timeout |
329
+ | `REQUEST_LOGGING` | `false` | Enable request audit logging |
330
+
331
+ > Copy [`.env.example`](.env.example) for a complete configuration template.
332
+
333
+ **For code-heavy content**, try:
334
+
335
+ ```json
336
+ "env": {
337
+ "RAG_HYBRID_WEIGHT": "0.8",
338
+ "RAG_GROUPING": "similar"
339
+ }
340
+ ```
341
+
342
+ ## Frequently Asked Questions
343
+
344
+ <details>
345
+ <summary><strong>Is my data really private?</strong></summary>
346
+
347
+ For local files, yes. Indexing and search run on your machine after the embedding model downloads (~90MB). RAG Vault only uses network if you choose remote URL ingestion or need to download a model.
348
+
349
+ </details>
350
+
351
+ <details>
352
+ <summary><strong>Does it work offline?</strong></summary>
353
+
354
+ Yes, after the first run. The model caches locally.
355
+
356
+ </details>
357
+
358
+ <details>
359
+ <summary><strong>What about GPU acceleration?</strong></summary>
360
+
361
+ Transformers.js runs on CPU. GPU support is experimental, and CPU performance is solid for typical local vault sizes.
362
+
363
+ </details>
364
+
365
+ <details>
366
+ <summary><strong>Can I change the embedding model?</strong></summary>
367
+
368
+ Yes. Set `MODEL_NAME` to any compatible HuggingFace model. You must delete `DB_PATH` and re-ingest because different models produce incompatible vectors.
369
+
370
+ **Recommended upgrade:** For better quality and multilingual support, use [EmbeddingGemma](https://huggingface.co/onnx-community/embeddinggemma-300m-ONNX):
371
+
372
+ ```json
373
+ "MODEL_NAME": "onnx-community/embeddinggemma-300m-ONNX"
374
+ ```
375
+
376
+ This model is a strong option for multilingual and higher-quality retrieval use cases.
377
+
378
+ **Other specialized models:**
379
+ - Scientific: `sentence-transformers/allenai-specter`
380
+ - Code: `jinaai/jina-embeddings-v2-base-code`
381
+
382
+ </details>
383
+
384
+ <details>
385
+ <summary><strong>How do I back up my data?</strong></summary>
386
+
387
+ Copy the `DB_PATH` directory (default: `./lancedb/`).
388
+
389
+ </details>
390
+
391
+ ## Troubleshooting
392
+
393
+ | Problem | Solution |
394
+ |---------|----------|
395
+ | No results found | Documents must be ingested first. Run "List all ingested files" to check. |
396
+ | Model download failed | Check internet connection. Model is ~90MB from HuggingFace. |
397
+ | File too large | Default limit is 100MB. Set `MAX_FILE_SIZE` higher or split the file. |
398
+ | Path outside BASE_DIR | All file paths must be under `BASE_DIR`. Use absolute paths. |
399
+ | MCP tools not showing | Verify config syntax, restart your AI tool completely (Cmd+Q on Mac). |
400
+ | 401 Unauthorized | API key required. Set `RAG_API_KEY` or use correct header format. |
401
+ | 429 Too Many Requests | Rate limited. Wait for reset or increase `RATE_LIMIT_MAX_REQUESTS`. |
402
+ | CORS errors | Add your origin to `CORS_ORIGINS` environment variable. |
403
+
404
+ ## Development
405
+
406
+ ```bash
407
+ git clone https://github.com/RobThePCGuy/rag-vault.git
408
+ cd rag-vault
409
+ pnpm install
410
+ pnpm --prefix web-ui install
411
+
412
+ # Install local git hooks (recommended, even for solo dev)
413
+ pnpm hooks:install
414
+
415
+ # Fast local quality gate (backend + web-ui type/lint/format, deps, unused, build, unit tests)
416
+ pnpm check:all
417
+
418
+ # Unit tests only (no model download required)
419
+ pnpm test:unit
420
+
421
+ # Integration/E2E tests (requires model download/network)
422
+ pnpm test:integration
423
+
424
+ # Build
425
+ pnpm build
426
+
427
+ # Run MCP server locally
428
+ pnpm dev
429
+
430
+ # Run web server locally
431
+ pnpm web:dev
432
+ ```
433
+
434
+
435
+ ### Test Tiers
436
+
437
+ - `pnpm test:unit`: deterministic tests for local/CI quality checks, excluding model-download integration paths.
438
+ - `pnpm test:integration`: full integration and E2E workflows, including embedding model initialization.
439
+
440
+ Use `RUN_EMBEDDING_INTEGRATION=1` to explicitly opt into network/model-dependent suites.
441
+
442
+ ### CI Strategy
443
+
444
+ - `quality.yml` runs on PRs and pushes and enforces the root quality gate (`pnpm check:all`), which includes backend checks and web-ui type/lint/format checks plus unit tests.
445
+ - A nightly scheduled job runs the integration/E2E suite so model-dependent workflows stay healthy without blocking every PR.
446
+ - `publish-npm.yml` publishes to npm on GitHub Releases, validates tag/version alignment, blocks duplicate npm versions, and supports a manual dry-run, while a real publish requires `NPM_TOKEN`.
447
+
448
+ ### Project Structure
449
+
450
+ ```
451
+ src/
452
+ ├── server/ # MCP tool handlers
453
+ ├── vectordb/ # LanceDB + hybrid search
454
+ ├── chunker/ # Semantic text splitting
455
+ ├── embedder/ # Transformers.js wrapper
456
+ ├── parser/ # PDF, DOCX, HTML parsing
457
+ ├── web/ # Express server + REST API
458
+ └── __tests__/ # Test suites
459
+
460
+ web-ui/ # React frontend
461
+ ```
462
+
463
+ ## Documentation
464
+
465
+ - [SECURITY.md](SECURITY.md): Security configuration and best practices
466
+ - [.env.example](.env.example): Complete environment variable template
467
+
468
+ ## License
469
+
470
+ MIT: free for personal and commercial use.
471
+
472
+ ## Acknowledgments
473
+
474
+ Built with [Model Context Protocol](https://modelcontextprotocol.io/), [LanceDB](https://lancedb.com/), and [Transformers.js](https://huggingface.co/docs/transformers.js).
475
+
476
+ > Started as a fork of [mcp-local-rag](https://github.com/shinpr/mcp-local-rag) by [Shinsuke Kagawa](https://github.com/shinpr). Now it’s its own thing.
477
+ > Huge credit to upstream contributors for the foundation, I’ve been iterating hard from there.
478
+ > Local-first dev tools, all the way.