mcp-local-rag 0.4.2 → 0.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -2,6 +2,8 @@
2
2
 
3
3
  [![npm version](https://img.shields.io/npm/v/mcp-local-rag.svg)](https://www.npmjs.com/package/mcp-local-rag)
4
4
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
5
+ [![TypeScript](https://img.shields.io/badge/TypeScript-5.0-blue.svg?logo=typescript&logoColor=white)](https://www.typescriptlang.org/)
6
+ [![MCP Registry](https://img.shields.io/badge/MCP-Registry-green.svg)](https://registry.modelcontextprotocol.io/)
5
7
 
6
8
  Local RAG for developers using MCP.
7
9
  Semantic search with keyword boost for exact technical terms — fully private, zero setup.
@@ -86,8 +88,8 @@ You want AI to search your documents—technical specs, research papers, interna
86
88
 
87
89
  ## Usage
88
90
 
89
- The server provides 5 MCP tools: ingest, search, list, delete, status
90
- (`ingest_file`, `query_documents`, `list_files`, `delete_file`, `status`).
91
+ The server provides 6 MCP tools: ingest file, ingest data, search, list, delete, status
92
+ (`ingest_file`, `ingest_data`, `query_documents`, `list_files`, `delete_file`, `status`).
91
93
 
92
94
  ### Ingesting Documents
93
95
 
@@ -99,6 +101,23 @@ Supports PDF, DOCX, TXT, and Markdown. The server extracts text, splits it into
99
101
 
100
102
  Re-ingesting the same file replaces the old version automatically.
101
103
 
104
+ ### Ingesting HTML Content
105
+
106
+ Use `ingest_data` to ingest HTML content retrieved by your AI assistant (via web fetch, curl, browser tools, etc.):
107
+
108
+ ```
109
+ "Fetch https://example.com/docs and ingest the HTML"
110
+ ```
111
+
112
+ The server extracts main content using Readability (removes navigation, ads, etc.), converts to Markdown, and indexes it. Perfect for:
113
+ - Web documentation
114
+ - HTML retrieved by the AI assistant
115
+ - Clipboard content
116
+
117
+ HTML is automatically cleaned—you get the article content, not the boilerplate.
118
+
119
+ > **Note:** The RAG server itself doesn't fetch web content—your AI assistant retrieves it and passes the HTML to `ingest_data`. This keeps the server fully local while letting you index any content your assistant can access. Please respect website terms of service and copyright when ingesting external content.
120
+
102
121
  ### Searching Documents
103
122
 
104
123
  ```
@@ -169,6 +188,42 @@ When you search:
169
188
 
170
189
  The keyword boost ensures exact terms like `useEffect` or error codes rank higher when they match.
171
190
 
191
+ ## Agent Skills
192
+
193
+ [Agent Skills](https://agentskills.io/) provide optimized prompts that help AI assistants use RAG tools more effectively. Install skills for better query formulation, result interpretation, and ingestion workflows:
194
+
195
+ ```bash
196
+ # Claude Code (project-level)
197
+ npx mcp-local-rag-skills --claude-code
198
+
199
+ # Claude Code (user-level)
200
+ npx mcp-local-rag-skills --claude-code --global
201
+
202
+ # Codex
203
+ npx mcp-local-rag-skills --codex
204
+ ```
205
+
206
+ Skills include:
207
+ - **Query optimization**: Better search query formulation
208
+ - **Result interpretation**: Score thresholds and filtering guidelines
209
+ - **HTML ingestion**: Format selection and source naming
210
+
211
+ ### Ensuring Skill Activation
212
+
213
+ Skills are loaded automatically in most cases—AI assistants scan skill metadata and load relevant instructions when needed. For consistent behavior:
214
+
215
+ **Option 1: Explicit request (natural language)**
216
+ Before RAG operations, request in natural language:
217
+ - "Use the mcp-local-rag skill for this search"
218
+ - "Apply RAG best practices from skills"
219
+
220
+ **Option 2: Add to agent instruction file**
221
+ Add to your `AGENTS.md`, `CLAUDE.md`, or other agent instruction file:
222
+ ```
223
+ When using query_documents, ingest_file, or ingest_data tools,
224
+ apply the mcp-local-rag skill for optimal query formulation and result interpretation.
225
+ ```
226
+
172
227
  <details>
173
228
  <summary><strong>Configuration</strong></summary>
174
229
 
@@ -301,7 +356,7 @@ Yes, after the first model download (~90MB).
301
356
  Cloud services offer better accuracy at scale but require sending data externally. This trades some accuracy for complete privacy and zero runtime cost.
302
357
 
303
358
  **What file formats are supported?**
304
- PDF, DOCX, TXT, Markdown. Not yet: Excel, PowerPoint, images, HTML.
359
+ PDF, DOCX, TXT, Markdown, and HTML (via `ingest_data`). Not yet: Excel, PowerPoint, images.
305
360
 
306
361
  **Can I change the embedding model?**
307
362
  Yes, but you must delete your database and re-ingest all documents. Different models produce incompatible vector dimensions.
@@ -0,0 +1,17 @@
1
+ #!/usr/bin/env node
2
+ /**
3
+ * MCP Local RAG Skills Installer
4
+ *
5
+ * Installs skills to various AI coding assistants:
6
+ * - Claude Code (project or global)
7
+ * - OpenAI Codex
8
+ * - Custom path
9
+ *
10
+ * Usage:
11
+ * npx mcp-local-rag-skills --claude-code # Project-level
12
+ * npx mcp-local-rag-skills --claude-code --global # User-level
13
+ * npx mcp-local-rag-skills --codex # Codex
14
+ * npx mcp-local-rag-skills --path /custom/path # Custom
15
+ */
16
+ export {};
17
+ //# sourceMappingURL=install-skills.d.ts.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"install-skills.d.ts","sourceRoot":"","sources":["../../src/bin/install-skills.ts"],"names":[],"mappings":";AAEA;;;;;;;;;;;;;GAaG"}
@@ -0,0 +1,194 @@
1
+ #!/usr/bin/env node
2
+ "use strict";
3
+ /**
4
+ * MCP Local RAG Skills Installer
5
+ *
6
+ * Installs skills to various AI coding assistants:
7
+ * - Claude Code (project or global)
8
+ * - OpenAI Codex
9
+ * - Custom path
10
+ *
11
+ * Usage:
12
+ * npx mcp-local-rag-skills --claude-code # Project-level
13
+ * npx mcp-local-rag-skills --claude-code --global # User-level
14
+ * npx mcp-local-rag-skills --codex # Codex
15
+ * npx mcp-local-rag-skills --path /custom/path # Custom
16
+ */
17
+ Object.defineProperty(exports, "__esModule", { value: true });
18
+ const node_fs_1 = require("node:fs");
19
+ const node_os_1 = require("node:os");
20
+ const node_path_1 = require("node:path");
21
+ // ============================================
22
+ // Constants
23
+ // ============================================
24
+ // Skills source directory (relative to dist/bin when compiled)
25
+ // dist/bin/install-skills.js -> dist/skills/mcp-local-rag
26
+ // But skills are actually in package root: skills/mcp-local-rag
27
+ // So from dist/bin, go up twice: ../.. then skills/mcp-local-rag
28
+ const SKILLS_SOURCE = (0, node_path_1.resolve)(__dirname, '..', '..', 'skills', 'mcp-local-rag');
29
+ // Codex home directory (supports CODEX_HOME environment variable)
30
+ // https://developers.openai.com/codex/local-config/
31
+ const CODEX_HOME = process.env['CODEX_HOME'] || (0, node_path_1.join)((0, node_os_1.homedir)(), '.codex');
32
+ // Installation targets
33
+ const TARGETS = {
34
+ 'claude-code-project': './.claude/skills/mcp-local-rag',
35
+ 'claude-code-global': (0, node_path_1.join)((0, node_os_1.homedir)(), '.claude', 'skills', 'mcp-local-rag'),
36
+ 'codex-project': './.codex/skills/mcp-local-rag',
37
+ 'codex-global': (0, node_path_1.join)(CODEX_HOME, 'skills', 'mcp-local-rag'),
38
+ };
39
+ function parseArgs(args) {
40
+ const options = {
41
+ target: 'claude-code-project',
42
+ help: false,
43
+ };
44
+ for (let i = 0; i < args.length; i++) {
45
+ const arg = args[i];
46
+ switch (arg) {
47
+ case '--help':
48
+ case '-h':
49
+ options.help = true;
50
+ break;
51
+ case '--claude-code':
52
+ // Check for --global flag
53
+ if (args[i + 1] === '--global') {
54
+ options.target = 'claude-code-global';
55
+ i++; // Skip next arg
56
+ }
57
+ else {
58
+ options.target = 'claude-code-project';
59
+ }
60
+ break;
61
+ case '--codex':
62
+ // Check for --project or --global flag
63
+ if (args[i + 1] === '--project') {
64
+ options.target = 'codex-project';
65
+ i++; // Skip next arg
66
+ }
67
+ else if (args[i + 1] === '--global') {
68
+ options.target = 'codex-global';
69
+ i++; // Skip next arg
70
+ }
71
+ else {
72
+ // Default to global (matches previous behavior)
73
+ options.target = 'codex-global';
74
+ }
75
+ break;
76
+ case '--path': {
77
+ const pathArg = args[i + 1];
78
+ if (!pathArg) {
79
+ console.error('Error: --path requires a path argument');
80
+ process.exit(1);
81
+ }
82
+ options.target = 'custom';
83
+ options.customPath = pathArg;
84
+ i++; // Skip next arg
85
+ break;
86
+ }
87
+ default:
88
+ if (arg?.startsWith('-')) {
89
+ console.error(`Unknown option: ${arg}`);
90
+ process.exit(1);
91
+ }
92
+ }
93
+ }
94
+ return options;
95
+ }
96
+ // ============================================
97
+ // Help Message
98
+ // ============================================
99
+ function printHelp() {
100
+ console.log(`
101
+ MCP Local RAG Skills Installer
102
+
103
+ Usage:
104
+ npx mcp-local-rag-skills [options]
105
+
106
+ Options:
107
+ --claude-code Install to project-level Claude Code skills
108
+ (./.claude/skills/)
109
+
110
+ --claude-code --global Install to user-level Claude Code skills
111
+ (~/.claude/skills/)
112
+
113
+ --codex Install to user-level Codex skills (default)
114
+ ($CODEX_HOME/skills/ or ~/.codex/skills/)
115
+
116
+ --codex --project Install to project-level Codex skills
117
+ (./.codex/skills/)
118
+
119
+ --codex --global Install to user-level Codex skills
120
+ ($CODEX_HOME/skills/ or ~/.codex/skills/)
121
+
122
+ --path <path> Install to custom path
123
+
124
+ --help, -h Show this help message
125
+
126
+ Examples:
127
+ npx mcp-local-rag-skills --claude-code
128
+ npx mcp-local-rag-skills --claude-code --global
129
+ npx mcp-local-rag-skills --codex
130
+ npx mcp-local-rag-skills --codex --project
131
+ npx mcp-local-rag-skills --path ./my-skills/
132
+ `);
133
+ }
134
+ // ============================================
135
+ // Installation
136
+ // ============================================
137
+ function getTargetPath(options) {
138
+ if (options.target === 'custom') {
139
+ if (!options.customPath) {
140
+ console.error('Error: Custom path not specified');
141
+ process.exit(1);
142
+ }
143
+ return (0, node_path_1.resolve)(options.customPath, 'mcp-local-rag');
144
+ }
145
+ return TARGETS[options.target];
146
+ }
147
+ function install(targetPath) {
148
+ // Check source exists
149
+ if (!(0, node_fs_1.existsSync)(SKILLS_SOURCE)) {
150
+ console.error(`Error: Skills source not found at ${SKILLS_SOURCE}`);
151
+ process.exit(1);
152
+ }
153
+ // Create target directory
154
+ const targetDir = (0, node_path_1.dirname)(targetPath);
155
+ if (!(0, node_fs_1.existsSync)(targetDir)) {
156
+ (0, node_fs_1.mkdirSync)(targetDir, { recursive: true });
157
+ console.log(`Created directory: ${targetDir}`);
158
+ }
159
+ // Copy skills
160
+ (0, node_fs_1.cpSync)(SKILLS_SOURCE, targetPath, { recursive: true });
161
+ console.log(`Installed skills to: ${targetPath}`);
162
+ }
163
+ // ============================================
164
+ // Main
165
+ // ============================================
166
+ function main() {
167
+ const args = process.argv.slice(2);
168
+ // Default to help if no args
169
+ if (args.length === 0) {
170
+ printHelp();
171
+ process.exit(0);
172
+ }
173
+ const options = parseArgs(args);
174
+ if (options.help) {
175
+ printHelp();
176
+ process.exit(0);
177
+ }
178
+ const targetPath = getTargetPath(options);
179
+ console.log('Installing MCP Local RAG skills...');
180
+ console.log(`Target: ${options.target}`);
181
+ console.log(`Path: ${targetPath}`);
182
+ console.log();
183
+ install(targetPath);
184
+ console.log();
185
+ console.log('Installation complete!');
186
+ console.log();
187
+ console.log('The following skills are now available:');
188
+ console.log(' - mcp-local-rag (SKILL.md)');
189
+ console.log(' - references/html-ingestion.md');
190
+ console.log(' - references/query-optimization.md');
191
+ console.log(' - references/result-refinement.md');
192
+ }
193
+ main();
194
+ //# sourceMappingURL=install-skills.js.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"install-skills.js","sourceRoot":"","sources":["../../src/bin/install-skills.ts"],"names":[],"mappings":";;AAEA;;;;;;;;;;;;;GAaG;;AAEH,qCAAuD;AACvD,qCAAiC;AACjC,yCAAkD;AAElD,+CAA+C;AAC/C,YAAY;AACZ,+CAA+C;AAE/C,+DAA+D;AAC/D,0DAA0D;AAC1D,gEAAgE;AAChE,iEAAiE;AACjE,MAAM,aAAa,GAAG,IAAA,mBAAO,EAAC,SAAS,EAAE,IAAI,EAAE,IAAI,EAAE,QAAQ,EAAE,eAAe,CAAC,CAAA;AAE/E,kEAAkE;AAClE,oDAAoD;AACpD,MAAM,UAAU,GAAG,OAAO,CAAC,GAAG,CAAC,YAAY,CAAC,IAAI,IAAA,gBAAI,EAAC,IAAA,iBAAO,GAAE,EAAE,QAAQ,CAAC,CAAA;AAEzE,uBAAuB;AACvB,MAAM,OAAO,GAAG;IACd,qBAAqB,EAAE,gCAAgC;IACvD,oBAAoB,EAAE,IAAA,gBAAI,EAAC,IAAA,iBAAO,GAAE,EAAE,SAAS,EAAE,QAAQ,EAAE,eAAe,CAAC;IAC3E,eAAe,EAAE,+BAA+B;IAChD,cAAc,EAAE,IAAA,gBAAI,EAAC,UAAU,EAAE,QAAQ,EAAE,eAAe,CAAC;CACnD,CAAA;AAYV,SAAS,SAAS,CAAC,IAAc;IAC/B,MAAM,OAAO,GAAY;QACvB,MAAM,EAAE,qBAAqB;QAC7B,IAAI,EAAE,KAAK;KACZ,CAAA;IAED,KAAK,IAAI,CAAC,GAAG,CAAC,EAAE,CAAC,GAAG,IAAI,CAAC,MAAM,EAAE,CAAC,EAAE,EAAE,CAAC;QACrC,MAAM,GAAG,GAAG,IAAI,CAAC,CAAC,CAAC,CAAA;QAEnB,QAAQ,GAAG,EAAE,CAAC;YACZ,KAAK,QAAQ,CAAC;YACd,KAAK,IAAI;gBACP,OAAO,CAAC,IAAI,GAAG,IAAI,CAAA;gBACnB,MAAK;YAEP,KAAK,eAAe;gBAClB,0BAA0B;gBAC1B,IAAI,IAAI,CAAC,CAAC,GAAG,CAAC,CAAC,KAAK,UAAU,EAAE,CAAC;oBAC/B,OAAO,CAAC,MAAM,GAAG,oBAAoB,CAAA;oBACrC,CAAC,EAAE,CAAA,CAAC,gBAAgB;gBACtB,CAAC;qBAAM,CAAC;oBACN,OAAO,CAAC,MAAM,GAAG,qBAAqB,CAAA;gBACxC,CAAC;gBACD,MAAK;YAEP,KAAK,SAAS;gBACZ,uCAAuC;gBACvC,IAAI,IAAI,CAAC,CAAC,GAAG,CAAC,CAAC,KAAK,WAAW,EAAE,CAAC;oBAChC,OAAO,CAAC,MAAM,GAAG,eAAe,CAAA;oBAChC,CAAC,EAAE,CAAA,CAAC,gBAAgB;gBACtB,CAAC;qBAAM,IAAI,IAAI,CAAC,CAAC,GAAG,CAAC,CAAC,KAAK,UAAU,EAAE,CAAC;oBACtC,OAAO,CAAC,MAAM,GAAG,cAAc,CAAA;oBAC/B,CAAC,EAAE,CAAA,CAAC,gBAAgB;gBACtB,CAAC;qBAAM,CAAC;oBACN,gDAAgD;oBAChD,OAAO,CAAC,MAAM,GAAG,cAAc,CAAA;gBACjC,CAAC;gBACD,MAAK;YAEP,KAAK,QAAQ,CAAC,CAAC,CAAC;gBACd,MAAM,OAAO,GAAG,IAAI,CAAC,CAAC,GAAG,CAAC,CAAC,CAAA;gBAC3B,IAAI,CAAC,OAAO,EAAE,CAAC;oBACb,OAAO,CAAC,KAAK,CAAC,wCAAwC,CAAC,CAAA;oBACvD,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAA;gBACjB,CAAC;gBACD,OAAO,CAAC,MAAM,GAAG,QAAQ,CAAA;gBACzB,OAAO,CAAC,UAAU,GAAG,OAAO,CAAA;gBAC5B,CAAC,EAAE,CAAA,CAAC,gBAAgB;gBACpB,MAAK;YACP,CAAC;YAED;gBACE,IAAI,GAAG,EAAE,UAAU,CAAC,GAAG,CAAC,EAAE,CAAC;oBACzB,OAAO,CAAC,KAAK,CAAC,mBAAmB,GAAG,EAAE,CAAC,CAAA;oBACvC,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAA;gBACjB,CAAC;QACL,CAAC;IACH,CAAC;IAED,OAAO,OAAO,CAAA;AAChB,CAAC;AAED,+CAA+C;AAC/C,eAAe;AACf,+CAA+C;AAE/C,SAAS,SAAS;IAChB,OAAO,CAAC,GAAG,CAAC;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;CAgCb,CAAC,CAAA;AACF,CAAC;AAED,+CAA+C;AAC/C,eAAe;AACf,+CAA+C;AAE/C,SAAS,aAAa,CAAC,OAAgB;IACrC,IAAI,OAAO,CAAC,MAAM,KAAK,QAAQ,EAAE,CAAC;QAChC,IAAI,CAAC,OAAO,CAAC,UAAU,EAAE,CAAC;YACxB,OAAO,CAAC,KAAK,CAAC,kCAAkC,CAAC,CAAA;YACjD,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAA;QACjB,CAAC;QACD,OAAO,IAAA,mBAAO,EAAC,OAAO,CAAC,UAAU,EAAE,eAAe,CAAC,CAAA;IACrD,CAAC;IAED,OAAO,OAAO,CAAC,OAAO,CAAC,MAAM,CAAC,CAAA;AAChC,CAAC;AAED,SAAS,OAAO,CAAC,UAAkB;IACjC,sBAAsB;IACtB,IAAI,CAAC,IAAA,oBAAU,EAAC,aAAa,CAAC,EAAE,CAAC;QAC/B,OAAO,CAAC,KAAK,CAAC,qCAAqC,aAAa,EAAE,CAAC,CAAA;QACnE,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAA;IACjB,CAAC;IAED,0BAA0B;IAC1B,MAAM,SAAS,GAAG,IAAA,mBAAO,EAAC,UAAU,CAAC,CAAA;IACrC,IAAI,CAAC,IAAA,oBAAU,EAAC,SAAS,CAAC,EAAE,CAAC;QAC3B,IAAA,mBAAS,EAAC,SAAS,EAAE,EAAE,SAAS,EAAE,IAAI,EAAE,CAAC,CAAA;QACzC,OAAO,CAAC,GAAG,CAAC,sBAAsB,SAAS,EAAE,CAAC,CAAA;IAChD,CAAC;IAED,cAAc;IACd,IAAA,gBAAM,EAAC,aAAa,EAAE,UAAU,EAAE,EAAE,SAAS,EAAE,IAAI,EAAE,CAAC,CAAA;IACtD,OAAO,CAAC,GAAG,CAAC,wBAAwB,UAAU,EAAE,CAAC,CAAA;AACnD,CAAC;AAED,+CAA+C;AAC/C,OAAO;AACP,+CAA+C;AAE/C,SAAS,IAAI;IACX,MAAM,IAAI,GAAG,OAAO,CAAC,IAAI,CAAC,KAAK,CAAC,CAAC,CAAC,CAAA;IAElC,6BAA6B;IAC7B,IAAI,IAAI,CAAC,MAAM,KAAK,CAAC,EAAE,CAAC;QACtB,SAAS,EAAE,CAAA;QACX,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAA;IACjB,CAAC;IAED,MAAM,OAAO,GAAG,SAAS,CAAC,IAAI,CAAC,CAAA;IAE/B,IAAI,OAAO,CAAC,IAAI,EAAE,CAAC;QACjB,SAAS,EAAE,CAAA;QACX,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAA;IACjB,CAAC;IAED,MAAM,UAAU,GAAG,aAAa,CAAC,OAAO,CAAC,CAAA;IAEzC,OAAO,CAAC,GAAG,CAAC,oCAAoC,CAAC,CAAA;IACjD,OAAO,CAAC,GAAG,CAAC,WAAW,OAAO,CAAC,MAAM,EAAE,CAAC,CAAA;IACxC,OAAO,CAAC,GAAG,CAAC,SAAS,UAAU,EAAE,CAAC,CAAA;IAClC,OAAO,CAAC,GAAG,EAAE,CAAA;IAEb,OAAO,CAAC,UAAU,CAAC,CAAA;IAEnB,OAAO,CAAC,GAAG,EAAE,CAAA;IACb,OAAO,CAAC,GAAG,CAAC,wBAAwB,CAAC,CAAA;IACrC,OAAO,CAAC,GAAG,EAAE,CAAA;IACb,OAAO,CAAC,GAAG,CAAC,yCAAyC,CAAC,CAAA;IACtD,OAAO,CAAC,GAAG,CAAC,8BAA8B,CAAC,CAAA;IAC3C,OAAO,CAAC,GAAG,CAAC,kCAAkC,CAAC,CAAA;IAC/C,OAAO,CAAC,GAAG,CAAC,sCAAsC,CAAC,CAAA;IACnD,OAAO,CAAC,GAAG,CAAC,qCAAqC,CAAC,CAAA;AACpD,CAAC;AAED,IAAI,EAAE,CAAA"}
@@ -1 +1 @@
1
- {"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../../src/embedder/index.ts"],"names":[],"mappings":"AAQA;;GAEG;AACH,MAAM,WAAW,cAAc;IAC7B,6BAA6B;IAC7B,SAAS,EAAE,MAAM,CAAA;IACjB,iBAAiB;IACjB,SAAS,EAAE,MAAM,CAAA;IACjB,4BAA4B;IAC5B,QAAQ,EAAE,MAAM,CAAA;CACjB;AAMD;;GAEG;AACH,qBAAa,cAAe,SAAQ,KAAK;aAGZ,KAAK,CAAC,EAAE,KAAK;gBADtC,OAAO,EAAE,MAAM,EACU,KAAK,CAAC,EAAE,KAAK,YAAA;CAKzC;AAMD;;;;;;;GAOG;AACH,qBAAa,QAAQ;IACnB,OAAO,CAAC,KAAK,CAAoD;IACjE,OAAO,CAAC,WAAW,CAA6B;IAChD,OAAO,CAAC,QAAQ,CAAC,MAAM,CAAgB;gBAE3B,MAAM,EAAE,cAAc;IAIlC;;OAEG;IACG,UAAU,IAAI,OAAO,CAAC,IAAI,CAAC;IAsBjC;;;OAGG;YACW,iBAAiB;IA+B/B;;;;;OAKG;IACG,KAAK,CAAC,IAAI,EAAE,MAAM,GAAG,OAAO,CAAC,MAAM,EAAE,CAAC;IAiC5C;;;;;OAKG;IACG,UAAU,CAAC,KAAK,EAAE,MAAM,EAAE,GAAG,OAAO,CAAC,MAAM,EAAE,EAAE,CAAC;CA0BvD"}
1
+ {"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../../src/embedder/index.ts"],"names":[],"mappings":"AAQA;;GAEG;AACH,MAAM,WAAW,cAAc;IAC7B,6BAA6B;IAC7B,SAAS,EAAE,MAAM,CAAA;IACjB,iBAAiB;IACjB,SAAS,EAAE,MAAM,CAAA;IACjB,4BAA4B;IAC5B,QAAQ,EAAE,MAAM,CAAA;CACjB;AAMD;;GAEG;AACH,qBAAa,cAAe,SAAQ,KAAK;aAGZ,KAAK,CAAC,EAAE,KAAK;gBADtC,OAAO,EAAE,MAAM,EACU,KAAK,CAAC,EAAE,KAAK,YAAA;CAKzC;AAMD;;;;;;;GAOG;AACH,qBAAa,QAAQ;IAEnB,OAAO,CAAC,KAAK,CAAgB;IAC7B,OAAO,CAAC,WAAW,CAA6B;IAChD,OAAO,CAAC,QAAQ,CAAC,MAAM,CAAgB;gBAE3B,MAAM,EAAE,cAAc;IAIlC;;OAEG;IACG,UAAU,IAAI,OAAO,CAAC,IAAI,CAAC;IAuBjC;;;OAGG;YACW,iBAAiB;IA+B/B;;;;;OAKG;IACG,KAAK,CAAC,IAAI,EAAE,MAAM,GAAG,OAAO,CAAC,MAAM,EAAE,CAAC;IAiC5C;;;;;OAKG;IACG,UAAU,CAAC,KAAK,EAAE,MAAM,EAAE,GAAG,OAAO,CAAC,MAAM,EAAE,EAAE,CAAC;CA0BvD"}
@@ -30,6 +30,7 @@ exports.EmbeddingError = EmbeddingError;
30
30
  */
31
31
  class Embedder {
32
32
  constructor(config) {
33
+ // Using unknown to avoid TS2590 (union type too complex with @types/jsdom)
33
34
  this.model = null;
34
35
  this.initPromise = null;
35
36
  this.config = config;
@@ -47,6 +48,7 @@ class Embedder {
47
48
  transformers_1.env.cacheDir = this.config.cacheDir;
48
49
  console.error(`Embedder: Setting cache directory to "${this.config.cacheDir}"`);
49
50
  console.error(`Embedder: Loading model "${this.config.modelPath}"...`);
51
+ // Use type assertion to avoid TS2590 (union type too complex with @types/jsdom)
50
52
  this.model = await (0, transformers_1.pipeline)('feature-extraction', this.config.modelPath);
51
53
  console.error('Embedder: Model loaded successfully');
52
54
  }
@@ -1 +1 @@
1
- {"version":3,"file":"index.js","sourceRoot":"","sources":["../../src/embedder/index.ts"],"names":[],"mappings":";AAAA,+CAA+C;;;AAE/C,4DAAyD;AAkBzD,+CAA+C;AAC/C,gBAAgB;AAChB,+CAA+C;AAE/C;;GAEG;AACH,MAAa,cAAe,SAAQ,KAAK;IACvC,YACE,OAAe,EACU,KAAa;QAEtC,KAAK,CAAC,OAAO,CAAC,CAAA;QAFW,UAAK,GAAL,KAAK,CAAQ;QAGtC,IAAI,CAAC,IAAI,GAAG,gBAAgB,CAAA;IAC9B,CAAC;CACF;AARD,wCAQC;AAED,+CAA+C;AAC/C,iBAAiB;AACjB,+CAA+C;AAE/C;;;;;;;GAOG;AACH,MAAa,QAAQ;IAKnB,YAAY,MAAsB;QAJ1B,UAAK,GAAgD,IAAI,CAAA;QACzD,gBAAW,GAAyB,IAAI,CAAA;QAI9C,IAAI,CAAC,MAAM,GAAG,MAAM,CAAA;IACtB,CAAC;IAED;;OAEG;IACH,KAAK,CAAC,UAAU;QACd,8BAA8B;QAC9B,IAAI,IAAI,CAAC,KAAK,EAAE,CAAC;YACf,OAAM;QACR,CAAC;QAED,IAAI,CAAC;YACH,+CAA+C;YAC/C,kBAAG,CAAC,QAAQ,GAAG,IAAI,CAAC,MAAM,CAAC,QAAQ,CAAA;YAEnC,OAAO,CAAC,KAAK,CAAC,yCAAyC,IAAI,CAAC,MAAM,CAAC,QAAQ,GAAG,CAAC,CAAA;YAC/E,OAAO,CAAC,KAAK,CAAC,4BAA4B,IAAI,CAAC,MAAM,CAAC,SAAS,MAAM,CAAC,CAAA;YACtE,IAAI,CAAC,KAAK,GAAG,MAAM,IAAA,uBAAQ,EAAC,oBAAoB,EAAE,IAAI,CAAC,MAAM,CAAC,SAAS,CAAC,CAAA;YACxE,OAAO,CAAC,KAAK,CAAC,qCAAqC,CAAC,CAAA;QACtD,CAAC;QAAC,OAAO,KAAK,EAAE,CAAC;YACf,MAAM,IAAI,cAAc,CACtB,kCAAmC,KAAe,CAAC,OAAO,EAAE,EAC5D,KAAc,CACf,CAAA;QACH,CAAC;IACH,CAAC;IAED;;;OAGG;IACK,KAAK,CAAC,iBAAiB;QAC7B,sBAAsB;QACtB,IAAI,IAAI,CAAC,KAAK,EAAE,CAAC;YACf,OAAM;QACR,CAAC;QAED,kDAAkD;QAClD,IAAI,IAAI,CAAC,WAAW,EAAE,CAAC;YACrB,MAAM,IAAI,CAAC,WAAW,CAAA;YACtB,OAAM;QACR,CAAC;QAED,uBAAuB;QACvB,OAAO,CAAC,KAAK,CACX,+FAA+F,CAChG,CAAA;QAED,IAAI,CAAC,WAAW,GAAG,IAAI,CAAC,UAAU,EAAE,CAAC,KAAK,CAAC,CAAC,KAAK,EAAE,EAAE;YACnD,8CAA8C;YAC9C,IAAI,CAAC,WAAW,GAAG,IAAI,CAAA;YAEvB,+CAA+C;YAC/C,MAAM,IAAI,cAAc,CACtB,+CAAgD,KAAe,CAAC,OAAO,wTAAwT,IAAI,CAAC,MAAM,CAAC,QAAQ,gCAAgC,EACnb,KAAc,CACf,CAAA;QACH,CAAC,CAAC,CAAA;QAEF,MAAM,IAAI,CAAC,WAAW,CAAA;IACxB,CAAC;IAED;;;;;OAKG;IACH,KAAK,CAAC,KAAK,CAAC,IAAY;QACtB,0EAA0E;QAC1E,MAAM,IAAI,CAAC,iBAAiB,EAAE,CAAA;QAE9B,IAAI,CAAC;YACH,mEAAmE;YACnE,IAAI,IAAI,CAAC,MAAM,KAAK,CAAC,EAAE,CAAC;gBACtB,MAAM,IAAI,cAAc,CAAC,0CAA0C,CAAC,CAAA;YACtE,CAAC;YAED,uEAAuE;YACvE,8FAA8F;YAC9F,MAAM,OAAO,GAAG,EAAE,OAAO,EAAE,MAAM,EAAE,SAAS,EAAE,IAAI,EAAE,CAAA;YACpD,MAAM,SAAS,GAAG,IAAI,CAAC,KAGa,CAAA;YACpC,MAAM,MAAM,GAAG,MAAM,SAAS,CAAC,IAAI,EAAE,OAAO,CAAC,CAAA;YAE7C,qCAAqC;YACrC,MAAM,SAAS,GAAG,KAAK,CAAC,IAAI,CAAC,MAAM,CAAC,IAAI,CAAC,CAAA;YACzC,OAAO,SAAS,CAAA;QAClB,CAAC;QAAC,OAAO,KAAK,EAAE,CAAC;YACf,IAAI,KAAK,YAAY,cAAc,EAAE,CAAC;gBACpC,MAAM,KAAK,CAAA;YACb,CAAC;YACD,MAAM,IAAI,cAAc,CACtB,iCAAkC,KAAe,CAAC,OAAO,EAAE,EAC3D,KAAc,CACf,CAAA;QACH,CAAC;IACH,CAAC;IAED;;;;;OAKG;IACH,KAAK,CAAC,UAAU,CAAC,KAAe;QAC9B,0EAA0E;QAC1E,MAAM,IAAI,CAAC,iBAAiB,EAAE,CAAA;QAE9B,IAAI,KAAK,CAAC,MAAM,KAAK,CAAC,EAAE,CAAC;YACvB,OAAO,EAAE,CAAA;QACX,CAAC;QAED,IAAI,CAAC;YACH,MAAM,UAAU,GAAe,EAAE,CAAA;YAEjC,6CAA6C;YAC7C,KAAK,IAAI,CAAC,GAAG,CAAC,EAAE,CAAC,GAAG,KAAK,CAAC,MAAM,EAAE,CAAC,IAAI,IAAI,CAAC,MAAM,CAAC,SAAS,EAAE,CAAC;gBAC7D,MAAM,KAAK,GAAG,KAAK,CAAC,KAAK,CAAC,CAAC,EAAE,CAAC,GAAG,IAAI,CAAC,MAAM,CAAC,SAAS,CAAC,CAAA;gBACvD,MAAM,eAAe,GAAG,MAAM,OAAO,CAAC,GAAG,CAAC,KAAK,CAAC,GAAG,CAAC,CAAC,IAAI,EAAE,EAAE,CAAC,IAAI,CAAC,KAAK,CAAC,IAAI,CAAC,CAAC,CAAC,CAAA;gBAChF,UAAU,CAAC,IAAI,CAAC,GAAG,eAAe,CAAC,CAAA;YACrC,CAAC;YAED,OAAO,UAAU,CAAA;QACnB,CAAC;QAAC,OAAO,KAAK,EAAE,CAAC;YACf,MAAM,IAAI,cAAc,CACtB,wCAAyC,KAAe,CAAC,OAAO,EAAE,EAClE,KAAc,CACf,CAAA;QACH,CAAC;IACH,CAAC;CACF;AA5ID,4BA4IC"}
1
+ {"version":3,"file":"index.js","sourceRoot":"","sources":["../../src/embedder/index.ts"],"names":[],"mappings":";AAAA,+CAA+C;;;AAE/C,4DAAyD;AAkBzD,+CAA+C;AAC/C,gBAAgB;AAChB,+CAA+C;AAE/C;;GAEG;AACH,MAAa,cAAe,SAAQ,KAAK;IACvC,YACE,OAAe,EACU,KAAa;QAEtC,KAAK,CAAC,OAAO,CAAC,CAAA;QAFW,UAAK,GAAL,KAAK,CAAQ;QAGtC,IAAI,CAAC,IAAI,GAAG,gBAAgB,CAAA;IAC9B,CAAC;CACF;AARD,wCAQC;AAED,+CAA+C;AAC/C,iBAAiB;AACjB,+CAA+C;AAE/C;;;;;;;GAOG;AACH,MAAa,QAAQ;IAMnB,YAAY,MAAsB;QALlC,2EAA2E;QACnE,UAAK,GAAY,IAAI,CAAA;QACrB,gBAAW,GAAyB,IAAI,CAAA;QAI9C,IAAI,CAAC,MAAM,GAAG,MAAM,CAAA;IACtB,CAAC;IAED;;OAEG;IACH,KAAK,CAAC,UAAU;QACd,8BAA8B;QAC9B,IAAI,IAAI,CAAC,KAAK,EAAE,CAAC;YACf,OAAM;QACR,CAAC;QAED,IAAI,CAAC;YACH,+CAA+C;YAC/C,kBAAG,CAAC,QAAQ,GAAG,IAAI,CAAC,MAAM,CAAC,QAAQ,CAAA;YAEnC,OAAO,CAAC,KAAK,CAAC,yCAAyC,IAAI,CAAC,MAAM,CAAC,QAAQ,GAAG,CAAC,CAAA;YAC/E,OAAO,CAAC,KAAK,CAAC,4BAA4B,IAAI,CAAC,MAAM,CAAC,SAAS,MAAM,CAAC,CAAA;YACtE,gFAAgF;YAChF,IAAI,CAAC,KAAK,GAAG,MAAM,IAAA,uBAAQ,EAAC,oBAAoB,EAAE,IAAI,CAAC,MAAM,CAAC,SAAS,CAAC,CAAA;YACxE,OAAO,CAAC,KAAK,CAAC,qCAAqC,CAAC,CAAA;QACtD,CAAC;QAAC,OAAO,KAAK,EAAE,CAAC;YACf,MAAM,IAAI,cAAc,CACtB,kCAAmC,KAAe,CAAC,OAAO,EAAE,EAC5D,KAAc,CACf,CAAA;QACH,CAAC;IACH,CAAC;IAED;;;OAGG;IACK,KAAK,CAAC,iBAAiB;QAC7B,sBAAsB;QACtB,IAAI,IAAI,CAAC,KAAK,EAAE,CAAC;YACf,OAAM;QACR,CAAC;QAED,kDAAkD;QAClD,IAAI,IAAI,CAAC,WAAW,EAAE,CAAC;YACrB,MAAM,IAAI,CAAC,WAAW,CAAA;YACtB,OAAM;QACR,CAAC;QAED,uBAAuB;QACvB,OAAO,CAAC,KAAK,CACX,+FAA+F,CAChG,CAAA;QAED,IAAI,CAAC,WAAW,GAAG,IAAI,CAAC,UAAU,EAAE,CAAC,KAAK,CAAC,CAAC,KAAK,EAAE,EAAE;YACnD,8CAA8C;YAC9C,IAAI,CAAC,WAAW,GAAG,IAAI,CAAA;YAEvB,+CAA+C;YAC/C,MAAM,IAAI,cAAc,CACtB,+CAAgD,KAAe,CAAC,OAAO,wTAAwT,IAAI,CAAC,MAAM,CAAC,QAAQ,gCAAgC,EACnb,KAAc,CACf,CAAA;QACH,CAAC,CAAC,CAAA;QAEF,MAAM,IAAI,CAAC,WAAW,CAAA;IACxB,CAAC;IAED;;;;;OAKG;IACH,KAAK,CAAC,KAAK,CAAC,IAAY;QACtB,0EAA0E;QAC1E,MAAM,IAAI,CAAC,iBAAiB,EAAE,CAAA;QAE9B,IAAI,CAAC;YACH,mEAAmE;YACnE,IAAI,IAAI,CAAC,MAAM,KAAK,CAAC,EAAE,CAAC;gBACtB,MAAM,IAAI,cAAc,CAAC,0CAA0C,CAAC,CAAA;YACtE,CAAC;YAED,uEAAuE;YACvE,8FAA8F;YAC9F,MAAM,OAAO,GAAG,EAAE,OAAO,EAAE,MAAM,EAAE,SAAS,EAAE,IAAI,EAAE,CAAA;YACpD,MAAM,SAAS,GAAG,IAAI,CAAC,KAGa,CAAA;YACpC,MAAM,MAAM,GAAG,MAAM,SAAS,CAAC,IAAI,EAAE,OAAO,CAAC,CAAA;YAE7C,qCAAqC;YACrC,MAAM,SAAS,GAAG,KAAK,CAAC,IAAI,CAAC,MAAM,CAAC,IAAI,CAAC,CAAA;YACzC,OAAO,SAAS,CAAA;QAClB,CAAC;QAAC,OAAO,KAAK,EAAE,CAAC;YACf,IAAI,KAAK,YAAY,cAAc,EAAE,CAAC;gBACpC,MAAM,KAAK,CAAA;YACb,CAAC;YACD,MAAM,IAAI,cAAc,CACtB,iCAAkC,KAAe,CAAC,OAAO,EAAE,EAC3D,KAAc,CACf,CAAA;QACH,CAAC;IACH,CAAC;IAED;;;;;OAKG;IACH,KAAK,CAAC,UAAU,CAAC,KAAe;QAC9B,0EAA0E;QAC1E,MAAM,IAAI,CAAC,iBAAiB,EAAE,CAAA;QAE9B,IAAI,KAAK,CAAC,MAAM,KAAK,CAAC,EAAE,CAAC;YACvB,OAAO,EAAE,CAAA;QACX,CAAC;QAED,IAAI,CAAC;YACH,MAAM,UAAU,GAAe,EAAE,CAAA;YAEjC,6CAA6C;YAC7C,KAAK,IAAI,CAAC,GAAG,CAAC,EAAE,CAAC,GAAG,KAAK,CAAC,MAAM,EAAE,CAAC,IAAI,IAAI,CAAC,MAAM,CAAC,SAAS,EAAE,CAAC;gBAC7D,MAAM,KAAK,GAAG,KAAK,CAAC,KAAK,CAAC,CAAC,EAAE,CAAC,GAAG,IAAI,CAAC,MAAM,CAAC,SAAS,CAAC,CAAA;gBACvD,MAAM,eAAe,GAAG,MAAM,OAAO,CAAC,GAAG,CAAC,KAAK,CAAC,GAAG,CAAC,CAAC,IAAI,EAAE,EAAE,CAAC,IAAI,CAAC,KAAK,CAAC,IAAI,CAAC,CAAC,CAAC,CAAA;gBAChF,UAAU,CAAC,IAAI,CAAC,GAAG,eAAe,CAAC,CAAA;YACrC,CAAC;YAED,OAAO,UAAU,CAAA;QACnB,CAAC;QAAC,OAAO,KAAK,EAAE,CAAC;YACf,MAAM,IAAI,cAAc,CACtB,wCAAyC,KAAe,CAAC,OAAO,EAAE,EAClE,KAAc,CACf,CAAA;QACH,CAAC;IACH,CAAC;CACF;AA9ID,4BA8IC"}
@@ -0,0 +1,14 @@
1
+ /**
2
+ * Parse HTML content and extract main content as Markdown
3
+ *
4
+ * Flow:
5
+ * 1. HTML string → JSDOM (DOM creation)
6
+ * 2. JSDOM → Readability (main content extraction, noise removal)
7
+ * 3. Readability result → Turndown (Markdown conversion)
8
+ *
9
+ * @param html - Raw HTML string
10
+ * @param url - Source URL (used for resolving relative links)
11
+ * @returns Markdown string of extracted content
12
+ */
13
+ export declare function parseHtml(html: string, url: string): Promise<string>;
14
+ //# sourceMappingURL=html-parser.d.ts.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"html-parser.d.ts","sourceRoot":"","sources":["../../src/parser/html-parser.ts"],"names":[],"mappings":"AAsDA;;;;;;;;;;;GAWG;AACH,wBAAsB,SAAS,CAAC,IAAI,EAAE,MAAM,EAAE,GAAG,EAAE,MAAM,GAAG,OAAO,CAAC,MAAM,CAAC,CAoD1E"}
@@ -0,0 +1,99 @@
1
+ "use strict";
2
+ // HTML Parser using Readability and Turndown
3
+ // Extracts main content from HTML and converts to Markdown
4
+ var __importDefault = (this && this.__importDefault) || function (mod) {
5
+ return (mod && mod.__esModule) ? mod : { "default": mod };
6
+ };
7
+ Object.defineProperty(exports, "__esModule", { value: true });
8
+ exports.parseHtml = parseHtml;
9
+ const readability_1 = require("@mozilla/readability");
10
+ const jsdom_1 = require("jsdom");
11
+ const turndown_1 = __importDefault(require("turndown"));
12
+ // ============================================
13
+ // Turndown Service Configuration
14
+ // ============================================
15
+ /**
16
+ * Create and configure Turndown service for HTML to Markdown conversion
17
+ */
18
+ function createTurndownService() {
19
+ const turndownService = new turndown_1.default({
20
+ headingStyle: 'atx', // Use # style headings
21
+ codeBlockStyle: 'fenced', // Use ``` for code blocks
22
+ bulletListMarker: '-', // Use - for bullet lists
23
+ emDelimiter: '_', // Use _ for emphasis
24
+ strongDelimiter: '**', // Use ** for bold
25
+ });
26
+ // Keep code blocks intact
27
+ turndownService.addRule('codeBlocks', {
28
+ filter: ['pre'],
29
+ replacement: (_content, node) => {
30
+ const element = node;
31
+ const codeElement = element.querySelector('code');
32
+ const code = codeElement ? codeElement.textContent : element.textContent;
33
+ const language = codeElement?.className?.replace('language-', '') || '';
34
+ return `\n\`\`\`${language}\n${code?.trim() || ''}\n\`\`\`\n`;
35
+ },
36
+ });
37
+ return turndownService;
38
+ }
39
+ // ============================================
40
+ // HTML Parser
41
+ // ============================================
42
+ /**
43
+ * Parse HTML content and extract main content as Markdown
44
+ *
45
+ * Flow:
46
+ * 1. HTML string → JSDOM (DOM creation)
47
+ * 2. JSDOM → Readability (main content extraction, noise removal)
48
+ * 3. Readability result → Turndown (Markdown conversion)
49
+ *
50
+ * @param html - Raw HTML string
51
+ * @param url - Source URL (used for resolving relative links)
52
+ * @returns Markdown string of extracted content
53
+ */
54
+ async function parseHtml(html, url) {
55
+ // Handle empty or whitespace-only HTML
56
+ if (!html || html.trim().length === 0) {
57
+ return '';
58
+ }
59
+ try {
60
+ // Create DOM from HTML string
61
+ const dom = new jsdom_1.JSDOM(html, {
62
+ url,
63
+ // Enable features needed for Readability
64
+ runScripts: 'outside-only',
65
+ });
66
+ const document = dom.window.document;
67
+ // Use Readability to extract main content
68
+ const reader = new readability_1.Readability(document, {
69
+ keepClasses: false,
70
+ debug: false,
71
+ });
72
+ const article = reader.parse();
73
+ // If Readability couldn't extract content, fall back to body text
74
+ if (!article || !article.content) {
75
+ // Try to get body content directly
76
+ const bodyContent = document.body?.innerHTML || '';
77
+ if (!bodyContent.trim()) {
78
+ return '';
79
+ }
80
+ // Convert raw body HTML to Markdown
81
+ const turndownService = createTurndownService();
82
+ return turndownService.turndown(bodyContent).trim();
83
+ }
84
+ // Convert extracted HTML content to Markdown
85
+ const turndownService = createTurndownService();
86
+ const markdown = turndownService.turndown(article.content);
87
+ // Add title if available
88
+ if (article.title) {
89
+ return `# ${article.title}\n\n${markdown}`.trim();
90
+ }
91
+ return markdown.trim();
92
+ }
93
+ catch (error) {
94
+ // Log error but don't throw - return empty string for graceful degradation
95
+ console.error('Failed to parse HTML:', error);
96
+ return '';
97
+ }
98
+ }
99
+ //# sourceMappingURL=html-parser.js.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"html-parser.js","sourceRoot":"","sources":["../../src/parser/html-parser.ts"],"names":[],"mappings":";AAAA,6CAA6C;AAC7C,2DAA2D;;;;;AAiE3D,8BAoDC;AAnHD,sDAAkD;AAClD,iCAA6B;AAC7B,wDAAsC;AActC,+CAA+C;AAC/C,iCAAiC;AACjC,+CAA+C;AAE/C;;GAEG;AACH,SAAS,qBAAqB;IAC5B,MAAM,eAAe,GAAG,IAAI,kBAAe,CAAC;QAC1C,YAAY,EAAE,KAAK,EAAE,uBAAuB;QAC5C,cAAc,EAAE,QAAQ,EAAE,0BAA0B;QACpD,gBAAgB,EAAE,GAAG,EAAE,yBAAyB;QAChD,WAAW,EAAE,GAAG,EAAE,qBAAqB;QACvC,eAAe,EAAE,IAAI,EAAE,kBAAkB;KAC1C,CAAC,CAAA;IAEF,0BAA0B;IAC1B,eAAe,CAAC,OAAO,CAAC,YAAY,EAAE;QACpC,MAAM,EAAE,CAAC,KAAK,CAAC;QACf,WAAW,EAAE,CAAC,QAAQ,EAAE,IAAI,EAAE,EAAE;YAC9B,MAAM,OAAO,GAAG,IAAe,CAAA;YAC/B,MAAM,WAAW,GAAG,OAAO,CAAC,aAAa,CAAC,MAAM,CAAC,CAAA;YACjD,MAAM,IAAI,GAAG,WAAW,CAAC,CAAC,CAAC,WAAW,CAAC,WAAW,CAAC,CAAC,CAAC,OAAO,CAAC,WAAW,CAAA;YACxE,MAAM,QAAQ,GAAG,WAAW,EAAE,SAAS,EAAE,OAAO,CAAC,WAAW,EAAE,EAAE,CAAC,IAAI,EAAE,CAAA;YACvE,OAAO,WAAW,QAAQ,KAAK,IAAI,EAAE,IAAI,EAAE,IAAI,EAAE,YAAY,CAAA;QAC/D,CAAC;KACF,CAAC,CAAA;IAEF,OAAO,eAAe,CAAA;AACxB,CAAC;AAED,+CAA+C;AAC/C,cAAc;AACd,+CAA+C;AAE/C;;;;;;;;;;;GAWG;AACI,KAAK,UAAU,SAAS,CAAC,IAAY,EAAE,GAAW;IACvD,uCAAuC;IACvC,IAAI,CAAC,IAAI,IAAI,IAAI,CAAC,IAAI,EAAE,CAAC,MAAM,KAAK,CAAC,EAAE,CAAC;QACtC,OAAO,EAAE,CAAA;IACX,CAAC;IAED,IAAI,CAAC;QACH,8BAA8B;QAC9B,MAAM,GAAG,GAAG,IAAI,aAAK,CAAC,IAAI,EAAE;YAC1B,GAAG;YACH,yCAAyC;YACzC,UAAU,EAAE,cAAc;SAC3B,CAAC,CAAA;QAEF,MAAM,QAAQ,GAAG,GAAG,CAAC,MAAM,CAAC,QAAQ,CAAA;QAEpC,0CAA0C;QAC1C,MAAM,MAAM,GAAG,IAAI,yBAAW,CAAC,QAAQ,EAAE;YACvC,WAAW,EAAE,KAAK;YAClB,KAAK,EAAE,KAAK;SACb,CAAC,CAAA;QAEF,MAAM,OAAO,GAAG,MAAM,CAAC,KAAK,EAA8B,CAAA;QAE1D,kEAAkE;QAClE,IAAI,CAAC,OAAO,IAAI,CAAC,OAAO,CAAC,OAAO,EAAE,CAAC;YACjC,mCAAmC;YACnC,MAAM,WAAW,GAAG,QAAQ,CAAC,IAAI,EAAE,SAAS,IAAI,EAAE,CAAA;YAClD,IAAI,CAAC,WAAW,CAAC,IAAI,EAAE,EAAE,CAAC;gBACxB,OAAO,EAAE,CAAA;YACX,CAAC;YAED,oCAAoC;YACpC,MAAM,eAAe,GAAG,qBAAqB,EAAE,CAAA;YAC/C,OAAO,eAAe,CAAC,QAAQ,CAAC,WAAW,CAAC,CAAC,IAAI,EAAE,CAAA;QACrD,CAAC;QAED,6CAA6C;QAC7C,MAAM,eAAe,GAAG,qBAAqB,EAAE,CAAA;QAC/C,MAAM,QAAQ,GAAG,eAAe,CAAC,QAAQ,CAAC,OAAO,CAAC,OAAO,CAAC,CAAA;QAE1D,yBAAyB;QACzB,IAAI,OAAO,CAAC,KAAK,EAAE,CAAC;YAClB,OAAO,KAAK,OAAO,CAAC,KAAK,OAAO,QAAQ,EAAE,CAAC,IAAI,EAAE,CAAA;QACnD,CAAC;QAED,OAAO,QAAQ,CAAC,IAAI,EAAE,CAAA;IACxB,CAAC;IAAC,OAAO,KAAK,EAAE,CAAC;QACf,2EAA2E;QAC3E,OAAO,CAAC,KAAK,CAAC,uBAAuB,EAAE,KAAK,CAAC,CAAA;QAC7C,OAAO,EAAE,CAAA;IACX,CAAC;AACH,CAAC"}
@@ -1,4 +1,5 @@
1
1
  import { type GroupingMode } from '../vectordb/index.js';
2
+ import { type ContentFormat } from './raw-data-utils.js';
2
3
  /**
3
4
  * RAGServer configuration
4
5
  */
@@ -36,12 +37,33 @@ export interface IngestFileInput {
36
37
  /** File path */
37
38
  filePath: string;
38
39
  }
40
+ /**
41
+ * ingest_data tool input metadata
42
+ */
43
+ export interface IngestDataMetadata {
44
+ /** Source identifier: URL ("https://...") or custom ID ("clipboard://2024-12-30") */
45
+ source: string;
46
+ /** Content format */
47
+ format: ContentFormat;
48
+ }
49
+ /**
50
+ * ingest_data tool input
51
+ */
52
+ export interface IngestDataInput {
53
+ /** Content to ingest (text, HTML, or Markdown) */
54
+ content: string;
55
+ /** Content metadata */
56
+ metadata: IngestDataMetadata;
57
+ }
39
58
  /**
40
59
  * delete_file tool input
60
+ * Either filePath or source must be provided
41
61
  */
42
62
  export interface DeleteFileInput {
43
- /** File path */
44
- filePath: string;
63
+ /** File path (for files ingested via ingest_file) */
64
+ filePath?: string;
65
+ /** Source identifier (for data ingested via ingest_data) */
66
+ source?: string;
45
67
  }
46
68
  /**
47
69
  * ingest_file tool output
@@ -66,6 +88,8 @@ export interface QueryResult {
66
88
  text: string;
67
89
  /** Similarity score */
68
90
  score: number;
91
+ /** Original source (only for raw-data files, e.g., URLs ingested via ingest_data) */
92
+ source?: string;
69
93
  }
70
94
  /**
71
95
  * RAG server compliant with MCP Protocol
@@ -82,6 +106,7 @@ export declare class RAGServer {
82
106
  private readonly embedder;
83
107
  private readonly chunker;
84
108
  private readonly parser;
109
+ private readonly dbPath;
85
110
  constructor(config: RAGServerConfig);
86
111
  /**
87
112
  * Set up MCP handlers
@@ -110,7 +135,23 @@ export declare class RAGServer {
110
135
  }];
111
136
  }>;
112
137
  /**
113
- * list_files tool handler (Phase 1: basic implementation)
138
+ * ingest_data tool handler
139
+ * Saves raw content to raw-data directory and calls handleIngestFile internally
140
+ *
141
+ * For HTML content:
142
+ * - Parses HTML and extracts main content using Readability
143
+ * - Converts to Markdown for better chunking
144
+ * - Saves as .md file
145
+ */
146
+ handleIngestData(args: IngestDataInput): Promise<{
147
+ content: [{
148
+ type: 'text';
149
+ text: string;
150
+ }];
151
+ }>;
152
+ /**
153
+ * list_files tool handler
154
+ * Enriches raw-data files with original source information
114
155
  */
115
156
  handleListFiles(): Promise<{
116
157
  content: [{
@@ -129,6 +170,8 @@ export declare class RAGServer {
129
170
  }>;
130
171
  /**
131
172
  * delete_file tool handler
173
+ * Deletes chunks from VectorDB and physical raw-data files
174
+ * Supports both filePath (for ingest_file) and source (for ingest_data)
132
175
  */
133
176
  handleDeleteFile(args: DeleteFileInput): Promise<{
134
177
  content: [{
@@ -1 +1 @@
1
- {"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../../src/server/index.ts"],"names":[],"mappings":"AASA,OAAO,EAAE,KAAK,YAAY,EAAiC,MAAM,sBAAsB,CAAA;AAMvF;;GAEG;AACH,MAAM,WAAW,eAAe;IAC9B,4BAA4B;IAC5B,MAAM,EAAE,MAAM,CAAA;IACd,iCAAiC;IACjC,SAAS,EAAE,MAAM,CAAA;IACjB,4BAA4B;IAC5B,QAAQ,EAAE,MAAM,CAAA;IAChB,8BAA8B;IAC9B,OAAO,EAAE,MAAM,CAAA;IACf,gCAAgC;IAChC,WAAW,EAAE,MAAM,CAAA;IACnB,kEAAkE;IAClE,WAAW,CAAC,EAAE,MAAM,CAAA;IACpB,qDAAqD;IACrD,QAAQ,CAAC,EAAE,YAAY,CAAA;IACvB,sFAAsF;IACtF,YAAY,CAAC,EAAE,MAAM,CAAA;CACtB;AAED;;GAEG;AACH,MAAM,WAAW,mBAAmB;IAClC,6BAA6B;IAC7B,KAAK,EAAE,MAAM,CAAA;IACb,iDAAiD;IACjD,KAAK,CAAC,EAAE,MAAM,CAAA;CACf;AAED;;GAEG;AACH,MAAM,WAAW,eAAe;IAC9B,gBAAgB;IAChB,QAAQ,EAAE,MAAM,CAAA;CACjB;AAED;;GAEG;AACH,MAAM,WAAW,eAAe;IAC9B,gBAAgB;IAChB,QAAQ,EAAE,MAAM,CAAA;CACjB;AAED;;GAEG;AACH,MAAM,WAAW,YAAY;IAC3B,gBAAgB;IAChB,QAAQ,EAAE,MAAM,CAAA;IAChB,kBAAkB;IAClB,UAAU,EAAE,MAAM,CAAA;IAClB,gBAAgB;IAChB,SAAS,EAAE,MAAM,CAAA;CAClB;AAED;;GAEG;AACH,MAAM,WAAW,WAAW;IAC1B,gBAAgB;IAChB,QAAQ,EAAE,MAAM,CAAA;IAChB,kBAAkB;IAClB,UAAU,EAAE,MAAM,CAAA;IAClB,WAAW;IACX,IAAI,EAAE,MAAM,CAAA;IACZ,uBAAuB;IACvB,KAAK,EAAE,MAAM,CAAA;CACd;AAMD;;;;;;;;GAQG;AACH,qBAAa,SAAS;IACpB,OAAO,CAAC,QAAQ,CAAC,MAAM,CAAQ;IAC/B,OAAO,CAAC,QAAQ,CAAC,WAAW,CAAa;IACzC,OAAO,CAAC,QAAQ,CAAC,QAAQ,CAAU;IACnC,OAAO,CAAC,QAAQ,CAAC,OAAO,CAAiB;IACzC,OAAO,CAAC,QAAQ,CAAC,MAAM,CAAgB;gBAE3B,MAAM,EAAE,eAAe;IAoCnC;;OAEG;IACH,OAAO,CAAC,aAAa;IAmGrB;;OAEG;IACG,UAAU,IAAI,OAAO,CAAC,IAAI,CAAC;IAKjC;;OAEG;IACG,oBAAoB,CACxB,IAAI,EAAE,mBAAmB,GACxB,OAAO,CAAC;QAAE,OAAO,EAAE,CAAC;YAAE,IAAI,EAAE,MAAM,CAAC;YAAC,IAAI,EAAE,MAAM,CAAA;SAAE,CAAC,CAAA;KAAE,CAAC;IA8BzD;;OAEG;IACG,gBAAgB,CACpB,IAAI,EAAE,eAAe,GACpB,OAAO,CAAC;QAAE,OAAO,EAAE,CAAC;YAAE,IAAI,EAAE,MAAM,CAAC;YAAC,IAAI,EAAE,MAAM,CAAA;SAAE,CAAC,CAAA;KAAE,CAAC;IA0HzD;;OAEG;IACG,eAAe,IAAI,OAAO,CAAC;QAAE,OAAO,EAAE,CAAC;YAAE,IAAI,EAAE,MAAM,CAAC;YAAC,IAAI,EAAE,MAAM,CAAA;SAAE,CAAC,CAAA;KAAE,CAAC;IAiB/E;;OAEG;IACG,YAAY,IAAI,OAAO,CAAC;QAAE,OAAO,EAAE,CAAC;YAAE,IAAI,EAAE,MAAM,CAAC;YAAC,IAAI,EAAE,MAAM,CAAA;SAAE,CAAC,CAAA;KAAE,CAAC;IAiB5E;;OAEG;IACG,gBAAgB,CACpB,IAAI,EAAE,eAAe,GACpB,OAAO,CAAC;QAAE,OAAO,EAAE,CAAC;YAAE,IAAI,EAAE,MAAM,CAAC;YAAC,IAAI,EAAE,MAAM,CAAA;SAAE,CAAC,CAAA;KAAE,CAAC;IAoCzD;;OAEG;IACG,GAAG,IAAI,OAAO,CAAC,IAAI,CAAC;CAK3B"}
1
+ {"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../../src/server/index.ts"],"names":[],"mappings":"AAWA,OAAO,EAAE,KAAK,YAAY,EAAiC,MAAM,sBAAsB,CAAA;AACvF,OAAO,EACL,KAAK,aAAa,EAKnB,MAAM,qBAAqB,CAAA;AAM5B;;GAEG;AACH,MAAM,WAAW,eAAe;IAC9B,4BAA4B;IAC5B,MAAM,EAAE,MAAM,CAAA;IACd,iCAAiC;IACjC,SAAS,EAAE,MAAM,CAAA;IACjB,4BAA4B;IAC5B,QAAQ,EAAE,MAAM,CAAA;IAChB,8BAA8B;IAC9B,OAAO,EAAE,MAAM,CAAA;IACf,gCAAgC;IAChC,WAAW,EAAE,MAAM,CAAA;IACnB,kEAAkE;IAClE,WAAW,CAAC,EAAE,MAAM,CAAA;IACpB,qDAAqD;IACrD,QAAQ,CAAC,EAAE,YAAY,CAAA;IACvB,sFAAsF;IACtF,YAAY,CAAC,EAAE,MAAM,CAAA;CACtB;AAED;;GAEG;AACH,MAAM,WAAW,mBAAmB;IAClC,6BAA6B;IAC7B,KAAK,EAAE,MAAM,CAAA;IACb,iDAAiD;IACjD,KAAK,CAAC,EAAE,MAAM,CAAA;CACf;AAED;;GAEG;AACH,MAAM,WAAW,eAAe;IAC9B,gBAAgB;IAChB,QAAQ,EAAE,MAAM,CAAA;CACjB;AAED;;GAEG;AACH,MAAM,WAAW,kBAAkB;IACjC,qFAAqF;IACrF,MAAM,EAAE,MAAM,CAAA;IACd,qBAAqB;IACrB,MAAM,EAAE,aAAa,CAAA;CACtB;AAED;;GAEG;AACH,MAAM,WAAW,eAAe;IAC9B,kDAAkD;IAClD,OAAO,EAAE,MAAM,CAAA;IACf,uBAAuB;IACvB,QAAQ,EAAE,kBAAkB,CAAA;CAC7B;AAED;;;GAGG;AACH,MAAM,WAAW,eAAe;IAC9B,qDAAqD;IACrD,QAAQ,CAAC,EAAE,MAAM,CAAA;IACjB,4DAA4D;IAC5D,MAAM,CAAC,EAAE,MAAM,CAAA;CAChB;AAED;;GAEG;AACH,MAAM,WAAW,YAAY;IAC3B,gBAAgB;IAChB,QAAQ,EAAE,MAAM,CAAA;IAChB,kBAAkB;IAClB,UAAU,EAAE,MAAM,CAAA;IAClB,gBAAgB;IAChB,SAAS,EAAE,MAAM,CAAA;CAClB;AAED;;GAEG;AACH,MAAM,WAAW,WAAW;IAC1B,gBAAgB;IAChB,QAAQ,EAAE,MAAM,CAAA;IAChB,kBAAkB;IAClB,UAAU,EAAE,MAAM,CAAA;IAClB,WAAW;IACX,IAAI,EAAE,MAAM,CAAA;IACZ,uBAAuB;IACvB,KAAK,EAAE,MAAM,CAAA;IACb,qFAAqF;IACrF,MAAM,CAAC,EAAE,MAAM,CAAA;CAChB;AAMD;;;;;;;;GAQG;AACH,qBAAa,SAAS;IACpB,OAAO,CAAC,QAAQ,CAAC,MAAM,CAAQ;IAC/B,OAAO,CAAC,QAAQ,CAAC,WAAW,CAAa;IACzC,OAAO,CAAC,QAAQ,CAAC,QAAQ,CAAU;IACnC,OAAO,CAAC,QAAQ,CAAC,OAAO,CAAiB;IACzC,OAAO,CAAC,QAAQ,CAAC,MAAM,CAAgB;IACvC,OAAO,CAAC,QAAQ,CAAC,MAAM,CAAQ;gBAEnB,MAAM,EAAE,eAAe;IAqCnC;;OAEG;IACH,OAAO,CAAC,aAAa;IA0IrB;;OAEG;IACG,UAAU,IAAI,OAAO,CAAC,IAAI,CAAC;IAKjC;;OAEG;IACG,oBAAoB,CACxB,IAAI,EAAE,mBAAmB,GACxB,OAAO,CAAC;QAAE,OAAO,EAAE,CAAC;YAAE,IAAI,EAAE,MAAM,CAAC;YAAC,IAAI,EAAE,MAAM,CAAA;SAAE,CAAC,CAAA;KAAE,CAAC;IA0CzD;;OAEG;IACG,gBAAgB,CACpB,IAAI,EAAE,eAAe,GACpB,OAAO,CAAC;QAAE,OAAO,EAAE,CAAC;YAAE,IAAI,EAAE,MAAM,CAAC;YAAC,IAAI,EAAE,MAAM,CAAA;SAAE,CAAC,CAAA;KAAE,CAAC;IAmIzD;;;;;;;;OAQG;IACG,gBAAgB,CACpB,IAAI,EAAE,eAAe,GACpB,OAAO,CAAC;QAAE,OAAO,EAAE,CAAC;YAAE,IAAI,EAAE,MAAM,CAAC;YAAC,IAAI,EAAE,MAAM,CAAA;SAAE,CAAC,CAAA;KAAE,CAAC;IAyDzD;;;OAGG;IACG,eAAe,IAAI,OAAO,CAAC;QAAE,OAAO,EAAE,CAAC;YAAE,IAAI,EAAE,MAAM,CAAC;YAAC,IAAI,EAAE,MAAM,CAAA;SAAE,CAAC,CAAA;KAAE,CAAC;IA6B/E;;OAEG;IACG,YAAY,IAAI,OAAO,CAAC;QAAE,OAAO,EAAE,CAAC;YAAE,IAAI,EAAE,MAAM,CAAC;YAAC,IAAI,EAAE,MAAM,CAAA;SAAE,CAAC,CAAA;KAAE,CAAC;IAiB5E;;;;OAIG;IACG,gBAAgB,CACpB,IAAI,EAAE,eAAe,GACpB,OAAO,CAAC;QAAE,OAAO,EAAE,CAAC;YAAE,IAAI,EAAE,MAAM,CAAC;YAAC,IAAI,EAAE,MAAM,CAAA;SAAE,CAAC,CAAA;KAAE,CAAC;IA+DzD;;OAEG;IACG,GAAG,IAAI,OAAO,CAAC,IAAI,CAAC;CAK3B"}