@itkoren/sqmd 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (128) hide show
  1. package/CHANGELOG.md +46 -0
  2. package/LICENSE +21 -0
  3. package/README.md +1052 -0
  4. package/dist/api/app.d.ts +14 -0
  5. package/dist/api/app.d.ts.map +1 -0
  6. package/dist/api/app.js +32 -0
  7. package/dist/api/app.js.map +1 -0
  8. package/dist/api/middleware.d.ts +5 -0
  9. package/dist/api/middleware.d.ts.map +1 -0
  10. package/dist/api/middleware.js +37 -0
  11. package/dist/api/middleware.js.map +1 -0
  12. package/dist/api/models.d.ts +178 -0
  13. package/dist/api/models.d.ts.map +1 -0
  14. package/dist/api/models.js +39 -0
  15. package/dist/api/models.js.map +1 -0
  16. package/dist/api/routes/documents.d.ts +4 -0
  17. package/dist/api/routes/documents.d.ts.map +1 -0
  18. package/dist/api/routes/documents.js +92 -0
  19. package/dist/api/routes/documents.js.map +1 -0
  20. package/dist/api/routes/health.d.ts +6 -0
  21. package/dist/api/routes/health.d.ts.map +1 -0
  22. package/dist/api/routes/health.js +38 -0
  23. package/dist/api/routes/health.js.map +1 -0
  24. package/dist/api/routes/index.d.ts +5 -0
  25. package/dist/api/routes/index.d.ts.map +1 -0
  26. package/dist/api/routes/index.js +83 -0
  27. package/dist/api/routes/index.js.map +1 -0
  28. package/dist/api/routes/search.d.ts +6 -0
  29. package/dist/api/routes/search.d.ts.map +1 -0
  30. package/dist/api/routes/search.js +104 -0
  31. package/dist/api/routes/search.js.map +1 -0
  32. package/dist/config/loader.d.ts +4 -0
  33. package/dist/config/loader.d.ts.map +1 -0
  34. package/dist/config/loader.js +144 -0
  35. package/dist/config/loader.js.map +1 -0
  36. package/dist/config/schema.d.ts +298 -0
  37. package/dist/config/schema.d.ts.map +1 -0
  38. package/dist/config/schema.js +50 -0
  39. package/dist/config/schema.js.map +1 -0
  40. package/dist/embeddings/ollama.d.ts +14 -0
  41. package/dist/embeddings/ollama.d.ts.map +1 -0
  42. package/dist/embeddings/ollama.js +46 -0
  43. package/dist/embeddings/ollama.js.map +1 -0
  44. package/dist/embeddings/transformers.d.ts +14 -0
  45. package/dist/embeddings/transformers.d.ts.map +1 -0
  46. package/dist/embeddings/transformers.js +64 -0
  47. package/dist/embeddings/transformers.js.map +1 -0
  48. package/dist/embeddings/types.d.ts +6 -0
  49. package/dist/embeddings/types.d.ts.map +1 -0
  50. package/dist/embeddings/types.js +2 -0
  51. package/dist/embeddings/types.js.map +1 -0
  52. package/dist/index.d.ts +3 -0
  53. package/dist/index.d.ts.map +1 -0
  54. package/dist/index.js +233 -0
  55. package/dist/index.js.map +1 -0
  56. package/dist/ingestion/chunker.d.ts +21 -0
  57. package/dist/ingestion/chunker.d.ts.map +1 -0
  58. package/dist/ingestion/chunker.js +117 -0
  59. package/dist/ingestion/chunker.js.map +1 -0
  60. package/dist/ingestion/fingerprint.d.ts +6 -0
  61. package/dist/ingestion/fingerprint.d.ts.map +1 -0
  62. package/dist/ingestion/fingerprint.js +17 -0
  63. package/dist/ingestion/fingerprint.js.map +1 -0
  64. package/dist/ingestion/parser.d.ts +16 -0
  65. package/dist/ingestion/parser.d.ts.map +1 -0
  66. package/dist/ingestion/parser.js +98 -0
  67. package/dist/ingestion/parser.js.map +1 -0
  68. package/dist/ingestion/pipeline.d.ts +32 -0
  69. package/dist/ingestion/pipeline.d.ts.map +1 -0
  70. package/dist/ingestion/pipeline.js +191 -0
  71. package/dist/ingestion/pipeline.js.map +1 -0
  72. package/dist/ingestion/scanner.d.ts +2 -0
  73. package/dist/ingestion/scanner.d.ts.map +1 -0
  74. package/dist/ingestion/scanner.js +54 -0
  75. package/dist/ingestion/scanner.js.map +1 -0
  76. package/dist/mcp/server.d.ts +8 -0
  77. package/dist/mcp/server.d.ts.map +1 -0
  78. package/dist/mcp/server.js +73 -0
  79. package/dist/mcp/server.js.map +1 -0
  80. package/dist/mcp/tools.d.ts +6 -0
  81. package/dist/mcp/tools.d.ts.map +1 -0
  82. package/dist/mcp/tools.js +276 -0
  83. package/dist/mcp/tools.js.map +1 -0
  84. package/dist/rag/context-builder.d.ts +3 -0
  85. package/dist/rag/context-builder.d.ts.map +1 -0
  86. package/dist/rag/context-builder.js +27 -0
  87. package/dist/rag/context-builder.js.map +1 -0
  88. package/dist/rag/prompt-templates.d.ts +5 -0
  89. package/dist/rag/prompt-templates.d.ts.map +1 -0
  90. package/dist/rag/prompt-templates.js +41 -0
  91. package/dist/rag/prompt-templates.js.map +1 -0
  92. package/dist/search/hybrid.d.ts +14 -0
  93. package/dist/search/hybrid.d.ts.map +1 -0
  94. package/dist/search/hybrid.js +58 -0
  95. package/dist/search/hybrid.js.map +1 -0
  96. package/dist/search/query.d.ts +4 -0
  97. package/dist/search/query.d.ts.map +1 -0
  98. package/dist/search/query.js +23 -0
  99. package/dist/search/query.js.map +1 -0
  100. package/dist/search/reranker.d.ts +11 -0
  101. package/dist/search/reranker.d.ts.map +1 -0
  102. package/dist/search/reranker.js +44 -0
  103. package/dist/search/reranker.js.map +1 -0
  104. package/dist/store/db.d.ts +11 -0
  105. package/dist/store/db.d.ts.map +1 -0
  106. package/dist/store/db.js +75 -0
  107. package/dist/store/db.js.map +1 -0
  108. package/dist/store/reader.d.ts +8 -0
  109. package/dist/store/reader.d.ts.map +1 -0
  110. package/dist/store/reader.js +122 -0
  111. package/dist/store/reader.js.map +1 -0
  112. package/dist/store/schema.d.ts +39 -0
  113. package/dist/store/schema.d.ts.map +1 -0
  114. package/dist/store/schema.js +33 -0
  115. package/dist/store/schema.js.map +1 -0
  116. package/dist/store/writer.d.ts +6 -0
  117. package/dist/store/writer.d.ts.map +1 -0
  118. package/dist/store/writer.js +43 -0
  119. package/dist/store/writer.js.map +1 -0
  120. package/dist/watcher/daemon.d.ts +5 -0
  121. package/dist/watcher/daemon.d.ts.map +1 -0
  122. package/dist/watcher/daemon.js +43 -0
  123. package/dist/watcher/daemon.js.map +1 -0
  124. package/dist/watcher/handler.d.ts +14 -0
  125. package/dist/watcher/handler.d.ts.map +1 -0
  126. package/dist/watcher/handler.js +82 -0
  127. package/dist/watcher/handler.js.map +1 -0
  128. package/package.json +56 -0
package/README.md ADDED
@@ -0,0 +1,1052 @@
1
+ # sqmd
2
+
3
+ A fully local, high-performance semantic search engine for Markdown files. Index your notes, documentation, or any collection of `.md` / `.mdx` files and query them with natural language — no external API keys, no cloud services, no data leaving your machine.
4
+
5
+ Designed to serve both humans (CLI + REST API) and AI agents (MCP server), with a RAG-ready output layer for use as an agent memory backend.
6
+
7
+ ---
8
+
9
+ ## Table of Contents
10
+
11
+ - [Features](#features)
12
+ - [Architecture Overview](#architecture-overview)
13
+ - [Technology Stack](#technology-stack)
14
+ - [Project Structure](#project-structure)
15
+ - [Getting Started](#getting-started)
16
+ - [Prerequisites](#prerequisites)
17
+ - [Installation](#installation)
18
+ - [Initial Configuration](#initial-configuration)
19
+ - [First Index](#first-index)
20
+ - [CLI Reference](#cli-reference)
21
+ - [index](#index)
22
+ - [search](#search)
23
+ - [serve](#serve)
24
+ - [mcp](#mcp)
25
+ - [status](#status)
26
+ - [config](#config)
27
+ - [REST API](#rest-api)
28
+ - [Search](#search-endpoints)
29
+ - [Documents](#document-endpoints)
30
+ - [Index Management](#index-management-endpoints)
31
+ - [Health & Metrics](#health--metrics-endpoints)
32
+ - [Authentication](#authentication)
33
+ - [MCP Server](#mcp-server)
34
+ - [Tools](#mcp-tools)
35
+ - [Resources](#mcp-resources)
36
+ - [Claude Desktop Integration](#claude-desktop-integration)
37
+ - [RAG Layer](#rag-layer)
38
+ - [Configuration Reference](#configuration-reference)
39
+ - [Architecture Deep Dive](#architecture-deep-dive)
40
+ - [Chunking Algorithm](#chunking-algorithm)
41
+ - [Embedding Pipeline](#embedding-pipeline)
42
+ - [Hybrid Search & RRF](#hybrid-search--rrf)
43
+ - [Incremental Indexing](#incremental-indexing)
44
+ - [LanceDB Schema](#lancedb-schema)
45
+ - [Embedding Backends](#embedding-backends)
46
+ - [Transformers.js (Default)](#transformersjs-default)
47
+ - [Ollama](#ollama)
48
+ - [Performance](#performance)
49
+ - [Development](#development)
50
+ - [Running Tests](#running-tests)
51
+ - [Project Conventions](#project-conventions)
52
+ - [Troubleshooting](#troubleshooting)
53
+
54
+ ---
55
+
56
+ ## Features
57
+
58
+ - **Fully local** — all embeddings, vector storage, and search run on-device
59
+ - **Hierarchical chunking** — sections are split following the document's heading structure, preserving semantic context
60
+ - **Hybrid search** — combines dense vector search (cosine ANN) and sparse full-text search (BM25/Tantivy) fused via Reciprocal Rank Fusion
61
+ - **Incremental indexing** — SHA-256 fingerprinting skips unchanged files; filesystem watcher triggers re-indexing automatically
62
+ - **Multiple interfaces** — CLI, REST API (Hono), and MCP server for AI agents
63
+ - **RAG output** — context builder assembles ranked chunks into token-budgeted context windows with source attribution
64
+ - **Optional reranking** — cross-encoder reranking (ONNX) for higher-precision results
65
+ - **Two embedding backends** — Transformers.js ONNX (default, bundled) or Ollama HTTP
66
+ - **Type-safe configuration** — Zod-validated YAML config with environment variable overrides
67
+
68
+ ---
69
+
70
+ ## Architecture Overview
71
+
72
+ ```
73
+ ┌─────────────────────────────────────────────────────────────────┐
74
+ │ Interfaces │
75
+ │ CLI (Commander) REST API (Hono) MCP Server (stdio/SSE) │
76
+ └────────────┬──────────────┬──────────────────┬─────────────────┘
77
+ │ │ │
78
+ ▼ ▼ ▼
79
+ ┌─────────────────────────────────────────────────────────────────┐
80
+ │ Search Layer │
81
+ │ Query preprocessing → Hybrid RRF → Reranker │
82
+ └────────────────────────────┬────────────────────────────────────┘
83
+
84
+ ┌───────────────┼───────────────┐
85
+ ▼ ▼ ▼
86
+ Vector ANN BM25 FTS RAG Context
87
+ (LanceDB) (Tantivy) Builder
88
+ │ │
89
+ └───────┬───────┘
90
+
91
+ ┌─────────────────────────────────────────────────────────────────┐
92
+ │ Storage Layer │
93
+ │ LanceDB (chunks + files tables) │
94
+ └────────────────────────────┬────────────────────────────────────┘
95
+
96
+ ┌─────────────────────────────────────────────────────────────────┐
97
+ │ Ingestion Pipeline │
98
+ │ Scanner → Parser (remark AST) → Chunker → Embedder → Writer │
99
+ └──────────────┬───────────────────────────────────────┬──────────┘
100
+ │ │
101
+ File System Transformers.js
102
+ (chokidar watch) ONNX / Ollama
103
+ ```
104
+
105
+ ---
106
+
107
+ ## Technology Stack
108
+
109
+ | Component | Library | Rationale |
110
+ |-----------|---------|-----------|
111
+ | Language | TypeScript (Node.js ≥22) | Type safety; ESM native; no GIL |
112
+ | Vector DB | `@lancedb/lancedb` | Embedded hybrid vector + BM25; no separate process |
113
+ | Embeddings | `@huggingface/transformers` v3 | ONNX runtime, 2–3× faster than PyTorch on CPU |
114
+ | MD Parsing | `remark` / `remark-parse` | Full mdast AST with line positions |
115
+ | REST Server | `hono` + `@hono/node-server` | ~3× faster than Express; excellent TypeScript DX |
116
+ | MCP Server | `@modelcontextprotocol/sdk` | Official Anthropic reference SDK |
117
+ | File Watch | `chokidar` v3 | Native FSEvents on macOS; debounce built-in |
118
+ | CLI | `commander` | Lightweight, typed |
119
+ | Config | `zod` + `js-yaml` | Runtime-validated config |
120
+ | Concurrency | `p-limit` | Bounded parallelism for the indexing pipeline |
121
+
122
+ ---
123
+
124
+ ## Project Structure
125
+
126
+ ```
127
+ sqmd/
128
+ ├── src/
129
+ │ ├── index.ts # CLI entrypoint
130
+ │ ├── config/
131
+ │ │ ├── schema.ts # Zod config schemas + TypeScript types
132
+ │ │ └── loader.ts # YAML loading, ~ expansion, env overrides
133
+ │ ├── ingestion/
134
+ │ │ ├── scanner.ts # Recursive async file discovery
135
+ │ │ ├── parser.ts # remark AST → Section[] with line numbers
136
+ │ │ ├── chunker.ts # Hierarchical token-aware chunking
137
+ │ │ ├── fingerprint.ts # SHA-256 content + path hashing
138
+ │ │ └── pipeline.ts # Full index orchestration (scan→chunk→embed→store)
139
+ │ ├── embeddings/
140
+ │ │ ├── types.ts # Embedder interface
141
+ │ │ ├── transformers.ts # Transformers.js ONNX backend
142
+ │ │ └── ollama.ts # Ollama HTTP backend
143
+ │ ├── store/
144
+ │ │ ├── schema.ts # Apache Arrow schemas + TypeScript record types
145
+ │ │ ├── db.ts # LanceDB connection management
146
+ │ │ ├── writer.ts # Upsert / delete operations
147
+ │ │ └── reader.ts # Vector search, FTS search, file/chunk queries
148
+ │ ├── search/
149
+ │ │ ├── query.ts # Query preprocessing and prefix injection
150
+ │ │ ├── hybrid.ts # RRF fusion (vector + BM25)
151
+ │ │ └── reranker.ts # Optional cross-encoder reranking
152
+ │ ├── watcher/
153
+ │ │ ├── handler.ts # chokidar event handler + debounce
154
+ │ │ └── daemon.ts # Long-running watcher lifecycle
155
+ │ ├── api/
156
+ │ │ ├── app.ts # Hono application factory
157
+ │ │ ├── middleware.ts # Auth, CORS, request logging
158
+ │ │ ├── models.ts # Zod request/response schemas
159
+ │ │ └── routes/
160
+ │ │ ├── health.ts # GET /api/v1/health
161
+ │ │ ├── search.ts # POST|GET /api/v1/search
162
+ │ │ ├── documents.ts # GET /api/v1/documents[/:id]
163
+ │ │ └── index.ts # POST /api/v1/index/trigger, GET /api/v1/index/status
164
+ │ ├── mcp/
165
+ │ │ ├── tools.ts # MCP tool + resource implementations
166
+ │ │ └── server.ts # MCP server (stdio + SSE transports)
167
+ │ └── rag/
168
+ │ ├── context-builder.ts # Token-budgeted context window assembly
169
+ │ └── prompt-templates.ts # System prompts for RAG use
170
+ ├── tests/
171
+ │ ├── unit/
172
+ │ │ ├── chunker.test.ts
173
+ │ │ ├── hybrid.test.ts
174
+ │ │ └── config.test.ts
175
+ │ └── integration/
176
+ │ ├── pipeline.test.ts
177
+ │ └── api.test.ts
178
+ ├── config.yaml # Default configuration (copy to ~/.sqmd/)
179
+ ├── package.json
180
+ └── tsconfig.json
181
+ ```
182
+
183
+ ---
184
+
185
+ ## Getting Started
186
+
187
+ ### Prerequisites
188
+
189
+ - **Node.js 22+** — required for native ESM and modern `node:` builtins
190
+ - **pnpm** (recommended) or npm
191
+
192
+ ```bash
193
+ node --version # must be ≥ 22.0.0
194
+ ```
195
+
196
+ ### Installation
197
+
198
+ **From source:**
199
+
200
+ ```bash
201
+ git clone <repo>
202
+ cd sqmd
203
+ npm install # or: pnpm install
204
+ npm run build # compiles TypeScript → dist/
205
+ ```
206
+
207
+ **Global install (after build):**
208
+
209
+ ```bash
210
+ npm install -g .
211
+ sqmd --version
212
+ ```
213
+
214
+ ### Initial Configuration
215
+
216
+ Write the default configuration file:
217
+
218
+ ```bash
219
+ node dist/index.js config --init ~/.sqmd/config.yaml
220
+ ```
221
+
222
+ Then edit `~/.sqmd/config.yaml` to set the directories you want to index:
223
+
224
+ ```yaml
225
+ paths:
226
+ watch_dirs:
227
+ - "~/notes"
228
+ - "~/work/docs"
229
+ db_path: "~/.sqmd/lancedb"
230
+ ```
231
+
232
+ The tool resolves config in this order:
233
+ 1. Path from `--config` flag
234
+ 2. `$SQMD_CONFIG` environment variable
235
+ 3. `~/.sqmd/config.yaml`
236
+ 4. `./config.yaml` (project-local)
237
+ 5. Built-in defaults
238
+
239
+ ### First Index
240
+
241
+ ```bash
242
+ node dist/index.js index
243
+ ```
244
+
245
+ On first run, the embedding model (`nomic-ai/nomic-embed-text-v1.5`, ~270 MB) is downloaded and cached to `~/.sqmd/models`. Subsequent runs use the cached model.
246
+
247
+ ---
248
+
249
+ ## CLI Reference
250
+
251
+ All commands accept `--config <path>` to specify a non-default config file.
252
+
253
+ ### `index`
254
+
255
+ Scan and index Markdown files.
256
+
257
+ ```
258
+ sqmd index [options]
259
+
260
+ Options:
261
+ --path <path> Directory or single file to index (default: watch_dirs from config)
262
+ --force Re-index all files, even if content is unchanged
263
+ --watch Keep running and re-index files as they change
264
+ --config <path> Config file path
265
+ ```
266
+
267
+ **Examples:**
268
+
269
+ ```bash
270
+ # Index default watch_dirs
271
+ node dist/index.js index
272
+
273
+ # Index a specific directory
274
+ node dist/index.js index --path ~/work/docs
275
+
276
+ # Force full re-index (ignores change detection)
277
+ node dist/index.js index --force
278
+
279
+ # Index then keep watching
280
+ node dist/index.js index --watch
281
+ ```
282
+
283
+ Progress is printed per-file. A summary reports indexed, skipped (unchanged), and errored files.
284
+
285
+ ---
286
+
287
+ ### `search`
288
+
289
+ Query the index from the terminal.
290
+
291
+ ```
292
+ sqmd search <query> [options]
293
+
294
+ Arguments:
295
+ query Natural language search query (quote multi-word queries)
296
+
297
+ Options:
298
+ --top-k <n> Number of results to return (default: 10)
299
+ --mode <mode> hybrid | vector | fts (default: hybrid)
300
+ --filter <path> Restrict results to files whose path contains this substring
301
+ --config <path> Config file path
302
+ ```
303
+
304
+ **Examples:**
305
+
306
+ ```bash
307
+ # Semantic search
308
+ node dist/index.js search "how to configure authentication"
309
+
310
+ # Full-text only
311
+ node dist/index.js search "OAuth token refresh" --mode fts
312
+
313
+ # Top 5 results scoped to a directory
314
+ node dist/index.js search "deployment strategy" --top-k 5 --filter /work/
315
+ ```
316
+
317
+ Output includes file path, heading breadcrumb, score, line range, and a 200-character snippet.
318
+
319
+ ---
320
+
321
+ ### `serve`
322
+
323
+ Start the HTTP REST API server.
324
+
325
+ ```
326
+ sqmd serve [options]
327
+
328
+ Options:
329
+ --host <host> Bind address (default: 127.0.0.1)
330
+ --port <port> Port (default: 7832)
331
+ --config <path> Config file path
332
+ ```
333
+
334
+ ```bash
335
+ node dist/index.js serve
336
+ # → Listening on http://127.0.0.1:7832
337
+ ```
338
+
339
+ If `watcher.enabled` is `true` in config, the file watcher starts automatically alongside the API server.
340
+
341
+ ---
342
+
343
+ ### `mcp`
344
+
345
+ Start the Model Context Protocol server.
346
+
347
+ ```
348
+ sqmd mcp [options]
349
+
350
+ Options:
351
+ --transport <transport> stdio | sse (default: stdio)
352
+ --port <port> Port for SSE transport (default: 7833)
353
+ --config <path> Config file path
354
+ ```
355
+
356
+ ```bash
357
+ # For Claude Desktop / Claude Code (stdio)
358
+ node dist/index.js mcp
359
+
360
+ # For HTTP-based agents (SSE)
361
+ node dist/index.js mcp --transport sse --port 7833
362
+ ```
363
+
364
+ ---
365
+
366
+ ### `status`
367
+
368
+ Display index statistics.
369
+
370
+ ```bash
371
+ node dist/index.js status
372
+ ```
373
+
374
+ Output:
375
+
376
+ ```
377
+ sqmd Status
378
+ ────────────────────────────────────────
379
+ DB path: ~/.sqmd/lancedb
380
+ Files indexed: 142
381
+ Chunks stored: 3847
382
+ Last indexed: 3/17/2026, 09:14:32 AM
383
+ Watch dirs: ~/notes
384
+ Embedder: transformers / nomic-ai/nomic-embed-text-v1.5
385
+ ```
386
+
387
+ ---
388
+
389
+ ### `config`
390
+
391
+ Manage configuration.
392
+
393
+ ```bash
394
+ # Write default config to a path
395
+ node dist/index.js config --init ~/.sqmd/config.yaml
396
+ ```
397
+
398
+ ---
399
+
400
+ ## REST API
401
+
402
+ Base URL: `http://localhost:7832/api/v1`
403
+
404
+ All responses are JSON. Errors use `{ "error": "...", "message": "..." }`.
405
+
406
+ ### Search Endpoints
407
+
408
+ #### `POST /api/v1/search`
409
+
410
+ ```json
411
+ // Request body
412
+ {
413
+ "query": "how to set up two-factor authentication",
414
+ "top_k": 10,
415
+ "mode": "hybrid",
416
+ "filter_path": "/notes/security",
417
+ "include_context": false,
418
+ "rerank": false
419
+ }
420
+ ```
421
+
422
+ | Field | Type | Default | Description |
423
+ |-------|------|---------|-------------|
424
+ | `query` | string | **required** | Natural language query |
425
+ | `top_k` | number | 10 | Number of results |
426
+ | `mode` | `"hybrid"` \| `"vector"` \| `"fts"` | `"hybrid"` | Search algorithm |
427
+ | `filter_path` | string | — | Path substring filter |
428
+ | `include_context` | boolean | false | Include breadcrumb-prefixed `text` field |
429
+ | `rerank` | boolean | config default | Apply cross-encoder reranking |
430
+
431
+ ```json
432
+ // Response
433
+ {
434
+ "results": [
435
+ {
436
+ "chunk_id": "abc123:2:0",
437
+ "file_id": "sha256-of-path",
438
+ "file_path": "/notes/security/2fa.md",
439
+ "heading_path": "Setup > Two-Factor Authentication",
440
+ "heading_text": "Two-Factor Authentication",
441
+ "heading_level": 2,
442
+ "section_index": 2,
443
+ "chunk_index": 0,
444
+ "text_raw": "Enable 2FA by navigating to Settings...",
445
+ "token_count": 87,
446
+ "score": 0.0312,
447
+ "line_start": 45,
448
+ "line_end": 72
449
+ }
450
+ ],
451
+ "query": "how to set up two-factor authentication",
452
+ "total": 10,
453
+ "duration_ms": 43
454
+ }
455
+ ```
456
+
457
+ #### `GET /api/v1/search?q=...`
458
+
459
+ ```bash
460
+ curl "http://localhost:7832/api/v1/search?q=configure+auth&top_k=5&mode=vector"
461
+ ```
462
+
463
+ Accepts the same parameters as POST, via query string. Useful for quick browser/curl queries.
464
+
465
+ ---
466
+
467
+ ### Document Endpoints
468
+
469
+ #### `GET /api/v1/documents`
470
+
471
+ Returns a paginated list of indexed files.
472
+
473
+ ```bash
474
+ curl "http://localhost:7832/api/v1/documents?limit=20&offset=0"
475
+ ```
476
+
477
+ ```json
478
+ {
479
+ "documents": [
480
+ {
481
+ "file_id": "...",
482
+ "file_path": "/notes/setup.md",
483
+ "file_hash": "...",
484
+ "chunk_count": 12,
485
+ "indexed_at": 1742215200000,
486
+ "status": "indexed"
487
+ }
488
+ ],
489
+ "total": 142,
490
+ "limit": 20,
491
+ "offset": 0
492
+ }
493
+ ```
494
+
495
+ #### `GET /api/v1/documents/:fileId`
496
+
497
+ Returns metadata and all stored chunks for a specific file.
498
+
499
+ ```bash
500
+ curl "http://localhost:7832/api/v1/documents/<file_id>"
501
+ ```
502
+
503
+ #### `GET /api/v1/documents/:fileId/raw`
504
+
505
+ Returns the raw Markdown content of the file (read from disk).
506
+
507
+ ---
508
+
509
+ ### Index Management Endpoints
510
+
511
+ #### `POST /api/v1/index/trigger`
512
+
513
+ Trigger a re-index operation asynchronously.
514
+
515
+ ```json
516
+ // Request
517
+ {
518
+ "paths": ["/notes/work"], // optional; defaults to watch_dirs
519
+ "force": false // optional
520
+ }
521
+ ```
522
+
523
+ ```json
524
+ // Response 202
525
+ {
526
+ "job_id": "job-1710000000000",
527
+ "status": "queued"
528
+ }
529
+ ```
530
+
531
+ #### `GET /api/v1/index/status`
532
+
533
+ Returns current index statistics and watcher state.
534
+
535
+ ```json
536
+ {
537
+ "fileCount": 142,
538
+ "chunkCount": 3847,
539
+ "watcherRunning": true,
540
+ "dbPath": "~/.sqmd/lancedb"
541
+ }
542
+ ```
543
+
544
+ #### `GET /api/v1/index/jobs/:jobId`
545
+
546
+ Returns the progress of a triggered index job.
547
+
548
+ ```json
549
+ {
550
+ "job_id": "job-1710000000000",
551
+ "status": "completed",
552
+ "indexed": 5,
553
+ "skipped": 137,
554
+ "errors": 0,
555
+ "started_at": 1710000000000,
556
+ "completed_at": 1710000003500
557
+ }
558
+ ```
559
+
560
+ ---
561
+
562
+ ### Health & Metrics Endpoints
563
+
564
+ #### `GET /api/v1/health`
565
+
566
+ ```json
567
+ {
568
+ "status": "ok",
569
+ "db": "connected",
570
+ "embedder": "transformers / nomic-ai/nomic-embed-text-v1.5",
571
+ "watcher": "running",
572
+ "uptime_seconds": 3600
573
+ }
574
+ ```
575
+
576
+ #### `GET /api/v1/metrics`
577
+
578
+ Search latency percentiles and throughput counters.
579
+
580
+ ---
581
+
582
+ ### Authentication
583
+
584
+ Set `api.api_key` in config to a non-empty string to enable bearer token auth. All `/api/*` requests must include:
585
+
586
+ ```
587
+ Authorization: Bearer <your-api-key>
588
+ ```
589
+
590
+ If `api_key` is empty (the default), authentication is disabled — suitable for local use.
591
+
592
+ ---
593
+
594
+ ## MCP Server
595
+
596
+ sqmd exposes a full Model Context Protocol server, allowing AI agents like Claude to search your notes directly from conversations.
597
+
598
+ ### MCP Tools
599
+
600
+ | Tool | Required Args | Optional Args | Description |
601
+ |------|--------------|---------------|-------------|
602
+ | `search_documents` | `query` | `top_k`, `mode`, `filter_path`, `include_context` | Primary semantic/hybrid search |
603
+ | `get_document` | `file_path` | `section` | Fetch a file's metadata and chunks, optionally filtered to a heading |
604
+ | `list_documents` | — | `path_prefix`, `limit` | Browse the indexed file tree |
605
+ | `trigger_index` | — | `paths`, `force` | Request re-indexing |
606
+ | `get_index_status` | — | — | Index health and stats |
607
+
608
+ **`search_documents` example (Claude tool call):**
609
+
610
+ ```json
611
+ {
612
+ "query": "database migration strategy",
613
+ "top_k": 5,
614
+ "mode": "hybrid",
615
+ "include_context": true
616
+ }
617
+ ```
618
+
619
+ When `include_context` is `true`, the response includes a pre-assembled `context` string ready to inject into a prompt.
620
+
621
+ ### MCP Resources
622
+
623
+ Every indexed file is exposed as a resource with URI scheme `md://<absolute-path>`:
624
+
625
+ ```
626
+ md:///Users/alice/notes/architecture.md
627
+ ```
628
+
629
+ Agents can read raw Markdown content directly via the resource protocol without going through the search tool.
630
+
631
+ ### Claude Desktop Integration
632
+
633
+ Add to `~/Library/Application Support/Claude/claude_desktop_config.json`:
634
+
635
+ ```json
636
+ {
637
+ "mcpServers": {
638
+ "sqmd": {
639
+ "command": "node",
640
+ "args": ["/path/to/sqmd/dist/index.js", "mcp"],
641
+ "env": {
642
+ "SQMD_CONFIG": "/Users/alice/.sqmd/config.yaml"
643
+ }
644
+ }
645
+ }
646
+ }
647
+ ```
648
+
649
+ Or, if installed globally:
650
+
651
+ ```json
652
+ {
653
+ "mcpServers": {
654
+ "sqmd": {
655
+ "command": "sqmd",
656
+ "args": ["mcp"],
657
+ "env": {
658
+ "SQMD_CONFIG": "~/.sqmd/config.yaml"
659
+ }
660
+ }
661
+ }
662
+ }
663
+ ```
664
+
665
+ ### Claude Code Integration
666
+
667
+ Add to your `.mcp.json` or use `claude mcp add`:
668
+
669
+ ```bash
670
+ claude mcp add sqmd -- node /path/to/dist/index.js mcp
671
+ ```
672
+
673
+ ---
674
+
675
+ ## RAG Layer
676
+
677
+ The `src/rag/` module provides utilities for AI agent memory management.
678
+
679
+ **`buildContext(results, maxTokens)`** assembles search results into a single context string that fits within a token budget. Each chunk is preceded by attribution metadata:
680
+
681
+ ```
682
+ Source: /notes/architecture/decisions.md
683
+ Section: Architecture > Database > Schema Design
684
+ Lines: 45-72
685
+
686
+ We chose PostgreSQL because it provides...
687
+
688
+ ---
689
+
690
+ Source: /notes/architecture/decisions.md
691
+ Section: Architecture > Database > Migrations
692
+ Lines: 100-134
693
+
694
+ All schema changes are managed via...
695
+ ```
696
+
697
+ The `search_documents` MCP tool returns this context when `include_context: true`. Inject it directly into the system prompt or user message of your agent.
698
+
699
+ **`ragSystemPrompt()`** returns a baseline system prompt for RAG-style agents instructing the model on how to interpret sourced context.
700
+
701
+ ---
702
+
703
+ ## Configuration Reference
704
+
705
+ All settings live in `config.yaml` (or the file pointed to by `--config` / `$SQMD_CONFIG`).
706
+
707
+ ### `paths`
708
+
709
+ | Key | Default | Description |
710
+ |-----|---------|-------------|
711
+ | `watch_dirs` | `["~/notes"]` | Directories to index and watch |
712
+ | `db_path` | `~/.sqmd/lancedb` | LanceDB database location |
713
+ | `model_cache_dir` | `~/.sqmd/models` | Directory for cached embedding models |
714
+
715
+ ### `embeddings`
716
+
717
+ | Key | Default | Description |
718
+ |-----|---------|-------------|
719
+ | `backend` | `"transformers"` | `"transformers"` (ONNX) or `"ollama"` |
720
+ | `model` | `"nomic-ai/nomic-embed-text-v1.5"` | HuggingFace model ID or Ollama model name |
721
+ | `batch_size` | `64` | Texts per embedding batch |
722
+ | `ollama_base_url` | `"http://localhost:11434"` | Ollama server URL (used only when backend is `"ollama"`) |
723
+
724
+ ### `chunking`
725
+
726
+ | Key | Default | Description |
727
+ |-----|---------|-------------|
728
+ | `max_tokens` | `512` | Maximum tokens per chunk before splitting |
729
+ | `min_chars` | `50` | Minimum characters; shorter chunks are discarded |
730
+ | `include_breadcrumb` | `true` | Prepend `"Section: H1 > H2 > H3\n\n"` to chunk text for richer embeddings |
731
+ | `overlap_tokens` | `64` | Carry-over tokens between adjacent sub-chunks when a section is split |
732
+
733
+ ### `search`
734
+
735
+ | Key | Default | Description |
736
+ |-----|---------|-------------|
737
+ | `default_top_k` | `10` | Default number of results |
738
+ | `rrf_k` | `60` | RRF constant (`k` in `1/(k + rank)`) — higher values reduce outlier impact |
739
+ | `rerank` | `false` | Enable cross-encoder reranking globally |
740
+ | `rerank_model` | `"cross-encoder/ms-marco-MiniLM-L-6-v2"` | ONNX cross-encoder model |
741
+ | `rerank_top_n` | `20` | Fetch this many candidates before reranking to `top_k` |
742
+
743
+ ### `watcher`
744
+
745
+ | Key | Default | Description |
746
+ |-----|---------|-------------|
747
+ | `enabled` | `true` | Auto-start file watcher when `serve` runs |
748
+ | `debounce_ms` | `3000` | Wait this long after the last change before re-indexing |
749
+ | `extensions` | `[".md", ".mdx"]` | File extensions to watch |
750
+ | `ignore_patterns` | `["**/.git/**", "**/node_modules/**"]` | Glob patterns to ignore |
751
+
752
+ ### `api`
753
+
754
+ | Key | Default | Description |
755
+ |-----|---------|-------------|
756
+ | `host` | `"127.0.0.1"` | Bind address |
757
+ | `port` | `7832` | HTTP port |
758
+ | `api_key` | `""` | API key for bearer auth; empty disables auth |
759
+
760
+ ### `mcp`
761
+
762
+ | Key | Default | Description |
763
+ |-----|---------|-------------|
764
+ | `transport` | `"stdio"` | `"stdio"` or `"sse"` |
765
+ | `sse_port` | `7833` | Port for SSE transport |
766
+
767
+ ### Environment Variable Overrides
768
+
769
+ | Variable | Config Key |
770
+ |----------|------------|
771
+ | `SQMD_CONFIG` | Config file path |
772
+ | `SQMD_DB_PATH` | `paths.db_path` |
773
+ | `SQMD_API_PORT` | `api.port` |
774
+
775
+ ---
776
+
777
+ ## Architecture Deep Dive
778
+
779
+ ### Chunking Algorithm
780
+
781
+ The chunker (`src/ingestion/chunker.ts`) implements a hierarchical, token-aware strategy inspired by PageIndex's TOC-based approach:
782
+
783
+ 1. **Parse** — `remark-parse` converts Markdown to an mdast AST with precise line number tracking.
784
+
785
+ 2. **Build section tree** — The AST walker maintains a heading stack. Every content block (paragraphs, lists, code blocks) is assigned to its nearest ancestor heading.
786
+
787
+ 3. **Inject breadcrumb** — When `include_breadcrumb` is enabled, each chunk's `text` field is prefixed with `"Section: H1 > H2 > H3\n\n"`. This prefix is embedded alongside the content, giving the vector model full hierarchical context. The `text_raw` field always contains the unprefixed content for display.
788
+
789
+ 4. **Token-aware splitting** — Sections exceeding `max_tokens` (default 512) are split at paragraph boundaries. The last paragraph of each chunk is carried over into the next when it fits within `overlap_tokens` (default 64), maintaining cross-chunk coherence.
790
+
791
+ 5. **Stub filtering** — Chunks with `text_raw.length < min_chars` (default 50) are discarded.
792
+
793
+ 6. **Preamble handling** — Content before the first heading becomes `heading_level = 0` with the filename stem as the breadcrumb.
794
+
795
+ Token estimation uses `Math.ceil(words * 1.3)` — a fast approximation that overestimates slightly to avoid over-long chunks.
796
+
797
+ ---
798
+
799
+ ### Embedding Pipeline
800
+
801
+ The pipeline (`src/ingestion/pipeline.ts`) orchestrates indexing with bounded parallelism:
802
+
803
+ ```
804
+ scanDirectory()
805
+
806
+ ├── hashFile() → compare with stored hash
807
+ │ └── skip if unchanged (unless --force)
808
+
809
+ ├── parseMarkdown() → ParsedDocument
810
+ ├── chunkDocument() → ChunkRecord[] (vectors empty)
811
+
812
+ └── [collected into batches of batch_size * 4]
813
+
814
+ ├── embedder.embed(texts) → number[][]
815
+ └── upsertChunks() + upsertFile() → LanceDB
816
+ ```
817
+
818
+ Files are processed with `p-limit(4)` concurrency. Embedding batches are flushed when the pending buffer exceeds `batch_size * 4` (default 256 chunks), balancing memory usage and throughput.
819
+
820
+ After the first bulk index, `createIndexes()` builds:
821
+ - **IVF-PQ vector index** — `num_partitions: 256`, `num_sub_vectors: 96` (cosine metric)
822
+ - **Tantivy FTS index** — on the `text` field
823
+
824
+ ---
825
+
826
+ ### Hybrid Search & RRF
827
+
828
+ `src/search/hybrid.ts` fuses vector and full-text results using **Reciprocal Rank Fusion**:
829
+
830
+ ```
831
+ query
832
+
833
+ ├── prepareQueryForEmbedding() → "search_query: <query>" (nomic prefix)
834
+
835
+ ├── vectorSearch(vector, k*3) → ranked list A
836
+ └── ftsSearch(query, k*3) → ranked list B
837
+
838
+
839
+ RRF score(d) = Σ 1 / (60 + rank_i)
840
+
841
+
842
+ top-k by RRF score → SearchResult[]
843
+ ```
844
+
845
+ The RRF constant `k=60` (configurable via `search.rrf_k`) controls how steeply rank differences penalise lower-ranked results. Duplicate chunk IDs across lists are merged, summing their RRF scores.
846
+
847
+ **Search modes:**
848
+ - `hybrid` — RRF fusion of both lists (recommended)
849
+ - `vector` — pure cosine ANN search only
850
+ - `fts` — pure BM25 full-text search only
851
+
852
+ **Optional reranking:** When enabled, the initial `top_k` result set is expanded to `rerank_top_n` (default 20) and scored by a cross-encoder (`cross-encoder/ms-marco-MiniLM-L-6-v2` ONNX), which jointly processes query + passage for higher-precision ranking.
853
+
854
+ ---
855
+
856
+ ### Incremental Indexing
857
+
858
+ Change detection uses two layers:
859
+
860
+ 1. **Content hash** (`src/ingestion/fingerprint.ts`) — SHA-256 of file contents stored in the `files` table. On re-scan, the current hash is compared against the stored one; identical hashes skip the file entirely.
861
+
862
+ 2. **File watcher** (`src/watcher/`) — chokidar monitors `watch_dirs` for `add`, `change`, and `unlink` events. Events are debounced (default 3 s) to coalesce rapid saves. On `unlink`, the file's chunks are deleted from both tables.
863
+
864
+ ---
865
+
866
+ ### LanceDB Schema
867
+
868
+ Two tables are maintained:
869
+
870
+ **`chunks`** — one row per chunk (core search table):
871
+
872
+ | Column | Type | Description |
873
+ |--------|------|-------------|
874
+ | `chunk_id` | Utf8 | `"{file_hash}:{section_idx}:{chunk_idx}"` |
875
+ | `file_id` | Utf8 | SHA-256 of the absolute file path |
876
+ | `file_path` | Utf8 | Absolute path |
877
+ | `file_hash` | Utf8 | Content hash (change detection) |
878
+ | `file_mtime` | Float64 | Epoch timestamp |
879
+ | `heading_path` | Utf8 | `"H1 > H2 > H3"` |
880
+ | `heading_level` | Int8 | 0 = preamble, 1–6 = heading depth |
881
+ | `heading_text` | Utf8 | Verbatim heading text |
882
+ | `section_index` | Int32 | Index of section within the file |
883
+ | `chunk_index` | Int32 | Index of chunk within the section |
884
+ | `text` | Utf8 | Breadcrumb-prefixed text (embedded) |
885
+ | `text_raw` | Utf8 | Display text (no breadcrumb) |
886
+ | `token_count` | Int32 | Approximate token count |
887
+ | `parent_headings` | List\<Utf8\> | Ancestor heading texts |
888
+ | `depth` | Int8 | Heading depth |
889
+ | `vector` | FixedSizeList(768, Float32) | Embedding vector |
890
+ | `line_start` | Int32 | First line in the source file |
891
+ | `line_end` | Int32 | Last line in the source file |
892
+
893
+ **`files`** — one row per indexed file:
894
+
895
+ | Column | Type | Description |
896
+ |--------|------|-------------|
897
+ | `file_id` | Utf8 | SHA-256 of path |
898
+ | `file_path` | Utf8 | Absolute path |
899
+ | `file_hash` | Utf8 | Content hash |
900
+ | `file_mtime` | Float64 | Last modification time |
901
+ | `chunk_count` | Int32 | Number of chunks |
902
+ | `indexed_at` | Float64 | Indexing timestamp |
903
+ | `status` | Utf8 | `"indexed"` \| `"error"` \| `"skipped"` |
904
+ | `error_msg` | Utf8 | Error details if status is `"error"` |
905
+
906
+ Vector dimension is `768` for `nomic-embed-text-v1.5`. For `bge-m3`, change `VECTOR_DIM` in `src/store/schema.ts` to `1024` before first index.
907
+
908
+ ---
909
+
910
+ ## Embedding Backends
911
+
912
+ ### Transformers.js (Default)
913
+
914
+ Uses `@huggingface/transformers` v3 with the ONNX runtime. No Python, no separate process. Models are downloaded once and cached locally.
915
+
916
+ **Model:** `nomic-ai/nomic-embed-text-v1.5` (768-dim, ~270 MB)
917
+
918
+ The nomic model uses asymmetric prefixes for higher accuracy:
919
+ - Documents are embedded as `"search_document: <text>"`
920
+ - Queries are embedded as `"search_query: <text>"`
921
+
922
+ To use `bge-m3` (1024-dim, multilingual):
923
+
924
+ ```yaml
925
+ embeddings:
926
+ model: "BAAI/bge-m3"
927
+ ```
928
+
929
+ Also update `VECTOR_DIM = 1024` in `src/store/schema.ts` and rebuild.
930
+
931
+ ### Ollama
932
+
933
+ Point sqmd at a running [Ollama](https://ollama.ai) instance:
934
+
935
+ ```yaml
936
+ embeddings:
937
+ backend: "ollama"
938
+ model: "nomic-embed-text"
939
+ ollama_base_url: "http://localhost:11434"
940
+ ```
941
+
942
+ Ollama must be running with the model already pulled (`ollama pull nomic-embed-text`).
943
+
944
+ ---
945
+
946
+ ## Performance
947
+
948
+ | Operation | Typical Time | Notes |
949
+ |-----------|-------------|-------|
950
+ | Initial index (50k chunks) | 2–4 min | CPU; ONNX SIMD; batch size 64 |
951
+ | Single file re-index | < 1 s | Hash skip + targeted upsert |
952
+ | Search (hybrid, no rerank) | < 100 ms | IVF-PQ ANN + Tantivy BM25 + RRF |
953
+ | Search (with reranking) | 200–500 ms | Cross-encoder inference per candidate |
954
+ | Memory (idle) | ~400 MB | ONNX model ~200 MB + mmap'd LanceDB |
955
+
956
+ Embedding throughput scales with CPU core count — the ONNX runtime uses SIMD and will use available threads automatically.
957
+
958
+ ---
959
+
960
+ ## Development
961
+
962
+ ### Running Tests
963
+
964
+ ```bash
965
+ npm test # run all tests once (vitest)
966
+ npm run test:watch # watch mode
967
+ ```
968
+
969
+ Test coverage:
970
+ - `tests/unit/chunker.test.ts` — hierarchical chunking, breadcrumbs, overlap, stub filtering
971
+ - `tests/unit/hybrid.test.ts` — RRF fusion logic (mocked DB)
972
+ - `tests/unit/config.test.ts` — config loading, validation, env var overrides
973
+ - `tests/integration/pipeline.test.ts` — full pipeline with temp LanceDB instance
974
+ - `tests/integration/api.test.ts` — Hono app endpoints (health, search, documents, index)
975
+
976
+ ### Building
977
+
978
+ ```bash
979
+ npm run build # tsc → dist/
980
+ npm run dev # tsx src/index.ts (no build step, for development)
981
+ ```
982
+
983
+ ### Project Conventions
984
+
985
+ - All source imports use `.js` extension for ESM compatibility (TypeScript resolves to `.ts` at compile time)
986
+ - Node built-ins use the `node:` prefix (`node:fs`, `node:path`, `node:crypto`)
987
+ - `src/config/schema.ts` is the single source of truth for all config types — do not duplicate config fields elsewhere
988
+ - Embedder is lazy-loaded on first use to avoid model download cost at startup for non-indexing commands
989
+ - `p-limit` concurrency default is 4 files; adjust `concurrency` in `pipeline.run()` for I/O-bound vs CPU-bound workloads
990
+
991
+ ---
992
+
993
+ ## Troubleshooting
994
+
995
+ **`Database may not be initialized. Run sqmd index first.`**
996
+
997
+ The LanceDB database doesn't exist yet. Run `node dist/index.js index` to create it.
998
+
999
+ ---
1000
+
1001
+ **`Path not found: ~/notes`**
1002
+
1003
+ The tilde in `watch_dirs` is expanded at runtime. Ensure the directory exists. Use an absolute path to be explicit:
1004
+
1005
+ ```yaml
1006
+ paths:
1007
+ watch_dirs:
1008
+ - "/Users/alice/notes"
1009
+ ```
1010
+
1011
+ ---
1012
+
1013
+ **First index takes a long time**
1014
+
1015
+ The embedding model (~270 MB) is being downloaded on first use. Subsequent runs use the cache at `~/.sqmd/models`. Check disk space and network connectivity if the download stalls.
1016
+
1017
+ ---
1018
+
1019
+ **Search returns no results**
1020
+
1021
+ 1. Run `node dist/index.js status` to verify files were indexed
1022
+ 2. Check `errors` count — some files may have failed to parse
1023
+ 3. Try `--mode fts` first to verify full-text search works independently
1024
+ 4. Ensure you're using the same model for both indexing and search (config `embeddings.model`)
1025
+
1026
+ ---
1027
+
1028
+ **Vector dimension mismatch error**
1029
+
1030
+ If you change the embedding model after an initial index, the stored vector dimension will mismatch the new model. Delete the database and re-index:
1031
+
1032
+ ```bash
1033
+ rm -rf ~/.sqmd/lancedb
1034
+ node dist/index.js index --force
1035
+ ```
1036
+
1037
+ ---
1038
+
1039
+ **Ollama connection refused**
1040
+
1041
+ Ensure Ollama is running (`ollama serve`) and the model is pulled (`ollama pull nomic-embed-text`). Verify `ollama_base_url` in config.
1042
+
1043
+ ---
1044
+
1045
+ **Port 7832 already in use**
1046
+
1047
+ Override with `--port` or in config:
1048
+
1049
+ ```yaml
1050
+ api:
1051
+ port: 8832
1052
+ ```