@itkoren/sqmd 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +46 -0
- package/LICENSE +21 -0
- package/README.md +1052 -0
- package/dist/api/app.d.ts +14 -0
- package/dist/api/app.d.ts.map +1 -0
- package/dist/api/app.js +32 -0
- package/dist/api/app.js.map +1 -0
- package/dist/api/middleware.d.ts +5 -0
- package/dist/api/middleware.d.ts.map +1 -0
- package/dist/api/middleware.js +37 -0
- package/dist/api/middleware.js.map +1 -0
- package/dist/api/models.d.ts +178 -0
- package/dist/api/models.d.ts.map +1 -0
- package/dist/api/models.js +39 -0
- package/dist/api/models.js.map +1 -0
- package/dist/api/routes/documents.d.ts +4 -0
- package/dist/api/routes/documents.d.ts.map +1 -0
- package/dist/api/routes/documents.js +92 -0
- package/dist/api/routes/documents.js.map +1 -0
- package/dist/api/routes/health.d.ts +6 -0
- package/dist/api/routes/health.d.ts.map +1 -0
- package/dist/api/routes/health.js +38 -0
- package/dist/api/routes/health.js.map +1 -0
- package/dist/api/routes/index.d.ts +5 -0
- package/dist/api/routes/index.d.ts.map +1 -0
- package/dist/api/routes/index.js +83 -0
- package/dist/api/routes/index.js.map +1 -0
- package/dist/api/routes/search.d.ts +6 -0
- package/dist/api/routes/search.d.ts.map +1 -0
- package/dist/api/routes/search.js +104 -0
- package/dist/api/routes/search.js.map +1 -0
- package/dist/config/loader.d.ts +4 -0
- package/dist/config/loader.d.ts.map +1 -0
- package/dist/config/loader.js +144 -0
- package/dist/config/loader.js.map +1 -0
- package/dist/config/schema.d.ts +298 -0
- package/dist/config/schema.d.ts.map +1 -0
- package/dist/config/schema.js +50 -0
- package/dist/config/schema.js.map +1 -0
- package/dist/embeddings/ollama.d.ts +14 -0
- package/dist/embeddings/ollama.d.ts.map +1 -0
- package/dist/embeddings/ollama.js +46 -0
- package/dist/embeddings/ollama.js.map +1 -0
- package/dist/embeddings/transformers.d.ts +14 -0
- package/dist/embeddings/transformers.d.ts.map +1 -0
- package/dist/embeddings/transformers.js +64 -0
- package/dist/embeddings/transformers.js.map +1 -0
- package/dist/embeddings/types.d.ts +6 -0
- package/dist/embeddings/types.d.ts.map +1 -0
- package/dist/embeddings/types.js +2 -0
- package/dist/embeddings/types.js.map +1 -0
- package/dist/index.d.ts +3 -0
- package/dist/index.d.ts.map +1 -0
- package/dist/index.js +233 -0
- package/dist/index.js.map +1 -0
- package/dist/ingestion/chunker.d.ts +21 -0
- package/dist/ingestion/chunker.d.ts.map +1 -0
- package/dist/ingestion/chunker.js +117 -0
- package/dist/ingestion/chunker.js.map +1 -0
- package/dist/ingestion/fingerprint.d.ts +6 -0
- package/dist/ingestion/fingerprint.d.ts.map +1 -0
- package/dist/ingestion/fingerprint.js +17 -0
- package/dist/ingestion/fingerprint.js.map +1 -0
- package/dist/ingestion/parser.d.ts +16 -0
- package/dist/ingestion/parser.d.ts.map +1 -0
- package/dist/ingestion/parser.js +98 -0
- package/dist/ingestion/parser.js.map +1 -0
- package/dist/ingestion/pipeline.d.ts +32 -0
- package/dist/ingestion/pipeline.d.ts.map +1 -0
- package/dist/ingestion/pipeline.js +191 -0
- package/dist/ingestion/pipeline.js.map +1 -0
- package/dist/ingestion/scanner.d.ts +2 -0
- package/dist/ingestion/scanner.d.ts.map +1 -0
- package/dist/ingestion/scanner.js +54 -0
- package/dist/ingestion/scanner.js.map +1 -0
- package/dist/mcp/server.d.ts +8 -0
- package/dist/mcp/server.d.ts.map +1 -0
- package/dist/mcp/server.js +73 -0
- package/dist/mcp/server.js.map +1 -0
- package/dist/mcp/tools.d.ts +6 -0
- package/dist/mcp/tools.d.ts.map +1 -0
- package/dist/mcp/tools.js +276 -0
- package/dist/mcp/tools.js.map +1 -0
- package/dist/rag/context-builder.d.ts +3 -0
- package/dist/rag/context-builder.d.ts.map +1 -0
- package/dist/rag/context-builder.js +27 -0
- package/dist/rag/context-builder.js.map +1 -0
- package/dist/rag/prompt-templates.d.ts +5 -0
- package/dist/rag/prompt-templates.d.ts.map +1 -0
- package/dist/rag/prompt-templates.js +41 -0
- package/dist/rag/prompt-templates.js.map +1 -0
- package/dist/search/hybrid.d.ts +14 -0
- package/dist/search/hybrid.d.ts.map +1 -0
- package/dist/search/hybrid.js +58 -0
- package/dist/search/hybrid.js.map +1 -0
- package/dist/search/query.d.ts +4 -0
- package/dist/search/query.d.ts.map +1 -0
- package/dist/search/query.js +23 -0
- package/dist/search/query.js.map +1 -0
- package/dist/search/reranker.d.ts +11 -0
- package/dist/search/reranker.d.ts.map +1 -0
- package/dist/search/reranker.js +44 -0
- package/dist/search/reranker.js.map +1 -0
- package/dist/store/db.d.ts +11 -0
- package/dist/store/db.d.ts.map +1 -0
- package/dist/store/db.js +75 -0
- package/dist/store/db.js.map +1 -0
- package/dist/store/reader.d.ts +8 -0
- package/dist/store/reader.d.ts.map +1 -0
- package/dist/store/reader.js +122 -0
- package/dist/store/reader.js.map +1 -0
- package/dist/store/schema.d.ts +39 -0
- package/dist/store/schema.d.ts.map +1 -0
- package/dist/store/schema.js +33 -0
- package/dist/store/schema.js.map +1 -0
- package/dist/store/writer.d.ts +6 -0
- package/dist/store/writer.d.ts.map +1 -0
- package/dist/store/writer.js +43 -0
- package/dist/store/writer.js.map +1 -0
- package/dist/watcher/daemon.d.ts +5 -0
- package/dist/watcher/daemon.d.ts.map +1 -0
- package/dist/watcher/daemon.js +43 -0
- package/dist/watcher/daemon.js.map +1 -0
- package/dist/watcher/handler.d.ts +14 -0
- package/dist/watcher/handler.d.ts.map +1 -0
- package/dist/watcher/handler.js +82 -0
- package/dist/watcher/handler.js.map +1 -0
- package/package.json +56 -0
package/README.md
ADDED
|
@@ -0,0 +1,1052 @@
|
|
|
1
|
+
# sqmd
|
|
2
|
+
|
|
3
|
+
A fully local, high-performance semantic search engine for Markdown files. Index your notes, documentation, or any collection of `.md` / `.mdx` files and query them with natural language — no external API keys, no cloud services, no data leaving your machine.
|
|
4
|
+
|
|
5
|
+
Designed to serve both humans (CLI + REST API) and AI agents (MCP server), with a RAG-ready output layer for use as an agent memory backend.
|
|
6
|
+
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## Table of Contents
|
|
10
|
+
|
|
11
|
+
- [Features](#features)
|
|
12
|
+
- [Architecture Overview](#architecture-overview)
|
|
13
|
+
- [Technology Stack](#technology-stack)
|
|
14
|
+
- [Project Structure](#project-structure)
|
|
15
|
+
- [Getting Started](#getting-started)
|
|
16
|
+
- [Prerequisites](#prerequisites)
|
|
17
|
+
- [Installation](#installation)
|
|
18
|
+
- [Initial Configuration](#initial-configuration)
|
|
19
|
+
- [First Index](#first-index)
|
|
20
|
+
- [CLI Reference](#cli-reference)
|
|
21
|
+
- [index](#index)
|
|
22
|
+
- [search](#search)
|
|
23
|
+
- [serve](#serve)
|
|
24
|
+
- [mcp](#mcp)
|
|
25
|
+
- [status](#status)
|
|
26
|
+
- [config](#config)
|
|
27
|
+
- [REST API](#rest-api)
|
|
28
|
+
- [Search](#search-endpoints)
|
|
29
|
+
- [Documents](#document-endpoints)
|
|
30
|
+
- [Index Management](#index-management-endpoints)
|
|
31
|
+
- [Health & Metrics](#health--metrics-endpoints)
|
|
32
|
+
- [Authentication](#authentication)
|
|
33
|
+
- [MCP Server](#mcp-server)
|
|
34
|
+
- [Tools](#mcp-tools)
|
|
35
|
+
- [Resources](#mcp-resources)
|
|
36
|
+
- [Claude Desktop Integration](#claude-desktop-integration)
|
|
37
|
+
- [RAG Layer](#rag-layer)
|
|
38
|
+
- [Configuration Reference](#configuration-reference)
|
|
39
|
+
- [Architecture Deep Dive](#architecture-deep-dive)
|
|
40
|
+
- [Chunking Algorithm](#chunking-algorithm)
|
|
41
|
+
- [Embedding Pipeline](#embedding-pipeline)
|
|
42
|
+
- [Hybrid Search & RRF](#hybrid-search--rrf)
|
|
43
|
+
- [Incremental Indexing](#incremental-indexing)
|
|
44
|
+
- [LanceDB Schema](#lancedb-schema)
|
|
45
|
+
- [Embedding Backends](#embedding-backends)
|
|
46
|
+
- [Transformers.js (Default)](#transformersjs-default)
|
|
47
|
+
- [Ollama](#ollama)
|
|
48
|
+
- [Performance](#performance)
|
|
49
|
+
- [Development](#development)
|
|
50
|
+
- [Running Tests](#running-tests)
|
|
51
|
+
- [Project Conventions](#project-conventions)
|
|
52
|
+
- [Troubleshooting](#troubleshooting)
|
|
53
|
+
|
|
54
|
+
---
|
|
55
|
+
|
|
56
|
+
## Features
|
|
57
|
+
|
|
58
|
+
- **Fully local** — all embeddings, vector storage, and search run on-device
|
|
59
|
+
- **Hierarchical chunking** — sections are split following the document's heading structure, preserving semantic context
|
|
60
|
+
- **Hybrid search** — combines dense vector search (cosine ANN) and sparse full-text search (BM25/Tantivy) fused via Reciprocal Rank Fusion
|
|
61
|
+
- **Incremental indexing** — SHA-256 fingerprinting skips unchanged files; filesystem watcher triggers re-indexing automatically
|
|
62
|
+
- **Multiple interfaces** — CLI, REST API (Hono), and MCP server for AI agents
|
|
63
|
+
- **RAG output** — context builder assembles ranked chunks into token-budgeted context windows with source attribution
|
|
64
|
+
- **Optional reranking** — cross-encoder reranking (ONNX) for higher-precision results
|
|
65
|
+
- **Two embedding backends** — Transformers.js ONNX (default, bundled) or Ollama HTTP
|
|
66
|
+
- **Type-safe configuration** — Zod-validated YAML config with environment variable overrides
|
|
67
|
+
|
|
68
|
+
---
|
|
69
|
+
|
|
70
|
+
## Architecture Overview
|
|
71
|
+
|
|
72
|
+
```
|
|
73
|
+
┌─────────────────────────────────────────────────────────────────┐
|
|
74
|
+
│ Interfaces │
|
|
75
|
+
│ CLI (Commander) REST API (Hono) MCP Server (stdio/SSE) │
|
|
76
|
+
└────────────┬──────────────┬──────────────────┬─────────────────┘
|
|
77
|
+
│ │ │
|
|
78
|
+
▼ ▼ ▼
|
|
79
|
+
┌─────────────────────────────────────────────────────────────────┐
|
|
80
|
+
│ Search Layer │
|
|
81
|
+
│ Query preprocessing → Hybrid RRF → Reranker │
|
|
82
|
+
└────────────────────────────┬────────────────────────────────────┘
|
|
83
|
+
│
|
|
84
|
+
┌───────────────┼───────────────┐
|
|
85
|
+
▼ ▼ ▼
|
|
86
|
+
Vector ANN BM25 FTS RAG Context
|
|
87
|
+
(LanceDB) (Tantivy) Builder
|
|
88
|
+
│ │
|
|
89
|
+
└───────┬───────┘
|
|
90
|
+
▼
|
|
91
|
+
┌─────────────────────────────────────────────────────────────────┐
|
|
92
|
+
│ Storage Layer │
|
|
93
|
+
│ LanceDB (chunks + files tables) │
|
|
94
|
+
└────────────────────────────┬────────────────────────────────────┘
|
|
95
|
+
│
|
|
96
|
+
┌─────────────────────────────────────────────────────────────────┐
|
|
97
|
+
│ Ingestion Pipeline │
|
|
98
|
+
│ Scanner → Parser (remark AST) → Chunker → Embedder → Writer │
|
|
99
|
+
└──────────────┬───────────────────────────────────────┬──────────┘
|
|
100
|
+
│ │
|
|
101
|
+
File System Transformers.js
|
|
102
|
+
(chokidar watch) ONNX / Ollama
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
---
|
|
106
|
+
|
|
107
|
+
## Technology Stack
|
|
108
|
+
|
|
109
|
+
| Component | Library | Rationale |
|
|
110
|
+
|-----------|---------|-----------|
|
|
111
|
+
| Language | TypeScript (Node.js ≥22) | Type safety; ESM native; no GIL |
|
|
112
|
+
| Vector DB | `@lancedb/lancedb` | Embedded hybrid vector + BM25; no separate process |
|
|
113
|
+
| Embeddings | `@huggingface/transformers` v3 | ONNX runtime, 2–3× faster than PyTorch on CPU |
|
|
114
|
+
| MD Parsing | `remark` / `remark-parse` | Full mdast AST with line positions |
|
|
115
|
+
| REST Server | `hono` + `@hono/node-server` | ~3× faster than Express; excellent TypeScript DX |
|
|
116
|
+
| MCP Server | `@modelcontextprotocol/sdk` | Official Anthropic reference SDK |
|
|
117
|
+
| File Watch | `chokidar` v3 | Native FSEvents on macOS; debounce built-in |
|
|
118
|
+
| CLI | `commander` | Lightweight, typed |
|
|
119
|
+
| Config | `zod` + `js-yaml` | Runtime-validated config |
|
|
120
|
+
| Concurrency | `p-limit` | Bounded parallelism for the indexing pipeline |
|
|
121
|
+
|
|
122
|
+
---
|
|
123
|
+
|
|
124
|
+
## Project Structure
|
|
125
|
+
|
|
126
|
+
```
|
|
127
|
+
sqmd/
|
|
128
|
+
├── src/
|
|
129
|
+
│ ├── index.ts # CLI entrypoint
|
|
130
|
+
│ ├── config/
|
|
131
|
+
│ │ ├── schema.ts # Zod config schemas + TypeScript types
|
|
132
|
+
│ │ └── loader.ts # YAML loading, ~ expansion, env overrides
|
|
133
|
+
│ ├── ingestion/
|
|
134
|
+
│ │ ├── scanner.ts # Recursive async file discovery
|
|
135
|
+
│ │ ├── parser.ts # remark AST → Section[] with line numbers
|
|
136
|
+
│ │ ├── chunker.ts # Hierarchical token-aware chunking
|
|
137
|
+
│ │ ├── fingerprint.ts # SHA-256 content + path hashing
|
|
138
|
+
│ │ └── pipeline.ts # Full index orchestration (scan→chunk→embed→store)
|
|
139
|
+
│ ├── embeddings/
|
|
140
|
+
│ │ ├── types.ts # Embedder interface
|
|
141
|
+
│ │ ├── transformers.ts # Transformers.js ONNX backend
|
|
142
|
+
│ │ └── ollama.ts # Ollama HTTP backend
|
|
143
|
+
│ ├── store/
|
|
144
|
+
│ │ ├── schema.ts # Apache Arrow schemas + TypeScript record types
|
|
145
|
+
│ │ ├── db.ts # LanceDB connection management
|
|
146
|
+
│ │ ├── writer.ts # Upsert / delete operations
|
|
147
|
+
│ │ └── reader.ts # Vector search, FTS search, file/chunk queries
|
|
148
|
+
│ ├── search/
|
|
149
|
+
│ │ ├── query.ts # Query preprocessing and prefix injection
|
|
150
|
+
│ │ ├── hybrid.ts # RRF fusion (vector + BM25)
|
|
151
|
+
│ │ └── reranker.ts # Optional cross-encoder reranking
|
|
152
|
+
│ ├── watcher/
|
|
153
|
+
│ │ ├── handler.ts # chokidar event handler + debounce
|
|
154
|
+
│ │ └── daemon.ts # Long-running watcher lifecycle
|
|
155
|
+
│ ├── api/
|
|
156
|
+
│ │ ├── app.ts # Hono application factory
|
|
157
|
+
│ │ ├── middleware.ts # Auth, CORS, request logging
|
|
158
|
+
│ │ ├── models.ts # Zod request/response schemas
|
|
159
|
+
│ │ └── routes/
|
|
160
|
+
│ │ ├── health.ts # GET /api/v1/health
|
|
161
|
+
│ │ ├── search.ts # POST|GET /api/v1/search
|
|
162
|
+
│ │ ├── documents.ts # GET /api/v1/documents[/:id]
|
|
163
|
+
│ │ └── index.ts # POST /api/v1/index/trigger, GET /api/v1/index/status
|
|
164
|
+
│ ├── mcp/
|
|
165
|
+
│ │ ├── tools.ts # MCP tool + resource implementations
|
|
166
|
+
│ │ └── server.ts # MCP server (stdio + SSE transports)
|
|
167
|
+
│ └── rag/
|
|
168
|
+
│ ├── context-builder.ts # Token-budgeted context window assembly
|
|
169
|
+
│ └── prompt-templates.ts # System prompts for RAG use
|
|
170
|
+
├── tests/
|
|
171
|
+
│ ├── unit/
|
|
172
|
+
│ │ ├── chunker.test.ts
|
|
173
|
+
│ │ ├── hybrid.test.ts
|
|
174
|
+
│ │ └── config.test.ts
|
|
175
|
+
│ └── integration/
|
|
176
|
+
│ ├── pipeline.test.ts
|
|
177
|
+
│ └── api.test.ts
|
|
178
|
+
├── config.yaml # Default configuration (copy to ~/.sqmd/)
|
|
179
|
+
├── package.json
|
|
180
|
+
└── tsconfig.json
|
|
181
|
+
```
|
|
182
|
+
|
|
183
|
+
---
|
|
184
|
+
|
|
185
|
+
## Getting Started
|
|
186
|
+
|
|
187
|
+
### Prerequisites
|
|
188
|
+
|
|
189
|
+
- **Node.js 22+** — required for native ESM and modern `node:` builtins
|
|
190
|
+
- **pnpm** (recommended) or npm
|
|
191
|
+
|
|
192
|
+
```bash
|
|
193
|
+
node --version # must be ≥ 22.0.0
|
|
194
|
+
```
|
|
195
|
+
|
|
196
|
+
### Installation
|
|
197
|
+
|
|
198
|
+
**From source:**
|
|
199
|
+
|
|
200
|
+
```bash
|
|
201
|
+
git clone <repo>
|
|
202
|
+
cd sqmd
|
|
203
|
+
npm install # or: pnpm install
|
|
204
|
+
npm run build # compiles TypeScript → dist/
|
|
205
|
+
```
|
|
206
|
+
|
|
207
|
+
**Global install (after build):**
|
|
208
|
+
|
|
209
|
+
```bash
|
|
210
|
+
npm install -g .
|
|
211
|
+
sqmd --version
|
|
212
|
+
```
|
|
213
|
+
|
|
214
|
+
### Initial Configuration
|
|
215
|
+
|
|
216
|
+
Write the default configuration file:
|
|
217
|
+
|
|
218
|
+
```bash
|
|
219
|
+
node dist/index.js config --init ~/.sqmd/config.yaml
|
|
220
|
+
```
|
|
221
|
+
|
|
222
|
+
Then edit `~/.sqmd/config.yaml` to set the directories you want to index:
|
|
223
|
+
|
|
224
|
+
```yaml
|
|
225
|
+
paths:
|
|
226
|
+
watch_dirs:
|
|
227
|
+
- "~/notes"
|
|
228
|
+
- "~/work/docs"
|
|
229
|
+
db_path: "~/.sqmd/lancedb"
|
|
230
|
+
```
|
|
231
|
+
|
|
232
|
+
The tool resolves config in this order:
|
|
233
|
+
1. Path from `--config` flag
|
|
234
|
+
2. `$SQMD_CONFIG` environment variable
|
|
235
|
+
3. `~/.sqmd/config.yaml`
|
|
236
|
+
4. `./config.yaml` (project-local)
|
|
237
|
+
5. Built-in defaults
|
|
238
|
+
|
|
239
|
+
### First Index
|
|
240
|
+
|
|
241
|
+
```bash
|
|
242
|
+
node dist/index.js index
|
|
243
|
+
```
|
|
244
|
+
|
|
245
|
+
On first run, the embedding model (`nomic-ai/nomic-embed-text-v1.5`, ~270 MB) is downloaded and cached to `~/.sqmd/models`. Subsequent runs use the cached model.
|
|
246
|
+
|
|
247
|
+
---
|
|
248
|
+
|
|
249
|
+
## CLI Reference
|
|
250
|
+
|
|
251
|
+
All commands accept `--config <path>` to specify a non-default config file.
|
|
252
|
+
|
|
253
|
+
### `index`
|
|
254
|
+
|
|
255
|
+
Scan and index Markdown files.
|
|
256
|
+
|
|
257
|
+
```
|
|
258
|
+
sqmd index [options]
|
|
259
|
+
|
|
260
|
+
Options:
|
|
261
|
+
--path <path> Directory or single file to index (default: watch_dirs from config)
|
|
262
|
+
--force Re-index all files, even if content is unchanged
|
|
263
|
+
--watch Keep running and re-index files as they change
|
|
264
|
+
--config <path> Config file path
|
|
265
|
+
```
|
|
266
|
+
|
|
267
|
+
**Examples:**
|
|
268
|
+
|
|
269
|
+
```bash
|
|
270
|
+
# Index default watch_dirs
|
|
271
|
+
node dist/index.js index
|
|
272
|
+
|
|
273
|
+
# Index a specific directory
|
|
274
|
+
node dist/index.js index --path ~/work/docs
|
|
275
|
+
|
|
276
|
+
# Force full re-index (ignores change detection)
|
|
277
|
+
node dist/index.js index --force
|
|
278
|
+
|
|
279
|
+
# Index then keep watching
|
|
280
|
+
node dist/index.js index --watch
|
|
281
|
+
```
|
|
282
|
+
|
|
283
|
+
Progress is printed per-file. A summary reports indexed, skipped (unchanged), and errored files.
|
|
284
|
+
|
|
285
|
+
---
|
|
286
|
+
|
|
287
|
+
### `search`
|
|
288
|
+
|
|
289
|
+
Query the index from the terminal.
|
|
290
|
+
|
|
291
|
+
```
|
|
292
|
+
sqmd search <query> [options]
|
|
293
|
+
|
|
294
|
+
Arguments:
|
|
295
|
+
query Natural language search query (quote multi-word queries)
|
|
296
|
+
|
|
297
|
+
Options:
|
|
298
|
+
--top-k <n> Number of results to return (default: 10)
|
|
299
|
+
--mode <mode> hybrid | vector | fts (default: hybrid)
|
|
300
|
+
--filter <path> Restrict results to files whose path contains this substring
|
|
301
|
+
--config <path> Config file path
|
|
302
|
+
```
|
|
303
|
+
|
|
304
|
+
**Examples:**
|
|
305
|
+
|
|
306
|
+
```bash
|
|
307
|
+
# Semantic search
|
|
308
|
+
node dist/index.js search "how to configure authentication"
|
|
309
|
+
|
|
310
|
+
# Full-text only
|
|
311
|
+
node dist/index.js search "OAuth token refresh" --mode fts
|
|
312
|
+
|
|
313
|
+
# Top 5 results scoped to a directory
|
|
314
|
+
node dist/index.js search "deployment strategy" --top-k 5 --filter /work/
|
|
315
|
+
```
|
|
316
|
+
|
|
317
|
+
Output includes file path, heading breadcrumb, score, line range, and a 200-character snippet.
|
|
318
|
+
|
|
319
|
+
---
|
|
320
|
+
|
|
321
|
+
### `serve`
|
|
322
|
+
|
|
323
|
+
Start the HTTP REST API server.
|
|
324
|
+
|
|
325
|
+
```
|
|
326
|
+
sqmd serve [options]
|
|
327
|
+
|
|
328
|
+
Options:
|
|
329
|
+
--host <host> Bind address (default: 127.0.0.1)
|
|
330
|
+
--port <port> Port (default: 7832)
|
|
331
|
+
--config <path> Config file path
|
|
332
|
+
```
|
|
333
|
+
|
|
334
|
+
```bash
|
|
335
|
+
node dist/index.js serve
|
|
336
|
+
# → Listening on http://127.0.0.1:7832
|
|
337
|
+
```
|
|
338
|
+
|
|
339
|
+
If `watcher.enabled` is `true` in config, the file watcher starts automatically alongside the API server.
|
|
340
|
+
|
|
341
|
+
---
|
|
342
|
+
|
|
343
|
+
### `mcp`
|
|
344
|
+
|
|
345
|
+
Start the Model Context Protocol server.
|
|
346
|
+
|
|
347
|
+
```
|
|
348
|
+
sqmd mcp [options]
|
|
349
|
+
|
|
350
|
+
Options:
|
|
351
|
+
--transport <transport> stdio | sse (default: stdio)
|
|
352
|
+
--port <port> Port for SSE transport (default: 7833)
|
|
353
|
+
--config <path> Config file path
|
|
354
|
+
```
|
|
355
|
+
|
|
356
|
+
```bash
|
|
357
|
+
# For Claude Desktop / Claude Code (stdio)
|
|
358
|
+
node dist/index.js mcp
|
|
359
|
+
|
|
360
|
+
# For HTTP-based agents (SSE)
|
|
361
|
+
node dist/index.js mcp --transport sse --port 7833
|
|
362
|
+
```
|
|
363
|
+
|
|
364
|
+
---
|
|
365
|
+
|
|
366
|
+
### `status`
|
|
367
|
+
|
|
368
|
+
Display index statistics.
|
|
369
|
+
|
|
370
|
+
```bash
|
|
371
|
+
node dist/index.js status
|
|
372
|
+
```
|
|
373
|
+
|
|
374
|
+
Output:
|
|
375
|
+
|
|
376
|
+
```
|
|
377
|
+
sqmd Status
|
|
378
|
+
────────────────────────────────────────
|
|
379
|
+
DB path: ~/.sqmd/lancedb
|
|
380
|
+
Files indexed: 142
|
|
381
|
+
Chunks stored: 3847
|
|
382
|
+
Last indexed: 3/17/2026, 09:14:32 AM
|
|
383
|
+
Watch dirs: ~/notes
|
|
384
|
+
Embedder: transformers / nomic-ai/nomic-embed-text-v1.5
|
|
385
|
+
```
|
|
386
|
+
|
|
387
|
+
---
|
|
388
|
+
|
|
389
|
+
### `config`
|
|
390
|
+
|
|
391
|
+
Manage configuration.
|
|
392
|
+
|
|
393
|
+
```bash
|
|
394
|
+
# Write default config to a path
|
|
395
|
+
node dist/index.js config --init ~/.sqmd/config.yaml
|
|
396
|
+
```
|
|
397
|
+
|
|
398
|
+
---
|
|
399
|
+
|
|
400
|
+
## REST API
|
|
401
|
+
|
|
402
|
+
Base URL: `http://localhost:7832/api/v1`
|
|
403
|
+
|
|
404
|
+
All responses are JSON. Errors use `{ "error": "...", "message": "..." }`.
|
|
405
|
+
|
|
406
|
+
### Search Endpoints
|
|
407
|
+
|
|
408
|
+
#### `POST /api/v1/search`
|
|
409
|
+
|
|
410
|
+
```json
|
|
411
|
+
// Request body
|
|
412
|
+
{
|
|
413
|
+
"query": "how to set up two-factor authentication",
|
|
414
|
+
"top_k": 10,
|
|
415
|
+
"mode": "hybrid",
|
|
416
|
+
"filter_path": "/notes/security",
|
|
417
|
+
"include_context": false,
|
|
418
|
+
"rerank": false
|
|
419
|
+
}
|
|
420
|
+
```
|
|
421
|
+
|
|
422
|
+
| Field | Type | Default | Description |
|
|
423
|
+
|-------|------|---------|-------------|
|
|
424
|
+
| `query` | string | **required** | Natural language query |
|
|
425
|
+
| `top_k` | number | 10 | Number of results |
|
|
426
|
+
| `mode` | `"hybrid"` \| `"vector"` \| `"fts"` | `"hybrid"` | Search algorithm |
|
|
427
|
+
| `filter_path` | string | — | Path substring filter |
|
|
428
|
+
| `include_context` | boolean | false | Include breadcrumb-prefixed `text` field |
|
|
429
|
+
| `rerank` | boolean | config default | Apply cross-encoder reranking |
|
|
430
|
+
|
|
431
|
+
```json
|
|
432
|
+
// Response
|
|
433
|
+
{
|
|
434
|
+
"results": [
|
|
435
|
+
{
|
|
436
|
+
"chunk_id": "abc123:2:0",
|
|
437
|
+
"file_id": "sha256-of-path",
|
|
438
|
+
"file_path": "/notes/security/2fa.md",
|
|
439
|
+
"heading_path": "Setup > Two-Factor Authentication",
|
|
440
|
+
"heading_text": "Two-Factor Authentication",
|
|
441
|
+
"heading_level": 2,
|
|
442
|
+
"section_index": 2,
|
|
443
|
+
"chunk_index": 0,
|
|
444
|
+
"text_raw": "Enable 2FA by navigating to Settings...",
|
|
445
|
+
"token_count": 87,
|
|
446
|
+
"score": 0.0312,
|
|
447
|
+
"line_start": 45,
|
|
448
|
+
"line_end": 72
|
|
449
|
+
}
|
|
450
|
+
],
|
|
451
|
+
"query": "how to set up two-factor authentication",
|
|
452
|
+
"total": 10,
|
|
453
|
+
"duration_ms": 43
|
|
454
|
+
}
|
|
455
|
+
```
|
|
456
|
+
|
|
457
|
+
#### `GET /api/v1/search?q=...`
|
|
458
|
+
|
|
459
|
+
```bash
|
|
460
|
+
curl "http://localhost:7832/api/v1/search?q=configure+auth&top_k=5&mode=vector"
|
|
461
|
+
```
|
|
462
|
+
|
|
463
|
+
Accepts the same parameters as POST, via query string. Useful for quick browser/curl queries.
|
|
464
|
+
|
|
465
|
+
---
|
|
466
|
+
|
|
467
|
+
### Document Endpoints
|
|
468
|
+
|
|
469
|
+
#### `GET /api/v1/documents`
|
|
470
|
+
|
|
471
|
+
Returns a paginated list of indexed files.
|
|
472
|
+
|
|
473
|
+
```bash
|
|
474
|
+
curl "http://localhost:7832/api/v1/documents?limit=20&offset=0"
|
|
475
|
+
```
|
|
476
|
+
|
|
477
|
+
```json
|
|
478
|
+
{
|
|
479
|
+
"documents": [
|
|
480
|
+
{
|
|
481
|
+
"file_id": "...",
|
|
482
|
+
"file_path": "/notes/setup.md",
|
|
483
|
+
"file_hash": "...",
|
|
484
|
+
"chunk_count": 12,
|
|
485
|
+
"indexed_at": 1742215200000,
|
|
486
|
+
"status": "indexed"
|
|
487
|
+
}
|
|
488
|
+
],
|
|
489
|
+
"total": 142,
|
|
490
|
+
"limit": 20,
|
|
491
|
+
"offset": 0
|
|
492
|
+
}
|
|
493
|
+
```
|
|
494
|
+
|
|
495
|
+
#### `GET /api/v1/documents/:fileId`
|
|
496
|
+
|
|
497
|
+
Returns metadata and all stored chunks for a specific file.
|
|
498
|
+
|
|
499
|
+
```bash
|
|
500
|
+
curl "http://localhost:7832/api/v1/documents/<file_id>"
|
|
501
|
+
```
|
|
502
|
+
|
|
503
|
+
#### `GET /api/v1/documents/:fileId/raw`
|
|
504
|
+
|
|
505
|
+
Returns the raw Markdown content of the file (read from disk).
|
|
506
|
+
|
|
507
|
+
---
|
|
508
|
+
|
|
509
|
+
### Index Management Endpoints
|
|
510
|
+
|
|
511
|
+
#### `POST /api/v1/index/trigger`
|
|
512
|
+
|
|
513
|
+
Trigger a re-index operation asynchronously.
|
|
514
|
+
|
|
515
|
+
```json
|
|
516
|
+
// Request
|
|
517
|
+
{
|
|
518
|
+
"paths": ["/notes/work"], // optional; defaults to watch_dirs
|
|
519
|
+
"force": false // optional
|
|
520
|
+
}
|
|
521
|
+
```
|
|
522
|
+
|
|
523
|
+
```json
|
|
524
|
+
// Response 202
|
|
525
|
+
{
|
|
526
|
+
"job_id": "job-1710000000000",
|
|
527
|
+
"status": "queued"
|
|
528
|
+
}
|
|
529
|
+
```
|
|
530
|
+
|
|
531
|
+
#### `GET /api/v1/index/status`
|
|
532
|
+
|
|
533
|
+
Returns current index statistics and watcher state.
|
|
534
|
+
|
|
535
|
+
```json
|
|
536
|
+
{
|
|
537
|
+
"fileCount": 142,
|
|
538
|
+
"chunkCount": 3847,
|
|
539
|
+
"watcherRunning": true,
|
|
540
|
+
"dbPath": "~/.sqmd/lancedb"
|
|
541
|
+
}
|
|
542
|
+
```
|
|
543
|
+
|
|
544
|
+
#### `GET /api/v1/index/jobs/:jobId`
|
|
545
|
+
|
|
546
|
+
Returns the progress of a triggered index job.
|
|
547
|
+
|
|
548
|
+
```json
|
|
549
|
+
{
|
|
550
|
+
"job_id": "job-1710000000000",
|
|
551
|
+
"status": "completed",
|
|
552
|
+
"indexed": 5,
|
|
553
|
+
"skipped": 137,
|
|
554
|
+
"errors": 0,
|
|
555
|
+
"started_at": 1710000000000,
|
|
556
|
+
"completed_at": 1710000003500
|
|
557
|
+
}
|
|
558
|
+
```
|
|
559
|
+
|
|
560
|
+
---
|
|
561
|
+
|
|
562
|
+
### Health & Metrics Endpoints
|
|
563
|
+
|
|
564
|
+
#### `GET /api/v1/health`
|
|
565
|
+
|
|
566
|
+
```json
|
|
567
|
+
{
|
|
568
|
+
"status": "ok",
|
|
569
|
+
"db": "connected",
|
|
570
|
+
"embedder": "transformers / nomic-ai/nomic-embed-text-v1.5",
|
|
571
|
+
"watcher": "running",
|
|
572
|
+
"uptime_seconds": 3600
|
|
573
|
+
}
|
|
574
|
+
```
|
|
575
|
+
|
|
576
|
+
#### `GET /api/v1/metrics`
|
|
577
|
+
|
|
578
|
+
Search latency percentiles and throughput counters.
|
|
579
|
+
|
|
580
|
+
---
|
|
581
|
+
|
|
582
|
+
### Authentication
|
|
583
|
+
|
|
584
|
+
Set `api.api_key` in config to a non-empty string to enable bearer token auth. All `/api/*` requests must include:
|
|
585
|
+
|
|
586
|
+
```
|
|
587
|
+
Authorization: Bearer <your-api-key>
|
|
588
|
+
```
|
|
589
|
+
|
|
590
|
+
If `api_key` is empty (the default), authentication is disabled — suitable for local use.
|
|
591
|
+
|
|
592
|
+
---
|
|
593
|
+
|
|
594
|
+
## MCP Server
|
|
595
|
+
|
|
596
|
+
sqmd exposes a full Model Context Protocol server, allowing AI agents like Claude to search your notes directly from conversations.
|
|
597
|
+
|
|
598
|
+
### MCP Tools
|
|
599
|
+
|
|
600
|
+
| Tool | Required Args | Optional Args | Description |
|
|
601
|
+
|------|--------------|---------------|-------------|
|
|
602
|
+
| `search_documents` | `query` | `top_k`, `mode`, `filter_path`, `include_context` | Primary semantic/hybrid search |
|
|
603
|
+
| `get_document` | `file_path` | `section` | Fetch a file's metadata and chunks, optionally filtered to a heading |
|
|
604
|
+
| `list_documents` | — | `path_prefix`, `limit` | Browse the indexed file tree |
|
|
605
|
+
| `trigger_index` | — | `paths`, `force` | Request re-indexing |
|
|
606
|
+
| `get_index_status` | — | — | Index health and stats |
|
|
607
|
+
|
|
608
|
+
**`search_documents` example (Claude tool call):**
|
|
609
|
+
|
|
610
|
+
```json
|
|
611
|
+
{
|
|
612
|
+
"query": "database migration strategy",
|
|
613
|
+
"top_k": 5,
|
|
614
|
+
"mode": "hybrid",
|
|
615
|
+
"include_context": true
|
|
616
|
+
}
|
|
617
|
+
```
|
|
618
|
+
|
|
619
|
+
When `include_context` is `true`, the response includes a pre-assembled `context` string ready to inject into a prompt.
|
|
620
|
+
|
|
621
|
+
### MCP Resources
|
|
622
|
+
|
|
623
|
+
Every indexed file is exposed as a resource with URI scheme `md://<absolute-path>`:
|
|
624
|
+
|
|
625
|
+
```
|
|
626
|
+
md:///Users/alice/notes/architecture.md
|
|
627
|
+
```
|
|
628
|
+
|
|
629
|
+
Agents can read raw Markdown content directly via the resource protocol without going through the search tool.
|
|
630
|
+
|
|
631
|
+
### Claude Desktop Integration
|
|
632
|
+
|
|
633
|
+
Add to `~/Library/Application Support/Claude/claude_desktop_config.json`:
|
|
634
|
+
|
|
635
|
+
```json
|
|
636
|
+
{
|
|
637
|
+
"mcpServers": {
|
|
638
|
+
"sqmd": {
|
|
639
|
+
"command": "node",
|
|
640
|
+
"args": ["/path/to/sqmd/dist/index.js", "mcp"],
|
|
641
|
+
"env": {
|
|
642
|
+
"SQMD_CONFIG": "/Users/alice/.sqmd/config.yaml"
|
|
643
|
+
}
|
|
644
|
+
}
|
|
645
|
+
}
|
|
646
|
+
}
|
|
647
|
+
```
|
|
648
|
+
|
|
649
|
+
Or, if installed globally:
|
|
650
|
+
|
|
651
|
+
```json
|
|
652
|
+
{
|
|
653
|
+
"mcpServers": {
|
|
654
|
+
"sqmd": {
|
|
655
|
+
"command": "sqmd",
|
|
656
|
+
"args": ["mcp"],
|
|
657
|
+
"env": {
|
|
658
|
+
"SQMD_CONFIG": "~/.sqmd/config.yaml"
|
|
659
|
+
}
|
|
660
|
+
}
|
|
661
|
+
}
|
|
662
|
+
}
|
|
663
|
+
```
|
|
664
|
+
|
|
665
|
+
### Claude Code Integration
|
|
666
|
+
|
|
667
|
+
Add to your `.mcp.json` or use `claude mcp add`:
|
|
668
|
+
|
|
669
|
+
```bash
|
|
670
|
+
claude mcp add sqmd -- node /path/to/dist/index.js mcp
|
|
671
|
+
```
|
|
672
|
+
|
|
673
|
+
---
|
|
674
|
+
|
|
675
|
+
## RAG Layer
|
|
676
|
+
|
|
677
|
+
The `src/rag/` module provides utilities for AI agent memory management.
|
|
678
|
+
|
|
679
|
+
**`buildContext(results, maxTokens)`** assembles search results into a single context string that fits within a token budget. Each chunk is preceded by attribution metadata:
|
|
680
|
+
|
|
681
|
+
```
|
|
682
|
+
Source: /notes/architecture/decisions.md
|
|
683
|
+
Section: Architecture > Database > Schema Design
|
|
684
|
+
Lines: 45-72
|
|
685
|
+
|
|
686
|
+
We chose PostgreSQL because it provides...
|
|
687
|
+
|
|
688
|
+
---
|
|
689
|
+
|
|
690
|
+
Source: /notes/architecture/decisions.md
|
|
691
|
+
Section: Architecture > Database > Migrations
|
|
692
|
+
Lines: 100-134
|
|
693
|
+
|
|
694
|
+
All schema changes are managed via...
|
|
695
|
+
```
|
|
696
|
+
|
|
697
|
+
The `search_documents` MCP tool returns this context when `include_context: true`. Inject it directly into the system prompt or user message of your agent.
|
|
698
|
+
|
|
699
|
+
**`ragSystemPrompt()`** returns a baseline system prompt for RAG-style agents instructing the model on how to interpret sourced context.
|
|
700
|
+
|
|
701
|
+
---
|
|
702
|
+
|
|
703
|
+
## Configuration Reference
|
|
704
|
+
|
|
705
|
+
All settings live in `config.yaml` (or the file pointed to by `--config` / `$SQMD_CONFIG`).
|
|
706
|
+
|
|
707
|
+
### `paths`
|
|
708
|
+
|
|
709
|
+
| Key | Default | Description |
|
|
710
|
+
|-----|---------|-------------|
|
|
711
|
+
| `watch_dirs` | `["~/notes"]` | Directories to index and watch |
|
|
712
|
+
| `db_path` | `~/.sqmd/lancedb` | LanceDB database location |
|
|
713
|
+
| `model_cache_dir` | `~/.sqmd/models` | Directory for cached embedding models |
|
|
714
|
+
|
|
715
|
+
### `embeddings`
|
|
716
|
+
|
|
717
|
+
| Key | Default | Description |
|
|
718
|
+
|-----|---------|-------------|
|
|
719
|
+
| `backend` | `"transformers"` | `"transformers"` (ONNX) or `"ollama"` |
|
|
720
|
+
| `model` | `"nomic-ai/nomic-embed-text-v1.5"` | HuggingFace model ID or Ollama model name |
|
|
721
|
+
| `batch_size` | `64` | Texts per embedding batch |
|
|
722
|
+
| `ollama_base_url` | `"http://localhost:11434"` | Ollama server URL (used only when backend is `"ollama"`) |
|
|
723
|
+
|
|
724
|
+
### `chunking`
|
|
725
|
+
|
|
726
|
+
| Key | Default | Description |
|
|
727
|
+
|-----|---------|-------------|
|
|
728
|
+
| `max_tokens` | `512` | Maximum tokens per chunk before splitting |
|
|
729
|
+
| `min_chars` | `50` | Minimum characters; shorter chunks are discarded |
|
|
730
|
+
| `include_breadcrumb` | `true` | Prepend `"Section: H1 > H2 > H3\n\n"` to chunk text for richer embeddings |
|
|
731
|
+
| `overlap_tokens` | `64` | Carry-over tokens between adjacent sub-chunks when a section is split |
|
|
732
|
+
|
|
733
|
+
### `search`
|
|
734
|
+
|
|
735
|
+
| Key | Default | Description |
|
|
736
|
+
|-----|---------|-------------|
|
|
737
|
+
| `default_top_k` | `10` | Default number of results |
|
|
738
|
+
| `rrf_k` | `60` | RRF constant (`k` in `1/(k + rank)`) — higher values reduce outlier impact |
|
|
739
|
+
| `rerank` | `false` | Enable cross-encoder reranking globally |
|
|
740
|
+
| `rerank_model` | `"cross-encoder/ms-marco-MiniLM-L-6-v2"` | ONNX cross-encoder model |
|
|
741
|
+
| `rerank_top_n` | `20` | Fetch this many candidates before reranking to `top_k` |
|
|
742
|
+
|
|
743
|
+
### `watcher`
|
|
744
|
+
|
|
745
|
+
| Key | Default | Description |
|
|
746
|
+
|-----|---------|-------------|
|
|
747
|
+
| `enabled` | `true` | Auto-start file watcher when `serve` runs |
|
|
748
|
+
| `debounce_ms` | `3000` | Wait this long after the last change before re-indexing |
|
|
749
|
+
| `extensions` | `[".md", ".mdx"]` | File extensions to watch |
|
|
750
|
+
| `ignore_patterns` | `["**/.git/**", "**/node_modules/**"]` | Glob patterns to ignore |
|
|
751
|
+
|
|
752
|
+
### `api`
|
|
753
|
+
|
|
754
|
+
| Key | Default | Description |
|
|
755
|
+
|-----|---------|-------------|
|
|
756
|
+
| `host` | `"127.0.0.1"` | Bind address |
|
|
757
|
+
| `port` | `7832` | HTTP port |
|
|
758
|
+
| `api_key` | `""` | API key for bearer auth; empty disables auth |
|
|
759
|
+
|
|
760
|
+
### `mcp`
|
|
761
|
+
|
|
762
|
+
| Key | Default | Description |
|
|
763
|
+
|-----|---------|-------------|
|
|
764
|
+
| `transport` | `"stdio"` | `"stdio"` or `"sse"` |
|
|
765
|
+
| `sse_port` | `7833` | Port for SSE transport |
|
|
766
|
+
|
|
767
|
+
### Environment Variable Overrides
|
|
768
|
+
|
|
769
|
+
| Variable | Config Key |
|
|
770
|
+
|----------|------------|
|
|
771
|
+
| `SQMD_CONFIG` | Config file path |
|
|
772
|
+
| `SQMD_DB_PATH` | `paths.db_path` |
|
|
773
|
+
| `SQMD_API_PORT` | `api.port` |
|
|
774
|
+
|
|
775
|
+
---
|
|
776
|
+
|
|
777
|
+
## Architecture Deep Dive
|
|
778
|
+
|
|
779
|
+
### Chunking Algorithm
|
|
780
|
+
|
|
781
|
+
The chunker (`src/ingestion/chunker.ts`) implements a hierarchical, token-aware strategy inspired by PageIndex's TOC-based approach:
|
|
782
|
+
|
|
783
|
+
1. **Parse** — `remark-parse` converts Markdown to an mdast AST with precise line number tracking.
|
|
784
|
+
|
|
785
|
+
2. **Build section tree** — The AST walker maintains a heading stack. Every content block (paragraphs, lists, code blocks) is assigned to its nearest ancestor heading.
|
|
786
|
+
|
|
787
|
+
3. **Inject breadcrumb** — When `include_breadcrumb` is enabled, each chunk's `text` field is prefixed with `"Section: H1 > H2 > H3\n\n"`. This prefix is embedded alongside the content, giving the vector model full hierarchical context. The `text_raw` field always contains the unprefixed content for display.
|
|
788
|
+
|
|
789
|
+
4. **Token-aware splitting** — Sections exceeding `max_tokens` (default 512) are split at paragraph boundaries. The last paragraph of each chunk is carried over into the next when it fits within `overlap_tokens` (default 64), maintaining cross-chunk coherence.
|
|
790
|
+
|
|
791
|
+
5. **Stub filtering** — Chunks with `text_raw.length < min_chars` (default 50) are discarded.
|
|
792
|
+
|
|
793
|
+
6. **Preamble handling** — Content before the first heading becomes `heading_level = 0` with the filename stem as the breadcrumb.
|
|
794
|
+
|
|
795
|
+
Token estimation uses `Math.ceil(words * 1.3)` — a fast approximation that overestimates slightly to avoid over-long chunks.
|
|
796
|
+
|
|
797
|
+
---
|
|
798
|
+
|
|
799
|
+
### Embedding Pipeline
|
|
800
|
+
|
|
801
|
+
The pipeline (`src/ingestion/pipeline.ts`) orchestrates indexing with bounded parallelism:
|
|
802
|
+
|
|
803
|
+
```
|
|
804
|
+
scanDirectory()
|
|
805
|
+
│
|
|
806
|
+
├── hashFile() → compare with stored hash
|
|
807
|
+
│ └── skip if unchanged (unless --force)
|
|
808
|
+
│
|
|
809
|
+
├── parseMarkdown() → ParsedDocument
|
|
810
|
+
├── chunkDocument() → ChunkRecord[] (vectors empty)
|
|
811
|
+
│
|
|
812
|
+
└── [collected into batches of batch_size * 4]
|
|
813
|
+
│
|
|
814
|
+
├── embedder.embed(texts) → number[][]
|
|
815
|
+
└── upsertChunks() + upsertFile() → LanceDB
|
|
816
|
+
```
|
|
817
|
+
|
|
818
|
+
Files are processed with `p-limit(4)` concurrency. Embedding batches are flushed when the pending buffer exceeds `batch_size * 4` (default 256 chunks), balancing memory usage and throughput.
|
|
819
|
+
|
|
820
|
+
After the first bulk index, `createIndexes()` builds:
|
|
821
|
+
- **IVF-PQ vector index** — `num_partitions: 256`, `num_sub_vectors: 96` (cosine metric)
|
|
822
|
+
- **Tantivy FTS index** — on the `text` field
|
|
823
|
+
|
|
824
|
+
---
|
|
825
|
+
|
|
826
|
+
### Hybrid Search & RRF
|
|
827
|
+
|
|
828
|
+
`src/search/hybrid.ts` fuses vector and full-text results using **Reciprocal Rank Fusion**:
|
|
829
|
+
|
|
830
|
+
```
|
|
831
|
+
query
|
|
832
|
+
│
|
|
833
|
+
├── prepareQueryForEmbedding() → "search_query: <query>" (nomic prefix)
|
|
834
|
+
│
|
|
835
|
+
├── vectorSearch(vector, k*3) → ranked list A
|
|
836
|
+
└── ftsSearch(query, k*3) → ranked list B
|
|
837
|
+
│
|
|
838
|
+
▼
|
|
839
|
+
RRF score(d) = Σ 1 / (60 + rank_i)
|
|
840
|
+
│
|
|
841
|
+
▼
|
|
842
|
+
top-k by RRF score → SearchResult[]
|
|
843
|
+
```
|
|
844
|
+
|
|
845
|
+
The RRF constant `k=60` (configurable via `search.rrf_k`) controls how steeply rank differences penalise lower-ranked results. Duplicate chunk IDs across lists are merged, summing their RRF scores.
|
|
846
|
+
|
|
847
|
+
**Search modes:**
|
|
848
|
+
- `hybrid` — RRF fusion of both lists (recommended)
|
|
849
|
+
- `vector` — pure cosine ANN search only
|
|
850
|
+
- `fts` — pure BM25 full-text search only
|
|
851
|
+
|
|
852
|
+
**Optional reranking:** When enabled, the initial `top_k` result set is expanded to `rerank_top_n` (default 20) and scored by a cross-encoder (`cross-encoder/ms-marco-MiniLM-L-6-v2` ONNX), which jointly processes query + passage for higher-precision ranking.
|
|
853
|
+
|
|
854
|
+
---
|
|
855
|
+
|
|
856
|
+
### Incremental Indexing
|
|
857
|
+
|
|
858
|
+
Change detection uses two layers:
|
|
859
|
+
|
|
860
|
+
1. **Content hash** (`src/ingestion/fingerprint.ts`) — SHA-256 of file contents stored in the `files` table. On re-scan, the current hash is compared against the stored one; identical hashes skip the file entirely.
|
|
861
|
+
|
|
862
|
+
2. **File watcher** (`src/watcher/`) — chokidar monitors `watch_dirs` for `add`, `change`, and `unlink` events. Events are debounced (default 3 s) to coalesce rapid saves. On `unlink`, the file's chunks are deleted from both tables.
|
|
863
|
+
|
|
864
|
+
---
|
|
865
|
+
|
|
866
|
+
### LanceDB Schema
|
|
867
|
+
|
|
868
|
+
Two tables are maintained:
|
|
869
|
+
|
|
870
|
+
**`chunks`** — one row per chunk (core search table):
|
|
871
|
+
|
|
872
|
+
| Column | Type | Description |
|
|
873
|
+
|--------|------|-------------|
|
|
874
|
+
| `chunk_id` | Utf8 | `"{file_hash}:{section_idx}:{chunk_idx}"` |
|
|
875
|
+
| `file_id` | Utf8 | SHA-256 of the absolute file path |
|
|
876
|
+
| `file_path` | Utf8 | Absolute path |
|
|
877
|
+
| `file_hash` | Utf8 | Content hash (change detection) |
|
|
878
|
+
| `file_mtime` | Float64 | Epoch timestamp |
|
|
879
|
+
| `heading_path` | Utf8 | `"H1 > H2 > H3"` |
|
|
880
|
+
| `heading_level` | Int8 | 0 = preamble, 1–6 = heading depth |
|
|
881
|
+
| `heading_text` | Utf8 | Verbatim heading text |
|
|
882
|
+
| `section_index` | Int32 | Index of section within the file |
|
|
883
|
+
| `chunk_index` | Int32 | Index of chunk within the section |
|
|
884
|
+
| `text` | Utf8 | Breadcrumb-prefixed text (embedded) |
|
|
885
|
+
| `text_raw` | Utf8 | Display text (no breadcrumb) |
|
|
886
|
+
| `token_count` | Int32 | Approximate token count |
|
|
887
|
+
| `parent_headings` | List\<Utf8\> | Ancestor heading texts |
|
|
888
|
+
| `depth` | Int8 | Heading depth |
|
|
889
|
+
| `vector` | FixedSizeList(768, Float32) | Embedding vector |
|
|
890
|
+
| `line_start` | Int32 | First line in the source file |
|
|
891
|
+
| `line_end` | Int32 | Last line in the source file |
|
|
892
|
+
|
|
893
|
+
**`files`** — one row per indexed file:
|
|
894
|
+
|
|
895
|
+
| Column | Type | Description |
|
|
896
|
+
|--------|------|-------------|
|
|
897
|
+
| `file_id` | Utf8 | SHA-256 of path |
|
|
898
|
+
| `file_path` | Utf8 | Absolute path |
|
|
899
|
+
| `file_hash` | Utf8 | Content hash |
|
|
900
|
+
| `file_mtime` | Float64 | Last modification time |
|
|
901
|
+
| `chunk_count` | Int32 | Number of chunks |
|
|
902
|
+
| `indexed_at` | Float64 | Indexing timestamp |
|
|
903
|
+
| `status` | Utf8 | `"indexed"` \| `"error"` \| `"skipped"` |
|
|
904
|
+
| `error_msg` | Utf8 | Error details if status is `"error"` |
|
|
905
|
+
|
|
906
|
+
Vector dimension is `768` for `nomic-embed-text-v1.5`. For `bge-m3`, change `VECTOR_DIM` in `src/store/schema.ts` to `1024` before first index.
|
|
907
|
+
|
|
908
|
+
---
|
|
909
|
+
|
|
910
|
+
## Embedding Backends
|
|
911
|
+
|
|
912
|
+
### Transformers.js (Default)
|
|
913
|
+
|
|
914
|
+
Uses `@huggingface/transformers` v3 with the ONNX runtime. No Python, no separate process. Models are downloaded once and cached locally.
|
|
915
|
+
|
|
916
|
+
**Model:** `nomic-ai/nomic-embed-text-v1.5` (768-dim, ~270 MB)
|
|
917
|
+
|
|
918
|
+
The nomic model uses asymmetric prefixes for higher accuracy:
|
|
919
|
+
- Documents are embedded as `"search_document: <text>"`
|
|
920
|
+
- Queries are embedded as `"search_query: <text>"`
|
|
921
|
+
|
|
922
|
+
To use `bge-m3` (1024-dim, multilingual):
|
|
923
|
+
|
|
924
|
+
```yaml
|
|
925
|
+
embeddings:
|
|
926
|
+
model: "BAAI/bge-m3"
|
|
927
|
+
```
|
|
928
|
+
|
|
929
|
+
Also update `VECTOR_DIM = 1024` in `src/store/schema.ts` and rebuild.
|
|
930
|
+
|
|
931
|
+
### Ollama
|
|
932
|
+
|
|
933
|
+
Point sqmd at a running [Ollama](https://ollama.ai) instance:
|
|
934
|
+
|
|
935
|
+
```yaml
|
|
936
|
+
embeddings:
|
|
937
|
+
backend: "ollama"
|
|
938
|
+
model: "nomic-embed-text"
|
|
939
|
+
ollama_base_url: "http://localhost:11434"
|
|
940
|
+
```
|
|
941
|
+
|
|
942
|
+
Ollama must be running with the model already pulled (`ollama pull nomic-embed-text`).
|
|
943
|
+
|
|
944
|
+
---
|
|
945
|
+
|
|
946
|
+
## Performance
|
|
947
|
+
|
|
948
|
+
| Operation | Typical Time | Notes |
|
|
949
|
+
|-----------|-------------|-------|
|
|
950
|
+
| Initial index (50k chunks) | 2–4 min | CPU; ONNX SIMD; batch size 64 |
|
|
951
|
+
| Single file re-index | < 1 s | Hash skip + targeted upsert |
|
|
952
|
+
| Search (hybrid, no rerank) | < 100 ms | IVF-PQ ANN + Tantivy BM25 + RRF |
|
|
953
|
+
| Search (with reranking) | 200–500 ms | Cross-encoder inference per candidate |
|
|
954
|
+
| Memory (idle) | ~400 MB | ONNX model ~200 MB + mmap'd LanceDB |
|
|
955
|
+
|
|
956
|
+
Embedding throughput scales with CPU core count — the ONNX runtime uses SIMD and will use available threads automatically.
|
|
957
|
+
|
|
958
|
+
---
|
|
959
|
+
|
|
960
|
+
## Development
|
|
961
|
+
|
|
962
|
+
### Running Tests
|
|
963
|
+
|
|
964
|
+
```bash
|
|
965
|
+
npm test # run all tests once (vitest)
|
|
966
|
+
npm run test:watch # watch mode
|
|
967
|
+
```
|
|
968
|
+
|
|
969
|
+
Test coverage:
|
|
970
|
+
- `tests/unit/chunker.test.ts` — hierarchical chunking, breadcrumbs, overlap, stub filtering
|
|
971
|
+
- `tests/unit/hybrid.test.ts` — RRF fusion logic (mocked DB)
|
|
972
|
+
- `tests/unit/config.test.ts` — config loading, validation, env var overrides
|
|
973
|
+
- `tests/integration/pipeline.test.ts` — full pipeline with temp LanceDB instance
|
|
974
|
+
- `tests/integration/api.test.ts` — Hono app endpoints (health, search, documents, index)
|
|
975
|
+
|
|
976
|
+
### Building
|
|
977
|
+
|
|
978
|
+
```bash
|
|
979
|
+
npm run build # tsc → dist/
|
|
980
|
+
npm run dev # tsx src/index.ts (no build step, for development)
|
|
981
|
+
```
|
|
982
|
+
|
|
983
|
+
### Project Conventions
|
|
984
|
+
|
|
985
|
+
- All source imports use `.js` extension for ESM compatibility (TypeScript resolves to `.ts` at compile time)
|
|
986
|
+
- Node built-ins use the `node:` prefix (`node:fs`, `node:path`, `node:crypto`)
|
|
987
|
+
- `src/config/schema.ts` is the single source of truth for all config types — do not duplicate config fields elsewhere
|
|
988
|
+
- Embedder is lazy-loaded on first use to avoid model download cost at startup for non-indexing commands
|
|
989
|
+
- `p-limit` concurrency default is 4 files; adjust `concurrency` in `pipeline.run()` for I/O-bound vs CPU-bound workloads
|
|
990
|
+
|
|
991
|
+
---
|
|
992
|
+
|
|
993
|
+
## Troubleshooting
|
|
994
|
+
|
|
995
|
+
**`Database may not be initialized. Run sqmd index first.`**
|
|
996
|
+
|
|
997
|
+
The LanceDB database doesn't exist yet. Run `node dist/index.js index` to create it.
|
|
998
|
+
|
|
999
|
+
---
|
|
1000
|
+
|
|
1001
|
+
**`Path not found: ~/notes`**
|
|
1002
|
+
|
|
1003
|
+
The tilde in `watch_dirs` is expanded at runtime. Ensure the directory exists. Use an absolute path to be explicit:
|
|
1004
|
+
|
|
1005
|
+
```yaml
|
|
1006
|
+
paths:
|
|
1007
|
+
watch_dirs:
|
|
1008
|
+
- "/Users/alice/notes"
|
|
1009
|
+
```
|
|
1010
|
+
|
|
1011
|
+
---
|
|
1012
|
+
|
|
1013
|
+
**First index takes a long time**
|
|
1014
|
+
|
|
1015
|
+
The embedding model (~270 MB) is being downloaded on first use. Subsequent runs use the cache at `~/.sqmd/models`. Check disk space and network connectivity if the download stalls.
|
|
1016
|
+
|
|
1017
|
+
---
|
|
1018
|
+
|
|
1019
|
+
**Search returns no results**
|
|
1020
|
+
|
|
1021
|
+
1. Run `node dist/index.js status` to verify files were indexed
|
|
1022
|
+
2. Check `errors` count — some files may have failed to parse
|
|
1023
|
+
3. Try `--mode fts` first to verify full-text search works independently
|
|
1024
|
+
4. Ensure you're using the same model for both indexing and search (config `embeddings.model`)
|
|
1025
|
+
|
|
1026
|
+
---
|
|
1027
|
+
|
|
1028
|
+
**Vector dimension mismatch error**
|
|
1029
|
+
|
|
1030
|
+
If you change the embedding model after an initial index, the stored vector dimension will mismatch the new model. Delete the database and re-index:
|
|
1031
|
+
|
|
1032
|
+
```bash
|
|
1033
|
+
rm -rf ~/.sqmd/lancedb
|
|
1034
|
+
node dist/index.js index --force
|
|
1035
|
+
```
|
|
1036
|
+
|
|
1037
|
+
---
|
|
1038
|
+
|
|
1039
|
+
**Ollama connection refused**
|
|
1040
|
+
|
|
1041
|
+
Ensure Ollama is running (`ollama serve`) and the model is pulled (`ollama pull nomic-embed-text`). Verify `ollama_base_url` in config.
|
|
1042
|
+
|
|
1043
|
+
---
|
|
1044
|
+
|
|
1045
|
+
**Port 7832 already in use**
|
|
1046
|
+
|
|
1047
|
+
Override with `--port` or in config:
|
|
1048
|
+
|
|
1049
|
+
```yaml
|
|
1050
|
+
api:
|
|
1051
|
+
port: 8832
|
|
1052
|
+
```
|