@softerist/heuristic-mcp 3.2.3 → 3.2.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (46) hide show
  1. package/README.md +387 -376
  2. package/config.jsonc +800 -800
  3. package/features/ann-config.js +102 -110
  4. package/features/clear-cache.js +81 -84
  5. package/features/find-similar-code.js +265 -286
  6. package/features/hybrid-search.js +487 -536
  7. package/features/index-codebase.js +3139 -3270
  8. package/features/lifecycle.js +1011 -1063
  9. package/features/package-version.js +277 -291
  10. package/features/register.js +351 -370
  11. package/features/resources.js +115 -130
  12. package/features/set-workspace.js +214 -240
  13. package/index.js +693 -758
  14. package/lib/cache-ops.js +22 -22
  15. package/lib/cache-utils.js +465 -519
  16. package/lib/cache.js +1749 -1849
  17. package/lib/call-graph.js +396 -396
  18. package/lib/cli.js +232 -226
  19. package/lib/config.js +1483 -1495
  20. package/lib/constants.js +511 -493
  21. package/lib/embed-query-process.js +206 -212
  22. package/lib/embedding-process.js +434 -451
  23. package/lib/embedding-worker.js +862 -934
  24. package/lib/ignore-patterns.js +276 -316
  25. package/lib/json-worker.js +14 -14
  26. package/lib/json-writer.js +302 -310
  27. package/lib/logging.js +116 -127
  28. package/lib/memory-logger.js +13 -13
  29. package/lib/onnx-backend.js +188 -193
  30. package/lib/path-utils.js +18 -23
  31. package/lib/project-detector.js +82 -84
  32. package/lib/server-lifecycle.js +133 -145
  33. package/lib/settings-editor.js +738 -739
  34. package/lib/slice-normalize.js +25 -31
  35. package/lib/tokenizer.js +168 -203
  36. package/lib/utils.js +364 -409
  37. package/lib/vector-store-binary.js +973 -991
  38. package/lib/vector-store-sqlite.js +377 -414
  39. package/lib/workspace-env.js +32 -34
  40. package/mcp_config.json +9 -9
  41. package/package.json +86 -86
  42. package/scripts/clear-cache.js +20 -20
  43. package/scripts/download-model.js +43 -43
  44. package/scripts/mcp-launcher.js +49 -49
  45. package/scripts/postinstall.js +12 -12
  46. package/search-configs.js +36 -36
package/README.md CHANGED
@@ -1,376 +1,387 @@
1
- # Heuristic MCP Server
2
-
3
- An enhanced MCP server for your codebase. It provides intelligent semantic search, find-similar-code, recency-aware ranking, call-graph proximity boosts, and smart chunking. Optimized for Antigravity, Cursor, Claude Desktop, and VS Code.
4
-
5
- ---
6
-
7
- ## Key Features
8
-
9
- - Zero-touch setup: postinstall auto-registers the MCP server with supported IDEs when possible.
10
- - Smart indexing: detects project type and applies smart ignore patterns on top of your excludes.
11
- - Semantic search: find code by meaning, not just keywords.
12
- - Find similar code: locate near-duplicate or related patterns from a snippet.
13
- - Package version lookup: check latest versions from npm, PyPI, crates.io, Maven, and more.
14
- - Workspace switching: change workspace at runtime without restarting the server.
15
- - Recency ranking and call-graph boosting: surfaces fresh and related code.
16
- - Optional ANN index: faster candidate retrieval for large codebases.
17
- - Optional binary vector store: mmap-friendly cache format for large repos.
18
- - Flexible embedding dimensions: MRL-compatible dimension reduction (64-768d) for speed/quality tradeoffs.
19
-
20
- ---
21
-
22
- ## Installation
23
-
24
- Install globally (recommended):
25
-
26
- ```bash
27
- npm install -g @softerist/heuristic-mcp
28
- ```
29
-
30
- What happens during install:
31
-
32
- - Registration runs automatically (`scripts/postinstall.js`).
33
- - Model pre-download is attempted (`scripts/download-model.js`). If offline, it will be skipped and downloaded on first run.
34
-
35
- If auto-registration did not update your IDE config, run:
36
-
37
- ```bash
38
- heuristic-mcp --start
39
- ```
40
-
41
- ---
42
-
43
- ## CLI Commands
44
-
45
- The `heuristic-mcp` binary manages the server lifecycle.
46
-
47
- ### Status
48
-
49
- ```bash
50
- heuristic-mcp --status
51
- ```
52
-
53
- Shows server PID(s) and cache stats.
54
-
55
- ### Logs
56
-
57
- ```bash
58
- heuristic-mcp --logs
59
- ```
60
-
61
- Tails the server log for the current workspace (defaults to last 200 lines and follows).
62
-
63
- Optional flags:
64
-
65
- ```bash
66
- heuristic-mcp --logs --tail 100
67
- heuristic-mcp --logs --no-follow
68
- ```
69
-
70
- ### Version
71
-
72
- ```bash
73
- heuristic-mcp --version
74
- ```
75
-
76
- ### Start/Stop
77
-
78
- ```bash
79
- heuristic-mcp --start
80
- heuristic-mcp --start antigravity
81
- heuristic-mcp --start codex
82
- heuristic-mcp --start cursor
83
- heuristic-mcp --start vscode
84
- heuristic-mcp --start windsurf
85
- heuristic-mcp --start warp
86
- heuristic-mcp --start "Claude Desktop"
87
- heuristic-mcp --stop
88
- ```
89
-
90
- `--start` registers (if needed) and enables the MCP server entry. `--stop` disables it so the IDE won't immediately respawn it. Restart/reload the IDE after `--start` to launch.
91
-
92
- Warp note: this package now targets `~/.warp/mcp_settings.json` (and `%APPDATA%\\Warp\\mcp_settings.json` on Windows when present). If no local Warp MCP config is writable yet, use Warp MCP settings/UI once to initialize it, then re-run `--start warp`.
93
-
94
- ### Clear Cache
95
-
96
- ```bash
97
- heuristic-mcp --clear-cache
98
- ```
99
-
100
- Clears the cache for the current working directory (or `--workspace` if provided) and removes stale cache directories without metadata.
101
-
102
- ---
103
-
104
- ## Configuration (`config.jsonc`)
105
-
106
- Configuration is loaded from your workspace root when the server runs with `--workspace`. If not provided by the IDE, the server auto-detects workspace via environment variables and current working directory. In server mode, it falls back to the package `config.jsonc` (or `config.json`) and then your current working directory.
107
-
108
- Example `config.jsonc`:
109
-
110
- ```json
111
- {
112
- "excludePatterns": ["**/legacy-code/**", "**/*.test.ts"],
113
- "fileNames": ["Dockerfile", ".env.example", "Makefile"],
114
- "indexing": {
115
- "smartIndexing": true
116
- },
117
- "worker": {
118
- "workerThreads": 0
119
- },
120
- "embedding": {
121
- "embeddingModel": "jinaai/jina-embeddings-v2-base-code",
122
- "embeddingBatchSize": null,
123
- "embeddingProcessNumThreads": 8
124
- },
125
- "search": {
126
- "recencyBoost": 0.1,
127
- "recencyDecayDays": 30
128
- },
129
- "callGraph": {
130
- "callGraphEnabled": true,
131
- "callGraphBoost": 0.15
132
- },
133
- "ann": {
134
- "annEnabled": true
135
- },
136
- "vectorStore": {
137
- "vectorStoreFormat": "binary",
138
- "vectorStoreContentMode": "external",
139
- "vectorStoreLoadMode": "disk",
140
- "contentCacheEntries": 256,
141
- "vectorCacheEntries": 64
142
- },
143
- "memoryCleanup": {
144
- "clearCacheAfterIndex": true
145
- }
146
- }
147
- ```
148
-
149
- Preferred style is namespaced keys (shown above). Legacy top-level keys are still supported for backward compatibility.
150
-
151
- ### Embedding Model & Dimension Options
152
-
153
- **Default model:** `jinaai/jina-embeddings-v2-base-code` (768 dimensions)
154
-
155
- > **Important:** The default Jina model was **not** trained with Matryoshka Representation Learning (MRL). Dimension reduction (`embeddingDimension`) will significantly degrade search quality with this model. Only use dimension reduction with MRL-trained models.
156
-
157
- For faster search with smaller embeddings, switch to an MRL-compatible model:
158
-
159
- ```json
160
- {
161
- "embedding": {
162
- "embeddingModel": "nomic-ai/nomic-embed-text-v1.5",
163
- "embeddingDimension": 128
164
- }
165
- }
166
- ```
167
-
168
- **MRL-compatible models:**
169
- - `nomic-ai/nomic-embed-text-v1.5` — recommended for 128d/256d
170
- - Other models explicitly trained with Matryoshka loss
171
-
172
- **embeddingDimension values:** `64 | 128 | 256 | 512 | 768 | null` (null = full dimensions)
173
-
174
- Cache location:
175
-
176
- - By default, the cache is stored in a global OS cache directory under `heuristic-mcp/<hash>`.
177
- - You can override with `cacheDirectory` in your config file.
178
-
179
- ### Environment Variables
180
-
181
- Selected overrides (prefix `SMART_CODING_`):
182
-
183
- Environment overrides target runtime keys and are synced back into namespaces by `lib/config.js`.
184
-
185
- - `SMART_CODING_VERBOSE=true|false` — enable detailed logging.
186
- - `SMART_CODING_WORKER_THREADS=auto|N` — worker thread count.
187
- - `SMART_CODING_BATCH_SIZE=100` — files per indexing batch.
188
- - `SMART_CODING_CHUNK_SIZE=25` — lines per chunk.
189
- - `SMART_CODING_MAX_RESULTS=5` — max search results.
190
- - `SMART_CODING_EMBEDDING_BATCH_SIZE=64` — embedding batch size (1–256, overrides auto).
191
- - `SMART_CODING_EMBEDDING_THREADS=8` — ONNX threads for the embedding child process.
192
- - `SMART_CODING_RECENCY_BOOST=0.1` — boost for recently edited files.
193
- - `SMART_CODING_RECENCY_DECAY_DAYS=30` — days until recency boost decays to 0.
194
- - `SMART_CODING_ANN_ENABLED=true|false` — enable ANN index.
195
- - `SMART_CODING_ANN_EF_SEARCH=64` — ANN search quality/speed tradeoff.
196
- - `SMART_CODING_VECTOR_STORE_FORMAT=json|binary|sqlite` — on-disk vector store format.
197
- - `SMART_CODING_VECTOR_STORE_CONTENT_MODE=external|inline` — where content is stored for binary format.
198
- - `SMART_CODING_VECTOR_STORE_LOAD_MODE=memory|disk` — vector loading strategy.
199
- - `SMART_CODING_CONTENT_CACHE_ENTRIES=256` — LRU entries for decoded content.
200
- - `SMART_CODING_VECTOR_CACHE_ENTRIES=64` — LRU entries for vectors (disk mode).
201
- - `SMART_CODING_CLEAR_CACHE_AFTER_INDEX=true|false` — drop in-memory vectors after indexing.
202
- - `SMART_CODING_UNLOAD_MODEL_AFTER_INDEX=true|false` — unload embedding model after indexing to free RAM (~500MB-1GB).
203
- - `SMART_CODING_EXPLICIT_GC=true|false` — opt-in to explicit GC (requires `--expose-gc`).
204
- - `SMART_CODING_INCREMENTAL_GC_THRESHOLD_MB=2048` — RSS threshold for running incremental GC after watcher updates (requires explicit GC).
205
- - `SMART_CODING_EMBEDDING_DIMENSION=64|128|256|512|768` — MRL dimension reduction (only for MRL-trained models).
206
-
207
- See `lib/config.js` for the full list.
208
-
209
- ### Binary Vector Store
210
-
211
- Set `vectorStore.vectorStoreFormat` to `binary` to use the on-disk binary cache. This keeps vectors and content out of JS heap
212
- and reads on demand. Recommended for large repos.
213
-
214
- - `vectorStore.vectorStoreContentMode=external` keeps content in the binary file and only loads for top-N results.
215
- - `vectorStore.contentCacheEntries` controls the small in-memory LRU for decoded content strings.
216
- - `vectorStore.vectorStoreLoadMode=disk` streams vectors from disk to reduce memory usage.
217
- - `vectorStore.vectorCacheEntries` controls the small in-memory LRU for vectors when using disk mode.
218
- - `memoryCleanup.clearCacheAfterIndex=true` drops in-memory vectors after indexing and reloads lazily on next query.
219
- - `memoryCleanup.unloadModelAfterIndex=true` (default) unloads the embedding model after indexing to free ~500MB-1GB of RAM; the model will reload on the next search query.
220
- - Note: `ann.annEnabled=true` with `vectorStore.vectorStoreLoadMode=disk` can increase disk reads during ANN rebuilds on large indexes.
221
-
222
- ### SQLite Vector Store
223
-
224
- Set `vectorStore.vectorStoreFormat` to `sqlite` to use SQLite for persistence. This provides:
225
-
226
- - ACID transactions for reliable writes
227
- - Simpler concurrent access
228
- - Standard database format for inspection
229
-
230
- ```json
231
- {
232
- "vectorStore": {
233
- "vectorStoreFormat": "sqlite"
234
- }
235
- }
236
- ```
237
-
238
- The vectors and content are stored in `vectors.sqlite` in your cache directory. You can inspect it with any SQLite browser.
239
- `vectorStore.vectorStoreContentMode` and `vectorStore.vectorStoreLoadMode` are respected for SQLite (use `vectorStore.vectorStoreLoadMode=disk` to avoid loading vectors into memory).
240
-
241
- **Tradeoffs vs Binary:**
242
- - Slightly higher read overhead (SQL queries vs direct memory access)
243
- - Better write reliability (transactions)
244
- - Easier debugging (standard SQLite file)
245
-
246
- ### Benchmarking Search
247
-
248
- Use the built-in script to compare memory vs latency tradeoffs:
249
-
250
- ```bash
251
- node tools/scripts/benchmark-search.js --query "database connection" --runs 10
252
- ```
253
-
254
- Compare modes quickly:
255
-
256
- ```bash
257
- SMART_CODING_VECTOR_STORE_LOAD_MODE=memory node tools/scripts/benchmark-search.js --runs 10
258
- SMART_CODING_VECTOR_STORE_LOAD_MODE=disk node tools/scripts/benchmark-search.js --runs 10
259
- SMART_CODING_VECTOR_STORE_FORMAT=binary SMART_CODING_VECTOR_STORE_LOAD_MODE=disk node tools/scripts/benchmark-search.js --runs 10
260
- ```
261
-
262
- Note: On small repos, disk mode may be slightly slower and show noisy RSS deltas; benefits are clearer on large indexes with a small `vectorStore.vectorCacheEntries`.
263
-
264
- ---
265
-
266
- ## MCP Tools Reference
267
-
268
- ### `a_semantic_search`
269
- Find code by meaning. Ideal for natural language queries like "authentication logic" or "database queries".
270
-
271
- ### `b_index_codebase`
272
- Manually trigger a full reindex. Useful after large code changes.
273
-
274
- ### `c_clear_cache`
275
- Clear the embeddings cache and force reindex.
276
-
277
- ### `d_ann_config`
278
- Configure the ANN (Approximate Nearest Neighbor) index. Actions: `stats`, `set_ef_search`, `rebuild`.
279
-
280
- ### `d_find_similar_code`
281
- Find similar code patterns given a snippet. Useful for finding duplicates or refactoring opportunities.
282
-
283
- ### `e_check_package_version`
284
- Fetch the latest version of a package from its official registry.
285
-
286
- **Supported registries:**
287
- - **npm** (default): `lodash`, `@types/node`
288
- - **PyPI**: `pip:requests`, `pypi:django`
289
- - **crates.io**: `cargo:serde`, `rust:tokio`
290
- - **Maven**: `maven:org.springframework:spring-core`
291
- - **Go**: `go:github.com/gin-gonic/gin`
292
- - **RubyGems**: `gem:rails`
293
- - **NuGet**: `nuget:Newtonsoft.Json`
294
- - **Packagist**: `composer:laravel/framework`
295
- - **Hex**: `hex:phoenix`
296
- - **pub.dev**: `pub:flutter`
297
- - **Homebrew**: `brew:node`
298
- - **Conda**: `conda:numpy`
299
-
300
- ### `f_set_workspace`
301
- Change the workspace directory at runtime. Updates search directory, cache location, and optionally triggers reindex.
302
-
303
- The server also attempts this automatically before each tool call when it detects a new workspace path from environment variables (for example `CODEX_WORKSPACE`, `CODEX_PROJECT_ROOT`, `WORKSPACE_FOLDER`).
304
-
305
- **Parameters:**
306
- - `workspacePath` (required): Absolute path to the new workspace
307
- - `reindex` (optional, default: `true`): Whether to trigger a full reindex
308
-
309
- ---
310
-
311
- ## Troubleshooting
312
-
313
- **Server isn't starting**
314
-
315
- 1. Run `heuristic-mcp --status` to check config and cache status.
316
- 2. Run `heuristic-mcp --logs` to see startup errors.
317
-
318
- **Native ONNX backend unavailable (falls back to WASM)**
319
-
320
- If you see log lines like:
321
-
322
- ```
323
- Native ONNX backend unavailable: The operating system cannot run %1.
324
- ...onnxruntime_binding.node. Falling back to WASM.
325
- ```
326
-
327
- The server will automatically disable workers and force `embedding.embeddingProcessPerBatch` to reduce memory spikes, but you
328
- should fix the native binding to restore stable memory usage:
329
-
330
- - Ensure you are running **64-bit Node.js** (`node -p "process.arch"` should be `x64`).
331
- - Install **Microsoft Visual C++ 2015–2022 Redistributable (x64)**.
332
- - Reinstall dependencies (clears locked native binaries):
333
-
334
- ```bash
335
- Remove-Item -Recurse -Force node_modules\\onnxruntime-node, node_modules\\.onnxruntime-node-* -ErrorAction SilentlyContinue
336
- npm install
337
- ```
338
-
339
- If you see a warning about **version mismatch** (e.g. "onnxruntime-node 1.23.x incompatible with transformers.js
340
- expectation 1.14.x"), install the matching version:
341
-
342
- ```bash
343
- npm install onnxruntime-node@1.14.0
344
- ```
345
-
346
- **Search returns no results**
347
-
348
- - Check `heuristic-mcp --status` for indexing progress.
349
- - If indexing shows zero files, review `excludePatterns` and `fileExtensions`.
350
-
351
- **Model download fails**
352
-
353
- - The install step tries to pre-download the model, but it can be skipped offline.
354
- - The server will download on first run; ensure network access at least once.
355
-
356
- **Clear cache**
357
-
358
- - Use the MCP tool `c_clear_cache`, run `heuristic-mcp --clear-cache`, or delete the cache directory. For local dev, run `npm run clean`.
359
-
360
- **Inspect cache**
361
-
362
- ```bash
363
- node tools/scripts/cache-stats.js --workspace <path>
364
- ```
365
-
366
- **Stop doesn't stick**
367
-
368
- - The IDE will auto-restart the server if it's still enabled in its config. `--stop` now disables the server entry for Antigravity, Cursor (including `~/.cursor/mcp.json`), Windsurf (`~/.codeium/windsurf/mcp_config.json`), Warp (`~/.warp/mcp_settings.json` and `%APPDATA%\\Warp\\mcp_settings.json` when present), Claude Desktop, and VS Code (when using common MCP settings keys). Restart the IDE after `--start` to re-enable.
369
-
370
- ---
371
-
372
- ## Contributing
373
-
374
- See `CONTRIBUTING.md` for guidelines.
375
-
376
- License: MIT
1
+ # Heuristic MCP Server
2
+
3
+ An enhanced MCP server for your codebase. It provides intelligent semantic search, find-similar-code, recency-aware ranking, call-graph proximity boosts, and smart chunking. Optimized for Antigravity, Cursor, Claude Desktop, and VS Code.
4
+
5
+ ---
6
+
7
+ ## Key Features
8
+
9
+ - Zero-touch setup: postinstall auto-registers the MCP server with supported IDEs when possible.
10
+ - Smart indexing: detects project type and applies smart ignore patterns on top of your excludes.
11
+ - Semantic search: find code by meaning, not just keywords.
12
+ - Find similar code: locate near-duplicate or related patterns from a snippet.
13
+ - Package version lookup: check latest versions from npm, PyPI, crates.io, Maven, and more.
14
+ - Workspace switching: change workspace at runtime without restarting the server.
15
+ - Recency ranking and call-graph boosting: surfaces fresh and related code.
16
+ - Optional ANN index: faster candidate retrieval for large codebases.
17
+ - Optional binary vector store: mmap-friendly cache format for large repos.
18
+ - Flexible embedding dimensions: MRL-compatible dimension reduction (64-768d) for speed/quality tradeoffs.
19
+
20
+ ---
21
+
22
+ ## Installation
23
+
24
+ Install globally (recommended):
25
+
26
+ ```bash
27
+ npm install -g @softerist/heuristic-mcp
28
+ ```
29
+
30
+ What happens during install:
31
+
32
+ - Registration runs automatically (`scripts/postinstall.js`).
33
+ - Model pre-download is attempted (`scripts/download-model.js`). If offline, it will be skipped and downloaded on first run.
34
+
35
+ If auto-registration did not update your IDE config, run:
36
+
37
+ ```bash
38
+ heuristic-mcp --start
39
+ ```
40
+
41
+ ---
42
+
43
+ ## CLI Commands
44
+
45
+ The `heuristic-mcp` binary manages the server lifecycle.
46
+
47
+ ### Status
48
+
49
+ ```bash
50
+ heuristic-mcp --status
51
+ ```
52
+
53
+ Shows server PID(s) and cache stats.
54
+
55
+ ### Logs
56
+
57
+ ```bash
58
+ heuristic-mcp --logs
59
+ ```
60
+
61
+ Tails the server log for the current workspace (defaults to last 200 lines and follows).
62
+
63
+ Optional flags:
64
+
65
+ ```bash
66
+ heuristic-mcp --logs --tail 100
67
+ heuristic-mcp --logs --no-follow
68
+ ```
69
+
70
+ ### Version
71
+
72
+ ```bash
73
+ heuristic-mcp --version
74
+ ```
75
+
76
+ ### Start/Stop
77
+
78
+ ```bash
79
+ heuristic-mcp --start
80
+ heuristic-mcp --start antigravity
81
+ heuristic-mcp --start codex
82
+ heuristic-mcp --start cursor
83
+ heuristic-mcp --start vscode
84
+ heuristic-mcp --start windsurf
85
+ heuristic-mcp --start warp
86
+ heuristic-mcp --start "Claude Desktop"
87
+ heuristic-mcp --stop
88
+ ```
89
+
90
+ `--start` registers (if needed) and enables the MCP server entry. `--stop` disables it so the IDE won't immediately respawn it. Restart/reload the IDE after `--start` to launch.
91
+
92
+ Warp note: this package now targets `~/.warp/mcp_settings.json` (and `%APPDATA%\\Warp\\mcp_settings.json` on Windows when present). If no local Warp MCP config is writable yet, use Warp MCP settings/UI once to initialize it, then re-run `--start warp`.
93
+
94
+ ### Clear Cache
95
+
96
+ ```bash
97
+ heuristic-mcp --clear-cache
98
+ ```
99
+
100
+ Clears the cache for the current working directory (or `--workspace` if provided) and removes stale cache directories without metadata.
101
+
102
+ ---
103
+
104
+ ## Configuration (`config.jsonc`)
105
+
106
+ Configuration is loaded from your workspace root when the server runs with `--workspace`. If not provided by the IDE, the server auto-detects workspace via environment variables and current working directory. In server mode, it falls back to the package `config.jsonc` (or `config.json`) and then your current working directory.
107
+
108
+ Example `config.jsonc`:
109
+
110
+ ```json
111
+ {
112
+ "excludePatterns": ["**/legacy-code/**", "**/*.test.ts"],
113
+ "fileNames": ["Dockerfile", ".env.example", "Makefile"],
114
+ "indexing": {
115
+ "smartIndexing": true
116
+ },
117
+ "worker": {
118
+ "workerThreads": 0
119
+ },
120
+ "embedding": {
121
+ "embeddingModel": "jinaai/jina-embeddings-v2-base-code",
122
+ "embeddingBatchSize": null,
123
+ "embeddingProcessNumThreads": 8
124
+ },
125
+ "search": {
126
+ "recencyBoost": 0.1,
127
+ "recencyDecayDays": 30
128
+ },
129
+ "callGraph": {
130
+ "callGraphEnabled": true,
131
+ "callGraphBoost": 0.15
132
+ },
133
+ "ann": {
134
+ "annEnabled": true
135
+ },
136
+ "vectorStore": {
137
+ "vectorStoreFormat": "binary",
138
+ "vectorStoreContentMode": "external",
139
+ "vectorStoreLoadMode": "disk",
140
+ "contentCacheEntries": 256,
141
+ "vectorCacheEntries": 64
142
+ },
143
+ "memoryCleanup": {
144
+ "clearCacheAfterIndex": true
145
+ }
146
+ }
147
+ ```
148
+
149
+ Preferred style is namespaced keys (shown above). Legacy top-level keys are still supported for backward compatibility.
150
+
151
+ ### Embedding Model & Dimension Options
152
+
153
+ **Default model:** `jinaai/jina-embeddings-v2-base-code` (768 dimensions)
154
+
155
+ > **Important:** The default Jina model was **not** trained with Matryoshka Representation Learning (MRL). Dimension reduction (`embeddingDimension`) will significantly degrade search quality with this model. Only use dimension reduction with MRL-trained models.
156
+
157
+ For faster search with smaller embeddings, switch to an MRL-compatible model:
158
+
159
+ ```json
160
+ {
161
+ "embedding": {
162
+ "embeddingModel": "nomic-ai/nomic-embed-text-v1.5",
163
+ "embeddingDimension": 128
164
+ }
165
+ }
166
+ ```
167
+
168
+ **MRL-compatible models:**
169
+
170
+ - `nomic-ai/nomic-embed-text-v1.5` recommended for 128d/256d
171
+ - Other models explicitly trained with Matryoshka loss
172
+
173
+ **embeddingDimension values:** `64 | 128 | 256 | 512 | 768 | null` (null = full dimensions)
174
+
175
+ Cache location:
176
+
177
+ - By default, the cache is stored in a global OS cache directory under `heuristic-mcp/<hash>`.
178
+ - You can override with `cacheDirectory` in your config file.
179
+
180
+ ### Environment Variables
181
+
182
+ Selected overrides (prefix `SMART_CODING_`):
183
+
184
+ Environment overrides target runtime keys and are synced back into namespaces by `lib/config.js`.
185
+
186
+ - `SMART_CODING_VERBOSE=true|false` — enable detailed logging.
187
+ - `SMART_CODING_WORKER_THREADS=auto|N` — worker thread count.
188
+ - `SMART_CODING_BATCH_SIZE=100` — files per indexing batch.
189
+ - `SMART_CODING_CHUNK_SIZE=25` — lines per chunk.
190
+ - `SMART_CODING_MAX_RESULTS=5` — max search results.
191
+ - `SMART_CODING_EMBEDDING_BATCH_SIZE=64` — embedding batch size (1–256, overrides auto).
192
+ - `SMART_CODING_EMBEDDING_THREADS=8` — ONNX threads for the embedding child process.
193
+ - `SMART_CODING_RECENCY_BOOST=0.1` — boost for recently edited files.
194
+ - `SMART_CODING_RECENCY_DECAY_DAYS=30` — days until recency boost decays to 0.
195
+ - `SMART_CODING_ANN_ENABLED=true|false` — enable ANN index.
196
+ - `SMART_CODING_ANN_EF_SEARCH=64` — ANN search quality/speed tradeoff.
197
+ - `SMART_CODING_VECTOR_STORE_FORMAT=json|binary|sqlite` — on-disk vector store format.
198
+ - `SMART_CODING_VECTOR_STORE_CONTENT_MODE=external|inline` — where content is stored for binary format.
199
+ - `SMART_CODING_VECTOR_STORE_LOAD_MODE=memory|disk` — vector loading strategy.
200
+ - `SMART_CODING_CONTENT_CACHE_ENTRIES=256` — LRU entries for decoded content.
201
+ - `SMART_CODING_VECTOR_CACHE_ENTRIES=64` — LRU entries for vectors (disk mode).
202
+ - `SMART_CODING_CLEAR_CACHE_AFTER_INDEX=true|false` — drop in-memory vectors after indexing.
203
+ - `SMART_CODING_UNLOAD_MODEL_AFTER_INDEX=true|false` — unload embedding model after indexing to free RAM (~500MB-1GB).
204
+ - `SMART_CODING_EXPLICIT_GC=true|false` — opt-in to explicit GC (requires `--expose-gc`).
205
+ - `SMART_CODING_INCREMENTAL_GC_THRESHOLD_MB=2048` — RSS threshold for running incremental GC after watcher updates (requires explicit GC).
206
+ - `SMART_CODING_EMBEDDING_DIMENSION=64|128|256|512|768` — MRL dimension reduction (only for MRL-trained models).
207
+
208
+ See `lib/config.js` for the full list.
209
+
210
+ ### Binary Vector Store
211
+
212
+ Set `vectorStore.vectorStoreFormat` to `binary` to use the on-disk binary cache. This keeps vectors and content out of JS heap
213
+ and reads on demand. Recommended for large repos.
214
+
215
+ - `vectorStore.vectorStoreContentMode=external` keeps content in the binary file and only loads for top-N results.
216
+ - `vectorStore.contentCacheEntries` controls the small in-memory LRU for decoded content strings.
217
+ - `vectorStore.vectorStoreLoadMode=disk` streams vectors from disk to reduce memory usage.
218
+ - `vectorStore.vectorCacheEntries` controls the small in-memory LRU for vectors when using disk mode.
219
+ - `memoryCleanup.clearCacheAfterIndex=true` drops in-memory vectors after indexing and reloads lazily on next query.
220
+ - `memoryCleanup.unloadModelAfterIndex=true` (default) unloads the embedding model after indexing to free ~500MB-1GB of RAM; the model will reload on the next search query.
221
+ - Note: `ann.annEnabled=true` with `vectorStore.vectorStoreLoadMode=disk` can increase disk reads during ANN rebuilds on large indexes.
222
+
223
+ ### SQLite Vector Store
224
+
225
+ Set `vectorStore.vectorStoreFormat` to `sqlite` to use SQLite for persistence. This provides:
226
+
227
+ - ACID transactions for reliable writes
228
+ - Simpler concurrent access
229
+ - Standard database format for inspection
230
+
231
+ ```json
232
+ {
233
+ "vectorStore": {
234
+ "vectorStoreFormat": "sqlite"
235
+ }
236
+ }
237
+ ```
238
+
239
+ The vectors and content are stored in `vectors.sqlite` in your cache directory. You can inspect it with any SQLite browser.
240
+ `vectorStore.vectorStoreContentMode` and `vectorStore.vectorStoreLoadMode` are respected for SQLite (use `vectorStore.vectorStoreLoadMode=disk` to avoid loading vectors into memory).
241
+
242
+ **Tradeoffs vs Binary:**
243
+
244
+ - Slightly higher read overhead (SQL queries vs direct memory access)
245
+ - Better write reliability (transactions)
246
+ - Easier debugging (standard SQLite file)
247
+
248
+ ### Benchmarking Search
249
+
250
+ Use the built-in script to compare memory vs latency tradeoffs:
251
+
252
+ ```bash
253
+ node tools/scripts/benchmark-search.js --query "database connection" --runs 10
254
+ ```
255
+
256
+ Compare modes quickly:
257
+
258
+ ```bash
259
+ SMART_CODING_VECTOR_STORE_LOAD_MODE=memory node tools/scripts/benchmark-search.js --runs 10
260
+ SMART_CODING_VECTOR_STORE_LOAD_MODE=disk node tools/scripts/benchmark-search.js --runs 10
261
+ SMART_CODING_VECTOR_STORE_FORMAT=binary SMART_CODING_VECTOR_STORE_LOAD_MODE=disk node tools/scripts/benchmark-search.js --runs 10
262
+ ```
263
+
264
+ Note: On small repos, disk mode may be slightly slower and show noisy RSS deltas; benefits are clearer on large indexes with a small `vectorStore.vectorCacheEntries`.
265
+
266
+ ---
267
+
268
+ ## MCP Tools Reference
269
+
270
+ ### `a_semantic_search`
271
+
272
+ Find code by meaning. Ideal for natural language queries like "authentication logic" or "database queries".
273
+
274
+ ### `b_index_codebase`
275
+
276
+ Manually trigger a full reindex. Useful after large code changes.
277
+
278
+ ### `c_clear_cache`
279
+
280
+ Clear the embeddings cache and force reindex.
281
+
282
+ ### `d_ann_config`
283
+
284
+ Configure the ANN (Approximate Nearest Neighbor) index. Actions: `stats`, `set_ef_search`, `rebuild`.
285
+
286
+ ### `d_find_similar_code`
287
+
288
+ Find similar code patterns given a snippet. Useful for finding duplicates or refactoring opportunities.
289
+
290
+ ### `e_check_package_version`
291
+
292
+ Fetch the latest version of a package from its official registry.
293
+
294
+ **Supported registries:**
295
+
296
+ - **npm** (default): `lodash`, `@types/node`
297
+ - **PyPI**: `pip:requests`, `pypi:django`
298
+ - **crates.io**: `cargo:serde`, `rust:tokio`
299
+ - **Maven**: `maven:org.springframework:spring-core`
300
+ - **Go**: `go:github.com/gin-gonic/gin`
301
+ - **RubyGems**: `gem:rails`
302
+ - **NuGet**: `nuget:Newtonsoft.Json`
303
+ - **Packagist**: `composer:laravel/framework`
304
+ - **Hex**: `hex:phoenix`
305
+ - **pub.dev**: `pub:flutter`
306
+ - **Homebrew**: `brew:node`
307
+ - **Conda**: `conda:numpy`
308
+
309
+ ### `f_set_workspace`
310
+
311
+ Change the workspace directory at runtime. Updates search directory, cache location, and optionally triggers reindex.
312
+
313
+ The server also attempts this automatically before each tool call when it detects a new workspace path from environment variables (for example `CODEX_WORKSPACE`, `CODEX_PROJECT_ROOT`, `WORKSPACE_FOLDER`).
314
+
315
+ **Parameters:**
316
+
317
+ - `workspacePath` (required): Absolute path to the new workspace
318
+ - `reindex` (optional, default: `true`): Whether to trigger a full reindex
319
+
320
+ ---
321
+
322
+ ## Troubleshooting
323
+
324
+ **Server isn't starting**
325
+
326
+ 1. Run `heuristic-mcp --status` to check config and cache status.
327
+ 2. Run `heuristic-mcp --logs` to see startup errors.
328
+
329
+ **Native ONNX backend unavailable (falls back to WASM)**
330
+
331
+ If you see log lines like:
332
+
333
+ ```
334
+ Native ONNX backend unavailable: The operating system cannot run %1.
335
+ ...onnxruntime_binding.node. Falling back to WASM.
336
+ ```
337
+
338
+ The server will automatically disable workers and force `embedding.embeddingProcessPerBatch` to reduce memory spikes, but you
339
+ should fix the native binding to restore stable memory usage:
340
+
341
+ - Ensure you are running **64-bit Node.js** (`node -p "process.arch"` should be `x64`).
342
+ - Install **Microsoft Visual C++ 2015–2022 Redistributable (x64)**.
343
+ - Reinstall dependencies (clears locked native binaries):
344
+
345
+ ```bash
346
+ Remove-Item -Recurse -Force node_modules\\onnxruntime-node, node_modules\\.onnxruntime-node-* -ErrorAction SilentlyContinue
347
+ npm install
348
+ ```
349
+
350
+ If you see a warning about **version mismatch** (e.g. "onnxruntime-node 1.23.x incompatible with transformers.js
351
+ expectation 1.14.x"), install the matching version:
352
+
353
+ ```bash
354
+ npm install onnxruntime-node@1.14.0
355
+ ```
356
+
357
+ **Search returns no results**
358
+
359
+ - Check `heuristic-mcp --status` for indexing progress.
360
+ - If indexing shows zero files, review `excludePatterns` and `fileExtensions`.
361
+
362
+ **Model download fails**
363
+
364
+ - The install step tries to pre-download the model, but it can be skipped offline.
365
+ - The server will download on first run; ensure network access at least once.
366
+
367
+ **Clear cache**
368
+
369
+ - Use the MCP tool `c_clear_cache`, run `heuristic-mcp --clear-cache`, or delete the cache directory. For local dev, run `npm run clean`.
370
+
371
+ **Inspect cache**
372
+
373
+ ```bash
374
+ node tools/scripts/cache-stats.js --workspace <path>
375
+ ```
376
+
377
+ **Stop doesn't stick**
378
+
379
+ - The IDE will auto-restart the server if it's still enabled in its config. `--stop` now disables the server entry for Antigravity, Cursor (including `~/.cursor/mcp.json`), Windsurf (`~/.codeium/windsurf/mcp_config.json`), Warp (`~/.warp/mcp_settings.json` and `%APPDATA%\\Warp\\mcp_settings.json` when present), Claude Desktop, and VS Code (when using common MCP settings keys). Restart the IDE after `--start` to re-enable.
380
+
381
+ ---
382
+
383
+ ## Contributing
384
+
385
+ See `CONTRIBUTING.md` for guidelines.
386
+
387
+ License: MIT