magector 1.2.15 → 1.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -57,6 +57,7 @@ Without Magector, asking Claude Code or Cursor *"how are checkout totals calcula
57
57
  - **Hybrid search** -- combines semantic vector similarity with keyword re-ranking for best-of-both-worlds results
58
58
  - **Structured JSON output** -- results include file path, class name, methods list, role badges, and content snippets for minimal round-trips
59
59
  - **Persistent serve mode** -- keeps ONNX model and HNSW index resident in memory, eliminating cold-start latency
60
+ - **Incremental re-indexing** -- background file watcher detects changes and updates the index without restart (tombstone + compact strategy)
60
61
  - **ONNX embeddings** -- native 384-dim transformer embeddings via ONNX Runtime
61
62
  - **36K+ vectors** -- indexes the complete Magento 2 / Adobe Commerce codebase including framework internals
62
63
  - **Magento-aware** -- understands controllers, plugins, observers, blocks, resolvers, repositories, and 20+ Magento patterns
@@ -227,10 +228,14 @@ magector-core serve [OPTIONS]
227
228
  Options:
228
229
  -d, --database <PATH> Index database path [default: ./magector.db]
229
230
  -c, --model-cache <PATH> Model cache directory [default: ./models]
231
+ -m, --magento-root <PATH> Magento root (enables file watcher)
232
+ --watch-interval <SECS> File watcher poll interval [default: 60]
230
233
  ```
231
234
 
232
235
  Starts a persistent process that reads JSON queries from stdin and writes JSON responses to stdout. Keeps the ONNX model and HNSW index resident in memory for fast repeated queries.
233
236
 
237
+ When `--magento-root` is provided, a background file watcher polls for changed files every `--watch-interval` seconds and incrementally re-indexes them without restart. Modified and deleted files are soft-deleted (tombstoned) in the HNSW index; new vectors are appended. When tombstoned entries exceed 20% of total vectors, the index is automatically compacted by rebuilding the HNSW graph.
238
+
234
239
  **Protocol (one JSON object per line):**
235
240
 
236
241
  ```json
@@ -242,6 +247,11 @@ Starts a persistent process that reads JSON queries from stdin and writes JSON r
242
247
 
243
248
  // Stats request:
244
249
  {"command":"stats"}
250
+
251
+ // Watcher status:
252
+ {"command":"watcher_status"}
253
+ // Response:
254
+ {"ok":true,"data":{"running":true,"tracked_files":18234,"last_scan_changes":3,"interval_secs":60}}
245
255
  ```
246
256
 
247
257
  ### Node.js CLI
@@ -523,7 +533,8 @@ magector/
523
533
  │ │ ├── lib.rs # Library exports
524
534
  │ │ ├── indexer.rs # Core indexing with progress output
525
535
  │ │ ├── embedder.rs # ONNX embedding (MiniLM-L6-v2)
526
- │ │ ├── vectordb.rs # HNSW vector database + hybrid search
536
+ │ │ ├── vectordb.rs # HNSW vector database + hybrid search + tombstones
537
+ │ │ ├── watcher.rs # File watcher for incremental re-indexing
527
538
  │ │ ├── ast.rs # Tree-sitter AST (PHP + JS)
528
539
  │ │ ├── magento.rs # Magento pattern detection (Rust)
529
540
  │ │ └── validation.rs # 557 test cases, validation framework
@@ -591,7 +602,32 @@ flowchart TD
591
602
  style fallback fill:#f4e8e8,color:#000
592
603
  ```
593
604
 
594
- ### 4. MCP Integration
605
+ ### 4. File Watcher (Incremental Re-indexing)
606
+
607
+ When the serve process is started with `--magento-root`, a background thread polls the filesystem for changes every 60 seconds (configurable via `--watch-interval`). Changed files are incrementally re-indexed without restarting the server.
608
+
609
+ Since `hnsw_rs` does not support point deletion, Magector uses a **tombstone** strategy: old vectors for modified/deleted files are marked as tombstoned and filtered out of search results. New vectors are appended. When tombstoned entries exceed 20% of total vectors, the HNSW graph is automatically rebuilt (compacted) to reclaim memory and restore search performance.
610
+
611
+ ```mermaid
612
+ flowchart TD
613
+ W1[Sleep 60s] --> W2[Scan Filesystem]
614
+ W2 --> W3{Changes?}
615
+ W3 -->|No| W1
616
+ W3 -->|Yes| W4[Tombstone Old Vectors]
617
+ W4 --> W5[Parse + Embed New Files]
618
+ W5 --> W6[Append to HNSW]
619
+ W6 --> W7{Tombstone > 20%?}
620
+ W7 -->|Yes| W8[Compact / Rebuild HNSW]
621
+ W7 -->|No| W9[Save to Disk]
622
+ W8 --> W9
623
+ W9 --> W1
624
+
625
+ style W4 fill:#f4e8e8,color:#000
626
+ style W5 fill:#e8f4e8,color:#000
627
+ style W8 fill:#e8e8f4,color:#000
628
+ ```
629
+
630
+ ### 5. MCP Integration
595
631
 
596
632
  The MCP server delegates all search/index operations to the Rust core binary. Analysis tools (diff, complexity) use ruvector JS modules directly.
597
633
 
@@ -788,7 +824,8 @@ npm run test:accuracy
788
824
  - **Parameters:** M=32, max_layers=16, ef_construction=200
789
825
  - **Distance metric:** Cosine similarity
790
826
  - **Hybrid search:** Semantic nearest-neighbor + keyword reranking in path and search text
791
- - **Persistence:** Bincode binary serialization
827
+ - **Incremental updates:** Tombstone soft-delete + periodic HNSW rebuild (compact)
828
+ - **Persistence:** Bincode V2 binary serialization (backward-compatible with V1)
792
829
 
793
830
  ### Index Structure
794
831
 
@@ -859,7 +896,7 @@ gantt
859
896
  Method chunking :active, 2025-04, 30d
860
897
  Intent detection :2025-05, 30d
861
898
  Type filtering :2025-06, 30d
862
- Incremental index :2025-07, 30d
899
+ Incremental index :done, 2025-04, 30d
863
900
  section Future
864
901
  VSCode extension :2025-08, 60d
865
902
  Web UI :2025-10, 60d
@@ -874,7 +911,7 @@ gantt
874
911
  - [ ] Method-level chunking (per-method vectors for direct method search)
875
912
  - [ ] Query intent classification (auto-detect "give me XML" vs "give me PHP")
876
913
  - [ ] Filtered search by file type at the vector level
877
- - [ ] Incremental indexing (only re-index changed files)
914
+ - [x] Incremental indexing (background file watcher with tombstone + compact strategy)
878
915
  - [ ] VSCode extension
879
916
  - [ ] Web UI for browsing results
880
917
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "magector",
3
- "version": "1.2.15",
3
+ "version": "1.3.0",
4
4
  "description": "Semantic code search for Magento 2 — index, search, MCP server",
5
5
  "type": "module",
6
6
  "main": "src/mcp-server.js",
@@ -33,10 +33,10 @@
33
33
  "ruvector": "^0.1.96"
34
34
  },
35
35
  "optionalDependencies": {
36
- "@magector/cli-darwin-arm64": "1.2.15",
37
- "@magector/cli-linux-x64": "1.2.15",
38
- "@magector/cli-linux-arm64": "1.2.15",
39
- "@magector/cli-win32-x64": "1.2.15"
36
+ "@magector/cli-darwin-arm64": "1.3.0",
37
+ "@magector/cli-linux-x64": "1.3.0",
38
+ "@magector/cli-linux-arm64": "1.3.0",
39
+ "@magector/cli-win32-x64": "1.3.0"
40
40
  },
41
41
  "keywords": [
42
42
  "magento",
package/src/mcp-server.js CHANGED
@@ -75,11 +75,17 @@ let serveReadline = null;
75
75
 
76
76
  function startServeProcess() {
77
77
  try {
78
- const proc = spawn(config.rustBinary, [
78
+ const args = [
79
79
  'serve',
80
80
  '-d', config.dbPath,
81
81
  '-c', config.modelCache
82
- ], { stdio: ['pipe', 'pipe', 'pipe'], env: rustEnv });
82
+ ];
83
+ // Enable file watcher if magento root is configured
84
+ if (config.magentoRoot && existsSync(config.magentoRoot)) {
85
+ args.push('-m', config.magentoRoot);
86
+ }
87
+ const proc = spawn(config.rustBinary, args,
88
+ { stdio: ['pipe', 'pipe', 'pipe'], env: rustEnv });
83
89
 
84
90
  proc.on('error', () => { serveProcess = null; serveReady = false; });
85
91
  proc.on('exit', () => { serveProcess = null; serveReady = false; });