magector 1.2.12 → 1.2.14
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +389 -154
- package/package.json +5 -5
package/README.md
CHANGED
|
@@ -7,7 +7,7 @@ Magector indexes an entire Magento 2 codebase and lets you search it with natura
|
|
|
7
7
|
[](https://www.rust-lang.org)
|
|
8
8
|
[](https://nodejs.org)
|
|
9
9
|
[](https://magento.com)
|
|
10
|
-
[](#validation)
|
|
11
11
|
[](LICENSE)
|
|
12
12
|
|
|
13
13
|
---
|
|
@@ -21,7 +21,7 @@ Magento 2 has **18,000+ source files** across hundreds of modules. Finding the r
|
|
|
21
21
|
| `grep` / `ripgrep` | No | No | 100-500ms |
|
|
22
22
|
| IDE search | No | No | 200-1000ms |
|
|
23
23
|
| GitHub search | Partial | No | 500-2000ms |
|
|
24
|
-
| **Magector** | **Yes** | **Yes** | **
|
|
24
|
+
| **Magector** | **Yes** | **Yes** | **10-45ms** |
|
|
25
25
|
|
|
26
26
|
Magector understands that a query about *"payment capture"* should return `Sales/Model/Order/Payment/Operations/CaptureOperation.php`, not just files containing the word "capture".
|
|
27
27
|
|
|
@@ -29,37 +29,41 @@ Magector understands that a query about *"payment capture"* should return `Sales
|
|
|
29
29
|
|
|
30
30
|
## Magector vs Built-in AI Search
|
|
31
31
|
|
|
32
|
-
Claude Code and Cursor both have built-in code search
|
|
32
|
+
Claude Code and Cursor both have built-in code search -- but they rely on keyword matching (`grep`/`ripgrep`) and file-tree heuristics. On a Magento 2 codebase with 18,000+ files, that approach breaks down fast.
|
|
33
33
|
|
|
34
34
|
| Capability | Claude Code / Cursor (built-in) | Magector |
|
|
35
35
|
|---|---|---|
|
|
36
36
|
| **Search method** | Keyword grep / ripgrep | Semantic vector search (ONNX embeddings) |
|
|
37
|
-
| **Understands intent** | No
|
|
38
|
-
| **Magento pattern awareness** | None
|
|
39
|
-
| **Query speed (
|
|
40
|
-
| **Context window cost** | Reads many wrong files
|
|
41
|
-
| **Works offline** | Yes | Yes
|
|
37
|
+
| **Understands intent** | No -- literal string matching only | Yes -- "payment capture" finds `CaptureOperation.php` |
|
|
38
|
+
| **Magento pattern awareness** | None -- treats all PHP the same | Detects controllers, plugins, observers, blocks, resolvers, cron, and 20+ patterns |
|
|
39
|
+
| **Query speed (36K vectors)** | 200-1000ms per grep pass; multiple rounds needed | 10-45ms single pass |
|
|
40
|
+
| **Context window cost** | Reads many wrong files, burns tokens | Returns structured JSON with ranked results, methods, and snippets |
|
|
41
|
+
| **Works offline** | Yes | Yes -- local ONNX model, no API calls |
|
|
42
42
|
| **Setup** | Built-in | `npx magector init` (one command) |
|
|
43
43
|
|
|
44
44
|
### What this means in practice
|
|
45
45
|
|
|
46
|
-
Without Magector, asking Claude Code or Cursor *"how are checkout totals calculated?"* triggers multiple grep searches, reads dozens of files, and still may miss the right ones. With Magector, the AI calls `magento_search("checkout totals calculation")` and gets the exact files ranked by relevance in one step
|
|
46
|
+
Without Magector, asking Claude Code or Cursor *"how are checkout totals calculated?"* triggers multiple grep searches, reads dozens of files, and still may miss the right ones. With Magector, the AI calls `magento_search("checkout totals calculation")` and gets the exact files ranked by relevance in one step -- saving tokens and time.
|
|
47
47
|
|
|
48
|
-
**Magector doesn't replace your AI tool
|
|
48
|
+
**Magector doesn't replace your AI tool -- it gives it a better search engine.**
|
|
49
49
|
|
|
50
50
|
---
|
|
51
51
|
|
|
52
52
|
## Features
|
|
53
53
|
|
|
54
54
|
- **Semantic search** -- find code by meaning, not exact keywords
|
|
55
|
-
- **
|
|
56
|
-
- **
|
|
57
|
-
- **
|
|
55
|
+
- **94.9% accuracy** -- validated with 101 E2E test queries across 16 tool categories, plus 557 Rust-level test cases
|
|
56
|
+
- **Hybrid search** -- combines semantic vector similarity with keyword re-ranking for best-of-both-worlds results
|
|
57
|
+
- **Structured JSON output** -- results include file path, class name, methods list, role badges, and content snippets for minimal round-trips
|
|
58
|
+
- **Persistent serve mode** -- keeps ONNX model and HNSW index resident in memory, eliminating cold-start latency
|
|
59
|
+
- **ONNX embeddings** -- native 384-dim transformer embeddings via ONNX Runtime
|
|
60
|
+
- **36K+ vectors** -- indexes the complete Magento 2 codebase including framework internals
|
|
58
61
|
- **Magento-aware** -- understands controllers, plugins, observers, blocks, resolvers, repositories, and 20+ Magento patterns
|
|
59
62
|
- **AST-powered** -- tree-sitter parsing for PHP and JavaScript extracts classes, methods, namespaces, and inheritance
|
|
63
|
+
- **Cross-tool discovery** -- tool descriptions include keywords and "See also" references so AI clients find the right tool on the first try
|
|
60
64
|
- **Diff analysis** -- risk scoring and change classification for git commits and staged changes
|
|
61
65
|
- **Complexity analysis** -- cyclomatic complexity, function count, and hotspot detection across modules
|
|
62
|
-
- **Fast** --
|
|
66
|
+
- **Fast** -- 10-45ms queries via persistent serve process, batched ONNX embedding with adaptive thread scaling
|
|
63
67
|
- **MCP server** -- 19 tools integrating with Claude Code, Cursor, and any MCP-compatible AI tool
|
|
64
68
|
- **Clean architecture** -- Rust core handles all indexing/search, Node.js MCP server delegates to it
|
|
65
69
|
|
|
@@ -67,52 +71,50 @@ Without Magector, asking Claude Code or Cursor *"how are checkout totals calcula
|
|
|
67
71
|
|
|
68
72
|
## Architecture
|
|
69
73
|
|
|
74
|
+
```mermaid
|
|
75
|
+
flowchart TD
|
|
76
|
+
subgraph rust ["Rust Core"]
|
|
77
|
+
A["AST Parser · PHP + JS"]
|
|
78
|
+
B["Pattern Detection · 20+"]
|
|
79
|
+
C["ONNX Embedder · 384d"]
|
|
80
|
+
D["HNSW + Reranking"]
|
|
81
|
+
A --> B --> C --> D
|
|
82
|
+
end
|
|
83
|
+
subgraph node ["Node.js Layer"]
|
|
84
|
+
E["MCP Server · 19 tools"]
|
|
85
|
+
F["Persistent Serve"]
|
|
86
|
+
G["CLI · init/index/search"]
|
|
87
|
+
E --> F
|
|
88
|
+
G --> F
|
|
89
|
+
end
|
|
90
|
+
node -->|stdin/stdout JSON| rust
|
|
91
|
+
|
|
92
|
+
style rust fill:#f4a460,color:#000
|
|
93
|
+
style node fill:#68b684,color:#000
|
|
70
94
|
```
|
|
71
|
-
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
|
|
86
|
-
|
|
87
|
-
|
|
88
|
-
|
|
89
|
-
|
|
90
|
-
|
|
91
|
-
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
│ ┌─────┴──────┐ │ │
|
|
95
|
-
│ │ HNSW │ │ │
|
|
96
|
-
│ │ Vector DB │ │ │
|
|
97
|
-
│ └────────────┘ │ │
|
|
98
|
-
└──────────────────┴───────────────────────┘
|
|
99
|
-
```
|
|
100
|
-
|
|
101
|
-
### Embedding Pipeline
|
|
102
|
-
|
|
103
|
-
```
|
|
104
|
-
Source File ──▶ Tree-sitter AST ──▶ Magento Pattern Detection ──▶ Search Text Enrichment
|
|
105
|
-
│ │
|
|
106
|
-
│ ▼
|
|
107
|
-
│ ONNX Runtime
|
|
108
|
-
│ (MiniLM-L6-v2)
|
|
109
|
-
│ │
|
|
110
|
-
│ ▼
|
|
111
|
-
│ 384-dim embedding
|
|
112
|
-
│ │
|
|
113
|
-
▼ ▼
|
|
114
|
-
Metadata ─────────────────────────────────────────────────────▶ HNSW Index
|
|
115
|
-
(path, class, namespace, type, methods, patterns) (17,891 vectors)
|
|
95
|
+
|
|
96
|
+
### Indexing Pipeline
|
|
97
|
+
|
|
98
|
+
```mermaid
|
|
99
|
+
flowchart TD
|
|
100
|
+
A[Source File] --> B[AST Parser]
|
|
101
|
+
B --> C[Pattern Detection]
|
|
102
|
+
C --> D[Text Enrichment]
|
|
103
|
+
D --> E[ONNX Embedding]
|
|
104
|
+
E --> F[(HNSW Index)]
|
|
105
|
+
A --> G[Metadata]
|
|
106
|
+
G --> F
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
### Search Pipeline
|
|
110
|
+
|
|
111
|
+
```mermaid
|
|
112
|
+
flowchart TD
|
|
113
|
+
Q[Query] --> E1[Synonym Enrichment]
|
|
114
|
+
E1 --> E2[ONNX Embedding]
|
|
115
|
+
E2 --> H[HNSW Search]
|
|
116
|
+
H --> R[Hybrid Reranking]
|
|
117
|
+
R --> J[Structured JSON]
|
|
116
118
|
```
|
|
117
119
|
|
|
118
120
|
### Components
|
|
@@ -120,12 +122,12 @@ Source File ──▶ Tree-sitter AST ──▶ Magento Pattern Detection ──
|
|
|
120
122
|
| Component | Technology | Purpose |
|
|
121
123
|
|-----------|-----------|---------|
|
|
122
124
|
| Embeddings | `ort` (ONNX Runtime) | all-MiniLM-L6-v2, 384 dimensions |
|
|
123
|
-
| Vector search | `hnsw_rs` | Approximate nearest neighbor |
|
|
125
|
+
| Vector search | `hnsw_rs` + hybrid reranking | Approximate nearest neighbor + keyword boosting |
|
|
124
126
|
| PHP parsing | `tree-sitter-php` | Class, method, namespace extraction |
|
|
125
127
|
| JS parsing | `tree-sitter-javascript` | AMD/ES6 module detection |
|
|
126
128
|
| Pattern detection | Custom Rust | 20+ Magento-specific patterns |
|
|
127
|
-
| CLI | `clap` | Command-line interface |
|
|
128
|
-
| MCP server | `@modelcontextprotocol/sdk` | AI tool integration |
|
|
129
|
+
| CLI | `clap` | Command-line interface (index, search, serve, validate) |
|
|
130
|
+
| MCP server | `@modelcontextprotocol/sdk` | AI tool integration with structured JSON output |
|
|
129
131
|
|
|
130
132
|
---
|
|
131
133
|
|
|
@@ -142,14 +144,17 @@ cd /path/to/your/magento2
|
|
|
142
144
|
npx magector init
|
|
143
145
|
```
|
|
144
146
|
|
|
145
|
-
This single command:
|
|
146
|
-
|
|
147
|
-
|
|
148
|
-
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
|
|
152
|
-
|
|
147
|
+
This single command handles the entire setup:
|
|
148
|
+
|
|
149
|
+
```mermaid
|
|
150
|
+
flowchart TD
|
|
151
|
+
A["npx magector init"] --> B[Verify Project]
|
|
152
|
+
B --> C[Download Model]
|
|
153
|
+
C --> D[Index Codebase]
|
|
154
|
+
D --> E[Detect IDE]
|
|
155
|
+
E --> F[Write Config]
|
|
156
|
+
F --> G[Update .gitignore]
|
|
157
|
+
```
|
|
153
158
|
|
|
154
159
|
### 2. Search
|
|
155
160
|
|
|
@@ -182,6 +187,7 @@ magector-core <COMMAND>
|
|
|
182
187
|
Commands:
|
|
183
188
|
index Index a Magento codebase
|
|
184
189
|
search Search the index semantically
|
|
190
|
+
serve Start persistent server mode (stdin/stdout JSON protocol)
|
|
185
191
|
validate Run validation suite (downloads Magento if needed)
|
|
186
192
|
download Download Magento 2 Open Source
|
|
187
193
|
stats Show index statistics
|
|
@@ -211,6 +217,31 @@ Options:
|
|
|
211
217
|
-f, --format <FORMAT> Output format: text, json [default: text]
|
|
212
218
|
```
|
|
213
219
|
|
|
220
|
+
#### `serve`
|
|
221
|
+
|
|
222
|
+
```bash
|
|
223
|
+
magector-core serve [OPTIONS]
|
|
224
|
+
|
|
225
|
+
Options:
|
|
226
|
+
-d, --database <PATH> Index database path [default: ./magector.db]
|
|
227
|
+
-c, --model-cache <PATH> Model cache directory [default: ./models]
|
|
228
|
+
```
|
|
229
|
+
|
|
230
|
+
Starts a persistent process that reads JSON queries from stdin and writes JSON responses to stdout. Keeps the ONNX model and HNSW index resident in memory for fast repeated queries.
|
|
231
|
+
|
|
232
|
+
**Protocol (one JSON object per line):**
|
|
233
|
+
|
|
234
|
+
```json
|
|
235
|
+
// Request:
|
|
236
|
+
{"command":"search","query":"product price","limit":10}
|
|
237
|
+
|
|
238
|
+
// Response:
|
|
239
|
+
{"ok":true,"data":[{"id":123,"score":0.85,"metadata":{...}}]}
|
|
240
|
+
|
|
241
|
+
// Stats request:
|
|
242
|
+
{"command":"stats"}
|
|
243
|
+
```
|
|
244
|
+
|
|
214
245
|
### Node.js CLI
|
|
215
246
|
|
|
216
247
|
```bash
|
|
@@ -236,45 +267,112 @@ npx magector help # Show help
|
|
|
236
267
|
|
|
237
268
|
## MCP Server Tools
|
|
238
269
|
|
|
239
|
-
The MCP server exposes 19 tools for AI-assisted Magento development
|
|
270
|
+
The MCP server exposes 19 tools for AI-assisted Magento development. All search tools return **structured JSON** with file paths, class names, methods, role badges, and content snippets -- enabling AI clients to parse results programmatically and minimize file-read round-trips.
|
|
271
|
+
|
|
272
|
+
### Output Format
|
|
273
|
+
|
|
274
|
+
All search tools return structured JSON:
|
|
275
|
+
|
|
276
|
+
```json
|
|
277
|
+
{
|
|
278
|
+
"results": [
|
|
279
|
+
{
|
|
280
|
+
"rank": 1,
|
|
281
|
+
"score": 0.892,
|
|
282
|
+
"path": "vendor/magento/module-catalog/Model/ProductRepository.php",
|
|
283
|
+
"module": "Magento_Catalog",
|
|
284
|
+
"className": "ProductRepository",
|
|
285
|
+
"namespace": "Magento\\Catalog\\Model",
|
|
286
|
+
"methods": ["save", "getById", "getList", "delete", "deleteById"],
|
|
287
|
+
"magentoType": "repository",
|
|
288
|
+
"fileType": "php",
|
|
289
|
+
"badges": ["repository"],
|
|
290
|
+
"snippet": "class ProductRepository implements ProductRepositoryInterface..."
|
|
291
|
+
}
|
|
292
|
+
],
|
|
293
|
+
"count": 1
|
|
294
|
+
}
|
|
295
|
+
```
|
|
296
|
+
|
|
297
|
+
**Key fields:**
|
|
298
|
+
- `methods` -- list of method names in the class (avoids needing to read the file)
|
|
299
|
+
- `badges` -- role indicators: `plugin`, `controller`, `observer`, `repository`, `graphql-resolver`, `model`, `block`
|
|
300
|
+
- `snippet` -- first 300 characters of indexed content for quick assessment
|
|
240
301
|
|
|
241
302
|
### Search Tools
|
|
242
303
|
|
|
243
304
|
| Tool | Description |
|
|
244
305
|
|------|-------------|
|
|
245
|
-
| `magento_search` | Semantic
|
|
246
|
-
| `magento_find_class` | Find PHP class, interface, or trait by name |
|
|
306
|
+
| `magento_search` | Semantic search -- find any PHP class, method, XML config, template, or GraphQL schema by natural language |
|
|
307
|
+
| `magento_find_class` | Find PHP class, interface, abstract class, or trait by name |
|
|
247
308
|
| `magento_find_method` | Find method implementations across the codebase |
|
|
248
309
|
|
|
249
310
|
### Magento-Specific Finders
|
|
250
311
|
|
|
251
312
|
| Tool | Description |
|
|
252
313
|
|------|-------------|
|
|
253
|
-
| `magento_find_config` | Find XML configuration
|
|
254
|
-
| `magento_find_template` | Find PHTML template files |
|
|
255
|
-
| `magento_find_plugin` | Find interceptor plugins and
|
|
256
|
-
| `magento_find_observer` | Find event observers |
|
|
257
|
-
| `
|
|
258
|
-
| `
|
|
259
|
-
| `
|
|
260
|
-
| `
|
|
261
|
-
| `
|
|
262
|
-
| `
|
|
314
|
+
| `magento_find_config` | Find XML configuration (di.xml, events.xml, routes.xml, system.xml, webapi.xml, module.xml, layout) |
|
|
315
|
+
| `magento_find_template` | Find PHTML template files for frontend or admin rendering |
|
|
316
|
+
| `magento_find_plugin` | Find interceptor plugins (before/after/around methods) and di.xml declarations |
|
|
317
|
+
| `magento_find_observer` | Find event observers and events.xml declarations |
|
|
318
|
+
| `magento_find_preference` | Find DI preference overrides -- which class implements an interface |
|
|
319
|
+
| `magento_find_controller` | Find MVC controllers by frontend or admin route path |
|
|
320
|
+
| `magento_find_block` | Find Block classes for view rendering |
|
|
321
|
+
| `magento_find_graphql` | Find GraphQL schema definitions, resolvers, types, queries, and mutations |
|
|
322
|
+
| `magento_find_api` | Find REST/SOAP API endpoints in webapi.xml |
|
|
323
|
+
| `magento_find_cron` | Find cron job definitions in crontab.xml |
|
|
324
|
+
| `magento_find_db_schema` | Find database table definitions in db_schema.xml (declarative schema) |
|
|
263
325
|
|
|
264
326
|
### Analysis Tools
|
|
265
327
|
|
|
266
328
|
| Tool | Description |
|
|
267
329
|
|------|-------------|
|
|
268
330
|
| `magento_analyze_diff` | Analyze git diffs for risk scoring and change classification |
|
|
269
|
-
| `magento_complexity` | Analyze
|
|
331
|
+
| `magento_complexity` | Analyze cyclomatic complexity, function count, and line count |
|
|
270
332
|
|
|
271
333
|
### Utility Tools
|
|
272
334
|
|
|
273
335
|
| Tool | Description |
|
|
274
336
|
|------|-------------|
|
|
275
|
-
| `magento_module_structure` | Show module
|
|
337
|
+
| `magento_module_structure` | Show complete module structure -- controllers, models, blocks, plugins, observers, configs |
|
|
276
338
|
| `magento_index` | Trigger re-indexing of the codebase |
|
|
277
|
-
| `magento_stats` | View index statistics
|
|
339
|
+
| `magento_stats` | View index statistics |
|
|
340
|
+
|
|
341
|
+
### Tool Cross-References
|
|
342
|
+
|
|
343
|
+
Each tool description includes "See also" hints to help AI clients chain tools effectively:
|
|
344
|
+
|
|
345
|
+
```mermaid
|
|
346
|
+
graph TD
|
|
347
|
+
cls["find_class"] --> plg["find_plugin"]
|
|
348
|
+
cls --> prf["find_preference"]
|
|
349
|
+
cls --> mtd["find_method"]
|
|
350
|
+
cfg["find_config"] --> obs["find_observer"]
|
|
351
|
+
cfg --> prf
|
|
352
|
+
cfg --> api["find_api"]
|
|
353
|
+
plg --> cls
|
|
354
|
+
plg --> mtd
|
|
355
|
+
tpl["find_template"] --> blk["find_block"]
|
|
356
|
+
blk --> tpl
|
|
357
|
+
blk --> cfg
|
|
358
|
+
dbs["find_db_schema"] --> cls
|
|
359
|
+
gql["find_graphql"] --> cls
|
|
360
|
+
gql --> mtd
|
|
361
|
+
ctl["find_controller"] --> cfg
|
|
362
|
+
|
|
363
|
+
style cls fill:#4a90d9,color:#fff
|
|
364
|
+
style mtd fill:#4a90d9,color:#fff
|
|
365
|
+
style cfg fill:#e8a838,color:#000
|
|
366
|
+
style plg fill:#d94a4a,color:#fff
|
|
367
|
+
style obs fill:#d94a4a,color:#fff
|
|
368
|
+
style prf fill:#e8a838,color:#000
|
|
369
|
+
style api fill:#e8a838,color:#000
|
|
370
|
+
style tpl fill:#68b684,color:#000
|
|
371
|
+
style blk fill:#68b684,color:#000
|
|
372
|
+
style dbs fill:#9b59b6,color:#fff
|
|
373
|
+
style gql fill:#9b59b6,color:#fff
|
|
374
|
+
style ctl fill:#4a90d9,color:#fff
|
|
375
|
+
```
|
|
278
376
|
|
|
279
377
|
### Query Examples
|
|
280
378
|
|
|
@@ -282,11 +380,18 @@ The MCP server exposes 19 tools for AI-assisted Magento development:
|
|
|
282
380
|
magento_search("how are checkout totals calculated")
|
|
283
381
|
magento_search("product price with tier pricing and catalog rules")
|
|
284
382
|
magento_find_class("ProductRepositoryInterface")
|
|
383
|
+
magento_find_method("getById")
|
|
285
384
|
magento_find_config("di.xml plugin for ProductRepository")
|
|
286
|
-
magento_find_plugin("
|
|
385
|
+
magento_find_plugin({ targetClass: "Topmenu" })
|
|
287
386
|
magento_find_observer("sales_order_place_after")
|
|
288
|
-
|
|
289
|
-
|
|
387
|
+
magento_find_preference("StoreManagerInterface")
|
|
388
|
+
magento_find_api("/V1/orders")
|
|
389
|
+
magento_find_controller("catalog/product/view")
|
|
390
|
+
magento_find_graphql("placeOrder")
|
|
391
|
+
magento_find_db_schema("sales_order")
|
|
392
|
+
magento_find_cron("indexer")
|
|
393
|
+
magento_find_block("cart totals")
|
|
394
|
+
magento_find_template("minicart")
|
|
290
395
|
magento_analyze_diff({ commitHash: "abc123" })
|
|
291
396
|
magento_complexity({ module: "Magento_Catalog", threshold: 10 })
|
|
292
397
|
```
|
|
@@ -310,43 +415,70 @@ Pre-built binaries are provided for the following platforms:
|
|
|
310
415
|
|
|
311
416
|
## Validation
|
|
312
417
|
|
|
313
|
-
Magector is validated
|
|
314
|
-
|
|
315
|
-
### Overall Results
|
|
316
|
-
|
|
317
|
-
| Metric | Value |
|
|
318
|
-
|--------|-------|
|
|
319
|
-
| **Accuracy** | **96.1%** |
|
|
320
|
-
| Tests passed | 535 / 557 |
|
|
321
|
-
| Index size | 17,891 vectors |
|
|
322
|
-
| Query time | 15-45ms |
|
|
323
|
-
| Indexing time | ~3 minutes |
|
|
324
|
-
|
|
325
|
-
### Category Performance
|
|
418
|
+
Magector is validated at two levels:
|
|
326
419
|
|
|
327
|
-
**
|
|
328
|
-
|
|
420
|
+
1. **E2E MCP accuracy tests** -- 101 queries across 16 tool categories via stdio JSON-RPC
|
|
421
|
+
2. **Rust-level validation** -- 557 test cases across 50+ categories against Magento 2.4.7
|
|
329
422
|
|
|
330
|
-
|
|
331
|
-
Catalog Product (96%), Customer Advanced (95%), Checkout Flow (95%), Shipping Advanced (93.3%), Category (93.3%), Frontend JS (90%), Search (90%)
|
|
423
|
+
### E2E Accuracy (MCP Tools)
|
|
332
424
|
|
|
333
|
-
|
|
334
|
-
|
|
335
|
-
|
|
425
|
+
```mermaid
|
|
426
|
+
---
|
|
427
|
+
config:
|
|
428
|
+
themeVariables:
|
|
429
|
+
pie1: "#4caf50"
|
|
430
|
+
pie2: "#f44336"
|
|
431
|
+
---
|
|
432
|
+
pie title Test Pass Rate (101 queries)
|
|
433
|
+
"Passed (101)" : 101
|
|
434
|
+
"Failed (0)" : 0
|
|
435
|
+
```
|
|
336
436
|
|
|
337
|
-
|
|
437
|
+
| Metric | Value |
|
|
438
|
+
|--------|-------|
|
|
439
|
+
| **Grade** | **A (94.9/100)** |
|
|
440
|
+
| **Pass rate** | 101/101 (100%) |
|
|
441
|
+
| **Precision** | 93.2% |
|
|
442
|
+
| **MRR** | 99.2% |
|
|
443
|
+
| **NDCG@10** | 85.5% |
|
|
444
|
+
| **Index size** | 35,795 vectors |
|
|
445
|
+
| **Query time** | 10-45ms |
|
|
446
|
+
|
|
447
|
+
#### Per-Tool Performance
|
|
448
|
+
|
|
449
|
+
| Tool | Pass | Precision | MRR | NDCG |
|
|
450
|
+
|------|------|-----------|-----|------|
|
|
451
|
+
| find_class | 100% | 100% | 100% | 100% |
|
|
452
|
+
| find_method | 100% | 89% | 100% | 87% |
|
|
453
|
+
| find_controller | 100% | 100% | 100% | -- |
|
|
454
|
+
| find_observer | 100% | 100% | 100% | 100% |
|
|
455
|
+
| find_plugin | 100% | 96% | 100% | 100% |
|
|
456
|
+
| find_preference | 100% | 100% | 100% | 100% |
|
|
457
|
+
| find_api | 100% | 100% | 100% | 100% |
|
|
458
|
+
| find_cron | 100% | 100% | 100% | 100% |
|
|
459
|
+
| find_db_schema | 100% | 100% | 100% | 100% |
|
|
460
|
+
| find_graphql | 100% | 100% | 100% | 100% |
|
|
461
|
+
| find_block | 100% | 100% | 100% | 100% |
|
|
462
|
+
| find_config | 100% | 89% | 89% | 93% |
|
|
463
|
+
| find_template | 100% | 84% | 100% | 100% |
|
|
464
|
+
| search | 100% | 99% | 100% | 100% |
|
|
465
|
+
|
|
466
|
+
### Integration Tests
|
|
467
|
+
|
|
468
|
+
62 integration tests covering MCP protocol compliance, tool schemas, tool calls, analysis tools, and stdout JSON integrity.
|
|
469
|
+
|
|
470
|
+
### Running Tests
|
|
338
471
|
|
|
339
472
|
```bash
|
|
340
|
-
#
|
|
341
|
-
|
|
342
|
-
|
|
473
|
+
# E2E accuracy tests (101 queries, requires indexed codebase)
|
|
474
|
+
npm run test:accuracy
|
|
475
|
+
npm run test:accuracy:verbose
|
|
343
476
|
|
|
344
|
-
#
|
|
345
|
-
|
|
477
|
+
# Integration tests (62 tests)
|
|
478
|
+
npm test
|
|
346
479
|
|
|
347
|
-
#
|
|
348
|
-
|
|
349
|
-
npm run validate:verbose
|
|
480
|
+
# Rust validation (557 test cases)
|
|
481
|
+
cd rust-core && cargo run --release -- validate -m ./magento2 --skip-index
|
|
350
482
|
```
|
|
351
483
|
|
|
352
484
|
---
|
|
@@ -357,7 +489,7 @@ npm run validate:verbose
|
|
|
357
489
|
magector/
|
|
358
490
|
├── src/ # Node.js source
|
|
359
491
|
│ ├── cli.js # CLI entry point (npx magector <command>)
|
|
360
|
-
│ ├── mcp-server.js # MCP server (19 tools,
|
|
492
|
+
│ ├── mcp-server.js # MCP server (19 tools, structured JSON output)
|
|
361
493
|
│ ├── binary.js # Platform binary resolver
|
|
362
494
|
│ ├── model.js # ONNX model resolver/downloader
|
|
363
495
|
│ ├── init.js # Full init command (index + IDE config)
|
|
@@ -372,7 +504,10 @@ magector/
|
|
|
372
504
|
│ ├── test-data-generator.js
|
|
373
505
|
│ └── accuracy-calculator.js
|
|
374
506
|
├── tests/ # Automated tests
|
|
375
|
-
│
|
|
507
|
+
│ ├── mcp-server.test.js # Integration tests (62 tests)
|
|
508
|
+
│ ├── mcp-accuracy.test.js # E2E accuracy tests (101 queries)
|
|
509
|
+
│ └── results/ # Test result artifacts
|
|
510
|
+
│ └── accuracy-report.json
|
|
376
511
|
├── platforms/ # Platform-specific binary packages
|
|
377
512
|
│ ├── darwin-arm64/ # macOS ARM (Apple Silicon)
|
|
378
513
|
│ ├── linux-x64/ # Linux x64
|
|
@@ -381,11 +516,11 @@ magector/
|
|
|
381
516
|
├── rust-core/ # Rust high-performance core
|
|
382
517
|
│ ├── Cargo.toml
|
|
383
518
|
│ ├── src/
|
|
384
|
-
│ │ ├── main.rs # Rust CLI (index, search, validate)
|
|
519
|
+
│ │ ├── main.rs # Rust CLI (index, search, serve, validate)
|
|
385
520
|
│ │ ├── lib.rs # Library exports
|
|
386
521
|
│ │ ├── indexer.rs # Core indexing with progress output
|
|
387
522
|
│ │ ├── embedder.rs # ONNX embedding (MiniLM-L6-v2)
|
|
388
|
-
│ │ ├── vectordb.rs # HNSW vector database
|
|
523
|
+
│ │ ├── vectordb.rs # HNSW vector database + hybrid search
|
|
389
524
|
│ │ ├── ast.rs # Tree-sitter AST (PHP + JS)
|
|
390
525
|
│ │ ├── magento.rs # Magento pattern detection (Rust)
|
|
391
526
|
│ │ └── validation.rs # 557 test cases, validation framework
|
|
@@ -424,32 +559,88 @@ Magector scans every `.php`, `.js`, `.xml`, `.phtml`, and `.graphqls` file in a
|
|
|
424
559
|
1. Query text is enriched with pattern synonyms (e.g., "controller" adds "action execute http request dispatch")
|
|
425
560
|
2. The enriched query is embedded into the same 384-dimensional vector space
|
|
426
561
|
3. HNSW finds the nearest neighbors by cosine similarity
|
|
427
|
-
4.
|
|
562
|
+
4. **Hybrid reranking** boosts results with keyword matches in path and search text
|
|
563
|
+
5. Results are returned as structured JSON with file path, class name, methods, role badges, and content snippet
|
|
564
|
+
|
|
565
|
+
### 3. Persistent Serve Mode
|
|
566
|
+
|
|
567
|
+
The MCP server spawns a persistent Rust process (`magector-core serve`) that keeps the ONNX model and HNSW index loaded in memory. Queries are sent as JSON over stdin and responses returned via stdout -- eliminating the ~2.6s cold-start overhead of loading the model per query. Falls back to single-shot `execFileSync` if the serve process is unavailable.
|
|
568
|
+
|
|
569
|
+
```mermaid
|
|
570
|
+
flowchart TD
|
|
571
|
+
subgraph startup ["Startup (once)"]
|
|
572
|
+
S1[Load Model] --> S2[Load Index]
|
|
573
|
+
S2 --> S3[Ready Signal]
|
|
574
|
+
end
|
|
575
|
+
subgraph query ["Per Query (10-45ms)"]
|
|
576
|
+
Q1[stdin JSON] --> Q2[Embed]
|
|
577
|
+
Q2 --> Q3[HNSW Search]
|
|
578
|
+
Q3 --> Q4[Rerank]
|
|
579
|
+
Q4 --> Q5[stdout JSON]
|
|
580
|
+
end
|
|
581
|
+
startup --> query
|
|
582
|
+
subgraph fallback ["Fallback"]
|
|
583
|
+
F1[execFileSync ~2.6s]
|
|
584
|
+
end
|
|
585
|
+
|
|
586
|
+
style startup fill:#e8f4e8,color:#000
|
|
587
|
+
style query fill:#e8e8f4,color:#000
|
|
588
|
+
style fallback fill:#f4e8e8,color:#000
|
|
589
|
+
```
|
|
428
590
|
|
|
429
|
-
###
|
|
591
|
+
### 4. MCP Integration
|
|
430
592
|
|
|
431
593
|
The MCP server delegates all search/index operations to the Rust core binary. Analysis tools (diff, complexity) use ruvector JS modules directly.
|
|
432
594
|
|
|
433
|
-
```
|
|
434
|
-
|
|
435
|
-
|
|
436
|
-
|
|
437
|
-
|
|
438
|
-
|
|
439
|
-
|
|
440
|
-
|
|
441
|
-
|
|
442
|
-
|
|
443
|
-
|
|
444
|
-
|
|
445
|
-
|
|
446
|
-
|
|
595
|
+
```mermaid
|
|
596
|
+
sequenceDiagram
|
|
597
|
+
participant Dev
|
|
598
|
+
participant AI
|
|
599
|
+
participant MCP
|
|
600
|
+
participant Rust
|
|
601
|
+
participant HNSW
|
|
602
|
+
|
|
603
|
+
Dev->>AI: "checkout totals?"
|
|
604
|
+
AI->>MCP: magento_search(...)
|
|
605
|
+
MCP->>Rust: JSON query
|
|
606
|
+
Rust->>HNSW: embed + search
|
|
607
|
+
HNSW-->>Rust: candidates
|
|
608
|
+
Rust-->>MCP: JSON results
|
|
609
|
+
MCP-->>AI: paths, methods, badges
|
|
610
|
+
AI-->>Dev: TotalsCollector.php
|
|
447
611
|
```
|
|
448
612
|
|
|
449
613
|
---
|
|
450
614
|
|
|
451
615
|
## Magento Patterns Detected
|
|
452
616
|
|
|
617
|
+
```mermaid
|
|
618
|
+
mindmap
|
|
619
|
+
root((Patterns))
|
|
620
|
+
PHP
|
|
621
|
+
Controller
|
|
622
|
+
Model
|
|
623
|
+
Repository
|
|
624
|
+
Block
|
|
625
|
+
Helper
|
|
626
|
+
ViewModel
|
|
627
|
+
Interception
|
|
628
|
+
Plugin
|
|
629
|
+
Observer
|
|
630
|
+
Preference
|
|
631
|
+
XML
|
|
632
|
+
di.xml
|
|
633
|
+
events.xml
|
|
634
|
+
webapi.xml
|
|
635
|
+
routes.xml
|
|
636
|
+
crontab.xml
|
|
637
|
+
db_schema.xml
|
|
638
|
+
Frontend
|
|
639
|
+
Template
|
|
640
|
+
JavaScript
|
|
641
|
+
GraphQL
|
|
642
|
+
```
|
|
643
|
+
|
|
453
644
|
Magector understands these Magento 2 architectural patterns:
|
|
454
645
|
|
|
455
646
|
| Pattern | Detection Method | Example |
|
|
@@ -490,7 +681,7 @@ Copy `.cursorrules` to your Magento project root for optimized AI-assisted devel
|
|
|
490
681
|
|
|
491
682
|
### Model Configuration
|
|
492
683
|
|
|
493
|
-
The ONNX model (`all-MiniLM-L6-v2`) is automatically downloaded on first run to
|
|
684
|
+
The ONNX model (`all-MiniLM-L6-v2`) is automatically downloaded on first run to `~/.magector/models/`. To use a different location:
|
|
494
685
|
|
|
495
686
|
```bash
|
|
496
687
|
magector-core index -m /path/to/magento -c /custom/model/path
|
|
@@ -535,16 +726,20 @@ cargo run --release -- validate
|
|
|
535
726
|
### Testing
|
|
536
727
|
|
|
537
728
|
```bash
|
|
538
|
-
#
|
|
729
|
+
# Integration tests (62 tests, requires indexed codebase)
|
|
539
730
|
npm test
|
|
540
731
|
|
|
732
|
+
# E2E accuracy tests (101 queries)
|
|
733
|
+
npm run test:accuracy
|
|
734
|
+
npm run test:accuracy:verbose
|
|
735
|
+
|
|
541
736
|
# Run without index (unit + schema tests only)
|
|
542
737
|
npm run test:no-index
|
|
543
738
|
|
|
544
|
-
#
|
|
739
|
+
# Rust unit tests
|
|
545
740
|
cd rust-core && cargo test
|
|
546
741
|
|
|
547
|
-
#
|
|
742
|
+
# Rust validation (557 test cases)
|
|
548
743
|
cd rust-core && cargo run --release -- validate -m ./magento2 --skip-index
|
|
549
744
|
```
|
|
550
745
|
|
|
@@ -553,18 +748,23 @@ cd rust-core && cargo run --release -- validate -m ./magento2 --skip-index
|
|
|
553
748
|
1. Add pattern detection in `rust-core/src/magento.rs`
|
|
554
749
|
2. Add search text enrichment in `rust-core/src/indexer.rs`
|
|
555
750
|
3. Add validation test cases in `rust-core/src/validation.rs`
|
|
556
|
-
4.
|
|
751
|
+
4. Add E2E accuracy test cases in `tests/mcp-accuracy.test.js`
|
|
752
|
+
5. Rebuild and run validation to verify:
|
|
557
753
|
|
|
558
754
|
```bash
|
|
559
755
|
cargo build --release
|
|
560
756
|
./target/release/magector-core validate -m ./magento2 --skip-index
|
|
757
|
+
npm run test:accuracy
|
|
561
758
|
```
|
|
562
759
|
|
|
563
760
|
### Adding MCP Tools
|
|
564
761
|
|
|
565
762
|
1. Define the tool schema in `src/mcp-server.js` (ListToolsRequestSchema handler)
|
|
566
|
-
2.
|
|
567
|
-
3.
|
|
763
|
+
2. Include keyword-rich descriptions and cross-tool "See also" references
|
|
764
|
+
3. Implement the handler in the CallToolRequestSchema handler
|
|
765
|
+
4. Return structured JSON via `formatSearchResults()`
|
|
766
|
+
5. Add E2E test cases in `tests/mcp-accuracy.test.js`
|
|
767
|
+
6. Test with Claude Code or the MCP inspector
|
|
568
768
|
|
|
569
769
|
---
|
|
570
770
|
|
|
@@ -582,8 +782,10 @@ cargo build --release
|
|
|
582
782
|
|
|
583
783
|
- **Algorithm:** HNSW (Hierarchical Navigable Small World)
|
|
584
784
|
- **Library:** `hnsw_rs`
|
|
785
|
+
- **Parameters:** M=32, max_layers=16, ef_construction=200
|
|
585
786
|
- **Distance metric:** Cosine similarity
|
|
586
|
-
- **
|
|
787
|
+
- **Hybrid search:** Semantic nearest-neighbor + keyword reranking in path and search text
|
|
788
|
+
- **Persistence:** Bincode binary serialization
|
|
587
789
|
|
|
588
790
|
### Index Structure
|
|
589
791
|
|
|
@@ -596,13 +798,15 @@ struct IndexMetadata {
|
|
|
596
798
|
magento_type: String, // controller, model, block, plugin, ...
|
|
597
799
|
class_name: Option<String>,
|
|
598
800
|
namespace: Option<String>,
|
|
599
|
-
methods: Vec<String>,
|
|
600
|
-
search_text: String, //
|
|
801
|
+
methods: Vec<String>, // extracted method names
|
|
802
|
+
search_text: String, // enriched searchable text
|
|
601
803
|
is_controller: bool,
|
|
602
804
|
is_plugin: bool,
|
|
603
805
|
is_observer: bool,
|
|
604
806
|
is_model: bool,
|
|
605
807
|
is_block: bool,
|
|
808
|
+
is_repository: bool,
|
|
809
|
+
is_resolver: bool,
|
|
606
810
|
// ... 20+ pattern flags
|
|
607
811
|
}
|
|
608
812
|
```
|
|
@@ -611,8 +815,9 @@ struct IndexMetadata {
|
|
|
611
815
|
|
|
612
816
|
| Operation | Time | Notes |
|
|
613
817
|
|-----------|------|-------|
|
|
614
|
-
| Full index (
|
|
615
|
-
| Single query |
|
|
818
|
+
| Full index (36K vectors) | ~1 min | Parallel parsing + batched ONNX embedding |
|
|
819
|
+
| Single query (warm) | 10-45ms | Persistent serve process, HNSW + rerank |
|
|
820
|
+
| Single query (cold) | ~2.6s | Includes ONNX model + index load |
|
|
616
821
|
| Embedding generation | ~2ms | ONNX Runtime with CoreML/CUDA |
|
|
617
822
|
| Batch embedding (32) | ~30ms | Batched ONNX inference |
|
|
618
823
|
| Model load | ~500ms | One-time at startup |
|
|
@@ -620,19 +825,49 @@ struct IndexMetadata {
|
|
|
620
825
|
|
|
621
826
|
### Performance Optimizations
|
|
622
827
|
|
|
828
|
+
- **Persistent serve mode** -- Rust process keeps ONNX model + HNSW index in memory via stdin/stdout JSON protocol
|
|
829
|
+
- **Query cache** -- LRU cache (200 entries) avoids re-embedding identical queries
|
|
830
|
+
- **Hybrid reranking** -- combines semantic similarity with keyword matching for better precision
|
|
623
831
|
- **Batched ONNX embedding** -- 32 texts per inference call (vs. 1-at-a-time), 3-5x faster embedding
|
|
624
|
-
- **Dynamic thread scaling** -- ONNX intra-op threads scale to CPU core count
|
|
832
|
+
- **Dynamic thread scaling** -- ONNX intra-op threads scale to CPU core count
|
|
625
833
|
- **Thread-local AST parsers** -- each rayon thread gets its own tree-sitter parser (no mutex contention)
|
|
626
834
|
- **Bincode persistence** -- binary serialization replaces JSON (3-5x faster save/load, ~5x smaller files)
|
|
627
|
-
- **Adaptive HNSW capacity** -- pre-sized to actual vector count
|
|
835
|
+
- **Adaptive HNSW capacity** -- pre-sized to actual vector count
|
|
628
836
|
- **Parallel HNSW insert** -- batch insert uses hnsw_rs parallel insertion on load and index
|
|
629
|
-
- **
|
|
837
|
+
- **Tuned ef_search** -- optimized search parameters for 36K vector index (ef_search=50 for search, 64 for hybrid)
|
|
630
838
|
|
|
631
839
|
---
|
|
632
840
|
|
|
633
841
|
## Roadmap
|
|
634
842
|
|
|
843
|
+
```mermaid
|
|
844
|
+
gantt
|
|
845
|
+
title Roadmap
|
|
846
|
+
dateFormat YYYY-MM
|
|
847
|
+
axisFormat %b
|
|
848
|
+
section Done
|
|
849
|
+
Hybrid search :done, 2025-01, 30d
|
|
850
|
+
Serve mode :done, 2025-02, 30d
|
|
851
|
+
JSON output :done, 2025-03, 15d
|
|
852
|
+
Cross-tool hints :done, 2025-03, 15d
|
|
853
|
+
E2E tests :done, 2025-03, 15d
|
|
854
|
+
section Next
|
|
855
|
+
Method chunking :active, 2025-04, 30d
|
|
856
|
+
Intent detection :2025-05, 30d
|
|
857
|
+
Type filtering :2025-06, 30d
|
|
858
|
+
Incremental index :2025-07, 30d
|
|
859
|
+
section Future
|
|
860
|
+
VSCode extension :2025-08, 60d
|
|
861
|
+
Web UI :2025-10, 60d
|
|
862
|
+
Commerce support :2026-01, 60d
|
|
863
|
+
```
|
|
864
|
+
|
|
635
865
|
- [x] Hybrid search (semantic + keyword re-ranking)
|
|
866
|
+
- [x] Persistent serve mode (eliminates cold-start latency)
|
|
867
|
+
- [x] Structured JSON output (methods, badges, snippets)
|
|
868
|
+
- [x] Cross-tool discovery hints for AI clients
|
|
869
|
+
- [x] E2E accuracy test suite (101 queries)
|
|
870
|
+
- [ ] Method-level chunking (per-method vectors for direct method search)
|
|
636
871
|
- [ ] Query intent classification (auto-detect "give me XML" vs "give me PHP")
|
|
637
872
|
- [ ] Filtered search by file type at the vector level
|
|
638
873
|
- [ ] Incremental indexing (only re-index changed files)
|
|
@@ -655,7 +890,7 @@ Contributions are welcome. Please:
|
|
|
655
890
|
1. Fork the repository
|
|
656
891
|
2. Create a feature branch (`git checkout -b feature/improvement`)
|
|
657
892
|
3. Add tests for new functionality
|
|
658
|
-
4. Run validation to ensure accuracy doesn't regress
|
|
893
|
+
4. Run validation to ensure accuracy doesn't regress: `npm run test:accuracy`
|
|
659
894
|
5. Submit a pull request
|
|
660
895
|
|
|
661
896
|
---
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "magector",
|
|
3
|
-
"version": "1.2.
|
|
3
|
+
"version": "1.2.14",
|
|
4
4
|
"description": "Semantic code search for Magento 2 — index, search, MCP server",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"main": "src/mcp-server.js",
|
|
@@ -33,10 +33,10 @@
|
|
|
33
33
|
"ruvector": "^0.1.96"
|
|
34
34
|
},
|
|
35
35
|
"optionalDependencies": {
|
|
36
|
-
"@magector/cli-darwin-arm64": "1.2.
|
|
37
|
-
"@magector/cli-linux-x64": "1.2.
|
|
38
|
-
"@magector/cli-linux-arm64": "1.2.
|
|
39
|
-
"@magector/cli-win32-x64": "1.2.
|
|
36
|
+
"@magector/cli-darwin-arm64": "1.2.14",
|
|
37
|
+
"@magector/cli-linux-x64": "1.2.14",
|
|
38
|
+
"@magector/cli-linux-arm64": "1.2.14",
|
|
39
|
+
"@magector/cli-win32-x64": "1.2.14"
|
|
40
40
|
},
|
|
41
41
|
"keywords": [
|
|
42
42
|
"magento",
|