magector 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +627 -0
- package/config/mcp-config.json +13 -0
- package/package.json +53 -0
- package/src/binary.js +66 -0
- package/src/cli.js +203 -0
- package/src/init.js +293 -0
- package/src/magento-patterns.js +563 -0
- package/src/mcp-server.js +915 -0
- package/src/model.js +127 -0
- package/src/templates/claude-md.js +47 -0
- package/src/templates/cursorrules.js +45 -0
- package/src/validation/accuracy-calculator.js +397 -0
- package/src/validation/benchmark.js +355 -0
- package/src/validation/test-data-generator.js +672 -0
- package/src/validation/test-queries.js +326 -0
- package/src/validation/validator.js +302 -0
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2025 Magector Contributors
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/README.md
ADDED
|
@@ -0,0 +1,627 @@
|
|
|
1
|
+
# Magector
|
|
2
|
+
|
|
3
|
+
**Semantic code search engine for Magento 2, powered by ONNX embeddings and HNSW vector search.**
|
|
4
|
+
|
|
5
|
+
Magector indexes an entire Magento 2 codebase and lets you search it with natural language. Instead of grepping for keywords, ask questions like *"how are checkout totals calculated?"* or *"where is the product price determined?"* and get ranked, relevant results in under 50ms.
|
|
6
|
+
|
|
7
|
+
[](https://www.rust-lang.org)
|
|
8
|
+
[](https://nodejs.org)
|
|
9
|
+
[](https://magento.com)
|
|
10
|
+
[](#validation)
|
|
11
|
+
[](LICENSE)
|
|
12
|
+
|
|
13
|
+
---
|
|
14
|
+
|
|
15
|
+
## Why Magector
|
|
16
|
+
|
|
17
|
+
Magento 2 has **18,000+ source files** across hundreds of modules. Finding the right code is slow:
|
|
18
|
+
|
|
19
|
+
| Approach | Finds semantic matches | Understands Magento patterns | Speed (18K files) |
|
|
20
|
+
|----------|:---------------------:|:---------------------------:|:-----------------:|
|
|
21
|
+
| `grep` / `ripgrep` | No | No | 100-500ms |
|
|
22
|
+
| IDE search | No | No | 200-1000ms |
|
|
23
|
+
| GitHub search | Partial | No | 500-2000ms |
|
|
24
|
+
| **Magector** | **Yes** | **Yes** | **15-45ms** |
|
|
25
|
+
|
|
26
|
+
Magector understands that a query about *"payment capture"* should return `Sales/Model/Order/Payment/Operations/CaptureOperation.php`, not just files containing the word "capture".
|
|
27
|
+
|
|
28
|
+
---
|
|
29
|
+
|
|
30
|
+
## Features
|
|
31
|
+
|
|
32
|
+
- **Semantic search** -- find code by meaning, not exact keywords
|
|
33
|
+
- **94.4% accuracy** -- validated with 557 test cases across 50+ categories
|
|
34
|
+
- **ONNX embeddings** -- native 384-dim transformer embeddings via ONNX Runtime for higher quality search
|
|
35
|
+
- **Parallel processing** -- batch embedding with parallel intelligence for faster indexing
|
|
36
|
+
- **Magento-aware** -- understands controllers, plugins, observers, blocks, resolvers, repositories, and 20+ Magento patterns
|
|
37
|
+
- **AST-powered** -- tree-sitter parsing for PHP and JavaScript extracts classes, methods, namespaces, and inheritance
|
|
38
|
+
- **Diff analysis** -- risk scoring and change classification for git commits and staged changes
|
|
39
|
+
- **Complexity analysis** -- cyclomatic complexity, function count, and hotspot detection across modules
|
|
40
|
+
- **Fast** -- 15-45ms queries, batched ONNX embedding with adaptive thread scaling
|
|
41
|
+
- **MCP server** -- 19 tools integrating with Claude Code, Cursor, and any MCP-compatible AI tool
|
|
42
|
+
- **Clean architecture** -- Rust core handles all indexing/search, Node.js MCP server delegates to it
|
|
43
|
+
|
|
44
|
+
---
|
|
45
|
+
|
|
46
|
+
## Architecture
|
|
47
|
+
|
|
48
|
+
```
|
|
49
|
+
┌──────────────────────────────────────────┐
|
|
50
|
+
│ Magector │
|
|
51
|
+
├──────────────────┬───────────────────────┤
|
|
52
|
+
│ Rust Core │ Node.js Layer │
|
|
53
|
+
│ │ │
|
|
54
|
+
│ ┌────────────┐ │ ┌─────────────────┐ │
|
|
55
|
+
│ │ Tree-sitter│ │ │ MCP Server │ │
|
|
56
|
+
│ │ AST Parser │ │ │ (19 tools) │ │
|
|
57
|
+
│ │ PHP + JS │ │ └────────┬────────┘ │
|
|
58
|
+
│ └─────┬──────┘ │ │ │
|
|
59
|
+
│ │ │ ┌────────┴────────┐ │
|
|
60
|
+
│ ┌─────┴──────┐ │ │ CLI Interface │ │
|
|
61
|
+
│ │ Magento │ │ │ index/search/ │ │
|
|
62
|
+
│ │ Pattern │ │ │ validate │ │
|
|
63
|
+
│ │ Detection │ │ └─────────────────┘ │
|
|
64
|
+
│ └─────┬──────┘ │ │
|
|
65
|
+
│ │ │ │
|
|
66
|
+
│ ┌─────┴──────┐ │ │
|
|
67
|
+
│ │ ONNX │ │ │
|
|
68
|
+
│ │ Embedder │ │ │
|
|
69
|
+
│ │ MiniLM-L6 │ │ │
|
|
70
|
+
│ └─────┬──────┘ │ │
|
|
71
|
+
│ │ │ │
|
|
72
|
+
│ ┌─────┴──────┐ │ │
|
|
73
|
+
│ │ HNSW │ │ │
|
|
74
|
+
│ │ Vector DB │ │ │
|
|
75
|
+
│ └────────────┘ │ │
|
|
76
|
+
└──────────────────┴───────────────────────┘
|
|
77
|
+
```
|
|
78
|
+
|
|
79
|
+
### Embedding Pipeline
|
|
80
|
+
|
|
81
|
+
```
|
|
82
|
+
Source File ──▶ Tree-sitter AST ──▶ Magento Pattern Detection ──▶ Search Text Enrichment
|
|
83
|
+
│ │
|
|
84
|
+
│ ▼
|
|
85
|
+
│ ONNX Runtime
|
|
86
|
+
│ (MiniLM-L6-v2)
|
|
87
|
+
│ │
|
|
88
|
+
│ ▼
|
|
89
|
+
│ 384-dim embedding
|
|
90
|
+
│ │
|
|
91
|
+
▼ ▼
|
|
92
|
+
Metadata ─────────────────────────────────────────────────────▶ HNSW Index
|
|
93
|
+
(path, class, namespace, type, methods, patterns) (17,891 vectors)
|
|
94
|
+
```
|
|
95
|
+
|
|
96
|
+
### Components
|
|
97
|
+
|
|
98
|
+
| Component | Technology | Purpose |
|
|
99
|
+
|-----------|-----------|---------|
|
|
100
|
+
| Embeddings | `ort` (ONNX Runtime) | all-MiniLM-L6-v2, 384 dimensions |
|
|
101
|
+
| Vector search | `hnsw_rs` | Approximate nearest neighbor |
|
|
102
|
+
| PHP parsing | `tree-sitter-php` | Class, method, namespace extraction |
|
|
103
|
+
| JS parsing | `tree-sitter-javascript` | AMD/ES6 module detection |
|
|
104
|
+
| Pattern detection | Custom Rust | 20+ Magento-specific patterns |
|
|
105
|
+
| CLI | `clap` | Command-line interface |
|
|
106
|
+
| MCP server | `@modelcontextprotocol/sdk` | AI tool integration |
|
|
107
|
+
|
|
108
|
+
---
|
|
109
|
+
|
|
110
|
+
## Quick Start
|
|
111
|
+
|
|
112
|
+
### Prerequisites
|
|
113
|
+
|
|
114
|
+
- [Node.js 18+](https://nodejs.org)
|
|
115
|
+
|
|
116
|
+
### 1. Initialize in Your Magento Project
|
|
117
|
+
|
|
118
|
+
```bash
|
|
119
|
+
cd /path/to/your/magento2
|
|
120
|
+
npx magector init
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
This single command:
|
|
124
|
+
- Verifies the Magento project
|
|
125
|
+
- Downloads the ONNX model (~86MB, cached globally in `~/.magector/models/`)
|
|
126
|
+
- Indexes the entire codebase
|
|
127
|
+
- Detects your IDE (Cursor / Claude Code)
|
|
128
|
+
- Writes MCP server configuration
|
|
129
|
+
- Writes IDE rules (`.cursorrules` / `CLAUDE.md`)
|
|
130
|
+
- Adds `magector.db` to `.gitignore`
|
|
131
|
+
|
|
132
|
+
### 2. Search
|
|
133
|
+
|
|
134
|
+
```bash
|
|
135
|
+
npx magector search "product price calculation"
|
|
136
|
+
npx magector search "checkout totals collector" -l 20
|
|
137
|
+
```
|
|
138
|
+
|
|
139
|
+
### 3. Re-index After Changes
|
|
140
|
+
|
|
141
|
+
```bash
|
|
142
|
+
npx magector index
|
|
143
|
+
```
|
|
144
|
+
|
|
145
|
+
### 4. IDE Setup Only (Skip Indexing)
|
|
146
|
+
|
|
147
|
+
```bash
|
|
148
|
+
npx magector setup
|
|
149
|
+
```
|
|
150
|
+
|
|
151
|
+
---
|
|
152
|
+
|
|
153
|
+
## CLI Reference
|
|
154
|
+
|
|
155
|
+
### Rust Core CLI
|
|
156
|
+
|
|
157
|
+
```
|
|
158
|
+
magector-core <COMMAND>
|
|
159
|
+
|
|
160
|
+
Commands:
|
|
161
|
+
index Index a Magento codebase
|
|
162
|
+
search Search the index semantically
|
|
163
|
+
validate Run validation suite (downloads Magento if needed)
|
|
164
|
+
download Download Magento 2 Open Source
|
|
165
|
+
stats Show index statistics
|
|
166
|
+
embed Generate embedding for text
|
|
167
|
+
```
|
|
168
|
+
|
|
169
|
+
#### `index`
|
|
170
|
+
|
|
171
|
+
```bash
|
|
172
|
+
magector-core index [OPTIONS]
|
|
173
|
+
|
|
174
|
+
Options:
|
|
175
|
+
-m, --magento-root <PATH> Path to Magento root directory
|
|
176
|
+
-d, --database <PATH> Index database path [default: ./magector.db]
|
|
177
|
+
-c, --model-cache <PATH> Model cache directory [default: ./models]
|
|
178
|
+
-v, --verbose Enable verbose output
|
|
179
|
+
```
|
|
180
|
+
|
|
181
|
+
#### `search`
|
|
182
|
+
|
|
183
|
+
```bash
|
|
184
|
+
magector-core search <QUERY> [OPTIONS]
|
|
185
|
+
|
|
186
|
+
Options:
|
|
187
|
+
-d, --database <PATH> Index database path [default: ./magector.db]
|
|
188
|
+
-l, --limit <N> Number of results [default: 10]
|
|
189
|
+
-f, --format <FORMAT> Output format: text, json [default: text]
|
|
190
|
+
```
|
|
191
|
+
|
|
192
|
+
### Node.js CLI
|
|
193
|
+
|
|
194
|
+
```bash
|
|
195
|
+
npx magector init [path] # Full setup: index + IDE config
|
|
196
|
+
npx magector index [path] # Index (or re-index) Magento codebase
|
|
197
|
+
npx magector search <query> # Search indexed code
|
|
198
|
+
npx magector stats # Show indexer statistics
|
|
199
|
+
npx magector setup [path] # IDE setup only (no indexing)
|
|
200
|
+
npx magector mcp # Start MCP server
|
|
201
|
+
npx magector help # Show help
|
|
202
|
+
```
|
|
203
|
+
|
|
204
|
+
### Environment Variables
|
|
205
|
+
|
|
206
|
+
| Variable | Description | Default |
|
|
207
|
+
|----------|-------------|---------|
|
|
208
|
+
| `MAGENTO_ROOT` | Path to Magento installation | Current directory |
|
|
209
|
+
| `MAGECTOR_DB` | Path to index database | `./magector.db` |
|
|
210
|
+
| `MAGECTOR_BIN` | Path to magector-core binary | Auto-detected |
|
|
211
|
+
| `MAGECTOR_MODELS` | Path to ONNX model directory | `~/.magector/models/` |
|
|
212
|
+
|
|
213
|
+
---
|
|
214
|
+
|
|
215
|
+
## MCP Server Tools
|
|
216
|
+
|
|
217
|
+
The MCP server exposes 19 tools for AI-assisted Magento development:
|
|
218
|
+
|
|
219
|
+
### Search Tools
|
|
220
|
+
|
|
221
|
+
| Tool | Description |
|
|
222
|
+
|------|-------------|
|
|
223
|
+
| `magento_search` | Semantic code search with natural language queries |
|
|
224
|
+
| `magento_find_class` | Find PHP class, interface, or trait by name |
|
|
225
|
+
| `magento_find_method` | Find method implementations across the codebase |
|
|
226
|
+
|
|
227
|
+
### Magento-Specific Finders
|
|
228
|
+
|
|
229
|
+
| Tool | Description |
|
|
230
|
+
|------|-------------|
|
|
231
|
+
| `magento_find_config` | Find XML configuration files (di.xml, events.xml, etc.) |
|
|
232
|
+
| `magento_find_template` | Find PHTML template files |
|
|
233
|
+
| `magento_find_plugin` | Find interceptor plugins and their targets |
|
|
234
|
+
| `magento_find_observer` | Find event observers |
|
|
235
|
+
| `magento_find_controller` | Find controllers by route or action |
|
|
236
|
+
| `magento_find_block` | Find Block classes |
|
|
237
|
+
| `magento_find_graphql` | Find GraphQL resolvers and schema |
|
|
238
|
+
| `magento_find_api` | Find REST API endpoints and webapi.xml routes |
|
|
239
|
+
| `magento_find_cron` | Find cron job definitions |
|
|
240
|
+
| `magento_find_db_schema` | Find database table definitions |
|
|
241
|
+
|
|
242
|
+
### Analysis Tools
|
|
243
|
+
|
|
244
|
+
| Tool | Description |
|
|
245
|
+
|------|-------------|
|
|
246
|
+
| `magento_analyze_diff` | Analyze git diffs for risk scoring and change classification |
|
|
247
|
+
| `magento_complexity` | Analyze code complexity (cyclomatic, function count, lines) |
|
|
248
|
+
|
|
249
|
+
### Utility Tools
|
|
250
|
+
|
|
251
|
+
| Tool | Description |
|
|
252
|
+
|------|-------------|
|
|
253
|
+
| `magento_module_structure` | Show module directory structure |
|
|
254
|
+
| `magento_index` | Trigger re-indexing of the codebase |
|
|
255
|
+
| `magento_stats` | View index statistics (ONNX, parallel mode) |
|
|
256
|
+
|
|
257
|
+
### Query Examples
|
|
258
|
+
|
|
259
|
+
```
|
|
260
|
+
magento_search("how are checkout totals calculated")
|
|
261
|
+
magento_search("product price with tier pricing and catalog rules")
|
|
262
|
+
magento_find_class("ProductRepositoryInterface")
|
|
263
|
+
magento_find_config("di.xml plugin for ProductRepository")
|
|
264
|
+
magento_find_plugin("save method")
|
|
265
|
+
magento_find_observer("sales_order_place_after")
|
|
266
|
+
magento_find_api("products REST endpoint")
|
|
267
|
+
magento_find_graphql("cart mutation resolver")
|
|
268
|
+
magento_analyze_diff({ commitHash: "abc123" })
|
|
269
|
+
magento_complexity({ module: "Magento_Catalog", threshold: 10 })
|
|
270
|
+
```
|
|
271
|
+
|
|
272
|
+
---
|
|
273
|
+
|
|
274
|
+
## Validation
|
|
275
|
+
|
|
276
|
+
Magector is validated against the complete Magento 2.4.7 codebase with **557 test cases** across **50+ categories**.
|
|
277
|
+
|
|
278
|
+
### Overall Results
|
|
279
|
+
|
|
280
|
+
| Metric | Value |
|
|
281
|
+
|--------|-------|
|
|
282
|
+
| **Accuracy** | **94.4%** |
|
|
283
|
+
| Tests passed | 526 / 557 |
|
|
284
|
+
| Index size | 17,891 vectors |
|
|
285
|
+
| Query time | 15-45ms |
|
|
286
|
+
| Indexing time | ~3 minutes |
|
|
287
|
+
|
|
288
|
+
### Category Performance
|
|
289
|
+
|
|
290
|
+
**100% accuracy (34 categories):**
|
|
291
|
+
Controllers, Blocks, Observers, GraphQL, API, Shipping, Tax, Payment, EAV, Indexers, Cron, Email, Import, Export, Cache, Queue, Admin, CMS, Promotions, Debugging, Architecture, Order Management, Plugin Advanced, GraphQL Advanced, API Advanced, Admin Advanced, Email Advanced, Cron Advanced, Queue Advanced, Import Advanced, Payment Advanced, URL Rewrite, SEO, Marketing
|
|
292
|
+
|
|
293
|
+
**90-99% accuracy:**
|
|
294
|
+
Catalog Product (96%), Customer Advanced (95%), Checkout Flow (95%), Shipping Advanced (93.3%), Category (93.3%), Frontend JS (90%), Search (90%)
|
|
295
|
+
|
|
296
|
+
**Known limitations:**
|
|
297
|
+
- XML configuration file search (di.xml, plugin configs) -- semantic search favors PHP files with richer content
|
|
298
|
+
- Very generic single-word queries -- include more context for better results
|
|
299
|
+
|
|
300
|
+
### Running Validation
|
|
301
|
+
|
|
302
|
+
```bash
|
|
303
|
+
# Full validation (downloads Magento, indexes, validates)
|
|
304
|
+
cd rust-core
|
|
305
|
+
cargo run --release -- validate
|
|
306
|
+
|
|
307
|
+
# Skip indexing (use existing index)
|
|
308
|
+
cargo run --release -- validate -m ./magento2 --skip-index
|
|
309
|
+
|
|
310
|
+
# Node.js validation suite
|
|
311
|
+
npm run validate
|
|
312
|
+
npm run validate:verbose
|
|
313
|
+
```
|
|
314
|
+
|
|
315
|
+
---
|
|
316
|
+
|
|
317
|
+
## Project Structure
|
|
318
|
+
|
|
319
|
+
```
|
|
320
|
+
magector/
|
|
321
|
+
├── src/ # Node.js source
|
|
322
|
+
│ ├── cli.js # CLI entry point (npx magector <command>)
|
|
323
|
+
│ ├── mcp-server.js # MCP server (19 tools, delegates to Rust core)
|
|
324
|
+
│ ├── binary.js # Platform binary resolver
|
|
325
|
+
│ ├── model.js # ONNX model resolver/downloader
|
|
326
|
+
│ ├── init.js # Full init command (index + IDE config)
|
|
327
|
+
│ ├── magento-patterns.js # Magento pattern detection (JS)
|
|
328
|
+
│ ├── templates/ # IDE rules templates
|
|
329
|
+
│ │ ├── cursorrules.js # .cursorrules content
|
|
330
|
+
│ │ └── claude-md.js # CLAUDE.md content
|
|
331
|
+
│ └── validation/ # JS validation suite
|
|
332
|
+
│ ├── validator.js
|
|
333
|
+
│ ├── benchmark.js
|
|
334
|
+
│ ├── test-queries.js
|
|
335
|
+
│ ├── test-data-generator.js
|
|
336
|
+
│ └── accuracy-calculator.js
|
|
337
|
+
├── tests/ # Automated tests
|
|
338
|
+
│ └── mcp-server.test.js # MCP server tests (Rust core + analysis tools)
|
|
339
|
+
├── platforms/ # Platform-specific binary packages
|
|
340
|
+
│ ├── darwin-arm64/ # macOS ARM (Apple Silicon)
|
|
341
|
+
│ ├── darwin-x64/ # macOS Intel
|
|
342
|
+
│ ├── linux-x64/ # Linux x64
|
|
343
|
+
│ ├── linux-arm64/ # Linux ARM64
|
|
344
|
+
│ └── win32-x64/ # Windows x64
|
|
345
|
+
├── rust-core/ # Rust high-performance core
|
|
346
|
+
│ ├── Cargo.toml
|
|
347
|
+
│ ├── src/
|
|
348
|
+
│ │ ├── main.rs # Rust CLI (index, search, validate)
|
|
349
|
+
│ │ ├── lib.rs # Library exports
|
|
350
|
+
│ │ ├── indexer.rs # Core indexing with progress output
|
|
351
|
+
│ │ ├── embedder.rs # ONNX embedding (MiniLM-L6-v2)
|
|
352
|
+
│ │ ├── vectordb.rs # HNSW vector database
|
|
353
|
+
│ │ ├── ast.rs # Tree-sitter AST (PHP + JS)
|
|
354
|
+
│ │ ├── magento.rs # Magento pattern detection (Rust)
|
|
355
|
+
│ │ └── validation.rs # 557 test cases, validation framework
|
|
356
|
+
│ └── models/ # ONNX model files (auto-downloaded)
|
|
357
|
+
│ ├── all-MiniLM-L6-v2.onnx
|
|
358
|
+
│ └── tokenizer.json
|
|
359
|
+
├── .github/
|
|
360
|
+
│ └── workflows/
|
|
361
|
+
│ └── release.yml # Cross-compile + publish CI
|
|
362
|
+
├── scripts/
|
|
363
|
+
│ └── setup.sh # Claude Code MCP setup script
|
|
364
|
+
├── config/
|
|
365
|
+
│ └── mcp-config.json # MCP server configuration template
|
|
366
|
+
├── package.json
|
|
367
|
+
├── .gitignore
|
|
368
|
+
├── LICENSE
|
|
369
|
+
└── README.md
|
|
370
|
+
```
|
|
371
|
+
|
|
372
|
+
---
|
|
373
|
+
|
|
374
|
+
## How It Works
|
|
375
|
+
|
|
376
|
+
### 1. Indexing
|
|
377
|
+
|
|
378
|
+
Magector scans every `.php`, `.js`, `.xml`, `.phtml`, and `.graphqls` file in a Magento codebase:
|
|
379
|
+
|
|
380
|
+
1. **AST parsing** -- Tree-sitter extracts class names, namespaces, methods, inheritance, and interface implementations from PHP and JavaScript files
|
|
381
|
+
2. **Pattern detection** -- Identifies Magento-specific patterns: controllers, models, repositories, plugins, observers, blocks, GraphQL resolvers, admin grids, cron jobs, and more
|
|
382
|
+
3. **Search text enrichment** -- Combines AST metadata with Magento pattern keywords to create semantically rich text representations
|
|
383
|
+
4. **Embedding** -- ONNX Runtime generates 384-dimensional vectors using all-MiniLM-L6-v2
|
|
384
|
+
5. **Indexing** -- Vectors are stored in an HNSW index for sub-millisecond approximate nearest neighbor search
|
|
385
|
+
|
|
386
|
+
### 2. Searching
|
|
387
|
+
|
|
388
|
+
1. Query text is enriched with pattern synonyms (e.g., "controller" adds "action execute http request dispatch")
|
|
389
|
+
2. The enriched query is embedded into the same 384-dimensional vector space
|
|
390
|
+
3. HNSW finds the nearest neighbors by cosine similarity
|
|
391
|
+
4. Results are ranked and returned with file path, class name, Magento type, and relevance score
|
|
392
|
+
|
|
393
|
+
### 3. MCP Integration
|
|
394
|
+
|
|
395
|
+
The MCP server delegates all search/index operations to the Rust core binary. Analysis tools (diff, complexity) use ruvector JS modules directly.
|
|
396
|
+
|
|
397
|
+
```
|
|
398
|
+
Developer: "How does checkout totals calculation work?"
|
|
399
|
+
│
|
|
400
|
+
▼
|
|
401
|
+
AI Assistant ──▶ magento_search("checkout totals collector calculate")
|
|
402
|
+
│
|
|
403
|
+
▼
|
|
404
|
+
MCP Server ──▶ magector-core search (Rust) ──▶ HNSW lookup ──▶ Ranked results
|
|
405
|
+
│
|
|
406
|
+
▼
|
|
407
|
+
Results:
|
|
408
|
+
1. Quote/Model/Quote/TotalsCollector.php (0.554)
|
|
409
|
+
2. Quote/Model/Quote/Address/Total/Collector.php (0.524)
|
|
410
|
+
3. Quote/Model/Quote/Address/Total/Subtotal.php (0.517)
|
|
411
|
+
```
|
|
412
|
+
|
|
413
|
+
---
|
|
414
|
+
|
|
415
|
+
## Magento Patterns Detected
|
|
416
|
+
|
|
417
|
+
Magector understands these Magento 2 architectural patterns:
|
|
418
|
+
|
|
419
|
+
| Pattern | Detection Method | Example |
|
|
420
|
+
|---------|-----------------|---------|
|
|
421
|
+
| Controller | Path + `execute()` method | `Controller/Adminhtml/Order/View.php` |
|
|
422
|
+
| Model | Path + extends `AbstractModel` | `Model/Product.php` |
|
|
423
|
+
| Repository | Path + implements `RepositoryInterface` | `Model/ProductRepository.php` |
|
|
424
|
+
| Block | Path + extends `AbstractBlock` | `Block/Product/View.php` |
|
|
425
|
+
| Plugin | Path + before/after/around methods | `Plugin/Product/SavePlugin.php` |
|
|
426
|
+
| Observer | Path + implements `ObserverInterface` | `Observer/ProductSaveObserver.php` |
|
|
427
|
+
| GraphQL Resolver | Path + implements `ResolverInterface` | `Model/Resolver/Products.php` |
|
|
428
|
+
| Helper | Path under `Helper/` | `Helper/Data.php` |
|
|
429
|
+
| Cron | Path under `Cron/` | `Cron/CleanExpiredQuotes.php` |
|
|
430
|
+
| Console Command | Path + extends `Command` | `Console/Command/IndexerReindex.php` |
|
|
431
|
+
| Data Provider | Path + `DataProvider` | `Ui/DataProvider/Product/Listing.php` |
|
|
432
|
+
| ViewModel | Path + implements `ArgumentInterface` | `ViewModel/Product/Breadcrumbs.php` |
|
|
433
|
+
| Setup Patch | Path + `Patch/Data` or `Patch/Schema` | `Setup/Patch/Data/AddAttribute.php` |
|
|
434
|
+
| di.xml | Path matching | `etc/di.xml`, `etc/frontend/di.xml` |
|
|
435
|
+
| events.xml | Path matching | `etc/events.xml` |
|
|
436
|
+
| webapi.xml | Path matching | `etc/webapi.xml` |
|
|
437
|
+
| layout XML | Path under `layout/` | `view/frontend/layout/catalog_product_view.xml` |
|
|
438
|
+
| Template | `.phtml` extension | `view/frontend/templates/product/view.phtml` |
|
|
439
|
+
| JavaScript | `.js` with AMD/ES6 detection | `view/frontend/web/js/view/minicart.js` |
|
|
440
|
+
| GraphQL Schema | `.graphqls` extension | `etc/schema.graphqls` |
|
|
441
|
+
|
|
442
|
+
---
|
|
443
|
+
|
|
444
|
+
## Configuration
|
|
445
|
+
|
|
446
|
+
### Cursor IDE Rules
|
|
447
|
+
|
|
448
|
+
Copy `.cursorrules` to your Magento project root for optimized AI-assisted development. The rules instruct the AI to:
|
|
449
|
+
|
|
450
|
+
1. Use Magector MCP tools before reading files manually
|
|
451
|
+
2. Write effective semantic queries
|
|
452
|
+
3. Follow Magento development patterns
|
|
453
|
+
4. Interpret search results correctly
|
|
454
|
+
|
|
455
|
+
### Model Configuration
|
|
456
|
+
|
|
457
|
+
The ONNX model (`all-MiniLM-L6-v2`) is automatically downloaded on first run to `rust-core/models/`. To use a different location:
|
|
458
|
+
|
|
459
|
+
```bash
|
|
460
|
+
magector-core index -m /path/to/magento -c /custom/model/path
|
|
461
|
+
```
|
|
462
|
+
|
|
463
|
+
---
|
|
464
|
+
|
|
465
|
+
## Development
|
|
466
|
+
|
|
467
|
+
### Building from Source
|
|
468
|
+
|
|
469
|
+
```bash
|
|
470
|
+
git clone https://github.com/krejcif/magector.git
|
|
471
|
+
cd magector
|
|
472
|
+
|
|
473
|
+
# Install Node.js dependencies
|
|
474
|
+
npm install
|
|
475
|
+
|
|
476
|
+
# Build the Rust core
|
|
477
|
+
cd rust-core
|
|
478
|
+
cargo build --release
|
|
479
|
+
cd ..
|
|
480
|
+
|
|
481
|
+
# The CLI will automatically find the dev binary at rust-core/target/release/magector-core
|
|
482
|
+
node src/cli.js help
|
|
483
|
+
```
|
|
484
|
+
|
|
485
|
+
### Building
|
|
486
|
+
|
|
487
|
+
```bash
|
|
488
|
+
# Rust core
|
|
489
|
+
cd rust-core
|
|
490
|
+
cargo build --release
|
|
491
|
+
|
|
492
|
+
# Run unit tests
|
|
493
|
+
cargo test
|
|
494
|
+
|
|
495
|
+
# Run validation
|
|
496
|
+
cargo run --release -- validate
|
|
497
|
+
```
|
|
498
|
+
|
|
499
|
+
### Testing
|
|
500
|
+
|
|
501
|
+
```bash
|
|
502
|
+
# Run MCP server auto tests (129 tests, requires indexed codebase)
|
|
503
|
+
npm test
|
|
504
|
+
|
|
505
|
+
# Run without index (unit + schema tests only)
|
|
506
|
+
npm run test:no-index
|
|
507
|
+
|
|
508
|
+
# Run Rust unit tests
|
|
509
|
+
cd rust-core && cargo test
|
|
510
|
+
|
|
511
|
+
# Run Rust validation (557 test cases)
|
|
512
|
+
cd rust-core && cargo run --release -- validate -m ./magento2 --skip-index
|
|
513
|
+
```
|
|
514
|
+
|
|
515
|
+
### Adding New Magento Patterns
|
|
516
|
+
|
|
517
|
+
1. Add pattern detection in `rust-core/src/magento.rs`
|
|
518
|
+
2. Add search text enrichment in `rust-core/src/indexer.rs`
|
|
519
|
+
3. Add validation test cases in `rust-core/src/validation.rs`
|
|
520
|
+
4. Rebuild and run validation to verify:
|
|
521
|
+
|
|
522
|
+
```bash
|
|
523
|
+
cargo build --release
|
|
524
|
+
./target/release/magector-core validate -m ./magento2 --skip-index
|
|
525
|
+
```
|
|
526
|
+
|
|
527
|
+
### Adding MCP Tools
|
|
528
|
+
|
|
529
|
+
1. Define the tool schema in `src/mcp-server.js` (ListToolsRequestSchema handler)
|
|
530
|
+
2. Implement the handler in the CallToolRequestSchema handler
|
|
531
|
+
3. Test with Claude Code or the MCP inspector
|
|
532
|
+
|
|
533
|
+
---
|
|
534
|
+
|
|
535
|
+
## Technical Details
|
|
536
|
+
|
|
537
|
+
### Embedding Model
|
|
538
|
+
|
|
539
|
+
- **Model:** all-MiniLM-L6-v2
|
|
540
|
+
- **Dimensions:** 384
|
|
541
|
+
- **Pooling:** Mean pooling with attention mask
|
|
542
|
+
- **Normalization:** L2 normalized
|
|
543
|
+
- **Runtime:** ONNX Runtime (via `ort` crate)
|
|
544
|
+
|
|
545
|
+
### Vector Index
|
|
546
|
+
|
|
547
|
+
- **Algorithm:** HNSW (Hierarchical Navigable Small World)
|
|
548
|
+
- **Library:** `hnsw_rs`
|
|
549
|
+
- **Distance metric:** Cosine similarity
|
|
550
|
+
- **Persistence:** JSON serialization (HNSW + metadata)
|
|
551
|
+
|
|
552
|
+
### Index Structure
|
|
553
|
+
|
|
554
|
+
Each indexed file produces a vector entry with metadata:
|
|
555
|
+
|
|
556
|
+
```rust
|
|
557
|
+
struct IndexMetadata {
|
|
558
|
+
path: String,
|
|
559
|
+
file_type: String, // php, xml, js, template, graphql
|
|
560
|
+
magento_type: String, // controller, model, block, plugin, ...
|
|
561
|
+
class_name: Option<String>,
|
|
562
|
+
namespace: Option<String>,
|
|
563
|
+
methods: Vec<String>,
|
|
564
|
+
search_text: String, // Enriched searchable text
|
|
565
|
+
is_controller: bool,
|
|
566
|
+
is_plugin: bool,
|
|
567
|
+
is_observer: bool,
|
|
568
|
+
is_model: bool,
|
|
569
|
+
is_block: bool,
|
|
570
|
+
// ... 20+ pattern flags
|
|
571
|
+
}
|
|
572
|
+
```
|
|
573
|
+
|
|
574
|
+
### Performance Characteristics
|
|
575
|
+
|
|
576
|
+
| Operation | Time | Notes |
|
|
577
|
+
|-----------|------|-------|
|
|
578
|
+
| Full index (18K files) | ~1 min | Parallel parsing + batched ONNX embedding |
|
|
579
|
+
| Single query | 15-45ms | HNSW approximate nearest neighbor |
|
|
580
|
+
| Embedding generation | ~2ms | ONNX Runtime with CoreML/CUDA |
|
|
581
|
+
| Batch embedding (32) | ~30ms | Batched ONNX inference |
|
|
582
|
+
| Model load | ~500ms | One-time at startup |
|
|
583
|
+
| Index save/load | <1s | Bincode binary serialization |
|
|
584
|
+
|
|
585
|
+
### Performance Optimizations
|
|
586
|
+
|
|
587
|
+
- **Batched ONNX embedding** -- 32 texts per inference call (vs. 1-at-a-time), 3-5x faster embedding
|
|
588
|
+
- **Dynamic thread scaling** -- ONNX intra-op threads scale to CPU core count (vs. hardcoded 4)
|
|
589
|
+
- **Thread-local AST parsers** -- each rayon thread gets its own tree-sitter parser (no mutex contention)
|
|
590
|
+
- **Bincode persistence** -- binary serialization replaces JSON (3-5x faster save/load, ~5x smaller files)
|
|
591
|
+
- **Adaptive HNSW capacity** -- pre-sized to actual vector count (no wasted memory)
|
|
592
|
+
- **Parallel HNSW insert** -- batch insert uses hnsw_rs parallel insertion on load and index
|
|
593
|
+
- **Optimized file discovery** -- no symlink following, uses cached DirEntry metadata
|
|
594
|
+
|
|
595
|
+
---
|
|
596
|
+
|
|
597
|
+
## Roadmap
|
|
598
|
+
|
|
599
|
+
- [ ] Hybrid search (semantic + BM25 keyword matching)
|
|
600
|
+
- [ ] Query intent classification (auto-detect "give me XML" vs "give me PHP")
|
|
601
|
+
- [ ] Filtered search by file type at the vector level
|
|
602
|
+
- [ ] Incremental indexing (only re-index changed files)
|
|
603
|
+
- [ ] VSCode extension
|
|
604
|
+
- [ ] Web UI for browsing results
|
|
605
|
+
- [ ] Support for Magento 2 Commerce (B2B, Staging modules)
|
|
606
|
+
|
|
607
|
+
---
|
|
608
|
+
|
|
609
|
+
## License
|
|
610
|
+
|
|
611
|
+
MIT License. See [LICENSE](LICENSE) for details.
|
|
612
|
+
|
|
613
|
+
---
|
|
614
|
+
|
|
615
|
+
## Contributing
|
|
616
|
+
|
|
617
|
+
Contributions are welcome. Please:
|
|
618
|
+
|
|
619
|
+
1. Fork the repository
|
|
620
|
+
2. Create a feature branch (`git checkout -b feature/improvement`)
|
|
621
|
+
3. Add tests for new functionality
|
|
622
|
+
4. Run validation to ensure accuracy doesn't regress
|
|
623
|
+
5. Submit a pull request
|
|
624
|
+
|
|
625
|
+
---
|
|
626
|
+
|
|
627
|
+
Built with Rust and Node.js for the Magento community.
|
|
@@ -0,0 +1,13 @@
|
|
|
1
|
+
{
|
|
2
|
+
"mcpServers": {
|
|
3
|
+
"magector": {
|
|
4
|
+
"command": "node",
|
|
5
|
+
"args": ["src/mcp-server.js"],
|
|
6
|
+
"cwd": "/Users/file/Code/magector",
|
|
7
|
+
"env": {
|
|
8
|
+
"MAGENTO_ROOT": "/path/to/your/magento",
|
|
9
|
+
"MAGECTOR_DB": "/Users/file/Code/magector/magector.db"
|
|
10
|
+
}
|
|
11
|
+
}
|
|
12
|
+
}
|
|
13
|
+
}
|