magector 2.13.2 → 2.14.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -2,10 +2,10 @@
2
2
 
3
3
  **Technology-aware MCP server for Magento 2 and Adobe Commerce with intelligent indexing and search.**
4
4
 
5
- Magector is a Model Context Protocol (MCP) server that deeply understands Magento 2 and Adobe Commerce. It builds a semantic vector index of your entire codebase — 18,000+ files across hundreds of modules — and exposes 34 tools that let AI assistants search, navigate, and understand the code with domain-specific intelligence. Instead of grepping for keywords, your AI asks *"how are checkout totals calculated?"* and gets ranked, relevant results in under 50ms, enriched with Magento pattern detection (plugins, observers, controllers, DI preferences, layout XML, and 20+ more).
5
+ Magector is a Model Context Protocol (MCP) server that deeply understands Magento 2 and Adobe Commerce. It builds a semantic vector index of your entire codebase — 18,000+ files across hundreds of modules — and exposes 45 tools that let AI assistants search, navigate, and understand the code with domain-specific intelligence. Instead of grepping for keywords, your AI asks *"how are checkout totals calculated?"* and gets ranked, relevant results in under 50ms, enriched with Magento pattern detection (plugins, observers, controllers, DI preferences, layout XML, and 20+ more).
6
6
 
7
7
  [![Rust](https://img.shields.io/badge/rust-1.75+-orange.svg)](https://www.rust-lang.org)
8
- [![Node.js](https://img.shields.io/badge/node-18+-green.svg)](https://nodejs.org)
8
+ [![Node.js](https://img.shields.io/badge/node-22.5+-green.svg)](https://nodejs.org)
9
9
  [![Magento](https://img.shields.io/badge/magento-2.4.x-blue.svg)](https://magento.com)
10
10
  [![Adobe Commerce](https://img.shields.io/badge/adobe%20commerce-supported-blue.svg)](https://business.adobe.com/products/magento/magento-commerce.html)
11
11
  [![Accuracy](https://img.shields.io/badge/accuracy-99.2%25-brightgreen.svg)](#validation)
@@ -58,7 +58,7 @@ The result: your AI assistant calls one MCP tool and gets ranked, pattern-enrich
58
58
  - **Complexity analysis** -- cyclomatic complexity, function count, and hotspot detection across modules
59
59
  - **Fast** -- 10-45ms queries via persistent serve process, batched ONNX embedding with adaptive thread scaling
60
60
  - **LLM description enrichment** -- generate natural-language descriptions of di.xml files using Claude, stored in SQLite, and prepend them to embedding text so descriptions influence vector search ranking (not just post-retrieval display)
61
- - **MCP server** -- 34 tools integrating with Claude Code, Cursor, and any MCP-compatible AI tool
61
+ - **MCP server** -- 45 tools integrating with Claude Code, Cursor, and any MCP-compatible AI tool
62
62
  - **Clean architecture** -- Rust core handles all indexing/search, Node.js MCP server delegates to it
63
63
 
64
64
  ---
@@ -70,7 +70,7 @@ flowchart LR
70
70
  subgraph node ["Node.js Layer"]
71
71
  direction TB
72
72
  G["CLI<br/>init · index · search · describe"]
73
- E["MCP Server<br/>34 tools · LRU cache"]
73
+ E["MCP Server<br/>45 tools · LRU cache"]
74
74
  F["Persistent Serve Process"]
75
75
  G --> F
76
76
  E --> F
@@ -129,7 +129,8 @@ flowchart LR
129
129
  | JS parsing | `tree-sitter-javascript` | AMD/ES6 module detection |
130
130
  | Pattern detection | Custom Rust | 20+ Magento-specific patterns |
131
131
  | CLI | `clap` | Command-line interface (index, search, serve, validate) |
132
- | Descriptions | `rusqlite` (bundled SQLite) | LLM-generated di.xml descriptions stored in SQLite, prepended to embeddings |
132
+ | Descriptions | `rusqlite` (bundled SQLite) | LLM-generated di.xml descriptions stored in `.magector/sqlite.db`, prepended to embeddings |
133
+ | Null-safety index | `node:sqlite` (Node.js 22.5+ built-in) | Method-chain enrichment index in `.magector/enrichment.db` — O(1) null-risk queries |
133
134
  | SONA | Custom Rust | Feedback learning with MicroLoRA + EWC++ |
134
135
  | MCP server | `@modelcontextprotocol/sdk` | AI tool integration with structured JSON output |
135
136
 
@@ -139,7 +140,8 @@ flowchart LR
139
140
 
140
141
  ### Prerequisites
141
142
 
142
- - [Node.js 18+](https://nodejs.org)
143
+ - [Node.js 22.5+](https://nodejs.org) — required for built-in `node:sqlite` (used by `magento_enrich` / `magento_find_null_risks`)
144
+ - [semgrep](https://semgrep.dev) (optional) — required for `magento_ast_search`: `pip install semgrep`
143
145
 
144
146
  ### 1. Initialize in Your Project
145
147
 
@@ -369,7 +371,7 @@ npx magector index --force
369
371
 
370
372
  ## MCP Server Tools
371
373
 
372
- The MCP server exposes 34 tools for AI-assisted Magento 2 and Adobe Commerce development. All search tools return **structured JSON** with file paths, class names, methods, role badges, and content snippets -- enabling AI clients to parse results programmatically and minimize file-read round-trips.
374
+ The MCP server exposes 45 tools for AI-assisted Magento 2 and Adobe Commerce development. All search tools return **structured JSON** with file paths, class names, methods, role badges, and content snippets -- enabling AI clients to parse results programmatically and minimize file-read round-trips.
373
375
 
374
376
  ### Output Format
375
377
 
@@ -470,9 +472,23 @@ Auto-detects entry type from pattern (`/V1/...` → API, `snake_case` → event,
470
472
  | Tool | Description |
471
473
  |------|-------------|
472
474
  | `magento_module_structure` | Show complete module structure -- controllers, models, blocks, plugins, observers, configs |
473
- | `magento_index` | Trigger re-indexing of the codebase |
474
- | `magento_describe` | Generate LLM descriptions for di.xml files (requires `ANTHROPIC_API_KEY`), stored in SQLite, auto-reindexes affected files |
475
+ | `magento_index` | Trigger re-indexing of the codebase (also kicks off background enrichment) |
476
+ | `magento_describe` | Generate LLM descriptions for di.xml files (requires `ANTHROPIC_API_KEY`), stored in `.magector/sqlite.db`, auto-reindexes affected files |
475
477
  | `magento_stats` | View index statistics |
478
+ | `magento_batch` | Execute multiple tool queries in parallel in one MCP roundtrip. Supports all search, find, grep, read, and null-risk tools. Use to avoid N×3-5s roundtrip overhead. |
479
+ | `magento_grep` | Exact text/regex search across PHP/XML/PHTML files (`grep -rn -E` internally). Supports `filesOnly` mode (like `grep -l`), `context` lines, `ignoreCase`, `include` patterns. **(v2.9)** |
480
+ | `magento_read` | Read a specific file with optional `methodName` extraction (~10× fewer tokens than reading the whole file) and `startLine`/`endLine` range. **(v2.10)** |
481
+ | `magento_trace_api` | Trace REST/GraphQL API endpoint from URL to implementation: webapi.xml → service interface → DI preference → method body. One call replaces 4-5 grep+read steps. **(v2.11)** |
482
+ | `magento_find_trigger` | Find database triggers across the codebase |
483
+ | `magento_find_table_usage` | Find all PHP code referencing a specific database table |
484
+
485
+ ### Null-Safety Analysis (v2.12–v2.13)
486
+
487
+ | Tool | Description |
488
+ |------|-------------|
489
+ | `magento_ast_search` | Structural PHP code search using [semgrep](https://semgrep.dev). Understands PHP AST — matches by structure regardless of variable names, ignores comments/strings. Pattern syntax: `$X` = any expression, `$Y` = any identifier, `...` = any args. Example: `$X->getPayment()->$Y(...)`. Requires `semgrep`. **(v2.12)** |
490
+ | `magento_enrich` | Build the method-chain enrichment index. Scans all `vendor/` PHP files for `->firstMethod()->secondMethod()` chains and detects null guards in surrounding code. Stores results in `.magector/enrichment.db` (SQLite, `node:sqlite`). Runs automatically after `magento_index`. **(v2.13)** |
491
+ | `magento_find_null_risks` | Query the enrichment index for method chains without null guards. O(1) SQLite query instead of file scanning. Pass `firstMethod` to filter (e.g., `"getPayment"` → all `->getPayment()->anything()` without null guard). Requires `magento_enrich`. **(v2.13)** |
476
492
 
477
493
  ### Search Enhancements (v2.1)
478
494
 
@@ -656,7 +672,7 @@ cd rust-core && cargo run --release -- validate -m ./magento2 --skip-index
656
672
  magector/
657
673
  ├── src/ # Node.js source
658
674
  │ ├── cli.js # CLI entry point (npx magector <command>)
659
- │ ├── mcp-server.js # MCP server (34 tools, structured JSON output)
675
+ │ ├── mcp-server.js # MCP server (45 tools, structured JSON output)
660
676
  │ ├── binary.js # Platform binary resolver
661
677
  │ ├── model.js # ONNX model resolver/downloader
662
678
  │ ├── init.js # Full init command (index + IDE config)
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "magector",
3
- "version": "2.13.2",
3
+ "version": "2.14.0",
4
4
  "description": "Semantic code search for Magento 2 — index, search, MCP server",
5
5
  "type": "module",
6
6
  "main": "src/mcp-server.js",
@@ -33,10 +33,10 @@
33
33
  "ruvector": "^0.1.96"
34
34
  },
35
35
  "optionalDependencies": {
36
- "@magector/cli-darwin-arm64": "2.13.2",
37
- "@magector/cli-linux-x64": "2.13.2",
38
- "@magector/cli-linux-arm64": "2.13.2",
39
- "@magector/cli-win32-x64": "2.13.2"
36
+ "@magector/cli-darwin-arm64": "2.14.0",
37
+ "@magector/cli-linux-x64": "2.14.0",
38
+ "@magector/cli-linux-arm64": "2.14.0",
39
+ "@magector/cli-win32-x64": "2.14.0"
40
40
  },
41
41
  "keywords": [
42
42
  "magento",
package/src/mcp-server.js CHANGED
@@ -3459,12 +3459,15 @@ function hasNullGuard(lines, matchLineIdx, receiverExpr, guardRadius = 6) {
3459
3459
  */
3460
3460
  async function enrichMethodChains(root) {
3461
3461
  const dbPath = ENRICHMENT_DB_PATH(root);
3462
+ logToFile('INFO', `enrich: starting method-chain scan, db=${dbPath}`);
3463
+ const enrichStart = Date.now();
3462
3464
 
3463
3465
  // Use node:sqlite (built-in, no deps)
3464
3466
  let DatabaseSync;
3465
3467
  try {
3466
3468
  ({ DatabaseSync } = await import('node:sqlite'));
3467
3469
  } catch {
3470
+ logToFile('ERR', 'enrich: node:sqlite not available — requires Node.js 22.5+');
3468
3471
  throw new Error('node:sqlite not available — requires Node.js 22.5+');
3469
3472
  }
3470
3473
 
@@ -3491,8 +3494,10 @@ async function enrichMethodChains(root) {
3491
3494
  const now = Date.now();
3492
3495
 
3493
3496
  const phpFiles = await glob('vendor/**/*.php', { cwd: root, absolute: true, nodir: true });
3497
+ logToFile('INFO', `enrich: found ${phpFiles.length} PHP files in vendor/`);
3494
3498
  let scanned = 0;
3495
3499
  let chains = 0;
3500
+ let readErrors = 0;
3496
3501
 
3497
3502
  const insertStmt = db.prepare(
3498
3503
  'INSERT INTO method_chains (file, line, chain, first_method, second_method, has_null_guard, updated_at) VALUES (?,?,?,?,?,?,?)'
@@ -3519,11 +3524,18 @@ async function enrichMethodChains(root) {
3519
3524
  return lo + 1; // 1-based
3520
3525
  }
3521
3526
 
3527
+ // Progress logging every 10k files
3528
+ const progressInterval = 10000;
3529
+
3522
3530
  db.exec('BEGIN');
3523
3531
  try {
3524
3532
  for (const phpFile of phpFiles) {
3525
3533
  let content;
3526
- try { content = readFileSync(phpFile, 'utf-8'); } catch { continue; }
3534
+ try { content = readFileSync(phpFile, 'utf-8'); } catch (err) {
3535
+ readErrors++;
3536
+ if (readErrors <= 5) logToFile('WARN', `enrich: cannot read ${phpFile}: ${err.code || err.message}`);
3537
+ continue;
3538
+ }
3527
3539
  if (!content.includes('->')) continue;
3528
3540
 
3529
3541
  const relPath = phpFile.replace(root + '/', '');
@@ -3551,14 +3563,20 @@ async function enrichMethodChains(root) {
3551
3563
  }
3552
3564
  }
3553
3565
  scanned++;
3566
+ if (scanned % progressInterval === 0) {
3567
+ logToFile('INFO', `enrich: progress ${scanned}/${phpFiles.length} files, ${chains} chains so far (${Date.now() - enrichStart}ms)`);
3568
+ }
3554
3569
  }
3555
3570
  db.exec('COMMIT');
3556
3571
  } catch (err) {
3572
+ logToFile('ERR', `enrich: transaction failed at file ${scanned}/${phpFiles.length}: ${err.message}`);
3557
3573
  db.exec('ROLLBACK');
3558
3574
  throw err;
3559
3575
  }
3560
3576
 
3561
3577
  db.close();
3578
+ const enrichElapsed = Date.now() - enrichStart;
3579
+ logToFile('INFO', `enrich: complete — ${scanned} files scanned, ${chains} chains indexed, ${readErrors} read errors, ${enrichElapsed}ms`);
3562
3580
  return { scanned, chains };
3563
3581
  }
3564
3582
 
@@ -3567,15 +3585,21 @@ async function enrichMethodChains(root) {
3567
3585
  */
3568
3586
  async function queryNullRisks(root, firstMethod, limit = 100) {
3569
3587
  const dbPath = ENRICHMENT_DB_PATH(root);
3570
- if (!existsSync(dbPath)) return null;
3588
+ if (!existsSync(dbPath)) {
3589
+ logToFile('WARN', `null_risks: enrichment.db not found at ${dbPath} — run magento_enrich first`);
3590
+ return null;
3591
+ }
3571
3592
 
3572
3593
  let DatabaseSync;
3573
3594
  try {
3574
3595
  ({ DatabaseSync } = await import('node:sqlite'));
3575
- } catch {
3596
+ } catch (err) {
3597
+ logToFile('ERR', `null_risks: node:sqlite not available: ${err.message}`);
3576
3598
  return null;
3577
3599
  }
3578
3600
 
3601
+ const queryStart = Date.now();
3602
+ logToFile('INFO', `null_risks: querying firstMethod=${firstMethod || '(all)'} limit=${limit}`);
3579
3603
  const db = new DatabaseSync(dbPath, { open: true });
3580
3604
  let rows;
3581
3605
  try {
@@ -3591,6 +3615,7 @@ async function queryNullRisks(root, firstMethod, limit = 100) {
3591
3615
  } finally {
3592
3616
  db.close();
3593
3617
  }
3618
+ logToFile('INFO', `null_risks: ${rows.length} unsafe chain(s) found in ${Date.now() - queryStart}ms`);
3594
3619
  return rows;
3595
3620
  }
3596
3621
 
@@ -3604,13 +3629,22 @@ async function astSearch(pattern, searchPath, lang, maxResults) {
3604
3629
  const semgrepLang = lang || 'php';
3605
3630
  const limit = Math.min(maxResults || 50, 200);
3606
3631
 
3632
+ logToFile('INFO', `ast_search: pattern="${pattern}" path="${searchPath || '.'}" lang=${semgrepLang} limit=${limit}`);
3633
+ const astStart = Date.now();
3634
+
3607
3635
  // Create a temporary empty .semgrepignore in the target directory if none exists.
3608
3636
  // Semgrep's default ignore list includes "vendor/" which is exactly what we need to scan.
3609
3637
  // An empty .semgrepignore overrides the defaults: https://semgrep.dev/docs/ignoring-files-folders-code/
3610
3638
  const semgrepIgnorePath = path.join(targetPath, '.semgrepignore');
3611
3639
  let createdSemgrepIgnore = false;
3612
3640
  if (!existsSync(semgrepIgnorePath)) {
3613
- try { writeFileSync(semgrepIgnorePath, '# Magector: scan vendor/ and all project files\n'); createdSemgrepIgnore = true; } catch { /* best effort */ }
3641
+ try {
3642
+ writeFileSync(semgrepIgnorePath, '# Magector: scan vendor/ and all project files\n');
3643
+ createdSemgrepIgnore = true;
3644
+ logToFile('INFO', `ast_search: created temporary .semgrepignore at ${targetPath}`);
3645
+ } catch (err) {
3646
+ logToFile('WARN', `ast_search: failed to create .semgrepignore: ${err.message}`);
3647
+ }
3614
3648
  }
3615
3649
 
3616
3650
  const semgrepArgs = [
@@ -3633,7 +3667,11 @@ async function astSearch(pattern, searchPath, lang, maxResults) {
3633
3667
  } catch (err) {
3634
3668
  // semgrep exits non-zero when it has findings — stdout still contains valid JSON
3635
3669
  rawOutput = err.stdout || '';
3636
- if (!rawOutput) throw new Error(`semgrep failed: ${(err.stderr || err.message || '').slice(0, 500)}`);
3670
+ if (!rawOutput) {
3671
+ const errMsg = (err.stderr || err.message || '').slice(0, 500);
3672
+ logToFile('ERR', `ast_search: semgrep failed after ${Date.now() - astStart}ms: ${errMsg}`);
3673
+ throw new Error(`semgrep failed: ${errMsg}`);
3674
+ }
3637
3675
  } finally {
3638
3676
  if (createdSemgrepIgnore) { try { unlinkSync(semgrepIgnorePath); } catch { /* best effort */ } }
3639
3677
  }
@@ -3642,10 +3680,16 @@ async function astSearch(pattern, searchPath, lang, maxResults) {
3642
3680
  try {
3643
3681
  parsed = JSON.parse(rawOutput);
3644
3682
  } catch {
3683
+ logToFile('ERR', `ast_search: failed to parse semgrep JSON output (${rawOutput.length} bytes)`);
3645
3684
  throw new Error(`Failed to parse semgrep output. First 300 chars: ${rawOutput.slice(0, 300)}`);
3646
3685
  }
3647
3686
 
3648
3687
  const findings = (parsed.results || []).slice(0, limit);
3688
+ const astElapsed = Date.now() - astStart;
3689
+ logToFile('INFO', `ast_search: ${findings.length} match(es) in ${astElapsed}ms (semgrep returned ${(parsed.results || []).length} total)`);
3690
+ if (parsed.errors && parsed.errors.length > 0) {
3691
+ logToFile('WARN', `ast_search: semgrep reported ${parsed.errors.length} error(s): ${parsed.errors.slice(0, 3).map(e => e.message || e.type || JSON.stringify(e)).join('; ')}`);
3692
+ }
3649
3693
  return findings.map(r => ({
3650
3694
  file: r.path.replace(root + '/', ''),
3651
3695
  line: r.start.line,
@@ -4765,6 +4809,7 @@ server.setRequestHandler(CallToolRequestSchema, async (request) => {
4765
4809
  const root = args.path || config.magentoRoot;
4766
4810
  const output = rustIndex(root);
4767
4811
  // Auto-enrich after indexing: runs in background, doesn't block response
4812
+ logToFile('INFO', 'Auto-enrich: starting in background after index');
4768
4813
  enrichMethodChains(root).then(({ scanned, chains }) => {
4769
4814
  logToFile('INFO', `Auto-enrich complete: ${scanned} files, ${chains} chains`);
4770
4815
  }).catch(err => {
@@ -6112,8 +6157,10 @@ server.setRequestHandler(CallToolRequestSchema, async (request) => {
6112
6157
  if (queries.length > 10) {
6113
6158
  return { content: [{ type: 'text', text: 'Maximum 10 queries per batch.' }], isError: true };
6114
6159
  }
6160
+ logToFile('INFO', `batch: ${queries.length} queries: ${queries.map(q => q.tool).join(', ')}`);
6115
6161
  // Run batch queries in parallel using existing standalone functions
6116
6162
  const batchResults = await Promise.all(queries.map(async (q, idx) => {
6163
+ const batchItemStart = Date.now();
6117
6164
  try {
6118
6165
  const a = q.args || {};
6119
6166
  let text = '';
@@ -6416,8 +6463,10 @@ server.setRequestHandler(CallToolRequestSchema, async (request) => {
6416
6463
  default:
6417
6464
  text = `Unsupported batch tool: ${q.tool}`;
6418
6465
  }
6466
+ logToFile('INFO', `batch[${idx}]: ${q.tool} completed (${Date.now() - batchItemStart}ms)`);
6419
6467
  return { idx, tool: q.tool, text };
6420
6468
  } catch (err) {
6469
+ logToFile('ERR', `batch[${idx}]: ${q.tool} failed (${Date.now() - batchItemStart}ms): ${err.message}`);
6421
6470
  return { idx, tool: q.tool, text: `Error: ${err.message}` };
6422
6471
  }
6423
6472
  }));
@@ -6446,6 +6495,7 @@ server.setRequestHandler(CallToolRequestSchema, async (request) => {
6446
6495
  }
6447
6496
  grepArgs.push('--', args.pattern, searchPath);
6448
6497
  let output;
6498
+ const grepStart = Date.now();
6449
6499
  try {
6450
6500
  output = execFileSync('grep', grepArgs, {
6451
6501
  cwd: root, encoding: 'utf-8', timeout: 30000,
@@ -6455,9 +6505,12 @@ server.setRequestHandler(CallToolRequestSchema, async (request) => {
6455
6505
  } catch (err) {
6456
6506
  // grep returns exit code 1 when no matches found
6457
6507
  output = err.stdout || '';
6508
+ if (err.killed) logToFile('WARN', `grep: timed out after 30s for pattern "${args.pattern}"`);
6458
6509
  }
6510
+ const grepElapsed = Date.now() - grepStart;
6459
6511
  const lines = output.trim().split('\n').filter(Boolean);
6460
6512
  const total = lines.length;
6513
+ if (grepElapsed > 5000) logToFile('WARN', `grep: slow query "${args.pattern}" — ${total} matches in ${grepElapsed}ms`);
6461
6514
  const truncated = lines.slice(0, maxResults);
6462
6515
  let text = filesOnly
6463
6516
  ? `## grep (files only): \`${args.pattern}\`\nFound **${total}** file(s)${total > maxResults ? ` (showing first ${maxResults})` : ''}. Use magento_read with methodName to read specific methods.\n\n`
@@ -6619,6 +6672,7 @@ server.setRequestHandler(CallToolRequestSchema, async (request) => {
6619
6672
  const filePath = path.join(root, args.path);
6620
6673
  let content;
6621
6674
  try { content = readFileSync(filePath, 'utf-8'); } catch (err) {
6675
+ logToFile('WARN', `read: file not found: ${args.path} (${err.code || err.message})`);
6622
6676
  return { content: [{ type: 'text', text: `File not found: ${args.path}` }], isError: true };
6623
6677
  }
6624
6678
 
@@ -6626,6 +6680,7 @@ server.setRequestHandler(CallToolRequestSchema, async (request) => {
6626
6680
  if (args.methodName) {
6627
6681
  const body = readFullMethodBody(filePath, args.methodName);
6628
6682
  if (!body) {
6683
+ logToFile('WARN', `read: method "${args.methodName}" not found in ${args.path}`);
6629
6684
  return { content: [{ type: 'text', text: `## ${args.path}\n\nMethod \`${args.methodName}\` not found in file.` }] };
6630
6685
  }
6631
6686
  // Find line number of the method