code-graph-llm 1.3.0 → 1.4.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -3,14 +3,14 @@
3
3
  A language-agnostic, ultra-compact codebase mapper designed specifically for LLM agents to optimize context and token usage. It doesn't just list files; it provides a high-signal "map" of your project's architecture, including descriptions and signatures.
4
4
 
5
5
  ## Features
6
- - **Smart Context Extraction:** Captures JSDoc, Python docstrings, and preceding comments for files and symbols.
7
- - **Signature Fallback:** Automatically extracts function signatures (parameters/types) if documentation is missing.
8
- - **Recursive .gitignore Support:** Deeply respects both root and nested `.gitignore` files across the entire project structure.
9
- - **Smart Flutter/Dart Support:** Optimized to reduce noise by filtering out common widget instantiations while capturing real functional declarations.
10
- - **Compact & Dense:** Optimized for LLM token efficiency, replacing expensive recursive file scans.
11
- - **Language-Agnostic:** Optimized regex support for JS/TS, Python, Go, Rust, Java, C#, C/C++, Swift, PHP, Ruby, Dart, and more.
12
- - **Recursive Ignore Logic:** Deeply respects `.gitignore` and standard excludes (`node_modules`, `.git`).
13
- - **Live Sync:** Continuous background updates or Git pre-commit hooks.
6
+ - **Structural Knowledge Graph:** Captures relationships between files and classes:
7
+ - **Dependencies:** Tracks `imports`, `requires`, and `includes` across files.
8
+ - **Inheritance:** Maps `extends`, `implements`, and class hierarchies.
9
+ - **Smart Context Extraction:** Captures JSDoc, Python docstrings, and preceding comments.
10
+ - **Signature Fallback:** Extracts function signatures (parameters/types) if documentation is missing.
11
+ - **Recursive .gitignore Support:** Deeply respects both root and nested `.gitignore` files.
12
+ - **Compact & Dense:** Optimized for LLM token efficiency with a dedicated `## GRAPH EDGES` section.
13
+ - **Language-Agnostic:** Support for JS/TS, Python, Go, Rust, Java, C#, C/C++, Swift, PHP, Ruby, Dart, and more.
14
14
 
15
15
  ## Installation
16
16
 
@@ -38,12 +38,16 @@ code-graph install-hook
38
38
  ## LLM Usage & Token Efficiency
39
39
 
40
40
  ### The "Read First" Strategy
41
- Instruct your LLM agent to read `llm-code-graph.md` as its first step. The file uses a dense format that provides immediate architectural context:
41
+ Instruct your LLM agent to read `llm-code-graph.md` as its first step. The file provides a high-level map and a structural graph for relational reasoning:
42
42
 
43
43
  **Example Map Entry:**
44
44
  ```markdown
45
- - src/auth.js | desc: Handles user authentication and JWT validation.
45
+ - src/auth.js | desc: Handles user authentication.
46
46
  - syms: [login [ (username, password) ], validateToken [ (token: string) ]]
47
+
48
+ ## GRAPH EDGES
49
+ [src/auth.js] -> [imports] -> [jwt-library]
50
+ [AdminUser] -> [inherits] -> [BaseUser]
47
51
  ```
48
52
 
49
53
  **Example System Prompt:**
@@ -78,13 +82,9 @@ fn main() {
78
82
  ```
79
83
 
80
84
  ## How it works
81
- 1. **File Scanning:** Recursively walks the directory, ignoring patterns in `.gitignore`.
82
- 2. **Context Extraction:** Scans for classes, functions, and variables.
83
- 3. **Docstring Capture:** If a symbol has a preceding comment (`//`, `/**`, `#`, `"""`), it's captured as a description.
84
- 4. **Signature Capture:** If no comment is found, it captures the declaration signature (parameters) as a fallback.
85
- 5. **Compilation:** Writes a single, minified `llm-code-graph.md` file designed for machine consumption.
86
-
87
- ## Publishing as a Package
88
- To share your own version:
89
- 1. `npm login`
90
- 2. `npm publish --access public`
85
+ 1. **File Scanning:** Recursively walks the directory, ignoring patterns in `.gitignore` (recursive).
86
+ 2. **Context Extraction:** Scans for classes, functions, and variables while ignoring matches in comments.
87
+ 3. **Graph Extraction:** Identifies `imports`, `requires`, `extends`, and `implements` to build a structural skeleton.
88
+ 4. **Docstring Capture:** Captures preceding comments as descriptions.
89
+ 5. **Signature Capture:** Fallback to declaration signatures (parameters) if docs are missing.
90
+ 6. **Compilation:** Writes a single, minified `llm-code-graph.md` file with a dedicated `## GRAPH EDGES` section.
package/index.js CHANGED
@@ -13,20 +13,27 @@ const IGNORE_FILE = '.gitignore';
13
13
  const DEFAULT_MAP_FILE = 'llm-code-graph.md';
14
14
 
15
15
  const SYMBOL_REGEXES = [
16
- // Types, Classes, Interfaces (Universal)
17
- /\b(?:class|interface|type|struct|enum|protocol|extension|trait|module|namespace|object)\s+([a-zA-Z_]\w*)/g,
16
+ // Types, Classes, Interfaces (Universal) with Inheritance support
17
+ // Captures: class Name extends Parent, interface Name implements Base
18
+ /\b(?:class|interface|type|struct|enum|protocol|extension|trait|module|namespace|object)\s+([a-zA-Z_]\w*)(?:\s+(?:extends|implements|:)\s+([a-zA-Z_]\w*(?:\s*,\s*[a-zA-Z_]\w*)*))?/g,
18
19
 
19
20
  // Explicit Function Keywords
20
21
  /\b(?:function|def|fn|func|fun|method|procedure|sub|routine)\s+([a-zA-Z_]\w*)/g,
21
22
 
22
23
  // Method/Var Declarations (C-style, Java, C#, TS, Dart)
23
- // Refined to require a variable/function name followed by a declaration signal
24
24
  /\b(?:void|async|public|private|protected|static|virtual|override|readonly|int|float|double|char|bool|string|val|var|let|final)\s+([a-zA-Z_]\w*)(?=\s*(?:\([^)]*\)|[a-zA-Z_]\w*)\s*(?:\{|=>|;|=))/g,
25
25
 
26
26
  // Exported symbols
27
27
  /\bexport\s+(?:default\s+)?(?:const|let|var|function|class|type|interface|enum|async|val)\s+([a-zA-Z_]\w*)/g
28
28
  ];
29
29
 
30
+ const EDGE_REGEXES = [
31
+ // Imports/Includes (JS, TS, Python, Go, Rust, C++, Java, Dart)
32
+ /\b(?:import|from|include|require|using)\s*(?:[\(\s])\s*['"]?([@\w\.\/\-]+)['"]?/g,
33
+ // C-style includes
34
+ /#include\s+[<"]([\w\.\/\-]+)[>"]/g
35
+ ];
36
+
30
37
  export const SUPPORTED_EXTENSIONS = [
31
38
  '.js', '.ts', '.jsx', '.tsx', '.py', '.go', '.rs', '.java',
32
39
  '.cpp', '.c', '.h', '.hpp', '.cc', '.rb', '.php', '.swift',
@@ -48,31 +55,51 @@ export function getIgnores(cwd, additionalLines = []) {
48
55
  return ig;
49
56
  }
50
57
 
51
- export function extractSymbols(content) {
58
+ export function extractSymbolsAndInheritance(content) {
52
59
  const symbols = [];
60
+ const inheritance = [];
61
+
62
+ // Create a version of content without comments to find symbols accurately
63
+ const noComments = content.replace(/\/\*[\s\S]*?\*\/|\/\/.*/g, '');
64
+
53
65
  for (const regex of SYMBOL_REGEXES) {
54
66
  let match;
55
67
  regex.lastIndex = 0;
56
- while ((match = regex.exec(content)) !== null) {
68
+ while ((match = regex.exec(noComments)) !== null) {
57
69
  if (match[1]) {
58
70
  const symbolName = match[1];
59
- if (['if', 'for', 'while', 'switch', 'return', 'await', 'yield', 'const', 'new'].includes(symbolName)) continue;
71
+ if (['if', 'for', 'while', 'switch', 'return', 'await', 'yield', 'const', 'new', 'let', 'var'].includes(symbolName)) continue;
72
+
73
+ // Capture inheritance if present (match[2])
74
+ if (match[2]) {
75
+ const parents = match[2].split(',').map(p => p.trim());
76
+ parents.forEach(parent => {
77
+ inheritance.push({ child: symbolName, parent });
78
+ });
79
+ }
80
+
81
+ // To find the comment, we need to find the position in the ORIGINAL content
82
+ const escapedName = symbolName.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
83
+ const posRegex = new RegExp(`\\b${escapedName}\\b`, 'g');
84
+ let posMatch = posRegex.exec(content);
85
+ if (!posMatch) continue;
60
86
 
61
- const linesBefore = content.substring(0, match.index).split('\n');
87
+ const linesBefore = content.substring(0, posMatch.index).split('\n');
62
88
  let comment = '';
63
- for (let i = linesBefore.length - 1; i >= 0; i--) {
89
+ // Skip the current line where the symbol is defined
90
+ for (let i = linesBefore.length - 2; i >= 0; i--) {
64
91
  const line = linesBefore[i].trim();
65
- if (line.startsWith('//') || line.startsWith('*') || line.startsWith('"""') || line.startsWith('#')) {
92
+ if (line.startsWith('//') || line.startsWith('*') || line.startsWith('"""') || line.startsWith('#') || line.startsWith('/*')) {
66
93
  const clean = line.replace(/[\/*#"]/g, '').trim();
67
94
  if (clean) comment = clean + (comment ? ' ' + comment : '');
68
- if (comment.length > 80) break;
95
+ if (comment.length > 100) break;
69
96
  } else if (line === '' && comment === '') continue;
70
97
  else break;
71
98
  }
72
99
 
73
100
  let context = comment;
74
101
  if (!context) {
75
- const remainingLine = content.substring(match.index + match[0].length);
102
+ const remainingLine = content.substring(posMatch.index + symbolName.length);
76
103
  const sigMatch = remainingLine.match(/^\s*(\([^)]*\)|[^\n{;]*)/);
77
104
  if (sigMatch && sigMatch[1].trim()) {
78
105
  context = sigMatch[1].trim();
@@ -83,14 +110,35 @@ export function extractSymbols(content) {
83
110
  }
84
111
  }
85
112
  }
86
- return Array.from(new Set(symbols)).sort();
113
+ return {
114
+ symbols: Array.from(new Set(symbols)).sort(),
115
+ inheritance: Array.from(new Set(inheritance.map(JSON.stringify))).map(JSON.parse)
116
+ };
117
+ }
118
+
119
+ export function extractEdges(content) {
120
+ const dependencies = new Set();
121
+ const noComments = content.replace(/\/\*[\s\S]*?\*\/|\/\/.*/g, '');
122
+ for (const regex of EDGE_REGEXES) {
123
+ let match;
124
+ regex.lastIndex = 0;
125
+ while ((match = regex.exec(noComments)) !== null) {
126
+ if (match[1]) {
127
+ const dep = match[1];
128
+ if (dep.length > 1 && !['style', 'react', 'vue', 'flutter'].includes(dep.toLowerCase())) {
129
+ dependencies.add(dep);
130
+ }
131
+ }
132
+ }
133
+ }
134
+ return Array.from(dependencies).sort();
87
135
  }
88
136
 
89
137
  async function generate(cwd = process.cwd()) {
90
138
  const files = [];
139
+ const allEdges = [];
91
140
 
92
141
  function walk(dir, ig) {
93
- // Check for local .gitignore and create a new scoped ignore object if found
94
142
  let localIg = ig;
95
143
  const localIgnorePath = path.join(dir, IGNORE_FILE);
96
144
  if (fs.existsSync(localIgnorePath) && dir !== cwd) {
@@ -98,7 +146,6 @@ async function generate(cwd = process.cwd()) {
98
146
  const lines = content.split('\n').map(line => {
99
147
  line = line.trim();
100
148
  if (!line || line.startsWith('#')) return null;
101
- // Rules in sub-gitignore are relative to that directory
102
149
  const relDir = path.relative(cwd, dir).replace(/\\/g, '/');
103
150
  return relDir ? `${relDir}/${line}` : line;
104
151
  }).filter(Boolean);
@@ -121,17 +168,34 @@ async function generate(cwd = process.cwd()) {
121
168
  const ext = path.extname(entry.name);
122
169
  if (SUPPORTED_EXTENSIONS.includes(ext)) {
123
170
  const content = fs.readFileSync(fullPath, 'utf8');
124
- const firstLines = content.split('\n').slice(0, 5);
171
+
172
+ // Extract file-level description
173
+ const lines = content.split('\n');
125
174
  let fileDesc = '';
126
- for (const line of firstLines) {
127
- const trimmed = line.trim();
128
- if (trimmed.startsWith('//') || trimmed.startsWith('#') || trimmed.startsWith('/*')) {
129
- fileDesc += trimmed.replace(/[\/*#]/g, '').trim() + ' ';
175
+ for (let i = 0; i < Math.min(10, lines.length); i++) {
176
+ const line = lines[i].trim();
177
+ if (line.startsWith('#!') || line === '') continue;
178
+ if (line.startsWith('//') || line.startsWith('#') || line.startsWith('/*')) {
179
+ fileDesc += line.replace(/[\/*#]/g, '').trim() + ' ';
180
+ } else {
181
+ break;
130
182
  }
131
183
  }
132
- const symbols = extractSymbols(content);
184
+
185
+ const { symbols, inheritance } = extractSymbolsAndInheritance(content);
186
+ const dependencies = extractEdges(content);
187
+
133
188
  if (!fileDesc.trim() && symbols.length > 0) fileDesc = `Contains ${symbols.length} symbols.`;
189
+
134
190
  files.push({ path: normalizedPath, desc: fileDesc.trim(), symbols });
191
+
192
+ // Collect Edges
193
+ dependencies.forEach(dep => {
194
+ allEdges.push(`[${normalizedPath}] -> [imports] -> [${dep}]`);
195
+ });
196
+ inheritance.forEach(inh => {
197
+ allEdges.push(`[${inh.child}] -> [inherits] -> [${inh.parent}]`);
198
+ });
135
199
  }
136
200
  }
137
201
  }
@@ -139,14 +203,18 @@ async function generate(cwd = process.cwd()) {
139
203
 
140
204
  walk(cwd, getIgnores(cwd));
141
205
 
142
- const output = files.map(f => {
206
+ const nodesOutput = files.map(f => {
143
207
  const descStr = f.desc ? ` | desc: ${f.desc.substring(0, 100)}` : '';
144
208
  const symStr = f.symbols.length > 0 ? `\n - syms: [${f.symbols.join(', ')}]` : '';
145
209
  return `- ${f.path}${descStr}${symStr}`;
146
210
  }).join('\n');
147
211
 
212
+ const edgesOutput = allEdges.length > 0
213
+ ? `\n\n## GRAPH EDGES\n${Array.from(new Set(allEdges)).sort().join('\n')}`
214
+ : '';
215
+
148
216
  const header = `# CODE_GRAPH_MAP\n> LLM_ONLY: DO NOT EDIT. COMPACT PROJECT MAP.\n\n`;
149
- fs.writeFileSync(path.join(cwd, DEFAULT_MAP_FILE), header + output);
217
+ fs.writeFileSync(path.join(cwd, DEFAULT_MAP_FILE), header + nodesOutput + edgesOutput);
150
218
  console.log(`[Code-Graph] Updated ${DEFAULT_MAP_FILE}`);
151
219
  }
152
220
 
package/llm-code-graph.md CHANGED
@@ -1,10 +1,29 @@
1
1
  # CODE_GRAPH_MAP
2
2
  > LLM_ONLY: DO NOT EDIT. COMPACT PROJECT MAP.
3
3
 
4
- - .gitignore
5
- - index.js | desc: !usrbinenv node
6
- - syms: [Name [Dart:], Name [PHP: class Name, interface Name, trait Name,], Name [PHP: class Name, interface Name,], Name [PHP: class Name,], Name [PHP:], Name [Ruby: def name, class Name,], Name [Ruby: def name,], Name [Swift: func name, class Name, struct Name, protocol Name,], Name [Swift: func name, class Name, struct Name,], Name [Swift: func name, class Name,], Name [Swift: func name,], SUPPORTED_EXTENSIONS [= [], extractSymbols [(content)], function [generate(cwd = process.cwd())], generate [(cwd = process.cwd()], getIgnores [(cwd)], installHook [(cwd = process.cwd()], is [We must check if p is a directory to append the trailing slash Since chokidar's ignore], name [Dart: class Name, void name,], name [Ruby:], name [Swift:], walk [(dir)], watch [(cwd = process.cwd()]]
7
- - package.json
8
- - README.md
9
- - test/index.test.js | desc: Contains 5 symbols.
10
- - syms: [noDocFunc [(arg1: string, arg2: number)], py_func [(x)], py_func [Note: Current regex captures '], py_func_2 [This is a python comment], testFunc [This is a test function]]
4
+ - index.js | desc: Contains 7 symbols.
5
+ - syms: [SUPPORTED_EXTENSIONS [= [], extractSymbolsAndInheritance [(content)], generate [(cwd = process.cwd()], getIgnores [(cwd, additionalLines = [])], installHook [(cwd = process.cwd()], walk [(dir, ig)], watch [(cwd = process.cwd()]]
6
+ - test/index.test.js | desc: Contains 8 symbols.
7
+ - syms: [AdminUser [extends BaseUser], IRepository [implements IBase], MyWidget [: StatelessWidget], ignored [by subdir/.gitignore], included [.js'), 'function included()], noDocFunc [(arg1: string, arg2: number)], realFunction [()], testFunc [This is a test function]]
8
+
9
+ ## GRAPH EDGES
10
+ [AdminUser] -> [inherits] -> [BaseUser]
11
+ [IRepository] -> [inherits] -> [IBase]
12
+ [MyWidget] -> [inherits] -> [StatelessWidget]
13
+ [index.js] -> [imports] -> [chokidar]
14
+ [index.js] -> [imports] -> [dependencies]
15
+ [index.js] -> [imports] -> [fs]
16
+ [index.js] -> [imports] -> [ignore]
17
+ [index.js] -> [imports] -> [new]
18
+ [index.js] -> [imports] -> [path]
19
+ [index.js] -> [imports] -> [url]
20
+ [test/index.test.js] -> [imports] -> [../index.js]
21
+ [test/index.test.js] -> [imports] -> [./local-file]
22
+ [test/index.test.js] -> [imports] -> [assert]
23
+ [test/index.test.js] -> [imports] -> [fs]
24
+ [test/index.test.js] -> [imports] -> [header.h]
25
+ [test/index.test.js] -> [imports] -> [node]
26
+ [test/index.test.js] -> [imports] -> [other-module]
27
+ [test/index.test.js] -> [imports] -> [path]
28
+ [test/index.test.js] -> [imports] -> [test]
29
+ [test/index.test.js] -> [imports] -> [url]
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "code-graph-llm",
3
- "version": "1.3.0",
3
+ "version": "1.4.1",
4
4
  "description": "Compact, language-agnostic codebase mapper for LLM token efficiency.",
5
5
  "main": "index.js",
6
6
  "bin": {
@@ -4,7 +4,8 @@ import fs from 'node:fs';
4
4
  import path from 'node:path';
5
5
  import { fileURLToPath } from 'url';
6
6
  import {
7
- extractSymbols,
7
+ extractSymbolsAndInheritance,
8
+ extractEdges,
8
9
  getIgnores,
9
10
  SUPPORTED_EXTENSIONS,
10
11
  generate
@@ -17,7 +18,7 @@ test('extractSymbols - JS/TS Docstrings', () => {
17
18
  */
18
19
  function testFunc(a, b) {}
19
20
  `;
20
- const symbols = extractSymbols(code);
21
+ const { symbols } = extractSymbolsAndInheritance(code);
21
22
  assert.ok(symbols.some(s => s.includes('testFunc') && s.includes('This is a test function')));
22
23
  });
23
24
 
@@ -27,8 +28,8 @@ test('extractSymbols - Signature Fallback', () => {
27
28
  return true;
28
29
  }
29
30
  `;
30
- const symbols = extractSymbols(code);
31
- // Matches "noDocFunc [(arg1: string, arg2: number)]"
31
+ const { symbols } = extractSymbolsAndInheritance(code);
32
+ // Matches "noDocFunc [ (arg1: string, arg2: number)]"
32
33
  assert.ok(symbols.some(s => s.includes('noDocFunc') && s.includes('arg1: string, arg2: number')));
33
34
  });
34
35
 
@@ -37,11 +38,35 @@ test('extractSymbols - Flutter/Dart Noise Reduction', () => {
37
38
  const SizedBox(height: 10);
38
39
  void realFunction() {}
39
40
  `;
40
- const symbols = extractSymbols(code);
41
+ const { symbols } = extractSymbolsAndInheritance(code);
41
42
  assert.ok(symbols.some(s => s.includes('realFunction')));
42
43
  assert.ok(!symbols.some(s => s.includes('SizedBox')));
43
44
  });
44
45
 
46
+ test('extractInheritance - Class relationships', () => {
47
+ const code = `
48
+ class AdminUser extends BaseUser {}
49
+ interface IRepository implements IBase {}
50
+ class MyWidget : StatelessWidget {}
51
+ `;
52
+ const { inheritance } = extractSymbolsAndInheritance(code);
53
+ assert.ok(inheritance.some(i => i.child === 'AdminUser' && i.parent === 'BaseUser'));
54
+ assert.ok(inheritance.some(i => i.child === 'IRepository' && i.parent === 'IBase'));
55
+ assert.ok(inheritance.some(i => i.child === 'MyWidget' && i.parent === 'StatelessWidget'));
56
+ });
57
+
58
+ test('extractEdges - Imports and includes', () => {
59
+ const code = `
60
+ import { something } from './local-file';
61
+ const other = require('other-module');
62
+ #include "header.h"
63
+ `;
64
+ const edges = extractEdges(code);
65
+ assert.ok(edges.includes('./local-file'));
66
+ assert.ok(edges.includes('other-module'));
67
+ assert.ok(edges.includes('header.h'));
68
+ });
69
+
45
70
  test('getIgnores - Default Patterns', () => {
46
71
  const ig = getIgnores(process.cwd());
47
72
  assert.strictEqual(ig.ignores('.git/'), true);