code-graph-llm 1.3.0 → 1.4.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -3,14 +3,14 @@
3
3
  A language-agnostic, ultra-compact codebase mapper designed specifically for LLM agents to optimize context and token usage. It doesn't just list files; it provides a high-signal "map" of your project's architecture, including descriptions and signatures.
4
4
 
5
5
  ## Features
6
- - **Smart Context Extraction:** Captures JSDoc, Python docstrings, and preceding comments for files and symbols.
7
- - **Signature Fallback:** Automatically extracts function signatures (parameters/types) if documentation is missing.
8
- - **Recursive .gitignore Support:** Deeply respects both root and nested `.gitignore` files across the entire project structure.
9
- - **Smart Flutter/Dart Support:** Optimized to reduce noise by filtering out common widget instantiations while capturing real functional declarations.
10
- - **Compact & Dense:** Optimized for LLM token efficiency, replacing expensive recursive file scans.
11
- - **Language-Agnostic:** Optimized regex support for JS/TS, Python, Go, Rust, Java, C#, C/C++, Swift, PHP, Ruby, Dart, and more.
12
- - **Recursive Ignore Logic:** Deeply respects `.gitignore` and standard excludes (`node_modules`, `.git`).
13
- - **Live Sync:** Continuous background updates or Git pre-commit hooks.
6
+ - **Structural Knowledge Graph:** Captures relationships between files and classes:
7
+ - **Dependencies:** Tracks `imports`, `requires`, and `includes` across files.
8
+ - **Inheritance:** Maps `extends`, `implements`, and class hierarchies.
9
+ - **Smart Context Extraction:** Captures JSDoc, Python docstrings, and preceding comments.
10
+ - **Signature Fallback:** Extracts function signatures (parameters/types) if documentation is missing.
11
+ - **Recursive .gitignore Support:** Deeply respects both root and nested `.gitignore` files.
12
+ - **Compact & Dense:** Optimized for LLM token efficiency with a dedicated `## GRAPH EDGES` section.
13
+ - **Language-Agnostic:** Support for JS/TS, Python, Go, Rust, Java, C#, C/C++, Swift, PHP, Ruby, Dart, and more.
14
14
 
15
15
  ## Installation
16
16
 
@@ -38,12 +38,16 @@ code-graph install-hook
38
38
  ## LLM Usage & Token Efficiency
39
39
 
40
40
  ### The "Read First" Strategy
41
- Instruct your LLM agent to read `llm-code-graph.md` as its first step. The file uses a dense format that provides immediate architectural context:
41
+ Instruct your LLM agent to read `llm-code-graph.md` as its first step. The file provides a high-level map and a structural graph for relational reasoning:
42
42
 
43
43
  **Example Map Entry:**
44
44
  ```markdown
45
- - src/auth.js | desc: Handles user authentication and JWT validation.
45
+ - src/auth.js | desc: Handles user authentication.
46
46
  - syms: [login [ (username, password) ], validateToken [ (token: string) ]]
47
+
48
+ ## GRAPH EDGES
49
+ [src/auth.js] -> [imports] -> [jwt-library]
50
+ [AdminUser] -> [inherits] -> [BaseUser]
47
51
  ```
48
52
 
49
53
  **Example System Prompt:**
@@ -78,13 +82,9 @@ fn main() {
78
82
  ```
79
83
 
80
84
  ## How it works
81
- 1. **File Scanning:** Recursively walks the directory, ignoring patterns in `.gitignore`.
82
- 2. **Context Extraction:** Scans for classes, functions, and variables.
83
- 3. **Docstring Capture:** If a symbol has a preceding comment (`//`, `/**`, `#`, `"""`), it's captured as a description.
84
- 4. **Signature Capture:** If no comment is found, it captures the declaration signature (parameters) as a fallback.
85
- 5. **Compilation:** Writes a single, minified `llm-code-graph.md` file designed for machine consumption.
86
-
87
- ## Publishing as a Package
88
- To share your own version:
89
- 1. `npm login`
90
- 2. `npm publish --access public`
85
+ 1. **File Scanning:** Recursively walks the directory, ignoring patterns in `.gitignore` (recursive).
86
+ 2. **Context Extraction:** Scans for classes, functions, and variables while ignoring matches in comments.
87
+ 3. **Graph Extraction:** Identifies `imports`, `requires`, `extends`, and `implements` to build a structural skeleton.
88
+ 4. **Docstring Capture:** Captures preceding comments as descriptions.
89
+ 5. **Signature Capture:** Fallback to declaration signatures (parameters) if docs are missing.
90
+ 6. **Compilation:** Writes a single, minified `llm-code-graph.md` file with a dedicated `## GRAPH EDGES` section.
package/index.js CHANGED
@@ -11,22 +11,32 @@ const __dirname = path.dirname(__filename);
11
11
 
12
12
  const IGNORE_FILE = '.gitignore';
13
13
  const DEFAULT_MAP_FILE = 'llm-code-graph.md';
14
-
15
14
  const SYMBOL_REGEXES = [
16
- // Types, Classes, Interfaces (Universal)
17
- /\b(?:class|interface|type|struct|enum|protocol|extension|trait|module|namespace|object)\s+([a-zA-Z_]\w*)/g,
18
-
15
+ // Types, Classes, Interfaces (Universal) with Inheritance support
16
+ /\b(?:class|interface|type|struct|enum|protocol|extension|trait|module|namespace|object)\s+([a-zA-Z_]\w*)(?:[^\n\S]*(?:extends|implements|:|(?:\())[^\n\S]*([a-zA-Z_]\w*(?:[^\n\S]*,\s*[a-zA-Z_]\w*)*)\)?)?/g,
17
+
19
18
  // Explicit Function Keywords
20
19
  /\b(?:function|def|fn|func|fun|method|procedure|sub|routine)\s+([a-zA-Z_]\w*)/g,
21
-
22
- // Method/Var Declarations (C-style, Java, C#, TS, Dart)
23
- // Refined to require a variable/function name followed by a declaration signal
24
- /\b(?:void|async|public|private|protected|static|virtual|override|readonly|int|float|double|char|bool|string|val|var|let|final)\s+([a-zA-Z_]\w*)(?=\s*(?:\([^)]*\)|[a-zA-Z_]\w*)\s*(?:\{|=>|;|=))/g,
25
-
20
+
21
+ // Method/Var Declarations (Java, Spring Boot, C-style)
22
+ // Captures: public String askAi(String question), void realFunction()
23
+ /\b(?:void|async|public|private|protected|static|final|native|synchronized|abstract|transient|volatile)\s+(?:[\w<>[\]]+\s+)?([a-zA-Z_]\w*)(?=\s*\([^)]*\)\s*(?:\{|=>|;|=))/g,
24
+
25
+ // Spring/Java/Dart Annotations (Captures the annotation name as a prefix)
26
+ /(@[a-zA-Z_]\w*(?:\([^)]*\))?)\s*(?:(?:public|private|protected|static|final|abstract|class|interface|enum|void|[\w<>[\]]+)\s+)+([a-zA-Z_]\w*)/g,
27
+
26
28
  // Exported symbols
27
29
  /\bexport\s+(?:default\s+)?(?:const|let|var|function|class|type|interface|enum|async|val)\s+([a-zA-Z_]\w*)/g
28
30
  ];
29
31
 
32
+
33
+ const EDGE_REGEXES = [
34
+ // Imports/Includes (JS, TS, Python, Go, Rust, C++, Java, Dart)
35
+ /\b(?:import|from|include|require|using)\s*(?:[\(\s])\s*['"]?([@\w\.\/\-]+)['"]?/g,
36
+ // C-style includes
37
+ /#include\s+[<"]([\w\.\/\-]+)[>"]/g
38
+ ];
39
+
30
40
  export const SUPPORTED_EXTENSIONS = [
31
41
  '.js', '.ts', '.jsx', '.tsx', '.py', '.go', '.rs', '.java',
32
42
  '.cpp', '.c', '.h', '.hpp', '.cc', '.rb', '.php', '.swift',
@@ -48,49 +58,104 @@ export function getIgnores(cwd, additionalLines = []) {
48
58
  return ig;
49
59
  }
50
60
 
51
- export function extractSymbols(content) {
61
+ export function extractSymbolsAndInheritance(content) {
52
62
  const symbols = [];
63
+ const inheritance = [];
64
+
65
+ // Create a version of content without comments AND strings to find symbols accurately
66
+ const cleanContent = content
67
+ .replace(/\/\*[\s\S]*?\*\/|\/\/.*/g, '')
68
+ .replace(/['"`](?:\\.|[^'"`])*['"`]/g, '');
69
+
53
70
  for (const regex of SYMBOL_REGEXES) {
54
71
  let match;
55
72
  regex.lastIndex = 0;
56
- while ((match = regex.exec(content)) !== null) {
73
+ while ((match = regex.exec(cleanContent)) !== null) {
57
74
  if (match[1]) {
58
- const symbolName = match[1];
59
- if (['if', 'for', 'while', 'switch', 'return', 'await', 'yield', 'const', 'new'].includes(symbolName)) continue;
75
+ // Handle the Annotation regex separately (it has 2 groups)
76
+ let symbolName = match[1];
77
+ let annotation = '';
78
+
79
+ if (symbolName.startsWith('@')) {
80
+ annotation = symbolName;
81
+ symbolName = match[2] || '';
82
+ if (!symbolName) continue;
83
+ }
84
+
85
+ if (['if', 'for', 'while', 'switch', 'return', 'await', 'yield', 'const', 'new', 'let', 'var', 'class', 'void', 'public', 'private', 'protected'].includes(symbolName)) continue;
60
86
 
61
- const linesBefore = content.substring(0, match.index).split('\n');
87
+ // Capture inheritance if present (match[2] only for non-annotation regex)
88
+ if (!annotation && match[2]) {
89
+ const parents = match[2].split(',').map(p => p.trim());
90
+ parents.forEach(parent => {
91
+ inheritance.push({ child: symbolName, parent });
92
+ });
93
+ }
94
+
95
+ // To find the comment, we need to find the position in the ORIGINAL content
96
+ const escapedName = symbolName.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
97
+ const posRegex = new RegExp(`\\b${escapedName}\\b`, 'g');
98
+ let posMatch = posRegex.exec(content);
99
+ if (!posMatch) continue;
100
+
101
+ const linesBefore = content.substring(0, posMatch.index).split('\n');
62
102
  let comment = '';
63
- for (let i = linesBefore.length - 1; i >= 0; i--) {
103
+ // Skip the current line where the symbol is defined
104
+ for (let i = linesBefore.length - 2; i >= 0; i--) {
64
105
  const line = linesBefore[i].trim();
65
- if (line.startsWith('//') || line.startsWith('*') || line.startsWith('"""') || line.startsWith('#')) {
106
+ if (line.startsWith('//') || line.startsWith('*') || line.startsWith('"""') || line.startsWith('#') || line.startsWith('/*')) {
66
107
  const clean = line.replace(/[\/*#"]/g, '').trim();
67
108
  if (clean) comment = clean + (comment ? ' ' + comment : '');
68
- if (comment.length > 80) break;
109
+ if (comment.length > 100) break;
69
110
  } else if (line === '' && comment === '') continue;
111
+ else if (line.startsWith('@')) continue; // Skip annotations in comment search
70
112
  else break;
71
113
  }
72
114
 
73
115
  let context = comment;
74
116
  if (!context) {
75
- const remainingLine = content.substring(match.index + match[0].length);
117
+ const remainingLine = content.substring(posMatch.index + symbolName.length);
76
118
  const sigMatch = remainingLine.match(/^\s*(\([^)]*\)|[^\n{;]*)/);
77
119
  if (sigMatch && sigMatch[1].trim()) {
78
120
  context = sigMatch[1].trim();
79
121
  }
80
122
  }
81
123
 
82
- symbols.push(context ? `${symbolName} [${context}]` : symbolName);
124
+ const displaySymbol = annotation ? `${annotation} ${symbolName}` : symbolName;
125
+ symbols.push(context ? `${displaySymbol} [${context}]` : displaySymbol);
83
126
  }
84
127
  }
85
128
  }
86
- return Array.from(new Set(symbols)).sort();
129
+ return {
130
+ symbols: Array.from(new Set(symbols)).sort(),
131
+ inheritance: Array.from(new Set(inheritance.map(JSON.stringify))).map(JSON.parse)
132
+ };
133
+ }
134
+
135
+ export function extractEdges(content) {
136
+ const dependencies = new Set();
137
+ const noComments = content.replace(/\/\*[\s\S]*?\*\/|\/\/.*/g, '');
138
+ for (const regex of EDGE_REGEXES) {
139
+ let match;
140
+ regex.lastIndex = 0;
141
+ while ((match = regex.exec(noComments)) !== null) {
142
+ if (match[1]) {
143
+ const dep = match[1];
144
+ // Filter out obvious noise, common library names, or keywords
145
+ if (dep.length > 1 && !['style', 'react', 'vue', 'flutter', 'new', 'const', 'let', 'var', 'dependencies', 'from', 'import'].includes(dep.toLowerCase())) {
146
+ dependencies.add(dep);
147
+ }
148
+ }
149
+ }
150
+ }
151
+ return Array.from(dependencies).sort();
87
152
  }
88
153
 
89
154
  async function generate(cwd = process.cwd()) {
90
155
  const files = [];
156
+ const allEdges = [];
91
157
 
92
158
  function walk(dir, ig) {
93
- // Check for local .gitignore and create a new scoped ignore object if found
94
159
  let localIg = ig;
95
160
  const localIgnorePath = path.join(dir, IGNORE_FILE);
96
161
  if (fs.existsSync(localIgnorePath) && dir !== cwd) {
@@ -98,7 +163,6 @@ async function generate(cwd = process.cwd()) {
98
163
  const lines = content.split('\n').map(line => {
99
164
  line = line.trim();
100
165
  if (!line || line.startsWith('#')) return null;
101
- // Rules in sub-gitignore are relative to that directory
102
166
  const relDir = path.relative(cwd, dir).replace(/\\/g, '/');
103
167
  return relDir ? `${relDir}/${line}` : line;
104
168
  }).filter(Boolean);
@@ -121,17 +185,34 @@ async function generate(cwd = process.cwd()) {
121
185
  const ext = path.extname(entry.name);
122
186
  if (SUPPORTED_EXTENSIONS.includes(ext)) {
123
187
  const content = fs.readFileSync(fullPath, 'utf8');
124
- const firstLines = content.split('\n').slice(0, 5);
188
+
189
+ // Extract file-level description
190
+ const lines = content.split('\n');
125
191
  let fileDesc = '';
126
- for (const line of firstLines) {
127
- const trimmed = line.trim();
128
- if (trimmed.startsWith('//') || trimmed.startsWith('#') || trimmed.startsWith('/*')) {
129
- fileDesc += trimmed.replace(/[\/*#]/g, '').trim() + ' ';
192
+ for (let i = 0; i < Math.min(10, lines.length); i++) {
193
+ const line = lines[i].trim();
194
+ if (line.startsWith('#!') || line === '') continue;
195
+ if (line.startsWith('//') || line.startsWith('#') || line.startsWith('/*')) {
196
+ fileDesc += line.replace(/[\/*#]/g, '').trim() + ' ';
197
+ } else {
198
+ break;
130
199
  }
131
200
  }
132
- const symbols = extractSymbols(content);
201
+
202
+ const { symbols, inheritance } = extractSymbolsAndInheritance(content);
203
+ const dependencies = extractEdges(content);
204
+
133
205
  if (!fileDesc.trim() && symbols.length > 0) fileDesc = `Contains ${symbols.length} symbols.`;
206
+
134
207
  files.push({ path: normalizedPath, desc: fileDesc.trim(), symbols });
208
+
209
+ // Collect Edges
210
+ dependencies.forEach(dep => {
211
+ allEdges.push(`[${normalizedPath}] -> [imports] -> [${dep}]`);
212
+ });
213
+ inheritance.forEach(inh => {
214
+ allEdges.push(`[${inh.child}] -> [inherits] -> [${inh.parent}]`);
215
+ });
135
216
  }
136
217
  }
137
218
  }
@@ -139,14 +220,18 @@ async function generate(cwd = process.cwd()) {
139
220
 
140
221
  walk(cwd, getIgnores(cwd));
141
222
 
142
- const output = files.map(f => {
223
+ const nodesOutput = files.map(f => {
143
224
  const descStr = f.desc ? ` | desc: ${f.desc.substring(0, 100)}` : '';
144
225
  const symStr = f.symbols.length > 0 ? `\n - syms: [${f.symbols.join(', ')}]` : '';
145
226
  return `- ${f.path}${descStr}${symStr}`;
146
227
  }).join('\n');
147
228
 
229
+ const edgesOutput = allEdges.length > 0
230
+ ? `\n\n## GRAPH EDGES\n${Array.from(new Set(allEdges)).sort().join('\n')}`
231
+ : '';
232
+
148
233
  const header = `# CODE_GRAPH_MAP\n> LLM_ONLY: DO NOT EDIT. COMPACT PROJECT MAP.\n\n`;
149
- fs.writeFileSync(path.join(cwd, DEFAULT_MAP_FILE), header + output);
234
+ fs.writeFileSync(path.join(cwd, DEFAULT_MAP_FILE), header + nodesOutput + edgesOutput);
150
235
  console.log(`[Code-Graph] Updated ${DEFAULT_MAP_FILE}`);
151
236
  }
152
237
 
package/llm-code-graph.md CHANGED
@@ -1,10 +1,23 @@
1
1
  # CODE_GRAPH_MAP
2
2
  > LLM_ONLY: DO NOT EDIT. COMPACT PROJECT MAP.
3
3
 
4
- - .gitignore
5
- - index.js | desc: !usrbinenv node
6
- - syms: [Name [Dart:], Name [PHP: class Name, interface Name, trait Name,], Name [PHP: class Name, interface Name,], Name [PHP: class Name,], Name [PHP:], Name [Ruby: def name, class Name,], Name [Ruby: def name,], Name [Swift: func name, class Name, struct Name, protocol Name,], Name [Swift: func name, class Name, struct Name,], Name [Swift: func name, class Name,], Name [Swift: func name,], SUPPORTED_EXTENSIONS [= [], extractSymbols [(content)], function [generate(cwd = process.cwd())], generate [(cwd = process.cwd()], getIgnores [(cwd)], installHook [(cwd = process.cwd()], is [We must check if p is a directory to append the trailing slash Since chokidar's ignore], name [Dart: class Name, void name,], name [Ruby:], name [Swift:], walk [(dir)], watch [(cwd = process.cwd()]]
7
- - package.json
8
- - README.md
9
- - test/index.test.js | desc: Contains 5 symbols.
10
- - syms: [noDocFunc [(arg1: string, arg2: number)], py_func [(x)], py_func [Note: Current regex captures '], py_func_2 [This is a python comment], testFunc [This is a test function]]
4
+ - index.js | desc: Contains 5 symbols.
5
+ - syms: [SUPPORTED_EXTENSIONS [= [], extractSymbolsAndInheritance [(content)], generate [(cwd = process.cwd()], getIgnores [(cwd, additionalLines = [])], walk [(dir, ig)]]
6
+ - test/index.test.js
7
+
8
+ ## GRAPH EDGES
9
+ [index.js] -> [imports] -> [chokidar]
10
+ [index.js] -> [imports] -> [fs]
11
+ [index.js] -> [imports] -> [ignore]
12
+ [index.js] -> [imports] -> [path]
13
+ [index.js] -> [imports] -> [url]
14
+ [test/index.test.js] -> [imports] -> [../index.js]
15
+ [test/index.test.js] -> [imports] -> [./local-file]
16
+ [test/index.test.js] -> [imports] -> [assert]
17
+ [test/index.test.js] -> [imports] -> [fs]
18
+ [test/index.test.js] -> [imports] -> [header.h]
19
+ [test/index.test.js] -> [imports] -> [node]
20
+ [test/index.test.js] -> [imports] -> [other-module]
21
+ [test/index.test.js] -> [imports] -> [path]
22
+ [test/index.test.js] -> [imports] -> [test]
23
+ [test/index.test.js] -> [imports] -> [url]
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "code-graph-llm",
3
- "version": "1.3.0",
3
+ "version": "1.4.3",
4
4
  "description": "Compact, language-agnostic codebase mapper for LLM token efficiency.",
5
5
  "main": "index.js",
6
6
  "bin": {
@@ -4,7 +4,8 @@ import fs from 'node:fs';
4
4
  import path from 'node:path';
5
5
  import { fileURLToPath } from 'url';
6
6
  import {
7
- extractSymbols,
7
+ extractSymbolsAndInheritance,
8
+ extractEdges,
8
9
  getIgnores,
9
10
  SUPPORTED_EXTENSIONS,
10
11
  generate
@@ -17,7 +18,7 @@ test('extractSymbols - JS/TS Docstrings', () => {
17
18
  */
18
19
  function testFunc(a, b) {}
19
20
  `;
20
- const symbols = extractSymbols(code);
21
+ const { symbols } = extractSymbolsAndInheritance(code);
21
22
  assert.ok(symbols.some(s => s.includes('testFunc') && s.includes('This is a test function')));
22
23
  });
23
24
 
@@ -27,8 +28,8 @@ test('extractSymbols - Signature Fallback', () => {
27
28
  return true;
28
29
  }
29
30
  `;
30
- const symbols = extractSymbols(code);
31
- // Matches "noDocFunc [(arg1: string, arg2: number)]"
31
+ const { symbols } = extractSymbolsAndInheritance(code);
32
+ // Matches "noDocFunc [ (arg1: string, arg2: number)]"
32
33
  assert.ok(symbols.some(s => s.includes('noDocFunc') && s.includes('arg1: string, arg2: number')));
33
34
  });
34
35
 
@@ -37,11 +38,49 @@ test('extractSymbols - Flutter/Dart Noise Reduction', () => {
37
38
  const SizedBox(height: 10);
38
39
  void realFunction() {}
39
40
  `;
40
- const symbols = extractSymbols(code);
41
+ const { symbols } = extractSymbolsAndInheritance(code);
41
42
  assert.ok(symbols.some(s => s.includes('realFunction')));
42
43
  assert.ok(!symbols.some(s => s.includes('SizedBox')));
43
44
  });
44
45
 
46
+ test('extractInheritance - Class relationships', () => {
47
+ const code = `
48
+ class AdminUser extends BaseUser {}
49
+ interface IRepository implements IBase {}
50
+ class MyWidget : StatelessWidget {}
51
+ `;
52
+ const { inheritance } = extractSymbolsAndInheritance(code);
53
+ assert.ok(inheritance.some(i => i.child === 'AdminUser' && i.parent === 'BaseUser'));
54
+ assert.ok(inheritance.some(i => i.child === 'IRepository' && i.parent === 'IBase'));
55
+ assert.ok(inheritance.some(i => i.child === 'MyWidget' && i.parent === 'StatelessWidget'));
56
+ });
57
+
58
+ test('extractEdges - Imports and includes', () => {
59
+ const code = `
60
+ import { something } from './local-file';
61
+ const other = require('other-module');
62
+ #include "header.h"
63
+ `;
64
+ const edges = extractEdges(code);
65
+ assert.ok(edges.includes('./local-file'));
66
+ assert.ok(edges.includes('other-module'));
67
+ assert.ok(edges.includes('header.h'));
68
+ });
69
+
70
+ test('extractSymbols - Java/Spring Annotations', () => {
71
+ const code = `
72
+ @RestController
73
+ public class MyController {
74
+ @GetMapping("/test")
75
+ public String hello() { return "hi"; }
76
+ }
77
+ `;
78
+ const { symbols } = extractSymbolsAndInheritance(code);
79
+ assert.ok(symbols.some(s => s.includes('@RestController MyController')));
80
+ // Note: Strings are stripped during extraction to avoid false positives
81
+ assert.ok(symbols.some(s => s.includes('@GetMapping() hello')));
82
+ });
83
+
45
84
  test('getIgnores - Default Patterns', () => {
46
85
  const ig = getIgnores(process.cwd());
47
86
  assert.strictEqual(ig.ignores('.git/'), true);