@kentwynn/kgraph 0.2.22 → 0.2.23
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +35 -11
- package/dist/cli/help.js +3 -2
- package/dist/cli/init-summary.d.ts +1 -1
- package/dist/cli/init-summary.js +28 -8
- package/dist/config/config.js +41 -0
- package/dist/scanner/broad-symbol-extractor.d.ts +3 -0
- package/dist/scanner/broad-symbol-extractor.js +292 -0
- package/dist/scanner/extraction-context.d.ts +23 -0
- package/dist/scanner/extraction-context.js +77 -0
- package/dist/scanner/file-classifier.js +15 -2
- package/dist/scanner/php-symbol-extractor.d.ts +2 -0
- package/dist/scanner/php-symbol-extractor.js +79 -0
- package/dist/scanner/repo-scanner.js +35 -1
- package/dist/scanner/ruby-symbol-extractor.d.ts +2 -0
- package/dist/scanner/ruby-symbol-extractor.js +75 -0
- package/dist/scanner/shell-symbol-extractor.d.ts +2 -0
- package/dist/scanner/shell-symbol-extractor.js +78 -0
- package/dist/scanner/sql-symbol-extractor.d.ts +2 -0
- package/dist/scanner/sql-symbol-extractor.js +166 -0
- package/dist/scanner/tree-sitter-parser.d.ts +1 -2
- package/dist/scanner/tree-sitter-parser.js +14 -0
- package/package.json +12 -1
package/README.md
CHANGED
|
@@ -41,22 +41,33 @@ The CLI presents this as **Atom Core**: lightweight local atoms plus determinist
|
|
|
41
41
|
|
|
42
42
|
## The Workflow
|
|
43
43
|
|
|
44
|
-
Use KGraph
|
|
44
|
+
Use KGraph with one setup command and one normal daily command:
|
|
45
45
|
|
|
46
46
|
```bash
|
|
47
|
-
# Required once per repository
|
|
48
|
-
kgraph
|
|
47
|
+
# Required once per repository.
|
|
48
|
+
# Creates .kgraph/, runs the first scan, and detects likely AI tools.
|
|
49
|
+
kgraph init
|
|
49
50
|
|
|
50
|
-
# Normal daily command
|
|
51
|
+
# Normal daily command.
|
|
52
|
+
# Refreshes maps, processes pending capture notes, and returns focused context.
|
|
51
53
|
kgraph "auth token refresh"
|
|
52
54
|
```
|
|
53
55
|
|
|
54
|
-
|
|
56
|
+
AI tool integrations are optional. During `kgraph init`, KGraph detects likely local tools such as Codex, Copilot, Claude Code, and Gemini, then prompts or prints a suggested `kgraph integrate add ...` command. You can accept the recommendation, choose custom integrations, skip the step, or configure them later.
|
|
57
|
+
|
|
58
|
+
If you already know exactly which integrations you want, you can pass them during init:
|
|
59
|
+
|
|
60
|
+
```bash
|
|
61
|
+
kgraph init --integrations codex,copilot,cursor,claude-code,gemini,windsurf,cline
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
The daily command runs the full practical workflow:
|
|
55
65
|
|
|
56
66
|
1. Refreshes the repository scan.
|
|
57
67
|
2. Updates file, symbol, import, and relationship maps.
|
|
58
68
|
3. Processes any Markdown capture notes waiting in `.kgraph/inbox/` into knowledge atoms.
|
|
59
|
-
4.
|
|
69
|
+
4. Reports memory health and actionable next steps.
|
|
70
|
+
5. Returns compact context for the topic you asked about.
|
|
60
71
|
|
|
61
72
|
You can also run just:
|
|
62
73
|
|
|
@@ -134,10 +145,11 @@ KGraph's core functionality is free and local-first. It does not require account
|
|
|
134
145
|
From the root of a repository:
|
|
135
146
|
|
|
136
147
|
```bash
|
|
137
|
-
# 1. Create the local KGraph workspace
|
|
148
|
+
# 1. Create the local KGraph workspace and run the first scan
|
|
138
149
|
kgraph init
|
|
139
150
|
|
|
140
|
-
# 2. Optional:
|
|
151
|
+
# 2. Optional: accept detected AI tool recommendations during init,
|
|
152
|
+
# or add integrations later when you want KGraph-managed instructions
|
|
141
153
|
kgraph integrate add codex copilot cursor claude-code gemini windsurf cline
|
|
142
154
|
|
|
143
155
|
# 3. Run the normal workflow for a topic
|
|
@@ -147,7 +159,7 @@ kgraph "auth token refresh"
|
|
|
147
159
|
kgraph doctor
|
|
148
160
|
```
|
|
149
161
|
|
|
150
|
-
`kgraph init`
|
|
162
|
+
`kgraph init` scans once, prints repo language coverage, detects likely local AI tools, and recommends matching integrations. Integrations are still optional: they only write local instruction files so tools know when to run KGraph. They do not start background agents or call AI providers.
|
|
151
163
|
|
|
152
164
|
After useful AI work, assistants save durable runtime-capture notes into `.kgraph/inbox/`. These notes are not project documentation; they are KGraph input files that the next `kgraph` run processes automatically. You can also process them directly with `kgraph update`.
|
|
153
165
|
|
|
@@ -183,13 +195,13 @@ This is optional. Claude Code can use generated hook scripts for automatic captu
|
|
|
183
195
|
kgraph init
|
|
184
196
|
```
|
|
185
197
|
|
|
186
|
-
Required once per repo. Creates `.kgraph/`, writes the local config, runs the first scan, and prints suggested next actions based on
|
|
198
|
+
Required once per repo. Creates `.kgraph/`, writes the local config, initializes the knowledge store, runs the first scan, detects likely local AI tools, and prints suggested next actions based on repo languages and detected tools.
|
|
187
199
|
|
|
188
200
|
```bash
|
|
189
201
|
kgraph init --integrations codex,copilot,cursor,claude-code,gemini,windsurf,cline
|
|
190
202
|
```
|
|
191
203
|
|
|
192
|
-
Initializes KGraph and writes local instruction files for
|
|
204
|
+
Initializes KGraph and immediately writes local instruction files for the named AI tools. This is optional; plain `kgraph init` can detect likely tools and recommend or prompt for integrations instead.
|
|
193
205
|
|
|
194
206
|
```bash
|
|
195
207
|
kgraph "some topic"
|
|
@@ -356,6 +368,10 @@ Show processed capture history. Add a query to find historical work by title, su
|
|
|
356
368
|
|
|
357
369
|
KGraph integrations are local files. They do not start background agents, call AI providers, or send data anywhere.
|
|
358
370
|
|
|
371
|
+
You do not need integrations to use KGraph manually. They are useful when you want Codex, Copilot, Cursor, Claude Code, Gemini, Windsurf, or Cline to see repo-local KGraph workflow instructions automatically.
|
|
372
|
+
|
|
373
|
+
`kgraph init` detects likely local tools and recommends integrations when possible. You can also manage them explicitly:
|
|
374
|
+
|
|
359
375
|
```bash
|
|
360
376
|
kgraph integrate add codex copilot cursor claude-code gemini windsurf cline
|
|
361
377
|
kgraph integrate add copilot --mode smart
|
|
@@ -424,9 +440,17 @@ KGraph deeply scans:
|
|
|
424
440
|
- Java and Kotlin
|
|
425
441
|
- C and C++
|
|
426
442
|
- C#
|
|
443
|
+
- PHP
|
|
444
|
+
- Ruby
|
|
445
|
+
- Shell
|
|
446
|
+
- SQL
|
|
427
447
|
|
|
428
448
|
Other languages keep practical file, import, and symbol depth without full call graph analysis. Common file types still appear in the file map with generic metadata, so context queries can still point to docs, config, SQL, CSS, HTML, YAML, and similar files.
|
|
429
449
|
|
|
450
|
+
KGraph also extracts basic symbols/imports for Swift, Terraform/HCL,
|
|
451
|
+
GraphQL, Protocol Buffers, Lua, Dart, Elixir, Scala, and R. Structured file extraction covers
|
|
452
|
+
YAML, JSON, TOML, Dockerfile/Containerfile, Markdown/MDX, HTML, CSS, SCSS, Sass, Less, and XML.
|
|
453
|
+
|
|
430
454
|
## Visualization
|
|
431
455
|
|
|
432
456
|
```bash
|
package/dist/cli/help.js
CHANGED
|
@@ -20,7 +20,7 @@ export function renderRootHelp(useColor = supportsColor()) {
|
|
|
20
20
|
'',
|
|
21
21
|
sectionTitle(theme, `${accent} Start`),
|
|
22
22
|
command('init', 'Required once: create .kgraph/ workspace'),
|
|
23
|
-
command('init --integrations codex,gemini', '
|
|
23
|
+
command('init --integrations codex,gemini', 'Optional: initialize and connect named AI tools'),
|
|
24
24
|
'',
|
|
25
25
|
sectionTitle(theme, `${accent} Daily workflow`),
|
|
26
26
|
command('kgraph', 'Refresh scan maps and process pending capture notes'),
|
|
@@ -68,7 +68,8 @@ export function renderRootHelp(useColor = supportsColor()) {
|
|
|
68
68
|
command('--capture-symbol <name>', 'Attach symbol evidence to root capture'),
|
|
69
69
|
'',
|
|
70
70
|
sectionTitle(theme, `${accent} Examples`),
|
|
71
|
-
' kgraph init
|
|
71
|
+
' kgraph init',
|
|
72
|
+
' kgraph integrate add codex copilot cursor claude-code gemini windsurf cline',
|
|
72
73
|
' kgraph "blog admin token usage"',
|
|
73
74
|
' kgraph "blog admin token usage" --final',
|
|
74
75
|
' kgraph "blog admin token usage" --capture "Author filter now uses display names" --capture-file www/app/blog/page.tsx',
|
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
import type { IntegrationConfig } from '../types/config.js';
|
|
2
2
|
import type { RepositoryFile } from '../types/maps.js';
|
|
3
3
|
import { type InitIntegrationRecommendation } from './init-recommendations.js';
|
|
4
|
-
type CoverageLevel = 'deep' | 'basic' | 'generic';
|
|
4
|
+
type CoverageLevel = 'deep' | 'basic' | 'structured' | 'generic';
|
|
5
5
|
export interface InitLanguageSummary {
|
|
6
6
|
language: string;
|
|
7
7
|
label: string;
|
package/dist/cli/init-summary.js
CHANGED
|
@@ -12,15 +12,32 @@ const LANGUAGE_PRESENTATION = {
|
|
|
12
12
|
c: { label: 'C', coverage: 'deep' },
|
|
13
13
|
cpp: { label: 'C++', coverage: 'deep' },
|
|
14
14
|
csharp: { label: 'C#', coverage: 'deep' },
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
|
|
15
|
+
php: { label: 'PHP', coverage: 'deep' },
|
|
16
|
+
swift: { label: 'Swift', coverage: 'basic' },
|
|
17
|
+
ruby: { label: 'Ruby', coverage: 'deep' },
|
|
18
|
+
shell: { label: 'Shell', coverage: 'deep' },
|
|
19
|
+
lua: { label: 'Lua', coverage: 'basic' },
|
|
20
|
+
dart: { label: 'Dart', coverage: 'basic' },
|
|
21
|
+
elixir: { label: 'Elixir', coverage: 'basic' },
|
|
22
|
+
scala: { label: 'Scala', coverage: 'basic' },
|
|
23
|
+
r: { label: 'R', coverage: 'basic' },
|
|
24
|
+
sql: { label: 'SQL', coverage: 'deep' },
|
|
25
|
+
terraform: { label: 'Terraform/HCL', coverage: 'basic' },
|
|
26
|
+
graphql: { label: 'GraphQL', coverage: 'basic' },
|
|
27
|
+
protobuf: { label: 'Protocol Buffers', coverage: 'basic' },
|
|
28
|
+
yaml: { label: 'YAML', coverage: 'structured' },
|
|
29
|
+
json: { label: 'JSON', coverage: 'structured' },
|
|
30
|
+
toml: { label: 'TOML', coverage: 'structured' },
|
|
31
|
+
dockerfile: { label: 'Dockerfile', coverage: 'structured' },
|
|
32
|
+
markdown: { label: 'Markdown', coverage: 'structured' },
|
|
33
|
+
html: { label: 'HTML', coverage: 'structured' },
|
|
34
|
+
css: { label: 'CSS', coverage: 'structured' },
|
|
35
|
+
scss: { label: 'SCSS', coverage: 'structured' },
|
|
36
|
+
sass: { label: 'Sass', coverage: 'structured' },
|
|
37
|
+
less: { label: 'Less', coverage: 'structured' },
|
|
38
|
+
xml: { label: 'XML', coverage: 'structured' },
|
|
22
39
|
};
|
|
23
|
-
const EXCLUDED_LANGUAGES = new Set(['unknown', '
|
|
40
|
+
const EXCLUDED_LANGUAGES = new Set(['unknown', 'restructuredtext']);
|
|
24
41
|
export function summarizeInitLanguages(files) {
|
|
25
42
|
const byLabel = new Map();
|
|
26
43
|
for (const file of files) {
|
|
@@ -103,6 +120,7 @@ function moreDetailedCoverage(left, right) {
|
|
|
103
120
|
const rank = {
|
|
104
121
|
deep: 3,
|
|
105
122
|
basic: 2,
|
|
123
|
+
structured: 2,
|
|
106
124
|
generic: 1,
|
|
107
125
|
};
|
|
108
126
|
return rank[left] >= rank[right] ? left : right;
|
|
@@ -113,6 +131,8 @@ function coverageDescription(coverage) {
|
|
|
113
131
|
return 'deep built-in extraction';
|
|
114
132
|
case 'basic':
|
|
115
133
|
return 'basic built-in extraction';
|
|
134
|
+
case 'structured':
|
|
135
|
+
return 'structured file extraction';
|
|
116
136
|
default:
|
|
117
137
|
return 'generic file coverage';
|
|
118
138
|
}
|
package/dist/config/config.js
CHANGED
|
@@ -67,6 +67,47 @@ export const DEFAULT_CONFIG = {
|
|
|
67
67
|
'.hxx',
|
|
68
68
|
// C#
|
|
69
69
|
'.cs',
|
|
70
|
+
// PHP / Swift / Ruby
|
|
71
|
+
'.php',
|
|
72
|
+
'.swift',
|
|
73
|
+
'.rb',
|
|
74
|
+
'.rake',
|
|
75
|
+
// Shell
|
|
76
|
+
'.sh',
|
|
77
|
+
'.bash',
|
|
78
|
+
'.zsh',
|
|
79
|
+
'.fish',
|
|
80
|
+
// Web / styles
|
|
81
|
+
'.html',
|
|
82
|
+
'.htm',
|
|
83
|
+
'.css',
|
|
84
|
+
'.scss',
|
|
85
|
+
'.sass',
|
|
86
|
+
'.less',
|
|
87
|
+
// Data / config / schema
|
|
88
|
+
'.json',
|
|
89
|
+
'.jsonc',
|
|
90
|
+
'.yaml',
|
|
91
|
+
'.yml',
|
|
92
|
+
'.toml',
|
|
93
|
+
'.dockerfile',
|
|
94
|
+
'.xml',
|
|
95
|
+
'.graphql',
|
|
96
|
+
'.gql',
|
|
97
|
+
'.sql',
|
|
98
|
+
'.tf',
|
|
99
|
+
'.proto',
|
|
100
|
+
// Docs
|
|
101
|
+
'.md',
|
|
102
|
+
'.mdx',
|
|
103
|
+
// Additional app languages
|
|
104
|
+
'.lua',
|
|
105
|
+
'.r',
|
|
106
|
+
'.R',
|
|
107
|
+
'.dart',
|
|
108
|
+
'.ex',
|
|
109
|
+
'.exs',
|
|
110
|
+
'.scala',
|
|
70
111
|
],
|
|
71
112
|
},
|
|
72
113
|
maxContextItems: 8,
|
|
@@ -0,0 +1,3 @@
|
|
|
1
|
+
import type { SymbolExtractionResult } from './ts-symbol-extractor.js';
|
|
2
|
+
export declare function extractBroadSymbols(sourceText: string, filePath: string, language: string): Promise<SymbolExtractionResult>;
|
|
3
|
+
export declare function supportsBroadExtraction(language: string): boolean;
|
|
@@ -0,0 +1,292 @@
|
|
|
1
|
+
import { parseSource } from './tree-sitter-parser.js';
|
|
2
|
+
const TREE_SITTER_GRAMMAR_BY_LANGUAGE = {
|
|
3
|
+
yaml: 'yaml',
|
|
4
|
+
json: 'json',
|
|
5
|
+
html: 'html',
|
|
6
|
+
css: 'css',
|
|
7
|
+
lua: 'lua',
|
|
8
|
+
dart: 'dart',
|
|
9
|
+
elixir: 'elixir',
|
|
10
|
+
scala: 'scala',
|
|
11
|
+
};
|
|
12
|
+
export async function extractBroadSymbols(sourceText, filePath, language) {
|
|
13
|
+
const result = createExtractionResult();
|
|
14
|
+
if (!sourceText.trim()) {
|
|
15
|
+
return result;
|
|
16
|
+
}
|
|
17
|
+
const broadLanguage = language;
|
|
18
|
+
const grammar = TREE_SITTER_GRAMMAR_BY_LANGUAGE[broadLanguage];
|
|
19
|
+
if (grammar) {
|
|
20
|
+
try {
|
|
21
|
+
const tree = await parseSource(sourceText, grammar);
|
|
22
|
+
tree.delete();
|
|
23
|
+
}
|
|
24
|
+
catch (error) {
|
|
25
|
+
result.warnings.push(`tree-sitter ${grammar} parse failed: ${error instanceof Error ? error.message : String(error)}`);
|
|
26
|
+
}
|
|
27
|
+
}
|
|
28
|
+
const lines = sourceText.split(/\r?\n/);
|
|
29
|
+
switch (broadLanguage) {
|
|
30
|
+
case 'swift':
|
|
31
|
+
collectSwift(lines, filePath, result);
|
|
32
|
+
break;
|
|
33
|
+
case 'terraform':
|
|
34
|
+
collectTerraform(lines, filePath, result);
|
|
35
|
+
break;
|
|
36
|
+
case 'graphql':
|
|
37
|
+
collectGraphql(lines, filePath, result);
|
|
38
|
+
break;
|
|
39
|
+
case 'protobuf':
|
|
40
|
+
collectProtobuf(lines, filePath, result);
|
|
41
|
+
break;
|
|
42
|
+
case 'lua':
|
|
43
|
+
collectLua(lines, filePath, result);
|
|
44
|
+
break;
|
|
45
|
+
case 'dart':
|
|
46
|
+
collectDart(lines, filePath, result);
|
|
47
|
+
break;
|
|
48
|
+
case 'elixir':
|
|
49
|
+
collectElixir(lines, filePath, result);
|
|
50
|
+
break;
|
|
51
|
+
case 'scala':
|
|
52
|
+
collectScala(lines, filePath, result);
|
|
53
|
+
break;
|
|
54
|
+
case 'r':
|
|
55
|
+
collectR(lines, filePath, result);
|
|
56
|
+
break;
|
|
57
|
+
case 'yaml':
|
|
58
|
+
case 'json':
|
|
59
|
+
case 'toml':
|
|
60
|
+
case 'dockerfile':
|
|
61
|
+
collectConfig(lines, filePath, result, broadLanguage);
|
|
62
|
+
break;
|
|
63
|
+
case 'markdown':
|
|
64
|
+
collectMarkdown(lines, filePath, result);
|
|
65
|
+
break;
|
|
66
|
+
case 'html':
|
|
67
|
+
case 'xml':
|
|
68
|
+
collectMarkup(lines, filePath, result);
|
|
69
|
+
break;
|
|
70
|
+
case 'css':
|
|
71
|
+
case 'scss':
|
|
72
|
+
case 'sass':
|
|
73
|
+
case 'less':
|
|
74
|
+
collectStylesheet(lines, filePath, result);
|
|
75
|
+
break;
|
|
76
|
+
}
|
|
77
|
+
return result;
|
|
78
|
+
}
|
|
79
|
+
export function supportsBroadExtraction(language) {
|
|
80
|
+
return [
|
|
81
|
+
'swift',
|
|
82
|
+
'terraform',
|
|
83
|
+
'graphql',
|
|
84
|
+
'protobuf',
|
|
85
|
+
'lua',
|
|
86
|
+
'dart',
|
|
87
|
+
'elixir',
|
|
88
|
+
'scala',
|
|
89
|
+
'r',
|
|
90
|
+
'yaml',
|
|
91
|
+
'json',
|
|
92
|
+
'toml',
|
|
93
|
+
'dockerfile',
|
|
94
|
+
'markdown',
|
|
95
|
+
'html',
|
|
96
|
+
'css',
|
|
97
|
+
'scss',
|
|
98
|
+
'sass',
|
|
99
|
+
'less',
|
|
100
|
+
'xml',
|
|
101
|
+
].includes(language);
|
|
102
|
+
}
|
|
103
|
+
function createExtractionResult() {
|
|
104
|
+
return { symbols: [], dependencies: [], relationships: [], warnings: [] };
|
|
105
|
+
}
|
|
106
|
+
function addSymbol(result, filePath, name, kind, line, exported = false, parentName) {
|
|
107
|
+
const id = [filePath, kind, parentName, name, line].filter(Boolean).join('#');
|
|
108
|
+
const symbol = {
|
|
109
|
+
id,
|
|
110
|
+
name,
|
|
111
|
+
kind,
|
|
112
|
+
filePath,
|
|
113
|
+
startLine: line,
|
|
114
|
+
endLine: line,
|
|
115
|
+
exported,
|
|
116
|
+
parentName,
|
|
117
|
+
};
|
|
118
|
+
result.symbols.push(symbol);
|
|
119
|
+
result.relationships.push({
|
|
120
|
+
sourceType: 'file',
|
|
121
|
+
sourceId: filePath,
|
|
122
|
+
targetType: 'symbol',
|
|
123
|
+
targetId: id,
|
|
124
|
+
relationshipType: 'contains',
|
|
125
|
+
confidence: 'high',
|
|
126
|
+
});
|
|
127
|
+
return symbol;
|
|
128
|
+
}
|
|
129
|
+
function addDependency(result, filePath, specifier, kind = specifier.startsWith('.') ? 'local' : 'package') {
|
|
130
|
+
result.dependencies.push({ fromFile: filePath, specifier, kind });
|
|
131
|
+
}
|
|
132
|
+
function collectSwift(lines, filePath, result) {
|
|
133
|
+
forEachLine(lines, (line, lineNumber) => {
|
|
134
|
+
const importMatch = line.match(/^\s*import\s+([A-Za-z_][\w.]*)/);
|
|
135
|
+
if (importMatch?.[1])
|
|
136
|
+
addDependency(result, filePath, importMatch[1]);
|
|
137
|
+
const typeMatch = line.match(/^\s*(?:public|private|internal|open|final|\s)*(class|struct|enum|protocol)\s+([A-Za-z_][\w]*)/);
|
|
138
|
+
if (typeMatch?.[2]) {
|
|
139
|
+
addSymbol(result, filePath, typeMatch[2], typeMatch[1] === 'protocol' ? 'interface' : 'class', lineNumber, true);
|
|
140
|
+
}
|
|
141
|
+
const functionMatch = line.match(/^\s*(?:public|private|internal|open|static|\s)*func\s+([A-Za-z_][\w]*)/);
|
|
142
|
+
if (functionMatch?.[1]) {
|
|
143
|
+
addSymbol(result, filePath, functionMatch[1], 'function', lineNumber, true);
|
|
144
|
+
}
|
|
145
|
+
});
|
|
146
|
+
}
|
|
147
|
+
function collectTerraform(lines, filePath, result) {
|
|
148
|
+
forEachLine(lines, (line, lineNumber) => {
|
|
149
|
+
const blockMatch = line.match(/^\s*(resource|data|module|variable|output|provider)\s+"([^"]+)"(?:\s+"([^"]+)")?/);
|
|
150
|
+
if (blockMatch?.[1] && blockMatch[2]) {
|
|
151
|
+
addSymbol(result, filePath, [blockMatch[1], blockMatch[2], blockMatch[3]].filter(Boolean).join('.'), 'type', lineNumber, true);
|
|
152
|
+
}
|
|
153
|
+
});
|
|
154
|
+
}
|
|
155
|
+
function collectGraphql(lines, filePath, result) {
|
|
156
|
+
forEachLine(lines, (line, lineNumber) => {
|
|
157
|
+
const typeMatch = line.match(/^\s*(type|interface|enum|input|union|scalar)\s+([A-Za-z_][\w]*)/);
|
|
158
|
+
if (typeMatch?.[2]) {
|
|
159
|
+
addSymbol(result, filePath, typeMatch[2], typeMatch[1] === 'interface' ? 'interface' : 'type', lineNumber, true);
|
|
160
|
+
}
|
|
161
|
+
});
|
|
162
|
+
}
|
|
163
|
+
function collectProtobuf(lines, filePath, result) {
|
|
164
|
+
forEachLine(lines, (line, lineNumber) => {
|
|
165
|
+
const importMatch = line.match(/^\s*import\s+"([^"]+)"/);
|
|
166
|
+
if (importMatch?.[1])
|
|
167
|
+
addDependency(result, filePath, importMatch[1], 'local');
|
|
168
|
+
const typeMatch = line.match(/^\s*(message|service|enum)\s+([A-Za-z_][\w]*)/);
|
|
169
|
+
if (typeMatch?.[2]) {
|
|
170
|
+
addSymbol(result, filePath, typeMatch[2], 'type', lineNumber, true);
|
|
171
|
+
}
|
|
172
|
+
});
|
|
173
|
+
}
|
|
174
|
+
function collectLua(lines, filePath, result) {
|
|
175
|
+
forEachLine(lines, (line, lineNumber) => {
|
|
176
|
+
const requireMatch = line.match(/require\s*\(?\s*['"]([^'"]+)['"]/);
|
|
177
|
+
if (requireMatch?.[1])
|
|
178
|
+
addDependency(result, filePath, requireMatch[1]);
|
|
179
|
+
const functionMatch = line.match(/^\s*(?:local\s+)?function\s+([A-Za-z_][\w.:]*)/);
|
|
180
|
+
if (functionMatch?.[1]) {
|
|
181
|
+
addSymbol(result, filePath, functionMatch[1], 'function', lineNumber, true);
|
|
182
|
+
}
|
|
183
|
+
});
|
|
184
|
+
}
|
|
185
|
+
function collectDart(lines, filePath, result) {
|
|
186
|
+
forEachLine(lines, (line, lineNumber) => {
|
|
187
|
+
const importMatch = line.match(/^\s*import\s+['"]([^'"]+)['"]/);
|
|
188
|
+
if (importMatch?.[1])
|
|
189
|
+
addDependency(result, filePath, importMatch[1]);
|
|
190
|
+
const classMatch = line.match(/^\s*(?:abstract\s+)?class\s+([A-Za-z_][\w]*)/);
|
|
191
|
+
if (classMatch?.[1])
|
|
192
|
+
addSymbol(result, filePath, classMatch[1], 'class', lineNumber, true);
|
|
193
|
+
const functionMatch = line.match(/^\s*(?:[A-Za-z_<>,?]+\s+)+([A-Za-z_][\w]*)\s*\([^;]*\)\s*(?:async\s*)?\{/);
|
|
194
|
+
if (functionMatch?.[1] && !['if', 'for', 'while', 'switch'].includes(functionMatch[1])) {
|
|
195
|
+
addSymbol(result, filePath, functionMatch[1], 'function', lineNumber, true);
|
|
196
|
+
}
|
|
197
|
+
});
|
|
198
|
+
}
|
|
199
|
+
function collectElixir(lines, filePath, result) {
|
|
200
|
+
forEachLine(lines, (line, lineNumber) => {
|
|
201
|
+
const moduleMatch = line.match(/^\s*defmodule\s+([A-Z][\w.]+)/);
|
|
202
|
+
if (moduleMatch?.[1])
|
|
203
|
+
addSymbol(result, filePath, moduleMatch[1], 'class', lineNumber, true);
|
|
204
|
+
const functionMatch = line.match(/^\s*defp?\s+([a-z_][\w!?]*)/);
|
|
205
|
+
if (functionMatch?.[1])
|
|
206
|
+
addSymbol(result, filePath, functionMatch[1], 'function', lineNumber, true);
|
|
207
|
+
});
|
|
208
|
+
}
|
|
209
|
+
function collectScala(lines, filePath, result) {
|
|
210
|
+
forEachLine(lines, (line, lineNumber) => {
|
|
211
|
+
const importMatch = line.match(/^\s*import\s+(.+)/);
|
|
212
|
+
if (importMatch?.[1])
|
|
213
|
+
addDependency(result, filePath, importMatch[1].trim());
|
|
214
|
+
const typeMatch = line.match(/^\s*(?:case\s+)?(class|object|trait|enum)\s+([A-Za-z_][\w]*)/);
|
|
215
|
+
if (typeMatch?.[2]) {
|
|
216
|
+
addSymbol(result, filePath, typeMatch[2], typeMatch[1] === 'trait' ? 'interface' : 'class', lineNumber, true);
|
|
217
|
+
}
|
|
218
|
+
const functionMatch = line.match(/^\s*def\s+([A-Za-z_][\w]*)/);
|
|
219
|
+
if (functionMatch?.[1])
|
|
220
|
+
addSymbol(result, filePath, functionMatch[1], 'function', lineNumber, true);
|
|
221
|
+
});
|
|
222
|
+
}
|
|
223
|
+
function collectR(lines, filePath, result) {
|
|
224
|
+
forEachLine(lines, (line, lineNumber) => {
|
|
225
|
+
const libraryMatch = line.match(/^\s*(?:library|require)\s*\(\s*([A-Za-z.][\w.]*)/);
|
|
226
|
+
if (libraryMatch?.[1])
|
|
227
|
+
addDependency(result, filePath, libraryMatch[1]);
|
|
228
|
+
const functionMatch = line.match(/^\s*([A-Za-z.][\w.]*)\s*(?:<-|=)\s*function\s*\(/);
|
|
229
|
+
if (functionMatch?.[1])
|
|
230
|
+
addSymbol(result, filePath, functionMatch[1], 'function', lineNumber, true);
|
|
231
|
+
});
|
|
232
|
+
}
|
|
233
|
+
function collectConfig(lines, filePath, result, language) {
|
|
234
|
+
forEachLine(lines, (line, lineNumber) => {
|
|
235
|
+
if (language === 'dockerfile') {
|
|
236
|
+
const stageMatch = line.match(/^\s*FROM\s+\S+(?:\s+AS\s+([A-Za-z_][\w-]*))?/i);
|
|
237
|
+
if (stageMatch?.[1])
|
|
238
|
+
addSymbol(result, filePath, stageMatch[1], 'type', lineNumber, true);
|
|
239
|
+
return;
|
|
240
|
+
}
|
|
241
|
+
const keyMatch = language === 'json'
|
|
242
|
+
? line.match(/^\s*"([^"]+)"\s*:/)
|
|
243
|
+
: line.match(/^\s*([A-Za-z_][\w.-]*)\s*[:=]/);
|
|
244
|
+
if (keyMatch?.[1]) {
|
|
245
|
+
addSymbol(result, filePath, keyMatch[1], 'type', lineNumber);
|
|
246
|
+
}
|
|
247
|
+
});
|
|
248
|
+
}
|
|
249
|
+
function collectMarkdown(lines, filePath, result) {
|
|
250
|
+
forEachLine(lines, (line, lineNumber) => {
|
|
251
|
+
const headingMatch = line.match(/^(#{1,6})\s+(.+)/);
|
|
252
|
+
if (headingMatch?.[2]) {
|
|
253
|
+
addSymbol(result, filePath, headingMatch[2].trim(), 'type', lineNumber);
|
|
254
|
+
}
|
|
255
|
+
});
|
|
256
|
+
}
|
|
257
|
+
function collectMarkup(lines, filePath, result) {
|
|
258
|
+
forEachLine(lines, (line, lineNumber) => {
|
|
259
|
+
for (const idMatch of line.matchAll(/\bid=["']([^"']+)["']/g)) {
|
|
260
|
+
if (idMatch[1])
|
|
261
|
+
addSymbol(result, filePath, `#${idMatch[1]}`, 'type', lineNumber);
|
|
262
|
+
}
|
|
263
|
+
for (const classMatch of line.matchAll(/\bclass=["']([^"']+)["']/g)) {
|
|
264
|
+
for (const className of classMatch[1]?.split(/\s+/) ?? []) {
|
|
265
|
+
if (className)
|
|
266
|
+
addSymbol(result, filePath, `.${className}`, 'type', lineNumber);
|
|
267
|
+
}
|
|
268
|
+
}
|
|
269
|
+
});
|
|
270
|
+
}
|
|
271
|
+
function collectStylesheet(lines, filePath, result) {
|
|
272
|
+
forEachLine(lines, (line, lineNumber) => {
|
|
273
|
+
const importMatch = line.match(/^\s*@(import|use|forward)\s+["']([^"']+)["']/);
|
|
274
|
+
if (importMatch?.[2])
|
|
275
|
+
addDependency(result, filePath, importMatch[2]);
|
|
276
|
+
for (const variableMatch of line.matchAll(/(--[A-Za-z_][\w-]*|\$[A-Za-z_][\w-]*)\s*:/g)) {
|
|
277
|
+
if (variableMatch[1])
|
|
278
|
+
addSymbol(result, filePath, variableMatch[1], 'type', lineNumber);
|
|
279
|
+
}
|
|
280
|
+
const mixinMatch = line.match(/^\s*@(mixin|function|keyframes)\s+([A-Za-z_][\w-]*)/);
|
|
281
|
+
if (mixinMatch?.[2]) {
|
|
282
|
+
addSymbol(result, filePath, mixinMatch[2], mixinMatch[1] === 'function' ? 'function' : 'type', lineNumber);
|
|
283
|
+
}
|
|
284
|
+
for (const selectorMatch of line.matchAll(/([.#][A-Za-z_][\w-]*)/g)) {
|
|
285
|
+
if (selectorMatch[1])
|
|
286
|
+
addSymbol(result, filePath, selectorMatch[1], 'type', lineNumber);
|
|
287
|
+
}
|
|
288
|
+
});
|
|
289
|
+
}
|
|
290
|
+
function forEachLine(lines, callback) {
|
|
291
|
+
lines.forEach((line, index) => callback(line, index + 1));
|
|
292
|
+
}
|
|
@@ -0,0 +1,23 @@
|
|
|
1
|
+
import type { CodeSymbol, Dependency, Relationship } from '../types/maps.js';
|
|
2
|
+
import type { SymbolExtractionResult } from './ts-symbol-extractor.js';
|
|
3
|
+
export declare class ExtractionContext {
|
|
4
|
+
private readonly filePath;
|
|
5
|
+
readonly symbols: CodeSymbol[];
|
|
6
|
+
readonly dependencies: Dependency[];
|
|
7
|
+
readonly relationships: Relationship[];
|
|
8
|
+
readonly warnings: string[];
|
|
9
|
+
constructor(filePath: string);
|
|
10
|
+
addSymbol(options: {
|
|
11
|
+
name: string;
|
|
12
|
+
kind: CodeSymbol['kind'];
|
|
13
|
+
startLine: number;
|
|
14
|
+
endLine?: number;
|
|
15
|
+
exported?: boolean;
|
|
16
|
+
parentName?: string;
|
|
17
|
+
}): CodeSymbol;
|
|
18
|
+
addDependency(specifier: string, kind?: Dependency['kind'], confidence?: Relationship['confidence']): void;
|
|
19
|
+
addSymbolContains(parent: CodeSymbol, child: CodeSymbol): void;
|
|
20
|
+
addWarning(message: string): void;
|
|
21
|
+
toResult(): SymbolExtractionResult;
|
|
22
|
+
}
|
|
23
|
+
export declare function emptyExtractionResult(): SymbolExtractionResult;
|
|
@@ -0,0 +1,77 @@
|
|
|
1
|
+
export class ExtractionContext {
|
|
2
|
+
filePath;
|
|
3
|
+
symbols = [];
|
|
4
|
+
dependencies = [];
|
|
5
|
+
relationships = [];
|
|
6
|
+
warnings = [];
|
|
7
|
+
constructor(filePath) {
|
|
8
|
+
this.filePath = filePath;
|
|
9
|
+
}
|
|
10
|
+
addSymbol(options) {
|
|
11
|
+
const id = [
|
|
12
|
+
this.filePath,
|
|
13
|
+
options.kind,
|
|
14
|
+
options.parentName,
|
|
15
|
+
options.name,
|
|
16
|
+
options.startLine,
|
|
17
|
+
options.endLine ?? options.startLine,
|
|
18
|
+
]
|
|
19
|
+
.filter(Boolean)
|
|
20
|
+
.join('#');
|
|
21
|
+
const symbol = {
|
|
22
|
+
id,
|
|
23
|
+
name: options.name,
|
|
24
|
+
kind: options.kind,
|
|
25
|
+
filePath: this.filePath,
|
|
26
|
+
startLine: options.startLine,
|
|
27
|
+
endLine: options.endLine ?? options.startLine,
|
|
28
|
+
exported: options.exported ?? false,
|
|
29
|
+
parentName: options.parentName,
|
|
30
|
+
};
|
|
31
|
+
this.symbols.push(symbol);
|
|
32
|
+
this.relationships.push({
|
|
33
|
+
sourceType: 'file',
|
|
34
|
+
sourceId: this.filePath,
|
|
35
|
+
targetType: 'symbol',
|
|
36
|
+
targetId: id,
|
|
37
|
+
relationshipType: 'contains',
|
|
38
|
+
confidence: 'high',
|
|
39
|
+
});
|
|
40
|
+
return symbol;
|
|
41
|
+
}
|
|
42
|
+
addDependency(specifier, kind = specifier.startsWith('.') ? 'local' : 'package', confidence = 'high') {
|
|
43
|
+
this.dependencies.push({ fromFile: this.filePath, specifier, kind });
|
|
44
|
+
this.relationships.push({
|
|
45
|
+
sourceType: 'file',
|
|
46
|
+
sourceId: this.filePath,
|
|
47
|
+
targetType: kind === 'local' ? 'file' : 'package',
|
|
48
|
+
targetId: specifier,
|
|
49
|
+
relationshipType: 'import',
|
|
50
|
+
confidence,
|
|
51
|
+
});
|
|
52
|
+
}
|
|
53
|
+
addSymbolContains(parent, child) {
|
|
54
|
+
this.relationships.push({
|
|
55
|
+
sourceType: 'symbol',
|
|
56
|
+
sourceId: parent.id,
|
|
57
|
+
targetType: 'symbol',
|
|
58
|
+
targetId: child.id,
|
|
59
|
+
relationshipType: 'symbol-contains',
|
|
60
|
+
confidence: 'high',
|
|
61
|
+
});
|
|
62
|
+
}
|
|
63
|
+
addWarning(message) {
|
|
64
|
+
this.warnings.push(message);
|
|
65
|
+
}
|
|
66
|
+
toResult() {
|
|
67
|
+
return {
|
|
68
|
+
symbols: this.symbols,
|
|
69
|
+
dependencies: this.dependencies,
|
|
70
|
+
relationships: this.relationships,
|
|
71
|
+
warnings: this.warnings,
|
|
72
|
+
};
|
|
73
|
+
}
|
|
74
|
+
}
|
|
75
|
+
export function emptyExtractionResult() {
|
|
76
|
+
return { symbols: [], dependencies: [], relationships: [], warnings: [] };
|
|
77
|
+
}
|
|
@@ -53,6 +53,7 @@ const LANGUAGE_BY_EXTENSION = {
|
|
|
53
53
|
'.scss': 'scss',
|
|
54
54
|
'.sass': 'sass',
|
|
55
55
|
'.less': 'less',
|
|
56
|
+
'.dockerfile': 'dockerfile',
|
|
56
57
|
'.vue': 'vue',
|
|
57
58
|
'.svelte': 'svelte',
|
|
58
59
|
// Data / Config
|
|
@@ -84,6 +85,13 @@ const LANGUAGE_BY_EXTENSION = {
|
|
|
84
85
|
'.proto': 'protobuf',
|
|
85
86
|
'.sql': 'sql',
|
|
86
87
|
};
|
|
88
|
+
const LANGUAGE_BY_BASENAME = {
|
|
89
|
+
Dockerfile: 'dockerfile',
|
|
90
|
+
Containerfile: 'dockerfile',
|
|
91
|
+
Makefile: 'shell',
|
|
92
|
+
Rakefile: 'ruby',
|
|
93
|
+
Gemfile: 'ruby',
|
|
94
|
+
};
|
|
87
95
|
export function shouldExclude(repoPath, config) {
|
|
88
96
|
const normalizedPath = normalizeRepoPath(repoPath);
|
|
89
97
|
return config.exclude.some((pattern) => matchesExcludePattern(normalizedPath, pattern));
|
|
@@ -120,10 +128,15 @@ export async function readGitignorePatterns(rootPath) {
|
|
|
120
128
|
}
|
|
121
129
|
}
|
|
122
130
|
export function detectLanguage(filePath) {
|
|
123
|
-
|
|
131
|
+
const basename = path.basename(filePath);
|
|
132
|
+
return (LANGUAGE_BY_BASENAME[basename] ??
|
|
133
|
+
LANGUAGE_BY_EXTENSION[path.extname(filePath)] ??
|
|
134
|
+
'unknown');
|
|
124
135
|
}
|
|
125
136
|
export function isPreciseLanguage(filePath, config) {
|
|
126
|
-
|
|
137
|
+
const basename = path.basename(filePath);
|
|
138
|
+
return (config.languages.precise.includes(path.extname(filePath)) ||
|
|
139
|
+
Object.hasOwn(LANGUAGE_BY_BASENAME, basename));
|
|
127
140
|
}
|
|
128
141
|
function matchesExcludePattern(repoPath, pattern) {
|
|
129
142
|
const normalized = normalizeRepoPath(pattern).replace(/\/$/, '');
|
|
@@ -0,0 +1,79 @@
|
|
|
1
|
+
import { emptyExtractionResult, ExtractionContext, } from './extraction-context.js';
|
|
2
|
+
import { parseSource } from './tree-sitter-parser.js';
|
|
3
|
+
export async function extractPhpSymbols(sourceText, filePath) {
|
|
4
|
+
if (!sourceText.trim()) {
|
|
5
|
+
return emptyExtractionResult();
|
|
6
|
+
}
|
|
7
|
+
const tree = await parseSource(sourceText, 'php');
|
|
8
|
+
const context = new ExtractionContext(filePath);
|
|
9
|
+
function addNamedSymbol(node, kind, parentName) {
|
|
10
|
+
const nameNode = findNameNode(node);
|
|
11
|
+
if (!nameNode)
|
|
12
|
+
return undefined;
|
|
13
|
+
return context.addSymbol({
|
|
14
|
+
name: nameNode.text,
|
|
15
|
+
kind,
|
|
16
|
+
startLine: node.startPosition.row + 1,
|
|
17
|
+
endLine: node.endPosition.row + 1,
|
|
18
|
+
exported: true,
|
|
19
|
+
parentName,
|
|
20
|
+
});
|
|
21
|
+
}
|
|
22
|
+
function walk(node, parentClassName) {
|
|
23
|
+
switch (node.type) {
|
|
24
|
+
case 'namespace_definition':
|
|
25
|
+
case 'namespace_name': {
|
|
26
|
+
if (node.type === 'namespace_name' && !parentClassName) {
|
|
27
|
+
context.addSymbol({
|
|
28
|
+
name: node.text,
|
|
29
|
+
kind: 'type',
|
|
30
|
+
startLine: node.startPosition.row + 1,
|
|
31
|
+
endLine: node.endPosition.row + 1,
|
|
32
|
+
exported: true,
|
|
33
|
+
});
|
|
34
|
+
return;
|
|
35
|
+
}
|
|
36
|
+
break;
|
|
37
|
+
}
|
|
38
|
+
case 'namespace_use_declaration': {
|
|
39
|
+
for (const nameNode of node.descendantsOfType('qualified_name')) {
|
|
40
|
+
context.addDependency(nameNode.text, 'package');
|
|
41
|
+
}
|
|
42
|
+
return;
|
|
43
|
+
}
|
|
44
|
+
case 'class_declaration':
|
|
45
|
+
case 'interface_declaration':
|
|
46
|
+
case 'trait_declaration':
|
|
47
|
+
case 'enum_declaration': {
|
|
48
|
+
const classSymbol = addNamedSymbol(node, node.type === 'interface_declaration' ? 'interface' : 'class', parentClassName);
|
|
49
|
+
const className = classSymbol?.name ?? parentClassName;
|
|
50
|
+
for (const child of node.namedChildren) {
|
|
51
|
+
walk(child, className);
|
|
52
|
+
}
|
|
53
|
+
return;
|
|
54
|
+
}
|
|
55
|
+
case 'function_definition':
|
|
56
|
+
case 'method_declaration': {
|
|
57
|
+
const symbol = addNamedSymbol(node, parentClassName ? 'method' : 'function', parentClassName);
|
|
58
|
+
if (symbol && parentClassName) {
|
|
59
|
+
const parent = context.symbols.find((candidate) => candidate.name === parentClassName && candidate.kind === 'class');
|
|
60
|
+
if (parent)
|
|
61
|
+
context.addSymbolContains(parent, symbol);
|
|
62
|
+
}
|
|
63
|
+
return;
|
|
64
|
+
}
|
|
65
|
+
}
|
|
66
|
+
for (const child of node.namedChildren) {
|
|
67
|
+
walk(child, parentClassName);
|
|
68
|
+
}
|
|
69
|
+
}
|
|
70
|
+
walk(tree.rootNode);
|
|
71
|
+
tree.delete();
|
|
72
|
+
return context.toResult();
|
|
73
|
+
}
|
|
74
|
+
function findNameNode(node) {
|
|
75
|
+
return (node.childForFieldName('name') ??
|
|
76
|
+
node.namedChildren.find((child) => child.type === 'name') ??
|
|
77
|
+
node.namedChildren.find((child) => child.type === 'variable_name') ??
|
|
78
|
+
node.namedChildren.find((child) => child.type === 'identifier'));
|
|
79
|
+
}
|
|
@@ -3,22 +3,44 @@ import crypto from 'node:crypto';
|
|
|
3
3
|
import { readFile, stat } from 'node:fs/promises';
|
|
4
4
|
import path from 'node:path';
|
|
5
5
|
import { estimateTokens } from '../session/token-estimator.js';
|
|
6
|
+
import { extractBroadSymbols, supportsBroadExtraction, } from './broad-symbol-extractor.js';
|
|
6
7
|
import { extractCSymbols } from './c-symbol-extractor.js';
|
|
7
8
|
import { extractCSharpSymbols } from './csharp-symbol-extractor.js';
|
|
8
9
|
import { buildFastGlobIgnore, detectLanguage, isPreciseLanguage, readGitignorePatterns, shouldExclude, } from './file-classifier.js';
|
|
9
10
|
import { getChangedFilesSince, isGitRepo } from './git-utils.js';
|
|
10
11
|
import { extractGoSymbols } from './go-symbol-extractor.js';
|
|
11
12
|
import { extractJvmSymbols } from './jvm-symbol-extractor.js';
|
|
13
|
+
import { extractPhpSymbols } from './php-symbol-extractor.js';
|
|
12
14
|
import { extractPythonSymbols } from './python-symbol-extractor.js';
|
|
15
|
+
import { extractRubySymbols } from './ruby-symbol-extractor.js';
|
|
13
16
|
import { extractRustSymbols } from './rust-symbol-extractor.js';
|
|
17
|
+
import { extractShellSymbols } from './shell-symbol-extractor.js';
|
|
18
|
+
import { extractSqlSymbols } from './sql-symbol-extractor.js';
|
|
14
19
|
import { extractTsSymbols } from './ts-symbol-extractor.js';
|
|
15
20
|
const C_EXTS = new Set(['.c', '.h', '.cpp', '.cc', '.cxx', '.hpp', '.hxx']);
|
|
16
21
|
const JVM_EXTS = new Set(['.java', '.kt', '.kts']);
|
|
22
|
+
const TS_EXTS = new Set(['.ts', '.tsx', '.js', '.jsx', '.mjs', '.cjs', '.mts', '.cts']);
|
|
17
23
|
async function extractSymbols(text, repoPath) {
|
|
18
24
|
const ext = path.extname(repoPath);
|
|
25
|
+
const language = detectLanguage(repoPath);
|
|
26
|
+
if (supportsBroadExtraction(language)) {
|
|
27
|
+
return extractBroadSymbols(text, repoPath, language);
|
|
28
|
+
}
|
|
19
29
|
if (ext === '.py' || ext === '.pyw' || ext === '.pyi') {
|
|
20
30
|
return extractPythonSymbols(text, repoPath);
|
|
21
31
|
}
|
|
32
|
+
if (ext === '.php') {
|
|
33
|
+
return extractPhpSymbols(text, repoPath);
|
|
34
|
+
}
|
|
35
|
+
if (ext === '.rb' || ext === '.rake' || language === 'ruby') {
|
|
36
|
+
return extractRubySymbols(text, repoPath);
|
|
37
|
+
}
|
|
38
|
+
if (['.sh', '.bash', '.zsh', '.fish'].includes(ext) || language === 'shell') {
|
|
39
|
+
return extractShellSymbols(text, repoPath);
|
|
40
|
+
}
|
|
41
|
+
if (ext === '.sql') {
|
|
42
|
+
return extractSqlSymbols(text, repoPath);
|
|
43
|
+
}
|
|
22
44
|
if (ext === '.go') {
|
|
23
45
|
return extractGoSymbols(text, repoPath);
|
|
24
46
|
}
|
|
@@ -34,7 +56,10 @@ async function extractSymbols(text, repoPath) {
|
|
|
34
56
|
if (ext === '.cs') {
|
|
35
57
|
return extractCSharpSymbols(text, repoPath);
|
|
36
58
|
}
|
|
37
|
-
|
|
59
|
+
if (TS_EXTS.has(ext)) {
|
|
60
|
+
return extractTsSymbols(text, repoPath);
|
|
61
|
+
}
|
|
62
|
+
return { symbols: [], dependencies: [], relationships: [], warnings: [] };
|
|
38
63
|
}
|
|
39
64
|
export async function scanRepository(rootPath, config, previous) {
|
|
40
65
|
const gitignorePatterns = await readGitignorePatterns(rootPath);
|
|
@@ -223,6 +248,15 @@ const SOURCE_EXTENSIONS = [
|
|
|
223
248
|
'.hpp',
|
|
224
249
|
'.hxx',
|
|
225
250
|
'.cs',
|
|
251
|
+
'.php',
|
|
252
|
+
'.swift',
|
|
253
|
+
'.rb',
|
|
254
|
+
'.rake',
|
|
255
|
+
'.lua',
|
|
256
|
+
'.dart',
|
|
257
|
+
'.ex',
|
|
258
|
+
'.exs',
|
|
259
|
+
'.scala',
|
|
226
260
|
];
|
|
227
261
|
function resolveLocalDependencies(dependencies, files) {
|
|
228
262
|
const filePaths = new Set(files.map((file) => file.path));
|
|
@@ -0,0 +1,75 @@
|
|
|
1
|
+
import { emptyExtractionResult, ExtractionContext, } from './extraction-context.js';
|
|
2
|
+
import { parseSource } from './tree-sitter-parser.js';
|
|
3
|
+
export async function extractRubySymbols(sourceText, filePath) {
|
|
4
|
+
if (!sourceText.trim()) {
|
|
5
|
+
return emptyExtractionResult();
|
|
6
|
+
}
|
|
7
|
+
const tree = await parseSource(sourceText, 'ruby');
|
|
8
|
+
const context = new ExtractionContext(filePath);
|
|
9
|
+
function addNamedSymbol(node, kind, parentName) {
|
|
10
|
+
const nameNode = findNameNode(node);
|
|
11
|
+
if (!nameNode)
|
|
12
|
+
return undefined;
|
|
13
|
+
return context.addSymbol({
|
|
14
|
+
name: nameNode.text,
|
|
15
|
+
kind,
|
|
16
|
+
startLine: node.startPosition.row + 1,
|
|
17
|
+
endLine: node.endPosition.row + 1,
|
|
18
|
+
exported: true,
|
|
19
|
+
parentName,
|
|
20
|
+
});
|
|
21
|
+
}
|
|
22
|
+
function walk(node, parentName) {
|
|
23
|
+
switch (node.type) {
|
|
24
|
+
case 'call': {
|
|
25
|
+
const methodName = findCallMethodName(node);
|
|
26
|
+
if (methodName === 'require' || methodName === 'require_relative') {
|
|
27
|
+
const argument = findFirstStringContent(node);
|
|
28
|
+
if (argument) {
|
|
29
|
+
context.addDependency(argument, methodName === 'require_relative' ? 'local' : 'package');
|
|
30
|
+
}
|
|
31
|
+
return;
|
|
32
|
+
}
|
|
33
|
+
break;
|
|
34
|
+
}
|
|
35
|
+
case 'class':
|
|
36
|
+
case 'module': {
|
|
37
|
+
const symbol = addNamedSymbol(node, node.type === 'class' ? 'class' : 'type', parentName);
|
|
38
|
+
const nextParent = symbol?.name ?? parentName;
|
|
39
|
+
for (const child of node.namedChildren) {
|
|
40
|
+
walk(child, nextParent);
|
|
41
|
+
}
|
|
42
|
+
return;
|
|
43
|
+
}
|
|
44
|
+
case 'method':
|
|
45
|
+
case 'singleton_method': {
|
|
46
|
+
const symbol = addNamedSymbol(node, parentName ? 'method' : 'function', parentName);
|
|
47
|
+
if (symbol && parentName) {
|
|
48
|
+
const parent = context.symbols.find((candidate) => candidate.name === parentName &&
|
|
49
|
+
(candidate.kind === 'class' || candidate.kind === 'type'));
|
|
50
|
+
if (parent)
|
|
51
|
+
context.addSymbolContains(parent, symbol);
|
|
52
|
+
}
|
|
53
|
+
return;
|
|
54
|
+
}
|
|
55
|
+
}
|
|
56
|
+
for (const child of node.namedChildren) {
|
|
57
|
+
walk(child, parentName);
|
|
58
|
+
}
|
|
59
|
+
}
|
|
60
|
+
walk(tree.rootNode);
|
|
61
|
+
tree.delete();
|
|
62
|
+
return context.toResult();
|
|
63
|
+
}
|
|
64
|
+
function findNameNode(node) {
|
|
65
|
+
return (node.childForFieldName('name') ??
|
|
66
|
+
node.namedChildren.find((child) => child.type === 'constant') ??
|
|
67
|
+
node.namedChildren.find((child) => child.type === 'identifier'));
|
|
68
|
+
}
|
|
69
|
+
function findCallMethodName(node) {
|
|
70
|
+
return (node.childForFieldName('method')?.text ??
|
|
71
|
+
node.namedChildren.find((child) => child.type === 'identifier')?.text);
|
|
72
|
+
}
|
|
73
|
+
function findFirstStringContent(node) {
|
|
74
|
+
return node.descendantsOfType('string_content')[0]?.text;
|
|
75
|
+
}
|
|
@@ -0,0 +1,78 @@
|
|
|
1
|
+
import { emptyExtractionResult, ExtractionContext, } from './extraction-context.js';
|
|
2
|
+
import { parseSource } from './tree-sitter-parser.js';
|
|
3
|
+
export async function extractShellSymbols(sourceText, filePath) {
|
|
4
|
+
if (!sourceText.trim()) {
|
|
5
|
+
return emptyExtractionResult();
|
|
6
|
+
}
|
|
7
|
+
const tree = await parseSource(sourceText, 'bash');
|
|
8
|
+
const context = new ExtractionContext(filePath);
|
|
9
|
+
function walk(node, currentFunctionId) {
|
|
10
|
+
switch (node.type) {
|
|
11
|
+
case 'function_definition': {
|
|
12
|
+
const name = findFunctionName(node);
|
|
13
|
+
if (name) {
|
|
14
|
+
const symbol = context.addSymbol({
|
|
15
|
+
name,
|
|
16
|
+
kind: 'function',
|
|
17
|
+
startLine: node.startPosition.row + 1,
|
|
18
|
+
endLine: node.endPosition.row + 1,
|
|
19
|
+
exported: true,
|
|
20
|
+
});
|
|
21
|
+
for (const child of node.namedChildren) {
|
|
22
|
+
walk(child, symbol.id);
|
|
23
|
+
}
|
|
24
|
+
return;
|
|
25
|
+
}
|
|
26
|
+
break;
|
|
27
|
+
}
|
|
28
|
+
case 'command': {
|
|
29
|
+
const commandName = findCommandName(node);
|
|
30
|
+
const firstArgument = findFirstCommandArgument(node);
|
|
31
|
+
if ((commandName === 'source' || commandName === '.') &&
|
|
32
|
+
firstArgument) {
|
|
33
|
+
context.addDependency(firstArgument, 'local');
|
|
34
|
+
return;
|
|
35
|
+
}
|
|
36
|
+
if (currentFunctionId && commandName && isLocalScriptReference(commandName)) {
|
|
37
|
+
context.relationships.push({
|
|
38
|
+
sourceType: 'symbol',
|
|
39
|
+
sourceId: currentFunctionId,
|
|
40
|
+
targetType: 'file',
|
|
41
|
+
targetId: commandName,
|
|
42
|
+
relationshipType: 'calls',
|
|
43
|
+
confidence: 'low',
|
|
44
|
+
});
|
|
45
|
+
}
|
|
46
|
+
break;
|
|
47
|
+
}
|
|
48
|
+
}
|
|
49
|
+
for (const child of node.namedChildren) {
|
|
50
|
+
walk(child, currentFunctionId);
|
|
51
|
+
}
|
|
52
|
+
}
|
|
53
|
+
walk(tree.rootNode);
|
|
54
|
+
tree.delete();
|
|
55
|
+
return context.toResult();
|
|
56
|
+
}
|
|
57
|
+
function findFunctionName(node) {
|
|
58
|
+
return (node.childForFieldName('name')?.text ??
|
|
59
|
+
node.namedChildren.find((child) => child.type === 'word')?.text);
|
|
60
|
+
}
|
|
61
|
+
function findCommandName(node) {
|
|
62
|
+
return (node.childForFieldName('name')?.text ??
|
|
63
|
+
node.namedChildren
|
|
64
|
+
.find((child) => child.type === 'command_name')
|
|
65
|
+
?.namedChildren.find((child) => child.type === 'word')?.text ??
|
|
66
|
+
node.namedChildren.find((child) => child.type === 'command_name')?.text ??
|
|
67
|
+
node.namedChildren.find((child) => child.type === 'word')?.text);
|
|
68
|
+
}
|
|
69
|
+
function findFirstCommandArgument(node) {
|
|
70
|
+
return node.namedChildren.find((child) => child.type === 'word')?.text;
|
|
71
|
+
}
|
|
72
|
+
function isLocalScriptReference(value) {
|
|
73
|
+
return (value.startsWith('./') ||
|
|
74
|
+
value.startsWith('../') ||
|
|
75
|
+
value.endsWith('.sh') ||
|
|
76
|
+
value.endsWith('.bash') ||
|
|
77
|
+
value.endsWith('.zsh'));
|
|
78
|
+
}
|
|
@@ -0,0 +1,166 @@
|
|
|
1
|
+
import { emptyExtractionResult, ExtractionContext, } from './extraction-context.js';
|
|
2
|
+
export async function extractSqlSymbols(sourceText, filePath) {
|
|
3
|
+
if (!sourceText.trim()) {
|
|
4
|
+
return emptyExtractionResult();
|
|
5
|
+
}
|
|
6
|
+
const context = new ExtractionContext(filePath);
|
|
7
|
+
const withoutComments = stripLineComments(sourceText);
|
|
8
|
+
const statements = splitStatements(withoutComments);
|
|
9
|
+
for (const statement of statements) {
|
|
10
|
+
collectCreateStatement(statement, context);
|
|
11
|
+
collectAlterStatement(statement, context);
|
|
12
|
+
collectQueryReferences(statement, context);
|
|
13
|
+
}
|
|
14
|
+
return context.toResult();
|
|
15
|
+
}
|
|
16
|
+
function collectCreateStatement(statement, context) {
|
|
17
|
+
const normalized = compact(statement.text);
|
|
18
|
+
const createMatch = normalized.match(/^CREATE\s+(?:OR\s+REPLACE\s+)?(?:(?:UNIQUE|MATERIALIZED|TEMP|TEMPORARY)\s+)*(TABLE|VIEW|INDEX|FUNCTION|PROCEDURE|TRIGGER|TYPE)\s+(?:IF\s+NOT\s+EXISTS\s+)?("?[\w.]+"?)/i);
|
|
19
|
+
if (!createMatch?.[1] || !createMatch[2]) {
|
|
20
|
+
return;
|
|
21
|
+
}
|
|
22
|
+
const objectKind = createMatch[1].toLowerCase();
|
|
23
|
+
const name = normalizeSqlIdentifier(createMatch[2]);
|
|
24
|
+
const symbol = context.addSymbol({
|
|
25
|
+
name,
|
|
26
|
+
kind: objectKind === 'function' || objectKind === 'procedure'
|
|
27
|
+
? 'function'
|
|
28
|
+
: 'type',
|
|
29
|
+
startLine: statement.startLine,
|
|
30
|
+
endLine: statement.endLine,
|
|
31
|
+
exported: true,
|
|
32
|
+
parentName: objectKind,
|
|
33
|
+
});
|
|
34
|
+
const tableForTrigger = normalized.match(/\bON\s+("?[\w.]+"?)/i)?.[1];
|
|
35
|
+
if (objectKind === 'trigger' && tableForTrigger) {
|
|
36
|
+
context.relationships.push({
|
|
37
|
+
sourceType: 'symbol',
|
|
38
|
+
sourceId: symbol.id,
|
|
39
|
+
targetType: 'symbol',
|
|
40
|
+
targetId: normalizeSqlIdentifier(tableForTrigger),
|
|
41
|
+
relationshipType: 'mentions',
|
|
42
|
+
confidence: 'high',
|
|
43
|
+
});
|
|
44
|
+
}
|
|
45
|
+
for (const referencedTable of referencedTables(normalized)) {
|
|
46
|
+
context.relationships.push({
|
|
47
|
+
sourceType: 'symbol',
|
|
48
|
+
sourceId: symbol.id,
|
|
49
|
+
targetType: 'symbol',
|
|
50
|
+
targetId: referencedTable,
|
|
51
|
+
relationshipType: 'mentions',
|
|
52
|
+
confidence: 'medium',
|
|
53
|
+
});
|
|
54
|
+
}
|
|
55
|
+
}
|
|
56
|
+
function collectAlterStatement(statement, context) {
|
|
57
|
+
const normalized = compact(statement.text);
|
|
58
|
+
const alterMatch = normalized.match(/^ALTER\s+TABLE\s+(?:IF\s+EXISTS\s+)?("?[\w.]+"?)/i);
|
|
59
|
+
if (!alterMatch?.[1]) {
|
|
60
|
+
return;
|
|
61
|
+
}
|
|
62
|
+
const tableName = normalizeSqlIdentifier(alterMatch[1]);
|
|
63
|
+
const symbol = context.addSymbol({
|
|
64
|
+
name: `alter ${tableName}`,
|
|
65
|
+
kind: 'type',
|
|
66
|
+
startLine: statement.startLine,
|
|
67
|
+
endLine: statement.endLine,
|
|
68
|
+
exported: true,
|
|
69
|
+
parentName: 'table',
|
|
70
|
+
});
|
|
71
|
+
context.relationships.push({
|
|
72
|
+
sourceType: 'symbol',
|
|
73
|
+
sourceId: symbol.id,
|
|
74
|
+
targetType: 'symbol',
|
|
75
|
+
targetId: tableName,
|
|
76
|
+
relationshipType: 'mentions',
|
|
77
|
+
confidence: 'high',
|
|
78
|
+
});
|
|
79
|
+
for (const referencedTable of referencedTables(normalized)) {
|
|
80
|
+
context.relationships.push({
|
|
81
|
+
sourceType: 'symbol',
|
|
82
|
+
sourceId: symbol.id,
|
|
83
|
+
targetType: 'symbol',
|
|
84
|
+
targetId: referencedTable,
|
|
85
|
+
relationshipType: 'mentions',
|
|
86
|
+
confidence: 'medium',
|
|
87
|
+
});
|
|
88
|
+
}
|
|
89
|
+
}
|
|
90
|
+
function collectQueryReferences(statement, context) {
|
|
91
|
+
const normalized = compact(statement.text);
|
|
92
|
+
const queryMatch = normalized.match(/^(SELECT|INSERT|UPDATE|DELETE)\b/i);
|
|
93
|
+
if (!queryMatch?.[1]) {
|
|
94
|
+
return;
|
|
95
|
+
}
|
|
96
|
+
const symbol = context.addSymbol({
|
|
97
|
+
name: `${queryMatch[1].toLowerCase()} statement ${statement.startLine}`,
|
|
98
|
+
kind: 'type',
|
|
99
|
+
startLine: statement.startLine,
|
|
100
|
+
endLine: statement.endLine,
|
|
101
|
+
});
|
|
102
|
+
for (const table of referencedTables(normalized)) {
|
|
103
|
+
context.relationships.push({
|
|
104
|
+
sourceType: 'symbol',
|
|
105
|
+
sourceId: symbol.id,
|
|
106
|
+
targetType: 'symbol',
|
|
107
|
+
targetId: table,
|
|
108
|
+
relationshipType: 'mentions',
|
|
109
|
+
confidence: 'medium',
|
|
110
|
+
});
|
|
111
|
+
}
|
|
112
|
+
}
|
|
113
|
+
function stripLineComments(sourceText) {
|
|
114
|
+
return sourceText
|
|
115
|
+
.split(/\r?\n/)
|
|
116
|
+
.map((line) => line.replace(/--.*$/, ''))
|
|
117
|
+
.join('\n');
|
|
118
|
+
}
|
|
119
|
+
function splitStatements(sourceText) {
|
|
120
|
+
const statements = [];
|
|
121
|
+
let current = '';
|
|
122
|
+
let startLine = 1;
|
|
123
|
+
let lineNumber = 1;
|
|
124
|
+
for (const line of sourceText.split(/\r?\n/)) {
|
|
125
|
+
if (!current.trim()) {
|
|
126
|
+
startLine = lineNumber;
|
|
127
|
+
}
|
|
128
|
+
current += `${line}\n`;
|
|
129
|
+
if (line.includes(';')) {
|
|
130
|
+
statements.push({
|
|
131
|
+
text: current,
|
|
132
|
+
startLine,
|
|
133
|
+
endLine: lineNumber,
|
|
134
|
+
});
|
|
135
|
+
current = '';
|
|
136
|
+
}
|
|
137
|
+
lineNumber += 1;
|
|
138
|
+
}
|
|
139
|
+
if (current.trim()) {
|
|
140
|
+
statements.push({ text: current, startLine, endLine: lineNumber - 1 });
|
|
141
|
+
}
|
|
142
|
+
return statements;
|
|
143
|
+
}
|
|
144
|
+
function compact(value) {
|
|
145
|
+
return value.replace(/\s+/g, ' ').trim();
|
|
146
|
+
}
|
|
147
|
+
function normalizeSqlIdentifier(value) {
|
|
148
|
+
return value.replace(/^"+|"+$/g, '').replace(/[;,)]$/, '');
|
|
149
|
+
}
|
|
150
|
+
function referencedTables(statement) {
|
|
151
|
+
const names = new Set();
|
|
152
|
+
const patterns = [
|
|
153
|
+
/\bREFERENCES\s+("?[\w.]+"?)/gi,
|
|
154
|
+
/\bFROM\s+("?[\w.]+"?)/gi,
|
|
155
|
+
/\bJOIN\s+("?[\w.]+"?)/gi,
|
|
156
|
+
/\bUPDATE\s+("?[\w.]+"?)/gi,
|
|
157
|
+
/\bINTO\s+("?[\w.]+"?)/gi,
|
|
158
|
+
];
|
|
159
|
+
for (const pattern of patterns) {
|
|
160
|
+
for (const match of statement.matchAll(pattern)) {
|
|
161
|
+
if (match[1])
|
|
162
|
+
names.add(normalizeSqlIdentifier(match[1]));
|
|
163
|
+
}
|
|
164
|
+
}
|
|
165
|
+
return [...names];
|
|
166
|
+
}
|
|
@@ -1,5 +1,4 @@
|
|
|
1
1
|
import { Language, Tree } from 'web-tree-sitter';
|
|
2
|
-
type GrammarKey = 'python' | 'java' | 'kotlin' | 'go' | 'rust' | 'c' | 'cpp' | 'c_sharp';
|
|
2
|
+
export type GrammarKey = 'python' | 'java' | 'kotlin' | 'go' | 'rust' | 'c' | 'cpp' | 'c_sharp' | 'php' | 'ruby' | 'bash' | 'yaml' | 'json' | 'html' | 'css' | 'lua' | 'dart' | 'elixir' | 'scala';
|
|
3
3
|
export declare function loadLanguage(grammarKey: GrammarKey): Promise<Language>;
|
|
4
4
|
export declare function parseSource(sourceText: string, grammarKey: GrammarKey): Promise<Tree>;
|
|
5
|
-
export {};
|
|
@@ -19,6 +19,20 @@ const GRAMMAR_PACKAGES = {
|
|
|
19
19
|
pkg: 'tree-sitter-c-sharp',
|
|
20
20
|
wasm: 'tree-sitter-c_sharp.wasm',
|
|
21
21
|
},
|
|
22
|
+
php: { pkg: 'tree-sitter-php', wasm: 'tree-sitter-php.wasm' },
|
|
23
|
+
ruby: { pkg: 'tree-sitter-ruby', wasm: 'tree-sitter-ruby.wasm' },
|
|
24
|
+
bash: { pkg: 'tree-sitter-bash', wasm: 'tree-sitter-bash.wasm' },
|
|
25
|
+
yaml: {
|
|
26
|
+
pkg: '@tree-sitter-grammars/tree-sitter-yaml',
|
|
27
|
+
wasm: 'tree-sitter-yaml.wasm',
|
|
28
|
+
},
|
|
29
|
+
json: { pkg: 'tree-sitter-json', wasm: 'tree-sitter-json.wasm' },
|
|
30
|
+
html: { pkg: 'tree-sitter-html', wasm: 'tree-sitter-html.wasm' },
|
|
31
|
+
css: { pkg: 'tree-sitter-css', wasm: 'tree-sitter-css.wasm' },
|
|
32
|
+
lua: { pkg: 'tree-sitter-lua', wasm: 'tree-sitter-lua.wasm' },
|
|
33
|
+
dart: { pkg: 'tree-sitter-dart', wasm: 'tree-sitter-dart.wasm' },
|
|
34
|
+
elixir: { pkg: 'tree-sitter-elixir', wasm: 'tree-sitter-elixir.wasm' },
|
|
35
|
+
scala: { pkg: 'tree-sitter-scala', wasm: 'tree-sitter-scala.wasm' },
|
|
22
36
|
};
|
|
23
37
|
async function ensureInit() {
|
|
24
38
|
if (!initPromise) {
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@kentwynn/kgraph",
|
|
3
|
-
"version": "0.2.
|
|
3
|
+
"version": "0.2.23",
|
|
4
4
|
"description": "Persistent repo intelligence for AI coding assistants.",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"bin": {
|
|
@@ -45,16 +45,27 @@
|
|
|
45
45
|
"dependencies": {
|
|
46
46
|
"@clack/prompts": "^1.3.0",
|
|
47
47
|
"@tree-sitter-grammars/tree-sitter-kotlin": "^1.1.0",
|
|
48
|
+
"@tree-sitter-grammars/tree-sitter-yaml": "^0.7.1",
|
|
48
49
|
"chalk": "^5.6.2",
|
|
49
50
|
"commander": "^12.1.0",
|
|
50
51
|
"fast-glob": "^3.3.2",
|
|
52
|
+
"tree-sitter-bash": "^0.25.1",
|
|
51
53
|
"tree-sitter-c": "^0.24.1",
|
|
52
54
|
"tree-sitter-c-sharp": "^0.23.5",
|
|
53
55
|
"tree-sitter-cpp": "^0.23.4",
|
|
56
|
+
"tree-sitter-css": "^0.25.0",
|
|
57
|
+
"tree-sitter-dart": "^1.0.0",
|
|
58
|
+
"tree-sitter-elixir": "^0.3.5",
|
|
54
59
|
"tree-sitter-go": "^0.25.0",
|
|
60
|
+
"tree-sitter-html": "^0.23.2",
|
|
55
61
|
"tree-sitter-java": "^0.23.5",
|
|
62
|
+
"tree-sitter-json": "^0.24.8",
|
|
63
|
+
"tree-sitter-lua": "^2.1.3",
|
|
64
|
+
"tree-sitter-php": "^0.24.2",
|
|
56
65
|
"tree-sitter-python": "^0.25.0",
|
|
66
|
+
"tree-sitter-ruby": "^0.23.1",
|
|
57
67
|
"tree-sitter-rust": "^0.24.0",
|
|
68
|
+
"tree-sitter-scala": "^0.24.0",
|
|
58
69
|
"typescript": "^5.9.3",
|
|
59
70
|
"web-tree-sitter": "^0.26.8",
|
|
60
71
|
"yaml": "^2.5.1"
|