carto-md 1.1.3 → 2.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CONTRIBUTING.md +156 -45
- package/README.md +113 -281
- package/package.json +25 -5
- package/src/agents/domains.js +31 -3
- package/src/agents/leiden.js +329 -0
- package/src/cli/check.js +27 -23
- package/src/cli/init.js +19 -27
- package/src/cli/serve.js +11 -2
- package/src/cli/sync.js +16 -2
- package/src/cli/watch.js +254 -144
- package/src/detector/framework.js +24 -12
- package/src/engine/carto.js +22 -6
- package/src/engine/worker.js +6 -0
- package/src/extractors/imports.js +409 -10
- package/src/extractors/languages/cpp.js +66 -0
- package/src/extractors/languages/csharp.js +122 -0
- package/src/extractors/languages/go.js +13 -1
- package/src/extractors/languages/java.js +116 -0
- package/src/extractors/languages/javascript.js +91 -2
- package/src/extractors/languages/python.js +15 -1
- package/src/extractors/languages/ruby.js +157 -0
- package/src/extractors/languages/rust.js +99 -0
- package/src/extractors/languages/typescript.js +55 -4
- package/src/extractors/routes.js +41 -1
- package/src/extractors/tree-sitter-parser.js +405 -0
- package/src/mcp/server-v2.js +543 -0
- package/src/store/change-detector.js +86 -0
- package/src/store/migrate.js +204 -0
- package/src/store/sqlite-store.js +747 -0
- package/src/store/sync-v2.js +586 -0
- package/src/watcher/watch.js +187 -35
package/CONTRIBUTING.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# Contributing to Carto
|
|
2
2
|
|
|
3
|
-
Carto is free, open source, and community-maintained. The core team owns the
|
|
3
|
+
Carto is free, open source, and community-maintained. The core team owns the SQLite store, MCP server, graph clustering, and CLI. The community owns language and framework extractors.
|
|
4
4
|
|
|
5
5
|
---
|
|
6
6
|
|
|
@@ -10,85 +10,190 @@ Carto is free, open source, and community-maintained. The core team owns the mer
|
|
|
10
10
|
|
|
11
11
|
New language support lives in `src/extractors/languages/`. Each language is an isolated module.
|
|
12
12
|
|
|
13
|
-
Currently supported
|
|
13
|
+
**Currently supported:** JavaScript/TypeScript, Python, Go, Rust, Ruby, Java, C/C++, C#, R, Prisma, HTML
|
|
14
14
|
|
|
15
|
-
Wanted
|
|
15
|
+
**Wanted:** PHP, Swift, Kotlin, Elixir, Scala, Haskell, Zig
|
|
16
16
|
|
|
17
17
|
### Tier 2 — Framework extractors (safe to add, easy to review)
|
|
18
18
|
|
|
19
|
-
Framework-specific route and model extraction lives
|
|
19
|
+
Framework-specific route and model extraction lives inside the language plugins.
|
|
20
20
|
|
|
21
|
-
Currently supported
|
|
22
|
-
- **JS/TS**: Express, Next.js (App + Pages Router), tRPC, Drizzle, Zod
|
|
23
|
-
- **Python**: FastAPI, Pydantic, SQLAlchemy, Django (models + URLs)
|
|
24
|
-
- **Go**: Gin, Echo, Chi, net/http
|
|
21
|
+
**Currently supported:**
|
|
22
|
+
- **JS/TS**: Express, Next.js (App + Pages Router), tRPC, React Router, Drizzle, Zod, TypeScript interfaces
|
|
23
|
+
- **Python**: FastAPI, Flask, Pydantic, SQLAlchemy, Django (models + URLs)
|
|
24
|
+
- **Go**: Gin, Echo, Chi, net/http — routes, structs, import graph
|
|
25
|
+
- **Rust**: Actix-web, Axum, Rocket — routes, structs
|
|
26
|
+
- **Java**: Spring MVC/Boot, JAX-RS — routes, JPA entities, records
|
|
27
|
+
- **C#**: ASP.NET Core (attribute routing + minimal API), EF Core classes, records
|
|
28
|
+
- **Ruby**: Rails routes.rb, Sinatra, ActiveRecord models
|
|
25
29
|
- **Schema**: Prisma
|
|
26
30
|
- **Frontend**: HTML fetch()
|
|
27
31
|
- **R**: Plumber, Shiny, R6, S7
|
|
28
32
|
|
|
29
|
-
Wanted
|
|
33
|
+
**Wanted:** NestJS, Hono, Fastify, Laravel, Django REST Framework, Ktor, Vapor
|
|
30
34
|
|
|
31
35
|
### Tier 3 — Core (review carefully before merging)
|
|
32
36
|
|
|
33
|
-
- `src/agents/merger.js` — merger logic. One bad merge = developer loses manual notes
|
|
34
|
-
- `src/agents/
|
|
35
|
-
- `src/
|
|
36
|
-
- `src/mcp/server.js` — MCP server tools. Breaking changes affect Kiro/Cursor/Claude
|
|
37
|
-
- `src/
|
|
38
|
-
- `src/
|
|
39
|
-
- `src/
|
|
40
|
-
- `src/cli/` — CLI commands.
|
|
37
|
+
- `src/agents/merger.js` — merger logic. One bad merge = developer loses manual notes.
|
|
38
|
+
- `src/agents/leiden.js` — Leiden+CPM graph clustering. Wrong clusters = wrong domain context.
|
|
39
|
+
- `src/store/sqlite-store.js` — SQLite persistence layer.
|
|
40
|
+
- `src/mcp/server-v2.js` — MCP server tools. Breaking changes affect Kiro/Cursor/Claude.
|
|
41
|
+
- `src/store/sync-v2.js` — full sync pipeline.
|
|
42
|
+
- `src/cli/watch.js` — incremental update pipeline.
|
|
43
|
+
- `src/extractors/imports.js` — import resolution for all languages.
|
|
41
44
|
|
|
42
45
|
---
|
|
43
46
|
|
|
44
|
-
## How to add a language
|
|
47
|
+
## How to add a language (V2 pattern — tree-sitter based)
|
|
45
48
|
|
|
46
|
-
|
|
47
|
-
|
|
49
|
+
V2 uses tree-sitter for import and symbol extraction. Babel is only used for deep JS/TS route/model extraction on API handler files.
|
|
50
|
+
|
|
51
|
+
### Step 1: Install the grammar
|
|
52
|
+
|
|
53
|
+
```bash
|
|
54
|
+
npm install tree-sitter-yourlanguage --save-exact
|
|
55
|
+
```
|
|
56
|
+
|
|
57
|
+
### Step 2: Add grammar definition to `src/extractors/tree-sitter-parser.js`
|
|
58
|
+
|
|
59
|
+
Add an entry to the `GRAMMAR_DEFS` array:
|
|
60
|
+
|
|
61
|
+
```js
|
|
62
|
+
{
|
|
63
|
+
name: 'yourlanguage',
|
|
64
|
+
extensions: ['.ext'],
|
|
65
|
+
loadGrammar: () => require('tree-sitter-yourlanguage'),
|
|
66
|
+
importQuery: `
|
|
67
|
+
(import_statement source: (string) @src)
|
|
68
|
+
`,
|
|
69
|
+
symbolQuery: `
|
|
70
|
+
(function_declaration name: (identifier) @name)
|
|
71
|
+
(class_declaration name: (identifier) @name)
|
|
72
|
+
`,
|
|
73
|
+
},
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
The queries use tree-sitter S-expression syntax. Run `node -e "const P = require('tree-sitter'); const L = require('tree-sitter-yourlanguage'); const p = new P(); p.setLanguage(L); console.log(p.parse('your code').rootNode.toString())"` to see the node types.
|
|
77
|
+
|
|
78
|
+
### Step 3: Create `src/extractors/languages/yourlanguage.js`
|
|
48
79
|
|
|
49
80
|
```js
|
|
81
|
+
'use strict';
|
|
82
|
+
|
|
83
|
+
const tsParser = require('../tree-sitter-parser');
|
|
84
|
+
|
|
50
85
|
module.exports = {
|
|
51
86
|
name: 'yourlanguage',
|
|
52
87
|
extensions: ['.ext'],
|
|
53
|
-
extract(content,
|
|
88
|
+
extract(content, filename) {
|
|
89
|
+
// Fast path: tree-sitter for imports + symbols (runs on ALL files)
|
|
90
|
+
const { imports: tsImports, symbols: tsSymbols } = tsParser.isAvailable()
|
|
91
|
+
? tsParser.extractAll(content, '.ext')
|
|
92
|
+
: { imports: [], symbols: [] };
|
|
93
|
+
|
|
54
94
|
return {
|
|
55
|
-
routes:
|
|
56
|
-
models:
|
|
57
|
-
functions:
|
|
58
|
-
|
|
59
|
-
|
|
95
|
+
routes: extractRoutes(content), // framework-specific, regex
|
|
96
|
+
models: extractModels(content), // ORM/schema models, regex
|
|
97
|
+
functions: tsSymbols
|
|
98
|
+
.filter(s => s.kind === 'function')
|
|
99
|
+
.map(s => ({ name: s.name, params: '—', returnType: '—' })),
|
|
100
|
+
envVars: extractEnvVars(content), // env var references
|
|
101
|
+
dbTables: [],
|
|
60
102
|
fetches: [],
|
|
61
103
|
storageKeys: [],
|
|
62
|
-
|
|
63
|
-
|
|
104
|
+
_tsImports: tsImports, // raw import paths (for import graph)
|
|
105
|
+
_tsSymbols: tsSymbols, // all symbols (for get_file_summary)
|
|
64
106
|
};
|
|
65
107
|
}
|
|
66
108
|
};
|
|
109
|
+
|
|
110
|
+
function extractRoutes(content) { return []; }
|
|
111
|
+
function extractModels(content) { return []; }
|
|
112
|
+
function extractEnvVars(content) { return []; }
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
### Step 4: Add import resolution to `src/extractors/imports.js`
|
|
116
|
+
|
|
117
|
+
If your language has resolvable local imports (not just package names), add a case in `extractImports()`:
|
|
118
|
+
|
|
119
|
+
```js
|
|
120
|
+
} else if (ext === '.ext') {
|
|
121
|
+
return extractYourLanguageImports(content, filePath, projectRoot);
|
|
122
|
+
}
|
|
67
123
|
```
|
|
68
124
|
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
5
|
|
125
|
+
Then implement `extractYourLanguageImports()` at the bottom of the file. It should return an array of relative file paths (from project root) that actually exist on disk.
|
|
126
|
+
|
|
127
|
+
### Step 5: Add to `CODE_EXTS` in `src/store/sync-v2.js`
|
|
128
|
+
|
|
129
|
+
```js
|
|
130
|
+
const CODE_EXTS = new Set([
|
|
131
|
+
// ... existing ...
|
|
132
|
+
'.ext',
|
|
133
|
+
]);
|
|
134
|
+
```
|
|
135
|
+
|
|
136
|
+
### Step 6: Add to `detectLanguage()` in `src/store/sync-v2.js`
|
|
137
|
+
|
|
138
|
+
```js
|
|
139
|
+
'.ext': 'yourlanguage',
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
### Step 7: Test
|
|
143
|
+
|
|
144
|
+
```bash
|
|
145
|
+
# Test extraction on a real file
|
|
146
|
+
node -e "
|
|
147
|
+
const { loadLanguagePlugins, getPluginForFile } = require('./src/extractors/loader');
|
|
148
|
+
const plugins = loadLanguagePlugins();
|
|
149
|
+
const plugin = getPluginForFile(plugins, 'test.ext');
|
|
150
|
+
const result = plugin.extract('your code here', 'test.ext');
|
|
151
|
+
console.log(JSON.stringify(result, null, 2));
|
|
152
|
+
"
|
|
153
|
+
|
|
154
|
+
# Run correctness tests
|
|
155
|
+
node test/correctness.js
|
|
156
|
+
|
|
157
|
+
# Run full test suite
|
|
158
|
+
npm test
|
|
159
|
+
```
|
|
72
160
|
|
|
73
161
|
---
|
|
74
162
|
|
|
75
163
|
## How to add a framework extractor
|
|
76
164
|
|
|
77
|
-
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
|
|
165
|
+
Framework-specific extraction (routes, models) lives inside the language plugin. Add regex patterns to the relevant `extractRoutes()` or `extractModels()` function.
|
|
166
|
+
|
|
167
|
+
Example — adding Hono routes to the JS plugin:
|
|
168
|
+
|
|
169
|
+
```js
|
|
170
|
+
// In src/extractors/languages/javascript.js, inside extractExpressRoutes():
|
|
171
|
+
|
|
172
|
+
// Hono: app.get('/path', handler) — same pattern as Express, already covered
|
|
173
|
+
// Hono with chaining: app.route('/api', apiRouter) — add if needed
|
|
174
|
+
```
|
|
175
|
+
|
|
176
|
+
Test on at least 2 real open-source projects using the framework.
|
|
81
177
|
|
|
82
178
|
---
|
|
83
179
|
|
|
84
|
-
## How
|
|
180
|
+
## How domain clustering works (V2)
|
|
85
181
|
|
|
86
|
-
Domain
|
|
182
|
+
Domain detection uses **Leiden+CPM graph clustering** (`src/agents/leiden.js`). Files that import each other heavily cluster together. Domain names are inferred from path tokens, with keyword hints for well-known patterns.
|
|
87
183
|
|
|
88
|
-
|
|
89
|
-
|
|
184
|
+
For non-SaaS repos, users can define custom domains in `carto.config.json`:
|
|
185
|
+
|
|
186
|
+
```json
|
|
187
|
+
{
|
|
188
|
+
"domains": {
|
|
189
|
+
"EDITOR": ["editor", "monaco", "text"],
|
|
190
|
+
"PLATFORM": ["platform", "service", "registry"]
|
|
191
|
+
}
|
|
192
|
+
}
|
|
90
193
|
```
|
|
91
194
|
|
|
195
|
+
The keyword seeds in `src/store/sync-v2.js` (the `keywordSeeds` object) can be extended for new domain types.
|
|
196
|
+
|
|
92
197
|
---
|
|
93
198
|
|
|
94
199
|
## Ground rules
|
|
@@ -98,6 +203,7 @@ Domain clustering lives in `src/agents/domains.js`. The `DOMAIN_MAP` array maps
|
|
|
98
203
|
- **Test on unknown repos.** Don't just test on projects you wrote. Find a real open-source repo using the framework and verify the output is correct.
|
|
99
204
|
- **No cloud, no telemetry, no tracking.** Carto is local only. Forever. Don't add any network calls except the existing npm update check.
|
|
100
205
|
- **No paid features.** Free forever. MIT. Don't propose monetization.
|
|
206
|
+
- **tree-sitter first.** For new languages, always use tree-sitter for imports and symbols. Only use regex for framework-specific patterns (routes, models) that tree-sitter queries can't easily express.
|
|
101
207
|
|
|
102
208
|
---
|
|
103
209
|
|
|
@@ -110,6 +216,8 @@ npm install
|
|
|
110
216
|
node src/cli/index.js init # test in any project
|
|
111
217
|
node src/cli/index.js serve # test MCP server
|
|
112
218
|
npm test # run test suite (30 tests)
|
|
219
|
+
node test/correctness.js # run correctness tests (31 tests)
|
|
220
|
+
node test/benchmark.js # run benchmarks against real repos
|
|
113
221
|
```
|
|
114
222
|
|
|
115
223
|
---
|
|
@@ -117,20 +225,23 @@ npm test # run test suite (30 tests)
|
|
|
117
225
|
## PR checklist
|
|
118
226
|
|
|
119
227
|
- [ ] Tested on at least 2-3 real open-source projects
|
|
120
|
-
- [ ] Before/after AGENTS.md included in PR description
|
|
121
|
-
- [ ] Plugin
|
|
228
|
+
- [ ] Before/after AGENTS.md or `get_architecture` output included in PR description
|
|
229
|
+
- [ ] Plugin uses tree-sitter for imports/symbols (not Babel or regex for the hot path)
|
|
230
|
+
- [ ] Plugin returns all fields including `_tsImports` and `_tsSymbols`
|
|
231
|
+
- [ ] Import resolution added to `src/extractors/imports.js` if language has local imports
|
|
232
|
+
- [ ] Extension added to `CODE_EXTS` and `detectLanguage()` in `sync-v2.js`
|
|
122
233
|
- [ ] No changes to merger logic (unless explicitly fixing a merger bug)
|
|
123
234
|
- [ ] No network calls added
|
|
124
|
-
- [ ] `
|
|
125
|
-
- [ ] `
|
|
235
|
+
- [ ] `npm test` passes (30/30)
|
|
236
|
+
- [ ] `node test/correctness.js` passes (31/31)
|
|
126
237
|
|
|
127
238
|
---
|
|
128
239
|
|
|
129
240
|
## Issues
|
|
130
241
|
|
|
131
|
-
- **Bug**: Open an issue with the project type, command run, and what
|
|
242
|
+
- **Bug**: Open an issue with the project type, command run, and what output was produced vs expected.
|
|
132
243
|
- **Language request**: Open an issue titled "Language: [name]"
|
|
133
244
|
- **Framework request**: Open an issue titled "Framework: [name]"
|
|
134
|
-
- **Domain
|
|
245
|
+
- **Domain clustering issue**: Open an issue titled "Domains: [repo name]" with the repo URL and what domains were detected vs what you expected.
|
|
135
246
|
|
|
136
247
|
All issues acknowledged within 24 hours.
|