carto-md 1.1.4 → 2.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CONTRIBUTING.md +144 -48
- package/README.md +151 -280
- package/acp-strategy.md +480 -0
- package/package.json +29 -5
- package/src/acp/agent.js +221 -0
- package/src/acp/prompt.js +69 -0
- package/src/acp/providers/anthropic.js +125 -0
- package/src/acp/providers/index.js +83 -0
- package/src/acp/providers/openai.js +137 -0
- package/src/acp/session.js +71 -0
- package/src/acp/tools.js +150 -0
- package/src/agents/leiden.js +329 -0
- package/src/cli/agent.js +13 -0
- package/src/cli/check.js +27 -23
- package/src/cli/index.js +3 -0
- package/src/cli/init.js +19 -27
- package/src/cli/serve.js +11 -2
- package/src/cli/sync.js +16 -2
- package/src/cli/watch.js +254 -144
- package/src/engine/worker.js +6 -0
- package/src/extractors/imports.js +334 -10
- package/src/extractors/languages/cpp.js +66 -0
- package/src/extractors/languages/csharp.js +122 -0
- package/src/extractors/languages/go.js +13 -1
- package/src/extractors/languages/java.js +116 -0
- package/src/extractors/languages/javascript.js +91 -2
- package/src/extractors/languages/python.js +15 -1
- package/src/extractors/languages/ruby.js +157 -0
- package/src/extractors/languages/rust.js +99 -0
- package/src/extractors/languages/typescript.js +55 -4
- package/src/extractors/tree-sitter-parser.js +405 -0
- package/src/mcp/server-v2.js +543 -0
- package/src/store/change-detector.js +86 -0
- package/src/store/migrate.js +204 -0
- package/src/store/sqlite-store.js +747 -0
- package/src/store/sync-v2.js +586 -0
- package/src/watcher/watch.js +187 -35
package/CONTRIBUTING.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# Contributing to Carto
|
|
2
2
|
|
|
3
|
-
Carto is free, open source, and community-maintained. The core team owns the
|
|
3
|
+
Carto is free, open source, and community-maintained. The core team owns the SQLite store, MCP server, graph clustering, and CLI. The community owns language and framework extractors.
|
|
4
4
|
|
|
5
5
|
---
|
|
6
6
|
|
|
@@ -10,88 +10,178 @@ Carto is free, open source, and community-maintained. The core team owns the mer
|
|
|
10
10
|
|
|
11
11
|
New language support lives in `src/extractors/languages/`. Each language is an isolated module.
|
|
12
12
|
|
|
13
|
-
Currently supported
|
|
13
|
+
**Currently supported:** JavaScript/TypeScript, Python, Go, Rust, Ruby, Java, C/C++, C#, R, Prisma, HTML
|
|
14
14
|
|
|
15
|
-
Wanted
|
|
15
|
+
**Wanted:** PHP, Swift, Kotlin, Elixir, Scala, Haskell, Zig
|
|
16
16
|
|
|
17
17
|
### Tier 2 — Framework extractors (safe to add, easy to review)
|
|
18
18
|
|
|
19
|
-
Framework-specific route and model extraction lives
|
|
19
|
+
Framework-specific route and model extraction lives inside the language plugins.
|
|
20
20
|
|
|
21
|
-
Currently supported
|
|
22
|
-
- **JS/TS**: Express, Next.js (App + Pages Router), tRPC, Drizzle, Zod
|
|
21
|
+
**Currently supported:**
|
|
22
|
+
- **JS/TS**: Express, Next.js (App + Pages Router), tRPC, React Router, Drizzle, Zod, TypeScript interfaces
|
|
23
23
|
- **Python**: FastAPI, Flask, Pydantic, SQLAlchemy, Django (models + URLs)
|
|
24
|
-
- **Go**: Gin, Echo, Chi,
|
|
24
|
+
- **Go**: Gin, Echo, Chi, net/http — routes, structs, import graph
|
|
25
|
+
- **Rust**: Actix-web, Axum, Rocket — routes, structs
|
|
26
|
+
- **Java**: Spring MVC/Boot, JAX-RS — routes, JPA entities, records
|
|
27
|
+
- **C#**: ASP.NET Core (attribute routing + minimal API), EF Core classes, records
|
|
28
|
+
- **Ruby**: Rails routes.rb, Sinatra, ActiveRecord models
|
|
25
29
|
- **Schema**: Prisma
|
|
26
30
|
- **Frontend**: HTML fetch()
|
|
27
31
|
- **R**: Plumber, Shiny, R6, S7
|
|
28
32
|
|
|
29
|
-
Wanted
|
|
33
|
+
**Wanted:** NestJS, Hono, Fastify, Laravel, Django REST Framework, Ktor, Vapor
|
|
30
34
|
|
|
31
35
|
### Tier 3 — Core (review carefully before merging)
|
|
32
36
|
|
|
33
|
-
- `src/agents/merger.js` — merger logic. One bad merge = developer loses manual notes
|
|
34
|
-
- `src/agents/
|
|
35
|
-
- `src/
|
|
36
|
-
- `src/mcp/server.js` — MCP server tools. Breaking changes affect Kiro/Cursor/Claude
|
|
37
|
-
- `src/
|
|
38
|
-
- `src/
|
|
39
|
-
- `src/
|
|
40
|
-
- `src/cli/` — CLI commands.
|
|
37
|
+
- `src/agents/merger.js` — merger logic. One bad merge = developer loses manual notes.
|
|
38
|
+
- `src/agents/leiden.js` — Leiden+CPM graph clustering. Wrong clusters = wrong domain context.
|
|
39
|
+
- `src/store/sqlite-store.js` — SQLite persistence layer.
|
|
40
|
+
- `src/mcp/server-v2.js` — MCP server tools. Breaking changes affect Kiro/Cursor/Claude.
|
|
41
|
+
- `src/store/sync-v2.js` — full sync pipeline.
|
|
42
|
+
- `src/cli/watch.js` — incremental update pipeline.
|
|
43
|
+
- `src/extractors/imports.js` — import resolution for all languages.
|
|
41
44
|
|
|
42
45
|
---
|
|
43
46
|
|
|
44
|
-
## How to add a language
|
|
47
|
+
## How to add a language (V2 pattern — tree-sitter based)
|
|
45
48
|
|
|
46
|
-
|
|
47
|
-
|
|
49
|
+
V2 uses tree-sitter for import and symbol extraction. Babel is only used for deep JS/TS route/model extraction on API handler files.
|
|
50
|
+
|
|
51
|
+
### Step 1: Install the grammar
|
|
52
|
+
|
|
53
|
+
```bash
|
|
54
|
+
npm install tree-sitter-yourlanguage --save-exact
|
|
55
|
+
```
|
|
56
|
+
|
|
57
|
+
### Step 2: Add grammar definition to `src/extractors/tree-sitter-parser.js`
|
|
58
|
+
|
|
59
|
+
Add an entry to the `GRAMMAR_DEFS` array:
|
|
48
60
|
|
|
49
61
|
```js
|
|
62
|
+
{
|
|
63
|
+
name: 'yourlanguage',
|
|
64
|
+
extensions: ['.ext'],
|
|
65
|
+
loadGrammar: () => require('tree-sitter-yourlanguage'),
|
|
66
|
+
importQuery: `
|
|
67
|
+
(import_statement source: (string) @src)
|
|
68
|
+
`,
|
|
69
|
+
symbolQuery: `
|
|
70
|
+
(function_declaration name: (identifier) @name)
|
|
71
|
+
(class_declaration name: (identifier) @name)
|
|
72
|
+
`,
|
|
73
|
+
},
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
The queries use tree-sitter S-expression syntax. Run `node -e "const P = require('tree-sitter'); const L = require('tree-sitter-yourlanguage'); const p = new P(); p.setLanguage(L); console.log(p.parse('your code').rootNode.toString())"` to see the node types.
|
|
77
|
+
|
|
78
|
+
### Step 3: Create `src/extractors/languages/yourlanguage.js`
|
|
79
|
+
|
|
80
|
+
```js
|
|
81
|
+
'use strict';
|
|
82
|
+
|
|
83
|
+
const tsParser = require('../tree-sitter-parser');
|
|
84
|
+
|
|
50
85
|
module.exports = {
|
|
51
86
|
name: 'yourlanguage',
|
|
52
87
|
extensions: ['.ext'],
|
|
53
|
-
extract(content,
|
|
88
|
+
extract(content, filename) {
|
|
89
|
+
// Fast path: tree-sitter for imports + symbols (runs on ALL files)
|
|
90
|
+
const { imports: tsImports, symbols: tsSymbols } = tsParser.isAvailable()
|
|
91
|
+
? tsParser.extractAll(content, '.ext')
|
|
92
|
+
: { imports: [], symbols: [] };
|
|
93
|
+
|
|
54
94
|
return {
|
|
55
|
-
routes:
|
|
56
|
-
models:
|
|
57
|
-
functions:
|
|
58
|
-
|
|
59
|
-
|
|
95
|
+
routes: extractRoutes(content), // framework-specific, regex
|
|
96
|
+
models: extractModels(content), // ORM/schema models, regex
|
|
97
|
+
functions: tsSymbols
|
|
98
|
+
.filter(s => s.kind === 'function')
|
|
99
|
+
.map(s => ({ name: s.name, params: '—', returnType: '—' })),
|
|
100
|
+
envVars: extractEnvVars(content), // env var references
|
|
101
|
+
dbTables: [],
|
|
60
102
|
fetches: [],
|
|
61
103
|
storageKeys: [],
|
|
62
|
-
|
|
63
|
-
|
|
104
|
+
_tsImports: tsImports, // raw import paths (for import graph)
|
|
105
|
+
_tsSymbols: tsSymbols, // all symbols (for get_file_summary)
|
|
64
106
|
};
|
|
65
107
|
}
|
|
66
108
|
};
|
|
109
|
+
|
|
110
|
+
function extractRoutes(content) { return []; }
|
|
111
|
+
function extractModels(content) { return []; }
|
|
112
|
+
function extractEnvVars(content) { return []; }
|
|
67
113
|
```
|
|
68
114
|
|
|
69
|
-
|
|
70
|
-
4. Test on at least 3 real open-source projects
|
|
71
|
-
5. Open a PR with before/after AGENTS.md examples
|
|
115
|
+
### Step 4: Add import resolution to `src/extractors/imports.js`
|
|
72
116
|
|
|
73
|
-
|
|
117
|
+
If your language has resolvable local imports (not just package names), add a case in `extractImports()`:
|
|
74
118
|
|
|
75
|
-
|
|
119
|
+
```js
|
|
120
|
+
} else if (ext === '.ext') {
|
|
121
|
+
return extractYourLanguageImports(content, filePath, projectRoot);
|
|
122
|
+
}
|
|
123
|
+
```
|
|
124
|
+
|
|
125
|
+
Then implement `extractYourLanguageImports()` at the bottom of the file. It should return an array of relative file paths (from project root) that actually exist on disk.
|
|
126
|
+
|
|
127
|
+
### Step 5: Add to `CODE_EXTS` in `src/store/sync-v2.js`
|
|
128
|
+
|
|
129
|
+
```js
|
|
130
|
+
const CODE_EXTS = new Set([
|
|
131
|
+
// ... existing ...
|
|
132
|
+
'.ext',
|
|
133
|
+
]);
|
|
134
|
+
```
|
|
135
|
+
|
|
136
|
+
### Step 6: Add to `detectLanguage()` in `src/store/sync-v2.js`
|
|
137
|
+
|
|
138
|
+
```js
|
|
139
|
+
'.ext': 'yourlanguage',
|
|
140
|
+
```
|
|
76
141
|
|
|
77
|
-
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
|
|
142
|
+
### Step 7: Test
|
|
143
|
+
|
|
144
|
+
```bash
|
|
145
|
+
# Test extraction on a real file
|
|
146
|
+
node -e "
|
|
147
|
+
const { loadLanguagePlugins, getPluginForFile } = require('./src/extractors/loader');
|
|
148
|
+
const plugins = loadLanguagePlugins();
|
|
149
|
+
const plugin = getPluginForFile(plugins, 'test.ext');
|
|
150
|
+
const result = plugin.extract('your code here', 'test.ext');
|
|
151
|
+
console.log(JSON.stringify(result, null, 2));
|
|
152
|
+
"
|
|
153
|
+
|
|
154
|
+
# Run correctness tests
|
|
155
|
+
node test/correctness.js
|
|
156
|
+
|
|
157
|
+
# Run full test suite
|
|
158
|
+
npm test
|
|
159
|
+
```
|
|
81
160
|
|
|
82
161
|
---
|
|
83
162
|
|
|
84
|
-
## How to add a
|
|
163
|
+
## How to add a framework extractor
|
|
164
|
+
|
|
165
|
+
Framework-specific extraction (routes, models) lives inside the language plugin. Add regex patterns to the relevant `extractRoutes()` or `extractModels()` function.
|
|
85
166
|
|
|
86
|
-
|
|
167
|
+
Example — adding Hono routes to the JS plugin:
|
|
87
168
|
|
|
88
169
|
```js
|
|
89
|
-
|
|
170
|
+
// In src/extractors/languages/javascript.js, inside extractExpressRoutes():
|
|
171
|
+
|
|
172
|
+
// Hono: app.get('/path', handler) — same pattern as Express, already covered
|
|
173
|
+
// Hono with chaining: app.route('/api', apiRouter) — add if needed
|
|
90
174
|
```
|
|
91
175
|
|
|
92
|
-
|
|
176
|
+
Test on at least 2 real open-source projects using the framework.
|
|
177
|
+
|
|
178
|
+
---
|
|
179
|
+
|
|
180
|
+
## How domain clustering works (V2)
|
|
181
|
+
|
|
182
|
+
Domain detection uses **Leiden+CPM graph clustering** (`src/agents/leiden.js`). Files that import each other heavily cluster together. Domain names are inferred from path tokens, with keyword hints for well-known patterns.
|
|
93
183
|
|
|
94
|
-
For non-
|
|
184
|
+
For non-SaaS repos, users can define custom domains in `carto.config.json`:
|
|
95
185
|
|
|
96
186
|
```json
|
|
97
187
|
{
|
|
@@ -102,7 +192,7 @@ For non-web repos (CLIs, desktop apps, compilers), users can define their own do
|
|
|
102
192
|
}
|
|
103
193
|
```
|
|
104
194
|
|
|
105
|
-
|
|
195
|
+
The keyword seeds in `src/store/sync-v2.js` (the `keywordSeeds` object) can be extended for new domain types.
|
|
106
196
|
|
|
107
197
|
---
|
|
108
198
|
|
|
@@ -113,6 +203,7 @@ Custom config overrides the default domain map entirely for that project.
|
|
|
113
203
|
- **Test on unknown repos.** Don't just test on projects you wrote. Find a real open-source repo using the framework and verify the output is correct.
|
|
114
204
|
- **No cloud, no telemetry, no tracking.** Carto is local only. Forever. Don't add any network calls except the existing npm update check.
|
|
115
205
|
- **No paid features.** Free forever. MIT. Don't propose monetization.
|
|
206
|
+
- **tree-sitter first.** For new languages, always use tree-sitter for imports and symbols. Only use regex for framework-specific patterns (routes, models) that tree-sitter queries can't easily express.
|
|
116
207
|
|
|
117
208
|
---
|
|
118
209
|
|
|
@@ -125,6 +216,8 @@ npm install
|
|
|
125
216
|
node src/cli/index.js init # test in any project
|
|
126
217
|
node src/cli/index.js serve # test MCP server
|
|
127
218
|
npm test # run test suite (30 tests)
|
|
219
|
+
node test/correctness.js # run correctness tests (31 tests)
|
|
220
|
+
node test/benchmark.js # run benchmarks against real repos
|
|
128
221
|
```
|
|
129
222
|
|
|
130
223
|
---
|
|
@@ -132,20 +225,23 @@ npm test # run test suite (30 tests)
|
|
|
132
225
|
## PR checklist
|
|
133
226
|
|
|
134
227
|
- [ ] Tested on at least 2-3 real open-source projects
|
|
135
|
-
- [ ] Before/after AGENTS.md included in PR description
|
|
136
|
-
- [ ] Plugin
|
|
228
|
+
- [ ] Before/after AGENTS.md or `get_architecture` output included in PR description
|
|
229
|
+
- [ ] Plugin uses tree-sitter for imports/symbols (not Babel or regex for the hot path)
|
|
230
|
+
- [ ] Plugin returns all fields including `_tsImports` and `_tsSymbols`
|
|
231
|
+
- [ ] Import resolution added to `src/extractors/imports.js` if language has local imports
|
|
232
|
+
- [ ] Extension added to `CODE_EXTS` and `detectLanguage()` in `sync-v2.js`
|
|
137
233
|
- [ ] No changes to merger logic (unless explicitly fixing a merger bug)
|
|
138
234
|
- [ ] No network calls added
|
|
139
|
-
- [ ] `
|
|
140
|
-
- [ ] `
|
|
235
|
+
- [ ] `npm test` passes (30/30)
|
|
236
|
+
- [ ] `node test/correctness.js` passes (31/31)
|
|
141
237
|
|
|
142
238
|
---
|
|
143
239
|
|
|
144
240
|
## Issues
|
|
145
241
|
|
|
146
|
-
- **Bug**: Open an issue with the project type, command run, and what
|
|
242
|
+
- **Bug**: Open an issue with the project type, command run, and what output was produced vs expected.
|
|
147
243
|
- **Language request**: Open an issue titled "Language: [name]"
|
|
148
244
|
- **Framework request**: Open an issue titled "Framework: [name]"
|
|
149
|
-
- **Domain
|
|
245
|
+
- **Domain clustering issue**: Open an issue titled "Domains: [repo name]" with the repo URL and what domains were detected vs what you expected.
|
|
150
246
|
|
|
151
247
|
All issues acknowledged within 24 hours.
|