carto-md 1.1.0 → 1.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CONTRIBUTING.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # Contributing to Carto
2
2
 
3
- Carto is free, open source, and community-maintained. The core team owns the merger logic, AST engine, and CLI. The community owns language and framework extractors.
3
+ Carto is free, open source, and community-maintained. The core team owns the merger logic, MCP server, graph clustering, and CLI. The community owns language and framework extractors.
4
4
 
5
5
  ---
6
6
 
@@ -18,14 +18,15 @@ Wanted: Go, Rust, Ruby, Java, PHP, C#.
18
18
 
19
19
  Framework-specific route and model extraction lives in `src/extractors/`. Each framework is an isolated module.
20
20
 
21
- Currently supported: FastAPI, Express, Next.js App Router, Prisma, HTML fetch(), Plumber, Shiny.
21
+ Currently supported: FastAPI, Express, Next.js App Router, Prisma, tRPC, HTML fetch(), Plumber, Shiny.
22
22
 
23
23
  Wanted: Django, Rails, Laravel, NestJS, Hono, Gin, Spring.
24
24
 
25
25
  ### Tier 3 — Core (review carefully before merging)
26
26
 
27
- - `src/agents/merger.js` — merger logic. One bad merge = developer loses manual notes = project dies. Changes here need strong justification and full test coverage.
28
- - `src/ast/`AST engine. Wrong extraction = wrong AGENTS.md = AI gets confident with wrong facts. Worse than no AGENTS.md.
27
+ - `src/agents/merger.js` — merger logic. One bad merge = developer loses manual notes = project dies.
28
+ - `src/agents/domains.js`graph-based domain clustering. Wrong clusters = wrong context files.
29
+ - `src/mcp/server.js` — MCP server tools. Breaking changes affect Kiro/Cursor/Claude integration.
29
30
  - `src/detector/` — framework detection logic.
30
31
  - `src/cli/` — CLI commands.
31
32
 
@@ -34,36 +35,46 @@ Wanted: Django, Rails, Laravel, NestJS, Hono, Gin, Spring.
34
35
  ## How to add a language
35
36
 
36
37
  1. Create `src/extractors/languages/yourlanguage.js`
37
- 2. Export a single function: `extractFromFile(filePath, fileContent)`
38
- 3. Return:
38
+ 2. Export a plugin object:
39
39
  ```js
40
- {
41
- routes: [{ method, path, functionName }],
42
- models: [{ className, fields: [{ name, type }] }],
43
- functions: [{ name, params }],
44
- envVars: ['VAR_NAME']
45
- }
40
+ module.exports = {
41
+ name: 'yourlanguage',
42
+ extensions: ['.ext'],
43
+ extract(content, relPath) {
44
+ return {
45
+ routes: [{ method, path, functionName }],
46
+ models: [{ className, fields: [{ name, type }] }],
47
+ functions: [{ name, params, returnType }],
48
+ envVars: ['VAR_NAME'],
49
+ dbTables: [{ tableName, modelName }],
50
+ fetches: [],
51
+ storageKeys: []
52
+ };
53
+ }
54
+ };
46
55
  ```
47
- 4. Add it to `src/extractors/loader.js` language map
48
- 5. Test on at least 3 real open-source projects
49
- 6. Open a PR with before/after AGENTS.md examples
56
+ 3. The loader auto-discovers it — no changes to `loader.js` needed
57
+ 4. Test on at least 3 real open-source projects
58
+ 5. Open a PR with before/after AGENTS.md examples
50
59
 
51
60
  ---
52
61
 
53
62
  ## How to add a framework extractor
54
63
 
55
- 1. Create `src/extractors/yourframework.js`
56
- 2. Export:
64
+ 1. Add detection to `src/detector/framework.js`
65
+ 2. Add route/model patterns to the relevant language plugin or create a new extractor in `src/extractors/`
66
+ 3. Test on at least 2 real projects using that framework
67
+ 4. Open a PR with before/after AGENTS.md examples
68
+
69
+ ---
70
+
71
+ ## How to add a domain keyword
72
+
73
+ Domain clustering lives in `src/agents/domains.js`. The `DOMAIN_MAP` array maps keywords to domain names. If your framework creates a new domain category, add it:
74
+
57
75
  ```js
58
- {
59
- detect(projectRoot, files) → boolean,
60
- extractRoutes(filePath, fileContent) → [{ method, path, functionName }],
61
- extractModels(filePath, fileContent) → [{ name, fields: [{ name, type }] }]
62
- }
76
+ { keywords: ['graphql', 'resolver', 'mutation'], domain: 'GRAPHQL' },
63
77
  ```
64
- 3. Add detection logic to `src/detector/framework.js`
65
- 4. Test on at least 2 real projects using that framework
66
- 5. Open a PR with before/after AGENTS.md examples
67
78
 
68
79
  ---
69
80
 
@@ -72,7 +83,7 @@ Wanted: Django, Rails, Laravel, NestJS, Hono, Gin, Spring.
72
83
  - **Never break the merger.** Manual sections in AGENTS.md are sacred. If your change could corrupt them, it needs a full merger test suite pass.
73
84
  - **Wrong output is worse than no output.** If your extractor produces incorrect routes or models, AI gets confident with wrong facts. Only ship when accurate on real projects.
74
85
  - **Test on unknown repos.** Don't just test on projects you wrote. Find a real open-source repo using the framework and verify the output is correct.
75
- - **No cloud, no telemetry, no tracking.** Carto is local only. Forever. Don't add any network calls.
86
+ - **No cloud, no telemetry, no tracking.** Carto is local only. Forever. Don't add any network calls except the existing npm update check.
76
87
  - **No paid features.** Free forever. MIT. Don't propose monetization.
77
88
 
78
89
  ---
@@ -84,6 +95,8 @@ git clone https://github.com/theanshsonkar/carto
84
95
  cd carto
85
96
  npm install
86
97
  node src/cli/index.js init # test in any project
98
+ node src/cli/index.js serve # test MCP server
99
+ npm test # run test suite
87
100
  ```
88
101
 
89
102
  ---
@@ -101,8 +114,9 @@ node src/cli/index.js init # test in any project
101
114
 
102
115
  ## Issues
103
116
 
104
- - **Bug**: Open an issue with the project type, command run, and what AGENTS.md produced vs what you expected.
105
- - **Language request**: Open an issue titled "Language: [name]" — someone from the community will pick it up.
106
- - **Framework request**: Open an issue titled "Framework: [name]".
117
+ - **Bug**: Open an issue with the project type, command run, and what AGENTS.md or domain files produced vs what you expected.
118
+ - **Language request**: Open an issue titled "Language: [name]"
119
+ - **Framework request**: Open an issue titled "Framework: [name]"
120
+ - **Domain keyword**: Open an issue titled "Domain: [name]" if your codebase doesn't cluster correctly
107
121
 
108
122
  All issues acknowledged within 24 hours.
package/README.md CHANGED
@@ -4,13 +4,13 @@
4
4
  [![MIT License](https://img.shields.io/badge/license-MIT-blue)](LICENSE)
5
5
  [![npm downloads](https://img.shields.io/npm/dm/carto-md)](https://www.npmjs.com/package/carto-md)
6
6
 
7
- **Maps your codebase so AI stops guessing. Your code changes. AGENTS.md updates. Every AI always knows.**
7
+ **The codebase intelligence layer every AI tool queries instead of guessing.**
8
8
 
9
9
  ```bash
10
10
  npm install -g carto-md
11
11
  ```
12
12
 
13
- Carto auto-generates and maintains your `AGENTS.md`the standard file every AI coding tool reads for project context. Every time you save, your routes, models, functions, and dependencies are extracted and kept current.
13
+ Carto maps your codebaseroutes, models, import graph, domain context and exposes it as a live MCP server that Kiro, Cursor, and Claude can query mid-task. No hallucinations about your own project. No rebuilding context every session.
14
14
 
15
15
  ---
16
16
 
@@ -19,13 +19,23 @@ Carto auto-generates and maintains your `AGENTS.md` — the standard file every
19
19
  AI coding tools are blind to your actual project. Every session starts from zero.
20
20
 
21
21
  - Claude hallucinates your schema
22
- - Copilot suggests the wrong field names
22
+ - Copilot suggests wrong field names
23
23
  - Kiro asks what framework you're using
24
24
  - You rebuild context manually, every time
25
25
 
26
- `AGENTS.md` fixes this — a file in your project root that every AI tool reads. But it's static. You write it manually. It gets stale the moment your code changes.
26
+ `AGENTS.md` fixes this — a standard file every AI tool reads. But it's static. You write it manually. It gets stale the moment your code changes.
27
27
 
28
- **Carto makes it live.**
28
+ **Carto makes it live. And queryable.**
29
+
30
+ | | Without Carto | With Carto |
31
+ |---|---|---|
32
+ | Knows blast radius before editing | Never | Always, instantly |
33
+ | Knows which routes break | Never | Exact list |
34
+ | Plans multi-file changes | Guesses | Fully informed |
35
+ | Hallucinates field names | Often | Never |
36
+ | Understands codebase on session start | 10–20 min | 0 |
37
+ | Works across Kiro, Cursor, Claude, Copilot | Separately | One shared graph |
38
+ | Stays current as code changes | Goes stale | Live on every save |
29
39
 
30
40
  ---
31
41
 
@@ -33,16 +43,15 @@ AI coding tools are blind to your actual project. Every session starts from zero
33
43
 
34
44
  Same task, two Claude sessions: *"Add a `notes` field to the booking model."*
35
45
 
36
- **Without AGENTS.md:**
46
+ **Without Carto:**
37
47
  - Wrong API route: suggested `POST /api/bookings` → actual is `POST /v2/bookings`
38
48
  - Wrong handler: suggested `handleNewBooking.ts` → not the creation path
39
- - Wrong file paths: pointed to v1 API (`apps/api/v1/...`) → v1 is legacy
49
+ - Wrong file paths: pointed to v1 API → v1 is legacy
40
50
  - Wrong tRPC file: `bookings.tsx` → actual is `bookings/_router.tsx`
41
51
  - Field list: ~15 fields guessed → missing 20+ real fields
42
- - Couldn't proceed without follow-up: *"Want me to write the exact diffs once you confirm the codebase location?"*
43
52
 
44
- **With AGENTS.md (generated by Carto):**
45
- - Correct API route: `POST /v2/bookings`
53
+ **With Carto:**
54
+ - Correct API route ✅
46
55
  - Correct controller path ✅
47
56
  - Correct tRPC file ✅
48
57
  - All 35+ booking fields returned accurately ✅
@@ -52,66 +61,146 @@ Same task, two Claude sessions: *"Add a `notes` field to the booking model."*
52
61
 
53
62
  Not smarter AI. The same AI with accurate facts.
54
63
 
55
- *Stress tested on cal.com (5,018 files): 87% route coverage, 100% model field accuracy, import graph with zero phantom links.*
64
+ ---
65
+
66
+ ## How it works
67
+
68
+ ```
69
+ carto init
70
+
71
+ Carto maps your codebase
72
+ → AGENTS.md (79 lines — lean map every AI reads)
73
+ → .carto/context/AUTH.md, PAYMENTS.md, TRPC.md, DATABASE.md
74
+ → .carto/map.json (import graph, routes, blast radius)
75
+ → MCP server auto-wired into Kiro, Cursor, Claude Desktop
76
+
77
+ carto watch (keeps everything live on every file save)
78
+ carto serve (MCP server — AI tools query graph mid-task)
79
+ ```
56
80
 
57
81
  ---
58
82
 
59
- ## Know what breaks before you break it
83
+ ## MCP AI queries your codebase live
60
84
 
61
- Most production bugs aren't logic errors. They're *"I didn't know X depended on Y."*
85
+ `carto init` auto-wires the MCP config into Kiro, Cursor, and Claude Desktop automatically. When Kiro or Cursor is mid-task, it can call Carto directly instead of guessing:
62
86
 
63
- `carto impact` makes that invisible knowledge visible — before you touch anything.
87
+ **`get_blast_radius("src/lib/payments.ts")`**
88
+ ```
89
+ Files affected:
90
+ → apps/web/app/api/checkout/route.ts
91
+ → apps/web/app/api/webhook/route.ts
92
+ → packages/trpc/routers/billing.ts
93
+
94
+ Routes at risk:
95
+ → POST /api/checkout
96
+ → POST /api/webhook
97
+ → POST /trpc/createSubscription
98
+ ```
64
99
 
65
- ```bash
66
- carto impact app/models.py
100
+ **`get_routes()`**
101
+ ```
102
+ | Method | Path | Handler |
103
+ |--------|-----------------------------|---------------------|
104
+ | POST | /api/auth/signup | POST |
105
+ | GET | /api/auth/oauth/me | GET |
106
+ | POST | /trpc/createBooking | createBooking |
107
+ | GET | /trpc/getAvailability | getAvailability |
108
+ | ... | ... | ... |
109
+ ```
67
110
 
68
- # Impact analysis: app/models.py
69
- #
70
- # Imported by:
71
- # → app/main.py
72
- # → app/rules.py
73
- # → app/scoring.py
74
- # → app/aws_collector.py
75
- # → tests/conftest.py
76
- #
77
- # Routes affected:
78
- # → POST /analyze
79
- # → GET /history
80
- # → POST /simulate
81
- # → ... 12 more
82
- #
83
- # Risk: HIGH — 5 files depend on this
111
+ **`get_domain("AUTH")`**
112
+ Returns `AUTH.md` — all auth routes, session models, JWT functions, env vars.
113
+
114
+ **`get_structure()`**
115
+ Returns import graph, entry points, high impact files, tech stack.
116
+
117
+ ### Manual MCP config (if auto-wire didn't detect your IDE)
118
+
119
+ **Kiro** — add to `~/.kiro/settings/mcp.json`:
120
+ ```json
121
+ {
122
+ "mcpServers": {
123
+ "carto": {
124
+ "command": "carto",
125
+ "args": ["serve"],
126
+ "cwd": "/path/to/your/project"
127
+ }
128
+ }
129
+ }
84
130
  ```
85
131
 
86
- No AI. No cloud. Runs in under a second. Locally, from your import graph.
132
+ **Cursor** add to `~/.cursor/mcp.json`:
133
+ ```json
134
+ {
135
+ "mcpServers": {
136
+ "carto": {
137
+ "command": "carto",
138
+ "args": ["serve"],
139
+ "cwd": "/path/to/your/project"
140
+ }
141
+ }
142
+ }
143
+ ```
144
+
145
+ **Claude Desktop** — add to `~/Library/Application Support/Claude/claude_desktop_config.json`:
146
+ ```json
147
+ {
148
+ "mcpServers": {
149
+ "carto": {
150
+ "command": "carto",
151
+ "args": ["serve"],
152
+ "cwd": "/path/to/your/project"
153
+ }
154
+ }
155
+ }
156
+ ```
87
157
 
88
- Make it a habit: before touching any file, run `carto impact` first. 10 seconds. Could save hours.
158
+ Then run `carto serve` in your project directory alongside `carto watch`.
89
159
 
90
160
  ---
91
161
 
92
- ## Why not just paste your code?
162
+ ## Domain context files
93
163
 
94
- Context windows are large now. But pasting code means:
164
+ Large codebases kill AI accuracy. A 2900-line AGENTS.md means AI reads 500 lines and guesses the rest.
95
165
 
96
- - You decide what's relevant — you're often wrong
97
- - AI sees a snapshot, not your live state
98
- - Bigger context ≠ better context
166
+ Carto splits context by domain automatically:
99
167
 
100
- Carto gives AI the map. You give AI the problem. Different jobs.
168
+ ```
169
+ AGENTS.md → 79 lines, always loaded
170
+ .carto/context/
171
+ AUTH.md → auth routes, session models, JWT functions
172
+ PAYMENTS.md → Stripe routes, billing models
173
+ TRPC.md → all tRPC procedures
174
+ DATABASE.md → every model, schema, table
175
+ EVENTS.md → webhooks, queues, cron jobs
176
+ CORE.md → shared utilities
177
+ ```
178
+
179
+ AI reads AGENTS.md always. Then reads only the relevant domain file for the current task. 400 lines of exact context instead of 2900 lines of everything.
180
+
181
+ Domain assignment uses your import graph — files that import each other cluster together, regardless of folder names.
101
182
 
102
183
  ---
103
184
 
104
- ## How it works
185
+ ## Know what breaks before you break it
105
186
 
187
+ ```bash
188
+ carto impact apps/web/app/api/auth/signup/route.ts
189
+
190
+ # Impact analysis: apps/web/app/api/auth/signup/route.ts
191
+ #
192
+ # Imported by:
193
+ # → apps/web/app/api/auth/signup/handlers/calcomSignupHandler.ts
194
+ # → apps/web/app/api/auth/signup/handlers/selfHostedHandler.ts
195
+ #
196
+ # Routes at risk:
197
+ # → POST /api/auth/signup
198
+ # → ALL /api/auth/signup/handlers
199
+ #
200
+ # Risk: MEDIUM
106
201
  ```
107
- You save a file
108
-
109
- Carto extracts routes, models, functions, env vars
110
-
111
- AGENTS.md updated in 300ms
112
-
113
- Cursor, Copilot, Kiro, Codex, Claude — all read current truth
114
- ```
202
+
203
+ No AI. No cloud. Runs in under a second. From your live import graph.
115
204
 
116
205
  ---
117
206
 
@@ -121,7 +210,7 @@ Cursor, Copilot, Kiro, Codex, Claude — all read current truth
121
210
  npm install -g carto-md
122
211
  ```
123
212
 
124
- Or run without installing:
213
+ Or without installing:
125
214
 
126
215
  ```bash
127
216
  npx carto-md init
@@ -132,16 +221,18 @@ npx carto-md init
132
221
  ## Usage
133
222
 
134
223
  ```bash
135
- # 1. Go to your project
136
224
  cd your-project
137
-
138
- # 2. Run once — like git init
139
225
  carto init
140
226
  ```
141
227
 
142
- That's it. Carto installs a git hook. Every `git commit` syncs AGENTS.md automatically — no watching, no manual runs, nothing to remember.
228
+ That's it. Carto:
229
+ - Maps your codebase
230
+ - Generates AGENTS.md + domain context files
231
+ - Auto-wires MCP into Kiro, Cursor, Claude Desktop
232
+ - Installs a git hook — syncs on every commit
143
233
 
144
- Want live updates on every file save too? Run `carto watch` in a background terminal.
234
+ Run `carto watch` in background for live updates on every file save.
235
+ Run `carto serve` to start the MCP server manually if needed.
145
236
 
146
237
  ---
147
238
 
@@ -149,19 +240,14 @@ Want live updates on every file save too? Run `carto watch` in a background term
149
240
 
150
241
  | Command | What it does |
151
242
  |---------|-------------|
152
- | `carto init` | Detect stack, generate AGENTS.md, install git hook auto-syncs on every commit |
153
- | `carto watch` | Live updates on every file save — optional, for between commits |
243
+ | `carto init` | Map codebase, generate context files, wire MCP into IDEs |
244
+ | `carto watch` | Live updates on every file save |
154
245
  | `carto sync` | One-time manual refresh |
246
+ | `carto serve` | Start MCP server for Kiro/Cursor/Claude queries |
155
247
  | `carto impact <file>` | Show blast radius before touching a file |
156
248
  | `carto remove` | Remove AGENTS.md and .carto/ from this project |
157
249
  | `carto --version` | Show version |
158
250
 
159
- **When to use each:**
160
- - `init` — once per project, sets everything up
161
- - `watch` — optional, if you want updates between commits
162
- - `sync` — if you skipped watch and need a fresh snapshot
163
- - `impact` — before editing anything critical
164
-
165
251
  ---
166
252
 
167
253
  ## Works with
@@ -170,7 +256,7 @@ Want live updates on every file save too? Run `carto watch` in a background term
170
256
  |----------|------------|
171
257
  | Python | FastAPI, Pydantic |
172
258
  | JavaScript | Express, Next.js |
173
- | TypeScript | Express, Next.js, Prisma |
259
+ | TypeScript | Express, Next.js, Prisma, tRPC |
174
260
  | R | Plumber, Shiny, R6, S7 |
175
261
  | HTML | fetch() calls |
176
262
 
@@ -178,22 +264,22 @@ More languages via community — open an issue or see [CONTRIBUTING.md](CONTRIBU
178
264
 
179
265
  ---
180
266
 
181
- ## What gets extracted automatically
267
+ ## What gets extracted
182
268
 
183
- - API routes — FastAPI, Express, Next.js App Router
184
- - Data models — Pydantic, Prisma
269
+ - API routes — FastAPI, Express, Next.js App Router, tRPC procedures
270
+ - Data models — Pydantic, Prisma, TypeScript interfaces
185
271
  - Function signatures — across all files
186
- - Dependencies — from `package.json` / `requirements.txt`
187
- - Environment variable names — never values
188
- - Frontend API calls — from `fetch()` patterns
189
272
  - Import graph — which files depend on which
190
- - Database tables
273
+ - Domain clusters — AUTH, PAYMENTS, TRPC, DATABASE, EVENTS
274
+ - Blast radius — what breaks if you change a file
275
+ - Environment variable names — never values
276
+ - Database tables — SQLAlchemy, Django ORM, Prisma
191
277
 
192
278
  ---
193
279
 
194
280
  ## What Carto never touches
195
281
 
196
- Your manual sections — architecture decisions, active bugs, business rules, coding conventions — stay yours forever. Carto only rewrites between its own markers:
282
+ Your manual sections stay yours forever. Carto only rewrites between its own markers:
197
283
 
198
284
  ```
199
285
  <!-- CARTO:AUTO:START -->
@@ -213,6 +299,7 @@ Carto fixes **factual hallucination about your own project**:
213
299
  - AI guessing wrong field names → fixed
214
300
  - AI assuming wrong framework → fixed
215
301
  - AI guessing wrong DB schema → fixed
302
+ - AI not knowing blast radius → fixed
216
303
 
217
304
  What Carto does not fix: AI reasoning badly, wrong implementation logic, misunderstanding what you want. Carto makes AI **accurate** about your project. Not smarter. Accurate. Different thing.
218
305
 
@@ -220,11 +307,11 @@ What Carto does not fix: AI reasoning badly, wrong implementation logic, misunde
220
307
 
221
308
  ## AI tools that read AGENTS.md
222
309
 
223
- Drop the file in your project root. Each tool picks it up via its own context config:
224
-
225
- - **Cursor** — via context rules
310
+ - **Cursor** via context rules + MCP
226
311
  - **GitHub Copilot** — via workspace instructions
227
- - **Kiro** — natively
312
+ - **Kiro** — natively + MCP
313
+ - **Claude Desktop** — via MCP
314
+ - **Claude Code** — natively
228
315
  - **Codex** — natively
229
316
  - **VS Code** — via workspace context
230
317
  - **Gemini CLI** — natively
@@ -273,4 +360,4 @@ MIT — free forever.
273
360
 
274
361
  ---
275
362
 
276
- *Built because AGENTS.md won. Someone had to keep it alive.*
363
+ *Built because AGENTS.md won. Someone had to keep it alive — and make it queryable.*
package/index.js ADDED
@@ -0,0 +1,20 @@
1
+ 'use strict';
2
+
3
+ /**
4
+ * carto-md — public module API
5
+ *
6
+ * Usage:
7
+ * const { Carto } = require('carto-md');
8
+ * const carto = new Carto();
9
+ * await carto.index('/path/to/project');
10
+ *
11
+ * // Get everything Kepler needs for a file
12
+ * const ctx = carto.getContextForFile('src/auth/auth.service.ts');
13
+ *
14
+ * // Listen for live updates
15
+ * carto.on('updated', ({ file, blastRadius }) => { ... });
16
+ */
17
+
18
+ const { Carto } = require('./src/engine/carto');
19
+
20
+ module.exports = { Carto };
package/package.json CHANGED
@@ -1,11 +1,11 @@
1
1
  {
2
2
  "name": "carto-md",
3
- "version": "1.1.0",
3
+ "version": "1.1.2",
4
4
  "description": "The context layer for AI-native development.",
5
5
  "bin": {
6
6
  "carto": "src/cli/index.js"
7
7
  },
8
- "main": "./src/sync.js",
8
+ "main": "./index.js",
9
9
  "scripts": {
10
10
  "test": "node test/test.js"
11
11
  },
@@ -0,0 +1,84 @@
1
+ 'use strict';
2
+
3
+ const fs = require('fs');
4
+ const path = require('path');
5
+ const crypto = require('crypto');
6
+
7
+ function getHashPath(projectRoot) {
8
+ return path.join(projectRoot, '.carto', 'hashes.json');
9
+ }
10
+
11
+ function loadHashes(projectRoot) {
12
+ try {
13
+ const raw = fs.readFileSync(getHashPath(projectRoot), 'utf-8');
14
+ return JSON.parse(raw);
15
+ } catch {
16
+ return {};
17
+ }
18
+ }
19
+
20
+ function saveHashes(projectRoot, hashes) {
21
+ const hashPath = getHashPath(projectRoot);
22
+ const tmp = hashPath + '.tmp';
23
+ try {
24
+ fs.writeFileSync(tmp, JSON.stringify(hashes, null, 2), 'utf-8');
25
+ fs.renameSync(tmp, hashPath);
26
+ } catch {}
27
+ }
28
+
29
+ function hashContent(content) {
30
+ return crypto.createHash('sha1').update(content).digest('hex');
31
+ }
32
+
33
+ /**
34
+ * computeChangedFiles(filePaths, storedHashes, projectRoot)
35
+ * Returns { changed: string[], unchanged: string[], hashes: object }
36
+ * changed = files whose content hash differs from stored
37
+ * unchanged = files whose hash matches — can skip re-parsing
38
+ */
39
+ function computeChangedFiles(filePaths, storedHashes, projectRoot) {
40
+ const changed = [];
41
+ const unchanged = [];
42
+ const newHashes = { ...storedHashes };
43
+
44
+ for (const filePath of filePaths) {
45
+ const relPath = path.relative(projectRoot, filePath);
46
+ let content;
47
+ try {
48
+ content = fs.readFileSync(filePath, 'utf-8');
49
+ } catch {
50
+ continue;
51
+ }
52
+ const hash = hashContent(content);
53
+ if (storedHashes[relPath] === hash) {
54
+ unchanged.push(filePath);
55
+ } else {
56
+ changed.push(filePath);
57
+ newHashes[relPath] = hash;
58
+ }
59
+ }
60
+
61
+ return { changed, unchanged, hashes: newHashes };
62
+ }
63
+
64
+ /**
65
+ * updateFileHash(projectRoot, relPath, content)
66
+ * Updates the hash for a single file after incremental re-index.
67
+ */
68
+ function updateFileHash(projectRoot, relPath, content) {
69
+ const hashes = loadHashes(projectRoot);
70
+ hashes[relPath] = hashContent(content);
71
+ saveHashes(projectRoot, hashes);
72
+ }
73
+
74
+ /**
75
+ * removeFileHash(projectRoot, relPath)
76
+ * Removes hash entry when a file is deleted.
77
+ */
78
+ function removeFileHash(projectRoot, relPath) {
79
+ const hashes = loadHashes(projectRoot);
80
+ delete hashes[relPath];
81
+ saveHashes(projectRoot, hashes);
82
+ }
83
+
84
+ module.exports = { loadHashes, saveHashes, hashContent, computeChangedFiles, updateFileHash, removeFileHash };