ctxloom-pro 1.5.3 → 1.5.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -47,7 +47,7 @@ The full first-run flow is **one install + one trial + one init per project.** E
47
47
  npm install -g ctxloom-pro
48
48
  ```
49
49
 
50
- > **For local trial / dev use the unpinned command above is fine.** For unattended CI usage, pin to the exact version (`ctxloom-pro@1.5.3`) so future CLI releases don't silently desync your agent-spec coverage — see the workflow example below.
50
+ > **For local trial / dev use the unpinned command above is fine.** For unattended CI usage, pin to the exact version (`ctxloom-pro@1.5.5`) so future CLI releases don't silently desync your agent-spec coverage — see the workflow example below.
51
51
 
52
52
  ### 2 — Start your free trial (once per email)
53
53
 
@@ -361,7 +361,7 @@ jobs:
361
361
  # Exact pin (not `@^1`) so future CLI releases that add/remove MCP
362
362
  # tools don't silently desync your reviewer-agent specs. Bump on
363
363
  # every release; see CHANGELOG.md for the live version table.
364
- - run: npm install -g ctxloom-pro@1.5.3
364
+ - run: npm install -g ctxloom-pro@1.5.5
365
365
  - run: ctxloom index
366
366
  - run: ctxloom rules check --json
367
367
  ```
package/README.md.bak ADDED
@@ -0,0 +1,832 @@
1
+ # ctxloom — The Universal Code Context Engine
2
+
3
+ A local-first MCP server that gives AI coding assistants deep structural understanding of your codebase through hybrid **Vector + AST + Graph** search, with **Skeletonization** for 92% token reduction.
4
+
5
+ No cloud indexing. No Python. Everything runs on your machine.
6
+
7
+ > **ctxloom requires a license.** Start a free 7-day trial — no credit card required.
8
+
9
+ ## Multi-Project Support (v1.1.0)
10
+
11
+ ctxloom now supports analyzing multiple projects in a single MCP session. Every tool accepts an optional `project_root` parameter (alias or absolute path).
12
+
13
+ **Register a project alias:**
14
+ ```bash
15
+ ctxloom register --alias myapp /path/to/project
16
+ ```
17
+
18
+ **Use the alias in tool calls:**
19
+ ```json
20
+ {
21
+ "project_root": "myapp"
22
+ }
23
+ ```
24
+
25
+ Or use an absolute path directly:
26
+ ```json
27
+ {
28
+ "project_root": "/path/to/project"
29
+ }
30
+ ```
31
+
32
+ **Project state management:** ctxloom maintains an LRU cache of active projects (cap 5 by default, override via `CTXLOOM_MAX_PROJECTS`). First-touch auto-indexing indexes the dependency graph (sync, Tier 1) and queues vector indexing (deferred, Tier 2). Responses include a `<ctxloom_indexing>` envelope on first-touch. Project-resolution errors return structured XML: `<error code="alias_not_found" .../>`, `<error code="no_default_project" .../>`, etc.
33
+
34
+ **Backward compatibility:** Set `CTXLOOM_DISABLE_MULTIPROJECT=1` to revert to single-project (v1.0.31) behavior.
35
+
36
+ ---
37
+
38
+ ## Getting Started
39
+
40
+ **Prerequisites:** Node.js 20+ and an MCP-compatible AI tool (Claude Code, Cursor, Windsurf, etc.)
41
+
42
+ The full first-run flow is **one install + one trial + one init per project.** Each step is a single command.
43
+
44
+ ### 1 — Install (once per machine)
45
+
46
+ ```bash
47
+ npm install -g ctxloom-pro
48
+ ```
49
+
50
+ > **For local trial / dev use the unpinned command above is fine.** For unattended CI usage, pin to the exact version (`ctxloom-pro@1.5.4`) so future CLI releases don't silently desync your agent-spec coverage — see the workflow example below.
51
+
52
+ ### 2 — Start your free trial (once per email)
53
+
54
+ ```bash
55
+ ctxloom trial
56
+ # Enter your email — a checkout link opens in your browser.
57
+ # No credit card required. After checkout, you receive a license key by email.
58
+ ```
59
+
60
+ Already have a key?
61
+
62
+ ```bash
63
+ ctxloom activate <your-key>
64
+ ```
65
+
66
+ ### 3 — Configure your AI tools (once per machine)
67
+
68
+ ```bash
69
+ ctxloom setup
70
+ # Detects Claude Code, Cursor, Windsurf, Claude Desktop, Codex,
71
+ # Kimi, Continue, Aider, Augment, Kilo, Qwen, JetBrains, VS Code —
72
+ # writes the global MCP entry for each one you have installed.
73
+ ```
74
+
75
+ ### 4 — Bootstrap each project (once per project)
76
+
77
+ ```bash
78
+ cd /path/to/your/project
79
+ ctxloom init # writes .mcp.json + appends .ctxloom/ to .gitignore
80
+ ctxloom index # builds vector + graph + git overlay
81
+ ```
82
+
83
+ `ctxloom init` is the piece that pins ctxloom to **this** project. Without it, MCP clients (notably Claude Code) launch the global MCP server with cwd inherited from wherever the IDE was first opened — and **do not relaunch on project switch** — so a single Claude Code session ends up serving graph queries from the wrong codebase. The `.mcp.json` produced by `init` carries an explicit `CTXLOOM_ROOT` and short-circuits that ambiguity.
84
+
85
+ Beyond `.mcp.json` + `.gitignore`, `ctxloom init` also writes the **agent-harness layer** (v1.4.0+): HMAC-signed rule blocks in `CLAUDE.md` / `AGENTS.md` / `GEMINI.md`, Claude Code hooks under `.claude/hooks/`, and six pre-packaged Claude Code skills under `.claude/skills/ctxloom-*/`. These tell the agent to prefer the ctxloom MCP tools over `Grep`/`Read` automatically; you don't have to remember the rule.
86
+
87
+ **Cross-agent hosts (v1.5.0+):** add `--host=<id>` to install rules for additional agent hosts beyond Claude Code:
88
+
89
+ ```bash
90
+ ctxloom init --host=cursor # writes .cursorrules
91
+ ctxloom init --host=aider # writes CONVENTIONS.md
92
+ ctxloom init --host=copilot # writes .github/copilot-instructions.md
93
+ ctxloom init --host=windsurf # writes .windsurfrules
94
+ ctxloom init --host=cursor,aider # comma-separated → multiple hosts
95
+ ctxloom init --host=all # writes every supported host
96
+ ```
97
+
98
+ Unknown host ids drop with a warning, not a hard failure. Re-running `ctxloom init` is idempotent — content matches → no-op; tampered blocks → refuse to clobber without `--force`.
99
+
100
+ After `init` + `index`, reopen your AI tool in the project directory. Your assistant now has full structural context.
101
+
102
+ ### License commands
103
+
104
+ ```bash
105
+ ctxloom status # tier, expiry, last validation
106
+ ctxloom deactivate # release this machine's seat (to move to a new machine)
107
+ ```
108
+
109
+ ### CI / headless environments
110
+
111
+ ```bash
112
+ CTXLOOM_LICENSE_KEY=<your-key> ctxloom index
113
+ ```
114
+
115
+ Set `CTXLOOM_LICENSE_KEY` in your CI secrets. The key is validated on every run — no local state written to the runner.
116
+
117
+ ### Manual MCP configuration (if you skip `ctxloom setup`)
118
+
119
+ Global MCP entry — match this in your client's config file by hand:
120
+
121
+ ```jsonc
122
+ // Claude Code: ~/.claude.json or .mcp.json in the project
123
+ // Cursor: ~/.cursor/mcp.json
124
+ // Codex CLI: ~/.codex/mcp.json
125
+ // Kimi: ~/.kimi/mcp.json
126
+ // Claude Desktop: ~/Library/Application Support/Claude/claude_desktop_config.json
127
+ {
128
+ "mcpServers": {
129
+ "ctxloom": {
130
+ "command": "ctxloom",
131
+ "args": []
132
+ }
133
+ }
134
+ }
135
+ ```
136
+
137
+ Then run `ctxloom init` inside each project — it writes a `.mcp.json` in the project root with `env.CTXLOOM_ROOT` set, which overrides the global entry on a per-project basis (Claude Code, Cursor, and the other MCP-aware clients merge per-project config over global automatically).
138
+
139
+ If you have a single fixed project (e.g. a CI runner or a Claude Desktop session with no project concept), pin the global entry directly:
140
+
141
+ ```jsonc
142
+ {
143
+ "mcpServers": {
144
+ "ctxloom": {
145
+ "command": "ctxloom",
146
+ "args": [],
147
+ "env": { "CTXLOOM_ROOT": "/path/to/project" }
148
+ }
149
+ }
150
+ }
151
+ ```
152
+
153
+ > Pricing: **Pro** €9.90/mo or €99/yr (1 seat) · **Team** €29.90/mo or €299/yr (5 seats) · [ctxloom.com/pricing](https://ctxloom.com/pricing)
154
+
155
+ ---
156
+
157
+ ## GitHub App — ctxloom-bot
158
+
159
+ ![Beta](https://img.shields.io/badge/status-beta-orange)
160
+
161
+ Get automated risk analysis and reviewer suggestions on every pull request.
162
+
163
+ <!-- TODO: Add demo GIF showing bot posting summary + inline comment on a PR -->
164
+
165
+ - Posts a risk-scored summary comment on every PR, combining blast radius, churn, and coupling data
166
+ - Adds inline review comments at the specific lines that carry the highest structural risk
167
+ - Suggests reviewers based on ownership data mined from git history
168
+ - Responds to `/ctxloom` slash commands (e.g. `/ctxloom blast-radius`, `/ctxloom risk`) directly in PR threads
169
+
170
+ See [`apps/pr-bot/README.md`](apps/pr-bot/README.md) for full installation and self-hosting instructions.
171
+
172
+ ---
173
+
174
+ ## Web Dashboard
175
+
176
+ ![Beta](https://img.shields.io/badge/status-beta-orange)
177
+
178
+ A local web dashboard that visualises your codebase's graph, risk, ownership, and community data in real time.
179
+
180
+ ```bash
181
+ # Index first (with git history for full data)
182
+ ctxloom index --with-git
183
+
184
+ # Launch the dashboard
185
+ ctxloom dashboard
186
+ ```
187
+
188
+ Visit `http://localhost:7842` — no browser extension required.
189
+
190
+ ### Views
191
+
192
+ | View | What it shows |
193
+ |------|---------------|
194
+ | **Overview** | File count, edge count, communities, git status, risk breakdown donut, top architectural hubs |
195
+ | **Dependency Graph** | Interactive D3 force-directed graph — hover for details, click to highlight neighbours, search to pan, community legend, risk rings |
196
+ | **Risk** | Sortable table: composite risk score (churn × 0.3 + bug density × 0.3 + bus factor × 0.2 + coupling × 0.2), filterable by filename |
197
+ | **Communities** | Auto-detected Louvain modules — expandable cluster cards showing member files |
198
+ | **Ownership** | Per-file primary owner, share %, bus factor warnings — filterable by file or contributor |
199
+ | **Guide** | In-app reference explaining every metric and how to interpret it |
200
+
201
+ ### Interactivity
202
+
203
+ - **Click any filename** across Risk, Ownership, and Communities to open a file preview drawer with the full source and an **Open in IDE** button (launches VS Code, Cursor, or system default)
204
+ - **↻ Refresh** button in Overview re-indexes the context in-place without restarting the server
205
+ - The server **auto-reloads** when `.ctxloom/graph-snapshot.json` changes — run `ctxloom index` in a separate terminal and the dashboard updates automatically
206
+
207
+ ### Risk tiers
208
+
209
+ | Tier | Score | Meaning |
210
+ |------|-------|---------|
211
+ | critical | > 0.8 | Urgent — high churn, sole owner, heavily coupled |
212
+ | high | > 0.6 | Address soon |
213
+ | medium | > 0.3 | Monitor |
214
+ | low | ≤ 0.3 | Acceptable |
215
+
216
+ ---
217
+
218
+ ## Reviewer Suggestions
219
+
220
+ Suggest PR reviewers based on git ownership, co-change history, and recent activity — no static CODEOWNERS to maintain:
221
+
222
+ ```bash
223
+ # Suggest reviewers for staged files
224
+ ctxloom review-suggest
225
+
226
+ # Suggest reviewers for specific files
227
+ ctxloom review-suggest src/auth.ts src/api/session.ts
228
+
229
+ # Show per-factor score breakdown
230
+ ctxloom review-suggest src/auth.ts --explain
231
+
232
+ # Generate / update .github/CODEOWNERS
233
+ ctxloom review-suggest --emit-codeowners --write
234
+
235
+ # Map git author emails to GitHub handles
236
+ GITHUB_TOKEN=<token> ctxloom authors-sync
237
+ ```
238
+
239
+ ### Scoring
240
+
241
+ Each candidate is scored across four factors:
242
+
243
+ | Factor | Weight | Source |
244
+ |--------|--------|--------|
245
+ | Ownership share | 50% | Blame-weighted commit history |
246
+ | Co-change recency | 25% | Files changed together in last 90 days |
247
+ | Recent activity | 15% | Commits in last 30/90 days |
248
+ | Bus-factor boost | 10% | Diversity nudge when bus factor ≤ 2 |
249
+
250
+ Candidates inactive for > 180 days are excluded automatically.
251
+
252
+ ### GitHub Action
253
+
254
+ Add to `.github/workflows/review.yml`:
255
+
256
+ ```yaml
257
+ name: Reviewer suggestions
258
+ on:
259
+ pull_request:
260
+ types: [opened, synchronize, reopened]
261
+ jobs:
262
+ suggest:
263
+ runs-on: ubuntu-latest
264
+ permissions:
265
+ pull-requests: write
266
+ contents: read
267
+ steps:
268
+ - uses: actions/checkout@v4
269
+ with:
270
+ fetch-depth: 0
271
+ - uses: kodiii/ctxloom-review-suggest@v1
272
+ with:
273
+ max: 3
274
+ ```
275
+
276
+ ### Email → GitHub handle mapping
277
+
278
+ Create `.ctxloom/authors.yml` to map or exclude authors:
279
+
280
+ ```yaml
281
+ mappings:
282
+ alice@company.com: alice-gh
283
+ bob@company.com: bobsmith
284
+ ignore:
285
+ - bot@dependabot.com
286
+ ```
287
+
288
+ ---
289
+
290
+ ## Architecture Rules Engine
291
+
292
+ Enforce architectural boundaries as a CI lint step — no runtime overhead, no flaky tests.
293
+
294
+ ```bash
295
+ # Check rules against the indexed dependency graph
296
+ ctxloom rules check
297
+
298
+ # JSON output (for CI parsers)
299
+ ctxloom rules check --json
300
+
301
+ # Skip re-indexing, use existing snapshot
302
+ ctxloom rules check --use-snapshot
303
+
304
+ # Limit text output to N violations (default: 50)
305
+ ctxloom rules check --limit=20
306
+ ```
307
+
308
+ ### Configuration
309
+
310
+ Create `.ctxloom/rules.yml` in your project root:
311
+
312
+ ```yaml
313
+ version: 1
314
+ rules:
315
+ - name: domain must not import infra
316
+ type: no-import
317
+ from: "src/domain/**"
318
+ to: "src/infra/**"
319
+ severity: error # optional — defaults to "error"
320
+
321
+ - name: no circular via shared
322
+ type: no-import
323
+ from: "src/features/**"
324
+ to: "src/shared/legacy/**"
325
+ severity: warning
326
+ ```
327
+
328
+ ### Rule fields
329
+
330
+ | Field | Required | Description |
331
+ |-------|----------|-------------|
332
+ | `name` | ✅ | Human-readable rule label (shown in violations) |
333
+ | `type` | ✅ | Always `no-import` in v1 |
334
+ | `from` | ✅ | picomatch glob — files that must not import |
335
+ | `to` | ✅ | picomatch glob — files that must not be imported |
336
+ | `severity` | ❌ | `error` (default) or `warning` |
337
+
338
+ Globs use [picomatch](https://github.com/micromatch/picomatch) syntax with `{ dot: true }` for dotfiles.
339
+
340
+ ### Exit codes
341
+
342
+ | Code | Meaning |
343
+ |------|---------|
344
+ | 0 | Clean (or warnings only) |
345
+ | 1 | One or more `error`-severity violations found |
346
+ | 2 | Config file invalid or I/O error |
347
+
348
+ ### CI integration
349
+
350
+ ```yaml
351
+ # .github/workflows/rules.yml
352
+ name: Architecture rules
353
+ on: [push, pull_request]
354
+ jobs:
355
+ check:
356
+ runs-on: ubuntu-latest
357
+ steps:
358
+ - uses: actions/checkout@v4
359
+ - uses: actions/setup-node@v4
360
+ with: { node-version: '20' }
361
+ # Exact pin (not `@^1`) so future CLI releases that add/remove MCP
362
+ # tools don't silently desync your reviewer-agent specs. Bump on
363
+ # every release; see CHANGELOG.md for the live version table.
364
+ - run: npm install -g ctxloom-pro@1.5.4
365
+ - run: ctxloom index
366
+ - run: ctxloom rules check --json
367
+ ```
368
+
369
+ ### MCP tool
370
+
371
+ The `ctx_rules_check` tool exposes the same engine to your AI assistant:
372
+
373
+ ```json
374
+ // Request
375
+ {}
376
+
377
+ // Response (schemaVersion: 1)
378
+ {
379
+ "schemaVersion": 1,
380
+ "violations": [
381
+ {
382
+ "rule": "domain must not import infra",
383
+ "severity": "error",
384
+ "from": "src/domain/user.ts",
385
+ "to": "src/infra/db.ts"
386
+ }
387
+ ],
388
+ "warnings": []
389
+ }
390
+ ```
391
+
392
+ The tool reads `.ctxloom/rules.yml` and the live dependency graph on every call — no restart required when config changes.
393
+
394
+ ### Limitations (v1)
395
+
396
+ - **Direct imports only** — transitive violations are not detected
397
+ - **Snapshot staleness** — `--use-snapshot` skips re-indexing; stale graphs may miss recent violations
398
+ - Rule type `no-import` only; more rule types planned for v2
399
+
400
+ ---
401
+
402
+ ## How ctxloom Compares
403
+
404
+ | Feature | ctxloom | code-review-graph | Others |
405
+ |---------|---------|-------------------|--------|
406
+ | Zero Python dependencies | ✅ Pure JS/TS | ❌ Python required | varies |
407
+ | Local-first (no cloud) | ✅ | ✅ | varies |
408
+ | Blast radius analysis | ✅ `ctx_blast_radius` | ✅ | ❌ |
409
+ | Community / cluster detection | ✅ Louvain (pure JS) | ✅ Leiden (Python) | ❌ |
410
+ | Architecture overview | ✅ `ctx_architecture_overview` | ✅ | ❌ |
411
+ | Execution flow tracing | ✅ `ctx_execution_flow` | ❌ | ❌ |
412
+ | Refactor rename preview | ✅ `ctx_refactor_preview` | ❌ | ❌ |
413
+ | Wiki generation (no LLM) | ✅ `ctx_wiki_generate` | ✅ | ❌ |
414
+ | Graph export (Gephi/Obsidian) | ✅ `ctx_graph_export` | ✅ | ❌ |
415
+ | Cross-repo search | ✅ `ctx_cross_repo_search` | ✅ | ❌ |
416
+ | All-in-one code review packet | ✅ `ctx_git_diff_review` | ✅ | ❌ |
417
+ | Tree-sitter AST | ✅ TS/JS/Python/Go/Rust/Java/C#/Ruby/Kotlin/Swift/PHP/Dart/Vue — 13 languages | ✅ Multi-language | varies |
418
+ | Token reduction (skeletonization) | ✅ **92% measured on real repos** | ✅ | ❌ |
419
+ | npm install size | ✅ <5 MB (lazy grammars) | ❌ Large | varies |
420
+ | MCP protocol native | ✅ | ✅ | varies |
421
+ | PR-native review comments | ✅ ctxloom-bot posts on every PR | ❌ | ❌ |
422
+
423
+ > Token reduction is measured, not estimated. See [`benchmarks/README.md`](benchmarks/README.md).
424
+
425
+ ---
426
+
427
+ ## Tools — 33 total
428
+
429
+ ### Search & Context
430
+
431
+ | Tool | Description |
432
+ |------|-------------|
433
+ | `ctx_search` | Hybrid semantic + graph search (vector similarity + import graph expansion) |
434
+ | `ctx_get_file` | Safe file read with path traversal protection (5 MB max) |
435
+ | `ctx_get_context_packet` | Smart multi-file context: primary file + dependency skeletons + reverse importers |
436
+ | `ctx_similar_files` | Find semantically similar files via vector embeddings |
437
+ | `ctx_cross_repo_search` | Federated semantic search across all registered repos |
438
+ | `ctx_full_text_search` | Hybrid keyword+vector search with regex support and configurable context lines |
439
+
440
+ ### Graph Intelligence
441
+
442
+ | Tool | Description |
443
+ |------|-------------|
444
+ | `ctx_blast_radius` | "What breaks if I change this?" — import + call graph traversal |
445
+ | `ctx_hub_nodes` | Top-N files by import degree (architectural chokepoints) |
446
+ | `ctx_bridge_nodes` | Top-N files by betweenness centrality (graph connectors) |
447
+ | `ctx_community_list` | Louvain community detection — cluster files into architectural modules |
448
+ | `ctx_architecture_overview` | High-level summary: communities, hub files, cross-community coupling |
449
+ | `ctx_knowledge_gaps` | Isolated files, untested hubs, dead code candidates |
450
+ | `ctx_surprising_connections` | Circular deps, cross-community imports, prod→test violations |
451
+ | `ctx_find_large_functions` | Find functions/classes exceeding a line-count threshold, sorted by size descending |
452
+
453
+ ### Code Navigation
454
+
455
+ | Tool | Description |
456
+ |------|-------------|
457
+ | `ctx_get_call_graph` | Bidirectional call graph traversal with configurable depth |
458
+ | `ctx_get_definition` | Symbol definition lookup via AST index |
459
+ | `ctx_execution_flow` | DFS call graph traversal from entry point with cycle detection |
460
+ | `ctx_get_affected_flows` | Which flows are affected by changed files? Traces back to root callers, then forward — auto-detects from `git diff HEAD~1` |
461
+ | `ctx_refactor_preview` | Read-only symbol rename diff preview — see every change before applying |
462
+ | `ctx_apply_refactor` | Write symbol renames to disk atomically (supports dry_run) |
463
+
464
+ ### Review & Export
465
+
466
+ | Tool | Description |
467
+ |------|-------------|
468
+ | `ctx_git_diff_review` | All-in-one code review packet: git diffs + skeletons + blast radius |
469
+ | `ctx_wiki_generate` | Generate `.ctxloom/wiki/` — one Markdown page per community (no LLM needed) |
470
+ | `ctx_graph_export` | Export graph to GraphML, DOT, Obsidian, SVG, or interactive D3.js HTML |
471
+ | `ctx_suggested_questions` | Graph-driven code review questions without LLM |
472
+ | `ctx_detect_changes` | Risk-scored change analysis — critical/high/medium/low priority |
473
+ | `ctx_graph_snapshot` | Save a named checkpoint of the dependency graph |
474
+ | `ctx_graph_diff` | Diff two named snapshots — added/removed nodes and edges |
475
+
476
+ ### Utilities
477
+
478
+ | Tool | Description |
479
+ |------|-------------|
480
+ | `ctx_get_rules` | Inject project rules from `.cursorrules`, `CLAUDE.md`, `CONTEXT.md`, `.ctxloomrc` |
481
+ | `ctx_status` | Server status: graph size, vector store count, initialization state |
482
+ | `ctx_get_workflow` | Return a pre-written tool sequence for review/debug/onboard/refactor/audit workflows |
483
+ | `ctx_rules_check` | Check `.ctxloom/rules.yml` against the live dependency graph — returns `{schemaVersion:1, violations, warnings}` |
484
+
485
+ ---
486
+
487
+ ## Risk Overlay (Git History)
488
+
489
+ ctxloom fuses your git history onto the structural graph to produce a *risk map* — showing which files are historically risky, not just structurally coupled.
490
+
491
+ ### Enable
492
+
493
+ Re-index with the `--with-git` flag (enabled by default):
494
+
495
+ ```
496
+ ctxloom . --with-git --git-window-days=365
497
+ ```
498
+
499
+ First run mines the last 365 days of commits (~30–90s on large repos). Subsequent runs are incremental.
500
+
501
+ ### New tools
502
+
503
+ | Tool | Description |
504
+ |------|-------------|
505
+ | `ctx_git_coupling` | Given a file, returns top co-changed siblings with confidence score, shared commit count, and recency data. Surfaces "historically this file changes with X" — invisible to static analysis. |
506
+ | `ctx_risk_overlay` | Given a list of files, returns a per-file risk score (0–1) combining churn, bug-fix density, bus-factor ownership, and coupling fan-out. |
507
+
508
+ ### Enriched tools
509
+
510
+ Existing tools gain a `risk` block when the overlay is active:
511
+
512
+ - **`ctx_detect_changes`** — each changed file now includes churn bucket, bug density, top coupled siblings, and ownership.
513
+ - **`ctx_blast_radius`** — adds a `historicalCoupling` section listing files that co-change with the seed set historically but are not reachable via imports ("historical surprise" surface).
514
+
515
+ ### Privacy
516
+
517
+ The overlay is **local only**. No code or commit metadata is sent anywhere. The sidecar is stored at `.ctxloom/git-overlay.json` alongside the graph snapshot.
518
+
519
+ ### Opt out
520
+
521
+ Pass `--no-git` to disable the overlay entirely. Tools degrade gracefully — the `risk` block becomes `null` and the note `"Re-index with --with-git to enable risk data."` appears in responses.
522
+
523
+ ---
524
+
525
+ ## Response Budgets (v1.2.7+)
526
+
527
+ Twelve source-returning tools accept a server-enforced **token budget**. When a response would exceed the budget, the server auto-substitutes a lighter form (Skeletonizer signature view, summary-only XML, or paths-without-snippets) instead of dumping 50KB of source into your context window.
528
+
529
+ ### Opting in
530
+
531
+ Pass any of these three optional fields to any of the 12 supported tools:
532
+
533
+ ```json
534
+ {
535
+ "max_response_tokens": 4000,
536
+ "on_budget_exceeded": "skeleton",
537
+ "response_format": "auto"
538
+ }
539
+ ```
540
+
541
+ | Field | Values | Default |
542
+ |---|---|---|
543
+ | `max_response_tokens` | positive integer | per-tool (see below) |
544
+ | `on_budget_exceeded` | `"skeleton"` \| `"truncate"` \| `"error"` | `"skeleton"` |
545
+ | `response_format` | `"full"` \| `"skeleton"` \| `"auto"` | `"auto"` |
546
+
547
+ **Back-compat:** when none of these fields are passed, the tool returns its raw response unchanged. Existing callers see zero behavior change.
548
+
549
+ ### Response envelope
550
+
551
+ When you opt in, the response is wrapped in a JSON envelope:
552
+
553
+ ```json
554
+ {
555
+ "data": "<the actual tool output — XML, text, or whatever the tool returns>",
556
+ "meta": {
557
+ "format": "full" | "skeleton" | "truncated",
558
+ "original_tokens_est": 8400,
559
+ "returned_tokens_est": 1600,
560
+ "fallback_reason": null | "budget_exceeded" | "minified_input" | "size_cap" | "skeleton_failed"
561
+ }
562
+ }
563
+ ```
564
+
565
+ ### Supported tools + default budgets
566
+
567
+ Defaults activate only when you opt in (any of the 3 fields above) without specifying `max_response_tokens` explicitly.
568
+
569
+ | Tool | Default | Skeleton fallback |
570
+ |---|---:|---|
571
+ | `ctx_get_file` | 8000 | Skeletonizer view of the file (~90% reduction on TS) |
572
+ | `ctx_get_context_packet` | 6000 | Re-render with the primary file skeletonized |
573
+ | `ctx_get_definition` | 2000 | none — truncate-only (already structural) |
574
+ | `ctx_git_diff_review` | 8000 | Drop `<skeleton>` blocks + omit transitive importers |
575
+ | `ctx_search` | 4000 | Drop content snippets (paths + scores only) |
576
+ | `ctx_full_text_search` | 4000 | Drop match snippets (paths + match counts only) |
577
+ | `ctx_wiki_generate` | 12000 | Downgrade to `detail_level=minimal` |
578
+ | `ctx_find_large_functions` | 2000 | none — truncate-only |
579
+ | `ctx_apply_refactor` | 2000 | none — truncate-only |
580
+ | `ctx_refactor_preview` | 4000 | Drop per-change before/after, keep file summary |
581
+ | `ctx_cross_repo_search` | 4000 | Drop content snippets |
582
+ | `ctx_execution_flow` | 4000 | none — truncate-only |
583
+
584
+ Defaults are **provisional** (derived from the issue's initial table); a future release will re-derive them from real per-tool p75 telemetry once enough usage data accumulates.
585
+
586
+ ### Token estimator
587
+
588
+ Default = `chars / 4` — within ±10% of GPT/Claude tokenizers on code with zero tokenization cost. Pluggable per-tool via the `estimator` option on `BudgetOptions` for callers that need accuracy-critical estimation (e.g. tiktoken).
589
+
590
+ ### Kill switch
591
+
592
+ Set `CTXLOOM_DISABLE_BUDGET=1` in the environment to silently ignore every `max_response_tokens` arg server-wide. Tools behave exactly as in pre-v1.2.7. Documented escape hatch for the soak period.
593
+
594
+ ### Telemetry
595
+
596
+ Set `CTXLOOM_TELEMETRY_LEVEL=full` to emit structured `mcp.budget.exceeded` and `mcp.fallback.used` events to stderr. Useful for tuning defaults against your own usage patterns.
597
+
598
+ > **Note:** `CTXLOOM_TELEMETRY_LEVEL` is also consumed by the license / PostHog telemetry layer (see [Telemetry](#telemetry) below) which only recognizes `all` / `error` / `off`. `full` is a separate, **additive** level — it enables budget-event emission *without narrowing* PostHog scope. To narrow PostHog telemetry, set the variable to `error` or `off`; those values disable budget events as a side effect.
599
+
600
+ ---
601
+
602
+ ## CLI Commands
603
+
604
+ ```
605
+ ctxloom Start MCP server (Stdio transport)
606
+ ctxloom index Index current directory + build dependency graph
607
+ ctxloom dashboard Open the web dashboard (port 7842)
608
+ ctxloom dashboard --port=N Start on a custom port
609
+ ctxloom dashboard --open Open browser automatically
610
+ ctxloom setup Detect and configure MCP-compatible AI tools (interactive)
611
+ ctxloom register <path> Register a repo for cross-repo search (v1.0.x)
612
+ ctxloom register --alias <name> <path> Register a project with an alias for multi-project support (v1.1.0+)
613
+ ctxloom repos List all registered repos
614
+ ctxloom grammars Show grammar cache status
615
+ ctxloom grammars --download Pre-download all language grammars
616
+ ctxloom rules check Check .ctxloom/rules.yml against the dependency graph
617
+ ctxloom rules check --json JSON output (schemaVersion: 1)
618
+ ctxloom rules check --use-snapshot Skip re-indexing, use existing graph snapshot
619
+ ctxloom rules check --limit=N Limit text output to N violations (default: 50)
620
+ ctxloom --help Show help
621
+ ```
622
+
623
+ ---
624
+
625
+ ## Language Support
626
+
627
+ | Language | Import Graph | Symbol Index | Skeletonization |
628
+ |----------|-------------|--------------|-----------------|
629
+ | TypeScript / JavaScript | ✅ Full AST | ✅ | ✅ |
630
+ | Python | ✅ Relative imports | ✅ | ✅ |
631
+ | Rust | ✅ `mod` resolution | ✅ | ✅ |
632
+ | Go | ✅ Relative paths | ✅ | ✅ |
633
+ | Java | ✅ Dot-to-slash | ✅ | ✅ |
634
+ | C# | ✅ Namespace resolution | ✅ | ✅ |
635
+ | Ruby | ✅ Relative paths | ✅ | ✅ |
636
+ | Kotlin | ✅ Package imports | ✅ | ✅ |
637
+ | Swift | ✅ Module imports | ✅ | ✅ |
638
+ | PHP | ✅ PSR-4 + require_once | ✅ | ✅ |
639
+ | Dart | ✅ Relative imports | ✅ | ✅ |
640
+ | Vue SFC | ✅ Script block | ✅ | ✅ |
641
+ | Jupyter Notebook | ✅ Python cell imports | ✅ | ✅ |
642
+
643
+ ---
644
+
645
+ ## Architecture
646
+
647
+ ```
648
+ ┌─────────────────────────────────────────────────────────┐
649
+ │ MCP Interface │
650
+ │ (Stdio transport) │
651
+ ├──────────────────────────────────────────────────────────┤
652
+ │ 33 Tools (ToolRegistry) │
653
+ │ Search · Graph Intelligence · Navigation · Review │
654
+ ├──────────────────────────────────────────────────────────┤
655
+ │ Context Engine │
656
+ │ ┌────────────┐ ┌──────────────┐ ┌─────────────────┐ │
657
+ │ │ Dependency │ │ VectorDB │ │ Skeletonizer │ │
658
+ │ │ Graph │ │ (LanceDB) │ │ (tree-sitter) │ │
659
+ │ └────────────┘ └──────────────┘ └─────────────────┘ │
660
+ │ ┌────────────┐ ┌──────────────┐ ┌─────────────────┐ │
661
+ │ │ CallGraph │ │ Community │ │ WikiGenerator │ │
662
+ │ │ Index │ │ Detector │ │ GraphExporter │ │
663
+ │ └────────────┘ └──────────────┘ └─────────────────┘ │
664
+ ├──────────────────────────────────────────────────────────┤
665
+ │ File Watcher (chokidar, 200ms debounce) │
666
+ │ Incremental graph updates + re-embedding │
667
+ ├──────────────────────────────────────────────────────────┤
668
+ │ Snapshot Manager (atomic writes) │
669
+ │ .ctxloom/graph-snapshot.json + call-graph-snapshot │
670
+ └──────────────────────────────────────────────────────────┘
671
+ ```
672
+
673
+ ### How search works
674
+
675
+ 1. **Embed** — query is embedded with `sentence-transformers/all-MiniLM-L6-v2` (local, 384-dim)
676
+ 2. **Vector search** — ANN query against pre-indexed file embeddings in LanceDB
677
+ 3. **Graph expansion** — results expanded via import graph (importers + imports get a small score boost)
678
+ 4. **Skeletonize** — dependency files reduced to signature-only views (functions, classes, exports) cutting token usage by ~92%
679
+
680
+ ---
681
+
682
+ ## Performance
683
+
684
+ Benchmarks run on every PR. To run locally:
685
+
686
+ ```bash
687
+ npx tsx benchmarks/benchmark.ts
688
+ ```
689
+
690
+ See [`benchmarks/README.md`](benchmarks/README.md) for methodology and how to reproduce results independently.
691
+
692
+ ## Token reduction benchmarks
693
+
694
+ Full-source skeletonization on real open-source frameworks — every TS/JS file (skipping tests, `.d.ts`, build output, minified vendor bundles).
695
+
696
+ | Repository | Files | Raw tokens | Skeleton tokens | Reduction |
697
+ |---|---:|---:|---:|---:|
698
+ | vercel/next.js | 2,742 | ~12.2M | ~584k | **95%** |
699
+ | honojs/hono | 200 | ~185k | ~30k | **84%** |
700
+ | vitejs/vite | 1,032 | ~459k | ~105k | **77%** |
701
+ | withastro/astro | 875 | ~805k | ~191k | **76%** |
702
+ | nestjs/nest | 1,305 | ~409k | ~177k | **57%** |
703
+ | **Weighted average · 6,154 files** | | **~14.1M** | **~1.1M** | **92%** |
704
+
705
+ Token counts use the standard 4 chars/token approximation. Per-repo range (57–95%) reflects file-shape sensitivity: codebases with lots of tiny re-export shims compress less than ones with meatier source. Results saved in [`benchmarks/large-repos-results.json`](benchmarks/large-repos-results.json). Run `npm run bench:repos` to reproduce.
706
+
707
+ ---
708
+
709
+ ## Security
710
+
711
+ - **Path traversal prevention** — all file inputs validated against project root (CWE-22), symlink-aware
712
+ - **Shell injection prevention** — `execFileSync` with argument arrays; no shell string interpolation
713
+ - **XML injection prevention** — all user-controlled strings escaped before XML output
714
+ - **File size limits** — files over 5 MB rejected by `PathValidator` and skipped by indexer
715
+ - **Input bounds** — `limit` capped at 100, `depth` capped at 20 across all tools
716
+ - **Atomic snapshot writes** — written to `.tmp` then renamed; prevents torn reads
717
+ - **Snapshot schema validation** — validated before hydration; prevents prototype pollution
718
+
719
+ ---
720
+
721
+ ## Environment Variables
722
+
723
+ | Variable | Description | Default |
724
+ |----------|-------------|---------|
725
+ | `CTXLOOM_ROOT` | Project root directory | Current working directory |
726
+ | `LOG_LEVEL` | Logging verbosity: `debug` / `info` / `warn` / `error` | `info` |
727
+ | `CTXLOOM_GRAMMAR_CDN` | CDN base URL for grammar downloads (air-gapped environments) | Built-in |
728
+ | `CTXLOOM_MAX_PROJECTS` | LRU cache cap for multi-project state (v1.1.0+) | `5` |
729
+ | `CTXLOOM_DISABLE_MULTIPROJECT` | Set to `1` to revert to v1.0.31 single-project mode (v1.1.0+) | (unset) |
730
+ | `CTXLOOM_NO_TELEMETRY` | Set to `1` to disable anonymous telemetry entirely (v1.2.0+) | (unset) |
731
+ | `CTXLOOM_TELEMETRY_LEVEL` | `all` / `error` / `off` — granular telemetry scope (v1.2.0+) | `all` |
732
+ | `DO_NOT_TRACK` | Universal cross-tool opt-out — equivalent to `CTXLOOM_NO_TELEMETRY=1` | (unset) |
733
+
734
+ ---
735
+
736
+ ## Telemetry
737
+
738
+ ctxloom collects **anonymous, opt-out telemetry** to understand which features are used and to catch crashes. **No file contents, paths, project names, or aliases are ever transmitted.** Project identifiers are SHA-256 hashes of the absolute path. The `distinct_id` is a random UUID at `~/.ctxloom/distinct_id`.
739
+
740
+ Disable with `CTXLOOM_NO_TELEMETRY=1` or the cross-tool `DO_NOT_TRACK=1`. For a granular middle ground (crash reports yes, usage analytics no) use `CTXLOOM_TELEMETRY_LEVEL=error`.
741
+
742
+ The complete list of events, properties, what is *never* collected, and how project paths are anonymized is documented in **[docs/TELEMETRY.md](docs/TELEMETRY.md)**.
743
+
744
+ ---
745
+
746
+ ## Build from Source
747
+
748
+ ```bash
749
+ git clone https://github.com/kodiii/ctxloom.git
750
+ cd ctxloom
751
+ npm install
752
+ npm run build
753
+ ctxloom index
754
+ node dist/index.js
755
+ ```
756
+
757
+ ---
758
+
759
+ ## Project Structure
760
+
761
+ ```
762
+ src/
763
+ ├── index.ts # CLI entry point
764
+ ├── server.ts # MCP server (Stdio transport)
765
+ ├── tools/
766
+ │ ├── registry.ts # ToolRegistry: register/dispatch
767
+ │ ├── search.ts # ctx_search
768
+ │ ├── file.ts # ctx_get_file
769
+ │ ├── context-packet.ts # ctx_get_context_packet
770
+ │ ├── call-graph.ts # ctx_get_call_graph
771
+ │ ├── definition.ts # ctx_get_definition
772
+ │ ├── rules.ts # ctx_get_rules
773
+ │ ├── rules-check.ts # ctx_rules_check
774
+ │ ├── similar-files.ts # ctx_similar_files
775
+ │ ├── status.ts # ctx_status
776
+ │ ├── blast-radius.ts # ctx_blast_radius
777
+ │ ├── hub-nodes.ts # ctx_hub_nodes
778
+ │ ├── bridge-nodes.ts # ctx_bridge_nodes
779
+ │ ├── community-list.ts # ctx_community_list
780
+ │ ├── architecture-overview.ts # ctx_architecture_overview
781
+ │ ├── knowledge-gaps.ts # ctx_knowledge_gaps
782
+ │ ├── surprising-connections.ts # ctx_surprising_connections
783
+ │ ├── wiki-generate.ts # ctx_wiki_generate
784
+ │ ├── graph-export.ts # ctx_graph_export
785
+ │ ├── git-diff-review.ts # ctx_git_diff_review
786
+ │ ├── refactor-preview.ts # ctx_refactor_preview
787
+ │ ├── execution-flow.ts # ctx_execution_flow
788
+ │ └── cross-repo-search.ts # ctx_cross_repo_search
789
+ ├── rules/
790
+ │ ├── types.ts # Rule, RulesConfig, Violation, CheckResult, RulesConfigError
791
+ │ ├── loadConfig.ts # YAML + zod config loader
792
+ │ ├── RulesChecker.ts # picomatch glob engine — graph edges → violations
793
+ │ ├── reporter.ts # formatText (human) + formatJson (schemaVersion: 1)
794
+ │ └── index.ts # barrel export
795
+ ├── graph/
796
+ │ ├── DependencyGraph.ts # In-memory graph + snapshot + multi-language
797
+ │ ├── CallGraphIndex.ts # Symbol-level call edges (TypeScript/JS)
798
+ │ ├── CommunityDetector.ts # Louvain clustering (graphology)
799
+ │ ├── WikiGenerator.ts # Hash-cached community Markdown wiki
800
+ │ └── GraphExporter.ts # GraphML / DOT / Obsidian export
801
+ ├── ast/
802
+ │ ├── ASTParser.ts # tree-sitter multi-language parser
803
+ │ └── Skeletonizer.ts # Signature-only code views
804
+ ├── db/
805
+ │ └── VectorStore.ts # LanceDB vector storage
806
+ ├── indexer/
807
+ │ └── embedder.ts # HuggingFace embeddings + file collection
808
+ ├── grammars/
809
+ │ └── GrammarLoader.ts # Lazy grammar download + SHA-256 verify
810
+ ├── security/
811
+ │ └── PathValidator.ts # Path traversal protection (CWE-22)
812
+ ├── watcher/
813
+ │ └── FileWatcher.ts # chokidar (200ms debounce, incremental)
814
+ ├── setup/
815
+ │ ├── clients.ts # 13-client registry + detection
816
+ │ └── setup-wizard.ts # Interactive setup CLI
817
+ └── utils/
818
+ ├── logger.ts # Structured JSON-lines logger (stderr)
819
+ └── importExtractor.ts # Regex import extraction (Python/Rust/Go/Java)
820
+
821
+ benchmarks/
822
+ ├── benchmark.ts # Benchmark suite (graph build + search + compression)
823
+ └── README.md # Methodology and reproducibility guide
824
+ ```
825
+
826
+ ---
827
+
828
+ ## License
829
+
830
+ © 2026 [Codzign](https://github.com/kodiii)
831
+
832
+ ctxloom is source-available software. The source code is public for transparency and contributions. Use beyond the 7-day trial requires a valid license key — see [ctxloom.com/pricing](https://ctxloom.com/pricing).
@@ -11592,10 +11592,10 @@ function resolveTelemetryLevel() {
11592
11592
  }
11593
11593
  var TELEMETRY_LEVEL = resolveTelemetryLevel();
11594
11594
  var TELEMETRY_DISABLED = TELEMETRY_LEVEL === "off";
11595
- var CTXLOOM_VERSION = "1.5.3".length > 0 ? "1.5.3" : "dev";
11595
+ var CTXLOOM_VERSION = "1.5.5".length > 0 ? "1.5.5" : "dev";
11596
11596
  var POSTHOG_HOST = "https://eu.i.posthog.com";
11597
11597
  var POSTHOG_KEY = process.env["POSTHOG_API_KEY"] ?? (true ? "phc_CiDkmFLcZ2K6uCpcoSUQLmFrnnUvsyXGhSxopX5TVKE6" : "");
11598
- var SENTRY_DSN = process.env["SENTRY_DSN"] ?? (true ? "https://81c94a0f04a8e242dee493ac1e17f733@o4508531702497280.ingest.de.sentry.io/4511256875368528" : "");
11598
+ var SENTRY_DSN = process.env["SENTRY_DSN"] ?? (true ? "https://81c94a0f04a8e242dee493ac1e17f733@o4508531702497280.ingest.de.sentry.io/4511256875368528\u2028" : "");
11599
11599
  var cachedDistinctId = null;
11600
11600
  function resolveDistinctId() {
11601
11601
  if (cachedDistinctId) return cachedDistinctId;
@@ -1,8 +1,8 @@
1
1
  import {
2
2
  VectorStore
3
- } from "./chunk-DVI2RWJR.js";
3
+ } from "./chunk-R56D54Y7.js";
4
4
  import "./chunk-TYDMSHV7.js";
5
5
  export {
6
6
  VectorStore
7
7
  };
8
- //# sourceMappingURL=VectorStore-XYLGD37W.js.map
8
+ //# sourceMappingURL=VectorStore-4VWT2ZMW.js.map
@@ -1,10 +1,10 @@
1
1
  import {
2
2
  VectorStore
3
- } from "./chunk-DVI2RWJR.js";
3
+ } from "./chunk-R56D54Y7.js";
4
4
  import {
5
5
  collectFiles,
6
6
  generateEmbedding
7
- } from "./chunk-UVR65QBJ.js";
7
+ } from "./chunk-COH5WYZS.js";
8
8
  import {
9
9
  diskSink,
10
10
  readEvents
@@ -8017,7 +8017,7 @@ function registerFullTextSearchTool(registry, ctx) {
8017
8017
  };
8018
8018
  if (mode === "semantic") {
8019
8019
  try {
8020
- const { generateEmbedding: generateEmbedding2 } = await import("./embedder-R4KCXSGO.js");
8020
+ const { generateEmbedding: generateEmbedding2 } = await import("./embedder-7YOG4DFN.js");
8021
8021
  const store = await ctx.getStore(project_root);
8022
8022
  const embedding = await generateEmbedding2(query);
8023
8023
  const results = await store.search(embedding, limit);
@@ -8054,7 +8054,7 @@ function registerFullTextSearchTool(registry, ctx) {
8054
8054
  let merged = keywordResults.slice(0, limit);
8055
8055
  if (mode === "hybrid") {
8056
8056
  try {
8057
- const { generateEmbedding: generateEmbedding2 } = await import("./embedder-R4KCXSGO.js");
8057
+ const { generateEmbedding: generateEmbedding2 } = await import("./embedder-7YOG4DFN.js");
8058
8058
  const store = await ctx.getStore(project_root);
8059
8059
  const embedding = await generateEmbedding2(query);
8060
8060
  const vectorResults = await store.search(embedding, Math.ceil(limit / 2));
@@ -10291,10 +10291,10 @@ var TELEMETRY_DISABLED = TELEMETRY_LEVEL === "off";
10291
10291
  function getTelemetryLevel() {
10292
10292
  return TELEMETRY_LEVEL;
10293
10293
  }
10294
- var CTXLOOM_VERSION = "1.5.3".length > 0 ? "1.5.3" : "dev";
10294
+ var CTXLOOM_VERSION = "1.5.5".length > 0 ? "1.5.5" : "dev";
10295
10295
  var POSTHOG_HOST = "https://eu.i.posthog.com";
10296
10296
  var POSTHOG_KEY = process.env["POSTHOG_API_KEY"] ?? (true ? "phc_CiDkmFLcZ2K6uCpcoSUQLmFrnnUvsyXGhSxopX5TVKE6" : "");
10297
- var SENTRY_DSN = process.env["SENTRY_DSN"] ?? (true ? "https://81c94a0f04a8e242dee493ac1e17f733@o4508531702497280.ingest.de.sentry.io/4511256875368528" : "");
10297
+ var SENTRY_DSN = process.env["SENTRY_DSN"] ?? (true ? "https://81c94a0f04a8e242dee493ac1e17f733@o4508531702497280.ingest.de.sentry.io/4511256875368528\u2028" : "");
10298
10298
  var cachedDistinctId = null;
10299
10299
  function resolveDistinctId() {
10300
10300
  if (cachedDistinctId) return cachedDistinctId;
@@ -11889,4 +11889,4 @@ export {
11889
11889
  skillFilePath,
11890
11890
  installHarness
11891
11891
  };
11892
- //# sourceMappingURL=chunk-MIC7Q72C.js.map
11892
+ //# sourceMappingURL=chunk-5R4P7VEE.js.map
@@ -158,7 +158,7 @@ function collectFiles(dir, results = []) {
158
158
  return results;
159
159
  }
160
160
  async function indexDirectory(rootDir, onProgress) {
161
- const { VectorStore } = await import("./VectorStore-XYLGD37W.js");
161
+ const { VectorStore } = await import("./VectorStore-4VWT2ZMW.js");
162
162
  const store = new VectorStore(path.join(rootDir, ".ctxloom", "vectors.lancedb"));
163
163
  await store.init();
164
164
  const files = collectFiles(rootDir);
@@ -211,4 +211,4 @@ export {
211
211
  collectFiles,
212
212
  indexDirectory
213
213
  };
214
- //# sourceMappingURL=chunk-UVR65QBJ.js.map
214
+ //# sourceMappingURL=chunk-COH5WYZS.js.map
@@ -13,8 +13,19 @@ var VectorStore = class {
13
13
  db = null;
14
14
  table = null;
15
15
  initialized = false;
16
- constructor(dbPath) {
16
+ /**
17
+ * Upserts since the last compaction. LanceDB writes 2 transactions per
18
+ * upsert (delete + add); without periodic compact_files() + cleanup, a
19
+ * long-lived MCP server accumulates tens of thousands of fragment FDs
20
+ * (observed: ~60k FDs / process after 18h of watcher-driven reindex).
21
+ */
22
+ upsertsSinceCompact = 0;
23
+ compactEvery;
24
+ cleanupOlderThanMs;
25
+ constructor(dbPath, options = {}) {
17
26
  this.dbPath = dbPath ?? path.join(process.cwd(), ".ctxloom", "vectors.lancedb");
27
+ this.compactEvery = options.compactEvery ?? 200;
28
+ this.cleanupOlderThanMs = options.cleanupOlderThanMs ?? 60 * 60 * 1e3;
18
29
  }
19
30
  async init() {
20
31
  if (this.initialized) return;
@@ -84,6 +95,33 @@ var VectorStore = class {
84
95
  content: content.slice(0, 512)
85
96
  };
86
97
  await this.table.add([record]);
98
+ this.upsertsSinceCompact++;
99
+ if (this.upsertsSinceCompact >= this.compactEvery) {
100
+ this.upsertsSinceCompact = 0;
101
+ await this.compact();
102
+ }
103
+ }
104
+ /**
105
+ * Merge fragments and prune old LanceDB versions. Idempotent and safe to
106
+ * call mid-flight; the Table API serializes writes internally. Called
107
+ * automatically every `compactEvery` upserts (default 200) to bound FD
108
+ * growth in long-lived MCP server processes.
109
+ *
110
+ * Uses optional-chaining so older `@lancedb/lancedb` builds without
111
+ * `optimize()` degrade to a no-op instead of crashing.
112
+ */
113
+ async compact() {
114
+ if (!this.table) return;
115
+ try {
116
+ const optimizable = this.table;
117
+ await optimizable.optimize?.({
118
+ cleanupOlderThan: new Date(Date.now() - this.cleanupOlderThanMs)
119
+ });
120
+ } catch (err) {
121
+ logger.warn("VectorStore.compact failed (non-fatal)", {
122
+ detail: err instanceof Error ? err.message : String(err)
123
+ });
124
+ }
87
125
  }
88
126
  /**
89
127
  * Search for the top-K most similar code records using vector search.
@@ -141,4 +179,4 @@ var VectorStore = class {
141
179
  export {
142
180
  VectorStore
143
181
  };
144
- //# sourceMappingURL=chunk-DVI2RWJR.js.map
182
+ //# sourceMappingURL=chunk-R56D54Y7.js.map
@@ -3,7 +3,7 @@ import {
3
3
  collectFiles,
4
4
  generateEmbedding,
5
5
  indexDirectory
6
- } from "./chunk-UVR65QBJ.js";
6
+ } from "./chunk-COH5WYZS.js";
7
7
  import "./chunk-TYDMSHV7.js";
8
8
  export {
9
9
  EMBEDDING_DIMENSION,
@@ -11,4 +11,4 @@ export {
11
11
  generateEmbedding,
12
12
  indexDirectory
13
13
  };
14
- //# sourceMappingURL=embedder-R4KCXSGO.js.map
14
+ //# sourceMappingURL=embedder-7YOG4DFN.js.map
package/dist/index.js CHANGED
@@ -45,18 +45,18 @@ import {
45
45
  validateDefaultRoot,
46
46
  wrapWithIndexingEnvelope,
47
47
  writeCODEOWNERS
48
- } from "./chunk-MIC7Q72C.js";
48
+ } from "./chunk-5R4P7VEE.js";
49
49
  import {
50
50
  addCtxloomToConfig,
51
51
  detectInstalledClients
52
52
  } from "./chunk-II2DPYRJ.js";
53
53
  import {
54
54
  VectorStore
55
- } from "./chunk-DVI2RWJR.js";
55
+ } from "./chunk-R56D54Y7.js";
56
56
  import {
57
57
  generateEmbedding,
58
58
  indexDirectory
59
- } from "./chunk-UVR65QBJ.js";
59
+ } from "./chunk-COH5WYZS.js";
60
60
  import "./chunk-5I6CJITG.js";
61
61
  import {
62
62
  logger
@@ -1020,7 +1020,7 @@ try {
1020
1020
  } catch {
1021
1021
  }
1022
1022
  var args = process.argv.slice(2);
1023
- var ctxloomVersion = "1.5.3".length > 0 ? "1.5.3" : "dev";
1023
+ var ctxloomVersion = "1.5.5".length > 0 ? "1.5.5" : "dev";
1024
1024
  if (args.includes("--version") || args.includes("-v")) {
1025
1025
  process.stdout.write(`ctxloom ${ctxloomVersion}
1026
1026
  `);
@@ -1093,7 +1093,7 @@ async function checkLicense() {
1093
1093
  if (command !== void 0 && LICENSE_GATE_BYPASS_COMMANDS.has(command)) return;
1094
1094
  const ciKey = process.env["CTXLOOM_LICENSE_KEY"];
1095
1095
  if (ciKey) {
1096
- const { ApiClient } = await import("./src-MTMXJEKZ.js");
1096
+ const { ApiClient } = await import("./src-QMDQDATD.js");
1097
1097
  const client = new ApiClient(process.env["CTXLOOM_API_BASE"]);
1098
1098
  try {
1099
1099
  const result = await client.validate(ciKey, "ci-ephemeral");
@@ -1471,7 +1471,7 @@ async function main() {
1471
1471
  }
1472
1472
  if (!skipHarness) {
1473
1473
  process.stdout.write("\n");
1474
- const { installHarness } = await import("./src-MTMXJEKZ.js");
1474
+ const { installHarness } = await import("./src-QMDQDATD.js");
1475
1475
  const h = installHarness({ cwd: initRoot, dryRun, force, extraHosts });
1476
1476
  const harnessFiles = [
1477
1477
  h.claudeMd,
@@ -1534,7 +1534,7 @@ async function main() {
1534
1534
  process.exit(1);
1535
1535
  }
1536
1536
  if (alias !== void 0) {
1537
- const { validateAlias } = await import("./src-MTMXJEKZ.js");
1537
+ const { validateAlias } = await import("./src-QMDQDATD.js");
1538
1538
  const v = validateAlias(alias);
1539
1539
  if (!v.ok) {
1540
1540
  console.error(`[ctxloom] Invalid alias: ${v.reason}`);
@@ -1798,7 +1798,7 @@ Suggested reviewers for ${files.length} file(s):`);
1798
1798
  process.stderr.write("[ctxloom] --limit must be a non-negative integer (0 for unlimited)\n");
1799
1799
  process.exit(2);
1800
1800
  }
1801
- const { loadRulesConfig, RulesChecker, formatText, formatJson, RulesConfigError } = await import("./src-MTMXJEKZ.js");
1801
+ const { loadRulesConfig, RulesChecker, formatText, formatJson, RulesConfigError } = await import("./src-QMDQDATD.js");
1802
1802
  let config;
1803
1803
  try {
1804
1804
  config = await loadRulesConfig(root);
@@ -1822,7 +1822,7 @@ Suggested reviewers for ${files.length} file(s):`);
1822
1822
  }
1823
1823
  let graph;
1824
1824
  if (useSnapshot) {
1825
- const { DependencyGraph: DG } = await import("./src-MTMXJEKZ.js");
1825
+ const { DependencyGraph: DG } = await import("./src-QMDQDATD.js");
1826
1826
  graph = new DG();
1827
1827
  const loaded = await graph.loadSnapshotOnly(root);
1828
1828
  if (!loaded) {
@@ -1831,7 +1831,7 @@ Suggested reviewers for ${files.length} file(s):`);
1831
1831
  }
1832
1832
  } else {
1833
1833
  process.stderr.write("[ctxloom] Building dependency graph...\n");
1834
- const { ASTParser: ASTParser2, DependencyGraph: DependencyGraph2 } = await import("./src-MTMXJEKZ.js");
1834
+ const { ASTParser: ASTParser2, DependencyGraph: DependencyGraph2 } = await import("./src-QMDQDATD.js");
1835
1835
  let parser;
1836
1836
  try {
1837
1837
  parser = new ASTParser2();
@@ -129,16 +129,16 @@ import {
129
129
  wrapBlock,
130
130
  wrapWithIndexingEnvelope,
131
131
  writeCODEOWNERS
132
- } from "./chunk-MIC7Q72C.js";
132
+ } from "./chunk-5R4P7VEE.js";
133
133
  import {
134
134
  VectorStore
135
- } from "./chunk-DVI2RWJR.js";
135
+ } from "./chunk-R56D54Y7.js";
136
136
  import {
137
137
  EMBEDDING_DIMENSION,
138
138
  collectFiles,
139
139
  generateEmbedding,
140
140
  indexDirectory
141
- } from "./chunk-UVR65QBJ.js";
141
+ } from "./chunk-COH5WYZS.js";
142
142
  import {
143
143
  filenameForDate,
144
144
  readEvents,
@@ -294,4 +294,4 @@ export {
294
294
  wrapWithIndexingEnvelope,
295
295
  writeCODEOWNERS
296
296
  };
297
- //# sourceMappingURL=src-MTMXJEKZ.js.map
297
+ //# sourceMappingURL=src-QMDQDATD.js.map
@@ -1,9 +1,9 @@
1
1
  import {
2
2
  VectorStore
3
- } from "../chunk-DVI2RWJR.js";
3
+ } from "../chunk-R56D54Y7.js";
4
4
  import {
5
5
  generateEmbedding
6
- } from "../chunk-UVR65QBJ.js";
6
+ } from "../chunk-COH5WYZS.js";
7
7
  import "../chunk-TYDMSHV7.js";
8
8
 
9
9
  // packages/core/src/workers/indexerWorker.ts
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "ctxloom-pro",
3
- "version": "1.5.3",
3
+ "version": "1.5.5",
4
4
  "description": "ctxloom — The Universal Code Context Engine. A local-first MCP server providing intelligent code context via hybrid Vector + AST + Graph search with Skeletonization (92% token reduction).",
5
5
  "type": "module",
6
6
  "main": "dist/index.js",