cto-ai-cli 5.2.0 → 7.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +169 -316
- package/dist/cli/index.js +6306 -5691
- package/dist/engine/index.d.ts +690 -730
- package/dist/engine/index.js +2415 -4721
- package/dist/mcp/index.js +3313 -15036
- package/package.json +9 -43
- package/DOCS.md +0 -902
- package/dist/action/index.js +0 -26395
- package/dist/api/dashboard.js +0 -2276
- package/dist/api/dashboard.js.map +0 -1
- package/dist/api/server.js +0 -3663
- package/dist/api/server.js.map +0 -1
- package/dist/cli/gateway.js +0 -3054
- package/dist/cli/index.d.ts +0 -2
- package/dist/cli/index.js.map +0 -1
- package/dist/cli/score.js +0 -6352
- package/dist/cli/v2/index.d.ts +0 -2
- package/dist/cli/v2/index.js +0 -3695
- package/dist/cli/v2/index.js.map +0 -1
- package/dist/engine/index.js.map +0 -1
- package/dist/fsevents-X6WP4TKM.node +0 -0
- package/dist/gateway/index.d.ts +0 -281
- package/dist/gateway/index.js +0 -2932
- package/dist/gateway/index.js.map +0 -1
- package/dist/govern/index.d.ts +0 -325
- package/dist/govern/index.js +0 -1101
- package/dist/govern/index.js.map +0 -1
- package/dist/interact/index.d.ts +0 -234
- package/dist/interact/index.js +0 -1542
- package/dist/interact/index.js.map +0 -1
- package/dist/mcp/index.d.ts +0 -2
- package/dist/mcp/index.js.map +0 -1
- package/dist/mcp/v2.d.ts +0 -2
- package/dist/mcp/v2.js +0 -18492
- package/dist/mcp/v2.js.map +0 -1
package/README.md
CHANGED
|
@@ -1,435 +1,288 @@
|
|
|
1
|
-
# CTO —
|
|
1
|
+
# CTO — AI Context Selection Engine
|
|
2
2
|
|
|
3
|
-
[](LICENSE)
|
|
4
|
-
[](#)
|
|
5
|
-
[](#)
|
|
6
3
|
[](https://www.npmjs.com/package/cto-ai-cli)
|
|
4
|
+
[](LICENSE)
|
|
5
|
+
[](.)
|
|
7
6
|
|
|
8
|
-
|
|
7
|
+
**Pick the right files for any AI task. Secrets auto-redacted. Learns from your feedback.**
|
|
9
8
|
|
|
10
9
|
```bash
|
|
11
|
-
|
|
10
|
+
cto --context "fix the auth middleware" --stdout | pbcopy # → clipboard
|
|
11
|
+
cto --context "fix auth" --prompt "Refactor to use JWT" # → AI prompt
|
|
12
|
+
cto --accept # → learns
|
|
12
13
|
```
|
|
13
14
|
|
|
14
|
-
|
|
15
|
+
76KB package · 606 tests · Zero AI dependencies.
|
|
15
16
|
|
|
16
17
|
---
|
|
17
18
|
|
|
18
19
|
## The Problem
|
|
19
20
|
|
|
20
|
-
When
|
|
21
|
+
When developers use AI coding assistants, they need to provide context — the right source files. Today, most teams either:
|
|
21
22
|
|
|
22
|
-
- **Send everything**
|
|
23
|
-
- **
|
|
24
|
-
- **Let the AI pick** — it doesn't know your dependency graph
|
|
23
|
+
- **Send everything** → expensive, slow, hits token limits
|
|
24
|
+
- **Pick files manually** → miss dependencies, forget test files, leak secrets
|
|
25
25
|
|
|
26
|
-
|
|
26
|
+
CTO solves both: it **automatically selects the most relevant files** for any task, **sanitizes secrets** before they reach any AI provider, and **learns from feedback** to get better over time.
|
|
27
27
|
|
|
28
|
-
##
|
|
28
|
+
## Quick Demo
|
|
29
29
|
|
|
30
30
|
```bash
|
|
31
|
-
|
|
32
|
-
```
|
|
33
|
-
```
|
|
34
|
-
⚡ cto-score — analyzing your project...
|
|
35
|
-
|
|
36
|
-
╔══════════════════════════════════════════════════╗
|
|
37
|
-
║ ║
|
|
38
|
-
║ 🟢 Context Score™ 88 / 100 Grade: A- ║
|
|
39
|
-
║ ║
|
|
40
|
-
║ Efficiency ████████████████░░░░ 80% ║
|
|
41
|
-
║ Coverage ████████████████████ 100% ║
|
|
42
|
-
║ Risk Control ████████████████████ 100% ║
|
|
43
|
-
║ Structure █░░░░░░░░░░░░░░░░░░ 5% ║
|
|
44
|
-
║ Governance ██████████████████░ 90% ║
|
|
45
|
-
║ ║
|
|
46
|
-
║ 💰 vs. Sending Everything: ║
|
|
47
|
-
║ Tokens saved: 392K (88%) ║
|
|
48
|
-
║ Monthly savings: ~$943 ║
|
|
49
|
-
║ ║
|
|
50
|
-
╚══════════════════════════════════════════════════╝
|
|
51
|
-
|
|
52
|
-
Scanned in 0.6s · 199 files · 443K tokens
|
|
31
|
+
cto --demo # Run a live showcase on your project
|
|
53
32
|
```
|
|
54
33
|
|
|
55
|
-
|
|
34
|
+
This runs a self-contained presentation that shows: project analysis, semantic matching proof, secret sanitization, ROI calculation, and benchmark results.
|
|
56
35
|
|
|
57
|
-
|
|
58
|
-
|--------|-----------------|----------------|
|
|
59
|
-
| **Context Score (88/100)** | Overall AI-readiness of your project | Higher = AI tools produce better output with your code |
|
|
60
|
-
| **Efficiency (80%)** | How much CTO can compress without losing value | 80% means we send 20% of tokens for the same quality |
|
|
61
|
-
| **Coverage (100%)** | % of important files included in the selection | 100% = every dependency and type file is captured |
|
|
62
|
-
| **Risk Control (100%)** | Are high-risk files (hubs, complex code) prioritized? | Ensures AI sees the files most likely to cause bugs |
|
|
63
|
-
| **Structure (5%)** | How well-organized your codebase is for AI | Low = too many large files, poor modularity |
|
|
64
|
-
| **Governance (90%)** | Audit logging, policy enforcement, secret scanning | Enterprise readiness |
|
|
65
|
-
| **Tokens saved (88%)** | Reduction vs. sending every file | Directly reduces your API costs |
|
|
66
|
-
| **Monthly savings ($943)** | Estimated cost reduction at 800 interactions/month | Based on average GPT-4o pricing |
|
|
36
|
+
## Benchmark Results
|
|
67
37
|
|
|
68
|
-
|
|
38
|
+
Tested against 8 curated tasks with ground truth (known correct files):
|
|
69
39
|
|
|
70
|
-
|
|
40
|
+
| Strategy | Precision | Must-have Recall | F1 |
|
|
41
|
+
|---|---|---|---|
|
|
42
|
+
| **CTO** | 33.6% | **100.0%** | **48.7%** |
|
|
43
|
+
| TF-IDF only | 54.6% | 87.5% | 62.0% |
|
|
44
|
+
| Risk-only | 20.8% | 18.8% | 15.0% |
|
|
45
|
+
| Alphabetical | 8.3% | 31.3% | 12.9% |
|
|
46
|
+
| Random | 7.7% | 6.3% | 2.8% |
|
|
71
47
|
|
|
72
|
-
|
|
48
|
+
**CTO never misses a must-have file** (100% recall). 3.8× better F1 than alphabetical. 17× better than random.
|
|
73
49
|
|
|
74
|
-
|
|
75
|
-
npx cto-ai-cli # Analyze current directory
|
|
76
|
-
npx cto-ai-cli ./my-project # Analyze a specific project
|
|
77
|
-
npx cto-ai-cli --json # Machine-readable JSON output
|
|
78
|
-
```
|
|
50
|
+
## ROI
|
|
79
51
|
|
|
80
|
-
|
|
52
|
+
On a typical 130-file TypeScript project:
|
|
81
53
|
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
|
|
54
|
+
| Metric | Without CTO | With CTO |
|
|
55
|
+
|---|---|---|
|
|
56
|
+
| Tokens per interaction | 370K (all files) | ~28K (selected) |
|
|
57
|
+
| Cost per interaction (Sonnet) | $1.11 | $0.08 |
|
|
58
|
+
| **Monthly cost (10 devs, 40/day)** | **$8,880** | **$640** |
|
|
59
|
+
| **Annual savings** | — | **~$99,000** |
|
|
85
60
|
|
|
86
|
-
|
|
61
|
+
Plus: fewer hallucinations (right context), zero secret leaks, and the learner gets smarter with every `--accept` / `--reject`.
|
|
87
62
|
|
|
88
|
-
|
|
89
|
-
npx cto-ai-cli --context "refactor the auth middleware"
|
|
90
|
-
```
|
|
63
|
+
## How it Works
|
|
91
64
|
|
|
92
|
-
Generates **task-specific** context — only files relevant to auth, including types, dependencies, and related tests.
|
|
93
|
-
|
|
94
|
-
Example output:
|
|
95
65
|
```
|
|
96
|
-
|
|
97
|
-
|
|
98
|
-
|
|
99
|
-
|
|
100
|
-
|
|
101
|
-
│ src/middleware/auth.ts 2,100 tokens
|
|
102
|
-
│ src/types/auth.ts 450 tokens
|
|
103
|
-
│ src/config/jwt.ts 320 tokens
|
|
104
|
-
│
|
|
105
|
-
├─ Dependencies (5 files) ─────────────────────
|
|
106
|
-
│ src/models/user.ts 1,200 tokens
|
|
107
|
-
│ src/services/token.ts 890 tokens
|
|
108
|
-
│ ...
|
|
109
|
-
│
|
|
110
|
-
└─ Tests (2 files) ────────────────────────────
|
|
111
|
-
tests/auth.test.ts 1,800 tokens
|
|
112
|
-
tests/middleware.test.ts 940 tokens
|
|
113
|
-
|
|
114
|
-
Saved to .cto/context.md (8.2K tokens — 97% smaller than full project)
|
|
66
|
+
Task description ──→ TF-IDF/BM25 ──→ Semantic scores ──┐
|
|
67
|
+
│
|
|
68
|
+
Project files ──→ Dependency graph ──→ Risk scores ──────┤──→ Composite ──→ Greedy ──→ Selection
|
|
69
|
+
│ ranking alloc
|
|
70
|
+
Feedback history ──→ Bayesian learner ──→ Boosts ────────┘
|
|
115
71
|
```
|
|
116
72
|
|
|
117
|
-
|
|
73
|
+
1. **Dependency graph** — parses imports, builds adjacency list, identifies hubs
|
|
74
|
+
2. **Risk scoring** — complexity × centrality × recency (continuous, log-scaled)
|
|
75
|
+
3. **TF-IDF/BM25 semantic matching** — task description scored against file contents + path boosting
|
|
76
|
+
4. **Composite ranking** — `finalScore = semantic × 0.55 + risk × 0.25 + learner × 0.2`
|
|
77
|
+
5. **Noise filtering** — files with zero semantic relevance are excluded (benchmark-driven optimization)
|
|
78
|
+
6. **Greedy allocation** — fills token budget top-down, cascading prune levels (full → signatures → skeleton)
|
|
79
|
+
7. **Bayesian learning** — exponential decay, Wilson score confidence, per-task-type patterns
|
|
80
|
+
|
|
81
|
+
**No AI is used for selection.** Same input → same output. Deterministic.
|
|
82
|
+
|
|
83
|
+
## Install
|
|
118
84
|
|
|
119
85
|
```bash
|
|
120
|
-
|
|
86
|
+
npm i -g cto-ai-cli # global
|
|
87
|
+
npx cto-ai-cli # or one-shot
|
|
121
88
|
```
|
|
122
89
|
|
|
123
|
-
|
|
90
|
+
## Context Selection
|
|
124
91
|
|
|
92
|
+
```bash
|
|
93
|
+
cto --context "refactor the auth middleware" # human-readable summary
|
|
94
|
+
cto --context "fix login bug" --stdout | pbcopy # pipe to clipboard
|
|
95
|
+
cto --context "add tests" --output context.md # save to file
|
|
96
|
+
cto --context "fix login" --prompt "Refactor to async/await" # full AI prompt
|
|
97
|
+
cto --context "debug scoring" --json # JSON for tooling
|
|
98
|
+
cto --context "fix auth" --budget 30000 # custom token budget
|
|
125
99
|
```
|
|
126
|
-
🔴 CRITICAL src/config/stripe.ts:8
|
|
127
|
-
api-key: sk_l********************yZ
|
|
128
|
-
🔴 CRITICAL src/config/database.ts:14
|
|
129
|
-
connection-string: post********************db
|
|
130
|
-
🟠 HIGH src/utils/email.ts:22
|
|
131
|
-
pii: admi**********om
|
|
132
|
-
|
|
133
|
-
🚨 3 critical findings. Rotate credentials immediately.
|
|
134
|
-
```
|
|
135
100
|
|
|
136
|
-
|
|
101
|
+
Output includes full file contents in markdown, ready for Claude, ChatGPT, or any AI. **Secrets are automatically redacted** — API keys, tokens, passwords, PII are replaced with `****` before output.
|
|
102
|
+
|
|
103
|
+
## Feedback Loop
|
|
137
104
|
|
|
138
|
-
|
|
105
|
+
CTO learns from real feedback, not from itself:
|
|
139
106
|
|
|
140
107
|
```bash
|
|
141
|
-
|
|
108
|
+
cto --accept # last selection was good
|
|
109
|
+
cto --reject # last selection was bad
|
|
110
|
+
cto --reject --missing src/auth.ts # this file was missing
|
|
111
|
+
cto --stats # see what CTO has learned
|
|
142
112
|
```
|
|
143
113
|
|
|
144
|
-
|
|
114
|
+
On `--reject`, CTO also detects files you edited after the selection that weren't in the context — those get automatically boosted for next time.
|
|
145
115
|
|
|
146
|
-
|
|
147
|
-
📊 Review Quality: 82/100 (B+)
|
|
148
|
-
|
|
149
|
-
Breaking Changes:
|
|
150
|
-
🔴 Removed export: UserService.findById (used by 4 files)
|
|
151
|
-
🟡 Changed signature: authenticate(token) → authenticate(token, opts)
|
|
116
|
+
## Secret Audit
|
|
152
117
|
|
|
153
|
-
|
|
154
|
-
|
|
155
|
-
|
|
118
|
+
```bash
|
|
119
|
+
cto --audit # scan all files
|
|
120
|
+
cto --audit --init-hook # install pre-commit hook
|
|
121
|
+
cto --audit --full-scan # ignore cache, scan everything
|
|
122
|
+
cto --audit --json # machine-readable output
|
|
123
|
+
```
|
|
156
124
|
|
|
157
|
-
|
|
158
|
-
Direct: 4 files | Transitive: 12 files | Tests: 3 files
|
|
125
|
+
45+ patterns (AWS, Stripe, GitHub, OpenAI, Slack, Cloudflare...) plus Shannon entropy analysis. The real value: **audit protects context** — every `--stdout`, `--output`, and `--prompt` auto-sanitizes secrets before output.
|
|
159
126
|
|
|
160
|
-
|
|
127
|
+
```
|
|
128
|
+
Before: OPENAI_KEY = "sk-Rk8bN3xYz2Wq5PmL7jCvT1aBcDe"
|
|
129
|
+
After: OPENAI_KEY = "sk-R********************De"
|
|
161
130
|
```
|
|
162
131
|
|
|
163
|
-
|
|
164
|
-
|-----------------|--------|
|
|
165
|
-
| **Breaking changes** | Removed exports, changed function signatures, deleted files |
|
|
166
|
-
| **Missing files** | Tests, type files, barrel exports, importers of changed code |
|
|
167
|
-
| **Impact radius** | How many files are affected (direct + transitive via BFS) |
|
|
168
|
-
| **Review quality** | Score based on PR size, focus, breaking changes, completeness |
|
|
132
|
+
## AI Gateway (Enterprise)
|
|
169
133
|
|
|
170
|
-
|
|
134
|
+
A transparent HTTP proxy between your developers and AI providers. Automatically injects optimized context, redacts secrets, and tracks costs — without changing developer workflow.
|
|
171
135
|
|
|
172
136
|
```bash
|
|
173
|
-
|
|
174
|
-
|
|
175
|
-
|
|
137
|
+
cto --gateway # Start on port 8787
|
|
138
|
+
cto --gateway --port 9000 # Custom port
|
|
139
|
+
cto --gateway --block-secrets # Block requests with critical secrets
|
|
140
|
+
cto --gateway --budget-daily 50 # $50/day budget limit
|
|
141
|
+
cto --gateway --budget-monthly 500 # $500/month budget limit
|
|
176
142
|
```
|
|
177
143
|
|
|
178
|
-
|
|
144
|
+
```
|
|
145
|
+
Developer → CTO Gateway → [context injection + sanitization + cost tracking] → AI Provider
|
|
146
|
+
↓
|
|
147
|
+
Dashboard (http://localhost:8787/__cto)
|
|
148
|
+
```
|
|
179
149
|
|
|
180
|
-
|
|
150
|
+
**What the gateway does automatically:**
|
|
151
|
+
- **Injects CTO-selected context** into every AI request (TF-IDF + composite scoring)
|
|
152
|
+
- **Redacts secrets** before they leave the network (45+ patterns)
|
|
153
|
+
- **Tracks costs** per model, per day, per month with budget alerts
|
|
154
|
+
- **Streams responses** with zero-copy SSE passthrough
|
|
155
|
+
- **Serves a live dashboard** at `/__cto` with real-time metrics
|
|
181
156
|
|
|
182
|
-
|
|
183
|
-
npx cto-ai-cli --ci # Run quality gate (exits 1 on failure)
|
|
184
|
-
npx cto-ai-cli --ci --threshold 80 # Custom minimum score
|
|
185
|
-
npx cto-ai-cli --ci --json # JSON for pipeline parsing
|
|
186
|
-
```
|
|
157
|
+
Supports OpenAI, Anthropic, Google, and Azure OpenAI. SSRF protection built-in.
|
|
187
158
|
|
|
188
|
-
|
|
159
|
+
## Cross-Repo Context
|
|
189
160
|
|
|
190
|
-
|
|
161
|
+
When working on a task, CTO can pull relevant files from **sibling repositories** — not just the current project.
|
|
191
162
|
|
|
192
163
|
```bash
|
|
193
|
-
|
|
194
|
-
|
|
164
|
+
cto --context "fix payment webhook" --auto-repos # Auto-discover sibling repos
|
|
165
|
+
cto --context "fix payment webhook" --repos shared-types,payment-service
|
|
195
166
|
```
|
|
196
167
|
|
|
197
|
-
|
|
168
|
+
**How it works:**
|
|
169
|
+
1. Discovers sibling repos in parent directory (any dir with `package.json`, `tsconfig.json`, `Cargo.toml`, etc.)
|
|
170
|
+
2. Builds a lightweight TF-IDF index per sibling (reads source files, no full analysis)
|
|
171
|
+
3. Queries each sibling with the task description
|
|
172
|
+
4. Returns ranked matches with repo attribution and content
|
|
198
173
|
|
|
199
|
-
|
|
174
|
+
Real use case: You're fixing a webhook handler in `api-gateway` — CTO finds the `Payment` interface in `shared-types` and the consumer in `notification-service` automatically.
|
|
175
|
+
|
|
176
|
+
## Cost-Aware Model Routing
|
|
200
177
|
|
|
201
|
-
|
|
178
|
+
CTO analyzes the **actual selected context** (not just the project) to recommend the cheapest model that can handle the task.
|
|
202
179
|
|
|
203
180
|
```bash
|
|
204
|
-
#
|
|
205
|
-
|
|
206
|
-
|
|
207
|
-
npx cto-ai-cli --benchmark # CTO vs naive vs random comparison
|
|
208
|
-
npx cto-ai-cli --compare # Compare vs popular OSS projects
|
|
209
|
-
npx cto-ai-cli --report # Markdown report + badge
|
|
210
|
-
|
|
211
|
-
# Context generation
|
|
212
|
-
npx cto-ai-cli --fix # Auto-generate .cto/context.md
|
|
213
|
-
npx cto-ai-cli --context "task" # Task-specific context
|
|
214
|
-
|
|
215
|
-
# Security
|
|
216
|
-
npx cto-ai-cli --audit # Secret & PII detection
|
|
217
|
-
npx cto-ai-cli --audit --full-scan # Scan all files (ignore cache)
|
|
218
|
-
npx cto-ai-cli --audit --init-hook # Install pre-commit hook
|
|
219
|
-
|
|
220
|
-
# Code review
|
|
221
|
-
npx cto-ai-cli --review # PR review analysis
|
|
222
|
-
npx cto-ai-cli --review --json # Review data as JSON
|
|
223
|
-
|
|
224
|
-
# Learning
|
|
225
|
-
npx cto-ai-cli --learn # Feedback model dashboard
|
|
226
|
-
npx cto-ai-cli --predict # File predictions for a task
|
|
227
|
-
npx cto-ai-cli --learn --json # Export learning data
|
|
228
|
-
|
|
229
|
-
# CI/CD
|
|
230
|
-
npx cto-ai-cli --ci # Quality gate
|
|
231
|
-
npx cto-ai-cli --ci --threshold 80 # Custom threshold
|
|
232
|
-
|
|
233
|
-
# Monorepo
|
|
234
|
-
npx cto-ai-cli --monorepo # Full monorepo analysis
|
|
235
|
-
npx cto-ai-cli --monorepo --package X # Single package
|
|
236
|
-
|
|
237
|
-
# Gateway (AI proxy)
|
|
238
|
-
npx cto-gateway # Start proxy server
|
|
239
|
-
npx cto-gateway --budget-daily 10 # With budget enforcement
|
|
181
|
+
cto --context "update readme" --route # → Haiku ($0.08/call, 73% cheaper)
|
|
182
|
+
cto --context "fix auth bug" --route # → Opus ($1.33/call, critical complexity)
|
|
183
|
+
cto --context "refactor API" --route # → Sonnet ($0.30/call, balanced)
|
|
240
184
|
```
|
|
241
185
|
|
|
242
|
-
|
|
186
|
+
**Complexity is computed from real signals:**
|
|
187
|
+
- Token density (% of budget used)
|
|
188
|
+
- Risk concentration (top-5 file avg risk vs project max)
|
|
189
|
+
- Directory diversity (cross-cutting = harder)
|
|
190
|
+
- Dependency density among selected files
|
|
243
191
|
|
|
244
|
-
|
|
192
|
+
The gateway also uses this: every proxied request gets a model recommendation in the injected context.
|
|
245
193
|
|
|
246
|
-
|
|
194
|
+
## MCP Server
|
|
247
195
|
|
|
248
|
-
|
|
249
|
-
|
|
250
|
-
|
|
251
|
-
"mcpServers": {
|
|
252
|
-
"cto": { "command": "cto-mcp" }
|
|
253
|
-
}
|
|
254
|
-
}
|
|
255
|
-
```
|
|
196
|
+
Works as an MCP server for AI editors (Windsurf, Claude Desktop, Cursor).
|
|
197
|
+
|
|
198
|
+
**3 tools:** `cto_select_context`, `cto_audit_secrets`, `cto_explain`
|
|
256
199
|
|
|
257
|
-
**Claude Desktop:**
|
|
258
200
|
```json
|
|
259
|
-
|
|
260
|
-
|
|
261
|
-
"cto": { "command": "npx", "args": ["-y", "cto-ai-cli", "--mcp"] }
|
|
262
|
-
}
|
|
263
|
-
}
|
|
264
|
-
```
|
|
201
|
+
// Windsurf: ~/.codeium/windsurf/mcp_config.json
|
|
202
|
+
{ "mcpServers": { "cto": { "command": "cto-mcp" } } }
|
|
265
203
|
|
|
266
|
-
|
|
204
|
+
// Claude Desktop
|
|
205
|
+
{ "mcpServers": { "cto": { "command": "npx", "args": ["-y", "cto-ai-cli"] } } }
|
|
206
|
+
```
|
|
267
207
|
|
|
268
|
-
|
|
208
|
+
MCP output is also auto-sanitized when `includeContents: true`.
|
|
269
209
|
|
|
270
210
|
## Programmatic API
|
|
271
211
|
|
|
272
212
|
```typescript
|
|
273
|
-
import { analyzeProject,
|
|
213
|
+
import { analyzeProject, selectContext, buildIndex, query } from 'cto-ai-cli';
|
|
274
214
|
|
|
275
|
-
// Analyze a project
|
|
276
215
|
const analysis = await analyzeProject('./my-project');
|
|
216
|
+
const index = buildIndex(files);
|
|
217
|
+
const semanticScores = query(index, 'fix auth', 50)
|
|
218
|
+
.map(m => ({ filePath: m.filePath, score: m.score }));
|
|
277
219
|
|
|
278
|
-
// Get the Context Score
|
|
279
|
-
const score = await computeContextScore(analysis);
|
|
280
|
-
console.log(`Score: ${score.overall}/100 (${score.grade})`);
|
|
281
|
-
console.log(`Tokens saved: ${score.comparison.savedPercent}%`);
|
|
282
|
-
|
|
283
|
-
// Select optimal files for a task
|
|
284
220
|
const selection = await selectContext({
|
|
285
|
-
task: '
|
|
221
|
+
task: 'fix auth',
|
|
286
222
|
analysis,
|
|
287
|
-
budget: 50_000,
|
|
223
|
+
budget: 50_000,
|
|
224
|
+
semanticScores,
|
|
288
225
|
});
|
|
289
|
-
|
|
290
|
-
console.log(`Selected ${selection.files.length} files`);
|
|
291
|
-
console.log(`Coverage: ${selection.coverage.score}%`);
|
|
292
|
-
for (const file of selection.files) {
|
|
293
|
-
console.log(` ${file.relativePath} (${file.tokens} tokens, risk: ${file.riskScore})`);
|
|
294
|
-
}
|
|
295
226
|
```
|
|
296
227
|
|
|
297
|
-
|
|
298
|
-
|
|
299
|
-
## How It Works
|
|
300
|
-
|
|
301
|
-
1. **Scan** — walks your project, parses imports, builds a dependency graph
|
|
302
|
-
2. **Score** — computes risk for each file (complexity, hub score, centrality, recency)
|
|
303
|
-
3. **Select** — deterministic greedy algorithm: picks highest-risk files first within token budget
|
|
304
|
-
4. **Prove** — measures coverage (% of important files included), compares vs naive strategies
|
|
305
|
-
|
|
306
|
-
No AI is used for selection. Same input always produces the same output. Fully reproducible.
|
|
307
|
-
|
|
308
|
-
---
|
|
309
|
-
|
|
310
|
-
## Benchmark Proof
|
|
311
|
-
|
|
312
|
-
CTO includes an automated benchmark that runs **real context selection** on this repository (or any repo) and compares CTO vs naive (alphabetical) vs random strategies.
|
|
313
|
-
|
|
314
|
-
```bash
|
|
315
|
-
$ npx tsx scripts/benchmark.ts --json
|
|
316
|
-
```
|
|
317
|
-
|
|
318
|
-
**Results on this repo (124 files, 346K tokens):**
|
|
319
|
-
|
|
320
|
-
| Metric | Result |
|
|
321
|
-
|--------|--------|
|
|
322
|
-
| **CTO win rate** | 100% (20/20 runs across 5 tasks × 4 budgets) |
|
|
323
|
-
| **Coverage gain vs random** | +81% average |
|
|
324
|
-
| **Tokens saved vs naive** | 10% average |
|
|
325
|
-
| **Compilability: CTO** | 92/100 |
|
|
326
|
-
| **Compilability: Naive** | 40/100 |
|
|
327
|
-
| **CTO fewer predicted errors** | 2 fewer type/import errors per task |
|
|
328
|
-
| **Avg selection time** | 16ms |
|
|
329
|
-
|
|
330
|
-
The benchmark uses the same scoring engine as the CLI. No hardcoded numbers — run it yourself on any project.
|
|
228
|
+
## v7.0 Enterprise Features
|
|
331
229
|
|
|
332
|
-
|
|
230
|
+
### Precision Reranker (96.9% precision, was 33.6%)
|
|
333
231
|
|
|
334
|
-
|
|
232
|
+
Multi-signal reranker between BM25 retrieval and greedy allocation:
|
|
233
|
+
- **Term coverage**: fraction of unique query terms matched per file
|
|
234
|
+
- **Term specificity**: IDF-weighted — rare terms matter more
|
|
235
|
+
- **Bigram proximity**: query terms appearing close together in the file
|
|
236
|
+
- **Dependency signal**: files in the dependency cone of top matches
|
|
237
|
+
- **Quality gate**: adaptive cutoff stops filling budget with noise
|
|
335
238
|
|
|
336
|
-
|
|
337
|
-
npx cto-gateway # Start proxy server
|
|
338
|
-
npx cto-gateway --port 9000 # Custom port
|
|
339
|
-
npx cto-gateway --budget-daily 10 # $10/day budget enforcement
|
|
340
|
-
npx cto-gateway --block-secrets # Strip secrets from prompts
|
|
341
|
-
npx cto-gateway --api-key <key> # Require authentication
|
|
342
|
-
```
|
|
239
|
+
### Persistent Index Cache
|
|
343
240
|
|
|
344
|
-
|
|
241
|
+
TF-IDF index persisted to `.cto/index-cache.json` with per-file mtime tracking. Subsequent queries only re-tokenize changed files. 50K-file repos go from 5s → <100ms on warm cache.
|
|
345
242
|
|
|
346
|
-
-
|
|
347
|
-
- **Secret scanning** — strips API keys and PII from outbound messages
|
|
348
|
-
- **Budget enforcement** — daily/weekly spend limits with alerts
|
|
349
|
-
- **Usage tracking** — JSONL logs of all requests with token counts and costs
|
|
350
|
-
- **SSRF protection** — domain allowlist, private IP blocking, HTTPS-only
|
|
351
|
-
- **Body size limits** — 10MB default, prevents abuse
|
|
352
|
-
- **Upstream timeouts** — 120s default with socket cleanup
|
|
353
|
-
- **Connection pooling** — keep-alive agents with 50 max sockets
|
|
243
|
+
### Multi-Language Dependency Graphs
|
|
354
244
|
|
|
355
|
-
|
|
356
|
-
|
|
357
|
-
---
|
|
358
|
-
|
|
359
|
-
## Multi-Model Optimization
|
|
245
|
+
Regex-based import parsing for **Python**, **Go**, **Java**, and **Rust** alongside ts-morph for TS/JS. Enables hub detection, risk scoring, and dependency expansion for polyglot codebases.
|
|
360
246
|
|
|
361
247
|
```bash
|
|
362
|
-
|
|
248
|
+
# Works on Python, Go, Java, Rust projects — not just TypeScript
|
|
249
|
+
cto --context "fix auth handler" /path/to/go-project
|
|
363
250
|
```
|
|
364
251
|
|
|
365
|
-
|
|
366
|
-
|
|
367
|
-
| Model | Context Window | Strengths |
|
|
368
|
-
|-------|---------------|----------|
|
|
369
|
-
| GPT-4o | 128K | General coding, debugging |
|
|
370
|
-
| GPT-4o Mini | 128K | Fast, cheap, simple tasks |
|
|
371
|
-
| Claude Sonnet 4 | 200K | Complex refactoring, architecture |
|
|
372
|
-
| Claude 3.5 Haiku | 200K | Fast analysis, code review |
|
|
373
|
-
| Gemini 2.0 Flash | 1M | Huge codebases, exploration |
|
|
374
|
-
| Gemini 2.5 Pro | 1M | Deep reasoning, long context |
|
|
375
|
-
| DeepSeek V3 | 128K | Cost-effective coding |
|
|
376
|
-
| Codestral | 256K | Code completion, generation |
|
|
252
|
+
### Team Authentication & SSO
|
|
377
253
|
|
|
378
|
-
|
|
254
|
+
Per-team API keys, JWT validation (HS256/RS256), rate limiting, model allowlists. Teams stored in `.cto/gateway/teams.json`.
|
|
379
255
|
|
|
380
|
-
|
|
256
|
+
### Metrics Export
|
|
381
257
|
|
|
382
|
-
|
|
258
|
+
Prometheus exposition format at `/__cto/metrics`, Datadog JSON, and StatsD UDP. Counters, histograms, gauges for requests, tokens, cost, latency, secrets.
|
|
383
259
|
|
|
384
|
-
###
|
|
260
|
+
### Per-Team Policy Engine
|
|
385
261
|
|
|
386
|
-
|
|
262
|
+
Routing rules per team: model overrides by task type, cost caps per request, context budget limits, block rules. Preset policies: `createCostConscious()`, `createSecurityFirst()`.
|
|
387
263
|
|
|
388
|
-
-
|
|
389
|
-
- **Forbidden patterns** — blocks paths containing `.ssh`, `.gnupg`, `.aws`, `.env`, `passwd`, `shadow`
|
|
390
|
-
- **Allowlist** — set `CTO_ALLOWED_ROOTS=/home/deploy,/opt/projects` to restrict access to specific directories
|
|
264
|
+
### Closed-Loop A/B Testing
|
|
391
265
|
|
|
392
|
-
|
|
266
|
+
Real experimentation on context strategies with two-proportion z-test for statistical significance. Deterministic assignment (SHA-256 hashing), auto-conclusion when p < 0.05.
|
|
393
267
|
|
|
394
|
-
|
|
268
|
+
### LSP Bridge (IDE Plugin)
|
|
395
269
|
|
|
396
|
-
-
|
|
397
|
-
- **PII detection** (emails, SSNs, phone numbers) with safe domain filtering
|
|
398
|
-
- **Allowlist system** — SHA-256 fingerprinted exceptions in `.cto/audit/allowlist.json`
|
|
399
|
-
- **Incremental scanning** — file hash cache, only re-scans changed files
|
|
400
|
-
- **Pre-commit hook** — `npx cto-ai-cli --audit --init-hook` installs a git hook that blocks commits with secrets
|
|
401
|
-
|
|
402
|
-
---
|
|
270
|
+
JSON-RPC 2.0 server over stdin/stdout for any IDE: VS Code, JetBrains, Neovim, Emacs. Custom methods: `cto/selectContext`, `cto/score`, `cto/audit`, `cto/experiments`.
|
|
403
271
|
|
|
404
272
|
## Honest Limitations
|
|
405
273
|
|
|
406
|
-
- **TypeScript/JavaScript gets
|
|
407
|
-
- **
|
|
408
|
-
- **
|
|
409
|
-
- **
|
|
410
|
-
|
|
411
|
-
---
|
|
274
|
+
- **TypeScript/JavaScript gets AST analysis.** Python/Go/Java/Rust get regex-based import parsing (good for graphs, not AST-accurate).
|
|
275
|
+
- **BM25 + reranker, not embeddings.** 96.9% precision on our benchmark. No neural model needed.
|
|
276
|
+
- **Learning needs ~5 feedback cycles** to start influencing selection. First runs are pure graph + risk + semantic.
|
|
277
|
+
- **Benchmarked against naive baselines** (alphabetical, random, risk-only, TF-IDF-only). Not compared against Cursor/Copilot internal context engines.
|
|
412
278
|
|
|
413
279
|
## Contributing
|
|
414
280
|
|
|
415
281
|
```bash
|
|
416
|
-
git clone https://github.com/cto-ai/cto-ai-cli.git
|
|
417
|
-
|
|
418
|
-
npm install
|
|
419
|
-
npm run build
|
|
420
|
-
npm test # 550 tests, 91% coverage
|
|
421
|
-
npm run typecheck # strict TypeScript, zero errors
|
|
422
|
-
```
|
|
423
|
-
|
|
424
|
-
Run the automated benchmark to see CTO vs naive on this repo:
|
|
425
|
-
|
|
426
|
-
```bash
|
|
427
|
-
npx tsx scripts/benchmark.ts # Human-readable report
|
|
428
|
-
npx tsx scripts/benchmark.ts --json # Machine-readable JSON
|
|
282
|
+
git clone https://github.com/cto-ai/cto-ai-cli.git && cd cto-ai-cli
|
|
283
|
+
npm install && npm run build && npm test # 776 tests
|
|
429
284
|
```
|
|
430
285
|
|
|
431
|
-
Full API docs, MCP server reference, and architecture are in [DOCS.md](DOCS.md).
|
|
432
|
-
|
|
433
286
|
## License
|
|
434
287
|
|
|
435
288
|
[MIT](LICENSE)
|