cto-ai-cli 5.2.0 → 7.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,435 +1,288 @@
1
- # CTO — Stop sending your entire codebase to AI
1
+ # CTO — AI Context Selection Engine
2
2
 
3
- [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
4
- [![Tests](https://img.shields.io/badge/tests-550_passing-brightgreen.svg)](#)
5
- [![Coverage](https://img.shields.io/badge/coverage-91%25-brightgreen.svg)](#)
6
3
  [![npm](https://img.shields.io/npm/v/cto-ai-cli.svg)](https://www.npmjs.com/package/cto-ai-cli)
4
+ [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
5
+ [![Tests](https://img.shields.io/badge/tests-606%20passing-brightgreen)](.)
7
6
 
8
- CTO analyzes your project and selects the **minimum set of files** your AI needs saving tokens, reducing cost, and producing code that actually compiles.
7
+ **Pick the right files for any AI task. Secrets auto-redacted. Learns from your feedback.**
9
8
 
10
9
  ```bash
11
- npx cto-ai-cli
10
+ cto --context "fix the auth middleware" --stdout | pbcopy # → clipboard
11
+ cto --context "fix auth" --prompt "Refactor to use JWT" # → AI prompt
12
+ cto --accept # → learns
12
13
  ```
13
14
 
14
- **Runs in <1 second.** No API keys. No data leaves your machine.
15
+ 76KB package · 606 tests · Zero AI dependencies.
15
16
 
16
17
  ---
17
18
 
18
19
  ## The Problem
19
20
 
20
- When you ask an AI to help with code, it needs context. Most approaches:
21
+ When developers use AI coding assistants, they need to provide context the right source files. Today, most teams either:
21
22
 
22
- - **Send everything** expensive, noisy, AI gets confused
23
- - **Send open files** misses types, dependencies, config
24
- - **Let the AI pick** — it doesn't know your dependency graph
23
+ - **Send everything** expensive, slow, hits token limits
24
+ - **Pick files manually** miss dependencies, forget test files, leak secrets
25
25
 
26
- The result: AI generates code that **doesn't compile** because it never saw your type definitions.
26
+ CTO solves both: it **automatically selects the most relevant files** for any task, **sanitizes secrets** before they reach any AI provider, and **learns from feedback** to get better over time.
27
27
 
28
- ## The Fix
28
+ ## Quick Demo
29
29
 
30
30
  ```bash
31
- $ npx cto-ai-cli ./my-project
32
- ```
33
- ```
34
- ⚡ cto-score — analyzing your project...
35
-
36
- ╔══════════════════════════════════════════════════╗
37
- ║ ║
38
- ║ 🟢 Context Score™ 88 / 100 Grade: A- ║
39
- ║ ║
40
- ║ Efficiency ████████████████░░░░ 80% ║
41
- ║ Coverage ████████████████████ 100% ║
42
- ║ Risk Control ████████████████████ 100% ║
43
- ║ Structure █░░░░░░░░░░░░░░░░░░ 5% ║
44
- ║ Governance ██████████████████░ 90% ║
45
- ║ ║
46
- ║ 💰 vs. Sending Everything: ║
47
- ║ Tokens saved: 392K (88%) ║
48
- ║ Monthly savings: ~$943 ║
49
- ║ ║
50
- ╚══════════════════════════════════════════════════╝
51
-
52
- Scanned in 0.6s · 199 files · 443K tokens
31
+ cto --demo # Run a live showcase on your project
53
32
  ```
54
33
 
55
- ### What each number means
34
+ This runs a self-contained presentation that shows: project analysis, semantic matching proof, secret sanitization, ROI calculation, and benchmark results.
56
35
 
57
- | Metric | What it measures | Why it matters |
58
- |--------|-----------------|----------------|
59
- | **Context Score (88/100)** | Overall AI-readiness of your project | Higher = AI tools produce better output with your code |
60
- | **Efficiency (80%)** | How much CTO can compress without losing value | 80% means we send 20% of tokens for the same quality |
61
- | **Coverage (100%)** | % of important files included in the selection | 100% = every dependency and type file is captured |
62
- | **Risk Control (100%)** | Are high-risk files (hubs, complex code) prioritized? | Ensures AI sees the files most likely to cause bugs |
63
- | **Structure (5%)** | How well-organized your codebase is for AI | Low = too many large files, poor modularity |
64
- | **Governance (90%)** | Audit logging, policy enforcement, secret scanning | Enterprise readiness |
65
- | **Tokens saved (88%)** | Reduction vs. sending every file | Directly reduces your API costs |
66
- | **Monthly savings ($943)** | Estimated cost reduction at 800 interactions/month | Based on average GPT-4o pricing |
36
+ ## Benchmark Results
67
37
 
68
- ---
38
+ Tested against 8 curated tasks with ground truth (known correct files):
69
39
 
70
- ## Quick Start
40
+ | Strategy | Precision | Must-have Recall | F1 |
41
+ |---|---|---|---|
42
+ | **CTO** | 33.6% | **100.0%** | **48.7%** |
43
+ | TF-IDF only | 54.6% | 87.5% | 62.0% |
44
+ | Risk-only | 20.8% | 18.8% | 15.0% |
45
+ | Alphabetical | 8.3% | 31.3% | 12.9% |
46
+ | Random | 7.7% | 6.3% | 2.8% |
71
47
 
72
- ### Score your project
48
+ **CTO never misses a must-have file** (100% recall). 3.8× better F1 than alphabetical. 17× better than random.
73
49
 
74
- ```bash
75
- npx cto-ai-cli # Analyze current directory
76
- npx cto-ai-cli ./my-project # Analyze a specific project
77
- npx cto-ai-cli --json # Machine-readable JSON output
78
- ```
50
+ ## ROI
79
51
 
80
- ### Generate optimized context for AI
52
+ On a typical 130-file TypeScript project:
81
53
 
82
- ```bash
83
- npx cto-ai-cli --fix
84
- ```
54
+ | Metric | Without CTO | With CTO |
55
+ |---|---|---|
56
+ | Tokens per interaction | 370K (all files) | ~28K (selected) |
57
+ | Cost per interaction (Sonnet) | $1.11 | $0.08 |
58
+ | **Monthly cost (10 devs, 40/day)** | **$8,880** | **$640** |
59
+ | **Annual savings** | — | **~$99,000** |
85
60
 
86
- Creates `.cto/context.md` paste this into any AI chat for optimal context. Also generates `.cto/config.json` and `.cto/.cteignore`.
61
+ Plus: fewer hallucinations (right context), zero secret leaks, and the learner gets smarter with every `--accept` / `--reject`.
87
62
 
88
- ```bash
89
- npx cto-ai-cli --context "refactor the auth middleware"
90
- ```
63
+ ## How it Works
91
64
 
92
- Generates **task-specific** context — only files relevant to auth, including types, dependencies, and related tests.
93
-
94
- Example output:
95
65
  ```
96
- 📋 Context for: "refactor the auth middleware"
97
-
98
- Selected 12 files (8.2K tokens):
99
-
100
- ┌─ Core (3 files) ─────────────────────────────
101
- │ src/middleware/auth.ts 2,100 tokens
102
- │ src/types/auth.ts 450 tokens
103
- │ src/config/jwt.ts 320 tokens
104
-
105
- ├─ Dependencies (5 files) ─────────────────────
106
- │ src/models/user.ts 1,200 tokens
107
- │ src/services/token.ts 890 tokens
108
- │ ...
109
-
110
- └─ Tests (2 files) ────────────────────────────
111
- tests/auth.test.ts 1,800 tokens
112
- tests/middleware.test.ts 940 tokens
113
-
114
- Saved to .cto/context.md (8.2K tokens — 97% smaller than full project)
66
+ Task description ──→ TF-IDF/BM25 ──→ Semantic scores ──┐
67
+
68
+ Project files ──→ Dependency graph ──→ Risk scores ──────┤──→ Composite ──→ Greedy ──→ Selection
69
+ │ ranking alloc
70
+ Feedback history ──→ Bayesian learner ──→ Boosts ────────┘
115
71
  ```
116
72
 
117
- ### Security audit
73
+ 1. **Dependency graph** — parses imports, builds adjacency list, identifies hubs
74
+ 2. **Risk scoring** — complexity × centrality × recency (continuous, log-scaled)
75
+ 3. **TF-IDF/BM25 semantic matching** — task description scored against file contents + path boosting
76
+ 4. **Composite ranking** — `finalScore = semantic × 0.55 + risk × 0.25 + learner × 0.2`
77
+ 5. **Noise filtering** — files with zero semantic relevance are excluded (benchmark-driven optimization)
78
+ 6. **Greedy allocation** — fills token budget top-down, cascading prune levels (full → signatures → skeleton)
79
+ 7. **Bayesian learning** — exponential decay, Wilson score confidence, per-task-type patterns
80
+
81
+ **No AI is used for selection.** Same input → same output. Deterministic.
82
+
83
+ ## Install
118
84
 
119
85
  ```bash
120
- npx cto-ai-cli --audit
86
+ npm i -g cto-ai-cli # global
87
+ npx cto-ai-cli # or one-shot
121
88
  ```
122
89
 
123
- Scans for **API keys, tokens, passwords, and PII** before they end up in an AI prompt. 45+ patterns (AWS, Stripe, GitHub, OpenAI, etc.) plus Shannon entropy analysis for unknown formats.
90
+ ## Context Selection
124
91
 
92
+ ```bash
93
+ cto --context "refactor the auth middleware" # human-readable summary
94
+ cto --context "fix login bug" --stdout | pbcopy # pipe to clipboard
95
+ cto --context "add tests" --output context.md # save to file
96
+ cto --context "fix login" --prompt "Refactor to async/await" # full AI prompt
97
+ cto --context "debug scoring" --json # JSON for tooling
98
+ cto --context "fix auth" --budget 30000 # custom token budget
125
99
  ```
126
- 🔴 CRITICAL src/config/stripe.ts:8
127
- api-key: sk_l********************yZ
128
- 🔴 CRITICAL src/config/database.ts:14
129
- connection-string: post********************db
130
- 🟠 HIGH src/utils/email.ts:22
131
- pii: admi**********om
132
-
133
- 🚨 3 critical findings. Rotate credentials immediately.
134
- ```
135
100
 
136
- Run in CI to block PRs with secrets: `CI=true npx cto-ai-cli --audit`
101
+ Output includes full file contents in markdown, ready for Claude, ChatGPT, or any AI. **Secrets are automatically redacted** — API keys, tokens, passwords, PII are replaced with `****` before output.
102
+
103
+ ## Feedback Loop
137
104
 
138
- ### Code review intelligence
105
+ CTO learns from real feedback, not from itself:
139
106
 
140
107
  ```bash
141
- npx cto-ai-cli --review
108
+ cto --accept # last selection was good
109
+ cto --reject # last selection was bad
110
+ cto --reject --missing src/auth.ts # this file was missing
111
+ cto --stats # see what CTO has learned
142
112
  ```
143
113
 
144
- Analyzes your git diff and generates a structured review:
114
+ On `--reject`, CTO also detects files you edited after the selection that weren't in the context — those get automatically boosted for next time.
145
115
 
146
- ```
147
- 📊 Review Quality: 82/100 (B+)
148
-
149
- Breaking Changes:
150
- 🔴 Removed export: UserService.findById (used by 4 files)
151
- 🟡 Changed signature: authenticate(token) → authenticate(token, opts)
116
+ ## Secret Audit
152
117
 
153
- Missing Files:
154
- ⚠️ No test file for src/services/auth.ts
155
- ⚠️ src/types/user.ts changed but barrel index not updated
118
+ ```bash
119
+ cto --audit # scan all files
120
+ cto --audit --init-hook # install pre-commit hook
121
+ cto --audit --full-scan # ignore cache, scan everything
122
+ cto --audit --json # machine-readable output
123
+ ```
156
124
 
157
- Impact Radius:
158
- Direct: 4 files | Transitive: 12 files | Tests: 3 files
125
+ 45+ patterns (AWS, Stripe, GitHub, OpenAI, Slack, Cloudflare...) plus Shannon entropy analysis. The real value: **audit protects context** — every `--stdout`, `--output`, and `--prompt` auto-sanitizes secrets before output.
159
126
 
160
- Saved review prompt to .cto/review-prompt.md
127
+ ```
128
+ Before: OPENAI_KEY = "sk-Rk8bN3xYz2Wq5PmL7jCvT1aBcDe"
129
+ After: OPENAI_KEY = "sk-R********************De"
161
130
  ```
162
131
 
163
- | What it detects | Example |
164
- |-----------------|--------|
165
- | **Breaking changes** | Removed exports, changed function signatures, deleted files |
166
- | **Missing files** | Tests, type files, barrel exports, importers of changed code |
167
- | **Impact radius** | How many files are affected (direct + transitive via BFS) |
168
- | **Review quality** | Score based on PR size, focus, breaking changes, completeness |
132
+ ## AI Gateway (Enterprise)
169
133
 
170
- ### Learning mode
134
+ A transparent HTTP proxy between your developers and AI providers. Automatically injects optimized context, redacts secrets, and tracks costs — without changing developer workflow.
171
135
 
172
136
  ```bash
173
- npx cto-ai-cli --learn # View feedback model & stats
174
- npx cto-ai-cli --predict # Predict relevant files for a task
175
- npx cto-ai-cli --learn --json # Export learning data for team sharing
137
+ cto --gateway # Start on port 8787
138
+ cto --gateway --port 9000 # Custom port
139
+ cto --gateway --block-secrets # Block requests with critical secrets
140
+ cto --gateway --budget-daily 50 # $50/day budget limit
141
+ cto --gateway --budget-monthly 500 # $500/month budget limit
176
142
  ```
177
143
 
178
- CTO learns from your usage patterns over time. Uses **EWMA temporal decay** (recent feedback weighs more) and **Bayesian confidence** (Wilson score — avoids over-trusting sparse data).
144
+ ```
145
+ Developer → CTO Gateway → [context injection + sanitization + cost tracking] → AI Provider
146
+
147
+ Dashboard (http://localhost:8787/__cto)
148
+ ```
179
149
 
180
- ### Quality gate for CI/CD
150
+ **What the gateway does automatically:**
151
+ - **Injects CTO-selected context** into every AI request (TF-IDF + composite scoring)
152
+ - **Redacts secrets** before they leave the network (45+ patterns)
153
+ - **Tracks costs** per model, per day, per month with budget alerts
154
+ - **Streams responses** with zero-copy SSE passthrough
155
+ - **Serves a live dashboard** at `/__cto` with real-time metrics
181
156
 
182
- ```bash
183
- npx cto-ai-cli --ci # Run quality gate (exits 1 on failure)
184
- npx cto-ai-cli --ci --threshold 80 # Custom minimum score
185
- npx cto-ai-cli --ci --json # JSON for pipeline parsing
186
- ```
157
+ Supports OpenAI, Anthropic, Google, and Azure OpenAI. SSRF protection built-in.
187
158
 
188
- Block merges when context quality drops below your threshold. Tracks baselines and detects regressions.
159
+ ## Cross-Repo Context
189
160
 
190
- ### Monorepo support
161
+ When working on a task, CTO can pull relevant files from **sibling repositories** — not just the current project.
191
162
 
192
163
  ```bash
193
- npx cto-ai-cli --monorepo # Analyze all packages
194
- npx cto-ai-cli --monorepo --package api # Focus on one package
164
+ cto --context "fix payment webhook" --auto-repos # Auto-discover sibling repos
165
+ cto --context "fix payment webhook" --repos shared-types,payment-service
195
166
  ```
196
167
 
197
- Detects npm/yarn/pnpm workspaces, Turborepo, Nx, and Lerna. Shows cross-package dependencies, isolation scores, and shared package analysis.
168
+ **How it works:**
169
+ 1. Discovers sibling repos in parent directory (any dir with `package.json`, `tsconfig.json`, `Cargo.toml`, etc.)
170
+ 2. Builds a lightweight TF-IDF index per sibling (reads source files, no full analysis)
171
+ 3. Queries each sibling with the task description
172
+ 4. Returns ranked matches with repo attribution and content
198
173
 
199
- ---
174
+ Real use case: You're fixing a webhook handler in `api-gateway` — CTO finds the `Payment` interface in `shared-types` and the consumer in `notification-service` automatically.
175
+
176
+ ## Cost-Aware Model Routing
200
177
 
201
- ## All CLI Flags
178
+ CTO analyzes the **actual selected context** (not just the project) to recommend the cheapest model that can handle the task.
202
179
 
203
180
  ```bash
204
- # Analysis
205
- npx cto-ai-cli [path] # Score a project
206
- npx cto-ai-cli --json # JSON output
207
- npx cto-ai-cli --benchmark # CTO vs naive vs random comparison
208
- npx cto-ai-cli --compare # Compare vs popular OSS projects
209
- npx cto-ai-cli --report # Markdown report + badge
210
-
211
- # Context generation
212
- npx cto-ai-cli --fix # Auto-generate .cto/context.md
213
- npx cto-ai-cli --context "task" # Task-specific context
214
-
215
- # Security
216
- npx cto-ai-cli --audit # Secret & PII detection
217
- npx cto-ai-cli --audit --full-scan # Scan all files (ignore cache)
218
- npx cto-ai-cli --audit --init-hook # Install pre-commit hook
219
-
220
- # Code review
221
- npx cto-ai-cli --review # PR review analysis
222
- npx cto-ai-cli --review --json # Review data as JSON
223
-
224
- # Learning
225
- npx cto-ai-cli --learn # Feedback model dashboard
226
- npx cto-ai-cli --predict # File predictions for a task
227
- npx cto-ai-cli --learn --json # Export learning data
228
-
229
- # CI/CD
230
- npx cto-ai-cli --ci # Quality gate
231
- npx cto-ai-cli --ci --threshold 80 # Custom threshold
232
-
233
- # Monorepo
234
- npx cto-ai-cli --monorepo # Full monorepo analysis
235
- npx cto-ai-cli --monorepo --package X # Single package
236
-
237
- # Gateway (AI proxy)
238
- npx cto-gateway # Start proxy server
239
- npx cto-gateway --budget-daily 10 # With budget enforcement
181
+ cto --context "update readme" --route # → Haiku ($0.08/call, 73% cheaper)
182
+ cto --context "fix auth bug" --route # Opus ($1.33/call, critical complexity)
183
+ cto --context "refactor API" --route # Sonnet ($0.30/call, balanced)
240
184
  ```
241
185
 
242
- ---
186
+ **Complexity is computed from real signals:**
187
+ - Token density (% of budget used)
188
+ - Risk concentration (top-5 file avg risk vs project max)
189
+ - Directory diversity (cross-cutting = harder)
190
+ - Dependency density among selected files
243
191
 
244
- ## MCP Server (for AI Editors)
192
+ The gateway also uses this: every proxied request gets a model recommendation in the injected context.
245
193
 
246
- CTO works as an [MCP server](https://modelcontextprotocol.io/) — plug it into Claude, Windsurf, or Cursor.
194
+ ## MCP Server
247
195
 
248
- **Windsurf** add to `~/.codeium/windsurf/mcp_config.json`:
249
- ```json
250
- {
251
- "mcpServers": {
252
- "cto": { "command": "cto-mcp" }
253
- }
254
- }
255
- ```
196
+ Works as an MCP server for AI editors (Windsurf, Claude Desktop, Cursor).
197
+
198
+ **3 tools:** `cto_select_context`, `cto_audit_secrets`, `cto_explain`
256
199
 
257
- **Claude Desktop:**
258
200
  ```json
259
- {
260
- "mcpServers": {
261
- "cto": { "command": "npx", "args": ["-y", "cto-ai-cli", "--mcp"] }
262
- }
263
- }
264
- ```
201
+ // Windsurf: ~/.codeium/windsurf/mcp_config.json
202
+ { "mcpServers": { "cto": { "command": "cto-mcp" } } }
265
203
 
266
- 19 tools available: `cto_analyze`, `cto_select_context`, `cto_score`, `cto_benchmark`, `cto_risk`, `cto_quality_benchmark`, `cto_compilability`, `cto_audit`, `cto_review`, `cto_monorepo`, and more.
204
+ // Claude Desktop
205
+ { "mcpServers": { "cto": { "command": "npx", "args": ["-y", "cto-ai-cli"] } } }
206
+ ```
267
207
 
268
- ---
208
+ MCP output is also auto-sanitized when `includeContents: true`.
269
209
 
270
210
  ## Programmatic API
271
211
 
272
212
  ```typescript
273
- import { analyzeProject, computeContextScore, selectContext } from 'cto-ai-cli';
213
+ import { analyzeProject, selectContext, buildIndex, query } from 'cto-ai-cli';
274
214
 
275
- // Analyze a project
276
215
  const analysis = await analyzeProject('./my-project');
216
+ const index = buildIndex(files);
217
+ const semanticScores = query(index, 'fix auth', 50)
218
+ .map(m => ({ filePath: m.filePath, score: m.score }));
277
219
 
278
- // Get the Context Score
279
- const score = await computeContextScore(analysis);
280
- console.log(`Score: ${score.overall}/100 (${score.grade})`);
281
- console.log(`Tokens saved: ${score.comparison.savedPercent}%`);
282
-
283
- // Select optimal files for a task
284
220
  const selection = await selectContext({
285
- task: 'refactor the auth middleware',
221
+ task: 'fix auth',
286
222
  analysis,
287
- budget: 50_000, // 50K token budget
223
+ budget: 50_000,
224
+ semanticScores,
288
225
  });
289
-
290
- console.log(`Selected ${selection.files.length} files`);
291
- console.log(`Coverage: ${selection.coverage.score}%`);
292
- for (const file of selection.files) {
293
- console.log(` ${file.relativePath} (${file.tokens} tokens, risk: ${file.riskScore})`);
294
- }
295
226
  ```
296
227
 
297
- ---
298
-
299
- ## How It Works
300
-
301
- 1. **Scan** — walks your project, parses imports, builds a dependency graph
302
- 2. **Score** — computes risk for each file (complexity, hub score, centrality, recency)
303
- 3. **Select** — deterministic greedy algorithm: picks highest-risk files first within token budget
304
- 4. **Prove** — measures coverage (% of important files included), compares vs naive strategies
305
-
306
- No AI is used for selection. Same input always produces the same output. Fully reproducible.
307
-
308
- ---
309
-
310
- ## Benchmark Proof
311
-
312
- CTO includes an automated benchmark that runs **real context selection** on this repository (or any repo) and compares CTO vs naive (alphabetical) vs random strategies.
313
-
314
- ```bash
315
- $ npx tsx scripts/benchmark.ts --json
316
- ```
317
-
318
- **Results on this repo (124 files, 346K tokens):**
319
-
320
- | Metric | Result |
321
- |--------|--------|
322
- | **CTO win rate** | 100% (20/20 runs across 5 tasks × 4 budgets) |
323
- | **Coverage gain vs random** | +81% average |
324
- | **Tokens saved vs naive** | 10% average |
325
- | **Compilability: CTO** | 92/100 |
326
- | **Compilability: Naive** | 40/100 |
327
- | **CTO fewer predicted errors** | 2 fewer type/import errors per task |
328
- | **Avg selection time** | 16ms |
329
-
330
- The benchmark uses the same scoring engine as the CLI. No hardcoded numbers — run it yourself on any project.
228
+ ## v7.0 Enterprise Features
331
229
 
332
- ---
230
+ ### Precision Reranker (96.9% precision, was 33.6%)
333
231
 
334
- ## Gateway AI Proxy with Security
232
+ Multi-signal reranker between BM25 retrieval and greedy allocation:
233
+ - **Term coverage**: fraction of unique query terms matched per file
234
+ - **Term specificity**: IDF-weighted — rare terms matter more
235
+ - **Bigram proximity**: query terms appearing close together in the file
236
+ - **Dependency signal**: files in the dependency cone of top matches
237
+ - **Quality gate**: adaptive cutoff stops filling budget with noise
335
238
 
336
- ```bash
337
- npx cto-gateway # Start proxy server
338
- npx cto-gateway --port 9000 # Custom port
339
- npx cto-gateway --budget-daily 10 # $10/day budget enforcement
340
- npx cto-gateway --block-secrets # Strip secrets from prompts
341
- npx cto-gateway --api-key <key> # Require authentication
342
- ```
239
+ ### Persistent Index Cache
343
240
 
344
- The gateway sits between your AI editor and the model API, adding:
241
+ TF-IDF index persisted to `.cto/index-cache.json` with per-file mtime tracking. Subsequent queries only re-tokenize changed files. 50K-file repos go from 5s → <100ms on warm cache.
345
242
 
346
- - **Context optimization** — injects relevant file contents into prompts automatically
347
- - **Secret scanning** — strips API keys and PII from outbound messages
348
- - **Budget enforcement** — daily/weekly spend limits with alerts
349
- - **Usage tracking** — JSONL logs of all requests with token counts and costs
350
- - **SSRF protection** — domain allowlist, private IP blocking, HTTPS-only
351
- - **Body size limits** — 10MB default, prevents abuse
352
- - **Upstream timeouts** — 120s default with socket cleanup
353
- - **Connection pooling** — keep-alive agents with 50 max sockets
243
+ ### Multi-Language Dependency Graphs
354
244
 
355
- Supports **OpenAI, Anthropic, Google, and Azure** providers with SSE streaming.
356
-
357
- ---
358
-
359
- ## Multi-Model Optimization
245
+ Regex-based import parsing for **Python**, **Go**, **Java**, and **Rust** alongside ts-morph for TS/JS. Enables hub detection, risk scoring, and dependency expansion for polyglot codebases.
360
246
 
361
247
  ```bash
362
- npx cto-ai-cli --benchmark
248
+ # Works on Python, Go, Java, Rust projects — not just TypeScript
249
+ cto --context "fix auth handler" /path/to/go-project
363
250
  ```
364
251
 
365
- CTO knows 8 model profiles and recommends the best model for your task:
366
-
367
- | Model | Context Window | Strengths |
368
- |-------|---------------|----------|
369
- | GPT-4o | 128K | General coding, debugging |
370
- | GPT-4o Mini | 128K | Fast, cheap, simple tasks |
371
- | Claude Sonnet 4 | 200K | Complex refactoring, architecture |
372
- | Claude 3.5 Haiku | 200K | Fast analysis, code review |
373
- | Gemini 2.0 Flash | 1M | Huge codebases, exploration |
374
- | Gemini 2.5 Pro | 1M | Deep reasoning, long context |
375
- | DeepSeek V3 | 128K | Cost-effective coding |
376
- | Codestral | 256K | Code completion, generation |
252
+ ### Team Authentication & SSO
377
253
 
378
- For each model, CTO computes: **budget** (based on context window), **quality score** (strength match + coverage), **estimated cost**, and a **recommendation** (best quality, best value, cheapest).
254
+ Per-team API keys, JWT validation (HS256/RS256), rate limiting, model allowlists. Teams stored in `.cto/gateway/teams.json`.
379
255
 
380
- ---
256
+ ### Metrics Export
381
257
 
382
- ## Security
258
+ Prometheus exposition format at `/__cto/metrics`, Datadog JSON, and StatsD UDP. Counters, histograms, gauges for requests, tokens, cost, latency, secrets.
383
259
 
384
- ### API Server Path Traversal Protection
260
+ ### Per-Team Policy Engine
385
261
 
386
- The API server (`cto-api`) validates all project paths:
262
+ Routing rules per team: model overrides by task type, cost caps per request, context budget limits, block rules. Preset policies: `createCostConscious()`, `createSecurityFirst()`.
387
263
 
388
- - **Forbidden system paths** — blocks `/etc`, `/usr`, `/var`, `/sys`, `/proc`, `/dev`, `/boot`, `/tmp`, `/root`
389
- - **Forbidden patterns** — blocks paths containing `.ssh`, `.gnupg`, `.aws`, `.env`, `passwd`, `shadow`
390
- - **Allowlist** — set `CTO_ALLOWED_ROOTS=/home/deploy,/opt/projects` to restrict access to specific directories
264
+ ### Closed-Loop A/B Testing
391
265
 
392
- ### Secret Detection
266
+ Real experimentation on context strategies with two-proportion z-test for statistical significance. Deterministic assignment (SHA-256 hashing), auto-conclusion when p < 0.05.
393
267
 
394
- 45+ patterns including AWS, Stripe, GitHub, OpenAI, Datadog, Sentry, Firebase, Supabase, and more. Plus:
268
+ ### LSP Bridge (IDE Plugin)
395
269
 
396
- - **Shannon entropy analysis** for unknown secret formats
397
- - **PII detection** (emails, SSNs, phone numbers) with safe domain filtering
398
- - **Allowlist system** — SHA-256 fingerprinted exceptions in `.cto/audit/allowlist.json`
399
- - **Incremental scanning** — file hash cache, only re-scans changed files
400
- - **Pre-commit hook** — `npx cto-ai-cli --audit --init-hook` installs a git hook that blocks commits with secrets
401
-
402
- ---
270
+ JSON-RPC 2.0 server over stdin/stdout for any IDE: VS Code, JetBrains, Neovim, Emacs. Custom methods: `cto/selectContext`, `cto/score`, `cto/audit`, `cto/experiments`.
403
271
 
404
272
  ## Honest Limitations
405
273
 
406
- - **TypeScript/JavaScript gets the deepest analysis.** Other languages (Python, Go, Rust, Java) get basic file + import analysis.
407
- - **Benchmarks use simple baselines** (alphabetical, random). Run `npx tsx scripts/benchmark.ts` on your own repo to see real numbers. We haven't compared against Cursor's or Copilot's internal context selection.
408
- - **Savings are estimates** based on average API pricing. Actual savings depend on your model and usage.
409
- - **Risk scoring uses a complexity proxy** instead of real git churn data (planned improvement).
410
-
411
- ---
274
+ - **TypeScript/JavaScript gets AST analysis.** Python/Go/Java/Rust get regex-based import parsing (good for graphs, not AST-accurate).
275
+ - **BM25 + reranker, not embeddings.** 96.9% precision on our benchmark. No neural model needed.
276
+ - **Learning needs ~5 feedback cycles** to start influencing selection. First runs are pure graph + risk + semantic.
277
+ - **Benchmarked against naive baselines** (alphabetical, random, risk-only, TF-IDF-only). Not compared against Cursor/Copilot internal context engines.
412
278
 
413
279
  ## Contributing
414
280
 
415
281
  ```bash
416
- git clone https://github.com/cto-ai/cto-ai-cli.git
417
- cd cto-ai-cli
418
- npm install
419
- npm run build
420
- npm test # 550 tests, 91% coverage
421
- npm run typecheck # strict TypeScript, zero errors
422
- ```
423
-
424
- Run the automated benchmark to see CTO vs naive on this repo:
425
-
426
- ```bash
427
- npx tsx scripts/benchmark.ts # Human-readable report
428
- npx tsx scripts/benchmark.ts --json # Machine-readable JSON
282
+ git clone https://github.com/cto-ai/cto-ai-cli.git && cd cto-ai-cli
283
+ npm install && npm run build && npm test # 776 tests
429
284
  ```
430
285
 
431
- Full API docs, MCP server reference, and architecture are in [DOCS.md](DOCS.md).
432
-
433
286
  ## License
434
287
 
435
288
  [MIT](LICENSE)