composto-ai 0.1.1 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,150 +1,142 @@
1
1
  # Composto
2
2
 
3
- **Proactive AI team companion less tokens, more insight.**
3
+ **Send meaning to your LLM, not code. 89% fewer tokens, same understanding.**
4
4
 
5
- Every AI coding tool sends raw source code to LLMs. Composto sends **meaning** compressed code enriched with codebase health data. The result: fewer tokens carrying more information than raw source ever could.
5
+ Composto parses your code into an AST, classifies every node by importance, and drops the noise. Your LLM gets the signal function signatures, control flow, dependencies without the braces, semicolons, and string literals it already knows.
6
+
7
+ ```
8
+ Raw source: 3,782 tokens → Composto IR: 663 tokens (82.5% savings)
9
+
10
+ USE:[../types.js, ./structure.js, ./fingerprint.js, ./health.js]
11
+ OUT FN:generateL0(code: string, filePath: string)
12
+ RET `${filePath}\n${declarations.join("\n")}`
13
+ OUT ASYNC FN:generateL1(code: string, filePath: string, health: HealthAnnotation...)
14
+ IF:health → RET annotateIR(ir, health)
15
+ RET ir
16
+ OUT FN:generateLayer(layer: IRLayer, options: {...})
17
+ SWITCH:layer
18
+ CASE:"L0" → RET generateL0(...)
19
+ CASE:"L1" → RET generateL1(...)
20
+ CASE:"L2" → RET generateL2(...)
21
+ CASE:"L3" → RET options.code
22
+ ```
6
23
 
7
24
  ---
8
25
 
9
- ## What Makes It Different
26
+ ## Quick Start
10
27
 
11
- | | Traditional AI Tools | Composto |
12
- |---|---|---|
13
- | **Paradigm** | Reactive (you ask, it does) | Proactive (it finds, you approve) |
14
- | **What LLM sees** | Raw source code | Health-Aware IR |
15
- | **Token usage** | Full files every time | 60-75% savings |
16
- | **Health context** | None | Hotspots, decay, inconsistencies |
17
- | **Codebase monitoring** | None | Watcher Engine |
28
+ ```bash
29
+ # Install
30
+ npm install -g composto-ai
18
31
 
19
- ### Health-Aware IR
32
+ # See how much you save
33
+ composto benchmark .
20
34
 
21
- Raw source tells the LLM *what* the code says. Composto IR tells it *what the code means* and *how healthy it is*:
35
+ # Generate IR for a file
36
+ composto ir src/app.ts
22
37
 
23
- ```
24
- // Raw source: 340 tokens, zero health context
25
- import { useState, useEffect } from "react";
26
- export function UserProfile({ userId }) {
27
- const [user, setUser] = useState(null);
28
- const [loading, setLoading] = useState(true);
29
- useEffect(() => { fetchUser(userId).then(...) }, [userId]);
30
- if (loading) return <Spinner />;
31
- if (!user) return <NotFound />;
32
- return <div>{user.name}</div>;
33
- }
34
-
35
- // Composto IR: 85 tokens + health context
36
- USE:react{useState,useEffect}
37
- OUT FN:UserProfile({userId}) [HOT:12/30 FIX:67% COV:↓ INCON]
38
- VAR:user = useState(null)
39
- VAR:loading = useState(true)
40
- IF:loading -> RET <Spinner />
41
- IF:!user -> RET <NotFound />
42
- RET <div>{user.name}</div>
38
+ # Smart context within a token budget
39
+ composto context src/ --budget 2000
43
40
  ```
44
41
 
45
- The LLM sees less, knows more, decides better.
46
-
47
42
  ---
48
43
 
49
- ## Installation
44
+ ## How It Works
50
45
 
51
- ### Claude Code
46
+ Composto uses [tree-sitter](https://tree-sitter.github.io/) to parse your code into an AST, then walks every node and classifies it:
52
47
 
53
- ```
54
- /plugin install composto
55
- ```
48
+ | Tier | Action | What | % of nodes |
49
+ |------|--------|------|-----------|
50
+ | **Tier 1** | Keep | imports, functions, classes, interfaces, types, enums | 0.8% |
51
+ | **Tier 2** | Summarize | if, for, while, switch, return, throw, try/catch | 0.9% |
52
+ | **Tier 3** | Compress | variable declarations → one-liner, await → kept | 6.9% |
53
+ | **Tier 4** | Drop | string contents, operators, punctuation, comments | **86.6%** |
56
54
 
57
- ### Cursor
55
+ 86.6% of your code's AST nodes are noise. Composto drops them.
58
56
 
59
- ```
60
- /add-plugin composto
61
- ```
57
+ ---
62
58
 
63
- ### Any platform (CLI)
59
+ ## Commands
64
60
 
65
61
  ```bash
66
- npm install -g composto
67
- ```
68
-
69
- ---
62
+ # Benchmark token savings across your project
63
+ composto benchmark .
70
64
 
71
- ## Usage
65
+ # Generate IR at different detail levels
66
+ composto ir <file> L0 # Structure map (~10 tokens) — just names
67
+ composto ir <file> L1 # Full IR — compressed code + health signals
68
+ composto ir <file> L2 # Delta context — only what changed
69
+ composto ir <file> L3 # Raw source — original code
72
70
 
73
- ### CLI Commands
71
+ # Smart context packing within a token budget
72
+ composto context <path> --budget <tokens>
73
+ # Fits maximum information into your budget:
74
+ # hotspot files get L1 (detailed), rest get L0 (structure)
74
75
 
75
- ```bash
76
- # Scan codebase for issues
76
+ # Scan for security issues and debug artifacts
77
77
  composto scan .
78
78
 
79
- # Analyze codebase health trends
79
+ # Analyze git history for health trends
80
80
  composto trends .
81
81
 
82
- # Generate Health-Aware IR for a file
83
- composto ir src/auth/login.ts L1
84
-
85
- # Layer options:
86
- # L0 — Structure map (~10 tokens)
87
- # L1 — Health-Aware IR (~85 tokens)
88
- # L2 — Delta context (~65 tokens)
89
- # L3 — Raw source (fallback)
82
+ # Compare LLM quality: raw code vs IR (requires ANTHROPIC_API_KEY)
83
+ composto benchmark-quality <file>
90
84
  ```
91
85
 
92
- ### As a Plugin
93
-
94
- Once installed, Composto activates automatically. Your AI agent will:
95
-
96
- 1. **Scan** the codebase for issues before starting work
97
- 2. **Check trends** for files being modified
98
- 3. **Use IR** instead of raw source when sharing code context
99
-
100
- No commands needed — it just works.
101
-
102
86
  ---
103
87
 
104
- ## What It Does
88
+ ## Quality Proof
105
89
 
106
- ### IR Engine Send meaning, not code
90
+ We tested 4 files from simple to hard. Same question, raw code vs IR: "What does this file do?"
107
91
 
108
- Four layers of code representation, from most compact to full source:
92
+ | File | Complexity | Raw Tokens | IR Tokens | Savings | Comprehension |
93
+ |------|-----------|-----------|----------|---------|--------------|
94
+ | hotspot.ts | Simple | 299 | 77 | 74.2% | Full |
95
+ | layers.ts | Medium | 765 | 249 | 67.5% | Full |
96
+ | detector.ts | Medium | 704 | 160 | 77.3% | Full |
97
+ | ast-walker.ts | **Hard (448 lines)** | 3,782 | 663 | 82.5% | ~90% |
109
98
 
110
- | Layer | Tokens | Use |
111
- |---|---|---|
112
- | L0: Structure Map | ~10 | File outline — functions, classes, line numbers |
113
- | L1: Health-Aware IR | ~85 | Compressed code + health annotations |
114
- | L2: Delta + Context | ~65 | Only what changed, with surrounding context |
115
- | L3: Raw Source | variable | Original code, specific lines only |
99
+ Even on a 448-line recursive AST walker with nested switches, an LLM can fully explain the architecture, all 12 functions, and the data flow from the IR alone.
116
100
 
117
- No AST parser. No language-specific dependencies. Works with TypeScript, JavaScript, Python, Go, and more.
101
+ **What IR preserves:** function signatures, parameter types, imports, control flow, return values, class/interface declarations.
118
102
 
119
- ### Watcher EngineProactive issue detection
103
+ **What IR drops:** string contents, regex patterns, operator details, formatting things the LLM already knows.
120
104
 
121
- Detects problems without being asked:
105
+ Full benchmark: [docs/benchmark-proof.md](docs/benchmark-proof.md)
106
+
107
+ ---
122
108
 
123
- - **Security** — Hardcoded secrets, API keys, tokens
124
- - **Debug artifacts** — `console.log`, `console.debug` left in source
125
- - **Context-aware severity** — Same issue, different severity in `src/` vs `tests/`
109
+ ## IR Layers
126
110
 
127
- ### Trend Analysis Codebase health over time
111
+ | Layer | Tokens | Use case |
112
+ |-------|--------|----------|
113
+ | **L0** | ~10 | "What's in this file?" — just function/class names |
114
+ | **L1** | ~85 | "What does this file do?" — compressed code + health signals |
115
+ | **L2** | ~65 | "What changed?" — git diff with context |
116
+ | **L3** | variable | "Show me the exact code" — raw source |
128
117
 
129
- Analyzes git history to find:
118
+ ### When to use which
130
119
 
131
- - **Hotspots** — Files that change too often with too many bug fixes
132
- - **Decay signals** Areas where churn is accelerating
133
- - **Inconsistencies** Files touched by many authors with conflicting patterns
120
+ ```
121
+ "Explain the architecture" → L1 for all files
122
+ "Fix this bug" → L3 for target file, L1 for context
123
+ "Review this PR" → L2 for changed files, L1 for context
124
+ "What files are in this repo?" → L0 for everything
125
+ ```
134
126
 
135
- All trend analysis is zero-token — pure local git analysis.
127
+ ---
136
128
 
137
- ### Health Annotations — The killer feature
129
+ ## Health-Aware IR
138
130
 
139
- IR Engine and Trend Analysis are not separate systems. Health data is embedded directly into code representation:
131
+ Composto analyzes git history and embeds health signals directly into IR:
140
132
 
141
133
  ```
142
134
  FN:handleAuth({credentials}) [HOT:15/30 FIX:73% COV:↓ INCON]
143
- VAR:session = createSession(credentials)
144
- IF:!session -> RET 401
135
+ IF:!session RET 401
136
+ RET { token, expiresAt }
145
137
  ```
146
138
 
147
- - `[HOT:15/30]` — 15 changes in last 30 commits
139
+ - `[HOT:15/30]` — 15 changes in last 30 commits (hotspot)
148
140
  - `[FIX:73%]` — 73% of changes were bug fixes
149
141
  - `[COV:↓]` — Test coverage declining
150
142
  - `[INCON]` — Inconsistent patterns from multiple authors
@@ -153,37 +145,53 @@ Only unhealthy code gets annotated. Healthy files stay clean.
153
145
 
154
146
  ---
155
147
 
156
- ## Architecture
148
+ ## Context Budget
149
+
150
+ Don't guess which files to send. Let Composto decide:
157
151
 
152
+ ```bash
153
+ composto context src/ --budget 2000
158
154
  ```
159
- +----------------------------------------------+
160
- | Platform Adapters |
161
- | Claude Code | VS Code | Cursor | CLI |
162
- +----------------------------------------------+
163
- | Watcher Engine |
164
- | Detector (0 token) -> Interpreter (~100 tok) |
165
- | + Trend Analysis (hotspots, decay, incon.) |
166
- +----------------------------------------------+
167
- | IR Engine |
168
- | Indentation Intel | Fingerprinting | Delta |
169
- | + Health Annotations (from Trend Analysis) |
170
- +----------------------------------------------+
171
- | Rule-Based Router |
172
- | Deterministic routing, zero tokens |
173
- +----------------------------------------------+
174
- | Agent Pool |
175
- | Fixer (Haiku) | Reviewer (Sonnet) |
176
- +----------------------------------------------+
177
- | Project Memory |
178
- | .composto/config.yaml | decisions/*.md |
179
- +----------------------------------------------+
155
+
156
+ Output:
157
+ ```
158
+ == L1 (detailed) ==
159
+ [hotspot] src/auth/login.ts
160
+ USE:[./types.js, ./session.js]
161
+ OUT ASYNC FN:login(credentials)
162
+ TRY
163
+ IF:!valid → THROW:AuthError
164
+ RET { token, user }
165
+
166
+ == L0 (structure) ==
167
+ src/utils/helpers.ts
168
+ FN:formatDate L5
169
+ FN:parseQuery L23
170
+ ...
171
+
172
+ Budget: 1994/2000 tokens
173
+ Files: 9 at L1, 16 at L0
174
+ ```
175
+
176
+ Hotspot files get full detail. Everything else gets structure. Budget is never exceeded.
177
+
178
+ ---
179
+
180
+ ## Stats
181
+
182
+ ```
183
+ Overall compression: 89.2%
184
+ L0 compression: 97.5%
185
+ AST engine: 51/51 files (0 regex fallback)
186
+ Languages: TypeScript, JavaScript, Python, Go, Rust
187
+ Tests: 145 passing
180
188
  ```
181
189
 
182
190
  ---
183
191
 
184
192
  ## Configuration
185
193
 
186
- Create `.composto/config.yaml` in your project root:
194
+ Optional `.composto/config.yaml`:
187
195
 
188
196
  ```yaml
189
197
  watchers:
@@ -194,96 +202,26 @@ watchers:
194
202
  "tests/**": info
195
203
  consoleLog:
196
204
  enabled: true
197
- severity:
198
- "src/**": warning
199
- "tests/**": info
200
-
201
- agents:
202
- fixer:
203
- enabled: true
204
- model: haiku
205
-
206
- ir:
207
- deltaContextLines: 3
208
- confidenceThreshold: 0.6
209
- genericPatterns: default
210
205
 
211
206
  trends:
212
207
  enabled: true
213
208
  hotspotThreshold: 10
214
209
  bugFixRatioThreshold: 0.5
215
- decayCheckTrigger: on-commit
216
- fullReportSchedule: weekly
217
210
  ```
218
211
 
219
212
  All settings have sensible defaults. The config file is optional.
220
213
 
221
214
  ---
222
215
 
223
- ## How It Works
224
-
225
- ```
226
- 1. Developer saves src/auth/login.ts
227
- |
228
- 2. Watcher Engine triggers (debounced)
229
- |
230
- 3. Detector: pattern match → "hardcoded secret, line 23" (0 tokens)
231
- |
232
- 4. IR Engine: generates Health-Aware IR + annotations (0 tokens)
233
- |
234
- 5. Router: severity=critical → route to Fixer (0 tokens)
235
- |
236
- 6. Fixer: generates fix via IR, not full source (~150 tokens)
237
- |
238
- 7. User: "login.ts:23 has a hardcoded secret.
239
- You added it for debugging. Move to .env?"
240
- |
241
- 8. User approves → patch applied
242
-
243
- Total cost: ~250 tokens. Traditional tools: ~3000+ tokens.
244
- ```
245
-
246
- ---
247
-
248
- ## Tech Stack
249
-
250
- - **Language:** TypeScript
251
- - **Runtime:** Node.js
252
- - **Testing:** Vitest (70 tests)
253
- - **Build:** tsup
254
- - **Zero native dependencies** — no tree-sitter, no language-specific parsers
255
-
256
- ---
257
-
258
- ## Roadmap
259
-
260
- ### v0.5 — Usable Alpha
261
- - Watcher Interpreter (batch Haiku calls for contextual explanations)
262
- - Reviewer Agent (Sonnet, code review with challenge mode)
263
- - Project Memory (decisions/ with YAML frontmatter)
264
- - Python + Go language support
265
-
266
- ### v1.0 — Public Release
267
- - Framework-specific fingerprint patterns (React, Express, etc.)
268
- - VS Code / Cursor / Claude Code deep integrations
269
- - Benchmark results: Health-Aware IR vs raw source
270
-
271
- ### v2.0 — Platform
272
- - Security / Architect agents
273
- - Custom Agent API
274
- - Team sync features
275
-
276
- ---
277
-
278
216
  ## Contributing
279
217
 
280
218
  ```bash
281
219
  git clone https://github.com/mertcanaltin/composto
282
220
  cd composto
283
221
  pnpm install
284
- pnpm test # 70 tests
222
+ pnpm test # 145 tests
285
223
  pnpm build # builds to dist/
286
- pnpm dev scan . # run locally
224
+ npx composto benchmark . # see compression stats
287
225
  ```
288
226
 
289
227
  ---