npm - composto-ai - Versions diffs - 0.1.1 → 0.2.0 - Mend

composto-ai 0.1.1 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (10) hide show

package/README.md +130 -192
package/dist/chunk-MFNGSHTZ.js +1163 -0
package/dist/index.js +31 -1164
package/dist/mcp/server.js +202 -0
package/grammars/tree-sitter-go.wasm +0 -0
package/grammars/tree-sitter-javascript.wasm +0 -0
package/grammars/tree-sitter-python.wasm +0 -0
package/grammars/tree-sitter-rust.wasm +0 -0
package/grammars/tree-sitter-typescript.wasm +0 -0
package/package.json +7 -3

package/README.md CHANGED Viewed

@@ -1,150 +1,142 @@
 # Composto
-**Proactive AI team companion — less tokens, more insight.**
+**Send meaning to your LLM, not code. 89% fewer tokens, same understanding.**
-Every AI coding tool sends raw source code to LLMs. Composto sends **meaning** — compressed code enriched with codebase health data. The result: fewer tokens carrying more information than raw source ever could.
+Composto parses your code into an AST, classifies every node by importance, and drops the noise. Your LLM gets the signal — function signatures, control flow, dependencies — without the braces, semicolons, and string literals it already knows.
+```
+Raw source:  3,782 tokens    →    Composto IR:  663 tokens (82.5% savings)
+USE:[../types.js, ./structure.js, ./fingerprint.js, ./health.js]
+OUT FN:generateL0(code: string, filePath: string)
+    RET `${filePath}\n${declarations.join("\n")}`
+OUT ASYNC FN:generateL1(code: string, filePath: string, health: HealthAnnotation...)
+    IF:health → RET annotateIR(ir, health)
+    RET ir
+OUT FN:generateLayer(layer: IRLayer, options: {...})
+    SWITCH:layer
+        CASE:"L0" → RET generateL0(...)
+        CASE:"L1" → RET generateL1(...)
+        CASE:"L2" → RET generateL2(...)
+        CASE:"L3" → RET options.code
+```
 ---
-## What Makes It Different
+## Quick Start
-| | Traditional AI Tools | Composto |
-|---|---|---|
-| **Paradigm** | Reactive (you ask, it does) | Proactive (it finds, you approve) |
-| **What LLM sees** | Raw source code | Health-Aware IR |
-| **Token usage** | Full files every time | 60-75% savings |
-| **Health context** | None | Hotspots, decay, inconsistencies |
-| **Codebase monitoring** | None | Watcher Engine |
+```bash
+# Install
+npm install -g composto-ai
-### Health-Aware IR
+# See how much you save
+composto benchmark .
-Raw source tells the LLM *what* the code says. Composto IR tells it *what the code means* and *how healthy it is*:
+# Generate IR for a file
+composto ir src/app.ts
-```
-// Raw source: 340 tokens, zero health context
-import { useState, useEffect } from "react";
-export function UserProfile({ userId }) {
-  const [user, setUser] = useState(null);
-  const [loading, setLoading] = useState(true);
-  useEffect(() => { fetchUser(userId).then(...) }, [userId]);
-  if (loading) return <Spinner />;
-  if (!user) return <NotFound />;
-  return <div>{user.name}</div>;
-}
-// Composto IR: 85 tokens + health context
-USE:react{useState,useEffect}
-OUT FN:UserProfile({userId}) [HOT:12/30 FIX:67% COV:↓ INCON]
-  VAR:user = useState(null)
-  VAR:loading = useState(true)
-  IF:loading -> RET <Spinner />
-  IF:!user -> RET <NotFound />
-  RET <div>{user.name}</div>
+# Smart context within a token budget
+composto context src/ --budget 2000
 ```
-The LLM sees less, knows more, decides better.
 ---
-## Installation
+## How It Works
-### Claude Code
+Composto uses [tree-sitter](https://tree-sitter.github.io/) to parse your code into an AST, then walks every node and classifies it:
-```
-/plugin install composto
-```
+| Tier | Action | What | % of nodes |
+|------|--------|------|-----------|
+| **Tier 1** | Keep | imports, functions, classes, interfaces, types, enums | 0.8% |
+| **Tier 2** | Summarize | if, for, while, switch, return, throw, try/catch | 0.9% |
+| **Tier 3** | Compress | variable declarations → one-liner, await → kept | 6.9% |
+| **Tier 4** | Drop | string contents, operators, punctuation, comments | **86.6%** |
-### Cursor
+86.6% of your code's AST nodes are noise. Composto drops them.
-```
-/add-plugin composto
-```
+---
-### Any platform (CLI)
+## Commands
 ```bash
-npm install -g composto
-```
----
+# Benchmark token savings across your project
+composto benchmark .
-## Usage
+# Generate IR at different detail levels
+composto ir <file> L0    # Structure map (~10 tokens) — just names
+composto ir <file> L1    # Full IR — compressed code + health signals
+composto ir <file> L2    # Delta context — only what changed
+composto ir <file> L3    # Raw source — original code
-### CLI Commands
+# Smart context packing within a token budget
+composto context <path> --budget <tokens>
+# Fits maximum information into your budget:
+# hotspot files get L1 (detailed), rest get L0 (structure)
-```bash
-# Scan codebase for issues
+# Scan for security issues and debug artifacts
 composto scan .
-# Analyze codebase health trends
+# Analyze git history for health trends
 composto trends .
-# Generate Health-Aware IR for a file
-composto ir src/auth/login.ts L1
-# Layer options:
-#   L0 — Structure map (~10 tokens)
-#   L1 — Health-Aware IR (~85 tokens)
-#   L2 — Delta context (~65 tokens)
-#   L3 — Raw source (fallback)
+# Compare LLM quality: raw code vs IR (requires ANTHROPIC_API_KEY)
+composto benchmark-quality <file>
 ```
-### As a Plugin
-Once installed, Composto activates automatically. Your AI agent will:
-1. **Scan** the codebase for issues before starting work
-2. **Check trends** for files being modified
-3. **Use IR** instead of raw source when sharing code context
-No commands needed — it just works.
 ---
-## What It Does
+## Quality Proof
-### IR Engine — Send meaning, not code
+We tested 4 files from simple to hard. Same question, raw code vs IR: "What does this file do?"
-Four layers of code representation, from most compact to full source:
+| File | Complexity | Raw Tokens | IR Tokens | Savings | Comprehension |
+|------|-----------|-----------|----------|---------|--------------|
+| hotspot.ts | Simple | 299 | 77 | 74.2% | Full |
+| layers.ts | Medium | 765 | 249 | 67.5% | Full |
+| detector.ts | Medium | 704 | 160 | 77.3% | Full |
+| ast-walker.ts | **Hard (448 lines)** | 3,782 | 663 | 82.5% | ~90% |
-| Layer | Tokens | Use |
-|---|---|---|
-| L0: Structure Map | ~10 | File outline — functions, classes, line numbers |
-| L1: Health-Aware IR | ~85 | Compressed code + health annotations |
-| L2: Delta + Context | ~65 | Only what changed, with surrounding context |
-| L3: Raw Source | variable | Original code, specific lines only |
+Even on a 448-line recursive AST walker with nested switches, an LLM can fully explain the architecture, all 12 functions, and the data flow from the IR alone.
-No AST parser. No language-specific dependencies. Works with TypeScript, JavaScript, Python, Go, and more.
+**What IR preserves:** function signatures, parameter types, imports, control flow, return values, class/interface declarations.
-### Watcher Engine — Proactive issue detection
+**What IR drops:** string contents, regex patterns, operator details, formatting — things the LLM already knows.
-Detects problems without being asked:
+Full benchmark: [docs/benchmark-proof.md](docs/benchmark-proof.md)
+---
-- **Security** — Hardcoded secrets, API keys, tokens
-- **Debug artifacts** — `console.log`, `console.debug` left in source
-- **Context-aware severity** — Same issue, different severity in `src/` vs `tests/`
+## IR Layers
-### Trend Analysis — Codebase health over time
+| Layer | Tokens | Use case |
+|-------|--------|----------|
+| **L0** | ~10 | "What's in this file?" — just function/class names |
+| **L1** | ~85 | "What does this file do?" — compressed code + health signals |
+| **L2** | ~65 | "What changed?" — git diff with context |
+| **L3** | variable | "Show me the exact code" — raw source |
-Analyzes git history to find:
+### When to use which
-- **Hotspots** — Files that change too often with too many bug fixes
-- **Decay signals** — Areas where churn is accelerating
-- **Inconsistencies** — Files touched by many authors with conflicting patterns
+```
+"Explain the architecture"     → L1 for all files
+"Fix this bug"                 → L3 for target file, L1 for context
+"Review this PR"               → L2 for changed files, L1 for context
+"What files are in this repo?" → L0 for everything
+```
-All trend analysis is zero-token — pure local git analysis.
+---
-### Health Annotations — The killer feature
+## Health-Aware IR
-IR Engine and Trend Analysis are not separate systems. Health data is embedded directly into code representation:
+Composto analyzes git history and embeds health signals directly into IR:
 ```
 FN:handleAuth({credentials}) [HOT:15/30 FIX:73% COV:↓ INCON]
-  VAR:session = createSession(credentials)
-  IF:!session -> RET 401
+  IF:!session → RET 401
+  RET { token, expiresAt }
 ```
-- `[HOT:15/30]` — 15 changes in last 30 commits
+- `[HOT:15/30]` — 15 changes in last 30 commits (hotspot)
 - `[FIX:73%]` — 73% of changes were bug fixes
 - `[COV:↓]` — Test coverage declining
 - `[INCON]` — Inconsistent patterns from multiple authors
@@ -153,37 +145,53 @@ Only unhealthy code gets annotated. Healthy files stay clean.
 ---
-## Architecture
+## Context Budget
+Don't guess which files to send. Let Composto decide:
+```bash
+composto context src/ --budget 2000
 ```
-+----------------------------------------------+
-|           Platform Adapters                   |
-|     Claude Code | VS Code | Cursor | CLI     |
-+----------------------------------------------+
-|              Watcher Engine                   |
-|  Detector (0 token) -> Interpreter (~100 tok) |
-|  + Trend Analysis (hotspots, decay, incon.)   |
-+----------------------------------------------+
-|              IR Engine                        |
-|  Indentation Intel | Fingerprinting | Delta   |
-|  + Health Annotations (from Trend Analysis)   |
-+----------------------------------------------+
-|          Rule-Based Router                    |
-|  Deterministic routing, zero tokens           |
-+----------------------------------------------+
-|           Agent Pool                          |
-|  Fixer (Haiku) | Reviewer (Sonnet)            |
-+----------------------------------------------+
-|          Project Memory                       |
-|  .composto/config.yaml | decisions/*.md       |
-+----------------------------------------------+
+Output:
+```
+== L1 (detailed) ==
+[hotspot] src/auth/login.ts
+  USE:[./types.js, ./session.js]
+  OUT ASYNC FN:login(credentials)
+    TRY
+      IF:!valid → THROW:AuthError
+      RET { token, user }
+== L0 (structure) ==
+src/utils/helpers.ts
+  FN:formatDate L5
+  FN:parseQuery L23
+...
+Budget: 1994/2000 tokens
+Files: 9 at L1, 16 at L0
+```
+Hotspot files get full detail. Everything else gets structure. Budget is never exceeded.
+---
+## Stats
+```
+Overall compression: 89.2%
+L0 compression:      97.5%
+AST engine:          51/51 files (0 regex fallback)
+Languages:           TypeScript, JavaScript, Python, Go, Rust
+Tests:               145 passing
 ```
 ---
 ## Configuration
-Create `.composto/config.yaml` in your project root:
+Optional `.composto/config.yaml`:
 ```yaml
 watchers:
@@ -194,96 +202,26 @@ watchers:
       "tests/**": info
   consoleLog:
     enabled: true
-    severity:
-      "src/**": warning
-      "tests/**": info
-agents:
-  fixer:
-    enabled: true
-    model: haiku
-ir:
-  deltaContextLines: 3
-  confidenceThreshold: 0.6
-  genericPatterns: default
 trends:
   enabled: true
   hotspotThreshold: 10
   bugFixRatioThreshold: 0.5
-  decayCheckTrigger: on-commit
-  fullReportSchedule: weekly
 ```
 All settings have sensible defaults. The config file is optional.
 ---
-## How It Works
-```
-1. Developer saves src/auth/login.ts
-        |
-2. Watcher Engine triggers (debounced)
-        |
-3. Detector: pattern match → "hardcoded secret, line 23" (0 tokens)
-        |
-4. IR Engine: generates Health-Aware IR + annotations (0 tokens)
-        |
-5. Router: severity=critical → route to Fixer (0 tokens)
-        |
-6. Fixer: generates fix via IR, not full source (~150 tokens)
-        |
-7. User: "login.ts:23 has a hardcoded secret.
-          You added it for debugging. Move to .env?"
-        |
-8. User approves → patch applied
-Total cost: ~250 tokens. Traditional tools: ~3000+ tokens.
-```
----
-## Tech Stack
-- **Language:** TypeScript
-- **Runtime:** Node.js
-- **Testing:** Vitest (70 tests)
-- **Build:** tsup
-- **Zero native dependencies** — no tree-sitter, no language-specific parsers
----
-## Roadmap
-### v0.5 — Usable Alpha
-- Watcher Interpreter (batch Haiku calls for contextual explanations)
-- Reviewer Agent (Sonnet, code review with challenge mode)
-- Project Memory (decisions/ with YAML frontmatter)
-- Python + Go language support
-### v1.0 — Public Release
-- Framework-specific fingerprint patterns (React, Express, etc.)
-- VS Code / Cursor / Claude Code deep integrations
-- Benchmark results: Health-Aware IR vs raw source
-### v2.0 — Platform
-- Security / Architect agents
-- Custom Agent API
-- Team sync features
----
 ## Contributing
 ```bash
 git clone https://github.com/mertcanaltin/composto
 cd composto
 pnpm install
-pnpm test        # 70 tests
+pnpm test        # 145 tests
 pnpm build       # builds to dist/
-pnpm dev scan .  # run locally
+npx composto benchmark .  # see compression stats
 ```
 ---