npm - @sliday/tamp - Versions diffs - 0.1.0 → 0.1.1 - Mend

@sliday/tamp 0.1.0 → 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/README.md ADDED Viewed

@@ -0,0 +1,208 @@
+# Tamp
+**Token compression proxy for Claude Code.** 33.9% fewer input tokens, zero code changes. Sits between your client and the Anthropic API.
+```
+npx @sliday/tamp
+```
+Or install globally:
+```bash
+curl -fsSL https://tamp.dev/setup.sh | bash
+```
+## How It Works
+Tamp intercepts `POST /v1/messages` requests and compresses `tool_result` blocks before forwarding them upstream. Source code, error results, and non-JSON content pass through untouched.
+```
+Claude Code ──► Tamp (localhost:7778) ──► Anthropic API
+                  │
+                  ├─ JSON → minify whitespace
+                  ├─ Arrays → TOON columnar encoding
+                  ├─ Line-numbered → strip prefixes + minify
+                  ├─ Source code → passthrough
+                  └─ Errors → skip
+```
+### Compression Stages
+| Stage | What it does | When it applies |
+|-------|-------------|-----------------|
+| `minify` | Strips JSON whitespace | Pretty-printed JSON objects/arrays |
+| `toon` | Columnar [TOON encoding](https://github.com/nicholasgasior/toon-format) | Homogeneous arrays (file listings, routes, deps) |
+| `llmlingua` | Neural text compression via [LLMLingua](https://github.com/microsoft/LLMLingua) sidecar | Natural language text (requires sidecar) |
+Only `minify` is enabled by default. Enable more with `TOONA_STAGES=minify,toon`.
+## Quick Start
+### 1. Start the proxy
+```bash
+npx @sliday/tamp
+```
+```
+  ┌─ Tamp ─────────────────────────────┐
+  │  Proxy: http://localhost:7778      │
+  │  Status: ● Ready                   │
+  │                                    │
+  │  In another terminal:              │
+  │  export ANTHROPIC_BASE_URL=http://localhost:7778
+  │  claude                            │
+  └────────────────────────────────────┘
+```
+### 2. Point Claude Code at the proxy
+```bash
+export ANTHROPIC_BASE_URL=http://localhost:7778
+claude
+```
+That's it. Use Claude Code as normal — Tamp compresses silently in the background.
+## Configuration
+All configuration via environment variables:
+| Variable | Default | Description |
+|----------|---------|-------------|
+| `TOONA_PORT` | `7778` | Proxy listen port |
+| `TOONA_UPSTREAM` | `https://api.anthropic.com` | Upstream API URL |
+| `TOONA_STAGES` | `minify` | Comma-separated compression stages |
+| `TOONA_MIN_SIZE` | `200` | Minimum content size (chars) to attempt compression |
+| `TOONA_LOG` | `true` | Enable request logging to stderr |
+| `TOONA_LOG_FILE` | _(none)_ | Write logs to file |
+| `TOONA_MAX_BODY` | `10485760` | Max request body size (bytes) before passthrough |
+| `TOONA_LLMLINGUA_URL` | _(none)_ | LLMLingua sidecar URL for text compression |
+### Recommended setup
+```bash
+# Maximum compression
+TOONA_STAGES=minify,toon npx @sliday/tamp
+```
+## Installation Methods
+### npx (no install)
+```bash
+npx @sliday/tamp
+```
+### npm global
+```bash
+npm install -g @sliday/tamp
+npx @sliday/tamp
+```
+### Git clone
+```bash
+git clone https://github.com/sliday/tamp.git
+cd tamp && npm install
+node bin/tamp.js
+```
+### One-line installer
+```bash
+curl -fsSL https://tamp.dev/setup.sh | bash
+```
+The installer clones to `~/.tamp`, adds `ANTHROPIC_BASE_URL` to your shell profile, and creates a `tamp` alias.
+## What Gets Compressed
+Tamp only compresses the **last user message** in each request (the most recent `tool_result` blocks). Historical messages are left untouched to avoid redundant recompression.
+| Content Type | Action | Example |
+|-------------|--------|---------|
+| Pretty-printed JSON | Minify whitespace | `package.json`, config files |
+| JSON with line numbers | Strip prefixes + minify | Read tool output (`  1→{...}`) |
+| Homogeneous JSON arrays | TOON encode | File listings, route tables, dependencies |
+| Already-minified JSON | Skip | Single-line JSON |
+| Source code (text) | Passthrough | `.ts`, `.py`, `.rs` files |
+| `is_error: true` results | Skip entirely | Error tool results |
+| TOON-encoded content | Skip | Already compressed |
+## Architecture
+```
+bin/tamp.js          CLI entry point
+index.js             HTTP proxy server
+compress.js          Compression pipeline (compressMessages, compressText)
+detect.js            Content classification (classifyContent, tryParseJSON, stripLineNumbers)
+config.js            Environment-based configuration
+stats.js             Session statistics and request logging
+setup.sh             One-line installer script
+```
+### How the proxy works
+1. Non-`/v1/messages` requests are piped through unmodified
+2. `POST /v1/messages` bodies are buffered and parsed as JSON
+3. The last user message's `tool_result` blocks are classified and compressed
+4. The modified body is forwarded upstream with updated `Content-Length`
+5. The upstream response is streamed back to the client unmodified
+Bodies exceeding `TOONA_MAX_BODY` are piped through without buffering.
+## Benchmarking
+The `bench/` directory contains a reproducible A/B benchmark that measures actual token savings via OpenRouter:
+```bash
+OPENROUTER_API_KEY=... node bench/runner.js   # 70 API calls, ~2 min
+node bench/analyze.js                          # Statistical analysis
+node bench/render.js                           # White paper (HTML + PDF)
+```
+Seven scenarios cover the full range: small/large JSON, tabular data, source code, multi-turn conversations, line-numbered output, and error results. Each runs 5 times for statistical confidence (95% CI via Student's t-distribution).
+Results are written to `bench/results/` (gitignored).
+## Development
+```bash
+# Run tests
+npm test
+# Smoke test (spins up proxy + echo server, validates compression)
+node smoke.js
+# Run specific test file
+node --test test/compress.test.js
+```
+### Test files
+```
+test/compress.test.js    Compression pipeline tests
+test/detect.test.js      Content classification tests
+test/config.test.js      Configuration loading tests
+test/proxy.test.js       HTTP proxy integration tests
+test/stats.test.js       Statistics and logging tests
+test/fixtures/           Sample API payloads
+```
+## How Token Savings Work
+Claude Code sends the full conversation history on every API call. As a session progresses, tool results accumulate — file contents, directory listings, command outputs — all re-sent as input tokens on each request.
+At $3/million input tokens (Sonnet 4), a 200-request session consuming 3M input tokens costs $9. If 60% of tool results are compressible JSON, and compression removes 30-50% of those tokens, that's $1.60-2.70 saved per session.
+For teams with 5 developers doing 2 sessions/day, that's $500-800/month in savings.
+## License
+MIT
+## Author
+[Stas Kulesh](mailto:stas@sliday.com) — [sliday.com](https://sliday.com)

package/compress.js CHANGED Viewed

@@ -1,4 +1,5 @@
 import { encode } from '@toon-format/toon'
+import { countTokens } from '@anthropic-ai/tokenizer'
 import { tryParseJSON, classifyContent, stripLineNumbers } from './detect.js'
 export function compressText(text, config) {
@@ -31,7 +32,7 @@ export function compressText(text, config) {
     } catch { /* fall back to minified */ }
   }
-  return { text: best.text, method: best.method, originalLen: text.length, compressedLen: best.text.length }
+  return { text: best.text, method: best.method, originalLen: text.length, compressedLen: best.text.length, originalTokens: countTokens(text), compressedTokens: countTokens(best.text) }
 }
 async function compressWithLLMLingua(text, config) {
@@ -47,7 +48,7 @@ async function compressWithLLMLingua(text, config) {
     clearTimeout(timeout)
     if (!res.ok) return null
     const data = await res.json()
-    return { text: data.text, method: 'llmlingua', originalLen: text.length, compressedLen: data.text.length }
+    return { text: data.text, method: 'llmlingua', originalLen: text.length, compressedLen: data.text.length, originalTokens: countTokens(text), compressedTokens: countTokens(data.text) }
   } catch {
     return null
   }

package/package.json CHANGED Viewed

@@ -8,7 +8,7 @@
     "detect.js",
     "stats.js"
   ],
-  "version": "0.1.0",
+  "version": "0.1.1",
   "description": "Token compression proxy for Claude Code. 50% fewer tokens, zero behavior change.",
   "type": "module",
   "main": "index.js",
@@ -17,9 +17,18 @@
   },
   "scripts": {
     "start": "node bin/tamp.js",
-    "test": "node --test test/*.test.js"
+    "test": "node --test test/*.test.js",
+    "test:sidecar": "node test/capture-golden.js --verify",
+    "bench:semantic": "node bench/semantic-eval.js"
   },
-  "keywords": ["claude", "anthropic", "proxy", "compression", "tokens", "llm"],
+  "keywords": [
+    "claude",
+    "anthropic",
+    "proxy",
+    "compression",
+    "tokens",
+    "llm"
+  ],
   "author": "Stas Kulesh <stas@sliday.com>",
   "license": "MIT",
   "repository": {
@@ -28,6 +37,7 @@
   },
   "homepage": "https://github.com/sliday/tamp",
   "dependencies": {
+    "@anthropic-ai/tokenizer": "^0.0.4",
     "@toon-format/toon": "^2.1.0"
   }
 }

package/stats.js CHANGED Viewed

@@ -8,20 +8,24 @@ export function formatRequestLog(stats, session) {
       lines.push(`[toona]   block[${s.index}]: skipped (${s.skipped})`)
     } else if (s.method) {
       const pct = (((s.originalLen - s.compressedLen) / s.originalLen) * 100).toFixed(1)
-      lines.push(`[toona]   block[${s.index}]: ${s.originalLen}->${s.compressedLen} chars (-${pct}%) [${s.method}]`)
+      const tokInfo = s.originalTokens ? ` ${s.originalTokens}->${s.compressedTokens} tok` : ''
+      lines.push(`[toona]   block[${s.index}]: ${s.originalLen}->${s.compressedLen} chars (-${pct}%)${tokInfo} [${s.method}]`)
     }
   }
   const totalOrig = compressed.reduce((a, s) => a + s.originalLen, 0)
   const totalComp = compressed.reduce((a, s) => a + s.compressedLen, 0)
+  const totalOrigTok = compressed.reduce((a, s) => a + (s.originalTokens || 0), 0)
+  const totalCompTok = compressed.reduce((a, s) => a + (s.compressedTokens || 0), 0)
   if (compressed.length > 0) {
     const pct = (((totalOrig - totalComp) / totalOrig) * 100).toFixed(1)
-    lines.push(`[toona]   total: ${totalOrig}->${totalComp} chars (-${pct}%)`)
+    const tokPct = totalOrigTok > 0 ? (((totalOrigTok - totalCompTok) / totalOrigTok) * 100).toFixed(1) : '0.0'
+    lines.push(`[toona]   total: ${totalOrig}->${totalComp} chars (-${pct}%), ${totalOrigTok}->${totalCompTok} tokens (-${tokPct}%)`)
   }
   if (session) {
     const totals = session.getTotals()
-    lines.push(`[toona]   session: ${totals.totalSaved} chars saved across ${totals.compressionCount} compressions`)
+    lines.push(`[toona]   session: ${totals.totalSaved} chars, ${totals.totalTokensSaved} tokens saved across ${totals.compressionCount} compressions`)
   }
   return lines.join('\n')
@@ -29,6 +33,7 @@ export function formatRequestLog(stats, session) {
 export function createSession() {
   let totalSaved = 0
+  let totalTokensSaved = 0
   let compressionCount = 0
   return {
@@ -36,12 +41,13 @@ export function createSession() {
       for (const s of stats) {
         if (s.method && s.originalLen && s.compressedLen) {
           totalSaved += s.originalLen - s.compressedLen
+          totalTokensSaved += (s.originalTokens || 0) - (s.compressedTokens || 0)
           compressionCount++
         }
       }
     },
     getTotals() {
-      return { totalSaved, compressionCount }
+      return { totalSaved, totalTokensSaved, compressionCount }
     },
   }
 }