@sliday/tamp 0.1.0 → 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1,208 @@
1
+ # Tamp
2
+
3
+ **Token compression proxy for Claude Code.** 33.9% fewer input tokens, zero code changes. Sits between your client and the Anthropic API.
4
+
5
+ ```
6
+ npx @sliday/tamp
7
+ ```
8
+
9
+ Or install globally:
10
+
11
+ ```bash
12
+ curl -fsSL https://tamp.dev/setup.sh | bash
13
+ ```
14
+
15
+ ## How It Works
16
+
17
+ Tamp intercepts `POST /v1/messages` requests and compresses `tool_result` blocks before forwarding them upstream. Source code, error results, and non-JSON content pass through untouched.
18
+
19
+ ```
20
+ Claude Code ──► Tamp (localhost:7778) ──► Anthropic API
21
+
22
+ ├─ JSON → minify whitespace
23
+ ├─ Arrays → TOON columnar encoding
24
+ ├─ Line-numbered → strip prefixes + minify
25
+ ├─ Source code → passthrough
26
+ └─ Errors → skip
27
+ ```
28
+
29
+ ### Compression Stages
30
+
31
+ | Stage | What it does | When it applies |
32
+ |-------|-------------|-----------------|
33
+ | `minify` | Strips JSON whitespace | Pretty-printed JSON objects/arrays |
34
+ | `toon` | Columnar [TOON encoding](https://github.com/nicholasgasior/toon-format) | Homogeneous arrays (file listings, routes, deps) |
35
+ | `llmlingua` | Neural text compression via [LLMLingua](https://github.com/microsoft/LLMLingua) sidecar | Natural language text (requires sidecar) |
36
+
37
+ Only `minify` is enabled by default. Enable more with `TOONA_STAGES=minify,toon`.
38
+
39
+ ## Quick Start
40
+
41
+ ### 1. Start the proxy
42
+
43
+ ```bash
44
+ npx @sliday/tamp
45
+ ```
46
+
47
+ ```
48
+ ┌─ Tamp ─────────────────────────────┐
49
+ │ Proxy: http://localhost:7778 │
50
+ │ Status: ● Ready │
51
+ │ │
52
+ │ In another terminal: │
53
+ │ export ANTHROPIC_BASE_URL=http://localhost:7778
54
+ │ claude │
55
+ └────────────────────────────────────┘
56
+ ```
57
+
58
+ ### 2. Point Claude Code at the proxy
59
+
60
+ ```bash
61
+ export ANTHROPIC_BASE_URL=http://localhost:7778
62
+ claude
63
+ ```
64
+
65
+ That's it. Use Claude Code as normal — Tamp compresses silently in the background.
66
+
67
+ ## Configuration
68
+
69
+ All configuration via environment variables:
70
+
71
+ | Variable | Default | Description |
72
+ |----------|---------|-------------|
73
+ | `TOONA_PORT` | `7778` | Proxy listen port |
74
+ | `TOONA_UPSTREAM` | `https://api.anthropic.com` | Upstream API URL |
75
+ | `TOONA_STAGES` | `minify` | Comma-separated compression stages |
76
+ | `TOONA_MIN_SIZE` | `200` | Minimum content size (chars) to attempt compression |
77
+ | `TOONA_LOG` | `true` | Enable request logging to stderr |
78
+ | `TOONA_LOG_FILE` | _(none)_ | Write logs to file |
79
+ | `TOONA_MAX_BODY` | `10485760` | Max request body size (bytes) before passthrough |
80
+ | `TOONA_LLMLINGUA_URL` | _(none)_ | LLMLingua sidecar URL for text compression |
81
+
82
+ ### Recommended setup
83
+
84
+ ```bash
85
+ # Maximum compression
86
+ TOONA_STAGES=minify,toon npx @sliday/tamp
87
+ ```
88
+
89
+ ## Installation Methods
90
+
91
+ ### npx (no install)
92
+
93
+ ```bash
94
+ npx @sliday/tamp
95
+ ```
96
+
97
+ ### npm global
98
+
99
+ ```bash
100
+ npm install -g @sliday/tamp
101
+ npx @sliday/tamp
102
+ ```
103
+
104
+ ### Git clone
105
+
106
+ ```bash
107
+ git clone https://github.com/sliday/tamp.git
108
+ cd tamp && npm install
109
+ node bin/tamp.js
110
+ ```
111
+
112
+ ### One-line installer
113
+
114
+ ```bash
115
+ curl -fsSL https://tamp.dev/setup.sh | bash
116
+ ```
117
+
118
+ The installer clones to `~/.tamp`, adds `ANTHROPIC_BASE_URL` to your shell profile, and creates a `tamp` alias.
119
+
120
+ ## What Gets Compressed
121
+
122
+ Tamp only compresses the **last user message** in each request (the most recent `tool_result` blocks). Historical messages are left untouched to avoid redundant recompression.
123
+
124
+ | Content Type | Action | Example |
125
+ |-------------|--------|---------|
126
+ | Pretty-printed JSON | Minify whitespace | `package.json`, config files |
127
+ | JSON with line numbers | Strip prefixes + minify | Read tool output (` 1→{...}`) |
128
+ | Homogeneous JSON arrays | TOON encode | File listings, route tables, dependencies |
129
+ | Already-minified JSON | Skip | Single-line JSON |
130
+ | Source code (text) | Passthrough | `.ts`, `.py`, `.rs` files |
131
+ | `is_error: true` results | Skip entirely | Error tool results |
132
+ | TOON-encoded content | Skip | Already compressed |
133
+
134
+ ## Architecture
135
+
136
+ ```
137
+ bin/tamp.js CLI entry point
138
+ index.js HTTP proxy server
139
+ compress.js Compression pipeline (compressMessages, compressText)
140
+ detect.js Content classification (classifyContent, tryParseJSON, stripLineNumbers)
141
+ config.js Environment-based configuration
142
+ stats.js Session statistics and request logging
143
+ setup.sh One-line installer script
144
+ ```
145
+
146
+ ### How the proxy works
147
+
148
+ 1. Non-`/v1/messages` requests are piped through unmodified
149
+ 2. `POST /v1/messages` bodies are buffered and parsed as JSON
150
+ 3. The last user message's `tool_result` blocks are classified and compressed
151
+ 4. The modified body is forwarded upstream with updated `Content-Length`
152
+ 5. The upstream response is streamed back to the client unmodified
153
+
154
+ Bodies exceeding `TOONA_MAX_BODY` are piped through without buffering.
155
+
156
+ ## Benchmarking
157
+
158
+ The `bench/` directory contains a reproducible A/B benchmark that measures actual token savings via OpenRouter:
159
+
160
+ ```bash
161
+ OPENROUTER_API_KEY=... node bench/runner.js # 70 API calls, ~2 min
162
+ node bench/analyze.js # Statistical analysis
163
+ node bench/render.js # White paper (HTML + PDF)
164
+ ```
165
+
166
+ Seven scenarios cover the full range: small/large JSON, tabular data, source code, multi-turn conversations, line-numbered output, and error results. Each runs 5 times for statistical confidence (95% CI via Student's t-distribution).
167
+
168
+ Results are written to `bench/results/` (gitignored).
169
+
170
+ ## Development
171
+
172
+ ```bash
173
+ # Run tests
174
+ npm test
175
+
176
+ # Smoke test (spins up proxy + echo server, validates compression)
177
+ node smoke.js
178
+
179
+ # Run specific test file
180
+ node --test test/compress.test.js
181
+ ```
182
+
183
+ ### Test files
184
+
185
+ ```
186
+ test/compress.test.js Compression pipeline tests
187
+ test/detect.test.js Content classification tests
188
+ test/config.test.js Configuration loading tests
189
+ test/proxy.test.js HTTP proxy integration tests
190
+ test/stats.test.js Statistics and logging tests
191
+ test/fixtures/ Sample API payloads
192
+ ```
193
+
194
+ ## How Token Savings Work
195
+
196
+ Claude Code sends the full conversation history on every API call. As a session progresses, tool results accumulate — file contents, directory listings, command outputs — all re-sent as input tokens on each request.
197
+
198
+ At $3/million input tokens (Sonnet 4), a 200-request session consuming 3M input tokens costs $9. If 60% of tool results are compressible JSON, and compression removes 30-50% of those tokens, that's $1.60-2.70 saved per session.
199
+
200
+ For teams with 5 developers doing 2 sessions/day, that's $500-800/month in savings.
201
+
202
+ ## License
203
+
204
+ MIT
205
+
206
+ ## Author
207
+
208
+ [Stas Kulesh](mailto:stas@sliday.com) — [sliday.com](https://sliday.com)
package/compress.js CHANGED
@@ -1,4 +1,5 @@
1
1
  import { encode } from '@toon-format/toon'
2
+ import { countTokens } from '@anthropic-ai/tokenizer'
2
3
  import { tryParseJSON, classifyContent, stripLineNumbers } from './detect.js'
3
4
 
4
5
  export function compressText(text, config) {
@@ -31,7 +32,7 @@ export function compressText(text, config) {
31
32
  } catch { /* fall back to minified */ }
32
33
  }
33
34
 
34
- return { text: best.text, method: best.method, originalLen: text.length, compressedLen: best.text.length }
35
+ return { text: best.text, method: best.method, originalLen: text.length, compressedLen: best.text.length, originalTokens: countTokens(text), compressedTokens: countTokens(best.text) }
35
36
  }
36
37
 
37
38
  async function compressWithLLMLingua(text, config) {
@@ -47,7 +48,7 @@ async function compressWithLLMLingua(text, config) {
47
48
  clearTimeout(timeout)
48
49
  if (!res.ok) return null
49
50
  const data = await res.json()
50
- return { text: data.text, method: 'llmlingua', originalLen: text.length, compressedLen: data.text.length }
51
+ return { text: data.text, method: 'llmlingua', originalLen: text.length, compressedLen: data.text.length, originalTokens: countTokens(text), compressedTokens: countTokens(data.text) }
51
52
  } catch {
52
53
  return null
53
54
  }
package/package.json CHANGED
@@ -8,7 +8,7 @@
8
8
  "detect.js",
9
9
  "stats.js"
10
10
  ],
11
- "version": "0.1.0",
11
+ "version": "0.1.1",
12
12
  "description": "Token compression proxy for Claude Code. 50% fewer tokens, zero behavior change.",
13
13
  "type": "module",
14
14
  "main": "index.js",
@@ -17,9 +17,18 @@
17
17
  },
18
18
  "scripts": {
19
19
  "start": "node bin/tamp.js",
20
- "test": "node --test test/*.test.js"
20
+ "test": "node --test test/*.test.js",
21
+ "test:sidecar": "node test/capture-golden.js --verify",
22
+ "bench:semantic": "node bench/semantic-eval.js"
21
23
  },
22
- "keywords": ["claude", "anthropic", "proxy", "compression", "tokens", "llm"],
24
+ "keywords": [
25
+ "claude",
26
+ "anthropic",
27
+ "proxy",
28
+ "compression",
29
+ "tokens",
30
+ "llm"
31
+ ],
23
32
  "author": "Stas Kulesh <stas@sliday.com>",
24
33
  "license": "MIT",
25
34
  "repository": {
@@ -28,6 +37,7 @@
28
37
  },
29
38
  "homepage": "https://github.com/sliday/tamp",
30
39
  "dependencies": {
40
+ "@anthropic-ai/tokenizer": "^0.0.4",
31
41
  "@toon-format/toon": "^2.1.0"
32
42
  }
33
43
  }
package/stats.js CHANGED
@@ -8,20 +8,24 @@ export function formatRequestLog(stats, session) {
8
8
  lines.push(`[toona] block[${s.index}]: skipped (${s.skipped})`)
9
9
  } else if (s.method) {
10
10
  const pct = (((s.originalLen - s.compressedLen) / s.originalLen) * 100).toFixed(1)
11
- lines.push(`[toona] block[${s.index}]: ${s.originalLen}->${s.compressedLen} chars (-${pct}%) [${s.method}]`)
11
+ const tokInfo = s.originalTokens ? ` ${s.originalTokens}->${s.compressedTokens} tok` : ''
12
+ lines.push(`[toona] block[${s.index}]: ${s.originalLen}->${s.compressedLen} chars (-${pct}%)${tokInfo} [${s.method}]`)
12
13
  }
13
14
  }
14
15
 
15
16
  const totalOrig = compressed.reduce((a, s) => a + s.originalLen, 0)
16
17
  const totalComp = compressed.reduce((a, s) => a + s.compressedLen, 0)
18
+ const totalOrigTok = compressed.reduce((a, s) => a + (s.originalTokens || 0), 0)
19
+ const totalCompTok = compressed.reduce((a, s) => a + (s.compressedTokens || 0), 0)
17
20
  if (compressed.length > 0) {
18
21
  const pct = (((totalOrig - totalComp) / totalOrig) * 100).toFixed(1)
19
- lines.push(`[toona] total: ${totalOrig}->${totalComp} chars (-${pct}%)`)
22
+ const tokPct = totalOrigTok > 0 ? (((totalOrigTok - totalCompTok) / totalOrigTok) * 100).toFixed(1) : '0.0'
23
+ lines.push(`[toona] total: ${totalOrig}->${totalComp} chars (-${pct}%), ${totalOrigTok}->${totalCompTok} tokens (-${tokPct}%)`)
20
24
  }
21
25
 
22
26
  if (session) {
23
27
  const totals = session.getTotals()
24
- lines.push(`[toona] session: ${totals.totalSaved} chars saved across ${totals.compressionCount} compressions`)
28
+ lines.push(`[toona] session: ${totals.totalSaved} chars, ${totals.totalTokensSaved} tokens saved across ${totals.compressionCount} compressions`)
25
29
  }
26
30
 
27
31
  return lines.join('\n')
@@ -29,6 +33,7 @@ export function formatRequestLog(stats, session) {
29
33
 
30
34
  export function createSession() {
31
35
  let totalSaved = 0
36
+ let totalTokensSaved = 0
32
37
  let compressionCount = 0
33
38
 
34
39
  return {
@@ -36,12 +41,13 @@ export function createSession() {
36
41
  for (const s of stats) {
37
42
  if (s.method && s.originalLen && s.compressedLen) {
38
43
  totalSaved += s.originalLen - s.compressedLen
44
+ totalTokensSaved += (s.originalTokens || 0) - (s.compressedTokens || 0)
39
45
  compressionCount++
40
46
  }
41
47
  }
42
48
  },
43
49
  getTotals() {
44
- return { totalSaved, compressionCount }
50
+ return { totalSaved, totalTokensSaved, compressionCount }
45
51
  },
46
52
  }
47
53
  }