@sliday/tamp 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1,235 @@
1
+ # Tamp
2
+
3
+ **Token compression proxy for coding agents.** 33.9% fewer input tokens, zero code changes. Works with Claude Code, Aider, Cursor, Cline, Windsurf, and any OpenAI-compatible agent.
4
+
5
+ ```
6
+ npx @sliday/tamp
7
+ ```
8
+
9
+ Or install globally:
10
+
11
+ ```bash
12
+ curl -fsSL https://tamp.dev/setup.sh | bash
13
+ ```
14
+
15
+ ## How It Works
16
+
17
+ Tamp auto-detects your agent's API format and compresses tool result blocks before forwarding upstream. Source code, error results, and non-JSON content pass through untouched.
18
+
19
+ ```
20
+ Claude Code ──► Tamp (localhost:7778) ──► Anthropic API
21
+ Aider/Cursor ──► │ ──► OpenAI API
22
+ Gemini CLI ────► │ ──► Google AI API
23
+
24
+ ├─ JSON → minify whitespace
25
+ ├─ Arrays → TOON columnar encoding
26
+ ├─ Line-numbered → strip prefixes + minify
27
+ ├─ Source code → passthrough
28
+ └─ Errors → skip
29
+ ```
30
+
31
+ ### Supported API Formats
32
+
33
+ | Format | Endpoint | Agents |
34
+ |--------|----------|--------|
35
+ | Anthropic Messages | `POST /v1/messages` | Claude Code |
36
+ | OpenAI Chat Completions | `POST /v1/chat/completions` | Aider, Cursor, Cline, Windsurf, OpenCode |
37
+ | Google Gemini | `POST .../generateContent` | Gemini CLI |
38
+
39
+ ### Compression Stages
40
+
41
+ | Stage | What it does | When it applies |
42
+ |-------|-------------|-----------------|
43
+ | `minify` | Strips JSON whitespace | Pretty-printed JSON objects/arrays |
44
+ | `toon` | Columnar [TOON encoding](https://github.com/nicholasgasior/toon-format) | Homogeneous arrays (file listings, routes, deps) |
45
+ | `llmlingua` | Neural text compression via [LLMLingua](https://github.com/microsoft/LLMLingua) sidecar | Natural language text (requires sidecar) |
46
+
47
+ Only `minify` is enabled by default. Enable more with `TOONA_STAGES=minify,toon`.
48
+
49
+ ## Quick Start
50
+
51
+ ### 1. Start the proxy
52
+
53
+ ```bash
54
+ npx @sliday/tamp
55
+ ```
56
+
57
+ ```
58
+ ┌─ Tamp ─────────────────────────────────┐
59
+ │ Proxy: http://localhost:7778 │
60
+ │ Status: ● Ready │
61
+ │ │
62
+ │ Claude Code: │
63
+ │ ANTHROPIC_BASE_URL=http://localhost:7778
64
+ │ │
65
+ │ Aider / Cursor / Cline: │
66
+ │ OPENAI_BASE_URL=http://localhost:7778
67
+ └────────────────────────────────────────┘
68
+ ```
69
+
70
+ ### 2. Point your agent at the proxy
71
+
72
+ **Claude Code:**
73
+ ```bash
74
+ export ANTHROPIC_BASE_URL=http://localhost:7778
75
+ claude
76
+ ```
77
+
78
+ **Aider:**
79
+ ```bash
80
+ export OPENAI_API_BASE=http://localhost:7778
81
+ aider
82
+ ```
83
+
84
+ **Cursor / Cline / Windsurf:**
85
+ Set the API base URL to `http://localhost:7778` in your editor's settings.
86
+
87
+ That's it. Use your agent as normal — Tamp compresses silently in the background.
88
+
89
+ ## Configuration
90
+
91
+ All configuration via environment variables:
92
+
93
+ | Variable | Default | Description |
94
+ |----------|---------|-------------|
95
+ | `TOONA_PORT` | `7778` | Proxy listen port |
96
+ | `TOONA_UPSTREAM` | `https://api.anthropic.com` | Default upstream API URL |
97
+ | `TOONA_UPSTREAM_OPENAI` | `https://api.openai.com` | Upstream for OpenAI-format requests |
98
+ | `TOONA_UPSTREAM_GEMINI` | `https://generativelanguage.googleapis.com` | Upstream for Gemini-format requests |
99
+ | `TOONA_STAGES` | `minify` | Comma-separated compression stages |
100
+ | `TOONA_MIN_SIZE` | `200` | Minimum content size (chars) to attempt compression |
101
+ | `TOONA_LOG` | `true` | Enable request logging to stderr |
102
+ | `TOONA_LOG_FILE` | _(none)_ | Write logs to file |
103
+ | `TOONA_MAX_BODY` | `10485760` | Max request body size (bytes) before passthrough |
104
+ | `TOONA_LLMLINGUA_URL` | _(none)_ | LLMLingua sidecar URL for text compression |
105
+
106
+ ### Recommended setup
107
+
108
+ ```bash
109
+ # Maximum compression
110
+ TOONA_STAGES=minify,toon npx @sliday/tamp
111
+ ```
112
+
113
+ ## Installation Methods
114
+
115
+ ### npx (no install)
116
+
117
+ ```bash
118
+ npx @sliday/tamp
119
+ ```
120
+
121
+ ### npm global
122
+
123
+ ```bash
124
+ npm install -g @sliday/tamp
125
+ npx @sliday/tamp
126
+ ```
127
+
128
+ ### Git clone
129
+
130
+ ```bash
131
+ git clone https://github.com/sliday/tamp.git
132
+ cd tamp && npm install
133
+ node bin/tamp.js
134
+ ```
135
+
136
+ ### One-line installer
137
+
138
+ ```bash
139
+ curl -fsSL https://tamp.dev/setup.sh | bash
140
+ ```
141
+
142
+ The installer clones to `~/.tamp`, adds `ANTHROPIC_BASE_URL` to your shell profile, and creates a `tamp` alias.
143
+
144
+ ## What Gets Compressed
145
+
146
+ Tamp only compresses the **last user message** in each request (the most recent `tool_result` blocks). Historical messages are left untouched to avoid redundant recompression.
147
+
148
+ | Content Type | Action | Example |
149
+ |-------------|--------|---------|
150
+ | Pretty-printed JSON | Minify whitespace | `package.json`, config files |
151
+ | JSON with line numbers | Strip prefixes + minify | Read tool output (` 1→{...}`) |
152
+ | Homogeneous JSON arrays | TOON encode | File listings, route tables, dependencies |
153
+ | Already-minified JSON | Skip | Single-line JSON |
154
+ | Source code (text) | Passthrough | `.ts`, `.py`, `.rs` files |
155
+ | `is_error: true` results | Skip entirely | Error tool results |
156
+ | TOON-encoded content | Skip | Already compressed |
157
+
158
+ ## Architecture
159
+
160
+ ```
161
+ bin/tamp.js CLI entry point
162
+ index.js HTTP proxy server
163
+ providers.js API format adapters (Anthropic, OpenAI, Gemini) + auto-detection
164
+ compress.js Compression pipeline (compressRequest, compressText)
165
+ detect.js Content classification (classifyContent, tryParseJSON, stripLineNumbers)
166
+ config.js Environment-based configuration
167
+ stats.js Session statistics and request logging
168
+ setup.sh One-line installer script
169
+ ```
170
+
171
+ ### How the proxy works
172
+
173
+ 1. `detectProvider()` auto-detects the API format from the request path
174
+ 2. Unrecognized requests are piped through unmodified
175
+ 3. Matched requests are buffered, parsed, and tool results are extracted via the provider adapter
176
+ 4. Extracted blocks are classified and compressed
177
+ 5. The modified body is forwarded to the correct upstream with updated `Content-Length`
178
+ 6. The upstream response is streamed back to the client unmodified
179
+
180
+ Bodies exceeding `TOONA_MAX_BODY` are piped through without buffering.
181
+
182
+ ## Benchmarking
183
+
184
+ The `bench/` directory contains a reproducible A/B benchmark that measures actual token savings via OpenRouter:
185
+
186
+ ```bash
187
+ OPENROUTER_API_KEY=... node bench/runner.js # 70 API calls, ~2 min
188
+ node bench/analyze.js # Statistical analysis
189
+ node bench/render.js # White paper (HTML + PDF)
190
+ ```
191
+
192
+ Seven scenarios cover the full range: small/large JSON, tabular data, source code, multi-turn conversations, line-numbered output, and error results. Each runs 5 times for statistical confidence (95% CI via Student's t-distribution).
193
+
194
+ Results are written to `bench/results/` (gitignored).
195
+
196
+ ## Development
197
+
198
+ ```bash
199
+ # Run tests
200
+ npm test
201
+
202
+ # Smoke test (spins up proxy + echo server, validates compression)
203
+ node smoke.js
204
+
205
+ # Run specific test file
206
+ node --test test/compress.test.js
207
+ ```
208
+
209
+ ### Test files
210
+
211
+ ```
212
+ test/compress.test.js Compression pipeline tests (Anthropic + OpenAI formats)
213
+ test/providers.test.js Provider adapter + auto-detection tests
214
+ test/detect.test.js Content classification tests
215
+ test/config.test.js Configuration loading tests
216
+ test/proxy.test.js HTTP proxy integration tests
217
+ test/stats.test.js Statistics and logging tests
218
+ test/fixtures/ Sample API payloads
219
+ ```
220
+
221
+ ## How Token Savings Work
222
+
223
+ Claude Code sends the full conversation history on every API call. As a session progresses, tool results accumulate — file contents, directory listings, command outputs — all re-sent as input tokens on each request.
224
+
225
+ At $3/million input tokens (Sonnet 4), a 200-request session consuming 3M input tokens costs $9. If 60% of tool results are compressible JSON, and compression removes 30-50% of those tokens, that's $1.60-2.70 saved per session.
226
+
227
+ For teams with 5 developers doing 2 sessions/day, that's $500-800/month in savings.
228
+
229
+ ## License
230
+
231
+ MIT
232
+
233
+ ## Author
234
+
235
+ [Stas Kulesh](mailto:stas@sliday.com) — [sliday.com](https://sliday.com)
package/bin/tamp.js CHANGED
@@ -4,17 +4,23 @@ import { createProxy } from '../index.js'
4
4
  const { config, server } = createProxy()
5
5
 
6
6
  server.listen(config.port, () => {
7
+ const url = `http://localhost:${config.port}`
7
8
  console.error('')
8
- console.error(' ┌─ Tamp ─────────────────────────────┐')
9
- console.error(` │ Proxy: http://localhost:${config.port} │`)
10
- console.error(' │ Status: ● Ready │')
11
- console.error(' │ │')
12
- console.error(' │ In another terminal: │')
13
- console.error(` │ export ANTHROPIC_BASE_URL=http://localhost:${config.port}`)
14
- console.error(' │ claude │')
15
- console.error(' └────────────────────────────────────┘')
9
+ console.error(' ┌─ Tamp ─────────────────────────────────┐')
10
+ console.error(` │ Proxy: ${url} │`)
11
+ console.error(' │ Status: ● Ready │')
12
+ console.error(' │ │')
13
+ console.error(' │ Claude Code: │')
14
+ console.error(` │ ANTHROPIC_BASE_URL=${url} │`)
15
+ console.error(' │ │')
16
+ console.error(' │ Aider / Cursor / Cline: │')
17
+ console.error(` │ OPENAI_BASE_URL=${url} │`)
18
+ console.error(' └────────────────────────────────────────┘')
16
19
  console.error('')
17
- console.error(` Upstream: ${config.upstream}`)
20
+ console.error(` Upstreams:`)
21
+ console.error(` anthropic → ${config.upstreams.anthropic}`)
22
+ console.error(` openai → ${config.upstreams.openai}`)
23
+ console.error(` gemini → ${config.upstreams.gemini}`)
18
24
  console.error(` Stages: ${config.stages.join(', ')}`)
19
25
  console.error('')
20
26
  })
package/compress.js CHANGED
@@ -1,5 +1,7 @@
1
1
  import { encode } from '@toon-format/toon'
2
+ import { countTokens } from '@anthropic-ai/tokenizer'
2
3
  import { tryParseJSON, classifyContent, stripLineNumbers } from './detect.js'
4
+ import { anthropic } from './providers.js'
3
5
 
4
6
  export function compressText(text, config) {
5
7
  if (text.length < config.minSize) return null
@@ -31,7 +33,7 @@ export function compressText(text, config) {
31
33
  } catch { /* fall back to minified */ }
32
34
  }
33
35
 
34
- return { text: best.text, method: best.method, originalLen: text.length, compressedLen: best.text.length }
36
+ return { text: best.text, method: best.method, originalLen: text.length, compressedLen: best.text.length, originalTokens: countTokens(text), compressedTokens: countTokens(best.text) }
35
37
  }
36
38
 
37
39
  async function compressWithLLMLingua(text, config) {
@@ -47,7 +49,7 @@ async function compressWithLLMLingua(text, config) {
47
49
  clearTimeout(timeout)
48
50
  if (!res.ok) return null
49
51
  const data = await res.json()
50
- return { text: data.text, method: 'llmlingua', originalLen: text.length, compressedLen: data.text.length }
52
+ return { text: data.text, method: 'llmlingua', originalLen: text.length, compressedLen: data.text.length, originalTokens: countTokens(text), compressedTokens: countTokens(data.text) }
51
53
  } catch {
52
54
  return null
53
55
  }
@@ -61,54 +63,22 @@ async function compressBlock(text, config) {
61
63
  return sync
62
64
  }
63
65
 
64
- export async function compressMessages(body, config) {
66
+ export async function compressRequest(body, config, provider) {
67
+ const targets = provider.extract(body)
65
68
  const stats = []
66
- if (!body?.messages?.length) return { body, stats }
67
-
68
- let lastUserIdx = -1
69
- for (let i = body.messages.length - 1; i >= 0; i--) {
70
- if (body.messages[i].role === 'user') { lastUserIdx = i; break }
71
- }
72
- if (lastUserIdx === -1) return { body, stats }
73
-
74
- const msg = body.messages[lastUserIdx]
75
- const debug = config.log
76
-
77
- if (typeof msg.content === 'string') {
78
- const result = await compressBlock(msg.content, config)
69
+ for (const target of targets) {
70
+ if (target.skip) { stats.push({ index: target.index, skipped: target.skip }); continue }
71
+ const result = await compressBlock(target.text, config)
79
72
  if (result) {
80
- msg.content = result.text
81
- stats.push({ index: lastUserIdx, ...result })
82
- }
83
- } else if (Array.isArray(msg.content)) {
84
- for (let i = 0; i < msg.content.length; i++) {
85
- const block = msg.content[i]
86
- if (block.type !== 'tool_result') continue
87
- if (block.is_error) { stats.push({ index: i, skipped: 'error' }); continue }
88
-
89
- if (typeof block.content === 'string') {
90
- if (debug) {
91
- const cls = classifyContent(block.content)
92
- const len = block.content.length
93
- console.error(`[toona] debug block[${i}]: type=${cls} len=${len} tool_use_id=${block.tool_use_id || '?'}`)
94
- }
95
- const result = await compressBlock(block.content, config)
96
- if (result) { block.content = result.text; stats.push({ index: i, ...result }) }
97
- } else if (Array.isArray(block.content)) {
98
- for (const sub of block.content) {
99
- if (sub.type === 'text') {
100
- if (debug) {
101
- const cls = classifyContent(sub.text)
102
- const len = sub.text.length
103
- console.error(`[toona] debug sub-block: type=${cls} len=${len}`)
104
- }
105
- const result = await compressBlock(sub.text, config)
106
- if (result) { sub.text = result.text; stats.push({ index: i, ...result }) }
107
- }
108
- }
109
- }
73
+ target.compressed = result.text
74
+ stats.push({ index: target.index, ...result })
110
75
  }
111
76
  }
112
-
77
+ provider.apply(body, targets)
113
78
  return { body, stats }
114
79
  }
80
+
81
+ export async function compressMessages(body, config) {
82
+ if (!body?.messages?.length) return { body, stats: [] }
83
+ return compressRequest(body, config, anthropic)
84
+ }
package/config.js CHANGED
@@ -3,6 +3,11 @@ export function loadConfig(env = process.env) {
3
3
  return Object.freeze({
4
4
  port: parseInt(env.TOONA_PORT, 10) || 7778,
5
5
  upstream: env.TOONA_UPSTREAM || 'https://api.anthropic.com',
6
+ upstreams: Object.freeze({
7
+ anthropic: env.TOONA_UPSTREAM || 'https://api.anthropic.com',
8
+ openai: env.TOONA_UPSTREAM_OPENAI || 'https://api.openai.com',
9
+ gemini: env.TOONA_UPSTREAM_GEMINI || 'https://generativelanguage.googleapis.com',
10
+ }),
6
11
  minSize: parseInt(env.TOONA_MIN_SIZE, 10) || 200,
7
12
  stages,
8
13
  log: env.TOONA_LOG !== 'false',
package/index.js CHANGED
@@ -1,11 +1,16 @@
1
1
  import http from 'node:http'
2
2
  import https from 'node:https'
3
3
  import { loadConfig } from './config.js'
4
- import { compressMessages } from './compress.js'
4
+ import { compressRequest } from './compress.js'
5
+ import { detectProvider } from './providers.js'
5
6
  import { createSession, formatRequestLog } from './stats.js'
6
7
 
7
8
  export function createProxy(overrides = {}) {
8
- const config = { ...loadConfig(), ...overrides }
9
+ const base = loadConfig()
10
+ const config = { ...base, ...overrides }
11
+ if (overrides.upstream && !overrides.upstreams) {
12
+ config.upstreams = { anthropic: overrides.upstream, openai: overrides.upstream, gemini: overrides.upstream }
13
+ }
9
14
  const session = createSession()
10
15
  return { config, session, server: _createServer(config, session) }
11
16
  }
@@ -81,13 +86,16 @@ function pipeRequest(req, res, upstreamUrl, prefixChunks) {
81
86
 
82
87
  return http.createServer(async (req, res) => {
83
88
  if (config.log) console.error(`[tamp] ${req.method} ${req.url}`)
84
- const upstreamUrl = new URL(req.url, config.upstream)
85
- const isMessages = req.method === 'POST' && req.url.startsWith('/v1/messages')
89
+ const provider = detectProvider(req.method, req.url)
86
90
 
87
- if (!isMessages) {
91
+ if (!provider) {
92
+ const upstreamUrl = new URL(req.url, config.upstream)
88
93
  return pipeRequest(req, res, upstreamUrl)
89
94
  }
90
95
 
96
+ const upstream = config.upstreams?.[provider.name] || config.upstream
97
+ const upstreamUrl = new URL(req.url, upstream)
98
+
91
99
  const chunks = []
92
100
  let size = 0
93
101
  let overflow = false
@@ -113,12 +121,12 @@ return http.createServer(async (req, res) => {
113
121
 
114
122
  try {
115
123
  const parsed = JSON.parse(rawBody.toString('utf-8'))
116
- const { body, stats } = await compressMessages(parsed, config)
124
+ const { body, stats } = await compressRequest(parsed, config, provider)
117
125
  finalBody = Buffer.from(JSON.stringify(body), 'utf-8')
118
126
 
119
127
  if (config.log && stats.length) {
120
128
  session.record(stats)
121
- console.error(formatRequestLog(stats, session))
129
+ console.error(formatRequestLog(stats, session, provider.name, req.url))
122
130
  }
123
131
  } catch (err) {
124
132
  if (config.log) console.error(`[tamp] passthrough (parse error): ${err.message}`)
package/package.json CHANGED
@@ -6,10 +6,11 @@
6
6
  "compress.js",
7
7
  "config.js",
8
8
  "detect.js",
9
+ "providers.js",
9
10
  "stats.js"
10
11
  ],
11
- "version": "0.1.0",
12
- "description": "Token compression proxy for Claude Code. 50% fewer tokens, zero behavior change.",
12
+ "version": "0.2.0",
13
+ "description": "Token compression proxy for coding agents. Works with Claude Code, Aider, Cursor, Cline, Windsurf. 33.9% fewer input tokens.",
13
14
  "type": "module",
14
15
  "main": "index.js",
15
16
  "bin": {
@@ -17,9 +18,18 @@
17
18
  },
18
19
  "scripts": {
19
20
  "start": "node bin/tamp.js",
20
- "test": "node --test test/*.test.js"
21
+ "test": "node --test test/*.test.js",
22
+ "test:sidecar": "node test/capture-golden.js --verify",
23
+ "bench:semantic": "node bench/semantic-eval.js"
21
24
  },
22
- "keywords": ["claude", "anthropic", "proxy", "compression", "tokens", "llm"],
25
+ "keywords": [
26
+ "claude",
27
+ "anthropic",
28
+ "proxy",
29
+ "compression",
30
+ "tokens",
31
+ "llm"
32
+ ],
23
33
  "author": "Stas Kulesh <stas@sliday.com>",
24
34
  "license": "MIT",
25
35
  "repository": {
@@ -28,6 +38,7 @@
28
38
  },
29
39
  "homepage": "https://github.com/sliday/tamp",
30
40
  "dependencies": {
41
+ "@anthropic-ai/tokenizer": "^0.0.4",
31
42
  "@toon-format/toon": "^2.1.0"
32
43
  }
33
44
  }
package/providers.js ADDED
@@ -0,0 +1,147 @@
1
+ const anthropic = {
2
+ name: 'anthropic',
3
+ match(method, url) {
4
+ return method === 'POST' && url.startsWith('/v1/messages')
5
+ },
6
+ extract(body) {
7
+ const targets = []
8
+ if (!body?.messages?.length) return targets
9
+
10
+ let lastUserIdx = -1
11
+ for (let i = body.messages.length - 1; i >= 0; i--) {
12
+ if (body.messages[i].role === 'user') { lastUserIdx = i; break }
13
+ }
14
+ if (lastUserIdx === -1) return targets
15
+
16
+ const msg = body.messages[lastUserIdx]
17
+
18
+ if (typeof msg.content === 'string') {
19
+ targets.push({ path: ['messages', lastUserIdx, 'content'], text: msg.content })
20
+ } else if (Array.isArray(msg.content)) {
21
+ for (let i = 0; i < msg.content.length; i++) {
22
+ const block = msg.content[i]
23
+ if (block.type !== 'tool_result') continue
24
+ if (block.is_error) { targets.push({ skip: 'error', index: i }); continue }
25
+
26
+ if (typeof block.content === 'string') {
27
+ targets.push({ path: ['messages', lastUserIdx, 'content', i, 'content'], text: block.content, index: i })
28
+ } else if (Array.isArray(block.content)) {
29
+ for (let j = 0; j < block.content.length; j++) {
30
+ const sub = block.content[j]
31
+ if (sub.type === 'text') {
32
+ targets.push({ path: ['messages', lastUserIdx, 'content', i, 'content', j, 'text'], text: sub.text, index: i })
33
+ }
34
+ }
35
+ }
36
+ }
37
+ }
38
+ return targets
39
+ },
40
+ apply(body, targets) {
41
+ for (const t of targets) {
42
+ if (t.skip || !t.compressed) continue
43
+ let obj = body
44
+ const path = t.path
45
+ for (let i = 0; i < path.length - 1; i++) obj = obj[path[i]]
46
+ obj[path[path.length - 1]] = t.compressed
47
+ }
48
+ },
49
+ }
50
+
51
+ const openai = {
52
+ name: 'openai',
53
+ match(method, url) {
54
+ return method === 'POST' && url.startsWith('/v1/chat/completions')
55
+ },
56
+ extract(body) {
57
+ const targets = []
58
+ if (!body?.messages?.length) return targets
59
+
60
+ // Find last assistant message with tool_calls
61
+ let lastAssistantIdx = -1
62
+ for (let i = body.messages.length - 1; i >= 0; i--) {
63
+ if (body.messages[i].role === 'assistant' && body.messages[i].tool_calls?.length) {
64
+ lastAssistantIdx = i
65
+ break
66
+ }
67
+ }
68
+ if (lastAssistantIdx === -1) return targets
69
+
70
+ // Collect all subsequent role:tool messages
71
+ for (let i = lastAssistantIdx + 1; i < body.messages.length; i++) {
72
+ const msg = body.messages[i]
73
+ if (msg.role !== 'tool') break
74
+ if (typeof msg.content === 'string') {
75
+ targets.push({ path: ['messages', i, 'content'], text: msg.content, index: i })
76
+ }
77
+ }
78
+ return targets
79
+ },
80
+ apply(body, targets) {
81
+ for (const t of targets) {
82
+ if (t.skip || !t.compressed) continue
83
+ let obj = body
84
+ const path = t.path
85
+ for (let i = 0; i < path.length - 1; i++) obj = obj[path[i]]
86
+ obj[path[path.length - 1]] = t.compressed
87
+ }
88
+ },
89
+ }
90
+
91
+ const gemini = {
92
+ name: 'gemini',
93
+ match(method, url) {
94
+ return method === 'POST' && url.includes('generateContent')
95
+ },
96
+ extract(body) {
97
+ const targets = []
98
+ if (!body?.contents?.length) return targets
99
+
100
+ // Find last content with functionResponse parts
101
+ for (let ci = body.contents.length - 1; ci >= 0; ci--) {
102
+ const content = body.contents[ci]
103
+ if (!content.parts?.length) continue
104
+ for (let pi = 0; pi < content.parts.length; pi++) {
105
+ const part = content.parts[pi]
106
+ if (!part.functionResponse?.response) continue
107
+ const resp = part.functionResponse.response
108
+ const text = typeof resp === 'string' ? resp : JSON.stringify(resp, null, 2)
109
+ targets.push({
110
+ path: ['contents', ci, 'parts', pi, 'functionResponse', 'response'],
111
+ text,
112
+ index: pi,
113
+ wasObject: typeof resp !== 'string',
114
+ })
115
+ }
116
+ if (targets.length) break
117
+ }
118
+ return targets
119
+ },
120
+ apply(body, targets) {
121
+ for (const t of targets) {
122
+ if (t.skip || !t.compressed) continue
123
+ let obj = body
124
+ const path = t.path
125
+ for (let i = 0; i < path.length - 1; i++) obj = obj[path[i]]
126
+ // If original was object, try to parse compressed back to object
127
+ if (t.wasObject) {
128
+ try {
129
+ obj[path[path.length - 1]] = JSON.parse(t.compressed)
130
+ continue
131
+ } catch { /* fall through to string */ }
132
+ }
133
+ obj[path[path.length - 1]] = t.compressed
134
+ }
135
+ },
136
+ }
137
+
138
+ const providers = [anthropic, openai, gemini]
139
+
140
+ export function detectProvider(method, url) {
141
+ for (const p of providers) {
142
+ if (p.match(method, url)) return p
143
+ }
144
+ return null
145
+ }
146
+
147
+ export { anthropic, openai, gemini }
package/stats.js CHANGED
@@ -1,27 +1,33 @@
1
- export function formatRequestLog(stats, session) {
1
+ export function formatRequestLog(stats, session, providerName, url) {
2
2
  const compressed = stats.filter(s => s.method)
3
3
  const skipped = stats.filter(s => s.skipped)
4
- const lines = [`[toona] POST /v1/messages — ${stats.length} blocks, ${compressed.length} compressed`]
4
+ const label = providerName || 'anthropic'
5
+ const path = url || '/v1/messages'
6
+ const lines = [`[toona] ${label} ${path} — ${stats.length} blocks, ${compressed.length} compressed`]
5
7
 
6
8
  for (const s of stats) {
7
9
  if (s.skipped) {
8
10
  lines.push(`[toona] block[${s.index}]: skipped (${s.skipped})`)
9
11
  } else if (s.method) {
10
12
  const pct = (((s.originalLen - s.compressedLen) / s.originalLen) * 100).toFixed(1)
11
- lines.push(`[toona] block[${s.index}]: ${s.originalLen}->${s.compressedLen} chars (-${pct}%) [${s.method}]`)
13
+ const tokInfo = s.originalTokens ? ` ${s.originalTokens}->${s.compressedTokens} tok` : ''
14
+ lines.push(`[toona] block[${s.index}]: ${s.originalLen}->${s.compressedLen} chars (-${pct}%)${tokInfo} [${s.method}]`)
12
15
  }
13
16
  }
14
17
 
15
18
  const totalOrig = compressed.reduce((a, s) => a + s.originalLen, 0)
16
19
  const totalComp = compressed.reduce((a, s) => a + s.compressedLen, 0)
20
+ const totalOrigTok = compressed.reduce((a, s) => a + (s.originalTokens || 0), 0)
21
+ const totalCompTok = compressed.reduce((a, s) => a + (s.compressedTokens || 0), 0)
17
22
  if (compressed.length > 0) {
18
23
  const pct = (((totalOrig - totalComp) / totalOrig) * 100).toFixed(1)
19
- lines.push(`[toona] total: ${totalOrig}->${totalComp} chars (-${pct}%)`)
24
+ const tokPct = totalOrigTok > 0 ? (((totalOrigTok - totalCompTok) / totalOrigTok) * 100).toFixed(1) : '0.0'
25
+ lines.push(`[toona] total: ${totalOrig}->${totalComp} chars (-${pct}%), ${totalOrigTok}->${totalCompTok} tokens (-${tokPct}%)`)
20
26
  }
21
27
 
22
28
  if (session) {
23
29
  const totals = session.getTotals()
24
- lines.push(`[toona] session: ${totals.totalSaved} chars saved across ${totals.compressionCount} compressions`)
30
+ lines.push(`[toona] session: ${totals.totalSaved} chars, ${totals.totalTokensSaved} tokens saved across ${totals.compressionCount} compressions`)
25
31
  }
26
32
 
27
33
  return lines.join('\n')
@@ -29,6 +35,7 @@ export function formatRequestLog(stats, session) {
29
35
 
30
36
  export function createSession() {
31
37
  let totalSaved = 0
38
+ let totalTokensSaved = 0
32
39
  let compressionCount = 0
33
40
 
34
41
  return {
@@ -36,12 +43,13 @@ export function createSession() {
36
43
  for (const s of stats) {
37
44
  if (s.method && s.originalLen && s.compressedLen) {
38
45
  totalSaved += s.originalLen - s.compressedLen
46
+ totalTokensSaved += (s.originalTokens || 0) - (s.compressedTokens || 0)
39
47
  compressionCount++
40
48
  }
41
49
  }
42
50
  },
43
51
  getTotals() {
44
- return { totalSaved, compressionCount }
52
+ return { totalSaved, totalTokensSaved, compressionCount }
45
53
  },
46
54
  }
47
55
  }