squeezr-ai 1.11.0 → 1.11.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -107,6 +107,67 @@ Your provider's API (Anthropic / OpenAI / Google / Ollama)
107
107
 
108
108
  Recent content is always preserved untouched — by default the last 3 tool results are never compressed. Your CLI always has full context for what it's currently working on.
109
109
 
110
+ ### Does compression make the AI "dumber"?
111
+
112
+ No — it's the opposite. Without Squeezr, long sessions hit the context window limit and the CLI **silently drops old messages entirely**. You lose them with no way to get them back.
113
+
114
+ With Squeezr, old messages are **summarized, not deleted**. A 3,000-token git diff from message #15 becomes a ~150-token summary like:
115
+
116
+ ```
117
+ [squeezr:a3f2c1] git diff: modified src/auth.ts — validateToken:
118
+ added expiry logging + refreshToken call. 3 files changed.
119
+ ```
120
+
121
+ The AI knows *what* you did, not every exact line. And if it needs the full original, it calls `squeezr_expand(a3f2c1)` and gets it back — losslessly.
122
+
123
+ | Scenario | Message #15 at turn #100 |
124
+ |---|---|
125
+ | **No compression** | Probably dropped by the CLI (doesn't fit) |
126
+ | **With Squeezr** | Summarized but present, expandable on demand |
127
+
128
+ The trade-off: less detail, but more memory. Without Squeezr the AI forgets entire messages. With Squeezr it has a one-line note about every decision made — and can retrieve the full context when needed.
129
+
130
+ ---
131
+
132
+ ## Deterministic compression engine
133
+
134
+ Before any AI model is involved, Squeezr runs a full deterministic compression pipeline on every tool result. This is a zero-cost, zero-latency layer that handles the most common developer outputs with specialized parsers:
135
+
136
+ | Tool output | What Squeezr does | Typical savings |
137
+ |---|---|---|
138
+ | **git status** | Parses staged/modified/untracked, drops noise lines | 70-85% |
139
+ | **git diff** | Extracts changed function names, strips context lines (adaptive), summarizes large diffs | 65-92% |
140
+ | **git log** | Compacts to `hash msg (author, date)`, caps entries by pressure | 70-90% |
141
+ | **cargo test/build/clippy** | Extracts only failures, errors, warnings | 80-95% |
142
+ | **vitest/jest/playwright** | Extracts failed tests and assertion errors | 80-95% |
143
+ | **tsc** | Groups errors by file, keeps only error lines | 75-90% |
144
+ | **eslint/biome** | Compacts to file + rule + message | 70-85% |
145
+ | **prettier** | Keeps only files-changed summary | 80-90% |
146
+ | **next build** | Extracts errors and route summary | 75-85% |
147
+ | **pytest** | Extracts FAILED lines and short summaries | 80-95% |
148
+ | **npm/pnpm install** | Strips progress bars, keeps final summary | 85-90% |
149
+ | **npm outdated** | Compact table format | 60-75% |
150
+ | **docker ps/images/logs** | Compact output, strips timestamps | 70-80% |
151
+ | **kubectl get/describe/logs** | Strips timestamps, compacts tables | 70-80% |
152
+ | **gh pr/run/issue** | Strips decorations, keeps data | 65-75% |
153
+ | **curl/wget** | Strips progress, keeps response body/headers | 60-80% |
154
+ | **terraform plan/apply** | Extracts changes summary | 70-85% |
155
+ | **prisma** | Compacts migration and schema output | 65-80% |
156
+ | **Grep results** | Groups by file, caps matches per file (adaptive) | 60-80% |
157
+ | **Read (large files)** | >500 lines: imports + signatures only. >200 lines: head + tail | 70-95% |
158
+ | **Glob** | Compacts file listings into directory summaries | 50-70% |
159
+
160
+ Additionally, on **all** outputs regardless of tool:
161
+ - ANSI escape codes stripped
162
+ - Progress bars and spinners removed
163
+ - Repeated lines collapsed (`"... repeated N more times"`)
164
+ - Duplicate stack traces deduplicated (Node.js and Python)
165
+ - Inline JSON minified (objects >200 chars)
166
+ - Timestamps stripped (ISO 8601, bracketed, bare time formats)
167
+ - Excessive whitespace collapsed
168
+
169
+ This engine runs in pure Node.js — microseconds per result, no API calls, no cost. It handles the bulk of the compression work. The AI layer (Haiku/GPT-4o-mini) only kicks in afterward on older messages where further summarization is needed.
170
+
110
171
  ---
111
172
 
112
173
  ## Supported CLIs and providers
@@ -116,12 +177,12 @@ Squeezr auto-detects which provider each request targets from the auth headers.
116
177
  | CLI | Set this env var | Compresses with | Extra keys needed |
117
178
  |---|---|---|---|
118
179
  | **Claude Code** | `ANTHROPIC_BASE_URL=http://localhost:8080` | Claude Haiku | None |
119
- | **Codex CLI** | `OPENAI_BASE_URL=http://localhost:8080` | GPT-4o-mini | None |
120
- | **Aider** (OpenAI backend) | `OPENAI_BASE_URL=http://localhost:8080` | GPT-4o-mini | None |
180
+ | **Codex CLI** | `openai_base_url=http://localhost:8080` | GPT-4o-mini | None |
181
+ | **Aider** (OpenAI backend) | `openai_base_url=http://localhost:8080` | GPT-4o-mini | None |
121
182
  | **Aider** (Anthropic backend) | `ANTHROPIC_BASE_URL=http://localhost:8080` | Claude Haiku | None |
122
- | **OpenCode** | `OPENAI_BASE_URL=http://localhost:8080` | GPT-4o-mini | None |
183
+ | **OpenCode** | `openai_base_url=http://localhost:8080` | GPT-4o-mini | None |
123
184
  | **Gemini CLI** | `GEMINI_API_BASE_URL=http://localhost:8080` | Gemini Flash 8B | None |
124
- | **Ollama** (any CLI) | `OPENAI_BASE_URL=http://localhost:8080` | Local model (configurable) | None |
185
+ | **Ollama** (any CLI) | `openai_base_url=http://localhost:8080` | Local model (configurable) | None |
125
186
 
126
187
  Squeezr extracts the API key from the request itself and reuses it for compression. Zero extra setup.
127
188
 
@@ -142,13 +203,13 @@ export ANTHROPIC_BASE_URL=http://localhost:8080 # macOS / Linux
142
203
  $env:ANTHROPIC_BASE_URL="http://localhost:8080" # Windows PowerShell
143
204
 
144
205
  # Codex / Aider / OpenCode
145
- export OPENAI_BASE_URL=http://localhost:8080
206
+ export openai_base_url=http://localhost:8080
146
207
 
147
208
  # Gemini CLI
148
209
  export GEMINI_API_BASE_URL=http://localhost:8080
149
210
 
150
211
  # Ollama
151
- export OPENAI_BASE_URL=http://localhost:8080
212
+ export openai_base_url=http://localhost:8080
152
213
  ```
153
214
 
154
215
  Or use the shell installer to set up the env var permanently and register Squeezr as a login service:
package/bin/squeezr.js CHANGED
@@ -47,6 +47,46 @@ function runNode(script, extraArgs = []) {
47
47
  child.on('exit', code => process.exit(code ?? 0))
48
48
  }
49
49
 
50
+ async function startDaemon() {
51
+ const distIndex = path.join(ROOT, 'dist', 'index.js')
52
+ if (!fs.existsSync(distIndex)) {
53
+ console.error(`Error: ${distIndex} not found. Run 'npm run build' first.`)
54
+ process.exit(1)
55
+ }
56
+
57
+ // Check if already running
58
+ const port = process.env.SQUEEZR_PORT || 8080
59
+ const running = await new Promise(resolve => {
60
+ const req = http.get(`http://localhost:${port}/squeezr/health`, res => {
61
+ resolve(res.statusCode === 200)
62
+ res.destroy()
63
+ })
64
+ req.on('error', () => resolve(false))
65
+ req.setTimeout(2000, () => { req.destroy(); resolve(false) })
66
+ })
67
+ if (running) {
68
+ console.log(`Squeezr is already running on port ${port}`)
69
+ return
70
+ }
71
+
72
+ // Launch detached background process
73
+ const logDir = path.join(os.homedir(), '.squeezr')
74
+ const logFile = path.join(logDir, 'squeezr.log')
75
+ fs.mkdirSync(logDir, { recursive: true })
76
+ const logFd = fs.openSync(logFile, 'a')
77
+ const child = spawn(process.execPath, [distIndex], {
78
+ detached: true,
79
+ stdio: ['ignore', logFd, logFd],
80
+ windowsHide: true,
81
+ cwd: ROOT,
82
+ env: { ...process.env, SQUEEZR_DAEMON: '1' },
83
+ })
84
+ child.unref()
85
+ fs.closeSync(logFd)
86
+ console.log(`Squeezr started in background (pid ${child.pid})`)
87
+ console.log(`Logs → ${logFile}`)
88
+ }
89
+
50
90
  function showLogs() {
51
91
  const logFile = path.join(os.homedir(), '.squeezr', 'squeezr.log')
52
92
  if (!fs.existsSync(logFile)) {
@@ -141,7 +181,7 @@ function setupWindows() {
141
181
  // 1. Set env vars permanently via setx (user scope, no admin needed)
142
182
  const vars = {
143
183
  ANTHROPIC_BASE_URL: 'http://localhost:8080',
144
- OPENAI_BASE_URL: 'http://localhost:8080',
184
+ openai_base_url: 'http://localhost:8080',
145
185
  GEMINI_API_BASE_URL: 'http://localhost:8080',
146
186
  }
147
187
  for (const [key, value] of Object.entries(vars)) {
@@ -217,7 +257,7 @@ function setupUnix() {
217
257
  const shellBlock = [
218
258
  `# squeezr env vars`,
219
259
  `export ANTHROPIC_BASE_URL=http://localhost:${port}`,
220
- `export OPENAI_BASE_URL=http://localhost:${port}`,
260
+ `export openai_base_url=http://localhost:${port}`,
221
261
  `export GEMINI_API_BASE_URL=http://localhost:${port}`,
222
262
  `# squeezr auto-heal: start proxy if not running`,
223
263
  `if ! curl -sf http://localhost:${port}/squeezr/health >/dev/null 2>&1; then`,
@@ -344,7 +384,7 @@ function setupWSL() {
344
384
  const shellBlock = [
345
385
  `# squeezr env vars`,
346
386
  `export ANTHROPIC_BASE_URL=http://localhost:${port}`,
347
- `export OPENAI_BASE_URL=http://localhost:${port}`,
387
+ `export openai_base_url=http://localhost:${port}`,
348
388
  `export GEMINI_API_BASE_URL=http://localhost:${port}`,
349
389
  `# squeezr auto-heal: start proxy if not running`,
350
390
  `if ! curl -sf http://localhost:${port}/squeezr/health >/dev/null 2>&1; then`,
@@ -381,7 +421,7 @@ function setupWSL() {
381
421
  const setxExe = '/mnt/c/Windows/System32/setx.exe'
382
422
  const winVars = {
383
423
  ANTHROPIC_BASE_URL: 'http://localhost:8080',
384
- OPENAI_BASE_URL: 'http://localhost:8080',
424
+ openai_base_url: 'http://localhost:8080',
385
425
  GEMINI_API_BASE_URL: 'http://localhost:8080',
386
426
  }
387
427
  if (fs.existsSync(setxExe)) {
@@ -489,7 +529,7 @@ Done!
489
529
  switch (command) {
490
530
  case undefined:
491
531
  case 'start':
492
- runNode('index.js')
532
+ startDaemon()
493
533
  break
494
534
 
495
535
  case 'setup':
package/dist/index.js CHANGED
@@ -11,9 +11,19 @@ serve({ fetch: app.fetch, port: PORT }, () => {
11
11
  console.log(`Backends: Anthropic → Haiku | OpenAI → GPT-4o-mini | Gemini → Flash-8B | Local → ${config.localCompressionModel}`);
12
12
  console.log(`Stats: http://localhost:${PORT}/squeezr/stats`);
13
13
  });
14
- process.on('SIGINT', () => {
15
- const s = stats.summary();
16
- console.log(`\n[squeezr] Session summary: ${s.requests} requests | -${s.total_saved_chars.toLocaleString()} chars (~${s.total_saved_tokens.toLocaleString()} tokens, ${s.savings_pct}% saved)`);
17
- process.exit(0);
18
- });
14
+ const isDaemon = !!process.env.SQUEEZR_DAEMON;
15
+ if (isDaemon) {
16
+ // Daemon mode: ignore SIGINT (Ctrl+C) and SIGHUP (terminal close)
17
+ // Only stop via `squeezr stop` which sends SIGTERM
18
+ process.on('SIGINT', () => { });
19
+ process.on('SIGHUP', () => { });
20
+ }
21
+ else {
22
+ // Dev mode (npm run dev): allow Ctrl+C to stop
23
+ process.on('SIGINT', () => {
24
+ const s = stats.summary();
25
+ console.log(`\n[squeezr] Session summary: ${s.requests} requests | -${s.total_saved_chars.toLocaleString()} chars (~${s.total_saved_tokens.toLocaleString()} tokens, ${s.savings_pct}% saved)`);
26
+ process.exit(0);
27
+ });
28
+ }
19
29
  process.on('SIGTERM', () => process.exit(0));
package/dist/server.js CHANGED
@@ -236,7 +236,8 @@ app.get('/squeezr/expand/:id', (c) => {
236
236
  app.all('*', async (c) => {
237
237
  const upstream = detectUpstream(c.req.raw.headers);
238
238
  const url = new URL(c.req.url);
239
- const targetUrl = `${upstream}${url.pathname}${url.search}`;
239
+ const targetPath = url.pathname === '/responses' ? '/v1/responses' : url.pathname;
240
+ const targetUrl = `${upstream}${targetPath}${url.search}`;
240
241
  const body = await c.req.arrayBuffer();
241
242
  const fwdHeaders = forwardHeaders(c.req.raw.headers);
242
243
  const resp = await fetch(targetUrl, {
package/dist/version.d.ts CHANGED
@@ -1 +1 @@
1
- export declare const VERSION = "1.11.0";
1
+ export declare const VERSION = "1.11.1";
package/dist/version.js CHANGED
@@ -1 +1 @@
1
- export const VERSION = '1.11.0';
1
+ export const VERSION = '1.11.1';
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "squeezr-ai",
3
- "version": "1.11.0",
3
+ "version": "1.11.2",
4
4
  "description": "AI proxy that compresses Claude Code, Codex, Aider, Gemini CLI and Ollama context windows to save thousands of tokens per session",
5
5
  "keywords": [
6
6
  "claude",