npm - squeezr-ai - Versions diffs - 1.11.1 → 1.11.2 - Mend

squeezr-ai 1.11.1 → 1.11.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/README.md CHANGED Viewed

@@ -107,6 +107,67 @@ Your provider's API (Anthropic / OpenAI / Google / Ollama)
 Recent content is always preserved untouched — by default the last 3 tool results are never compressed. Your CLI always has full context for what it's currently working on.
+### Does compression make the AI "dumber"?
+No — it's the opposite. Without Squeezr, long sessions hit the context window limit and the CLI **silently drops old messages entirely**. You lose them with no way to get them back.
+With Squeezr, old messages are **summarized, not deleted**. A 3,000-token git diff from message #15 becomes a ~150-token summary like:
+```
+[squeezr:a3f2c1] git diff: modified src/auth.ts — validateToken:
+added expiry logging + refreshToken call. 3 files changed.
+```
+The AI knows *what* you did, not every exact line. And if it needs the full original, it calls `squeezr_expand(a3f2c1)` and gets it back — losslessly.
+| Scenario | Message #15 at turn #100 |
+|---|---|
+| **No compression** | Probably dropped by the CLI (doesn't fit) |
+| **With Squeezr** | Summarized but present, expandable on demand |
+The trade-off: less detail, but more memory. Without Squeezr the AI forgets entire messages. With Squeezr it has a one-line note about every decision made — and can retrieve the full context when needed.
+---
+## Deterministic compression engine
+Before any AI model is involved, Squeezr runs a full deterministic compression pipeline on every tool result. This is a zero-cost, zero-latency layer that handles the most common developer outputs with specialized parsers:
+| Tool output | What Squeezr does | Typical savings |
+|---|---|---|
+| **git status** | Parses staged/modified/untracked, drops noise lines | 70-85% |
+| **git diff** | Extracts changed function names, strips context lines (adaptive), summarizes large diffs | 65-92% |
+| **git log** | Compacts to `hash msg (author, date)`, caps entries by pressure | 70-90% |
+| **cargo test/build/clippy** | Extracts only failures, errors, warnings | 80-95% |
+| **vitest/jest/playwright** | Extracts failed tests and assertion errors | 80-95% |
+| **tsc** | Groups errors by file, keeps only error lines | 75-90% |
+| **eslint/biome** | Compacts to file + rule + message | 70-85% |
+| **prettier** | Keeps only files-changed summary | 80-90% |
+| **next build** | Extracts errors and route summary | 75-85% |
+| **pytest** | Extracts FAILED lines and short summaries | 80-95% |
+| **npm/pnpm install** | Strips progress bars, keeps final summary | 85-90% |
+| **npm outdated** | Compact table format | 60-75% |
+| **docker ps/images/logs** | Compact output, strips timestamps | 70-80% |
+| **kubectl get/describe/logs** | Strips timestamps, compacts tables | 70-80% |
+| **gh pr/run/issue** | Strips decorations, keeps data | 65-75% |
+| **curl/wget** | Strips progress, keeps response body/headers | 60-80% |
+| **terraform plan/apply** | Extracts changes summary | 70-85% |
+| **prisma** | Compacts migration and schema output | 65-80% |
+| **Grep results** | Groups by file, caps matches per file (adaptive) | 60-80% |
+| **Read (large files)** | >500 lines: imports + signatures only. >200 lines: head + tail | 70-95% |
+| **Glob** | Compacts file listings into directory summaries | 50-70% |
+Additionally, on **all** outputs regardless of tool:
+- ANSI escape codes stripped
+- Progress bars and spinners removed
+- Repeated lines collapsed (`"... repeated N more times"`)
+- Duplicate stack traces deduplicated (Node.js and Python)
+- Inline JSON minified (objects >200 chars)
+- Timestamps stripped (ISO 8601, bracketed, bare time formats)
+- Excessive whitespace collapsed
+This engine runs in pure Node.js — microseconds per result, no API calls, no cost. It handles the bulk of the compression work. The AI layer (Haiku/GPT-4o-mini) only kicks in afterward on older messages where further summarization is needed.
 ---
 ## Supported CLIs and providers
@@ -116,12 +177,12 @@ Squeezr auto-detects which provider each request targets from the auth headers.
 | CLI | Set this env var | Compresses with | Extra keys needed |
 |---|---|---|---|
 | **Claude Code** | `ANTHROPIC_BASE_URL=http://localhost:8080` | Claude Haiku | None |
-| **Codex CLI** | `OPENAI_BASE_URL=http://localhost:8080` | GPT-4o-mini | None |
-| **Aider** (OpenAI backend) | `OPENAI_BASE_URL=http://localhost:8080` | GPT-4o-mini | None |
+| **Codex CLI** | `openai_base_url=http://localhost:8080` | GPT-4o-mini | None |
+| **Aider** (OpenAI backend) | `openai_base_url=http://localhost:8080` | GPT-4o-mini | None |
 | **Aider** (Anthropic backend) | `ANTHROPIC_BASE_URL=http://localhost:8080` | Claude Haiku | None |
-| **OpenCode** | `OPENAI_BASE_URL=http://localhost:8080` | GPT-4o-mini | None |
+| **OpenCode** | `openai_base_url=http://localhost:8080` | GPT-4o-mini | None |
 | **Gemini CLI** | `GEMINI_API_BASE_URL=http://localhost:8080` | Gemini Flash 8B | None |
-| **Ollama** (any CLI) | `OPENAI_BASE_URL=http://localhost:8080` | Local model (configurable) | None |
+| **Ollama** (any CLI) | `openai_base_url=http://localhost:8080` | Local model (configurable) | None |
 Squeezr extracts the API key from the request itself and reuses it for compression. Zero extra setup.
@@ -142,13 +203,13 @@ export ANTHROPIC_BASE_URL=http://localhost:8080        # macOS / Linux
 $env:ANTHROPIC_BASE_URL="http://localhost:8080"        # Windows PowerShell
 # Codex / Aider / OpenCode
-export OPENAI_BASE_URL=http://localhost:8080
+export openai_base_url=http://localhost:8080
 # Gemini CLI
 export GEMINI_API_BASE_URL=http://localhost:8080
 # Ollama
-export OPENAI_BASE_URL=http://localhost:8080
+export openai_base_url=http://localhost:8080
 ```
 Or use the shell installer to set up the env var permanently and register Squeezr as a login service:

package/bin/squeezr.js CHANGED Viewed

@@ -181,7 +181,7 @@ function setupWindows() {
   // 1. Set env vars permanently via setx (user scope, no admin needed)
   const vars = {
     ANTHROPIC_BASE_URL: 'http://localhost:8080',
-    OPENAI_BASE_URL: 'http://localhost:8080',
+    openai_base_url: 'http://localhost:8080',
     GEMINI_API_BASE_URL: 'http://localhost:8080',
   }
   for (const [key, value] of Object.entries(vars)) {
@@ -257,7 +257,7 @@ function setupUnix() {
   const shellBlock = [
     `# squeezr env vars`,
     `export ANTHROPIC_BASE_URL=http://localhost:${port}`,
-    `export OPENAI_BASE_URL=http://localhost:${port}`,
+    `export openai_base_url=http://localhost:${port}`,
     `export GEMINI_API_BASE_URL=http://localhost:${port}`,
     `# squeezr auto-heal: start proxy if not running`,
     `if ! curl -sf http://localhost:${port}/squeezr/health >/dev/null 2>&1; then`,
@@ -384,7 +384,7 @@ function setupWSL() {
   const shellBlock = [
     `# squeezr env vars`,
     `export ANTHROPIC_BASE_URL=http://localhost:${port}`,
-    `export OPENAI_BASE_URL=http://localhost:${port}`,
+    `export openai_base_url=http://localhost:${port}`,
     `export GEMINI_API_BASE_URL=http://localhost:${port}`,
     `# squeezr auto-heal: start proxy if not running`,
     `if ! curl -sf http://localhost:${port}/squeezr/health >/dev/null 2>&1; then`,
@@ -421,7 +421,7 @@ function setupWSL() {
   const setxExe = '/mnt/c/Windows/System32/setx.exe'
   const winVars = {
     ANTHROPIC_BASE_URL: 'http://localhost:8080',
-    OPENAI_BASE_URL: 'http://localhost:8080',
+    openai_base_url: 'http://localhost:8080',
     GEMINI_API_BASE_URL: 'http://localhost:8080',
   }
   if (fs.existsSync(setxExe)) {

package/dist/server.js CHANGED Viewed

@@ -236,7 +236,8 @@ app.get('/squeezr/expand/:id', (c) => {
 app.all('*', async (c) => {
     const upstream = detectUpstream(c.req.raw.headers);
     const url = new URL(c.req.url);
-    const targetUrl = `${upstream}${url.pathname}${url.search}`;
+    const targetPath = url.pathname === '/responses' ? '/v1/responses' : url.pathname;
+    const targetUrl = `${upstream}${targetPath}${url.search}`;
     const body = await c.req.arrayBuffer();
     const fwdHeaders = forwardHeaders(c.req.raw.headers);
     const resp = await fetch(targetUrl, {

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "squeezr-ai",
-  "version": "1.11.1",
+  "version": "1.11.2",
   "description": "AI proxy that compresses Claude Code, Codex, Aider, Gemini CLI and Ollama context windows to save thousands of tokens per session",
   "keywords": [
     "claude",