squeezr-ai 1.80.6 → 1.80.15

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -51,7 +51,13 @@ Every request passes through Squeezr on `localhost:8080`. Compression layers, in
51
51
  4. **Deterministic preprocessing** — zero-latency regex rules on every tool result: ANSI/progress-bar/timestamp stripping, line dedup, JSON minification, plus ~30 tool-specific patterns (git, vitest/jest, tsc, eslint, cargo, pytest, docker, kubectl, gh…). Byte-stable → cache-safe.
52
52
  5. **Cross-turn dedup & diff-reads** — repeated tool outputs collapse to references; repeated file reads become diffs against the latest read. (Only past the cache barrier.)
53
53
  6. **Stale-turn summarization** — conversations >40 turns get old assistant prose collapsed to keyword summaries. (Only for clients without prompt caching.)
54
- 7. **AI compression** (opt-in, off by default) — blocks ≥1500 chars summarized by a small model. Measured on real data: 75–91% compression on large blocks. Backends: **Zest (local, free, deterministic)**, Haiku, GPT-4o-mini, Gemini Flash. Guarded by a rate limiter (20 calls/5 min), a persistent on/off toggle, and the cache barrier.
54
+ 7. **AI compression** (opt-in, off by default) — old blocks above the AI floor (~1000 chars, auto-raised by the quality governor) summarized by a small model. Backends: **Zest (local, free, deterministic)**, Haiku, GPT-4o-mini, Gemini Flash. Heavily guarded so it only ever helps:
55
+ - **Structured-data guard** — JSON / JSONL / record dumps / tables are *never* AI-rewritten (a model can silently blank a field value); they stay in their deterministic form. Prose/logs still get compressed.
56
+ - **Compressibility probe** — a one-shot `deflate` estimate skips already-dense blocks (path/error/test dumps) that wouldn't beat the min-ratio, so no wasted backend calls.
57
+ - **Acceptance guardrail + retry-with-correction** — every AI result is validated; if it dropped a critical token (path/URL/error code) the model is re-prompted with the exact tokens to restore, else the result is rejected and the deterministic form is kept. Nothing that loses a hard token is ever used.
58
+ - **Quality governor** — watches expand-rate and guard-reject-rate and auto-raises the min block size (or pauses) when quality dips.
59
+ - **Backend-aware limits** — local Zest is free → no rate limit, generous timeout, processed sequentially (Ollama serialises anyway). Cloud backends keep a hard cap (20 calls/5 min) and a short timeout to protect spend.
60
+ - Plus the persistent on/off toggle and the cache barrier.
55
61
 
56
62
  ### Recovery: nothing is ever lost
57
63
 
@@ -67,7 +73,7 @@ Compression aggressiveness scales with context usage: <50% → light (1500-char
67
73
 
68
74
  | Page | What it shows |
69
75
  |------|---------------|
70
- | **Overview** | All-time tokens saved (single source of truth), ratio + per-request average, cost saved, Top Tools (real per-tool block counts), Session Cache (AI layer), AI Compression card (calls / saved / spent / net), **Prompt Cache health** (read vs creation + hit %), Savings by type (per-technique breakdown), by model (incl. what compression backends spend), by client, compression mode + **Bypass / AI Compression toggles** |
76
+ | **Overview** | **Today-scoped** (resets at midnight): tokens saved, two honest ratios — **% of total sent today** and **% of the last request** (changes every turn), cost comparison (today), Cost/Savings-by-type breakdown (today), Top Tools, Session Cache, AI Compression card (calls / saved / spent / net), **Prompt Cache health** (read vs creation + hit %, **persisted across restarts**), by model / by client, compression mode + **Bypass / AI Compression toggles** |
71
77
  | **Savings** | Day / Week / Month / All-time filters with period navigation — per-period tokens, cost, sessions, charts, By Model / By Client / Top Tools / AI Compression / Session Cache, all persisted across restarts |
72
78
  | **Settings** | Client base-URL reference, ports, version/uptime, bypass & circuit breaker state, **AI Compression on/off**, **Restart / Stop buttons**, update check |
73
79
 
@@ -76,11 +82,13 @@ Compression aggressiveness scales with context usage: <50% → light (1500-char
76
82
  Squeezr sits in the critical path. It is designed to never break your workflow — and never burn your plan:
77
83
 
78
84
  - **Bypass mode (persisted)** — one click/command disables all compression; survives restarts. The emergency stop.
79
- - **AI compression master switch (persisted, default OFF)** — with a subscription OAuth token, AI compression calls bill against *your own plan*; only enable it with a separately billed API key or the free local Zest backend.
80
- - **AI rate limiter** — hard cap of 20 AI calls per 5-minute sliding window, process-global.
81
- - **AI minimum block size (1500 chars)** — measured on real data: small blocks *expand* under AI compression; Squeezr never AI-compresses them.
85
+ - **AI compression master switch (persisted, default OFF)** — with a subscription OAuth token, AI compression calls bill against *your own plan*; Squeezr refuses to auto-route to Haiku on an OAuth token. Use the free local Zest backend or a separately billed API key.
86
+ - **AI rate limiter (cloud only)** — hard cap of 20 AI calls per 5-minute sliding window for paid cloud backends (protects spend). Local Zest is free → not rate-limited.
87
+ - **AI minimum block size (~1000 chars, governed)** — small blocks can't be compressed without loss; Squeezr never AI-compresses below the floor, and the quality governor raises it automatically if reject/expand rates climb.
88
+ - **Structured-data & compressibility guards** — AI never rewrites structured data (JSON/records → no field corruption), and dense/incompressible blocks skip AI entirely.
89
+ - **Acceptance guardrail + retry-with-correction** — AI output that drops a critical token or doesn't save enough is rejected (after one corrective retry); the deterministic form is kept.
82
90
  - **Cache barrier** — unstable passes can't touch the cached prefix (see prompt-cache safety above).
83
- - **Circuit breaker** — 3 consecutive AI backend failures → AI compression disabled for 60s, deterministic continues.
91
+ - **Circuit breaker + backend-aware timeouts** — 3 consecutive AI backend failures → AI disabled for 60s, deterministic continues. Local calls get a generous timeout and run sequentially (Ollama serialises) so they don't false-timeout.
84
92
  - **Atomic persistence** — stats, history, caches and toggles are written atomically (tmp + rename); a crash can't corrupt them.
85
93
  - **Self-test on startup** — detects port squatting (the classic `$.speed` Claude Code error), env-var drift, and pipeline issues.
86
94
 
@@ -95,7 +103,7 @@ One source of truth (`~/.squeezr/stats.json`, continuous net counters — never
95
103
 
96
104
  ## Zest — Squeezr's own compression model
97
105
 
98
- Zest (`zest-0.8b`, fine-tuned from Qwen3.5-0.8B with LoRA) is Squeezr's local compression model: free, runs on CPU via Ollama, and **deterministic in greedy decoding** — which makes AI compression byte-stable and therefore cache-safe. Status: v3 trained (89% eval accuracy), GGUF packaging in progress. Design doc: [docs/REINVENT_AI.md](docs/REINVENT_AI.md)
106
+ Zest (`zest-0.8b`, fine-tuned from Qwen3.5-0.8B with LoRA) is Squeezr's local compression model: free, runs on CPU via Ollama, and **deterministic in greedy decoding** (temperature 0) — which makes AI compression byte-stable and therefore cache-safe. Status: deployed and selectable as the `local` backend (Ollama). Training data is being regenerated against Squeezr's own runtime guard (every example must keep all hard tokens — paths/URLs/error codes — and clear the min-ratio) so the model learns guard-passing compression instead of token-dropping. Design doc: [docs/REINVENT_AI.md](docs/REINVENT_AI.md)
99
107
 
100
108
  ## MCP server
101
109
 
@@ -112,8 +120,10 @@ User config lives at **`~/.squeezr/squeezr.toml`** (survives npm updates). A pro
112
120
  threshold = 800 # min chars to compress a tool result
113
121
  keep_recent = 3 # recent tool results never touched
114
122
  ai_compression = false # MASTER switch for AI calls — default OFF (see Safety)
123
+ backend = "local" # auto | local (Zest) | haiku | gpt-mini | gemini-flash
115
124
  compress_system_prompt = true
116
125
  compress_conversation = true
126
+ compress_assistant_ai = false # AI-compress long old assistant turns (prose-heavy chats)
117
127
  stale_turns = true # auto-disabled when prompt-cache markers are present
118
128
  tool_desc_compress = true # first-paragraph truncation + expand recovery
119
129
  tool_desc_expand = true
@@ -1,6 +1,7 @@
1
1
  import { describe, it, expect, vi, beforeEach } from 'vitest';
2
2
  import { clearExpandStore } from '../expand.js';
3
3
  import { clearSessionCache } from '../sessionCache.js';
4
+ import { runtimeOverrides } from '../config.js';
4
5
  // Mock AI SDKs before importing compressor
5
6
  vi.mock('@anthropic-ai/sdk', () => ({
6
7
  // function (not arrow) — `new Anthropic()` requires a constructable implementation
@@ -34,10 +35,17 @@ vi.mock('../aiToggle.js', () => ({
34
35
  setAiCompression: () => { },
35
36
  toggleAiCompression: () => true,
36
37
  }));
37
- // Mock fetch for Gemini
38
+ // Mock fetch for the fetch-based backends. Must satisfy BOTH shapes because the
39
+ // default backend is now `local` (Ollama, /api/chat → {message:{content}}) and
40
+ // effectiveBackend() reads the global config singleton, not the per-test config.
41
+ // Gemini uses {candidates}. `ok: true` keeps ollamaCompressChunk from throwing.
38
42
  const mockFetch = vi.fn().mockResolvedValue({
43
+ ok: true,
39
44
  json: async () => ({
40
45
  candidates: [{ content: { parts: [{ text: 'AI compressed summary' }] } }],
46
+ message: { content: 'AI compressed summary' },
47
+ prompt_eval_count: 10,
48
+ eval_count: 5,
41
49
  }),
42
50
  });
43
51
  vi.stubGlobal('fetch', mockFetch);
@@ -72,6 +80,10 @@ beforeEach(() => {
72
80
  clearExpandStore();
73
81
  clearSessionCache();
74
82
  vi.clearAllMocks();
83
+ // effectiveBackend() reads the GLOBAL config singleton (default `local`), so by
84
+ // default these tests exercise the Ollama path (mock fetch above returns a valid
85
+ // Ollama-shaped body). Tests that need a specific cloud backend set it explicitly.
86
+ runtimeOverrides.compressionBackend = undefined;
75
87
  });
76
88
  // ── Anthropic format ──────────────────────────────────────────────────────────
77
89
  describe('compressAnthropicMessages', () => {
@@ -118,7 +130,7 @@ describe('compressAnthropicMessages', () => {
118
130
  const msgs = makeMessages(['x'.repeat(1600), 'y'.repeat(1600)]);
119
131
  const [result] = await compressAnthropicMessages(msgs, 'key', baseConfig);
120
132
  const compressed = result[1].content[0].content;
121
- expect(compressed).toMatch(/\[squeezr:[a-f0-9]{6} -\d+%\]/);
133
+ expect(compressed).toMatch(/\[squeezr:[a-f0-9]{6} -\d+% — squeezr_expand\("[a-f0-9]{6}"\) for full exact text\]/);
122
134
  });
123
135
  it('does not compress blocks below threshold', async () => {
124
136
  const shortText = 'short'; // below threshold of 50
@@ -234,11 +246,10 @@ describe('compressOpenAIMessages', () => {
234
246
  expect(savings.compressed).toBe(1);
235
247
  });
236
248
  it('uses Ollama backend for local keys', async () => {
237
- const OpenAI = (await import('openai')).default;
238
- const msgs = makeMessages(['z'.repeat(200), 'v'.repeat(200)]);
249
+ const msgs = makeMessages(['z'.repeat(1600), 'v'.repeat(1600)]);
239
250
  await compressOpenAIMessages(msgs, 'ollama-key', { ...baseConfig, isLocalKey: () => true }, true);
240
- // OpenAI client should be called (Ollama uses OpenAI-compatible API)
241
- expect(OpenAI).toHaveBeenCalled();
251
+ // Local compression uses Ollama's native /api/chat over fetch (not the OpenAI SDK).
252
+ expect(mockFetch).toHaveBeenCalledWith(expect.stringContaining('/api/chat'), expect.any(Object));
242
253
  });
243
254
  it('does not inject expand tool for local requests', async () => {
244
255
  const msgs = makeMessages(['short']);
@@ -281,6 +292,7 @@ describe('compressGeminiContents', () => {
281
292
  expect(savings.compressed).toBe(1);
282
293
  });
283
294
  it('uses fetch with Gemini API URL', async () => {
295
+ runtimeOverrides.compressionBackend = 'auto'; // use the per-API default (Gemini), not the global `local`
284
296
  const cts = makeContents(['g'.repeat(200), 'h'.repeat(200)]);
285
297
  await compressGeminiContents(cts, 'my-google-key', baseConfig);
286
298
  expect(mockFetch).toHaveBeenCalledWith(expect.stringContaining('generativelanguage.googleapis.com'), expect.any(Object));
@@ -1,5 +1,77 @@
1
1
  import { describe, it, expect } from 'vitest';
2
2
  import { preprocess, preprocessForTool, preprocessRatio } from '../deterministic.js';
3
+ // ── Read fidelity (Edit-mismatch / corruption guards) ─────────────────────────
4
+ describe('preprocessForTool - Read stays verbatim', () => {
5
+ it('does not minify embedded JSON in a read (Edit would mismatch)', () => {
6
+ const file = '{\n "name": "pkg",\n "version": "1.0.0",\n "scripts": {\n "build": "tsc"\n }\n}';
7
+ expect(preprocessForTool(file, 'Read')).toBe(file);
8
+ });
9
+ it('does not strip timestamp-like substrings from a read', () => {
10
+ const file = 'const RELEASE = "2026-01-02T03:04:05Z"\nconst T = "12:34:56 "';
11
+ expect(preprocessForTool(file, 'Read')).toBe(file);
12
+ });
13
+ it('does not collapse blank lines or dedup repeated lines in a read', () => {
14
+ const file = 'a\n\n\n\nb\nx\nx\nx\nx\n';
15
+ expect(preprocessForTool(file, 'Read')).toBe(file);
16
+ });
17
+ it('still strips ANSI/control noise from a read', () => {
18
+ expect(preprocessForTool('\x1B[32mcode\x1B[0m', 'Read')).toBe('code');
19
+ });
20
+ });
21
+ describe('minifyJson - big integer safety', () => {
22
+ it('leaves blocks with 16+ digit integers untouched (precision corruption)', () => {
23
+ const block = '{ "id": 1234567890123456789, "name": "x", "padding": "' + 'y'.repeat(200) + '" }';
24
+ // The big id must survive verbatim (JSON.parse would round it)
25
+ expect(preprocess(block)).toContain('1234567890123456789');
26
+ });
27
+ it('still minifies safe JSON', () => {
28
+ const block = '{\n "a": 1,\n "b": "' + 'z'.repeat(200) + '"\n}';
29
+ const out = preprocess(block);
30
+ expect(out).not.toContain('\n "a"'); // got minified
31
+ });
32
+ });
33
+ describe('compactGrepOutput - Windows paths', () => {
34
+ it('groups by full drive-letter path, not the bare drive', () => {
35
+ const lines = Array.from({ length: 25 }, (_, i) => `C:\\src\\app.ts:${i + 1}:match ${i}`);
36
+ const out = preprocessForTool(lines.join('\n'), 'Grep');
37
+ expect(out).toContain('C:\\src\\app.ts');
38
+ expect(out).not.toMatch(/^C \(/m); // not grouped under bogus file "C"
39
+ });
40
+ });
41
+ // ── Expand results are never re-compressed ────────────────────────────────────
42
+ describe('preprocessForTool - squeezr_expand result is verbatim', () => {
43
+ it('returns an expand-call result untouched, even when huge', () => {
44
+ const huge = Array.from({ length: 500 }, (_, i) => `recovered line ${i}`).join('\n');
45
+ // mcp-prefixed name (how Claude Code routes the MCP tool)
46
+ expect(preprocessForTool(huge, 'mcp__squeezr__squeezr_expand')).toBe(huge);
47
+ expect(preprocessForTool(huge, 'squeezr_expand')).toBe(huge);
48
+ });
49
+ });
50
+ // ── Reversibility: lossy deterministic compaction gets an expand pointer ──────
51
+ describe('preprocessForTool - lossy compaction is recoverable', () => {
52
+ it('appends a squeezr_expand pointer when a huge read is truncated', () => {
53
+ const big = Array.from({ length: 400 }, (_, i) => `line ${i}`).join('\n');
54
+ const out = preprocessForTool(big, 'Read');
55
+ expect(out).toContain('squeezr_expand("');
56
+ expect(out).toContain('omitted');
57
+ });
58
+ it('does NOT append a pointer to a small verbatim read', () => {
59
+ const small = 'const a = 1\nconst b = 2\n';
60
+ expect(preprocessForTool(small, 'Read')).toBe(small);
61
+ });
62
+ it('appends a pointer when a long bash output is truncated', () => {
63
+ const log = Array.from({ length: 200 }, (_, i) => `noise output line ${i}`).join('\n');
64
+ expect(preprocessForTool(log, 'Bash')).toContain('squeezr_expand("');
65
+ });
66
+ it('the pointer id round-trips through the expand store', async () => {
67
+ const big = Array.from({ length: 400 }, (_, i) => `unique-line-${i}`).join('\n');
68
+ const out = preprocessForTool(big, 'Read');
69
+ const id = out.match(/squeezr_expand\("([0-9a-f]{6})"\)/)?.[1];
70
+ expect(id).toBeTruthy();
71
+ const { retrieveOriginal } = await import('../expand.js');
72
+ expect(retrieveOriginal(id)).toBe(big);
73
+ });
74
+ });
3
75
  // ── Base pipeline ─────────────────────────────────────────────────────────────
4
76
  describe('preprocess - base pipeline', () => {
5
77
  it('strips ANSI escape codes', () => {
@@ -74,25 +146,25 @@ describe('preprocessRatio', () => {
74
146
  });
75
147
  // ── Git diff ──────────────────────────────────────────────────────────────────
76
148
  describe('preprocessForTool - git diff', () => {
77
- const sampleDiff = `diff --git a/src/foo.ts b/src/foo.ts
78
- index abc123..def456 100644
79
- --- a/src/foo.ts
80
- +++ b/src/foo.ts
81
- @@ -1,7 +1,7 @@
82
- import { foo } from './bar'
83
-
84
- -const x = 1
85
- +const x = 2
86
-
87
- function hello() {
88
- return x
89
- }
90
- @@ -10,5 +10,5 @@
91
- context before
92
- -old line
93
- +new line
94
- context after
95
- more context
149
+ const sampleDiff = `diff --git a/src/foo.ts b/src/foo.ts
150
+ index abc123..def456 100644
151
+ --- a/src/foo.ts
152
+ +++ b/src/foo.ts
153
+ @@ -1,7 +1,7 @@
154
+ import { foo } from './bar'
155
+
156
+ -const x = 1
157
+ +const x = 2
158
+
159
+ function hello() {
160
+ return x
161
+ }
162
+ @@ -10,5 +10,5 @@
163
+ context before
164
+ -old line
165
+ +new line
166
+ context after
167
+ more context
96
168
  even more context`;
97
169
  it('keeps diff headers', () => {
98
170
  const out = preprocessForTool(sampleDiff, 'Bash');
@@ -121,24 +193,24 @@ index abc123..def456 100644
121
193
  });
122
194
  // ── Cargo test ────────────────────────────────────────────────────────────────
123
195
  describe('preprocessForTool - cargo test', () => {
124
- const allPassing = `running 5 tests
125
- test foo::test_a ... ok
126
- test foo::test_b ... ok
127
- test foo::test_c ... ok
128
- test foo::test_d ... ok
129
- test foo::test_e ... ok
130
-
196
+ const allPassing = `running 5 tests
197
+ test foo::test_a ... ok
198
+ test foo::test_b ... ok
199
+ test foo::test_c ... ok
200
+ test foo::test_d ... ok
201
+ test foo::test_e ... ok
202
+
131
203
  test result: ok. 5 passed; 0 failed; 0 ignored`;
132
- const withFailures = `running 3 tests
133
- test foo::test_a ... ok
134
- test foo::test_b ... FAILED
135
- test foo::test_c ... ok
136
-
137
- failures:
138
-
139
- ---- foo::test_b stdout ----
140
- thread 'foo::test_b' panicked at 'assertion failed', src/lib.rs:10
141
-
204
+ const withFailures = `running 3 tests
205
+ test foo::test_a ... ok
206
+ test foo::test_b ... FAILED
207
+ test foo::test_c ... ok
208
+
209
+ failures:
210
+
211
+ ---- foo::test_b stdout ----
212
+ thread 'foo::test_b' panicked at 'assertion failed', src/lib.rs:10
213
+
142
214
  test result: FAILED. 2 passed; 1 failed`;
143
215
  it('returns only summary when all tests pass', () => {
144
216
  const out = preprocessForTool(allPassing, 'Bash');
@@ -161,14 +233,14 @@ test result: FAILED. 2 passed; 1 failed`;
161
233
  });
162
234
  // ── Cargo build / clippy ──────────────────────────────────────────────────────
163
235
  describe('preprocessForTool - cargo build errors', () => {
164
- const buildOutput = ` Compiling foo v0.1.0
165
- Compiling bar v1.2.3
166
- error[E0308]: mismatched types
167
- --> src/main.rs:5:10
168
- |
169
- 5 | let x: i32 = "hello";
170
- | --- ^^^^^^^ expected i32, found &str
171
- |
236
+ const buildOutput = ` Compiling foo v0.1.0
237
+ Compiling bar v1.2.3
238
+ error[E0308]: mismatched types
239
+ --> src/main.rs:5:10
240
+ |
241
+ 5 | let x: i32 = "hello";
242
+ | --- ^^^^^^^ expected i32, found &str
243
+ |
172
244
  error: aborting due to 1 previous error`;
173
245
  it('removes Compiling lines', () => {
174
246
  const out = preprocessForTool(buildOutput, 'Bash');
@@ -188,23 +260,23 @@ error: aborting due to 1 previous error`;
188
260
  });
189
261
  // ── Vitest ────────────────────────────────────────────────────────────────────
190
262
  describe('preprocessForTool - vitest', () => {
191
- const allPass = `✓ src/foo.test.ts (3)
192
- ✓ test one 5ms
193
- ✓ test two 3ms
194
- ✓ test three 2ms
195
-
196
- Test Files 1 passed (1)
197
- Tests 3 passed (3)
263
+ const allPass = `✓ src/foo.test.ts (3)
264
+ ✓ test one 5ms
265
+ ✓ test two 3ms
266
+ ✓ test three 2ms
267
+
268
+ Test Files 1 passed (1)
269
+ Tests 3 passed (3)
198
270
  Duration 120ms`;
199
- const withFail = `✓ src/foo.test.ts (2)
200
- × src/bar.test.ts (1)
201
- × failing test 10ms
202
- AssertionError: expected 1 to equal 2
203
- - Expected: 2
204
- + Received: 1
205
-
206
- Test Files 1 failed | 1 passed (2)
207
- Tests 1 failed | 2 passed (3)
271
+ const withFail = `✓ src/foo.test.ts (2)
272
+ × src/bar.test.ts (1)
273
+ × failing test 10ms
274
+ AssertionError: expected 1 to equal 2
275
+ - Expected: 2
276
+ + Received: 1
277
+
278
+ Test Files 1 failed | 1 passed (2)
279
+ Tests 1 failed | 2 passed (3)
208
280
  Duration 150ms`;
209
281
  it('returns only summary when all tests pass', () => {
210
282
  const out = preprocessForTool(allPass, 'Bash');
@@ -230,8 +302,8 @@ Duration 150ms`;
230
302
  });
231
303
  // ── TypeScript ────────────────────────────────────────────────────────────────
232
304
  describe('preprocessForTool - tsc errors', () => {
233
- const tscOutput = `src/foo.ts(10,5): error TS2345: Argument of type 'string' is not assignable to parameter of type 'number'.
234
- src/foo.ts(20,3): error TS2551: Property 'bar' does not exist on type 'Foo'.
305
+ const tscOutput = `src/foo.ts(10,5): error TS2345: Argument of type 'string' is not assignable to parameter of type 'number'.
306
+ src/foo.ts(20,3): error TS2551: Property 'bar' does not exist on type 'Foo'.
235
307
  src/bar.ts(5,10): error TS2304: Cannot find name 'baz'.`;
236
308
  it('groups errors by file', () => {
237
309
  const out = preprocessForTool(tscOutput, 'Bash');
@@ -249,13 +321,13 @@ src/bar.ts(5,10): error TS2304: Cannot find name 'baz'.`;
249
321
  });
250
322
  // ── ESLint ────────────────────────────────────────────────────────────────────
251
323
  describe('preprocessForTool - eslint', () => {
252
- const eslintOutput = `/src/foo.ts
253
- 10:5 error 'x' is defined but never used no-unused-vars
254
- 20:1 warning Unexpected console statement no-console
255
-
256
- /src/bar.ts
257
- 5:10 error Missing semicolon semi
258
-
324
+ const eslintOutput = `/src/foo.ts
325
+ 10:5 error 'x' is defined but never used no-unused-vars
326
+ 20:1 warning Unexpected console statement no-console
327
+
328
+ /src/bar.ts
329
+ 5:10 error Missing semicolon semi
330
+
259
331
  ✖ 3 problems (2 errors, 1 warning)`;
260
332
  it('keeps error/warning lines', () => {
261
333
  const out = preprocessForTool(eslintOutput, 'Bash');
@@ -288,7 +360,7 @@ describe('preprocessForTool - pnpm install', () => {
288
360
  });
289
361
  // ── Docker ────────────────────────────────────────────────────────────────────
290
362
  describe('preprocessForTool - docker ps', () => {
291
- const dockerPs = `CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
363
+ const dockerPs = `CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
292
364
  abc123def456 nginx:latest "/docker-e…" 2 hours ago Up 2 hours 80/tcp web`;
293
365
  it('keeps header and container rows', () => {
294
366
  const out = preprocessForTool(dockerPs, 'Bash');
@@ -312,15 +384,15 @@ describe('preprocessForTool - long bash output (generic truncation)', () => {
312
384
  });
313
385
  // ── gh CLI ────────────────────────────────────────────────────────────────────
314
386
  describe('preprocessForTool - gh pr', () => {
315
- const ghPr = `title: Fix the bug
316
- state: OPEN
317
- author: sergioramosv
318
- url: https://github.com/sergioramosv/squeezr/pull/5
319
- number: 5
320
- labels: bug, help wanted
321
-
322
- This is a long PR body with lots of text explaining the changes
323
- in great detail that we don't really need in a summary.
387
+ const ghPr = `title: Fix the bug
388
+ state: OPEN
389
+ author: sergioramosv
390
+ url: https://github.com/sergioramosv/squeezr/pull/5
391
+ number: 5
392
+ labels: bug, help wanted
393
+
394
+ This is a long PR body with lots of text explaining the changes
395
+ in great detail that we don't really need in a summary.
324
396
  More text here. Even more text. Lots and lots of text.`;
325
397
  it('keeps key metadata fields', () => {
326
398
  const out = preprocessForTool(ghPr, 'Bash');
@@ -379,6 +451,21 @@ describe('preprocessForTool - Read tool', () => {
379
451
  expect(out).toContain('omitted');
380
452
  expect(out.length).toBeLessThan(lockfile.length / 10);
381
453
  });
454
+ it('does NOT misclassify a source file that merely mentions lockfile patterns', () => {
455
+ // Regression: a 600-line source file that contains the lockfile signature
456
+ // strings ONCE (as the detector's own patterns) must not be nuked as a
457
+ // lockfile. Real harm: this destroys content with no expand copy.
458
+ const src = [
459
+ `function looksLikeLockfile(text) {`,
460
+ ` return text.includes('integrity sha') || text.includes('"resolved"') || text.includes('# yarn lockfile')`,
461
+ `}`,
462
+ ...Array.from({ length: 600 }, (_, i) => `const value${i} = compute(${i})`),
463
+ ].join('\n');
464
+ const out = preprocessForTool(src, 'Read');
465
+ expect(out).not.toContain('lockfile —'); // not the lockfile-omitted summary
466
+ // It is a >500-line TS file → semantic structure extraction keeps signatures
467
+ expect(out).toContain('looksLikeLockfile');
468
+ });
382
469
  });
383
470
  // ── Glob tool ─────────────────────────────────────────────────────────────────
384
471
  describe('preprocessForTool - Glob tool', () => {
@@ -397,18 +484,18 @@ describe('preprocessForTool - Glob tool', () => {
397
484
  });
398
485
  // ── git status ────────────────────────────────────────────────────────────────
399
486
  describe('preprocessForTool - git status', () => {
400
- const status = `On branch main
401
- Your branch is up to date with 'origin/main'.
402
-
403
- Changes not staged for commit:
404
- (use "git add <file>..." to update what will be committed)
405
- modified: src/foo.ts
406
- modified: src/bar.ts
407
-
408
- Untracked files:
409
- (use "git add <file>..." to include in what will be committed)
410
- new-file.ts
411
-
487
+ const status = `On branch main
488
+ Your branch is up to date with 'origin/main'.
489
+
490
+ Changes not staged for commit:
491
+ (use "git add <file>..." to update what will be committed)
492
+ modified: src/foo.ts
493
+ modified: src/bar.ts
494
+
495
+ Untracked files:
496
+ (use "git add <file>..." to include in what will be committed)
497
+ new-file.ts
498
+
412
499
  no changes added to commit`;
413
500
  it('shows branch name', () => {
414
501
  const out = preprocessForTool(status, 'Bash');
@@ -459,14 +546,14 @@ describe('preprocessForTool - git log --oneline', () => {
459
546
  });
460
547
  // ── pnpm list ─────────────────────────────────────────────────────────────────
461
548
  describe('preprocessForTool - pnpm/npm list', () => {
462
- const npmList = `my-app@1.0.0
463
- ├── express@4.18.2
464
- │ ├── accepts@1.3.8
465
- │ │ └── mime-types@2.1.35
466
- │ └── body-parser@1.20.2
467
- ├── react@18.2.0
468
- │ └── scheduler@0.23.0
469
- └── typescript@5.8.3
549
+ const npmList = `my-app@1.0.0
550
+ ├── express@4.18.2
551
+ │ ├── accepts@1.3.8
552
+ │ │ └── mime-types@2.1.35
553
+ │ └── body-parser@1.20.2
554
+ ├── react@18.2.0
555
+ │ └── scheduler@0.23.0
556
+ └── typescript@5.8.3
470
557
  └── typescript@5.8.3 deduped`;
471
558
  it('keeps direct dependencies', () => {
472
559
  const out = preprocessForTool(npmList, 'Bash');
@@ -508,15 +595,15 @@ describe('preprocessForTool - pnpm outdated', () => {
508
595
  });
509
596
  // ── prisma ────────────────────────────────────────────────────────────────────
510
597
  describe('preprocessForTool - prisma', () => {
511
- const prismaOutput = `Prisma schema loaded from prisma/schema.prisma
512
- Environment variables loaded from .env
513
-
514
- ✔ Generated Prisma Client (v5.10.2) to ./node_modules/@prisma/client in 127ms
515
-
516
- ┌─────────────────────────────────────────────────────────┐
517
- │ Starter Prisma Tip: │
518
- │ Understand your Prisma schema better with the │
519
- │ Prisma VS Code Extension, for free! │
598
+ const prismaOutput = `Prisma schema loaded from prisma/schema.prisma
599
+ Environment variables loaded from .env
600
+
601
+ ✔ Generated Prisma Client (v5.10.2) to ./node_modules/@prisma/client in 127ms
602
+
603
+ ┌─────────────────────────────────────────────────────────┐
604
+ │ Starter Prisma Tip: │
605
+ │ Understand your Prisma schema better with the │
606
+ │ Prisma VS Code Extension, for free! │
520
607
  └─────────────────────────────────────────────────────────┘`;
521
608
  it('keeps important output lines', () => {
522
609
  const out = preprocessForTool(prismaOutput, 'Bash');
@@ -551,18 +638,18 @@ describe('preprocessForTool - gh pr checks', () => {
551
638
  });
552
639
  // ── Playwright ────────────────────────────────────────────────────────────────
553
640
  describe('preprocessForTool - playwright', () => {
554
- const withFail = `Running 5 tests using 2 workers
555
-
556
- ✘ tests/login.spec.ts:12:5 › Login › should log in [chromium] (5.2s)
557
-
558
- Error: Timed out 5000ms waiting for expect(locator).toBeVisible()
559
- Locator: getByRole('button', { name: 'Submit' })
560
- Expected: visible
561
- Received: hidden
562
- at tests/login.spec.ts:15:22
563
-
564
- ✓ tests/home.spec.ts:5:5 › Home › loads [chromium] (1.1s)
565
-
641
+ const withFail = `Running 5 tests using 2 workers
642
+
643
+ ✘ tests/login.spec.ts:12:5 › Login › should log in [chromium] (5.2s)
644
+
645
+ Error: Timed out 5000ms waiting for expect(locator).toBeVisible()
646
+ Locator: getByRole('button', { name: 'Submit' })
647
+ Expected: visible
648
+ Received: hidden
649
+ at tests/login.spec.ts:15:22
650
+
651
+ ✓ tests/home.spec.ts:5:5 › Home › loads [chromium] (1.1s)
652
+
566
653
  1 failed, 4 passed (12s)`;
567
654
  it('keeps failure blocks', () => {
568
655
  const out = preprocessForTool(withFail, 'Bash');
@@ -580,11 +667,11 @@ describe('preprocessForTool - playwright', () => {
580
667
  });
581
668
  // ── Python / pytest ───────────────────────────────────────────────────────────
582
669
  describe('preprocessForTool - python traceback', () => {
583
- const traceback = `Traceback (most recent call last):
584
- File "app.py", line 42, in process
585
- result = calculate(x)
586
- File "app.py", line 17, in calculate
587
- return x / 0
670
+ const traceback = `Traceback (most recent call last):
671
+ File "app.py", line 42, in process
672
+ result = calculate(x)
673
+ File "app.py", line 17, in calculate
674
+ return x / 0
588
675
  ZeroDivisionError: division by zero`;
589
676
  it('keeps traceback lines', () => {
590
677
  const out = preprocessForTool(traceback, 'Bash');
@@ -600,11 +687,11 @@ ZeroDivisionError: division by zero`;
600
687
  });
601
688
  // ── Go test ───────────────────────────────────────────────────────────────────
602
689
  describe('preprocessForTool - go test', () => {
603
- const goOutput = `--- PASS: TestAdd (0.00s)
604
- --- FAIL: TestDivide (0.00s)
605
- calc_test.go:15: expected 5, got 0
606
- --- PASS: TestMultiply (0.00s)
607
- FAIL
690
+ const goOutput = `--- PASS: TestAdd (0.00s)
691
+ --- FAIL: TestDivide (0.00s)
692
+ calc_test.go:15: expected 5, got 0
693
+ --- PASS: TestMultiply (0.00s)
694
+ FAIL
608
695
  FAIL\tgithub.com/user/calc\t0.003s`;
609
696
  it('keeps failure lines', () => {
610
697
  const out = preprocessForTool(goOutput, 'Bash');
@@ -623,18 +710,18 @@ FAIL\tgithub.com/user/calc\t0.003s`;
623
710
  });
624
711
  // ── Terraform ─────────────────────────────────────────────────────────────────
625
712
  describe('preprocessForTool - terraform', () => {
626
- const planOutput = `Terraform will perform the following actions:
627
-
628
- # aws_instance.web will be created
629
- + resource "aws_instance" "web" {
630
- + ami = "ami-0c55b159cbfafe1f0"
631
- + instance_type = "t2.micro"
632
- ... (many attributes)
633
- }
634
-
635
- # aws_s3_bucket.data must be replaced
636
- -/+ resource "aws_s3_bucket" "data" {
637
-
713
+ const planOutput = `Terraform will perform the following actions:
714
+
715
+ # aws_instance.web will be created
716
+ + resource "aws_instance" "web" {
717
+ + ami = "ami-0c55b159cbfafe1f0"
718
+ + instance_type = "t2.micro"
719
+ ... (many attributes)
720
+ }
721
+
722
+ # aws_s3_bucket.data must be replaced
723
+ -/+ resource "aws_s3_bucket" "data" {
724
+
638
725
  Plan: 1 to add, 0 to change, 1 to destroy.`;
639
726
  it('keeps resource change summary lines', () => {
640
727
  const out = preprocessForTool(planOutput, 'Bash');