context-compress 2026.3.21 → 2026.3.22

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -6,6 +6,8 @@
6
6
  > provides a detailed before/after comparison for 12 common operations,
7
7
  > and addresses the natural question: "doesn't less tokens mean losing context?"
8
8
 
9
+ **Version**: 2026.3.21 | **Last updated**: 2026-03-22
10
+
9
11
  ---
10
12
 
11
13
  ## Table of Contents
@@ -17,6 +19,7 @@
17
19
  - [Context Window Impact](#context-window-impact)
18
20
  - [Cost Impact](#cost-impact)
19
21
  - [Deep Dive: How Playwright Snapshot Goes from 56KB to 299B](#deep-dive-how-playwright-snapshot-goes-from-56kb-to-299b)
22
+ - [Security and Reliability](#security-and-reliability)
20
23
  - [FAQ: Doesn't Less Tokens Mean Losing Context?](#faq-doesnt-less-tokens-mean-losing-context)
21
24
 
22
25
  ---
@@ -26,16 +29,18 @@
26
29
  Every byte of tool output that enters Claude Code's context window **consumes tokens permanently**. In a typical coding session:
27
30
 
28
31
  ```
29
- Read a bundled file → 776KB → 194,076 tokens
30
- Playwright browser snapshot → 56KB → 14,000 tokens
31
- npm test (42 tests) → 4KB → 935 tokens
32
- git diff (3 commits) → 8KB → 2,000 tokens
32
+ Read a bundled file → 776KB → 155K-259K tokens
33
+ Playwright browser snapshot → 56KB → 11K-19K tokens
34
+ npm test (42 tests) → 4KB → 748-1,246 tokens
35
+ git diff (3 commits) → 8KB → 1,600-2,667 tokens
33
36
  ─────────────────
34
- Total: 211,011 tokens
35
- already exceeds 200K window
37
+ Total: 169K-282K tokens
38
+ can overflow 200K window
36
39
  ```
37
40
 
38
- With just 4 operations, you've **overflowed the entire context window**. Earlier conversation messages get compressed or lost. The agent forgets what you asked. Quality degrades.
41
+ > **Token estimation**: 1 token 3-5 bytes depending on content. We use a range (bytes/5 to bytes/3) because Anthropic does not publish a local tokenizer for Claude 3+ models.
42
+
43
+ With just 4 operations, you risk **overflowing the entire context window**. Earlier conversation messages get compressed or lost. The agent forgets what you asked. Quality degrades.
39
44
 
40
45
  The worst part: **99% of that tool output is noise** — import statements, boilerplate, minified code, irrelevant test output. The agent doesn't benefit from seeing it. It just crowds out the conversation.
41
46
 
@@ -47,7 +52,7 @@ context-compress doesn't delete data — it **defers** it. All data is preserved
47
52
 
48
53
  ### Layer 1: Sandbox Execution
49
54
 
50
- The agent writes code to process data. Only `console.log()` output enters context.
55
+ The agent writes code to process data. Only `console.log()` output enters context. 11 languages supported: JavaScript, TypeScript, Python, Shell, Ruby, Go, Rust, PHP, Perl, R, Elixir.
51
56
 
52
57
  ```
53
58
  execute_file("server.bundle.mjs", code: `
@@ -61,9 +66,11 @@ Context: 420 bytes (the extracted schema)
61
66
 
62
67
  The agent isn't blindly losing context — it's **choosing** what matters via code.
63
68
 
69
+ **Safeguards**: Code input limited to 1MB. Subprocess timeout (default 30s). Output hard cap (100MB). Process group kill on timeout. Concurrent executions limited to 8 globally.
70
+
64
71
  ### Layer 2: FTS5 Knowledge Base
65
72
 
66
- Full data is stored in a searchable SQLite FTS5 database with BM25 ranking, Porter stemming, and fuzzy matching. The agent can query it at any time.
73
+ Full data is stored in a searchable SQLite FTS5 database with BM25 ranking, Porter stemming, trigram matching, and Levenshtein fuzzy correction (with early-exit optimization).
67
74
 
68
75
  ```
69
76
  index(path: "snapshot.md") → 56KB stored, 42 chunks created
@@ -74,6 +81,8 @@ search("order table row headers") → 180B match returned
74
81
 
75
82
  Data is **not lost**. It's **indexed and searchable on demand**.
76
83
 
84
+ **Persistence option**: Set `persistDb: true` in config to survive MCP server restarts.
85
+
77
86
  ### Layer 3: Intent-Based Auto-Filter
78
87
 
79
88
  When the agent provides an `intent` parameter, large outputs are automatically filtered:
@@ -93,103 +102,107 @@ Small outputs are **never compressed**. Large outputs are filtered by what was a
93
102
 
94
103
  The following comparison uses realistic output sizes measured from the context-compress project itself.
95
104
 
96
- > **Token calculation**: 1 token ≈ 4 bytes (English text average)
105
+ > **Token calculation**: 1 token ≈ 3-5 bytes. The "Tokens" column shows the midpoint estimate (bytes/4). See [Cost Impact](#cost-impact) for range-based calculations.
97
106
 
98
107
  ### 1. Read large source file (server.ts ~21KB)
99
108
 
100
- | | Bytes | Tokens | Method |
109
+ | | Bytes | Tokens (est.) | Method |
101
110
  |:--|--:|--:|:--|
102
- | **Before** | 21,000 | 5,250 | `Read` tool → full file dumped into context |
103
- | **After** | 350 | 88 | `execute_file` → agent prints only what it needs |
104
- | **Saved** | | **5,162** | **98.3% reduction** |
111
+ | **Before** | 21,000 | ~5,250 | `Read` tool → full file dumped into context |
112
+ | **After** | 350 | ~88 | `execute_file` → agent prints only what it needs |
113
+ | **Saved** | | **~5,162** | **98.3% reduction** |
105
114
 
106
115
  ### 2. Read bundled file (server.bundle.mjs ~776KB)
107
116
 
108
- | | Bytes | Tokens | Method |
117
+ | | Bytes | Tokens (est.) | Method |
109
118
  |:--|--:|--:|:--|
110
- | **Before** | 776,304 | 194,076 | `Read` tool → full file in context (truncated at 2000 lines) |
111
- | **After** | 420 | 105 | `execute_file` → extract specific function/pattern |
112
- | **Saved** | | **193,971** | **99.9% reduction** |
119
+ | **Before** | 776,304 | ~194,076 | `Read` tool → full file in context (truncated at 2000 lines) |
120
+ | **After** | 420 | ~105 | `execute_file` → extract specific function/pattern |
121
+ | **Saved** | | **~193,971** | **99.9% reduction** |
113
122
 
114
123
  ### 3. npm test output (42 tests, ~3.7KB)
115
124
 
116
- | | Bytes | Tokens | Method |
125
+ | | Bytes | Tokens (est.) | Method |
117
126
  |:--|--:|--:|:--|
118
- | **Before** | 3,739 | 935 | `Bash` → full stdout in context |
119
- | **After** | 180 | 45 | `execute` with `intent: "failing tests"` → summary only |
120
- | **Saved** | | **890** | **95.2% reduction** |
127
+ | **Before** | 3,739 | ~935 | `Bash` → full stdout in context |
128
+ | **After** | 180 | ~45 | `execute` with `intent: "failing tests"` → summary only |
129
+ | **Saved** | | **~890** | **95.2% reduction** |
121
130
 
122
131
  ### 4. git log (full history, ~5KB)
123
132
 
124
- | | Bytes | Tokens | Method |
133
+ | | Bytes | Tokens (est.) | Method |
125
134
  |:--|--:|--:|:--|
126
- | **Before** | 5,000 | 1,250 | `Bash git log` → all commits in context |
127
- | **After** | 250 | 63 | `execute` + `search` for specific commits |
128
- | **Saved** | | **1,187** | **95.0% reduction** |
135
+ | **Before** | 5,000 | ~1,250 | `Bash git log` → all commits in context |
136
+ | **After** | 250 | ~63 | `execute` + `search` for specific commits |
137
+ | **Saved** | | **~1,187** | **95.0% reduction** |
129
138
 
130
139
  ### 5. git diff (3 commits, ~8KB)
131
140
 
132
- | | Bytes | Tokens | Method |
141
+ | | Bytes | Tokens (est.) | Method |
133
142
  |:--|--:|--:|:--|
134
- | **Before** | 8,000 | 2,000 | `Bash git diff` → full patch in context |
135
- | **After** | 400 | 100 | `execute` + `search` for changed functions |
136
- | **Saved** | | **1,900** | **95.0% reduction** |
143
+ | **Before** | 8,000 | ~2,000 | `Bash git diff` → full patch in context |
144
+ | **After** | 400 | ~100 | `execute` + `search` for changed functions |
145
+ | **Saved** | | **~1,900** | **95.0% reduction** |
137
146
 
138
147
  ### 6. grep across codebase (~1.4KB)
139
148
 
140
- | | Bytes | Tokens | Method |
149
+ | | Bytes | Tokens (est.) | Method |
141
150
  |:--|--:|--:|:--|
142
- | **Before** | 1,442 | 361 | `Grep` → all matching lines in context |
143
- | **After** | 1,442 | 361 | Same — small output passes through as-is |
151
+ | **Before** | 1,442 | ~361 | `Grep` → all matching lines in context |
152
+ | **After** | 1,442 | ~361 | Same — small output passes through as-is |
144
153
  | **Saved** | | **0** | **0% — no overhead for small outputs** |
145
154
 
146
155
  ### 7. Playwright browser_snapshot (~56KB)
147
156
 
148
- | | Bytes | Tokens | Method |
157
+ | | Bytes | Tokens (est.) | Method |
149
158
  |:--|--:|--:|:--|
150
- | **Before** | 56,000 | 14,000 | `browser_snapshot` → full accessibility tree in context |
151
- | **After** | 299 | 75 | save → `index` → `search` for specific elements |
152
- | **Saved** | | **13,925** | **99.5% reduction** |
159
+ | **Before** | 56,000 | ~14,000 | `browser_snapshot` → full accessibility tree in context |
160
+ | **After** | 299 | ~75 | save → `index` → `search` for specific elements |
161
+ | **Saved** | | **~13,925** | **99.5% reduction** |
153
162
 
154
163
  ### 8. curl API response (JSON ~12KB)
155
164
 
156
- | | Bytes | Tokens | Method |
165
+ | | Bytes | Tokens (est.) | Method |
157
166
  |:--|--:|--:|:--|
158
- | **Before** | 12,000 | 3,000 | `Bash curl` → full JSON response in context |
159
- | **After** | 350 | 88 | `execute` → extract specific fields with code |
160
- | **Saved** | | **2,912** | **97.1% reduction** |
167
+ | **Before** | 12,000 | ~3,000 | `Bash curl` → full JSON response in context |
168
+ | **After** | 350 | ~88 | `execute` → extract specific fields with code |
169
+ | **Saved** | | **~2,912** | **97.1% reduction** |
161
170
 
162
171
  ### 9. fetch_and_index (web docs ~45KB)
163
172
 
164
- | | Bytes | Tokens | Method |
173
+ | | Bytes | Tokens (est.) | Method |
165
174
  |:--|--:|--:|:--|
166
- | **Before** | 45,000 | 11,250 | `WebFetch` → full page markdown in context |
167
- | **After** | 3,000 | 750 | `fetch_and_index` → 3KB preview + rest searchable |
168
- | **Saved** | | **10,500** | **93.3% reduction** |
175
+ | **Before** | 45,000 | ~11,250 | `WebFetch` → full page markdown in context |
176
+ | **After** | 3,000 | ~750 | `fetch_and_index` → 3KB preview + rest searchable |
177
+ | **Saved** | | **~10,500** | **93.3% reduction** |
178
+
179
+ **Security**: SSRF protection with DNS rebinding prevention, IP pinning, redirect blocking, and 10MB response size limit. Prompt injection detection on fetched content.
169
180
 
170
181
  ### 10. batch_execute (5 commands, ~25KB total)
171
182
 
172
- | | Bytes | Tokens | Method |
183
+ | | Bytes | Tokens (est.) | Method |
173
184
  |:--|--:|--:|:--|
174
- | **Before** | 25,000 | 6,250 | 5x `Bash` → all output in context |
175
- | **After** | 1,500 | 375 | `batch_execute` + search across all in 1 call |
176
- | **Saved** | | **5,875** | **94.0% reduction** |
185
+ | **Before** | 25,000 | ~6,250 | 5x `Bash` → all output in context |
186
+ | **After** | 1,500 | ~375 | `batch_execute` + search across all in 1 call |
187
+ | **Saved** | | **~5,875** | **94.0% reduction** |
188
+
189
+ **Performance**: Commands run with bounded concurrency (max 4 parallel). Global execution limit of 8 prevents resource exhaustion.
177
190
 
178
191
  ### 11. Read CSV/JSON data file (~100KB)
179
192
 
180
- | | Bytes | Tokens | Method |
193
+ | | Bytes | Tokens (est.) | Method |
181
194
  |:--|--:|--:|:--|
182
- | **Before** | 100,000 | 25,000 | `Read` → file contents in context |
183
- | **After** | 500 | 125 | `execute_file` → extract/aggregate specific data |
184
- | **Saved** | | **24,875** | **99.5% reduction** |
195
+ | **Before** | 100,000 | ~25,000 | `Read` → file contents in context |
196
+ | **After** | 500 | ~125 | `execute_file` → extract/aggregate specific data |
197
+ | **Saved** | | **~24,875** | **99.5% reduction** |
185
198
 
186
199
  ### 12. npm install log (~15KB)
187
200
 
188
- | | Bytes | Tokens | Method |
201
+ | | Bytes | Tokens (est.) | Method |
189
202
  |:--|--:|--:|:--|
190
- | **Before** | 15,000 | 3,750 | `Bash npm install` → full install log in context |
191
- | **After** | 200 | 50 | `execute` with `intent: "errors"` → only issues shown |
192
- | **Saved** | | **3,700** | **98.7% reduction** |
203
+ | **Before** | 15,000 | ~3,750 | `Bash npm install` → full install log in context |
204
+ | **After** | 200 | ~50 | `execute` with `intent: "errors"` → only issues shown |
205
+ | **Saved** | | **~3,700** | **98.7% reduction** |
193
206
 
194
207
  ---
195
208
 
@@ -198,10 +211,10 @@ The following comparison uses realistic output sizes measured from the context-c
198
211
  Combining all 12 operations from a single coding session:
199
212
 
200
213
  ```
201
- BEFORE: 1,043 KB → 267,121 tokens consumed
202
- AFTER: 9 KB → 2,223 tokens consumed
214
+ BEFORE: 1,043 KB → ~261K tokens consumed (bytes/4 midpoint)
215
+ AFTER: 9 KB → ~2.2K tokens consumed
203
216
  ────────────────────────
204
- SAVED: 1,035 KB → 264,898 tokens
217
+ SAVED: 1,035 KB → ~259K tokens
205
218
  REDUCTION: 99.2%
206
219
  ```
207
220
 
@@ -216,42 +229,43 @@ Claude Code uses a 200K token context window.
216
229
  │ 200,000 token context window │
217
230
  │ │
218
231
  │ WITHOUT context-compress: │
219
- │ ████████████████████████████████████████████████████ 133.6%
232
+ │ ████████████████████████████████████████████████████ ~131%
220
233
  │ ← 12 operations OVERFLOW the window. Conversation lost. │
221
234
  │ │
222
235
  │ WITH context-compress: │
223
- │ █ 1.1%
224
- │ ← 12 operations use 1.1%. 98.9% free for conversation.
236
+ │ █ ~1.1%
237
+ │ ← 12 operations use ~1.1%. ~98.9% free for conversation.
225
238
  └─────────────────────────────────────────────────────────────┘
226
239
  ```
227
240
 
228
241
  | Metric | Before | After |
229
242
  |:--|--:|--:|
230
- | Tokens consumed | 267,121 | 2,223 |
231
- | % of context window | 133.6% | 1.1% |
243
+ | Tokens consumed (est.) | ~261,000 | ~2,200 |
244
+ | % of context window | ~131% | ~1.1% |
232
245
  | Operations before compaction | ~9 | **~1,100** |
233
- | Conversation longevity | Short | **~121x longer** |
246
+ | Conversation longevity | Short | **~119x longer** |
234
247
 
235
248
  ---
236
249
 
237
250
  ## Cost Impact
238
251
 
239
- Input token pricing (per session, 12 operations):
252
+ Input token pricing (per session, 12 operations). Using midpoint estimate (bytes/4):
240
253
 
241
254
  | Model | Before | After | Saved per Session |
242
255
  |:--|--:|--:|--:|
243
- | Sonnet 4 ($3/MTok) | $0.80 | $0.007 | **$0.79** |
244
- | Opus 4 ($15/MTok) | $4.01 | $0.033 | **$3.97** |
256
+ | Haiku 4.5 ($0.80/MTok) | $0.21 | $0.002 | **$0.21** |
257
+ | Sonnet 4.6 ($3/MTok) | $0.78 | $0.007 | **$0.78** |
258
+ | Opus 4.6 ($15/MTok) | $3.92 | $0.033 | **$3.89** |
245
259
 
246
- ### Extrapolated Savings
260
+ ### Extrapolated Monthly Savings
247
261
 
248
- | Usage | Sonnet Monthly | Opus Monthly |
249
- |:--|--:|--:|
250
- | 5 sessions/day | $118.50 | $592.50 |
251
- | 10 sessions/day | $237.00 | **$1,185.00** |
252
- | 20 sessions/day | $474.00 | **$2,370.00** |
262
+ | Usage | Haiku | Sonnet | Opus |
263
+ |:--|--:|--:|--:|
264
+ | 5 sessions/day | $31.05 | $116.44 | **$582.19** |
265
+ | 10 sessions/day | $62.10 | $232.88 | **$1,164.38** |
266
+ | 20 sessions/day | $124.20 | $465.75 | **$2,328.75** |
253
267
 
254
- > Note: These are input token savings only. Actual savings vary based on session complexity. Output tokens are unaffected.
268
+ > Note: These are input token savings only. Actual savings vary based on session complexity. Output tokens are unaffected. Token estimates use bytes/4 midpoint; actual counts may vary 20-30%.
255
269
 
256
270
  ---
257
271
 
@@ -317,7 +331,7 @@ The `browser_snapshot()` tool returns a full accessibility tree:
317
331
  ... (thousands more lines for a real application)
318
332
  ```
319
333
 
320
- **All 56,000 bytes (14,000 tokens) dumped into context. Gone.**
334
+ **All 56,000 bytes (~14,000 tokens) dumped into context. Gone.**
321
335
 
322
336
  The agent probably only needed the login form. But it paid for the entire page.
323
337
 
@@ -363,6 +377,36 @@ The other 55,701 bytes are still in FTS5 — fully searchable. Need the order ta
363
377
 
364
378
  ---
365
379
 
380
+ ## Security and Reliability
381
+
382
+ context-compress v2026.3.21 includes comprehensive security and reliability features:
383
+
384
+ ### Security
385
+
386
+ | Feature | Description |
387
+ |:--|:--|
388
+ | Environment isolation | Opt-in credential passthrough (`passthroughEnvVars` defaults to empty) |
389
+ | SSRF protection | 4-layer defense: hostname validation, DNS rebinding prevention, IP pinning, redirect blocking |
390
+ | Input limits | Code: 1MB max. Fetch response: 10MB max. Index content: 50MB max |
391
+ | Concurrency control | Global limit of 8 concurrent executions. batch_execute: max 4 parallel |
392
+ | Prompt injection detection | Regex-based advisory warnings on fetched content (7 patterns) |
393
+ | Path traversal protection | `realpathSync` with symlink resolution + project boundary enforcement |
394
+ | Process isolation | Timeout, output caps (100MB), process group kill, safe environment |
395
+
396
+ ### Reliability
397
+
398
+ | Feature | Description |
399
+ |:--|:--|
400
+ | Graceful shutdown | Active subprocess tracking, SIGTERM/SIGINT cleanup, uncaughtException handling |
401
+ | DB resilience | In-memory fallback on disk-full. WAL mode for crash recovery. Stale DB cleanup |
402
+ | Output processing | Line deduplication, error grouping, smart 60/40 head/tail truncation |
403
+ | Search fallback | 3-layer: Porter stemming → trigram (lazy) → Levenshtein fuzzy correction |
404
+ | Configuration | ENV > file > defaults with Zod validation and sanity clamping |
405
+
406
+ For the full security model, see [SECURITY.md](../SECURITY.md).
407
+
408
+ ---
409
+
366
410
  ## FAQ: Doesn't Less Tokens Mean Losing Context?
367
411
 
368
412
  **This is the right question to ask.** If we're feeding the agent fewer tokens, doesn't it see less?
@@ -374,7 +418,7 @@ The other 55,701 bytes are still in FTS5 — fully searchable. Need the order ta
374
418
  ```
375
419
  WITHOUT context-compress (passive exposure):
376
420
  ┌──────────────────────────────────────────────────────┐
377
- │ 194,076 tokens loaded into context │
421
+ ~194,000 tokens loaded into context │
378
422
  │ │
379
423
  │ 99% = imports, boilerplate, minified code, │
380
424
  │ source maps, irrelevant functions... │
@@ -390,7 +434,7 @@ WITHOUT context-compress (passive exposure):
390
434
 
391
435
  WITH context-compress (active retrieval):
392
436
  ┌──────────────────────────────────────────────────────┐
393
- │ 105 tokens loaded into context │
437
+ ~105 tokens loaded into context │
394
438
  │ │
395
439
  │ 100% = exactly the function you care about │
396
440
  │ │
@@ -443,11 +487,13 @@ context-compress trades **passive exposure to noise** for **active retrieval of
443
487
 
444
488
  | Tool | Mechanism | Best For |
445
489
  |:--|:--|:--|
446
- | `execute` | Runs code in sandbox. Only `console.log` enters context | CLI commands, API calls, test runners |
490
+ | `execute` | Runs code in sandbox (11 languages). Only `console.log` enters context | CLI commands, API calls, test runners |
447
491
  | `execute_file` | Reads file into sandbox. Only printed summary enters context | Large source files, CSVs, logs, data files |
448
492
  | `index` + `search` | FTS5 stores all data. BM25 returns only matching chunks | Documentation, snapshots, large datasets |
449
493
  | `fetch_and_index` | HTML → markdown → FTS5. Returns 3KB preview + searchable index | Web pages, API docs, reference material |
450
- | `batch_execute` | Runs N commands, indexes all output, searches across all in 1 call | Multi-step workflows, exploration |
494
+ | `batch_execute` | Runs N commands (max 4 parallel), indexes all output, searches across all in 1 call | Multi-step workflows, exploration |
495
+ | `discover` | Shows knowledge base inventory and optimization suggestions | Understanding available indexed data |
496
+ | `stats` | Real-time session statistics with token range estimates and cost | Monitoring compression effectiveness |
451
497
 
452
498
  The core principle:
453
499
 
@@ -455,5 +501,6 @@ The core principle:
455
501
 
456
502
  ---
457
503
 
458
- *Generated from real benchmarks on the context-compress v1.0.0 codebase.*
459
- *Token calculation: 1 token 4 bytes (English text average).*
504
+ *Generated from real benchmarks on the context-compress v2026.3.21 codebase.*
505
+ *Token estimates use bytes/4 midpoint. Actual token counts may vary by 20-30% depending on content type.*
506
+ *See SECURITY.md for the full trust model and security architecture.*
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "context-compress",
3
- "version": "2026.3.21",
3
+ "version": "2026.3.22",
4
4
  "description": "Context-aware MCP server that compresses tool output for Claude Code",
5
5
  "type": "module",
6
6
  "main": "dist/server.js",