context-mode 0.5.2 → 0.5.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -12,8 +12,8 @@
12
12
  {
13
13
  "name": "context-mode",
14
14
  "source": "./",
15
- "description": "Claude Code MCP plugin that saves 94% of your context window. Sandboxed code execution in 10 languages, FTS5 knowledge base with BM25 ranking, and smart truncation.",
16
- "version": "0.5.2",
15
+ "description": "Claude Code MCP plugin that saves 98% of your context window. Sandboxed code execution in 10 languages, FTS5 knowledge base with BM25 ranking, and intent-driven search.",
16
+ "version": "0.5.4",
17
17
  "author": {
18
18
  "name": "Mert Koseoğlu"
19
19
  },
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "context-mode",
3
- "version": "0.5.2",
4
- "description": "Claude Code MCP plugin that saves 94% of your context window. Sandboxed code execution in 10 languages, FTS5 knowledge base with BM25 ranking, and smart truncation.",
3
+ "version": "0.5.4",
4
+ "description": "Claude Code MCP plugin that saves 98% of your context window. Sandboxed code execution in 10 languages, FTS5 knowledge base with BM25 ranking, and intent-driven search.",
5
5
  "author": {
6
6
  "name": "Mert Koseoğlu",
7
7
  "url": "https://github.com/mksglu"
package/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # Context Mode
2
2
 
3
- **Claude Code MCP plugin that saves 94% of your context window.**
3
+ **Claude Code MCP plugin that saves 98% of your context window.**
4
4
 
5
5
  Every tool call in Claude Code consumes context tokens. A single Playwright snapshot burns 10K-135K tokens. A Context7 docs lookup dumps 4K-10K tokens. GitHub's `list_commits` with 30 results costs 29K-64K tokens. With 5+ MCP servers active, you lose ~55K tokens before your first message — and after 30 minutes of real debugging, responses slow to a crawl.
6
6
 
@@ -12,14 +12,14 @@ Claude Code has a 200K token context window. Here's how fast popular MCP servers
12
12
 
13
13
  | MCP Server | Tool | Without Context Mode | With Context Mode | Savings | Source |
14
14
  |---|---|---|---|---|---|
15
- | **Playwright** | `browser_snapshot` | 10K-135K tokens | ~20 tokens | **99%** | [playwright-mcp#1233](https://github.com/microsoft/playwright-mcp/issues/1233) |
16
- | **Context7** | `query-docs` | 4K-10K tokens | ~70 tokens | **98%** | [upstash/context7](https://github.com/upstash/context7) |
17
- | **GitHub** | `list_commits` (30) | 29K-64K tokens | ~10 tokens | **99%** | [github-mcp-server#142](https://github.com/github/github-mcp-server/issues/142) |
18
- | **Sentry** | issue analysis | 5K-30K tokens | ~25 tokens | **99%** | [getsentry/sentry-mcp](https://github.com/getsentry/sentry-mcp) |
19
- | **Supabase** | schema queries | 2K-30K tokens | ~30 tokens | **99%** | [supabase-community/supabase-mcp](https://github.com/supabase-community/supabase-mcp) |
20
- | **Firecrawl** | `scrape` / `crawl` | 5K-50K+ tokens | ~70 tokens | **99%** | [firecrawl](https://github.com/mendableai/firecrawl) |
21
- | **Chrome DevTools** | DOM / network | 5K-50K+ tokens | ~25 tokens | **99%** | Community benchmark |
22
- | **Fetch** | `fetch` | 5K-50K tokens | ~70 tokens | **99%** | Official reference server |
15
+ | **Playwright** | `browser_snapshot` | 10K-135K tokens | ~75 tokens | **99%** | [playwright-mcp#1233](https://github.com/microsoft/playwright-mcp/issues/1233) |
16
+ | **Context7** | `query-docs` | 4K-10K tokens | ~65 tokens | **98%** | [upstash/context7](https://github.com/upstash/context7) |
17
+ | **GitHub** | `list_commits` (30) | 29K-64K tokens | ~180 tokens | **99%** | [github-mcp-server#142](https://github.com/github/github-mcp-server/issues/142) |
18
+ | **Sentry** | issue analysis | 5K-30K tokens | ~85 tokens | **99%** | [getsentry/sentry-mcp](https://github.com/getsentry/sentry-mcp) |
19
+ | **Supabase** | schema queries | 2K-30K tokens | ~80 tokens | **99%** | [supabase-community/supabase-mcp](https://github.com/supabase-community/supabase-mcp) |
20
+ | **Firecrawl** | `scrape` / `crawl` | 5K-50K+ tokens | ~65 tokens | **99%** | [firecrawl](https://github.com/mendableai/firecrawl) |
21
+ | **Chrome DevTools** | DOM / network | 5K-50K+ tokens | ~75 tokens | **99%** | Community benchmark |
22
+ | **Fetch** | `fetch` | 5K-50K tokens | ~65 tokens | **99%** | Official reference server |
23
23
 
24
24
  **Real measurement** ([Scott Spence, 2025](https://scottspence.com/posts/optimising-mcp-server-context-usage-in-claude-code)): With 81+ MCP tools enabled across multiple servers, **143K of 200K tokens (72%) consumed** — 82K tokens just for MCP tool definitions. Only 28% left for actual work.
25
25
 
@@ -29,15 +29,15 @@ Claude Code has a 200K token context window. Here's how fast popular MCP servers
29
29
 
30
30
  | What you're doing | Without Context Mode | With Context Mode | Savings |
31
31
  |---|---|---|---|
32
- | Playwright `browser_snapshot` | 12 KB into context | 50 B summary | **99%** |
33
- | Context7 `query-docs` (React) | 60 KB raw docs | 285 B search result | **99%** |
34
- | `gh pr list` / `gh api` | 8 KB JSON response | 40 B summary | **99%** |
35
- | Read `access.log` (500 req) | 45 KB raw log | 71 B status breakdown | **99%** |
36
- | `npm test` (30 suites) | 6 KB raw output | 37 B pass/fail | **99%** |
37
- | Git log (153 commits) | 12 KB raw log | 18 B summary | **99%** |
38
- | Supabase Edge Functions docs | 4 KB raw docs | 123 B code example | **97%** |
32
+ | Playwright `browser_snapshot` | 56 KB into context | 299 B summary | **99%** |
33
+ | Context7 `query-docs` (React) | 5.9 KB raw docs | 261 B summary | **96%** |
34
+ | GitHub issues (20) | 59 KB JSON response | 1.1 KB summary | **98%** |
35
+ | Read `access.log` (500 req) | 45 KB raw log | 155 B status breakdown | **100%** |
36
+ | `vitest` (30 suites) | 6 KB raw output | 337 B pass/fail | **95%** |
37
+ | Git log (153 commits) | 12 KB raw log | 107 B summary | **99%** |
38
+ | Analytics CSV (500 rows) | 86 KB raw data | 222 B summary | **100%** |
39
39
 
40
- **Real aggregate across 13 scenarios: 194 KB raw → 12.6 KB context (94% savings)**
40
+ **Real aggregate across 14 scenarios: 315 KB raw → 5.4 KB context (98% savings)**
41
41
 
42
42
  ## Quick Start
43
43
 
@@ -75,7 +75,7 @@ Claude calls: execute({ language: "shell", code: "gh pr list --json title,state
75
75
  Returns: "3" ← 2 bytes instead of 8KB JSON
76
76
  ```
77
77
 
78
- **Intent-driven search** (v0.5.0): When you provide an `intent` parameter and output exceeds 5KB, Context Mode uses BM25 search to return only the relevant sections matching your intent.
78
+ **Intent-driven search** (v0.5.2): When you provide an `intent` parameter and output exceeds 5KB, Context Mode uses score-based BM25 search to return only the relevant sections matching your intent.
79
79
 
80
80
  ```
81
81
  Claude calls: execute({
@@ -83,9 +83,12 @@ Claude calls: execute({
83
83
  code: "cat /var/log/app.log",
84
84
  intent: "connection refused database error"
85
85
  })
86
- Returns: only the 3 matching log sections (1.5KB) ← instead of 100KB raw log
86
+ Returns: section titles + searchable terms (500B) ← instead of 100KB raw log
87
87
  ```
88
88
 
89
+ When intent search runs, the response includes `Searchable terms` — distinctive vocabulary
90
+ extracted from the output via IDF scoring. Use these terms for targeted follow-up `search()` calls.
91
+
89
92
  Authenticated CLIs work out of the box — `gh`, `aws`, `gcloud`, `kubectl`, `docker` credentials are passed through securely. Bun auto-detected for 3-5x faster JS/TS.
90
93
 
91
94
  ### `execute_file` — Process Files Without Loading
@@ -132,12 +135,12 @@ Use instead of WebFetch or Context7 when you need documentation — index once,
132
135
  ┌──────────────────────────────────────────────────────────────────┐
133
136
  │ Without Context Mode │
134
137
  │ │
135
- │ Claude Code → Playwright snapshot → 12KB into context │
136
- │ Claude Code → Context7 docs → 60KB into context │
137
- │ Claude Code → gh pr list → 8KB into context │
138
+ │ Claude Code → Playwright snapshot → 56KB into context │
139
+ │ Claude Code → Context7 docs → 6KB into context │
140
+ │ Claude Code → gh pr list → 6KB into context │
138
141
  │ Claude Code → cat access.log → 45KB into context │
139
142
  │ │
140
- │ Total: 125KB consumed = ~32,000 tokens = 16% of context gone │
143
+ │ Total: 113KB consumed = ~29,000 tokens = 14% of context gone │
141
144
  └──────────────────────────────────────────────────────────────────┘
142
145
 
143
146
  ┌──────────────────────────────────────────────────────────────────┐
@@ -145,10 +148,10 @@ Use instead of WebFetch or Context7 when you need documentation — index once,
145
148
  │ │
146
149
  │ Claude Code → fetch_and_index(url) → "Indexed 8 sections" (50B)│
147
150
  │ Claude Code → search("snapshot") → exact element (500B) │
148
- │ Claude Code → execute("gh pr list") → "3 open PRs" (40B)│
149
- │ Claude Code → execute_file(log) → "500:14, 404:89" (30B)│
151
+ │ Claude Code → execute("gh pr list") → "5 PRs, +59 -0" (719B)│
152
+ │ Claude Code → execute_file(log) → "500:13, 404:13" (155B)│
150
153
  │ │
151
- │ Total: 620B consumed = ~160 tokens = 0.08% of context
154
+ │ Total: 1.4KB consumed = ~350 tokens = 0.18% of context
152
155
  └──────────────────────────────────────────────────────────────────┘
153
156
  ```
154
157
 
@@ -172,6 +175,7 @@ Use instead of WebFetch or Context7 when you need documentation — index once,
172
175
  │ │ • BM25 ranking │ │
173
176
  │ │ • Porter stemming │ │
174
177
  │ │ • Heading-aware chunks │ │
178
+ │ │ • Vocabulary hints │ │
175
179
  │ └────────────────────────┘ │
176
180
  └──────────────────────────────┘
177
181
  ```
@@ -213,20 +217,26 @@ ORDER BY rank LIMIT 3;
213
217
 
214
218
  **Lazy singleton:** Database created only when `index` or `search` is first called — zero overhead for sessions that don't use it.
215
219
 
216
- ### Intent-Driven Search (v0.5.0)
220
+ ### Intent-Driven Search (v0.5.2)
221
+
222
+ When `execute` or `execute_file` is called with an `intent` parameter and output exceeds 5KB, Context Mode uses score-based BM25 search to return only the relevant sections:
217
223
 
218
- When `execute` or `execute_file` is called with an `intent` parameter and output exceeds 5KB, Context Mode chunks the output and uses BM25 search to return only the relevant sections:
224
+ - **Score-based search**: Searches ALL intent words independently, ranks chunks by match count
225
+ - **Searchable terms**: Distinctive vocabulary hints extracted via IDF scoring, helping you craft precise follow-up `search()` calls
226
+ - **Smarter chunk titles**: Uses the first content line of each chunk instead of generic "Section N" labels
219
227
 
220
228
  ```
221
229
  Without intent:
222
230
  stdout (100KB) → full output enters context
223
231
 
224
232
  With intent:
225
- stdout (100KB) → chunk by lines → in-memory FTS5 → search(intent)2-5KB relevant sections
226
- Result: only what you need enters context
233
+ stdout (100KB) → chunk by lines → in-memory FTS5 → score all intent words top chunks + searchable terms
234
+ Result: only what you need enters context, plus vocabulary for targeted follow-ups
227
235
  ```
228
236
 
229
- Tested across 4 real-world scenarios:
237
+ **31% to 100% recall on real-world CHANGELOG test** — the score-based approach finds every relevant section, not just those matching a single query string.
238
+
239
+ Tested across 5 real-world scenarios:
230
240
 
231
241
  | Scenario | Without Intent | With Intent | Size Reduction |
232
242
  |---|---|---|---|
@@ -234,6 +244,7 @@ Tested across 4 real-world scenarios:
234
244
  | 3 test failures among 200 tests | only 2/3 visible | **all 3 found** | 2.4 KB vs 5.0 KB |
235
245
  | 2 build warnings among 300 lines | both lost in output | **both found** | 2.1 KB vs 5.0 KB |
236
246
  | API auth error (line 743/1000) | error lost in output | **found** | 1.2 KB vs 4.9 KB |
247
+ | Semantic gap (CHANGELOG search) | 31% recall | **100% recall** | Full coverage |
237
248
 
238
249
  Intent search finds the target every time while using 50-75% fewer bytes.
239
250
 
@@ -256,17 +267,17 @@ Tested with tools from popular MCP servers and Claude Code workflows:
256
267
 
257
268
  | Scenario | Tool | Raw | Context | Savings |
258
269
  |---|---|---|---|---|
259
- | Playwright page snapshot | `execute_file` | 50+ KB | 78 B | **99%** |
260
- | Context7 React docs | `index + search` | 5.9 KB | 285 B | **95%** |
261
- | Context7 Supabase docs | `index + search` | 3.9 KB | 123 B | **97%** |
262
- | Context7 Next.js docs | `index + search` | 6.5 KB | 273 B | **96%** |
263
- | httpbin.org API docs | `fetch_and_index` | 9.4 KB | 50 B | **99%** |
264
- | GitHub API response | `execute` | 8+ KB | 40 B | **99%** |
265
- | Access log (500 req) | `execute_file` | 45.1 KB | 71 B | **100%** |
266
- | Analytics CSV (500 rows) | `execute_file` | 85.5 KB | 11.5 KB | **87%** |
267
- | MCP tools manifest (40 tools) | `execute_file` | 17.0 KB | 78 B | **100%** |
268
- | npm test (30 suites) | `execute_file` | 6.0 KB | 37 B | **99%** |
269
- | Git log (153 commits) | `execute` | 11.6 KB | 18 B | **100%** |
270
+ | Playwright page snapshot | `execute` | 56.2 KB | 299 B | **99%** |
271
+ | Context7 React docs | `execute` | 5.9 KB | 261 B | **96%** |
272
+ | Context7 Next.js docs | `execute` | 6.5 KB | 249 B | **96%** |
273
+ | Context7 Tailwind docs | `execute` | 4.0 KB | 186 B | **95%** |
274
+ | GitHub Issues (20) | `execute` | 58.9 KB | 1.1 KB | **98%** |
275
+ | GitHub PR list (5) | `execute` | 6.4 KB | 719 B | **89%** |
276
+ | Access log (500 req) | `execute_file` | 45.1 KB | 155 B | **100%** |
277
+ | Analytics CSV (500 rows) | `execute_file` | 85.5 KB | 222 B | **100%** |
278
+ | MCP tools manifest (40 tools) | `execute_file` | 17.0 KB | 742 B | **96%** |
279
+ | Test output (30 suites) | `execute` | 6.0 KB | 337 B | **95%** |
280
+ | Git log (153 commits) | `execute` | 11.6 KB | 107 B | **99%** |
270
281
 
271
282
  ### Session Impact
272
283
 
@@ -274,9 +285,9 @@ Typical 45-minute debugging session:
274
285
 
275
286
  | Metric | Without | With | Delta |
276
287
  |---|---|---|---|
277
- | Context consumed | 177 KB | 10 KB | **-94%** |
278
- | Tokens used | ~45,300 | ~2,600 | **-94%** |
279
- | Context remaining | 77% | 95% | **+18pp** |
288
+ | Context consumed | 315 KB | 5.4 KB | **-98%** |
289
+ | Tokens used | ~80,600 | ~1,400 | **-98%** |
290
+ | Context remaining | 60% | 99% | **+39pp** |
280
291
  | Time before slowdown | ~30 min | ~3 hours | **+6x** |
281
292
 
282
293
  ## Tool Decision Matrix
@@ -375,13 +386,13 @@ Just ask naturally — Claude automatically routes through Context Mode when it
375
386
 
376
387
  ## Test Suite
377
388
 
378
- 99+ tests across 4 suites:
389
+ 100+ tests across 4 suites:
379
390
 
380
391
  | Suite | Tests | Coverage |
381
392
  |---|---|---|
382
393
  | Executor | 55 | 10 languages, sandbox, output handling, concurrency, timeouts |
383
394
  | ContentStore | 40 | FTS5 schema, BM25 ranking, chunking, stemming, plain text indexing |
384
- | Intent Search | 4 | Intent-driven search across 4 real-world scenarios |
395
+ | Intent Search | 5 | Intent-driven search across 5 real-world scenarios (incl. semantic gap) |
385
396
  | MCP Integration | 24 | JSON-RPC protocol, all 5 tools, fetch_and_index, errors |
386
397
 
387
398
  ## Development
@@ -393,7 +404,7 @@ npm install
393
404
  npm run build
394
405
  npm test # executor (55 tests)
395
406
  npm run test:store # FTS5/BM25 (40 tests)
396
- npm run test:all # all suites (99+ tests)
407
+ npm run test:all # all suites (100+ tests)
397
408
  ```
398
409
 
399
410
  ## License
package/build/server.js CHANGED
@@ -5,7 +5,7 @@ import { z } from "zod";
5
5
  import { PolyglotExecutor } from "./executor.js";
6
6
  import { ContentStore } from "./store.js";
7
7
  import { detectRuntimes, getRuntimeSummary, getAvailableLanguages, hasBunRuntime, } from "./runtime.js";
8
- const VERSION = "0.5.2";
8
+ const VERSION = "0.5.4";
9
9
  const runtimes = detectRuntimes();
10
10
  const available = getAvailableLanguages(runtimes);
11
11
  const server = new McpServer({
package/package.json CHANGED
@@ -1,8 +1,8 @@
1
1
  {
2
2
  "name": "context-mode",
3
- "version": "0.5.2",
3
+ "version": "0.5.4",
4
4
  "type": "module",
5
- "description": "Claude Code MCP plugin that saves 94% of your context window. Sandboxed code execution, FTS5 knowledge base, and smart truncation.",
5
+ "description": "Claude Code MCP plugin that saves 98% of your context window. Sandboxed code execution, FTS5 knowledge base, and intent-driven search.",
6
6
  "author": "Mert Koseoğlu",
7
7
  "license": "MIT",
8
8
  "keywords": [
@@ -1,12 +1,23 @@
1
1
  ---
2
2
  name: context-mode
3
3
  description: |
4
- ALWAYS use context-mode tools instead of Bash/curl/cat when output MIGHT exceed 20 lines.
5
- This includes: API requests (curl, fetch, gh, aws), file reading (cat, head, tail),
6
- command output (npm test, git log, docker ps, kubectl), log analysis, data processing,
7
- and ANY operation where you cannot guarantee the output will be small.
8
- When in doubt, use context-mode the cost of routing through it is near zero,
9
- but the cost of flooding context with 50KB+ of raw data is catastrophic.
4
+ Use context-mode tools (execute, execute_file, fetch_and_index, index, search) instead of
5
+ Bash/curl/cat when processing large outputs. The cost of routing through context-mode is near zero,
6
+ but flooding context with 50KB+ of raw data is catastrophic.
7
+
8
+ Use when: user wants to "fetch documentation", "look up the docs", "fetch the docs",
9
+ "check the API reference", "find examples in docs", "index documentation", "search docs",
10
+ "analyze logs", "check the logs", "read access.log", "debug the 500s",
11
+ "run tests", "run the test suite", "check if tests pass", "npm test", "vitest", "pytest",
12
+ "git log", "show recent commits", "list commits", "what changed", "diff between branches",
13
+ "analyze output", "summarize output", "process data", "parse JSON", "parse CSV",
14
+ "filter results", "extract errors", "check build output", "analyze dependencies",
15
+ "process API response", "large file analysis", "hit this endpoint", "call the API",
16
+ "check the response", "list containers", "check pods", "outdated packages",
17
+ "count lines", "find TODOs", "analyze codebase", "security audit",
18
+ or ANY operation where output might exceed 20 lines.
19
+ Also use when routing output from other MCP tools (Playwright snapshots, Context7 docs,
20
+ GitHub API responses, Sentry data) through index → search to avoid context bloat.
10
21
  ---
11
22
 
12
23
  # Context Mode: Default for All Large Output