context-mode 0.4.1 → 0.5.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -13,7 +13,7 @@
13
13
  "name": "context-mode",
14
14
  "source": "./",
15
15
  "description": "Claude Code MCP plugin that saves 94% of your context window. Sandboxed code execution in 10 languages, FTS5 knowledge base with BM25 ranking, and smart truncation.",
16
- "version": "0.4.1",
16
+ "version": "0.5.2",
17
17
  "author": {
18
18
  "name": "Mert Koseoğlu"
19
19
  },
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "context-mode",
3
- "version": "0.4.1",
3
+ "version": "0.5.2",
4
4
  "description": "Claude Code MCP plugin that saves 94% of your context window. Sandboxed code execution in 10 languages, FTS5 knowledge base with BM25 ranking, and smart truncation.",
5
5
  "author": {
6
6
  "name": "Mert Koseoğlu",
package/README.md CHANGED
@@ -75,11 +75,22 @@ Claude calls: execute({ language: "shell", code: "gh pr list --json title,state
75
75
  Returns: "3" ← 2 bytes instead of 8KB JSON
76
76
  ```
77
77
 
78
+ **Intent-driven search** (v0.5.0): When you provide an `intent` parameter and output exceeds 5KB, Context Mode uses BM25 search to return only the relevant sections matching your intent.
79
+
80
+ ```
81
+ Claude calls: execute({
82
+ language: "shell",
83
+ code: "cat /var/log/app.log",
84
+ intent: "connection refused database error"
85
+ })
86
+ Returns: only the 3 matching log sections (1.5KB) ← instead of 100KB raw log
87
+ ```
88
+
78
89
  Authenticated CLIs work out of the box — `gh`, `aws`, `gcloud`, `kubectl`, `docker` credentials are passed through securely. Bun auto-detected for 3-5x faster JS/TS.
79
90
 
80
91
  ### `execute_file` — Process Files Without Loading
81
92
 
82
- File contents never enter context. The file is read into a `FILE_CONTENT` variable inside the sandbox.
93
+ File contents never enter context. The file is read into a `FILE_CONTENT` variable inside the sandbox. Also supports `intent` parameter for intent-driven search on large outputs.
83
94
 
84
95
  ```
85
96
  Claude calls: execute_file({ path: "access.log", language: "python", code: "..." })
@@ -152,7 +163,7 @@ Use instead of WebFetch or Context7 when you need documentation — index once,
152
163
  │ │ • 10 language runtimes │ │
153
164
  │ │ • Sandboxed subprocess │ │
154
165
  │ │ • Auth passthrough │ │
155
- │ │ • Smart truncation │ │
166
+ │ │ • Intent-driven search │ │
156
167
  │ └────────────────────────┘ │
157
168
  │ │
158
169
  │ ┌────────────────────────┐ │
@@ -202,17 +213,29 @@ ORDER BY rank LIMIT 3;
202
213
 
203
214
  **Lazy singleton:** Database created only when `index` or `search` is first called — zero overhead for sessions that don't use it.
204
215
 
205
- ### Smart Truncation
216
+ ### Intent-Driven Search (v0.5.0)
206
217
 
207
- When subprocess output exceeds the 100KB buffer, Context Mode preserves both head and tail:
218
+ When `execute` or `execute_file` is called with an `intent` parameter and output exceeds 5KB, Context Mode chunks the output and uses BM25 search to return only the relevant sections:
208
219
 
209
220
  ```
210
- Head (60%): Initial output with context
211
- ... [47 lines / 3.2KB truncated — showing first 12 + last 8 lines] ...
212
- Tail (40%): Final output with errors/results
221
+ Without intent:
222
+ stdout (100KB) full output enters context
223
+
224
+ With intent:
225
+ stdout (100KB) → chunk by lines → in-memory FTS5 → search(intent) → 2-5KB relevant sections
226
+ Result: only what you need enters context
213
227
  ```
214
228
 
215
- Line-boundary snapping never cuts mid-line. Error messages at the bottom are always preserved.
229
+ Tested across 4 real-world scenarios:
230
+
231
+ | Scenario | Without Intent | With Intent | Size Reduction |
232
+ |---|---|---|---|
233
+ | Server log error (line 347/500) | error lost in output | **found** | 1.5 KB vs 5.0 KB |
234
+ | 3 test failures among 200 tests | only 2/3 visible | **all 3 found** | 2.4 KB vs 5.0 KB |
235
+ | 2 build warnings among 300 lines | both lost in output | **both found** | 2.1 KB vs 5.0 KB |
236
+ | API auth error (line 743/1000) | error lost in output | **found** | 1.2 KB vs 4.9 KB |
237
+
238
+ Intent search finds the target every time while using 50-75% fewer bytes.
216
239
 
217
240
  ### HTML to Markdown Conversion
218
241
 
@@ -352,12 +375,13 @@ Just ask naturally — Claude automatically routes through Context Mode when it
352
375
 
353
376
  ## Test Suite
354
377
 
355
- 113 tests across 3 suites:
378
+ 99+ tests across 4 suites:
356
379
 
357
380
  | Suite | Tests | Coverage |
358
381
  |---|---|---|
359
- | Executor | 55 | 10 languages, sandbox, truncation, concurrency, timeouts |
360
- | ContentStore | 34 | FTS5 schema, BM25 ranking, chunking, stemming, fixtures |
382
+ | Executor | 55 | 10 languages, sandbox, output handling, concurrency, timeouts |
383
+ | ContentStore | 40 | FTS5 schema, BM25 ranking, chunking, stemming, plain text indexing |
384
+ | Intent Search | 4 | Intent-driven search across 4 real-world scenarios |
361
385
  | MCP Integration | 24 | JSON-RPC protocol, all 5 tools, fetch_and_index, errors |
362
386
 
363
387
  ## Development
@@ -368,8 +392,8 @@ cd claude-context-mode
368
392
  npm install
369
393
  npm run build
370
394
  npm test # executor (55 tests)
371
- npm run test:store # FTS5/BM25 (34 tests)
372
- npm run test:all # all suites (113 tests)
395
+ npm run test:store # FTS5/BM25 (40 tests)
396
+ npm run test:all # all suites (99+ tests)
373
397
  ```
374
398
 
375
399
  ## License
package/build/executor.js CHANGED
@@ -232,23 +232,23 @@ export class PolyglotExecutor {
232
232
  switch (language) {
233
233
  case "javascript":
234
234
  case "typescript":
235
- return `const FILE_CONTENT = require("fs").readFileSync(${escaped}, "utf-8");\n${code}`;
235
+ return `const FILE_CONTENT_PATH = ${escaped};\nconst FILE_CONTENT = require("fs").readFileSync(FILE_CONTENT_PATH, "utf-8");\n${code}`;
236
236
  case "python":
237
- return `with open(${escaped}, "r") as _f:\n FILE_CONTENT = _f.read()\n${code}`;
237
+ return `FILE_CONTENT_PATH = ${escaped}\nwith open(FILE_CONTENT_PATH, "r") as _f:\n FILE_CONTENT = _f.read()\n${code}`;
238
238
  case "shell":
239
- return `FILE_CONTENT=$(cat ${escaped})\n${code}`;
239
+ return `FILE_CONTENT_PATH=${escaped}\nFILE_CONTENT=$(cat ${escaped})\n${code}`;
240
240
  case "ruby":
241
- return `FILE_CONTENT = File.read(${escaped})\n${code}`;
241
+ return `FILE_CONTENT_PATH = ${escaped}\nFILE_CONTENT = File.read(FILE_CONTENT_PATH)\n${code}`;
242
242
  case "go":
243
- return `package main\n\nimport (\n\t"fmt"\n\t"os"\n)\n\nfunc main() {\n\tb, _ := os.ReadFile(${escaped})\n\tFILE_CONTENT := string(b)\n\t_ = FILE_CONTENT\n${code}\n}\n`;
243
+ return `package main\n\nimport (\n\t"fmt"\n\t"os"\n)\n\nvar FILE_CONTENT_PATH = ${escaped}\n\nfunc main() {\n\tb, _ := os.ReadFile(FILE_CONTENT_PATH)\n\tFILE_CONTENT := string(b)\n\t_ = FILE_CONTENT\n\t_ = fmt.Sprint()\n${code}\n}\n`;
244
244
  case "rust":
245
- return `use std::fs;\n\nfn main() {\n let file_content = fs::read_to_string(${escaped}).unwrap();\n${code}\n}\n`;
245
+ return `use std::fs;\n\nfn main() {\n let file_content_path = ${escaped};\n let file_content = fs::read_to_string(file_content_path).unwrap();\n${code}\n}\n`;
246
246
  case "php":
247
- return `<?php\n$FILE_CONTENT = file_get_contents(${escaped});\n${code}`;
247
+ return `<?php\n$FILE_CONTENT_PATH = ${escaped};\n$FILE_CONTENT = file_get_contents($FILE_CONTENT_PATH);\n${code}`;
248
248
  case "perl":
249
- return `open(my $fh, '<', ${escaped}) or die "Cannot open: $!";\nmy $FILE_CONTENT = do { local $/; <$fh> };\nclose($fh);\n${code}`;
249
+ return `my $FILE_CONTENT_PATH = ${escaped};\nopen(my $fh, '<', $FILE_CONTENT_PATH) or die "Cannot open: $!";\nmy $FILE_CONTENT = do { local $/; <$fh> };\nclose($fh);\n${code}`;
250
250
  case "r":
251
- return `FILE_CONTENT <- readLines(${escaped}, warn=FALSE)\nFILE_CONTENT <- paste(FILE_CONTENT, collapse="\\n")\n${code}`;
251
+ return `FILE_CONTENT_PATH <- ${escaped}\nFILE_CONTENT <- readLines(FILE_CONTENT_PATH, warn=FALSE)\nFILE_CONTENT <- paste(FILE_CONTENT, collapse="\\n")\n${code}`;
252
252
  }
253
253
  }
254
254
  }
package/build/server.js CHANGED
@@ -5,11 +5,12 @@ import { z } from "zod";
5
5
  import { PolyglotExecutor } from "./executor.js";
6
6
  import { ContentStore } from "./store.js";
7
7
  import { detectRuntimes, getRuntimeSummary, getAvailableLanguages, hasBunRuntime, } from "./runtime.js";
8
+ const VERSION = "0.5.2";
8
9
  const runtimes = detectRuntimes();
9
10
  const available = getAvailableLanguages(runtimes);
10
11
  const server = new McpServer({
11
12
  name: "context-mode",
12
- version: "0.4.1",
13
+ version: VERSION,
13
14
  });
14
15
  const executor = new PolyglotExecutor({ runtimes });
15
16
  // Lazy singleton — no DB overhead unless index/search is used
@@ -53,8 +54,15 @@ server.registerTool("execute", {
53
54
  .optional()
54
55
  .default(30000)
55
56
  .describe("Max execution time in ms"),
57
+ intent: z
58
+ .string()
59
+ .optional()
60
+ .describe("What you're looking for in the output. When provided and output is large (>5KB), " +
61
+ "indexes output into knowledge base and returns section titles + previews — not full content. " +
62
+ "Use search() to retrieve specific sections. Example: 'failing tests', 'HTTP 500 errors'." +
63
+ "\n\nTIP: Use specific technical terms, not just concepts. Check 'Searchable terms' in the response for available vocabulary."),
56
64
  }),
57
- }, async ({ language, code, timeout }) => {
65
+ }, async ({ language, code, timeout, intent }) => {
58
66
  try {
59
67
  const result = await executor.execute({ language, code, timeout });
60
68
  if (result.timedOut) {
@@ -69,19 +77,34 @@ server.registerTool("execute", {
69
77
  };
70
78
  }
71
79
  if (result.exitCode !== 0) {
80
+ const output = `Exit code: ${result.exitCode}\n\nstdout:\n${result.stdout}\n\nstderr:\n${result.stderr}`;
81
+ if (intent && intent.trim().length > 0 && Buffer.byteLength(output) > INTENT_SEARCH_THRESHOLD) {
82
+ return {
83
+ content: [
84
+ { type: "text", text: intentSearch(output, intent, `execute:${language}:error`) },
85
+ ],
86
+ isError: true,
87
+ };
88
+ }
72
89
  return {
73
90
  content: [
74
- {
75
- type: "text",
76
- text: `Exit code: ${result.exitCode}\n\nstdout:\n${result.stdout}\n\nstderr:\n${result.stderr}`,
77
- },
91
+ { type: "text", text: output },
78
92
  ],
79
93
  isError: true,
80
94
  };
81
95
  }
96
+ const stdout = result.stdout || "(no output)";
97
+ // Intent-driven search: if intent provided and output is large enough
98
+ if (intent && intent.trim().length > 0 && Buffer.byteLength(stdout) > INTENT_SEARCH_THRESHOLD) {
99
+ return {
100
+ content: [
101
+ { type: "text", text: intentSearch(stdout, intent, `execute:${language}`) },
102
+ ],
103
+ };
104
+ }
82
105
  return {
83
106
  content: [
84
- { type: "text", text: result.stdout || "(no output)" },
107
+ { type: "text", text: stdout },
85
108
  ],
86
109
  };
87
110
  }
@@ -111,6 +134,85 @@ function indexStdout(stdout, source) {
111
134
  };
112
135
  }
113
136
  // ─────────────────────────────────────────────────────────
137
+ // Helper: intent-driven search on execution output
138
+ // ─────────────────────────────────────────────────────────
139
+ const INTENT_SEARCH_THRESHOLD = 5_000; // bytes — ~80-100 lines
140
+ function intentSearch(stdout, intent, source, maxResults = 5) {
141
+ const totalLines = stdout.split("\n").length;
142
+ const totalBytes = Buffer.byteLength(stdout);
143
+ // Index into the PERSISTENT store so user can search() later
144
+ const persistent = getStore();
145
+ const indexed = persistent.indexPlainText(stdout, source);
146
+ // Search with an ephemeral store to find matching section titles
147
+ const ephemeral = new ContentStore(":memory:");
148
+ try {
149
+ ephemeral.indexPlainText(stdout, source);
150
+ let results = ephemeral.search(intent, maxResults);
151
+ // Score-based relaxed search: search ALL words, rank by match count
152
+ if (results.length === 0) {
153
+ const words = intent.trim().split(/\s+/).filter(w => w.length > 2).slice(0, 20);
154
+ if (words.length > 0) {
155
+ const sectionScores = new Map();
156
+ for (const word of words) {
157
+ const wordResults = ephemeral.search(word, 10);
158
+ for (const r of wordResults) {
159
+ const existing = sectionScores.get(r.title);
160
+ if (existing) {
161
+ existing.score += 1;
162
+ if (r.rank < existing.bestRank) {
163
+ existing.bestRank = r.rank;
164
+ existing.result = r;
165
+ }
166
+ }
167
+ else {
168
+ sectionScores.set(r.title, { result: r, score: 1, bestRank: r.rank });
169
+ }
170
+ }
171
+ }
172
+ results = Array.from(sectionScores.values())
173
+ .sort((a, b) => b.score - a.score || a.bestRank - b.bestRank)
174
+ .slice(0, maxResults)
175
+ .map(s => s.result);
176
+ }
177
+ }
178
+ // Extract distinctive terms as vocabulary hints for the LLM
179
+ const distinctiveTerms = persistent.getDistinctiveTerms(indexed.sourceId);
180
+ if (results.length === 0) {
181
+ const lines = [
182
+ `Indexed ${indexed.totalChunks} sections from "${source}" into knowledge base.`,
183
+ `No sections matched intent "${intent}" in ${totalLines}-line output (${(totalBytes / 1024).toFixed(1)}KB).`,
184
+ ];
185
+ if (distinctiveTerms.length > 0) {
186
+ lines.push("");
187
+ lines.push(`Searchable terms: ${distinctiveTerms.join(", ")}`);
188
+ }
189
+ lines.push("");
190
+ lines.push("Use search() to explore the indexed content.");
191
+ return lines.join("\n");
192
+ }
193
+ // Return ONLY titles + first-line previews — not full content
194
+ const lines = [
195
+ `Indexed ${indexed.totalChunks} sections from "${source}" into knowledge base.`,
196
+ `${results.length} sections matched "${intent}" (${totalLines} lines, ${(totalBytes / 1024).toFixed(1)}KB):`,
197
+ "",
198
+ ];
199
+ for (const r of results) {
200
+ const preview = r.content.split("\n")[0].slice(0, 120);
201
+ lines.push(` - ${r.title}: ${preview}`);
202
+ }
203
+ if (distinctiveTerms.length > 0) {
204
+ lines.push("");
205
+ lines.push(`Searchable terms: ${distinctiveTerms.join(", ")}`);
206
+ }
207
+ lines.push("");
208
+ lines.push("Use search() to retrieve full content of any section.");
209
+ return lines.join("\n");
210
+ }
211
+ finally {
212
+ ephemeral.close();
213
+ }
214
+ }
215
+ // ─────────────────────────────────────────────────────────
114
216
  // Tool: execute_file
115
217
  // ─────────────────────────────────────────────────────────
116
218
  server.registerTool("execute_file", {
@@ -142,8 +244,13 @@ server.registerTool("execute_file", {
142
244
  .optional()
143
245
  .default(30000)
144
246
  .describe("Max execution time in ms"),
247
+ intent: z
248
+ .string()
249
+ .optional()
250
+ .describe("What you're looking for in the output. When provided and output is large (>5KB), " +
251
+ "returns only matching sections via BM25 search instead of truncated output."),
145
252
  }),
146
- }, async ({ path, language, code, timeout }) => {
253
+ }, async ({ path, language, code, timeout, intent }) => {
147
254
  try {
148
255
  const result = await executor.executeFile({
149
256
  path,
@@ -163,19 +270,33 @@ server.registerTool("execute_file", {
163
270
  };
164
271
  }
165
272
  if (result.exitCode !== 0) {
273
+ const output = `Error processing ${path} (exit ${result.exitCode}):\n${result.stderr || result.stdout}`;
274
+ if (intent && intent.trim().length > 0 && Buffer.byteLength(output) > INTENT_SEARCH_THRESHOLD) {
275
+ return {
276
+ content: [
277
+ { type: "text", text: intentSearch(output, intent, `file:${path}:error`) },
278
+ ],
279
+ isError: true,
280
+ };
281
+ }
166
282
  return {
167
283
  content: [
168
- {
169
- type: "text",
170
- text: `Error processing ${path} (exit ${result.exitCode}):\n${result.stderr || result.stdout}`,
171
- },
284
+ { type: "text", text: output },
172
285
  ],
173
286
  isError: true,
174
287
  };
175
288
  }
289
+ const stdout = result.stdout || "(no output)";
290
+ if (intent && intent.trim().length > 0 && Buffer.byteLength(stdout) > INTENT_SEARCH_THRESHOLD) {
291
+ return {
292
+ content: [
293
+ { type: "text", text: intentSearch(stdout, intent, `file:${path}`) },
294
+ ],
295
+ };
296
+ }
176
297
  return {
177
298
  content: [
178
- { type: "text", text: result.stdout || "(no output)" },
299
+ { type: "text", text: stdout },
179
300
  ],
180
301
  };
181
302
  }
@@ -267,6 +388,10 @@ server.registerTool("search", {
267
388
  "- Look up API signatures ('Supabase RLS policy syntax')\n" +
268
389
  "- Get configuration details ('Tailwind responsive breakpoints')\n" +
269
390
  "- Find migration steps ('App Router data fetching')\n\n" +
391
+ "SEARCH TIPS:\n" +
392
+ "- Use specific technical terms, not concepts ('__proto__' not 'security')\n" +
393
+ "- Check 'Searchable terms' from execute/execute_file results for available vocabulary\n" +
394
+ "- Combine multiple specific terms for better results\n\n" +
270
395
  "Returns exact content — not summaries. Each result includes heading hierarchy and full section text.",
271
396
  inputSchema: z.object({
272
397
  query: z.string().describe("Natural language search query"),
@@ -444,7 +569,7 @@ server.registerTool("fetch_and_index", {
444
569
  async function main() {
445
570
  const transport = new StdioServerTransport();
446
571
  await server.connect(transport);
447
- console.error("Context Mode MCP server v0.4.0 running on stdio");
572
+ console.error(`Context Mode MCP server v${VERSION} running on stdio`);
448
573
  console.error(`Detected runtimes:\n${getRuntimeSummary(runtimes)}`);
449
574
  if (!hasBunRuntime()) {
450
575
  console.error("\nPerformance tip: Install Bun for 3-5x faster JS/TS execution");
package/build/store.d.ts CHANGED
@@ -33,7 +33,14 @@ export declare class ContentStore {
33
33
  path?: string;
34
34
  source?: string;
35
35
  }): IndexResult;
36
+ /**
37
+ * Index plain-text output (logs, build output, test results) by splitting
38
+ * into fixed-size line groups. Unlike markdown indexing, this does not
39
+ * look for headings — it chunks by line count with overlap.
40
+ */
41
+ indexPlainText(content: string, source: string, linesPerChunk?: number): IndexResult;
36
42
  search(query: string, limit?: number): SearchResult[];
43
+ getDistinctiveTerms(sourceId: number, maxTerms?: number): string[];
37
44
  getStats(): StoreStats;
38
45
  close(): void;
39
46
  }
package/build/store.js CHANGED
@@ -12,6 +12,24 @@ import { readFileSync } from "node:fs";
12
12
  import { tmpdir } from "node:os";
13
13
  import { join } from "node:path";
14
14
  // ─────────────────────────────────────────────────────────
15
+ // Constants
16
+ // ─────────────────────────────────────────────────────────
17
+ const STOPWORDS = new Set([
18
+ "the", "and", "for", "are", "but", "not", "you", "all", "can", "had",
19
+ "her", "was", "one", "our", "out", "has", "his", "how", "its", "may",
20
+ "new", "now", "old", "see", "way", "who", "did", "get", "got", "let",
21
+ "say", "she", "too", "use", "will", "with", "this", "that", "from",
22
+ "they", "been", "have", "many", "some", "them", "than", "each", "make",
23
+ "like", "just", "over", "such", "take", "into", "year", "your", "good",
24
+ "could", "would", "about", "which", "their", "there", "other", "after",
25
+ "should", "through", "also", "more", "most", "only", "very", "when",
26
+ "what", "then", "these", "those", "being", "does", "done", "both",
27
+ "same", "still", "while", "where", "here", "were", "much",
28
+ // Common in code/changelogs
29
+ "update", "updates", "updated", "deps", "dev", "tests", "test",
30
+ "add", "added", "fix", "fixed", "run", "running", "using",
31
+ ]);
32
+ // ─────────────────────────────────────────────────────────
15
33
  // Helpers
16
34
  // ─────────────────────────────────────────────────────────
17
35
  function sanitizeQuery(query) {
@@ -94,6 +112,42 @@ export class ContentStore {
94
112
  codeChunks,
95
113
  };
96
114
  }
115
+ // ── Index Plain Text ──
116
+ /**
117
+ * Index plain-text output (logs, build output, test results) by splitting
118
+ * into fixed-size line groups. Unlike markdown indexing, this does not
119
+ * look for headings — it chunks by line count with overlap.
120
+ */
121
+ indexPlainText(content, source, linesPerChunk = 20) {
122
+ if (!content || content.trim().length === 0) {
123
+ const insertSource = this.#db.prepare("INSERT INTO sources (label, chunk_count, code_chunk_count) VALUES (?, 0, 0)");
124
+ const info = insertSource.run(source);
125
+ return {
126
+ sourceId: Number(info.lastInsertRowid),
127
+ label: source,
128
+ totalChunks: 0,
129
+ codeChunks: 0,
130
+ };
131
+ }
132
+ const chunks = this.#chunkPlainText(content, linesPerChunk);
133
+ const insertSource = this.#db.prepare("INSERT INTO sources (label, chunk_count, code_chunk_count) VALUES (?, ?, ?)");
134
+ const insertChunk = this.#db.prepare("INSERT INTO chunks (title, content, source_id, content_type) VALUES (?, ?, ?, ?)");
135
+ const transaction = this.#db.transaction(() => {
136
+ const info = insertSource.run(source, chunks.length, 0);
137
+ const sourceId = Number(info.lastInsertRowid);
138
+ for (const chunk of chunks) {
139
+ insertChunk.run(chunk.title, chunk.content, sourceId, "prose");
140
+ }
141
+ return sourceId;
142
+ });
143
+ const sourceId = transaction();
144
+ return {
145
+ sourceId,
146
+ label: source,
147
+ totalChunks: chunks.length,
148
+ codeChunks: 0,
149
+ };
150
+ }
97
151
  // ── Search ──
98
152
  search(query, limit = 3) {
99
153
  const sanitized = sanitizeQuery(query);
@@ -119,6 +173,46 @@ export class ContentStore {
119
173
  contentType: r.content_type,
120
174
  }));
121
175
  }
176
+ // ── Vocabulary ──
177
+ getDistinctiveTerms(sourceId, maxTerms = 40) {
178
+ const stats = this.#db
179
+ .prepare("SELECT chunk_count FROM sources WHERE id = ?")
180
+ .get(sourceId);
181
+ if (!stats || stats.chunk_count < 3)
182
+ return [];
183
+ const totalChunks = stats.chunk_count;
184
+ const minAppearances = 2;
185
+ const maxAppearances = Math.max(3, Math.ceil(totalChunks * 0.4));
186
+ const rows = this.#db
187
+ .prepare("SELECT content FROM chunks WHERE source_id = ?")
188
+ .all(sourceId);
189
+ // Count document frequency (how many sections contain each word)
190
+ const docFreq = new Map();
191
+ for (const row of rows) {
192
+ const words = new Set(row.content
193
+ .toLowerCase()
194
+ .split(/[^\p{L}\p{N}_-]+/u)
195
+ .filter((w) => w.length >= 3 && !STOPWORDS.has(w)));
196
+ for (const word of words) {
197
+ docFreq.set(word, (docFreq.get(word) ?? 0) + 1);
198
+ }
199
+ }
200
+ const filtered = Array.from(docFreq.entries())
201
+ .filter(([, count]) => count >= minAppearances && count <= maxAppearances);
202
+ // Score: IDF (rarity) + length bonus + identifier bonus (underscore/camelCase)
203
+ const scored = filtered.map(([word, count]) => {
204
+ const idf = Math.log(totalChunks / count);
205
+ const lenBonus = Math.min(word.length / 20, 0.5);
206
+ const hasSpecialChars = /[_]/.test(word);
207
+ const isCamelOrLong = word.length >= 12;
208
+ const identifierBonus = hasSpecialChars ? 1.5 : isCamelOrLong ? 0.8 : 0;
209
+ return { word, score: idf + lenBonus + identifierBonus };
210
+ });
211
+ return scored
212
+ .sort((a, b) => b.score - a.score)
213
+ .slice(0, maxTerms)
214
+ .map((s) => s.word);
215
+ }
122
216
  // ── Stats ──
123
217
  getStats() {
124
218
  const sources = this.#db.prepare("SELECT COUNT(*) as c FROM sources").get()?.c ?? 0;
@@ -203,6 +297,46 @@ export class ContentStore {
203
297
  flush();
204
298
  return chunks;
205
299
  }
300
+ #chunkPlainText(text, linesPerChunk) {
301
+ // Try blank-line splitting first for naturally-sectioned output
302
+ const sections = text.split(/\n\s*\n/);
303
+ if (sections.length >= 3 &&
304
+ sections.length <= 200 &&
305
+ sections.every((s) => Buffer.byteLength(s) < 5000)) {
306
+ return sections
307
+ .map((section, i) => {
308
+ const trimmed = section.trim();
309
+ const firstLine = trimmed.split("\n")[0].slice(0, 80);
310
+ return {
311
+ title: firstLine || `Section ${i + 1}`,
312
+ content: trimmed,
313
+ };
314
+ })
315
+ .filter((s) => s.content.length > 0);
316
+ }
317
+ const lines = text.split("\n");
318
+ // Small enough for a single chunk
319
+ if (lines.length <= linesPerChunk) {
320
+ return [{ title: "Output", content: text }];
321
+ }
322
+ // Fixed-size line groups with 2-line overlap
323
+ const chunks = [];
324
+ const overlap = 2;
325
+ const step = Math.max(linesPerChunk - overlap, 1);
326
+ for (let i = 0; i < lines.length; i += step) {
327
+ const slice = lines.slice(i, i + linesPerChunk);
328
+ if (slice.length === 0)
329
+ break;
330
+ const startLine = i + 1;
331
+ const endLine = Math.min(i + slice.length, lines.length);
332
+ const firstLine = slice[0]?.trim().slice(0, 80);
333
+ chunks.push({
334
+ title: firstLine || `Lines ${startLine}-${endLine}`,
335
+ content: slice.join("\n"),
336
+ });
337
+ }
338
+ return chunks;
339
+ }
206
340
  #buildTitle(headingStack, currentHeading) {
207
341
  if (headingStack.length === 0) {
208
342
  return currentHeading || "Untitled";
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "context-mode",
3
- "version": "0.4.1",
3
+ "version": "0.5.2",
4
4
  "type": "module",
5
5
  "description": "Claude Code MCP plugin that saves 94% of your context window. Sandboxed code execution, FTS5 knowledge base, and smart truncation.",
6
6
  "author": "Mert Koseoğlu",