jasper-context-compactor 0.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1,57 @@
1
+ # Context Compactor
2
+
3
+ > Token-based context compaction for OpenClaw with local models (MLX, llama.cpp, Ollama)
4
+
5
+ ## Why?
6
+
7
+ Local LLM servers don't report context overflow errors like cloud APIs do. OpenClaw's built-in compaction relies on these errors to trigger. This plugin estimates tokens client-side and proactively summarizes older messages before hitting the model's limit.
8
+
9
+ ## Quick Start
10
+
11
+ ```bash
12
+ # One command setup (installs + configures)
13
+ npx jasper-context-compactor setup
14
+
15
+ # Restart gateway
16
+ openclaw gateway restart
17
+ ```
18
+
19
+ That's it! The setup command:
20
+ - Copies plugin files to `~/.openclaw/extensions/context-compactor/`
21
+ - Adds plugin config to `openclaw.json` with sensible defaults
22
+
23
+ ## Configuration
24
+
25
+ ```json
26
+ {
27
+ "plugins": {
28
+ "entries": {
29
+ "context-compactor": {
30
+ "enabled": true,
31
+ "config": {
32
+ "maxTokens": 8000,
33
+ "keepRecentTokens": 2000,
34
+ "summaryMaxTokens": 1000,
35
+ "charsPerToken": 4
36
+ }
37
+ }
38
+ }
39
+ }
40
+ }
41
+ ```
42
+
43
+ ## Commands
44
+
45
+ - `/context-stats` — Show current token usage
46
+ - `/compact-now` — Force fresh compaction on next message
47
+
48
+ ## How It Works
49
+
50
+ 1. Before each agent turn, estimates total context tokens
51
+ 2. If over `maxTokens`, splits messages into "old" and "recent"
52
+ 3. Summarizes old messages using the session model
53
+ 4. Injects summary + recent messages as context
54
+
55
+ ## License
56
+
57
+ MIT
package/SKILL.md ADDED
@@ -0,0 +1,218 @@
1
+ ---
2
+ name: context-compactor
3
+ version: 0.1.0
4
+ description: Token-based context compaction for local models (MLX, llama.cpp, Ollama) that don't report context limits.
5
+ ---
6
+
7
+ # Context Compactor
8
+
9
+ Automatic context compaction for OpenClaw when using local models that don't properly report token limits or context overflow errors.
10
+
11
+ ## The Problem
12
+
13
+ Cloud APIs (Anthropic, OpenAI) report context overflow errors, allowing OpenClaw's built-in compaction to trigger. Local models (MLX, llama.cpp, Ollama) often:
14
+
15
+ - Silently truncate context
16
+ - Return garbage when context is exceeded
17
+ - Don't report accurate token counts
18
+
19
+ This leaves you with broken conversations when context gets too long.
20
+
21
+ ## The Solution
22
+
23
+ Context Compactor estimates tokens client-side and proactively summarizes older messages before hitting the model's limit.
24
+
25
+ ## How It Works
26
+
27
+ ```
28
+ ┌─────────────────────────────────────────────────────────────┐
29
+ │ 1. Message arrives │
30
+ │ 2. before_agent_start hook fires │
31
+ │ 3. Plugin estimates total context tokens │
32
+ │ 4. If over maxTokens: │
33
+ │ a. Split into "old" and "recent" messages │
34
+ │ b. Summarize old messages (LLM or fallback) │
35
+ │ c. Inject summary as compacted context │
36
+ │ 5. Agent sees: summary + recent + new message │
37
+ └─────────────────────────────────────────────────────────────┘
38
+ ```
39
+
40
+ ## Installation
41
+
42
+ ```bash
43
+ # One command setup (recommended)
44
+ npx jasper-context-compactor setup
45
+
46
+ # Restart gateway
47
+ openclaw gateway restart
48
+ ```
49
+
50
+ The setup command automatically:
51
+ - Copies plugin files to `~/.openclaw/extensions/context-compactor/`
52
+ - Adds plugin config to `openclaw.json` with sensible defaults
53
+
54
+ ## Configuration
55
+
56
+ Add to `openclaw.json`:
57
+
58
+ ```json
59
+ {
60
+ "plugins": {
61
+ "entries": {
62
+ "context-compactor": {
63
+ "enabled": true,
64
+ "config": {
65
+ "maxTokens": 8000,
66
+ "keepRecentTokens": 2000,
67
+ "summaryMaxTokens": 1000,
68
+ "charsPerToken": 4
69
+ }
70
+ }
71
+ }
72
+ }
73
+ }
74
+ ```
75
+
76
+ ### Options
77
+
78
+ | Option | Default | Description |
79
+ |--------|---------|-------------|
80
+ | `enabled` | `true` | Enable/disable the plugin |
81
+ | `maxTokens` | `8000` | Max context tokens before compaction |
82
+ | `keepRecentTokens` | `2000` | Tokens to preserve from recent messages |
83
+ | `summaryMaxTokens` | `1000` | Max tokens for the summary |
84
+ | `charsPerToken` | `4` | Token estimation ratio |
85
+ | `summaryModel` | (session model) | Model to use for summarization |
86
+
87
+ ### Tuning for Your Model
88
+
89
+ **MLX (8K context models):**
90
+ ```json
91
+ {
92
+ "maxTokens": 6000,
93
+ "keepRecentTokens": 1500,
94
+ "charsPerToken": 4
95
+ }
96
+ ```
97
+
98
+ **Larger context (32K models):**
99
+ ```json
100
+ {
101
+ "maxTokens": 28000,
102
+ "keepRecentTokens": 4000,
103
+ "charsPerToken": 4
104
+ }
105
+ ```
106
+
107
+ **Small context (4K models):**
108
+ ```json
109
+ {
110
+ "maxTokens": 3000,
111
+ "keepRecentTokens": 800,
112
+ "charsPerToken": 4
113
+ }
114
+ ```
115
+
116
+ ## Commands
117
+
118
+ ### `/compact-now`
119
+
120
+ Force clear the summary cache and trigger fresh compaction on next message.
121
+
122
+ ```
123
+ /compact-now
124
+ ```
125
+
126
+ ### `/context-stats`
127
+
128
+ Show current context token usage and whether compaction would trigger.
129
+
130
+ ```
131
+ /context-stats
132
+ ```
133
+
134
+ Output:
135
+ ```
136
+ 📊 Context Stats
137
+
138
+ Messages: 47 total
139
+ - User: 23
140
+ - Assistant: 24
141
+ - System: 0
142
+
143
+ Estimated Tokens: ~6,234
144
+ Limit: 8,000
145
+ Usage: 77.9%
146
+
147
+ ✅ Within limits
148
+ ```
149
+
150
+ ## How Summarization Works
151
+
152
+ When compaction triggers:
153
+
154
+ 1. **Split messages** into "old" (to summarize) and "recent" (to keep)
155
+ 2. **Generate summary** using the session model (or configured `summaryModel`)
156
+ 3. **Cache the summary** to avoid regenerating for the same content
157
+ 4. **Inject context** with the summary prepended
158
+
159
+ If the LLM runtime isn't available (e.g., during startup), a fallback truncation-based summary is used.
160
+
161
+ ## Differences from Built-in Compaction
162
+
163
+ | Feature | Built-in | Context Compactor |
164
+ |---------|----------|-------------------|
165
+ | Trigger | Model reports overflow | Token estimate threshold |
166
+ | Works with local models | ❌ (need overflow error) | ✅ |
167
+ | Persists to transcript | ✅ | ❌ (session-only) |
168
+ | Summarization | Pi runtime | Plugin LLM call |
169
+
170
+ Context Compactor is **complementary** — it catches cases before they hit the model's hard limit.
171
+
172
+ ## Troubleshooting
173
+
174
+ **Summary quality is poor:**
175
+ - Try a better `summaryModel`
176
+ - Increase `summaryMaxTokens`
177
+ - The fallback truncation is used if LLM runtime isn't available
178
+
179
+ **Compaction triggers too often:**
180
+ - Increase `maxTokens`
181
+ - Decrease `keepRecentTokens` (keeps less, summarizes earlier)
182
+
183
+ **Not compacting when expected:**
184
+ - Check `/context-stats` to see current usage
185
+ - Verify `enabled: true` in config
186
+ - Check logs for `[context-compactor]` messages
187
+
188
+ **Characters per token wrong:**
189
+ - Default of 4 works for English
190
+ - Try 3 for CJK languages
191
+ - Try 5 for highly technical content
192
+
193
+ ## Logs
194
+
195
+ Enable debug logging:
196
+
197
+ ```json
198
+ {
199
+ "plugins": {
200
+ "entries": {
201
+ "context-compactor": {
202
+ "config": {
203
+ "logLevel": "debug"
204
+ }
205
+ }
206
+ }
207
+ }
208
+ }
209
+ ```
210
+
211
+ Look for:
212
+ - `[context-compactor] Current context: ~XXXX tokens`
213
+ - `[context-compactor] Compacted X messages → summary`
214
+
215
+ ## Links
216
+
217
+ - **GitHub**: https://github.com/E-x-O-Entertainment-Studios-Inc/openclaw-context-compactor
218
+ - **OpenClaw Docs**: https://docs.openclaw.ai/concepts/compaction
package/cli.js ADDED
@@ -0,0 +1,269 @@
1
+ #!/usr/bin/env node
2
+ /**
3
+ * Context Compactor CLI
4
+ * Setup script with interactive token limit configuration
5
+ */
6
+
7
+ const fs = require('fs');
8
+ const path = require('path');
9
+ const os = require('os');
10
+ const readline = require('readline');
11
+
12
+ const OPENCLAW_CONFIG = path.join(os.homedir(), '.openclaw', 'openclaw.json');
13
+ const OPENCLAW_EXTENSIONS = path.join(os.homedir(), '.openclaw', 'extensions', 'context-compactor');
14
+
15
+ function log(msg) {
16
+ console.log(`📦 ${msg}`);
17
+ }
18
+
19
+ function error(msg) {
20
+ console.error(`❌ ${msg}`);
21
+ }
22
+
23
+ function prompt(question) {
24
+ const rl = readline.createInterface({
25
+ input: process.stdin,
26
+ output: process.stdout
27
+ });
28
+
29
+ return new Promise(resolve => {
30
+ rl.question(question, answer => {
31
+ rl.close();
32
+ resolve(answer.trim());
33
+ });
34
+ });
35
+ }
36
+
37
+ async function detectModelContextWindow(config) {
38
+ // Try to detect from OpenClaw config
39
+ const model = config?.agents?.defaults?.model?.primary;
40
+
41
+ if (!model) return null;
42
+
43
+ // Common context windows (conservative estimates)
44
+ const knownContexts = {
45
+ // Anthropic
46
+ 'anthropic/claude-opus': 200000,
47
+ 'anthropic/claude-sonnet': 200000,
48
+ 'anthropic/claude-haiku': 200000,
49
+ // OpenAI
50
+ 'openai/gpt-4': 128000,
51
+ 'openai/gpt-4-turbo': 128000,
52
+ 'openai/gpt-3.5-turbo': 16000,
53
+ // Local models (common sizes)
54
+ 'mlx': 8000,
55
+ 'ollama': 8000,
56
+ 'llama': 8000,
57
+ 'mistral': 32000,
58
+ 'qwen': 32000,
59
+ };
60
+
61
+ // Check for exact match first
62
+ for (const [pattern, tokens] of Object.entries(knownContexts)) {
63
+ if (model.toLowerCase().includes(pattern.toLowerCase())) {
64
+ return { model, tokens, source: 'detected' };
65
+ }
66
+ }
67
+
68
+ return { model, tokens: null, source: 'unknown' };
69
+ }
70
+
71
+ async function setup() {
72
+ log('Context Compactor — Setup');
73
+ console.log('='.repeat(50));
74
+
75
+ // Check if OpenClaw is installed
76
+ const openclawDir = path.join(os.homedir(), '.openclaw');
77
+ if (!fs.existsSync(openclawDir)) {
78
+ error('OpenClaw not detected (~/.openclaw not found)');
79
+ console.log('Install OpenClaw first: https://docs.openclaw.ai');
80
+ process.exit(1);
81
+ }
82
+
83
+ // Copy plugin files to extensions directory
84
+ console.log('');
85
+ log('Installing plugin files...');
86
+ fs.mkdirSync(OPENCLAW_EXTENSIONS, { recursive: true });
87
+
88
+ const pluginDir = path.dirname(__filename);
89
+ const filesToCopy = ['index.ts', 'openclaw.plugin.json'];
90
+
91
+ for (const file of filesToCopy) {
92
+ const src = path.join(pluginDir, file);
93
+ const dest = path.join(OPENCLAW_EXTENSIONS, file);
94
+ if (fs.existsSync(src)) {
95
+ fs.copyFileSync(src, dest);
96
+ console.log(` ✓ Copied: ${file}`);
97
+ }
98
+ }
99
+
100
+ // Load existing config
101
+ let config = {};
102
+ if (fs.existsSync(OPENCLAW_CONFIG)) {
103
+ try {
104
+ config = JSON.parse(fs.readFileSync(OPENCLAW_CONFIG, 'utf8'));
105
+ } catch (e) {
106
+ error(`Could not parse openclaw.json: ${e.message}`);
107
+ process.exit(1);
108
+ }
109
+ }
110
+
111
+ // Determine token limit
112
+ console.log('');
113
+ log('Configuring token limits...');
114
+ console.log('');
115
+ console.log(' To set the right limit, I can check your OpenClaw config');
116
+ console.log(' to see what model you\'re using.');
117
+ console.log('');
118
+ console.log(' 🔒 Privacy: This runs 100% locally. Nothing is sent externally.');
119
+ console.log('');
120
+
121
+ const checkConfig = await prompt(' Check your config for model info? (y/n): ');
122
+
123
+ let maxTokens = 8000; // Default
124
+ let detectedInfo = null;
125
+
126
+ if (checkConfig.toLowerCase() === 'y' || checkConfig.toLowerCase() === 'yes') {
127
+ detectedInfo = await detectModelContextWindow(config);
128
+
129
+ if (detectedInfo && detectedInfo.tokens) {
130
+ console.log('');
131
+ console.log(` ✓ Detected model: ${detectedInfo.model}`);
132
+ console.log(` ✓ Context window: ~${detectedInfo.tokens.toLocaleString()} tokens`);
133
+
134
+ // Suggest a safe limit (leave 20% headroom)
135
+ const suggested = Math.floor(detectedInfo.tokens * 0.8);
136
+ console.log(` → Suggested maxTokens: ${suggested.toLocaleString()} (80% of context)`);
137
+ console.log('');
138
+
139
+ const useDetected = await prompt(` Use ${suggested.toLocaleString()} tokens? (y/n, or enter custom): `);
140
+
141
+ if (useDetected.toLowerCase() === 'y' || useDetected.toLowerCase() === 'yes') {
142
+ maxTokens = suggested;
143
+ } else if (/^\d+$/.test(useDetected)) {
144
+ maxTokens = parseInt(useDetected, 10);
145
+ }
146
+ } else if (detectedInfo && detectedInfo.model) {
147
+ console.log('');
148
+ console.log(` ⚠ Found model: ${detectedInfo.model}`);
149
+ console.log(' ⚠ Could not determine context window automatically.');
150
+ }
151
+ }
152
+
153
+ // If we still don't have a good value, ask manually
154
+ if (maxTokens === 8000 && (!detectedInfo || !detectedInfo.tokens)) {
155
+ console.log('');
156
+ console.log(' Common context windows:');
157
+ console.log(' • MLX / llama.cpp (small): 4,000 - 8,000');
158
+ console.log(' • Mistral / Qwen (medium): 32,000');
159
+ console.log(' • Claude / GPT-4 (large): 128,000+');
160
+ console.log('');
161
+ console.log(' Check your model\'s docs or LM Studio/Ollama settings.');
162
+ console.log(' Config location: ~/.openclaw/openclaw.json');
163
+ console.log('');
164
+
165
+ const customTokens = await prompt(' Enter maxTokens (default 8000): ');
166
+ if (/^\d+$/.test(customTokens)) {
167
+ maxTokens = parseInt(customTokens, 10);
168
+ }
169
+ }
170
+
171
+ // Calculate keepRecentTokens (25% of max)
172
+ const keepRecentTokens = Math.floor(maxTokens * 0.25);
173
+ const summaryMaxTokens = Math.floor(maxTokens * 0.125);
174
+
175
+ console.log('');
176
+ console.log(` Configuration:`);
177
+ console.log(` maxTokens: ${maxTokens.toLocaleString()}`);
178
+ console.log(` keepRecentTokens: ${keepRecentTokens.toLocaleString()} (25%)`);
179
+ console.log(` summaryMaxTokens: ${summaryMaxTokens.toLocaleString()} (12.5%)`);
180
+
181
+ // Update openclaw.json
182
+ console.log('');
183
+ log('Updating OpenClaw config...');
184
+
185
+ // Initialize plugins structure if needed
186
+ if (!config.plugins) config.plugins = {};
187
+ if (!config.plugins.entries) config.plugins.entries = {};
188
+
189
+ // Add/update plugin config
190
+ config.plugins.entries['context-compactor'] = {
191
+ enabled: true,
192
+ config: {
193
+ maxTokens,
194
+ keepRecentTokens,
195
+ summaryMaxTokens,
196
+ charsPerToken: 4
197
+ }
198
+ };
199
+
200
+ // Write back with nice formatting
201
+ fs.writeFileSync(OPENCLAW_CONFIG, JSON.stringify(config, null, 2) + '\n');
202
+ console.log(' ✓ Saved to openclaw.json');
203
+
204
+ console.log('');
205
+ console.log('='.repeat(50));
206
+ log('Setup complete!');
207
+ console.log('');
208
+ console.log('Next steps:');
209
+ console.log(' 1. Restart OpenClaw: openclaw gateway restart');
210
+ console.log(' 2. Check status in chat: /context-stats');
211
+ console.log('');
212
+ console.log('To adjust later, edit ~/.openclaw/openclaw.json');
213
+ console.log('under plugins.entries["context-compactor"].config');
214
+ }
215
+
216
+ function showHelp() {
217
+ console.log(`
218
+ Context Compactor
219
+ Token-based context compaction for local models
220
+
221
+ USAGE:
222
+ npx openclaw-context-compactor setup Install and configure plugin
223
+ npx openclaw-context-compactor help Show this help
224
+
225
+ WHAT IT DOES:
226
+ - Copies plugin files to ~/.openclaw/extensions/context-compactor/
227
+ - Detects your model's context window (with permission)
228
+ - Configures appropriate token limits
229
+ - Enables automatic context compaction for local models
230
+
231
+ CONFIGURATION:
232
+ After setup, adjust in openclaw.json:
233
+
234
+ "context-compactor": {
235
+ "enabled": true,
236
+ "config": {
237
+ "maxTokens": 8000, // Your model's context limit minus buffer
238
+ "keepRecentTokens": 2000 // Recent context to preserve
239
+ }
240
+ }
241
+
242
+ COMMANDS (in chat):
243
+ /context-stats Show current token usage
244
+ /compact-now Force fresh compaction
245
+ `);
246
+ }
247
+
248
+ // Main
249
+ const command = process.argv[2];
250
+
251
+ switch (command) {
252
+ case 'setup':
253
+ case 'install':
254
+ setup().catch(err => {
255
+ error(err.message);
256
+ process.exit(1);
257
+ });
258
+ break;
259
+ case 'help':
260
+ case '--help':
261
+ case '-h':
262
+ case undefined:
263
+ showHelp();
264
+ break;
265
+ default:
266
+ error(`Unknown command: ${command}`);
267
+ showHelp();
268
+ process.exit(1);
269
+ }
package/index.ts ADDED
@@ -0,0 +1,399 @@
1
+ /**
2
+ * Context Compactor - OpenClaw Plugin
3
+ *
4
+ * Token-based context compaction for local models (MLX, llama.cpp, Ollama)
5
+ * that don't properly report context window limits.
6
+ *
7
+ * How it works:
8
+ * 1. Before each agent turn, estimates total context tokens
9
+ * 2. If over threshold, summarizes older messages
10
+ * 3. Injects summary + recent messages as the new context
11
+ *
12
+ * This is a client-side solution that doesn't require model cooperation.
13
+ */
14
+
15
+ import * as fs from 'fs';
16
+ import * as path from 'path';
17
+ import * as os from 'os';
18
+
19
+ interface PluginConfig {
20
+ enabled?: boolean;
21
+ maxTokens?: number;
22
+ keepRecentTokens?: number;
23
+ summaryMaxTokens?: number;
24
+ charsPerToken?: number;
25
+ summaryModel?: string;
26
+ logLevel?: 'debug' | 'info' | 'warn' | 'error';
27
+ }
28
+
29
+ interface Message {
30
+ role: 'user' | 'assistant' | 'system';
31
+ content: string;
32
+ }
33
+
34
+ interface SessionEntry {
35
+ id: string;
36
+ type: string;
37
+ message?: Message;
38
+ content?: string;
39
+ role?: string;
40
+ parentId?: string;
41
+ }
42
+
43
+ interface PluginApi {
44
+ config: {
45
+ plugins?: {
46
+ entries?: {
47
+ 'context-compactor'?: {
48
+ config?: PluginConfig;
49
+ };
50
+ };
51
+ };
52
+ };
53
+ logger: {
54
+ info: (msg: string) => void;
55
+ warn: (msg: string) => void;
56
+ error: (msg: string) => void;
57
+ debug: (msg: string) => void;
58
+ };
59
+ registerTool: (tool: any) => void;
60
+ registerCommand: (cmd: any) => void;
61
+ registerGatewayMethod: (name: string, handler: any) => void;
62
+ on: (event: string, handler: (event: any) => Promise<any>) => void;
63
+ runtime?: {
64
+ llm?: {
65
+ complete: (opts: { model?: string; messages: Message[]; maxTokens?: number }) => Promise<{ content: string }>;
66
+ };
67
+ };
68
+ }
69
+
70
+ // Simple token estimator (chars / charsPerToken)
71
+ function estimateTokens(text: string, charsPerToken: number): number {
72
+ return Math.ceil(text.length / charsPerToken);
73
+ }
74
+
75
+ // Read session transcript
76
+ function readTranscript(sessionPath: string): SessionEntry[] {
77
+ if (!fs.existsSync(sessionPath)) return [];
78
+
79
+ const content = fs.readFileSync(sessionPath, 'utf8');
80
+ const lines = content.trim().split('\n').filter(Boolean);
81
+
82
+ return lines.map(line => {
83
+ try {
84
+ return JSON.parse(line);
85
+ } catch {
86
+ return null;
87
+ }
88
+ }).filter(Boolean) as SessionEntry[];
89
+ }
90
+
91
+ // Extract messages from session entries
92
+ function extractMessages(entries: SessionEntry[]): Message[] {
93
+ const messages: Message[] = [];
94
+
95
+ for (const entry of entries) {
96
+ if (entry.type === 'message' && entry.message) {
97
+ messages.push(entry.message);
98
+ } else if (entry.role && entry.content) {
99
+ messages.push({ role: entry.role as Message['role'], content: entry.content });
100
+ }
101
+ }
102
+
103
+ return messages;
104
+ }
105
+
106
+ // Split messages into "old" (to summarize) and "recent" (to keep)
107
+ function splitMessages(
108
+ messages: Message[],
109
+ keepRecentTokens: number,
110
+ charsPerToken: number
111
+ ): { old: Message[]; recent: Message[] } {
112
+ let recentTokens = 0;
113
+ let splitIndex = messages.length;
114
+
115
+ // Walk backwards from end, counting tokens
116
+ for (let i = messages.length - 1; i >= 0; i--) {
117
+ const msgTokens = estimateTokens(messages[i].content, charsPerToken);
118
+ if (recentTokens + msgTokens > keepRecentTokens) {
119
+ splitIndex = i + 1;
120
+ break;
121
+ }
122
+ recentTokens += msgTokens;
123
+ if (i === 0) splitIndex = 0;
124
+ }
125
+
126
+ return {
127
+ old: messages.slice(0, splitIndex),
128
+ recent: messages.slice(splitIndex),
129
+ };
130
+ }
131
+
132
+ // Format messages for summarization
133
+ function formatForSummary(messages: Message[]): string {
134
+ return messages.map(m => `[${m.role.toUpperCase()}]: ${m.content}`).join('\n\n');
135
+ }
136
+
137
+ // In-memory cache for summaries (avoid re-summarizing the same content)
138
+ const summaryCache = new Map<string, string>();
139
+
140
+ function hashMessages(messages: Message[]): string {
141
+ const content = messages.map(m => `${m.role}:${m.content}`).join('|');
142
+ // Simple hash
143
+ let hash = 0;
144
+ for (let i = 0; i < content.length; i++) {
145
+ const char = content.charCodeAt(i);
146
+ hash = ((hash << 5) - hash) + char;
147
+ hash = hash & hash;
148
+ }
149
+ return hash.toString(16);
150
+ }
151
+
152
+ export default function register(api: PluginApi) {
153
+ const cfg = api.config.plugins?.entries?.['context-compactor']?.config ?? {};
154
+
155
+ if (cfg.enabled === false) {
156
+ api.logger.info('[context-compactor] Plugin disabled');
157
+ return;
158
+ }
159
+
160
+ const maxTokens = cfg.maxTokens ?? 8000;
161
+ const keepRecentTokens = cfg.keepRecentTokens ?? 2000;
162
+ const summaryMaxTokens = cfg.summaryMaxTokens ?? 1000;
163
+ const charsPerToken = cfg.charsPerToken ?? 4;
164
+ const summaryModel = cfg.summaryModel;
165
+
166
+ api.logger.info(`[context-compactor] Initialized (maxTokens=${maxTokens}, keepRecent=${keepRecentTokens})`);
167
+
168
+ // ============================================================================
169
+ // Core: before_agent_start hook
170
+ // ============================================================================
171
+
172
+ api.on('before_agent_start', async (event: {
173
+ prompt?: string;
174
+ sessionKey?: string;
175
+ sessionId?: string;
176
+ context?: {
177
+ sessionFile?: string;
178
+ messages?: Message[];
179
+ };
180
+ }) => {
181
+ try {
182
+ // Get current messages from context or session file
183
+ let messages: Message[] = event.context?.messages ?? [];
184
+
185
+ if (messages.length === 0 && event.context?.sessionFile) {
186
+ const entries = readTranscript(event.context.sessionFile);
187
+ messages = extractMessages(entries);
188
+ }
189
+
190
+ if (messages.length === 0) {
191
+ api.logger.debug?.('[context-compactor] No messages to compact');
192
+ return;
193
+ }
194
+
195
+ // Estimate total tokens
196
+ const totalTokens = messages.reduce(
197
+ (sum, m) => sum + estimateTokens(m.content, charsPerToken),
198
+ 0
199
+ );
200
+
201
+ api.logger.debug?.(`[context-compactor] Current context: ~${totalTokens} tokens`);
202
+
203
+ // Check if compaction needed
204
+ if (totalTokens <= maxTokens) {
205
+ return; // Under limit, no action needed
206
+ }
207
+
208
+ api.logger.info(`[context-compactor] Context (~${totalTokens} tokens) exceeds limit (${maxTokens}), compacting...`);
209
+
210
+ // Split into old and recent
211
+ const { old, recent } = splitMessages(messages, keepRecentTokens, charsPerToken);
212
+
213
+ if (old.length === 0) {
214
+ api.logger.warn('[context-compactor] No old messages to summarize, skipping');
215
+ return;
216
+ }
217
+
218
+ // Check cache
219
+ const cacheKey = hashMessages(old);
220
+ let summary = summaryCache.get(cacheKey);
221
+
222
+ if (!summary) {
223
+ // Generate summary
224
+ const formatted = formatForSummary(old);
225
+
226
+ const summaryPrompt = `Summarize this conversation concisely, preserving key decisions, context, and important details. Focus on information that would be needed to continue the conversation coherently.
227
+
228
+ CONVERSATION:
229
+ ${formatted}
230
+
231
+ SUMMARY (be concise, max ${Math.floor(summaryMaxTokens * charsPerToken)} characters):`;
232
+
233
+ if (api.runtime?.llm?.complete) {
234
+ // Use OpenClaw's LLM runtime
235
+ const result = await api.runtime.llm.complete({
236
+ model: summaryModel,
237
+ messages: [{ role: 'user', content: summaryPrompt }],
238
+ maxTokens: summaryMaxTokens,
239
+ });
240
+ summary = result.content;
241
+ } else {
242
+ // Fallback: simple truncation-based summary
243
+ api.logger.warn('[context-compactor] LLM runtime not available, using truncation fallback');
244
+ const maxChars = summaryMaxTokens * charsPerToken;
245
+ summary = `[Context Summary - ${old.length} messages compacted]\n\n`;
246
+
247
+ // Keep first and last few messages
248
+ const keepCount = Math.min(3, Math.floor(old.length / 2));
249
+ const first = old.slice(0, keepCount);
250
+ const last = old.slice(-keepCount);
251
+
252
+ summary += 'Earlier:\n' + first.map(m => `- ${m.role}: ${m.content.slice(0, 200)}...`).join('\n');
253
+ summary += '\n\nRecent:\n' + last.map(m => `- ${m.role}: ${m.content.slice(0, 200)}...`).join('\n');
254
+
255
+ if (summary.length > maxChars) {
256
+ summary = summary.slice(0, maxChars) + '...';
257
+ }
258
+ }
259
+
260
+ // Cache it
261
+ summaryCache.set(cacheKey, summary);
262
+
263
+ // Limit cache size
264
+ if (summaryCache.size > 100) {
265
+ const firstKey = summaryCache.keys().next().value;
266
+ if (firstKey) summaryCache.delete(firstKey);
267
+ }
268
+ }
269
+
270
+ const recentTokens = recent.reduce(
271
+ (sum, m) => sum + estimateTokens(m.content, charsPerToken),
272
+ 0
273
+ );
274
+ const summaryTokens = estimateTokens(summary, charsPerToken);
275
+ const newTotal = summaryTokens + recentTokens;
276
+
277
+ api.logger.info(
278
+ `[context-compactor] Compacted ${old.length} messages → summary (~${summaryTokens} tokens) + ${recent.length} recent (~${recentTokens} tokens) = ~${newTotal} tokens`
279
+ );
280
+
281
+ // Return context modification
282
+ return {
283
+ prependContext: `<compacted-context>
284
+ The following is a summary of earlier conversation that was compacted to fit context limits:
285
+
286
+ ${summary}
287
+
288
+ ---
289
+ Recent conversation continues below:
290
+ </compacted-context>`,
291
+ // Note: We can't actually replace messages in before_agent_start,
292
+ // we can only prepend context. For full message replacement,
293
+ // we'd need a different hook or session modification.
294
+ };
295
+
296
+ } catch (err: any) {
297
+ api.logger.error(`[context-compactor] Error: ${err.message}`);
298
+ }
299
+ });
300
+
301
+ // ============================================================================
302
+ // Command: /compact
303
+ // ============================================================================
304
+
305
+ api.registerCommand({
306
+ name: 'compact-now',
307
+ description: 'Force context compaction on next message',
308
+ acceptsArgs: false,
309
+ requireAuth: true,
310
+ handler: async () => {
311
+ // Clear cache to force fresh summary
312
+ summaryCache.clear();
313
+ return { text: '🧹 Context compaction cache cleared. Next message will trigger fresh compaction if needed.' };
314
+ },
315
+ });
316
+
317
+ // ============================================================================
318
+ // Command: /context-stats
319
+ // ============================================================================
320
+
321
+ api.registerCommand({
322
+ name: 'context-stats',
323
+ description: 'Show estimated context token usage',
324
+ acceptsArgs: false,
325
+ requireAuth: true,
326
+ handler: async (ctx: { sessionFile?: string }) => {
327
+ try {
328
+ if (!ctx.sessionFile) {
329
+ return { text: '⚠️ Session file not available' };
330
+ }
331
+
332
+ const entries = readTranscript(ctx.sessionFile);
333
+ const messages = extractMessages(entries);
334
+
335
+ const totalTokens = messages.reduce(
336
+ (sum, m) => sum + estimateTokens(m.content, charsPerToken),
337
+ 0
338
+ );
339
+
340
+ const userMsgs = messages.filter(m => m.role === 'user').length;
341
+ const assistantMsgs = messages.filter(m => m.role === 'assistant').length;
342
+ const systemMsgs = messages.filter(m => m.role === 'system').length;
343
+
344
+ return {
345
+ text: `📊 **Context Stats**
346
+
347
+ **Messages:** ${messages.length} total
348
+ - User: ${userMsgs}
349
+ - Assistant: ${assistantMsgs}
350
+ - System: ${systemMsgs}
351
+
352
+ **Estimated Tokens:** ~${totalTokens.toLocaleString()}
353
+ **Limit:** ${maxTokens.toLocaleString()}
354
+ **Usage:** ${((totalTokens / maxTokens) * 100).toFixed(1)}%
355
+
356
+ ${totalTokens > maxTokens ? '⚠️ **Over limit - compaction will trigger**' : '✅ Within limits'}`,
357
+ };
358
+ } catch (err: any) {
359
+ return { text: `❌ Error: ${err.message}` };
360
+ }
361
+ },
362
+ });
363
+
364
+ // ============================================================================
365
+ // RPC: context-compactor.stats
366
+ // ============================================================================
367
+
368
+ api.registerGatewayMethod('context-compactor.stats', async ({ params, respond }: any) => {
369
+ try {
370
+ const { sessionFile } = params;
371
+
372
+ if (!sessionFile || !fs.existsSync(sessionFile)) {
373
+ respond(true, { error: 'Session file not found', messages: 0, tokens: 0 });
374
+ return;
375
+ }
376
+
377
+ const entries = readTranscript(sessionFile);
378
+ const messages = extractMessages(entries);
379
+
380
+ const totalTokens = messages.reduce(
381
+ (sum, m) => sum + estimateTokens(m.content, charsPerToken),
382
+ 0
383
+ );
384
+
385
+ respond(true, {
386
+ messages: messages.length,
387
+ tokens: totalTokens,
388
+ maxTokens,
389
+ needsCompaction: totalTokens > maxTokens,
390
+ cacheSize: summaryCache.size,
391
+ });
392
+ } catch (err: any) {
393
+ respond(false, { error: err.message });
394
+ }
395
+ });
396
+ }
397
+
398
+ export const id = 'context-compactor';
399
+ export const name = 'Context Compactor - Local Model Support';
@@ -0,0 +1,53 @@
1
+ {
2
+ "id": "context-compactor",
3
+ "name": "Context Compactor",
4
+ "version": "0.1.0",
5
+ "description": "Token-based context compaction for local models (MLX, llama.cpp) that don't report context limits",
6
+ "configSchema": {
7
+ "type": "object",
8
+ "additionalProperties": false,
9
+ "properties": {
10
+ "enabled": {
11
+ "type": "boolean",
12
+ "default": true
13
+ },
14
+ "maxTokens": {
15
+ "type": "number",
16
+ "default": 8000,
17
+ "description": "Maximum context tokens before compaction triggers"
18
+ },
19
+ "keepRecentTokens": {
20
+ "type": "number",
21
+ "default": 2000,
22
+ "description": "Tokens to keep from recent messages (not summarized)"
23
+ },
24
+ "summaryMaxTokens": {
25
+ "type": "number",
26
+ "default": 1000,
27
+ "description": "Maximum tokens for the compaction summary"
28
+ },
29
+ "charsPerToken": {
30
+ "type": "number",
31
+ "default": 4,
32
+ "description": "Estimated characters per token (for simple counting)"
33
+ },
34
+ "summaryModel": {
35
+ "type": "string",
36
+ "description": "Model to use for summarization (defaults to session model)"
37
+ },
38
+ "logLevel": {
39
+ "type": "string",
40
+ "enum": ["debug", "info", "warn", "error"],
41
+ "default": "info"
42
+ }
43
+ }
44
+ },
45
+ "uiHints": {
46
+ "enabled": { "label": "Enable Context Compactor" },
47
+ "maxTokens": { "label": "Max Context Tokens", "placeholder": "8000" },
48
+ "keepRecentTokens": { "label": "Keep Recent Tokens", "placeholder": "2000" },
49
+ "summaryMaxTokens": { "label": "Summary Max Tokens", "placeholder": "1000" },
50
+ "charsPerToken": { "label": "Chars per Token (estimate)", "placeholder": "4" },
51
+ "summaryModel": { "label": "Summary Model (optional)", "placeholder": "Leave blank for session model" }
52
+ }
53
+ }
package/package.json ADDED
@@ -0,0 +1,36 @@
1
+ {
2
+ "name": "jasper-context-compactor",
3
+ "version": "0.2.1",
4
+ "description": "Context compaction plugin for OpenClaw - works with local models (MLX, llama.cpp) that don't report token limits",
5
+ "main": "index.ts",
6
+ "bin": {
7
+ "context-compactor": "./cli.js"
8
+ },
9
+ "openclaw": {
10
+ "extensions": ["./index.ts"]
11
+ },
12
+ "keywords": [
13
+ "openclaw",
14
+ "context",
15
+ "compaction",
16
+ "mlx",
17
+ "local-llm",
18
+ "token-limit"
19
+ ],
20
+ "author": "E.x.O. Entertainment Studios Inc.",
21
+ "license": "MIT",
22
+ "repository": {
23
+ "type": "git",
24
+ "url": "https://github.com/E-x-O-Entertainment-Studios-Inc/openclaw-context-compactor.git"
25
+ },
26
+ "engines": {
27
+ "node": ">=18.0.0"
28
+ },
29
+ "files": [
30
+ "cli.js",
31
+ "index.ts",
32
+ "openclaw.plugin.json",
33
+ "SKILL.md",
34
+ "README.md"
35
+ ]
36
+ }