npm - jasper-context-compactor - Versions diffs - 0.2.1 - Mend

jasper-context-compactor 0.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

package/README.md ADDED Viewed

@@ -0,0 +1,57 @@
+# Context Compactor
+> Token-based context compaction for OpenClaw with local models (MLX, llama.cpp, Ollama)
+## Why?
+Local LLM servers don't report context overflow errors like cloud APIs do. OpenClaw's built-in compaction relies on these errors to trigger. This plugin estimates tokens client-side and proactively summarizes older messages before hitting the model's limit.
+## Quick Start
+```bash
+# One command setup (installs + configures)
+npx jasper-context-compactor setup
+# Restart gateway
+openclaw gateway restart
+```
+That's it! The setup command:
+- Copies plugin files to `~/.openclaw/extensions/context-compactor/`
+- Adds plugin config to `openclaw.json` with sensible defaults
+## Configuration
+```json
+{
+  "plugins": {
+    "entries": {
+      "context-compactor": {
+        "enabled": true,
+        "config": {
+          "maxTokens": 8000,
+          "keepRecentTokens": 2000,
+          "summaryMaxTokens": 1000,
+          "charsPerToken": 4
+        }
+      }
+    }
+  }
+}
+```
+## Commands
+- `/context-stats` — Show current token usage
+- `/compact-now` — Force fresh compaction on next message
+## How It Works
+1. Before each agent turn, estimates total context tokens
+2. If over `maxTokens`, splits messages into "old" and "recent"
+3. Summarizes old messages using the session model
+4. Injects summary + recent messages as context
+## License
+MIT

package/SKILL.md ADDED Viewed

@@ -0,0 +1,218 @@
+---
+name: context-compactor
+version: 0.1.0
+description: Token-based context compaction for local models (MLX, llama.cpp, Ollama) that don't report context limits.
+---
+# Context Compactor
+Automatic context compaction for OpenClaw when using local models that don't properly report token limits or context overflow errors.
+## The Problem
+Cloud APIs (Anthropic, OpenAI) report context overflow errors, allowing OpenClaw's built-in compaction to trigger. Local models (MLX, llama.cpp, Ollama) often:
+- Silently truncate context
+- Return garbage when context is exceeded
+- Don't report accurate token counts
+This leaves you with broken conversations when context gets too long.
+## The Solution
+Context Compactor estimates tokens client-side and proactively summarizes older messages before hitting the model's limit.
+## How It Works
+```
+┌─────────────────────────────────────────────────────────────┐
+│  1. Message arrives                                         │
+│  2. before_agent_start hook fires                           │
+│  3. Plugin estimates total context tokens                   │
+│  4. If over maxTokens:                                      │
+│     a. Split into "old" and "recent" messages              │
+│     b. Summarize old messages (LLM or fallback)            │
+│     c. Inject summary as compacted context                 │
+│  5. Agent sees: summary + recent + new message             │
+└─────────────────────────────────────────────────────────────┘
+```
+## Installation
+```bash
+# One command setup (recommended)
+npx jasper-context-compactor setup
+# Restart gateway
+openclaw gateway restart
+```
+The setup command automatically:
+- Copies plugin files to `~/.openclaw/extensions/context-compactor/`
+- Adds plugin config to `openclaw.json` with sensible defaults
+## Configuration
+Add to `openclaw.json`:
+```json
+{
+  "plugins": {
+    "entries": {
+      "context-compactor": {
+        "enabled": true,
+        "config": {
+          "maxTokens": 8000,
+          "keepRecentTokens": 2000,
+          "summaryMaxTokens": 1000,
+          "charsPerToken": 4
+        }
+      }
+    }
+  }
+}
+```
+### Options
+| Option | Default | Description |
+|--------|---------|-------------|
+| `enabled` | `true` | Enable/disable the plugin |
+| `maxTokens` | `8000` | Max context tokens before compaction |
+| `keepRecentTokens` | `2000` | Tokens to preserve from recent messages |
+| `summaryMaxTokens` | `1000` | Max tokens for the summary |
+| `charsPerToken` | `4` | Token estimation ratio |
+| `summaryModel` | (session model) | Model to use for summarization |
+### Tuning for Your Model
+**MLX (8K context models):**
+```json
+{
+  "maxTokens": 6000,
+  "keepRecentTokens": 1500,
+  "charsPerToken": 4
+}
+```
+**Larger context (32K models):**
+```json
+{
+  "maxTokens": 28000,
+  "keepRecentTokens": 4000,
+  "charsPerToken": 4
+}
+```
+**Small context (4K models):**
+```json
+{
+  "maxTokens": 3000,
+  "keepRecentTokens": 800,
+  "charsPerToken": 4
+}
+```
+## Commands
+### `/compact-now`
+Force clear the summary cache and trigger fresh compaction on next message.
+```
+/compact-now
+```
+### `/context-stats`
+Show current context token usage and whether compaction would trigger.
+```
+/context-stats
+```
+Output:
+```
+📊 Context Stats
+Messages: 47 total
+- User: 23
+- Assistant: 24
+- System: 0
+Estimated Tokens: ~6,234
+Limit: 8,000
+Usage: 77.9%
+✅ Within limits
+```
+## How Summarization Works
+When compaction triggers:
+1. **Split messages** into "old" (to summarize) and "recent" (to keep)
+2. **Generate summary** using the session model (or configured `summaryModel`)
+3. **Cache the summary** to avoid regenerating for the same content
+4. **Inject context** with the summary prepended
+If the LLM runtime isn't available (e.g., during startup), a fallback truncation-based summary is used.
+## Differences from Built-in Compaction
+| Feature | Built-in | Context Compactor |
+|---------|----------|-------------------|
+| Trigger | Model reports overflow | Token estimate threshold |
+| Works with local models | ❌ (need overflow error) | ✅ |
+| Persists to transcript | ✅ | ❌ (session-only) |
+| Summarization | Pi runtime | Plugin LLM call |
+Context Compactor is **complementary** — it catches cases before they hit the model's hard limit.
+## Troubleshooting
+**Summary quality is poor:**
+- Try a better `summaryModel`
+- Increase `summaryMaxTokens`
+- The fallback truncation is used if LLM runtime isn't available
+**Compaction triggers too often:**
+- Increase `maxTokens`
+- Decrease `keepRecentTokens` (keeps less, summarizes earlier)
+**Not compacting when expected:**
+- Check `/context-stats` to see current usage
+- Verify `enabled: true` in config
+- Check logs for `[context-compactor]` messages
+**Characters per token wrong:**
+- Default of 4 works for English
+- Try 3 for CJK languages
+- Try 5 for highly technical content
+## Logs
+Enable debug logging:
+```json
+{
+  "plugins": {
+    "entries": {
+      "context-compactor": {
+        "config": {
+          "logLevel": "debug"
+        }
+      }
+    }
+  }
+}
+```
+Look for:
+- `[context-compactor] Current context: ~XXXX tokens`
+- `[context-compactor] Compacted X messages → summary`
+## Links
+- **GitHub**: https://github.com/E-x-O-Entertainment-Studios-Inc/openclaw-context-compactor
+- **OpenClaw Docs**: https://docs.openclaw.ai/concepts/compaction

package/cli.js ADDED Viewed

@@ -0,0 +1,269 @@
+#!/usr/bin/env node
+/**
+ * Context Compactor CLI
+ * Setup script with interactive token limit configuration
+ */
+const fs = require('fs');
+const path = require('path');
+const os = require('os');
+const readline = require('readline');
+const OPENCLAW_CONFIG = path.join(os.homedir(), '.openclaw', 'openclaw.json');
+const OPENCLAW_EXTENSIONS = path.join(os.homedir(), '.openclaw', 'extensions', 'context-compactor');
+function log(msg) {
+  console.log(`📦 ${msg}`);
+}
+function error(msg) {
+  console.error(`❌ ${msg}`);
+}
+function prompt(question) {
+  const rl = readline.createInterface({
+    input: process.stdin,
+    output: process.stdout
+  });
+  return new Promise(resolve => {
+    rl.question(question, answer => {
+      rl.close();
+      resolve(answer.trim());
+    });
+  });
+}
+async function detectModelContextWindow(config) {
+  // Try to detect from OpenClaw config
+  const model = config?.agents?.defaults?.model?.primary;
+  if (!model) return null;
+  // Common context windows (conservative estimates)
+  const knownContexts = {
+    // Anthropic
+    'anthropic/claude-opus': 200000,
+    'anthropic/claude-sonnet': 200000,
+    'anthropic/claude-haiku': 200000,
+    // OpenAI
+    'openai/gpt-4': 128000,
+    'openai/gpt-4-turbo': 128000,
+    'openai/gpt-3.5-turbo': 16000,
+    // Local models (common sizes)
+    'mlx': 8000,
+    'ollama': 8000,
+    'llama': 8000,
+    'mistral': 32000,
+    'qwen': 32000,
+  };
+  // Check for exact match first
+  for (const [pattern, tokens] of Object.entries(knownContexts)) {
+    if (model.toLowerCase().includes(pattern.toLowerCase())) {
+      return { model, tokens, source: 'detected' };
+    }
+  }
+  return { model, tokens: null, source: 'unknown' };
+}
+async function setup() {
+  log('Context Compactor — Setup');
+  console.log('='.repeat(50));
+  // Check if OpenClaw is installed
+  const openclawDir = path.join(os.homedir(), '.openclaw');
+  if (!fs.existsSync(openclawDir)) {
+    error('OpenClaw not detected (~/.openclaw not found)');
+    console.log('Install OpenClaw first: https://docs.openclaw.ai');
+    process.exit(1);
+  }
+  // Copy plugin files to extensions directory
+  console.log('');
+  log('Installing plugin files...');
+  fs.mkdirSync(OPENCLAW_EXTENSIONS, { recursive: true });
+  const pluginDir = path.dirname(__filename);
+  const filesToCopy = ['index.ts', 'openclaw.plugin.json'];
+  for (const file of filesToCopy) {
+    const src = path.join(pluginDir, file);
+    const dest = path.join(OPENCLAW_EXTENSIONS, file);
+    if (fs.existsSync(src)) {
+      fs.copyFileSync(src, dest);
+      console.log(`  ✓ Copied: ${file}`);
+    }
+  }
+  // Load existing config
+  let config = {};
+  if (fs.existsSync(OPENCLAW_CONFIG)) {
+    try {
+      config = JSON.parse(fs.readFileSync(OPENCLAW_CONFIG, 'utf8'));
+    } catch (e) {
+      error(`Could not parse openclaw.json: ${e.message}`);
+      process.exit(1);
+    }
+  }
+  // Determine token limit
+  console.log('');
+  log('Configuring token limits...');
+  console.log('');
+  console.log('  To set the right limit, I can check your OpenClaw config');
+  console.log('  to see what model you\'re using.');
+  console.log('');
+  console.log('  🔒 Privacy: This runs 100% locally. Nothing is sent externally.');
+  console.log('');
+  const checkConfig = await prompt('  Check your config for model info? (y/n): ');
+  let maxTokens = 8000; // Default
+  let detectedInfo = null;
+  if (checkConfig.toLowerCase() === 'y' || checkConfig.toLowerCase() === 'yes') {
+    detectedInfo = await detectModelContextWindow(config);
+    if (detectedInfo && detectedInfo.tokens) {
+      console.log('');
+      console.log(`  ✓ Detected model: ${detectedInfo.model}`);
+      console.log(`  ✓ Context window: ~${detectedInfo.tokens.toLocaleString()} tokens`);
+      // Suggest a safe limit (leave 20% headroom)
+      const suggested = Math.floor(detectedInfo.tokens * 0.8);
+      console.log(`  → Suggested maxTokens: ${suggested.toLocaleString()} (80% of context)`);
+      console.log('');
+      const useDetected = await prompt(`  Use ${suggested.toLocaleString()} tokens? (y/n, or enter custom): `);
+      if (useDetected.toLowerCase() === 'y' || useDetected.toLowerCase() === 'yes') {
+        maxTokens = suggested;
+      } else if (/^\d+$/.test(useDetected)) {
+        maxTokens = parseInt(useDetected, 10);
+      }
+    } else if (detectedInfo && detectedInfo.model) {
+      console.log('');
+      console.log(`  ⚠ Found model: ${detectedInfo.model}`);
+      console.log('  ⚠ Could not determine context window automatically.');
+    }
+  }
+  // If we still don't have a good value, ask manually
+  if (maxTokens === 8000 && (!detectedInfo || !detectedInfo.tokens)) {
+    console.log('');
+    console.log('  Common context windows:');
+    console.log('    • MLX / llama.cpp (small): 4,000 - 8,000');
+    console.log('    • Mistral / Qwen (medium): 32,000');
+    console.log('    • Claude / GPT-4 (large):  128,000+');
+    console.log('');
+    console.log('  Check your model\'s docs or LM Studio/Ollama settings.');
+    console.log('  Config location: ~/.openclaw/openclaw.json');
+    console.log('');
+    const customTokens = await prompt('  Enter maxTokens (default 8000): ');
+    if (/^\d+$/.test(customTokens)) {
+      maxTokens = parseInt(customTokens, 10);
+    }
+  }
+  // Calculate keepRecentTokens (25% of max)
+  const keepRecentTokens = Math.floor(maxTokens * 0.25);
+  const summaryMaxTokens = Math.floor(maxTokens * 0.125);
+  console.log('');
+  console.log(`  Configuration:`);
+  console.log(`    maxTokens:        ${maxTokens.toLocaleString()}`);
+  console.log(`    keepRecentTokens: ${keepRecentTokens.toLocaleString()} (25%)`);
+  console.log(`    summaryMaxTokens: ${summaryMaxTokens.toLocaleString()} (12.5%)`);
+  // Update openclaw.json
+  console.log('');
+  log('Updating OpenClaw config...');
+  // Initialize plugins structure if needed
+  if (!config.plugins) config.plugins = {};
+  if (!config.plugins.entries) config.plugins.entries = {};
+  // Add/update plugin config
+  config.plugins.entries['context-compactor'] = {
+    enabled: true,
+    config: {
+      maxTokens,
+      keepRecentTokens,
+      summaryMaxTokens,
+      charsPerToken: 4
+    }
+  };
+  // Write back with nice formatting
+  fs.writeFileSync(OPENCLAW_CONFIG, JSON.stringify(config, null, 2) + '\n');
+  console.log('  ✓ Saved to openclaw.json');
+  console.log('');
+  console.log('='.repeat(50));
+  log('Setup complete!');
+  console.log('');
+  console.log('Next steps:');
+  console.log('  1. Restart OpenClaw: openclaw gateway restart');
+  console.log('  2. Check status in chat: /context-stats');
+  console.log('');
+  console.log('To adjust later, edit ~/.openclaw/openclaw.json');
+  console.log('under plugins.entries["context-compactor"].config');
+}
+function showHelp() {
+  console.log(`
+Context Compactor
+Token-based context compaction for local models
+USAGE:
+  npx openclaw-context-compactor setup    Install and configure plugin
+  npx openclaw-context-compactor help     Show this help
+WHAT IT DOES:
+  - Copies plugin files to ~/.openclaw/extensions/context-compactor/
+  - Detects your model's context window (with permission)
+  - Configures appropriate token limits
+  - Enables automatic context compaction for local models
+CONFIGURATION:
+  After setup, adjust in openclaw.json:
+  "context-compactor": {
+    "enabled": true,
+    "config": {
+      "maxTokens": 8000,       // Your model's context limit minus buffer
+      "keepRecentTokens": 2000 // Recent context to preserve
+    }
+  }
+COMMANDS (in chat):
+  /context-stats    Show current token usage
+  /compact-now      Force fresh compaction
+`);
+}
+// Main
+const command = process.argv[2];
+switch (command) {
+  case 'setup':
+  case 'install':
+    setup().catch(err => {
+      error(err.message);
+      process.exit(1);
+    });
+    break;
+  case 'help':
+  case '--help':
+  case '-h':
+  case undefined:
+    showHelp();
+    break;
+  default:
+    error(`Unknown command: ${command}`);
+    showHelp();
+    process.exit(1);
+}

package/index.ts ADDED Viewed

@@ -0,0 +1,399 @@
+/**
+ * Context Compactor - OpenClaw Plugin
+ *
+ * Token-based context compaction for local models (MLX, llama.cpp, Ollama)
+ * that don't properly report context window limits.
+ *
+ * How it works:
+ * 1. Before each agent turn, estimates total context tokens
+ * 2. If over threshold, summarizes older messages
+ * 3. Injects summary + recent messages as the new context
+ *
+ * This is a client-side solution that doesn't require model cooperation.
+ */
+import * as fs from 'fs';
+import * as path from 'path';
+import * as os from 'os';
+interface PluginConfig {
+  enabled?: boolean;
+  maxTokens?: number;
+  keepRecentTokens?: number;
+  summaryMaxTokens?: number;
+  charsPerToken?: number;
+  summaryModel?: string;
+  logLevel?: 'debug' | 'info' | 'warn' | 'error';
+}
+interface Message {
+  role: 'user' | 'assistant' | 'system';
+  content: string;
+}
+interface SessionEntry {
+  id: string;
+  type: string;
+  message?: Message;
+  content?: string;
+  role?: string;
+  parentId?: string;
+}
+interface PluginApi {
+  config: {
+    plugins?: {
+      entries?: {
+        'context-compactor'?: {
+          config?: PluginConfig;
+        };
+      };
+    };
+  };
+  logger: {
+    info: (msg: string) => void;
+    warn: (msg: string) => void;
+    error: (msg: string) => void;
+    debug: (msg: string) => void;
+  };
+  registerTool: (tool: any) => void;
+  registerCommand: (cmd: any) => void;
+  registerGatewayMethod: (name: string, handler: any) => void;
+  on: (event: string, handler: (event: any) => Promise<any>) => void;
+  runtime?: {
+    llm?: {
+      complete: (opts: { model?: string; messages: Message[]; maxTokens?: number }) => Promise<{ content: string }>;
+    };
+  };
+}
+// Simple token estimator (chars / charsPerToken)
+function estimateTokens(text: string, charsPerToken: number): number {
+  return Math.ceil(text.length / charsPerToken);
+}
+// Read session transcript
+function readTranscript(sessionPath: string): SessionEntry[] {
+  if (!fs.existsSync(sessionPath)) return [];
+  const content = fs.readFileSync(sessionPath, 'utf8');
+  const lines = content.trim().split('\n').filter(Boolean);
+  return lines.map(line => {
+    try {
+      return JSON.parse(line);
+    } catch {
+      return null;
+    }
+  }).filter(Boolean) as SessionEntry[];
+}
+// Extract messages from session entries
+function extractMessages(entries: SessionEntry[]): Message[] {
+  const messages: Message[] = [];
+  for (const entry of entries) {
+    if (entry.type === 'message' && entry.message) {
+      messages.push(entry.message);
+    } else if (entry.role && entry.content) {
+      messages.push({ role: entry.role as Message['role'], content: entry.content });
+    }
+  }
+  return messages;
+}
+// Split messages into "old" (to summarize) and "recent" (to keep)
+function splitMessages(
+  messages: Message[],
+  keepRecentTokens: number,
+  charsPerToken: number
+): { old: Message[]; recent: Message[] } {
+  let recentTokens = 0;
+  let splitIndex = messages.length;
+  // Walk backwards from end, counting tokens
+  for (let i = messages.length - 1; i >= 0; i--) {
+    const msgTokens = estimateTokens(messages[i].content, charsPerToken);
+    if (recentTokens + msgTokens > keepRecentTokens) {
+      splitIndex = i + 1;
+      break;
+    }
+    recentTokens += msgTokens;
+    if (i === 0) splitIndex = 0;
+  }
+  return {
+    old: messages.slice(0, splitIndex),
+    recent: messages.slice(splitIndex),
+  };
+}
+// Format messages for summarization
+function formatForSummary(messages: Message[]): string {
+  return messages.map(m => `[${m.role.toUpperCase()}]: ${m.content}`).join('\n\n');
+}
+// In-memory cache for summaries (avoid re-summarizing the same content)
+const summaryCache = new Map<string, string>();
+function hashMessages(messages: Message[]): string {
+  const content = messages.map(m => `${m.role}:${m.content}`).join('|');
+  // Simple hash
+  let hash = 0;
+  for (let i = 0; i < content.length; i++) {
+    const char = content.charCodeAt(i);
+    hash = ((hash << 5) - hash) + char;
+    hash = hash & hash;
+  }
+  return hash.toString(16);
+}
+export default function register(api: PluginApi) {
+  const cfg = api.config.plugins?.entries?.['context-compactor']?.config ?? {};
+  if (cfg.enabled === false) {
+    api.logger.info('[context-compactor] Plugin disabled');
+    return;
+  }
+  const maxTokens = cfg.maxTokens ?? 8000;
+  const keepRecentTokens = cfg.keepRecentTokens ?? 2000;
+  const summaryMaxTokens = cfg.summaryMaxTokens ?? 1000;
+  const charsPerToken = cfg.charsPerToken ?? 4;
+  const summaryModel = cfg.summaryModel;
+  api.logger.info(`[context-compactor] Initialized (maxTokens=${maxTokens}, keepRecent=${keepRecentTokens})`);
+  // ============================================================================
+  // Core: before_agent_start hook
+  // ============================================================================
+  api.on('before_agent_start', async (event: {
+    prompt?: string;
+    sessionKey?: string;
+    sessionId?: string;
+    context?: {
+      sessionFile?: string;
+      messages?: Message[];
+    };
+  }) => {
+    try {
+      // Get current messages from context or session file
+      let messages: Message[] = event.context?.messages ?? [];
+      if (messages.length === 0 && event.context?.sessionFile) {
+        const entries = readTranscript(event.context.sessionFile);
+        messages = extractMessages(entries);
+      }
+      if (messages.length === 0) {
+        api.logger.debug?.('[context-compactor] No messages to compact');
+        return;
+      }
+      // Estimate total tokens
+      const totalTokens = messages.reduce(
+        (sum, m) => sum + estimateTokens(m.content, charsPerToken),
+        0
+      );
+      api.logger.debug?.(`[context-compactor] Current context: ~${totalTokens} tokens`);
+      // Check if compaction needed
+      if (totalTokens <= maxTokens) {
+        return; // Under limit, no action needed
+      }
+      api.logger.info(`[context-compactor] Context (~${totalTokens} tokens) exceeds limit (${maxTokens}), compacting...`);
+      // Split into old and recent
+      const { old, recent } = splitMessages(messages, keepRecentTokens, charsPerToken);
+      if (old.length === 0) {
+        api.logger.warn('[context-compactor] No old messages to summarize, skipping');
+        return;
+      }
+      // Check cache
+      const cacheKey = hashMessages(old);
+      let summary = summaryCache.get(cacheKey);
+      if (!summary) {
+        // Generate summary
+        const formatted = formatForSummary(old);
+        const summaryPrompt = `Summarize this conversation concisely, preserving key decisions, context, and important details. Focus on information that would be needed to continue the conversation coherently.
+CONVERSATION:
+${formatted}
+SUMMARY (be concise, max ${Math.floor(summaryMaxTokens * charsPerToken)} characters):`;
+        if (api.runtime?.llm?.complete) {
+          // Use OpenClaw's LLM runtime
+          const result = await api.runtime.llm.complete({
+            model: summaryModel,
+            messages: [{ role: 'user', content: summaryPrompt }],
+            maxTokens: summaryMaxTokens,
+          });
+          summary = result.content;
+        } else {
+          // Fallback: simple truncation-based summary
+          api.logger.warn('[context-compactor] LLM runtime not available, using truncation fallback');
+          const maxChars = summaryMaxTokens * charsPerToken;
+          summary = `[Context Summary - ${old.length} messages compacted]\n\n`;
+          // Keep first and last few messages
+          const keepCount = Math.min(3, Math.floor(old.length / 2));
+          const first = old.slice(0, keepCount);
+          const last = old.slice(-keepCount);
+          summary += 'Earlier:\n' + first.map(m => `- ${m.role}: ${m.content.slice(0, 200)}...`).join('\n');
+          summary += '\n\nRecent:\n' + last.map(m => `- ${m.role}: ${m.content.slice(0, 200)}...`).join('\n');
+          if (summary.length > maxChars) {
+            summary = summary.slice(0, maxChars) + '...';
+          }
+        }
+        // Cache it
+        summaryCache.set(cacheKey, summary);
+        // Limit cache size
+        if (summaryCache.size > 100) {
+          const firstKey = summaryCache.keys().next().value;
+          if (firstKey) summaryCache.delete(firstKey);
+        }
+      }
+      const recentTokens = recent.reduce(
+        (sum, m) => sum + estimateTokens(m.content, charsPerToken),
+        0
+      );
+      const summaryTokens = estimateTokens(summary, charsPerToken);
+      const newTotal = summaryTokens + recentTokens;
+      api.logger.info(
+        `[context-compactor] Compacted ${old.length} messages → summary (~${summaryTokens} tokens) + ${recent.length} recent (~${recentTokens} tokens) = ~${newTotal} tokens`
+      );
+      // Return context modification
+      return {
+        prependContext: `<compacted-context>
+The following is a summary of earlier conversation that was compacted to fit context limits:
+${summary}
+---
+Recent conversation continues below:
+</compacted-context>`,
+        // Note: We can't actually replace messages in before_agent_start,
+        // we can only prepend context. For full message replacement,
+        // we'd need a different hook or session modification.
+      };
+    } catch (err: any) {
+      api.logger.error(`[context-compactor] Error: ${err.message}`);
+    }
+  });
+  // ============================================================================
+  // Command: /compact
+  // ============================================================================
+  api.registerCommand({
+    name: 'compact-now',
+    description: 'Force context compaction on next message',
+    acceptsArgs: false,
+    requireAuth: true,
+    handler: async () => {
+      // Clear cache to force fresh summary
+      summaryCache.clear();
+      return { text: '🧹 Context compaction cache cleared. Next message will trigger fresh compaction if needed.' };
+    },
+  });
+  // ============================================================================
+  // Command: /context-stats
+  // ============================================================================
+  api.registerCommand({
+    name: 'context-stats',
+    description: 'Show estimated context token usage',
+    acceptsArgs: false,
+    requireAuth: true,
+    handler: async (ctx: { sessionFile?: string }) => {
+      try {
+        if (!ctx.sessionFile) {
+          return { text: '⚠️ Session file not available' };
+        }
+        const entries = readTranscript(ctx.sessionFile);
+        const messages = extractMessages(entries);
+        const totalTokens = messages.reduce(
+          (sum, m) => sum + estimateTokens(m.content, charsPerToken),
+          0
+        );
+        const userMsgs = messages.filter(m => m.role === 'user').length;
+        const assistantMsgs = messages.filter(m => m.role === 'assistant').length;
+        const systemMsgs = messages.filter(m => m.role === 'system').length;
+        return {
+          text: `📊 **Context Stats**
+**Messages:** ${messages.length} total
+- User: ${userMsgs}
+- Assistant: ${assistantMsgs}
+- System: ${systemMsgs}
+**Estimated Tokens:** ~${totalTokens.toLocaleString()}
+**Limit:** ${maxTokens.toLocaleString()}
+**Usage:** ${((totalTokens / maxTokens) * 100).toFixed(1)}%
+${totalTokens > maxTokens ? '⚠️ **Over limit - compaction will trigger**' : '✅ Within limits'}`,
+        };
+      } catch (err: any) {
+        return { text: `❌ Error: ${err.message}` };
+      }
+    },
+  });
+  // ============================================================================
+  // RPC: context-compactor.stats
+  // ============================================================================
+  api.registerGatewayMethod('context-compactor.stats', async ({ params, respond }: any) => {
+    try {
+      const { sessionFile } = params;
+      if (!sessionFile || !fs.existsSync(sessionFile)) {
+        respond(true, { error: 'Session file not found', messages: 0, tokens: 0 });
+        return;
+      }
+      const entries = readTranscript(sessionFile);
+      const messages = extractMessages(entries);
+      const totalTokens = messages.reduce(
+        (sum, m) => sum + estimateTokens(m.content, charsPerToken),
+        0
+      );
+      respond(true, {
+        messages: messages.length,
+        tokens: totalTokens,
+        maxTokens,
+        needsCompaction: totalTokens > maxTokens,
+        cacheSize: summaryCache.size,
+      });
+    } catch (err: any) {
+      respond(false, { error: err.message });
+    }
+  });
+}
+export const id = 'context-compactor';
+export const name = 'Context Compactor - Local Model Support';

package/openclaw.plugin.json ADDED Viewed

@@ -0,0 +1,53 @@
+{
+  "id": "context-compactor",
+  "name": "Context Compactor",
+  "version": "0.1.0",
+  "description": "Token-based context compaction for local models (MLX, llama.cpp) that don't report context limits",
+  "configSchema": {
+    "type": "object",
+    "additionalProperties": false,
+    "properties": {
+      "enabled": {
+        "type": "boolean",
+        "default": true
+      },
+      "maxTokens": {
+        "type": "number",
+        "default": 8000,
+        "description": "Maximum context tokens before compaction triggers"
+      },
+      "keepRecentTokens": {
+        "type": "number",
+        "default": 2000,
+        "description": "Tokens to keep from recent messages (not summarized)"
+      },
+      "summaryMaxTokens": {
+        "type": "number",
+        "default": 1000,
+        "description": "Maximum tokens for the compaction summary"
+      },
+      "charsPerToken": {
+        "type": "number",
+        "default": 4,
+        "description": "Estimated characters per token (for simple counting)"
+      },
+      "summaryModel": {
+        "type": "string",
+        "description": "Model to use for summarization (defaults to session model)"
+      },
+      "logLevel": {
+        "type": "string",
+        "enum": ["debug", "info", "warn", "error"],
+        "default": "info"
+      }
+    }
+  },
+  "uiHints": {
+    "enabled": { "label": "Enable Context Compactor" },
+    "maxTokens": { "label": "Max Context Tokens", "placeholder": "8000" },
+    "keepRecentTokens": { "label": "Keep Recent Tokens", "placeholder": "2000" },
+    "summaryMaxTokens": { "label": "Summary Max Tokens", "placeholder": "1000" },
+    "charsPerToken": { "label": "Chars per Token (estimate)", "placeholder": "4" },
+    "summaryModel": { "label": "Summary Model (optional)", "placeholder": "Leave blank for session model" }
+  }
+}

package/package.json ADDED Viewed

@@ -0,0 +1,36 @@
+{
+  "name": "jasper-context-compactor",
+  "version": "0.2.1",
+  "description": "Context compaction plugin for OpenClaw - works with local models (MLX, llama.cpp) that don't report token limits",
+  "main": "index.ts",
+  "bin": {
+    "context-compactor": "./cli.js"
+  },
+  "openclaw": {
+    "extensions": ["./index.ts"]
+  },
+  "keywords": [
+    "openclaw",
+    "context",
+    "compaction",
+    "mlx",
+    "local-llm",
+    "token-limit"
+  ],
+  "author": "E.x.O. Entertainment Studios Inc.",
+  "license": "MIT",
+  "repository": {
+    "type": "git",
+    "url": "https://github.com/E-x-O-Entertainment-Studios-Inc/openclaw-context-compactor.git"
+  },
+  "engines": {
+    "node": ">=18.0.0"
+  },
+  "files": [
+    "cli.js",
+    "index.ts",
+    "openclaw.plugin.json",
+    "SKILL.md",
+    "README.md"
+  ]
+}