npm - @tuanhung303/opencode-acp - Versions diffs - 2.1.0 → 2.1.1 - Mend

@tuanhung303/opencode-acp 2.1.0 → 2.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (2) hide show

package/README.md +63 -245
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -2,11 +2,41 @@
 [![npm version](https://img.shields.io/npm/v/@tuanhung303/opencode-acp.svg)](https://www.npmjs.com/package/@tuanhung303/opencode-acp)
-Automatically reduces token usage in OpenCode by removing obsolete tools from conversation history.
-> **⚡ Enhanced fork** with aggressive default settings and four new pruning strategies: **Prune Thinking**, **Head-Tail Truncation**, **Read Consolidation**, and **Placeholder Compression**.
-![ACP in action](dcp-demo5.png)
+Automatically reduces token usage in OpenCode by removing redundant or obsolete content from conversation history.
+> **⚡ Performance Fork:** Optimized with aggressive defaults and four additional pruning strategies: **Prune Thinking**, **Head-Tail Truncation**, **Read Consolidation**, and **Placeholder Compression**.
+## Pruning Architecture
+```mermaid
+graph TD
+    A[New Turn] --> B{Strategy Type}
+    subgraph "Automatic (Zero Cost)"
+    B --> C[Deduplication]
+    B --> D[Supersede Writes]
+    B --> E[Purge Errors]
+    B --> F[Read Consolidation]
+    B --> G[Prune Thinking]
+    B --> H[Head-Tail Truncation]
+    C --> |Removes identical calls| I[Context Saved]
+    D --> |Prunes writes after reads| I
+    E --> |Strips failed call inputs| I
+    F --> |Stale reads point to new| I
+    G --> |Strips reasoning/thinking| I
+    H --> |Middle-out truncation| I
+    end
+    subgraph "LLM-Driven (High Precision)"
+    A --> J[AI Nudge]
+    J --> |Suggests Cleanup| K{Agent Action}
+    K --> L[Discard Tool]
+    K --> M[Extract Tool]
+    L --> |Direct removal| I
+    M --> |Summarize then remove| I
+    end
+```
 ## Installation
@@ -19,269 +49,57 @@ Add to your OpenCode config:
 }
 ```
-Using `@latest` ensures you always get the newest version automatically when OpenCode starts.
-Restart OpenCode. The plugin will automatically start optimizing your sessions.
-## How Pruning Works
-DCP uses multiple tools and strategies to reduce context size:
-### Tools
-**Discard** — Exposes a `discard` tool that the AI can call to remove completed or noisy tool content from context.
-**Extract** — Exposes an `extract` tool that the AI can call to distill valuable context into concise summaries before removing the tool content.
-### Strategies
+## Core Strategies
-**Deduplication** — Identifies repeated tool calls (e.g., reading the same file multiple times) and keeps only the most recent output. Runs automatically on every request with zero LLM cost.
+### 🚀 Automatic Strategies (Instant & Free)
-**Supersede Writes** — Prunes write tool inputs for files that have subsequently been read. When a file is written and later read, the original write content becomes redundant since the current file state is captured in the read result. Runs automatically on every request with zero LLM cost. **⚡ Enabled by default in this fork.**
+- **Head-Tail Truncation** (⚡ Default): Preserves the first 20% and last 30% of tool outputs, removing the middle section. Optimized for the "Lost in the Middle" phenomenon.
+- **Read Consolidation**: When the same file is read multiple times, older outputs are replaced with pointers to the newest version.
+- **Prune Thinking**: Strips ephemeral reasoning blocks (Claude `<thinking>` or OpenAI `reasoning`) from older messages.
+- **Supersede Writes**: Automatically prunes `write` tool inputs once a subsequent `read` of the same file confirms the state.
+- **Deduplication**: Eliminates identical tool calls while preserving the most recent result.
+- **Purge Errors**: Automatically clears inputs from failed tool calls after 2 turns.
-**Purge Errors** — Prunes tool inputs for tools that returned errors after a configurable number of turns (default: 2). Error messages are preserved for context, but the potentially large input content is removed. Runs automatically on every request with zero LLM cost. **⚡ Reduced from 4 to 2 turns in this fork.**
+### 🧠 LLM-Driven Tools
-**Prune Thinking** ⚡ NEW — Removes extended thinking tokens (`<thinking>` blocks, OpenAI reasoning fields) from older assistant messages. Thinking tokens consume significant context but provide no utility after the response is generated. Preserves recent turns for cache efficiency. Runs automatically with zero LLM cost.
+- **`discard`**: The agent manually removes tool content that is no longer relevant to the current task.
+- **`extract`**: The agent distills valuable findings into a concise summary before purging the raw data.
-**Read Consolidation** ⚡ NEW — When the same tool+parameter is called multiple times (e.g., reading the same file), older outputs are replaced with pointers to the newer output. The newest call keeps its full content since it has the current state. Different from deduplication: keeps both calls visible as breadcrumbs. Runs automatically with zero LLM cost.
+## Configuration Defaults
-Example:
-```
-Older read: [📍 See later read at message #15. This output is stale.]
-Newer read: [Full file content preserved here]
-```
-**Head-Tail Truncation** ⚡ NEW — Preserves the first 20% (head) and last 30% (tail) of tool outputs, removing the middle section. Based on the "Lost in the Middle" phenomenon where LLMs perform best with information at the beginning and end. Preserves real content unlike placeholder compression. Runs automatically with zero LLM cost.
-Example:
-```
-Before: [1000 tokens of file content]
-After:  [First 200 tokens...]
-[...📄 500 tokens truncated...]
-[...Last 300 tokens]
-```
+This fork utilizes more aggressive defaults than the upstream version to maximize context efficiency:
-Tool-specific icons: 📄 File content, 🔍 Search results, 💻 Command output, 🌐 Web content, 📊 Data/Excel, 📋 Default
+| Feature              | Upstream | ACP Fork    |
+| :------------------- | :------- | :---------- |
+| `purgeErrors.turns`  | 4        | **2**       |
+| `nudgeFrequency`     | 10       | **5**       |
+| `supersedeWrites`    | Disabled | **Enabled** |
+| `pruneThinking`      | N/A      | **Enabled** |
+| `headTailTruncation` | N/A      | **Enabled** |
-**Placeholder Compression** ⚡ (Disabled by default) — Replaces verbose tool outputs with actionable placeholder hints while preserving the tool call structure (name + input) as breadcrumbs. **Superseded by Head-Tail Truncation** which preserves real content instead of generic hints. Can be re-enabled if you prefer minimal placeholders.
+## Configuration File
-Your session history is never modified—DCP replaces pruned content with placeholders before sending requests to your LLM.
-## Changes from Upstream
-| Setting                          | Upstream | This Fork | Rationale                          |
-| -------------------------------- | -------- | --------- | ---------------------------------- |
-| `purgeErrors.turns`              | 4        | **2**     | Errors rarely useful after 2 turns |
-| `nudgeFrequency`                 | 10       | **5**     | More frequent prune reminders      |
-| `supersedeWrites.enabled`        | false    | **true**  | Safe with read-after-write pattern |
-| `pruneThinking.enabled`          | N/A      | **true**  | Strip ephemeral thinking tokens    |
-| `readConsolidation.enabled`      | N/A      | **true**  | Older reads point to newer reads   |
-| `headTailTruncation.enabled`     | N/A      | **true**  | Keep 50% of content (head+tail)    |
-| `placeholderCompression.enabled` | N/A      | **false** | Superseded by head-tail truncation |
-## Impact on Prompt Caching
-LLM providers like Anthropic and OpenAI cache prompts based on exact prefix matching. When DCP prunes a tool output, it changes the message content, which invalidates cached prefixes from that point forward.
-**Trade-off:** You lose some cache read benefits but gain larger token savings from reduced context size and performance improvements through reduced context poisoning. In most cases, the token savings outweigh the cache miss cost—especially in long sessions where context bloat becomes significant.
-> **Note:** In testing, cache hit rates were approximately 65% with DCP enabled vs 85% without.
-**Best use case:** Providers that count usage in requests, such as Github Copilot and Google Antigravity have no negative price impact.
-## Configuration
-DCP uses its own config file:
-- Global: `~/.config/opencode/dcp.jsonc` (or `dcp.json`), created automatically on first run
-- Custom config directory: `$OPENCODE_CONFIG_DIR/dcp.jsonc` (or `dcp.json`), if `OPENCODE_CONFIG_DIR` is set
-- Project: `.opencode/dcp.jsonc` (or `dcp.json`) in your project's `.opencode` directory
-<details>
-<summary><strong>Default Configuration</strong> (click to expand)</summary>
+The plugin automatically creates a config at `~/.config/opencode/dcp.jsonc`. You can also use project-level overrides in `.opencode/dcp.jsonc`.
 ```jsonc
 {
-    "$schema": "https://raw.githubusercontent.com/tuanhung303/opencode-agent-context-pruning/master/dcp.schema.json",
-    // Enable or disable the plugin
-    "enabled": true,
-    // Enable debug logging to ~/.config/opencode/logs/dcp/
-    "debug": false,
-    // Notification display: "off", "minimal", or "detailed"
-    "pruneNotification": "detailed",
-    // Slash commands configuration
-    "commands": {
-        "enabled": true,
-        // Additional tools to protect from pruning via commands (e.g., /dcp sweep)
-        "protectedTools": [],
-    },
-    // Protect from pruning for <turns> message turns
-    "turnProtection": {
-        "enabled": false,
-        "turns": 4,
-    },
-    // Protect file operations from pruning via glob patterns
-    // Patterns match tool parameters.filePath (e.g. read/write/edit)
-    "protectedFilePatterns": [],
-    // LLM-driven context pruning tools
-    "tools": {
-        // Shared settings for all prune tools
-        "settings": {
-            // Nudge the LLM to use prune tools (every <nudgeFrequency> tool results)
-            "nudgeEnabled": true,
-            "nudgeFrequency": 5, // ⚡ Changed from 10
-            // Additional tools to protect from pruning
-            "protectedTools": [],
-        },
-        // Removes tool content from context without preservation (for completed tasks or noise)
-        "discard": {
-            "enabled": true,
-        },
-        // Distills key findings into preserved knowledge before removing raw content
-        "extract": {
-            "enabled": true,
-            // Show distillation content as an ignored message notification
-            "showDistillation": false,
-        },
-    },
-    // Automatic pruning strategies
     "strategies": {
-        // Remove duplicate tool calls (same tool with same arguments)
-        "deduplication": {
-            "enabled": true,
-            // Additional tools to protect from pruning
-            "protectedTools": [],
-        },
-        // Prune write tool inputs when the file has been subsequently read
-        "supersedeWrites": {
-            "enabled": true, // ⚡ Changed from false
-        },
-        // Prune tool inputs for errored tools after X turns
-        "purgeErrors": {
-            "enabled": true,
-            // Number of turns before errored tool inputs are pruned
-            "turns": 2, // ⚡ Changed from 4
-            // Additional tools to protect from pruning
-            "protectedTools": [],
-        },
-        // ⚡ NEW: Remove extended thinking tokens from older messages
-        "pruneThinking": {
-            "enabled": true,
-            // Turns to wait before pruning (preserves cache)
-            "delayTurns": 1,
-        },
-        // ⚡ NEW: Consolidate duplicate reads - older outputs point to newer
-        "readConsolidation": {
-            "enabled": true,
-            // Tools to track for consolidation
-            "tools": ["read", "glob", "grep", "webfetch", "bash"],
-        },
-        // ⚡ NEW: Keep first 20% + last 30% of outputs, truncate middle
+        "pruneThinking": { "enabled": true, "delayTurns": 1 },
         "headTailTruncation": {
             "enabled": true,
-            // Turns to wait before truncating
-            "delayTurns": 2,
-            // Ratio of output to keep from beginning
             "headRatio": 0.2,
-            // Ratio of output to keep from end
             "tailRatio": 0.3,
-            // Additional tools to protect from truncation
-            "protectedTools": [],
-        },
-        // Replace verbose outputs with placeholder hints (disabled by default)
-        // Superseded by headTailTruncation which preserves real content
-        "placeholderCompression": {
-            "enabled": false, // ⚡ Disabled - use headTailTruncation instead
-            // Turns to wait before compressing
-            "delayTurns": 2,
-            // Only compress outputs larger than this token count
-            "minOutputTokens": 100,
-            // Additional tools to protect from compression
-            "protectedTools": [],
         },
+        "readConsolidation": { "enabled": true },
     },
 }
 ```
-</details>
-### New Strategy: Prune Thinking
-Removes extended thinking content from assistant messages after a configurable delay:
-- **Anthropic**: Removes `type: "thinking"` content blocks
-- **OpenAI**: Removes `reasoning` field from messages
-- **Fallback**: Strips `<thinking>...</thinking>` tags from text
-```jsonc
-"pruneThinking": {
-    "enabled": true,
-    "delayTurns": 1  // Keep current turn for cache, prune older
-}
-```
-### New Strategy: Placeholder Compression
-Replaces verbose tool outputs with actionable hints while preserving breadcrumbs:
-| Tool       | Placeholder Example                                               |
-| ---------- | ----------------------------------------------------------------- |
-| `read`     | `[File read previously. Read again if needed: /path/to/file.ts]`  |
-| `grep`     | `[Content search completed for: pattern. Search again if needed]` |
-| `bash`     | `[Command executed: npm test. Re-run if needed]`                  |
-| `webfetch` | `[URL fetched: https://example.com. Fetch again if needed]`       |
-**Protected tools** (never compressed): `write`, `edit`, `todowrite`, `todoread`, `discard`, `extract`, `task`, `question`, `batch`, `skill`
-```jsonc
-"placeholderCompression": {
-    "enabled": true,
-    "delayTurns": 2,        // Wait 2 turns before compressing
-    "minOutputTokens": 100, // Only compress large outputs
-    "protectedTools": []    // Add custom protected tools
-}
-```
-### Commands
-DCP provides a `/dcp` slash command:
-- `/dcp` — Shows available DCP commands
-- `/dcp context` — Shows a breakdown of your current session's token usage by category (system, user, assistant, tools, etc.) and how much has been saved through pruning.
-- `/dcp stats` — Shows cumulative pruning statistics across all sessions.
-- `/dcp sweep` — Prunes all tools since the last user message. Accepts an optional count: `/dcp sweep 10` prunes the last 10 tools. Respects `commands.protectedTools`.
-### Turn Protection
-When enabled, turn protection prevents tool outputs from being pruned for a configurable number of message turns. This gives the AI time to reference recent tool outputs before they become prunable. Applies to both `discard` and `extract` tools, as well as automatic strategies.
-### Protected Tools
-By default, these tools are always protected from pruning across all strategies:
-`task`, `todowrite`, `todoread`, `discard`, `extract`, `batch`, `write`, `edit`
-The `protectedTools` arrays in each section add to this default list.
-### Config Precedence
-Settings are merged in order:
-Defaults → Global (`~/.config/opencode/dcp.jsonc`) → Config Dir (`$OPENCODE_CONFIG_DIR/dcp.jsonc`) → Project (`.opencode/dcp.jsonc`).
-Each level overrides the previous, so project settings take priority over config-dir and global, which take priority over defaults.
-Restart OpenCode after making config changes.
-## Limitations
-**Subagents** — DCP is disabled for subagents. Subagents are not designed to be token efficient; what matters is that the final message returned to the main agent is a concise summary of findings. DCP's pruning could interfere with this summarization behavior.
-## References
+## Commands
-- [Escaping Context Amnesia in AI Agents](https://www.hadijaveed.me/2025/11/26/escaping-context-amnesia-ai-agents/) - Inspiration for placeholder compression strategy
-- [Original DCP Repository](https://github.com/Opencode-DCP/opencode-dynamic-context-pruning)
+- `/dcp context`: Visualize token usage and savings.
+- `/dcp stats`: Cumulative efficiency report.
+- `/dcp sweep`: Force a manual cleanup of the current session.
 ## License

package/package.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
     "$schema": "https://json.schemastore.org/package.json",
     "name": "@tuanhung303/opencode-acp",
-    "version": "2.1.0",
+    "version": "2.1.1",
     "type": "module",
     "description": "OpenCode plugin that optimizes token usage by pruning obsolete tool outputs - Aggressive fork with Head-Tail Truncation, Read Consolidation, Prune Thinking, and Placeholder Compression strategies",
     "main": "./dist/index.js",