@tuanhung303/opencode-acp 2.1.0 → 2.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +63 -245
  2. package/package.json +1 -1
package/README.md CHANGED
@@ -2,11 +2,41 @@
2
2
 
3
3
  [![npm version](https://img.shields.io/npm/v/@tuanhung303/opencode-acp.svg)](https://www.npmjs.com/package/@tuanhung303/opencode-acp)
4
4
 
5
- Automatically reduces token usage in OpenCode by removing obsolete tools from conversation history.
6
-
7
- > **⚡ Enhanced fork** with aggressive default settings and four new pruning strategies: **Prune Thinking**, **Head-Tail Truncation**, **Read Consolidation**, and **Placeholder Compression**.
8
-
9
- ![ACP in action](dcp-demo5.png)
5
+ Automatically reduces token usage in OpenCode by removing redundant or obsolete content from conversation history.
6
+
7
+ > **⚡ Performance Fork:** Optimized with aggressive defaults and four additional pruning strategies: **Prune Thinking**, **Head-Tail Truncation**, **Read Consolidation**, and **Placeholder Compression**.
8
+
9
+ ## Pruning Architecture
10
+
11
+ ```mermaid
12
+ graph TD
13
+ A[New Turn] --> B{Strategy Type}
14
+
15
+ subgraph "Automatic (Zero Cost)"
16
+ B --> C[Deduplication]
17
+ B --> D[Supersede Writes]
18
+ B --> E[Purge Errors]
19
+ B --> F[Read Consolidation]
20
+ B --> G[Prune Thinking]
21
+ B --> H[Head-Tail Truncation]
22
+
23
+ C --> |Removes identical calls| I[Context Saved]
24
+ D --> |Prunes writes after reads| I
25
+ E --> |Strips failed call inputs| I
26
+ F --> |Stale reads point to new| I
27
+ G --> |Strips reasoning/thinking| I
28
+ H --> |Middle-out truncation| I
29
+ end
30
+
31
+ subgraph "LLM-Driven (High Precision)"
32
+ A --> J[AI Nudge]
33
+ J --> |Suggests Cleanup| K{Agent Action}
34
+ K --> L[Discard Tool]
35
+ K --> M[Extract Tool]
36
+ L --> |Direct removal| I
37
+ M --> |Summarize then remove| I
38
+ end
39
+ ```
10
40
 
11
41
  ## Installation
12
42
 
@@ -19,269 +49,57 @@ Add to your OpenCode config:
19
49
  }
20
50
  ```
21
51
 
22
- Using `@latest` ensures you always get the newest version automatically when OpenCode starts.
23
-
24
- Restart OpenCode. The plugin will automatically start optimizing your sessions.
25
-
26
- ## How Pruning Works
27
-
28
- DCP uses multiple tools and strategies to reduce context size:
29
-
30
- ### Tools
31
-
32
- **Discard** — Exposes a `discard` tool that the AI can call to remove completed or noisy tool content from context.
33
-
34
- **Extract** — Exposes an `extract` tool that the AI can call to distill valuable context into concise summaries before removing the tool content.
35
-
36
- ### Strategies
52
+ ## Core Strategies
37
53
 
38
- **Deduplication** Identifies repeated tool calls (e.g., reading the same file multiple times) and keeps only the most recent output. Runs automatically on every request with zero LLM cost.
54
+ ### 🚀 Automatic Strategies (Instant & Free)
39
55
 
40
- **Supersede Writes** Prunes write tool inputs for files that have subsequently been read. When a file is written and later read, the original write content becomes redundant since the current file state is captured in the read result. Runs automatically on every request with zero LLM cost. **⚡ Enabled by default in this fork.**
56
+ - **Head-Tail Truncation** (⚡ Default): Preserves the first 20% and last 30% of tool outputs, removing the middle section. Optimized for the "Lost in the Middle" phenomenon.
57
+ - **Read Consolidation**: When the same file is read multiple times, older outputs are replaced with pointers to the newest version.
58
+ - **Prune Thinking**: Strips ephemeral reasoning blocks (Claude `<thinking>` or OpenAI `reasoning`) from older messages.
59
+ - **Supersede Writes**: Automatically prunes `write` tool inputs once a subsequent `read` of the same file confirms the state.
60
+ - **Deduplication**: Eliminates identical tool calls while preserving the most recent result.
61
+ - **Purge Errors**: Automatically clears inputs from failed tool calls after 2 turns.
41
62
 
42
- **Purge Errors** — Prunes tool inputs for tools that returned errors after a configurable number of turns (default: 2). Error messages are preserved for context, but the potentially large input content is removed. Runs automatically on every request with zero LLM cost. **⚡ Reduced from 4 to 2 turns in this fork.**
63
+ ### 🧠 LLM-Driven Tools
43
64
 
44
- **Prune Thinking** NEW Removes extended thinking tokens (`<thinking>` blocks, OpenAI reasoning fields) from older assistant messages. Thinking tokens consume significant context but provide no utility after the response is generated. Preserves recent turns for cache efficiency. Runs automatically with zero LLM cost.
65
+ - **`discard`**: The agent manually removes tool content that is no longer relevant to the current task.
66
+ - **`extract`**: The agent distills valuable findings into a concise summary before purging the raw data.
45
67
 
46
- **Read Consolidation** ⚡ NEW — When the same tool+parameter is called multiple times (e.g., reading the same file), older outputs are replaced with pointers to the newer output. The newest call keeps its full content since it has the current state. Different from deduplication: keeps both calls visible as breadcrumbs. Runs automatically with zero LLM cost.
68
+ ## Configuration Defaults
47
69
 
48
- Example:
49
-
50
- ```
51
- Older read: [📍 See later read at message #15. This output is stale.]
52
- Newer read: [Full file content preserved here]
53
- ```
54
-
55
- **Head-Tail Truncation** ⚡ NEW — Preserves the first 20% (head) and last 30% (tail) of tool outputs, removing the middle section. Based on the "Lost in the Middle" phenomenon where LLMs perform best with information at the beginning and end. Preserves real content unlike placeholder compression. Runs automatically with zero LLM cost.
56
-
57
- Example:
58
-
59
- ```
60
- Before: [1000 tokens of file content]
61
- After: [First 200 tokens...]
62
-
63
- [...📄 500 tokens truncated...]
64
-
65
- [...Last 300 tokens]
66
- ```
70
+ This fork utilizes more aggressive defaults than the upstream version to maximize context efficiency:
67
71
 
68
- Tool-specific icons: 📄 File content, 🔍 Search results, 💻 Command output, 🌐 Web content, 📊 Data/Excel, 📋 Default
72
+ | Feature | Upstream | ACP Fork |
73
+ | :------------------- | :------- | :---------- |
74
+ | `purgeErrors.turns` | 4 | **2** |
75
+ | `nudgeFrequency` | 10 | **5** |
76
+ | `supersedeWrites` | Disabled | **Enabled** |
77
+ | `pruneThinking` | N/A | **Enabled** |
78
+ | `headTailTruncation` | N/A | **Enabled** |
69
79
 
70
- **Placeholder Compression** ⚡ (Disabled by default) — Replaces verbose tool outputs with actionable placeholder hints while preserving the tool call structure (name + input) as breadcrumbs. **Superseded by Head-Tail Truncation** which preserves real content instead of generic hints. Can be re-enabled if you prefer minimal placeholders.
80
+ ## Configuration File
71
81
 
72
- Your session history is never modified—DCP replaces pruned content with placeholders before sending requests to your LLM.
73
-
74
- ## Changes from Upstream
75
-
76
- | Setting | Upstream | This Fork | Rationale |
77
- | -------------------------------- | -------- | --------- | ---------------------------------- |
78
- | `purgeErrors.turns` | 4 | **2** | Errors rarely useful after 2 turns |
79
- | `nudgeFrequency` | 10 | **5** | More frequent prune reminders |
80
- | `supersedeWrites.enabled` | false | **true** | Safe with read-after-write pattern |
81
- | `pruneThinking.enabled` | N/A | **true** | Strip ephemeral thinking tokens |
82
- | `readConsolidation.enabled` | N/A | **true** | Older reads point to newer reads |
83
- | `headTailTruncation.enabled` | N/A | **true** | Keep 50% of content (head+tail) |
84
- | `placeholderCompression.enabled` | N/A | **false** | Superseded by head-tail truncation |
85
-
86
- ## Impact on Prompt Caching
87
-
88
- LLM providers like Anthropic and OpenAI cache prompts based on exact prefix matching. When DCP prunes a tool output, it changes the message content, which invalidates cached prefixes from that point forward.
89
-
90
- **Trade-off:** You lose some cache read benefits but gain larger token savings from reduced context size and performance improvements through reduced context poisoning. In most cases, the token savings outweigh the cache miss cost—especially in long sessions where context bloat becomes significant.
91
-
92
- > **Note:** In testing, cache hit rates were approximately 65% with DCP enabled vs 85% without.
93
-
94
- **Best use case:** Providers that count usage in requests, such as Github Copilot and Google Antigravity have no negative price impact.
95
-
96
- ## Configuration
97
-
98
- DCP uses its own config file:
99
-
100
- - Global: `~/.config/opencode/dcp.jsonc` (or `dcp.json`), created automatically on first run
101
- - Custom config directory: `$OPENCODE_CONFIG_DIR/dcp.jsonc` (or `dcp.json`), if `OPENCODE_CONFIG_DIR` is set
102
- - Project: `.opencode/dcp.jsonc` (or `dcp.json`) in your project's `.opencode` directory
103
-
104
- <details>
105
- <summary><strong>Default Configuration</strong> (click to expand)</summary>
82
+ The plugin automatically creates a config at `~/.config/opencode/dcp.jsonc`. You can also use project-level overrides in `.opencode/dcp.jsonc`.
106
83
 
107
84
  ```jsonc
108
85
  {
109
- "$schema": "https://raw.githubusercontent.com/tuanhung303/opencode-agent-context-pruning/master/dcp.schema.json",
110
- // Enable or disable the plugin
111
- "enabled": true,
112
- // Enable debug logging to ~/.config/opencode/logs/dcp/
113
- "debug": false,
114
- // Notification display: "off", "minimal", or "detailed"
115
- "pruneNotification": "detailed",
116
- // Slash commands configuration
117
- "commands": {
118
- "enabled": true,
119
- // Additional tools to protect from pruning via commands (e.g., /dcp sweep)
120
- "protectedTools": [],
121
- },
122
- // Protect from pruning for <turns> message turns
123
- "turnProtection": {
124
- "enabled": false,
125
- "turns": 4,
126
- },
127
- // Protect file operations from pruning via glob patterns
128
- // Patterns match tool parameters.filePath (e.g. read/write/edit)
129
- "protectedFilePatterns": [],
130
- // LLM-driven context pruning tools
131
- "tools": {
132
- // Shared settings for all prune tools
133
- "settings": {
134
- // Nudge the LLM to use prune tools (every <nudgeFrequency> tool results)
135
- "nudgeEnabled": true,
136
- "nudgeFrequency": 5, // ⚡ Changed from 10
137
- // Additional tools to protect from pruning
138
- "protectedTools": [],
139
- },
140
- // Removes tool content from context without preservation (for completed tasks or noise)
141
- "discard": {
142
- "enabled": true,
143
- },
144
- // Distills key findings into preserved knowledge before removing raw content
145
- "extract": {
146
- "enabled": true,
147
- // Show distillation content as an ignored message notification
148
- "showDistillation": false,
149
- },
150
- },
151
- // Automatic pruning strategies
152
86
  "strategies": {
153
- // Remove duplicate tool calls (same tool with same arguments)
154
- "deduplication": {
155
- "enabled": true,
156
- // Additional tools to protect from pruning
157
- "protectedTools": [],
158
- },
159
- // Prune write tool inputs when the file has been subsequently read
160
- "supersedeWrites": {
161
- "enabled": true, // ⚡ Changed from false
162
- },
163
- // Prune tool inputs for errored tools after X turns
164
- "purgeErrors": {
165
- "enabled": true,
166
- // Number of turns before errored tool inputs are pruned
167
- "turns": 2, // ⚡ Changed from 4
168
- // Additional tools to protect from pruning
169
- "protectedTools": [],
170
- },
171
- // ⚡ NEW: Remove extended thinking tokens from older messages
172
- "pruneThinking": {
173
- "enabled": true,
174
- // Turns to wait before pruning (preserves cache)
175
- "delayTurns": 1,
176
- },
177
- // ⚡ NEW: Consolidate duplicate reads - older outputs point to newer
178
- "readConsolidation": {
179
- "enabled": true,
180
- // Tools to track for consolidation
181
- "tools": ["read", "glob", "grep", "webfetch", "bash"],
182
- },
183
- // ⚡ NEW: Keep first 20% + last 30% of outputs, truncate middle
87
+ "pruneThinking": { "enabled": true, "delayTurns": 1 },
184
88
  "headTailTruncation": {
185
89
  "enabled": true,
186
- // Turns to wait before truncating
187
- "delayTurns": 2,
188
- // Ratio of output to keep from beginning
189
90
  "headRatio": 0.2,
190
- // Ratio of output to keep from end
191
91
  "tailRatio": 0.3,
192
- // Additional tools to protect from truncation
193
- "protectedTools": [],
194
- },
195
- // Replace verbose outputs with placeholder hints (disabled by default)
196
- // Superseded by headTailTruncation which preserves real content
197
- "placeholderCompression": {
198
- "enabled": false, // ⚡ Disabled - use headTailTruncation instead
199
- // Turns to wait before compressing
200
- "delayTurns": 2,
201
- // Only compress outputs larger than this token count
202
- "minOutputTokens": 100,
203
- // Additional tools to protect from compression
204
- "protectedTools": [],
205
92
  },
93
+ "readConsolidation": { "enabled": true },
206
94
  },
207
95
  }
208
96
  ```
209
97
 
210
- </details>
211
-
212
- ### New Strategy: Prune Thinking
213
-
214
- Removes extended thinking content from assistant messages after a configurable delay:
215
-
216
- - **Anthropic**: Removes `type: "thinking"` content blocks
217
- - **OpenAI**: Removes `reasoning` field from messages
218
- - **Fallback**: Strips `<thinking>...</thinking>` tags from text
219
-
220
- ```jsonc
221
- "pruneThinking": {
222
- "enabled": true,
223
- "delayTurns": 1 // Keep current turn for cache, prune older
224
- }
225
- ```
226
-
227
- ### New Strategy: Placeholder Compression
228
-
229
- Replaces verbose tool outputs with actionable hints while preserving breadcrumbs:
230
-
231
- | Tool | Placeholder Example |
232
- | ---------- | ----------------------------------------------------------------- |
233
- | `read` | `[File read previously. Read again if needed: /path/to/file.ts]` |
234
- | `grep` | `[Content search completed for: pattern. Search again if needed]` |
235
- | `bash` | `[Command executed: npm test. Re-run if needed]` |
236
- | `webfetch` | `[URL fetched: https://example.com. Fetch again if needed]` |
237
-
238
- **Protected tools** (never compressed): `write`, `edit`, `todowrite`, `todoread`, `discard`, `extract`, `task`, `question`, `batch`, `skill`
239
-
240
- ```jsonc
241
- "placeholderCompression": {
242
- "enabled": true,
243
- "delayTurns": 2, // Wait 2 turns before compressing
244
- "minOutputTokens": 100, // Only compress large outputs
245
- "protectedTools": [] // Add custom protected tools
246
- }
247
- ```
248
-
249
- ### Commands
250
-
251
- DCP provides a `/dcp` slash command:
252
-
253
- - `/dcp` — Shows available DCP commands
254
- - `/dcp context` — Shows a breakdown of your current session's token usage by category (system, user, assistant, tools, etc.) and how much has been saved through pruning.
255
- - `/dcp stats` — Shows cumulative pruning statistics across all sessions.
256
- - `/dcp sweep` — Prunes all tools since the last user message. Accepts an optional count: `/dcp sweep 10` prunes the last 10 tools. Respects `commands.protectedTools`.
257
-
258
- ### Turn Protection
259
-
260
- When enabled, turn protection prevents tool outputs from being pruned for a configurable number of message turns. This gives the AI time to reference recent tool outputs before they become prunable. Applies to both `discard` and `extract` tools, as well as automatic strategies.
261
-
262
- ### Protected Tools
263
-
264
- By default, these tools are always protected from pruning across all strategies:
265
- `task`, `todowrite`, `todoread`, `discard`, `extract`, `batch`, `write`, `edit`
266
-
267
- The `protectedTools` arrays in each section add to this default list.
268
-
269
- ### Config Precedence
270
-
271
- Settings are merged in order:
272
- Defaults → Global (`~/.config/opencode/dcp.jsonc`) → Config Dir (`$OPENCODE_CONFIG_DIR/dcp.jsonc`) → Project (`.opencode/dcp.jsonc`).
273
- Each level overrides the previous, so project settings take priority over config-dir and global, which take priority over defaults.
274
-
275
- Restart OpenCode after making config changes.
276
-
277
- ## Limitations
278
-
279
- **Subagents** — DCP is disabled for subagents. Subagents are not designed to be token efficient; what matters is that the final message returned to the main agent is a concise summary of findings. DCP's pruning could interfere with this summarization behavior.
280
-
281
- ## References
98
+ ## Commands
282
99
 
283
- - [Escaping Context Amnesia in AI Agents](https://www.hadijaveed.me/2025/11/26/escaping-context-amnesia-ai-agents/) - Inspiration for placeholder compression strategy
284
- - [Original DCP Repository](https://github.com/Opencode-DCP/opencode-dynamic-context-pruning)
100
+ - `/dcp context`: Visualize token usage and savings.
101
+ - `/dcp stats`: Cumulative efficiency report.
102
+ - `/dcp sweep`: Force a manual cleanup of the current session.
285
103
 
286
104
  ## License
287
105
 
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "$schema": "https://json.schemastore.org/package.json",
3
3
  "name": "@tuanhung303/opencode-acp",
4
- "version": "2.1.0",
4
+ "version": "2.1.1",
5
5
  "type": "module",
6
6
  "description": "OpenCode plugin that optimizes token usage by pruning obsolete tool outputs - Aggressive fork with Head-Tail Truncation, Read Consolidation, Prune Thinking, and Placeholder Compression strategies",
7
7
  "main": "./dist/index.js",