@bd7pil/opencode-deep-memory 0.3.5 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -37,42 +37,95 @@ OpenCode auto-installs on startup. Memory appears at `.deep-memory/` in your pro
37
37
  │ repo map (code symbols) │
38
38
  └─────────────────────────────┘
39
39
 
40
- ┌──────────────┐ ┌──────────────┐ │ ┌──────────────┐
41
- │ chat.message │ │ chat.params │ │ │messages.tfm
42
- │ keyword→notes│ │ agent→budget │ │ │ strip old
43
- │ "记住"/"rem" │ │ main 800t │ │ │ reasoning +
44
- │ │ │ oracle 400t │ │ │ metadata +
45
- └──────────────┘ └──────────────┘ │ │ errors
46
- └──────────────┘
47
- ┌──────────────┘
48
-
49
- ┌─────────────────────┴───────────────────────┐
40
+ ┌──────────────┐ ┌──────────────┐ │ ┌───────────────────────────┐
41
+ │ chat.message │ │ chat.params │ │ │ messages.transform
42
+ │ keyword→notes│ │ agent→budget │ │ │ Layer 1: strip reason.
43
+ │ "记住"/"rem" │ │ main 800t │ │ │ Layer 2: deep compress
44
+ │ │ │ oracle 400t │ │ │ dedup / error purge /
45
+ └──────────────┘ └──────────────┘ │ │ tool compress / JSON /
46
+ │ message prune / CCR │
47
+ ┌──────────────┘ └───────────────────────────┘
48
+
49
+ ┌────────────────────┴────────────────────────┐
50
50
  │ event │
51
51
  │ session.created → resume + dream schedule │
52
- │ session.idle → enrichment
52
+ │ session.idle → enrichment + notify
53
53
  │ session.compacted → checkpoint │
54
54
  └─────────────────────────────────────────────┘
55
55
  ```
56
56
 
57
57
  ## Context compression
58
58
 
59
- Old messages (>8 turns) are compressed deterministically, without calling an LLM.
60
- The key insight: **reasoning is a disposable process** — once the model reaches a conclusion
61
- (in the text or tool output), the reasoning that got it there no longer affects future turns.
62
- Similarly, API metadata, system notifications, and inline thinking tags carry no value
63
- once the conversation moves past them. We strip these in-place, replacing removed parts
64
- with sentinels so message structure stays intact and prompt caching is preserved.
59
+ Two compression layers run automatically, no LLM calls required.
60
+
61
+ ### Layer 1: Deterministic stripping
62
+
63
+ Always active, strips disposable content from old messages:
65
64
 
66
65
  | What gets stripped | How | Why safe |
67
66
  |--------------------|-----|----------|
68
- | `reasoning_details` metadata | Delete the JSON blob from the part | API billing metadata, never reaches the model |
69
- | Old reasoning text | Set `thinking`/`text` to `"[cleared]"` | Conclusions are in the assistant's text output |
70
- | System injections | Replace entire message with sentinel | `<system-reminder>` and OMO markers are stale after one turn |
71
- | Tool errors >100 chars | Truncate to first 100 chars | An old error only needs "it failed", not the full trace |
72
- | Inline `<thinking>` tags | Regex strip from old assistant text | Same as reasoning — process, not product |
67
+ | `reasoning_details` metadata | Delete the JSON blob | Billing metadata, never reaches model |
68
+ | Old reasoning text | Replace with `[cleared]` | Conclusions are in assistant text |
69
+ | System injections | Replace with `[stripped]` | `<system-reminder>` stale after one turn |
70
+ | Tool errors >100 chars | Truncate | An old error only needs "it failed" |
71
+ | Inline `<thinking>` tags | Regex strip | Process, not product |
72
+
73
+ ### Layer 2: Deep compression (pressure-triggered)
74
+
75
+ Activates when context pressure exceeds thresholds. Inspired by
76
+ [DCP](https://github.com/Opencode-DCP/opencode-dynamic-context-pruning),
77
+ [Headroom](https://github.com/chopratejas/headroom), and
78
+ [Edgee](https://github.com/edgee-ai/edgee).
79
+
80
+ | Pressure | Threshold | Actions |
81
+ |----------|-----------|---------|
82
+ | **low** | < 50% context | Layer 1 only |
83
+ | **medium** | 50–70% | + tool dedup + error purge + tool output compression |
84
+ | **high** | 70–85% | + JSON array crush + old message truncation + nudge |
85
+ | **critical** | > 85% | + aggressive nudge (model prompted to compress) |
86
+
87
+ What gets compressed at medium+:
88
+
89
+ | Target | Strategy | Source |
90
+ |--------|----------|--------|
91
+ | Duplicate tool calls | Signature matching (`toolName::sortedParams`) | DCP |
92
+ | Old error inputs | Purge inputs after 4 turns | DCP |
93
+ | File reads | Keep first 50 + key lines + last 20 | Edgee |
94
+ | Command outputs | Keep errors + last 30 lines | Edgee |
95
+ | Search results | Keep top-20, group by file | Edgee |
96
+ | JSON arrays | Keep first 30% + last 15% + dedup middle | Headroom SmartCrusher |
97
+ | Old assistant text | Extract key info (headings, code, errors) | DCP |
98
+
99
+ All compressed content is **reversible** via CCR (Compress-Cache-Retrieve):
100
+ originals are cached with SHA-256 hash and 5-minute TTL.
101
+ Models can retrieve them via the `deep_expand` tool.
102
+
103
+ **Never touched**: user messages, recent 8 messages, protected tools
104
+ (question, edit, write, todowrite, memory_store/search/forget).
105
+
106
+ ## Toast notifications
107
+
108
+ After each LLM turn, deep-memory shows a toast notification (bottom-right corner) summarizing
109
+ what was compressed and injected. The notification level is chosen automatically:
110
+
111
+ | Scenario | Level | Content |
112
+ |----------|-------|---------|
113
+ | Injection only (no compression) | minimal | One-line summary: `-8.5K stripped` |
114
+ | Compression (short session) | detailed | Progress bar + per-category breakdown |
115
+ | Compression + rich context (repo-map, memory, checkpoint) | extended | Full panel with budget usage |
116
+
117
+ Example toast (detailed level):
73
118
 
74
- **Never touched**: user messages (anchor turn boundaries), recent 8 messages (working context),
75
- tool calls and their results (API pairing integrity).
119
+ ```
120
+ deep-memory | compressed
121
+ ─ Compression ─────────────────────────────
122
+ │████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░│
123
+ reasoning -6.2K | metadata -2.1K | tool_err -0.8K
124
+ ─ Injection ───────────────────────────────
125
+ m[0] stable 1055B ✓ m[1] volatile 574B
126
+ tier=main | mode=normal
127
+ repo-map: 12 symbols | memory: 8 entries
128
+ ```
76
129
 
77
130
  ## Cache-stable injection
78
131
 
@@ -117,6 +170,16 @@ updated incrementally on writes.
117
170
  └── sessions/<sid>/ per-session archive
118
171
  ```
119
172
 
173
+ ## Tools
174
+
175
+ | Tool | Purpose |
176
+ |------|---------|
177
+ | `memory_search` | Search persistent memory across sessions (BM25 + CJK) |
178
+ | `memory_store` | Store decisions, constraints, gotchas, facts, notes |
179
+ | `memory_forget` | Remove memory entries matching a query |
180
+ | `memory_expand` | Decompress a sentinel reference to its original content |
181
+ | `deep_expand` | Retrieve original content compressed by CCR (use `[ccr:HASH]` marker) |
182
+
120
183
  ## Commands
121
184
 
122
185
  Copy `.opencode/command/*.md` to your project:
@@ -161,6 +224,18 @@ and JetBrains plugin.
161
224
 
162
225
  **[Plandex][]** — an AI coding agent designed for large tasks and real world projects.
163
226
 
227
+ **[DCP][]** — Dynamic Context Pruning for OpenCode. Our tool deduplication, error purging,
228
+ and nudge system are inspired by DCP's architecture.
229
+
230
+ **[Headroom][]** — compress tool outputs, logs, files, RAG chunks for AI agents.
231
+ Our JSON array crush and CCR (Compress-Cache-Retrieve) are derived from Headroom's SmartCrusher.
232
+
233
+ **[Edgee][]** — agent gateway that compresses tokens before LLM providers.
234
+ Our per-tool compression strategies (read, bash, grep, glob) are inspired by Edgee's approach.
235
+
236
+ **[Contextomizer][]** — ultra-fast deterministic library for transforming bloated tool outputs.
237
+ Our content type detection pipeline is inspired by Contextomizer's approach.
238
+
164
239
  [MiMo-Code]: https://github.com/XiaomiMiMo/MiMo-Code
165
240
  [Magic Context]: https://github.com/cortexkit/magic-context
166
241
  [Aider]: https://github.com/Aider-AI/aider
@@ -168,6 +243,10 @@ and JetBrains plugin.
168
243
  [Continue]: https://github.com/continuedev/continue
169
244
  [OpenHands]: https://github.com/All-Hands-AI/OpenHands
170
245
  [Plandex]: https://github.com/plandex-ai/plandex
246
+ [DCP]: https://github.com/Opencode-DCP/opencode-dynamic-context-pruning
247
+ [Headroom]: https://github.com/chopratejas/headroom
248
+ [Edgee]: https://github.com/edgee-ai/edgee
249
+ [Contextomizer]: https://github.com/GandalFran/contextomizer
171
250
 
172
251
  ## Development
173
252
 
@@ -176,6 +255,8 @@ npm install
176
255
  npm run verify # typecheck + test (363) + build + smoke (49)
177
256
  ```
178
257
 
258
+ Stats: 54 source files, 27 test files (363 tests), 10 compress modules, 49 smoke checks.
259
+
179
260
  ## CI/CD (npm Trusted Publishing)
180
261
 
181
262
  Releases use npm OIDC Trusted Publishing — no token needed. To set up for a fork: