agent-cache-optimizer 0.5.3 → 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,41 +1,104 @@
1
1
  # Changelog
2
2
 
3
- ## 0.4.0 — 2026-06-25
3
+ ## 0.6.0 — 2026-06-25
4
4
 
5
5
  ### Added
6
- - **Cache warming**: persist known-stable hashes to `warm-cache.json`; new sessions skip cold start
7
- - **Savings tracking**: cumulative estimated $ savings in `savings.json`, displayed in `aco status`
8
- - **Enhanced diag.log**: per-call stableKB + estimated $ saved + cumulative total
9
- - **Conversation log adapter**: append-only guidelines for maximizing cache across turns
6
+
7
+ - **Model-scoped tracking**: databases and warm caches are now keyed by `provider__model__agent` instead of just agent name, enabling correct per-provider/model stability tracking across multi-model setups
8
+ - **Block splitting v2**: robust brace-depth JSON parser handles arbitrary nesting, escaped strings, consecutive objects (not just arrays), and XML sibling elements. Markdown section splitting respects fenced code blocks. Long top-level lists (3+ items) are split into individual items.
9
+ - **Volatile metadata detection**: cold-start heuristics now detect and cap blocks containing dynamic meta-info patterns (`currentDate`, `session ID`, `timestamp`, `last updated`, `ISO timestamp`) even when structural heuristics would otherwise boost them to stable
10
+ - **Provider cache metrics**: real cache hit rate tracking from OpenCode provider events (`cacheReadTokens`, `cacheWriteTokens`, `cacheHitRate`) stored in `cache-metrics.json` with per-scope and total aggregation
11
+ - **Structured event logging**: all significant events written to `events.jsonl` with content-hashed IDs for privacy-preserving observability
12
+ - **Enhanced CLI**: `agent-cache-optimizer status` now displays cache hit rate, structured event counts, and properly handles scoped warm cache format
10
13
 
11
14
  ### Changed
12
- - `classify()` now accepts `warmHashes` for instant warm-state classification
13
- - `aco status --json` includes savings + warm cache data
14
- - `aco status` dashboard shows est. savings and warm cache count
15
+
16
+ - **Two-tier classification**: simplified from 3 tiers (stable/unknown/dynamic) to 2 tiers (stable/dynamic) with 0.5 threshold, effectively eliminating the "unknown" bucket
17
+ - **Warm cache v2**: upgraded format with `global` + per-scope hash sets; hashes stable across multiple scopes are promoted to global for cross-scope cache warming
18
+ - **Warm cache durability**: hashes persist across sessions unless absent from ALL scopes (not just removed on first scope change)
19
+ - **Content-addressable snapshot keys**: session and item IDs in cache metrics are content-hashed to avoid leaking sensitive identifiers
20
+ - Cold-start classification now routes 0.5-score blocks to stable instead of unknown
21
+
22
+ ### Fixed
23
+
24
+ - Cumulative savings no longer double-counted (was multiplying by observation count twice)
25
+ - Metrics deduplication: zero-delta provider events are skipped after the first recording
26
+
27
+ ## 0.5.4 — 2026-06-25
28
+
29
+ ### Added
30
+
31
+ - **Disk space management**: diag.log rotates at 50KB/1000 lines
32
+ - **Stale hash pruning**: auto-removes hashes unseen for 7 days with count≤2
33
+ - **VERSION const**: version logged in plugin startup message
34
+
35
+ ### Fixed
36
+
37
+ - Migration detects missing `contentObservations` and resets cleanly
38
+
39
+ ## 0.5.3 — 2026-06-25
40
+
41
+ ### Changed
42
+
43
+ - `VERSION` constant replaces hardcoded version strings
44
+ - Migration logic simplified: full reset when `contentObservations` missing
45
+
46
+ ## 0.5.2 — 2026-06-25
47
+
48
+ ### Fixed
49
+
50
+ - `contentObservations` field properly tracked (was missing after migration)
51
+
52
+ ## 0.5.1 — 2026-06-25
53
+
54
+ ### Fixed
55
+
56
+ - Auto-migrate pre-0.5 DBs on load (rebuild contentIndex from positions)
57
+ - `contentObservations` separate from `observations` for accurate content scoring
58
+
59
+ ## 0.5.0 — 2026-06-25
60
+
61
+ ### Added
62
+
63
+ - **Content-addressed block matching** (Irminsul-lite): track blocks by hash regardless of position
64
+ - `contentIndex` + `contentScores` in StabilityDB for position-independent tracking
65
+ - `updateContentDB()` for per-call content fingerprinting
66
+ - `lookupContentScore()` for position-independent score queries
67
+ - Classification priority: warm cache → content score → position score → cold start
68
+
69
+ ### Changed
70
+
71
+ - Stable block identification improved from ~1/25 to 25/25 in production
72
+
73
+ ## 0.4.0 — 2026-06-25
74
+
75
+ ### Added
76
+
77
+ - **Cache warming**: persist known-stable hashes to `warm-cache.json`
78
+ - **Savings tracking**: cumulative estimated $ savings in `savings.json`
79
+ - **Enhanced diag.log**: per-call stableKB + estimated $ saved + cumulative total
80
+ - **Conversation log adapter**: append-only cache optimization guidelines
15
81
 
16
82
  ## 0.2.1 — 2026-06-24
17
83
 
18
84
  ### Fixed
19
- - Binary renamed from `aco` to `agent-cache-optimizer` (aco was taken on npm)
85
+
86
+ - Binary renamed from `aco` to `agent-cache-optimizer` (`aco` was taken on npm)
20
87
 
21
88
  ## 0.2.0 — 2026-06-24
22
89
 
23
90
  ### Added
24
- - `agent-cache-optimizer` CLI binary (replaces skill-based slash command)
25
- - `aco status` / `aco status --json` commands
91
+
92
+ - `agent-cache-optimizer` CLI (replaces skill-based slash command)
93
+ - `agent-cache-optimizer status` and `--json` commands
26
94
 
27
95
  ## 0.1.0 — 2026-06-24
28
96
 
29
97
  ### Added
30
98
 
31
- - **Core engine**: content-agnostic hash-based stability tracking (`core.ts`)
32
- - **Cold-start heuristics**: universal position/size/structure signals (`heuristics.ts`)
33
- - **Block splitting**: automatic splitting of >4KB blocks at JSON/Markdown/XML boundaries (`splitting.ts`)
34
- - **OpenCode plugin**: `experimental.chat.system.transform` hook for runtime prompt reordering
35
- - **Per-agent tracking**: isolated stability databases for orchestrator/oracle/fixer/etc.
36
- - **Diagnostics**: `chat.params` fallback logging, `diag.log` audit trail
37
- - **Provider headers**: automatic Anthropic `prompt-caching-2024-07-31` header via `chat.headers`
38
- - **Status dashboard**: `/cache-status` slash command + `cache-status.sh` CLI script
39
- - **Cache audit tool**: `check-cache-friendly.sh` for scanning config files
40
- - **Claude Code adapter**: optimization guidelines document
41
- - **Documentation**: bilingual README (EN + zh-CN), cross-CLI architecture docs
99
+ - Core engine: content-agnostic hash-based stability tracking
100
+ - Cold-start heuristics: universal position/size/structure signals
101
+ - Block splitting for >4KB blocks at JSON/Markdown/XML boundaries
102
+ - OpenCode plugin: `experimental.chat.system.transform` hook
103
+ - Per-agent tracking, diagnostics, Anthropic prompt-caching header
104
+ - Bilingual README (EN + zh-CN)
package/README.md CHANGED
@@ -96,20 +96,24 @@ agent-cache-optimizer status --json # JSON for scripts
96
96
  ║ KV Cache Optimizer Status ║
97
97
  ╠══════════════════════════════════════════════════════════════╣
98
98
  ║ Status: ACTIVE ║
99
- ║ Mode: orchestrator=WARM oracle=COLD
100
- ║ Uptime: 2026-06-24T15:30 → 2026-06-24T16:45 ║
99
+ ║ Mode: WARM (12 scopes, 150 observations)
100
+ ║ Uptime: 2026-06-24T15:30 → 2026-06-25T16:45 ║
101
+ ║ Structured events: 1267 jsonl records ║
101
102
  ╠══════════════════════════════════════════════════════════════╣
102
- Agent Obs Positions Stable
103
- orchestrator 12 11 8/11
104
- oracle 3 5 3/5
103
+ Scope Obs Positions Stable
104
+ deepseek__deepseek-chat__orch 12 25 25/25
105
+ deepseek__deepseek-chat__oracle 3 5 5/5
105
106
  ╠══════════════════════════════════════════════════════════════╣
106
- Estimated cache reuse: ~73% of system prompt
107
+ Est. savings: $1.2345 over 50 calls
108
+ ║ Warm cache: 52 stable hashes pinned (18 global + 34 scoped) ║
109
+ ║ Cache hit: 96.4% (29952/31061 input tokens) ║
110
+ ║ Last reorder: S:25 U:0 D:0 T:25 obs:150 ║
107
111
  ╚══════════════════════════════════════════════════════════════╝
108
112
  ```
109
113
 
110
114
  ## 🏗 How It Works
111
115
 
112
- ### 1. Observe (content-agnostic)
116
+ ### 1. Observe (content-addressed, position-independent)
113
117
 
114
118
  The plugin **never reads the content** of your prompts. It only hashes
115
119
  each system block and tracks which hashes stay the same vs change across calls.
@@ -126,7 +130,15 @@ After 3 observations:
126
130
  ...
127
131
  ```
128
132
 
129
- ### 2. Classify & Reorder
133
+ ### 2. Split & Classify
134
+
135
+ Large blocks (>4KB) are split at structural boundaries — JSON arrays,
136
+ Markdown headings, XML elements, and long lists — using a robust
137
+ brace-depth parser that handles arbitrary nesting and fenced code blocks.
138
+
139
+ Cold-start heuristics detect volatile metadata patterns (`currentDate`,
140
+ `session ID`, `timestamp`) and cap their scores to prevent structural
141
+ boosts from misclassifying them as stable.
130
142
 
131
143
  ```
132
144
  ┌──────────┐
@@ -144,30 +156,36 @@ After 3 observations:
144
156
 
145
157
  | Phase | Trigger | Method |
146
158
  | -------------- | ----------------------- | -------------------------------------------- |
147
- | **Cold start** | First 2 calls per agent | Universal position/size/structure heuristics |
159
+ | **Cold start** | First 2 calls per scope | Universal position/size/structure heuristics |
148
160
  | **Warm** | 3+ calls | Hash-based stability scores |
149
161
 
150
162
  The cold-start heuristics use **only** structural signals (position, size,
151
163
  delimiters, line density) — no keyword matching, no config awareness.
152
164
  This means the plugin works immediately with **any** agent setup.
153
165
 
166
+ ### 4. Provider Cache Metrics
167
+
168
+ Real cache hit rates are tracked from OpenCode provider events — no
169
+ estimation needed. `cache-metrics.json` records per-scope and
170
+ total `cacheReadTokens`, `cacheWriteTokens`, and `cacheHitRate`.
171
+ All session and message IDs are content-hashed for privacy.
172
+
154
173
  ## 📊 Benchmarks
155
174
 
156
175
  Tested on a realistic OpenCode orchestrator prompt (~25KB system prompt):
157
176
 
158
- | Scenario | Cacheable prefix | Improvement |
159
- | ------------------------------ | ---------------- | ----------- |
160
- | Original (no reorder) | 0 KB (0%) | — |
161
- | Cold start (heuristics) | 21.8 KB (88%) | +88% |
162
- | Warm (hash-based, 3+ sessions) | 21.8 KB (88%) | +88% |
177
+ | Scenario | Cacheable prefix | Improvement |
178
+ | ---------------------------- | ------------------ | ----------- |
179
+ | Original (no reorder) | 0 KB (0%) | — |
180
+ | Cold start (heuristics) | 21.8 KB (88%) | +88% |
181
+ | **Content-addressed (v0.5)** | **52.9 KB (100%)** | **+100%** |
163
182
 
164
- **Per-agent results** (3 different agent configurations):
183
+ **Production results** (155 observations, deepseek-v4-pro):
165
184
 
166
- | Agent | Blocks | Stable | Dynamic | Cacheable |
167
- | ------------ | ------ | ------ | ------- | --------- |
168
- | orchestrator | 11 | 8 | 3 | 88% |
169
- | oracle | 6 | 3 | 3 | 88% |
170
- | fixer | 6 | 3 | 3 | 90% |
185
+ | Phase | S | U | D | Stable KB |
186
+ | ---------------------------- | ------ | ----- | ----- | ----------- |
187
+ | Pre-v0.5 (position-based) | 1 | 0 | 24 | ~2 KB |
188
+ | **v0.5 (content-addressed)** | **25** | **0** | **0** | **52.9 KB** |
171
189
 
172
190
  ## 🔌 Supported Platforms
173
191
 
@@ -202,25 +220,29 @@ db = updateDB(db, optimized)
202
220
  ```
203
221
  agent-cache-optimizer/
204
222
  ├── src/
205
- │ ├── index.ts # OpenCode plugin entry point
206
- │ ├── core.ts # Hash-tracking engine (CLI-agnostic)
207
- │ ├── heuristics.ts # Cold-start position/size classifiers
208
- │ ├── splitting.ts # Large block splitter (>4KB)
209
- └── types.ts # TypeScript types
223
+ │ ├── index.ts # OpenCode plugin entry
224
+ │ ├── core.ts # Content-addressed hash engine
225
+ │ ├── heuristics.ts # Cold-start + content classifiers
226
+ │ ├── splitting.ts # Large block splitter (brace-depth parser)
227
+ ├── types.ts # TypeScript types
228
+ │ └── __tests__/ # Unit tests (vitest)
229
+ │ ├── plugin.test.ts
230
+ │ └── heuristics-splitting.test.ts
210
231
  ├── adapters/
211
- └── claude-code.md # Claude Code optimization guide
232
+ ├── claude-code.md # Claude Code optimization guide
233
+ │ └── conversation-log.md # Append-only log guidelines
212
234
  ├── bin/
213
235
  │ └── aco # CLI: agent-cache-optimizer status
214
236
  ├── scripts/
215
- │ ├── cache-status.sh # Status dashboard (legacy)
237
+ │ ├── cache-status.sh # Legacy status script
216
238
  │ └── check-cache-friendly.sh # Config audit tool
217
239
  ├── docs/
218
- │ ├── cross-cli.md # Cross-CLI architecture
219
- └── upstream.md # Upstream fix recommendations
220
- ├── README.md
221
- ├── README.zh-CN.md # 中文文档
222
- ├── LICENSE # MIT
223
- └── CHANGELOG.md
240
+ │ ├── deep-research-kv-cache.md # DeepSeek KV cache research
241
+ ├── cross-cli.md # Cross-CLI architecture
242
+ │ └── upstream.md # Upstream fix recommendations
243
+ ├── README.md + README.zh-CN.md
244
+ ├── CHANGELOG.md
245
+ └── LICENSE (MIT)
224
246
  ```
225
247
 
226
248
  ## 🛠 Cache-Friendliness Audit
package/bin/aco CHANGED
@@ -38,6 +38,9 @@ print(json.dumps(agents))" 2>/dev/null || echo "$agents_json")
38
38
  local diag_entries=0
39
39
  [[ -f "$CACHE_DIR/diag.log" ]] && diag_entries=$(wc -l < "$CACHE_DIR/diag.log" | tr -d ' ')
40
40
 
41
+ local event_entries=0
42
+ [[ -f "$CACHE_DIR/events.jsonl" ]] && event_entries=$(wc -l < "$CACHE_DIR/events.jsonl" | tr -d ' ')
43
+
41
44
  local total_obs=0
42
45
  for db in "$CACHE_DIR"/stability-*.json; do
43
46
  [[ -f "$db" ]] || continue
@@ -46,9 +49,6 @@ print(json.dumps(agents))" 2>/dev/null || echo "$agents_json")
46
49
  total_obs=$((total_obs + o))
47
50
  done
48
51
 
49
- local status="no_data"
50
- [[ $diag_entries -gt 0 ]] && status="active"
51
-
52
52
  local savings_json="{}"
53
53
  if [[ -f "$CACHE_DIR/savings.json" ]]; then
54
54
  savings_json=$(python3 -c "import json; print(json.dumps(json.load(open('$CACHE_DIR/savings.json'))))" 2>/dev/null || echo "{}")
@@ -56,7 +56,29 @@ print(json.dumps(agents))" 2>/dev/null || echo "$agents_json")
56
56
 
57
57
  local warm_count=0
58
58
  if [[ -f "$CACHE_DIR/warm-cache.json" ]]; then
59
- warm_count=$(python3 -c "import json; print(len(json.load(open('$CACHE_DIR/warm-cache.json'))))" 2>/dev/null || echo 0)
59
+ warm_count=$(python3 -c "
60
+ import json
61
+ d=json.load(open('$CACHE_DIR/warm-cache.json'))
62
+ hashes=set(d if isinstance(d, list) else d.get('global', []))
63
+ if isinstance(d, dict):
64
+ for values in d.get('scopes', {}).values():
65
+ hashes.update(values)
66
+ print(len(hashes))" 2>/dev/null || echo 0)
67
+ fi
68
+
69
+ local cache_metrics_json="{}"
70
+ if [[ -f "$CACHE_DIR/cache-metrics.json" ]]; then
71
+ cache_metrics_json=$(python3 -c "import json; print(json.dumps(json.load(open('$CACHE_DIR/cache-metrics.json'))))" 2>/dev/null || echo "{}")
72
+ fi
73
+
74
+ local metric_events=0
75
+ if [[ -f "$CACHE_DIR/cache-metrics.json" ]]; then
76
+ metric_events=$(python3 -c "import json; print(json.load(open('$CACHE_DIR/cache-metrics.json')).get('total', {}).get('events', 0))" 2>/dev/null || echo 0)
77
+ fi
78
+
79
+ local status="no_data"
80
+ if [[ $diag_entries -gt 0 || $total_obs -gt 0 || $metric_events -gt 0 ]]; then
81
+ status="active"
60
82
  fi
61
83
 
62
84
  python3 -c "
@@ -64,10 +86,12 @@ import json
64
86
  print(json.dumps({
65
87
  'status': '$status',
66
88
  'diag_entries': $diag_entries,
89
+ 'event_entries': $event_entries,
67
90
  'total_observations': $total_obs,
68
91
  'agents': $agents_json,
69
92
  'savings': $savings_json,
70
- 'warm_cache_hashes': $warm_count
93
+ 'warm_cache_hashes': $warm_count,
94
+ 'cache_metrics': $cache_metrics_json
71
95
  }, indent=2))"
72
96
  }
73
97
 
@@ -98,6 +122,9 @@ status_text() {
98
122
  last_ts=$(tail -1 "$CACHE_DIR/diag.log" | grep -oP '^\[[^\]]+\]' | tr -d '[]' 2>/dev/null || echo "?")
99
123
  fi
100
124
 
125
+ local event_entries=0
126
+ [[ -f "$CACHE_DIR/events.jsonl" ]] && event_entries=$(wc -l < "$CACHE_DIR/events.jsonl" | tr -d ' ')
127
+
101
128
  echo ""
102
129
  echo -e "${BOLD}╔══════════════════════════════════════════════════════╗${NC}"
103
130
  echo -e "${BOLD}║ agent-cache-optimizer Status ║${NC}"
@@ -112,6 +139,10 @@ status_text() {
112
139
 
113
140
  echo -e "${BOLD}╠══════════════════════════════════════════════════════╣${NC}"
114
141
 
142
+ if [[ $event_entries -gt 0 ]]; then
143
+ printf "║ ${CYAN}Structured events: ${event_entries} jsonl records${NC} ║\n"
144
+ fi
145
+
115
146
  if [[ $agent_count -eq 0 ]]; then
116
147
  echo -e "║ ${YELLOW}No per-agent data yet${NC} ║"
117
148
  else
@@ -155,10 +186,34 @@ print(d.get('totalObservations', 0))" 2>/dev/null || echo "0")
155
186
  # Warm cache
156
187
  if [[ -f "$CACHE_DIR/warm-cache.json" ]]; then
157
188
  local warm_count
158
- warm_count=$(python3 -c "import json; print(len(json.load(open('$CACHE_DIR/warm-cache.json'))))" 2>/dev/null || echo "0")
189
+ warm_count=$(python3 -c "
190
+ import json
191
+ d=json.load(open('$CACHE_DIR/warm-cache.json'))
192
+ hashes=set(d if isinstance(d, list) else d.get('global', []))
193
+ if isinstance(d, dict):
194
+ for values in d.get('scopes', {}).values():
195
+ hashes.update(values)
196
+ print(len(hashes))" 2>/dev/null || echo "0")
159
197
  printf "║ ${CYAN}Warm cache: ${warm_count} stable hashes pinned${NC} ║\n"
160
198
  fi
161
199
 
200
+ # Real provider cache metrics from OpenCode events
201
+ if [[ -f "$CACHE_DIR/cache-metrics.json" ]]; then
202
+ local metrics_line
203
+ metrics_line=$(python3 -c "
204
+ import json
205
+ d=json.load(open('$CACHE_DIR/cache-metrics.json')).get('total', {})
206
+ read=int(d.get('cacheReadTokens', 0))
207
+ write=int(d.get('cacheWriteTokens', 0))
208
+ inp=int(d.get('inputTokens', 0))
209
+ rate=float(d.get('cacheHitRate', 0)) * 100
210
+ events=int(d.get('events', 0))
211
+ print(f'Cache hit: {rate:.1f}% ({read}/{inp} input tokens, write {write}, events {events})')" 2>/dev/null || echo "")
212
+ if [[ -n "$metrics_line" ]]; then
213
+ printf "║ ${CYAN}%-53s${NC} ║\n" "$metrics_line"
214
+ fi
215
+ fi
216
+
162
217
  if [[ $diag_entries -gt 0 ]]; then
163
218
  echo -e "║ ${CYAN}Last reorder:${NC} ║"
164
219
  tail -1 "$CACHE_DIR/diag.log" 2>/dev/null | while IFS= read -r line; do