npm - claude-code-cache-fix - Versions diffs - 2.0.6 → 3.0.0 - Mend

claude-code-cache-fix 2.0.6 → 3.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (18) hide show

package/bin/claude-via-proxy.mjs +113 -0
package/package.json +8 -3
package/proxy/config.mjs +23 -0
package/proxy/extensions/cache-control-normalize.mjs +59 -0
package/proxy/extensions/cache-telemetry.mjs +24 -0
package/proxy/extensions/fingerprint-strip.mjs +105 -0
package/proxy/extensions/fresh-session-sort.mjs +188 -0
package/proxy/extensions/identity-normalization.mjs +129 -0
package/proxy/extensions/request-log.mjs +35 -0
package/proxy/extensions/sort-stabilization.mjs +62 -0
package/proxy/extensions/ttl-management.mjs +49 -0
package/proxy/extensions.json +10 -0
package/proxy/pipeline.mjs +96 -0
package/proxy/server.mjs +168 -0
package/proxy/stream.mjs +110 -0
package/proxy/upstream.mjs +93 -0
package/proxy/watcher.mjs +42 -0
package/tools/MANUAL-COMPACT.md +41 -2

package/tools/MANUAL-COMPACT.md CHANGED Viewed

@@ -79,6 +79,32 @@ was max(dualpol_lr, hail_lr) for correlation grouping." > /tmp/context.txt
 The user context is injected into the summarization prompt, ensuring those details appear in the output.
+### Pre-Clear Agent Review (Recommended)
+Before `/clear`, let the agent review the summary while it still has full context. Paste this prompt into the session:
+```
+I'm about to /clear this session. Read /tmp/<session-id>-compact-summary.txt — that's the summary that will be used to restore context after the clear.
+Review it against your current knowledge and do the following:
+1. Write a SESSION_STATE.md in this project directory that captures anything the summary missed — especially:
+   - Active work state details the summary got wrong or understated
+   - Decisions made and their rationale that aren't in the summary
+   - Context about collaborators, dependencies, or constraints
+   - Anything you'd need to know to resume work that isn't recoverable from git
+2. Write any critical findings to memory files (if your project uses them) that should persist across sessions.
+3. Tell me what's missing from the summary so I can verify the gap is covered.
+Do NOT do a /clear yourself. I will do it after you've finished writing.
+```
+Replace `<session-id>` with the actual path shown in the script output.
+The agent will identify gaps while it still has the context to fill them. This typically raises summary fidelity from ~85% to ~95%+.
 ### Restoring Context After /clear
 In the CC session:
@@ -90,7 +116,7 @@ In the CC session:
 Then as your first message:
 ```
-Read /tmp/<session-id>-compact-summary.txt for context on where we left off.
+Read /tmp/<session-id>-compact-summary.txt for context on where we left off. Also read SESSION_STATE.md in this directory for additional context the summary may have missed.
 ```
 ## Limitations
@@ -114,7 +140,20 @@ Use the user context file to fill known gaps.
 ### Token cost
-The summarization call costs tokens against your Q5h quota. At ~50K extract tokens through Sonnet, expect ~1-2% Q5h per compaction. This is comparable to what `/compact` costs.
+Two costs to account for:
+1. **Summarization call** — the `claude --print` call through Sonnet. At ~50K extract tokens, expect ~1-2% Q5h.
+2. **Cold start after /clear** — the first API call rebuilds the full cache from scratch. Real-world example from a 954K-token session:
+```
+Before /clear:  cache_read=954,399  cache_creation=0      (warm)
+First call:     cache_read=0        cache_creation=954,399 (cold rebuild)
+Second call:    cache_read=957,253  cache_creation=5,569   (warm again)
+```
+The cold rebuild consumed ~15% Q5h in one call on our Max 5x account. After that single rebuild, the session is warm again and cache hits resume at 99%+.
+**Total cost of a manual compact cycle:** ~17% Q5h (2% summarization + 15% cold rebuild). Compare to hitting the 1M wall and losing the session entirely.
 ### Requires Claude Sonnet access