@martian-engineering/lossless-claw 0.9.4 → 0.11.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +10 -3
- package/dist/index.js +329 -39
- package/docs/agent-tools.md +6 -6
- package/docs/architecture.md +17 -18
- package/docs/compaction-redesign-map.md +243 -0
- package/docs/configuration.md +40 -27
- package/docs/focus-briefs-implementation-plan.md +240 -0
- package/docs/tui.md +25 -1
- package/doctor-contract-api.d.ts +34 -0
- package/doctor-contract-api.js +349 -0
- package/openclaw.plugin.json +54 -24
- package/package.json +14 -21
- package/skills/lossless-claw/references/config.md +114 -51
- package/skills/lossless-claw/references/recall-tools.md +6 -0
package/docs/agent-tools.md
CHANGED
|
@@ -32,7 +32,7 @@ Summaries are lossy by design. The "Expand for details about:" footer at the end
|
|
|
32
32
|
|
|
33
33
|
Search across messages and/or summaries using regex or full-text search.
|
|
34
34
|
|
|
35
|
-
Use `mode: "full_text"` for keyword or topical recall. Wrap exact multi-word phrases in quotes to preserve phrase matching. Keep the default `sort: "recency"` for recent events, switch to `sort: "relevance"` when looking for the best older match on a topic, and use `sort: "hybrid"` when you want relevance without giving up recency entirely.
|
|
35
|
+
Use `mode: "full_text"` for keyword or topical recall. Full-text queries are not regexes: alternation (`A|B`), regex wildcards (`.*`), character classes (`[abc]`), and anchors (`^foo`, `foo$`) require `mode: "regex"`. Wrap exact multi-word phrases in quotes to preserve phrase matching. Keep the default `sort: "recency"` for recent events, switch to `sort: "relevance"` when looking for the best older match on a topic, and use `sort: "hybrid"` when you want relevance without giving up recency entirely.
|
|
36
36
|
|
|
37
37
|
**Parameters:**
|
|
38
38
|
|
|
@@ -41,7 +41,7 @@ Use `mode: "full_text"` for keyword or topical recall. Wrap exact multi-word phr
|
|
|
41
41
|
| `pattern` | string | ✅ | — | Search pattern |
|
|
42
42
|
| `mode` | string | | `"regex"` | `"regex"` or `"full_text"` |
|
|
43
43
|
| `scope` | string | | `"both"` | `"messages"`, `"summaries"`, or `"both"` |
|
|
44
|
-
| `conversationId` | number | | current | Specific conversation to search |
|
|
44
|
+
| `conversationId` | number | | current session family | Specific physical conversation to search |
|
|
45
45
|
| `allConversations` | boolean | | `false` | Search all conversations |
|
|
46
46
|
| `since` | string | | — | ISO timestamp lower bound |
|
|
47
47
|
| `before` | string | | — | ISO timestamp upper bound |
|
|
@@ -81,7 +81,7 @@ Look up metadata and content for a specific summary or stored file.
|
|
|
81
81
|
| Param | Type | Required | Default | Description |
|
|
82
82
|
|-------|------|----------|---------|-------------|
|
|
83
83
|
| `id` | string | ✅ | — | `sum_xxx` for summaries, `file_xxx` for files |
|
|
84
|
-
| `conversationId` | number | | current | Scope to a specific conversation |
|
|
84
|
+
| `conversationId` | number | | current session family | Scope to a specific physical conversation |
|
|
85
85
|
| `allConversations` | boolean | | `false` | Allow cross-conversation lookups |
|
|
86
86
|
|
|
87
87
|
**Returns for summaries:**
|
|
@@ -124,7 +124,7 @@ When `allConversations: true` is set, `lcm_expand_query` can now synthesize one
|
|
|
124
124
|
| `query` | string | ✅* | — | Text query to find summaries (if no `summaryIds`) |
|
|
125
125
|
| `summaryIds` | string[] | ✅* | — | Specific summary IDs to expand (if no `query`) |
|
|
126
126
|
| `maxTokens` | number | | 2000 | Answer length cap |
|
|
127
|
-
| `conversationId` | number | | current | Scope to a specific conversation |
|
|
127
|
+
| `conversationId` | number | | current session family | Scope to a specific physical conversation |
|
|
128
128
|
| `allConversations` | boolean | | `false` | Search across all conversations |
|
|
129
129
|
|
|
130
130
|
*One of `query` or `summaryIds` is required.
|
|
@@ -177,7 +177,7 @@ Add instructions to your agent's system prompt so it knows when to use LCM tools
|
|
|
177
177
|
## Memory & Context
|
|
178
178
|
|
|
179
179
|
Use LCM tools for recall:
|
|
180
|
-
1. `lcm_grep` — Search all conversations by keyword/regex. Prefer `mode: "full_text"` for topic
|
|
180
|
+
1. `lcm_grep` — Search all conversations by keyword/regex. Prefer `mode: "full_text"` for short topic terms, use `mode: "regex"` for alternation or other regex syntax, quote exact phrases, use `sort: "relevance"` for older-topic lookups, and `sort: "hybrid"` when recency should still matter.
|
|
181
181
|
2. `lcm_describe` — Inspect a specific summary (cheap, no sub-agent)
|
|
182
182
|
3. `lcm_expand_query` — Deep recall with bounded sub-agent expansion
|
|
183
183
|
|
|
@@ -187,7 +187,7 @@ listing something you need, use `lcm_expand_query` to get the full detail.
|
|
|
187
187
|
|
|
188
188
|
### Conversation scoping
|
|
189
189
|
|
|
190
|
-
By default, tools operate on the current conversation. Use `lcm_grep(..., allConversations: true)` when you need broad global discovery. Use `lcm_expand_query(..., allConversations: true)` when you want bounded synthesis across sessions. Use `conversationId` when you already know the exact conversation to inspect or expand.
|
|
190
|
+
By default, tools operate on the current session family: the active conversation plus archived segments that share the same stable session identity. This keeps recall continuous across session rotation and `/reset` replacement rows without widening the search to unrelated sessions. Use `lcm_grep(..., allConversations: true)` when you need broad global discovery. Use `lcm_expand_query(..., allConversations: true)` when you want bounded synthesis across sessions. Use `conversationId` when you already know the exact physical conversation to inspect or expand.
|
|
191
191
|
|
|
192
192
|
### Performance considerations
|
|
193
193
|
|
package/docs/architecture.md
CHANGED
|
@@ -56,7 +56,7 @@ When OpenClaw processes a turn, it calls the context engine's lifecycle hooks:
|
|
|
56
56
|
|
|
57
57
|
1. **bootstrap** — On session start, reconciles the JSONL session file with the LCM database. Imports any messages that exist in the file but not in LCM (crash recovery).
|
|
58
58
|
2. **ingest** / **ingestBatch** — Persists new messages to the database and appends them to context_items.
|
|
59
|
-
3. **afterTurn** — After the model responds, ingests new messages, then evaluates whether
|
|
59
|
+
3. **afterTurn** — After the model responds, ingests new messages, then evaluates whether `contextThreshold` requires compaction.
|
|
60
60
|
|
|
61
61
|
### Leaf compaction
|
|
62
62
|
|
|
@@ -66,9 +66,9 @@ The **leaf pass** converts raw messages into leaf summaries:
|
|
|
66
66
|
2. Cap the chunk at `leafChunkTokens` (default 20k tokens).
|
|
67
67
|
3. Concatenate message content with timestamps.
|
|
68
68
|
4. Resolve the most recent prior summary for continuity (passed as `previous_context` so the LLM avoids repeating known information).
|
|
69
|
-
5. Send to
|
|
70
|
-
6. Normalize
|
|
71
|
-
7. If normalization is empty, log provider/model
|
|
69
|
+
5. Send to OpenClaw's host-owned `runtime.llm.complete` capability with the leaf prompt.
|
|
70
|
+
6. Normalize runtime LLM response text into plain text while preserving provider/model diagnostics from the host result.
|
|
71
|
+
7. If normalization is empty, log provider/model diagnostics and fall back to deterministic truncation.
|
|
72
72
|
8. If the summary is larger than the input (LLM failure), retry with the aggressive prompt. If still too large, fall back to deterministic truncation.
|
|
73
73
|
9. Persist the summary, link to source messages, and replace the message range in context_items.
|
|
74
74
|
|
|
@@ -84,15 +84,16 @@ The **condensed pass** merges summaries at the same depth into a higher-level su
|
|
|
84
84
|
|
|
85
85
|
### Compaction modes
|
|
86
86
|
|
|
87
|
-
**
|
|
88
|
-
- Checks if
|
|
89
|
-
-
|
|
90
|
-
-
|
|
91
|
-
-
|
|
87
|
+
**Automatic threshold sweep (after each turn):**
|
|
88
|
+
- Checks if the assembled context crosses `contextThreshold`
|
|
89
|
+
- Below threshold, does not compact and does not record leaf debt
|
|
90
|
+
- In deferred mode, records one `"threshold"` maintenance row for background, `maintain()`, or pre-assembly execution
|
|
91
|
+
- In inline mode, runs a full sweep before `afterTurn()` completes
|
|
92
92
|
|
|
93
|
-
**Full sweep (manual `/compact
|
|
93
|
+
**Full sweep (threshold, manual `/compact`, or overflow):**
|
|
94
94
|
- Phase 1: Repeatedly runs leaf passes until no more eligible chunks
|
|
95
|
-
- Phase 2:
|
|
95
|
+
- Phase 2: If the summarized prefix is above `summaryPrefixTargetTokens`, repeatedly runs condensation passes starting from the shallowest eligible depth, respecting the preferred `sweepMaxDepth` (`0` for leaf-only, `-1` for unlimited)
|
|
96
|
+
- Pressure phase: If summarized-prefix pressure remains, condensation may go beyond `sweepMaxDepth` using the hard fanout floor
|
|
96
97
|
- Each pass checks for progress; stops if no tokens were saved
|
|
97
98
|
|
|
98
99
|
**Budget-targeted (`compactUntilUnder`):**
|
|
@@ -206,6 +207,8 @@ LCM handles crash recovery through **bootstrap reconciliation**:
|
|
|
206
207
|
2. Compare against the LCM database.
|
|
207
208
|
3. Find the most recent message that exists in both (the "anchor").
|
|
208
209
|
4. Import any messages after the anchor that are in JSONL but not in LCM.
|
|
210
|
+
5. If an existing session key moves to a different transcript file and no anchor exists, treat the new file as a bounded transcript epoch and import its recoverable messages. The same flood cap used for tail reconciliation prevents large unrelated transcripts from being appended automatically.
|
|
211
|
+
6. Advance the bootstrap checkpoint only after an overlap is found or a bounded epoch import succeeds. No-anchor reads that import nothing leave the old checkpoint in place so a later turn can retry.
|
|
209
212
|
|
|
210
213
|
This handles the case where OpenClaw wrote messages to the session file but crashed before LCM could persist them.
|
|
211
214
|
|
|
@@ -213,12 +216,8 @@ This handles the case where OpenClaw wrote messages to the session file but cras
|
|
|
213
216
|
|
|
214
217
|
All mutating operations (ingest, compact) are serialized per-session using a promise queue. This prevents races between concurrent afterTurn/compact calls for the same conversation without blocking operations on different conversations.
|
|
215
218
|
|
|
216
|
-
##
|
|
219
|
+
## Runtime LLM boundary
|
|
217
220
|
|
|
218
|
-
LCM needs
|
|
221
|
+
LCM needs model inference for summarization, but it does not resolve provider credentials, base URLs, or provider transport settings directly. Summarization calls go through OpenClaw's `runtime.llm.complete` capability, which owns model preparation, credential resolution, OAuth refresh, provider dispatch, and usage attribution.
|
|
219
222
|
|
|
220
|
-
|
|
221
|
-
2. **Environment variables** — Standard provider env vars (`ANTHROPIC_API_KEY`, etc.)
|
|
222
|
-
3. **Custom provider key** — From models config (e.g., `models.json`)
|
|
223
|
-
|
|
224
|
-
For OAuth providers (e.g., Anthropic via Claude Max), LCM handles token refresh and credential persistence automatically.
|
|
223
|
+
Configured Lossless summary model overrides (`summaryModel`, `largeFileSummaryModel`, and `fallbackProviders`) are sent as runtime LLM model override requests. OpenClaw enforces those requests with `plugins.entries.lossless-claw.llm.allowModelOverride` and `plugins.entries.lossless-claw.llm.allowedModels`; denied overrides fail closed instead of silently falling back to a different model.
|
|
@@ -0,0 +1,243 @@
|
|
|
1
|
+
# Compaction Redesign Map
|
|
2
|
+
|
|
3
|
+
Status: implementation pass
|
|
4
|
+
Date: 2026-05-14
|
|
5
|
+
Branch: `josh/compaction-redesign`
|
|
6
|
+
|
|
7
|
+
## Goal
|
|
8
|
+
|
|
9
|
+
Lossless should stop trying to infer whether a provider prompt cache is hot or cold before deciding whether to compact. The old cache-aware incremental strategy could not be made sound with the signals available to Lossless: by the time a low cache-read observation arrives, the provider has usually already rewritten the cache for that turn. Without a reliable `expiresAt` signal, Lossless cannot safely tell "cold before this turn" from "cold during this turn, now hot again."
|
|
10
|
+
|
|
11
|
+
The new design is intentionally simpler:
|
|
12
|
+
|
|
13
|
+
1. Do not run automatic incremental compaction from raw-history pressure.
|
|
14
|
+
2. Let context grow in the assembled transcript until the configured threshold is crossed.
|
|
15
|
+
3. When `contextThreshold` is crossed, run the existing full-sweep mechanism.
|
|
16
|
+
4. Keep the fresh tail as the protected boundary for recent verbatim context.
|
|
17
|
+
5. Reuse existing summary sizing and fanout configuration.
|
|
18
|
+
6. Use a summarized-prefix pressure target only as the escape hatch when a preferred-depth sweep does not reduce enough context.
|
|
19
|
+
|
|
20
|
+
## Implemented Decisions
|
|
21
|
+
|
|
22
|
+
| Decision | Outcome |
|
|
23
|
+
| --- | --- |
|
|
24
|
+
| Automatic compaction trigger | `contextThreshold` only. |
|
|
25
|
+
| Raw leaf trigger | Kept as a diagnostic/manual helper; removed from automatic scheduling. |
|
|
26
|
+
| Deferred debt | New automatic debt uses only reason `"threshold"`. |
|
|
27
|
+
| Cache hotness | No longer delays automatic threshold compaction. |
|
|
28
|
+
| Legacy non-threshold debt | Revalidated against threshold, then swept or marked finished as obsolete. |
|
|
29
|
+
| Full sweep trigger | No longer starts only because `evaluateLeafTrigger()` is true. |
|
|
30
|
+
| Full sweep preferred depth | `compactFullSweep()` now respects `sweepMaxDepth` during routine condensation. |
|
|
31
|
+
| Fresh tail | Kept. It remains independent of incremental compaction. |
|
|
32
|
+
| Default leaf chunk size | Kept at 20k tokens. |
|
|
33
|
+
| Deprecated depth key | `incrementalMaxDepth` remains accepted as an alias for `sweepMaxDepth`. |
|
|
34
|
+
| Pressure escape hatch | `summaryPrefixTargetTokens` lets sweeps condense beyond the preferred depth when summarized context remains too large. |
|
|
35
|
+
| `cacheAwareCompaction.*` | Still visible and accepted, but documented as deprecated compatibility config. |
|
|
36
|
+
| `dynamicLeafChunkTokens.*` | Still visible and accepted, but documented as deprecated compatibility config. |
|
|
37
|
+
| Engine `compactLeafAsync()` | Removed. Automatic and public engine compaction should go through threshold/full-sweep paths. |
|
|
38
|
+
| `CompactionEngine.compactLeaf()` | Kept as a lower-level helper and for focused tests. |
|
|
39
|
+
| Stable hot-cache orphan stripping | Removed with the cache-state-dependent assembly behavior. |
|
|
40
|
+
|
|
41
|
+
## Current Lifecycle
|
|
42
|
+
|
|
43
|
+
### Ingestion and After-Turn Scheduling
|
|
44
|
+
|
|
45
|
+
`LcmContextEngine.afterTurn()` now follows one automatic policy:
|
|
46
|
+
|
|
47
|
+
```text
|
|
48
|
+
afterTurn -> ingest messages -> update telemetry -> evaluate contextThreshold
|
|
49
|
+
|
|
50
|
+
if below threshold:
|
|
51
|
+
do not compact
|
|
52
|
+
do not record maintenance debt
|
|
53
|
+
|
|
54
|
+
if threshold is crossed and mode is inline:
|
|
55
|
+
run threshold full sweep inline
|
|
56
|
+
|
|
57
|
+
if threshold is crossed and mode is deferred:
|
|
58
|
+
record one threshold maintenance row
|
|
59
|
+
schedule the background drain
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
The raw-history leaf trigger is no longer part of this lifecycle. `evaluateLeafTrigger()` can still answer "is there enough old raw material for a leaf pass?", but that answer does not cause automatic maintenance.
|
|
63
|
+
|
|
64
|
+
Relevant code:
|
|
65
|
+
|
|
66
|
+
- `src/engine.ts`: `afterTurn()`
|
|
67
|
+
- `src/engine.ts`: `recordDeferredCompactionDebt()`
|
|
68
|
+
- `src/store/compaction-maintenance-store.ts`: one coalesced maintenance row per conversation
|
|
69
|
+
- `src/compaction.ts`: `evaluateLeafTrigger()`
|
|
70
|
+
|
|
71
|
+
### Deferred Debt
|
|
72
|
+
|
|
73
|
+
Deferred maintenance still exists because threshold sweeps can be expensive and should often happen outside the critical response path.
|
|
74
|
+
|
|
75
|
+
New automatic debt should always use:
|
|
76
|
+
|
|
77
|
+
```text
|
|
78
|
+
reason = "threshold"
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
When the debt drains, Lossless calls threshold full sweep via `executeCompactionCore({ compactionTarget: "threshold" })`. Prompt-cache telemetry and TTLs are not consulted. Session queue idleness remains relevant because compaction should not race active session work.
|
|
82
|
+
|
|
83
|
+
Old databases may contain pending non-threshold debt from previous builds. The compatibility behavior is:
|
|
84
|
+
|
|
85
|
+
- re-evaluate `contextThreshold` at consumption time
|
|
86
|
+
- if the conversation is over threshold, run threshold full sweep
|
|
87
|
+
- if it is under threshold, mark the old debt finished with a no-op legacy reason
|
|
88
|
+
|
|
89
|
+
This clears obsolete maintenance rows without deleting persisted conversation data.
|
|
90
|
+
|
|
91
|
+
Relevant code:
|
|
92
|
+
|
|
93
|
+
- `src/engine.ts`: `drainDeferredCompactionDebtNow()`
|
|
94
|
+
- `src/engine.ts`: `consumeDeferredCompactionDebt()`
|
|
95
|
+
- `src/engine.ts`: `maintain()`
|
|
96
|
+
- `src/engine.ts`: pre-assembly maintenance drain
|
|
97
|
+
|
|
98
|
+
### Full Sweep
|
|
99
|
+
|
|
100
|
+
`CompactionEngine.compact()` delegates to `compactFullSweep()`.
|
|
101
|
+
|
|
102
|
+
The sweep has two phases:
|
|
103
|
+
|
|
104
|
+
1. Leaf phase: repeatedly summarize the oldest raw chunks outside the fresh tail.
|
|
105
|
+
2. Condensed phase: if summarized-prefix tokens exceed `summaryPrefixTargetTokens`, repeatedly summarize same-depth summary chunks, shallowest first.
|
|
106
|
+
|
|
107
|
+
Routine threshold sweeps use `contextThreshold` to decide when to start compaction. Once started, the leaf phase runs until no eligible raw-message chunk remains outside the fresh tail. Condensation is controlled by `summaryPrefixTargetTokens`, not by total context pressure. Forced sweeps still stop when no eligible chunk remains or when a pass stops making token progress.
|
|
108
|
+
|
|
109
|
+
`sweepMaxDepth` is the preferred source-depth cap for routine full-sweep condensation:
|
|
110
|
+
|
|
111
|
+
- `0`: leaf summaries only
|
|
112
|
+
- `1`: depth-0 summaries may condense into depth 1, then stop
|
|
113
|
+
- `2`: depth 0 -> 1 and depth 1 -> 2 are allowed
|
|
114
|
+
- `-1`: unlimited
|
|
115
|
+
|
|
116
|
+
The cap is intentionally aspirational. If summary tokens outside the fresh tail exceed `summaryPrefixTargetTokens` after routine condensation, Lossless runs a pressure condensation phase that may go deeper using `condensedMinFanoutHard`.
|
|
117
|
+
|
|
118
|
+
Relevant code:
|
|
119
|
+
|
|
120
|
+
- `src/compaction.ts`: `compactFullSweep()`
|
|
121
|
+
- `src/compaction.ts`: `selectOldestLeafChunk()`
|
|
122
|
+
- `src/compaction.ts`: `selectShallowestCondensationCandidate()`
|
|
123
|
+
- `src/compaction.ts`: `resolveSweepMaxDepth()`
|
|
124
|
+
- `src/compaction.ts`: `resolveSummaryPrefixTargetTokens()`
|
|
125
|
+
|
|
126
|
+
### Fresh Tail
|
|
127
|
+
|
|
128
|
+
The fresh tail is not incremental compaction. It stays because it protects recent verbatim context and gives both assembly and compaction a stable boundary.
|
|
129
|
+
|
|
130
|
+
The fresh tail:
|
|
131
|
+
|
|
132
|
+
- is always included during assembly
|
|
133
|
+
- is excluded from leaf summarization
|
|
134
|
+
- may be capped by `freshTailMaxTokens`
|
|
135
|
+
- still preserves the newest message even when that one message exceeds the cap
|
|
136
|
+
|
|
137
|
+
Relevant code:
|
|
138
|
+
|
|
139
|
+
- `src/assembler.ts`: `resolveFreshTailOrdinal()`
|
|
140
|
+
- `src/assembler.ts`: `Assembler.assemble()`
|
|
141
|
+
- `src/compaction.ts`: `resolveFreshTailOrdinal()`
|
|
142
|
+
- `src/compaction.ts`: `countRawTokensOutsideFreshTail()`
|
|
143
|
+
|
|
144
|
+
## Removed Automatic Policy
|
|
145
|
+
|
|
146
|
+
The old `evaluateIncrementalCompaction()` path combined:
|
|
147
|
+
|
|
148
|
+
- prompt-cache telemetry
|
|
149
|
+
- hot/cold/unknown cache-state heuristics
|
|
150
|
+
- cache TTL guesses
|
|
151
|
+
- dynamic leaf chunk sizing
|
|
152
|
+
- raw-history pressure outside the fresh tail
|
|
153
|
+
- bounded cold-cache catch-up
|
|
154
|
+
- hot-cache leaf-only behavior
|
|
155
|
+
- budget-headroom gates
|
|
156
|
+
|
|
157
|
+
That policy is removed from automatic scheduling. The important reason is not that each individual heuristic was unreasonable; it is that the combined decision depended on cache state that Lossless cannot reliably observe at the time it must decide whether to mutate the prompt prefix.
|
|
158
|
+
|
|
159
|
+
## Config Semantics
|
|
160
|
+
|
|
161
|
+
### Active Settings
|
|
162
|
+
|
|
163
|
+
| Key | Role |
|
|
164
|
+
| --- | --- |
|
|
165
|
+
| `contextThreshold` | The only automatic compaction trigger. |
|
|
166
|
+
| `proactiveThresholdCompactionMode` | Chooses inline vs deferred threshold full sweep. |
|
|
167
|
+
| `freshTailCount` | Protects newest raw messages during assembly and compaction. |
|
|
168
|
+
| `freshTailMaxTokens` | Optional cap for protected fresh-tail size. |
|
|
169
|
+
| `leafChunkTokens` | Maximum raw material per leaf summary during sweep; default remains 20k. |
|
|
170
|
+
| `leafMinFanout` | Minimum raw-message or depth-0 summary fanout for useful compaction. |
|
|
171
|
+
| `condensedMinFanout` | Normal same-depth condensation grouping for depth 1+. |
|
|
172
|
+
| `condensedMinFanoutHard` | Hard-trigger/repair condensation grouping. |
|
|
173
|
+
| `sweepMaxDepth` | Preferred source-depth cap for routine threshold full sweep. |
|
|
174
|
+
| `summaryPrefixTargetTokens` | Optional target for summarized-prefix tokens; pressure condensation may go deeper if this target is missed. |
|
|
175
|
+
| `leafTargetTokens` | Leaf summary target. |
|
|
176
|
+
| `condensedTargetTokens` | Condensed summary target. |
|
|
177
|
+
|
|
178
|
+
### Deprecated Compatibility Settings
|
|
179
|
+
|
|
180
|
+
| Key | Status |
|
|
181
|
+
| --- | --- |
|
|
182
|
+
| `incrementalMaxDepth` | Accepted as a deprecated alias for `sweepMaxDepth`. New config should use `sweepMaxDepth`. |
|
|
183
|
+
| `cacheAwareCompaction.*` | Accepted and visible as deprecated config. It no longer changes automatic compaction decisions. |
|
|
184
|
+
| `dynamicLeafChunkTokens.*` | Accepted and visible as deprecated config. Automatic compaction uses `leafChunkTokens` directly. |
|
|
185
|
+
|
|
186
|
+
Keeping these settings visible avoids breaking existing OpenClaw config and gives operators an explicit deprecation signal instead of silently hiding known keys.
|
|
187
|
+
|
|
188
|
+
## Stable Orphan Stripping Tradeoff
|
|
189
|
+
|
|
190
|
+
The old cache-aware assembly path could preserve a stable hot-cache boundary by overriding tool-call orphan stripping at a previously observed ordinal. This was removed with the rest of the cache-state-dependent assembly behavior.
|
|
191
|
+
|
|
192
|
+
Benefits of removal:
|
|
193
|
+
|
|
194
|
+
- the assembled prompt no longer changes based on inferred cache hotness
|
|
195
|
+
- assembly has fewer hidden stateful branches
|
|
196
|
+
- prompt-prefix behavior is easier to reason about and test
|
|
197
|
+
- cache telemetry remains diagnostic instead of controlling prompt mutation
|
|
198
|
+
|
|
199
|
+
Cost of removal:
|
|
200
|
+
|
|
201
|
+
- Lossless gives up one cache-oriented prefix-stability optimization for tool-call boundaries
|
|
202
|
+
- in some hot-cache sessions, ordinary tool-pair repair may alter the prefix sooner than the old stable-boundary override would have
|
|
203
|
+
|
|
204
|
+
The ordinary assembler still sanitizes tool-use/tool-result pairing, so this is a cache-efficiency tradeoff rather than a transcript-correctness tradeoff.
|
|
205
|
+
|
|
206
|
+
## Test Coverage
|
|
207
|
+
|
|
208
|
+
The implementation should cover:
|
|
209
|
+
|
|
210
|
+
- below-threshold turns do not compact and do not record debt
|
|
211
|
+
- threshold crossings record only `"threshold"` debt in deferred mode
|
|
212
|
+
- inline mode runs threshold full sweep rather than leaf-trigger compaction
|
|
213
|
+
- background drain consumes threshold debt without prompt-cache telemetry or TTL
|
|
214
|
+
- `maintain()` consumes threshold debt without prompt-cache delay
|
|
215
|
+
- pre-assembly drain consumes threshold debt without prompt-cache delay
|
|
216
|
+
- legacy non-threshold debt is cleared when threshold no longer applies
|
|
217
|
+
- legacy non-threshold debt is upgraded to threshold full sweep when threshold still applies
|
|
218
|
+
- `compactFullSweep()` treats `sweepMaxDepth` as a preferred depth
|
|
219
|
+
- `compactFullSweep()` pressure-condenses past `sweepMaxDepth` when threshold or summary-prefix pressure remains
|
|
220
|
+
- the fresh tail remains verbatim and un-compacted
|
|
221
|
+
|
|
222
|
+
Removed or rewritten coverage:
|
|
223
|
+
|
|
224
|
+
- hot-cache delay gate tests
|
|
225
|
+
- cold-cache catch-up tests
|
|
226
|
+
- dynamic automatic leaf chunk tests
|
|
227
|
+
- automatic leaf debt tests
|
|
228
|
+
- engine-level `compactLeafAsync()` tests
|
|
229
|
+
- stable hot-cache orphan-stripping tests
|
|
230
|
+
|
|
231
|
+
## Non-Goals
|
|
232
|
+
|
|
233
|
+
- Do not add a total-context target floor in this pass.
|
|
234
|
+
- Do not remove persisted telemetry or maintenance tables.
|
|
235
|
+
- Do not parallelize full-sweep leaf summaries yet. The current leaf prompt uses prior summary continuity, so parallelization would require a separate semantic design.
|
|
236
|
+
- Do not depend on provider cache `expiresAt`.
|
|
237
|
+
- Do not remove accepted deprecated config keys until a separate migration decision is made.
|
|
238
|
+
|
|
239
|
+
## Follow-Up Watch Items
|
|
240
|
+
|
|
241
|
+
1. If repeated threshold re-entry happens in live use, tune `summaryPrefixTargetTokens`, `contextThreshold`, `leafChunkTokens`, and fanout before adding a total-context target floor.
|
|
242
|
+
2. If 20k leaf chunks make threshold sweeps too frequent, consider 30k before adding new mechanisms.
|
|
243
|
+
3. If stable orphan stripping removal causes measurable cache regressions in tool-heavy sessions, revisit it as an assembly feature independent of cache-hotness inference.
|
package/docs/configuration.md
CHANGED
|
@@ -24,12 +24,15 @@ Most installations only need to override a handful of keys. If you want a comple
|
|
|
24
24
|
"freshTailCount": 64,
|
|
25
25
|
"freshTailMaxTokens": 24000,
|
|
26
26
|
"promptAwareEviction": false,
|
|
27
|
+
"stubLargeToolPayloads": false,
|
|
27
28
|
"newSessionRetainDepth": 2,
|
|
28
29
|
"leafMinFanout": 8,
|
|
29
30
|
"condensedMinFanout": 4,
|
|
30
31
|
"condensedMinFanoutHard": 2,
|
|
32
|
+
"sweepMaxDepth": 1,
|
|
31
33
|
"incrementalMaxDepth": 1,
|
|
32
34
|
"leafChunkTokens": 20000,
|
|
35
|
+
"summaryPrefixTargetTokens": 20000,
|
|
33
36
|
"bootstrapMaxTokens": 6000,
|
|
34
37
|
"leafTargetTokens": 2400,
|
|
35
38
|
"condensedTargetTokens": 2000,
|
|
@@ -55,6 +58,7 @@ Most installations only need to override a handful of keys. If you want a comple
|
|
|
55
58
|
"proactiveThresholdCompactionMode": "deferred",
|
|
56
59
|
"autoRotateSessionFiles": {
|
|
57
60
|
"enabled": true,
|
|
61
|
+
"createBackups": false,
|
|
58
62
|
"sizeBytes": 2097152,
|
|
59
63
|
"startup": "rotate",
|
|
60
64
|
"runtime": "rotate"
|
|
@@ -66,7 +70,7 @@ Most installations only need to override a handful of keys. If you want a comple
|
|
|
66
70
|
"hotCachePressureFactor": 4,
|
|
67
71
|
"hotCacheBudgetHeadroomRatio": 0.2,
|
|
68
72
|
"coldCacheObservationThreshold": 3,
|
|
69
|
-
"criticalBudgetPressureRatio": 0.
|
|
73
|
+
"criticalBudgetPressureRatio": 0.90
|
|
70
74
|
},
|
|
71
75
|
"dynamicLeafChunkTokens": {
|
|
72
76
|
"enabled": true,
|
|
@@ -82,6 +86,7 @@ Notes on the example:
|
|
|
82
86
|
- `largeFilesDir` shows the expanded default path shape. Both `databasePath` and `largeFilesDir` default to paths under `OPENCLAW_STATE_DIR` (which in turn falls back to `~/.openclaw`).
|
|
83
87
|
- `timezone` has no fixed hardcoded default; at runtime it resolves from `TZ` first, then the system timezone. The example uses `America/Los_Angeles`.
|
|
84
88
|
- `maxAssemblyTokenBudget` has no default. The example uses `30000` as a realistic cap for a 32k-class model.
|
|
89
|
+
- `summaryPrefixTargetTokens` has no fixed default. The example uses `20000`, which matches the derived default for large-context models with the default `leafChunkTokens`.
|
|
85
90
|
- `databasePath` is the preferred key. `dbPath` is an accepted alias.
|
|
86
91
|
- `largeFileThresholdTokens` is the preferred key. `largeFileTokenThreshold` is an accepted alias.
|
|
87
92
|
|
|
@@ -124,13 +129,14 @@ openclaw plugins install --link /path/to/lossless-claw
|
|
|
124
129
|
| `transcriptGcEnabled` | `boolean` | `false` | `LCM_TRANSCRIPT_GC_ENABLED` | Enables transcript rewrite GC during `maintain()`; disabled by default so transcript rewrites stay opt-in. |
|
|
125
130
|
| `proactiveThresholdCompactionMode` | `"deferred" \| "inline"` | `"deferred"` | `LCM_PROACTIVE_THRESHOLD_COMPACTION_MODE` | Controls whether proactive threshold compaction is deferred into maintenance debt by default or run inline for legacy behavior. |
|
|
126
131
|
| `autoRotateSessionFiles.enabled` | `boolean` | `true` | `LCM_AUTO_ROTATE_SESSION_FILES_ENABLED` | Enables automatic rotation for oversized LCM-managed session JSONL files. |
|
|
132
|
+
| `autoRotateSessionFiles.createBackups` | `boolean` | `false` | `LCM_AUTO_ROTATE_SESSION_FILES_CREATE_BACKUPS` | Creates or replaces the rolling `rotate-latest` SQLite backup before automatic session-file rotation. Manual `/lcm rotate` backups are always created. |
|
|
127
133
|
| `autoRotateSessionFiles.sizeBytes` | `integer` | `2097152` | `LCM_AUTO_ROTATE_SESSION_FILES_SIZE_BYTES` | Byte threshold that triggers automatic session-file rotation. |
|
|
128
134
|
| `autoRotateSessionFiles.startup` | `"rotate" \| "warn" \| "off"` | `"rotate"` | `LCM_AUTO_ROTATE_SESSION_FILES_STARTUP` | Startup behavior for oversized indexed OpenClaw session transcripts that also have active LCM bootstrap state. |
|
|
129
135
|
| `autoRotateSessionFiles.runtime` | `"rotate" \| "warn" \| "off"` | `"rotate"` | `LCM_AUTO_ROTATE_SESSION_FILES_RUNTIME` | Runtime behavior after `afterTurn()` and `maintain()` check the current transcript size. |
|
|
130
136
|
|
|
131
137
|
> **Multi-profile note:** `OPENCLAW_STATE_DIR` (set by the host OpenClaw gateway) controls where state is stored. When two gateways run on the same host (e.g. separate bot personas), each gateway sets its own `OPENCLAW_STATE_DIR` and lossless-claw automatically uses that directory for the database, large-file payloads, auth-profile lookups, and legacy secrets — no per-profile plugin config is needed.
|
|
132
138
|
|
|
133
|
-
Automatic session-file rotation
|
|
139
|
+
Automatic session-file rotation rewrites only the live session transcript, keeps the active LCM conversation and durable history intact, and refreshes the bootstrap checkpoint. Startup rotation first scans OpenClaw's current indexed session stores for configured agents, then intersects those candidates with active LCM conversations and matching bootstrap file mappings. Automatic rotation does not create a SQLite backup by default; set `autoRotateSessionFiles.createBackups` to `true` to make runtime rotation replace the rolling `rotate-latest` backup and to make startup rotation create one pre-rotation LCM database backup for the batch before any transcript is rewritten. Manual `/lcm rotate` always keeps its backup-backed behavior regardless of this flag. Rotation never runs for ignored sessions, stateless sessions, or sessions without active LCM state. The preserved JSONL tail follows the existing rotate behavior, which is controlled by `freshTailCount`.
|
|
134
140
|
|
|
135
141
|
Every automatic decision emits grep-able log lines prefixed with `[lcm] auto-rotate:`. Startup emits one compact summary line with `phase=startup`, `action=summary`, `scanned`, `eligible`, `rotated`, `warned`, `skipped`, `durationMs`, `bytesRemoved`, and backup fields when a batch backup was created; quiet skips such as missing files, missing bootstrap mappings, and below-threshold files are counted there instead of producing one line per candidate. Rotation detail lines include `phase`, `action`, `sessionId`, `sessionKey`, `sessionFile`, `sizeBytes`, `thresholdBytes`, `durationMs`, `backupPath`, `bytesRemoved`, `preservedTailMessageCount`, and `checkpointSize`; real warning lines include the same available context plus `reason` or `error`.
|
|
136
142
|
|
|
@@ -142,11 +148,14 @@ Every automatic decision emits grep-able log lines prefixed with `[lcm] auto-rot
|
|
|
142
148
|
| `freshTailCount` | `integer` | `64` | `LCM_FRESH_TAIL_COUNT` | Number of newest messages always kept raw. |
|
|
143
149
|
| `freshTailMaxTokens` | `integer` | unset | `LCM_FRESH_TAIL_MAX_TOKENS` | Optional token cap for the protected fresh tail. The newest message is always preserved even if it exceeds the cap. |
|
|
144
150
|
| `promptAwareEviction` | `boolean` | `false` | `LCM_PROMPT_AWARE_EVICTION_ENABLED` | When enabled, budget-constrained assembly keeps older evictable items by prompt relevance instead of pure chronology. This improves retrieval under tight budgets, but it can reduce prompt-cache hit rates because the preserved prefix changes as prompts change. |
|
|
151
|
+
| `stubLargeToolPayloads` | `boolean` | `false` | `LCM_STUB_LARGE_TOOL_PAYLOADS` | When enabled, evictable tool-result rows backfilled with `messages.large_content` are assembled as `[LCM Tool Output: file_xxx ...]` stubs while the fresh tail stays inline. Requires `scripts/lcm-blob-migrate.mjs`, which defaults to the same large-files root as runtime LCM (`LCM_LARGE_FILES_DIR` or `${OPENCLAW_STATE_DIR}/lcm-files`). |
|
|
145
152
|
| `leafMinFanout` | `integer` | `8` | `LCM_LEAF_MIN_FANOUT` | Minimum number of raw messages required before a leaf pass runs. |
|
|
146
153
|
| `condensedMinFanout` | `integer` | `4` | `LCM_CONDENSED_MIN_FANOUT` | Number of same-depth summaries needed before condensation is attempted. |
|
|
147
154
|
| `condensedMinFanoutHard` | `integer` | `2` | `LCM_CONDENSED_MIN_FANOUT_HARD` | Hard floor for condensation grouping during maintenance and repair flows. |
|
|
148
|
-
| `
|
|
149
|
-
| `
|
|
155
|
+
| `sweepMaxDepth` | `integer` | `1` | `LCM_SWEEP_MAX_DEPTH` | Preferred maximum condensation source depth during routine threshold sweeps. Use `0` for leaf-only and `-1` for unlimited depth. Pressure sweeps may go deeper when summarized context remains above target. |
|
|
156
|
+
| `incrementalMaxDepth` | `integer` | alias of `sweepMaxDepth` | `LCM_INCREMENTAL_MAX_DEPTH` | Deprecated alias for `sweepMaxDepth`. Kept so existing configs continue to load. |
|
|
157
|
+
| `leafChunkTokens` | `integer` | `20000` | `LCM_LEAF_CHUNK_TOKENS` | Maximum source-token budget for a leaf compaction chunk. Larger chunks reduce sweep frequency at the cost of slower individual summary calls. |
|
|
158
|
+
| `summaryPrefixTargetTokens` | `integer` | derived | `LCM_SUMMARY_PREFIX_TARGET_TOKENS` | Optional target for summarized-prefix tokens after a full sweep. If unset, Lossless derives `max(condensedTargetTokens, min(leafChunkTokens, floor(contextThreshold * tokenBudget * 0.5)))`. |
|
|
150
159
|
| `bootstrapMaxTokens` | `integer` | `max(6000, floor(leafChunkTokens * 0.3))` | `LCM_BOOTSTRAP_MAX_TOKENS` | Maximum parent-history tokens imported when a new LCM conversation bootstraps. |
|
|
151
160
|
| `leafTargetTokens` | `integer` | `2400` | `LCM_LEAF_TARGET_TOKENS` | Prompt target for leaf summary size. |
|
|
152
161
|
| `condensedTargetTokens` | `integer` | `2000` | `LCM_CONDENSED_TARGET_TOKENS` | Prompt target for condensed summary size. |
|
|
@@ -170,6 +179,8 @@ Every automatic decision emits grep-able log lines prefixed with `[lcm] auto-rot
|
|
|
170
179
|
| `summaryTimeoutMs` | `integer` | `60000` | `LCM_SUMMARY_TIMEOUT_MS` | Maximum time to wait for one model-backed summarizer call. |
|
|
171
180
|
| `customInstructions` | `string` | `""` | `LCM_CUSTOM_INSTRUCTIONS` | Extra natural-language instructions injected into every summarization prompt. |
|
|
172
181
|
|
|
182
|
+
Summary calls are executed through OpenClaw's `api.runtime.llm.complete` capability. If you configure an explicit Lossless summary model (`summaryModel`, `largeFileSummaryModel`, or `fallbackProviders`), OpenClaw must allow that runtime LLM override under `plugins.entries.lossless-claw.llm.allowModelOverride` and `plugins.entries.lossless-claw.llm.allowedModels`. `openclaw doctor --fix` can add the minimal policy entries for configured Lossless summary models. Delegated expansion calls use OpenClaw's runtime sub-agent layer; explicit `expansionModel` values require `plugins.entries.lossless-claw.subagent.allowModelOverride` and a matching `subagent.allowedModels` entry, or `"*"` if you intentionally trust any expansion target. `openclaw doctor --fix` can add the minimal subagent policy, and `lcm_expand_query` retries once without the override if the host rejects it.
|
|
183
|
+
|
|
173
184
|
### Fallbacks, circuit breaking, and safety rails
|
|
174
185
|
|
|
175
186
|
| Key | Type | Default | Env override | Purpose |
|
|
@@ -184,32 +195,33 @@ Every automatic decision emits grep-able log lines prefixed with `[lcm] auto-rot
|
|
|
184
195
|
|
|
185
196
|
| Key | Type | Default | Env override | Purpose |
|
|
186
197
|
| --- | --- | --- | --- | --- |
|
|
187
|
-
| `cacheAwareCompaction.enabled` | `boolean` | `true` | `LCM_CACHE_AWARE_COMPACTION_ENABLED` |
|
|
188
|
-
| `cacheAwareCompaction.cacheTTLSeconds` | `integer` | `300` | `LCM_CACHE_TTL_SECONDS` |
|
|
189
|
-
| `cacheAwareCompaction.maxColdCacheCatchupPasses` | `integer` | `2` | `LCM_MAX_COLD_CACHE_CATCHUP_PASSES` |
|
|
190
|
-
| `cacheAwareCompaction.hotCachePressureFactor` | `number` | `4` | `LCM_HOT_CACHE_PRESSURE_FACTOR` |
|
|
191
|
-
| `cacheAwareCompaction.hotCacheBudgetHeadroomRatio` | `number` | `0.2` | `LCM_HOT_CACHE_BUDGET_HEADROOM_RATIO` |
|
|
192
|
-
| `cacheAwareCompaction.coldCacheObservationThreshold` | `integer` | `3` | `LCM_COLD_CACHE_OBSERVATION_THRESHOLD` |
|
|
193
|
-
| `cacheAwareCompaction.criticalBudgetPressureRatio` | `number` | `0.
|
|
198
|
+
| `cacheAwareCompaction.enabled` | `boolean` | `true` | `LCM_CACHE_AWARE_COMPACTION_ENABLED` | Deprecated. Accepted for config compatibility but no longer used for automatic compaction decisions. |
|
|
199
|
+
| `cacheAwareCompaction.cacheTTLSeconds` | `integer` | `300` | `LCM_CACHE_TTL_SECONDS` | Deprecated. Accepted for config compatibility; threshold debt no longer waits for cache TTL. |
|
|
200
|
+
| `cacheAwareCompaction.maxColdCacheCatchupPasses` | `integer` | `2` | `LCM_MAX_COLD_CACHE_CATCHUP_PASSES` | Deprecated. Automatic cold-cache catch-up passes were removed. |
|
|
201
|
+
| `cacheAwareCompaction.hotCachePressureFactor` | `number` | `4` | `LCM_HOT_CACHE_PRESSURE_FACTOR` | Deprecated. Hot-cache raw-history pressure no longer drives automatic compaction. |
|
|
202
|
+
| `cacheAwareCompaction.hotCacheBudgetHeadroomRatio` | `number` | `0.2` | `LCM_HOT_CACHE_BUDGET_HEADROOM_RATIO` | Deprecated. Hot-cache budget headroom no longer defers automatic threshold compaction. |
|
|
203
|
+
| `cacheAwareCompaction.coldCacheObservationThreshold` | `integer` | `3` | `LCM_COLD_CACHE_OBSERVATION_THRESHOLD` | Deprecated. Cold-cache streaks remain observable telemetry only. |
|
|
204
|
+
| `cacheAwareCompaction.criticalBudgetPressureRatio` | `number` | `0.90` | `LCM_CRITICAL_BUDGET_PRESSURE_RATIO` | Deprecated. `contextThreshold` is the only automatic compaction threshold. |
|
|
194
205
|
|
|
195
206
|
#### `dynamicLeafChunkTokens`
|
|
196
207
|
|
|
197
208
|
| Key | Type | Default | Env override | Purpose |
|
|
198
209
|
| --- | --- | --- | --- | --- |
|
|
199
|
-
| `dynamicLeafChunkTokens.enabled` | `boolean` | `true` | `LCM_DYNAMIC_LEAF_CHUNK_TOKENS_ENABLED` |
|
|
200
|
-
| `dynamicLeafChunkTokens.max` | `integer` | `max(leafChunkTokens, floor(leafChunkTokens * 2))` | `LCM_DYNAMIC_LEAF_CHUNK_TOKENS_MAX` |
|
|
210
|
+
| `dynamicLeafChunkTokens.enabled` | `boolean` | `true` | `LCM_DYNAMIC_LEAF_CHUNK_TOKENS_ENABLED` | Deprecated. Accepted for config compatibility but no longer used by automatic compaction. |
|
|
211
|
+
| `dynamicLeafChunkTokens.max` | `integer` | `max(leafChunkTokens, floor(leafChunkTokens * 2))` | `LCM_DYNAMIC_LEAF_CHUNK_TOKENS_MAX` | Deprecated. With the default `leafChunkTokens=20000`, this resolves to `40000`, but automatic compaction uses `leafChunkTokens`. |
|
|
212
|
+
|
|
213
|
+
### Threshold full-sweep compaction
|
|
201
214
|
|
|
202
|
-
|
|
215
|
+
Automatic compaction is threshold-only:
|
|
203
216
|
|
|
204
|
-
|
|
217
|
+
- `afterTurn()` evaluates `contextThreshold` against the active token budget
|
|
218
|
+
- below threshold, no automatic compaction runs and no leaf debt is recorded
|
|
219
|
+
- at or above threshold, inline mode runs a threshold full sweep immediately
|
|
220
|
+
- deferred mode records one coalesced `"threshold"` maintenance row and drains it in the background, `maintain()`, or pre-assembly
|
|
205
221
|
|
|
206
|
-
-
|
|
207
|
-
- hot cache skips incremental maintenance entirely when the assembled context is still comfortably below the real token budget
|
|
208
|
-
- hot cache also gets a short hysteresis window so one ambiguous turn does not immediately discard a recently healthy cache signal
|
|
209
|
-
- cold cache still allows bounded catch-up passes via `cacheAwareCompaction.maxColdCacheCatchupPasses`
|
|
210
|
-
- once `currentTokenCount >= criticalBudgetPressureRatio * tokenBudget`, deferred compaction bypasses hot-cache delay so prompt-mutating debt can run before emergency overflow handling
|
|
222
|
+
Lossless still records prompt-cache telemetry for status and diagnostics, but cache hotness no longer delays threshold debt. Legacy `cacheAwareCompaction.*` and `dynamicLeafChunkTokens.*` settings remain accepted so existing OpenClaw config continues to load, but they do not change automatic compaction behavior.
|
|
211
223
|
|
|
212
|
-
|
|
224
|
+
Full sweeps first run leaf passes until there are no more eligible raw-message chunks outside the fresh tail. Condensation is then driven by summarized-prefix pressure: the routine condensation phase obeys `sweepMaxDepth`, and if the summarized prefix still exceeds `summaryPrefixTargetTokens`, a pressure phase may use `condensedMinFanoutHard` and condense deeper. Total context pressure starts the sweep, but does not by itself force deeper condensation once the raw prefix has been summarized.
|
|
213
225
|
|
|
214
226
|
### Prompt-aware eviction
|
|
215
227
|
|
|
@@ -235,12 +247,12 @@ Compaction summarization resolves candidates in this order:
|
|
|
235
247
|
1. `LCM_SUMMARY_MODEL` and `LCM_SUMMARY_PROVIDER`
|
|
236
248
|
2. `plugins.entries.lossless-claw.config.summaryModel` and `summaryProvider`
|
|
237
249
|
3. OpenClaw's default compaction model
|
|
238
|
-
4.
|
|
250
|
+
4. Runtime/session provider and model hints from OpenClaw
|
|
239
251
|
5. `fallbackProviders`
|
|
240
252
|
|
|
241
253
|
If `summaryModel` already contains a provider prefix such as `anthropic/claude-sonnet-4-20250514`, `summaryProvider` is ignored for that candidate.
|
|
242
254
|
|
|
243
|
-
|
|
255
|
+
Lossless does not resolve provider credentials directly for compaction summaries. OpenClaw's runtime LLM layer owns provider/model preparation, auth profiles, OAuth refresh, base URLs, and dispatch. Lossless only selects the requested summary target and passes it to the host runtime, where model override policy is enforced.
|
|
244
256
|
|
|
245
257
|
A practical starting point for cost-sensitive setups is:
|
|
246
258
|
|
|
@@ -285,11 +297,12 @@ This keeps long-term history available while still giving users a real clean-sla
|
|
|
285
297
|
Lossless-claw now defaults `proactiveThresholdCompactionMode` to `deferred`.
|
|
286
298
|
|
|
287
299
|
- deferred mode records a single coalesced maintenance debt row per conversation
|
|
288
|
-
- deferred
|
|
289
|
-
- `maintain()`
|
|
290
|
-
- `assemble()` consumes
|
|
300
|
+
- new deferred compaction debt is only created for `contextThreshold` pressure and uses reason `"threshold"`
|
|
301
|
+
- `maintain()` consumes threshold debt when the host explicitly opts in to deferred execution
|
|
302
|
+
- `assemble()` consumes pending threshold debt before building the next prompt
|
|
303
|
+
- old non-threshold debt from earlier builds is revalidated; if the conversation is no longer over threshold, it is cleared as a no-op
|
|
291
304
|
- `/lcm status` / `/lossless status` shows the current maintenance state, including pending/running/last-failure details
|
|
292
|
-
- status output also surfaces the latest API/cache telemetry
|
|
305
|
+
- status output also surfaces the latest API/cache telemetry as diagnostics, not as a deferral gate
|
|
293
306
|
- set `proactiveThresholdCompactionMode` to `inline` only if you need the legacy inline proactive compaction behavior for compatibility
|
|
294
307
|
|
|
295
308
|
### `/lcm rotate`
|