@xdarkicex/openclaw-memory-libravdb 1.4.3 → 1.4.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +76 -16
- package/docs/README.md +3 -12
- package/docs/architecture.md +68 -153
- package/docs/contributing.md +1 -2
- package/openclaw.plugin.json +64 -1
- package/package.json +2 -2
- package/src/cli.ts +34 -0
- package/src/comparison-experiments.ts +128 -0
- package/src/context-engine.ts +276 -62
- package/src/dream-promotion.ts +492 -0
- package/src/dream-routing.ts +40 -0
- package/src/index.ts +16 -1
- package/src/markdown-hash.ts +104 -0
- package/src/markdown-ingest.ts +627 -0
- package/src/memory-runtime.ts +32 -9
- package/src/scoring.ts +6 -3
- package/src/temporal.ts +657 -80
- package/src/types.ts +48 -0
- package/docs/ast-v2.md +0 -167
- package/docs/ast.md +0 -70
- package/docs/compaction-evaluation.md +0 -182
- package/docs/continuity.md +0 -708
- package/docs/elevated-guidance.md +0 -258
- package/docs/gating.md +0 -134
- package/docs/implementation.md +0 -447
- package/docs/mathematics-v2.md +0 -1879
- package/docs/mathematics.md +0 -695
package/docs/continuity.md
DELETED
|
@@ -1,708 +0,0 @@
|
|
|
1
|
-
# Continuity Model
|
|
2
|
-
|
|
3
|
-
This document defines the continuity layer for the planned memory system.
|
|
4
|
-
Its purpose is to ensure that session continuity does not depend only on
|
|
5
|
-
semantic retrieval quality or summary fidelity.
|
|
6
|
-
|
|
7
|
-
The central design rule is:
|
|
8
|
-
|
|
9
|
-
$$
|
|
10
|
-
\text{continuity} \neq \text{semantic summary alone}
|
|
11
|
-
$$
|
|
12
|
-
|
|
13
|
-
This document also defines a proposed lossless extension to the current model.
|
|
14
|
-
That extension is inspired by the immutable-store and expandable-summary
|
|
15
|
-
architecture in the LCM paper, "Lossless Context Management"
|
|
16
|
-
([Ehrlich and Blackman, 2026](https://papers.voltropy.com/LCM)). Where this
|
|
17
|
-
document adopts that idea directly, it cites the paper explicitly. The
|
|
18
|
-
mathematical notation below is adapted to this repository's existing
|
|
19
|
-
invariant/tail/retrieval decomposition rather than copied from the paper.
|
|
20
|
-
|
|
21
|
-
Instead, continuity is modeled as the composition of:
|
|
22
|
-
|
|
23
|
-
$$
|
|
24
|
-
C_{\mathrm{total}}(q)=\mathcal{I}_1\cup \mathcal{I}_2^{*}\cup T_{\mathrm{recent}}\cup \mathrm{Proj}(\mathcal{V}_{\mathrm{rest}}, q)
|
|
25
|
-
$$
|
|
26
|
-
|
|
27
|
-
where:
|
|
28
|
-
|
|
29
|
-
- $\mathcal{I}_1$ is the hard authored invariant context from
|
|
30
|
-
[`ast-v2.md`](./ast-v2.md)
|
|
31
|
-
- $\mathcal{I}_2^{*}$ is the admitted soft-invariant prefix from
|
|
32
|
-
[`ast-v2.md`](./ast-v2.md)
|
|
33
|
-
- $T_{\mathrm{recent}}$ is a preserved raw recent session tail
|
|
34
|
-
- $\mathrm{Proj}(\mathcal{V}_{\mathrm{rest}}, q)$ is the scored retrieval
|
|
35
|
-
output over the remaining variant memory corpus
|
|
36
|
-
|
|
37
|
-
## 1. Motivation
|
|
38
|
-
|
|
39
|
-
The retrieval system in [`mathematics-v2.md`](./mathematics-v2.md) is optimized
|
|
40
|
-
for relevance under a token budget. That is necessary, but it is not sufficient
|
|
41
|
-
for continuity.
|
|
42
|
-
|
|
43
|
-
Lossy summarization and imperfect retrieval can both preserve topic relevance
|
|
44
|
-
while still losing operational details such as:
|
|
45
|
-
|
|
46
|
-
- the latest local intent shift
|
|
47
|
-
- recently introduced identifiers and file paths
|
|
48
|
-
- nearby causal ordering between turns
|
|
49
|
-
- work-in-progress context that has not yet become durable memory
|
|
50
|
-
|
|
51
|
-
Therefore the system needs a non-semantic continuity term that remains stable
|
|
52
|
-
even when summary quality is imperfect.
|
|
53
|
-
|
|
54
|
-
## 2. Continuity Decomposition
|
|
55
|
-
|
|
56
|
-
Let the full retrievable corpus be partitioned as:
|
|
57
|
-
|
|
58
|
-
$$
|
|
59
|
-
\mathbf{D}=\mathcal{I}_1\cup\mathcal{I}_2\cup\mathcal{V},
|
|
60
|
-
\qquad
|
|
61
|
-
\mathcal{I}_1\cap\mathcal{I}_2=\mathcal{I}_1\cap\mathcal{V}=\mathcal{I}_2\cap\mathcal{V}=\emptyset
|
|
62
|
-
$$
|
|
63
|
-
|
|
64
|
-
Following [`ast-v2.md`](./ast-v2.md), invariant authored directives are injected for
|
|
65
|
-
all queries:
|
|
66
|
-
|
|
67
|
-
$$
|
|
68
|
-
d\in\mathcal{I}_1 \Rightarrow G(q,d)=1
|
|
69
|
-
$$
|
|
70
|
-
|
|
71
|
-
Soft authored directives are injected by position-preserving prefix selection:
|
|
72
|
-
|
|
73
|
-
$$
|
|
74
|
-
\mathcal{I}_2^{*}=\mathrm{Pref}(\mathcal{I}_2;\,\tau_{\mathcal{I}_2}^{\mathrm{eff}})
|
|
75
|
-
$$
|
|
76
|
-
|
|
77
|
-
We further partition the session-derived variant corpus into:
|
|
78
|
-
|
|
79
|
-
$$
|
|
80
|
-
\mathcal{V} = T_{\mathrm{recent}} \cup \mathcal{V}_{\mathrm{rest}},
|
|
81
|
-
\qquad
|
|
82
|
-
T_{\mathrm{recent}} \cap \mathcal{V}_{\mathrm{rest}} = \emptyset
|
|
83
|
-
$$
|
|
84
|
-
|
|
85
|
-
where $T_{\mathrm{recent}}$ is a fixed raw suffix of the active session that is
|
|
86
|
-
preserved verbatim and excluded from destructive compaction.
|
|
87
|
-
|
|
88
|
-
The final injected context becomes:
|
|
89
|
-
|
|
90
|
-
$$
|
|
91
|
-
C_{\mathrm{total}}(q)=\mathcal{I}_1\cup \mathcal{I}_2^{*}\cup T_{\mathrm{recent}}\cup \mathrm{Proj}(\mathcal{V}_{\mathrm{rest}}, q)
|
|
92
|
-
$$
|
|
93
|
-
|
|
94
|
-
This means continuity is guaranteed jointly by:
|
|
95
|
-
|
|
96
|
-
- hard authored invariants
|
|
97
|
-
- admitted soft authored invariants
|
|
98
|
-
- preserved recent raw context
|
|
99
|
-
- scored retrieval over the older compacted/searchable corpus
|
|
100
|
-
|
|
101
|
-
## 3. Recent-Tail Definition
|
|
102
|
-
|
|
103
|
-
Let the active session turns ordered by ascending timestamp be:
|
|
104
|
-
|
|
105
|
-
$$
|
|
106
|
-
\Sigma = \langle t_1, t_2, \dots, t_n \rangle
|
|
107
|
-
$$
|
|
108
|
-
|
|
109
|
-
Define a preserved recent-tail selector:
|
|
110
|
-
|
|
111
|
-
$$
|
|
112
|
-
T_{\mathrm{recent}} = \mathrm{Tail}(\Sigma; m, \tau_{\mathrm{tail}})
|
|
113
|
-
$$
|
|
114
|
-
|
|
115
|
-
subject to the constraints:
|
|
116
|
-
|
|
117
|
-
- at least the most recent $m$ raw turns are preserved
|
|
118
|
-
- the preserved tail token target is $\tau_{\mathrm{tail}}$
|
|
119
|
-
- preserved turns are never replaced by summaries while they remain in the tail
|
|
120
|
-
|
|
121
|
-
The exact selection policy may be count-based, token-based, or both. A valid
|
|
122
|
-
runtime policy is:
|
|
123
|
-
|
|
124
|
-
$$
|
|
125
|
-
T_{\mathrm{base}} = \text{shortest raw suffix of } \Sigma \text{ such that }
|
|
126
|
-
|T_{\mathrm{base}}| \ge m
|
|
127
|
-
$$
|
|
128
|
-
|
|
129
|
-
If the base suffix fits within the tail token target:
|
|
130
|
-
|
|
131
|
-
$$
|
|
132
|
-
\sum_{d\in T_{\mathrm{base}}}\mathrm{toks}(d)\le \tau_{\mathrm{tail}}
|
|
133
|
-
$$
|
|
134
|
-
|
|
135
|
-
then the runtime may extend it backward to the longest raw suffix
|
|
136
|
-
$T_{\mathrm{recent}}$ satisfying:
|
|
137
|
-
|
|
138
|
-
$$
|
|
139
|
-
T_{\mathrm{base}} \subseteq T_{\mathrm{recent}}
|
|
140
|
-
\qquad\text{and}\qquad
|
|
141
|
-
\sum_{d\in T_{\mathrm{recent}}}\mathrm{toks}(d)\le \tau_{\mathrm{tail}}
|
|
142
|
-
$$
|
|
143
|
-
|
|
144
|
-
If the most recent $m$ turns already exceed the tail target:
|
|
145
|
-
|
|
146
|
-
$$
|
|
147
|
-
\sum_{d\in T_{\mathrm{base}}}\mathrm{toks}(d) > \tau_{\mathrm{tail}}
|
|
148
|
-
$$
|
|
149
|
-
|
|
150
|
-
then continuity takes precedence and:
|
|
151
|
-
|
|
152
|
-
$$
|
|
153
|
-
T_{\mathrm{recent}} = T_{\mathrm{base}}
|
|
154
|
-
$$
|
|
155
|
-
|
|
156
|
-
with the overflow absorbed by reducing the retrievable variant budget
|
|
157
|
-
accordingly. In other words, $m$ wins over $\tau_{\mathrm{tail}}$ whenever the
|
|
158
|
-
two conflict.
|
|
159
|
-
|
|
160
|
-
This selector is intentionally structural rather than semantic.
|
|
161
|
-
|
|
162
|
-
The selector should also preserve small logically coupled turn bundles when a
|
|
163
|
-
boundary would otherwise split an inseparable local unit. In practice, this
|
|
164
|
-
means the runtime may extend $T_{\mathrm{recent}}$ slightly backward to keep a
|
|
165
|
-
recent cause/effect pair, request/response pair, or equivalent tightly coupled
|
|
166
|
-
artifact bundle intact.
|
|
167
|
-
|
|
168
|
-
**Policy note.** Bundle coupling is a heuristic policy layer, not a formal
|
|
169
|
-
theorem term. It is listed in Section 13.4 as a heuristic and is not part of
|
|
170
|
-
the core $C_{\mathrm{total}}(q)$ assembly theorem.
|
|
171
|
-
|
|
172
|
-
## 4. Budget Partition
|
|
173
|
-
|
|
174
|
-
Let the total prompt budget be $\tau$. Then the continuity-aware allocation is:
|
|
175
|
-
|
|
176
|
-
$$
|
|
177
|
-
\tau = \tau_{\mathcal{I}_1} + \tau_{\mathcal{I}_2}^{*} + \tau_{\mathrm{tail}} + \tau_{\mathcal{V}}
|
|
178
|
-
$$
|
|
179
|
-
|
|
180
|
-
equivalently:
|
|
181
|
-
|
|
182
|
-
$$
|
|
183
|
-
\tau_{\mathcal{V}} = \tau - \tau_{\mathcal{I}_1} - \tau_{\mathcal{I}_2}^{*} - \tau_{\mathrm{tail}}
|
|
184
|
-
$$
|
|
185
|
-
|
|
186
|
-
with:
|
|
187
|
-
|
|
188
|
-
- $\tau_{\mathcal{I}_1}$ consumed by hard authored context
|
|
189
|
-
- $\tau_{\mathcal{I}_2}^{*}$ consumed by the admitted soft-invariant prefix
|
|
190
|
-
- $\tau_{\mathrm{tail}}$ reserved for preserved recent raw context
|
|
191
|
-
- $\tau_{\mathcal{V}}$ reserved for scored retrieval over
|
|
192
|
-
$\mathcal{V}_{\mathrm{rest}}$
|
|
193
|
-
|
|
194
|
-
Following the unified contract in [`mathematics-v2.md`](./mathematics-v2.md),
|
|
195
|
-
let the reserve fractions satisfy:
|
|
196
|
-
|
|
197
|
-
$$
|
|
198
|
-
\alpha_1,\alpha_2,\beta\in[0,1],
|
|
199
|
-
\qquad
|
|
200
|
-
\alpha_1+\alpha_2+\beta\le 1
|
|
201
|
-
$$
|
|
202
|
-
|
|
203
|
-
with:
|
|
204
|
-
|
|
205
|
-
$$
|
|
206
|
-
\tau_{\mathcal{I}_1}\le \alpha_1\tau
|
|
207
|
-
\qquad\text{and}\qquad
|
|
208
|
-
\tau_{\mathrm{tail}}^{\mathrm{target}}=\beta\tau
|
|
209
|
-
$$
|
|
210
|
-
|
|
211
|
-
The soft authored tier is then bounded by:
|
|
212
|
-
|
|
213
|
-
$$
|
|
214
|
-
\tau_{\mathcal{I}_2}^{\mathrm{eff}}
|
|
215
|
-
=
|
|
216
|
-
\min\!\left(
|
|
217
|
-
\alpha_2\tau,\,
|
|
218
|
-
\tau-\tau_{\mathcal{I}_1}-\sum_{d\in T_{\mathrm{base}}}\mathrm{toks}(d)
|
|
219
|
-
\right)
|
|
220
|
-
$$
|
|
221
|
-
|
|
222
|
-
This enforces the intended precedence:
|
|
223
|
-
|
|
224
|
-
1. hard authored invariants
|
|
225
|
-
2. the mandatory recent-tail base suffix
|
|
226
|
-
3. the soft-invariant prefix
|
|
227
|
-
4. additional tail extension up to the target tail budget
|
|
228
|
-
5. residual variant retrieval
|
|
229
|
-
|
|
230
|
-
The residual budget must satisfy:
|
|
231
|
-
|
|
232
|
-
$$
|
|
233
|
-
\tau_{\mathcal{V}} \ge 0
|
|
234
|
-
$$
|
|
235
|
-
|
|
236
|
-
Startup and runtime must preserve:
|
|
237
|
-
|
|
238
|
-
$$
|
|
239
|
-
\tau_{\mathcal{I}_1} + \sum_{d\in T_{\mathrm{base}}}\mathrm{toks}(d) \le \tau
|
|
240
|
-
$$
|
|
241
|
-
|
|
242
|
-
and:
|
|
243
|
-
|
|
244
|
-
$$
|
|
245
|
-
\sum_{d\in C_{\mathrm{total}}(q)} \mathrm{toks}(d)\le \tau
|
|
246
|
-
$$
|
|
247
|
-
|
|
248
|
-
The retrieval system may truncate only $\mathrm{Proj}(\mathcal{V}_{\mathrm{rest}}, q)$.
|
|
249
|
-
It must not truncate $\mathcal{I}_1$, and it must not silently compact away
|
|
250
|
-
$T_{\mathrm{recent}}$. The soft authored tier may be truncated only by
|
|
251
|
-
position-preserving prefix selection.
|
|
252
|
-
|
|
253
|
-
## 5. Compaction Boundary Invariant
|
|
254
|
-
|
|
255
|
-
Compaction operates only on:
|
|
256
|
-
|
|
257
|
-
$$
|
|
258
|
-
\mathcal{V}_{\mathrm{rest}}
|
|
259
|
-
$$
|
|
260
|
-
|
|
261
|
-
not on $T_{\mathrm{recent}}$.
|
|
262
|
-
|
|
263
|
-
If a summary record $s(C_j)$ replaces a cluster $C_j$, then:
|
|
264
|
-
|
|
265
|
-
$$
|
|
266
|
-
C_j \subseteq \mathcal{V}_{\mathrm{rest}}
|
|
267
|
-
$$
|
|
268
|
-
|
|
269
|
-
and never:
|
|
270
|
-
|
|
271
|
-
$$
|
|
272
|
-
C_j \cap T_{\mathrm{recent}} \neq \emptyset
|
|
273
|
-
$$
|
|
274
|
-
|
|
275
|
-
This gives a hard continuity boundary: recent discourse remains exact, older
|
|
276
|
-
discourse becomes summary-eligible.
|
|
277
|
-
|
|
278
|
-
The boundary must also be bundle-safe. If a cluster candidate would split a
|
|
279
|
-
tightly coupled local unit across the tail boundary, the runtime should move the
|
|
280
|
-
boundary backward so that the unit stays entirely in $T_{\mathrm{recent}}$ or
|
|
281
|
-
entirely in $\mathcal{V}_{\mathrm{rest}}$.
|
|
282
|
-
*(This is a heuristic policy; see Section 13.4.)*
|
|
283
|
-
|
|
284
|
-
## 6. Compaction Progress Guarantee
|
|
285
|
-
|
|
286
|
-
Continuity is not preserved if compaction can stall indefinitely or emit
|
|
287
|
-
summaries that fail to reduce storage pressure. Therefore compaction must make
|
|
288
|
-
monotone progress when it is invoked on eligible material.
|
|
289
|
-
|
|
290
|
-
Let $C_j$ be a compactable cluster with source token mass:
|
|
291
|
-
|
|
292
|
-
$$
|
|
293
|
-
\tau(C_j)=\sum_{d\in C_j}\mathrm{toks}(d)
|
|
294
|
-
$$
|
|
295
|
-
|
|
296
|
-
and let the emitted summary be $s(C_j)$ with:
|
|
297
|
-
|
|
298
|
-
$$
|
|
299
|
-
\tau(s(C_j))=\mathrm{toks}(s(C_j))
|
|
300
|
-
$$
|
|
301
|
-
|
|
302
|
-
The preferred invariant is:
|
|
303
|
-
|
|
304
|
-
$$
|
|
305
|
-
\tau(s(C_j)) < \tau(C_j)
|
|
306
|
-
$$
|
|
307
|
-
|
|
308
|
-
If the primary summarizer fails to achieve this reduction, the compaction path
|
|
309
|
-
should escalate through increasingly conservative modes until it produces a
|
|
310
|
-
strictly smaller representation or explicitly declines compaction for that
|
|
311
|
-
cluster. A valid strategy is:
|
|
312
|
-
|
|
313
|
-
1. normal summary generation
|
|
314
|
-
2. more aggressive summary generation
|
|
315
|
-
3. deterministic bounded fallback
|
|
316
|
-
|
|
317
|
-
This preserves a stronger system property:
|
|
318
|
-
|
|
319
|
-
$$
|
|
320
|
-
\Delta_{\mathrm{compact}}(C_j)=\tau(C_j)-\tau(s(C_j)) > 0
|
|
321
|
-
$$
|
|
322
|
-
|
|
323
|
-
whenever a cluster is actually replaced.
|
|
324
|
-
|
|
325
|
-
**Edge case — singleton clusters.** If a cluster contains only a single turn
|
|
326
|
-
($|C_j| = 1$), the clustering algorithm produces a `trivial`-tagged summary that
|
|
327
|
-
does not represent meaningful compaction progress. The $\Delta_{\mathrm{compact}} > 0$
|
|
328
|
-
guarantee applies only to clusters with $|C_j| \ge 2$ that are meaningfully replaced;
|
|
329
|
-
trivial singletons are boundary cases excluded from the progress invariant.
|
|
330
|
-
|
|
331
|
-
## 7. Summary Lineage And Recoverability
|
|
332
|
-
|
|
333
|
-
Continuity improves when summary nodes are not opaque replacements but
|
|
334
|
-
recoverable abstractions with stable lineage.
|
|
335
|
-
|
|
336
|
-
For each compacted cluster $C_j$, the summary metadata should include at least:
|
|
337
|
-
|
|
338
|
-
- source identifiers
|
|
339
|
-
- earliest source timestamp
|
|
340
|
-
- latest source timestamp
|
|
341
|
-
- compaction timestamp
|
|
342
|
-
- summary method
|
|
343
|
-
- confidence
|
|
344
|
-
|
|
345
|
-
If deeper summary-on-summary compaction is introduced later, the runtime should
|
|
346
|
-
extend this metadata with parent-summary references so the compacted memory
|
|
347
|
-
space remains navigable as a directed acyclic lineage graph rather than a flat
|
|
348
|
-
bag of summaries.
|
|
349
|
-
|
|
350
|
-
Formally, for each summary node $s$ we want a typed lineage record, and
|
|
351
|
-
potentially, in a hierarchical future:
|
|
352
|
-
|
|
353
|
-
$$
|
|
354
|
-
P(s)\subseteq \mathbf{S}
|
|
355
|
-
$$
|
|
356
|
-
|
|
357
|
-
where $\mathbf{S}$ is the set of summary nodes.
|
|
358
|
-
|
|
359
|
-
The object $L(s)$ is a typed tuple or record, not an unordered set. A more
|
|
360
|
-
precise notation is:
|
|
361
|
-
|
|
362
|
-
$$
|
|
363
|
-
L(s)=\big(\mathrm{SourceIDs}(s), t_{\min}(s), t_{\max}(s), \mathrm{Method}(s), \mathrm{Confidence}(s)\big)
|
|
364
|
-
$$
|
|
365
|
-
|
|
366
|
-
This does not replace retrieval scoring. It guarantees that compressed history
|
|
367
|
-
remains inspectable and attributable.
|
|
368
|
-
|
|
369
|
-
## 7.5 Lossless Recoverability Extension
|
|
370
|
-
|
|
371
|
-
The current implementation stores lineage metadata for summaries, but it does
|
|
372
|
-
not yet preserve a fully immutable raw session store after compaction. A
|
|
373
|
-
stronger continuity contract is to treat compaction summaries as derived views
|
|
374
|
-
over immutable raw history rather than destructive replacements. This is the
|
|
375
|
-
main architectural idea adopted from the LCM paper's immutable store, summary
|
|
376
|
-
DAG, and bounded expansion model
|
|
377
|
-
([Ehrlich and Blackman, 2026](https://papers.voltropy.com/LCM)).
|
|
378
|
-
|
|
379
|
-
Let the raw session history be:
|
|
380
|
-
|
|
381
|
-
$$
|
|
382
|
-
\mathcal{R}_{\mathrm{session}}=\langle r_1,r_2,\dots,r_n\rangle
|
|
383
|
-
$$
|
|
384
|
-
|
|
385
|
-
where each $r_i$ is a raw persisted turn and raw-history persistence is
|
|
386
|
-
append-only:
|
|
387
|
-
|
|
388
|
-
$$
|
|
389
|
-
\mathrm{Compact}(\mathcal{R}_{\mathrm{session}})=\mathcal{R}_{\mathrm{session}}
|
|
390
|
-
$$
|
|
391
|
-
|
|
392
|
-
Compaction instead constructs a summary-node set:
|
|
393
|
-
|
|
394
|
-
$$
|
|
395
|
-
\mathbf{S}=\{s_1,s_2,\dots\}
|
|
396
|
-
$$
|
|
397
|
-
|
|
398
|
-
and a parent relation:
|
|
399
|
-
|
|
400
|
-
$$
|
|
401
|
-
E_{\triangleleft}\subseteq (\mathbf{S}\times\mathbf{S})\cup(\mathbf{S}\times\mathcal{R}_{\mathrm{session}})
|
|
402
|
-
$$
|
|
403
|
-
|
|
404
|
-
where an edge $(s,x)\in E_{\triangleleft}$ means summary node $s$ directly
|
|
405
|
-
covers child node $x$, with $x$ either a raw turn or a lower-order summary.
|
|
406
|
-
|
|
407
|
-
The resulting continuity graph is:
|
|
408
|
-
|
|
409
|
-
$$
|
|
410
|
-
\mathcal{G}_{\mathrm{cont}}=(\mathbf{S}\cup\mathcal{R}_{\mathrm{session}}, E_{\triangleleft})
|
|
411
|
-
$$
|
|
412
|
-
|
|
413
|
-
with the intended acyclicity invariant:
|
|
414
|
-
|
|
415
|
-
$$
|
|
416
|
-
\mathcal{G}_{\mathrm{cont}} \text{ is a DAG}
|
|
417
|
-
$$
|
|
418
|
-
|
|
419
|
-
Define recursive expansion to leaf raw turns:
|
|
420
|
-
|
|
421
|
-
$$
|
|
422
|
-
\mathrm{Expand}^{*}(x)=
|
|
423
|
-
\begin{cases}
|
|
424
|
-
\{x\} & \text{if } x\in\mathcal{R}_{\mathrm{session}} \\
|
|
425
|
-
\bigcup_{y:(x,y)\in E_{\triangleleft}} \mathrm{Expand}^{*}(y) & \text{if } x\in\mathbf{S}
|
|
426
|
-
\end{cases}
|
|
427
|
-
$$
|
|
428
|
-
|
|
429
|
-
Then lossless recoverability means:
|
|
430
|
-
|
|
431
|
-
$$
|
|
432
|
-
\forall s\in\mathbf{S},\ \mathrm{Expand}^{*}(s)\neq\emptyset
|
|
433
|
-
$$
|
|
434
|
-
|
|
435
|
-
and:
|
|
436
|
-
|
|
437
|
-
$$
|
|
438
|
-
\forall r\in\mathcal{R}_{\mathrm{session}},\ \exists x\in \mathbf{S}\cup T_{\mathrm{recent}} \text{ such that } r\in \mathrm{Expand}^{*}(x)
|
|
439
|
-
$$
|
|
440
|
-
|
|
441
|
-
Operationally, this means compaction may change which nodes are injected or
|
|
442
|
-
searched first, but it must not erase the ability to navigate back to the raw
|
|
443
|
-
turns covered by a summary.
|
|
444
|
-
|
|
445
|
-
The current repository should treat this as a proposed extension, not as a
|
|
446
|
-
claim about present behavior. Today the compactor inserts summaries with
|
|
447
|
-
structured lineage metadata, then deletes the covered source turns from the
|
|
448
|
-
session collection after successful replacement. A future lossless
|
|
449
|
-
implementation should separate:
|
|
450
|
-
|
|
451
|
-
- immutable raw turn storage
|
|
452
|
-
- active/searchable summary views
|
|
453
|
-
- bounded expansion and search over compacted history
|
|
454
|
-
|
|
455
|
-
The corresponding data-model change is to add a raw immutable session layer and
|
|
456
|
-
store summary coverage edges explicitly instead of using lineage metadata alone
|
|
457
|
-
as the recoverability surface.
|
|
458
|
-
|
|
459
|
-
## 8. Continuity-Aware Summarization Input
|
|
460
|
-
|
|
461
|
-
Compaction input should be continuity-safe before it reaches the summarizer.
|
|
462
|
-
Large opaque payloads, binary blobs, and transport artifacts consume token
|
|
463
|
-
budget without increasing continuity.
|
|
464
|
-
|
|
465
|
-
Therefore the summarization view of a cluster should apply a sanitization
|
|
466
|
-
operator:
|
|
467
|
-
|
|
468
|
-
$$
|
|
469
|
-
\widetilde{C}_j=\mathrm{Sanitize}(C_j)
|
|
470
|
-
$$
|
|
471
|
-
|
|
472
|
-
where $\mathrm{Sanitize}$ removes or replaces payload forms whose contribution
|
|
473
|
-
to downstream continuity is negligible relative to their token mass.
|
|
474
|
-
|
|
475
|
-
The intended behavior is not to destroy source truth in storage. It is to
|
|
476
|
-
provide the summarizer with a continuity-preserving projection of the source
|
|
477
|
-
cluster.
|
|
478
|
-
|
|
479
|
-
## 9. Delta-Conditioned Summaries
|
|
480
|
-
|
|
481
|
-
Independent summaries tend to repeat stable background context and waste both
|
|
482
|
-
storage and retrieval budget. A stronger continuity formulation conditions new
|
|
483
|
-
summaries on nearby previously compacted state.
|
|
484
|
-
|
|
485
|
-
Let $B_j$ be bounded prior compacted context relevant to cluster $C_j$. A valid
|
|
486
|
-
selection rule is that $B_j$ is drawn from temporally adjacent or topically
|
|
487
|
-
adjacent compacted state and satisfies a fixed supporting-context cap:
|
|
488
|
-
|
|
489
|
-
$$
|
|
490
|
-
\mathrm{toks}(B_j)\le \tau_B
|
|
491
|
-
$$
|
|
492
|
-
|
|
493
|
-
for some configured constant $\tau_B$.
|
|
494
|
-
|
|
495
|
-
Then a delta-conditioned summarizer computes:
|
|
496
|
-
|
|
497
|
-
$$
|
|
498
|
-
s(C_j \mid B_j)
|
|
499
|
-
$$
|
|
500
|
-
|
|
501
|
-
instead of an unconditional $s(C_j)$.
|
|
502
|
-
|
|
503
|
-
The purpose is to preserve what changed, what remains active, and what was
|
|
504
|
-
superseded, rather than re-summarizing unchanged context repeatedly.
|
|
505
|
-
|
|
506
|
-
This should remain bounded. $B_j$ is supporting context for compaction, not an
|
|
507
|
-
unbounded recursive history expansion.
|
|
508
|
-
|
|
509
|
-
## 10. Why This Complements Retrieval
|
|
510
|
-
|
|
511
|
-
The retrieval score in [`mathematics-v2.md`](./mathematics-v2.md) answers:
|
|
512
|
-
|
|
513
|
-
$$
|
|
514
|
-
\text{which older records are most relevant to query } q\ ?
|
|
515
|
-
$$
|
|
516
|
-
|
|
517
|
-
The continuity term answers a different question:
|
|
518
|
-
|
|
519
|
-
$$
|
|
520
|
-
\text{which context must remain exact even if scoring or summarization is imperfect?}
|
|
521
|
-
$$
|
|
522
|
-
|
|
523
|
-
These objectives are complementary, not competing.
|
|
524
|
-
|
|
525
|
-
The continuity layer is therefore a hard constraint system wrapped around the
|
|
526
|
-
existing ranking model, not a replacement for it.
|
|
527
|
-
|
|
528
|
-
## 11. Runtime Invariants
|
|
529
|
-
|
|
530
|
-
The implementation must preserve the following:
|
|
531
|
-
|
|
532
|
-
1. Invariant completeness:
|
|
533
|
-
|
|
534
|
-
$$
|
|
535
|
-
\forall d\in\mathcal{I},\ \forall q\in\mathbf{Q}: d\in C_{\mathrm{total}}(q)
|
|
536
|
-
$$
|
|
537
|
-
|
|
538
|
-
2. Recent-tail exactness:
|
|
539
|
-
|
|
540
|
-
$$
|
|
541
|
-
\forall d\in T_{\mathrm{recent}}:\ d \text{ is stored and injected as raw context, not as a derived summary}
|
|
542
|
-
$$
|
|
543
|
-
|
|
544
|
-
3. Partition integrity:
|
|
545
|
-
|
|
546
|
-
$$
|
|
547
|
-
\mathcal{I}\cap T_{\mathrm{recent}}=\emptyset,\qquad
|
|
548
|
-
\mathcal{I}\cap\mathcal{V}_{\mathrm{rest}}=\emptyset,\qquad
|
|
549
|
-
T_{\mathrm{recent}}\cap\mathcal{V}_{\mathrm{rest}}=\emptyset
|
|
550
|
-
$$
|
|
551
|
-
|
|
552
|
-
4. Compaction exclusion:
|
|
553
|
-
|
|
554
|
-
$$
|
|
555
|
-
\forall C_j,\ C_j \subseteq \mathcal{V}_{\mathrm{rest}}
|
|
556
|
-
$$
|
|
557
|
-
|
|
558
|
-
5. Budget respect:
|
|
559
|
-
|
|
560
|
-
$$
|
|
561
|
-
\sum_{d\in C_{\mathrm{total}}(q)} \mathrm{toks}(d)\le\tau
|
|
562
|
-
$$
|
|
563
|
-
|
|
564
|
-
6. Positive compaction progress on replaced clusters:
|
|
565
|
-
|
|
566
|
-
$$
|
|
567
|
-
\forall C_j \text{ actually replaced},\ \Delta_{\mathrm{compact}}(C_j) > 0
|
|
568
|
-
$$
|
|
569
|
-
|
|
570
|
-
7. Lineage completeness for summaries:
|
|
571
|
-
|
|
572
|
-
$$
|
|
573
|
-
\forall s,\ \mathrm{SourceIDs}(s)\neq\emptyset
|
|
574
|
-
$$
|
|
575
|
-
|
|
576
|
-
8. Boundary-safe coupling:
|
|
577
|
-
|
|
578
|
-
No continuity-critical local bundle may be split across the recent-tail and
|
|
579
|
-
compaction boundary.
|
|
580
|
-
|
|
581
|
-
9. Lossless recoverability when the extension is enabled:
|
|
582
|
-
|
|
583
|
-
$$
|
|
584
|
-
\forall s\in\mathbf{S},\ \mathrm{Expand}^{*}(s)\subseteq\mathcal{R}_{\mathrm{session}}
|
|
585
|
-
\qquad\text{and}\qquad
|
|
586
|
-
\mathrm{Expand}^{*}(s)\neq\emptyset
|
|
587
|
-
$$
|
|
588
|
-
|
|
589
|
-
10. Raw-history immutability when the extension is enabled:
|
|
590
|
-
|
|
591
|
-
Compaction may add summary nodes and coverage edges, but it must not delete
|
|
592
|
-
raw turns from $\mathcal{R}_{\mathrm{session}}$.
|
|
593
|
-
|
|
594
|
-
## 12. Practical Interpretation
|
|
595
|
-
|
|
596
|
-
In practical terms, continuity for this system is:
|
|
597
|
-
|
|
598
|
-
$$
|
|
599
|
-
\begin{aligned}
|
|
600
|
-
\text{continuity} ={}& \text{authored rules} \\
|
|
601
|
-
&+ \text{recent exact session state} \\
|
|
602
|
-
&+ \text{recoverable compacted history} \\
|
|
603
|
-
&+ \text{older retrieved memory}
|
|
604
|
-
\end{aligned}
|
|
605
|
-
$$
|
|
606
|
-
|
|
607
|
-
This avoids the failure mode where continuity depends entirely on a semantic
|
|
608
|
-
summary being perfect. It also means compaction is not merely a storage
|
|
609
|
-
optimization. It is a constrained transformation that must preserve exact
|
|
610
|
-
recent state, recoverable lineage, and monotone progress.
|
|
611
|
-
|
|
612
|
-
## 13. Layer Separation And Review Guidance
|
|
613
|
-
|
|
614
|
-
The strongest follow-on review result for this document is that the continuity
|
|
615
|
-
theory is healthiest when it keeps three layers separate:
|
|
616
|
-
|
|
617
|
-
1. storage axioms
|
|
618
|
-
2. core retrieval and assembly math
|
|
619
|
-
3. recoverability policy
|
|
620
|
-
|
|
621
|
-
The authoritative continuity contract in this document should therefore be read
|
|
622
|
-
as follows.
|
|
623
|
-
|
|
624
|
-
### 13.1 Storage Axioms
|
|
625
|
-
|
|
626
|
-
When the lossless extension is enabled, raw-history immutability is a storage
|
|
627
|
-
axiom:
|
|
628
|
-
|
|
629
|
-
$$
|
|
630
|
-
\mathrm{Compact}(\mathcal{R}_{\mathrm{session}})=\mathcal{R}_{\mathrm{session}}
|
|
631
|
-
$$
|
|
632
|
-
|
|
633
|
-
That statement is unconditional. It does not depend on query relevance,
|
|
634
|
-
summary confidence, or token budget. It is stronger than lineage metadata or
|
|
635
|
-
query-time expansion. It simply means compaction does not delete raw source
|
|
636
|
-
turns from the immutable raw layer.
|
|
637
|
-
|
|
638
|
-
### 13.2 Recoverability Theorem
|
|
639
|
-
|
|
640
|
-
The summary-coverage DAG and $\mathrm{Expand}^{*}$ belong to recoverability,
|
|
641
|
-
not to the primary retrieval theorem. Their job is to guarantee that compacted
|
|
642
|
-
history remains navigable back to raw source turns:
|
|
643
|
-
|
|
644
|
-
$$
|
|
645
|
-
\forall s\in\mathbf{S},\ \mathrm{Expand}^{*}(s)\subseteq\mathcal{R}_{\mathrm{session}}
|
|
646
|
-
\qquad\text{and}\qquad
|
|
647
|
-
\mathrm{Expand}^{*}(s)\neq\emptyset
|
|
648
|
-
$$
|
|
649
|
-
|
|
650
|
-
This is a structural property of the continuity graph. It is not by itself a
|
|
651
|
-
claim that every query should traverse that graph during normal assembly.
|
|
652
|
-
|
|
653
|
-
### 13.3 Retrieval Boundary
|
|
654
|
-
|
|
655
|
-
The core continuity theorem remains:
|
|
656
|
-
|
|
657
|
-
$$
|
|
658
|
-
C_{\mathrm{total}}(q)=\mathcal{I}_1\cup \mathcal{I}_2^{*}\cup T_{\mathrm{recent}}\cup \mathrm{Proj}(\mathcal{V}_{\mathrm{rest}}, q)
|
|
659
|
-
$$
|
|
660
|
-
|
|
661
|
-
This document treats that expression as the primary assembly law. A runtime may
|
|
662
|
-
experiment with query-time summary expansion, but such expansion should be
|
|
663
|
-
treated as a bounded policy layer wrapped around the core theorem unless it is
|
|
664
|
-
formally re-derived inside the governing retrieval math.
|
|
665
|
-
|
|
666
|
-
In particular, policy knobs such as:
|
|
667
|
-
|
|
668
|
-
- summary-expansion confidence thresholds
|
|
669
|
-
- expansion token budgets
|
|
670
|
-
- depth limits
|
|
671
|
-
- expansion penalties or attenuations
|
|
672
|
-
|
|
673
|
-
are not themselves continuity axioms. They are deployment and retrieval-policy
|
|
674
|
-
choices layered on top of the structural guarantees above.
|
|
675
|
-
|
|
676
|
-
### 13.4 Heuristic vs. Theorem Boundary
|
|
677
|
-
|
|
678
|
-
The following ideas remain useful, but should be read as heuristics unless
|
|
679
|
-
their mathematics is defined explicitly elsewhere:
|
|
680
|
-
|
|
681
|
-
- **bundle-safe boundary extension** (Section 3): the runtime may extend
|
|
682
|
-
$T_{\mathrm{recent}}$ backward to avoid splitting a coupled local bundle;
|
|
683
|
-
this is a heuristic policy, not a formal tail selector term
|
|
684
|
-
- specific escalation ladders for compaction fallback
|
|
685
|
-
- **confidence-triggered automatic expansion**: query-time summary expansion is
|
|
686
|
-
explicit recovery/audit only; it was removed from the hot retrieval path and
|
|
687
|
-
is not the default behavior — see Section 13.3 and memory 283
|
|
688
|
-
- any fixed expansion penalty not derived from the governing score equations
|
|
689
|
-
|
|
690
|
-
This distinction matters because continuity should stay theorem-safe even when
|
|
691
|
-
those policies are tuned, replaced, or disabled.
|
|
692
|
-
|
|
693
|
-
### 13.5 Future Theory Direction
|
|
694
|
-
|
|
695
|
-
Several mathematically interesting review suggestions are worth preserving for
|
|
696
|
-
future refinement, but they are not part of the current authoritative theorem:
|
|
697
|
-
|
|
698
|
-
- information-theoretic or rate-distortion views of compaction quality
|
|
699
|
-
- hot-spot preservation tiers based on access concentration
|
|
700
|
-
- causal-centrality-aware compaction vetoes
|
|
701
|
-
- entropy-driven tail selection instead of fixed turn-count rules
|
|
702
|
-
- explicit recovery-state machines triggered by retrieval failure (the vNext
|
|
703
|
-
retrieval-failure signals S1/S2/S3 are defined separately in the vNext spec
|
|
704
|
-
slice; they are not part of the current $C_{\mathrm{total}}$ theorem)
|
|
705
|
-
|
|
706
|
-
These are promising research directions for later versions. The current
|
|
707
|
-
document keeps the simpler invariant-first continuity model as the normative
|
|
708
|
-
contract until one of those stronger formulations is deliberately adopted.
|