@xdarkicex/openclaw-memory-libravdb 1.3.11 → 1.3.12

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,488 @@
1
+ # Continuity Model
2
+
3
+ This document defines the continuity layer for the planned memory system.
4
+ Its purpose is to ensure that session continuity does not depend only on
5
+ semantic retrieval quality or summary fidelity.
6
+
7
+ The central design rule is:
8
+
9
+ $$
10
+ \text{continuity} \neq \text{semantic summary alone}
11
+ $$
12
+
13
+ Instead, continuity is modeled as the composition of:
14
+
15
+ $$
16
+ C_{\mathrm{total}}(q)=\mathcal{I}_1\cup \mathcal{I}_2^{*}\cup T_{\mathrm{recent}}\cup \mathrm{Proj}(\mathcal{V}_{\mathrm{rest}}, q)
17
+ $$
18
+
19
+ where:
20
+
21
+ - $\mathcal{I}_1$ is the hard authored invariant context from
22
+ [`ast-v2.md`](./ast-v2.md)
23
+ - $\mathcal{I}_2^{*}$ is the admitted soft-invariant prefix from
24
+ [`ast-v2.md`](./ast-v2.md)
25
+ - $T_{\mathrm{recent}}$ is a preserved raw recent session tail
26
+ - $\mathrm{Proj}(\mathcal{V}_{\mathrm{rest}}, q)$ is the scored retrieval
27
+ output over the remaining variant memory corpus
28
+
29
+ ## 1. Motivation
30
+
31
+ The retrieval system in [`mathematics-v2.md`](./mathematics-v2.md) is optimized
32
+ for relevance under a token budget. That is necessary, but it is not sufficient
33
+ for continuity.
34
+
35
+ Lossy summarization and imperfect retrieval can both preserve topic relevance
36
+ while still losing operational details such as:
37
+
38
+ - the latest local intent shift
39
+ - recently introduced identifiers and file paths
40
+ - nearby causal ordering between turns
41
+ - work-in-progress context that has not yet become durable memory
42
+
43
+ Therefore the system needs a non-semantic continuity term that remains stable
44
+ even when summary quality is imperfect.
45
+
46
+ ## 2. Continuity Decomposition
47
+
48
+ Let the full retrievable corpus be partitioned as:
49
+
50
+ $$
51
+ \mathbf{D}=\mathcal{I}_1\cup\mathcal{I}_2\cup\mathcal{V},
52
+ \qquad
53
+ \mathcal{I}_1\cap\mathcal{I}_2=\mathcal{I}_1\cap\mathcal{V}=\mathcal{I}_2\cap\mathcal{V}=\emptyset
54
+ $$
55
+
56
+ Following [`ast-v2.md`](./ast-v2.md), invariant authored directives are injected for
57
+ all queries:
58
+
59
+ $$
60
+ d\in\mathcal{I}_1 \Rightarrow G(q,d)=1
61
+ $$
62
+
63
+ Soft authored directives are injected by position-preserving prefix selection:
64
+
65
+ $$
66
+ \mathcal{I}_2^{*}=\mathrm{Pref}(\mathcal{I}_2;\,\tau_{\mathcal{I}_2}^{\mathrm{eff}})
67
+ $$
68
+
69
+ We further partition the session-derived variant corpus into:
70
+
71
+ $$
72
+ \mathcal{V} = T_{\mathrm{recent}} \cup \mathcal{V}_{\mathrm{rest}},
73
+ \qquad
74
+ T_{\mathrm{recent}} \cap \mathcal{V}_{\mathrm{rest}} = \emptyset
75
+ $$
76
+
77
+ where $T_{\mathrm{recent}}$ is a fixed raw suffix of the active session that is
78
+ preserved verbatim and excluded from destructive compaction.
79
+
80
+ The final injected context becomes:
81
+
82
+ $$
83
+ C_{\mathrm{total}}(q)=\mathcal{I}_1\cup \mathcal{I}_2^{*}\cup T_{\mathrm{recent}}\cup \mathrm{Proj}(\mathcal{V}_{\mathrm{rest}}, q)
84
+ $$
85
+
86
+ This means continuity is guaranteed jointly by:
87
+
88
+ - hard authored invariants
89
+ - admitted soft authored invariants
90
+ - preserved recent raw context
91
+ - scored retrieval over the older compacted/searchable corpus
92
+
93
+ ## 3. Recent-Tail Definition
94
+
95
+ Let the active session turns ordered by ascending timestamp be:
96
+
97
+ $$
98
+ \Sigma = \langle t_1, t_2, \dots, t_n \rangle
99
+ $$
100
+
101
+ Define a preserved recent-tail selector:
102
+
103
+ $$
104
+ T_{\mathrm{recent}} = \mathrm{Tail}(\Sigma; m, \tau_{\mathrm{tail}})
105
+ $$
106
+
107
+ subject to the constraints:
108
+
109
+ - at least the most recent $m$ raw turns are preserved
110
+ - the preserved tail token target is $\tau_{\mathrm{tail}}$
111
+ - preserved turns are never replaced by summaries while they remain in the tail
112
+
113
+ The exact selection policy may be count-based, token-based, or both. A valid
114
+ runtime policy is:
115
+
116
+ $$
117
+ T_{\mathrm{base}} = \text{shortest raw suffix of } \Sigma \text{ such that }
118
+ |T_{\mathrm{base}}| \ge m
119
+ $$
120
+
121
+ If the base suffix fits within the tail token target:
122
+
123
+ $$
124
+ \sum_{d\in T_{\mathrm{base}}}\mathrm{toks}(d)\le \tau_{\mathrm{tail}}
125
+ $$
126
+
127
+ then the runtime may extend it backward to the longest raw suffix
128
+ $T_{\mathrm{recent}}$ satisfying:
129
+
130
+ $$
131
+ T_{\mathrm{base}} \subseteq T_{\mathrm{recent}}
132
+ \qquad\text{and}\qquad
133
+ \sum_{d\in T_{\mathrm{recent}}}\mathrm{toks}(d)\le \tau_{\mathrm{tail}}
134
+ $$
135
+
136
+ If the most recent $m$ turns already exceed the tail target:
137
+
138
+ $$
139
+ \sum_{d\in T_{\mathrm{base}}}\mathrm{toks}(d) > \tau_{\mathrm{tail}}
140
+ $$
141
+
142
+ then continuity takes precedence and:
143
+
144
+ $$
145
+ T_{\mathrm{recent}} = T_{\mathrm{base}}
146
+ $$
147
+
148
+ with the overflow absorbed by reducing the retrievable variant budget
149
+ accordingly. In other words, $m$ wins over $\tau_{\mathrm{tail}}$ whenever the
150
+ two conflict.
151
+
152
+ This selector is intentionally structural rather than semantic.
153
+
154
+ The selector should also preserve small logically coupled turn bundles when a
155
+ boundary would otherwise split an inseparable local unit. In practice, this
156
+ means the runtime may extend $T_{\mathrm{recent}}$ slightly backward to keep a
157
+ recent cause/effect pair, request/response pair, or equivalent tightly coupled
158
+ artifact bundle intact.
159
+
160
+ ## 4. Budget Partition
161
+
162
+ Let the total prompt budget be $\tau$. Then the continuity-aware allocation is:
163
+
164
+ $$
165
+ \tau = \tau_{\mathcal{I}_1} + \tau_{\mathcal{I}_2}^{*} + \tau_{\mathrm{tail}} + \tau_{\mathcal{V}}
166
+ $$
167
+
168
+ equivalently:
169
+
170
+ $$
171
+ \tau_{\mathcal{V}} = \tau - \tau_{\mathcal{I}_1} - \tau_{\mathcal{I}_2}^{*} - \tau_{\mathrm{tail}}
172
+ $$
173
+
174
+ with:
175
+
176
+ - $\tau_{\mathcal{I}_1}$ consumed by hard authored context
177
+ - $\tau_{\mathcal{I}_2}^{*}$ consumed by the admitted soft-invariant prefix
178
+ - $\tau_{\mathrm{tail}}$ reserved for preserved recent raw context
179
+ - $\tau_{\mathcal{V}}$ reserved for scored retrieval over
180
+ $\mathcal{V}_{\mathrm{rest}}$
181
+
182
+ Following the unified contract in [`mathematics-v2.md`](./mathematics-v2.md),
183
+ let the reserve fractions satisfy:
184
+
185
+ $$
186
+ \alpha_1,\alpha_2,\beta\in[0,1],
187
+ \qquad
188
+ \alpha_1+\alpha_2+\beta\le 1
189
+ $$
190
+
191
+ with:
192
+
193
+ $$
194
+ \tau_{\mathcal{I}_1}\le \alpha_1\tau
195
+ \qquad\text{and}\qquad
196
+ \tau_{\mathrm{tail}}^{\mathrm{target}}=\beta\tau
197
+ $$
198
+
199
+ The soft authored tier is then bounded by:
200
+
201
+ $$
202
+ \tau_{\mathcal{I}_2}^{\mathrm{eff}}
203
+ =
204
+ \min\!\left(
205
+ \alpha_2\tau,\,
206
+ \tau-\tau_{\mathcal{I}_1}-\sum_{d\in T_{\mathrm{base}}}\mathrm{toks}(d)
207
+ \right)
208
+ $$
209
+
210
+ This enforces the intended precedence:
211
+
212
+ 1. hard authored invariants
213
+ 2. the mandatory recent-tail base suffix
214
+ 3. the soft-invariant prefix
215
+ 4. additional tail extension up to the target tail budget
216
+ 5. residual variant retrieval
217
+
218
+ The residual budget must satisfy:
219
+
220
+ $$
221
+ \tau_{\mathcal{V}} \ge 0
222
+ $$
223
+
224
+ Startup and runtime must preserve:
225
+
226
+ $$
227
+ \tau_{\mathcal{I}_1} + \sum_{d\in T_{\mathrm{base}}}\mathrm{toks}(d) \le \tau
228
+ $$
229
+
230
+ and:
231
+
232
+ $$
233
+ \sum_{d\in C_{\mathrm{total}}(q)} \mathrm{toks}(d)\le \tau
234
+ $$
235
+
236
+ The retrieval system may truncate only $\mathrm{Proj}(\mathcal{V}_{\mathrm{rest}}, q)$.
237
+ It must not truncate $\mathcal{I}_1$, and it must not silently compact away
238
+ $T_{\mathrm{recent}}$. The soft authored tier may be truncated only by
239
+ position-preserving prefix selection.
240
+
241
+ ## 5. Compaction Boundary Invariant
242
+
243
+ Compaction operates only on:
244
+
245
+ $$
246
+ \mathcal{V}_{\mathrm{rest}}
247
+ $$
248
+
249
+ not on $T_{\mathrm{recent}}$.
250
+
251
+ If a summary record $s(C_j)$ replaces a cluster $C_j$, then:
252
+
253
+ $$
254
+ C_j \subseteq \mathcal{V}_{\mathrm{rest}}
255
+ $$
256
+
257
+ and never:
258
+
259
+ $$
260
+ C_j \cap T_{\mathrm{recent}} \neq \emptyset
261
+ $$
262
+
263
+ This gives a hard continuity boundary: recent discourse remains exact, older
264
+ discourse becomes summary-eligible.
265
+
266
+ The boundary must also be bundle-safe. If a cluster candidate would split a
267
+ tightly coupled local unit across the tail boundary, the runtime should move the
268
+ boundary backward so that the unit stays entirely in $T_{\mathrm{recent}}$ or
269
+ entirely in $\mathcal{V}_{\mathrm{rest}}$.
270
+
271
+ ## 6. Compaction Progress Guarantee
272
+
273
+ Continuity is not preserved if compaction can stall indefinitely or emit
274
+ summaries that fail to reduce storage pressure. Therefore compaction must make
275
+ monotone progress when it is invoked on eligible material.
276
+
277
+ Let $C_j$ be a compactable cluster with source token mass:
278
+
279
+ $$
280
+ \tau(C_j)=\sum_{d\in C_j}\mathrm{toks}(d)
281
+ $$
282
+
283
+ and let the emitted summary be $s(C_j)$ with:
284
+
285
+ $$
286
+ \tau(s(C_j))=\mathrm{toks}(s(C_j))
287
+ $$
288
+
289
+ The preferred invariant is:
290
+
291
+ $$
292
+ \tau(s(C_j)) < \tau(C_j)
293
+ $$
294
+
295
+ If the primary summarizer fails to achieve this reduction, the compaction path
296
+ should escalate through increasingly conservative modes until it produces a
297
+ strictly smaller representation or explicitly declines compaction for that
298
+ cluster. A valid strategy is:
299
+
300
+ 1. normal summary generation
301
+ 2. more aggressive summary generation
302
+ 3. deterministic bounded fallback
303
+
304
+ This preserves a stronger system property:
305
+
306
+ $$
307
+ \Delta_{\mathrm{compact}}(C_j)=\tau(C_j)-\tau(s(C_j)) > 0
308
+ $$
309
+
310
+ whenever a cluster is actually replaced.
311
+
312
+ ## 7. Summary Lineage And Recoverability
313
+
314
+ Continuity improves when summary nodes are not opaque replacements but
315
+ recoverable abstractions with stable lineage.
316
+
317
+ For each compacted cluster $C_j$, the summary metadata should include at least:
318
+
319
+ - source identifiers
320
+ - earliest source timestamp
321
+ - latest source timestamp
322
+ - compaction timestamp
323
+ - summary method
324
+ - confidence
325
+
326
+ If deeper summary-on-summary compaction is introduced later, the runtime should
327
+ extend this metadata with parent-summary references so the compacted memory
328
+ space remains navigable as a directed acyclic lineage graph rather than a flat
329
+ bag of summaries.
330
+
331
+ Formally, for each summary node $s$ we want a typed lineage record, and
332
+ potentially, in a hierarchical future:
333
+
334
+ $$
335
+ P(s)\subseteq \mathbf{S}
336
+ $$
337
+
338
+ where $\mathbf{S}$ is the set of summary nodes.
339
+
340
+ The object $L(s)$ is a typed tuple or record, not an unordered set. A more
341
+ precise notation is:
342
+
343
+ $$
344
+ L(s)=\big(\mathrm{SourceIDs}(s), t_{\min}(s), t_{\max}(s), \mathrm{Method}(s), \mathrm{Confidence}(s)\big)
345
+ $$
346
+
347
+ This does not replace retrieval scoring. It guarantees that compressed history
348
+ remains inspectable and attributable.
349
+
350
+ ## 8. Continuity-Aware Summarization Input
351
+
352
+ Compaction input should be continuity-safe before it reaches the summarizer.
353
+ Large opaque payloads, binary blobs, and transport artifacts consume token
354
+ budget without increasing continuity.
355
+
356
+ Therefore the summarization view of a cluster should apply a sanitization
357
+ operator:
358
+
359
+ $$
360
+ \widetilde{C}_j=\mathrm{Sanitize}(C_j)
361
+ $$
362
+
363
+ where $\mathrm{Sanitize}$ removes or replaces payload forms whose contribution
364
+ to downstream continuity is negligible relative to their token mass.
365
+
366
+ The intended behavior is not to destroy source truth in storage. It is to
367
+ provide the summarizer with a continuity-preserving projection of the source
368
+ cluster.
369
+
370
+ ## 9. Delta-Conditioned Summaries
371
+
372
+ Independent summaries tend to repeat stable background context and waste both
373
+ storage and retrieval budget. A stronger continuity formulation conditions new
374
+ summaries on nearby previously compacted state.
375
+
376
+ Let $B_j$ be bounded prior compacted context relevant to cluster $C_j$. A valid
377
+ selection rule is that $B_j$ is drawn from temporally adjacent or topically
378
+ adjacent compacted state and satisfies a fixed supporting-context cap:
379
+
380
+ $$
381
+ \mathrm{toks}(B_j)\le \tau_B
382
+ $$
383
+
384
+ for some configured constant $\tau_B$.
385
+
386
+ Then a delta-conditioned summarizer computes:
387
+
388
+ $$
389
+ s(C_j \mid B_j)
390
+ $$
391
+
392
+ instead of an unconditional $s(C_j)$.
393
+
394
+ The purpose is to preserve what changed, what remains active, and what was
395
+ superseded, rather than re-summarizing unchanged context repeatedly.
396
+
397
+ This should remain bounded. $B_j$ is supporting context for compaction, not an
398
+ unbounded recursive history expansion.
399
+
400
+ ## 10. Why This Complements Retrieval
401
+
402
+ The retrieval score in [`mathematics-v2.md`](./mathematics-v2.md) answers:
403
+
404
+ $$
405
+ \text{which older records are most relevant to query } q\ ?
406
+ $$
407
+
408
+ The continuity term answers a different question:
409
+
410
+ $$
411
+ \text{which context must remain exact even if scoring or summarization is imperfect?}
412
+ $$
413
+
414
+ These objectives are complementary, not competing.
415
+
416
+ The continuity layer is therefore a hard constraint system wrapped around the
417
+ existing ranking model, not a replacement for it.
418
+
419
+ ## 11. Runtime Invariants
420
+
421
+ The implementation must preserve the following:
422
+
423
+ 1. Invariant completeness:
424
+
425
+ $$
426
+ \forall d\in\mathcal{I},\ \forall q\in\mathbf{Q}: d\in C_{\mathrm{total}}(q)
427
+ $$
428
+
429
+ 2. Recent-tail exactness:
430
+
431
+ $$
432
+ \forall d\in T_{\mathrm{recent}}:\ d \text{ is stored and injected as raw context, not as a derived summary}
433
+ $$
434
+
435
+ 3. Partition integrity:
436
+
437
+ $$
438
+ \mathcal{I}\cap T_{\mathrm{recent}}=\emptyset,\qquad
439
+ \mathcal{I}\cap\mathcal{V}_{\mathrm{rest}}=\emptyset,\qquad
440
+ T_{\mathrm{recent}}\cap\mathcal{V}_{\mathrm{rest}}=\emptyset
441
+ $$
442
+
443
+ 4. Compaction exclusion:
444
+
445
+ $$
446
+ \forall C_j,\ C_j \subseteq \mathcal{V}_{\mathrm{rest}}
447
+ $$
448
+
449
+ 5. Budget respect:
450
+
451
+ $$
452
+ \sum_{d\in C_{\mathrm{total}}(q)} \mathrm{toks}(d)\le\tau
453
+ $$
454
+
455
+ 6. Positive compaction progress on replaced clusters:
456
+
457
+ $$
458
+ \forall C_j \text{ actually replaced},\ \Delta_{\mathrm{compact}}(C_j) > 0
459
+ $$
460
+
461
+ 7. Lineage completeness for summaries:
462
+
463
+ $$
464
+ \forall s,\ \mathrm{SourceIDs}(s)\neq\emptyset
465
+ $$
466
+
467
+ 8. Boundary-safe coupling:
468
+
469
+ No continuity-critical local bundle may be split across the recent-tail and
470
+ compaction boundary.
471
+
472
+ ## 12. Practical Interpretation
473
+
474
+ In practical terms, continuity for this system is:
475
+
476
+ $$
477
+ \begin{aligned}
478
+ \text{continuity} ={}& \text{authored rules} \\
479
+ &+ \text{recent exact session state} \\
480
+ &+ \text{recoverable compacted history} \\
481
+ &+ \text{older retrieved memory}
482
+ \end{aligned}
483
+ $$
484
+
485
+ This avoids the failure mode where continuity depends entirely on a semantic
486
+ summary being perfect. It also means compaction is not merely a storage
487
+ optimization. It is a constrained transformation that must preserve exact
488
+ recent state, recoverable lineage, and monotone progress.
@@ -72,5 +72,5 @@ Before opening a PR:
72
72
  - `pnpm check` must pass
73
73
  - `go test -race ./...` from `sidecar/` must pass
74
74
  - any new gating signal must come with calibration or invariant coverage
75
- - any retrieval math change must be reflected in [mathematics.md](./mathematics.md)
75
+ - any retrieval math change must be reflected in [mathematics-v2.md](./mathematics-v2.md)
76
76
  - any gating change must be reflected in [gating.md](./gating.md)