@xdarkicex/openclaw-memory-libravdb 1.3.11 → 1.3.13

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,1228 @@
1
+ # Mathematical Reference
2
+
3
+ This document is the formal reference for the scoring and optimization math used
4
+ by the plugin. The gating scalar is documented separately in
5
+ [gating.md](./gating.md). The continuity model and recent-tail preservation
6
+ layer are documented in [continuity.md](./continuity.md). The authored
7
+ invariant/variant partitioning rules are documented in
8
+ [ast-v2.md](./ast-v2.md). Earlier non-versioned math docs are preserved for
9
+ historical context, but the reviewed `*-v*` documents are authoritative when
10
+ both forms exist.
11
+
12
+ Every formula below points at the file that currently implements it. If the code
13
+ changes first, this document must change with it.
14
+
15
+ This revision (3.3) merges the complete section set from `mathematics.md` with
16
+ the formal corrections introduced in `mathematics-3-2.md`. All sections are now
17
+ present and carry the 3.2 corrections:
18
+
19
+ - explicit domain and startup invariants where later proofs depend on them
20
+ - removal of self-referential set definitions in the planned two-pass model
21
+ - disambiguation of decay symbols with different units and meanings
22
+ - explicit convex-combination proof obligations for bounded scores
23
+ - regularized Matryoshka normalization with $\varepsilon$-guarded denominators
24
+ and explicit early-exit threshold values
25
+ - division-by-zero guards in compaction clustering ($n = 0$ and $k = 0$ cases)
26
+ - clamped confidence formula with per-backend range proofs
27
+ - cold-start smoothing in the authority-weight frequency term $f(d)$
28
+ - separated coarse-candidate raw set from filtered set in Pass 1
29
+ - $\eta_{\mathrm{hop}}$ symbol replacing bare $\lambda$ for hop attenuation
30
+ - startup invariant $\tau_{\mathcal{I}} \le \tau$ made explicit
31
+ - edge-case safety and quality-multiplier boundedness added as runtime invariants
32
+ - Unicode code-point correction in sidecar token estimator
33
+ - $\chi$ calibration notice tied to tokenizer validation
34
+
35
+ ## 1. Hybrid Scoring
36
+
37
+ Each candidate returned by the vector store starts with a cosine similarity score
38
+ $\cos(q,d) \in [0,1]$ from embedding retrieval. The host then applies a hybrid
39
+ ranker:
40
+
41
+ $$
42
+ \mathrm{base}(d) =
43
+ \alpha \cdot \cos(q,d) +
44
+ \beta \cdot R(d) +
45
+ \gamma \cdot S(d)
46
+ $$
47
+
48
+ $$
49
+ \mathrm{score}(d) = \mathrm{base}(d) \cdot Q(d)
50
+ $$
51
+
52
+ where:
53
+
54
+ $$
55
+ R(d) = e^{-\lambda_s(d)\,\Delta t_d}
56
+ $$
57
+
58
+ $$
59
+ S(d)=
60
+ \begin{cases}
61
+ 1.0 & \text{if } d \text{ is from the active session} \\
62
+ 0.6 & \text{if } d \text{ is from durable user memory} \\
63
+ 0.3 & \text{if } d \text{ is from global memory}
64
+ \end{cases}
65
+ $$
66
+
67
+ $$
68
+ Q(d)=
69
+ \begin{cases}
70
+ 1 - \delta \cdot \mathrm{decay\_rate}(d) & \text{if } d \text{ is a summary} \\
71
+ 1 & \text{otherwise}
72
+ \end{cases}
73
+ $$
74
+
75
+ Implemented in [`src/scoring.ts`](../src/scoring.ts).
76
+
77
+ The current implementation defaults are:
78
+
79
+ - $\alpha = 0.7$
80
+ - $\beta = 0.2$
81
+ - $\gamma = 0.1$
82
+ - $\delta = 0.5$
83
+
84
+ The runtime enforces this convex-mixture contract by clamping weights into
85
+ $[0,1]$ and re-normalizing them onto a unit sum before scoring. This keeps the
86
+ base score on a stable scale and makes tuning interpretable: increasing one
87
+ weight means explicitly decreasing another.
88
+
89
+ **Note on retrieval similarity.** The term $\cos(q,d) \in [0,1]$ represents the
90
+ similarity score as bounded at the host ranking boundary. If the retrieval layer
91
+ surfaces a negative cosine-style score, the host clamps it to $0$ before applying
92
+ the section-1 hybrid ranker. The planned two-pass system in Section 7 uses raw
93
+ cosine similarity spanning $[-1,1]$ with negatives clipped explicitly. These are
94
+ described separately to avoid conflating current implementation with planned
95
+ architecture.
96
+
97
+ ### 1.1 Domain Constraints
98
+
99
+ The following parameter domains are required for all formulas in this section:
100
+
101
+ $$
102
+ \alpha, \beta, \gamma \in [0,1], \qquad \alpha + \beta + \gamma = 1
103
+ $$
104
+
105
+ $$
106
+ \delta \in [0,1]
107
+ $$
108
+
109
+ $$
110
+ \cos(q,d) \in [0,1], \qquad R(d) \in (0,1], \qquad S(d) \in \{0.3, 0.6, 1.0\}
111
+ $$
112
+
113
+ $$
114
+ \mathrm{decay\_rate}(d) \in [0,1]
115
+ $$
116
+
117
+ Under these assumptions, $\mathrm{base}(d)$ is a convex combination of
118
+ quantities in $[0,1]$, so:
119
+
120
+ $$
121
+ \mathrm{base}(d) \in [0,1]
122
+ $$
123
+
124
+ And since $\delta \in [0,1]$ and the decay rate is in $[0,1]$:
125
+
126
+ $$
127
+ Q(d) \in [1-\delta,\, 1] \subseteq [0,1]
128
+ $$
129
+
130
+ Therefore:
131
+
132
+ $$
133
+ \mathrm{score}(d) \in [0,1]
134
+ $$
135
+
136
+ ### 1.2 Boundary Cases
137
+
138
+ - $\alpha = 1$ collapses to semantic retrieval only.
139
+ - $\beta = 1$ collapses to pure recency preference.
140
+ - $\gamma = 1$ collapses to scope-only ranking and is almost always wrong
141
+ because it ignores content.
142
+ - $\delta = 0$ ignores summary quality completely.
143
+ - $\delta = 1$ applies the maximum configured penalty to low-confidence
144
+ summaries while preserving nonnegativity, because
145
+ the decay rate is in $[0,1]$, which guarantees $Q(d) \ge 0$.
146
+
147
+ ### 1.3 Note on $S(d)$ Values
148
+
149
+ The scope weights $\{1.0, 0.6, 0.3\}$ are empirically tuned constants, not
150
+ values derived from a normalized probability model. They are intentionally
151
+ stable across query types. At the default $\gamma = 0.1$, the maximum
152
+ contribution of $S(d)$ to $\mathrm{base}(d)$ is $0.1$, so miscalibration of
153
+ these values has bounded impact on the final score. Future work may replace
154
+ this step function with access-frequency priors derived from retrieval
155
+ telemetry.
156
+
157
+ ## 2. Recency Decay
158
+
159
+ Recency uses exponential decay:
160
+
161
+ $$
162
+ R(d) = e^{-\lambda_s \Delta t_d}
163
+ $$
164
+
165
+ where $\Delta t_d$ is the age of the record in seconds and $\lambda_s$ is the
166
+ scope-specific decay constant.
167
+
168
+ Implemented in [`src/scoring.ts`](../src/scoring.ts).
169
+
170
+ In the current implementation, $\Delta t_d$ is measured in **seconds**, not
171
+ milliseconds:
172
+
173
+ $$
174
+ \Delta t_d = \frac{\mathrm{Date.now()} - ts_d}{1000}
175
+ $$
176
+
177
+ and the $\lambda_s$ values are therefore **per-second** decay constants. The
178
+ product $\lambda_s \Delta t_d$ is dimensionless, as required by the exponential.
179
+
180
+ The current implementation uses different constants by scope:
181
+
182
+ - active session: $\lambda_s = 0.0001$
183
+ - durable user memory: $\lambda_s = 0.00001$
184
+ - global memory: $\lambda_s = 0.000002$
185
+
186
+ The implied half-lives make the decay constants auditable at a glance:
187
+
188
+ | Scope | $\lambda_s$ | Half-life |
189
+ |---|---|---|
190
+ | Session | $0.0001$ | $\approx 1.9\ \text{hours}$ |
191
+ | User | $0.00001$ | $\approx 19\ \text{hours}$ |
192
+ | Global | $0.000002$ | $\approx 4\ \text{days}$ |
193
+
194
+ $$
195
+ t_{1/2} = \frac{\ln 2}{\lambda_s}
196
+ $$
197
+
198
+ If those half-lives feel wrong for a given deployment, adjust $\lambda_s$ via
199
+ config — do not change the decay formula itself.
200
+
201
+ This makes session context fade fastest, user memory fade more slowly, and
202
+ global memory remain the most stable.
203
+
204
+ **Note on symbol disambiguation.** The symbol $\lambda_s$ here denotes the
205
+ scope-specific recency decay constant with units $\mathrm{s}^{-1}$. Section 7.3
206
+ uses $\lambda_r$ for a separate recency constant in the planned authority weight.
207
+ Section 7.7 uses $\eta_{\mathrm{hop}}$ for a dimensionless hop attenuation
208
+ factor. These three parameters are distinct and must not be substituted for each
209
+ other.
210
+
211
+ Why exponential instead of linear:
212
+
213
+ - exponential decay preserves ordering smoothly across many time scales
214
+ - it never goes negative
215
+ - it gives a natural "fast drop then long tail" shape for conversational relevance
216
+
217
+ Linear decay has a hard cutoff or requires arbitrary clipping. Exponential decay
218
+ decays old memories continuously without inventing a discontinuity.
219
+
220
+ ## 3. Token Budget Fitting
221
+
222
+ After ranking, the system performs greedy prompt packing.
223
+
224
+ Implemented in [`src/tokens.ts`](../src/tokens.ts).
225
+
226
+ Let candidates be sorted by final hybrid score:
227
+
228
+ $$
229
+ \mathrm{score}(d_1) \ge \mathrm{score}(d_2) \ge \dots \ge \mathrm{score}(d_n)
230
+ $$
231
+
232
+ and let $c_i$ be the estimated token cost of candidate $d_i$. The current host
233
+ token estimator is:
234
+
235
+ $$
236
+ \mathrm{estimateTokens}(t)=\left\lceil\frac{|t|}{\chi(t)}\right\rceil
237
+ $$
238
+
239
+ where:
240
+
241
+ $$
242
+ \chi(t)=
243
+ \begin{cases}
244
+ 1.6 & \text{for CJK scripts} \\
245
+ 2.5 & \text{for Cyrillic, Arabic, or Hebrew scripts} \\
246
+ 4.0 & \text{otherwise}
247
+ \end{cases}
248
+ $$
249
+
250
+ Given prompt budget $B$, the system selects the longest ranked prefix whose
251
+ cumulative cost fits:
252
+
253
+ $$
254
+ S = [d_1, d_2, \dots, d_m]
255
+ $$
256
+
257
+ such that:
258
+
259
+ $$
260
+ \sum_{i=1}^{m} c_i \le B
261
+ $$
262
+
263
+ and either $m=n$ or $\sum_{i=1}^{m+1} c_i > B$.
264
+
265
+ Greedy is optimal for this implementation because the ranking is already fixed.
266
+ The problem is not "find the best weighted subset under a knapsack objective";
267
+ it is "preserve rank order while honoring a hard prompt cap." Once rank order
268
+ is fixed, prefix acceptance is the correct policy.
269
+
270
+ **Note on estimator divergence.** The host estimator
271
+ ([`src/tokens.ts`](../src/tokens.ts)) is script-aware and is used for prompt
272
+ budget fitting. The sidecar estimator
273
+ ([`sidecar/compact/tokens.go`](../sidecar/compact/tokens.go)) uses a fixed
274
+ normalization rule:
275
+
276
+ $$
277
+ \widehat{T}_{sidecar}(t)=\max\!\left(\left\lfloor\frac{C(t)}{4}\right\rfloor,\, 1\right)
278
+ $$
279
+
280
+ where $C(t)$ is the Unicode code-point count of the string. The sidecar uses
281
+ `utf8.RuneCountInString()` rather than `len()`, because Go's `len()` returns
282
+ the UTF-8 byte length, not the code-point count; a CJK character occupies 3
283
+ bytes, so `len()` would produce a systematic over-count relative to the host
284
+ estimator's character-based ratios. The remaining divergence is bounded in
285
+ impact because the sidecar value appears only as a normalization denominator
286
+ in $P(t)$, never in prompt-budget arithmetic.
287
+
288
+ The two estimators are intentionally different. The host estimator optimizes
289
+ prompt-budget accuracy. The sidecar estimator is used only as a stable
290
+ normalization denominator in the technical specificity signal $P(t)$ of the
291
+ gating scalar. They must not be substituted for each other.
292
+
293
+ **Note on $\chi$ calibration.** The ratios $\{1.6, 2.5, 4.0\}$ are validated
294
+ against GPT-4 family tokenizers. They should be re-validated against the
295
+ deployment tokenizer on a representative corpus sample whenever the tokenizer
296
+ changes; the validation script and its results should be committed alongside
297
+ this document.
298
+
299
+ ## 4. Matryoshka Cascade
300
+
301
+ For Nomic embeddings, one full vector $\vec{v} \in \mathbb{R}^{768}$ produces
302
+ three tiers via regularized normalization:
303
+
304
+ $$
305
+ \vec{u}_{64} = \frac{\vec{v}_{1:64}}{\sqrt{\lVert \vec{v}_{1:64} \rVert_2^2 + \varepsilon^2}}, \quad
306
+ \vec{u}_{256} = \frac{\vec{v}_{1:256}}{\sqrt{\lVert \vec{v}_{1:256} \rVert_2^2 + \varepsilon^2}}, \quad
307
+ \vec{u}_{768} = \frac{\vec{v}_{1:768}}{\sqrt{\lVert \vec{v}_{1:768} \rVert_2^2 + \varepsilon^2}}
308
+ $$
309
+
310
+ where $\varepsilon = 10^{-8}$.
311
+
312
+ Re-normalization is required after truncation because a prefix of a unit vector
313
+ is not itself a unit vector in general. The regularized denominator
314
+ $\sqrt{\lVert \vec{v}_{1:k} \rVert_2^2 + \varepsilon^2}$ is numerically
315
+ identical to the plain $L_2$ norm when the norm is large, and smoothly forces
316
+ $\vec{u}_k \to \vec{0}$ when the norm approaches zero rather than producing NaN
317
+ or amplifying floating-point noise. A near-zero-norm tier vector yields a cosine
318
+ score near zero, which falls below both early-exit thresholds and produces
319
+ automatic fall-through to the next tier.
320
+
321
+ **Note on approximate unit normalization.** For any nonzero prefix with
322
+ $\varepsilon > 0$:
323
+
324
+ $$
325
+ \lVert \vec{u}_k \rVert_2
326
+ = \frac{\lVert \vec{v}_{1:k} \rVert_2}{\sqrt{\lVert \vec{v}_{1:k} \rVert_2^2 + \varepsilon^2}}
327
+ < 1
328
+ $$
329
+
330
+ So regularized prefix vectors are **approximately** unit-normalized. The
331
+ approximation becomes negligible when the prefix norm is large relative to
332
+ $\varepsilon$; with $\varepsilon = 10^{-8}$ and ordinary float32 prefix norms
333
+ this difference is not operationally significant, but the distinction matters
334
+ for formal correctness.
335
+
336
+ Implemented in [`sidecar/embed/matryoshka.go`](../sidecar/embed/matryoshka.go)
337
+ and [`sidecar/store/libravdb.go`](../sidecar/store/libravdb.go).
338
+
339
+ Cascade search uses:
340
+
341
+ - L1: `64d` with early-exit threshold $\theta_{L1} = 0.65$
342
+ - L2: `256d` with early-exit threshold $\theta_{L2} = 0.75$
343
+ - L3: `768d`
344
+
345
+ These thresholds are calibrated on held-out cosine rank correlation with the
346
+ 768d ground truth for the chosen embedding model. They control the
347
+ precision/recall tradeoff of the cascade and are not required to preserve exact
348
+ ranking — rank preservation at reduced dimension is approximate by design of
349
+ Matryoshka prefix embeddings, not a mathematical guarantee. The L1 and L2 tiers
350
+ function as recall-oriented coarse filters; the false-positive rate at each tier
351
+ is an explicit design parameter controlled by $\theta_{L1}$ and $\theta_{L2}$.
352
+ If the embedding model changes, both thresholds must be re-derived from the new
353
+ model's ROC curve against 768d ground truth.
354
+
355
+ The search exits early when a tier's best score exceeds the configured threshold;
356
+ otherwise it falls through to the next tier. Empty lower-tier collections
357
+ degrade gracefully because:
358
+
359
+ $$
360
+ \max(\emptyset) = 0
361
+ $$
362
+
363
+ and $0$ is below both early-exit thresholds by design.
364
+
365
+ Backfill condition:
366
+
367
+ - L3 is the source of truth
368
+ - L1 and L2 are derived caches
369
+ - if an L1 or L2 insert fails, a dirty-tier marker is recorded
370
+ - startup backfill reconstructs the missing tier vector from L3
371
+
372
+ **Note on $\varepsilon$ calibration.** The value $\varepsilon = 10^{-8}$ is
373
+ appropriate for float32 embeddings where pathological near-zero norms are
374
+ numerical artifacts. If the embedding model changes, verify that near-zero norms
375
+ in the new model are indeed artifacts and not meaningful signal before retaining
376
+ this value.
377
+
378
+ ## 5. Compaction Clustering
379
+
380
+ Compaction groups raw session turns into deterministic chronological clusters
381
+ and replaces each cluster with one summary record. The intent is to turn many
382
+ highly local turns into fewer retrieval-worthy summaries.
383
+
384
+ Implemented in [`sidecar/compact/summarize.go`](../sidecar/compact/summarize.go).
385
+
386
+ The current algorithm is not semantic k-means. It is deterministic chronological
387
+ partitioning:
388
+
389
+ 1. collect eligible non-summary turns
390
+ 2. sort them by `(ts, id)`
391
+ 3. choose target cluster size $k$
392
+ 4. normalize the requested target cluster size:
393
+
394
+ Non-positive runtime inputs are normalized to the shipped default
395
+ $k = 20$ before clustering. After normalization, the effective target size must
396
+ satisfy $k \ge 1$.
397
+
398
+ 5. derive cluster count:
399
+
400
+ Let $n$ be the number of eligible turns. The cluster count is:
401
+
402
+ $$
403
+ c = \left\lceil \frac{\max(n,\,1)}{k} \right\rceil
404
+ $$
405
+
406
+ 6. assign turn $i$ to cluster:
407
+
408
+ $$
409
+ \mathrm{clusterIndex}(i) = \left\lfloor \frac{i \cdot c}{\max(n,\,1)} \right\rfloor
410
+ $$
411
+
412
+ The $\max(n, 1)$ guards prevent division by zero when $n = 0$. When $n \ge 1$,
413
+ these are identical to the unguarded forms $\lceil n/k \rceil$ and
414
+ $\lfloor (i \cdot c)/n \rfloor$.
415
+
416
+ When $n < k$, the formula produces $c = 1$ and all turns map to cluster 0: a
417
+ single cluster containing fewer turns than the target size. Single-member
418
+ clusters should be tagged with method `trivial` so that downstream consumers can
419
+ apply a different quality interpretation if needed.
420
+
421
+ This yields contiguous chronological buckets of roughly equal size while
422
+ avoiding nondeterministic clustering behavior.
423
+
424
+ The summarizer input for cluster $C_j$ is the ordered turn sequence:
425
+
426
+ $$
427
+ C_j = [t_1, t_2, \dots, t_m]
428
+ $$
429
+
430
+ with each element carrying turn id and text.
431
+
432
+ The output is a summary record $s(C_j)$ with:
433
+
434
+ - summary text
435
+ - source ids
436
+ - confidence
437
+ - method
438
+ - `decay_rate = 1 - confidence`
439
+
440
+ Implemented across [`sidecar/compact/summarize.go`](../sidecar/compact/summarize.go),
441
+ [`sidecar/summarize/engine.go`](../sidecar/summarize/engine.go), and
442
+ [`sidecar/summarize/onnx_local.go`](../sidecar/summarize/onnx_local.go).
443
+
444
+ For the first real-model benchmark pass comparing raw T5 confidence against
445
+ Nomic-space preservation metrics and the hard preservation gate, see
446
+ [`compaction-evaluation.md`](./compaction-evaluation.md).
447
+
448
+ ### 5.1 Semiotic Mismatch
449
+
450
+ The system uses:
451
+
452
+ - T5-small as an optional local abstractive decoder
453
+ - Nomic `nomic-embed-text-v1.5` as the canonical retrieval embedding space
454
+
455
+ Those models do not measure the same thing.
456
+
457
+ The raw T5 confidence term is:
458
+
459
+ $$
460
+ \mathrm{conf}_{\mathrm{t5}}(s, C_j) =
461
+ \exp\!\left(\frac{1}{m}\sum_{i=1}^{m}\log p(x_i \mid x_{<i}, C_j)\right)
462
+ $$
463
+
464
+ where $x_i$ are generated summary tokens. This measures decoder
465
+ self-consistency, not geometric preservation in the vector space used later for
466
+ retrieval.
467
+
468
+ So a T5 summary can be locally confident while still drifting away from the
469
+ source cluster in Nomic space.
470
+
471
+ ### 5.2 Nomic-Space Preservation
472
+
473
+ Let the embedding function be:
474
+
475
+ $$
476
+ E : \text{text} \to \mathbb{R}^d
477
+ $$
478
+
479
+ For a source cluster $C_j = \langle t_1, \dots, t_n \rangle$, define:
480
+
481
+ $$
482
+ v_i = E(t_i)
483
+ $$
484
+
485
+ $$
486
+ \mu_C = \frac{1}{n}\sum_{i=1}^{n} v_i
487
+ $$
488
+
489
+ $$
490
+ v_s = E(s)
491
+ $$
492
+
493
+ where cosine similarity renormalizes vectors at comparison time, so $\mu_C$
494
+ does not need separate unit normalization in the definition below.
495
+
496
+ The primary preservation term is centroid alignment:
497
+
498
+ $$
499
+ Q_{\mathrm{align}}(s, C_j) = \cos(v_s, \mu_C)
500
+ $$
501
+
502
+ The secondary preservation term is average positive source coverage:
503
+
504
+ $$
505
+ Q_{\mathrm{cover}}(s, C_j) =
506
+ \frac{1}{n}\sum_{i=1}^{n}\max(0, \cos(v_s, v_i))
507
+ $$
508
+
509
+ The Nomic-space confidence term is then:
510
+
511
+ $$
512
+ \mathrm{conf}_{\mathrm{nomic}}(s, C_j) =
513
+ \max\!\left(0,\;\min\!\left(1,\;\frac{Q_{\mathrm{align}} + Q_{\mathrm{cover}}}{2}\right)\right)
514
+ $$
515
+
516
+ This is the canonical compaction quality signal because it is defined in the
517
+ same geometric space the vector store uses at retrieval time.
518
+
519
+ ### 5.3 Preservation Gate
520
+
521
+ Before an abstractive T5 summary is accepted, it must pass a hard preservation
522
+ gate:
523
+
524
+ $$
525
+ Q_{\mathrm{align}}(s, C_j) \ge \tau_{\mathrm{preserve}}
526
+ $$
527
+
528
+ with the shipped default:
529
+
530
+ $$
531
+ \tau_{\mathrm{preserve}} = 0.65
532
+ $$
533
+
534
+ If the abstractive summary fails this test, the system rejects it and falls back
535
+ to deterministic extractive compaction.
536
+
537
+ This means the decoder may propose a summary, but Nomic-space preservation
538
+ decides whether it is faithful enough to become memory.
539
+
540
+ ### 5.4 Final Confidence
541
+
542
+ For extractive summaries, the final stored confidence is:
543
+
544
+ $$
545
+ \mathrm{confidence}(s) = \mathrm{conf}_{\mathrm{nomic}}(s, C_j)
546
+ $$
547
+
548
+ For accepted abstractive T5 summaries, the final stored confidence is a
549
+ Nomic-heavy hybrid:
550
+
551
+ $$
552
+ \mathrm{confidence}(s) =
553
+ \lambda \cdot \mathrm{conf}_{\mathrm{nomic}}(s, C_j)
554
+ + (1-\lambda)\cdot \mathrm{conf}_{\mathrm{t5}}(s, C_j)
555
+ $$
556
+
557
+ with the shipped default:
558
+
559
+ $$
560
+ \lambda = 0.8
561
+ $$
562
+
563
+ So Nomic-space preservation remains the dominant term, while T5 decoder
564
+ confidence contributes only auxiliary stability information.
565
+
566
+ Therefore:
567
+
568
+ $$
569
+ \mathrm{confidence}(s) \in [0,1]
570
+ $$
571
+
572
+ for all valid inputs, because both $\mathrm{conf}_{\mathrm{nomic}}$ and
573
+ $\mathrm{conf}_{\mathrm{t5}}$ are bounded in $[0,1]$ and the hybrid is a convex
574
+ combination.
575
+
576
+ ### 5.5 Retrieval Decay Multiplier
577
+
578
+ The retrieval decay metadata is then:
579
+
580
+ $$
581
+ \mathrm{decay\_rate}(s) = 1 - \mathrm{confidence}(s)
582
+ $$
583
+
584
+ and the retrieval quality multiplier from Section 1 becomes:
585
+
586
+ $$
587
+ Q(s) = 1 - \delta \cdot \mathrm{decay\_rate}(s)
588
+ $$
589
+
590
+ Given $\delta \in [0,1]$ and $\mathrm{confidence}(s) \in [0,1]$, the decay rate
591
+ is in $[0,1]$ and therefore:
592
+
593
+ $$
594
+ Q(s) \in [1-\delta,\, 1] \subseteq [0,1]
595
+ $$
596
+
597
+ At the shipped default $\delta = 0.5$, this constrains summary quality weights
598
+ to:
599
+
600
+ $$
601
+ Q(s) \in [0.5,\, 1.0]
602
+ $$
603
+
604
+ This makes compaction load-bearing in retrieval rather than archival only.
605
+
606
+ ## 6. Why These Pieces Compose
607
+
608
+ The full quality loop is:
609
+
610
+ $$
611
+ \text{high-value turns}
612
+ \rightarrow \text{better clusters}
613
+ \rightarrow \text{higher summary confidence}
614
+ \rightarrow \text{lower decay rate}
615
+ \rightarrow \text{higher retrieval score}
616
+ $$
617
+
618
+ That is the system-level reason the math is distributed across ingestion,
619
+ compaction, and retrieval instead of existing only in one scoring function.
620
+
621
+ For rigor, this section should be read in two parts:
622
+
623
+ - The upstream step
624
+ `high-value turns -> better clusters -> higher summary confidence`
625
+ is an engineering hypothesis supported by preservation metrics and empirical
626
+ calibration evidence. It is not a pure algebraic proof obligation because it
627
+ depends on learned-model behavior.
628
+ - The downstream step
629
+ `higher summary confidence -> lower decay rate -> higher retrieval score`
630
+ is a formal and implementation-correspondence obligation. It follows from:
631
+
632
+ $$
633
+ \mathrm{decay\_rate}(s) = 1 - \mathrm{confidence}(s)
634
+ $$
635
+
636
+ and
637
+
638
+ $$
639
+ Q(s) = 1 - \delta \cdot \mathrm{decay\_rate}(s),
640
+ \qquad
641
+ S_{\mathrm{final}}(s) = S_{\mathrm{base}}(s) \cdot Q(s)
642
+ $$
643
+
644
+ Under equal base score $S_{\mathrm{base}}$ and fixed $\delta \in [0,1]$,
645
+ higher confidence implies lower decay, larger $Q(s)$, and therefore a larger
646
+ final retrieval score. This downstream monotonic composition is the part that
647
+ must be locked by exact code-level tests before later retrieval architecture
648
+ work proceeds.
649
+
650
+ ## 7. Two-Pass Discovery Scoring
651
+
652
+ This section documents the reviewed scoring and assembly model for the
653
+ two-pass retrieval system. Parts of this section are now implemented in
654
+ [`src/scoring.ts`](../src/scoring.ts),
655
+ [`src/context-engine.ts`](../src/context-engine.ts),
656
+ [`src/continuity.ts`](../src/continuity.ts), and the sidecar store/RPC
657
+ adapter. Remaining unimplemented or approximate pieces should be treated as
658
+ explicit follow-on work, not as permission to relax the mathematical contract.
659
+
660
+ The design goal is to separate:
661
+
662
+ 1. invariant documents that must always be present
663
+ 2. cheap discovery over variant documents
664
+ 3. selective second-pass expansion under a hard prompt budget
665
+
666
+ ### 7.1 Foundational Definitions
667
+
668
+ Let the retrievable document corpus be:
669
+
670
+ $$
671
+ \mathbf{D}=\{d_1, d_2, \ldots, d_n\}
672
+ $$
673
+
674
+ and let the query space be $\mathbf{Q}$.
675
+
676
+ Let the embedding function:
677
+
678
+ $$
679
+ \varphi : \mathbf{D}\cup\mathbf{Q}\rightarrow \mathbb{R}^m
680
+ $$
681
+
682
+ map documents and queries to unit vectors:
683
+
684
+ $$
685
+ \lVert \varphi(x) \rVert_2 = 1 \qquad \forall x \in \mathbf{D}\cup\mathbf{Q}
686
+ $$
687
+
688
+ The gating function is:
689
+
690
+ $$
691
+ G : \mathbf{Q}\times\mathbf{D}\rightarrow \{0,1\}
692
+ $$
693
+
694
+ and determines whether a document is injected for a query.
695
+
696
+ ### 7.2 Corpus Decomposition
697
+
698
+ The reviewed AST partitioning model in [`ast-v2.md`](./ast-v2.md) refines the
699
+ older binary invariant-or-variant split into three authored tiers plus a
700
+ continuity carve-out inside the retrievable variant corpus.
701
+
702
+ The authored corpus is partitioned into hard invariants, soft invariants, and
703
+ variant memory:
704
+
705
+ $$
706
+ \mathbf{D} = \mathcal{I}_1\cup\mathcal{I}_2\cup\mathcal{V},
707
+ \qquad
708
+ \mathcal{I}_1\cap\mathcal{I}_2=\mathcal{I}_1\cap\mathcal{V}=\mathcal{I}_2\cap\mathcal{V}=\emptyset
709
+ $$
710
+
711
+ The tier membership predicate is:
712
+
713
+ $$
714
+ \iota : \mathbf{D}\rightarrow \{0,1,2\}
715
+ $$
716
+
717
+ with:
718
+
719
+ $$
720
+ \mathcal{I}_1 = \{d\in\mathbf{D}\mid \iota(d)=1\}
721
+ $$
722
+
723
+ and:
724
+
725
+ $$
726
+ \mathcal{I}_2 = \{d\in\mathbf{D}\mid \iota(d)=2\}
727
+ \qquad
728
+ \mathcal{V} = \{d\in\mathbf{D}\mid \iota(d)=0\}
729
+ $$
730
+
731
+ Here:
732
+
733
+ - $\mathcal{I}_1$ is the hard invariant set, injected exactly and never
734
+ truncated
735
+ - $\mathcal{I}_2$ is the soft invariant sequence, injected by longest-prefix
736
+ truncation in authored order
737
+ - $\mathcal{V}$ is the retrievable variant corpus
738
+
739
+ For OpenClaw, the intended implementation is that authored documents such as
740
+ `AGENTS.md` and `souls.md` are compiled into $\mathcal{I}_1$, $\mathcal{I}_2$,
741
+ and $\mathcal{V}$ at load time rather than discovered monolithically at query
742
+ time.
743
+
744
+ The hard authored guarantee is:
745
+
746
+ $$
747
+ \iota(d)=1 \Rightarrow G(q,d)=1 \qquad \forall q\in\mathbf{Q}
748
+ $$
749
+
750
+ Soft invariants are also authored constants, but unlike $\mathcal{I}_1$ they
751
+ are budget-elastic. Let the authored order on $\mathcal{I}_2$ be:
752
+
753
+ $$
754
+ \mathcal{I}_2=\langle d^{(2)}_1,d^{(2)}_2,\dots,d^{(2)}_m\rangle
755
+ $$
756
+
757
+ and define the longest-prefix operator:
758
+
759
+ $$
760
+ \mathrm{Pref}(\mathcal{I}_2;\,b)=\langle d^{(2)}_1,\dots,d^{(2)}_j\rangle
761
+ $$
762
+
763
+ where:
764
+
765
+ $$
766
+ j=\max\left\{r\in\{0,\dots,m\}\ \middle|\ \sum_{i=1}^{r}\mathrm{toks}(d^{(2)}_i)\le b\right\}
767
+ $$
768
+
769
+ When continuity is enabled, the runtime further refines the variant corpus into
770
+ an exact recent raw suffix and the remaining retrievable variant set:
771
+
772
+ $$
773
+ \mathcal{V}=T_{\mathrm{recent}}\cup\mathcal{V}_{\mathrm{rest}},
774
+ \qquad
775
+ T_{\mathrm{recent}}\cap\mathcal{V}_{\mathrm{rest}}=\emptyset
776
+ $$
777
+
778
+ Only $\mathcal{V}_{\mathrm{rest}}$ participates in semantic retrieval. The
779
+ recent tail is preserved exactly and budgeted separately.
780
+
781
+ ### 7.3 Document Authority Weight
782
+
783
+ Each retrievable variant document carries a precomputed authority weight:
784
+
785
+ $$
786
+ \omega(d)=\alpha_r\cdot r(d)+\alpha_f\cdot f(d)+\alpha_a\cdot a(d)
787
+ $$
788
+
789
+ with:
790
+
791
+ $$
792
+ \alpha_r+\alpha_f+\alpha_a=1, \qquad \alpha_r,\alpha_f,\alpha_a \in [0,1]
793
+ $$
794
+
795
+ where:
796
+
797
+ $$
798
+ r(d)=\exp\!\left(-\lambda_r\cdot \Delta t(d)\right)
799
+ $$
800
+
801
+ $$
802
+ f(d)=\frac{\log(1+\mathrm{acc}(d))}{\log\!\left(1+\max_{d'\in\mathcal{V}_{\mathrm{rest}}}\mathrm{acc}(d')+1\right)}
803
+ $$
804
+
805
+ $$
806
+ a(d)\in[0,1]
807
+ $$
808
+
809
+ Here $\lambda_r > 0$ is the recency decay constant with units $\mathrm{s}^{-1}$,
810
+ and $\Delta t(d) \ge 0$ is document age in seconds.
811
+
812
+ The $+1$ in the denominator of $f(d)$, but not the numerator, implements minimal
813
+ additive smoothing that guarantees a defined value at cold start. The asymmetry
814
+ is deliberate: a document with zero accesses should score $f(d) = 0$ exactly,
815
+ which the unsmoothed numerator preserves. When
816
+ $\max_{d'\in\mathcal{V}_{\mathrm{rest}}}\mathrm{acc}(d') = 0$, the denominator equals $\log 2$
817
+ and:
818
+
819
+ $$
820
+ f(d) = 0 \qquad \forall d\in\mathcal{V}_{\mathrm{rest}}
821
+ $$
822
+
823
+ cleanly deferring frequency weight to $r(d)$ and $a(d)$ until access history
824
+ accumulates.
825
+
826
+ Because $r(d)\in(0,1]$, $f(d)\in[0,1]$, and $a(d)\in[0,1]$, and $\omega(d)$
827
+ is a convex combination of these terms:
828
+
829
+ $$
830
+ \omega(d)\in[0,1]
831
+ $$
832
+
833
+ For variant nodes extracted from core authored identity documents,
834
+ [`ast-v2.md`](./ast-v2.md) sets $a(d)=1.0$. This lets the planned discovery
835
+ score incorporate recency, access frequency, and authored authority without
836
+ baking those concerns into the raw cosine term.
837
+
838
+ ### 7.4 Pass 1: Coarse Semantic Filtering
839
+
840
+ Pass 1 computes cosine similarity:
841
+
842
+ $$
843
+ \mathrm{sim}(q,d)=\varphi(q)^\top \varphi(d) \in [-1,1]
844
+ $$
845
+
846
+ The raw top-$k_1$ candidate set is:
847
+
848
+ $$
849
+ \mathcal{C}_1^{\mathrm{raw}}(q)=\mathrm{TopK}_{d\in\mathcal{V}_{\mathrm{rest}}}\!\left(k_1,\,\mathrm{sim}(q,d)\right)
850
+ $$
851
+
852
+ with filtered coarse set:
853
+
854
+ $$
855
+ \mathcal{C}_1(q)=\left\{d\in\mathcal{C}_1^{\mathrm{raw}}(q)\mid \mathrm{sim}(q,d)\ge \theta_1\right\}
856
+ $$
857
+
858
+ where $\theta_1\in[-1,1]$.
859
+
860
+ The purpose of this pass is breadth with cheap semantic recall. Documents below
861
+ $\theta_1$ are rejected even if they land in the top-$k_1$ set, because the
862
+ first pass must not admit semantically orthogonal noise into second-pass work.
863
+
864
+ ### 7.5 Pass 2: Normalized Hybrid Scoring
865
+
866
+ Let the query keyword extractor return:
867
+
868
+ $$
869
+ K = \mathrm{KeyExt}(q)
870
+ $$
871
+
872
+ and define normalized keyword coverage:
873
+
874
+ $$
875
+ M_{norm}(K,d)=\frac{|K\cap \mathrm{terms}(d)|}{\max(|K|,\,1)}\in[0,1]
876
+ $$
877
+
878
+ When $|K| > 0$ this is identical to $|K\cap \mathrm{terms}(d)| / |K|$. When
879
+ $|K| = 0$ (the query yields no extractable keywords), the numerator is zero and
880
+ $M_{norm} = 0$ exactly, collapsing the second-pass score to pure semantic
881
+ retrieval — the correct degenerate behavior.
882
+
883
+ The proposed normalized second-pass score is:
884
+
885
+ $$
886
+ S_{final}(d)=
887
+ \frac{
888
+ \omega(d)\cdot\max(\mathrm{sim}(q,d),\,0)\cdot\left(1+\kappa\cdot M_{norm}(K,d)\right)
889
+ }{
890
+ 1+\kappa
891
+ }
892
+ $$
893
+
894
+ where $\kappa\in[0,\infty)$.
895
+
896
+ The normalized second-pass score form above was suggested during design review
897
+ by GitHub contributor [@JuanHuaXu](https://github.com/JuanHuaXu). The broader
898
+ two-pass architecture in this section remains project-authored.
899
+
900
+ This form is preferred over a hard clamp such as $\min(\mathrm{term},1)$
901
+ because clamping discards ranking information at the high end of the score
902
+ distribution. The denominator $(1+\kappa)$ gives an analytic bound instead of
903
+ truncating the result.
904
+
905
+ The second-pass candidate set is:
906
+
907
+ $$
908
+ \mathcal{C}_2(q)=\mathrm{TopK}_{d\in\mathcal{C}_1(q)}\!\left(k_2,\,S_{final}(d)\right)
909
+ $$
910
+
911
+ with $k_2 \le k_1$ and $k_1, k_2 \in \mathbb{Z}_{>0}$.
912
+
913
+ ### 7.6 Bounded Range and Interpretation of $\kappa$
914
+
915
+ Let:
916
+
917
+ $$
918
+ s=\max(\mathrm{sim}(q,d),\,0)\in[0,1]
919
+ $$
920
+
921
+ Then:
922
+
923
+ $$
924
+ S_{final}(d)=\frac{\omega(d)\cdot s\cdot(1+\kappa M_{norm}(K,d))}{1+\kappa}
925
+ $$
926
+
927
+ Because $M_{norm}(K,d)\in[0,1]$ and $\kappa\ge 0$:
928
+
929
+ $$
930
+ 1 \le 1+\kappa M_{norm}(K,d) \le 1+\kappa
931
+ $$
932
+
933
+ so:
934
+
935
+ $$
936
+ 0 \le \frac{1+\kappa M_{norm}(K,d)}{1+\kappa} \le 1
937
+ $$
938
+
939
+ Combining with $s\in[0,1]$ and $\omega(d)\in[0,1]$:
940
+
941
+ $$
942
+ 0 \le S_{final}(d)\le \omega(d)\le 1
943
+ $$
944
+
945
+ This yields a clean interpretation of $\kappa$:
946
+
947
+ - $\kappa = 0$ gives pure semantic retrieval
948
+ - $\kappa = 0.5$ allows keyword coverage to provide up to a one-third relative
949
+ boost before normalization
950
+ - $\kappa = 1.0$ makes full lexical support restore the pure semantic ceiling
951
+ while penalizing semantic-only matches with no keyword support
952
+
953
+ A reasonable initial experiment value is:
954
+
955
+ $$
956
+ \kappa = 0.3
957
+ $$
958
+
959
+ ### 7.7 Multi-Hop Expansion
960
+
961
+ Let the authored hop graph be:
962
+
963
+ $$
964
+ \mathcal{G}=(\mathbf{D},\, E)
965
+ $$
966
+
967
+ where edges are registered in document metadata at authorship time.
968
+
969
+ For a document $d$, define its hop neighborhood:
970
+
971
+ $$
972
+ H(d)=\{d'\in\mathbf{D}\mid (d,d')\in E\}
973
+ $$
974
+
975
+ The hop expansion set is:
976
+
977
+ $$
978
+ \mathcal{C}_{hop}(q)=\bigcup_{d\in\mathcal{C}_2(q)} H(d)\setminus\mathcal{C}_2(q)
979
+ $$
980
+
981
+ Each hop candidate inherits a decayed score from its best parent:
982
+
983
+ $$
984
+ S_{hop}(d')=
985
+ \eta_{\mathrm{hop}}\cdot
986
+ \max_{d\in\mathcal{C}_2(q),\; d'\in H(d)} S_{final}(d)
987
+ $$
988
+
989
+ with hop decay factor $\eta_{\mathrm{hop}}\in(0,1)$.
990
+
991
+ **Note on symbol disambiguation.** The symbol $\eta_{\mathrm{hop}}$ is used
992
+ here deliberately to avoid collision with $\lambda_s$ (scope recency, Section 2)
993
+ and $\lambda_r$ (authority-weight recency, Section 7.3). The parameters have
994
+ different semantics and units: $\lambda_r$ has units $\mathrm{s}^{-1}$, while
995
+ $\eta_{\mathrm{hop}}$ is a dimensionless attenuation factor in $(0,1)$.
996
+
997
+ The filtered hop set is:
998
+
999
+ $$
1000
+ \mathcal{C}_{hop}^{*}(q)=\{d'\in\mathcal{C}_{hop}(q)\mid S_{hop}(d')\ge\theta_{hop}\}
1001
+ $$
1002
+
1003
+ with $\theta_{hop}\in[0,1]$.
1004
+
1005
+ Since $S_{final}(d)\in[0,1]$ and $\eta_{\mathrm{hop}}\in(0,1)$:
1006
+
1007
+ $$
1008
+ S_{hop}(d')\in[0,\,1)
1009
+ $$
1010
+
1011
+ ### 7.8 Final Assembly Under a Token Budget
1012
+
1013
+ Variant projection is:
1014
+
1015
+ $$
1016
+ \mathrm{Proj}(\mathcal{V}_{\mathrm{rest}},\, q)=\mathcal{C}_2(q)\cup\mathcal{C}_{hop}^{*}(q)
1017
+ $$
1018
+
1019
+ The final injected context is:
1020
+
1021
+ $$
1022
+ C_{\mathrm{total}}(q)=\mathcal{I}_1\cup \mathcal{I}_2^{*}\cup T_{\mathrm{recent}}\cup \mathrm{Proj}(\mathcal{V}_{\mathrm{rest}},\, q)
1023
+ $$
1024
+
1025
+ Let the total prompt budget be $\tau$, and let the reserve fractions satisfy:
1026
+
1027
+ $$
1028
+ \alpha_1,\alpha_2,\beta\in[0,1],
1029
+ \qquad
1030
+ \alpha_1+\alpha_2+\beta\le 1
1031
+ $$
1032
+
1033
+ where:
1034
+
1035
+ - $\alpha_1$ reserves hard authored budget
1036
+ - $\alpha_2$ reserves soft authored budget
1037
+ - $\beta$ is the target recent-tail budget fraction
1038
+
1039
+ Define the hard authored token mass:
1040
+
1041
+ $$
1042
+ \tau_{\mathcal{I}_1}=\sum_{d\in\mathcal{I}_1}\mathrm{toks}(d)
1043
+ $$
1044
+
1045
+ **Required startup hard authored invariant:**
1046
+
1047
+ $$
1048
+ \tau_{\mathcal{I}_1}\le \alpha_1\tau
1049
+ $$
1050
+
1051
+ This must be enforced at startup or configuration validation time. If violated,
1052
+ the system cannot simultaneously satisfy "the hard invariant set is never
1053
+ truncated" and "total injected tokens do not exceed the total budget."
1054
+ Initialization must fail or the deployment must be reconfigured.
1055
+
1056
+ Let $T_{\mathrm{base}}$ be the mandatory recent-tail base suffix defined in
1057
+ [`continuity.md`](./continuity.md): the shortest raw suffix of the active
1058
+ session containing at least the most recent $m$ turns. The mandatory continuity
1059
+ fit requirement is:
1060
+
1061
+ $$
1062
+ \tau_{\mathcal{I}_1} + \sum_{d\in T_{\mathrm{base}}}\mathrm{toks}(d)\le \tau
1063
+ $$
1064
+
1065
+ Otherwise no legal assembly exists that preserves both hard invariants and the
1066
+ minimum continuity tail. The runtime must surface degraded mode explicitly; it
1067
+ must not silently truncate $\mathcal{I}_1$ or split the mandatory recent tail.
1068
+
1069
+ The effective soft authored budget is:
1070
+
1071
+ $$
1072
+ \tau_{\mathcal{I}_2}^{\mathrm{eff}}
1073
+ =
1074
+ \min\!\left(
1075
+ \alpha_2\tau,\,
1076
+ \tau-\tau_{\mathcal{I}_1}-\sum_{d\in T_{\mathrm{base}}}\mathrm{toks}(d)
1077
+ \right)
1078
+ $$
1079
+
1080
+ and the injected soft invariant prefix is:
1081
+
1082
+ $$
1083
+ \mathcal{I}_2^{*}=\mathrm{Pref}(\mathcal{I}_2;\,\tau_{\mathcal{I}_2}^{\mathrm{eff}})
1084
+ $$
1085
+
1086
+ Define the recent-tail target:
1087
+
1088
+ $$
1089
+ \tau_{\mathrm{tail}}^{\mathrm{target}}=\beta\tau
1090
+ $$
1091
+
1092
+ The exact recent-tail selector is the longest bundle-safe raw suffix containing
1093
+ $T_{\mathrm{base}}$ and satisfying:
1094
+
1095
+ $$
1096
+ \sum_{d\in T_{\mathrm{recent}}}\mathrm{toks}(d)
1097
+ \le
1098
+ \min\!\left(
1099
+ \max\!\left(\tau_{\mathrm{tail}}^{\mathrm{target}},\,
1100
+ \sum_{d\in T_{\mathrm{base}}}\mathrm{toks}(d)\right),\,
1101
+ \tau-\tau_{\mathcal{I}_1}-\sum_{d\in\mathcal{I}_2^{*}}\mathrm{toks}(d)
1102
+ \right)
1103
+ $$
1104
+
1105
+ This preserves the continuity rule that the mandatory recent suffix wins over
1106
+ the nominal tail target when they conflict, while still respecting the total
1107
+ prompt budget.
1108
+
1109
+ The residual retrievable variant budget is:
1110
+
1111
+ $$
1112
+ \tau_{\mathcal{V}}(q)
1113
+ =
1114
+ \tau-\tau_{\mathcal{I}_1}
1115
+ -\sum_{d\in\mathcal{I}_2^{*}}\mathrm{toks}(d)
1116
+ -\sum_{d\in T_{\mathrm{recent}}}\mathrm{toks}(d)
1117
+ $$
1118
+
1119
+ which must satisfy:
1120
+
1121
+ $$
1122
+ \tau_{\mathcal{V}}(q)\ge 0
1123
+ $$
1124
+
1125
+ Documents in $\mathrm{Proj}(\mathcal{V}_{\mathrm{rest}}, q)$ are injected in descending
1126
+ score order until:
1127
+
1128
+ $$
1129
+ \sum_{d\in \text{injected}} \mathrm{toks}(d)\le\tau_{\mathcal{V}}
1130
+ $$
1131
+
1132
+ The merged score sequence is:
1133
+
1134
+ $$
1135
+ \sigma(d)=
1136
+ \begin{cases}
1137
+ S_{final}(d) & d\in\mathcal{C}_2(q) \\
1138
+ S_{hop}(d) & d\in\mathcal{C}_{hop}^{*}(q)
1139
+ \end{cases}
1140
+ $$
1141
+
1142
+ ### 7.9 Complete Gating Definition
1143
+
1144
+ $$
1145
+ G(q,d)=
1146
+ \begin{cases}
1147
+ 1 & \text{if } d\in\mathcal{I}_1\cup\mathcal{I}_2^{*}\cup T_{\mathrm{recent}} \\
1148
+ \mathbf{1}[d\in\mathcal{C}_2(q)\cup\mathcal{C}_{hop}^{*}(q)] & \text{if } d\in\mathcal{V}_{\mathrm{rest}}
1149
+ \end{cases}
1150
+ $$
1151
+
1152
+ ### 7.10 Required Runtime Invariants
1153
+
1154
+ The implementation must preserve these properties:
1155
+
1156
+ 1. Invariant completeness:
1157
+
1158
+ $$
1159
+ \forall d\in\mathcal{I}_1,\; \forall q\in\mathbf{Q}: d\in C_{\mathrm{total}}(q)
1160
+ $$
1161
+
1162
+ 2. Soft invariant order preservation:
1163
+
1164
+ $$
1165
+ \mathcal{I}_2^{*}\text{ is a prefix of }\mathcal{I}_2
1166
+ $$
1167
+
1168
+ 3. Partition integrity:
1169
+
1170
+ $$
1171
+ \mathcal{I}_1\cap\mathcal{I}_2=\mathcal{I}_1\cap\mathcal{V}=\mathcal{I}_2\cap\mathcal{V}=\emptyset,
1172
+ \qquad
1173
+ T_{\mathrm{recent}}\cap\mathcal{V}_{\mathrm{rest}}=\emptyset
1174
+ $$
1175
+
1176
+ 4. Mandatory recent-tail completeness:
1177
+
1178
+ $$
1179
+ T_{\mathrm{base}}\subseteq T_{\mathrm{recent}}
1180
+ $$
1181
+
1182
+ 5. Score boundedness:
1183
+
1184
+ $$
1185
+ S_{final}(d)\in[0,1]
1186
+ $$
1187
+
1188
+ 6. Token budget respect:
1189
+
1190
+ $$
1191
+ \sum_{d\in C_{\mathrm{total}}(q)} \mathrm{toks}(d)\le\tau
1192
+ $$
1193
+
1194
+ with $\mathcal{I}_1$ never truncated, $\mathcal{I}_2$ truncated only by
1195
+ longest-prefix selection, and the recent-tail base never silently dropped.
1196
+
1197
+ 7. Compaction boundary safety:
1198
+
1199
+ Compaction may operate only on $\mathcal{V}_{\mathrm{rest}}$, never on
1200
+ $T_{\mathrm{recent}}$.
1201
+
1202
+ 8. Hop termination:
1203
+
1204
+ The authored hop graph should be acyclic, or the runtime must cap hop depth at
1205
+ one to guarantee termination.
1206
+
1207
+ 9. Edge-case safety:
1208
+
1209
+ No valid input in the declared domain may produce a NaN, a negative score, or a
1210
+ division-by-zero. This includes at minimum:
1211
+
1212
+ - cold-start corpus with $\max \mathrm{acc}=0$
1213
+ - empty extracted keyword set with $|K|=0$
1214
+ - zero eligible clustering turns with $n=0$
1215
+ - near-zero-norm Matryoshka prefix vectors
1216
+ - empty hop neighborhoods
1217
+ - empty or zero-residual $\tau_{\mathcal{V}}(q)$ after invariant and
1218
+ continuity reservation
1219
+
1220
+ 7. Quality multiplier boundedness:
1221
+
1222
+ $$
1223
+ \mathrm{confidence}(s)\in[0,1],
1224
+ \qquad
1225
+ Q(d)\in[1-\delta,\,1]\subseteq[0,1]
1226
+ $$
1227
+
1228
+ for all valid inputs with $\delta\in[0,1]$.