@xdarkicex/openclaw-memory-libravdb 1.3.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (80) hide show
  1. package/README.md +46 -0
  2. package/docs/README.md +14 -0
  3. package/docs/architecture-decisions/README.md +6 -0
  4. package/docs/architecture-decisions/adr-001-onnx-over-ollama.md +21 -0
  5. package/docs/architecture-decisions/adr-002-libravdb-over-lancedb.md +19 -0
  6. package/docs/architecture-decisions/adr-003-convex-gating-over-threshold.md +27 -0
  7. package/docs/architecture-decisions/adr-004-sidecar-over-native-ts.md +21 -0
  8. package/docs/architecture.md +188 -0
  9. package/docs/contributing.md +76 -0
  10. package/docs/dependencies.md +38 -0
  11. package/docs/embedding-profiles.md +42 -0
  12. package/docs/gating.md +329 -0
  13. package/docs/implementation.md +381 -0
  14. package/docs/installation.md +272 -0
  15. package/docs/mathematics.md +695 -0
  16. package/docs/models.md +63 -0
  17. package/docs/problem.md +64 -0
  18. package/docs/security.md +86 -0
  19. package/openclaw.plugin.json +84 -0
  20. package/package.json +41 -0
  21. package/scripts/build-sidecar.sh +30 -0
  22. package/scripts/postinstall.js +169 -0
  23. package/scripts/setup.sh +20 -0
  24. package/scripts/setup.ts +505 -0
  25. package/scripts/sidecar-release.d.ts +4 -0
  26. package/scripts/sidecar-release.js +17 -0
  27. package/sidecar/cmd/inspect_onnx/main.go +105 -0
  28. package/sidecar/compact/gate.go +273 -0
  29. package/sidecar/compact/gate_test.go +85 -0
  30. package/sidecar/compact/summarize.go +345 -0
  31. package/sidecar/compact/summarize_test.go +319 -0
  32. package/sidecar/compact/tokens.go +11 -0
  33. package/sidecar/config/config.go +119 -0
  34. package/sidecar/config/config_test.go +75 -0
  35. package/sidecar/embed/engine.go +696 -0
  36. package/sidecar/embed/engine_test.go +349 -0
  37. package/sidecar/embed/matryoshka.go +93 -0
  38. package/sidecar/embed/matryoshka_test.go +150 -0
  39. package/sidecar/embed/onnx_local.go +319 -0
  40. package/sidecar/embed/onnx_local_test.go +159 -0
  41. package/sidecar/embed/profile_contract_test.go +71 -0
  42. package/sidecar/embed/profile_eval_test.go +923 -0
  43. package/sidecar/embed/profiles.go +39 -0
  44. package/sidecar/go.mod +21 -0
  45. package/sidecar/go.sum +30 -0
  46. package/sidecar/health/check.go +33 -0
  47. package/sidecar/health/check_test.go +55 -0
  48. package/sidecar/main.go +151 -0
  49. package/sidecar/model/encoder.go +222 -0
  50. package/sidecar/model/registry.go +262 -0
  51. package/sidecar/model/registry_test.go +102 -0
  52. package/sidecar/model/seq2seq.go +133 -0
  53. package/sidecar/server/rpc.go +343 -0
  54. package/sidecar/server/rpc_test.go +350 -0
  55. package/sidecar/server/transport.go +160 -0
  56. package/sidecar/store/libravdb.go +676 -0
  57. package/sidecar/store/libravdb_test.go +472 -0
  58. package/sidecar/summarize/engine.go +360 -0
  59. package/sidecar/summarize/engine_test.go +148 -0
  60. package/sidecar/summarize/onnx_local.go +494 -0
  61. package/sidecar/summarize/onnx_local_test.go +48 -0
  62. package/sidecar/summarize/profiles.go +52 -0
  63. package/sidecar/summarize/tokenizer.go +13 -0
  64. package/sidecar/summarize/tokenizer_hf.go +76 -0
  65. package/sidecar/summarize/util.go +13 -0
  66. package/src/cli.ts +205 -0
  67. package/src/context-engine.ts +195 -0
  68. package/src/index.ts +27 -0
  69. package/src/memory-provider.ts +24 -0
  70. package/src/openclaw-plugin-sdk.d.ts +53 -0
  71. package/src/plugin-runtime.ts +67 -0
  72. package/src/recall-cache.ts +34 -0
  73. package/src/recall-utils.ts +22 -0
  74. package/src/rpc.ts +84 -0
  75. package/src/scoring.ts +58 -0
  76. package/src/sidecar.ts +506 -0
  77. package/src/tokens.ts +36 -0
  78. package/src/types.ts +146 -0
  79. package/tsconfig.json +20 -0
  80. package/tsconfig.tests.json +12 -0
@@ -0,0 +1,695 @@
1
+ # Mathematical Reference
2
+
3
+ This document is the formal reference for the scoring and optimization math used
4
+ by the plugin. The gating scalar is documented separately in
5
+ [gating.md](./gating.md).
6
+
7
+ Every formula below points at the file that currently implements it. If the code
8
+ changes first, this document must change with it.
9
+
10
+ ## 1. Hybrid Scoring
11
+
12
+ Each candidate returned by the vector store starts with a cosine similarity score
13
+ $\cos(q,d) \in [0,1]$ from embedding retrieval. The host then applies a hybrid
14
+ ranker:
15
+
16
+ $$
17
+ \mathrm{base}(d) =
18
+ \alpha \cdot \cos(q,d) +
19
+ \beta \cdot R(d) +
20
+ \gamma \cdot S(d)
21
+ $$
22
+
23
+ $$
24
+ \mathrm{score}(d) = \mathrm{base}(d) \cdot Q(d)
25
+ $$
26
+
27
+ where:
28
+
29
+ $$
30
+ R(d) = e^{-\lambda(d)\Delta t_d}
31
+ $$
32
+
33
+ $$
34
+ S(d)=
35
+ \begin{cases}
36
+ 1.0 & \text{if } d \text{ is from the active session} \\
37
+ 0.6 & \text{if } d \text{ is from durable user memory} \\
38
+ 0.3 & \text{if } d \text{ is from global memory}
39
+ \end{cases}
40
+ $$
41
+
42
+ $$
43
+ Q(d)=
44
+ \begin{cases}
45
+ 1 - \delta \cdot \mathrm{decay\_rate}(d) & \text{if } d \text{ is a summary} \\
46
+ 1 & \text{otherwise}
47
+ \end{cases}
48
+ $$
49
+
50
+ Implemented in [`src/scoring.ts`](../src/scoring.ts).
51
+
52
+ The current implementation defaults are:
53
+
54
+ - $\alpha = 0.7$
55
+ - $\beta = 0.2$
56
+ - $\gamma = 0.1$
57
+ - $\delta = 0.5$
58
+
59
+ The design convention is that $\alpha + \beta + \gamma = 1$. This keeps the
60
+ base score on a stable scale and makes tuning interpretable: increasing one
61
+ weight means explicitly decreasing another.
62
+
63
+ Boundary cases:
64
+
65
+ - $\alpha = 1$ collapses to semantic retrieval only.
66
+ - $\beta = 1$ collapses to pure recency preference.
67
+ - $\gamma = 1$ collapses to scope-only ranking and is almost always wrong
68
+ because it ignores content.
69
+ - $\delta = 0$ ignores summary quality completely.
70
+ - $\delta = 1$ applies the maximum configured penalty to low-confidence
71
+ summaries.
72
+
73
+ ## 2. Recency Decay
74
+
75
+ Recency uses exponential decay:
76
+
77
+ $$
78
+ R(d) = e^{-\lambda \Delta t_d}
79
+ $$
80
+
81
+ where $\Delta t_d$ is the age of the record in seconds and $\lambda$ is the
82
+ scope-specific decay constant.
83
+
84
+ Implemented in [`src/scoring.ts`](../src/scoring.ts).
85
+
86
+ In the current implementation, $\Delta t_d$ is measured in **seconds**, not
87
+ milliseconds:
88
+
89
+ $$
90
+ \Delta t_d = \frac{\mathrm{Date.now()} - ts_d}{1000}
91
+ $$
92
+
93
+ and the $\lambda$ values are therefore **per-second** decay constants.
94
+
95
+ The current implementation uses different constants by scope:
96
+
97
+ - active session: $\lambda = 0.0001$
98
+ - durable user memory: $\lambda = 0.00001$
99
+ - global memory: $\lambda = 0.000002$
100
+
101
+ The implied half-lives make the decay constants auditable at a glance:
102
+
103
+ | Scope | $\lambda$ | Half-life |
104
+ |---|---|---|
105
+ | Session | $0.0001$ | $\approx 1.9\ \text{hours}$ |
106
+ | User | $0.00001$ | $\approx 19\ \text{hours}$ |
107
+ | Global | $0.000002$ | $\approx 4\ \text{days}$ |
108
+
109
+ $$
110
+ t_{1/2} = \frac{\ln 2}{\lambda}
111
+ $$
112
+
113
+ If those half-lives feel wrong for a given deployment, adjust $\lambda$ via
114
+ config — do not change the decay formula itself.
115
+
116
+ This makes session context fade fastest, user memory fade more slowly, and
117
+ global memory remain the most stable.
118
+
119
+ Why exponential instead of linear:
120
+
121
+ - exponential decay preserves ordering smoothly across many time scales
122
+ - it never goes negative
123
+ - it gives a natural "fast drop then long tail" shape for conversational relevance
124
+
125
+ Linear decay has a hard cutoff or requires arbitrary clipping. Exponential decay decays old memories continuously without inventing a discontinuity.
126
+
127
+ ## 3. Token Budget Fitting
128
+
129
+ After ranking, the system performs greedy prompt packing.
130
+
131
+ Implemented in [`src/tokens.ts`](../src/tokens.ts).
132
+
133
+ Let candidates be sorted by final hybrid score:
134
+
135
+ $$
136
+ \mathrm{score}(d_1) \ge \mathrm{score}(d_2) \ge \dots \ge \mathrm{score}(d_n)
137
+ $$
138
+
139
+ and let $c_i$ be the estimated token cost of candidate $d_i$. The current host
140
+ token estimator is:
141
+
142
+ $$
143
+ \mathrm{estimateTokens}(t)=\left\lceil\frac{|t|}{\chi(t)}\right\rceil
144
+ $$
145
+
146
+ where:
147
+
148
+ $$
149
+ \chi(t)=
150
+ \begin{cases}
151
+ 1.6 & \text{for CJK scripts} \\
152
+ 2.5 & \text{for Cyrillic, Arabic, or Hebrew scripts} \\
153
+ 4.0 & \text{otherwise}
154
+ \end{cases}
155
+ $$
156
+
157
+ Given prompt budget $B$, the system selects the longest ranked prefix whose
158
+ cumulative cost fits:
159
+
160
+ $$
161
+ S = [d_1, d_2, \dots, d_m]
162
+ $$
163
+
164
+ such that:
165
+
166
+ $$
167
+ \sum_{i=1}^{m} c_i \le B
168
+ $$
169
+
170
+ and either $m=n$ or $\sum_{i=1}^{m+1} c_i > B$.
171
+
172
+ Greedy is optimal for this implementation because the ranking is already fixed.
173
+ The problem is not "find the best weighted subset under a knapsack objective";
174
+ it is "preserve rank order while honoring a hard prompt cap." Once rank order
175
+ is fixed, prefix acceptance is the correct policy.
176
+
177
+ **Note on estimator divergence.** The host estimator
178
+ ([`src/tokens.ts`](../src/tokens.ts)) is script-aware and is used for prompt
179
+ budget fitting. The sidecar estimator
180
+ ([`sidecar/compact/tokens.go`](../sidecar/compact/tokens.go)) uses a fixed
181
+ bytes-per-token rule:
182
+
183
+ $$
184
+ \widehat{T}_{sidecar}(t)=\max\left(\left\lfloor\frac{\mathrm{len}(t)}{4}\right\rfloor, 1\right)
185
+ $$
186
+
187
+ The two estimators are intentionally different. The host estimator optimizes
188
+ prompt-budget accuracy. The sidecar estimator is used only as a stable
189
+ normalization denominator in the technical specificity signal $P(t)$ of the
190
+ gating scalar. They must not be substituted for each other.
191
+
192
+ ## 4. Matryoshka Cascade
193
+
194
+ For Nomic embeddings, one full vector $\vec{v} \in \mathbb{R}^{768}$ produces three tiers:
195
+
196
+ $$
197
+ \vec{u}_{64} = \frac{\vec{v}_{1:64}}{\lVert \vec{v}_{1:64} \rVert_2}, \quad
198
+ \vec{u}_{256} = \frac{\vec{v}_{1:256}}{\lVert \vec{v}_{1:256} \rVert_2}, \quad
199
+ \vec{u}_{768} = \frac{\vec{v}_{1:768}}{\lVert \vec{v}_{1:768} \rVert_2}
200
+ $$
201
+
202
+ Re-normalization is required after truncation because a prefix of a unit vector is not itself a unit vector in general.
203
+
204
+ Implemented in [`sidecar/embed/matryoshka.go`](../sidecar/embed/matryoshka.go)
205
+ and [`sidecar/store/libravdb.go`](../sidecar/store/libravdb.go).
206
+
207
+ Cascade search uses:
208
+
209
+ - L1: `64d`
210
+ - L2: `256d`
211
+ - L3: `768d`
212
+
213
+ The search exits early when a tier's best score exceeds the configured threshold;
214
+ otherwise it falls through to the next tier. Empty lower-tier collections
215
+ degrade gracefully because:
216
+
217
+ $$
218
+ \max(\emptyset) = 0
219
+ $$
220
+
221
+ and `0` is below both early-exit thresholds by design.
222
+
223
+ Backfill condition:
224
+
225
+ - L3 is the source of truth
226
+ - L1 and L2 are derived caches
227
+ - if an L1 or L2 insert fails, a dirty-tier marker is recorded
228
+ - startup backfill reconstructs the missing tier vector from L3
229
+
230
+ ## 5. Compaction Clustering
231
+
232
+ Compaction groups raw session turns into deterministic chronological clusters
233
+ and replaces each cluster with one summary record. The intent is to turn many
234
+ highly local turns into fewer retrieval-worthy summaries.
235
+
236
+ Implemented in [`sidecar/compact/summarize.go`](../sidecar/compact/summarize.go).
237
+
238
+ The current algorithm is not semantic k-means. It is deterministic chronological
239
+ partitioning:
240
+
241
+ 1. collect eligible non-summary turns
242
+ 2. sort them by `(ts, id)`
243
+ 3. choose target cluster size $k$
244
+ 4. derive cluster count:
245
+
246
+ $$
247
+ c = \left\lceil \frac{n}{k} \right\rceil
248
+ $$
249
+
250
+ where $n$ is the number of eligible turns
251
+ 5. assign turn $i$ to cluster:
252
+
253
+ $$
254
+ \mathrm{clusterIndex}(i) = \left\lfloor \frac{i \cdot c}{n} \right\rfloor
255
+ $$
256
+
257
+ This yields contiguous chronological buckets of roughly equal size while
258
+ avoiding nondeterministic clustering behavior.
259
+
260
+ The summarizer input for cluster $C_j$ is the ordered turn sequence:
261
+
262
+ $$
263
+ C_j = [t_1, t_2, \dots, t_m]
264
+ $$
265
+
266
+ with each element carrying turn id and text.
267
+
268
+ The output is a summary record $s(C_j)$ with:
269
+
270
+ - summary text
271
+ - source ids
272
+ - confidence
273
+ - method
274
+ - `decay_rate = 1 - confidence`
275
+
276
+ Implemented across [`sidecar/compact/summarize.go`](../sidecar/compact/summarize.go),
277
+ [`sidecar/summarize/engine.go`](../sidecar/summarize/engine.go), and
278
+ [`sidecar/summarize/onnx_local.go`](../sidecar/summarize/onnx_local.go).
279
+
280
+ The confidence term is implemented as a bounded quality signal:
281
+
282
+ $$
283
+ \mathrm{confidence}(s) \in [0,1]
284
+ $$
285
+
286
+ with backend-specific definitions:
287
+
288
+ $$
289
+ \mathrm{confidence}_{extractive}(s) =
290
+ \mathrm{mean\ cosine\ similarity\ of\ selected\ turns\ to\ the\ cluster\ centroid}
291
+ $$
292
+
293
+ $$
294
+ \mathrm{confidence}_{onnx}(s) =
295
+ \exp\left(\frac{\sum_{i=1}^{n}\log p(t_i \mid t_{<i}, C_j)}{n}\right)
296
+ $$
297
+
298
+ where $t_i$ are generated summary tokens and $C_j$ is the source cluster.
299
+
300
+ The retrieval decay metadata is then:
301
+
302
+ $$
303
+ \mathrm{decay\_rate}(s)=1-\mathrm{confidence}(s)
304
+ $$
305
+
306
+ and the retrieval quality multiplier from Section 1 becomes:
307
+
308
+ $$
309
+ Q(s)=1-\delta\cdot\mathrm{decay\_rate}(s)
310
+ $$
311
+
312
+ At the shipped default $\delta = 0.5$, this constrains summary quality weights
313
+ to:
314
+
315
+ $$
316
+ Q(s)\in[0.5,1.0]
317
+ $$
318
+
319
+ This makes compaction load-bearing in retrieval rather than archival only.
320
+
321
+ ## 6. Why These Pieces Compose
322
+
323
+ The full quality loop is:
324
+
325
+ $$
326
+ \text{high-value turns}
327
+ \rightarrow \text{better clusters}
328
+ \rightarrow \text{higher summary confidence}
329
+ \rightarrow \text{lower decay rate}
330
+ \rightarrow \text{higher retrieval score}
331
+ $$
332
+
333
+ That is the system-level reason the math is distributed across ingestion,
334
+ compaction, and retrieval instead of existing only in one scoring function.
335
+
336
+ ## 7. Planned Two-Pass Discovery Scoring
337
+
338
+ This section documents the planned scoring and assembly model for a future
339
+ two-pass retrieval system. It is a design target for optimization work after
340
+ the OpenClaw `2026.3.28+` memory prompt contract change. It is **not** the
341
+ current implementation in [`src/scoring.ts`](../src/scoring.ts) or
342
+ [`src/context-engine.ts`](../src/context-engine.ts).
343
+
344
+ The design goal is to separate:
345
+
346
+ 1. invariant documents that must always be present
347
+ 2. cheap discovery over variant documents
348
+ 3. selective second-pass expansion under a hard prompt budget
349
+
350
+ ### 7.1 Foundational Definitions
351
+
352
+ Let the retrievable document corpus be:
353
+
354
+ $$
355
+ \mathbf{D}=\{d_1, d_2, \ldots, d_n\}
356
+ $$
357
+
358
+ and let the query space be:
359
+
360
+ $$
361
+ \mathbf{Q}
362
+ $$
363
+
364
+ Let the embedding function:
365
+
366
+ $$
367
+ \varphi : \mathbf{D}\cup\mathbf{Q}\rightarrow \mathbb{R}^m
368
+ $$
369
+
370
+ map documents and queries to unit vectors:
371
+
372
+ $$
373
+ \|\varphi(x)\| = 1 \qquad \forall x \in \mathbf{D}\cup\mathbf{Q}
374
+ $$
375
+
376
+ The planned gating function is:
377
+
378
+ $$
379
+ G : \mathbf{Q}\times\mathbf{D}\rightarrow \{0,1\}
380
+ $$
381
+
382
+ and determines whether a document is injected for a query.
383
+
384
+ ### 7.2 Corpus Decomposition
385
+
386
+ The corpus is partitioned into invariant and variant sets:
387
+
388
+ $$
389
+ \mathbf{D} = \mathcal{I}\cup\mathcal{V},
390
+ \qquad
391
+ \mathcal{I}\cap\mathcal{V}=\emptyset
392
+ $$
393
+
394
+ The invariant membership predicate is:
395
+
396
+ $$
397
+ \iota : \mathbf{D}\rightarrow \{0,1\}
398
+ $$
399
+
400
+ with:
401
+
402
+ $$
403
+ \mathcal{I} = \{d\in\mathbf{D}\mid \iota(d)=1\}
404
+ $$
405
+
406
+ and:
407
+
408
+ $$
409
+ \mathcal{V} = \mathbf{D}\setminus\mathcal{I}
410
+ $$
411
+
412
+ For OpenClaw, the intended implementation is that invariant documents are
413
+ registered as authored constants at load time rather than discovered at query
414
+ time. In practice, this means documents such as `AGENTS.md` and `souls.md`
415
+ should be compiled into the invariant set when they are explicitly marked as
416
+ always-inject rules.
417
+
418
+ The required invariant is:
419
+
420
+ $$
421
+ \iota(d)=1 \Rightarrow G(q,d)=1 \qquad \forall q\in\mathbf{Q}
422
+ $$
423
+
424
+ This is a compile-time guarantee, not a runtime heuristic.
425
+
426
+ ### 7.3 Document Authority Weight
427
+
428
+ Each variant document carries a precomputed authority weight:
429
+
430
+ $$
431
+ \omega(d)=\alpha_r\cdot r(d)+\alpha_f\cdot f(d)+\alpha_a\cdot a(d)
432
+ $$
433
+
434
+ with:
435
+
436
+ $$
437
+ \alpha_r+\alpha_f+\alpha_a=1
438
+ $$
439
+
440
+ where:
441
+
442
+ $$
443
+ r(d)=\exp\left(-\lambda_r\cdot \Delta t(d)\right)
444
+ $$
445
+
446
+ $$
447
+ f(d)=\frac{\log(1+\operatorname{acc}(d))}{\log\left(1+\max_{d'\in\mathcal{V}}\operatorname{acc}(d')\right)}
448
+ $$
449
+
450
+ $$
451
+ a(d)\in[0,1]
452
+ $$
453
+
454
+ This lets the planned discovery score incorporate recency, access frequency,
455
+ and authored authority without baking those concerns into the raw cosine term.
456
+
457
+ ### 7.4 Pass 1: Coarse Semantic Filtering
458
+
459
+ Pass 1 computes cosine similarity:
460
+
461
+ $$
462
+ \operatorname{sim}(q,d)=\varphi(q)^\top \varphi(d)
463
+ $$
464
+
465
+ and selects the coarse candidate set:
466
+
467
+ $$
468
+ \mathcal{C}_1(q)=\operatorname{top\text{-}k_1}_{d\in\mathcal{V}}\ \operatorname{sim}(q,d)
469
+ $$
470
+
471
+ with a hard similarity floor:
472
+
473
+ $$
474
+ \mathcal{C}_1(q)=\{d\in\mathcal{C}_1(q)\mid \operatorname{sim}(q,d)\ge \theta_1\}
475
+ $$
476
+
477
+ The purpose of this pass is breadth with cheap semantic recall. Documents below
478
+ $\theta_1$ are rejected even if they land in the top-$k_1$ set, because the
479
+ first pass must not admit semantically orthogonal noise into second-pass work.
480
+
481
+ ### 7.5 Pass 2: Normalized Hybrid Scoring
482
+
483
+ Let the query keyword extractor return:
484
+
485
+ $$
486
+ K = \operatorname{KeyExt}(q)
487
+ $$
488
+
489
+ and define normalized keyword coverage:
490
+
491
+ $$
492
+ M_{norm}(K,d)=\frac{|K\cap \operatorname{terms}(d)|}{|K|}\in[0,1]
493
+ $$
494
+
495
+ The proposed normalized second-pass score is:
496
+
497
+ $$
498
+ S_{final}(d)=
499
+ \frac{
500
+ \omega(d)\cdot\max(\operatorname{sim}(q,d), 0)\cdot\left(1+\kappa\cdot M_{norm}(K,d)\right)
501
+ }{
502
+ 1+\kappa
503
+ }
504
+ $$
505
+
506
+ The normalized second-pass score form above was suggested during design review
507
+ by GitHub contributor [@JuanHuaXu](https://github.com/JuanHuaXu). The broader
508
+ two-pass architecture in this section remains project-authored.
509
+
510
+ This form is preferred over a hard clamp such as $\min(\mathrm{term},1)$
511
+ because clamping discards ranking information at the high end of the score
512
+ distribution. The denominator $(1+\kappa)$ gives an analytic bound instead of
513
+ truncating the result.
514
+
515
+ The second-pass candidate set is:
516
+
517
+ $$
518
+ \mathcal{C}_2(q)=\operatorname{top\text{-}k_2}_{d\in\mathcal{C}_1(q)}\ S_{final}(d)
519
+ $$
520
+
521
+ with:
522
+
523
+ $$
524
+ k_2 \le k_1
525
+ $$
526
+
527
+ ### 7.6 Bounded Range and Interpretation of $\kappa$
528
+
529
+ Let:
530
+
531
+ $$
532
+ s=\max(\operatorname{sim}(q,d),0)\in[0,1]
533
+ $$
534
+
535
+ Then:
536
+
537
+ $$
538
+ S_{final}(d)=\frac{\omega(d)\cdot s\cdot(1+\kappa M_{norm}(K,d))}{1+\kappa}
539
+ $$
540
+
541
+ The numerator is maximized when $s=1$ and $M_{norm}(K,d)=1$:
542
+
543
+ $$
544
+ \max(\text{numerator})=\omega(d)\cdot(1+\kappa)
545
+ $$
546
+
547
+ Therefore:
548
+
549
+ $$
550
+ 0 \le S_{final}(d)\le \omega(d)\le 1
551
+ $$
552
+
553
+ This yields a clean interpretation of $\kappa$:
554
+
555
+ - $\kappa = 0$ gives pure semantic retrieval
556
+ - $\kappa = 0.5$ allows keyword coverage to provide up to a one-third relative
557
+ boost before normalization
558
+ - $\kappa = 1.0$ makes full lexical support restore the pure semantic ceiling
559
+ while penalizing semantic-only matches with no keyword support
560
+
561
+ A reasonable initial experiment value is:
562
+
563
+ $$
564
+ \kappa = 0.3
565
+ $$
566
+
567
+ ### 7.7 Multi-Hop Expansion
568
+
569
+ Let the authored hop graph be:
570
+
571
+ $$
572
+ \mathcal{G}=(\mathbf{D}, E)
573
+ $$
574
+
575
+ where edges are registered in document metadata at authorship time.
576
+
577
+ For a document $d$, define its hop neighborhood:
578
+
579
+ $$
580
+ H(d)=\{d'\in\mathbf{D}\mid (d,d')\in E\}
581
+ $$
582
+
583
+ The hop expansion set is:
584
+
585
+ $$
586
+ \mathcal{C}_{hop}(q)=\bigcup_{d\in\mathcal{C}_2(q)} H(d)\setminus\mathcal{C}_2(q)
587
+ $$
588
+
589
+ Each hop candidate inherits a decayed score from its best parent:
590
+
591
+ $$
592
+ S_{hop}(d')=
593
+ \lambda\cdot
594
+ \max_{d\in\mathcal{C}_2(q),\ d'\in H(d)} S_{final}(d)
595
+ $$
596
+
597
+ with hop decay:
598
+
599
+ $$
600
+ \lambda\in(0,1)
601
+ $$
602
+
603
+ and filtered hop set:
604
+
605
+ $$
606
+ \mathcal{C}_{hop}^{*}(q)=\{d'\in\mathcal{C}_{hop}(q)\mid S_{hop}(d')\ge\theta_{hop}\}
607
+ $$
608
+
609
+ ### 7.8 Final Assembly Under a Token Budget
610
+
611
+ Variant projection is:
612
+
613
+ $$
614
+ \operatorname{Proj}(\mathcal{V}, q)=\mathcal{C}_2(q)\cup\mathcal{C}_{hop}^{*}(q)
615
+ $$
616
+
617
+ Total injected soul context is:
618
+
619
+ $$
620
+ C_{soul}(q)=\mathcal{I}\cup \operatorname{Proj}(\mathcal{V}, q)
621
+ $$
622
+
623
+ Let the total prompt budget be $\tau$. If the invariant set consumes:
624
+
625
+ $$
626
+ \tau_{\mathcal{I}}=\sum_{d\in\mathcal{I}} \operatorname{toks}(d)
627
+ $$
628
+
629
+ then the variant budget is:
630
+
631
+ $$
632
+ \tau_{\mathcal{V}}=\tau-\tau_{\mathcal{I}}
633
+ $$
634
+
635
+ Documents in $\operatorname{Proj}(\mathcal{V}, q)$ are injected in descending
636
+ score order until:
637
+
638
+ $$
639
+ \sum_{d\in \text{injected}} \operatorname{toks}(d)\le\tau_{\mathcal{V}}
640
+ $$
641
+
642
+ The merged score sequence is:
643
+
644
+ $$
645
+ \sigma(d)=
646
+ \begin{cases}
647
+ S_{final}(d) & d\in\mathcal{C}_2(q) \\
648
+ S_{hop}(d) & d\in\mathcal{C}_{hop}^{*}(q)
649
+ \end{cases}
650
+ $$
651
+
652
+ ### 7.9 Complete Gating Definition
653
+
654
+ $$
655
+ G(q,d)=
656
+ \begin{cases}
657
+ 1 & \text{if } \iota(d)=1 \\
658
+ \mathbf{1}[d\in\mathcal{C}_2(q)\cup\mathcal{C}_{hop}^{*}(q)] & \text{if } \iota(d)=0
659
+ \end{cases}
660
+ $$
661
+
662
+ ### 7.10 Required Runtime Invariants
663
+
664
+ The implementation must preserve these properties:
665
+
666
+ 1. Invariant completeness:
667
+
668
+ $$
669
+ \forall d\in\mathcal{I},\ \forall q\in\mathbf{Q}: d\in C_{soul}(q)
670
+ $$
671
+
672
+ 2. Partition integrity:
673
+
674
+ $$
675
+ \mathcal{I}\cap\mathcal{V}=\emptyset
676
+ $$
677
+
678
+ 3. Score boundedness:
679
+
680
+ $$
681
+ S_{final}(d)\in[0,1]
682
+ $$
683
+
684
+ 4. Token budget respect:
685
+
686
+ $$
687
+ \sum_{d\in C_{soul}(q)} \operatorname{toks}(d)\le\tau
688
+ $$
689
+
690
+ with the invariant set never truncated
691
+
692
+ 5. Hop termination:
693
+
694
+ The authored hop graph should be acyclic, or the runtime must cap hop depth at
695
+ one to guarantee termination.