@xdarkicex/openclaw-memory-libravdb 1.4.16 → 1.4.17
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +105 -363
- package/dist/cli.d.ts +1 -1
- package/dist/cli.js +11 -0
- package/dist/index.d.ts +2 -0
- package/dist/index.js +49 -25
- package/docs/README.md +31 -16
- package/docs/assets/libravdb-logo.svg +14 -0
- package/docs/contributing.md +16 -69
- package/docs/development.md +98 -0
- package/docs/features.md +125 -0
- package/docs/install.md +4 -0
- package/docs/installation.md +79 -272
- package/docs/models.md +37 -46
- package/docs/performance-and-tuning.md +145 -0
- package/openclaw.plugin.json +1 -1
- package/package.json +1 -1
|
@@ -0,0 +1,145 @@
|
|
|
1
|
+
# Performance And Tuning
|
|
2
|
+
|
|
3
|
+
This document keeps resource sizing, tuning knobs, and benchmark workflows out
|
|
4
|
+
of the root README.
|
|
5
|
+
|
|
6
|
+
## Resource Expectations
|
|
7
|
+
|
|
8
|
+
The numbers below are local measurements from this repository as of
|
|
9
|
+
`2026-03-29`, unless labeled as estimates.
|
|
10
|
+
|
|
11
|
+
### Disk
|
|
12
|
+
|
|
13
|
+
Measured local asset sizes:
|
|
14
|
+
|
|
15
|
+
- daemon binary: `7.7M`
|
|
16
|
+
- bundled Nomic model directory: `523M`
|
|
17
|
+
- bundled MiniLM fallback model directory: `87M`
|
|
18
|
+
- optional T5 summarizer directory: `371M`
|
|
19
|
+
- unpacked ONNX Runtime directory on macOS arm64: `44M`
|
|
20
|
+
- ONNX Runtime archive download on macOS arm64: `9.5M`
|
|
21
|
+
|
|
22
|
+
Vector payload lower bounds:
|
|
23
|
+
|
|
24
|
+
- MiniLM `384d`: `384 * 4 = 1536 bytes` per vector
|
|
25
|
+
- Nomic `768d`: `768 * 4 = 3072 bytes` per vector
|
|
26
|
+
|
|
27
|
+
Estimated lower-bound vector payload for `10,000` stored turns:
|
|
28
|
+
|
|
29
|
+
- MiniLM: about `15.4 MB`
|
|
30
|
+
- Nomic: about `30.7 MB`
|
|
31
|
+
|
|
32
|
+
Actual on-disk LibraVDB usage is higher because text, metadata, collection
|
|
33
|
+
structure, and index state are stored as well.
|
|
34
|
+
|
|
35
|
+
### Memory
|
|
36
|
+
|
|
37
|
+
Measured on Apple M2 by starting the daemon and reading RSS after startup:
|
|
38
|
+
|
|
39
|
+
- Nomic embedding path loaded without optional T5 summarizer: about `266 MB`
|
|
40
|
+
- Nomic plus local ONNX T5 summarizer loaded: about `503 MB`
|
|
41
|
+
|
|
42
|
+
Not yet bench-measured in this repo:
|
|
43
|
+
|
|
44
|
+
- RSS during active inference
|
|
45
|
+
- peak RSS during compaction of large clusters
|
|
46
|
+
|
|
47
|
+
### CPU
|
|
48
|
+
|
|
49
|
+
Measured from the current Go benchmark harness on Apple M2:
|
|
50
|
+
|
|
51
|
+
- MiniLM bundled query embedding: about `22.6 ms/op`
|
|
52
|
+
- MiniLM onnx-local query embedding: about `16.3 ms/op`
|
|
53
|
+
- Nomic onnx-local query embedding: about `43.7 ms/op`
|
|
54
|
+
|
|
55
|
+
Measured from a one-off 40-query timing sample on Apple M2:
|
|
56
|
+
|
|
57
|
+
- Nomic query embedding `p50`: about `18.61 ms`
|
|
58
|
+
- Nomic query embedding `p95`: about `24.19 ms`
|
|
59
|
+
|
|
60
|
+
Measured from a one-off synthetic 50-turn compaction run with the current
|
|
61
|
+
extractive summarizer and Nomic embeddings:
|
|
62
|
+
|
|
63
|
+
- `50`-turn extractive compaction wall time: about `3175 ms`
|
|
64
|
+
|
|
65
|
+
Not yet bench-measured:
|
|
66
|
+
|
|
67
|
+
- equivalent Linux x64 embedding latency on a reference machine
|
|
68
|
+
- `50`-turn compaction wall time through the optional ONNX T5 abstractive path
|
|
69
|
+
|
|
70
|
+
## Runtime Tuning Fields
|
|
71
|
+
|
|
72
|
+
Prefer the defaults unless you are measuring a specific problem. These fields
|
|
73
|
+
are advanced controls, not required install settings.
|
|
74
|
+
|
|
75
|
+
| Field | Effect |
|
|
76
|
+
|---|---|
|
|
77
|
+
| `topK` | Search result budget before prompt fitting. |
|
|
78
|
+
| `alpha`, `beta`, `gamma` | Hybrid scoring weights for similarity, scope, and recency-style signals. |
|
|
79
|
+
| `ingestionGateThreshold` | Durable-memory promotion threshold, default `0.35`. |
|
|
80
|
+
| `gatingWeights` | Domain-adaptive admission weights for conversational and technical memory. |
|
|
81
|
+
| `gatingTechNorm` | Normalization control for the technical-content gate. |
|
|
82
|
+
| `gatingCentroidK` | Number of centroid candidates used by the gate. |
|
|
83
|
+
| `compactionQualityWeight` | How much summary confidence affects retrieval score, default `0.5`. |
|
|
84
|
+
| `recencyLambdaSession` | Session-memory recency decay. |
|
|
85
|
+
| `recencyLambdaUser` | Durable user-memory recency decay. |
|
|
86
|
+
| `recencyLambdaGlobal` | Global-memory recency decay. |
|
|
87
|
+
| `tokenBudgetFraction` | Fraction of host context budget available to memory assembly. |
|
|
88
|
+
| `compactThreshold` | Explicit compaction trigger threshold. |
|
|
89
|
+
| `compactionThresholdFraction` | Dynamic trigger ratio when `compactThreshold` is unset, default `0.8`. |
|
|
90
|
+
| `compactSessionTokenBudget` | Auto-compaction budget since the last compaction, default `2000`; set `0` to disable. |
|
|
91
|
+
| `rpcTimeoutMs` | Sidecar RPC timeout, default `30000`. |
|
|
92
|
+
| `maxRetries` | Retry budget for sidecar RPC calls. |
|
|
93
|
+
| `logLevel` | Plugin log level. |
|
|
94
|
+
|
|
95
|
+
Model-related fields live in [Embedding profiles](./embedding-profiles.md) and
|
|
96
|
+
[Models](./models.md).
|
|
97
|
+
|
|
98
|
+
## LongMemEval Harness
|
|
99
|
+
|
|
100
|
+
The repository includes a local LongMemEval harness that runs the dataset
|
|
101
|
+
through the plugin layer and checks whether the assembled prompt still contains
|
|
102
|
+
the evidence turns.
|
|
103
|
+
|
|
104
|
+
The benchmark runner is committed, but the dataset and generated reports are
|
|
105
|
+
not. Keep downloaded data and local outputs under `benchmarks/longmemeval/`,
|
|
106
|
+
which is ignored by default.
|
|
107
|
+
|
|
108
|
+
Run it with:
|
|
109
|
+
|
|
110
|
+
```bash
|
|
111
|
+
LONGMEMEVAL_DATA_FILE=/path/to/longmemeval_oracle.json pnpm run benchmark:longmemeval
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
If you already have a daemon running and do not want the benchmark to spawn
|
|
115
|
+
another one, set:
|
|
116
|
+
|
|
117
|
+
```bash
|
|
118
|
+
LONGMEMEVAL_USE_EXISTING_DAEMON=1 \
|
|
119
|
+
LONGMEMEVAL_SIDECAR_PATH=unix:/path/to/libravdb.sock \
|
|
120
|
+
pnpm run benchmark:longmemeval
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
Optional controls:
|
|
124
|
+
|
|
125
|
+
- `LONGMEMEVAL_LIMIT` caps the number of questions
|
|
126
|
+
- `LONGMEMEVAL_TOPK` changes the search budget
|
|
127
|
+
- `LONGMEMEVAL_OUT_FILE` writes JSONL records for analysis
|
|
128
|
+
|
|
129
|
+
The harness writes JSONL incrementally, so partial results survive if a
|
|
130
|
+
transient daemon failure interrupts a long run. If the local test daemon drops
|
|
131
|
+
mid-run, the benchmark restarts it and retries the current instance once before
|
|
132
|
+
recording an error result.
|
|
133
|
+
|
|
134
|
+
To score a hypothesis JSONL file with the official LongMemEval evaluator:
|
|
135
|
+
|
|
136
|
+
```bash
|
|
137
|
+
LONGMEMEVAL_EVAL_REPO=/path/to/LongMemEval \
|
|
138
|
+
LONGMEMEVAL_HYPOTHESIS_FILE=/path/to/hypotheses.jsonl \
|
|
139
|
+
LONGMEMEVAL_DATA_FILE=/path/to/longmemeval_oracle.json \
|
|
140
|
+
OPENAI_API_KEY=... \
|
|
141
|
+
pnpm run benchmark:longmemeval:score
|
|
142
|
+
```
|
|
143
|
+
|
|
144
|
+
The scorer wrapper shells out to the official Python evaluation script and then
|
|
145
|
+
prints aggregate metrics from the generated log when available.
|
package/openclaw.plugin.json
CHANGED