@xdarkicex/openclaw-memory-libravdb 1.4.16 → 1.4.17
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +105 -363
- package/dist/cli.d.ts +1 -1
- package/dist/cli.js +11 -0
- package/dist/index.d.ts +2 -0
- package/dist/index.js +49 -25
- package/docs/README.md +31 -16
- package/docs/assets/libravdb-logo.svg +14 -0
- package/docs/contributing.md +16 -69
- package/docs/development.md +98 -0
- package/docs/features.md +125 -0
- package/docs/install.md +4 -0
- package/docs/installation.md +79 -272
- package/docs/models.md +37 -46
- package/docs/performance-and-tuning.md +145 -0
- package/openclaw.plugin.json +1 -1
- package/package.json +1 -1
package/docs/installation.md
CHANGED
|
@@ -1,123 +1,34 @@
|
|
|
1
1
|
# Installation Reference
|
|
2
2
|
|
|
3
|
-
This
|
|
3
|
+
This is the full installation reference for
|
|
4
|
+
`@xdarkicex/openclaw-memory-libravdb`. For the shortest path, use
|
|
5
|
+
[install.md](./install.md).
|
|
4
6
|
|
|
5
7
|
## System Requirements
|
|
6
8
|
|
|
7
|
-
| Requirement | Minimum |
|
|
8
|
-
|
|
9
|
-
| Node.js | `22.0.0` |
|
|
10
|
-
| OpenClaw | `2026.3.22` |
|
|
11
|
-
|
|
|
12
|
-
|
|
|
13
|
-
|
|
|
14
|
-
|
|
|
15
|
-
| Architecture | `arm64`, `x64` | Match published daemon release assets | Current release matrix builds five daemon targets |
|
|
9
|
+
| Requirement | Minimum | Notes |
|
|
10
|
+
|---|---:|---|
|
|
11
|
+
| Node.js | `22.0.0` | Enforced by `package.json` `engines.node`. |
|
|
12
|
+
| OpenClaw | `2026.3.22` | Earliest supported host version for this plugin API shape. |
|
|
13
|
+
| `libravdbd` | published daemon asset | Required for normal runtime. |
|
|
14
|
+
| Go | `1.22` | Required only for local daemon development. |
|
|
15
|
+
| OS | macOS, Linux, Windows | Unix uses a local socket; Windows uses TCP loopback. |
|
|
16
|
+
| Architecture | `arm64`, `x64` | Must match the daemon release asset. |
|
|
16
17
|
|
|
17
|
-
|
|
18
|
+
Resource sizing and benchmark data live in
|
|
19
|
+
[Performance and tuning](./performance-and-tuning.md).
|
|
18
20
|
|
|
19
|
-
|
|
21
|
+
OpenClaw compatibility note:
|
|
20
22
|
|
|
21
|
-
|
|
22
|
-
build on `2026-03-29` or explicitly labeled as estimates.
|
|
23
|
+
- the plugin is currently verified against OpenClaw `2026.4.23`
|
|
23
24
|
|
|
24
|
-
|
|
25
|
+
## Install Flow
|
|
25
26
|
|
|
26
|
-
|
|
27
|
+
The published plugin package is connect-only. It installs TypeScript plugin code
|
|
28
|
+
and docs; it does not compile Go code, download model assets, or supervise the
|
|
29
|
+
daemon.
|
|
27
30
|
|
|
28
|
-
|
|
29
|
-
- bundled Nomic model directory: `523M`
|
|
30
|
-
- bundled MiniLM fallback model directory: `87M`
|
|
31
|
-
- optional T5 summarizer directory: `371M`
|
|
32
|
-
- unpacked ONNX Runtime directory on macOS arm64: `44M`
|
|
33
|
-
- ONNX Runtime archive download on macOS arm64: `9.5M`
|
|
34
|
-
|
|
35
|
-
Practical footprints:
|
|
36
|
-
|
|
37
|
-
- default quality-first install without optional T5:
|
|
38
|
-
about `575 MB` (`7.7M + 523M + 44M`)
|
|
39
|
-
- install with optional T5 summarizer:
|
|
40
|
-
about `946 MB`
|
|
41
|
-
|
|
42
|
-
Vector payload lower bounds for stored turns, derived from embedding dimension:
|
|
43
|
-
|
|
44
|
-
- MiniLM `384d`: `384 * 4 = 1536 bytes` per vector
|
|
45
|
-
- Nomic `768d`: `768 * 4 = 3072 bytes` per vector
|
|
46
|
-
|
|
47
|
-
Estimated lower-bound vector payload for `10,000` stored turns:
|
|
48
|
-
|
|
49
|
-
- MiniLM: about `15.4 MB`
|
|
50
|
-
- Nomic: about `30.7 MB`
|
|
51
|
-
|
|
52
|
-
These are lower bounds for vector payload only. Actual on-disk LibraVDB usage is
|
|
53
|
-
higher because text, metadata, collection structure, and index state are stored
|
|
54
|
-
as well.
|
|
55
|
-
|
|
56
|
-
### Memory
|
|
57
|
-
|
|
58
|
-
Measured locally on Apple M2, `2026-03-29`, by starting the daemon and reading
|
|
59
|
-
RSS after startup:
|
|
60
|
-
|
|
61
|
-
- idle RSS with Nomic embedding path loaded and no optional T5 summarizer:
|
|
62
|
-
about `271,872 KB` (`~266 MB`)
|
|
63
|
-
- idle RSS with Nomic plus local ONNX T5 summarizer loaded:
|
|
64
|
-
about `515,312 KB` (`~503 MB`)
|
|
65
|
-
|
|
66
|
-
Not yet bench-measured in the repo:
|
|
67
|
-
|
|
68
|
-
- RSS during active inference
|
|
69
|
-
- peak RSS during compaction of large clusters
|
|
70
|
-
|
|
71
|
-
Current operational estimate:
|
|
72
|
-
|
|
73
|
-
- embedding inference should remain close to the embed-only idle baseline plus
|
|
74
|
-
transient ONNX workspace allocation
|
|
75
|
-
- optional T5 provisioning roughly doubles steady-state RSS
|
|
76
|
-
|
|
77
|
-
### CPU
|
|
78
|
-
|
|
79
|
-
Measured locally from the existing Go benchmark harness on Apple M2,
|
|
80
|
-
`2026-03-29`:
|
|
81
|
-
|
|
82
|
-
- MiniLM bundled query embedding: about `22.6 ms/op`
|
|
83
|
-
- MiniLM onnx-local query embedding: about `16.3 ms/op`
|
|
84
|
-
- Nomic onnx-local query embedding: about `43.7 ms/op`
|
|
85
|
-
|
|
86
|
-
Measured locally from a one-off 40-query timing sample on Apple M2,
|
|
87
|
-
`2026-03-29`:
|
|
88
|
-
|
|
89
|
-
- Nomic query embedding `p50`: about `18.61 ms`
|
|
90
|
-
- Nomic query embedding `p95`: about `24.19 ms`
|
|
91
|
-
|
|
92
|
-
Measured locally from a one-off synthetic 50-turn compaction run using the
|
|
93
|
-
current extractive summarizer and Nomic embeddings:
|
|
94
|
-
|
|
95
|
-
- `50`-turn extractive compaction wall time: about `3175 ms`
|
|
96
|
-
|
|
97
|
-
Not yet bench-measured in the repo:
|
|
98
|
-
|
|
99
|
-
- equivalent Linux x64 embedding latency on a reference machine
|
|
100
|
-
- `50`-turn compaction wall time through the optional ONNX T5 abstractive path
|
|
101
|
-
|
|
102
|
-
### Network
|
|
103
|
-
|
|
104
|
-
Setup downloads are front-loaded. After installation, the plugin is local-first.
|
|
105
|
-
|
|
106
|
-
Current setup assets:
|
|
107
|
-
|
|
108
|
-
- Nomic model: about `522 MB`
|
|
109
|
-
- T5-small encoder: about `135 MB`
|
|
110
|
-
- T5-small decoder: about `222 MB`
|
|
111
|
-
- ONNX Runtime macOS arm64 archive: about `9.5 MB`
|
|
112
|
-
|
|
113
|
-
After install, the plugin makes no required network calls for embedding or
|
|
114
|
-
extractive compaction. The only optional runtime network path is:
|
|
115
|
-
|
|
116
|
-
- `summarizerBackend = "ollama-local"` or another custom summarizer endpoint
|
|
117
|
-
|
|
118
|
-
## Standard Install
|
|
119
|
-
|
|
120
|
-
### Fastest Path on macOS
|
|
31
|
+
Recommended macOS path:
|
|
121
32
|
|
|
122
33
|
```bash
|
|
123
34
|
brew tap xDarkicex/homebrew-openclaw-libravdb-memory
|
|
@@ -126,85 +37,30 @@ brew services start libravdbd
|
|
|
126
37
|
openclaw plugins install @xdarkicex/openclaw-memory-libravdb
|
|
127
38
|
```
|
|
128
39
|
|
|
129
|
-
|
|
130
|
-
|
|
131
|
-
### Plugin Package
|
|
132
|
-
|
|
133
|
-
```bash
|
|
134
|
-
openclaw plugins install @xdarkicex/openclaw-memory-libravdb
|
|
135
|
-
```
|
|
136
|
-
|
|
137
|
-
The plugin package installs as compiled OpenClaw runtime code without daemon bootstrap hooks.
|
|
138
|
-
|
|
139
|
-
## Daemon Install
|
|
140
|
-
|
|
141
|
-
Install and start `libravdbd` separately for the same user account that runs OpenClaw. The daemon owns the local DB engine and listens on a local endpoint.
|
|
142
|
-
|
|
143
|
-
Default endpoints:
|
|
144
|
-
|
|
145
|
-
- Homebrew on macOS: `unix:/opt/homebrew/var/clawdb/run/libravdb.sock`
|
|
146
|
-
- macOS/Linux user-local installs: `unix:$HOME/.clawdb/run/libravdb.sock`
|
|
147
|
-
- Windows: `tcp:127.0.0.1:37421`
|
|
148
|
-
|
|
149
|
-
If you run the daemon on a different endpoint, set `plugins.configs.libravdb-memory.sidecarPath` in `~/.openclaw/openclaw.json`.
|
|
150
|
-
|
|
151
|
-
### Linux
|
|
152
|
-
|
|
153
|
-
Recommended layout:
|
|
40
|
+
Manual Linux sketch:
|
|
154
41
|
|
|
155
42
|
```bash
|
|
156
43
|
mkdir -p ~/.local/bin ~/.config/systemd/user
|
|
157
44
|
curl -L -o ~/.local/bin/libravdbd <published-libravdbd-binary-url>
|
|
158
45
|
chmod +x ~/.local/bin/libravdbd
|
|
159
|
-
|
|
46
|
+
curl -L -o ~/.config/systemd/user/libravdbd.service <published-libravdbd-service-template-url>
|
|
160
47
|
systemctl --user enable --now libravdbd.service
|
|
48
|
+
openclaw plugins install @xdarkicex/openclaw-memory-libravdb
|
|
161
49
|
```
|
|
162
50
|
|
|
163
|
-
|
|
164
|
-
|
|
165
|
-
```bash
|
|
166
|
-
systemctl --user status libravdbd.service
|
|
167
|
-
openclaw memory status
|
|
168
|
-
```
|
|
169
|
-
|
|
170
|
-
### Homebrew / macOS
|
|
171
|
-
|
|
172
|
-
Homebrew users should normally install from the published tap:
|
|
173
|
-
|
|
174
|
-
```bash
|
|
175
|
-
brew tap xDarkicex/homebrew-openclaw-libravdb-memory
|
|
176
|
-
brew install libravdbd
|
|
177
|
-
brew services start libravdbd
|
|
178
|
-
```
|
|
179
|
-
|
|
180
|
-
With `sidecarPath: "auto"`, Homebrew installs on Apple Silicon should resolve to:
|
|
181
|
-
|
|
182
|
-
```text
|
|
183
|
-
unix:/opt/homebrew/var/clawdb/run/libravdb.sock
|
|
184
|
-
```
|
|
185
|
-
|
|
186
|
-
User-local installs still default to:
|
|
51
|
+
Windows uses a loopback TCP endpoint by default:
|
|
187
52
|
|
|
188
53
|
```text
|
|
189
|
-
|
|
54
|
+
tcp:127.0.0.1:37421
|
|
190
55
|
```
|
|
191
56
|
|
|
192
|
-
|
|
193
|
-
|
|
194
|
-
|
|
195
|
-
- `libravdbd-darwin-amd64`
|
|
196
|
-
- `libravdbd-linux-amd64`
|
|
197
|
-
- `libravdbd-linux-arm64`
|
|
57
|
+
This repository does not yet include a full Windows service-install walkthrough.
|
|
58
|
+
Use the published Windows daemon asset under your preferred process supervisor
|
|
59
|
+
or run `libravdbd serve` in a terminal for validation.
|
|
198
60
|
|
|
199
|
-
|
|
200
|
-
|
|
201
|
-
If your GitHub Actions configuration includes:
|
|
202
|
-
|
|
203
|
-
- public tap repository `xDarkicex/homebrew-openclaw-libravdb-memory`
|
|
204
|
-
|
|
205
|
-
then tagged releases also push the generated formula into `Formula/libravdbd.rb` in that tap repository automatically.
|
|
61
|
+
## Activation
|
|
206
62
|
|
|
207
|
-
|
|
63
|
+
Assign `libravdb-memory` to both OpenClaw slots:
|
|
208
64
|
|
|
209
65
|
```json
|
|
210
66
|
{
|
|
@@ -212,29 +68,15 @@ Example plugin config:
|
|
|
212
68
|
"slots": {
|
|
213
69
|
"memory": "libravdb-memory",
|
|
214
70
|
"contextEngine": "libravdb-memory"
|
|
215
|
-
},
|
|
216
|
-
"configs": {
|
|
217
|
-
"libravdb-memory": {
|
|
218
|
-
"sidecarPath": "unix:/Users/<you>/.clawdb/run/libravdb.sock"
|
|
219
|
-
}
|
|
220
71
|
}
|
|
221
72
|
}
|
|
222
73
|
}
|
|
223
74
|
```
|
|
224
75
|
|
|
225
|
-
|
|
76
|
+
Treat partial assignment as a misconfiguration. This plugin is designed to own
|
|
77
|
+
memory prompt injection and the context-engine lifecycle together.
|
|
226
78
|
|
|
227
|
-
|
|
228
|
-
|
|
229
|
-
```text
|
|
230
|
-
Installed plugin: libravdb-memory
|
|
231
|
-
```
|
|
232
|
-
|
|
233
|
-
## Activation
|
|
234
|
-
|
|
235
|
-
The manifest declares `kind: "context-engine"` and the runtime registers the memory prompt and memory runtime surfaces in code. It is still intended to own both the `memory` and `contextEngine` slots together. Treat partial slot assignment as a misconfiguration.
|
|
236
|
-
|
|
237
|
-
Add this to `~/.openclaw/openclaw.json`:
|
|
79
|
+
If the daemon uses a non-default endpoint, add `sidecarPath`:
|
|
238
80
|
|
|
239
81
|
```json
|
|
240
82
|
{
|
|
@@ -242,22 +84,37 @@ Add this to `~/.openclaw/openclaw.json`:
|
|
|
242
84
|
"slots": {
|
|
243
85
|
"memory": "libravdb-memory",
|
|
244
86
|
"contextEngine": "libravdb-memory"
|
|
87
|
+
},
|
|
88
|
+
"configs": {
|
|
89
|
+
"libravdb-memory": {
|
|
90
|
+
"sidecarPath": "unix:/Users/<you>/.clawdb/run/libravdb.sock"
|
|
91
|
+
}
|
|
245
92
|
}
|
|
246
93
|
}
|
|
247
94
|
}
|
|
248
95
|
```
|
|
249
96
|
|
|
250
|
-
|
|
97
|
+
When `sidecarPath` is `"auto"`, macOS/Linux endpoint resolution checks:
|
|
98
|
+
|
|
99
|
+
1. `LIBRAVDB_RPC_ENDPOINT`
|
|
100
|
+
2. `$HOME/.clawdb/run/libravdb.sock`
|
|
101
|
+
3. `/opt/homebrew/var/clawdb/run/libravdb.sock`
|
|
102
|
+
4. `/usr/local/var/clawdb/run/libravdb.sock`
|
|
103
|
+
5. fallback to `$HOME/.clawdb/run/libravdb.sock`
|
|
251
104
|
|
|
252
|
-
|
|
253
|
-
- The plugin id is `libravdb-memory`. The npm package name used at install time is `@xdarkicex/openclaw-memory-libravdb`.
|
|
254
|
-
- On newer OpenClaw versions, the plugin also registers a memory runtime bridge so the built-in `memory_search` tool can query libraVDB through the same sidecar-backed retrieval path.
|
|
255
|
-
- On newer OpenClaw versions, the plugin also listens for `before_reset` and `session_end` so it can send best-effort lifecycle hints into the sidecar.
|
|
256
|
-
- Those hints are journaled internally by the sidecar and can be inspected with `openclaw memory journal` without exposing them to normal memory export or recall.
|
|
257
|
-
- The journal keeps only a bounded number of newest entries. Override that cap with `plugins.configs.libravdb-memory.lifecycleJournalMaxEntries` if you need a different retention window.
|
|
258
|
-
- The plugin does not currently register `registerMemoryFlushPlan`; transcript ingest and compaction remain owned by the context-engine lifecycle and the sidecar.
|
|
105
|
+
## Default Paths
|
|
259
106
|
|
|
260
|
-
|
|
107
|
+
| Platform | Default endpoint |
|
|
108
|
+
|---|---|
|
|
109
|
+
| macOS/Linux user-local | `unix:$HOME/.clawdb/run/libravdb.sock` |
|
|
110
|
+
| macOS Homebrew Apple Silicon | `unix:/opt/homebrew/var/clawdb/run/libravdb.sock` |
|
|
111
|
+
| Windows | `tcp:127.0.0.1:37421` |
|
|
112
|
+
|
|
113
|
+
Default data path:
|
|
114
|
+
|
|
115
|
+
```text
|
|
116
|
+
$HOME/.clawdb/data.libravdb
|
|
117
|
+
```
|
|
261
118
|
|
|
262
119
|
## Verification
|
|
263
120
|
|
|
@@ -274,6 +131,7 @@ Expected output shape:
|
|
|
274
131
|
│ Sidecar │ running │
|
|
275
132
|
│ Turns stored │ 0 │
|
|
276
133
|
│ Memories stored │ 0 │
|
|
134
|
+
│ Lifecycle hints │ 0 │
|
|
277
135
|
│ Gate threshold │ 0.35 │
|
|
278
136
|
│ Abstractive model │ ready | not provisioned │
|
|
279
137
|
│ Embedding profile │ all-minilm-l6-v2 │
|
|
@@ -283,51 +141,10 @@ Expected output shape:
|
|
|
283
141
|
|
|
284
142
|
Interpretation:
|
|
285
143
|
|
|
286
|
-
- `Sidecar=running` means the
|
|
287
|
-
- `Gate threshold=0.35` confirms the default
|
|
288
|
-
- `Abstractive model=not provisioned` is acceptable
|
|
289
|
-
|
|
290
|
-
## Contributor Install
|
|
291
|
-
|
|
292
|
-
For contributors working from a clone:
|
|
293
|
-
|
|
294
|
-
```bash
|
|
295
|
-
pnpm check
|
|
296
|
-
bash scripts/build-daemon.sh
|
|
297
|
-
```
|
|
298
|
-
|
|
299
|
-
This prepares a local daemon binary in `.daemon-bin/libravdbd` (or `.exe` on Windows) and copies any locally available model/runtime assets there for testing.
|
|
300
|
-
|
|
301
|
-
Contributor default:
|
|
302
|
-
|
|
303
|
-
- install `libravdbd` separately with Homebrew or release assets, then run `bash scripts/build-daemon.sh`
|
|
304
|
-
|
|
305
|
-
Private local daemon development:
|
|
306
|
-
|
|
307
|
-
- set `LIBRAVDBD_SOURCE_DIR=/path/to/libravdbd` to build from your local daemon repo
|
|
308
|
-
- or set `LIBRAVDBD_BINARY_PATH=/path/to/libravdbd` to use a prebuilt local daemon binary
|
|
309
|
-
|
|
310
|
-
## User-Service Templates
|
|
311
|
-
|
|
312
|
-
Published daemon installs include matching user-service templates:
|
|
313
|
-
|
|
314
|
-
- Linux user service: `libravdbd.service`
|
|
315
|
-
- macOS LaunchAgent: `com.xdarkicex.libravdbd.plist`
|
|
316
|
-
|
|
317
|
-
Linux example:
|
|
318
|
-
|
|
319
|
-
```bash
|
|
320
|
-
mkdir -p ~/.config/systemd/user
|
|
321
|
-
cp <published-libravdbd-service-template> ~/.config/systemd/user/libravdbd.service
|
|
322
|
-
systemctl --user enable --now libravdbd.service
|
|
323
|
-
```
|
|
324
|
-
|
|
325
|
-
macOS example:
|
|
326
|
-
|
|
327
|
-
1. Copy the published `com.xdarkicex.libravdbd.plist`
|
|
328
|
-
2. Replace `__LIBRAVDBD_PATH__` and `__HOME__`
|
|
329
|
-
3. Save it to `~/Library/LaunchAgents/com.xdarkicex.libravdbd.plist`
|
|
330
|
-
4. Load it with `launchctl load ~/Library/LaunchAgents/com.xdarkicex.libravdbd.plist`
|
|
144
|
+
- `Sidecar=running` means the daemon answered the health check.
|
|
145
|
+
- `Gate threshold=0.35` confirms the default durable-memory gate.
|
|
146
|
+
- `Abstractive model=not provisioned` is acceptable; compaction falls back to
|
|
147
|
+
the extractive path.
|
|
331
148
|
|
|
332
149
|
## Troubleshooting
|
|
333
150
|
|
|
@@ -335,47 +152,37 @@ macOS example:
|
|
|
335
152
|
|
|
336
153
|
Common causes:
|
|
337
154
|
|
|
338
|
-
-
|
|
339
|
-
-
|
|
340
|
-
-
|
|
341
|
-
-
|
|
155
|
+
- `libravdbd` is not running for the same user account as OpenClaw
|
|
156
|
+
- `sidecarPath` points at the wrong endpoint
|
|
157
|
+
- ONNX Runtime assets are missing or unpacked in the wrong place
|
|
158
|
+
- a model asset failed checksum validation
|
|
342
159
|
|
|
343
|
-
Check:
|
|
160
|
+
Check the daemon first:
|
|
344
161
|
|
|
345
162
|
```bash
|
|
346
163
|
openclaw memory status
|
|
164
|
+
brew services restart libravdbd
|
|
347
165
|
```
|
|
348
166
|
|
|
349
|
-
|
|
350
|
-
|
|
351
|
-
```bash
|
|
352
|
-
brew services start libravdbd
|
|
353
|
-
```
|
|
354
|
-
|
|
355
|
-
Or, without Homebrew:
|
|
167
|
+
For foreground debugging:
|
|
356
168
|
|
|
357
169
|
```bash
|
|
358
170
|
libravdbd serve
|
|
359
171
|
```
|
|
360
172
|
|
|
361
|
-
On macOS/Linux, the default endpoint is `unix:$HOME/.clawdb/run/libravdb.sock`. On Windows, the default endpoint is `tcp:127.0.0.1:37421`.
|
|
362
|
-
|
|
363
173
|
### Hash mismatch
|
|
364
174
|
|
|
365
|
-
|
|
366
|
-
|
|
367
|
-
- the daemon asset is corrupt
|
|
368
|
-
- the local cache is stale
|
|
369
|
-
- the expected checksum is wrong
|
|
370
|
-
|
|
371
|
-
Do not bypass this. Delete the asset and rerun setup, or republish the release with corrected checksums.
|
|
175
|
+
Do not bypass a checksum mismatch. Delete the corrupt or stale asset and rerun
|
|
176
|
+
setup, or republish the release with corrected checksums.
|
|
372
177
|
|
|
373
|
-
###
|
|
178
|
+
### Default memory still appears active
|
|
374
179
|
|
|
375
|
-
|
|
180
|
+
Confirm that `libravdb-memory` is assigned to both `memory` and
|
|
181
|
+
`contextEngine`. Without both slot entries, OpenClaw's default memory path can
|
|
182
|
+
continue to run in parallel.
|
|
376
183
|
|
|
377
|
-
###
|
|
184
|
+
### Lifecycle journal looks empty
|
|
378
185
|
|
|
379
|
-
The
|
|
380
|
-
|
|
381
|
-
|
|
186
|
+
The sidecar journal only records advisory lifecycle hints such as `before_reset`
|
|
187
|
+
and `session_end`. It is bounded by `lifecycleJournalMaxEntries`, default `500`,
|
|
188
|
+
and is not part of normal memory recall.
|
package/docs/models.md
CHANGED
|
@@ -1,63 +1,54 @@
|
|
|
1
1
|
# Model Strategy
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
The plugin uses local ONNX-first inference for embeddings and optional
|
|
4
|
+
abstractive summarization. That keeps prompt assembly local, predictable, and
|
|
5
|
+
available offline after assets are installed.
|
|
4
6
|
|
|
5
|
-
|
|
7
|
+
## Why ONNX Over Ollama For The Critical Path
|
|
6
8
|
|
|
7
|
-
|
|
9
|
+
`assemble` runs before each response build. An embedding request that crosses a
|
|
10
|
+
process and HTTP server boundary adds avoidable tail latency. Local ONNX
|
|
11
|
+
inference inside the sidecar keeps retrieval close to the database and avoids a
|
|
12
|
+
runtime dependency on a separate model server.
|
|
8
13
|
|
|
9
|
-
|
|
10
|
-
|
|
11
|
-
|
|
12
|
-
Local ONNX inference inside the sidecar keeps the retrieval path local and
|
|
13
|
-
predictable. On the current Apple M2 development machine, the repository's own
|
|
14
|
-
benchmark harness measures roughly `16-23 ms/op` for MiniLM query embeddings and
|
|
15
|
-
about `44 ms/op` for Nomic in the steady-state Go benchmark path.
|
|
14
|
+
ONNX assets can be provisioned once and reused without network access. Given
|
|
15
|
+
fixed weights and input, embeddings are deterministic enough for stable
|
|
16
|
+
similarity ordering and reproducible retrieval behavior.
|
|
16
17
|
|
|
17
|
-
|
|
18
|
+
The trade-off is artifact size. This project accepts that cost because local
|
|
19
|
+
latency and offline operation are part of the product contract.
|
|
18
20
|
|
|
19
|
-
|
|
21
|
+
## Default And Optional Embedding Profiles
|
|
20
22
|
|
|
21
|
-
|
|
23
|
+
The current safe default profile is `all-minilm-l6-v2`.
|
|
22
24
|
|
|
23
|
-
|
|
25
|
+
MiniLM is the default because it keeps local retrieval within the target memory
|
|
26
|
+
envelope on macOS and is less fragile with ONNX Runtime execution than larger
|
|
27
|
+
profiles.
|
|
24
28
|
|
|
25
|
-
|
|
29
|
+
`nomic-embed-text-v1.5` remains available as an explicit opt-in profile for
|
|
30
|
+
long-context retrieval experiments. Nomic's Matryoshka training makes
|
|
31
|
+
`64d -> 256d -> 768d` tiering principled rather than arbitrary truncation, but
|
|
32
|
+
its larger footprint makes it a less conservative default.
|
|
26
33
|
|
|
27
|
-
|
|
34
|
+
For exact profile metadata, read [Embedding profiles](./embedding-profiles.md).
|
|
28
35
|
|
|
29
|
-
##
|
|
36
|
+
## Summarization
|
|
30
37
|
|
|
31
|
-
|
|
38
|
+
Compaction can run without an abstractive summarizer. When the optional T5-small
|
|
39
|
+
assets are not provisioned, the daemon degrades to the extractive path.
|
|
32
40
|
|
|
33
|
-
-
|
|
34
|
-
-
|
|
41
|
+
T5-small is the optional local abstractive summarizer because it is small enough
|
|
42
|
+
for CPU-local operation while still useful for session-cluster summaries. Larger
|
|
43
|
+
generative models would increase latency and operational complexity.
|
|
35
44
|
|
|
36
|
-
|
|
45
|
+
## Model Roles
|
|
37
46
|
|
|
38
|
-
|
|
47
|
+
| Model/profile | Role |
|
|
48
|
+
|---|---|
|
|
49
|
+
| `all-minilm-l6-v2` | Default lightweight embedding profile. |
|
|
50
|
+
| `nomic-embed-text-v1.5` | Opt-in long-context embedding profile. |
|
|
51
|
+
| T5-small | Optional local abstractive compaction summarizer. |
|
|
39
52
|
|
|
40
|
-
|
|
41
|
-
|
|
42
|
-
- the full Nomic profile is unavailable
|
|
43
|
-
- a smaller bundled footprint matters more than long-context or Matryoshka behavior
|
|
44
|
-
|
|
45
|
-
It is no longer the quality-first default.
|
|
46
|
-
|
|
47
|
-
## Why T5-small for Summarization
|
|
48
|
-
|
|
49
|
-
The abstractive summarization path is optional and must remain CPU-feasible on local machines. T5-small fits that constraint better than larger generative models:
|
|
50
|
-
|
|
51
|
-
- small enough to run locally
|
|
52
|
-
- expressive enough for session-cluster summarization
|
|
53
|
-
- does not require a remote server
|
|
54
|
-
|
|
55
|
-
The plugin still degrades gracefully to extractive compaction when the T5 assets are not provisioned.
|
|
56
|
-
|
|
57
|
-
## Model Roles in the System
|
|
58
|
-
|
|
59
|
-
- Nomic embedder: quality-first retrieval path, Matryoshka tiers
|
|
60
|
-
- MiniLM: fallback embedder
|
|
61
|
-
- T5-small: optional higher-quality compaction summarizer
|
|
62
|
-
|
|
63
|
-
The model strategy is therefore not “use ONNX everywhere because ONNX is fashionable.” It is “use ONNX where local deterministic inference is part of the product contract.”
|
|
53
|
+
External summarizer endpoints, such as Ollama, are optional. They are not part
|
|
54
|
+
of the required retrieval path.
|