@xdarkicex/openclaw-memory-libravdb 1.4.16 → 1.4.17

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,123 +1,34 @@
1
1
  # Installation Reference
2
2
 
3
- This document is the full installation reference for `@xdarkicex/openclaw-memory-libravdb`. For the short path, use the root [README.md](../README.md).
3
+ This is the full installation reference for
4
+ `@xdarkicex/openclaw-memory-libravdb`. For the shortest path, use
5
+ [install.md](./install.md).
4
6
 
5
7
  ## System Requirements
6
8
 
7
- | Requirement | Minimum | Recommended | Notes |
8
- |---|---|---|---|
9
- | Node.js | `22.0.0` | Latest LTS | Enforced in [`package.json`](../package.json) `engines.node` |
10
- | OpenClaw | `2026.3.22` | Current stable | Pinned by [`package.json`](../package.json) `peerDependencies.openclaw`; this is the earliest local tag confirmed to expose `definePluginEntry`, `registerContextEngine`, `registerMemoryPromptSection`, and the base plugin API shape this repo uses. Newer hosts may also expose the optional `registerMemoryRuntime` seam, which this plugin now adopts when available |
11
- | Go | `1.22` | Latest stable | Required only for local daemon development, not for normal plugin install |
12
- | Disk | about `1 GB` free for default Nomic install | `2 GB+` if provisioning optional T5 and leaving room for DB growth | See Resource Requirements below |
13
- | RAM | about `512 MB` for embed-only runtime | `1 GB+` if optional T5 summarizer is provisioned | Based on local RSS measurements below |
14
- | OS | macOS, Linux, Windows | Current stable releases | Unix uses a local socket; Windows uses TCP loopback |
15
- | Architecture | `arm64`, `x64` | Match published daemon release assets | Current release matrix builds five daemon targets |
9
+ | Requirement | Minimum | Notes |
10
+ |---|---:|---|
11
+ | Node.js | `22.0.0` | Enforced by `package.json` `engines.node`. |
12
+ | OpenClaw | `2026.3.22` | Earliest supported host version for this plugin API shape. |
13
+ | `libravdbd` | published daemon asset | Required for normal runtime. |
14
+ | Go | `1.22` | Required only for local daemon development. |
15
+ | OS | macOS, Linux, Windows | Unix uses a local socket; Windows uses TCP loopback. |
16
+ | Architecture | `arm64`, `x64` | Must match the daemon release asset. |
16
17
 
17
- The published plugin install path is scanner-clean and connect-only. End users should not need Go to install the OpenClaw plugin itself.
18
+ Resource sizing and benchmark data live in
19
+ [Performance and tuning](./performance-and-tuning.md).
18
20
 
19
- ## Resource Requirements
21
+ OpenClaw compatibility note:
20
22
 
21
- The numbers in this section are either directly measured from the current local
22
- build on `2026-03-29` or explicitly labeled as estimates.
23
+ - the plugin is currently verified against OpenClaw `2026.4.23`
23
24
 
24
- ### Disk
25
+ ## Install Flow
25
26
 
26
- Measured locally from this checkout:
27
+ The published plugin package is connect-only. It installs TypeScript plugin code
28
+ and docs; it does not compile Go code, download model assets, or supervise the
29
+ daemon.
27
30
 
28
- - daemon binary: `7.7M`
29
- - bundled Nomic model directory: `523M`
30
- - bundled MiniLM fallback model directory: `87M`
31
- - optional T5 summarizer directory: `371M`
32
- - unpacked ONNX Runtime directory on macOS arm64: `44M`
33
- - ONNX Runtime archive download on macOS arm64: `9.5M`
34
-
35
- Practical footprints:
36
-
37
- - default quality-first install without optional T5:
38
- about `575 MB` (`7.7M + 523M + 44M`)
39
- - install with optional T5 summarizer:
40
- about `946 MB`
41
-
42
- Vector payload lower bounds for stored turns, derived from embedding dimension:
43
-
44
- - MiniLM `384d`: `384 * 4 = 1536 bytes` per vector
45
- - Nomic `768d`: `768 * 4 = 3072 bytes` per vector
46
-
47
- Estimated lower-bound vector payload for `10,000` stored turns:
48
-
49
- - MiniLM: about `15.4 MB`
50
- - Nomic: about `30.7 MB`
51
-
52
- These are lower bounds for vector payload only. Actual on-disk LibraVDB usage is
53
- higher because text, metadata, collection structure, and index state are stored
54
- as well.
55
-
56
- ### Memory
57
-
58
- Measured locally on Apple M2, `2026-03-29`, by starting the daemon and reading
59
- RSS after startup:
60
-
61
- - idle RSS with Nomic embedding path loaded and no optional T5 summarizer:
62
- about `271,872 KB` (`~266 MB`)
63
- - idle RSS with Nomic plus local ONNX T5 summarizer loaded:
64
- about `515,312 KB` (`~503 MB`)
65
-
66
- Not yet bench-measured in the repo:
67
-
68
- - RSS during active inference
69
- - peak RSS during compaction of large clusters
70
-
71
- Current operational estimate:
72
-
73
- - embedding inference should remain close to the embed-only idle baseline plus
74
- transient ONNX workspace allocation
75
- - optional T5 provisioning roughly doubles steady-state RSS
76
-
77
- ### CPU
78
-
79
- Measured locally from the existing Go benchmark harness on Apple M2,
80
- `2026-03-29`:
81
-
82
- - MiniLM bundled query embedding: about `22.6 ms/op`
83
- - MiniLM onnx-local query embedding: about `16.3 ms/op`
84
- - Nomic onnx-local query embedding: about `43.7 ms/op`
85
-
86
- Measured locally from a one-off 40-query timing sample on Apple M2,
87
- `2026-03-29`:
88
-
89
- - Nomic query embedding `p50`: about `18.61 ms`
90
- - Nomic query embedding `p95`: about `24.19 ms`
91
-
92
- Measured locally from a one-off synthetic 50-turn compaction run using the
93
- current extractive summarizer and Nomic embeddings:
94
-
95
- - `50`-turn extractive compaction wall time: about `3175 ms`
96
-
97
- Not yet bench-measured in the repo:
98
-
99
- - equivalent Linux x64 embedding latency on a reference machine
100
- - `50`-turn compaction wall time through the optional ONNX T5 abstractive path
101
-
102
- ### Network
103
-
104
- Setup downloads are front-loaded. After installation, the plugin is local-first.
105
-
106
- Current setup assets:
107
-
108
- - Nomic model: about `522 MB`
109
- - T5-small encoder: about `135 MB`
110
- - T5-small decoder: about `222 MB`
111
- - ONNX Runtime macOS arm64 archive: about `9.5 MB`
112
-
113
- After install, the plugin makes no required network calls for embedding or
114
- extractive compaction. The only optional runtime network path is:
115
-
116
- - `summarizerBackend = "ollama-local"` or another custom summarizer endpoint
117
-
118
- ## Standard Install
119
-
120
- ### Fastest Path on macOS
31
+ Recommended macOS path:
121
32
 
122
33
  ```bash
123
34
  brew tap xDarkicex/homebrew-openclaw-libravdb-memory
@@ -126,85 +37,30 @@ brew services start libravdbd
126
37
  openclaw plugins install @xdarkicex/openclaw-memory-libravdb
127
38
  ```
128
39
 
129
- This is the preferred install flow for macOS users. It gives you a managed `libravdbd` service and a scanner-clean OpenClaw plugin package.
130
-
131
- ### Plugin Package
132
-
133
- ```bash
134
- openclaw plugins install @xdarkicex/openclaw-memory-libravdb
135
- ```
136
-
137
- The plugin package installs as compiled OpenClaw runtime code without daemon bootstrap hooks.
138
-
139
- ## Daemon Install
140
-
141
- Install and start `libravdbd` separately for the same user account that runs OpenClaw. The daemon owns the local DB engine and listens on a local endpoint.
142
-
143
- Default endpoints:
144
-
145
- - Homebrew on macOS: `unix:/opt/homebrew/var/clawdb/run/libravdb.sock`
146
- - macOS/Linux user-local installs: `unix:$HOME/.clawdb/run/libravdb.sock`
147
- - Windows: `tcp:127.0.0.1:37421`
148
-
149
- If you run the daemon on a different endpoint, set `plugins.configs.libravdb-memory.sidecarPath` in `~/.openclaw/openclaw.json`.
150
-
151
- ### Linux
152
-
153
- Recommended layout:
40
+ Manual Linux sketch:
154
41
 
155
42
  ```bash
156
43
  mkdir -p ~/.local/bin ~/.config/systemd/user
157
44
  curl -L -o ~/.local/bin/libravdbd <published-libravdbd-binary-url>
158
45
  chmod +x ~/.local/bin/libravdbd
159
- cp <published-libravdbd-service-template> ~/.config/systemd/user/libravdbd.service
46
+ curl -L -o ~/.config/systemd/user/libravdbd.service <published-libravdbd-service-template-url>
160
47
  systemctl --user enable --now libravdbd.service
48
+ openclaw plugins install @xdarkicex/openclaw-memory-libravdb
161
49
  ```
162
50
 
163
- Then verify:
164
-
165
- ```bash
166
- systemctl --user status libravdbd.service
167
- openclaw memory status
168
- ```
169
-
170
- ### Homebrew / macOS
171
-
172
- Homebrew users should normally install from the published tap:
173
-
174
- ```bash
175
- brew tap xDarkicex/homebrew-openclaw-libravdb-memory
176
- brew install libravdbd
177
- brew services start libravdbd
178
- ```
179
-
180
- With `sidecarPath: "auto"`, Homebrew installs on Apple Silicon should resolve to:
181
-
182
- ```text
183
- unix:/opt/homebrew/var/clawdb/run/libravdb.sock
184
- ```
185
-
186
- User-local installs still default to:
51
+ Windows uses a loopback TCP endpoint by default:
187
52
 
188
53
  ```text
189
- unix:$HOME/.clawdb/run/libravdb.sock
54
+ tcp:127.0.0.1:37421
190
55
  ```
191
56
 
192
- The daemon release pipeline generates a publish-ready `libravdbd.rb` formula asset for release assets named:
193
-
194
- - `libravdbd-darwin-arm64`
195
- - `libravdbd-darwin-amd64`
196
- - `libravdbd-linux-amd64`
197
- - `libravdbd-linux-arm64`
57
+ This repository does not yet include a full Windows service-install walkthrough.
58
+ Use the published Windows daemon asset under your preferred process supervisor
59
+ or run `libravdbd serve` in a terminal for validation.
198
60
 
199
- The generated Homebrew formula also stages the bundled ONNX Runtime archive, the shipped embedding profile assets, and the T5 summarizer bundle into the install prefix so the daemon can start without a separate manual asset unpack step.
200
-
201
- If your GitHub Actions configuration includes:
202
-
203
- - public tap repository `xDarkicex/homebrew-openclaw-libravdb-memory`
204
-
205
- then tagged releases also push the generated formula into `Formula/libravdbd.rb` in that tap repository automatically.
61
+ ## Activation
206
62
 
207
- Example plugin config:
63
+ Assign `libravdb-memory` to both OpenClaw slots:
208
64
 
209
65
  ```json
210
66
  {
@@ -212,29 +68,15 @@ Example plugin config:
212
68
  "slots": {
213
69
  "memory": "libravdb-memory",
214
70
  "contextEngine": "libravdb-memory"
215
- },
216
- "configs": {
217
- "libravdb-memory": {
218
- "sidecarPath": "unix:/Users/<you>/.clawdb/run/libravdb.sock"
219
- }
220
71
  }
221
72
  }
222
73
  }
223
74
  ```
224
75
 
225
- ## Expected Install Shape
76
+ Treat partial assignment as a misconfiguration. This plugin is designed to own
77
+ memory prompt injection and the context-engine lifecycle together.
226
78
 
227
- Expected successful plugin install shape:
228
-
229
- ```text
230
- Installed plugin: libravdb-memory
231
- ```
232
-
233
- ## Activation
234
-
235
- The manifest declares `kind: "context-engine"` and the runtime registers the memory prompt and memory runtime surfaces in code. It is still intended to own both the `memory` and `contextEngine` slots together. Treat partial slot assignment as a misconfiguration.
236
-
237
- Add this to `~/.openclaw/openclaw.json`:
79
+ If the daemon uses a non-default endpoint, add `sidecarPath`:
238
80
 
239
81
  ```json
240
82
  {
@@ -242,22 +84,37 @@ Add this to `~/.openclaw/openclaw.json`:
242
84
  "slots": {
243
85
  "memory": "libravdb-memory",
244
86
  "contextEngine": "libravdb-memory"
87
+ },
88
+ "configs": {
89
+ "libravdb-memory": {
90
+ "sidecarPath": "unix:/Users/<you>/.clawdb/run/libravdb.sock"
91
+ }
245
92
  }
246
93
  }
247
94
  }
248
95
  ```
249
96
 
250
- Notes:
97
+ When `sidecarPath` is `"auto"`, macOS/Linux endpoint resolution checks:
98
+
99
+ 1. `LIBRAVDB_RPC_ENDPOINT`
100
+ 2. `$HOME/.clawdb/run/libravdb.sock`
101
+ 3. `/opt/homebrew/var/clawdb/run/libravdb.sock`
102
+ 4. `/usr/local/var/clawdb/run/libravdb.sock`
103
+ 5. fallback to `$HOME/.clawdb/run/libravdb.sock`
251
104
 
252
- - This plugin should own both `memory` and `contextEngine`. Do not assign only one of them.
253
- - The plugin id is `libravdb-memory`. The npm package name used at install time is `@xdarkicex/openclaw-memory-libravdb`.
254
- - On newer OpenClaw versions, the plugin also registers a memory runtime bridge so the built-in `memory_search` tool can query libraVDB through the same sidecar-backed retrieval path.
255
- - On newer OpenClaw versions, the plugin also listens for `before_reset` and `session_end` so it can send best-effort lifecycle hints into the sidecar.
256
- - Those hints are journaled internally by the sidecar and can be inspected with `openclaw memory journal` without exposing them to normal memory export or recall.
257
- - The journal keeps only a bounded number of newest entries. Override that cap with `plugins.configs.libravdb-memory.lifecycleJournalMaxEntries` if you need a different retention window.
258
- - The plugin does not currently register `registerMemoryFlushPlan`; transcript ingest and compaction remain owned by the context-engine lifecycle and the sidecar.
105
+ ## Default Paths
259
106
 
260
- Without a slot entry, OpenClaw's default memory can continue to run in parallel.
107
+ | Platform | Default endpoint |
108
+ |---|---|
109
+ | macOS/Linux user-local | `unix:$HOME/.clawdb/run/libravdb.sock` |
110
+ | macOS Homebrew Apple Silicon | `unix:/opt/homebrew/var/clawdb/run/libravdb.sock` |
111
+ | Windows | `tcp:127.0.0.1:37421` |
112
+
113
+ Default data path:
114
+
115
+ ```text
116
+ $HOME/.clawdb/data.libravdb
117
+ ```
261
118
 
262
119
  ## Verification
263
120
 
@@ -274,6 +131,7 @@ Expected output shape:
274
131
  │ Sidecar │ running │
275
132
  │ Turns stored │ 0 │
276
133
  │ Memories stored │ 0 │
134
+ │ Lifecycle hints │ 0 │
277
135
  │ Gate threshold │ 0.35 │
278
136
  │ Abstractive model │ ready | not provisioned │
279
137
  │ Embedding profile │ all-minilm-l6-v2 │
@@ -283,51 +141,10 @@ Expected output shape:
283
141
 
284
142
  Interpretation:
285
143
 
286
- - `Sidecar=running` means the local `libravdbd` daemon answered JSON-RPC `health`.
287
- - `Gate threshold=0.35` confirms the default gating scalar boundary is active.
288
- - `Abstractive model=not provisioned` is acceptable. The system degrades to extractive compaction.
289
-
290
- ## Contributor Install
291
-
292
- For contributors working from a clone:
293
-
294
- ```bash
295
- pnpm check
296
- bash scripts/build-daemon.sh
297
- ```
298
-
299
- This prepares a local daemon binary in `.daemon-bin/libravdbd` (or `.exe` on Windows) and copies any locally available model/runtime assets there for testing.
300
-
301
- Contributor default:
302
-
303
- - install `libravdbd` separately with Homebrew or release assets, then run `bash scripts/build-daemon.sh`
304
-
305
- Private local daemon development:
306
-
307
- - set `LIBRAVDBD_SOURCE_DIR=/path/to/libravdbd` to build from your local daemon repo
308
- - or set `LIBRAVDBD_BINARY_PATH=/path/to/libravdbd` to use a prebuilt local daemon binary
309
-
310
- ## User-Service Templates
311
-
312
- Published daemon installs include matching user-service templates:
313
-
314
- - Linux user service: `libravdbd.service`
315
- - macOS LaunchAgent: `com.xdarkicex.libravdbd.plist`
316
-
317
- Linux example:
318
-
319
- ```bash
320
- mkdir -p ~/.config/systemd/user
321
- cp <published-libravdbd-service-template> ~/.config/systemd/user/libravdbd.service
322
- systemctl --user enable --now libravdbd.service
323
- ```
324
-
325
- macOS example:
326
-
327
- 1. Copy the published `com.xdarkicex.libravdbd.plist`
328
- 2. Replace `__LIBRAVDBD_PATH__` and `__HOME__`
329
- 3. Save it to `~/Library/LaunchAgents/com.xdarkicex.libravdbd.plist`
330
- 4. Load it with `launchctl load ~/Library/LaunchAgents/com.xdarkicex.libravdbd.plist`
144
+ - `Sidecar=running` means the daemon answered the health check.
145
+ - `Gate threshold=0.35` confirms the default durable-memory gate.
146
+ - `Abstractive model=not provisioned` is acceptable; compaction falls back to
147
+ the extractive path.
331
148
 
332
149
  ## Troubleshooting
333
150
 
@@ -335,47 +152,37 @@ macOS example:
335
152
 
336
153
  Common causes:
337
154
 
338
- - ONNX Runtime library missing or unpacked in the wrong place
339
- - downloaded model file hash mismatch
340
- - `libravdbd` not started for the current user
341
- - plugin pointed at the wrong endpoint
155
+ - `libravdbd` is not running for the same user account as OpenClaw
156
+ - `sidecarPath` points at the wrong endpoint
157
+ - ONNX Runtime assets are missing or unpacked in the wrong place
158
+ - a model asset failed checksum validation
342
159
 
343
- Check:
160
+ Check the daemon first:
344
161
 
345
162
  ```bash
346
163
  openclaw memory status
164
+ brew services restart libravdbd
347
165
  ```
348
166
 
349
- If the daemon is down, start it and verify the configured endpoint:
350
-
351
- ```bash
352
- brew services start libravdbd
353
- ```
354
-
355
- Or, without Homebrew:
167
+ For foreground debugging:
356
168
 
357
169
  ```bash
358
170
  libravdbd serve
359
171
  ```
360
172
 
361
- On macOS/Linux, the default endpoint is `unix:$HOME/.clawdb/run/libravdb.sock`. On Windows, the default endpoint is `tcp:127.0.0.1:37421`.
362
-
363
173
  ### Hash mismatch
364
174
 
365
- Hash mismatch means one of:
366
-
367
- - the daemon asset is corrupt
368
- - the local cache is stale
369
- - the expected checksum is wrong
370
-
371
- Do not bypass this. Delete the asset and rerun setup, or republish the release with corrected checksums.
175
+ Do not bypass a checksum mismatch. Delete the corrupt or stale asset and rerun
176
+ setup, or republish the release with corrected checksums.
372
177
 
373
- ### Windows behavior
178
+ ### Default memory still appears active
374
179
 
375
- On Windows the daemon uses a loopback TCP endpoint instead of a Unix socket. This is expected. The plugin’s transport layer already handles the fallback.
180
+ Confirm that `libravdb-memory` is assigned to both `memory` and
181
+ `contextEngine`. Without both slot entries, OpenClaw's default memory path can
182
+ continue to run in parallel.
376
183
 
377
- ### Published daemon requirement
184
+ ### Lifecycle journal looks empty
378
185
 
379
- The daemon must come from a published `libravdbd` binary for the current platform.
380
- If that download or checksum verification fails, setup stops instead of falling
381
- back to a local `go build`.
186
+ The sidecar journal only records advisory lifecycle hints such as `before_reset`
187
+ and `session_end`. It is bounded by `lifecycleJournalMaxEntries`, default `500`,
188
+ and is not part of normal memory recall.
package/docs/models.md CHANGED
@@ -1,63 +1,54 @@
1
1
  # Model Strategy
2
2
 
3
- ## Why ONNX Over Ollama
3
+ The plugin uses local ONNX-first inference for embeddings and optional
4
+ abstractive summarization. That keeps prompt assembly local, predictable, and
5
+ available offline after assets are installed.
4
6
 
5
- The plugin uses ONNX-first local inference for embedding and optional abstractive summarization.
7
+ ## Why ONNX Over Ollama For The Critical Path
6
8
 
7
- ### Latency
9
+ `assemble` runs before each response build. An embedding request that crosses a
10
+ process and HTTP server boundary adds avoidable tail latency. Local ONNX
11
+ inference inside the sidecar keeps retrieval close to the database and avoids a
12
+ runtime dependency on a separate model server.
8
13
 
9
- `assemble` is on the critical path before every response build. An embedding request that crosses process and HTTP boundaries adds avoidable tail latency. Local ONNX inference inside the sidecar keeps the retrieval path in the low-millisecond range on the target hardware profile.
10
- `assemble` is on the critical path before every response build. An embedding
11
- request that crosses process and HTTP boundaries adds avoidable tail latency.
12
- Local ONNX inference inside the sidecar keeps the retrieval path local and
13
- predictable. On the current Apple M2 development machine, the repository's own
14
- benchmark harness measures roughly `16-23 ms/op` for MiniLM query embeddings and
15
- about `44 ms/op` for Nomic in the steady-state Go benchmark path.
14
+ ONNX assets can be provisioned once and reused without network access. Given
15
+ fixed weights and input, embeddings are deterministic enough for stable
16
+ similarity ordering and reproducible retrieval behavior.
16
17
 
17
- ### Offline Operation
18
+ The trade-off is artifact size. This project accepts that cost because local
19
+ latency and offline operation are part of the product contract.
18
20
 
19
- The plugin is designed to be local-first. Requiring a running Ollama server would break that guarantee. ONNX assets can be provisioned once and reused without network or daemon availability.
21
+ ## Default And Optional Embedding Profiles
20
22
 
21
- ### Determinism
23
+ The current safe default profile is `all-minilm-l6-v2`.
22
24
 
23
- ONNX inference is deterministic given fixed weights and input. Deterministic embeddings give stable similarity ordering and reproducible retrieval behavior.
25
+ MiniLM is the default because it keeps local retrieval within the target memory
26
+ envelope on macOS and is less fragile with ONNX Runtime execution than larger
27
+ profiles.
24
28
 
25
- ### Binary Size Trade-Off
29
+ `nomic-embed-text-v1.5` remains available as an explicit opt-in profile for
30
+ long-context retrieval experiments. Nomic's Matryoshka training makes
31
+ `64d -> 256d -> 768d` tiering principled rather than arbitrary truncation, but
32
+ its larger footprint makes it a less conservative default.
26
33
 
27
- Local models increase the artifact footprint. That is an explicit trade-off accepted by the architecture because predictable latency and offline operation are more important for this plugin than minimal package size.
34
+ For exact profile metadata, read [Embedding profiles](./embedding-profiles.md).
28
35
 
29
- ## Why `nomic-embed-text-v1.5`
36
+ ## Summarization
30
37
 
31
- This is the default embedding profile because it earned the role on two axes:
38
+ Compaction can run without an abstractive summarizer. When the optional T5-small
39
+ assets are not provisioned, the daemon degrades to the extractive path.
32
40
 
33
- - long-context document support
34
- - Matryoshka structure for tiered retrieval
41
+ T5-small is the optional local abstractive summarizer because it is small enough
42
+ for CPU-local operation while still useful for session-cluster summaries. Larger
43
+ generative models would increase latency and operational complexity.
35
44
 
36
- The model’s Matryoshka training is what makes the `64d -> 256d -> 768d` cascade principled rather than arbitrary truncation.
45
+ ## Model Roles
37
46
 
38
- ## Why `all-minilm-l6-v2` Still Exists
47
+ | Model/profile | Role |
48
+ |---|---|
49
+ | `all-minilm-l6-v2` | Default lightweight embedding profile. |
50
+ | `nomic-embed-text-v1.5` | Opt-in long-context embedding profile. |
51
+ | T5-small | Optional local abstractive compaction summarizer. |
39
52
 
40
- MiniLM remains the lightweight fallback profile. It is useful when:
41
-
42
- - the full Nomic profile is unavailable
43
- - a smaller bundled footprint matters more than long-context or Matryoshka behavior
44
-
45
- It is no longer the quality-first default.
46
-
47
- ## Why T5-small for Summarization
48
-
49
- The abstractive summarization path is optional and must remain CPU-feasible on local machines. T5-small fits that constraint better than larger generative models:
50
-
51
- - small enough to run locally
52
- - expressive enough for session-cluster summarization
53
- - does not require a remote server
54
-
55
- The plugin still degrades gracefully to extractive compaction when the T5 assets are not provisioned.
56
-
57
- ## Model Roles in the System
58
-
59
- - Nomic embedder: quality-first retrieval path, Matryoshka tiers
60
- - MiniLM: fallback embedder
61
- - T5-small: optional higher-quality compaction summarizer
62
-
63
- The model strategy is therefore not “use ONNX everywhere because ONNX is fashionable.” It is “use ONNX where local deterministic inference is part of the product contract.”
53
+ External summarizer endpoints, such as Ollama, are optional. They are not part
54
+ of the required retrieval path.