@aionis/substrate 0.1.3 → 0.1.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (52) hide show
  1. package/CHANGELOG.md +21 -0
  2. package/README.md +120 -13
  3. package/dist/candidate-index.d.ts +25 -0
  4. package/dist/candidate-index.d.ts.map +1 -0
  5. package/dist/candidate-index.js +217 -0
  6. package/dist/candidate-index.js.map +1 -0
  7. package/dist/cli.d.ts.map +1 -1
  8. package/dist/cli.js +160 -2
  9. package/dist/cli.js.map +1 -1
  10. package/dist/embedding-projection.d.ts +16 -0
  11. package/dist/embedding-projection.d.ts.map +1 -0
  12. package/dist/embedding-projection.js +101 -0
  13. package/dist/embedding-projection.js.map +1 -0
  14. package/dist/file-substrate.d.ts +3 -0
  15. package/dist/file-substrate.d.ts.map +1 -1
  16. package/dist/file-substrate.js +47 -1
  17. package/dist/file-substrate.js.map +1 -1
  18. package/dist/index.d.ts +4 -0
  19. package/dist/index.d.ts.map +1 -1
  20. package/dist/index.js +4 -0
  21. package/dist/index.js.map +1 -1
  22. package/dist/runtime-live-sidecar.d.ts +79 -0
  23. package/dist/runtime-live-sidecar.d.ts.map +1 -0
  24. package/dist/runtime-live-sidecar.js +382 -0
  25. package/dist/runtime-live-sidecar.js.map +1 -0
  26. package/dist/search.d.ts +11 -2
  27. package/dist/search.d.ts.map +1 -1
  28. package/dist/search.js +78 -7
  29. package/dist/search.js.map +1 -1
  30. package/dist/sqlite-substrate.d.ts +3 -0
  31. package/dist/sqlite-substrate.d.ts.map +1 -1
  32. package/dist/sqlite-substrate.js +61 -6
  33. package/dist/sqlite-substrate.js.map +1 -1
  34. package/dist/types.d.ts +3 -0
  35. package/dist/types.d.ts.map +1 -1
  36. package/dist/zvec-candidate-index.d.ts +42 -0
  37. package/dist/zvec-candidate-index.d.ts.map +1 -0
  38. package/dist/zvec-candidate-index.js +463 -0
  39. package/dist/zvec-candidate-index.js.map +1 -0
  40. package/docs/ADAPTER_CONTRACT.md +12 -2
  41. package/docs/API_USAGE.md +135 -0
  42. package/docs/CLI.md +76 -1
  43. package/docs/RUNTIME_DUAL_WRITE_EXPERIMENT.md +14 -0
  44. package/docs/RUNTIME_LIVE_SIDECAR.md +142 -0
  45. package/docs/RUNTIME_SNAPSHOT_IMPORT.md +3 -0
  46. package/docs/RUNTIME_ZVEC_CANDIDATE_INDEX.md +80 -0
  47. package/docs/STORE_CONTRACT.md +15 -0
  48. package/docs/V0_2_ROADMAP.md +15 -3
  49. package/docs/ZVEC_PROVIDER_EMBEDDING_EVAL.md +216 -0
  50. package/docs/ZVEC_SCALE_MAINTENANCE.md +89 -0
  51. package/examples/live-sidecar/index.mjs +189 -0
  52. package/package.json +17 -2
package/docs/CLI.md CHANGED
@@ -6,9 +6,11 @@ It is intentionally narrow:
6
6
 
7
7
  - it inspects, previews, backs up, restores, and compacts Substrate stores;
8
8
  - it imports Runtime Lite SQLite snapshots into separate Substrate stores;
9
+ - it incrementally mirrors Runtime Lite evidence into separate Substrate stores with an explicit checkpoint;
9
10
  - it runs read-only checks over existing Runtime evidence;
10
11
  - it writes reports to local files;
11
- - it does not start Aionis Runtime unless you explicitly use the separate repository script `check:runtime-dual-write`;
12
+ - it does not start Aionis Runtime unless you explicitly use a repository validation script such as
13
+ `check:runtime-dual-write` or `check:runtime-product-bridge`;
12
14
  - it does not mutate Runtime source code or replace Runtime storage.
13
15
 
14
16
  ## Install
@@ -123,6 +125,56 @@ The JSON output includes imported/skipped counts plus structured `diagnostics.so
123
125
  `diagnostics.skipReasons`, and `diagnostics.jsonIssues` so bridge failures can be classified
124
126
  without scraping warning strings.
125
127
 
128
+ ## Runtime Live Sidecar
129
+
130
+ Use `live-sidecar` to keep a separate Substrate store in sync with Runtime Lite evidence without replaying unchanged rows:
131
+
132
+ ```bash
133
+ npx aionis-substrate live-sidecar \
134
+ --source /path/to/aionis-runtime-lite.sqlite \
135
+ --target ./substrate.sqlite \
136
+ --adapter sqlite \
137
+ --checkpoint ./runtime-live-checkpoint.json \
138
+ --scope repo-a
139
+ ```
140
+
141
+ The Runtime source is opened read-only. The target is a Substrate store owned by this command.
142
+ The checkpoint file records stable fingerprints for mapped Runtime nodes, relations, feedback,
143
+ and decisions. Re-running the command applies only new or changed evidence.
144
+
145
+ Use `--dry-run` to inspect the apply plan without writing the target or checkpoint:
146
+
147
+ ```bash
148
+ npx aionis-substrate live-sidecar \
149
+ --source /path/to/aionis-runtime-lite.sqlite \
150
+ --target ./substrate.sqlite \
151
+ --adapter sqlite \
152
+ --checkpoint ./runtime-live-checkpoint.json \
153
+ --scope repo-a \
154
+ --dry-run
155
+ ```
156
+
157
+ Use `--watch` for a bounded polling loop with a single-instance lock:
158
+
159
+ ```bash
160
+ npx aionis-substrate live-sidecar \
161
+ --source /path/to/aionis-runtime-lite.sqlite \
162
+ --target ./substrate.sqlite \
163
+ --adapter sqlite \
164
+ --checkpoint ./runtime-live-checkpoint.json \
165
+ --scope repo-a \
166
+ --watch \
167
+ --iterations 20 \
168
+ --interval-ms 5000
169
+ ```
170
+
171
+ The default lock path is `<checkpoint>.lock`. Override it with `--lock <path>`.
172
+ Use `--no-lock` only for controlled tests. The watch report contract is
173
+ `aionis_runtime_live_sidecar_watch_report_v1`.
174
+
175
+ The report contract is `aionis_runtime_live_sidecar_report_v1`. Read `import_summary` as source coverage
176
+ and `apply_summary` as the checkpointed sidecar result.
177
+
126
178
  ## Sidecar Check
127
179
 
128
180
  Use `sidecar` when you already have Runtime Lite SQLite evidence and want to check whether Substrate can mirror the governed context surface from outside the Runtime boundary.
@@ -163,6 +215,25 @@ npx aionis-substrate sidecar \
163
215
 
164
216
  The report contract is `aionis_runtime_sidecar_check_report_v1`.
165
217
 
218
+ ## Runtime Product Bridge Gate
219
+
220
+ The package CLI is for store operations and sidecar sync. In this repository,
221
+ use `check:runtime-product-bridge` when you need the full product bridge gate
222
+ against a focused Runtime checkout:
223
+
224
+ ```bash
225
+ npm run check:runtime-product-bridge -- \
226
+ --runtime-root /path/to/AionisRuntime-focused
227
+ ```
228
+
229
+ The gate starts focused Runtime with isolated Lite SQLite paths, runs real
230
+ `observe -> guide -> feedback -> measure`, writes the same observed evidence
231
+ into an external Substrate store, verifies reopen parity, runs chain probes,
232
+ mirrors Runtime Lite SQLite through read-only `live-sidecar`, verifies
233
+ checkpoint idempotency, and compares mirrored Substrate `previewContext` buckets
234
+ against Runtime guide surfaces. The top-level report is
235
+ `product-bridge-gate-summary.json`.
236
+
166
237
  ## What Passing Means
167
238
 
168
239
  Passing snapshot parity means one Runtime SQLite scope can be imported into an isolated Substrate store and compiled into matching governed buckets.
@@ -181,6 +252,10 @@ The reference files were scanned, but none of their memory ids overlap the disco
181
252
 
182
253
  The imported Substrate buckets do not match the supplied Runtime guide/measure surface. Inspect the generated `summary.json` before changing code; the mismatch may be a scope, reference, or fixture problem.
183
254
 
255
+ `checkpoint ignored because target store is empty`
256
+
257
+ The live sidecar found an existing checkpoint, but the target store had no events. It replayed the Runtime snapshot into the target so the checkpoint cannot hide a missing or newly created target.
258
+
184
259
  `ExperimentalWarning: SQLite is an experimental feature`
185
260
 
186
261
  Node 24 currently marks `node:sqlite` as experimental. This is expected for the current embedded SQLite adapter.
@@ -39,6 +39,20 @@ The compared surfaces are:
39
39
 
40
40
  ## Command
41
41
 
42
+ Product bridge gate:
43
+
44
+ ```bash
45
+ npm run check:runtime-product-bridge -- \
46
+ --runtime-root /Volumes/ziel/AionisRuntime-focused
47
+ ```
48
+
49
+ This is the full validation path for a product bridge: real focused Runtime
50
+ calls, external Substrate dual-write parity, reopen parity, chain probes,
51
+ read-only `live-sidecar` mirroring, repeated sidecar idempotency, and mirrored
52
+ `previewContext` parity.
53
+
54
+ Lower-level dual-write experiment:
55
+
42
56
  ```bash
43
57
  npm run check:runtime-dual-write -- \
44
58
  --runtime-root /Volumes/ziel/AionisRuntime-focused \
@@ -0,0 +1,142 @@
1
+ # Runtime Live Sidecar
2
+
3
+ The Runtime live sidecar is an external, one-way bridge from an Aionis Runtime Lite SQLite database into an independent Aionis Substrate store.
4
+
5
+ It exists for a narrow productization step:
6
+
7
+ - Runtime remains the source of execution memory writes.
8
+ - Substrate mirrors Runtime evidence into a durable substrate store.
9
+ - A checkpoint prevents replaying unchanged Runtime rows on every poll.
10
+ - No Runtime table, source file, policy, or guide path is mutated.
11
+
12
+ This is not a replacement for Runtime policy. It is a sidecar substrate sync primitive.
13
+
14
+ ## Command
15
+
16
+ ```bash
17
+ npx aionis-substrate live-sidecar \
18
+ --source /path/to/aionis-runtime-lite.sqlite \
19
+ --target ./substrate.sqlite \
20
+ --adapter sqlite \
21
+ --checkpoint ./runtime-live-checkpoint.json \
22
+ --scope repo-a
23
+ ```
24
+
25
+ Run the command repeatedly from a scheduler or host process. Each run:
26
+
27
+ 1. opens the Runtime SQLite source read-only;
28
+ 2. maps Runtime memory nodes, relations, feedback, and decisions through the snapshot importer;
29
+ 3. fingerprints each mapped Substrate write;
30
+ 4. writes only new or changed evidence into the target store;
31
+ 5. atomically updates the checkpoint file.
32
+
33
+ Use `--adapter file` when the target is a file-backed Substrate directory.
34
+
35
+ Use `--dry-run` to report what would be applied without writing the target or checkpoint:
36
+
37
+ ```bash
38
+ npx aionis-substrate live-sidecar \
39
+ --source /path/to/aionis-runtime-lite.sqlite \
40
+ --target ./substrate.sqlite \
41
+ --adapter sqlite \
42
+ --checkpoint ./runtime-live-checkpoint.json \
43
+ --scope repo-a \
44
+ --dry-run
45
+ ```
46
+
47
+ ## Bounded Watch Loop
48
+
49
+ Use `--watch` when a host wants Substrate to poll Runtime Lite repeatedly in one process:
50
+
51
+ ```bash
52
+ npx aionis-substrate live-sidecar \
53
+ --source /path/to/aionis-runtime-lite.sqlite \
54
+ --target ./substrate.sqlite \
55
+ --adapter sqlite \
56
+ --checkpoint ./runtime-live-checkpoint.json \
57
+ --scope repo-a \
58
+ --watch \
59
+ --iterations 20 \
60
+ --interval-ms 5000
61
+ ```
62
+
63
+ `--watch` is intentionally bounded in this package version. The host chooses the number
64
+ of iterations and can restart the command through cron, launchd, systemd, a supervisor,
65
+ or its own Agent host process.
66
+
67
+ The watch command creates a single-instance lock by default at:
68
+
69
+ ```text
70
+ <checkpoint>.lock
71
+ ```
72
+
73
+ Override it with `--lock <path>`. Use `--no-lock` only for controlled tests.
74
+
75
+ The watch report contract is `aionis_runtime_live_sidecar_watch_report_v1`.
76
+ It contains:
77
+
78
+ - `reports`: every per-iteration `aionis_runtime_live_sidecar_report_v1`;
79
+ - `apply_summary`: aggregate attempted/applied/unchanged counts across all iterations;
80
+ - `lock_path`: the lock used for this run;
81
+ - `iterations_requested` and `iterations_completed`.
82
+
83
+ ## Soak Check
84
+
85
+ Use the soak check before release or before embedding the sidecar in a long-running host:
86
+
87
+ ```bash
88
+ npm run check:runtime-live-sidecar-soak
89
+ ```
90
+
91
+ The check creates a real Runtime Lite SQLite fixture, appends evidence in batches,
92
+ and runs bounded watch loops against a separate real Substrate SQLite target. It
93
+ verifies:
94
+
95
+ - every newly appended Runtime row is applied exactly once;
96
+ - unchanged rows remain checkpoint-skipped on later polls;
97
+ - the checkpoint fingerprint count matches the mirrored target;
98
+ - the lock file is released after every watch loop;
99
+ - the target store can be reopened with the same node and event counts.
100
+
101
+ The report contract is `aionis_runtime_live_sidecar_soak_report_v1`.
102
+
103
+ ## Report
104
+
105
+ The report contract is `aionis_runtime_live_sidecar_report_v1`.
106
+
107
+ Important fields:
108
+
109
+ - `import_summary`: the Runtime snapshot importer coverage for this scan.
110
+ - `apply_summary`: what the live sidecar actually applied or skipped through the checkpoint.
111
+ - `checkpoint_before`: whether a checkpoint existed and how many fingerprints it contained.
112
+ - `checkpoint_after`: fingerprint counts after the run.
113
+ - `store_before` / `store_after`: target store event counters.
114
+ - `warnings`: importer warnings plus sidecar consistency warnings.
115
+
116
+ `import_summary.nodesImported` can be larger than `apply_summary.nodes.applied`. That is expected: the importer reports what it mapped from Runtime; the live sidecar reports what was new or changed after checkpoint comparison.
117
+
118
+ ## Checkpoint Behavior
119
+
120
+ The checkpoint is scoped to:
121
+
122
+ - Runtime source path;
123
+ - optional Runtime scope;
124
+ - mapped Substrate object fingerprints.
125
+
126
+ If the checkpoint path points to a different Runtime source or scope, the command fails. This prevents accidental cross-source reuse.
127
+
128
+ If the checkpoint contains fingerprints but the target store is empty, the sidecar ignores the checkpoint and replays the Runtime snapshot into the target. This prevents a stale checkpoint from hiding a lost or newly created target store.
129
+
130
+ If an individual node, relation, feedback row, or decision is missing from the target even though the checkpoint says it is unchanged, the sidecar re-applies that object.
131
+
132
+ ## Product Boundary
133
+
134
+ The live sidecar is external infrastructure:
135
+
136
+ - it does not replace Runtime Lite;
137
+ - it does not install dual-write inside Runtime;
138
+ - it does not change Runtime guide/admission policy;
139
+ - it does not encode benchmark-specific rules;
140
+ - it does not make Substrate the full Runtime policy engine.
141
+
142
+ Use it when you want a live Substrate mirror for inspection, backup, product experiments, or external host integration without touching Runtime core.
@@ -14,6 +14,9 @@ The importer exists to answer one engineering question:
14
14
 
15
15
  > Can the Substrate contract represent Runtime memory evidence, relations, feedback, and decision traces without polluting the focused Runtime?
16
16
 
17
+ For repeated checkpointed mirroring, use the separate Runtime live sidecar documented in
18
+ [RUNTIME_LIVE_SIDECAR.md](RUNTIME_LIVE_SIDECAR.md). Snapshot import is a one-shot copy path.
19
+
17
20
  ## Source Tables
18
21
 
19
22
  The importer currently understands the focused Runtime Lite write-store tables:
@@ -0,0 +1,80 @@
1
+ # Runtime Zvec Candidate Index Check
2
+
3
+ This check validates the optional Zvec candidate index against real Aionis Runtime Lite SQLite snapshots.
4
+
5
+ It proves a storage/index contract:
6
+
7
+ - Runtime Lite SQLite is opened read-only.
8
+ - selected Runtime scopes are imported into an isolated Substrate SQLite store.
9
+ - a Zvec candidate index is rebuilt from the imported Substrate nodes.
10
+ - `verify()` must report no missing, orphan, or stale index entries.
11
+ - wide-window Zvec candidate search must preserve canonical Substrate search ids.
12
+ - narrow-window Zvec candidate search must still recover the seeded real Runtime memory node for exact-node probes.
13
+
14
+ The check uses a deterministic local text projection to provide vectors for real imported Runtime nodes. It validates Zvec integration, write/rebuild/verify behavior, and candidate-window safety. It does not evaluate embedding-provider semantic quality.
15
+
16
+ ## Run
17
+
18
+ ```bash
19
+ npm run check:runtime-zvec-index -- \
20
+ --root /Volumes/ziel/AionisRuntime-focused/.tmp \
21
+ --max-files 10 \
22
+ --max-scopes 12 \
23
+ --min-nodes 3 \
24
+ --probes-per-scope 5
25
+ ```
26
+
27
+ The command writes a report under `reports/runtime-zvec-candidate-index-*` unless `--output` is supplied.
28
+
29
+ ## Options
30
+
31
+ - `--root <path>`: Runtime directory or SQLite file to scan. Repeatable.
32
+ - `--max-files <n|all>`: maximum Runtime SQLite files to inspect.
33
+ - `--max-scopes <n|all>`: maximum selected scopes to validate.
34
+ - `--max-scopes-per-file <n>`: maximum candidate scopes from each Runtime SQLite file.
35
+ - `--min-nodes <n>`: minimum `lite_memory_nodes` rows required for a scope.
36
+ - `--probes-per-scope <n>`: exact-node search probes to run per selected scope.
37
+ - `--narrow-candidate-limit <n>`: Zvec candidate window for narrow recovery probes.
38
+ - `--result-limit <n>`: final Substrate search result limit.
39
+ - `--keep-store`: keep temporary Substrate SQLite and Zvec files for inspection.
40
+
41
+ ## Report Interpretation
42
+
43
+ Important fields:
44
+
45
+ - `passed_scopes` / `failed_scopes`: scope-level storage/index gate.
46
+ - `total_nodes_imported`: real Runtime nodes imported into isolated Substrate stores.
47
+ - `total_vector_indexable_nodes`: imported nodes with a deterministic vector projection.
48
+ - `total_wide_parity_hits / total_probes_attempted`: wide candidate window preserved canonical Substrate search output.
49
+ - `total_narrow_seed_hits / total_probes_attempted`: narrow candidate window recovered the seeded real Runtime node.
50
+ - `zvec_health`: per-scope missing/orphan/stale index diagnostics.
51
+
52
+ Failures mean the Zvec sidecar needs index-contract work before it should be used for Runtime-scale candidate preselection.
53
+
54
+ ## Local Validation Snapshot
55
+
56
+ On 2026-06-26, the check was run against local `AionisRuntime-focused/.tmp` Runtime Lite SQLite files:
57
+
58
+ ```bash
59
+ npm run check:runtime-zvec-index -- \
60
+ --root /Volumes/ziel/AionisRuntime-focused/.tmp \
61
+ --max-files all \
62
+ --max-scopes 20 \
63
+ --min-nodes 3 \
64
+ --probes-per-scope 8 \
65
+ --narrow-candidate-limit 20
66
+ ```
67
+
68
+ Result:
69
+
70
+ - discovered SQLite files: 30
71
+ - Runtime SQLite files with candidate scopes: 14
72
+ - attempted scopes: 20
73
+ - passed scopes: 20
74
+ - imported Runtime nodes: 3,080
75
+ - vector-indexable nodes: 3,080
76
+ - probes attempted: 160
77
+ - wide candidate parity: 100%
78
+ - narrow candidate seed hit rate: 100%
79
+
80
+ This validates the Zvec storage/index contract on real Runtime snapshots. Semantic embedding quality should be measured separately with a provider-backed embedding eval because this check intentionally uses a deterministic local text projection.
@@ -101,6 +101,21 @@ It must:
101
101
 
102
102
  Search is not admission. It may find candidate evidence, but it must not decide whether memory can influence the next Agent turn. Governed prompt surfaces are produced by `compileContext`.
103
103
 
104
+ ### Candidate Index Contract
105
+
106
+ Stores may be opened with an optional candidate index.
107
+
108
+ The index contract is:
109
+
110
+ - source of truth stays in the file or SQLite substrate store;
111
+ - open rebuilds the index from durable nodes unless explicitly disabled;
112
+ - node upserts and lifecycle transitions write through to the index after the truth store mutation succeeds;
113
+ - `verify(nodes)` reports missing, orphan, and stale index entries;
114
+ - indexed search may narrow candidate ids, but returned results are still canonical substrate nodes with substrate scoring and reason codes;
115
+ - index lookup must not append events, mutate lifecycle state, or produce admission decisions.
116
+
117
+ This boundary allows `createZvecCandidateIndex` and future ANN-backed adapters without turning the index into the memory database or the admission policy.
118
+
104
119
  ### Relation Graph
105
120
 
106
121
  Relations connect memory evidence:
@@ -54,7 +54,7 @@ Keep Runtime experiments isolated:
54
54
  - dual-write sidecar continues to write into a separate Substrate store;
55
55
  - no replacement of Aionis Runtime storage in v0.2.
56
56
 
57
- Initial implementation status: `check:runtime-sidecar` now combines read-only Runtime snapshot parity and same-source reference corpus parity into a single report contract. Real Runtime dual-write remains an explicit separate gate through `check:runtime-dual-write` because it starts focused Runtime.
57
+ Initial implementation status: `check:runtime-sidecar` now combines read-only Runtime snapshot parity and same-source reference corpus parity into a single report contract. `live-sidecar` adds a checkpointed external mirror from Runtime Lite SQLite into a separate Substrate target for repeated host-managed sync, including bounded watch polling and a checkpoint lock. Real Runtime dual-write remains an explicit separate gate through `check:runtime-dual-write` because it starts focused Runtime.
58
58
 
59
59
  ### 5. Product CLI and Docs
60
60
 
@@ -64,11 +64,22 @@ Make the substrate boundary easier to consume without widening policy scope:
64
64
  - document install, minimal API usage, and sidecar reports separately;
65
65
  - keep repository-only Runtime process experiments explicit and separate.
66
66
 
67
- Initial implementation status: the package exposes `aionis-substrate sidecar` for read-only snapshot/reference checks and store commands for inspect, preview-context, backup, restore, compact, and Runtime snapshot import. These commands do not start Runtime, mutate Runtime storage, or implement Runtime admission policy.
67
+ Initial implementation status: the package exposes `aionis-substrate sidecar` for read-only snapshot/reference checks, `aionis-substrate live-sidecar` for checkpointed external mirroring, and store commands for inspect, preview-context, backup, restore, compact, and Runtime snapshot import. These commands do not start Runtime, mutate Runtime storage, or implement Runtime admission policy.
68
+
69
+ ### 6. Candidate Index Boundary
70
+
71
+ Add an optional index boundary for database-style candidate lookup:
72
+
73
+ - index adapters receive write-through node updates;
74
+ - open can rebuild the index from durable nodes;
75
+ - `verify(nodes)` reports missing, orphan, and stale entries;
76
+ - final search results still come from canonical Substrate nodes.
77
+
78
+ Initial implementation status: `createMemoryCandidateIndex` provides a deterministic in-process candidate index with rebuild, verify, upsert, delete, and search. `createZvecCandidateIndex` provides an optional Zvec-backed candidate index for local vector preselection. File and SQLite adapters can use either one to narrow candidates before applying the existing Substrate search contract.
68
79
 
69
80
  ## Excluded
70
81
 
71
- - Vector search, ANN, embeddings, or semantic recall.
82
+ - Built-in embedding generation, hosted ANN services, or semantic recall policy.
72
83
  - Full Aionis Runtime admission policy.
73
84
  - LLM-as-judge or model-generated lifecycle policy.
74
85
  - Agent orchestration or external Agent harnesses.
@@ -82,6 +93,7 @@ Before v0.2 can be tagged:
82
93
  - `npm run typecheck`
83
94
  - `npm test`
84
95
  - `npm run bench:contract`
96
+ - `npm run check:runtime-live-sidecar-soak`
85
97
  - `npm run check:release`
86
98
  - `npm run check:scale -- --nodes 10000 --scopes 10 --relations 2000 --feedback 1000`
87
99
  - adapter parity tests for every new public API;
@@ -0,0 +1,216 @@
1
+ # Zvec Provider Embedding Eval
2
+
3
+ This eval checks Zvec candidate preselection with real provider embeddings.
4
+
5
+ It measures two different surfaces:
6
+
7
+ - raw Zvec candidate hit rate: whether provider embeddings place the expected memory id in the Zvec candidate window.
8
+ - final Substrate search hit rate: whether the current Substrate canonical search contract returns that memory after candidate narrowing.
9
+
10
+ This distinction matters. Zvec is a candidate sidecar, not the truth store and not the final admission policy. SQLite remains the truth store, and Substrate reloads canonical nodes before returning search results.
11
+
12
+ ## Provider Contract
13
+
14
+ The eval supports three provider contracts:
15
+
16
+ - `openai`: OpenAI-compatible embeddings endpoints.
17
+ - `minimax`: MiniMax native embeddings endpoint.
18
+ - `dashscope`: Alibaba Cloud DashScope native embeddings endpoint.
19
+
20
+ ### OpenAI-Compatible
21
+
22
+ ```text
23
+ POST <base-url><endpoint>
24
+ Authorization: Bearer <api-key>
25
+ Content-Type: application/json
26
+
27
+ {
28
+ "model": "...",
29
+ "input": ["text one", "text two"]
30
+ }
31
+ ```
32
+
33
+ The response must contain `data[].embedding` arrays in request order or with `data[].index`.
34
+
35
+ ### MiniMax Native
36
+
37
+ MiniMax embeddings use a native request shape:
38
+
39
+ ```text
40
+ POST <base-url><endpoint>
41
+ Authorization: Bearer <api-key>
42
+ Content-Type: application/json
43
+
44
+ {
45
+ "model": "embo-01",
46
+ "type": "db",
47
+ "texts": ["document text one", "document text two"]
48
+ }
49
+ ```
50
+
51
+ The eval calls MiniMax with `type: "db"` for memory-node vectors and
52
+ `type: "query"` for query vectors. The response must contain `vectors[]`.
53
+
54
+ ### DashScope Native
55
+
56
+ DashScope native embeddings use separate `text_type` values for document and
57
+ query vectors:
58
+
59
+ ```text
60
+ POST <base-url><endpoint>
61
+ Authorization: Bearer <api-key>
62
+ Content-Type: application/json
63
+
64
+ {
65
+ "model": "text-embedding-v4",
66
+ "input": {
67
+ "texts": ["document text one", "document text two"]
68
+ },
69
+ "parameters": {
70
+ "text_type": "document",
71
+ "dimension": 1024
72
+ }
73
+ }
74
+ ```
75
+
76
+ The eval calls DashScope with `text_type: "document"` for memory-node vectors
77
+ and `text_type: "query"` for query vectors. The response must contain
78
+ `output.embeddings[].embedding` arrays, ordered by `text_index`.
79
+
80
+ ## Run
81
+
82
+ ```bash
83
+ AIONIS_EMBEDDING_API_KEY=... \
84
+ AIONIS_EMBEDDING_MODEL=text-embedding-3-small \
85
+ npm run check:zvec-provider-embedding -- \
86
+ --base-url https://api.openai.com/v1 \
87
+ --nodes 240 \
88
+ --scopes 4 \
89
+ --queries 20 \
90
+ --candidate-limit 20
91
+ ```
92
+
93
+ For another OpenAI-compatible provider:
94
+
95
+ ```bash
96
+ AIONIS_EMBEDDING_API_KEY=... \
97
+ npm run check:zvec-provider-embedding -- \
98
+ --base-url https://provider.example/v1 \
99
+ --endpoint /embeddings \
100
+ --model provider-embedding-model
101
+ ```
102
+
103
+ For Alibaba Cloud DashScope `text-embedding-v4` through the OpenAI-compatible
104
+ endpoint:
105
+
106
+ ```bash
107
+ AIONIS_EMBEDDING_PROVIDER=openai \
108
+ AIONIS_EMBEDDING_API_KEY=... \
109
+ AIONIS_EMBEDDING_MODEL=text-embedding-v4 \
110
+ npm run check:zvec-provider-embedding -- \
111
+ --base-url https://dashscope.aliyuncs.com/compatible-mode/v1 \
112
+ --endpoint /embeddings \
113
+ --dimensions 1024 \
114
+ --batch-size 10 \
115
+ --nodes 240 \
116
+ --scopes 4 \
117
+ --queries 20 \
118
+ --candidate-limit 40
119
+ ```
120
+
121
+ DashScope `text-embedding-v4` accepts small batches on this endpoint, so the
122
+ example uses `--batch-size 10`. In the current provider eval, `--candidate-limit
123
+ 40` is a better first setting than `20` because it lets Zvec act as a semantic
124
+ candidate preselector without prematurely excluding lexical matches.
125
+
126
+ For Alibaba Cloud DashScope `text-embedding-v4` through the native endpoint:
127
+
128
+ ```bash
129
+ AIONIS_EMBEDDING_PROVIDER=dashscope \
130
+ AIONIS_EMBEDDING_API_KEY=... \
131
+ AIONIS_EMBEDDING_MODEL=text-embedding-v4 \
132
+ npm run check:zvec-provider-embedding -- \
133
+ --dimensions 1024 \
134
+ --projection structured \
135
+ --query-instruct "Retrieve the Aionis Substrate memory document that best answers the implementation question." \
136
+ --batch-size 10 \
137
+ --nodes 240 \
138
+ --scopes 4 \
139
+ --queries 20 \
140
+ --candidate-limit 40
141
+ ```
142
+
143
+ The native provider path is useful because it preserves the provider's
144
+ query/document embedding contract instead of embedding every text through one
145
+ generic endpoint shape.
146
+
147
+ For MiniMax:
148
+
149
+ ```bash
150
+ AIONIS_EMBEDDING_PROVIDER=minimax \
151
+ AIONIS_EMBEDDING_API_KEY=... \
152
+ AIONIS_EMBEDDING_MODEL=embo-01 \
153
+ npm run check:zvec-provider-embedding -- \
154
+ --base-url https://api.minimaxi.com/v1 \
155
+ --nodes 240 \
156
+ --scopes 4 \
157
+ --queries 20 \
158
+ --candidate-limit 20
159
+ ```
160
+
161
+ The command writes a report under `reports/zvec-provider-embedding-*` unless `--output` is supplied.
162
+
163
+ ## Options
164
+
165
+ - `--provider <openai|minimax|dashscope>`: embedding provider contract. Defaults to `AIONIS_EMBEDDING_PROVIDER` or `openai`.
166
+ - `--base-url <url>`: provider base URL. Defaults to `AIONIS_EMBEDDING_BASE_URL` or `https://api.openai.com/v1`.
167
+ - `--endpoint <path>`: embeddings endpoint. Defaults to `AIONIS_EMBEDDING_ENDPOINT` or `/embeddings`.
168
+ - `--model <name>`: embedding model. Defaults to `AIONIS_EMBEDDING_MODEL`.
169
+ - `--api-key-var <name>`: environment variable containing the API key. Defaults to `AIONIS_EMBEDDING_API_KEY`.
170
+ - `--dimensions <n>`: optional embedding dimensions request parameter.
171
+ - `--projection <plain|structured>`: embedding text projection. `plain` embeds compact search text; `structured` uses the SDK `buildAionisEmbeddingDocument` and `buildAionisEmbeddingQuery` contract. Defaults to `structured`.
172
+ - `--query-instruct <text>`: optional query instruction for providers that support query-side instructions.
173
+ - `--nodes <n>`: generated Substrate nodes.
174
+ - `--scopes <n>`: generated scopes.
175
+ - `--queries <n>`: semantic query probes. Current built-in fixture supports up to 24.
176
+ - `--batch-size <n>`: provider embedding batch size.
177
+ - `--candidate-limit <n>`: Zvec candidate window.
178
+ - `--result-limit <n>`: final Substrate search result limit.
179
+ - `--keep-store`: keep temporary SQLite and Zvec files for inspection.
180
+
181
+ ## Report Interpretation
182
+
183
+ Important fields:
184
+
185
+ - `raw_zvec_candidate_top1_rate`: provider embedding quality at the Zvec candidate layer.
186
+ - `raw_zvec_candidate_topk_rate`: whether the expected memory id entered the candidate window.
187
+ - `final_substrate_topk_rate`: current end-to-end `searchNodes()` output after canonical Substrate scoring.
188
+ - `lexical_substrate_topk_rate`: canonical deterministic search without Zvec.
189
+ - `probe_results`: per-query raw candidate rank, final rank, lexical rank, and returned ids for miss analysis.
190
+ - `embedding_usage`: provider requests, embedded text count, input character count, provider token count when exposed, and failed request count.
191
+ - `zvec_health`: missing, orphan, and stale sidecar diagnostics.
192
+
193
+ Current DashScope native `text-embedding-v4` smoke results with the SDK
194
+ projection on this fixture:
195
+
196
+ | Projection | Dimension | Raw Top-1 | Raw Top-K | Final Top-K | Lexical Top-K |
197
+ | --- | ---: | ---: | ---: | ---: | ---: |
198
+ | `plain` | 1024 | 40% | 100% | 75% | 70% |
199
+ | `structured` + query/document | 1024 | 45% | 100% | 85% | 70% |
200
+
201
+ The useful gain comes from the SDK query/document projection contract instead
202
+ of changing Substrate's final admission or search filters.
203
+
204
+ If raw Zvec hit rate is strong but final Substrate hit rate is weaker, the provider embeddings are finding useful candidates but the final canonical scorer is still acting as a lexical/structured gate. That is a search-contract boundary, not a provider failure.
205
+
206
+ The provider eval is intentionally strict about this distinction. A low final
207
+ Substrate hit rate can happen even when provider vectors are valid if the Zvec
208
+ candidate window excludes the expected id or if the final canonical scorer
209
+ prefers lexical/structured evidence over semantic candidates.
210
+
211
+ Substrate fuses candidate-index evidence into final `searchNodes()` ranking by
212
+ adding auditable `semantic_candidate_fusion` reasons and preserving a small
213
+ semantic recall floor for top-ranked candidates. This only changes ranking after
214
+ normal scope, lifecycle, authority, confidence, team, agent, and target-file
215
+ filters pass. Zvec still remains a sidecar candidate preselector; file/SQLite
216
+ stores remain the truth store.