@aionis/substrate 0.1.4 → 0.1.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (43) hide show
  1. package/CHANGELOG.md +12 -0
  2. package/README.md +94 -10
  3. package/dist/candidate-index.d.ts +25 -0
  4. package/dist/candidate-index.d.ts.map +1 -0
  5. package/dist/candidate-index.js +217 -0
  6. package/dist/candidate-index.js.map +1 -0
  7. package/dist/embedding-projection.d.ts +16 -0
  8. package/dist/embedding-projection.d.ts.map +1 -0
  9. package/dist/embedding-projection.js +101 -0
  10. package/dist/embedding-projection.js.map +1 -0
  11. package/dist/file-substrate.d.ts +3 -0
  12. package/dist/file-substrate.d.ts.map +1 -1
  13. package/dist/file-substrate.js +47 -1
  14. package/dist/file-substrate.js.map +1 -1
  15. package/dist/index.d.ts +3 -0
  16. package/dist/index.d.ts.map +1 -1
  17. package/dist/index.js +3 -0
  18. package/dist/index.js.map +1 -1
  19. package/dist/search.d.ts +11 -2
  20. package/dist/search.d.ts.map +1 -1
  21. package/dist/search.js +78 -7
  22. package/dist/search.js.map +1 -1
  23. package/dist/sqlite-substrate.d.ts +3 -0
  24. package/dist/sqlite-substrate.d.ts.map +1 -1
  25. package/dist/sqlite-substrate.js +61 -6
  26. package/dist/sqlite-substrate.js.map +1 -1
  27. package/dist/types.d.ts +3 -0
  28. package/dist/types.d.ts.map +1 -1
  29. package/dist/zvec-candidate-index.d.ts +42 -0
  30. package/dist/zvec-candidate-index.d.ts.map +1 -0
  31. package/dist/zvec-candidate-index.js +463 -0
  32. package/dist/zvec-candidate-index.js.map +1 -0
  33. package/docs/ADAPTER_CONTRACT.md +12 -2
  34. package/docs/API_USAGE.md +135 -0
  35. package/docs/CLI.md +21 -1
  36. package/docs/RUNTIME_DUAL_WRITE_EXPERIMENT.md +14 -0
  37. package/docs/RUNTIME_ZVEC_CANDIDATE_INDEX.md +80 -0
  38. package/docs/STORE_CONTRACT.md +15 -0
  39. package/docs/V0_2_ROADMAP.md +12 -1
  40. package/docs/ZVEC_PROVIDER_EMBEDDING_EVAL.md +216 -0
  41. package/docs/ZVEC_SCALE_MAINTENANCE.md +89 -0
  42. package/examples/live-sidecar/index.mjs +189 -0
  43. package/package.json +15 -1
package/docs/CLI.md CHANGED
@@ -9,7 +9,8 @@ It is intentionally narrow:
9
9
  - it incrementally mirrors Runtime Lite evidence into separate Substrate stores with an explicit checkpoint;
10
10
  - it runs read-only checks over existing Runtime evidence;
11
11
  - it writes reports to local files;
12
- - it does not start Aionis Runtime unless you explicitly use the separate repository script `check:runtime-dual-write`;
12
+ - it does not start Aionis Runtime unless you explicitly use a repository validation script such as
13
+ `check:runtime-dual-write` or `check:runtime-product-bridge`;
13
14
  - it does not mutate Runtime source code or replace Runtime storage.
14
15
 
15
16
  ## Install
@@ -214,6 +215,25 @@ npx aionis-substrate sidecar \
214
215
 
215
216
  The report contract is `aionis_runtime_sidecar_check_report_v1`.
216
217
 
218
+ ## Runtime Product Bridge Gate
219
+
220
+ The package CLI is for store operations and sidecar sync. In this repository,
221
+ use `check:runtime-product-bridge` when you need the full product bridge gate
222
+ against a focused Runtime checkout:
223
+
224
+ ```bash
225
+ npm run check:runtime-product-bridge -- \
226
+ --runtime-root /path/to/AionisRuntime-focused
227
+ ```
228
+
229
+ The gate starts focused Runtime with isolated Lite SQLite paths, runs real
230
+ `observe -> guide -> feedback -> measure`, writes the same observed evidence
231
+ into an external Substrate store, verifies reopen parity, runs chain probes,
232
+ mirrors Runtime Lite SQLite through read-only `live-sidecar`, verifies
233
+ checkpoint idempotency, and compares mirrored Substrate `previewContext` buckets
234
+ against Runtime guide surfaces. The top-level report is
235
+ `product-bridge-gate-summary.json`.
236
+
217
237
  ## What Passing Means
218
238
 
219
239
  Passing snapshot parity means one Runtime SQLite scope can be imported into an isolated Substrate store and compiled into matching governed buckets.
@@ -39,6 +39,20 @@ The compared surfaces are:
39
39
 
40
40
  ## Command
41
41
 
42
+ Product bridge gate:
43
+
44
+ ```bash
45
+ npm run check:runtime-product-bridge -- \
46
+ --runtime-root /Volumes/ziel/AionisRuntime-focused
47
+ ```
48
+
49
+ This is the full validation path for a product bridge: real focused Runtime
50
+ calls, external Substrate dual-write parity, reopen parity, chain probes,
51
+ read-only `live-sidecar` mirroring, repeated sidecar idempotency, and mirrored
52
+ `previewContext` parity.
53
+
54
+ Lower-level dual-write experiment:
55
+
42
56
  ```bash
43
57
  npm run check:runtime-dual-write -- \
44
58
  --runtime-root /Volumes/ziel/AionisRuntime-focused \
@@ -0,0 +1,80 @@
1
+ # Runtime Zvec Candidate Index Check
2
+
3
+ This check validates the optional Zvec candidate index against real Aionis Runtime Lite SQLite snapshots.
4
+
5
+ It proves a storage/index contract:
6
+
7
+ - Runtime Lite SQLite is opened read-only.
8
+ - selected Runtime scopes are imported into an isolated Substrate SQLite store.
9
+ - a Zvec candidate index is rebuilt from the imported Substrate nodes.
10
+ - `verify()` must report no missing, orphan, or stale index entries.
11
+ - wide-window Zvec candidate search must preserve canonical Substrate search ids.
12
+ - narrow-window Zvec candidate search must still recover the seeded real Runtime memory node for exact-node probes.
13
+
14
+ The check uses a deterministic local text projection to provide vectors for real imported Runtime nodes. It validates Zvec integration, write/rebuild/verify behavior, and candidate-window safety. It does not evaluate embedding-provider semantic quality.
15
+
16
+ ## Run
17
+
18
+ ```bash
19
+ npm run check:runtime-zvec-index -- \
20
+ --root /Volumes/ziel/AionisRuntime-focused/.tmp \
21
+ --max-files 10 \
22
+ --max-scopes 12 \
23
+ --min-nodes 3 \
24
+ --probes-per-scope 5
25
+ ```
26
+
27
+ The command writes a report under `reports/runtime-zvec-candidate-index-*` unless `--output` is supplied.
28
+
29
+ ## Options
30
+
31
+ - `--root <path>`: Runtime directory or SQLite file to scan. Repeatable.
32
+ - `--max-files <n|all>`: maximum Runtime SQLite files to inspect.
33
+ - `--max-scopes <n|all>`: maximum selected scopes to validate.
34
+ - `--max-scopes-per-file <n>`: maximum candidate scopes from each Runtime SQLite file.
35
+ - `--min-nodes <n>`: minimum `lite_memory_nodes` rows required for a scope.
36
+ - `--probes-per-scope <n>`: exact-node search probes to run per selected scope.
37
+ - `--narrow-candidate-limit <n>`: Zvec candidate window for narrow recovery probes.
38
+ - `--result-limit <n>`: final Substrate search result limit.
39
+ - `--keep-store`: keep temporary Substrate SQLite and Zvec files for inspection.
40
+
41
+ ## Report Interpretation
42
+
43
+ Important fields:
44
+
45
+ - `passed_scopes` / `failed_scopes`: scope-level storage/index gate.
46
+ - `total_nodes_imported`: real Runtime nodes imported into isolated Substrate stores.
47
+ - `total_vector_indexable_nodes`: imported nodes with a deterministic vector projection.
48
+ - `total_wide_parity_hits / total_probes_attempted`: wide candidate window preserved canonical Substrate search output.
49
+ - `total_narrow_seed_hits / total_probes_attempted`: narrow candidate window recovered the seeded real Runtime node.
50
+ - `zvec_health`: per-scope missing/orphan/stale index diagnostics.
51
+
52
+ Failures mean the Zvec sidecar needs index-contract work before it should be used for Runtime-scale candidate preselection.
53
+
54
+ ## Local Validation Snapshot
55
+
56
+ On 2026-06-26, the check was run against local `AionisRuntime-focused/.tmp` Runtime Lite SQLite files:
57
+
58
+ ```bash
59
+ npm run check:runtime-zvec-index -- \
60
+ --root /Volumes/ziel/AionisRuntime-focused/.tmp \
61
+ --max-files all \
62
+ --max-scopes 20 \
63
+ --min-nodes 3 \
64
+ --probes-per-scope 8 \
65
+ --narrow-candidate-limit 20
66
+ ```
67
+
68
+ Result:
69
+
70
+ - discovered SQLite files: 30
71
+ - Runtime SQLite files with candidate scopes: 14
72
+ - attempted scopes: 20
73
+ - passed scopes: 20
74
+ - imported Runtime nodes: 3,080
75
+ - vector-indexable nodes: 3,080
76
+ - probes attempted: 160
77
+ - wide candidate parity: 100%
78
+ - narrow candidate seed hit rate: 100%
79
+
80
+ This validates the Zvec storage/index contract on real Runtime snapshots. Semantic embedding quality should be measured separately with a provider-backed embedding eval because this check intentionally uses a deterministic local text projection.
@@ -101,6 +101,21 @@ It must:
101
101
 
102
102
  Search is not admission. It may find candidate evidence, but it must not decide whether memory can influence the next Agent turn. Governed prompt surfaces are produced by `compileContext`.
103
103
 
104
+ ### Candidate Index Contract
105
+
106
+ Stores may be opened with an optional candidate index.
107
+
108
+ The index contract is:
109
+
110
+ - source of truth stays in the file or SQLite substrate store;
111
+ - open rebuilds the index from durable nodes unless explicitly disabled;
112
+ - node upserts and lifecycle transitions write through to the index after the truth store mutation succeeds;
113
+ - `verify(nodes)` reports missing, orphan, and stale index entries;
114
+ - indexed search may narrow candidate ids, but returned results are still canonical substrate nodes with substrate scoring and reason codes;
115
+ - index lookup must not append events, mutate lifecycle state, or produce admission decisions.
116
+
117
+ This boundary allows `createZvecCandidateIndex` and future ANN-backed adapters without turning the index into the memory database or the admission policy.
118
+
104
119
  ### Relation Graph
105
120
 
106
121
  Relations connect memory evidence:
@@ -66,9 +66,20 @@ Make the substrate boundary easier to consume without widening policy scope:
66
66
 
67
67
  Initial implementation status: the package exposes `aionis-substrate sidecar` for read-only snapshot/reference checks, `aionis-substrate live-sidecar` for checkpointed external mirroring, and store commands for inspect, preview-context, backup, restore, compact, and Runtime snapshot import. These commands do not start Runtime, mutate Runtime storage, or implement Runtime admission policy.
68
68
 
69
+ ### 6. Candidate Index Boundary
70
+
71
+ Add an optional index boundary for database-style candidate lookup:
72
+
73
+ - index adapters receive write-through node updates;
74
+ - open can rebuild the index from durable nodes;
75
+ - `verify(nodes)` reports missing, orphan, and stale entries;
76
+ - final search results still come from canonical Substrate nodes.
77
+
78
+ Initial implementation status: `createMemoryCandidateIndex` provides a deterministic in-process candidate index with rebuild, verify, upsert, delete, and search. `createZvecCandidateIndex` provides an optional Zvec-backed candidate index for local vector preselection. File and SQLite adapters can use either one to narrow candidates before applying the existing Substrate search contract.
79
+
69
80
  ## Excluded
70
81
 
71
- - Vector search, ANN, embeddings, or semantic recall.
82
+ - Built-in embedding generation, hosted ANN services, or semantic recall policy.
72
83
  - Full Aionis Runtime admission policy.
73
84
  - LLM-as-judge or model-generated lifecycle policy.
74
85
  - Agent orchestration or external Agent harnesses.
@@ -0,0 +1,216 @@
1
+ # Zvec Provider Embedding Eval
2
+
3
+ This eval checks Zvec candidate preselection with real provider embeddings.
4
+
5
+ It measures two different surfaces:
6
+
7
+ - raw Zvec candidate hit rate: whether provider embeddings place the expected memory id in the Zvec candidate window.
8
+ - final Substrate search hit rate: whether the current Substrate canonical search contract returns that memory after candidate narrowing.
9
+
10
+ This distinction matters. Zvec is a candidate sidecar, not the truth store and not the final admission policy. SQLite remains the truth store, and Substrate reloads canonical nodes before returning search results.
11
+
12
+ ## Provider Contract
13
+
14
+ The eval supports three provider contracts:
15
+
16
+ - `openai`: OpenAI-compatible embeddings endpoints.
17
+ - `minimax`: MiniMax native embeddings endpoint.
18
+ - `dashscope`: Alibaba Cloud DashScope native embeddings endpoint.
19
+
20
+ ### OpenAI-Compatible
21
+
22
+ ```text
23
+ POST <base-url><endpoint>
24
+ Authorization: Bearer <api-key>
25
+ Content-Type: application/json
26
+
27
+ {
28
+ "model": "...",
29
+ "input": ["text one", "text two"]
30
+ }
31
+ ```
32
+
33
+ The response must contain `data[].embedding` arrays in request order or with `data[].index`.
34
+
35
+ ### MiniMax Native
36
+
37
+ MiniMax embeddings use a native request shape:
38
+
39
+ ```text
40
+ POST <base-url><endpoint>
41
+ Authorization: Bearer <api-key>
42
+ Content-Type: application/json
43
+
44
+ {
45
+ "model": "embo-01",
46
+ "type": "db",
47
+ "texts": ["document text one", "document text two"]
48
+ }
49
+ ```
50
+
51
+ The eval calls MiniMax with `type: "db"` for memory-node vectors and
52
+ `type: "query"` for query vectors. The response must contain `vectors[]`.
53
+
54
+ ### DashScope Native
55
+
56
+ DashScope native embeddings use separate `text_type` values for document and
57
+ query vectors:
58
+
59
+ ```text
60
+ POST <base-url><endpoint>
61
+ Authorization: Bearer <api-key>
62
+ Content-Type: application/json
63
+
64
+ {
65
+ "model": "text-embedding-v4",
66
+ "input": {
67
+ "texts": ["document text one", "document text two"]
68
+ },
69
+ "parameters": {
70
+ "text_type": "document",
71
+ "dimension": 1024
72
+ }
73
+ }
74
+ ```
75
+
76
+ The eval calls DashScope with `text_type: "document"` for memory-node vectors
77
+ and `text_type: "query"` for query vectors. The response must contain
78
+ `output.embeddings[].embedding` arrays, ordered by `text_index`.
79
+
80
+ ## Run
81
+
82
+ ```bash
83
+ AIONIS_EMBEDDING_API_KEY=... \
84
+ AIONIS_EMBEDDING_MODEL=text-embedding-3-small \
85
+ npm run check:zvec-provider-embedding -- \
86
+ --base-url https://api.openai.com/v1 \
87
+ --nodes 240 \
88
+ --scopes 4 \
89
+ --queries 20 \
90
+ --candidate-limit 20
91
+ ```
92
+
93
+ For another OpenAI-compatible provider:
94
+
95
+ ```bash
96
+ AIONIS_EMBEDDING_API_KEY=... \
97
+ npm run check:zvec-provider-embedding -- \
98
+ --base-url https://provider.example/v1 \
99
+ --endpoint /embeddings \
100
+ --model provider-embedding-model
101
+ ```
102
+
103
+ For Alibaba Cloud DashScope `text-embedding-v4` through the OpenAI-compatible
104
+ endpoint:
105
+
106
+ ```bash
107
+ AIONIS_EMBEDDING_PROVIDER=openai \
108
+ AIONIS_EMBEDDING_API_KEY=... \
109
+ AIONIS_EMBEDDING_MODEL=text-embedding-v4 \
110
+ npm run check:zvec-provider-embedding -- \
111
+ --base-url https://dashscope.aliyuncs.com/compatible-mode/v1 \
112
+ --endpoint /embeddings \
113
+ --dimensions 1024 \
114
+ --batch-size 10 \
115
+ --nodes 240 \
116
+ --scopes 4 \
117
+ --queries 20 \
118
+ --candidate-limit 40
119
+ ```
120
+
121
+ DashScope `text-embedding-v4` accepts small batches on this endpoint, so the
122
+ example uses `--batch-size 10`. In the current provider eval, `--candidate-limit
123
+ 40` is a better first setting than `20` because it lets Zvec act as a semantic
124
+ candidate preselector without prematurely excluding lexical matches.
125
+
126
+ For Alibaba Cloud DashScope `text-embedding-v4` through the native endpoint:
127
+
128
+ ```bash
129
+ AIONIS_EMBEDDING_PROVIDER=dashscope \
130
+ AIONIS_EMBEDDING_API_KEY=... \
131
+ AIONIS_EMBEDDING_MODEL=text-embedding-v4 \
132
+ npm run check:zvec-provider-embedding -- \
133
+ --dimensions 1024 \
134
+ --projection structured \
135
+ --query-instruct "Retrieve the Aionis Substrate memory document that best answers the implementation question." \
136
+ --batch-size 10 \
137
+ --nodes 240 \
138
+ --scopes 4 \
139
+ --queries 20 \
140
+ --candidate-limit 40
141
+ ```
142
+
143
+ The native provider path is useful because it preserves the provider's
144
+ query/document embedding contract instead of embedding every text through one
145
+ generic endpoint shape.
146
+
147
+ For MiniMax:
148
+
149
+ ```bash
150
+ AIONIS_EMBEDDING_PROVIDER=minimax \
151
+ AIONIS_EMBEDDING_API_KEY=... \
152
+ AIONIS_EMBEDDING_MODEL=embo-01 \
153
+ npm run check:zvec-provider-embedding -- \
154
+ --base-url https://api.minimaxi.com/v1 \
155
+ --nodes 240 \
156
+ --scopes 4 \
157
+ --queries 20 \
158
+ --candidate-limit 20
159
+ ```
160
+
161
+ The command writes a report under `reports/zvec-provider-embedding-*` unless `--output` is supplied.
162
+
163
+ ## Options
164
+
165
+ - `--provider <openai|minimax|dashscope>`: embedding provider contract. Defaults to `AIONIS_EMBEDDING_PROVIDER` or `openai`.
166
+ - `--base-url <url>`: provider base URL. Defaults to `AIONIS_EMBEDDING_BASE_URL` or `https://api.openai.com/v1`.
167
+ - `--endpoint <path>`: embeddings endpoint. Defaults to `AIONIS_EMBEDDING_ENDPOINT` or `/embeddings`.
168
+ - `--model <name>`: embedding model. Defaults to `AIONIS_EMBEDDING_MODEL`.
169
+ - `--api-key-var <name>`: environment variable containing the API key. Defaults to `AIONIS_EMBEDDING_API_KEY`.
170
+ - `--dimensions <n>`: optional embedding dimensions request parameter.
171
+ - `--projection <plain|structured>`: embedding text projection. `plain` embeds compact search text; `structured` uses the SDK `buildAionisEmbeddingDocument` and `buildAionisEmbeddingQuery` contract. Defaults to `structured`.
172
+ - `--query-instruct <text>`: optional query instruction for providers that support query-side instructions.
173
+ - `--nodes <n>`: generated Substrate nodes.
174
+ - `--scopes <n>`: generated scopes.
175
+ - `--queries <n>`: semantic query probes. Current built-in fixture supports up to 24.
176
+ - `--batch-size <n>`: provider embedding batch size.
177
+ - `--candidate-limit <n>`: Zvec candidate window.
178
+ - `--result-limit <n>`: final Substrate search result limit.
179
+ - `--keep-store`: keep temporary SQLite and Zvec files for inspection.
180
+
181
+ ## Report Interpretation
182
+
183
+ Important fields:
184
+
185
+ - `raw_zvec_candidate_top1_rate`: provider embedding quality at the Zvec candidate layer.
186
+ - `raw_zvec_candidate_topk_rate`: whether the expected memory id entered the candidate window.
187
+ - `final_substrate_topk_rate`: current end-to-end `searchNodes()` output after canonical Substrate scoring.
188
+ - `lexical_substrate_topk_rate`: canonical deterministic search without Zvec.
189
+ - `probe_results`: per-query raw candidate rank, final rank, lexical rank, and returned ids for miss analysis.
190
+ - `embedding_usage`: provider requests, embedded text count, input character count, provider token count when exposed, and failed request count.
191
+ - `zvec_health`: missing, orphan, and stale sidecar diagnostics.
192
+
193
+ Current DashScope native `text-embedding-v4` smoke results with the SDK
194
+ projection on this fixture:
195
+
196
+ | Projection | Dimension | Raw Top-1 | Raw Top-K | Final Top-K | Lexical Top-K |
197
+ | --- | ---: | ---: | ---: | ---: | ---: |
198
+ | `plain` | 1024 | 40% | 100% | 75% | 70% |
199
+ | `structured` + query/document | 1024 | 45% | 100% | 85% | 70% |
200
+
201
+ The useful gain comes from the SDK query/document projection contract instead
202
+ of changing Substrate's final admission or search filters.
203
+
204
+ If raw Zvec hit rate is strong but final Substrate hit rate is weaker, the provider embeddings are finding useful candidates but the final canonical scorer is still acting as a lexical/structured gate. That is a search-contract boundary, not a provider failure.
205
+
206
+ The provider eval is intentionally strict about this distinction. A low final
207
+ Substrate hit rate can happen even when provider vectors are valid if the Zvec
208
+ candidate window excludes the expected id or if the final canonical scorer
209
+ prefers lexical/structured evidence over semantic candidates.
210
+
211
+ Substrate fuses candidate-index evidence into final `searchNodes()` ranking by
212
+ adding auditable `semantic_candidate_fusion` reasons and preserving a small
213
+ semantic recall floor for top-ranked candidates. This only changes ranking after
214
+ normal scope, lifecycle, authority, confidence, team, agent, and target-file
215
+ filters pass. Zvec still remains a sidecar candidate preselector; file/SQLite
216
+ stores remain the truth store.
@@ -0,0 +1,89 @@
1
+ # Zvec Scale Maintenance Check
2
+
3
+ This check validates the optional Zvec candidate sidecar under Substrate-scale write and maintenance operations.
4
+
5
+ It proves a storage/index contract:
6
+
7
+ - SQLite remains the truth store.
8
+ - Zvec receives write-through node upserts while nodes are written.
9
+ - `verify()` reports no missing, orphan, or stale entries after writes.
10
+ - reopening the store rebuilds the sidecar from SQLite truth.
11
+ - wide-window Zvec candidate search preserves canonical Substrate search ids.
12
+ - narrow-window Zvec candidate search recovers seeded exact-node probes.
13
+ - lifecycle transitions update the sidecar fingerprint.
14
+ - checkpoint compaction and post-compaction reopen keep the sidecar verifiable.
15
+
16
+ The check uses the same deterministic local text projection as the Runtime Zvec validation. It validates storage/index maintenance and candidate-window safety. It does not evaluate embedding-provider semantic quality.
17
+
18
+ ## Run
19
+
20
+ ```bash
21
+ npm run check:zvec-scale -- \
22
+ --nodes 10000 \
23
+ --scopes 10 \
24
+ --relations 2000 \
25
+ --feedback 1000 \
26
+ --probes 100 \
27
+ --narrow-candidate-limit 20
28
+ ```
29
+
30
+ The command writes a report under `reports/zvec-scale-*` unless `--output` is supplied.
31
+
32
+ ## Options
33
+
34
+ - `--nodes <n>`: generated memory nodes.
35
+ - `--scopes <n>`: generated scopes.
36
+ - `--relations <n>`: generated relation rows.
37
+ - `--feedback <n>`: generated feedback rows.
38
+ - `--probes <n>`: exact-node search probes.
39
+ - `--narrow-candidate-limit <n>`: Zvec candidate window for narrow seeded recovery probes.
40
+ - `--transitions <n>`: lifecycle transitions to apply after reopen.
41
+ - `--output <dir>`: report directory.
42
+ - `--keep-store`: keep the temporary SQLite and Zvec files for inspection.
43
+
44
+ ## Report Interpretation
45
+
46
+ Important fields:
47
+
48
+ - `zvec_health.after_write`: write-through index health.
49
+ - `zvec_health.after_reopen`: rebuild-on-open health.
50
+ - `zvec_health.after_transitions`: lifecycle transition sync health.
51
+ - `zvec_health.after_compact` and `after_compact_reopen`: compaction and reopen health.
52
+ - `wide_parity_rate`: wide-window Zvec search matched canonical Substrate search.
53
+ - `narrow_seed_hit_rate`: narrow-window Zvec search recovered seeded nodes.
54
+ - `sqlite_bytes` and `zvec_bytes`: local storage footprint for the generated run.
55
+
56
+ Any missing, orphan, stale, parity, or seeded-recovery failure means the Zvec sidecar needs maintenance work before it should be used for larger Substrate candidate preselection.
57
+
58
+ ## Local Validation Snapshot
59
+
60
+ On 2026-06-26, the check was run locally with a 10k-node SQLite truth store and Zvec sidecar:
61
+
62
+ ```bash
63
+ npm run check:zvec-scale -- \
64
+ --nodes 10000 \
65
+ --scopes 10 \
66
+ --relations 2000 \
67
+ --feedback 1000 \
68
+ --probes 100 \
69
+ --narrow-candidate-limit 20
70
+ ```
71
+
72
+ Result:
73
+
74
+ - nodes: 10,000
75
+ - relations: 2,000
76
+ - feedback records: 1,000
77
+ - lifecycle transitions: 100
78
+ - probes: 100
79
+ - Zvec health after write/reopen/transitions/compact/compact-reopen: pass
80
+ - wide candidate parity: 100%
81
+ - narrow seeded recovery: 100%
82
+ - SQLite bytes: 17,657,856
83
+ - Zvec sidecar bytes: 19,528,279
84
+ - write nodes with Zvec: 2,335 ms
85
+ - close after write, including manifest flush: 171 ms
86
+ - reopen and rebuild: 1,855 ms
87
+ - post-compaction reopen and rebuild: 1,783 ms
88
+
89
+ This validates the sidecar maintenance path at package scale: write-through indexing, rebuild, lifecycle-transition synchronization, compaction, and candidate-window safety all remain consistent with the SQLite truth store.
@@ -0,0 +1,189 @@
1
+ import assert from "node:assert/strict";
2
+ import { mkdtemp, rm } from "node:fs/promises";
3
+ import { tmpdir } from "node:os";
4
+ import { join } from "node:path";
5
+ import { DatabaseSync } from "node:sqlite";
6
+ import {
7
+ openSqliteAionisSubstrate,
8
+ runRuntimeLiveSidecarOnce,
9
+ } from "../../dist/index.js";
10
+
11
+ const scope = "repo-a";
12
+ const workspace = await mkdtemp(join(tmpdir(), "aionis-substrate-live-sidecar-"));
13
+
14
+ function createRuntimeLiteSource(path) {
15
+ const db = new DatabaseSync(path);
16
+ try {
17
+ db.exec(`
18
+ CREATE TABLE lite_memory_nodes (
19
+ id TEXT PRIMARY KEY,
20
+ scope TEXT NOT NULL,
21
+ client_id TEXT,
22
+ type TEXT NOT NULL,
23
+ tier TEXT NOT NULL,
24
+ title TEXT,
25
+ text_summary TEXT,
26
+ slots_json TEXT NOT NULL,
27
+ raw_ref TEXT,
28
+ evidence_ref TEXT,
29
+ embedding_vector_json TEXT,
30
+ embedding_model TEXT,
31
+ memory_lane TEXT NOT NULL,
32
+ producer_agent_id TEXT,
33
+ owner_agent_id TEXT,
34
+ owner_team_id TEXT,
35
+ embedding_status TEXT NOT NULL,
36
+ embedding_last_error TEXT,
37
+ salience REAL NOT NULL,
38
+ importance REAL NOT NULL,
39
+ confidence REAL NOT NULL,
40
+ redaction_version INTEGER NOT NULL,
41
+ commit_id TEXT NOT NULL,
42
+ created_at TEXT NOT NULL
43
+ );
44
+ `);
45
+ } finally {
46
+ db.close();
47
+ }
48
+ }
49
+
50
+ function insertRuntimeNode(path, row) {
51
+ const db = new DatabaseSync(path);
52
+ try {
53
+ db.prepare(`
54
+ INSERT INTO lite_memory_nodes (
55
+ id, scope, client_id, type, tier, title, text_summary, slots_json, raw_ref, evidence_ref,
56
+ embedding_vector_json, embedding_model, memory_lane, producer_agent_id, owner_agent_id,
57
+ owner_team_id, embedding_status, embedding_last_error, salience, importance, confidence,
58
+ redaction_version, commit_id, created_at
59
+ ) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
60
+ `).run(
61
+ row.id,
62
+ scope,
63
+ `client-${row.id}`,
64
+ row.type,
65
+ row.tier,
66
+ row.title,
67
+ row.summary,
68
+ JSON.stringify(row.slots),
69
+ row.rawRef ?? null,
70
+ row.evidenceRef ?? null,
71
+ null,
72
+ "demo-embedding",
73
+ "execution",
74
+ "agent-a",
75
+ "agent-a",
76
+ "team-a",
77
+ "ready",
78
+ null,
79
+ 0.8,
80
+ 0.85,
81
+ row.confidence,
82
+ 1,
83
+ "demo-commit",
84
+ row.createdAt,
85
+ );
86
+ } finally {
87
+ db.close();
88
+ }
89
+ }
90
+
91
+ try {
92
+ const runtimeSource = join(workspace, "runtime-lite.sqlite");
93
+ const substrateTarget = join(workspace, "substrate.sqlite");
94
+ const checkpoint = join(workspace, "runtime-live-checkpoint.json");
95
+ createRuntimeLiteSource(runtimeSource);
96
+
97
+ insertRuntimeNode(runtimeSource, {
98
+ id: "current-route",
99
+ type: "procedure",
100
+ tier: "hot",
101
+ title: "Current route",
102
+ summary: "Use src/runtime.ts after verifier passed.",
103
+ confidence: 0.95,
104
+ createdAt: "2026-06-26T00:00:00.000Z",
105
+ slots: {
106
+ summary_kind: "workflow_anchor",
107
+ contract_trust: "trusted",
108
+ target_files: ["src/runtime.ts", "tests/runtime.test.ts"],
109
+ execution_result_summary: { status: "passed" },
110
+ },
111
+ });
112
+
113
+ insertRuntimeNode(runtimeSource, {
114
+ id: "failed-branch",
115
+ type: "procedure",
116
+ tier: "hot",
117
+ title: "Failed branch",
118
+ summary: "The legacy src/legacy.ts path failed the verifier and should not steer the next turn.",
119
+ confidence: 0.9,
120
+ createdAt: "2026-06-26T00:01:00.000Z",
121
+ slots: {
122
+ summary_kind: "workflow_anchor",
123
+ contract_trust: "rejected",
124
+ target_files: ["src/legacy.ts"],
125
+ execution_result_summary: { status: "failed" },
126
+ },
127
+ });
128
+
129
+ insertRuntimeNode(runtimeSource, {
130
+ id: "raw-trace",
131
+ type: "trace_pointer",
132
+ tier: "cold",
133
+ title: "Raw terminal trace",
134
+ summary: "Full terminal trace is retained as payload evidence but should only be rehydrated on demand.",
135
+ rawRef: "file://trace.log",
136
+ confidence: 0.88,
137
+ createdAt: "2026-06-26T00:02:00.000Z",
138
+ slots: {
139
+ summary_kind: "raw_trace_pointer",
140
+ target_files: ["src/runtime.ts"],
141
+ },
142
+ });
143
+
144
+ const store = await openSqliteAionisSubstrate({ path: substrateTarget });
145
+ try {
146
+ const first = await runRuntimeLiveSidecarOnce({
147
+ sourcePath: runtimeSource,
148
+ target: store,
149
+ checkpointPath: checkpoint,
150
+ scope,
151
+ });
152
+ const second = await runRuntimeLiveSidecarOnce({
153
+ sourcePath: runtimeSource,
154
+ target: store,
155
+ checkpointPath: checkpoint,
156
+ scope,
157
+ });
158
+
159
+ const context = await store.previewContext({
160
+ scope,
161
+ query: "continue the current runtime implementation route",
162
+ maxPerBucket: 8,
163
+ });
164
+
165
+ assert.equal(first.apply_summary.nodes.applied, 3);
166
+ assert.equal(second.apply_summary.nodes.applied, 0);
167
+ assert.deepEqual(context.use_now.map((node) => node.id), ["current-route"]);
168
+ assert.deepEqual(context.do_not_use.map((node) => node.id), ["failed-branch"]);
169
+ assert.deepEqual(context.rehydrate.map((node) => node.id), ["raw-trace"]);
170
+
171
+ console.log(JSON.stringify({
172
+ ok: true,
173
+ runtime_source: runtimeSource,
174
+ substrate_target: substrateTarget,
175
+ first_sidecar_run: first.apply_summary.nodes,
176
+ second_sidecar_run: second.apply_summary.nodes,
177
+ governed_context: {
178
+ use_now: context.use_now.map((node) => node.id),
179
+ inspect_before_use: context.inspect_before_use.map((node) => node.id),
180
+ do_not_use: context.do_not_use.map((node) => node.id),
181
+ rehydrate: context.rehydrate.map((node) => node.id),
182
+ },
183
+ }, null, 2));
184
+ } finally {
185
+ await store.close();
186
+ }
187
+ } finally {
188
+ await rm(workspace, { recursive: true, force: true });
189
+ }