@semiont/make-meaning 0.5.5 → 0.5.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -8,18 +8,26 @@
8
8
 
9
9
  **Making meaning from resources through actors, context assembly, and relationship reasoning.**
10
10
 
11
- This package implements the actor model from [ACTOR-MODEL.md](../../docs/system/ACTOR-MODEL.md). It owns the **Knowledge Base** and the actors that interface with it:
11
+ This package implements the actor model from [ACTOR-MODEL.md](../../docs/system/ACTOR-MODEL.md). It owns the **Knowledge Base** and the seven actors that serve it.
12
+
13
+ Five **access actors** mediate every read and write — the bus-facing interface of the Knowledge Base:
12
14
 
13
15
  - **Stower** (write) — the single write gateway to the Knowledge Base; handles all resource and annotation mutations and job lifecycle events
14
- - **Browser** (read) — handles all KB read queries: resources, annotations, events, annotation history, referenced-by lookups, entity type listing, and directory browse (merging filesystem listings with KB metadata)
16
+ - **Browser** (read) — handles all KB read queries: resources, annotations, events, annotation history, referenced-by lookups, entity type and tag-schema listing, and directory browse (merging filesystem listings with KB metadata)
15
17
  - **Gatherer** (context assembly) — assembles gathered context for annotations (`gather:requested`) and resources (`gather:resource-requested`); searches vectors for semantically similar passages (adds `semanticContext` to `GatheredContext`)
16
18
  - **Matcher** (search/link) — context-driven candidate search with multi-source retrieval, composite structural scoring, and optional LLM semantic scoring
17
- - **Smelter** (embed) — subscribes to resource/annotation events, chunks text, embeds via `@semiont/vectors`, emits `embedding:compute` commands (persisted by Stower as `embedding:computed` events), and indexes into vector store (Qdrant)
18
19
  - **CloneTokenManager** (yield) — manages clone token lifecycle for resource cloning
19
20
 
20
- All actors subscribe to the EventBus via RxJS pipelines. They expose only `initialize()` and `stop()`no public business methods. Callers communicate with actors by putting events on the bus.
21
+ Two **projection pipelines** follow the event log to keep the eventually-consistent read models in syncaddressed by no one, replying to nothing:
22
+
23
+ - **Graph Consumer** (project) — subscribes to graph-relevant domain events and projects them into the graph database; carried on the KB record (`kb.graphConsumer`) and rebuilt from the event log at startup (`rebuildAll()`)
24
+ - **Smelter** (embed) — standalone embedding pipeline run via `@semiont/make-meaning/smelter-main` (not started by `startMakeMeaning`); subscribes to domain events, reads content from the KB working tree via `WorkerContentTransport`, chunks text, embeds via `@semiont/vectors`, and indexes into the vector store (Qdrant). On startup it reconciles Qdrant against the KS catalog — re-embedding what's missing or stale (every upsert is stamped with the embedded bytes' checksum, so changed content is detected) and deleting orphans — so a wiped Qdrant volume, or events missed while the worker was down, recover by restarting the smelter
25
+
26
+ (The third derived read model — the materialized views — is not pipeline-maintained: the EventStore's `ViewManager` materializes views synchronously inside `appendEvent()` for a read-your-writes guarantee.)
27
+
28
+ All seven actors subscribe to the EventBus via RxJS pipelines and expose no public business methods — only `initialize()` and `stop()`, plus a startup recovery entry point on the pipelines (`rebuildAll()` / `reconcile()`). Callers communicate with the access actors by putting events on the bus.
21
29
 
22
- The EventBus is a **complete interface** for all knowledge-domain operations. HTTP routes in the backend are thin wrappers that delegate to EventBus actors. The `@semiont/api-client` exposes the same operations via verb-oriented namespaces (`semiont.browse`, `semiont.mark`, `semiont.gather`, etc.).
30
+ The EventBus is a **complete interface** for all knowledge-domain operations. HTTP routes in the backend are thin wrappers that delegate to EventBus actors. The `@semiont/http-transport` exposes the same operations via verb-oriented namespaces (`semiont.browse`, `semiont.mark`, `semiont.gather`, etc.).
23
31
 
24
32
  ## Quick Start
25
33
 
@@ -44,7 +52,7 @@ const makeMeaning = await startMakeMeaning(project, config, eventBus, logger);
44
52
 
45
53
  // Access components
46
54
  const { knowledgeSystem, jobQueue } = makeMeaning;
47
- const { kb, stower, browser, gatherer, matcher, smelter, cloneTokenManager } = knowledgeSystem;
55
+ const { kb, stower, browser, gatherer, matcher, cloneTokenManager } = knowledgeSystem;
48
56
 
49
57
  // Graceful shutdown
50
58
  await makeMeaning.stop();
@@ -52,33 +60,37 @@ await makeMeaning.stop();
52
60
 
53
61
  This single call initializes:
54
62
  - **KnowledgeSystem** — groups the Knowledge Base and its actors
55
- - **KnowledgeBase** — groups EventStore, ViewStorage, WorkingTreeStore, GraphDatabase, GraphDBConsumer, and optionally VectorStore and Smelter
63
+ - **KnowledgeBase** — groups EventStore, ViewStorage, WorkingTreeStore, GraphDatabase, GraphDBConsumer, and optionally VectorStore
56
64
  - **Stower** — subscribes to write commands on EventBus
57
65
  - **Browser** — subscribes to all KB read queries and directory browse requests on EventBus
58
66
  - **Gatherer** — subscribes to annotation and resource gather requests on EventBus; searches vectors for semantically similar passages
59
67
  - **Matcher** — subscribes to candidate search requests on EventBus
60
- - **Smelter** — subscribes to resource/annotation events, chunks text, embeds, indexes into Qdrant
61
68
  - **CloneTokenManager** — subscribes to clone token operations on EventBus
62
69
  - **JobQueue** — background job processing queue + job status subscription
63
- - **6 annotation workers** — poll job queue for async AI tasks
70
+ - **Bus command handlers** — request-channel translators registered via `registerBusHandlers`
71
+
72
+ It does **not** start the Smelter (a standalone process — `@semiont/make-meaning/smelter-main`) or the job workers (the worker process in [@semiont/jobs](../jobs/) — see [Job Workers](./docs/job-workers.md)).
64
73
 
65
74
  ### Gather Context (via EventBus)
66
75
 
67
76
  ```typescript
68
77
  import { firstValueFrom, race, filter, timeout } from 'rxjs';
69
78
 
79
+ const correlationId = crypto.randomUUID();
80
+
70
81
  // Emit gather request for an annotation
71
82
  eventBus.get('gather:requested').next({
72
- annotationUri,
83
+ correlationId,
84
+ annotationId,
73
85
  resourceId,
74
- options: { contextLines: 5 },
86
+ options: { contextWindow: 1000 },
75
87
  });
76
88
 
77
89
  // Await result
78
90
  const result = await firstValueFrom(
79
91
  race(
80
- eventBus.get('gather:complete').pipe(filter(e => e.annotationUri === annotationUri)),
81
- eventBus.get('gather:failed').pipe(filter(e => e.annotationUri === annotationUri)),
92
+ eventBus.get('gather:complete').pipe(filter(e => e.correlationId === correlationId)),
93
+ eventBus.get('gather:failed').pipe(filter(e => e.correlationId === correlationId)),
82
94
  ).pipe(timeout(30_000)),
83
95
  );
84
96
  ```
@@ -93,14 +105,15 @@ All meaningful actions flow through the EventBus. The KB actors are reactive —
93
105
  graph TB
94
106
  Routes["Backend Routes"] -->|commands| BUS["Event Bus"]
95
107
  Workers["Job Workers"] -->|commands| BUS
96
- EBC["SemiontApiClient"] -->|commands| BUS
108
+ EBC["SemiontClient"] -->|commands| BUS
97
109
 
98
110
  subgraph ks ["Knowledge System"]
99
111
  STOWER["Stower<br/>(write)"]
100
112
  BROWSER["Browser<br/>(read)"]
101
113
  GATHERER["Gatherer<br/>(context assembly)"]
102
114
  MATCHER["Matcher<br/>(search/link)"]
103
- SMELTER["Smelter<br/>(embed)"]
115
+ SMELTER["Smelter<br/>(embed pipeline, standalone process)"]
116
+ GC["Graph Consumer<br/>(graph pipeline)"]
104
117
  CTM["CloneTokenManager<br/>(clone)"]
105
118
  KB["Knowledge Base"]
106
119
  VECTORS["Vector Store<br/>(Qdrant)"]
@@ -112,21 +125,22 @@ graph TB
112
125
  MATCHER -->|search| VECTORS
113
126
  SMELTER -->|embed & index| VECTORS
114
127
  SMELTER -->|read| KB
128
+ GC -->|project| KB
115
129
  CTM -->|query| KB
116
130
  end
117
131
 
118
- BUS -->|"yield:create, yield:update, yield:mv<br/>mark:create, mark:delete, mark:update-body<br/>frame:add-entity-type, mark:archive, mark:unarchive<br/>mark:update-entity-types, job:start, job:*"| STOWER
119
- BUS -->|"browse:resource-requested, browse:resources-requested<br/>browse:annotations-requested, browse:annotation-requested<br/>browse:events-requested, browse:annotation-history-requested<br/>browse:referenced-by-requested, browse:entity-types-requested<br/>browse:directory-requested"| BROWSER
132
+ BUS -->|"yield:create, yield:update, yield:mv<br/>mark:create, mark:delete, mark:update-body<br/>frame:add-entity-type, frame:add-tag-schema<br/>mark:archive, mark:unarchive, mark:update-entity-types<br/>job:start, job:complete, job:fail"| STOWER
133
+ BUS -->|"browse:resource-requested, browse:resources-requested<br/>browse:annotations-requested, browse:annotation-requested<br/>browse:events-requested, browse:annotation-history-requested<br/>browse:referenced-by-requested, browse:entity-types-requested<br/>browse:tag-schemas-requested, browse:directory-requested"| BROWSER
120
134
  BUS -->|"gather:requested<br/>gather:resource-requested"| GATHERER
121
135
  BUS -->|"match:search-requested"| MATCHER
122
- BUS -->|"yield:created, mark:created,<br/>mark:body-updated"| SMELTER
136
+ BUS -->|"domain events:<br/>yield:created, yield:updated<br/>yield:representation-added<br/>mark:added, mark:removed, mark:archived"| SMELTER
137
+ BUS -->|"graph-relevant<br/>domain events"| GC
123
138
  BUS -->|"yield:clone-token-requested<br/>yield:clone-resource-requested<br/>yield:clone-create"| CTM
124
139
 
125
- STOWER -->|"yield:created, yield:updated, yield:moved<br/>mark:created, mark:deleted, mark:body-updated<br/>frame:entity-type-added, ..."| BUS
126
- BROWSER -->|"browse:resource-result, browse:resources-result<br/>browse:annotations-result, browse:annotation-result<br/>browse:events-result, browse:annotation-history-result<br/>browse:referenced-by-result, browse:entity-types-result<br/>browse:directory-result"| BUS
140
+ STOWER -->|"yield:create-ok, yield:update-ok, yield:move-ok<br/>mark:delete-ok, *-failed replies<br/>(domain events are republished onto the bus<br/>by the EventStore: yield:created, mark:added, ...)"| BUS
141
+ BROWSER -->|"browse:resource-result, browse:resources-result<br/>browse:annotations-result, browse:annotation-result<br/>browse:events-result, browse:annotation-history-result<br/>browse:referenced-by-result, browse:entity-types-result<br/>browse:tag-schemas-result, browse:directory-result"| BUS
127
142
  GATHERER -->|"gather:complete, gather:failed<br/>gather:resource-complete, gather:resource-failed"| BUS
128
143
  MATCHER -->|"match:search-results, match:search-failed"| BUS
129
- SMELTER -->|"embedding:compute,<br/>embedding:delete"| BUS
130
144
  CTM -->|"yield:clone-token-generated<br/>yield:clone-resource-result<br/>yield:clone-created"| BUS
131
145
 
132
146
  classDef bus fill:#e8a838,stroke:#b07818,stroke-width:3px,color:#000,font-weight:bold
@@ -136,7 +150,7 @@ graph TB
136
150
 
137
151
  class BUS bus
138
152
  classDef vectorstore fill:#6b8e9d,stroke:#4a6a7a,stroke-width:2px,color:#fff
139
- class STOWER,BROWSER,GATHERER,MATCHER,SMELTER,CTM actor
153
+ class STOWER,BROWSER,GATHERER,MATCHER,SMELTER,GC,CTM actor
140
154
  class KB kb
141
155
  class VECTORS vectorstore
142
156
  class Routes,Workers,EBC caller
@@ -146,24 +160,25 @@ graph TB
146
160
 
147
161
  The **Knowledge System** binds the Knowledge Base to its actors. Nothing outside the Knowledge System reads or writes the Knowledge Base directly.
148
162
 
149
- The **Knowledge Base** is an inert store — it has no intelligence, no goals, no decisions. It groups five core subsystems and two optional ones:
163
+ The **Knowledge Base** is an inert store — it has no intelligence, no goals, no decisions. It groups five core subsystems and one optional one:
150
164
 
151
165
  | Store | Implementation | Purpose |
152
166
  |-------|---------------|---------|
153
167
  | **Event Log** | `EventStore` | Immutable append-only log of all domain events |
154
- | **Materialized Views** | `ViewStorage` | Denormalized projections for fast reads |
168
+ | **Materialized Views** | `ViewStorage` | Denormalized projections for fast reads (materialized synchronously on append) |
155
169
  | **Content Store** | `WorkingTreeStore` | Working-tree files addressed by URI |
156
170
  | **Graph** | `GraphDatabase` | Eventually consistent relationship projection |
157
- | **Graph Consumer** | `GraphDBConsumer` | Event-to-graph synchronization pipeline |
171
+ | **Graph Consumer** | `GraphDBConsumer` | Event-to-graph projection pipeline (one of the two pipeline actors; carried on the KB record because `createKnowledgeBase()` constructs and starts it) |
158
172
  | **Vectors** *(optional)* | `VectorStore` | Semantic vector index (Qdrant + memory) via `@semiont/vectors` |
159
- | **Smelter** *(optional)* | `Smelter` | Embedding pipeline actor (chunk, embed, index) |
173
+
174
+ Its sibling pipeline, the Smelter (event-to-vector projection), is **not** a KB member — it runs as a standalone process via `@semiont/make-meaning/smelter-main`.
160
175
 
161
176
  ```typescript
162
177
  import { createKnowledgeBase } from '@semiont/make-meaning';
163
178
 
164
- const kb = await createKnowledgeBase(eventStore, project, graphDb, logger);
179
+ const kb = await createKnowledgeBase(eventStore, project, graphDb, eventBus, logger, options);
165
180
  // kb.eventStore, kb.views, kb.content, kb.graph, kb.graphConsumer
166
- // kb.vectors (optional), kb.smelter (optional)
181
+ // kb.vectors (optional), kb.projectionsDir
167
182
  ```
168
183
 
169
184
  ### EventBus Ownership
@@ -196,7 +211,7 @@ This pattern (functional core, imperative shell) is shared with `@semiont/event-
196
211
  ### Service (Primary)
197
212
 
198
213
  - `startMakeMeaning(project, config, eventBus, logger)` — Initialize all infrastructure
199
- - `MakeMeaningService` — Type for service return value (`knowledgeSystem`, `jobQueue`, `workers`, `stop`)
214
+ - `MakeMeaningService` — Type for service return value (`knowledgeSystem`, `jobQueue`, `stop`)
200
215
 
201
216
  ### Knowledge System
202
217
 
@@ -205,8 +220,8 @@ This pattern (functional core, imperative shell) is shared with `@semiont/event-
205
220
 
206
221
  ### Knowledge Base
207
222
 
208
- - `createKnowledgeBase(eventStore, project, graphDb, logger)` — Async factory function
209
- - `KnowledgeBase` — Interface grouping the five KB stores (including `graphConsumer`)
223
+ - `createKnowledgeBase(eventStore, project, graphDb, eventBus, logger, options?)` — Async factory function
224
+ - `KnowledgeBase` — Interface grouping the KB stores (`eventStore`, `views`, `content`, `graph`, optional `vectors`) plus the `graphConsumer` pipeline
210
225
 
211
226
  ### Actors
212
227
 
@@ -214,8 +229,10 @@ This pattern (functional core, imperative shell) is shared with `@semiont/event-
214
229
  - `Browser` — Read actor (all KB queries, directory listings merged with KB metadata)
215
230
  - `Gatherer` — Context assembly actor (annotation and resource gather flows; vector semantic search)
216
231
  - `Matcher` — Search/link actor (context-driven candidate search with structural + semantic scoring)
217
- - `Smelter` — Embedding pipeline actor (chunk, embed, persist, index into vector store)
218
232
  - `CloneTokenManager` — Clone token lifecycle actor (yield domain)
233
+ - `Smelter` / `createSmelterActorStateUnit` / `WorkerContentTransport` — the embedding pipeline, its domain-event fan-in, and the worker-side content transport; wired together by the standalone `@semiont/make-meaning/smelter-main` entry point, and exported for callers that run the pipeline on their own `WorkerBus`
234
+
235
+ The Graph Consumer (`GraphDBConsumer`) is not exported — `createKnowledgeBase()` constructs it internally and exposes it as `kb.graphConsumer`.
219
236
 
220
237
  ### Operations
221
238
 
@@ -237,7 +254,7 @@ This pattern (functional core, imperative shell) is shared with `@semiont/event-
237
254
  ## Dependencies
238
255
 
239
256
  - **[@semiont/core](../core/)** — Core types, EventBus, utilities
240
- - **[@semiont/api-client](../api-client/)** — OpenAPI-generated types
257
+ - **[@semiont/http-transport](../http-transport/)** — OpenAPI-generated types
241
258
  - **[@semiont/event-sourcing](../event-sourcing/)** — Event store and view storage
242
259
  - **[@semiont/content](../content/)** — Content-addressed storage
243
260
  - **[@semiont/graph](../graph/)** — Graph database abstraction
@@ -245,6 +262,8 @@ This pattern (functional core, imperative shell) is shared with `@semiont/event-
245
262
  - **[@semiont/inference](../inference/)** — AI primitives (generateText)
246
263
  - **[@semiont/vectors](../vectors/)** — Vector store abstraction (Qdrant + memory) and embedding providers (Voyage, Ollama)
247
264
  - **[@semiont/jobs](../jobs/)** — Job queue and annotation workers
265
+ - **[@semiont/observability](../observability/)** — Actor spans and metrics providers
266
+ - **[@semiont/sdk](../sdk/)** — `StateUnit` / `WorkerBus` types (used by the Smelter actor state unit)
248
267
 
249
268
  ## Testing
250
269
 
package/dist/index.d.ts CHANGED
@@ -1,14 +1,13 @@
1
1
  import { JobQueue } from '@semiont/jobs';
2
2
  import { SemiontProject } from '@semiont/core/node';
3
- import { GraphServiceConfig, VectorsServiceConfig, EmbeddingServiceConfig, EventBus, Logger, StoredEvent, ResourceId, ResourceDescriptor, AnnotationId, components, ITransport, BaseUrl, ConnectionState, SemiontError, UserDID, EventMap, IContentTransport, PutBinaryRequest, PutBinaryOptions, ContentFormat as ContentFormat$1, AccessToken, Annotation, UserId, ResourceAnnotations, AnnotationCategory, GraphPath, GraphConnection } from '@semiont/core';
4
- export { AssembledAnnotation, applyBodyOperations, assembleAnnotation } from '@semiont/core';
3
+ import { GraphServiceConfig, VectorsServiceConfig, EmbeddingServiceConfig, EventBus, Logger, StoredEvent, ResourceId, ResourceDescriptor, AnnotationId, components, ITransport, BaseUrl, ConnectionState, SemiontError, UserDID, EventMap, IContentTransport, PutBinaryRequest, PutBinaryOptions, AccessToken, Annotation, UserId, ResourceAnnotations, AnnotationCategory, GraphPath, GraphConnection } from '@semiont/core';
5
4
  import { EventStore, ViewStorage } from '@semiont/event-sourcing';
6
5
  import { WorkingTreeStore } from '@semiont/content';
7
6
  import { GraphDatabase } from '@semiont/graph';
8
- import { VectorStore, EmbeddingProvider } from '@semiont/vectors';
7
+ import { VectorStore, EmbeddingProvider, ChunkingConfig } from '@semiont/vectors';
9
8
  import { InferenceClient } from '@semiont/inference';
10
9
  import { BehaviorSubject, Observable } from 'rxjs';
11
- import { StateUnit, WorkerBus } from '@semiont/sdk';
10
+ import { StateUnit, WorkerBus, BusRequestPrimitive } from '@semiont/sdk';
12
11
  import { Writable, Readable } from 'node:stream';
13
12
 
14
13
  /**
@@ -195,7 +194,7 @@ declare function createKnowledgeBase(eventStore: EventStore, project: SemiontPro
195
194
  *
196
195
  * The single write gateway to the Knowledge Base. Subscribes to command
197
196
  * events on the EventBus and translates them into domain events on the
198
- * EventStore + content writes to the RepresentationStore.
197
+ * EventStore + content operations on the WorkingTreeStore.
199
198
  *
200
199
  * From ARCHITECTURE.md:
201
200
  * The Knowledge Base has exactly three actor interfaces:
@@ -203,7 +202,8 @@ declare function createKnowledgeBase(eventStore: EventStore, project: SemiontPro
203
202
  * - Gatherer (read context)
204
203
  * - Matcher (read search)
205
204
  *
206
- * No other code should call eventStore.appendEvent() or repStore.store().
205
+ * No other code should call eventStore.appendEvent() or mutate the working tree
206
+ * through kb.content.
207
207
  *
208
208
  * Subscriptions:
209
209
  * - yield:create → resource.created (+ content store) → yield:created / yield:create-failed
@@ -348,6 +348,11 @@ declare class Matcher {
348
348
  /**
349
349
  * Search vectors for semantically similar resources.
350
350
  * Returns empty array if vectors or embedding provider are not configured.
351
+ *
352
+ * No entity-type filter: the annotation's entity types are a ranking signal
353
+ * (Jaccard + IDF in `contextDrivenSearch`), not an inclusion gate. Gating
354
+ * recall here would hide vector-similar but differently-tagged candidates
355
+ * from the scorer.
351
356
  */
352
357
  private searchVectors;
353
358
  stop(): Promise<void>;
@@ -430,12 +435,16 @@ declare class CloneTokenManager {
430
435
  * Nothing outside the KnowledgeSystem reads or writes the KnowledgeBase directly.
431
436
  *
432
437
  * - kb: the durable store (event log, views, content, graph)
433
- * - stower: write actor — weaves new knowledge in
434
- * - gatherer: read actor — traces threads to build context
435
- * - matcher: search actor — finds related threads
436
- * - browser: filesystem actor — lists project directories
438
+ * - stower: write actor — the single write gateway
439
+ * - browser: read actor — all KB queries plus directory listings
440
+ * - gatherer: context-assembly actor — builds GatheredContext from passage, graph, and vectors
441
+ * - matcher: search actor — context-driven candidate search and scoring
437
442
  * - cloneTokenManager: token actor — manages resource clone tokens
438
443
  *
444
+ * These are the five access actors. Two projection-pipeline actors complete
445
+ * the seven: the Graph Consumer (kb.graphConsumer, started by
446
+ * createKnowledgeBase) and the Smelter (standalone process via smelter-main).
447
+ *
439
448
  * EventBus, JobQueue, and workers are peers to KnowledgeSystem, not members.
440
449
  */
441
450
 
@@ -556,27 +565,34 @@ declare class LocalTransport implements ITransport {
556
565
  * resource-creation pipeline the HTTP `/resources` handler uses.
557
566
  */
558
567
 
568
+ type GetResourceResponse = components['schemas']['GetResourceResponse'];
559
569
  declare class LocalContentTransport implements IContentTransport {
560
570
  private readonly ks;
561
571
  constructor(ks: KnowledgeSystem);
562
572
  putBinary(_request: PutBinaryRequest, _options?: PutBinaryOptions): Promise<{
563
573
  resourceId: ResourceId;
564
574
  }>;
565
- getBinary(resourceId: ResourceId, options?: {
566
- accept?: ContentFormat$1 | string;
575
+ getBinary(resourceId: ResourceId, _options?: {
567
576
  auth?: AccessToken;
568
577
  }): Promise<{
569
578
  data: ArrayBuffer;
570
579
  contentType: string;
571
580
  }>;
572
- getBinaryStream(resourceId: ResourceId, options?: {
573
- accept?: ContentFormat$1 | string;
581
+ getBinaryStream(resourceId: ResourceId, _options?: {
574
582
  auth?: AccessToken;
575
583
  }): Promise<{
576
584
  stream: ReadableStream<Uint8Array>;
577
585
  contentType: string;
578
586
  }>;
579
587
  private loadBinary;
588
+ /**
589
+ * Assemble the resource's JSON-LD graph in-process from the KB — the local
590
+ * realization of `IContentTransport.getResourceGraph` (symmetric with
591
+ * getBinary; SIMPLER-JSON-LD.md decision 7). Local mode has no auth.
592
+ */
593
+ getResourceGraph(resourceId: ResourceId, _options?: {
594
+ auth?: AccessToken;
595
+ }): Promise<GetResourceResponse>;
580
596
  dispose(): void;
581
597
  }
582
598
 
@@ -704,11 +720,178 @@ interface SmelterActorStateUnitOptions {
704
720
  }
705
721
  interface SmelterActorStateUnit extends StateUnit {
706
722
  events$: Observable<SmelterEvent>;
707
- emit(channel: string, payload: Record<string, unknown>): Promise<void>;
708
723
  start(): void;
709
724
  }
710
725
  declare function createSmelterActorStateUnit(options: SmelterActorStateUnitOptions): SmelterActorStateUnit;
711
726
 
727
+ /**
728
+ * Smelter — event-to-vector pipeline for the standalone smelter worker.
729
+ *
730
+ * Consumes the smelter-relevant domain events surfaced by
731
+ * `SmelterActorStateUnit.events$`, reads resource content via the injected
732
+ * `IContentTransport` (HTTP verbatim mode in worker deployments — the
733
+ * stored bytes, untouched), chunks and embeds it via the configured
734
+ * EmbeddingProvider, and indexes vectors into the VectorStore (Qdrant).
735
+ * `smelter-main` is the container entry point that wires this up.
736
+ *
737
+ * ## Per-resource serialization
738
+ *
739
+ * Smelter processes events strictly in order per resourceId via
740
+ * `groupBy(resourceId) + concatMap(...)`. This is the stream-consumer
741
+ * flavor of per-resource serialization — the same invariant enforced by
742
+ * `GraphDBConsumer`, `Gatherer`, and (in a different shape) `ViewManager`.
743
+ * See `packages/core/src/serialize-per-key.ts` for the shared primitive
744
+ * used by RPC-style services.
745
+ *
746
+ * ## Batching
747
+ *
748
+ * `burstBuffer` collects event bursts per resource; consecutive same-type
749
+ * runs within a burst share a single `embedBatch()` call.
750
+ *
751
+ * ## Reconciliation
752
+ *
753
+ * Qdrant is an ephemeral projection of the event log. `reconcile()` brings
754
+ * it back in sync at startup — after a wiped volume, or after events missed
755
+ * while the worker was down. It is a planner: it diffs the store against the
756
+ * catalog (over the `browse:*` RPC channels) — both membership AND content
757
+ * freshness, via the checksum stamped onto every resource upsert — and
758
+ * enqueues `smelt:*` work items through the same mailbox as live events, so
759
+ * per-resource ordering holds across the two paths (axioms S1/S2/S11/S12 in
760
+ * `.plans/SMELTER-AXIOMS.md`).
761
+ */
762
+
763
+ interface ReconcileSummary {
764
+ resourcesEmbedded: number;
765
+ resourceVectorsDeleted: number;
766
+ annotationsEmbedded: number;
767
+ annotationVectorsDeleted: number;
768
+ }
769
+ type ReconcileState = {
770
+ phase: 'pending';
771
+ } | {
772
+ phase: 'running';
773
+ } | {
774
+ phase: 'done';
775
+ summary: ReconcileSummary;
776
+ } | {
777
+ phase: 'failed';
778
+ error: string;
779
+ };
780
+ /**
781
+ * Burst-buffer timings for the event pipeline. Required — `smelter-main`
782
+ * passes production values (50/100/200); test harnesses pass ~1ms values so
783
+ * property suites run at generator speed. See `.plans/SMELTER-AXIOMS.md` (D4).
784
+ */
785
+ interface SmelterTiming {
786
+ burstWindowMs: number;
787
+ maxBatchSize: number;
788
+ idleTimeoutMs: number;
789
+ }
790
+ /**
791
+ * Reconcile-planner work items — enqueued through the same mailbox as wire
792
+ * events. Distinct `smelt:*` types make forged domain events unrepresentable
793
+ * (`.plans/SMELTER-AXIOMS.md`, D1); the shared shape lets the per-resource
794
+ * lanes and batch paths serve both kinds of input.
795
+ */
796
+ interface SmelterWorkItem {
797
+ type: 'smelt:embed' | 'smelt:purge' | 'smelt:embed-annotation' | 'smelt:purge-annotation';
798
+ resourceId: string;
799
+ payload: Record<string, unknown>;
800
+ }
801
+ type SmelterInput = SmelterEvent | SmelterWorkItem;
802
+ declare class Smelter {
803
+ private events$;
804
+ private vectorStore;
805
+ private embeddingProvider;
806
+ private content;
807
+ private bus;
808
+ private chunkingConfig;
809
+ private timing;
810
+ private logger;
811
+ private static readonly RECONCILE_PAGE_SIZE;
812
+ /** Bound on concurrently in-flight reconcile work — a cold rebuild must not fan out unbounded embedding calls. */
813
+ private static readonly RECONCILE_WAVE;
814
+ private eventSubject;
815
+ private sourceSubscription;
816
+ private pipelineSubscription;
817
+ private _eventsProcessed;
818
+ private _reconcileState;
819
+ private workDone;
820
+ private workWaiter;
821
+ constructor(events$: Observable<SmelterEvent>, vectorStore: VectorStore, embeddingProvider: EmbeddingProvider, content: IContentTransport, bus: BusRequestPrimitive, chunkingConfig: ChunkingConfig, timing: SmelterTiming, logger: Logger);
822
+ get eventsProcessed(): number;
823
+ get reconcileState(): ReconcileState;
824
+ initialize(): void;
825
+ stop(): void;
826
+ private noteWorkDone;
827
+ /**
828
+ * Returns the number of WIRE events processed without error (the S9b
829
+ * oracle) — `smelt:*` work-item runs tick the drain counter instead.
830
+ */
831
+ private processBatch;
832
+ /**
833
+ * Batch-optimized processing for consecutive events of the same type.
834
+ * Returns the number of events processed without error.
835
+ */
836
+ private applyBatchByType;
837
+ /** Returns true if the input was processed without error. */
838
+ private safeProcessEvent;
839
+ private processEvent;
840
+ private handleResourcePurge;
841
+ /**
842
+ * Resolve a resource's embeddable text: bytes via the content transport,
843
+ * gated to media types that decode as text, decoded charset-aware. The
844
+ * checksum is over the raw bytes actually read — stamped onto the vectors
845
+ * so reconciliation can compare against the catalog's claim (S12). Returns
846
+ * null (logged) when the resource doesn't decode as text, is unavailable,
847
+ * or is empty — callers skip it.
848
+ */
849
+ private fetchEmbeddableText;
850
+ private embedResource;
851
+ private handleResourceArchived;
852
+ private handleAnnotationAdded;
853
+ private handleAnnotationRemoved;
854
+ /**
855
+ * Batch-embed chunks from multiple yield:created events in a single
856
+ * embedBatch() call, then index per resource.
857
+ */
858
+ private batchResourceCreated;
859
+ /**
860
+ * Batch-embed exact texts from multiple mark:added events in a single
861
+ * embedBatch() call, then index per annotation.
862
+ */
863
+ private batchAnnotationAdded;
864
+ /**
865
+ * Reconcile the vector store against the KS catalog.
866
+ *
867
+ * Lists what IS indexed (via the store's id enumeration) and what SHOULD
868
+ * be (non-archived resources with embeddable media types, plus their
869
+ * exact-text annotations, via the `browse:*` RPC channels), then plans the
870
+ * diff as `smelt:*` work items — embeds for what's missing, purges for
871
+ * what shouldn't be there — and drains them through the pipeline mailbox.
872
+ * Work items share the per-resource lanes with live events, so a reconcile
873
+ * re-embed can never interleave with (or stale-overwrite) live processing
874
+ * of the same resource (axioms S1/S2). Waves of RECONCILE_WAVE bound how
875
+ * many embedding calls a cold rebuild has in flight.
876
+ *
877
+ * Call after the live subscription is attached so nothing falls in the
878
+ * gap. The index snapshot is taken BEFORE the catalog listing so a
879
+ * resource indexed by a live event mid-reconcile is never mistaken for an
880
+ * orphan; convergence holds because every upsert replaces a resource's
881
+ * full vector set from current content.
882
+ */
883
+ reconcile(): Promise<ReconcileSummary>;
884
+ /**
885
+ * Enqueue planner work through the mailbox in bounded waves and await
886
+ * completion. The pipeline ticks `noteWorkDone` for every consumed work
887
+ * item (success or failure — failures are logged like any live event), so
888
+ * each wave's waiter resolves exactly when its items have been processed.
889
+ */
890
+ private drain;
891
+ /** Page through `browse:resources-requested` until the catalog is exhausted. */
892
+ private listAllResources;
893
+ }
894
+
712
895
  /**
713
896
  * Exchange Format Manifest Types
714
897
  *
@@ -755,9 +938,9 @@ declare function validateManifestVersion(version: number): void;
755
938
  *
756
939
  * Produces a lossless tar.gz archive of the system of record:
757
940
  * - Event log (all streams, JSONL format)
758
- * - Content store (content-addressed blobs)
941
+ * - Working-tree content (archived as checksum-named blobs)
759
942
  *
760
- * Reads events via EventStore and content via RepresentationStore.
943
+ * Reads events via EventStore and content via WorkingTreeStore.
761
944
  * The archive can restore a complete knowledge base.
762
945
  */
763
946
 
@@ -935,7 +1118,7 @@ declare function importLinkedData(archive: Readable, options: LinkedDataImporter
935
1118
  * Business logic for resource operations. All writes go through the EventBus
936
1119
  * — the Stower actor subscribes and handles persistence.
937
1120
  *
938
- * For create: emits yield:create, awaits yield:created / yield:create-failed.
1121
+ * For create: emits yield:create, awaits yield:create-ok / yield:create-failed.
939
1122
  */
940
1123
 
941
1124
  type ContentFormat = components['schemas']['ContentFormat'];
@@ -1237,8 +1420,5 @@ declare function generateResourceSummary(resourceName: string, content: string,
1237
1420
  */
1238
1421
  declare function generateReferenceSuggestions(referenceTitle: string, client: InferenceClient, entityType?: string, currentContent?: string): Promise<string[] | null>;
1239
1422
 
1240
- declare const PACKAGE_NAME = "@semiont/make-meaning";
1241
- declare const VERSION = "0.1.0";
1242
-
1243
- export { AnnotationContext, AnnotationOperations, BACKUP_FORMAT, Browser, CloneTokenManager, FORMAT_VERSION, Gatherer$1 as Gatherer, GraphContext, LLMContext, LocalContentTransport, LocalTransport, Matcher, PACKAGE_NAME, ResourceContext, ResourceOperations, Stower, VERSION, bootstrapEntityTypes, createKnowledgeBase, createSmelterActorStateUnit, exportBackup, exportLinkedData, generateReferenceSuggestions, generateResourceSummary, importBackup, importLinkedData, isBackupManifest, readEntityTypesProjection, registerAnnotationAssemblyHandler, registerAnnotationLookupHandlers, registerBindUpdateBodyHandler, registerBusHandlers, registerJobCommandHandlers, startMakeMeaning, stopKnowledgeSystem, validateManifestVersion };
1244
- export type { BackupContentReader, BackupEventStoreReader, BackupExporterOptions, BackupImportResult, BackupImporterOptions, BackupManifestHeader, BackupStreamSummary, BuildContextOptions, ContentBlobResolver, CreateAnnotationResult, CreateResourceInput, CreateResourceResult, GraphEdge, GraphNode, GraphRepresentation, KnowledgeBase, KnowledgeSystem, LLMContextOptions, LinkedDataContentReader, LinkedDataExporterOptions, LinkedDataImportResult, LinkedDataImporterOptions, LinkedDataViewReader, ListResourcesFilters, LocalTransportConfig, MakeMeaningConfig, MakeMeaningService, ReplayStats, SmelterActorStateUnit, SmelterActorStateUnitOptions, SmelterEvent, UpdateAnnotationBodyResult };
1423
+ export { AnnotationContext, AnnotationOperations, BACKUP_FORMAT, Browser, CloneTokenManager, FORMAT_VERSION, Gatherer$1 as Gatherer, GraphContext, LLMContext, LocalContentTransport, LocalTransport, Matcher, ResourceContext, ResourceOperations, Smelter, Stower, bootstrapEntityTypes, createKnowledgeBase, createSmelterActorStateUnit, exportBackup, exportLinkedData, generateReferenceSuggestions, generateResourceSummary, importBackup, importLinkedData, isBackupManifest, readEntityTypesProjection, registerAnnotationAssemblyHandler, registerAnnotationLookupHandlers, registerBindUpdateBodyHandler, registerBusHandlers, registerJobCommandHandlers, startMakeMeaning, stopKnowledgeSystem, validateManifestVersion };
1424
+ export type { BackupContentReader, BackupEventStoreReader, BackupExporterOptions, BackupImportResult, BackupImporterOptions, BackupManifestHeader, BackupStreamSummary, BuildContextOptions, ContentBlobResolver, CreateAnnotationResult, CreateResourceInput, CreateResourceResult, GraphEdge, GraphNode, GraphRepresentation, KnowledgeBase, KnowledgeSystem, LLMContextOptions, LinkedDataContentReader, LinkedDataExporterOptions, LinkedDataImportResult, LinkedDataImporterOptions, LinkedDataViewReader, ListResourcesFilters, LocalTransportConfig, MakeMeaningConfig, MakeMeaningService, ReconcileState, ReconcileSummary, ReplayStats, SmelterActorStateUnit, SmelterActorStateUnitOptions, SmelterEvent, SmelterInput, SmelterTiming, SmelterWorkItem, UpdateAnnotationBodyResult };