@semiont/core 0.5.6 → 0.5.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -6,16 +6,15 @@
6
6
  [![npm downloads](https://img.shields.io/npm/dm/@semiont/core.svg)](https://www.npmjs.com/package/@semiont/core)
7
7
  [![License](https://img.shields.io/npm/l/@semiont/core.svg)](https://github.com/The-AI-Alliance/semiont/blob/main/LICENSE)
8
8
 
9
- Core types and domain logic for the Semiont semantic knowledge platform. This package is the **source of truth for OpenAPI types** and provides backend utilities for event sourcing, URIs, DID generation, and the EventBus.
9
+ Core types and domain logic for the Semiont semantic knowledge platform. This package is the **source of truth for OpenAPI types** and provides the shared domain layer: event-sourcing types, the EventBus, the transport contract, W3C Web Annotation utilities, anchoring, DIDs, and configuration loading.
10
10
 
11
- > **Architecture Note**: This package generates TypeScript types from the OpenAPI specification. `@semiont/api-client` re-exports these types and provides HTTP client functionality.
11
+ > **Architecture Note**: This package generates TypeScript types from the OpenAPI specification. Every other package in the monorepo imports them from here.
12
12
 
13
13
  ## Who Should Use This
14
14
 
15
15
  - ✅ **Backend** (`apps/backend`) - Server implementation, imports types from core
16
- - ✅ **Packages** - Other monorepo packages that need OpenAPI types or EventBus
17
- - ✅ **Internal Utilities** - Type generation, validation, domain logic
18
- - ✅ **Frontend / Browser** - Types and pure utilities (main barrel is browser-safe)
16
+ - ✅ **Packages** - Other monorepo packages that need OpenAPI types, the EventBus, or the transport contract
17
+ - ✅ **Frontend / Browser** - Types and pure utilities (the main barrel is browser-safe)
19
18
 
20
19
  ## Who Should Use `@semiont/core/node` Instead
21
20
 
@@ -30,14 +29,11 @@ import { SemiontProject, loadEnvironmentConfig } from '@semiont/core/node';
30
29
 
31
30
  **Rule**: If your code runs in a browser or edge runtime, use `@semiont/core`. If it runs in Node.js and needs filesystem access, use `@semiont/core/node`.
32
31
 
33
- ## Who Should Use `@semiont/api-client` Instead
32
+ ## Who Should Use `@semiont/sdk` Instead
34
33
 
35
- - **External Applications** - For HTTP client + utilities
36
- - **Frontend** (`apps/frontend`, `packages/react-ui`) - For API communication and W3C utilities
37
- - **Demo Scripts** - For higher-level API access
38
- - **MCP Servers** - For client-side annotation utilities
34
+ Application code talking to a Semiont backend should use [`@semiont/sdk`](../sdk/), which provides `SemiontClient`, the verb-oriented namespaces, and the session layer. The SDK consumes the `ITransport` / `IContentTransport` contracts defined here; the HTTP implementations of those contracts live in [`@semiont/http-transport`](../http-transport/) and are re-exported by the SDK for convenience.
39
35
 
40
- **Rule of thumb**: If you need to make HTTP requests or work with W3C selectors, use `@semiont/api-client`. If you only need types and domain logic, use `@semiont/core`.
36
+ **Rule of thumb**: If you are making API calls, use `@semiont/sdk`. If you only need types and domain logic, use `@semiont/core`. Import from `@semiont/http-transport` directly only when constructing a transport stack by hand.
41
37
 
42
38
  ## Installation
43
39
 
@@ -84,114 +80,78 @@ const token = accessToken('eyJhbGc...');
84
80
  const eType = entityType('Person');
85
81
  ```
86
82
 
83
+ Branded ID types (`ResourceId`, `AnnotationId`, `UserId`) with factories and guards (`resourceId`, `annotationId`, `userId`, `isResourceId`, `isAnnotationId`) live alongside the URI brands.
84
+
87
85
  ### Event Sourcing Types
88
86
 
89
- Event types for the event-sourced architecture:
87
+ The persisted event catalog — every event type written to the JSONL event log, discriminated on `type` and namespaced by concern (`yield:*` resource lifecycle, `mark:*` annotations and tags, `frame:*` schema registration, `job:*` job lifecycle):
90
88
 
91
89
  ```typescript
92
90
  import type {
93
- ResourceEvent,
94
- ResourceCreatedEvent,
95
- ResourceArchivedEvent,
96
- DocumentUnarchivedEvent,
97
- AnnotationAddedEvent,
98
- AnnotationRemovedEvent,
99
- AnnotationBodyUpdatedEvent,
100
- EntityTagAddedEvent,
101
- EntityTagRemovedEvent,
91
+ PersistedEvent,
92
+ PersistedEventType,
93
+ EventOfType,
94
+ EventInput,
102
95
  StoredEvent,
103
96
  EventMetadata,
104
- DocumentAnnotations,
105
97
  BodyOperation,
98
+ ResourceAnnotations,
106
99
  } from '@semiont/core';
107
- ```
108
-
109
- ### DID Utilities
110
-
111
- Generate W3C Decentralized Identifiers for annotations:
100
+ import { PERSISTED_EVENT_TYPES } from '@semiont/core';
112
101
 
113
- ```typescript
114
- import { userToDid, userToAgent, didToAgent } from '@semiont/core';
115
-
116
- // Convert user to DID:WEB
117
- const did = userToDid(user);
118
- // => 'did:web:localhost%3A4000:users:user-id'
119
-
120
- // Convert user to W3C Agent
121
- const agent = userToAgent(user);
122
- // => { id: 'did:web:...', type: 'Person', name: 'User Name' }
102
+ function handle(event: PersistedEvent) {
103
+ if (event.type === 'mark:added') {
104
+ // payload is narrowed to the AnnotationAdded payload
105
+ }
106
+ }
123
107
  ```
124
108
 
125
- ### Cryptographic Utilities
109
+ `PERSISTED_EVENT_TYPES` is the runtime list of every persisted event type, with a compile-time exhaustiveness check against the catalog.
126
110
 
127
- Content-addressing and checksums:
111
+ ### EventBus
128
112
 
129
- ```typescript
130
- import {
131
- generateId,
132
- generateToken,
133
- generateUuid,
134
- calculateChecksum,
135
- verifyChecksum,
136
- } from '@semiont/core';
137
-
138
- // Generate unique IDs
139
- const id = generateId();
140
- const token = generateToken();
141
- const uuid = generateUuid();
113
+ The RxJS-based event bus shared by backend and clients, with a typed channel protocol:
142
114
 
143
- // Content checksums for verification
144
- const checksum = calculateChecksum(content);
145
- const isValid = verifyChecksum(content, checksum);
115
+ ```typescript
116
+ import { EventBus, ScopedEventBus, burstBuffer, serializePerKey } from '@semiont/core';
117
+ import type { EventMap, EventName } from '@semiont/core';
146
118
  ```
147
119
 
148
- ### Type Guards
120
+ - **`EventBus` / `ScopedEventBus`** — framework-agnostic pub/sub over the unified `EventMap`
121
+ - **`CHANNEL_SCHEMAS`** — maps each channel to its OpenAPI payload schema
122
+ - **`burstBuffer`** — RxJS operator for coalescing event bursts
123
+ - **`serializePerKey`** — per-key serialization for RPC-style callers
124
+ - **`busLog` / `setBusLogTraceIdProvider`** — cross-wire bus observability
149
125
 
150
- Runtime type checking:
126
+ ### Transport Contract
151
127
 
152
- ```typescript
153
- import {
154
- isString,
155
- isNumber,
156
- isBoolean,
157
- isObject,
158
- isArray,
159
- isNonEmptyArray,
160
- isDefined,
161
- } from '@semiont/core';
128
+ The interfaces every concrete transport must satisfy, plus the channel set transports bridge into a client's bus:
162
129
 
163
- if (isNonEmptyArray(value)) {
164
- // TypeScript knows value is T[] with length > 0
165
- }
130
+ ```typescript
131
+ import type { ITransport, IContentTransport, IBackendOperations, ConnectionState } from '@semiont/core';
132
+ import { BRIDGED_CHANNELS } from '@semiont/core';
166
133
  ```
167
134
 
168
- ### Error Classes
135
+ `@semiont/http-transport` implements these over HTTP + SSE; `LocalTransport` in `@semiont/make-meaning` implements them in-process.
136
+
137
+ ### W3C Web Annotation Utilities
169
138
 
170
- Backend error types:
139
+ Pure functions for building and reading W3C Annotations:
171
140
 
172
141
  ```typescript
173
142
  import {
174
- SemiontError,
175
- APIError,
176
- NotFoundError,
177
- ConflictError,
178
- ValidationError,
179
- UnauthorizedError,
143
+ assembleAnnotation,
144
+ applyBodyOperations,
145
+ getBodySource,
146
+ getTargetSelector,
147
+ getExactText,
148
+ isHighlight,
149
+ isReference,
150
+ isComment,
180
151
  } from '@semiont/core';
181
-
182
- throw new NotFoundError('Document not found');
183
- throw new ValidationError('Invalid annotation format');
184
152
  ```
185
153
 
186
- ### HTTP Client Utilities
187
-
188
- Backend HTTP utilities (internal use):
189
-
190
- ```typescript
191
- import { fetchAPI, createFetchAPI } from '@semiont/core';
192
-
193
- const response = await fetchAPI(url, { method: 'POST', body: data });
194
- ```
154
+ Selector helpers cover text position, text quote, SVG, and PDF-viewrect fragment selectors (`getTextPositionSelector`, `getSvgSelector`, `createFragmentSelector`, `parseSvgSelector`, …).
195
155
 
196
156
  ### Annotation body matcher
197
157
 
@@ -223,75 +183,99 @@ const linkingIdx = findBodyItem(annotation.body, {
223
183
  `purpose` is optional in the identity. Omit it to match on identity alone;
224
184
  provide it when the caller knows which purpose to target.
225
185
 
226
- ### Backend Internal Types
186
+ ### Anchoring
227
187
 
228
- Types not in the OpenAPI spec:
188
+ Re-anchor annotations after content edits — fuzzy text matching plus a render-time strategy that combines position and quote selectors with confidence scoring:
229
189
 
230
190
  ```typescript
231
- import type {
232
- UpdateDocumentInput,
233
- ResourceFilter,
234
- CreateAnnotationInternal,
235
- AnnotationCategory,
236
- GoogleAuthRequest,
237
- GraphConnection,
238
- GraphPath,
239
- EntityTypeStats,
191
+ import {
192
+ anchorAnnotation,
193
+ normalizeText,
194
+ buildContentCache,
195
+ findBestTextMatch,
240
196
  } from '@semiont/core';
241
197
  ```
242
198
 
243
- ### Constants
199
+ ### DID Utilities
244
200
 
245
- Backend-specific constants:
201
+ Generate and parse W3C Decentralized Identifiers for humans and software peers:
246
202
 
247
203
  ```typescript
248
- import { CREATION_METHODS } from '@semiont/core';
204
+ import { userToDid, userToAgent, agentToDid, softwareToAgent, didToAgent } from '@semiont/core';
249
205
 
250
- CREATION_METHODS.API // 'api'
251
- CREATION_METHODS.PASTE // 'paste'
252
- CREATION_METHODS.FILE_UPLOAD // 'file-upload'
253
- CREATION_METHODS.REFERENCE // 'reference'
254
- CREATION_METHODS.IMPORT // 'import'
255
- ```
206
+ userToDid({ email: 'alice@example.com', domain: 'example.com' });
207
+ // => 'did:web:example.com:users:alice%40example.com'
256
208
 
257
- ## What's NOT Included
209
+ userToAgent({ id: 'u1', domain: 'example.com', name: 'Alice', email: 'alice@example.com' });
210
+ // => { '@type': 'Person', '@id': 'did:web:example.com:users:alice%40example.com', name: 'Alice' }
258
211
 
259
- The following utilities have been **moved to @semiont/api-client** (as of 2025-10-24):
212
+ didToAgent('did:web:example.com:agents:ollama:gemma2%3A27b');
213
+ // => { '@type': 'Software', '@id': ..., name: 'ollama gemma2:27b', provider: 'ollama', model: 'gemma2:27b' }
214
+ ```
260
215
 
261
- ### Selector Utilities
216
+ ### Error Classes
262
217
 
263
- **Use `@semiont/api-client` instead:**
218
+ In-process error types, sharing the `TransportErrorCode` vocabulary with the transport-specific classes (`APIError` lives in `@semiont/http-transport`):
264
219
 
265
220
  ```typescript
266
- // OLD (removed from @semiont/core):
267
- import { getExactText, getTextPositionSelector } from '@semiont/core';
221
+ import {
222
+ SemiontError,
223
+ ValidationError,
224
+ ScriptError,
225
+ NotFoundError,
226
+ UnauthorizedError,
227
+ ConflictError,
228
+ } from '@semiont/core';
268
229
 
269
- // NEW (use @semiont/api-client):
270
- import { getExactText, getTextPositionSelector } from '@semiont/api-client';
230
+ throw new NotFoundError('Resource not found');
271
231
  ```
272
232
 
273
- ### Locale Utilities
274
-
275
- **Use `@semiont/api-client` instead:**
233
+ ### Type Guards & Validation
276
234
 
277
235
  ```typescript
278
- // OLD (removed from @semiont/core):
279
- import { LOCALES, formatLocaleDisplay, getLocaleInfo } from '@semiont/core';
236
+ import { isString, isObject, isArray, isDefined, validateData, isValidEmail } from '@semiont/core';
280
237
 
281
- // NEW (use @semiont/api-client):
282
- import { LOCALES, formatLocaleDisplay, getLocaleInfo } from '@semiont/api-client';
238
+ if (isDefined(value)) {
239
+ // TypeScript knows value is T, not T | null | undefined
240
+ }
283
241
  ```
284
242
 
285
- ### Annotation Utilities (Public API)
243
+ ### Resource & Misc Utilities
244
+
245
+ - **ResourceDescriptor accessors** — `getResourceId`, `getPrimaryRepresentation`, `getChecksum`, `isArchived`, `decodeRepresentation`, …
246
+ - **Locales** — `LOCALES`, `getLocaleInfo`, `formatLocaleDisplay`, …
247
+ - **Media types** — `MEDIA_TYPES` capability registry keyed by the spec's `SupportedMediaType` enum (render / anchoring / text-extraction / authorable per type), with `capabilitiesOf`, `textExtractionOf`, `mediaTypeForExtension`, `baseMediaType`, …
248
+ - **Text encoding** — `extractCharset`, `decodeWithCharset`
249
+ - **Text context** — `extractContext`, `reconcileSelector`
250
+ - **SVG** — `createRectangleSvg`, `parseSvgSelector`, `scaleSvgToNative`, …
251
+ - **IDs** — `generateUuid`
286
252
 
287
- **Use `@semiont/api-client` instead:**
253
+ ### Configuration
254
+
255
+ Schema-generated configuration types plus loaders:
288
256
 
289
257
  ```typescript
290
- // OLD (removed from @semiont/core):
291
- import { compareAnnotationIds, getEntityTypes, getBodySource } from '@semiont/core';
258
+ import { loadTomlConfig, parseEnvironment, ConfigurationError } from '@semiont/core';
259
+ import type { SemiontConfig, EnvironmentConfig, ServicesConfig } from '@semiont/core';
260
+ ```
261
+
262
+ Filesystem-backed loading (`SemiontProject`, `loadEnvironmentConfig`) is in `@semiont/core/node` — see above.
263
+
264
+ ### Backend Internal Types
292
265
 
293
- // NEW (use @semiont/api-client):
294
- import { compareAnnotationIds, getEntityTypes, getBodySource } from '@semiont/api-client';
266
+ Types not in the OpenAPI spec:
267
+
268
+ ```typescript
269
+ import type {
270
+ UpdateResourceInput,
271
+ ResourceFilter,
272
+ CreateAnnotationInternal,
273
+ AnnotationCategory,
274
+ GoogleAuthRequest,
275
+ GraphConnection,
276
+ GraphPath,
277
+ EntityTypeStats,
278
+ } from '@semiont/core';
295
279
  ```
296
280
 
297
281
  ## Architecture: Spec-First
@@ -299,15 +283,10 @@ import { compareAnnotationIds, getEntityTypes, getBodySource } from '@semiont/ap
299
283
  Semiont follows a **spec-first architecture**:
300
284
 
301
285
  1. **OpenAPI Specification** ([specs/src/](../../specs/src/)) is the source of truth
302
- 2. **@semiont/core** generates types from OpenAPI and provides utilities
303
- 3. **@semiont/api-client** re-exports types from core and provides HTTP client
304
-
305
- **Principle**:
306
- - OpenAPI types & domain utilities → `@semiont/core` (source of truth)
307
- - HTTP client & convenience re-exports → `@semiont/api-client`
308
- - Backend internal implementation → imports from `@semiont/core`
286
+ 2. **@semiont/core** generates types from OpenAPI and provides domain utilities
287
+ 3. Every other package imports types from `@semiont/core`; application code talks to the backend through `@semiont/sdk`, whose transports implement core's `ITransport` contract
309
288
 
310
- **Type Yield Flow**: OpenAPI spec → `@semiont/core/src/types.ts` (via `openapi-typescript`) → re-exported by `@semiont/api-client` for convenience. This ensures no circular dependencies and clear build order.
289
+ **Type Yield Flow**: OpenAPI spec → `@semiont/core/src/types.ts` (via `openapi-typescript`) → imported across the monorepo. This ensures no circular dependencies and clear build order.
311
290
 
312
291
  ## Development
313
292
 
@@ -328,7 +307,8 @@ Apache-2.0
328
307
 
329
308
  ## Related Packages
330
309
 
331
- - [`@semiont/api-client`](../api-client/) - Primary TypeScript SDK (use this for most cases)
310
+ - [`@semiont/sdk`](../sdk/) - The Semiont SDK (`SemiontClient`) - use this for application development
311
+ - [`@semiont/http-transport`](../http-transport/) - HTTP implementations of core's transport contract
332
312
  - [`@semiont/backend`](../../apps/backend/) - Backend API server
333
313
  - [`@semiont/frontend`](../../apps/frontend/) - Web application
334
314
 
@@ -548,14 +548,12 @@ interface EnvironmentConfig {
548
548
  * XDG environment variables are read here and nowhere else.
549
549
  *
550
550
  * Durable paths (inside the project root, committed or repo-local):
551
- * eventsDir — .semiont/events/ (system of record, committed)
552
- * representationsDir — representations/ (content store, committed)
551
+ * eventsDir — .semiont/events/ (system of record, committed)
553
552
  *
554
553
  * Ephemeral paths (outside the project root, never committed):
555
554
  * configDir — $XDG_CONFIG_HOME/semiont/{name}/ (generated config for managed processes)
556
555
  * dataHome — $XDG_DATA_HOME/semiont/{name}/ (persistent user data, e.g. database files)
557
556
  * stateDir — $XDG_STATE_HOME/semiont/{name}/
558
- * embeddingsDir — stateDir/embeddings/
559
557
  * projectionsDir — stateDir/projections/
560
558
  * jobsDir — stateDir/jobs/
561
559
  * backendLogsDir — stateDir/backend/
@@ -575,11 +573,9 @@ declare class SemiontProject {
575
573
  * working-tree and event-log changes in the git index automatically. */
576
574
  readonly gitSync: boolean;
577
575
  readonly eventsDir: string;
578
- readonly representationsDir: string;
579
576
  readonly configDir: string;
580
577
  readonly dataHome: string;
581
578
  readonly stateDir: string;
582
- readonly embeddingsDir: string;
583
579
  readonly projectionsDir: string;
584
580
  readonly jobsDir: string;
585
581
  readonly backendLogsDir: string;
@@ -289,14 +289,12 @@ var SemiontProject = class _SemiontProject {
289
289
  gitSync;
290
290
  // Durable
291
291
  eventsDir;
292
- representationsDir;
293
292
  // Ephemeral — config (generated config files for managed processes)
294
293
  configDir;
295
294
  // Ephemeral — data (persistent user data managed by semiont)
296
295
  dataHome;
297
296
  // Ephemeral — state
298
297
  stateDir;
299
- embeddingsDir;
300
298
  projectionsDir;
301
299
  jobsDir;
302
300
  backendLogsDir;
@@ -319,14 +317,12 @@ name = "${name}"
319
317
  this.name = _SemiontProject.readName(projectRoot);
320
318
  this.gitSync = _SemiontProject.readGitSync(projectRoot);
321
319
  this.eventsDir = path.join(projectRoot, ".semiont", "events");
322
- this.representationsDir = path.join(projectRoot, "representations");
323
320
  const xdgConfig = process.env.XDG_CONFIG_HOME || path.join(os.homedir(), ".config");
324
321
  this.configDir = path.join(xdgConfig, "semiont", this.name);
325
322
  const xdgData = process.env.XDG_DATA_HOME || path.join(os.homedir(), ".local", "share");
326
323
  this.dataHome = path.join(xdgData, "semiont", this.name);
327
324
  const xdgState = process.env.XDG_STATE_HOME || path.join(os.homedir(), ".local", "state");
328
325
  this.stateDir = path.join(xdgState, "semiont", this.name);
329
- this.embeddingsDir = path.join(this.stateDir, "embeddings");
330
326
  this.projectionsDir = path.join(this.stateDir, "projections");
331
327
  this.jobsDir = path.join(this.stateDir, "jobs");
332
328
  this.backendLogsDir = path.join(this.stateDir, "backend");