npm - @semiont/make-meaning - Versions diffs - 0.2.43 → 0.2.46 - Mend

@semiont/make-meaning 0.2.43 → 0.2.46

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/README.md CHANGED Viewed

@@ -6,14 +6,18 @@
 [![npm downloads](https://img.shields.io/npm/dm/@semiont/make-meaning.svg)](https://www.npmjs.com/package/@semiont/make-meaning)
 [![License](https://img.shields.io/npm/l/@semiont/make-meaning.svg)](https://github.com/The-AI-Alliance/semiont/blob/main/LICENSE)
-**Making meaning from resources through context assembly, pattern detection, and relationship reasoning.**
+**Making meaning from resources through actors, context assembly, and relationship reasoning.**
-This package transforms raw resources into meaningful, interconnected knowledge through:
+This package implements the actor model from [ARCHITECTURE.md](../../docs/ARCHITECTURE.md). It owns the **Knowledge Base** and the actors that interface with it:
-- **Context Assembly**: Gathering resource metadata, content, and annotations from distributed storage
-- **Pattern Detection**: AI-powered discovery of semantic patterns (comments, highlights, assessments, tags)
-- **Relationship Reasoning**: Navigating connections between resources through graph traversal
-- **Job Workers**: Asynchronous processing of detection tasks with progress tracking
+- **Stower** (write) — the single write gateway to the Knowledge Base
+- **Gatherer** (read) — handles all browse reads, context assembly, and entity type listing
+- **Binder** (search/link) — searches KB stores for entity resolution and graph queries
+- **CloneTokenManager** (yield) — manages clone token lifecycle for resource cloning
+All actors subscribe to the EventBus via RxJS pipelines. They expose only `initialize()` and `stop()` — no public business methods. Callers communicate with actors by putting events on the bus.
+The EventBus is a **complete interface** for all knowledge-domain operations. HTTP routes in the backend are thin wrappers that delegate to EventBus actors. The system can operate entirely without HTTP — see `EventBusClient` in `@semiont/api-client`.
 ## Quick Start
@@ -23,319 +27,194 @@ npm install @semiont/make-meaning
 ### Start Make-Meaning Service
-The simplest way to use make-meaning infrastructure is through the service module:
 ```typescript
 import { startMakeMeaning } from '@semiont/make-meaning';
-import type { EnvironmentConfig } from '@semiont/core';
+import { EventBus } from '@semiont/core';
+import type { EnvironmentConfig, Logger } from '@semiont/core';
-// Start all infrastructure (job queue, workers, graph consumer)
-const makeMeaning = await startMakeMeaning(config);
+// EventBus is created outside make-meaning — it is not encapsulated by this package
+const eventBus = new EventBus();
-// Access job queue for route handlers
-const jobQueue = makeMeaning.jobQueue;
+// Start all infrastructure
+const makeMeaning = await startMakeMeaning(config, eventBus, logger);
+// Access components
+const { kb, jobQueue, stower, gatherer, binder, cloneTokenManager } = makeMeaning;
 // Graceful shutdown
 await makeMeaning.stop();
 ```
 This single call initializes:
-- Job queue
-- All 6 detection/generation workers
-- Graph consumer (event-to-graph synchronization)
-- Shared event store connection
-### Assemble Resource Context
-```typescript
-import { ResourceContext } from '@semiont/make-meaning';
-const resource = await ResourceContext.getResourceMetadata(resourceId, config);
-const resources = await ResourceContext.listResources({ createdAfter: '2024-01-01' }, config);
-const withContent = await ResourceContext.addContentPreviews(resources, config);
-```
+- **KnowledgeBase** — groups EventStore, ViewStorage, RepresentationStore, GraphDatabase
+- **Stower** — subscribes to write commands on EventBus
+- **Gatherer** — subscribes to browse reads, gather context, and entity type listing on EventBus
+- **Binder** — subscribes to search and referenced-by queries on EventBus
+- **CloneTokenManager** — subscribes to clone token operations on EventBus
+- **GraphDBConsumer** — event-to-graph synchronization (RxJS burst-buffered pipeline)
+- **JobQueue** — background job processing queue + job status subscription
+- **6 annotation workers** — poll job queue for async AI tasks
-### Work with Annotations
+### Create a Resource (via EventBus)
 ```typescript
-import { AnnotationContext } from '@semiont/make-meaning';
-// Get all annotations for a resource
-const annotations = await AnnotationContext.getResourceAnnotations(resourceId, config);
-// Build LLM context for an annotation (includes surrounding text)
-const context = await AnnotationContext.buildLLMContext(
-  annotationUri,
-  resourceId,
-  config,
-  { contextLines: 5 }
+import { ResourceOperations } from '@semiont/make-meaning';
+import { userId } from '@semiont/core';
+const result = await ResourceOperations.createResource(
+  {
+    name: 'My Document',
+    content: Buffer.from('Document content here'),
+    format: 'text/plain',
+    language: 'en',
+  },
+  userId('user-123'),
+  eventBus,
+  config.services.backend.publicURL,
 );
 ```
-### Detect Semantic Patterns
+`ResourceOperations.createResource` emits `yield:create` on the EventBus. The Stower subscribes to this event, persists the resource to the EventStore and ContentStore, and emits `yield:created` back on the bus.
-```typescript
-import { AnnotationDetection } from '@semiont/make-meaning';
-// AI-powered detection of passages that merit commentary
-const comments = await AnnotationDetection.detectComments(
-  resourceId,
-  config,
-  'Focus on technical explanations',
-  'educational',
-  0.7
-);
+### Gather Context (via EventBus)
-// Detect passages that should be highlighted
-const highlights = await AnnotationDetection.detectHighlights(
-  resourceId,
-  config,
-  'Find key definitions and important concepts',
-  0.5
-);
+```typescript
+import { firstValueFrom, race, filter, timeout } from 'rxjs';
-// Detect and extract structured tags from text using ontology schemas
-const tags = await AnnotationDetection.detectTags(
+// Emit gather request
+eventBus.get('gather:requested').next({
+  annotationUri,
   resourceId,
-  config,
-  'irac',  // Schema ID from @semiont/ontology
-  'issue'  // Category within the schema
+  options: { contextLines: 5 },
+});
+// Await result
+const result = await firstValueFrom(
+  race(
+    eventBus.get('gather:complete').pipe(filter(e => e.annotationUri === annotationUri)),
+    eventBus.get('gather:failed').pipe(filter(e => e.annotationUri === annotationUri)),
+  ).pipe(timeout(30_000)),
 );
 ```
-### Navigate Resource Relationships
-```typescript
-import { GraphContext } from '@semiont/make-meaning';
-// Find resources that link to this resource (backlinks)
-const backlinks = await GraphContext.getBacklinks(resourceId, config);
-// Find shortest path between two resources
-const paths = await GraphContext.findPath(fromResourceId, toResourceId, config, 3);
-// Full-text search across all resources
-const results = await GraphContext.searchResources('neural networks', config, 10);
-```
-### Use Individual Workers (Advanced)
-For fine-grained control, workers can be instantiated directly:
-```typescript
-import {
-  ReferenceDetectionWorker,
-  HighlightDetectionWorker,
-  GenerationWorker,
-} from '@semiont/make-meaning';
-import { JobQueue } from '@semiont/jobs';
-import { createEventStore } from '@semiont/event-sourcing';
-// Create shared dependencies
-const jobQueue = new JobQueue({ dataDir: './data' });
-await jobQueue.initialize();
-const eventStore = createEventStore('./data', 'http://localhost:3000');
-// Create workers with explicit dependencies
-const referenceWorker = new ReferenceDetectionWorker(jobQueue, config, eventStore);
-const highlightWorker = new HighlightDetectionWorker(jobQueue, config, eventStore);
-const generationWorker = new GenerationWorker(jobQueue, config, eventStore);
-// Start workers
-await Promise.all([
-  referenceWorker.start(),
-  highlightWorker.start(),
-  generationWorker.start(),
-]);
-```
-**Note**: In most cases, use `startMakeMeaning()` instead, which handles all initialization automatically.
-## Documentation
-- **[API Reference](./docs/api-reference.md)** - Complete API documentation for all classes and methods
-- **[Job Workers](./docs/job-workers.md)** - Asynchronous task processing with progress tracking
-- **[Architecture](./docs/architecture.md)** - System design and data flow
-- **[Examples](./docs/examples.md)** - Common use cases and patterns
-## Philosophy
-Resources don't exist in isolation. A document becomes meaningful when we understand its annotations, its relationships to other resources, and the patterns within its content. `@semiont/make-meaning` provides the infrastructure to:
-1. **Assemble context** from event-sourced storage
-2. **Detect patterns** using AI inference
-3. **Reason about relationships** through graph traversal
-This is the "applied meaning-making" layer - it sits between low-level AI primitives ([@semiont/inference](../inference/)) and high-level application orchestration ([apps/backend](../../apps/backend/)).
-## Infrastructure Ownership
-**MakeMeaningService is the single source of truth for all infrastructure:**
+## Architecture
-```typescript
-import { startMakeMeaning } from '@semiont/make-meaning';
+### Actor Model
-// Create ALL infrastructure once at startup
-const makeMeaning = await startMakeMeaning(config);
+All meaningful actions flow through the EventBus. The three KB actors are reactive — they subscribe via RxJS pipelines in `initialize()` and communicate results by emitting on the bus.
-// Access infrastructure components
-const { eventStore, graphDb, repStore, inferenceClient, jobQueue } = makeMeaning;
+```mermaid
+graph TB
+    Routes["Backend Routes"] -->|commands| BUS["Event Bus"]
+    Workers["Job Workers"] -->|commands| BUS
+    EBC["EventBusClient"] -->|commands| BUS
+    BUS -->|"yield:create, mark:create,<br/>mark:delete, job:*"| STOWER["Stower<br/>(write)"]
+    BUS -->|"browse:*, gather:*,<br/>mark:entity-types-*"| GATHERER["Gatherer<br/>(read)"]
+    BUS -->|"bind:search-*,<br/>bind:referenced-by-*"| BINDER["Binder<br/>(search/link)"]
+    BUS -->|"yield:clone-*"| CTM["CloneTokenManager<br/>(clone)"]
+    STOWER -->|persist| KB["Knowledge Base"]
+    GATHERER -->|query| KB
+    BINDER -->|query| KB
+    CTM -->|query| KB
+    STOWER -->|"yield:created, mark:created"| BUS
+    GATHERER -->|"browse:*-result,<br/>gather:complete"| BUS
+    BINDER -->|"bind:search-results,<br/>bind:referenced-by-result"| BUS
+    CTM -->|"yield:clone-token-generated,<br/>yield:clone-resource-result"| BUS
+    classDef bus fill:#e8a838,stroke:#b07818,stroke-width:3px,color:#000,font-weight:bold
+    classDef actor fill:#5a9a6a,stroke:#3d6644,stroke-width:2px,color:#fff
+    classDef kb fill:#8b6b9d,stroke:#6b4a7a,stroke-width:2px,color:#fff
+    classDef caller fill:#4a90a4,stroke:#2c5f7a,stroke-width:2px,color:#fff
+    class BUS bus
+    class STOWER,GATHERER,BINDER,CTM actor
+    class KB kb
+    class Routes,Workers,EBC caller
 ```
-**What MakeMeaningService Owns:**
+### Knowledge Base
-1. **EventStore** - Event log and materialized views (single source of truth)
-2. **GraphDatabase** - Graph database connection for relationships and traversal
-3. **RepresentationStore** - Content-addressed document storage
-4. **InferenceClient** - LLM client for AI operations
-5. **JobQueue** - Background job processing queue
-6. **Workers** - All 6 detection/generation workers
-7. **GraphDBConsumer** - Event-to-graph synchronization
+The Knowledge Base is an inert store — it has no intelligence, no goals, no decisions. It groups four subsystems:
-**Critical Design Rule:**
+| Store | Implementation | Purpose |
+|-------|---------------|---------|
+| **Event Log** | `EventStore` | Immutable append-only log of all domain events |
+| **Materialized Views** | `ViewStorage` | Denormalized projections for fast reads |
+| **Content Store** | `RepresentationStore` | Content-addressed binary storage (SHA-256) |
+| **Graph** | `GraphDatabase` | Eventually consistent relationship projection |
 ```typescript
-// ✅ CORRECT: Access infrastructure from MakeMeaningService
-const { graphDb } = makeMeaning;
+import { createKnowledgeBase } from '@semiont/make-meaning';
-// ❌ WRONG: NEVER create infrastructure outside of startMakeMeaning()
-const graphDb = await getGraphDatabase(config);  // NEVER DO THIS
-const repStore = new FilesystemRepresentationStore(...);  // NEVER DO THIS
-const eventStore = createEventStore(...);  // NEVER DO THIS
+const kb = createKnowledgeBase(eventStore, basePath, projectRoot, graphDb, logger);
+// kb.eventStore, kb.views, kb.content, kb.graph
 ```
-**Why This Matters:**
-- **Single initialization** - All infrastructure created once, shared everywhere
-- **No resource leaks** - Single connection per resource type (database, storage, etc.)
-- **Consistent configuration** - Same config across all components
-- **Testability** - Single injection point for mocking
-- **Lifecycle management** - Centralized shutdown via `makeMeaning.stop()`
-**Implementation Pattern:**
-- Backend creates MakeMeaningService in [apps/backend/src/index.ts:56](../../apps/backend/src/index.ts#L56)
-- Routes access via Hono context: `c.get('makeMeaning')`
-- Services receive infrastructure as parameters (dependency injection)
-- Workers receive EventStore and InferenceClient via constructor
-This architectural pattern prevents duplicate connections, ensures consistent state, and provides clear ownership boundaries across the entire system.
-## Architecture
-Three-layer design separating concerns:
+### EventBus Ownership
-```mermaid
-graph TB
-    Backend["<b>apps/backend</b><br/>Job orchestration, HTTP APIs, streaming"]
-    MakeMeaning["<b>@semiont/make-meaning</b><br/>Context assembly, detection/generation,<br/>prompt engineering, response parsing,<br/>job workers"]
-    Inference["<b>@semiont/inference</b><br/>AI primitives only:<br/>generateText, client management"]
+The EventBus is created by the backend (or script) and passed into `startMakeMeaning()` as a dependency. Make-meaning does not own or encapsulate the EventBus — it is shared across the entire system.
-    Backend --> MakeMeaning
-    MakeMeaning --> Inference
-    style Backend fill:#e1f5ff
-    style MakeMeaning fill:#fff4e6
-    style Inference fill:#f3e5f5
-```
-**Key principles:**
-- **Centralized infrastructure**: All infrastructure owned by MakeMeaningService (single initialization point)
-- **Event-sourced context**: Resources and annotations assembled from event streams
-- **Content-addressed storage**: Content retrieved using checksums (deduplication, caching)
-- **Graph-backed relationships**: @semiont/graph provides traversal for backlinks, paths, connections
-- **Explicit dependencies**: Workers receive infrastructure via constructor (dependency injection, no singletons)
-- **No ad-hoc creation**: Routes and services NEVER create their own infrastructure instances
+## Documentation
-See [Architecture](./docs/architecture.md) for complete details.
+- **[Architecture](./docs/architecture.md)** — Actor model, data flow, storage architecture
+- **[API Reference](./docs/api-reference.md)** — Context modules and operations
+- **[Examples](./docs/examples.md)** — Common use cases and patterns
+- **[Job Workers](./docs/job-workers.md)** — Async annotation workers (in @semiont/jobs)
+- **[Scripting](./docs/SCRIPTING.md)** — Direct scripting without HTTP backend
 ## Exports
-### Service Module (Primary)
-- `startMakeMeaning(config)` - Initialize all make-meaning infrastructure
-- `MakeMeaningService` - Type for service return value
-- `GraphDBConsumer` - Graph consumer class (for advanced use)
+### Service (Primary)
-### Context Assembly
+- `startMakeMeaning(config, eventBus, logger)` — Initialize all infrastructure
+- `MakeMeaningService` — Type for service return value
-- `ResourceContext` - Resource metadata and content
-- `AnnotationContext` - Annotation queries and context building
-- `GraphContext` - Graph traversal and search
+### Knowledge Base
-### Detection & Generation
+- `createKnowledgeBase(...)` — Factory function
+- `KnowledgeBase` — Interface grouping the four KB stores
-- `AnnotationDetection` - AI-powered semantic pattern detection (orchestrates detection pipeline)
-- `MotivationPrompts` - Prompt builders for comment/highlight/assessment/tag detection
-- `MotivationParsers` - Response parsers with offset validation
-- `extractEntities` - Entity extraction with context-based disambiguation
-- `generateResourceFromTopic` - Markdown resource generation with language support
-- `generateResourceSummary` - Resource summarization
-- `generateReferenceSuggestions` - Smart suggestion generation
+### Actors
-### Job Workers (Advanced)
+- `Stower` — Write gateway actor
+- `Gatherer` — Read actor (browse reads, context assembly, entity type listing)
+- `Binder` — Search/link actor (entity resolution, referenced-by queries)
+- `CloneTokenManager` — Clone token lifecycle actor (yield domain)
-- `ReferenceDetectionWorker` - Entity reference detection
-- `GenerationWorker` - AI content generation
-- `HighlightDetectionWorker` - Highlight detection
-- `CommentDetectionWorker` - Comment detection
-- `AssessmentDetectionWorker` - Assessment detection
-- `TagDetectionWorker` - Structured tag detection
+### Operations
-**Note**: Workers are typically managed by `startMakeMeaning()`, not instantiated directly.
+- `ResourceOperations` — Resource CRUD (emits commands to EventBus)
+- `AnnotationOperations` — Annotation CRUD (emits commands to EventBus)
-See [Job Workers](./docs/job-workers.md) for implementation details.
+### Context Assembly
-### Types
+- `ResourceContext` — Resource metadata queries from ViewStorage
+- `AnnotationContext` — Annotation queries and LLM context building
+- `GraphContext` — Graph traversal and search
+- `LLMContext` — Resource-level LLM context assembly
-```typescript
-export type {
-  CommentMatch,
-  HighlightMatch,
-  AssessmentMatch,
-  TagMatch,
-} from './detection/motivation-parsers';
-export type { ExtractedEntity } from './detection/entity-extractor';
-```
+### Generation
-## Configuration
+- `generateResourceSummary` — Resource summarization
+- `generateReferenceSuggestions` — Smart suggestion generation
-All methods require an `EnvironmentConfig` object:
+### Graph
-```typescript
-import type { EnvironmentConfig } from '@semiont/core';
-const config: EnvironmentConfig = {
-  services: {
-    backend: {
-      publicURL: 'http://localhost:3000',
-    },
-    openai: {
-      apiKey: process.env.OPENAI_API_KEY!,
-      model: 'gpt-4o-mini',
-      temperature: 0.7,
-    },
-  },
-  storage: {
-    base: '/path/to/storage',
-  },
-};
-```
+- `GraphDBConsumer` — Event-to-graph synchronization
 ## Dependencies
-`@semiont/make-meaning` builds on several core packages:
-- **[@semiont/core](../core/)**: Core types and utilities
-- **[@semiont/api-client](../api-client/)**: OpenAPI-generated types
-- **[@semiont/event-sourcing](../event-sourcing/)**: Event store and view storage
-- **[@semiont/content](../content/)**: Content-addressed storage
-- **[@semiont/graph](../graph/)**: Neo4j graph database client
-- **[@semiont/ontology](../ontology/)**: Schema definitions for tags
-- **[@semiont/inference](../inference/)**: AI primitives (prompts, parsers, generateText)
-- **[@semiont/jobs](../jobs/)**: Job queue and worker base class
+- **[@semiont/core](../core/)** — Core types, EventBus, utilities
+- **[@semiont/api-client](../api-client/)** — OpenAPI-generated types
+- **[@semiont/event-sourcing](../event-sourcing/)** — Event store and view storage
+- **[@semiont/content](../content/)** — Content-addressed storage
+- **[@semiont/graph](../graph/)** — Graph database abstraction
+- **[@semiont/ontology](../ontology/)** — Schema definitions for tags
+- **[@semiont/inference](../inference/)** — AI primitives (generateText)
+- **[@semiont/jobs](../jobs/)** — Job queue and annotation workers
 ## Testing