@asaidimu/utils-workspace 6.0.0 → 6.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,415 +1,464 @@
1
- # AI Workspace SDK
1
+ # @asaidimu/utils-workspace
2
2
 
3
- **Content‑addressed workspace and conversation management for AI applications.**
3
+ > **The Architectural Foundation for Production-Grade AI Workspaces.**
4
4
 
5
- [![npm version](https://img.shields.io/npm/v/@asaidimu/utils-workspace)](https://www.npmjs.com/package/@asaidimu/utils-workspace)
6
- [![license](https://img.shields.io/npm/l/@asaidimu/utils-workspace)](https://github.com/asaidimu/erp-utils/blob/main/src/workspace/LICENSE)
7
- [![build status](https://img.shields.io/github/actions/workflow/status/asaidimu/erp-utils/ci.yml?branch=main)](https://github.com/asaidimu/erp-utils/actions)
5
+ ---
8
6
 
9
7
  ## 📖 Table of Contents
10
8
 
11
- - [Overview & Features](#overview--features)
12
- - [Installation & Setup](#installation--setup)
13
- - [Usage Documentation](#usage-documentation)
14
- - [Basic Conversation Flow](#basic-conversation-flow)
15
- - [Turn Versioning & Editing](#turn-versioning--editing)
16
- - [Working with Blobs (Images & Documents)](#working-with-blobs-images--documents)
17
- - [Managing Preferences & Roles](#managing-preferences--roles)
18
- - [Branching a Conversation](#branching-a-conversation)
19
- - [Custom Prompt Assembly](#custom-prompt-assembly)
20
- - [Project Architecture](#project-architecture)
21
- - [Development & Contributing](#development--contributing)
22
- - [Additional Information](#additional-information)
9
+ 1. [**1. Introduction: The Evolution of Conversation State**](#1-introduction)
10
+ 2. [**2. Mental Model: Conversation as a Directed Acyclic Graph (DAG)**](#2-mental-model)
11
+ 3. [**3. Core Design Philosophy: The Four Pillars of AI State**](#3-philosophy)
12
+ 4. [**4. System Architecture: The "Engine Under the Hood"**](#4-architecture)
13
+ - [4.1. CQRS: Decoupling Write and Read Paths](#41-cqrs)
14
+ - [4.2. The Index: Reactive Read Projections](#42-the-index)
15
+ - [4.3. The Sequential Serializer: Race Condition Prevention](#43-serializer)
16
+ - [4.4. Event Bus and Reactivity](#44-event-bus)
17
+ 5. [**5. Domain Entities: Deep Dive into the State**](#5-entities)
18
+ - [5.1. Sessions: The Contextual Roots](#51-sessions)
19
+ - [5.2. Turns: The Atoms of Hyper-History](#52-turns)
20
+ - [5.3. Blobs: Deduplicated, Content-Addressed Asset Management](#53-blobs)
21
+ - [5.4. Roles: Persistent Personas and AI Identity](#54-roles)
22
+ - [5.5. Preferences & Context: The Multi-Tier Memory Layer](#55-memory)
23
+ 6. [**6. The Prompt Pipeline: Anatomy of a Request**](#6-pipeline)
24
+ - [6.1. Stage 1: DAG Snapshotting and Ancestry Walking](#61-snapshotting)
25
+ - [6.2. Stage 2: Semantic Retrieval and Topic Alignment](#62-retrieval)
26
+ - [6.3. Stage 3: Conflict Resolution in Instructions](#63-conflict-resolution)
27
+ - [6.4. Stage 4: Multi-Modal Blob Resolution Protocols](#64-resolution)
28
+ - [6.5. Stage 5: Provider-Specific Adapter Mapping](#65-mapping)
29
+ 7. [**7. The Content Block Specification: Standardized Interoperability**](#7-blocks)
30
+ - [7.1. Basic Blocks: Text and Thinking](#71-basic-blocks)
31
+ - [7.2. Asset Blocks: Image and Document](#72-asset-blocks)
32
+ - [7.3. Functional Blocks: ToolUse and ToolResult](#73-functional-blocks)
33
+ - [7.4. Structural Blocks: Summary and RoleTransition](#74-structural-blocks)
34
+ 8. [**8. Command Reference: The Mutation API Manual**](#8-commands)
35
+ - [8.1. Workspace Domain Commands](#81-workspace-cmds)
36
+ - [8.2. Session Domain Commands](#82-session-cmds)
37
+ - [8.3. Turn & DAG Domain Commands](#83-turn-cmds)
38
+ - [8.4. Role & Persona Domain Commands](#84-role-cmds)
39
+ - [8.5. Blob & Asset Domain Commands](#85-blob-cmds)
40
+ - [8.6. Resource (Context/Preference) Domain Commands](#86-resource-cmds)
41
+ 9. [**9. Extension Framework: Building the Plugin Ecosystem**](#9-extension)
42
+ - [9.1. WorkspaceExtensions: Packaging Domain Logic](#91-extensions)
43
+ - [9.2. Middleware: The Request Interceptor Pipeline](#92-middleware)
44
+ - [9.3. TurnProcessors: Giving the AI Agency](#93-turnprocessors)
45
+ - [9.4. Custom Indexers: Expanding the Read Model](#94-indexers)
46
+ 10. [**10. Persistence Layer: Schema and Collections**](#10-persistence)
47
+ 11. [**11. Implementation Guide: Building Your Workspace**](#11-implementation)
48
+ - [11.1. Implementing a Database Boundary](#111-db-impl)
49
+ - [11.2. Implementing a BlobStorage Boundary](#112-blob-impl)
50
+ - [11.3. Bootstrapping the Manager](#113-bootstrap)
51
+ 12. [**12. Design Decisions: The "Why" Behind the "What"**](#12-decisions)
52
+ 13. [**13. Comparison with Industry Standards**](#13-comparison)
53
+ 14. [**14. Advanced Recipes & Design Patterns**](#14-recipes)
54
+ 15. [**15. Technical Specifications: Protocols & Formats**](#15-spec)
55
+ 16. [**16. Troubleshooting & Common Failure Modes**](#16-troubleshooting)
56
+ 17. [**17. Testing Strategy: Ensuring Workspace Integrity**](#17-testing)
57
+ 18. [**18. Glossary of Terms**](#18-glossary)
58
+ 19. [**19. API Reference Appendix**](#19-api-ref)
59
+ 20. [**20. License**](#20-license)
23
60
 
24
61
  ---
25
62
 
26
- ## Overview & Features
63
+ ## 1. Introduction: The Evolution of Conversation State
64
+
65
+ In the first wave of LLM integration, "conversation" was modeled as a disposable buffer—a linear array of messages passed to an endpoint. While sufficient for simple chatbots, this model is fundamentally incompatible with **Professional AI Workspaces**.
27
66
 
28
- The AI Workspace SDK provides a robust, offline‑first foundation for building AI‑powered applications. It models conversation sessions as directed acyclic graphs (DAGs) of turns, supports content‑addressed binary blobs (images, documents), and includes a flexible prompt assembly pipeline that respects token budgets, relevance scoring, and user preferences.
67
+ A workspace isn't just a chat; it's a collaborative environment where AI agents act on artifacts, remember user preferences across months, navigate complex branching decision trees, and handle multi-gigabyte document libraries without breaking a sweat.
29
68
 
30
- Unlike simple chat history arrays, this library treats conversations as versioned, branchable graphs – enabling features like turn editing, branching conversations, and time‑travel navigation. All state changes go through a command/reducer pattern, making it easy to implement undo/redo, sync with backends, or build collaborative editors.
69
+ `@asaidimu/utils-workspace` is the state engine designed for this complexity. It provides a high-integrity, content-addressed foundation that treats:
31
70
 
32
- ### Key Features
71
+ - **Conversation as Graph Architecture**: Enabling non-destructive editing and infinite branching.
72
+ - **Data as Content-Addressed Assets**: Eliminating redundancy and ensuring asset integrity.
73
+ - **State as Observable Commands**: Providing a perfect audit trail and reactive UI projections.
33
74
 
34
- - **Session‑based conversation DAG** Turns are stored as versioned nodes with parent pointers. Branch, edit, or delete turns without losing history.
35
- - **Turn versioning** – Each edit creates a new version of a turn; users can switch between versions or view the version history.
36
- - **Content‑addressed blobs** – Binary data (images, PDFs, etc.) stored by SHA‑256, deduplicated automatically, with reference counting and remote ID mapping.
37
- - **Token‑aware prompt assembly** – Plan prompts under token budgets, rank context by relevance (Jaccard + freshness), truncate gracefully, and inject summaries.
38
- - **Role & preference system** – Each session has a role (persona + system prompt) and can override preference defaults per topic.
39
- - **Task management** – First‑class task entities with steps, status, and topic linking – integrate with `task:proposal` blocks.
40
- - **Pluggable storage** – Use IndexedDB (browser), Memory (testing/server), or implement your own backend via the `Database` and `BlobStorage` interfaces.
41
- - **Command/reducer architecture** – All mutations are commands that produce patches; perfect for reactive UIs and event sourcing.
75
+ If you are building an IDE, a research lab, or an autonomous agent platform, this is the engine you've been looking for.
42
76
 
43
77
  ---
44
78
 
45
- ## Installation & Setup
79
+ ## 2. Mental Model: Conversation as a Directed Acyclic Graph (DAG)
46
80
 
47
- ### Prerequisites
81
+ Most developers think of chat history as an `Array<Message>`. In this framework, we discard that model in favor of a **Directed Acyclic Graph (DAG)**.
48
82
 
49
- - Node.js 18+ or modern browser
50
- - npm or bun
83
+ ### 2.1. The Hyper-History Concept
51
84
 
52
- ### Installation
85
+ In a standard chat, each message points to the one before it. In our `TurnTree` model, we introduce a layer of abstraction between the "Sequence" and the "Content."
53
86
 
54
- Install the workspace package and its required peer dependency `@asaidimu/utils-database`:
87
+ - **TurnNode**: Represents a logical "moment" or "slot" in the conversation timeline. It is a stable identifier that does not change.
88
+ - **Turn**: Represents the actual data (text, images, tool results) at that moment.
55
89
 
56
- ```bash
57
- npm install @asaidimu/utils-workspace @asaidimu/utils-database
58
- # or
59
- bun add @asaidimu/utils-workspace @asaidimu/utils-database
60
- ```
90
+ By separating these, we enable **Non-Destructive Editing**. When a user "edits" a prompt at TurnNode X, we don't update the database. We create `Turn (Version 2)` under `TurnNode X`. The graph now has two possible paths forward from X's parent.
61
91
 
62
- The library is written in TypeScript and ships with its own type definitions – no additional `@types/` packages required.
92
+ ### 2.2. Forking and Divergent Timelines
63
93
 
64
- ### Basic Configuration
94
+ Sessions are merely pointers to a "Head" in the global graph of TurnNodes. This means "Forking" a session is an $O(1)$ metadata operation. You can create 1,000 different sessions that all share the same first 50 turns. This is critical for:
65
95
 
66
- ```typescript
67
- import { createEventBus } from '@asaidimu/events';
68
- import { DatabaseConnection, createEphemeralStore } from '@asaidimu/utils-database';
69
- import {
70
- ContentStore,
71
- createWorkspaceDatabase,
72
- MemoryBlobStorage,
73
- WorkspaceManager,
74
- SessionManager,
75
- createSimpleWorkspace,
76
- } from '@asaidimu/utils-workspace';
77
-
78
- // 1. Setup database and blob storage
79
- const db = createWorkspaceDatabase(
80
- await DatabaseConnection(
81
- { database: 'my-app', validate: true, predicates: {} },
82
- createEphemeralStore()
83
- )
84
- );
85
- const eventBus = createEventBus();
86
- const blobStorage = new MemoryBlobStorage(); // or IndexedDBBlobStorage for browsers
87
-
88
- // 2. Create core components
89
- const contentStore = await ContentStore.create(db, blobStorage, eventBus);
90
- const workspaceManager = new WorkspaceManager({ contentStore, eventBus });
91
- const sessionManager = new SessionManager(workspaceManager, contentStore);
92
-
93
- // 3. Initial workspace state
94
- let currentWorkspace = createSimpleWorkspace({ name: 'My Workspace', language: 'en', actor: 'user' });
95
- ```
96
+ - **A/B Testing**: Testing how different model instructions respond to the same conversation.
97
+ - **Collaborative Branching**: Allowing multiple team members to explore different paths from a shared starting point.
96
98
 
97
- ### Verification
99
+ ---
98
100
 
99
- Run a quick smoke test:
101
+ ## 3. Core Design Philosophy: The Four Pillars of AI State
100
102
 
101
- ```typescript
102
- const createResult = await sessionManager.create(currentWorkspace, { label: 'Test Session' });
103
- if (createResult.ok) {
104
- console.log('Session created:', createResult.value.session.id());
105
- }
106
- ```
103
+ ### 3.1. Immutability and Versioning
104
+
105
+ We believe that AI history should be a permanent record. Every AI response and user prompt is immutable. To "change" the past, you must create a new branch. This ensures that the "reasoning path" of an agent is always recoverable.
106
+
107
+ ### 3.2. Content-Addressability (SHA-256)
108
+
109
+ Assets (images, PDFs, source code) are identified by the cryptographic hash of their contents. This solves the "State Bloat" problem. If 100 sessions all reference the same 5MB system manual, the workspace only stores one copy.
110
+
111
+ ### 3.3. Command-Driven Mutations (CQRS)
112
+
113
+ We strictly follow the **Command Query Responsibility Segregation (CQRS)** pattern.
114
+
115
+ - **Commands**: Describe intent (e.g., `session:fork`).
116
+ - **Reducers**: Pure-ish functions that execute the intent against the persistent stores.
117
+ - **Projections**: In-memory read models (The Index) that allow for sub-millisecond retrieval of workspace state.
118
+
119
+ ### 3.4. Portability and LLM Agnosticism
120
+
121
+ The workspace core does not know what "OpenAI" or "Gemini" is. It operates on an abstract `Prompt` model. This allows you to build your entire application logic once and swap between different LLM providers just by changing the `LLMAdapter`.
107
122
 
108
123
  ---
109
124
 
110
- ## Usage Documentation
125
+ ## 4. System Architecture: The "Engine Under the Hood"
111
126
 
112
- ### Basic Conversation Flow
127
+ ### 4.1. Write Model: Atomic Entity Stores
113
128
 
114
- ```typescript
115
- import { TurnBuilder, merge } from '@asaidimu/utils-workspace';
116
-
117
- // Create a session
118
- const { session, patch } = (await sessionManager.create(currentWorkspace, { label: 'Chat' })).value;
119
- currentWorkspace = merge(currentWorkspace, patch);
120
-
121
- // Add a user turn
122
- const userTurn = new TurnBuilder('user')
123
- .addText('What is the weather like today?')
124
- .build();
125
- const addResult = await session.addTurn(currentWorkspace, userTurn);
126
- currentWorkspace = merge(currentWorkspace, addResult.value);
127
-
128
- // Resolve the effective session (includes preferences, context, transcript)
129
- const effective = (await workspaceManager.resolveSession(currentWorkspace, session.id())).value;
130
-
131
- // Build a prompt for an LLM
132
- const promptBuilder = new PromptBuilder({ blobResolver: contentStore.getBlobResolver() });
133
- const prompt = await promptBuilder.build(effective, {
134
- tokenBudget: { total: 8000 },
135
- relevanceConfig: { recentMessageWindow: 5, minScore: 0.3 }
136
- });
129
+ The library organizes data into atomic "Stores." Each store is responsible for a single domain entity (e.g., `RoleStore`, `SessionStore`). These stores are designed to be backed by any persistent database via a standardized `Database` boundary.
137
130
 
138
- console.log(prompt.system.persona);
139
- console.log(prompt.transcript.turns);
140
- ```
131
+ ### 4.2. Read Model: Reactive Index Projections
141
132
 
133
+ Querying a database for every prompt is too slow. To solve this, the `WorkspaceManager` maintains a **Workspace Index**. This index is a reactive, in-memory projection of the underlying stores.
142
134
 
143
- ### Turn Versioning & Editing
135
+ - When a command is executed, the reducer returns a "Patch."
136
+ - The manager applies the patch to the Index.
137
+ - Subscribers are notified via the Event Bus.
144
138
 
145
- Every turn in a session is versioned. When you edit a turn, a new version is created while preserving the old one. The session’s “head” points to the latest version along the active chain, but users can switch between versions or view version history.
139
+ ### 4.3. The Sequential Serializer (Race Condition Prevention)
146
140
 
147
- **Editing a turn** create a new version with modified content:
141
+ Concurrency in graph-based history leads to "phantom branches" and corrupt heads. The `WorkspaceManager` uses a **Sequential Serializer** (a promise-chaining queue) to ensure that only one mutation can happen at a time.
148
142
 
149
- ```typescript
150
- // Edit the user turn we just added
151
- const newBlocks: ContentBlock[] = [
152
- { id: uuid(), type: 'text', text: 'What is the weather like in Tokyo?' }
153
- ];
154
- const editResult = await session.editTurn(
155
- currentWorkspace,
156
- userTurn.id, // turn ID
157
- newBlocks, // new content blocks
158
- 'user' // optional role snapshot
159
- );
160
- if (editResult.ok) {
161
- currentWorkspace = merge(currentWorkspace, editResult.value);
162
- }
163
- ```
143
+ ### 4.4. Event Bus and Reactivity
164
144
 
165
- **Navigating versions** switch between versions of a turn (e.g., undo/redo):
145
+ The workspace is "Alive." Every mutation emits events that can be listened to by the UI, external logging services, or other parts of the application.
166
146
 
167
- ```typescript
168
- // Switch to the previous version of this turn
169
- const leftResult = await session.switchVersionLeft(currentWorkspace, userTurn.id);
170
- if (leftResult.ok) {
171
- currentWorkspace = merge(currentWorkspace, leftResult.value);
172
- // The session head now points to the previous version
173
- }
147
+ ---
174
148
 
175
- // Switch to the next version (if available)
176
- const rightResult = await session.switchVersionRight(currentWorkspace, userTurn.id);
177
- ```
149
+ ## 5. Domain Entities: Deep Dive into the State
178
150
 
179
- **Inspecting version info** get available versions and navigation state:
151
+ ### 5.1. Sessions: The Contextual Roots
180
152
 
181
- ```typescript
182
- const branchInfo = await session.branchInfo(userTurn.id);
183
- console.log(branchInfo);
184
- // {
185
- // versions: [0, 1, 2], // all version numbers for this turn
186
- // currentIndex: 1, // index of the active version
187
- // total: 3,
188
- // hasPrev: true,
189
- // hasNext: true
190
- // }
191
- ```
153
+ A `Session` is the user's primary unit of interaction. It manages the "active timeline."
192
154
 
193
- **How it works under the hood** When you call `editTurn`, the library:
194
- 1. Loads the current version of the turn.
195
- 2. Increments the version number.
196
- 3. Stores a new turn document with the updated blocks.
197
- 4. Updates the session head if the edited turn was the head.
198
- 5. Preserves all previous versions, which remain accessible via `switchVersionLeft/Right` and appear in `branchInfo`.
155
+ - **Head Pointer**: A `TurnRef` (UUID + Version) identifying the current leaf of the DAG.
156
+ - **Topic Affinity**: Semantic tags that act as "magnets" for RAG context and user preferences.
199
157
 
200
- This makes the conversation graph fully auditable and supports collaborative editing scenarios.
158
+ ### 5.2. Turns: The Atoms of Hyper-History
201
159
 
202
- ### Working with Blobs (Images & Documents)
160
+ A `Turn` is the smallest unit of conversation. It is a multi-modal container that can hold anything from a text string to a sequence of tool executions and internal "thinking" blocks.
203
161
 
204
- ```typescript
205
- // Register an image blob
206
- const imageData = await fetch('/photo.jpg').then(r => new Uint8Array(await r.arrayBuffer()));
207
- const registerCmd: RegisterBlob = {
208
- type: 'blob:register',
209
- timestamp: new Date().toISOString(),
210
- payload: { data: imageData, mediaType: 'image/jpeg', filename: 'photo.jpg' }
211
- };
212
- const blobResult = await workspaceManager.dispatch(currentWorkspace, registerCmd);
213
- if (blobResult.ok) {
214
- const blobRef = blobResult.value.index?.blobs?.['sha256...']; // reference
215
- // Use blobRef in an ImageBlock
216
- const turnWithImage = new TurnBuilder('user')
217
- .addImage(blobRef, 'A beautiful landscape')
218
- .build();
219
- }
220
- ```
162
+ ### 5.3. Blobs: Deduplicated Asset Management
163
+
164
+ Blobs represent the library's solution for large binary data. They utilize a **Reference Counting** garbage collector.
165
+
166
+ - **Registration**: Calculate hash -> Check for existence -> Create or increment ref-count.
167
+ - **Automatic Cleanup**: When a turn is deleted, the system decrements the blob's ref-count. Only when it hits zero is the physical file deleted from storage.
168
+
169
+ ### 5.4. Roles: Persistent Personas
170
+
171
+ A `Role` encapsulates the AI's identity. It includes the "System Prompt" (Persona), model-specific constraints (e.g., `temperature`, `window size`), and a set of topics that the role is "interested" in.
172
+
173
+ ---
174
+
175
+ ## 6. The Prompt Pipeline: Anatomy of a Request
176
+
177
+ The `PromptBuilder` follows a 5-step lifecycle:
178
+
179
+ ### 6.1. Stage 1: DAG Snapshotting and Ancestry Walking
180
+
181
+ Starting from the session's `Head`, the builder performs a recursive walk up the parent pointers. This flattens the graph into a linear `Transcript`.
182
+
183
+ ### 6.2. Stage 2: Semantic Retrieval and Topic Alignment
184
+
185
+ The builder identifies the active `Topics` of the session. It then queries the `ContextIndex`. It uses the `ContextRetriever` to rank entries by relevance.
186
+
187
+ ### 6.3. Stage 3: Conflict Resolution in Instructions
188
+
189
+ It assembles the final "System Instruction." It merges the `Role` instructions with any active `Preferences`.
190
+
191
+ ### 6.4. Stage 4: Multi-Modal Blob Resolution Protocols
192
+
193
+ Every `BlobRef` in the transcript is resolved. The builder contacts the `BlobStore` to determine if the data should be sent as `inlineData` (Base64) or if it has a pre-existing `fileId`.
194
+
195
+ ### 6.5. Stage 5: Provider-Specific Adapter Mapping
196
+
197
+ The finalized, agnostic `Prompt` is passed to the `LLMAdapter`. This is where the mapping to the provider's specific wire format occurs.
198
+
199
+ ---
200
+
201
+ ## 7. The Content Block Specification
202
+
203
+ ### 7.1. Basic Blocks
204
+
205
+ - **`text`**: Standard text communication.
206
+ - **`thinking`**: Internal CoT reasoning. Separate to allow UI to hide/show reasoning paths.
207
+
208
+ ### 7.2. Asset Blocks
209
+
210
+ - **`image`**: References a `BlobRef`. Supports `mediaType` and `filename`.
211
+ - **`document`**: Used for PDFs, code snippets, or long-form text files.
212
+
213
+ ### 7.3. Functional Blocks
214
+
215
+ - **`tool_use`**: A structured request for tool execution. Contains `callId`, `name`, and `args`.
216
+ - **`tool_result`**: The output of a tool, linked via `callId`.
217
+
218
+ ### 7.4. Structural Blocks
219
+
220
+ - **`summary`**: A "compression" block. Replaces older history turns.
221
+ - **`role:transition`**: Notates that the session persona changed.
222
+
223
+ ---
221
224
 
222
- ### Managing Preferences & Roles
225
+ ## 8. Command Reference: The Mutation API Manual
226
+
227
+ ### 8.1. Workspace Domain Commands
228
+
229
+ - **`workspace:create`**: Initializes root metadata.
230
+ - **`workspace:sync`**: Forces a full re-indexing of all stores.
231
+
232
+ ### 8.2. Session Domain Commands
233
+
234
+ | Command | Payload | Side Effects |
235
+ | :------------------------ | :------------------------ | :------------------------------------- |
236
+ | **`session:create`** | `id, role, topics, label` | Creates database record + Index entry. |
237
+ | **`session:fork`** | `sessionId, turnId` | Metadata-only clone of history. |
238
+ | **`session:update`** | `sessionId, updates` | Patch Index + Update store. |
239
+ | **`session:delete`** | `sessionId` | Recursive release of all blob refs. |
240
+ | **`session:role:switch`** | `sessionId, newRole` | Insert transition block + Update role. |
241
+
242
+ ### 8.3. Turn & DAG Domain Commands
243
+
244
+ | Command | Payload | Description |
245
+ | :---------------- | :----------------------------- | :-------------------------------- |
246
+ | **`turn:add`** | `sessionId, turn` | Appends turn. Retains blobs. |
247
+ | **`turn:edit`** | `sessionId, turnId, newBlocks` | New Version in DAG. Moves Head. |
248
+ | **`turn:branch`** | `sessionId, turn` | Diverge from non-head node. |
249
+ | **`turn:delete`** | `sessionId, turnId, version` | Removes Version + Releases blobs. |
250
+
251
+ ### 8.4. Role & Persona Domain Commands
252
+
253
+ - **`role:add`**: Register new persona.
254
+ - **`role:update`**: Modify instructions/constraints.
255
+ - **`role:delete`**: Remove role.
256
+
257
+ ### 8.5. Blob & Asset Domain Commands
258
+
259
+ - **`blob:register`**: Calculate hash + Create storage record + Increment ref-count.
260
+ - **`blob:retain`**: Increment ref-count.
261
+ - **`blob:release`**: Decrement ref-count + Trigger purge if 0.
262
+ - **`blob:record_remote_id`**: Map local hash to remote API ID.
263
+
264
+ ---
265
+
266
+ ## 9. Extension Framework
267
+
268
+ ### 9.1. WorkspaceExtensions
269
+
270
+ Packaging custom logic into a single plugin.
223
271
 
224
272
  ```typescript
225
- // Add a preference
226
- const prefCmd: AddPreference = {
227
- type: 'preference:add',
228
- timestamp: new Date().toISOString(),
229
- payload: {
230
- id: crypto.randomUUID(),
231
- content: 'Always use metric units for measurements.',
232
- topics: ['weather', 'science'],
233
- timestamp: new Date().toISOString()
234
- }
273
+ const MyPlugin: WorkspaceExtension = {
274
+ schemas: [
275
+ /* schemas */
276
+ ],
277
+ reducers: { "custom:cmd": myReducer },
278
+ middleware: [myMiddleware],
235
279
  };
236
- await workspaceManager.dispatch(currentWorkspace, prefCmd);
237
-
238
- // Create a role that uses that preference by default
239
- const roleCmd: AddRole = {
240
- type: 'role:add',
241
- timestamp: new Date().toISOString(),
242
- payload: {
243
- name: 'scientist',
244
- label: 'Science Assistant',
245
- persona: 'You are a helpful science expert.',
246
- preferences: [prefId]
247
- }
248
- };
249
- await workspaceManager.dispatch(currentWorkspace, roleCmd);
250
280
  ```
251
281
 
282
+ ### 9.2. Middleware: Intercepting the State Stream
283
+
284
+ Guardians of the workspace. Can abort, augment, or observe commands.
285
+
286
+ ### 9.3. TurnProcessors: Giving the AI Agency
287
+
288
+ Scans AI response blocks for side-effects (e.g., auto-saving memories or tasks).
289
+
290
+ ### 9.4. Custom Indexers
291
+
292
+ Expanding the Read Model with custom read-projections.
293
+
294
+ ---
295
+
296
+ ## 10. Persistence Layer: Schema and Collections
297
+
298
+ | Collection | Key | Description |
299
+ | :------------- | :-------------- | :---------------------------- |
300
+ | `workspace` | `id` | Global metadata and settings. |
301
+ | `role` | `name` | Persona instructions. |
302
+ | `preference` | `id` | User instructions. |
303
+ | `context` | `key` | RAG snippets. |
304
+ | `session` | `id` | Session metadata + Head. |
305
+ | `turn` | `(id, version)` | Immutable history records. |
306
+ | `blob_records` | `sha256` | Metadata + RefCount. |
307
+
308
+ ---
309
+
310
+ ## 11. Implementation Guide
252
311
 
253
- ### Branching a Conversation
312
+ ### 11.1. Implementing a Database Boundary
254
313
 
255
314
  ```typescript
256
- // Get the current head turn
257
- const head = session.head(currentWorkspace);
258
- if (head) {
259
- // Create a branch from that turn
260
- const branchTurn = new TurnBuilder('assistant')
261
- .withParent(head) // explicitly set parent
262
- .addText('Let me think differently...')
263
- .build();
264
- const branchResult = await session.branch(currentWorkspace, branchTurn);
265
- currentWorkspace = merge(currentWorkspace, branchResult.value);
315
+ class MyDb implements WorkspaceDatabase {
316
+ async collection(name: string) {
317
+ /* ... */
318
+ }
319
+ async open(schemas: SchemaDefinition[]) {
320
+ /* ... */
321
+ }
266
322
  }
267
323
  ```
268
324
 
269
- ### Custom Prompt Assembly
270
-
271
- The SDK includes a default token planner and Jaccard context retriever, but you can replace them:
325
+ ### 11.2. Implementing a BlobStorage Boundary
272
326
 
273
327
  ```typescript
274
- class MyCustomRetriever implements ContextRetriever {
275
- rank(input: ContextRankingInput): Context[] {
276
- // your ranking logic
328
+ class MyStorage implements BlobStorage {
329
+ async put(data: Uint8Array): Promise<string> {
330
+ /* return sha256 */
331
+ }
332
+ async get(sha256: string): Promise<Uint8Array | null> {
333
+ /* ... */
334
+ }
335
+ async delete(sha256: string): Promise<void> {
336
+ /* ... */
277
337
  }
278
338
  }
339
+ ```
279
340
 
280
- const promptBuilder = new PromptBuilder({
281
- blobResolver: contentStore.getBlobResolver(),
282
- retriever: new MyCustomRetriever(),
283
- planner: new MyTokenPlanner()
341
+ ### 11.3. Bootstrapping the Manager
342
+
343
+ ```typescript
344
+ const { manager, sessions, ctx } = await createWorkspace({
345
+ db: new MyDb(),
346
+ blobStorage: new MyStorage(),
347
+ getWorkspace: () => state,
348
+ setWorkspace: (patch) => {
349
+ state = merge(state, patch);
350
+ },
351
+ processor: new MyProcessor(),
284
352
  });
285
353
  ```
286
354
 
287
355
  ---
288
356
 
289
- ## Project Architecture
290
-
291
- ### Core Components
292
-
293
- | Component | Responsibility |
294
- |-----------|----------------|
295
- | `WorkspaceManager` | Dispatches commands, applies reducer, coordinates side effects (persistence, blob ops). |
296
- | `ContentStore` | Session‑aware read/write layer with LRU caches for roles, preferences, context. Owns `TurnTree` and `BlobStore`. |
297
- | `TurnTree` | Manages the turn DAG – persistence, head pointer, branching, version switching, subtree deletion. |
298
- | `BlobStore` | Content‑addressed blob registry (SHA‑256) with reference counting, remote ID mapping, and eviction. |
299
- | `Session` | Thin coordinator for a single session – validates existence, delegates to `TurnTree` for reads, forwards write commands to `WorkspaceManager`. |
300
- | `SessionManager` | Creates, opens, and lists sessions. Returns `Session` objects (stateless, lazy loading). |
301
- | `PromptBuilder` | Assembles prompts from an `EffectiveSession` using a `ContextRetriever`, `TokenPlanner`, and optional `Summarizer`. |
302
- | `Reducer` | Pure function that validates commands and produces `DeepPartial<Workspace>` patches for the in‑memory index. |
303
-
304
- ### Data Flow
305
-
306
- ```mermaid
307
- graph LR
308
- A[UI / Caller] -->|Command| B[WorkspaceManager]
309
- B -->|Validate & Patch| C[Reducer]
310
- C -->|Patch| D[In‑Memory Workspace]
311
- B -->|Side Effects| E[ContentStore]
312
- E --> F[(Database)]
313
- E --> G[(Blob Storage)]
314
- B -->|Emit Event| H[EventBus]
315
- H -->|Notify| A
316
- ```
357
+ ## 12. Design Decisions
358
+
359
+ ### Why a DAG?
317
360
 
318
- ### Extension Points
361
+ A tree allows branching, but a DAG allows for **Merging** and **Non-Destructive Versioning**. It accurately represents the way conversation evolved into multiple "what-if" scenarios.
319
362
 
320
- - **`Summarizer`** – Compress transcripts when token budget is tight.
321
- - **`ContextRetriever`** – Custom ranking of context entries (e.g., vector similarity).
322
- - **`TokenPlanner`** Decide which turns/preferences/context fit into a budget.
323
- - **`ToolRegistry`** – Register callable tools; `tool:call` commands will execute them.
324
- - **`PermissionGuard`** – Intercept commands/tool calls for authentication/authorization.
325
- - **`BlobStorage`** – Implement your own backend (S3, OPFS, etc.) by conforming to the interface.
363
+ ### Why SHA-256?
364
+
365
+ UUIDs lead to "Asset Proliferation." SHA-256 enables **Global Deduplication**, critical for reducing storage in RAG-heavy apps.
366
+
367
+ ### Why CQRS?
368
+
369
+ Read patterns (building prompts) are vastly different from Write patterns (saving turns). CQRS optimizes both independently.
326
370
 
327
371
  ---
328
372
 
329
- ## Development & Contributing
373
+ ## 13. Comparison with Industry Standards
330
374
 
331
- ### Development Setup
375
+ - **vs. OpenAI Threads**: OpenAI threads are black boxes. Our Workspace provides full local control and DAG navigation.
376
+ - **vs. LangChain Memory**: LangChain memory is often ephemeral. Our Workspace is a persistent, repo-level database.
332
377
 
333
- ```bash
334
- git clone https://github.com/asaidimu/erp-utils.git
335
- cd erp-utils/src/workspace
336
- npm install
337
- npm run build
338
- ```
378
+ ---
379
+
380
+ ## 14. Advanced Recipes & Design Patterns
381
+
382
+ ### The "Shadow Turn" Pattern
383
+
384
+ Intercept `turn:add` and inject a hidden system turn before the user message for real-time situational awareness.
339
385
 
340
- ### Available Scripts
386
+ ### Automatic Topic Discovery
341
387
 
342
- | Script | Description |
343
- |--------|-------------|
344
- | `npm test` | Run unit tests once (Vitest) |
345
- | `npm run test:watch` | Run tests in watch mode |
346
- | `npm run test:browser` | Run tests in a browser environment |
347
- | `npm run build` | Compile TypeScript to `dist/` |
388
+ Use a cheap model to scan every user turn and suggest new topics to refine RAG retrieval dynamically.
348
389
 
349
- ### Testing
390
+ ---
391
+
392
+ ## 15. Technical Specifications
393
+
394
+ ### Blob Hash Protocol
395
+
396
+ SHA-256 hex digest -> stored under `blobs/{sha256}`. RefCount = 0 triggers deletion.
397
+
398
+ ### Index Sync Protocol
399
+
400
+ Full re-build of Index via parallel execution of all registered `Indexers`.
401
+
402
+ ---
350
403
 
351
- Tests are written with [Vitest](https://vitest.dev/) and cover reducers, turn tree operations, blob reference counting, and prompt assembly. Run `npm test` to execute the suite. We aim for >85% coverage on core modules.
404
+ ## 16. Troubleshooting
352
405
 
353
- ### Contributing Guidelines
406
+ ### Stale Index
354
407
 
355
- 1. **Fork the repository** and create a feature branch.
356
- 2. **Follow the existing code style** (Prettier + ESLint configured).
357
- 3. **Write tests** for any new functionality or bug fixes.
358
- 4. **Commit messages** should follow [Conventional Commits](https://www.conventionalcommits.org/) (e.g., `feat: add branch info API`).
359
- 5. **Open a pull request** against the `main` branch. Include a clear description and link to any related issue.
408
+ Ensure reducers return the correct patch. Call `workspace:sync` to force a rebuild.
360
409
 
361
- ### Issue Reporting
410
+ ### Blob Reference Leak
362
411
 
363
- Report bugs or request features via [GitHub Issues](https://github.com/asaidimu/erp-utils/issues). Please include:
364
- - Library version
365
- - Minimal code reproduction
366
- - Expected vs actual behavior
412
+ Use `session:delete` to ensure all blobs in the session DAG are released.
367
413
 
368
414
  ---
369
415
 
370
- ## Additional Information
416
+ ## 17. Testing Strategy
371
417
 
372
- ### Troubleshooting
418
+ ### Integration Testing the DAG
373
419
 
374
- | Issue | Likely Solution |
375
- |-------|----------------|
376
- | `Blob bytes not found locally` | The blob was evicted because `refCount` reached zero and `eagerEviction` is true. Either re‑register the blob or disable eager eviction. |
377
- | `Cannot delete role still referenced by sessions` | Change all sessions using that role to another role first, or delete the sessions. |
378
- | `Turn not found when editing` | Ensure the turn ID and version exist. Use `session.turns()` to list current nodes. |
379
- | `Prompt assembly drops all context` | Check your `tokenBudget.total` increase it, or provide a custom `estimator` that returns smaller token counts. |
420
+ ```typescript
421
+ test('can branch history', async () => {
422
+ const session = await sessions.create({ ... });
423
+ await session.addTurn({ blocks: [{ type: 'text', text: 'Prompt 1' }] });
424
+ const turn1Id = session.head()!.id;
425
+ await session.editTurn(turn1Id, [{ type: 'text', text: 'Edited Prompt 1' }]);
426
+ // Verify both versions exist and head points to version 2
427
+ });
428
+ ```
429
+
430
+ ### Unit Testing Reducers
380
431
 
381
- ### FAQ
432
+ Reducers are pure-ish and can be tested in isolation by providing a mock `WorkspaceContext`.
382
433
 
383
- **Q: How do I handle large files (e.g., 100MB videos)?**
384
- A: The blob storage is content‑addressed, so each unique file is stored once. Use `registerBlob` to add it, then reference it via `BlobRef`. For very large files, consider implementing a streaming backend or using remote IDs (e.g., upload to S3 and store the fileId via `recordRemoteId`).
434
+ ---
435
+
436
+ ## 18. Glossary of Terms
385
437
 
386
- **Q: Can I use this library in a browser with IndexedDB?**
387
- A: Yes use `IndexedDBBlobStorage` and `createWorkspaceDatabase` with an IndexedDB adapter. The turn DAG and all entities are persisted locally.
438
+ - **`Head`**: The current active turn.
439
+ - **`Transcript`**: Flattened history for an LLM request.
440
+ - **`Persona`**: System instructions for a Role.
441
+ - **`Topic`**: Semantic tag for RAG connectivity.
442
+ - **`Reducer`**: Command handler for state mutation.
388
443
 
389
- **Q: How do I migrate from a simple message array?**
390
- A: Create a session, then add turns sequentially using `session.addTurn`. The `parent` field will be set automatically if you use `addTurn` (it links to the current head). For branching, you can also set `parent` manually.
444
+ ---
391
445
 
392
- **Q: Does the SDK support real‑time collaboration?**
393
- A: The command/reducer pattern is well‑suited for CRDTs or operational transforms. The workspace events (`workspace:changed`) can be broadcast to other clients, and commands can be merged. We plan to add built‑in sync in a future version.
446
+ ## 19. API Reference Appendix
394
447
 
395
- ### Changelog & Roadmap
448
+ ### WorkspaceManager
396
449
 
397
- See [CHANGELOG.md](https://github.com/asaidimu/erp-utils/blob/main/src/workspace/CHANGELOG.md) for version history.
398
- Planned features:
399
- - Built‑in summarizer using LLM calls
400
- - Vector store integration for semantic context retrieval
401
- - Real‑time collaboration via WebSocket transport
450
+ - `dispatch(command)`: Atomic mutation.
451
+ - `workspace()`: Sync Index access.
452
+ - `use(middleware)`: Register interceptor.
402
453
 
403
- ### License
454
+ ### Session
404
455
 
405
- MIT © [Saidimu](https://github.com/asaidimu). See [LICENSE](https://github.com/asaidimu/erp-utils/blob/main/src/workspace/LICENSE) for full text.
456
+ - `snapshot()`: Prepare prompt data.
457
+ - `addTurn(blocks)`: Append to head.
458
+ - `branchInfo(turnId)`: Retrieve all versions.
406
459
 
407
- ### Acknowledgments
460
+ ---
408
461
 
409
- Built on:
410
- - [`@asaidimu/anansi`](https://github.com/asaidimu/anansi) – schema‑based document store
411
- - [`@asaidimu/events`](https://github.com/asaidimu/events) – typed event bus
412
- - [`@asaidimu/utils-database`](https://github.com/asaidimu/erp-utils/tree/main/src/database) – database abstraction layer
413
- - [`uuid`](https://github.com/uuidjs/uuid) – UUID generation
462
+ ## 20. License
414
463
 
415
- Inspired by modern AI orchestration frameworks and offline‑first application patterns.
464
+ MIT © [Saidimu](https://github.com/asaidimu)