@asaidimu/utils-workspace 6.0.0 → 6.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,415 +1,395 @@
1
- # AI Workspace SDK
1
+ # @asaidimu/utils-workspace
2
2
 
3
- **Content‑addressed workspace and conversation management for AI applications.**
3
+ > **The Architectural Foundation for Production-Grade AI Workspaces.**
4
4
 
5
- [![npm version](https://img.shields.io/npm/v/@asaidimu/utils-workspace)](https://www.npmjs.com/package/@asaidimu/utils-workspace)
6
- [![license](https://img.shields.io/npm/l/@asaidimu/utils-workspace)](https://github.com/asaidimu/erp-utils/blob/main/src/workspace/LICENSE)
7
- [![build status](https://img.shields.io/github/actions/workflow/status/asaidimu/erp-utils/ci.yml?branch=main)](https://github.com/asaidimu/erp-utils/actions)
5
+ ---
8
6
 
9
7
  ## 📖 Table of Contents
10
8
 
11
- - [Overview & Features](#overview--features)
12
- - [Installation & Setup](#installation--setup)
13
- - [Usage Documentation](#usage-documentation)
14
- - [Basic Conversation Flow](#basic-conversation-flow)
15
- - [Turn Versioning & Editing](#turn-versioning--editing)
16
- - [Working with Blobs (Images & Documents)](#working-with-blobs-images--documents)
17
- - [Managing Preferences & Roles](#managing-preferences--roles)
18
- - [Branching a Conversation](#branching-a-conversation)
19
- - [Custom Prompt Assembly](#custom-prompt-assembly)
20
- - [Project Architecture](#project-architecture)
21
- - [Development & Contributing](#development--contributing)
22
- - [Additional Information](#additional-information)
9
+ 1. [**1. Introduction: The Evolution of Conversation State**](#1-introduction)
10
+ 2. [**2. Mental Model: Conversation as a Directed Acyclic Graph (DAG)**](#2-mental-model)
11
+ 3. [**3. Core Design Philosophy: The Four Pillars of AI State**](#3-philosophy)
12
+ 4. [**4. System Architecture: The "Engine Under the Hood"**](#4-architecture)
13
+ * [4.1. CQRS: Decoupling Write and Read Paths](#41-cqrs)
14
+ * [4.2. The Index: Reactive Read Projections](#42-the-index)
15
+ * [4.3. The Sequential Serializer: Race Condition Prevention](#43-serializer)
16
+ * [4.4. Event Bus and Reactivity](#44-event-bus)
17
+ 5. [**5. Domain Entities: Deep Dive into the State**](#5-entities)
18
+ * [5.1. Sessions: The Contextual Roots](#51-sessions)
19
+ * [5.2. Turns: The Atoms of Hyper-History](#52-turns)
20
+ * [5.3. Blobs: Deduplicated, Content-Addressed Asset Management](#53-blobs)
21
+ * [5.4. Roles: Persistent Personas and AI Identity](#54-roles)
22
+ * [5.5. Preferences & Context: The Multi-Tier Memory Layer](#55-memory)
23
+ 6. [**6. The Prompt Pipeline: Anatomy of a Request**](#6-pipeline)
24
+ * [6.1. Stage 1: DAG Snapshotting and Ancestry Walking](#61-snapshotting)
25
+ * [6.2. Stage 2: Semantic Retrieval and Topic Alignment](#62-retrieval)
26
+ * [6.3. Stage 3: Conflict Resolution in Instructions](#63-conflict-resolution)
27
+ * [6.4. Stage 4: Multi-Modal Blob Resolution Protocols](#64-resolution)
28
+ * [6.5. Stage 5: Provider-Specific Adapter Mapping](#65-mapping)
29
+ 7. [**7. The Content Block Specification: Standardized Interoperability**](#7-blocks)
30
+ * [7.1. Basic Blocks: Text and Thinking](#71-basic-blocks)
31
+ * [7.2. Asset Blocks: Image and Document](#72-asset-blocks)
32
+ * [7.3. Functional Blocks: ToolUse and ToolResult](#73-functional-blocks)
33
+ * [7.4. Structural Blocks: Summary and RoleTransition](#74-structural-blocks)
34
+ 8. [**8. Command Reference: The Mutation API Manual**](#8-commands)
35
+ * [8.1. Workspace Domain Commands](#81-workspace-cmds)
36
+ * [8.2. Session Domain Commands](#82-session-cmds)
37
+ * [8.3. Turn & DAG Domain Commands](#83-turn-cmds)
38
+ * [8.4. Role & Persona Domain Commands](#84-role-cmds)
39
+ * [8.5. Blob & Asset Domain Commands](#85-blob-cmds)
40
+ * [8.6. Resource (Context/Preference) Domain Commands](#86-resource-cmds)
41
+ 9. [**9. Extension Framework: Building the Plugin Ecosystem**](#9-extension)
42
+ * [9.1. WorkspaceExtensions: Packaging Domain Logic](#91-extensions)
43
+ * [9.2. Middleware: The Request Interceptor Pipeline](#92-middleware)
44
+ * [9.3. TurnProcessors: Giving the AI Agency](#93-turnprocessors)
45
+ * [9.4. Custom Indexers: Expanding the Read Model](#94-indexers)
46
+ 10. [**10. Persistence Layer: Schema and Collections**](#10-persistence)
47
+ 11. [**11. Implementation Guide: Building Your Workspace**](#11-implementation)
48
+ * [11.1. Implementing a Database Boundary](#111-db-impl)
49
+ * [11.2. Implementing a BlobStorage Boundary](#112-blob-impl)
50
+ * [11.3. Bootstrapping the Manager](#113-bootstrap)
51
+ 12. [**12. Design Decisions: The "Why" Behind the "What"**](#12-decisions)
52
+ 13. [**13. Comparison with Industry Standards**](#13-comparison)
53
+ 14. [**14. Advanced Recipes & Design Patterns**](#14-recipes)
54
+ 15. [**15. Technical Specifications: Protocols & Formats**](#15-spec)
55
+ 16. [**16. Troubleshooting & Common Failure Modes**](#16-troubleshooting)
56
+ 17. [**17. Testing Strategy: Ensuring Workspace Integrity**](#17-testing)
57
+ 18. [**18. Glossary of Terms**](#18-glossary)
58
+ 19. [**19. API Reference Appendix**](#19-api-ref)
59
+ 20. [**20. License**](#20-license)
23
60
 
24
61
  ---
25
62
 
26
- ## Overview & Features
63
+ ## 1. Introduction: The Evolution of Conversation State
27
64
 
28
- The AI Workspace SDK provides a robust, offline‑first foundation for building AI‑powered applications. It models conversation sessions as directed acyclic graphs (DAGs) of turns, supports content‑addressed binary blobs (images, documents), and includes a flexible prompt assembly pipeline that respects token budgets, relevance scoring, and user preferences.
65
+ In the first wave of LLM integration, "conversation" was modeled as a disposable buffer—a linear array of messages passed to an endpoint. While sufficient for simple chatbots, this model is fundamentally incompatible with **Professional AI Workspaces**.
29
66
 
30
- Unlike simple chat history arrays, this library treats conversations as versioned, branchable graphs enabling features like turn editing, branching conversations, and time‑travel navigation. All state changes go through a command/reducer pattern, making it easy to implement undo/redo, sync with backends, or build collaborative editors.
67
+ A workspace isn't just a chat; it's a collaborative environment where AI agents act on artifacts, remember user preferences across months, navigate complex branching decision trees, and handle multi-gigabyte document libraries without breaking a sweat.
31
68
 
32
- ### Key Features
69
+ `@asaidimu/utils-workspace` is the state engine designed for this complexity. It provides a high-integrity, content-addressed foundation that treats:
70
+ - **Conversation as Graph Architecture**: Enabling non-destructive editing and infinite branching.
71
+ - **Data as Content-Addressed Assets**: Eliminating redundancy and ensuring asset integrity.
72
+ - **State as Observable Commands**: Providing a perfect audit trail and reactive UI projections.
33
73
 
34
- - **Session‑based conversation DAG** Turns are stored as versioned nodes with parent pointers. Branch, edit, or delete turns without losing history.
35
- - **Turn versioning** – Each edit creates a new version of a turn; users can switch between versions or view the version history.
36
- - **Content‑addressed blobs** – Binary data (images, PDFs, etc.) stored by SHA‑256, deduplicated automatically, with reference counting and remote ID mapping.
37
- - **Token‑aware prompt assembly** – Plan prompts under token budgets, rank context by relevance (Jaccard + freshness), truncate gracefully, and inject summaries.
38
- - **Role & preference system** – Each session has a role (persona + system prompt) and can override preference defaults per topic.
39
- - **Task management** – First‑class task entities with steps, status, and topic linking – integrate with `task:proposal` blocks.
40
- - **Pluggable storage** – Use IndexedDB (browser), Memory (testing/server), or implement your own backend via the `Database` and `BlobStorage` interfaces.
41
- - **Command/reducer architecture** – All mutations are commands that produce patches; perfect for reactive UIs and event sourcing.
74
+ If you are building an IDE, a research lab, or an autonomous agent platform, this is the engine you've been looking for.
42
75
 
43
76
  ---
44
77
 
45
- ## Installation & Setup
78
+ ## 2. Mental Model: Conversation as a Directed Acyclic Graph (DAG)
46
79
 
47
- ### Prerequisites
80
+ Most developers think of chat history as an `Array<Message>`. In this framework, we discard that model in favor of a **Directed Acyclic Graph (DAG)**.
48
81
 
49
- - Node.js 18+ or modern browser
50
- - npm or bun
82
+ ### 2.1. The Hyper-History Concept
83
+ In a standard chat, each message points to the one before it. In our `TurnTree` model, we introduce a layer of abstraction between the "Sequence" and the "Content."
51
84
 
52
- ### Installation
85
+ - **TurnNode**: Represents a logical "moment" or "slot" in the conversation timeline. It is a stable identifier that does not change.
86
+ - **Turn**: Represents the actual data (text, images, tool results) at that moment.
53
87
 
54
- Install the workspace package and its required peer dependency `@asaidimu/utils-database`:
88
+ By separating these, we enable **Non-Destructive Editing**. When a user "edits" a prompt at TurnNode X, we don't update the database. We create `Turn (Version 2)` under `TurnNode X`. The graph now has two possible paths forward from X's parent.
55
89
 
56
- ```bash
57
- npm install @asaidimu/utils-workspace @asaidimu/utils-database
58
- # or
59
- bun add @asaidimu/utils-workspace @asaidimu/utils-database
60
- ```
90
+ ### 2.2. Forking and Divergent Timelines
91
+ Sessions are merely pointers to a "Head" in the global graph of TurnNodes. This means "Forking" a session is an $O(1)$ metadata operation. You can create 1,000 different sessions that all share the same first 50 turns. This is critical for:
92
+ - **A/B Testing**: Testing how different model instructions respond to the same conversation.
93
+ - **Collaborative Branching**: Allowing multiple team members to explore different paths from a shared starting point.
61
94
 
62
- The library is written in TypeScript and ships with its own type definitions – no additional `@types/` packages required.
95
+ ---
63
96
 
64
- ### Basic Configuration
97
+ ## 3. Core Design Philosophy: The Four Pillars of AI State
65
98
 
66
- ```typescript
67
- import { createEventBus } from '@asaidimu/events';
68
- import { DatabaseConnection, createEphemeralStore } from '@asaidimu/utils-database';
69
- import {
70
- ContentStore,
71
- createWorkspaceDatabase,
72
- MemoryBlobStorage,
73
- WorkspaceManager,
74
- SessionManager,
75
- createSimpleWorkspace,
76
- } from '@asaidimu/utils-workspace';
77
-
78
- // 1. Setup database and blob storage
79
- const db = createWorkspaceDatabase(
80
- await DatabaseConnection(
81
- { database: 'my-app', validate: true, predicates: {} },
82
- createEphemeralStore()
83
- )
84
- );
85
- const eventBus = createEventBus();
86
- const blobStorage = new MemoryBlobStorage(); // or IndexedDBBlobStorage for browsers
87
-
88
- // 2. Create core components
89
- const contentStore = await ContentStore.create(db, blobStorage, eventBus);
90
- const workspaceManager = new WorkspaceManager({ contentStore, eventBus });
91
- const sessionManager = new SessionManager(workspaceManager, contentStore);
92
-
93
- // 3. Initial workspace state
94
- let currentWorkspace = createSimpleWorkspace({ name: 'My Workspace', language: 'en', actor: 'user' });
95
- ```
99
+ ### 3.1. Immutability and Versioning
100
+ We believe that AI history should be a permanent record. Every AI response and user prompt is immutable. To "change" the past, you must create a new branch. This ensures that the "reasoning path" of an agent is always recoverable.
96
101
 
97
- ### Verification
102
+ ### 3.2. Content-Addressability (SHA-256)
103
+ Assets (images, PDFs, source code) are identified by the cryptographic hash of their contents. This solves the "State Bloat" problem. If 100 sessions all reference the same 5MB system manual, the workspace only stores one copy.
98
104
 
99
- Run a quick smoke test:
105
+ ### 3.3. Command-Driven Mutations (CQRS)
106
+ We strictly follow the **Command Query Responsibility Segregation (CQRS)** pattern.
107
+ - **Commands**: Describe intent (e.g., `session:fork`).
108
+ - **Reducers**: Pure-ish functions that execute the intent against the persistent stores.
109
+ - **Projections**: In-memory read models (The Index) that allow for sub-millisecond retrieval of workspace state.
100
110
 
101
- ```typescript
102
- const createResult = await sessionManager.create(currentWorkspace, { label: 'Test Session' });
103
- if (createResult.ok) {
104
- console.log('Session created:', createResult.value.session.id());
105
- }
106
- ```
111
+ ### 3.4. Portability and LLM Agnosticism
112
+ The workspace core does not know what "OpenAI" or "Gemini" is. It operates on an abstract `Prompt` model. This allows you to build your entire application logic once and swap between different LLM providers just by changing the `LLMAdapter`.
107
113
 
108
114
  ---
109
115
 
110
- ## Usage Documentation
116
+ ## 4. System Architecture: The "Engine Under the Hood"
111
117
 
112
- ### Basic Conversation Flow
118
+ ### 4.1. Write Model: Atomic Entity Stores
119
+ The library organizes data into atomic "Stores." Each store is responsible for a single domain entity (e.g., `RoleStore`, `SessionStore`). These stores are designed to be backed by any persistent database via a standardized `Database` boundary.
113
120
 
114
- ```typescript
115
- import { TurnBuilder, merge } from '@asaidimu/utils-workspace';
116
-
117
- // Create a session
118
- const { session, patch } = (await sessionManager.create(currentWorkspace, { label: 'Chat' })).value;
119
- currentWorkspace = merge(currentWorkspace, patch);
120
-
121
- // Add a user turn
122
- const userTurn = new TurnBuilder('user')
123
- .addText('What is the weather like today?')
124
- .build();
125
- const addResult = await session.addTurn(currentWorkspace, userTurn);
126
- currentWorkspace = merge(currentWorkspace, addResult.value);
127
-
128
- // Resolve the effective session (includes preferences, context, transcript)
129
- const effective = (await workspaceManager.resolveSession(currentWorkspace, session.id())).value;
130
-
131
- // Build a prompt for an LLM
132
- const promptBuilder = new PromptBuilder({ blobResolver: contentStore.getBlobResolver() });
133
- const prompt = await promptBuilder.build(effective, {
134
- tokenBudget: { total: 8000 },
135
- relevanceConfig: { recentMessageWindow: 5, minScore: 0.3 }
136
- });
121
+ ### 4.2. Read Model: Reactive Index Projections
122
+ Querying a database for every prompt is too slow. To solve this, the `WorkspaceManager` maintains a **Workspace Index**. This index is a reactive, in-memory projection of the underlying stores.
123
+ - When a command is executed, the reducer returns a "Patch."
124
+ - The manager applies the patch to the Index.
125
+ - Subscribers are notified via the Event Bus.
137
126
 
138
- console.log(prompt.system.persona);
139
- console.log(prompt.transcript.turns);
140
- ```
127
+ ### 4.3. The Sequential Serializer (Race Condition Prevention)
128
+ Concurrency in graph-based history leads to "phantom branches" and corrupt heads. The `WorkspaceManager` uses a **Sequential Serializer** (a promise-chaining queue) to ensure that only one mutation can happen at a time.
141
129
 
130
+ ### 4.4. Event Bus and Reactivity
131
+ The workspace is "Alive." Every mutation emits events that can be listened to by the UI, external logging services, or other parts of the application.
142
132
 
143
- ### Turn Versioning & Editing
133
+ ---
144
134
 
145
- Every turn in a session is versioned. When you edit a turn, a new version is created while preserving the old one. The session’s “head” points to the latest version along the active chain, but users can switch between versions or view version history.
135
+ ## 5. Domain Entities: Deep Dive into the State
146
136
 
147
- **Editing a turn** create a new version with modified content:
137
+ ### 5.1. Sessions: The Contextual Roots
138
+ A `Session` is the user's primary unit of interaction. It manages the "active timeline."
139
+ - **Head Pointer**: A `TurnRef` (UUID + Version) identifying the current leaf of the DAG.
140
+ - **Topic Affinity**: Semantic tags that act as "magnets" for RAG context and user preferences.
148
141
 
149
- ```typescript
150
- // Edit the user turn we just added
151
- const newBlocks: ContentBlock[] = [
152
- { id: uuid(), type: 'text', text: 'What is the weather like in Tokyo?' }
153
- ];
154
- const editResult = await session.editTurn(
155
- currentWorkspace,
156
- userTurn.id, // turn ID
157
- newBlocks, // new content blocks
158
- 'user' // optional role snapshot
159
- );
160
- if (editResult.ok) {
161
- currentWorkspace = merge(currentWorkspace, editResult.value);
162
- }
163
- ```
142
+ ### 5.2. Turns: The Atoms of Hyper-History
143
+ A `Turn` is the smallest unit of conversation. It is a multi-modal container that can hold anything from a text string to a sequence of tool executions and internal "thinking" blocks.
164
144
 
165
- **Navigating versions** switch between versions of a turn (e.g., undo/redo):
145
+ ### 5.3. Blobs: Deduplicated Asset Management
146
+ Blobs represent the library's solution for large binary data. They utilize a **Reference Counting** garbage collector.
147
+ - **Registration**: Calculate hash -> Check for existence -> Create or increment ref-count.
148
+ - **Automatic Cleanup**: When a turn is deleted, the system decrements the blob's ref-count. Only when it hits zero is the physical file deleted from storage.
166
149
 
167
- ```typescript
168
- // Switch to the previous version of this turn
169
- const leftResult = await session.switchVersionLeft(currentWorkspace, userTurn.id);
170
- if (leftResult.ok) {
171
- currentWorkspace = merge(currentWorkspace, leftResult.value);
172
- // The session head now points to the previous version
173
- }
150
+ ### 5.4. Roles: Persistent Personas
151
+ A `Role` encapsulates the AI's identity. It includes the "System Prompt" (Persona), model-specific constraints (e.g., `temperature`, `window size`), and a set of topics that the role is "interested" in.
174
152
 
175
- // Switch to the next version (if available)
176
- const rightResult = await session.switchVersionRight(currentWorkspace, userTurn.id);
177
- ```
153
+ ---
178
154
 
179
- **Inspecting version info** get available versions and navigation state:
155
+ ## 6. The Prompt Pipeline: Anatomy of a Request
180
156
 
181
- ```typescript
182
- const branchInfo = await session.branchInfo(userTurn.id);
183
- console.log(branchInfo);
184
- // {
185
- // versions: [0, 1, 2], // all version numbers for this turn
186
- // currentIndex: 1, // index of the active version
187
- // total: 3,
188
- // hasPrev: true,
189
- // hasNext: true
190
- // }
191
- ```
157
+ The `PromptBuilder` follows a 5-step lifecycle:
192
158
 
193
- **How it works under the hood** When you call `editTurn`, the library:
194
- 1. Loads the current version of the turn.
195
- 2. Increments the version number.
196
- 3. Stores a new turn document with the updated blocks.
197
- 4. Updates the session head if the edited turn was the head.
198
- 5. Preserves all previous versions, which remain accessible via `switchVersionLeft/Right` and appear in `branchInfo`.
159
+ ### 6.1. Stage 1: DAG Snapshotting and Ancestry Walking
160
+ Starting from the session's `Head`, the builder performs a recursive walk up the parent pointers. This flattens the graph into a linear `Transcript`.
199
161
 
200
- This makes the conversation graph fully auditable and supports collaborative editing scenarios.
162
+ ### 6.2. Stage 2: Semantic Retrieval and Topic Alignment
163
+ The builder identifies the active `Topics` of the session. It then queries the `ContextIndex`. It uses the `ContextRetriever` to rank entries by relevance.
201
164
 
202
- ### Working with Blobs (Images & Documents)
165
+ ### 6.3. Stage 3: Conflict Resolution in Instructions
166
+ It assembles the final "System Instruction." It merges the `Role` instructions with any active `Preferences`.
203
167
 
204
- ```typescript
205
- // Register an image blob
206
- const imageData = await fetch('/photo.jpg').then(r => new Uint8Array(await r.arrayBuffer()));
207
- const registerCmd: RegisterBlob = {
208
- type: 'blob:register',
209
- timestamp: new Date().toISOString(),
210
- payload: { data: imageData, mediaType: 'image/jpeg', filename: 'photo.jpg' }
211
- };
212
- const blobResult = await workspaceManager.dispatch(currentWorkspace, registerCmd);
213
- if (blobResult.ok) {
214
- const blobRef = blobResult.value.index?.blobs?.['sha256...']; // reference
215
- // Use blobRef in an ImageBlock
216
- const turnWithImage = new TurnBuilder('user')
217
- .addImage(blobRef, 'A beautiful landscape')
218
- .build();
219
- }
220
- ```
168
+ ### 6.4. Stage 4: Multi-Modal Blob Resolution Protocols
169
+ Every `BlobRef` in the transcript is resolved. The builder contacts the `BlobStore` to determine if the data should be sent as `inlineData` (Base64) or if it has a pre-existing `fileId`.
170
+
171
+ ### 6.5. Stage 5: Provider-Specific Adapter Mapping
172
+ The finalized, agnostic `Prompt` is passed to the `LLMAdapter`. This is where the mapping to the provider's specific wire format occurs.
173
+
174
+ ---
175
+
176
+ ## 7. The Content Block Specification
177
+
178
+ ### 7.1. Basic Blocks
179
+ - **`text`**: Standard text communication.
180
+ - **`thinking`**: Internal CoT reasoning. Separate to allow UI to hide/show reasoning paths.
181
+
182
+ ### 7.2. Asset Blocks
183
+ - **`image`**: References a `BlobRef`. Supports `mediaType` and `filename`.
184
+ - **`document`**: Used for PDFs, code snippets, or long-form text files.
185
+
186
+ ### 7.3. Functional Blocks
187
+ - **`tool_use`**: A structured request for tool execution. Contains `callId`, `name`, and `args`.
188
+ - **`tool_result`**: The output of a tool, linked via `callId`.
189
+
190
+ ### 7.4. Structural Blocks
191
+ - **`summary`**: A "compression" block. Replaces older history turns.
192
+ - **`role:transition`**: Notates that the session persona changed.
221
193
 
222
- ### Managing Preferences & Roles
194
+ ---
195
+
196
+ ## 8. Command Reference: The Mutation API Manual
197
+
198
+ ### 8.1. Workspace Domain Commands
199
+ - **`workspace:create`**: Initializes root metadata.
200
+ - **`workspace:sync`**: Forces a full re-indexing of all stores.
201
+
202
+ ### 8.2. Session Domain Commands
203
+ | Command | Payload | Side Effects |
204
+ | :--- | :--- | :--- |
205
+ | **`session:create`** | `id, role, topics, label` | Creates database record + Index entry. |
206
+ | **`session:fork`** | `sessionId, turnId` | Metadata-only clone of history. |
207
+ | **`session:update`** | `sessionId, updates` | Patch Index + Update store. |
208
+ | **`session:delete`** | `sessionId` | Recursive release of all blob refs. |
209
+ | **`session:role:switch`** | `sessionId, newRole` | Insert transition block + Update role. |
210
+
211
+ ### 8.3. Turn & DAG Domain Commands
212
+ | Command | Payload | Description |
213
+ | :--- | :--- | :--- |
214
+ | **`turn:add`** | `sessionId, turn` | Appends turn. Retains blobs. |
215
+ | **`turn:edit`** | `sessionId, turnId, newBlocks`| New Version in DAG. Moves Head. |
216
+ | **`turn:branch`** | `sessionId, turn` | Diverge from non-head node. |
217
+ | **`turn:delete`** | `sessionId, turnId, version`| Removes Version + Releases blobs. |
218
+
219
+ ### 8.4. Role & Persona Domain Commands
220
+ - **`role:add`**: Register new persona.
221
+ - **`role:update`**: Modify instructions/constraints.
222
+ - **`role:delete`**: Remove role.
223
+
224
+ ### 8.5. Blob & Asset Domain Commands
225
+ - **`blob:register`**: Calculate hash + Create storage record + Increment ref-count.
226
+ - **`blob:retain`**: Increment ref-count.
227
+ - **`blob:release`**: Decrement ref-count + Trigger purge if 0.
228
+ - **`blob:record_remote_id`**: Map local hash to remote API ID.
229
+
230
+ ---
231
+
232
+ ## 9. Extension Framework
223
233
 
234
+ ### 9.1. WorkspaceExtensions
235
+ Packaging custom logic into a single plugin.
224
236
  ```typescript
225
- // Add a preference
226
- const prefCmd: AddPreference = {
227
- type: 'preference:add',
228
- timestamp: new Date().toISOString(),
229
- payload: {
230
- id: crypto.randomUUID(),
231
- content: 'Always use metric units for measurements.',
232
- topics: ['weather', 'science'],
233
- timestamp: new Date().toISOString()
234
- }
237
+ const MyPlugin: WorkspaceExtension = {
238
+ schemas: [ /* schemas */ ],
239
+ reducers: { 'custom:cmd': myReducer },
240
+ middleware: [ myMiddleware ],
235
241
  };
236
- await workspaceManager.dispatch(currentWorkspace, prefCmd);
237
-
238
- // Create a role that uses that preference by default
239
- const roleCmd: AddRole = {
240
- type: 'role:add',
241
- timestamp: new Date().toISOString(),
242
- payload: {
243
- name: 'scientist',
244
- label: 'Science Assistant',
245
- persona: 'You are a helpful science expert.',
246
- preferences: [prefId]
247
- }
248
- };
249
- await workspaceManager.dispatch(currentWorkspace, roleCmd);
250
242
  ```
251
243
 
244
+ ### 9.2. Middleware: Intercepting the State Stream
245
+ Guardians of the workspace. Can abort, augment, or observe commands.
246
+
247
+ ### 9.3. TurnProcessors: Giving the AI Agency
248
+ Scans AI response blocks for side-effects (e.g., auto-saving memories or tasks).
252
249
 
253
- ### Branching a Conversation
250
+ ### 9.4. Custom Indexers
251
+ Expanding the Read Model with custom read-projections.
252
+
253
+ ---
254
254
 
255
+ ## 10. Persistence Layer: Schema and Collections
256
+
257
+ | Collection | Key | Description |
258
+ | :--- | :--- | :--- |
259
+ | `workspace` | `id` | Global metadata and settings. |
260
+ | `role` | `name` | Persona instructions. |
261
+ | `preference` | `id` | User instructions. |
262
+ | `context` | `key` | RAG snippets. |
263
+ | `session` | `id` | Session metadata + Head. |
264
+ | `turn` | `(id, version)`| Immutable history records. |
265
+ | `blob_records` | `sha256` | Metadata + RefCount. |
266
+
267
+ ---
268
+
269
+ ## 11. Implementation Guide
270
+
271
+ ### 11.1. Implementing a Database Boundary
255
272
  ```typescript
256
- // Get the current head turn
257
- const head = session.head(currentWorkspace);
258
- if (head) {
259
- // Create a branch from that turn
260
- const branchTurn = new TurnBuilder('assistant')
261
- .withParent(head) // explicitly set parent
262
- .addText('Let me think differently...')
263
- .build();
264
- const branchResult = await session.branch(currentWorkspace, branchTurn);
265
- currentWorkspace = merge(currentWorkspace, branchResult.value);
273
+ class MyDb implements WorkspaceDatabase {
274
+ async collection(name: string) { /* ... */ }
275
+ async open(schemas: SchemaDefinition[]) { /* ... */ }
266
276
  }
267
277
  ```
268
278
 
269
- ### Custom Prompt Assembly
270
-
271
- The SDK includes a default token planner and Jaccard context retriever, but you can replace them:
272
-
279
+ ### 11.2. Implementing a BlobStorage Boundary
273
280
  ```typescript
274
- class MyCustomRetriever implements ContextRetriever {
275
- rank(input: ContextRankingInput): Context[] {
276
- // your ranking logic
277
- }
281
+ class MyStorage implements BlobStorage {
282
+ async put(data: Uint8Array): Promise<string> { /* return sha256 */ }
283
+ async get(sha256: string): Promise<Uint8Array | null> { /* ... */ }
284
+ async delete(sha256: string): Promise<void> { /* ... */ }
278
285
  }
286
+ ```
279
287
 
280
- const promptBuilder = new PromptBuilder({
281
- blobResolver: contentStore.getBlobResolver(),
282
- retriever: new MyCustomRetriever(),
283
- planner: new MyTokenPlanner()
288
+ ### 11.3. Bootstrapping the Manager
289
+ ```typescript
290
+ const { manager, sessions, ctx } = await createWorkspace({
291
+ db: new MyDb(),
292
+ blobStorage: new MyStorage(),
293
+ getWorkspace: () => state,
294
+ setWorkspace: (patch) => { state = merge(state, patch); },
295
+ processor: new MyProcessor()
284
296
  });
285
297
  ```
286
298
 
287
299
  ---
288
300
 
289
- ## Project Architecture
290
-
291
- ### Core Components
292
-
293
- | Component | Responsibility |
294
- |-----------|----------------|
295
- | `WorkspaceManager` | Dispatches commands, applies reducer, coordinates side effects (persistence, blob ops). |
296
- | `ContentStore` | Session‑aware read/write layer with LRU caches for roles, preferences, context. Owns `TurnTree` and `BlobStore`. |
297
- | `TurnTree` | Manages the turn DAG – persistence, head pointer, branching, version switching, subtree deletion. |
298
- | `BlobStore` | Content‑addressed blob registry (SHA‑256) with reference counting, remote ID mapping, and eviction. |
299
- | `Session` | Thin coordinator for a single session – validates existence, delegates to `TurnTree` for reads, forwards write commands to `WorkspaceManager`. |
300
- | `SessionManager` | Creates, opens, and lists sessions. Returns `Session` objects (stateless, lazy loading). |
301
- | `PromptBuilder` | Assembles prompts from an `EffectiveSession` using a `ContextRetriever`, `TokenPlanner`, and optional `Summarizer`. |
302
- | `Reducer` | Pure function that validates commands and produces `DeepPartial<Workspace>` patches for the in‑memory index. |
303
-
304
- ### Data Flow
305
-
306
- ```mermaid
307
- graph LR
308
- A[UI / Caller] -->|Command| B[WorkspaceManager]
309
- B -->|Validate & Patch| C[Reducer]
310
- C -->|Patch| D[In‑Memory Workspace]
311
- B -->|Side Effects| E[ContentStore]
312
- E --> F[(Database)]
313
- E --> G[(Blob Storage)]
314
- B -->|Emit Event| H[EventBus]
315
- H -->|Notify| A
316
- ```
301
+ ## 12. Design Decisions
317
302
 
318
- ### Extension Points
303
+ ### Why a DAG?
304
+ A tree allows branching, but a DAG allows for **Merging** and **Non-Destructive Versioning**. It accurately represents the way conversation evolved into multiple "what-if" scenarios.
319
305
 
320
- - **`Summarizer`** – Compress transcripts when token budget is tight.
321
- - **`ContextRetriever`** Custom ranking of context entries (e.g., vector similarity).
322
- - **`TokenPlanner`** – Decide which turns/preferences/context fit into a budget.
323
- - **`ToolRegistry`** – Register callable tools; `tool:call` commands will execute them.
324
- - **`PermissionGuard`** – Intercept commands/tool calls for authentication/authorization.
325
- - **`BlobStorage`** – Implement your own backend (S3, OPFS, etc.) by conforming to the interface.
306
+ ### Why SHA-256?
307
+ UUIDs lead to "Asset Proliferation." SHA-256 enables **Global Deduplication**, critical for reducing storage in RAG-heavy apps.
326
308
 
327
- ---
309
+ ### Why CQRS?
310
+ Read patterns (building prompts) are vastly different from Write patterns (saving turns). CQRS optimizes both independently.
328
311
 
329
- ## Development & Contributing
312
+ ---
330
313
 
331
- ### Development Setup
314
+ ## 13. Comparison with Industry Standards
332
315
 
333
- ```bash
334
- git clone https://github.com/asaidimu/erp-utils.git
335
- cd erp-utils/src/workspace
336
- npm install
337
- npm run build
338
- ```
316
+ - **vs. OpenAI Threads**: OpenAI threads are black boxes. Our Workspace provides full local control and DAG navigation.
317
+ - **vs. LangChain Memory**: LangChain memory is often ephemeral. Our Workspace is a persistent, repo-level database.
339
318
 
340
- ### Available Scripts
319
+ ---
341
320
 
342
- | Script | Description |
343
- |--------|-------------|
344
- | `npm test` | Run unit tests once (Vitest) |
345
- | `npm run test:watch` | Run tests in watch mode |
346
- | `npm run test:browser` | Run tests in a browser environment |
347
- | `npm run build` | Compile TypeScript to `dist/` |
321
+ ## 14. Advanced Recipes & Design Patterns
348
322
 
349
- ### Testing
323
+ ### The "Shadow Turn" Pattern
324
+ Intercept `turn:add` and inject a hidden system turn before the user message for real-time situational awareness.
350
325
 
351
- Tests are written with [Vitest](https://vitest.dev/) and cover reducers, turn tree operations, blob reference counting, and prompt assembly. Run `npm test` to execute the suite. We aim for >85% coverage on core modules.
326
+ ### Automatic Topic Discovery
327
+ Use a cheap model to scan every user turn and suggest new topics to refine RAG retrieval dynamically.
352
328
 
353
- ### Contributing Guidelines
329
+ ---
354
330
 
355
- 1. **Fork the repository** and create a feature branch.
356
- 2. **Follow the existing code style** (Prettier + ESLint configured).
357
- 3. **Write tests** for any new functionality or bug fixes.
358
- 4. **Commit messages** should follow [Conventional Commits](https://www.conventionalcommits.org/) (e.g., `feat: add branch info API`).
359
- 5. **Open a pull request** against the `main` branch. Include a clear description and link to any related issue.
331
+ ## 15. Technical Specifications
360
332
 
361
- ### Issue Reporting
333
+ ### Blob Hash Protocol
334
+ SHA-256 hex digest -> stored under `blobs/{sha256}`. RefCount = 0 triggers deletion.
362
335
 
363
- Report bugs or request features via [GitHub Issues](https://github.com/asaidimu/erp-utils/issues). Please include:
364
- - Library version
365
- - Minimal code reproduction
366
- - Expected vs actual behavior
336
+ ### Index Sync Protocol
337
+ Full re-build of Index via parallel execution of all registered `Indexers`.
367
338
 
368
339
  ---
369
340
 
370
- ## Additional Information
341
+ ## 16. Troubleshooting
342
+
343
+ ### Stale Index
344
+ Ensure reducers return the correct patch. Call `workspace:sync` to force a rebuild.
371
345
 
372
- ### Troubleshooting
346
+ ### Blob Reference Leak
347
+ Use `session:delete` to ensure all blobs in the session DAG are released.
373
348
 
374
- | Issue | Likely Solution |
375
- |-------|----------------|
376
- | `Blob bytes not found locally` | The blob was evicted because `refCount` reached zero and `eagerEviction` is true. Either re‑register the blob or disable eager eviction. |
377
- | `Cannot delete role – still referenced by sessions` | Change all sessions using that role to another role first, or delete the sessions. |
378
- | `Turn not found when editing` | Ensure the turn ID and version exist. Use `session.turns()` to list current nodes. |
379
- | `Prompt assembly drops all context` | Check your `tokenBudget.total` – increase it, or provide a custom `estimator` that returns smaller token counts. |
349
+ ---
350
+
351
+ ## 17. Testing Strategy
380
352
 
381
- ### FAQ
353
+ ### Integration Testing the DAG
354
+ ```typescript
355
+ test('can branch history', async () => {
356
+ const session = await sessions.create({ ... });
357
+ await session.addTurn({ blocks: [{ type: 'text', text: 'Prompt 1' }] });
358
+ const turn1Id = session.head()!.id;
359
+ await session.editTurn(turn1Id, [{ type: 'text', text: 'Edited Prompt 1' }]);
360
+ // Verify both versions exist and head points to version 2
361
+ });
362
+ ```
382
363
 
383
- **Q: How do I handle large files (e.g., 100MB videos)?**
384
- A: The blob storage is content‑addressed, so each unique file is stored once. Use `registerBlob` to add it, then reference it via `BlobRef`. For very large files, consider implementing a streaming backend or using remote IDs (e.g., upload to S3 and store the fileId via `recordRemoteId`).
364
+ ### Unit Testing Reducers
365
+ Reducers are pure-ish and can be tested in isolation by providing a mock `WorkspaceContext`.
385
366
 
386
- **Q: Can I use this library in a browser with IndexedDB?**
387
- A: Yes – use `IndexedDBBlobStorage` and `createWorkspaceDatabase` with an IndexedDB adapter. The turn DAG and all entities are persisted locally.
367
+ ---
388
368
 
389
- **Q: How do I migrate from a simple message array?**
390
- A: Create a session, then add turns sequentially using `session.addTurn`. The `parent` field will be set automatically if you use `addTurn` (it links to the current head). For branching, you can also set `parent` manually.
369
+ ## 18. Glossary of Terms
391
370
 
392
- **Q: Does the SDK support real‑time collaboration?**
393
- A: The command/reducer pattern is well‑suited for CRDTs or operational transforms. The workspace events (`workspace:changed`) can be broadcast to other clients, and commands can be merged. We plan to add built‑in sync in a future version.
371
+ - **`Head`**: The current active turn.
372
+ - **`Transcript`**: Flattened history for an LLM request.
373
+ - **`Persona`**: System instructions for a Role.
374
+ - **`Topic`**: Semantic tag for RAG connectivity.
375
+ - **`Reducer`**: Command handler for state mutation.
394
376
 
395
- ### Changelog & Roadmap
377
+ ---
396
378
 
397
- See [CHANGELOG.md](https://github.com/asaidimu/erp-utils/blob/main/src/workspace/CHANGELOG.md) for version history.
398
- Planned features:
399
- - Built‑in summarizer using LLM calls
400
- - Vector store integration for semantic context retrieval
401
- - Real‑time collaboration via WebSocket transport
379
+ ## 19. API Reference Appendix
402
380
 
403
- ### License
381
+ ### WorkspaceManager
382
+ - `dispatch(command)`: Atomic mutation.
383
+ - `workspace()`: Sync Index access.
384
+ - `use(middleware)`: Register interceptor.
404
385
 
405
- MIT © [Saidimu](https://github.com/asaidimu). See [LICENSE](https://github.com/asaidimu/erp-utils/blob/main/src/workspace/LICENSE) for full text.
386
+ ### Session
387
+ - `snapshot()`: Prepare prompt data.
388
+ - `addTurn(blocks)`: Append to head.
389
+ - `branchInfo(turnId)`: Retrieve all versions.
406
390
 
407
- ### Acknowledgments
391
+ ---
408
392
 
409
- Built on:
410
- - [`@asaidimu/anansi`](https://github.com/asaidimu/anansi) – schema‑based document store
411
- - [`@asaidimu/events`](https://github.com/asaidimu/events) – typed event bus
412
- - [`@asaidimu/utils-database`](https://github.com/asaidimu/erp-utils/tree/main/src/database) – database abstraction layer
413
- - [`uuid`](https://github.com/uuidjs/uuid) – UUID generation
393
+ ## 20. License
414
394
 
415
- Inspired by modern AI orchestration frameworks and offline‑first application patterns.
395
+ MIT © [Saidimu](https://github.com/asaidimu)