@durable-streams/state 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,502 @@
1
+ # The Durable Streams State Protocol
2
+
3
+ **Document:** Durable Streams State Protocol
4
+ **Version:** 1.0
5
+ **Date:** 2025-01-XX
6
+ **Author:** ElectricSQL
7
+ **Status:** Extension of Durable Streams Protocol
8
+
9
+ ---
10
+
11
+ ## Abstract
12
+
13
+ This document specifies the Durable Streams State Protocol, an extension of the Durable Streams Protocol [PROTOCOL] that defines a composable schema for state change events (insert/update/delete) and control messages. The protocol provides a shared vocabulary for state synchronization that works across different transport layers, storage backends, and application patterns, enabling database-style sync semantics over durable streams.
14
+
15
+ ## Copyright Notice
16
+
17
+ Copyright (c) 2025 ElectricSQL
18
+
19
+ ## Table of Contents
20
+
21
+ 1. [Introduction](#1-introduction)
22
+ 2. [Terminology](#2-terminology)
23
+ 3. [Protocol Overview](#3-protocol-overview)
24
+ 4. [Message Types](#4-message-types)
25
+ - 4.1. [Change Messages](#41-change-messages)
26
+ - 4.2. [Control Messages](#42-control-messages)
27
+ 5. [Message Format](#5-message-format)
28
+ - 5.1. [Change Message Structure](#51-change-message-structure)
29
+ - 5.2. [Control Message Structure](#52-control-message-structure)
30
+ 6. [State Materialization](#6-state-materialization)
31
+ 7. [Schema Validation](#7-schema-validation)
32
+ 8. [Use Cases](#8-use-cases)
33
+ 9. [Security Considerations](#9-security-considerations)
34
+ 10. [IANA Considerations](#10-iana-considerations)
35
+ 11. [References](#11-references)
36
+
37
+ ---
38
+
39
+ ## 1. Introduction
40
+
41
+ The Durable Streams State Protocol extends the Durable Streams Protocol [PROTOCOL] by defining a standard message format for state synchronization. While the base protocol provides byte-level stream operations, this extension adds semantic meaning to messages, enabling clients to materialize and query state from change events.
42
+
43
+ The protocol is designed to be:
44
+
45
+ - **Composable**: A building block that works with any transport layer (durable streams, WebSockets, Server-Sent Events)
46
+ - **Type-safe**: Supports multi-type streams with discriminated unions
47
+ - **Decoupled**: Separates event processing from persistence, allowing flexible storage backends
48
+ - **Schema-agnostic**: Uses Standard Schema [STANDARD-SCHEMA] for validation, supporting multiple schema libraries
49
+
50
+ This protocol enables applications to build real-time state synchronization systems including presence tracking, chat rooms, feature flags, collaborative editing, and more, all using a common change event vocabulary.
51
+
52
+ ## 2. Terminology
53
+
54
+ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.
55
+
56
+ **Change Message**: A message representing a state mutation (insert, update, or delete operation) on an entity identified by type and key.
57
+
58
+ **Control Message**: A message for stream management (snapshot boundaries, resets) separate from data changes.
59
+
60
+ **Entity Type**: A discriminator field in change messages that routes events to the correct collection or handler. Enables multi-type streams where different entity types coexist.
61
+
62
+ **Entity Key**: A unique identifier for an entity within a given type. Together with type, forms a composite key.
63
+
64
+ **Materialized State**: An in-memory or persistent view of state constructed by applying change events sequentially.
65
+
66
+ **Operation**: The type of change being applied: `insert` (create), `update` (modify), or `delete` (remove).
67
+
68
+ **Standard Schema**: A vendor-neutral schema format [STANDARD-SCHEMA] that enables validation with multiple schema libraries (Zod, Valibot, ArkType).
69
+
70
+ ## 3. Protocol Overview
71
+
72
+ The State Protocol operates on streams created with `Content-Type: application/json` as specified in the base Durable Streams Protocol [PROTOCOL]. Messages are JSON objects that conform to one of two message types:
73
+
74
+ 1. **Change Messages**: Represent state mutations (insert/update/delete)
75
+ 2. **Control Messages**: Provide stream management signals
76
+
77
+ Clients append change messages to streams using the base protocol's append operation. When reading from streams, clients receive JSON arrays of messages (per Section 7.1 of [PROTOCOL]) and apply them sequentially to materialize state.
78
+
79
+ The protocol does not prescribe:
80
+
81
+ - How state is persisted (in-memory, IndexedDB, SQLite, etc.)
82
+ - How queries are executed (direct map lookups, database queries, etc.)
83
+ - How conflicts are resolved (last-write-wins, CRDTs, etc.)
84
+
85
+ These decisions are left to implementations, enabling flexibility while providing a common event format.
86
+
87
+ ## 4. Message Types
88
+
89
+ ### 4.1. Change Messages
90
+
91
+ Change messages represent state mutations. They **MUST** contain:
92
+
93
+ - `type` (string): Entity type discriminator
94
+ - `key` (string): Entity identifier
95
+ - `headers` (object): Operation metadata
96
+
97
+ For `insert` and `update` operations, change messages **MUST** also contain:
98
+
99
+ - `value` (any JSON value): The new value for the entity
100
+
101
+ For `delete` operations, change messages **MAY** contain:
102
+
103
+ - `value` (any JSON value): Typically `null` or omitted
104
+
105
+ Change messages **MAY** include:
106
+
107
+ - `old_value` (any JSON value): Previous value, useful for conflict detection or audit logging
108
+
109
+ The `headers` object **MUST** contain:
110
+
111
+ - `operation` (string): One of `"insert"`, `"update"`, or `"delete"`
112
+
113
+ The `headers` object **MAY** contain:
114
+
115
+ - `txid` (string): Transaction identifier for grouping related changes
116
+ - `timestamp` (string): RFC 3339 timestamp indicating when the change occurred
117
+
118
+ #### 4.1.1. Insert Operation
119
+
120
+ Insert operations create new entities. The `value` field **MUST** be present and contain the entity data. The `old_value` field **SHOULD NOT** be present for insert operations.
121
+
122
+ **Example:**
123
+
124
+ ```json
125
+ {
126
+ "type": "user",
127
+ "key": "user:123",
128
+ "value": {
129
+ "name": "Alice",
130
+ "email": "alice@example.com"
131
+ },
132
+ "headers": {
133
+ "operation": "insert",
134
+ "timestamp": "2025-01-15T10:30:00Z"
135
+ }
136
+ }
137
+ ```
138
+
139
+ #### 4.1.2. Update Operation
140
+
141
+ Update operations modify existing entities. The `value` field **MUST** be present and contain the new entity data. The `old_value` field **MAY** be present to enable conflict detection.
142
+
143
+ **Example:**
144
+
145
+ ```json
146
+ {
147
+ "type": "user",
148
+ "key": "user:123",
149
+ "value": {
150
+ "name": "Alice",
151
+ "email": "alice.new@example.com"
152
+ },
153
+ "old_value": {
154
+ "name": "Alice",
155
+ "email": "alice@example.com"
156
+ },
157
+ "headers": {
158
+ "operation": "update",
159
+ "timestamp": "2025-01-15T10:35:00Z"
160
+ }
161
+ }
162
+ ```
163
+
164
+ #### 4.1.3. Delete Operation
165
+
166
+ Delete operations remove entities. The `value` field **MAY** be present (typically `null`) or **MAY** be omitted entirely. The `old_value` field **MAY** be present to preserve the deleted entity's data.
167
+
168
+ **Example (value omitted):**
169
+
170
+ ```json
171
+ {
172
+ "type": "user",
173
+ "key": "user:123",
174
+ "old_value": {
175
+ "name": "Alice",
176
+ "email": "alice.new@example.com"
177
+ },
178
+ "headers": {
179
+ "operation": "delete",
180
+ "timestamp": "2025-01-15T10:40:00Z"
181
+ }
182
+ }
183
+ ```
184
+
185
+ **Example (value null):**
186
+
187
+ ```json
188
+ {
189
+ "type": "user",
190
+ "key": "user:123",
191
+ "value": null,
192
+ "old_value": {
193
+ "name": "Alice",
194
+ "email": "alice.new@example.com"
195
+ },
196
+ "headers": {
197
+ "operation": "delete",
198
+ "timestamp": "2025-01-15T10:40:00Z"
199
+ }
200
+ }
201
+ ```
202
+
203
+ ### 4.2. Control Messages
204
+
205
+ Control messages provide stream management signals separate from data changes. They **MUST** contain:
206
+
207
+ - `headers` (object): Control metadata
208
+
209
+ The `headers` object **MUST** contain:
210
+
211
+ - `control` (string): One of `"snapshot-start"`, `"snapshot-end"`, or `"reset"`
212
+
213
+ The `headers` object **MAY** contain:
214
+
215
+ - `offset` (string): Stream offset associated with the control event
216
+
217
+ #### 4.2.1. Snapshot Boundaries
218
+
219
+ The `snapshot-start` and `snapshot-end` control messages delimit snapshot boundaries. Servers **MAY** emit these to indicate that a sequence of change messages represents a complete snapshot of state at a point in time.
220
+
221
+ **Example:**
222
+
223
+ ```json
224
+ {
225
+ "headers": {
226
+ "control": "snapshot-start",
227
+ "offset": "123456_000"
228
+ }
229
+ }
230
+ ```
231
+
232
+ ```json
233
+ {
234
+ "headers": {
235
+ "control": "snapshot-end",
236
+ "offset": "123456_789"
237
+ }
238
+ }
239
+ ```
240
+
241
+ #### 4.2.2. Reset Control
242
+
243
+ The `reset` control message signals that clients **SHOULD** clear their materialized state and restart from the indicated offset. This enables servers to signal state resets or schema migrations.
244
+
245
+ **Example:**
246
+
247
+ ```json
248
+ {
249
+ "headers": {
250
+ "control": "reset",
251
+ "offset": "123456_000"
252
+ }
253
+ }
254
+ ```
255
+
256
+ ## 5. Message Format
257
+
258
+ ### 5.1. Change Message Structure
259
+
260
+ Change messages **MUST** be valid JSON objects with the following structure:
261
+
262
+ ```json
263
+ {
264
+ "type": "<entity-type>",
265
+ "key": "<entity-key>",
266
+ "value": <any-json-value>,
267
+ "old_value": <any-json-value>, // optional
268
+ "headers": {
269
+ "operation": "insert" | "update" | "delete",
270
+ "txid": "<transaction-id>", // optional
271
+ "timestamp": "<rfc3339-timestamp>" // optional
272
+ }
273
+ }
274
+ ```
275
+
276
+ **Field Requirements:**
277
+
278
+ - `type`: **MUST** be a non-empty string
279
+ - `key`: **MUST** be a non-empty string
280
+ - `value`: **MUST** be a valid JSON value (string, number, boolean, null, array, or object)
281
+ - `old_value`: **MAY** be present; if present, **MUST** be a valid JSON value
282
+ - `headers.operation`: **MUST** be one of `"insert"`, `"update"`, or `"delete"`
283
+ - `headers.txid`: **MAY** be present; if present, **MUST** be a non-empty string
284
+ - `headers.timestamp`: **MAY** be present; if present, **MUST** be a valid RFC 3339 timestamp
285
+
286
+ ### 5.2. Control Message Structure
287
+
288
+ Control messages **MUST** be valid JSON objects with the following structure:
289
+
290
+ ```json
291
+ {
292
+ "headers": {
293
+ "control": "snapshot-start" | "snapshot-end" | "reset",
294
+ "offset": "<stream-offset>" // optional
295
+ }
296
+ }
297
+ ```
298
+
299
+ **Field Requirements:**
300
+
301
+ - `headers.control`: **MUST** be one of `"snapshot-start"`, `"snapshot-end"`, or `"reset"`
302
+ - `headers.offset`: **MAY** be present; if present, **MUST** be a valid stream offset string
303
+
304
+ ## 6. State Materialization
305
+
306
+ Clients materialize state by applying change messages sequentially in stream order. The materialization process **MUST**:
307
+
308
+ 1. Process messages in the order they appear in the stream
309
+ 2. For change messages:
310
+ - Apply `insert` operations by storing the entity at `type`/`key`
311
+ - Apply `update` operations by replacing the entity at `type`/`key`
312
+ - Apply `delete` operations by removing the entity at `type`/`key`
313
+ 3. For control messages:
314
+ - Handle control signals according to application logic (e.g., clear state on `reset`)
315
+
316
+ The protocol does not prescribe how state is stored. Implementations **MAY** use:
317
+
318
+ - In-memory maps (for simple cases)
319
+ - IndexedDB (for browser persistence)
320
+ - SQLite (for local databases)
321
+ - TanStack DB collections (for query interfaces)
322
+ - Custom storage backends
323
+
324
+ **Example Materialization:**
325
+
326
+ Given the following change messages:
327
+
328
+ ```json
329
+ [
330
+ {
331
+ "type": "user",
332
+ "key": "1",
333
+ "value": { "name": "Alice" },
334
+ "headers": { "operation": "insert" }
335
+ },
336
+ {
337
+ "type": "user",
338
+ "key": "2",
339
+ "value": { "name": "Bob" },
340
+ "headers": { "operation": "insert" }
341
+ },
342
+ {
343
+ "type": "user",
344
+ "key": "1",
345
+ "value": { "name": "Alice Smith" },
346
+ "headers": { "operation": "update" }
347
+ }
348
+ ]
349
+ ```
350
+
351
+ The materialized state after processing would be:
352
+
353
+ ```
354
+ type: "user"
355
+ key: "1" -> { "name": "Alice Smith" }
356
+ key: "2" -> { "name": "Bob" }
357
+ ```
358
+
359
+ ## 7. Schema Validation
360
+
361
+ Implementations **MAY** validate change message values using Standard Schema [STANDARD-SCHEMA]. Standard Schema provides a vendor-neutral format that works with multiple schema libraries (Zod, Valibot, ArkType).
362
+
363
+ When schema validation is enabled:
364
+
365
+ - Change messages **SHOULD** be validated against the schema for their entity type before materialization
366
+ - Invalid messages **MAY** be rejected or logged according to implementation policy
367
+ - Schema validation **SHOULD NOT** block stream processing for other entity types
368
+
369
+ The protocol does not require schema validation, but implementations **SHOULD** provide validation capabilities for production use.
370
+
371
+ ## 8. Use Cases
372
+
373
+ The State Protocol enables several common patterns:
374
+
375
+ ### 8.1. Key/Value Store
376
+
377
+ Simple synced configuration with optimistic updates:
378
+
379
+ ```json
380
+ {
381
+ "type": "config",
382
+ "key": "theme",
383
+ "value": "dark",
384
+ "headers": { "operation": "insert" }
385
+ }
386
+ ```
387
+
388
+ ### 8.2. Presence Tracking
389
+
390
+ Real-time online status with heartbeat semantics:
391
+
392
+ ```json
393
+ {
394
+ "type": "presence",
395
+ "key": "user:123",
396
+ "value": { "status": "online", "lastSeen": 1705312200000 },
397
+ "headers": { "operation": "update" }
398
+ }
399
+ ```
400
+
401
+ ### 8.3. Multi-Type Streams
402
+
403
+ Chat rooms with users, messages, reactions, and receipts:
404
+
405
+ ```json
406
+ [
407
+ {
408
+ "type": "user",
409
+ "key": "user:123",
410
+ "value": { "name": "Alice" },
411
+ "headers": { "operation": "insert" }
412
+ },
413
+ {
414
+ "type": "message",
415
+ "key": "msg:456",
416
+ "value": { "userId": "user:123", "text": "Hello!" },
417
+ "headers": { "operation": "insert" }
418
+ },
419
+ {
420
+ "type": "reaction",
421
+ "key": "reaction:789",
422
+ "value": { "messageId": "msg:456", "emoji": "👍" },
423
+ "headers": { "operation": "insert" }
424
+ }
425
+ ]
426
+ ```
427
+
428
+ ### 8.4. Feature Flags
429
+
430
+ Real-time configuration propagation:
431
+
432
+ ```json
433
+ {
434
+ "type": "flag",
435
+ "key": "new-editor",
436
+ "value": {
437
+ "enabled": true,
438
+ "rollout": { "type": "percentage", "value": 50 }
439
+ },
440
+ "headers": { "operation": "update" }
441
+ }
442
+ ```
443
+
444
+ ## 9. Security Considerations
445
+
446
+ ### 9.1. Message Validation
447
+
448
+ Clients **MUST** validate that received messages conform to the message format specified in this document. Malformed messages **SHOULD** be rejected to prevent injection attacks.
449
+
450
+ ### 9.2. Schema Validation
451
+
452
+ When schema validation is enabled, implementations **MUST** validate change message values before materialization. Invalid values **SHOULD** be rejected to prevent type confusion attacks.
453
+
454
+ ### 9.3. Untrusted Content
455
+
456
+ As specified in the base protocol [PROTOCOL], clients **MUST** treat stream contents as untrusted input. This applies to both the message structure and the values within change messages.
457
+
458
+ ### 9.4. Type and Key Validation
459
+
460
+ Implementations **SHOULD** validate that `type` and `key` fields contain only expected values to prevent injection of unauthorized entity types or keys.
461
+
462
+ ### 9.5. Transaction Identifiers
463
+
464
+ The `txid` field is opaque to clients. Servers **MAY** use transaction identifiers for grouping related changes, but clients **MUST NOT** rely on transaction semantics unless explicitly documented by the server.
465
+
466
+ ## 10. IANA Considerations
467
+
468
+ This document does not require any IANA registrations. The protocol uses JSON message formats and operates within the context of the Durable Streams Protocol [PROTOCOL], which defines the necessary HTTP headers and content types.
469
+
470
+ ## 11. References
471
+
472
+ ### 11.1. Normative References
473
+
474
+ **[PROTOCOL]**
475
+ Durable Streams Protocol. ElectricSQL, 2025.
476
+ <https://github.com/electric-sql/durable-streams/blob/main/PROTOCOL.md>
477
+
478
+ **[RFC2119]**
479
+ Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, <https://www.rfc-editor.org/info/rfc2119>.
480
+
481
+ **[RFC3339]**
482
+ Klyne, G. and C. Newman, "Date and Time on the Internet: Timestamps", RFC 3339, DOI 10.17487/RFC3339, July 2002, <https://www.rfc-editor.org/info/rfc3339>.
483
+
484
+ **[RFC8174]**
485
+ Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, <https://www.rfc-editor.org/info/rfc8174>.
486
+
487
+ **[STANDARD-SCHEMA]**
488
+ Standard Schema Specification.
489
+ <https://github.com/standard-schema/spec>
490
+
491
+ ### 11.2. Informative References
492
+
493
+ **[JSON-SCHEMA]**
494
+ Wright, A., Andrews, H., and B. Hutton, "JSON Schema: A Media Type for Describing JSON Documents", draft-wright-json-schema-00 (work in progress).
495
+
496
+ ---
497
+
498
+ **Full Copyright Statement**
499
+
500
+ Copyright (c) 2025 ElectricSQL
501
+
502
+ This document and the information contained herein are provided on an "AS IS" basis. ElectricSQL disclaims all warranties, express or implied, including but not limited to any warranty that the use of the information herein will not infringe any rights or any implied warranties of merchantability or fitness for a particular purpose.