@bluelibs/runner 6.3.0 → 6.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,337 @@
1
+ # Serializer Protocol (Internal)
2
+
3
+ ← [Back to main README](../README.md)
4
+
5
+ This document describes the JSON wire format produced and accepted by `Serializer`.
6
+
7
+ Scope:
8
+
9
+ - This is an internal protocol reference for Runner contributors and maintainers.
10
+ - It is not part of the public, end-user documentation.
11
+ - Backwards compatibility is best-effort; treat `version` as the contract boundary.
12
+
13
+ ---
14
+
15
+ ## Configuration options
16
+
17
+ The serializer accepts the following options:
18
+
19
+ | Option | Type | Default | Description |
20
+ | ------------------------ | ---------- | ---------- | ---------------------------------------------------------------------- |
21
+ | `maxDepth` | `number` | `1000` | Maximum recursion depth for serialization/deserialization |
22
+ | `maxRegExpPatternLength` | `number` | `1024` | Maximum allowed RegExp pattern length |
23
+ | `allowUnsafeRegExp` | `boolean` | `false` | Allow patterns that fail the RegExp safety heuristic |
24
+ | `allowedTypes` | `string[]` | `null` | Whitelist of type IDs allowed during deserialization (null = all) |
25
+ | `symbolPolicy` | `string` | `allow-all` | Symbol deserialization policy: `allow-all`, `well-known-only`, `disabled` |
26
+ | `pretty` | `boolean` | `false` | Enable indented JSON output |
27
+
28
+ ---
29
+
30
+ ## Two formats
31
+
32
+ The serializer understands two payload shapes:
33
+
34
+ 1. **Tree format** (plain JSON, non-graph)
35
+ 2. **Graph format** (identity-preserving, supports cycles)
36
+
37
+ The implementation chooses graph format when it needs to preserve identity or represent cycles; otherwise it may emit a plain JSON value.
38
+
39
+ ---
40
+
41
+ ## Tree format
42
+
43
+ Any JSON value is a valid tree payload:
44
+
45
+ - primitives: `string | number | boolean | null`
46
+ - arrays
47
+ - objects
48
+
49
+ ### Typed values (tree format)
50
+
51
+ Typed values are encoded as:
52
+
53
+ ```json
54
+ { "__type": "Date", "value": "2024-01-01T00:00:00.000Z" }
55
+ ```
56
+
57
+ The type id is resolved via the internal type registry.
58
+
59
+ To avoid collisions with plain user objects in tree mode (`stringify`/`parse`),
60
+ object keys named `__type` and `__graph` are escaped on serialization and
61
+ unescaped during tree-format deserialization.
62
+
63
+ - Escape prefix: `$runner.escape::`
64
+
65
+ ### Safety rules (all formats)
66
+
67
+ When deserializing payload objects (tree or graph), keys listed in the unsafe-key set are filtered out to prevent prototype pollution:
68
+
69
+ - `__proto__`
70
+ - `constructor`
71
+ - `prototype`
72
+
73
+ Filtered keys do not appear in the resulting object.
74
+
75
+ ---
76
+
77
+ ## Graph format
78
+
79
+ Graph payloads are objects with the following shape:
80
+
81
+ ```ts
82
+ type GraphPayload = {
83
+ __graph: true;
84
+ version: 1;
85
+ root: SerializedValue;
86
+ nodes: Record<string, SerializedNode>;
87
+ };
88
+ ```
89
+
90
+ Notes:
91
+
92
+ - `nodes` is treated as a key/value table of node ids to node records.
93
+ - `nodes` is normalized into a null-prototype record during deserialization.
94
+
95
+ ### References
96
+
97
+ References are objects of the shape:
98
+
99
+ ```json
100
+ { "__ref": "obj_1" }
101
+ ```
102
+
103
+ During graph deserialization, references are resolved against `nodes`.
104
+
105
+ Safety rules:
106
+
107
+ - reference objects must be canonical (`{ "__ref": "..." }` with no extra fields)
108
+ - unsafe reference ids (`__proto__`, `constructor`, `prototype`) are rejected
109
+
110
+ ### Node kinds
111
+
112
+ Each `nodes[id]` value is a node record with a `kind` discriminator:
113
+
114
+ #### `object`
115
+
116
+ ```json
117
+ { "kind": "object", "value": { "a": 1, "b": { "__ref": "obj_2" } } }
118
+ ```
119
+
120
+ - `value` is an object whose values are `SerializedValue`.
121
+ - Unsafe keys are filtered during deserialization.
122
+
123
+ #### `array`
124
+
125
+ ```json
126
+ { "kind": "array", "value": [1, { "__ref": "obj_1" }, 3] }
127
+ ```
128
+
129
+ - `value` is an array of `SerializedValue`.
130
+ - malformed array-node payloads (non-array `value`) fail fast.
131
+
132
+ #### `type`
133
+
134
+ ```json
135
+ { "kind": "type", "type": "Map", "value": [["k", "v"]] }
136
+ ```
137
+
138
+ - `type` is the type id in the registry.
139
+ - `value` is the serialized payload for that type, which is recursively deserialized.
140
+
141
+ Typed values can also appear inline (outside `nodes`) using the tree type-record shape:
142
+
143
+ ```json
144
+ { "__type": "RegExp", "value": { "pattern": "test", "flags": "gi" } }
145
+ ```
146
+
147
+ ### Type strategies
148
+
149
+ Custom types can use one of two serialization strategies:
150
+
151
+ - **identity** (default): The type is stored as a graph node, preserving object identity across multiple references.
152
+ - **value**: The type is serialized inline without identity tracking. Used for immutable/value-like types (e.g., `Date`, `RegExp`).
153
+
154
+ ### Type registration (`addType()`)
155
+
156
+ Type registration controls how values are mapped to and from the wire shapes:
157
+
158
+ - inline typed record: `{ "__type": "MyType", "value": ... }` (value strategy)
159
+ - typed node in `nodes`: `{ kind: "type", type: "MyType", value: ... }` (identity strategy)
160
+
161
+ Custom type registration is explicit: provide `id`, `is`, `serialize`, and
162
+ `deserialize` via `addType({ ... })` (see `src/serializer/types.ts`).
163
+
164
+ #### Recommended: explicit type definition
165
+
166
+ Use an explicit `TypeDefinition` when you want the contract to be obvious and
167
+ fully controlled (recommended for docs and library code).
168
+
169
+ ```ts
170
+ import { Serializer } from "@bluelibs/runner";
171
+
172
+ type DistanceUnit = "m" | "km";
173
+
174
+ class Distance {
175
+ constructor(
176
+ public value: number,
177
+ public unit: DistanceUnit,
178
+ ) {}
179
+ }
180
+
181
+ const serializer = new Serializer();
182
+
183
+ serializer.addType<Distance, { value: number; unit: DistanceUnit }>({
184
+ id: "Distance",
185
+ is: (value): value is Distance => value instanceof Distance,
186
+ serialize: (d) => ({ value: d.value, unit: d.unit }),
187
+ deserialize: (payload) => {
188
+ if (
189
+ typeof payload.value !== "number" ||
190
+ (payload.unit !== "m" && payload.unit !== "km")
191
+ ) {
192
+ throw new Error("Invalid Distance payload");
193
+ }
194
+ return new Distance(payload.value, payload.unit);
195
+ },
196
+ strategy: "value",
197
+ });
198
+ ```
199
+
200
+ Notes:
201
+
202
+ - `is(...)` is a runtime predicate; it can use `instanceof`, duck-typing, or any other guard.
203
+ - Prefer validating input payloads in `deserialize(...)` and failing fast on unexpected shapes.
204
+ - Use `strategy: "ref"` (and optionally `create()`) when you need identity preservation across references/cycles.
205
+
206
+ #### About `addType<...>` generics
207
+
208
+ You often do not need to write generics explicitly, because TypeScript can infer
209
+ them from your `serialize` and `deserialize` functions in the object form.
210
+
211
+ ```ts
212
+ serializer.addType({
213
+ id: "Distance",
214
+ is: (value): value is Distance => value instanceof Distance,
215
+ serialize: (d) => ({ value: d.value, unit: d.unit }),
216
+ deserialize: (payload) => new Distance(payload.value, payload.unit),
217
+ strategy: "value",
218
+ });
219
+ ```
220
+
221
+ Explicit generics are still useful when you want stricter intent in docs or when
222
+ inference becomes too broad in complex definitions.
223
+
224
+ #### Non-class objects are supported
225
+
226
+ `is(...)` is just a runtime predicate. It does not require classes or
227
+ `instanceof`. You can use duck typing for plain objects.
228
+
229
+ ```ts
230
+ type Money = {
231
+ kind: "money";
232
+ amount: number;
233
+ currency: "USD" | "EUR";
234
+ };
235
+
236
+ serializer.addType({
237
+ id: "Money",
238
+ is: (value): value is Money => {
239
+ if (!value || typeof value !== "object") return false;
240
+ const rec = value as Record<string, unknown>;
241
+ return (
242
+ rec.kind === "money" &&
243
+ typeof rec.amount === "number" &&
244
+ (rec.currency === "USD" || rec.currency === "EUR")
245
+ );
246
+ },
247
+ serialize: (m) => m,
248
+ deserialize: (payload) => {
249
+ if (
250
+ !payload ||
251
+ typeof payload !== "object" ||
252
+ (payload as Record<string, unknown>).kind !== "money"
253
+ ) {
254
+ throw new Error("Invalid Money payload");
255
+ }
256
+ return payload as Money;
257
+ },
258
+ strategy: "value",
259
+ });
260
+ ```
261
+
262
+ ### Circular reference handling
263
+
264
+ The deserializer uses a `resolved` cache to handle circular references safely:
265
+
266
+ 1. When a node is first encountered, a placeholder is stored in `resolved` immediately.
267
+ 2. Child values are recursively deserialized.
268
+ 3. If a `__ref` points to an already-resolving node, the cached placeholder is returned.
269
+ 4. This breaks infinite loops and preserves identity.
270
+
271
+ For typed nodes with a `create()` factory, the placeholder is the result of `create()`. After full deserialization, properties are merged into the placeholder to maintain identity across circular references.
272
+
273
+ ---
274
+
275
+ ## Serialization rules (important edge cases)
276
+
277
+ These rules are applied by both formats:
278
+
279
+ - `undefined` is preserved via `{ "__type": "Undefined", "value": null }`.
280
+ - non-finite numbers (`NaN`, `Infinity`, `-Infinity`) are preserved via `{ "__type": "NonFiniteNumber", "value": "NaN" | "Infinity" | "-Infinity" }`.
281
+ - `bigint` is preserved via `{ "__type": "BigInt", "value": "123" }`.
282
+ - deserialization validates payload format as an integer string before calling `BigInt(...)`
283
+ - `symbol` is supported for:
284
+ - **Global symbols**: `Symbol.for(key)` is preserved via `{ "__type": "Symbol", "value": { "kind": "For", "key": "..." } }`.
285
+ - **Well-known symbols**: Standard symbols (e.g. `Symbol.iterator`) are preserved via `{ "__type": "Symbol", "value": { "kind": "WellKnown", "key": "..." } }`.
286
+ - Unique symbols (`Symbol('...')`) and `function` are rejected (throw).
287
+
288
+ Previous versions of the protocol serialized these as `null` (matching `JSON.stringify`). Modern versions prefer data integrity for supported types and strict failure for prohibited types.
289
+
290
+ ---
291
+
292
+ ## Depth and resource limits
293
+
294
+ Deserialization is guarded by a depth counter:
295
+
296
+ - `maxDepth` defaults to **1000**.
297
+ - Valid values: non-negative finite integers, or `Infinity` for no limit.
298
+ - Invalid values (`-5`, `NaN`, `undefined`) fall back to the default.
299
+ - When `depth > maxDepth`, deserialization throws `Maximum depth exceeded (N)`.
300
+
301
+ RegExp payloads are validated:
302
+
303
+ - `maxRegExpPatternLength` defaults to **1024**; `Infinity` disables length checking.
304
+ - A safety heuristic rejects patterns with nested quantifiers (e.g., `(a+)+`) and dangerous quantified overlapping alternations (e.g., `^(a|aa)+$`) unless `allowUnsafeRegExp` is `true`.
305
+ - `flags` must use a supported unique subset of `dgimsuvy`.
306
+
307
+ ### Security protections
308
+
309
+ - **Prototype pollution**: Keys `__proto__`, `constructor`, `prototype` are filtered from all objects.
310
+ - **Unknown types**: Type IDs must match registered types exactly; no dynamic resolution.
311
+ - **Type whitelist**: When `allowedTypes` is set, only listed types are allowed during deserialization.
312
+ - **Reference safety**: Reference objects must be canonical (`{ "__ref": "..." }`) and unsafe reference IDs are rejected.
313
+ - **RegExp hardening**: Pattern length limits, safety heuristic checks, strict flag validation (`dgimsuvy` unique set), and fail-fast invalid payload errors.
314
+ - **BigInt hardening**: Payload must be a valid integer string before `BigInt(...)` is evaluated.
315
+ - **Error custom field hardening**: Reserved/prototype-pollution/method-shadowing keys are filtered from Error custom fields.
316
+ - **Type registration hardening**: `addType()` validates `id`, `is`, `serialize`, and `deserialize` at runtime.
317
+
318
+ ### Security model (current behavior)
319
+
320
+ 1. `deserialize()` interprets objects matching the graph envelope shape (`__graph: true`, `root`, `nodes`) as protocol payloads.
321
+ 2. If payloads do not match protocol guards, deserialization falls back to tree-format handling.
322
+ 3. Tree-format object-key escaping for `__type` and `__graph` protects `stringify`/`parse` round-trips for plain user objects.
323
+ 4. Malformed protocol payloads in guarded paths fail fast with explicit errors (for example invalid refs, invalid array node payloads, invalid type payloads).
324
+
325
+ ---
326
+
327
+ ## Versioning
328
+
329
+ `version` is reserved for protocol changes. Current graph version emitted by serializer:
330
+
331
+ - `version: 1`
332
+
333
+ Deserializer behavior (current):
334
+
335
+ - Graph payload detection uses `__graph: true` plus basic shape checks (`root` and `nodes` present/object).
336
+ - `version` is currently informational and not strictly enforced during deserialization.
337
+ - If/when protocol evolution is introduced, `version` will become the compatibility switch for forward/backward handling.