wscodec 0.3.0 → 0.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +171 -33
  2. package/package.json +3 -2
package/README.md CHANGED
@@ -90,11 +90,15 @@ returns an `UnrealBlob` with:
90
90
  | `error` | `string \| null` | populated when structural decode failed |
91
91
  | `_raw` | `Uint8Array` | the input bytes, retained for pass-through serialize |
92
92
  | `_dirty` | `boolean` | set by mutating callers to force re-encode |
93
+ | `_recomputeSizes` | `boolean` | when true, every `tag.size` is rewritten from the actual value byte count on serialize. Set automatically by `jsonToBlob`; see [Editing](#editing) |
93
94
 
94
- `blob.serialize()` returns a `Uint8Array`. When `_dirty` is false it
95
- returns `_raw` verbatim (byte-identical pass-through). When `_dirty`
96
- is true it re-emits the property stream from `properties` via
97
- `writePropertyStream`.
95
+ `blob.serialize(options?)` returns a `Uint8Array`. When `_dirty` is
96
+ false it returns `_raw` verbatim (byte-identical pass-through). When
97
+ `_dirty` is true it re-emits the property stream from `properties` via
98
+ `writePropertyStream`. Pass `{ recomputeSizes: true }` (or set
99
+ `blob._recomputeSizes`) to recompute every `tag.size` from the actual
100
+ encoded value bytes — required after any edit that changes a
101
+ variable-length field.
98
102
 
99
103
  `blob.findProperty(name)` returns the first top-level property whose
100
104
  tag name matches, or `null`. It does NOT traverse into embedded
@@ -114,10 +118,10 @@ JavaScript shape depends on the tag's type:
114
118
  | `StrProperty`, `NameProperty` | string / `FName` |
115
119
  | `StructProperty` | `StructValue`. `.value` is either a plain object for known binary structs (`Vector`, `Quat`, `Transform`, ...), an `FGuid` instance for the `Guid` struct, or a nested property array for unknown structs |
116
120
  | `ArrayProperty`, `SetProperty` | `ArrayValue` / `SetValue` with `.elements` |
117
- | `MapProperty` | `MapValue` with `.entries: [[key, value], ...]` |
121
+ | `MapProperty` | `MapValue` with `.entries: [{ key, value }, ...]` and `.removed: [...]` |
118
122
  | `ObjectProperty`, `ClassProperty`, `Weak*`, `Lazy*`, `WSObjectProperty` | `ObjectRef` (kind + optional path/classPath/embedded stream) |
119
123
  | `SoftObjectProperty`, `SoftClassProperty` | `SoftObjectRef` (`assetPath`, `subPath`) |
120
- | `TextProperty` | `FTextValue` (handles UE4 FText history types -1, 0, 2, 4) |
124
+ | `TextProperty` | `FTextValue` (handles UE4 FText history types -1, 0, 1, 2, 4) |
121
125
  | anything wscodec couldn't structurally decode | `OpaqueValue`. Bytes retained verbatim |
122
126
 
123
127
  Submodule re-exports make the value classes importable directly:
@@ -126,6 +130,8 @@ Submodule re-exports make the value classes importable directly:
126
130
  import { ObjectRef, SoftObjectRef, FTextValue, OpaqueValue, StructValue } from 'wscodec';
127
131
  import { PropertyTag, ArrayValue, SetValue, MapValue } from 'wscodec';
128
132
  import { FName, FGuid } from 'wscodec';
133
+ import { blobToJSON, jsonToBlob, blobToJSONString, jsonStringToBlob,
134
+ jsonReplacer, jsonReviver } from 'wscodec';
129
135
  ```
130
136
 
131
137
  Lower-level helpers (`Cursor`, `Writer`, `readPropertyStream`,
@@ -155,11 +161,77 @@ The helper validates that both `read(cursor)` and `write(writer, value)`
155
161
  are functions. Register before calling `UnrealBlob.decode` on any blob
156
162
  that uses the type.
157
163
 
164
+ ### JSON conversion
165
+
166
+ The object tree round-trips through JSON. This is the recommended path
167
+ for editing: the tree becomes plain JSON, edits are plain JS object
168
+ mutations, and the JSON-to-blob pipeline handles size recomputation,
169
+ sentinel substitution for `-0` / `Infinity` / `NaN`, and base64 for the
170
+ small fraction of bytes that the codec doesn't structurally decode.
171
+
172
+ ```js
173
+ import {
174
+ UnrealBlob,
175
+ blobToJSON, jsonToBlob,
176
+ blobToJSONString, jsonStringToBlob,
177
+ } from 'wscodec';
178
+
179
+ const blob = UnrealBlob.decode(uncompressedBytes);
180
+
181
+ // Object-tree round trip (preserves -0 in memory via Object.is, but a
182
+ // naive JSON.stringify on the result would lose it; see below).
183
+ const obj = blobToJSON(blob);
184
+ const blob2 = jsonToBlob(obj);
185
+
186
+ // String round trip — use this whenever the JSON crosses a stringify
187
+ // boundary (file I/O, sockets, etc.). The sentinels guard non-finite
188
+ // numbers and -0 across the conversion.
189
+ const json = blobToJSONString(blob, 2 /* optional indent */);
190
+ const blob3 = jsonStringToBlob(json);
191
+
192
+ // blob3.serialize() reproduces the input bytes (modulo the wire's
193
+ // optional "inflated tag.size" lies, which jsonToBlob normalizes).
194
+ ```
195
+
196
+ `blobToJSON` produces a plain-object tree with:
197
+
198
+ - `FName` values flattened to bare strings (with metadata-object fallback only when `isUnicode`/`isNull`/`number` aren't defaults)
199
+ - `FGuid` flattened to its canonical 8-4-4-4-12 hex string
200
+ - `Int64Property` / `UInt64Property` / `DateTime` / `Timespan` as decimal strings
201
+ - `StructValue` discriminated by `form: "binary" | "propStream" | "decodeError"`
202
+ - `OpaqueValue` as `{ _opaque: true, bytes: <base64>, reason }`
203
+ - `bodyTrailing` as base64 when present
204
+ - `ArrayValue._perElementTrailings` (the `JianZhuInstYuanXings` per-piece placement cache) as `{ transforms: [[16 floats], …], ids: [u32, …], aux: [[16 floats], …] }` — see [Round-trip guarantees](#round-trip-guarantees) for the NaN-bit-preservation note
205
+
206
+ `jsonToBlob` returns an `UnrealBlob` with `_dirty = true` and
207
+ `_recomputeSizes = true`, so `blob.serialize()` will rewrite every
208
+ `tag.size` from the actual encoded value bytes. That makes the JSON
209
+ pipeline safe for arbitrary edits — including ones that change FString
210
+ lengths, add/remove array elements, or grow nested structs.
211
+
212
+ If you need to build a larger JSON envelope around an `UnrealBlob`
213
+ (e.g., a full db export), use `jsonReplacer` / `jsonReviver` with your
214
+ own `JSON.stringify` / `JSON.parse` calls so the same `-0`/`NaN`/`Infinity`
215
+ sentinels are applied uniformly:
216
+
217
+ ```js
218
+ import { blobToJSON, jsonToBlob, jsonReplacer, jsonReviver } from 'wscodec';
219
+
220
+ const envelope = { actor_serial: 17, blob: blobToJSON(blob), other: '...' };
221
+ const text = JSON.stringify(envelope, jsonReplacer);
222
+ const parsed = JSON.parse(text, jsonReviver);
223
+ const blob2 = jsonToBlob(parsed.blob);
224
+ ```
225
+
226
+ The codec is consumable as a submodule: `import { blobToJSON } from 'wscodec/json';`.
227
+
158
228
  ### Editing
159
229
 
160
- The library does not provide typed mutators. Callers manipulate the
161
- `properties` tree directly, then set `_dirty` on the ROOT blob to
162
- force a re-encode.
230
+ Two paths are supported. For most edits, **go through JSON** ([§
231
+ JSON conversion](#json-conversion)) it handles size recomputation
232
+ and numeric edge cases automatically. For low-level edits that
233
+ change zero-cost fields (numbers, bools, single bytes), you can also
234
+ mutate the object tree directly.
163
235
 
164
236
  ```js
165
237
  import { UnrealBlob, FName } from 'wscodec';
@@ -186,31 +258,38 @@ inventory.value.elements.push(new FName('Item_Wood'));
186
258
  // (5) Remove an element. Just splice it out; don't set null.
187
259
  inventory.value.elements.splice(0, 1);
188
260
 
189
- // Always set _dirty on the ROOT blob (not on nested properties). The
190
- // flag is read by blob.serialize() to decide pass-through vs re-encode.
261
+ // Tell the encoder to (a) re-emit from properties at all, and (b)
262
+ // recompute every tag.size from the actual encoded value bytes. The
263
+ // recompute is REQUIRED whenever any edit could change a value's
264
+ // encoded byte count (FStrings, FText, array length, nested structs).
265
+ // It's free when nothing changed in size, so just turning it on by
266
+ // default for any direct edit is the safest path.
191
267
  blob._dirty = true;
268
+ blob._recomputeSizes = true;
192
269
 
193
270
  const updatedBytes = blob.serialize(); // re-emits from properties
194
271
  ```
195
272
 
196
273
  Gotchas:
197
274
 
198
- - `_dirty` lives on the root `UnrealBlob`, not on nested `Property` /
199
- `ArrayValue` / `StructValue` objects. Mutating a deep value without
200
- setting `blob._dirty = true` returns the original `_raw` bytes
201
- unchanged.
275
+ - `_dirty` and `_recomputeSizes` live on the root `UnrealBlob`, not on
276
+ nested `Property` / `ArrayValue` / `StructValue` objects. Mutating a
277
+ deep value without setting `blob._dirty = true` returns the original
278
+ `_raw` bytes unchanged.
202
279
  - `BoolProperty` values live in the `tag` (`tag.boolVal`), not in
203
280
  `property.value`. To flip a bool, edit `prop.tag.boolVal`.
204
281
  - Removing a property means splicing it out of `blob.properties`, not
205
282
  setting `property.value = null`.
206
- - If you change a value's encoded SIZE (e.g. extending an FString),
207
- the property's `tag.size` is recomputed on write, but any property
208
- that previously carried a `_sizeMismatch` annotation refuses to
209
- re-emit. Such properties are extremely rare in healthy world.db
210
- files and are reported by `npm test`.
283
+ - **Anything that changes encoded byte count requires `_recomputeSizes = true`.**
284
+ Lengthening an FString, adding an array element, swapping a known-
285
+ binary struct for a propStream any of these without recompute leaves
286
+ every dependent `tag.size` stale, and Soulmask's reader will walk off
287
+ the end of the value into the next property's bytes. Symptom: edited
288
+ blob loads but with reset/missing fields downstream of the edit.
289
+ - The JSON pipeline (`jsonToBlob`, `jsonStringToBlob`) sets this for you.
211
290
  - `serialize()` throws if `_dirty` is true AND `error` is set:
212
291
  re-emitting from an empty properties array would produce a malformed
213
- stream. Leave `_dirty=false` to pass through `_raw` verbatim, or
292
+ stream. Leave `_dirty = false` to pass through `_raw` verbatim, or
214
293
  clear `.error` first if you've replaced `.properties` manually.
215
294
  - 64-bit integer values (`Int64Property`, `UInt64Property`,
216
295
  `DateTime`, `Timespan`) round-trip as decimal strings. If you
@@ -218,9 +297,12 @@ Gotchas:
218
297
  (`|v| <= Number.MAX_SAFE_INTEGER`); otherwise the writer throws
219
298
  rather than silently lose precision.
220
299
 
221
- `serialize()` for a dirty blob is byte-identical to a fresh
222
- `decode + serialize` cycle on its output, verified on every row of
223
- the tested `world.db`.
300
+ `serialize()` is byte-identical to a fresh `decode + serialize` cycle
301
+ on its output, verified on every row of the tested `world.db`. With
302
+ recompute enabled the encoder may produce shorter bytes than the
303
+ original when the wire's `tag.size` over-stated the actual value byte
304
+ count (some Soulmask Maps do this); the bytes still decode to the same
305
+ object tree, and tested in-game loads accept both forms.
224
306
 
225
307
  ## LZ4 integration
226
308
 
@@ -273,9 +355,11 @@ bytes if you need that).
273
355
 
274
356
  For every row in the tested `world.db`:
275
357
 
276
- - `UnrealBlob.decode(inner)` succeeds without `error` set.
277
- - `blob.serialize()` with `_dirty = false` returns the input bytes byte-identical.
278
- - `blob.serialize()` with `_dirty = true` re-emits from `properties` and is byte-identical to the input.
358
+ - `UnrealBlob.decode(inner)` succeeds without `error` set and produces zero `OpaqueValue` entries (every property type decodes structurally).
359
+ - `blob.serialize()` with `_dirty = false` returns the input bytes byte-identical (pass-through).
360
+ - `blob.serialize()` with `_dirty = true` and `_recomputeSizes = false` re-emits from `properties` and is byte-identical to the input.
361
+ - `blob.serialize()` with `_dirty = true` and `_recomputeSizes = true` (the JSON-pipeline default) re-emits with `tag.size` rewritten from the actual value byte count. Decoding the result yields the same property tree as the input; the wire bytes may differ where the input's stored sizes over-stated the actual value byte count (some Soulmask Maps do this).
362
+ - The same `UnrealBlob` going through `blobToJSON` → `jsonToBlob` → `serialize` yields bytes that decode to the same tree as the input.
279
363
 
280
364
  Coverage includes every known Soulmask wire-format quirk:
281
365
 
@@ -293,11 +377,22 @@ Coverage includes every known Soulmask wire-format quirk:
293
377
  `JianZhuInstYuanXings` arrays (`YuanXing` = "prototype", so
294
378
  "building-zone yuan-xing" is the list of building-piece prototypes
295
379
  inside a building zone) interleave a fixed-shape binary block after
296
- each ObjectProperty element: an 8-byte header + three stride/count
297
- sections (per-piece world transforms, ids, and aux data).
380
+ each ObjectProperty element. The codec decodes the block structurally
381
+ into `{ transforms: [[16 floats], …], ids: [u32, …], aux: [[16 floats], …] }`.
382
+ The transforms are row-major UE `FMatrix`-style 4×4 matrices; the
383
+ translation lives at indices 12, 13, 14. Non-canonical NaN bit
384
+ patterns (observed `0xFFFFFFFF` as a sentinel in aux data) are
385
+ preserved via `{ $nanBits: u32 }` wrappers, because JS `Number`
386
+ collapses all NaNs to `0x7FC00000`. In-game testing confirms Soulmask
387
+ renders building pieces from their `RelativeTransform` property (a
388
+ `StructProperty<Transform>` in `MapInstJianZhuDataList`); the
389
+ per-element trailings carry the same data as a render-side cache, so
390
+ edits that move pieces must update both.
298
391
  - **ArrayProperty<TextProperty> with mixed FText history types.**
299
392
  Elements use history types -1 (culture-invariant), 0 (localized),
300
- 2 (ordered format), and 4 (`FTextHistory_AsNumber`). History type 4
393
+ 1 (`FTextHistory_NamedFormat` a format pattern plus a
394
+ `TMap<FString, FFormatArgumentValue>` of named arguments), 2
395
+ (ordered format), and 4 (`FTextHistory_AsNumber`). History type 4
301
396
  embeds a legacy UE3-style `FNumberFormattingOptions` whose boolean
302
397
  fields are 4 bytes wide rather than the modern 1 byte; the codec
303
398
  emits this correctly.
@@ -311,23 +406,66 @@ Coverage includes every known Soulmask wire-format quirk:
311
406
  actual 636422); pair shapes are detected by peeking at the next
312
407
  bytes rather than trusting the declared size.
313
408
 
314
- ## Running the test
409
+ ## Running the tests
315
410
 
316
411
  ```sh
317
412
  git clone https://github.com/auroris/SoulmaskCodec.git
318
413
  cd SoulmaskCodec
319
414
  npm install
415
+
416
+ # Byte-identical roundtrip across every row of a world.db.
320
417
  npm test # looks for world.db two dirs up by default
321
- # or
322
418
  node test/test-roundtrip.mjs /path/to/world.db
419
+
420
+ # JSON-pipeline roundtrip. Encodes both sides with recomputeSizes=true
421
+ # and compares; verifies blobToJSON ↔ jsonToBlob is lossless.
422
+ npm run test:json -- /path/to/world.db
423
+ npm run test:json-spot -- /path/to/world.db # spot-check on rows that exercise each code path
323
424
  ```
324
425
 
325
426
  Test deps: `lz4-wasm-nodejs` (LZ4 inside the test) and
326
427
  `better-sqlite3` (reads the `world.db` SQLite file). Both are picked
327
428
  up via npm module resolution; if `better-sqlite3` isn't installed at
328
- the package root the test will surface that with a clear error. See
429
+ the package root the tests will surface that with a clear error. See
329
430
  the Setup section above for the build-tools prerequisite on Windows.
330
431
 
432
+ ## Bundled scripts
433
+
434
+ The repo also ships full db ↔ JSON utilities under [scripts/](scripts/).
435
+ These are NOT shipped in the npm package (the codec stays zero-dep); they
436
+ live in the repo as reference workflows.
437
+
438
+ ```sh
439
+ # Dump every row of a world.db (LZ4-decompressing actor_data and decoding
440
+ # through wscodec where possible) to a single JSON file.
441
+ npm run export-db -- /path/to/world.db world.json
442
+
443
+ # Inverse: rebuild a SQLite db from the JSON export. The npm script
444
+ # already runs node with --max-old-space-size=4096 (necessary for
445
+ # multi-hundred-MB exports).
446
+ npm run import-db -- world.json /path/to/new.db
447
+
448
+ # Diff two world.db files at the uncompressed level (tolerates LZ4 re-compression).
449
+ node scripts/diff-dbs.mjs a.db b.db
450
+
451
+ # Search every decoded blob for a substring (custom names, UIDs, asset paths).
452
+ node scripts/find-string.mjs /path/to/world.db "Claude's Chest"
453
+
454
+ # Pretty-print one actor's full property tree.
455
+ node scripts/dump-actor.mjs /path/to/world.db <actor_serial> [out.json]
456
+
457
+ # Merge every workbench access log, NPC work log, and clan log into a single
458
+ # timestamp-sorted .log file. .NET ticks → ISO-8601 UTC; FText placeholders
459
+ # substituted into their NamedFormat / OrderedFormat templates.
460
+ npm run dump-logs -- /path/to/world.db world.log
461
+ ```
462
+
463
+ The export/import pair has been validated end-to-end against Soulmask
464
+ itself: a full db → JSON → db round-trip produces a save that the game
465
+ loads cleanly. See [docs/helpers-handoff.md](docs/helpers-handoff.md)
466
+ for notes on building a higher-level save-edit helper library on top of
467
+ the codec.
468
+
331
469
  ## License
332
470
 
333
471
  MIT.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "wscodec",
3
- "version": "0.3.0",
3
+ "version": "0.3.1",
4
4
  "description": "Pure-JS codec for Soulmask actor_data property streams (UE 4.27 FPropertyTag wire format). Zero runtime dependencies. Accepts uncompressed bytes, returns JS objects, and vice versa. Round-trip byte-identical against every actor.",
5
5
  "type": "module",
6
6
  "main": "./wscodec.mjs",
@@ -28,7 +28,8 @@
28
28
  "test:json": "node test/test-json-full.mjs",
29
29
  "test:json-spot": "node test/test-json-roundtrip.mjs",
30
30
  "export-db": "node scripts/db-to-json.mjs",
31
- "import-db": "node scripts/json-to-db.mjs"
31
+ "import-db": "node --max-old-space-size=4096 scripts/json-to-db.mjs",
32
+ "dump-logs": "node scripts/dump-logs.mjs"
32
33
  },
33
34
  "keywords": [
34
35
  "soulmask",