cborg 4.0.8 → 4.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -12,7 +12,7 @@ jobs:
12
12
  - name: Checkout Repository
13
13
  uses: actions/checkout@v4
14
14
  - name: Use Node.js ${{ matrix.node }}
15
- uses: actions/setup-node@v4.0.1
15
+ uses: actions/setup-node@v4.0.2
16
16
  with:
17
17
  node-version: ${{ matrix.node }}
18
18
  - name: Install Dependencies
@@ -35,7 +35,7 @@ jobs:
35
35
  with:
36
36
  fetch-depth: 0
37
37
  - name: Setup Node.js
38
- uses: actions/setup-node@v4.0.1
38
+ uses: actions/setup-node@v4.0.2
39
39
  with:
40
40
  node-version: lts/*
41
41
  - name: Install dependencies
package/CHANGELOG.md CHANGED
@@ -1,3 +1,22 @@
1
+ ## [4.1.0](https://github.com/rvagg/cborg/compare/v4.0.9...v4.1.0) (2024-02-29)
2
+
3
+
4
+ ### Features
5
+
6
+ * export `Tokenizer`, document how it can be used ([fbba395](https://github.com/rvagg/cborg/commit/fbba395b6848bd8dcb66a1d18c36e1033581b5ef))
7
+
8
+
9
+ ### Bug Fixes
10
+
11
+ * update tyupes for new exports ([986035d](https://github.com/rvagg/cborg/commit/986035dbffbc682b06d7bfecb8d2c3cc02048429))
12
+
13
+ ## [4.0.9](https://github.com/rvagg/cborg/compare/v4.0.8...v4.0.9) (2024-02-07)
14
+
15
+
16
+ ### Trivial Changes
17
+
18
+ * **deps:** bump actions/setup-node from 4.0.1 to 4.0.2 ([5abab22](https://github.com/rvagg/cborg/commit/5abab22ae3cc8645562083be816620005725f5c9))
19
+
1
20
  ## [4.0.8](https://github.com/rvagg/cborg/compare/v4.0.7...v4.0.8) (2024-01-15)
2
21
 
3
22
 
package/README.md CHANGED
@@ -31,6 +31,7 @@
31
31
  * [`encodedLength(data[, options])`](#encodedlengthdata-options)
32
32
  * [Type encoders](#type-encoders)
33
33
  * [Tag decoders](#tag-decoders)
34
+ * [Decoding with a custom tokeniser](#decoding-with-a-custom-tokeniser)
34
35
  * [Deterministic encoding recommendations](#deterministic-encoding-recommendations)
35
36
  * [Round-trip consistency](#round-trip-consistency)
36
37
  * [JSON mode](#json-mode)
@@ -243,7 +244,7 @@ Decode valid CBOR bytes from a `Uint8Array` (or `Buffer`) and return a JavaScrip
243
244
  * `rejectDuplicateMapKeys` (boolean, default `false`): when the decoder encounters duplicate keys for the same map, an error will be thrown when this option is set. This is an additional _strictness_ option, disallowing data-hiding and reducing the number of same-data different-bytes possibilities where it matters.
244
245
  * `retainStringBytes` (boolean, default `false`): when decoding strings, retain the original bytes on the `Token` object as `byteValue`. Since it is possible to encode non-UTF-8 characters in strings in CBOR, and JavaScript doesn't properly handle non-UTF-8 in its conversion from bytes (`TextEncoder` or `Buffer`), this can result in a loss of data (and an inability to round-trip). Where this is important, a token stream should be consumed instead of a plain `decode()` and the `byteValue` property on string tokens can be inspected (see [lib/diagnostic.js](lib/diagnostic.js) for an example of its use.)
245
246
  * `tags` (array): a mapping of tag number to tag decoder function. By default no tags are supported. See [Tag decoders](#tag-decoders).
246
- * `tokenizer` (object): an object with two methods, `next()` which returns a `Token` and `done()` which returns a `boolean`. Can be used to implement custom input decoding. See the source code for examples.
247
+ * `tokenizer` (object): an object with two methods, `next()` which returns a `Token`, `done()` which returns a `boolean` and `pos()` which returns the current byte position being decoded. Can be used to implement custom input decoding. See the source code for examples. (Note en-US spelling "tokenizer" used throughout exported methods and types, which may be confused with "tokeniser" used in these docs).
247
248
 
248
249
  ### `decodeFirst(data[, options])`
249
250
 
@@ -383,6 +384,45 @@ function bigNegIntDecoder (bytes) {
383
384
  }
384
385
  ```
385
386
 
387
+ ## Decoding with a custom tokeniser
388
+
389
+ `decode()` allows overriding the `tokenizer` option to provide a custom tokeniser. This object can be described with the following interface:
390
+
391
+ ```typescript
392
+ export interface DecodeTokenizer {
393
+ next(): Token,
394
+ done(): boolean,
395
+ pos(): number,
396
+ }
397
+ ```
398
+
399
+ `next()` should return the next token in the stream, `done()` should return `true` when the stream is finished, and `pos()` should return the current byte position in the stream.
400
+
401
+ Overriding the default tokeniser can be useful for changing the rules of decode. For example, it is used to turn cborg into a JSON decoder by changing parsing rules on how to turn bytes into tokens. See the source code for how this works.
402
+
403
+ The default `Tokenizer` class is available from the default export. Providing `options.tokenizer = new Tokenizer(bytes, options)` would result in the same decode path using this tokeniser. However, this can also be used to override or modify default decode paths by intercepting the token stream. For example, to perform a decode that disallows bytes, the following code would work:
404
+
405
+ ```js
406
+ import { decode, Tokenizer, Type } from 'cborg'
407
+
408
+ class CustomTokeniser extends Tokenizer {
409
+ next () {
410
+ const nextToken = super.next()
411
+ if (nextToken.type === Type.bytes) {
412
+ throw new Error('Unsupported type: bytes')
413
+ }
414
+ return nextToken
415
+ }
416
+ }
417
+
418
+ function customDecode (data, options) {
419
+ options = Object.assign({}, options, {
420
+ tokenizer: new CustomTokeniser(data, options)
421
+ })
422
+ return decode(data, options)
423
+ }
424
+ ```
425
+
386
426
  ## Deterministic encoding recommendations
387
427
 
388
428
  cborg is designed with deterministic encoding forms as a primary feature. It is suitable for use with content addressed systems or other systems where convergence of binary forms is important. The ideal is to have strictly _one way_ of mapping a set of data into a binary form. Unfortunately CBOR has many opportunities for flexibility, including:
@@ -428,7 +468,7 @@ There are a number of forms where an object will not round-trip precisely, if th
428
468
 
429
469
  Use `import { encode, decode, decodeFirst } from 'cborg/json'` to access the JSON handling encoder and decoder.
430
470
 
431
- Many of the same encode and decode options available for CBOR can be used to manage JSON handling. These include strictness requirements for decode and custom tag encoders for encode. Tag encoders can't create new tags as there are no tags in JSON, but they can replace JavaScript object forms with custom JSON forms (e.g. convert a `Uint8Array` to a valid JSON form rather than having the encoder throw an error). The inverse is also possible, turning specific JSON forms into JavaScript forms, by using a custom tokenizer on decode.
471
+ Many of the same encode and decode options available for CBOR can be used to manage JSON handling. These include strictness requirements for decode and custom tag encoders for encode. Tag encoders can't create new tags as there are no tags in JSON, but they can replace JavaScript object forms with custom JSON forms (e.g. convert a `Uint8Array` to a valid JSON form rather than having the encoder throw an error). The inverse is also possible, turning specific JSON forms into JavaScript forms, by using a custom tokeniser on decode.
432
472
 
433
473
  Special notes on options specific to the JSON:
434
474
 
package/cborg.js CHANGED
@@ -1,5 +1,5 @@
1
1
  import { encode } from './lib/encode.js'
2
- import { decode, decodeFirst } from './lib/decode.js'
2
+ import { decode, decodeFirst, Tokeniser, tokensToObject } from './lib/decode.js'
3
3
  import { Token, Type } from './lib/token.js'
4
4
 
5
5
  /**
@@ -14,6 +14,8 @@ import { Token, Type } from './lib/token.js'
14
14
  export {
15
15
  decode,
16
16
  decodeFirst,
17
+ Tokeniser as Tokenizer,
18
+ tokensToObject,
17
19
  encode,
18
20
  Token,
19
21
  Type
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "cborg",
3
- "version": "4.0.8",
3
+ "version": "4.1.0",
4
4
  "description": "Fast CBOR with a focus on strictness",
5
5
  "main": "cborg.js",
6
6
  "type": "module",
package/types/cborg.d.ts CHANGED
@@ -16,8 +16,10 @@ export type DecodeOptions = import('./interface').DecodeOptions;
16
16
  export type EncodeOptions = import('./interface').EncodeOptions;
17
17
  import { decode } from './lib/decode.js';
18
18
  import { decodeFirst } from './lib/decode.js';
19
+ import { Tokeniser } from './lib/decode.js';
20
+ import { tokensToObject } from './lib/decode.js';
19
21
  import { encode } from './lib/encode.js';
20
22
  import { Token } from './lib/token.js';
21
23
  import { Type } from './lib/token.js';
22
- export { decode, decodeFirst, encode, Token, Type };
24
+ export { decode, decodeFirst, Tokeniser as Tokenizer, tokensToObject, encode, Token, Type };
23
25
  //# sourceMappingURL=cborg.d.ts.map
@@ -1 +1 @@
1
- {"version":3,"file":"cborg.d.ts","sourceRoot":"","sources":["../cborg.js"],"names":[],"mappings":";;;yBAMa,OAAO,aAAa,EAAE,UAAU;;;;0BAEhC,OAAO,aAAa,EAAE,mBAAmB;;;;4BACzC,OAAO,aAAa,EAAE,aAAa;;;;4BACnC,OAAO,aAAa,EAAE,aAAa;uBATZ,iBAAiB;4BAAjB,iBAAiB;uBAD9B,iBAAiB;sBAEZ,gBAAgB;qBAAhB,gBAAgB"}
1
+ {"version":3,"file":"cborg.d.ts","sourceRoot":"","sources":["../cborg.js"],"names":[],"mappings":";;;yBAMa,OAAO,aAAa,EAAE,UAAU;;;;0BAEhC,OAAO,aAAa,EAAE,mBAAmB;;;;4BACzC,OAAO,aAAa,EAAE,aAAa;;;;4BACnC,OAAO,aAAa,EAAE,aAAa;uBATe,iBAAiB;4BAAjB,iBAAiB;0BAAjB,iBAAiB;+BAAjB,iBAAiB;uBADzD,iBAAiB;sBAEZ,gBAAgB;qBAAhB,gBAAgB"}