npm - @exodus/bytes - Versions diffs - 1.12.0 → 1.13.0 - Mend

@exodus/bytes 1.12.0 → 1.13.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (39) hide show

package/README.md +39 -16
package/base58.js +3 -3
package/base64.js +7 -6
package/bech32.js +3 -3
package/encoding-browser.browser.js +43 -17
package/fallback/_utils.js +7 -122
package/fallback/base32.js +3 -3
package/fallback/base58check.js +3 -3
package/fallback/base64.js +2 -3
package/fallback/encoding.api.js +0 -43
package/fallback/encoding.js +41 -2
package/fallback/encoding.labels.js +20 -16
package/fallback/hex.js +3 -4
package/fallback/latin1.js +5 -6
package/fallback/percent.js +1 -1
package/fallback/platform.browser.js +31 -0
package/fallback/platform.js +2 -0
package/fallback/platform.native.js +97 -0
package/fallback/single-byte.encodings.js +40 -49
package/fallback/single-byte.js +4 -4
package/fallback/utf16.js +69 -2
package/fallback/utf8.auto.browser.js +2 -0
package/fallback/utf8.auto.js +1 -0
package/fallback/utf8.auto.native.js +1 -0
package/fallback/utf8.js +25 -3
package/hex.js +6 -8
package/hex.node.js +2 -3
package/multi-byte.js +2 -2
package/multi-byte.node.js +3 -3
package/package.json +22 -4
package/single-byte.js +5 -5
package/single-byte.node.js +4 -4
package/utf16.browser.js +8 -0
package/utf16.js +1 -90
package/utf16.native.js +22 -0
package/utf16.node.js +5 -20
package/utf8.js +9 -28
package/utf8.node.js +3 -4
package/whatwg.js +6 -2

package/README.md CHANGED Viewed

@@ -30,7 +30,7 @@ Tested in CI with [@exodus/test](https://github.com/ExodusMovement/test#exoduste
 [![Hermes](https://img.shields.io/badge/Hermes-282C34?style=for-the-badge&logo=React)](https://hermesengine.dev)
 [![V8](https://img.shields.io/badge/V8-4285F4?style=for-the-badge&logo=V8&logoColor=white)](https://v8.dev/docs/d8)
 [![JavaScriptCore](https://img.shields.io/badge/JavaScriptCore-006CFF?style=for-the-badge)](https://docs.webkit.org/Deep%20Dive/JSC/JavaScriptCore.html)
-[![SpiderMonkey](https://img.shields.io/badge/SpiderMonkey-FFD681?style=for-the-badge)](https://spidermonkey.dev/)
+[![SpiderMonkey](https://img.shields.io/badge/SpiderMonkey-FFD681?style=for-the-badge)](https://spidermonkey.dev/)\
 [![QuickJS](https://img.shields.io/badge/QuickJS-E58200?style=for-the-badge)](https://github.com/quickjs-ng/quickjs)
 [![XS](https://img.shields.io/badge/XS-0B307A?style=for-the-badge)](https://github.com/Moddable-OpenSource/moddable)
 [![GraalJS](https://img.shields.io/badge/GraalJS-C74634?style=for-the-badge)](https://github.com/oracle/graaljs)
@@ -100,24 +100,47 @@ _These are only provided as a compatibility layer, prefer hardened APIs instead
 ### Lite version
-If you don't need support for legacy multi-byte encodings, you can use the lite import:
-```js
-import { TextDecoder, TextEncoder } from '@exodus/bytes/encoding-lite.js'
-import { TextDecoderStream, TextEncoderStream } from '@exodus/bytes/encoding-lite.js' // Requires Streams
-```
+Alternate exports exist that can help reduce bundle size, see comparison:
+| import | size |
+| - | - |
+| [@exodus/bytes/encoding-browser.js](#exodusbytesencoding-browserjs-) | <sub>![](https://img.shields.io/bundlejs/size/@exodus/bytes/encoding-browser.js?style=flat-square)</sub> |
+| [@exodus/bytes/encoding-lite.js](#exodusbytesencoding-litejs-) | <sub>![](https://img.shields.io/bundlejs/size/@exodus/bytes/encoding-lite.js?style=flat-square)</sub> |
+| [@exodus/bytes/encoding.js](#exodusbytesencodingjs-) | <sub>![](https://img.shields.io/bundlejs/size/@exodus/bytes/encoding.js?style=flat-square)</sub> |
+| `text-encoding` | <sub>![](https://img.shields.io/bundlejs/size/text-encoding?style=flat-square)</sub> |
+| `iconv-lite` | <sub>![](https://img.shields.io/bundlejs/size/iconv-lite/lib/index.js?style=flat-square)</sub> |
+| `whatwg-encoding` | <sub>![](https://img.shields.io/bundlejs/size/whatwg-encoding?style=flat-square)</sub> |
+Libraries are advised to use single-purpose hardened `@exodus/bytes/utf8.js` / `@exodus/bytes/utf16.js` APIs for Unicode.
+Applications (including React Native apps) are advised to load either `@exodus/bytes/encoding-lite.js` or `@exodus/bytes/encoding.js`
+(depending on whether legacy multi-byte support is needed) and use that as a global polyfill.
-This reduces the bundle size 9x:\
-from 90 KiB gzipped for `@exodus/bytes/encoding.js` to 10 KiB gzipped for `@exodus/bytes/encoding-lite.js`.\
-(For comparison, `text-encoding` module is 190 KiB gzipped, and `iconv-lite` is 194 KiB gzipped):
+#### `@exodus/bytes/encoding-lite.js`
-It still supports `utf-8`, `utf-16le`, `utf-16be` and all single-byte encodings specified by the spec,
-the only difference is support for legacy multi-byte encodings.
+If you don't need support for legacy multi-byte encodings.
+Reduces the bundle size 11x, while still keeping `utf-8`, `utf-16le`, `utf-16be` and all single-byte encodings specified by the spec.
+The only difference is support for legacy multi-byte encodings.
 See [the list of encodings](https://encoding.spec.whatwg.org/#names-and-labels).
+This can be useful for example in React Native global TextDecoder polyfill,
+if you are sure that you don't need legacy multi-byte encodings support.
+#### `@exodus/bytes/encoding-browser.js`
+Resolves to a tiny import in browser bundles, preferring native `TextDecoder` / `TextEncoder`.
+For non-browsers (Node.js, React Native), loads a full implementation.
+> [!NOTE]
+> This is not the default behavior for `@exodus/bytes/encoding.js` because all major browser implementations have bugs,
+> which `@exodus/bytes/encoding.js` fixes. Only use if you are ok with that.
 ## API
-### @exodus/bytes/utf8.js
+### @exodus/bytes/utf8.js <sub>![](https://img.shields.io/bundlejs/size/@exodus/bytes/utf8.js?style=flat-square)<sub>
 UTF-8 encoding/decoding
@@ -183,7 +206,7 @@ Prefer using strict throwing methods for cryptography applications._
 This is similar to `new TextDecoder('utf-8', { ignoreBOM: true }).decode(arr)`,
 but works on all engines.
-### @exodus/bytes/utf16.js
+### @exodus/bytes/utf16.js <sub>![](https://img.shields.io/bundlejs/size/@exodus/bytes/utf16.js?style=flat-square)<sub>
 UTF-16 encoding/decoding
@@ -667,7 +690,7 @@ Create a view of a TypedArray in the specified format (`'uint8'` or `'buffer'`)
 > [!IMPORTANT]
 > Does not copy data, returns a view on the same underlying buffer
-### @exodus/bytes/encoding.js
+### @exodus/bytes/encoding.js <sub>![](https://img.shields.io/bundlejs/size/@exodus/bytes/encoding.js?style=flat-square)</sub>
 Implements the [Encoding standard](https://encoding.spec.whatwg.org/):
 [TextDecoder](https://encoding.spec.whatwg.org/#interface-textdecoder),
@@ -778,7 +801,7 @@ only expects lowercased encoding name:
 new TextDecoder(getBOMEncoding(input) ?? fallbackEncoding).decode(input)
 ```
-### @exodus/bytes/encoding-lite.js
+### @exodus/bytes/encoding-lite.js <sub>![](https://img.shields.io/bundlejs/size/@exodus/bytes/encoding-lite.js?style=flat-square)</sub>
 The exact same exports as `@exodus/bytes/encoding.js` are also exported as
 `@exodus/bytes/encoding-lite.js`, with the difference that the lite version does not load
@@ -837,7 +860,7 @@ true
 '%'
 ```
-### @exodus/bytes/encoding-browser.js
+### @exodus/bytes/encoding-browser.js <sub>![](https://img.shields.io/bundlejs/size/@exodus/bytes/encoding-browser.js?style=flat-square)<sub>
 Same as `@exodus/bytes/encoding.js`, but in browsers instead of polyfilling just uses whatever the
 browser provides, drastically reducing the bundle size (to less than 2 KiB gzipped).

package/base58.js CHANGED Viewed

@@ -1,6 +1,6 @@
 import { typedView } from './array.js'
-import { assertUint8 } from './assert.js'
-import { nativeDecoder, nativeEncoder, isHermes, E_STRING } from './fallback/_utils.js'
+import { assertU8, E_STRING } from './fallback/_utils.js'
+import { nativeDecoder, nativeEncoder, isHermes } from './fallback/platform.js'
 import { encodeAscii, decodeAscii } from './fallback/latin1.js'
 const alphabet58 = [...'123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz']
@@ -23,7 +23,7 @@ const E_CHAR = 'Invalid character in base58 input'
 const shouldUseBigIntFrom = isHermes // faster only on Hermes, numbers path beats it on normal engines
 function toBase58core(arr, alphabet, codes) {
-  assertUint8(arr)
+  assertU8(arr)
   const length = arr.length
   if (length === 0) return ''

package/base64.js CHANGED Viewed

@@ -1,6 +1,7 @@
-import { assertUint8, assertEmptyRest } from './assert.js'
+import { assertEmptyRest } from './assert.js'
 import { typedView } from './array.js'
-import { isHermes, skipWeb, E_STRING } from './fallback/_utils.js'
+import { assertU8, E_STRING } from './fallback/_utils.js'
+import { isHermes } from './fallback/platform.js'
 import { decodeLatin1, encodeLatin1 } from './fallback/latin1.js'
 import * as js from './fallback/base64.js'
@@ -34,10 +35,10 @@ function maybePad(res, padding) {
 }
 const toUrl = (x) => x.replaceAll('+', '-').replaceAll('/', '_')
-const haveWeb = (x) => !skipWeb && web64 && x.toBase64 === web64
+const haveWeb = (x) => web64 && x.toBase64 === web64
 export function toBase64(x, { padding = true } = {}) {
-  assertUint8(x)
+  assertU8(x)
   if (haveWeb(x)) return padding ? x.toBase64() : x.toBase64({ omitPadding: !padding }) // Modern, optionless is slightly faster
   if (haveNativeBuffer) return maybeUnpad(toBuffer(x).base64Slice(0, x.byteLength), padding) // Older Node.js
   if (shouldUseBtoa) return maybeUnpad(btoa(decodeLatin1(x)), padding)
@@ -46,7 +47,7 @@ export function toBase64(x, { padding = true } = {}) {
 // NOTE: base64url omits padding by default
 export function toBase64url(x, { padding = false } = {}) {
-  assertUint8(x)
+  assertU8(x)
   if (haveWeb(x)) return x.toBase64({ alphabet: 'base64url', omitPadding: !padding }) // Modern
   if (haveNativeBuffer) return maybePad(toBuffer(x).base64urlSlice(0, x.byteLength), padding) // Older Node.js
   if (shouldUseBtoa) return maybeUnpad(toUrl(btoa(decodeLatin1(x))), padding)
@@ -111,7 +112,7 @@ function noWhitespaceSeen(str, arr) {
 }
 let fromBase64impl
-if (!skipWeb && Uint8Array.fromBase64) {
+if (Uint8Array.fromBase64) {
   // NOTICE: this is actually slower than our JS impl in older JavaScriptCore and (slightly) in SpiderMonkey, but faster on V8 and new JavaScriptCore
   fromBase64impl = (str, isBase64url, padding) => {
     const alphabet = isBase64url ? 'base64url' : 'base64'

package/bech32.js CHANGED Viewed

@@ -1,5 +1,5 @@
-import { assertUint8 } from './assert.js'
-import { nativeEncoder, E_STRING } from './fallback/_utils.js'
+import { assertU8, E_STRING } from './fallback/_utils.js'
+import { nativeEncoder } from './fallback/platform.js'
 import { decodeAscii, encodeAscii, encodeLatin1 } from './fallback/latin1.js'
 const alphabet = [...'qpzry9x8gf2tvdw0s3jn54khce6mua7l']
@@ -109,7 +109,7 @@ function pPrefix(prefix) {
 function toBech32enc(prefix, bytes, limit, encoding) {
   if (typeof prefix !== 'string' || !prefix) throw new TypeError(E_PREFIX)
   if (typeof limit !== 'number') throw new TypeError(E_SIZE)
-  assertUint8(bytes)
+  assertU8(bytes)
   const bytesLength = bytes.length
   const wordsLength = Math.ceil((bytesLength * 8) / 5)
   if (!(prefix.length + 7 + wordsLength <= limit)) throw new TypeError(E_SIZE)

package/encoding-browser.browser.js CHANGED Viewed

@@ -1,10 +1,4 @@
-import {
-  fromSource,
-  getBOMEncoding,
-  normalizeEncoding,
-  E_ENCODING,
-} from './fallback/encoding.api.js'
-import labels from './fallback/encoding.labels.js'
+import { getBOMEncoding } from './fallback/encoding.api.js'
 // Lite-weight version which re-exports existing implementations on browsers,
 // while still being aliased to the full impl in RN and Node.js
@@ -13,17 +7,49 @@ import labels from './fallback/encoding.labels.js'
 const { TextDecoder, TextEncoder, TextDecoderStream, TextEncoderStream } = globalThis
-export { normalizeEncoding, getBOMEncoding, labelToName } from './fallback/encoding.api.js'
+export { getBOMEncoding } from './fallback/encoding.api.js'
 export { TextDecoder, TextEncoder, TextDecoderStream, TextEncoderStream }
-// https://encoding.spec.whatwg.org/#decode
+export function normalizeEncoding(label) {
+  if (label === 'utf-8' || label === 'utf8' || label === 'UTF-8' || label === 'UTF8') return 'utf-8'
+  if (label === 'windows-1252' || label === 'ascii' || label === 'latin1') return 'windows-1252'
+  if (/[^\w\t\n\f\r .:-]/i.test(label)) return null
+  const l = `${label}`.trim().toLowerCase()
+  try {
+    return new TextDecoder(l).encoding
+  } catch {}
+  if (l === 'x-user-defined') return l
+  if (
+    l === 'replacement' ||
+    l === 'csiso2022kr' ||
+    l === 'hz-gb-2312' ||
+    l === 'iso-2022-cn' ||
+    l === 'iso-2022-cn-ext' ||
+    l === 'iso-2022-kr'
+  ) {
+    return 'replacement'
+  }
+  return null
+}
 export function legacyHookDecode(input, fallbackEncoding = 'utf-8') {
-  let u8 = fromSource(input)
-  const bomEncoding = getBOMEncoding(u8)
-  if (bomEncoding) u8 = u8.subarray(bomEncoding === 'utf-8' ? 3 : 2)
-  const enc = bomEncoding ?? normalizeEncoding(fallbackEncoding) // "the byte order mark is more authoritative than anything else"
-  if (enc === 'utf-8') return new TextDecoder('utf-8', { ignoreBOM: true }).decode(u8) // fast path
-  if (enc === 'replacement') return u8.byteLength > 0 ? '\uFFFD' : ''
-  if (!Object.hasOwn(labels, enc)) throw new RangeError(E_ENCODING)
-  return new TextDecoder(enc, { ignoreBOM: true }).decode(u8)
+  const enc = getBOMEncoding(input) ?? normalizeEncoding(fallbackEncoding)
+  if (enc === 'replacement') return input.byteLength > 0 ? '\uFFFD' : ''
+  return new TextDecoder(enc).decode(input)
+}
+export function labelToName(label) {
+  const enc = normalizeEncoding(label)
+  if (enc === 'utf-8') return 'UTF-8'
+  if (!enc) return enc
+  const p = enc.slice(0, 3)
+  if (p === 'utf' || p === 'iso' || p === 'koi' || p === 'euc' || p === 'ibm' || p === 'gbk') {
+    return enc.toUpperCase()
+  }
+  if (enc === 'big5') return 'Big5'
+  if (enc === 'shift_jis') return 'Shift_JIS'
+  return enc
 }

package/fallback/_utils.js CHANGED Viewed

@@ -1,131 +1,15 @@
-const { Buffer, TextEncoder, TextDecoder } = globalThis
-const haveNativeBuffer = Buffer && !Buffer.TYPED_ARRAY_SUPPORT
-export const nativeBuffer = haveNativeBuffer ? Buffer : null
-export const isHermes = !!globalThis.HermesInternal
-export const isDeno = !!globalThis.Deno
-export const isLE = /* @__PURE__ */ (() => new Uint8Array(Uint16Array.of(258).buffer)[0] === 2)()
+export * from './platform.js'
-// We consider Node.js TextDecoder/TextEncoder native
-let isNative = (x) => x && (haveNativeBuffer || `${x}`.includes('[native code]'))
-if (!haveNativeBuffer && isNative(() => {})) isNative = () => false // e.g. XS, we don't want false positives
-export const nativeEncoder = isNative(TextEncoder) ? new TextEncoder() : null
-export const nativeDecoder = isNative(TextDecoder)
-  ? new TextDecoder('utf-8', { ignoreBOM: true })
-  : null
-// Actually windows-1252, compatible with ascii and latin1 decoding
-// Beware that on non-latin1, i.e. on windows-1252, this is broken in ~all Node.js versions released
-// in 2025 due to a regression, so we call it Latin1 as it's usable only for that
-const getNativeLatin1 = () => {
-  // Not all barebone engines with TextDecoder support something except utf-8, detect
-  if (nativeDecoder) {
-    try {
-      return new TextDecoder('latin1', { ignoreBOM: true })
-    } catch {}
-  }
-  return null
-}
-export const nativeDecoderLatin1 = /* @__PURE__ */ getNativeLatin1()
-// Block Firefox < 146 specifically from using native hex/base64, as it's very slow there
-// Refs: https://bugzilla.mozilla.org/show_bug.cgi?id=1994067 (and linked issues), fixed in 146
-// Before that, all versions of Firefox >= 133 are slow
-// TODO: this could be removed when < 146 usage diminishes (note ESR)
-// We do not worry about false-negatives here but worry about false-positives!
-function shouldSkipBuiltins() {
-  const g = globalThis
-  // First, attempt to exclude as many things as we can using trivial checks, just in case, and to not hit ua
-  if (haveNativeBuffer || isHermes || !g.window || g.chrome || !g.navigator) return false
-  try {
-    // This was fixed specifically in Firefox 146. Other engines except Hermes (already returned) get this right
-    new WeakSet().add(Symbol()) // eslint-disable-line symbol-description
-    return false
-  } catch {
-    // In catch and not after in case if something too smart optimizes out code in try. False-negative is acceptable in that case
-    if (!('onmozfullscreenerror' in g)) return false // Firefox has it (might remove in the future, but we don't care)
-    return /firefox/i.test(g.navigator.userAgent || '') // as simple as we can
-  }
-  /* c8 ignore next */
-  return false // eslint-disable-line no-unreachable
-}
-export const skipWeb = /* @__PURE__ */ shouldSkipBuiltins()
-function decodePartAddition(a, start, end, m) {
-  let o = ''
-  let i = start
-  for (const last3 = end - 3; i < last3; i += 4) {
-    const x0 = a[i]
-    const x1 = a[i + 1]
-    const x2 = a[i + 2]
-    const x3 = a[i + 3]
-    o += m[x0]
-    o += m[x1]
-    o += m[x2]
-    o += m[x3]
-  }
-  while (i < end) o += m[a[i++]]
-  return o
-}
-// Decoding with templates is faster on Hermes
-function decodePartTemplates(a, start, end, m) {
-  let o = ''
-  let i = start
-  for (const last15 = end - 15; i < last15; i += 16) {
-    const x0 = a[i]
-    const x1 = a[i + 1]
-    const x2 = a[i + 2]
-    const x3 = a[i + 3]
-    const x4 = a[i + 4]
-    const x5 = a[i + 5]
-    const x6 = a[i + 6]
-    const x7 = a[i + 7]
-    const x8 = a[i + 8]
-    const x9 = a[i + 9]
-    const x10 = a[i + 10]
-    const x11 = a[i + 11]
-    const x12 = a[i + 12]
-    const x13 = a[i + 13]
-    const x14 = a[i + 14]
-    const x15 = a[i + 15]
-    o += `${m[x0]}${m[x1]}${m[x2]}${m[x3]}${m[x4]}${m[x5]}${m[x6]}${m[x7]}${m[x8]}${m[x9]}${m[x10]}${m[x11]}${m[x12]}${m[x13]}${m[x14]}${m[x15]}`
-  }
-  while (i < end) o += m[a[i++]]
-  return o
-}
-const decodePart = isHermes ? decodePartTemplates : decodePartAddition
-export function decode2string(arr, start, end, m) {
-  if (end - start > 30_000) {
-    // Limit concatenation to avoid excessive GC
-    // Thresholds checked on Hermes for toHex
-    const concat = []
-    for (let i = start; i < end; ) {
-      const step = i + 500
-      const iNext = step > end ? end : step
-      concat.push(decodePart(arr, i, iNext, m))
-      i = iNext
-    }
-    const res = concat.join('')
-    concat.length = 0
-    return res
-  }
-  return decodePart(arr, start, end, m)
-}
+const { Buffer } = globalThis
 export function assert(condition, msg) {
   if (!condition) throw new Error(msg)
 }
+export function assertU8(arg) {
+  if (!(arg instanceof Uint8Array)) throw new TypeError('Expected an Uint8Array')
+}
 // On arrays in heap (<= 64) it's cheaper to copy into a pooled buffer than lazy-create the ArrayBuffer storage
 export const toBuf = (x) =>
   x.byteLength <= 64 && x.BYTES_PER_ELEMENT === 1
@@ -133,3 +17,4 @@ export const toBuf = (x) =>
     : Buffer.from(x.buffer, x.byteOffset, x.byteLength)
 export const E_STRING = 'Input is not a string'
+export const E_STRICT_UNICODE = 'Input is not well-formed Unicode'

package/fallback/base32.js CHANGED Viewed

@@ -1,5 +1,5 @@
-import { assertUint8 } from '../assert.js'
-import { nativeEncoder, nativeDecoder, isHermes } from './_utils.js'
+import { assertU8 } from './_utils.js'
+import { nativeEncoder, nativeDecoder, isHermes } from './platform.js'
 import { encodeAscii, decodeAscii } from './latin1.js'
 // See https://datatracker.ietf.org/doc/html/rfc4648
@@ -18,7 +18,7 @@ const useTemplates = isHermes // Faster on Hermes and JSC, but we use it only on
 // We construct output by concatenating chars, this seems to be fine enough on modern JS engines
 export function toBase32(arr, isBase32Hex, padding) {
-  assertUint8(arr)
+  assertU8(arr)
   const fullChunks = Math.floor(arr.length / 5)
   const fullChunksBytes = fullChunks * 5
   let o = ''

package/fallback/base58check.js CHANGED Viewed

@@ -1,6 +1,6 @@
 import { typedView } from '@exodus/bytes/array.js'
 import { toBase58, fromBase58 } from '@exodus/bytes/base58.js'
-import { assertUint8 } from '../assert.js'
+import { assertU8 } from './_utils.js'
 const E_CHECKSUM = 'Invalid checksum'
@@ -28,7 +28,7 @@ function assertChecksum(c, r) {
 export const makeBase58check = (hashAlgo, hashAlgoSync) => {
   const apis = {
     async encode(arr) {
-      assertUint8(arr)
+      assertU8(arr)
       return encodeWithChecksum(arr, await hashAlgo(arr))
     },
     async decode(str, format = 'uint8') {
@@ -41,7 +41,7 @@ export const makeBase58check = (hashAlgo, hashAlgoSync) => {
   return {
     ...apis,
     encodeSync(arr) {
-      assertUint8(arr)
+      assertU8(arr)
       return encodeWithChecksum(arr, hashAlgoSync(arr))
     },
     decodeSync(str, format = 'uint8') {

package/fallback/base64.js CHANGED Viewed

@@ -1,5 +1,4 @@
-import { assertUint8 } from '../assert.js'
-import { nativeEncoder, nativeDecoder } from './_utils.js'
+import { nativeEncoder, nativeDecoder } from './platform.js'
 import { encodeAscii, decodeAscii } from './latin1.js'
 // See https://datatracker.ietf.org/doc/html/rfc4648
@@ -15,8 +14,8 @@ export const E_LENGTH = 'Invalid base64 length'
 export const E_LAST = 'Invalid last chunk'
 // We construct output by concatenating chars, this seems to be fine enough on modern JS engines
+// Expects a checked Uint8Array
 export function toBase64(arr, isURL, padding) {
-  assertUint8(arr)
   const fullChunks = (arr.length / 3) | 0
   const fullChunksBytes = fullChunks * 3
   let o = ''

package/fallback/encoding.api.js CHANGED Viewed

@@ -1,32 +1,3 @@
-import labels from './encoding.labels.js'
-let labelsMap
-export const E_ENCODING = 'Unknown encoding'
-// Warning: unlike whatwg-encoding, returns lowercased labels
-// Those are case-insensitive and that's how TextDecoder encoding getter normalizes them
-// https://encoding.spec.whatwg.org/#names-and-labels
-export function normalizeEncoding(label) {
-  // fast path
-  if (label === 'utf-8' || label === 'utf8' || label === 'UTF-8' || label === 'UTF8') return 'utf-8'
-  if (label === 'windows-1252' || label === 'ascii' || label === 'latin1') return 'windows-1252'
-  // full map
-  if (/[^\w\t\n\f\r .:-]/i.test(label)) return null // must be ASCII (with ASCII whitespace)
-  const low = `${label}`.trim().toLowerCase()
-  if (Object.hasOwn(labels, low)) return low
-  if (!labelsMap) {
-    labelsMap = new Map()
-    for (const [label, aliases] of Object.entries(labels)) {
-      for (const alias of aliases) labelsMap.set(alias, label)
-    }
-  }
-  const mapped = labelsMap.get(low)
-  if (mapped) return mapped
-  return null
-}
 // TODO: make this more strict against Symbol.toStringTag
 // Is not very significant though, anything faking Symbol.toStringTag could as well override
 // prototypes, which is not something we protect against
@@ -65,17 +36,3 @@ export function getBOMEncoding(input) {
   if (u8[0] === 0xfe && u8[1] === 0xff) return 'utf-16be'
   return null
 }
-const uppercasePrefixes = new Set(['utf', 'iso', 'koi', 'euc', 'ibm', 'gbk'])
-// Unlike normalizeEncoding, case-sensitive
-// https://encoding.spec.whatwg.org/#names-and-labels
-export function labelToName(label) {
-  const enc = normalizeEncoding(label)
-  if (enc === 'utf-8') return 'UTF-8' // fast path
-  if (!enc) return enc
-  if (uppercasePrefixes.has(enc.slice(0, 3))) return enc.toUpperCase()
-  if (enc === 'big5') return 'Big5'
-  if (enc === 'shift_jis') return 'Shift_JIS'
-  return enc
-}

package/fallback/encoding.js CHANGED Viewed

@@ -5,17 +5,56 @@ import { utf16toString, utf16toStringLoose } from '@exodus/bytes/utf16.js'
 import { utf8fromStringLoose, utf8toString, utf8toStringLoose } from '@exodus/bytes/utf8.js'
 import { createSinglebyteDecoder } from '@exodus/bytes/single-byte.js'
 import labels from './encoding.labels.js'
-import { fromSource, getBOMEncoding, normalizeEncoding, E_ENCODING } from './encoding.api.js'
+import { fromSource, getBOMEncoding } from './encoding.api.js'
 import { unfinishedBytes, mergePrefix } from './encoding.util.js'
-export { labelToName, getBOMEncoding, normalizeEncoding } from './encoding.api.js'
+export { getBOMEncoding } from './encoding.api.js'
+export const E_ENCODING = 'Unknown encoding'
 const E_MULTI = "import '@exodus/bytes/encoding.js' for legacy multi-byte encodings support"
 const E_OPTIONS = 'The "options" argument must be of type object'
 const replacementChar = '\uFFFD'
 const multibyteSet = new Set(['big5', 'euc-kr', 'euc-jp', 'iso-2022-jp', 'shift_jis', 'gbk', 'gb18030']) // prettier-ignore
 let createMultibyteDecoder, multibyteEncoder
+let labelsMap
+// Warning: unlike whatwg-encoding, returns lowercased labels
+// Those are case-insensitive and that's how TextDecoder encoding getter normalizes them
+// https://encoding.spec.whatwg.org/#names-and-labels
+export function normalizeEncoding(label) {
+  // fast path
+  if (label === 'utf-8' || label === 'utf8' || label === 'UTF-8' || label === 'UTF8') return 'utf-8'
+  if (label === 'windows-1252' || label === 'ascii' || label === 'latin1') return 'windows-1252'
+  // full map
+  if (/[^\w\t\n\f\r .:-]/i.test(label)) return null // must be ASCII (with ASCII whitespace)
+  const low = `${label}`.trim().toLowerCase()
+  if (Object.hasOwn(labels, low)) return low
+  if (!labelsMap) {
+    labelsMap = new Map()
+    for (const [name, aliases] of Object.entries(labels)) {
+      for (const alias of aliases) labelsMap.set(alias, name)
+    }
+  }
+  const mapped = labelsMap.get(low)
+  if (mapped) return mapped
+  return null
+}
+const uppercasePrefixes = new Set(['utf', 'iso', 'koi', 'euc', 'ibm', 'gbk'])
+// Unlike normalizeEncoding, case-sensitive
+// https://encoding.spec.whatwg.org/#names-and-labels
+export function labelToName(label) {
+  const enc = normalizeEncoding(label)
+  if (enc === 'utf-8') return 'UTF-8' // fast path
+  if (!enc) return enc
+  if (uppercasePrefixes.has(enc.slice(0, 3))) return enc.toUpperCase()
+  if (enc === 'big5') return 'Big5'
+  if (enc === 'shift_jis') return 'Shift_JIS'
+  return enc
+}
 export const isMultibyte = (enc) => multibyteSet.has(enc)
 export function setMultibyte(createDecoder, createEncoder) {
   createMultibyteDecoder = createDecoder

package/fallback/encoding.labels.js CHANGED Viewed

@@ -4,43 +4,47 @@
 // prettier-ignore
 const labels = {
   'utf-8': ['unicode-1-1-utf-8', 'unicode11utf8', 'unicode20utf8', 'utf8', 'x-unicode20utf8'],
-  ibm866: ['866', 'cp866', 'csibm866'],
-  'iso-8859-2': ['csisolatin2', 'iso-ir-101', 'iso8859-2', 'iso88592', 'iso_8859-2', 'iso_8859-2:1987', 'l2', 'latin2'],
-  'iso-8859-3': ['csisolatin3', 'iso-ir-109', 'iso8859-3', 'iso88593', 'iso_8859-3', 'iso_8859-3:1988', 'l3', 'latin3'],
-  'iso-8859-4': ['csisolatin4', 'iso-ir-110', 'iso8859-4', 'iso88594', 'iso_8859-4', 'iso_8859-4:1988', 'l4', 'latin4'],
-  'iso-8859-5': ['csisolatincyrillic', 'cyrillic', 'iso-ir-144', 'iso8859-5', 'iso88595', 'iso_8859-5', 'iso_8859-5:1988'],
-  'iso-8859-6': ['arabic', 'asmo-708', 'csiso88596e', 'csiso88596i', 'csisolatinarabic', 'ecma-114', 'iso-8859-6-e', 'iso-8859-6-i', 'iso-ir-127', 'iso8859-6', 'iso88596', 'iso_8859-6', 'iso_8859-6:1987'],
-  'iso-8859-7': ['csisolatingreek', 'ecma-118', 'elot_928', 'greek', 'greek8', 'iso-ir-126', 'iso8859-7', 'iso88597', 'iso_8859-7', 'iso_8859-7:1987', 'sun_eu_greek'],
-  'iso-8859-8': ['csiso88598e', 'csisolatinhebrew', 'hebrew', 'iso-8859-8-e', 'iso-ir-138', 'iso8859-8', 'iso88598', 'iso_8859-8', 'iso_8859-8:1988', 'visual'],
+  'utf-16be': ['unicodefffe'],
+  'utf-16le': ['csunicode', 'iso-10646-ucs-2', 'ucs-2', 'unicode', 'unicodefeff', 'utf-16'],
+  'iso-8859-2': ['iso-ir-101'],
+  'iso-8859-3': ['iso-ir-109'],
+  'iso-8859-4': ['iso-ir-110'],
+  'iso-8859-5': ['csisolatincyrillic', 'cyrillic', 'iso-ir-144'],
+  'iso-8859-6': ['arabic', 'asmo-708', 'csiso88596e', 'csiso88596i', 'csisolatinarabic', 'ecma-114', 'iso-8859-6-e', 'iso-8859-6-i', 'iso-ir-127'],
+  'iso-8859-7': ['csisolatingreek', 'ecma-118', 'elot_928', 'greek', 'greek8', 'iso-ir-126', 'sun_eu_greek'],
+  'iso-8859-8': ['csiso88598e', 'csisolatinhebrew', 'hebrew', 'iso-8859-8-e', 'iso-ir-138', 'visual'],
   'iso-8859-8-i': ['csiso88598i', 'logical'],
-  'iso-8859-10': ['csisolatin6', 'iso-ir-157', 'iso8859-10', 'iso885910', 'l6', 'latin6'],
-  'iso-8859-13': ['iso8859-13', 'iso885913'],
-  'iso-8859-14': ['iso8859-14', 'iso885914'],
-  'iso-8859-15': ['csisolatin9', 'iso8859-15', 'iso885915', 'iso_8859-15', 'l9'],
   'iso-8859-16': [],
   'koi8-r': ['cskoi8r', 'koi', 'koi8', 'koi8_r'],
   'koi8-u': ['koi8-ru'],
-  macintosh: ['csmacintosh', 'mac', 'x-mac-roman'],
   'windows-874': ['dos-874', 'iso-8859-11', 'iso8859-11', 'iso885911', 'tis-620'],
+  ibm866: ['866', 'cp866', 'csibm866'],
   'x-mac-cyrillic': ['x-mac-ukrainian'],
+  macintosh: ['csmacintosh', 'mac', 'x-mac-roman'],
   gbk: ['chinese', 'csgb2312', 'csiso58gb231280', 'gb2312', 'gb_2312', 'gb_2312-80', 'iso-ir-58', 'x-gbk'],
   gb18030: [],
   big5: ['big5-hkscs', 'cn-big5', 'csbig5', 'x-x-big5'],
   'euc-jp': ['cseucpkdfmtjapanese', 'x-euc-jp'],
-  'iso-2022-jp': ['csiso2022jp'],
   shift_jis: ['csshiftjis', 'ms932', 'ms_kanji', 'shift-jis', 'sjis', 'windows-31j', 'x-sjis'],
   'euc-kr': ['cseuckr', 'csksc56011987', 'iso-ir-149', 'korean', 'ks_c_5601-1987', 'ks_c_5601-1989', 'ksc5601', 'ksc_5601', 'windows-949'],
+  'iso-2022-jp': ['csiso2022jp'],
   replacement: ['csiso2022kr', 'hz-gb-2312', 'iso-2022-cn', 'iso-2022-cn-ext', 'iso-2022-kr'],
-  'utf-16be': ['unicodefffe'],
-  'utf-16le': ['csunicode', 'iso-10646-ucs-2', 'ucs-2', 'unicode', 'unicodefeff', 'utf-16'],
   'x-user-defined': [],
 }
+for (const i of [10, 13, 14, 15]) labels[`iso-8859-${i}`] = [`iso8859-${i}`, `iso8859${i}`]
+for (const i of [2, 6, 7]) labels[`iso-8859-${i}`].push(`iso_8859-${i}:1987`)
+for (const i of [3, 4, 5, 8]) labels[`iso-8859-${i}`].push(`iso_8859-${i}:1988`)
+// prettier-ignore
+for (let i = 2; i < 9; i++) labels[`iso-8859-${i}`].push(`iso8859-${i}`, `iso8859${i}`, `iso_8859-${i}`)
+for (let i = 2; i < 5; i++) labels[`iso-8859-${i}`].push(`csisolatin${i}`, `l${i}`, `latin${i}`)
 for (let i = 0; i < 9; i++) labels[`windows-125${i}`] = [`cp125${i}`, `x-cp125${i}`]
 // prettier-ignore
 labels['windows-1252'].push('ansi_x3.4-1968', 'ascii', 'cp819', 'csisolatin1', 'ibm819', 'iso-8859-1', 'iso-ir-100', 'iso8859-1', 'iso88591', 'iso_8859-1', 'iso_8859-1:1987', 'l1', 'latin1', 'us-ascii')
 // prettier-ignore
 labels['windows-1254'].push('csisolatin5', 'iso-8859-9', 'iso-ir-148', 'iso8859-9', 'iso88599', 'iso_8859-9', 'iso_8859-9:1989', 'l5', 'latin5')
+labels['iso-8859-10'].push('csisolatin6', 'iso-ir-157', 'l6', 'latin6')
+labels['iso-8859-15'].push('csisolatin9', 'iso_8859-15', 'l9')
 export default labels