@bigdreamsweb3/wordbin 1.0.6 → 1.0.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,149 +1,364 @@
1
- # WordBin Encode words & short phrases into tiny, reversible binary
2
-
3
- ![npm](https://img.shields.io/npm/v/@bigdreamsweb3/wordbin) ![license](https://img.shields.io/npm/l/@bigdreamsweb3/wordbin) ![node](https://img.shields.io/node/v/@bigdreamsweb3/wordbin)
4
-
5
- **WordBin** is a deterministic, reversible word-to-binary encoder/decoder that compresses short human-readable phrases into **tiny, predictable binary payloads**.
6
-
7
- Optimized for **Web3, blockchain, QR codes, IoT, URLs, and other space-sensitive applications**, WordBin ensures short sequences of words can be stored efficiently and recovered reliably.
8
-
9
- ---
10
-
11
- ## Why WordBin
12
-
13
- - Compress short phrases (crypto seeds, tags, keywords) into **2–4 byte IDs per word**.
14
- - Fully **deterministic & collision-safe** — same dictionary = same encoding.
15
- - Minimal payload → perfect for **blockchain metadata, NFC/QR codes, and low-bandwidth IoT**.
16
- - Works in **Node.js & browser**, no runtime dependencies.
17
- - Bundled **small dictionary (BIP-39)** with optional **large/custom dictionaries**.
18
-
19
- ---
20
-
21
- ## Quick Install
22
-
23
- ```bash
24
- npm install @bigdreamsweb3/wordbin
25
- ```
26
-
27
- > The small **v1 (BIP-39)** dictionary is included ready to encode/decode immediately.
28
-
29
- ---
30
-
31
- ## Quick Start (Node.js / Browser)
32
-
33
- ```js
34
- import { WordBin } from "@bigdreamsweb3/wordbin";
35
-
36
- // Use latest dictionary (v1 bundled, v2 if built)
37
- const wb = await WordBin.create();
38
-
39
- const phrase =
40
- "pet either learn purse candy leader craft undo spoil forum slot spirit";
41
- const encoded = await wb.encode(phrase);
42
-
43
- console.log("Base64 payload:", encoded.payload);
44
- // → Base64 payload: Aru6lPEGao2lRWajTRYITVtmNVVD8BWvHGhOtBplWKaJmQ==
45
-
46
- console.log("Dictionary Version used (header byte):", encoded.dictVersion);
47
- // Dictionary Version used (header byte):: 2
48
-
49
- console.log("Encoded bytes:", encoded.encodedBytes);
50
- // Encoded bytes: 34
51
-
52
- console.log("Original bytes:", encoded.originalBytes);
53
- // → Original bytes: 70
54
-
55
- console.log("Bytes saved:", encoded.bytesSaved);
56
- // → Bytes saved: 36
57
-
58
- const decoded = await wb.decode(encoded.encoded);
59
- console.log("Decoded:", decoded); // → "pet either learn purse candy leader craft undo spoil forum slot spirit"
60
- ```
61
-
62
- ### Browser Example
63
-
64
- ```html
65
- <script type="module">
66
- import { WordBin } from "https://esm.sh/@bigdreamsweb3/wordbin";
67
- const wb = await WordBin.create();
68
- </script>
69
- ```
70
-
71
- ---
72
-
73
- ## CLI – Build Larger or Custom Dictionaries
74
-
75
- ```bash
76
- # Interactive mode
77
- npx wordbin build
78
-
79
- # Direct commands
80
- npx wordbin build --version 1 # BIP-39 (small, bundled)
81
- npx wordbin build --version 2 # dwyl/english-words (~466k)
82
- npx wordbin build --all # Both v1 + v2
83
- npx wordbin build --custom https://example.com/mywords.txt
84
- npx wordbin build --custom ./my-local-words.txt
85
- ```
86
-
87
- > All generated files go to `./data/` in your current directory.
88
-
89
- ---
90
-
91
- ## Dictionary Versions
92
-
93
- | Version | Filename | Words | Source | ID bytes | Best for |
94
- | ------- | --------------------- | -------- | ------------------ | -------- | --------------------------------------- |
95
- | v1 | wordbin-v1-bip39.json | 2,048 | BIP-39 English | ~2 | Crypto recovery seeds, high reliability |
96
- | v2 | wordbin-v2-dwyl.json | ~479,000 | dwyl/english-words | 2–4 | General English, tags, keywords |
97
-
98
- ---
99
-
100
- ## How It Works
101
-
102
- 1. Each word is hashed first **2–4 bytes** = its fixed ID.
103
- 2. If the ID is in the dictionary → use the short ID.
104
- Otherwise store the word as a compact literal block.
105
- 3. Payload begins with one byte: **dictionary version**.
106
- 4. Decoding uses the same dictionary to reverse IDs original words.
107
- 5. Backtracking handles any ID length ambiguity guaranteed correct output.
108
-
109
- ---
110
-
111
- ## Real-World Use Cases
112
-
113
- | Use Case | Typical Savings | Dictionary |
114
- | ------------------------------------- | --------------- | ---------- |
115
- | Crypto recovery phrases (12–24 words) | 50–65% | v1 |
116
- | Short tags / keywords | 40–70% | v2 |
117
- | Tiny QR codes / NFC labels | 50–80% | v1 or v2 |
118
- | Short links / URL-friendly tags | 45–65% | v2 |
119
- | On-chain metadata (Ethereum, Solana) | 40–70% | v2 |
120
- | LoRa / low-bandwidth IoT labels | 50–75% | v2 |
121
- | Offline phrase storage / search | 35–60% | v2 |
122
-
123
- > Not suitable for long texts, arbitrary binary files, or when dictionaries cannot be shared reliably.
124
-
125
- ---
126
-
127
- ## Loading Specific Dictionary Versions
128
-
129
- ```js
130
- import { loadDictionaryByVersion } from "@bigdreamsweb3/wordbin";
131
-
132
- const dictV1 = await loadDictionaryByVersion(1); // always available
133
- const wbV1 = new WordBin(dictV1);
134
-
135
- const dictV2 = await loadDictionaryByVersion(2); // throws if not built
136
- ```
137
-
138
- ---
139
-
140
- ## License
141
-
142
- MIT
143
-
144
- ---
145
-
146
- ## Contributing
147
-
148
- Want to help improve WordBin?
149
- Read our [CONTRIBUTING.md](./CONTRIBUTING.md) guide.
1
+ # WordBin Compact, Reversible Word-Phrase Encoder
2
+
3
+ [![npm version](https://img.shields.io/npm/v/@bigdreamsweb3/wordbin?style=flat-square)](https://www.npmjs.com/package/@bigdreamsweb3/wordbin)
4
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg?style=flat-square)](https://opensource.org/licenses/MIT)
5
+ [![Node.js](https://img.shields.io/node/v/@bigdreamsweb3/wordbin?style=flat-square)](https://nodejs.org)
6
+ [![Tests](https://img.shields.io/badge/tests-passing-brightgreen?style=flat-square)](https://github.com/bigdreamsweb3/wordbin/actions)
7
+
8
+ **WordBin** encodes short human-readable phrases — crypto seeds, metadata tags, labels, keywords — into **compact, deterministic binary payloads** that can be perfectly reversed back to the original words.
9
+
10
+ Designed for **Web3 metadata**, **QR codes**, **blockchain events**, **IoT**, **NFC tags**, **short URLs**, and any **low-bandwidth or storage-constrained** environment where every byte matters.
11
+
12
+ ---
13
+
14
+ ### Real Output
15
+
16
+ ```
17
+ Input : "stock ridge avoid school honey trap wait wheel worry face differ wedding"
18
+ Words : 12
19
+ Output : 34 bytes (original: 72 bytes)
20
+ Saved : 38 bytes — 47% of original size
21
+
22
+ Hex : 0108c424409e363270f7d64deba55e2e11ba716eba59926de2f50282599fc5afd1a8
23
+ Base58 : 2MepGpLHGPPmnrdzjmpqet2XFQ2YGMSpQoDXDex7toUBdZ
24
+ Base64 : AQjEJECeNjJw99ZN66VeLhG6cW66WZJt4vUCglmfxa/RqA==
25
+ Bin21 : ☺◄Ä$@ž6rp÷ÖMë¥^.►ºqnºY™mâõ☻™Å¯Ñ¨
26
+
27
+ Decoded: "stock ridge avoid school honey trap wait wheel worry face differ wedding" ✓
28
+ ```
29
+
30
+ ---
31
+
32
+ ## Why WordBin?
33
+
34
+ - **40–70% size reduction** on typical short phrases
35
+ - **Deterministic** — same input + same dictionary = same output, every time
36
+ - **Lossless** decode is always a perfect round-trip
37
+ - **Universal decoder** accepts hex, Base58, Base64, Bin21, or raw bytes; format is auto-detected
38
+ - **Resilient** — non-WordBin payloads are never rejected; partial word extraction is attempted before falling back gracefully
39
+ - **No runtime dependencies** — works in Node.js and the browser
40
+ - **Flexible dictionaries** BIP-39 (v1, bundled), large English (v2), or custom wordlists
41
+
42
+ ---
43
+
44
+ ## Install
45
+
46
+ ```bash
47
+ npm install @bigdreamsweb3/wordbin
48
+ ```
49
+
50
+ > Ships with **v1 (BIP-39, 2048 words)** pre-bundled — works out of the box.
51
+
52
+ ---
53
+
54
+ ## Quick Start
55
+
56
+ ```ts
57
+ import { WordBin } from "@bigdreamsweb3/wordbin";
58
+
59
+ const wb = await WordBin.create();
60
+
61
+ const phrase =
62
+ "stock ridge avoid school honey trap wait wheel worry face differ wedding";
63
+
64
+ // ── Encode ────────────────────────────────────────────────────────────────────
65
+ const encoded = await wb.encode(phrase);
66
+
67
+ console.log(encoded.hexPayload); // standard hex string
68
+ console.log(encoded.base58Payload); // Base58 string
69
+ console.log(encoded.base64Payload); // Base64 string
70
+ console.log(encoded.payload); // Bin21 (1 char per byte, most compact printable form)
71
+ console.log(encoded.encodedBytes); // 34
72
+ console.log(encoded.originalBytes); // 72
73
+ console.log(encoded.ratioPercent); // 47.22
74
+
75
+ // ── Decode — pass any format, it's auto-detected ──────────────────────────────
76
+ const r1 = await wb.decode(encoded.hexPayload); // DetectedFormat: "hex"
77
+ const r2 = await wb.decode(encoded.base58Payload); // DetectedFormat: "base58"
78
+ const r3 = await wb.decode(encoded.base64Payload); // DetectedFormat: "base64"
79
+ const r4 = await wb.decode(encoded.payload); // DetectedFormat: "bin21"
80
+ const r5 = await wb.decode(encoded.encoded); // DetectedFormat: "bytes" (Uint8Array)
81
+
82
+ console.log(r1.text); // "stock ridge avoid school honey trap..."
83
+ console.log(r1.isWordBin); // true
84
+ ```
85
+
86
+ ### Browser (ESM)
87
+
88
+ ```html
89
+ <script type="module">
90
+ import { WordBin } from "https://esm.sh/@bigdreamsweb3/wordbin";
91
+ const wb = await WordBin.create();
92
+ const { hexPayload } = await wb.encode("abandon ability able");
93
+ const { text } = await wb.decode(hexPayload);
94
+ console.log(text); // "abandon ability able"
95
+ </script>
96
+ ```
97
+
98
+ ---
99
+
100
+ ## Payload Formats
101
+
102
+ WordBin produces four interchangeable representations of the same encoded bytes. Pass any of them to `decode()` — the format is detected automatically.
103
+
104
+ | Format | Field | Description | Size |
105
+ | ---------- | --------------- | -------------------------------------- | -------------------------------- |
106
+ | **Hex** | `hexPayload` | Lowercase hex, 2 chars per byte | raw |
107
+ | **Base58** | `base58Payload` | URL-safe, no ambiguous chars (0/O/I/l) | ~1.4× raw |
108
+ | **Base64** | `base64Payload` | Standard Base64 with `=` padding | ~1.33× raw |
109
+ | **Bin21** | `payload` | Latin-1 string, 1 char per byte | 1× raw — smallest printable form |
110
+ | **Bytes** | `encoded` | Raw `Uint8Array` | 1× raw |
111
+
112
+ > **Bin21** is WordBin's signature format: each encoded byte maps to exactly one character. No expansion. A 34-byte payload is a 34-character string.
113
+
114
+ ---
115
+
116
+ ## Decode API
117
+
118
+ `wb.decode(payload)` always returns a `DecodeResult` it never throws.
119
+
120
+ ```ts
121
+ interface DecodeResult {
122
+ text: string; // decoded words, or best-effort extraction
123
+ isWordBin: boolean; // true = valid WordBin payload, perfectly decoded
124
+ detectedFormat: PayloadFormat; // "hex" | "base58" | "base64" | "bin21" | "bytes"
125
+ notice?: string; // present when payload is not a valid WordBin stream
126
+ rawSegments?: string[]; // unmatched bytes shown as [0xXX], non-WordBin only
127
+ }
128
+ ```
129
+
130
+ ### Decode behaviour
131
+
132
+ ```
133
+ Payload received
134
+
135
+
136
+ Format detection ──── hex / base58 / base64 / bin21 / bytes
137
+
138
+
139
+ Strict WordBin parse (all installed dictionary versions)
140
+ │ │
141
+ Success Failure
142
+ │ │
143
+ isWordBin: true Partial scan ── extract words where bytes match
144
+ text: original │ preserve unmatched bytes as [0xXX]
145
+ isWordBin: false
146
+ notice: explains what happened
147
+ ```
148
+
149
+ **Any payload is accepted.** If bytes don't match any dictionary, a partial word extraction is attempted across all installed dictionaries. Remaining bytes are preserved as `[0xXX]` markers so nothing is silently discarded.
150
+
151
+ ```ts
152
+ // Non-WordBin payload — still handled gracefully
153
+ const result = await wb.decode("48656c6c6f20576f726c64"); // "Hello World" as hex
154
+
155
+ console.log(result.isWordBin); // false
156
+ console.log(result.notice); // "This does not appear to be a valid WordBin payload..."
157
+ console.log(result.rawSegments); // ["[0x48]", "[0x65]", ...] — unmatched bytes
158
+ ```
159
+
160
+ ---
161
+
162
+ ## How Encoding Works
163
+
164
+ 1. **Version header** — first byte identifies the dictionary version (`0x01` for v1)
165
+ 2. **Dictionary lookup** — each word in the phrase is replaced by its compact binary ID (1–4 bytes)
166
+ 3. **Literal fallback** — words not in the dictionary are stored as `varint length + UTF-8 bytes`
167
+ 4. **Payload representations** — the raw bytes are encoded into hex, Base58, Base64, and Bin21
168
+
169
+ Payloads are **self-describing** (the version byte is embedded) and **fully lossless**.
170
+
171
+ ---
172
+
173
+ ## Compression by Use Case
174
+
175
+ | Use case | Words | Original | Encoded | Saved |
176
+ | -------------------------- | ----- | ----------- | ----------- | ----- |
177
+ | 12-word BIP-39 seed phrase | 12 | ~72 bytes | ~34 bytes | ~53% |
178
+ | Crypto metadata / labels | 5–8 | 30–50 bytes | 12–24 bytes | ~55% |
179
+ | Short tag / keyword list | 3–6 | 20–40 bytes | 8–18 bytes | ~60% |
180
+ | English sentence | 8–15 | 50–90 bytes | 25–45 bytes | ~50% |
181
+
182
+ ---
183
+
184
+ ## CLI — Build Dictionaries
185
+
186
+ ```bash
187
+ # Interactive
188
+ npx wordbin build
189
+
190
+ # Specific version
191
+ npx wordbin build --version 1 # BIP-39 (2048 words, already bundled)
192
+ npx wordbin build --version 2 # dwyl/english-words (~466k words)
193
+ npx wordbin build --all # Build v1 and v2
194
+
195
+ # Custom wordlist
196
+ npx wordbin build --custom ./mywords.txt
197
+ npx wordbin build --custom https://example.com/words.txt
198
+ ```
199
+
200
+ Output goes to `./data/`. Load with `loadDictionaryByVersion()` or pass directly to `new WordBin(dict)`.
201
+
202
+ ---
203
+
204
+ ## Dictionary Versions
205
+
206
+ | Version | Words | Source | Best for | Bundled |
207
+ | ---------- | -------- | ------------------ | --------------------------------------- | -------------- |
208
+ | **v1** | 2,048 | BIP-39 English | Crypto seeds, maximum reliability | ✅ Yes |
209
+ | **v2** | ~466,550 | dwyl/english-words | General English, tags, large vocabulary | Build required |
210
+ | **Custom** | Any | Your wordlist | Domain-specific vocabulary | Build required |
211
+
212
+ ---
213
+
214
+ ## Advanced Usage
215
+
216
+ ### Load a specific dictionary version
217
+
218
+ ```ts
219
+ import { loadDictionaryByVersion, WordBin } from "@bigdreamsweb3/wordbin";
220
+
221
+ const dict = await loadDictionaryByVersion(1);
222
+ const wb = new WordBin(dict);
223
+ ```
224
+
225
+ ### Encode with a specific dictionary version
226
+
227
+ ```ts
228
+ const encoded = await wb.encode("abandon ability able", { dictVersion: 1 });
229
+ ```
230
+
231
+ ### Encode from an existing EncodeResult or raw bytes
232
+
233
+ ```ts
234
+ // Re-encode a previous result (e.g. to switch dictionary version)
235
+ const reEncoded = await wb.encode(encoded);
236
+
237
+ // Encode raw bytes
238
+ const fromBytes = await wb.encode(new Uint8Array([1, 2, 3]));
239
+ ```
240
+
241
+ ### Build a WordBin instance from a custom wordlist
242
+
243
+ ```ts
244
+ const wb = await WordBin.createFromWords(["apple", "banana", "cherry", ...]);
245
+ ```
246
+
247
+ ---
248
+
249
+ ---
250
+
251
+ ## For Contributors — Clone & Run Locally
252
+
253
+ > This section is for people developing WordBin itself.
254
+ > If you're just using the package, `npm install @bigdreamsweb3/wordbin` is all you need.
255
+
256
+ ### Prerequisites
257
+
258
+ - Node.js ≥ 18
259
+ - npm ≥ 9
260
+
261
+ ### Clone and set up
262
+
263
+ ```bash
264
+ git clone https://github.com/bigdreamsweb3/wordbin.git
265
+ cd wordbin
266
+ npm install
267
+ ```
268
+
269
+ ### Project layout
270
+
271
+ ```
272
+ wordbin/
273
+ ├── src/
274
+ │ ├── core/ # WordBin class — encode, decode, format detection
275
+ │ ├── dict/ # Dictionary loader, builder, and bundled data
276
+ │ ├── utils/ # Buffer helpers (toHex, toBase64, varint, utf8)
277
+ │ └── constants.ts # LITERAL token and other shared constants
278
+ ├── test/
279
+ │ └── test.spec.ts # Vitest suite — encode, decode, round-trip, non-WordBin
280
+ ├── data/ # Built dictionary files (generated, not committed)
281
+ └── dist/ # Compiled output (generated on build)
282
+ ```
283
+
284
+ ### Build
285
+
286
+ ```bash
287
+ npm run build # compile TypeScript → dist/
288
+ ```
289
+
290
+ ### Run the tests
291
+
292
+ ```bash
293
+ npm test # run all test suites once
294
+ npm run test:watch # watch mode — re-runs on file save
295
+ ```
296
+
297
+ Each suite can be run independently — either flip a flag in `test/test.spec.ts`:
298
+
299
+ ```ts
300
+ const RUN = {
301
+ ENCODE_ONLY: true, // set false to skip
302
+ DECODE_ONLY: true,
303
+ ENCODE_THEN_DECODE: true,
304
+ NON_WORDBIN_DECODE: true,
305
+ };
306
+ ```
307
+
308
+ Or target a suite by name from the CLI:
309
+
310
+ ```bash
311
+ npx vitest -t "Encode only"
312
+ npx vitest -t "Decode only"
313
+ npx vitest -t "Encode then decode"
314
+ npx vitest -t "Non-WordBin decode"
315
+ ```
316
+
317
+ ### Build dictionaries locally
318
+
319
+ The v1 BIP-39 dictionary is bundled with the package. To build additional versions:
320
+
321
+ ```bash
322
+ npx wordbin build --version 2 # large English (~466k words) → ./data/
323
+ npx wordbin build --all # v1 + v2
324
+ ```
325
+
326
+ The `./data/` directory is gitignored. Built dictionaries are loaded automatically by `loadLatestDictionary()` at runtime.
327
+
328
+ ### Making changes
329
+
330
+ The most common contribution points:
331
+
332
+ | What you want to change | Where to look |
333
+ | ------------------------------------------ | -------------------------------------------------------- |
334
+ | Encode / decode logic | `src/core/wordbin.ts` |
335
+ | Format detection (hex/base58/base64/bin21) | `detectAndConvert()` in `wordbin.ts` |
336
+ | Dictionary loading / versioning | `src/dict/dictionary-loader.ts` |
337
+ | Dictionary building from a wordlist | `src/dict/builder.ts` |
338
+ | Buffer utilities (varint, hex, utf8) | `src/utils/buffer.ts` |
339
+ | Add a new payload format | `PayloadFormat` type + `detectAndConvert()` + `encode()` |
340
+
341
+ ### Submitting a pull request
342
+
343
+ 1. Fork the repo and create a branch: `git checkout -b my-feature`
344
+ 2. Make your changes and ensure all tests pass: `npm test`
345
+ 3. Add or update tests for any new behaviour in `test/test.spec.ts`
346
+ 4. Open a pull request with a clear description of what changed and why
347
+
348
+ For larger changes — new dictionary versions, new payload formats, architectural changes — please open an issue first to discuss the approach.
349
+
350
+ ---
351
+
352
+ ## License
353
+
354
+ MIT — see [LICENSE](./LICENSE) for details.
355
+
356
+ ---
357
+
358
+ ## Contributing
359
+
360
+ See [CONTRIBUTING.md](./CONTRIBUTING.md) to add dictionary versions, improve compression, or fix bugs.
361
+
362
+ ---
363
+
364
+ Enjoy the tiny payloads. We up 🚀
@@ -35,16 +35,6 @@ function toBase64(bytes) {
35
35
  }
36
36
  return Buffer.from(bytes).toString("base64");
37
37
  }
38
- function fromBase64(base64) {
39
- const at = globalThis.atob;
40
- if (typeof at === "function") {
41
- const binary = at(base64);
42
- const out = new Uint8Array(binary.length);
43
- for (let i = 0; i < binary.length; i++) out[i] = binary.charCodeAt(i);
44
- return out;
45
- }
46
- return new Uint8Array(Buffer.from(base64, "base64"));
47
- }
48
38
  function utf8Encode(str) {
49
39
  if (typeof TextEncoder !== "undefined") return new TextEncoder().encode(str);
50
40
  return new Uint8Array(Buffer.from(str, "utf8"));
@@ -85,10 +75,19 @@ async function buildDictionary(words, options = {}) {
85
75
  const normalizedWords = words.map((w) => w.trim().toLowerCase()).filter((w) => w);
86
76
  await Promise.all(
87
77
  normalizedWords.map(async (word) => {
88
- const id = await generateWordId(word);
89
- const key = toHex(id);
90
- if (!map[key]) map[key] = [];
91
- map[key].push(word);
78
+ let attempt = 0;
79
+ let key;
80
+ while (true) {
81
+ const id = await generateWordId(
82
+ attempt === 0 ? word : `${word}:${attempt}`
83
+ );
84
+ key = toHex(id);
85
+ if (!map[key]) {
86
+ map[key] = [word];
87
+ break;
88
+ }
89
+ attempt++;
90
+ }
92
91
  })
93
92
  );
94
93
  Object.values(map).forEach((collisions) => {
@@ -101,14 +100,12 @@ async function buildDictionary(words, options = {}) {
101
100
  };
102
101
  }
103
102
  export {
104
- toHex as a,
103
+ utf8Decode as a,
105
104
  buildDictionary as b,
106
- utf8Decode as c,
105
+ toHex as c,
107
106
  decodeVarint as d,
108
107
  encodeVarint as e,
109
- fromBase64 as f,
110
- generateWordId as g,
111
108
  toBase64 as t,
112
109
  utf8Encode as u
113
110
  };
114
- //# sourceMappingURL=dictionary-D3gr2Ala.js.map
111
+ //# sourceMappingURL=builder-vFphFQMU.js.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"builder-vFphFQMU.js","sources":["../src/core/tiers.ts","../src/core/id.ts","../src/utils/buffer.ts","../src/dict/builder.ts"],"sourcesContent":["export function getIdByteLength(wordLength: number): number {\r\n if (wordLength <= 4) return 2;\r\n if (wordLength <= 9) return 3;\r\n return 4;\r\n}\r\n\r\nexport function getWrapByteLength(wordLength: number): number {\r\n if (wordLength <= 4) return 2;\r\n if (wordLength <= 9) return 3;\r\n return 4;\r\n}\r\n\r\nexport async function getTextEncoder(): Promise<TextEncoder> {\r\n if (typeof TextEncoder !== \"undefined\") return new TextEncoder();\r\n const { TextEncoder: NodeTextEncoder } = await import(\"node:util\");\r\n // @ts-ignore Node typings\r\n return new NodeTextEncoder();\r\n}\r\n\r\nexport async function wrapBase64(data: string): Promise<Uint8Array> {\r\n const normalized = data.trim().toLowerCase();\r\n if (!normalized) throw new Error(\"Cannot generate ID for empty string\");\r\n\r\n const encoder = await getTextEncoder();\r\n const result = encoder.encode(normalized);\r\n\r\n // Browser + Node compatible SHA-256\r\n let hash: ArrayBuffer;\r\n const anyCrypto: any = (globalThis as any).crypto;\r\n if (anyCrypto && anyCrypto.subtle) {\r\n hash = await anyCrypto.subtle.digest(\"SHA-256\", result);\r\n } else {\r\n const { createHash } = await import(\"node:crypto\");\r\n hash = createHash(\"sha256\").update(Buffer.from(result)).digest().buffer;\r\n }\r\n\r\n const hashBytes = new Uint8Array(hash);\r\n const size = getWrapByteLength(normalized.length);\r\n return hashBytes.slice(0, size);\r\n}\r\n","import { getIdByteLength } from './tiers.js'\r\n\r\n/**\r\n * Deterministic word \t ID generator\r\n * Same output on browser and node (when using compatible input)\r\n */\r\nexport async function generateWordId(word: string): Promise<Uint8Array> {\r\n const normalized = word.trim().toLowerCase()\r\n if (!normalized) throw new Error('Cannot generate ID for empty string')\r\n\r\n const encoder = await getTextEncoder()\r\n const data = encoder.encode(normalized)\r\n\r\n // Browser + Node compatible SHA-256\r\n let hash: ArrayBuffer\r\n const anyCrypto: any = (globalThis as any).crypto\r\n if (anyCrypto && anyCrypto.subtle) {\r\n hash = await anyCrypto.subtle.digest('SHA-256', data)\r\n } else {\r\n const { createHash } = await import('node:crypto')\r\n hash = createHash('sha256').update(Buffer.from(data)).digest().buffer\r\n }\r\n\r\n const hashBytes = new Uint8Array(hash)\r\n const size = getIdByteLength(normalized.length)\r\n return hashBytes.slice(0, size)\r\n}\r\n\r\nasync function getTextEncoder(): Promise<TextEncoder> {\r\n if (typeof TextEncoder !== 'undefined') return new TextEncoder()\r\n const { TextEncoder: NodeTextEncoder } = await import('node:util')\r\n // @ts-ignore Node typings\r\n return new NodeTextEncoder()\r\n}\r\n","export function toHex(bytes: Uint8Array): string {\r\n return Array.from(bytes)\r\n .map((b) => b.toString(16).padStart(2, '0'))\r\n .join('')\r\n}\r\n\r\nexport function fromHex(hex: string): Uint8Array {\r\n if (hex.length % 2 !== 0) throw new Error('Invalid hex string length')\r\n const bytes = new Uint8Array(hex.length / 2)\r\n for (let i = 0; i < hex.length; i += 2) {\r\n bytes[i / 2] = parseInt(hex.slice(i, i + 2), 16)\r\n }\r\n return bytes\r\n}\r\n\r\nexport function toBase64(bytes: Uint8Array): string {\r\n const b64 = (globalThis as any).btoa\r\n if (typeof b64 === 'function') {\r\n return b64(String.fromCharCode(...bytes))\r\n }\r\n // Node fallback\r\n return Buffer.from(bytes).toString('base64')\r\n}\r\n\r\nexport function fromBase64(base64: string): Uint8Array {\r\n const at = (globalThis as any).atob\r\n if (typeof at === 'function') {\r\n const binary = at(base64)\r\n const out = new Uint8Array(binary.length)\r\n for (let i = 0; i < binary.length; i++) out[i] = binary.charCodeAt(i)\r\n return out\r\n }\r\n // Node fallback\r\n return new Uint8Array(Buffer.from(base64, 'base64'))\r\n}\r\n\r\n// UTF-8 helpers\r\nexport function utf8Encode(str: string): Uint8Array {\r\n if (typeof TextEncoder !== 'undefined') return new TextEncoder().encode(str)\r\n // Node fallback\r\n return new Uint8Array(Buffer.from(str, 'utf8'))\r\n}\r\n\r\nexport function utf8Decode(bytes: Uint8Array): string {\r\n if (typeof TextDecoder !== 'undefined') return new TextDecoder().decode(bytes)\r\n // Node fallback\r\n return Buffer.from(bytes).toString('utf8')\r\n}\r\n\r\n// Varint (LEB128 7-bit groups) helpers\r\nexport function encodeVarint(n: number): Uint8Array {\r\n if (n < 0) throw new Error('Varint cannot encode negative numbers')\r\n const out: number[] = []\r\n do {\r\n let byte = n & 0x7f\r\n n >>>= 7\r\n if (n !== 0) byte |= 0x80\r\n out.push(byte)\r\n } while (n !== 0)\r\n return new Uint8Array(out)\r\n}\r\n\r\nexport function decodeVarint(bytes: Uint8Array, offset: number): { value: number; bytesRead: number } {\r\n let result = 0\r\n let shift = 0\r\n let pos = offset\r\n while (pos < bytes.length) {\r\n const byte = bytes[pos++]\r\n result |= (byte & 0x7f) << shift\r\n if ((byte & 0x80) === 0) {\r\n return { value: result, bytesRead: pos - offset }\r\n }\r\n shift += 7\r\n if (shift > 35) throw new Error('Varint too large')\r\n }\r\n throw new Error('Truncated varint')\r\n}\r\n","// File: src\\dict\\builder.ts\r\n\r\nimport type { WordBinDictionary } from \"../types\";\r\nimport { generateWordId } from \"../core/id.js\";\r\nimport { toHex } from \"../utils/buffer.js\";\r\n\r\nexport interface BuildDictionaryOptions {\r\n /**\r\n * Dictionary version number (used in header and for format compatibility)\r\n * @default 1\r\n */\r\n version?: number;\r\n\r\n /**\r\n * Human-readable description of this dictionary\r\n * @default \"WordBin dictionary v${version}\"\r\n */\r\n description?: string;\r\n\r\n /**\r\n * Optional: custom prefix or identifier for this dictionary build\r\n * (can be used in logs, filenames, etc.)\r\n */\r\n name?: string;\r\n}\r\n\r\nexport async function buildDictionary(\r\n words: string[],\r\n options: BuildDictionaryOptions = {},\r\n): Promise<WordBinDictionary> {\r\n const { version = 1, description = `WordBin dictionary v${version}` } =\r\n options;\r\n\r\n const map: Record<string, string[]> = {};\r\n\r\n const normalizedWords = words\r\n .map((w) => w.trim().toLowerCase())\r\n .filter((w) => w);\r\n\r\n await Promise.all(\r\n normalizedWords.map(async (word) => {\r\n let attempt = 0;\r\n let key: string;\r\n\r\n while (true) {\r\n const id = await generateWordId(\r\n attempt === 0 ? word : `${word}:${attempt}`,\r\n );\r\n\r\n key = toHex(id);\r\n\r\n // If no collision, break\r\n if (!map[key]) {\r\n map[key] = [word];\r\n break;\r\n }\r\n\r\n // Collision detected → try again\r\n attempt++;\r\n }\r\n }),\r\n );\r\n\r\n Object.values(map).forEach((collisions) => {\r\n collisions.sort((a, b) => a.localeCompare(b));\r\n });\r\n\r\n return {\r\n version,\r\n description,\r\n words: map,\r\n };\r\n}\r\n"],"names":[],"mappings":"AAAO,SAAS,gBAAgB,YAA4B;AAC1D,MAAI,cAAc,EAAG,QAAO;AAC5B,MAAI,cAAc,EAAG,QAAO;AAC5B,SAAO;AACT;ACEA,eAAsB,eAAe,MAAmC;AACtE,QAAM,aAAa,KAAK,KAAA,EAAO,YAAA;AAC/B,MAAI,CAAC,WAAY,OAAM,IAAI,MAAM,qCAAqC;AAEtE,QAAM,UAAU,MAAM,eAAA;AACtB,QAAM,OAAO,QAAQ,OAAO,UAAU;AAGtC,MAAI;AACJ,QAAM,YAAkB,WAAmB;AAC3C,MAAI,aAAa,UAAU,QAAQ;AACjC,WAAO,MAAM,UAAU,OAAO,OAAO,WAAW,IAAI;AAAA,EACtD,OAAO;AACL,UAAM,EAAE,WAAA,IAAe,MAAM,OAAO,aAAa;AACjD,WAAO,WAAW,QAAQ,EAAE,OAAO,OAAO,KAAK,IAAI,CAAC,EAAE,OAAA,EAAS;AAAA,EACjE;AAEA,QAAM,YAAY,IAAI,WAAW,IAAI;AACrC,QAAM,OAAO,gBAAgB,WAAW,MAAM;AAC9C,SAAO,UAAU,MAAM,GAAG,IAAI;AAChC;AAEA,eAAe,iBAAuC;AACpD,MAAI,OAAO,gBAAgB,YAAa,QAAO,IAAI,YAAA;AACnD,QAAM,EAAE,aAAa,oBAAoB,MAAM,OAAO,WAAW;AAEjE,SAAO,IAAI,gBAAA;AACb;ACjCO,SAAS,MAAM,OAA2B;AAC/C,SAAO,MAAM,KAAK,KAAK,EACpB,IAAI,CAAC,MAAM,EAAE,SAAS,EAAE,EAAE,SAAS,GAAG,GAAG,CAAC,EAC1C,KAAK,EAAE;AACZ;AAWO,SAAS,SAAS,OAA2B;AAClD,QAAM,MAAO,WAAmB;AAChC,MAAI,OAAO,QAAQ,YAAY;AAC7B,WAAO,IAAI,OAAO,aAAa,GAAG,KAAK,CAAC;AAAA,EAC1C;AAEA,SAAO,OAAO,KAAK,KAAK,EAAE,SAAS,QAAQ;AAC7C;AAeO,SAAS,WAAW,KAAyB;AAClD,MAAI,OAAO,gBAAgB,YAAa,QAAO,IAAI,YAAA,EAAc,OAAO,GAAG;AAE3E,SAAO,IAAI,WAAW,OAAO,KAAK,KAAK,MAAM,CAAC;AAChD;AAEO,SAAS,WAAW,OAA2B;AACpD,MAAI,OAAO,gBAAgB,YAAa,QAAO,IAAI,YAAA,EAAc,OAAO,KAAK;AAE7E,SAAO,OAAO,KAAK,KAAK,EAAE,SAAS,MAAM;AAC3C;AAGO,SAAS,aAAa,GAAuB;AAClD,MAAI,IAAI,EAAG,OAAM,IAAI,MAAM,uCAAuC;AAClE,QAAM,MAAgB,CAAA;AACtB,KAAG;AACD,QAAI,OAAO,IAAI;AACf,WAAO;AACP,QAAI,MAAM,EAAG,SAAQ;AACrB,QAAI,KAAK,IAAI;AAAA,EACf,SAAS,MAAM;AACf,SAAO,IAAI,WAAW,GAAG;AAC3B;AAEO,SAAS,aAAa,OAAmB,QAAsD;AACpG,MAAI,SAAS;AACb,MAAI,QAAQ;AACZ,MAAI,MAAM;AACV,SAAO,MAAM,MAAM,QAAQ;AACzB,UAAM,OAAO,MAAM,KAAK;AACxB,eAAW,OAAO,QAAS;AAC3B,SAAK,OAAO,SAAU,GAAG;AACvB,aAAO,EAAE,OAAO,QAAQ,WAAW,MAAM,OAAA;AAAA,IAC3C;AACA,aAAS;AACT,QAAI,QAAQ,GAAI,OAAM,IAAI,MAAM,kBAAkB;AAAA,EACpD;AACA,QAAM,IAAI,MAAM,kBAAkB;AACpC;AClDA,eAAsB,gBACpB,OACA,UAAkC,IACN;AAC5B,QAAM,EAAE,UAAU,GAAG,cAAc,uBAAuB,OAAO,OAC/D;AAEF,QAAM,MAAgC,CAAA;AAEtC,QAAM,kBAAkB,MACrB,IAAI,CAAC,MAAM,EAAE,KAAA,EAAO,YAAA,CAAa,EACjC,OAAO,CAAC,MAAM,CAAC;AAElB,QAAM,QAAQ;AAAA,IACZ,gBAAgB,IAAI,OAAO,SAAS;AAClC,UAAI,UAAU;AACd,UAAI;AAEJ,aAAO,MAAM;AACX,cAAM,KAAK,MAAM;AAAA,UACf,YAAY,IAAI,OAAO,GAAG,IAAI,IAAI,OAAO;AAAA,QAAA;AAG3C,cAAM,MAAM,EAAE;AAGd,YAAI,CAAC,IAAI,GAAG,GAAG;AACb,cAAI,GAAG,IAAI,CAAC,IAAI;AAChB;AAAA,QACF;AAGA;AAAA,MACF;AAAA,IACF,CAAC;AAAA,EAAA;AAGH,SAAO,OAAO,GAAG,EAAE,QAAQ,CAAC,eAAe;AACzC,eAAW,KAAK,CAAC,GAAG,MAAM,EAAE,cAAc,CAAC,CAAC;AAAA,EAC9C,CAAC;AAED,SAAO;AAAA,IACL;AAAA,IACA;AAAA,IACA,OAAO;AAAA,EAAA;AAEX;"}
package/dist/cli.mjs CHANGED
@@ -4,7 +4,7 @@ import { resolve } from "node:path";
4
4
  import { createInterface } from "node:readline";
5
5
  import { stdout, stdin } from "node:process";
6
6
  import { wordlists } from "bip39";
7
- import { b as buildDictionary } from "./dictionary-D3gr2Ala.js";
7
+ import { b as buildDictionary } from "./builder-vFphFQMU.js";
8
8
  const rl = createInterface({ input: stdin, output: stdout });
9
9
  const help = `
10
10
  WordBin CLI – Dictionary Builder
@@ -0,0 +1,6 @@
1
+ export declare function toHexString(bytes: Uint8Array): string;
2
+ export declare function fromHexString(hex: string): Uint8Array | null;
3
+ export declare function generateRandomData(length?: number): Uint8Array;
4
+ export declare function copyToClipboard(text: string): Promise<void>;
5
+ export declare function modernToHex(bytes: Uint8Array): string;
6
+ export declare function modernFromHex(hex: string): Uint8Array | null;
@@ -0,0 +1,9 @@
1
+ export declare class SimpleLatinShortener {
2
+ private manualMap;
3
+ private reverseMap;
4
+ constructor();
5
+ shorten(str: string): string;
6
+ restore(str: string): string;
7
+ utf8Bytes(str: string): number;
8
+ test(input: string): void;
9
+ }
@@ -0,0 +1,2 @@
1
+ export declare function fullEncode(input: string): Uint8Array;
2
+ export declare function fullDecode(data: Uint8Array): string;