bpe-lite 0.4.0 → 0.4.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +14 -14
  2. package/package.json +1 -1
package/README.md CHANGED
@@ -64,29 +64,29 @@ Vocab files are bundled in the package — no network required at runtime or ins
64
64
 
65
65
  ## Performance
66
66
 
67
- Benchmarked on Node v24, win32/x64, against [js-tiktoken](https://github.com/dqbd/tiktoken) and [ai-tokenizer](https://github.com/nicepkg/ai-tokenizer) (`node --expose-gc scripts/bench.js`).
67
+ Benchmarked on Node v24 (win32/x64). Benchmark command: `node --expose-gc scripts/bench.js`.
68
68
 
69
69
  **OpenAI cl100k — large text (~54 KB)**
70
70
 
71
- | impl | ops/s | MB/s |
72
- |------|------:|-----:|
73
- | bpe-lite | **240** | **12.7** |
74
- | ai-tokenizer | 234 | 12.4 |
75
- | js-tiktoken | 26 | 1.4 |
71
+ | impl | ops/s | tokens/s | MB/s |
72
+ |------|------:|---------:|-----:|
73
+ | bpe-lite | **257** | **3.15M** | **13.6** |
74
+ | ai-tokenizer | 201 | 2.46M | 10.7 |
75
+ | js-tiktoken | 23 | 282k | 1.2 |
76
76
 
77
77
  **Anthropic — large text (~54 KB)**
78
78
 
79
- | impl | ops/s | MB/s |
80
- |------|------:|-----:|
81
- | bpe-lite | **276** | **14.6** |
82
- | ai-tokenizer | 258 | 13.7 |
79
+ | impl | ops/s | tokens/s | MB/s |
80
+ |------|------:|---------:|-----:|
81
+ | bpe-lite | **257** | 3.15M | **13.6** |
82
+ | ai-tokenizer | 253 | **4.62M** | 13.4 |
83
83
 
84
84
  **Gemini — large text (8 KB)**
85
85
 
86
- | impl | ops/s | MB/s | note |
87
- |------|------:|-----:|------|
88
- | bpe-lite | **6,640** | **51.9** | actual Gemma3 SPM |
89
- | ai-tokenizer | 1,300 | 10.2 | o200k BPE — different algorithm, different results |
86
+ | impl | ops/s | tokens/s | MB/s | note |
87
+ |------|------:|---------:|-----:|------|
88
+ | bpe-lite | **3,800** | **6.23M** | **29.7** | actual Gemma3 SPM |
89
+ | ai-tokenizer | 1,220 | 2.01M | 9.6 | o200k BPE — different algorithm, different results |
90
90
 
91
91
  ai-tokenizer does not implement Gemini tokenization. The row above uses their o200k encoding on the same input string; it produces different token ids and counts than the Gemini tokenizer, so it is not a real comparison.
92
92
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "bpe-lite",
3
- "version": "0.4.0",
3
+ "version": "0.4.2",
4
4
  "description": "Offline BPE tokenizer for OpenAI, Anthropic, and Gemini — zero dependencies",
5
5
  "main": "src/index.js",
6
6
  "types": "index.d.ts",