@nxtedition/shared 3.0.1 → 3.0.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +69 -0
- package/lib/bench-worker.d.mts +1 -0
- package/lib/bench.d.mts +1 -0
- package/package.json +3 -2
package/README.md
CHANGED
|
@@ -157,6 +157,75 @@ Non-blocking write attempt. Returns `false` if the buffer is full. The `fn` call
|
|
|
157
157
|
|
|
158
158
|
Batches multiple writes within the callback. The write pointer is only published to the reader when `cork` returns, reducing atomic operation overhead.
|
|
159
159
|
|
|
160
|
+
## Benchmarks
|
|
161
|
+
|
|
162
|
+
Measured on Apple M3 Pro (3.51 GHz), Node.js 25.6.1, 8 MiB ring buffer.
|
|
163
|
+
|
|
164
|
+
Each benchmark writes batches of fixed-size messages from the main thread and
|
|
165
|
+
reads them in a worker thread. The shared ring buffer is compared against
|
|
166
|
+
Node.js `postMessage` (structured clone). Hardware performance counters were
|
|
167
|
+
collected with [`@mitata/counters`](https://github.com/evanwashere/mitata).
|
|
168
|
+
|
|
169
|
+
### Throughput
|
|
170
|
+
|
|
171
|
+
| Size | shared (buffer) | shared (string) | postMessage (buffer) | postMessage (string) |
|
|
172
|
+
| -----: | --------------: | --------------: | -------------------: | -------------------: |
|
|
173
|
+
| 64 B | **1.07 GiB/s** | 793 MiB/s | 93 MiB/s | 117 MiB/s |
|
|
174
|
+
| 256 B | **2.98 GiB/s** | 2.56 GiB/s | 259 MiB/s | 391 MiB/s |
|
|
175
|
+
| 1 KiB | 4.65 GiB/s | **7.52 GiB/s** | 1.24 GiB/s | 1.68 GiB/s |
|
|
176
|
+
| 4 KiB | 4.94 GiB/s | **16.38 GiB/s** | 3.77 GiB/s | 4.84 GiB/s |
|
|
177
|
+
| 16 KiB | 5.25 GiB/s | **22.33 GiB/s** | 8.54 GiB/s | 9.65 GiB/s |
|
|
178
|
+
| 64 KiB | 5.53 GiB/s | **19.86 GiB/s** | 10.94 GiB/s | 12.25 GiB/s |
|
|
179
|
+
|
|
180
|
+
### Message rate
|
|
181
|
+
|
|
182
|
+
| Size | shared (buffer) | shared (string) | postMessage (buffer) | postMessage (string) |
|
|
183
|
+
| -----: | --------------: | --------------: | -------------------: | -------------------: |
|
|
184
|
+
| 64 B | **17.99 M/s** | 12.99 M/s | 1.53 M/s | 1.92 M/s |
|
|
185
|
+
| 256 B | **12.50 M/s** | 10.73 M/s | 1.06 M/s | 1.60 M/s |
|
|
186
|
+
| 1 KiB | 4.87 M/s | **7.88 M/s** | 1.30 M/s | 1.76 M/s |
|
|
187
|
+
| 4 KiB | 1.30 M/s | **4.29 M/s** | 989 K/s | 1.27 M/s |
|
|
188
|
+
| 16 KiB | 344 K/s | **1.46 M/s** | 560 K/s | 632 K/s |
|
|
189
|
+
| 64 KiB | 91 K/s | **325 K/s** | 179 K/s | 201 K/s |
|
|
190
|
+
|
|
191
|
+
### CPU efficiency (instructions per cycle)
|
|
192
|
+
|
|
193
|
+
| Size | shared (buffer) | shared (string) | postMessage (buffer) | postMessage (string) |
|
|
194
|
+
| -----: | --------------: | --------------: | -------------------: | -------------------: |
|
|
195
|
+
| 64 B | 4.80 | 5.79 | 3.91 | 3.37 |
|
|
196
|
+
| 256 B | 4.46 | 5.98 | 3.48 | 3.06 |
|
|
197
|
+
| 1 KiB | 4.17 | **6.29** | 3.63 | 3.15 |
|
|
198
|
+
| 4 KiB | 3.75 | **6.72** | 3.38 | 2.83 |
|
|
199
|
+
| 16 KiB | 3.80 | **6.03** | 2.74 | 2.86 |
|
|
200
|
+
| 64 KiB | 3.96 | **4.57** | 2.43 | 2.93 |
|
|
201
|
+
|
|
202
|
+
### Key findings
|
|
203
|
+
|
|
204
|
+
- **Small messages (64-256 B):** The shared ring buffer with `Buffer.copy` delivers
|
|
205
|
+
up to **12x higher message rate** and **9x higher throughput** than `postMessage`.
|
|
206
|
+
Per-message overhead dominates at these sizes, and avoiding structured cloning makes
|
|
207
|
+
the biggest difference.
|
|
208
|
+
|
|
209
|
+
- **Large messages (1-64 KiB):** The shared ring buffer with string encoding
|
|
210
|
+
(`Buffer.write`) reaches up to **22 GiB/s** — roughly **2-4x faster** than
|
|
211
|
+
`postMessage`. V8's ASCII fast path for UTF-8 encoding is heavily vectorized
|
|
212
|
+
(6-7 IPC on Apple M3 Pro), which explains why string writes outperform raw
|
|
213
|
+
`Buffer.copy` at larger sizes.
|
|
214
|
+
|
|
215
|
+
- **CPU efficiency:** The shared ring buffer consistently achieves higher IPC
|
|
216
|
+
(4-7) compared to `postMessage` (2-4), indicating less time spent stalled on
|
|
217
|
+
memory or synchronization.
|
|
218
|
+
|
|
219
|
+
- **Caveat:** The string benchmark uses ASCII-only content. Multi-byte UTF-8
|
|
220
|
+
strings will not hit V8's vectorized fast path and will be significantly slower.
|
|
221
|
+
|
|
222
|
+
### Running the benchmark
|
|
223
|
+
|
|
224
|
+
```sh
|
|
225
|
+
# Hardware counters require elevated privileges on macOS
|
|
226
|
+
sudo node --allow-natives-syntax packages/shared/src/bench.mjs
|
|
227
|
+
```
|
|
228
|
+
|
|
160
229
|
## License
|
|
161
230
|
|
|
162
231
|
MIT
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
export {};
|
package/lib/bench.d.mts
ADDED
|
@@ -0,0 +1 @@
|
|
|
1
|
+
export {};
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@nxtedition/shared",
|
|
3
|
-
"version": "3.0.
|
|
3
|
+
"version": "3.0.2",
|
|
4
4
|
"type": "module",
|
|
5
5
|
"main": "lib/index.js",
|
|
6
6
|
"types": "lib/index.d.ts",
|
|
@@ -26,5 +26,6 @@
|
|
|
26
26
|
"oxlint-tsgolint": "^0.13.0",
|
|
27
27
|
"rimraf": "^6.1.3",
|
|
28
28
|
"typescript": "^5.9.3"
|
|
29
|
-
}
|
|
29
|
+
},
|
|
30
|
+
"gitHead": "3648df9e97a19a6ebdf497afb1845a01b5301460"
|
|
30
31
|
}
|