npm - @nxtedition/shared - Versions diffs - 3.0.1 → 3.0.2 - Mend

@nxtedition/shared 3.0.1 → 3.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/README.md CHANGED Viewed

@@ -157,6 +157,75 @@ Non-blocking write attempt. Returns `false` if the buffer is full. The `fn` call
 Batches multiple writes within the callback. The write pointer is only published to the reader when `cork` returns, reducing atomic operation overhead.
+## Benchmarks
+Measured on Apple M3 Pro (3.51 GHz), Node.js 25.6.1, 8 MiB ring buffer.
+Each benchmark writes batches of fixed-size messages from the main thread and
+reads them in a worker thread. The shared ring buffer is compared against
+Node.js `postMessage` (structured clone). Hardware performance counters were
+collected with [`@mitata/counters`](https://github.com/evanwashere/mitata).
+### Throughput
+|   Size | shared (buffer) | shared (string) | postMessage (buffer) | postMessage (string) |
+| -----: | --------------: | --------------: | -------------------: | -------------------: |
+|   64 B |  **1.07 GiB/s** |       793 MiB/s |             93 MiB/s |            117 MiB/s |
+|  256 B |  **2.98 GiB/s** |      2.56 GiB/s |            259 MiB/s |            391 MiB/s |
+|  1 KiB |      4.65 GiB/s |  **7.52 GiB/s** |           1.24 GiB/s |           1.68 GiB/s |
+|  4 KiB |      4.94 GiB/s | **16.38 GiB/s** |           3.77 GiB/s |           4.84 GiB/s |
+| 16 KiB |      5.25 GiB/s | **22.33 GiB/s** |           8.54 GiB/s |           9.65 GiB/s |
+| 64 KiB |      5.53 GiB/s | **19.86 GiB/s** |          10.94 GiB/s |          12.25 GiB/s |
+### Message rate
+|   Size | shared (buffer) | shared (string) | postMessage (buffer) | postMessage (string) |
+| -----: | --------------: | --------------: | -------------------: | -------------------: |
+|   64 B |   **17.99 M/s** |       12.99 M/s |             1.53 M/s |             1.92 M/s |
+|  256 B |   **12.50 M/s** |       10.73 M/s |             1.06 M/s |             1.60 M/s |
+|  1 KiB |        4.87 M/s |    **7.88 M/s** |             1.30 M/s |             1.76 M/s |
+|  4 KiB |        1.30 M/s |    **4.29 M/s** |              989 K/s |             1.27 M/s |
+| 16 KiB |         344 K/s |    **1.46 M/s** |              560 K/s |              632 K/s |
+| 64 KiB |          91 K/s |     **325 K/s** |              179 K/s |              201 K/s |
+### CPU efficiency (instructions per cycle)
+|   Size | shared (buffer) | shared (string) | postMessage (buffer) | postMessage (string) |
+| -----: | --------------: | --------------: | -------------------: | -------------------: |
+|   64 B |            4.80 |            5.79 |                 3.91 |                 3.37 |
+|  256 B |            4.46 |            5.98 |                 3.48 |                 3.06 |
+|  1 KiB |            4.17 |        **6.29** |                 3.63 |                 3.15 |
+|  4 KiB |            3.75 |        **6.72** |                 3.38 |                 2.83 |
+| 16 KiB |            3.80 |        **6.03** |                 2.74 |                 2.86 |
+| 64 KiB |            3.96 |        **4.57** |                 2.43 |                 2.93 |
+### Key findings
+- **Small messages (64-256 B):** The shared ring buffer with `Buffer.copy` delivers
+  up to **12x higher message rate** and **9x higher throughput** than `postMessage`.
+  Per-message overhead dominates at these sizes, and avoiding structured cloning makes
+  the biggest difference.
+- **Large messages (1-64 KiB):** The shared ring buffer with string encoding
+  (`Buffer.write`) reaches up to **22 GiB/s** — roughly **2-4x faster** than
+  `postMessage`. V8's ASCII fast path for UTF-8 encoding is heavily vectorized
+  (6-7 IPC on Apple M3 Pro), which explains why string writes outperform raw
+  `Buffer.copy` at larger sizes.
+- **CPU efficiency:** The shared ring buffer consistently achieves higher IPC
+  (4-7) compared to `postMessage` (2-4), indicating less time spent stalled on
+  memory or synchronization.
+- **Caveat:** The string benchmark uses ASCII-only content. Multi-byte UTF-8
+  strings will not hit V8's vectorized fast path and will be significantly slower.
+### Running the benchmark
+```sh
+# Hardware counters require elevated privileges on macOS
+sudo node --allow-natives-syntax packages/shared/src/bench.mjs
+```
 ## License
 MIT

package/lib/bench-worker.d.mts ADDED Viewed

	@@ -0,0 +1 @@
1	+ export {};

package/lib/bench.d.mts ADDED Viewed

	@@ -0,0 +1 @@
1	+ export {};

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@nxtedition/shared",
-  "version": "3.0.1",
+  "version": "3.0.2",
   "type": "module",
   "main": "lib/index.js",
   "types": "lib/index.d.ts",
@@ -26,5 +26,6 @@
     "oxlint-tsgolint": "^0.13.0",
     "rimraf": "^6.1.3",
     "typescript": "^5.9.3"
-  }
+  },
+  "gitHead": "3648df9e97a19a6ebdf497afb1845a01b5301460"
 }