RubyGems - omq-lz4 - Versions diffs - 0.2.0 → 0.3.1 - Mend

omq-lz4 0.2.0 → 0.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: ade7fc716054707c2e05bade867b01333708de30df98e3acb524f8da1a7d98d4
-  data.tar.gz: 7ae5fade7f0d88eed7f6f7f8875373a044127098d596b9f6bae5554d585b1e90
+  metadata.gz: b14469227b64cb647b42dbb8495754752ea81654449eab28cae78a0778aa36dd
+  data.tar.gz: 344189a2d0c29ec2d46b310f5607d133bdba87a69fb974be6bb582101f76aacf
 SHA512:
-  metadata.gz: 59afc48b58c8c1efac0973bffa702cac3958b0150b4eb51fe9dec82dacc1c9735446d83e58ac6963f0302e13b26a0890b1eaeff5c0d7d846830c561775a848f4
-  data.tar.gz: 17c66cce9f79f1a375a8614e3aa3cc45071277432521ec96d446daede230b1e6e1a71003b88062345c48550c128e524aa10160d6a06c41e6c7597309e138d5f0
+  metadata.gz: c3ef67d35613434faf125317a61dded584c8d268032eb01830e22a809415ab20b2cf198daa8ee7803d04e88b97d4b39bb6042071a7b18bbf8cff03cb6db06cee
+  data.tar.gz: 49ed574d2b93d2e039160514602cf4f90e125edd9c2b3e05033fbbd5ee1dc1eb098373acaed27c54553e9e6977d6488d3f751b85544ad87a6f0498e5e061cad5

data/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,33 @@
 # Changelog
+## [Unreleased]
+## 0.3.1 (2026-05-28)
+### Changed
+- `MIN_COMPRESS_WITH_DICT`: raised from 32 to 128. The previous value
+  was too aggressive; 128 leaves a safer margin above the measured
+  crossover.
+## 0.3.0 (2026-05-11)
+### Added
+- **LZ4M multi-block encoding/decoding** (RFC §5.3a, §5.4a, §5.5 rule 4).
+  Parts larger than `LZ4M_BLOCK_SIZE` (1 GiB) are split into independently
+  decodable blocks, each compressed against the installed dict (if any).
+  `encode_part` / `decode_part` accept a `block_size:` keyword for testing
+  with smaller-than-protocol block sizes.
+- `LZ4M_SENTINEL` (`"LZ4M"`) and `LZ4M_BLOCK_SIZE` (1,073,741,824) constants
+  in `OMQ::LZ4::Codec`.
+- LZ4B `decompressed_size` cap: the decoder now rejects single-block parts
+  whose declared `decompressed_size` exceeds `LZ4M_BLOCK_SIZE` (RFC §5.5
+  rule 3).
+- Codec tests for LZ4M round-trips (with and without dict, partial last
+  block, random bytes), malformed LZ4M inputs (truncated, leftover bytes,
+  corrupt block data, budget overrun), and the LZ4B block size limit.
 ## 0.2.0 (2026-05-04)
 ### Changed

data/README.md CHANGED Viewed

@@ -4,20 +4,19 @@
 [![License: ISC](https://img.shields.io/badge/License-ISC-blue.svg)](LICENSE)
 [![Ruby](https://img.shields.io/badge/Ruby-%3E%3D%203.3-CC342D?logo=ruby&logoColor=white)](https://www.ruby-lang.org)
-> **Status:** 0.1.0 — first landable release. See
-> [RFC.md](RFC.md) for the wire-format spec and
-> [CHANGELOG.md](CHANGELOG.md) for what's in.
 LZ4-compressed TCP transport for [OMQ](https://github.com/paddor/omq),
 complementary to [`omq-zstd`](https://github.com/paddor/omq-zstd).
 Pick `lz4+tcp://` instead of `tcp://` or `zstd+tcp://` when you want
 cheap per-message compression with a small per-connection footprint.
+See [RFC.md](RFC.md) for the wire-format specification and
+[CHANGELOG.md](CHANGELOG.md) for release history.
 ## When to pick `lz4+tcp://` over `zstd+tcp://`
 LZ4 has no entropy stage (no Huffman, no FSE), ~16 KiB of encoder state
 per connection, and trades a **worse compression ratio** for
-**~4–8× faster encode** and **~3× less memory per connection**.
+**~4-8x faster encode** and **~3x less memory per connection**.
 | | `zstd+tcp://` | `lz4+tcp://` |
 |---|---|---|
@@ -26,7 +25,7 @@ per connection, and trades a **worse compression ratio** for
 | Memory per connection | ~256 KiB | ~16 KiB + dict |
 | Ratio, 1 KiB JSON no dict | ~45% | ~65% |
 | Ratio, 1 KiB JSON with dict | ~20% | ~35% |
-| Auto-trained dictionaries | ✓ | — (user-supplied only) |
+| Auto-trained dictionaries | yes | no (user-supplied only) |
 Pick `omq-lz4` for CPU- or memory-scarce deployments (edge gateways,
 IoT concentrators, high-fanout scenarios where per-connection state
@@ -61,13 +60,13 @@ pull.receive  # => ["hello, compressed world"]
 ```
 Both peers must use `lz4+tcp://`. A `tcp://` peer cannot talk to an
-`lz4+tcp://` peer — they speak different transports.
+`lz4+tcp://` peer. They speak different transports.
 ### Dictionary compression
 Small messages don't compress well on their own. A shared dictionary
-gives 2–5× better ratios on payloads with a common prefix. Supply a
-user-trained dictionary (LZ4 has no auto-training — use `omq-zstd`
+gives 2-5x better ratios on payloads with a common prefix. Supply a
+user-trained dictionary (LZ4 has no auto-training; use `omq-zstd`
 for that):
 ```ruby
@@ -79,9 +78,7 @@ The sender ships the dictionary to the receiver in-band, prefixed
 with the dictionary sentinel (`4C 5A 34 44`, "LZ4D" in ASCII), on
 the first outgoing message. The receiver installs the dictionary
 and decompresses subsequent messages against it. Dictionary size
-is capped at **8 KiB** — tighter than `omq-zstd`'s 64 KiB cap, to
-let constrained peers accept shipments without allocating tens of
-KB of scratch.
+is capped at **8 KiB** (same cap as `omq-zstd`).
 ### Compression thresholds
@@ -90,7 +87,7 @@ To avoid pessimizing tiny frames, the sender skips compression below:
 | Mode            | Threshold |
 |-----------------|-----------|
 | No dictionary   | 512 B     |
-| With dictionary | 32 B      |
+| With dictionary | 128 B     |
 Below the threshold the part is sent uncompressed (4-byte zero
 sentinel + plaintext).
@@ -100,14 +97,54 @@ sentinel + plaintext).
 The receiver bounds decompression by the socket's `max_message_size`
 (the same knob you'd use on a plain `tcp://` socket). It caps the
 **total decompressed size of all parts in a single message**. A peer
-attempting to send an over-budget message drops the connection —
+attempting to send an over-budget message drops the connection.
 `OMQ::SocketDeadError` surfaces on the next `receive`.
 Independent of that, the dictionary itself is capped at 8 KiB; a
 larger shipment drops the connection.
-See the plan roadmap ([../OMQ-LZ4.plan](../OMQ-LZ4.plan)) for
-history and open questions.
+## Wire format
+Every post-handshake ZMTP message part starts with a 4-byte sentinel:
+| Sentinel (hex) | ASCII | Meaning |
+|---|---|---|
+| `00 00 00 00` | (none) | Uncompressed plaintext |
+| `4C 5A 34 42` | `LZ4B` | LZ4-compressed single block |
+| `4C 5A 34 4D` | `LZ4M` | LZ4-compressed multi-block |
+| `4C 5A 34 44` | `LZ4D` | Dictionary shipment |
+**Single-block** (`LZ4B`): `sentinel (4) || decompressed_size u64 LE (8) || LZ4 block bytes`.
+12-byte envelope. Raw LZ4 block format (no magic, no descriptor, no
+checksum). `decompressed_size` is required because LZ4 block format
+carries no length prefix; the receiver pre-sizes its output buffer.
+**Multi-block** (`LZ4M`): same header, followed by a sequence of
+`u32 LE compressed_block_len || LZ4 block bytes` pairs. Each block
+decompresses independently at up to 1 GiB. Used for parts exceeding
+the single-block size cap.
+**Dictionary shipment** (`LZ4D`): `sentinel (4) || dict bytes (1..8192)`.
+Single-part ZMTP message consumed by the transport, not delivered
+to the application. At most one per direction per connection.
+Any other leading 4 bytes close the connection.
+## Constants
+| Constant | Value |
+|---|---|
+| Scheme | `lz4+tcp` |
+| Uncompressed sentinel | `00 00 00 00` |
+| Single-block sentinel | `4C 5A 34 42` (`LZ4B`) |
+| Multi-block sentinel | `4C 5A 34 4D` (`LZ4M`) |
+| Dictionary sentinel | `4C 5A 34 44` (`LZ4D`) |
+| LZ4M block size | 1 GiB (`0x40000000`) |
+| Max dictionary size | 8 KiB |
+| Min compress, no dict | 512 B |
+| Min compress, with dict | 128 B |
+| LZ4B envelope | 12 B (4 sentinel + 8 size) |
+| Uncompressed envelope | 4 B (sentinel only) |
 ## Performance
@@ -124,7 +161,7 @@ Lorem ipsum prefix) input.
 |   16 KiB   |       ~3.2 µs  |     ~2.4 µs |       ~3.9 µs  |     ~3.0 µs |
 |    1 MiB   |      ~89 µs    |    ~87 µs   |     ~173 µs    |   ~303 µs   |
-**End-to-end PUSH → PULL over `lz4+tcp://` (loopback):**
+**End-to-end PUSH -> PULL over `lz4+tcp://` (loopback):**
 | Message size | Throughput |
 |--------------|-----------:|
@@ -142,8 +179,8 @@ OMQ_DEV=1 bundle exec ruby --yjit bench/head_to_head.rb   # lz4 vs zstd
 ### Head-to-head vs `omq-zstd` and plain `tcp`
-End-to-end PUSH → PULL throughput, Ruby 4.0 + YJIT. Input:
-UUID-sprinkled Lorem ipsum — a fresh UUID between each Lorem
+End-to-end PUSH -> PULL throughput, Ruby 4.0 + YJIT. Input:
+UUID-sprinkled Lorem ipsum, a fresh UUID between each Lorem
 paragraph. Approximates realistic workloads where a schema
 repeats but values vary (event logs, protobuf records, JSON
 events), so a fraction of every message is mandatorily
@@ -161,27 +198,27 @@ across three bandwidth regimes.
 | Link                | Metric   |   tcp | lz4+tcp | zstd -3 | zstd 3 |
 |---------------------|----------|------:|--------:|--------:|-------:|
 | **100 Mbit**        | plain    |  11.8 |     105 |     114 | **197**|
-| (cap ≈ 12 MiB/s)    | wire     |  11.8 |      12 |      12 |    12  |
-|                     | speedup  | 1.00× |   8.89× |   9.70× |**16.74×**|
+| (cap ~12 MiB/s)     | wire     |  11.8 |      12 |      12 |    12  |
+|                     | speedup  | 1.00x |   8.89x |   9.70x |**16.74x**|
 | **1 Gbit**          | plain    | 117   |     794 | **900** |    603 |
-| (cap ≈ 125 MiB/s)   | wire     | 117   |      93 |      94 |     36 |
-|                     | speedup  | 1.00× |   6.81× |**7.73×**|  5.17× |
+| (cap ~125 MiB/s)    | wire     | 117   |      93 |      94 |     36 |
+|                     | speedup  | 1.00x |   6.81x |**7.73x**|  5.17x |
 | **Unlimited loopback** | plain | **1 064** |  869 |    972  |    626 |
 | (kernel-copy-bound) | wire     | 1 064 |      99 |     101 |     37 |
-|                     | speedup  | 1.00× |   0.82× |   0.91× |  0.59× |
+|                     | speedup  | 1.00x |   0.82x |   0.91x |  0.59x |
 Three regimes visible:
-- **100 Mbit** — all compressed transports saturate wire at
-  ~12 MiB/s. Plaintext = wire-cap × (1 / compression-ratio). The
+- **100 Mbit**: all compressed transports saturate wire at
+  ~12 MiB/s. Plaintext = wire-cap x (1 / compression-ratio). The
   tighter the ratio, the bigger the win: `zstd 3`'s 3% wire ratio
-  translates to a **~17× throughput multiplier** over plain tcp.
-- **1 Gbit** — compressed transports shift from wire-saturated to
+  translates to a **~17x throughput multiplier** over plain tcp.
+- **1 Gbit**: compressed transports shift from wire-saturated to
   CPU-limited. `zstd -3` reaches ~75% of wire cap; `zstd 3` only
   29% (deep CPU-bound). Both beat plain tcp (which is pinned at
-  the wire cap) by **6–8×**. `zstd 3`'s tighter wire no longer
-  helps — there's no wire saturation to trade CPU for.
-- **Unlimited loopback** — no wire cap. All three are
+  the wire cap) by **6-8x**. `zstd 3`'s tighter wire no longer
+  helps; there's no wire saturation to trade CPU for.
+- **Unlimited loopback**: no wire cap. All three are
   CPU-limited. Plain tcp doesn't pay compression CPU, so **skip
   compression on loopback**.
@@ -197,25 +234,25 @@ Or use a `veth` pair in a network namespace so shaping doesn't
 touch your host's real loopback (see `tc-netem(8)`, `ip-netns(8)`).
 Full sweeps (8 sizes from 256 B to 512 KiB) for each regime live
-in `bench/head_to_head.rb` output — run it yourself; the
+in `bench/head_to_head.rb` output. Run it yourself; the
 headline numbers above are stable across repeats but small sizes
 and very large sizes vary a bit run-to-run.
 **Takeaway:**
 - Pick **`lz4+tcp://`** for bandwidth-limited links (any real
-  network — even 1 Gbit LAN). 6–9× throughput multiplier over
+  network, even 1 Gbit LAN). 6-9x throughput multiplier over
   plain `tcp`, minimal memory (~16 KiB/connection), modest CPU.
   Ties or beats `zstd -3` at 1 Gbit; loses the ratio race to
   `zstd 3` at 100 Mbit and below.
-- Pick **`zstd+tcp://` (level ≥ 3)** when the wire is the
-  precious resource (≤ 100 Mbit links, WAN, or you're paying for
-  egress). **~17× throughput multiplier at 100 Mbit** for 128 KiB
-  messages is hard to argue with.
+- Pick **`zstd+tcp://` (level >= 3)** when the wire is the
+  precious resource (100 Mbit links or slower, WAN, or you're
+  paying for egress). **~17x throughput multiplier at 100 Mbit**
+  for 128 KiB messages is hard to argue with.
 - Pick **plain `tcp://`** when the link is *not* the bottleneck
   (localhost IPC, loopback, datacenter-fast inter-host
   connections where the bandwidth ceiling is above the CPU's
-  compress/decompress speed — typically 10+ Gbit), or when the
+  compress/decompress speed, typically 10+ Gbit), or when the
   payload is already high-entropy (encrypted, already compressed,
   random binary) and compression only adds overhead.

data/lib/omq/lz4/codec.rb CHANGED Viewed

@@ -14,18 +14,22 @@ module OMQ
     # Each wire part begins with a 4-byte sentinel:
     #
     #   00 00 00 00   uncompressed plaintext
-    #   4C 5A 34 42   LZ4-compressed block ("LZ4B" in ASCII)
+    #   4C 5A 34 42   LZ4-compressed single block ("LZ4B" in ASCII)
+    #   4C 5A 34 4D   LZ4-compressed multi-block ("LZ4M" in ASCII)
     #   4C 5A 34 44   dictionary shipment ("LZ4D" in ASCII)
     #
-    # `decode_part` handles UNCOMPRESSED and LZ4B only. Dictionary
+    # `decode_part` handles UNCOMPRESSED, LZ4B, and LZ4M. Dictionary
     # shipments are a transport-layer concern: the transport peeks the
     # first 4 bytes of each incoming wire part, routes LZ4D to
     # `decode_dict_shipment`, and never hands a shipment to `decode_part`.
     module Codec
       UNCOMPRESSED_SENTINEL = "\x00\x00\x00\x00".b.freeze
       LZ4B_SENTINEL         = "LZ4B".b.freeze
+      LZ4M_SENTINEL         = "LZ4M".b.freeze
       LZ4D_SENTINEL         = "LZ4D".b.freeze
+      LZ4M_BLOCK_SIZE = 1_073_741_824
       # Size thresholds below which compression isn't worth attempting.
       # Empirically tuned on Lorem-ipsum-like input via
       # bench/min_compress_size_sweep.rb: for block-format LZ4 the
@@ -37,7 +41,7 @@ module OMQ
       # passthrough anyway. Below the threshold, `encode_part` emits
       # UNCOMPRESSED directly without touching the compressor.
       MIN_COMPRESS_NO_DICT   = 512
-      MIN_COMPRESS_WITH_DICT = 32
+      MIN_COMPRESS_WITH_DICT = 128
       # Maximum dictionary size on the wire. A policy choice, not a
       # protocol limit; tight enough that constrained peers can accept
@@ -65,10 +69,11 @@ module OMQ
       # `min_size` overrides the default threshold. Nil (the default)
       # picks `MIN_COMPRESS_NO_DICT` for a no-dict codec and
       # `MIN_COMPRESS_WITH_DICT` for a dict codec.
-      def encode_part(plaintext, block_codec:, min_size: nil)
+      def encode_part(plaintext, block_codec:, min_size: nil, block_size: LZ4M_BLOCK_SIZE)
         min_size ||= block_codec.has_dict? ? MIN_COMPRESS_WITH_DICT : MIN_COMPRESS_NO_DICT
         return encode_passthrough(plaintext) if plaintext.bytesize < min_size
+        return encode_multi_block(plaintext, block_codec, block_size) if plaintext.bytesize > block_size
         compressed = block_codec.compress(plaintext)
@@ -91,7 +96,7 @@ module OMQ
       #
       # Does not handle LZ4D shipments; transport must route those to
       # `decode_dict_shipment` before calling here.
-      def decode_part(wire_bytes, block_codec:, max_size: nil)
+      def decode_part(wire_bytes, block_codec:, max_size: nil, block_size: LZ4M_BLOCK_SIZE)
         if wire_bytes.bytesize < 4
           raise ProtocolError, "wire part too short (< 4 bytes)"
         end
@@ -107,6 +112,10 @@ module OMQ
             raise ProtocolError, "LZ4B part too short (< 12 bytes, no room for size field)"
           end
           decompressed_size = wire_bytes.byteslice(4, 8).unpack1("Q<")
+          if decompressed_size > block_size
+            raise ProtocolError,
+              "LZ4B decompressed_size #{decompressed_size} exceeds block size limit #{block_size}"
+          end
           check_size!(decompressed_size, max_size)
           block = wire_bytes.byteslice(12, wire_bytes.bytesize - 12)
           begin
@@ -114,8 +123,9 @@ module OMQ
           rescue RLZ4::DecompressError => e
             raise ProtocolError, "LZ4B decode failed: #{e.message}"
           end
+        when LZ4M_SENTINEL
+          decode_multi_block(wire_bytes, block_codec, max_size, block_size)
         when LZ4D_SENTINEL
-          # Should not reach decode_part; transport should have routed this.
           raise ProtocolError,
             "LZ4D dictionary shipment seen at decode_part (transport should route to decode_dict_shipment)"
         else
@@ -154,6 +164,70 @@ module OMQ
       class << self
         private
+        def encode_multi_block(plaintext, block_codec, block_size)
+          buf = String.new(encoding: Encoding::BINARY)
+          buf << LZ4M_SENTINEL
+          buf << [plaintext.bytesize].pack("Q<")
+          offset = 0
+          while offset < plaintext.bytesize
+            chunk_size = [block_size, plaintext.bytesize - offset].min
+            chunk = plaintext.byteslice(offset, chunk_size)
+            compressed = block_codec.compress(chunk)
+            buf << [compressed.bytesize].pack("V")
+            buf << compressed
+            offset += chunk_size
+          end
+          buf
+        end
+        def decode_multi_block(wire_bytes, block_codec, max_size, block_size)
+          if wire_bytes.bytesize < 12
+            raise ProtocolError, "LZ4M part too short (< 12 bytes, no room for size field)"
+          end
+          decompressed_size = wire_bytes.byteslice(4, 8).unpack1("Q<")
+          check_size!(decompressed_size, max_size)
+          output = String.new(capacity: decompressed_size, encoding: Encoding::BINARY)
+          offset = 12
+          remaining = decompressed_size
+          while remaining > 0
+            if offset + 4 > wire_bytes.bytesize
+              raise ProtocolError, "LZ4M truncated: no room for block length at offset #{offset}"
+            end
+            compressed_len = wire_bytes.byteslice(offset, 4).unpack1("V")
+            offset += 4
+            if offset + compressed_len > wire_bytes.bytesize
+              raise ProtocolError, "LZ4M truncated: block at offset #{offset} extends past wire end"
+            end
+            block_data = wire_bytes.byteslice(offset, compressed_len)
+            offset += compressed_len
+            block_decompressed_size = [block_size, remaining].min
+            begin
+              output << block_codec.decompress(block_data, decompressed_size: block_decompressed_size)
+            rescue RLZ4::DecompressError => e
+              raise ProtocolError, "LZ4M block decode failed: #{e.message}"
+            end
+            remaining -= block_decompressed_size
+          end
+          if offset != wire_bytes.bytesize
+            raise ProtocolError, "LZ4M: #{wire_bytes.bytesize - offset} leftover bytes after last block"
+          end
+          output
+        end
         def encode_passthrough(plaintext)
           UNCOMPRESSED_SENTINEL + plaintext
         end

data/lib/omq/lz4/version.rb CHANGED Viewed

@@ -2,6 +2,6 @@
 module OMQ
   module LZ4
-    VERSION = "0.2.0"
+    VERSION = "0.3.1"
   end
 end

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: omq-lz4
 version: !ruby/object:Gem::Version
-  version: 0.2.0
+  version: 0.3.1
 platform: ruby
 authors:
 - Patrik Wenger
@@ -74,7 +74,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
     - !ruby/object:Gem::Version
       version: '0'
 requirements: []
-rubygems_version: 4.0.6
+rubygems_version: 4.0.10
 specification_version: 4
 summary: LZ4+TCP transport for OMQ
 test_files: []