omq-lz4 0.2.0 → 0.3.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +28 -0
- data/README.md +75 -38
- data/lib/omq/lz4/codec.rb +80 -6
- data/lib/omq/lz4/version.rb +1 -1
- metadata +2 -2
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: b14469227b64cb647b42dbb8495754752ea81654449eab28cae78a0778aa36dd
|
|
4
|
+
data.tar.gz: 344189a2d0c29ec2d46b310f5607d133bdba87a69fb974be6bb582101f76aacf
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: c3ef67d35613434faf125317a61dded584c8d268032eb01830e22a809415ab20b2cf198daa8ee7803d04e88b97d4b39bb6042071a7b18bbf8cff03cb6db06cee
|
|
7
|
+
data.tar.gz: 49ed574d2b93d2e039160514602cf4f90e125edd9c2b3e05033fbbd5ee1dc1eb098373acaed27c54553e9e6977d6488d3f751b85544ad87a6f0498e5e061cad5
|
data/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,33 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
+
## [Unreleased]
|
|
4
|
+
|
|
5
|
+
## 0.3.1 (2026-05-28)
|
|
6
|
+
|
|
7
|
+
### Changed
|
|
8
|
+
|
|
9
|
+
- `MIN_COMPRESS_WITH_DICT`: raised from 32 to 128. The previous value
|
|
10
|
+
was too aggressive; 128 leaves a safer margin above the measured
|
|
11
|
+
crossover.
|
|
12
|
+
|
|
13
|
+
## 0.3.0 (2026-05-11)
|
|
14
|
+
|
|
15
|
+
### Added
|
|
16
|
+
|
|
17
|
+
- **LZ4M multi-block encoding/decoding** (RFC §5.3a, §5.4a, §5.5 rule 4).
|
|
18
|
+
Parts larger than `LZ4M_BLOCK_SIZE` (1 GiB) are split into independently
|
|
19
|
+
decodable blocks, each compressed against the installed dict (if any).
|
|
20
|
+
`encode_part` / `decode_part` accept a `block_size:` keyword for testing
|
|
21
|
+
with smaller-than-protocol block sizes.
|
|
22
|
+
- `LZ4M_SENTINEL` (`"LZ4M"`) and `LZ4M_BLOCK_SIZE` (1,073,741,824) constants
|
|
23
|
+
in `OMQ::LZ4::Codec`.
|
|
24
|
+
- LZ4B `decompressed_size` cap: the decoder now rejects single-block parts
|
|
25
|
+
whose declared `decompressed_size` exceeds `LZ4M_BLOCK_SIZE` (RFC §5.5
|
|
26
|
+
rule 3).
|
|
27
|
+
- Codec tests for LZ4M round-trips (with and without dict, partial last
|
|
28
|
+
block, random bytes), malformed LZ4M inputs (truncated, leftover bytes,
|
|
29
|
+
corrupt block data, budget overrun), and the LZ4B block size limit.
|
|
30
|
+
|
|
3
31
|
## 0.2.0 (2026-05-04)
|
|
4
32
|
|
|
5
33
|
### Changed
|
data/README.md
CHANGED
|
@@ -4,20 +4,19 @@
|
|
|
4
4
|
[](LICENSE)
|
|
5
5
|
[](https://www.ruby-lang.org)
|
|
6
6
|
|
|
7
|
-
> **Status:** 0.1.0 — first landable release. See
|
|
8
|
-
> [RFC.md](RFC.md) for the wire-format spec and
|
|
9
|
-
> [CHANGELOG.md](CHANGELOG.md) for what's in.
|
|
10
|
-
|
|
11
7
|
LZ4-compressed TCP transport for [OMQ](https://github.com/paddor/omq),
|
|
12
8
|
complementary to [`omq-zstd`](https://github.com/paddor/omq-zstd).
|
|
13
9
|
Pick `lz4+tcp://` instead of `tcp://` or `zstd+tcp://` when you want
|
|
14
10
|
cheap per-message compression with a small per-connection footprint.
|
|
15
11
|
|
|
12
|
+
See [RFC.md](RFC.md) for the wire-format specification and
|
|
13
|
+
[CHANGELOG.md](CHANGELOG.md) for release history.
|
|
14
|
+
|
|
16
15
|
## When to pick `lz4+tcp://` over `zstd+tcp://`
|
|
17
16
|
|
|
18
17
|
LZ4 has no entropy stage (no Huffman, no FSE), ~16 KiB of encoder state
|
|
19
18
|
per connection, and trades a **worse compression ratio** for
|
|
20
|
-
**~4
|
|
19
|
+
**~4-8x faster encode** and **~3x less memory per connection**.
|
|
21
20
|
|
|
22
21
|
| | `zstd+tcp://` | `lz4+tcp://` |
|
|
23
22
|
|---|---|---|
|
|
@@ -26,7 +25,7 @@ per connection, and trades a **worse compression ratio** for
|
|
|
26
25
|
| Memory per connection | ~256 KiB | ~16 KiB + dict |
|
|
27
26
|
| Ratio, 1 KiB JSON no dict | ~45% | ~65% |
|
|
28
27
|
| Ratio, 1 KiB JSON with dict | ~20% | ~35% |
|
|
29
|
-
| Auto-trained dictionaries |
|
|
28
|
+
| Auto-trained dictionaries | yes | no (user-supplied only) |
|
|
30
29
|
|
|
31
30
|
Pick `omq-lz4` for CPU- or memory-scarce deployments (edge gateways,
|
|
32
31
|
IoT concentrators, high-fanout scenarios where per-connection state
|
|
@@ -61,13 +60,13 @@ pull.receive # => ["hello, compressed world"]
|
|
|
61
60
|
```
|
|
62
61
|
|
|
63
62
|
Both peers must use `lz4+tcp://`. A `tcp://` peer cannot talk to an
|
|
64
|
-
`lz4+tcp://` peer
|
|
63
|
+
`lz4+tcp://` peer. They speak different transports.
|
|
65
64
|
|
|
66
65
|
### Dictionary compression
|
|
67
66
|
|
|
68
67
|
Small messages don't compress well on their own. A shared dictionary
|
|
69
|
-
gives 2
|
|
70
|
-
user-trained dictionary (LZ4 has no auto-training
|
|
68
|
+
gives 2-5x better ratios on payloads with a common prefix. Supply a
|
|
69
|
+
user-trained dictionary (LZ4 has no auto-training; use `omq-zstd`
|
|
71
70
|
for that):
|
|
72
71
|
|
|
73
72
|
```ruby
|
|
@@ -79,9 +78,7 @@ The sender ships the dictionary to the receiver in-band, prefixed
|
|
|
79
78
|
with the dictionary sentinel (`4C 5A 34 44`, "LZ4D" in ASCII), on
|
|
80
79
|
the first outgoing message. The receiver installs the dictionary
|
|
81
80
|
and decompresses subsequent messages against it. Dictionary size
|
|
82
|
-
is capped at **8 KiB**
|
|
83
|
-
let constrained peers accept shipments without allocating tens of
|
|
84
|
-
KB of scratch.
|
|
81
|
+
is capped at **8 KiB** (same cap as `omq-zstd`).
|
|
85
82
|
|
|
86
83
|
### Compression thresholds
|
|
87
84
|
|
|
@@ -90,7 +87,7 @@ To avoid pessimizing tiny frames, the sender skips compression below:
|
|
|
90
87
|
| Mode | Threshold |
|
|
91
88
|
|-----------------|-----------|
|
|
92
89
|
| No dictionary | 512 B |
|
|
93
|
-
| With dictionary |
|
|
90
|
+
| With dictionary | 128 B |
|
|
94
91
|
|
|
95
92
|
Below the threshold the part is sent uncompressed (4-byte zero
|
|
96
93
|
sentinel + plaintext).
|
|
@@ -100,14 +97,54 @@ sentinel + plaintext).
|
|
|
100
97
|
The receiver bounds decompression by the socket's `max_message_size`
|
|
101
98
|
(the same knob you'd use on a plain `tcp://` socket). It caps the
|
|
102
99
|
**total decompressed size of all parts in a single message**. A peer
|
|
103
|
-
attempting to send an over-budget message drops the connection
|
|
100
|
+
attempting to send an over-budget message drops the connection.
|
|
104
101
|
`OMQ::SocketDeadError` surfaces on the next `receive`.
|
|
105
102
|
|
|
106
103
|
Independent of that, the dictionary itself is capped at 8 KiB; a
|
|
107
104
|
larger shipment drops the connection.
|
|
108
105
|
|
|
109
|
-
|
|
110
|
-
|
|
106
|
+
## Wire format
|
|
107
|
+
|
|
108
|
+
Every post-handshake ZMTP message part starts with a 4-byte sentinel:
|
|
109
|
+
|
|
110
|
+
| Sentinel (hex) | ASCII | Meaning |
|
|
111
|
+
|---|---|---|
|
|
112
|
+
| `00 00 00 00` | (none) | Uncompressed plaintext |
|
|
113
|
+
| `4C 5A 34 42` | `LZ4B` | LZ4-compressed single block |
|
|
114
|
+
| `4C 5A 34 4D` | `LZ4M` | LZ4-compressed multi-block |
|
|
115
|
+
| `4C 5A 34 44` | `LZ4D` | Dictionary shipment |
|
|
116
|
+
|
|
117
|
+
**Single-block** (`LZ4B`): `sentinel (4) || decompressed_size u64 LE (8) || LZ4 block bytes`.
|
|
118
|
+
12-byte envelope. Raw LZ4 block format (no magic, no descriptor, no
|
|
119
|
+
checksum). `decompressed_size` is required because LZ4 block format
|
|
120
|
+
carries no length prefix; the receiver pre-sizes its output buffer.
|
|
121
|
+
|
|
122
|
+
**Multi-block** (`LZ4M`): same header, followed by a sequence of
|
|
123
|
+
`u32 LE compressed_block_len || LZ4 block bytes` pairs. Each block
|
|
124
|
+
decompresses independently at up to 1 GiB. Used for parts exceeding
|
|
125
|
+
the single-block size cap.
|
|
126
|
+
|
|
127
|
+
**Dictionary shipment** (`LZ4D`): `sentinel (4) || dict bytes (1..8192)`.
|
|
128
|
+
Single-part ZMTP message consumed by the transport, not delivered
|
|
129
|
+
to the application. At most one per direction per connection.
|
|
130
|
+
|
|
131
|
+
Any other leading 4 bytes close the connection.
|
|
132
|
+
|
|
133
|
+
## Constants
|
|
134
|
+
|
|
135
|
+
| Constant | Value |
|
|
136
|
+
|---|---|
|
|
137
|
+
| Scheme | `lz4+tcp` |
|
|
138
|
+
| Uncompressed sentinel | `00 00 00 00` |
|
|
139
|
+
| Single-block sentinel | `4C 5A 34 42` (`LZ4B`) |
|
|
140
|
+
| Multi-block sentinel | `4C 5A 34 4D` (`LZ4M`) |
|
|
141
|
+
| Dictionary sentinel | `4C 5A 34 44` (`LZ4D`) |
|
|
142
|
+
| LZ4M block size | 1 GiB (`0x40000000`) |
|
|
143
|
+
| Max dictionary size | 8 KiB |
|
|
144
|
+
| Min compress, no dict | 512 B |
|
|
145
|
+
| Min compress, with dict | 128 B |
|
|
146
|
+
| LZ4B envelope | 12 B (4 sentinel + 8 size) |
|
|
147
|
+
| Uncompressed envelope | 4 B (sentinel only) |
|
|
111
148
|
|
|
112
149
|
## Performance
|
|
113
150
|
|
|
@@ -124,7 +161,7 @@ Lorem ipsum prefix) input.
|
|
|
124
161
|
| 16 KiB | ~3.2 µs | ~2.4 µs | ~3.9 µs | ~3.0 µs |
|
|
125
162
|
| 1 MiB | ~89 µs | ~87 µs | ~173 µs | ~303 µs |
|
|
126
163
|
|
|
127
|
-
**End-to-end PUSH
|
|
164
|
+
**End-to-end PUSH -> PULL over `lz4+tcp://` (loopback):**
|
|
128
165
|
|
|
129
166
|
| Message size | Throughput |
|
|
130
167
|
|--------------|-----------:|
|
|
@@ -142,8 +179,8 @@ OMQ_DEV=1 bundle exec ruby --yjit bench/head_to_head.rb # lz4 vs zstd
|
|
|
142
179
|
|
|
143
180
|
### Head-to-head vs `omq-zstd` and plain `tcp`
|
|
144
181
|
|
|
145
|
-
End-to-end PUSH
|
|
146
|
-
UUID-sprinkled Lorem ipsum
|
|
182
|
+
End-to-end PUSH -> PULL throughput, Ruby 4.0 + YJIT. Input:
|
|
183
|
+
UUID-sprinkled Lorem ipsum, a fresh UUID between each Lorem
|
|
147
184
|
paragraph. Approximates realistic workloads where a schema
|
|
148
185
|
repeats but values vary (event logs, protobuf records, JSON
|
|
149
186
|
events), so a fraction of every message is mandatorily
|
|
@@ -161,27 +198,27 @@ across three bandwidth regimes.
|
|
|
161
198
|
| Link | Metric | tcp | lz4+tcp | zstd -3 | zstd 3 |
|
|
162
199
|
|---------------------|----------|------:|--------:|--------:|-------:|
|
|
163
200
|
| **100 Mbit** | plain | 11.8 | 105 | 114 | **197**|
|
|
164
|
-
| (cap
|
|
165
|
-
| | speedup | 1.
|
|
201
|
+
| (cap ~12 MiB/s) | wire | 11.8 | 12 | 12 | 12 |
|
|
202
|
+
| | speedup | 1.00x | 8.89x | 9.70x |**16.74x**|
|
|
166
203
|
| **1 Gbit** | plain | 117 | 794 | **900** | 603 |
|
|
167
|
-
| (cap
|
|
168
|
-
| | speedup | 1.
|
|
204
|
+
| (cap ~125 MiB/s) | wire | 117 | 93 | 94 | 36 |
|
|
205
|
+
| | speedup | 1.00x | 6.81x |**7.73x**| 5.17x |
|
|
169
206
|
| **Unlimited loopback** | plain | **1 064** | 869 | 972 | 626 |
|
|
170
207
|
| (kernel-copy-bound) | wire | 1 064 | 99 | 101 | 37 |
|
|
171
|
-
| | speedup | 1.
|
|
208
|
+
| | speedup | 1.00x | 0.82x | 0.91x | 0.59x |
|
|
172
209
|
|
|
173
210
|
Three regimes visible:
|
|
174
211
|
|
|
175
|
-
- **100 Mbit
|
|
176
|
-
~12 MiB/s. Plaintext = wire-cap
|
|
212
|
+
- **100 Mbit**: all compressed transports saturate wire at
|
|
213
|
+
~12 MiB/s. Plaintext = wire-cap x (1 / compression-ratio). The
|
|
177
214
|
tighter the ratio, the bigger the win: `zstd 3`'s 3% wire ratio
|
|
178
|
-
translates to a **~
|
|
179
|
-
- **1 Gbit
|
|
215
|
+
translates to a **~17x throughput multiplier** over plain tcp.
|
|
216
|
+
- **1 Gbit**: compressed transports shift from wire-saturated to
|
|
180
217
|
CPU-limited. `zstd -3` reaches ~75% of wire cap; `zstd 3` only
|
|
181
218
|
29% (deep CPU-bound). Both beat plain tcp (which is pinned at
|
|
182
|
-
the wire cap) by **6
|
|
183
|
-
helps
|
|
184
|
-
- **Unlimited loopback
|
|
219
|
+
the wire cap) by **6-8x**. `zstd 3`'s tighter wire no longer
|
|
220
|
+
helps; there's no wire saturation to trade CPU for.
|
|
221
|
+
- **Unlimited loopback**: no wire cap. All three are
|
|
185
222
|
CPU-limited. Plain tcp doesn't pay compression CPU, so **skip
|
|
186
223
|
compression on loopback**.
|
|
187
224
|
|
|
@@ -197,25 +234,25 @@ Or use a `veth` pair in a network namespace so shaping doesn't
|
|
|
197
234
|
touch your host's real loopback (see `tc-netem(8)`, `ip-netns(8)`).
|
|
198
235
|
|
|
199
236
|
Full sweeps (8 sizes from 256 B to 512 KiB) for each regime live
|
|
200
|
-
in `bench/head_to_head.rb` output
|
|
237
|
+
in `bench/head_to_head.rb` output. Run it yourself; the
|
|
201
238
|
headline numbers above are stable across repeats but small sizes
|
|
202
239
|
and very large sizes vary a bit run-to-run.
|
|
203
240
|
|
|
204
241
|
**Takeaway:**
|
|
205
242
|
|
|
206
243
|
- Pick **`lz4+tcp://`** for bandwidth-limited links (any real
|
|
207
|
-
network
|
|
244
|
+
network, even 1 Gbit LAN). 6-9x throughput multiplier over
|
|
208
245
|
plain `tcp`, minimal memory (~16 KiB/connection), modest CPU.
|
|
209
246
|
Ties or beats `zstd -3` at 1 Gbit; loses the ratio race to
|
|
210
247
|
`zstd 3` at 100 Mbit and below.
|
|
211
|
-
- Pick **`zstd+tcp://` (level
|
|
212
|
-
precious resource (
|
|
213
|
-
egress). **~
|
|
214
|
-
messages is hard to argue with.
|
|
248
|
+
- Pick **`zstd+tcp://` (level >= 3)** when the wire is the
|
|
249
|
+
precious resource (100 Mbit links or slower, WAN, or you're
|
|
250
|
+
paying for egress). **~17x throughput multiplier at 100 Mbit**
|
|
251
|
+
for 128 KiB messages is hard to argue with.
|
|
215
252
|
- Pick **plain `tcp://`** when the link is *not* the bottleneck
|
|
216
253
|
(localhost IPC, loopback, datacenter-fast inter-host
|
|
217
254
|
connections where the bandwidth ceiling is above the CPU's
|
|
218
|
-
compress/decompress speed
|
|
255
|
+
compress/decompress speed, typically 10+ Gbit), or when the
|
|
219
256
|
payload is already high-entropy (encrypted, already compressed,
|
|
220
257
|
random binary) and compression only adds overhead.
|
|
221
258
|
|
data/lib/omq/lz4/codec.rb
CHANGED
|
@@ -14,18 +14,22 @@ module OMQ
|
|
|
14
14
|
# Each wire part begins with a 4-byte sentinel:
|
|
15
15
|
#
|
|
16
16
|
# 00 00 00 00 uncompressed plaintext
|
|
17
|
-
# 4C 5A 34 42 LZ4-compressed block ("LZ4B" in ASCII)
|
|
17
|
+
# 4C 5A 34 42 LZ4-compressed single block ("LZ4B" in ASCII)
|
|
18
|
+
# 4C 5A 34 4D LZ4-compressed multi-block ("LZ4M" in ASCII)
|
|
18
19
|
# 4C 5A 34 44 dictionary shipment ("LZ4D" in ASCII)
|
|
19
20
|
#
|
|
20
|
-
# `decode_part` handles UNCOMPRESSED and
|
|
21
|
+
# `decode_part` handles UNCOMPRESSED, LZ4B, and LZ4M. Dictionary
|
|
21
22
|
# shipments are a transport-layer concern: the transport peeks the
|
|
22
23
|
# first 4 bytes of each incoming wire part, routes LZ4D to
|
|
23
24
|
# `decode_dict_shipment`, and never hands a shipment to `decode_part`.
|
|
24
25
|
module Codec
|
|
25
26
|
UNCOMPRESSED_SENTINEL = "\x00\x00\x00\x00".b.freeze
|
|
26
27
|
LZ4B_SENTINEL = "LZ4B".b.freeze
|
|
28
|
+
LZ4M_SENTINEL = "LZ4M".b.freeze
|
|
27
29
|
LZ4D_SENTINEL = "LZ4D".b.freeze
|
|
28
30
|
|
|
31
|
+
LZ4M_BLOCK_SIZE = 1_073_741_824
|
|
32
|
+
|
|
29
33
|
# Size thresholds below which compression isn't worth attempting.
|
|
30
34
|
# Empirically tuned on Lorem-ipsum-like input via
|
|
31
35
|
# bench/min_compress_size_sweep.rb: for block-format LZ4 the
|
|
@@ -37,7 +41,7 @@ module OMQ
|
|
|
37
41
|
# passthrough anyway. Below the threshold, `encode_part` emits
|
|
38
42
|
# UNCOMPRESSED directly without touching the compressor.
|
|
39
43
|
MIN_COMPRESS_NO_DICT = 512
|
|
40
|
-
MIN_COMPRESS_WITH_DICT =
|
|
44
|
+
MIN_COMPRESS_WITH_DICT = 128
|
|
41
45
|
|
|
42
46
|
# Maximum dictionary size on the wire. A policy choice, not a
|
|
43
47
|
# protocol limit; tight enough that constrained peers can accept
|
|
@@ -65,10 +69,11 @@ module OMQ
|
|
|
65
69
|
# `min_size` overrides the default threshold. Nil (the default)
|
|
66
70
|
# picks `MIN_COMPRESS_NO_DICT` for a no-dict codec and
|
|
67
71
|
# `MIN_COMPRESS_WITH_DICT` for a dict codec.
|
|
68
|
-
def encode_part(plaintext, block_codec:, min_size: nil)
|
|
72
|
+
def encode_part(plaintext, block_codec:, min_size: nil, block_size: LZ4M_BLOCK_SIZE)
|
|
69
73
|
min_size ||= block_codec.has_dict? ? MIN_COMPRESS_WITH_DICT : MIN_COMPRESS_NO_DICT
|
|
70
74
|
|
|
71
75
|
return encode_passthrough(plaintext) if plaintext.bytesize < min_size
|
|
76
|
+
return encode_multi_block(plaintext, block_codec, block_size) if plaintext.bytesize > block_size
|
|
72
77
|
|
|
73
78
|
compressed = block_codec.compress(plaintext)
|
|
74
79
|
|
|
@@ -91,7 +96,7 @@ module OMQ
|
|
|
91
96
|
#
|
|
92
97
|
# Does not handle LZ4D shipments; transport must route those to
|
|
93
98
|
# `decode_dict_shipment` before calling here.
|
|
94
|
-
def decode_part(wire_bytes, block_codec:, max_size: nil)
|
|
99
|
+
def decode_part(wire_bytes, block_codec:, max_size: nil, block_size: LZ4M_BLOCK_SIZE)
|
|
95
100
|
if wire_bytes.bytesize < 4
|
|
96
101
|
raise ProtocolError, "wire part too short (< 4 bytes)"
|
|
97
102
|
end
|
|
@@ -107,6 +112,10 @@ module OMQ
|
|
|
107
112
|
raise ProtocolError, "LZ4B part too short (< 12 bytes, no room for size field)"
|
|
108
113
|
end
|
|
109
114
|
decompressed_size = wire_bytes.byteslice(4, 8).unpack1("Q<")
|
|
115
|
+
if decompressed_size > block_size
|
|
116
|
+
raise ProtocolError,
|
|
117
|
+
"LZ4B decompressed_size #{decompressed_size} exceeds block size limit #{block_size}"
|
|
118
|
+
end
|
|
110
119
|
check_size!(decompressed_size, max_size)
|
|
111
120
|
block = wire_bytes.byteslice(12, wire_bytes.bytesize - 12)
|
|
112
121
|
begin
|
|
@@ -114,8 +123,9 @@ module OMQ
|
|
|
114
123
|
rescue RLZ4::DecompressError => e
|
|
115
124
|
raise ProtocolError, "LZ4B decode failed: #{e.message}"
|
|
116
125
|
end
|
|
126
|
+
when LZ4M_SENTINEL
|
|
127
|
+
decode_multi_block(wire_bytes, block_codec, max_size, block_size)
|
|
117
128
|
when LZ4D_SENTINEL
|
|
118
|
-
# Should not reach decode_part; transport should have routed this.
|
|
119
129
|
raise ProtocolError,
|
|
120
130
|
"LZ4D dictionary shipment seen at decode_part (transport should route to decode_dict_shipment)"
|
|
121
131
|
else
|
|
@@ -154,6 +164,70 @@ module OMQ
|
|
|
154
164
|
class << self
|
|
155
165
|
private
|
|
156
166
|
|
|
167
|
+
def encode_multi_block(plaintext, block_codec, block_size)
|
|
168
|
+
buf = String.new(encoding: Encoding::BINARY)
|
|
169
|
+
buf << LZ4M_SENTINEL
|
|
170
|
+
buf << [plaintext.bytesize].pack("Q<")
|
|
171
|
+
|
|
172
|
+
offset = 0
|
|
173
|
+
while offset < plaintext.bytesize
|
|
174
|
+
chunk_size = [block_size, plaintext.bytesize - offset].min
|
|
175
|
+
chunk = plaintext.byteslice(offset, chunk_size)
|
|
176
|
+
compressed = block_codec.compress(chunk)
|
|
177
|
+
buf << [compressed.bytesize].pack("V")
|
|
178
|
+
buf << compressed
|
|
179
|
+
offset += chunk_size
|
|
180
|
+
end
|
|
181
|
+
|
|
182
|
+
buf
|
|
183
|
+
end
|
|
184
|
+
|
|
185
|
+
|
|
186
|
+
def decode_multi_block(wire_bytes, block_codec, max_size, block_size)
|
|
187
|
+
if wire_bytes.bytesize < 12
|
|
188
|
+
raise ProtocolError, "LZ4M part too short (< 12 bytes, no room for size field)"
|
|
189
|
+
end
|
|
190
|
+
|
|
191
|
+
decompressed_size = wire_bytes.byteslice(4, 8).unpack1("Q<")
|
|
192
|
+
check_size!(decompressed_size, max_size)
|
|
193
|
+
|
|
194
|
+
output = String.new(capacity: decompressed_size, encoding: Encoding::BINARY)
|
|
195
|
+
offset = 12
|
|
196
|
+
remaining = decompressed_size
|
|
197
|
+
|
|
198
|
+
while remaining > 0
|
|
199
|
+
if offset + 4 > wire_bytes.bytesize
|
|
200
|
+
raise ProtocolError, "LZ4M truncated: no room for block length at offset #{offset}"
|
|
201
|
+
end
|
|
202
|
+
|
|
203
|
+
compressed_len = wire_bytes.byteslice(offset, 4).unpack1("V")
|
|
204
|
+
offset += 4
|
|
205
|
+
|
|
206
|
+
if offset + compressed_len > wire_bytes.bytesize
|
|
207
|
+
raise ProtocolError, "LZ4M truncated: block at offset #{offset} extends past wire end"
|
|
208
|
+
end
|
|
209
|
+
|
|
210
|
+
block_data = wire_bytes.byteslice(offset, compressed_len)
|
|
211
|
+
offset += compressed_len
|
|
212
|
+
|
|
213
|
+
block_decompressed_size = [block_size, remaining].min
|
|
214
|
+
begin
|
|
215
|
+
output << block_codec.decompress(block_data, decompressed_size: block_decompressed_size)
|
|
216
|
+
rescue RLZ4::DecompressError => e
|
|
217
|
+
raise ProtocolError, "LZ4M block decode failed: #{e.message}"
|
|
218
|
+
end
|
|
219
|
+
|
|
220
|
+
remaining -= block_decompressed_size
|
|
221
|
+
end
|
|
222
|
+
|
|
223
|
+
if offset != wire_bytes.bytesize
|
|
224
|
+
raise ProtocolError, "LZ4M: #{wire_bytes.bytesize - offset} leftover bytes after last block"
|
|
225
|
+
end
|
|
226
|
+
|
|
227
|
+
output
|
|
228
|
+
end
|
|
229
|
+
|
|
230
|
+
|
|
157
231
|
def encode_passthrough(plaintext)
|
|
158
232
|
UNCOMPRESSED_SENTINEL + plaintext
|
|
159
233
|
end
|
data/lib/omq/lz4/version.rb
CHANGED
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: omq-lz4
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 0.
|
|
4
|
+
version: 0.3.1
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Patrik Wenger
|
|
@@ -74,7 +74,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
|
74
74
|
- !ruby/object:Gem::Version
|
|
75
75
|
version: '0'
|
|
76
76
|
requirements: []
|
|
77
|
-
rubygems_version: 4.0.
|
|
77
|
+
rubygems_version: 4.0.10
|
|
78
78
|
specification_version: 4
|
|
79
79
|
summary: LZ4+TCP transport for OMQ
|
|
80
80
|
test_files: []
|