omq-zstd 0.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/DESIGN.md +162 -0
- data/LICENSE +15 -0
- data/README.md +163 -0
- data/RFC.md +453 -0
- data/lib/omq/transport/zstd_tcp/codec.rb +163 -0
- data/lib/omq/transport/zstd_tcp/connection.rb +162 -0
- data/lib/omq/transport/zstd_tcp/transport.rb +253 -0
- data/lib/omq/transport/zstd_tcp.rb +26 -0
- data/lib/omq/zstd/version.rb +7 -0
- data/lib/omq/zstd.rb +4 -0
- metadata +79 -0
checksums.yaml
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
1
|
+
---
|
|
2
|
+
SHA256:
|
|
3
|
+
metadata.gz: 9e725966f3527f9ad15396b9dc6b4a182a93a3cdf5b1f6916b6ccb25ba39b146
|
|
4
|
+
data.tar.gz: 4c6bc2ed5ba87c77868ff93781dc9269a8929b78df8e190097bc943e84c2c46c
|
|
5
|
+
SHA512:
|
|
6
|
+
metadata.gz: fa50d84a5d8a53a0a0bad2c5df4cee8d9435f04612f260082d02a226740f438d91f33702e29c3b974fda0ef6302080ab96d61a4e804e0abb409b2892a7f24e92
|
|
7
|
+
data.tar.gz: 74119b06fa29d1dc339803c153fe2571c9d401d638a6fdc36cf1b5c64e28f2ba951e935ebbe7203475332d58c1e3a4be2622fb2acd5ac5da9ff713122f10d53b
|
data/DESIGN.md
ADDED
|
@@ -0,0 +1,162 @@
|
|
|
1
|
+
# omq-transport-zstd — Implementation Notes
|
|
2
|
+
|
|
3
|
+
These are implementation details for the `zstd+tcp://` transport plugin.
|
|
4
|
+
The wire protocol is specified in [RFC.md](RFC.md); this document covers
|
|
5
|
+
the OMQ-specific architecture that doesn't belong in the RFC.
|
|
6
|
+
|
|
7
|
+
## Why a transport plugin
|
|
8
|
+
|
|
9
|
+
Compression could live at three layers. Each has a fatal flaw except the
|
|
10
|
+
transport layer.
|
|
11
|
+
|
|
12
|
+
**Socket-level wrapper** (too high). A wrapper sits above routing and
|
|
13
|
+
knows nothing about transports. It can't distinguish TCP from IPC or
|
|
14
|
+
inproc, so it compresses local connections (pure overhead). It also
|
|
15
|
+
can't act on new connections naturally — dict shipping requires
|
|
16
|
+
per-connection state, but the wrapper only sees messages after routing
|
|
17
|
+
has dispatched them. Reconnect handling requires hooking into
|
|
18
|
+
connection lifecycle events, which is awkward from outside.
|
|
19
|
+
|
|
20
|
+
**ZMTP connection layer** (too low). Embedding compression into each
|
|
21
|
+
ZMTP connection means PUB fan-out compresses the same message N times
|
|
22
|
+
(once per subscriber connection). The connection layer has no
|
|
23
|
+
socket-wide view, so there's no way to share compression work across
|
|
24
|
+
connections.
|
|
25
|
+
|
|
26
|
+
**Transport layer** (right). `zstd+tcp://` makes transport selection
|
|
27
|
+
explicit in the endpoint URI. Only TCP connections get compressed. IPC
|
|
28
|
+
and inproc are unaffected even on the same socket. Dict lifetime
|
|
29
|
+
matches connection lifetime naturally (new connection = new wrapper =
|
|
30
|
+
re-ship dict). No negotiation needed — both peers use `zstd+tcp://`.
|
|
31
|
+
The Codec is socket-wide (shared across connections via the Dialer or
|
|
32
|
+
Listener), so PUB compresses once and reuses the result.
|
|
33
|
+
|
|
34
|
+
## Architecture
|
|
35
|
+
|
|
36
|
+
```
|
|
37
|
+
Socket bind("zstd+tcp://...") / connect("zstd+tcp://...")
|
|
38
|
+
│ │
|
|
39
|
+
▼ ▼
|
|
40
|
+
Engine transport.listener(...) transport.dialer(...)
|
|
41
|
+
│ → Listener → Dialer
|
|
42
|
+
│ holds Codec holds Codec
|
|
43
|
+
│ #wrap_connection #wrap_connection
|
|
44
|
+
▼ │ │
|
|
45
|
+
ConnectionLifecycle ▼ ▼
|
|
46
|
+
ready! ──────► ZstdConnection(conn, codec)
|
|
47
|
+
│ #write_message → compress → ship_dict! → delegate
|
|
48
|
+
│ #receive_message → decode (per-connection recv_dict)
|
|
49
|
+
│ #respond_to?(:write_wire) → false (forces fan-out path)
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
### Dialer / Listener
|
|
53
|
+
|
|
54
|
+
Both are stateful transport objects created by the transport module's
|
|
55
|
+
`.dialer` / `.listener` factory methods. They hold the Codec and
|
|
56
|
+
implement `#wrap_connection(conn)` which wraps raw ZMTP connections
|
|
57
|
+
in a `ZstdConnection`. The Engine stores them in `@dialers` / `@listeners`
|
|
58
|
+
(Hash keyed by endpoint) and calls `#wrap_connection` during
|
|
59
|
+
`ConnectionLifecycle#ready!`.
|
|
60
|
+
|
|
61
|
+
Reconnect calls `dialer.connect` directly — no transport lookup or
|
|
62
|
+
opts replay needed. The Dialer holds everything.
|
|
63
|
+
|
|
64
|
+
### Codec (socket-wide)
|
|
65
|
+
|
|
66
|
+
One Codec per Dialer or Listener. Shared across all connections of
|
|
67
|
+
that endpoint. Owns:
|
|
68
|
+
|
|
69
|
+
- **Compression**: `#compress_parts(parts)` with identity cache
|
|
70
|
+
- **Training**: sample collection, `RZstd::Dictionary.train`, dict ID patching
|
|
71
|
+
- **Send dict**: `#send_dict_bytes` — the trained or user-supplied dict bytes
|
|
72
|
+
|
|
73
|
+
### ZstdConnection (per-connection)
|
|
74
|
+
|
|
75
|
+
`SimpleDelegator` wrapping a `Protocol::ZMTP::Connection`. Per-connection
|
|
76
|
+
state:
|
|
77
|
+
|
|
78
|
+
- `@dict_shipped` — whether the dict has been sent on this connection
|
|
79
|
+
- `@recv_dict` — the peer's dictionary for decompression
|
|
80
|
+
|
|
81
|
+
Intercepts `#send_message`, `#write_message`, `#write_messages`,
|
|
82
|
+
`#receive_message`. Returns `false` for `respond_to?(:write_wire)` to
|
|
83
|
+
force fan-out through `#write_message` (which hits the compression
|
|
84
|
+
cache) instead of pre-encoded wire bytes.
|
|
85
|
+
|
|
86
|
+
## Identity-based compression cache
|
|
87
|
+
|
|
88
|
+
PUB fan-out sends the same frozen message parts Array to every
|
|
89
|
+
subscriber's `#write_message`. The Codec exploits this with an
|
|
90
|
+
`Object#equal?` check:
|
|
91
|
+
|
|
92
|
+
```ruby
|
|
93
|
+
def compress_parts(parts)
|
|
94
|
+
return @cached_compressed if parts.equal?(@cached_parts)
|
|
95
|
+
# ... compress ...
|
|
96
|
+
@cached_parts = parts
|
|
97
|
+
@cached_compressed = compressed.freeze
|
|
98
|
+
end
|
|
99
|
+
```
|
|
100
|
+
|
|
101
|
+
`.equal?` is O(1) — same frozen Array object from `freeze_message`.
|
|
102
|
+
First subscriber pays the compression cost; subsequent subscribers
|
|
103
|
+
get the cached result. Net: one compression per message, N wire
|
|
104
|
+
writes.
|
|
105
|
+
|
|
106
|
+
## Dict shipping order
|
|
107
|
+
|
|
108
|
+
Training can trigger DURING `#compress_parts` (when the sample
|
|
109
|
+
threshold is reached mid-compression). The dict must be shipped
|
|
110
|
+
AFTER compression but BEFORE the wire write, so the receiver
|
|
111
|
+
has the dict before seeing frames that use it:
|
|
112
|
+
|
|
113
|
+
```ruby
|
|
114
|
+
def write_message(parts)
|
|
115
|
+
compressed = @codec.compress_parts(parts) # may trigger training
|
|
116
|
+
ship_dict! # ships if newly trained
|
|
117
|
+
__getobj__.write_message(compressed)
|
|
118
|
+
end
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
## Training heuristics
|
|
122
|
+
|
|
123
|
+
- **Sample threshold**: 1000 messages OR 100 KiB of plaintext, whichever first
|
|
124
|
+
- **Sample size cap**: frames > 1024 bytes are skipped (dictionaries primarily benefit small frames)
|
|
125
|
+
- **Dict capacity**: 8 KiB (conservative; Zstd recommends ~100:1 sample-to-dict ratio)
|
|
126
|
+
- **Dict ID patching**: auto-trained dicts get a random ID in the user range (32768..2^31-1) to avoid collisions with Zstd's built-in dict IDs
|
|
127
|
+
- **Training failure**: if `RZstd::Dictionary.train` raises, training is disabled permanently for the socket. No retry.
|
|
128
|
+
|
|
129
|
+
## Frame dispatch
|
|
130
|
+
|
|
131
|
+
Three sentinels for per-part decoding:
|
|
132
|
+
|
|
133
|
+
| Preamble (4 bytes hex) | Meaning |
|
|
134
|
+
|---|---|
|
|
135
|
+
| `00 00 00 00` | Uncompressed plaintext (part too small or incompressible) |
|
|
136
|
+
| `28 B5 2F FD` | Zstd compressed frame (the standard Zstd magic number) |
|
|
137
|
+
| `37 A4 30 EC` | Zstd dictionary — install into per-connection recv slot |
|
|
138
|
+
|
|
139
|
+
Dict frames are single-part ZMTP messages. When all parts in a message
|
|
140
|
+
are dict frames, `#decode_parts` returns `nil` and `#receive_message`
|
|
141
|
+
loops to get the next real message.
|
|
142
|
+
|
|
143
|
+
## Budget enforcement
|
|
144
|
+
|
|
145
|
+
The receiver tracks a per-message decompressed byte budget derived from
|
|
146
|
+
`max_message_size`. Each part's declared `Frame_Content_Size` is checked
|
|
147
|
+
BEFORE decompression. The budget decreases across parts of a multipart
|
|
148
|
+
message, so the total decompressed size can't exceed the limit even if
|
|
149
|
+
individual parts are within bounds.
|
|
150
|
+
|
|
151
|
+
## Constants
|
|
152
|
+
|
|
153
|
+
```
|
|
154
|
+
MAX_DECOMPRESSED_SIZE = 16 MiB (absolute cap per frame)
|
|
155
|
+
MAX_DICT_SIZE = 64 KiB (reject oversized dicts)
|
|
156
|
+
DICT_CAPACITY = 8 KiB (training target size)
|
|
157
|
+
TRAIN_MAX_SAMPLES = 1000
|
|
158
|
+
TRAIN_MAX_BYTES = 100 KiB
|
|
159
|
+
TRAIN_MAX_SAMPLE_LEN = 1024 (skip large frames for training)
|
|
160
|
+
MIN_COMPRESS_NO_DICT = 512 B
|
|
161
|
+
MIN_COMPRESS_WITH_DICT = 64 B
|
|
162
|
+
```
|
data/LICENSE
ADDED
|
@@ -0,0 +1,15 @@
|
|
|
1
|
+
ISC License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026, Patrik Wenger
|
|
4
|
+
|
|
5
|
+
Permission to use, copy, modify, and/or distribute this software for any
|
|
6
|
+
purpose with or without fee is hereby granted, provided that the above
|
|
7
|
+
copyright notice and this permission notice appear in all copies.
|
|
8
|
+
|
|
9
|
+
THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
|
|
10
|
+
WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
|
|
11
|
+
MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
|
|
12
|
+
ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
|
|
13
|
+
WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
|
|
14
|
+
ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
|
|
15
|
+
OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
|
data/README.md
ADDED
|
@@ -0,0 +1,163 @@
|
|
|
1
|
+
# omq-zstd
|
|
2
|
+
|
|
3
|
+
[](https://rubygems.org/gems/omq-zstd)
|
|
4
|
+
[](LICENSE)
|
|
5
|
+
[](https://www.ruby-lang.org)
|
|
6
|
+
|
|
7
|
+
> **Status:** Draft. Wire format may change before the first tagged release.
|
|
8
|
+
|
|
9
|
+
Zstandard-compressed TCP transport for [OMQ](https://github.com/paddor/omq).
|
|
10
|
+
Pick `zstd+tcp://` instead of `tcp://` and every message part on the wire is
|
|
11
|
+
compressed per-part with [Zstandard](https://github.com/facebook/zstd).
|
|
12
|
+
Compression is intrinsic to the transport — no negotiation, no socket option,
|
|
13
|
+
no payload changes. The ZMTP handshake itself runs over plain TCP; only
|
|
14
|
+
post-handshake message parts are compressed.
|
|
15
|
+
|
|
16
|
+
See [RFC.md](RFC.md) for the wire-format specification and
|
|
17
|
+
[DESIGN.md](DESIGN.md) for the implementation rationale.
|
|
18
|
+
|
|
19
|
+
## Install
|
|
20
|
+
|
|
21
|
+
```ruby
|
|
22
|
+
# Gemfile
|
|
23
|
+
gem "omq-zstd"
|
|
24
|
+
```
|
|
25
|
+
|
|
26
|
+
```sh
|
|
27
|
+
gem install omq-zstd
|
|
28
|
+
```
|
|
29
|
+
|
|
30
|
+
## Usage
|
|
31
|
+
|
|
32
|
+
```ruby
|
|
33
|
+
require "omq"
|
|
34
|
+
require "omq/zstd"
|
|
35
|
+
|
|
36
|
+
pull = OMQ::PULL.new
|
|
37
|
+
push = OMQ::PUSH.new
|
|
38
|
+
|
|
39
|
+
uri = pull.bind("zstd+tcp://127.0.0.1:0")
|
|
40
|
+
push.connect(uri.to_s)
|
|
41
|
+
|
|
42
|
+
push << ["hello, compressed world"]
|
|
43
|
+
pull.receive # => ["hello, compressed world"]
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
Both peers must use the `zstd+tcp://` scheme. A `tcp://` peer cannot talk to
|
|
47
|
+
a `zstd+tcp://` peer — they speak different transports.
|
|
48
|
+
|
|
49
|
+
### Compression level
|
|
50
|
+
|
|
51
|
+
Default is **`-3`** (negative = Zstd's fast strategy). Override at bind/connect:
|
|
52
|
+
|
|
53
|
+
```ruby
|
|
54
|
+
pull.bind("zstd+tcp://127.0.0.1:0", level: 3)
|
|
55
|
+
push.connect("zstd+tcp://127.0.0.1:5555", level: 9)
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
Per-direction, per-side: each side picks its own send level. Receiving works
|
|
59
|
+
at any level the peer chose.
|
|
60
|
+
|
|
61
|
+
### Dictionaries
|
|
62
|
+
|
|
63
|
+
Small messages don't compress well on their own. A shared Zstd dictionary
|
|
64
|
+
trained on representative payloads gives 2–10× ratios on payloads in the
|
|
65
|
+
dozens-to-hundreds-of-bytes range.
|
|
66
|
+
|
|
67
|
+
**User-supplied dictionary** (out-of-band agreement):
|
|
68
|
+
|
|
69
|
+
```ruby
|
|
70
|
+
dict = File.binread("schema.dict") # produced by `zstd --train`
|
|
71
|
+
push.connect("zstd+tcp://127.0.0.1:5555", dict: dict)
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
The sender ships the dictionary to the receiver in-band as a one-shot
|
|
75
|
+
single-part message prefixed with the dictionary sentinel
|
|
76
|
+
(`37 A4 30 EC`), so the receiver does not need a copy on disk.
|
|
77
|
+
|
|
78
|
+
**Auto-trained dictionary** (zero config — the default when no `dict:` is
|
|
79
|
+
passed): the sender collects up to 1000 samples or 100 KiB (whichever hits
|
|
80
|
+
first), trains a dictionary, ships it inline, and switches to dictionary
|
|
81
|
+
mode. Until then, payloads are compressed without a dictionary or sent
|
|
82
|
+
plaintext when below the threshold.
|
|
83
|
+
|
|
84
|
+
### Compression thresholds
|
|
85
|
+
|
|
86
|
+
To avoid pessimizing tiny frames, the sender skips compression below:
|
|
87
|
+
|
|
88
|
+
| Mode | Threshold |
|
|
89
|
+
|------|-----------|
|
|
90
|
+
| No dictionary | 512 B |
|
|
91
|
+
| With dictionary | 64 B |
|
|
92
|
+
|
|
93
|
+
Below the threshold the part is sent uncompressed (4-byte zero sentinel +
|
|
94
|
+
plaintext bytes).
|
|
95
|
+
|
|
96
|
+
### Security limits
|
|
97
|
+
|
|
98
|
+
The receiver bounds decompression by the socket's own `max_message_size`
|
|
99
|
+
— the same knob you'd use on a plain `tcp://` socket. It caps the
|
|
100
|
+
**total decompressed size of all parts in a single message**, not each
|
|
101
|
+
part individually: the budget starts at `max_message_size` and shrinks
|
|
102
|
+
as each part is decoded, so a message whose parts sum to more than the
|
|
103
|
+
cap is rejected on the offending part.
|
|
104
|
+
|
|
105
|
+
```ruby
|
|
106
|
+
pull.max_message_size = 1_048_576 # 1 MiB cap on the total message
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
If `max_message_size` is `nil` (OMQ's default, unlimited), there is no
|
|
110
|
+
ceiling on decompressed message size. Set a value that matches what
|
|
111
|
+
your application would tolerate over plain `tcp://`.
|
|
112
|
+
|
|
113
|
+
Independent of the message-size knob, the dictionary itself is capped at
|
|
114
|
+
64 KiB (Zstd's recommended dictionary size range). A peer attempting to
|
|
115
|
+
ship a larger dictionary, or send a message whose decompressed parts
|
|
116
|
+
exceed `max_message_size`, drops the connection — `OMQ::SocketDeadError`
|
|
117
|
+
surfaces on the next `receive`.
|
|
118
|
+
|
|
119
|
+
## When to use it
|
|
120
|
+
|
|
121
|
+
`zstd+tcp://` is worth picking when:
|
|
122
|
+
|
|
123
|
+
- You're network-bound (cross-region, IoT links, congested LAN).
|
|
124
|
+
- Your payloads have repetitive structure (JSON, log lines, protobuf with
|
|
125
|
+
string fields, similar binary records).
|
|
126
|
+
- You want compression without touching the message format on either side.
|
|
127
|
+
|
|
128
|
+
It is **not** worth it for:
|
|
129
|
+
|
|
130
|
+
- `inproc://` or `ipc://` — irrelevant; there is no wire to shrink. Use
|
|
131
|
+
`zstd+tcp://` only on the connections that actually need it. Other
|
|
132
|
+
transports on the same socket are unaffected.
|
|
133
|
+
- Already-compressed payloads (gzip, video, encrypted blobs) — the Zstd
|
|
134
|
+
pass adds CPU for no gain.
|
|
135
|
+
- Latency-critical sub-microsecond paths — compression adds single-digit
|
|
136
|
+
microseconds per kilobyte at low levels, but it is not free.
|
|
137
|
+
|
|
138
|
+
## How it works (in one paragraph)
|
|
139
|
+
|
|
140
|
+
`require "omq/zstd"` registers the `zstd+tcp` scheme on
|
|
141
|
+
`OMQ::Engine.transports`. A `zstd+tcp` socket builds a per-engine
|
|
142
|
+
`Codec` (one Zstd dictionary instance shared across all the socket's
|
|
143
|
+
connections — fan-out compresses each part exactly once). Each accepted
|
|
144
|
+
or dialed TCP connection is wrapped in `ZstdConnection`, a
|
|
145
|
+
`SimpleDelegator` over the underlying ZMTP connection that intercepts
|
|
146
|
+
`#send_message` / `#write_message` / `#receive_message`. Message parts
|
|
147
|
+
go out as a 4-byte sentinel + payload: `00 00 00 00` for plaintext,
|
|
148
|
+
`28 B5 2F FD` (Zstandard frame magic) for a compressed part, or
|
|
149
|
+
`37 A4 30 EC` for a one-shot single-part dictionary shipment. The
|
|
150
|
+
receiver dispatches on the sentinel, decompresses with bounded
|
|
151
|
+
buffers, and hands plaintext parts up to ZMTP unchanged.
|
|
152
|
+
|
|
153
|
+
## Development
|
|
154
|
+
|
|
155
|
+
```sh
|
|
156
|
+
OMQ_DEV=1 bundle install
|
|
157
|
+
OMQ_DEV=1 bundle exec rake test
|
|
158
|
+
OMQ_DEV=1 bundle exec ruby --yjit bench/level_sweep.rb
|
|
159
|
+
```
|
|
160
|
+
|
|
161
|
+
## License
|
|
162
|
+
|
|
163
|
+
[ISC](LICENSE)
|