omq-zstd 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 9e725966f3527f9ad15396b9dc6b4a182a93a3cdf5b1f6916b6ccb25ba39b146
4
+ data.tar.gz: 4c6bc2ed5ba87c77868ff93781dc9269a8929b78df8e190097bc943e84c2c46c
5
+ SHA512:
6
+ metadata.gz: fa50d84a5d8a53a0a0bad2c5df4cee8d9435f04612f260082d02a226740f438d91f33702e29c3b974fda0ef6302080ab96d61a4e804e0abb409b2892a7f24e92
7
+ data.tar.gz: 74119b06fa29d1dc339803c153fe2571c9d401d638a6fdc36cf1b5c64e28f2ba951e935ebbe7203475332d58c1e3a4be2622fb2acd5ac5da9ff713122f10d53b
data/DESIGN.md ADDED
@@ -0,0 +1,162 @@
1
+ # omq-transport-zstd — Implementation Notes
2
+
3
+ These are implementation details for the `zstd+tcp://` transport plugin.
4
+ The wire protocol is specified in [RFC.md](RFC.md); this document covers
5
+ the OMQ-specific architecture that doesn't belong in the RFC.
6
+
7
+ ## Why a transport plugin
8
+
9
+ Compression could live at three layers. Each has a fatal flaw except the
10
+ transport layer.
11
+
12
+ **Socket-level wrapper** (too high). A wrapper sits above routing and
13
+ knows nothing about transports. It can't distinguish TCP from IPC or
14
+ inproc, so it compresses local connections (pure overhead). It also
15
+ can't act on new connections naturally — dict shipping requires
16
+ per-connection state, but the wrapper only sees messages after routing
17
+ has dispatched them. Reconnect handling requires hooking into
18
+ connection lifecycle events, which is awkward from outside.
19
+
20
+ **ZMTP connection layer** (too low). Embedding compression into each
21
+ ZMTP connection means PUB fan-out compresses the same message N times
22
+ (once per subscriber connection). The connection layer has no
23
+ socket-wide view, so there's no way to share compression work across
24
+ connections.
25
+
26
+ **Transport layer** (right). `zstd+tcp://` makes transport selection
27
+ explicit in the endpoint URI. Only TCP connections get compressed. IPC
28
+ and inproc are unaffected even on the same socket. Dict lifetime
29
+ matches connection lifetime naturally (new connection = new wrapper =
30
+ re-ship dict). No negotiation needed — both peers use `zstd+tcp://`.
31
+ The Codec is socket-wide (shared across connections via the Dialer or
32
+ Listener), so PUB compresses once and reuses the result.
33
+
34
+ ## Architecture
35
+
36
+ ```
37
+ Socket bind("zstd+tcp://...") / connect("zstd+tcp://...")
38
+ │ │
39
+ ▼ ▼
40
+ Engine transport.listener(...) transport.dialer(...)
41
+ │ → Listener → Dialer
42
+ │ holds Codec holds Codec
43
+ │ #wrap_connection #wrap_connection
44
+ ▼ │ │
45
+ ConnectionLifecycle ▼ ▼
46
+ ready! ──────► ZstdConnection(conn, codec)
47
+ │ #write_message → compress → ship_dict! → delegate
48
+ │ #receive_message → decode (per-connection recv_dict)
49
+ │ #respond_to?(:write_wire) → false (forces fan-out path)
50
+ ```
51
+
52
+ ### Dialer / Listener
53
+
54
+ Both are stateful transport objects created by the transport module's
55
+ `.dialer` / `.listener` factory methods. They hold the Codec and
56
+ implement `#wrap_connection(conn)` which wraps raw ZMTP connections
57
+ in a `ZstdConnection`. The Engine stores them in `@dialers` / `@listeners`
58
+ (Hash keyed by endpoint) and calls `#wrap_connection` during
59
+ `ConnectionLifecycle#ready!`.
60
+
61
+ Reconnect calls `dialer.connect` directly — no transport lookup or
62
+ opts replay needed. The Dialer holds everything.
63
+
64
+ ### Codec (socket-wide)
65
+
66
+ One Codec per Dialer or Listener. Shared across all connections of
67
+ that endpoint. Owns:
68
+
69
+ - **Compression**: `#compress_parts(parts)` with identity cache
70
+ - **Training**: sample collection, `RZstd::Dictionary.train`, dict ID patching
71
+ - **Send dict**: `#send_dict_bytes` — the trained or user-supplied dict bytes
72
+
73
+ ### ZstdConnection (per-connection)
74
+
75
+ `SimpleDelegator` wrapping a `Protocol::ZMTP::Connection`. Per-connection
76
+ state:
77
+
78
+ - `@dict_shipped` — whether the dict has been sent on this connection
79
+ - `@recv_dict` — the peer's dictionary for decompression
80
+
81
+ Intercepts `#send_message`, `#write_message`, `#write_messages`,
82
+ `#receive_message`. Returns `false` for `respond_to?(:write_wire)` to
83
+ force fan-out through `#write_message` (which hits the compression
84
+ cache) instead of pre-encoded wire bytes.
85
+
86
+ ## Identity-based compression cache
87
+
88
+ PUB fan-out sends the same frozen message parts Array to every
89
+ subscriber's `#write_message`. The Codec exploits this with an
90
+ `Object#equal?` check:
91
+
92
+ ```ruby
93
+ def compress_parts(parts)
94
+ return @cached_compressed if parts.equal?(@cached_parts)
95
+ # ... compress ...
96
+ @cached_parts = parts
97
+ @cached_compressed = compressed.freeze
98
+ end
99
+ ```
100
+
101
+ `.equal?` is O(1) — same frozen Array object from `freeze_message`.
102
+ First subscriber pays the compression cost; subsequent subscribers
103
+ get the cached result. Net: one compression per message, N wire
104
+ writes.
105
+
106
+ ## Dict shipping order
107
+
108
+ Training can trigger DURING `#compress_parts` (when the sample
109
+ threshold is reached mid-compression). The dict must be shipped
110
+ AFTER compression but BEFORE the wire write, so the receiver
111
+ has the dict before seeing frames that use it:
112
+
113
+ ```ruby
114
+ def write_message(parts)
115
+ compressed = @codec.compress_parts(parts) # may trigger training
116
+ ship_dict! # ships if newly trained
117
+ __getobj__.write_message(compressed)
118
+ end
119
+ ```
120
+
121
+ ## Training heuristics
122
+
123
+ - **Sample threshold**: 1000 messages OR 100 KiB of plaintext, whichever first
124
+ - **Sample size cap**: frames > 1024 bytes are skipped (dictionaries primarily benefit small frames)
125
+ - **Dict capacity**: 8 KiB (conservative; Zstd recommends ~100:1 sample-to-dict ratio)
126
+ - **Dict ID patching**: auto-trained dicts get a random ID in the user range (32768..2^31-1) to avoid collisions with Zstd's built-in dict IDs
127
+ - **Training failure**: if `RZstd::Dictionary.train` raises, training is disabled permanently for the socket. No retry.
128
+
129
+ ## Frame dispatch
130
+
131
+ Three sentinels for per-part decoding:
132
+
133
+ | Preamble (4 bytes hex) | Meaning |
134
+ |---|---|
135
+ | `00 00 00 00` | Uncompressed plaintext (part too small or incompressible) |
136
+ | `28 B5 2F FD` | Zstd compressed frame (the standard Zstd magic number) |
137
+ | `37 A4 30 EC` | Zstd dictionary — install into per-connection recv slot |
138
+
139
+ Dict frames are single-part ZMTP messages. When all parts in a message
140
+ are dict frames, `#decode_parts` returns `nil` and `#receive_message`
141
+ loops to get the next real message.
142
+
143
+ ## Budget enforcement
144
+
145
+ The receiver tracks a per-message decompressed byte budget derived from
146
+ `max_message_size`. Each part's declared `Frame_Content_Size` is checked
147
+ BEFORE decompression. The budget decreases across parts of a multipart
148
+ message, so the total decompressed size can't exceed the limit even if
149
+ individual parts are within bounds.
150
+
151
+ ## Constants
152
+
153
+ ```
154
+ MAX_DECOMPRESSED_SIZE = 16 MiB (absolute cap per frame)
155
+ MAX_DICT_SIZE = 64 KiB (reject oversized dicts)
156
+ DICT_CAPACITY = 8 KiB (training target size)
157
+ TRAIN_MAX_SAMPLES = 1000
158
+ TRAIN_MAX_BYTES = 100 KiB
159
+ TRAIN_MAX_SAMPLE_LEN = 1024 (skip large frames for training)
160
+ MIN_COMPRESS_NO_DICT = 512 B
161
+ MIN_COMPRESS_WITH_DICT = 64 B
162
+ ```
data/LICENSE ADDED
@@ -0,0 +1,15 @@
1
+ ISC License
2
+
3
+ Copyright (c) 2026, Patrik Wenger
4
+
5
+ Permission to use, copy, modify, and/or distribute this software for any
6
+ purpose with or without fee is hereby granted, provided that the above
7
+ copyright notice and this permission notice appear in all copies.
8
+
9
+ THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
10
+ WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
11
+ MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
12
+ ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
13
+ WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
14
+ ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
15
+ OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,163 @@
1
+ # omq-zstd
2
+
3
+ [![Gem Version](https://img.shields.io/gem/v/omq-zstd?color=e9573f)](https://rubygems.org/gems/omq-zstd)
4
+ [![License: ISC](https://img.shields.io/badge/License-ISC-blue.svg)](LICENSE)
5
+ [![Ruby](https://img.shields.io/badge/Ruby-%3E%3D%203.3-CC342D?logo=ruby&logoColor=white)](https://www.ruby-lang.org)
6
+
7
+ > **Status:** Draft. Wire format may change before the first tagged release.
8
+
9
+ Zstandard-compressed TCP transport for [OMQ](https://github.com/paddor/omq).
10
+ Pick `zstd+tcp://` instead of `tcp://` and every message part on the wire is
11
+ compressed per-part with [Zstandard](https://github.com/facebook/zstd).
12
+ Compression is intrinsic to the transport — no negotiation, no socket option,
13
+ no payload changes. The ZMTP handshake itself runs over plain TCP; only
14
+ post-handshake message parts are compressed.
15
+
16
+ See [RFC.md](RFC.md) for the wire-format specification and
17
+ [DESIGN.md](DESIGN.md) for the implementation rationale.
18
+
19
+ ## Install
20
+
21
+ ```ruby
22
+ # Gemfile
23
+ gem "omq-zstd"
24
+ ```
25
+
26
+ ```sh
27
+ gem install omq-zstd
28
+ ```
29
+
30
+ ## Usage
31
+
32
+ ```ruby
33
+ require "omq"
34
+ require "omq/zstd"
35
+
36
+ pull = OMQ::PULL.new
37
+ push = OMQ::PUSH.new
38
+
39
+ uri = pull.bind("zstd+tcp://127.0.0.1:0")
40
+ push.connect(uri.to_s)
41
+
42
+ push << ["hello, compressed world"]
43
+ pull.receive # => ["hello, compressed world"]
44
+ ```
45
+
46
+ Both peers must use the `zstd+tcp://` scheme. A `tcp://` peer cannot talk to
47
+ a `zstd+tcp://` peer — they speak different transports.
48
+
49
+ ### Compression level
50
+
51
+ Default is **`-3`** (negative = Zstd's fast strategy). Override at bind/connect:
52
+
53
+ ```ruby
54
+ pull.bind("zstd+tcp://127.0.0.1:0", level: 3)
55
+ push.connect("zstd+tcp://127.0.0.1:5555", level: 9)
56
+ ```
57
+
58
+ Per-direction, per-side: each side picks its own send level. Receiving works
59
+ at any level the peer chose.
60
+
61
+ ### Dictionaries
62
+
63
+ Small messages don't compress well on their own. A shared Zstd dictionary
64
+ trained on representative payloads gives 2–10× ratios on payloads in the
65
+ dozens-to-hundreds-of-bytes range.
66
+
67
+ **User-supplied dictionary** (out-of-band agreement):
68
+
69
+ ```ruby
70
+ dict = File.binread("schema.dict") # produced by `zstd --train`
71
+ push.connect("zstd+tcp://127.0.0.1:5555", dict: dict)
72
+ ```
73
+
74
+ The sender ships the dictionary to the receiver in-band as a one-shot
75
+ single-part message prefixed with the dictionary sentinel
76
+ (`37 A4 30 EC`), so the receiver does not need a copy on disk.
77
+
78
+ **Auto-trained dictionary** (zero config — the default when no `dict:` is
79
+ passed): the sender collects up to 1000 samples or 100 KiB (whichever hits
80
+ first), trains a dictionary, ships it inline, and switches to dictionary
81
+ mode. Until then, payloads are compressed without a dictionary or sent
82
+ plaintext when below the threshold.
83
+
84
+ ### Compression thresholds
85
+
86
+ To avoid pessimizing tiny frames, the sender skips compression below:
87
+
88
+ | Mode | Threshold |
89
+ |------|-----------|
90
+ | No dictionary | 512 B |
91
+ | With dictionary | 64 B |
92
+
93
+ Below the threshold the part is sent uncompressed (4-byte zero sentinel +
94
+ plaintext bytes).
95
+
96
+ ### Security limits
97
+
98
+ The receiver bounds decompression by the socket's own `max_message_size`
99
+ — the same knob you'd use on a plain `tcp://` socket. It caps the
100
+ **total decompressed size of all parts in a single message**, not each
101
+ part individually: the budget starts at `max_message_size` and shrinks
102
+ as each part is decoded, so a message whose parts sum to more than the
103
+ cap is rejected on the offending part.
104
+
105
+ ```ruby
106
+ pull.max_message_size = 1_048_576 # 1 MiB cap on the total message
107
+ ```
108
+
109
+ If `max_message_size` is `nil` (OMQ's default, unlimited), there is no
110
+ ceiling on decompressed message size. Set a value that matches what
111
+ your application would tolerate over plain `tcp://`.
112
+
113
+ Independent of the message-size knob, the dictionary itself is capped at
114
+ 64 KiB (Zstd's recommended dictionary size range). A peer attempting to
115
+ ship a larger dictionary, or send a message whose decompressed parts
116
+ exceed `max_message_size`, drops the connection — `OMQ::SocketDeadError`
117
+ surfaces on the next `receive`.
118
+
119
+ ## When to use it
120
+
121
+ `zstd+tcp://` is worth picking when:
122
+
123
+ - You're network-bound (cross-region, IoT links, congested LAN).
124
+ - Your payloads have repetitive structure (JSON, log lines, protobuf with
125
+ string fields, similar binary records).
126
+ - You want compression without touching the message format on either side.
127
+
128
+ It is **not** worth it for:
129
+
130
+ - `inproc://` or `ipc://` — irrelevant; there is no wire to shrink. Use
131
+ `zstd+tcp://` only on the connections that actually need it. Other
132
+ transports on the same socket are unaffected.
133
+ - Already-compressed payloads (gzip, video, encrypted blobs) — the Zstd
134
+ pass adds CPU for no gain.
135
+ - Latency-critical sub-microsecond paths — compression adds single-digit
136
+ microseconds per kilobyte at low levels, but it is not free.
137
+
138
+ ## How it works (in one paragraph)
139
+
140
+ `require "omq/zstd"` registers the `zstd+tcp` scheme on
141
+ `OMQ::Engine.transports`. A `zstd+tcp` socket builds a per-engine
142
+ `Codec` (one Zstd dictionary instance shared across all the socket's
143
+ connections — fan-out compresses each part exactly once). Each accepted
144
+ or dialed TCP connection is wrapped in `ZstdConnection`, a
145
+ `SimpleDelegator` over the underlying ZMTP connection that intercepts
146
+ `#send_message` / `#write_message` / `#receive_message`. Message parts
147
+ go out as a 4-byte sentinel + payload: `00 00 00 00` for plaintext,
148
+ `28 B5 2F FD` (Zstandard frame magic) for a compressed part, or
149
+ `37 A4 30 EC` for a one-shot single-part dictionary shipment. The
150
+ receiver dispatches on the sentinel, decompresses with bounded
151
+ buffers, and hands plaintext parts up to ZMTP unchanged.
152
+
153
+ ## Development
154
+
155
+ ```sh
156
+ OMQ_DEV=1 bundle install
157
+ OMQ_DEV=1 bundle exec rake test
158
+ OMQ_DEV=1 bundle exec ruby --yjit bench/level_sweep.rb
159
+ ```
160
+
161
+ ## License
162
+
163
+ [ISC](LICENSE)