hyperion-rb 1.4.0 → 1.4.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +5 -0
- data/README.md +19 -2
- data/lib/hyperion/metrics.rb +54 -18
- data/lib/hyperion/version.rb +1 -1
- metadata +1 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 64221d424c994e0757262dca6984cd2b65bf9ec3da6c85094daa717c716da906
|
|
4
|
+
data.tar.gz: 236188ff1777b49178bb4b2bfe75fbea9309ec6f5d56ec01329943dfa1afbb36
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 9b196f8d046c828546f8f5fbac91e0300e8704a70f11b47096a9b0fb05caf2ec34f63e2f33768ab53b41413aaa4e957ac499aa4779ecd54b6a987deef7fd464e
|
|
7
|
+
data.tar.gz: cb3d64f757736bcbc2632492e24b0794a67c98bfdcae7a1e53d89f583cf2eecc404e060fd3ec10e441ddfdcd11273c41e0d46fa9b80be904b29a7718fe0258ba
|
data/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,10 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
+
## [1.4.1] - 2026-04-27
|
|
4
|
+
|
|
5
|
+
### Fixed
|
|
6
|
+
- **`Hyperion::Metrics` fiber-key bug** — pre-1.4.1 the metrics module stored counters via `Thread.current[:key]`, which is FIBER-local in Ruby 1.9+. Under an `Async::Scheduler` (TLS / h2 / `--async-io` plain HTTP/1.1) every handler fiber got its own private counters Hash that `Hyperion.stats` could never see — increments were stranded, the dispatch counters and `:bytes_written` etc. read as zero from any non-handler-fiber observer (including the Prometheus `/-/metrics` exporter when scraped from a different fiber). Switched to `Thread#thread_variable_*` (truly thread-local across fibers) plus direct counter-Hash list storage so snapshots also survive thread death. Verified via 4 new specs: cross-fiber on same thread, cross-thread, cross-fiber-on-different-thread, many-fibers-on-same-thread (210 increments aggregated correctly). Surfaced by hyperion-async-pg 0.4.0's bench round, which couldn't read `:requests_async_dispatched` from spec assertions even though the increments were firing.
|
|
7
|
+
|
|
3
8
|
## [1.4.0] - 2026-04-27
|
|
4
9
|
|
|
5
10
|
Default-behaviour change for TLS users: HTTP/1.1-over-TLS now dispatches inline on the calling fiber instead of hopping through the worker thread pool. Fiber-cooperative libraries (`hyperion-async-pg`, `async-redis`) work on the TLS h1 path without `--async-io`. No code-path changes for plain HTTP/1.1 default behaviour.
|
data/README.md
CHANGED
|
@@ -95,7 +95,7 @@ Ubuntu 24.04 / 16 vCPU / Ruby 3.3.3, Postgres 17 over WAN, `wrk -t4 -c200 -d20s`
|
|
|
95
95
|
1. **Linear scaling with pool size** under `--async-io` — `r/s ≈ pool × 12` on this WAN bench. Single-worker pool=200 hits 2381 r/s, **42× Puma `-t 5`** and **5.9× Puma's best** (`-t 30`).
|
|
96
96
|
2. **Mixed workload doesn't kill the win** — Hyperion `--async-io` pool=128 actually goes *up* on mixed (1740 vs 1344 r/s) because CPU work overlaps other fibers' PG-wait windows. This is the honest "what happens to a real Rails handler" answer.
|
|
97
97
|
3. **Hyperion ≈ Falcon within 3-7%** across pool sizes; both fiber-native architectures extract similar value from `hyperion-async-pg`.
|
|
98
|
-
4. **RSS at single-worker scale isn't the architectural moat** — Linux thread stacks are demand-paged; PG connection buffers dominate RSS at pool sizes ≤ 200. The
|
|
98
|
+
4. **RSS at single-worker scale isn't the architectural moat** — Linux thread stacks are demand-paged; PG connection buffers dominate RSS at pool sizes ≤ 200. The architectural win is **handler concurrency under load**, not idle memory: Hyperion's fiber path runs thousands of in-flight handler invocations per OS thread, so wait-bound handlers don't queue at `max_threads`. See [Concurrency at scale](#concurrency-at-scale-architectural-advantages) for both the throughput-under-load row and a measured 10k-idle-keepalive RSS sweep against Puma and Falcon.
|
|
99
99
|
5. **`-w 4` cold-start caveat** — multi-worker p99 inflates because the bench rackup uses lazy per-process pool init (each worker pays full pool fill on its first request). Production apps avoid this with `on_worker_boot { Hyperion::AsyncPg::FiberPool.new(...).fill }`.
|
|
100
100
|
|
|
101
101
|
Three things must all be true to get this win:
|
|
@@ -176,7 +176,21 @@ These workloads demonstrate structural differences between Hyperion's fiber-per-
|
|
|
176
176
|
| Hyperion `-w 1 -t 10` | 93,090 | 6,910 | 3,446 | 27.01 s |
|
|
177
177
|
| Puma `-w 1 -t 10:10` | 77,340 | 22,660 | 706 | 109.59 s |
|
|
178
178
|
|
|
179
|
-
|
|
179
|
+
At 10k concurrent connections under load Hyperion serves **~5× the throughput** of Puma with **~20% fewer dropped requests**. The per-connection bookkeeping cost is bounded by fiber size, not by `max_threads` — workers don't get pinned to long-lived sockets, so a slow handler doesn't starve other connections.
|
|
180
|
+
|
|
181
|
+
**Memory at idle keep-alive scale — 10,000 idle HTTP/1.1 keep-alive connections:**
|
|
182
|
+
|
|
183
|
+
Each client opens a TCP connection, sends one keep-alive GET, drains the response, then holds the socket open without sending a follow-up request. RSS is sampled once a second across a 30s idle hold. Same hello-world rackup, single worker, no TLS. Hyperion runs with `async_io true` (fiber-per-connection on the plain HTTP/1.1 path).
|
|
184
|
+
|
|
185
|
+
| | held | dropped | peak RSS | RSS after drain |
|
|
186
|
+
|---|---:|---:|---:|---:|
|
|
187
|
+
| Hyperion `-w 1 -t 5 --async-io` | 10,000 / 10,000 | 0 | 173 MB | 155 MB |
|
|
188
|
+
| Puma `-w 0 -t 100` | 10,000 / 10,000 | 0 | 101 MB | 104 MB |
|
|
189
|
+
| Falcon `--count 1` | 10,000 / 10,000 | 0 | 429 MB | 440 MB |
|
|
190
|
+
|
|
191
|
+
All three hold 10k idle conns without OOMing or dropping — the "MB-per-thread" intuition that thread-based servers can't reach this scale doesn't survive contact with Linux's demand-paged thread stacks plus Puma's reactor-based keep-alive handling. Per-conn RSS lands at ~14 KB (Hyperion fiber + parser state), ~7 KB (Puma reactor entry + tiny thread share), ~36 KB (Falcon Async::Task + protocol-http stack). Bounded, not unbounded — for all three.
|
|
192
|
+
|
|
193
|
+
The architectural difference shows up under **load**, not at idle: Puma can only run `max_threads` handler invocations concurrently, so wait-bound handlers (DB, HTTP, Redis) starve at higher request concurrency than `max_threads`. Hyperion's fiber-per-connection model + `--async-io` gives one OS thread thousands of in-flight handler executions, paired with [hyperion-async-pg](https://github.com/exodusgaming-io/hyperion-async-pg) for non-blocking DB. The 10k-conn throughput row above (5× Puma) is the consequence — same idle RSS shape, very different behaviour once the handlers actually do work.
|
|
180
194
|
|
|
181
195
|
**HTTP/2 multiplexing — 1 connection × 100 concurrent streams (handler sleeps 50 ms):**
|
|
182
196
|
|
|
@@ -194,6 +208,9 @@ Hyperion fans 100 in-flight streams across separate fibers within a single TCP c
|
|
|
194
208
|
bundle exec ruby bench/compare.rb
|
|
195
209
|
HYPERION_WORKERS=4 PUMA_WORKERS=4 FALCON_COUNT=4 bundle exec ruby bench/compare.rb
|
|
196
210
|
|
|
211
|
+
# Idle keep-alive RSS sweep (1k / 5k / 10k conns, 30s hold per server)
|
|
212
|
+
./bench/keepalive_memory.sh
|
|
213
|
+
|
|
197
214
|
# Real Rails / Grape: see bench/db.ru for the schema
|
|
198
215
|
```
|
|
199
216
|
|
data/lib/hyperion/metrics.rb
CHANGED
|
@@ -7,6 +7,22 @@ module Hyperion
|
|
|
7
7
|
# all threads that have ever incremented (one short mutex section, only
|
|
8
8
|
# taken when the operator asks for stats).
|
|
9
9
|
#
|
|
10
|
+
# Storage: counters live behind `Thread#thread_variable_*`, which is the
|
|
11
|
+
# only TRUE thread-local in Ruby 1.9+ — `Thread.current[:key]` is in fact
|
|
12
|
+
# FIBER-local, so under an `Async::Scheduler` (TLS path, h2 streams, the
|
|
13
|
+
# 1.3.0+ `--async-io` plain HTTP/1.1 path) every handler fiber would get
|
|
14
|
+
# its own private counters Hash that `snapshot` could never find.
|
|
15
|
+
# Verified with hyperion-async-pg 0.4.0's bench round; before the fix
|
|
16
|
+
# the dispatch counters dropped requests entirely under `--async-io` and
|
|
17
|
+
# an external scrape (Prometheus exporter on a different fiber than the
|
|
18
|
+
# handler) saw the dispatch buckets at zero.
|
|
19
|
+
#
|
|
20
|
+
# Cross-fiber races on the same OS thread: the `+=` is technically read-
|
|
21
|
+
# modify-write, but Ruby's fiber scheduler only preempts at IO boundaries
|
|
22
|
+
# (Fiber.scheduler-aware system calls), and `Hash#[]=` is purely Ruby —
|
|
23
|
+
# no preemption mid-increment, no torn writes. Two fibers cannot
|
|
24
|
+
# interleave a single `+=` on the same OS thread.
|
|
25
|
+
#
|
|
10
26
|
# Reset semantics: counters monotonically increase. Operators that want
|
|
11
27
|
# rate-of-change should snapshot, sleep, snapshot, diff.
|
|
12
28
|
#
|
|
@@ -14,16 +30,40 @@ module Hyperion
|
|
|
14
30
|
# Hyperion.stats -> Hash with all current values across all threads.
|
|
15
31
|
class Metrics
|
|
16
32
|
def initialize
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
#
|
|
20
|
-
#
|
|
33
|
+
# Direct list of every per-thread counters Hash ever allocated through
|
|
34
|
+
# this Metrics instance. We hold the Hash refs ourselves (instead of
|
|
35
|
+
# holding Thread refs and looking the Hash up via thread-local
|
|
36
|
+
# storage) so snapshot survives thread death — counters from a
|
|
37
|
+
# short-lived worker that already exited still aggregate. Tiny per-
|
|
38
|
+
# thread footprint (one Hash + one slot in this Array).
|
|
39
|
+
@thread_counters = []
|
|
40
|
+
@counters_mutex = Mutex.new
|
|
41
|
+
# Per-instance thread-local key so spec runs that build fresh Metrics
|
|
42
|
+
# objects don't share state across examples.
|
|
21
43
|
@thread_key = :"__hyperion_metrics_#{object_id}__"
|
|
22
44
|
end
|
|
23
45
|
|
|
24
|
-
# Hot path: one
|
|
46
|
+
# Hot path: one thread-variable lookup + one hash op. No mutex on the
|
|
47
|
+
# increment fast path; the mutex is taken only on first allocation per
|
|
48
|
+
# OS thread (very rare) and on snapshot.
|
|
49
|
+
#
|
|
50
|
+
# Storage uses Thread#thread_variable_*, which is the only TRUE thread-
|
|
51
|
+
# local in Ruby 1.9+ — Thread.current[:key] is in fact FIBER-local, so
|
|
52
|
+
# under an Async::Scheduler (TLS path, h2 streams, the 1.3.0+ --async-io
|
|
53
|
+
# plain HTTP/1.1 path) every handler fiber would get its own private
|
|
54
|
+
# counters Hash that snapshot could never aggregate. Verified with
|
|
55
|
+
# hyperion-async-pg 0.4.0's bench round; before the fix the dispatch
|
|
56
|
+
# counters dropped requests under --async-io.
|
|
57
|
+
#
|
|
58
|
+
# Cross-fiber races on the same OS thread: the `+=` is read-modify-write,
|
|
59
|
+
# but Ruby's fiber scheduler only preempts at IO boundaries (Fiber-
|
|
60
|
+
# scheduler-aware system calls). Hash#[]= is purely Ruby — no
|
|
61
|
+
# preemption mid-increment, no torn writes. Two fibers cannot
|
|
62
|
+
# interleave a single `+=` on the same OS thread.
|
|
25
63
|
def increment(key, by = 1)
|
|
26
|
-
|
|
64
|
+
thread = Thread.current
|
|
65
|
+
counters = thread.thread_variable_get(@thread_key)
|
|
66
|
+
counters = register_thread_counters(thread) if counters.nil?
|
|
27
67
|
counters[key] += by
|
|
28
68
|
end
|
|
29
69
|
|
|
@@ -37,14 +77,9 @@ module Hyperion
|
|
|
37
77
|
|
|
38
78
|
def snapshot
|
|
39
79
|
result = Hash.new(0)
|
|
40
|
-
@
|
|
41
|
-
|
|
42
|
-
|
|
43
|
-
counters = t[@thread_key]
|
|
44
|
-
next unless counters
|
|
45
|
-
|
|
46
|
-
counters.each { |k, v| result[k] += v }
|
|
47
|
-
end
|
|
80
|
+
counters_snapshot = @counters_mutex.synchronize { @thread_counters.dup }
|
|
81
|
+
counters_snapshot.each do |counters|
|
|
82
|
+
counters.each { |k, v| result[k] += v }
|
|
48
83
|
end
|
|
49
84
|
result.default = nil
|
|
50
85
|
result
|
|
@@ -52,16 +87,17 @@ module Hyperion
|
|
|
52
87
|
|
|
53
88
|
# Tests can call .reset! between examples to avoid cross-spec leakage.
|
|
54
89
|
def reset!
|
|
55
|
-
@
|
|
56
|
-
@
|
|
90
|
+
@counters_mutex.synchronize do
|
|
91
|
+
@thread_counters.each(&:clear)
|
|
57
92
|
end
|
|
58
93
|
end
|
|
59
94
|
|
|
60
95
|
private
|
|
61
96
|
|
|
62
|
-
def register_thread_counters
|
|
97
|
+
def register_thread_counters(thread)
|
|
63
98
|
counters = Hash.new(0)
|
|
64
|
-
|
|
99
|
+
thread.thread_variable_set(@thread_key, counters)
|
|
100
|
+
@counters_mutex.synchronize { @thread_counters << counters }
|
|
65
101
|
counters
|
|
66
102
|
end
|
|
67
103
|
end
|
data/lib/hyperion/version.rb
CHANGED