hyperion-rb 1.6.0 → 1.6.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: eecb708980287e22968f1405778e91632ecd26eaae33845891ee00e94adf6839
4
- data.tar.gz: e433cd1bff6039cd8c30bbc9fdce6cd4176ca1e03d33362f9e67ceddd7d89880
3
+ metadata.gz: a134460ea9e2ed0f20a2c063b643593f9198d5206f382f8669e332b68510c16d
4
+ data.tar.gz: c64216c7532fc4f3f2476459abc0e9df41473ae2774c9806a33a6c8186925ed6
5
5
  SHA512:
6
- metadata.gz: c02db1008beea0246601193a4ca18e8ac931f9a4952a2571a88ecc24028a838707a55e0ecf2159ba53f76b4a3c32c94c635afeb957304ae16ca2e0c2951ef7a3
7
- data.tar.gz: 8c3bb4674315a4c41cd4bf38296ab2c991b113f3bcb0ced20ae3a143572f289c3d9a00ce258ba317045a5427b4b7a7986bf6de9f499c84d0ec61b505d8563518
6
+ metadata.gz: d5dcaf718cc108753c757e8e6b44cd2072e018e3c6133fc38f562285c45ab7b90f5be4819d64e5a9ea32de451622cb7b9597a16b053a0cee0c7f1a3b1fb85355
7
+ data.tar.gz: 0136f9605c114a90faae7169bef1027dc06f1d7ca520a8a38e4cf4d7ed5b8b9b9eda87ca86611c74e8033d9ede6ae74b0ab7ca768e8f2c6ebf042d30c1f0e783
data/CHANGELOG.md CHANGED
@@ -1,5 +1,21 @@
1
1
  # Changelog
2
2
 
3
+ ## [1.6.2] - 2026-04-27
4
+
5
+ Doc release. No code changes.
6
+
7
+ ### Added
8
+ - **README "Production tuning (real Rails apps)" section** — distilled from a real-app bench against the Exodus platform (Rails 8.1, on-LAN PG + Redis at ~0.3 ms RTT, `-w 4 -t 10`, `wrk -t8 -c200 -d30s`). Headline: the simplest drop-in (`hyperion -t N -w M` matching Puma's existing `-t/-w`) is the right answer; `+9%` rps and `28×` lower p99 on health endpoints over the same Puma config, no other knobs needed. Documents which synthetic-bench knobs (`-t 30`, `--yjit`, larger `RAILS_POOL`, `--async-io`) DON'T help on real Rails and why (GVL contention past `-t 10`, dev-mode YJIT instability, pool rarely the bottleneck, sync Redis blocking ahead of async-pg yields). Saves operators from the trap of "tune harder = faster" — the simple drop-in IS the answer on real workloads.
9
+
10
+ ## [1.6.1] - 2026-04-27
11
+
12
+ Audit follow-up from the [BENCH_2026_04_27.md](docs/BENCH_2026_04_27.md) sweep. No code-path changes; doc surface and operator-UX polish.
13
+
14
+ ### Added
15
+ - **`## Operator guidance` README section** — concrete "when do I pick which config?" tables. Translates the bench numbers into decisions: `-w 1 + larger pool` vs `-w N + smaller pool` for I/O-bound (multi-worker is 2.6× memory for 0.77× rps if you pick wrong on PG-wait); the `--async-io` decision tree (default OFF unless you're paired with a fiber-cooperative library); how to read p50 vs p99 (tail wins are 5-200× larger than the rps story suggests — size capacity by p99).
16
+ - **Boot-time advisory warn for orphan `--async-io`** — if `async_io: true` is set but no fiber-cooperative library is loaded (`hyperion-async-pg`, `async-redis`, `async-http`), Hyperion logs a single advisory warn at boot pointing at the operator-guidance docs. The setting is still honoured; the warn just helps operators who flipped the flag expecting a free perf bump (bench showed `--async-io` on hello-world = 47% rps regression + 3.65 s p99 spike).
17
+ - **4 new specs in `spec/hyperion/cli_async_io_warn_spec.rb`** covering all four warn-fire cases (true + no library, false, nil, true + library detected via stub_const).
18
+
3
19
  ## [1.6.0] - 2026-04-27
4
20
 
5
21
  Two parallel improvements landing in 1.6.0:
data/README.md CHANGED
@@ -25,7 +25,9 @@ bundle exec hyperion config.ru
25
25
 
26
26
  ## Benchmarks
27
27
 
28
- All numbers are real wrk runs against published Hyperion configs. Hyperion ships **with default-ON structured access logs**; Puma comparisons use Puma defaults (no per-request log emission). Each section is stamped with the Hyperion version it was measured against — newer versions (1.3.0+ `--async-io`, 1.4.0+ TLS h1 inline, 1.4.1+ Metrics fiber-key fix) preserve or improve these numbers; we re-run the headline configs each release and have not seen regressions on these workloads.
28
+ All numbers are real wrk runs against published Hyperion configs. Hyperion ships **with default-ON structured access logs**; Puma comparisons use Puma defaults (no per-request log emission). Each section is stamped with the Hyperion version it was measured against — newer versions (1.3.0+ `--async-io`, 1.4.0+ TLS h1 inline, 1.4.1+ Metrics fiber-key fix, 1.6.0+ HTTP/2 writer fiber + 3 C-ext additions) preserve or improve these numbers; we re-run the headline configs each release and have not seen regressions on these workloads.
29
+
30
+ > **Comprehensive matrix for 1.6.0 + hyperion-async-pg 0.5.0 (16-vCPU Linux, 9 workloads × 25+ configs)**: see [`docs/BENCH_2026_04_27.md`](docs/BENCH_2026_04_27.md). Headline: 98,818 r/s on hello `-w 16`, 21,215 r/s `-w 4` at p99 < 2 ms, 2,180 r/s on a 50 ms-waiting PG workload (4.1× the best Puma), 1,667 req/s HTTP/2 multiplexed at 0 errors, 155 MB RSS for 10k idle keep-alive connections.
29
31
 
30
32
  ### Hello-world Rack app
31
33
 
@@ -320,6 +322,85 @@ Strict DSL: unknown methods raise `NoMethodError` at boot — typos surface imme
320
322
 
321
323
  A documented sample lives at [`config/hyperion.example.rb`](config/hyperion.example.rb).
322
324
 
325
+ ## Operator guidance
326
+
327
+ Concrete tradeoffs distilled from [`docs/BENCH_2026_04_27.md`](docs/BENCH_2026_04_27.md). If the bench numbers cited below feel surprising, check that doc for the full matrix + caveats.
328
+
329
+ ### When to use `-w N`
330
+
331
+ | Workload shape | Recommended | Why |
332
+ |---|---|---|
333
+ | **Pure I/O-bound** (PG / Redis / external HTTP, no significant CPU) | `-w 1` + larger pool | Bench: `-w 1 pool=200` = 87 MB / 2,180 r/s vs `-w 4 pool=64` = 224 MB / 1,680 r/s. **2.6× more memory, 0.77× rps** if you pick multi-worker on a wait-bound workload. |
334
+ | **Pure CPU-bound** (heavy JSON / template render / image processing) | `-w N` matching CPU count | Each worker's accept loop is single-threaded under `--async-io`; multi-worker gives CPU-parallelism. Bench: `-w 16 -t 5` hits 98,818 r/s on a 16-vCPU box, 4.7× a `-w 1` ceiling on the same hardware. |
335
+ | **Mixed** (Rails-shaped: ~5 ms CPU + 50 ms PG wait per request) | `-w N/2` (half cores) + medium pool | Lets CPU work parallelise while keeping per-worker memory tractable. Bench `pg_mixed.ru` at `-w 4 -t 5 pool=128` = 1,740 r/s with no cold-start spike (ForkSafe `prefill_in_child: true`). |
336
+
337
+ Multi-worker on PG-wait workloads is the **wrong** default for most apps — the headline rps doesn't justify the memory and PG-connection cost. Verify your shape with the bench before scaling out.
338
+
339
+ ### When to use `--async-io`
340
+
341
+ ```
342
+ Are you using a fiber-cooperative I/O library?
343
+ (hyperion-async-pg, async-redis, async-http)
344
+
345
+ ┌─────────────┴─────────────┐
346
+ yes no
347
+ │ │
348
+ Pair with a fiber-aware Leave --async-io OFF.
349
+ connection pool Default thread-pool dispatch
350
+ (FiberPool, async-pool — is faster for synchronous
351
+ NOT connection_pool gem, Rails apps. Bench: --async-io
352
+ which uses non-fiber Mutex). on hello-world = 47% rps
353
+ │ regression + p99 spike to
354
+ Set --async-io. 3.65 s under no-yield workloads.
355
+ Pool size is the real No reason to flip the flag.
356
+ concurrency knob; -t is
357
+ decorative for wait-bound.
358
+ ```
359
+
360
+ Hyperion warns at boot if you set `--async-io` without any fiber-cooperative library loaded. The setting is still honoured; the warn just nudges operators who flipped it expecting a free perf bump.
361
+
362
+ ### Tuning `-t` and pool sizes
363
+
364
+ - **Without `--async-io`** (sync server, default): `-t` is the concurrency knob. Each in-flight request holds an OS thread; pool size should match `-t`. Bench shows Puma-style behaviour — at 200 wrk conns hitting a 5-thread server, queue depth dominates p99 (Hyperion `-t 5 -w 1` p50 = 0.95 ms vs Puma's same shape at 59.5 ms — Hyperion's queueing is cheaper but the model still serializes at `-t`).
365
+ - **With `--async-io` + a fiber-aware pool**: pool size is the concurrency knob. `-t` is decorative for wait-bound workloads; one accept-loop fiber serves all in-flight queries via the pool. Linear scaling: pool=64 → ~780 r/s, pool=128 → ~1,344 r/s, pool=200 → ~2,180 r/s on 50 ms PG queries.
366
+ - **Pool over WAN**: if `PG.connect` round-trip is >50 ms, expect pool fill at startup to take `pool_size / parallel_fill_threads × RTT`. `hyperion-async-pg 0.5.1+` auto-scales `parallel_fill_threads` so pool=200 fills in ~1-2 s.
367
+
368
+ ### How to read p50 vs p99
369
+
370
+ Tail latency tells the queueing story; rps tells the throughput story. Hyperion's tail wins are **always** bigger than its rps wins — sometimes the rps numbers look close to a competitor while p99 is 5-200× lower:
371
+
372
+ | Workload | Hyperion rps / p99 | Closest competitor | rps ratio | p99 ratio |
373
+ |---|---|---|---:|---:|
374
+ | Hello `-w 4` | 21,215 r/s / 1.87 ms | Falcon 24,061 / 9.78 ms | 0.88× | **5.2× lower** |
375
+ | CPU JSON `-w 4` | 15,582 r/s / 2.47 ms | Falcon 18,643 / 13.51 ms | 0.84× | **5.5× lower** |
376
+ | Static 1 MiB | 1,919 r/s / 4.22 ms | Puma 2,074 / 55 ms | 0.93× | **13× lower** |
377
+ | PG-wait `-w 1` pool=200 | 2,180 r/s / 668 ms | Puma 530 r/s + 200 timeouts | **4.1×** | qualitative crush |
378
+
379
+ **Size capacity by p99, not by mean.** Throughput peaks are easy to fake under controlled bench conditions; tail latency reflects what your slowest user actually experiences when the load balancer fans them onto a busy worker.
380
+
381
+ ### Production tuning (real Rails apps)
382
+
383
+ Distilled from a real-app bench against the [Exodus platform](https://github.com/andrew-woblavobla/hyperion/blob/master/docs/BENCH_2026_04_27.md) (Rails 8.1, on-LAN PG + Redis at ~0.3 ms RTT, `-w 4 -t 10`, `wrk -t8 -c200 -d30s`). The headline finding: the **simplest drop-in is the right answer**, and the additional knobs operators reach for first don't help on real Rails.
384
+
385
+ **Recommended for migrating from Puma**: `hyperion -t N -w M` matching your current Puma `-t N:N -w M`. No other flags. That gives you (vs Puma at the same `-t/-w`):
386
+
387
+ - **+9% rps on lightweight endpoints** (matches the 5-10% per-request CPU savings the rest of the bench section documents).
388
+ - **28× lower p99 on health-style endpoints** — the queue-of-doom shape Puma exhibits under sustained 200-conn load doesn't reproduce on Hyperion's worker-owns-connection model.
389
+ - **3.8× lower p99 on PG-touching endpoints**.
390
+ - **Same RSS, same operator surface** — you keep all your existing config, monitoring, and deploy scripts.
391
+
392
+ **Knobs that help on synthetic benches but NOT on real Rails — leave them off:**
393
+
394
+ | Knob | Synthetic bench result | Real Rails result | Recommendation |
395
+ |---|---|---|---|
396
+ | `-t 30` (more threads/worker) | Helped Hyperion 5-10% on hello-world | **Hurt** p99 vs `-t 10` on real Rails (3.51 s vs 148 ms on /up) — GVL + middleware Mutex contention dominates past `-t 10` | Stay at `-t 10`. Match Puma's recommended `RAILS_MAX_THREADS`. |
397
+ | `--yjit` | 5-10% on synthetic CPU-bound | Wash on dev-mode Rails (312 vs 328 rps, p99 worse with YJIT) | Skip for now. Production-mode Rails may behave differently — verify with your own bench before flipping. |
398
+ | `RAILS_POOL` > 25 | n/a | No improvement at pool=50 or pool=100 on real Rails (rps within 3%, p99 within noise). Pool starvation is rarely the bottleneck on a `-w 4 -t 10` config | Keep your existing AR pool size. |
399
+ | `--async-io` | 33-42× rps on PG-bound (with `hyperion-async-pg`) | **Worse** than drop-in on real Rails (4.14 s p99 on /up vs 148 ms drop-in) | **Don't enable** until your full I/O stack is fiber-cooperative. The synchronous Redis client (`redis-rb`) blocks the OS thread before async-pg can yield, so fibers can't compound. Migrate to `async-redis` *first*, then revisit. |
400
+ | `--async-io` + `hyperion-async-pg` AR adapter | Verified 48× rps lift on a single-PG-query bench | Marginal-or-negative on real Rails (similar reason: Redis-first handlers don't yield) | Same — wait for a full-async I/O stack. |
401
+
402
+ **Why the simple drop-in wins on real Rails:** the per-request budget on a real handler is dominated by the Rails middleware chain (rack-attack, locale redirect, tagger, etc.) + handler logic + DB + cache I/O. Hyperion's per-request CPU optimizations (C-ext header parser, response builder, lock-free metrics, fiber-cooperative TLS dispatch in 1.4.0+) shave ~5-10% off the *non-I/O* portion of the budget consistently — and the [worker-owns-connection model](#concurrency-at-scale-architectural-advantages) prevents the queue-amplification that Puma's thread-pool dispatch shows under sustained load. You don't need to "tune" anything to get those.
403
+
323
404
  ## Logging
324
405
 
325
406
  Default behaviour (rc16+):
data/lib/hyperion/cli.rb CHANGED
@@ -26,6 +26,14 @@ module Hyperion
26
26
  Hyperion.logger = Hyperion::Logger.new(level: config.log_level, format: config.log_format)
27
27
  end
28
28
 
29
+ # Advisory: operators frequently flip --async-io expecting "fast mode"
30
+ # without installing a fiber-cooperative I/O library. On hello-world this
31
+ # costs ~5% rps; on no-I/O workloads more. The flag only pays off when
32
+ # paired with `hyperion-async-pg` / `async-redis` / `async-http`. We log
33
+ # once at boot pointing at the operator-guidance docs; the operator's
34
+ # setting is still honoured.
35
+ warn_orphan_async_io(config)
36
+
29
37
  # Propagate log_requests so every Connection picks it up via
30
38
  # `Hyperion.log_requests?` without needing to thread it through
31
39
  # Server/ThreadPool/Master plumbing. Default is ON; nil means "don't
@@ -261,6 +269,35 @@ WARNING: argv is visible via `ps`; prefer --admin-token-file PATH for production
261
269
  end
262
270
  private_class_method :maybe_enable_yjit
263
271
 
272
+ # Probe table for fiber-cooperative I/O libraries. If `async_io: true` is
273
+ # set but none of these are loaded, the operator has likely flipped the
274
+ # flag without reading the bench numbers — `--async-io` adds Async-loop
275
+ # overhead and only pays off when paired with a library whose I/O calls
276
+ # yield to the scheduler. Hello-world bench (BENCH_2026_04_27.md) showed
277
+ # a 47% rps regression + 3.65 s p99 spike on this shape.
278
+ ASYNC_IO_PROBE_LIBS = {
279
+ 'hyperion-async-pg' => -> { defined?(::Hyperion::AsyncPg) },
280
+ 'async-redis' => -> { defined?(::Async::Redis) },
281
+ 'async-http' => -> { defined?(::Async::HTTP) }
282
+ }.freeze
283
+
284
+ def self.warn_orphan_async_io(config)
285
+ return unless config.async_io == true # nil and false are both no-ops here
286
+
287
+ detected = ASYNC_IO_PROBE_LIBS.select { |_name, probe| probe.call }.keys
288
+ return unless detected.empty?
289
+
290
+ Hyperion.logger.warn do
291
+ {
292
+ message: 'async_io enabled but no fiber-cooperative I/O library detected',
293
+ libraries_checked: ASYNC_IO_PROBE_LIBS.keys,
294
+ impact: 'async_io adds Async-loop overhead (~5-47% rps depending on workload) and only pays off when paired with a library that yields to the Async scheduler on socket waits.',
295
+ docs: 'https://github.com/andrew-woblavobla/hyperion#operator-guidance'
296
+ }
297
+ end
298
+ end
299
+ private_class_method :warn_orphan_async_io
300
+
264
301
  # When admin_token is configured, wrap the app in AdminMiddleware so
265
302
  # POST /-/quit and GET /-/metrics become token-protected admin endpoints.
266
303
  # Skipped when the token is unset — those paths fall through to the app,
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Hyperion
4
- VERSION = '1.6.0'
4
+ VERSION = '1.6.2'
5
5
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: hyperion-rb
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.6.0
4
+ version: 1.6.2
5
5
  platform: ruby
6
6
  authors:
7
7
  - Andrey Lobanov