RubyGems - hyperion-rb - Versions diffs - 1.6.0 → 1.6.2 - Mend

hyperion-rb 1.6.0 → 1.6.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: eecb708980287e22968f1405778e91632ecd26eaae33845891ee00e94adf6839
-  data.tar.gz: e433cd1bff6039cd8c30bbc9fdce6cd4176ca1e03d33362f9e67ceddd7d89880
+  metadata.gz: a134460ea9e2ed0f20a2c063b643593f9198d5206f382f8669e332b68510c16d
+  data.tar.gz: c64216c7532fc4f3f2476459abc0e9df41473ae2774c9806a33a6c8186925ed6
 SHA512:
-  metadata.gz: c02db1008beea0246601193a4ca18e8ac931f9a4952a2571a88ecc24028a838707a55e0ecf2159ba53f76b4a3c32c94c635afeb957304ae16ca2e0c2951ef7a3
-  data.tar.gz: 8c3bb4674315a4c41cd4bf38296ab2c991b113f3bcb0ced20ae3a143572f289c3d9a00ce258ba317045a5427b4b7a7986bf6de9f499c84d0ec61b505d8563518
+  metadata.gz: d5dcaf718cc108753c757e8e6b44cd2072e018e3c6133fc38f562285c45ab7b90f5be4819d64e5a9ea32de451622cb7b9597a16b053a0cee0c7f1a3b1fb85355
+  data.tar.gz: 0136f9605c114a90faae7169bef1027dc06f1d7ca520a8a38e4cf4d7ed5b8b9b9eda87ca86611c74e8033d9ede6ae74b0ab7ca768e8f2c6ebf042d30c1f0e783

data/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,21 @@
 # Changelog
+## [1.6.2] - 2026-04-27
+Doc release. No code changes.
+### Added
+- **README "Production tuning (real Rails apps)" section** — distilled from a real-app bench against the Exodus platform (Rails 8.1, on-LAN PG + Redis at ~0.3 ms RTT, `-w 4 -t 10`, `wrk -t8 -c200 -d30s`). Headline: the simplest drop-in (`hyperion -t N -w M` matching Puma's existing `-t/-w`) is the right answer; `+9%` rps and `28×` lower p99 on health endpoints over the same Puma config, no other knobs needed. Documents which synthetic-bench knobs (`-t 30`, `--yjit`, larger `RAILS_POOL`, `--async-io`) DON'T help on real Rails and why (GVL contention past `-t 10`, dev-mode YJIT instability, pool rarely the bottleneck, sync Redis blocking ahead of async-pg yields). Saves operators from the trap of "tune harder = faster" — the simple drop-in IS the answer on real workloads.
+## [1.6.1] - 2026-04-27
+Audit follow-up from the [BENCH_2026_04_27.md](docs/BENCH_2026_04_27.md) sweep. No code-path changes; doc surface and operator-UX polish.
+### Added
+- **`## Operator guidance` README section** — concrete "when do I pick which config?" tables. Translates the bench numbers into decisions: `-w 1 + larger pool` vs `-w N + smaller pool` for I/O-bound (multi-worker is 2.6× memory for 0.77× rps if you pick wrong on PG-wait); the `--async-io` decision tree (default OFF unless you're paired with a fiber-cooperative library); how to read p50 vs p99 (tail wins are 5-200× larger than the rps story suggests — size capacity by p99).
+- **Boot-time advisory warn for orphan `--async-io`** — if `async_io: true` is set but no fiber-cooperative library is loaded (`hyperion-async-pg`, `async-redis`, `async-http`), Hyperion logs a single advisory warn at boot pointing at the operator-guidance docs. The setting is still honoured; the warn just helps operators who flipped the flag expecting a free perf bump (bench showed `--async-io` on hello-world = 47% rps regression + 3.65 s p99 spike).
+- **4 new specs in `spec/hyperion/cli_async_io_warn_spec.rb`** covering all four warn-fire cases (true + no library, false, nil, true + library detected via stub_const).
 ## [1.6.0] - 2026-04-27
 Two parallel improvements landing in 1.6.0:

data/README.md CHANGED Viewed

@@ -25,7 +25,9 @@ bundle exec hyperion config.ru
 ## Benchmarks
-All numbers are real wrk runs against published Hyperion configs. Hyperion ships **with default-ON structured access logs**; Puma comparisons use Puma defaults (no per-request log emission). Each section is stamped with the Hyperion version it was measured against — newer versions (1.3.0+ `--async-io`, 1.4.0+ TLS h1 inline, 1.4.1+ Metrics fiber-key fix) preserve or improve these numbers; we re-run the headline configs each release and have not seen regressions on these workloads.
+All numbers are real wrk runs against published Hyperion configs. Hyperion ships **with default-ON structured access logs**; Puma comparisons use Puma defaults (no per-request log emission). Each section is stamped with the Hyperion version it was measured against — newer versions (1.3.0+ `--async-io`, 1.4.0+ TLS h1 inline, 1.4.1+ Metrics fiber-key fix, 1.6.0+ HTTP/2 writer fiber + 3 C-ext additions) preserve or improve these numbers; we re-run the headline configs each release and have not seen regressions on these workloads.
+> **Comprehensive matrix for 1.6.0 + hyperion-async-pg 0.5.0 (16-vCPU Linux, 9 workloads × 25+ configs)**: see [`docs/BENCH_2026_04_27.md`](docs/BENCH_2026_04_27.md). Headline: 98,818 r/s on hello `-w 16`, 21,215 r/s `-w 4` at p99 < 2 ms, 2,180 r/s on a 50 ms-waiting PG workload (4.1× the best Puma), 1,667 req/s HTTP/2 multiplexed at 0 errors, 155 MB RSS for 10k idle keep-alive connections.
 ### Hello-world Rack app
@@ -320,6 +322,85 @@ Strict DSL: unknown methods raise `NoMethodError` at boot — typos surface imme
 A documented sample lives at [`config/hyperion.example.rb`](config/hyperion.example.rb).
+## Operator guidance
+Concrete tradeoffs distilled from [`docs/BENCH_2026_04_27.md`](docs/BENCH_2026_04_27.md). If the bench numbers cited below feel surprising, check that doc for the full matrix + caveats.
+### When to use `-w N`
+| Workload shape | Recommended | Why |
+|---|---|---|
+| **Pure I/O-bound** (PG / Redis / external HTTP, no significant CPU) | `-w 1` + larger pool | Bench: `-w 1 pool=200` = 87 MB / 2,180 r/s vs `-w 4 pool=64` = 224 MB / 1,680 r/s. **2.6× more memory, 0.77× rps** if you pick multi-worker on a wait-bound workload. |
+| **Pure CPU-bound** (heavy JSON / template render / image processing) | `-w N` matching CPU count | Each worker's accept loop is single-threaded under `--async-io`; multi-worker gives CPU-parallelism. Bench: `-w 16 -t 5` hits 98,818 r/s on a 16-vCPU box, 4.7× a `-w 1` ceiling on the same hardware. |
+| **Mixed** (Rails-shaped: ~5 ms CPU + 50 ms PG wait per request) | `-w N/2` (half cores) + medium pool | Lets CPU work parallelise while keeping per-worker memory tractable. Bench `pg_mixed.ru` at `-w 4 -t 5 pool=128` = 1,740 r/s with no cold-start spike (ForkSafe `prefill_in_child: true`). |
+Multi-worker on PG-wait workloads is the **wrong** default for most apps — the headline rps doesn't justify the memory and PG-connection cost. Verify your shape with the bench before scaling out.
+### When to use `--async-io`
+```
+                 Are you using a fiber-cooperative I/O library?
+                 (hyperion-async-pg, async-redis, async-http)
+                              │
+                ┌─────────────┴─────────────┐
+                yes                          no
+                │                            │
+        Pair with a fiber-aware       Leave --async-io OFF.
+        connection pool               Default thread-pool dispatch
+        (FiberPool, async-pool —      is faster for synchronous
+        NOT connection_pool gem,      Rails apps. Bench: --async-io
+        which uses non-fiber Mutex).  on hello-world = 47% rps
+                │                     regression + p99 spike to
+        Set --async-io.               3.65 s under no-yield workloads.
+        Pool size is the real         No reason to flip the flag.
+        concurrency knob; -t is
+        decorative for wait-bound.
+```
+Hyperion warns at boot if you set `--async-io` without any fiber-cooperative library loaded. The setting is still honoured; the warn just nudges operators who flipped it expecting a free perf bump.
+### Tuning `-t` and pool sizes
+- **Without `--async-io`** (sync server, default): `-t` is the concurrency knob. Each in-flight request holds an OS thread; pool size should match `-t`. Bench shows Puma-style behaviour — at 200 wrk conns hitting a 5-thread server, queue depth dominates p99 (Hyperion `-t 5 -w 1` p50 = 0.95 ms vs Puma's same shape at 59.5 ms — Hyperion's queueing is cheaper but the model still serializes at `-t`).
+- **With `--async-io` + a fiber-aware pool**: pool size is the concurrency knob. `-t` is decorative for wait-bound workloads; one accept-loop fiber serves all in-flight queries via the pool. Linear scaling: pool=64 → ~780 r/s, pool=128 → ~1,344 r/s, pool=200 → ~2,180 r/s on 50 ms PG queries.
+- **Pool over WAN**: if `PG.connect` round-trip is >50 ms, expect pool fill at startup to take `pool_size / parallel_fill_threads × RTT`. `hyperion-async-pg 0.5.1+` auto-scales `parallel_fill_threads` so pool=200 fills in ~1-2 s.
+### How to read p50 vs p99
+Tail latency tells the queueing story; rps tells the throughput story. Hyperion's tail wins are **always** bigger than its rps wins — sometimes the rps numbers look close to a competitor while p99 is 5-200× lower:
+| Workload | Hyperion rps / p99 | Closest competitor | rps ratio | p99 ratio |
+|---|---|---|---:|---:|
+| Hello `-w 4` | 21,215 r/s / 1.87 ms | Falcon 24,061 / 9.78 ms | 0.88× | **5.2× lower** |
+| CPU JSON `-w 4` | 15,582 r/s / 2.47 ms | Falcon 18,643 / 13.51 ms | 0.84× | **5.5× lower** |
+| Static 1 MiB | 1,919 r/s / 4.22 ms | Puma 2,074 / 55 ms | 0.93× | **13× lower** |
+| PG-wait `-w 1` pool=200 | 2,180 r/s / 668 ms | Puma 530 r/s + 200 timeouts | **4.1×** | qualitative crush |
+**Size capacity by p99, not by mean.** Throughput peaks are easy to fake under controlled bench conditions; tail latency reflects what your slowest user actually experiences when the load balancer fans them onto a busy worker.
+### Production tuning (real Rails apps)
+Distilled from a real-app bench against the [Exodus platform](https://github.com/andrew-woblavobla/hyperion/blob/master/docs/BENCH_2026_04_27.md) (Rails 8.1, on-LAN PG + Redis at ~0.3 ms RTT, `-w 4 -t 10`, `wrk -t8 -c200 -d30s`). The headline finding: the **simplest drop-in is the right answer**, and the additional knobs operators reach for first don't help on real Rails.
+**Recommended for migrating from Puma**: `hyperion -t N -w M` matching your current Puma `-t N:N -w M`. No other flags. That gives you (vs Puma at the same `-t/-w`):
+- **+9% rps on lightweight endpoints** (matches the 5-10% per-request CPU savings the rest of the bench section documents).
+- **28× lower p99 on health-style endpoints** — the queue-of-doom shape Puma exhibits under sustained 200-conn load doesn't reproduce on Hyperion's worker-owns-connection model.
+- **3.8× lower p99 on PG-touching endpoints**.
+- **Same RSS, same operator surface** — you keep all your existing config, monitoring, and deploy scripts.
+**Knobs that help on synthetic benches but NOT on real Rails — leave them off:**
+| Knob | Synthetic bench result | Real Rails result | Recommendation |
+|---|---|---|---|
+| `-t 30` (more threads/worker) | Helped Hyperion 5-10% on hello-world | **Hurt** p99 vs `-t 10` on real Rails (3.51 s vs 148 ms on /up) — GVL + middleware Mutex contention dominates past `-t 10` | Stay at `-t 10`. Match Puma's recommended `RAILS_MAX_THREADS`. |
+| `--yjit` | 5-10% on synthetic CPU-bound | Wash on dev-mode Rails (312 vs 328 rps, p99 worse with YJIT) | Skip for now. Production-mode Rails may behave differently — verify with your own bench before flipping. |
+| `RAILS_POOL` > 25 | n/a | No improvement at pool=50 or pool=100 on real Rails (rps within 3%, p99 within noise). Pool starvation is rarely the bottleneck on a `-w 4 -t 10` config | Keep your existing AR pool size. |
+| `--async-io` | 33-42× rps on PG-bound (with `hyperion-async-pg`) | **Worse** than drop-in on real Rails (4.14 s p99 on /up vs 148 ms drop-in) | **Don't enable** until your full I/O stack is fiber-cooperative. The synchronous Redis client (`redis-rb`) blocks the OS thread before async-pg can yield, so fibers can't compound. Migrate to `async-redis` *first*, then revisit. |
+| `--async-io` + `hyperion-async-pg` AR adapter | Verified 48× rps lift on a single-PG-query bench | Marginal-or-negative on real Rails (similar reason: Redis-first handlers don't yield) | Same — wait for a full-async I/O stack. |
+**Why the simple drop-in wins on real Rails:** the per-request budget on a real handler is dominated by the Rails middleware chain (rack-attack, locale redirect, tagger, etc.) + handler logic + DB + cache I/O. Hyperion's per-request CPU optimizations (C-ext header parser, response builder, lock-free metrics, fiber-cooperative TLS dispatch in 1.4.0+) shave ~5-10% off the *non-I/O* portion of the budget consistently — and the [worker-owns-connection model](#concurrency-at-scale-architectural-advantages) prevents the queue-amplification that Puma's thread-pool dispatch shows under sustained load. You don't need to "tune" anything to get those.
 ## Logging
 Default behaviour (rc16+):

data/lib/hyperion/cli.rb CHANGED Viewed

@@ -26,6 +26,14 @@ module Hyperion
         Hyperion.logger = Hyperion::Logger.new(level: config.log_level, format: config.log_format)
       end
+      # Advisory: operators frequently flip --async-io expecting "fast mode"
+      # without installing a fiber-cooperative I/O library. On hello-world this
+      # costs ~5% rps; on no-I/O workloads more. The flag only pays off when
+      # paired with `hyperion-async-pg` / `async-redis` / `async-http`. We log
+      # once at boot pointing at the operator-guidance docs; the operator's
+      # setting is still honoured.
+      warn_orphan_async_io(config)
       # Propagate log_requests so every Connection picks it up via
       # `Hyperion.log_requests?` without needing to thread it through
       # Server/ThreadPool/Master plumbing. Default is ON; nil means "don't
@@ -261,6 +269,35 @@ WARNING: argv is visible via `ps`; prefer --admin-token-file PATH for production
     end
     private_class_method :maybe_enable_yjit
+    # Probe table for fiber-cooperative I/O libraries. If `async_io: true` is
+    # set but none of these are loaded, the operator has likely flipped the
+    # flag without reading the bench numbers — `--async-io` adds Async-loop
+    # overhead and only pays off when paired with a library whose I/O calls
+    # yield to the scheduler. Hello-world bench (BENCH_2026_04_27.md) showed
+    # a 47% rps regression + 3.65 s p99 spike on this shape.
+    ASYNC_IO_PROBE_LIBS = {
+      'hyperion-async-pg' => -> { defined?(::Hyperion::AsyncPg) },
+      'async-redis' => -> { defined?(::Async::Redis) },
+      'async-http' => -> { defined?(::Async::HTTP) }
+    }.freeze
+    def self.warn_orphan_async_io(config)
+      return unless config.async_io == true # nil and false are both no-ops here
+      detected = ASYNC_IO_PROBE_LIBS.select { |_name, probe| probe.call }.keys
+      return unless detected.empty?
+      Hyperion.logger.warn do
+        {
+          message: 'async_io enabled but no fiber-cooperative I/O library detected',
+          libraries_checked: ASYNC_IO_PROBE_LIBS.keys,
+          impact: 'async_io adds Async-loop overhead (~5-47% rps depending on workload) and only pays off when paired with a library that yields to the Async scheduler on socket waits.',
+          docs: 'https://github.com/andrew-woblavobla/hyperion#operator-guidance'
+        }
+      end
+    end
+    private_class_method :warn_orphan_async_io
     # When admin_token is configured, wrap the app in AdminMiddleware so
     # POST /-/quit and GET /-/metrics become token-protected admin endpoints.
     # Skipped when the token is unset — those paths fall through to the app,

data/lib/hyperion/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module Hyperion
-  VERSION = '1.6.0'
+  VERSION = '1.6.2'
 end

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: hyperion-rb
 version: !ruby/object:Gem::Version
-  version: 1.6.0
+  version: 1.6.2
 platform: ruby
 authors:
 - Andrey Lobanov