RubyGems - hyperion-rb - Versions diffs - 1.0.1 → 1.2.0 - Mend

hyperion-rb 1.0.1 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (21) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +44 -0
data/README.md +32 -1
data/ext/hyperion_http/parser.c +141 -0
data/lib/hyperion/adapter/rack.rb +14 -0
data/lib/hyperion/admin_middleware.rb +110 -0
data/lib/hyperion/cli.rb +82 -1
data/lib/hyperion/config.rb +11 -1
data/lib/hyperion/connection.rb +56 -4
data/lib/hyperion/http2_handler.rb +243 -6
data/lib/hyperion/logger.rb +94 -3
data/lib/hyperion/master.rb +69 -1
data/lib/hyperion/prometheus_exporter.rb +96 -0
data/lib/hyperion/response_writer.rb +87 -10
data/lib/hyperion/server.rb +106 -32
data/lib/hyperion/thread_pool.rb +24 -8
data/lib/hyperion/version.rb +1 -1
data/lib/hyperion/worker.rb +19 -11
data/lib/hyperion/worker_health.rb +33 -0
data/lib/hyperion.rb +58 -0
metadata +4 -1

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 65702226f151c3ba8314ac5927575595aeda57a522d045cc61feaf04f87bae53
-  data.tar.gz: 704454f1d7b5e484b70bc09baf7047d9a6760f9a1d824402092455c4af63a8d8
+  metadata.gz: 4174d7143559b6bd05bdc78acf4377add8aca32f885e933786c50f31c956e9ba
+  data.tar.gz: f163a7f5bd2b363f37205e1f1ba845fb0324c329cc15b4c1144e6d519a1bc60a
 SHA512:
-  metadata.gz: e210752143a969e8070de69bcf81bbc40dd509d827cb8f70dbcf44bdc1fb3692c9110000a732e2bdeb6185ea52fd07416f3d1fdc56005c5fd4dcca1e99e60fd3
-  data.tar.gz: 0fa86fc58fa590087cb84948569be3968a4e7515a9fd8f9de28cc191bb6c8ddb8b5d016fa4b66ed142aaa01555357320ba62bbc7bc71abf3a96cbbfd4c3be457
+  metadata.gz: ea61b5e3298ae50b9b6530d51e1f9a5299b0ccfea3b99248230a601a96ebaf764b5d7978215e09a7d73ed7e85ee3f8b5f7d13d40a830ca5c4482a9d192b2919a
+  data.tar.gz: ed8e125b2ff0c9aab53f3178d0f31d1b0db028f8ebf3a40d09ba11e86c3a62756a3c15c7eb3b288faf8dee6f1062159d372cb0c08997a76fedcb97d485d87283

data/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,49 @@
 # Changelog
+## [1.2.0] - 2026-04-27
+Production hardening + perf round 2. No breaking changes.
+### Added
+- **Zero-copy sendfile path** — when a Rack body responds to `#to_path` (e.g. `Rack::Files`, asset uploads), `ResponseWriter` uses `IO.copy_stream(file, socket)` which triggers `sendfile(2)` on Linux for plain TCP. Eliminates the ~MB-sized String allocation per static-asset response. Falls back to userspace copy on TLS / non-Linux but still avoids the userspace String build. New metrics: `:sendfile_responses`, `:tls_zerobuf_responses`.
+- **Hot fork warmup (`Hyperion.warmup!`)** — master pre-allocates the Rack env Hash pool, primes the C extension's lazy state, and touches commonly-resolved constants before `before_fork`. Workers inherit the warm pools via Copy-on-Write. Removes first-N-requests-after-fork allocation tax.
+- **Backpressure (`max_pending`)** — when the thread pool's inbox queue exceeds the configured depth, new accepts get HTTP 503 + `Retry-After: 1` and the socket is closed immediately (no Rack dispatch, no access-log line). Default off (nil); opt in by setting an Integer. New metric: `:rejected_connections`.
+- **Prometheus exporter** — `AdminMiddleware` now serves `GET /-/metrics` in addition to `POST /-/quit` (same token). Renders `Hyperion.stats` as Prometheus text exposition v0.0.4. Counter names follow the `hyperion_<key>_total` convention; `:responses_<code>` keys are grouped under `hyperion_responses_status_total{status="<code>"}`.
+- **Slow-client total-deadline (`max_request_read_seconds`)** — per-request wallclock cap on the request-line + headers read phase (default 60s). Defense-in-depth against slowloris: a malicious client can no longer dribble 1 byte per `read_timeout` window indefinitely. On overrun, Hyperion writes 408 + closes. Resets per request on keep-alive sessions. New metric: `:slow_request_aborts`.
+- **HTTP/2 SETTINGS tuning** — Falcon-class defaults shipped: `MAX_CONCURRENT_STREAMS=128`, `INITIAL_WINDOW_SIZE=1MiB`, `MAX_FRAME_SIZE=1MiB`, `MAX_HEADER_LIST_SIZE=64KiB`. All four overridable via Config DSL (`h2_max_concurrent_streams` etc). Out-of-spec values are clamped + warned, not crashed.
+- **`docs/REVERSE_PROXY.md`** — nginx + AWS ALB samples, X-Forwarded-* semantics, admin-endpoint hardening at the edge. Includes the documented gotcha that ALB-to-target HTTP/2 strips WebSocket upgrade headers (use HTTP/1.1 upstream).
+### Changed
+- **`ResponseWriter` Date header now uses `cached_date`** — the per-thread, per-second cache landed in 1.1.0 was never wired into the hot path. It is now. Eliminates ~3 String allocations per response (`Time.now.httpdate` → cached String reuse).
+- **`AdminMiddleware`** refactored: shared `authorize` helper between `/-/quit` and `/-/metrics`; `PATH` constant split into `PATH_QUIT` + `PATH_METRICS`.
+- **`Hyperion::Logger` per-thread access buffer key** is now namespaced per Logger instance (already shipped as a 1.1.0 follow-up fix; documented here for completeness).
+### Fixed
+- N/A — no regressions discovered between 1.1.0 and 1.2.0.
+## [1.1.0] - 2026-04-27
+First minor release after 1.0.0. Production hardening + perf wins, no breaking changes.
+### Added
+- **HTTP/2 §8.1.2 semantic validation** — Hyperion now rejects malformed `:method` / `:path` / `:scheme` pseudo-headers, connection-specific headers (`connection`, `te`, `transfer-encoding`, `keep-alive`, `upgrade`, `proxy-connection`), and inconsistent `content-length` framing with `RST_STREAM PROTOCOL_ERROR`. h2spec conformance pass rate is now 100% on the §8.1.2 suite (was 76.7% in 1.0.x).
+- **Worker recycling (`worker_max_rss_mb`)** — master polls each child's RSS via `/proc/<pid>/statm` (Linux) or `ps -o rss=` (macOS/BSD) every `worker_check_interval` seconds (default 30s). Workers exceeding the configured RSS ceiling are gracefully cycled (SIGTERM, drain, respawn). Disabled when `worker_max_rss_mb` is nil.
+- **Admin drain endpoint (`POST /-/quit`)** — token-protected Rack middleware that triggers the same SIGTERM-driven graceful shutdown as the signal path. Disabled by default; mount by setting `admin_token` in the Hyperion config DSL. Auth via `X-Hyperion-Admin-Token` header (constant-time comparison). Returns 202 + `{"status":"draining"}` on success, 401 on missing/wrong token.
+- **YJIT auto-enable** — Hyperion enables YJIT automatically in production/staging environments (`RAILS_ENV` / `RACK_ENV` / `HYPERION_ENV`). Override with the `yjit` config setting (true/false) or `--[no-]yjit` CLI flag. No-op on Rubies built without YJIT.
+- **C-extension access-log line builder** (`Hyperion::CParser.build_access_line`) — single-allocation line construction in C, ~10× faster than the Ruby interpolation path. Auto-selected on non-TTY destinations (production); colored TTY runs keep the Ruby fallback.
+- **Date-header cache** — per-thread, per-second cache of `Time.now.httpdate` in `ResponseWriter`. Eliminates ~3 String allocations per response.
+- **`bytes_read` / `bytes_written` metrics** — counters exposed via `Hyperion.stats` for connection-level bandwidth monitoring.
+- **`Hyperion.c_parser_available?`** module accessor + boot-time warn line if the llhttp C extension didn't load (so operators running production with the slower pure-Ruby fallback notice immediately).
+- **`MIGRATING_FROM_PUMA.md`** — operator guide covering config translation, lifecycle hook mapping, signal differences, and observability gaps.
+- **Concurrency-at-scale benchmarks** — README now documents 10 000-connection keep-alive throughput and h2 multiplexing numbers vs Puma/Falcon.
+### Changed
+- **Plain HTTP/1.1 accept loop bypasses Async** — when no TLS is configured, Hyperion uses a raw `IO.select` + `accept_nonblock` loop instead of wrapping the loop in an Async task. Worker-owns-connection semantics are unchanged. Removes ~2 µs of fiber-scheduler overhead from the hot accept path.
+### Fixed
+- **Lost shutdown log lines under SIGTERM** — `Master#shutdown_children` and `CLI.run_single` now call `Logger#flush_all`, which walks every per-thread access-log buffer registered through the Logger and `IO#flush`es both stdout and stderr before the process exits. Operators no longer have to chase missing `master draining` / `master exiting` lines after a graceful shutdown.
+- **Cross-instance Logger buffer leak** — per-thread access-log buffers are now namespaced per Logger instance (`:"__hyperion_access_buf_<oid>__"`). Previously a globally-shared key meant a buffer registered against an early Logger could be written to by a later Logger whose `flush_all` couldn't see it. The hot path remains a single `Thread.current` read.
 ## [1.0.1] - 2026-04-26
 ### Fixed

data/README.md CHANGED Viewed

@@ -77,6 +77,35 @@ Health endpoint that traverses the full middleware chain (rack-attack, locale re
 On Grape and Rails-controller workloads Puma hits wrk's 2 s timeout cap on ~⅔ of requests — its real p99 is censored above 2 s. Hyperion serves all of its requests under 1.2 s with 0 to 16 timeouts. **1.14–1.48× Puma throughput** depending on endpoint.
+### Concurrency at scale (architectural advantages)
+These workloads demonstrate structural differences between Hyperion's fiber-per-connection / fiber-per-stream model and Puma's thread-pool model. Numbers are illustrative; the architecture is what matters. Run on Ubuntu 24.04 / Ruby 3.3.3, single worker, h2load `-c <conns> -n 100000 --rps 1000 --h1`.
+**5,000 concurrent keep-alive connections (50,000 requests):**
+| | succeeded | r/s | wall | master RSS |
+|---|---:|---:|---:|---:|
+| Hyperion `-w 1 -t 10` | 50,000 / 50,000 | 3,460 | 14.45 s | 53.5 MB |
+| Puma `-w 1 -t 10:10`  | 50,000 / 50,000 | 1,762 | 28.37 s | 36.9 MB |
+**10,000 concurrent keep-alive connections (100,000 requests):**
+| | succeeded | failed | r/s | wall |
+|---|---:|---:|---:|---:|
+| Hyperion `-w 1 -t 10` | 93,090 | 6,910 | 3,446 | 27.01 s |
+| Puma `-w 1 -t 10:10`  | 77,340 | 22,660 | 706 | 109.59 s |
+Hyperion holds each connection in a ~1 KB fiber stack; Puma needs an OS thread (~1–8 MB each, capped at `max_threads`). At 10k concurrent connections Hyperion serves **~5× the throughput** of Puma with **~20% fewer dropped requests**, while the per-connection bookkeeping cost is bounded by fiber size, not by `max_threads`.
+**HTTP/2 multiplexing — 1 connection × 100 concurrent streams (handler sleeps 50 ms):**
+| | wall time |
+|---|---:|
+| Hyperion (per-stream fiber dispatch) | **1.04 s** |
+| Serial baseline (100 × 50 ms) | 5.00 s |
+Hyperion fans 100 in-flight streams across separate fibers within a single TCP connection. A serial server would take 5 s; the fiber-multiplexed result (1.04 s, ~96 req/s on one socket) is bounded by single-handler sleep time plus framing overhead. Puma has no native HTTP/2 path — production deployments terminate h2 at nginx and forward h1 to the worker pool, which serializes again.
 ### Reproduce
 ```sh
@@ -107,6 +136,8 @@ curl --http2 -k https://127.0.0.1:9443/
 `bundle exec rake spec` (and the `default` task) auto-invoke `compile`, so a fresh checkout just needs `bundle install && bundle exec rake` to get a green run.
+**Migrating from Puma?** See [docs/MIGRATING_FROM_PUMA.md](docs/MIGRATING_FROM_PUMA.md).
 ## Configuration
 Three layers, in precedence order: explicit CLI flag > environment variable > `config/hyperion.rb` > built-in default.
@@ -244,7 +275,7 @@ Smuggling defenses for HTTP/1.1: `Content-Length` + `Transfer-Encoding` together
 ## Compatibility
-- **Ruby 3.2+** required.
+- **Ruby 3.3+** required (the `protocol-http2 ~> 0.26` transitive dep imposes this floor; older Ruby installs error at `bundle install`).
 - **Rack 3** (auto-sets `SERVER_SOFTWARE`, `rack.version`, `REMOTE_ADDR`, IPv6-safe `Host` parsing, CRLF guard).
 - **`Hyperion::FiberLocal.install!`** opt-in shim for older Rails apps that store request-scoped data via `Thread.current.thread_variable_*` (modern Rails 7.1+ already uses Fiber storage natively; the shim handles the residual footgun).
 - **`Hyperion::FiberLocal.verify_environment!`** runtime check that `Thread.current[:k]` is fiber-local on the current Ruby (it is on 3.2+).

data/ext/hyperion_http/parser.c CHANGED Viewed

@@ -404,6 +404,145 @@ static VALUE cbuild_response_head(VALUE self, VALUE rb_status, VALUE rb_reason,
     return buf;
 }
+/* Hyperion::CParser.build_access_line(format, ts, method, path, query,
+ *                                     status, duration_ms, remote_addr,
+ *                                     http_version) -> String
+ *
+ * Hand-rolled access-log line builder used by Hyperion::Logger#access on the
+ * hot path. The Ruby version allocates 1-2 throwaway Strings per line; this
+ * builds the line into a stack scratch buffer (with rb_str_buf overflow for
+ * extreme cases) and returns a single Ruby String. ~10× faster on the
+ * common case, which closes the perf gap between log_requests on/off.
+ *
+ * `format` is :text or :json (Symbol). The format strings here mirror
+ * Logger#build_access_text / #build_access_json byte-for-byte (no colour —
+ * the C builder is only used when @colorize is false, i.e. non-TTY production
+ * deployments where access logs are the highest-volume log line).
+ *
+ * String inputs are passed through verbatim. Access logs are best-effort
+ * structured output, not a security boundary; CRLF in path/remote_addr would
+ * be a log-injection nuisance but cannot escalate. Status (int) and
+ * duration_ms (double/Numeric) go through snprintf, which is type-safe.
+ */
+static VALUE cbuild_access_line(VALUE self,
+                                VALUE format_sym, VALUE rb_ts, VALUE rb_method,
+                                VALUE rb_path, VALUE rb_query, VALUE rb_status,
+                                VALUE rb_duration, VALUE rb_remote,
+                                VALUE rb_http_version) {
+    (void)self;
+    Check_Type(rb_ts, T_STRING);
+    Check_Type(rb_method, T_STRING);
+    Check_Type(rb_path, T_STRING);
+    Check_Type(rb_http_version, T_STRING);
+    int is_json = (TYPE(format_sym) == T_SYMBOL) &&
+                  (SYM2ID(format_sym) == rb_intern("json"));
+    int status     = NUM2INT(rb_status);
+    double dur_ms  = NUM2DBL(rb_duration);
+    int has_query  = !NIL_P(rb_query) && RSTRING_LEN(rb_query) > 0;
+    int has_remote = !NIL_P(rb_remote) && RSTRING_LEN(rb_remote) > 0;
+    /* 1 KiB initial buffer covers the vast majority of access-log lines
+     * (timestamp + level + path + status + addr ~= 200 bytes). rb_str_cat
+     * grows on overflow.
+     *
+     * We use a CAT_LIT macro for literal-string appends so the compiler
+     * computes length via sizeof — manual byte counts on hand-rolled
+     * literal lengths are an off-by-one waiting to happen. */
+#define CAT_LIT(b, s) rb_str_cat((b), (s), (long)(sizeof(s) - 1))
+    VALUE buf = rb_str_buf_new(512);
+    if (is_json) {
+        /* Prefix: {"ts":"...","level":"info","source":"hyperion","message":"request", */
+        CAT_LIT(buf, "{\"ts\":\"");
+        rb_str_cat(buf, RSTRING_PTR(rb_ts), RSTRING_LEN(rb_ts));
+        CAT_LIT(buf, "\",\"level\":\"info\",\"source\":\"hyperion\",\"message\":\"request\",");
+        CAT_LIT(buf, "\"method\":\"");
+        rb_str_cat(buf, RSTRING_PTR(rb_method), RSTRING_LEN(rb_method));
+        CAT_LIT(buf, "\",\"path\":\"");
+        rb_str_cat(buf, RSTRING_PTR(rb_path), RSTRING_LEN(rb_path));
+        CAT_LIT(buf, "\"");
+        if (has_query) {
+            CAT_LIT(buf, ",\"query\":\"");
+            rb_str_cat(buf, RSTRING_PTR(rb_query), RSTRING_LEN(rb_query));
+            CAT_LIT(buf, "\"");
+        }
+        char num[64];
+        int n = snprintf(num, sizeof(num), ",\"status\":%d,\"duration_ms\":%g,",
+                         status, dur_ms);
+        rb_str_cat(buf, num, n);
+        if (has_remote) {
+            CAT_LIT(buf, "\"remote_addr\":\"");
+            rb_str_cat(buf, RSTRING_PTR(rb_remote), RSTRING_LEN(rb_remote));
+            CAT_LIT(buf, "\",");
+        } else {
+            CAT_LIT(buf, "\"remote_addr\":null,");
+        }
+        CAT_LIT(buf, "\"http_version\":\"");
+        rb_str_cat(buf, RSTRING_PTR(rb_http_version), RSTRING_LEN(rb_http_version));
+        CAT_LIT(buf, "\"}\n");
+    } else {
+        /* text: "<ts> INFO  [hyperion] message=request method=... path=... [query=...] status=... duration_ms=... remote_addr=... http_version=...\n" */
+        rb_str_cat(buf, RSTRING_PTR(rb_ts), RSTRING_LEN(rb_ts));
+        CAT_LIT(buf, " INFO  [hyperion] message=request method=");
+        rb_str_cat(buf, RSTRING_PTR(rb_method), RSTRING_LEN(rb_method));
+        CAT_LIT(buf, " path=");
+        rb_str_cat(buf, RSTRING_PTR(rb_path), RSTRING_LEN(rb_path));
+        if (has_query) {
+            /* Mirror Logger#quote_if_needed: quote if value contains
+             * whitespace, '"', or '='. Hot path skips quoting. */
+            const char *q_ptr = RSTRING_PTR(rb_query);
+            long q_len = RSTRING_LEN(rb_query);
+            int need_quote = 0;
+            for (long j = 0; j < q_len; j++) {
+                char c = q_ptr[j];
+                if (c == ' ' || c == '\t' || c == '\n' || c == '\r' ||
+                    c == '"' || c == '=') {
+                    need_quote = 1;
+                    break;
+                }
+            }
+            if (need_quote) {
+                /* Defer to Ruby's String#inspect for correct quoting. */
+                VALUE quoted = rb_funcall(rb_query, rb_intern("inspect"), 0);
+                CAT_LIT(buf, " query=");
+                rb_str_cat(buf, RSTRING_PTR(quoted), RSTRING_LEN(quoted));
+            } else {
+                CAT_LIT(buf, " query=");
+                rb_str_cat(buf, q_ptr, q_len);
+            }
+        }
+        char num[80];
+        /* Use %g to match the existing Ruby format which interpolates
+         * Float#to_s (no fixed precision). Status is an int. */
+        int n = snprintf(num, sizeof(num), " status=%d duration_ms=%g remote_addr=",
+                         status, dur_ms);
+        rb_str_cat(buf, num, n);
+        if (has_remote) {
+            rb_str_cat(buf, RSTRING_PTR(rb_remote), RSTRING_LEN(rb_remote));
+        } else {
+            CAT_LIT(buf, "nil");
+        }
+        CAT_LIT(buf, " http_version=");
+        rb_str_cat(buf, RSTRING_PTR(rb_http_version), RSTRING_LEN(rb_http_version));
+        CAT_LIT(buf, "\n");
+    }
+    return buf;
+}
+#undef CAT_LIT
 void Init_hyperion_http(void) {
     install_settings();
@@ -416,6 +555,8 @@ void Init_hyperion_http(void) {
     rb_define_method(rb_cCParser, "parse", cparser_parse, 1);
     rb_define_singleton_method(rb_cCParser, "build_response_head",
                                cbuild_response_head, 6);
+    rb_define_singleton_method(rb_cCParser, "build_access_line",
+                               cbuild_access_line, 9);
     id_new             = rb_intern("new");
     id_downcase        = rb_intern("downcase");

data/lib/hyperion/adapter/rack.rb CHANGED Viewed

@@ -49,6 +49,20 @@ module Hyperion
       )
       class << self
+        # Pre-allocate `n` env-hash and rack-input objects in master before
+        # fork. Children inherit the populated free-list via copy-on-write —
+        # the hash slots stay shared until a request mutates them. Eliminates
+        # the first-N-requests allocation tax that every fresh worker would
+        # otherwise pay on cold start. Idempotent: safe to call multiple
+        # times; the pool simply caps at its configured `max_size`.
+        def warmup_pool(count = 8)
+          warmed_envs = Array.new(count) { ENV_POOL.acquire }
+          warmed_inputs = Array.new(count) { INPUT_POOL.acquire }
+          warmed_envs.each { |e| ENV_POOL.release(e) }
+          warmed_inputs.each { |i| INPUT_POOL.release(i) }
+          nil
+        end
         def call(app, request)
           env, input = build_env(request)
           status, headers, body = app.call(env)

data/lib/hyperion/admin_middleware.rb ADDED Viewed

@@ -0,0 +1,110 @@
+# frozen_string_literal: true
+require 'rack/utils'
+module Hyperion
+  # Rack middleware that exposes administrative endpoints on the same
+  # listener as the application. Disabled by default — only mounted when
+  # `admin_token` is configured. Currently provides:
+  #
+  #   POST /-/quit     →  triggers graceful master drain (SIGTERM to ppid)
+  #   GET  /-/metrics  →  returns Hyperion.stats in Prometheus text format
+  #
+  # Auth: the request must include `X-Hyperion-Admin-Token: <token>`.
+  # Mismatch → 401. Path/method mismatch → falls through to the app
+  # (so the app can still own /-/anything if Hyperion's admin is off).
+  # When the token is unset, the constructor refuses to wrap — callers
+  # must skip mounting this middleware at all.
+  #
+  # SECURITY: the bearer token is defense-in-depth, not a substitute for
+  # network isolation. Operators MUST keep the listener on a private
+  # network or behind TLS + an authenticating reverse proxy. Anyone who
+  # can reach the listener AND knows the token can drain the server or
+  # scrape its metrics. See docs/REVERSE_PROXY.md for nginx/ALB recipes
+  # that block /-/* at the edge.
+  class AdminMiddleware
+    PATH_QUIT    = '/-/quit'
+    PATH_METRICS = '/-/metrics'
+    METRICS_CONTENT_TYPE = 'text/plain; version=0.0.4; charset=utf-8'
+    JSON_CONTENT_TYPE    = 'application/json'
+    UNAUTHORIZED_BODY = %({"error":"unauthorized"}\n)
+    def initialize(app, token:, signal_target: nil)
+      raise ArgumentError, 'admin_token must be a non-empty String' if token.nil? || token.to_s.empty?
+      @app           = app
+      @token         = token.to_s
+      # Override hook for tests. Defaults to ppid in worker context, pid
+      # for single-worker context (caller decides).
+      @signal_target = signal_target
+    end
+    def call(env)
+      path   = env['PATH_INFO']
+      method = env['REQUEST_METHOD']
+      if path == PATH_QUIT && method == 'POST'
+        authorize(env) { handle_quit(env) }
+      elsif path == PATH_METRICS && method == 'GET'
+        authorize(env) { handle_metrics }
+      else
+        @app.call(env)
+      end
+    end
+    private
+    # Wrap a handler in the shared bearer-token check. Yields only when the
+    # token matches; returns the canonical 401 response otherwise.
+    def authorize(env)
+      provided = env['HTTP_X_HYPERION_ADMIN_TOKEN'].to_s
+      return unauthorized unless secure_match?(provided)
+      yield
+    end
+    def unauthorized
+      [401, { 'content-type' => JSON_CONTENT_TYPE }, [UNAUTHORIZED_BODY]]
+    end
+    def handle_quit(env)
+      target = resolve_signal_target
+      Hyperion.logger.info do
+        { message: 'admin drain requested', remote_addr: env['REMOTE_ADDR'], target_pid: target }
+      end
+      begin
+        Process.kill('TERM', target)
+      rescue StandardError => e
+        Hyperion.logger.warn { { message: 'admin drain signal failed', error: e.message } }
+        return [500, { 'content-type' => JSON_CONTENT_TYPE }, [%({"error":"signal_failed"}\n)]]
+      end
+      [202, { 'content-type' => JSON_CONTENT_TYPE }, [%({"status":"draining"}\n)]]
+    end
+    def handle_metrics
+      body = PrometheusExporter.render(Hyperion.stats)
+      [200, { 'content-type' => METRICS_CONTENT_TYPE }, [body]]
+    end
+    def secure_match?(provided)
+      return false if provided.empty?
+      return false unless provided.bytesize == @token.bytesize
+      # Constant-time comparison. Rack::Utils.secure_compare requires same
+      # length, so we prefix-pad first to avoid a length-leak side channel.
+      Rack::Utils.secure_compare(provided, @token)
+    end
+    def resolve_signal_target
+      return @signal_target if @signal_target
+      # In a forked worker, ppid IS the master; in single-worker mode,
+      # the master + worker are the same process — signal self.
+      ppid = Process.ppid
+      ppid > 1 ? ppid : Process.pid
+    end
+  end
+end

data/lib/hyperion/cli.rb CHANGED Viewed

@@ -53,6 +53,10 @@ module Hyperion
         o.on('--fiber-local-shim', 'Patch Thread.current[] to be fiber-local (Rails-compat for older gems)') do
           cli_opts[:fiber_local_shim] = true
         end
+        o.on('--[no-]yjit',
+             'Enable Ruby YJIT (default: auto on RAILS_ENV/RACK_ENV=production/staging)') do |v|
+          cli_opts[:yjit] = v
+        end
         o.on('-h', '--help', 'show help') do
           puts o
           exit 0
@@ -79,6 +83,11 @@ module Hyperion
       # touch — fall through to the env/default chain in Hyperion.log_requests?".
       Hyperion.log_requests = config.log_requests unless config.log_requests.nil?
+      # Enable YJIT before workers fork / connections start. Auto-on in
+      # production/staging gives operators the perf bump for free; explicit
+      # config.yjit (true/false) overrides the env-based default.
+      maybe_enable_yjit(config)
       rackup = argv.first || 'config.ru'
       abort("[hyperion] no such rackup file: #{rackup}") unless File.exist?(rackup)
@@ -88,6 +97,7 @@ module Hyperion
       end
       app = load_rack_app(rackup)
+      app = wrap_admin_middleware(app, config)
       workers = config.workers.zero? ? Etc.nprocessors : config.workers
       if workers <= 1
@@ -101,10 +111,20 @@ module Hyperion
       tls = build_tls_from_config(config)
       server = Server.new(host: config.host, port: config.port, app: app,
                           tls: tls, thread_count: config.thread_count,
-                          read_timeout: config.read_timeout)
+                          read_timeout: config.read_timeout,
+                          max_pending: config.max_pending,
+                          max_request_read_seconds: config.max_request_read_seconds,
+                          h2_settings: Master.build_h2_settings(config))
       server.listen
       scheme = tls ? 'https' : 'http'
       Hyperion.logger.info { { message: 'listening', url: "#{scheme}://#{server.host}:#{server.port}" } }
+      warn_c_parser_unavailable
+      # Pre-allocate Rack env-pool entries and eager-touch lazy constants.
+      # In single-mode there's no fork, but the warmup still pays for itself
+      # by frontloading the first-N-request allocation cost off the first
+      # real client. Idempotent — safe to call once per process.
+      Hyperion.warmup!
       # Single-worker mode reuses the lifecycle hooks: before_fork is a no-op
       # here (no fork happens), and on_worker_boot/on_worker_shutdown fire
@@ -130,6 +150,11 @@ module Hyperion
       server.start
       shutdown_thread.join
       config.on_worker_shutdown.each { |h| h.call(0) }
+      # Drain per-thread access buffers + sync stdio. Single-worker mode
+      # doesn't go through Master#shutdown_children, so without this call
+      # buffered access lines + final shutdown messages can be lost on
+      # SIGTERM. See Hyperion::Logger#flush_all.
+      Hyperion.logger.flush_all
     end
     def self.run_cluster(config, app, workers)
@@ -155,5 +180,61 @@ module Hyperion
       { cert: config.tls_cert, key: config.tls_key }
     end
     private_class_method :build_tls_from_config
+    # Decide whether to enable YJIT and flip the switch once at boot.
+    # Precedence:
+    #   1. config.yjit explicitly true/false  → honour exactly.
+    #   2. config.yjit nil (default)          → auto: on for production/staging.
+    # No-op on Rubies without YJIT (e.g. JRuby/TruffleRuby) and idempotent if
+    # the operator already passed `ruby --yjit` upstream.
+    def self.maybe_enable_yjit(config)
+      return unless defined?(::RubyVM::YJIT)
+      return if ::RubyVM::YJIT.enabled?
+      enable = if config.yjit.nil?
+                 env_name = ENV['HYPERION_ENV'] || ENV['RAILS_ENV'] || ENV['RACK_ENV']
+                 %w[production staging].include?(env_name)
+               else
+                 config.yjit
+               end
+      return unless enable
+      ::RubyVM::YJIT.enable
+      Hyperion.logger.info do
+        { message: 'YJIT enabled', mode: config.yjit.nil? ? 'auto' : 'explicit' }
+      end
+    end
+    private_class_method :maybe_enable_yjit
+    # When admin_token is configured, wrap the app in AdminMiddleware so
+    # POST /-/quit and GET /-/metrics become token-protected admin endpoints.
+    # Skipped when the token is unset — those paths fall through to the app,
+    # so apps may still own /-/anything if Hyperion's admin is off.
+    def self.wrap_admin_middleware(app, config)
+      return app if config.admin_token.nil? || config.admin_token.to_s.empty?
+      Hyperion.logger.info do
+        { message: 'admin endpoint enabled',
+          paths: [AdminMiddleware::PATH_QUIT, AdminMiddleware::PATH_METRICS] }
+      end
+      AdminMiddleware.new(app, token: config.admin_token)
+    end
+    private_class_method :wrap_admin_middleware
+    # Warn loudly at boot if the C parser didn't load — operators running
+    # production with the pure-Ruby fallback are paying ~2× CPU on parse-heavy
+    # workloads and probably don't know it.
+    def self.warn_c_parser_unavailable
+      return if Hyperion.c_parser_available?
+      Hyperion.logger.warn do
+        {
+          message: 'llhttp C parser not loaded — using pure-Ruby fallback (slower)',
+          remediation: 'rebuild the gem with `bundle exec rake compile` or check your OpenSSL/build-essential install'
+        }
+      end
+    end
+    private_class_method :warn_c_parser_unavailable
   end
 end

data/lib/hyperion/config.rb CHANGED Viewed

@@ -24,7 +24,17 @@ module Hyperion
       log_level: nil, # nil → Logger picks from env / default
       log_format: nil, # nil → Logger picks via auto rule
       log_requests: nil, # nil → Hyperion.log_requests? (default true)
-      fiber_local_shim: false
+      fiber_local_shim: false,
+      yjit: nil, # nil → auto: enable on production/staging; true/false to force.
+      worker_max_rss_mb: nil, # Integer, e.g. 1024. When a worker exceeds this RSS in MB, master gracefully cycles it. nil disables.
+      worker_check_interval: 30, # Seconds between RSS polls. Tradeoff: tighter = faster recycle, more ps calls. 30s matches Puma WorkerKiller.
+      admin_token: nil, # String. When set, exposes admin endpoints (POST /-/quit triggers graceful drain; GET /-/metrics returns Prometheus-format Hyperion.stats). Same token guards both. nil disables admin entirely (paths fall through to the app).
+      max_pending: nil, # Integer, e.g. 256. When the per-worker accept inbox has this many queued connections, additional accepts are rejected with HTTP 503 + Retry-After:1 instead of being queued. nil disables (current behaviour: unbounded queue).
+      max_request_read_seconds: 60, # Numeric. Total wallclock budget (seconds) for reading the request line + headers + body for ONE request. Defends against slowloris-style drips that satisfy the per-recv read_timeout but never finish the request. Resets between requests on a keep-alive connection. nil disables.
+      h2_max_concurrent_streams: 128, # HTTP/2 SETTINGS_MAX_CONCURRENT_STREAMS — cap on simultaneously-open streams per connection. Falcon: 64. nil leaves protocol-http2 default (0xFFFFFFFF).
+      h2_initial_window_size: 1_048_576, # HTTP/2 SETTINGS_INITIAL_WINDOW_SIZE (octets) — flow-control window per stream at open. Bigger = fewer WINDOW_UPDATE round-trips on large bodies. Spec default is 65535. nil → leave protocol default.
+      h2_max_frame_size: 1_048_576, # HTTP/2 SETTINGS_MAX_FRAME_SIZE (octets) — biggest DATA/HEADERS frame we'll accept. Spec floor 16384, ceiling 16777215. We pick 1 MiB to match common CDNs without unbounded buffer growth. nil → leave protocol default (16384).
+      h2_max_header_list_size: 65_536 # HTTP/2 SETTINGS_MAX_HEADER_LIST_SIZE (octets) — advisory cap on the decompressed header block. Bounds memory of pathological client headers. nil → leave protocol default (unbounded).
     }.freeze
     HOOKS = %i[before_fork on_worker_boot on_worker_shutdown].freeze