RubyGems - hyperion-rb - Versions diffs - 1.1.0 → 1.2.0 - Mend

hyperion-rb 1.1.0 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (17) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +21 -0
data/lib/hyperion/adapter/rack.rb +14 -0
data/lib/hyperion/admin_middleware.rb +47 -17
data/lib/hyperion/cli.rb +17 -5
data/lib/hyperion/config.rb +7 -1
data/lib/hyperion/connection.rb +55 -4
data/lib/hyperion/http2_handler.rb +90 -2
data/lib/hyperion/master.rb +24 -1
data/lib/hyperion/prometheus_exporter.rb +96 -0
data/lib/hyperion/response_writer.rb +62 -2
data/lib/hyperion/server.rb +64 -16
data/lib/hyperion/thread_pool.rb +24 -8
data/lib/hyperion/version.rb +1 -1
data/lib/hyperion/worker.rb +19 -11
data/lib/hyperion.rb +39 -0
metadata +2 -1

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 5670da7700c48436d0e3ded790cf5df090deeebd38bc0b7024b9b6e95c20b5c8
-  data.tar.gz: 10bedef6e02717511eb83bea0044e71978d7e13b9d93d0ac37310a89f6581e9a
+  metadata.gz: 4174d7143559b6bd05bdc78acf4377add8aca32f885e933786c50f31c956e9ba
+  data.tar.gz: f163a7f5bd2b363f37205e1f1ba845fb0324c329cc15b4c1144e6d519a1bc60a
 SHA512:
-  metadata.gz: e791cdd9271cb954ddc11ee037ced8c182fffa4c8b27ded1d0c5672cada1d62fb4095d9e4c440136ce8eeed746eca6e4d99ebb3b1e42a2bc9bbd7bce5c1d9615
-  data.tar.gz: 4728b4bf159583fc6f46bd8c33dbcf916b74dddd49dd685159d39950112f5716cdc8108903d0ca312b31eef397d2237fab9d2f34d51e90822a7d3cab9c1b6691
+  metadata.gz: ea61b5e3298ae50b9b6530d51e1f9a5299b0ccfea3b99248230a601a96ebaf764b5d7978215e09a7d73ed7e85ee3f8b5f7d13d40a830ca5c4482a9d192b2919a
+  data.tar.gz: ed8e125b2ff0c9aab53f3178d0f31d1b0db028f8ebf3a40d09ba11e86c3a62756a3c15c7eb3b288faf8dee6f1062159d372cb0c08997a76fedcb97d485d87283

data/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,26 @@
 # Changelog
+## [1.2.0] - 2026-04-27
+Production hardening + perf round 2. No breaking changes.
+### Added
+- **Zero-copy sendfile path** — when a Rack body responds to `#to_path` (e.g. `Rack::Files`, asset uploads), `ResponseWriter` uses `IO.copy_stream(file, socket)` which triggers `sendfile(2)` on Linux for plain TCP. Eliminates the ~MB-sized String allocation per static-asset response. Falls back to userspace copy on TLS / non-Linux but still avoids the userspace String build. New metrics: `:sendfile_responses`, `:tls_zerobuf_responses`.
+- **Hot fork warmup (`Hyperion.warmup!`)** — master pre-allocates the Rack env Hash pool, primes the C extension's lazy state, and touches commonly-resolved constants before `before_fork`. Workers inherit the warm pools via Copy-on-Write. Removes first-N-requests-after-fork allocation tax.
+- **Backpressure (`max_pending`)** — when the thread pool's inbox queue exceeds the configured depth, new accepts get HTTP 503 + `Retry-After: 1` and the socket is closed immediately (no Rack dispatch, no access-log line). Default off (nil); opt in by setting an Integer. New metric: `:rejected_connections`.
+- **Prometheus exporter** — `AdminMiddleware` now serves `GET /-/metrics` in addition to `POST /-/quit` (same token). Renders `Hyperion.stats` as Prometheus text exposition v0.0.4. Counter names follow the `hyperion_<key>_total` convention; `:responses_<code>` keys are grouped under `hyperion_responses_status_total{status="<code>"}`.
+- **Slow-client total-deadline (`max_request_read_seconds`)** — per-request wallclock cap on the request-line + headers read phase (default 60s). Defense-in-depth against slowloris: a malicious client can no longer dribble 1 byte per `read_timeout` window indefinitely. On overrun, Hyperion writes 408 + closes. Resets per request on keep-alive sessions. New metric: `:slow_request_aborts`.
+- **HTTP/2 SETTINGS tuning** — Falcon-class defaults shipped: `MAX_CONCURRENT_STREAMS=128`, `INITIAL_WINDOW_SIZE=1MiB`, `MAX_FRAME_SIZE=1MiB`, `MAX_HEADER_LIST_SIZE=64KiB`. All four overridable via Config DSL (`h2_max_concurrent_streams` etc). Out-of-spec values are clamped + warned, not crashed.
+- **`docs/REVERSE_PROXY.md`** — nginx + AWS ALB samples, X-Forwarded-* semantics, admin-endpoint hardening at the edge. Includes the documented gotcha that ALB-to-target HTTP/2 strips WebSocket upgrade headers (use HTTP/1.1 upstream).
+### Changed
+- **`ResponseWriter` Date header now uses `cached_date`** — the per-thread, per-second cache landed in 1.1.0 was never wired into the hot path. It is now. Eliminates ~3 String allocations per response (`Time.now.httpdate` → cached String reuse).
+- **`AdminMiddleware`** refactored: shared `authorize` helper between `/-/quit` and `/-/metrics`; `PATH` constant split into `PATH_QUIT` + `PATH_METRICS`.
+- **`Hyperion::Logger` per-thread access buffer key** is now namespaced per Logger instance (already shipped as a 1.1.0 follow-up fix; documented here for completeness).
+### Fixed
+- N/A — no regressions discovered between 1.1.0 and 1.2.0.
 ## [1.1.0] - 2026-04-27
 First minor release after 1.0.0. Production hardening + perf wins, no breaking changes.

data/lib/hyperion/adapter/rack.rb CHANGED Viewed

@@ -49,6 +49,20 @@ module Hyperion
       )
       class << self
+        # Pre-allocate `n` env-hash and rack-input objects in master before
+        # fork. Children inherit the populated free-list via copy-on-write —
+        # the hash slots stay shared until a request mutates them. Eliminates
+        # the first-N-requests allocation tax that every fresh worker would
+        # otherwise pay on cold start. Idempotent: safe to call multiple
+        # times; the pool simply caps at its configured `max_size`.
+        def warmup_pool(count = 8)
+          warmed_envs = Array.new(count) { ENV_POOL.acquire }
+          warmed_inputs = Array.new(count) { INPUT_POOL.acquire }
+          warmed_envs.each { |e| ENV_POOL.release(e) }
+          warmed_inputs.each { |i| INPUT_POOL.release(i) }
+          nil
+        end
         def call(app, request)
           env, input = build_env(request)
           status, headers, body = app.call(env)

data/lib/hyperion/admin_middleware.rb CHANGED Viewed

@@ -7,7 +7,8 @@ module Hyperion
   # listener as the application. Disabled by default — only mounted when
   # `admin_token` is configured. Currently provides:
   #
-  #   POST /-/quit  →  triggers graceful master drain (SIGTERM to ppid)
+  #   POST /-/quit     →  triggers graceful master drain (SIGTERM to ppid)
+  #   GET  /-/metrics  →  returns Hyperion.stats in Prometheus text format
   #
   # Auth: the request must include `X-Hyperion-Admin-Token: <token>`.
   # Mismatch → 401. Path/method mismatch → falls through to the app
@@ -18,9 +19,17 @@ module Hyperion
   # SECURITY: the bearer token is defense-in-depth, not a substitute for
   # network isolation. Operators MUST keep the listener on a private
   # network or behind TLS + an authenticating reverse proxy. Anyone who
-  # can reach the listener AND knows the token can drain the server.
+  # can reach the listener AND knows the token can drain the server or
+  # scrape its metrics. See docs/REVERSE_PROXY.md for nginx/ALB recipes
+  # that block /-/* at the edge.
   class AdminMiddleware
-    PATH = '/-/quit'
+    PATH_QUIT    = '/-/quit'
+    PATH_METRICS = '/-/metrics'
+    METRICS_CONTENT_TYPE = 'text/plain; version=0.0.4; charset=utf-8'
+    JSON_CONTENT_TYPE    = 'application/json'
+    UNAUTHORIZED_BODY = %({"error":"unauthorized"}\n)
     def initialize(app, token:, signal_target: nil)
       raise ArgumentError, 'admin_token must be a non-empty String' if token.nil? || token.to_s.empty?
@@ -33,38 +42,59 @@ module Hyperion
     end
     def call(env)
-      return @app.call(env) unless admin_request?(env)
+      path   = env['PATH_INFO']
+      method = env['REQUEST_METHOD']
-      provided = env['HTTP_X_HYPERION_ADMIN_TOKEN'].to_s
-      # Constant-time comparison. Rack::Utils.secure_compare requires same
-      # length, so prefix-pad first to avoid a length-leak side channel.
-      unless secure_match?(provided)
-        return [401, { 'content-type' => 'application/json' },
-                [%({"error":"unauthorized"}\n)]]
+      if path == PATH_QUIT && method == 'POST'
+        authorize(env) { handle_quit(env) }
+      elsif path == PATH_METRICS && method == 'GET'
+        authorize(env) { handle_metrics }
+      else
+        @app.call(env)
       end
+    end
+    private
+    # Wrap a handler in the shared bearer-token check. Yields only when the
+    # token matches; returns the canonical 401 response otherwise.
+    def authorize(env)
+      provided = env['HTTP_X_HYPERION_ADMIN_TOKEN'].to_s
+      return unauthorized unless secure_match?(provided)
+      yield
+    end
+    def unauthorized
+      [401, { 'content-type' => JSON_CONTENT_TYPE }, [UNAUTHORIZED_BODY]]
+    end
+    def handle_quit(env)
       target = resolve_signal_target
-      Hyperion.logger.info { { message: 'admin drain requested', remote_addr: env['REMOTE_ADDR'], target_pid: target } }
+      Hyperion.logger.info do
+        { message: 'admin drain requested', remote_addr: env['REMOTE_ADDR'], target_pid: target }
+      end
       begin
         Process.kill('TERM', target)
       rescue StandardError => e
         Hyperion.logger.warn { { message: 'admin drain signal failed', error: e.message } }
-        return [500, { 'content-type' => 'application/json' }, [%({"error":"signal_failed"}\n)]]
+        return [500, { 'content-type' => JSON_CONTENT_TYPE }, [%({"error":"signal_failed"}\n)]]
       end
-      [202, { 'content-type' => 'application/json' }, [%({"status":"draining"}\n)]]
+      [202, { 'content-type' => JSON_CONTENT_TYPE }, [%({"status":"draining"}\n)]]
     end
-    private
-    def admin_request?(env)
-      env['PATH_INFO'] == PATH && env['REQUEST_METHOD'] == 'POST'
+    def handle_metrics
+      body = PrometheusExporter.render(Hyperion.stats)
+      [200, { 'content-type' => METRICS_CONTENT_TYPE }, [body]]
     end
     def secure_match?(provided)
       return false if provided.empty?
       return false unless provided.bytesize == @token.bytesize
+      # Constant-time comparison. Rack::Utils.secure_compare requires same
+      # length, so we prefix-pad first to avoid a length-leak side channel.
       Rack::Utils.secure_compare(provided, @token)
     end

data/lib/hyperion/cli.rb CHANGED Viewed

@@ -111,12 +111,21 @@ module Hyperion
       tls = build_tls_from_config(config)
       server = Server.new(host: config.host, port: config.port, app: app,
                           tls: tls, thread_count: config.thread_count,
-                          read_timeout: config.read_timeout)
+                          read_timeout: config.read_timeout,
+                          max_pending: config.max_pending,
+                          max_request_read_seconds: config.max_request_read_seconds,
+                          h2_settings: Master.build_h2_settings(config))
       server.listen
       scheme = tls ? 'https' : 'http'
       Hyperion.logger.info { { message: 'listening', url: "#{scheme}://#{server.host}:#{server.port}" } }
       warn_c_parser_unavailable
+      # Pre-allocate Rack env-pool entries and eager-touch lazy constants.
+      # In single-mode there's no fork, but the warmup still pays for itself
+      # by frontloading the first-N-request allocation cost off the first
+      # real client. Idempotent — safe to call once per process.
+      Hyperion.warmup!
       # Single-worker mode reuses the lifecycle hooks: before_fork is a no-op
       # here (no fork happens), and on_worker_boot/on_worker_shutdown fire
       # for the lone in-process "worker" so app code that opens DB pools etc.
@@ -199,13 +208,16 @@ module Hyperion
     private_class_method :maybe_enable_yjit
     # When admin_token is configured, wrap the app in AdminMiddleware so
-    # POST /-/quit becomes a token-protected drain endpoint. Skipped when
-    # the token is unset — the path falls through to the app, so apps may
-    # still own /-/anything if Hyperion's admin is off.
+    # POST /-/quit and GET /-/metrics become token-protected admin endpoints.
+    # Skipped when the token is unset — those paths fall through to the app,
+    # so apps may still own /-/anything if Hyperion's admin is off.
     def self.wrap_admin_middleware(app, config)
       return app if config.admin_token.nil? || config.admin_token.to_s.empty?
-      Hyperion.logger.info { { message: 'admin endpoint enabled', path: AdminMiddleware::PATH } }
+      Hyperion.logger.info do
+        { message: 'admin endpoint enabled',
+          paths: [AdminMiddleware::PATH_QUIT, AdminMiddleware::PATH_METRICS] }
+      end
       AdminMiddleware.new(app, token: config.admin_token)
     end
     private_class_method :wrap_admin_middleware

data/lib/hyperion/config.rb CHANGED Viewed

@@ -28,7 +28,13 @@ module Hyperion
       yjit: nil, # nil → auto: enable on production/staging; true/false to force.
       worker_max_rss_mb: nil, # Integer, e.g. 1024. When a worker exceeds this RSS in MB, master gracefully cycles it. nil disables.
       worker_check_interval: 30, # Seconds between RSS polls. Tradeoff: tighter = faster recycle, more ps calls. 30s matches Puma WorkerKiller.
-      admin_token: nil # String. When set, POST /-/quit triggers graceful drain. nil disables endpoint entirely (returns 404).
+      admin_token: nil, # String. When set, exposes admin endpoints (POST /-/quit triggers graceful drain; GET /-/metrics returns Prometheus-format Hyperion.stats). Same token guards both. nil disables admin entirely (paths fall through to the app).
+      max_pending: nil, # Integer, e.g. 256. When the per-worker accept inbox has this many queued connections, additional accepts are rejected with HTTP 503 + Retry-After:1 instead of being queued. nil disables (current behaviour: unbounded queue).
+      max_request_read_seconds: 60, # Numeric. Total wallclock budget (seconds) for reading the request line + headers + body for ONE request. Defends against slowloris-style drips that satisfy the per-recv read_timeout but never finish the request. Resets between requests on a keep-alive connection. nil disables.
+      h2_max_concurrent_streams: 128, # HTTP/2 SETTINGS_MAX_CONCURRENT_STREAMS — cap on simultaneously-open streams per connection. Falcon: 64. nil leaves protocol-http2 default (0xFFFFFFFF).
+      h2_initial_window_size: 1_048_576, # HTTP/2 SETTINGS_INITIAL_WINDOW_SIZE (octets) — flow-control window per stream at open. Bigger = fewer WINDOW_UPDATE round-trips on large bodies. Spec default is 65535. nil → leave protocol default.
+      h2_max_frame_size: 1_048_576, # HTTP/2 SETTINGS_MAX_FRAME_SIZE (octets) — biggest DATA/HEADERS frame we'll accept. Spec floor 16384, ceiling 16777215. We pick 1 MiB to match common CDNs without unbounded buffer growth. nil → leave protocol default (16384).
+      h2_max_header_list_size: 65_536 # HTTP/2 SETTINGS_MAX_HEADER_LIST_SIZE (octets) — advisory cap on the decompressed header block. Bounds memory of pathological client headers. nil → leave protocol default (unbounded).
     }.freeze
     HOOKS = %i[before_fork on_worker_boot on_worker_shutdown].freeze

data/lib/hyperion/connection.rb CHANGED Viewed

@@ -17,6 +17,7 @@ module Hyperion
     MAX_BODY_BYTES                  = 16 * 1024 * 1024 # 16 MB cap. Phase 5 introduces streaming bodies.
     HEADER_TERM                     = "\r\n\r\n"
     TIMEOUT_SENTINEL                = :__hyperion_read_timeout__
+    DEADLINE_SENTINEL               = :__hyperion_request_deadline__
     IDLE_KEEPALIVE_TIMEOUT_SECONDS  = 5
     # Default parser is the C-extension `CParser` when the extension built;
@@ -44,14 +45,20 @@ module Hyperion
       @log_requests = log_requests.nil? ? Hyperion.log_requests? : log_requests
     end
-    def serve(socket, app)
+    def serve(socket, app, max_request_read_seconds: 60)
       request_count = 0
       carry = +'' # bytes already pulled off the socket but past the prev request boundary
       peer_addr = peer_address(socket)
       @metrics.increment(:connections_accepted)
       @metrics.increment(:connections_active)
       loop do
-        buffer = read_request(socket, carry)
+        # Per-request wallclock deadline. Captured fresh for every request so
+        # long-lived keep-alive sessions with many small requests don't
+        # falsely trip after the cumulative budget elapses.
+        request_started_clock = Process.clock_gettime(Process::CLOCK_MONOTONIC) if max_request_read_seconds
+        buffer = read_request(socket, carry, deadline_started_at: request_started_clock,
+                                             max_request_read_seconds: max_request_read_seconds,
+                                             peer_addr: peer_addr)
         return unless buffer
         if buffer == TIMEOUT_SENTINEL
@@ -65,6 +72,10 @@ module Hyperion
           return
         end
+        # Slowloris-style abort: deadline tripped during read. We've already
+        # written the 408 (best-effort) inside read_request; close out here.
+        return if buffer == DEADLINE_SENTINEL
         request, body_end = @parser.parse(buffer)
         carry = +(buffer.byteslice(body_end, buffer.bytesize - body_end) || '')
         request = enrich_with_peer(request, peer_addr) if peer_addr && request.peer_address.nil?
@@ -193,10 +204,16 @@ module Hyperion
     # pipelining). Returns the full buffer (with any trailing pipelined
     # bytes intact); the parser's returned end_offset tells the caller
     # where this request ends. On EOF returns nil; on read timeout returns
-    # TIMEOUT_SENTINEL.
-    def read_request(socket, carry = +'')
+    # TIMEOUT_SENTINEL; on per-request wallclock deadline trip returns
+    # DEADLINE_SENTINEL (and emits a best-effort 408 + close).
+    def read_request(socket, carry = +'', deadline_started_at: nil, max_request_read_seconds: nil,
+                     peer_addr: nil)
       buffer = carry
       until buffer.include?(HEADER_TERM)
+        if deadline_exceeded?(deadline_started_at, max_request_read_seconds)
+          return abort_for_deadline(socket, deadline_started_at, peer_addr)
+        end
         chunk = read_chunk(socket)
         return chunk if chunk.nil? || chunk == TIMEOUT_SENTINEL
         return nil if chunk.empty?
@@ -211,6 +228,9 @@ module Hyperion
       if chunked?(headers_part)
         until chunked_body_complete?(buffer, header_end)
           raise ParseError, 'chunked body exceeds limit' if buffer.bytesize - header_end > MAX_BODY_BYTES
+          if deadline_exceeded?(deadline_started_at, max_request_read_seconds)
+            return abort_for_deadline(socket, deadline_started_at, peer_addr)
+          end
           chunk = read_chunk(socket)
           break if chunk.nil? || chunk.empty? || chunk == TIMEOUT_SENTINEL
@@ -220,6 +240,10 @@ module Hyperion
       else
         content_length = headers_part[/^content-length:\s*(\d+)/i, 1].to_i
         while buffer.bytesize < header_end + content_length
+          if deadline_exceeded?(deadline_started_at, max_request_read_seconds)
+            return abort_for_deadline(socket, deadline_started_at, peer_addr)
+          end
           chunk = read_chunk(socket)
           break if chunk.nil? || chunk.empty? || chunk == TIMEOUT_SENTINEL
@@ -230,6 +254,33 @@ module Hyperion
       buffer
     end
+    # nil-disabled or budget-untripped → false. Otherwise the wallclock cap
+    # has been exceeded and the caller should abort.
+    def deadline_exceeded?(started_at, max_seconds)
+      return false unless started_at && max_seconds
+      (Process.clock_gettime(Process::CLOCK_MONOTONIC) - started_at) > max_seconds
+    end
+    # Slowloris fallback: log a structured warn, bump :slow_request_aborts,
+    # write a best-effort 408, and let the caller close the socket. We don't
+    # wait on the 408 write — a dribbling client may never read it, and
+    # that's the failure mode we're protecting against anyway.
+    def abort_for_deadline(socket, started_at, peer_addr)
+      elapsed = started_at ? (Process.clock_gettime(Process::CLOCK_MONOTONIC) - started_at).round(3) : nil
+      @metrics.increment(:slow_request_aborts)
+      @logger.warn do
+        { message: 'request read deadline exceeded', remote_addr: peer_addr, elapsed_seconds: elapsed }
+      end
+      begin
+        socket.write("HTTP/1.1 408 Request Timeout\r\nconnection: close\r\ncontent-length: 0\r\n\r\n")
+      rescue StandardError
+        # Peer may have already gone — nothing to do.
+      end
+      @metrics.increment_status(408)
+      DEADLINE_SENTINEL
+    end
     def chunked?(headers_part)
       headers_part.match?(/^transfer-encoding:[ \t]*[^\r\n]*chunked\b/i)
     end

data/lib/hyperion/http2_handler.rb CHANGED Viewed

@@ -212,9 +212,34 @@ module Hyperion
       end
     end
-    def initialize(app:, thread_pool: nil)
+    # Maps Hyperion-friendly setting names to the integer SETTINGS_* identifiers
+    # protocol-http2 uses on the wire. See RFC 7540 §6.5.2 — these are the
+    # only four parameters Hyperion exposes; the rest of the SETTINGS frame
+    # (HEADER_TABLE_SIZE, ENABLE_PUSH, etc.) keeps protocol-http2's default.
+    SETTINGS_KEY_MAP = {
+      max_concurrent_streams: ::Protocol::HTTP2::Settings::MAXIMUM_CONCURRENT_STREAMS,
+      initial_window_size: ::Protocol::HTTP2::Settings::INITIAL_WINDOW_SIZE,
+      max_frame_size: ::Protocol::HTTP2::Settings::MAXIMUM_FRAME_SIZE,
+      max_header_list_size: ::Protocol::HTTP2::Settings::MAXIMUM_HEADER_LIST_SIZE
+    }.freeze
+    # RFC 7540 §6.5.2 floor for SETTINGS_MAX_FRAME_SIZE. protocol-http2 raises
+    # ProtocolError on values below this; we clamp + warn instead so a
+    # misconfigured operator gets a working server, not a boot-time crash.
+    H2_MIN_FRAME_SIZE = 0x4000 # 16384
+    # RFC 7540 §6.5.2 ceiling for SETTINGS_MAX_FRAME_SIZE.
+    H2_MAX_FRAME_SIZE = 0xFFFFFF # 16777215
+    # RFC 7540 §6.9.2 — INITIAL_WINDOW_SIZE has the same 31-bit max as the
+    # WINDOW_UPDATE frame's Window Size Increment (see protocol-http2's
+    # MAXIMUM_ALLOWED_WINDOW_SIZE).
+    H2_MAX_WINDOW_SIZE = 0x7FFFFFFF
+    def initialize(app:, thread_pool: nil, h2_settings: nil)
       @app         = app
       @thread_pool = thread_pool
+      @h2_settings = h2_settings
       @metrics     = Hyperion.metrics
       @logger      = Hyperion.logger
     end
@@ -224,7 +249,7 @@ module Hyperion
       @metrics.increment(:connections_active)
       framer = ::Protocol::HTTP2::Framer.new(socket)
       server = build_server(framer)
-      server.read_connection_preface
+      server.read_connection_preface(initial_settings_payload)
       # Extract once — the same TCP peer drives every stream on this conn.
       peer_addr = peer_address(socket)
@@ -290,6 +315,69 @@ module Hyperion
     private
+    # Build the [setting_id, value] pairs that go in the connection-preface
+    # SETTINGS frame. protocol-http2's Server#read_connection_preface accepts
+    # this array and does the wire encoding for us. Empty array (no overrides
+    # configured) → SETTINGS frame still goes out, just with no entries
+    # (effectively an ack), which is what the spec allows.
+    #
+    # We clamp out-of-range values (max_frame_size below the spec floor or
+    # above its ceiling, initial_window_size above 31-bit max) instead of
+    # letting protocol-http2 raise ProtocolError at handshake time — a
+    # crashing handshake leaks the connection. Operator gets a warn so the
+    # misconfiguration surfaces in logs.
+    def initial_settings_payload
+      return [] unless @h2_settings
+      payload = []
+      @h2_settings.each do |key, value|
+        next if value.nil?
+        setting_id = SETTINGS_KEY_MAP[key]
+        unless setting_id
+          @logger.warn { { message: 'unknown h2 setting; skipping', setting: key } }
+          next
+        end
+        clamped = clamp_h2_setting(key, value)
+        payload << [setting_id, clamped]
+      end
+      payload
+    end
+    def clamp_h2_setting(key, value)
+      case key
+      when :max_frame_size
+        if value < H2_MIN_FRAME_SIZE
+          @logger.warn do
+            { message: 'h2 max_frame_size below spec minimum; clamping',
+              configured: value, clamped_to: H2_MIN_FRAME_SIZE }
+          end
+          H2_MIN_FRAME_SIZE
+        elsif value > H2_MAX_FRAME_SIZE
+          @logger.warn do
+            { message: 'h2 max_frame_size above spec maximum; clamping',
+              configured: value, clamped_to: H2_MAX_FRAME_SIZE }
+          end
+          H2_MAX_FRAME_SIZE
+        else
+          value
+        end
+      when :initial_window_size
+        if value > H2_MAX_WINDOW_SIZE
+          @logger.warn do
+            { message: 'h2 initial_window_size above spec maximum; clamping',
+              configured: value, clamped_to: H2_MAX_WINDOW_SIZE }
+          end
+          H2_MAX_WINDOW_SIZE
+        else
+          value
+        end
+      else
+        value
+      end
+    end
     def build_server(framer)
       server = ::Protocol::HTTP2::Server.new(framer)
       server.define_singleton_method(:accept_stream) do |stream_id, &block|

data/lib/hyperion/master.rb CHANGED Viewed

@@ -47,6 +47,20 @@ module Hyperion
       end
     end
+    # Pulls the four configurable HTTP/2 SETTINGS values out of the Config
+    # and returns them as a Hash. Nils are stripped so an operator who
+    # explicitly sets one to `nil` (meaning "leave protocol-http2 default in
+    # place") doesn't accidentally send a SETTINGS entry with a nil value.
+    # Empty hash → no override → Http2Handler skips the SETTINGS push.
+    def self.build_h2_settings(config)
+      {
+        max_concurrent_streams: config.h2_max_concurrent_streams,
+        initial_window_size: config.h2_initial_window_size,
+        max_frame_size: config.h2_max_frame_size,
+        max_header_list_size: config.h2_max_header_list_size
+      }.compact
+    end
     def initialize(host:, port:, app:, workers: DEFAULT_WORKER_COUNT,
                    read_timeout: Server::DEFAULT_READ_TIMEOUT_SECONDS, tls: nil,
                    thread_count: Server::DEFAULT_THREAD_COUNT, config: nil)
@@ -84,6 +98,12 @@ module Hyperion
         }
       end
+      # Pre-allocate Rack env-pool entries and eager-touch lazy constants
+      # BEFORE we fork. Children inherit the warm memory via copy-on-write
+      # so the first batch of requests on each fresh worker doesn't pay
+      # the allocation/autoload tax.
+      Hyperion.warmup!
       # `before_fork` runs ONCE in the master before any worker is forked.
       # Operators use it to close shared resources (DB pools, Redis sockets)
       # so each child gets fresh connections rather than inheriting the
@@ -143,7 +163,10 @@ module Hyperion
           host: @host, port: @port, app: @app,
           read_timeout: @read_timeout, tls: @tls,
           thread_count: @thread_count, config: @config,
-          worker_index: worker_index
+          worker_index: worker_index,
+          max_pending: @config.max_pending,
+          max_request_read_seconds: @config.max_request_read_seconds,
+          h2_settings: Master.build_h2_settings(@config)
         }
         # Hand the inherited socket to the worker in :share mode. In
         # :reuseport mode the worker binds its own with SO_REUSEPORT.

data/lib/hyperion/prometheus_exporter.rb ADDED Viewed

@@ -0,0 +1,96 @@
+# frozen_string_literal: true
+module Hyperion
+  # Renders Hyperion.stats as Prometheus text exposition format (v0.0.4).
+  # Mounted by AdminMiddleware on GET /-/metrics; the returned content-type
+  # is `text/plain; version=0.0.4; charset=utf-8`.
+  #
+  # Mapping rules:
+  # - keys listed in KNOWN_METRICS get their canonical name + curated HELP/TYPE
+  # - keys matching `responses_<3-digit>` are grouped under a single
+  #   `hyperion_responses_status_total` family with a `status` label
+  # - any other key is auto-exported as `hyperion_<key>` with a generic HELP
+  #   line, so newly-added counters surface in Prometheus without code changes
+  #   here (the curated-name path is just nicer presentation, not gating)
+  #
+  # Output ordering is deterministic for stable scrape diffs:
+  # - known metrics in KNOWN_METRICS declaration order
+  # - status codes ascending
+  # - other keys alphabetically
+  module PrometheusExporter
+    module_function
+    KNOWN_METRICS = {
+      requests: { name: 'hyperion_requests_total',
+                  help: 'Total HTTP requests handled',
+                  type: 'counter' },
+      bytes_read: { name: 'hyperion_bytes_read_total',
+                    help: 'Total bytes read from request sockets',
+                    type: 'counter' },
+      bytes_written: { name: 'hyperion_bytes_written_total',
+                       help: 'Total bytes written to response sockets',
+                       type: 'counter' },
+      rejected_connections: { name: 'hyperion_rejected_connections_total',
+                              help: 'Connections rejected due to backpressure (max_pending)',
+                              type: 'counter' },
+      sendfile_responses: { name: 'hyperion_sendfile_responses_total',
+                            help: 'Responses sent via plain-TCP sendfile(2) zero-copy path',
+                            type: 'counter' },
+      tls_zerobuf_responses: { name: 'hyperion_tls_zerobuf_responses_total',
+                               help: 'Responses sent via TLS IO.copy_stream (avoids userspace String build, but TLS encryption forces copy)',
+                               type: 'counter' }
+    }.freeze
+    STATUS_KEY_PATTERN = /\Aresponses_(\d{3})\z/
+    STATUS_FAMILY_NAME = 'hyperion_responses_status_total'
+    STATUS_FAMILY_HELP = 'Responses by HTTP status code'
+    def render(stats)
+      buf = +''
+      grouped_status = {}
+      other = {}
+      known = {}
+      stats.each do |key, value|
+        if (match = key.to_s.match(STATUS_KEY_PATTERN))
+          grouped_status[match[1]] = value
+        elsif KNOWN_METRICS.key?(key)
+          known[key] = value
+        else
+          other[key] = value
+        end
+      end
+      # Known metrics first, in declaration order — gives the scrape a stable,
+      # human-friendly preamble regardless of hash insertion order.
+      KNOWN_METRICS.each do |key, meta|
+        next unless known.key?(key)
+        append_metric(buf, meta[:name], meta[:help], meta[:type], known[key])
+      end
+      unless grouped_status.empty?
+        buf << "# HELP #{STATUS_FAMILY_NAME} #{STATUS_FAMILY_HELP}\n"
+        buf << "# TYPE #{STATUS_FAMILY_NAME} counter\n"
+        grouped_status.sort.each do |status, value|
+          buf << %(#{STATUS_FAMILY_NAME}{status="#{status}"} #{value}\n)
+        end
+      end
+      other.sort_by { |k, _| k.to_s }.each do |key, value|
+        name = "hyperion_#{key}"
+        append_metric(buf, name, 'Hyperion internal counter (auto-exported)', 'counter', value)
+      end
+      buf
+    end
+    def append_metric(buf, name, help, type, value)
+      buf << "# HELP #{name} #{help}\n"
+      buf << "# TYPE #{name} #{type}\n"
+      buf << "#{name} #{value}\n"
+    end
+    private_class_method :append_metric
+  end
+end

data/lib/hyperion/response_writer.rb CHANGED Viewed

@@ -36,6 +36,21 @@ module Hyperion
     CRLF_HEADER_VALUE = /[\r\n]/
     def write(io, status, headers, body, keep_alive: false)
+      # Zero-copy fast path: bodies that point at an on-disk file (Rack::Files,
+      # asset servers, signed-download responders) get streamed via
+      # IO.copy_stream which delegates to sendfile(2) on Linux for plain TCP
+      # sockets — bytes go from the file's page cache straight to the socket
+      # buffer with no userspace allocation. For TLS sockets we still avoid the
+      # multi-MB String build, but encryption forces a userspace round-trip so
+      # we count that path separately.
+      return write_sendfile(io, status, headers, body, keep_alive: keep_alive) if body.respond_to?(:to_path)
+      write_buffered(io, status, headers, body, keep_alive: keep_alive)
+    end
+    private
+    def write_buffered(io, status, headers, body, keep_alive:)
       # Phase 1 buffers the full body so Content-Length is exact.
       # Phase 2 introduces chunked transfer-encoding for streaming bodies;
       # Phase 5 batches via IO::Buffer to avoid this intermediate String.
@@ -43,7 +58,7 @@ module Hyperion
       body.each { |chunk| buffered << chunk }
       reason = REASONS[status] || 'Unknown'
-      date_str = Time.now.httpdate
+      date_str = cached_date
       head = build_head(status, reason, headers, buffered.bytesize, keep_alive, date_str)
@@ -67,7 +82,52 @@ module Hyperion
       body.close if body.respond_to?(:close)
     end
-    private
+    def write_sendfile(io, status, headers, body, keep_alive:)
+      path = body.to_path
+      file = File.open(path, 'rb')
+      file_size = file.size
+      # If the app explicitly set content-length, respect it; otherwise use the
+      # real file size. Rack::Files does not pre-set content-length, so the
+      # common case is the File.size branch.
+      content_length = explicit_content_length(headers) || file_size
+      reason = REASONS[status] || 'Unknown'
+      date_str = cached_date
+      head = build_head(status, reason, headers, content_length, keep_alive, date_str)
+      io.write(head)
+      # IO.copy_stream copies up to file_size bytes from the file to the socket.
+      # On Linux + plain TCPSocket this triggers sendfile(2) — kernel-level
+      # zero-copy. On TLS sockets and non-Linux platforms it falls back to
+      # internal read+write loops, but we still avoid building a String the
+      # size of the file in Ruby.
+      copied = IO.copy_stream(file, io, file_size)
+      record_zero_copy_metric(io)
+      Hyperion.metrics.increment(:bytes_written, head.bytesize + copied)
+    ensure
+      file&.close
+      body.close if body.respond_to?(:close)
+    end
+    def explicit_content_length(headers)
+      headers.each do |k, v|
+        return v.to_i if k.to_s.casecmp('content-length').zero?
+      end
+      nil
+    end
+    # Plain TCPSocket → real sendfile(2). TLS-wrapped sockets cannot use
+    # sendfile (kernel can't encrypt) but still avoid the per-response String
+    # allocation, so we track them under a separate counter.
+    def record_zero_copy_metric(io)
+      if defined?(::OpenSSL::SSL::SSLSocket) && io.is_a?(::OpenSSL::SSL::SSLSocket)
+        Hyperion.metrics.increment(:tls_zerobuf_responses)
+      else
+        Hyperion.metrics.increment(:sendfile_responses)
+      end
+    end
     # rc17: prefer the C extension when available — eliminates the per-response
     # status-line interpolation, normalized hash, and per-header String#<<

data/lib/hyperion/server.rb CHANGED Viewed

@@ -20,18 +20,40 @@ module Hyperion
     DEFAULT_READ_TIMEOUT_SECONDS = 30
     DEFAULT_THREAD_COUNT         = 5
+    # Pre-built minimal 503 response for the backpressure path. We bypass
+    # ResponseWriter / Rack entirely — no env build, no app dispatch, no
+    # access-log line. The bytes are frozen and reused across every
+    # rejection so the overload path stays allocation-free. Body is JSON
+    # so JSON-only API consumers don't have to special-case the format.
+    REJECT_503 = lambda {
+      body = +%({"error":"server_busy","retry_after_seconds":1}\n)
+      body.force_encoding(Encoding::ASCII_8BIT)
+      head = +"HTTP/1.1 503 Service Unavailable\r\n" \
+              "content-type: application/json\r\n" \
+              "content-length: #{body.bytesize}\r\n" \
+              "retry-after: 1\r\n" \
+              "connection: close\r\n" \
+              "\r\n"
+      head.force_encoding(Encoding::ASCII_8BIT)
+      (head + body).freeze
+    }.call
     attr_reader :host, :port
     def initialize(app:, host: '127.0.0.1', port: 9292, read_timeout: DEFAULT_READ_TIMEOUT_SECONDS,
-                   tls: nil, thread_count: DEFAULT_THREAD_COUNT)
-      @host         = host
-      @port         = port
-      @app          = app
-      @read_timeout = read_timeout
-      @tls          = tls
-      @thread_count = thread_count
-      @thread_pool  = nil
-      @stopped      = false
+                   tls: nil, thread_count: DEFAULT_THREAD_COUNT, max_pending: nil,
+                   max_request_read_seconds: 60, h2_settings: nil)
+      @host                     = host
+      @port                     = port
+      @app                      = app
+      @read_timeout             = read_timeout
+      @tls                      = tls
+      @thread_count             = thread_count
+      @max_pending              = max_pending
+      @max_request_read_seconds = max_request_read_seconds
+      @h2_settings              = h2_settings
+      @thread_pool              = nil
+      @stopped                  = false
     end
     def listen
@@ -83,7 +105,7 @@ module Hyperion
     def start
       listen unless @server
-      @thread_pool = ThreadPool.new(size: @thread_count) if @thread_count.positive?
+      @thread_pool = ThreadPool.new(size: @thread_count, max_pending: @max_pending) if @thread_count.positive?
       if @tls
         # TLS path: ALPN may pick `h2`, and h2 spawns one fiber per stream
@@ -121,9 +143,12 @@ module Hyperion
         apply_timeout(socket)
         if @thread_pool
-          @thread_pool.submit_connection(socket, @app)
+          unless @thread_pool.submit_connection(socket, @app,
+                                                max_request_read_seconds: @max_request_read_seconds)
+            reject_connection(socket)
+          end
         else
-          Connection.new.serve(socket, @app)
+          Connection.new.serve(socket, @app, max_request_read_seconds: @max_request_read_seconds)
         end
       end
     end
@@ -148,15 +173,38 @@ module Hyperion
         # HTTP/2: each stream runs on a fiber inside Http2Handler. The
         # handler still uses the pool's `#call` for app.call hops on each
         # stream (one per stream, not one per connection).
-        Http2Handler.new(app: @app, thread_pool: @thread_pool).serve(socket)
+        Http2Handler.new(app: @app, thread_pool: @thread_pool, h2_settings: @h2_settings).serve(socket)
       elsif @thread_pool
         # HTTP/1.1 (e.g. TLS-wrapped after ALPN picked http/1.1): hand the
         # connection to a worker thread. The fiber that called dispatch
-        # returns immediately.
-        @thread_pool.submit_connection(socket, @app)
+        # returns immediately. On overflow, reject with 503 + close.
+        unless @thread_pool.submit_connection(socket, @app,
+                                              max_request_read_seconds: @max_request_read_seconds)
+          reject_connection(socket)
+        end
       else
         # No pool (thread_count: 0): inline on the calling fiber.
-        Connection.new.serve(socket, @app)
+        Connection.new.serve(socket, @app, max_request_read_seconds: @max_request_read_seconds)
+      end
+    end
+    # Backpressure rejection. Emits a pre-built 503 + closes the socket.
+    # No Rack env, no app dispatch, no access-log line — the overload
+    # path must stay cheap so we don't pile rejection cost on top of the
+    # already-saturated workers. Bumps :rejected_connections so operators
+    # can alert on sustained overload.
+    def reject_connection(socket)
+      socket.write(REJECT_503)
+      Hyperion.metrics.increment(:rejected_connections)
+    rescue StandardError
+      # Client may have hung up between accept and our 503 write — that's
+      # the failure mode we're protecting them from anyway, so swallow.
+      nil
+    ensure
+      begin
+        socket.close
+      rescue StandardError
+        nil
       end
     end

data/lib/hyperion/thread_pool.rb CHANGED Viewed

@@ -26,11 +26,12 @@ module Hyperion
   class ThreadPool
     SHUTDOWN = :__hyperion_thread_pool_shutdown__
-    attr_reader :size
+    attr_reader :size, :max_pending
-    def initialize(size:)
-      @size       = size
-      @inbox      = Queue.new # multiplexes both kinds of jobs
+    def initialize(size:, max_pending: nil)
+      @size        = size
+      @max_pending = max_pending
+      @inbox       = Queue.new # multiplexes both kinds of jobs
       # Pre-allocate one reply queue per in-flight slot for the legacy `#call`
       # path. Bounded by `size`: if all workers are busy, all reply queues are
       # checked out, and the next caller blocks on `@reply_pool.pop` until a
@@ -43,8 +44,23 @@ module Hyperion
     # HTTP/1.1 path: hand the whole socket to a worker thread. The worker
     # runs `Connection#serve(socket, app)` directly. No per-request hop.
     # Returns immediately — caller does not wait.
-    def submit_connection(socket, app)
-      @inbox << [:connection, socket, app]
+    #
+    # Returns true on enqueue, false on rejection. When `max_pending` is set
+    # and the inbox already has at least that many entries, the connection
+    # is rejected up to the caller (Server emits a 503 and closes the
+    # socket). Without `max_pending` (default nil) the queue is unbounded
+    # and we always return true — preserves pre-1.2 behaviour.
+    #
+    # The check is inherently racy with worker drain — workers may pop
+    # between our `size` read and the `<<`. Backpressure is statistical,
+    # not strict. Off-by-one over the configured cap during a thundering
+    # accept burst is acceptable; the cost of stricter sync would be a
+    # mutex on every enqueue, which we won't pay on the hot path.
+    def submit_connection(socket, app, max_request_read_seconds: 60)
+      return false if @max_pending && @inbox.size >= @max_pending
+      @inbox << [:connection, socket, app, max_request_read_seconds]
+      true
     end
     # HTTP/2 + sub-call path: hop one `app.call` from the calling fiber to a
@@ -78,12 +94,12 @@ module Hyperion
           case job[0]
           when :connection
-            _, socket, app = job
+            _, socket, app, max_request_read_seconds = job
             # Worker thread owns the connection for its full lifetime. Pass
             # thread_pool: nil so Connection#call_app inlines Adapter::Rack.call
             # — the worker IS the pool, no further hop required.
             begin
-              Hyperion::Connection.new.serve(socket, app)
+              Hyperion::Connection.new.serve(socket, app, max_request_read_seconds: max_request_read_seconds)
             rescue StandardError => e
               Hyperion.logger.error do
                 {

data/lib/hyperion/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module Hyperion
-  VERSION = '1.1.0'
+  VERSION = '1.2.0'
 end

data/lib/hyperion/worker.rb CHANGED Viewed

@@ -18,16 +18,21 @@ module Hyperion
   class Worker
     def initialize(host:, port:, app:, read_timeout:, tls: nil,
                    thread_count: Server::DEFAULT_THREAD_COUNT,
-                   config: nil, worker_index: 0, listener: nil)
-      @host         = host
-      @port         = port
-      @app          = app
-      @read_timeout = read_timeout
-      @tls          = tls
-      @thread_count = thread_count
-      @config       = config || Hyperion::Config.new
-      @worker_index = worker_index
-      @listener     = listener
+                   config: nil, worker_index: 0, listener: nil,
+                   max_pending: nil, max_request_read_seconds: 60,
+                   h2_settings: nil)
+      @host                     = host
+      @port                     = port
+      @app                      = app
+      @read_timeout             = read_timeout
+      @tls                      = tls
+      @thread_count             = thread_count
+      @config                   = config || Hyperion::Config.new
+      @worker_index             = worker_index
+      @listener                 = listener
+      @max_pending              = max_pending
+      @max_request_read_seconds = max_request_read_seconds
+      @h2_settings              = h2_settings
     end
     def run
@@ -43,7 +48,10 @@ module Hyperion
       server = Server.new(host: @host, port: @port, app: @app,
                           read_timeout: @read_timeout, tls: @tls,
-                          thread_count: @thread_count)
+                          thread_count: @thread_count,
+                          max_pending: @max_pending,
+                          max_request_read_seconds: @max_request_read_seconds,
+                          h2_settings: @h2_settings)
       tcp_server = @listener || build_reuseport_listener
       server.adopt_listener(tcp_server)

data/lib/hyperion.rb CHANGED Viewed

@@ -63,6 +63,44 @@ module Hyperion
         else true # default ON
         end
     end
+    # Pre-fork warmup. Run by Master and CLI single-mode BEFORE children are
+    # forked (or before the lone worker starts accepting). Pre-allocates the
+    # Rack adapter's object pools and eager-touches lazily-resolved constants
+    # so each forked child inherits warm memory via copy-on-write — the first
+    # N requests on a fresh worker no longer pay the allocation / autoload
+    # tax that would otherwise serialize behind the GVL on cold start.
+    #
+    # Idempotent — second and later calls are no-ops. Failures are swallowed
+    # with a warn log: warmup is an optimization, not a correctness gate.
+    # If, for instance, OpenSSL can't be required in some odd environment,
+    # we'd rather start cold than refuse to boot.
+    def warmup!
+      return if @warmed
+      @warmed = true
+      if defined?(::Hyperion::Adapter::Rack) && ::Hyperion::Adapter::Rack.respond_to?(:warmup_pool)
+        ::Hyperion::Adapter::Rack.warmup_pool(8)
+      end
+      # Touch the C extension's response-head builder so its lazily-initialized
+      # internal state runs in the master, not in every child after fork.
+      ::Hyperion::CParser.respond_to?(:build_response_head) if defined?(::Hyperion::CParser)
+      # Eager-load TLS / SSLSocket. The sendfile path's `is_a?` check would
+      # otherwise trigger autoload in the worker on the first TLS response.
+      require 'openssl'
+      defined?(::OpenSSL::SSL::SSLSocket) && ::OpenSSL::SSL::SSLSocket.name
+      # Force Ruby's tzinfo / strftime-cache load by emitting one httpdate.
+      # Subsequent calls hit the per-thread `cached_date` slot in response_writer.
+      Time.now.httpdate
+      nil
+    rescue StandardError => e
+      Hyperion.logger.warn { { message: 'warmup failed (non-fatal)', error: e.message } }
+      nil
+    end
   end
 end
@@ -89,6 +127,7 @@ require_relative 'hyperion/request'
 require_relative 'hyperion/parser'
 require_relative 'hyperion/c_parser'
 require_relative 'hyperion/adapter/rack'
+require_relative 'hyperion/prometheus_exporter'
 require_relative 'hyperion/admin_middleware'
 require_relative 'hyperion/response_writer'
 require_relative 'hyperion/thread_pool'

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: hyperion-rb
 version: !ruby/object:Gem::Version
-  version: 1.1.0
+  version: 1.2.0
 platform: ruby
 authors:
 - Andrey Lobanov
@@ -160,6 +160,7 @@ files:
 - lib/hyperion/metrics.rb
 - lib/hyperion/parser.rb
 - lib/hyperion/pool.rb
+- lib/hyperion/prometheus_exporter.rb
 - lib/hyperion/request.rb
 - lib/hyperion/response_writer.rb
 - lib/hyperion/server.rb