hyperion-rb 1.0.1 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 65702226f151c3ba8314ac5927575595aeda57a522d045cc61feaf04f87bae53
4
- data.tar.gz: 704454f1d7b5e484b70bc09baf7047d9a6760f9a1d824402092455c4af63a8d8
3
+ metadata.gz: 4174d7143559b6bd05bdc78acf4377add8aca32f885e933786c50f31c956e9ba
4
+ data.tar.gz: f163a7f5bd2b363f37205e1f1ba845fb0324c329cc15b4c1144e6d519a1bc60a
5
5
  SHA512:
6
- metadata.gz: e210752143a969e8070de69bcf81bbc40dd509d827cb8f70dbcf44bdc1fb3692c9110000a732e2bdeb6185ea52fd07416f3d1fdc56005c5fd4dcca1e99e60fd3
7
- data.tar.gz: 0fa86fc58fa590087cb84948569be3968a4e7515a9fd8f9de28cc191bb6c8ddb8b5d016fa4b66ed142aaa01555357320ba62bbc7bc71abf3a96cbbfd4c3be457
6
+ metadata.gz: ea61b5e3298ae50b9b6530d51e1f9a5299b0ccfea3b99248230a601a96ebaf764b5d7978215e09a7d73ed7e85ee3f8b5f7d13d40a830ca5c4482a9d192b2919a
7
+ data.tar.gz: ed8e125b2ff0c9aab53f3178d0f31d1b0db028f8ebf3a40d09ba11e86c3a62756a3c15c7eb3b288faf8dee6f1062159d372cb0c08997a76fedcb97d485d87283
data/CHANGELOG.md CHANGED
@@ -1,5 +1,49 @@
1
1
  # Changelog
2
2
 
3
+ ## [1.2.0] - 2026-04-27
4
+
5
+ Production hardening + perf round 2. No breaking changes.
6
+
7
+ ### Added
8
+ - **Zero-copy sendfile path** — when a Rack body responds to `#to_path` (e.g. `Rack::Files`, asset uploads), `ResponseWriter` uses `IO.copy_stream(file, socket)` which triggers `sendfile(2)` on Linux for plain TCP. Eliminates the ~MB-sized String allocation per static-asset response. Falls back to userspace copy on TLS / non-Linux but still avoids the userspace String build. New metrics: `:sendfile_responses`, `:tls_zerobuf_responses`.
9
+ - **Hot fork warmup (`Hyperion.warmup!`)** — master pre-allocates the Rack env Hash pool, primes the C extension's lazy state, and touches commonly-resolved constants before `before_fork`. Workers inherit the warm pools via Copy-on-Write. Removes first-N-requests-after-fork allocation tax.
10
+ - **Backpressure (`max_pending`)** — when the thread pool's inbox queue exceeds the configured depth, new accepts get HTTP 503 + `Retry-After: 1` and the socket is closed immediately (no Rack dispatch, no access-log line). Default off (nil); opt in by setting an Integer. New metric: `:rejected_connections`.
11
+ - **Prometheus exporter** — `AdminMiddleware` now serves `GET /-/metrics` in addition to `POST /-/quit` (same token). Renders `Hyperion.stats` as Prometheus text exposition v0.0.4. Counter names follow the `hyperion_<key>_total` convention; `:responses_<code>` keys are grouped under `hyperion_responses_status_total{status="<code>"}`.
12
+ - **Slow-client total-deadline (`max_request_read_seconds`)** — per-request wallclock cap on the request-line + headers read phase (default 60s). Defense-in-depth against slowloris: a malicious client can no longer dribble 1 byte per `read_timeout` window indefinitely. On overrun, Hyperion writes 408 + closes. Resets per request on keep-alive sessions. New metric: `:slow_request_aborts`.
13
+ - **HTTP/2 SETTINGS tuning** — Falcon-class defaults shipped: `MAX_CONCURRENT_STREAMS=128`, `INITIAL_WINDOW_SIZE=1MiB`, `MAX_FRAME_SIZE=1MiB`, `MAX_HEADER_LIST_SIZE=64KiB`. All four overridable via Config DSL (`h2_max_concurrent_streams` etc). Out-of-spec values are clamped + warned, not crashed.
14
+ - **`docs/REVERSE_PROXY.md`** — nginx + AWS ALB samples, X-Forwarded-* semantics, admin-endpoint hardening at the edge. Includes the documented gotcha that ALB-to-target HTTP/2 strips WebSocket upgrade headers (use HTTP/1.1 upstream).
15
+
16
+ ### Changed
17
+ - **`ResponseWriter` Date header now uses `cached_date`** — the per-thread, per-second cache landed in 1.1.0 was never wired into the hot path. It is now. Eliminates ~3 String allocations per response (`Time.now.httpdate` → cached String reuse).
18
+ - **`AdminMiddleware`** refactored: shared `authorize` helper between `/-/quit` and `/-/metrics`; `PATH` constant split into `PATH_QUIT` + `PATH_METRICS`.
19
+ - **`Hyperion::Logger` per-thread access buffer key** is now namespaced per Logger instance (already shipped as a 1.1.0 follow-up fix; documented here for completeness).
20
+
21
+ ### Fixed
22
+ - N/A — no regressions discovered between 1.1.0 and 1.2.0.
23
+
24
+ ## [1.1.0] - 2026-04-27
25
+
26
+ First minor release after 1.0.0. Production hardening + perf wins, no breaking changes.
27
+
28
+ ### Added
29
+ - **HTTP/2 §8.1.2 semantic validation** — Hyperion now rejects malformed `:method` / `:path` / `:scheme` pseudo-headers, connection-specific headers (`connection`, `te`, `transfer-encoding`, `keep-alive`, `upgrade`, `proxy-connection`), and inconsistent `content-length` framing with `RST_STREAM PROTOCOL_ERROR`. h2spec conformance pass rate is now 100% on the §8.1.2 suite (was 76.7% in 1.0.x).
30
+ - **Worker recycling (`worker_max_rss_mb`)** — master polls each child's RSS via `/proc/<pid>/statm` (Linux) or `ps -o rss=` (macOS/BSD) every `worker_check_interval` seconds (default 30s). Workers exceeding the configured RSS ceiling are gracefully cycled (SIGTERM, drain, respawn). Disabled when `worker_max_rss_mb` is nil.
31
+ - **Admin drain endpoint (`POST /-/quit`)** — token-protected Rack middleware that triggers the same SIGTERM-driven graceful shutdown as the signal path. Disabled by default; mount by setting `admin_token` in the Hyperion config DSL. Auth via `X-Hyperion-Admin-Token` header (constant-time comparison). Returns 202 + `{"status":"draining"}` on success, 401 on missing/wrong token.
32
+ - **YJIT auto-enable** — Hyperion enables YJIT automatically in production/staging environments (`RAILS_ENV` / `RACK_ENV` / `HYPERION_ENV`). Override with the `yjit` config setting (true/false) or `--[no-]yjit` CLI flag. No-op on Rubies built without YJIT.
33
+ - **C-extension access-log line builder** (`Hyperion::CParser.build_access_line`) — single-allocation line construction in C, ~10× faster than the Ruby interpolation path. Auto-selected on non-TTY destinations (production); colored TTY runs keep the Ruby fallback.
34
+ - **Date-header cache** — per-thread, per-second cache of `Time.now.httpdate` in `ResponseWriter`. Eliminates ~3 String allocations per response.
35
+ - **`bytes_read` / `bytes_written` metrics** — counters exposed via `Hyperion.stats` for connection-level bandwidth monitoring.
36
+ - **`Hyperion.c_parser_available?`** module accessor + boot-time warn line if the llhttp C extension didn't load (so operators running production with the slower pure-Ruby fallback notice immediately).
37
+ - **`MIGRATING_FROM_PUMA.md`** — operator guide covering config translation, lifecycle hook mapping, signal differences, and observability gaps.
38
+ - **Concurrency-at-scale benchmarks** — README now documents 10 000-connection keep-alive throughput and h2 multiplexing numbers vs Puma/Falcon.
39
+
40
+ ### Changed
41
+ - **Plain HTTP/1.1 accept loop bypasses Async** — when no TLS is configured, Hyperion uses a raw `IO.select` + `accept_nonblock` loop instead of wrapping the loop in an Async task. Worker-owns-connection semantics are unchanged. Removes ~2 µs of fiber-scheduler overhead from the hot accept path.
42
+
43
+ ### Fixed
44
+ - **Lost shutdown log lines under SIGTERM** — `Master#shutdown_children` and `CLI.run_single` now call `Logger#flush_all`, which walks every per-thread access-log buffer registered through the Logger and `IO#flush`es both stdout and stderr before the process exits. Operators no longer have to chase missing `master draining` / `master exiting` lines after a graceful shutdown.
45
+ - **Cross-instance Logger buffer leak** — per-thread access-log buffers are now namespaced per Logger instance (`:"__hyperion_access_buf_<oid>__"`). Previously a globally-shared key meant a buffer registered against an early Logger could be written to by a later Logger whose `flush_all` couldn't see it. The hot path remains a single `Thread.current` read.
46
+
3
47
  ## [1.0.1] - 2026-04-26
4
48
 
5
49
  ### Fixed
data/README.md CHANGED
@@ -77,6 +77,35 @@ Health endpoint that traverses the full middleware chain (rack-attack, locale re
77
77
 
78
78
  On Grape and Rails-controller workloads Puma hits wrk's 2 s timeout cap on ~⅔ of requests — its real p99 is censored above 2 s. Hyperion serves all of its requests under 1.2 s with 0 to 16 timeouts. **1.14–1.48× Puma throughput** depending on endpoint.
79
79
 
80
+ ### Concurrency at scale (architectural advantages)
81
+
82
+ These workloads demonstrate structural differences between Hyperion's fiber-per-connection / fiber-per-stream model and Puma's thread-pool model. Numbers are illustrative; the architecture is what matters. Run on Ubuntu 24.04 / Ruby 3.3.3, single worker, h2load `-c <conns> -n 100000 --rps 1000 --h1`.
83
+
84
+ **5,000 concurrent keep-alive connections (50,000 requests):**
85
+
86
+ | | succeeded | r/s | wall | master RSS |
87
+ |---|---:|---:|---:|---:|
88
+ | Hyperion `-w 1 -t 10` | 50,000 / 50,000 | 3,460 | 14.45 s | 53.5 MB |
89
+ | Puma `-w 1 -t 10:10` | 50,000 / 50,000 | 1,762 | 28.37 s | 36.9 MB |
90
+
91
+ **10,000 concurrent keep-alive connections (100,000 requests):**
92
+
93
+ | | succeeded | failed | r/s | wall |
94
+ |---|---:|---:|---:|---:|
95
+ | Hyperion `-w 1 -t 10` | 93,090 | 6,910 | 3,446 | 27.01 s |
96
+ | Puma `-w 1 -t 10:10` | 77,340 | 22,660 | 706 | 109.59 s |
97
+
98
+ Hyperion holds each connection in a ~1 KB fiber stack; Puma needs an OS thread (~1–8 MB each, capped at `max_threads`). At 10k concurrent connections Hyperion serves **~5× the throughput** of Puma with **~20% fewer dropped requests**, while the per-connection bookkeeping cost is bounded by fiber size, not by `max_threads`.
99
+
100
+ **HTTP/2 multiplexing — 1 connection × 100 concurrent streams (handler sleeps 50 ms):**
101
+
102
+ | | wall time |
103
+ |---|---:|
104
+ | Hyperion (per-stream fiber dispatch) | **1.04 s** |
105
+ | Serial baseline (100 × 50 ms) | 5.00 s |
106
+
107
+ Hyperion fans 100 in-flight streams across separate fibers within a single TCP connection. A serial server would take 5 s; the fiber-multiplexed result (1.04 s, ~96 req/s on one socket) is bounded by single-handler sleep time plus framing overhead. Puma has no native HTTP/2 path — production deployments terminate h2 at nginx and forward h1 to the worker pool, which serializes again.
108
+
80
109
  ### Reproduce
81
110
 
82
111
  ```sh
@@ -107,6 +136,8 @@ curl --http2 -k https://127.0.0.1:9443/
107
136
 
108
137
  `bundle exec rake spec` (and the `default` task) auto-invoke `compile`, so a fresh checkout just needs `bundle install && bundle exec rake` to get a green run.
109
138
 
139
+ **Migrating from Puma?** See [docs/MIGRATING_FROM_PUMA.md](docs/MIGRATING_FROM_PUMA.md).
140
+
110
141
  ## Configuration
111
142
 
112
143
  Three layers, in precedence order: explicit CLI flag > environment variable > `config/hyperion.rb` > built-in default.
@@ -244,7 +275,7 @@ Smuggling defenses for HTTP/1.1: `Content-Length` + `Transfer-Encoding` together
244
275
 
245
276
  ## Compatibility
246
277
 
247
- - **Ruby 3.2+** required.
278
+ - **Ruby 3.3+** required (the `protocol-http2 ~> 0.26` transitive dep imposes this floor; older Ruby installs error at `bundle install`).
248
279
  - **Rack 3** (auto-sets `SERVER_SOFTWARE`, `rack.version`, `REMOTE_ADDR`, IPv6-safe `Host` parsing, CRLF guard).
249
280
  - **`Hyperion::FiberLocal.install!`** opt-in shim for older Rails apps that store request-scoped data via `Thread.current.thread_variable_*` (modern Rails 7.1+ already uses Fiber storage natively; the shim handles the residual footgun).
250
281
  - **`Hyperion::FiberLocal.verify_environment!`** runtime check that `Thread.current[:k]` is fiber-local on the current Ruby (it is on 3.2+).
@@ -404,6 +404,145 @@ static VALUE cbuild_response_head(VALUE self, VALUE rb_status, VALUE rb_reason,
404
404
  return buf;
405
405
  }
406
406
 
407
+ /* Hyperion::CParser.build_access_line(format, ts, method, path, query,
408
+ * status, duration_ms, remote_addr,
409
+ * http_version) -> String
410
+ *
411
+ * Hand-rolled access-log line builder used by Hyperion::Logger#access on the
412
+ * hot path. The Ruby version allocates 1-2 throwaway Strings per line; this
413
+ * builds the line into a stack scratch buffer (with rb_str_buf overflow for
414
+ * extreme cases) and returns a single Ruby String. ~10× faster on the
415
+ * common case, which closes the perf gap between log_requests on/off.
416
+ *
417
+ * `format` is :text or :json (Symbol). The format strings here mirror
418
+ * Logger#build_access_text / #build_access_json byte-for-byte (no colour —
419
+ * the C builder is only used when @colorize is false, i.e. non-TTY production
420
+ * deployments where access logs are the highest-volume log line).
421
+ *
422
+ * String inputs are passed through verbatim. Access logs are best-effort
423
+ * structured output, not a security boundary; CRLF in path/remote_addr would
424
+ * be a log-injection nuisance but cannot escalate. Status (int) and
425
+ * duration_ms (double/Numeric) go through snprintf, which is type-safe.
426
+ */
427
+ static VALUE cbuild_access_line(VALUE self,
428
+ VALUE format_sym, VALUE rb_ts, VALUE rb_method,
429
+ VALUE rb_path, VALUE rb_query, VALUE rb_status,
430
+ VALUE rb_duration, VALUE rb_remote,
431
+ VALUE rb_http_version) {
432
+ (void)self;
433
+ Check_Type(rb_ts, T_STRING);
434
+ Check_Type(rb_method, T_STRING);
435
+ Check_Type(rb_path, T_STRING);
436
+ Check_Type(rb_http_version, T_STRING);
437
+
438
+ int is_json = (TYPE(format_sym) == T_SYMBOL) &&
439
+ (SYM2ID(format_sym) == rb_intern("json"));
440
+
441
+ int status = NUM2INT(rb_status);
442
+ double dur_ms = NUM2DBL(rb_duration);
443
+
444
+ int has_query = !NIL_P(rb_query) && RSTRING_LEN(rb_query) > 0;
445
+ int has_remote = !NIL_P(rb_remote) && RSTRING_LEN(rb_remote) > 0;
446
+
447
+ /* 1 KiB initial buffer covers the vast majority of access-log lines
448
+ * (timestamp + level + path + status + addr ~= 200 bytes). rb_str_cat
449
+ * grows on overflow.
450
+ *
451
+ * We use a CAT_LIT macro for literal-string appends so the compiler
452
+ * computes length via sizeof — manual byte counts on hand-rolled
453
+ * literal lengths are an off-by-one waiting to happen. */
454
+ #define CAT_LIT(b, s) rb_str_cat((b), (s), (long)(sizeof(s) - 1))
455
+
456
+ VALUE buf = rb_str_buf_new(512);
457
+
458
+ if (is_json) {
459
+ /* Prefix: {"ts":"...","level":"info","source":"hyperion","message":"request", */
460
+ CAT_LIT(buf, "{\"ts\":\"");
461
+ rb_str_cat(buf, RSTRING_PTR(rb_ts), RSTRING_LEN(rb_ts));
462
+ CAT_LIT(buf, "\",\"level\":\"info\",\"source\":\"hyperion\",\"message\":\"request\",");
463
+ CAT_LIT(buf, "\"method\":\"");
464
+ rb_str_cat(buf, RSTRING_PTR(rb_method), RSTRING_LEN(rb_method));
465
+ CAT_LIT(buf, "\",\"path\":\"");
466
+ rb_str_cat(buf, RSTRING_PTR(rb_path), RSTRING_LEN(rb_path));
467
+ CAT_LIT(buf, "\"");
468
+
469
+ if (has_query) {
470
+ CAT_LIT(buf, ",\"query\":\"");
471
+ rb_str_cat(buf, RSTRING_PTR(rb_query), RSTRING_LEN(rb_query));
472
+ CAT_LIT(buf, "\"");
473
+ }
474
+
475
+ char num[64];
476
+ int n = snprintf(num, sizeof(num), ",\"status\":%d,\"duration_ms\":%g,",
477
+ status, dur_ms);
478
+ rb_str_cat(buf, num, n);
479
+
480
+ if (has_remote) {
481
+ CAT_LIT(buf, "\"remote_addr\":\"");
482
+ rb_str_cat(buf, RSTRING_PTR(rb_remote), RSTRING_LEN(rb_remote));
483
+ CAT_LIT(buf, "\",");
484
+ } else {
485
+ CAT_LIT(buf, "\"remote_addr\":null,");
486
+ }
487
+
488
+ CAT_LIT(buf, "\"http_version\":\"");
489
+ rb_str_cat(buf, RSTRING_PTR(rb_http_version), RSTRING_LEN(rb_http_version));
490
+ CAT_LIT(buf, "\"}\n");
491
+ } else {
492
+ /* text: "<ts> INFO [hyperion] message=request method=... path=... [query=...] status=... duration_ms=... remote_addr=... http_version=...\n" */
493
+ rb_str_cat(buf, RSTRING_PTR(rb_ts), RSTRING_LEN(rb_ts));
494
+ CAT_LIT(buf, " INFO [hyperion] message=request method=");
495
+ rb_str_cat(buf, RSTRING_PTR(rb_method), RSTRING_LEN(rb_method));
496
+ CAT_LIT(buf, " path=");
497
+ rb_str_cat(buf, RSTRING_PTR(rb_path), RSTRING_LEN(rb_path));
498
+
499
+ if (has_query) {
500
+ /* Mirror Logger#quote_if_needed: quote if value contains
501
+ * whitespace, '"', or '='. Hot path skips quoting. */
502
+ const char *q_ptr = RSTRING_PTR(rb_query);
503
+ long q_len = RSTRING_LEN(rb_query);
504
+ int need_quote = 0;
505
+ for (long j = 0; j < q_len; j++) {
506
+ char c = q_ptr[j];
507
+ if (c == ' ' || c == '\t' || c == '\n' || c == '\r' ||
508
+ c == '"' || c == '=') {
509
+ need_quote = 1;
510
+ break;
511
+ }
512
+ }
513
+ if (need_quote) {
514
+ /* Defer to Ruby's String#inspect for correct quoting. */
515
+ VALUE quoted = rb_funcall(rb_query, rb_intern("inspect"), 0);
516
+ CAT_LIT(buf, " query=");
517
+ rb_str_cat(buf, RSTRING_PTR(quoted), RSTRING_LEN(quoted));
518
+ } else {
519
+ CAT_LIT(buf, " query=");
520
+ rb_str_cat(buf, q_ptr, q_len);
521
+ }
522
+ }
523
+
524
+ char num[80];
525
+ /* Use %g to match the existing Ruby format which interpolates
526
+ * Float#to_s (no fixed precision). Status is an int. */
527
+ int n = snprintf(num, sizeof(num), " status=%d duration_ms=%g remote_addr=",
528
+ status, dur_ms);
529
+ rb_str_cat(buf, num, n);
530
+
531
+ if (has_remote) {
532
+ rb_str_cat(buf, RSTRING_PTR(rb_remote), RSTRING_LEN(rb_remote));
533
+ } else {
534
+ CAT_LIT(buf, "nil");
535
+ }
536
+
537
+ CAT_LIT(buf, " http_version=");
538
+ rb_str_cat(buf, RSTRING_PTR(rb_http_version), RSTRING_LEN(rb_http_version));
539
+ CAT_LIT(buf, "\n");
540
+ }
541
+
542
+ return buf;
543
+ }
544
+ #undef CAT_LIT
545
+
407
546
  void Init_hyperion_http(void) {
408
547
  install_settings();
409
548
 
@@ -416,6 +555,8 @@ void Init_hyperion_http(void) {
416
555
  rb_define_method(rb_cCParser, "parse", cparser_parse, 1);
417
556
  rb_define_singleton_method(rb_cCParser, "build_response_head",
418
557
  cbuild_response_head, 6);
558
+ rb_define_singleton_method(rb_cCParser, "build_access_line",
559
+ cbuild_access_line, 9);
419
560
 
420
561
  id_new = rb_intern("new");
421
562
  id_downcase = rb_intern("downcase");
@@ -49,6 +49,20 @@ module Hyperion
49
49
  )
50
50
 
51
51
  class << self
52
+ # Pre-allocate `n` env-hash and rack-input objects in master before
53
+ # fork. Children inherit the populated free-list via copy-on-write —
54
+ # the hash slots stay shared until a request mutates them. Eliminates
55
+ # the first-N-requests allocation tax that every fresh worker would
56
+ # otherwise pay on cold start. Idempotent: safe to call multiple
57
+ # times; the pool simply caps at its configured `max_size`.
58
+ def warmup_pool(count = 8)
59
+ warmed_envs = Array.new(count) { ENV_POOL.acquire }
60
+ warmed_inputs = Array.new(count) { INPUT_POOL.acquire }
61
+ warmed_envs.each { |e| ENV_POOL.release(e) }
62
+ warmed_inputs.each { |i| INPUT_POOL.release(i) }
63
+ nil
64
+ end
65
+
52
66
  def call(app, request)
53
67
  env, input = build_env(request)
54
68
  status, headers, body = app.call(env)
@@ -0,0 +1,110 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'rack/utils'
4
+
5
+ module Hyperion
6
+ # Rack middleware that exposes administrative endpoints on the same
7
+ # listener as the application. Disabled by default — only mounted when
8
+ # `admin_token` is configured. Currently provides:
9
+ #
10
+ # POST /-/quit → triggers graceful master drain (SIGTERM to ppid)
11
+ # GET /-/metrics → returns Hyperion.stats in Prometheus text format
12
+ #
13
+ # Auth: the request must include `X-Hyperion-Admin-Token: <token>`.
14
+ # Mismatch → 401. Path/method mismatch → falls through to the app
15
+ # (so the app can still own /-/anything if Hyperion's admin is off).
16
+ # When the token is unset, the constructor refuses to wrap — callers
17
+ # must skip mounting this middleware at all.
18
+ #
19
+ # SECURITY: the bearer token is defense-in-depth, not a substitute for
20
+ # network isolation. Operators MUST keep the listener on a private
21
+ # network or behind TLS + an authenticating reverse proxy. Anyone who
22
+ # can reach the listener AND knows the token can drain the server or
23
+ # scrape its metrics. See docs/REVERSE_PROXY.md for nginx/ALB recipes
24
+ # that block /-/* at the edge.
25
+ class AdminMiddleware
26
+ PATH_QUIT = '/-/quit'
27
+ PATH_METRICS = '/-/metrics'
28
+
29
+ METRICS_CONTENT_TYPE = 'text/plain; version=0.0.4; charset=utf-8'
30
+ JSON_CONTENT_TYPE = 'application/json'
31
+
32
+ UNAUTHORIZED_BODY = %({"error":"unauthorized"}\n)
33
+
34
+ def initialize(app, token:, signal_target: nil)
35
+ raise ArgumentError, 'admin_token must be a non-empty String' if token.nil? || token.to_s.empty?
36
+
37
+ @app = app
38
+ @token = token.to_s
39
+ # Override hook for tests. Defaults to ppid in worker context, pid
40
+ # for single-worker context (caller decides).
41
+ @signal_target = signal_target
42
+ end
43
+
44
+ def call(env)
45
+ path = env['PATH_INFO']
46
+ method = env['REQUEST_METHOD']
47
+
48
+ if path == PATH_QUIT && method == 'POST'
49
+ authorize(env) { handle_quit(env) }
50
+ elsif path == PATH_METRICS && method == 'GET'
51
+ authorize(env) { handle_metrics }
52
+ else
53
+ @app.call(env)
54
+ end
55
+ end
56
+
57
+ private
58
+
59
+ # Wrap a handler in the shared bearer-token check. Yields only when the
60
+ # token matches; returns the canonical 401 response otherwise.
61
+ def authorize(env)
62
+ provided = env['HTTP_X_HYPERION_ADMIN_TOKEN'].to_s
63
+ return unauthorized unless secure_match?(provided)
64
+
65
+ yield
66
+ end
67
+
68
+ def unauthorized
69
+ [401, { 'content-type' => JSON_CONTENT_TYPE }, [UNAUTHORIZED_BODY]]
70
+ end
71
+
72
+ def handle_quit(env)
73
+ target = resolve_signal_target
74
+ Hyperion.logger.info do
75
+ { message: 'admin drain requested', remote_addr: env['REMOTE_ADDR'], target_pid: target }
76
+ end
77
+ begin
78
+ Process.kill('TERM', target)
79
+ rescue StandardError => e
80
+ Hyperion.logger.warn { { message: 'admin drain signal failed', error: e.message } }
81
+ return [500, { 'content-type' => JSON_CONTENT_TYPE }, [%({"error":"signal_failed"}\n)]]
82
+ end
83
+
84
+ [202, { 'content-type' => JSON_CONTENT_TYPE }, [%({"status":"draining"}\n)]]
85
+ end
86
+
87
+ def handle_metrics
88
+ body = PrometheusExporter.render(Hyperion.stats)
89
+ [200, { 'content-type' => METRICS_CONTENT_TYPE }, [body]]
90
+ end
91
+
92
+ def secure_match?(provided)
93
+ return false if provided.empty?
94
+ return false unless provided.bytesize == @token.bytesize
95
+
96
+ # Constant-time comparison. Rack::Utils.secure_compare requires same
97
+ # length, so we prefix-pad first to avoid a length-leak side channel.
98
+ Rack::Utils.secure_compare(provided, @token)
99
+ end
100
+
101
+ def resolve_signal_target
102
+ return @signal_target if @signal_target
103
+
104
+ # In a forked worker, ppid IS the master; in single-worker mode,
105
+ # the master + worker are the same process — signal self.
106
+ ppid = Process.ppid
107
+ ppid > 1 ? ppid : Process.pid
108
+ end
109
+ end
110
+ end
data/lib/hyperion/cli.rb CHANGED
@@ -53,6 +53,10 @@ module Hyperion
53
53
  o.on('--fiber-local-shim', 'Patch Thread.current[] to be fiber-local (Rails-compat for older gems)') do
54
54
  cli_opts[:fiber_local_shim] = true
55
55
  end
56
+ o.on('--[no-]yjit',
57
+ 'Enable Ruby YJIT (default: auto on RAILS_ENV/RACK_ENV=production/staging)') do |v|
58
+ cli_opts[:yjit] = v
59
+ end
56
60
  o.on('-h', '--help', 'show help') do
57
61
  puts o
58
62
  exit 0
@@ -79,6 +83,11 @@ module Hyperion
79
83
  # touch — fall through to the env/default chain in Hyperion.log_requests?".
80
84
  Hyperion.log_requests = config.log_requests unless config.log_requests.nil?
81
85
 
86
+ # Enable YJIT before workers fork / connections start. Auto-on in
87
+ # production/staging gives operators the perf bump for free; explicit
88
+ # config.yjit (true/false) overrides the env-based default.
89
+ maybe_enable_yjit(config)
90
+
82
91
  rackup = argv.first || 'config.ru'
83
92
  abort("[hyperion] no such rackup file: #{rackup}") unless File.exist?(rackup)
84
93
 
@@ -88,6 +97,7 @@ module Hyperion
88
97
  end
89
98
 
90
99
  app = load_rack_app(rackup)
100
+ app = wrap_admin_middleware(app, config)
91
101
  workers = config.workers.zero? ? Etc.nprocessors : config.workers
92
102
 
93
103
  if workers <= 1
@@ -101,10 +111,20 @@ module Hyperion
101
111
  tls = build_tls_from_config(config)
102
112
  server = Server.new(host: config.host, port: config.port, app: app,
103
113
  tls: tls, thread_count: config.thread_count,
104
- read_timeout: config.read_timeout)
114
+ read_timeout: config.read_timeout,
115
+ max_pending: config.max_pending,
116
+ max_request_read_seconds: config.max_request_read_seconds,
117
+ h2_settings: Master.build_h2_settings(config))
105
118
  server.listen
106
119
  scheme = tls ? 'https' : 'http'
107
120
  Hyperion.logger.info { { message: 'listening', url: "#{scheme}://#{server.host}:#{server.port}" } }
121
+ warn_c_parser_unavailable
122
+
123
+ # Pre-allocate Rack env-pool entries and eager-touch lazy constants.
124
+ # In single-mode there's no fork, but the warmup still pays for itself
125
+ # by frontloading the first-N-request allocation cost off the first
126
+ # real client. Idempotent — safe to call once per process.
127
+ Hyperion.warmup!
108
128
 
109
129
  # Single-worker mode reuses the lifecycle hooks: before_fork is a no-op
110
130
  # here (no fork happens), and on_worker_boot/on_worker_shutdown fire
@@ -130,6 +150,11 @@ module Hyperion
130
150
  server.start
131
151
  shutdown_thread.join
132
152
  config.on_worker_shutdown.each { |h| h.call(0) }
153
+ # Drain per-thread access buffers + sync stdio. Single-worker mode
154
+ # doesn't go through Master#shutdown_children, so without this call
155
+ # buffered access lines + final shutdown messages can be lost on
156
+ # SIGTERM. See Hyperion::Logger#flush_all.
157
+ Hyperion.logger.flush_all
133
158
  end
134
159
 
135
160
  def self.run_cluster(config, app, workers)
@@ -155,5 +180,61 @@ module Hyperion
155
180
  { cert: config.tls_cert, key: config.tls_key }
156
181
  end
157
182
  private_class_method :build_tls_from_config
183
+
184
+ # Decide whether to enable YJIT and flip the switch once at boot.
185
+ # Precedence:
186
+ # 1. config.yjit explicitly true/false → honour exactly.
187
+ # 2. config.yjit nil (default) → auto: on for production/staging.
188
+ # No-op on Rubies without YJIT (e.g. JRuby/TruffleRuby) and idempotent if
189
+ # the operator already passed `ruby --yjit` upstream.
190
+ def self.maybe_enable_yjit(config)
191
+ return unless defined?(::RubyVM::YJIT)
192
+ return if ::RubyVM::YJIT.enabled?
193
+
194
+ enable = if config.yjit.nil?
195
+ env_name = ENV['HYPERION_ENV'] || ENV['RAILS_ENV'] || ENV['RACK_ENV']
196
+ %w[production staging].include?(env_name)
197
+ else
198
+ config.yjit
199
+ end
200
+
201
+ return unless enable
202
+
203
+ ::RubyVM::YJIT.enable
204
+ Hyperion.logger.info do
205
+ { message: 'YJIT enabled', mode: config.yjit.nil? ? 'auto' : 'explicit' }
206
+ end
207
+ end
208
+ private_class_method :maybe_enable_yjit
209
+
210
+ # When admin_token is configured, wrap the app in AdminMiddleware so
211
+ # POST /-/quit and GET /-/metrics become token-protected admin endpoints.
212
+ # Skipped when the token is unset — those paths fall through to the app,
213
+ # so apps may still own /-/anything if Hyperion's admin is off.
214
+ def self.wrap_admin_middleware(app, config)
215
+ return app if config.admin_token.nil? || config.admin_token.to_s.empty?
216
+
217
+ Hyperion.logger.info do
218
+ { message: 'admin endpoint enabled',
219
+ paths: [AdminMiddleware::PATH_QUIT, AdminMiddleware::PATH_METRICS] }
220
+ end
221
+ AdminMiddleware.new(app, token: config.admin_token)
222
+ end
223
+ private_class_method :wrap_admin_middleware
224
+
225
+ # Warn loudly at boot if the C parser didn't load — operators running
226
+ # production with the pure-Ruby fallback are paying ~2× CPU on parse-heavy
227
+ # workloads and probably don't know it.
228
+ def self.warn_c_parser_unavailable
229
+ return if Hyperion.c_parser_available?
230
+
231
+ Hyperion.logger.warn do
232
+ {
233
+ message: 'llhttp C parser not loaded — using pure-Ruby fallback (slower)',
234
+ remediation: 'rebuild the gem with `bundle exec rake compile` or check your OpenSSL/build-essential install'
235
+ }
236
+ end
237
+ end
238
+ private_class_method :warn_c_parser_unavailable
158
239
  end
159
240
  end
@@ -24,7 +24,17 @@ module Hyperion
24
24
  log_level: nil, # nil → Logger picks from env / default
25
25
  log_format: nil, # nil → Logger picks via auto rule
26
26
  log_requests: nil, # nil → Hyperion.log_requests? (default true)
27
- fiber_local_shim: false
27
+ fiber_local_shim: false,
28
+ yjit: nil, # nil → auto: enable on production/staging; true/false to force.
29
+ worker_max_rss_mb: nil, # Integer, e.g. 1024. When a worker exceeds this RSS in MB, master gracefully cycles it. nil disables.
30
+ worker_check_interval: 30, # Seconds between RSS polls. Tradeoff: tighter = faster recycle, more ps calls. 30s matches Puma WorkerKiller.
31
+ admin_token: nil, # String. When set, exposes admin endpoints (POST /-/quit triggers graceful drain; GET /-/metrics returns Prometheus-format Hyperion.stats). Same token guards both. nil disables admin entirely (paths fall through to the app).
32
+ max_pending: nil, # Integer, e.g. 256. When the per-worker accept inbox has this many queued connections, additional accepts are rejected with HTTP 503 + Retry-After:1 instead of being queued. nil disables (current behaviour: unbounded queue).
33
+ max_request_read_seconds: 60, # Numeric. Total wallclock budget (seconds) for reading the request line + headers + body for ONE request. Defends against slowloris-style drips that satisfy the per-recv read_timeout but never finish the request. Resets between requests on a keep-alive connection. nil disables.
34
+ h2_max_concurrent_streams: 128, # HTTP/2 SETTINGS_MAX_CONCURRENT_STREAMS — cap on simultaneously-open streams per connection. Falcon: 64. nil leaves protocol-http2 default (0xFFFFFFFF).
35
+ h2_initial_window_size: 1_048_576, # HTTP/2 SETTINGS_INITIAL_WINDOW_SIZE (octets) — flow-control window per stream at open. Bigger = fewer WINDOW_UPDATE round-trips on large bodies. Spec default is 65535. nil → leave protocol default.
36
+ h2_max_frame_size: 1_048_576, # HTTP/2 SETTINGS_MAX_FRAME_SIZE (octets) — biggest DATA/HEADERS frame we'll accept. Spec floor 16384, ceiling 16777215. We pick 1 MiB to match common CDNs without unbounded buffer growth. nil → leave protocol default (16384).
37
+ h2_max_header_list_size: 65_536 # HTTP/2 SETTINGS_MAX_HEADER_LIST_SIZE (octets) — advisory cap on the decompressed header block. Bounds memory of pathological client headers. nil → leave protocol default (unbounded).
28
38
  }.freeze
29
39
 
30
40
  HOOKS = %i[before_fork on_worker_boot on_worker_shutdown].freeze