film 0.1.0-x86_64-linux

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of film might be problematic. Click here for more details.

checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: cb6d478c6fa39f2c2ab9aa4a202a4c34b97e5894eddea2e50c65433a77b34add
4
+ data.tar.gz: a9be66c87875d91350367605c27ecfb0a7998c27c7d3287616f79533363dbd59
5
+ SHA512:
6
+ metadata.gz: 88e705c18d293ffd8ef5077b4584be93c00489ce0358cd0cbacf0baa729f809aee6eddfeb33e6a931d01e512b3cc951a2278271a1fe4c41dd5ddb6d9f1dd7bd5
7
+ data.tar.gz: 426f1081e2eeeabd77789ee1c052de03dd95c159a8515483ae65b0008a8a744da0373ac4e4f39702d1c5903852eca2ce8b95c199c05d88c99dcebfa14eda93a2
data/.yardopts ADDED
@@ -0,0 +1,13 @@
1
+ --readme README.md
2
+ --markup markdown
3
+ --title "Film"
4
+ --output-dir doc/api
5
+ --no-private
6
+ --embed-mixins
7
+ lib/**/*.rb
8
+ -
9
+ doc/architecture.md
10
+ doc/benchmarks.md
11
+ doc/rails-on-ractors.md
12
+ CHANGELOG.md
13
+ LICENSE.txt
data/CHANGELOG.md ADDED
@@ -0,0 +1,54 @@
1
+ ## [0.1.0] - 2026-06-11
2
+
3
+ Initial release.
4
+
5
+ - HTTP/1.1 server with all network I/O in Rust on tokio + hyper.
6
+ - Worker **Ractors** for true parallel request handling (`mode: :ractor`,
7
+ requires a Ractor-shareable app), with a threaded fallback (`mode: :threaded`)
8
+ that runs any Rack app, Rails included. Puma-style `workers × threads`
9
+ topology in both modes.
10
+ - Rack 3 spec compliance verified by Rack::Lint over real sockets: streaming
11
+ request bodies (forward-only `rack.input`), enumerable and callable
12
+ (full-duplex stream) response bodies, lowercase/multi-value headers.
13
+ - Supervised crash recovery: a dying ractor 500s its in-flight requests
14
+ immediately and is respawned.
15
+ - Graceful shutdown: drain to deadline, then abort in-flight clients and
16
+ reap workers; second signal force-exits.
17
+ - Bounded request queue with 503 backpressure; bounded body channels give
18
+ per-request backpressure in both directions with the GVL released.
19
+ - TLS via rustls (file paths or inline PEM).
20
+ - Near-zero-allocation env construction: frozen LRU caches for
21
+ Host/peer-address values, shared frozen `rack.errors`, and a shared
22
+ frozen null `rack.input` for bodyless requests.
23
+ - The Rust side allocates through mimalloc (chosen over jemalloc in a three-way benchmark).
24
+ - Fused worker loop: the env arrives with the request handle embedded
25
+ (`env["film.request"]`) and the common complete-body response rides a
26
+ single respond-and-take native call—~one FFI crossing per request,
27
+ no per-request arrays. Opt-in `batch` directive for grabbing several
28
+ queued requests per visit (default 1; >1 trades fairness for
29
+ throughput).
30
+ - Experimental `lanes` mode: per-worker queues with awake-preferring
31
+ dispatch and work stealing (+20% ractor-mode plaintext on Linux,
32
+ making ractor the fastest Film configuration on real hardware). Off
33
+ by default.
34
+ - Live stats: `server.stats` (queued, in-flight, served, rejected,
35
+ respawns, lane depths) and a `SIGUSR1` one-line dump for CLI servers.
36
+ - Native async logging: a `log_requests` access log (status-colored on
37
+ color terminals—2xx green, 3xx yellow, 4xx maroon, 5xx bright red—503
38
+ rejections included) and `Film::Logger` / `Film::Logger::Device`—lines
39
+ flow through a lock-free channel to a Rust flusher thread, so
40
+ request threads never take a log mutex or issue a write syscall. The
41
+ device is Ractor-shareable.
42
+ - `film --check` (and `Film::Check.report`): explains why an app is not
43
+ Ractor-shareable—captured variables with definition sites, ivar
44
+ paths, and the class-ivar trap—without freezing anything.
45
+ - Puma-style Ruby DSL config file (`film.rb`) and a `film` CLI
46
+ (`film -C film.rb config.ru`); precedence kwargs > file > defaults.
47
+ Directives include `environment`, `pidfile`, and `rackup`.
48
+ - Hot-path performance work: a GVL-free queue fast path, zero-copy response
49
+ headers, a frozen Ractor-shareable env-string cache, and no per-request
50
+ task spawns. Single-process throughput is on par with (or modestly above)
51
+ a same-topology process cluster in our benchmarks; see README.
52
+ - Request timeouts: `request_timeout: seconds` (off by default) returns an
53
+ immediate 504 when the app misses the deadline; the late response is
54
+ dropped and the handler is never killed. Counted as `stats[:timeouts]`.
data/LICENSE.txt ADDED
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2026 Yaroslav Markin
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,390 @@
1
+ # Film
2
+
3
+ **Film** is a high-performance **Ractor** web server for Ruby 4.0+.
4
+
5
+ [![GitHub Release](https://img.shields.io/github/v/release/yaroslav/film)](https://github.com/yaroslav/film/releases)
6
+ [![Docs](https://img.shields.io/badge/yard-docs-blue.svg)](https://rubydoc.info/gems/film)
7
+
8
+ Ruby threads cannot run Ruby code in parallel, so production setups fork
9
+ a process per core and pay for each copy in memory. Film runs your code
10
+ on every core in **one small process**. A **Rust** (tokio + hyper)
11
+ front-end owns the network, parallel **Ractors** run your Rack 3 app,
12
+ and a threaded fallback mode runs everything else, Rails included.
13
+
14
+ * **Fast.** On a real 8-core server, every Film mode is **1.4-2×** ahead
15
+ of a same-topology Puma cluster on I/O-light endpoints. Ractor mode
16
+ also wins on pure CPU. [Benchmarks](#benchmarks) below.
17
+ * **A fraction of the memory.** One process instead of a fork per core:
18
+ about **1/19th of the Puma cluster's memory** under the same load, and
19
+ about 1/8th when serving the Rails hello-world.
20
+ * **Parallel without forking.** Ractor mode runs CPU work **5×** faster
21
+ than Film's own GVL-bound threaded mode, in the same small process.
22
+ * **Production plumbing included.** Graceful drain, crash supervision
23
+ and respawn, bounded queues with 503 backpressure, request timeouts,
24
+ TLS (rustls), live stats, async access and app logging.
25
+ * **Tells you why.** `film --check` lists exactly what blocks your app
26
+ from ractor mode, finding by finding, so you do not have to decode
27
+ `Ractor::IsolationError` yourself.
28
+ * **Puma-shaped.** The same `workers × threads` topology, a familiar
29
+ config DSL, a `film` CLI. If you can run Puma, you can run Film.
30
+
31
+ ```sh
32
+ bundle add film # or: gem install film
33
+ echo 'run ->(env) { [200, {"content-type" => "text/plain"}, ["Action!"]] }' > config.ru
34
+ bundle exec film
35
+ ```
36
+
37
+ **N.B.:** Ractors are officially **experimental** in Ruby 4.0, and so is this server. The threaded mode is solid. Still, Film aims to be the best way to experiment with Ractors today—and the best Ractor server when they become stable.
38
+
39
+ ---
40
+
41
+ ## Table of Contents
42
+
43
+ - [Why](#why)
44
+ - [Benchmarks](#benchmarks)
45
+ - [Install](#install)
46
+ - [Usage](#usage)
47
+ - [Config file and CLI](#config-file-and-cli)
48
+ - [`film --check`](#film---check)
49
+ - [Request timeouts](#request-timeouts)
50
+ - [Stats](#stats)
51
+ - [Logging](#logging)
52
+ - [Timer waits](#timer-waits)
53
+ - [Rack 3 compliance](#rack-3-compliance)
54
+ - [Rails](#rails)
55
+ - [Development](#development)
56
+ - [Contributing](#contributing)
57
+ - [License](#license)
58
+
59
+ ## Why
60
+
61
+ The GVL allows only one Ruby thread to run at a time. To use all cores,
62
+ Ruby servers fork processes, and every fork costs a full copy of the
63
+ app. Ractors do not have this limit: each one has its own lock, so one
64
+ process can run Ruby in parallel. What was missing is a server that
65
+ dispatches requests to them. Ruby 4.0 reworked Ractors (`Ractor::Port`,
66
+ `shareable_proc`, less lock contention) and made this worth building.
67
+ The design notes live in [doc/architecture.md](doc/architecture.md).
68
+
69
+ ## Benchmarks
70
+
71
+ Measured on a real server: AWS **c7a.2xlarge** (8-core AMD EPYC 9R14,
72
+ 16 GB, Amazon Linux 2023). This is a realistic app-server size. The same
73
+ Ractor-shareable app runs on every server, Ruby 4.0.5 with YJIT, equal
74
+ topology (8 workers × 3 threads; Puma forks, Film stays in one process).
75
+ Numbers are req/s by wrk (8-second windows, 64 connections, same host).
76
+ Methodology and the analysis behind every column:
77
+ [doc/benchmarks.md](doc/benchmarks.md).
78
+
79
+ | endpoint | Film :ractor | + lanes | Film :threaded | Puma (cluster) |
80
+ |-------------|-------------:|--------:|---------------:|---------------:|
81
+ | /plaintext | 201,472 | **241,501** | 218,348 | 117,838 |
82
+ | /10k | 156,635 | **183,564** | 153,442 | 106,666 |
83
+ | /cpu (fib) | 66,735¹| **70,373** | 13,298 | 58,207 |
84
+ | /io (5 ms) | 4,527²| 4,530 | **4,715** | 4,691 |
85
+ | /io_native | 4,714 | **4,717** | 4,709 | 4,692 |
86
+
87
+ Memory on the same box, RSS under load:
88
+
89
+ | serving | Film (one process) | Puma cluster (8 workers) |
90
+ |-----------------------|-------------------:|-------------------------:|
91
+ | bench app, :ractor | **57 MB** | 1,078 MB |
92
+ | bench app, :threaded | **50 MB** | 1,078 MB |
93
+ | Rails hello-world | **97 MB** | 797 MB |
94
+
95
+ "+ lanes" is the experimental per-worker-queue dispatcher (`lanes true`).
96
+ It adds +20% over the shared queue on this hardware and makes ractor
97
+ mode the fastest Film configuration. Details:
98
+ [doc/benchmarks.md](doc/benchmarks.md#lane-dispatch-experimental-lanes-true).
99
+
100
+ ¹ Stock settings, no tuning. Ractor mode beats the fork cluster on pure
101
+ CPU by +15% (+21% with lanes). Threaded mode shows the GVL ceiling that
102
+ every single-process Ruby server hits. The CPU-tuning recipe that our
103
+ earlier Docker measurements needed makes no difference on real hardware
104
+ (+0.5%); see [doc/benchmarks.md](doc/benchmarks.md#cpu-bound-tuning).
105
+
106
+ ² The ractor timer tax is small on real hardware: −4% against threaded
107
+ mode (it was −18% in Docker). Wait-bound throughput is slots ÷ wait, and
108
+ Film slots are threads, not processes. `workers 32, threads 1` measured
109
+ **5,922 /io (+27% over the cluster) and 6,254 /io_native (+34%)**, still
110
+ one small process. See
111
+ [doc/benchmarks.md](doc/benchmarks.md#why-io-lags-in-ractor-mode-on-linux).
112
+
113
+ A common first idea is to keep your current server and wrap the app in
114
+ a ractor pool. We measured that too (same box; the analysis is in the
115
+ doc):
116
+
117
+ | endpoint | Film :ractor | Puma + ractor wrapper | Falcon + ractor wrapper |
118
+ |------------|-------------:|----------------------:|------------------------:|
119
+ | /plaintext | **201,472** | 19,425 | 100,624 |
120
+ | /cpu (fib) | **66,735** | 17,106 | 49,083 |
121
+ | /io (5 ms) | **4,527** | 1,447 | 1,549 |
122
+
123
+ In short: ractor mode reaches fork-level CPU parallelism (**5×** Film's
124
+ own GVL-bound threaded mode) in one process, at about 1/19th of the
125
+ cluster's memory. Every Film mode is 1.4-2× ahead of the cluster on
126
+ I/O-light endpoints. The macOS numbers (secondary; everything there hits
127
+ the loopback ceiling) and the YJIT × Ractors gotcha are in
128
+ [doc/benchmarks.md](doc/benchmarks.md).
129
+
130
+ Reproduce: `bench/run.sh [seconds] [concurrency]` for the main table,
131
+ `bench/studies.sh` for the follow-ups (CPU recipe, topology, scaling,
132
+ logging, memory).
133
+
134
+ ## Install
135
+
136
+ You need Ruby >= 4.0. Add Film to your application's bundle:
137
+
138
+ ```sh
139
+ bundle add film # or: gem install film (outside a bundle)
140
+ ```
141
+
142
+ or put it in the `Gemfile` yourself:
143
+
144
+ ```ruby
145
+ gem "film", "~> 0.1"
146
+ ```
147
+
148
+ Then generate a config and serve:
149
+
150
+ ```sh
151
+ bundle exec film --init # writes film.rb; every directive documented in place
152
+ bundle exec film # picks up config.ru + film.rb, serves on :9292
153
+ ```
154
+
155
+ (After a standalone `gem install`, the `film` command works without
156
+ `bundle exec`.)
157
+
158
+ No Rust compiler needed: released versions ship precompiled native gems
159
+ for Linux (x86_64/aarch64, glibc and musl) and macOS (arm64). On other
160
+ platforms the gem compiles at install time; that needs a Rust toolchain,
161
+ plus clang/libclang on Linux.
162
+
163
+ ## Usage
164
+
165
+ ```ruby
166
+ require "film"
167
+
168
+ # Ractor mode needs a Ractor-shareable app: capture nothing, freeze config.
169
+ app = Ractor.shareable_proc do |env|
170
+ [200, { "content-type" => "text/plain" }, ["Hello from #{Ractor.current}"]]
171
+ end
172
+
173
+ Film::Server.run(app, port: 9292) # traps INT/TERM; Ctrl-C drains gracefully
174
+ ```
175
+
176
+ Or embedded, with everything spelled out:
177
+
178
+ ```ruby
179
+ server = Film::Server.new(app,
180
+ bind: "127.0.0.1",
181
+ port: 9292, # 0 = ephemeral; read back via server.port
182
+ workers: Etc.nprocessors, # ractors (parallelism)
183
+ threads: 3, # threads per ractor (I/O concurrency, Puma-style)
184
+ mode: :auto, # :auto | :ractor | :threaded
185
+ queue_depth: 1024, # bounded queue; overflow → 503
186
+ queue_timeout: 1.0, # seconds before 503 on a full queue
187
+ request_timeout: nil, # seconds before a slow response becomes a 504 (nil = off)
188
+ shutdown_timeout: 30, # drain deadline
189
+ tls: { cert: "cert.pem", key: "key.pem" }, # file paths or inline PEM
190
+ )
191
+ server.start
192
+ server.shutdown # graceful: drain → deadline → abort stragglers
193
+ ```
194
+
195
+ ### Modes
196
+
197
+ - **`:ractor`**: `workers` Ractors × `threads` Threads each. The app must
198
+ be `Ractor.shareable?` (frozen middleware, `shareable_proc` endpoints).
199
+ Forcing `:ractor` with an unshareable app raises
200
+ `Film::UnshareableAppError`. A crashed ractor returns 500 to its
201
+ in-flight requests right away, then respawns.
202
+ - **`:threaded`**: the same machinery on `workers × threads` plain
203
+ Threads. Runs **any** Rack app, including Rails, today. Parallel for
204
+ I/O, serialized by the GVL for CPU.
205
+ - **`:auto`** (default): `:ractor` when the app is shareable, otherwise
206
+ a warning and `:threaded`. One caveat: a *class* used as a Rack app
207
+ always counts as "shareable" (classes are), even if calling it touches
208
+ unshareable state. Force `:threaded` for those.
209
+
210
+ ## Config file and CLI
211
+
212
+ Settings can live in a Puma-style Ruby DSL file. Precedence: explicit
213
+ kwargs and CLI flags > config file > defaults.
214
+
215
+ ```ruby
216
+ # film.rb
217
+ port 9292
218
+ workers 8
219
+ threads 3
220
+ mode :ractor
221
+ ```
222
+
223
+ ```sh
224
+ film --init # write a fully commented sample film.rb
225
+ film # config.ru + film.rb, port 9292
226
+ film --check # explain whether the app can run in :ractor mode
227
+ film -C config/film.rb -p 3000 -w 4 -m ractor my_app.ru
228
+ ```
229
+
230
+ The generated sample documents every directive, including the Rails
231
+ settings and the performance notes.
232
+
233
+ ## `film --check`
234
+
235
+ When an app cannot run in `:ractor` mode, Film can tell you why, instead
236
+ of leaving you with a bare `Ractor::IsolationError`. The check changes
237
+ nothing (it does not freeze your objects) and names each blocker:
238
+ captured variables with the place they were defined, instance variables
239
+ by path, and the class-level instance variable trap that catches
240
+ class-style apps:
241
+
242
+ ```
243
+ $ film --check
244
+ check: app is NOT Ractor-shareable
245
+ - app (Proc at app.rb:12)—captures `cache` = {} (Hash) (unshareable)
246
+ - app (HelloApp).@instance—class-level ivar holds #<HelloApp…>—classes
247
+ pass Ractor.shareable?, but reading this from a worker ractor raises
248
+ Ractor::IsolationError on the first request
249
+ hints: freeze config at boot; build endpoints with Ractor.shareable_proc;
250
+ keep per-worker resources in Ractor.store_if_absent; or run mode :threaded.
251
+ ```
252
+
253
+ Exit status is 0/1, so it works in CI. The programmatic form is
254
+ `Film::Check.report(app)`.
255
+
256
+ ## Request timeouts
257
+
258
+ `request_timeout: seconds` (or `request_timeout 30` in `film.rb`) limits
259
+ how long the app may take to produce a response. Past the deadline the
260
+ client gets an immediate **504** while the handler keeps running; its
261
+ late response is dropped without harm. Off by default. The handler is
262
+ deliberately *not* killed, because interrupting arbitrary Ruby mid-flight
263
+ is unsafe. A stuck handler still occupies its worker slot until it
264
+ returns, so set the deadline above your slowest legitimate endpoint and
265
+ watch `stats[:timeouts]`.
266
+
267
+ ## Stats
268
+
269
+ `server.stats` returns a live snapshot: the configuration plus counters
270
+ from the native layer (one relaxed atomic per request, no measurable
271
+ cost):
272
+
273
+ ```ruby
274
+ server.stats
275
+ # => {mode: :ractor, lanes: false, workers: 8, threads: 3, batch: 1,
276
+ # respawns: 0, queued: 0, in_flight: 2, served: 1041, rejected: 0,
277
+ # timeouts: 0}
278
+ # plus lane_depths: [...] when lane dispatch is on
279
+ ```
280
+
281
+ From the outside, `kill -USR1 <pid>` prints the same snapshot as one line
282
+ (pair it with `pidfile` to find the pid):
283
+
284
+ ```
285
+ Film stats: mode=:ractor lanes=false workers=8 threads=3 batch=1 respawns=0 queued=0 in_flight=2 served=1041 rejected=0 timeouts=0
286
+ ```
287
+
288
+ ## Logging
289
+
290
+ With one log line per request, `Film::Logger` sustained **2.4× the
291
+ throughput of a shared `::Logger`** (151k vs 63k req/s on the benchmark
292
+ box). There are two native pieces. Both write through a lock-free
293
+ channel to a Rust flusher thread, so request threads never take a log
294
+ mutex and never make a write syscall:
295
+
296
+ - **Access log** (`log_requests true`): one line per request to stdout,
297
+ including the 503s that never reach your app. On color terminals the
298
+ lines are tinted by status class: 2xx green, 3xx yellow, 4xx maroon,
299
+ 5xx bright red:
300
+
301
+ ```
302
+ 127.0.0.1 [Tue, 10 Jun 2026 13:39:56 GMT] "GET / HTTP/1.1" 200 0.1ms
303
+ ```
304
+
305
+ - **`Film::Logger`**: a `::Logger` over the same async sink, for your
306
+ app's own logging (`Film::Logger.new("log/production.log")`, or no
307
+ argument for stdout). The raw IO-like device is `Film::Logger::Device`,
308
+ for integrations that want bytes without `::Logger` formatting. The
309
+ device is frozen and Ractor-shareable, so one device serves every
310
+ worker.
311
+
312
+ `Film::Logger` in a **Rails** app: it is a real `::Logger` subclass, so
313
+ it fits anywhere Rails expects a logger:
314
+
315
+ ```ruby
316
+ # config/environments/production.rb, simplest forms:
317
+ config.logger = Film::Logger.new # stdout
318
+ config.logger = Film::Logger.new("log/production.log") # file
319
+ # both file and stdout:
320
+ config.logger = ActiveSupport::BroadcastLogger.new(
321
+ Film::Logger.new("log/production.log"), Film::Logger.new
322
+ )
323
+ # tagged logging wraps it like any ::Logger:
324
+ config.logger = ActiveSupport::TaggedLogging.new(Film::Logger.new)
325
+ ```
326
+
327
+ From a plain **Rack** app, give middleware the logger, or hand
328
+ `Rack::CommonLogger` the raw device (it just calls `write`):
329
+
330
+ ```ruby
331
+ # config.ru
332
+ use Rack::CommonLogger, Film::Logger::Device.new # access-style app log
333
+ run MyApp
334
+ ```
335
+
336
+ (If you only want request lines, prefer Film's own `log_requests true`.
337
+ It is free for your Ruby threads, and it also sees the 503s that never
338
+ reach Rack.)
339
+
340
+ Graceful shutdown drains both logs fully. A hard crash can lose the tail
341
+ of the buffer, and when you log faster than the disk can take (over 100k
342
+ lines/s), the sink drops lines instead of blocking request threads.
343
+ These trade-offs are measured in
344
+ [doc/benchmarks.md](doc/benchmarks.md#logging-costs).
345
+
346
+ ## Timer waits
347
+
348
+ `Film.sleep(seconds)` is a high-resolution sleep on the OS clock with
349
+ the GVL released. MRI's own `sleep` wakes up late inside non-main
350
+ ractors (details and numbers in [doc/benchmarks.md](doc/benchmarks.md)).
351
+ Use `Film.sleep` for explicit timer waits in handlers. Ordinary blocking
352
+ I/O does not need it.
353
+
354
+ ## Rack 3 compliance
355
+
356
+ The spec suite runs every test app under `Rack::Lint` over real sockets:
357
+ streaming request bodies (forward-only `rack.input`), enumerable and
358
+ callable (full-duplex stream) response bodies, lowercase and multi-value
359
+ headers, HEAD/204 semantics. Full hijack is left out on purpose; it is
360
+ optional in Rack 3.
361
+
362
+ ## Rails
363
+
364
+ Rails (edge) runs on Film today in `:threaded` mode; see
365
+ `examples/rails-hello`. Ractor-mode Rails is blocked upstream. The exact
366
+ blockers, the `Ruby::Box` findings, and what would unlock it are written
367
+ up in [doc/rails-on-ractors.md](doc/rails-on-ractors.md). The example
368
+ ships a probe script that re-tests against whatever Rails you bundle.
369
+
370
+ ## Development
371
+
372
+ ```sh
373
+ bin/setup
374
+ bundle exec rake # compile, Rust tests, specs, RBS, lint
375
+ RB_SYS_CARGO_PROFILE=dev bundle exec rake compile # fast dev rebuilds
376
+ ```
377
+
378
+ ## Assisted by
379
+
380
+ Claude Code (Mythos, Opus).
381
+
382
+ ## Contributing
383
+
384
+ Bug reports and pull requests are welcome on GitHub at
385
+ https://github.com/yaroslav/film.
386
+
387
+ ## License
388
+
389
+ The gem is available as open source under the terms of the
390
+ [MIT License](https://opensource.org/licenses/MIT).
data/doc/README.md ADDED
@@ -0,0 +1,6 @@
1
+ # Extra documentation
2
+
3
+ This folder is dedicated to architectural decisions, discussions, and
4
+ benchmark results.
5
+
6
+ Almost all content here is written by agents (Claude Code or Codex).
@@ -0,0 +1,161 @@
1
+ # Architecture
2
+
3
+ ```
4
+ tokio (Rust threads) Ruby
5
+ ┌──────────────────────────┐
6
+ │ accept loop (hyper) │ bounded MPMC ┌─ worker: Ractor × threads ─┐
7
+ │ per request: │ ──── queue ───────► │ loop { │
8
+ │ parse → RequestCtx │ │ env = take_one │ ← blocks with the
9
+ │ queue full → 503 │ ◄─── response ───── │ status,h,b = app.(env) │ per-ractor lock
10
+ │ TLS (rustls) │ ◄─── body chunks ── │ respond / stream │ RELEASED
11
+ └──────────────────────────┘ └────────────────────────────┘
12
+ ```
13
+
14
+ All network I/O lives in Rust on a tokio multi-threaded runtime; hyper
15
+ parses HTTP/1.1 and handles keep-alive; rustls terminates TLS. Ruby never
16
+ touches a socket. Each request becomes a Rust-side `RequestCtx` pushed to a
17
+ bounded flume MPMC queue; Ruby workers pull from it.
18
+
19
+ ## Topology
20
+
21
+ Puma-style two-level: `workers × threads`.
22
+
23
+ - `:ractor` mode—`workers` Ractors, each running `threads` Ruby Threads
24
+ over the same worker loop. Parallel across ractors (each has its own VM
25
+ lock); concurrent within one only for I/O-bound handlers.
26
+ - `:threaded` mode—the same total capacity as plain Threads on the main
27
+ ractor. Runs any Rack app; the GVL serializes CPU work.
28
+ - Identical machinery either way: the flume queue is MPMC, a "worker slot"
29
+ is per-thread, and the worker loop (`lib/film/worker.rb`) is shared
30
+ verbatim.
31
+ - Experimental `lanes true` replaces the one shared queue with a small
32
+ private queue per worker slot (awake-preferring dispatch, work
33
+ stealing); see [benchmarks](benchmarks.md#lane-dispatch-experimental-lanes-true).
34
+
35
+ ## The Rust ↔ Ruby boundary
36
+
37
+ - **No native (TypedData) handle crosses a ractor boundary.** Worker
38
+ ractors receive plain integers (server id, worker ids) plus the
39
+ Ractor-shareable app; native state lives in a global Rust-side registry
40
+ keyed by those ids. The per-request handle
41
+ (`Film::Native::Request`, a TypedData object) is created *inside* the
42
+ worker ractor by the take calls (`take_one`/`take_batch`), so its
43
+ ownership is correct by construction.
44
+ - **Blocking discipline:** every blocking native call goes through
45
+ `rb_thread_call_without_gvl` (rb-sys; magnus doesn't wrap it) so a
46
+ blocked worker holds no VM lock. Waits poll an atomic interrupt flag
47
+ between bounded `recv_timeout` ticks; the unblock function (UBF) just
48
+ sets the flag. `flume::Selector` lost wakeups under sustained load
49
+ (workers went permanently deaf to a non-empty queue after ~100k
50
+ requests) and is not used anywhere.
51
+ - **Fast path:** when a request is already queued, `take_one` takes it
52
+ with `try_recv` while still holding the GVL—the release/reacquire pair
53
+ (two scheduler round-trips) is skipped entirely. Under load this is the
54
+ common case.
55
+ - **Fused crossing:** the common complete-body response rides
56
+ `respond_and_take_one`: answer the previous request and take the next in
57
+ one FFI call, ~one crossing per request once the loop is warm. The env
58
+ Hash carries the request handle under `env["film.request"]`, so no
59
+ per-request pair array exists either.
60
+ - **Env construction:** one FFI call builds the full CGI side of the Rack
61
+ env as a real Hash. Static keys, common methods/protocols and 44 common
62
+ `HTTP_*` header names come from a frozen (and therefore Ractor-shareable)
63
+ string cache built once at init on the main ractor. Frozen keys also
64
+ skip the dup that `Hash#[]=` performs on unfrozen string keys. Only
65
+ `rack.input` is lazy/streaming.
66
+ - **Response path:** the Rack headers Hash is passed through as-is and
67
+ iterated on the Rust side (`RHash#foreach`); header bytes are borrowed
68
+ in place from rooted Ruby strings (safe: GVL held, hyper copies
69
+ immediately). Single-chunk bodies skip the join copy.
70
+
71
+ ## Backpressure, in both directions
72
+
73
+ - Bounded request queue between tokio and Ruby. When it stays full past
74
+ `queue_timeout`, the client gets an immediate 503 rather than waiting.
75
+ - Request bodies stream through a bounded(8) channel: hyper is only polled
76
+ as fast as Ruby consumes (inbound backpressure costs nothing extra).
77
+ Bodyless requests (most GETs) spawn no forwarder task at all.
78
+ - Response bodies stream through a bounded(8) channel the other way: a
79
+ slow client makes `write_chunk` block—with the GVL released.
80
+
81
+ ## Failure handling
82
+
83
+ Three parties can answer a client, coordinated by an atomic
84
+ first-claimant-wins flag on the per-request `Responder`:
85
+
86
+ 1. The app, via the worker loop (normal path; `StandardError` is rescued
87
+ in Ruby and becomes a clean 500).
88
+ 2. The supervisor: each worker ractor has a supervisor thread blocked in
89
+ `Ractor#value`. A hard crash (any `Exception`) wakes it; it immediately
90
+ 500s the crashed ractor's in-flight requests via a `Weak<Responder>`
91
+ side table—not when GC eventually notices—and respawns the ractor
92
+ with fresh slots.
93
+ 3. A `Drop` guard on `RequestCtx` as the universal backstop (GC of an
94
+ abandoned handle, teardown races). The Drop path never touches the Ruby
95
+ API, so it is safe from any thread.
96
+
97
+ With `request_timeout` configured, the tokio front-end can additionally
98
+ answer with a 504 on its own when the response head misses the deadline;
99
+ the worker keeps running, and its late response goes nowhere harmlessly:
100
+ the front-end has stopped listening (the oneshot receiver is dropped),
101
+ and the worker's claim makes the Drop backstop a no-op.
102
+
103
+ Client aborts are handled the same way in reverse: hyper drops the request
104
+ future, and a Rust `Drop` guard keeps the in-flight counter honest (a
105
+ plain decrement after an `.await` would never run).
106
+
107
+ ## Graceful shutdown
108
+
109
+ `stop_accepting` → drain until queue + in-flight reach zero or the
110
+ deadline passes → `close_queue` (idle workers see Disconnected and exit) →
111
+ join workers → past deadline: abort remaining clients (a 500, or a
112
+ connection abort mid-stream), interrupt blocked workers, reap
113
+ stragglers → tear down the tokio runtime. Idempotent;
114
+ a second INT/TERM force-exits.
115
+
116
+ ## Timer waits: `Film.sleep`
117
+
118
+ MRI's `sleep` parks the thread on the VM timer, whose wakeups inside
119
+ non-main ractors are coarse (how coarse is environment-dependent; see
120
+ [benchmarks](benchmarks.md#why-io-lags-in-ractor-mode-on-linux)).
121
+ `Film.sleep` releases the GVL and waits on the OS clock directly, chunked
122
+ at the interrupt tick so `Thread#kill` and shutdown stay responsive.
123
+
124
+ ## Why tokio (researched June 2026)
125
+
126
+ - **tokio + hyper**: the bottleneck is the Ruby dispatch boundary, not raw
127
+ I/O throughput; what matters is HTTP correctness, keep-alive, TLS, and
128
+ h2-later—hyper's territory. Cross-platform out of the box.
129
+ - **monoio**: thread-per-core io_uring looks great in echo-server
130
+ benchmarks, but hyper only works through its poll-io compat layer
131
+ (forfeiting io_uring on the hot path), and the share-nothing advantage
132
+ is spent the moment requests fan into an MPMC queue toward Ruby.
133
+ - **compio**: completion-based, cross-platform, production-proven—but no
134
+ first-class HTTP server story yet, and completion-model owned-buffer
135
+ semantics would leak into the request lifecycle design.
136
+ - **ntex**: the strongest alternative—unlike monoio/compio it has a
137
+ first-class HTTP/1.1 + HTTP/2 server stack (TechEmpower top tier) plus
138
+ an io_uring runtime ("neon") on Linux today. Rejected as the default
139
+ for now: its thread-per-core, `Rc`-based `!Send` worker model is
140
+ exactly what our Send-ctx-into-MPMC dispatch opts out of; its own
141
+ request/response/body types would force a conversion seam through
142
+ `Responder` and the streaming path; neon is Linux-only (ntex-on-tokio
143
+ elsewhere forfeits the io_uring win and just trades hyper's
144
+ battle-tested h1 for a less-deployed one); and the realistic gain is
145
+ confined to syscall-bound /plaintext-class traffic—the Ruby boundary,
146
+ not the front-end, is where Film's time goes. Worth a contained
147
+ feature-flag spike if the Linux plaintext ceiling ever matters
148
+ competitively.
149
+ - **io_uring path**: tokio ships in-tree io_uring as an unstable feature
150
+ (file ops as of 1.52; network expected to follow). `server.rs` isolates
151
+ the runtime, so adopting it later is a contained change—and would
152
+ deliver most of ntex/neon's win without the type seam.
153
+
154
+ ## Versioning of risky dependencies
155
+
156
+ magnus is used for everything except the GVL-release primitives and the
157
+ `rb_ext_ractor_safe` flag, which go straight to rb-sys (magnus wraps
158
+ neither). magnus's lazy TypedData class cache is force-resolved at init
159
+ on the main ractor, so no worker ractor ever races its first resolution;
160
+ the only symbols the crate creates are made during `server_start`, also
161
+ on the main ractor.