puma 6.4.1 → 7.2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/History.md +407 -8
- data/README.md +109 -49
- data/docs/deployment.md +58 -23
- data/docs/fork_worker.md +11 -1
- data/docs/java_options.md +54 -0
- data/docs/jungle/README.md +1 -1
- data/docs/kubernetes.md +11 -16
- data/docs/plugins.md +6 -2
- data/docs/restart.md +2 -2
- data/docs/signals.md +21 -21
- data/docs/stats.md +11 -5
- data/docs/systemd.md +14 -5
- data/ext/puma_http11/extconf.rb +20 -32
- data/ext/puma_http11/mini_ssl.c +29 -9
- data/ext/puma_http11/org/jruby/puma/Http11.java +40 -9
- data/ext/puma_http11/puma_http11.c +125 -118
- data/lib/puma/app/status.rb +11 -3
- data/lib/puma/binder.rb +21 -11
- data/lib/puma/cli.rb +10 -8
- data/lib/puma/client.rb +183 -83
- data/lib/puma/cluster/worker.rb +24 -21
- data/lib/puma/cluster/worker_handle.rb +38 -8
- data/lib/puma/cluster.rb +73 -47
- data/lib/puma/cluster_accept_loop_delay.rb +91 -0
- data/lib/puma/commonlogger.rb +3 -3
- data/lib/puma/configuration.rb +131 -60
- data/lib/puma/const.rb +31 -12
- data/lib/puma/control_cli.rb +10 -6
- data/lib/puma/detect.rb +2 -0
- data/lib/puma/dsl.rb +411 -121
- data/lib/puma/error_logger.rb +7 -5
- data/lib/puma/events.rb +25 -10
- data/lib/puma/io_buffer.rb +8 -4
- data/lib/puma/jruby_restart.rb +0 -16
- data/lib/puma/launcher/bundle_pruner.rb +1 -1
- data/lib/puma/launcher.rb +73 -55
- data/lib/puma/log_writer.rb +9 -9
- data/lib/puma/minissl/context_builder.rb +1 -0
- data/lib/puma/minissl.rb +1 -1
- data/lib/puma/null_io.rb +26 -0
- data/lib/puma/plugin/systemd.rb +3 -3
- data/lib/puma/rack/urlmap.rb +1 -1
- data/lib/puma/reactor.rb +19 -13
- data/lib/puma/request.rb +71 -39
- data/lib/puma/runner.rb +15 -17
- data/lib/puma/sd_notify.rb +1 -4
- data/lib/puma/server.rb +134 -73
- data/lib/puma/single.rb +7 -4
- data/lib/puma/state_file.rb +3 -2
- data/lib/puma/thread_pool.rb +57 -80
- data/lib/puma/util.rb +0 -7
- data/lib/puma.rb +10 -0
- data/lib/rack/handler/puma.rb +10 -7
- data/tools/Dockerfile +15 -5
- metadata +14 -15
- data/ext/puma_http11/ext_help.h +0 -15
data/README.md
CHANGED
|
@@ -4,15 +4,14 @@
|
|
|
4
4
|
|
|
5
5
|
# Puma: A Ruby Web Server Built For Parallelism
|
|
6
6
|
|
|
7
|
-
[](https://codeclimate.com/github/puma/puma)
|
|
7
|
+
[](https://github.com/puma/puma/actions/workflows/tests.yml?query=branch%3Amain)
|
|
9
8
|
[]( https://stackoverflow.com/questions/tagged/puma )
|
|
10
9
|
|
|
11
10
|
Puma is a **simple, fast, multi-threaded, and highly parallel HTTP 1.1 server for Ruby/Rack applications**.
|
|
12
11
|
|
|
13
12
|
## Built For Speed & Parallelism
|
|
14
13
|
|
|
15
|
-
Puma is a server for [Rack](https://github.com/rack/rack)-powered HTTP applications written in Ruby. It is:
|
|
14
|
+
Puma is a server for [Rack](https://github.com/rack/rack)-powered HTTP applications written in Ruby. It is:
|
|
16
15
|
* **Multi-threaded**. Each request is served in a separate thread. This helps you serve more requests per second with less memory use.
|
|
17
16
|
* **Multi-process**. "Pre-forks" in cluster mode, using less memory per-process thanks to copy-on-write memory.
|
|
18
17
|
* **Standalone**. With SSL support, zero-downtime rolling restarts and a built-in request bufferer, you can deploy Puma without any reverse proxy.
|
|
@@ -82,10 +81,10 @@ $ bundle exec puma
|
|
|
82
81
|
|
|
83
82
|
## Configuration
|
|
84
83
|
|
|
85
|
-
Puma provides numerous options. Consult `puma -h` (or `puma --help`) for a full list of CLI options, or see `Puma::DSL` or [dsl.rb](https://github.com/puma/puma/blob/
|
|
84
|
+
Puma provides numerous options. Consult `puma -h` (or `puma --help`) for a full list of CLI options, or see `Puma::DSL` or [dsl.rb](https://github.com/puma/puma/blob/main/lib/puma/dsl.rb).
|
|
86
85
|
|
|
87
86
|
You can also find several configuration examples as part of the
|
|
88
|
-
[test](https://github.com/puma/puma/tree/
|
|
87
|
+
[test](https://github.com/puma/puma/tree/main/test/config) suite.
|
|
89
88
|
|
|
90
89
|
For debugging purposes, you can set the environment variable `PUMA_LOG_CONFIG` with a value
|
|
91
90
|
and the loaded configuration will be printed as part of the boot process.
|
|
@@ -102,9 +101,9 @@ Puma will automatically scale the number of threads, from the minimum until it c
|
|
|
102
101
|
|
|
103
102
|
Be aware that additionally Puma creates threads on its own for internal purposes (e.g. handling slow clients). So, even if you specify -t 1:1, expect around 7 threads created in your application.
|
|
104
103
|
|
|
105
|
-
###
|
|
104
|
+
### Cluster mode
|
|
106
105
|
|
|
107
|
-
Puma also offers "
|
|
106
|
+
Puma also offers "cluster mode". Cluster mode `fork`s workers from a master process. Each child process still has its own thread pool. You can tune the number of workers with the `-w` (or `--workers`) flag:
|
|
108
107
|
|
|
109
108
|
```
|
|
110
109
|
$ puma -t 8:32 -w 3
|
|
@@ -116,13 +115,24 @@ Or with the `WEB_CONCURRENCY` environment variable:
|
|
|
116
115
|
$ WEB_CONCURRENCY=3 puma -t 8:32
|
|
117
116
|
```
|
|
118
117
|
|
|
119
|
-
|
|
118
|
+
When using a config file, most applications can simply set `workers :auto` (requires the `concurrent-ruby` gem) to match the number of worker processes to the available processors:
|
|
120
119
|
|
|
121
|
-
|
|
120
|
+
```ruby
|
|
121
|
+
# config/puma.rb
|
|
122
|
+
workers :auto
|
|
123
|
+
```
|
|
124
|
+
|
|
125
|
+
See [`workers :auto` gotchas](lib/puma/dsl.rb).
|
|
126
|
+
|
|
127
|
+
Note that threads are still used in cluster mode, and the `-t` thread flag setting is per worker, so `-w 2 -t 16:16` will spawn 32 threads in total, with 16 in each worker process.
|
|
128
|
+
|
|
129
|
+
If `workers` is set to `:auto`, or the `WEB_CONCURRENCY` environment variable is set to `"auto"`, and the `concurrent-ruby` gem is available in your application, Puma will set the worker process count to the result of [available processors](https://msp-greg.github.io/concurrent-ruby/Concurrent.html#available_processor_count-class_method).
|
|
122
130
|
|
|
123
|
-
|
|
131
|
+
For an in-depth discussion of the tradeoffs of thread and process count settings, [see our docs](docs/deployment.md).
|
|
124
132
|
|
|
125
|
-
|
|
133
|
+
In cluster mode, Puma can "preload" your application. This loads all the application code *prior* to forking. Preloading reduces total memory usage of your application via an operating system feature called [copy-on-write](https://en.wikipedia.org/wiki/Copy-on-write).
|
|
134
|
+
|
|
135
|
+
If the number of workers is greater than 1 (and `--prune-bundler` has not been specified), preloading will be enabled by default. Otherwise, you can use the `--preload` flag from the command line:
|
|
126
136
|
|
|
127
137
|
```
|
|
128
138
|
$ puma -w 3 --preload
|
|
@@ -138,51 +148,106 @@ preload_app!
|
|
|
138
148
|
|
|
139
149
|
Preloading can’t be used with phased restart, since phased restart kills and restarts workers one-by-one, and preloading copies the code of master into the workers.
|
|
140
150
|
|
|
141
|
-
|
|
151
|
+
#### Cluster mode hooks
|
|
152
|
+
|
|
153
|
+
When using clustered mode, Puma's configuration DSL provides `before_fork`, `before_worker_boot`, and `after_worker_shutdown`
|
|
154
|
+
hooks to run code when the master process forks, the child workers are booted, and after each child worker exits respectively.
|
|
155
|
+
|
|
156
|
+
It is recommended to use these hooks with `preload_app!`, otherwise constants loaded by your
|
|
157
|
+
application (such as `Rails`) will not be available inside the hooks.
|
|
142
158
|
|
|
143
159
|
```ruby
|
|
144
160
|
# config/puma.rb
|
|
145
|
-
|
|
146
|
-
#
|
|
161
|
+
before_fork do
|
|
162
|
+
# Add code to run inside the Puma master process before it forks a worker child.
|
|
147
163
|
end
|
|
148
|
-
```
|
|
149
164
|
|
|
150
|
-
|
|
151
|
-
|
|
152
|
-
|
|
165
|
+
before_worker_boot do
|
|
166
|
+
# Add code to run inside the Puma worker process after forking.
|
|
167
|
+
end
|
|
153
168
|
|
|
154
|
-
|
|
155
|
-
|
|
169
|
+
after_worker_shutdown do |worker_handle|
|
|
170
|
+
# Add code to run inside the Puma master process after a worker exits. `worker.process_status` can be used to get the
|
|
171
|
+
# `Process::Status` of the exited worker.
|
|
172
|
+
end
|
|
173
|
+
```
|
|
156
174
|
|
|
157
|
-
|
|
175
|
+
In addition, there is an `before_refork` and `after_refork` hooks which are used only in [`fork_worker` mode](docs/fork_worker.md),
|
|
176
|
+
when the worker 0 child process forks a grandchild worker:
|
|
158
177
|
|
|
159
178
|
```ruby
|
|
160
|
-
|
|
161
|
-
|
|
162
|
-
#
|
|
179
|
+
before_refork do
|
|
180
|
+
# Used only when fork_worker mode is enabled. Add code to run inside the Puma worker 0
|
|
181
|
+
# child process before it forks a grandchild worker.
|
|
163
182
|
end
|
|
164
183
|
```
|
|
165
184
|
|
|
166
|
-
|
|
185
|
+
```ruby
|
|
186
|
+
after_refork do
|
|
187
|
+
# Used only when fork_worker mode is enabled. Add code to run inside the Puma worker 0
|
|
188
|
+
# child process after it forks a grandchild worker.
|
|
189
|
+
end
|
|
190
|
+
```
|
|
191
|
+
|
|
192
|
+
Importantly, note the following considerations when Ruby forks a child process:
|
|
193
|
+
|
|
194
|
+
1. File descriptors such as network sockets **are** copied from the parent to the forked
|
|
195
|
+
child process. Dual-use of the same sockets by parent and child will result in I/O conflicts
|
|
196
|
+
such as `SocketError`, `Errno::EPIPE`, and `EOFError`.
|
|
197
|
+
2. Background Ruby threads, including threads used by various third-party gems for connection
|
|
198
|
+
monitoring, etc., are **not** copied to the child process. Often this does not cause
|
|
199
|
+
immediate problems until a third-party connection goes down, at which point there will
|
|
200
|
+
be no supervisor to reconnect it.
|
|
201
|
+
|
|
202
|
+
Therefore, we recommend the following:
|
|
203
|
+
|
|
204
|
+
1. If possible, do not establish any socket connections (HTTP, database connections, etc.)
|
|
205
|
+
inside Puma's master process when booting.
|
|
206
|
+
2. If (1) is not possible, use `before_fork` and `before_refork` to disconnect the parent's socket
|
|
207
|
+
connections when forking, so that they are not accidentally copied to the child process.
|
|
208
|
+
3. Use `before_worker_boot` to restart any background threads on the forked child.
|
|
209
|
+
4. Use `after_refork` to restart any background threads on the parent.
|
|
210
|
+
|
|
211
|
+
#### Master process lifecycle hooks
|
|
212
|
+
|
|
213
|
+
Puma's configuration DSL provides master process lifecycle hooks `after_booted`, `before_restart`, and `after_stopped`
|
|
214
|
+
which may be used to specify code blocks to run on each event:
|
|
167
215
|
|
|
168
216
|
```ruby
|
|
169
217
|
# config/puma.rb
|
|
170
|
-
|
|
171
|
-
#
|
|
218
|
+
after_booted do
|
|
219
|
+
# Add code to run in the Puma master process after it boots,
|
|
220
|
+
# and also after a phased restart completes.
|
|
221
|
+
end
|
|
222
|
+
|
|
223
|
+
before_restart do
|
|
224
|
+
# Add code to run in the Puma master process when it receives
|
|
225
|
+
# a restart command but before it restarts.
|
|
226
|
+
end
|
|
227
|
+
|
|
228
|
+
after_stopped do
|
|
229
|
+
# Add code to run in the Puma master process when it receives
|
|
230
|
+
# a stop command but before it shuts down.
|
|
172
231
|
end
|
|
173
232
|
```
|
|
174
233
|
|
|
175
234
|
### Error handling
|
|
176
235
|
|
|
177
|
-
If
|
|
178
|
-
textual error message (see `Puma::Server#lowlevel_error` or [server.rb](https://github.com/puma/puma/blob/
|
|
236
|
+
If Puma encounters an error outside of the context of your application, it will respond with a 400/500 and a simple
|
|
237
|
+
textual error message (see `Puma::Server#lowlevel_error` or [server.rb](https://github.com/puma/puma/blob/main/lib/puma/server.rb)).
|
|
179
238
|
You can specify custom behavior for this scenario. For example, you can report the error to your third-party
|
|
180
239
|
error-tracking service (in this example, [rollbar](https://rollbar.com)):
|
|
181
240
|
|
|
182
241
|
```ruby
|
|
183
|
-
lowlevel_error_handler do |e|
|
|
184
|
-
|
|
185
|
-
|
|
242
|
+
lowlevel_error_handler do |e, env, status|
|
|
243
|
+
if status == 400
|
|
244
|
+
message = "The server could not process the request due to an error, such as an incorrectly typed URL, malformed syntax, or a URL that contains illegal characters.\n"
|
|
245
|
+
else
|
|
246
|
+
message = "An error has occurred, and engineers have been informed. Please reload the page. If you continue to have problems, contact support@example.com\n"
|
|
247
|
+
Rollbar.critical(e)
|
|
248
|
+
end
|
|
249
|
+
|
|
250
|
+
[status, {}, [message]]
|
|
186
251
|
end
|
|
187
252
|
```
|
|
188
253
|
|
|
@@ -249,7 +314,7 @@ $ puma -b ssl://localhost:9292 -b tcp://localhost:9393 -C config/use_local_host.
|
|
|
249
314
|
|
|
250
315
|
#### Controlling SSL Cipher Suites
|
|
251
316
|
|
|
252
|
-
To use or avoid specific SSL
|
|
317
|
+
To use or avoid specific SSL ciphers for TLSv1.2 and below, use `ssl_cipher_filter` or `ssl_cipher_list` options.
|
|
253
318
|
|
|
254
319
|
##### Ruby:
|
|
255
320
|
|
|
@@ -263,6 +328,14 @@ $ puma -b 'ssl://127.0.0.1:9292?key=path_to_key&cert=path_to_cert&ssl_cipher_fil
|
|
|
263
328
|
$ puma -b 'ssl://127.0.0.1:9292?keystore=path_to_keystore&keystore-pass=keystore_password&ssl_cipher_list=TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA'
|
|
264
329
|
```
|
|
265
330
|
|
|
331
|
+
To configure the available TLSv1.3 ciphersuites, use `ssl_ciphersuites` option (not available for JRuby).
|
|
332
|
+
|
|
333
|
+
##### Ruby:
|
|
334
|
+
|
|
335
|
+
```
|
|
336
|
+
$ puma -b 'ssl://127.0.0.1:9292?key=path_to_key&cert=path_to_cert&ssl_ciphersuites=TLS_AES_256_GCM_SHA384:TLS_AES_128_GCM_SHA256'
|
|
337
|
+
```
|
|
338
|
+
|
|
266
339
|
See https://www.openssl.org/docs/man1.1.1/man1/ciphers.html for cipher filter format and full list of cipher suites.
|
|
267
340
|
|
|
268
341
|
Disable TLS v1 with the `no_tlsv1` option:
|
|
@@ -279,7 +352,7 @@ To enable verification flags offered by OpenSSL, use `verification_flags` (not a
|
|
|
279
352
|
$ puma -b 'ssl://127.0.0.1:9292?key=path_to_key&cert=path_to_cert&verification_flags=PARTIAL_CHAIN'
|
|
280
353
|
```
|
|
281
354
|
|
|
282
|
-
You can also set multiple verification flags (by separating them with
|
|
355
|
+
You can also set multiple verification flags (by separating them with a comma):
|
|
283
356
|
|
|
284
357
|
```
|
|
285
358
|
$ puma -b 'ssl://127.0.0.1:9292?key=path_to_key&cert=path_to_cert&verification_flags=PARTIAL_CHAIN,CRL_CHECK'
|
|
@@ -320,7 +393,7 @@ Puma has a built-in status and control app that can be used to query and control
|
|
|
320
393
|
$ puma --control-url tcp://127.0.0.1:9293 --control-token foo
|
|
321
394
|
```
|
|
322
395
|
|
|
323
|
-
Puma will start the control server on localhost port 9293. All requests to the control server will need to include control token (in this case, `token=foo`) as a query parameter. This allows for simple authentication. Check out `Puma::App::Status` or [status.rb](https://github.com/puma/puma/blob/
|
|
396
|
+
Puma will start the control server on localhost port 9293. All requests to the control server will need to include control token (in this case, `token=foo`) as a query parameter. This allows for simple authentication. Check out `Puma::App::Status` or [status.rb](https://github.com/puma/puma/blob/main/lib/puma/app/status.rb) to see what the status app has available.
|
|
324
397
|
|
|
325
398
|
You can also interact with the control server via `pumactl`. This command will restart Puma:
|
|
326
399
|
|
|
@@ -352,7 +425,7 @@ $ puma -C "-"
|
|
|
352
425
|
|
|
353
426
|
The other side-effects of setting the environment are whether to show stack traces (in `development` or `test`), and setting RACK_ENV may potentially affect middleware looking for this value to change their behavior. The default puma RACK_ENV value is `development`. You can see all config default values in `Puma::Configuration#puma_default_options` or [configuration.rb](https://github.com/puma/puma/blob/61c6213fbab/lib/puma/configuration.rb#L182-L204).
|
|
354
427
|
|
|
355
|
-
Check out `Puma::DSL` or [dsl.rb](https://github.com/puma/puma/blob/
|
|
428
|
+
Check out `Puma::DSL` or [dsl.rb](https://github.com/puma/puma/blob/main/lib/puma/dsl.rb) to see all available options.
|
|
356
429
|
|
|
357
430
|
## Restart
|
|
358
431
|
|
|
@@ -372,19 +445,6 @@ Some platforms do not support all Puma features.
|
|
|
372
445
|
* **Windows**: Cluster mode is not supported due to a lack of fork(2).
|
|
373
446
|
* **Kubernetes**: The way Kubernetes handles pod shutdowns interacts poorly with server processes implementing graceful shutdown, like Puma. See the [kubernetes section of the documentation](docs/kubernetes.md) for more details.
|
|
374
447
|
|
|
375
|
-
## Known Bugs
|
|
376
|
-
|
|
377
|
-
For MRI versions 2.2.7, 2.2.8, 2.2.9, 2.2.10, 2.3.4 and 2.4.1, you may see ```stream closed in another thread (IOError)```. It may be caused by a [Ruby bug](https://bugs.ruby-lang.org/issues/13632). It can be fixed with the gem https://rubygems.org/gems/stopgap_13632:
|
|
378
|
-
|
|
379
|
-
```ruby
|
|
380
|
-
if %w(2.2.7 2.2.8 2.2.9 2.2.10 2.3.4 2.4.1).include? RUBY_VERSION
|
|
381
|
-
begin
|
|
382
|
-
require 'stopgap_13632'
|
|
383
|
-
rescue LoadError
|
|
384
|
-
end
|
|
385
|
-
end
|
|
386
|
-
```
|
|
387
|
-
|
|
388
448
|
## Deployment
|
|
389
449
|
|
|
390
450
|
* Puma has support for Capistrano with an [external gem](https://github.com/seuros/capistrano-puma).
|
data/docs/deployment.md
CHANGED
|
@@ -16,32 +16,34 @@ assume this is how you're using Puma.
|
|
|
16
16
|
Initially, Puma was conceived as a thread-only web server, but support for
|
|
17
17
|
processes was added in version 2.
|
|
18
18
|
|
|
19
|
+
In general, use single mode only if:
|
|
20
|
+
|
|
21
|
+
* You are using JRuby, TruffleRuby or another fully-multithreaded implementation of Ruby
|
|
22
|
+
* You are using MRI but in an environment where only 1 CPU core is available.
|
|
23
|
+
|
|
24
|
+
Otherwise, you'll want to use cluster mode to utilize all available CPU resources.
|
|
25
|
+
|
|
19
26
|
To run `puma` in single mode (i.e., as a development environment), set the
|
|
20
27
|
number of workers to 0; anything higher will run in cluster mode.
|
|
21
28
|
|
|
22
|
-
|
|
29
|
+
## Cluster Mode Tips
|
|
23
30
|
|
|
24
|
-
|
|
31
|
+
For the purposes of Puma provisioning, "CPU cores" means:
|
|
25
32
|
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
* Set the number of threads to desired concurrent requests/number of workers.
|
|
29
|
-
Puma defaults to 5, and that's a decent number.
|
|
33
|
+
1. On ARM, the number of physical cores.
|
|
34
|
+
2. On x86, the number of logical cores, hyperthreads, or vCPUs (these words all mean the same thing).
|
|
30
35
|
|
|
31
|
-
|
|
36
|
+
Set your config with the following process:
|
|
32
37
|
|
|
33
|
-
* If you'
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
* Enjoy 50% memory savings
|
|
37
|
-
* As you grow more confident in the thread-safety of your app, you can tune the
|
|
38
|
-
workers down and the threads up.
|
|
38
|
+
* Use cluster mode and set `workers :auto` (requires the `concurrent-ruby` gem) to match the number of CPU cores on the machine (minimum 2, otherwise use single mode!). If you can't add the gem, set the worker count manually to the available CPU cores.
|
|
39
|
+
* Set the number of threads to desired concurrent requests/number of workers.
|
|
40
|
+
Puma defaults to 5, and that's a decent number.
|
|
39
41
|
|
|
40
|
-
|
|
42
|
+
For most deployments, adding `concurrent-ruby` and using `workers :auto` is the right starting point.
|
|
41
43
|
|
|
42
|
-
See [
|
|
44
|
+
See [`workers :auto` gotchas](../lib/puma/dsl.rb).
|
|
43
45
|
|
|
44
|
-
|
|
46
|
+
## Worker utilization
|
|
45
47
|
|
|
46
48
|
**How do you know if you've got enough (or too many workers)?**
|
|
47
49
|
|
|
@@ -50,14 +52,34 @@ a time. But since so many apps are waiting on IO from DBs, etc., they can
|
|
|
50
52
|
utilize threads to use the process more efficiently.
|
|
51
53
|
|
|
52
54
|
Generally, you never want processes that are pegged all the time. That can mean
|
|
53
|
-
there is more work to do than the process can get through. On the other hand, if
|
|
54
|
-
you have processes that sit around doing nothing, then
|
|
55
|
-
|
|
55
|
+
there is more work to do than the process can get through, and requests will end up with additional latency. On the other hand, if
|
|
56
|
+
you have processes that sit around doing nothing, then you're wasting resources and money.
|
|
57
|
+
|
|
58
|
+
In general, you are making a tradeoff between:
|
|
59
|
+
|
|
60
|
+
1. CPU and memory utilization.
|
|
61
|
+
2. Time spent queueing for a Puma worker to `accept` requests and additional latency caused by CPU contention.
|
|
62
|
+
|
|
63
|
+
If latency is important to you, you will have to accept lower utilization, and vice versa.
|
|
56
64
|
|
|
57
|
-
|
|
58
|
-
utilization means you've got capacity still but aren't starving threads.
|
|
65
|
+
## Container/VPS sizing
|
|
59
66
|
|
|
60
|
-
|
|
67
|
+
You will have to make a decision about how "big" to make each pod/VPS/server/dyno.
|
|
68
|
+
|
|
69
|
+
**TL:DR;**: 80% of Puma apps will end up deploying "pods" of 4 workers, 5 threads each, 4 vCPU and 8GB of RAM.
|
|
70
|
+
|
|
71
|
+
For the rest of this discussion, we'll adopt the Kubernetes term of "pods".
|
|
72
|
+
|
|
73
|
+
Should you run 2 pods with 50 workers each? 25 pods, each with 4 workers? 100 pods, with each Puma running in single mode? Each scenario represents the same total amount of capacity (100 Puma processes that can respond to requests), but there are tradeoffs to make:
|
|
74
|
+
|
|
75
|
+
* **Increasing worker counts decreases latency, but means you scale in bigger "chunks"**. Worker counts should be somewhere between 4 and 32 in most cases. You want more than 4 in order to minimize time spent in request queueing for a free Puma worker, but probably less than ~32 because otherwise autoscaling is working in too large of an increment or they probably won't fit very well into your nodes. In any queueing system, queue time is proportional to 1/n, where n is the number of things pulling from the queue. Each pod will have its own request queue (i.e., the socket backlog). If you have 4 pods with 1 worker each (4 request queues), wait times are, proportionally, about 4 times higher than if you had 1 pod with 4 workers (1 request queue).
|
|
76
|
+
* **Increasing thread counts will increase throughput, but also latency and memory use** Unless you have a very I/O-heavy application (50%+ time spent waiting on IO), use the default thread count (5 for MRI). Using higher numbers of threads with low I/O wait (<50% of wall clock time) will lead to additional request latency and additional memory usage.
|
|
77
|
+
* **Increasing worker counts decreases memory per worker on average**. More processes per pod reduces memory usage per process, because of copy-on-write memory and because the cost of the single master process is "amortized" over more child processes.
|
|
78
|
+
* **Low worker counts (<4) have exceptionally poor throughput**. Don't run less than 4 processes per pod if you can. Low numbers of processes per pod will lead to high request queueing (see discussion above), which means you will have to run more pods and resources.
|
|
79
|
+
* **CPU-core-to-worker ratios should be around 1**. If running Puma with `threads > 1`, allocate 1 CPU core (see definition above!) per worker. If single threaded, allocate ~0.75 cpus per worker. Most web applications spend about 25% of their time in I/O - but when you're running multi-threaded, your Puma process will have higher CPU usage and should be able to fully saturate a CPU core. Using `workers :auto` will size workers to this guidance on most platforms.
|
|
80
|
+
* **Don't set memory limits unless necessary**. Most Puma processes will use about ~512MB-1GB per worker, and about 1GB for the master process. However, you probably shouldn't bother with setting memory limits lower than around 2GB per process, because most places you are deploying will have 2GB of RAM per CPU. A sensible memory limit for a Puma configuration of 4 child workers might be something like 8 GB (1 GB for the master, 7GB for the 4 children).
|
|
81
|
+
|
|
82
|
+
**Measuring utilization and queue time**
|
|
61
83
|
|
|
62
84
|
Using a timestamp header from an upstream proxy server (e.g., `nginx` or
|
|
63
85
|
`haproxy`) makes it possible to indicate how long requests have been waiting for
|
|
@@ -75,7 +97,7 @@ a Puma thread to become available.
|
|
|
75
97
|
* `env['puma.request_body_wait']` contains the number of milliseconds Puma
|
|
76
98
|
spent waiting for the client to send the request body.
|
|
77
99
|
* haproxy: `%Th` (TLS handshake time) and `%Ti` (idle time before request)
|
|
78
|
-
can
|
|
100
|
+
can also be added as headers.
|
|
79
101
|
|
|
80
102
|
## Should I daemonize?
|
|
81
103
|
|
|
@@ -100,3 +122,16 @@ or hell, even `monit`.
|
|
|
100
122
|
You probably will want to deploy some new code at some point, and you'd like
|
|
101
123
|
Puma to start running that new code. There are a few options for restarting
|
|
102
124
|
Puma, described separately in our [restart documentation](restart.md).
|
|
125
|
+
|
|
126
|
+
## Migrating from Unicorn
|
|
127
|
+
|
|
128
|
+
* If you're migrating from unicorn though, here are some settings to start with:
|
|
129
|
+
* Set workers to half the number of unicorn workers you're using
|
|
130
|
+
* Set threads to 2
|
|
131
|
+
* Enjoy 50% memory savings
|
|
132
|
+
* As you grow more confident in the thread-safety of your app, you can tune the
|
|
133
|
+
workers down and the threads up.
|
|
134
|
+
|
|
135
|
+
## Ubuntu / Systemd (Systemctl) Installation
|
|
136
|
+
|
|
137
|
+
See [systemd.md](systemd.md)
|
data/docs/fork_worker.md
CHANGED
|
@@ -22,10 +22,20 @@ The `fork_worker` option allows your application to be initialized only once for
|
|
|
22
22
|
|
|
23
23
|
You can trigger a refork by sending the cluster the `SIGURG` signal or running the `pumactl refork` command at any time. A refork will also automatically trigger once, after a certain number of requests have been processed by worker 0 (default 1000). To configure the number of requests before the auto-refork, pass a positive integer argument to `fork_worker` (e.g., `fork_worker 1000`), or `0` to disable.
|
|
24
24
|
|
|
25
|
+
### Usage Considerations
|
|
26
|
+
|
|
27
|
+
- `fork_worker` introduces new `before_refork` and `after_refork` configuration hooks. Note the following:
|
|
28
|
+
- When initially forking the parent process to the worker 0 child, `before_fork` will trigger on the parent process and `before_worker_boot` will trigger on the worker 0 child as normal.
|
|
29
|
+
- When forking the worker 0 child to grandchild workers, `before_refork` and `after_refork` will trigger on the worker 0 child, and `before_worker_boot` will trigger on each grandchild worker.
|
|
30
|
+
- For clarity, `before_fork` does not trigger on worker 0, and `after_refork` does not trigger on the grandchild.
|
|
31
|
+
- As a general migration guide:
|
|
32
|
+
- Copy any logic within your existing `before_fork` hook to the `before_refork` hook.
|
|
33
|
+
- Consider to copy logic from your `before_worker_boot` hook to the `after_refork` hook, if it is needed to reset the state of worker 0 after it forks.
|
|
34
|
+
|
|
25
35
|
### Limitations
|
|
26
36
|
|
|
27
37
|
- This mode is still very experimental so there may be bugs or edge-cases, particularly around expected behavior of existing hooks. Please open a [bug report](https://github.com/puma/puma/issues/new?template=bug_report.md) if you encounter any issues.
|
|
28
38
|
|
|
29
39
|
- In order to fork new workers cleanly, worker 0 shuts down its server and stops serving requests so there are no open file descriptors or other kinds of shared global state between processes, and to maximize copy-on-write efficiency across the newly-forked workers. This may temporarily reduce total capacity of the cluster during a phased restart / refork.
|
|
30
40
|
|
|
31
|
-
|
|
41
|
+
- In a cluster with `n` workers, a normal phased restart stops and restarts workers one by one while the application is loaded in each process, so `n-1` workers are available serving requests during the restart. In a phased restart in fork-worker mode, the application is first loaded in worker 0 while `n-1` workers are available, then worker 0 remains stopped while the rest of the workers are reloaded one by one, leaving only `n-2` workers to be available for a brief period of time. Reloading the rest of the workers should be quick because the application is preloaded at that point, but there may be situations where it can take longer (slow clients, long-running application code, slow worker-fork hooks, etc).
|
|
@@ -0,0 +1,54 @@
|
|
|
1
|
+
# Java Options
|
|
2
|
+
|
|
3
|
+
`System Properties` or `Environment Variables` can be used to change Puma's
|
|
4
|
+
default configuration for its Java extension. The provided values are evaluated
|
|
5
|
+
during initialization, and changes while running the app have no effect.
|
|
6
|
+
Moreover, default values may be used in case of invalid inputs.
|
|
7
|
+
|
|
8
|
+
## Supported Options
|
|
9
|
+
|
|
10
|
+
| ENV Name | Default Value | Validation |
|
|
11
|
+
|------------------------------|:-------------:|:------------------------:|
|
|
12
|
+
| PUMA_QUERY_STRING_MAX_LENGTH | 1024 * 10 | Positive natural number |
|
|
13
|
+
| PUMA_REQUEST_PATH_MAX_LENGTH | 8192 | Positive natural number |
|
|
14
|
+
| PUMA_REQUEST_URI_MAX_LENGTH | 1024 * 12 | Positive natural number |
|
|
15
|
+
| PUMA_SKIP_SIGUSR2 | nil | n/a |
|
|
16
|
+
|
|
17
|
+
## Examples
|
|
18
|
+
|
|
19
|
+
### Invalid inputs
|
|
20
|
+
|
|
21
|
+
An empty string will be handled as missing, and the default value will be used instead.
|
|
22
|
+
Puma will print an error message for other invalid values.
|
|
23
|
+
|
|
24
|
+
```
|
|
25
|
+
foo@bar:~/puma$ PUMA_QUERY_STRING_MAX_LENGTH=abc PUMA_REQUEST_PATH_MAX_LENGTH='' PUMA_REQUEST_URI_MAX_LENGTH=0 bundle exec bin/puma test/rackup/hello.ru
|
|
26
|
+
|
|
27
|
+
The value 0 for PUMA_REQUEST_URI_MAX_LENGTH is invalid. Using default value 12288 instead.
|
|
28
|
+
The value abc for PUMA_QUERY_STRING_MAX_LENGTH is invalid. Using default value 10240 instead.
|
|
29
|
+
Puma starting in single mode...
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
### Valid inputs
|
|
33
|
+
|
|
34
|
+
```
|
|
35
|
+
foo@bar:~/puma$ PUMA_REQUEST_PATH_MAX_LENGTH=9 bundle exec bin/puma test/rackup/hello.ru
|
|
36
|
+
|
|
37
|
+
Puma starting in single mode...
|
|
38
|
+
```
|
|
39
|
+
```
|
|
40
|
+
foo@bar:~ export path=/123456789 # 10 chars
|
|
41
|
+
foo@bar:~ curl "http://localhost:9292${path}"
|
|
42
|
+
|
|
43
|
+
Puma caught this error: HTTP element REQUEST_PATH is longer than the 9 allowed length. (Puma::HttpParserError)
|
|
44
|
+
|
|
45
|
+
foo@bar:~ export path=/12345678 # 9 chars
|
|
46
|
+
foo@bar:~ curl "http://localhost:9292${path}"
|
|
47
|
+
Hello World
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
### Java Flight Recorder Compatibility
|
|
51
|
+
|
|
52
|
+
Unfortunately Java Flight Recorder uses `SIGUSR2` internally. If you wish to
|
|
53
|
+
use JFR, turn off Puma's trapping of `SIGUSR2` by setting the environment variable
|
|
54
|
+
`PUMA_SKIP_SIGUSR2` to any value.
|
data/docs/jungle/README.md
CHANGED
data/docs/kubernetes.md
CHANGED
|
@@ -2,16 +2,17 @@
|
|
|
2
2
|
|
|
3
3
|
## Running Puma in Kubernetes
|
|
4
4
|
|
|
5
|
-
In general running Puma in Kubernetes works as-is, no special configuration is needed beyond what you would write anyway to get a new Kubernetes Deployment going. There is one known interaction between the way Kubernetes handles pod termination and how Puma handles `SIGINT`, where some
|
|
5
|
+
In general running Puma in Kubernetes works as-is, no special configuration is needed beyond what you would write anyway to get a new Kubernetes Deployment going. There is one known interaction between the way Kubernetes handles pod termination and how Puma handles `SIGINT`, where some requests might be sent to Puma after it has already entered graceful shutdown mode and is no longer accepting requests. This can lead to dropped requests during rolling deploys. A workaround for this is listed at the end of this article.
|
|
6
6
|
|
|
7
7
|
## Basic setup
|
|
8
8
|
|
|
9
9
|
Assuming you already have a running cluster and docker image repository, you can run a simple Puma app with the following example Dockerfile and Deployment specification. These are meant as examples only and are deliberately very minimal to the point of skipping many options that are recommended for running in production, like healthchecks and envvar configuration with ConfigMaps. In general you should check the [Kubernetes documentation](https://kubernetes.io/docs/home/) and [Docker documentation](https://docs.docker.com/) for a more comprehensive overview of the available options.
|
|
10
10
|
|
|
11
|
-
A basic Dockerfile example:
|
|
12
|
-
|
|
13
|
-
|
|
14
|
-
|
|
11
|
+
A basic Dockerfile example:
|
|
12
|
+
|
|
13
|
+
```Dockerfile
|
|
14
|
+
FROM ruby:3.4.5-alpine # can be updated to newer ruby versions
|
|
15
|
+
RUN apk update && apk add build-base # and any other packages you need
|
|
15
16
|
|
|
16
17
|
# Only rebuild gem bundle if Gemfile changes
|
|
17
18
|
COPY Gemfile Gemfile.lock ./
|
|
@@ -26,7 +27,8 @@ CMD bundle exec rackup -o 0.0.0.0
|
|
|
26
27
|
```
|
|
27
28
|
|
|
28
29
|
A sample `deployment.yaml`:
|
|
29
|
-
|
|
30
|
+
|
|
31
|
+
```yaml
|
|
30
32
|
---
|
|
31
33
|
apiVersion: apps/v1
|
|
32
34
|
kind: Deployment
|
|
@@ -47,7 +49,7 @@ spec:
|
|
|
47
49
|
image: <your image here>
|
|
48
50
|
ports:
|
|
49
51
|
- containerPort: 9292
|
|
50
|
-
```
|
|
52
|
+
```
|
|
51
53
|
|
|
52
54
|
## Graceful shutdown and pod termination
|
|
53
55
|
|
|
@@ -59,7 +61,7 @@ For some high-throughput systems, it is possible that some HTTP requests will re
|
|
|
59
61
|
4. The pod has up to `terminationGracePeriodSeconds` (default: 30 seconds) to gracefully shut down. Puma will do this (after it receives SIGTERM) by closing down the socket that accepts new requests and finishing any requests already running before exiting the Puma process.
|
|
60
62
|
5. If the pod is still running after `terminationGracePeriodSeconds` has elapsed, the pod receives `SIGKILL` to make sure the process inside it stops. After that, the container exits and all other Kubernetes objects associated with it are cleaned up.
|
|
61
63
|
|
|
62
|
-
There is a subtle race condition between step 2 and 3: The replication controller does not synchronously remove the pod from the Services AND THEN call the pre-stop hook of the pod, but rather it asynchronously sends "remove this pod from your endpoints" requests to the Services and then immediately proceeds to invoke the pods' pre-stop hook. If the Service controller (typically something like nginx or haproxy) receives
|
|
64
|
+
There is a subtle race condition between step 2 and 3: The replication controller does not synchronously remove the pod from the Services AND THEN call the pre-stop hook of the pod, but rather it asynchronously sends "remove this pod from your endpoints" requests to the Services and then immediately proceeds to invoke the pods' pre-stop hook. If the Service controller (typically something like nginx or haproxy) receives and handles this request "too" late (due to internal lag or network latency between the replication and Service controllers) then it is possible that the Service controller will send one or more requests to a Puma process which has already shut down its listening socket. These requests will then fail with 5XX error codes.
|
|
63
65
|
|
|
64
66
|
The way Kubernetes works this way, rather than handling step 2 synchronously, is due to the CAP theorem: in a distributed system there is no way to guarantee that any message will arrive promptly. In particular, waiting for all Service controllers to report back might get stuck for an indefinite time if one of them has already been terminated or if there has been a net split. A way to work around this is to add a sleep to the pre-stop hook of the same time as the `terminationGracePeriodSeconds` time. This will allow the Puma process to keep serving new requests during the entire grace period, although it will no longer receive new requests after all Service controllers have propagated the removal of the pod from their endpoint lists. Then, after `terminationGracePeriodSeconds`, the pod receives `SIGKILL` and closes down. If your process can't handle SIGKILL properly, for example because it needs to release locks in different services, you can also sleep for a shorter period (and/or increase `terminationGracePeriodSeconds`) as long as the time slept is longer than the time that your Service controllers take to propagate the pod removal. The downside of this workaround is that all pods will take at minimum the amount of time slept to shut down and this will increase the time required for your rolling deploy.
|
|
65
67
|
|
|
@@ -67,12 +69,5 @@ More discussions and links to relevant articles can be found in https://github.c
|
|
|
67
69
|
|
|
68
70
|
## Workers Per Pod, and Other Config Issues
|
|
69
71
|
|
|
70
|
-
|
|
71
|
-
|
|
72
|
-
* Worker counts should be somewhere between 4 and 32 in most cases. You want more than 4 in order to minimize time spent in request queueing for a free Puma worker, but probably less than ~32 because otherwise autoscaling is working in too large of an increment or they probably won't fit very well into your nodes. In any queueing system, queue time is proportional to 1/n, where n is the number of things pulling from the queue. Each pod will have its own request queue (i.e., the socket backlog). If you have 4 pods with 1 worker each (4 request queues), wait times are, proportionally, about 4 times higher than if you had 1 pod with 4 workers (1 request queue).
|
|
73
|
-
* Unless you have a very I/O-heavy application (50%+ time spent waiting on IO), use the default thread count (5 for MRI). Using higher numbers of threads with low I/O wait (<50%) will lead to additional request queueing time (latency!) and additional memory usage.
|
|
74
|
-
* More processes per pod reduces memory usage per process, because of copy-on-write memory and because the cost of the single master process is "amortized" over more child processes.
|
|
75
|
-
* Don't run less than 4 processes per pod if you can. Low numbers of processes per pod will lead to high request queueing, which means you will have to run more pods.
|
|
76
|
-
* If multithreaded, allocate 1 CPU per worker. If single threaded, allocate 0.75 cpus per worker. Most web applications spend about 25% of their time in I/O - but when you're running multi-threaded, your Puma process will have higher CPU usage and should be able to fully saturate a CPU core.
|
|
77
|
-
* Most Puma processes will use about ~512MB-1GB per worker, and about 1GB for the master process. However, you probably shouldn't bother with setting memory limits lower than around 2GB per process, because most places you are deploying will have 2GB of RAM per CPU. A sensible memory limit for a Puma configuration of 4 child workers might be something like 8 GB (1 GB for the master, 7GB for the 4 children).
|
|
72
|
+
See our [deployment docs](./deployment.md) for more information about how to correctly size your pods and choose the right number of workers and threads.
|
|
78
73
|
|
data/docs/plugins.md
CHANGED
|
@@ -5,13 +5,13 @@ operations.
|
|
|
5
5
|
|
|
6
6
|
There are two canonical plugins to aid in the development of new plugins:
|
|
7
7
|
|
|
8
|
-
* [tmp\_restart](https://github.com/puma/puma/blob/
|
|
8
|
+
* [tmp\_restart](https://github.com/puma/puma/blob/main/lib/puma/plugin/tmp_restart.rb):
|
|
9
9
|
Restarts the server if the file `tmp/restart.txt` is touched
|
|
10
10
|
* [heroku](https://github.com/puma/puma-heroku/blob/master/lib/puma/plugin/heroku.rb):
|
|
11
11
|
Packages up the default configuration used by Puma on Heroku (being sunset
|
|
12
12
|
with the release of Puma 5.0)
|
|
13
13
|
|
|
14
|
-
Plugins are activated in a Puma configuration file (such as `config/puma.rb
|
|
14
|
+
Plugins are activated in a Puma configuration file (such as `config/puma.rb`)
|
|
15
15
|
by adding `plugin "name"`, such as `plugin "heroku"`.
|
|
16
16
|
|
|
17
17
|
Plugins are activated based on path requirements so, activating the `heroku`
|
|
@@ -36,3 +36,7 @@ object that is useful for additional configuration.
|
|
|
36
36
|
|
|
37
37
|
Public methods in [`Puma::Plugin`](../lib/puma/plugin.rb) are treated as a
|
|
38
38
|
public API for plugins.
|
|
39
|
+
|
|
40
|
+
## Binder hooks
|
|
41
|
+
|
|
42
|
+
There's `Puma::Binder#before_parse` method that allows to add proc to run before the body of `Puma::Binder#parse`. Example of usage can be found in [that repository](https://github.com/anchordotdev/puma-acme/blob/v0.1.3/lib/puma/acme/plugin.rb#L97-L118) (`before_parse_hook` could be renamed `before_parse`, making monkey patching of [binder.rb](https://github.com/anchordotdev/puma-acme/blob/v0.1.3/lib/puma/acme/binder.rb) is unnecessary).
|
data/docs/restart.md
CHANGED
|
@@ -29,7 +29,7 @@ Any of the following will cause a Puma server to perform a hot restart:
|
|
|
29
29
|
|
|
30
30
|
* The newly started Puma process changes its current working directory to the directory specified by the `directory` option. If `directory` is set to symlink, this is automatically re-evaluated, so this mechanism can be used to upgrade the application.
|
|
31
31
|
* Only one version of the application is running at a time.
|
|
32
|
-
* `
|
|
32
|
+
* `before_restart` is invoked just before the server shuts down. This can be used to clean up resources (like long-lived database connections) gracefully. Since Ruby 2.0, it is not typically necessary to explicitly close file descriptors on restart. This is because any file descriptor opened by Ruby will have the `FD_CLOEXEC` flag set, meaning that file descriptors are closed on `exec`. `before_restart` is useful, though, if your application needs to perform any more graceful protocol-specific shutdown procedures before closing connections.
|
|
33
33
|
|
|
34
34
|
## Phased restart
|
|
35
35
|
|
|
@@ -59,7 +59,7 @@ Any of the following will cause a Puma server to perform a phased restart:
|
|
|
59
59
|
|
|
60
60
|
* When a phased restart begins, the Puma master process changes its current working directory to the directory specified by the `directory` option. If `directory` is set to symlink, this is automatically re-evaluated, so this mechanism can be used to upgrade the application.
|
|
61
61
|
* On a single server, it's possible that two versions of the application are running concurrently during a phased restart.
|
|
62
|
-
* `
|
|
62
|
+
* `before_restart` is not invoked
|
|
63
63
|
* Phased restarts can be slow for Puma clusters with many workers. Hot restarts often complete more quickly, but at the cost of increased latency during the restart.
|
|
64
64
|
* Phased restarts cannot be used to upgrade any gems loaded by the Puma master process, including `puma` itself, anything in `extra_runtime_dependencies`, or dependencies thereof. Upgrading other gems is safe.
|
|
65
65
|
* If you remove the gems from old releases as part of your deployment strategy, there are additional considerations. Do not put any gems into `extra_runtime_dependencies` that have native extensions or have dependencies that have native extensions (one common example is `puma_worker_killer` and its dependency on `ffi`). Workers will fail on boot during a phased restart. The underlying issue is recorded in [an issue on the rubygems project](https://github.com/rubygems/rubygems/issues/4004). Hot restarts are your only option here if you need these dependencies.
|