puma 6.4.1 → 7.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (57) hide show
  1. checksums.yaml +4 -4
  2. data/History.md +407 -8
  3. data/README.md +109 -49
  4. data/docs/deployment.md +58 -23
  5. data/docs/fork_worker.md +11 -1
  6. data/docs/java_options.md +54 -0
  7. data/docs/jungle/README.md +1 -1
  8. data/docs/kubernetes.md +11 -16
  9. data/docs/plugins.md +6 -2
  10. data/docs/restart.md +2 -2
  11. data/docs/signals.md +21 -21
  12. data/docs/stats.md +11 -5
  13. data/docs/systemd.md +14 -5
  14. data/ext/puma_http11/extconf.rb +20 -32
  15. data/ext/puma_http11/mini_ssl.c +29 -9
  16. data/ext/puma_http11/org/jruby/puma/Http11.java +40 -9
  17. data/ext/puma_http11/puma_http11.c +125 -118
  18. data/lib/puma/app/status.rb +11 -3
  19. data/lib/puma/binder.rb +21 -11
  20. data/lib/puma/cli.rb +10 -8
  21. data/lib/puma/client.rb +183 -83
  22. data/lib/puma/cluster/worker.rb +24 -21
  23. data/lib/puma/cluster/worker_handle.rb +38 -8
  24. data/lib/puma/cluster.rb +73 -47
  25. data/lib/puma/cluster_accept_loop_delay.rb +91 -0
  26. data/lib/puma/commonlogger.rb +3 -3
  27. data/lib/puma/configuration.rb +131 -60
  28. data/lib/puma/const.rb +31 -12
  29. data/lib/puma/control_cli.rb +10 -6
  30. data/lib/puma/detect.rb +2 -0
  31. data/lib/puma/dsl.rb +411 -121
  32. data/lib/puma/error_logger.rb +7 -5
  33. data/lib/puma/events.rb +25 -10
  34. data/lib/puma/io_buffer.rb +8 -4
  35. data/lib/puma/jruby_restart.rb +0 -16
  36. data/lib/puma/launcher/bundle_pruner.rb +1 -1
  37. data/lib/puma/launcher.rb +73 -55
  38. data/lib/puma/log_writer.rb +9 -9
  39. data/lib/puma/minissl/context_builder.rb +1 -0
  40. data/lib/puma/minissl.rb +1 -1
  41. data/lib/puma/null_io.rb +26 -0
  42. data/lib/puma/plugin/systemd.rb +3 -3
  43. data/lib/puma/rack/urlmap.rb +1 -1
  44. data/lib/puma/reactor.rb +19 -13
  45. data/lib/puma/request.rb +71 -39
  46. data/lib/puma/runner.rb +15 -17
  47. data/lib/puma/sd_notify.rb +1 -4
  48. data/lib/puma/server.rb +134 -73
  49. data/lib/puma/single.rb +7 -4
  50. data/lib/puma/state_file.rb +3 -2
  51. data/lib/puma/thread_pool.rb +57 -80
  52. data/lib/puma/util.rb +0 -7
  53. data/lib/puma.rb +10 -0
  54. data/lib/rack/handler/puma.rb +10 -7
  55. data/tools/Dockerfile +15 -5
  56. metadata +14 -15
  57. data/ext/puma_http11/ext_help.h +0 -15
data/README.md CHANGED
@@ -4,15 +4,14 @@
4
4
 
5
5
  # Puma: A Ruby Web Server Built For Parallelism
6
6
 
7
- [![Actions](https://github.com/puma/puma/workflows/Tests/badge.svg?branch=master)](https://github.com/puma/puma/actions?query=workflow%3ATests)
8
- [![Code Climate](https://codeclimate.com/github/puma/puma.svg)](https://codeclimate.com/github/puma/puma)
7
+ [![Actions](https://github.com/puma/puma/actions/workflows/tests.yml/badge.svg?branch=main)](https://github.com/puma/puma/actions/workflows/tests.yml?query=branch%3Amain)
9
8
  [![StackOverflow](https://img.shields.io/badge/stackoverflow-Puma-blue.svg)]( https://stackoverflow.com/questions/tagged/puma )
10
9
 
11
10
  Puma is a **simple, fast, multi-threaded, and highly parallel HTTP 1.1 server for Ruby/Rack applications**.
12
11
 
13
12
  ## Built For Speed & Parallelism
14
13
 
15
- Puma is a server for [Rack](https://github.com/rack/rack)-powered HTTP applications written in Ruby. It is:
14
+ Puma is a server for [Rack](https://github.com/rack/rack)-powered HTTP applications written in Ruby. It is:
16
15
  * **Multi-threaded**. Each request is served in a separate thread. This helps you serve more requests per second with less memory use.
17
16
  * **Multi-process**. "Pre-forks" in cluster mode, using less memory per-process thanks to copy-on-write memory.
18
17
  * **Standalone**. With SSL support, zero-downtime rolling restarts and a built-in request bufferer, you can deploy Puma without any reverse proxy.
@@ -82,10 +81,10 @@ $ bundle exec puma
82
81
 
83
82
  ## Configuration
84
83
 
85
- Puma provides numerous options. Consult `puma -h` (or `puma --help`) for a full list of CLI options, or see `Puma::DSL` or [dsl.rb](https://github.com/puma/puma/blob/master/lib/puma/dsl.rb).
84
+ Puma provides numerous options. Consult `puma -h` (or `puma --help`) for a full list of CLI options, or see `Puma::DSL` or [dsl.rb](https://github.com/puma/puma/blob/main/lib/puma/dsl.rb).
86
85
 
87
86
  You can also find several configuration examples as part of the
88
- [test](https://github.com/puma/puma/tree/master/test/config) suite.
87
+ [test](https://github.com/puma/puma/tree/main/test/config) suite.
89
88
 
90
89
  For debugging purposes, you can set the environment variable `PUMA_LOG_CONFIG` with a value
91
90
  and the loaded configuration will be printed as part of the boot process.
@@ -102,9 +101,9 @@ Puma will automatically scale the number of threads, from the minimum until it c
102
101
 
103
102
  Be aware that additionally Puma creates threads on its own for internal purposes (e.g. handling slow clients). So, even if you specify -t 1:1, expect around 7 threads created in your application.
104
103
 
105
- ### Clustered mode
104
+ ### Cluster mode
106
105
 
107
- Puma also offers "clustered mode". Clustered mode `fork`s workers from a master process. Each child process still has its own thread pool. You can tune the number of workers with the `-w` (or `--workers`) flag:
106
+ Puma also offers "cluster mode". Cluster mode `fork`s workers from a master process. Each child process still has its own thread pool. You can tune the number of workers with the `-w` (or `--workers`) flag:
108
107
 
109
108
  ```
110
109
  $ puma -t 8:32 -w 3
@@ -116,13 +115,24 @@ Or with the `WEB_CONCURRENCY` environment variable:
116
115
  $ WEB_CONCURRENCY=3 puma -t 8:32
117
116
  ```
118
117
 
119
- Note that threads are still used in clustered mode, and the `-t` thread flag setting is per worker, so `-w 2 -t 16:16` will spawn 32 threads in total, with 16 in each worker process.
118
+ When using a config file, most applications can simply set `workers :auto` (requires the `concurrent-ruby` gem) to match the number of worker processes to the available processors:
120
119
 
121
- For an in-depth discussion of the tradeoffs of thread and process count settings, [see our docs](https://github.com/puma/puma/blob/9282a8efa5a0c48e39c60d22ca70051a25df9f55/docs/kubernetes.md#workers-per-pod-and-other-config-issues).
120
+ ```ruby
121
+ # config/puma.rb
122
+ workers :auto
123
+ ```
124
+
125
+ See [`workers :auto` gotchas](lib/puma/dsl.rb).
126
+
127
+ Note that threads are still used in cluster mode, and the `-t` thread flag setting is per worker, so `-w 2 -t 16:16` will spawn 32 threads in total, with 16 in each worker process.
128
+
129
+ If `workers` is set to `:auto`, or the `WEB_CONCURRENCY` environment variable is set to `"auto"`, and the `concurrent-ruby` gem is available in your application, Puma will set the worker process count to the result of [available processors](https://msp-greg.github.io/concurrent-ruby/Concurrent.html#available_processor_count-class_method).
122
130
 
123
- In clustered mode, Puma can "preload" your application. This loads all the application code *prior* to forking. Preloading reduces total memory usage of your application via an operating system feature called [copy-on-write](https://en.wikipedia.org/wiki/Copy-on-write).
131
+ For an in-depth discussion of the tradeoffs of thread and process count settings, [see our docs](docs/deployment.md).
124
132
 
125
- If the `WEB_CONCURRENCY` environment variable is set to a value > 1 (and `--prune-bundler` has not been specified), preloading will be enabled by default. Otherwise, you can use the `--preload` flag from the command line:
133
+ In cluster mode, Puma can "preload" your application. This loads all the application code *prior* to forking. Preloading reduces total memory usage of your application via an operating system feature called [copy-on-write](https://en.wikipedia.org/wiki/Copy-on-write).
134
+
135
+ If the number of workers is greater than 1 (and `--prune-bundler` has not been specified), preloading will be enabled by default. Otherwise, you can use the `--preload` flag from the command line:
126
136
 
127
137
  ```
128
138
  $ puma -w 3 --preload
@@ -138,51 +148,106 @@ preload_app!
138
148
 
139
149
  Preloading can’t be used with phased restart, since phased restart kills and restarts workers one-by-one, and preloading copies the code of master into the workers.
140
150
 
141
- When using clustered mode, you can specify a block in your configuration file that will be run on boot of each worker:
151
+ #### Cluster mode hooks
152
+
153
+ When using clustered mode, Puma's configuration DSL provides `before_fork`, `before_worker_boot`, and `after_worker_shutdown`
154
+ hooks to run code when the master process forks, the child workers are booted, and after each child worker exits respectively.
155
+
156
+ It is recommended to use these hooks with `preload_app!`, otherwise constants loaded by your
157
+ application (such as `Rails`) will not be available inside the hooks.
142
158
 
143
159
  ```ruby
144
160
  # config/puma.rb
145
- on_worker_boot do
146
- # configuration here
161
+ before_fork do
162
+ # Add code to run inside the Puma master process before it forks a worker child.
147
163
  end
148
- ```
149
164
 
150
- This code can be used to setup the process before booting the application, allowing
151
- you to do some Puma-specific things that you don't want to embed in your application.
152
- For instance, you could fire a log notification that a worker booted or send something to statsd. This can be called multiple times.
165
+ before_worker_boot do
166
+ # Add code to run inside the Puma worker process after forking.
167
+ end
153
168
 
154
- Constants loaded by your application (such as `Rails`) will not be available in `on_worker_boot`
155
- unless preloading is enabled.
169
+ after_worker_shutdown do |worker_handle|
170
+ # Add code to run inside the Puma master process after a worker exits. `worker.process_status` can be used to get the
171
+ # `Process::Status` of the exited worker.
172
+ end
173
+ ```
156
174
 
157
- You can also specify a block to be run before workers are forked, using `before_fork`:
175
+ In addition, there is an `before_refork` and `after_refork` hooks which are used only in [`fork_worker` mode](docs/fork_worker.md),
176
+ when the worker 0 child process forks a grandchild worker:
158
177
 
159
178
  ```ruby
160
- # config/puma.rb
161
- before_fork do
162
- # configuration here
179
+ before_refork do
180
+ # Used only when fork_worker mode is enabled. Add code to run inside the Puma worker 0
181
+ # child process before it forks a grandchild worker.
163
182
  end
164
183
  ```
165
184
 
166
- You can also specify a block to be run after puma is booted using `on_booted`:
185
+ ```ruby
186
+ after_refork do
187
+ # Used only when fork_worker mode is enabled. Add code to run inside the Puma worker 0
188
+ # child process after it forks a grandchild worker.
189
+ end
190
+ ```
191
+
192
+ Importantly, note the following considerations when Ruby forks a child process:
193
+
194
+ 1. File descriptors such as network sockets **are** copied from the parent to the forked
195
+ child process. Dual-use of the same sockets by parent and child will result in I/O conflicts
196
+ such as `SocketError`, `Errno::EPIPE`, and `EOFError`.
197
+ 2. Background Ruby threads, including threads used by various third-party gems for connection
198
+ monitoring, etc., are **not** copied to the child process. Often this does not cause
199
+ immediate problems until a third-party connection goes down, at which point there will
200
+ be no supervisor to reconnect it.
201
+
202
+ Therefore, we recommend the following:
203
+
204
+ 1. If possible, do not establish any socket connections (HTTP, database connections, etc.)
205
+ inside Puma's master process when booting.
206
+ 2. If (1) is not possible, use `before_fork` and `before_refork` to disconnect the parent's socket
207
+ connections when forking, so that they are not accidentally copied to the child process.
208
+ 3. Use `before_worker_boot` to restart any background threads on the forked child.
209
+ 4. Use `after_refork` to restart any background threads on the parent.
210
+
211
+ #### Master process lifecycle hooks
212
+
213
+ Puma's configuration DSL provides master process lifecycle hooks `after_booted`, `before_restart`, and `after_stopped`
214
+ which may be used to specify code blocks to run on each event:
167
215
 
168
216
  ```ruby
169
217
  # config/puma.rb
170
- on_booted do
171
- # configuration here
218
+ after_booted do
219
+ # Add code to run in the Puma master process after it boots,
220
+ # and also after a phased restart completes.
221
+ end
222
+
223
+ before_restart do
224
+ # Add code to run in the Puma master process when it receives
225
+ # a restart command but before it restarts.
226
+ end
227
+
228
+ after_stopped do
229
+ # Add code to run in the Puma master process when it receives
230
+ # a stop command but before it shuts down.
172
231
  end
173
232
  ```
174
233
 
175
234
  ### Error handling
176
235
 
177
- If puma encounters an error outside of the context of your application, it will respond with a 500 and a simple
178
- textual error message (see `Puma::Server#lowlevel_error` or [server.rb](https://github.com/puma/puma/blob/master/lib/puma/server.rb)).
236
+ If Puma encounters an error outside of the context of your application, it will respond with a 400/500 and a simple
237
+ textual error message (see `Puma::Server#lowlevel_error` or [server.rb](https://github.com/puma/puma/blob/main/lib/puma/server.rb)).
179
238
  You can specify custom behavior for this scenario. For example, you can report the error to your third-party
180
239
  error-tracking service (in this example, [rollbar](https://rollbar.com)):
181
240
 
182
241
  ```ruby
183
- lowlevel_error_handler do |e|
184
- Rollbar.critical(e)
185
- [500, {}, ["An error has occurred, and engineers have been informed. Please reload the page. If you continue to have problems, contact support@example.com\n"]]
242
+ lowlevel_error_handler do |e, env, status|
243
+ if status == 400
244
+ message = "The server could not process the request due to an error, such as an incorrectly typed URL, malformed syntax, or a URL that contains illegal characters.\n"
245
+ else
246
+ message = "An error has occurred, and engineers have been informed. Please reload the page. If you continue to have problems, contact support@example.com\n"
247
+ Rollbar.critical(e)
248
+ end
249
+
250
+ [status, {}, [message]]
186
251
  end
187
252
  ```
188
253
 
@@ -249,7 +314,7 @@ $ puma -b ssl://localhost:9292 -b tcp://localhost:9393 -C config/use_local_host.
249
314
 
250
315
  #### Controlling SSL Cipher Suites
251
316
 
252
- To use or avoid specific SSL cipher suites, use `ssl_cipher_filter` or `ssl_cipher_list` options.
317
+ To use or avoid specific SSL ciphers for TLSv1.2 and below, use `ssl_cipher_filter` or `ssl_cipher_list` options.
253
318
 
254
319
  ##### Ruby:
255
320
 
@@ -263,6 +328,14 @@ $ puma -b 'ssl://127.0.0.1:9292?key=path_to_key&cert=path_to_cert&ssl_cipher_fil
263
328
  $ puma -b 'ssl://127.0.0.1:9292?keystore=path_to_keystore&keystore-pass=keystore_password&ssl_cipher_list=TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA'
264
329
  ```
265
330
 
331
+ To configure the available TLSv1.3 ciphersuites, use `ssl_ciphersuites` option (not available for JRuby).
332
+
333
+ ##### Ruby:
334
+
335
+ ```
336
+ $ puma -b 'ssl://127.0.0.1:9292?key=path_to_key&cert=path_to_cert&ssl_ciphersuites=TLS_AES_256_GCM_SHA384:TLS_AES_128_GCM_SHA256'
337
+ ```
338
+
266
339
  See https://www.openssl.org/docs/man1.1.1/man1/ciphers.html for cipher filter format and full list of cipher suites.
267
340
 
268
341
  Disable TLS v1 with the `no_tlsv1` option:
@@ -279,7 +352,7 @@ To enable verification flags offered by OpenSSL, use `verification_flags` (not a
279
352
  $ puma -b 'ssl://127.0.0.1:9292?key=path_to_key&cert=path_to_cert&verification_flags=PARTIAL_CHAIN'
280
353
  ```
281
354
 
282
- You can also set multiple verification flags (by separating them with coma):
355
+ You can also set multiple verification flags (by separating them with a comma):
283
356
 
284
357
  ```
285
358
  $ puma -b 'ssl://127.0.0.1:9292?key=path_to_key&cert=path_to_cert&verification_flags=PARTIAL_CHAIN,CRL_CHECK'
@@ -320,7 +393,7 @@ Puma has a built-in status and control app that can be used to query and control
320
393
  $ puma --control-url tcp://127.0.0.1:9293 --control-token foo
321
394
  ```
322
395
 
323
- Puma will start the control server on localhost port 9293. All requests to the control server will need to include control token (in this case, `token=foo`) as a query parameter. This allows for simple authentication. Check out `Puma::App::Status` or [status.rb](https://github.com/puma/puma/blob/master/lib/puma/app/status.rb) to see what the status app has available.
396
+ Puma will start the control server on localhost port 9293. All requests to the control server will need to include control token (in this case, `token=foo`) as a query parameter. This allows for simple authentication. Check out `Puma::App::Status` or [status.rb](https://github.com/puma/puma/blob/main/lib/puma/app/status.rb) to see what the status app has available.
324
397
 
325
398
  You can also interact with the control server via `pumactl`. This command will restart Puma:
326
399
 
@@ -352,7 +425,7 @@ $ puma -C "-"
352
425
 
353
426
  The other side-effects of setting the environment are whether to show stack traces (in `development` or `test`), and setting RACK_ENV may potentially affect middleware looking for this value to change their behavior. The default puma RACK_ENV value is `development`. You can see all config default values in `Puma::Configuration#puma_default_options` or [configuration.rb](https://github.com/puma/puma/blob/61c6213fbab/lib/puma/configuration.rb#L182-L204).
354
427
 
355
- Check out `Puma::DSL` or [dsl.rb](https://github.com/puma/puma/blob/master/lib/puma/dsl.rb) to see all available options.
428
+ Check out `Puma::DSL` or [dsl.rb](https://github.com/puma/puma/blob/main/lib/puma/dsl.rb) to see all available options.
356
429
 
357
430
  ## Restart
358
431
 
@@ -372,19 +445,6 @@ Some platforms do not support all Puma features.
372
445
  * **Windows**: Cluster mode is not supported due to a lack of fork(2).
373
446
  * **Kubernetes**: The way Kubernetes handles pod shutdowns interacts poorly with server processes implementing graceful shutdown, like Puma. See the [kubernetes section of the documentation](docs/kubernetes.md) for more details.
374
447
 
375
- ## Known Bugs
376
-
377
- For MRI versions 2.2.7, 2.2.8, 2.2.9, 2.2.10, 2.3.4 and 2.4.1, you may see ```stream closed in another thread (IOError)```. It may be caused by a [Ruby bug](https://bugs.ruby-lang.org/issues/13632). It can be fixed with the gem https://rubygems.org/gems/stopgap_13632:
378
-
379
- ```ruby
380
- if %w(2.2.7 2.2.8 2.2.9 2.2.10 2.3.4 2.4.1).include? RUBY_VERSION
381
- begin
382
- require 'stopgap_13632'
383
- rescue LoadError
384
- end
385
- end
386
- ```
387
-
388
448
  ## Deployment
389
449
 
390
450
  * Puma has support for Capistrano with an [external gem](https://github.com/seuros/capistrano-puma).
data/docs/deployment.md CHANGED
@@ -16,32 +16,34 @@ assume this is how you're using Puma.
16
16
  Initially, Puma was conceived as a thread-only web server, but support for
17
17
  processes was added in version 2.
18
18
 
19
+ In general, use single mode only if:
20
+
21
+ * You are using JRuby, TruffleRuby or another fully-multithreaded implementation of Ruby
22
+ * You are using MRI but in an environment where only 1 CPU core is available.
23
+
24
+ Otherwise, you'll want to use cluster mode to utilize all available CPU resources.
25
+
19
26
  To run `puma` in single mode (i.e., as a development environment), set the
20
27
  number of workers to 0; anything higher will run in cluster mode.
21
28
 
22
- Here are some tips for cluster mode:
29
+ ## Cluster Mode Tips
23
30
 
24
- ### MRI
31
+ For the purposes of Puma provisioning, "CPU cores" means:
25
32
 
26
- * Use cluster mode and set the number of workers to 1.5x the number of CPU cores
27
- in the machine, starting from a minimum of 2.
28
- * Set the number of threads to desired concurrent requests/number of workers.
29
- Puma defaults to 5, and that's a decent number.
33
+ 1. On ARM, the number of physical cores.
34
+ 2. On x86, the number of logical cores, hyperthreads, or vCPUs (these words all mean the same thing).
30
35
 
31
- #### Migrating from Unicorn
36
+ Set your config with the following process:
32
37
 
33
- * If you're migrating from unicorn though, here are some settings to start with:
34
- * Set workers to half the number of unicorn workers you're using
35
- * Set threads to 2
36
- * Enjoy 50% memory savings
37
- * As you grow more confident in the thread-safety of your app, you can tune the
38
- workers down and the threads up.
38
+ * Use cluster mode and set `workers :auto` (requires the `concurrent-ruby` gem) to match the number of CPU cores on the machine (minimum 2, otherwise use single mode!). If you can't add the gem, set the worker count manually to the available CPU cores.
39
+ * Set the number of threads to desired concurrent requests/number of workers.
40
+ Puma defaults to 5, and that's a decent number.
39
41
 
40
- #### Ubuntu / Systemd (Systemctl) Installation
42
+ For most deployments, adding `concurrent-ruby` and using `workers :auto` is the right starting point.
41
43
 
42
- See [systemd.md](systemd.md)
44
+ See [`workers :auto` gotchas](../lib/puma/dsl.rb).
43
45
 
44
- #### Worker utilization
46
+ ## Worker utilization
45
47
 
46
48
  **How do you know if you've got enough (or too many workers)?**
47
49
 
@@ -50,14 +52,34 @@ a time. But since so many apps are waiting on IO from DBs, etc., they can
50
52
  utilize threads to use the process more efficiently.
51
53
 
52
54
  Generally, you never want processes that are pegged all the time. That can mean
53
- there is more work to do than the process can get through. On the other hand, if
54
- you have processes that sit around doing nothing, then they're just eating up
55
- resources.
55
+ there is more work to do than the process can get through, and requests will end up with additional latency. On the other hand, if
56
+ you have processes that sit around doing nothing, then you're wasting resources and money.
57
+
58
+ In general, you are making a tradeoff between:
59
+
60
+ 1. CPU and memory utilization.
61
+ 2. Time spent queueing for a Puma worker to `accept` requests and additional latency caused by CPU contention.
62
+
63
+ If latency is important to you, you will have to accept lower utilization, and vice versa.
56
64
 
57
- Watch your CPU utilization over time and aim for about 70% on average. 70%
58
- utilization means you've got capacity still but aren't starving threads.
65
+ ## Container/VPS sizing
59
66
 
60
- **Measuring utilization**
67
+ You will have to make a decision about how "big" to make each pod/VPS/server/dyno.
68
+
69
+ **TL:DR;**: 80% of Puma apps will end up deploying "pods" of 4 workers, 5 threads each, 4 vCPU and 8GB of RAM.
70
+
71
+ For the rest of this discussion, we'll adopt the Kubernetes term of "pods".
72
+
73
+ Should you run 2 pods with 50 workers each? 25 pods, each with 4 workers? 100 pods, with each Puma running in single mode? Each scenario represents the same total amount of capacity (100 Puma processes that can respond to requests), but there are tradeoffs to make:
74
+
75
+ * **Increasing worker counts decreases latency, but means you scale in bigger "chunks"**. Worker counts should be somewhere between 4 and 32 in most cases. You want more than 4 in order to minimize time spent in request queueing for a free Puma worker, but probably less than ~32 because otherwise autoscaling is working in too large of an increment or they probably won't fit very well into your nodes. In any queueing system, queue time is proportional to 1/n, where n is the number of things pulling from the queue. Each pod will have its own request queue (i.e., the socket backlog). If you have 4 pods with 1 worker each (4 request queues), wait times are, proportionally, about 4 times higher than if you had 1 pod with 4 workers (1 request queue).
76
+ * **Increasing thread counts will increase throughput, but also latency and memory use** Unless you have a very I/O-heavy application (50%+ time spent waiting on IO), use the default thread count (5 for MRI). Using higher numbers of threads with low I/O wait (<50% of wall clock time) will lead to additional request latency and additional memory usage.
77
+ * **Increasing worker counts decreases memory per worker on average**. More processes per pod reduces memory usage per process, because of copy-on-write memory and because the cost of the single master process is "amortized" over more child processes.
78
+ * **Low worker counts (<4) have exceptionally poor throughput**. Don't run less than 4 processes per pod if you can. Low numbers of processes per pod will lead to high request queueing (see discussion above), which means you will have to run more pods and resources.
79
+ * **CPU-core-to-worker ratios should be around 1**. If running Puma with `threads > 1`, allocate 1 CPU core (see definition above!) per worker. If single threaded, allocate ~0.75 cpus per worker. Most web applications spend about 25% of their time in I/O - but when you're running multi-threaded, your Puma process will have higher CPU usage and should be able to fully saturate a CPU core. Using `workers :auto` will size workers to this guidance on most platforms.
80
+ * **Don't set memory limits unless necessary**. Most Puma processes will use about ~512MB-1GB per worker, and about 1GB for the master process. However, you probably shouldn't bother with setting memory limits lower than around 2GB per process, because most places you are deploying will have 2GB of RAM per CPU. A sensible memory limit for a Puma configuration of 4 child workers might be something like 8 GB (1 GB for the master, 7GB for the 4 children).
81
+
82
+ **Measuring utilization and queue time**
61
83
 
62
84
  Using a timestamp header from an upstream proxy server (e.g., `nginx` or
63
85
  `haproxy`) makes it possible to indicate how long requests have been waiting for
@@ -75,7 +97,7 @@ a Puma thread to become available.
75
97
  * `env['puma.request_body_wait']` contains the number of milliseconds Puma
76
98
  spent waiting for the client to send the request body.
77
99
  * haproxy: `%Th` (TLS handshake time) and `%Ti` (idle time before request)
78
- can can also be added as headers.
100
+ can also be added as headers.
79
101
 
80
102
  ## Should I daemonize?
81
103
 
@@ -100,3 +122,16 @@ or hell, even `monit`.
100
122
  You probably will want to deploy some new code at some point, and you'd like
101
123
  Puma to start running that new code. There are a few options for restarting
102
124
  Puma, described separately in our [restart documentation](restart.md).
125
+
126
+ ## Migrating from Unicorn
127
+
128
+ * If you're migrating from unicorn though, here are some settings to start with:
129
+ * Set workers to half the number of unicorn workers you're using
130
+ * Set threads to 2
131
+ * Enjoy 50% memory savings
132
+ * As you grow more confident in the thread-safety of your app, you can tune the
133
+ workers down and the threads up.
134
+
135
+ ## Ubuntu / Systemd (Systemctl) Installation
136
+
137
+ See [systemd.md](systemd.md)
data/docs/fork_worker.md CHANGED
@@ -22,10 +22,20 @@ The `fork_worker` option allows your application to be initialized only once for
22
22
 
23
23
  You can trigger a refork by sending the cluster the `SIGURG` signal or running the `pumactl refork` command at any time. A refork will also automatically trigger once, after a certain number of requests have been processed by worker 0 (default 1000). To configure the number of requests before the auto-refork, pass a positive integer argument to `fork_worker` (e.g., `fork_worker 1000`), or `0` to disable.
24
24
 
25
+ ### Usage Considerations
26
+
27
+ - `fork_worker` introduces new `before_refork` and `after_refork` configuration hooks. Note the following:
28
+ - When initially forking the parent process to the worker 0 child, `before_fork` will trigger on the parent process and `before_worker_boot` will trigger on the worker 0 child as normal.
29
+ - When forking the worker 0 child to grandchild workers, `before_refork` and `after_refork` will trigger on the worker 0 child, and `before_worker_boot` will trigger on each grandchild worker.
30
+ - For clarity, `before_fork` does not trigger on worker 0, and `after_refork` does not trigger on the grandchild.
31
+ - As a general migration guide:
32
+ - Copy any logic within your existing `before_fork` hook to the `before_refork` hook.
33
+ - Consider to copy logic from your `before_worker_boot` hook to the `after_refork` hook, if it is needed to reset the state of worker 0 after it forks.
34
+
25
35
  ### Limitations
26
36
 
27
37
  - This mode is still very experimental so there may be bugs or edge-cases, particularly around expected behavior of existing hooks. Please open a [bug report](https://github.com/puma/puma/issues/new?template=bug_report.md) if you encounter any issues.
28
38
 
29
39
  - In order to fork new workers cleanly, worker 0 shuts down its server and stops serving requests so there are no open file descriptors or other kinds of shared global state between processes, and to maximize copy-on-write efficiency across the newly-forked workers. This may temporarily reduce total capacity of the cluster during a phased restart / refork.
30
40
 
31
- In a cluster with `n` workers, a normal phased restart stops and restarts workers one by one while the application is loaded in each process, so `n-1` workers are available serving requests during the restart. In a phased restart in fork-worker mode, the application is first loaded in worker 0 while `n-1` workers are available, then worker 0 remains stopped while the rest of the workers are reloaded one by one, leaving only `n-2` workers to be available for a brief period of time. Reloading the rest of the workers should be quick because the application is preloaded at that point, but there may be situations where it can take longer (slow clients, long-running application code, slow worker-fork hooks, etc).
41
+ - In a cluster with `n` workers, a normal phased restart stops and restarts workers one by one while the application is loaded in each process, so `n-1` workers are available serving requests during the restart. In a phased restart in fork-worker mode, the application is first loaded in worker 0 while `n-1` workers are available, then worker 0 remains stopped while the rest of the workers are reloaded one by one, leaving only `n-2` workers to be available for a brief period of time. Reloading the rest of the workers should be quick because the application is preloaded at that point, but there may be situations where it can take longer (slow clients, long-running application code, slow worker-fork hooks, etc).
@@ -0,0 +1,54 @@
1
+ # Java Options
2
+
3
+ `System Properties` or `Environment Variables` can be used to change Puma's
4
+ default configuration for its Java extension. The provided values are evaluated
5
+ during initialization, and changes while running the app have no effect.
6
+ Moreover, default values may be used in case of invalid inputs.
7
+
8
+ ## Supported Options
9
+
10
+ | ENV Name | Default Value | Validation |
11
+ |------------------------------|:-------------:|:------------------------:|
12
+ | PUMA_QUERY_STRING_MAX_LENGTH | 1024 * 10 | Positive natural number |
13
+ | PUMA_REQUEST_PATH_MAX_LENGTH | 8192 | Positive natural number |
14
+ | PUMA_REQUEST_URI_MAX_LENGTH | 1024 * 12 | Positive natural number |
15
+ | PUMA_SKIP_SIGUSR2 | nil | n/a |
16
+
17
+ ## Examples
18
+
19
+ ### Invalid inputs
20
+
21
+ An empty string will be handled as missing, and the default value will be used instead.
22
+ Puma will print an error message for other invalid values.
23
+
24
+ ```
25
+ foo@bar:~/puma$ PUMA_QUERY_STRING_MAX_LENGTH=abc PUMA_REQUEST_PATH_MAX_LENGTH='' PUMA_REQUEST_URI_MAX_LENGTH=0 bundle exec bin/puma test/rackup/hello.ru
26
+
27
+ The value 0 for PUMA_REQUEST_URI_MAX_LENGTH is invalid. Using default value 12288 instead.
28
+ The value abc for PUMA_QUERY_STRING_MAX_LENGTH is invalid. Using default value 10240 instead.
29
+ Puma starting in single mode...
30
+ ```
31
+
32
+ ### Valid inputs
33
+
34
+ ```
35
+ foo@bar:~/puma$ PUMA_REQUEST_PATH_MAX_LENGTH=9 bundle exec bin/puma test/rackup/hello.ru
36
+
37
+ Puma starting in single mode...
38
+ ```
39
+ ```
40
+ foo@bar:~ export path=/123456789 # 10 chars
41
+ foo@bar:~ curl "http://localhost:9292${path}"
42
+
43
+ Puma caught this error: HTTP element REQUEST_PATH is longer than the 9 allowed length. (Puma::HttpParserError)
44
+
45
+ foo@bar:~ export path=/12345678 # 9 chars
46
+ foo@bar:~ curl "http://localhost:9292${path}"
47
+ Hello World
48
+ ```
49
+
50
+ ### Java Flight Recorder Compatibility
51
+
52
+ Unfortunately Java Flight Recorder uses `SIGUSR2` internally. If you wish to
53
+ use JFR, turn off Puma's trapping of `SIGUSR2` by setting the environment variable
54
+ `PUMA_SKIP_SIGUSR2` to any value.
@@ -2,7 +2,7 @@
2
2
 
3
3
  ## Systemd
4
4
 
5
- See [/docs/systemd](https://github.com/puma/puma/blob/master/docs/systemd.md).
5
+ See [/docs/systemd](https://github.com/puma/puma/blob/main/docs/systemd.md).
6
6
 
7
7
  ## rc.d
8
8
 
data/docs/kubernetes.md CHANGED
@@ -2,16 +2,17 @@
2
2
 
3
3
  ## Running Puma in Kubernetes
4
4
 
5
- In general running Puma in Kubernetes works as-is, no special configuration is needed beyond what you would write anyway to get a new Kubernetes Deployment going. There is one known interaction between the way Kubernetes handles pod termination and how Puma handles `SIGINT`, where some request might be sent to Puma after it has already entered graceful shutdown mode and is no longer accepting requests. This can lead to dropped requests during rolling deploys. A workaround for this is listed at the end of this article.
5
+ In general running Puma in Kubernetes works as-is, no special configuration is needed beyond what you would write anyway to get a new Kubernetes Deployment going. There is one known interaction between the way Kubernetes handles pod termination and how Puma handles `SIGINT`, where some requests might be sent to Puma after it has already entered graceful shutdown mode and is no longer accepting requests. This can lead to dropped requests during rolling deploys. A workaround for this is listed at the end of this article.
6
6
 
7
7
  ## Basic setup
8
8
 
9
9
  Assuming you already have a running cluster and docker image repository, you can run a simple Puma app with the following example Dockerfile and Deployment specification. These are meant as examples only and are deliberately very minimal to the point of skipping many options that are recommended for running in production, like healthchecks and envvar configuration with ConfigMaps. In general you should check the [Kubernetes documentation](https://kubernetes.io/docs/home/) and [Docker documentation](https://docs.docker.com/) for a more comprehensive overview of the available options.
10
10
 
11
- A basic Dockerfile example:
12
- ```
13
- FROM ruby:2.5.1-alpine # can be updated to newer ruby versions
14
- RUN apk update && apk add build-base # and any other packages you need
11
+ A basic Dockerfile example:
12
+
13
+ ```Dockerfile
14
+ FROM ruby:3.4.5-alpine # can be updated to newer ruby versions
15
+ RUN apk update && apk add build-base # and any other packages you need
15
16
 
16
17
  # Only rebuild gem bundle if Gemfile changes
17
18
  COPY Gemfile Gemfile.lock ./
@@ -26,7 +27,8 @@ CMD bundle exec rackup -o 0.0.0.0
26
27
  ```
27
28
 
28
29
  A sample `deployment.yaml`:
29
- ```
30
+
31
+ ```yaml
30
32
  ---
31
33
  apiVersion: apps/v1
32
34
  kind: Deployment
@@ -47,7 +49,7 @@ spec:
47
49
  image: <your image here>
48
50
  ports:
49
51
  - containerPort: 9292
50
- ```
52
+ ```
51
53
 
52
54
  ## Graceful shutdown and pod termination
53
55
 
@@ -59,7 +61,7 @@ For some high-throughput systems, it is possible that some HTTP requests will re
59
61
  4. The pod has up to `terminationGracePeriodSeconds` (default: 30 seconds) to gracefully shut down. Puma will do this (after it receives SIGTERM) by closing down the socket that accepts new requests and finishing any requests already running before exiting the Puma process.
60
62
  5. If the pod is still running after `terminationGracePeriodSeconds` has elapsed, the pod receives `SIGKILL` to make sure the process inside it stops. After that, the container exits and all other Kubernetes objects associated with it are cleaned up.
61
63
 
62
- There is a subtle race condition between step 2 and 3: The replication controller does not synchronously remove the pod from the Services AND THEN call the pre-stop hook of the pod, but rather it asynchronously sends "remove this pod from your endpoints" requests to the Services and then immediately proceeds to invoke the pods' pre-stop hook. If the Service controller (typically something like nginx or haproxy) receives this request handles this request "too" late (due to internal lag or network latency between the replication and Service controllers) then it is possible that the Service controller will send one or more requests to a Puma process which has already shut down its listening socket. These requests will then fail with 5XX error codes.
64
+ There is a subtle race condition between step 2 and 3: The replication controller does not synchronously remove the pod from the Services AND THEN call the pre-stop hook of the pod, but rather it asynchronously sends "remove this pod from your endpoints" requests to the Services and then immediately proceeds to invoke the pods' pre-stop hook. If the Service controller (typically something like nginx or haproxy) receives and handles this request "too" late (due to internal lag or network latency between the replication and Service controllers) then it is possible that the Service controller will send one or more requests to a Puma process which has already shut down its listening socket. These requests will then fail with 5XX error codes.
63
65
 
64
66
  The way Kubernetes works this way, rather than handling step 2 synchronously, is due to the CAP theorem: in a distributed system there is no way to guarantee that any message will arrive promptly. In particular, waiting for all Service controllers to report back might get stuck for an indefinite time if one of them has already been terminated or if there has been a net split. A way to work around this is to add a sleep to the pre-stop hook of the same time as the `terminationGracePeriodSeconds` time. This will allow the Puma process to keep serving new requests during the entire grace period, although it will no longer receive new requests after all Service controllers have propagated the removal of the pod from their endpoint lists. Then, after `terminationGracePeriodSeconds`, the pod receives `SIGKILL` and closes down. If your process can't handle SIGKILL properly, for example because it needs to release locks in different services, you can also sleep for a shorter period (and/or increase `terminationGracePeriodSeconds`) as long as the time slept is longer than the time that your Service controllers take to propagate the pod removal. The downside of this workaround is that all pods will take at minimum the amount of time slept to shut down and this will increase the time required for your rolling deploy.
65
67
 
@@ -67,12 +69,5 @@ More discussions and links to relevant articles can be found in https://github.c
67
69
 
68
70
  ## Workers Per Pod, and Other Config Issues
69
71
 
70
- With containerization, you will have to make a decision about how "big" to make each pod. Should you run 2 pods with 50 workers each? 25 pods, each with 4 workers? 100 pods, with each Puma running in single mode? Each scenario represents the same total amount of capacity (100 Puma processes that can respond to requests), but there are tradeoffs to make.
71
-
72
- * Worker counts should be somewhere between 4 and 32 in most cases. You want more than 4 in order to minimize time spent in request queueing for a free Puma worker, but probably less than ~32 because otherwise autoscaling is working in too large of an increment or they probably won't fit very well into your nodes. In any queueing system, queue time is proportional to 1/n, where n is the number of things pulling from the queue. Each pod will have its own request queue (i.e., the socket backlog). If you have 4 pods with 1 worker each (4 request queues), wait times are, proportionally, about 4 times higher than if you had 1 pod with 4 workers (1 request queue).
73
- * Unless you have a very I/O-heavy application (50%+ time spent waiting on IO), use the default thread count (5 for MRI). Using higher numbers of threads with low I/O wait (<50%) will lead to additional request queueing time (latency!) and additional memory usage.
74
- * More processes per pod reduces memory usage per process, because of copy-on-write memory and because the cost of the single master process is "amortized" over more child processes.
75
- * Don't run less than 4 processes per pod if you can. Low numbers of processes per pod will lead to high request queueing, which means you will have to run more pods.
76
- * If multithreaded, allocate 1 CPU per worker. If single threaded, allocate 0.75 cpus per worker. Most web applications spend about 25% of their time in I/O - but when you're running multi-threaded, your Puma process will have higher CPU usage and should be able to fully saturate a CPU core.
77
- * Most Puma processes will use about ~512MB-1GB per worker, and about 1GB for the master process. However, you probably shouldn't bother with setting memory limits lower than around 2GB per process, because most places you are deploying will have 2GB of RAM per CPU. A sensible memory limit for a Puma configuration of 4 child workers might be something like 8 GB (1 GB for the master, 7GB for the 4 children).
72
+ See our [deployment docs](./deployment.md) for more information about how to correctly size your pods and choose the right number of workers and threads.
78
73
 
data/docs/plugins.md CHANGED
@@ -5,13 +5,13 @@ operations.
5
5
 
6
6
  There are two canonical plugins to aid in the development of new plugins:
7
7
 
8
- * [tmp\_restart](https://github.com/puma/puma/blob/master/lib/puma/plugin/tmp_restart.rb):
8
+ * [tmp\_restart](https://github.com/puma/puma/blob/main/lib/puma/plugin/tmp_restart.rb):
9
9
  Restarts the server if the file `tmp/restart.txt` is touched
10
10
  * [heroku](https://github.com/puma/puma-heroku/blob/master/lib/puma/plugin/heroku.rb):
11
11
  Packages up the default configuration used by Puma on Heroku (being sunset
12
12
  with the release of Puma 5.0)
13
13
 
14
- Plugins are activated in a Puma configuration file (such as `config/puma.rb'`)
14
+ Plugins are activated in a Puma configuration file (such as `config/puma.rb`)
15
15
  by adding `plugin "name"`, such as `plugin "heroku"`.
16
16
 
17
17
  Plugins are activated based on path requirements so, activating the `heroku`
@@ -36,3 +36,7 @@ object that is useful for additional configuration.
36
36
 
37
37
  Public methods in [`Puma::Plugin`](../lib/puma/plugin.rb) are treated as a
38
38
  public API for plugins.
39
+
40
+ ## Binder hooks
41
+
42
+ There's `Puma::Binder#before_parse` method that allows to add proc to run before the body of `Puma::Binder#parse`. Example of usage can be found in [that repository](https://github.com/anchordotdev/puma-acme/blob/v0.1.3/lib/puma/acme/plugin.rb#L97-L118) (`before_parse_hook` could be renamed `before_parse`, making monkey patching of [binder.rb](https://github.com/anchordotdev/puma-acme/blob/v0.1.3/lib/puma/acme/binder.rb) is unnecessary).
data/docs/restart.md CHANGED
@@ -29,7 +29,7 @@ Any of the following will cause a Puma server to perform a hot restart:
29
29
 
30
30
  * The newly started Puma process changes its current working directory to the directory specified by the `directory` option. If `directory` is set to symlink, this is automatically re-evaluated, so this mechanism can be used to upgrade the application.
31
31
  * Only one version of the application is running at a time.
32
- * `on_restart` is invoked just before the server shuts down. This can be used to clean up resources (like long-lived database connections) gracefully. Since Ruby 2.0, it is not typically necessary to explicitly close file descriptors on restart. This is because any file descriptor opened by Ruby will have the `FD_CLOEXEC` flag set, meaning that file descriptors are closed on `exec`. `on_restart` is useful, though, if your application needs to perform any more graceful protocol-specific shutdown procedures before closing connections.
32
+ * `before_restart` is invoked just before the server shuts down. This can be used to clean up resources (like long-lived database connections) gracefully. Since Ruby 2.0, it is not typically necessary to explicitly close file descriptors on restart. This is because any file descriptor opened by Ruby will have the `FD_CLOEXEC` flag set, meaning that file descriptors are closed on `exec`. `before_restart` is useful, though, if your application needs to perform any more graceful protocol-specific shutdown procedures before closing connections.
33
33
 
34
34
  ## Phased restart
35
35
 
@@ -59,7 +59,7 @@ Any of the following will cause a Puma server to perform a phased restart:
59
59
 
60
60
  * When a phased restart begins, the Puma master process changes its current working directory to the directory specified by the `directory` option. If `directory` is set to symlink, this is automatically re-evaluated, so this mechanism can be used to upgrade the application.
61
61
  * On a single server, it's possible that two versions of the application are running concurrently during a phased restart.
62
- * `on_restart` is not invoked
62
+ * `before_restart` is not invoked
63
63
  * Phased restarts can be slow for Puma clusters with many workers. Hot restarts often complete more quickly, but at the cost of increased latency during the restart.
64
64
  * Phased restarts cannot be used to upgrade any gems loaded by the Puma master process, including `puma` itself, anything in `extra_runtime_dependencies`, or dependencies thereof. Upgrading other gems is safe.
65
65
  * If you remove the gems from old releases as part of your deployment strategy, there are additional considerations. Do not put any gems into `extra_runtime_dependencies` that have native extensions or have dependencies that have native extensions (one common example is `puma_worker_killer` and its dependency on `ffi`). Workers will fail on boot during a phased restart. The underlying issue is recorded in [an issue on the rubygems project](https://github.com/rubygems/rubygems/issues/4004). Hot restarts are your only option here if you need these dependencies.