puma 7.1.0-java → 7.2.0-java
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/History.md +71 -0
- data/README.md +17 -9
- data/docs/deployment.md +58 -23
- data/docs/jungle/README.md +1 -1
- data/docs/kubernetes.md +3 -10
- data/docs/plugins.md +2 -2
- data/docs/signals.md +10 -10
- data/docs/stats.md +1 -1
- data/docs/systemd.md +3 -3
- data/ext/puma_http11/puma_http11.c +101 -109
- data/lib/puma/app/status.rb +10 -2
- data/lib/puma/cluster/worker.rb +10 -9
- data/lib/puma/cluster.rb +2 -3
- data/lib/puma/configuration.rb +16 -9
- data/lib/puma/const.rb +2 -2
- data/lib/puma/dsl.rb +16 -6
- data/lib/puma/launcher.rb +4 -3
- data/lib/puma/puma_http11.jar +0 -0
- data/lib/puma/reactor.rb +3 -12
- data/lib/puma/request.rb +10 -8
- data/lib/puma/runner.rb +1 -1
- data/lib/puma/server.rb +3 -3
- data/lib/puma/single.rb +2 -2
- data/tools/Dockerfile +13 -5
- metadata +5 -6
- data/ext/puma_http11/ext_help.h +0 -15
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 9a3dac630c4a0e901a7fc0fd84f9e9f9a4d26f97c7053a80295a420028c83990
|
|
4
|
+
data.tar.gz: 672c76739cd3502a72f9cc9f69a78b0d9c4e17287981dbe64390800f0685c625
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: f5983d6b00d74e220658943eb2d4a08103af3d034cd3fd32f9693a1dfbbef34e4c11f857d200c120267aad3069a93e8b4ed8681c897db9e6fa46c782ee441871
|
|
7
|
+
data.tar.gz: 247d27471aa4c0957792c3a5c4609f3e2ef712db02a13c3bff158e5e90ad79ee660b0502b73ac49a307ad2f5b2f6975cbc1f033fbcade3df26e30f49fdc1aa42
|
data/History.md
CHANGED
|
@@ -1,3 +1,38 @@
|
|
|
1
|
+
## 7.2.0 / 2026-01-20
|
|
2
|
+
|
|
3
|
+
* Features
|
|
4
|
+
* Add workers `:auto` ([#3827])
|
|
5
|
+
* Make it possible to restrict control server commands to stats ([#3787])
|
|
6
|
+
|
|
7
|
+
* Bugfixes
|
|
8
|
+
* Don't break if `WEB_CONCURRENCY` is set to a blank string ([#3837])
|
|
9
|
+
* Don't share server between worker 0 and descendants on refork ([#3602])
|
|
10
|
+
* Fix phase check race condition in `Puma::Cluster#check_workers` ([#3690])
|
|
11
|
+
* Fix advertising of CLI config before config files are loaded ([#3823])
|
|
12
|
+
|
|
13
|
+
* Performance
|
|
14
|
+
* 17% faster HTTP parsing through pre-interning env keys ([#3825])
|
|
15
|
+
* Implement `dsize` and `dcompact` functions for `Puma::HttpParser`, which makes Puma's C-extension GC-compactible ([#3828])
|
|
16
|
+
|
|
17
|
+
* Refactor
|
|
18
|
+
* Remove `NoMethodError` rescue in `Reactor#select_loop` ([#3831])
|
|
19
|
+
* Various cleanups in the C extension ([#3814])
|
|
20
|
+
* Monomorphize `handle_request` return ([#3802])
|
|
21
|
+
|
|
22
|
+
* Docs
|
|
23
|
+
* Change link to `docs/deployment.md` in `README.md` ([#3848])
|
|
24
|
+
* Fix formatting for each signal description in signals.md ([#3813])
|
|
25
|
+
* Update deployment and Kubernetes docs with Puma configuration tips ([#3807])
|
|
26
|
+
* Rename master to main ([#3809], [#3808], [#3800])
|
|
27
|
+
* Fix some minor typos in the docs ([#3804])
|
|
28
|
+
* Add `GOVERNANCE.md`, `MAINTAINERS` ([#3826])
|
|
29
|
+
* Remove Code Climate badge ([#3820])
|
|
30
|
+
* Add @joshuay03 to the maintainer list
|
|
31
|
+
|
|
32
|
+
* CI
|
|
33
|
+
* Use Minitest 6 where applicable ([#3859])
|
|
34
|
+
* Many test suite improvements and flake fixes ([#3861], [#3863], [#3860], [#3852], [#3857], [#3856], [#3845], [#3843], [#3842], [#3841], [#3822], [#3817], [#3764])
|
|
35
|
+
|
|
1
36
|
## 7.1.0 / 2025-10-16
|
|
2
37
|
|
|
3
38
|
* Features
|
|
@@ -2259,6 +2294,42 @@ be added back in a future date when a java Puma::MiniSSL is added.
|
|
|
2259
2294
|
* Bugfixes
|
|
2260
2295
|
* Your bugfix goes here <Most recent on the top, like GitHub> (#Github Number)
|
|
2261
2296
|
|
|
2297
|
+
[#3863]:https://github.com/puma/puma/pull/3863 "PR by Nate Berkopec, merged 2026-01-20"
|
|
2298
|
+
[#3861]:https://github.com/puma/puma/pull/3861 "PR by MSP-Greg, merged 2026-01-20"
|
|
2299
|
+
[#3860]:https://github.com/puma/puma/pull/3860 "PR by MSP-Greg, merged 2026-01-16"
|
|
2300
|
+
[#3859]:https://github.com/puma/puma/pull/3859 "PR by MSP-Greg, merged 2026-01-16"
|
|
2301
|
+
[#3857]:https://github.com/puma/puma/pull/3857 "PR by Aaron Patterson, merged 2026-01-12"
|
|
2302
|
+
[#3856]:https://github.com/puma/puma/pull/3856 "PR by MSP-Greg, merged 2026-01-12"
|
|
2303
|
+
[#3852]:https://github.com/puma/puma/pull/3852 "PR by Miłosz Bieniek, merged 2026-01-14"
|
|
2304
|
+
[#3848]:https://github.com/puma/puma/pull/3848 "PR by Miłosz Bieniek, merged 2025-12-27"
|
|
2305
|
+
[#3845]:https://github.com/puma/puma/pull/3845 "PR by MSP-Greg, merged 2025-12-19"
|
|
2306
|
+
[#3843]:https://github.com/puma/puma/pull/3843 "PR by MSP-Greg, merged 2025-12-18"
|
|
2307
|
+
[#3842]:https://github.com/puma/puma/pull/3842 "PR by MSP-Greg, merged 2025-12-18"
|
|
2308
|
+
[#3841]:https://github.com/puma/puma/pull/3841 "PR by MSP-Greg, merged 2025-12-18"
|
|
2309
|
+
[#3837]:https://github.com/puma/puma/pull/3837 "PR by John Bachir, merged 2026-01-09"
|
|
2310
|
+
[#3833]:https://github.com/puma/puma/pull/3833 "PR by Patrik Ragnarsson, merged 2025-11-25"
|
|
2311
|
+
[#3831]:https://github.com/puma/puma/pull/3831 "PR by Joshua Young, merged 2025-11-25"
|
|
2312
|
+
[#3828]:https://github.com/puma/puma/pull/3828 "PR by Jean Boussier, merged 2025-11-21"
|
|
2313
|
+
[#3827]:https://github.com/puma/puma/pull/3827 "PR by Nate Berkopec, merged 2026-01-20"
|
|
2314
|
+
[#3826]:https://github.com/puma/puma/pull/3826 "PR by Nate Berkopec, merged 2026-01-20"
|
|
2315
|
+
[#3825]:https://github.com/puma/puma/pull/3825 "PR by Jean Boussier, merged 2025-11-19"
|
|
2316
|
+
[#3823]:https://github.com/puma/puma/pull/3823 "PR by Joshua Young, merged 2025-11-18"
|
|
2317
|
+
[#3822]:https://github.com/puma/puma/pull/3822 "PR by Nate Berkopec, merged 2025-11-17"
|
|
2318
|
+
[#3820]:https://github.com/puma/puma/pull/3820 "PR by Nate Berkopec, merged 2025-11-19"
|
|
2319
|
+
[#3817]:https://github.com/puma/puma/pull/3817 "PR by Nate Berkopec, merged 2025-11-17"
|
|
2320
|
+
[#3814]:https://github.com/puma/puma/pull/3814 "PR by Jean Boussier, merged 2025-11-17"
|
|
2321
|
+
[#3813]:https://github.com/puma/puma/pull/3813 "PR by Masafumi Koba, merged 2025-11-17"
|
|
2322
|
+
[#3809]:https://github.com/puma/puma/pull/3809 "PR by Patrik Ragnarsson, merged 2025-10-26"
|
|
2323
|
+
[#3808]:https://github.com/puma/puma/pull/3808 "PR by Nymuxyzo, merged 2025-10-26"
|
|
2324
|
+
[#3807]:https://github.com/puma/puma/pull/3807 "PR by Nate Berkopec, merged 2025-10-28"
|
|
2325
|
+
[#3804]:https://github.com/puma/puma/pull/3804 "PR by Joe Rafaniello, merged 2025-10-21"
|
|
2326
|
+
[#3802]:https://github.com/puma/puma/pull/3802 "PR by Richard Schneeman, merged 2025-10-20"
|
|
2327
|
+
[#3800]:https://github.com/puma/puma/pull/3800 "PR by MSP-Greg, merged 2025-10-19"
|
|
2328
|
+
[#3787]:https://github.com/puma/puma/pull/3787 "PR by Stan Hu, merged 2025-10-17"
|
|
2329
|
+
[#3764]:https://github.com/puma/puma/pull/3764 "PR by MSP-Greg, merged 2025-10-17"
|
|
2330
|
+
[#3690]:https://github.com/puma/puma/pull/3690 "PR by Joshua Young, merged 2025-11-18"
|
|
2331
|
+
[#3602]:https://github.com/puma/puma/pull/3602 "PR by Joshua Young, merged 2025-11-28"
|
|
2332
|
+
|
|
2262
2333
|
[#3707]:https://github.com/puma/puma/pull/3707 "PR by @nerdrew, merged 2025-10-02"
|
|
2263
2334
|
[#3794]:https://github.com/puma/puma/pull/3794 "PR by @schneems, merged 2025-10-16"
|
|
2264
2335
|
[#3795]:https://github.com/puma/puma/pull/3795 "PR by @MSP-Greg, merged 2025-10-16"
|
data/README.md
CHANGED
|
@@ -4,8 +4,7 @@
|
|
|
4
4
|
|
|
5
5
|
# Puma: A Ruby Web Server Built For Parallelism
|
|
6
6
|
|
|
7
|
-
[](https://codeclimate.com/github/puma/puma)
|
|
7
|
+
[](https://github.com/puma/puma/actions/workflows/tests.yml?query=branch%3Amain)
|
|
9
8
|
[]( https://stackoverflow.com/questions/tagged/puma )
|
|
10
9
|
|
|
11
10
|
Puma is a **simple, fast, multi-threaded, and highly parallel HTTP 1.1 server for Ruby/Rack applications**.
|
|
@@ -82,10 +81,10 @@ $ bundle exec puma
|
|
|
82
81
|
|
|
83
82
|
## Configuration
|
|
84
83
|
|
|
85
|
-
Puma provides numerous options. Consult `puma -h` (or `puma --help`) for a full list of CLI options, or see `Puma::DSL` or [dsl.rb](https://github.com/puma/puma/blob/
|
|
84
|
+
Puma provides numerous options. Consult `puma -h` (or `puma --help`) for a full list of CLI options, or see `Puma::DSL` or [dsl.rb](https://github.com/puma/puma/blob/main/lib/puma/dsl.rb).
|
|
86
85
|
|
|
87
86
|
You can also find several configuration examples as part of the
|
|
88
|
-
[test](https://github.com/puma/puma/tree/
|
|
87
|
+
[test](https://github.com/puma/puma/tree/main/test/config) suite.
|
|
89
88
|
|
|
90
89
|
For debugging purposes, you can set the environment variable `PUMA_LOG_CONFIG` with a value
|
|
91
90
|
and the loaded configuration will be printed as part of the boot process.
|
|
@@ -116,11 +115,20 @@ Or with the `WEB_CONCURRENCY` environment variable:
|
|
|
116
115
|
$ WEB_CONCURRENCY=3 puma -t 8:32
|
|
117
116
|
```
|
|
118
117
|
|
|
118
|
+
When using a config file, most applications can simply set `workers :auto` (requires the `concurrent-ruby` gem) to match the number of worker processes to the available processors:
|
|
119
|
+
|
|
120
|
+
```ruby
|
|
121
|
+
# config/puma.rb
|
|
122
|
+
workers :auto
|
|
123
|
+
```
|
|
124
|
+
|
|
125
|
+
See [`workers :auto` gotchas](lib/puma/dsl.rb).
|
|
126
|
+
|
|
119
127
|
Note that threads are still used in cluster mode, and the `-t` thread flag setting is per worker, so `-w 2 -t 16:16` will spawn 32 threads in total, with 16 in each worker process.
|
|
120
128
|
|
|
121
|
-
If the `WEB_CONCURRENCY` environment variable is set to `"auto"
|
|
129
|
+
If `workers` is set to `:auto`, or the `WEB_CONCURRENCY` environment variable is set to `"auto"`, and the `concurrent-ruby` gem is available in your application, Puma will set the worker process count to the result of [available processors](https://msp-greg.github.io/concurrent-ruby/Concurrent.html#available_processor_count-class_method).
|
|
122
130
|
|
|
123
|
-
For an in-depth discussion of the tradeoffs of thread and process count settings, [see our docs](
|
|
131
|
+
For an in-depth discussion of the tradeoffs of thread and process count settings, [see our docs](docs/deployment.md).
|
|
124
132
|
|
|
125
133
|
In cluster mode, Puma can "preload" your application. This loads all the application code *prior* to forking. Preloading reduces total memory usage of your application via an operating system feature called [copy-on-write](https://en.wikipedia.org/wiki/Copy-on-write).
|
|
126
134
|
|
|
@@ -226,7 +234,7 @@ end
|
|
|
226
234
|
### Error handling
|
|
227
235
|
|
|
228
236
|
If Puma encounters an error outside of the context of your application, it will respond with a 400/500 and a simple
|
|
229
|
-
textual error message (see `Puma::Server#lowlevel_error` or [server.rb](https://github.com/puma/puma/blob/
|
|
237
|
+
textual error message (see `Puma::Server#lowlevel_error` or [server.rb](https://github.com/puma/puma/blob/main/lib/puma/server.rb)).
|
|
230
238
|
You can specify custom behavior for this scenario. For example, you can report the error to your third-party
|
|
231
239
|
error-tracking service (in this example, [rollbar](https://rollbar.com)):
|
|
232
240
|
|
|
@@ -385,7 +393,7 @@ Puma has a built-in status and control app that can be used to query and control
|
|
|
385
393
|
$ puma --control-url tcp://127.0.0.1:9293 --control-token foo
|
|
386
394
|
```
|
|
387
395
|
|
|
388
|
-
Puma will start the control server on localhost port 9293. All requests to the control server will need to include control token (in this case, `token=foo`) as a query parameter. This allows for simple authentication. Check out `Puma::App::Status` or [status.rb](https://github.com/puma/puma/blob/
|
|
396
|
+
Puma will start the control server on localhost port 9293. All requests to the control server will need to include control token (in this case, `token=foo`) as a query parameter. This allows for simple authentication. Check out `Puma::App::Status` or [status.rb](https://github.com/puma/puma/blob/main/lib/puma/app/status.rb) to see what the status app has available.
|
|
389
397
|
|
|
390
398
|
You can also interact with the control server via `pumactl`. This command will restart Puma:
|
|
391
399
|
|
|
@@ -417,7 +425,7 @@ $ puma -C "-"
|
|
|
417
425
|
|
|
418
426
|
The other side-effects of setting the environment are whether to show stack traces (in `development` or `test`), and setting RACK_ENV may potentially affect middleware looking for this value to change their behavior. The default puma RACK_ENV value is `development`. You can see all config default values in `Puma::Configuration#puma_default_options` or [configuration.rb](https://github.com/puma/puma/blob/61c6213fbab/lib/puma/configuration.rb#L182-L204).
|
|
419
427
|
|
|
420
|
-
Check out `Puma::DSL` or [dsl.rb](https://github.com/puma/puma/blob/
|
|
428
|
+
Check out `Puma::DSL` or [dsl.rb](https://github.com/puma/puma/blob/main/lib/puma/dsl.rb) to see all available options.
|
|
421
429
|
|
|
422
430
|
## Restart
|
|
423
431
|
|
data/docs/deployment.md
CHANGED
|
@@ -16,32 +16,34 @@ assume this is how you're using Puma.
|
|
|
16
16
|
Initially, Puma was conceived as a thread-only web server, but support for
|
|
17
17
|
processes was added in version 2.
|
|
18
18
|
|
|
19
|
+
In general, use single mode only if:
|
|
20
|
+
|
|
21
|
+
* You are using JRuby, TruffleRuby or another fully-multithreaded implementation of Ruby
|
|
22
|
+
* You are using MRI but in an environment where only 1 CPU core is available.
|
|
23
|
+
|
|
24
|
+
Otherwise, you'll want to use cluster mode to utilize all available CPU resources.
|
|
25
|
+
|
|
19
26
|
To run `puma` in single mode (i.e., as a development environment), set the
|
|
20
27
|
number of workers to 0; anything higher will run in cluster mode.
|
|
21
28
|
|
|
22
|
-
|
|
29
|
+
## Cluster Mode Tips
|
|
23
30
|
|
|
24
|
-
|
|
31
|
+
For the purposes of Puma provisioning, "CPU cores" means:
|
|
25
32
|
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
* Set the number of threads to desired concurrent requests/number of workers.
|
|
29
|
-
Puma defaults to 5, and that's a decent number.
|
|
33
|
+
1. On ARM, the number of physical cores.
|
|
34
|
+
2. On x86, the number of logical cores, hyperthreads, or vCPUs (these words all mean the same thing).
|
|
30
35
|
|
|
31
|
-
|
|
36
|
+
Set your config with the following process:
|
|
32
37
|
|
|
33
|
-
* If you'
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
* Enjoy 50% memory savings
|
|
37
|
-
* As you grow more confident in the thread-safety of your app, you can tune the
|
|
38
|
-
workers down and the threads up.
|
|
38
|
+
* Use cluster mode and set `workers :auto` (requires the `concurrent-ruby` gem) to match the number of CPU cores on the machine (minimum 2, otherwise use single mode!). If you can't add the gem, set the worker count manually to the available CPU cores.
|
|
39
|
+
* Set the number of threads to desired concurrent requests/number of workers.
|
|
40
|
+
Puma defaults to 5, and that's a decent number.
|
|
39
41
|
|
|
40
|
-
|
|
42
|
+
For most deployments, adding `concurrent-ruby` and using `workers :auto` is the right starting point.
|
|
41
43
|
|
|
42
|
-
See [
|
|
44
|
+
See [`workers :auto` gotchas](../lib/puma/dsl.rb).
|
|
43
45
|
|
|
44
|
-
|
|
46
|
+
## Worker utilization
|
|
45
47
|
|
|
46
48
|
**How do you know if you've got enough (or too many workers)?**
|
|
47
49
|
|
|
@@ -50,14 +52,34 @@ a time. But since so many apps are waiting on IO from DBs, etc., they can
|
|
|
50
52
|
utilize threads to use the process more efficiently.
|
|
51
53
|
|
|
52
54
|
Generally, you never want processes that are pegged all the time. That can mean
|
|
53
|
-
there is more work to do than the process can get through. On the other hand, if
|
|
54
|
-
you have processes that sit around doing nothing, then
|
|
55
|
-
|
|
55
|
+
there is more work to do than the process can get through, and requests will end up with additional latency. On the other hand, if
|
|
56
|
+
you have processes that sit around doing nothing, then you're wasting resources and money.
|
|
57
|
+
|
|
58
|
+
In general, you are making a tradeoff between:
|
|
59
|
+
|
|
60
|
+
1. CPU and memory utilization.
|
|
61
|
+
2. Time spent queueing for a Puma worker to `accept` requests and additional latency caused by CPU contention.
|
|
62
|
+
|
|
63
|
+
If latency is important to you, you will have to accept lower utilization, and vice versa.
|
|
56
64
|
|
|
57
|
-
|
|
58
|
-
utilization means you've got capacity still but aren't starving threads.
|
|
65
|
+
## Container/VPS sizing
|
|
59
66
|
|
|
60
|
-
|
|
67
|
+
You will have to make a decision about how "big" to make each pod/VPS/server/dyno.
|
|
68
|
+
|
|
69
|
+
**TL:DR;**: 80% of Puma apps will end up deploying "pods" of 4 workers, 5 threads each, 4 vCPU and 8GB of RAM.
|
|
70
|
+
|
|
71
|
+
For the rest of this discussion, we'll adopt the Kubernetes term of "pods".
|
|
72
|
+
|
|
73
|
+
Should you run 2 pods with 50 workers each? 25 pods, each with 4 workers? 100 pods, with each Puma running in single mode? Each scenario represents the same total amount of capacity (100 Puma processes that can respond to requests), but there are tradeoffs to make:
|
|
74
|
+
|
|
75
|
+
* **Increasing worker counts decreases latency, but means you scale in bigger "chunks"**. Worker counts should be somewhere between 4 and 32 in most cases. You want more than 4 in order to minimize time spent in request queueing for a free Puma worker, but probably less than ~32 because otherwise autoscaling is working in too large of an increment or they probably won't fit very well into your nodes. In any queueing system, queue time is proportional to 1/n, where n is the number of things pulling from the queue. Each pod will have its own request queue (i.e., the socket backlog). If you have 4 pods with 1 worker each (4 request queues), wait times are, proportionally, about 4 times higher than if you had 1 pod with 4 workers (1 request queue).
|
|
76
|
+
* **Increasing thread counts will increase throughput, but also latency and memory use** Unless you have a very I/O-heavy application (50%+ time spent waiting on IO), use the default thread count (5 for MRI). Using higher numbers of threads with low I/O wait (<50% of wall clock time) will lead to additional request latency and additional memory usage.
|
|
77
|
+
* **Increasing worker counts decreases memory per worker on average**. More processes per pod reduces memory usage per process, because of copy-on-write memory and because the cost of the single master process is "amortized" over more child processes.
|
|
78
|
+
* **Low worker counts (<4) have exceptionally poor throughput**. Don't run less than 4 processes per pod if you can. Low numbers of processes per pod will lead to high request queueing (see discussion above), which means you will have to run more pods and resources.
|
|
79
|
+
* **CPU-core-to-worker ratios should be around 1**. If running Puma with `threads > 1`, allocate 1 CPU core (see definition above!) per worker. If single threaded, allocate ~0.75 cpus per worker. Most web applications spend about 25% of their time in I/O - but when you're running multi-threaded, your Puma process will have higher CPU usage and should be able to fully saturate a CPU core. Using `workers :auto` will size workers to this guidance on most platforms.
|
|
80
|
+
* **Don't set memory limits unless necessary**. Most Puma processes will use about ~512MB-1GB per worker, and about 1GB for the master process. However, you probably shouldn't bother with setting memory limits lower than around 2GB per process, because most places you are deploying will have 2GB of RAM per CPU. A sensible memory limit for a Puma configuration of 4 child workers might be something like 8 GB (1 GB for the master, 7GB for the 4 children).
|
|
81
|
+
|
|
82
|
+
**Measuring utilization and queue time**
|
|
61
83
|
|
|
62
84
|
Using a timestamp header from an upstream proxy server (e.g., `nginx` or
|
|
63
85
|
`haproxy`) makes it possible to indicate how long requests have been waiting for
|
|
@@ -75,7 +97,7 @@ a Puma thread to become available.
|
|
|
75
97
|
* `env['puma.request_body_wait']` contains the number of milliseconds Puma
|
|
76
98
|
spent waiting for the client to send the request body.
|
|
77
99
|
* haproxy: `%Th` (TLS handshake time) and `%Ti` (idle time before request)
|
|
78
|
-
can
|
|
100
|
+
can also be added as headers.
|
|
79
101
|
|
|
80
102
|
## Should I daemonize?
|
|
81
103
|
|
|
@@ -100,3 +122,16 @@ or hell, even `monit`.
|
|
|
100
122
|
You probably will want to deploy some new code at some point, and you'd like
|
|
101
123
|
Puma to start running that new code. There are a few options for restarting
|
|
102
124
|
Puma, described separately in our [restart documentation](restart.md).
|
|
125
|
+
|
|
126
|
+
## Migrating from Unicorn
|
|
127
|
+
|
|
128
|
+
* If you're migrating from unicorn though, here are some settings to start with:
|
|
129
|
+
* Set workers to half the number of unicorn workers you're using
|
|
130
|
+
* Set threads to 2
|
|
131
|
+
* Enjoy 50% memory savings
|
|
132
|
+
* As you grow more confident in the thread-safety of your app, you can tune the
|
|
133
|
+
workers down and the threads up.
|
|
134
|
+
|
|
135
|
+
## Ubuntu / Systemd (Systemctl) Installation
|
|
136
|
+
|
|
137
|
+
See [systemd.md](systemd.md)
|
data/docs/jungle/README.md
CHANGED
data/docs/kubernetes.md
CHANGED
|
@@ -2,7 +2,7 @@
|
|
|
2
2
|
|
|
3
3
|
## Running Puma in Kubernetes
|
|
4
4
|
|
|
5
|
-
In general running Puma in Kubernetes works as-is, no special configuration is needed beyond what you would write anyway to get a new Kubernetes Deployment going. There is one known interaction between the way Kubernetes handles pod termination and how Puma handles `SIGINT`, where some
|
|
5
|
+
In general running Puma in Kubernetes works as-is, no special configuration is needed beyond what you would write anyway to get a new Kubernetes Deployment going. There is one known interaction between the way Kubernetes handles pod termination and how Puma handles `SIGINT`, where some requests might be sent to Puma after it has already entered graceful shutdown mode and is no longer accepting requests. This can lead to dropped requests during rolling deploys. A workaround for this is listed at the end of this article.
|
|
6
6
|
|
|
7
7
|
## Basic setup
|
|
8
8
|
|
|
@@ -61,7 +61,7 @@ For some high-throughput systems, it is possible that some HTTP requests will re
|
|
|
61
61
|
4. The pod has up to `terminationGracePeriodSeconds` (default: 30 seconds) to gracefully shut down. Puma will do this (after it receives SIGTERM) by closing down the socket that accepts new requests and finishing any requests already running before exiting the Puma process.
|
|
62
62
|
5. If the pod is still running after `terminationGracePeriodSeconds` has elapsed, the pod receives `SIGKILL` to make sure the process inside it stops. After that, the container exits and all other Kubernetes objects associated with it are cleaned up.
|
|
63
63
|
|
|
64
|
-
There is a subtle race condition between step 2 and 3: The replication controller does not synchronously remove the pod from the Services AND THEN call the pre-stop hook of the pod, but rather it asynchronously sends "remove this pod from your endpoints" requests to the Services and then immediately proceeds to invoke the pods' pre-stop hook. If the Service controller (typically something like nginx or haproxy) receives
|
|
64
|
+
There is a subtle race condition between step 2 and 3: The replication controller does not synchronously remove the pod from the Services AND THEN call the pre-stop hook of the pod, but rather it asynchronously sends "remove this pod from your endpoints" requests to the Services and then immediately proceeds to invoke the pods' pre-stop hook. If the Service controller (typically something like nginx or haproxy) receives and handles this request "too" late (due to internal lag or network latency between the replication and Service controllers) then it is possible that the Service controller will send one or more requests to a Puma process which has already shut down its listening socket. These requests will then fail with 5XX error codes.
|
|
65
65
|
|
|
66
66
|
The way Kubernetes works this way, rather than handling step 2 synchronously, is due to the CAP theorem: in a distributed system there is no way to guarantee that any message will arrive promptly. In particular, waiting for all Service controllers to report back might get stuck for an indefinite time if one of them has already been terminated or if there has been a net split. A way to work around this is to add a sleep to the pre-stop hook of the same time as the `terminationGracePeriodSeconds` time. This will allow the Puma process to keep serving new requests during the entire grace period, although it will no longer receive new requests after all Service controllers have propagated the removal of the pod from their endpoint lists. Then, after `terminationGracePeriodSeconds`, the pod receives `SIGKILL` and closes down. If your process can't handle SIGKILL properly, for example because it needs to release locks in different services, you can also sleep for a shorter period (and/or increase `terminationGracePeriodSeconds`) as long as the time slept is longer than the time that your Service controllers take to propagate the pod removal. The downside of this workaround is that all pods will take at minimum the amount of time slept to shut down and this will increase the time required for your rolling deploy.
|
|
67
67
|
|
|
@@ -69,12 +69,5 @@ More discussions and links to relevant articles can be found in https://github.c
|
|
|
69
69
|
|
|
70
70
|
## Workers Per Pod, and Other Config Issues
|
|
71
71
|
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
* Worker counts should be somewhere between 4 and 32 in most cases. You want more than 4 in order to minimize time spent in request queueing for a free Puma worker, but probably less than ~32 because otherwise autoscaling is working in too large of an increment or they probably won't fit very well into your nodes. In any queueing system, queue time is proportional to 1/n, where n is the number of things pulling from the queue. Each pod will have its own request queue (i.e., the socket backlog). If you have 4 pods with 1 worker each (4 request queues), wait times are, proportionally, about 4 times higher than if you had 1 pod with 4 workers (1 request queue).
|
|
75
|
-
* Unless you have a very I/O-heavy application (50%+ time spent waiting on IO), use the default thread count (5 for MRI). Using higher numbers of threads with low I/O wait (<50%) will lead to additional request queueing time (latency!) and additional memory usage.
|
|
76
|
-
* More processes per pod reduces memory usage per process, because of copy-on-write memory and because the cost of the single master process is "amortized" over more child processes.
|
|
77
|
-
* Don't run less than 4 processes per pod if you can. Low numbers of processes per pod will lead to high request queueing, which means you will have to run more pods.
|
|
78
|
-
* If multithreaded, allocate 1 CPU per worker. If single threaded, allocate 0.75 cpus per worker. Most web applications spend about 25% of their time in I/O - but when you're running multi-threaded, your Puma process will have higher CPU usage and should be able to fully saturate a CPU core.
|
|
79
|
-
* Most Puma processes will use about ~512MB-1GB per worker, and about 1GB for the master process. However, you probably shouldn't bother with setting memory limits lower than around 2GB per process, because most places you are deploying will have 2GB of RAM per CPU. A sensible memory limit for a Puma configuration of 4 child workers might be something like 8 GB (1 GB for the master, 7GB for the 4 children).
|
|
72
|
+
See our [deployment docs](./deployment.md) for more information about how to correctly size your pods and choose the right number of workers and threads.
|
|
80
73
|
|
data/docs/plugins.md
CHANGED
|
@@ -5,13 +5,13 @@ operations.
|
|
|
5
5
|
|
|
6
6
|
There are two canonical plugins to aid in the development of new plugins:
|
|
7
7
|
|
|
8
|
-
* [tmp\_restart](https://github.com/puma/puma/blob/
|
|
8
|
+
* [tmp\_restart](https://github.com/puma/puma/blob/main/lib/puma/plugin/tmp_restart.rb):
|
|
9
9
|
Restarts the server if the file `tmp/restart.txt` is touched
|
|
10
10
|
* [heroku](https://github.com/puma/puma-heroku/blob/master/lib/puma/plugin/heroku.rb):
|
|
11
11
|
Packages up the default configuration used by Puma on Heroku (being sunset
|
|
12
12
|
with the release of Puma 5.0)
|
|
13
13
|
|
|
14
|
-
Plugins are activated in a Puma configuration file (such as `config/puma.rb
|
|
14
|
+
Plugins are activated in a Puma configuration file (such as `config/puma.rb`)
|
|
15
15
|
by adding `plugin "name"`, such as `plugin "heroku"`.
|
|
16
16
|
|
|
17
17
|
Plugins are activated based on path requirements so, activating the `heroku`
|
data/docs/signals.md
CHANGED
|
@@ -33,16 +33,16 @@ Now you will see via `ps` that there is no more `tail` process. Sometimes when r
|
|
|
33
33
|
|
|
34
34
|
Puma cluster responds to these signals:
|
|
35
35
|
|
|
36
|
-
- `TTIN
|
|
37
|
-
- `TTOU
|
|
38
|
-
- `TERM
|
|
39
|
-
- `USR2
|
|
40
|
-
- `USR1
|
|
41
|
-
- `HUP
|
|
42
|
-
- `INT
|
|
43
|
-
- `CHLD`
|
|
44
|
-
- `URG
|
|
45
|
-
- `INFO
|
|
36
|
+
- `TTIN`: Increment the worker count by 1.
|
|
37
|
+
- `TTOU`: Decrement the worker count by 1.
|
|
38
|
+
- `TERM`: Send `TERM` to worker. The worker will attempt to finish then exit.
|
|
39
|
+
- `USR2`: Restart workers. This also reloads the Puma configuration file, if there is one.
|
|
40
|
+
- `USR1`: Restart workers in phases, a rolling restart. This will not reload the configuration file.
|
|
41
|
+
- `HUP`: Reopen log files defined in `stdout_redirect` configuration parameter. If there is no `stdout_redirect` option provided, it will behave like `INT`.
|
|
42
|
+
- `INT`: Equivalent of sending Ctrl-C to cluster. Puma will attempt to finish then exit.
|
|
43
|
+
- `CHLD`: Reap zombie child processes and wake event loop in `fork_worker` mode.
|
|
44
|
+
- `URG`: Refork workers in phases from worker 0 if `fork_worker` option is enabled.
|
|
45
|
+
- `INFO`: Print backtraces of all Puma threads.
|
|
46
46
|
|
|
47
47
|
## Callbacks order in case of different signals
|
|
48
48
|
|
data/docs/stats.md
CHANGED
|
@@ -70,7 +70,7 @@ When Puma runs in single mode, these stats are available at the top level. When
|
|
|
70
70
|
|
|
71
71
|
### cluster mode
|
|
72
72
|
|
|
73
|
-
* phase: which phase of restart the process is in, during [phased restart](https://github.com/puma/puma/blob/
|
|
73
|
+
* phase: which phase of restart the process is in, during [phased restart](https://github.com/puma/puma/blob/main/docs/restart.md)
|
|
74
74
|
* workers: ??
|
|
75
75
|
* booted_workers: how many workers currently running?
|
|
76
76
|
* old_workers: ??
|
data/docs/systemd.md
CHANGED
|
@@ -119,8 +119,8 @@ or cluster mode.
|
|
|
119
119
|
### Sockets and symlinks
|
|
120
120
|
|
|
121
121
|
When using releases folders, you should set the socket path using the shared
|
|
122
|
-
folder path (ex. `/srv/
|
|
123
|
-
path (`/srv/
|
|
122
|
+
folder path (ex. `/srv/project/shared/tmp/puma.sock`), not the release folder
|
|
123
|
+
path (`/srv/project/releases/1234/tmp/puma.sock`).
|
|
124
124
|
|
|
125
125
|
Puma will detect the release path socket as different than the one provided by
|
|
126
126
|
systemd and attempt to bind it again, resulting in the exception `There is
|
|
@@ -139,7 +139,7 @@ automatically for any activated socket. When systemd socket activation is not
|
|
|
139
139
|
enabled, this option does nothing.
|
|
140
140
|
|
|
141
141
|
This also accepts an optional argument `only` (DSL: `'only'`) to discard any
|
|
142
|
-
binds that
|
|
142
|
+
binds that are not socket activated.
|
|
143
143
|
|
|
144
144
|
## Usage
|
|
145
145
|
|