pitchfork 0.8.0 → 0.10.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 9c9d47a7cd3604f0807a41882322b4461edf2a990439e152dfa2d94bb731eff7
4
- data.tar.gz: 1a38267df9cff0493452fa88c65cb4c9a502643aaf05c8e0b60544f93e9f6e4f
3
+ metadata.gz: 0f23c677bd64eb8ca7a040cb2d41dab4047e05ae849da16af8e537748bfb6dbe
4
+ data.tar.gz: 890bac5b67c23d6f0aacf0d9130999cf69ac0bc5a2f1ea06bdd142d67c0606ac
5
5
  SHA512:
6
- metadata.gz: e8318fc2ae118a7e4a89f65e76e0634d306abd1838f0e4957f638bc870ddd202bc72c006fe3b93b6eec52f1765daea43b24b68eb7b8249b5b21b28a8b9abd51b
7
- data.tar.gz: 5f23586cf49e29649496e15577ce7344c477b99878bfd6756f685f2a96ad40fc202d4f3cf97dad8183dd50b2489423c1fcfd48e20303d4eafae6b7f1db586e9b
6
+ metadata.gz: 39c33bbbe865ba22bebc8b6b1e823f71d0ba6b2787f06af26f0ffe3d4251c781850826fcbcdf1dc13b1734919770e35dc57815d2de057f18d97ba354f247a3e7
7
+ data.tar.gz: 94e5ae39e129af1fd950c9a77e094bac5bda586f7358115fd8cf25bb3c6ee2079bb4654ebd4e6ed5e9ff498b461559da07d7b44acc0147fcac5c9a4695b7c62c
data/CHANGELOG.md CHANGED
@@ -1,5 +1,13 @@
1
1
  # Unreleased
2
2
 
3
+ # 0.10.0
4
+
5
+ - Include requests count in workers proctitle.
6
+
7
+ # 0.9.0
8
+
9
+ - Implement `spawn_timeout` to protect against bugs causing workers to get stuck before they reach ready state.
10
+
3
11
  # 0.8.0
4
12
 
5
13
  - Add an `after_monitor_ready` callback, called in the monitor process at end of boot.
data/Gemfile.lock CHANGED
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- pitchfork (0.8.0)
4
+ pitchfork (0.10.0)
5
5
  rack (>= 2.0)
6
6
  raindrops (~> 0.7)
7
7
 
data/README.md CHANGED
@@ -9,13 +9,6 @@ advantage of features in Unix/Unix-like kernels. Slow clients should
9
9
  only be served by placing a reverse proxy capable of fully buffering
10
10
  both the request and response in between `pitchfork` and slow clients.
11
11
 
12
- ## Disclaimer
13
-
14
- Until this notice is removed from the README, `pitchfork` should be
15
- considered experimental. As such it is not encouraged to run it in
16
- production just yet unless you feel capable of debugging yourself
17
- any issue that may arise.
18
-
19
12
  ## Features
20
13
 
21
14
  * Designed for Rack, Linux, fast clients, and ease-of-debugging. We
@@ -40,9 +33,23 @@ any issue that may arise.
40
33
  or ports yourself. `pitchfork` can spawn and manage any number of
41
34
  worker processes you choose to scale to your backend.
42
35
 
36
+ * Adaptative timeout: request timeout can be extended dynamically on a
37
+ per request basis, which allows to keep a strict overall timeout for
38
+ most endpoints, but allow a few endpoints to take longer.
39
+
43
40
  * Load balancing is done entirely by the operating system kernel.
44
41
  Requests never pile up behind a busy worker process.
45
42
 
43
+ ## When to Use
44
+
45
+ Pitchfork isn't inherently better than other Ruby application servers, it mostly
46
+ focus on different tradeoffs.
47
+
48
+ If you are fine with your current server, it's best to stick with it.
49
+
50
+ If there is a problem you are trying to solve, please read the
51
+ [migration guide](docs/WHY_MIGRATE.md) first.
52
+
46
53
  ## Requirements
47
54
 
48
55
  Ruby(MRI) Version 2.5 and above.
@@ -238,6 +238,19 @@ exit or be SIGKILL-ed due to timeouts.
238
238
  See https://nginx.org/en/docs/http/ngx_http_upstream_module.html
239
239
  for more details on nginx upstream configuration.
240
240
 
241
+ ### `spawn_timeout`
242
+
243
+ ```ruby
244
+ timeout 5
245
+ ```
246
+
247
+ Sets the timeout for a newly spawned worker to be ready after being spawned.
248
+
249
+ This timeout is a safeguard against various low-level fork safety bugs that could cause
250
+ a process to dead-lock.
251
+
252
+ The default of `10` seconds is quite generous and likely doesn't need to be adjusted.
253
+
241
254
  ### `logger`
242
255
 
243
256
  ```ruby
data/docs/REFORKING.md CHANGED
@@ -66,7 +66,10 @@ PID COMMAND
66
66
  105 \_ pitchfork (gen:0) worker[3]
67
67
  ```
68
68
 
69
- When a reforking is triggered, one of the workers is selected to fork a new `mold`.
69
+ As the diagram shows, while workers are forked from the mold, they become children of the master process.
70
+ We'll see how does that work [later](#forking-sibling-processes).
71
+
72
+ When a reforking is triggered, one of the workers is selected to fork a new `mold`:
70
73
 
71
74
  ```
72
75
  PID COMMAND
@@ -79,6 +82,9 @@ PID COMMAND
79
82
  105 \_ pitchfork (gen:1) mold
80
83
  ```
81
84
 
85
+ Again, while the mold was forked from a worker, it becomes a child of the master process.
86
+ We'll see how does that work [later](#forking-sibling-processes).
87
+
82
88
  When that new mold is ready, `pitchfork` terminates the old mold and starts a slow rollout of older workers and replace them with fresh workers
83
89
  forked from the mold:
84
90
 
@@ -104,7 +110,7 @@ PID COMMAND
104
110
 
105
111
  etc.
106
112
 
107
- ### Forking Sibling Processes
113
+ ### Forking Sibling Processes
108
114
 
109
115
  Normally on unix systems, when calling `fork(2)`, the newly created process is a child of the original one, so forking from the mold should create
110
116
  a process tree such as:
@@ -119,5 +125,8 @@ PID COMMAND
119
125
  However the `pitchfork` master process registers itself as a "child subreaper" via [`PR_SET_CHILD_SUBREAPER`](https://man7.org/linux/man-pages/man2/prctl.2.html).
120
126
  This means any descendant process that is orphaned will be re-parented as a child of the master rather than a child of the init process (pid 1).
121
127
 
122
- With this in mind, the mold fork twice to create an orphaned process that will get re-attached to the master, effectively forking a sibling rather than a child.
123
- The need for `PR_SET_CHILD_SUBREAPER` is the main reason why reforking is only available on Linux.
128
+ With this in mind, the mold forks twice to create an orphaned process that will get re-attached to the master,
129
+ effectively forking a sibling rather than a child. Similarly, workers do the same when forking new molds.
130
+ This technique eases killing previous generations of molds and workers.
131
+
132
+ The need for `PR_SET_CHILD_SUBREAPER` is the main reason why reforking is only available on Linux.
@@ -0,0 +1,93 @@
1
+ # Why migrate to Pitchfork?
2
+
3
+ First and foremost, if you don't have any specific problem with your current server, then don't.
4
+
5
+ Pitchfork isn't a silver bullet, it's a very opinionated software that focus on very specific tradeoffs,
6
+ that are different from other servers.
7
+
8
+ ## Coming from Unicorn
9
+
10
+ ### Why Migrate?
11
+
12
+ #### Adaptative timeout
13
+
14
+ Pitchfork allows to extend the request timeout on a per request basis,
15
+ this can be helpful when trying to reduce the global request timeout
16
+ to a saner value. You can enforce a stricter value, and extend it
17
+ in the minority of offending endpoints.
18
+
19
+ #### Memory Usage - Reforking
20
+
21
+ If you are unsatisfied with Unicorn memory usage, but threaded Puma isn't an option
22
+ for you, then Pitchfork may be an option if you are able to enable reforking.
23
+
24
+ However be warned that making an application fork safe can be non-trivial,
25
+ and mistakes can lead to critical bugs.
26
+
27
+ #### Rack 3
28
+
29
+ As of Unicorn `6.1.0`, Rack 3 isn't yet supported by Unicorn.
30
+
31
+ Pitchfork is compatible with Rack 3.
32
+
33
+ ### Why Not Migrate?
34
+
35
+ #### Reduced Features
36
+
37
+ While Pitchfork started as a fork of Unicorn, many features such as daemonization,
38
+ pid file management, hot reload have been stripped.
39
+
40
+ Pitchfork only kept features that makes sense in a containerized world.
41
+
42
+ ## Coming from Puma
43
+
44
+ Generally speaking, compared to (threaded) Puma, Pitchfork *may* offer better latency and isolation at the expense of throughput.
45
+
46
+ ### Why Migrate?
47
+
48
+ #### Latency
49
+
50
+ If you suspect your application is subject to contention on the GVL or some other in-process shared resources,
51
+ then Pitchfork may offer improved latency.
52
+
53
+ It is however heavily recommended to first confirm this suspicion with profiling
54
+ tools such as [gvltools](https://github.com/Shopify/gvltools).
55
+
56
+ If you application isn't subject to in-process contention, Pitchfork is unlikely to improve latency.
57
+
58
+ #### Out of Band Garbage Collection
59
+
60
+ Another advantage of only processing a single request per process is that
61
+ [it allows to periodically trigger garbage collection when the worker isn't processing any request](https://shopify.engineering/adventures-in-garbage-collection).
62
+
63
+ This can significantly improve tail latency at the expense of throughput.
64
+
65
+ #### Resiliency and Isolation
66
+
67
+ Since Pitchfork workers have their own address space and only process one request at a time
68
+ it makes it much harder for one faulty request to impact another.
69
+
70
+ Even if a bug causes Ruby to crash, only the request that triggered the bug will be impacted.
71
+
72
+ If a bug causes Ruby to hang, the monitor process will SIGKILL the worker and the capacity will be
73
+ reclaimed.
74
+
75
+ This makes Pitchfork more resilient to some classes of bugs.
76
+
77
+ #### Thread Safety
78
+
79
+ Pitchfork doesn't require applications to be thread-safe. That is probably the worst reason
80
+ to migrate though.
81
+
82
+ ### Why Not Migrate?
83
+
84
+ #### Memory Usage
85
+
86
+ Without reforking enabled Pitchfork will without a doubt use more memory than threaded Puma.
87
+
88
+ With reforking enabled, results will vary based on the application profile and the number of Puma threads,
89
+ but should be in the same ballpark, sometimes better, but likely worse, this depends on many variables and
90
+ can't really be predicted.
91
+
92
+ However be warned that [making an application fork safe](FORK_SAFETY.md) can be non-trivial,
93
+ and mistakes can lead to critical bugs.
@@ -107,6 +107,33 @@ module Pitchfork
107
107
  @workers.each_value(&block)
108
108
  end
109
109
 
110
+ def soft_kill_all(sig)
111
+ each do |child|
112
+ child.soft_kill(sig)
113
+ end
114
+ end
115
+
116
+ def hard_kill(sig, child)
117
+ child.hard_kill(sig)
118
+ rescue Errno::ESRCH
119
+ reap(child.pid)
120
+ child.close
121
+ end
122
+
123
+ def hard_kill_all(sig)
124
+ each do |child|
125
+ hard_kill(sig, child)
126
+ end
127
+ end
128
+
129
+ def hard_timeout(child)
130
+ child.hard_timeout!
131
+ rescue Errno::ESRCH
132
+ reap(child.pid)
133
+ child.close
134
+ true
135
+ end
136
+
110
137
  def workers
111
138
  @workers.values
112
139
  end
@@ -33,6 +33,7 @@ module Pitchfork
33
33
  DEFAULTS = {
34
34
  :soft_timeout => 20,
35
35
  :cleanup_timeout => 2,
36
+ :spawn_timeout => 10,
36
37
  :timeout => 22,
37
38
  :logger => default_logger,
38
39
  :worker_processes => 1,
@@ -174,6 +175,10 @@ module Pitchfork
174
175
  set_int(:timeout, soft_timeout + cleanup_timeout, 5)
175
176
  end
176
177
 
178
+ def spawn_timeout(seconds)
179
+ set_int(:spawn_timeout, seconds, 1)
180
+ end
181
+
177
182
  def worker_processes(nr)
178
183
  set_int(:worker_processes, nr, 1)
179
184
  end
@@ -74,7 +74,7 @@ module Pitchfork
74
74
  end
75
75
 
76
76
  # :stopdoc:
77
- attr_accessor :app, :timeout, :soft_timeout, :cleanup_timeout, :worker_processes,
77
+ attr_accessor :app, :timeout, :soft_timeout, :cleanup_timeout, :spawn_timeout, :worker_processes,
78
78
  :after_worker_fork, :after_mold_fork,
79
79
  :listener_opts, :children,
80
80
  :orig_app, :config, :ready_pipe,
@@ -396,15 +396,15 @@ module Pitchfork
396
396
  limit = Pitchfork.time_now + timeout
397
397
  until @children.workers.empty? || Pitchfork.time_now > limit
398
398
  if graceful
399
- soft_kill_each_child(:TERM)
399
+ @children.soft_kill_all(:TERM)
400
400
  else
401
- kill_each_child(:INT)
401
+ @children.hard_kill_all(:INT)
402
402
  end
403
403
  if monitor_loop(false) == StopIteration
404
404
  return StopIteration
405
405
  end
406
406
  end
407
- kill_each_child(:KILL)
407
+ @children.hard_kill_all(:KILL)
408
408
  @promotion_lock.unlink
409
409
  end
410
410
 
@@ -506,25 +506,28 @@ module Pitchfork
506
506
  next
507
507
  else # worker is out of time
508
508
  next_sleep = 0
509
- if worker.mold?
510
- logger.error "mold pid=#{worker.pid} timed out, killing"
511
- else
512
- logger.error "worker=#{worker.nr} pid=#{worker.pid} timed out, killing"
513
- end
509
+ hard_timeout(worker)
510
+ end
511
+ end
514
512
 
515
- if @after_worker_hard_timeout
516
- begin
517
- @after_worker_hard_timeout.call(self, worker)
518
- rescue => error
519
- Pitchfork.log_error(@logger, "after_worker_hard_timeout callback", error)
520
- end
521
- end
513
+ next_sleep <= 0 ? 1 : next_sleep
514
+ end
522
515
 
523
- kill_worker(:KILL, worker.pid) # take no prisoners for hard timeout violations
516
+ def hard_timeout(worker)
517
+ if @after_worker_hard_timeout
518
+ begin
519
+ @after_worker_hard_timeout.call(self, worker)
520
+ rescue => error
521
+ Pitchfork.log_error(@logger, "after_worker_hard_timeout callback", error)
524
522
  end
525
523
  end
526
524
 
527
- next_sleep <= 0 ? 1 : next_sleep
525
+ if worker.mold?
526
+ logger.error "mold pid=#{worker.pid} timed out, killing"
527
+ else
528
+ logger.error "worker=#{worker.nr} pid=#{worker.pid} timed out, killing"
529
+ end
530
+ @children.hard_timeout(worker) # take no prisoners for hard timeout violations
528
531
  end
529
532
 
530
533
  def trigger_refork
@@ -556,6 +559,10 @@ module Pitchfork
556
559
  def spawn_worker(worker, detach:)
557
560
  logger.info("worker=#{worker.nr} gen=#{worker.generation} spawning...")
558
561
 
562
+ # We set the deadline before spawning the child so that if for some
563
+ # reason it gets stuck before reaching the worker loop,
564
+ # the monitor process will kill it.
565
+ worker.update_deadline(@spawn_timeout)
559
566
  Pitchfork.fork_sibling do
560
567
  worker.pid = Process.pid
561
568
 
@@ -693,12 +700,12 @@ module Pitchfork
693
700
 
694
701
  # once a client is accepted, it is processed in its entirety here
695
702
  # in 3 easy steps: read request, call app, write app response
696
- def process_client(client, timeout_handler)
703
+ def process_client(client, worker, timeout_handler)
697
704
  env = nil
698
705
  @request = Pitchfork::HttpParser.new
699
706
  env = @request.read(client)
700
707
 
701
- proc_name status: "processing: #{env["PATH_INFO"]}"
708
+ proc_name status: "requests: #{worker.requests_count}, processing: #{env["PATH_INFO"]}"
702
709
 
703
710
  timeout_handler.rack_env = env
704
711
  env["pitchfork.timeout"] = timeout_handler
@@ -831,7 +838,7 @@ module Pitchfork
831
838
  when Message
832
839
  worker.update(client)
833
840
  else
834
- request_env = process_client(client, prepare_timeout(worker))
841
+ request_env = process_client(client, worker, prepare_timeout(worker))
835
842
  @after_request_complete&.call(self, worker, request_env)
836
843
  worker.increment_requests_count
837
844
  end
@@ -844,6 +851,7 @@ module Pitchfork
844
851
 
845
852
  if @refork_condition && Info.fork_safe? && !worker.outdated?
846
853
  if @refork_condition.met?(worker, logger)
854
+ proc_name status: "requests: #{worker.requests_count}, spawning mold"
847
855
  if spawn_mold(worker.generation)
848
856
  logger.info("Refork condition met, promoting ourselves")
849
857
  end
@@ -851,7 +859,7 @@ module Pitchfork
851
859
  end
852
860
  end
853
861
 
854
- proc_name status: "waiting"
862
+ proc_name status: "requests: #{worker.requests_count}, waiting"
855
863
  waiter.get_readers(ready, readers, @timeout * 500) # to milliseconds, but halved
856
864
  rescue => e
857
865
  Pitchfork.log_error(@logger, "listen loop error", e) if readers[0]
@@ -926,15 +934,6 @@ module Pitchfork
926
934
  worker = @children.reap(wpid) and worker.close rescue nil
927
935
  end
928
936
 
929
- # delivers a signal to each worker
930
- def kill_each_child(signal)
931
- @children.each { |w| kill_worker(signal, w.pid) }
932
- end
933
-
934
- def soft_kill_each_child(signal)
935
- @children.each { |worker| worker.soft_kill(signal) }
936
- end
937
-
938
937
  # returns an array of string names for the given listener array
939
938
  def listener_names(listeners = LISTENERS)
940
939
  listeners.map { |io| sock_name(io) }
@@ -6,14 +6,35 @@ module Pitchfork
6
6
  module Info
7
7
  @workers_count = 0
8
8
  @fork_safe = true
9
- @kept_ios = ObjectSpace::WeakMap.new
9
+
10
+ class WeakSet # :nodoc
11
+ def initialize
12
+ @map = ObjectSpace::WeakMap.new
13
+ end
14
+
15
+ if RUBY_VERSION < "2.7"
16
+ def <<(object)
17
+ @map[object] = object
18
+ end
19
+ else
20
+ def <<(object)
21
+ @map[object] = true
22
+ end
23
+ end
24
+
25
+ def each(&block)
26
+ @map.each_key(&block)
27
+ end
28
+ end
29
+
30
+ @kept_ios = WeakSet.new
10
31
 
11
32
  class << self
12
33
  attr_accessor :workers_count
13
34
 
14
35
  def keep_io(io)
15
36
  raise ArgumentError, "#{io.inspect} doesn't respond to :to_io" unless io.respond_to?(:to_io)
16
- @kept_ios[io] = io
37
+ @kept_ios << io
17
38
  io
18
39
  end
19
40
 
@@ -22,9 +43,9 @@ module Pitchfork
22
43
  end
23
44
 
24
45
  def close_all_ios!
25
- ignored_ios = [$stdin, $stdout, $stderr]
46
+ ignored_ios = [$stdin, $stdout, $stderr, STDIN, STDOUT, STDERR].uniq.compact
26
47
 
27
- @kept_ios.each_value do |io_like|
48
+ @kept_ios.each do |io_like|
28
49
  ignored_ios << (io_like.is_a?(IO) ? io_like : io_like.to_io)
29
50
  end
30
51
 
@@ -1,7 +1,7 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Pitchfork
4
- VERSION = "0.8.0"
4
+ VERSION = "0.10.0"
5
5
  module Const
6
6
  UNICORN_VERSION = '6.1.0'
7
7
  end
@@ -139,6 +139,14 @@ module Pitchfork
139
139
  success
140
140
  end
141
141
 
142
+ def hard_kill(sig)
143
+ Process.kill(sig, pid)
144
+ end
145
+
146
+ def hard_timeout!
147
+ hard_kill(:KILL)
148
+ end
149
+
142
150
  # this only runs when the Rack app.call is not running
143
151
  # act like a listener
144
152
  def accept_nonblock(exception: nil) # :nodoc:
data/lib/pitchfork.rb CHANGED
@@ -204,8 +204,13 @@ module Pitchfork
204
204
  # or the master to be PID 1.
205
205
  if middle_pid = FORK_LOCK.synchronize { Process.fork } # parent
206
206
  # We need to wait(2) so that the middle process doesn't end up a zombie.
207
- Process.wait(middle_pid)
207
+ # The process only call fork again an exit so it should be pretty fast.
208
+ # However it might need to execute some `Process._fork` or `at_exit` callbacks,
209
+ # so it case it takes more than 5 seconds to exit, we kill it with SIGBUS
210
+ # to produce a crash report, as this is indicative of a nasty bug.
211
+ process_wait_with_timeout(middle_pid, 5, :BUS)
208
212
  else # first child
213
+ Process.setproctitle("<pitchfork fork_sibling>")
209
214
  clean_fork(&block) # detach into a grand child
210
215
  exit
211
216
  end
@@ -216,6 +221,18 @@ module Pitchfork
216
221
  nil # it's tricky to return the PID
217
222
  end
218
223
 
224
+ def process_wait_with_timeout(pid, timeout, timeout_signal = :KILL)
225
+ (timeout * 200).times do
226
+ status = Process.wait(pid, Process::WNOHANG)
227
+ return status if status
228
+ sleep 0.005 # 200 * 5ms => 1s
229
+ end
230
+
231
+ # The process didn't exit in the allotted time, so we kill it.
232
+ Process.kill(timeout_signal, pid)
233
+ Process.wait(pid)
234
+ end
235
+
219
236
  def time_now(int = false)
220
237
  Process.clock_gettime(Process::CLOCK_MONOTONIC, int ? :second : :float_second)
221
238
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: pitchfork
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.8.0
4
+ version: 0.10.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Jean Boussier
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2023-09-08 00:00:00.000000000 Z
11
+ date: 2023-11-02 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: raindrops
@@ -72,6 +72,7 @@ files:
72
72
  - docs/REFORKING.md
73
73
  - docs/SIGNALS.md
74
74
  - docs/TUNING.md
75
+ - docs/WHY_MIGRATE.md
75
76
  - examples/constant_caches.ru
76
77
  - examples/echo.ru
77
78
  - examples/hello.ru