pitchfork 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of pitchfork might be problematic. Click here for more details.

Files changed (63) hide show
  1. checksums.yaml +7 -0
  2. data/.git-blame-ignore-revs +3 -0
  3. data/.gitattributes +5 -0
  4. data/.github/workflows/ci.yml +30 -0
  5. data/.gitignore +23 -0
  6. data/COPYING +674 -0
  7. data/Dockerfile +4 -0
  8. data/Gemfile +9 -0
  9. data/Gemfile.lock +30 -0
  10. data/LICENSE +67 -0
  11. data/README.md +123 -0
  12. data/Rakefile +72 -0
  13. data/docs/Application_Timeouts.md +74 -0
  14. data/docs/CONFIGURATION.md +388 -0
  15. data/docs/DESIGN.md +86 -0
  16. data/docs/FORK_SAFETY.md +80 -0
  17. data/docs/PHILOSOPHY.md +90 -0
  18. data/docs/REFORKING.md +113 -0
  19. data/docs/SIGNALS.md +38 -0
  20. data/docs/TUNING.md +106 -0
  21. data/examples/constant_caches.ru +43 -0
  22. data/examples/echo.ru +25 -0
  23. data/examples/hello.ru +5 -0
  24. data/examples/nginx.conf +156 -0
  25. data/examples/pitchfork.conf.minimal.rb +5 -0
  26. data/examples/pitchfork.conf.rb +77 -0
  27. data/examples/unicorn.socket +11 -0
  28. data/exe/pitchfork +116 -0
  29. data/ext/pitchfork_http/CFLAGS +13 -0
  30. data/ext/pitchfork_http/c_util.h +116 -0
  31. data/ext/pitchfork_http/child_subreaper.h +25 -0
  32. data/ext/pitchfork_http/common_field_optimization.h +130 -0
  33. data/ext/pitchfork_http/epollexclusive.h +124 -0
  34. data/ext/pitchfork_http/ext_help.h +38 -0
  35. data/ext/pitchfork_http/extconf.rb +14 -0
  36. data/ext/pitchfork_http/global_variables.h +97 -0
  37. data/ext/pitchfork_http/httpdate.c +79 -0
  38. data/ext/pitchfork_http/pitchfork_http.c +4318 -0
  39. data/ext/pitchfork_http/pitchfork_http.rl +1024 -0
  40. data/ext/pitchfork_http/pitchfork_http_common.rl +76 -0
  41. data/lib/pitchfork/app/old_rails/static.rb +59 -0
  42. data/lib/pitchfork/children.rb +124 -0
  43. data/lib/pitchfork/configurator.rb +314 -0
  44. data/lib/pitchfork/const.rb +23 -0
  45. data/lib/pitchfork/http_parser.rb +206 -0
  46. data/lib/pitchfork/http_response.rb +63 -0
  47. data/lib/pitchfork/http_server.rb +822 -0
  48. data/lib/pitchfork/launcher.rb +9 -0
  49. data/lib/pitchfork/mem_info.rb +36 -0
  50. data/lib/pitchfork/message.rb +130 -0
  51. data/lib/pitchfork/mold_selector.rb +29 -0
  52. data/lib/pitchfork/preread_input.rb +33 -0
  53. data/lib/pitchfork/refork_condition.rb +21 -0
  54. data/lib/pitchfork/select_waiter.rb +9 -0
  55. data/lib/pitchfork/socket_helper.rb +199 -0
  56. data/lib/pitchfork/stream_input.rb +152 -0
  57. data/lib/pitchfork/tee_input.rb +133 -0
  58. data/lib/pitchfork/tmpio.rb +35 -0
  59. data/lib/pitchfork/version.rb +8 -0
  60. data/lib/pitchfork/worker.rb +244 -0
  61. data/lib/pitchfork.rb +158 -0
  62. data/pitchfork.gemspec +30 -0
  63. metadata +137 -0
@@ -0,0 +1,388 @@
1
+ # Configuration
2
+
3
+ Most of Pitchfork configuration is directly inherited from Unicorn, however several
4
+ options have been removed, and a few added.
5
+
6
+ ## Basic configurations
7
+
8
+ ### `worker_processes`
9
+
10
+ ```ruby
11
+ worker_processes 16
12
+ ```
13
+
14
+ Sets the number of desired worker processes.
15
+ Each worker process will serve exactly one client at a time.
16
+
17
+ ### `listen`
18
+
19
+ By default pitchfork listen to port 8080.
20
+
21
+ ```ruby
22
+ listen 2007
23
+ listen "/path/to/.pitchfork.sock", backlog: 64
24
+ listen 8080, tcp_nopush: true
25
+ ```
26
+
27
+ Adds an address to the existing listener set. May be specified more
28
+ than once. address may be an Integer port number for a TCP port, an
29
+ `IP_ADDRESS:PORT` for TCP listeners or a pathname for UNIX domain sockets.
30
+
31
+ ```ruby
32
+ listen 3000 # listen to port 3000 on all TCP interfaces
33
+ listen "127.0.0.1:3000" # listen to port 3000 on the loopback interface
34
+ listen "/path/to/.pitchfork.sock" # listen on the given Unix domain socket
35
+ listen "[::1]:3000" # listen to port 3000 on the IPv6 loopback interface
36
+ ```
37
+
38
+ When using Unix domain sockets, be sure:
39
+ 1) the path matches the one used by nginx
40
+ 2) uses the same filesystem namespace as the nginx process
41
+ For systemd users using PrivateTmp=true (for either nginx or pitchfork),
42
+ this means Unix domain sockets must not be placed in /tmp
43
+
44
+ The following options may be specified (but are generally not needed):
45
+
46
+ - `backlog: number of clients`
47
+
48
+ This is the backlog of the listen() syscall.
49
+
50
+ Some operating systems allow negative values here to specify the
51
+ maximum allowable value. In most cases, this number is only
52
+ recommendation and there are other OS-specific tunables and
53
+ variables that can affect this number. See the listen(2)
54
+ syscall documentation of your OS for the exact semantics of
55
+ this.
56
+
57
+ If you are running pitchfork on multiple machines, lowering this number
58
+ can help your load balancer detect when a machine is overloaded
59
+ and give requests to a different machine.
60
+
61
+ Default: `1024`
62
+
63
+ Note: with the Linux kernel, the net.core.somaxconn sysctl defaults
64
+ to 128, capping this value to 128. Raising the sysctl allows a
65
+ larger backlog (which may not be desirable with multiple,
66
+ load-balanced machines).
67
+
68
+ - `rcvbuf: bytes, sndbuf: bytes`
69
+
70
+ Maximum receive and send buffer sizes (in bytes) of sockets.
71
+
72
+ These correspond to the SO_RCVBUF and SO_SNDBUF settings which
73
+ can be set via the setsockopt(2) syscall. Some kernels
74
+ (e.g. Linux 2.4+) have intelligent auto-tuning mechanisms and
75
+ there is no need (and it is sometimes detrimental) to specify them.
76
+
77
+ See the socket API documentation of your operating system
78
+ to determine the exact semantics of these settings and
79
+ other operating system-specific knobs where they can be
80
+ specified.
81
+
82
+ Defaults: operating system defaults
83
+
84
+ - `tcp_nodelay: false`
85
+
86
+ Enables Nagle's algorithm on TCP sockets if +false+.
87
+
88
+ Default: +true+ (Nagle's algorithm disabled)
89
+
90
+ Setting this to +false+ can help in situation where the network between
91
+ pitchfork and the reverse proxy may be congested. In most case it's not
92
+ necessary.
93
+
94
+ This has no effect on UNIX sockets.
95
+
96
+ - `tcp_nopush: true or false`
97
+
98
+ Enables/disables TCP_CORK in Linux or TCP_NOPUSH in FreeBSD
99
+
100
+ This prevents partial TCP frames from being sent out and reduces
101
+ wakeups in nginx if it is on a different machine.
102
+ Since pitchfork is only designed for applications that send the response body
103
+ quickly without keepalive, sockets will always be flushed on close to prevent delays.
104
+
105
+ This has no effect on UNIX sockets.
106
+
107
+ Default: `false` (disabled)
108
+
109
+ - `ipv6only: true or false`
110
+
111
+ This option makes IPv6-capable TCP listeners IPv6-only and unable
112
+ to receive IPv4 queries on dual-stack systems.
113
+ A separate IPv4-only listener is required if this is true.
114
+
115
+ Enabling this option for the IPv6-only listener and having a
116
+ separate IPv4 listener is recommended if you wish to support IPv6
117
+ on the same TCP port. Otherwise, the value of `env["REMOTE_ADDR"]`
118
+ will appear as an ugly IPv4-mapped-IPv6 address for IPv4 clients
119
+ (e.g `:ffff:10.0.0.1` instead of just `10.0.0.1`).
120
+
121
+ Default: Operating-system dependent
122
+
123
+ - `reuseport: true or false`
124
+
125
+ This enables multiple, independently-started pitchfork instances to
126
+ bind to the same port (as long as all the processes enable this).
127
+
128
+ This option must be used when pitchfork first binds the listen socket.
129
+
130
+ Note: there is a chance of connections being dropped if
131
+ one of the pitchfork instances is stopped while using this.
132
+
133
+ This is supported on *BSD systems and Linux 3.9 or later.
134
+
135
+ ref: https://lwn.net/Articles/542629/
136
+
137
+ Default: `false` (unset)
138
+
139
+ - `umask: mode`
140
+
141
+ Sets the file mode creation mask for UNIX sockets.
142
+ If specified, this is usually in octal notation.
143
+
144
+ Typically UNIX domain sockets are created with more liberal
145
+ file permissions than the rest of the application.
146
+ By default, we create UNIX domain sockets to be readable and writable by
147
+ all local users to give them the same accessibility as locally-bound TCP listeners.
148
+
149
+ This has no effect on TCP listeners.
150
+
151
+ Default: `0000` (world-read/writable)
152
+
153
+ - `tcp_defer_accept: Integer`
154
+
155
+ Defer `accept()` until data is ready (Linux-only)
156
+
157
+ For Linux 2.6.32 and later, this is the number of retransmits to
158
+ defer an `accept()` for if no data arrives, but the client will
159
+ eventually be accepted after the specified number of retransmits
160
+ regardless of whether data is ready.
161
+
162
+ For Linux before 2.6.32, this is a boolean option, and
163
+ accepts are _always_ deferred indefinitely if no data arrives.
164
+
165
+ Specifying `true` is synonymous for the default value(s) below,
166
+ and `false` or `nil` is synonymous for a value of zero.
167
+
168
+ A value of `1` is a good optimization for local networks and trusted clients.
169
+ There is no good reason to ever disable this with a +zero+ value with pitchfork.
170
+
171
+ Default: `1`
172
+
173
+ ### `timeout`
174
+
175
+ ```ruby
176
+ timeout 10
177
+ ```
178
+
179
+ Sets the timeout of worker processes to a number of seconds.
180
+ Workers handling the request/app.call/response cycle taking longer than
181
+ this time period will be forcibly killed (via `SIGKILL`).
182
+
183
+ This timeout mecanism shouldn't be routinely relying on, and should
184
+ instead be considered as a last line of defense in case you application
185
+ is impacted by bugs causing unexpectedly slow response time, or fully stuck
186
+ processes.
187
+
188
+ Make sure to read the guide on [application timeouts](Application_Timeouts.md).
189
+
190
+ This configuration defaults to a (too) generous 20 seconds, it is
191
+ highly recommended to set a stricter one based on your application
192
+ profile.
193
+
194
+ This timeout is enforced by the master process itself and not subject
195
+ to the scheduling limitations by the worker process.
196
+ Due the low-complexity, low-overhead implementation, timeouts of less
197
+ than 3.0 seconds can be considered inaccurate and unsafe.
198
+
199
+ For running Pitchfork behind nginx, it is recommended to set
200
+ "fail_timeout=0" for in your nginx configuration like this
201
+ to have nginx always retry backends that may have had workers
202
+ SIGKILL-ed due to timeouts.
203
+
204
+ ```
205
+ upstream pitchfork_backend {
206
+ # for UNIX domain socket setups:
207
+ server unix:/path/to/.pitchfork.sock fail_timeout=0;
208
+
209
+ # for TCP setups
210
+ server 192.168.0.7:8080 fail_timeout=0;
211
+ server 192.168.0.8:8080 fail_timeout=0;
212
+ server 192.168.0.9:8080 fail_timeout=0;
213
+ }
214
+ ```
215
+
216
+ See https://nginx.org/en/docs/http/ngx_http_upstream_module.html
217
+ for more details on nginx upstream configuration.
218
+
219
+ ### `logger`
220
+
221
+ ```ruby
222
+ logger Logger.new("path/to/logs")
223
+ ```
224
+
225
+ Replace the default logger by the provided one.
226
+ The passed logger must respond to the standard Ruby Logger interface.
227
+ The default Logger will log its output to STDERR.
228
+
229
+ ## Callbacks
230
+
231
+ Because pitchfork several callbacks around the lifecycle of workers.
232
+ It is often necessary to use these callbacks to close inherited connection after fork.
233
+
234
+ Note that when reforking is available, the `pitchfork` master process won't load your application
235
+ at all. As such for hooks executed in the master, you may need to explicitly load the parts of your
236
+ application that are used in hooks.
237
+
238
+ `pitchfork` also don't attempt to rescue hook errors. Raising from a worker hook will crash the worker,
239
+ and raising from a master hook will bring the whole cluster down.
240
+
241
+ ### `before_fork`
242
+
243
+ ```ruby
244
+ before_fork do |server, worker|
245
+ Database.disconnect!
246
+ end
247
+ ```
248
+
249
+ Called in the context of the parent or mold.
250
+ For most protocols connections can be closed after fork, but some
251
+ stateful protocols require to close connections before fork.
252
+
253
+ That is the case for instance of many SQL databases protocols.
254
+
255
+ ### `after_fork`
256
+
257
+ ```ruby
258
+ after_fork do |server, worker|
259
+ NetworkClient.disconnect!
260
+ end
261
+ ```
262
+
263
+ Called in the worker after forking. Generally used to close inherited connections
264
+ or to restart backgrounds threads for libraries that don't do it automatically.
265
+
266
+
267
+ ### `after_worker_ready`
268
+
269
+ Called by a worker process after it has been fully loaded, directly before it
270
+ starts responding to requests:
271
+
272
+ ```ruby
273
+ after_worker_ready do |server, worker|
274
+ server.logger.info("worker #{worker.nr} ready")
275
+ end
276
+ ```
277
+
278
+ ### `after_worker_exit`
279
+
280
+ Called in the master process after a worker exits.
281
+
282
+ ```ruby
283
+ after_worker_exit do |server, worker, status|
284
+ # status is a Process::Status instance for the exited worker process
285
+ unless status.success?
286
+ server.logger.error("worker process failure: #{status.inspect}")
287
+ end
288
+ end
289
+ ```
290
+
291
+ ## Reforking
292
+
293
+ ### `refork_after`
294
+
295
+ ```ruby
296
+ refork_after [50, 100, 1000]
297
+ ```
298
+
299
+ Sets a number of requests threshold for triggering an automatic refork.
300
+ The limit is per-worker, for instance with `refork_after [50]` a refork is triggered
301
+ once at least one worker processed `50` requests.
302
+
303
+ Each element is a limit for the next generation. On the example above a new generation
304
+ is triggered when a worker has processed 50 requests, then the second generation when
305
+ a worker from the new generation processed an additional 100 requests and finally after another
306
+ 1000 requests.
307
+
308
+ Generally speaking Copy-on-Write efficiency tend to degrade fast during the early requests,
309
+ and then less and less frequently.
310
+
311
+ As such you likely want to refork exponentially less and less over time.
312
+
313
+ By default automatic reforking isn't enabled.
314
+
315
+ Make sure to read the [fork safety guide](FORK_SAFETY.md) before enabling reforking.
316
+
317
+ ### `mold_selector`
318
+
319
+ Sets the mold selector implementation.
320
+
321
+ ```ruby
322
+ mold_selector do |server|
323
+ candidate = server.children.workers.sample # return an random worker
324
+ server.logger.info("worker=#{worker.nr} pid=#{worker.pid} selected as new mold")
325
+ candidate
326
+ end
327
+ ```
328
+
329
+ The has access to `server.children` a `Pitchfork::Children` instance.
330
+ This object can be used to introspect the state of the cluster and select the most
331
+ appropriate worker to be used as the new mold from which workers will be reforked.
332
+
333
+ The default implementation selects the worker with the least
334
+ amount of shared memory. This heuristic aim to select the most
335
+ warmed up worker.
336
+
337
+ This should be considered a very advanced API and it is discouraged
338
+ to use it unless you are confident you have a clear understanding
339
+ of pitchfork's architecture.
340
+
341
+ ## Rack Features
342
+
343
+ ### `default_middleware`
344
+
345
+ Sets whether to add Pitchfork's default middlewares. Defaults to `true`.
346
+
347
+ ### `early_hints`
348
+
349
+ Sets whether to enable the proposed early hints Rack API. Defaults to `false`.
350
+
351
+ If enabled, Rails 5.2+ will automatically send a 103 Early Hint for all the `javascript_include_tag` and `stylesheet_link_tag`
352
+ in your response. See: https://api.rubyonrails.org/v5.2/classes/ActionDispatch/Request.html#method-i-send_early_hints
353
+ See also https://tools.ietf.org/html/rfc8297
354
+
355
+ ## Advanced Tuning Configurations
356
+
357
+ Make sure to read the tuning guide before tweaking any of these.
358
+ Also note that most of these options are inherited from Unicorn, so
359
+ most guides on how to tune Unicorn likely apply here.
360
+
361
+ ### `rewindable_input`
362
+
363
+ Toggles making `env["rack.input"]` rewindable.
364
+ Disabling rewindability can improve performance by lowering I/O and memory usage for applications that accept uploads.
365
+ Keep in mind that the Rack 1.x spec requires `env["rack.input"]` to be rewindable, but the Rack 2.x spec does not.
366
+
367
+ `rewindable_input` defaults to `true` for compatibility.
368
+ Setting it to `false` may be safe for applications and frameworks developed for Rack 2.x and later.
369
+
370
+ ### `client_body_buffer_size`
371
+
372
+ The maximum size in bytes to buffer in memory before resorting to a temporary file.
373
+ Default is `112` kilobytes.
374
+ This option has no effect if `rewindable_input` is set to `false`.
375
+
376
+ ### `check_client_connection`
377
+
378
+ When enabled, pitchfork will check the client connection by writing
379
+ the beginning of the HTTP headers before calling the application.
380
+
381
+ This will prevent calling the application for clients who have
382
+ disconnected while their connection was queued.
383
+
384
+ This only affects clients connecting over Unix domain sockets
385
+ and TCP via loopback (`127.*.*.*`).
386
+ It is unlikely to detect disconnects if the client is on a remote host (even on a fast LAN).
387
+
388
+ This option cannot be used in conjunction with `tcp_nopush`.
data/docs/DESIGN.md ADDED
@@ -0,0 +1,86 @@
1
+ ## Design
2
+
3
+ * Simplicity: Pitchfork is a traditional UNIX prefork web server.
4
+ No threads are used at all, this makes applications easier to debug
5
+ and fix.
6
+
7
+ * Resiliency: If something in goes catastrophically wrong and your application
8
+ is dead locked or somehow stuck, once the request timeout is reached the master
9
+ process will take care of sending `kill -9` to the affected worker and
10
+ spawn a new one to replace it.
11
+
12
+ * Leverage Copy-on-Write: The only real disadvantage of prefork servers is
13
+ their increased memory usage. But thanks to reforking, `pitchfork` is able
14
+ to drastically improve Copy-on-Write performance, hence reduce memory usage
15
+ enough that it's no longer a concern.
16
+
17
+ * The Ragel+C HTTP parser is taken from Mongrel.
18
+
19
+ * All HTTP parsing and I/O is done much like Mongrel:
20
+ 1. read/parse HTTP request headers in full
21
+ 2. call Rack application
22
+ 3. write HTTP response back to the client
23
+
24
+ * Like Mongrel, neither keepalive nor pipelining are supported.
25
+ These aren't needed since Pitchfork is only designed to serve
26
+ fast, low-latency clients directly. Do one thing, do it well;
27
+ let nginx handle slow clients.
28
+
29
+ * Configuration is purely in Ruby. Ruby is less
30
+ ambiguous than YAML and lets lambdas for
31
+ before_fork/after_fork hooks be defined inline. An
32
+ optional, separate config_file may be used to modify supported
33
+ configuration changes.
34
+
35
+ * One master process spawns and reaps worker processes.
36
+
37
+ * The number of worker processes should be scaled to the number of
38
+ CPUs or memory you have. If you have an existing
39
+ Unicorn cluster on a single-threaded app, using the same amount of
40
+ processes should work. Let a full-HTTP-request-buffering reverse
41
+ proxy like nginx manage concurrency to thousands of slow clients for
42
+ you. Pitchfork scaling should only be concerned about limits of your
43
+ backend system(s).
44
+
45
+ * Load balancing between worker processes is done by the OS kernel.
46
+ All workers share a common set of listener sockets and does
47
+ non-blocking accept() on them. The kernel will decide which worker
48
+ process to give a socket to and workers will sleep if there is
49
+ nothing to accept().
50
+
51
+ * Since non-blocking accept() is used, there can be a thundering
52
+ herd when an occasional client connects when application
53
+ *is not busy*. The thundering herd problem should not affect
54
+ applications that are running all the time since worker processes
55
+ will only select()/accept() outside of the application dispatch.
56
+
57
+ * Additionally, thundering herds are much smaller than with
58
+ configurations using existing prefork servers. Process counts should
59
+ only be scaled to backend resources, _never_ to the number of expected
60
+ clients like is typical with blocking prefork servers. So while we've
61
+ seen instances of popular prefork servers configured to run many
62
+ hundreds of worker processes, Pitchfork deployments are typically between
63
+ 1 and 2 processes per-core.
64
+
65
+ * Blocking I/O is used for clients. This allows a simpler code path
66
+ to be followed within the Ruby interpreter and fewer syscalls.
67
+
68
+ * `SIGKILL` is used to terminate the timed-out workers from misbehaving apps
69
+ as reliably as possible on a UNIX system. The default timeout is a
70
+ generous 20 seconds.
71
+
72
+ * The poor performance of select() on large FD sets is avoided
73
+ as few file descriptors are used in each worker.
74
+ There should be no gain from moving to highly scalable but
75
+ unportable event notification solutions for watching few
76
+ file descriptors.
77
+
78
+ * If the master process dies unexpectedly for any reason,
79
+ workers will notice within :timeout/2 seconds and follow
80
+ the master to its death.
81
+
82
+ * There is never any explicit real-time dependency or communication
83
+ between the worker processes nor to the master process.
84
+ Synchronization is handled entirely by the OS kernel and shared
85
+ resources are never accessed by the worker when it is servicing
86
+ a client.
@@ -0,0 +1,80 @@
1
+ # Fork Safety
2
+
3
+ Because `pitchfork` is a preforking server, your application code and libraries
4
+ must be fork safe.
5
+
6
+ Generally code might be fork-unsafe for one of two reasons
7
+
8
+ ## Inherited Connection
9
+
10
+ When a process is forked, any open file descriptor (sockets, files, pipes, etc)
11
+ end up shared between the parent and child process. This is never what you
12
+ want, so any code keeping persistent connections should close them either
13
+ before or after the fork happens.
14
+
15
+ `pitchfork` provide two callbacks in its configuration file to do so:
16
+
17
+ ```ruby
18
+ # pitchfork.conf.rb
19
+
20
+ before_fork do
21
+ Sequel::DATABASES.each(&:disconnect)
22
+ end
23
+
24
+ after_fork do
25
+ SomeLibary.connection.close
26
+ end
27
+ ```
28
+
29
+ The documentation of any database client or network library you use should be
30
+ read with care to figure out how to disconnect it, and whether it is best to
31
+ do it before or after fork.
32
+
33
+ Since the most common Ruby application servers `Puma`, `Unicorn` and `Passenger`
34
+ have forking at least as an option, the requirements are generally well documented.
35
+
36
+ However what is novel with `Pitchfork`, is that processes can be forked more than once.
37
+ So just because an application works fine with existing pre-fork servers doesn't necessarily
38
+ mean it will work fine with `Pitchfork`.
39
+
40
+ It's not uncommon for applications to not close connections after fork, but for it to go
41
+ unnoticed because these connections are lazily created when the first request is handled.
42
+
43
+ So if you enable reforking for the first time, you may discover some issues.
44
+
45
+ Also note that rather than to expose a callback, some libraries take on them to detect
46
+ that a fork happened, and automatically close inherited connections.
47
+
48
+ ## Background Threads
49
+
50
+ When a process is forked, only the main threads will stay alive in the child process.
51
+ So any libraries that spawn a background thread for periodical work may need to be notified
52
+ that a fork happened and that it should restart its thread.
53
+
54
+ Just like with connections, some libraries take on them to automatically restart their background
55
+ thread when they detect a fork happened.
56
+
57
+ # Refork Safety
58
+
59
+ Some code might happen to work without issue in other forking servers such as Unicorn or Puma,
60
+ but not work in Pitchfork when reforking is enabled.
61
+
62
+ This is because it is not uncommon for network connections or background threads to only be
63
+ initialized upon the first request. As such they're not inherited on the first fork.
64
+
65
+ However when reforking is enabled, new processes as forked out of warmed up process, as such
66
+ any lazily created connection is much more likely to have been created.
67
+
68
+ As such, if you enable reforking for the first time, it is heavily recommended to first do it
69
+ in some sort of staging environment, or on a small subset of production servers as to limit the
70
+ impact of discovering such bug.
71
+
72
+ ## Known Incompatible Gems
73
+
74
+ - [The `grpc` isn't fork safe](https://github.com/grpc/grpc/issues/8798) and doesn't provide any before or after fork callback to re-establish connection.
75
+ It can only be used in forking environment if the client is never used in the parent before fork.
76
+ If you application uses `grpc`, you shouldn't enable reforking.
77
+ But frankly, that gem is such a tire fire, you shouldn't use it regardless.
78
+ If you really have to consume a gRPC API, you can consider `grpc_kit` as a replacement.
79
+
80
+ No other gem is known to be incompatible, but if you find one please open an issue to add it to the list.
@@ -0,0 +1,90 @@
1
+ # The Philosophy Behind pitchfork
2
+
3
+ ## Avoid Complexity
4
+
5
+ Instead of attempting to be efficient at serving slow clients, pitchfork
6
+ relies on a buffering reverse proxy to efficiently deal with slow
7
+ clients.
8
+
9
+ ## Threads and Events Are Not Well Suited For Transactional Web Applications
10
+
11
+ `pitchfork` uses a preforking worker model with blocking I/O.
12
+ Our processing model is the antithesis of processing models using threads or
13
+ non-blocking I/O with events or fibers.
14
+
15
+ It is only meant to serve fast, transactional HTTP/1.1 applications.
16
+ These applications rarely if ever spend more than half their time on IOs, and
17
+ any remote API call made by the application should either have a strict SLA
18
+ and timeout, or be deferred to a background job processing system.
19
+
20
+ As such, when they are ran in a threaded or fiber processing model, they suffer
21
+ from GVL contention and GC pauses, which hurts latency.
22
+
23
+ `pitchfork` is not suited for all applications. `pitchfork` is optimized for
24
+ applications that are CPU/memory/disk intensive and spend little time
25
+ waiting on external resources (e.g. a database server or external API).
26
+
27
+ WebSocket, Server-Sent Events and applications that mainly act as a light proxy
28
+ to another service have radically different performance profiles and requirements,
29
+ and shouldn't be handled by the same process. It's preferable to host these with
30
+ a threaded or evented processing model (`falcon`, `puma`, etc).
31
+
32
+ No processing model can efficiently handle both types of workload. Use
33
+ the right tool with the right configuration for the right job.
34
+
35
+ ## Improved Performance Through Reverse Proxying
36
+
37
+ By acting as a buffer to shield unicorn from slow I/O, a reverse proxy
38
+ will inevitably incur overhead in the form of extra data copies.
39
+ However, as I/O within a local network is fast (and faster still
40
+ with local sockets), this overhead is negligible for the vast majority
41
+ of HTTP requests and responses.
42
+
43
+ The ideal reverse proxy complements the weaknesses of `pitchfork`.
44
+ A reverse proxy for `pitchfork` should meet the following requirements:
45
+
46
+ 1. It should fully buffer all HTTP requests (and large responses).
47
+ Each request should be "corked" in the reverse proxy and sent
48
+ as fast as possible to the backend unicorn processes. This is
49
+ the most important feature to look for when choosing a
50
+ reverse proxy for `pitchfork`.
51
+
52
+ 2. It should handle SSL/TLS termination. Requests should arrive
53
+ decrypted to `pitchfork`. Reverse proxy can do this much more
54
+ efficiently. If you don't trust your local network enough to
55
+ make unencrypted traffic go through it, you can have a reverse
56
+ proxy on the same server than `pitchfork` to handle decryption.
57
+
58
+ 3. It should handle HTTP/2 or HTTP/3 termination. Newer HTTP protocols
59
+ do not provide any feature or improvements that are useful or even desirable
60
+ for transactional HTTP applications. Your reverse proxy or load balancer
61
+ should handle the HTTP/2 or HTTP/3 protocol with the client, but forward
62
+ requests to `pitchfork` as HTTP/1.1.
63
+
64
+ 4. It should efficiently manage persistent connections (and
65
+ pipelining) to slow clients.
66
+
67
+ 5. It should not be "sticky". Even if the client has a persistent
68
+ connection, every request made as part of that persistent connection
69
+ should be load balanced individually.
70
+
71
+ 6. It should (optionally) serve static files. If you have static
72
+ files on your site (especially large ones), they are far more
73
+ efficiently served with as few data copies as possible (e.g. with
74
+ sendfile() to completely avoid copying the data to userspace).
75
+
76
+ Suitable options include `nginx`, `caddy` and likely several others.
77
+
78
+ ## Leverage Copy-on-Write to reduce memory usage.
79
+
80
+ One of the main advantages of threaded servers over preforking servers is their
81
+ lower memory usage.
82
+
83
+ However `pitchfork` solves this with its reforking feature. If enabled and properly configured
84
+ it very significantly increase Copy-on-Write performance, closing the gap with threaded servers.
85
+
86
+ ## Assume Modern Deployment Methods
87
+
88
+ Pitchfork assumes it is deployed using modern tools such as either containers or
89
+ advanced init systems such as systemd. As such it doesn't provide classic daemon
90
+ functionallty like pidfile management, log rediction and reopening, config reloading etc.