passenger 4.0.44 → 4.0.45

Sign up to get free protection for your applications and to get access to all the features.

Potentially problematic release.


This version of passenger might be problematic. Click here for more details.

Files changed (110) hide show
  1. checksums.yaml +8 -8
  2. checksums.yaml.gz.asc +7 -7
  3. data.tar.gz.asc +7 -7
  4. data/.travis.yml +3 -0
  5. data/CHANGELOG +31 -0
  6. data/CONTRIBUTING.md +70 -10
  7. data/CONTRIBUTORS +4 -0
  8. data/README.md +1 -1
  9. data/Vagrantfile +50 -0
  10. data/bin/passenger-install-nginx-module +7 -2
  11. data/build/basics.rb +4 -1
  12. data/build/documentation.rb +6 -0
  13. data/build/node_tests.rb +7 -1
  14. data/build/packaging.rb +5 -0
  15. data/build/test_basics.rb +3 -3
  16. data/debian.template/copyright +1 -1
  17. data/debian.template/passenger.manpages +0 -1
  18. data/dev/rack.test/config.ru +5 -0
  19. data/dev/rack.test/public/asset.txt +1 -0
  20. data/dev/vagrant/apache_default_site.conf +35 -0
  21. data/dev/vagrant/apache_passenger.conf +5 -0
  22. data/dev/vagrant/apache_passenger.load +1 -0
  23. data/dev/vagrant/apache_ports.conf +24 -0
  24. data/dev/vagrant/apache_rack_test.conf +9 -0
  25. data/dev/vagrant/bashrc +21 -0
  26. data/dev/vagrant/nginx.conf +39 -0
  27. data/dev/vagrant/nginx_rakefile +34 -0
  28. data/dev/vagrant/nginx_start +32 -0
  29. data/dev/vagrant/provision.sh +115 -0
  30. data/dev/vagrant/sudoers.conf +5 -0
  31. data/doc/Design and Architecture.txt +515 -0
  32. data/doc/DeveloperQuickstart.md +70 -0
  33. data/doc/Users guide Apache.idmap.txt +24 -18
  34. data/doc/Users guide Apache.txt +200 -62
  35. data/doc/Users guide Nginx.idmap.txt +53 -45
  36. data/doc/Users guide Nginx.txt +501 -360
  37. data/doc/Users guide Standalone.txt +8 -0
  38. data/doc/images/direct_spawning.png +0 -0
  39. data/doc/images/direct_spawning.svg +16 -13
  40. data/doc/images/helper_agent_core_architecture.png +0 -0
  41. data/doc/images/passenger_architecture_overview.png +0 -0
  42. data/doc/images/smart_spawning.png +0 -0
  43. data/doc/images/{smart.svg → smart_spawning.svg} +23 -20
  44. data/doc/images/spawning_preparation_work.png +0 -0
  45. data/doc/images/startup_sequence.png +0 -0
  46. data/doc/users_guide_snippets/appendix_c_spawning_methods.txt +82 -121
  47. data/doc/users_guide_snippets/environment_variables.txt +1 -1
  48. data/doc/users_guide_snippets/support_information.txt +2 -0
  49. data/doc/users_guide_snippets/tips.txt +117 -9
  50. data/ext/apache2/Configuration.hpp +4 -2
  51. data/ext/apache2/ConfigurationCommands.cpp +14 -0
  52. data/ext/apache2/ConfigurationFields.hpp +4 -0
  53. data/ext/apache2/ConfigurationSetters.cpp +22 -0
  54. data/ext/apache2/CreateDirConfig.cpp +2 -0
  55. data/ext/apache2/Hooks.cpp +30 -14
  56. data/ext/apache2/MergeDirConfig.cpp +14 -0
  57. data/ext/apache2/SetHeaders.cpp +8 -0
  58. data/ext/common/ApplicationPool2/AppTypes.cpp +6 -1
  59. data/ext/common/ApplicationPool2/Implementation.cpp +1 -1
  60. data/ext/common/ApplicationPool2/Session.h +1 -1
  61. data/ext/common/Constants.h +9 -7
  62. data/ext/common/Utils/HttpHeaderBufferer.h +23 -4
  63. data/ext/common/Utils/StrIntUtils.h +35 -0
  64. data/ext/common/Utils/StringScanning.h +4 -10
  65. data/ext/common/agents/HelperAgent/RequestHandler.h +90 -49
  66. data/ext/nginx/CacheLocationConfig.c +40 -0
  67. data/ext/nginx/ConfigurationCommands.c +20 -0
  68. data/ext/nginx/ConfigurationFields.h +4 -0
  69. data/ext/nginx/ContentHandler.c +1 -1
  70. data/ext/nginx/CreateLocationConfig.c +9 -0
  71. data/ext/nginx/MergeLocationConfig.c +12 -0
  72. data/ext/nginx/config +2 -2
  73. data/ext/nginx/ngx_http_passenger_module.c +4 -4
  74. data/helper-scripts/node-loader.js +40 -27
  75. data/lib/phusion_passenger.rb +1 -1
  76. data/lib/phusion_passenger/apache2/config_options.rb +14 -2
  77. data/lib/phusion_passenger/constants.rb +7 -6
  78. data/lib/phusion_passenger/loader_shared_helpers.rb +11 -1
  79. data/lib/phusion_passenger/nginx/config_options.rb +8 -0
  80. data/lib/phusion_passenger/packaging.rb +8 -3
  81. data/lib/phusion_passenger/platform_info/apache.rb +3 -0
  82. data/lib/phusion_passenger/platform_info/ruby.rb +4 -1
  83. data/lib/phusion_passenger/standalone/command.rb +0 -1
  84. data/lib/phusion_passenger/standalone/package_runtime_command.rb +1 -0
  85. data/lib/phusion_passenger/standalone/start_command.rb +80 -62
  86. data/lib/phusion_passenger/standalone/status_command.rb +1 -0
  87. data/lib/phusion_passenger/standalone/stop_command.rb +1 -0
  88. data/man/passenger-config.1 +1 -1
  89. data/man/passenger-memory-stats.8 +1 -1
  90. data/man/passenger-status.8 +1 -1
  91. data/npm-shrinkwrap.json +229 -0
  92. data/package.json +28 -0
  93. data/resources/templates/standalone/config.erb +2 -0
  94. data/rpm/Vagrantfile +0 -3
  95. data/test/config.json.vagrant +30 -0
  96. data/test/cxx/HttpHeaderBuffererTest.cpp +64 -10
  97. data/test/cxx/RequestHandlerTest.cpp +35 -13
  98. data/test/integration_tests/apache2_tests.rb +1 -0
  99. data/test/stub/node/app.js +26 -18
  100. metadata +28 -13
  101. metadata.gz.asc +7 -7
  102. data/doc/Architectural overview.idmap.txt +0 -36
  103. data/doc/Architectural overview.txt +0 -410
  104. data/doc/images/smart.png +0 -0
  105. data/ext/common/ApplicationPool2/README.md +0 -56
  106. data/man/passenger-stress-test.1 +0 -43
  107. data/node_lib/phusion_passenger/httplib_emulation.js +0 -215
  108. data/node_lib/phusion_passenger/request_handler.js +0 -73
  109. data/node_lib/phusion_passenger/session_protocol_parser.js +0 -113
  110. data/test/node/httplib_emulation_spec.js +0 -623
@@ -6,7 +6,7 @@ It incorporates packaging work done by Neil Wilson <neil@brightbox.co.uk>
6
6
  Some further refined packaging work was done by David Moreno <david@axiombox.com> and
7
7
  Micah Anderson <micah@riseup.net>.
8
8
 
9
- It was downloaded from http://www.modrails.com/install.html
9
+ It was downloaded from https://www.phusionpassenger.com/
10
10
 
11
11
  Upstream Authors: Hongli Lai <hongli@plan99.net>
12
12
  Ninh Bui <ninh.bui@gmail.com>
@@ -1,4 +1,3 @@
1
1
  man/passenger-memory-stats.8
2
2
  man/passenger-status.8
3
3
  man/passenger-config.1
4
- man/passenger-stress-test.1
@@ -0,0 +1,5 @@
1
+ app = lambda do |env|
2
+ [200, { "Content-Type" => "text/plain" }, ["ok\n"]]
3
+ end
4
+
5
+ run app
@@ -0,0 +1 @@
1
+ This is a static asset.
@@ -0,0 +1,35 @@
1
+ # This file is overwritten by 'vagrant provision'. For the source,
2
+ # see dev/vagrant/apache_default_site.conf in the Phusion Passenger source
3
+ # tree.
4
+
5
+ <VirtualHost *:8000>
6
+ # The ServerName directive sets the request scheme, hostname and port that
7
+ # the server uses to identify itself. This is used when creating
8
+ # redirection URLs. In the context of virtual hosts, the ServerName
9
+ # specifies what hostname must appear in the request's Host: header to
10
+ # match this virtual host. For the default virtual host (this file) this
11
+ # value is not decisive as it is used as a last resort host regardless.
12
+ # However, you must set it for any further virtual host explicitly.
13
+ #ServerName www.example.com
14
+
15
+ ServerAdmin webmaster@localhost
16
+ DocumentRoot /var/www/html
17
+
18
+ # Available loglevels: trace8, ..., trace1, debug, info, notice, warn,
19
+ # error, crit, alert, emerg.
20
+ # It is also possible to configure the loglevel for particular
21
+ # modules, e.g.
22
+ #LogLevel info ssl:warn
23
+
24
+ ErrorLog ${APACHE_LOG_DIR}/error.log
25
+ CustomLog ${APACHE_LOG_DIR}/access.log combined
26
+
27
+ # For most configuration files from conf-available/, which are
28
+ # enabled or disabled at a global level, it is possible to
29
+ # include a line for only one particular virtual host. For example the
30
+ # following line enables the CGI configuration for this host only
31
+ # after it has been globally disabled with "a2disconf".
32
+ #Include conf-available/serve-cgi-bin.conf
33
+ </VirtualHost>
34
+
35
+ # vim: syntax=apache ts=4 sw=4 sts=4 sr noet
@@ -0,0 +1,5 @@
1
+ <IfModule mod_passenger.c>
2
+ PassengerRoot /vagrant
3
+ PassengerDefaultRuby /usr/bin/ruby
4
+ PassengerLogLevel 1
5
+ </IfModule>
@@ -0,0 +1 @@
1
+ LoadModule passenger_module /vagrant/buildout/apache2/mod_passenger.so
@@ -0,0 +1,24 @@
1
+ # This file is overwritten by 'vagrant provision'. For the source,
2
+ # see dev/vagrant/apache_ports.conf in the Phusion Passenger source
3
+ # tree.
4
+
5
+ # If you just change the port or add more ports here, you will likely also
6
+ # have to change the VirtualHost statement in
7
+ # /etc/apache2/sites-enabled/000-default.conf
8
+
9
+ Listen 8000
10
+ Listen 8001
11
+ Listen 8002
12
+ Listen 8003
13
+ Listen 8004
14
+ Listen 8005
15
+
16
+ <IfModule ssl_module>
17
+ Listen 8010
18
+ </IfModule>
19
+
20
+ <IfModule mod_gnutls.c>
21
+ Listen 8010
22
+ </IfModule>
23
+
24
+ # vim: syntax=apache ts=4 sw=4 sts=4 sr noet
@@ -0,0 +1,9 @@
1
+ <VirtualHost *:8001>
2
+ ServerName rack.test
3
+ DocumentRoot /vagrant/dev/rack.test/public
4
+ <Directory /vagrant/dev/rack.test/public>
5
+ Allow from all
6
+ Options -MultiViews
7
+ Require all granted
8
+ </Directory>
9
+ </VirtualHost>
@@ -0,0 +1,21 @@
1
+ # This file is overwritten by 'vagrant provision'. For the source,
2
+ # see dev/vagrant/bashrc in the Phusion Passenger source
3
+ # tree.
4
+
5
+ # Display git branch in bash prompt.
6
+ export PS1='\[\e]0;\u@\h: \w\a\]${debian_chroot:+($debian_chroot)}\u@\h:\w$(__git_ps1 " (%s)")]\$ '
7
+
8
+ # Add Phusion Passenger command line tools to PATH.
9
+ export PATH=/vagrant/bin:$PATH
10
+
11
+ # Tell Phusion Passenger's build system to use ccache.
12
+ export USE_CCACHE=1
13
+ export CCACHE_COMPRESS=1
14
+
15
+ # Tell Phusion Passenger Standalone to run in debug mode.
16
+ export PASSENGER_DEBUG=1
17
+
18
+ alias ls='ls -Fh --color'
19
+ alias dir='ls -l'
20
+ alias free='free -m'
21
+ alias df='df -h'
@@ -0,0 +1,39 @@
1
+ daemon off;
2
+ worker_processes 1;
3
+ #user nobody;
4
+ #error_log logs/error.log debug;
5
+ #pid logs/nginx.pid;
6
+
7
+
8
+ events {
9
+ worker_connections 1024;
10
+ }
11
+
12
+
13
+ http {
14
+ include mime.types;
15
+ default_type application/octet-stream;
16
+ sendfile on;
17
+ keepalive_timeout 65;
18
+ #gzip on;
19
+
20
+ passenger_root /vagrant;
21
+ passenger_ruby /usr/bin/ruby;
22
+ passenger_log_level 1;
23
+
24
+ server {
25
+ listen 8100;
26
+ server_name localhost;
27
+ location / {
28
+ root html;
29
+ index index.html index.htm;
30
+ }
31
+ }
32
+
33
+ server {
34
+ listen 8101;
35
+ server_name rack.test;
36
+ root /vagrant/dev/rack.test/public;
37
+ passenger_enabled on;
38
+ }
39
+ }
@@ -0,0 +1,34 @@
1
+ ENV['CC'] ||= 'ccache cc'
2
+ ENV['CXX'] ||= 'ccache c++'
3
+
4
+ desc "Bootstrap Nginx for the first time"
5
+ task :bootstrap => :configure do
6
+ sh "make -j2"
7
+ sh "make install"
8
+ sh "rm -f inst/sbin/nginx"
9
+ puts
10
+ puts "--------------------------"
11
+ puts "You're all set! You can start Nginx by running:"
12
+ puts
13
+ puts " ./start"
14
+ puts
15
+ puts "Nginx can be reached from the host machine on http://127.0.0.1:8100/"
16
+ puts
17
+ puts "You never have to run `rake bootstrap` again. You also never have to " +
18
+ "run `make install`. If you've made changes to the Passenger Nginx module, " +
19
+ "simply run `make && ./start` in /home/vagrant/nginx. The `start` script " +
20
+ "will start the newly compiled Nginx binary directly."
21
+ end
22
+
23
+ desc "Configure Nginx source tree"
24
+ task :configure do
25
+ sh "./configure --prefix=/home/vagrant/nginx/inst" +
26
+ " --add-module=/vagrant/ext/nginx" +
27
+ " --with-http_ssl_module" +
28
+ " --with-http_gzip_static_module" +
29
+ " --with-http_stub_status_module" +
30
+ " --with-http_spdy_module" +
31
+ " --with-ipv6" +
32
+ " --with-debug"
33
+ sh "sed", "-E", "-i", 's/ -O[0-9]? / -ggdb /g', "objs/Makefile"
34
+ end
@@ -0,0 +1,32 @@
1
+ #!/usr/bin/ruby
2
+ # This file is overwritten by 'vagrant provision'. For the source,
3
+ # see dev/vagrant/nginx_start in the Phusion Passenger source
4
+ # tree.
5
+
6
+ ENV['PASSENGER_BEEP_ON_ABORT'] = '1'
7
+
8
+ def run_in_bg(*command)
9
+ return fork do
10
+ Process.setsid
11
+ exec(*command)
12
+ end
13
+ end
14
+
15
+ File.open('inst/logs/error.log', 'a') do |f|
16
+ f.write("\n\n\n\n-------------- NGINX START #{Time.now} --------------\n\n\n\n")
17
+ end
18
+
19
+ tail_pid = run_in_bg("tail -n 0 -f inst/logs/error.log")
20
+ nginx_pid = run_in_bg("./objs/nginx", *ARGV)
21
+ begin
22
+ Process.waitpid(nginx_pid)
23
+ nginx_pid = nil
24
+ rescue Interrupt
25
+ ensure
26
+ Process.kill('INT', tail_pid)
27
+ Process.waitpid(tail_pid)
28
+ if nginx_pid
29
+ Process.kill('INT', nginx_pid)
30
+ Process.waitpid(nginx_pid)
31
+ end
32
+ end
@@ -0,0 +1,115 @@
1
+ #!/bin/bash
2
+ set -ex
3
+ set -o pipefail
4
+
5
+
6
+ ### Update /etc/hosts
7
+
8
+ if ! grep -q passenger.test /etc/hosts; then
9
+ cat >>/etc/hosts <<-EOF
10
+
11
+ 127.0.0.1 passenger.test
12
+ 127.0.0.1 mycook.passenger.test
13
+ 127.0.0.1 zsfa.passenger.test
14
+ 127.0.0.1 norails.passenger.test
15
+ 127.0.0.1 1.passenger.test 2.passenger.test 3.passenger.test
16
+ 127.0.0.1 4.passenger.test 5.passenger.test 6.passenger.test
17
+ 127.0.0.1 7.passenger.test 8.passenger.test 9.passenger.test
18
+ 127.0.0.1 rack.test foobar.test
19
+ EOF
20
+ fi
21
+
22
+
23
+ ### Update bashrc and bash profile
24
+
25
+ if ! grep -q bashrc.mine /etc/bash.bashrc; then
26
+ echo ". /etc/bash.bashrc.mine" >> /etc/bash.bashrc
27
+ fi
28
+ if ! grep -q bashrc.mine /home/vagrant/.bashrc; then
29
+ echo ". /etc/bash.bashrc.mine" >> /home/vagrant/.bashrc
30
+ fi
31
+ if ! grep -q /vagrant /home/vagrant/.profile; then
32
+ echo "if tty -s; then cd /vagrant; fi" >> /home/vagrant/.profile
33
+ fi
34
+ cp /vagrant/dev/vagrant/bashrc /etc/bash.bashrc.mine
35
+ cp /vagrant/dev/vagrant/sudoers.conf /etc/sudoers.d/passenger
36
+ chmod 440 /etc/sudoers.d/passenger
37
+
38
+
39
+ ### Install native dependencies
40
+
41
+ apt-get update
42
+ apt-get install -y build-essential git bash-completion ccache wget \
43
+ libxml2-dev libxslt1-dev libsqlite3-dev libcurl4-openssl-dev libpcre3-dev \
44
+ ruby ruby-dev nodejs npm \
45
+ apache2-mpm-worker apache2-threaded-dev
46
+
47
+
48
+ ### Install basic gems
49
+
50
+ if [[ ! -e /usr/local/bin/rake ]]; then
51
+ gem install rake --no-rdoc --no-ri
52
+ fi
53
+ if [[ ! -e /usr/local/bin/drake ]]; then
54
+ gem install drake --no-rdoc --no-ri
55
+ fi
56
+ if [[ ! -e /usr/local/bin/bundler ]]; then
57
+ gem install bundler --no-rdoc --no-ri
58
+ fi
59
+
60
+
61
+ ### Install Phusion Passenger development dependencies
62
+
63
+ pushd /vagrant
64
+ if [[ ! -e ~/.test_deps_installed ]]; then
65
+ rake test:install_deps SUDO=1
66
+ touch ~/.test_deps_installed
67
+ fi
68
+ popd
69
+
70
+
71
+ ### Install Nginx source code
72
+
73
+ pushd /home/vagrant
74
+ if [[ ! -e nginx ]]; then
75
+ sudo -u vagrant -H git clone -b nginx-1.6 https://github.com/nginx/nginx.git
76
+ fi
77
+ sudo -u vagrant -H mkdir -p nginx/inst/conf
78
+ sudo -u vagrant -H cp /vagrant/dev/vagrant/nginx_start nginx/start
79
+ if [[ ! -e nginx/Rakefile ]]; then
80
+ sudo -u vagrant -H cp /vagrant/dev/vagrant/nginx_rakefile nginx/Rakefile
81
+ fi
82
+ if [[ ! -e nginx/inst/conf/nginx.conf ]]; then
83
+ sudo -u vagrant -H cp /vagrant/dev/vagrant/nginx.conf nginx/inst/conf/
84
+ fi
85
+ if [[ ! -e nginx/nginx.conf && ! -h nginx/nginx.conf ]]; then
86
+ sudo -u vagrant -H ln -s inst/conf/nginx.conf nginx/nginx.conf
87
+ fi
88
+ if [[ ! -e nginx/access.log && ! -h nginx/access.log ]]; then
89
+ sudo -u vagrant -H ln -s inst/logs/access.log nginx/access.log
90
+ fi
91
+ if [[ ! -e nginx/error.log && ! -h nginx/error.log ]]; then
92
+ sudo -u vagrant -H ln -s inst/logs/error.log nginx/error.log
93
+ fi
94
+ popd
95
+
96
+
97
+ ### Set up Apache
98
+
99
+ should_restart_apache=false
100
+ cp /vagrant/dev/vagrant/apache_ports.conf /etc/apache2/ports.conf
101
+ cp /vagrant/dev/vagrant/apache_default_site.conf /etc/apache2/sites-available/000-default.conf
102
+ if [[ ! -e /etc/apache2/mods-available/passenger.conf ]]; then
103
+ cp /vagrant/dev/vagrant/apache_passenger.conf /etc/apache2/mods-available/passenger.conf
104
+ fi
105
+ if [[ ! -e /etc/apache2/mods-available/passenger.load ]]; then
106
+ cp /vagrant/dev/vagrant/apache_passenger.load /etc/apache2/mods-available/passenger.load
107
+ fi
108
+ if [[ ! -e /etc/apache2/sites-available/010-rack.test.conf ]]; then
109
+ cp /vagrant/dev/vagrant/apache_rack_test.conf /etc/apache2/sites-available/010-rack.test.conf
110
+ a2ensite 010-rack.test
111
+ should_restart_apache=true
112
+ fi
113
+ if $should_restart_apache; then
114
+ service apache2 restart
115
+ fi
@@ -0,0 +1,5 @@
1
+ # This file is overwritten by 'vagrant provision'. For the source,
2
+ # see dev/vagrant/sudoers.conf in the Phusion Passenger source
3
+ # tree.
4
+
5
+ Defaults secure_path="/vagrant/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
@@ -0,0 +1,515 @@
1
+ = Phusion Passenger Design and Architecture =
2
+
3
+ image:images/phusion_banner.png[link="http://www.phusion.nl/"]
4
+
5
+ This guide describes Phusion Passenger's design and architecture in detail. With this guide, we hope that contributors can quickly find their way around the Phusion Passenger codebase.
6
+
7
+ The guide assumes that you're familiar with using Phusion Passenger and with Nginx or Apache, and that you've read the link:https://github.com/phusion/passenger/blob/master/CONTRIBUTING.md[Contributors Guide] and the link:https://github.com/phusion/passenger/blob/master/doc/DeveloperQuickstart.md[Developer QuickStart].
8
+
9
+
10
+ == Introduction
11
+
12
+ [[web_app_models]]
13
+ === Web application models and the role of the application server
14
+
15
+ Before we describe Phusion Passenger, it is important to understand how typical web applications work from the viewpoint of someone who wants to connect a web application to a web server.
16
+
17
+ A typical, isolated, web application accepts an HTTP request from some I/O channel, processes it internally, and outputs an HTTP response, which is sent back to the client. This is done in a loop, until the application is commanded to exit. This does not necessarily mean that the web application speaks HTTP directly: it just means that the web application accepts some kind of representation of an HTTP request.
18
+
19
+ image:images/typical_isolated_web_application.png[Architecture of a typical web application in isolation]
20
+
21
+ Some web applications are directly accessible through the HTTP protocol, while others are not. It depends on the language and framework that the web application is built on. For example, Ruby (Rack/Rails) and Python (WSGI) web applications are typically not directly accessible through the HTTP protocol. On the other hand, Node.js web applications *do* tend to be accessible through the HTTP protocol. The reasons for this are historical, but they're outside the scope of this guide.
22
+
23
+ ==== Common models
24
+
25
+ Here are some common models that are in use:
26
+
27
+ 1. The web application is contained in an application server. This application server may or may not be able to contain multiple web applications. The administrator then connects the application server to a web server through some kind of protocol. This protocol may be HTTP, FastCGI, SCGI, AJP or whatever. The web server dispatches (forwards) requests to the application server, which in turn dispatches requests to the correct web application, in a format that the web application understands. Conversely, HTTP responses outputted by the web application are sent to the application server, which in turn sends them to the web server, and eventually to the HTTP client.
28
+ +
29
+ Typical examples of such a model:
30
+ +
31
+ * A J2EE application, contained in the Tomcat application server, reverse proxied behind the Apache web server. Tomcat can contain multiple web applications in a single Tomcat instance.
32
+ * Most Ruby application servers besides Phusion Passenger (Thin, Unicorn, Goliath, etc). These application servers can only contain a single Ruby web application per instance. They load the web application into their own process and are put behind a web server (Apache, Nginx) in a reverse proxy setup.
33
+ * Green Unicorn, the Python (WSGI) application server, behind a reverse proxy setup.
34
+ * PHP web applications spawned by the FastCGI Process Manager (FPM), behind an Nginx reverse proxy setup.
35
+
36
+ 2. The web application is contained directly in a web server. In this case, the web server acts like an application server. Typical examples include:
37
+ +
38
+ --
39
+ * PHP web applications running on Apache through mod_php.
40
+ * Python (WSGI) web applications running on Apache through mod_uwsgi or mod_python.
41
+ --
42
+ +
43
+ Note that this does not necessarily mean that the web application is run inside the same process as the web server: it just means that the web server manages applications. In case of mod_php, PHP runs directly inside the Apache worker processes, but in case of mod_uwsgi the Python processes can be configured to run out-of-process.
44
+ +
45
+ Phusion Passenger for Apache and Phusion Passenger for Nginx implement this model, and run applications outside the web server process.
46
+
47
+ 3. The web application *is* a web server, and can accept HTTP requests directly. Examples of this model:
48
+ +
49
+ --
50
+ * Almost all Node.js and Meteor JS web applications.
51
+ * The Trac bug tracking software, running in its standalone server.
52
+ --
53
+ +
54
+ In most setups, the administrator puts them in a reverse proxy configuration, behind a real web server such as Apache or Nginx, instead of letting them accept HTTP requests directly.
55
+ +
56
+ Phusion Passenger Standalone implements this model. However, you can expose Phusion Passenger Standalone directly to the Internet because it uses Nginx internally.
57
+
58
+ 4. The web application does not speak HTTP directly, but is connected directly to the web server through some communication adapter. CGI, FastCGI and SCGI are good examples of this.
59
+
60
+ The above models cover how nearly all web applications work, whether they're based on PHP, Django, J2EE, ASP.NET, Ruby on Rails, or whatever. Note that all of these models provide the same functionality, i.e. no model can do something that a different model can't. The critical reader will notice that all of these models are identical to the one described in the first diagram, if the combination of web servers, application servers, web applications etc. are considered to be a single entity; a black box if you will.
61
+
62
+ It should also be noted that these models do not enforce any particular I/O processing implementation. The web servers, application servers, web applications, etc. could process I/O serially (i.e. one request at a time),
63
+ could multiplex I/O with a single thread (e.g. by using `select(2)` or `poll(2)`) or it could process I/O with multiple threads and/or multiple processes. It depends on the implementation.
64
+
65
+ Of course, there are many variations possible. For example, load balancers could be used. But that is outside the scope of this document.
66
+
67
+ ==== The rationale behind reverse proxying
68
+
69
+ As you've seen, administrators often put the web application or its application server behind a real web server in a reverse proxy setup, even when the web app/app server already speaks HTTP. This is because implementing HTTP in a proper, secure way involves more than just speaking the protocol. The public Internet is a hostile environment where clients can send any arbitrary data and can exhibit any arbitrary I/O patterns. If you don't properly implement I/O handling, then you could open yourself either to parser vulnerabilities, or denial-of-service attacks.
70
+
71
+ Web servers like Apache and Nginx have already implemented world-class I/O and connection handling code and it would be a waste to reinvent their wheel. In the end, putting the application in a reverse proxying setup often makes the whole system more robust and and more secure. This is the reason why it's considered good practice.
72
+
73
+ A typical problem involves dealing with *slow clients*. These clients may send HTTP requests slowly and read HTTP responses slowly, perhaps taking many seconds to complete their work. A naive single-threaded HTTP server implementation that reads an HTTP requests, processes, and sends the HTTP response in a loop may end up spending so much time waiting for I/O that spends very little time doing actual work. Worse: suppose that the client is malicious, just leaves the socket open and never reads the HTTP response, then the server will spend forever waiting for the client, not being able to handle any more requests. A real-world attack based on this principle is link:http://en.wikipedia.org/wiki/Slowloris[Slowloris].
74
+
75
+ .An example of a naive HTTP server implementation
76
+ -------------------
77
+ while true
78
+ client = accept_next_client()
79
+ request = read_http_request(client)
80
+ response = process_request(request)
81
+ send_http_response(client, response)
82
+ end
83
+ -------------------
84
+
85
+ There are many ways to solve this problem. One could use one thread per client, one could implement I/O timeouts, one could use an evented I/O architecture, one could have a dedicated I/O thread or process buffer requests and responses. The point is, implementing all this properly is non-trivial. Instead of reimplementing these over and over in each application server, it's better to let a real web server deal with all the details and let the application server and the web application do what they're best at: their own core business logic.
86
+
87
+ === Phusion Passenger architecture overview
88
+
89
+ image:images/passenger_architecture_overview.png[An overview of Phusion Passenger's architecture]
90
+
91
+ Phusion Passenger is not a single, monolithic entity. Instead, it consists of multiple components and processes that work together. Part of the reason why Phusion Passenger is split like this, is because it's technically necessary (no other way to implement it). But another part of the reason is stability and robustness. Individual components can crash and can be restarted independently from each other. If we were to put everything inside a single process, then a crash will take down all of Phusion Passenger.
92
+
93
+ Thus, if the HelperAgent crashes, or if an application process crashes, they can both be restarted without affecting the web server's stability.
94
+
95
+ ==== Web server module
96
+
97
+ When an HTTP client sends a request, it is received by the web server (Nginx or Apache). Both Apache and Nginx can be extended with **modules**. Phusion Passenger provides such a module. The module is loaded into Nginx/Apache. It checks whether the request should be handled by a Phusion Passenger-served web application, and if so, forwards the request to the HelperAgent. The internal wire protocol used during this forwarding, is a modified version of link:http://en.wikipedia.org/wiki/SCGI[SCGI].
98
+
99
+ The Nginx module and Apache module have an entirely different code base. Their code bases are in `ext/nginx` and `ext/apache2`, respectively. Both modules are relatively small because they outsource most logic to the HelperAgent, and because they utilize a common library (`ext/common`). This allows us to support both Nginx and Apache without having to write a lot of things twice.
100
+
101
+ ==== HelperAgent
102
+
103
+ The **HelperAgent** is Phusion Passenger's core, where most of the processing is done. The HelperAgent keeps track of which application processes currently exist, and using load balancing rules, determines which process a request should be forwarded to. The HelperAgent also takes care of **application spawning**: if it determines that having more application processes is necessary or beneficial, then it will make that happen. Process spawning is subject to user-configured limits: the HelperAgent will never spawn more processes than a user-configured maximum.
104
+
105
+ The HelperAgent also has monitoring and statistics gathering capabilities. It constantly keeps track of applications' memory usage, how many requests they've handled, etc. This information can later be queried from administration tools. And if an application process crashes, the HelperAgent restarts it.
106
+
107
+ The HelperAgent is by far the largest and most complex part of the system, but it is itself composed of several smaller subsystems. Most of the <<helper_agent_architecture,HelperAgent architecture>> chapter is devoted to describing the HelperAgent.
108
+
109
+ ==== LoggingAgent
110
+
111
+ The HelperAgent cooperates with the **LoggingAgent**. This latter is responsible for sending data to link:https://www.unionstationapp.com[Union Station], a monitoring web service. If you didn't explicitly tell Phusion Passenger to send data to Union Station, then the LoggingAgent sits idle and does not consume resources.
112
+
113
+ ==== Watchdog
114
+
115
+ The HelperAgent and the LoggingAgent contain complex logic, so they could contain bugs which could crash them.
116
+ So as a safety measure, they are both monitored by the **Watchdog**. If either of them crash, they are restarted by the Watchdog. This setup seeks to ensure that the system stays up, no matter what.
117
+
118
+ You might now wonder: what happens if the Watchdog crashes? Shouldn't the Watchdog be monitored by another Watchdog? We've contemplated this possibility, but the Watchdog is very simple, and since 2012 we haven't seen a single report of the Watchdog crashing, nor have we been able to make it crash since that time. So, for the sake of keeping the codebase as simple as possible, we've chosen not to introduce multiple Watchdogs.
119
+
120
+ ==== Command line tools
121
+
122
+ Finally, there is an array of **command line tools** which support Phusion Passenger. The installers -- `passenger-install-*-module` -- are responsible for installing Phusion Passenger. There are administrative tools such as `passenger-status` and `passenger-memory-stats`. And many more. Some of these tools may communicate with one of the agents. For example, the `passenger-status` queries the HelperAgent for information that the HelperAgent has collected. How this communication is done, is described in <<instance_state_and_communication,Instance state and communication>>.
123
+
124
+ ==== Passenger Standalone
125
+
126
+ You might have noticed that Phusion Passenger Standalone is not part of the diagram. So how does it fit into the architecture? Well, Phusion Passenger Standalone is actually just Phusion Passenger for Nginx. The `passenger start` command simply sets up a modified and stripped-down Nginx web server (which we call the `WebHelper`) with the Phusion Passenger Nginx module loaded.
127
+
128
+ === Build system and source tree
129
+
130
+ Phusion Passenger is written mostly in C\++ and Ruby. The web server modules, HelperAgent, LoggingAgent and Watchdog are written in C++. Most command line tools are written in Ruby. You can find each component here:
131
+
132
+ * The web server modules can be found in `ext/apache2` and `ext/nginx`.
133
+ * The HelperAgent, LoggingAgent and Watchdog can be found in `ext/common/agents`.
134
+ * The command line tools can be found in `bin`, with some parts of their code in `lib`.
135
+
136
+ More information can be found in the link:https://github.com/phusion/passenger/blob/master/CONTRIBUTING.md[Contributors Guide]. This guide also teaches you how to compile Phusion Passenger.
137
+
138
+
139
+ == Initialization
140
+
141
+ image:images/startup_sequence.png[Startup sequence]
142
+
143
+ Phusion Passenger initializes as follows.
144
+
145
+ 1. First, the user begins with starting the web server. This for example be done by running `sudo service apache2 start` or `sudo service nginx start`. Or perhaps the web server is configured to be automatically started by the OS, in which case the user doesn't have to do anything. In case of Phusion Passenger Standalone, the user runs `passenger start` which in turn starts Nginx.
146
+
147
+ 2. The Phusion Passenger module inside Nginx/Apache proceeds with starting the Watchdog. This is implemented in:
148
+ +
149
+ * `ext/nginx/ngx_http_passenger_module.c`, function `start_watchdog()`.
150
+ * `ext/apache2/Hooks.cpp`, in the constructor for the `Hooks` class.
151
+ * `ext/common/AgentsStarter.h` and `AgentsStarter.cpp`. Most of the logic pertaining starting the Watchdog is in this file.
152
+
153
+ 3. The Watchdog first initializes a <<instance_state_and_communication,"server instance directory">>, which is a temporary directory containing files that will be used during the life time of this Phusion Passenger instance. For example, the directory contains Unix domain socket files, so that the different Phusion Passenger processes can communicate with each other. The Watchdog is implemented in `ext/common/agents/Watchdog/Main.cpp`.
154
+
155
+ 4. The Watchdog starts the HelperAgent and the LoggingAgent simultaneously. Each performs its own initialization.
156
+
157
+ 5. When the HelperAgent is done initializing, it will send a message back to the Watchdog saying that it's done. The LoggingAgent does something similar. When the Watchdog has received both acknowledgment messages, it finishes initialization. If the Watchdog notices that one of the agents have exited without sending an acknowledgment message, then it enters an error state.
158
+
159
+ 6. The Watchdog reports successful startup back to the Phusion Passenger module that's running inside Nginx/Apache. Or, if initialization didn't success, the Watchdog reports back an error. The Phusion Passenger module inside Nginx/Apache then logs the error.
160
+
161
+ After initialization, Phusion Passenger is ready to receive and to process requests.
162
+
163
+
164
+ [[helper_agent_architecture]]
165
+ == HelperAgent architecture
166
+
167
+ image:images/helper_agent_core_architecture.png[HelperAgent architecture]
168
+
169
+ The HelperAgent consists of two subsystems. One is the *request handling subsystem*. The other is *the ApplicationPool subsystem*, which performs the bulk of process management. The HelperAgent also uses a number of support libraries. The largest third-party support libraries are shown in the diagram. Many more -- internal -- support libraries are used, but they're omitted from the diagram. You can find these internal support libraries in the directory `ext/common/Utils`.
170
+
171
+ === Request handling
172
+
173
+ Recall that requests are first received from the web server. The web server serializes the request into a slightly modified version of link:http://en.wikipedia.org/wiki/SCGI[the SCGI format], and sends it to the HelperAgent's RequestHandler. The RequestHandler performs some work, and eventually sends back a regular HTTP response. The web server parses the RequestHandler response, and sends a response to the original HTTP client.
174
+
175
+ The RequestHandler listens on a Unix domain socket file. This Unix domain socket file is called `request`, and is located in <<instance_state_and_communication,the server instance directory>>.
176
+
177
+ ==== One client per request
178
+
179
+ The web server creates a new connection to the HelperAgent on every request. Thus, from the viewpoint of the RequestHandler, its client is the web server. Every time a client connects (i.e. a new request is forwarded), the RequestHandler creates a new Client object which represents that request. All request-specific state is stored inside the Client. After the RequestHandler is done processing a request, it closes the client socket.
180
+
181
+ Note that in the diagram, a Client has a 0..1 association with RequestHandler. That's because when a Client is disconnected, the pointer to the associated RequestHandler is set to NULL. There might be background operations left which still have a pointer to the Client. As soon as those background operations finish, they check whether the Client has a valid pointer to the RequestHandler. If so, they commit their work; if not, they discard their work. The Client is destroyed when all its associated background operations have finished.
182
+
183
+ [[request_handler_forwarding_to_app]]
184
+ ==== Forwarding to the application
185
+
186
+ The RequestHandler asynchronously asks the ApplicationPool subsystem to select an appropriate application process to handle this request. The ApplicationPool checks whether there is an appropriate process, and if not, tries to spawn one. Maybe spawning is not possible right now because of configured resource limits, and we have to wait. In any case, the ApplicationPool takes care of all the nasty details and book keeping, and eventually replies back to the RequestHandler with either a Session object, or an exception.
187
+
188
+ A Session object represents a single request/response cycle with a particular application process. The RequestHandler uses the information in this Session object to establish a connection with that process and forwards the request, using <<loader_setting_up_server,a protocol that the application prefers>> and that the RequestHandler supports. The process performs work, and replies back with an HTTP response. The RequestHandler parses and postprocesses the response, and sends a response back to the web server.
189
+
190
+ If the ApplicationPool replied with an exception, the RequestHandler sends back an error response.
191
+
192
+ ==== I/O model
193
+
194
+ The RequestHandler uses the **evented I/O model**. This means that the RequestHandler handles many clients (requests) at the same time, using a single thread, inside a single process. This is possible through the use of I/O event multiplexing mechanisms, which are provided by the OS. Examples of such mechanisms include the `select()`, `poll()`, `epoll()` and `kqueue()` system calls. But those mechanisms are very low-level and OS-specific, so the RequestHandler uses two libraries which abstract away the differences and provide a higher-level API, link:http://software.schmorp.de/pkg/libev.html[libev] and link:http://software.schmorp.de/pkg/libeio.html[libeio].
195
+
196
+ The evented I/O model is also used in Nginx. It is in contrast to the single-threaded multi-process model which handles 1 client per process (used by Apache with the prefork MPM), or the multi-threaded model which handles 1 client per thread (used by Apache with the worker MPM). You can learn more about evented I/O and the different I/O models using these resources:
197
+
198
+ * link:http://www.slideshare.net/marc.seeger/seeger-aysnc-io[Event-Driven I/O: A hands-on introduction] -- Marc Seeger, 2010
199
+ * link:http://stackoverflow.com/questions/5807246/event-driven-io-and-blocking-vs-nonblocking[Event Driven IO And Blocking vs NonBlocking] -- Stack Overflow
200
+ * link:http://stackoverflow.com/questions/3231018/how-does-event-driven-i-o-allow-multiprocessing[How does event driven I/O allow multiprocessing?] -- Stack Overflow
201
+ * link:http://www.kegel.com/c10k.html[The C10K problem] -- an overview of the different I/O models used in different servers; Dan Kegel
202
+
203
+ === The ApplicationPool subsystem
204
+
205
+ The ApplicationPool subsystem is responsible for:
206
+
207
+ * Keeping track of which application processes exist.
208
+ * Spawning processes.
209
+ * Routing requests to an appropriate process. This also implies that it load balances requests between processes.
210
+ * Monitoring processes (CPU usage, memory usage, etc).
211
+ * Enforcing resource limits. Ensuring that not too many processes are spawned, ensuring that processes that use too much memory are shut down, etc.
212
+ * Restarting processes on demand (e.g. when the timestamp of `restart.txt` has changed).
213
+ * Restarting processes that have crashed.
214
+ * Queuing requests and limiting concurrency. Each process tells the ApplicationPool how many concurrent requests it can handle. If more concurrent requests come in than the processes say they can handle, then the excess requests are queued within the ApplicationPool subsystem. Similarly, if requests come in while a process is being spawned, then those requests are queued until the process is done spawning.
215
+
216
+ The main interface into the subsystem is the Pool class, with its `asyncGet()` method. The RequestHandler calls something like `pool->asyncGet(options, callback)` inside its `checkoutSession()` method. `asyncGet()` replies with a Session, or an exception.
217
+
218
+ Pool is the core of the subsystem. It contains high-level process management logic but not the low-level details of spawning processes. The code is further divided into the following classes, each of which contain the core code managing its respective domain:
219
+
220
+ **SuperGroup**::
221
+ A logical collection of different applications. It's designed to be able to contain one or more Groups, but currently it always contains exactly 1 Group.
222
+ +
223
+ SuperGroup was originally introduced as a building block for a future feature: polyglot, multi-language applications. The idea was that, as more and more programming languages are introduced and become popular, there would be more and more demand to write applications in multiple languages. This would be done by splitting applications into multiple parts, with each part implemented in a different language. We wanted to introduce a feature that makes it super-easy to make such polyglot applications as a single whole. However, as time went on, we realized that we were mistaken and that most developers actually don't want to bother with multiple programming languages: they'd rather stick with a single one. So nowadays, SuperGroup is actually obsolete, but it's still kept in the codebase because it's not harmful, and removing it is too much work.
224
+
225
+ **Group**::
226
+ Represents an application. It can contain multiple processes, all belonging to the same application.
227
+
228
+ **Process**::
229
+ Represents an OS process; an instance of a certain application. A process may have multiple server sockets on which it listens for requests. The Process class contains various book keeping information, such as the number of sessions that are currently open. It also contains the communication channel with the underlying OS process. Process objects are created through <<spawner_subsystem,the ApplicationPool Spawner sub-subsystem>>.
230
+
231
+ **Socket**::
232
+ Represents a single server socket, on which a process listens for requests. Session objects are created through Socket. Socket maintains book keeping information about how many sessions are currently open for that particular socket.
233
+
234
+ **Session**::
235
+ Represents a single request/response cycle with a particular process. Upon creation and destruction, various book keeping information is updated.
236
+
237
+ **Options (not shown in diagram)**::
238
+ A configuration object for the `Pool::asyncGet()` method.
239
+
240
+ If you look at the diagram, then you see that SuperGroup, Group and Process all have 0..1 associations with their containing classes. An object that has a NULL association with its containing object, is considered invalid and should not be used. The fact that the association can be NULL is a detail of the memory management scheme that we employ.
241
+
242
+ [[spawner_subsystem]]
243
+ === The Spawner subsystem
244
+
245
+ The Spawner subsystem is a sub-subsystem within ApplicationPool. It is responsible for actually spawning application processes, and then creating Process objects with the correct information in it.
246
+
247
+ The `Spawner` interface encapsulates all low-level process spawning logic. Pool calls Spawner whenever it needs to spawn another application process.
248
+
249
+ Recall that Phusion Passenger supports multiple spawn methods. For example, the `smart` spawn method spawns processes through an intermediate preloader process, and can utilize copy-on-write. This is explained in detail in link:Users%20guide%20Nginx.html#spawning_methods_explained[Spawn methods explained] in the Phusion Passenger manual. Each spawn method corresponds to a different implementation of the Spawner interface. The following implementations are available:
250
+
251
+ * DirectSpawner -- implements the `direct` spawn method.
252
+ * SmartSpawner -- implements the `smart` spawn method.
253
+ * DummySpawner (not shown in diagram) -- only used in unit tests.
254
+
255
+ The spawn method is user-configurable through the `spawnMethod` field in the `Options` object. To avoid convoluting the Pool code with spawner implementation selection logic, we also have a SpawnerFactory class, which the Pool uses.
256
+
257
+ The details of the spawning process is described in <<app_spawning_and_loading,Application spawning and loading>>.
258
+
259
+
260
+ [[app_spawning_and_loading]]
261
+ == Application spawning and loading
262
+
263
+ Application processes are spawned from the HelperAgent process. Spawning a process involves a lot of **preparation work**, such as setting up communication channels, setting up the current working directory, environment variables, etc. This preparation work is done by <<spawner_subsystem,a Spawner object>>, together with various support executables.
264
+
265
+ When preparation is done, your application's entry point has to be loaded somehow. That loading is done through a language-specific **loader program**. The loader program communicates with the Spawner through the communication channel that was set up earlier, initializes the language-specific environment, sets up a server, and reports back to the Spawner. This communication is done through a certain **protocol**.
266
+
267
+ === Preparation work
268
+
269
+ image:images/spawning_preparation_work.png[Spawning preparation work]
270
+
271
+ [[basic_setup_and_forking]]
272
+ ==== Basic setup and forking
273
+
274
+ Spawning begins when the `spawn()` method is called on a <<spawner_subsystem,Spawner object>>. The Spawner determines link:Users%20guide%20Nginx.html#user_switching[which user the process should run as], and sets up some communication channels (anonymous Unix domain socket pairs), and forks a process. The parent waits until the child exits, or replies with something over the communication channel.
275
+
276
+ The communication channel in question is -- from the viewpoint of the (pre)loader -- actually just stdin, stdout and stderr! The anonymous Unix domain socket pairs that the Spawner creates, is mapped to the child process's stdin, stdout and stderr file descriptors. Thus, Spawner sends data to the (pre)loader by writing stuff to its stdin, and the (pre)loader sends data back to the Spawner by writing stuff to stdout or stderr.
277
+
278
+ ==== Loading SpawnPreparer, possibly through bash
279
+
280
+ Because the HelperAgent is heavily multi-threaded, the child process has been forked by the Spawner link:http://pubs.opengroup.org/onlinepubs/009695399/functions/fork.html[may only perform async-signal-safe operations]:
281
+
282
+ [quote, The Open Group's POSIX specification, fork() man page]
283
+ "A process shall be created with a single thread. If a multi-threaded process calls fork(), the new process shall contain a replica of the calling thread and its entire address space, possibly including the states of mutexes and other resources. Consequently, to avoid errors, the child process may only execute async-signal-safe operations until such time as one of the exec functions is called. Fork handlers may be established by means of the pthread_atfork() function in order to maintain application invariants across fork() calls."
284
+
285
+ Don't worry if you don't know what this means. The point is, there's almost nothing the forked process can safely do at that stage. So it outsources most of the remaining preparation work to an external executable, the SpawnPreparer. The SpawnPreparer starts with a clean environment where it can safely execute code.
286
+
287
+ To execute the SpawnPreparer, the child process executes one of the following commands:
288
+
289
+ * If the target user's shell is bash, and the `passenger_load_shell_envvars` option is turned on:
290
+ +
291
+ `bash -l -c '/path-to/SpawnPreparer /path-to-loader-or-preloader'`
292
+ +
293
+ This causes bash to load its startup files, e.g. bashrc, profile, etc, after which it executes the SpawnPreparer with the given parameters. The reason why we do this is because a lot of users try to set environment variables in their bashrc, and they expect these environment variables to be picked up by applications spawned by Phusion Passenger. Unfortunately environment variables link:Users%20guide%20Nginx.html#about_environment_variables[don't work that way], but we support it anyway because it is good for usability.
294
+
295
+ * Otherwise, the SpawnPreparer is executed directly, without bash:
296
+ +
297
+ `/path-to/SpawnPreparer /path-to-loader-or-preloader`
298
+
299
+ How `path-to-loader-or-preloader` is determined, is described in <<app_types_registry,The AppTypes registry>>.
300
+
301
+ ==== SpawnPreparer further sets up the environment
302
+
303
+ The SpawnPreparer is responsible for setting up certain environment variables, current working directory, and other process environmental conditions. When SpawnPreparer is done, it executes the loader or the preloader.
304
+
305
+ ==== Executing the loader or preloader
306
+
307
+ If `passenger_spawn_method` is set to `smart` (the default), and there is a preloader available for the application's programming language, then this step executes the language-specific **preloader**. If either of the previous conditions are not met (and thus the `passenger_spawn_method` is automatically forced to `direct`), then this step executes the language-specific **loader**.
308
+
309
+ All (pre)loaders are located in the `helper-scripts` directory in the source tree. Here are some of the (pre)loaders that are used:
310
+
311
+ [options="header"]
312
+ |================================================================================
313
+ | Language/Framework | Loader | Preloader
314
+ | Ruby Rack and Rails >= 3.x | rack-loader.rb | rack-preloader.rb
315
+ | Ruby on Rails 1.x and 2.x | classic-rails-loader.rb | classic-rails-preloader.rb
316
+ | Python | wsgi-loader.py | -
317
+ | Node.js and bundled Meteor | node-loader.js | -
318
+ | Unbundled Meteor | meteor-loader.rb | -
319
+ |================================================================================
320
+
321
+ <<app_types_registry,The AppTypes registry>> keeps a list of available (pre)loaders, and which languages they belong to.
322
+
323
+ What the loader does is described in <<loaders,Loaders>>. Likewise, preloaders are described in <<preloaders,Preloaders>>.
324
+
325
+ [[loaders]]
326
+ === Loaders
327
+
328
+ A loader initializes in 4 stages:
329
+
330
+ 1. It first goes through a <<loader_handshake,handshake>>, where it reads the parameters that the Spawner has sent over the communication channel.
331
+ 2. It <<application_loading,loads the application>>. The behavior of this stage may be customized by the received parameters.
332
+ 3. It <<loader_setting_up_server,sets up a server>>, on which this application process listens for requests.
333
+ 4. It <<loader_report_readiness,sends a response back>> to the Spawner, in which it tells Spawner whether initialization was successful, and if so, where the socket is on which this application process listens for requests.
334
+
335
+ Once initialized, the loader enters a main loop, in which it keeps handling requests until a signal has been received that says it should terminate.
336
+
337
+ As explained in <<basic_setup_and_forking,Basic setup and forking>>, the communication channels that the loader uses are just plain old stdin and stdout. Every programming language supports reading and writing from these channels. This also means that you can easily test a loader by simply executing it and entering messages in the terminal.
338
+
339
+ But stdout can also be used for printing normal output. How does the Spawner distinguish between control messages, and normal messages that should be displayed? The answer is that control messages must start with `!> ` (including the trailing whitespace), and must end with a newline. The Spawner reads messages line-by-line, processes lines that start with `!> `, and prints lines that don't start with that marker.
340
+
341
+ [[loader_handshake]]
342
+ ==== Handshake
343
+
344
+ The handshaking process begins with a protocol version handshake. The loader printing the line `!> I have control 1.0`. The Spawner then sends "You have control 1.0", which the loader checks. If the loader observes that the version handshake does not match the expectation, then it aborts with an error.
345
+
346
+ The Spawner also sends a list of key-value pairs, which is terminated by an empty newline. Upon receiving the empty newline, the Spawner proceeds with <<application_loading,loading the application>>.
347
+
348
+ Example:
349
+
350
+ --------------------------------------------------------
351
+ Loader Spawner
352
+
353
+ !> I have control 1.0
354
+ You have control 1.0
355
+ passenger_root: ...
356
+ passenger_version: 4.0.45
357
+ ruby_libdir: /Users/hongli/Projects/passenger/lib
358
+ generation_dir: /tmp/passenger.1.0.2082/generation-0
359
+ gupid: 1647ad4-ovJJMiPkAAt
360
+ connect_password: jXGaSzo8vRX5oGe2uuSv5tJsf1uX7ZgIeEH2x0nfOEa
361
+ app_root: /Users/hongli/Sites/rack.test
362
+ startup_file: config.ru
363
+ process_title: Passenger RackApp
364
+ log_level: 3
365
+ environment: development
366
+ base_uri: /
367
+ ...
368
+ (empty newline)
369
+ --------------------------------------------------------
370
+
371
+ [[application_loading]]
372
+ ==== Application loading
373
+
374
+ How the application is loaded, depends on the programming language. Here are some examples:
375
+
376
+ * The Ruby Rack loader does it by `load()`-ing the startup file, which by default is `config.ru`.
377
+ * The Python loader does it by calling `imp.load_source('passenger_wsgi', 'passenger_wsgi.py')`.
378
+ * The Node.js (and bundled Meteor) loader does it by `require()`-ing the startup file, which by default is `app.js`.
379
+ * The unbundled Meteor loader does it by executing the `meteor run` command.
380
+
381
+ If no errors occur, the loader proceeds with <<loader_setting_up_server,setting up a server>>. Otherwise, it <<loader_error_reporting,reports an error>>.
382
+
383
+ [[loader_setting_up_server]]
384
+ ==== Setting up a server
385
+
386
+ The loader sets up a server on which the application listens for requests. The Spawner doesn't care how this is done, how this server works, or even what its concurrency is. It only cares about how it can contact the server. So the loader has full freedom in this step.
387
+
388
+ As explained in <<request_handler_forwarding_to_app,section 'Request handling' and subsection 'Forwarding to the application'>>, the RequestHandler can talk with the application process in a protocol that the application prefers. The RequestHandler supports two protocols:
389
+
390
+ * A Phusion Passenger internal protocol which we call the 'session' protocol. This protocol is used by the Ruby loaders and the Python loader. A description of this protocol is outside the scope of this document, but if you're interested in how it looks like and how it behaves, you can study the source code of the Ruby and Python loaders, as well as `ext/common/Utils/MessageIO.h`.
391
+ * The HTTP protocol. This protocol is used by the Node.js and Meteor loaders. If you're writing a new loader then, it's probably easiest to use this protocol, together with whatever HTTP library is available for the loader's target language.
392
+
393
+ Typically, the server is setup to listen on a Unix domain socket file, inside the `backends` subdirectory of the 'generation directory'. The path to the generation directory was passed during handshake. However the server may also listen on a TCP socket.
394
+
395
+ Once a server has been setup, the loader can <<load_report_readiness,report readiness>>.
396
+
397
+ [[loader_report_readiness]]
398
+ ==== Reporting readiness
399
+
400
+ Once the server is set up, the loader sends back a `!> Ready` response, followed by information about where the server socket listens on, and what protocol it expects. The response is terminated with a `!> ` line (notice the trailing whitespace, which is required).
401
+
402
+ The information about where the server socket listens on, is a 4-tuple:
403
+
404
+ * The **name**. This must always be `main`.
405
+ * The **address**. For Unix domain sockets, it has the form `unix:/path-to-socket`. For TCP socket, it has the form `tcp://127.0.0.1:PORT`.
406
+ * The **protocol**. This must be either `session` or `http_session`.
407
+ * The maximum number of **concurrent connections** the server supports. The ApplicationPool will ensure that the process never receives more concurrent requests than this number. A value of 0 means that the concurrency is unlimited.
408
+
409
+ Here's an example of what the Node.js loader sends as response:
410
+
411
+ --------------------------------------------------------
412
+ !> Ready
413
+ !> socket: main;unix:/path-to-generation-dir/backends/node.1234-5677;session;0")
414
+ !>
415
+ --------------------------------------------------------
416
+
417
+ After reporting readiness, the loader can <<loader_main_loop,enter a main loop and wait for termination>>.
418
+
419
+ [[loader_error_reporting]]
420
+ ==== Error reporting
421
+
422
+ If something goes wrong in any of the stages, the loader can report an error in two ways:
423
+
424
+ 1. Just write the error message to stdout as you normally do, and abort without printing the `!> Ready` message. The HelperAgent will read everything that the loader has written to stdout, and use it as the error message. This error message is considered to be plain text.
425
+ 2. Abort after printing a special `!> Error` message. The loader can signal that the message is HTML. The RequestHandler will format the error message as HTML.
426
+
427
+ [[loader_main_loop]]
428
+ ==== Main loop and termination
429
+
430
+ The loader's main loop's job is to wait until a single byte has been received on stdin. As long as the byte has not been received, the loader should not exit, and should keep processing requests. When the byte has been received, the following conditions are guaranteed to be true:
431
+
432
+ * All clients for this particular process have disconnected.
433
+ * No more clients will be routed to this particular process.
434
+
435
+ This guarantee is enforced by the RequestHandler and the ApplicationPool. Thus, the loader doesn't have to perform any kind of complicated shutdown. It can just exit the process.
436
+
437
+ If the server was listening on a Unix domain socket file, then the loader doesn't even have to remove the file. The ApplicationPool already takes care of that.
438
+
439
+ ==== Stdout and stderr forwarding
440
+
441
+ All lines that the loader writes to stdout, and that are not prefixed with `!> `, are forwarded by the HelperAgent to its own stdout. Similarly, everything that the loader writes to stderr, whether prefixed with `!> ` or not, is forwarded by the HelperAgent to its own stdout.
442
+
443
+ While the Spawner is still doing its work, it takes care of this forwarding by itself. Once the Spawner is done, it outsources this work to two PipeWatcher objects, each which spawns a background thread for this purpose.
444
+
445
+ [[preloaders]]
446
+ === Preloaders
447
+
448
+ Preloaders are a special kind of loaders, used for reducing spawn time and leveraging copy-on-write. You can learn more about this at link:Users%20guide%20Nginx.html#spawning_methods_explained[Spawning methods explained].
449
+
450
+ Preloaders look a lot like loaders, but behave slightly differently. They also use stdin, stderr and stdout to communicate with the Spawner. The protocols are very similar.
451
+
452
+ A preloader initializes in 4 stages:
453
+
454
+ 1. It first goes through a handshake, which is the same as <<loader_handshake,the loader handshake>>.
455
+ 2. It <<application_loading,loads the application>> just like the loader does.
456
+ 3. It sets up a server on which it listens for spawn commands.
457
+ 4. It sends a response back to the Spawner. This is similar to how the loader does it, but instead of telling the Spawner where the application listens for requests, it tells the Spawner where the preloader process listens for spawn commands.
458
+
459
+ Once initialized, the preloader enters a main loop, in which it keeps handling spawn commands until a signal has been received that says it should terminate.
460
+
461
+ When a spawn command is received, the preloader forks off a child process (which already has the application loaded) and reports the child process's PID to the Spawner. It also sets up a communication channel between the Spawner and the child process.
462
+
463
+ [[app_types_registry]]
464
+ === The AppTypes registry
465
+
466
+ When the web server receives a request, the Phusion Passenger module inside it autodetects the type of application that the request belongs to. It does that by examening the filesystem and checking which one of the startup files exist. For example, if `config.ru` exists, then it assumes that it's a Ruby app. Or if `app.js` exists, then it assumes that it's a Node.js app. The Phusion Passenger module forwards the inferred application type to the HelperAgent.
467
+
468
+ Given an application type, the associated loader and preloader can be looked up.
469
+
470
+ Information about the supported application types, startup files, loaders and preloaders are defined in the following places:
471
+
472
+ * The constant `appTypeDefinitions` in the file `ext/common/ApplicationPool2/AppTypes.cpp` keeps a list of supported languages. It also specifies the default startup file name belonging to each language.
473
+ * The method `getStartCommand()` in the file `ext/common/ApplicationPool2/Options.h` defines the loaders that should be used for each language.
474
+ * The method `tryCreateSmartSpawner()` in the file `ext/common/ApplicationPool2/SpawnerFactory.h` defines the preloaders that should be used for each language.
475
+ * The method `looks_like_app_directory?` in the file `lib/phusion_passenger/standalone/app_finder.rb` keeps a list of supported startup files. This is only used within Passenger Standalone.
476
+
477
+
478
+ [[instance_state_and_communication]]
479
+ == Instance state and communication
480
+
481
+ Every time you start Phusion Passenger, you've created a new *instance*. Every instance consists of multiple processes that work together (Watchdog, HelperAgent, LoggingAgent, application processes). All those processes have to work together and have to be able to communicate with each other. Those processes must also *not* communicate with the processes belonging to other instances. For example, if you start Apache+Passenger *and Nginx+Passenger, then we don't want the HelperAgent that's started from Apache to use LoggingAgent that's started from Nginx.
482
+
483
+ Clearly, the processes can't listen on a specific TCP port for communication. Nor can they listen on a fixed Unix domain socket filename.
484
+
485
+ That is where the 'server instance directory' comes in. Every Phusion Passenger instance has its own, unique temporary directory. That directory is removed when the instance halts. The directory contains Unix domain socket files that the processes listen on. Every Phusion Passenger related process knows where its own server instance directory is, and thus, knows how to communicate with other processes belonging to the same instance. The server instance directory is implemented in `ext/common/ServerInstanceDir.h`.
486
+
487
+ Administration tools such as `passenger-status` query information using server instance directories. First, they check which server instance directories exist on the system. If they find only one, then they query the sockets inside that sole server instance directory. Otherwise, they abort with an error and asks the user to specifically select the instance to query.
488
+
489
+
490
+ [appendix]
491
+ == About Rack
492
+
493
+ The de-facto standard interface for Ruby web applications is link:http://rack.rubyforge.org/[Rack]. Rack specifies an programming interface for web application developers to implement. This interface covers HTTP request and response handling, and is not dependent on any particular application server. The idea is that any Rack-compliant application server can implement the Rack specification and work with all Rack-compliant web applications.
494
+
495
+ image:images/rack.png[]
496
+
497
+ In the distant past, each Ruby web framework had its own interface, so application servers needed to explicitly add support for each web framework. Nowadays application servers just support Rack.
498
+
499
+ image:images/many_web_framework_protocols.png[]
500
+
501
+ Ruby on Rails has been fully Rack compliant since version 3.0. Rails 2.3 was partially Rack-compliant while earlier versions were not Rack-compliant at all. Phusion Passenger supports Rack as well as all Rails 1.x and 2.x versions.
502
+
503
+ [appendix]
504
+ == About Apache
505
+
506
+ The Apache web server has a dynamic module system and a pluggable I/O multiprocessing (the ability to handle more than 1 concurrent HTTP client at the same time) architecture. An Apache module which implements a particular multiprocessing strategy, is called a Multi-Processing Module (MPM). The single-threaded multi-process link:http://httpd.apache.org/docs/2.4/mod/prefork.html[prefork MPM] had been the default and the most popular one for a long time, but in recent times the hybrid multi-threaded/multi-process link:http://httpd.apache.org/docs/2.4/mod/worker.html[worker MPM] is becoming increasingly popular because of its better performance and scalability. Furthermore, Apache 2.4 introduced the link:http://httpd.apache.org/docs/2.4/mod/event.html[event MPM] which is a hybrid evented/multi-threaded/multi-process MPM and offers even more scalability benefits.
507
+
508
+ The prefork MPM remains in wide use today because it's the only MPM that works well with mod_php.
509
+
510
+ The prefork MPM spawns multiple worker child processes. HTTP requests are first accepted by a so-called control process, and then forwarded to one of the worker processes. The next section contains a diagram which shows the prefork MPM's architecture.
511
+
512
+ [appendix]
513
+ == About Nginx
514
+
515
+ Nginx is a lightweight web server that is becoming increasingly popular. It is known to be smaller, lighter weight and more scalable than Apache thanks to its evented I/O architecture. That said, Nginx is less flexible than Apache. For example it has no dynamic module system: all modules must be statically compiled into Nginx.