nerve 0.0.1 → 0.2.1
Sign up to get free protection for your applications and to get access to all the features.
- data/.gitignore +5 -0
- data/.nerve.rc +2 -0
- data/Gemfile +0 -2
- data/LICENSE.txt +2 -2
- data/README.md +50 -2
- data/Vagrantfile +121 -0
- data/bin/nerve +56 -0
- data/example/nerve.conf.json +50 -0
- data/lib/nerve/log.rb +24 -0
- data/lib/nerve/reporter.rb +64 -0
- data/lib/nerve/ring_buffer.rb +30 -0
- data/lib/nerve/service_watcher/base.rb +51 -0
- data/lib/nerve/service_watcher/http.rb +62 -0
- data/lib/nerve/service_watcher/rabbitmq.rb +68 -0
- data/lib/nerve/service_watcher/tcp.rb +56 -0
- data/lib/nerve/service_watcher.rb +96 -0
- data/lib/nerve/utils.rb +17 -0
- data/lib/nerve/version.rb +1 -1
- data/lib/nerve.rb +66 -2
- data/nerve.gemspec +5 -2
- metadata +53 -4
data/.gitignore
CHANGED
data/.nerve.rc
ADDED
data/Gemfile
CHANGED
data/LICENSE.txt
CHANGED
@@ -1,4 +1,4 @@
|
|
1
|
-
Copyright (c)
|
1
|
+
Copyright (c) 2013 Airbnb, Inc.
|
2
2
|
|
3
3
|
MIT License
|
4
4
|
|
@@ -19,4 +19,4 @@ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
|
19
19
|
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
|
20
20
|
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
21
21
|
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
|
22
|
-
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
22
|
+
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/README.md
CHANGED
@@ -1,7 +1,21 @@
|
|
1
1
|
# Nerve
|
2
2
|
|
3
|
+
Nerve is a utility for tracking the status of machines and services.
|
4
|
+
It runs locally on the boxes which make up a distributed system, and reports state information to a distributed key-value store.
|
5
|
+
At Airbnb, we use Zookeeper as our key-value store.
|
6
|
+
The combination of Nerve and [Synapse](https://github.com/airbnb/synapse) make service discovery in the cloud easy!
|
3
7
|
|
4
|
-
##
|
8
|
+
## Motivation ##
|
9
|
+
|
10
|
+
We already use [Synapse](https://github.com/airbnb/synapse) to discover remote services.
|
11
|
+
However, those services needed boilerplate code to register themselves in [Zookeeper](zookeeper.apache.org/).
|
12
|
+
Nerve simplifies underlying services, enables code reuse, and allows us to create a more composable system.
|
13
|
+
It does so by factoring out the boilerplate into it's own application, which independenly handles monitoring and reporting.
|
14
|
+
|
15
|
+
Beyond those benefits, nerve also acts as a general watchdog on systems.
|
16
|
+
The information it reports can be used to take action from a certralized automation center: action like scaling distributed systems up or down or alerting ops or engineering about downtime.
|
17
|
+
|
18
|
+
## Installation ##
|
5
19
|
|
6
20
|
Add this line to your application's Gemfile:
|
7
21
|
|
@@ -15,8 +29,42 @@ Or install it yourself as:
|
|
15
29
|
|
16
30
|
$ gem install nerve
|
17
31
|
|
18
|
-
##
|
32
|
+
## Configuration ##
|
33
|
+
|
34
|
+
Nerve depends on a single configuration file, in json format.
|
35
|
+
It is usually called `nerve.conf.json`.
|
36
|
+
An example config file is available in `example/nerve.conf.json`.
|
37
|
+
The config file is composed of two main sections:
|
38
|
+
|
39
|
+
* `instance_id`: the name under which your services will be registered in zookeeper
|
40
|
+
* `services`: the hash (from service name to config) of the services nerve will be monitoring
|
41
|
+
|
42
|
+
### Services Config ###
|
43
|
+
|
44
|
+
Each service that nerve will be monitoring is specified in the `services` hash.
|
45
|
+
The key is the name of the service, and the value is a configuration hash telling nerve how to monitor the service.
|
46
|
+
The configuration contains the following options:
|
47
|
+
|
48
|
+
* `port`: the default port for service checks; nerve will submit this the address `instance_id:port` to Zookeeper
|
49
|
+
* `host`: the default host on which to make service checks; you should make this your *public* ip if you want to make sure your service is publically accessible
|
50
|
+
* `zk_hosts`: a list of the zookeeper hosts comprising the [ensemble](https://zookeeper.apache.org/doc/r3.1.2/zookeeperAdmin.html#sc_zkMulitServerSetup) that nerve will submit registration to
|
51
|
+
* `zk_path`: the path (or [znode](https://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#sc_zkDataModel_znodes)) where the registration will be created; nerve will create the [ephemeral node](https://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#Ephemeral+Nodes) that is the registration as a child of this path
|
52
|
+
* `check_interval`: the frequency with which service checks will be initiated; defaults to `500ms`
|
53
|
+
* `checks`: a list of checks that nerve will perform; if all of the pass, the service will be registered; otherwise, it will be un-registered
|
54
|
+
|
55
|
+
### Checks ###
|
56
|
+
|
57
|
+
The core of nerve is a set of service checks.
|
58
|
+
Each service can define a number of checks, and all of them must pass for the service to be registered.
|
59
|
+
Although the exact parameters passed to each check are different, all take a number of common arguments:
|
19
60
|
|
61
|
+
* `type`: (required) the kind of check; you can see available check types in the `lib/nerve/service_watcher` dir of this repo
|
62
|
+
* `name`: (optional) a descriptive, human-readable name for the check; it will be auto-generated based on the other parameters if not specified
|
63
|
+
* `host`: (optional) the host on which the check will be performed; defaults to the `host` of the service to which the check belongs
|
64
|
+
* `port`: (optional) the port on which the check will be performed; like `host`, it defaults to the `port` of the service
|
65
|
+
* `timeout`: (optional) maximum time the check can take; defaults to `100ms`
|
66
|
+
* `rise`: (optional) how many consecutive checks must pass before the check is considered passing; defaults to 1
|
67
|
+
* `fall`: (optional) how many consecutive checks must fail before the check is considered failing; defaults to 1
|
20
68
|
|
21
69
|
## Contributing
|
22
70
|
|
data/Vagrantfile
ADDED
@@ -0,0 +1,121 @@
|
|
1
|
+
# -*- mode: ruby -*-
|
2
|
+
# vi: set ft=ruby :
|
3
|
+
|
4
|
+
unless ENV['COOKBOOK_DIR'] and ENV['DATA_BAG_DIR']
|
5
|
+
STDERR.puts "you need to set COOKBOOK_DIR and DATA_BAG_DIR as environment variables"
|
6
|
+
Kernel.exit 1
|
7
|
+
end
|
8
|
+
|
9
|
+
this_dir = File.dirname(File.expand_path __FILE__)
|
10
|
+
|
11
|
+
STDERR.puts "mounting #{this_dir} as /root/this_dir"
|
12
|
+
|
13
|
+
Vagrant::Config.run do |config|
|
14
|
+
# All Vagrant configuration is done here. The most common configuration
|
15
|
+
# options are documented and commented below. For a complete reference,
|
16
|
+
# please see the online documentation at vagrantup.com.
|
17
|
+
|
18
|
+
# Every Vagrant virtual environment requires a box to build off of.
|
19
|
+
config.vm.box = "precise64"
|
20
|
+
|
21
|
+
# The url from where the 'config.vm.box' box will be fetched if it
|
22
|
+
# doesn't already exist on the user's system.
|
23
|
+
# config.vm.box_url = "http://domain.com/path/to/above.box"
|
24
|
+
|
25
|
+
config.vm.box_url = 'http://files.vagrantup.com/precise64.box'
|
26
|
+
|
27
|
+
# Boot with a GUI so you can see the screen. (Default is headless)
|
28
|
+
# config.vm.boot_mode = :gui
|
29
|
+
|
30
|
+
# Assign this VM to a host-only network IP, allowing you to access it
|
31
|
+
# via the IP. Host-only networks can talk to the host machine as well as
|
32
|
+
# any other machines on the same network, but cannot be accessed (through this
|
33
|
+
# network interface) by any external networks.
|
34
|
+
# config.vm.network :hostonly, "192.168.33.10"
|
35
|
+
|
36
|
+
# Assign this VM to a bridged network, allowing you to connect directly to a
|
37
|
+
# network using the host's network device. This makes the VM appear as another
|
38
|
+
# physical device on your network.
|
39
|
+
# config.vm.network :bridged
|
40
|
+
|
41
|
+
# Forward a port from the guest to the host, which allows for outside
|
42
|
+
# computers to access the VM, whereas host only networking does not.
|
43
|
+
# config.vm.forward_port 80, 8080
|
44
|
+
|
45
|
+
# Share an additional folder to the guest VM. The first argument is
|
46
|
+
# an identifier, the second is the path on the guest to mount the
|
47
|
+
# folder, and the third is the path on the host to the actual folder.
|
48
|
+
# config.vm.share_folder "v-data", "/vagrant_data", "../data"
|
49
|
+
|
50
|
+
config.vm.share_folder 'this_dir', '/this_dir', this_dir
|
51
|
+
|
52
|
+
# Enable provisioning with Puppet stand alone. Puppet manifests
|
53
|
+
# are contained in a directory path relative to this Vagrantfile.
|
54
|
+
# You will need to create the manifests directory and a manifest in
|
55
|
+
# the file base.pp in the manifests_path directory.
|
56
|
+
#
|
57
|
+
# An example Puppet manifest to provision the message of the day:
|
58
|
+
#
|
59
|
+
# # group { "puppet":
|
60
|
+
# # ensure => "present",
|
61
|
+
# # }
|
62
|
+
# #
|
63
|
+
# # File { owner => 0, group => 0, mode => 0644 }
|
64
|
+
# #
|
65
|
+
# # file { '/etc/motd':
|
66
|
+
# # content => "Welcome to your Vagrant-built virtual machine!
|
67
|
+
# # Managed by Puppet.\n"
|
68
|
+
# # }
|
69
|
+
#
|
70
|
+
# config.vm.provision :puppet do |puppet|
|
71
|
+
# puppet.manifests_path = "manifests"
|
72
|
+
# puppet.manifest_file = "base.pp"
|
73
|
+
# end
|
74
|
+
|
75
|
+
# Enable provisioning with chef solo, specifying a cookbooks path, roles
|
76
|
+
# path, and data_bags path (all relative to this Vagrantfile), and adding
|
77
|
+
# some recipes and/or roles.
|
78
|
+
#
|
79
|
+
config.vm.provision :chef_solo do |chef|
|
80
|
+
# chef.cookbooks_path = "../my-recipes/cookbooks"
|
81
|
+
# chef.roles_path = "../my-recipes/roles"
|
82
|
+
# chef.data_bags_path = '/tmp/foo'
|
83
|
+
# chef.add_recipe "mysql"
|
84
|
+
# chef.add_role "web"
|
85
|
+
|
86
|
+
# You may also specify custom JSON attributes:
|
87
|
+
# chef.json = { :mysql_password => "foo" }
|
88
|
+
|
89
|
+
chef.data_bags_path = ENV['DATA_BAG_DIR']
|
90
|
+
chef.cookbooks_path = ENV['COOKBOOK_DIR']
|
91
|
+
chef.add_recipe 'vagrant'
|
92
|
+
chef.add_recipe 'nerve'
|
93
|
+
end
|
94
|
+
|
95
|
+
# Enable provisioning with chef server, specifying the chef server URL,
|
96
|
+
# and the path to the validation key (relative to this Vagrantfile).
|
97
|
+
#
|
98
|
+
# The Opscode Platform uses HTTPS. Substitute your organization for
|
99
|
+
# ORGNAME in the URL and validation key.
|
100
|
+
#
|
101
|
+
# If you have your own Chef Server, use the appropriate URL, which may be
|
102
|
+
# HTTP instead of HTTPS depending on your configuration. Also change the
|
103
|
+
# validation key to validation.pem.
|
104
|
+
#
|
105
|
+
# config.vm.provision :chef_client do |chef|
|
106
|
+
# chef.chef_server_url = "https://api.opscode.com/organizations/ORGNAME"
|
107
|
+
# chef.validation_key_path = "ORGNAME-validator.pem"
|
108
|
+
# end
|
109
|
+
#
|
110
|
+
# If you're using the Opscode platform, your validator client is
|
111
|
+
# ORGNAME-validator, replacing ORGNAME with your organization name.
|
112
|
+
#
|
113
|
+
# IF you have your own Chef Server, the default validation client name is
|
114
|
+
# chef-validator, unless you changed the configuration.
|
115
|
+
#
|
116
|
+
# chef.validation_client_name = "ORGNAME-validator"
|
117
|
+
|
118
|
+
# Vagrant::Config.run do |config|
|
119
|
+
# config.vm.provision :shell, :path => "vagrant/init.sh"
|
120
|
+
# end
|
121
|
+
end
|
data/bin/nerve
ADDED
@@ -0,0 +1,56 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
|
3
|
+
require 'json'
|
4
|
+
require 'optparse'
|
5
|
+
|
6
|
+
require 'nerve'
|
7
|
+
|
8
|
+
options={}
|
9
|
+
|
10
|
+
# set command line options
|
11
|
+
optparse = OptionParser.new do |opts|
|
12
|
+
opts.banner =<<EOB
|
13
|
+
Welcome to nerve
|
14
|
+
|
15
|
+
Usage: nerve --config /path/to/nerve/config
|
16
|
+
EOB
|
17
|
+
|
18
|
+
options[:config] = ENV['NERVE_CONFIG']
|
19
|
+
opts.on('-c config','--config config', String, 'path to nerve config') do |key,value|
|
20
|
+
options[:config] = key
|
21
|
+
end
|
22
|
+
|
23
|
+
opts.on( '-h', '--help', 'Display this screen' ) do
|
24
|
+
puts opts
|
25
|
+
exit
|
26
|
+
end
|
27
|
+
|
28
|
+
end
|
29
|
+
|
30
|
+
|
31
|
+
# parse command line arguments
|
32
|
+
optparse.parse!
|
33
|
+
|
34
|
+
|
35
|
+
# parse nerve config file
|
36
|
+
begin
|
37
|
+
config = JSON::parse(File.read(options[:config]))
|
38
|
+
rescue TypeError => e
|
39
|
+
raise ArgumentError, "you must pass in a '--config' option"
|
40
|
+
rescue Errno::ENOENT => e
|
41
|
+
raise ArgumentError, "config file does not exist:\n#{e.inspect}"
|
42
|
+
rescue Errno::EACCES => e
|
43
|
+
raise ArgumentError, "could not open config file:\n#{e.inspect}"
|
44
|
+
rescue JSON::ParserError => e
|
45
|
+
raise "config file #{options[:config]} is not json:\n#{e.inspect}"
|
46
|
+
end
|
47
|
+
|
48
|
+
|
49
|
+
# create nerve object
|
50
|
+
s = Nerve::Nerve.new config
|
51
|
+
|
52
|
+
# start nerve
|
53
|
+
s.run
|
54
|
+
|
55
|
+
|
56
|
+
puts "exiting nerve"
|
@@ -0,0 +1,50 @@
|
|
1
|
+
{
|
2
|
+
"instance_id": "mymachine",
|
3
|
+
"services": {
|
4
|
+
"your_http_service": {
|
5
|
+
"port": 3000,
|
6
|
+
"host": "127.0.0.1",
|
7
|
+
"zk_hosts": ["localhost:2181"],
|
8
|
+
"zk_path": "/nerve/services/your_http_service/services",
|
9
|
+
"check_interval": 2,
|
10
|
+
"checks": [
|
11
|
+
{
|
12
|
+
"type": "http",
|
13
|
+
"uri": "/health",
|
14
|
+
"timeout": 0.2,
|
15
|
+
"rise": 3,
|
16
|
+
"fall": 2
|
17
|
+
}
|
18
|
+
]
|
19
|
+
},
|
20
|
+
"your_tcp_service": {
|
21
|
+
"port": 6379,
|
22
|
+
"host": "127.0.0.1",
|
23
|
+
"zk_hosts": ["localhost:2181"],
|
24
|
+
"zk_path": "/nerve/services/your_tcp_service/services",
|
25
|
+
"check_interval": 2,
|
26
|
+
"checks": [
|
27
|
+
{
|
28
|
+
"type": "tcp",
|
29
|
+
"timeout": 0.2,
|
30
|
+
"rise": 3,
|
31
|
+
"fall": 2
|
32
|
+
}
|
33
|
+
]
|
34
|
+
},
|
35
|
+
"rabbitmq_service": {
|
36
|
+
"port": 5672,
|
37
|
+
"host": "127.0.0.1",
|
38
|
+
"zk_hosts": ["localhost:2181"],
|
39
|
+
"zk_path": "/nerve/services/your_rabbitmq_service/services",
|
40
|
+
"check_interval": 2,
|
41
|
+
"checks": [
|
42
|
+
{
|
43
|
+
"type": "rabbitmq",
|
44
|
+
"username": "guest",
|
45
|
+
"password": "guest"
|
46
|
+
}
|
47
|
+
]
|
48
|
+
}
|
49
|
+
}
|
50
|
+
}
|
data/lib/nerve/log.rb
ADDED
@@ -0,0 +1,24 @@
|
|
1
|
+
module Nerve
|
2
|
+
module Logging
|
3
|
+
|
4
|
+
def log
|
5
|
+
@logger ||= Logging.logger_for(self.class.name)
|
6
|
+
end
|
7
|
+
|
8
|
+
# Use a hash class-ivar to cache a unique Logger per class:
|
9
|
+
@loggers = {}
|
10
|
+
|
11
|
+
class << self
|
12
|
+
def logger_for(classname)
|
13
|
+
@loggers[classname] ||= configure_logger_for(classname)
|
14
|
+
end
|
15
|
+
|
16
|
+
def configure_logger_for(classname)
|
17
|
+
logger = Logger.new(STDERR)
|
18
|
+
logger.level = Logger::INFO unless ENV['DEBUG']
|
19
|
+
logger.progname = classname
|
20
|
+
return logger
|
21
|
+
end
|
22
|
+
end
|
23
|
+
end
|
24
|
+
end
|
@@ -0,0 +1,64 @@
|
|
1
|
+
require 'zk'
|
2
|
+
|
3
|
+
module Nerve
|
4
|
+
class Reporter
|
5
|
+
include Utils
|
6
|
+
include Logging
|
7
|
+
|
8
|
+
def initialize(opts)
|
9
|
+
%w{hosts path key}.each do |required|
|
10
|
+
raise ArgumentError, "you need to specify required argument #{required}" unless opts[required]
|
11
|
+
end
|
12
|
+
|
13
|
+
@path = opts['hosts'].shuffle.join(',') + opts['path']
|
14
|
+
@data = parse_data(opts['data'] || '')
|
15
|
+
@key = opts['key']
|
16
|
+
@key.insert(0,'/') unless @key[0] == '/'
|
17
|
+
end
|
18
|
+
|
19
|
+
def start()
|
20
|
+
log.info "nerve: waiting to connect to zookeeper at #{@path}"
|
21
|
+
@zk = ZK.new(@path)
|
22
|
+
|
23
|
+
log.info "nerve: successfully created zk connection to #{@path}"
|
24
|
+
end
|
25
|
+
|
26
|
+
def report_up()
|
27
|
+
zk_save
|
28
|
+
end
|
29
|
+
|
30
|
+
def report_down
|
31
|
+
zk_delete
|
32
|
+
end
|
33
|
+
|
34
|
+
def update_data(new_data='')
|
35
|
+
@data = parse_data(new_data)
|
36
|
+
zk_save
|
37
|
+
end
|
38
|
+
|
39
|
+
def ping?
|
40
|
+
return @zk.ping?
|
41
|
+
end
|
42
|
+
|
43
|
+
private
|
44
|
+
|
45
|
+
def zk_delete
|
46
|
+
@zk.delete(@key, :ignore => :no_node)
|
47
|
+
end
|
48
|
+
|
49
|
+
def zk_save
|
50
|
+
log.debug "nerve: writing data #{@data.class} to zk at #{@key} with #{@data.inspect}"
|
51
|
+
begin
|
52
|
+
@zk.set(@key,@data)
|
53
|
+
rescue ZK::Exceptions::NoNode => e
|
54
|
+
@zk.create(@key,:data => @data, :mode => :ephemeral)
|
55
|
+
end
|
56
|
+
end
|
57
|
+
|
58
|
+
def parse_data(data)
|
59
|
+
return data if data.class == String
|
60
|
+
return data.to_json
|
61
|
+
end
|
62
|
+
|
63
|
+
end
|
64
|
+
end
|
@@ -0,0 +1,30 @@
|
|
1
|
+
module Nerve
|
2
|
+
class RingBuffer < Array
|
3
|
+
alias_method :array_push, :push
|
4
|
+
alias_method :array_element, :[]
|
5
|
+
|
6
|
+
def initialize( size )
|
7
|
+
@ring_size = size.to_i
|
8
|
+
super( @ring_size )
|
9
|
+
end
|
10
|
+
|
11
|
+
def average
|
12
|
+
self.inject(0.0) { |sum, el| sum + el } / self.size
|
13
|
+
end
|
14
|
+
|
15
|
+
def push( element )
|
16
|
+
if length == @ring_size
|
17
|
+
shift # loose element
|
18
|
+
end
|
19
|
+
array_push element
|
20
|
+
end
|
21
|
+
|
22
|
+
# Access elements in the RingBuffer
|
23
|
+
#
|
24
|
+
# offset will be typically negative!
|
25
|
+
#
|
26
|
+
def []( offset = 0 )
|
27
|
+
return self.array_element( - 1 + offset )
|
28
|
+
end
|
29
|
+
end
|
30
|
+
end
|
@@ -0,0 +1,51 @@
|
|
1
|
+
module Nerve
|
2
|
+
module ServiceCheck
|
3
|
+
class BaseServiceCheck
|
4
|
+
include Utils
|
5
|
+
include Logging
|
6
|
+
|
7
|
+
def initialize(opts={})
|
8
|
+
@timeout = opts['timeout'] ? opts['timeout'].to_f : 0.1
|
9
|
+
@rise = opts['rise'] ? opts['rise'].to_i : 1
|
10
|
+
@fall = opts['fall'] ? opts['fall'].to_i : 1
|
11
|
+
@name = opts['name'] ? opts['name'] : "undefined"
|
12
|
+
|
13
|
+
@check_buffer = RingBuffer.new([@rise, @fall].max)
|
14
|
+
@last_result = nil
|
15
|
+
end
|
16
|
+
|
17
|
+
def up?
|
18
|
+
# do the check
|
19
|
+
check_result = !!ignore_errors do
|
20
|
+
check
|
21
|
+
end
|
22
|
+
|
23
|
+
# this is the first check -- initialize buffer
|
24
|
+
if @last_result == nil
|
25
|
+
@last_result = check_result
|
26
|
+
@check_buffer.size.times {@check_buffer.push check_result}
|
27
|
+
log.info "nerve: service check #{@name} initial check returned #{check_result}"
|
28
|
+
end
|
29
|
+
|
30
|
+
log.debug "nerve: service check #{@name} returned #{check_result}"
|
31
|
+
@check_buffer.push(check_result)
|
32
|
+
|
33
|
+
# we've failed if the last @fall times are false
|
34
|
+
unless @check_buffer.last(@fall).reduce(:|)
|
35
|
+
log.info "nerve: service check #{@name} transitions to down after #{@fall} failures" if @last_result
|
36
|
+
@last_result = false
|
37
|
+
end
|
38
|
+
|
39
|
+
# we've succeeded if the last @rise times is true
|
40
|
+
if @check_buffer.last(@rise).reduce(:&)
|
41
|
+
log.info "nerve: service check #{@name} transitions to up after #{@rise} successes" unless @last_result
|
42
|
+
@last_result = true
|
43
|
+
end
|
44
|
+
|
45
|
+
# otherwise return the last result
|
46
|
+
return @last_result
|
47
|
+
end
|
48
|
+
end
|
49
|
+
end
|
50
|
+
end
|
51
|
+
|
@@ -0,0 +1,62 @@
|
|
1
|
+
require 'nerve/service_watcher/base'
|
2
|
+
|
3
|
+
module Nerve
|
4
|
+
module ServiceCheck
|
5
|
+
require 'net/http'
|
6
|
+
|
7
|
+
class HttpServiceCheck < BaseServiceCheck
|
8
|
+
def initialize(opts={})
|
9
|
+
super
|
10
|
+
|
11
|
+
%w{port uri}.each do |required|
|
12
|
+
raise ArgumentError, "missing required argument #{required} in http check" unless
|
13
|
+
opts[required]
|
14
|
+
instance_variable_set("@#{required}",opts[required])
|
15
|
+
end
|
16
|
+
|
17
|
+
@host = opts['host'] || '127.0.0.1'
|
18
|
+
@ssl = opts['ssl'] || false
|
19
|
+
|
20
|
+
@read_timeout = opts['read_timeout'] || @timeout
|
21
|
+
@open_timeout = opts['open_timeout'] || 0.2
|
22
|
+
@ssl_timeout = opts['ssl_timeout'] || 0.2
|
23
|
+
|
24
|
+
@name = "http-#{@host}:#{@port}#{@uri}"
|
25
|
+
end
|
26
|
+
|
27
|
+
def check
|
28
|
+
log.debug "running health check #{@name}"
|
29
|
+
|
30
|
+
connection = get_connection
|
31
|
+
response = connection.get(@uri)
|
32
|
+
code = response.code.to_i
|
33
|
+
|
34
|
+
log.debug "nerve: check #{@name} got response code #{code}"
|
35
|
+
if code >= 200 and code < 300
|
36
|
+
return true
|
37
|
+
else
|
38
|
+
return false
|
39
|
+
end
|
40
|
+
end
|
41
|
+
|
42
|
+
private
|
43
|
+
def get_connection
|
44
|
+
con = Net::HTTP.new(@host, @port)
|
45
|
+
con.read_timeout = @read_timeout
|
46
|
+
con.open_timeout = @open_timeout
|
47
|
+
|
48
|
+
if @ssl
|
49
|
+
con.use_ssl = true
|
50
|
+
con.ssl_timeout = @ssl_timeout
|
51
|
+
con.verify_mode = OpenSSL::SSL::VERIFY_NONE
|
52
|
+
end
|
53
|
+
|
54
|
+
return con
|
55
|
+
end
|
56
|
+
|
57
|
+
end
|
58
|
+
|
59
|
+
CHECKS ||= {}
|
60
|
+
CHECKS['http'] = HttpServiceCheck
|
61
|
+
end
|
62
|
+
end
|
@@ -0,0 +1,68 @@
|
|
1
|
+
require 'nerve/service_watcher/base'
|
2
|
+
require 'bunny'
|
3
|
+
|
4
|
+
module Nerve
|
5
|
+
module ServiceCheck
|
6
|
+
class RabbitMQServiceCheck < BaseServiceCheck
|
7
|
+
require 'socket'
|
8
|
+
include Socket::Constants
|
9
|
+
|
10
|
+
def initialize(opts={})
|
11
|
+
super
|
12
|
+
|
13
|
+
raise ArgumentError, "missing required argument 'port' in rabbitmq check" unless opts['port']
|
14
|
+
|
15
|
+
@port = opts['port']
|
16
|
+
@host = opts['host'] || '127.0.0.1'
|
17
|
+
@user = opts['username'] || 'guest'
|
18
|
+
@pass = opts['password'] || 'guest'
|
19
|
+
end
|
20
|
+
|
21
|
+
def check
|
22
|
+
# the idea for this check was taken from the one in rabbitmq management
|
23
|
+
# -- the aliveness_test:
|
24
|
+
# https://github.com/rabbitmq/rabbitmq-management/blob/9a8e3d1ab5144e3f6a1cb9a4639eb738713b926d/src/rabbit_mgmt_wm_aliveness_test.erl
|
25
|
+
log.debug "nerve: running rabbitmq health check #{@name}"
|
26
|
+
|
27
|
+
conn = Bunny.new(
|
28
|
+
:host => @host,
|
29
|
+
:port => @port,
|
30
|
+
:user => @user,
|
31
|
+
:pass => @pass,
|
32
|
+
:log_file => STDERR,
|
33
|
+
:continuation_timeout => @timeout,
|
34
|
+
:automatically_recover => false,
|
35
|
+
:heartbeat => false,
|
36
|
+
:threaded => false
|
37
|
+
)
|
38
|
+
|
39
|
+
begin
|
40
|
+
conn.start
|
41
|
+
ch = conn.create_channel
|
42
|
+
|
43
|
+
# create a queue, publish to it
|
44
|
+
log.debug "nerve: publishing to rabbitmq"
|
45
|
+
ch.queue('nerve')
|
46
|
+
ch.basic_publish('nerve test message', '', 'nerve', :mandatory => true, :expiration => 2 * 1000)
|
47
|
+
|
48
|
+
# read and ack the message
|
49
|
+
log.debug "nerve: consuming from rabbitmq"
|
50
|
+
delivery_info, properties, payload = ch.basic_get('nerve', :ack => true)
|
51
|
+
|
52
|
+
if payload
|
53
|
+
ch.acknowledge(delivery_info.delivery_tag)
|
54
|
+
return true
|
55
|
+
else
|
56
|
+
log.debug "nerve: rabbitmq consumption returned no payload"
|
57
|
+
return false
|
58
|
+
end
|
59
|
+
ensure
|
60
|
+
conn.close
|
61
|
+
end
|
62
|
+
end
|
63
|
+
end
|
64
|
+
|
65
|
+
CHECKS ||= {}
|
66
|
+
CHECKS['rabbitmq'] = RabbitMQServiceCheck
|
67
|
+
end
|
68
|
+
end
|
@@ -0,0 +1,56 @@
|
|
1
|
+
require 'nerve/service_watcher/base'
|
2
|
+
|
3
|
+
module Nerve
|
4
|
+
module ServiceCheck
|
5
|
+
class TcpServiceCheck < BaseServiceCheck
|
6
|
+
require 'socket'
|
7
|
+
include Socket::Constants
|
8
|
+
|
9
|
+
def initialize(opts={})
|
10
|
+
super
|
11
|
+
|
12
|
+
raise ArgumentError, "missing required argument 'port' in tcp check" unless opts['port']
|
13
|
+
|
14
|
+
@port = opts['port']
|
15
|
+
@host = opts['host'] || '127.0.0.1'
|
16
|
+
|
17
|
+
@address = Socket.sockaddr_in(@port, @host)
|
18
|
+
end
|
19
|
+
|
20
|
+
def check
|
21
|
+
log.debug "nerve: running TCP health check #{@name}"
|
22
|
+
|
23
|
+
# create a TCP socket
|
24
|
+
socket = Socket.new(AF_INET, SOCK_STREAM, 0)
|
25
|
+
|
26
|
+
begin
|
27
|
+
# open a non-blocking connection
|
28
|
+
socket.connect_nonblock(@address)
|
29
|
+
rescue Errno::EINPROGRESS
|
30
|
+
# opening a non-blocking socket will usually raise
|
31
|
+
# this exception. it's just connect returning immediately,
|
32
|
+
# so it's not really an exception, but ruby makes it into
|
33
|
+
# one. if we got here, we are now free to wait until the timeout
|
34
|
+
# expires for the socket to be writeable
|
35
|
+
IO.select(nil, [socket], nil, @timeout)
|
36
|
+
|
37
|
+
# we should be connected now; allow any other exception through
|
38
|
+
begin
|
39
|
+
socket.connect_nonblock(@address)
|
40
|
+
rescue Errno::EISCONN
|
41
|
+
return true
|
42
|
+
end
|
43
|
+
else
|
44
|
+
# we managed to connect REALLY REALLY FAST
|
45
|
+
log.debug "nerve: connected to non-blocking socket without an exception"
|
46
|
+
return true
|
47
|
+
ensure
|
48
|
+
socket.close
|
49
|
+
end
|
50
|
+
end
|
51
|
+
end
|
52
|
+
|
53
|
+
CHECKS ||= {}
|
54
|
+
CHECKS['tcp'] = TcpServiceCheck
|
55
|
+
end
|
56
|
+
end
|
@@ -0,0 +1,96 @@
|
|
1
|
+
require 'nerve/service_watcher/tcp'
|
2
|
+
require 'nerve/service_watcher/http'
|
3
|
+
require 'nerve/service_watcher/rabbitmq'
|
4
|
+
|
5
|
+
module Nerve
|
6
|
+
class ServiceWatcher
|
7
|
+
include Utils
|
8
|
+
include Logging
|
9
|
+
|
10
|
+
def initialize(service={})
|
11
|
+
log.debug "nerve: creating service watcher object"
|
12
|
+
|
13
|
+
# check that we have all of the required arguments
|
14
|
+
%w{name instance_id host port zk_hosts zk_path}.each do |required|
|
15
|
+
raise ArgumentError, "missing required argument #{required} for new service watcher" unless service[required]
|
16
|
+
end
|
17
|
+
|
18
|
+
@name = service['name']
|
19
|
+
|
20
|
+
# configure the reporter, which we use for talking to zookeeper
|
21
|
+
@reporter = Reporter.new({
|
22
|
+
'hosts' => service['zk_hosts'],
|
23
|
+
'path' => service['zk_path'],
|
24
|
+
'key' => "#{service['instance_id']}_#{@name}",
|
25
|
+
'data' => {'host' => service['host'], 'port' => service['port']},
|
26
|
+
})
|
27
|
+
|
28
|
+
# instantiate the checks for this service
|
29
|
+
@service_checks = []
|
30
|
+
service['checks'] ||= []
|
31
|
+
service['checks'].each do |check|
|
32
|
+
check['type'] ||= "undefined"
|
33
|
+
begin
|
34
|
+
service_check_class = ServiceCheck::CHECKS[check['type']]
|
35
|
+
rescue
|
36
|
+
raise ArgumentError,
|
37
|
+
"invalid service check type #{check['type']}; valid types: #{ServiceCheck::CHECKS.keys.join(',')}"
|
38
|
+
end
|
39
|
+
|
40
|
+
check['host'] ||= service['host']
|
41
|
+
check['port'] ||= service['port']
|
42
|
+
check['name'] ||= "#{@name} #{check['type']}-#{check['host']}:#{check['port']}"
|
43
|
+
@service_checks << service_check_class.new(check)
|
44
|
+
end
|
45
|
+
|
46
|
+
# how often do we initiate service checks?
|
47
|
+
@check_interval = service['check_interval'] || 0.5
|
48
|
+
|
49
|
+
log.debug "nerve: created service watcher for #{@name} with #{@service_checks.size} checks"
|
50
|
+
end
|
51
|
+
|
52
|
+
def run()
|
53
|
+
log.info "nerve: starting service watch #{@name}"
|
54
|
+
|
55
|
+
# begin by reporting down
|
56
|
+
@reporter.start()
|
57
|
+
@reporter.report_down
|
58
|
+
was_up = false
|
59
|
+
|
60
|
+
until $EXIT
|
61
|
+
@reporter.ping?
|
62
|
+
|
63
|
+
# what is the status of the service?
|
64
|
+
is_up = check?
|
65
|
+
log.debug "nerve: current service status for #{@name} is #{is_up.inspect}"
|
66
|
+
|
67
|
+
if is_up != was_up
|
68
|
+
if is_up
|
69
|
+
@reporter.report_up
|
70
|
+
log.info "nerve: service #{@name} is now up"
|
71
|
+
else
|
72
|
+
@reporter.report_down
|
73
|
+
log.warn "nerve: service #{@name} is now down"
|
74
|
+
end
|
75
|
+
was_up = is_up
|
76
|
+
end
|
77
|
+
|
78
|
+
# wait to run more checks
|
79
|
+
sleep @check_interval
|
80
|
+
end
|
81
|
+
rescue StandardError => e
|
82
|
+
log.error "nerve: error in service watcher #{@name}: #{e}"
|
83
|
+
raise e
|
84
|
+
ensure
|
85
|
+
log.info "nerve: ending service watch #{@name}"
|
86
|
+
$EXIT = true
|
87
|
+
end
|
88
|
+
|
89
|
+
def check?
|
90
|
+
@service_checks.each do |check|
|
91
|
+
return false unless check.up?
|
92
|
+
end
|
93
|
+
return true
|
94
|
+
end
|
95
|
+
end
|
96
|
+
end
|
data/lib/nerve/utils.rb
ADDED
@@ -0,0 +1,17 @@
|
|
1
|
+
module Nerve
|
2
|
+
module Utils
|
3
|
+
def safe_run(command)
|
4
|
+
res = `#{command}`.chomp
|
5
|
+
raise "command '#{command}' failed to run:\n#{res}" unless $?.success?
|
6
|
+
end
|
7
|
+
|
8
|
+
def ignore_errors(&block)
|
9
|
+
begin
|
10
|
+
return yield
|
11
|
+
rescue Object => error
|
12
|
+
log.debug "ignoring error #{error.inspect}"
|
13
|
+
return false
|
14
|
+
end
|
15
|
+
end
|
16
|
+
end
|
17
|
+
end
|
data/lib/nerve/version.rb
CHANGED
data/lib/nerve.rb
CHANGED
@@ -1,5 +1,69 @@
|
|
1
|
-
require
|
1
|
+
require 'logger'
|
2
|
+
require 'json'
|
3
|
+
require 'timeout'
|
4
|
+
|
5
|
+
require 'nerve/version'
|
6
|
+
require 'nerve/utils'
|
7
|
+
require 'nerve/log'
|
8
|
+
require 'nerve/ring_buffer'
|
9
|
+
require 'nerve/reporter'
|
10
|
+
require 'nerve/service_watcher'
|
2
11
|
|
3
12
|
module Nerve
|
4
|
-
|
13
|
+
class Nerve
|
14
|
+
|
15
|
+
include Logging
|
16
|
+
|
17
|
+
def initialize(opts={})
|
18
|
+
# set global variable for exit signal
|
19
|
+
$EXIT = false
|
20
|
+
|
21
|
+
# trap int signal and set exit to true
|
22
|
+
%w{INT TERM}.each do |signal|
|
23
|
+
trap(signal) do
|
24
|
+
$EXIT = true
|
25
|
+
end
|
26
|
+
end
|
27
|
+
|
28
|
+
log.info 'nerve: starting up!'
|
29
|
+
|
30
|
+
# required options
|
31
|
+
log.debug 'nerve: checking for required inputs'
|
32
|
+
%w{instance_id services}.each do |required|
|
33
|
+
raise ArgumentError, "you need to specify required argument #{required}" unless opts[required]
|
34
|
+
end
|
35
|
+
|
36
|
+
@instance_id = opts['instance_id']
|
37
|
+
|
38
|
+
# create service watcher objects
|
39
|
+
log.debug 'nerve: creating service watchers'
|
40
|
+
@service_watchers=[]
|
41
|
+
opts['services'].each do |name, config|
|
42
|
+
@service_watchers << ServiceWatcher.new(config.merge({'instance_id' => @instance_id, 'name' => name}))
|
43
|
+
end
|
44
|
+
|
45
|
+
log.debug 'nerve: completed init'
|
46
|
+
end
|
47
|
+
|
48
|
+
def run
|
49
|
+
log.info 'nerve: starting run'
|
50
|
+
begin
|
51
|
+
children = []
|
52
|
+
|
53
|
+
log.debug 'nerve: launching service check threads'
|
54
|
+
@service_watchers.each do |watcher|
|
55
|
+
children << Thread.new{watcher.run}
|
56
|
+
end
|
57
|
+
|
58
|
+
log.debug 'nerve: main thread done, waiting for children'
|
59
|
+
children.each do |child|
|
60
|
+
child.join
|
61
|
+
end
|
62
|
+
ensure
|
63
|
+
$EXIT = true
|
64
|
+
end
|
65
|
+
log.info 'nerve: exiting'
|
66
|
+
end
|
67
|
+
|
68
|
+
end
|
5
69
|
end
|
data/nerve.gemspec
CHANGED
@@ -6,8 +6,8 @@ require 'nerve/version'
|
|
6
6
|
Gem::Specification.new do |gem|
|
7
7
|
gem.name = "nerve"
|
8
8
|
gem.version = Nerve::VERSION
|
9
|
-
gem.authors = ["Martin Rhoads"]
|
10
|
-
gem.email = ["martin.rhoads@airbnb.com"]
|
9
|
+
gem.authors = ["Martin Rhoads", "Igor Serebryany", "Pierre Carrier"]
|
10
|
+
gem.email = ["martin.rhoads@airbnb.com", "igor.serebryany@airbnb.com"]
|
11
11
|
gem.description = %q{description}
|
12
12
|
gem.summary = %q{summary}
|
13
13
|
gem.homepage = ""
|
@@ -16,4 +16,7 @@ Gem::Specification.new do |gem|
|
|
16
16
|
gem.executables = gem.files.grep(%r{^bin/}).map{ |f| File.basename(f) }
|
17
17
|
gem.test_files = gem.files.grep(%r{^(test|spec|features)/})
|
18
18
|
gem.require_paths = ["lib"]
|
19
|
+
|
20
|
+
gem.add_runtime_dependency "zk", "~> 1.9.2"
|
21
|
+
gem.add_runtime_dependency "bunny", "= 1.0.0.rc2"
|
19
22
|
end
|
metadata
CHANGED
@@ -1,29 +1,78 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: nerve
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.2.1
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
8
8
|
- Martin Rhoads
|
9
|
+
- Igor Serebryany
|
10
|
+
- Pierre Carrier
|
9
11
|
autorequire:
|
10
12
|
bindir: bin
|
11
13
|
cert_chain: []
|
12
|
-
date:
|
13
|
-
dependencies:
|
14
|
+
date: 2013-10-24 00:00:00.000000000 Z
|
15
|
+
dependencies:
|
16
|
+
- !ruby/object:Gem::Dependency
|
17
|
+
name: zk
|
18
|
+
requirement: !ruby/object:Gem::Requirement
|
19
|
+
none: false
|
20
|
+
requirements:
|
21
|
+
- - ~>
|
22
|
+
- !ruby/object:Gem::Version
|
23
|
+
version: 1.9.2
|
24
|
+
type: :runtime
|
25
|
+
prerelease: false
|
26
|
+
version_requirements: !ruby/object:Gem::Requirement
|
27
|
+
none: false
|
28
|
+
requirements:
|
29
|
+
- - ~>
|
30
|
+
- !ruby/object:Gem::Version
|
31
|
+
version: 1.9.2
|
32
|
+
- !ruby/object:Gem::Dependency
|
33
|
+
name: bunny
|
34
|
+
requirement: !ruby/object:Gem::Requirement
|
35
|
+
none: false
|
36
|
+
requirements:
|
37
|
+
- - '='
|
38
|
+
- !ruby/object:Gem::Version
|
39
|
+
version: 1.0.0.rc2
|
40
|
+
type: :runtime
|
41
|
+
prerelease: false
|
42
|
+
version_requirements: !ruby/object:Gem::Requirement
|
43
|
+
none: false
|
44
|
+
requirements:
|
45
|
+
- - '='
|
46
|
+
- !ruby/object:Gem::Version
|
47
|
+
version: 1.0.0.rc2
|
14
48
|
description: description
|
15
49
|
email:
|
16
50
|
- martin.rhoads@airbnb.com
|
17
|
-
|
51
|
+
- igor.serebryany@airbnb.com
|
52
|
+
executables:
|
53
|
+
- nerve
|
18
54
|
extensions: []
|
19
55
|
extra_rdoc_files: []
|
20
56
|
files:
|
21
57
|
- .gitignore
|
58
|
+
- .nerve.rc
|
22
59
|
- Gemfile
|
23
60
|
- LICENSE.txt
|
24
61
|
- README.md
|
25
62
|
- Rakefile
|
63
|
+
- Vagrantfile
|
64
|
+
- bin/nerve
|
65
|
+
- example/nerve.conf.json
|
26
66
|
- lib/nerve.rb
|
67
|
+
- lib/nerve/log.rb
|
68
|
+
- lib/nerve/reporter.rb
|
69
|
+
- lib/nerve/ring_buffer.rb
|
70
|
+
- lib/nerve/service_watcher.rb
|
71
|
+
- lib/nerve/service_watcher/base.rb
|
72
|
+
- lib/nerve/service_watcher/http.rb
|
73
|
+
- lib/nerve/service_watcher/rabbitmq.rb
|
74
|
+
- lib/nerve/service_watcher/tcp.rb
|
75
|
+
- lib/nerve/utils.rb
|
27
76
|
- lib/nerve/version.rb
|
28
77
|
- nerve.gemspec
|
29
78
|
homepage: ''
|