nerve 0.0.1 → 0.2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/.gitignore +5 -0
- data/.nerve.rc +2 -0
- data/Gemfile +0 -2
- data/LICENSE.txt +2 -2
- data/README.md +50 -2
- data/Vagrantfile +121 -0
- data/bin/nerve +56 -0
- data/example/nerve.conf.json +50 -0
- data/lib/nerve/log.rb +24 -0
- data/lib/nerve/reporter.rb +64 -0
- data/lib/nerve/ring_buffer.rb +30 -0
- data/lib/nerve/service_watcher/base.rb +51 -0
- data/lib/nerve/service_watcher/http.rb +62 -0
- data/lib/nerve/service_watcher/rabbitmq.rb +68 -0
- data/lib/nerve/service_watcher/tcp.rb +56 -0
- data/lib/nerve/service_watcher.rb +96 -0
- data/lib/nerve/utils.rb +17 -0
- data/lib/nerve/version.rb +1 -1
- data/lib/nerve.rb +66 -2
- data/nerve.gemspec +5 -2
- metadata +53 -4
data/.gitignore
CHANGED
data/.nerve.rc
ADDED
data/Gemfile
CHANGED
data/LICENSE.txt
CHANGED
@@ -1,4 +1,4 @@
|
|
1
|
-
Copyright (c)
|
1
|
+
Copyright (c) 2013 Airbnb, Inc.
|
2
2
|
|
3
3
|
MIT License
|
4
4
|
|
@@ -19,4 +19,4 @@ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
|
19
19
|
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
|
20
20
|
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
21
21
|
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
|
22
|
-
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
22
|
+
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/README.md
CHANGED
@@ -1,7 +1,21 @@
|
|
1
1
|
# Nerve
|
2
2
|
|
3
|
+
Nerve is a utility for tracking the status of machines and services.
|
4
|
+
It runs locally on the boxes which make up a distributed system, and reports state information to a distributed key-value store.
|
5
|
+
At Airbnb, we use Zookeeper as our key-value store.
|
6
|
+
The combination of Nerve and [Synapse](https://github.com/airbnb/synapse) make service discovery in the cloud easy!
|
3
7
|
|
4
|
-
##
|
8
|
+
## Motivation ##
|
9
|
+
|
10
|
+
We already use [Synapse](https://github.com/airbnb/synapse) to discover remote services.
|
11
|
+
However, those services needed boilerplate code to register themselves in [Zookeeper](zookeeper.apache.org/).
|
12
|
+
Nerve simplifies underlying services, enables code reuse, and allows us to create a more composable system.
|
13
|
+
It does so by factoring out the boilerplate into it's own application, which independenly handles monitoring and reporting.
|
14
|
+
|
15
|
+
Beyond those benefits, nerve also acts as a general watchdog on systems.
|
16
|
+
The information it reports can be used to take action from a certralized automation center: action like scaling distributed systems up or down or alerting ops or engineering about downtime.
|
17
|
+
|
18
|
+
## Installation ##
|
5
19
|
|
6
20
|
Add this line to your application's Gemfile:
|
7
21
|
|
@@ -15,8 +29,42 @@ Or install it yourself as:
|
|
15
29
|
|
16
30
|
$ gem install nerve
|
17
31
|
|
18
|
-
##
|
32
|
+
## Configuration ##
|
33
|
+
|
34
|
+
Nerve depends on a single configuration file, in json format.
|
35
|
+
It is usually called `nerve.conf.json`.
|
36
|
+
An example config file is available in `example/nerve.conf.json`.
|
37
|
+
The config file is composed of two main sections:
|
38
|
+
|
39
|
+
* `instance_id`: the name under which your services will be registered in zookeeper
|
40
|
+
* `services`: the hash (from service name to config) of the services nerve will be monitoring
|
41
|
+
|
42
|
+
### Services Config ###
|
43
|
+
|
44
|
+
Each service that nerve will be monitoring is specified in the `services` hash.
|
45
|
+
The key is the name of the service, and the value is a configuration hash telling nerve how to monitor the service.
|
46
|
+
The configuration contains the following options:
|
47
|
+
|
48
|
+
* `port`: the default port for service checks; nerve will submit this the address `instance_id:port` to Zookeeper
|
49
|
+
* `host`: the default host on which to make service checks; you should make this your *public* ip if you want to make sure your service is publically accessible
|
50
|
+
* `zk_hosts`: a list of the zookeeper hosts comprising the [ensemble](https://zookeeper.apache.org/doc/r3.1.2/zookeeperAdmin.html#sc_zkMulitServerSetup) that nerve will submit registration to
|
51
|
+
* `zk_path`: the path (or [znode](https://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#sc_zkDataModel_znodes)) where the registration will be created; nerve will create the [ephemeral node](https://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#Ephemeral+Nodes) that is the registration as a child of this path
|
52
|
+
* `check_interval`: the frequency with which service checks will be initiated; defaults to `500ms`
|
53
|
+
* `checks`: a list of checks that nerve will perform; if all of the pass, the service will be registered; otherwise, it will be un-registered
|
54
|
+
|
55
|
+
### Checks ###
|
56
|
+
|
57
|
+
The core of nerve is a set of service checks.
|
58
|
+
Each service can define a number of checks, and all of them must pass for the service to be registered.
|
59
|
+
Although the exact parameters passed to each check are different, all take a number of common arguments:
|
19
60
|
|
61
|
+
* `type`: (required) the kind of check; you can see available check types in the `lib/nerve/service_watcher` dir of this repo
|
62
|
+
* `name`: (optional) a descriptive, human-readable name for the check; it will be auto-generated based on the other parameters if not specified
|
63
|
+
* `host`: (optional) the host on which the check will be performed; defaults to the `host` of the service to which the check belongs
|
64
|
+
* `port`: (optional) the port on which the check will be performed; like `host`, it defaults to the `port` of the service
|
65
|
+
* `timeout`: (optional) maximum time the check can take; defaults to `100ms`
|
66
|
+
* `rise`: (optional) how many consecutive checks must pass before the check is considered passing; defaults to 1
|
67
|
+
* `fall`: (optional) how many consecutive checks must fail before the check is considered failing; defaults to 1
|
20
68
|
|
21
69
|
## Contributing
|
22
70
|
|
data/Vagrantfile
ADDED
@@ -0,0 +1,121 @@
|
|
1
|
+
# -*- mode: ruby -*-
|
2
|
+
# vi: set ft=ruby :
|
3
|
+
|
4
|
+
unless ENV['COOKBOOK_DIR'] and ENV['DATA_BAG_DIR']
|
5
|
+
STDERR.puts "you need to set COOKBOOK_DIR and DATA_BAG_DIR as environment variables"
|
6
|
+
Kernel.exit 1
|
7
|
+
end
|
8
|
+
|
9
|
+
this_dir = File.dirname(File.expand_path __FILE__)
|
10
|
+
|
11
|
+
STDERR.puts "mounting #{this_dir} as /root/this_dir"
|
12
|
+
|
13
|
+
Vagrant::Config.run do |config|
|
14
|
+
# All Vagrant configuration is done here. The most common configuration
|
15
|
+
# options are documented and commented below. For a complete reference,
|
16
|
+
# please see the online documentation at vagrantup.com.
|
17
|
+
|
18
|
+
# Every Vagrant virtual environment requires a box to build off of.
|
19
|
+
config.vm.box = "precise64"
|
20
|
+
|
21
|
+
# The url from where the 'config.vm.box' box will be fetched if it
|
22
|
+
# doesn't already exist on the user's system.
|
23
|
+
# config.vm.box_url = "http://domain.com/path/to/above.box"
|
24
|
+
|
25
|
+
config.vm.box_url = 'http://files.vagrantup.com/precise64.box'
|
26
|
+
|
27
|
+
# Boot with a GUI so you can see the screen. (Default is headless)
|
28
|
+
# config.vm.boot_mode = :gui
|
29
|
+
|
30
|
+
# Assign this VM to a host-only network IP, allowing you to access it
|
31
|
+
# via the IP. Host-only networks can talk to the host machine as well as
|
32
|
+
# any other machines on the same network, but cannot be accessed (through this
|
33
|
+
# network interface) by any external networks.
|
34
|
+
# config.vm.network :hostonly, "192.168.33.10"
|
35
|
+
|
36
|
+
# Assign this VM to a bridged network, allowing you to connect directly to a
|
37
|
+
# network using the host's network device. This makes the VM appear as another
|
38
|
+
# physical device on your network.
|
39
|
+
# config.vm.network :bridged
|
40
|
+
|
41
|
+
# Forward a port from the guest to the host, which allows for outside
|
42
|
+
# computers to access the VM, whereas host only networking does not.
|
43
|
+
# config.vm.forward_port 80, 8080
|
44
|
+
|
45
|
+
# Share an additional folder to the guest VM. The first argument is
|
46
|
+
# an identifier, the second is the path on the guest to mount the
|
47
|
+
# folder, and the third is the path on the host to the actual folder.
|
48
|
+
# config.vm.share_folder "v-data", "/vagrant_data", "../data"
|
49
|
+
|
50
|
+
config.vm.share_folder 'this_dir', '/this_dir', this_dir
|
51
|
+
|
52
|
+
# Enable provisioning with Puppet stand alone. Puppet manifests
|
53
|
+
# are contained in a directory path relative to this Vagrantfile.
|
54
|
+
# You will need to create the manifests directory and a manifest in
|
55
|
+
# the file base.pp in the manifests_path directory.
|
56
|
+
#
|
57
|
+
# An example Puppet manifest to provision the message of the day:
|
58
|
+
#
|
59
|
+
# # group { "puppet":
|
60
|
+
# # ensure => "present",
|
61
|
+
# # }
|
62
|
+
# #
|
63
|
+
# # File { owner => 0, group => 0, mode => 0644 }
|
64
|
+
# #
|
65
|
+
# # file { '/etc/motd':
|
66
|
+
# # content => "Welcome to your Vagrant-built virtual machine!
|
67
|
+
# # Managed by Puppet.\n"
|
68
|
+
# # }
|
69
|
+
#
|
70
|
+
# config.vm.provision :puppet do |puppet|
|
71
|
+
# puppet.manifests_path = "manifests"
|
72
|
+
# puppet.manifest_file = "base.pp"
|
73
|
+
# end
|
74
|
+
|
75
|
+
# Enable provisioning with chef solo, specifying a cookbooks path, roles
|
76
|
+
# path, and data_bags path (all relative to this Vagrantfile), and adding
|
77
|
+
# some recipes and/or roles.
|
78
|
+
#
|
79
|
+
config.vm.provision :chef_solo do |chef|
|
80
|
+
# chef.cookbooks_path = "../my-recipes/cookbooks"
|
81
|
+
# chef.roles_path = "../my-recipes/roles"
|
82
|
+
# chef.data_bags_path = '/tmp/foo'
|
83
|
+
# chef.add_recipe "mysql"
|
84
|
+
# chef.add_role "web"
|
85
|
+
|
86
|
+
# You may also specify custom JSON attributes:
|
87
|
+
# chef.json = { :mysql_password => "foo" }
|
88
|
+
|
89
|
+
chef.data_bags_path = ENV['DATA_BAG_DIR']
|
90
|
+
chef.cookbooks_path = ENV['COOKBOOK_DIR']
|
91
|
+
chef.add_recipe 'vagrant'
|
92
|
+
chef.add_recipe 'nerve'
|
93
|
+
end
|
94
|
+
|
95
|
+
# Enable provisioning with chef server, specifying the chef server URL,
|
96
|
+
# and the path to the validation key (relative to this Vagrantfile).
|
97
|
+
#
|
98
|
+
# The Opscode Platform uses HTTPS. Substitute your organization for
|
99
|
+
# ORGNAME in the URL and validation key.
|
100
|
+
#
|
101
|
+
# If you have your own Chef Server, use the appropriate URL, which may be
|
102
|
+
# HTTP instead of HTTPS depending on your configuration. Also change the
|
103
|
+
# validation key to validation.pem.
|
104
|
+
#
|
105
|
+
# config.vm.provision :chef_client do |chef|
|
106
|
+
# chef.chef_server_url = "https://api.opscode.com/organizations/ORGNAME"
|
107
|
+
# chef.validation_key_path = "ORGNAME-validator.pem"
|
108
|
+
# end
|
109
|
+
#
|
110
|
+
# If you're using the Opscode platform, your validator client is
|
111
|
+
# ORGNAME-validator, replacing ORGNAME with your organization name.
|
112
|
+
#
|
113
|
+
# IF you have your own Chef Server, the default validation client name is
|
114
|
+
# chef-validator, unless you changed the configuration.
|
115
|
+
#
|
116
|
+
# chef.validation_client_name = "ORGNAME-validator"
|
117
|
+
|
118
|
+
# Vagrant::Config.run do |config|
|
119
|
+
# config.vm.provision :shell, :path => "vagrant/init.sh"
|
120
|
+
# end
|
121
|
+
end
|
data/bin/nerve
ADDED
@@ -0,0 +1,56 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
|
3
|
+
require 'json'
|
4
|
+
require 'optparse'
|
5
|
+
|
6
|
+
require 'nerve'
|
7
|
+
|
8
|
+
options={}
|
9
|
+
|
10
|
+
# set command line options
|
11
|
+
optparse = OptionParser.new do |opts|
|
12
|
+
opts.banner =<<EOB
|
13
|
+
Welcome to nerve
|
14
|
+
|
15
|
+
Usage: nerve --config /path/to/nerve/config
|
16
|
+
EOB
|
17
|
+
|
18
|
+
options[:config] = ENV['NERVE_CONFIG']
|
19
|
+
opts.on('-c config','--config config', String, 'path to nerve config') do |key,value|
|
20
|
+
options[:config] = key
|
21
|
+
end
|
22
|
+
|
23
|
+
opts.on( '-h', '--help', 'Display this screen' ) do
|
24
|
+
puts opts
|
25
|
+
exit
|
26
|
+
end
|
27
|
+
|
28
|
+
end
|
29
|
+
|
30
|
+
|
31
|
+
# parse command line arguments
|
32
|
+
optparse.parse!
|
33
|
+
|
34
|
+
|
35
|
+
# parse nerve config file
|
36
|
+
begin
|
37
|
+
config = JSON::parse(File.read(options[:config]))
|
38
|
+
rescue TypeError => e
|
39
|
+
raise ArgumentError, "you must pass in a '--config' option"
|
40
|
+
rescue Errno::ENOENT => e
|
41
|
+
raise ArgumentError, "config file does not exist:\n#{e.inspect}"
|
42
|
+
rescue Errno::EACCES => e
|
43
|
+
raise ArgumentError, "could not open config file:\n#{e.inspect}"
|
44
|
+
rescue JSON::ParserError => e
|
45
|
+
raise "config file #{options[:config]} is not json:\n#{e.inspect}"
|
46
|
+
end
|
47
|
+
|
48
|
+
|
49
|
+
# create nerve object
|
50
|
+
s = Nerve::Nerve.new config
|
51
|
+
|
52
|
+
# start nerve
|
53
|
+
s.run
|
54
|
+
|
55
|
+
|
56
|
+
puts "exiting nerve"
|
@@ -0,0 +1,50 @@
|
|
1
|
+
{
|
2
|
+
"instance_id": "mymachine",
|
3
|
+
"services": {
|
4
|
+
"your_http_service": {
|
5
|
+
"port": 3000,
|
6
|
+
"host": "127.0.0.1",
|
7
|
+
"zk_hosts": ["localhost:2181"],
|
8
|
+
"zk_path": "/nerve/services/your_http_service/services",
|
9
|
+
"check_interval": 2,
|
10
|
+
"checks": [
|
11
|
+
{
|
12
|
+
"type": "http",
|
13
|
+
"uri": "/health",
|
14
|
+
"timeout": 0.2,
|
15
|
+
"rise": 3,
|
16
|
+
"fall": 2
|
17
|
+
}
|
18
|
+
]
|
19
|
+
},
|
20
|
+
"your_tcp_service": {
|
21
|
+
"port": 6379,
|
22
|
+
"host": "127.0.0.1",
|
23
|
+
"zk_hosts": ["localhost:2181"],
|
24
|
+
"zk_path": "/nerve/services/your_tcp_service/services",
|
25
|
+
"check_interval": 2,
|
26
|
+
"checks": [
|
27
|
+
{
|
28
|
+
"type": "tcp",
|
29
|
+
"timeout": 0.2,
|
30
|
+
"rise": 3,
|
31
|
+
"fall": 2
|
32
|
+
}
|
33
|
+
]
|
34
|
+
},
|
35
|
+
"rabbitmq_service": {
|
36
|
+
"port": 5672,
|
37
|
+
"host": "127.0.0.1",
|
38
|
+
"zk_hosts": ["localhost:2181"],
|
39
|
+
"zk_path": "/nerve/services/your_rabbitmq_service/services",
|
40
|
+
"check_interval": 2,
|
41
|
+
"checks": [
|
42
|
+
{
|
43
|
+
"type": "rabbitmq",
|
44
|
+
"username": "guest",
|
45
|
+
"password": "guest"
|
46
|
+
}
|
47
|
+
]
|
48
|
+
}
|
49
|
+
}
|
50
|
+
}
|
data/lib/nerve/log.rb
ADDED
@@ -0,0 +1,24 @@
|
|
1
|
+
module Nerve
|
2
|
+
module Logging
|
3
|
+
|
4
|
+
def log
|
5
|
+
@logger ||= Logging.logger_for(self.class.name)
|
6
|
+
end
|
7
|
+
|
8
|
+
# Use a hash class-ivar to cache a unique Logger per class:
|
9
|
+
@loggers = {}
|
10
|
+
|
11
|
+
class << self
|
12
|
+
def logger_for(classname)
|
13
|
+
@loggers[classname] ||= configure_logger_for(classname)
|
14
|
+
end
|
15
|
+
|
16
|
+
def configure_logger_for(classname)
|
17
|
+
logger = Logger.new(STDERR)
|
18
|
+
logger.level = Logger::INFO unless ENV['DEBUG']
|
19
|
+
logger.progname = classname
|
20
|
+
return logger
|
21
|
+
end
|
22
|
+
end
|
23
|
+
end
|
24
|
+
end
|
@@ -0,0 +1,64 @@
|
|
1
|
+
require 'zk'
|
2
|
+
|
3
|
+
module Nerve
|
4
|
+
class Reporter
|
5
|
+
include Utils
|
6
|
+
include Logging
|
7
|
+
|
8
|
+
def initialize(opts)
|
9
|
+
%w{hosts path key}.each do |required|
|
10
|
+
raise ArgumentError, "you need to specify required argument #{required}" unless opts[required]
|
11
|
+
end
|
12
|
+
|
13
|
+
@path = opts['hosts'].shuffle.join(',') + opts['path']
|
14
|
+
@data = parse_data(opts['data'] || '')
|
15
|
+
@key = opts['key']
|
16
|
+
@key.insert(0,'/') unless @key[0] == '/'
|
17
|
+
end
|
18
|
+
|
19
|
+
def start()
|
20
|
+
log.info "nerve: waiting to connect to zookeeper at #{@path}"
|
21
|
+
@zk = ZK.new(@path)
|
22
|
+
|
23
|
+
log.info "nerve: successfully created zk connection to #{@path}"
|
24
|
+
end
|
25
|
+
|
26
|
+
def report_up()
|
27
|
+
zk_save
|
28
|
+
end
|
29
|
+
|
30
|
+
def report_down
|
31
|
+
zk_delete
|
32
|
+
end
|
33
|
+
|
34
|
+
def update_data(new_data='')
|
35
|
+
@data = parse_data(new_data)
|
36
|
+
zk_save
|
37
|
+
end
|
38
|
+
|
39
|
+
def ping?
|
40
|
+
return @zk.ping?
|
41
|
+
end
|
42
|
+
|
43
|
+
private
|
44
|
+
|
45
|
+
def zk_delete
|
46
|
+
@zk.delete(@key, :ignore => :no_node)
|
47
|
+
end
|
48
|
+
|
49
|
+
def zk_save
|
50
|
+
log.debug "nerve: writing data #{@data.class} to zk at #{@key} with #{@data.inspect}"
|
51
|
+
begin
|
52
|
+
@zk.set(@key,@data)
|
53
|
+
rescue ZK::Exceptions::NoNode => e
|
54
|
+
@zk.create(@key,:data => @data, :mode => :ephemeral)
|
55
|
+
end
|
56
|
+
end
|
57
|
+
|
58
|
+
def parse_data(data)
|
59
|
+
return data if data.class == String
|
60
|
+
return data.to_json
|
61
|
+
end
|
62
|
+
|
63
|
+
end
|
64
|
+
end
|
@@ -0,0 +1,30 @@
|
|
1
|
+
module Nerve
|
2
|
+
class RingBuffer < Array
|
3
|
+
alias_method :array_push, :push
|
4
|
+
alias_method :array_element, :[]
|
5
|
+
|
6
|
+
def initialize( size )
|
7
|
+
@ring_size = size.to_i
|
8
|
+
super( @ring_size )
|
9
|
+
end
|
10
|
+
|
11
|
+
def average
|
12
|
+
self.inject(0.0) { |sum, el| sum + el } / self.size
|
13
|
+
end
|
14
|
+
|
15
|
+
def push( element )
|
16
|
+
if length == @ring_size
|
17
|
+
shift # loose element
|
18
|
+
end
|
19
|
+
array_push element
|
20
|
+
end
|
21
|
+
|
22
|
+
# Access elements in the RingBuffer
|
23
|
+
#
|
24
|
+
# offset will be typically negative!
|
25
|
+
#
|
26
|
+
def []( offset = 0 )
|
27
|
+
return self.array_element( - 1 + offset )
|
28
|
+
end
|
29
|
+
end
|
30
|
+
end
|
@@ -0,0 +1,51 @@
|
|
1
|
+
module Nerve
|
2
|
+
module ServiceCheck
|
3
|
+
class BaseServiceCheck
|
4
|
+
include Utils
|
5
|
+
include Logging
|
6
|
+
|
7
|
+
def initialize(opts={})
|
8
|
+
@timeout = opts['timeout'] ? opts['timeout'].to_f : 0.1
|
9
|
+
@rise = opts['rise'] ? opts['rise'].to_i : 1
|
10
|
+
@fall = opts['fall'] ? opts['fall'].to_i : 1
|
11
|
+
@name = opts['name'] ? opts['name'] : "undefined"
|
12
|
+
|
13
|
+
@check_buffer = RingBuffer.new([@rise, @fall].max)
|
14
|
+
@last_result = nil
|
15
|
+
end
|
16
|
+
|
17
|
+
def up?
|
18
|
+
# do the check
|
19
|
+
check_result = !!ignore_errors do
|
20
|
+
check
|
21
|
+
end
|
22
|
+
|
23
|
+
# this is the first check -- initialize buffer
|
24
|
+
if @last_result == nil
|
25
|
+
@last_result = check_result
|
26
|
+
@check_buffer.size.times {@check_buffer.push check_result}
|
27
|
+
log.info "nerve: service check #{@name} initial check returned #{check_result}"
|
28
|
+
end
|
29
|
+
|
30
|
+
log.debug "nerve: service check #{@name} returned #{check_result}"
|
31
|
+
@check_buffer.push(check_result)
|
32
|
+
|
33
|
+
# we've failed if the last @fall times are false
|
34
|
+
unless @check_buffer.last(@fall).reduce(:|)
|
35
|
+
log.info "nerve: service check #{@name} transitions to down after #{@fall} failures" if @last_result
|
36
|
+
@last_result = false
|
37
|
+
end
|
38
|
+
|
39
|
+
# we've succeeded if the last @rise times is true
|
40
|
+
if @check_buffer.last(@rise).reduce(:&)
|
41
|
+
log.info "nerve: service check #{@name} transitions to up after #{@rise} successes" unless @last_result
|
42
|
+
@last_result = true
|
43
|
+
end
|
44
|
+
|
45
|
+
# otherwise return the last result
|
46
|
+
return @last_result
|
47
|
+
end
|
48
|
+
end
|
49
|
+
end
|
50
|
+
end
|
51
|
+
|
@@ -0,0 +1,62 @@
|
|
1
|
+
require 'nerve/service_watcher/base'
|
2
|
+
|
3
|
+
module Nerve
|
4
|
+
module ServiceCheck
|
5
|
+
require 'net/http'
|
6
|
+
|
7
|
+
class HttpServiceCheck < BaseServiceCheck
|
8
|
+
def initialize(opts={})
|
9
|
+
super
|
10
|
+
|
11
|
+
%w{port uri}.each do |required|
|
12
|
+
raise ArgumentError, "missing required argument #{required} in http check" unless
|
13
|
+
opts[required]
|
14
|
+
instance_variable_set("@#{required}",opts[required])
|
15
|
+
end
|
16
|
+
|
17
|
+
@host = opts['host'] || '127.0.0.1'
|
18
|
+
@ssl = opts['ssl'] || false
|
19
|
+
|
20
|
+
@read_timeout = opts['read_timeout'] || @timeout
|
21
|
+
@open_timeout = opts['open_timeout'] || 0.2
|
22
|
+
@ssl_timeout = opts['ssl_timeout'] || 0.2
|
23
|
+
|
24
|
+
@name = "http-#{@host}:#{@port}#{@uri}"
|
25
|
+
end
|
26
|
+
|
27
|
+
def check
|
28
|
+
log.debug "running health check #{@name}"
|
29
|
+
|
30
|
+
connection = get_connection
|
31
|
+
response = connection.get(@uri)
|
32
|
+
code = response.code.to_i
|
33
|
+
|
34
|
+
log.debug "nerve: check #{@name} got response code #{code}"
|
35
|
+
if code >= 200 and code < 300
|
36
|
+
return true
|
37
|
+
else
|
38
|
+
return false
|
39
|
+
end
|
40
|
+
end
|
41
|
+
|
42
|
+
private
|
43
|
+
def get_connection
|
44
|
+
con = Net::HTTP.new(@host, @port)
|
45
|
+
con.read_timeout = @read_timeout
|
46
|
+
con.open_timeout = @open_timeout
|
47
|
+
|
48
|
+
if @ssl
|
49
|
+
con.use_ssl = true
|
50
|
+
con.ssl_timeout = @ssl_timeout
|
51
|
+
con.verify_mode = OpenSSL::SSL::VERIFY_NONE
|
52
|
+
end
|
53
|
+
|
54
|
+
return con
|
55
|
+
end
|
56
|
+
|
57
|
+
end
|
58
|
+
|
59
|
+
CHECKS ||= {}
|
60
|
+
CHECKS['http'] = HttpServiceCheck
|
61
|
+
end
|
62
|
+
end
|
@@ -0,0 +1,68 @@
|
|
1
|
+
require 'nerve/service_watcher/base'
|
2
|
+
require 'bunny'
|
3
|
+
|
4
|
+
module Nerve
|
5
|
+
module ServiceCheck
|
6
|
+
class RabbitMQServiceCheck < BaseServiceCheck
|
7
|
+
require 'socket'
|
8
|
+
include Socket::Constants
|
9
|
+
|
10
|
+
def initialize(opts={})
|
11
|
+
super
|
12
|
+
|
13
|
+
raise ArgumentError, "missing required argument 'port' in rabbitmq check" unless opts['port']
|
14
|
+
|
15
|
+
@port = opts['port']
|
16
|
+
@host = opts['host'] || '127.0.0.1'
|
17
|
+
@user = opts['username'] || 'guest'
|
18
|
+
@pass = opts['password'] || 'guest'
|
19
|
+
end
|
20
|
+
|
21
|
+
def check
|
22
|
+
# the idea for this check was taken from the one in rabbitmq management
|
23
|
+
# -- the aliveness_test:
|
24
|
+
# https://github.com/rabbitmq/rabbitmq-management/blob/9a8e3d1ab5144e3f6a1cb9a4639eb738713b926d/src/rabbit_mgmt_wm_aliveness_test.erl
|
25
|
+
log.debug "nerve: running rabbitmq health check #{@name}"
|
26
|
+
|
27
|
+
conn = Bunny.new(
|
28
|
+
:host => @host,
|
29
|
+
:port => @port,
|
30
|
+
:user => @user,
|
31
|
+
:pass => @pass,
|
32
|
+
:log_file => STDERR,
|
33
|
+
:continuation_timeout => @timeout,
|
34
|
+
:automatically_recover => false,
|
35
|
+
:heartbeat => false,
|
36
|
+
:threaded => false
|
37
|
+
)
|
38
|
+
|
39
|
+
begin
|
40
|
+
conn.start
|
41
|
+
ch = conn.create_channel
|
42
|
+
|
43
|
+
# create a queue, publish to it
|
44
|
+
log.debug "nerve: publishing to rabbitmq"
|
45
|
+
ch.queue('nerve')
|
46
|
+
ch.basic_publish('nerve test message', '', 'nerve', :mandatory => true, :expiration => 2 * 1000)
|
47
|
+
|
48
|
+
# read and ack the message
|
49
|
+
log.debug "nerve: consuming from rabbitmq"
|
50
|
+
delivery_info, properties, payload = ch.basic_get('nerve', :ack => true)
|
51
|
+
|
52
|
+
if payload
|
53
|
+
ch.acknowledge(delivery_info.delivery_tag)
|
54
|
+
return true
|
55
|
+
else
|
56
|
+
log.debug "nerve: rabbitmq consumption returned no payload"
|
57
|
+
return false
|
58
|
+
end
|
59
|
+
ensure
|
60
|
+
conn.close
|
61
|
+
end
|
62
|
+
end
|
63
|
+
end
|
64
|
+
|
65
|
+
CHECKS ||= {}
|
66
|
+
CHECKS['rabbitmq'] = RabbitMQServiceCheck
|
67
|
+
end
|
68
|
+
end
|
@@ -0,0 +1,56 @@
|
|
1
|
+
require 'nerve/service_watcher/base'
|
2
|
+
|
3
|
+
module Nerve
|
4
|
+
module ServiceCheck
|
5
|
+
class TcpServiceCheck < BaseServiceCheck
|
6
|
+
require 'socket'
|
7
|
+
include Socket::Constants
|
8
|
+
|
9
|
+
def initialize(opts={})
|
10
|
+
super
|
11
|
+
|
12
|
+
raise ArgumentError, "missing required argument 'port' in tcp check" unless opts['port']
|
13
|
+
|
14
|
+
@port = opts['port']
|
15
|
+
@host = opts['host'] || '127.0.0.1'
|
16
|
+
|
17
|
+
@address = Socket.sockaddr_in(@port, @host)
|
18
|
+
end
|
19
|
+
|
20
|
+
def check
|
21
|
+
log.debug "nerve: running TCP health check #{@name}"
|
22
|
+
|
23
|
+
# create a TCP socket
|
24
|
+
socket = Socket.new(AF_INET, SOCK_STREAM, 0)
|
25
|
+
|
26
|
+
begin
|
27
|
+
# open a non-blocking connection
|
28
|
+
socket.connect_nonblock(@address)
|
29
|
+
rescue Errno::EINPROGRESS
|
30
|
+
# opening a non-blocking socket will usually raise
|
31
|
+
# this exception. it's just connect returning immediately,
|
32
|
+
# so it's not really an exception, but ruby makes it into
|
33
|
+
# one. if we got here, we are now free to wait until the timeout
|
34
|
+
# expires for the socket to be writeable
|
35
|
+
IO.select(nil, [socket], nil, @timeout)
|
36
|
+
|
37
|
+
# we should be connected now; allow any other exception through
|
38
|
+
begin
|
39
|
+
socket.connect_nonblock(@address)
|
40
|
+
rescue Errno::EISCONN
|
41
|
+
return true
|
42
|
+
end
|
43
|
+
else
|
44
|
+
# we managed to connect REALLY REALLY FAST
|
45
|
+
log.debug "nerve: connected to non-blocking socket without an exception"
|
46
|
+
return true
|
47
|
+
ensure
|
48
|
+
socket.close
|
49
|
+
end
|
50
|
+
end
|
51
|
+
end
|
52
|
+
|
53
|
+
CHECKS ||= {}
|
54
|
+
CHECKS['tcp'] = TcpServiceCheck
|
55
|
+
end
|
56
|
+
end
|
@@ -0,0 +1,96 @@
|
|
1
|
+
require 'nerve/service_watcher/tcp'
|
2
|
+
require 'nerve/service_watcher/http'
|
3
|
+
require 'nerve/service_watcher/rabbitmq'
|
4
|
+
|
5
|
+
module Nerve
|
6
|
+
class ServiceWatcher
|
7
|
+
include Utils
|
8
|
+
include Logging
|
9
|
+
|
10
|
+
def initialize(service={})
|
11
|
+
log.debug "nerve: creating service watcher object"
|
12
|
+
|
13
|
+
# check that we have all of the required arguments
|
14
|
+
%w{name instance_id host port zk_hosts zk_path}.each do |required|
|
15
|
+
raise ArgumentError, "missing required argument #{required} for new service watcher" unless service[required]
|
16
|
+
end
|
17
|
+
|
18
|
+
@name = service['name']
|
19
|
+
|
20
|
+
# configure the reporter, which we use for talking to zookeeper
|
21
|
+
@reporter = Reporter.new({
|
22
|
+
'hosts' => service['zk_hosts'],
|
23
|
+
'path' => service['zk_path'],
|
24
|
+
'key' => "#{service['instance_id']}_#{@name}",
|
25
|
+
'data' => {'host' => service['host'], 'port' => service['port']},
|
26
|
+
})
|
27
|
+
|
28
|
+
# instantiate the checks for this service
|
29
|
+
@service_checks = []
|
30
|
+
service['checks'] ||= []
|
31
|
+
service['checks'].each do |check|
|
32
|
+
check['type'] ||= "undefined"
|
33
|
+
begin
|
34
|
+
service_check_class = ServiceCheck::CHECKS[check['type']]
|
35
|
+
rescue
|
36
|
+
raise ArgumentError,
|
37
|
+
"invalid service check type #{check['type']}; valid types: #{ServiceCheck::CHECKS.keys.join(',')}"
|
38
|
+
end
|
39
|
+
|
40
|
+
check['host'] ||= service['host']
|
41
|
+
check['port'] ||= service['port']
|
42
|
+
check['name'] ||= "#{@name} #{check['type']}-#{check['host']}:#{check['port']}"
|
43
|
+
@service_checks << service_check_class.new(check)
|
44
|
+
end
|
45
|
+
|
46
|
+
# how often do we initiate service checks?
|
47
|
+
@check_interval = service['check_interval'] || 0.5
|
48
|
+
|
49
|
+
log.debug "nerve: created service watcher for #{@name} with #{@service_checks.size} checks"
|
50
|
+
end
|
51
|
+
|
52
|
+
def run()
|
53
|
+
log.info "nerve: starting service watch #{@name}"
|
54
|
+
|
55
|
+
# begin by reporting down
|
56
|
+
@reporter.start()
|
57
|
+
@reporter.report_down
|
58
|
+
was_up = false
|
59
|
+
|
60
|
+
until $EXIT
|
61
|
+
@reporter.ping?
|
62
|
+
|
63
|
+
# what is the status of the service?
|
64
|
+
is_up = check?
|
65
|
+
log.debug "nerve: current service status for #{@name} is #{is_up.inspect}"
|
66
|
+
|
67
|
+
if is_up != was_up
|
68
|
+
if is_up
|
69
|
+
@reporter.report_up
|
70
|
+
log.info "nerve: service #{@name} is now up"
|
71
|
+
else
|
72
|
+
@reporter.report_down
|
73
|
+
log.warn "nerve: service #{@name} is now down"
|
74
|
+
end
|
75
|
+
was_up = is_up
|
76
|
+
end
|
77
|
+
|
78
|
+
# wait to run more checks
|
79
|
+
sleep @check_interval
|
80
|
+
end
|
81
|
+
rescue StandardError => e
|
82
|
+
log.error "nerve: error in service watcher #{@name}: #{e}"
|
83
|
+
raise e
|
84
|
+
ensure
|
85
|
+
log.info "nerve: ending service watch #{@name}"
|
86
|
+
$EXIT = true
|
87
|
+
end
|
88
|
+
|
89
|
+
def check?
|
90
|
+
@service_checks.each do |check|
|
91
|
+
return false unless check.up?
|
92
|
+
end
|
93
|
+
return true
|
94
|
+
end
|
95
|
+
end
|
96
|
+
end
|
data/lib/nerve/utils.rb
ADDED
@@ -0,0 +1,17 @@
|
|
1
|
+
module Nerve
|
2
|
+
module Utils
|
3
|
+
def safe_run(command)
|
4
|
+
res = `#{command}`.chomp
|
5
|
+
raise "command '#{command}' failed to run:\n#{res}" unless $?.success?
|
6
|
+
end
|
7
|
+
|
8
|
+
def ignore_errors(&block)
|
9
|
+
begin
|
10
|
+
return yield
|
11
|
+
rescue Object => error
|
12
|
+
log.debug "ignoring error #{error.inspect}"
|
13
|
+
return false
|
14
|
+
end
|
15
|
+
end
|
16
|
+
end
|
17
|
+
end
|
data/lib/nerve/version.rb
CHANGED
data/lib/nerve.rb
CHANGED
@@ -1,5 +1,69 @@
|
|
1
|
-
require
|
1
|
+
require 'logger'
|
2
|
+
require 'json'
|
3
|
+
require 'timeout'
|
4
|
+
|
5
|
+
require 'nerve/version'
|
6
|
+
require 'nerve/utils'
|
7
|
+
require 'nerve/log'
|
8
|
+
require 'nerve/ring_buffer'
|
9
|
+
require 'nerve/reporter'
|
10
|
+
require 'nerve/service_watcher'
|
2
11
|
|
3
12
|
module Nerve
|
4
|
-
|
13
|
+
class Nerve
|
14
|
+
|
15
|
+
include Logging
|
16
|
+
|
17
|
+
def initialize(opts={})
|
18
|
+
# set global variable for exit signal
|
19
|
+
$EXIT = false
|
20
|
+
|
21
|
+
# trap int signal and set exit to true
|
22
|
+
%w{INT TERM}.each do |signal|
|
23
|
+
trap(signal) do
|
24
|
+
$EXIT = true
|
25
|
+
end
|
26
|
+
end
|
27
|
+
|
28
|
+
log.info 'nerve: starting up!'
|
29
|
+
|
30
|
+
# required options
|
31
|
+
log.debug 'nerve: checking for required inputs'
|
32
|
+
%w{instance_id services}.each do |required|
|
33
|
+
raise ArgumentError, "you need to specify required argument #{required}" unless opts[required]
|
34
|
+
end
|
35
|
+
|
36
|
+
@instance_id = opts['instance_id']
|
37
|
+
|
38
|
+
# create service watcher objects
|
39
|
+
log.debug 'nerve: creating service watchers'
|
40
|
+
@service_watchers=[]
|
41
|
+
opts['services'].each do |name, config|
|
42
|
+
@service_watchers << ServiceWatcher.new(config.merge({'instance_id' => @instance_id, 'name' => name}))
|
43
|
+
end
|
44
|
+
|
45
|
+
log.debug 'nerve: completed init'
|
46
|
+
end
|
47
|
+
|
48
|
+
def run
|
49
|
+
log.info 'nerve: starting run'
|
50
|
+
begin
|
51
|
+
children = []
|
52
|
+
|
53
|
+
log.debug 'nerve: launching service check threads'
|
54
|
+
@service_watchers.each do |watcher|
|
55
|
+
children << Thread.new{watcher.run}
|
56
|
+
end
|
57
|
+
|
58
|
+
log.debug 'nerve: main thread done, waiting for children'
|
59
|
+
children.each do |child|
|
60
|
+
child.join
|
61
|
+
end
|
62
|
+
ensure
|
63
|
+
$EXIT = true
|
64
|
+
end
|
65
|
+
log.info 'nerve: exiting'
|
66
|
+
end
|
67
|
+
|
68
|
+
end
|
5
69
|
end
|
data/nerve.gemspec
CHANGED
@@ -6,8 +6,8 @@ require 'nerve/version'
|
|
6
6
|
Gem::Specification.new do |gem|
|
7
7
|
gem.name = "nerve"
|
8
8
|
gem.version = Nerve::VERSION
|
9
|
-
gem.authors = ["Martin Rhoads"]
|
10
|
-
gem.email = ["martin.rhoads@airbnb.com"]
|
9
|
+
gem.authors = ["Martin Rhoads", "Igor Serebryany", "Pierre Carrier"]
|
10
|
+
gem.email = ["martin.rhoads@airbnb.com", "igor.serebryany@airbnb.com"]
|
11
11
|
gem.description = %q{description}
|
12
12
|
gem.summary = %q{summary}
|
13
13
|
gem.homepage = ""
|
@@ -16,4 +16,7 @@ Gem::Specification.new do |gem|
|
|
16
16
|
gem.executables = gem.files.grep(%r{^bin/}).map{ |f| File.basename(f) }
|
17
17
|
gem.test_files = gem.files.grep(%r{^(test|spec|features)/})
|
18
18
|
gem.require_paths = ["lib"]
|
19
|
+
|
20
|
+
gem.add_runtime_dependency "zk", "~> 1.9.2"
|
21
|
+
gem.add_runtime_dependency "bunny", "= 1.0.0.rc2"
|
19
22
|
end
|
metadata
CHANGED
@@ -1,29 +1,78 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: nerve
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.2.1
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
8
8
|
- Martin Rhoads
|
9
|
+
- Igor Serebryany
|
10
|
+
- Pierre Carrier
|
9
11
|
autorequire:
|
10
12
|
bindir: bin
|
11
13
|
cert_chain: []
|
12
|
-
date:
|
13
|
-
dependencies:
|
14
|
+
date: 2013-10-24 00:00:00.000000000 Z
|
15
|
+
dependencies:
|
16
|
+
- !ruby/object:Gem::Dependency
|
17
|
+
name: zk
|
18
|
+
requirement: !ruby/object:Gem::Requirement
|
19
|
+
none: false
|
20
|
+
requirements:
|
21
|
+
- - ~>
|
22
|
+
- !ruby/object:Gem::Version
|
23
|
+
version: 1.9.2
|
24
|
+
type: :runtime
|
25
|
+
prerelease: false
|
26
|
+
version_requirements: !ruby/object:Gem::Requirement
|
27
|
+
none: false
|
28
|
+
requirements:
|
29
|
+
- - ~>
|
30
|
+
- !ruby/object:Gem::Version
|
31
|
+
version: 1.9.2
|
32
|
+
- !ruby/object:Gem::Dependency
|
33
|
+
name: bunny
|
34
|
+
requirement: !ruby/object:Gem::Requirement
|
35
|
+
none: false
|
36
|
+
requirements:
|
37
|
+
- - '='
|
38
|
+
- !ruby/object:Gem::Version
|
39
|
+
version: 1.0.0.rc2
|
40
|
+
type: :runtime
|
41
|
+
prerelease: false
|
42
|
+
version_requirements: !ruby/object:Gem::Requirement
|
43
|
+
none: false
|
44
|
+
requirements:
|
45
|
+
- - '='
|
46
|
+
- !ruby/object:Gem::Version
|
47
|
+
version: 1.0.0.rc2
|
14
48
|
description: description
|
15
49
|
email:
|
16
50
|
- martin.rhoads@airbnb.com
|
17
|
-
|
51
|
+
- igor.serebryany@airbnb.com
|
52
|
+
executables:
|
53
|
+
- nerve
|
18
54
|
extensions: []
|
19
55
|
extra_rdoc_files: []
|
20
56
|
files:
|
21
57
|
- .gitignore
|
58
|
+
- .nerve.rc
|
22
59
|
- Gemfile
|
23
60
|
- LICENSE.txt
|
24
61
|
- README.md
|
25
62
|
- Rakefile
|
63
|
+
- Vagrantfile
|
64
|
+
- bin/nerve
|
65
|
+
- example/nerve.conf.json
|
26
66
|
- lib/nerve.rb
|
67
|
+
- lib/nerve/log.rb
|
68
|
+
- lib/nerve/reporter.rb
|
69
|
+
- lib/nerve/ring_buffer.rb
|
70
|
+
- lib/nerve/service_watcher.rb
|
71
|
+
- lib/nerve/service_watcher/base.rb
|
72
|
+
- lib/nerve/service_watcher/http.rb
|
73
|
+
- lib/nerve/service_watcher/rabbitmq.rb
|
74
|
+
- lib/nerve/service_watcher/tcp.rb
|
75
|
+
- lib/nerve/utils.rb
|
27
76
|
- lib/nerve/version.rb
|
28
77
|
- nerve.gemspec
|
29
78
|
homepage: ''
|