nerve 0.0.1 → 0.2.1

Sign up to get free protection for your applications and to get access to all the features.
data/.gitignore CHANGED
@@ -15,3 +15,8 @@ spec/reports
15
15
  test/tmp
16
16
  test/version_tmp
17
17
  tmp
18
+ *.sw?
19
+ \#*\#
20
+ .\#*
21
+ vendor
22
+ .vagrant
data/.nerve.rc ADDED
@@ -0,0 +1,2 @@
1
+ export DATA_BAG_DIR=/Users/martin/Dropbox/airbnb/src/chef/data_bags
2
+ export COOKBOOK_DIR=/Users/martin/Dropbox/airbnb/src/chef/cookbooks
data/Gemfile CHANGED
@@ -1,4 +1,2 @@
1
1
  source 'https://rubygems.org'
2
-
3
- # Specify your gem's dependencies in nerve.gemspec
4
2
  gemspec
data/LICENSE.txt CHANGED
@@ -1,4 +1,4 @@
1
- Copyright (c) 2012 Martin Rhoads
1
+ Copyright (c) 2013 Airbnb, Inc.
2
2
 
3
3
  MIT License
4
4
 
@@ -19,4 +19,4 @@ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
19
19
  NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
20
20
  LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
21
21
  OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
22
- WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
22
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/README.md CHANGED
@@ -1,7 +1,21 @@
1
1
  # Nerve
2
2
 
3
+ Nerve is a utility for tracking the status of machines and services.
4
+ It runs locally on the boxes which make up a distributed system, and reports state information to a distributed key-value store.
5
+ At Airbnb, we use Zookeeper as our key-value store.
6
+ The combination of Nerve and [Synapse](https://github.com/airbnb/synapse) make service discovery in the cloud easy!
3
7
 
4
- ## Installation
8
+ ## Motivation ##
9
+
10
+ We already use [Synapse](https://github.com/airbnb/synapse) to discover remote services.
11
+ However, those services needed boilerplate code to register themselves in [Zookeeper](zookeeper.apache.org/).
12
+ Nerve simplifies underlying services, enables code reuse, and allows us to create a more composable system.
13
+ It does so by factoring out the boilerplate into it's own application, which independenly handles monitoring and reporting.
14
+
15
+ Beyond those benefits, nerve also acts as a general watchdog on systems.
16
+ The information it reports can be used to take action from a certralized automation center: action like scaling distributed systems up or down or alerting ops or engineering about downtime.
17
+
18
+ ## Installation ##
5
19
 
6
20
  Add this line to your application's Gemfile:
7
21
 
@@ -15,8 +29,42 @@ Or install it yourself as:
15
29
 
16
30
  $ gem install nerve
17
31
 
18
- ## Usage
32
+ ## Configuration ##
33
+
34
+ Nerve depends on a single configuration file, in json format.
35
+ It is usually called `nerve.conf.json`.
36
+ An example config file is available in `example/nerve.conf.json`.
37
+ The config file is composed of two main sections:
38
+
39
+ * `instance_id`: the name under which your services will be registered in zookeeper
40
+ * `services`: the hash (from service name to config) of the services nerve will be monitoring
41
+
42
+ ### Services Config ###
43
+
44
+ Each service that nerve will be monitoring is specified in the `services` hash.
45
+ The key is the name of the service, and the value is a configuration hash telling nerve how to monitor the service.
46
+ The configuration contains the following options:
47
+
48
+ * `port`: the default port for service checks; nerve will submit this the address `instance_id:port` to Zookeeper
49
+ * `host`: the default host on which to make service checks; you should make this your *public* ip if you want to make sure your service is publically accessible
50
+ * `zk_hosts`: a list of the zookeeper hosts comprising the [ensemble](https://zookeeper.apache.org/doc/r3.1.2/zookeeperAdmin.html#sc_zkMulitServerSetup) that nerve will submit registration to
51
+ * `zk_path`: the path (or [znode](https://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#sc_zkDataModel_znodes)) where the registration will be created; nerve will create the [ephemeral node](https://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#Ephemeral+Nodes) that is the registration as a child of this path
52
+ * `check_interval`: the frequency with which service checks will be initiated; defaults to `500ms`
53
+ * `checks`: a list of checks that nerve will perform; if all of the pass, the service will be registered; otherwise, it will be un-registered
54
+
55
+ ### Checks ###
56
+
57
+ The core of nerve is a set of service checks.
58
+ Each service can define a number of checks, and all of them must pass for the service to be registered.
59
+ Although the exact parameters passed to each check are different, all take a number of common arguments:
19
60
 
61
+ * `type`: (required) the kind of check; you can see available check types in the `lib/nerve/service_watcher` dir of this repo
62
+ * `name`: (optional) a descriptive, human-readable name for the check; it will be auto-generated based on the other parameters if not specified
63
+ * `host`: (optional) the host on which the check will be performed; defaults to the `host` of the service to which the check belongs
64
+ * `port`: (optional) the port on which the check will be performed; like `host`, it defaults to the `port` of the service
65
+ * `timeout`: (optional) maximum time the check can take; defaults to `100ms`
66
+ * `rise`: (optional) how many consecutive checks must pass before the check is considered passing; defaults to 1
67
+ * `fall`: (optional) how many consecutive checks must fail before the check is considered failing; defaults to 1
20
68
 
21
69
  ## Contributing
22
70
 
data/Vagrantfile ADDED
@@ -0,0 +1,121 @@
1
+ # -*- mode: ruby -*-
2
+ # vi: set ft=ruby :
3
+
4
+ unless ENV['COOKBOOK_DIR'] and ENV['DATA_BAG_DIR']
5
+ STDERR.puts "you need to set COOKBOOK_DIR and DATA_BAG_DIR as environment variables"
6
+ Kernel.exit 1
7
+ end
8
+
9
+ this_dir = File.dirname(File.expand_path __FILE__)
10
+
11
+ STDERR.puts "mounting #{this_dir} as /root/this_dir"
12
+
13
+ Vagrant::Config.run do |config|
14
+ # All Vagrant configuration is done here. The most common configuration
15
+ # options are documented and commented below. For a complete reference,
16
+ # please see the online documentation at vagrantup.com.
17
+
18
+ # Every Vagrant virtual environment requires a box to build off of.
19
+ config.vm.box = "precise64"
20
+
21
+ # The url from where the 'config.vm.box' box will be fetched if it
22
+ # doesn't already exist on the user's system.
23
+ # config.vm.box_url = "http://domain.com/path/to/above.box"
24
+
25
+ config.vm.box_url = 'http://files.vagrantup.com/precise64.box'
26
+
27
+ # Boot with a GUI so you can see the screen. (Default is headless)
28
+ # config.vm.boot_mode = :gui
29
+
30
+ # Assign this VM to a host-only network IP, allowing you to access it
31
+ # via the IP. Host-only networks can talk to the host machine as well as
32
+ # any other machines on the same network, but cannot be accessed (through this
33
+ # network interface) by any external networks.
34
+ # config.vm.network :hostonly, "192.168.33.10"
35
+
36
+ # Assign this VM to a bridged network, allowing you to connect directly to a
37
+ # network using the host's network device. This makes the VM appear as another
38
+ # physical device on your network.
39
+ # config.vm.network :bridged
40
+
41
+ # Forward a port from the guest to the host, which allows for outside
42
+ # computers to access the VM, whereas host only networking does not.
43
+ # config.vm.forward_port 80, 8080
44
+
45
+ # Share an additional folder to the guest VM. The first argument is
46
+ # an identifier, the second is the path on the guest to mount the
47
+ # folder, and the third is the path on the host to the actual folder.
48
+ # config.vm.share_folder "v-data", "/vagrant_data", "../data"
49
+
50
+ config.vm.share_folder 'this_dir', '/this_dir', this_dir
51
+
52
+ # Enable provisioning with Puppet stand alone. Puppet manifests
53
+ # are contained in a directory path relative to this Vagrantfile.
54
+ # You will need to create the manifests directory and a manifest in
55
+ # the file base.pp in the manifests_path directory.
56
+ #
57
+ # An example Puppet manifest to provision the message of the day:
58
+ #
59
+ # # group { "puppet":
60
+ # # ensure => "present",
61
+ # # }
62
+ # #
63
+ # # File { owner => 0, group => 0, mode => 0644 }
64
+ # #
65
+ # # file { '/etc/motd':
66
+ # # content => "Welcome to your Vagrant-built virtual machine!
67
+ # # Managed by Puppet.\n"
68
+ # # }
69
+ #
70
+ # config.vm.provision :puppet do |puppet|
71
+ # puppet.manifests_path = "manifests"
72
+ # puppet.manifest_file = "base.pp"
73
+ # end
74
+
75
+ # Enable provisioning with chef solo, specifying a cookbooks path, roles
76
+ # path, and data_bags path (all relative to this Vagrantfile), and adding
77
+ # some recipes and/or roles.
78
+ #
79
+ config.vm.provision :chef_solo do |chef|
80
+ # chef.cookbooks_path = "../my-recipes/cookbooks"
81
+ # chef.roles_path = "../my-recipes/roles"
82
+ # chef.data_bags_path = '/tmp/foo'
83
+ # chef.add_recipe "mysql"
84
+ # chef.add_role "web"
85
+
86
+ # You may also specify custom JSON attributes:
87
+ # chef.json = { :mysql_password => "foo" }
88
+
89
+ chef.data_bags_path = ENV['DATA_BAG_DIR']
90
+ chef.cookbooks_path = ENV['COOKBOOK_DIR']
91
+ chef.add_recipe 'vagrant'
92
+ chef.add_recipe 'nerve'
93
+ end
94
+
95
+ # Enable provisioning with chef server, specifying the chef server URL,
96
+ # and the path to the validation key (relative to this Vagrantfile).
97
+ #
98
+ # The Opscode Platform uses HTTPS. Substitute your organization for
99
+ # ORGNAME in the URL and validation key.
100
+ #
101
+ # If you have your own Chef Server, use the appropriate URL, which may be
102
+ # HTTP instead of HTTPS depending on your configuration. Also change the
103
+ # validation key to validation.pem.
104
+ #
105
+ # config.vm.provision :chef_client do |chef|
106
+ # chef.chef_server_url = "https://api.opscode.com/organizations/ORGNAME"
107
+ # chef.validation_key_path = "ORGNAME-validator.pem"
108
+ # end
109
+ #
110
+ # If you're using the Opscode platform, your validator client is
111
+ # ORGNAME-validator, replacing ORGNAME with your organization name.
112
+ #
113
+ # IF you have your own Chef Server, the default validation client name is
114
+ # chef-validator, unless you changed the configuration.
115
+ #
116
+ # chef.validation_client_name = "ORGNAME-validator"
117
+
118
+ # Vagrant::Config.run do |config|
119
+ # config.vm.provision :shell, :path => "vagrant/init.sh"
120
+ # end
121
+ end
data/bin/nerve ADDED
@@ -0,0 +1,56 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require 'json'
4
+ require 'optparse'
5
+
6
+ require 'nerve'
7
+
8
+ options={}
9
+
10
+ # set command line options
11
+ optparse = OptionParser.new do |opts|
12
+ opts.banner =<<EOB
13
+ Welcome to nerve
14
+
15
+ Usage: nerve --config /path/to/nerve/config
16
+ EOB
17
+
18
+ options[:config] = ENV['NERVE_CONFIG']
19
+ opts.on('-c config','--config config', String, 'path to nerve config') do |key,value|
20
+ options[:config] = key
21
+ end
22
+
23
+ opts.on( '-h', '--help', 'Display this screen' ) do
24
+ puts opts
25
+ exit
26
+ end
27
+
28
+ end
29
+
30
+
31
+ # parse command line arguments
32
+ optparse.parse!
33
+
34
+
35
+ # parse nerve config file
36
+ begin
37
+ config = JSON::parse(File.read(options[:config]))
38
+ rescue TypeError => e
39
+ raise ArgumentError, "you must pass in a '--config' option"
40
+ rescue Errno::ENOENT => e
41
+ raise ArgumentError, "config file does not exist:\n#{e.inspect}"
42
+ rescue Errno::EACCES => e
43
+ raise ArgumentError, "could not open config file:\n#{e.inspect}"
44
+ rescue JSON::ParserError => e
45
+ raise "config file #{options[:config]} is not json:\n#{e.inspect}"
46
+ end
47
+
48
+
49
+ # create nerve object
50
+ s = Nerve::Nerve.new config
51
+
52
+ # start nerve
53
+ s.run
54
+
55
+
56
+ puts "exiting nerve"
@@ -0,0 +1,50 @@
1
+ {
2
+ "instance_id": "mymachine",
3
+ "services": {
4
+ "your_http_service": {
5
+ "port": 3000,
6
+ "host": "127.0.0.1",
7
+ "zk_hosts": ["localhost:2181"],
8
+ "zk_path": "/nerve/services/your_http_service/services",
9
+ "check_interval": 2,
10
+ "checks": [
11
+ {
12
+ "type": "http",
13
+ "uri": "/health",
14
+ "timeout": 0.2,
15
+ "rise": 3,
16
+ "fall": 2
17
+ }
18
+ ]
19
+ },
20
+ "your_tcp_service": {
21
+ "port": 6379,
22
+ "host": "127.0.0.1",
23
+ "zk_hosts": ["localhost:2181"],
24
+ "zk_path": "/nerve/services/your_tcp_service/services",
25
+ "check_interval": 2,
26
+ "checks": [
27
+ {
28
+ "type": "tcp",
29
+ "timeout": 0.2,
30
+ "rise": 3,
31
+ "fall": 2
32
+ }
33
+ ]
34
+ },
35
+ "rabbitmq_service": {
36
+ "port": 5672,
37
+ "host": "127.0.0.1",
38
+ "zk_hosts": ["localhost:2181"],
39
+ "zk_path": "/nerve/services/your_rabbitmq_service/services",
40
+ "check_interval": 2,
41
+ "checks": [
42
+ {
43
+ "type": "rabbitmq",
44
+ "username": "guest",
45
+ "password": "guest"
46
+ }
47
+ ]
48
+ }
49
+ }
50
+ }
data/lib/nerve/log.rb ADDED
@@ -0,0 +1,24 @@
1
+ module Nerve
2
+ module Logging
3
+
4
+ def log
5
+ @logger ||= Logging.logger_for(self.class.name)
6
+ end
7
+
8
+ # Use a hash class-ivar to cache a unique Logger per class:
9
+ @loggers = {}
10
+
11
+ class << self
12
+ def logger_for(classname)
13
+ @loggers[classname] ||= configure_logger_for(classname)
14
+ end
15
+
16
+ def configure_logger_for(classname)
17
+ logger = Logger.new(STDERR)
18
+ logger.level = Logger::INFO unless ENV['DEBUG']
19
+ logger.progname = classname
20
+ return logger
21
+ end
22
+ end
23
+ end
24
+ end
@@ -0,0 +1,64 @@
1
+ require 'zk'
2
+
3
+ module Nerve
4
+ class Reporter
5
+ include Utils
6
+ include Logging
7
+
8
+ def initialize(opts)
9
+ %w{hosts path key}.each do |required|
10
+ raise ArgumentError, "you need to specify required argument #{required}" unless opts[required]
11
+ end
12
+
13
+ @path = opts['hosts'].shuffle.join(',') + opts['path']
14
+ @data = parse_data(opts['data'] || '')
15
+ @key = opts['key']
16
+ @key.insert(0,'/') unless @key[0] == '/'
17
+ end
18
+
19
+ def start()
20
+ log.info "nerve: waiting to connect to zookeeper at #{@path}"
21
+ @zk = ZK.new(@path)
22
+
23
+ log.info "nerve: successfully created zk connection to #{@path}"
24
+ end
25
+
26
+ def report_up()
27
+ zk_save
28
+ end
29
+
30
+ def report_down
31
+ zk_delete
32
+ end
33
+
34
+ def update_data(new_data='')
35
+ @data = parse_data(new_data)
36
+ zk_save
37
+ end
38
+
39
+ def ping?
40
+ return @zk.ping?
41
+ end
42
+
43
+ private
44
+
45
+ def zk_delete
46
+ @zk.delete(@key, :ignore => :no_node)
47
+ end
48
+
49
+ def zk_save
50
+ log.debug "nerve: writing data #{@data.class} to zk at #{@key} with #{@data.inspect}"
51
+ begin
52
+ @zk.set(@key,@data)
53
+ rescue ZK::Exceptions::NoNode => e
54
+ @zk.create(@key,:data => @data, :mode => :ephemeral)
55
+ end
56
+ end
57
+
58
+ def parse_data(data)
59
+ return data if data.class == String
60
+ return data.to_json
61
+ end
62
+
63
+ end
64
+ end
@@ -0,0 +1,30 @@
1
+ module Nerve
2
+ class RingBuffer < Array
3
+ alias_method :array_push, :push
4
+ alias_method :array_element, :[]
5
+
6
+ def initialize( size )
7
+ @ring_size = size.to_i
8
+ super( @ring_size )
9
+ end
10
+
11
+ def average
12
+ self.inject(0.0) { |sum, el| sum + el } / self.size
13
+ end
14
+
15
+ def push( element )
16
+ if length == @ring_size
17
+ shift # loose element
18
+ end
19
+ array_push element
20
+ end
21
+
22
+ # Access elements in the RingBuffer
23
+ #
24
+ # offset will be typically negative!
25
+ #
26
+ def []( offset = 0 )
27
+ return self.array_element( - 1 + offset )
28
+ end
29
+ end
30
+ end
@@ -0,0 +1,51 @@
1
+ module Nerve
2
+ module ServiceCheck
3
+ class BaseServiceCheck
4
+ include Utils
5
+ include Logging
6
+
7
+ def initialize(opts={})
8
+ @timeout = opts['timeout'] ? opts['timeout'].to_f : 0.1
9
+ @rise = opts['rise'] ? opts['rise'].to_i : 1
10
+ @fall = opts['fall'] ? opts['fall'].to_i : 1
11
+ @name = opts['name'] ? opts['name'] : "undefined"
12
+
13
+ @check_buffer = RingBuffer.new([@rise, @fall].max)
14
+ @last_result = nil
15
+ end
16
+
17
+ def up?
18
+ # do the check
19
+ check_result = !!ignore_errors do
20
+ check
21
+ end
22
+
23
+ # this is the first check -- initialize buffer
24
+ if @last_result == nil
25
+ @last_result = check_result
26
+ @check_buffer.size.times {@check_buffer.push check_result}
27
+ log.info "nerve: service check #{@name} initial check returned #{check_result}"
28
+ end
29
+
30
+ log.debug "nerve: service check #{@name} returned #{check_result}"
31
+ @check_buffer.push(check_result)
32
+
33
+ # we've failed if the last @fall times are false
34
+ unless @check_buffer.last(@fall).reduce(:|)
35
+ log.info "nerve: service check #{@name} transitions to down after #{@fall} failures" if @last_result
36
+ @last_result = false
37
+ end
38
+
39
+ # we've succeeded if the last @rise times is true
40
+ if @check_buffer.last(@rise).reduce(:&)
41
+ log.info "nerve: service check #{@name} transitions to up after #{@rise} successes" unless @last_result
42
+ @last_result = true
43
+ end
44
+
45
+ # otherwise return the last result
46
+ return @last_result
47
+ end
48
+ end
49
+ end
50
+ end
51
+
@@ -0,0 +1,62 @@
1
+ require 'nerve/service_watcher/base'
2
+
3
+ module Nerve
4
+ module ServiceCheck
5
+ require 'net/http'
6
+
7
+ class HttpServiceCheck < BaseServiceCheck
8
+ def initialize(opts={})
9
+ super
10
+
11
+ %w{port uri}.each do |required|
12
+ raise ArgumentError, "missing required argument #{required} in http check" unless
13
+ opts[required]
14
+ instance_variable_set("@#{required}",opts[required])
15
+ end
16
+
17
+ @host = opts['host'] || '127.0.0.1'
18
+ @ssl = opts['ssl'] || false
19
+
20
+ @read_timeout = opts['read_timeout'] || @timeout
21
+ @open_timeout = opts['open_timeout'] || 0.2
22
+ @ssl_timeout = opts['ssl_timeout'] || 0.2
23
+
24
+ @name = "http-#{@host}:#{@port}#{@uri}"
25
+ end
26
+
27
+ def check
28
+ log.debug "running health check #{@name}"
29
+
30
+ connection = get_connection
31
+ response = connection.get(@uri)
32
+ code = response.code.to_i
33
+
34
+ log.debug "nerve: check #{@name} got response code #{code}"
35
+ if code >= 200 and code < 300
36
+ return true
37
+ else
38
+ return false
39
+ end
40
+ end
41
+
42
+ private
43
+ def get_connection
44
+ con = Net::HTTP.new(@host, @port)
45
+ con.read_timeout = @read_timeout
46
+ con.open_timeout = @open_timeout
47
+
48
+ if @ssl
49
+ con.use_ssl = true
50
+ con.ssl_timeout = @ssl_timeout
51
+ con.verify_mode = OpenSSL::SSL::VERIFY_NONE
52
+ end
53
+
54
+ return con
55
+ end
56
+
57
+ end
58
+
59
+ CHECKS ||= {}
60
+ CHECKS['http'] = HttpServiceCheck
61
+ end
62
+ end
@@ -0,0 +1,68 @@
1
+ require 'nerve/service_watcher/base'
2
+ require 'bunny'
3
+
4
+ module Nerve
5
+ module ServiceCheck
6
+ class RabbitMQServiceCheck < BaseServiceCheck
7
+ require 'socket'
8
+ include Socket::Constants
9
+
10
+ def initialize(opts={})
11
+ super
12
+
13
+ raise ArgumentError, "missing required argument 'port' in rabbitmq check" unless opts['port']
14
+
15
+ @port = opts['port']
16
+ @host = opts['host'] || '127.0.0.1'
17
+ @user = opts['username'] || 'guest'
18
+ @pass = opts['password'] || 'guest'
19
+ end
20
+
21
+ def check
22
+ # the idea for this check was taken from the one in rabbitmq management
23
+ # -- the aliveness_test:
24
+ # https://github.com/rabbitmq/rabbitmq-management/blob/9a8e3d1ab5144e3f6a1cb9a4639eb738713b926d/src/rabbit_mgmt_wm_aliveness_test.erl
25
+ log.debug "nerve: running rabbitmq health check #{@name}"
26
+
27
+ conn = Bunny.new(
28
+ :host => @host,
29
+ :port => @port,
30
+ :user => @user,
31
+ :pass => @pass,
32
+ :log_file => STDERR,
33
+ :continuation_timeout => @timeout,
34
+ :automatically_recover => false,
35
+ :heartbeat => false,
36
+ :threaded => false
37
+ )
38
+
39
+ begin
40
+ conn.start
41
+ ch = conn.create_channel
42
+
43
+ # create a queue, publish to it
44
+ log.debug "nerve: publishing to rabbitmq"
45
+ ch.queue('nerve')
46
+ ch.basic_publish('nerve test message', '', 'nerve', :mandatory => true, :expiration => 2 * 1000)
47
+
48
+ # read and ack the message
49
+ log.debug "nerve: consuming from rabbitmq"
50
+ delivery_info, properties, payload = ch.basic_get('nerve', :ack => true)
51
+
52
+ if payload
53
+ ch.acknowledge(delivery_info.delivery_tag)
54
+ return true
55
+ else
56
+ log.debug "nerve: rabbitmq consumption returned no payload"
57
+ return false
58
+ end
59
+ ensure
60
+ conn.close
61
+ end
62
+ end
63
+ end
64
+
65
+ CHECKS ||= {}
66
+ CHECKS['rabbitmq'] = RabbitMQServiceCheck
67
+ end
68
+ end
@@ -0,0 +1,56 @@
1
+ require 'nerve/service_watcher/base'
2
+
3
+ module Nerve
4
+ module ServiceCheck
5
+ class TcpServiceCheck < BaseServiceCheck
6
+ require 'socket'
7
+ include Socket::Constants
8
+
9
+ def initialize(opts={})
10
+ super
11
+
12
+ raise ArgumentError, "missing required argument 'port' in tcp check" unless opts['port']
13
+
14
+ @port = opts['port']
15
+ @host = opts['host'] || '127.0.0.1'
16
+
17
+ @address = Socket.sockaddr_in(@port, @host)
18
+ end
19
+
20
+ def check
21
+ log.debug "nerve: running TCP health check #{@name}"
22
+
23
+ # create a TCP socket
24
+ socket = Socket.new(AF_INET, SOCK_STREAM, 0)
25
+
26
+ begin
27
+ # open a non-blocking connection
28
+ socket.connect_nonblock(@address)
29
+ rescue Errno::EINPROGRESS
30
+ # opening a non-blocking socket will usually raise
31
+ # this exception. it's just connect returning immediately,
32
+ # so it's not really an exception, but ruby makes it into
33
+ # one. if we got here, we are now free to wait until the timeout
34
+ # expires for the socket to be writeable
35
+ IO.select(nil, [socket], nil, @timeout)
36
+
37
+ # we should be connected now; allow any other exception through
38
+ begin
39
+ socket.connect_nonblock(@address)
40
+ rescue Errno::EISCONN
41
+ return true
42
+ end
43
+ else
44
+ # we managed to connect REALLY REALLY FAST
45
+ log.debug "nerve: connected to non-blocking socket without an exception"
46
+ return true
47
+ ensure
48
+ socket.close
49
+ end
50
+ end
51
+ end
52
+
53
+ CHECKS ||= {}
54
+ CHECKS['tcp'] = TcpServiceCheck
55
+ end
56
+ end
@@ -0,0 +1,96 @@
1
+ require 'nerve/service_watcher/tcp'
2
+ require 'nerve/service_watcher/http'
3
+ require 'nerve/service_watcher/rabbitmq'
4
+
5
+ module Nerve
6
+ class ServiceWatcher
7
+ include Utils
8
+ include Logging
9
+
10
+ def initialize(service={})
11
+ log.debug "nerve: creating service watcher object"
12
+
13
+ # check that we have all of the required arguments
14
+ %w{name instance_id host port zk_hosts zk_path}.each do |required|
15
+ raise ArgumentError, "missing required argument #{required} for new service watcher" unless service[required]
16
+ end
17
+
18
+ @name = service['name']
19
+
20
+ # configure the reporter, which we use for talking to zookeeper
21
+ @reporter = Reporter.new({
22
+ 'hosts' => service['zk_hosts'],
23
+ 'path' => service['zk_path'],
24
+ 'key' => "#{service['instance_id']}_#{@name}",
25
+ 'data' => {'host' => service['host'], 'port' => service['port']},
26
+ })
27
+
28
+ # instantiate the checks for this service
29
+ @service_checks = []
30
+ service['checks'] ||= []
31
+ service['checks'].each do |check|
32
+ check['type'] ||= "undefined"
33
+ begin
34
+ service_check_class = ServiceCheck::CHECKS[check['type']]
35
+ rescue
36
+ raise ArgumentError,
37
+ "invalid service check type #{check['type']}; valid types: #{ServiceCheck::CHECKS.keys.join(',')}"
38
+ end
39
+
40
+ check['host'] ||= service['host']
41
+ check['port'] ||= service['port']
42
+ check['name'] ||= "#{@name} #{check['type']}-#{check['host']}:#{check['port']}"
43
+ @service_checks << service_check_class.new(check)
44
+ end
45
+
46
+ # how often do we initiate service checks?
47
+ @check_interval = service['check_interval'] || 0.5
48
+
49
+ log.debug "nerve: created service watcher for #{@name} with #{@service_checks.size} checks"
50
+ end
51
+
52
+ def run()
53
+ log.info "nerve: starting service watch #{@name}"
54
+
55
+ # begin by reporting down
56
+ @reporter.start()
57
+ @reporter.report_down
58
+ was_up = false
59
+
60
+ until $EXIT
61
+ @reporter.ping?
62
+
63
+ # what is the status of the service?
64
+ is_up = check?
65
+ log.debug "nerve: current service status for #{@name} is #{is_up.inspect}"
66
+
67
+ if is_up != was_up
68
+ if is_up
69
+ @reporter.report_up
70
+ log.info "nerve: service #{@name} is now up"
71
+ else
72
+ @reporter.report_down
73
+ log.warn "nerve: service #{@name} is now down"
74
+ end
75
+ was_up = is_up
76
+ end
77
+
78
+ # wait to run more checks
79
+ sleep @check_interval
80
+ end
81
+ rescue StandardError => e
82
+ log.error "nerve: error in service watcher #{@name}: #{e}"
83
+ raise e
84
+ ensure
85
+ log.info "nerve: ending service watch #{@name}"
86
+ $EXIT = true
87
+ end
88
+
89
+ def check?
90
+ @service_checks.each do |check|
91
+ return false unless check.up?
92
+ end
93
+ return true
94
+ end
95
+ end
96
+ end
@@ -0,0 +1,17 @@
1
+ module Nerve
2
+ module Utils
3
+ def safe_run(command)
4
+ res = `#{command}`.chomp
5
+ raise "command '#{command}' failed to run:\n#{res}" unless $?.success?
6
+ end
7
+
8
+ def ignore_errors(&block)
9
+ begin
10
+ return yield
11
+ rescue Object => error
12
+ log.debug "ignoring error #{error.inspect}"
13
+ return false
14
+ end
15
+ end
16
+ end
17
+ end
data/lib/nerve/version.rb CHANGED
@@ -1,3 +1,3 @@
1
1
  module Nerve
2
- VERSION = "0.0.1"
2
+ VERSION = "0.2.1"
3
3
  end
data/lib/nerve.rb CHANGED
@@ -1,5 +1,69 @@
1
- require "nerve/version"
1
+ require 'logger'
2
+ require 'json'
3
+ require 'timeout'
4
+
5
+ require 'nerve/version'
6
+ require 'nerve/utils'
7
+ require 'nerve/log'
8
+ require 'nerve/ring_buffer'
9
+ require 'nerve/reporter'
10
+ require 'nerve/service_watcher'
2
11
 
3
12
  module Nerve
4
- # Your code goes here...
13
+ class Nerve
14
+
15
+ include Logging
16
+
17
+ def initialize(opts={})
18
+ # set global variable for exit signal
19
+ $EXIT = false
20
+
21
+ # trap int signal and set exit to true
22
+ %w{INT TERM}.each do |signal|
23
+ trap(signal) do
24
+ $EXIT = true
25
+ end
26
+ end
27
+
28
+ log.info 'nerve: starting up!'
29
+
30
+ # required options
31
+ log.debug 'nerve: checking for required inputs'
32
+ %w{instance_id services}.each do |required|
33
+ raise ArgumentError, "you need to specify required argument #{required}" unless opts[required]
34
+ end
35
+
36
+ @instance_id = opts['instance_id']
37
+
38
+ # create service watcher objects
39
+ log.debug 'nerve: creating service watchers'
40
+ @service_watchers=[]
41
+ opts['services'].each do |name, config|
42
+ @service_watchers << ServiceWatcher.new(config.merge({'instance_id' => @instance_id, 'name' => name}))
43
+ end
44
+
45
+ log.debug 'nerve: completed init'
46
+ end
47
+
48
+ def run
49
+ log.info 'nerve: starting run'
50
+ begin
51
+ children = []
52
+
53
+ log.debug 'nerve: launching service check threads'
54
+ @service_watchers.each do |watcher|
55
+ children << Thread.new{watcher.run}
56
+ end
57
+
58
+ log.debug 'nerve: main thread done, waiting for children'
59
+ children.each do |child|
60
+ child.join
61
+ end
62
+ ensure
63
+ $EXIT = true
64
+ end
65
+ log.info 'nerve: exiting'
66
+ end
67
+
68
+ end
5
69
  end
data/nerve.gemspec CHANGED
@@ -6,8 +6,8 @@ require 'nerve/version'
6
6
  Gem::Specification.new do |gem|
7
7
  gem.name = "nerve"
8
8
  gem.version = Nerve::VERSION
9
- gem.authors = ["Martin Rhoads"]
10
- gem.email = ["martin.rhoads@airbnb.com"]
9
+ gem.authors = ["Martin Rhoads", "Igor Serebryany", "Pierre Carrier"]
10
+ gem.email = ["martin.rhoads@airbnb.com", "igor.serebryany@airbnb.com"]
11
11
  gem.description = %q{description}
12
12
  gem.summary = %q{summary}
13
13
  gem.homepage = ""
@@ -16,4 +16,7 @@ Gem::Specification.new do |gem|
16
16
  gem.executables = gem.files.grep(%r{^bin/}).map{ |f| File.basename(f) }
17
17
  gem.test_files = gem.files.grep(%r{^(test|spec|features)/})
18
18
  gem.require_paths = ["lib"]
19
+
20
+ gem.add_runtime_dependency "zk", "~> 1.9.2"
21
+ gem.add_runtime_dependency "bunny", "= 1.0.0.rc2"
19
22
  end
metadata CHANGED
@@ -1,29 +1,78 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: nerve
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.1
4
+ version: 0.2.1
5
5
  prerelease:
6
6
  platform: ruby
7
7
  authors:
8
8
  - Martin Rhoads
9
+ - Igor Serebryany
10
+ - Pierre Carrier
9
11
  autorequire:
10
12
  bindir: bin
11
13
  cert_chain: []
12
- date: 2012-12-22 00:00:00.000000000 Z
13
- dependencies: []
14
+ date: 2013-10-24 00:00:00.000000000 Z
15
+ dependencies:
16
+ - !ruby/object:Gem::Dependency
17
+ name: zk
18
+ requirement: !ruby/object:Gem::Requirement
19
+ none: false
20
+ requirements:
21
+ - - ~>
22
+ - !ruby/object:Gem::Version
23
+ version: 1.9.2
24
+ type: :runtime
25
+ prerelease: false
26
+ version_requirements: !ruby/object:Gem::Requirement
27
+ none: false
28
+ requirements:
29
+ - - ~>
30
+ - !ruby/object:Gem::Version
31
+ version: 1.9.2
32
+ - !ruby/object:Gem::Dependency
33
+ name: bunny
34
+ requirement: !ruby/object:Gem::Requirement
35
+ none: false
36
+ requirements:
37
+ - - '='
38
+ - !ruby/object:Gem::Version
39
+ version: 1.0.0.rc2
40
+ type: :runtime
41
+ prerelease: false
42
+ version_requirements: !ruby/object:Gem::Requirement
43
+ none: false
44
+ requirements:
45
+ - - '='
46
+ - !ruby/object:Gem::Version
47
+ version: 1.0.0.rc2
14
48
  description: description
15
49
  email:
16
50
  - martin.rhoads@airbnb.com
17
- executables: []
51
+ - igor.serebryany@airbnb.com
52
+ executables:
53
+ - nerve
18
54
  extensions: []
19
55
  extra_rdoc_files: []
20
56
  files:
21
57
  - .gitignore
58
+ - .nerve.rc
22
59
  - Gemfile
23
60
  - LICENSE.txt
24
61
  - README.md
25
62
  - Rakefile
63
+ - Vagrantfile
64
+ - bin/nerve
65
+ - example/nerve.conf.json
26
66
  - lib/nerve.rb
67
+ - lib/nerve/log.rb
68
+ - lib/nerve/reporter.rb
69
+ - lib/nerve/ring_buffer.rb
70
+ - lib/nerve/service_watcher.rb
71
+ - lib/nerve/service_watcher/base.rb
72
+ - lib/nerve/service_watcher/http.rb
73
+ - lib/nerve/service_watcher/rabbitmq.rb
74
+ - lib/nerve/service_watcher/tcp.rb
75
+ - lib/nerve/utils.rb
27
76
  - lib/nerve/version.rb
28
77
  - nerve.gemspec
29
78
  homepage: ''