auto-consul 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/Gemfile ADDED
@@ -0,0 +1,12 @@
1
+ source 'https://rubygems.org'
2
+
3
+ gem 'aws-sdk'
4
+
5
+ group :development do
6
+ gem 'rake', '~> 10.1.0'
7
+ gem 'pry', '~> 0.9.0'
8
+ end
9
+
10
+ group :test do
11
+ gem 'rspec'
12
+ end
data/Gemfile.lock ADDED
@@ -0,0 +1,36 @@
1
+ GEM
2
+ remote: https://rubygems.org/
3
+ specs:
4
+ aws-sdk (1.40.0)
5
+ json (~> 1.4)
6
+ nokogiri (>= 1.4.4)
7
+ coderay (1.1.0)
8
+ diff-lcs (1.2.5)
9
+ json (1.8.1)
10
+ method_source (0.8.2)
11
+ mini_portile (0.5.3)
12
+ nokogiri (1.6.1)
13
+ mini_portile (~> 0.5.0)
14
+ pry (0.9.12.6)
15
+ coderay (~> 1.0)
16
+ method_source (~> 0.8)
17
+ slop (~> 3.4)
18
+ rake (10.1.1)
19
+ rspec (2.14.1)
20
+ rspec-core (~> 2.14.0)
21
+ rspec-expectations (~> 2.14.0)
22
+ rspec-mocks (~> 2.14.0)
23
+ rspec-core (2.14.8)
24
+ rspec-expectations (2.14.5)
25
+ diff-lcs (>= 1.1.3, < 2.0)
26
+ rspec-mocks (2.14.6)
27
+ slop (3.5.0)
28
+
29
+ PLATFORMS
30
+ ruby
31
+
32
+ DEPENDENCIES
33
+ aws-sdk
34
+ pry (~> 0.9.0)
35
+ rake (~> 10.1.0)
36
+ rspec
data/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2014 Ethan Rowe
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,132 @@
1
+ auto-consul
2
+ ===========
3
+
4
+ Ruby gem for bootstrapping consul cluster members
5
+
6
+ # Example usage
7
+
8
+ Given two vagrant boxes, each with consul and auto-consul installed.
9
+
10
+ Export your AWS keys into the environment in each:
11
+
12
+ ```
13
+ export AWS_ACCESS_KEY_ID=...
14
+ export AWS_SECRET_ACCESS_KEY=...
15
+ ```
16
+
17
+ This will allow the AWS SDK to pick them up.
18
+
19
+ The server, screen A:
20
+
21
+ auto-consul -r s3://my-bucket/consul/test-cluster \
22
+ -a 192.168.50.100 \
23
+ -n server1 \
24
+ run
25
+
26
+ Then, server screen B:
27
+
28
+ while true; do
29
+ auto-consul -r s3://my-bucket/consul/test-cluster \
30
+ -a 192.168.50.100 \
31
+ -n server1 \
32
+ heartbeat
33
+ sleep 60
34
+ done
35
+
36
+ The first launches the agent, the latter checks its run status and
37
+ issues a heartbeat to the specified S3 bucket.
38
+
39
+ Because this is the first server, there will be no heartbeats in the
40
+ bucket (assuming a fresh bucket/key combination). Therefore, the agent
41
+ will be launched in server mode, along with the bootstrap option to
42
+ initialize the raft cluster for state management.
43
+
44
+ Look in the S3 bucket above, under "servers", and you should see
45
+ a timestamped entry like "20140516092731-server1". This is produced
46
+ by the "heartbeat" command and allows new agents to discover active
47
+ members of the cluster for joining.
48
+
49
+ Having seen the server heartbeat, go to the agent vagrant box, and
50
+ do something similar. Screen A:
51
+
52
+ auto-consul -r s3://my-bucket/consul/test-cluster \
53
+ -a 192.168.50.101 \
54
+ -n agent1 \
55
+ run
56
+
57
+ In this case, the agent will discover the server via its heartbeat. It
58
+ will know that we have enough servers (it defaults to only wanting one;
59
+ that's fine for dev/testing but not good for availability) and thus
60
+ simply join as a normal agent.
61
+
62
+ Screen B:
63
+
64
+ while true; do
65
+ auto-consul -r s3://my-bucket/consul/test-cluster \
66
+ -a 192.168.50.101 \
67
+ -n agent1 \
68
+ heartbeat
69
+ sleep 60
70
+ done
71
+
72
+ This generates heartbeats like the server did, but while the server
73
+ sends heartbeats both to "servers" and "agents" in the bucket, the
74
+ normal agent sends heartbeats only to "agents".
75
+
76
+ # Mode determination
77
+
78
+ Given a desired number of servers (defaulting to 1) and a registry
79
+ (for now, an S3-style URL), the basic algorithm is:
80
+
81
+ - Are there enough servers?
82
+ - Yes: Be an agent. Done.
83
+ - No: are there no servers?
84
+ - Yes: Be a server with bootstrap mode. Done.
85
+ - No: Be a server without bootstrap mode, joining with others. Done.
86
+
87
+ There is very obviously a race condition in the determination of node
88
+ mode. In practice, it should be easy enough to coordinate things such
89
+ that the race doesn't cause problems. Longer term, we'll need to revise
90
+ the mode determination logic to use a backend supporting optimistic
91
+ locking or some such. (A compare-and-swap pattern would work fine; consul
92
+ itself would allow for this given one existing server).
93
+
94
+ ## Heartbeats and membership
95
+
96
+ The heartbeats give us a rough indication of cluster membership. The
97
+ tool uses an expiry time (in seconds) to determine which heartbeats are
98
+ still active, and will purge any expired heartbeats from the registry
99
+ whenever it encounters them.
100
+
101
+ Each heartbeat tells us:
102
+ - The node's name within the consul cluster
103
+ - The timestamp of the heartbeat (the freshness)
104
+ - The IP at which the node can be reached for cluster join operations.
105
+
106
+ For now, it is necessary to run the heartbeat utility in parallel to the
107
+ run utility. In subsequent work we may want to have these things coordinated
108
+ by one daemon, but given the experimental nature of this project it's not
109
+ worth caring about just yet.
110
+
111
+ The heartbeat asks consul for its status and from that determines if it
112
+ is running as a server or regular agent (or if it is running at all). If
113
+ consul is not running at all, no heartbeat will be emitted.
114
+
115
+ The default expiry is 120 seconds. It is recommended that heartbeats fire
116
+ at half that duration (60 seconds).
117
+
118
+ # Cluster join
119
+
120
+ After the node mode is determined, it's necessary (except in the case of
121
+ a bootstrap-mode server) to join a cluster by contacting an extant member.
122
+
123
+ This is the primary purpose of the heartbeat registry; a server-mode node
124
+ will find the IP (from the active heartbeats) of a *server*, and use that
125
+ IP to join the cluster. An agent-mode node will find the IP of an *agent*
126
+ for the join operation.
127
+
128
+ In a production-ready tool, we would have a monitor on the registry and
129
+ keep trying new hosts until a join succeeds. But in this experimental
130
+ phase, it just picks the first member in the relevant list and uses that.
131
+ If that member is actually down, then the join simply won't work.
132
+
data/bin/auto-consul ADDED
@@ -0,0 +1,110 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ # vim: set filetype=ruby;
4
+
5
+ require 'auto-consul'
6
+ require 'optparse'
7
+ require 'socket'
8
+ require 'ostruct'
9
+
10
+ class UnknownCommandException < Exception
11
+ end
12
+
13
+ class Command < OpenStruct
14
+ def local
15
+ @local ||= AutoConsul::Local.bind_to_path(data)
16
+ end
17
+
18
+ def cluster
19
+ @cluster ||= AutoConsul::Cluster.new registry
20
+ end
21
+
22
+ def state
23
+ @state ||= AutoConsul::RunState::CLIProvider.new
24
+ end
25
+
26
+ def do_set_mode
27
+ cluster.set_mode! local, expiry, servers
28
+ end
29
+
30
+ def do_run
31
+ if local.mode.nil?
32
+ do_set_mode
33
+ end
34
+ do_direct_run
35
+ end
36
+
37
+ def do_direct_run
38
+ runner = :no_op
39
+ runner = :run_agent! if local.agent?
40
+ runner = :run_server! if local.server?
41
+ runner = AutoConsul::Runner.method(runner)
42
+ runner.call node, addr, expiry, local, cluster
43
+ end
44
+
45
+ def do_heartbeat
46
+ if state.running?
47
+ cluster.servers.heartbeat! node, addr, expiry if state.server?
48
+ cluster.agents.heartbeat! node, addr, expiry if state.agent?
49
+ end
50
+ end
51
+
52
+ def execute cmd
53
+ command = "do_#{cmd}".to_sym
54
+ if respond_to? command
55
+ send command
56
+ else
57
+ raise UnknownCommandException.new("Unknown command: #{cmd}")
58
+ end
59
+ end
60
+ end
61
+
62
+ runner = Command.new(:data => '/tmp/consul/state',
63
+ :dc => 'dc1',
64
+ :expiry => 120,
65
+ :servers => 1,
66
+ :node => Socket.gethostname.split('.', 2)[0])
67
+
68
+ parser = OptionParser.new do |opts|
69
+ opts.banner = "Usage: auto-consul [options] COMMAND"
70
+
71
+ opts.on("-r", "--registry URL", String, "The cluster registry URL") do |u|
72
+ runner.registry = u
73
+ end
74
+
75
+ opts.on("-d", "--data-dir PATH", String, "The path where local state will be preserved.") do |d|
76
+ runner.data_dir = d
77
+ end
78
+
79
+ opts.on("-a", "--address IPADDR", String, "The IP address to bind to and announce for cluster communication.") do |a|
80
+ runner.addr = a
81
+ end
82
+
83
+ opts.on("-n", "--node NAME", String, "The unique name by which the node identifies itself within the cluster.") do |n|
84
+ runner.node = n
85
+ end
86
+
87
+ opts.on("-e", "--expiry SECONDS", Integer, "The expiration time (in seconds) for registry heartbeats") do |e|
88
+ runner.expiry = e.to_i
89
+ end
90
+
91
+ opts.on("-s", "--servers NUMBER", Integer, "The desired number of consul servers.") do |s|
92
+ runner.servers = s.to_i
93
+ end
94
+
95
+ opts.on_tail('-h', '--help', "Show this help message.") do
96
+ puts opts
97
+ exit
98
+ end
99
+ end
100
+
101
+ parser.parse!
102
+
103
+ begin
104
+ runner.execute(ARGV.shift)
105
+ rescue UnknownCommandException => e
106
+ puts e.message
107
+ puts parser
108
+ exit 2
109
+ end
110
+
@@ -0,0 +1,53 @@
1
+ require 'uri'
2
+
3
+ module AutoConsul
4
+ class Cluster
5
+ def self.get_provider_for_uri uri_string
6
+ uri = URI(uri_string)
7
+ Registry.supported_schemes[uri.scheme.downcase].new uri
8
+ end
9
+
10
+ attr_reader :uri_string
11
+
12
+ def initialize uri
13
+ @uri_string = uri
14
+ end
15
+
16
+ def servers
17
+ @servers ||= self.class.get_provider_for_uri File.join(uri_string, 'servers')
18
+ end
19
+
20
+ def agents
21
+ @agents ||= self.class.get_provider_for_uri File.join(uri_string, 'agents')
22
+ end
23
+
24
+ def set_mode! local_state, expiry, desired_servers=1
25
+ if servers.members(expiry).size < desired_servers
26
+ local_state.set_server!
27
+ else
28
+ local_state.set_agent!
29
+ end
30
+ end
31
+
32
+ module Registry
33
+ def self.supported_schemes
34
+ constants.inject({}) do |a, const|
35
+ if const.to_s =~ /^(.+?)Provider$/
36
+ a[$1.downcase] = const_get(const)
37
+ end
38
+ a
39
+ end
40
+ end
41
+
42
+ class Provider
43
+ attr_reader :uri
44
+
45
+ def initialize uri
46
+ @uri = uri
47
+ end
48
+ end
49
+ end
50
+ end
51
+ end
52
+
53
+ require_relative 'providers/s3'
@@ -0,0 +1,78 @@
1
+ require 'fileutils'
2
+
3
+ module AutoConsul
4
+ module Local
5
+ class FileSystemState
6
+ def initialize path
7
+ unless File.directory? path
8
+ FileUtils.mkdir_p path
9
+ end
10
+
11
+ @path = path
12
+ end
13
+
14
+ def path
15
+ @path
16
+ end
17
+
18
+ def mode_path
19
+ File.join(path, 'mode')
20
+ end
21
+
22
+ def set_server!
23
+ set_mode 'server'
24
+ end
25
+
26
+ def set_agent!
27
+ set_mode 'agent'
28
+ end
29
+
30
+ def set_mode mode
31
+ File.open(mode_path, 'w') do |f|
32
+ f.write mode
33
+ end
34
+ end
35
+
36
+ VALID_MODES = {
37
+ 'agent' => 'agent',
38
+ 'server' => 'server',
39
+ }
40
+
41
+ def self.determine_mode mode_file
42
+ if File.file? mode_file
43
+ value = File.open(mode_file, 'r') {|f| f.read }
44
+ VALID_MODES[value]
45
+ else
46
+ nil
47
+ end
48
+ end
49
+
50
+ def mode
51
+ if @mode.nil?
52
+ @mode = self.class.determine_mode mode_path
53
+ end
54
+ @mode
55
+ end
56
+
57
+ def server?
58
+ mode == 'server'
59
+ end
60
+
61
+ def agent?
62
+ mode == 'agent'
63
+ end
64
+
65
+ def data_path
66
+ if not (m = mode).nil?
67
+ File.join(path, mode)
68
+ else
69
+ nil
70
+ end
71
+ end
72
+ end
73
+
74
+ def self.bind_to_path path
75
+ FileSystemState.new path
76
+ end
77
+ end
78
+ end
@@ -0,0 +1,108 @@
1
+ require 'aws-sdk'
2
+
3
+ module AutoConsul::Cluster::Registry
4
+ class S3Provider < Provider
5
+ class S3Member
6
+ attr_reader :s3_object, :identifier, :time
7
+
8
+ def initialize s3obj
9
+ @s3_object = s3obj
10
+ @time, @identifier = S3Provider.from_key_base(File.basename(s3obj.key))
11
+ @data_read = false
12
+ end
13
+
14
+ def data
15
+ if not @data_read
16
+ @data = s3_object.read
17
+ @data_read = true
18
+ end
19
+ @data
20
+ end
21
+ end
22
+
23
+ def s3
24
+ @s3 ||= self.class.get_s3
25
+ end
26
+
27
+ def self.get_s3
28
+ AWS::S3.new
29
+ end
30
+
31
+ def bucket_name
32
+ uri.host
33
+ end
34
+
35
+ def bucket
36
+ @bucket ||= s3.buckets[bucket_name]
37
+ end
38
+
39
+ def key_prefix
40
+ uri.path[1..-1]
41
+ end
42
+
43
+ def now
44
+ Time.now
45
+ end
46
+
47
+ KEY_TIMESTAMP_FORMAT = '%Y%m%d%H%M%S'
48
+
49
+ def self.write_stamp time
50
+ time.dup.utc.strftime KEY_TIMESTAMP_FORMAT
51
+ end
52
+
53
+ def self.read_stamp stamp
54
+ t = Time.strptime(stamp, KEY_TIMESTAMP_FORMAT)
55
+ Time.utc(t.year, t.month, t.day, t.hour, t.min, t.sec, 0)
56
+ end
57
+
58
+ def self.to_key_base time, identifier
59
+ "#{write_stamp time}-#{identifier.to_s}"
60
+ end
61
+
62
+ def self.from_key_base key_base
63
+ stamp, identifier = key_base.split('-', 2)
64
+ [read_stamp(stamp), identifier]
65
+ end
66
+
67
+ def write_key time, identity
68
+ File.join(key_prefix, self.class.to_key_base(time, identity))
69
+ end
70
+
71
+ def heartbeat! identity, data, expiry=nil
72
+ result = bucket.objects[write_key now, identity].write data
73
+ purge!(expiry) unless expiry.nil?
74
+ result
75
+ end
76
+
77
+ def purge! expiry
78
+ min_key = File.join(key_prefix, "#{self.class.write_stamp(Time.now - expiry + 1)}-")
79
+ bucket.objects.with_prefix(key_prefix).delete_if do |s3obj|
80
+ s3obj.key < min_key
81
+ end
82
+ end
83
+
84
+ def members expiry
85
+ deletes, actives = [], {}
86
+ # The expiry gives an exclusive boundary, not an inclusive,
87
+ # so the minimal allowable key must begin one second after the
88
+ # specified expiry (given a resolution of seconds).
89
+ min_time = Time.now.utc - expiry + 1
90
+ min_key = File.join(key_prefix, min_time.strftime('%Y%m%d%H%M%S-'))
91
+ bucket.objects.with_prefix(key_prefix).each do |obj|
92
+ if obj.key < min_key
93
+ deletes << obj
94
+ else
95
+ o = S3Member.new(obj)
96
+ actives[o.identifier] = o
97
+ end
98
+ end
99
+ deletes! deletes
100
+ actives.values.sort_by {|m| [m.time, m.identifier]}
101
+ end
102
+
103
+ def deletes! deletes
104
+ bucket.objects.delete deletes if deletes.size > 0
105
+ end
106
+ end
107
+ end
108
+
@@ -0,0 +1,61 @@
1
+ module AutoConsul::RunState
2
+ class CLIProvider
3
+ AGENT_MASK = 0b1
4
+ SERVER_MASK = 0b10
5
+
6
+ def check_run_state
7
+ result = 0
8
+ r, w = IO.pipe
9
+ if system("consul info", :out => w)
10
+ w.close
11
+ result = flags_from_output r
12
+ r.close
13
+ end
14
+ result
15
+ end
16
+
17
+ def flags_from_output stream
18
+ consul_block = false
19
+ result = 0
20
+ stream.each do |line|
21
+ if line =~ /^consul:\s/
22
+ consul_block = true
23
+ result = AGENT_MASK
24
+ break
25
+ end
26
+ end
27
+
28
+ if consul_block
29
+ stream.each do |line|
30
+ # Exit condition from consul block
31
+ break if line !~ /^\s+/
32
+
33
+ if line =~ /^\s+server\s+=\s+true/
34
+ result |= SERVER_MASK
35
+ break
36
+ end
37
+ end
38
+ end
39
+ result
40
+ end
41
+
42
+ def run_state
43
+ if @run_state.nil?
44
+ @run_state = check_run_state
45
+ end
46
+ @run_state
47
+ end
48
+
49
+ def agent?
50
+ (run_state & AGENT_MASK) > 0
51
+ end
52
+
53
+ def server?
54
+ (run_state & SERVER_MASK) > 0
55
+ end
56
+
57
+ def running?
58
+ run_state > 0
59
+ end
60
+ end
61
+ end
@@ -0,0 +1,5 @@
1
+ module AutoConsul::RunState
2
+ end
3
+
4
+ require_relative 'run_state/cli'
5
+
@@ -0,0 +1,56 @@
1
+ module AutoConsul
2
+ module Runner
3
+ SLEEP_INTERVAL = 2
4
+ RETRIES = 5
5
+
6
+ def self.launch_and_join(agent_args, remote_ip=nil)
7
+ pid = spawn(*(['consul', 'agent'] + agent_args))
8
+
9
+ # We really need to check that is running, but later.
10
+ return nil unless verify_running(pid)
11
+
12
+ if not remote_ip.nil?
13
+ join remote_ip
14
+ end
15
+
16
+ pid
17
+ end
18
+
19
+ def self.verify_running pid
20
+ RETRIES.times do |i|
21
+ sleep SLEEP_INTERVAL + (SLEEP_INTERVAL * i)
22
+ return true if system('consul', 'info')
23
+ end
24
+ return false
25
+ end
26
+
27
+ def self.join remote_ip
28
+ system('consul', 'join', remote_ip)
29
+ end
30
+
31
+ def self.pick_joining_host hosts
32
+ # Lets randomize this later.
33
+ hosts[0].data
34
+ end
35
+
36
+ def self.run_agent! identity, bind_ip, expiry, local_state, registry
37
+ remote_ip = pick_joining_host(registry.agents.members(expiry))
38
+ pid = launch_and_join(['-bind', bind_ip,
39
+ '-data-dir', local_state.data_path,
40
+ '-node', identity], remote_ip)
41
+ Process.wait pid
42
+ end
43
+
44
+ def self.run_server! identity, bind_ip, expiry, local_state, registry
45
+ members = registry.servers.members(expiry)
46
+ remote_ip = members.size > 0 ? pick_joining_host(members) : nil
47
+
48
+ args = ['-bind', bind_ip, '-data-dir', local_state.data_path, '-node', identity, '-server']
49
+ args << '-bootstrap' if members.size < 1
50
+
51
+ pid = launch_and_join(args, remote_ip)
52
+
53
+ Process.wait pid unless pid.nil?
54
+ end
55
+ end
56
+ end
@@ -0,0 +1,8 @@
1
+ module AutoConsul
2
+ end
3
+
4
+ require 'auto-consul/local'
5
+ require 'auto-consul/cluster'
6
+ require 'auto-consul/run_state'
7
+ require 'auto-consul/runner'
8
+