auto-consul 0.0.1 → 0.0.2
Sign up to get free protection for your applications and to get access to all the features.
- data/README.md +44 -32
- data/bin/auto-consul +37 -6
- data/lib/auto-consul/runner.rb +137 -16
- data/spec/runner_spec.rb +402 -18
- metadata +1 -1
data/README.md
CHANGED
@@ -14,32 +14,29 @@ Export your AWS keys into the environment in each:
|
|
14
14
|
export AWS_SECRET_ACCESS_KEY=...
|
15
15
|
```
|
16
16
|
|
17
|
-
This will allow the AWS SDK to pick them up.
|
17
|
+
This will allow the AWS SDK to pick them up. (Note that on an EC2
|
18
|
+
instance, the AWS SDK should seamlessly pick up any IAM roles associated
|
19
|
+
with the instance, so these environment variables should not be
|
20
|
+
necessary.)
|
18
21
|
|
19
|
-
|
22
|
+
Then, run the agent via auto-consul:
|
20
23
|
|
21
24
|
auto-consul -r s3://my-bucket/consul/test-cluster \
|
22
25
|
-a 192.168.50.100 \
|
23
|
-
-n
|
26
|
+
-n agent1 \
|
27
|
+
-t 60 \
|
24
28
|
run
|
25
29
|
|
26
|
-
|
30
|
+
This will launch the consul agent, and emit heartbeats roughly every
|
31
|
+
minute. The `-r` indicates the bucket to use as the heartbeat registry.
|
32
|
+
The `-t` specifies the interval in seconds between heartbeats. The
|
33
|
+
`-a` and `-n` options map to consul's native `-bind` and `-node` options,
|
34
|
+
respectively.
|
27
35
|
|
28
|
-
|
29
|
-
|
30
|
-
|
31
|
-
|
32
|
-
heartbeat
|
33
|
-
sleep 60
|
34
|
-
done
|
35
|
-
|
36
|
-
The first launches the agent, the latter checks its run status and
|
37
|
-
issues a heartbeat to the specified S3 bucket.
|
38
|
-
|
39
|
-
Because this is the first server, there will be no heartbeats in the
|
40
|
-
bucket (assuming a fresh bucket/key combination). Therefore, the agent
|
41
|
-
will be launched in server mode, along with the bootstrap option to
|
42
|
-
initialize the raft cluster for state management.
|
36
|
+
Because this is the first server, there will be no extant heartbeats in the
|
37
|
+
bucket (assuming a fresh bucket/key combination) at startup. Therefore,
|
38
|
+
the agent will be launched in server mode, along with the bootstrap option
|
39
|
+
to initialize the raft cluster for state management.
|
43
40
|
|
44
41
|
Look in the S3 bucket above, under "servers", and you should see
|
45
42
|
a timestamped entry like "20140516092731-server1". This is produced
|
@@ -47,11 +44,12 @@ by the "heartbeat" command and allows new agents to discover active
|
|
47
44
|
members of the cluster for joining.
|
48
45
|
|
49
46
|
Having seen the server heartbeat, go to the agent vagrant box, and
|
50
|
-
do something similar.
|
47
|
+
do something similar.
|
51
48
|
|
52
49
|
auto-consul -r s3://my-bucket/consul/test-cluster \
|
53
50
|
-a 192.168.50.101 \
|
54
51
|
-n agent1 \
|
52
|
+
-t 60 \
|
55
53
|
run
|
56
54
|
|
57
55
|
In this case, the agent will discover the server via its heartbeat. It
|
@@ -59,19 +57,38 @@ will know that we have enough servers (it defaults to only wanting one;
|
|
59
57
|
that's fine for dev/testing but not good for availability) and thus
|
60
58
|
simply join as a normal agent.
|
61
59
|
|
62
|
-
|
60
|
+
While the server sends heartbeats both to "servers" and "agents" in the
|
61
|
+
bucket, the normal agent sends heartbeats only to "agents".
|
62
|
+
|
63
|
+
## Alternative - separate heartbeat
|
64
|
+
|
65
|
+
The auto-consul runner does not have to issue heartbeats itself; those
|
66
|
+
can be left out entirely (but please understand this means automatic
|
67
|
+
mode determination won't work), or run in a separate process with timing
|
68
|
+
under more precise control.
|
69
|
+
|
70
|
+
For example, in our vagrant both, the following commands are equivalent
|
71
|
+
to the original server run.
|
72
|
+
|
73
|
+
Server screen A:
|
74
|
+
|
75
|
+
auto-consul -r s3://my-bucket/consul/test-cluster \
|
76
|
+
-a 192.168.50.100 \
|
77
|
+
-n server1 \
|
78
|
+
run
|
79
|
+
|
80
|
+
Then, server screen B:
|
63
81
|
|
64
82
|
while true; do
|
65
83
|
auto-consul -r s3://my-bucket/consul/test-cluster \
|
66
|
-
-a 192.168.50.
|
67
|
-
-n
|
84
|
+
-a 192.168.50.100 \
|
85
|
+
-n server1 \
|
68
86
|
heartbeat
|
69
87
|
sleep 60
|
70
88
|
done
|
71
89
|
|
72
|
-
|
73
|
-
|
74
|
-
normal agent sends heartbeats only to "agents".
|
90
|
+
The first launches the agent, the latter checks its run status and
|
91
|
+
issues a heartbeat to the specified S3 bucket.
|
75
92
|
|
76
93
|
# Mode determination
|
77
94
|
|
@@ -103,17 +120,12 @@ Each heartbeat tells us:
|
|
103
120
|
- The timestamp of the heartbeat (the freshness)
|
104
121
|
- The IP at which the node can be reached for cluster join operations.
|
105
122
|
|
106
|
-
For now, it is necessary to run the heartbeat utility in parallel to the
|
107
|
-
run utility. In subsequent work we may want to have these things coordinated
|
108
|
-
by one daemon, but given the experimental nature of this project it's not
|
109
|
-
worth caring about just yet.
|
110
|
-
|
111
123
|
The heartbeat asks consul for its status and from that determines if it
|
112
124
|
is running as a server or regular agent (or if it is running at all). If
|
113
125
|
consul is not running at all, no heartbeat will be emitted.
|
114
126
|
|
115
127
|
The default expiry is 120 seconds. It is recommended that heartbeats fire
|
116
|
-
at half that duration (60 seconds).
|
128
|
+
at half that duration (60 seconds) or less.
|
117
129
|
|
118
130
|
# Cluster join
|
119
131
|
|
data/bin/auto-consul
CHANGED
@@ -12,7 +12,7 @@ end
|
|
12
12
|
|
13
13
|
class Command < OpenStruct
|
14
14
|
def local
|
15
|
-
@local ||= AutoConsul::Local.bind_to_path(
|
15
|
+
@local ||= AutoConsul::Local.bind_to_path(data_dir)
|
16
16
|
end
|
17
17
|
|
18
18
|
def cluster
|
@@ -25,6 +25,8 @@ class Command < OpenStruct
|
|
25
25
|
|
26
26
|
def do_set_mode
|
27
27
|
cluster.set_mode! local, expiry, servers
|
28
|
+
# Healthy exit
|
29
|
+
0
|
28
30
|
end
|
29
31
|
|
30
32
|
def do_run
|
@@ -36,16 +38,37 @@ class Command < OpenStruct
|
|
36
38
|
|
37
39
|
def do_direct_run
|
38
40
|
runner = :no_op
|
39
|
-
runner = :
|
40
|
-
runner = :
|
41
|
+
runner = :agent_runner if local.agent?
|
42
|
+
runner = :server_runner if local.server?
|
41
43
|
runner = AutoConsul::Runner.method(runner)
|
42
|
-
runner.call
|
44
|
+
monitor = runner.call(node, addr, expiry, local, cluster)
|
45
|
+
if with_heartbeat and ticks > 0
|
46
|
+
monitor.while_up do |ap|
|
47
|
+
# There's data exchange with the agent and with the registries
|
48
|
+
# per heartbeat, which with Ruby's memory management could cause
|
49
|
+
# process bloat over a long time. So fork per heartbeat so we
|
50
|
+
# can avoid the bloat in the long-running process.
|
51
|
+
while true
|
52
|
+
sleep ticks
|
53
|
+
kid = fork { do_heartbeat }
|
54
|
+
Process.waitpid kid
|
55
|
+
end
|
56
|
+
end
|
57
|
+
end
|
58
|
+
monitor.run!
|
59
|
+
# Returns the exit status of the "consul agent" run.
|
60
|
+
monitor.wait
|
43
61
|
end
|
44
62
|
|
45
63
|
def do_heartbeat
|
46
64
|
if state.running?
|
47
65
|
cluster.servers.heartbeat! node, addr, expiry if state.server?
|
48
66
|
cluster.agents.heartbeat! node, addr, expiry if state.agent?
|
67
|
+
# Healthy exit.
|
68
|
+
0
|
69
|
+
else
|
70
|
+
# Not running; can't heartbeat.
|
71
|
+
1
|
49
72
|
end
|
50
73
|
end
|
51
74
|
|
@@ -59,10 +82,12 @@ class Command < OpenStruct
|
|
59
82
|
end
|
60
83
|
end
|
61
84
|
|
62
|
-
runner = Command.new(:
|
85
|
+
runner = Command.new(:data_dir => '/tmp/consul/state',
|
63
86
|
:dc => 'dc1',
|
87
|
+
:with_heartbeat => false,
|
64
88
|
:expiry => 120,
|
65
89
|
:servers => 1,
|
90
|
+
:ticks => 60,
|
66
91
|
:node => Socket.gethostname.split('.', 2)[0])
|
67
92
|
|
68
93
|
parser = OptionParser.new do |opts|
|
@@ -88,6 +113,11 @@ parser = OptionParser.new do |opts|
|
|
88
113
|
runner.expiry = e.to_i
|
89
114
|
end
|
90
115
|
|
116
|
+
opts.on("-t", "--ticks SECONDS", Integer, "The time between heartbeats (in seconds) for registry heartbeats; use of this activates a concurrent heartbeat thread for the 'run' command.") do |t|
|
117
|
+
runner.ticks = t.to_i
|
118
|
+
runner.with_heartbeat = true
|
119
|
+
end
|
120
|
+
|
91
121
|
opts.on("-s", "--servers NUMBER", Integer, "The desired number of consul servers.") do |s|
|
92
122
|
runner.servers = s.to_i
|
93
123
|
end
|
@@ -101,7 +131,8 @@ end
|
|
101
131
|
parser.parse!
|
102
132
|
|
103
133
|
begin
|
104
|
-
runner.execute(ARGV.shift)
|
134
|
+
status = runner.execute(ARGV.shift)
|
135
|
+
exit status
|
105
136
|
rescue UnknownCommandException => e
|
106
137
|
puts e.message
|
107
138
|
puts parser
|
data/lib/auto-consul/runner.rb
CHANGED
@@ -1,19 +1,143 @@
|
|
1
|
+
require 'thread'
|
2
|
+
|
1
3
|
module AutoConsul
|
2
4
|
module Runner
|
5
|
+
INITIAL_VERIFY_SLEEP = 0.1
|
3
6
|
SLEEP_INTERVAL = 2
|
4
7
|
RETRIES = 5
|
5
8
|
|
6
|
-
|
7
|
-
|
9
|
+
class AgentProcess
|
10
|
+
attr_reader :args
|
11
|
+
attr_reader :exit_code
|
12
|
+
attr_reader :pid
|
13
|
+
attr_reader :status
|
14
|
+
attr_reader :thread
|
15
|
+
attr_reader :stop_queue
|
16
|
+
attr_reader :stop_thread
|
8
17
|
|
9
|
-
|
10
|
-
|
18
|
+
def initialize args
|
19
|
+
@args = args.dup.freeze
|
20
|
+
@callbacks = {}
|
21
|
+
end
|
11
22
|
|
12
|
-
|
13
|
-
|
23
|
+
def on_up &action
|
24
|
+
register_callback :up, &action
|
25
|
+
end
|
26
|
+
|
27
|
+
def on_down &action
|
28
|
+
register_callback :down, &action
|
29
|
+
end
|
30
|
+
|
31
|
+
def launch!
|
32
|
+
set_status :starting
|
33
|
+
@thread = Thread.new do
|
34
|
+
Thread.current.abort_on_exception = true
|
35
|
+
run_agent
|
36
|
+
end
|
37
|
+
end
|
38
|
+
|
39
|
+
def run_agent
|
40
|
+
handle_signals!
|
41
|
+
@pid = spawn(*(['consul', 'agent'] + args), :pgroup => true)
|
42
|
+
result = Process.waitpid2(@pid)
|
43
|
+
@exit_code = result[1].exitstatus
|
44
|
+
set_status :down
|
14
45
|
end
|
15
46
|
|
16
|
-
|
47
|
+
def handle_signals!
|
48
|
+
if @stop_queue.nil?
|
49
|
+
@stop_queue = Queue.new
|
50
|
+
@stop_thread = Thread.new do
|
51
|
+
while true
|
52
|
+
@stop_queue.pop
|
53
|
+
stop!
|
54
|
+
end
|
55
|
+
end
|
56
|
+
['INT', 'TERM'].each do |sig|
|
57
|
+
Signal.trap(sig) do
|
58
|
+
@stop_queue << sig
|
59
|
+
end
|
60
|
+
end
|
61
|
+
end
|
62
|
+
nil
|
63
|
+
end
|
64
|
+
|
65
|
+
VALID_VERIFY_STATUSES = [nil, :starting]
|
66
|
+
|
67
|
+
def verify_up!
|
68
|
+
sleep INITIAL_VERIFY_SLEEP
|
69
|
+
tries = 0
|
70
|
+
while VALID_VERIFY_STATUSES.include?(status) and tries < RETRIES
|
71
|
+
sleep SLEEP_INTERVAL ** tries if tries > 0
|
72
|
+
if system('consul', 'info')
|
73
|
+
set_status :up
|
74
|
+
else
|
75
|
+
tries += 1
|
76
|
+
end
|
77
|
+
end
|
78
|
+
end
|
79
|
+
|
80
|
+
def on_stopping &action
|
81
|
+
register_callback :stopping, &action
|
82
|
+
end
|
83
|
+
|
84
|
+
VALID_STOP_STATUSES = [nil, :starting, :up, :stopping]
|
85
|
+
STOP_SIGNAL = "SIGINT"
|
86
|
+
|
87
|
+
def stop!
|
88
|
+
raise "The consul agent is not running (no pid)" if pid.nil?
|
89
|
+
raise "The consul agent is not running (status #{status.to_s})." unless VALID_STOP_STATUSES.include? status
|
90
|
+
set_status :stopping
|
91
|
+
Process.kill STOP_SIGNAL, pid
|
92
|
+
end
|
93
|
+
|
94
|
+
def register_callback on_status, &action
|
95
|
+
(@callbacks[on_status] ||= []) << action
|
96
|
+
end
|
97
|
+
|
98
|
+
def run!
|
99
|
+
launch!
|
100
|
+
verify_up!
|
101
|
+
status
|
102
|
+
end
|
103
|
+
|
104
|
+
def wait
|
105
|
+
if (t = thread).nil?
|
106
|
+
raise "The consul agent has not started within this runner."
|
107
|
+
end
|
108
|
+
t.join
|
109
|
+
exit_code
|
110
|
+
end
|
111
|
+
|
112
|
+
def while_up &action
|
113
|
+
on_up do |obj|
|
114
|
+
thread = Thread.new { action.call obj }
|
115
|
+
obj.on_stopping {|x| thread.kill }
|
116
|
+
obj.on_down {|x| thread.kill }
|
117
|
+
end
|
118
|
+
end
|
119
|
+
|
120
|
+
def run_callbacks on_status
|
121
|
+
if callbacks = @callbacks[on_status]
|
122
|
+
callbacks.each do |callback|
|
123
|
+
callback.call self
|
124
|
+
end
|
125
|
+
end
|
126
|
+
end
|
127
|
+
|
128
|
+
def set_status new_status
|
129
|
+
@status = new_status
|
130
|
+
run_callbacks new_status
|
131
|
+
new_status
|
132
|
+
end
|
133
|
+
end
|
134
|
+
|
135
|
+
def self.joining_runner(agent_args, remote_ip=nil)
|
136
|
+
runner = AgentProcess.new(agent_args)
|
137
|
+
if not remote_ip.nil?
|
138
|
+
runner.on_up {|a| join remote_ip}
|
139
|
+
end
|
140
|
+
runner
|
17
141
|
end
|
18
142
|
|
19
143
|
def self.verify_running pid
|
@@ -33,24 +157,21 @@ module AutoConsul
|
|
33
157
|
hosts[0].data
|
34
158
|
end
|
35
159
|
|
36
|
-
def self.
|
160
|
+
def self.agent_runner identity, bind_ip, expiry, local_state, registry
|
37
161
|
remote_ip = pick_joining_host(registry.agents.members(expiry))
|
38
|
-
|
39
|
-
|
40
|
-
|
41
|
-
Process.wait pid
|
162
|
+
joining_runner(['-bind', bind_ip,
|
163
|
+
'-data-dir', local_state.data_path,
|
164
|
+
'-node', identity], remote_ip)
|
42
165
|
end
|
43
166
|
|
44
|
-
def self.
|
167
|
+
def self.server_runner identity, bind_ip, expiry, local_state, registry
|
45
168
|
members = registry.servers.members(expiry)
|
46
169
|
remote_ip = members.size > 0 ? pick_joining_host(members) : nil
|
47
170
|
|
48
171
|
args = ['-bind', bind_ip, '-data-dir', local_state.data_path, '-node', identity, '-server']
|
49
172
|
args << '-bootstrap' if members.size < 1
|
50
173
|
|
51
|
-
|
52
|
-
|
53
|
-
Process.wait pid unless pid.nil?
|
174
|
+
joining_runner(args, remote_ip)
|
54
175
|
end
|
55
176
|
end
|
56
177
|
end
|
data/spec/runner_spec.rb
CHANGED
@@ -1,6 +1,63 @@
|
|
1
1
|
require 'spec-helper'
|
2
2
|
|
3
|
-
shared_examples_for '
|
3
|
+
shared_examples_for 'an unstoppable' do
|
4
|
+
before do
|
5
|
+
subject.set_status initial_status
|
6
|
+
end
|
7
|
+
|
8
|
+
before do
|
9
|
+
Process.should_not_receive(:kill)
|
10
|
+
subject.should_not_receive(:set_status)
|
11
|
+
end
|
12
|
+
|
13
|
+
it 'throws an exception' do
|
14
|
+
expect { subject.stop! }.to raise_error(/consul agent is not running/)
|
15
|
+
end
|
16
|
+
end
|
17
|
+
|
18
|
+
shared_examples_for 'stop signaler' do
|
19
|
+
before do
|
20
|
+
subject.set_status initial_status
|
21
|
+
end
|
22
|
+
|
23
|
+
before do
|
24
|
+
Process.should_receive(:kill).with("SIGINT", pid)
|
25
|
+
end
|
26
|
+
|
27
|
+
describe 'with no stopping callbacks' do
|
28
|
+
before do
|
29
|
+
subject.should_not_receive(:stopping_a!)
|
30
|
+
subject.should_not_receive(:stopping_b!)
|
31
|
+
end
|
32
|
+
|
33
|
+
it 'signals the agent process to stop' do
|
34
|
+
subject.stop!
|
35
|
+
end
|
36
|
+
end
|
37
|
+
|
38
|
+
describe 'with stopping callbacks' do
|
39
|
+
before do
|
40
|
+
subject.on_stopping do |o|
|
41
|
+
subject.stopping_a! o
|
42
|
+
expect(subject.status).to eq(:stopping)
|
43
|
+
end
|
44
|
+
|
45
|
+
subject.on_stopping do |o|
|
46
|
+
subject.stopping_b! o
|
47
|
+
expect(subject.status).to eq(:stopping)
|
48
|
+
end
|
49
|
+
|
50
|
+
subject.should_receive(:stopping_a!).with(subject)
|
51
|
+
subject.should_receive(:stopping_b!).with(subject)
|
52
|
+
end
|
53
|
+
|
54
|
+
it 'invokes the callbacks and signals the agent process to stop' do
|
55
|
+
subject.stop!
|
56
|
+
end
|
57
|
+
end
|
58
|
+
end
|
59
|
+
|
60
|
+
shared_examples_for 'a consul agent process runner' do |method_name, registry_name, join_flag, args|
|
4
61
|
it 'properly launches consul agent' do
|
5
62
|
members = []
|
6
63
|
members << member if join_flag
|
@@ -8,7 +65,7 @@ shared_examples_for 'a consul agent run' do |method_name, registry_name, join_fl
|
|
8
65
|
registry.should_receive(registry_name).with.and_return(reg = double)
|
9
66
|
reg.should_receive(:members).with(expiry).and_return(members)
|
10
67
|
|
11
|
-
expected_args = (['
|
68
|
+
expected_args = (['-bind', ip, '-data-dir', data_dir, '-node', identity] + args).collect do |e|
|
12
69
|
if e.instance_of? Symbol
|
13
70
|
send e
|
14
71
|
else
|
@@ -16,23 +73,350 @@ shared_examples_for 'a consul agent run' do |method_name, registry_name, join_fl
|
|
16
73
|
end
|
17
74
|
end
|
18
75
|
|
19
|
-
|
76
|
+
runner = double("AgentProcess")
|
77
|
+
expect(AutoConsul::Runner::AgentProcess).to receive(:new).with(expected_args).and_return { runner }
|
20
78
|
|
21
|
-
|
22
|
-
|
79
|
+
if join_flag
|
80
|
+
expect(AutoConsul::Runner).to receive(:system).with('consul', 'join', remote_ip).and_return(true)
|
81
|
+
expect(runner).to receive(:on_up) do |&action|
|
82
|
+
# The callback mechanism is how we join the cluster.
|
83
|
+
action.call
|
84
|
+
end
|
85
|
+
else
|
86
|
+
expect(AutoConsul::Runner).to_not receive(:system)
|
87
|
+
expect(runner).to_not receive(:on_up)
|
88
|
+
end
|
23
89
|
|
24
|
-
AutoConsul::Runner.
|
25
|
-
|
26
|
-
|
90
|
+
callable = AutoConsul::Runner.method(method_name)
|
91
|
+
expect(callable.call(identity, ip, expiry, local_state, registry)).to be(runner)
|
92
|
+
end
|
93
|
+
end
|
27
94
|
|
28
|
-
|
29
|
-
|
95
|
+
describe AutoConsul::Runner::AgentProcess do
|
96
|
+
let(:args) do
|
97
|
+
(1..3).collect do |i|
|
98
|
+
double("MockParam#{i.to_s}").to_s
|
30
99
|
end
|
100
|
+
end
|
31
101
|
|
32
|
-
|
102
|
+
subject { AutoConsul::Runner::AgentProcess.new args }
|
33
103
|
|
34
|
-
|
35
|
-
|
104
|
+
describe "#handle_signals!" do
|
105
|
+
let(:stop_thread) { double('StopThread', :block => Proc.new { raise 'Loop breakout bogus' }) }
|
106
|
+
let(:stop_queue) { [] }
|
107
|
+
|
108
|
+
before do
|
109
|
+
count = 0
|
110
|
+
# This is necessary to capture the block given to the thread,
|
111
|
+
# and to ensure it only happens once.
|
112
|
+
expect(Thread).to receive(:new) do |&block|
|
113
|
+
expect(count).to eq(0)
|
114
|
+
count += 1
|
115
|
+
stop_thread.stub(:block).and_return block
|
116
|
+
end
|
117
|
+
expect(Queue).to receive(:new).with.once.and_return(stop_queue)
|
118
|
+
end
|
119
|
+
|
120
|
+
it 'should register a queuing signal handler for SIGINT, SIGTERM' do
|
121
|
+
seen = {}
|
122
|
+
expect(Signal).to receive(:trap).exactly(2).times do |sig, &block|
|
123
|
+
seen[sig] = true
|
124
|
+
len = stop_queue.size
|
125
|
+
block.call
|
126
|
+
expect(stop_queue.size).to eq(len + 1)
|
127
|
+
expect(stop_queue[-1]).to eq(sig)
|
128
|
+
end
|
129
|
+
subject.handle_signals!
|
130
|
+
expect(seen).to eq({"INT" => true, "TERM" => true})
|
131
|
+
end
|
132
|
+
|
133
|
+
it 'should issue stop per queue member' do
|
134
|
+
Signal.stub(:trap)
|
135
|
+
subject.handle_signals!
|
136
|
+
sigs = ['INT', 'TERM', 'INT']
|
137
|
+
expect(stop_queue).to receive(:pop).exactly(sigs.size + 1).times do
|
138
|
+
raise 'Loop breakout' unless sigs.size > 0
|
139
|
+
sigs.pop
|
140
|
+
end
|
141
|
+
expect(subject).to receive(:stop!).with.exactly(3).times
|
142
|
+
expect { stop_thread.block.call }.to raise_exception(/Loop breakout/)
|
143
|
+
end
|
144
|
+
|
145
|
+
it 'should only configure signals once' do
|
146
|
+
expect(Signal).to receive(:trap).exactly(2).times
|
147
|
+
subject.handle_signals!
|
148
|
+
subject.handle_signals!
|
149
|
+
end
|
150
|
+
end
|
151
|
+
|
152
|
+
describe "launch! method" do
|
153
|
+
let(:thread) { double('AgentThread') }
|
154
|
+
let(:pid) { double('AgentPid') }
|
155
|
+
let(:exit_code) { 2.to_i }
|
156
|
+
|
157
|
+
before do
|
158
|
+
# We'll have the AgentProcess take over signal handling, since it's
|
159
|
+
# bound up in the life cycle of the process.
|
160
|
+
subject.should_receive(:handle_signals!).with.once
|
161
|
+
|
162
|
+
# This sucks, but what are you gonna do? It needs to be in a separate
|
163
|
+
# thread so the waitpid2 call doesn't block the main process.
|
164
|
+
subject.should_receive(:spawn).with(*(['consul', 'agent'] + args), :pgroup => true).and_return(pid)
|
165
|
+
process_status = double('ProcessStatus', :pid => pid,
|
166
|
+
:exitstatus => exit_code)
|
167
|
+
Process.should_receive(:waitpid2).with(pid).and_return([pid, process_status])
|
168
|
+
Thread.should_receive(:new) do |&block|
|
169
|
+
# The status should be :starting before invoking the thread.
|
170
|
+
expect(subject.status).to eq(:starting)
|
171
|
+
# Make sure we set abort_on_exception on the thread.
|
172
|
+
Thread.should_receive(:current).with.and_return(thread)
|
173
|
+
thread.should_receive(:abort_on_exception=).with(true)
|
174
|
+
block.call
|
175
|
+
# The block is what moves it to a status of :down.
|
176
|
+
# And sets the pid and exit code.
|
177
|
+
expect(subject.status).to eq(:down)
|
178
|
+
expect(subject.pid).to eq(pid)
|
179
|
+
expect(subject.exit_code).to eq(exit_code)
|
180
|
+
thread
|
181
|
+
end
|
182
|
+
end
|
183
|
+
|
184
|
+
it 'should invoke "agent consul" with given args and wait on the result' do
|
185
|
+
expect(subject.thread).to be_nil
|
186
|
+
expect(subject.pid).to be_nil
|
187
|
+
expect(subject.status).to be_nil
|
188
|
+
expect(subject.exit_code).to be_nil
|
189
|
+
subject.launch!
|
190
|
+
expect(subject.thread).to be(thread)
|
191
|
+
end
|
192
|
+
|
193
|
+
it 'should invoke "agent consul" and run callbacks after going down.' do
|
194
|
+
subject.on_down do |x|
|
195
|
+
x.down_a!
|
196
|
+
end
|
197
|
+
|
198
|
+
subject.on_down do |x|
|
199
|
+
x.down_b!
|
200
|
+
end
|
201
|
+
|
202
|
+
subject.should_receive(:down_a!).with
|
203
|
+
subject.should_receive(:down_b!).with
|
204
|
+
|
205
|
+
subject.launch!
|
206
|
+
end
|
207
|
+
|
208
|
+
# it should blow up if called with status other than nil, :down.
|
209
|
+
end
|
210
|
+
|
211
|
+
describe "verify_up! method" do
|
212
|
+
describe 'when check succeeds' do
|
213
|
+
before do
|
214
|
+
subject.should_receive(:sleep).with(0.1)
|
215
|
+
subject.should_receive(:system).with('consul', 'info').and_return(false, false, false, true)
|
216
|
+
subject.should_receive(:sleep).with(2)
|
217
|
+
subject.should_receive(:sleep).with(4)
|
218
|
+
subject.should_receive(:sleep).with(8)
|
219
|
+
end
|
220
|
+
|
221
|
+
it 'sets status to :up' do
|
222
|
+
subject.verify_up!
|
223
|
+
expect(subject.status).to eq(:up)
|
224
|
+
end
|
225
|
+
|
226
|
+
describe 'with callbacks' do
|
227
|
+
before do
|
228
|
+
subject.on_up do |obj|
|
229
|
+
subject.callback_a! obj
|
230
|
+
expect(subject.status).to eq(:up)
|
231
|
+
end
|
232
|
+
|
233
|
+
subject.on_up do |obj|
|
234
|
+
subject.callback_b! obj
|
235
|
+
expect(subject.status).to eq(:up)
|
236
|
+
end
|
237
|
+
|
238
|
+
subject.should_receive(:callback_a!).with(subject)
|
239
|
+
subject.should_receive(:callback_b!).with(subject)
|
240
|
+
end
|
241
|
+
|
242
|
+
it 'invokes up callbacks with itself as parameter' do
|
243
|
+
subject.verify_up!
|
244
|
+
end
|
245
|
+
end
|
246
|
+
end
|
247
|
+
|
248
|
+
describe 'when check fails' do
|
249
|
+
before do
|
250
|
+
subject.should_receive(:sleep).with(0.1)
|
251
|
+
subject.should_receive(:system).with('consul', 'info').and_return(false, false, false, false, false)
|
252
|
+
subject.should_receive(:sleep).with(2)
|
253
|
+
subject.should_receive(:sleep).with(4)
|
254
|
+
subject.should_receive(:sleep).with(8)
|
255
|
+
subject.should_receive(:sleep).with(16)
|
256
|
+
end
|
257
|
+
|
258
|
+
it 'leaves the status alone' do
|
259
|
+
subject.should_not_receive(:set_status)
|
260
|
+
subject.verify_up!
|
261
|
+
expect(subject.status).to be_nil
|
262
|
+
end
|
263
|
+
|
264
|
+
describe 'with callbacks' do
|
265
|
+
before do
|
266
|
+
subject.on_up do |obj|
|
267
|
+
subject.callback_a! obj
|
268
|
+
end
|
269
|
+
|
270
|
+
subject.on_up do |obj|
|
271
|
+
subject.callback_b! obj
|
272
|
+
end
|
273
|
+
|
274
|
+
subject.should_not_receive(:callback_a!)
|
275
|
+
subject.should_not_receive(:callback_b!)
|
276
|
+
end
|
277
|
+
|
278
|
+
it 'does not invoke callbacks at all' do
|
279
|
+
subject.verify_up!
|
280
|
+
end
|
281
|
+
end
|
282
|
+
|
283
|
+
# it should blow up if called with status other than :starting, :up
|
284
|
+
end
|
285
|
+
end
|
286
|
+
|
287
|
+
describe 'stop! method' do
|
288
|
+
describe 'with a pid' do
|
289
|
+
let(:pid) { double("AgentPid") }
|
290
|
+
before { subject.stub(:pid).and_return(pid) }
|
291
|
+
|
292
|
+
describe 'in nil status' do
|
293
|
+
# Need an expression so compiler doesn't ignore the block
|
294
|
+
let(:initial_status) { nil && true }
|
295
|
+
it_behaves_like 'stop signaler'
|
296
|
+
end
|
297
|
+
|
298
|
+
describe 'in :starting status' do
|
299
|
+
let(:initial_status) { :starting.to_sym }
|
300
|
+
it_behaves_like 'stop signaler'
|
301
|
+
end
|
302
|
+
|
303
|
+
describe 'in :up status' do
|
304
|
+
let(:initial_status) { :up.to_sym }
|
305
|
+
it_behaves_like 'stop signaler'
|
306
|
+
end
|
307
|
+
|
308
|
+
describe 'in :stopping status' do
|
309
|
+
let(:initial_status) { :stopping.to_sym }
|
310
|
+
it_behaves_like 'stop signaler'
|
311
|
+
end
|
312
|
+
|
313
|
+
describe 'in :down status' do
|
314
|
+
let(:initial_status) { :down.to_sym }
|
315
|
+
it_behaves_like 'an unstoppable'
|
316
|
+
end
|
317
|
+
end
|
318
|
+
|
319
|
+
describe 'with no pid' do
|
320
|
+
before do
|
321
|
+
subject.stub(:pid).and_return(nil)
|
322
|
+
end
|
323
|
+
|
324
|
+
describe 'in nil status' do
|
325
|
+
# Need an expression so compiler doesn't ignore the block
|
326
|
+
let(:initial_status) { nil && true }
|
327
|
+
it_behaves_like 'an unstoppable'
|
328
|
+
end
|
329
|
+
|
330
|
+
describe 'in :starting status' do
|
331
|
+
let(:initial_status) { :starting.to_sym }
|
332
|
+
it_behaves_like 'an unstoppable'
|
333
|
+
end
|
334
|
+
|
335
|
+
describe 'in :up status' do
|
336
|
+
let(:initial_status) { :up.to_sym }
|
337
|
+
it_behaves_like 'an unstoppable'
|
338
|
+
end
|
339
|
+
|
340
|
+
describe 'in :stopping status' do
|
341
|
+
let(:initial_status) { :stopping.to_sym }
|
342
|
+
it_behaves_like 'an unstoppable'
|
343
|
+
end
|
344
|
+
|
345
|
+
describe 'in :down status' do
|
346
|
+
let(:initial_status) { :down.to_sym }
|
347
|
+
it_behaves_like 'an unstoppable'
|
348
|
+
end
|
349
|
+
end
|
350
|
+
end
|
351
|
+
|
352
|
+
describe ':run! method' do
|
353
|
+
it 'launches, then verifies up, and returns status' do
|
354
|
+
status = double('Status')
|
355
|
+
expect(subject).to receive(:launch!).with.ordered
|
356
|
+
expect(subject).to receive(:verify_up!).with.ordered
|
357
|
+
expect(subject).to receive(:status).with.ordered { status }
|
358
|
+
expect(subject.run!).to be(status)
|
359
|
+
end
|
360
|
+
end
|
361
|
+
|
362
|
+
describe ':wait method' do
|
363
|
+
it 'waits on the agent runner thread and returns the exit code' do
|
364
|
+
thread = double('Thread')
|
365
|
+
exit_code = double('ExitCode')
|
366
|
+
expect(subject).to receive(:thread).with.and_return { thread }
|
367
|
+
expect(thread).to receive(:join).with.and_return { thread }
|
368
|
+
expect(subject).to receive(:exit_code).with.and_return { exit_code }
|
369
|
+
expect(subject.wait).to be(exit_code)
|
370
|
+
end
|
371
|
+
|
372
|
+
it 'blows up if no thread is present' do
|
373
|
+
expect(subject).to receive(:thread).with.and_return { nil }
|
374
|
+
expect { subject.wait }.to raise_exception(/consul agent has not started/)
|
375
|
+
end
|
376
|
+
end
|
377
|
+
|
378
|
+
describe '#while_up method' do
|
379
|
+
let(:thread) { double('Thread') }
|
380
|
+
# Use this as the while_up block; it will verify that the block is invoked with
|
381
|
+
# the AgentProcess instance as sole parameter.
|
382
|
+
let(:action) { Proc.new {|o| expect(o).to be(subject)} }
|
383
|
+
|
384
|
+
describe 'when brought up' do
|
385
|
+
before do
|
386
|
+
# This "brings it up."
|
387
|
+
expect(subject).to receive(:on_up).and_yield(subject)
|
388
|
+
|
389
|
+
# And this happens in the on_up callback
|
390
|
+
expect(Thread).to receive(:new) do |&blk|
|
391
|
+
blk.call
|
392
|
+
thread
|
393
|
+
end
|
394
|
+
end
|
395
|
+
|
396
|
+
it 'registers an on_stopping that kills the thread for the given block' do
|
397
|
+
expect(thread).to receive(:kill).with
|
398
|
+
expect(subject).to receive(:on_stopping).and_yield(subject)
|
399
|
+
subject.while_up &action
|
400
|
+
end
|
401
|
+
|
402
|
+
it 'registers an on_down that kills the thread for the given block' do
|
403
|
+
expect(thread).to receive(:kill).with
|
404
|
+
expect(subject).to receive(:on_down).and_yield(subject)
|
405
|
+
subject.while_up &action
|
406
|
+
end
|
407
|
+
end
|
408
|
+
|
409
|
+
describe 'when never brought up' do
|
410
|
+
it 'never registers on_stopping or on_down handlers' do
|
411
|
+
expect(Thread).to_not receive(:new)
|
412
|
+
expect(thread).to_not receive(:kill)
|
413
|
+
expect(subject).to_not receive(:on_stopping)
|
414
|
+
expect(subject).to_not receive(:on_down)
|
415
|
+
# We receive it but we're not running anything.
|
416
|
+
expect(subject).to receive(:on_up)
|
417
|
+
subject.while_up &action
|
418
|
+
end
|
419
|
+
end
|
36
420
|
end
|
37
421
|
end
|
38
422
|
|
@@ -55,21 +439,21 @@ describe AutoConsul::Runner do
|
|
55
439
|
registry.servers.stub(:servers).with(expiry).and_return(servers_list)
|
56
440
|
end
|
57
441
|
|
58
|
-
describe :
|
59
|
-
it_behaves_like 'a consul agent
|
442
|
+
describe :agent_runner do
|
443
|
+
it_behaves_like 'a consul agent process runner', :agent_runner, :agents, true, []
|
60
444
|
end
|
61
445
|
|
62
|
-
describe :
|
446
|
+
describe :server_runner do
|
63
447
|
describe 'with empty server registry' do
|
64
448
|
# consul agent -bind 192.168.50.100 -data-dir /opt/consul/server/data -node vagrant-server -server -bootstrap
|
65
|
-
it_behaves_like 'a consul agent
|
449
|
+
it_behaves_like 'a consul agent process runner', :server_runner, :servers, false, ['-server', '-bootstrap']
|
66
450
|
end
|
67
451
|
|
68
452
|
describe 'with other servers in registry' do
|
69
453
|
# consul agent -bind 192.168.50.100 -data-dir /opt/consul/server/data -node vagrant-server -server
|
70
454
|
# consul join some_ip
|
71
455
|
|
72
|
-
it_behaves_like 'a consul agent
|
456
|
+
it_behaves_like 'a consul agent process runner', :server_runner, :servers, true, ['-server']
|
73
457
|
end
|
74
458
|
end
|
75
459
|
end
|