auto-consul 0.0.1 → 0.0.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/README.md +44 -32
- data/bin/auto-consul +37 -6
- data/lib/auto-consul/runner.rb +137 -16
- data/spec/runner_spec.rb +402 -18
- metadata +1 -1
data/README.md
CHANGED
@@ -14,32 +14,29 @@ Export your AWS keys into the environment in each:
|
|
14
14
|
export AWS_SECRET_ACCESS_KEY=...
|
15
15
|
```
|
16
16
|
|
17
|
-
This will allow the AWS SDK to pick them up.
|
17
|
+
This will allow the AWS SDK to pick them up. (Note that on an EC2
|
18
|
+
instance, the AWS SDK should seamlessly pick up any IAM roles associated
|
19
|
+
with the instance, so these environment variables should not be
|
20
|
+
necessary.)
|
18
21
|
|
19
|
-
|
22
|
+
Then, run the agent via auto-consul:
|
20
23
|
|
21
24
|
auto-consul -r s3://my-bucket/consul/test-cluster \
|
22
25
|
-a 192.168.50.100 \
|
23
|
-
-n
|
26
|
+
-n agent1 \
|
27
|
+
-t 60 \
|
24
28
|
run
|
25
29
|
|
26
|
-
|
30
|
+
This will launch the consul agent, and emit heartbeats roughly every
|
31
|
+
minute. The `-r` indicates the bucket to use as the heartbeat registry.
|
32
|
+
The `-t` specifies the interval in seconds between heartbeats. The
|
33
|
+
`-a` and `-n` options map to consul's native `-bind` and `-node` options,
|
34
|
+
respectively.
|
27
35
|
|
28
|
-
|
29
|
-
|
30
|
-
|
31
|
-
|
32
|
-
heartbeat
|
33
|
-
sleep 60
|
34
|
-
done
|
35
|
-
|
36
|
-
The first launches the agent, the latter checks its run status and
|
37
|
-
issues a heartbeat to the specified S3 bucket.
|
38
|
-
|
39
|
-
Because this is the first server, there will be no heartbeats in the
|
40
|
-
bucket (assuming a fresh bucket/key combination). Therefore, the agent
|
41
|
-
will be launched in server mode, along with the bootstrap option to
|
42
|
-
initialize the raft cluster for state management.
|
36
|
+
Because this is the first server, there will be no extant heartbeats in the
|
37
|
+
bucket (assuming a fresh bucket/key combination) at startup. Therefore,
|
38
|
+
the agent will be launched in server mode, along with the bootstrap option
|
39
|
+
to initialize the raft cluster for state management.
|
43
40
|
|
44
41
|
Look in the S3 bucket above, under "servers", and you should see
|
45
42
|
a timestamped entry like "20140516092731-server1". This is produced
|
@@ -47,11 +44,12 @@ by the "heartbeat" command and allows new agents to discover active
|
|
47
44
|
members of the cluster for joining.
|
48
45
|
|
49
46
|
Having seen the server heartbeat, go to the agent vagrant box, and
|
50
|
-
do something similar.
|
47
|
+
do something similar.
|
51
48
|
|
52
49
|
auto-consul -r s3://my-bucket/consul/test-cluster \
|
53
50
|
-a 192.168.50.101 \
|
54
51
|
-n agent1 \
|
52
|
+
-t 60 \
|
55
53
|
run
|
56
54
|
|
57
55
|
In this case, the agent will discover the server via its heartbeat. It
|
@@ -59,19 +57,38 @@ will know that we have enough servers (it defaults to only wanting one;
|
|
59
57
|
that's fine for dev/testing but not good for availability) and thus
|
60
58
|
simply join as a normal agent.
|
61
59
|
|
62
|
-
|
60
|
+
While the server sends heartbeats both to "servers" and "agents" in the
|
61
|
+
bucket, the normal agent sends heartbeats only to "agents".
|
62
|
+
|
63
|
+
## Alternative - separate heartbeat
|
64
|
+
|
65
|
+
The auto-consul runner does not have to issue heartbeats itself; those
|
66
|
+
can be left out entirely (but please understand this means automatic
|
67
|
+
mode determination won't work), or run in a separate process with timing
|
68
|
+
under more precise control.
|
69
|
+
|
70
|
+
For example, in our vagrant both, the following commands are equivalent
|
71
|
+
to the original server run.
|
72
|
+
|
73
|
+
Server screen A:
|
74
|
+
|
75
|
+
auto-consul -r s3://my-bucket/consul/test-cluster \
|
76
|
+
-a 192.168.50.100 \
|
77
|
+
-n server1 \
|
78
|
+
run
|
79
|
+
|
80
|
+
Then, server screen B:
|
63
81
|
|
64
82
|
while true; do
|
65
83
|
auto-consul -r s3://my-bucket/consul/test-cluster \
|
66
|
-
-a 192.168.50.
|
67
|
-
-n
|
84
|
+
-a 192.168.50.100 \
|
85
|
+
-n server1 \
|
68
86
|
heartbeat
|
69
87
|
sleep 60
|
70
88
|
done
|
71
89
|
|
72
|
-
|
73
|
-
|
74
|
-
normal agent sends heartbeats only to "agents".
|
90
|
+
The first launches the agent, the latter checks its run status and
|
91
|
+
issues a heartbeat to the specified S3 bucket.
|
75
92
|
|
76
93
|
# Mode determination
|
77
94
|
|
@@ -103,17 +120,12 @@ Each heartbeat tells us:
|
|
103
120
|
- The timestamp of the heartbeat (the freshness)
|
104
121
|
- The IP at which the node can be reached for cluster join operations.
|
105
122
|
|
106
|
-
For now, it is necessary to run the heartbeat utility in parallel to the
|
107
|
-
run utility. In subsequent work we may want to have these things coordinated
|
108
|
-
by one daemon, but given the experimental nature of this project it's not
|
109
|
-
worth caring about just yet.
|
110
|
-
|
111
123
|
The heartbeat asks consul for its status and from that determines if it
|
112
124
|
is running as a server or regular agent (or if it is running at all). If
|
113
125
|
consul is not running at all, no heartbeat will be emitted.
|
114
126
|
|
115
127
|
The default expiry is 120 seconds. It is recommended that heartbeats fire
|
116
|
-
at half that duration (60 seconds).
|
128
|
+
at half that duration (60 seconds) or less.
|
117
129
|
|
118
130
|
# Cluster join
|
119
131
|
|
data/bin/auto-consul
CHANGED
@@ -12,7 +12,7 @@ end
|
|
12
12
|
|
13
13
|
class Command < OpenStruct
|
14
14
|
def local
|
15
|
-
@local ||= AutoConsul::Local.bind_to_path(
|
15
|
+
@local ||= AutoConsul::Local.bind_to_path(data_dir)
|
16
16
|
end
|
17
17
|
|
18
18
|
def cluster
|
@@ -25,6 +25,8 @@ class Command < OpenStruct
|
|
25
25
|
|
26
26
|
def do_set_mode
|
27
27
|
cluster.set_mode! local, expiry, servers
|
28
|
+
# Healthy exit
|
29
|
+
0
|
28
30
|
end
|
29
31
|
|
30
32
|
def do_run
|
@@ -36,16 +38,37 @@ class Command < OpenStruct
|
|
36
38
|
|
37
39
|
def do_direct_run
|
38
40
|
runner = :no_op
|
39
|
-
runner = :
|
40
|
-
runner = :
|
41
|
+
runner = :agent_runner if local.agent?
|
42
|
+
runner = :server_runner if local.server?
|
41
43
|
runner = AutoConsul::Runner.method(runner)
|
42
|
-
runner.call
|
44
|
+
monitor = runner.call(node, addr, expiry, local, cluster)
|
45
|
+
if with_heartbeat and ticks > 0
|
46
|
+
monitor.while_up do |ap|
|
47
|
+
# There's data exchange with the agent and with the registries
|
48
|
+
# per heartbeat, which with Ruby's memory management could cause
|
49
|
+
# process bloat over a long time. So fork per heartbeat so we
|
50
|
+
# can avoid the bloat in the long-running process.
|
51
|
+
while true
|
52
|
+
sleep ticks
|
53
|
+
kid = fork { do_heartbeat }
|
54
|
+
Process.waitpid kid
|
55
|
+
end
|
56
|
+
end
|
57
|
+
end
|
58
|
+
monitor.run!
|
59
|
+
# Returns the exit status of the "consul agent" run.
|
60
|
+
monitor.wait
|
43
61
|
end
|
44
62
|
|
45
63
|
def do_heartbeat
|
46
64
|
if state.running?
|
47
65
|
cluster.servers.heartbeat! node, addr, expiry if state.server?
|
48
66
|
cluster.agents.heartbeat! node, addr, expiry if state.agent?
|
67
|
+
# Healthy exit.
|
68
|
+
0
|
69
|
+
else
|
70
|
+
# Not running; can't heartbeat.
|
71
|
+
1
|
49
72
|
end
|
50
73
|
end
|
51
74
|
|
@@ -59,10 +82,12 @@ class Command < OpenStruct
|
|
59
82
|
end
|
60
83
|
end
|
61
84
|
|
62
|
-
runner = Command.new(:
|
85
|
+
runner = Command.new(:data_dir => '/tmp/consul/state',
|
63
86
|
:dc => 'dc1',
|
87
|
+
:with_heartbeat => false,
|
64
88
|
:expiry => 120,
|
65
89
|
:servers => 1,
|
90
|
+
:ticks => 60,
|
66
91
|
:node => Socket.gethostname.split('.', 2)[0])
|
67
92
|
|
68
93
|
parser = OptionParser.new do |opts|
|
@@ -88,6 +113,11 @@ parser = OptionParser.new do |opts|
|
|
88
113
|
runner.expiry = e.to_i
|
89
114
|
end
|
90
115
|
|
116
|
+
opts.on("-t", "--ticks SECONDS", Integer, "The time between heartbeats (in seconds) for registry heartbeats; use of this activates a concurrent heartbeat thread for the 'run' command.") do |t|
|
117
|
+
runner.ticks = t.to_i
|
118
|
+
runner.with_heartbeat = true
|
119
|
+
end
|
120
|
+
|
91
121
|
opts.on("-s", "--servers NUMBER", Integer, "The desired number of consul servers.") do |s|
|
92
122
|
runner.servers = s.to_i
|
93
123
|
end
|
@@ -101,7 +131,8 @@ end
|
|
101
131
|
parser.parse!
|
102
132
|
|
103
133
|
begin
|
104
|
-
runner.execute(ARGV.shift)
|
134
|
+
status = runner.execute(ARGV.shift)
|
135
|
+
exit status
|
105
136
|
rescue UnknownCommandException => e
|
106
137
|
puts e.message
|
107
138
|
puts parser
|
data/lib/auto-consul/runner.rb
CHANGED
@@ -1,19 +1,143 @@
|
|
1
|
+
require 'thread'
|
2
|
+
|
1
3
|
module AutoConsul
|
2
4
|
module Runner
|
5
|
+
INITIAL_VERIFY_SLEEP = 0.1
|
3
6
|
SLEEP_INTERVAL = 2
|
4
7
|
RETRIES = 5
|
5
8
|
|
6
|
-
|
7
|
-
|
9
|
+
class AgentProcess
|
10
|
+
attr_reader :args
|
11
|
+
attr_reader :exit_code
|
12
|
+
attr_reader :pid
|
13
|
+
attr_reader :status
|
14
|
+
attr_reader :thread
|
15
|
+
attr_reader :stop_queue
|
16
|
+
attr_reader :stop_thread
|
8
17
|
|
9
|
-
|
10
|
-
|
18
|
+
def initialize args
|
19
|
+
@args = args.dup.freeze
|
20
|
+
@callbacks = {}
|
21
|
+
end
|
11
22
|
|
12
|
-
|
13
|
-
|
23
|
+
def on_up &action
|
24
|
+
register_callback :up, &action
|
25
|
+
end
|
26
|
+
|
27
|
+
def on_down &action
|
28
|
+
register_callback :down, &action
|
29
|
+
end
|
30
|
+
|
31
|
+
def launch!
|
32
|
+
set_status :starting
|
33
|
+
@thread = Thread.new do
|
34
|
+
Thread.current.abort_on_exception = true
|
35
|
+
run_agent
|
36
|
+
end
|
37
|
+
end
|
38
|
+
|
39
|
+
def run_agent
|
40
|
+
handle_signals!
|
41
|
+
@pid = spawn(*(['consul', 'agent'] + args), :pgroup => true)
|
42
|
+
result = Process.waitpid2(@pid)
|
43
|
+
@exit_code = result[1].exitstatus
|
44
|
+
set_status :down
|
14
45
|
end
|
15
46
|
|
16
|
-
|
47
|
+
def handle_signals!
|
48
|
+
if @stop_queue.nil?
|
49
|
+
@stop_queue = Queue.new
|
50
|
+
@stop_thread = Thread.new do
|
51
|
+
while true
|
52
|
+
@stop_queue.pop
|
53
|
+
stop!
|
54
|
+
end
|
55
|
+
end
|
56
|
+
['INT', 'TERM'].each do |sig|
|
57
|
+
Signal.trap(sig) do
|
58
|
+
@stop_queue << sig
|
59
|
+
end
|
60
|
+
end
|
61
|
+
end
|
62
|
+
nil
|
63
|
+
end
|
64
|
+
|
65
|
+
VALID_VERIFY_STATUSES = [nil, :starting]
|
66
|
+
|
67
|
+
def verify_up!
|
68
|
+
sleep INITIAL_VERIFY_SLEEP
|
69
|
+
tries = 0
|
70
|
+
while VALID_VERIFY_STATUSES.include?(status) and tries < RETRIES
|
71
|
+
sleep SLEEP_INTERVAL ** tries if tries > 0
|
72
|
+
if system('consul', 'info')
|
73
|
+
set_status :up
|
74
|
+
else
|
75
|
+
tries += 1
|
76
|
+
end
|
77
|
+
end
|
78
|
+
end
|
79
|
+
|
80
|
+
def on_stopping &action
|
81
|
+
register_callback :stopping, &action
|
82
|
+
end
|
83
|
+
|
84
|
+
VALID_STOP_STATUSES = [nil, :starting, :up, :stopping]
|
85
|
+
STOP_SIGNAL = "SIGINT"
|
86
|
+
|
87
|
+
def stop!
|
88
|
+
raise "The consul agent is not running (no pid)" if pid.nil?
|
89
|
+
raise "The consul agent is not running (status #{status.to_s})." unless VALID_STOP_STATUSES.include? status
|
90
|
+
set_status :stopping
|
91
|
+
Process.kill STOP_SIGNAL, pid
|
92
|
+
end
|
93
|
+
|
94
|
+
def register_callback on_status, &action
|
95
|
+
(@callbacks[on_status] ||= []) << action
|
96
|
+
end
|
97
|
+
|
98
|
+
def run!
|
99
|
+
launch!
|
100
|
+
verify_up!
|
101
|
+
status
|
102
|
+
end
|
103
|
+
|
104
|
+
def wait
|
105
|
+
if (t = thread).nil?
|
106
|
+
raise "The consul agent has not started within this runner."
|
107
|
+
end
|
108
|
+
t.join
|
109
|
+
exit_code
|
110
|
+
end
|
111
|
+
|
112
|
+
def while_up &action
|
113
|
+
on_up do |obj|
|
114
|
+
thread = Thread.new { action.call obj }
|
115
|
+
obj.on_stopping {|x| thread.kill }
|
116
|
+
obj.on_down {|x| thread.kill }
|
117
|
+
end
|
118
|
+
end
|
119
|
+
|
120
|
+
def run_callbacks on_status
|
121
|
+
if callbacks = @callbacks[on_status]
|
122
|
+
callbacks.each do |callback|
|
123
|
+
callback.call self
|
124
|
+
end
|
125
|
+
end
|
126
|
+
end
|
127
|
+
|
128
|
+
def set_status new_status
|
129
|
+
@status = new_status
|
130
|
+
run_callbacks new_status
|
131
|
+
new_status
|
132
|
+
end
|
133
|
+
end
|
134
|
+
|
135
|
+
def self.joining_runner(agent_args, remote_ip=nil)
|
136
|
+
runner = AgentProcess.new(agent_args)
|
137
|
+
if not remote_ip.nil?
|
138
|
+
runner.on_up {|a| join remote_ip}
|
139
|
+
end
|
140
|
+
runner
|
17
141
|
end
|
18
142
|
|
19
143
|
def self.verify_running pid
|
@@ -33,24 +157,21 @@ module AutoConsul
|
|
33
157
|
hosts[0].data
|
34
158
|
end
|
35
159
|
|
36
|
-
def self.
|
160
|
+
def self.agent_runner identity, bind_ip, expiry, local_state, registry
|
37
161
|
remote_ip = pick_joining_host(registry.agents.members(expiry))
|
38
|
-
|
39
|
-
|
40
|
-
|
41
|
-
Process.wait pid
|
162
|
+
joining_runner(['-bind', bind_ip,
|
163
|
+
'-data-dir', local_state.data_path,
|
164
|
+
'-node', identity], remote_ip)
|
42
165
|
end
|
43
166
|
|
44
|
-
def self.
|
167
|
+
def self.server_runner identity, bind_ip, expiry, local_state, registry
|
45
168
|
members = registry.servers.members(expiry)
|
46
169
|
remote_ip = members.size > 0 ? pick_joining_host(members) : nil
|
47
170
|
|
48
171
|
args = ['-bind', bind_ip, '-data-dir', local_state.data_path, '-node', identity, '-server']
|
49
172
|
args << '-bootstrap' if members.size < 1
|
50
173
|
|
51
|
-
|
52
|
-
|
53
|
-
Process.wait pid unless pid.nil?
|
174
|
+
joining_runner(args, remote_ip)
|
54
175
|
end
|
55
176
|
end
|
56
177
|
end
|
data/spec/runner_spec.rb
CHANGED
@@ -1,6 +1,63 @@
|
|
1
1
|
require 'spec-helper'
|
2
2
|
|
3
|
-
shared_examples_for '
|
3
|
+
shared_examples_for 'an unstoppable' do
|
4
|
+
before do
|
5
|
+
subject.set_status initial_status
|
6
|
+
end
|
7
|
+
|
8
|
+
before do
|
9
|
+
Process.should_not_receive(:kill)
|
10
|
+
subject.should_not_receive(:set_status)
|
11
|
+
end
|
12
|
+
|
13
|
+
it 'throws an exception' do
|
14
|
+
expect { subject.stop! }.to raise_error(/consul agent is not running/)
|
15
|
+
end
|
16
|
+
end
|
17
|
+
|
18
|
+
shared_examples_for 'stop signaler' do
|
19
|
+
before do
|
20
|
+
subject.set_status initial_status
|
21
|
+
end
|
22
|
+
|
23
|
+
before do
|
24
|
+
Process.should_receive(:kill).with("SIGINT", pid)
|
25
|
+
end
|
26
|
+
|
27
|
+
describe 'with no stopping callbacks' do
|
28
|
+
before do
|
29
|
+
subject.should_not_receive(:stopping_a!)
|
30
|
+
subject.should_not_receive(:stopping_b!)
|
31
|
+
end
|
32
|
+
|
33
|
+
it 'signals the agent process to stop' do
|
34
|
+
subject.stop!
|
35
|
+
end
|
36
|
+
end
|
37
|
+
|
38
|
+
describe 'with stopping callbacks' do
|
39
|
+
before do
|
40
|
+
subject.on_stopping do |o|
|
41
|
+
subject.stopping_a! o
|
42
|
+
expect(subject.status).to eq(:stopping)
|
43
|
+
end
|
44
|
+
|
45
|
+
subject.on_stopping do |o|
|
46
|
+
subject.stopping_b! o
|
47
|
+
expect(subject.status).to eq(:stopping)
|
48
|
+
end
|
49
|
+
|
50
|
+
subject.should_receive(:stopping_a!).with(subject)
|
51
|
+
subject.should_receive(:stopping_b!).with(subject)
|
52
|
+
end
|
53
|
+
|
54
|
+
it 'invokes the callbacks and signals the agent process to stop' do
|
55
|
+
subject.stop!
|
56
|
+
end
|
57
|
+
end
|
58
|
+
end
|
59
|
+
|
60
|
+
shared_examples_for 'a consul agent process runner' do |method_name, registry_name, join_flag, args|
|
4
61
|
it 'properly launches consul agent' do
|
5
62
|
members = []
|
6
63
|
members << member if join_flag
|
@@ -8,7 +65,7 @@ shared_examples_for 'a consul agent run' do |method_name, registry_name, join_fl
|
|
8
65
|
registry.should_receive(registry_name).with.and_return(reg = double)
|
9
66
|
reg.should_receive(:members).with(expiry).and_return(members)
|
10
67
|
|
11
|
-
expected_args = (['
|
68
|
+
expected_args = (['-bind', ip, '-data-dir', data_dir, '-node', identity] + args).collect do |e|
|
12
69
|
if e.instance_of? Symbol
|
13
70
|
send e
|
14
71
|
else
|
@@ -16,23 +73,350 @@ shared_examples_for 'a consul agent run' do |method_name, registry_name, join_fl
|
|
16
73
|
end
|
17
74
|
end
|
18
75
|
|
19
|
-
|
76
|
+
runner = double("AgentProcess")
|
77
|
+
expect(AutoConsul::Runner::AgentProcess).to receive(:new).with(expected_args).and_return { runner }
|
20
78
|
|
21
|
-
|
22
|
-
|
79
|
+
if join_flag
|
80
|
+
expect(AutoConsul::Runner).to receive(:system).with('consul', 'join', remote_ip).and_return(true)
|
81
|
+
expect(runner).to receive(:on_up) do |&action|
|
82
|
+
# The callback mechanism is how we join the cluster.
|
83
|
+
action.call
|
84
|
+
end
|
85
|
+
else
|
86
|
+
expect(AutoConsul::Runner).to_not receive(:system)
|
87
|
+
expect(runner).to_not receive(:on_up)
|
88
|
+
end
|
23
89
|
|
24
|
-
AutoConsul::Runner.
|
25
|
-
|
26
|
-
|
90
|
+
callable = AutoConsul::Runner.method(method_name)
|
91
|
+
expect(callable.call(identity, ip, expiry, local_state, registry)).to be(runner)
|
92
|
+
end
|
93
|
+
end
|
27
94
|
|
28
|
-
|
29
|
-
|
95
|
+
describe AutoConsul::Runner::AgentProcess do
|
96
|
+
let(:args) do
|
97
|
+
(1..3).collect do |i|
|
98
|
+
double("MockParam#{i.to_s}").to_s
|
30
99
|
end
|
100
|
+
end
|
31
101
|
|
32
|
-
|
102
|
+
subject { AutoConsul::Runner::AgentProcess.new args }
|
33
103
|
|
34
|
-
|
35
|
-
|
104
|
+
describe "#handle_signals!" do
|
105
|
+
let(:stop_thread) { double('StopThread', :block => Proc.new { raise 'Loop breakout bogus' }) }
|
106
|
+
let(:stop_queue) { [] }
|
107
|
+
|
108
|
+
before do
|
109
|
+
count = 0
|
110
|
+
# This is necessary to capture the block given to the thread,
|
111
|
+
# and to ensure it only happens once.
|
112
|
+
expect(Thread).to receive(:new) do |&block|
|
113
|
+
expect(count).to eq(0)
|
114
|
+
count += 1
|
115
|
+
stop_thread.stub(:block).and_return block
|
116
|
+
end
|
117
|
+
expect(Queue).to receive(:new).with.once.and_return(stop_queue)
|
118
|
+
end
|
119
|
+
|
120
|
+
it 'should register a queuing signal handler for SIGINT, SIGTERM' do
|
121
|
+
seen = {}
|
122
|
+
expect(Signal).to receive(:trap).exactly(2).times do |sig, &block|
|
123
|
+
seen[sig] = true
|
124
|
+
len = stop_queue.size
|
125
|
+
block.call
|
126
|
+
expect(stop_queue.size).to eq(len + 1)
|
127
|
+
expect(stop_queue[-1]).to eq(sig)
|
128
|
+
end
|
129
|
+
subject.handle_signals!
|
130
|
+
expect(seen).to eq({"INT" => true, "TERM" => true})
|
131
|
+
end
|
132
|
+
|
133
|
+
it 'should issue stop per queue member' do
|
134
|
+
Signal.stub(:trap)
|
135
|
+
subject.handle_signals!
|
136
|
+
sigs = ['INT', 'TERM', 'INT']
|
137
|
+
expect(stop_queue).to receive(:pop).exactly(sigs.size + 1).times do
|
138
|
+
raise 'Loop breakout' unless sigs.size > 0
|
139
|
+
sigs.pop
|
140
|
+
end
|
141
|
+
expect(subject).to receive(:stop!).with.exactly(3).times
|
142
|
+
expect { stop_thread.block.call }.to raise_exception(/Loop breakout/)
|
143
|
+
end
|
144
|
+
|
145
|
+
it 'should only configure signals once' do
|
146
|
+
expect(Signal).to receive(:trap).exactly(2).times
|
147
|
+
subject.handle_signals!
|
148
|
+
subject.handle_signals!
|
149
|
+
end
|
150
|
+
end
|
151
|
+
|
152
|
+
describe "launch! method" do
|
153
|
+
let(:thread) { double('AgentThread') }
|
154
|
+
let(:pid) { double('AgentPid') }
|
155
|
+
let(:exit_code) { 2.to_i }
|
156
|
+
|
157
|
+
before do
|
158
|
+
# We'll have the AgentProcess take over signal handling, since it's
|
159
|
+
# bound up in the life cycle of the process.
|
160
|
+
subject.should_receive(:handle_signals!).with.once
|
161
|
+
|
162
|
+
# This sucks, but what are you gonna do? It needs to be in a separate
|
163
|
+
# thread so the waitpid2 call doesn't block the main process.
|
164
|
+
subject.should_receive(:spawn).with(*(['consul', 'agent'] + args), :pgroup => true).and_return(pid)
|
165
|
+
process_status = double('ProcessStatus', :pid => pid,
|
166
|
+
:exitstatus => exit_code)
|
167
|
+
Process.should_receive(:waitpid2).with(pid).and_return([pid, process_status])
|
168
|
+
Thread.should_receive(:new) do |&block|
|
169
|
+
# The status should be :starting before invoking the thread.
|
170
|
+
expect(subject.status).to eq(:starting)
|
171
|
+
# Make sure we set abort_on_exception on the thread.
|
172
|
+
Thread.should_receive(:current).with.and_return(thread)
|
173
|
+
thread.should_receive(:abort_on_exception=).with(true)
|
174
|
+
block.call
|
175
|
+
# The block is what moves it to a status of :down.
|
176
|
+
# And sets the pid and exit code.
|
177
|
+
expect(subject.status).to eq(:down)
|
178
|
+
expect(subject.pid).to eq(pid)
|
179
|
+
expect(subject.exit_code).to eq(exit_code)
|
180
|
+
thread
|
181
|
+
end
|
182
|
+
end
|
183
|
+
|
184
|
+
it 'should invoke "agent consul" with given args and wait on the result' do
|
185
|
+
expect(subject.thread).to be_nil
|
186
|
+
expect(subject.pid).to be_nil
|
187
|
+
expect(subject.status).to be_nil
|
188
|
+
expect(subject.exit_code).to be_nil
|
189
|
+
subject.launch!
|
190
|
+
expect(subject.thread).to be(thread)
|
191
|
+
end
|
192
|
+
|
193
|
+
it 'should invoke "agent consul" and run callbacks after going down.' do
|
194
|
+
subject.on_down do |x|
|
195
|
+
x.down_a!
|
196
|
+
end
|
197
|
+
|
198
|
+
subject.on_down do |x|
|
199
|
+
x.down_b!
|
200
|
+
end
|
201
|
+
|
202
|
+
subject.should_receive(:down_a!).with
|
203
|
+
subject.should_receive(:down_b!).with
|
204
|
+
|
205
|
+
subject.launch!
|
206
|
+
end
|
207
|
+
|
208
|
+
# it should blow up if called with status other than nil, :down.
|
209
|
+
end
|
210
|
+
|
211
|
+
describe "verify_up! method" do
|
212
|
+
describe 'when check succeeds' do
|
213
|
+
before do
|
214
|
+
subject.should_receive(:sleep).with(0.1)
|
215
|
+
subject.should_receive(:system).with('consul', 'info').and_return(false, false, false, true)
|
216
|
+
subject.should_receive(:sleep).with(2)
|
217
|
+
subject.should_receive(:sleep).with(4)
|
218
|
+
subject.should_receive(:sleep).with(8)
|
219
|
+
end
|
220
|
+
|
221
|
+
it 'sets status to :up' do
|
222
|
+
subject.verify_up!
|
223
|
+
expect(subject.status).to eq(:up)
|
224
|
+
end
|
225
|
+
|
226
|
+
describe 'with callbacks' do
|
227
|
+
before do
|
228
|
+
subject.on_up do |obj|
|
229
|
+
subject.callback_a! obj
|
230
|
+
expect(subject.status).to eq(:up)
|
231
|
+
end
|
232
|
+
|
233
|
+
subject.on_up do |obj|
|
234
|
+
subject.callback_b! obj
|
235
|
+
expect(subject.status).to eq(:up)
|
236
|
+
end
|
237
|
+
|
238
|
+
subject.should_receive(:callback_a!).with(subject)
|
239
|
+
subject.should_receive(:callback_b!).with(subject)
|
240
|
+
end
|
241
|
+
|
242
|
+
it 'invokes up callbacks with itself as parameter' do
|
243
|
+
subject.verify_up!
|
244
|
+
end
|
245
|
+
end
|
246
|
+
end
|
247
|
+
|
248
|
+
describe 'when check fails' do
|
249
|
+
before do
|
250
|
+
subject.should_receive(:sleep).with(0.1)
|
251
|
+
subject.should_receive(:system).with('consul', 'info').and_return(false, false, false, false, false)
|
252
|
+
subject.should_receive(:sleep).with(2)
|
253
|
+
subject.should_receive(:sleep).with(4)
|
254
|
+
subject.should_receive(:sleep).with(8)
|
255
|
+
subject.should_receive(:sleep).with(16)
|
256
|
+
end
|
257
|
+
|
258
|
+
it 'leaves the status alone' do
|
259
|
+
subject.should_not_receive(:set_status)
|
260
|
+
subject.verify_up!
|
261
|
+
expect(subject.status).to be_nil
|
262
|
+
end
|
263
|
+
|
264
|
+
describe 'with callbacks' do
|
265
|
+
before do
|
266
|
+
subject.on_up do |obj|
|
267
|
+
subject.callback_a! obj
|
268
|
+
end
|
269
|
+
|
270
|
+
subject.on_up do |obj|
|
271
|
+
subject.callback_b! obj
|
272
|
+
end
|
273
|
+
|
274
|
+
subject.should_not_receive(:callback_a!)
|
275
|
+
subject.should_not_receive(:callback_b!)
|
276
|
+
end
|
277
|
+
|
278
|
+
it 'does not invoke callbacks at all' do
|
279
|
+
subject.verify_up!
|
280
|
+
end
|
281
|
+
end
|
282
|
+
|
283
|
+
# it should blow up if called with status other than :starting, :up
|
284
|
+
end
|
285
|
+
end
|
286
|
+
|
287
|
+
describe 'stop! method' do
|
288
|
+
describe 'with a pid' do
|
289
|
+
let(:pid) { double("AgentPid") }
|
290
|
+
before { subject.stub(:pid).and_return(pid) }
|
291
|
+
|
292
|
+
describe 'in nil status' do
|
293
|
+
# Need an expression so compiler doesn't ignore the block
|
294
|
+
let(:initial_status) { nil && true }
|
295
|
+
it_behaves_like 'stop signaler'
|
296
|
+
end
|
297
|
+
|
298
|
+
describe 'in :starting status' do
|
299
|
+
let(:initial_status) { :starting.to_sym }
|
300
|
+
it_behaves_like 'stop signaler'
|
301
|
+
end
|
302
|
+
|
303
|
+
describe 'in :up status' do
|
304
|
+
let(:initial_status) { :up.to_sym }
|
305
|
+
it_behaves_like 'stop signaler'
|
306
|
+
end
|
307
|
+
|
308
|
+
describe 'in :stopping status' do
|
309
|
+
let(:initial_status) { :stopping.to_sym }
|
310
|
+
it_behaves_like 'stop signaler'
|
311
|
+
end
|
312
|
+
|
313
|
+
describe 'in :down status' do
|
314
|
+
let(:initial_status) { :down.to_sym }
|
315
|
+
it_behaves_like 'an unstoppable'
|
316
|
+
end
|
317
|
+
end
|
318
|
+
|
319
|
+
describe 'with no pid' do
|
320
|
+
before do
|
321
|
+
subject.stub(:pid).and_return(nil)
|
322
|
+
end
|
323
|
+
|
324
|
+
describe 'in nil status' do
|
325
|
+
# Need an expression so compiler doesn't ignore the block
|
326
|
+
let(:initial_status) { nil && true }
|
327
|
+
it_behaves_like 'an unstoppable'
|
328
|
+
end
|
329
|
+
|
330
|
+
describe 'in :starting status' do
|
331
|
+
let(:initial_status) { :starting.to_sym }
|
332
|
+
it_behaves_like 'an unstoppable'
|
333
|
+
end
|
334
|
+
|
335
|
+
describe 'in :up status' do
|
336
|
+
let(:initial_status) { :up.to_sym }
|
337
|
+
it_behaves_like 'an unstoppable'
|
338
|
+
end
|
339
|
+
|
340
|
+
describe 'in :stopping status' do
|
341
|
+
let(:initial_status) { :stopping.to_sym }
|
342
|
+
it_behaves_like 'an unstoppable'
|
343
|
+
end
|
344
|
+
|
345
|
+
describe 'in :down status' do
|
346
|
+
let(:initial_status) { :down.to_sym }
|
347
|
+
it_behaves_like 'an unstoppable'
|
348
|
+
end
|
349
|
+
end
|
350
|
+
end
|
351
|
+
|
352
|
+
describe ':run! method' do
|
353
|
+
it 'launches, then verifies up, and returns status' do
|
354
|
+
status = double('Status')
|
355
|
+
expect(subject).to receive(:launch!).with.ordered
|
356
|
+
expect(subject).to receive(:verify_up!).with.ordered
|
357
|
+
expect(subject).to receive(:status).with.ordered { status }
|
358
|
+
expect(subject.run!).to be(status)
|
359
|
+
end
|
360
|
+
end
|
361
|
+
|
362
|
+
describe ':wait method' do
|
363
|
+
it 'waits on the agent runner thread and returns the exit code' do
|
364
|
+
thread = double('Thread')
|
365
|
+
exit_code = double('ExitCode')
|
366
|
+
expect(subject).to receive(:thread).with.and_return { thread }
|
367
|
+
expect(thread).to receive(:join).with.and_return { thread }
|
368
|
+
expect(subject).to receive(:exit_code).with.and_return { exit_code }
|
369
|
+
expect(subject.wait).to be(exit_code)
|
370
|
+
end
|
371
|
+
|
372
|
+
it 'blows up if no thread is present' do
|
373
|
+
expect(subject).to receive(:thread).with.and_return { nil }
|
374
|
+
expect { subject.wait }.to raise_exception(/consul agent has not started/)
|
375
|
+
end
|
376
|
+
end
|
377
|
+
|
378
|
+
describe '#while_up method' do
|
379
|
+
let(:thread) { double('Thread') }
|
380
|
+
# Use this as the while_up block; it will verify that the block is invoked with
|
381
|
+
# the AgentProcess instance as sole parameter.
|
382
|
+
let(:action) { Proc.new {|o| expect(o).to be(subject)} }
|
383
|
+
|
384
|
+
describe 'when brought up' do
|
385
|
+
before do
|
386
|
+
# This "brings it up."
|
387
|
+
expect(subject).to receive(:on_up).and_yield(subject)
|
388
|
+
|
389
|
+
# And this happens in the on_up callback
|
390
|
+
expect(Thread).to receive(:new) do |&blk|
|
391
|
+
blk.call
|
392
|
+
thread
|
393
|
+
end
|
394
|
+
end
|
395
|
+
|
396
|
+
it 'registers an on_stopping that kills the thread for the given block' do
|
397
|
+
expect(thread).to receive(:kill).with
|
398
|
+
expect(subject).to receive(:on_stopping).and_yield(subject)
|
399
|
+
subject.while_up &action
|
400
|
+
end
|
401
|
+
|
402
|
+
it 'registers an on_down that kills the thread for the given block' do
|
403
|
+
expect(thread).to receive(:kill).with
|
404
|
+
expect(subject).to receive(:on_down).and_yield(subject)
|
405
|
+
subject.while_up &action
|
406
|
+
end
|
407
|
+
end
|
408
|
+
|
409
|
+
describe 'when never brought up' do
|
410
|
+
it 'never registers on_stopping or on_down handlers' do
|
411
|
+
expect(Thread).to_not receive(:new)
|
412
|
+
expect(thread).to_not receive(:kill)
|
413
|
+
expect(subject).to_not receive(:on_stopping)
|
414
|
+
expect(subject).to_not receive(:on_down)
|
415
|
+
# We receive it but we're not running anything.
|
416
|
+
expect(subject).to receive(:on_up)
|
417
|
+
subject.while_up &action
|
418
|
+
end
|
419
|
+
end
|
36
420
|
end
|
37
421
|
end
|
38
422
|
|
@@ -55,21 +439,21 @@ describe AutoConsul::Runner do
|
|
55
439
|
registry.servers.stub(:servers).with(expiry).and_return(servers_list)
|
56
440
|
end
|
57
441
|
|
58
|
-
describe :
|
59
|
-
it_behaves_like 'a consul agent
|
442
|
+
describe :agent_runner do
|
443
|
+
it_behaves_like 'a consul agent process runner', :agent_runner, :agents, true, []
|
60
444
|
end
|
61
445
|
|
62
|
-
describe :
|
446
|
+
describe :server_runner do
|
63
447
|
describe 'with empty server registry' do
|
64
448
|
# consul agent -bind 192.168.50.100 -data-dir /opt/consul/server/data -node vagrant-server -server -bootstrap
|
65
|
-
it_behaves_like 'a consul agent
|
449
|
+
it_behaves_like 'a consul agent process runner', :server_runner, :servers, false, ['-server', '-bootstrap']
|
66
450
|
end
|
67
451
|
|
68
452
|
describe 'with other servers in registry' do
|
69
453
|
# consul agent -bind 192.168.50.100 -data-dir /opt/consul/server/data -node vagrant-server -server
|
70
454
|
# consul join some_ip
|
71
455
|
|
72
|
-
it_behaves_like 'a consul agent
|
456
|
+
it_behaves_like 'a consul agent process runner', :server_runner, :servers, true, ['-server']
|
73
457
|
end
|
74
458
|
end
|
75
459
|
end
|