deadman_check 0.2.1 → 0.3.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/README.md +40 -13
- data/Rakefile +1 -0
- data/bin/deadman-check +22 -6
- data/deadman_check.gemspec +2 -1
- data/lib/deadman_check/version.rb +1 -1
- data/lib/deadman_check_switch.rb +56 -26
- metadata +26 -6
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 59091e4397024d50031cf4368adfea22a4057207
|
4
|
+
data.tar.gz: dcc1a25d10b623e2b4e13c928ba19cbef1ccc1e9
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 394b714ccbb820e3fc3e526baadc3bb75ef51309ba3a08cd058548cbe8c6b4fe5da7219182023a523d9db19438319a42f30ff7e39d92098bf93bbe8f1c9f1ae0
|
7
|
+
data.tar.gz: bfc91fcadb9bb51475fd7382516dc687adce7d3c5917e5fa8b6bb26989432301e17ae9d06ee7fed3e74332a6eaa0ca1a6d6b984e6371a2439dd480c861f66e2b
|
data/README.md
CHANGED
@@ -18,8 +18,11 @@ is expected for that job.
|
|
18
18
|
|
19
19
|
|
20
20
|
### Requirements
|
21
|
-
* [Consul](https://www.consul.io/) instance
|
22
|
-
|
21
|
+
* [Consul](https://www.consul.io/) instance or cluster to report to
|
22
|
+
|
23
|
+
### Alerting Options
|
24
|
+
* [Slack](https://slack.com/)
|
25
|
+
* [AWS SNS](https://aws.amazon.com/documentation/sns/)
|
23
26
|
|
24
27
|
## Example Usage
|
25
28
|
|
@@ -127,7 +130,7 @@ job "DeadmanMonitoring" {
|
|
127
130
|
"8500",
|
128
131
|
"--key",
|
129
132
|
"deadman/SilverBulletPeriodicProcess",
|
130
|
-
"--alert-to",
|
133
|
+
"--alert-to-slack",
|
131
134
|
"slackroom",
|
132
135
|
"--daemon",
|
133
136
|
"--daemon-sleep",
|
@@ -153,7 +156,7 @@ If you have multiple periodic jobs that need to be monitored then use the ```--k
|
|
153
156
|
|
154
157
|
<img width="658" alt="screen shot 2017-04-23 at 11 17 29 pm" src="https://cloud.githubusercontent.com/assets/538171/25324510/14d6e7f0-287b-11e7-9c0d-733d69e1cc94.png">
|
155
158
|
|
156
|
-
To monitor the above you would just use the ```--key-path``` argument instead of ```--key```
|
159
|
+
To monitor the above you would just use the ```--key-path``` argument instead of ```--key``` and AWS SNS for alerting endpoint
|
157
160
|
|
158
161
|
```hcl
|
159
162
|
job "DeadmanMonitoring" {
|
@@ -173,8 +176,10 @@ job "DeadmanMonitoring" {
|
|
173
176
|
"8500",
|
174
177
|
"--key-path",
|
175
178
|
"deadman/",
|
176
|
-
"--alert-to",
|
177
|
-
"
|
179
|
+
"--alert-to-sns",
|
180
|
+
"arn:aws:sns:us-east-1:123412345678:deadman-check",
|
181
|
+
"--alert-to-sns-region",
|
182
|
+
"us-east-1",
|
178
183
|
"--daemon",
|
179
184
|
"--daemon-sleep",
|
180
185
|
"900"]
|
@@ -184,13 +189,16 @@ job "DeadmanMonitoring" {
|
|
184
189
|
memory = 256
|
185
190
|
}
|
186
191
|
env {
|
187
|
-
|
192
|
+
AWS_ACCESS_KEY_ID = "YourAWSKEY"
|
193
|
+
AWS_SECRET_ACCESS_KEY = "YourAWSSecret"
|
188
194
|
}
|
189
195
|
}
|
190
196
|
}
|
191
197
|
}
|
192
198
|
```
|
193
199
|
|
200
|
+
<img width="903" alt="screen shot 2017-08-04 at 11 39 12 am" src="https://user-images.githubusercontent.com/538171/28982223-e576743c-7909-11e7-8e65-ebb0b4a76762.png">
|
201
|
+
|
194
202
|
# Non-Nomad Use:
|
195
203
|
|
196
204
|
## Local system installation
|
@@ -216,6 +224,13 @@ $ alias deadman-check='\
|
|
216
224
|
|
217
225
|
If you don't do the docker pull, the first time you run deadman-check, the docker run command will automatically pull the sepulworld/deadman-check image on the Docker Hub. Subsequent runs will use a locally cached copy of the image and will not have to download anything.
|
218
226
|
|
227
|
+
### Alerting Setup
|
228
|
+
* Slack alerting requires a SLACK_API_TOKEN environment variable to be set (use [Slack Bot integration](https://my.slack.com/services/new/bot)) (optional)
|
229
|
+
* [AWS SNS](https://aws.amazon.com/documentation/sns/) alerting requires appropreiate AWS IAM access to target SNS topic. One of the following can be used for authentication. IAM policy access to publish to the topic will be required
|
230
|
+
- ENV['AWS_ACCESS_KEY_ID'] and ENV['AWS_SECRET_ACCESS_KEY']
|
231
|
+
- The shared credentials ini file at ~/.aws/credentials (more information)
|
232
|
+
- From an [instance profile](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_switch-role-ec2_instance-profiles.html) when running on EC2
|
233
|
+
|
219
234
|
## Usage via Local System Install
|
220
235
|
|
221
236
|
```bash
|
@@ -298,15 +313,21 @@ $ deadman-check switch_monitor -h
|
|
298
313
|
|
299
314
|
DESCRIPTION:
|
300
315
|
|
301
|
-
switch_monitor will monitor either a given key which contains a services last epoch checkin and frequency, or a series of services that set keys
|
316
|
+
switch_monitor will monitor either a given key which contains a services last epoch checkin and frequency, or a series of services that set keys
|
317
|
+
under a given key-path in Consul
|
302
318
|
|
303
319
|
EXAMPLES:
|
304
320
|
|
305
321
|
# Target a Consul key deadman/myservice, and this key has an EPOCH value to check looking to alert
|
306
|
-
deadman-check switch_monitor --host 127.0.0.1 --port 8500 --key deadman/myservice --alert-to
|
322
|
+
deadman-check switch_monitor --host 127.0.0.1 --port 8500 --key deadman/myservice --alert-to-slack my-slack-monitor-channel
|
323
|
+
|
324
|
+
# Target a Consul key path deadman/, which contains 2 or more service keys to monitor, i.e. deadman/myservice1, deadman/myservice2,
|
325
|
+
deadmman/myservice3 all fall under the path deadman/
|
326
|
+
deadman-check switch_monitor --host 127.0.0.1 --port 8500 --key-path deadman/ --alert-to-slack my-slack-monitor-channel
|
307
327
|
|
308
|
-
# Target a Consul key path deadman/,
|
309
|
-
|
328
|
+
# Target a Consul key path deadman/, alert to Amazon SNS, i.e. deadman/myservice1, deadman/myservice2, deadmman/myservice3 all fall under the path
|
329
|
+
deadman/
|
330
|
+
deadman-check switch_monitor --host 127.0.0.1 --port 8500 --key-path deadman/ --alert-to-sns arn:aws:sns:*:123456789012:my_corporate_topic
|
310
331
|
|
311
332
|
OPTIONS:
|
312
333
|
|
@@ -322,8 +343,14 @@ $ deadman-check switch_monitor -h
|
|
322
343
|
--key KEY
|
323
344
|
Consul key to monitor, provide this or --key-path if you have multiple keys in a given path.
|
324
345
|
|
325
|
-
--alert-to
|
326
|
-
|
346
|
+
--alert-to-slack SLACKCHANNEL
|
347
|
+
Slack channel to send alert, don't include the # tag in name
|
348
|
+
|
349
|
+
--alert-to-sns SNSARN
|
350
|
+
Amazon Web Services SNS arn to send alert, example arn arn:aws:sns:*:123456789012:my_corporate_topic
|
351
|
+
|
352
|
+
--alert-to-sns-region AWSREGION
|
353
|
+
Amazon Web Services region the SNS topic is in, defaults to us-west-2
|
327
354
|
|
328
355
|
--daemon
|
329
356
|
Run as a daemon, otherwise will run check just once
|
data/Rakefile
CHANGED
data/bin/deadman-check
CHANGED
@@ -14,18 +14,26 @@ command :switch_monitor do |c|
|
|
14
14
|
c.summary = 'Target a Consul key to monitor'
|
15
15
|
c.description = 'switch_monitor will monitor either a given key which contains a services last epoch checkin and frequency, or a series of services that set keys under a given key-path in Consul'
|
16
16
|
c.example %q{Target a Consul key deadman/myservice, and this key has an EPOCH value to check looking to alert},
|
17
|
-
%q{deadman-check switch_monitor --host 127.0.0.1 --port 8500 --key deadman/myservice --alert-to
|
17
|
+
%q{deadman-check switch_monitor --host 127.0.0.1 --port 8500 --key deadman/myservice --alert-to-slack my-slack-monitor-channel}
|
18
18
|
c.example %q{Target a Consul key path deadman/, which contains 2 or more service keys to monitor, i.e. deadman/myservice1, deadman/myservice2, deadmman/myservice3 all fall under the path deadman/},
|
19
|
-
%q{deadman-check switch_monitor --host 127.0.0.1 --port 8500 --key-path deadman/ --alert-to
|
19
|
+
%q{deadman-check switch_monitor --host 127.0.0.1 --port 8500 --key-path deadman/ --alert-to-slack my-slack-monitor-channel}
|
20
|
+
c.example %q{Target a Consul key path deadman/, alert to Amazon SNS, i.e. deadman/myservice1, deadman/myservice2, deadmman/myservice3 all fall under the path deadman/},
|
21
|
+
%q{deadman-check switch_monitor --host 127.0.0.1 --port 8500 --key-path deadman/ --alert-to-sns arn:aws:sns:*:123456789012:my_corporate_topic}
|
20
22
|
c.option '--host HOST', String, 'IP address or hostname of Consul system'
|
21
23
|
c.option '--port PORT', String, 'port Consul is listening on'
|
22
24
|
c.option '--key-path KEYPATH', String, 'Consul key path to monitor, performs a recursive key lookup at given path.'
|
23
25
|
c.option '--key KEY', String, 'Consul key to monitor, provide this or --key-path if you have multiple keys in a given path.'
|
24
|
-
c.option '--alert-to
|
26
|
+
c.option '--alert-to-slack SLACKCHANNEL', String, 'Slack channel to send alert, don\'t include the # tag in name'
|
27
|
+
c.option '--alert-to-sns SNSARN', String, 'Amazon Web Services SNS arn to send alert, example arn arn:aws:sns:*:123456789012:my_corporate_topic'
|
28
|
+
c.option '--alert-to-sns-region AWSREGION', String, 'Amazon Web Services region the SNS topic is in, defaults to us-west-2'
|
25
29
|
c.option '--daemon', 'Run as a daemon, otherwise will run check just once'
|
26
30
|
c.option '--daemon-sleep SECONDS', String, 'Set the number of seconds to sleep in between switch checks, default 300'
|
27
31
|
c.action do |args, options|
|
28
|
-
options.default :daemon_sleep => 300
|
32
|
+
options.default :daemon_sleep => 300,
|
33
|
+
:alert_to_sns_region => 'us-west-2',
|
34
|
+
:alert_to_sns => nil,
|
35
|
+
:alert_to_slack => nil
|
36
|
+
|
29
37
|
if options.key_path && options.key
|
30
38
|
abort("Specify --key-path or --key, don't specify both")
|
31
39
|
end
|
@@ -36,8 +44,10 @@ command :switch_monitor do |c|
|
|
36
44
|
target = options.key_path
|
37
45
|
recurse = true
|
38
46
|
end
|
39
|
-
switch_monitor = DeadmanCheck::SwitchMonitor.new(
|
40
|
-
|
47
|
+
switch_monitor = DeadmanCheck::SwitchMonitor.new(
|
48
|
+
options.host, options.port,
|
49
|
+
target, options.alert_to_slack, options.alert_to_sns,
|
50
|
+
options.alert_to_sns_region, recurse, options.daemon_sleep)
|
41
51
|
if options.daemon
|
42
52
|
Daemons.run(switch_monitor.run_check_daemon)
|
43
53
|
else
|
@@ -57,6 +67,12 @@ command :key_set do |c|
|
|
57
67
|
c.option '--key KEY', String, 'Consul key to report EPOCH time and frequency for service'
|
58
68
|
c.option '--frequency FREQUENCY', String, 'Frequency at which this key should be updated in seconds'
|
59
69
|
c.action do |args, options|
|
70
|
+
if options.frequency.nil?
|
71
|
+
abort("Specify --frequency at which this key should be updated by the service")
|
72
|
+
end
|
73
|
+
if options.key.nil?
|
74
|
+
abort("Must specify a --key")
|
75
|
+
end
|
60
76
|
key_set = DeadmanCheck::KeySet.new(options.host, options.port, options.key,
|
61
77
|
options.frequency)
|
62
78
|
key_set.run_consul_key_update
|
data/deadman_check.gemspec
CHANGED
@@ -36,7 +36,8 @@ Gem::Specification.new do |spec|
|
|
36
36
|
spec.add_development_dependency "webmock", "~> 3.0"
|
37
37
|
|
38
38
|
spec.add_dependency 'commander', '~> 4.4', '>= 4.4.3'
|
39
|
-
spec.add_dependency 'diplomat', '~>
|
39
|
+
spec.add_dependency 'diplomat', '~> 2.0.0', '>= 2.0.0'
|
40
40
|
spec.add_dependency 'slack-ruby-client', '~> 0.8.0'
|
41
41
|
spec.add_dependency 'daemons', '~> 1.2.4', '>=1.2.4'
|
42
|
+
spec.add_dependency 'aws-sdk', '~> 2.10.21', '>=2.10.21'
|
42
43
|
end
|
data/lib/deadman_check_switch.rb
CHANGED
@@ -3,23 +3,36 @@ require 'deadman_check_global'
|
|
3
3
|
require 'diplomat'
|
4
4
|
require 'slack-ruby-client'
|
5
5
|
require 'json'
|
6
|
+
require 'aws-sdk'
|
6
7
|
|
7
8
|
module DeadmanCheck
|
8
9
|
# Switch class
|
9
10
|
class SwitchMonitor
|
10
|
-
attr_accessor :host, :port, :target, :
|
11
|
+
attr_accessor :host, :port, :target, :alert_to_slack,
|
12
|
+
:alert_to_sns, :alert_to_sns_region, :recurse, :daemon_sleep
|
11
13
|
|
12
|
-
def initialize(host, port, target,
|
14
|
+
def initialize(host, port, target, alert_to_slack, alert_to_sns,
|
15
|
+
alert_to_sns_region, recurse, daemon_sleep)
|
13
16
|
@host = host
|
14
17
|
@port = port
|
15
18
|
@target = target
|
16
|
-
@
|
19
|
+
@alert_to_slack = alert_to_slack
|
20
|
+
@alert_to_sns = alert_to_sns
|
21
|
+
@alert_to_sns_region = alert_to_sns_region
|
17
22
|
@recurse = recurse
|
18
23
|
@daemon_sleep = daemon_sleep.to_i
|
19
|
-
end
|
20
24
|
|
21
|
-
|
22
|
-
|
25
|
+
unless @alert_to_slack.nil?
|
26
|
+
Slack.configure do |config|
|
27
|
+
config.token = ENV['SLACK_API_TOKEN']
|
28
|
+
end
|
29
|
+
end
|
30
|
+
|
31
|
+
unless @alert_to_sns.nil?
|
32
|
+
@sns = Aws::SNS::Client.new(
|
33
|
+
region: @alert_to_sns_region
|
34
|
+
)
|
35
|
+
end
|
23
36
|
end
|
24
37
|
|
25
38
|
def run_check_once
|
@@ -52,20 +65,26 @@ module DeadmanCheck
|
|
52
65
|
return recorded_epochs
|
53
66
|
end
|
54
67
|
|
68
|
+
def check_recursive_recorded_epochs(recorded_epochs, current_epoch)
|
69
|
+
recorded_epochs.each do |recorded_service|
|
70
|
+
value_json = JSON.parse(recorded_service[:value])
|
71
|
+
frequency = value_json["frequency"].to_i
|
72
|
+
epoch = value_json["epoch"].to_i
|
73
|
+
epoch_diff = diff_epoch(current_epoch, epoch)
|
74
|
+
alert_if_epoch_greater_than_frequency(epoch_diff,
|
75
|
+
recorded_service[:key],
|
76
|
+
frequency)
|
77
|
+
end
|
78
|
+
end
|
79
|
+
|
55
80
|
def parse_recorded_epoch(recorded_epochs)
|
56
81
|
# {"epoch":1493000501,"frequency":"300"}
|
57
|
-
value_json = JSON.parse(recorded_epochs)
|
58
|
-
frequency = value_json["frequency"]
|
59
|
-
epoch = value_json["epoch"]
|
82
|
+
value_json = JSON.parse(recorded_epochs[0][:value])
|
83
|
+
frequency = value_json["frequency"]
|
84
|
+
epoch = value_json["epoch"]
|
60
85
|
return epoch, frequency
|
61
86
|
end
|
62
87
|
|
63
|
-
def alert_if_epoch_greater_than_frequency(epoch_diff, target, frequency)
|
64
|
-
if epoch_diff > frequency
|
65
|
-
slack_alert(@alert_to, target, epoch_diff)
|
66
|
-
end
|
67
|
-
end
|
68
|
-
|
69
88
|
def check_recorded_epoch(parse_recorded_epoch, current_epoch)
|
70
89
|
recorded_epoch = parse_recorded_epoch[0].to_i
|
71
90
|
frequency = parse_recorded_epoch[1].to_i
|
@@ -73,23 +92,34 @@ module DeadmanCheck
|
|
73
92
|
alert_if_epoch_greater_than_frequency(epoch_diff, @target, frequency)
|
74
93
|
end
|
75
94
|
|
76
|
-
def
|
77
|
-
|
78
|
-
|
79
|
-
|
80
|
-
|
81
|
-
|
82
|
-
alert_if_epoch_greater_than_frequency(epoch_diff,
|
83
|
-
recorded_service[:key],
|
84
|
-
frequency)
|
95
|
+
def alert_if_epoch_greater_than_frequency(epoch_diff, target, frequency)
|
96
|
+
if epoch_diff > frequency
|
97
|
+
slack_alert(
|
98
|
+
@alert_to_slack, target, epoch_diff) unless @alert_to_slack.nil?
|
99
|
+
sns_alert(
|
100
|
+
@alert_to_sns, target, epoch_diff) unless @alert_to_sns.nil?
|
85
101
|
end
|
86
102
|
end
|
87
103
|
|
88
|
-
def slack_alert(
|
104
|
+
def slack_alert(alert_to_slack, target, epoch_diff)
|
89
105
|
client = Slack::Web::Client.new
|
90
|
-
client.chat_postMessage(channel: "\##{
|
106
|
+
client.chat_postMessage(channel: "\##{alert_to_slack}",
|
107
|
+
text: "Alert: Deadman Switch
|
91
108
|
Triggered for #{target}, with #{epoch_diff} seconds since last run",
|
92
109
|
username: 'deadman')
|
93
110
|
end
|
111
|
+
|
112
|
+
def sns_alert(alert_to_sns, target, epoch_diff)
|
113
|
+
@sns.publish(
|
114
|
+
target_arn: @alert_to_sns,
|
115
|
+
message_structure: 'json',
|
116
|
+
message: {
|
117
|
+
:default => "Alert: Deadman Switch triggered for #{target}",
|
118
|
+
:email => "Alert: Deadman Switch triggered for #{target}, with
|
119
|
+
#{epoch_diff} seconds since last run",
|
120
|
+
:sms => "Alert: Deadman Switch for #{target}"
|
121
|
+
}.to_json
|
122
|
+
)
|
123
|
+
end
|
94
124
|
end
|
95
125
|
end
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: deadman_check
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.3.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- zane
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2017-
|
11
|
+
date: 2017-08-04 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: bundler
|
@@ -92,20 +92,20 @@ dependencies:
|
|
92
92
|
requirements:
|
93
93
|
- - "~>"
|
94
94
|
- !ruby/object:Gem::Version
|
95
|
-
version:
|
95
|
+
version: 2.0.0
|
96
96
|
- - ">="
|
97
97
|
- !ruby/object:Gem::Version
|
98
|
-
version:
|
98
|
+
version: 2.0.0
|
99
99
|
type: :runtime
|
100
100
|
prerelease: false
|
101
101
|
version_requirements: !ruby/object:Gem::Requirement
|
102
102
|
requirements:
|
103
103
|
- - "~>"
|
104
104
|
- !ruby/object:Gem::Version
|
105
|
-
version:
|
105
|
+
version: 2.0.0
|
106
106
|
- - ">="
|
107
107
|
- !ruby/object:Gem::Version
|
108
|
-
version:
|
108
|
+
version: 2.0.0
|
109
109
|
- !ruby/object:Gem::Dependency
|
110
110
|
name: slack-ruby-client
|
111
111
|
requirement: !ruby/object:Gem::Requirement
|
@@ -140,6 +140,26 @@ dependencies:
|
|
140
140
|
- - ">="
|
141
141
|
- !ruby/object:Gem::Version
|
142
142
|
version: 1.2.4
|
143
|
+
- !ruby/object:Gem::Dependency
|
144
|
+
name: aws-sdk
|
145
|
+
requirement: !ruby/object:Gem::Requirement
|
146
|
+
requirements:
|
147
|
+
- - "~>"
|
148
|
+
- !ruby/object:Gem::Version
|
149
|
+
version: 2.10.21
|
150
|
+
- - ">="
|
151
|
+
- !ruby/object:Gem::Version
|
152
|
+
version: 2.10.21
|
153
|
+
type: :runtime
|
154
|
+
prerelease: false
|
155
|
+
version_requirements: !ruby/object:Gem::Requirement
|
156
|
+
requirements:
|
157
|
+
- - "~>"
|
158
|
+
- !ruby/object:Gem::Version
|
159
|
+
version: 2.10.21
|
160
|
+
- - ">="
|
161
|
+
- !ruby/object:Gem::Version
|
162
|
+
version: 2.10.21
|
143
163
|
description: |-
|
144
164
|
A script to check a given Consul key EPOCH for
|
145
165
|
freshness. Good for monitoring cron jobs or batch jobs. Have the last step
|