deadman_check 0.2.1 → 0.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +40 -13
- data/Rakefile +1 -0
- data/bin/deadman-check +22 -6
- data/deadman_check.gemspec +2 -1
- data/lib/deadman_check/version.rb +1 -1
- data/lib/deadman_check_switch.rb +56 -26
- metadata +26 -6
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 59091e4397024d50031cf4368adfea22a4057207
|
4
|
+
data.tar.gz: dcc1a25d10b623e2b4e13c928ba19cbef1ccc1e9
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 394b714ccbb820e3fc3e526baadc3bb75ef51309ba3a08cd058548cbe8c6b4fe5da7219182023a523d9db19438319a42f30ff7e39d92098bf93bbe8f1c9f1ae0
|
7
|
+
data.tar.gz: bfc91fcadb9bb51475fd7382516dc687adce7d3c5917e5fa8b6bb26989432301e17ae9d06ee7fed3e74332a6eaa0ca1a6d6b984e6371a2439dd480c861f66e2b
|
data/README.md
CHANGED
@@ -18,8 +18,11 @@ is expected for that job.
|
|
18
18
|
|
19
19
|
|
20
20
|
### Requirements
|
21
|
-
* [Consul](https://www.consul.io/) instance
|
22
|
-
|
21
|
+
* [Consul](https://www.consul.io/) instance or cluster to report to
|
22
|
+
|
23
|
+
### Alerting Options
|
24
|
+
* [Slack](https://slack.com/)
|
25
|
+
* [AWS SNS](https://aws.amazon.com/documentation/sns/)
|
23
26
|
|
24
27
|
## Example Usage
|
25
28
|
|
@@ -127,7 +130,7 @@ job "DeadmanMonitoring" {
|
|
127
130
|
"8500",
|
128
131
|
"--key",
|
129
132
|
"deadman/SilverBulletPeriodicProcess",
|
130
|
-
"--alert-to",
|
133
|
+
"--alert-to-slack",
|
131
134
|
"slackroom",
|
132
135
|
"--daemon",
|
133
136
|
"--daemon-sleep",
|
@@ -153,7 +156,7 @@ If you have multiple periodic jobs that need to be monitored then use the ```--k
|
|
153
156
|
|
154
157
|
<img width="658" alt="screen shot 2017-04-23 at 11 17 29 pm" src="https://cloud.githubusercontent.com/assets/538171/25324510/14d6e7f0-287b-11e7-9c0d-733d69e1cc94.png">
|
155
158
|
|
156
|
-
To monitor the above you would just use the ```--key-path``` argument instead of ```--key```
|
159
|
+
To monitor the above you would just use the ```--key-path``` argument instead of ```--key``` and AWS SNS for alerting endpoint
|
157
160
|
|
158
161
|
```hcl
|
159
162
|
job "DeadmanMonitoring" {
|
@@ -173,8 +176,10 @@ job "DeadmanMonitoring" {
|
|
173
176
|
"8500",
|
174
177
|
"--key-path",
|
175
178
|
"deadman/",
|
176
|
-
"--alert-to",
|
177
|
-
"
|
179
|
+
"--alert-to-sns",
|
180
|
+
"arn:aws:sns:us-east-1:123412345678:deadman-check",
|
181
|
+
"--alert-to-sns-region",
|
182
|
+
"us-east-1",
|
178
183
|
"--daemon",
|
179
184
|
"--daemon-sleep",
|
180
185
|
"900"]
|
@@ -184,13 +189,16 @@ job "DeadmanMonitoring" {
|
|
184
189
|
memory = 256
|
185
190
|
}
|
186
191
|
env {
|
187
|
-
|
192
|
+
AWS_ACCESS_KEY_ID = "YourAWSKEY"
|
193
|
+
AWS_SECRET_ACCESS_KEY = "YourAWSSecret"
|
188
194
|
}
|
189
195
|
}
|
190
196
|
}
|
191
197
|
}
|
192
198
|
```
|
193
199
|
|
200
|
+
<img width="903" alt="screen shot 2017-08-04 at 11 39 12 am" src="https://user-images.githubusercontent.com/538171/28982223-e576743c-7909-11e7-8e65-ebb0b4a76762.png">
|
201
|
+
|
194
202
|
# Non-Nomad Use:
|
195
203
|
|
196
204
|
## Local system installation
|
@@ -216,6 +224,13 @@ $ alias deadman-check='\
|
|
216
224
|
|
217
225
|
If you don't do the docker pull, the first time you run deadman-check, the docker run command will automatically pull the sepulworld/deadman-check image on the Docker Hub. Subsequent runs will use a locally cached copy of the image and will not have to download anything.
|
218
226
|
|
227
|
+
### Alerting Setup
|
228
|
+
* Slack alerting requires a SLACK_API_TOKEN environment variable to be set (use [Slack Bot integration](https://my.slack.com/services/new/bot)) (optional)
|
229
|
+
* [AWS SNS](https://aws.amazon.com/documentation/sns/) alerting requires appropreiate AWS IAM access to target SNS topic. One of the following can be used for authentication. IAM policy access to publish to the topic will be required
|
230
|
+
- ENV['AWS_ACCESS_KEY_ID'] and ENV['AWS_SECRET_ACCESS_KEY']
|
231
|
+
- The shared credentials ini file at ~/.aws/credentials (more information)
|
232
|
+
- From an [instance profile](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_switch-role-ec2_instance-profiles.html) when running on EC2
|
233
|
+
|
219
234
|
## Usage via Local System Install
|
220
235
|
|
221
236
|
```bash
|
@@ -298,15 +313,21 @@ $ deadman-check switch_monitor -h
|
|
298
313
|
|
299
314
|
DESCRIPTION:
|
300
315
|
|
301
|
-
switch_monitor will monitor either a given key which contains a services last epoch checkin and frequency, or a series of services that set keys
|
316
|
+
switch_monitor will monitor either a given key which contains a services last epoch checkin and frequency, or a series of services that set keys
|
317
|
+
under a given key-path in Consul
|
302
318
|
|
303
319
|
EXAMPLES:
|
304
320
|
|
305
321
|
# Target a Consul key deadman/myservice, and this key has an EPOCH value to check looking to alert
|
306
|
-
deadman-check switch_monitor --host 127.0.0.1 --port 8500 --key deadman/myservice --alert-to
|
322
|
+
deadman-check switch_monitor --host 127.0.0.1 --port 8500 --key deadman/myservice --alert-to-slack my-slack-monitor-channel
|
323
|
+
|
324
|
+
# Target a Consul key path deadman/, which contains 2 or more service keys to monitor, i.e. deadman/myservice1, deadman/myservice2,
|
325
|
+
deadmman/myservice3 all fall under the path deadman/
|
326
|
+
deadman-check switch_monitor --host 127.0.0.1 --port 8500 --key-path deadman/ --alert-to-slack my-slack-monitor-channel
|
307
327
|
|
308
|
-
# Target a Consul key path deadman/,
|
309
|
-
|
328
|
+
# Target a Consul key path deadman/, alert to Amazon SNS, i.e. deadman/myservice1, deadman/myservice2, deadmman/myservice3 all fall under the path
|
329
|
+
deadman/
|
330
|
+
deadman-check switch_monitor --host 127.0.0.1 --port 8500 --key-path deadman/ --alert-to-sns arn:aws:sns:*:123456789012:my_corporate_topic
|
310
331
|
|
311
332
|
OPTIONS:
|
312
333
|
|
@@ -322,8 +343,14 @@ $ deadman-check switch_monitor -h
|
|
322
343
|
--key KEY
|
323
344
|
Consul key to monitor, provide this or --key-path if you have multiple keys in a given path.
|
324
345
|
|
325
|
-
--alert-to
|
326
|
-
|
346
|
+
--alert-to-slack SLACKCHANNEL
|
347
|
+
Slack channel to send alert, don't include the # tag in name
|
348
|
+
|
349
|
+
--alert-to-sns SNSARN
|
350
|
+
Amazon Web Services SNS arn to send alert, example arn arn:aws:sns:*:123456789012:my_corporate_topic
|
351
|
+
|
352
|
+
--alert-to-sns-region AWSREGION
|
353
|
+
Amazon Web Services region the SNS topic is in, defaults to us-west-2
|
327
354
|
|
328
355
|
--daemon
|
329
356
|
Run as a daemon, otherwise will run check just once
|
data/Rakefile
CHANGED
data/bin/deadman-check
CHANGED
@@ -14,18 +14,26 @@ command :switch_monitor do |c|
|
|
14
14
|
c.summary = 'Target a Consul key to monitor'
|
15
15
|
c.description = 'switch_monitor will monitor either a given key which contains a services last epoch checkin and frequency, or a series of services that set keys under a given key-path in Consul'
|
16
16
|
c.example %q{Target a Consul key deadman/myservice, and this key has an EPOCH value to check looking to alert},
|
17
|
-
%q{deadman-check switch_monitor --host 127.0.0.1 --port 8500 --key deadman/myservice --alert-to
|
17
|
+
%q{deadman-check switch_monitor --host 127.0.0.1 --port 8500 --key deadman/myservice --alert-to-slack my-slack-monitor-channel}
|
18
18
|
c.example %q{Target a Consul key path deadman/, which contains 2 or more service keys to monitor, i.e. deadman/myservice1, deadman/myservice2, deadmman/myservice3 all fall under the path deadman/},
|
19
|
-
%q{deadman-check switch_monitor --host 127.0.0.1 --port 8500 --key-path deadman/ --alert-to
|
19
|
+
%q{deadman-check switch_monitor --host 127.0.0.1 --port 8500 --key-path deadman/ --alert-to-slack my-slack-monitor-channel}
|
20
|
+
c.example %q{Target a Consul key path deadman/, alert to Amazon SNS, i.e. deadman/myservice1, deadman/myservice2, deadmman/myservice3 all fall under the path deadman/},
|
21
|
+
%q{deadman-check switch_monitor --host 127.0.0.1 --port 8500 --key-path deadman/ --alert-to-sns arn:aws:sns:*:123456789012:my_corporate_topic}
|
20
22
|
c.option '--host HOST', String, 'IP address or hostname of Consul system'
|
21
23
|
c.option '--port PORT', String, 'port Consul is listening on'
|
22
24
|
c.option '--key-path KEYPATH', String, 'Consul key path to monitor, performs a recursive key lookup at given path.'
|
23
25
|
c.option '--key KEY', String, 'Consul key to monitor, provide this or --key-path if you have multiple keys in a given path.'
|
24
|
-
c.option '--alert-to
|
26
|
+
c.option '--alert-to-slack SLACKCHANNEL', String, 'Slack channel to send alert, don\'t include the # tag in name'
|
27
|
+
c.option '--alert-to-sns SNSARN', String, 'Amazon Web Services SNS arn to send alert, example arn arn:aws:sns:*:123456789012:my_corporate_topic'
|
28
|
+
c.option '--alert-to-sns-region AWSREGION', String, 'Amazon Web Services region the SNS topic is in, defaults to us-west-2'
|
25
29
|
c.option '--daemon', 'Run as a daemon, otherwise will run check just once'
|
26
30
|
c.option '--daemon-sleep SECONDS', String, 'Set the number of seconds to sleep in between switch checks, default 300'
|
27
31
|
c.action do |args, options|
|
28
|
-
options.default :daemon_sleep => 300
|
32
|
+
options.default :daemon_sleep => 300,
|
33
|
+
:alert_to_sns_region => 'us-west-2',
|
34
|
+
:alert_to_sns => nil,
|
35
|
+
:alert_to_slack => nil
|
36
|
+
|
29
37
|
if options.key_path && options.key
|
30
38
|
abort("Specify --key-path or --key, don't specify both")
|
31
39
|
end
|
@@ -36,8 +44,10 @@ command :switch_monitor do |c|
|
|
36
44
|
target = options.key_path
|
37
45
|
recurse = true
|
38
46
|
end
|
39
|
-
switch_monitor = DeadmanCheck::SwitchMonitor.new(
|
40
|
-
|
47
|
+
switch_monitor = DeadmanCheck::SwitchMonitor.new(
|
48
|
+
options.host, options.port,
|
49
|
+
target, options.alert_to_slack, options.alert_to_sns,
|
50
|
+
options.alert_to_sns_region, recurse, options.daemon_sleep)
|
41
51
|
if options.daemon
|
42
52
|
Daemons.run(switch_monitor.run_check_daemon)
|
43
53
|
else
|
@@ -57,6 +67,12 @@ command :key_set do |c|
|
|
57
67
|
c.option '--key KEY', String, 'Consul key to report EPOCH time and frequency for service'
|
58
68
|
c.option '--frequency FREQUENCY', String, 'Frequency at which this key should be updated in seconds'
|
59
69
|
c.action do |args, options|
|
70
|
+
if options.frequency.nil?
|
71
|
+
abort("Specify --frequency at which this key should be updated by the service")
|
72
|
+
end
|
73
|
+
if options.key.nil?
|
74
|
+
abort("Must specify a --key")
|
75
|
+
end
|
60
76
|
key_set = DeadmanCheck::KeySet.new(options.host, options.port, options.key,
|
61
77
|
options.frequency)
|
62
78
|
key_set.run_consul_key_update
|
data/deadman_check.gemspec
CHANGED
@@ -36,7 +36,8 @@ Gem::Specification.new do |spec|
|
|
36
36
|
spec.add_development_dependency "webmock", "~> 3.0"
|
37
37
|
|
38
38
|
spec.add_dependency 'commander', '~> 4.4', '>= 4.4.3'
|
39
|
-
spec.add_dependency 'diplomat', '~>
|
39
|
+
spec.add_dependency 'diplomat', '~> 2.0.0', '>= 2.0.0'
|
40
40
|
spec.add_dependency 'slack-ruby-client', '~> 0.8.0'
|
41
41
|
spec.add_dependency 'daemons', '~> 1.2.4', '>=1.2.4'
|
42
|
+
spec.add_dependency 'aws-sdk', '~> 2.10.21', '>=2.10.21'
|
42
43
|
end
|
data/lib/deadman_check_switch.rb
CHANGED
@@ -3,23 +3,36 @@ require 'deadman_check_global'
|
|
3
3
|
require 'diplomat'
|
4
4
|
require 'slack-ruby-client'
|
5
5
|
require 'json'
|
6
|
+
require 'aws-sdk'
|
6
7
|
|
7
8
|
module DeadmanCheck
|
8
9
|
# Switch class
|
9
10
|
class SwitchMonitor
|
10
|
-
attr_accessor :host, :port, :target, :
|
11
|
+
attr_accessor :host, :port, :target, :alert_to_slack,
|
12
|
+
:alert_to_sns, :alert_to_sns_region, :recurse, :daemon_sleep
|
11
13
|
|
12
|
-
def initialize(host, port, target,
|
14
|
+
def initialize(host, port, target, alert_to_slack, alert_to_sns,
|
15
|
+
alert_to_sns_region, recurse, daemon_sleep)
|
13
16
|
@host = host
|
14
17
|
@port = port
|
15
18
|
@target = target
|
16
|
-
@
|
19
|
+
@alert_to_slack = alert_to_slack
|
20
|
+
@alert_to_sns = alert_to_sns
|
21
|
+
@alert_to_sns_region = alert_to_sns_region
|
17
22
|
@recurse = recurse
|
18
23
|
@daemon_sleep = daemon_sleep.to_i
|
19
|
-
end
|
20
24
|
|
21
|
-
|
22
|
-
|
25
|
+
unless @alert_to_slack.nil?
|
26
|
+
Slack.configure do |config|
|
27
|
+
config.token = ENV['SLACK_API_TOKEN']
|
28
|
+
end
|
29
|
+
end
|
30
|
+
|
31
|
+
unless @alert_to_sns.nil?
|
32
|
+
@sns = Aws::SNS::Client.new(
|
33
|
+
region: @alert_to_sns_region
|
34
|
+
)
|
35
|
+
end
|
23
36
|
end
|
24
37
|
|
25
38
|
def run_check_once
|
@@ -52,20 +65,26 @@ module DeadmanCheck
|
|
52
65
|
return recorded_epochs
|
53
66
|
end
|
54
67
|
|
68
|
+
def check_recursive_recorded_epochs(recorded_epochs, current_epoch)
|
69
|
+
recorded_epochs.each do |recorded_service|
|
70
|
+
value_json = JSON.parse(recorded_service[:value])
|
71
|
+
frequency = value_json["frequency"].to_i
|
72
|
+
epoch = value_json["epoch"].to_i
|
73
|
+
epoch_diff = diff_epoch(current_epoch, epoch)
|
74
|
+
alert_if_epoch_greater_than_frequency(epoch_diff,
|
75
|
+
recorded_service[:key],
|
76
|
+
frequency)
|
77
|
+
end
|
78
|
+
end
|
79
|
+
|
55
80
|
def parse_recorded_epoch(recorded_epochs)
|
56
81
|
# {"epoch":1493000501,"frequency":"300"}
|
57
|
-
value_json = JSON.parse(recorded_epochs)
|
58
|
-
frequency = value_json["frequency"]
|
59
|
-
epoch = value_json["epoch"]
|
82
|
+
value_json = JSON.parse(recorded_epochs[0][:value])
|
83
|
+
frequency = value_json["frequency"]
|
84
|
+
epoch = value_json["epoch"]
|
60
85
|
return epoch, frequency
|
61
86
|
end
|
62
87
|
|
63
|
-
def alert_if_epoch_greater_than_frequency(epoch_diff, target, frequency)
|
64
|
-
if epoch_diff > frequency
|
65
|
-
slack_alert(@alert_to, target, epoch_diff)
|
66
|
-
end
|
67
|
-
end
|
68
|
-
|
69
88
|
def check_recorded_epoch(parse_recorded_epoch, current_epoch)
|
70
89
|
recorded_epoch = parse_recorded_epoch[0].to_i
|
71
90
|
frequency = parse_recorded_epoch[1].to_i
|
@@ -73,23 +92,34 @@ module DeadmanCheck
|
|
73
92
|
alert_if_epoch_greater_than_frequency(epoch_diff, @target, frequency)
|
74
93
|
end
|
75
94
|
|
76
|
-
def
|
77
|
-
|
78
|
-
|
79
|
-
|
80
|
-
|
81
|
-
|
82
|
-
alert_if_epoch_greater_than_frequency(epoch_diff,
|
83
|
-
recorded_service[:key],
|
84
|
-
frequency)
|
95
|
+
def alert_if_epoch_greater_than_frequency(epoch_diff, target, frequency)
|
96
|
+
if epoch_diff > frequency
|
97
|
+
slack_alert(
|
98
|
+
@alert_to_slack, target, epoch_diff) unless @alert_to_slack.nil?
|
99
|
+
sns_alert(
|
100
|
+
@alert_to_sns, target, epoch_diff) unless @alert_to_sns.nil?
|
85
101
|
end
|
86
102
|
end
|
87
103
|
|
88
|
-
def slack_alert(
|
104
|
+
def slack_alert(alert_to_slack, target, epoch_diff)
|
89
105
|
client = Slack::Web::Client.new
|
90
|
-
client.chat_postMessage(channel: "\##{
|
106
|
+
client.chat_postMessage(channel: "\##{alert_to_slack}",
|
107
|
+
text: "Alert: Deadman Switch
|
91
108
|
Triggered for #{target}, with #{epoch_diff} seconds since last run",
|
92
109
|
username: 'deadman')
|
93
110
|
end
|
111
|
+
|
112
|
+
def sns_alert(alert_to_sns, target, epoch_diff)
|
113
|
+
@sns.publish(
|
114
|
+
target_arn: @alert_to_sns,
|
115
|
+
message_structure: 'json',
|
116
|
+
message: {
|
117
|
+
:default => "Alert: Deadman Switch triggered for #{target}",
|
118
|
+
:email => "Alert: Deadman Switch triggered for #{target}, with
|
119
|
+
#{epoch_diff} seconds since last run",
|
120
|
+
:sms => "Alert: Deadman Switch for #{target}"
|
121
|
+
}.to_json
|
122
|
+
)
|
123
|
+
end
|
94
124
|
end
|
95
125
|
end
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: deadman_check
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.3.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- zane
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2017-
|
11
|
+
date: 2017-08-04 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: bundler
|
@@ -92,20 +92,20 @@ dependencies:
|
|
92
92
|
requirements:
|
93
93
|
- - "~>"
|
94
94
|
- !ruby/object:Gem::Version
|
95
|
-
version:
|
95
|
+
version: 2.0.0
|
96
96
|
- - ">="
|
97
97
|
- !ruby/object:Gem::Version
|
98
|
-
version:
|
98
|
+
version: 2.0.0
|
99
99
|
type: :runtime
|
100
100
|
prerelease: false
|
101
101
|
version_requirements: !ruby/object:Gem::Requirement
|
102
102
|
requirements:
|
103
103
|
- - "~>"
|
104
104
|
- !ruby/object:Gem::Version
|
105
|
-
version:
|
105
|
+
version: 2.0.0
|
106
106
|
- - ">="
|
107
107
|
- !ruby/object:Gem::Version
|
108
|
-
version:
|
108
|
+
version: 2.0.0
|
109
109
|
- !ruby/object:Gem::Dependency
|
110
110
|
name: slack-ruby-client
|
111
111
|
requirement: !ruby/object:Gem::Requirement
|
@@ -140,6 +140,26 @@ dependencies:
|
|
140
140
|
- - ">="
|
141
141
|
- !ruby/object:Gem::Version
|
142
142
|
version: 1.2.4
|
143
|
+
- !ruby/object:Gem::Dependency
|
144
|
+
name: aws-sdk
|
145
|
+
requirement: !ruby/object:Gem::Requirement
|
146
|
+
requirements:
|
147
|
+
- - "~>"
|
148
|
+
- !ruby/object:Gem::Version
|
149
|
+
version: 2.10.21
|
150
|
+
- - ">="
|
151
|
+
- !ruby/object:Gem::Version
|
152
|
+
version: 2.10.21
|
153
|
+
type: :runtime
|
154
|
+
prerelease: false
|
155
|
+
version_requirements: !ruby/object:Gem::Requirement
|
156
|
+
requirements:
|
157
|
+
- - "~>"
|
158
|
+
- !ruby/object:Gem::Version
|
159
|
+
version: 2.10.21
|
160
|
+
- - ">="
|
161
|
+
- !ruby/object:Gem::Version
|
162
|
+
version: 2.10.21
|
143
163
|
description: |-
|
144
164
|
A script to check a given Consul key EPOCH for
|
145
165
|
freshness. Good for monitoring cron jobs or batch jobs. Have the last step
|