bosh-monitor 1.5.0.pre.1113

Sign up to get free protection for your applications and to get access to all the features.
Files changed (37) hide show
  1. data/README +80 -0
  2. data/bin/bosh-monitor +30 -0
  3. data/bin/bosh-monitor-console +51 -0
  4. data/bin/listener +58 -0
  5. data/lib/bosh/monitor.rb +72 -0
  6. data/lib/bosh/monitor/agent.rb +51 -0
  7. data/lib/bosh/monitor/agent_manager.rb +295 -0
  8. data/lib/bosh/monitor/api_controller.rb +18 -0
  9. data/lib/bosh/monitor/config.rb +71 -0
  10. data/lib/bosh/monitor/core_ext.rb +8 -0
  11. data/lib/bosh/monitor/director.rb +76 -0
  12. data/lib/bosh/monitor/director_monitor.rb +33 -0
  13. data/lib/bosh/monitor/errors.rb +19 -0
  14. data/lib/bosh/monitor/event_processor.rb +109 -0
  15. data/lib/bosh/monitor/events/alert.rb +92 -0
  16. data/lib/bosh/monitor/events/base.rb +70 -0
  17. data/lib/bosh/monitor/events/heartbeat.rb +139 -0
  18. data/lib/bosh/monitor/metric.rb +16 -0
  19. data/lib/bosh/monitor/plugins/base.rb +27 -0
  20. data/lib/bosh/monitor/plugins/cloud_watch.rb +56 -0
  21. data/lib/bosh/monitor/plugins/datadog.rb +78 -0
  22. data/lib/bosh/monitor/plugins/dummy.rb +20 -0
  23. data/lib/bosh/monitor/plugins/email.rb +135 -0
  24. data/lib/bosh/monitor/plugins/http_request_helper.rb +25 -0
  25. data/lib/bosh/monitor/plugins/logger.rb +13 -0
  26. data/lib/bosh/monitor/plugins/nats.rb +43 -0
  27. data/lib/bosh/monitor/plugins/pagerduty.rb +48 -0
  28. data/lib/bosh/monitor/plugins/paging_datadog_client.rb +24 -0
  29. data/lib/bosh/monitor/plugins/resurrector.rb +82 -0
  30. data/lib/bosh/monitor/plugins/resurrector_helper.rb +84 -0
  31. data/lib/bosh/monitor/plugins/tsdb.rb +43 -0
  32. data/lib/bosh/monitor/plugins/varz.rb +17 -0
  33. data/lib/bosh/monitor/protocols/tsdb.rb +68 -0
  34. data/lib/bosh/monitor/runner.rb +162 -0
  35. data/lib/bosh/monitor/version.rb +5 -0
  36. data/lib/bosh/monitor/yaml_helper.rb +18 -0
  37. metadata +246 -0
data/README ADDED
@@ -0,0 +1,80 @@
1
+ h4. Synopsis
2
+
3
+ BOSH Health Monitor (BHM) is a component that monitors health of one or multiple BOSH deployments. It processes heartbeats and alerts from BOSH agents and notifies interested parties if something goes wrong.
4
+
5
+ h4. Heartbeats
6
+
7
+ Agent sends periodic heartbeats to HM. Heartbeats are sent via message bus and have the following format:
8
+
9
+ | *Subject* | hm.agent.heartbeat.<agent_id> |
10
+ | *Payload* | none |
11
+
12
+ h6. Heartbeat processing
13
+
14
+ # If the agent is known to HM the last heartbeat timestamp gets updated. No analysis is attempted at this point, analyze agents routine is asynchronous to heartbeat processing.
15
+ # If the agent is unknown it gets registered with HM with a warning flag set (we call them rogue agents). Next director poll will possibly include this agent to a list of managed agents and clear the flag. We might generate the alert if the flag hasn't been cleared for some (configurable) time.
16
+
17
+ h4. Agents discovery
18
+
19
+ HM polls director periodically to get the list of managed VMs:
20
+
21
+ | *Endpoint* | GET /deployments/<deployment_name>/vms |
22
+ | *Response* | JSON including agent ids, job names and indices for all managed VMs |
23
+
24
+ When new agent is discovered it gets registered and added to a managed deployment. No active operations are performed to reach the agent and query it, we only rely on heartbeats and agent alerts.
25
+
26
+ h4. Agents analysis
27
+
28
+ This is a periodic operation that goes through all known agents. First it tries to go through all managed deployments, then analyzes rogue agents as well. The following procedure is used:
29
+
30
+ # If agent missed more than N heartbeats the "Agent Missing" alert is generated.
31
+
32
+ h4. Alerts
33
+
34
+ Alert is a concept used by HM to flag and deliver information about important events. It includes the following data:
35
+
36
+ # Id
37
+ # Severity
38
+ # Source (usually deployment/job/index tuple)
39
+ # Timestamp
40
+ # Description
41
+ # Long description (optional)
42
+ # Tags (optional)
43
+
44
+ h6. Alert Processor
45
+
46
+ Alert Processor is a module that registers incoming alerts and routes them to interested parties via appropriate delivery agent. It should conform to the following interface:
47
+
48
+ | *Method* | *Arguments* | *Description* |
49
+ | *register_alert* | alert (object responding to :id, :severity, :timestamp, :description, :long_description, :source and :tags) | Registers an alert and invokes a delivery agent. Delivery agent might or might not deliver alert immediately depending on the implementation, so Alert Processor shouldn't make any assumptions about delivery (i.e. agent might queue up several alerts and send them asynchronously. |
50
+ | *add_delivery_agent* | delivery_agent, options | Adds a delivery agent to a processor |
51
+
52
+ Alert id can be an arbitrary string however Alert Processor might use it to keep track of registered alerts and don't process the same alert twice. This way other HM modules can just blindly register any incoming alerts and leave the dedup step to the alert processor).
53
+
54
+ Alerts are only persisted in HM memory (at least in the initial version) so losing HM leads to losing any undelivered alerts that might have been queued by a delivery agent or alert processor).
55
+
56
+ If alert processor has more than one delivery agents associated with it then it notifies all of them in order (i.e. we want to notify both Zabbix and Pager Duty).
57
+
58
+ h6. Delivery Agent
59
+
60
+ Delivery Agent is a module that takes care of an alert delivery mechanism (such as an email, Pager Duty alert, writing to a journal or even silently discarding the alert). It should conform to the following interface:
61
+
62
+ | *Method* | *Arguments* | *Description* |
63
+ | *deliver* | alert | Delivers alert or queues it for delivery. |
64
+
65
+ The initial implementation will have email and Pager Duty delivery agents.
66
+
67
+ Alert Processor is not pluggable, it's just one of HM classes. Delivery agents are pluggable but generally not changed in a runtime but initialized using an HM configuration file on HM startup.
68
+
69
+ h4. Alerts from agent
70
+
71
+ HM subscribes to agent alerts on a message bus:
72
+
73
+ | *Subject* | hm.agent.alert.<agent_id> |
74
+ | *Payload* | JSON containing the following keys: id, service, event, action, description, timestamp, tags |
75
+
76
+ BOSH Agent is responsible for mapping any underlying supervisor alert format to the expected JSON payload and send it to HM.
77
+
78
+ HM is responsible for interpreting JSON payload and mapping it to a sequence of HM actions and possibly creating an HM alert compatible with Alert Processor module. HM never dedups incoming alerts outside of Alert Processor (this adds some overhead to an incoming alert parser but shouldn't be too bad). Malformed payloads are ignored.
79
+
80
+ Job name and index are not featured in agent incoming alert, those are looked up in director. If heartbeat came from a rogue agent and we have no job name and/or index then we note that fact in alert description but don't try to be too worried about that (service name and agent id should be enough). We might consider including agent IP address as a part of heartbeat so we can track down rogue agents.
@@ -0,0 +1,30 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "bosh/monitor"
4
+ require "optparse"
5
+
6
+ config_file = nil
7
+
8
+ opts = OptionParser.new do |opts|
9
+ opts.on("-c", "--config FILE", "configuration file") do |opt|
10
+ config_file = opt
11
+ end
12
+ end
13
+
14
+ opts.parse!(ARGV.dup)
15
+
16
+ if config_file.nil?
17
+ puts opts
18
+ exit 1
19
+ end
20
+
21
+ runner = Bosh::Monitor::Runner.new(config_file)
22
+
23
+ Signal.trap("INT") do
24
+ runner.stop
25
+ exit(1)
26
+ end
27
+
28
+ runner.run
29
+
30
+
@@ -0,0 +1,51 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require 'bosh/monitor'
4
+ require 'irb'
5
+ require 'irb/completion'
6
+
7
+ module Bosh
8
+ module Monitor
9
+
10
+ class Console
11
+ include YamlHelper
12
+
13
+ def self.start(context)
14
+ new.start(context)
15
+ end
16
+
17
+ def start(context)
18
+ config_file = nil
19
+
20
+ opts = OptionParser.new do |opt|
21
+ opt.on("-c", "--config [ARG]", "configuration file") { |c| config_file = c }
22
+ end
23
+
24
+ opts.parse!(ARGV)
25
+
26
+ if config_file.nil?
27
+ puts opts
28
+ exit 1
29
+ end
30
+
31
+ puts "=> Loading #{config_file}"
32
+ Bhm.config = load_yaml_file(config_file)
33
+
34
+ begin
35
+ require 'ruby-debug'
36
+ puts "=> Debugger enabled"
37
+ rescue LoadError
38
+ puts "=> ruby-debug not found, debugger disabled"
39
+ end
40
+
41
+ puts "=> Welcome to BOSH Health Monitor console"
42
+
43
+ IRB.start
44
+ end
45
+
46
+ end
47
+ end
48
+ end
49
+
50
+ Bhm::Console.start(self)
51
+
@@ -0,0 +1,58 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "eventmachine"
4
+ require "nats/client"
5
+
6
+ class Listener
7
+
8
+ def self.start
9
+ new.start
10
+ end
11
+
12
+ def start
13
+ filter = nil
14
+ nats_uri = nil
15
+ nats_subject = nil
16
+
17
+ opts = OptionParser.new do |opt|
18
+ opt.on("-f", "--filter ARG") { |f| filter = f }
19
+ opt.on("-n", "--nats URI") { |n| nats_uri = n }
20
+ opt.on("-s", "--subject ARG") { |s| nats_subject = s }
21
+ end
22
+
23
+ opts.parse!(ARGV)
24
+
25
+ if nats_uri.nil?
26
+ puts "Usage: listener [options] <nats_uri>"
27
+ end
28
+
29
+ nats_client_options = {
30
+ :uri => nats_uri,
31
+ :autostart => false
32
+ }
33
+
34
+ @nats = NATS.connect(nats_client_options)
35
+
36
+ if nats_subject
37
+ puts "> NATS subject is set to `#{nats_subject}'"
38
+ else
39
+ nats_subject = "bosh.hm.events"
40
+ end
41
+
42
+ if filter
43
+ puts "> Filter is set to `#{filter}'"
44
+ end
45
+
46
+ puts "> Subscribing to events"
47
+ @nats.subscribe(nats_subject) do |msg|
48
+ if filter.nil? || msg =~ Regexp.new(Regexp.quote(filter))
49
+ puts "#{Time.now.strftime("%Y-%m-%d %H:%M:%S")} >> " + msg
50
+ end
51
+ end
52
+ end
53
+ end
54
+
55
+ EM.run do
56
+ Listener.start
57
+ end
58
+
@@ -0,0 +1,72 @@
1
+ module Bosh
2
+ module Monitor
3
+ end
4
+ end
5
+
6
+ Bhm = Bosh::Monitor
7
+
8
+ begin
9
+ require 'fiber'
10
+ rescue LoadError
11
+ unless defined? Fiber
12
+ $stderr.puts 'FATAL: HealthMonitor requires Ruby implementation that supports fibers'
13
+ exit 1
14
+ end
15
+ end
16
+
17
+ require 'ostruct'
18
+ require 'set'
19
+
20
+ require 'em-http-request'
21
+ require 'eventmachine'
22
+ require 'logging'
23
+ require 'nats/client'
24
+ require 'sinatra'
25
+ require 'thin'
26
+ require 'securerandom'
27
+ require 'yajl'
28
+
29
+ # Helpers
30
+ require 'bosh/monitor/yaml_helper'
31
+
32
+ # Basic blocks
33
+ require 'bosh/monitor/agent'
34
+ require 'bosh/monitor/config'
35
+ require 'bosh/monitor/core_ext'
36
+ require 'bosh/monitor/director'
37
+ require 'bosh/monitor/director_monitor'
38
+ require 'bosh/monitor/errors'
39
+ require 'bosh/monitor/metric'
40
+ require 'bosh/monitor/runner'
41
+ require 'bosh/monitor/version'
42
+
43
+ # Processing
44
+ require 'bosh/monitor/agent_manager'
45
+ require 'bosh/monitor/event_processor'
46
+
47
+ # HTTP endpoints
48
+ require 'bosh/monitor/api_controller'
49
+
50
+ # Protocols
51
+ require 'bosh/monitor/protocols/tsdb'
52
+
53
+ # Events
54
+ require 'bosh/monitor/events/base'
55
+ require 'bosh/monitor/events/alert'
56
+ require 'bosh/monitor/events/heartbeat'
57
+
58
+ # Plugins
59
+ require 'bosh/monitor/plugins/base'
60
+ require 'bosh/monitor/plugins/dummy'
61
+ require 'bosh/monitor/plugins/http_request_helper'
62
+ require 'bosh/monitor/plugins/resurrector_helper'
63
+ require 'bosh/monitor/plugins/cloud_watch'
64
+ require 'bosh/monitor/plugins/datadog'
65
+ require 'bosh/monitor/plugins/paging_datadog_client'
66
+ require 'bosh/monitor/plugins/email'
67
+ require 'bosh/monitor/plugins/logger'
68
+ require 'bosh/monitor/plugins/nats'
69
+ require 'bosh/monitor/plugins/pagerduty'
70
+ require 'bosh/monitor/plugins/resurrector'
71
+ require 'bosh/monitor/plugins/tsdb'
72
+ require 'bosh/monitor/plugins/varz'
@@ -0,0 +1,51 @@
1
+ module Bosh::Monitor
2
+ class Agent
3
+
4
+ attr_reader :id
5
+ attr_reader :discovered_at
6
+ attr_accessor :updated_at
7
+
8
+ ATTRIBUTES = [ :deployment, :job, :index, :cid ]
9
+
10
+ ATTRIBUTES.each do |attribute|
11
+ attr_accessor attribute
12
+ end
13
+
14
+ def initialize(id, opts={})
15
+ raise ArgumentError, "Agent must have an id" if id.nil?
16
+
17
+ @id = id
18
+ @discovered_at = Time.now
19
+ @updated_at = Time.now
20
+ @logger = Bhm.logger
21
+ @intervals = Bhm.intervals
22
+
23
+ @deployment = opts[:deployment]
24
+ @job = opts[:job]
25
+ @index = opts[:index]
26
+ @cid = opts[:cid]
27
+ end
28
+
29
+ def name
30
+ if @deployment && @job && @index
31
+ "#{@deployment}: #{@job}(#{@index}) [id=#{@id}, cid=#{@cid}]"
32
+ else
33
+ state = ATTRIBUTES.inject([]) do |acc, attribute|
34
+ value = send(attribute)
35
+ acc << "#{attribute}=#{value}" if value
36
+ acc
37
+ end
38
+
39
+ "agent #{@id} [#{state.join(", ")}]"
40
+ end
41
+ end
42
+
43
+ def timed_out?
44
+ (Time.now - @updated_at) > @intervals.agent_timeout
45
+ end
46
+
47
+ def rogue?
48
+ (Time.now - @discovered_at) > @intervals.rogue_agent_alert && @deployment.nil?
49
+ end
50
+ end
51
+ end
@@ -0,0 +1,295 @@
1
+ module Bosh::Monitor
2
+
3
+ class AgentManager
4
+ attr_reader :heartbeats_received
5
+ attr_reader :alerts_received
6
+ attr_reader :alerts_processed
7
+
8
+ attr_accessor :processor
9
+
10
+ def initialize(event_processor)
11
+ # hash of agent id to agent structure (see add_agent())
12
+ @agents = { }
13
+
14
+ # hash of deployment name to set of agent ids
15
+ @deployments = { }
16
+
17
+ @logger = Bhm.logger
18
+ @heartbeats_received = 0
19
+ @alerts_received = 0
20
+ @alerts_processed = 0
21
+
22
+ @processor = event_processor
23
+ end
24
+
25
+ # Get a hash of agent id -> agent object for all agents associated with the deployment
26
+ def get_agents_for_deployment(deployment_name)
27
+ agent_ids = @deployments[deployment_name]
28
+ @agents.select { |key, value| agent_ids.include?(key) }
29
+ end
30
+
31
+ def lookup_plugin(name, options = {})
32
+ plugin_class = nil
33
+ begin
34
+ class_name = name.to_s.split("_").map(&:capitalize).join
35
+ plugin_class = Bosh::Monitor::Plugins.const_get(class_name)
36
+ rescue NameError => e
37
+ raise PluginError, "Cannot find `#{name}' plugin"
38
+ end
39
+
40
+ plugin_class.new(options)
41
+ end
42
+
43
+ def setup_events
44
+ Bhm.set_varz("heartbeats_received", 0)
45
+
46
+ @processor.enable_pruning(Bhm.intervals.prune_events)
47
+ Bhm.plugins.each do |plugin|
48
+ @processor.add_plugin(lookup_plugin(plugin["name"], plugin["options"]), plugin["events"])
49
+ end
50
+
51
+ Bhm.nats.subscribe("hm.agent.heartbeat.*") do |message, reply, subject|
52
+ process_event(:heartbeat, subject, message)
53
+ end
54
+
55
+ Bhm.nats.subscribe("hm.agent.alert.*") do |message, reply, subject|
56
+ process_event(:alert, subject, message)
57
+ end
58
+
59
+ Bhm.nats.subscribe("hm.agent.shutdown.*") do |message, reply, subject|
60
+ process_event(:shutdown, subject, message)
61
+ end
62
+ end
63
+
64
+ def agents_count
65
+ @agents.size
66
+ end
67
+
68
+ def deployments_count
69
+ @deployments.size
70
+ end
71
+
72
+ # Syncs deployments list received from director
73
+ # with HM deployments.
74
+ # @param deployments Array list of deployments returned by director
75
+ def sync_deployments(deployments)
76
+ managed = Set.new(deployments.map { |d| d["name"] })
77
+ all = Set.new(@deployments.keys)
78
+
79
+ (all - managed).each do |stale_deployment|
80
+ @logger.warn("Found stale deployment #{stale_deployment}, removing...")
81
+ remove_deployment(stale_deployment)
82
+ end
83
+ end
84
+
85
+ def sync_agents(deployment, vms)
86
+ managed_agent_ids = @deployments[deployment] || Set.new
87
+ active_agent_ids = Set.new
88
+
89
+ vms.each do |vm|
90
+ if add_agent(deployment, vm)
91
+ active_agent_ids << vm["agent_id"]
92
+ end
93
+ end
94
+
95
+ (managed_agent_ids - active_agent_ids).each do |agent_id|
96
+ remove_agent(agent_id)
97
+ end
98
+ end
99
+
100
+ def remove_deployment(name)
101
+ agent_ids = @deployments[name]
102
+
103
+ agent_ids.to_a.each do |agent_id|
104
+ @agents.delete(agent_id)
105
+ end
106
+
107
+ @deployments.delete(name)
108
+ end
109
+
110
+ def remove_agent(agent_id)
111
+ @agents.delete(agent_id)
112
+ @deployments.each_pair do |deployment, agents|
113
+ agents.delete(agent_id)
114
+ end
115
+ end
116
+
117
+ # Processes VM data from BOSH Director,
118
+ # extracts relevant agent data, wraps it into Agent object
119
+ # and adds it to a list of managed agents.
120
+ def add_agent(deployment_name, vm_data)
121
+ unless vm_data.kind_of?(Hash)
122
+ @logger.error("Invalid format for VM data: expected Hash, got #{vm_data.class}: #{vm_data}")
123
+ return false
124
+ end
125
+
126
+ agent_id = vm_data["agent_id"]
127
+ agent_cid = vm_data["cid"]
128
+
129
+ if agent_id.nil?
130
+ @logger.warn("No agent id for VM: #{vm_data}")
131
+ return false
132
+ end
133
+
134
+ # Idle VMs, we don't care about them, but we still want to track them
135
+ if vm_data["job"].nil?
136
+ @logger.debug("VM with no job found: #{agent_id}")
137
+ end
138
+
139
+ agent = @agents[agent_id]
140
+
141
+ if agent.nil?
142
+ @logger.debug("Discovered agent #{agent_id}")
143
+ agent = Agent.new(agent_id)
144
+ @agents[agent_id] = agent
145
+ end
146
+
147
+ agent.deployment = deployment_name
148
+ agent.job = vm_data["job"]
149
+ agent.index = vm_data["index"]
150
+ agent.cid = vm_data["cid"]
151
+
152
+ @deployments[deployment_name] ||= Set.new
153
+ @deployments[deployment_name] << agent_id
154
+ true
155
+ end
156
+
157
+ def analyze_agents
158
+ @logger.info "Analyzing agents..."
159
+ started = Time.now
160
+
161
+ processed = Set.new
162
+ count = 0
163
+
164
+ # Agents from managed deployments
165
+ @deployments.each_pair do |deployment_name, agent_ids|
166
+ agent_ids.each do |agent_id|
167
+ analyze_agent(agent_id)
168
+ processed << agent_id
169
+ count += 1
170
+ end
171
+ end
172
+
173
+ # Rogue agents (hey there Solid Snake)
174
+ (@agents.keys.to_set - processed).each do |agent_id|
175
+ @logger.warn("Agent #{agent_id} is not a part of any deployment")
176
+ analyze_agent(agent_id)
177
+ count += 1
178
+ end
179
+
180
+ @logger.info("Analyzed %s, took %s seconds" % [ pluralize(count, "agent"), Time.now - started ])
181
+ count
182
+ end
183
+
184
+ def analyze_agent(agent_id)
185
+ agent = @agents[agent_id]
186
+ ts = Time.now.to_i
187
+
188
+ if agent.nil?
189
+ @logger.error("Can't analyze agent #{agent_id} as it is missing from agents index, skipping...")
190
+ return false
191
+ end
192
+
193
+ if agent.timed_out? && agent.rogue?
194
+ # Agent has timed out but it was never
195
+ # actually a proper member of the deployment,
196
+ # so we don't really care about it
197
+ remove_agent(agent.id)
198
+ return
199
+ end
200
+
201
+ if agent.timed_out?
202
+ @processor.process(:alert,
203
+ severity: 2,
204
+ source: agent.name,
205
+ title: "#{agent.id} has timed out",
206
+ created_at: ts,
207
+ deployment: agent.deployment,
208
+ job: agent.job,
209
+ index: agent.index)
210
+ end
211
+
212
+ if agent.rogue?
213
+ @processor.process(:alert,
214
+ :severity => 2,
215
+ :source => agent.name,
216
+ :title => "#{agent.id} is not a part of any deployment",
217
+ :created_at => ts)
218
+ end
219
+
220
+ true
221
+ end
222
+
223
+ def process_event(kind, subject, payload = {})
224
+ kind = kind.to_s
225
+ agent_id = subject.split('.', 4).last
226
+ agent = @agents[agent_id]
227
+
228
+ if agent.nil?
229
+ # There might be more than a single shutdown event,
230
+ # we are only interested in processing it if agent
231
+ # is still managed
232
+ return if kind == "shutdown"
233
+
234
+ @logger.warn("Received #{kind} from unmanaged agent: #{agent_id}")
235
+ agent = Agent.new(agent_id)
236
+ @agents[agent_id] = agent
237
+ else
238
+ @logger.debug("Received #{kind} from #{agent_id}: #{payload}")
239
+ end
240
+
241
+ case payload
242
+ when String
243
+ message = Yajl::Parser.parse(payload)
244
+ when Hash
245
+ message = payload
246
+ end
247
+
248
+ case kind.to_s
249
+ when "alert"
250
+ on_alert(agent, message)
251
+ when "heartbeat"
252
+ on_heartbeat(agent, message)
253
+ when "shutdown"
254
+ on_shutdown(agent, message)
255
+ else
256
+ @logger.warn("No handler found for `#{kind}' event")
257
+ end
258
+
259
+ rescue Yajl::ParseError => e
260
+ @logger.error("Cannot parse incoming event: #{e}")
261
+ rescue Bhm::InvalidEvent => e
262
+ @logger.error("Invalid event: #{e}")
263
+ end
264
+
265
+ def on_alert(agent, message)
266
+ if message.is_a?(Hash) && !message.has_key?("source")
267
+ message["source"] = agent.name
268
+ end
269
+
270
+ @processor.process(:alert, message)
271
+ @alerts_processed += 1
272
+ Bhm.set_varz("alerts_processed", @alerts_processed)
273
+ end
274
+
275
+ def on_heartbeat(agent, message)
276
+ agent.updated_at = Time.now
277
+
278
+ if message.is_a?(Hash)
279
+ message["timestamp"] = Time.now.to_i if message["timestamp"].nil?
280
+ message["agent_id"] = agent.id
281
+ message["deployment"] = agent.deployment
282
+ end
283
+
284
+ @processor.process(:heartbeat, message)
285
+ @heartbeats_received += 1
286
+ Bhm.set_varz("heartbeats_received", @heartbeats_received)
287
+ end
288
+
289
+ def on_shutdown(agent, message)
290
+ @logger.info("Agent `#{agent.id}' shutting down...")
291
+ remove_agent(agent.id)
292
+ end
293
+
294
+ end
295
+ end