scout_agent 3.0.7 → 3.1.0

Sign up to get free protection for your applications and to get access to all the features.
data/CHANGELOG CHANGED
@@ -1,3 +1,16 @@
1
+ == 3.1.0
2
+
3
+ * Fixed a bug where the monitor might not properly signal all subprocesses
4
+ * Made the master agent forward shutdown requests to running missions
5
+ * Enhanced the stop command to ensure all processes exit and to skip the waits
6
+ on KILL signals since they cannot be responded to
7
+ * Improved the error message for forced stops to give next steps
8
+ * Fixed a bug that would prevent the agent from daemonizing itself at start-up
9
+ if it had to clean out a stale PID file first
10
+ * Added status database and log file maintenance to the snapshot and queue
11
+ commands
12
+ * Completed full code documentation and clean-up pass
13
+
1
14
  == 3.0.7
2
15
 
3
16
  * Added a log() method (with a logger() alias) plugins can access to add
data/README CHANGED
@@ -1,3 +1,45 @@
1
- = ReadMe
1
+ = The Scout Agent
2
2
 
3
- Scout, Version 3
3
+ This is the agent software installed on servers to work with {the Scout monitoring application}[http://scoutapp.com/]. See the sections below for details on how to install and use the agent, how to build your own plugins for it to track the data you care about, and how to add features to the agent itself.
4
+
5
+ == How do I use the Scout agent?
6
+
7
+ Installing the agent is just a simple gem install:
8
+
9
+ $ sudo gem install scout_agent
10
+
11
+ Note that the gem requires Ruby 1.8.6 or higher and Rubygems 1.3.1 or higher. It also doesn't run on Windows due to Ruby not supporting fork() there.
12
+
13
+ Once the gem is installed, you need to identify yourself with the agent key (which looks like a7349498-bec3-4ddf-963c-149a666433a4) that you get from the Web application. Just issue this command and have your key ready when it asks for it:
14
+
15
+ $ sudo scout_agent id
16
+
17
+ At this point, you should be all set to run the agent. You start it up with this command:
18
+
19
+ $ sudo scout_agent start
20
+
21
+ The agent is a daemon, so it should return your prompt after it moves into the land of background processes. It will be running though. You can issue the following command if you want to check up on it:
22
+
23
+ $ scout_agent status
24
+
25
+ With the agent running, you should be able to log into your account on the {Web application}[http://scoutapp.com/] to setup your list of plugins and see the agent delivering data.
26
+
27
+ == How do I build my own plugins?
28
+
29
+ Scout makes it very easy to build your own plugins for anything you need to monitor. In a matter of minutes you could be tracking the user sign-ups in your application or anything else that's important to you. Once you pipe some data into Scout you can take advantages of all the graphing and trend analysis we use for more traditional monitoring, like Rails applications.
30
+
31
+ We have {a tutorial}[http://scoutapp.com/plugin_urls/static/creating_a_plugin] on the Web site that walks you through building a Scout plugin.
32
+
33
+ == How do I hack on the agent?
34
+
35
+ We try to keep the agent code fairly clean and documented, so it's hopefully not too tough to poke around in. However, it is a big code base. Let me give you the dime tour of where to look for things. All paths below are relative to <tt>lib/scount_agent</tt>.
36
+
37
+ <tt>dispatcher.rb</tt>, <tt>assignment.rb</tt>, and <tt>assignment/*</tt>:: This is the code the agent uses to interpret commands users give on the command-line. You'll find configuration file loading (<tt>plan.rb</tt> is a configuration), switch parsing, command selection, and invocation in here.
38
+ <tt>api.rb</tt>:: The ScoutAgent::API is the external interface for the queue and snapshot commands. This can be used to push data into Scout, without even building a Plugin, or just to request an updated snapshot of the environment.
39
+ <tt>lifeline.rb</tt> and <tt>agent.rb</tt>:: The ScoutAgent::Lifeline object monitors a ScoutAgent::Agent class, which is a major function of Scout, namely the plugin runner and the XMPP communication module. This is a pretty typical multi-process heartbeat setup where the Agent is fork()ed into a separate process and then monitored for regular check-ins written to a shared pipe.
40
+ <tt>agent/master_agent.rb</tt> and <tt>mission.rb</tt>:: Together these two pieces make up the heart of the agent. The ScoutAgent::Agent::MasterAgent is the main event loop and ScoutAgent::Mission (aliased Plugin) are the pieces of code that get run in that loop.
41
+ <tt>agent/communication_agent.rb</tt>, <tt>order.rb</tt>, and <tt>order/*</tt>:: This code is used to listen for supported commands over an XMPP connection. The ScoutAgent::Agent::CommunicationAgent manages all the XMPP talking and ScoutAgent::Order and subclasses are the commands.
42
+ <tt>database.rb</tt> and <tt>database/*</tt>:: This is a thin wrapper over Amalgalite[http://copiousfreetime.rubyforge.org/amalgalite/] (and SQLite databases by extension). These are the memory of the agent and, with locking, the primary IPC used used by the agent.
43
+ <tt>core_extensions.rb</tt>:: This file holds a handful of extensions that make sense in the context of the agent. This is not an ActiveSupport size library, but just some simple niceties. These extensions can be used in your own Plugins.
44
+
45
+ We welcome additions to the agent and will incorporate patches if we feel they add to the platform as a whole. Obviously, the easier we can understand what you did the easier it is to judge that, so tests and documentation are plusses to us.
data/Rakefile CHANGED
@@ -118,7 +118,7 @@ task :upload_docs => :rdoc do
118
118
  )
119
119
  host = "#{config['username']}@rubyforge.org"
120
120
  remote_dir = "/var/www/gforge-projects/#{SA_SPEC.rubyforge_project}"
121
- local_dir = 'doc/html'
121
+ local_dir = "doc"
122
122
 
123
123
  sh "rsync -av --delete #{local_dir}/ #{host}:#{remote_dir}"
124
124
  end
data/TODO CHANGED
@@ -1,3 +1,9 @@
1
1
  = To Do List
2
2
 
3
- Coming soon...
3
+ * Build a `scout_agent help` command that provides general and command specific
4
+ help, perhaps by parsing the comments on assignments
5
+ * Add SSL certificate verification show we ensure we are always talking with the
6
+ official Scout server
7
+ * Add something like a SHA signature of the code to reports to help developers
8
+ see which version they are looking at data from
9
+ * Improve test coverage all over the agent
data/lib/scout_agent.rb CHANGED
@@ -64,9 +64,34 @@ module ScoutAgent
64
64
  wire_tap.tap = $stdout unless skip_stdout or Plan.run_as_daemon?
65
65
  wire_tap
66
66
  end
67
+
68
+ #
69
+ # A maintenance method used to remove log files written more than seven days
70
+ # ago. This prevents the hard drive from slowly filling with log files and
71
+ # thus is called as part of the main event loop as well as after snapshot and
72
+ # queue commands. A recent +log+ must be provided to notify of rotation
73
+ # errors.
74
+ #
75
+ def self.remove_old_log_files(log)
76
+ Plan.log_dir.each_entry do |log_file|
77
+ if log_file.to_s =~ /\.(\d{4})(\d{2})(\d{2})\z/
78
+ log_day = Time.local(*$~.captures.map { |n| n.to_i })
79
+ if Time.now - log_day > 60 * 60 * 24 * 7
80
+ begin
81
+ (Plan.log_dir + log_file).unlink
82
+ rescue Exception => error # file cannot be unlinked
83
+ log.error( "Failed to unlink old log file '#{log_file}': " +
84
+ "#{error.message} (#{error.class})." )
85
+ next
86
+ end
87
+ log.debug("Successfully unlinked old log file '#{log_file}'.")
88
+ end
89
+ end
90
+ end
91
+ end
67
92
 
68
93
  # The version of this agent.
69
- VERSION = "3.0.7".freeze
94
+ VERSION = "3.1.0".freeze
70
95
  # A Pathname reference to the agent code directory, used in dynamic loading.
71
96
  LIB_DIR = Pathname.new(File.dirname(__FILE__)) + agent_name
72
97
  end
@@ -6,9 +6,20 @@ require "scout_agent/order"
6
6
 
7
7
  module ScoutAgent
8
8
  class Agent
9
+ #
10
+ # This agent manages the XMPP connection with the server. It mainly just
11
+ # listens for messages from the server and passes them on to matching Order
12
+ # instances.
13
+ #
9
14
  class CommunicationAgent < Agent
15
+ # The number of seconds to wait before attempting another connection.
10
16
  RECONNECT_WAIT = 60
11
17
 
18
+ #
19
+ # Prepares a log() and passses it down to the Orders, which are also
20
+ # loaded here. A list of trusted XMPP users is also prepared as part of
21
+ # this start-up.
22
+ #
12
23
  def initialize
13
24
  super # setup our log and status
14
25
 
@@ -26,6 +37,11 @@ module ScoutAgent
26
37
  }
27
38
  end
28
39
 
40
+ #
41
+ # This method encupsulates the process of the XMPP listener, which is
42
+ # pretty much: login, setup, listen for commands until told to stop, and
43
+ # exit.
44
+ #
29
45
  def run
30
46
  login
31
47
  update_status("Online since #{Time.now.utc.to_db_s}")
@@ -35,7 +51,8 @@ module ScoutAgent
35
51
  listen
36
52
  close_connection
37
53
  end
38
-
54
+
55
+ # Triggers the shutdown process.
39
56
  def finish
40
57
  log.info("Shutting down.")
41
58
  if @shutdown_thread
@@ -45,8 +62,18 @@ module ScoutAgent
45
62
  end
46
63
  end
47
64
 
65
+ #######
48
66
  private
67
+ #######
49
68
 
69
+ #######################
70
+ ### XMPP Operations ###
71
+ #######################
72
+
73
+ #
74
+ # Prepares a Jabber::Client with identification and Exception handling,
75
+ # then hands off to try_connection().
76
+ #
50
77
  def login
51
78
  Thread.abort_on_exception = true # make XMPP4R fail fast
52
79
  @agent_jid = Jabber::JID.new("#{agent_key}@#{jabber_server}/agent")
@@ -58,6 +85,10 @@ module ScoutAgent
58
85
  try_connection
59
86
  end
60
87
 
88
+ #
89
+ # Loops over connection attempts until successfully reaching the server.
90
+ # There's a pause of <tt>RECONNECT_WAIT</tt> seconds between each attempt.
91
+ #
61
92
  def try_connection
62
93
  @connecting = true
63
94
  until connect_and_authenticate?
@@ -68,13 +99,18 @@ module ScoutAgent
68
99
  @connecting = false
69
100
  end
70
101
 
102
+ #
103
+ # Attempts an XMPP connection and login. The process will be cancelled if
104
+ # either part fails. Returns +true+ if the entire process completes as
105
+ # expected, or +false+ otherwise.
106
+ #
71
107
  def connect_and_authenticate?
72
108
  status("Connecting")
73
109
  close_connection
74
110
  begin
75
111
  no_warnings { @jabber.connect }
76
112
  rescue Exception => error # connection failure
77
- log.error("Failed to connect (#{error.class}: #{error.message}).")
113
+ log.error("Failed to connect to XMPP server.")
78
114
  return false
79
115
  end
80
116
  begin
@@ -87,6 +123,10 @@ module ScoutAgent
87
123
  true
88
124
  end
89
125
 
126
+ #
127
+ # Builds an XMPP status with +message+ and arranges for it to be sent to
128
+ # the server in a separate Thread.
129
+ #
90
130
  def update_status(message, status = nil)
91
131
  status("Queuing status change")
92
132
  presence = Jabber::Presence.new
@@ -104,11 +144,19 @@ module ScoutAgent
104
144
  end
105
145
  end
106
146
 
147
+ #
148
+ # Grabs a _roster_ that can be used to manage the subscription requests of
149
+ # this agent's XMPP user.
150
+ #
107
151
  def fetch_roster
108
152
  status("Preparing connection")
109
153
  @roster = Jabber::Roster::Helper.new(@jabber)
110
154
  end
111
155
 
156
+ #
157
+ # Installs a callback that will accept all subscription requests from
158
+ # trusted XMPP identities.
159
+ #
112
160
  def install_subscriptions_callback
113
161
  @roster.add_subscription_request_callback do |_, presence|
114
162
  log.info("Subscription request: #{presence.from}")
@@ -121,6 +169,15 @@ module ScoutAgent
121
169
  end
122
170
  end
123
171
 
172
+ #
173
+ # Installs a callback that reads XMPP messages looking for supported
174
+ # commands. The identified commands are handed off to an Order subclass
175
+ # for execution.
176
+ #
177
+ # Before processing a command, this process will strip a leading message
178
+ # ID, if provided. A response is sent to the server aknowledging the
179
+ # receit of the message in such cases.
180
+ #
124
181
  def install_messages_callback
125
182
  @jabber.add_message_callback do |message|
126
183
  log.info("Received message from #{message.from}: #{message.body}")
@@ -144,6 +201,10 @@ module ScoutAgent
144
201
  end
145
202
  end
146
203
 
204
+ #
205
+ # Sends a chat message with +body+ to the XMPP identity named by +who+.
206
+ # Messages are sent in a separate Thread.
207
+ #
147
208
  def send_chat_message(who, body)
148
209
  status("Queuing message")
149
210
  message = Jabber::Message.new(who, body)
@@ -160,6 +221,7 @@ module ScoutAgent
160
221
  end
161
222
  end
162
223
 
224
+ # Stops this main Thread to allow the listening Thread to take over.
163
225
  def listen
164
226
  log.info("Listening for commands.")
165
227
  status("Listening for commands")
@@ -167,6 +229,7 @@ module ScoutAgent
167
229
  Thread.stop
168
230
  end
169
231
 
232
+ # Closes our XMPP connection, if it's still open.
170
233
  def close_connection
171
234
  @jabber.close! if @jabber.is_connected?
172
235
  rescue Exception # connection already closed
@@ -174,17 +237,27 @@ module ScoutAgent
174
237
  log.warn("Failed to close connection.")
175
238
  end
176
239
 
240
+ ###############
241
+ ### Helpers ###
242
+ ###############
243
+
244
+ # Returns an appropriate key for the environment.
177
245
  def agent_key
178
246
  @agent_key ||= Plan.test_mode? ?
179
247
  "a7349498-bec3-4ddf-963c-149a666433a4" :
180
248
  Plan.agent_key
181
249
  end
182
250
 
251
+ #
252
+ # Returns an appropriate server for the environment. In non-test mode
253
+ # this will be the host() parsed out of the check-in URL.
254
+ #
183
255
  def jabber_server
184
256
  @jabber_server ||= Plan.test_mode? ? "jabber.org" :
185
257
  URI.parse(Plan.server_url).host
186
258
  end
187
259
 
260
+ # Returns +true+ if +user+ matches any of our trusted XMPP identities.
188
261
  def trusted?(user)
189
262
  id = user.to_s
190
263
  @trusted.any? { |trusted| id =~ trusted }
@@ -6,16 +6,31 @@ require "scout_agent/mission"
6
6
 
7
7
  module ScoutAgent
8
8
  class Agent
9
+ #
10
+ # This agent is the main event loop for the platform. It's primary function
11
+ # is to run each Mission downloaded from the server at the correct time. As
12
+ # part of that, it regularly updates the Mission list from the server and
13
+ # also prepares and sends check-ins to the server after a set of mission
14
+ # runs. Snapshots are also periodically run here and added to check-ins.
15
+ #
16
+ # This loop also manages regular maintenance like <tt>VACUUM</tt>ing SQLite
17
+ # databases and cleaning out old log files.
18
+ #
9
19
  class MasterAgent < Agent
20
+ #
21
+ # Prepares the primary event loop for execution. The main function at
22
+ # this point is to ensure that we can load all of the needed databases.
23
+ #
10
24
  def initialize
11
25
  super # setup our log and status
12
26
 
13
- @running = true
14
- @main_loop = nil
15
- @server = Server.new(log)
16
- @db = Database.load(:mission_log, log)
17
- @queue = Database.load(:queue, log)
18
- @snapshots = Database.load(:snapshots, log)
27
+ @running = true
28
+ @main_loop = nil
29
+ @mission_pid = nil
30
+ @server = Server.new(log)
31
+ @db = Database.load(:mission_log, log)
32
+ @queue = Database.load(:queue, log)
33
+ @snapshots = Database.load(:snapshots, log)
19
34
 
20
35
  if [@db, @queue, @snapshots].any? { |db| db.nil? }
21
36
  log.fatal("Could not load all required databases.")
@@ -23,6 +38,11 @@ module ScoutAgent
23
38
  end
24
39
  end
25
40
 
41
+ #
42
+ # This method outlines the steps of the event loop: update our plan from
43
+ # the server, run Missions and snapshots, check-in, handle maintenance,
44
+ # and rest.
45
+ #
26
46
  def run
27
47
  log.info("Running.")
28
48
  @main_loop = Thread.new do
@@ -42,31 +62,64 @@ module ScoutAgent
42
62
  @main_loop.join
43
63
  end
44
64
 
65
+ #
66
+ # This method is called automatically when this Agent process receives an
67
+ # +ALRM+ signal and all it does is to wake up the event loop Thread, if it
68
+ # is not already running. This allows the Agent to notice external
69
+ # changes quicker.
70
+ #
45
71
  def notice_changes
46
72
  @main_loop.run if @main_loop
47
73
  rescue ThreadError # Thread was already killed
48
74
  # do nothing: we're shutting down and can't notice new things
49
75
  end
50
76
 
77
+ #
78
+ # This method is called automatically when this Agent process receives a
79
+ # stop request, like a +TERM+ signal. It prepares the Agent to shutdown,
80
+ # but doesn't trigger an immediate stop. Instead, the Agent will check to
81
+ # see if this has been called after each phase of the main event loop.
82
+ # This allows it to avoid repeating work when it relaunches.
83
+ #
84
+ # This request is also forwarded to a currently running Mission, if there
85
+ # is one.
86
+ #
51
87
  def finish
52
88
  if @running
53
89
  log.info("Shutting down.")
90
+ if @mission_pid
91
+ log.info("Forwarding shutdown request to the running mission.")
92
+ begin
93
+ Process.kill("TERM", @mission_pid)
94
+ rescue Exception # unable to signal mission
95
+ log.warn("Mission could not be signaled.")
96
+ # do nothing: mission already ended
97
+ end
98
+ end
54
99
  else
55
100
  log.warn("Received multiple shutdown signals.")
56
101
  end
57
102
  @running = false
58
103
  notice_changes
59
104
  end
60
-
105
+
106
+ #######
61
107
  private
108
+ #######
62
109
 
63
110
  #############
64
111
  ### Agent ###
65
112
  #############
66
113
 
114
+ #
115
+ # Updates our plan from the server, if it hasn't changed. This includes
116
+ # both the list of plugins to run as well as our list of commands involved
117
+ # in a snapshot.
118
+ #
67
119
  def fetch_plan
68
120
  log.info("Fetching plan from server.")
69
121
  status("Fetching plan from server")
122
+ # read the plan
70
123
  headers = {}
71
124
  if not Plan.test_mode? and (old_plan = @db.current_plan)
72
125
  log.debug( "Adding If-Modified-Since for plan fetch: " +
@@ -74,6 +127,7 @@ module ScoutAgent
74
127
  headers[:if_modified_since] = old_plan[:last_modified]
75
128
  end
76
129
  json_plan = @server.get_plan(headers)
130
+ # skip mission or empty plans
77
131
  if json_plan.nil? # failed to retrieve plan
78
132
  log.warn("Could not retrieve plan from server.")
79
133
  return
@@ -83,24 +137,38 @@ module ScoutAgent
83
137
  else
84
138
  log.info("Received plan (#{json_plan.to_s.size} bytes).")
85
139
  end
140
+ # parse the plan
86
141
  begin
87
142
  ruby_plan = JSON.parse(json_plan.to_s)
88
143
  rescue JSON::ParserError # bad JSON
89
144
  log.error("Plan from server was malformed JSON.")
90
145
  return # skip plan update
91
146
  end
147
+ # update the local databases with the changes
92
148
  @db.update_plan( json_plan.headers[:last_modified],
93
149
  Array(ruby_plan["plugins"]) )
94
150
  @snapshots.update_commands(Array(ruby_plan["commands"]))
95
151
  end
96
152
 
153
+ #
154
+ # This loop runs all Missions that have exceeded their run time wait, one
155
+ # at a time. Missions are run in a child process to make it easy to time
156
+ # them out and clean up after them, as well as to ensure the agent isn't
157
+ # affected by their code.
158
+ #
159
+ # The execution process outlined for a Mission here is: create a child
160
+ # process, compile the Mission code in that process, run the code in that
161
+ # process, and have that process record the results in the database so
162
+ # this process can access them.
163
+ #
97
164
  def execute_missions
98
165
  status("Running missions")
99
166
  ran_a_mission = false
100
- while mission = @db.current_mission
167
+ while mission = @db.current_mission # loop over pending Missions
101
168
  log.info("Running #{mission[:name]} mission.")
102
169
  ran_a_mission = true
103
- pid = fork do
170
+ # run a Mission
171
+ @mission_pid = fork do
104
172
  reset_environment
105
173
  compile_mission(mission)
106
174
  run_mission(mission)
@@ -108,10 +176,12 @@ module ScoutAgent
108
176
  end
109
177
 
110
178
  begin
179
+ # wait for the Mission to complete, or the timeout to expire
111
180
  Timeout.timeout(mission[:timeout]) do
112
- Process.wait(pid)
181
+ Process.wait(@mission_pid)
113
182
  end
114
- unless $?.success?
183
+ @mission_pid = nil
184
+ unless $?.success? # record that the Mission exited with an error
115
185
  log.warn( "#{mission[:name]} exited with an error: " +
116
186
  "#{$?.exitstatus}." )
117
187
  @db.write_report(
@@ -121,8 +191,9 @@ module ScoutAgent
121
191
  :body => "Exit status: #{$?.exitstatus}"
122
192
  )
123
193
  end
124
- rescue Timeout::Error # mission exceeded allowed execution
125
- status = Process.term_or_kill(pid)
194
+ rescue Timeout::Error # Mission exceeded allowed execution
195
+ status = Process.term_or_kill(@mission_pid)
196
+ @mission_pid = nil
126
197
  log.error( "#{mission[:name]} took too long to run: " +
127
198
  "#{status && status.exitstatus}." )
128
199
  @db.write_report(
@@ -138,9 +209,16 @@ module ScoutAgent
138
209
  break
139
210
  end
140
211
  end
212
+ # we shouldn't wake up with nothing to do, so check for that
141
213
  log.warn("No missions to run.") unless ran_a_mission
142
214
  end
143
215
 
216
+ #
217
+ # Request a snapshot via the API. This is a normal (non-forced) request,
218
+ # so only commands that have passed their interval will be run and it's
219
+ # likely that nothing at all will be done (because the last trip through
220
+ # the event loop ran a full snapshot, for example).
221
+ #
144
222
  def prepare_snapshot
145
223
  if Plan.periodic_snapshots?
146
224
  status("Preparing a system snapshot")
@@ -150,10 +228,18 @@ module ScoutAgent
150
228
  end
151
229
  end
152
230
 
231
+ #
232
+ # Sends all generated data up to the Scout server. This is not mission
233
+ # critical data so it is removed from the databases as it is sent. This
234
+ # prevents something like a slow send that times out on our end but does
235
+ # eventually complete from causing us to later send duplicated data.
236
+ #
153
237
  def checkin
238
+ # get the data from the databases
154
239
  reports = @db.current_reports
155
240
  queued = @queue.queued_reports
156
241
  snapshots = @snapshots.current_runs
242
+ # ensure we have something to send
157
243
  if reports.empty? and queued.empty? and snapshots.empty?
158
244
  log.warn("No data to report to the server.")
159
245
  return
@@ -161,16 +247,18 @@ module ScoutAgent
161
247
 
162
248
  log.info("Checking in with server.")
163
249
  status("Checking in with server")
250
+ # prepare the data for transport to the server
164
251
  checkin = { :reports => Array.new,
165
252
  :hints => Array.new,
166
253
  :alerts => Array.new,
167
254
  :errors => Array.new }
168
255
  (reports + queued).each do |report|
169
- type = report.delete_at(:type)
256
+ type = report.delete_at(:type)
170
257
  checkin["#{type}s".to_sym] << report.to_hash
171
258
  end
172
259
  checkin[:snapshots] = snapshots.map { |run| run.to_hash }
173
260
 
261
+ # log some details about what we are sending
174
262
  report_dates = String.new
175
263
  if reports.first or queued.first
176
264
  dates = [ [reports.first, queued.first],
@@ -191,6 +279,7 @@ module ScoutAgent
191
279
  "#{checkin[:alerts].size} alerts, " +
192
280
  "and #{checkin[:errors].size} errors)#{report_dates} " +
193
281
  "and #{snapshots.size} snapshot runs#{snapshot_dates}." )
282
+ # transmit the data and record the results
194
283
  if @server.post_checkin(checkin)
195
284
  log.info("Server received data.")
196
285
  else
@@ -198,6 +287,11 @@ module ScoutAgent
198
287
  end
199
288
  end
200
289
 
290
+ #
291
+ # Performs the regular maintenance needed to keep the agent from slowly
292
+ # filling the hard drive with data. It <tt>VACUUM</tt>s databases to
293
+ # reclaim space and removes old log files.
294
+ #
201
295
  def perform_maintenance
202
296
  log.info("Running maintenance tasks.")
203
297
  status("Running maintenance tasks")
@@ -213,23 +307,10 @@ module ScoutAgent
213
307
  end
214
308
 
215
309
  # clean out old logs
216
- Plan.log_dir.each_entry do |log_file|
217
- if log_file.to_s =~ /\.(\d{4})(\d{2})(\d{2})\z/
218
- log_day = Time.local(*$~.captures.map { |n| n.to_i })
219
- if Time.now - log_day > 60 * 60 * 24 * 7
220
- begin
221
- (Plan.log_dir + log_file).unlink
222
- rescue Exception => error # file cannot be unlinked
223
- log.error( "Failed to unlink old log file '#{log_file}': " +
224
- "#{error.message} (#{error.class})." )
225
- next
226
- end
227
- log.debug("Successfully unlinked old log file '#{log_file}'.")
228
- end
229
- end
230
- end
310
+ ScoutAgent.remove_old_log_files(log)
231
311
  end
232
312
 
313
+ # Rest for however much time we have before more work is needed.
233
314
  def wait_for_orders
234
315
  pause = @db.seconds_to_next_mission
235
316
  log.info("Waiting #{pause} seconds for next mission run.")
@@ -237,6 +318,10 @@ module ScoutAgent
237
318
  sleep pause
238
319
  end
239
320
 
321
+ #
322
+ # Finish the shutdown process, if it was started while we were doing the
323
+ # last step of the main event loop.
324
+ #
240
325
  def check_running_status
241
326
  exit unless @running
242
327
  end
@@ -245,9 +330,18 @@ module ScoutAgent
245
330
  ### Mission ###
246
331
  ###############
247
332
 
333
+ #
334
+ # Reset our parent's signal handlers, authorize the Mission identity,
335
+ # prepare a log, and reset our status.
336
+ #
248
337
  def reset_environment
249
338
  # swap out our parent's signal handlers
250
- install_shutdown_handler { exit }
339
+ install_shutdown_handler do
340
+ Thread.new do
341
+ log.info("Shutting down.")
342
+ exit
343
+ end
344
+ end
251
345
 
252
346
  # clear the parent's identity and assume mine
253
347
  IDCard.me = nil
@@ -263,12 +357,13 @@ module ScoutAgent
263
357
  end
264
358
  end
265
359
 
360
+ # Build the +mission+ code or exit() with an error if it cannot be built.
266
361
  def compile_mission(mission)
267
362
  log.info("Compiling #{mission[:name]} mission.")
268
363
  status("Compiling")
269
364
  begin
270
365
  eval(mission[:code], TOPLEVEL_BINDING, mission[:name])
271
- rescue Exception => error # any compile error
366
+ rescue Exception => error # any compile error
272
367
  raise if $!.is_a? SystemExit # don't catch exit() calls
273
368
  log.error( "#{mission[:name]} could not be compiled: " +
274
369
  "#{error.message} (#{error.class})." )
@@ -282,6 +377,11 @@ module ScoutAgent
282
377
  end
283
378
  end
284
379
 
380
+ #
381
+ # Create a Mission object from the code previously prepared, passing in
382
+ # details like <tt>:memory</tt> and <tt>:options</tt> from +mission+.
383
+ # Once this Mission is created, it is run().
384
+ #
285
385
  def run_mission(mission)
286
386
  log.info("Preparing #{mission[:name]} mission.")
287
387
  if prepared = Mission.prepared
@@ -304,6 +404,7 @@ module ScoutAgent
304
404
  end
305
405
  end
306
406
 
407
+ # Report that +mission+ is now complete.
307
408
  def complete_mission(mission)
308
409
  log.info("#{mission[:name]} mission complete.")
309
410
  end