averell23-watchdogger 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/CHANGES +0 -0
- data/LICENSE +676 -0
- data/README.rdoc +86 -0
- data/bin/watchdogger +54 -0
- data/lib/dog_log.rb +52 -0
- data/lib/watchdogger.rb +177 -0
- data/lib/watcher/base.rb +96 -0
- data/lib/watcher/http_watcher.rb +52 -0
- data/lib/watcher/log_watcher.rb +97 -0
- data/lib/watcher.rb +51 -0
- data/lib/watcher_action/htpost.rb +42 -0
- data/lib/watcher_action/kill_process.rb +24 -0
- data/lib/watcher_action/log_action.rb +27 -0
- data/lib/watcher_action/send_mail.rb +67 -0
- data/lib/watcher_action.rb +41 -0
- data/lib/watcher_event.rb +23 -0
- data/sample_config.yml +16 -0
- data/test/http_watcher_test.rb +63 -0
- data/test/kill_process_test.rb +29 -0
- data/test/log_watcher_test.rb +38 -0
- data/test/test_helper.rb +6 -0
- data/test/watchdogger_test.rb +82 -0
- metadata +119 -0
data/README.rdoc
ADDED
@@ -0,0 +1,86 @@
|
|
1
|
+
= Watchdogger
|
2
|
+
|
3
|
+
Watchdogger is a simple ruby program that will monitor your log file or web
|
4
|
+
page (and potentially other stuff, too) and take action if something is amiss.
|
5
|
+
It is intentionally simple, so there is less chance of having things go wrong
|
6
|
+
with the watchdog itself.
|
7
|
+
|
8
|
+
If you want to build more complicated reporting or such, you could do so
|
9
|
+
separately and act on notifications from the watchdog.
|
10
|
+
|
11
|
+
= Installation
|
12
|
+
|
13
|
+
Should be a matter of a simple
|
14
|
+
|
15
|
+
gem install averell23-watchdogger
|
16
|
+
|
17
|
+
= Quick start
|
18
|
+
|
19
|
+
You will have to make a configuration file like this:
|
20
|
+
|
21
|
+
# Actions you want to execute
|
22
|
+
actions:
|
23
|
+
log_event:
|
24
|
+
type: log_action
|
25
|
+
severity: info
|
26
|
+
restart_server:
|
27
|
+
type: kill_process
|
28
|
+
pidfile: /var/pid/server.pid
|
29
|
+
# Watchers: Things you want to monitor
|
30
|
+
watchers:
|
31
|
+
check_my_site:
|
32
|
+
type: http_watcher
|
33
|
+
url: http://www.mysite.some/
|
34
|
+
content_match: 'some.*other?string'
|
35
|
+
actions:
|
36
|
+
- log_event
|
37
|
+
- restart_server
|
38
|
+
# Run each minute
|
39
|
+
interval: 60
|
40
|
+
logfile: /mystuff/dogger.log
|
41
|
+
log_level: info
|
42
|
+
|
43
|
+
If your log file is called watcher.yml, you could call the watchdog script with
|
44
|
+
|
45
|
+
watchdogger -c watcher.yml
|
46
|
+
|
47
|
+
(If you omit the log file from the configuration, you'll also see the log
|
48
|
+
output on your console).
|
49
|
+
|
50
|
+
The script above will check each minute if the web site responds and if the body
|
51
|
+
matches the given regular expression. If not, it will log a message and restart
|
52
|
+
the server.
|
53
|
+
|
54
|
+
There are other options too, just check the API docs for more.
|
55
|
+
|
56
|
+
To find out the options of the program, call
|
57
|
+
|
58
|
+
watchdogger --help
|
59
|
+
|
60
|
+
= Why Watchdogger?
|
61
|
+
|
62
|
+
I wrote this to monitor our instance of Tomcat. While there are other scripts
|
63
|
+
around, they seem to be either "quick" shell scripts that require a standard
|
64
|
+
linux layout, or potentially "Enterprise" solutions with a lot of overhead.
|
65
|
+
|
66
|
+
This script is just plain Ruby and does not depend on external libraries. So
|
67
|
+
you should be able to install the gem, set up your configuration and be
|
68
|
+
good to go. It doesn't even assume a particular install location.
|
69
|
+
|
70
|
+
(In the future there may be some actions or things that require jruby, to
|
71
|
+
monitor java servers)
|
72
|
+
|
73
|
+
= What if Watchdogger dies?
|
74
|
+
|
75
|
+
Watchdogger has been designed to run as a daemon, instead of being a cron job
|
76
|
+
that needs to be called every few minutes. This has several advantages:
|
77
|
+
We can have state information throughout the session, we can spawn helper
|
78
|
+
processes if needed, etc.
|
79
|
+
|
80
|
+
However, the Watchdogger process may die for some reason, which would not be
|
81
|
+
a good thing.
|
82
|
+
|
83
|
+
There is an easy solution, though: Just *do* install Watchdogger
|
84
|
+
as a cron job. It will check if the old daemon process is still running and
|
85
|
+
exit with code 1 if that's the case. If the old process is stale, it will
|
86
|
+
restart itself.
|
data/bin/watchdogger
ADDED
@@ -0,0 +1,54 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
$: << File.expand_path(File.join(File.dirname(__FILE__), '..', 'lib'))
|
3
|
+
require 'watchdogger'
|
4
|
+
|
5
|
+
module DoggerFlags extend OptiFlagSet
|
6
|
+
optional_flag 'config' do
|
7
|
+
alternate_forms 'c'
|
8
|
+
long_form 'configuration'
|
9
|
+
description 'Configuration file from which to read the servers to guard. Defaults to .watchdogger.yml in the users config dir.'
|
10
|
+
end
|
11
|
+
|
12
|
+
optional_switch_flag 'daemon' do
|
13
|
+
alternate_forms 'd'
|
14
|
+
long_form 'daemonize'
|
15
|
+
description 'Run as a daemon, using the configured pidfile or .watchdogger.pid in your home directory'
|
16
|
+
end
|
17
|
+
|
18
|
+
optional_switch_flag 'shutdown' do
|
19
|
+
alternate_forms 's'
|
20
|
+
long_form 'shutdown'
|
21
|
+
description 'Shutdown the daemon. You must given the -daemon flag in the same way as on startup'
|
22
|
+
end
|
23
|
+
|
24
|
+
optional_switch_flag 'status' do
|
25
|
+
description 'Get the daemon status'
|
26
|
+
end
|
27
|
+
|
28
|
+
and_process!
|
29
|
+
end
|
30
|
+
|
31
|
+
flags = DoggerFlags.flags
|
32
|
+
config_file_name = flags.config || File.join(Etc.getpwuid.dir, '.watchdogger.yml')
|
33
|
+
|
34
|
+
begin
|
35
|
+
config = YAML.load_file(config_file_name)
|
36
|
+
WatchDogger.init_system(config)
|
37
|
+
if(flags.status)
|
38
|
+
puts WatchDogger.check_daemon ? "The daemon is running." : "The daemon seems to be dead."
|
39
|
+
elsif(flags.shutdown)
|
40
|
+
puts "Shutting down the old daemon"
|
41
|
+
WatchDogger.shutdown_daemon
|
42
|
+
elsif(flags.daemon)
|
43
|
+
puts "Starting daemon."
|
44
|
+
WatchDogger.daemon
|
45
|
+
else
|
46
|
+
puts "Starting in foreground."
|
47
|
+
WatchDogger.watch_loop
|
48
|
+
end
|
49
|
+
rescue SystemExit
|
50
|
+
# nothing
|
51
|
+
rescue Exception => e
|
52
|
+
puts "Problem with this command: #{e.message}. Exiting."
|
53
|
+
exit(1)
|
54
|
+
end
|
data/lib/dog_log.rb
ADDED
@@ -0,0 +1,52 @@
|
|
1
|
+
require 'logger'
|
2
|
+
|
3
|
+
# Logging facility for the watchdog
|
4
|
+
class DogLog # :nodoc:
|
5
|
+
|
6
|
+
class << self
|
7
|
+
|
8
|
+
# Set the log file and severity. This will reset the current logger,
|
9
|
+
# but should not usually be called on an active log.
|
10
|
+
def setup(logfile, severity)
|
11
|
+
@logfile = logfile
|
12
|
+
@severity = severity
|
13
|
+
if(@logger)
|
14
|
+
assit_fail('Resetting logfile')
|
15
|
+
@logger.close if(@logger.respond_to?(:close))
|
16
|
+
@logger = nil
|
17
|
+
end
|
18
|
+
end
|
19
|
+
|
20
|
+
# If nothing is configured, we log to STDERR by default
|
21
|
+
def logger
|
22
|
+
@logger ||= begin
|
23
|
+
@logfile ||= STDERR
|
24
|
+
severity = @severity || Logger::DEBUG
|
25
|
+
logger = Logger.new(get_log_io, 3)
|
26
|
+
logger.level = severity
|
27
|
+
logger
|
28
|
+
end
|
29
|
+
end
|
30
|
+
|
31
|
+
def get_log_io
|
32
|
+
return @logfile if(@logfile.kind_of?(IO))
|
33
|
+
if(Module.constants.member?(@logfile.upcase))
|
34
|
+
const = Module.const_get(@logfile.upcase)
|
35
|
+
return const if(const.kind_of?(IO))
|
36
|
+
end
|
37
|
+
File.open(@logfile, 'a')
|
38
|
+
end
|
39
|
+
|
40
|
+
|
41
|
+
end
|
42
|
+
end
|
43
|
+
|
44
|
+
class Object # :nodoc:
|
45
|
+
def self.dog_log
|
46
|
+
DogLog.logger
|
47
|
+
end
|
48
|
+
|
49
|
+
def dog_log
|
50
|
+
DogLog.logger
|
51
|
+
end
|
52
|
+
end
|
data/lib/watchdogger.rb
ADDED
@@ -0,0 +1,177 @@
|
|
1
|
+
require 'rubygems'
|
2
|
+
require 'optiflag'
|
3
|
+
require 'etc'
|
4
|
+
require 'yaml'
|
5
|
+
require 'assit'
|
6
|
+
|
7
|
+
# require our own stuff
|
8
|
+
lib_dir = File.expand_path(File.dirname(__FILE__))
|
9
|
+
$: << lib_dir
|
10
|
+
require 'dog_log'
|
11
|
+
require 'watcher'
|
12
|
+
require 'watcher_action'
|
13
|
+
require 'watcher_event'
|
14
|
+
|
15
|
+
# Require the Watcher
|
16
|
+
Dir[File.join(lib_dir, 'watcher', '*.rb')].each { |f| require 'watcher/' + File.basename(f, '.rb') }
|
17
|
+
# Require the actions
|
18
|
+
Dir[File.join(lib_dir, 'watcher_action', '*.rb')].each { |f| require 'watcher_action/' + File.basename(f, '.rb') }
|
19
|
+
|
20
|
+
module WatchDogger # :nodoc:
|
21
|
+
|
22
|
+
class << self
|
23
|
+
|
24
|
+
# Initializes the watchdog system, sets up the log. In addition to the configured
|
25
|
+
# Watchers and WatcherActions, the system can take the following arguments:
|
26
|
+
#
|
27
|
+
# log_level - The log level for the system log. This will apply to all log messages
|
28
|
+
# logfile - The log file for the system. Defaults to STDOUT
|
29
|
+
# interval - The watch interval in seconds. Defaults to 60
|
30
|
+
def init_system(options)
|
31
|
+
# First setup the logging options
|
32
|
+
@log_level = options.get_value(:log_level)
|
33
|
+
@logfile = options.get_value(:logfile)
|
34
|
+
DogLog.setup(@logfile, @log_level)
|
35
|
+
|
36
|
+
# Now setup the actions
|
37
|
+
actions = options.get_value(:actions, false)
|
38
|
+
raise(ArgumentError, "Actions not configured correctly.") unless(actions.is_a?(Hash))
|
39
|
+
actions.each do |act_name, act_options|
|
40
|
+
WatcherAction.register(act_name, act_options)
|
41
|
+
end
|
42
|
+
|
43
|
+
# Setupup the watchers
|
44
|
+
watchers = options.get_value(:watchers, false)
|
45
|
+
raise(ArgumentError, "Watchers not configured correctly.") unless(watchers.is_a?(Hash))
|
46
|
+
watchers.each do |watch_name, watch_options|
|
47
|
+
Watcher.register(watch_name, watch_options)
|
48
|
+
end
|
49
|
+
|
50
|
+
dog_log.info('Watchdogger') { 'System Initialized' }
|
51
|
+
@watch_interval = options.get_value(:interval, 60).to_i
|
52
|
+
@pidfile = options.get_value(:pidfile) || File.join(Etc.getpwuid.dir, '.watchdogger.pid')
|
53
|
+
@pidfile = File.expand_path(@pidfile)
|
54
|
+
end
|
55
|
+
|
56
|
+
|
57
|
+
# This is the main loop of the watcher
|
58
|
+
def watch_loop
|
59
|
+
signal_traps
|
60
|
+
dog_log.info('Watchdogger') { "Starting watch loop with interval #{@watch_interval}"}
|
61
|
+
loop do
|
62
|
+
Watcher.watch_all!
|
63
|
+
sleep(@watch_interval)
|
64
|
+
end
|
65
|
+
end
|
66
|
+
|
67
|
+
# Run as a daemon
|
68
|
+
def daemon
|
69
|
+
raise(RuntimeError, "Daemon still running") if(check_daemon)
|
70
|
+
raise(ArgumentError, "Not daemonizing without a logfile") unless(@logfile && @logfile.upcase != 'STDOUT' && @logfile.upcase != 'STDERR')
|
71
|
+
# Test the file opening before going daemon, so we know that
|
72
|
+
# it should usually work
|
73
|
+
File.open(@pidfile, 'w') { |io| io << 'starting' }
|
74
|
+
daemonize
|
75
|
+
# Now write the pid for real
|
76
|
+
File.open(@pidfile, 'w') { |io| io << Process.pid.to_s }
|
77
|
+
dog_log.info('Watchdogger') { "Running as daemon with pid #{Process.pid}" }
|
78
|
+
watch_loop
|
79
|
+
end
|
80
|
+
|
81
|
+
# By default, camelize converts strings to UpperCamelCase. If the argument to camelize
|
82
|
+
# is set to ":lower" then camelize produces lowerCamelCase.
|
83
|
+
#
|
84
|
+
# camelize will also convert '/' to '::' which is useful for converting paths to namespaces
|
85
|
+
#
|
86
|
+
# Examples
|
87
|
+
# "active_record".camelize #=> "ActiveRecord"
|
88
|
+
# "active_record".camelize(:lower) #=> "activeRecord"
|
89
|
+
# "active_record/errors".camelize #=> "ActiveRecord::Errors"
|
90
|
+
# "active_record/errors".camelize(:lower) #=> "activeRecord::Errors"
|
91
|
+
def camelize(lower_case_and_underscored_word, first_letter_in_uppercase = true) # :nodoc:
|
92
|
+
if first_letter_in_uppercase
|
93
|
+
lower_case_and_underscored_word.to_s.gsub(/\/(.?)/) { "::" + $1.upcase }.gsub(/(^|_)(.)/) { $2.upcase }
|
94
|
+
else
|
95
|
+
lower_case_and_underscored_word.first + camelize(lower_case_and_underscored_word)[1..-1]
|
96
|
+
end
|
97
|
+
end
|
98
|
+
|
99
|
+
# Shutdown the given daemon
|
100
|
+
def shutdown_daemon
|
101
|
+
Process.kill('TERM', get_pid)
|
102
|
+
end
|
103
|
+
|
104
|
+
# Check for running daemon. Returns true if the system thinks that the
|
105
|
+
# daemon is still running.
|
106
|
+
def check_daemon
|
107
|
+
return false unless(File.exists?(@pidfile))
|
108
|
+
pid = get_pid
|
109
|
+
begin
|
110
|
+
Process.kill(0, pid)
|
111
|
+
true
|
112
|
+
rescue Errno::EPERM
|
113
|
+
true
|
114
|
+
rescue Errno::ESRCH
|
115
|
+
dog_log.info('Watchdogger') { "Old process #{pid} is stale, good to go." }
|
116
|
+
false
|
117
|
+
rescue Exception => e
|
118
|
+
dog_log.error('Watchdogger') { "Could not find out if process #{pid} still runs (#{e.message}). Hoping for the best..." }
|
119
|
+
false
|
120
|
+
end
|
121
|
+
end
|
122
|
+
|
123
|
+
private
|
124
|
+
|
125
|
+
def get_pid
|
126
|
+
pid = File.open(@pidfile, 'r') { |io| io.read }
|
127
|
+
pid.to_i
|
128
|
+
end
|
129
|
+
|
130
|
+
# Clean shutdown
|
131
|
+
def shutdown
|
132
|
+
dog_log.info('Watchdogger') { "Cleaning watchers..." }
|
133
|
+
Watcher.cleanup_watchers
|
134
|
+
if(@my_pidfile)
|
135
|
+
dog_log.info('Watchdogger') { "Removing pidfile at #{@my_pidfile}" }
|
136
|
+
FileUtils.remove(@my_pidfile) if(File.exists?(@my_pidfile))
|
137
|
+
end
|
138
|
+
dog_log.info('Watchdogger') { "Shutting down."}
|
139
|
+
exit(0)
|
140
|
+
end
|
141
|
+
|
142
|
+
# Setup handler for the signals that should be handled by the skript
|
143
|
+
def signal_traps
|
144
|
+
Signal.trap('INT') { shutdown }
|
145
|
+
Signal.trap('TERM') { shutdown }
|
146
|
+
Signal.trap('HUP', 'IGNORE')
|
147
|
+
end
|
148
|
+
|
149
|
+
# File active_support/core_ext/kernel/daemonizing.rb, line 4
|
150
|
+
def daemonize
|
151
|
+
exit if fork # Parent exits, child continues.
|
152
|
+
Process.setsid # Become session leader.
|
153
|
+
exit if fork # Zap session leader. See [1].
|
154
|
+
Dir.chdir "/" # Release old working directory.
|
155
|
+
File.umask 0000 # Ensure sensible umask. Adjust as needed.
|
156
|
+
STDIN.reopen "/dev/null" # Free file descriptors and
|
157
|
+
STDOUT.reopen "/dev/null", "a" # point them somewhere sensible.
|
158
|
+
STDERR.reopen STDOUT # STDOUT/ERR should better go to a logfile.
|
159
|
+
end
|
160
|
+
|
161
|
+
end
|
162
|
+
|
163
|
+
end
|
164
|
+
|
165
|
+
class Hash # :nodoc:
|
166
|
+
|
167
|
+
# Gets the value from the Hash, regardless if it's stored with a symbol
|
168
|
+
# or a key as a string. You may supply a default value - if the default
|
169
|
+
# is set to false, the method will raise an argument error. By default, it
|
170
|
+
# will return nil if the element isn't found.
|
171
|
+
def get_value(sym_or_string, default = nil)
|
172
|
+
value = self[sym_or_string.to_s] || self[sym_or_string.to_sym] || default
|
173
|
+
raise(ArgumentError, "No value set for #{sym_or_string}") if((default == false) && !value)
|
174
|
+
value
|
175
|
+
end
|
176
|
+
|
177
|
+
end
|
data/lib/watcher/base.rb
ADDED
@@ -0,0 +1,96 @@
|
|
1
|
+
module Watcher
|
2
|
+
|
3
|
+
# Base class for all Watcher.
|
4
|
+
#
|
5
|
+
# A watcher checks for a condition (e.g., if a web site responds, if a log file shows signs of
|
6
|
+
# trouble). Each watcher will have one or more actions attached that will be called if the
|
7
|
+
# watched condition is triggered.
|
8
|
+
#
|
9
|
+
# Each watcher will accept the following options, which are handled by the superclass:
|
10
|
+
#
|
11
|
+
# severity - Severity of the event. Each time the event is triggered, the watcher will
|
12
|
+
# add this value to the internal "severity". If the internal severity reaches
|
13
|
+
# 100, the action is triggered. This means that with a severity of 100 the
|
14
|
+
# action is run each time the watcher triggers. With a severity of 1, it is
|
15
|
+
# only executed every 100th time. The global mechanism will reset the
|
16
|
+
# severity once the action is triggered. The watcher class may decide
|
17
|
+
# to reset the severity also on other occasions. Default: 100
|
18
|
+
#
|
19
|
+
# actions - The actions that should be executed when the watcher triggers. These
|
20
|
+
# are names of actions that have been set up previously. (Required)
|
21
|
+
#
|
22
|
+
# warn_actions - Additional actions that are executed if the watcher triggers, but the
|
23
|
+
# severity for a real action is not yet reached.
|
24
|
+
#
|
25
|
+
# Each watcher object must respond to the #watch_it! method. It must check the watched condition
|
26
|
+
# and return nil or false if the condition is not met. If the condition is met, it may return
|
27
|
+
# true or an error message.
|
28
|
+
#
|
29
|
+
# The Watcher _may_ also respond to the #cleanup method - which will be used to clean
|
30
|
+
# up all existing Watcher on a clean shutdown.
|
31
|
+
class Base
|
32
|
+
attr_accessor :severity
|
33
|
+
attr_accessor :name
|
34
|
+
|
35
|
+
# Sets up all actions for this watcher
|
36
|
+
def setup_actions(configuration)
|
37
|
+
action_config = configuration.get_value(:actions)
|
38
|
+
raise(ArgumentError, "No actions passed to watcher.") unless(action_config)
|
39
|
+
if(action_config.is_a?(Array))
|
40
|
+
action_config.each { |ac| actions << ac }
|
41
|
+
else
|
42
|
+
assit(action_config.is_a?(String) || action_config.is_a?(Symbol))
|
43
|
+
actions << action_config
|
44
|
+
end
|
45
|
+
warn_config = configuration.get_value(:warn_actions)
|
46
|
+
if(warn_config.is_a?(Array))
|
47
|
+
warn_config.each { |ac| warn_actions << ac }
|
48
|
+
elsif(warn_config)
|
49
|
+
assit_kind_of(String, warn_config)
|
50
|
+
warn_actions << warn_config
|
51
|
+
end
|
52
|
+
end
|
53
|
+
|
54
|
+
private
|
55
|
+
|
56
|
+
# Checks the trigger and does everything to call the actions connected to this watcher
|
57
|
+
def do_watch!
|
58
|
+
@current_severity ||= 0
|
59
|
+
result = watch_it!
|
60
|
+
return unless(result)
|
61
|
+
|
62
|
+
event = WatcherEvent.new
|
63
|
+
event.watcher = self
|
64
|
+
event.message = result unless(result.is_a?(TrueClass))
|
65
|
+
event.timestamp = Time.now
|
66
|
+
|
67
|
+
@current_severity = @current_severity + severity
|
68
|
+
if(@current_severity >= 100)
|
69
|
+
run_actions(event)
|
70
|
+
@current_severity = 0
|
71
|
+
else
|
72
|
+
warn_actions(event)
|
73
|
+
end
|
74
|
+
|
75
|
+
@last_run = Time.now
|
76
|
+
end
|
77
|
+
|
78
|
+
# Executes all actions for this watcher
|
79
|
+
def run_actions(event)
|
80
|
+
actions.each { |ac| WatcherAction.run_action(ac, event) }
|
81
|
+
end
|
82
|
+
|
83
|
+
def warn_actions(event)
|
84
|
+
warn_actions.each { |ac| WatcherAction.run_action(ac, event) }
|
85
|
+
end
|
86
|
+
|
87
|
+
def actions
|
88
|
+
@actions ||= []
|
89
|
+
end
|
90
|
+
|
91
|
+
def warn_actions
|
92
|
+
@warn_actions ||= []
|
93
|
+
end
|
94
|
+
end
|
95
|
+
|
96
|
+
end
|
@@ -0,0 +1,52 @@
|
|
1
|
+
require 'net/http'
|
2
|
+
|
3
|
+
module Watcher
|
4
|
+
|
5
|
+
# Checks an http connection if it is active and returns the expected results.
|
6
|
+
# Options for this watcher:
|
7
|
+
#
|
8
|
+
# url - The URL to query (required)
|
9
|
+
# response - The response code that is expected from the operation
|
10
|
+
# content_match - A regular expression that is matched against the result.
|
11
|
+
# The watcher fails if the expression doesn't match
|
12
|
+
# timeout - The timeout for the connection attempt. Defaults to 10 sec
|
13
|
+
#
|
14
|
+
# If neither response nor content_match are given, the watcher will expect a
|
15
|
+
# 200 OK response from the server.
|
16
|
+
#
|
17
|
+
# This watcher resets the current severity on each successful connect, so that
|
18
|
+
# only continuous failures count against the trigger condition.
|
19
|
+
class HttpWatcher < Watcher::Base
|
20
|
+
|
21
|
+
def initialize(config)
|
22
|
+
@url = config.get_value(:url, false)
|
23
|
+
match = config.get_value(:content_match)
|
24
|
+
@content_match = Regexp.new(match) if(match)
|
25
|
+
response = config.get_value(:response)
|
26
|
+
@response = ((!response && !match) ? "200" : response)
|
27
|
+
@timeout = config.get_value(:timeout, 10).to_i
|
28
|
+
end
|
29
|
+
|
30
|
+
def watch_it!
|
31
|
+
url = URI.parse(@url)
|
32
|
+
res = Net::HTTP.start(url.host, url.port) do |http|
|
33
|
+
http.read_timeout = @timeout
|
34
|
+
http.get(url.path)
|
35
|
+
end
|
36
|
+
test_failed = false
|
37
|
+
if(@response && (@response != res.code))
|
38
|
+
test_failed = "Unexpected HTTP response: #{res.code} for #{@url} - expected #{@response}"
|
39
|
+
elsif(@content_match && !@content_match.match(res.body))
|
40
|
+
test_failed = "Did not find #{@content_match.to_s} at #{@url}"
|
41
|
+
end
|
42
|
+
@current_severity = 0 unless(test_failed)
|
43
|
+
dog_log.debug('HttpWatcher') { "Watch of #{@url} resulted in #{test_failed}" }
|
44
|
+
test_failed
|
45
|
+
rescue Exception => e
|
46
|
+
dog_log.debug('HttpWatcher') { "Watch of #{@url} had exception #{e.message}"}
|
47
|
+
"Exception on connect to #{@url} - #{e.message}"
|
48
|
+
end
|
49
|
+
|
50
|
+
end
|
51
|
+
|
52
|
+
end
|
@@ -0,0 +1,97 @@
|
|
1
|
+
require 'file/tail'
|
2
|
+
|
3
|
+
module Watcher
|
4
|
+
|
5
|
+
# Watches a log file for a given regular expression. This watcher will start a background thread that
|
6
|
+
# tails the log and matches against the regexp.
|
7
|
+
#
|
8
|
+
# The current implementation is not tested against logs with very high load.
|
9
|
+
#
|
10
|
+
# = Options
|
11
|
+
#
|
12
|
+
# logfile - The log file to watch (required)
|
13
|
+
# match - A regular expression against which the log file will be matched (required)
|
14
|
+
# reopen_after - A number indicating after how many "unchanged" checks the file will
|
15
|
+
# be reopened. Defaults to 10.
|
16
|
+
# interval_first, interval_max are the start and the max value for waiting on an unchanged
|
17
|
+
# log file. They default to 60 (1 minute) and 300 (5 minutes). The log file will
|
18
|
+
# be considered stale and reopened after max_value * 3.
|
19
|
+
#
|
20
|
+
# = Warning
|
21
|
+
#
|
22
|
+
# Depending on your Ruby implementation and platform, the background thread may not be
|
23
|
+
# taken down if Watchdogger explodes during runtime.
|
24
|
+
#
|
25
|
+
# On a clean exit, all threads will be cleanly shut down, but if you kill -9 it,
|
26
|
+
# you may want to check for any rogue processes.
|
27
|
+
class LogWatcher < Watcher::Base
|
28
|
+
|
29
|
+
def initialize(options)
|
30
|
+
@file_name = options.get_value(:logfile, false)
|
31
|
+
@match_str = options.get_value(:match, false)
|
32
|
+
|
33
|
+
@interval_first = options.get_value(:interval_first, 60).to_i
|
34
|
+
@interval_max = options.get_value(:interval_max, 300).to_i
|
35
|
+
|
36
|
+
watch_log # Start the watcher
|
37
|
+
end
|
38
|
+
|
39
|
+
def watch_it!
|
40
|
+
unless(@log_watcher.status) # Restart the watcher if killed for some reason
|
41
|
+
dog_log.warn { "Log watcher on #{@file_name} died? Restarting..." }
|
42
|
+
watch_log
|
43
|
+
end
|
44
|
+
is_triggered = false
|
45
|
+
if(triggered?)
|
46
|
+
is_triggered = "Found #{@match_str} in #{@file_name}"
|
47
|
+
end
|
48
|
+
is_triggered
|
49
|
+
rescue Exception => e
|
50
|
+
"Exception running the log watcher: #{e}"
|
51
|
+
end
|
52
|
+
|
53
|
+
def cleanup
|
54
|
+
@log_watcher.kill if(@log_watcher && !@log_watcher.stop?)
|
55
|
+
end
|
56
|
+
|
57
|
+
private
|
58
|
+
|
59
|
+
def triggered?
|
60
|
+
@triggered
|
61
|
+
end
|
62
|
+
|
63
|
+
def triggered!
|
64
|
+
@my_mutex.synchronize { @triggered = true }
|
65
|
+
end
|
66
|
+
|
67
|
+
def reset_trigger
|
68
|
+
@my_mutex.synchronize { @triggered = false }
|
69
|
+
end
|
70
|
+
|
71
|
+
def watch_log
|
72
|
+
@my_mutex = Mutex.new
|
73
|
+
if(@log_watcher)
|
74
|
+
dog_log.warn { "Existing log watcher on #{@file_name}, killing."}
|
75
|
+
@log_watcher.terminate!
|
76
|
+
end
|
77
|
+
@log_watcher = Thread.new(@match_str, @file_name) do |match_str, file_name|
|
78
|
+
matcher = Regexp.new(match_str)
|
79
|
+
logfile = File::Tail::Logfile.open(file_name,
|
80
|
+
:backward => 1,
|
81
|
+
:reopen_deleted => true,
|
82
|
+
:interval => @interval_first,
|
83
|
+
:max_interval => @interval_max,
|
84
|
+
:reopen_suspicious => true,
|
85
|
+
:suspicious_interval => (@interval_max * 3)
|
86
|
+
)
|
87
|
+
logfile.tail do |line|
|
88
|
+
if(matcher.match(line))
|
89
|
+
triggered!
|
90
|
+
end
|
91
|
+
end
|
92
|
+
end
|
93
|
+
dog_log.debug { "Log watcher thread started for #{@file_name}" }
|
94
|
+
end
|
95
|
+
|
96
|
+
end
|
97
|
+
end
|
data/lib/watcher.rb
ADDED
@@ -0,0 +1,51 @@
|
|
1
|
+
# Handling for the Watcher objects in the system. The Watcher are not access directly,
|
2
|
+
# but handled through the static methods of this module. The module will also keep track
|
3
|
+
# of the watcher state and wrap the invocation procedure.
|
4
|
+
module Watcher
|
5
|
+
|
6
|
+
# Number of times the Watcher were called
|
7
|
+
@@watch_runs = 0
|
8
|
+
|
9
|
+
class << self
|
10
|
+
|
11
|
+
# Create a new watcher with the given configuration. The type identifies the watcher class
|
12
|
+
# that should be used.
|
13
|
+
#
|
14
|
+
# This will *not* return the watcher object, as it is not to be used externally. The watcher
|
15
|
+
# will be registered internally, and called when the #watch_all! method is called
|
16
|
+
def register(name, config_options)
|
17
|
+
assit_kind_of(String, name)
|
18
|
+
raise(ArgumentError, "Illegal options") unless(config_options.is_a?(Hash))
|
19
|
+
type = config_options.get_value(:type, false)
|
20
|
+
type = WatchDogger.camelize(type)
|
21
|
+
watcher = Watcher.const_get(type).new(config_options)
|
22
|
+
watcher.setup_actions(config_options)
|
23
|
+
severity = config_options.get_value(:severity, 100).to_i
|
24
|
+
watcher.severity = severity
|
25
|
+
watcher.name = name
|
26
|
+
registered_watchers << watcher
|
27
|
+
dog_log.debug('Watcher') { "Registered Watcher of type #{type}" }
|
28
|
+
end
|
29
|
+
|
30
|
+
# This will execute all registered Watcher, which, in turn, will execute their
|
31
|
+
# actions if necessary. Normally, this will run all Watcher each time this method is
|
32
|
+
# called. However, the watcher may implement conditions on which the check is skipped.
|
33
|
+
def watch_all!
|
34
|
+
@last_check = Time.now
|
35
|
+
registered_watchers.each { |w| w.send(:do_watch!) }
|
36
|
+
end
|
37
|
+
|
38
|
+
# Cleans up all Watcher
|
39
|
+
def cleanup_watchers
|
40
|
+
registered_watchers.each { |w| w.cleanup if(w.respond_to?(:cleanup)) }
|
41
|
+
end
|
42
|
+
|
43
|
+
private
|
44
|
+
|
45
|
+
def registered_watchers
|
46
|
+
@registered_watchers ||= []
|
47
|
+
end
|
48
|
+
|
49
|
+
end # end class methods
|
50
|
+
|
51
|
+
end
|