averell23-watchdogger 0.1.0
Sign up to get free protection for your applications and to get access to all the features.
- data/CHANGES +0 -0
- data/LICENSE +676 -0
- data/README.rdoc +86 -0
- data/bin/watchdogger +54 -0
- data/lib/dog_log.rb +52 -0
- data/lib/watchdogger.rb +177 -0
- data/lib/watcher/base.rb +96 -0
- data/lib/watcher/http_watcher.rb +52 -0
- data/lib/watcher/log_watcher.rb +97 -0
- data/lib/watcher.rb +51 -0
- data/lib/watcher_action/htpost.rb +42 -0
- data/lib/watcher_action/kill_process.rb +24 -0
- data/lib/watcher_action/log_action.rb +27 -0
- data/lib/watcher_action/send_mail.rb +67 -0
- data/lib/watcher_action.rb +41 -0
- data/lib/watcher_event.rb +23 -0
- data/sample_config.yml +16 -0
- data/test/http_watcher_test.rb +63 -0
- data/test/kill_process_test.rb +29 -0
- data/test/log_watcher_test.rb +38 -0
- data/test/test_helper.rb +6 -0
- data/test/watchdogger_test.rb +82 -0
- metadata +119 -0
data/README.rdoc
ADDED
@@ -0,0 +1,86 @@
|
|
1
|
+
= Watchdogger
|
2
|
+
|
3
|
+
Watchdogger is a simple ruby program that will monitor your log file or web
|
4
|
+
page (and potentially other stuff, too) and take action if something is amiss.
|
5
|
+
It is intentionally simple, so there is less chance of having things go wrong
|
6
|
+
with the watchdog itself.
|
7
|
+
|
8
|
+
If you want to build more complicated reporting or such, you could do so
|
9
|
+
separately and act on notifications from the watchdog.
|
10
|
+
|
11
|
+
= Installation
|
12
|
+
|
13
|
+
Should be a matter of a simple
|
14
|
+
|
15
|
+
gem install averell23-watchdogger
|
16
|
+
|
17
|
+
= Quick start
|
18
|
+
|
19
|
+
You will have to make a configuration file like this:
|
20
|
+
|
21
|
+
# Actions you want to execute
|
22
|
+
actions:
|
23
|
+
log_event:
|
24
|
+
type: log_action
|
25
|
+
severity: info
|
26
|
+
restart_server:
|
27
|
+
type: kill_process
|
28
|
+
pidfile: /var/pid/server.pid
|
29
|
+
# Watchers: Things you want to monitor
|
30
|
+
watchers:
|
31
|
+
check_my_site:
|
32
|
+
type: http_watcher
|
33
|
+
url: http://www.mysite.some/
|
34
|
+
content_match: 'some.*other?string'
|
35
|
+
actions:
|
36
|
+
- log_event
|
37
|
+
- restart_server
|
38
|
+
# Run each minute
|
39
|
+
interval: 60
|
40
|
+
logfile: /mystuff/dogger.log
|
41
|
+
log_level: info
|
42
|
+
|
43
|
+
If your log file is called watcher.yml, you could call the watchdog script with
|
44
|
+
|
45
|
+
watchdogger -c watcher.yml
|
46
|
+
|
47
|
+
(If you omit the log file from the configuration, you'll also see the log
|
48
|
+
output on your console).
|
49
|
+
|
50
|
+
The script above will check each minute if the web site responds and if the body
|
51
|
+
matches the given regular expression. If not, it will log a message and restart
|
52
|
+
the server.
|
53
|
+
|
54
|
+
There are other options too, just check the API docs for more.
|
55
|
+
|
56
|
+
To find out the options of the program, call
|
57
|
+
|
58
|
+
watchdogger --help
|
59
|
+
|
60
|
+
= Why Watchdogger?
|
61
|
+
|
62
|
+
I wrote this to monitor our instance of Tomcat. While there are other scripts
|
63
|
+
around, they seem to be either "quick" shell scripts that require a standard
|
64
|
+
linux layout, or potentially "Enterprise" solutions with a lot of overhead.
|
65
|
+
|
66
|
+
This script is just plain Ruby and does not depend on external libraries. So
|
67
|
+
you should be able to install the gem, set up your configuration and be
|
68
|
+
good to go. It doesn't even assume a particular install location.
|
69
|
+
|
70
|
+
(In the future there may be some actions or things that require jruby, to
|
71
|
+
monitor java servers)
|
72
|
+
|
73
|
+
= What if Watchdogger dies?
|
74
|
+
|
75
|
+
Watchdogger has been designed to run as a daemon, instead of being a cron job
|
76
|
+
that needs to be called every few minutes. This has several advantages:
|
77
|
+
We can have state information throughout the session, we can spawn helper
|
78
|
+
processes if needed, etc.
|
79
|
+
|
80
|
+
However, the Watchdogger process may die for some reason, which would not be
|
81
|
+
a good thing.
|
82
|
+
|
83
|
+
There is an easy solution, though: Just *do* install Watchdogger
|
84
|
+
as a cron job. It will check if the old daemon process is still running and
|
85
|
+
exit with code 1 if that's the case. If the old process is stale, it will
|
86
|
+
restart itself.
|
data/bin/watchdogger
ADDED
@@ -0,0 +1,54 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
$: << File.expand_path(File.join(File.dirname(__FILE__), '..', 'lib'))
|
3
|
+
require 'watchdogger'
|
4
|
+
|
5
|
+
module DoggerFlags extend OptiFlagSet
|
6
|
+
optional_flag 'config' do
|
7
|
+
alternate_forms 'c'
|
8
|
+
long_form 'configuration'
|
9
|
+
description 'Configuration file from which to read the servers to guard. Defaults to .watchdogger.yml in the users config dir.'
|
10
|
+
end
|
11
|
+
|
12
|
+
optional_switch_flag 'daemon' do
|
13
|
+
alternate_forms 'd'
|
14
|
+
long_form 'daemonize'
|
15
|
+
description 'Run as a daemon, using the configured pidfile or .watchdogger.pid in your home directory'
|
16
|
+
end
|
17
|
+
|
18
|
+
optional_switch_flag 'shutdown' do
|
19
|
+
alternate_forms 's'
|
20
|
+
long_form 'shutdown'
|
21
|
+
description 'Shutdown the daemon. You must given the -daemon flag in the same way as on startup'
|
22
|
+
end
|
23
|
+
|
24
|
+
optional_switch_flag 'status' do
|
25
|
+
description 'Get the daemon status'
|
26
|
+
end
|
27
|
+
|
28
|
+
and_process!
|
29
|
+
end
|
30
|
+
|
31
|
+
flags = DoggerFlags.flags
|
32
|
+
config_file_name = flags.config || File.join(Etc.getpwuid.dir, '.watchdogger.yml')
|
33
|
+
|
34
|
+
begin
|
35
|
+
config = YAML.load_file(config_file_name)
|
36
|
+
WatchDogger.init_system(config)
|
37
|
+
if(flags.status)
|
38
|
+
puts WatchDogger.check_daemon ? "The daemon is running." : "The daemon seems to be dead."
|
39
|
+
elsif(flags.shutdown)
|
40
|
+
puts "Shutting down the old daemon"
|
41
|
+
WatchDogger.shutdown_daemon
|
42
|
+
elsif(flags.daemon)
|
43
|
+
puts "Starting daemon."
|
44
|
+
WatchDogger.daemon
|
45
|
+
else
|
46
|
+
puts "Starting in foreground."
|
47
|
+
WatchDogger.watch_loop
|
48
|
+
end
|
49
|
+
rescue SystemExit
|
50
|
+
# nothing
|
51
|
+
rescue Exception => e
|
52
|
+
puts "Problem with this command: #{e.message}. Exiting."
|
53
|
+
exit(1)
|
54
|
+
end
|
data/lib/dog_log.rb
ADDED
@@ -0,0 +1,52 @@
|
|
1
|
+
require 'logger'
|
2
|
+
|
3
|
+
# Logging facility for the watchdog
|
4
|
+
class DogLog # :nodoc:
|
5
|
+
|
6
|
+
class << self
|
7
|
+
|
8
|
+
# Set the log file and severity. This will reset the current logger,
|
9
|
+
# but should not usually be called on an active log.
|
10
|
+
def setup(logfile, severity)
|
11
|
+
@logfile = logfile
|
12
|
+
@severity = severity
|
13
|
+
if(@logger)
|
14
|
+
assit_fail('Resetting logfile')
|
15
|
+
@logger.close if(@logger.respond_to?(:close))
|
16
|
+
@logger = nil
|
17
|
+
end
|
18
|
+
end
|
19
|
+
|
20
|
+
# If nothing is configured, we log to STDERR by default
|
21
|
+
def logger
|
22
|
+
@logger ||= begin
|
23
|
+
@logfile ||= STDERR
|
24
|
+
severity = @severity || Logger::DEBUG
|
25
|
+
logger = Logger.new(get_log_io, 3)
|
26
|
+
logger.level = severity
|
27
|
+
logger
|
28
|
+
end
|
29
|
+
end
|
30
|
+
|
31
|
+
def get_log_io
|
32
|
+
return @logfile if(@logfile.kind_of?(IO))
|
33
|
+
if(Module.constants.member?(@logfile.upcase))
|
34
|
+
const = Module.const_get(@logfile.upcase)
|
35
|
+
return const if(const.kind_of?(IO))
|
36
|
+
end
|
37
|
+
File.open(@logfile, 'a')
|
38
|
+
end
|
39
|
+
|
40
|
+
|
41
|
+
end
|
42
|
+
end
|
43
|
+
|
44
|
+
class Object # :nodoc:
|
45
|
+
def self.dog_log
|
46
|
+
DogLog.logger
|
47
|
+
end
|
48
|
+
|
49
|
+
def dog_log
|
50
|
+
DogLog.logger
|
51
|
+
end
|
52
|
+
end
|
data/lib/watchdogger.rb
ADDED
@@ -0,0 +1,177 @@
|
|
1
|
+
require 'rubygems'
|
2
|
+
require 'optiflag'
|
3
|
+
require 'etc'
|
4
|
+
require 'yaml'
|
5
|
+
require 'assit'
|
6
|
+
|
7
|
+
# require our own stuff
|
8
|
+
lib_dir = File.expand_path(File.dirname(__FILE__))
|
9
|
+
$: << lib_dir
|
10
|
+
require 'dog_log'
|
11
|
+
require 'watcher'
|
12
|
+
require 'watcher_action'
|
13
|
+
require 'watcher_event'
|
14
|
+
|
15
|
+
# Require the Watcher
|
16
|
+
Dir[File.join(lib_dir, 'watcher', '*.rb')].each { |f| require 'watcher/' + File.basename(f, '.rb') }
|
17
|
+
# Require the actions
|
18
|
+
Dir[File.join(lib_dir, 'watcher_action', '*.rb')].each { |f| require 'watcher_action/' + File.basename(f, '.rb') }
|
19
|
+
|
20
|
+
module WatchDogger # :nodoc:
|
21
|
+
|
22
|
+
class << self
|
23
|
+
|
24
|
+
# Initializes the watchdog system, sets up the log. In addition to the configured
|
25
|
+
# Watchers and WatcherActions, the system can take the following arguments:
|
26
|
+
#
|
27
|
+
# log_level - The log level for the system log. This will apply to all log messages
|
28
|
+
# logfile - The log file for the system. Defaults to STDOUT
|
29
|
+
# interval - The watch interval in seconds. Defaults to 60
|
30
|
+
def init_system(options)
|
31
|
+
# First setup the logging options
|
32
|
+
@log_level = options.get_value(:log_level)
|
33
|
+
@logfile = options.get_value(:logfile)
|
34
|
+
DogLog.setup(@logfile, @log_level)
|
35
|
+
|
36
|
+
# Now setup the actions
|
37
|
+
actions = options.get_value(:actions, false)
|
38
|
+
raise(ArgumentError, "Actions not configured correctly.") unless(actions.is_a?(Hash))
|
39
|
+
actions.each do |act_name, act_options|
|
40
|
+
WatcherAction.register(act_name, act_options)
|
41
|
+
end
|
42
|
+
|
43
|
+
# Setupup the watchers
|
44
|
+
watchers = options.get_value(:watchers, false)
|
45
|
+
raise(ArgumentError, "Watchers not configured correctly.") unless(watchers.is_a?(Hash))
|
46
|
+
watchers.each do |watch_name, watch_options|
|
47
|
+
Watcher.register(watch_name, watch_options)
|
48
|
+
end
|
49
|
+
|
50
|
+
dog_log.info('Watchdogger') { 'System Initialized' }
|
51
|
+
@watch_interval = options.get_value(:interval, 60).to_i
|
52
|
+
@pidfile = options.get_value(:pidfile) || File.join(Etc.getpwuid.dir, '.watchdogger.pid')
|
53
|
+
@pidfile = File.expand_path(@pidfile)
|
54
|
+
end
|
55
|
+
|
56
|
+
|
57
|
+
# This is the main loop of the watcher
|
58
|
+
def watch_loop
|
59
|
+
signal_traps
|
60
|
+
dog_log.info('Watchdogger') { "Starting watch loop with interval #{@watch_interval}"}
|
61
|
+
loop do
|
62
|
+
Watcher.watch_all!
|
63
|
+
sleep(@watch_interval)
|
64
|
+
end
|
65
|
+
end
|
66
|
+
|
67
|
+
# Run as a daemon
|
68
|
+
def daemon
|
69
|
+
raise(RuntimeError, "Daemon still running") if(check_daemon)
|
70
|
+
raise(ArgumentError, "Not daemonizing without a logfile") unless(@logfile && @logfile.upcase != 'STDOUT' && @logfile.upcase != 'STDERR')
|
71
|
+
# Test the file opening before going daemon, so we know that
|
72
|
+
# it should usually work
|
73
|
+
File.open(@pidfile, 'w') { |io| io << 'starting' }
|
74
|
+
daemonize
|
75
|
+
# Now write the pid for real
|
76
|
+
File.open(@pidfile, 'w') { |io| io << Process.pid.to_s }
|
77
|
+
dog_log.info('Watchdogger') { "Running as daemon with pid #{Process.pid}" }
|
78
|
+
watch_loop
|
79
|
+
end
|
80
|
+
|
81
|
+
# By default, camelize converts strings to UpperCamelCase. If the argument to camelize
|
82
|
+
# is set to ":lower" then camelize produces lowerCamelCase.
|
83
|
+
#
|
84
|
+
# camelize will also convert '/' to '::' which is useful for converting paths to namespaces
|
85
|
+
#
|
86
|
+
# Examples
|
87
|
+
# "active_record".camelize #=> "ActiveRecord"
|
88
|
+
# "active_record".camelize(:lower) #=> "activeRecord"
|
89
|
+
# "active_record/errors".camelize #=> "ActiveRecord::Errors"
|
90
|
+
# "active_record/errors".camelize(:lower) #=> "activeRecord::Errors"
|
91
|
+
def camelize(lower_case_and_underscored_word, first_letter_in_uppercase = true) # :nodoc:
|
92
|
+
if first_letter_in_uppercase
|
93
|
+
lower_case_and_underscored_word.to_s.gsub(/\/(.?)/) { "::" + $1.upcase }.gsub(/(^|_)(.)/) { $2.upcase }
|
94
|
+
else
|
95
|
+
lower_case_and_underscored_word.first + camelize(lower_case_and_underscored_word)[1..-1]
|
96
|
+
end
|
97
|
+
end
|
98
|
+
|
99
|
+
# Shutdown the given daemon
|
100
|
+
def shutdown_daemon
|
101
|
+
Process.kill('TERM', get_pid)
|
102
|
+
end
|
103
|
+
|
104
|
+
# Check for running daemon. Returns true if the system thinks that the
|
105
|
+
# daemon is still running.
|
106
|
+
def check_daemon
|
107
|
+
return false unless(File.exists?(@pidfile))
|
108
|
+
pid = get_pid
|
109
|
+
begin
|
110
|
+
Process.kill(0, pid)
|
111
|
+
true
|
112
|
+
rescue Errno::EPERM
|
113
|
+
true
|
114
|
+
rescue Errno::ESRCH
|
115
|
+
dog_log.info('Watchdogger') { "Old process #{pid} is stale, good to go." }
|
116
|
+
false
|
117
|
+
rescue Exception => e
|
118
|
+
dog_log.error('Watchdogger') { "Could not find out if process #{pid} still runs (#{e.message}). Hoping for the best..." }
|
119
|
+
false
|
120
|
+
end
|
121
|
+
end
|
122
|
+
|
123
|
+
private
|
124
|
+
|
125
|
+
def get_pid
|
126
|
+
pid = File.open(@pidfile, 'r') { |io| io.read }
|
127
|
+
pid.to_i
|
128
|
+
end
|
129
|
+
|
130
|
+
# Clean shutdown
|
131
|
+
def shutdown
|
132
|
+
dog_log.info('Watchdogger') { "Cleaning watchers..." }
|
133
|
+
Watcher.cleanup_watchers
|
134
|
+
if(@my_pidfile)
|
135
|
+
dog_log.info('Watchdogger') { "Removing pidfile at #{@my_pidfile}" }
|
136
|
+
FileUtils.remove(@my_pidfile) if(File.exists?(@my_pidfile))
|
137
|
+
end
|
138
|
+
dog_log.info('Watchdogger') { "Shutting down."}
|
139
|
+
exit(0)
|
140
|
+
end
|
141
|
+
|
142
|
+
# Setup handler for the signals that should be handled by the skript
|
143
|
+
def signal_traps
|
144
|
+
Signal.trap('INT') { shutdown }
|
145
|
+
Signal.trap('TERM') { shutdown }
|
146
|
+
Signal.trap('HUP', 'IGNORE')
|
147
|
+
end
|
148
|
+
|
149
|
+
# File active_support/core_ext/kernel/daemonizing.rb, line 4
|
150
|
+
def daemonize
|
151
|
+
exit if fork # Parent exits, child continues.
|
152
|
+
Process.setsid # Become session leader.
|
153
|
+
exit if fork # Zap session leader. See [1].
|
154
|
+
Dir.chdir "/" # Release old working directory.
|
155
|
+
File.umask 0000 # Ensure sensible umask. Adjust as needed.
|
156
|
+
STDIN.reopen "/dev/null" # Free file descriptors and
|
157
|
+
STDOUT.reopen "/dev/null", "a" # point them somewhere sensible.
|
158
|
+
STDERR.reopen STDOUT # STDOUT/ERR should better go to a logfile.
|
159
|
+
end
|
160
|
+
|
161
|
+
end
|
162
|
+
|
163
|
+
end
|
164
|
+
|
165
|
+
class Hash # :nodoc:
|
166
|
+
|
167
|
+
# Gets the value from the Hash, regardless if it's stored with a symbol
|
168
|
+
# or a key as a string. You may supply a default value - if the default
|
169
|
+
# is set to false, the method will raise an argument error. By default, it
|
170
|
+
# will return nil if the element isn't found.
|
171
|
+
def get_value(sym_or_string, default = nil)
|
172
|
+
value = self[sym_or_string.to_s] || self[sym_or_string.to_sym] || default
|
173
|
+
raise(ArgumentError, "No value set for #{sym_or_string}") if((default == false) && !value)
|
174
|
+
value
|
175
|
+
end
|
176
|
+
|
177
|
+
end
|
data/lib/watcher/base.rb
ADDED
@@ -0,0 +1,96 @@
|
|
1
|
+
module Watcher
|
2
|
+
|
3
|
+
# Base class for all Watcher.
|
4
|
+
#
|
5
|
+
# A watcher checks for a condition (e.g., if a web site responds, if a log file shows signs of
|
6
|
+
# trouble). Each watcher will have one or more actions attached that will be called if the
|
7
|
+
# watched condition is triggered.
|
8
|
+
#
|
9
|
+
# Each watcher will accept the following options, which are handled by the superclass:
|
10
|
+
#
|
11
|
+
# severity - Severity of the event. Each time the event is triggered, the watcher will
|
12
|
+
# add this value to the internal "severity". If the internal severity reaches
|
13
|
+
# 100, the action is triggered. This means that with a severity of 100 the
|
14
|
+
# action is run each time the watcher triggers. With a severity of 1, it is
|
15
|
+
# only executed every 100th time. The global mechanism will reset the
|
16
|
+
# severity once the action is triggered. The watcher class may decide
|
17
|
+
# to reset the severity also on other occasions. Default: 100
|
18
|
+
#
|
19
|
+
# actions - The actions that should be executed when the watcher triggers. These
|
20
|
+
# are names of actions that have been set up previously. (Required)
|
21
|
+
#
|
22
|
+
# warn_actions - Additional actions that are executed if the watcher triggers, but the
|
23
|
+
# severity for a real action is not yet reached.
|
24
|
+
#
|
25
|
+
# Each watcher object must respond to the #watch_it! method. It must check the watched condition
|
26
|
+
# and return nil or false if the condition is not met. If the condition is met, it may return
|
27
|
+
# true or an error message.
|
28
|
+
#
|
29
|
+
# The Watcher _may_ also respond to the #cleanup method - which will be used to clean
|
30
|
+
# up all existing Watcher on a clean shutdown.
|
31
|
+
class Base
|
32
|
+
attr_accessor :severity
|
33
|
+
attr_accessor :name
|
34
|
+
|
35
|
+
# Sets up all actions for this watcher
|
36
|
+
def setup_actions(configuration)
|
37
|
+
action_config = configuration.get_value(:actions)
|
38
|
+
raise(ArgumentError, "No actions passed to watcher.") unless(action_config)
|
39
|
+
if(action_config.is_a?(Array))
|
40
|
+
action_config.each { |ac| actions << ac }
|
41
|
+
else
|
42
|
+
assit(action_config.is_a?(String) || action_config.is_a?(Symbol))
|
43
|
+
actions << action_config
|
44
|
+
end
|
45
|
+
warn_config = configuration.get_value(:warn_actions)
|
46
|
+
if(warn_config.is_a?(Array))
|
47
|
+
warn_config.each { |ac| warn_actions << ac }
|
48
|
+
elsif(warn_config)
|
49
|
+
assit_kind_of(String, warn_config)
|
50
|
+
warn_actions << warn_config
|
51
|
+
end
|
52
|
+
end
|
53
|
+
|
54
|
+
private
|
55
|
+
|
56
|
+
# Checks the trigger and does everything to call the actions connected to this watcher
|
57
|
+
def do_watch!
|
58
|
+
@current_severity ||= 0
|
59
|
+
result = watch_it!
|
60
|
+
return unless(result)
|
61
|
+
|
62
|
+
event = WatcherEvent.new
|
63
|
+
event.watcher = self
|
64
|
+
event.message = result unless(result.is_a?(TrueClass))
|
65
|
+
event.timestamp = Time.now
|
66
|
+
|
67
|
+
@current_severity = @current_severity + severity
|
68
|
+
if(@current_severity >= 100)
|
69
|
+
run_actions(event)
|
70
|
+
@current_severity = 0
|
71
|
+
else
|
72
|
+
warn_actions(event)
|
73
|
+
end
|
74
|
+
|
75
|
+
@last_run = Time.now
|
76
|
+
end
|
77
|
+
|
78
|
+
# Executes all actions for this watcher
|
79
|
+
def run_actions(event)
|
80
|
+
actions.each { |ac| WatcherAction.run_action(ac, event) }
|
81
|
+
end
|
82
|
+
|
83
|
+
def warn_actions(event)
|
84
|
+
warn_actions.each { |ac| WatcherAction.run_action(ac, event) }
|
85
|
+
end
|
86
|
+
|
87
|
+
def actions
|
88
|
+
@actions ||= []
|
89
|
+
end
|
90
|
+
|
91
|
+
def warn_actions
|
92
|
+
@warn_actions ||= []
|
93
|
+
end
|
94
|
+
end
|
95
|
+
|
96
|
+
end
|
@@ -0,0 +1,52 @@
|
|
1
|
+
require 'net/http'
|
2
|
+
|
3
|
+
module Watcher
|
4
|
+
|
5
|
+
# Checks an http connection if it is active and returns the expected results.
|
6
|
+
# Options for this watcher:
|
7
|
+
#
|
8
|
+
# url - The URL to query (required)
|
9
|
+
# response - The response code that is expected from the operation
|
10
|
+
# content_match - A regular expression that is matched against the result.
|
11
|
+
# The watcher fails if the expression doesn't match
|
12
|
+
# timeout - The timeout for the connection attempt. Defaults to 10 sec
|
13
|
+
#
|
14
|
+
# If neither response nor content_match are given, the watcher will expect a
|
15
|
+
# 200 OK response from the server.
|
16
|
+
#
|
17
|
+
# This watcher resets the current severity on each successful connect, so that
|
18
|
+
# only continuous failures count against the trigger condition.
|
19
|
+
class HttpWatcher < Watcher::Base
|
20
|
+
|
21
|
+
def initialize(config)
|
22
|
+
@url = config.get_value(:url, false)
|
23
|
+
match = config.get_value(:content_match)
|
24
|
+
@content_match = Regexp.new(match) if(match)
|
25
|
+
response = config.get_value(:response)
|
26
|
+
@response = ((!response && !match) ? "200" : response)
|
27
|
+
@timeout = config.get_value(:timeout, 10).to_i
|
28
|
+
end
|
29
|
+
|
30
|
+
def watch_it!
|
31
|
+
url = URI.parse(@url)
|
32
|
+
res = Net::HTTP.start(url.host, url.port) do |http|
|
33
|
+
http.read_timeout = @timeout
|
34
|
+
http.get(url.path)
|
35
|
+
end
|
36
|
+
test_failed = false
|
37
|
+
if(@response && (@response != res.code))
|
38
|
+
test_failed = "Unexpected HTTP response: #{res.code} for #{@url} - expected #{@response}"
|
39
|
+
elsif(@content_match && !@content_match.match(res.body))
|
40
|
+
test_failed = "Did not find #{@content_match.to_s} at #{@url}"
|
41
|
+
end
|
42
|
+
@current_severity = 0 unless(test_failed)
|
43
|
+
dog_log.debug('HttpWatcher') { "Watch of #{@url} resulted in #{test_failed}" }
|
44
|
+
test_failed
|
45
|
+
rescue Exception => e
|
46
|
+
dog_log.debug('HttpWatcher') { "Watch of #{@url} had exception #{e.message}"}
|
47
|
+
"Exception on connect to #{@url} - #{e.message}"
|
48
|
+
end
|
49
|
+
|
50
|
+
end
|
51
|
+
|
52
|
+
end
|
@@ -0,0 +1,97 @@
|
|
1
|
+
require 'file/tail'
|
2
|
+
|
3
|
+
module Watcher
|
4
|
+
|
5
|
+
# Watches a log file for a given regular expression. This watcher will start a background thread that
|
6
|
+
# tails the log and matches against the regexp.
|
7
|
+
#
|
8
|
+
# The current implementation is not tested against logs with very high load.
|
9
|
+
#
|
10
|
+
# = Options
|
11
|
+
#
|
12
|
+
# logfile - The log file to watch (required)
|
13
|
+
# match - A regular expression against which the log file will be matched (required)
|
14
|
+
# reopen_after - A number indicating after how many "unchanged" checks the file will
|
15
|
+
# be reopened. Defaults to 10.
|
16
|
+
# interval_first, interval_max are the start and the max value for waiting on an unchanged
|
17
|
+
# log file. They default to 60 (1 minute) and 300 (5 minutes). The log file will
|
18
|
+
# be considered stale and reopened after max_value * 3.
|
19
|
+
#
|
20
|
+
# = Warning
|
21
|
+
#
|
22
|
+
# Depending on your Ruby implementation and platform, the background thread may not be
|
23
|
+
# taken down if Watchdogger explodes during runtime.
|
24
|
+
#
|
25
|
+
# On a clean exit, all threads will be cleanly shut down, but if you kill -9 it,
|
26
|
+
# you may want to check for any rogue processes.
|
27
|
+
class LogWatcher < Watcher::Base
|
28
|
+
|
29
|
+
def initialize(options)
|
30
|
+
@file_name = options.get_value(:logfile, false)
|
31
|
+
@match_str = options.get_value(:match, false)
|
32
|
+
|
33
|
+
@interval_first = options.get_value(:interval_first, 60).to_i
|
34
|
+
@interval_max = options.get_value(:interval_max, 300).to_i
|
35
|
+
|
36
|
+
watch_log # Start the watcher
|
37
|
+
end
|
38
|
+
|
39
|
+
def watch_it!
|
40
|
+
unless(@log_watcher.status) # Restart the watcher if killed for some reason
|
41
|
+
dog_log.warn { "Log watcher on #{@file_name} died? Restarting..." }
|
42
|
+
watch_log
|
43
|
+
end
|
44
|
+
is_triggered = false
|
45
|
+
if(triggered?)
|
46
|
+
is_triggered = "Found #{@match_str} in #{@file_name}"
|
47
|
+
end
|
48
|
+
is_triggered
|
49
|
+
rescue Exception => e
|
50
|
+
"Exception running the log watcher: #{e}"
|
51
|
+
end
|
52
|
+
|
53
|
+
def cleanup
|
54
|
+
@log_watcher.kill if(@log_watcher && !@log_watcher.stop?)
|
55
|
+
end
|
56
|
+
|
57
|
+
private
|
58
|
+
|
59
|
+
def triggered?
|
60
|
+
@triggered
|
61
|
+
end
|
62
|
+
|
63
|
+
def triggered!
|
64
|
+
@my_mutex.synchronize { @triggered = true }
|
65
|
+
end
|
66
|
+
|
67
|
+
def reset_trigger
|
68
|
+
@my_mutex.synchronize { @triggered = false }
|
69
|
+
end
|
70
|
+
|
71
|
+
def watch_log
|
72
|
+
@my_mutex = Mutex.new
|
73
|
+
if(@log_watcher)
|
74
|
+
dog_log.warn { "Existing log watcher on #{@file_name}, killing."}
|
75
|
+
@log_watcher.terminate!
|
76
|
+
end
|
77
|
+
@log_watcher = Thread.new(@match_str, @file_name) do |match_str, file_name|
|
78
|
+
matcher = Regexp.new(match_str)
|
79
|
+
logfile = File::Tail::Logfile.open(file_name,
|
80
|
+
:backward => 1,
|
81
|
+
:reopen_deleted => true,
|
82
|
+
:interval => @interval_first,
|
83
|
+
:max_interval => @interval_max,
|
84
|
+
:reopen_suspicious => true,
|
85
|
+
:suspicious_interval => (@interval_max * 3)
|
86
|
+
)
|
87
|
+
logfile.tail do |line|
|
88
|
+
if(matcher.match(line))
|
89
|
+
triggered!
|
90
|
+
end
|
91
|
+
end
|
92
|
+
end
|
93
|
+
dog_log.debug { "Log watcher thread started for #{@file_name}" }
|
94
|
+
end
|
95
|
+
|
96
|
+
end
|
97
|
+
end
|
data/lib/watcher.rb
ADDED
@@ -0,0 +1,51 @@
|
|
1
|
+
# Handling for the Watcher objects in the system. The Watcher are not access directly,
|
2
|
+
# but handled through the static methods of this module. The module will also keep track
|
3
|
+
# of the watcher state and wrap the invocation procedure.
|
4
|
+
module Watcher
|
5
|
+
|
6
|
+
# Number of times the Watcher were called
|
7
|
+
@@watch_runs = 0
|
8
|
+
|
9
|
+
class << self
|
10
|
+
|
11
|
+
# Create a new watcher with the given configuration. The type identifies the watcher class
|
12
|
+
# that should be used.
|
13
|
+
#
|
14
|
+
# This will *not* return the watcher object, as it is not to be used externally. The watcher
|
15
|
+
# will be registered internally, and called when the #watch_all! method is called
|
16
|
+
def register(name, config_options)
|
17
|
+
assit_kind_of(String, name)
|
18
|
+
raise(ArgumentError, "Illegal options") unless(config_options.is_a?(Hash))
|
19
|
+
type = config_options.get_value(:type, false)
|
20
|
+
type = WatchDogger.camelize(type)
|
21
|
+
watcher = Watcher.const_get(type).new(config_options)
|
22
|
+
watcher.setup_actions(config_options)
|
23
|
+
severity = config_options.get_value(:severity, 100).to_i
|
24
|
+
watcher.severity = severity
|
25
|
+
watcher.name = name
|
26
|
+
registered_watchers << watcher
|
27
|
+
dog_log.debug('Watcher') { "Registered Watcher of type #{type}" }
|
28
|
+
end
|
29
|
+
|
30
|
+
# This will execute all registered Watcher, which, in turn, will execute their
|
31
|
+
# actions if necessary. Normally, this will run all Watcher each time this method is
|
32
|
+
# called. However, the watcher may implement conditions on which the check is skipped.
|
33
|
+
def watch_all!
|
34
|
+
@last_check = Time.now
|
35
|
+
registered_watchers.each { |w| w.send(:do_watch!) }
|
36
|
+
end
|
37
|
+
|
38
|
+
# Cleans up all Watcher
|
39
|
+
def cleanup_watchers
|
40
|
+
registered_watchers.each { |w| w.cleanup if(w.respond_to?(:cleanup)) }
|
41
|
+
end
|
42
|
+
|
43
|
+
private
|
44
|
+
|
45
|
+
def registered_watchers
|
46
|
+
@registered_watchers ||= []
|
47
|
+
end
|
48
|
+
|
49
|
+
end # end class methods
|
50
|
+
|
51
|
+
end
|