crash-watch 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,20 @@
1
+ Copyright (c) 2010 Phusion
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining
4
+ a copy of this software and associated documentation files (the
5
+ "Software"), to deal in the Software without restriction, including
6
+ without limitation the rights to use, copy, modify, merge, publish,
7
+ distribute, sublicense, and/or sell copies of the Software, and to
8
+ permit persons to whom the Software is furnished to do so, subject to
9
+ the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be
12
+ included in all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,55 @@
1
+ # Introduction
2
+
3
+ * Do you have (server) processes that sometimes crash for mysterious reasons?
4
+ * Can you not figure out why?
5
+ * Do they not print any error messages to their log files upon crashing?
6
+ * Are debuggers complicated, scary things that you'd rather avoid?
7
+
8
+ `crash-watch` to the rescue! This little program will monitor a specified process and wait until it crashes. It will then print useful information such as its exit status, what signal caused it to abort, and its backtrace.
9
+
10
+ ## Installation
11
+
12
+ gem install crash-watch
13
+
14
+ You must also have GDB installed. Mac OS X already has it by default. If you're on Linux, try one of these:
15
+
16
+ apt-get install gdb
17
+ yum install gdb
18
+
19
+ ## Sample usage
20
+
21
+ $ crash-watch <PID>
22
+ Monitoring PID <PID>...
23
+ (...some time later, <PID> exits...)
24
+ Process exited.
25
+ Exit code: 0
26
+ Backtrace:
27
+ Thread 1 (process 95205):
28
+ #0 0x00007fff87ea1db0 in _exit ()
29
+ No symbol table info available.
30
+ #1 0x000000010002a260 in ruby_stop ()
31
+ No symbol table info available.
32
+ #2 0x0000000100031a54 in ruby_run ()
33
+ No symbol table info available.
34
+ #3 0x00000001000009e4 in main ()
35
+ No symbol table info available.
36
+
37
+ While monitoring the process, you may interrupt `crash-watch` by pressing Ctrl-C. `crash-watch` will then detach from the process, which will then continue normally. You may re-attach `crash-watch` later.
38
+
39
+ ## Goodie: GDB controller
40
+
41
+ I've written a small library for controlling gdb, which `crash-watch` uses internally. With CrashWatch::GdbController you can send arbitrary commands to gdb and also get its response.
42
+
43
+ Instantiate with:
44
+
45
+ require 'crash_watch/gdb_controller'
46
+ gdb = CrashWatch::GdbController.new
47
+
48
+ This will spawn a new GDB process. Use `#execute` to execute arbitary GDB commands. Whatever the command prints to stdout and stderr will be available in the result string.
49
+
50
+ gdb.execute("bt") # => backtrace string
51
+ gdb.execute("p 1 + 2") # => "$1 = 3\n"
52
+
53
+ Call `#close` when you no longer need it.
54
+
55
+ gdb.close
@@ -0,0 +1,76 @@
1
+ #!/usr/bin/env ruby
2
+ require 'optparse'
3
+
4
+ options = {}
5
+ parser = OptionParser.new do |opts|
6
+ opts.banner = "Usage: crash-watch [options] PID"
7
+ opts.separator ""
8
+
9
+ opts.separator "Options:"
10
+ opts.on("-d", "--debug", "Show GDB commands that crash-watch sends.") do
11
+ options[:debug] = true
12
+ end
13
+ opts.on("--dump", "Dump current process backtrace and exit immediately.") do
14
+ options[:dump] = true
15
+ end
16
+ opts.on("-v", "--version", "Show version number.") do
17
+ options[:version] = true
18
+ end
19
+ opts.on("-h", "--help", "Show this help message.") do
20
+ options[:help] = true
21
+ end
22
+ end
23
+ begin
24
+ parser.parse!
25
+ rescue OptionParser::ParseError => e
26
+ puts e
27
+ puts
28
+ puts "Please see '--help' for valid options."
29
+ exit 1
30
+ end
31
+
32
+ if options[:help]
33
+ puts parser
34
+ exit
35
+ elsif options[:version]
36
+ require 'crash_watch/version'
37
+ puts "crash-watch version #{CrashWatch::VERSION_STRING}"
38
+ exit
39
+ elsif ARGV.size != 1
40
+ puts parser
41
+ exit 1
42
+ end
43
+
44
+ require 'crash_watch/gdb_controller'
45
+ gdb = CrashWatch::GdbController.new
46
+ begin
47
+ gdb.debug = options[:debug]
48
+
49
+ # Ruby sometimes uses SIGVTARLM for thread scheduling.
50
+ gdb.execute("handle SIGVTALRM noprint pass")
51
+
52
+ if gdb.attach(ARGV[0])
53
+ if options[:dump]
54
+ puts "Current thread (#{gdb.current_thread}) backtrace:"
55
+ puts " " << gdb.current_thread_backtrace.gsub(/\n/, "\n ")
56
+ puts
57
+ puts "All thread backtraces:"
58
+ puts " " << gdb.all_threads_backtraces.gsub(/\n/, "\n ")
59
+ else
60
+ puts "Monitoring PID #{ARGV[0]}..."
61
+ exit_info = gdb.wait_until_exit
62
+ puts "Process exited at #{Time.now}."
63
+ puts "Exit code: #{exit_info.exit_code}" if exit_info.exit_code
64
+ puts "Signal: #{exit_info.signal}" if exit_info.signaled?
65
+ if exit_info.backtrace
66
+ puts "Backtrace:"
67
+ puts " " << exit_info.backtrace.gsub(/\n/, "\n ")
68
+ end
69
+ end
70
+ else
71
+ puts "Cannot attach to process."
72
+ exit 2
73
+ end
74
+ ensure
75
+ gdb.close
76
+ end
@@ -0,0 +1,26 @@
1
+ require File.expand_path('lib/crash_watch/version', File.dirname(__FILE__))
2
+
3
+ Gem::Specification.new do |s|
4
+ s.name = "crash-watch"
5
+ s.version = CrashWatch::VERSION_STRING
6
+ s.authors = ["Hongli Lai"]
7
+ s.date = "2010-04-16"
8
+ s.description = "Monitor processes and display useful information when they crash."
9
+ s.summary = "Monitor processes and display useful information when they crash"
10
+ s.email = "hongli@phusion.nl"
11
+ s.files = Dir[
12
+ "README.markdown",
13
+ "LICENSE.txt",
14
+ "crash-watch.gemspec",
15
+ "bin/**/*",
16
+ "lib/**/*",
17
+ "test/**/*"
18
+ ]
19
+ s.homepage = "http://github.com/FooBarWidget/crash-watch"
20
+ s.rdoc_options = ["--charset=UTF-8"]
21
+ s.executables = ["crash-watch"]
22
+ s.require_paths = ["lib"]
23
+ s.add_development_dependency("ffi")
24
+ s.add_development_dependency("rspec")
25
+ end
26
+
@@ -0,0 +1,256 @@
1
+ module CrashWatch
2
+
3
+ class GdbController
4
+ class ExitInfo
5
+ attr_reader :exit_code, :signal, :backtrace, :snapshot
6
+
7
+ def initialize(exit_code, signal, backtrace, snapshot)
8
+ @exit_code = exit_code
9
+ @signal = signal
10
+ @backtrace = backtrace
11
+ @snapshot = snapshot
12
+ end
13
+
14
+ def signaled?
15
+ !!@signal
16
+ end
17
+ end
18
+
19
+ END_OF_RESPONSE_MARKER = '--------END_OF_RESPONSE--------'
20
+
21
+ attr_accessor :debug
22
+
23
+ def initialize
24
+ @pid, @in, @out = popen_command("gdb", "-n", "-q")
25
+ execute("set prompt ")
26
+ end
27
+
28
+ def execute(command_string, timeout = nil)
29
+ raise "GDB session is already closed" if !@pid
30
+ puts "gdb write #{command_string.inspect}" if @debug
31
+ @in.puts(command_string)
32
+ @in.puts("echo \\n#{END_OF_RESPONSE_MARKER}\\n")
33
+ done = false
34
+ result = ""
35
+ while !done
36
+ begin
37
+ if select([@out], nil, nil, timeout)
38
+ line = @out.readline
39
+ puts "gdb read #{line.inspect}" if @debug
40
+ if line == "#{END_OF_RESPONSE_MARKER}\n"
41
+ done = true
42
+ else
43
+ result << line
44
+ end
45
+ else
46
+ close!
47
+ done = true
48
+ result = nil
49
+ end
50
+ rescue EOFError
51
+ done = true
52
+ end
53
+ end
54
+ return result
55
+ end
56
+
57
+ def closed?
58
+ return !@pid
59
+ end
60
+
61
+ def close
62
+ if !closed?
63
+ begin
64
+ execute("detach", 5)
65
+ execute("quit", 5) if !closed?
66
+ rescue Errno::EPIPE
67
+ end
68
+ if !closed?
69
+ @in.close
70
+ @out.close
71
+ Process.waitpid(@pid)
72
+ @pid = nil
73
+ end
74
+ end
75
+ end
76
+
77
+ def close!
78
+ if !closed?
79
+ @in.close
80
+ @out.close
81
+ Process.kill('KILL', @pid)
82
+ Process.waitpid(@pid)
83
+ @pid = nil
84
+ end
85
+ end
86
+
87
+ def attach(pid)
88
+ pid = pid.to_s.strip
89
+ raise ArgumentError if pid.empty?
90
+ result = execute("attach #{pid}")
91
+ return result !~ /(No such process|Unable to access task|Operation not permitted)/
92
+ end
93
+
94
+ def call(code)
95
+ result = execute("call #{code}")
96
+ result =~ /= (.*)$/
97
+ return $1
98
+ end
99
+
100
+ def program_counter
101
+ return execute("p/x $pc").gsub(/.* = /, '')
102
+ end
103
+
104
+ def current_thread
105
+ execute("thread") =~ /Current thread is (.+?) /
106
+ return $1
107
+ end
108
+
109
+ def current_thread_backtrace
110
+ return execute("bt full").strip
111
+ end
112
+
113
+ def all_threads_backtraces
114
+ return execute("thread apply all bt full").strip
115
+ end
116
+
117
+ def ruby_backtrace
118
+ filename = "/tmp/gdb-capture.#{@pid}.txt"
119
+
120
+ orig_stdout_fd_copy = call("(int) dup(1)")
121
+ new_stdout = call(%Q{(void *) fopen("#{filename}", "w")})
122
+ new_stdout_fd = call("(int) fileno(#{new_stdout})")
123
+ call("(int) dup2(#{new_stdout_fd}, 1)")
124
+
125
+ # Let's hope stdout is set to line buffered or unbuffered mode...
126
+ call("(void) rb_backtrace()")
127
+
128
+ call("(int) dup2(#{orig_stdout_fd_copy}, 1)")
129
+ call("(int) fclose(#{new_stdout})")
130
+ call("(int) close(#{orig_stdout_fd_copy})")
131
+
132
+ if File.exist?(filename)
133
+ result = File.read(filename)
134
+ result.strip!
135
+ if result.empty?
136
+ return nil
137
+ else
138
+ return result
139
+ end
140
+ else
141
+ return nil
142
+ end
143
+ ensure
144
+ if filename
145
+ File.unlink(filename) rescue nil
146
+ end
147
+ end
148
+
149
+ def wait_until_exit
150
+ execute("break _exit")
151
+
152
+ signal = nil
153
+ backtraces = nil
154
+ snapshot = nil
155
+
156
+ while true
157
+ result = execute("continue")
158
+ if result =~ /^Program received signal (.+?),/
159
+ signal = $1
160
+ backtraces = execute("thread apply all bt full").strip
161
+ if backtraces.empty?
162
+ backtraces = execute("bt full").strip
163
+ end
164
+ snapshot = yield(self) if block_given?
165
+
166
+ # This signal may or may not be immediately fatal; the
167
+ # signal might be ignored by the process, or the process
168
+ # has some clever signal handler that fixes the state,
169
+ # or maybe the signal handler must run some cleanup code
170
+ # before killing the process. Let's find out by running
171
+ # the next machine instruction.
172
+ old_program_counter = program_counter
173
+ result = execute("stepi")
174
+ if result =~ /^Program received signal .+?,/
175
+ # Yes, it was fatal. Here we don't care whether the
176
+ # instruction caused a different signal. The last
177
+ # one is probably what we're interested in.
178
+ return ExitInfo.new(nil, signal, backtraces, snapshot)
179
+ elsif result =~ /^Program (terminated|exited)/ || result =~ /^Breakpoint .*? _exit/
180
+ # Running the next instruction causes the program to terminate.
181
+ # Not sure what's going on but the previous signal and
182
+ # backtrace is probably what we're interested in.
183
+ return ExitInfo.new(nil, signal, backtraces, snapshot)
184
+ elsif old_program_counter == program_counter
185
+ # The process cannot continue but we're not sure what GDB
186
+ # is telling us.
187
+ raise "Unexpected GDB output: #{result}"
188
+ end
189
+ # else:
190
+ # The signal doesn't isn't immediately fatal, so save current
191
+ # status, continue, and check whether the process exits later.
192
+ elsif result =~ /^Program terminated with signal (.+?),/
193
+ if $1 == signal
194
+ # Looks like the signal we trapped earlier
195
+ # caused an exit.
196
+ return ExitInfo.new(nil, signal, backtraces, snapshot)
197
+ else
198
+ return ExitInfo.new(nil, signal, nil, snapshot)
199
+ end
200
+ elsif result =~ /^Breakpoint .*? _exit /
201
+ backtraces = execute("thread apply all bt full").strip
202
+ if backtraces.empty?
203
+ backtraces = execute("bt full").strip
204
+ end
205
+ snapshot = yield(self) if block_given?
206
+ # On OS X, gdb may fail to return from the 'continue' command
207
+ # even though the process exited. Kernel bug? In any case,
208
+ # we put a timeout here so that we don't wait indefinitely.
209
+ result = execute("continue", 10)
210
+ if result =~ /^Program exited with code (\d+)\.$/
211
+ return ExitInfo.new($1.to_i, nil, backtraces, snapshot)
212
+ elsif result =~ /^Program exited normally/
213
+ return ExitInfo.new(0, nil, backtraces, snapshot)
214
+ else
215
+ return ExitInfo.new(nil, nil, backtraces, snapshot)
216
+ end
217
+ elsif result =~ /^Program exited with code (\d+)\.$/
218
+ return ExitInfo.new($1.to_i, nil, nil, nil)
219
+ elsif result =~ /^Program exited normally/
220
+ return ExitInfo.new(0, nil, nil, nil)
221
+ else
222
+ return ExitInfo.new(nil, nil, nil, nil)
223
+ end
224
+ end
225
+ end
226
+
227
+ private
228
+ def popen_command(*command)
229
+ a, b = IO.pipe
230
+ c, d = IO.pipe
231
+ if Process.respond_to?(:spawn)
232
+ args = command.dup
233
+ args << {
234
+ STDIN => a,
235
+ STDOUT => d,
236
+ STDERR => d,
237
+ :close_others => true
238
+ }
239
+ pid = Process.spawn(*args)
240
+ else
241
+ pid = fork do
242
+ STDIN.reopen(a)
243
+ STDOUT.reopen(d)
244
+ STDERR.reopen(d)
245
+ b.close
246
+ c.close
247
+ exec(*command)
248
+ end
249
+ end
250
+ a.close
251
+ d.close
252
+ return [pid, b, c]
253
+ end
254
+ end
255
+
256
+ end
@@ -0,0 +1,3 @@
1
+ module CrashWatch
2
+ VERSION_STRING = '1.1.0'
3
+ end
@@ -0,0 +1,116 @@
1
+ source_root = File.expand_path(File.dirname(__FILE__) + "/..")
2
+ $LOAD_PATH.unshift("#{source_root}/lib")
3
+ Thread.abort_on_exception = true
4
+
5
+ require 'crash_watch/gdb_controller'
6
+
7
+ describe CrashWatch::GdbController do
8
+ before :each do
9
+ @gdb = CrashWatch::GdbController.new
10
+ end
11
+
12
+ after :each do
13
+ @gdb.close
14
+ if @process
15
+ Process.kill('KILL', @process.pid)
16
+ @process.close
17
+ end
18
+ end
19
+
20
+ def run_script_and_wait(code, snapshot_callback = nil, &block)
21
+ @process = IO.popen(%Q{ruby -e '#{code}'}, 'w')
22
+ @gdb.attach(@process.pid)
23
+ thread = Thread.new do
24
+ sleep 0.1
25
+ if block
26
+ block.call
27
+ end
28
+ @process.write("\n")
29
+ end
30
+ exit_info = @gdb.wait_until_exit(&snapshot_callback)
31
+ thread.join
32
+ return exit_info
33
+ end
34
+
35
+ describe "#execute" do
36
+ it "executes the desired command and returns its output" do
37
+ @gdb.execute("echo hello world").should == "hello world\n"
38
+ end
39
+ end
40
+
41
+ describe "#attach" do
42
+ before :each do
43
+ @process = IO.popen("sleep 9999", "w")
44
+ end
45
+
46
+ it "returns true if attaching worked" do
47
+ @gdb.attach(@process.pid).should be_true
48
+ end
49
+
50
+ it "returns false if the PID doesn't exist" do
51
+ Process.kill('KILL', @process.pid)
52
+ sleep 0.25
53
+ @gdb.attach(@process.pid).should be_false
54
+ end
55
+ end
56
+
57
+ describe "#wait_until_exit" do
58
+ it "returns the expected information if the process exited normally" do
59
+ exit_info = run_script_and_wait('STDIN.readline')
60
+ exit_info.exit_code.should == 0
61
+ exit_info.should_not be_signaled
62
+ end
63
+
64
+ it "returns the expected information if the process exited with a non-zero exit code" do
65
+ exit_info = run_script_and_wait('STDIN.readline; exit 3')
66
+ exit_info.exit_code.should == 3
67
+ exit_info.should_not be_signaled
68
+ exit_info.backtrace.should_not be_nil
69
+ exit_info.backtrace.should_not be_empty
70
+ end
71
+
72
+ it "returns the expected information if the process exited because of a signal" do
73
+ exit_info = run_script_and_wait(
74
+ 'STDIN.readline;' +
75
+ 'require "rubygems";' +
76
+ 'require "ffi";' +
77
+ 'module MyLib;' +
78
+ 'extend FFI::Library;' +
79
+ 'ffi_lib "c";' +
80
+ 'attach_function :abort, [], :void;' +
81
+ 'end;' +
82
+ 'MyLib.abort')
83
+ exit_info.should be_signaled
84
+ exit_info.backtrace.should =~ /abort/
85
+ end
86
+
87
+ it "ignores non-fatal signals" do
88
+ exit_info = run_script_and_wait('trap("INT") { }; STDIN.readline; exit 2') do
89
+ Process.kill('INT', @process.pid)
90
+ end
91
+ exit_info.exit_code.should == 2
92
+ exit_info.should_not be_signaled
93
+ exit_info.backtrace.should_not be_nil
94
+ exit_info.backtrace.should_not be_empty
95
+ end
96
+
97
+ it "returns information of the signal that aborted the process, not information of ignored signals" do
98
+ exit_info = run_script_and_wait(
99
+ 'trap("INT") { };' +
100
+ 'STDIN.readline;' +
101
+ 'require "rubygems";' +
102
+ 'require "ffi";' +
103
+ 'module MyLib;' +
104
+ 'extend FFI::Library;' +
105
+ 'ffi_lib "c";' +
106
+ 'attach_function :abort, [], :void;' +
107
+ 'end;' +
108
+ 'MyLib.abort'
109
+ ) do
110
+ Process.kill('INT', @process.pid)
111
+ end
112
+ exit_info.should be_signaled
113
+ exit_info.backtrace.should =~ /abort/
114
+ end
115
+ end
116
+ end
metadata ADDED
@@ -0,0 +1,100 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: crash-watch
3
+ version: !ruby/object:Gem::Version
4
+ hash: 19
5
+ prerelease: false
6
+ segments:
7
+ - 1
8
+ - 1
9
+ - 0
10
+ version: 1.1.0
11
+ platform: ruby
12
+ authors:
13
+ - Hongli Lai
14
+ autorequire:
15
+ bindir: bin
16
+ cert_chain: []
17
+
18
+ date: 2010-04-16 00:00:00 +02:00
19
+ default_executable:
20
+ dependencies:
21
+ - !ruby/object:Gem::Dependency
22
+ name: ffi
23
+ prerelease: false
24
+ requirement: &id001 !ruby/object:Gem::Requirement
25
+ none: false
26
+ requirements:
27
+ - - ">="
28
+ - !ruby/object:Gem::Version
29
+ hash: 3
30
+ segments:
31
+ - 0
32
+ version: "0"
33
+ type: :development
34
+ version_requirements: *id001
35
+ - !ruby/object:Gem::Dependency
36
+ name: rspec
37
+ prerelease: false
38
+ requirement: &id002 !ruby/object:Gem::Requirement
39
+ none: false
40
+ requirements:
41
+ - - ">="
42
+ - !ruby/object:Gem::Version
43
+ hash: 3
44
+ segments:
45
+ - 0
46
+ version: "0"
47
+ type: :development
48
+ version_requirements: *id002
49
+ description: Monitor processes and display useful information when they crash.
50
+ email: hongli@phusion.nl
51
+ executables:
52
+ - crash-watch
53
+ extensions: []
54
+
55
+ extra_rdoc_files: []
56
+
57
+ files:
58
+ - README.markdown
59
+ - LICENSE.txt
60
+ - crash-watch.gemspec
61
+ - bin/crash-watch
62
+ - lib/crash_watch/gdb_controller.rb
63
+ - lib/crash_watch/version.rb
64
+ - test/gdb_controller_spec.rb
65
+ has_rdoc: true
66
+ homepage: http://github.com/FooBarWidget/crash-watch
67
+ licenses: []
68
+
69
+ post_install_message:
70
+ rdoc_options:
71
+ - --charset=UTF-8
72
+ require_paths:
73
+ - lib
74
+ required_ruby_version: !ruby/object:Gem::Requirement
75
+ none: false
76
+ requirements:
77
+ - - ">="
78
+ - !ruby/object:Gem::Version
79
+ hash: 3
80
+ segments:
81
+ - 0
82
+ version: "0"
83
+ required_rubygems_version: !ruby/object:Gem::Requirement
84
+ none: false
85
+ requirements:
86
+ - - ">="
87
+ - !ruby/object:Gem::Version
88
+ hash: 3
89
+ segments:
90
+ - 0
91
+ version: "0"
92
+ requirements: []
93
+
94
+ rubyforge_project:
95
+ rubygems_version: 1.3.7
96
+ signing_key:
97
+ specification_version: 3
98
+ summary: Monitor processes and display useful information when they crash
99
+ test_files: []
100
+