pqa 1.6

Sign up to get free protection for your applications and to get access to all the features.
data/bin/pqa ADDED
@@ -0,0 +1,9 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ begin
4
+ require 'pqa'
5
+ rescue LoadError
6
+ require 'rubygems'
7
+ require 'pqa'
8
+ end
9
+ PQA.run
@@ -0,0 +1,78 @@
1
+ November 27, 2005 - 1.6:
2
+ Fixed the first line of the file - it was missing a character.
3
+ Made Gem runnable - i.e., now you can do "pqa -file [..etc, etc]" after doing a "gem install pqa"
4
+ Various minor speed improvements.
5
+ Wrote more unit tests.
6
+
7
+ January 31, 2005 - 1.5:
8
+ Fixed a bug that would sometimes associate a duration measurement with the incorrect query.
9
+ Fixed a bug which would cause the HTML report to error out with a nil-dereference exception.
10
+ Fixed a bug which cause PQA to choke on SQL statements that were issued as part of a stored procedure.
11
+ Restored the ability to run under Ruby 1.6.8.
12
+
13
+ July 27, 2004 - 1.4:
14
+ The syslog parser now checks for either LOG or DEBUG entries.
15
+ Fixed a bug in the syslog parser that would choke on entries like "postgres starting".
16
+
17
+ June 18, 2004 - 1.3:
18
+ Improved MySQL query log support.
19
+ Parse errors (if they occur) are now included in the report.
20
+ The default log format is now "pglog". Syslog still works, but you'll need to specify "-logtype syslog".
21
+
22
+ June 8, 2004 - 1.2:
23
+ Added MySQL query log file support.
24
+ Fixed bug - duration report links no longer appear in header if duration information is not available.
25
+
26
+ June 7, 2004 - 1.1:
27
+ Syslog parser now handles durations.
28
+ Added text versions of duration reports.
29
+ Removed some spurious entries from the reports (i.e., BEGIN;ROLLBACK).
30
+
31
+ May 17, 2004 - 1.0:
32
+ Now numbers are normalized - i.e.; "select bar where foo=222" normalizes to the same query as "select bar where foo=123312321". This means more accurate information on what queries are occurring most frequently, taking the most time, and all that.
33
+ Fixed bug - OverallStatsReport no longer displays "longest ran in 0.0 seconds" if the log does not include duration information
34
+ Fixed bug - OverallStatsReport now displays the correct number of unique queries. v0.9 listed the same number for both total and unique queries.
35
+ Fixed various bugs in syslog parsing - now it should work better with both PG 7.3 and PG 7.4. It's still a better idea to use the Postgres log, but if you must use syslog for some reason, it's better now.
36
+
37
+ May 11, 2004 - 0.9:
38
+ Added ability to handle Postgres logs where log_pid/log_timestamp/log_connection has been enabled
39
+ Modified to support both "query:" and "statement:" as log entry preambles - i.e., works with PG 7.4 logs now.
40
+ The SQL colorizing works a bit better.
41
+ Updated documentation to include better postgresql.conf configuration details.
42
+
43
+ May 7, 2004 - 0.8:
44
+ Added UPDATE queries to the "Queries by type" report
45
+ Added support for parsing query duration data from the Postgres log
46
+ Added a "Queries that took up the most time" report
47
+ Added a "Slowest queries" report
48
+ Added a table of contents to the HTML report
49
+
50
+ April 28, 2004 - 0.7:
51
+ Added support for using Postgres log file. syslog is still supported, of course.
52
+ Fixed bug which resulted in errors if the number of valid queries in a log was less than -top. Thanks to Tom De Bruyne for reporting this bug.
53
+ The SQL colorizing works a bit better now.
54
+ Various tweaks to HTML reports.
55
+
56
+ April 23, 2004 - 0.6:
57
+ Added a 'rank' column to the 'MostFrequentQueries' report.
58
+ Colorized the SQL keywords in the HTML report.
59
+
60
+ April 7, 2004 - 0.5:
61
+ Added HTML reports.
62
+
63
+ April 6, 2004 - 0.4:
64
+ Fixed a bug which prevented single digit date logs from being parsed.
65
+
66
+ March 17, 2004 - v0.3:
67
+ Fixed an off-by-one bug in the number of reports returned.
68
+ More optimizations, should again be about 10% faster.
69
+ Added a "query frequency by type" report.
70
+
71
+ March 9, 2004 - v0.2:
72
+ Fixed a connection id bug.
73
+ Various optimizations, should be about 10% faster.
74
+ Improved packaging.
75
+
76
+ March 5, 2004 - v0.1:
77
+ Can display queries by frequency.
78
+ Performs query normalization.
@@ -0,0 +1,31 @@
1
+ Copyright (c) 2003-2005, InfoEther, LLC
2
+ All rights reserved.
3
+
4
+ Redistribution and use in source and binary forms, with or without
5
+ modification, are permitted provided that the following conditions are
6
+ met:
7
+
8
+ * Redistributions of source code must retain the above copyright
9
+ notice, this list of conditions and the following disclaimer.
10
+ * Redistributions in binary form must reproduce the above copyright
11
+ notice, this list of conditions and the following disclaimer in the
12
+ documentation and/or other materials provided with the distribution.
13
+ * The end-user documentation included with the redistribution, if
14
+ any, must include the following acknowledgement:
15
+ "This product includes software developed in part by support from
16
+ InfoEther, LLC"
17
+ * Neither the name of InfoEther, LLC nor the names of its
18
+ contributors may be used to endorse or promote products derived from
19
+ this software without specific prior written permission.
20
+
21
+ THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS
22
+ IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
23
+ TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
24
+ PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER
25
+ OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
26
+ EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
27
+ PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
28
+ PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
29
+ LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
30
+ NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
31
+ SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
@@ -0,0 +1,15 @@
1
+ Postgres Query Analyzer (PQA) is a little library to help you analyze the PostgreSQL query logs.
2
+
3
+ To use it, you'll need:
4
+ Ruby - http://ruby-lang.org/
5
+ PQA - http://pgfoundry.org/projects/pqa/
6
+
7
+ and of course a PostgreSQL installation.
8
+
9
+ There's an article I've about PQA written on DatabaseJournal.com:
10
+ http://www.databasejournal.com/features/postgresql/article.php/3323561
11
+
12
+ There's more documentation and a sample run here:
13
+ http://pqa.projects.postgresql.org/
14
+
15
+ Like PQA? Buy Tom's completely unrelated book, "PMD Applied"! http://pmdapplied.com/
@@ -0,0 +1,1173 @@
1
+ #!/usr/local/bin/ruby -w
2
+
3
+ DEFAULT_TOP=10
4
+ BUG_URL_STRING="This is a <a href=\"http://pgfoundry.org/tracker/?atid=130&group_id=1000008&func=browse\">bug</a>."
5
+
6
+ class PQA
7
+ def PQA.pqa_usage
8
+ puts "=============================="
9
+ puts "Usage: " + $0 + " [-logtype syslog|pglog|mysql] [-top n] [-normalize] [-format text|html] [-reports rep1,rep2,...,repn] -file log_file_name"
10
+ puts "Report types : overall, bytype, mosttime, slowest, mostfrequent, errors"
11
+ puts "For example:"
12
+ puts "ruby pqa.rb -logtype pglog -top 10 -normalize -format text -reports overall,slowest ../sample/pglog_sample.log"
13
+ puts "=============================="
14
+ end
15
+ def PQA.run
16
+ (PQA.pqa_usage ; exit) if ARGV == nil
17
+ if !ARGV.include?("-file")
18
+ puts "=============================="
19
+ puts "## No log file specified; use the '-file' parameter"
20
+ pqa_usage ; exit
21
+ end
22
+ log = nil
23
+ if ARGV.include?("-logtype") && ARGV[ARGV.index("-logtype")+1] == "syslog"
24
+ log = GenericLogReader.new(ARGV[ARGV.index("-file")+1], "SyslogPGParser", "PostgreSQLAccumulator")
25
+ elsif ARGV.include?("-logtype") && ARGV[ARGV.index("-logtype")+1] == "mysql"
26
+ log = GenericLogReader.new(ARGV[ARGV.index("-file")+1], "MySQLLogLine", "MySQLAccumulator")
27
+ else
28
+ log = GenericLogReader.new(ARGV[ARGV.index("-file")+1], "PostgresLogParser", "PostgreSQLAccumulator")
29
+ end
30
+ log.parse
31
+ log.normalize if ARGV.include?("-normalize")
32
+ top = (ARGV.include?("-top") ? ARGV[ARGV.index("-top")+1] : DEFAULT_TOP).to_i
33
+ format = (ARGV.include?("-format") ? ARGV[ARGV.index("-format")+1] : "text")
34
+
35
+ rpts = []
36
+ if ARGV.include?("-reports")
37
+ reports_array = ARGV[ARGV.index("-reports")+1].split(',')
38
+ rpts.push(OverallStatsReport.new(log)) if reports_array.include?("overall")
39
+ rpts.push(QueriesByTypeReport.new(log)) if reports_array.include?("bytype")
40
+ rpts.push(QueriesThatTookUpTheMostTimeReport.new(log,top)) if reports_array.include?("mosttime")
41
+ rpts.push(SlowestQueriesReport.new(log, top)) if reports_array.include?("slowest")
42
+ rpts.push(MostFrequentQueriesReport.new(log, top)) if reports_array.include?("mostfrequent")
43
+ rpts.push(ErrorReport.new(log)) if reports_array.include?("errors")
44
+ #rpts.push() if reports_array.include?("")
45
+ rpts.push(ParseErrorReport.new(log))
46
+ else
47
+ rpts = [OverallStatsReport.new(log), QueriesByTypeReport.new(log), QueriesThatTookUpTheMostTimeReport.new(log, top), SlowestQueriesReport.new(log, top), MostFrequentQueriesReport.new(log, top), ParseErrorReport.new(log)]
48
+ end
49
+ report_aggregator = (format == "text") ? TextReportAggregator.new : HTMLReportAggregator.new
50
+ puts report_aggregator.create(rpts)
51
+ end
52
+ end
53
+
54
+ # Log file parsers
55
+ class ParseError
56
+ attr_reader :exception, :line
57
+ def initialize(e, line)
58
+ @exception = e
59
+ @line = line
60
+ end
61
+ end
62
+
63
+ class GenericLogReader
64
+ DEBUG = false
65
+ attr_accessor :includes_duration, :queries, :errors, :parse_errors
66
+ attr_reader :time_to_parse
67
+
68
+ def initialize(filename, line_parser_name, accumulator_name)
69
+ @filename = filename
70
+ @line_parser_name = line_parser_name
71
+ @accumulator_name = accumulator_name
72
+ @includes_duration = false
73
+ @queries, @errors , @parse_errors= [], [], []
74
+ end
75
+
76
+ def parse
77
+ start = Time.new
78
+ a = Object.const_get(@accumulator_name).new
79
+ puts "Using #{@accumulator_name}" if DEBUG
80
+ p = Object.const_get(@line_parser_name).new
81
+ puts "Using #{@line_parser_name}" if DEBUG
82
+ File.foreach(@filename) {|text|
83
+ begin
84
+ line = p.parse(text)
85
+ if line
86
+ a.append(line)
87
+ else
88
+ # text.gsub!(/\n/, '\n').gsub!(/\t/, '\t')
89
+ # $stderr.puts "Unrecognized text: '#{text}'"
90
+ end
91
+ rescue StandardError => e
92
+ @parse_errors << ParseError.new(e,line)
93
+ end
94
+ }
95
+ @time_to_parse = Time.new - start
96
+ a.close_out_all
97
+ @queries = a.queries
98
+ @errors = a.errors
99
+ @includes_duration = a.has_duration_info
100
+ end
101
+
102
+ def normalize
103
+ @queries.each {|q| q.normalize }
104
+ end
105
+
106
+ def unique_queries
107
+ uniq = []
108
+ @queries.each {|x| uniq << x.text if !uniq.include?(x.text) }
109
+ uniq.size
110
+ end
111
+ end
112
+
113
+ #
114
+ # MySQL Parsing is broken
115
+ #
116
+
117
+ class MySQLLogLine
118
+ DISCARD = Regexp.new("(^Time )|(^Tcp)|( Quit )|( USE )|(Connect)")
119
+ START_QUERY = Regexp.new('\d{1,5} Query')
120
+ attr_reader :text, :is_new_query, :recognized
121
+
122
+ def initialize(text)
123
+ @recognized = true
124
+ @is_new_query = false
125
+ if DISCARD.match(text) != nil
126
+ @recognized = false
127
+ return
128
+ end
129
+ @text = text
130
+ @is_new_query = START_QUERY.match(@text) != nil
131
+
132
+ end
133
+
134
+ def is_continuation
135
+ @recognized && !/^(\d{1,6})|(\s*)/.match(@text).nil?
136
+ end
137
+
138
+ def is_duration_line
139
+ false
140
+ end
141
+
142
+ def parse_query_segment
143
+ if @is_new_query
144
+ tmp = START_QUERY.match(@text.strip)
145
+ raise StandardError.new("PQA identified a line as the start of a new query, but then was unable to match it with the START_QUERY Regex. #{BUG_URL_STRING}") if tmp == nil
146
+ return tmp.post_match.strip
147
+ end
148
+ @text.strip.chomp
149
+ end
150
+
151
+ def to_s
152
+ @text
153
+ end
154
+ end
155
+
156
+ class MySQLAccumulator
157
+ attr_reader :queries
158
+
159
+ def initialize
160
+ @current = nil
161
+ @queries = []
162
+ end
163
+
164
+ def new_query_start(line)
165
+ @queries << @current if !@current.nil?
166
+ @current = Query.new(line.parse_query_segment)
167
+ end
168
+
169
+ def query_continuation(line)
170
+ @current.append(line.parse_query_segment) if !@current.nil?
171
+ end
172
+
173
+ def close_out_all ; end
174
+ end
175
+
176
+ #
177
+ # PostgreSQL lines Classes
178
+ #
179
+
180
+
181
+
182
+ class PGLogLine
183
+ DEBUG = false
184
+ attr_accessor :connection_id, :cmd_no, :line_no
185
+ attr_reader :text, :duration, :ignore
186
+
187
+ def initialize(text = "NO TEXT", duration = nil)
188
+ @text = text.chomp
189
+ @duration = duration
190
+
191
+ if text.nil?
192
+ $stderr.puts "Nil text for line text !" if DEBUG
193
+ end
194
+
195
+
196
+ # for tracking
197
+ @connection_id = nil
198
+ @cmd_no = nil
199
+ @line_no = nil
200
+ end
201
+
202
+ def to_s
203
+ @text
204
+ end
205
+
206
+ def parse_duration(time_str, unit)
207
+ unit == "ms" ? (time_str.to_f / 1000.0) : time_str.to_f
208
+ end
209
+
210
+ def dump
211
+ self.class.to_s + "(" + @connection_id.to_s + "): " + text
212
+ end
213
+ end
214
+
215
+ class PGQueryStarter < PGLogLine
216
+ attr_reader :ignore
217
+
218
+ def initialize(text, duration = nil)
219
+ super(filter_query(text), duration)
220
+ end
221
+
222
+ def filter_query(text)
223
+ @ignore = (text =~ /begin/i) || (text =~ /VACUUM/i) || (text =~ /^select 1$/i)
224
+ return text
225
+ end
226
+
227
+ def append_to(queries)
228
+ queries.push(Query.new(@text, @ignore))
229
+ return nil
230
+ end
231
+
232
+ end
233
+
234
+ class PGQueryStarterWithDuration < PGQueryStarter
235
+ ignore = false
236
+
237
+ def initialize(text, time_str, unit)
238
+ @time_str = time_str
239
+ @unit = unit
240
+ text_match = /[\s]*(query|statement):[\s]*/i.match(text)
241
+ if text_match
242
+ super(text_match.post_match, parse_duration(time_str, unit))
243
+ else
244
+ $stderr.puts "Found garbage after Duration line : #{text}"
245
+ super(text, parse_duration(time_str, unit))
246
+ end
247
+ end
248
+
249
+ def append_to(queries)
250
+ queries.got_duration!
251
+ closed_query = queries.pop
252
+ query = Query.new(@text, @ignore)
253
+ query.duration = @duration
254
+ queries.push(query)
255
+ return closed_query
256
+ end
257
+
258
+ end
259
+
260
+ class PGContinuationLine < PGLogLine
261
+ ignore = false
262
+
263
+ def initialize(text, duration = nil)
264
+ super(text.gsub(/\^I/, "\t"))
265
+ end
266
+
267
+
268
+ def append_to(queries)
269
+ if queries.last.nil?
270
+ $stderr.puts "Continuation for no previous query"
271
+ else
272
+ queries.last.append(@text)
273
+ end
274
+ return nil
275
+ end
276
+
277
+ end
278
+
279
+ # Durations
280
+
281
+ class PGDurationLine < PGLogLine
282
+ ignore = false
283
+
284
+ def initialize(time_str, unit)
285
+ @time_str = time_str
286
+ @unit = unit
287
+ super("NO TEXT", parse_duration(time_str, unit))
288
+ end
289
+
290
+ def append_to(queries)
291
+ if queries.last.nil?
292
+ $stderr.puts "Duration for no previous query"
293
+ return nil
294
+ else
295
+ queries.got_duration!
296
+ queries.last.duration = @duration
297
+ return queries.pop
298
+ end
299
+ end
300
+
301
+ end
302
+
303
+ # Error Management
304
+ # Those 4 classes are untested
305
+ # keep ignore = true for the moment
306
+
307
+ class PGErrorLine < PGLogLine
308
+ ignore = false
309
+
310
+ def append_to(errors)
311
+ closed_query = errors.pop
312
+ errors.push(ErrorQuery.new(@text))
313
+ return closed_query
314
+ end
315
+
316
+ end
317
+
318
+ class PGHintLine < PGLogLine
319
+ ignore = false
320
+
321
+ def append_to(errors)
322
+ if errors.last
323
+ errors.last.append_hint(@text)
324
+ else
325
+ $stderr.puts "Hint for no previous error"
326
+ end
327
+ return nil
328
+ end
329
+
330
+ end
331
+
332
+ class PGDetailLine < PGLogLine
333
+ ignore = false
334
+
335
+ def append_to(errors)
336
+ if errors.last
337
+ errors.last.append_detail(@text)
338
+ else
339
+ $stderr.puts "Detail for no previous error"
340
+ end
341
+ return nil
342
+ end
343
+
344
+ end
345
+
346
+ class PGStatementLine < PGLogLine
347
+ ignore = false
348
+
349
+ def append_to(errors)
350
+ if errors.last
351
+ errors.last.append_statement(@text)
352
+ else
353
+ $stderr.puts "Detail for no previous error"
354
+ end
355
+ return nil
356
+ end
357
+
358
+ end
359
+
360
+ # Contexts
361
+
362
+ class PGContextLine < PGLogLine
363
+ ignore = false
364
+
365
+ SQL_STATEMENT = /^SQL statement "/
366
+ SQL_FUNCTION = /([^\s]+)[\s]+function[\s]+"([^"]+)"(.*)$/
367
+
368
+ def initialize(text)
369
+ statement_match = SQL_STATEMENT.match(text)
370
+ if statement_match
371
+ super(statement_match.post_match[0..-1])
372
+ else
373
+ function_match = SQL_FUNCTION.match(text)
374
+ if function_match
375
+ super(function_match[2])
376
+ else
377
+ $stderr.puts "Unrecognized Context" if DEBUG
378
+ super(text)
379
+ end
380
+ @match_all = true
381
+ end
382
+ end
383
+
384
+ def append_to(queries)
385
+ sub_query = queries.pop
386
+ if sub_query.nil?
387
+ $stderr.puts "Missing Query for Context"
388
+ elsif queries.last
389
+ queries.last.set_subquery(sub_query.to_s)
390
+ else
391
+ $stderr.puts "Context for no previous Query"
392
+ end
393
+ return nil
394
+ end
395
+
396
+ end
397
+
398
+ # Statuses
399
+ # This class is untested
400
+ # please keep ignore = true for the moment
401
+
402
+ class PGStatusLine < PGLogLine
403
+ ignore = true
404
+ CONN_RECV = /connection received: host=([^\s]+) port=([\d]+)/
405
+ CONN_AUTH = /connection authorized: user=([^\s]+) database=([^\s]+)/
406
+
407
+ def append_to(stream)
408
+ conn_recv = CONN_RECV.match(@text)
409
+ if conn_recv
410
+ stream.set_host_conn!(conn_recv[1], conn_recv[2])
411
+ end
412
+
413
+ conn_auth = CONN_AUTH.match(@text)
414
+ if conn_auth
415
+ stream.set_user_db!(conn_auth[1], conn_auth[2])
416
+ end
417
+ return nil
418
+ end
419
+
420
+ end
421
+
422
+
423
+
424
+ class PostgreSQLParser
425
+ LOG_OR_DEBUG_LINE = Regexp.new("^(LOG|DEBUG):[\s]*")
426
+ QUERY_STARTER = Regexp.new("^(query|statement):[\s]*")
427
+ STATUS = Regexp.new("^(connection|received|unexpected EOF)")
428
+ DURATION = Regexp.new('^duration:([\s\d\.]*)(sec|ms)')
429
+ CONTINUATION_LINE = /^(\^I|\s|\t)/
430
+ CONTEXT_LINE = /^CONTEXT:[\s]*/
431
+ ERROR_LINE = /^(WARNING|ERROR|FATAL|PANIC):[\s]*/
432
+ HINT_LINE = /^HINT:[\s]*/
433
+ DETAIL_LINE = /^DETAIL:[\s]*/
434
+ STATEMENT_LINE = /^STATEMENT:[\s]*/
435
+
436
+ def parse(text)
437
+ logdebug_match = LOG_OR_DEBUG_LINE.match(text)
438
+ if logdebug_match
439
+
440
+ query_match = QUERY_STARTER.match(logdebug_match.post_match)
441
+ if query_match
442
+ return PGQueryStarter.new(query_match.post_match)
443
+ end
444
+
445
+ duration_match = DURATION.match(logdebug_match.post_match)
446
+ if duration_match
447
+ additionnal_info = duration_match.post_match.strip.chomp
448
+ if additionnal_info == ""
449
+ return PGDurationLine.new(duration_match[1].strip, duration_match[2])
450
+ else
451
+ return PGQueryStarterWithDuration.new(additionnal_info, duration_match[1].strip, duration_match[2])
452
+ end
453
+ end
454
+
455
+ status_match = STATUS.match(logdebug_match.post_match)
456
+ if status_match
457
+ return PGStatusLine.new(logdebug_match.post_match)
458
+ end
459
+
460
+ # $stderr.puts "Unrecognized LOG or DEBUG line: #{text}"
461
+ return nil
462
+ end
463
+
464
+ error_match = ERROR_LINE.match(text)
465
+ if error_match
466
+ return PGErrorLine.new(error_match.post_match)
467
+ end
468
+
469
+ context_match = CONTEXT_LINE.match(text)
470
+ if context_match
471
+ return PGContextLine.new(context_match.post_match)
472
+ end
473
+
474
+ continuation_match = CONTINUATION_LINE.match(text)
475
+ if continuation_match
476
+ return PGContinuationLine.new(continuation_match.post_match)
477
+ end
478
+
479
+ statement_match = STATEMENT_LINE.match(text)
480
+ if statement_match
481
+ return PGStatementLine.new(statement_match.post_match)
482
+ end
483
+
484
+ hint_match = HINT_LINE.match(text)
485
+ if hint_match
486
+ return PGHintLine.new(hint_match.post_match)
487
+ end
488
+
489
+ detail_match = DETAIL_LINE.match(text)
490
+ if detail_match
491
+ return PGDetailLine.new(detail_match.post_match)
492
+ end
493
+
494
+ if text.strip.chomp == ""
495
+ return PGContinuationLine.new("")
496
+ end
497
+
498
+ # $stderr.puts "Unrecognized PostgreSQL log line: #{text}"
499
+ return nil
500
+ end
501
+
502
+ end
503
+
504
+ class SyslogPGParser < PostgreSQLParser
505
+ CMD_LINE = Regexp.new('\[(\d{1,10})(\-\d{1,5}){0,1}\] ')
506
+
507
+
508
+ def initialize(syslog_str = 'postgres')
509
+ @postgres_pid = Regexp.new(" " + syslog_str + '\[(\d{1,5})\]: ')
510
+ end
511
+
512
+ def parse(data)
513
+ recognized = false
514
+
515
+ pid_match=@postgres_pid.match(data)
516
+ return if pid_match.nil?
517
+
518
+ connection_id = pid_match[1]
519
+ text = pid_match.post_match
520
+ return nil if text == nil
521
+
522
+ line_id_match = CMD_LINE.match(text)
523
+ return nil if line_id_match.nil?
524
+
525
+ text = line_id_match.post_match
526
+ cmd_no = line_id_match[1]
527
+ if line_id_match[2]
528
+ line_no = line_id_match[2][1..-1]
529
+ else
530
+ line_no = 1
531
+ end
532
+
533
+
534
+ result = super(text)
535
+ return nil if result.nil?
536
+
537
+ result.connection_id = connection_id
538
+ result.cmd_no = cmd_no
539
+ result.line_no = line_no
540
+
541
+ # $stderr.puts result.dump
542
+
543
+ return result
544
+ end
545
+
546
+ end
547
+
548
+
549
+ class PostgresLogParser < PostgreSQLParser
550
+ STARTS_WITH_DATE=Regexp.new("^[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9] ")
551
+ STARTS_WITH_PID=Regexp.new('\[(\d{1,5})\] ')
552
+
553
+ def initialize
554
+ @conn_id_found = false
555
+ @last_conn_id = nil
556
+ end
557
+
558
+ def parse(text)
559
+ connection_id = nil
560
+ text = STARTS_WITH_DATE.match(text) ? text.split(" ")[2..-1].join(" ").strip : text
561
+
562
+ pid_match = STARTS_WITH_PID.match(text)
563
+ if pid_match
564
+ @conn_id_found = true
565
+ connection_id = pid_match[1]
566
+ @last_conn_id = connection_id
567
+ text = pid_match.post_match.strip
568
+ end
569
+
570
+ result = super(text)
571
+ # Badly formated continuations need this...
572
+ #if result.nil?
573
+ # result = PGContinuationLine.new(text)
574
+ #end
575
+ return nil if result.nil?
576
+
577
+ if pid_match
578
+ result.connection_id = connection_id
579
+ else
580
+ result.connection_id = @last_conn_id
581
+ end
582
+
583
+ # $stderr.puts result.dump
584
+
585
+ return result
586
+ end
587
+ end
588
+
589
+
590
+ class LogStream
591
+ attr_reader :has_duration_info, :queries
592
+ def initialize
593
+ @queries = []
594
+ @has_duration_info = false
595
+
596
+ @host = "UNKNOWN"
597
+ @port = "UNKNOWN"
598
+ @user = "UNKNOWN"
599
+ @db = "UNKNOWN"
600
+ end
601
+
602
+ def queries
603
+ @queries.reject {|q| q.ignored}
604
+ end
605
+
606
+ def append(line)
607
+ return line.append_to(self)
608
+ end
609
+
610
+ def push(query)
611
+ query.set_db(@db)
612
+ query.set_user(@user)
613
+ @queries.push(query)
614
+ end
615
+
616
+ def pop
617
+ @queries.pop
618
+ end
619
+
620
+ def last
621
+ @queries.last
622
+ end
623
+
624
+ def set_host_conn!(host, port)
625
+ @host = host
626
+ @port = port
627
+ end
628
+
629
+ def set_user_db!(user, db)
630
+ @user = user
631
+ @db = db
632
+ end
633
+
634
+ def got_duration!
635
+ @has_duration_info = true
636
+ end
637
+ end
638
+
639
+
640
+ class PostgreSQLAccumulator
641
+ attr_reader :queries, :errors, :has_duration_info
642
+
643
+ def initialize
644
+ @queries = []
645
+ @errors = []
646
+ @working = {}
647
+ @stream = LogStream.new
648
+ @has_duration_info = false
649
+ end
650
+
651
+ def append(line)
652
+ if line.connection_id
653
+ if !@working.has_key?(line.connection_id)
654
+ @working[line.connection_id] = LogStream.new
655
+ end
656
+ query = @working[line.connection_id].append(line)
657
+ else
658
+ # no pid mode :
659
+ query = @stream.append(line)
660
+ end
661
+ if query && !query.ignored
662
+ query.accumulate_to(self)
663
+ end
664
+ end
665
+
666
+ def append_query(query)
667
+ @queries.push(query)
668
+ end
669
+
670
+ def append_error(error)
671
+ @errors.push(error)
672
+ end
673
+
674
+ def close_out_all
675
+ @stream.queries.each { |q| q.accumulate_to(self) }
676
+ @has_duration_info = @stream.has_duration_info
677
+ @working.each {|k, stream|
678
+ stream.queries.each { |q| q.accumulate_to(self) }
679
+ @has_duration_info = @has_duration_info || stream.has_duration_info
680
+ }
681
+ end
682
+ end
683
+
684
+ class Query
685
+ DEBUG = false
686
+ REMOVE_TEXT = Regexp.new("'[^']*'")
687
+ REMOVE_NUMBERS = Regexp.new('([^a-zA-Z_\$])([0-9]{1,10})')
688
+
689
+ attr_reader :text, :db, :user
690
+ attr_accessor :duration, :ignored, :q_id
691
+
692
+ def initialize(text="", ignored=false)
693
+ $stderr.puts "NIL txt for Query" if text.nil? && DEBUG
694
+ @text = text
695
+ @duration = nil
696
+ @subqueries = []
697
+ @parsing_subs = false
698
+ @ignored = ignored
699
+ @normalized = false
700
+ end
701
+
702
+ def append(txt)
703
+ $stderr.puts "NIL txt for append" if txt.nil? && DEBUG
704
+ if @parsing_subs
705
+ @subqueries.last << " " << txt
706
+ else
707
+ @text << " " << txt
708
+ end
709
+ end
710
+
711
+ def set_subquery(text)
712
+ $stderr.puts "NIL txt for sub_q" if text.nil? && DEBUG
713
+ @parsing_subs = true
714
+ @subqueries << text
715
+ end
716
+
717
+ def set_db(db)
718
+ @db = db
719
+ end
720
+
721
+ def set_user(user)
722
+ @user = user
723
+ end
724
+
725
+ def normalize
726
+ if @text
727
+ @text.gsub!(/\\'/, '')
728
+ @text.gsub!(REMOVE_TEXT, "{ }")
729
+ @text.gsub!(REMOVE_NUMBERS, '\1{ }')
730
+ @text.squeeze!(" ")
731
+ @text.strip!
732
+ end
733
+ @normalized = true
734
+ @text
735
+ end
736
+
737
+ def accumulate_to(accumulator)
738
+ accumulator.append_query(self)
739
+ end
740
+ #
741
+ # Does not work for the moment
742
+ #
743
+ # def text
744
+ # if @normalized
745
+ # @text
746
+ # else
747
+ # "[" + @db + "," + @user + "] " + @text
748
+ # end
749
+ # end
750
+ # def to_s
751
+ # text
752
+ # end
753
+ def to_s
754
+ @text
755
+ end
756
+
757
+ def is_select
758
+ check(/^SELECT/i)
759
+ end
760
+
761
+ def is_delete
762
+ check(/^DELETE/i)
763
+ end
764
+
765
+ def is_insert
766
+ check(/^INSERT/i)
767
+ end
768
+
769
+ def is_update
770
+ check(/^UPDATE/i)
771
+ end
772
+
773
+ def check(regexp)
774
+ regexp.match(@text.strip) != nil
775
+ end
776
+ end
777
+
778
+ # Errors not used for the moment
779
+
780
+ class ErrorQuery < Query
781
+ attr_reader :text, :hint, :detail, :error
782
+
783
+ is_select = false
784
+ is_delete = false
785
+ is_insert = false
786
+ is_update = false
787
+
788
+ def initialize(text="NO ERROR MESSAGE")
789
+ @error = text
790
+ @hint = ''
791
+ @detail = ''
792
+ super("NO STATEMENT")
793
+ end
794
+
795
+ def append_statement(text)
796
+ $stderr.puts "NIL txt for error statement" if text.nil? && DEBUG
797
+ @text=text
798
+ end
799
+
800
+ def append_hint(text)
801
+ $stderr.puts "NIL txt for error hint" if text.nil? && DEBUG
802
+ @hint = text
803
+ end
804
+
805
+ def append_detail(text)
806
+ $stderr.puts "NIL txt for error detail" if text.nil? && DEBUG
807
+ @detail = text
808
+ end
809
+
810
+ def accumulate_to(accumulator)
811
+ accumulator.append_error(self)
812
+ end
813
+
814
+ end
815
+
816
+ # Reports
817
+ class TextReportAggregator
818
+ def create(reports)
819
+ rpt = ""
820
+ reports.each {|r|
821
+ next if !r.applicable
822
+ rpt << r.text
823
+ }
824
+ rpt
825
+ end
826
+ end
827
+
828
+ class HTMLReportAggregator
829
+ def create(reports)
830
+ rpt = "<html><head>"
831
+ rpt =<<EOS
832
+ <style type="text/css">
833
+ body { background-color:white; }
834
+ h2 { text-align:center; }
835
+ h3 { color:blue }
836
+ p, td, th { font-family:Courier, Arial, Helvetica, sans-serif; font-size:14px; }
837
+ th { color:white; background-color:#7B8CBE; }
838
+ span.keyword { color:blue; }
839
+ </style>
840
+ EOS
841
+ #tr { background-color:#E1E8FD; }
842
+ rpt << "<title>SQL Query Analysis (generated #{Time.now})</title></head><body>\n"
843
+ rpt << "<h2>SQL Query Analysis (generated #{Time.now})</h2><br>\n"
844
+ rpt << "<hr><center>"
845
+ rpt << "<table><th>Reports</th>"
846
+ reports.each_index {|x|
847
+ next if !reports[x].applicable
848
+ link = "<a href=\"#report#{x}\">#{reports[x].title}</a>"
849
+ rpt << "<tr><td>#{link}</td></tr>"
850
+ }
851
+ rpt << "</table>"
852
+ rpt << "<hr></center>"
853
+ reports.each_index {|x|
854
+ next if !reports[x].applicable
855
+ rpt << "<a name=\"report#{x}\"> </a>"
856
+ rpt << reports[x].html
857
+ }
858
+ rpt << "</body></html>\n"
859
+ end
860
+ end
861
+
862
+ class GenericReport
863
+
864
+ def initialize(log)
865
+ @log = log
866
+ end
867
+
868
+ def colorize(txt)
869
+ ["SELECT","UPDATE","INSERT INTO","DELETE","WHERE","VALUES","FROM","AND","ORDER BY","GROUP BY","LIMIT", "OFFSET", "DESC","ASC","AS","EXPLAIN","DROP","EXEC"].each {|w|
870
+ txt = txt.gsub(Regexp.new(w), "<span class='keyword'>#{w}</span>")
871
+ }
872
+ ["select","update","from","where","explain","drop"].each {|w|
873
+ txt = txt.gsub(Regexp.new(w), "<span class='keyword'>#{w}</span>")
874
+ }
875
+ txt
876
+ end
877
+
878
+ def title
879
+ "Unnamed report"
880
+ end
881
+
882
+ def pctg_of(a,b)
883
+ a > 0 ? (((a.to_f/b.to_f)*100.0).round)/100.0 : 0
884
+ end
885
+
886
+ def round(x, places)
887
+ (x * 10.0 * places).round / (10.0 * places)
888
+ end
889
+
890
+ def applicable
891
+ true
892
+ end
893
+ end
894
+
895
+ class OverallStatsReport < GenericReport
896
+
897
+ def html
898
+ rpt = "<h3>#{title}</h3>\n"
899
+ rpt << "#{@log.queries.size} queries\n"
900
+ rpt << "<br>#{@log.unique_queries} unique queries\n"
901
+ if @log.includes_duration
902
+ rpt << "<br>Total query duration was #{round(total_duration, 2)} seconds\n"
903
+ longest = find_longest
904
+ rpt << "<br>Longest query (#{colorize(longest.text)}) ran in #{"%2.3f" % longest.duration} seconds\n"
905
+ shortest = find_shortest
906
+ rpt << "<br>Shortest query (#{colorize(shortest.text)}) ran in #{"%2.3f" % shortest.duration} seconds\n"
907
+ end
908
+ rpt << "<br>Log file parsed in #{"%2.1f" % @log.time_to_parse} seconds\n"
909
+ end
910
+
911
+ def title
912
+ "Overall statistics"
913
+ end
914
+
915
+ def text
916
+ rpt = "######## #{title}\n"
917
+ rpt << "#{@log.queries.size} queries (#{@log.unique_queries} unique)"
918
+ rpt << ", longest ran in #{find_longest.duration} seconds)," if @log.includes_duration
919
+ rpt << " parsed in #{@log.time_to_parse} seconds\n"
920
+ end
921
+
922
+ def total_duration
923
+ @log.queries.inject(0) {|sum, q| sum += (q.duration != nil) ? q.duration : 0 }
924
+ end
925
+
926
+ def find_shortest
927
+ q = Query.new("No queries found")
928
+ @log.queries.min {|a,b|
929
+ return b if a.duration.nil?
930
+ return a if b.duration.nil?
931
+ a.duration <=> b.duration
932
+ }
933
+ end
934
+
935
+ def find_longest
936
+ q = Query.new("No queries found")
937
+ @log.queries.max {|a,b|
938
+ return b if a.duration.nil?
939
+ return a if b.duration.nil?
940
+ a.duration <=> b.duration
941
+ }
942
+ end
943
+ end
944
+
945
+ class MostFrequentQueriesReport < GenericReport
946
+
947
+ def initialize(log, top=DEFAULT_TOP)
948
+ super(log)
949
+ @top = top
950
+ end
951
+
952
+ def title
953
+ "Most frequent queries"
954
+ end
955
+
956
+ def html
957
+ list = create_report
958
+ rpt = "<h3>#{title}</h3>\n"
959
+ rpt << "<table><tr><th>Rank</th><th>Times executed</th><th>Query text</th>\n"
960
+ (list.size < @top ? list.size : @top).times {|x|
961
+ rpt << "<tr><td>#{x+1}</td><td>#{list[x][1]}</td><td>#{colorize(list[x][0])}</td></tr>\n"
962
+ }
963
+ rpt << "</table>\n"
964
+ end
965
+
966
+ def text
967
+ list = create_report
968
+ rpt = "######## #{title}\n"
969
+ (list.size < @top ? list.size : @top).times {|x|
970
+ rpt << list[x][1].to_s + " times: " + list[x][0].to_s + "\n"
971
+ }
972
+ rpt
973
+ end
974
+
975
+ def create_report
976
+ h = {}
977
+ @log.queries.each {|q|
978
+ h[q.text] = 0 if !h.has_key?(q.text)
979
+ h[q.text] += 1
980
+ }
981
+ h.sort {|a,b| b[1] <=> a[1] }
982
+ end
983
+ end
984
+
985
+ class LittleWrapper
986
+ attr_accessor :total_duration, :count, :q
987
+
988
+ def initialize(q)
989
+ @q = q
990
+ @total_duration = 0.0
991
+ @count = 0
992
+ end
993
+
994
+ def add(q)
995
+ return if q.duration.nil?
996
+ @total_duration += q.duration
997
+ @count += 1
998
+ end
999
+ end
1000
+
1001
+ class QueriesThatTookUpTheMostTimeReport < GenericReport
1002
+ def initialize(log, top=DEFAULT_TOP)
1003
+ super(log)
1004
+ @top = top
1005
+ end
1006
+
1007
+ def title
1008
+ "Queries that took up the most time"
1009
+ end
1010
+
1011
+ def applicable
1012
+ @log.includes_duration
1013
+ end
1014
+
1015
+ def html
1016
+ list = create_report
1017
+ rpt = "<h3>#{title}</h3>\n"
1018
+ rpt << "<table><tr><th>Rank</th><th>Total time (seconds)</th><th>Times executed</th><th>Query text</th>\n"
1019
+ (list.size < @top ? list.size : @top).times {|x|
1020
+ rpt << "<tr><td>#{x+1}</td><td>#{"%2.3f" % list[x][1].total_duration}</td><td align=right>#{list[x][1].count}</td><td>#{colorize(list[x][0])}</td></tr>\n"
1021
+ }
1022
+ rpt << "</table>\n"
1023
+ end
1024
+
1025
+ def text
1026
+ list = create_report
1027
+ rpt = "######## #{title}\n"
1028
+ (list.size < @top ? list.size : @top).times {|x|
1029
+ rpt << "#{"%2.3f" % list[x][1].total_duration} seconds: #{list[x][0]}\n"
1030
+ }
1031
+ rpt
1032
+ end
1033
+
1034
+ def create_report
1035
+ h = {}
1036
+ @log.queries.each {|q|
1037
+ next if q.duration.nil?
1038
+ h[q.text] = LittleWrapper.new(q) if !h.has_key?(q.text)
1039
+ h[q.text].add(q)
1040
+ }
1041
+ h.sort {|a,b| b[1].total_duration <=> a[1].total_duration }
1042
+ end
1043
+ end
1044
+
1045
+ class SlowestQueriesReport < GenericReport
1046
+
1047
+ def initialize(log, top=DEFAULT_TOP)
1048
+ super(log)
1049
+ @top = top
1050
+ end
1051
+
1052
+ def applicable
1053
+ @log.includes_duration
1054
+ end
1055
+
1056
+ def title
1057
+ "Slowest queries"
1058
+ end
1059
+
1060
+ def text
1061
+ list = create_report
1062
+ rpt = "######## #{title}\n"
1063
+ (list.size < @top ? list.size : @top).times {|x|
1064
+ rpt << "#{"%2.3f" % list[x].duration} seconds: #{list[x].text}\n"
1065
+ }
1066
+ rpt
1067
+ end
1068
+
1069
+ def html
1070
+ list = create_report
1071
+ rpt = "<h3>#{title}</h3>\n"
1072
+ rpt << "<table><tr><th>Rank</th><th>Time</th><th>Query text</th>\n"
1073
+ (list.size < @top ? list.size : @top).times {|x|
1074
+ rpt << "<tr><td>#{x+1}</td><td>#{"%2.3f" % list[x].duration}</td><td>#{colorize(list[x].text)}</td></tr>\n"
1075
+ }
1076
+ rpt << "</table>\n"
1077
+ end
1078
+
1079
+ def create_report
1080
+ (@log.queries.reject{|q| q.duration.nil?}).sort {|a,b| b.duration.to_f <=> a.duration.to_f }.slice(0,@top)
1081
+ end
1082
+ end
1083
+
1084
+ class ParseErrorReport < GenericReport
1085
+
1086
+ def title
1087
+ "Parse Errors"
1088
+ end
1089
+
1090
+ def applicable
1091
+ !@log.parse_errors.empty?
1092
+ end
1093
+
1094
+ def text
1095
+ rpt = "######## #{title}\n"
1096
+ @log.parse_errors.each {|x| rpt << "#{x.exception} : #{x.line}\n" }
1097
+ rpt
1098
+ end
1099
+
1100
+ def html
1101
+ rpt = "<h3>#{title}</h3>\n"
1102
+ rpt << "<table><tr><th>Explanation</th><th>Offending line</th>\n"
1103
+ @log.parse_errors.each {|x|
1104
+ rpt << "<tr><td>#{x.exception.message}</td><td>#{x.line}</td></tr>\n"
1105
+ }
1106
+ rpt << "</table>\n"
1107
+ end
1108
+ end
1109
+
1110
+ class ErrorReport < GenericReport
1111
+
1112
+ def title
1113
+ "Errors"
1114
+ end
1115
+
1116
+ def applicable
1117
+ !@log.errors.empty?
1118
+ end
1119
+
1120
+ def text
1121
+ rpt = "######## #{title}\n"
1122
+ @log.errors.each {|x| rpt << "#{x.error} : #{x.text}\n" }
1123
+ rpt
1124
+ end
1125
+
1126
+ def html
1127
+ rpt = "<h3>#{title}</h3>\n"
1128
+ rpt << "<table><tr><th>Error</th><th>Offending query</th>\n"
1129
+ @log.errors.each {|x|
1130
+ message = "<p>#{x.error}</p>" + (x.detail.size > 0 ? "<p>DETAIL : #{x.detail}</p>" : '') + \
1131
+ (x.hint.size > 0 ? "<p>HINT : #{x.hint}</p>" : '')
1132
+ rpt << "<tr><td>#{message}</td><td>#{colorize(x.text)}</td></tr>\n"
1133
+ }
1134
+ rpt << "</table>\n"
1135
+ end
1136
+ end
1137
+
1138
+ class QueriesByTypeReport < GenericReport
1139
+
1140
+ def title
1141
+ "Queries by type"
1142
+ end
1143
+
1144
+ def html
1145
+ sel,ins,upd,del=create_report
1146
+ rpt = "<h3>#{title}</h3>\n"
1147
+ rpt << "<table><tr><th>Type</th><th>Count</th><th>Percentage</th>\n"
1148
+ rpt << "<tr><td>SELECT</td><td>#{sel}</td><td align=center>#{(pctg_of(sel, @log.queries.size)*100).to_i}</td></tr>\n" if sel > 0
1149
+ rpt << "<tr><td>INSERT</td><td>#{ins}</td><td align=center>#{(pctg_of(ins, @log.queries.size)*100).to_i}</td></tr>\n" if ins > 0
1150
+ rpt << "<tr><td>UPDATE</td><td>#{upd}</td><td align=center>#{(pctg_of(upd, @log.queries.size)*100).to_i}</td></tr>\n" if upd > 0
1151
+ rpt << "<tr><td>DELETE</td><td>#{del}</td><td align=center>#{(pctg_of(del, @log.queries.size)*100).to_i}</td></tr>\n" if del > 0
1152
+ rpt << "</table>\n"
1153
+ end
1154
+
1155
+ def text
1156
+ sel,ins,upd,del=create_report
1157
+ rpt = "######## #{title}\n"
1158
+ rpt << "SELECTs: #{sel.to_s.ljust(sel.to_s.size + 1)} (#{(pctg_of(sel, @log.queries.size)*100).to_i}%)\n" if sel > 0
1159
+ rpt << "INSERTs: #{ins.to_s.ljust(sel.to_s.size + 1)} (#{(pctg_of(ins, @log.queries.size)*100).to_i}%)\n" if ins > 0
1160
+ rpt << "UPDATEs: #{upd.to_s.ljust(upd.to_s.size + 1)} (#{(pctg_of(upd, @log.queries.size)*100).to_i}%)\n" if upd > 0
1161
+ rpt << "DELETEs: #{del.to_s.ljust(sel.to_s.size + 1)} (#{(pctg_of(del, @log.queries.size)*100).to_i}%)\n" if del > 0
1162
+ rpt
1163
+ end
1164
+
1165
+ def create_report
1166
+ [@log.queries.find_all {|q| q.is_select}.size,
1167
+ @log.queries.find_all {|q| q.is_insert}.size,
1168
+ @log.queries.find_all {|q| q.is_update}.size,
1169
+ @log.queries.find_all {|q| q.is_delete}.size]
1170
+ end
1171
+ end
1172
+
1173
+ PQA.run if __FILE__ == $0