pqa 1.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/bin/pqa ADDED
@@ -0,0 +1,9 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ begin
4
+ require 'pqa'
5
+ rescue LoadError
6
+ require 'rubygems'
7
+ require 'pqa'
8
+ end
9
+ PQA.run
@@ -0,0 +1,78 @@
1
+ November 27, 2005 - 1.6:
2
+ Fixed the first line of the file - it was missing a character.
3
+ Made Gem runnable - i.e., now you can do "pqa -file [..etc, etc]" after doing a "gem install pqa"
4
+ Various minor speed improvements.
5
+ Wrote more unit tests.
6
+
7
+ January 31, 2005 - 1.5:
8
+ Fixed a bug that would sometimes associate a duration measurement with the incorrect query.
9
+ Fixed a bug which would cause the HTML report to error out with a nil-dereference exception.
10
+ Fixed a bug which cause PQA to choke on SQL statements that were issued as part of a stored procedure.
11
+ Restored the ability to run under Ruby 1.6.8.
12
+
13
+ July 27, 2004 - 1.4:
14
+ The syslog parser now checks for either LOG or DEBUG entries.
15
+ Fixed a bug in the syslog parser that would choke on entries like "postgres starting".
16
+
17
+ June 18, 2004 - 1.3:
18
+ Improved MySQL query log support.
19
+ Parse errors (if they occur) are now included in the report.
20
+ The default log format is now "pglog". Syslog still works, but you'll need to specify "-logtype syslog".
21
+
22
+ June 8, 2004 - 1.2:
23
+ Added MySQL query log file support.
24
+ Fixed bug - duration report links no longer appear in header if duration information is not available.
25
+
26
+ June 7, 2004 - 1.1:
27
+ Syslog parser now handles durations.
28
+ Added text versions of duration reports.
29
+ Removed some spurious entries from the reports (i.e., BEGIN;ROLLBACK).
30
+
31
+ May 17, 2004 - 1.0:
32
+ Now numbers are normalized - i.e.; "select bar where foo=222" normalizes to the same query as "select bar where foo=123312321". This means more accurate information on what queries are occurring most frequently, taking the most time, and all that.
33
+ Fixed bug - OverallStatsReport no longer displays "longest ran in 0.0 seconds" if the log does not include duration information
34
+ Fixed bug - OverallStatsReport now displays the correct number of unique queries. v0.9 listed the same number for both total and unique queries.
35
+ Fixed various bugs in syslog parsing - now it should work better with both PG 7.3 and PG 7.4. It's still a better idea to use the Postgres log, but if you must use syslog for some reason, it's better now.
36
+
37
+ May 11, 2004 - 0.9:
38
+ Added ability to handle Postgres logs where log_pid/log_timestamp/log_connection has been enabled
39
+ Modified to support both "query:" and "statement:" as log entry preambles - i.e., works with PG 7.4 logs now.
40
+ The SQL colorizing works a bit better.
41
+ Updated documentation to include better postgresql.conf configuration details.
42
+
43
+ May 7, 2004 - 0.8:
44
+ Added UPDATE queries to the "Queries by type" report
45
+ Added support for parsing query duration data from the Postgres log
46
+ Added a "Queries that took up the most time" report
47
+ Added a "Slowest queries" report
48
+ Added a table of contents to the HTML report
49
+
50
+ April 28, 2004 - 0.7:
51
+ Added support for using Postgres log file. syslog is still supported, of course.
52
+ Fixed bug which resulted in errors if the number of valid queries in a log was less than -top. Thanks to Tom De Bruyne for reporting this bug.
53
+ The SQL colorizing works a bit better now.
54
+ Various tweaks to HTML reports.
55
+
56
+ April 23, 2004 - 0.6:
57
+ Added a 'rank' column to the 'MostFrequentQueries' report.
58
+ Colorized the SQL keywords in the HTML report.
59
+
60
+ April 7, 2004 - 0.5:
61
+ Added HTML reports.
62
+
63
+ April 6, 2004 - 0.4:
64
+ Fixed a bug which prevented single digit date logs from being parsed.
65
+
66
+ March 17, 2004 - v0.3:
67
+ Fixed an off-by-one bug in the number of reports returned.
68
+ More optimizations, should again be about 10% faster.
69
+ Added a "query frequency by type" report.
70
+
71
+ March 9, 2004 - v0.2:
72
+ Fixed a connection id bug.
73
+ Various optimizations, should be about 10% faster.
74
+ Improved packaging.
75
+
76
+ March 5, 2004 - v0.1:
77
+ Can display queries by frequency.
78
+ Performs query normalization.
@@ -0,0 +1,31 @@
1
+ Copyright (c) 2003-2005, InfoEther, LLC
2
+ All rights reserved.
3
+
4
+ Redistribution and use in source and binary forms, with or without
5
+ modification, are permitted provided that the following conditions are
6
+ met:
7
+
8
+ * Redistributions of source code must retain the above copyright
9
+ notice, this list of conditions and the following disclaimer.
10
+ * Redistributions in binary form must reproduce the above copyright
11
+ notice, this list of conditions and the following disclaimer in the
12
+ documentation and/or other materials provided with the distribution.
13
+ * The end-user documentation included with the redistribution, if
14
+ any, must include the following acknowledgement:
15
+ "This product includes software developed in part by support from
16
+ InfoEther, LLC"
17
+ * Neither the name of InfoEther, LLC nor the names of its
18
+ contributors may be used to endorse or promote products derived from
19
+ this software without specific prior written permission.
20
+
21
+ THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS
22
+ IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
23
+ TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
24
+ PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER
25
+ OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
26
+ EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
27
+ PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
28
+ PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
29
+ LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
30
+ NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
31
+ SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
@@ -0,0 +1,15 @@
1
+ Postgres Query Analyzer (PQA) is a little library to help you analyze the PostgreSQL query logs.
2
+
3
+ To use it, you'll need:
4
+ Ruby - http://ruby-lang.org/
5
+ PQA - http://pgfoundry.org/projects/pqa/
6
+
7
+ and of course a PostgreSQL installation.
8
+
9
+ There's an article I've about PQA written on DatabaseJournal.com:
10
+ http://www.databasejournal.com/features/postgresql/article.php/3323561
11
+
12
+ There's more documentation and a sample run here:
13
+ http://pqa.projects.postgresql.org/
14
+
15
+ Like PQA? Buy Tom's completely unrelated book, "PMD Applied"! http://pmdapplied.com/
@@ -0,0 +1,1173 @@
1
+ #!/usr/local/bin/ruby -w
2
+
3
+ DEFAULT_TOP=10
4
+ BUG_URL_STRING="This is a <a href=\"http://pgfoundry.org/tracker/?atid=130&group_id=1000008&func=browse\">bug</a>."
5
+
6
+ class PQA
7
+ def PQA.pqa_usage
8
+ puts "=============================="
9
+ puts "Usage: " + $0 + " [-logtype syslog|pglog|mysql] [-top n] [-normalize] [-format text|html] [-reports rep1,rep2,...,repn] -file log_file_name"
10
+ puts "Report types : overall, bytype, mosttime, slowest, mostfrequent, errors"
11
+ puts "For example:"
12
+ puts "ruby pqa.rb -logtype pglog -top 10 -normalize -format text -reports overall,slowest ../sample/pglog_sample.log"
13
+ puts "=============================="
14
+ end
15
+ def PQA.run
16
+ (PQA.pqa_usage ; exit) if ARGV == nil
17
+ if !ARGV.include?("-file")
18
+ puts "=============================="
19
+ puts "## No log file specified; use the '-file' parameter"
20
+ pqa_usage ; exit
21
+ end
22
+ log = nil
23
+ if ARGV.include?("-logtype") && ARGV[ARGV.index("-logtype")+1] == "syslog"
24
+ log = GenericLogReader.new(ARGV[ARGV.index("-file")+1], "SyslogPGParser", "PostgreSQLAccumulator")
25
+ elsif ARGV.include?("-logtype") && ARGV[ARGV.index("-logtype")+1] == "mysql"
26
+ log = GenericLogReader.new(ARGV[ARGV.index("-file")+1], "MySQLLogLine", "MySQLAccumulator")
27
+ else
28
+ log = GenericLogReader.new(ARGV[ARGV.index("-file")+1], "PostgresLogParser", "PostgreSQLAccumulator")
29
+ end
30
+ log.parse
31
+ log.normalize if ARGV.include?("-normalize")
32
+ top = (ARGV.include?("-top") ? ARGV[ARGV.index("-top")+1] : DEFAULT_TOP).to_i
33
+ format = (ARGV.include?("-format") ? ARGV[ARGV.index("-format")+1] : "text")
34
+
35
+ rpts = []
36
+ if ARGV.include?("-reports")
37
+ reports_array = ARGV[ARGV.index("-reports")+1].split(',')
38
+ rpts.push(OverallStatsReport.new(log)) if reports_array.include?("overall")
39
+ rpts.push(QueriesByTypeReport.new(log)) if reports_array.include?("bytype")
40
+ rpts.push(QueriesThatTookUpTheMostTimeReport.new(log,top)) if reports_array.include?("mosttime")
41
+ rpts.push(SlowestQueriesReport.new(log, top)) if reports_array.include?("slowest")
42
+ rpts.push(MostFrequentQueriesReport.new(log, top)) if reports_array.include?("mostfrequent")
43
+ rpts.push(ErrorReport.new(log)) if reports_array.include?("errors")
44
+ #rpts.push() if reports_array.include?("")
45
+ rpts.push(ParseErrorReport.new(log))
46
+ else
47
+ rpts = [OverallStatsReport.new(log), QueriesByTypeReport.new(log), QueriesThatTookUpTheMostTimeReport.new(log, top), SlowestQueriesReport.new(log, top), MostFrequentQueriesReport.new(log, top), ParseErrorReport.new(log)]
48
+ end
49
+ report_aggregator = (format == "text") ? TextReportAggregator.new : HTMLReportAggregator.new
50
+ puts report_aggregator.create(rpts)
51
+ end
52
+ end
53
+
54
+ # Log file parsers
55
+ class ParseError
56
+ attr_reader :exception, :line
57
+ def initialize(e, line)
58
+ @exception = e
59
+ @line = line
60
+ end
61
+ end
62
+
63
+ class GenericLogReader
64
+ DEBUG = false
65
+ attr_accessor :includes_duration, :queries, :errors, :parse_errors
66
+ attr_reader :time_to_parse
67
+
68
+ def initialize(filename, line_parser_name, accumulator_name)
69
+ @filename = filename
70
+ @line_parser_name = line_parser_name
71
+ @accumulator_name = accumulator_name
72
+ @includes_duration = false
73
+ @queries, @errors , @parse_errors= [], [], []
74
+ end
75
+
76
+ def parse
77
+ start = Time.new
78
+ a = Object.const_get(@accumulator_name).new
79
+ puts "Using #{@accumulator_name}" if DEBUG
80
+ p = Object.const_get(@line_parser_name).new
81
+ puts "Using #{@line_parser_name}" if DEBUG
82
+ File.foreach(@filename) {|text|
83
+ begin
84
+ line = p.parse(text)
85
+ if line
86
+ a.append(line)
87
+ else
88
+ # text.gsub!(/\n/, '\n').gsub!(/\t/, '\t')
89
+ # $stderr.puts "Unrecognized text: '#{text}'"
90
+ end
91
+ rescue StandardError => e
92
+ @parse_errors << ParseError.new(e,line)
93
+ end
94
+ }
95
+ @time_to_parse = Time.new - start
96
+ a.close_out_all
97
+ @queries = a.queries
98
+ @errors = a.errors
99
+ @includes_duration = a.has_duration_info
100
+ end
101
+
102
+ def normalize
103
+ @queries.each {|q| q.normalize }
104
+ end
105
+
106
+ def unique_queries
107
+ uniq = []
108
+ @queries.each {|x| uniq << x.text if !uniq.include?(x.text) }
109
+ uniq.size
110
+ end
111
+ end
112
+
113
+ #
114
+ # MySQL Parsing is broken
115
+ #
116
+
117
+ class MySQLLogLine
118
+ DISCARD = Regexp.new("(^Time )|(^Tcp)|( Quit )|( USE )|(Connect)")
119
+ START_QUERY = Regexp.new('\d{1,5} Query')
120
+ attr_reader :text, :is_new_query, :recognized
121
+
122
+ def initialize(text)
123
+ @recognized = true
124
+ @is_new_query = false
125
+ if DISCARD.match(text) != nil
126
+ @recognized = false
127
+ return
128
+ end
129
+ @text = text
130
+ @is_new_query = START_QUERY.match(@text) != nil
131
+
132
+ end
133
+
134
+ def is_continuation
135
+ @recognized && !/^(\d{1,6})|(\s*)/.match(@text).nil?
136
+ end
137
+
138
+ def is_duration_line
139
+ false
140
+ end
141
+
142
+ def parse_query_segment
143
+ if @is_new_query
144
+ tmp = START_QUERY.match(@text.strip)
145
+ raise StandardError.new("PQA identified a line as the start of a new query, but then was unable to match it with the START_QUERY Regex. #{BUG_URL_STRING}") if tmp == nil
146
+ return tmp.post_match.strip
147
+ end
148
+ @text.strip.chomp
149
+ end
150
+
151
+ def to_s
152
+ @text
153
+ end
154
+ end
155
+
156
+ class MySQLAccumulator
157
+ attr_reader :queries
158
+
159
+ def initialize
160
+ @current = nil
161
+ @queries = []
162
+ end
163
+
164
+ def new_query_start(line)
165
+ @queries << @current if !@current.nil?
166
+ @current = Query.new(line.parse_query_segment)
167
+ end
168
+
169
+ def query_continuation(line)
170
+ @current.append(line.parse_query_segment) if !@current.nil?
171
+ end
172
+
173
+ def close_out_all ; end
174
+ end
175
+
176
+ #
177
+ # PostgreSQL lines Classes
178
+ #
179
+
180
+
181
+
182
+ class PGLogLine
183
+ DEBUG = false
184
+ attr_accessor :connection_id, :cmd_no, :line_no
185
+ attr_reader :text, :duration, :ignore
186
+
187
+ def initialize(text = "NO TEXT", duration = nil)
188
+ @text = text.chomp
189
+ @duration = duration
190
+
191
+ if text.nil?
192
+ $stderr.puts "Nil text for line text !" if DEBUG
193
+ end
194
+
195
+
196
+ # for tracking
197
+ @connection_id = nil
198
+ @cmd_no = nil
199
+ @line_no = nil
200
+ end
201
+
202
+ def to_s
203
+ @text
204
+ end
205
+
206
+ def parse_duration(time_str, unit)
207
+ unit == "ms" ? (time_str.to_f / 1000.0) : time_str.to_f
208
+ end
209
+
210
+ def dump
211
+ self.class.to_s + "(" + @connection_id.to_s + "): " + text
212
+ end
213
+ end
214
+
215
+ class PGQueryStarter < PGLogLine
216
+ attr_reader :ignore
217
+
218
+ def initialize(text, duration = nil)
219
+ super(filter_query(text), duration)
220
+ end
221
+
222
+ def filter_query(text)
223
+ @ignore = (text =~ /begin/i) || (text =~ /VACUUM/i) || (text =~ /^select 1$/i)
224
+ return text
225
+ end
226
+
227
+ def append_to(queries)
228
+ queries.push(Query.new(@text, @ignore))
229
+ return nil
230
+ end
231
+
232
+ end
233
+
234
+ class PGQueryStarterWithDuration < PGQueryStarter
235
+ ignore = false
236
+
237
+ def initialize(text, time_str, unit)
238
+ @time_str = time_str
239
+ @unit = unit
240
+ text_match = /[\s]*(query|statement):[\s]*/i.match(text)
241
+ if text_match
242
+ super(text_match.post_match, parse_duration(time_str, unit))
243
+ else
244
+ $stderr.puts "Found garbage after Duration line : #{text}"
245
+ super(text, parse_duration(time_str, unit))
246
+ end
247
+ end
248
+
249
+ def append_to(queries)
250
+ queries.got_duration!
251
+ closed_query = queries.pop
252
+ query = Query.new(@text, @ignore)
253
+ query.duration = @duration
254
+ queries.push(query)
255
+ return closed_query
256
+ end
257
+
258
+ end
259
+
260
+ class PGContinuationLine < PGLogLine
261
+ ignore = false
262
+
263
+ def initialize(text, duration = nil)
264
+ super(text.gsub(/\^I/, "\t"))
265
+ end
266
+
267
+
268
+ def append_to(queries)
269
+ if queries.last.nil?
270
+ $stderr.puts "Continuation for no previous query"
271
+ else
272
+ queries.last.append(@text)
273
+ end
274
+ return nil
275
+ end
276
+
277
+ end
278
+
279
+ # Durations
280
+
281
+ class PGDurationLine < PGLogLine
282
+ ignore = false
283
+
284
+ def initialize(time_str, unit)
285
+ @time_str = time_str
286
+ @unit = unit
287
+ super("NO TEXT", parse_duration(time_str, unit))
288
+ end
289
+
290
+ def append_to(queries)
291
+ if queries.last.nil?
292
+ $stderr.puts "Duration for no previous query"
293
+ return nil
294
+ else
295
+ queries.got_duration!
296
+ queries.last.duration = @duration
297
+ return queries.pop
298
+ end
299
+ end
300
+
301
+ end
302
+
303
+ # Error Management
304
+ # Those 4 classes are untested
305
+ # keep ignore = true for the moment
306
+
307
+ class PGErrorLine < PGLogLine
308
+ ignore = false
309
+
310
+ def append_to(errors)
311
+ closed_query = errors.pop
312
+ errors.push(ErrorQuery.new(@text))
313
+ return closed_query
314
+ end
315
+
316
+ end
317
+
318
+ class PGHintLine < PGLogLine
319
+ ignore = false
320
+
321
+ def append_to(errors)
322
+ if errors.last
323
+ errors.last.append_hint(@text)
324
+ else
325
+ $stderr.puts "Hint for no previous error"
326
+ end
327
+ return nil
328
+ end
329
+
330
+ end
331
+
332
+ class PGDetailLine < PGLogLine
333
+ ignore = false
334
+
335
+ def append_to(errors)
336
+ if errors.last
337
+ errors.last.append_detail(@text)
338
+ else
339
+ $stderr.puts "Detail for no previous error"
340
+ end
341
+ return nil
342
+ end
343
+
344
+ end
345
+
346
+ class PGStatementLine < PGLogLine
347
+ ignore = false
348
+
349
+ def append_to(errors)
350
+ if errors.last
351
+ errors.last.append_statement(@text)
352
+ else
353
+ $stderr.puts "Detail for no previous error"
354
+ end
355
+ return nil
356
+ end
357
+
358
+ end
359
+
360
+ # Contexts
361
+
362
+ class PGContextLine < PGLogLine
363
+ ignore = false
364
+
365
+ SQL_STATEMENT = /^SQL statement "/
366
+ SQL_FUNCTION = /([^\s]+)[\s]+function[\s]+"([^"]+)"(.*)$/
367
+
368
+ def initialize(text)
369
+ statement_match = SQL_STATEMENT.match(text)
370
+ if statement_match
371
+ super(statement_match.post_match[0..-1])
372
+ else
373
+ function_match = SQL_FUNCTION.match(text)
374
+ if function_match
375
+ super(function_match[2])
376
+ else
377
+ $stderr.puts "Unrecognized Context" if DEBUG
378
+ super(text)
379
+ end
380
+ @match_all = true
381
+ end
382
+ end
383
+
384
+ def append_to(queries)
385
+ sub_query = queries.pop
386
+ if sub_query.nil?
387
+ $stderr.puts "Missing Query for Context"
388
+ elsif queries.last
389
+ queries.last.set_subquery(sub_query.to_s)
390
+ else
391
+ $stderr.puts "Context for no previous Query"
392
+ end
393
+ return nil
394
+ end
395
+
396
+ end
397
+
398
+ # Statuses
399
+ # This class is untested
400
+ # please keep ignore = true for the moment
401
+
402
+ class PGStatusLine < PGLogLine
403
+ ignore = true
404
+ CONN_RECV = /connection received: host=([^\s]+) port=([\d]+)/
405
+ CONN_AUTH = /connection authorized: user=([^\s]+) database=([^\s]+)/
406
+
407
+ def append_to(stream)
408
+ conn_recv = CONN_RECV.match(@text)
409
+ if conn_recv
410
+ stream.set_host_conn!(conn_recv[1], conn_recv[2])
411
+ end
412
+
413
+ conn_auth = CONN_AUTH.match(@text)
414
+ if conn_auth
415
+ stream.set_user_db!(conn_auth[1], conn_auth[2])
416
+ end
417
+ return nil
418
+ end
419
+
420
+ end
421
+
422
+
423
+
424
+ class PostgreSQLParser
425
+ LOG_OR_DEBUG_LINE = Regexp.new("^(LOG|DEBUG):[\s]*")
426
+ QUERY_STARTER = Regexp.new("^(query|statement):[\s]*")
427
+ STATUS = Regexp.new("^(connection|received|unexpected EOF)")
428
+ DURATION = Regexp.new('^duration:([\s\d\.]*)(sec|ms)')
429
+ CONTINUATION_LINE = /^(\^I|\s|\t)/
430
+ CONTEXT_LINE = /^CONTEXT:[\s]*/
431
+ ERROR_LINE = /^(WARNING|ERROR|FATAL|PANIC):[\s]*/
432
+ HINT_LINE = /^HINT:[\s]*/
433
+ DETAIL_LINE = /^DETAIL:[\s]*/
434
+ STATEMENT_LINE = /^STATEMENT:[\s]*/
435
+
436
+ def parse(text)
437
+ logdebug_match = LOG_OR_DEBUG_LINE.match(text)
438
+ if logdebug_match
439
+
440
+ query_match = QUERY_STARTER.match(logdebug_match.post_match)
441
+ if query_match
442
+ return PGQueryStarter.new(query_match.post_match)
443
+ end
444
+
445
+ duration_match = DURATION.match(logdebug_match.post_match)
446
+ if duration_match
447
+ additionnal_info = duration_match.post_match.strip.chomp
448
+ if additionnal_info == ""
449
+ return PGDurationLine.new(duration_match[1].strip, duration_match[2])
450
+ else
451
+ return PGQueryStarterWithDuration.new(additionnal_info, duration_match[1].strip, duration_match[2])
452
+ end
453
+ end
454
+
455
+ status_match = STATUS.match(logdebug_match.post_match)
456
+ if status_match
457
+ return PGStatusLine.new(logdebug_match.post_match)
458
+ end
459
+
460
+ # $stderr.puts "Unrecognized LOG or DEBUG line: #{text}"
461
+ return nil
462
+ end
463
+
464
+ error_match = ERROR_LINE.match(text)
465
+ if error_match
466
+ return PGErrorLine.new(error_match.post_match)
467
+ end
468
+
469
+ context_match = CONTEXT_LINE.match(text)
470
+ if context_match
471
+ return PGContextLine.new(context_match.post_match)
472
+ end
473
+
474
+ continuation_match = CONTINUATION_LINE.match(text)
475
+ if continuation_match
476
+ return PGContinuationLine.new(continuation_match.post_match)
477
+ end
478
+
479
+ statement_match = STATEMENT_LINE.match(text)
480
+ if statement_match
481
+ return PGStatementLine.new(statement_match.post_match)
482
+ end
483
+
484
+ hint_match = HINT_LINE.match(text)
485
+ if hint_match
486
+ return PGHintLine.new(hint_match.post_match)
487
+ end
488
+
489
+ detail_match = DETAIL_LINE.match(text)
490
+ if detail_match
491
+ return PGDetailLine.new(detail_match.post_match)
492
+ end
493
+
494
+ if text.strip.chomp == ""
495
+ return PGContinuationLine.new("")
496
+ end
497
+
498
+ # $stderr.puts "Unrecognized PostgreSQL log line: #{text}"
499
+ return nil
500
+ end
501
+
502
+ end
503
+
504
+ class SyslogPGParser < PostgreSQLParser
505
+ CMD_LINE = Regexp.new('\[(\d{1,10})(\-\d{1,5}){0,1}\] ')
506
+
507
+
508
+ def initialize(syslog_str = 'postgres')
509
+ @postgres_pid = Regexp.new(" " + syslog_str + '\[(\d{1,5})\]: ')
510
+ end
511
+
512
+ def parse(data)
513
+ recognized = false
514
+
515
+ pid_match=@postgres_pid.match(data)
516
+ return if pid_match.nil?
517
+
518
+ connection_id = pid_match[1]
519
+ text = pid_match.post_match
520
+ return nil if text == nil
521
+
522
+ line_id_match = CMD_LINE.match(text)
523
+ return nil if line_id_match.nil?
524
+
525
+ text = line_id_match.post_match
526
+ cmd_no = line_id_match[1]
527
+ if line_id_match[2]
528
+ line_no = line_id_match[2][1..-1]
529
+ else
530
+ line_no = 1
531
+ end
532
+
533
+
534
+ result = super(text)
535
+ return nil if result.nil?
536
+
537
+ result.connection_id = connection_id
538
+ result.cmd_no = cmd_no
539
+ result.line_no = line_no
540
+
541
+ # $stderr.puts result.dump
542
+
543
+ return result
544
+ end
545
+
546
+ end
547
+
548
+
549
+ class PostgresLogParser < PostgreSQLParser
550
+ STARTS_WITH_DATE=Regexp.new("^[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9] ")
551
+ STARTS_WITH_PID=Regexp.new('\[(\d{1,5})\] ')
552
+
553
+ def initialize
554
+ @conn_id_found = false
555
+ @last_conn_id = nil
556
+ end
557
+
558
+ def parse(text)
559
+ connection_id = nil
560
+ text = STARTS_WITH_DATE.match(text) ? text.split(" ")[2..-1].join(" ").strip : text
561
+
562
+ pid_match = STARTS_WITH_PID.match(text)
563
+ if pid_match
564
+ @conn_id_found = true
565
+ connection_id = pid_match[1]
566
+ @last_conn_id = connection_id
567
+ text = pid_match.post_match.strip
568
+ end
569
+
570
+ result = super(text)
571
+ # Badly formated continuations need this...
572
+ #if result.nil?
573
+ # result = PGContinuationLine.new(text)
574
+ #end
575
+ return nil if result.nil?
576
+
577
+ if pid_match
578
+ result.connection_id = connection_id
579
+ else
580
+ result.connection_id = @last_conn_id
581
+ end
582
+
583
+ # $stderr.puts result.dump
584
+
585
+ return result
586
+ end
587
+ end
588
+
589
+
590
+ class LogStream
591
+ attr_reader :has_duration_info, :queries
592
+ def initialize
593
+ @queries = []
594
+ @has_duration_info = false
595
+
596
+ @host = "UNKNOWN"
597
+ @port = "UNKNOWN"
598
+ @user = "UNKNOWN"
599
+ @db = "UNKNOWN"
600
+ end
601
+
602
+ def queries
603
+ @queries.reject {|q| q.ignored}
604
+ end
605
+
606
+ def append(line)
607
+ return line.append_to(self)
608
+ end
609
+
610
+ def push(query)
611
+ query.set_db(@db)
612
+ query.set_user(@user)
613
+ @queries.push(query)
614
+ end
615
+
616
+ def pop
617
+ @queries.pop
618
+ end
619
+
620
+ def last
621
+ @queries.last
622
+ end
623
+
624
+ def set_host_conn!(host, port)
625
+ @host = host
626
+ @port = port
627
+ end
628
+
629
+ def set_user_db!(user, db)
630
+ @user = user
631
+ @db = db
632
+ end
633
+
634
+ def got_duration!
635
+ @has_duration_info = true
636
+ end
637
+ end
638
+
639
+
640
+ class PostgreSQLAccumulator
641
+ attr_reader :queries, :errors, :has_duration_info
642
+
643
+ def initialize
644
+ @queries = []
645
+ @errors = []
646
+ @working = {}
647
+ @stream = LogStream.new
648
+ @has_duration_info = false
649
+ end
650
+
651
+ def append(line)
652
+ if line.connection_id
653
+ if !@working.has_key?(line.connection_id)
654
+ @working[line.connection_id] = LogStream.new
655
+ end
656
+ query = @working[line.connection_id].append(line)
657
+ else
658
+ # no pid mode :
659
+ query = @stream.append(line)
660
+ end
661
+ if query && !query.ignored
662
+ query.accumulate_to(self)
663
+ end
664
+ end
665
+
666
+ def append_query(query)
667
+ @queries.push(query)
668
+ end
669
+
670
+ def append_error(error)
671
+ @errors.push(error)
672
+ end
673
+
674
+ def close_out_all
675
+ @stream.queries.each { |q| q.accumulate_to(self) }
676
+ @has_duration_info = @stream.has_duration_info
677
+ @working.each {|k, stream|
678
+ stream.queries.each { |q| q.accumulate_to(self) }
679
+ @has_duration_info = @has_duration_info || stream.has_duration_info
680
+ }
681
+ end
682
+ end
683
+
684
+ class Query
685
+ DEBUG = false
686
+ REMOVE_TEXT = Regexp.new("'[^']*'")
687
+ REMOVE_NUMBERS = Regexp.new('([^a-zA-Z_\$])([0-9]{1,10})')
688
+
689
+ attr_reader :text, :db, :user
690
+ attr_accessor :duration, :ignored, :q_id
691
+
692
+ def initialize(text="", ignored=false)
693
+ $stderr.puts "NIL txt for Query" if text.nil? && DEBUG
694
+ @text = text
695
+ @duration = nil
696
+ @subqueries = []
697
+ @parsing_subs = false
698
+ @ignored = ignored
699
+ @normalized = false
700
+ end
701
+
702
+ def append(txt)
703
+ $stderr.puts "NIL txt for append" if txt.nil? && DEBUG
704
+ if @parsing_subs
705
+ @subqueries.last << " " << txt
706
+ else
707
+ @text << " " << txt
708
+ end
709
+ end
710
+
711
+ def set_subquery(text)
712
+ $stderr.puts "NIL txt for sub_q" if text.nil? && DEBUG
713
+ @parsing_subs = true
714
+ @subqueries << text
715
+ end
716
+
717
+ def set_db(db)
718
+ @db = db
719
+ end
720
+
721
+ def set_user(user)
722
+ @user = user
723
+ end
724
+
725
+ def normalize
726
+ if @text
727
+ @text.gsub!(/\\'/, '')
728
+ @text.gsub!(REMOVE_TEXT, "{ }")
729
+ @text.gsub!(REMOVE_NUMBERS, '\1{ }')
730
+ @text.squeeze!(" ")
731
+ @text.strip!
732
+ end
733
+ @normalized = true
734
+ @text
735
+ end
736
+
737
+ def accumulate_to(accumulator)
738
+ accumulator.append_query(self)
739
+ end
740
+ #
741
+ # Does not work for the moment
742
+ #
743
+ # def text
744
+ # if @normalized
745
+ # @text
746
+ # else
747
+ # "[" + @db + "," + @user + "] " + @text
748
+ # end
749
+ # end
750
+ # def to_s
751
+ # text
752
+ # end
753
+ def to_s
754
+ @text
755
+ end
756
+
757
+ def is_select
758
+ check(/^SELECT/i)
759
+ end
760
+
761
+ def is_delete
762
+ check(/^DELETE/i)
763
+ end
764
+
765
+ def is_insert
766
+ check(/^INSERT/i)
767
+ end
768
+
769
+ def is_update
770
+ check(/^UPDATE/i)
771
+ end
772
+
773
+ def check(regexp)
774
+ regexp.match(@text.strip) != nil
775
+ end
776
+ end
777
+
778
+ # Errors not used for the moment
779
+
780
+ class ErrorQuery < Query
781
+ attr_reader :text, :hint, :detail, :error
782
+
783
+ is_select = false
784
+ is_delete = false
785
+ is_insert = false
786
+ is_update = false
787
+
788
+ def initialize(text="NO ERROR MESSAGE")
789
+ @error = text
790
+ @hint = ''
791
+ @detail = ''
792
+ super("NO STATEMENT")
793
+ end
794
+
795
+ def append_statement(text)
796
+ $stderr.puts "NIL txt for error statement" if text.nil? && DEBUG
797
+ @text=text
798
+ end
799
+
800
+ def append_hint(text)
801
+ $stderr.puts "NIL txt for error hint" if text.nil? && DEBUG
802
+ @hint = text
803
+ end
804
+
805
+ def append_detail(text)
806
+ $stderr.puts "NIL txt for error detail" if text.nil? && DEBUG
807
+ @detail = text
808
+ end
809
+
810
+ def accumulate_to(accumulator)
811
+ accumulator.append_error(self)
812
+ end
813
+
814
+ end
815
+
816
+ # Reports
817
+ class TextReportAggregator
818
+ def create(reports)
819
+ rpt = ""
820
+ reports.each {|r|
821
+ next if !r.applicable
822
+ rpt << r.text
823
+ }
824
+ rpt
825
+ end
826
+ end
827
+
828
+ class HTMLReportAggregator
829
+ def create(reports)
830
+ rpt = "<html><head>"
831
+ rpt =<<EOS
832
+ <style type="text/css">
833
+ body { background-color:white; }
834
+ h2 { text-align:center; }
835
+ h3 { color:blue }
836
+ p, td, th { font-family:Courier, Arial, Helvetica, sans-serif; font-size:14px; }
837
+ th { color:white; background-color:#7B8CBE; }
838
+ span.keyword { color:blue; }
839
+ </style>
840
+ EOS
841
+ #tr { background-color:#E1E8FD; }
842
+ rpt << "<title>SQL Query Analysis (generated #{Time.now})</title></head><body>\n"
843
+ rpt << "<h2>SQL Query Analysis (generated #{Time.now})</h2><br>\n"
844
+ rpt << "<hr><center>"
845
+ rpt << "<table><th>Reports</th>"
846
+ reports.each_index {|x|
847
+ next if !reports[x].applicable
848
+ link = "<a href=\"#report#{x}\">#{reports[x].title}</a>"
849
+ rpt << "<tr><td>#{link}</td></tr>"
850
+ }
851
+ rpt << "</table>"
852
+ rpt << "<hr></center>"
853
+ reports.each_index {|x|
854
+ next if !reports[x].applicable
855
+ rpt << "<a name=\"report#{x}\"> </a>"
856
+ rpt << reports[x].html
857
+ }
858
+ rpt << "</body></html>\n"
859
+ end
860
+ end
861
+
862
+ class GenericReport
863
+
864
+ def initialize(log)
865
+ @log = log
866
+ end
867
+
868
+ def colorize(txt)
869
+ ["SELECT","UPDATE","INSERT INTO","DELETE","WHERE","VALUES","FROM","AND","ORDER BY","GROUP BY","LIMIT", "OFFSET", "DESC","ASC","AS","EXPLAIN","DROP","EXEC"].each {|w|
870
+ txt = txt.gsub(Regexp.new(w), "<span class='keyword'>#{w}</span>")
871
+ }
872
+ ["select","update","from","where","explain","drop"].each {|w|
873
+ txt = txt.gsub(Regexp.new(w), "<span class='keyword'>#{w}</span>")
874
+ }
875
+ txt
876
+ end
877
+
878
+ def title
879
+ "Unnamed report"
880
+ end
881
+
882
+ def pctg_of(a,b)
883
+ a > 0 ? (((a.to_f/b.to_f)*100.0).round)/100.0 : 0
884
+ end
885
+
886
+ def round(x, places)
887
+ (x * 10.0 * places).round / (10.0 * places)
888
+ end
889
+
890
+ def applicable
891
+ true
892
+ end
893
+ end
894
+
895
+ class OverallStatsReport < GenericReport
896
+
897
+ def html
898
+ rpt = "<h3>#{title}</h3>\n"
899
+ rpt << "#{@log.queries.size} queries\n"
900
+ rpt << "<br>#{@log.unique_queries} unique queries\n"
901
+ if @log.includes_duration
902
+ rpt << "<br>Total query duration was #{round(total_duration, 2)} seconds\n"
903
+ longest = find_longest
904
+ rpt << "<br>Longest query (#{colorize(longest.text)}) ran in #{"%2.3f" % longest.duration} seconds\n"
905
+ shortest = find_shortest
906
+ rpt << "<br>Shortest query (#{colorize(shortest.text)}) ran in #{"%2.3f" % shortest.duration} seconds\n"
907
+ end
908
+ rpt << "<br>Log file parsed in #{"%2.1f" % @log.time_to_parse} seconds\n"
909
+ end
910
+
911
+ def title
912
+ "Overall statistics"
913
+ end
914
+
915
+ def text
916
+ rpt = "######## #{title}\n"
917
+ rpt << "#{@log.queries.size} queries (#{@log.unique_queries} unique)"
918
+ rpt << ", longest ran in #{find_longest.duration} seconds)," if @log.includes_duration
919
+ rpt << " parsed in #{@log.time_to_parse} seconds\n"
920
+ end
921
+
922
+ def total_duration
923
+ @log.queries.inject(0) {|sum, q| sum += (q.duration != nil) ? q.duration : 0 }
924
+ end
925
+
926
+ def find_shortest
927
+ q = Query.new("No queries found")
928
+ @log.queries.min {|a,b|
929
+ return b if a.duration.nil?
930
+ return a if b.duration.nil?
931
+ a.duration <=> b.duration
932
+ }
933
+ end
934
+
935
+ def find_longest
936
+ q = Query.new("No queries found")
937
+ @log.queries.max {|a,b|
938
+ return b if a.duration.nil?
939
+ return a if b.duration.nil?
940
+ a.duration <=> b.duration
941
+ }
942
+ end
943
+ end
944
+
945
+ class MostFrequentQueriesReport < GenericReport
946
+
947
+ def initialize(log, top=DEFAULT_TOP)
948
+ super(log)
949
+ @top = top
950
+ end
951
+
952
+ def title
953
+ "Most frequent queries"
954
+ end
955
+
956
+ def html
957
+ list = create_report
958
+ rpt = "<h3>#{title}</h3>\n"
959
+ rpt << "<table><tr><th>Rank</th><th>Times executed</th><th>Query text</th>\n"
960
+ (list.size < @top ? list.size : @top).times {|x|
961
+ rpt << "<tr><td>#{x+1}</td><td>#{list[x][1]}</td><td>#{colorize(list[x][0])}</td></tr>\n"
962
+ }
963
+ rpt << "</table>\n"
964
+ end
965
+
966
+ def text
967
+ list = create_report
968
+ rpt = "######## #{title}\n"
969
+ (list.size < @top ? list.size : @top).times {|x|
970
+ rpt << list[x][1].to_s + " times: " + list[x][0].to_s + "\n"
971
+ }
972
+ rpt
973
+ end
974
+
975
+ def create_report
976
+ h = {}
977
+ @log.queries.each {|q|
978
+ h[q.text] = 0 if !h.has_key?(q.text)
979
+ h[q.text] += 1
980
+ }
981
+ h.sort {|a,b| b[1] <=> a[1] }
982
+ end
983
+ end
984
+
985
+ class LittleWrapper
986
+ attr_accessor :total_duration, :count, :q
987
+
988
+ def initialize(q)
989
+ @q = q
990
+ @total_duration = 0.0
991
+ @count = 0
992
+ end
993
+
994
+ def add(q)
995
+ return if q.duration.nil?
996
+ @total_duration += q.duration
997
+ @count += 1
998
+ end
999
+ end
1000
+
1001
+ class QueriesThatTookUpTheMostTimeReport < GenericReport
1002
+ def initialize(log, top=DEFAULT_TOP)
1003
+ super(log)
1004
+ @top = top
1005
+ end
1006
+
1007
+ def title
1008
+ "Queries that took up the most time"
1009
+ end
1010
+
1011
+ def applicable
1012
+ @log.includes_duration
1013
+ end
1014
+
1015
+ def html
1016
+ list = create_report
1017
+ rpt = "<h3>#{title}</h3>\n"
1018
+ rpt << "<table><tr><th>Rank</th><th>Total time (seconds)</th><th>Times executed</th><th>Query text</th>\n"
1019
+ (list.size < @top ? list.size : @top).times {|x|
1020
+ rpt << "<tr><td>#{x+1}</td><td>#{"%2.3f" % list[x][1].total_duration}</td><td align=right>#{list[x][1].count}</td><td>#{colorize(list[x][0])}</td></tr>\n"
1021
+ }
1022
+ rpt << "</table>\n"
1023
+ end
1024
+
1025
+ def text
1026
+ list = create_report
1027
+ rpt = "######## #{title}\n"
1028
+ (list.size < @top ? list.size : @top).times {|x|
1029
+ rpt << "#{"%2.3f" % list[x][1].total_duration} seconds: #{list[x][0]}\n"
1030
+ }
1031
+ rpt
1032
+ end
1033
+
1034
+ def create_report
1035
+ h = {}
1036
+ @log.queries.each {|q|
1037
+ next if q.duration.nil?
1038
+ h[q.text] = LittleWrapper.new(q) if !h.has_key?(q.text)
1039
+ h[q.text].add(q)
1040
+ }
1041
+ h.sort {|a,b| b[1].total_duration <=> a[1].total_duration }
1042
+ end
1043
+ end
1044
+
1045
+ class SlowestQueriesReport < GenericReport
1046
+
1047
+ def initialize(log, top=DEFAULT_TOP)
1048
+ super(log)
1049
+ @top = top
1050
+ end
1051
+
1052
+ def applicable
1053
+ @log.includes_duration
1054
+ end
1055
+
1056
+ def title
1057
+ "Slowest queries"
1058
+ end
1059
+
1060
+ def text
1061
+ list = create_report
1062
+ rpt = "######## #{title}\n"
1063
+ (list.size < @top ? list.size : @top).times {|x|
1064
+ rpt << "#{"%2.3f" % list[x].duration} seconds: #{list[x].text}\n"
1065
+ }
1066
+ rpt
1067
+ end
1068
+
1069
+ def html
1070
+ list = create_report
1071
+ rpt = "<h3>#{title}</h3>\n"
1072
+ rpt << "<table><tr><th>Rank</th><th>Time</th><th>Query text</th>\n"
1073
+ (list.size < @top ? list.size : @top).times {|x|
1074
+ rpt << "<tr><td>#{x+1}</td><td>#{"%2.3f" % list[x].duration}</td><td>#{colorize(list[x].text)}</td></tr>\n"
1075
+ }
1076
+ rpt << "</table>\n"
1077
+ end
1078
+
1079
+ def create_report
1080
+ (@log.queries.reject{|q| q.duration.nil?}).sort {|a,b| b.duration.to_f <=> a.duration.to_f }.slice(0,@top)
1081
+ end
1082
+ end
1083
+
1084
+ class ParseErrorReport < GenericReport
1085
+
1086
+ def title
1087
+ "Parse Errors"
1088
+ end
1089
+
1090
+ def applicable
1091
+ !@log.parse_errors.empty?
1092
+ end
1093
+
1094
+ def text
1095
+ rpt = "######## #{title}\n"
1096
+ @log.parse_errors.each {|x| rpt << "#{x.exception} : #{x.line}\n" }
1097
+ rpt
1098
+ end
1099
+
1100
+ def html
1101
+ rpt = "<h3>#{title}</h3>\n"
1102
+ rpt << "<table><tr><th>Explanation</th><th>Offending line</th>\n"
1103
+ @log.parse_errors.each {|x|
1104
+ rpt << "<tr><td>#{x.exception.message}</td><td>#{x.line}</td></tr>\n"
1105
+ }
1106
+ rpt << "</table>\n"
1107
+ end
1108
+ end
1109
+
1110
+ class ErrorReport < GenericReport
1111
+
1112
+ def title
1113
+ "Errors"
1114
+ end
1115
+
1116
+ def applicable
1117
+ !@log.errors.empty?
1118
+ end
1119
+
1120
+ def text
1121
+ rpt = "######## #{title}\n"
1122
+ @log.errors.each {|x| rpt << "#{x.error} : #{x.text}\n" }
1123
+ rpt
1124
+ end
1125
+
1126
+ def html
1127
+ rpt = "<h3>#{title}</h3>\n"
1128
+ rpt << "<table><tr><th>Error</th><th>Offending query</th>\n"
1129
+ @log.errors.each {|x|
1130
+ message = "<p>#{x.error}</p>" + (x.detail.size > 0 ? "<p>DETAIL : #{x.detail}</p>" : '') + \
1131
+ (x.hint.size > 0 ? "<p>HINT : #{x.hint}</p>" : '')
1132
+ rpt << "<tr><td>#{message}</td><td>#{colorize(x.text)}</td></tr>\n"
1133
+ }
1134
+ rpt << "</table>\n"
1135
+ end
1136
+ end
1137
+
1138
+ class QueriesByTypeReport < GenericReport
1139
+
1140
+ def title
1141
+ "Queries by type"
1142
+ end
1143
+
1144
+ def html
1145
+ sel,ins,upd,del=create_report
1146
+ rpt = "<h3>#{title}</h3>\n"
1147
+ rpt << "<table><tr><th>Type</th><th>Count</th><th>Percentage</th>\n"
1148
+ rpt << "<tr><td>SELECT</td><td>#{sel}</td><td align=center>#{(pctg_of(sel, @log.queries.size)*100).to_i}</td></tr>\n" if sel > 0
1149
+ rpt << "<tr><td>INSERT</td><td>#{ins}</td><td align=center>#{(pctg_of(ins, @log.queries.size)*100).to_i}</td></tr>\n" if ins > 0
1150
+ rpt << "<tr><td>UPDATE</td><td>#{upd}</td><td align=center>#{(pctg_of(upd, @log.queries.size)*100).to_i}</td></tr>\n" if upd > 0
1151
+ rpt << "<tr><td>DELETE</td><td>#{del}</td><td align=center>#{(pctg_of(del, @log.queries.size)*100).to_i}</td></tr>\n" if del > 0
1152
+ rpt << "</table>\n"
1153
+ end
1154
+
1155
+ def text
1156
+ sel,ins,upd,del=create_report
1157
+ rpt = "######## #{title}\n"
1158
+ rpt << "SELECTs: #{sel.to_s.ljust(sel.to_s.size + 1)} (#{(pctg_of(sel, @log.queries.size)*100).to_i}%)\n" if sel > 0
1159
+ rpt << "INSERTs: #{ins.to_s.ljust(sel.to_s.size + 1)} (#{(pctg_of(ins, @log.queries.size)*100).to_i}%)\n" if ins > 0
1160
+ rpt << "UPDATEs: #{upd.to_s.ljust(upd.to_s.size + 1)} (#{(pctg_of(upd, @log.queries.size)*100).to_i}%)\n" if upd > 0
1161
+ rpt << "DELETEs: #{del.to_s.ljust(sel.to_s.size + 1)} (#{(pctg_of(del, @log.queries.size)*100).to_i}%)\n" if del > 0
1162
+ rpt
1163
+ end
1164
+
1165
+ def create_report
1166
+ [@log.queries.find_all {|q| q.is_select}.size,
1167
+ @log.queries.find_all {|q| q.is_insert}.size,
1168
+ @log.queries.find_all {|q| q.is_update}.size,
1169
+ @log.queries.find_all {|q| q.is_delete}.size]
1170
+ end
1171
+ end
1172
+
1173
+ PQA.run if __FILE__ == $0