pgdexter 0.3.2 → 0.3.7

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
- SHA1:
3
- metadata.gz: 714767248afe28ad9e354ebb6485b34a8d6aadd0
4
- data.tar.gz: 545eaaea1df0312049f4cecff3fe9b03b80a6cd9
2
+ SHA256:
3
+ metadata.gz: a4245b0c02667e411a915ef4c19ef472925048180d5882bc7b63b90037cb9d45
4
+ data.tar.gz: '0887ae99de87c186c1bfc52edbdbb59546df61de93bfbccd0974dff5c15b80f0'
5
5
  SHA512:
6
- metadata.gz: '096684887c5d5a48a2df74a2dae6cf70bc2b5e98dd5dc6d7d6d8b583bfb76193ce86166972d9553f89452f16629a8576a30f2627fd8d0b16a9672ce76a316835'
7
- data.tar.gz: d128734845414672a3f470f138c6a422dc2b24024a66866e50524fe3bb3e949bd4a899305a93f71fb14af3b56820f66e20a00d5ba5788cd0c88bd0afaa064e32
6
+ metadata.gz: 45ec7b462d925c2e5aa313734d1109b59e65e743c28e3471ef604d430037949675c615ab9004b98f8b055b0a575cdc169470d89e64694ef39cdd2bee60674548
7
+ data.tar.gz: 8d33afbd03ba250ae7dd4dfe6c1947e8860fece0925b58a098fb7a54f6f8fe76aaa3bc93f663b10282ef1a9b3bb2c059ac26014f9a3cff3798ece22dbb8d62be
@@ -1,16 +1,42 @@
1
- ## 0.3.2
1
+ ## 0.3.7 (2020-07-10)
2
+
3
+ - Fixed help output
4
+
5
+ ## 0.3.6 (2020-03-30)
6
+
7
+ - Fixed warning with Ruby 2.7
8
+
9
+ ## 0.3.5 (2018-04-30)
10
+
11
+ - Added `sql` input format
12
+ - Fixed error for queries with double dash comments
13
+ - Fixed connection threading issue with `--pg-stat-activity` option
14
+
15
+ ## 0.3.4 (2018-04-09)
16
+
17
+ - Fixed `--username` option
18
+ - Fixed `JSON::NestingError`
19
+ - Added `--pg-stat-activity` option
20
+
21
+ ## 0.3.3 (2018-02-22)
22
+
23
+ - Added support for views and materialized views
24
+ - Better handle case when multiple indexes are found for a query
25
+ - Added `--min-cost-savings-pct` option
26
+
27
+ ## 0.3.2 (2018-01-04)
2
28
 
3
29
  - Fixed parsing issue with named prepared statements
4
30
  - Fixed parsing issue with multiline queries in csv format
5
31
  - Better explanations for indexing decisions
6
32
 
7
- ## 0.3.1
33
+ ## 0.3.1 (2017-12-28)
8
34
 
9
35
  - Added support for queries with bind variables
10
36
  - Fixed error with streaming logs as csv format
11
37
  - Handle malformed CSV gracefully
12
38
 
13
- ## 0.3.0
39
+ ## 0.3.0 (2017-12-22)
14
40
 
15
41
  - Added support for schemas
16
42
  - Added support for csv format
@@ -18,12 +44,12 @@
18
44
  - Added `--min-calls` option
19
45
  - Fixed debug output when indexes not found
20
46
 
21
- ## 0.2.1
47
+ ## 0.2.1 (2017-09-02)
22
48
 
23
49
  - Fixed bad suggestions
24
50
  - Improved debugging output
25
51
 
26
- ## 0.2.0
52
+ ## 0.2.0 (2017-08-27)
27
53
 
28
54
  - Added same connection options as `psql`
29
55
  - Added support for multiple files
@@ -34,38 +60,38 @@ Breaking
34
60
 
35
61
  - `-h` option changed to `--host` instead of `--help` for consistency with `psql`
36
62
 
37
- ## 0.1.6
63
+ ## 0.1.6 (2017-08-26)
38
64
 
39
65
  - Significant performance improvements
40
66
  - Added `--include` option
41
67
 
42
- ## 0.1.5
68
+ ## 0.1.5 (2017-08-14)
43
69
 
44
70
  - Added support for non-`SELECT` queries
45
71
  - Added `--pg-stat-statements` option
46
72
  - Added advisory locks
47
73
  - Added support for running as a non-superuser
48
74
 
49
- ## 0.1.4
75
+ ## 0.1.4 (2017-07-02)
50
76
 
51
77
  - Added support for multicolumn indexes
52
78
 
53
- ## 0.1.3
79
+ ## 0.1.3 (2017-06-30)
54
80
 
55
81
  - Fixed error with non-lowercase columns
56
82
  - Fixed error with `json` columns
57
83
 
58
- ## 0.1.2
84
+ ## 0.1.2 (2017-06-26)
59
85
 
60
86
  - Added `--exclude` option
61
87
  - Added `--log-sql` option
62
88
 
63
- ## 0.1.1
89
+ ## 0.1.1 (2017-06-25)
64
90
 
65
91
  - Added `--interval` option
66
92
  - Added `--min-time` option
67
93
  - Added `--log-level` option
68
94
 
69
- ## 0.1.0
95
+ ## 0.1.0 (2017-06-24)
70
96
 
71
97
  - Launched
@@ -1,4 +1,4 @@
1
- Copyright (c) 2017 Andrew Kane
1
+ Copyright (c) 2017-2020 Andrew Kane
2
2
 
3
3
  MIT License
4
4
 
data/README.md CHANGED
@@ -2,18 +2,18 @@
2
2
 
3
3
  The automatic indexer for Postgres
4
4
 
5
- [Read about how it works](https://medium.com/@ankane/introducing-dexter-the-automatic-indexer-for-postgres-5f8fa8b28f27)
5
+ [Read about how it works](https://ankane.org/introducing-dexter)
6
6
 
7
7
  [![Build Status](https://travis-ci.org/ankane/dexter.svg?branch=master)](https://travis-ci.org/ankane/dexter)
8
8
 
9
9
  ## Installation
10
10
 
11
- First, install [HypoPG](https://github.com/dalibo/hypopg) on your database server. This doesn’t require a restart.
11
+ First, install [HypoPG](https://github.com/HypoPG/hypopg) on your database server. This doesn’t require a restart.
12
12
 
13
13
  ```sh
14
14
  cd /tmp
15
- curl -L https://github.com/dalibo/hypopg/archive/1.1.0.tar.gz | tar xz
16
- cd hypopg-1.1.0
15
+ curl -L https://github.com/HypoPG/hypopg/archive/1.1.4.tar.gz | tar xz
16
+ cd hypopg-1.1.4
17
17
  make
18
18
  make install # may need sudo
19
19
  ```
@@ -104,12 +104,20 @@ or pass files:
104
104
  dexter <connection-options> <file1> <file2>
105
105
  ```
106
106
 
107
+ or collect running queries with:
108
+
109
+ ```sh
110
+ dexter <connection-options> --pg-stat-activity
111
+ ```
112
+
107
113
  or use the [pg_stat_statements](https://www.postgresql.org/docs/current/static/pgstatstatements.html) extension:
108
114
 
109
115
  ```sh
110
116
  dexter <connection-options> --pg-stat-statements
111
117
  ```
112
118
 
119
+ > Note: Logs or running queries are highly preferred over pg_stat_statements, as pg_stat_statements often doesn’t store enough information to optimize queries.
120
+
113
121
  ### Collection Options
114
122
 
115
123
  To prevent one-off queries from being indexed, specify a minimum number of calls before a query is considered for indexing
@@ -207,6 +215,10 @@ gem specific_install https://github.com/ankane/dexter.git
207
215
 
208
216
  This software wouldn’t be possible without [HypoPG](https://github.com/dalibo/hypopg), which allows you to create hypothetical indexes, and [pg_query](https://github.com/lfittl/pg_query), which allows you to parse and fingerprint queries. A big thanks to Dalibo and Lukas Fittl respectively.
209
217
 
218
+ ## Research
219
+
220
+ This is known as the Index Selection Problem (ISP).
221
+
210
222
  ## Contributing
211
223
 
212
224
  Everyone is encouraged to help improve this project. Here are a few ways you can help:
@@ -216,17 +228,18 @@ Everyone is encouraged to help improve this project. Here are a few ways you can
216
228
  - Write, clarify, or fix documentation
217
229
  - Suggest or add new features
218
230
 
219
- To get started, run:
231
+ To get started with development, run:
220
232
 
221
233
  ```sh
222
234
  git clone https://github.com/ankane/dexter.git
223
235
  cd dexter
224
- bundle
225
- rake install
236
+ bundle install
237
+ bundle exec rake install
226
238
  ```
227
239
 
228
240
  To run tests, use:
229
241
 
230
242
  ```sh
231
- rake test
243
+ createdb dexter_test
244
+ bundle exec rake test
232
245
  ```
data/exe/dexter CHANGED
@@ -1,10 +1,7 @@
1
1
  #!/usr/bin/env ruby
2
2
 
3
+ # handle interrupts
4
+ trap("SIGINT") { abort }
5
+
3
6
  require "dexter"
4
- begin
5
- Dexter::Client.new(ARGV).perform
6
- rescue Dexter::Abort => e
7
- abort e.message
8
- rescue Interrupt => e
9
- # do nothing
10
- end
7
+ Dexter::Client.start
@@ -1,16 +1,22 @@
1
- require "dexter/version"
2
- require "slop"
1
+ # dependencies
3
2
  require "pg"
4
3
  require "pg_query"
5
- require "time"
4
+ require "slop"
5
+
6
+ # stdlib
6
7
  require "set"
7
- require "thread"
8
+ require "time"
9
+
10
+ # modules
11
+ require "dexter/version"
8
12
  require "dexter/logging"
9
13
  require "dexter/client"
10
14
  require "dexter/collector"
11
15
  require "dexter/indexer"
12
16
  require "dexter/log_parser"
13
17
  require "dexter/csv_log_parser"
18
+ require "dexter/pg_stat_activity_parser"
19
+ require "dexter/sql_log_parser"
14
20
  require "dexter/processor"
15
21
  require "dexter/query"
16
22
 
@@ -4,6 +4,12 @@ module Dexter
4
4
 
5
5
  attr_reader :arguments, :options
6
6
 
7
+ def self.start
8
+ Dexter::Client.new(ARGV).perform
9
+ rescue Dexter::Abort => e
10
+ abort e.message
11
+ end
12
+
7
13
  def initialize(args)
8
14
  @arguments, @options = parse_args(args)
9
15
  end
@@ -18,6 +24,8 @@ module Dexter
18
24
  elsif options[:pg_stat_statements]
19
25
  # TODO support streaming option
20
26
  Indexer.new(options).process_stat_statements
27
+ elsif options[:pg_stat_activity]
28
+ Processor.new(:pg_stat_activity, options).perform
21
29
  elsif arguments.any?
22
30
  ARGV.replace(arguments)
23
31
  Processor.new(ARGF, options).perform
@@ -29,9 +37,9 @@ module Dexter
29
37
  def parse_args(args)
30
38
  opts = Slop.parse(args) do |o|
31
39
  o.banner = %(Usage:
32
- dexter [options]
33
-
34
- Options:)
40
+ dexter [options])
41
+ o.separator ""
42
+ o.separator "Options:"
35
43
  o.boolean "--analyze", "analyze tables that haven't been analyzed in the past hour", default: false
36
44
  o.boolean "--create", "create indexes", default: false
37
45
  o.array "--exclude", "prevent specific tables from being indexed"
@@ -43,11 +51,10 @@ Options:)
43
51
  o.boolean "--log-sql", "log sql", default: false
44
52
  o.float "--min-calls", "only process queries that have been called a certain number of times", default: 0
45
53
  o.float "--min-time", "only process queries that have consumed a certain amount of DB time, in minutes", default: 0
54
+ o.integer "--min-cost-savings-pct", default: 50, help: false
55
+ o.boolean "--pg-stat-activity", "use pg_stat_activity", default: false, help: false
46
56
  o.boolean "--pg-stat-statements", "use pg_stat_statements", default: false, help: false
47
57
  o.string "-s", "--statement", "process a single statement"
48
- # separator must go here to show up correctly - slop bug?
49
- o.separator ""
50
- o.separator "Connection options:"
51
58
  o.on "-v", "--version", "print the version" do
52
59
  log Dexter::VERSION
53
60
  exit
@@ -56,10 +63,12 @@ Options:)
56
63
  log o
57
64
  exit
58
65
  end
59
- o.string "-U", "--username"
60
- o.string "-d", "--dbname"
61
- o.string "-h", "--host"
62
- o.integer "-p", "--port"
66
+ o.separator ""
67
+ o.separator "Connection options:"
68
+ o.string "-d", "--dbname", "database name"
69
+ o.string "-h", "--host", "database host"
70
+ o.integer "-p", "--port", "database port"
71
+ o.string "-U", "--username", "database user"
63
72
  end
64
73
 
65
74
  arguments = opts.arguments
@@ -12,7 +12,9 @@ module Dexter
12
12
  @min_time = options[:min_time] || 0
13
13
  @min_calls = options[:min_calls] || 0
14
14
  @analyze = options[:analyze]
15
+ @min_cost_savings_pct = options[:min_cost_savings_pct].to_i
15
16
  @options = options
17
+ @mutex = Mutex.new
16
18
 
17
19
  create_extension unless extension_exists?
18
20
  execute("SET lock_timeout = '5s'")
@@ -24,11 +26,28 @@ module Dexter
24
26
  process_queries(queries)
25
27
  end
26
28
 
29
+ def stat_activity
30
+ execute <<-SQL
31
+ SELECT
32
+ pid || ':' || COALESCE(query_start, xact_start) AS id,
33
+ query,
34
+ EXTRACT(EPOCH FROM NOW() - COALESCE(query_start, xact_start)) * 1000.0 AS duration_ms
35
+ FROM
36
+ pg_stat_activity
37
+ WHERE
38
+ datname = current_database()
39
+ AND state = 'active'
40
+ AND pid != pg_backend_pid()
41
+ ORDER BY
42
+ 1
43
+ SQL
44
+ end
45
+
27
46
  def process_queries(queries)
28
47
  # reset hypothetical indexes
29
48
  reset_hypothetical_indexes
30
49
 
31
- tables = Set.new(database_tables)
50
+ tables = Set.new(database_tables + materialized_views)
32
51
 
33
52
  # map tables without schema to schema
34
53
  no_schema_tables = {}
@@ -37,11 +56,28 @@ module Dexter
37
56
  no_schema_tables[group] = t2.sort_by { |t| [search_path_index[t.split(".")[0]] || 1000000, t] }[0]
38
57
  end
39
58
 
59
+ # add tables from views
60
+ view_tables = database_view_tables
61
+ view_tables.each do |v, vt|
62
+ view_tables[v] = vt.map { |t| no_schema_tables[t] || t }
63
+ end
64
+
65
+ # fully resolve tables
66
+ # make sure no views in result
67
+ view_tables.each do |v, vt|
68
+ view_tables[v] = vt.flat_map { |t| view_tables[t] || [t] }.uniq
69
+ end
70
+
40
71
  # filter queries from other databases and system tables
41
72
  queries.each do |query|
42
73
  # add schema to table if needed
43
74
  query.tables = query.tables.map { |t| no_schema_tables[t] || t }
44
75
 
76
+ # substitute view tables
77
+ new_tables = query.tables.flat_map { |t| view_tables[t] || [t] }.uniq
78
+ query.tables_from_views = new_tables - query.tables
79
+ query.tables = new_tables
80
+
45
81
  # check for missing tables
46
82
  query.missing_tables = !query.tables.all? { |t| tables.include?(t) }
47
83
  end
@@ -71,13 +107,13 @@ module Dexter
71
107
  analyze_tables(tables) if tables.any? && (@analyze || @log_level == "debug2")
72
108
 
73
109
  # create hypothetical indexes and explain queries
74
- candidates = tables.any? ? create_hypothetical_indexes(queries.select(&:candidate_tables), tables) : {}
110
+ candidates = tables.any? ? create_hypothetical_indexes(queries.select(&:candidate_tables)) : {}
75
111
 
76
112
  # see if new indexes were used and meet bar
77
113
  new_indexes = determine_indexes(queries, candidates, tables)
78
114
 
79
115
  # display and create new indexes
80
- show_and_create_indexes(new_indexes, queries, tables)
116
+ show_and_create_indexes(new_indexes, queries)
81
117
  end
82
118
 
83
119
  private
@@ -146,9 +182,9 @@ module Dexter
146
182
  query.plans << plan(query.statement)
147
183
  if @log_explain
148
184
  # Pass format to prevent ANALYZE
149
- puts execute("EXPLAIN (FORMAT TEXT) #{safe_statement(query.statement)}").map { |r| r["QUERY PLAN"] }.join("\n")
185
+ puts execute("EXPLAIN (FORMAT TEXT) #{safe_statement(query.statement)}", pretty: false).map { |r| r["QUERY PLAN"] }.join("\n")
150
186
  end
151
- rescue PG::Error => e
187
+ rescue PG::Error, JSON::NestingError => e
152
188
  if @log_explain
153
189
  log e.message
154
190
  end
@@ -157,7 +193,7 @@ module Dexter
157
193
  end
158
194
  end
159
195
 
160
- def create_hypothetical_indexes(queries, tables)
196
+ def create_hypothetical_indexes(queries)
161
197
  candidates = {}
162
198
 
163
199
  # get initial costs for queries
@@ -166,6 +202,7 @@ module Dexter
166
202
 
167
203
  # filter tables for performance
168
204
  tables = Set.new(explainable_queries.flat_map(&:tables))
205
+ tables_from_views = Set.new(explainable_queries.flat_map(&:tables_from_views))
169
206
 
170
207
  if tables.any?
171
208
  # since every set of multi-column indexes are expensive
@@ -182,7 +219,8 @@ module Dexter
182
219
  end
183
220
 
184
221
  # create hypothetical indexes
185
- columns_by_table = columns(tables).select { |c| possible_columns.include?(c[:column]) }.group_by { |c| c[:table] }
222
+ # use all columns in tables from views
223
+ columns_by_table = columns(tables).select { |c| possible_columns.include?(c[:column]) || tables_from_views.include?(c[:table]) }.group_by { |c| c[:table] }
186
224
 
187
225
  # create single column indexes
188
226
  create_hypothetical_indexes_helper(columns_by_table, 1, candidates)
@@ -265,14 +303,16 @@ module Dexter
265
303
  end
266
304
  end
267
305
 
306
+ savings_ratio = (1 - @min_cost_savings_pct / 100.0)
307
+
268
308
  queries.each do |query|
269
309
  if query.explainable? && query.high_cost?
270
310
  new_cost, new_cost2 = query.costs[1..2]
271
311
 
272
- cost_savings = new_cost < query.initial_cost * 0.5
312
+ cost_savings = new_cost < query.initial_cost * savings_ratio
273
313
 
274
314
  # set high bar for multicolumn indexes
275
- cost_savings2 = new_cost > 100 && new_cost2 < new_cost * 0.5
315
+ cost_savings2 = new_cost > 100 && new_cost2 < new_cost * savings_ratio
276
316
 
277
317
  key = cost_savings2 ? 2 : 1
278
318
  query_indexes = hypo_indexes_from_plan(index_name_to_columns, query.plans[key], index_set)
@@ -283,10 +323,56 @@ module Dexter
283
323
  cost_savings2 = false
284
324
  end
285
325
 
286
- # TODO if multiple indexes are found (for either single or multicolumn)
326
+ suggest_index = cost_savings || cost_savings2
327
+
328
+ cost_savings3 = false
329
+ new_cost3 = nil
330
+
331
+ # if multiple indexes are found (for either single or multicolumn)
287
332
  # determine the impact of each individually
288
- # for now, be conservative and don't suggest if more than one index
289
- suggest_index = (cost_savings || cost_savings2) && query_indexes.size == 1
333
+ # there may be a better single index that we're not considering
334
+ # that didn't get picked up by pass1 or pass2
335
+ # TODO clean this up
336
+ # TODO suggest more than one index from this if savings are there
337
+ if suggest_index && query_indexes.size > 1
338
+ winning_index = nil
339
+ winning_cost = nil
340
+ winning_plan = nil
341
+
342
+ query_indexes.each do |query_index|
343
+ reset_hypothetical_indexes
344
+ create_hypothetical_index(query_index[:table], query_index[:columns].map { |v| {column: v} })
345
+ plan3 = plan(query.statement)
346
+ cost3 = plan3["Total Cost"]
347
+
348
+ if !winning_cost || cost3 < winning_cost
349
+ winning_cost = cost3
350
+ winning_index = query_index
351
+ winning_plan = plan3
352
+ end
353
+ end
354
+
355
+ query.plans << winning_plan
356
+
357
+ # duplicated from above
358
+ # TODO DRY
359
+ use_winning =
360
+ if cost_savings2
361
+ new_cost > 100 && winning_cost < new_cost * savings_ratio
362
+ else
363
+ winning_cost < query.initial_cost * savings_ratio
364
+ end
365
+
366
+ query_indexes = [winning_index]
367
+ new_cost3 = winning_cost
368
+ query.pass3_indexes = query_indexes
369
+
370
+ if use_winning
371
+ cost_savings3 = true
372
+ else
373
+ suggest_index = false
374
+ end
375
+ end
290
376
 
291
377
  if suggest_index
292
378
  query_indexes.each do |index|
@@ -299,7 +385,7 @@ module Dexter
299
385
  query.suggest_index = suggest_index
300
386
  query.new_cost =
301
387
  if suggest_index
302
- cost_savings2 ? new_cost2 : new_cost
388
+ cost_savings3 ? new_cost3 : (cost_savings2 ? new_cost2 : new_cost)
303
389
  else
304
390
  query.initial_cost
305
391
  end
@@ -331,7 +417,7 @@ module Dexter
331
417
  end
332
418
  end
333
419
 
334
- def show_and_create_indexes(new_indexes, queries, tables)
420
+ def show_and_create_indexes(new_indexes, queries)
335
421
  # print summary
336
422
  if new_indexes.any?
337
423
  new_indexes.each do |index|
@@ -368,9 +454,12 @@ module Dexter
368
454
  log "Start: #{query.costs[0]}"
369
455
  log "Pass1: #{query.costs[1]} : #{log_indexes(query.pass1_indexes || [])}"
370
456
  log "Pass2: #{query.costs[2]} : #{log_indexes(query.pass2_indexes || [])}"
457
+ if query.costs[3]
458
+ log "Pass3: #{query.costs[3]} : #{log_indexes(query.pass3_indexes || [])}"
459
+ end
371
460
  log "Final: #{query.new_cost} : #{log_indexes(query.suggest_index ? query_indexes : [])}"
372
- if query_indexes.size == 1 && !query.suggest_index
373
- log "Need 50% cost savings to suggest index"
461
+ if (query.pass1_indexes.any? || query.pass2_indexes.any?) && !query.suggest_index
462
+ log "Need #{@min_cost_savings_pct}% cost savings to suggest index"
374
463
  end
375
464
  else
376
465
  log "Could not run explain"
@@ -409,6 +498,9 @@ module Dexter
409
498
 
410
499
  def conn
411
500
  @conn ||= begin
501
+ # set connect timeout if none set
502
+ ENV["PGCONNECT_TIMEOUT"] ||= "2"
503
+
412
504
  if @options[:dbname] =~ /\Apostgres(ql)?:\/\//
413
505
  config = @options[:dbname]
414
506
  else
@@ -416,7 +508,7 @@ module Dexter
416
508
  host: @options[:host],
417
509
  port: @options[:port],
418
510
  dbname: @options[:dbname],
419
- user: @options[:user]
511
+ user: @options[:username]
420
512
  }.reject { |_, value| value.to_s.empty? }
421
513
  config = config[:dbname] if config.keys == [:dbname] && config[:dbname].include?("=")
422
514
  end
@@ -426,7 +518,7 @@ module Dexter
426
518
  abort e.message
427
519
  end
428
520
 
429
- def execute(query)
521
+ def execute(query, pretty: true)
430
522
  # use exec_params instead of exec for security
431
523
  #
432
524
  # Unlike PQexec, PQexecParams allows at most one SQL command in the given string.
@@ -434,14 +526,17 @@ module Dexter
434
526
  # This is a limitation of the underlying protocol, but has some usefulness
435
527
  # as an extra defense against SQL-injection attacks.
436
528
  # https://www.postgresql.org/docs/current/static/libpq-exec.html
437
- query = squish(query)
529
+ query = squish(query) if pretty
438
530
  log "SQL: #{query}" if @log_sql
439
- conn.exec_params(query, []).to_a
531
+
532
+ @mutex.synchronize do
533
+ conn.exec_params(query, []).to_a
534
+ end
440
535
  end
441
536
 
442
537
  def plan(query)
443
538
  # strip semi-colons as another measure of defense
444
- JSON.parse(execute("EXPLAIN (FORMAT JSON) #{safe_statement(query)}").first["QUERY PLAN"]).first["Plan"]
539
+ JSON.parse(execute("EXPLAIN (FORMAT JSON) #{safe_statement(query)}", pretty: false).first["QUERY PLAN"], max_nesting: 1000).first["Plan"]
445
540
  end
446
541
 
447
542
  # TODO for multicolumn indexes, use ordering
@@ -449,11 +544,15 @@ module Dexter
449
544
  columns_by_table.each do |table, cols|
450
545
  # no reason to use btree index for json columns
451
546
  cols.reject { |c| ["json", "jsonb"].include?(c[:type]) }.permutation(n) do |col_set|
452
- candidates[col_set] = execute("SELECT * FROM hypopg_create_index('CREATE INDEX ON #{quote_ident(table)} (#{col_set.map { |c| quote_ident(c[:column]) }.join(", ")})')").first["indexname"]
547
+ candidates[col_set] = create_hypothetical_index(table, col_set)
453
548
  end
454
549
  end
455
550
  end
456
551
 
552
+ def create_hypothetical_index(table, col_set)
553
+ execute("SELECT * FROM hypopg_create_index('CREATE INDEX ON #{quote_ident(table)} (#{col_set.map { |c| quote_ident(c[:column]) }.join(", ")})')").first["indexname"]
554
+ end
555
+
457
556
  def database_tables
458
557
  result = execute <<-SQL
459
558
  SELECT
@@ -466,6 +565,43 @@ module Dexter
466
565
  result.map { |r| r["table_name"] }
467
566
  end
468
567
 
568
+ def materialized_views
569
+ if server_version_num >= 90300
570
+ result = execute <<-SQL
571
+ SELECT
572
+ schemaname || '.' || matviewname AS table_name
573
+ FROM
574
+ pg_matviews
575
+ SQL
576
+ result.map { |r| r["table_name"] }
577
+ else
578
+ []
579
+ end
580
+ end
581
+
582
+ def server_version_num
583
+ execute("SHOW server_version_num").first["server_version_num"].to_i
584
+ end
585
+
586
+ def database_view_tables
587
+ result = execute <<-SQL
588
+ SELECT
589
+ schemaname || '.' || viewname AS table_name,
590
+ definition
591
+ FROM
592
+ pg_views
593
+ WHERE
594
+ schemaname NOT IN ('information_schema', 'pg_catalog')
595
+ SQL
596
+
597
+ view_tables = {}
598
+ result.each do |row|
599
+ view_tables[row["table_name"]] = PgQuery.parse(row["definition"]).tables
600
+ end
601
+
602
+ view_tables
603
+ end
604
+
469
605
  def stat_statements
470
606
  result = execute <<-SQL
471
607
  SELECT
@@ -515,13 +651,15 @@ module Dexter
515
651
  def columns(tables)
516
652
  columns = execute <<-SQL
517
653
  SELECT
518
- table_schema || '.' || table_name AS table_name,
519
- column_name,
520
- data_type
521
- FROM
522
- information_schema.columns
523
- WHERE
524
- table_schema || '.' || table_name IN (#{tables.map { |t| quote(t) }.join(", ")})
654
+ s.nspname || '.' || t.relname AS table_name,
655
+ a.attname AS column_name,
656
+ pg_catalog.format_type(a.atttypid, a.atttypmod) AS data_type
657
+ FROM pg_attribute a
658
+ JOIN pg_class t on a.attrelid = t.oid
659
+ JOIN pg_namespace s on t.relnamespace = s.oid
660
+ WHERE a.attnum > 0
661
+ AND NOT a.attisdropped
662
+ AND s.nspname || '.' || t.relname IN (#{tables.map { |t| quote(t) }.join(", ")})
525
663
  ORDER BY
526
664
  1, 2
527
665
  SQL
@@ -0,0 +1,25 @@
1
+ module Dexter
2
+ class PgStatActivityParser < LogParser
3
+ def perform
4
+ queries = {}
5
+
6
+ loop do
7
+ new_queries = {}
8
+ @logfile.stat_activity.each do |row|
9
+ new_queries[row["id"]] = row
10
+ end
11
+
12
+ # store queries after they complete
13
+ queries.each do |id, row|
14
+ unless new_queries[id]
15
+ process_entry(row["query"], row["duration_ms"].to_f)
16
+ end
17
+ end
18
+
19
+ queries = new_queries
20
+
21
+ sleep(1)
22
+ end
23
+ end
24
+ end
25
+ end
@@ -6,15 +6,19 @@ module Dexter
6
6
  @logfile = logfile
7
7
 
8
8
  @collector = Collector.new(min_time: options[:min_time], min_calls: options[:min_calls])
9
+ @indexer = Indexer.new(options)
10
+
9
11
  @log_parser =
10
- if options[:input_format] == "csv"
12
+ if @logfile == :pg_stat_activity
13
+ PgStatActivityParser.new(@indexer, @collector)
14
+ elsif options[:input_format] == "csv"
11
15
  CsvLogParser.new(logfile, @collector)
16
+ elsif options[:input_format] == "sql"
17
+ SqlLogParser.new(logfile, @collector)
12
18
  else
13
19
  LogParser.new(logfile, @collector)
14
20
  end
15
21
 
16
- @indexer = Indexer.new(options)
17
-
18
22
  @starting_interval = 3
19
23
  @interval = options[:interval]
20
24
 
@@ -25,7 +29,7 @@ module Dexter
25
29
  end
26
30
 
27
31
  def perform
28
- if @logfile == STDIN
32
+ if [STDIN, :pg_stat_activity].include?(@logfile)
29
33
  Thread.abort_on_exception = true
30
34
  Thread.new do
31
35
  sleep(@starting_interval)
@@ -2,7 +2,7 @@ module Dexter
2
2
  class Query
3
3
  attr_reader :statement, :fingerprint, :plans
4
4
  attr_writer :tables
5
- attr_accessor :missing_tables, :new_cost, :total_time, :calls, :indexes, :suggest_index, :pass1_indexes, :pass2_indexes, :candidate_tables
5
+ attr_accessor :missing_tables, :new_cost, :total_time, :calls, :indexes, :suggest_index, :pass1_indexes, :pass2_indexes, :pass3_indexes, :candidate_tables, :tables_from_views
6
6
 
7
7
  def initialize(statement, fingerprint = nil)
8
8
  @statement = statement
@@ -11,6 +11,7 @@ module Dexter
11
11
  end
12
12
  @fingerprint = fingerprint
13
13
  @plans = []
14
+ @tables_from_views = []
14
15
  end
15
16
 
16
17
  def tables
@@ -0,0 +1,10 @@
1
+ module Dexter
2
+ class SqlLogParser < LogParser
3
+ def perform
4
+ # TODO support streaming
5
+ @logfile.read.split(";").each do |statement|
6
+ process_entry(statement, 0)
7
+ end
8
+ end
9
+ end
10
+ end
@@ -1,3 +1,3 @@
1
1
  module Dexter
2
- VERSION = "0.3.2"
2
+ VERSION = "0.3.7"
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: pgdexter
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.3.2
4
+ version: 0.3.7
5
5
  platform: ruby
6
6
  authors:
7
7
  - Andrew Kane
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2018-01-05 00:00:00.000000000 Z
11
+ date: 2020-07-10 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: slop
@@ -16,28 +16,28 @@ dependencies:
16
16
  requirements:
17
17
  - - ">="
18
18
  - !ruby/object:Gem::Version
19
- version: 4.2.0
19
+ version: 4.8.2
20
20
  type: :runtime
21
21
  prerelease: false
22
22
  version_requirements: !ruby/object:Gem::Requirement
23
23
  requirements:
24
24
  - - ">="
25
25
  - !ruby/object:Gem::Version
26
- version: 4.2.0
26
+ version: 4.8.2
27
27
  - !ruby/object:Gem::Dependency
28
28
  name: pg
29
29
  requirement: !ruby/object:Gem::Requirement
30
30
  requirements:
31
31
  - - ">="
32
32
  - !ruby/object:Gem::Version
33
- version: '0'
33
+ version: 0.18.2
34
34
  type: :runtime
35
35
  prerelease: false
36
36
  version_requirements: !ruby/object:Gem::Requirement
37
37
  requirements:
38
38
  - - ">="
39
39
  - !ruby/object:Gem::Version
40
- version: '0'
40
+ version: 0.18.2
41
41
  - !ruby/object:Gem::Dependency
42
42
  name: pg_query
43
43
  requirement: !ruby/object:Gem::Requirement
@@ -95,23 +95,16 @@ dependencies:
95
95
  - !ruby/object:Gem::Version
96
96
  version: '0'
97
97
  description:
98
- email:
99
- - andrew@chartkick.com
98
+ email: andrew@chartkick.com
100
99
  executables:
101
100
  - dexter
102
101
  extensions: []
103
102
  extra_rdoc_files: []
104
103
  files:
105
- - ".gitignore"
106
- - ".travis.yml"
107
104
  - CHANGELOG.md
108
- - Gemfile
109
105
  - LICENSE.txt
110
106
  - README.md
111
- - Rakefile
112
107
  - exe/dexter
113
- - guides/Hosted-Postgres.md
114
- - guides/Linux.md
115
108
  - lib/dexter.rb
116
109
  - lib/dexter/client.rb
117
110
  - lib/dexter/collector.rb
@@ -119,12 +112,14 @@ files:
119
112
  - lib/dexter/indexer.rb
120
113
  - lib/dexter/log_parser.rb
121
114
  - lib/dexter/logging.rb
115
+ - lib/dexter/pg_stat_activity_parser.rb
122
116
  - lib/dexter/processor.rb
123
117
  - lib/dexter/query.rb
118
+ - lib/dexter/sql_log_parser.rb
124
119
  - lib/dexter/version.rb
125
- - pgdexter.gemspec
126
120
  homepage: https://github.com/ankane/dexter
127
- licenses: []
121
+ licenses:
122
+ - MIT
128
123
  metadata: {}
129
124
  post_install_message:
130
125
  rdoc_options: []
@@ -134,15 +129,14 @@ required_ruby_version: !ruby/object:Gem::Requirement
134
129
  requirements:
135
130
  - - ">="
136
131
  - !ruby/object:Gem::Version
137
- version: '0'
132
+ version: '2.2'
138
133
  required_rubygems_version: !ruby/object:Gem::Requirement
139
134
  requirements:
140
135
  - - ">="
141
136
  - !ruby/object:Gem::Version
142
137
  version: '0'
143
138
  requirements: []
144
- rubyforge_project:
145
- rubygems_version: 2.6.13
139
+ rubygems_version: 3.1.2
146
140
  signing_key:
147
141
  specification_version: 4
148
142
  summary: The automatic indexer for Postgres
data/.gitignore DELETED
@@ -1,9 +0,0 @@
1
- /.bundle/
2
- /.yardoc
3
- /Gemfile.lock
4
- /_yardoc/
5
- /coverage/
6
- /doc/
7
- /pkg/
8
- /spec/reports/
9
- /tmp/
@@ -1,18 +0,0 @@
1
- language: ruby
2
- rvm: 2.4.1
3
- cache: bundler
4
- script: bundle exec rake test
5
- addons:
6
- postgresql: "9.6"
7
- before_script:
8
- - sudo apt-get install postgresql-server-dev-9.6
9
- - wget https://github.com/dalibo/hypopg/archive/1.0.0.tar.gz
10
- - tar xf 1.0.0.tar.gz
11
- - cd hypopg-1.0.0
12
- - make
13
- - sudo make install
14
- - psql -c 'create database dexter_test;' -U postgres
15
- notifications:
16
- email:
17
- on_success: never
18
- on_failure: change
data/Gemfile DELETED
@@ -1,4 +0,0 @@
1
- source "https://rubygems.org"
2
-
3
- # Specify your gem's dependencies in dexter.gemspec
4
- gemspec
data/Rakefile DELETED
@@ -1,11 +0,0 @@
1
- require "bundler/gem_tasks"
2
- require "rake/testtask"
3
-
4
- Rake::TestTask.new(:test) do |t|
5
- t.libs << "test"
6
- t.libs << "lib"
7
- t.test_files = FileList["test/**/*_test.rb"]
8
- t.warning = false
9
- end
10
-
11
- task default: :test
@@ -1,102 +0,0 @@
1
- # Hosted Postgres
2
-
3
- Some hosted providers like Amazon RDS and Heroku do not support the HypoPG extension, which Dexter needs to run. Hopefully this will change with time. For now, we can spin up a separate database instance to run Dexter. It’s not super convenient, but can be useful to do from time to time.
4
-
5
- ### Install Postgres and Ruby
6
-
7
- Linux
8
-
9
- ```sh
10
- sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt/ $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
11
- sudo apt-get install -y wget ca-certificates
12
- wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
13
- sudo apt-get update
14
- sudo apt-get install -y postgresql-9.6 postgresql-server-dev-9.6
15
- sudo -u postgres createuser $(whoami) -s
16
- sudo apt-get install -y ruby2.2 ruby2.2-dev
17
- ```
18
-
19
- Mac
20
-
21
- ```sh
22
- brew install postgresql
23
- brew install ruby
24
- ```
25
-
26
- ### Install HypoPG and Dexter
27
-
28
- HypoPG
29
-
30
- ```sh
31
- cd /tmp
32
- curl -L https://github.com/dalibo/hypopg/archive/1.0.0.tar.gz | tar xz
33
- cd hypopg-1.0.0
34
- make
35
- make install # may need sudo
36
- ```
37
-
38
- Dexter
39
-
40
- ```sh
41
- gem install pgdexter # may need sudo
42
- ```
43
-
44
- ### Download logs
45
-
46
- #### Amazon RDS
47
-
48
- Create an IAM user with the policy below:
49
-
50
- ```
51
- {
52
- "Statement": [
53
- {
54
- "Action": [
55
- "rds:DescribeDBLogFiles",
56
- "rds:DownloadDBLogFilePortion"
57
- ],
58
- "Effect": "Allow",
59
- "Resource": "*"
60
- }
61
- ]
62
- }
63
- ```
64
-
65
- And run:
66
-
67
- ```sh
68
- aws configure
69
- gem install pghero_logs # may need sudo
70
- pghero_logs download <instance-id>
71
- ```
72
-
73
- #### Heroku
74
-
75
- Production-tier databases only
76
-
77
- ```sh
78
- heroku logs -p postgres > postgresql.log
79
- ```
80
-
81
- ### Dump and restore
82
-
83
- We recommend creating a new instance from a snapshot for the dump to avoid affecting customers.
84
-
85
- ```sh
86
- pg_dump -v -j 8 -Fd -f /tmp/newout.dir <connection-options>
87
- ```
88
-
89
- Then shutdown the dump instance. Restore with:
90
-
91
- ```sh
92
- createdb dexter_restore
93
- pg_restore -v -j 8 -x -O --format=d -d dexter_restore /tmp/newout.dir/
94
- ```
95
-
96
- ### Run Dexter
97
-
98
- ```sh
99
- dexter dexter_restore postgresql.log* --analyze
100
- ```
101
-
102
- :tada:
@@ -1,59 +0,0 @@
1
- # Linux Packages
2
-
3
- Distributions
4
-
5
- - [Ubuntu 16.04 (Xenial)](#ubuntu-1604-xenial)
6
- - [Ubuntu 14.04 (Trusty)](#ubuntu-1404-trusty)
7
- - [Debian 8 (Jesse)](#debian-8-jesse)
8
- - [CentOS / RHEL 7](#centos--rhel-7)
9
- - [SUSE Linux Enterprise Server 12](#suse-linux-enterprise-server-12)
10
-
11
- ### Ubuntu 16.04 (Xenial)
12
-
13
- ```sh
14
- wget -qO- https://dl.packager.io/srv/pghero/dexter/key | sudo apt-key add -
15
- sudo wget -O /etc/apt/sources.list.d/dexter.list \
16
- https://dl.packager.io/srv/pghero/dexter/master/installer/ubuntu/16.04.repo
17
- sudo apt-get update
18
- sudo apt-get -y install dexter
19
- ```
20
-
21
- ### Ubuntu 14.04 (Trusty)
22
-
23
- ```sh
24
- wget -qO- https://dl.packager.io/srv/pghero/dexter/key | sudo apt-key add -
25
- sudo wget -O /etc/apt/sources.list.d/dexter.list \
26
- https://dl.packager.io/srv/pghero/dexter/master/installer/ubuntu/14.04.repo
27
- sudo apt-get update
28
- sudo apt-get install dexter
29
- ```
30
-
31
- ### Debian 8 (Jesse)
32
-
33
- ```sh
34
- wget -qO- https://dl.packager.io/srv/pghero/dexter/key | sudo apt-key add -
35
- sudo wget -O /etc/apt/sources.list.d/dexter.list \
36
- https://dl.packager.io/srv/pghero/dexter/master/installer/debian/8.repo
37
- sudo apt-get update
38
- sudo apt-get install dexter
39
- ```
40
-
41
- ### CentOS / RHEL 7
42
-
43
- ```sh
44
- sudo wget -O /etc/yum.repos.d/dexter.repo \
45
- https://dl.packager.io/srv/pghero/dexter/master/installer/el/7.repo
46
- sudo yum install dexter
47
- ```
48
-
49
- ### SUSE Linux Enterprise Server 12
50
-
51
- ```sh
52
- sudo wget -O /etc/zypp/repos.d/dexter.repo \
53
- https://dl.packager.io/srv/pghero/dexter/master/installer/sles/12.repo
54
- sudo zypper install dexter
55
- ```
56
-
57
- ## Credits
58
-
59
- :heart: Made possible by [Packager](https://packager.io/)
@@ -1,30 +0,0 @@
1
- # coding: utf-8
2
-
3
- lib = File.expand_path("../lib", __FILE__)
4
- $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
5
- require "dexter/version"
6
-
7
- Gem::Specification.new do |spec|
8
- spec.name = "pgdexter"
9
- spec.version = Dexter::VERSION
10
- spec.authors = ["Andrew Kane"]
11
- spec.email = ["andrew@chartkick.com"]
12
-
13
- spec.summary = "The automatic indexer for Postgres"
14
- spec.homepage = "https://github.com/ankane/dexter"
15
-
16
- spec.files = `git ls-files -z`.split("\x0").reject do |f|
17
- f.match(%r{^(test|spec|features)/})
18
- end
19
- spec.bindir = "exe"
20
- spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
21
- spec.require_paths = ["lib"]
22
-
23
- spec.add_dependency "slop", ">= 4.2.0"
24
- spec.add_dependency "pg"
25
- spec.add_dependency "pg_query"
26
-
27
- spec.add_development_dependency "bundler"
28
- spec.add_development_dependency "rake"
29
- spec.add_development_dependency "minitest"
30
- end