pgdexter 0.4.2 → 0.5.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 8e8097505929a24e8c038c7fe69ac667faaa05008a3aa94971501fc4b9f15157
4
- data.tar.gz: a08bf061493ed984103ce768af1347d81250c51b4f068f69d3688a1e4fd25514
3
+ metadata.gz: 65822d0d98c9641efdc3146295e098e09e83348ed109b856670a03d74ed2d70b
4
+ data.tar.gz: 6d16c9019172e5e69df056ac358588151fa52ae9e8256159547e1842e2f9b97c
5
5
  SHA512:
6
- metadata.gz: 9c2d99ecbd5fd68e460f25982fe31d1149d9682a7065cfb674d24785f78138ab04d94bcdbf3b81f2d8065468f6531bbaea1e0593bd8fd7ab4dcdb2b061276c1e
7
- data.tar.gz: 0e55cc589470760d317e3ba0e895d7ce5e24cd79517cf02c59aaba1caf36eb1f9785725eedc61e7402d480a61870828aaf2b33683f8ca58223f380d518c7df9c
6
+ metadata.gz: 4991adea5ee65493ea99abe94c19360fc6cc718048784431409abc08fbaf1b1efe3b304dedbd0994d8b66b38294b41ea6400bf5de1f03f09694723a7b709e77c
7
+ data.tar.gz: 00f47a3efd2de6565dd5f5a3a64caa4b610e282d16620975376876501130a6d0cdbbef9412254b8f3d09db2df7a1f3f62d3a8e581c4fcfecd27ea3224cb6f20f
data/CHANGELOG.md CHANGED
@@ -1,3 +1,22 @@
1
+ ## 0.5.0 (2023-04-18)
2
+
3
+ - Added support for normalized queries
4
+ - Added `--stdin` option (now required to read from stdin)
5
+ - Added `--enable-hypopg` option (now required to enable HypoPG)
6
+ - Improved output when HypoPG not installed
7
+ - Changed `--pg-stat-activity` to sample 10 times and exit
8
+ - Detect input format based on file extension
9
+ - Dropped support for experimental `--log-table` option
10
+ - Dropped support for Linux packages for Ubuntu 18.04 and Debian 10
11
+ - Dropped support for Ruby < 2.7
12
+ - Dropped support for Postgres < 11
13
+
14
+ ## 0.4.3 (2023-03-26)
15
+
16
+ - Added experimental `--log-table` option
17
+ - Improved help
18
+ - Require pg_query < 4
19
+
1
20
  ## 0.4.2 (2023-01-29)
2
21
 
3
22
  - Fixed `--pg-stat-statements` option for Postgres 13+
data/LICENSE.txt CHANGED
@@ -1,4 +1,4 @@
1
- Copyright (c) 2017-2021 Andrew Kane
1
+ Copyright (c) 2017-2023 Andrew Kane
2
2
 
3
3
  MIT License
4
4
 
data/README.md CHANGED
@@ -18,15 +18,15 @@ make
18
18
  make install # may need sudo
19
19
  ```
20
20
 
21
- > Note: If you have issues, make sure `postgresql-server-dev-*` is installed.
21
+ And enable it in databases where you want to use Dexter:
22
22
 
23
- Enable logging for slow queries in your Postgres config file.
24
-
25
- ```ini
26
- log_min_duration_statement = 10 # ms
23
+ ```sql
24
+ CREATE EXTENSION hypopg;
27
25
  ```
28
26
 
29
- And install the command line tool with:
27
+ See the [installation notes](#hypopg-installation-notes) if you run into issues.
28
+
29
+ Then install the command line tool with:
30
30
 
31
31
  ```sh
32
32
  gem install pgdexter
@@ -36,10 +36,10 @@ The command line tool is also available with [Docker](#docker), [Homebrew](#home
36
36
 
37
37
  ## How to Use
38
38
 
39
- Dexter needs a connection to your database and a log file to process.
39
+ Dexter needs a connection to your database and a source of queries (like [pg_stat_statements](https://www.postgresql.org/docs/current/pgstatstatements.html)) to process.
40
40
 
41
41
  ```sh
42
- tail -F -n +1 <log-file> | dexter <connection-options>
42
+ dexter -d dbname --pg-stat-statements
43
43
  ```
44
44
 
45
45
  This finds slow queries and generates output like:
@@ -53,7 +53,6 @@ Index found: public.movies (title)
53
53
  Index found: public.ratings (movie_id)
54
54
  Index found: public.ratings (rating)
55
55
  Index found: public.ratings (user_id)
56
- Processing 12 new query fingerprints
57
56
  ```
58
57
 
59
58
  To be safe, Dexter will not create indexes unless you pass the `--create` flag. In this case, you’ll see:
@@ -84,41 +83,78 @@ and connection strings:
84
83
  host=localhost port=5432 dbname=mydb
85
84
  ```
86
85
 
86
+ Always make sure your [connection is secure](https://ankane.org/postgres-sslmode-explained) when connecting to a database over a network you don’t fully trust.
87
+
87
88
  ## Collecting Queries
88
89
 
89
- There are many ways to collect queries. For real-time indexing, pipe your logfile:
90
+ Dexter can collect queries from a number of sources.
91
+
92
+ - [Query stats](#query-stats)
93
+ - [Live queries](#live-queries)
94
+ - [Log files](#log-file)
95
+ - [SQL files](#sql-files)
96
+
97
+ ### Query Stats
98
+
99
+ Enable [pg_stat_statements](https://www.postgresql.org/docs/current/pgstatstatements.html) in your database.
100
+
101
+ ```psql
102
+ CREATE EXTENSION pg_stat_statements;
103
+ ```
104
+
105
+ And use:
90
106
 
91
107
  ```sh
92
- tail -F -n +1 <log-file> | dexter <connection-options>
108
+ dexter <connection-options> --pg-stat-statements
93
109
  ```
94
110
 
95
- Pass a single statement with:
111
+ ### Live Queries
112
+
113
+ Get live queries from the [pg_stat_activity](https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORING-PG-STAT-ACTIVITY-VIEW) view with:
96
114
 
97
115
  ```sh
98
- dexter <connection-options> -s "SELECT * FROM ..."
116
+ dexter <connection-options> --pg-stat-activity
117
+ ```
118
+
119
+ ### Log Files
120
+
121
+ Enable logging for slow queries in your Postgres config file.
122
+
123
+ ```ini
124
+ log_min_duration_statement = 10 # ms
99
125
  ```
100
126
 
101
- or pass files:
127
+ And use:
102
128
 
103
129
  ```sh
104
- dexter <connection-options> <file1> <file2>
130
+ dexter <connection-options> postgresql.log
105
131
  ```
106
132
 
107
- or collect running queries with:
133
+ Supports `stderr`, `csvlog`, and `jsonlog` formats.
134
+
135
+ For real-time indexing, pipe your logfile:
108
136
 
109
137
  ```sh
110
- dexter <connection-options> --pg-stat-activity
138
+ tail -F -n +1 postgresql.log | dexter <connection-options> --stdin
111
139
  ```
112
140
 
113
- or use the [pg_stat_statements](https://www.postgresql.org/docs/current/static/pgstatstatements.html) extension:
141
+ And pass `--input-format csvlog` or `--input-format jsonlog` if needed.
142
+
143
+ ### SQL Files
144
+
145
+ Pass a SQL file with:
114
146
 
115
147
  ```sh
116
- dexter <connection-options> --pg-stat-statements
148
+ dexter <connection-options> queries.sql
117
149
  ```
118
150
 
119
- > Note: Logs or running queries are highly preferred over pg_stat_statements, as pg_stat_statements often doesn’t store enough information to optimize queries.
151
+ Pass a single query with:
152
+
153
+ ```sh
154
+ dexter <connection-options> -s "SELECT * FROM ..."
155
+ ```
120
156
 
121
- ### Collection Options
157
+ ## Collection Options
122
158
 
123
159
  To prevent one-off queries from being indexed, specify a minimum number of calls before a query is considered for indexing
124
160
 
@@ -132,12 +168,6 @@ You can do the same for total time a query has run
132
168
  dexter --min-time 10 # minutes
133
169
  ```
134
170
 
135
- Specify the format
136
-
137
- ```sh
138
- dexter --input-format csv
139
- ```
140
-
141
171
  When streaming logs, specify the time to wait between processing queries
142
172
 
143
173
  ```sh
@@ -146,16 +176,22 @@ dexter --interval 60 # seconds
146
176
 
147
177
  ## Examples
148
178
 
149
- Ubuntu with PostgreSQL 12
179
+ Postgres package on Ubuntu 22.04
150
180
 
151
181
  ```sh
152
- tail -F -n +1 /var/log/postgresql/postgresql-12-main.log | sudo -u postgres dexter dbname
182
+ sudo -u postgres dexter -d dbname /var/log/postgresql/postgresql-14-main.log
153
183
  ```
154
184
 
155
- Homebrew on Mac
185
+ Homebrew Postgres on Mac ARM
156
186
 
157
187
  ```sh
158
- tail -F -n +1 /usr/local/var/postgres/server.log | dexter dbname
188
+ dexter -d dbname /opt/homebrew/var/log/postgresql@14.log
189
+ ```
190
+
191
+ Homebrew Postgres on Mac x86-64
192
+
193
+ ```sh
194
+ dexter -d dbname /usr/local/var/log/postgresql@14.log
159
195
  ```
160
196
 
161
197
  ## Analyze
@@ -198,6 +234,30 @@ For other providers, see [this guide](guides/Hosted-Postgres.md). To request a n
198
234
  - Google Cloud SQL - vote or comment on [this page](https://issuetracker.google.com/issues/69250435)
199
235
  - DigitalOcean Managed Databases - vote or comment on [this page](https://ideas.digitalocean.com/app-framework-services/p/support-hypopg-for-postgres)
200
236
 
237
+ ## HypoPG Installation Notes
238
+
239
+ ### Postgres Location
240
+
241
+ If your machine has multiple Postgres installations, specify the path to [pg_config](https://www.postgresql.org/docs/current/app-pgconfig.html) with:
242
+
243
+ ```sh
244
+ export PG_CONFIG=/Applications/Postgres.app/Contents/Versions/latest/bin/pg_config
245
+ ```
246
+
247
+ Then re-run the installation instructions (run `make clean` before `make` if needed)
248
+
249
+ ### Missing Header
250
+
251
+ If compilation fails with `fatal error: postgres.h: No such file or directory`, make sure Postgres development files are installed on the server.
252
+
253
+ For Ubuntu and Debian, use:
254
+
255
+ ```sh
256
+ sudo apt-get install postgresql-server-dev-15
257
+ ```
258
+
259
+ Note: Replace `15` with your Postgres server version
260
+
201
261
  ## Additional Installation Methods
202
262
 
203
263
  ### Docker
@@ -214,19 +274,19 @@ And run it with:
214
274
  docker run -ti ankane/dexter <connection-options>
215
275
  ```
216
276
 
217
- For databases on the host machine, use `host.docker.internal` as the hostname (on Linux, this requires Docker 20.04 and `--add-host=host.docker.internal:host-gateway`).
277
+ For databases on the host machine, use `host.docker.internal` as the hostname (on Linux, this requires Docker 20.04+ and `--add-host=host.docker.internal:host-gateway`).
218
278
 
219
279
  ### Homebrew
220
280
 
221
281
  With Homebrew, you can use:
222
282
 
223
283
  ```sh
224
- brew install ankane/brew/dexter
284
+ brew install dexter
225
285
  ```
226
286
 
227
287
  ## Future Work
228
288
 
229
- [Here are some ideas](https://github.com/ankane/dexter/issues/1)
289
+ [Here are some ideas](https://github.com/ankane/dexter/issues/45)
230
290
 
231
291
  ## Upgrading
232
292
 
@@ -243,9 +303,19 @@ gem install specific_install
243
303
  gem specific_install https://github.com/ankane/dexter.git
244
304
  ```
245
305
 
306
+ ## Upgrade Notes
307
+
308
+ ### 0.5.0
309
+
310
+ The `--stdin` option is now required to read queries from stdin.
311
+
312
+ ```sh
313
+ tail -F -n +1 postgresql.log | dexter <connection-options> --stdin
314
+ ```
315
+
246
316
  ## Thanks
247
317
 
248
- This software wouldn’t be possible without [HypoPG](https://github.com/HypoPG/hypopg), which allows you to create hypothetical indexes, and [pg_query](https://github.com/lfittl/pg_query), which allows you to parse and fingerprint queries. A big thanks to Dalibo and Lukas Fittl respectively.
318
+ This software wouldn’t be possible without [HypoPG](https://github.com/HypoPG/hypopg), which allows you to create hypothetical indexes, and [pg_query](https://github.com/lfittl/pg_query), which allows you to parse and fingerprint queries. A big thanks to Dalibo and Lukas Fittl respectively. Also, thanks to YugabyteDB for [this article](https://dev.to/yugabyte/explain-from-pgstatstatements-normalized-queries-how-to-always-get-the-generic-plan-in--5cfi) on how to explain normalized queries.
249
319
 
250
320
  ## Research
251
321
 
data/lib/dexter/client.rb CHANGED
@@ -7,7 +7,7 @@ module Dexter
7
7
 
8
8
  def self.start
9
9
  Dexter::Client.new(ARGV).perform
10
- rescue Dexter::Abort, PG::UndefinedFile => e
10
+ rescue Dexter::Abort, PG::UndefinedFile, PG::FeatureNotSupported => e
11
11
  abort colorize(e.message.strip, :red)
12
12
  end
13
13
 
@@ -29,9 +29,15 @@ module Dexter
29
29
  Processor.new(:pg_stat_activity, options).perform
30
30
  elsif arguments.any?
31
31
  ARGV.replace(arguments)
32
+ if !options[:input_format]
33
+ ext = ARGV.map { |v| File.extname(v) }.uniq
34
+ options[:input_format] = ext.first[1..-1] if ext.size == 1
35
+ end
32
36
  Processor.new(ARGF, options).perform
33
- else
37
+ elsif options[:stdin]
34
38
  Processor.new(STDIN, options).perform
39
+ else
40
+ raise Dexter::Abort, "Specify a source of queries: --pg-stat-statements, --pg-stat-activity, --stdin, or a path"
35
41
  end
36
42
  end
37
43
 
@@ -40,23 +46,45 @@ module Dexter
40
46
  o.banner = %(Usage:
41
47
  dexter [options])
42
48
  o.separator ""
43
- o.separator "Options:"
49
+
50
+ o.separator "Input options:"
51
+ o.string "--input-format", "input format"
52
+ o.boolean "--pg-stat-activity", "use pg_stat_activity", default: false
53
+ o.boolean "--pg-stat-statements", "use pg_stat_statements", default: false, help: false
54
+ o.boolean "--stdin", "use stdin", default: false
55
+ o.string "-s", "--statement", "process a single statement"
56
+ o.separator ""
57
+
58
+ o.separator "Connection options:"
59
+ o.string "-d", "--dbname", "database name"
60
+ o.string "-h", "--host", "database host"
61
+ o.integer "-p", "--port", "database port"
62
+ o.string "-U", "--username", "database user"
63
+ o.separator ""
64
+
65
+ o.separator "Processing options:"
66
+ o.integer "--interval", "time to wait between processing queries, in seconds", default: 60
67
+ o.float "--min-calls", "only process queries that have been called a certain number of times", default: 0
68
+ o.float "--min-time", "only process queries that have consumed a certain amount of DB time, in minutes", default: 0
69
+ o.separator ""
70
+
71
+ o.separator "Indexing options:"
44
72
  o.boolean "--analyze", "analyze tables that haven't been analyzed in the past hour", default: false
45
73
  o.boolean "--create", "create indexes", default: false
74
+ o.boolean "--enable-hypopg", "enable the HypoPG extension", default: false
46
75
  o.array "--exclude", "prevent specific tables from being indexed"
47
76
  o.string "--include", "only include specific tables"
48
- o.string "--input-format", "input format", default: "stderr"
49
- o.integer "--interval", "time to wait between processing queries, in seconds", default: 60
77
+ o.integer "--min-cost-savings-pct", default: 50, help: false
78
+ o.string "--tablespace", "tablespace to create indexes"
79
+ o.separator ""
80
+
81
+ o.separator "Logging options:"
50
82
  o.boolean "--log-explain", "log explain", default: false, help: false
51
83
  o.string "--log-level", "log level", default: "info"
52
84
  o.boolean "--log-sql", "log sql", default: false
53
- o.float "--min-calls", "only process queries that have been called a certain number of times", default: 0
54
- o.float "--min-time", "only process queries that have consumed a certain amount of DB time, in minutes", default: 0
55
- o.integer "--min-cost-savings-pct", default: 50, help: false
56
- o.boolean "--pg-stat-activity", "use pg_stat_activity", default: false, help: false
57
- o.boolean "--pg-stat-statements", "use pg_stat_statements", default: false, help: false
58
- o.string "-s", "--statement", "process a single statement"
59
- o.string "--tablespace", "tablespace to create indexes"
85
+ o.separator ""
86
+
87
+ o.separator "Other options:"
60
88
  o.on "-v", "--version", "print the version" do
61
89
  log Dexter::VERSION
62
90
  exit
@@ -65,12 +93,6 @@ module Dexter
65
93
  log o
66
94
  exit
67
95
  end
68
- o.separator ""
69
- o.separator "Connection options:"
70
- o.string "-d", "--dbname", "database name"
71
- o.string "-h", "--host", "database host"
72
- o.integer "-p", "--port", "database port"
73
- o.string "-U", "--username", "database user"
74
96
  end
75
97
 
76
98
  arguments = opts.arguments
@@ -8,7 +8,7 @@ module Dexter
8
8
  @min_calls = options[:min_calls]
9
9
  end
10
10
 
11
- def add(query, duration)
11
+ def add(query, total_time, calls = 1)
12
12
  fingerprint =
13
13
  begin
14
14
  PgQuery.fingerprint(query)
@@ -19,8 +19,8 @@ module Dexter
19
19
  return unless fingerprint
20
20
 
21
21
  @top_queries[fingerprint] ||= {calls: 0, total_time: 0}
22
- @top_queries[fingerprint][:calls] += 1
23
- @top_queries[fingerprint][:total_time] += duration
22
+ @top_queries[fingerprint][:calls] += calls
23
+ @top_queries[fingerprint][:total_time] += total_time
24
24
  @top_queries[fingerprint][:query] = query
25
25
  @mutex.synchronize do
26
26
  @new_queries << fingerprint
@@ -1,22 +1,24 @@
1
- require "csv"
2
-
3
1
  module Dexter
4
2
  class CsvLogParser < LogParser
5
3
  FIRST_LINE_REGEX = /\A.+/
6
4
 
7
5
  def perform
8
6
  CSV.new(@logfile.to_io).each do |row|
9
- if (m = REGEX.match(row[13]))
10
- # replace first line with match
11
- # needed for multiline queries
12
- active_line = row[13].sub(FIRST_LINE_REGEX, m[3])
13
-
14
- add_parameters(active_line, row[14]) if row[14]
15
- process_entry(active_line, m[1].to_f)
16
- end
7
+ process_csv_row(row[13], row[14])
17
8
  end
18
9
  rescue CSV::MalformedCSVError => e
19
10
  raise Dexter::Abort, "ERROR: #{e.message}"
20
11
  end
12
+
13
+ def process_csv_row(message, detail)
14
+ if (m = REGEX.match(message))
15
+ # replace first line with match
16
+ # needed for multiline queries
17
+ active_line = message.sub(FIRST_LINE_REGEX, m[3])
18
+
19
+ add_parameters(active_line, detail) if detail
20
+ process_entry(active_line, m[1].to_f)
21
+ end
22
+ end
21
23
  end
22
24
  end
@@ -17,7 +17,12 @@ module Dexter
17
17
  @options = options
18
18
  @mutex = Mutex.new
19
19
 
20
- create_extension unless extension_exists?
20
+ if server_version_num < 110000
21
+ raise Dexter::Abort, "This version of Dexter requires Postgres 11+"
22
+ end
23
+
24
+ check_extension
25
+
21
26
  execute("SET lock_timeout = '5s'")
22
27
  end
23
28
 
@@ -27,23 +32,6 @@ module Dexter
27
32
  process_queries(queries)
28
33
  end
29
34
 
30
- def stat_activity
31
- execute <<-SQL
32
- SELECT
33
- pid || ':' || COALESCE(query_start, xact_start) AS id,
34
- query,
35
- EXTRACT(EPOCH FROM NOW() - COALESCE(query_start, xact_start)) * 1000.0 AS duration_ms
36
- FROM
37
- pg_stat_activity
38
- WHERE
39
- datname = current_database()
40
- AND state = 'active'
41
- AND pid != pg_backend_pid()
42
- ORDER BY
43
- 1
44
- SQL
45
- end
46
-
47
35
  def process_queries(queries)
48
36
  # reset hypothetical indexes
49
37
  reset_hypothetical_indexes
@@ -119,19 +107,20 @@ module Dexter
119
107
 
120
108
  private
121
109
 
122
- def create_extension
123
- execute("SET client_min_messages = warning")
124
- begin
125
- execute("CREATE EXTENSION IF NOT EXISTS hypopg")
126
- rescue PG::UndefinedFile
110
+ def check_extension
111
+ extension = execute("SELECT installed_version FROM pg_available_extensions WHERE name = 'hypopg'").first
112
+
113
+ if extension.nil?
127
114
  raise Dexter::Abort, "Install HypoPG first: https://github.com/ankane/dexter#installation"
128
- rescue PG::InsufficientPrivilege
129
- raise Dexter::Abort, "Use a superuser to run: CREATE EXTENSION hypopg"
130
115
  end
131
- end
132
116
 
133
- def extension_exists?
134
- execute("SELECT * FROM pg_available_extensions WHERE name = 'hypopg' AND installed_version IS NOT NULL").any?
117
+ if extension["installed_version"].nil?
118
+ if @options[:enable_hypopg]
119
+ execute("CREATE EXTENSION hypopg")
120
+ else
121
+ raise Dexter::Abort, "Run `CREATE EXTENSION hypopg` or pass --enable-hypopg"
122
+ end
123
+ end
135
124
  end
136
125
 
137
126
  def reset_hypothetical_indexes
@@ -141,7 +130,7 @@ module Dexter
141
130
  def analyze_tables(tables)
142
131
  tables = tables.to_a.sort
143
132
 
144
- analyze_stats = execute <<-SQL
133
+ query = <<~SQL
145
134
  SELECT
146
135
  schemaname || '.' || relname AS table,
147
136
  last_analyze,
@@ -149,8 +138,9 @@ module Dexter
149
138
  FROM
150
139
  pg_stat_user_tables
151
140
  WHERE
152
- schemaname || '.' || relname IN (#{tables.map { |t| quote(t) }.join(", ")})
141
+ schemaname || '.' || relname IN (#{tables.size.times.map { |i| "$#{i + 1}" }.join(", ")})
153
142
  SQL
143
+ analyze_stats = execute(query, params: tables.to_a)
154
144
 
155
145
  last_analyzed = {}
156
146
  analyze_stats.each do |stats|
@@ -181,10 +171,6 @@ module Dexter
181
171
  end
182
172
  begin
183
173
  query.plans << plan(query.statement)
184
- if @log_explain
185
- # Pass format to prevent ANALYZE
186
- puts execute("EXPLAIN (FORMAT TEXT) #{safe_statement(query.statement)}", pretty: false).map { |r| r["QUERY PLAN"] }.join("\n")
187
- end
188
174
  rescue PG::Error, JSON::NestingError => e
189
175
  if @log_explain
190
176
  log e.message
@@ -214,7 +200,7 @@ module Dexter
214
200
  find_columns(query.tree).each do |col|
215
201
  last_col = col["fields"].last
216
202
  if last_col["String"]
217
- possible_columns << last_col["String"]["str"]
203
+ possible_columns << last_col["String"]["sval"]
218
204
  end
219
205
  end
220
206
  end
@@ -510,7 +496,7 @@ module Dexter
510
496
  def conn
511
497
  @conn ||= begin
512
498
  # set connect timeout if none set
513
- ENV["PGCONNECT_TIMEOUT"] ||= "2"
499
+ ENV["PGCONNECT_TIMEOUT"] ||= "3"
514
500
 
515
501
  if @options[:dbname] =~ /\Apostgres(ql)?:\/\//
516
502
  config = @options[:dbname]
@@ -529,7 +515,7 @@ module Dexter
529
515
  raise Dexter::Abort, e.message
530
516
  end
531
517
 
532
- def execute(query, pretty: true)
518
+ def execute(query, pretty: true, params: [])
533
519
  # use exec_params instead of exec for security
534
520
  #
535
521
  # Unlike PQexec, PQexecParams allows at most one SQL command in the given string.
@@ -538,16 +524,56 @@ module Dexter
538
524
  # as an extra defense against SQL-injection attacks.
539
525
  # https://www.postgresql.org/docs/current/static/libpq-exec.html
540
526
  query = squish(query) if pretty
541
- log colorize("[sql] #{query}", :cyan) if @log_sql
527
+ log colorize("[sql] #{query}#{params.any? ? " /*#{params.to_json}*/" : ""}", :cyan) if @log_sql
542
528
 
543
529
  @mutex.synchronize do
544
- conn.exec_params(query, []).to_a
530
+ conn.exec_params("#{query} /*dexter*/", params).to_a
545
531
  end
546
532
  end
547
533
 
548
534
  def plan(query)
535
+ prepared = false
536
+ transaction = false
537
+
538
+ # try to EXPLAIN normalized queries
539
+ # https://dev.to/yugabyte/explain-from-pgstatstatements-normalized-queries-how-to-always-get-the-generic-plan-in--5cfi
540
+ explain_normalized = query.include?("$1")
541
+ if explain_normalized
542
+ prepared_name = "dexter_prepared"
543
+ execute("PREPARE #{prepared_name} AS #{safe_statement(query)}", pretty: false)
544
+ prepared = true
545
+ params = execute("SELECT array_length(parameter_types, 1) AS params FROM pg_prepared_statements WHERE name = $1", params: [prepared_name]).first["params"].to_i
546
+ query = "EXECUTE #{prepared_name}(#{params.times.map { "NULL" }.join(", ")})"
547
+
548
+ execute("BEGIN")
549
+ transaction = true
550
+
551
+ if server_version_num >= 120000
552
+ execute("SET LOCAL plan_cache_mode = force_generic_plan")
553
+ else
554
+ execute("SET LOCAL cpu_operator_cost = 1e42")
555
+ 5.times do
556
+ execute("EXPLAIN (FORMAT JSON) #{safe_statement(query)}", pretty: false)
557
+ end
558
+ execute("ROLLBACK")
559
+ execute("BEGIN")
560
+ end
561
+ end
562
+
549
563
  # strip semi-colons as another measure of defense
550
- JSON.parse(execute("EXPLAIN (FORMAT JSON) #{safe_statement(query)}", pretty: false).first["QUERY PLAN"], max_nesting: 1000).first["Plan"]
564
+ plan = JSON.parse(execute("EXPLAIN (FORMAT JSON) #{safe_statement(query)}", pretty: false).first["QUERY PLAN"], max_nesting: 1000).first["Plan"]
565
+
566
+ if @log_explain
567
+ # Pass format to prevent ANALYZE
568
+ puts execute("EXPLAIN (FORMAT TEXT) #{safe_statement(query)}", pretty: false).map { |r| r["QUERY PLAN"] }.join("\n")
569
+ end
570
+
571
+ plan
572
+ ensure
573
+ if explain_normalized
574
+ execute("ROLLBACK") if transaction
575
+ execute("DEALLOCATE #{prepared_name}") if prepared
576
+ end
551
577
  end
552
578
 
553
579
  # TODO for multicolumn indexes, use ordering
@@ -565,7 +591,7 @@ module Dexter
565
591
  end
566
592
 
567
593
  def database_tables
568
- result = execute <<-SQL
594
+ result = execute <<~SQL
569
595
  SELECT
570
596
  table_schema || '.' || table_name AS table_name
571
597
  FROM
@@ -577,17 +603,13 @@ module Dexter
577
603
  end
578
604
 
579
605
  def materialized_views
580
- if server_version_num >= 90300
581
- result = execute <<-SQL
582
- SELECT
583
- schemaname || '.' || matviewname AS table_name
584
- FROM
585
- pg_matviews
586
- SQL
587
- result.map { |r| r["table_name"] }
588
- else
589
- []
590
- end
606
+ result = execute <<~SQL
607
+ SELECT
608
+ schemaname || '.' || matviewname AS table_name
609
+ FROM
610
+ pg_matviews
611
+ SQL
612
+ result.map { |r| r["table_name"] }
591
613
  end
592
614
 
593
615
  def server_version_num
@@ -595,7 +617,7 @@ module Dexter
595
617
  end
596
618
 
597
619
  def database_view_tables
598
- result = execute <<-SQL
620
+ result = execute <<~SQL
599
621
  SELECT
600
622
  schemaname || '.' || viewname AS table_name,
601
623
  definition
@@ -621,7 +643,7 @@ module Dexter
621
643
 
622
644
  def stat_statements
623
645
  total_time = server_version_num >= 130000 ? "(total_plan_time + total_exec_time)" : "total_time"
624
- result = execute <<-SQL
646
+ sql = <<~SQL
625
647
  SELECT
626
648
  DISTINCT query
627
649
  FROM
@@ -630,18 +652,18 @@ module Dexter
630
652
  pg_database ON pg_database.oid = pg_stat_statements.dbid
631
653
  WHERE
632
654
  datname = current_database()
633
- AND #{total_time} >= #{@min_time * 60000}
634
- AND calls >= #{@min_calls}
655
+ AND #{total_time} >= \$1
656
+ AND calls >= \$2
635
657
  ORDER BY
636
658
  1
637
659
  SQL
638
- result.map { |q| q["query"] }
660
+ execute(sql, params: [@min_time * 60000, @min_calls]).map { |q| q["query"] }
639
661
  end
640
662
 
641
663
  def with_advisory_lock
642
664
  lock_id = 123456
643
665
  first_time = true
644
- while execute("SELECT pg_try_advisory_lock(#{lock_id})").first["pg_try_advisory_lock"] != "t"
666
+ while execute("SELECT pg_try_advisory_lock($1)", params: [lock_id]).first["pg_try_advisory_lock"] != "t"
645
667
  if first_time
646
668
  log "Waiting for lock..."
647
669
  first_time = false
@@ -650,16 +672,19 @@ module Dexter
650
672
  end
651
673
  yield
652
674
  ensure
653
- with_min_messages("error") do
654
- execute("SELECT pg_advisory_unlock(#{lock_id})")
675
+ suppress_messages do
676
+ execute("SELECT pg_advisory_unlock($1)", params: [lock_id])
655
677
  end
656
678
  end
657
679
 
658
- def with_min_messages(value)
659
- execute("SET client_min_messages = #{quote(value)}")
680
+ def suppress_messages
681
+ conn.set_notice_processor do |message|
682
+ # do nothing
683
+ end
660
684
  yield
661
685
  ensure
662
- execute("SET client_min_messages = warning")
686
+ # clear notice processor
687
+ conn.set_notice_processor
663
688
  end
664
689
 
665
690
  def index_exists?(index)
@@ -667,7 +692,7 @@ module Dexter
667
692
  end
668
693
 
669
694
  def columns(tables)
670
- columns = execute <<-SQL
695
+ query = <<~SQL
671
696
  SELECT
672
697
  s.nspname || '.' || t.relname AS table_name,
673
698
  a.attname AS column_name,
@@ -677,16 +702,16 @@ module Dexter
677
702
  JOIN pg_namespace s on t.relnamespace = s.oid
678
703
  WHERE a.attnum > 0
679
704
  AND NOT a.attisdropped
680
- AND s.nspname || '.' || t.relname IN (#{tables.map { |t| quote(t) }.join(", ")})
705
+ AND s.nspname || '.' || t.relname IN (#{tables.size.times.map { |i| "$#{i + 1}" }.join(", ")})
681
706
  ORDER BY
682
707
  1, 2
683
708
  SQL
684
-
709
+ columns = execute(query, params: tables.to_a)
685
710
  columns.map { |v| {table: v["table_name"], column: v["column_name"], type: v["data_type"]} }
686
711
  end
687
712
 
688
713
  def indexes(tables)
689
- execute(<<-SQL
714
+ query = <<~SQL
690
715
  SELECT
691
716
  schemaname || '.' || t.relname AS table,
692
717
  ix.relname AS name,
@@ -701,14 +726,14 @@ module Dexter
701
726
  LEFT JOIN
702
727
  pg_stat_user_indexes ui ON ui.indexrelid = i.indexrelid
703
728
  WHERE
704
- schemaname || '.' || t.relname IN (#{tables.map { |t| quote(t) }.join(", ")}) AND
729
+ schemaname || '.' || t.relname IN (#{tables.size.times.map { |i| "$#{i + 1}" }.join(", ")}) AND
705
730
  indisvalid = 't' AND
706
731
  indexprs IS NULL AND
707
732
  indpred IS NULL
708
733
  ORDER BY
709
734
  1, 2
710
735
  SQL
711
- ).map { |v| v["columns"] = v["columns"].sub(") WHERE (", " WHERE ").split(", ").map { |c| unquote(c) }; v }
736
+ execute(query, params: tables.to_a).map { |v| v["columns"] = v["columns"].sub(") WHERE (", " WHERE ").split(", ").map { |c| unquote(c) }; v }
712
737
  end
713
738
 
714
739
  def search_path
@@ -727,19 +752,6 @@ module Dexter
727
752
  value.split(".").map { |v| conn.quote_ident(v) }.join(".")
728
753
  end
729
754
 
730
- def quote(value)
731
- if value.is_a?(String)
732
- "'#{quote_string(value)}'"
733
- else
734
- value
735
- end
736
- end
737
-
738
- # from activerecord
739
- def quote_string(s)
740
- s.gsub(/\\/, '\&\&').gsub(/'/, "''")
741
- end
742
-
743
755
  # from activesupport
744
756
  def squish(str)
745
757
  str.to_s.gsub(/\A[[:space:]]+/, "").gsub(/[[:space:]]+\z/, "").gsub(/[[:space:]]+/, " ")
@@ -1,5 +1,3 @@
1
- require "json"
2
-
3
1
  module Dexter
4
2
  class JsonLogParser < LogParser
5
3
  FIRST_LINE_REGEX = /\A.+/
@@ -3,38 +3,12 @@ module Dexter
3
3
  include Logging
4
4
 
5
5
  REGEX = /duration: (\d+\.\d+) ms (statement|execute [^:]+): (.+)/
6
- LINE_SEPERATOR = ": ".freeze
7
- DETAIL_LINE = "DETAIL: ".freeze
8
6
 
9
7
  def initialize(logfile, collector)
10
8
  @logfile = logfile
11
9
  @collector = collector
12
10
  end
13
11
 
14
- def perform
15
- active_line = nil
16
- duration = nil
17
-
18
- @logfile.each_line do |line|
19
- if active_line
20
- if line.include?(DETAIL_LINE)
21
- add_parameters(active_line, line.chomp.split(DETAIL_LINE)[1])
22
- elsif line.include?(LINE_SEPERATOR)
23
- process_entry(active_line, duration)
24
- active_line = nil
25
- else
26
- active_line << line
27
- end
28
- end
29
-
30
- if !active_line && (m = REGEX.match(line.chomp))
31
- duration = m[1].to_f
32
- active_line = m[3]
33
- end
34
- end
35
- process_entry(active_line, duration) if active_line
36
- end
37
-
38
12
  private
39
13
 
40
14
  def process_entry(query, duration)
@@ -1,25 +1,50 @@
1
1
  module Dexter
2
2
  class PgStatActivityParser < LogParser
3
3
  def perform
4
- queries = {}
4
+ previous_queries = {}
5
5
 
6
- loop do
7
- new_queries = {}
8
- @logfile.stat_activity.each do |row|
9
- new_queries[row["id"]] = row
6
+ 10.times do
7
+ active_queries = {}
8
+ processed_queries = {}
9
+
10
+ stat_activity.each do |row|
11
+ if row["state"] == "active"
12
+ active_queries[row["id"]] = row
13
+ else
14
+ process_entry(row["query"], row["duration_ms"].to_f)
15
+ processed_queries[row["id"]] = true
16
+ end
10
17
  end
11
18
 
12
19
  # store queries after they complete
13
- queries.each do |id, row|
14
- unless new_queries[id]
20
+ previous_queries.each do |id, row|
21
+ if !active_queries[id] && !processed_queries[id]
15
22
  process_entry(row["query"], row["duration_ms"].to_f)
16
23
  end
17
24
  end
18
25
 
19
- queries = new_queries
26
+ previous_queries = active_queries
20
27
 
21
- sleep(1)
28
+ sleep(0.1)
22
29
  end
23
30
  end
31
+
32
+ def stat_activity
33
+ sql = <<~SQL
34
+ SELECT
35
+ pid || ':' || COALESCE(query_start, xact_start) AS id,
36
+ query,
37
+ state,
38
+ EXTRACT(EPOCH FROM NOW() - COALESCE(query_start, xact_start)) * 1000.0 AS duration_ms
39
+ FROM
40
+ pg_stat_activity
41
+ WHERE
42
+ datname = current_database()
43
+ AND pid != pg_backend_pid()
44
+ ORDER BY
45
+ 1
46
+ SQL
47
+ @logfile.send(:execute, sql)
48
+ end
24
49
  end
25
50
  end
@@ -18,7 +18,7 @@ module Dexter
18
18
  elsif options[:input_format] == "sql"
19
19
  SqlLogParser.new(logfile, @collector)
20
20
  else
21
- LogParser.new(logfile, @collector)
21
+ StderrLogParser.new(logfile, @collector)
22
22
  end
23
23
 
24
24
  @starting_interval = 3
@@ -31,7 +31,7 @@ module Dexter
31
31
  end
32
32
 
33
33
  def perform
34
- if [STDIN, :pg_stat_activity].include?(@logfile)
34
+ if [STDIN].include?(@logfile)
35
35
  Thread.abort_on_exception = true
36
36
  Thread.new do
37
37
  sleep(@starting_interval)
@@ -0,0 +1,34 @@
1
+ module Dexter
2
+ class StderrLogParser < LogParser
3
+ LINE_SEPERATOR = ": ".freeze
4
+ DETAIL_LINE = "DETAIL: ".freeze
5
+
6
+ def perform
7
+ process_stderr(@logfile.each_line)
8
+ end
9
+
10
+ def process_stderr(rows)
11
+ active_line = nil
12
+ duration = nil
13
+
14
+ rows.each do |line|
15
+ if active_line
16
+ if line.include?(DETAIL_LINE)
17
+ add_parameters(active_line, line.chomp.split(DETAIL_LINE)[1])
18
+ elsif line.include?(LINE_SEPERATOR)
19
+ process_entry(active_line, duration)
20
+ active_line = nil
21
+ else
22
+ active_line << line
23
+ end
24
+ end
25
+
26
+ if !active_line && (m = REGEX.match(line.chomp))
27
+ duration = m[1].to_f
28
+ active_line = m[3]
29
+ end
30
+ end
31
+ process_entry(active_line, duration) if active_line
32
+ end
33
+ end
34
+ end
@@ -1,3 +1,3 @@
1
1
  module Dexter
2
- VERSION = "0.4.2"
2
+ VERSION = "0.5.0"
3
3
  end
data/lib/dexter.rb CHANGED
@@ -4,23 +4,27 @@ require "pg_query"
4
4
  require "slop"
5
5
 
6
6
  # stdlib
7
+ require "csv"
7
8
  require "json"
8
9
  require "set"
9
10
  require "time"
10
11
 
11
12
  # modules
12
- require "dexter/version"
13
- require "dexter/logging"
14
- require "dexter/client"
15
- require "dexter/collector"
16
- require "dexter/indexer"
17
- require "dexter/log_parser"
18
- require "dexter/csv_log_parser"
19
- require "dexter/json_log_parser"
20
- require "dexter/pg_stat_activity_parser"
21
- require "dexter/sql_log_parser"
22
- require "dexter/processor"
23
- require "dexter/query"
13
+ require_relative "dexter/logging"
14
+ require_relative "dexter/client"
15
+ require_relative "dexter/collector"
16
+ require_relative "dexter/indexer"
17
+ require_relative "dexter/processor"
18
+ require_relative "dexter/query"
19
+ require_relative "dexter/version"
20
+
21
+ # parsers
22
+ require_relative "dexter/log_parser"
23
+ require_relative "dexter/csv_log_parser"
24
+ require_relative "dexter/json_log_parser"
25
+ require_relative "dexter/pg_stat_activity_parser"
26
+ require_relative "dexter/sql_log_parser"
27
+ require_relative "dexter/stderr_log_parser"
24
28
 
25
29
  module Dexter
26
30
  class Abort < StandardError; end
metadata CHANGED
@@ -1,57 +1,57 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: pgdexter
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.4.2
4
+ version: 0.5.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Andrew Kane
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2023-01-30 00:00:00.000000000 Z
11
+ date: 2023-04-18 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
- name: slop
14
+ name: pg
15
15
  requirement: !ruby/object:Gem::Requirement
16
16
  requirements:
17
17
  - - ">="
18
18
  - !ruby/object:Gem::Version
19
- version: 4.8.2
19
+ version: 0.18.2
20
20
  type: :runtime
21
21
  prerelease: false
22
22
  version_requirements: !ruby/object:Gem::Requirement
23
23
  requirements:
24
24
  - - ">="
25
25
  - !ruby/object:Gem::Version
26
- version: 4.8.2
26
+ version: 0.18.2
27
27
  - !ruby/object:Gem::Dependency
28
- name: pg
28
+ name: pg_query
29
29
  requirement: !ruby/object:Gem::Requirement
30
30
  requirements:
31
- - - ">="
31
+ - - "~>"
32
32
  - !ruby/object:Gem::Version
33
- version: 0.18.2
33
+ version: '4'
34
34
  type: :runtime
35
35
  prerelease: false
36
36
  version_requirements: !ruby/object:Gem::Requirement
37
37
  requirements:
38
- - - ">="
38
+ - - "~>"
39
39
  - !ruby/object:Gem::Version
40
- version: 0.18.2
40
+ version: '4'
41
41
  - !ruby/object:Gem::Dependency
42
- name: pg_query
42
+ name: slop
43
43
  requirement: !ruby/object:Gem::Requirement
44
44
  requirements:
45
45
  - - ">="
46
46
  - !ruby/object:Gem::Version
47
- version: '2.1'
47
+ version: 4.10.1
48
48
  type: :runtime
49
49
  prerelease: false
50
50
  version_requirements: !ruby/object:Gem::Requirement
51
51
  requirements:
52
52
  - - ">="
53
53
  - !ruby/object:Gem::Version
54
- version: '2.1'
54
+ version: 4.10.1
55
55
  description:
56
56
  email: andrew@ankane.org
57
57
  executables:
@@ -75,6 +75,7 @@ files:
75
75
  - lib/dexter/processor.rb
76
76
  - lib/dexter/query.rb
77
77
  - lib/dexter/sql_log_parser.rb
78
+ - lib/dexter/stderr_log_parser.rb
78
79
  - lib/dexter/version.rb
79
80
  homepage: https://github.com/ankane/dexter
80
81
  licenses:
@@ -88,14 +89,14 @@ required_ruby_version: !ruby/object:Gem::Requirement
88
89
  requirements:
89
90
  - - ">="
90
91
  - !ruby/object:Gem::Version
91
- version: '2.5'
92
+ version: '2.7'
92
93
  required_rubygems_version: !ruby/object:Gem::Requirement
93
94
  requirements:
94
95
  - - ">="
95
96
  - !ruby/object:Gem::Version
96
97
  version: '0'
97
98
  requirements: []
98
- rubygems_version: 3.4.1
99
+ rubygems_version: 3.4.10
99
100
  signing_key:
100
101
  specification_version: 4
101
102
  summary: The automatic indexer for Postgres