pgdexter 0.4.2 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 8e8097505929a24e8c038c7fe69ac667faaa05008a3aa94971501fc4b9f15157
4
- data.tar.gz: a08bf061493ed984103ce768af1347d81250c51b4f068f69d3688a1e4fd25514
3
+ metadata.gz: 65822d0d98c9641efdc3146295e098e09e83348ed109b856670a03d74ed2d70b
4
+ data.tar.gz: 6d16c9019172e5e69df056ac358588151fa52ae9e8256159547e1842e2f9b97c
5
5
  SHA512:
6
- metadata.gz: 9c2d99ecbd5fd68e460f25982fe31d1149d9682a7065cfb674d24785f78138ab04d94bcdbf3b81f2d8065468f6531bbaea1e0593bd8fd7ab4dcdb2b061276c1e
7
- data.tar.gz: 0e55cc589470760d317e3ba0e895d7ce5e24cd79517cf02c59aaba1caf36eb1f9785725eedc61e7402d480a61870828aaf2b33683f8ca58223f380d518c7df9c
6
+ metadata.gz: 4991adea5ee65493ea99abe94c19360fc6cc718048784431409abc08fbaf1b1efe3b304dedbd0994d8b66b38294b41ea6400bf5de1f03f09694723a7b709e77c
7
+ data.tar.gz: 00f47a3efd2de6565dd5f5a3a64caa4b610e282d16620975376876501130a6d0cdbbef9412254b8f3d09db2df7a1f3f62d3a8e581c4fcfecd27ea3224cb6f20f
data/CHANGELOG.md CHANGED
@@ -1,3 +1,22 @@
1
+ ## 0.5.0 (2023-04-18)
2
+
3
+ - Added support for normalized queries
4
+ - Added `--stdin` option (now required to read from stdin)
5
+ - Added `--enable-hypopg` option (now required to enable HypoPG)
6
+ - Improved output when HypoPG not installed
7
+ - Changed `--pg-stat-activity` to sample 10 times and exit
8
+ - Detect input format based on file extension
9
+ - Dropped support for experimental `--log-table` option
10
+ - Dropped support for Linux packages for Ubuntu 18.04 and Debian 10
11
+ - Dropped support for Ruby < 2.7
12
+ - Dropped support for Postgres < 11
13
+
14
+ ## 0.4.3 (2023-03-26)
15
+
16
+ - Added experimental `--log-table` option
17
+ - Improved help
18
+ - Require pg_query < 4
19
+
1
20
  ## 0.4.2 (2023-01-29)
2
21
 
3
22
  - Fixed `--pg-stat-statements` option for Postgres 13+
data/LICENSE.txt CHANGED
@@ -1,4 +1,4 @@
1
- Copyright (c) 2017-2021 Andrew Kane
1
+ Copyright (c) 2017-2023 Andrew Kane
2
2
 
3
3
  MIT License
4
4
 
data/README.md CHANGED
@@ -18,15 +18,15 @@ make
18
18
  make install # may need sudo
19
19
  ```
20
20
 
21
- > Note: If you have issues, make sure `postgresql-server-dev-*` is installed.
21
+ And enable it in databases where you want to use Dexter:
22
22
 
23
- Enable logging for slow queries in your Postgres config file.
24
-
25
- ```ini
26
- log_min_duration_statement = 10 # ms
23
+ ```sql
24
+ CREATE EXTENSION hypopg;
27
25
  ```
28
26
 
29
- And install the command line tool with:
27
+ See the [installation notes](#hypopg-installation-notes) if you run into issues.
28
+
29
+ Then install the command line tool with:
30
30
 
31
31
  ```sh
32
32
  gem install pgdexter
@@ -36,10 +36,10 @@ The command line tool is also available with [Docker](#docker), [Homebrew](#home
36
36
 
37
37
  ## How to Use
38
38
 
39
- Dexter needs a connection to your database and a log file to process.
39
+ Dexter needs a connection to your database and a source of queries (like [pg_stat_statements](https://www.postgresql.org/docs/current/pgstatstatements.html)) to process.
40
40
 
41
41
  ```sh
42
- tail -F -n +1 <log-file> | dexter <connection-options>
42
+ dexter -d dbname --pg-stat-statements
43
43
  ```
44
44
 
45
45
  This finds slow queries and generates output like:
@@ -53,7 +53,6 @@ Index found: public.movies (title)
53
53
  Index found: public.ratings (movie_id)
54
54
  Index found: public.ratings (rating)
55
55
  Index found: public.ratings (user_id)
56
- Processing 12 new query fingerprints
57
56
  ```
58
57
 
59
58
  To be safe, Dexter will not create indexes unless you pass the `--create` flag. In this case, you’ll see:
@@ -84,41 +83,78 @@ and connection strings:
84
83
  host=localhost port=5432 dbname=mydb
85
84
  ```
86
85
 
86
+ Always make sure your [connection is secure](https://ankane.org/postgres-sslmode-explained) when connecting to a database over a network you don’t fully trust.
87
+
87
88
  ## Collecting Queries
88
89
 
89
- There are many ways to collect queries. For real-time indexing, pipe your logfile:
90
+ Dexter can collect queries from a number of sources.
91
+
92
+ - [Query stats](#query-stats)
93
+ - [Live queries](#live-queries)
94
+ - [Log files](#log-file)
95
+ - [SQL files](#sql-files)
96
+
97
+ ### Query Stats
98
+
99
+ Enable [pg_stat_statements](https://www.postgresql.org/docs/current/pgstatstatements.html) in your database.
100
+
101
+ ```psql
102
+ CREATE EXTENSION pg_stat_statements;
103
+ ```
104
+
105
+ And use:
90
106
 
91
107
  ```sh
92
- tail -F -n +1 <log-file> | dexter <connection-options>
108
+ dexter <connection-options> --pg-stat-statements
93
109
  ```
94
110
 
95
- Pass a single statement with:
111
+ ### Live Queries
112
+
113
+ Get live queries from the [pg_stat_activity](https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORING-PG-STAT-ACTIVITY-VIEW) view with:
96
114
 
97
115
  ```sh
98
- dexter <connection-options> -s "SELECT * FROM ..."
116
+ dexter <connection-options> --pg-stat-activity
117
+ ```
118
+
119
+ ### Log Files
120
+
121
+ Enable logging for slow queries in your Postgres config file.
122
+
123
+ ```ini
124
+ log_min_duration_statement = 10 # ms
99
125
  ```
100
126
 
101
- or pass files:
127
+ And use:
102
128
 
103
129
  ```sh
104
- dexter <connection-options> <file1> <file2>
130
+ dexter <connection-options> postgresql.log
105
131
  ```
106
132
 
107
- or collect running queries with:
133
+ Supports `stderr`, `csvlog`, and `jsonlog` formats.
134
+
135
+ For real-time indexing, pipe your logfile:
108
136
 
109
137
  ```sh
110
- dexter <connection-options> --pg-stat-activity
138
+ tail -F -n +1 postgresql.log | dexter <connection-options> --stdin
111
139
  ```
112
140
 
113
- or use the [pg_stat_statements](https://www.postgresql.org/docs/current/static/pgstatstatements.html) extension:
141
+ And pass `--input-format csvlog` or `--input-format jsonlog` if needed.
142
+
143
+ ### SQL Files
144
+
145
+ Pass a SQL file with:
114
146
 
115
147
  ```sh
116
- dexter <connection-options> --pg-stat-statements
148
+ dexter <connection-options> queries.sql
117
149
  ```
118
150
 
119
- > Note: Logs or running queries are highly preferred over pg_stat_statements, as pg_stat_statements often doesn’t store enough information to optimize queries.
151
+ Pass a single query with:
152
+
153
+ ```sh
154
+ dexter <connection-options> -s "SELECT * FROM ..."
155
+ ```
120
156
 
121
- ### Collection Options
157
+ ## Collection Options
122
158
 
123
159
  To prevent one-off queries from being indexed, specify a minimum number of calls before a query is considered for indexing
124
160
 
@@ -132,12 +168,6 @@ You can do the same for total time a query has run
132
168
  dexter --min-time 10 # minutes
133
169
  ```
134
170
 
135
- Specify the format
136
-
137
- ```sh
138
- dexter --input-format csv
139
- ```
140
-
141
171
  When streaming logs, specify the time to wait between processing queries
142
172
 
143
173
  ```sh
@@ -146,16 +176,22 @@ dexter --interval 60 # seconds
146
176
 
147
177
  ## Examples
148
178
 
149
- Ubuntu with PostgreSQL 12
179
+ Postgres package on Ubuntu 22.04
150
180
 
151
181
  ```sh
152
- tail -F -n +1 /var/log/postgresql/postgresql-12-main.log | sudo -u postgres dexter dbname
182
+ sudo -u postgres dexter -d dbname /var/log/postgresql/postgresql-14-main.log
153
183
  ```
154
184
 
155
- Homebrew on Mac
185
+ Homebrew Postgres on Mac ARM
156
186
 
157
187
  ```sh
158
- tail -F -n +1 /usr/local/var/postgres/server.log | dexter dbname
188
+ dexter -d dbname /opt/homebrew/var/log/postgresql@14.log
189
+ ```
190
+
191
+ Homebrew Postgres on Mac x86-64
192
+
193
+ ```sh
194
+ dexter -d dbname /usr/local/var/log/postgresql@14.log
159
195
  ```
160
196
 
161
197
  ## Analyze
@@ -198,6 +234,30 @@ For other providers, see [this guide](guides/Hosted-Postgres.md). To request a n
198
234
  - Google Cloud SQL - vote or comment on [this page](https://issuetracker.google.com/issues/69250435)
199
235
  - DigitalOcean Managed Databases - vote or comment on [this page](https://ideas.digitalocean.com/app-framework-services/p/support-hypopg-for-postgres)
200
236
 
237
+ ## HypoPG Installation Notes
238
+
239
+ ### Postgres Location
240
+
241
+ If your machine has multiple Postgres installations, specify the path to [pg_config](https://www.postgresql.org/docs/current/app-pgconfig.html) with:
242
+
243
+ ```sh
244
+ export PG_CONFIG=/Applications/Postgres.app/Contents/Versions/latest/bin/pg_config
245
+ ```
246
+
247
+ Then re-run the installation instructions (run `make clean` before `make` if needed)
248
+
249
+ ### Missing Header
250
+
251
+ If compilation fails with `fatal error: postgres.h: No such file or directory`, make sure Postgres development files are installed on the server.
252
+
253
+ For Ubuntu and Debian, use:
254
+
255
+ ```sh
256
+ sudo apt-get install postgresql-server-dev-15
257
+ ```
258
+
259
+ Note: Replace `15` with your Postgres server version
260
+
201
261
  ## Additional Installation Methods
202
262
 
203
263
  ### Docker
@@ -214,19 +274,19 @@ And run it with:
214
274
  docker run -ti ankane/dexter <connection-options>
215
275
  ```
216
276
 
217
- For databases on the host machine, use `host.docker.internal` as the hostname (on Linux, this requires Docker 20.04 and `--add-host=host.docker.internal:host-gateway`).
277
+ For databases on the host machine, use `host.docker.internal` as the hostname (on Linux, this requires Docker 20.04+ and `--add-host=host.docker.internal:host-gateway`).
218
278
 
219
279
  ### Homebrew
220
280
 
221
281
  With Homebrew, you can use:
222
282
 
223
283
  ```sh
224
- brew install ankane/brew/dexter
284
+ brew install dexter
225
285
  ```
226
286
 
227
287
  ## Future Work
228
288
 
229
- [Here are some ideas](https://github.com/ankane/dexter/issues/1)
289
+ [Here are some ideas](https://github.com/ankane/dexter/issues/45)
230
290
 
231
291
  ## Upgrading
232
292
 
@@ -243,9 +303,19 @@ gem install specific_install
243
303
  gem specific_install https://github.com/ankane/dexter.git
244
304
  ```
245
305
 
306
+ ## Upgrade Notes
307
+
308
+ ### 0.5.0
309
+
310
+ The `--stdin` option is now required to read queries from stdin.
311
+
312
+ ```sh
313
+ tail -F -n +1 postgresql.log | dexter <connection-options> --stdin
314
+ ```
315
+
246
316
  ## Thanks
247
317
 
248
- This software wouldn’t be possible without [HypoPG](https://github.com/HypoPG/hypopg), which allows you to create hypothetical indexes, and [pg_query](https://github.com/lfittl/pg_query), which allows you to parse and fingerprint queries. A big thanks to Dalibo and Lukas Fittl respectively.
318
+ This software wouldn’t be possible without [HypoPG](https://github.com/HypoPG/hypopg), which allows you to create hypothetical indexes, and [pg_query](https://github.com/lfittl/pg_query), which allows you to parse and fingerprint queries. A big thanks to Dalibo and Lukas Fittl respectively. Also, thanks to YugabyteDB for [this article](https://dev.to/yugabyte/explain-from-pgstatstatements-normalized-queries-how-to-always-get-the-generic-plan-in--5cfi) on how to explain normalized queries.
249
319
 
250
320
  ## Research
251
321
 
data/lib/dexter/client.rb CHANGED
@@ -7,7 +7,7 @@ module Dexter
7
7
 
8
8
  def self.start
9
9
  Dexter::Client.new(ARGV).perform
10
- rescue Dexter::Abort, PG::UndefinedFile => e
10
+ rescue Dexter::Abort, PG::UndefinedFile, PG::FeatureNotSupported => e
11
11
  abort colorize(e.message.strip, :red)
12
12
  end
13
13
 
@@ -29,9 +29,15 @@ module Dexter
29
29
  Processor.new(:pg_stat_activity, options).perform
30
30
  elsif arguments.any?
31
31
  ARGV.replace(arguments)
32
+ if !options[:input_format]
33
+ ext = ARGV.map { |v| File.extname(v) }.uniq
34
+ options[:input_format] = ext.first[1..-1] if ext.size == 1
35
+ end
32
36
  Processor.new(ARGF, options).perform
33
- else
37
+ elsif options[:stdin]
34
38
  Processor.new(STDIN, options).perform
39
+ else
40
+ raise Dexter::Abort, "Specify a source of queries: --pg-stat-statements, --pg-stat-activity, --stdin, or a path"
35
41
  end
36
42
  end
37
43
 
@@ -40,23 +46,45 @@ module Dexter
40
46
  o.banner = %(Usage:
41
47
  dexter [options])
42
48
  o.separator ""
43
- o.separator "Options:"
49
+
50
+ o.separator "Input options:"
51
+ o.string "--input-format", "input format"
52
+ o.boolean "--pg-stat-activity", "use pg_stat_activity", default: false
53
+ o.boolean "--pg-stat-statements", "use pg_stat_statements", default: false, help: false
54
+ o.boolean "--stdin", "use stdin", default: false
55
+ o.string "-s", "--statement", "process a single statement"
56
+ o.separator ""
57
+
58
+ o.separator "Connection options:"
59
+ o.string "-d", "--dbname", "database name"
60
+ o.string "-h", "--host", "database host"
61
+ o.integer "-p", "--port", "database port"
62
+ o.string "-U", "--username", "database user"
63
+ o.separator ""
64
+
65
+ o.separator "Processing options:"
66
+ o.integer "--interval", "time to wait between processing queries, in seconds", default: 60
67
+ o.float "--min-calls", "only process queries that have been called a certain number of times", default: 0
68
+ o.float "--min-time", "only process queries that have consumed a certain amount of DB time, in minutes", default: 0
69
+ o.separator ""
70
+
71
+ o.separator "Indexing options:"
44
72
  o.boolean "--analyze", "analyze tables that haven't been analyzed in the past hour", default: false
45
73
  o.boolean "--create", "create indexes", default: false
74
+ o.boolean "--enable-hypopg", "enable the HypoPG extension", default: false
46
75
  o.array "--exclude", "prevent specific tables from being indexed"
47
76
  o.string "--include", "only include specific tables"
48
- o.string "--input-format", "input format", default: "stderr"
49
- o.integer "--interval", "time to wait between processing queries, in seconds", default: 60
77
+ o.integer "--min-cost-savings-pct", default: 50, help: false
78
+ o.string "--tablespace", "tablespace to create indexes"
79
+ o.separator ""
80
+
81
+ o.separator "Logging options:"
50
82
  o.boolean "--log-explain", "log explain", default: false, help: false
51
83
  o.string "--log-level", "log level", default: "info"
52
84
  o.boolean "--log-sql", "log sql", default: false
53
- o.float "--min-calls", "only process queries that have been called a certain number of times", default: 0
54
- o.float "--min-time", "only process queries that have consumed a certain amount of DB time, in minutes", default: 0
55
- o.integer "--min-cost-savings-pct", default: 50, help: false
56
- o.boolean "--pg-stat-activity", "use pg_stat_activity", default: false, help: false
57
- o.boolean "--pg-stat-statements", "use pg_stat_statements", default: false, help: false
58
- o.string "-s", "--statement", "process a single statement"
59
- o.string "--tablespace", "tablespace to create indexes"
85
+ o.separator ""
86
+
87
+ o.separator "Other options:"
60
88
  o.on "-v", "--version", "print the version" do
61
89
  log Dexter::VERSION
62
90
  exit
@@ -65,12 +93,6 @@ module Dexter
65
93
  log o
66
94
  exit
67
95
  end
68
- o.separator ""
69
- o.separator "Connection options:"
70
- o.string "-d", "--dbname", "database name"
71
- o.string "-h", "--host", "database host"
72
- o.integer "-p", "--port", "database port"
73
- o.string "-U", "--username", "database user"
74
96
  end
75
97
 
76
98
  arguments = opts.arguments
@@ -8,7 +8,7 @@ module Dexter
8
8
  @min_calls = options[:min_calls]
9
9
  end
10
10
 
11
- def add(query, duration)
11
+ def add(query, total_time, calls = 1)
12
12
  fingerprint =
13
13
  begin
14
14
  PgQuery.fingerprint(query)
@@ -19,8 +19,8 @@ module Dexter
19
19
  return unless fingerprint
20
20
 
21
21
  @top_queries[fingerprint] ||= {calls: 0, total_time: 0}
22
- @top_queries[fingerprint][:calls] += 1
23
- @top_queries[fingerprint][:total_time] += duration
22
+ @top_queries[fingerprint][:calls] += calls
23
+ @top_queries[fingerprint][:total_time] += total_time
24
24
  @top_queries[fingerprint][:query] = query
25
25
  @mutex.synchronize do
26
26
  @new_queries << fingerprint
@@ -1,22 +1,24 @@
1
- require "csv"
2
-
3
1
  module Dexter
4
2
  class CsvLogParser < LogParser
5
3
  FIRST_LINE_REGEX = /\A.+/
6
4
 
7
5
  def perform
8
6
  CSV.new(@logfile.to_io).each do |row|
9
- if (m = REGEX.match(row[13]))
10
- # replace first line with match
11
- # needed for multiline queries
12
- active_line = row[13].sub(FIRST_LINE_REGEX, m[3])
13
-
14
- add_parameters(active_line, row[14]) if row[14]
15
- process_entry(active_line, m[1].to_f)
16
- end
7
+ process_csv_row(row[13], row[14])
17
8
  end
18
9
  rescue CSV::MalformedCSVError => e
19
10
  raise Dexter::Abort, "ERROR: #{e.message}"
20
11
  end
12
+
13
+ def process_csv_row(message, detail)
14
+ if (m = REGEX.match(message))
15
+ # replace first line with match
16
+ # needed for multiline queries
17
+ active_line = message.sub(FIRST_LINE_REGEX, m[3])
18
+
19
+ add_parameters(active_line, detail) if detail
20
+ process_entry(active_line, m[1].to_f)
21
+ end
22
+ end
21
23
  end
22
24
  end
@@ -17,7 +17,12 @@ module Dexter
17
17
  @options = options
18
18
  @mutex = Mutex.new
19
19
 
20
- create_extension unless extension_exists?
20
+ if server_version_num < 110000
21
+ raise Dexter::Abort, "This version of Dexter requires Postgres 11+"
22
+ end
23
+
24
+ check_extension
25
+
21
26
  execute("SET lock_timeout = '5s'")
22
27
  end
23
28
 
@@ -27,23 +32,6 @@ module Dexter
27
32
  process_queries(queries)
28
33
  end
29
34
 
30
- def stat_activity
31
- execute <<-SQL
32
- SELECT
33
- pid || ':' || COALESCE(query_start, xact_start) AS id,
34
- query,
35
- EXTRACT(EPOCH FROM NOW() - COALESCE(query_start, xact_start)) * 1000.0 AS duration_ms
36
- FROM
37
- pg_stat_activity
38
- WHERE
39
- datname = current_database()
40
- AND state = 'active'
41
- AND pid != pg_backend_pid()
42
- ORDER BY
43
- 1
44
- SQL
45
- end
46
-
47
35
  def process_queries(queries)
48
36
  # reset hypothetical indexes
49
37
  reset_hypothetical_indexes
@@ -119,19 +107,20 @@ module Dexter
119
107
 
120
108
  private
121
109
 
122
- def create_extension
123
- execute("SET client_min_messages = warning")
124
- begin
125
- execute("CREATE EXTENSION IF NOT EXISTS hypopg")
126
- rescue PG::UndefinedFile
110
+ def check_extension
111
+ extension = execute("SELECT installed_version FROM pg_available_extensions WHERE name = 'hypopg'").first
112
+
113
+ if extension.nil?
127
114
  raise Dexter::Abort, "Install HypoPG first: https://github.com/ankane/dexter#installation"
128
- rescue PG::InsufficientPrivilege
129
- raise Dexter::Abort, "Use a superuser to run: CREATE EXTENSION hypopg"
130
115
  end
131
- end
132
116
 
133
- def extension_exists?
134
- execute("SELECT * FROM pg_available_extensions WHERE name = 'hypopg' AND installed_version IS NOT NULL").any?
117
+ if extension["installed_version"].nil?
118
+ if @options[:enable_hypopg]
119
+ execute("CREATE EXTENSION hypopg")
120
+ else
121
+ raise Dexter::Abort, "Run `CREATE EXTENSION hypopg` or pass --enable-hypopg"
122
+ end
123
+ end
135
124
  end
136
125
 
137
126
  def reset_hypothetical_indexes
@@ -141,7 +130,7 @@ module Dexter
141
130
  def analyze_tables(tables)
142
131
  tables = tables.to_a.sort
143
132
 
144
- analyze_stats = execute <<-SQL
133
+ query = <<~SQL
145
134
  SELECT
146
135
  schemaname || '.' || relname AS table,
147
136
  last_analyze,
@@ -149,8 +138,9 @@ module Dexter
149
138
  FROM
150
139
  pg_stat_user_tables
151
140
  WHERE
152
- schemaname || '.' || relname IN (#{tables.map { |t| quote(t) }.join(", ")})
141
+ schemaname || '.' || relname IN (#{tables.size.times.map { |i| "$#{i + 1}" }.join(", ")})
153
142
  SQL
143
+ analyze_stats = execute(query, params: tables.to_a)
154
144
 
155
145
  last_analyzed = {}
156
146
  analyze_stats.each do |stats|
@@ -181,10 +171,6 @@ module Dexter
181
171
  end
182
172
  begin
183
173
  query.plans << plan(query.statement)
184
- if @log_explain
185
- # Pass format to prevent ANALYZE
186
- puts execute("EXPLAIN (FORMAT TEXT) #{safe_statement(query.statement)}", pretty: false).map { |r| r["QUERY PLAN"] }.join("\n")
187
- end
188
174
  rescue PG::Error, JSON::NestingError => e
189
175
  if @log_explain
190
176
  log e.message
@@ -214,7 +200,7 @@ module Dexter
214
200
  find_columns(query.tree).each do |col|
215
201
  last_col = col["fields"].last
216
202
  if last_col["String"]
217
- possible_columns << last_col["String"]["str"]
203
+ possible_columns << last_col["String"]["sval"]
218
204
  end
219
205
  end
220
206
  end
@@ -510,7 +496,7 @@ module Dexter
510
496
  def conn
511
497
  @conn ||= begin
512
498
  # set connect timeout if none set
513
- ENV["PGCONNECT_TIMEOUT"] ||= "2"
499
+ ENV["PGCONNECT_TIMEOUT"] ||= "3"
514
500
 
515
501
  if @options[:dbname] =~ /\Apostgres(ql)?:\/\//
516
502
  config = @options[:dbname]
@@ -529,7 +515,7 @@ module Dexter
529
515
  raise Dexter::Abort, e.message
530
516
  end
531
517
 
532
- def execute(query, pretty: true)
518
+ def execute(query, pretty: true, params: [])
533
519
  # use exec_params instead of exec for security
534
520
  #
535
521
  # Unlike PQexec, PQexecParams allows at most one SQL command in the given string.
@@ -538,16 +524,56 @@ module Dexter
538
524
  # as an extra defense against SQL-injection attacks.
539
525
  # https://www.postgresql.org/docs/current/static/libpq-exec.html
540
526
  query = squish(query) if pretty
541
- log colorize("[sql] #{query}", :cyan) if @log_sql
527
+ log colorize("[sql] #{query}#{params.any? ? " /*#{params.to_json}*/" : ""}", :cyan) if @log_sql
542
528
 
543
529
  @mutex.synchronize do
544
- conn.exec_params(query, []).to_a
530
+ conn.exec_params("#{query} /*dexter*/", params).to_a
545
531
  end
546
532
  end
547
533
 
548
534
  def plan(query)
535
+ prepared = false
536
+ transaction = false
537
+
538
+ # try to EXPLAIN normalized queries
539
+ # https://dev.to/yugabyte/explain-from-pgstatstatements-normalized-queries-how-to-always-get-the-generic-plan-in--5cfi
540
+ explain_normalized = query.include?("$1")
541
+ if explain_normalized
542
+ prepared_name = "dexter_prepared"
543
+ execute("PREPARE #{prepared_name} AS #{safe_statement(query)}", pretty: false)
544
+ prepared = true
545
+ params = execute("SELECT array_length(parameter_types, 1) AS params FROM pg_prepared_statements WHERE name = $1", params: [prepared_name]).first["params"].to_i
546
+ query = "EXECUTE #{prepared_name}(#{params.times.map { "NULL" }.join(", ")})"
547
+
548
+ execute("BEGIN")
549
+ transaction = true
550
+
551
+ if server_version_num >= 120000
552
+ execute("SET LOCAL plan_cache_mode = force_generic_plan")
553
+ else
554
+ execute("SET LOCAL cpu_operator_cost = 1e42")
555
+ 5.times do
556
+ execute("EXPLAIN (FORMAT JSON) #{safe_statement(query)}", pretty: false)
557
+ end
558
+ execute("ROLLBACK")
559
+ execute("BEGIN")
560
+ end
561
+ end
562
+
549
563
  # strip semi-colons as another measure of defense
550
- JSON.parse(execute("EXPLAIN (FORMAT JSON) #{safe_statement(query)}", pretty: false).first["QUERY PLAN"], max_nesting: 1000).first["Plan"]
564
+ plan = JSON.parse(execute("EXPLAIN (FORMAT JSON) #{safe_statement(query)}", pretty: false).first["QUERY PLAN"], max_nesting: 1000).first["Plan"]
565
+
566
+ if @log_explain
567
+ # Pass format to prevent ANALYZE
568
+ puts execute("EXPLAIN (FORMAT TEXT) #{safe_statement(query)}", pretty: false).map { |r| r["QUERY PLAN"] }.join("\n")
569
+ end
570
+
571
+ plan
572
+ ensure
573
+ if explain_normalized
574
+ execute("ROLLBACK") if transaction
575
+ execute("DEALLOCATE #{prepared_name}") if prepared
576
+ end
551
577
  end
552
578
 
553
579
  # TODO for multicolumn indexes, use ordering
@@ -565,7 +591,7 @@ module Dexter
565
591
  end
566
592
 
567
593
  def database_tables
568
- result = execute <<-SQL
594
+ result = execute <<~SQL
569
595
  SELECT
570
596
  table_schema || '.' || table_name AS table_name
571
597
  FROM
@@ -577,17 +603,13 @@ module Dexter
577
603
  end
578
604
 
579
605
  def materialized_views
580
- if server_version_num >= 90300
581
- result = execute <<-SQL
582
- SELECT
583
- schemaname || '.' || matviewname AS table_name
584
- FROM
585
- pg_matviews
586
- SQL
587
- result.map { |r| r["table_name"] }
588
- else
589
- []
590
- end
606
+ result = execute <<~SQL
607
+ SELECT
608
+ schemaname || '.' || matviewname AS table_name
609
+ FROM
610
+ pg_matviews
611
+ SQL
612
+ result.map { |r| r["table_name"] }
591
613
  end
592
614
 
593
615
  def server_version_num
@@ -595,7 +617,7 @@ module Dexter
595
617
  end
596
618
 
597
619
  def database_view_tables
598
- result = execute <<-SQL
620
+ result = execute <<~SQL
599
621
  SELECT
600
622
  schemaname || '.' || viewname AS table_name,
601
623
  definition
@@ -621,7 +643,7 @@ module Dexter
621
643
 
622
644
  def stat_statements
623
645
  total_time = server_version_num >= 130000 ? "(total_plan_time + total_exec_time)" : "total_time"
624
- result = execute <<-SQL
646
+ sql = <<~SQL
625
647
  SELECT
626
648
  DISTINCT query
627
649
  FROM
@@ -630,18 +652,18 @@ module Dexter
630
652
  pg_database ON pg_database.oid = pg_stat_statements.dbid
631
653
  WHERE
632
654
  datname = current_database()
633
- AND #{total_time} >= #{@min_time * 60000}
634
- AND calls >= #{@min_calls}
655
+ AND #{total_time} >= \$1
656
+ AND calls >= \$2
635
657
  ORDER BY
636
658
  1
637
659
  SQL
638
- result.map { |q| q["query"] }
660
+ execute(sql, params: [@min_time * 60000, @min_calls]).map { |q| q["query"] }
639
661
  end
640
662
 
641
663
  def with_advisory_lock
642
664
  lock_id = 123456
643
665
  first_time = true
644
- while execute("SELECT pg_try_advisory_lock(#{lock_id})").first["pg_try_advisory_lock"] != "t"
666
+ while execute("SELECT pg_try_advisory_lock($1)", params: [lock_id]).first["pg_try_advisory_lock"] != "t"
645
667
  if first_time
646
668
  log "Waiting for lock..."
647
669
  first_time = false
@@ -650,16 +672,19 @@ module Dexter
650
672
  end
651
673
  yield
652
674
  ensure
653
- with_min_messages("error") do
654
- execute("SELECT pg_advisory_unlock(#{lock_id})")
675
+ suppress_messages do
676
+ execute("SELECT pg_advisory_unlock($1)", params: [lock_id])
655
677
  end
656
678
  end
657
679
 
658
- def with_min_messages(value)
659
- execute("SET client_min_messages = #{quote(value)}")
680
+ def suppress_messages
681
+ conn.set_notice_processor do |message|
682
+ # do nothing
683
+ end
660
684
  yield
661
685
  ensure
662
- execute("SET client_min_messages = warning")
686
+ # clear notice processor
687
+ conn.set_notice_processor
663
688
  end
664
689
 
665
690
  def index_exists?(index)
@@ -667,7 +692,7 @@ module Dexter
667
692
  end
668
693
 
669
694
  def columns(tables)
670
- columns = execute <<-SQL
695
+ query = <<~SQL
671
696
  SELECT
672
697
  s.nspname || '.' || t.relname AS table_name,
673
698
  a.attname AS column_name,
@@ -677,16 +702,16 @@ module Dexter
677
702
  JOIN pg_namespace s on t.relnamespace = s.oid
678
703
  WHERE a.attnum > 0
679
704
  AND NOT a.attisdropped
680
- AND s.nspname || '.' || t.relname IN (#{tables.map { |t| quote(t) }.join(", ")})
705
+ AND s.nspname || '.' || t.relname IN (#{tables.size.times.map { |i| "$#{i + 1}" }.join(", ")})
681
706
  ORDER BY
682
707
  1, 2
683
708
  SQL
684
-
709
+ columns = execute(query, params: tables.to_a)
685
710
  columns.map { |v| {table: v["table_name"], column: v["column_name"], type: v["data_type"]} }
686
711
  end
687
712
 
688
713
  def indexes(tables)
689
- execute(<<-SQL
714
+ query = <<~SQL
690
715
  SELECT
691
716
  schemaname || '.' || t.relname AS table,
692
717
  ix.relname AS name,
@@ -701,14 +726,14 @@ module Dexter
701
726
  LEFT JOIN
702
727
  pg_stat_user_indexes ui ON ui.indexrelid = i.indexrelid
703
728
  WHERE
704
- schemaname || '.' || t.relname IN (#{tables.map { |t| quote(t) }.join(", ")}) AND
729
+ schemaname || '.' || t.relname IN (#{tables.size.times.map { |i| "$#{i + 1}" }.join(", ")}) AND
705
730
  indisvalid = 't' AND
706
731
  indexprs IS NULL AND
707
732
  indpred IS NULL
708
733
  ORDER BY
709
734
  1, 2
710
735
  SQL
711
- ).map { |v| v["columns"] = v["columns"].sub(") WHERE (", " WHERE ").split(", ").map { |c| unquote(c) }; v }
736
+ execute(query, params: tables.to_a).map { |v| v["columns"] = v["columns"].sub(") WHERE (", " WHERE ").split(", ").map { |c| unquote(c) }; v }
712
737
  end
713
738
 
714
739
  def search_path
@@ -727,19 +752,6 @@ module Dexter
727
752
  value.split(".").map { |v| conn.quote_ident(v) }.join(".")
728
753
  end
729
754
 
730
- def quote(value)
731
- if value.is_a?(String)
732
- "'#{quote_string(value)}'"
733
- else
734
- value
735
- end
736
- end
737
-
738
- # from activerecord
739
- def quote_string(s)
740
- s.gsub(/\\/, '\&\&').gsub(/'/, "''")
741
- end
742
-
743
755
  # from activesupport
744
756
  def squish(str)
745
757
  str.to_s.gsub(/\A[[:space:]]+/, "").gsub(/[[:space:]]+\z/, "").gsub(/[[:space:]]+/, " ")
@@ -1,5 +1,3 @@
1
- require "json"
2
-
3
1
  module Dexter
4
2
  class JsonLogParser < LogParser
5
3
  FIRST_LINE_REGEX = /\A.+/
@@ -3,38 +3,12 @@ module Dexter
3
3
  include Logging
4
4
 
5
5
  REGEX = /duration: (\d+\.\d+) ms (statement|execute [^:]+): (.+)/
6
- LINE_SEPERATOR = ": ".freeze
7
- DETAIL_LINE = "DETAIL: ".freeze
8
6
 
9
7
  def initialize(logfile, collector)
10
8
  @logfile = logfile
11
9
  @collector = collector
12
10
  end
13
11
 
14
- def perform
15
- active_line = nil
16
- duration = nil
17
-
18
- @logfile.each_line do |line|
19
- if active_line
20
- if line.include?(DETAIL_LINE)
21
- add_parameters(active_line, line.chomp.split(DETAIL_LINE)[1])
22
- elsif line.include?(LINE_SEPERATOR)
23
- process_entry(active_line, duration)
24
- active_line = nil
25
- else
26
- active_line << line
27
- end
28
- end
29
-
30
- if !active_line && (m = REGEX.match(line.chomp))
31
- duration = m[1].to_f
32
- active_line = m[3]
33
- end
34
- end
35
- process_entry(active_line, duration) if active_line
36
- end
37
-
38
12
  private
39
13
 
40
14
  def process_entry(query, duration)
@@ -1,25 +1,50 @@
1
1
  module Dexter
2
2
  class PgStatActivityParser < LogParser
3
3
  def perform
4
- queries = {}
4
+ previous_queries = {}
5
5
 
6
- loop do
7
- new_queries = {}
8
- @logfile.stat_activity.each do |row|
9
- new_queries[row["id"]] = row
6
+ 10.times do
7
+ active_queries = {}
8
+ processed_queries = {}
9
+
10
+ stat_activity.each do |row|
11
+ if row["state"] == "active"
12
+ active_queries[row["id"]] = row
13
+ else
14
+ process_entry(row["query"], row["duration_ms"].to_f)
15
+ processed_queries[row["id"]] = true
16
+ end
10
17
  end
11
18
 
12
19
  # store queries after they complete
13
- queries.each do |id, row|
14
- unless new_queries[id]
20
+ previous_queries.each do |id, row|
21
+ if !active_queries[id] && !processed_queries[id]
15
22
  process_entry(row["query"], row["duration_ms"].to_f)
16
23
  end
17
24
  end
18
25
 
19
- queries = new_queries
26
+ previous_queries = active_queries
20
27
 
21
- sleep(1)
28
+ sleep(0.1)
22
29
  end
23
30
  end
31
+
32
+ def stat_activity
33
+ sql = <<~SQL
34
+ SELECT
35
+ pid || ':' || COALESCE(query_start, xact_start) AS id,
36
+ query,
37
+ state,
38
+ EXTRACT(EPOCH FROM NOW() - COALESCE(query_start, xact_start)) * 1000.0 AS duration_ms
39
+ FROM
40
+ pg_stat_activity
41
+ WHERE
42
+ datname = current_database()
43
+ AND pid != pg_backend_pid()
44
+ ORDER BY
45
+ 1
46
+ SQL
47
+ @logfile.send(:execute, sql)
48
+ end
24
49
  end
25
50
  end
@@ -18,7 +18,7 @@ module Dexter
18
18
  elsif options[:input_format] == "sql"
19
19
  SqlLogParser.new(logfile, @collector)
20
20
  else
21
- LogParser.new(logfile, @collector)
21
+ StderrLogParser.new(logfile, @collector)
22
22
  end
23
23
 
24
24
  @starting_interval = 3
@@ -31,7 +31,7 @@ module Dexter
31
31
  end
32
32
 
33
33
  def perform
34
- if [STDIN, :pg_stat_activity].include?(@logfile)
34
+ if [STDIN].include?(@logfile)
35
35
  Thread.abort_on_exception = true
36
36
  Thread.new do
37
37
  sleep(@starting_interval)
@@ -0,0 +1,34 @@
1
+ module Dexter
2
+ class StderrLogParser < LogParser
3
+ LINE_SEPERATOR = ": ".freeze
4
+ DETAIL_LINE = "DETAIL: ".freeze
5
+
6
+ def perform
7
+ process_stderr(@logfile.each_line)
8
+ end
9
+
10
+ def process_stderr(rows)
11
+ active_line = nil
12
+ duration = nil
13
+
14
+ rows.each do |line|
15
+ if active_line
16
+ if line.include?(DETAIL_LINE)
17
+ add_parameters(active_line, line.chomp.split(DETAIL_LINE)[1])
18
+ elsif line.include?(LINE_SEPERATOR)
19
+ process_entry(active_line, duration)
20
+ active_line = nil
21
+ else
22
+ active_line << line
23
+ end
24
+ end
25
+
26
+ if !active_line && (m = REGEX.match(line.chomp))
27
+ duration = m[1].to_f
28
+ active_line = m[3]
29
+ end
30
+ end
31
+ process_entry(active_line, duration) if active_line
32
+ end
33
+ end
34
+ end
@@ -1,3 +1,3 @@
1
1
  module Dexter
2
- VERSION = "0.4.2"
2
+ VERSION = "0.5.0"
3
3
  end
data/lib/dexter.rb CHANGED
@@ -4,23 +4,27 @@ require "pg_query"
4
4
  require "slop"
5
5
 
6
6
  # stdlib
7
+ require "csv"
7
8
  require "json"
8
9
  require "set"
9
10
  require "time"
10
11
 
11
12
  # modules
12
- require "dexter/version"
13
- require "dexter/logging"
14
- require "dexter/client"
15
- require "dexter/collector"
16
- require "dexter/indexer"
17
- require "dexter/log_parser"
18
- require "dexter/csv_log_parser"
19
- require "dexter/json_log_parser"
20
- require "dexter/pg_stat_activity_parser"
21
- require "dexter/sql_log_parser"
22
- require "dexter/processor"
23
- require "dexter/query"
13
+ require_relative "dexter/logging"
14
+ require_relative "dexter/client"
15
+ require_relative "dexter/collector"
16
+ require_relative "dexter/indexer"
17
+ require_relative "dexter/processor"
18
+ require_relative "dexter/query"
19
+ require_relative "dexter/version"
20
+
21
+ # parsers
22
+ require_relative "dexter/log_parser"
23
+ require_relative "dexter/csv_log_parser"
24
+ require_relative "dexter/json_log_parser"
25
+ require_relative "dexter/pg_stat_activity_parser"
26
+ require_relative "dexter/sql_log_parser"
27
+ require_relative "dexter/stderr_log_parser"
24
28
 
25
29
  module Dexter
26
30
  class Abort < StandardError; end
metadata CHANGED
@@ -1,57 +1,57 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: pgdexter
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.4.2
4
+ version: 0.5.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Andrew Kane
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2023-01-30 00:00:00.000000000 Z
11
+ date: 2023-04-18 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
- name: slop
14
+ name: pg
15
15
  requirement: !ruby/object:Gem::Requirement
16
16
  requirements:
17
17
  - - ">="
18
18
  - !ruby/object:Gem::Version
19
- version: 4.8.2
19
+ version: 0.18.2
20
20
  type: :runtime
21
21
  prerelease: false
22
22
  version_requirements: !ruby/object:Gem::Requirement
23
23
  requirements:
24
24
  - - ">="
25
25
  - !ruby/object:Gem::Version
26
- version: 4.8.2
26
+ version: 0.18.2
27
27
  - !ruby/object:Gem::Dependency
28
- name: pg
28
+ name: pg_query
29
29
  requirement: !ruby/object:Gem::Requirement
30
30
  requirements:
31
- - - ">="
31
+ - - "~>"
32
32
  - !ruby/object:Gem::Version
33
- version: 0.18.2
33
+ version: '4'
34
34
  type: :runtime
35
35
  prerelease: false
36
36
  version_requirements: !ruby/object:Gem::Requirement
37
37
  requirements:
38
- - - ">="
38
+ - - "~>"
39
39
  - !ruby/object:Gem::Version
40
- version: 0.18.2
40
+ version: '4'
41
41
  - !ruby/object:Gem::Dependency
42
- name: pg_query
42
+ name: slop
43
43
  requirement: !ruby/object:Gem::Requirement
44
44
  requirements:
45
45
  - - ">="
46
46
  - !ruby/object:Gem::Version
47
- version: '2.1'
47
+ version: 4.10.1
48
48
  type: :runtime
49
49
  prerelease: false
50
50
  version_requirements: !ruby/object:Gem::Requirement
51
51
  requirements:
52
52
  - - ">="
53
53
  - !ruby/object:Gem::Version
54
- version: '2.1'
54
+ version: 4.10.1
55
55
  description:
56
56
  email: andrew@ankane.org
57
57
  executables:
@@ -75,6 +75,7 @@ files:
75
75
  - lib/dexter/processor.rb
76
76
  - lib/dexter/query.rb
77
77
  - lib/dexter/sql_log_parser.rb
78
+ - lib/dexter/stderr_log_parser.rb
78
79
  - lib/dexter/version.rb
79
80
  homepage: https://github.com/ankane/dexter
80
81
  licenses:
@@ -88,14 +89,14 @@ required_ruby_version: !ruby/object:Gem::Requirement
88
89
  requirements:
89
90
  - - ">="
90
91
  - !ruby/object:Gem::Version
91
- version: '2.5'
92
+ version: '2.7'
92
93
  required_rubygems_version: !ruby/object:Gem::Requirement
93
94
  requirements:
94
95
  - - ">="
95
96
  - !ruby/object:Gem::Version
96
97
  version: '0'
97
98
  requirements: []
98
- rubygems_version: 3.4.1
99
+ rubygems_version: 3.4.10
99
100
  signing_key:
100
101
  specification_version: 4
101
102
  summary: The automatic indexer for Postgres