pg-plan-alternatives 0.1.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- pg_plan_alternatives-0.1.0/LICENSE +21 -0
- pg_plan_alternatives-0.1.0/MANIFEST.in +5 -0
- pg_plan_alternatives-0.1.0/PKG-INFO +438 -0
- pg_plan_alternatives-0.1.0/README.md +417 -0
- pg_plan_alternatives-0.1.0/pyproject.toml +56 -0
- pg_plan_alternatives-0.1.0/requirements_dev.txt +7 -0
- pg_plan_alternatives-0.1.0/setup.cfg +4 -0
- pg_plan_alternatives-0.1.0/src/pg_plan_alternatives/__init__.py +1 -0
- pg_plan_alternatives-0.1.0/src/pg_plan_alternatives/bpf/__init__.py +4 -0
- pg_plan_alternatives-0.1.0/src/pg_plan_alternatives/bpf/pg_plan_alternatives.c +331 -0
- pg_plan_alternatives-0.1.0/src/pg_plan_alternatives/helper.py +220 -0
- pg_plan_alternatives-0.1.0/src/pg_plan_alternatives/pg_plan_alternatives.py +377 -0
- pg_plan_alternatives-0.1.0/src/pg_plan_alternatives/visualize_plan_graph.py +682 -0
- pg_plan_alternatives-0.1.0/src/pg_plan_alternatives.egg-info/PKG-INFO +438 -0
- pg_plan_alternatives-0.1.0/src/pg_plan_alternatives.egg-info/SOURCES.txt +19 -0
- pg_plan_alternatives-0.1.0/src/pg_plan_alternatives.egg-info/dependency_links.txt +1 -0
- pg_plan_alternatives-0.1.0/src/pg_plan_alternatives.egg-info/entry_points.txt +3 -0
- pg_plan_alternatives-0.1.0/src/pg_plan_alternatives.egg-info/requires.txt +2 -0
- pg_plan_alternatives-0.1.0/src/pg_plan_alternatives.egg-info/top_level.txt +1 -0
- pg_plan_alternatives-0.1.0/tests/test_helper.py +86 -0
- pg_plan_alternatives-0.1.0/tests/test_oid_resolver.py +94 -0
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 Jan Nidzwetzki
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
|
@@ -0,0 +1,438 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: pg_plan_alternatives
|
|
3
|
+
Version: 0.1.0
|
|
4
|
+
Summary: An eBPF-based tool to show all query plans considered by PostgreSQL during query planning
|
|
5
|
+
Author-email: Jan Nidzwetzki <jnidzwetzki@gmx.de>
|
|
6
|
+
License-Expression: MIT
|
|
7
|
+
Project-URL: Homepage, https://github.com/jnidzwetzki/pg_plan_alternatives
|
|
8
|
+
Project-URL: Bug Tracker, https://github.com/jnidzwetzki/pg_plan_alternatives/issues
|
|
9
|
+
Keywords: postgresql,postgres,ebpf,query-planning,optimizer
|
|
10
|
+
Classifier: Development Status :: 4 - Beta
|
|
11
|
+
Classifier: Intended Audience :: Developers
|
|
12
|
+
Classifier: Operating System :: POSIX :: Linux
|
|
13
|
+
Classifier: Programming Language :: Python
|
|
14
|
+
Classifier: Topic :: Software Development :: Debuggers
|
|
15
|
+
Requires-Python: >=3.10
|
|
16
|
+
Description-Content-Type: text/markdown
|
|
17
|
+
License-File: LICENSE
|
|
18
|
+
Requires-Dist: graphviz
|
|
19
|
+
Requires-Dist: psycopg2-binary>=2.9
|
|
20
|
+
Dynamic: license-file
|
|
21
|
+
|
|
22
|
+
# pg_plan_alternatives: A PostgreSQL Plan Alternatives Tracer
|
|
23
|
+
[](https://github.com/jnidzwetzki/pg_plan_alternatives/actions/workflows/integration_tests.yml)
|
|
24
|
+
[](https://opensource.org/licenses/MIT)
|
|
25
|
+
[](https://github.com/jnidzwetzki/pg_plan_alternatives/)
|
|
26
|
+
|
|
27
|
+
An eBPF-based tool designed to show **all query plans** that are considered by PostgreSQL during query planning, not just the final chosen plan as shown in `EXPLAIN` output.
|
|
28
|
+
|
|
29
|
+
## 🎯 Overview
|
|
30
|
+
|
|
31
|
+
PostgreSQL uses a cost-based optimizer to determine the most efficient way to execute a query. When PostgreSQL plans a query, it considers many different execution paths and chooses the one with the lowest estimated cost. The standard `EXPLAIN` command only shows the final chosen plan. `pg_plan_alternatives` reveals all the alternative plans that were considered, along with their costs, giving you a complete picture of the optimizer's reasoning.
|
|
32
|
+
|
|
33
|
+
Key features:
|
|
34
|
+
- **`pg_plan_alternatives`**: eBPF-based tracer that captures all query plans considered during planning
|
|
35
|
+
- **`visualize_plan_graph`**: Creates interactive graph visualizations from trace output
|
|
36
|
+
- Supports PostgreSQL 17 and 18
|
|
37
|
+
- JSON output format for easy processing
|
|
38
|
+
- Shows cost estimates (startup and total) for each alternative
|
|
39
|
+
- Highlights which plan was ultimately chosen
|
|
40
|
+
|
|
41
|
+
**Note:** This tool relies on [eBPF](https://ebpf.io/) (_Extended Berkeley Packet Filter_) technology and requires root privileges to run.
|
|
42
|
+
|
|
43
|
+
## ⚡ Quickstart
|
|
44
|
+
|
|
45
|
+
1. Install the tool:
|
|
46
|
+
|
|
47
|
+
```bash
|
|
48
|
+
pip install pg_plan_alternatives
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
2. Identify your PostgreSQL server binary (e.g., `/usr/lib/postgresql/17/bin/postgres`)
|
|
52
|
+
|
|
53
|
+
3. Start tracing (requires root privileges):
|
|
54
|
+
|
|
55
|
+
```bash
|
|
56
|
+
sudo pg_plan_alternatives -x /usr/lib/postgresql/17/bin/postgres -p <PID> -n $(pg_config --includedir-server)/nodes/nodetags.h
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
4. Run your queries in PostgreSQL
|
|
60
|
+
|
|
61
|
+
5. View the trace output showing all considered plans
|
|
62
|
+
|
|
63
|
+
## 📊 Usage Examples
|
|
64
|
+
|
|
65
|
+
### Basic Tracing
|
|
66
|
+
|
|
67
|
+
```bash
|
|
68
|
+
# Trace all PostgreSQL processes using the binary
|
|
69
|
+
sudo pg_plan_alternatives -x /usr/lib/postgresql/17/bin/postgres -n /path/to/nodetags.h
|
|
70
|
+
|
|
71
|
+
# Trace a specific PostgreSQL backend process
|
|
72
|
+
sudo pg_plan_alternatives -x /usr/lib/postgresql/17/bin/postgres -p 1234 -n /path/to/nodetags.h
|
|
73
|
+
|
|
74
|
+
# Trace multiple processes
|
|
75
|
+
sudo pg_plan_alternatives -x /usr/lib/postgresql/17/bin/postgres -p 1234 -p 5678 -n /path/to/nodetags.h
|
|
76
|
+
|
|
77
|
+
# Output in JSON format
|
|
78
|
+
sudo pg_plan_alternatives -x /usr/lib/postgresql/17/bin/postgres -p 1234 -j -n /path/to/nodetags.h
|
|
79
|
+
|
|
80
|
+
# Save output to file
|
|
81
|
+
sudo pg_plan_alternatives -x /usr/lib/postgresql/17/bin/postgres -p 1234 -j -o plans.json -n /path/to/nodetags.h
|
|
82
|
+
|
|
83
|
+
# Verbose mode
|
|
84
|
+
sudo pg_plan_alternatives -x /usr/lib/postgresql/17/bin/postgres -p 1234 -v -n /path/to/nodetags.h
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
*Note:* The path to `nodetags.h` is required to resolve the path type enums to human-readable names.
|
|
88
|
+
|
|
89
|
+
### Creating Visualizations
|
|
90
|
+
|
|
91
|
+
```bash
|
|
92
|
+
# Create a PNG graph from trace output
|
|
93
|
+
visualize_plan_graph -i plans.json -o plans.png
|
|
94
|
+
|
|
95
|
+
# Create an interactive HTML visualization
|
|
96
|
+
visualize_plan_graph -i plans.json -o plans.html
|
|
97
|
+
|
|
98
|
+
# Create an SVG graph
|
|
99
|
+
visualize_plan_graph -i plans.json -o plans.svg
|
|
100
|
+
|
|
101
|
+
# Create separate graphs for each PID
|
|
102
|
+
visualize_plan_graph -i plans.json -o plans.png --group-by-pid
|
|
103
|
+
|
|
104
|
+
# Resolve table OIDs by connecting to the database
|
|
105
|
+
visualize_plan_graph -i plans.json -o plans.png --db-url postgres://user:pass@host/db
|
|
106
|
+
```
|
|
107
|
+
|
|
108
|
+
## 📄 Example Usage
|
|
109
|
+
|
|
110
|
+
### Preparing the Environment
|
|
111
|
+
|
|
112
|
+
To see the tool in action, you can set up a simple PostgreSQL environment with some test data:
|
|
113
|
+
|
|
114
|
+
```sql
|
|
115
|
+
CREATE TABLE test1(id INTEGER PRIMARY KEY);
|
|
116
|
+
CREATE TABLE test2(id INTEGER PRIMARY KEY);
|
|
117
|
+
|
|
118
|
+
INSERT INTO test1 SELECT generate_series(1, 1000);
|
|
119
|
+
INSERT INTO test2 SELECT generate_series(1, 1000);
|
|
120
|
+
|
|
121
|
+
ANALYZE;
|
|
122
|
+
```
|
|
123
|
+
|
|
124
|
+
### SELECT
|
|
125
|
+
|
|
126
|
+
In the first example, we will run a simple `SELECT` query and trace the planning process to see all the alternatives considered by the optimizer. To capture the planning of this query, we can run the following command:
|
|
127
|
+
|
|
128
|
+
```
|
|
129
|
+
$ sudo pg_plan_alternatives -x /home/jan/postgresql-sandbox/bin/REL_17_1_DEBUG/bin/postgres -n $(pg_config --includedir-server)/nodes/nodetags.h
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
In another terminal, we execute the query:
|
|
133
|
+
|
|
134
|
+
```sql
|
|
135
|
+
SELECT * FROM test1;
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
The output from `pg_plan_alternatives` will show all the paths that were considered for this query, including sequential scans, index scans, and bitmap heap scans, along with their estimated costs. The chosen plan will be highlighted in the output with a `[CHOSEN]` tag.
|
|
139
|
+
|
|
140
|
+
```
|
|
141
|
+
================================================================================
|
|
142
|
+
PostgreSQL Plan Alternatives Tracer
|
|
143
|
+
Binary: /home/jan/postgresql-sandbox/bin/REL_17_1_DEBUG/bin/postgres
|
|
144
|
+
Tracing all PostgreSQL processes
|
|
145
|
+
================================================================================
|
|
146
|
+
|
|
147
|
+
Received event: PID=3917080, Type=ADD_PATH, PathType=T_SeqScan
|
|
148
|
+
[20:14:54.116] [PID 3917080] ADD_PATH: T_SeqScan (startup=0.00, total=15.00, rows=1000, parent_rti=1, parent_oid=26144)
|
|
149
|
+
Received event: PID=3917080, Type=ADD_PATH, PathType=T_IndexOnlyScan
|
|
150
|
+
[20:14:54.118] [PID 3917080] ADD_PATH: T_IndexOnlyScan (startup=0.28, total=43.27, rows=1000, parent_rti=1, parent_oid=26144)
|
|
151
|
+
Received event: PID=3917080, Type=ADD_PATH, PathType=T_BitmapHeapScan
|
|
152
|
+
[20:14:54.118] [PID 3917080] ADD_PATH: T_BitmapHeapScan (startup=25.52, total=40.52, rows=1000, parent_rti=1, parent_oid=26144)
|
|
153
|
+
Received event: PID=3917080, Type=ADD_PATH, PathType=T_SeqScan
|
|
154
|
+
[20:14:54.118] [PID 3917080] ADD_PATH: T_SeqScan (startup=0.00, total=15.00, rows=1000, parent_oid=26144)
|
|
155
|
+
Received event: PID=3917080, Type=CREATE_PLAN, PathType=T_SeqScan
|
|
156
|
+
[20:14:54.118] [PID 3917080] CREATE_PLAN: T_SeqScan (startup=0.00, total=15.00) [CHOSEN]
|
|
157
|
+
```
|
|
158
|
+
|
|
159
|
+
When we run `EXPLAIN (VERBOSE, ANALYZE)` on the same query, we can see that the chosen plan was a sequential scan, which matches the output from our tracer. Also the costs and estimated rows align with what was reported in the trace output.
|
|
160
|
+
|
|
161
|
+
```
|
|
162
|
+
jan2=# EXPLAIN (VERBOSE, ANALYZE) SELECT * FROM test1;
|
|
163
|
+
QUERY PLAN
|
|
164
|
+
-------------------------------------------------------------------------------------------------------------
|
|
165
|
+
Seq Scan on public.test1 (cost=0.00..15.00 rows=1000 width=4) (actual time=0.119..0.291 rows=1000 loops=1)
|
|
166
|
+
Output: id
|
|
167
|
+
Planning Time: 0.855 ms
|
|
168
|
+
Execution Time: 0.437 ms
|
|
169
|
+
(4 rows)
|
|
170
|
+
```
|
|
171
|
+
|
|
172
|
+
To visualize the alternatives, we can save the trace output to a JSON file and then create an SVG graph:
|
|
173
|
+
|
|
174
|
+
```
|
|
175
|
+
$ sudo pg_plan_alternatives -x /home/jan/postgresql-sandbox/bin/REL_17_1_DEBUG/bin/postgres -n $(pg_config --includedir-server)/nodes/nodetags.h -j -o examples/select.json
|
|
176
|
+
```
|
|
177
|
+
|
|
178
|
+
From this JSON file, we can generate a graph visualization. The `--db-url` option allows the tool to connect to the database and resolve OIDs to human-readable table names, which makes the graph easier to understand.
|
|
179
|
+
|
|
180
|
+
```
|
|
181
|
+
$ visualize_plan_graph -i examples/select.json -o examples/select.svg --db-url psql://localhost/jan2 -v
|
|
182
|
+
```
|
|
183
|
+
|
|
184
|
+

|
|
185
|
+
|
|
186
|
+
### SELECT with a simple WHERE clause:
|
|
187
|
+
|
|
188
|
+
In the next example, we will run a `SELECT` query with a `WHERE` clause that filters for a specific ID.
|
|
189
|
+
|
|
190
|
+
```
|
|
191
|
+
SELECT * FROM test1 WHERE id = 5;
|
|
192
|
+
```
|
|
193
|
+
|
|
194
|
+
The trace output shows the following alternatives being considered by the optimizer:
|
|
195
|
+
|
|
196
|
+
```
|
|
197
|
+
Received event: PID=3917080, Type=ADD_PATH, PathType=T_SeqScan
|
|
198
|
+
[20:15:53.751] [PID 3917080] ADD_PATH: T_SeqScan (startup=0.00, total=17.50, rows=1, parent_rti=1, parent_oid=26144)
|
|
199
|
+
Received event: PID=3917080, Type=ADD_PATH, PathType=T_IndexOnlyScan
|
|
200
|
+
[20:15:53.751] [PID 3917080] ADD_PATH: T_IndexOnlyScan (startup=0.28, total=8.29, rows=1, parent_rti=1, parent_oid=26144)
|
|
201
|
+
Received event: PID=3917080, Type=ADD_PATH, PathType=T_BitmapHeapScan
|
|
202
|
+
[20:15:53.751] [PID 3917080] ADD_PATH: T_BitmapHeapScan (startup=4.28, total=8.30, rows=1, parent_rti=1, parent_oid=26144)
|
|
203
|
+
Received event: PID=3917080, Type=ADD_PATH, PathType=T_IndexOnlyScan
|
|
204
|
+
[20:15:53.751] [PID 3917080] ADD_PATH: T_IndexOnlyScan (startup=0.28, total=8.29, rows=1, parent_oid=26144)
|
|
205
|
+
Received event: PID=3917080, Type=CREATE_PLAN, PathType=T_IndexOnlyScan
|
|
206
|
+
[20:15:53.751] [PID 3917080] CREATE_PLAN: T_IndexOnlyScan (startup=0.28, total=8.29) [CHOSEN]
|
|
207
|
+
```
|
|
208
|
+
|
|
209
|
+
This time the optimizer has chosen the `Index Only Scan` plan, which has the lowest estimated cost. When we run `EXPLAIN (VERBOSE, ANALYZE)` on this query, we can confirm that the chosen plan matches what was reported in the trace output.
|
|
210
|
+
|
|
211
|
+
```
|
|
212
|
+
jan2=# EXPLAIN (VERBOSE, ANALYZE) SELECT * FROM test1 WHERE id = 5;
|
|
213
|
+
QUERY PLAN
|
|
214
|
+
------------------------------------------------------------------------------------------------------------------------------
|
|
215
|
+
Index Only Scan using test1_pkey on public.test1 (cost=0.28..8.29 rows=1 width=4) (actual time=0.153..0.160 rows=1 loops=1)
|
|
216
|
+
Output: id
|
|
217
|
+
Index Cond: (test1.id = 5)
|
|
218
|
+
Heap Fetches: 1
|
|
219
|
+
Planning Time: 1.166 ms
|
|
220
|
+
Execution Time: 0.284 ms
|
|
221
|
+
(6 rows)
|
|
222
|
+
```
|
|
223
|
+
|
|
224
|
+
The visualization of the alternatives for this query looks like this:
|
|
225
|
+
|
|
226
|
+

|
|
227
|
+
|
|
228
|
+
### JOIN
|
|
229
|
+
|
|
230
|
+
To give a more complex example, we can run a `JOIN` query that combines data from both `test1` and `test2`:
|
|
231
|
+
|
|
232
|
+
```sql
|
|
233
|
+
SELECT * FROM test1 LEFT JOIN test2 ON (test1.id = test2.id);
|
|
234
|
+
```
|
|
235
|
+
|
|
236
|
+
Now far more alternatives are considered by the optimizer, including different join strategies (merge join, hash join, nested loop) and different scan methods for each table.
|
|
237
|
+
|
|
238
|
+
```
|
|
239
|
+
Received event: PID=3917080, Type=ADD_PATH, PathType=T_SeqScan
|
|
240
|
+
[20:22:42.381] [PID 3917080] ADD_PATH: T_SeqScan (startup=0.00, total=15.00, rows=1000, parent_rti=1, parent_oid=26144)
|
|
241
|
+
Received event: PID=3917080, Type=ADD_PATH, PathType=T_IndexOnlyScan
|
|
242
|
+
[20:22:42.381] [PID 3917080] ADD_PATH: T_IndexOnlyScan (startup=0.28, total=43.27, rows=1000, parent_rti=1, parent_oid=26144)
|
|
243
|
+
Received event: PID=3917080, Type=ADD_PATH, PathType=T_SeqScan
|
|
244
|
+
[20:22:42.381] [PID 3917080] ADD_PATH: T_SeqScan (startup=0.00, total=15.00, rows=1000, parent_rti=2, parent_oid=26149)
|
|
245
|
+
Received event: PID=3917080, Type=ADD_PATH, PathType=T_IndexOnlyScan
|
|
246
|
+
[20:22:42.382] [PID 3917080] ADD_PATH: T_IndexOnlyScan (startup=0.28, total=43.27, rows=1000, parent_rti=2, parent_oid=26149)
|
|
247
|
+
Received event: PID=3917080, Type=ADD_PATH, PathType=T_IndexOnlyScan
|
|
248
|
+
[20:22:42.383] [PID 3917080] ADD_PATH: T_IndexOnlyScan (startup=0.28, total=0.33, rows=1, parent_rti=2, parent_oid=26149)
|
|
249
|
+
Received event: PID=3917080, Type=ADD_PATH, PathType=T_BitmapHeapScan
|
|
250
|
+
[20:22:42.383] [PID 3917080] ADD_PATH: T_BitmapHeapScan (startup=0.30, total=4.32, rows=1, parent_rti=2, parent_oid=26149)
|
|
251
|
+
Received event: PID=3917080, Type=ADD_PATH, PathType=T_MergeJoin
|
|
252
|
+
[20:22:42.385] [PID 3917080] ADD_PATH: T_MergeJoin (startup=129.66, total=149.66, rows=1000, join=JOIN_LEFT, outer_rti=1, outer_oid=26144, inner_rti=2, inner_oid=26149)
|
|
253
|
+
Received event: PID=3917080, Type=ADD_PATH, PathType=T_NestLoop
|
|
254
|
+
[20:22:42.385] [PID 3917080] ADD_PATH: T_NestLoop (startup=0.00, total=27487.55, rows=1000, join=JOIN_LEFT, outer_rti=1, outer_oid=26144, inner_rti=2, inner_oid=26149)
|
|
255
|
+
Received event: PID=3917080, Type=ADD_PATH, PathType=T_NestLoop
|
|
256
|
+
[20:22:42.385] [PID 3917080] ADD_PATH: T_NestLoop (startup=0.00, total=15017.53, rows=1000, join=JOIN_LEFT, outer_rti=1, outer_oid=26144, inner_rti=2, inner_oid=26149)
|
|
257
|
+
Received event: PID=3917080, Type=ADD_PATH, PathType=T_NestLoop
|
|
258
|
+
[20:22:42.385] [PID 3917080] ADD_PATH: T_NestLoop (startup=0.28, total=27515.83, rows=1000, join=JOIN_LEFT, outer_rti=1, outer_oid=26144, inner_rti=2, inner_oid=26149)
|
|
259
|
+
Received event: PID=3917080, Type=ADD_PATH, PathType=T_NestLoop
|
|
260
|
+
[20:22:42.385] [PID 3917080] ADD_PATH: T_NestLoop (startup=0.28, total=15045.80, rows=1000, join=JOIN_LEFT, outer_rti=1, outer_oid=26144, inner_rti=2, inner_oid=26149)
|
|
261
|
+
Received event: PID=3917080, Type=ADD_PATH, PathType=T_MergeJoin
|
|
262
|
+
[20:22:42.386] [PID 3917080] ADD_PATH: T_MergeJoin (startup=65.10, total=125.60, rows=1000, join=JOIN_LEFT, outer_rti=1, outer_oid=26144, inner_rti=2, inner_oid=26149)
|
|
263
|
+
Received event: PID=3917080, Type=ADD_PATH, PathType=T_MergeJoin
|
|
264
|
+
[20:22:42.386] [PID 3917080] ADD_PATH: T_MergeJoin (startup=0.55, total=101.55, rows=1000, join=JOIN_LEFT, outer_rti=1, outer_oid=26144, inner_rti=2, inner_oid=26149)
|
|
265
|
+
Received event: PID=3917080, Type=ADD_PATH, PathType=T_HashJoin
|
|
266
|
+
[20:22:42.386] [PID 3917080] ADD_PATH: T_HashJoin (startup=27.50, total=45.14, rows=1000, join=JOIN_LEFT, outer_rti=1, outer_oid=26144, inner_rti=2, inner_oid=26149)
|
|
267
|
+
Received event: PID=3917080, Type=ADD_PATH, PathType=T_HashJoin
|
|
268
|
+
[20:22:42.387] [PID 3917080] ADD_PATH: T_HashJoin (startup=27.50, total=45.14, rows=1000, join=JOIN_RIGHT, outer_rti=2, outer_oid=26149, inner_rti=1, inner_oid=26144)
|
|
269
|
+
Received event: PID=3917080, Type=ADD_PATH, PathType=T_HashJoin
|
|
270
|
+
[20:22:42.387] [PID 3917080] ADD_PATH: T_HashJoin (startup=27.50, total=45.14, rows=1000, join=JOIN_LEFT, outer_rti=1, outer_oid=26144, inner_rti=2, inner_oid=26149)
|
|
271
|
+
Received event: PID=3917080, Type=CREATE_PLAN, PathType=T_HashJoin
|
|
272
|
+
[20:22:42.387] [PID 3917080] CREATE_PLAN: T_HashJoin (startup=27.50, total=45.14) [CHOSEN]
|
|
273
|
+
Received event: PID=3917080, Type=CREATE_PLAN, PathType=T_SeqScan
|
|
274
|
+
[20:22:42.387] [PID 3917080] CREATE_PLAN: T_SeqScan (startup=0.00, total=15.00) [CHOSEN]
|
|
275
|
+
Received event: PID=3917080, Type=CREATE_PLAN, PathType=T_SeqScan
|
|
276
|
+
[20:22:42.387] [PID 3917080] CREATE_PLAN: T_SeqScan (startup=0.00, total=15.00) [CHOSEN]
|
|
277
|
+
```
|
|
278
|
+
|
|
279
|
+
According to the trace output, the optimizer has chosen a `Hash Join` strategy for the join operation, and sequential scans for both tables. When we run `EXPLAIN (VERBOSE, ANALYZE)` on this query, we can confirm that the chosen plan matches what was reported in the trace output.
|
|
280
|
+
|
|
281
|
+
```
|
|
282
|
+
jan2=# EXPLAIN (VERBOSE, ANALYZE) SELECT * FROM test1 LEFT JOIN test2 ON (test1.id = test2.id);
|
|
283
|
+
QUERY PLAN
|
|
284
|
+
-------------------------------------------------------------------------------------------------------------------------
|
|
285
|
+
Hash Left Join (cost=27.50..45.14 rows=1000 width=8) (actual time=0.625..1.422 rows=1000 loops=1)
|
|
286
|
+
Output: test1.id, test2.id
|
|
287
|
+
Inner Unique: true
|
|
288
|
+
Hash Cond: (test1.id = test2.id)
|
|
289
|
+
-> Seq Scan on public.test1 (cost=0.00..15.00 rows=1000 width=4) (actual time=0.038..0.220 rows=1000 loops=1)
|
|
290
|
+
Output: test1.id
|
|
291
|
+
-> Hash (cost=15.00..15.00 rows=1000 width=4) (actual time=0.571..0.572 rows=1000 loops=1)
|
|
292
|
+
Output: test2.id
|
|
293
|
+
Buckets: 1024 Batches: 1 Memory Usage: 44kB
|
|
294
|
+
-> Seq Scan on public.test2 (cost=0.00..15.00 rows=1000 width=4) (actual time=0.019..0.191 rows=1000 loops=1)
|
|
295
|
+
Output: test2.id
|
|
296
|
+
Planning Time: 3.436 ms
|
|
297
|
+
Execution Time: 1.551 ms
|
|
298
|
+
(13 rows)
|
|
299
|
+
```
|
|
300
|
+
|
|
301
|
+
The visualization of the alternatives for this query looks like this:
|
|
302
|
+
|
|
303
|
+

|
|
304
|
+
|
|
305
|
+
|
|
306
|
+
### JOIN with a WHERE clause
|
|
307
|
+
|
|
308
|
+
As the last example, we run a `JOIN` query with a `WHERE` clause that filters for a specific ID in the first table:
|
|
309
|
+
|
|
310
|
+
```sql
|
|
311
|
+
SELECT * FROM test1 LEFT JOIN test2 ON (test1.id = test2.id) WHERE test1.id=123;
|
|
312
|
+
```
|
|
313
|
+
|
|
314
|
+
The trace output shows that the optimizer has just a few join strategies to consider.
|
|
315
|
+
|
|
316
|
+
```
|
|
317
|
+
Received event: PID=3917080, Type=ADD_PATH, PathType=T_SeqScan
|
|
318
|
+
[20:29:41.396] [PID 3917080] ADD_PATH: T_SeqScan (startup=0.00, total=17.50, rows=1, parent_rti=1, parent_oid=26144)
|
|
319
|
+
Received event: PID=3917080, Type=ADD_PATH, PathType=T_IndexOnlyScan
|
|
320
|
+
[20:29:41.396] [PID 3917080] ADD_PATH: T_IndexOnlyScan (startup=0.28, total=8.29, rows=1, parent_rti=1, parent_oid=26144)
|
|
321
|
+
Received event: PID=3917080, Type=ADD_PATH, PathType=T_BitmapHeapScan
|
|
322
|
+
[20:29:41.396] [PID 3917080] ADD_PATH: T_BitmapHeapScan (startup=4.28, total=8.30, rows=1, parent_rti=1, parent_oid=26144)
|
|
323
|
+
Received event: PID=3917080, Type=ADD_PATH, PathType=T_SeqScan
|
|
324
|
+
[20:29:41.396] [PID 3917080] ADD_PATH: T_SeqScan (startup=0.00, total=17.50, rows=1, parent_rti=2, parent_oid=26149)
|
|
325
|
+
Received event: PID=3917080, Type=ADD_PATH, PathType=T_IndexOnlyScan
|
|
326
|
+
[20:29:41.396] [PID 3917080] ADD_PATH: T_IndexOnlyScan (startup=0.28, total=8.29, rows=1, parent_rti=2, parent_oid=26149)
|
|
327
|
+
Received event: PID=3917080, Type=ADD_PATH, PathType=T_BitmapHeapScan
|
|
328
|
+
[20:29:41.396] [PID 3917080] ADD_PATH: T_BitmapHeapScan (startup=4.28, total=8.30, rows=1, parent_rti=2, parent_oid=26149)
|
|
329
|
+
Received event: PID=3917080, Type=ADD_PATH, PathType=T_NestLoop
|
|
330
|
+
[20:29:41.397] [PID 3917080] ADD_PATH: T_NestLoop (startup=0.55, total=16.60, rows=1, join=JOIN_LEFT, outer_rti=1, outer_oid=26144, inner_rti=2, inner_oid=26149)
|
|
331
|
+
Received event: PID=3917080, Type=ADD_PATH, PathType=T_NestLoop
|
|
332
|
+
[20:29:41.397] [PID 3917080] ADD_PATH: T_NestLoop (startup=0.55, total=16.60, rows=1, join=JOIN_LEFT, outer_rti=1, outer_oid=26144, inner_rti=2, inner_oid=26149)
|
|
333
|
+
Received event: PID=3917080, Type=ADD_PATH, PathType=T_NestLoop
|
|
334
|
+
[20:29:41.397] [PID 3917080] ADD_PATH: T_NestLoop (startup=0.55, total=16.60, rows=1, join=JOIN_LEFT, outer_rti=1, outer_oid=26144, inner_rti=2, inner_oid=26149)
|
|
335
|
+
Received event: PID=3917080, Type=CREATE_PLAN, PathType=T_NestLoop
|
|
336
|
+
[20:29:41.397] [PID 3917080] CREATE_PLAN: T_NestLoop (startup=0.55, total=16.60) [CHOSEN]
|
|
337
|
+
Received event: PID=3917080, Type=CREATE_PLAN, PathType=T_IndexOnlyScan
|
|
338
|
+
[20:29:41.397] [PID 3917080] CREATE_PLAN: T_IndexOnlyScan (startup=0.28, total=8.29) [CHOSEN]
|
|
339
|
+
Received event: PID=3917080, Type=CREATE_PLAN, PathType=T_IndexOnlyScan
|
|
340
|
+
[20:29:41.397] [PID 3917080] CREATE_PLAN: T_IndexOnlyScan (startup=0.28, total=8.29) [CHOSEN]
|
|
341
|
+
```
|
|
342
|
+
|
|
343
|
+
This time, the optimizer has chosen a `Nested Loop Join` strategy for the join operation, and index scans for both tables. When we run `EXPLAIN (VERBOSE, ANALYZE)` on this query, we can confirm that the chosen plan matches what was reported in the trace output.
|
|
344
|
+
|
|
345
|
+
|
|
346
|
+
```
|
|
347
|
+
jan2=# EXPLAIN (VERBOSE, ANALYZE) SELECT * FROM test1 LEFT JOIN test2 ON (test1.id = test2.id) WHERE test1.id=123;
|
|
348
|
+
QUERY PLAN
|
|
349
|
+
------------------------------------------------------------------------------------------------------------------------------------
|
|
350
|
+
Nested Loop Left Join (cost=0.55..16.60 rows=1 width=8) (actual time=0.183..0.189 rows=1 loops=1)
|
|
351
|
+
Output: test1.id, test2.id
|
|
352
|
+
Inner Unique: true
|
|
353
|
+
-> Index Only Scan using test1_pkey on public.test1 (cost=0.28..8.29 rows=1 width=4) (actual time=0.139..0.143 rows=1 loops=1)
|
|
354
|
+
Output: test1.id
|
|
355
|
+
Index Cond: (test1.id = 123)
|
|
356
|
+
Heap Fetches: 1
|
|
357
|
+
-> Index Only Scan using test2_pkey on public.test2 (cost=0.28..8.29 rows=1 width=4) (actual time=0.032..0.032 rows=1 loops=1)
|
|
358
|
+
Output: test2.id
|
|
359
|
+
Index Cond: (test2.id = 123)
|
|
360
|
+
Heap Fetches: 1
|
|
361
|
+
Planning Time: 1.116 ms
|
|
362
|
+
Execution Time: 0.336 ms
|
|
363
|
+
(13 rows)
|
|
364
|
+
```
|
|
365
|
+
|
|
366
|
+
The visualization of the alternatives for this query looks like this:
|
|
367
|
+
|
|
368
|
+

|
|
369
|
+
|
|
370
|
+
## 🎨 Visualization
|
|
371
|
+
|
|
372
|
+
The `visualize_plan_graph` tool creates visual representations of the query plans:
|
|
373
|
+
|
|
374
|
+
- **Green nodes**: Plans that were chosen for execution
|
|
375
|
+
- **Blue nodes**: Alternative plans that were considered but not selected
|
|
376
|
+
- **Node labels**: Show path type, costs, and estimated rows
|
|
377
|
+
- **Statistics**: Summary showing total plans considered, cheapest and most expensive plans
|
|
378
|
+
|
|
379
|
+
|
|
380
|
+
## 🔧 How It Works
|
|
381
|
+
|
|
382
|
+
The tool uses eBPF (Extended Berkeley Packet Filter) to instrument the `add_path()` function in PostgreSQL's query planner. This function is called every time the optimizer considers a new execution path. By capturing these calls, we can see all the alternatives that were evaluated.
|
|
383
|
+
|
|
384
|
+
The tool also instruments the `create_plan()` function to identify which path was ultimately chosen for execution.
|
|
385
|
+
|
|
386
|
+
Key instrumented functions:
|
|
387
|
+
- **`add_path()`**: Called when a new query plan alternative is considered
|
|
388
|
+
- **`create_plan()`**: Called when the chosen plan is converted to an execution plan
|
|
389
|
+
|
|
390
|
+
## 📋 Requirements
|
|
391
|
+
|
|
392
|
+
- Linux with eBPF support (kernel 4.9+)
|
|
393
|
+
- Python 3.10+
|
|
394
|
+
- Root privileges (required for eBPF)
|
|
395
|
+
- PostgreSQL 14, 15, 16, 17, or 18 with debug symbols
|
|
396
|
+
- BCC (BPF Compiler Collection)
|
|
397
|
+
- graphviz (for visualization)
|
|
398
|
+
- psycopg2 (required for OID resolution)
|
|
399
|
+
|
|
400
|
+
### Installing Dependencies
|
|
401
|
+
|
|
402
|
+
#### Ubuntu/Debian
|
|
403
|
+
|
|
404
|
+
```bash
|
|
405
|
+
# Install BCC
|
|
406
|
+
sudo apt-get install bpfcc-tools python3-bpfcc
|
|
407
|
+
|
|
408
|
+
# Install graphviz
|
|
409
|
+
sudo apt-get install graphviz
|
|
410
|
+
|
|
411
|
+
# Install the tool
|
|
412
|
+
pip install pg_plan_alternatives
|
|
413
|
+
```
|
|
414
|
+
|
|
415
|
+
## PostgreSQL Build
|
|
416
|
+
The software is tested with PostgreSQL versions 17, and 18. In order to be able to attach the _uprobes_ to the functions, they should not to be optimized away (e.g., inlined) during the compilation of PostgreSQL. Otherwise errors like `Unable to locate function XXX` will occur.
|
|
417
|
+
|
|
418
|
+
It is recommended to compile PostgreSQL with the following CFLAGS: `CFLAGS="-ggdb -Og -g3 -fno-omit-frame-pointer"`.
|
|
419
|
+
|
|
420
|
+
## Developer Notes
|
|
421
|
+
## Installation
|
|
422
|
+
|
|
423
|
+
The tool can be installed system-wide or in a dedicated [virtual environment](https://docs.python.org/3/library/venv.html). To create and install the tools in such a virtual environment, the following steps must be performed. To install the tools system-wide, these steps can be skipped.
|
|
424
|
+
|
|
425
|
+
```shell
|
|
426
|
+
cd <installation directory>
|
|
427
|
+
python3 -m venv .venv
|
|
428
|
+
source .venv/bin/activate
|
|
429
|
+
|
|
430
|
+
# Copy the distribution Python BCC packages into this environment
|
|
431
|
+
cp -av /usr/lib/python3/dist-packages/bcc* $(python -c "from distutils.sysconfig import get_python_lib; print(get_python_lib())")
|
|
432
|
+
|
|
433
|
+
pip install -r requirements_dev.txt
|
|
434
|
+
```
|
|
435
|
+
|
|
436
|
+
## 📝 License
|
|
437
|
+
|
|
438
|
+
MIT License - see [LICENSE](LICENSE) file for details.
|