pg-plan-alternatives 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Jan Nidzwetzki
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,5 @@
1
+ include README.md
2
+ include LICENSE
3
+ include requirements_dev.txt
4
+ recursive-include src/pg_plan_alternatives/bpf *.c
5
+ recursive-include examples *.md
@@ -0,0 +1,438 @@
1
+ Metadata-Version: 2.4
2
+ Name: pg_plan_alternatives
3
+ Version: 0.1.0
4
+ Summary: An eBPF-based tool to show all query plans considered by PostgreSQL during query planning
5
+ Author-email: Jan Nidzwetzki <jnidzwetzki@gmx.de>
6
+ License-Expression: MIT
7
+ Project-URL: Homepage, https://github.com/jnidzwetzki/pg_plan_alternatives
8
+ Project-URL: Bug Tracker, https://github.com/jnidzwetzki/pg_plan_alternatives/issues
9
+ Keywords: postgresql,postgres,ebpf,query-planning,optimizer
10
+ Classifier: Development Status :: 4 - Beta
11
+ Classifier: Intended Audience :: Developers
12
+ Classifier: Operating System :: POSIX :: Linux
13
+ Classifier: Programming Language :: Python
14
+ Classifier: Topic :: Software Development :: Debuggers
15
+ Requires-Python: >=3.10
16
+ Description-Content-Type: text/markdown
17
+ License-File: LICENSE
18
+ Requires-Dist: graphviz
19
+ Requires-Dist: psycopg2-binary>=2.9
20
+ Dynamic: license-file
21
+
22
+ # pg_plan_alternatives: A PostgreSQL Plan Alternatives Tracer
23
+ [![Basic Integration Tests](https://github.com/jnidzwetzki/pg_plan_alternatives/actions/workflows/integration_tests.yml/badge.svg)](https://github.com/jnidzwetzki/pg_plan_alternatives/actions/workflows/integration_tests.yml)
24
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
25
+ [![GitHub Repo stars](https://img.shields.io/github/stars/jnidzwetzki/pg_plan_alternatives?style=social)](https://github.com/jnidzwetzki/pg_plan_alternatives/)
26
+
27
+ An eBPF-based tool designed to show **all query plans** that are considered by PostgreSQL during query planning, not just the final chosen plan as shown in `EXPLAIN` output.
28
+
29
+ ## 🎯 Overview
30
+
31
+ PostgreSQL uses a cost-based optimizer to determine the most efficient way to execute a query. When PostgreSQL plans a query, it considers many different execution paths and chooses the one with the lowest estimated cost. The standard `EXPLAIN` command only shows the final chosen plan. `pg_plan_alternatives` reveals all the alternative plans that were considered, along with their costs, giving you a complete picture of the optimizer's reasoning.
32
+
33
+ Key features:
34
+ - **`pg_plan_alternatives`**: eBPF-based tracer that captures all query plans considered during planning
35
+ - **`visualize_plan_graph`**: Creates interactive graph visualizations from trace output
36
+ - Supports PostgreSQL 17 and 18
37
+ - JSON output format for easy processing
38
+ - Shows cost estimates (startup and total) for each alternative
39
+ - Highlights which plan was ultimately chosen
40
+
41
+ **Note:** This tool relies on [eBPF](https://ebpf.io/) (_Extended Berkeley Packet Filter_) technology and requires root privileges to run.
42
+
43
+ ## ⚡ Quickstart
44
+
45
+ 1. Install the tool:
46
+
47
+ ```bash
48
+ pip install pg_plan_alternatives
49
+ ```
50
+
51
+ 2. Identify your PostgreSQL server binary (e.g., `/usr/lib/postgresql/17/bin/postgres`)
52
+
53
+ 3. Start tracing (requires root privileges):
54
+
55
+ ```bash
56
+ sudo pg_plan_alternatives -x /usr/lib/postgresql/17/bin/postgres -p <PID> -n $(pg_config --includedir-server)/nodes/nodetags.h
57
+ ```
58
+
59
+ 4. Run your queries in PostgreSQL
60
+
61
+ 5. View the trace output showing all considered plans
62
+
63
+ ## 📊 Usage Examples
64
+
65
+ ### Basic Tracing
66
+
67
+ ```bash
68
+ # Trace all PostgreSQL processes using the binary
69
+ sudo pg_plan_alternatives -x /usr/lib/postgresql/17/bin/postgres -n /path/to/nodetags.h
70
+
71
+ # Trace a specific PostgreSQL backend process
72
+ sudo pg_plan_alternatives -x /usr/lib/postgresql/17/bin/postgres -p 1234 -n /path/to/nodetags.h
73
+
74
+ # Trace multiple processes
75
+ sudo pg_plan_alternatives -x /usr/lib/postgresql/17/bin/postgres -p 1234 -p 5678 -n /path/to/nodetags.h
76
+
77
+ # Output in JSON format
78
+ sudo pg_plan_alternatives -x /usr/lib/postgresql/17/bin/postgres -p 1234 -j -n /path/to/nodetags.h
79
+
80
+ # Save output to file
81
+ sudo pg_plan_alternatives -x /usr/lib/postgresql/17/bin/postgres -p 1234 -j -o plans.json -n /path/to/nodetags.h
82
+
83
+ # Verbose mode
84
+ sudo pg_plan_alternatives -x /usr/lib/postgresql/17/bin/postgres -p 1234 -v -n /path/to/nodetags.h
85
+ ```
86
+
87
+ *Note:* The path to `nodetags.h` is required to resolve the path type enums to human-readable names.
88
+
89
+ ### Creating Visualizations
90
+
91
+ ```bash
92
+ # Create a PNG graph from trace output
93
+ visualize_plan_graph -i plans.json -o plans.png
94
+
95
+ # Create an interactive HTML visualization
96
+ visualize_plan_graph -i plans.json -o plans.html
97
+
98
+ # Create an SVG graph
99
+ visualize_plan_graph -i plans.json -o plans.svg
100
+
101
+ # Create separate graphs for each PID
102
+ visualize_plan_graph -i plans.json -o plans.png --group-by-pid
103
+
104
+ # Resolve table OIDs by connecting to the database
105
+ visualize_plan_graph -i plans.json -o plans.png --db-url postgres://user:pass@host/db
106
+ ```
107
+
108
+ ## 📄 Example Usage
109
+
110
+ ### Preparing the Environment
111
+
112
+ To see the tool in action, you can set up a simple PostgreSQL environment with some test data:
113
+
114
+ ```sql
115
+ CREATE TABLE test1(id INTEGER PRIMARY KEY);
116
+ CREATE TABLE test2(id INTEGER PRIMARY KEY);
117
+
118
+ INSERT INTO test1 SELECT generate_series(1, 1000);
119
+ INSERT INTO test2 SELECT generate_series(1, 1000);
120
+
121
+ ANALYZE;
122
+ ```
123
+
124
+ ### SELECT
125
+
126
+ In the first example, we will run a simple `SELECT` query and trace the planning process to see all the alternatives considered by the optimizer. To capture the planning of this query, we can run the following command:
127
+
128
+ ```
129
+ $ sudo pg_plan_alternatives -x /home/jan/postgresql-sandbox/bin/REL_17_1_DEBUG/bin/postgres -n $(pg_config --includedir-server)/nodes/nodetags.h
130
+ ```
131
+
132
+ In another terminal, we execute the query:
133
+
134
+ ```sql
135
+ SELECT * FROM test1;
136
+ ```
137
+
138
+ The output from `pg_plan_alternatives` will show all the paths that were considered for this query, including sequential scans, index scans, and bitmap heap scans, along with their estimated costs. The chosen plan will be highlighted in the output with a `[CHOSEN]` tag.
139
+
140
+ ```
141
+ ================================================================================
142
+ PostgreSQL Plan Alternatives Tracer
143
+ Binary: /home/jan/postgresql-sandbox/bin/REL_17_1_DEBUG/bin/postgres
144
+ Tracing all PostgreSQL processes
145
+ ================================================================================
146
+
147
+ Received event: PID=3917080, Type=ADD_PATH, PathType=T_SeqScan
148
+ [20:14:54.116] [PID 3917080] ADD_PATH: T_SeqScan (startup=0.00, total=15.00, rows=1000, parent_rti=1, parent_oid=26144)
149
+ Received event: PID=3917080, Type=ADD_PATH, PathType=T_IndexOnlyScan
150
+ [20:14:54.118] [PID 3917080] ADD_PATH: T_IndexOnlyScan (startup=0.28, total=43.27, rows=1000, parent_rti=1, parent_oid=26144)
151
+ Received event: PID=3917080, Type=ADD_PATH, PathType=T_BitmapHeapScan
152
+ [20:14:54.118] [PID 3917080] ADD_PATH: T_BitmapHeapScan (startup=25.52, total=40.52, rows=1000, parent_rti=1, parent_oid=26144)
153
+ Received event: PID=3917080, Type=ADD_PATH, PathType=T_SeqScan
154
+ [20:14:54.118] [PID 3917080] ADD_PATH: T_SeqScan (startup=0.00, total=15.00, rows=1000, parent_oid=26144)
155
+ Received event: PID=3917080, Type=CREATE_PLAN, PathType=T_SeqScan
156
+ [20:14:54.118] [PID 3917080] CREATE_PLAN: T_SeqScan (startup=0.00, total=15.00) [CHOSEN]
157
+ ```
158
+
159
+ When we run `EXPLAIN (VERBOSE, ANALYZE)` on the same query, we can see that the chosen plan was a sequential scan, which matches the output from our tracer. Also the costs and estimated rows align with what was reported in the trace output.
160
+
161
+ ```
162
+ jan2=# EXPLAIN (VERBOSE, ANALYZE) SELECT * FROM test1;
163
+ QUERY PLAN
164
+ -------------------------------------------------------------------------------------------------------------
165
+ Seq Scan on public.test1 (cost=0.00..15.00 rows=1000 width=4) (actual time=0.119..0.291 rows=1000 loops=1)
166
+ Output: id
167
+ Planning Time: 0.855 ms
168
+ Execution Time: 0.437 ms
169
+ (4 rows)
170
+ ```
171
+
172
+ To visualize the alternatives, we can save the trace output to a JSON file and then create an SVG graph:
173
+
174
+ ```
175
+ $ sudo pg_plan_alternatives -x /home/jan/postgresql-sandbox/bin/REL_17_1_DEBUG/bin/postgres -n $(pg_config --includedir-server)/nodes/nodetags.h -j -o examples/select.json
176
+ ```
177
+
178
+ From this JSON file, we can generate a graph visualization. The `--db-url` option allows the tool to connect to the database and resolve OIDs to human-readable table names, which makes the graph easier to understand.
179
+
180
+ ```
181
+ $ visualize_plan_graph -i examples/select.json -o examples/select.svg --db-url psql://localhost/jan2 -v
182
+ ```
183
+
184
+ ![Select plan alternatives](examples/select.svg)
185
+
186
+ ### SELECT with a simple WHERE clause:
187
+
188
+ In the next example, we will run a `SELECT` query with a `WHERE` clause that filters for a specific ID.
189
+
190
+ ```
191
+ SELECT * FROM test1 WHERE id = 5;
192
+ ```
193
+
194
+ The trace output shows the following alternatives being considered by the optimizer:
195
+
196
+ ```
197
+ Received event: PID=3917080, Type=ADD_PATH, PathType=T_SeqScan
198
+ [20:15:53.751] [PID 3917080] ADD_PATH: T_SeqScan (startup=0.00, total=17.50, rows=1, parent_rti=1, parent_oid=26144)
199
+ Received event: PID=3917080, Type=ADD_PATH, PathType=T_IndexOnlyScan
200
+ [20:15:53.751] [PID 3917080] ADD_PATH: T_IndexOnlyScan (startup=0.28, total=8.29, rows=1, parent_rti=1, parent_oid=26144)
201
+ Received event: PID=3917080, Type=ADD_PATH, PathType=T_BitmapHeapScan
202
+ [20:15:53.751] [PID 3917080] ADD_PATH: T_BitmapHeapScan (startup=4.28, total=8.30, rows=1, parent_rti=1, parent_oid=26144)
203
+ Received event: PID=3917080, Type=ADD_PATH, PathType=T_IndexOnlyScan
204
+ [20:15:53.751] [PID 3917080] ADD_PATH: T_IndexOnlyScan (startup=0.28, total=8.29, rows=1, parent_oid=26144)
205
+ Received event: PID=3917080, Type=CREATE_PLAN, PathType=T_IndexOnlyScan
206
+ [20:15:53.751] [PID 3917080] CREATE_PLAN: T_IndexOnlyScan (startup=0.28, total=8.29) [CHOSEN]
207
+ ```
208
+
209
+ This time the optimizer has chosen the `Index Only Scan` plan, which has the lowest estimated cost. When we run `EXPLAIN (VERBOSE, ANALYZE)` on this query, we can confirm that the chosen plan matches what was reported in the trace output.
210
+
211
+ ```
212
+ jan2=# EXPLAIN (VERBOSE, ANALYZE) SELECT * FROM test1 WHERE id = 5;
213
+ QUERY PLAN
214
+ ------------------------------------------------------------------------------------------------------------------------------
215
+ Index Only Scan using test1_pkey on public.test1 (cost=0.28..8.29 rows=1 width=4) (actual time=0.153..0.160 rows=1 loops=1)
216
+ Output: id
217
+ Index Cond: (test1.id = 5)
218
+ Heap Fetches: 1
219
+ Planning Time: 1.166 ms
220
+ Execution Time: 0.284 ms
221
+ (6 rows)
222
+ ```
223
+
224
+ The visualization of the alternatives for this query looks like this:
225
+
226
+ ![Select plan alternatives](examples/select_where.svg)
227
+
228
+ ### JOIN
229
+
230
+ To give a more complex example, we can run a `JOIN` query that combines data from both `test1` and `test2`:
231
+
232
+ ```sql
233
+ SELECT * FROM test1 LEFT JOIN test2 ON (test1.id = test2.id);
234
+ ```
235
+
236
+ Now far more alternatives are considered by the optimizer, including different join strategies (merge join, hash join, nested loop) and different scan methods for each table.
237
+
238
+ ```
239
+ Received event: PID=3917080, Type=ADD_PATH, PathType=T_SeqScan
240
+ [20:22:42.381] [PID 3917080] ADD_PATH: T_SeqScan (startup=0.00, total=15.00, rows=1000, parent_rti=1, parent_oid=26144)
241
+ Received event: PID=3917080, Type=ADD_PATH, PathType=T_IndexOnlyScan
242
+ [20:22:42.381] [PID 3917080] ADD_PATH: T_IndexOnlyScan (startup=0.28, total=43.27, rows=1000, parent_rti=1, parent_oid=26144)
243
+ Received event: PID=3917080, Type=ADD_PATH, PathType=T_SeqScan
244
+ [20:22:42.381] [PID 3917080] ADD_PATH: T_SeqScan (startup=0.00, total=15.00, rows=1000, parent_rti=2, parent_oid=26149)
245
+ Received event: PID=3917080, Type=ADD_PATH, PathType=T_IndexOnlyScan
246
+ [20:22:42.382] [PID 3917080] ADD_PATH: T_IndexOnlyScan (startup=0.28, total=43.27, rows=1000, parent_rti=2, parent_oid=26149)
247
+ Received event: PID=3917080, Type=ADD_PATH, PathType=T_IndexOnlyScan
248
+ [20:22:42.383] [PID 3917080] ADD_PATH: T_IndexOnlyScan (startup=0.28, total=0.33, rows=1, parent_rti=2, parent_oid=26149)
249
+ Received event: PID=3917080, Type=ADD_PATH, PathType=T_BitmapHeapScan
250
+ [20:22:42.383] [PID 3917080] ADD_PATH: T_BitmapHeapScan (startup=0.30, total=4.32, rows=1, parent_rti=2, parent_oid=26149)
251
+ Received event: PID=3917080, Type=ADD_PATH, PathType=T_MergeJoin
252
+ [20:22:42.385] [PID 3917080] ADD_PATH: T_MergeJoin (startup=129.66, total=149.66, rows=1000, join=JOIN_LEFT, outer_rti=1, outer_oid=26144, inner_rti=2, inner_oid=26149)
253
+ Received event: PID=3917080, Type=ADD_PATH, PathType=T_NestLoop
254
+ [20:22:42.385] [PID 3917080] ADD_PATH: T_NestLoop (startup=0.00, total=27487.55, rows=1000, join=JOIN_LEFT, outer_rti=1, outer_oid=26144, inner_rti=2, inner_oid=26149)
255
+ Received event: PID=3917080, Type=ADD_PATH, PathType=T_NestLoop
256
+ [20:22:42.385] [PID 3917080] ADD_PATH: T_NestLoop (startup=0.00, total=15017.53, rows=1000, join=JOIN_LEFT, outer_rti=1, outer_oid=26144, inner_rti=2, inner_oid=26149)
257
+ Received event: PID=3917080, Type=ADD_PATH, PathType=T_NestLoop
258
+ [20:22:42.385] [PID 3917080] ADD_PATH: T_NestLoop (startup=0.28, total=27515.83, rows=1000, join=JOIN_LEFT, outer_rti=1, outer_oid=26144, inner_rti=2, inner_oid=26149)
259
+ Received event: PID=3917080, Type=ADD_PATH, PathType=T_NestLoop
260
+ [20:22:42.385] [PID 3917080] ADD_PATH: T_NestLoop (startup=0.28, total=15045.80, rows=1000, join=JOIN_LEFT, outer_rti=1, outer_oid=26144, inner_rti=2, inner_oid=26149)
261
+ Received event: PID=3917080, Type=ADD_PATH, PathType=T_MergeJoin
262
+ [20:22:42.386] [PID 3917080] ADD_PATH: T_MergeJoin (startup=65.10, total=125.60, rows=1000, join=JOIN_LEFT, outer_rti=1, outer_oid=26144, inner_rti=2, inner_oid=26149)
263
+ Received event: PID=3917080, Type=ADD_PATH, PathType=T_MergeJoin
264
+ [20:22:42.386] [PID 3917080] ADD_PATH: T_MergeJoin (startup=0.55, total=101.55, rows=1000, join=JOIN_LEFT, outer_rti=1, outer_oid=26144, inner_rti=2, inner_oid=26149)
265
+ Received event: PID=3917080, Type=ADD_PATH, PathType=T_HashJoin
266
+ [20:22:42.386] [PID 3917080] ADD_PATH: T_HashJoin (startup=27.50, total=45.14, rows=1000, join=JOIN_LEFT, outer_rti=1, outer_oid=26144, inner_rti=2, inner_oid=26149)
267
+ Received event: PID=3917080, Type=ADD_PATH, PathType=T_HashJoin
268
+ [20:22:42.387] [PID 3917080] ADD_PATH: T_HashJoin (startup=27.50, total=45.14, rows=1000, join=JOIN_RIGHT, outer_rti=2, outer_oid=26149, inner_rti=1, inner_oid=26144)
269
+ Received event: PID=3917080, Type=ADD_PATH, PathType=T_HashJoin
270
+ [20:22:42.387] [PID 3917080] ADD_PATH: T_HashJoin (startup=27.50, total=45.14, rows=1000, join=JOIN_LEFT, outer_rti=1, outer_oid=26144, inner_rti=2, inner_oid=26149)
271
+ Received event: PID=3917080, Type=CREATE_PLAN, PathType=T_HashJoin
272
+ [20:22:42.387] [PID 3917080] CREATE_PLAN: T_HashJoin (startup=27.50, total=45.14) [CHOSEN]
273
+ Received event: PID=3917080, Type=CREATE_PLAN, PathType=T_SeqScan
274
+ [20:22:42.387] [PID 3917080] CREATE_PLAN: T_SeqScan (startup=0.00, total=15.00) [CHOSEN]
275
+ Received event: PID=3917080, Type=CREATE_PLAN, PathType=T_SeqScan
276
+ [20:22:42.387] [PID 3917080] CREATE_PLAN: T_SeqScan (startup=0.00, total=15.00) [CHOSEN]
277
+ ```
278
+
279
+ According to the trace output, the optimizer has chosen a `Hash Join` strategy for the join operation, and sequential scans for both tables. When we run `EXPLAIN (VERBOSE, ANALYZE)` on this query, we can confirm that the chosen plan matches what was reported in the trace output.
280
+
281
+ ```
282
+ jan2=# EXPLAIN (VERBOSE, ANALYZE) SELECT * FROM test1 LEFT JOIN test2 ON (test1.id = test2.id);
283
+ QUERY PLAN
284
+ -------------------------------------------------------------------------------------------------------------------------
285
+ Hash Left Join (cost=27.50..45.14 rows=1000 width=8) (actual time=0.625..1.422 rows=1000 loops=1)
286
+ Output: test1.id, test2.id
287
+ Inner Unique: true
288
+ Hash Cond: (test1.id = test2.id)
289
+ -> Seq Scan on public.test1 (cost=0.00..15.00 rows=1000 width=4) (actual time=0.038..0.220 rows=1000 loops=1)
290
+ Output: test1.id
291
+ -> Hash (cost=15.00..15.00 rows=1000 width=4) (actual time=0.571..0.572 rows=1000 loops=1)
292
+ Output: test2.id
293
+ Buckets: 1024 Batches: 1 Memory Usage: 44kB
294
+ -> Seq Scan on public.test2 (cost=0.00..15.00 rows=1000 width=4) (actual time=0.019..0.191 rows=1000 loops=1)
295
+ Output: test2.id
296
+ Planning Time: 3.436 ms
297
+ Execution Time: 1.551 ms
298
+ (13 rows)
299
+ ```
300
+
301
+ The visualization of the alternatives for this query looks like this:
302
+
303
+ ![Select plan alternatives](examples/join.svg)
304
+
305
+
306
+ ### JOIN with a WHERE clause
307
+
308
+ As the last example, we run a `JOIN` query with a `WHERE` clause that filters for a specific ID in the first table:
309
+
310
+ ```sql
311
+ SELECT * FROM test1 LEFT JOIN test2 ON (test1.id = test2.id) WHERE test1.id=123;
312
+ ```
313
+
314
+ The trace output shows that the optimizer has just a few join strategies to consider.
315
+
316
+ ```
317
+ Received event: PID=3917080, Type=ADD_PATH, PathType=T_SeqScan
318
+ [20:29:41.396] [PID 3917080] ADD_PATH: T_SeqScan (startup=0.00, total=17.50, rows=1, parent_rti=1, parent_oid=26144)
319
+ Received event: PID=3917080, Type=ADD_PATH, PathType=T_IndexOnlyScan
320
+ [20:29:41.396] [PID 3917080] ADD_PATH: T_IndexOnlyScan (startup=0.28, total=8.29, rows=1, parent_rti=1, parent_oid=26144)
321
+ Received event: PID=3917080, Type=ADD_PATH, PathType=T_BitmapHeapScan
322
+ [20:29:41.396] [PID 3917080] ADD_PATH: T_BitmapHeapScan (startup=4.28, total=8.30, rows=1, parent_rti=1, parent_oid=26144)
323
+ Received event: PID=3917080, Type=ADD_PATH, PathType=T_SeqScan
324
+ [20:29:41.396] [PID 3917080] ADD_PATH: T_SeqScan (startup=0.00, total=17.50, rows=1, parent_rti=2, parent_oid=26149)
325
+ Received event: PID=3917080, Type=ADD_PATH, PathType=T_IndexOnlyScan
326
+ [20:29:41.396] [PID 3917080] ADD_PATH: T_IndexOnlyScan (startup=0.28, total=8.29, rows=1, parent_rti=2, parent_oid=26149)
327
+ Received event: PID=3917080, Type=ADD_PATH, PathType=T_BitmapHeapScan
328
+ [20:29:41.396] [PID 3917080] ADD_PATH: T_BitmapHeapScan (startup=4.28, total=8.30, rows=1, parent_rti=2, parent_oid=26149)
329
+ Received event: PID=3917080, Type=ADD_PATH, PathType=T_NestLoop
330
+ [20:29:41.397] [PID 3917080] ADD_PATH: T_NestLoop (startup=0.55, total=16.60, rows=1, join=JOIN_LEFT, outer_rti=1, outer_oid=26144, inner_rti=2, inner_oid=26149)
331
+ Received event: PID=3917080, Type=ADD_PATH, PathType=T_NestLoop
332
+ [20:29:41.397] [PID 3917080] ADD_PATH: T_NestLoop (startup=0.55, total=16.60, rows=1, join=JOIN_LEFT, outer_rti=1, outer_oid=26144, inner_rti=2, inner_oid=26149)
333
+ Received event: PID=3917080, Type=ADD_PATH, PathType=T_NestLoop
334
+ [20:29:41.397] [PID 3917080] ADD_PATH: T_NestLoop (startup=0.55, total=16.60, rows=1, join=JOIN_LEFT, outer_rti=1, outer_oid=26144, inner_rti=2, inner_oid=26149)
335
+ Received event: PID=3917080, Type=CREATE_PLAN, PathType=T_NestLoop
336
+ [20:29:41.397] [PID 3917080] CREATE_PLAN: T_NestLoop (startup=0.55, total=16.60) [CHOSEN]
337
+ Received event: PID=3917080, Type=CREATE_PLAN, PathType=T_IndexOnlyScan
338
+ [20:29:41.397] [PID 3917080] CREATE_PLAN: T_IndexOnlyScan (startup=0.28, total=8.29) [CHOSEN]
339
+ Received event: PID=3917080, Type=CREATE_PLAN, PathType=T_IndexOnlyScan
340
+ [20:29:41.397] [PID 3917080] CREATE_PLAN: T_IndexOnlyScan (startup=0.28, total=8.29) [CHOSEN]
341
+ ```
342
+
343
+ This time, the optimizer has chosen a `Nested Loop Join` strategy for the join operation, and index scans for both tables. When we run `EXPLAIN (VERBOSE, ANALYZE)` on this query, we can confirm that the chosen plan matches what was reported in the trace output.
344
+
345
+
346
+ ```
347
+ jan2=# EXPLAIN (VERBOSE, ANALYZE) SELECT * FROM test1 LEFT JOIN test2 ON (test1.id = test2.id) WHERE test1.id=123;
348
+ QUERY PLAN
349
+ ------------------------------------------------------------------------------------------------------------------------------------
350
+ Nested Loop Left Join (cost=0.55..16.60 rows=1 width=8) (actual time=0.183..0.189 rows=1 loops=1)
351
+ Output: test1.id, test2.id
352
+ Inner Unique: true
353
+ -> Index Only Scan using test1_pkey on public.test1 (cost=0.28..8.29 rows=1 width=4) (actual time=0.139..0.143 rows=1 loops=1)
354
+ Output: test1.id
355
+ Index Cond: (test1.id = 123)
356
+ Heap Fetches: 1
357
+ -> Index Only Scan using test2_pkey on public.test2 (cost=0.28..8.29 rows=1 width=4) (actual time=0.032..0.032 rows=1 loops=1)
358
+ Output: test2.id
359
+ Index Cond: (test2.id = 123)
360
+ Heap Fetches: 1
361
+ Planning Time: 1.116 ms
362
+ Execution Time: 0.336 ms
363
+ (13 rows)
364
+ ```
365
+
366
+ The visualization of the alternatives for this query looks like this:
367
+
368
+ ![Select plan alternatives](examples/select_where.svg)
369
+
370
+ ## 🎨 Visualization
371
+
372
+ The `visualize_plan_graph` tool creates visual representations of the query plans:
373
+
374
+ - **Green nodes**: Plans that were chosen for execution
375
+ - **Blue nodes**: Alternative plans that were considered but not selected
376
+ - **Node labels**: Show path type, costs, and estimated rows
377
+ - **Statistics**: Summary showing total plans considered, cheapest and most expensive plans
378
+
379
+
380
+ ## 🔧 How It Works
381
+
382
+ The tool uses eBPF (Extended Berkeley Packet Filter) to instrument the `add_path()` function in PostgreSQL's query planner. This function is called every time the optimizer considers a new execution path. By capturing these calls, we can see all the alternatives that were evaluated.
383
+
384
+ The tool also instruments the `create_plan()` function to identify which path was ultimately chosen for execution.
385
+
386
+ Key instrumented functions:
387
+ - **`add_path()`**: Called when a new query plan alternative is considered
388
+ - **`create_plan()`**: Called when the chosen plan is converted to an execution plan
389
+
390
+ ## 📋 Requirements
391
+
392
+ - Linux with eBPF support (kernel 4.9+)
393
+ - Python 3.10+
394
+ - Root privileges (required for eBPF)
395
+ - PostgreSQL 14, 15, 16, 17, or 18 with debug symbols
396
+ - BCC (BPF Compiler Collection)
397
+ - graphviz (for visualization)
398
+ - psycopg2 (required for OID resolution)
399
+
400
+ ### Installing Dependencies
401
+
402
+ #### Ubuntu/Debian
403
+
404
+ ```bash
405
+ # Install BCC
406
+ sudo apt-get install bpfcc-tools python3-bpfcc
407
+
408
+ # Install graphviz
409
+ sudo apt-get install graphviz
410
+
411
+ # Install the tool
412
+ pip install pg_plan_alternatives
413
+ ```
414
+
415
+ ## PostgreSQL Build
416
+ The software is tested with PostgreSQL versions 17, and 18. In order to be able to attach the _uprobes_ to the functions, they should not to be optimized away (e.g., inlined) during the compilation of PostgreSQL. Otherwise errors like `Unable to locate function XXX` will occur.
417
+
418
+ It is recommended to compile PostgreSQL with the following CFLAGS: `CFLAGS="-ggdb -Og -g3 -fno-omit-frame-pointer"`.
419
+
420
+ ## Developer Notes
421
+ ## Installation
422
+
423
+ The tool can be installed system-wide or in a dedicated [virtual environment](https://docs.python.org/3/library/venv.html). To create and install the tools in such a virtual environment, the following steps must be performed. To install the tools system-wide, these steps can be skipped.
424
+
425
+ ```shell
426
+ cd <installation directory>
427
+ python3 -m venv .venv
428
+ source .venv/bin/activate
429
+
430
+ # Copy the distribution Python BCC packages into this environment
431
+ cp -av /usr/lib/python3/dist-packages/bcc* $(python -c "from distutils.sysconfig import get_python_lib; print(get_python_lib())")
432
+
433
+ pip install -r requirements_dev.txt
434
+ ```
435
+
436
+ ## 📝 License
437
+
438
+ MIT License - see [LICENSE](LICENSE) file for details.