ydb-sqlglot-plugin 0.2.2__tar.gz → 0.2.4__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {ydb_sqlglot_plugin-0.2.2 → ydb_sqlglot_plugin-0.2.4}/PKG-INFO +77 -4
- {ydb_sqlglot_plugin-0.2.2 → ydb_sqlglot_plugin-0.2.4}/README.md +76 -3
- {ydb_sqlglot_plugin-0.2.2 → ydb_sqlglot_plugin-0.2.4}/pyproject.toml +1 -1
- ydb_sqlglot_plugin-0.2.4/ydb_sqlglot/version.py +1 -0
- {ydb_sqlglot_plugin-0.2.2 → ydb_sqlglot_plugin-0.2.4}/ydb_sqlglot/ydb.py +1244 -51
- {ydb_sqlglot_plugin-0.2.2 → ydb_sqlglot_plugin-0.2.4}/ydb_sqlglot_plugin.egg-info/PKG-INFO +77 -4
- ydb_sqlglot_plugin-0.2.2/ydb_sqlglot/version.py +0 -1
- {ydb_sqlglot_plugin-0.2.2 → ydb_sqlglot_plugin-0.2.4}/LICENSE +0 -0
- {ydb_sqlglot_plugin-0.2.2 → ydb_sqlglot_plugin-0.2.4}/setup.cfg +0 -0
- {ydb_sqlglot_plugin-0.2.2 → ydb_sqlglot_plugin-0.2.4}/ydb_sqlglot/__init__.py +0 -0
- {ydb_sqlglot_plugin-0.2.2 → ydb_sqlglot_plugin-0.2.4}/ydb_sqlglot_plugin.egg-info/SOURCES.txt +0 -0
- {ydb_sqlglot_plugin-0.2.2 → ydb_sqlglot_plugin-0.2.4}/ydb_sqlglot_plugin.egg-info/dependency_links.txt +0 -0
- {ydb_sqlglot_plugin-0.2.2 → ydb_sqlglot_plugin-0.2.4}/ydb_sqlglot_plugin.egg-info/entry_points.txt +0 -0
- {ydb_sqlglot_plugin-0.2.2 → ydb_sqlglot_plugin-0.2.4}/ydb_sqlglot_plugin.egg-info/requires.txt +0 -0
- {ydb_sqlglot_plugin-0.2.2 → ydb_sqlglot_plugin-0.2.4}/ydb_sqlglot_plugin.egg-info/top_level.txt +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: ydb-sqlglot-plugin
|
|
3
|
-
Version: 0.2.
|
|
3
|
+
Version: 0.2.4
|
|
4
4
|
Summary: YDB dialect plugin for sqlglot
|
|
5
5
|
Author: YDB Team
|
|
6
6
|
License: Apache-2.0
|
|
@@ -102,6 +102,31 @@ LEFT JOIN (
|
|
|
102
102
|
|
|
103
103
|
The same rewriting applies to `EXISTS`, `IN (subquery)`, and `ANY/ALL` subqueries.
|
|
104
104
|
|
|
105
|
+
#### GROUP BY aliases
|
|
106
|
+
|
|
107
|
+
YDB accepts aliases directly inside `GROUP BY` items. The generator uses this
|
|
108
|
+
form for grouped columns so later clauses and decorrelated subqueries can refer
|
|
109
|
+
to a stable grouping name:
|
|
110
|
+
|
|
111
|
+
```sql
|
|
112
|
+
-- input
|
|
113
|
+
SELECT user_id, COUNT(*) FROM events GROUP BY user_id
|
|
114
|
+
|
|
115
|
+
-- output
|
|
116
|
+
SELECT user_id, COUNT(*) FROM `events` GROUP BY user_id AS user_id
|
|
117
|
+
```
|
|
118
|
+
|
|
119
|
+
If a grouped column is selected under a generated alias, the `GROUP BY` item uses
|
|
120
|
+
that alias as well:
|
|
121
|
+
|
|
122
|
+
```sql
|
|
123
|
+
SELECT user_id AS _u_1, COUNT(*) FROM `events` GROUP BY user_id AS _u_1
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
Positional `GROUP BY` references are expanded before generation. When a
|
|
127
|
+
positional reference points to a constant expression, the grouping item is
|
|
128
|
+
removed because YDB rejects grouping by constants.
|
|
129
|
+
|
|
105
130
|
---
|
|
106
131
|
|
|
107
132
|
### YDB → any SQL
|
|
@@ -115,15 +140,27 @@ The plugin parses YDB/YQL back into sqlglot's AST, enabling round-trips, YDB-to-
|
|
|
115
140
|
| `$variable` references | `SELECT * FROM $t AS t` |
|
|
116
141
|
| `Module::Function()` | `DateTime::GetYear(ts)` |
|
|
117
142
|
| `DECLARE $p AS Type` | `DECLARE $p AS Int32` |
|
|
118
|
-
| `FLATTEN [LIST\|DICT] BY
|
|
143
|
+
| `FLATTEN [LIST\|DICT\|OPTIONAL] BY ...` / `FLATTEN COLUMNS` | `FROM t FLATTEN LIST BY col AS item`, `FROM t FLATTEN BY (a, b)`, `FROM t FLATTEN COLUMNS` |
|
|
119
144
|
| `Optional<T>` / `T?` | `CAST(x AS Optional<Utf8>)` |
|
|
120
145
|
| Container types | `CAST(x AS List<Int32>)`, `Dict<Utf8, Int64>`, `Set<Utf8>`, `Tuple<Int32, Utf8>` |
|
|
121
146
|
| `ASSUME ORDER BY` | `SELECT * FROM t ASSUME ORDER BY id` |
|
|
147
|
+
| `GROUP BY expr AS alias` / `GROUP COMPACT BY` | `SELECT v, COUNT(*) FROM t GROUP BY v AS v` |
|
|
148
|
+
| `LEFT ONLY JOIN` | `SELECT * FROM a LEFT ONLY JOIN b USING (id)` |
|
|
149
|
+
| `* WITHOUT (...)` projections | `SELECT b.* WITHOUT (b.id) FROM t AS b` |
|
|
122
150
|
| Named expressions | `$t = (SELECT 1 AS x)` |
|
|
151
|
+
| Lambda expressions | `($x, $y?) -> ($x + COALESCE($y, 0))`, `($y) -> { $p = "x"; RETURN $p \|\| $y }` |
|
|
152
|
+
| YQL struct literals | `AsList(<|user_id: "u1", description: NULL|>)` |
|
|
153
|
+
| `IN COMPACT` | `WHERE key IN COMPACT $values` |
|
|
123
154
|
| `PRAGMA` | `PRAGMA AnsiImplicitCrossJoin` |
|
|
155
|
+
| Table-valued functions | `SELECT * FROM AS_TABLE($Input) AS k` |
|
|
156
|
+
| Table source options and index views | ``FROM `t` WITH TabletId='...'``, ``FROM `t` VIEW PRIMARY KEY v`` |
|
|
157
|
+
| Function-valued expressions | `$grep(x)`, `DateTime::Format("%Y-%m-%d")(ts)`, `Interval("P7D")` |
|
|
124
158
|
|
|
125
159
|
Table names without backticks are accepted on input; the generator always produces backtick-quoted output.
|
|
126
160
|
|
|
161
|
+
The parser also tolerates case variants that appear in real YQL dumps, such as
|
|
162
|
+
`set<Utf8>`, `Tuple<Int32, Utf8>?`, and lowercase `return` in lambda blocks.
|
|
163
|
+
|
|
127
164
|
#### CTEs reassembly
|
|
128
165
|
|
|
129
166
|
YDB-style named expressions are automatically reassembled into standard `WITH` CTEs when targeting other dialects:
|
|
@@ -179,6 +216,7 @@ Functions below are recognized by sqlglot as standard SQL expressions and transl
|
|
|
179
216
|
| `INTERVAL n HOUR` (literal) | `DateTime::IntervalFromHours(n)` |
|
|
180
217
|
| `INTERVAL n MINUTE` (literal) | `DateTime::IntervalFromMinutes(n)` |
|
|
181
218
|
| `INTERVAL n SECOND` (literal) | `DateTime::IntervalFromSeconds(n)` |
|
|
219
|
+
| `Interval("P7D")` (YQL input) | passed through unchanged |
|
|
182
220
|
| `dateDiff('minute', a, b)` | `(CAST(b AS Int64) - CAST(a AS Int64)) / 60000000` |
|
|
183
221
|
| `dateDiff('hour', a, b)` | `(CAST(b AS Int64) - CAST(a AS Int64)) / 3600000000` |
|
|
184
222
|
| `dateDiff('day', a, b)` | `(CAST(b AS Int64) - CAST(a AS Int64)) / 86400000000` |
|
|
@@ -204,11 +242,34 @@ Functions below are recognized by sqlglot as standard SQL expressions and transl
|
|
|
204
242
|
|---|---|
|
|
205
243
|
| `ARRAY(v1, v2, ...)` | `AsList(v1, v2, ...)` |
|
|
206
244
|
| `ARRAY_LENGTH(x)` / `ARRAY_SIZE(x)` | `ListLength(x)` |
|
|
207
|
-
| `ARRAY_FILTER(arr, x -> cond)` | `ListFilter(arr, ($x) ->
|
|
208
|
-
| `ARRAY_ANY(arr, x -> cond)` | `ListHasItems(ListFilter(arr, ($x) ->
|
|
245
|
+
| `ARRAY_FILTER(arr, x -> cond)` | `ListFilter(arr, ($x) -> (cond))` |
|
|
246
|
+
| `ARRAY_ANY(arr, x -> cond)` | `ListHasItems(ListFilter(arr, ($x) -> (cond)))` |
|
|
209
247
|
| `ARRAY_AGG(x)` | `AGGREGATE_LIST(x)` |
|
|
210
248
|
| `UNNEST(x)` | `FLATTEN BY x` |
|
|
211
249
|
|
|
250
|
+
Lambda expressions are represented with sqlglot's standard `exp.Lambda` AST node.
|
|
251
|
+
When a source dialect parses lambdas, the YDB generator emits YQL lambda syntax:
|
|
252
|
+
|
|
253
|
+
```sql
|
|
254
|
+
-- DuckDB input
|
|
255
|
+
SELECT list_filter(arr, x -> x > 0) FROM t
|
|
256
|
+
|
|
257
|
+
-- YDB output
|
|
258
|
+
SELECT ListFilter(arr, ($x) -> ($x > 0)) FROM `t`
|
|
259
|
+
```
|
|
260
|
+
|
|
261
|
+
YDB input also supports documented YQL lambda forms, including optional
|
|
262
|
+
arguments and block bodies with local named expressions:
|
|
263
|
+
|
|
264
|
+
```sql
|
|
265
|
+
($x, $y?) -> ($x + COALESCE($y, 0));
|
|
266
|
+
($y) -> { $prefix = "x"; RETURN $prefix || $y; };
|
|
267
|
+
```
|
|
268
|
+
|
|
269
|
+
ClickHouse `ARRAY JOIN` and simple `arrayJoin(...)` projections, and PostgreSQL
|
|
270
|
+
`LATERAL unnest(...)`, are converted to YDB `FLATTEN BY` when the operation is
|
|
271
|
+
directly tied to the source table.
|
|
272
|
+
|
|
212
273
|
### Conditional / math
|
|
213
274
|
|
|
214
275
|
| Input | YQL output |
|
|
@@ -223,6 +284,18 @@ Functions below are recognized by sqlglot as standard SQL expressions and transl
|
|
|
223
284
|
|---|---|
|
|
224
285
|
| `jsonb_col @> value` (PostgreSQL) | `Yson::Contains(jsonb_col, value)` |
|
|
225
286
|
|
|
287
|
+
YDB JSON functions are parsed and round-tripped, including `PASSING`,
|
|
288
|
+
`RETURNING`, wrapper modes, and `ON EMPTY` / `ON ERROR` clauses:
|
|
289
|
+
|
|
290
|
+
```sql
|
|
291
|
+
JSON_VALUE(payload, '$.value + $delta' PASSING 1 AS delta RETURNING Int64 DEFAULT 0 ON EMPTY ERROR ON ERROR)
|
|
292
|
+
JSON_QUERY(payload, '$.items' WITH CONDITIONAL ARRAY WRAPPER NULL ON EMPTY ERROR ON ERROR)
|
|
293
|
+
JSON_EXISTS(payload, '$.items[$Index]' PASSING 0 AS "Index" FALSE ON ERROR)
|
|
294
|
+
```
|
|
295
|
+
|
|
296
|
+
JSON paths can contain quoted keys, for example
|
|
297
|
+
`JSON_EXISTS(item_result, "$.'P_008 device playback test'")`.
|
|
298
|
+
|
|
226
299
|
---
|
|
227
300
|
|
|
228
301
|
## Type mapping
|
|
@@ -75,6 +75,31 @@ LEFT JOIN (
|
|
|
75
75
|
|
|
76
76
|
The same rewriting applies to `EXISTS`, `IN (subquery)`, and `ANY/ALL` subqueries.
|
|
77
77
|
|
|
78
|
+
#### GROUP BY aliases
|
|
79
|
+
|
|
80
|
+
YDB accepts aliases directly inside `GROUP BY` items. The generator uses this
|
|
81
|
+
form for grouped columns so later clauses and decorrelated subqueries can refer
|
|
82
|
+
to a stable grouping name:
|
|
83
|
+
|
|
84
|
+
```sql
|
|
85
|
+
-- input
|
|
86
|
+
SELECT user_id, COUNT(*) FROM events GROUP BY user_id
|
|
87
|
+
|
|
88
|
+
-- output
|
|
89
|
+
SELECT user_id, COUNT(*) FROM `events` GROUP BY user_id AS user_id
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
If a grouped column is selected under a generated alias, the `GROUP BY` item uses
|
|
93
|
+
that alias as well:
|
|
94
|
+
|
|
95
|
+
```sql
|
|
96
|
+
SELECT user_id AS _u_1, COUNT(*) FROM `events` GROUP BY user_id AS _u_1
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
Positional `GROUP BY` references are expanded before generation. When a
|
|
100
|
+
positional reference points to a constant expression, the grouping item is
|
|
101
|
+
removed because YDB rejects grouping by constants.
|
|
102
|
+
|
|
78
103
|
---
|
|
79
104
|
|
|
80
105
|
### YDB → any SQL
|
|
@@ -88,15 +113,27 @@ The plugin parses YDB/YQL back into sqlglot's AST, enabling round-trips, YDB-to-
|
|
|
88
113
|
| `$variable` references | `SELECT * FROM $t AS t` |
|
|
89
114
|
| `Module::Function()` | `DateTime::GetYear(ts)` |
|
|
90
115
|
| `DECLARE $p AS Type` | `DECLARE $p AS Int32` |
|
|
91
|
-
| `FLATTEN [LIST\|DICT] BY
|
|
116
|
+
| `FLATTEN [LIST\|DICT\|OPTIONAL] BY ...` / `FLATTEN COLUMNS` | `FROM t FLATTEN LIST BY col AS item`, `FROM t FLATTEN BY (a, b)`, `FROM t FLATTEN COLUMNS` |
|
|
92
117
|
| `Optional<T>` / `T?` | `CAST(x AS Optional<Utf8>)` |
|
|
93
118
|
| Container types | `CAST(x AS List<Int32>)`, `Dict<Utf8, Int64>`, `Set<Utf8>`, `Tuple<Int32, Utf8>` |
|
|
94
119
|
| `ASSUME ORDER BY` | `SELECT * FROM t ASSUME ORDER BY id` |
|
|
120
|
+
| `GROUP BY expr AS alias` / `GROUP COMPACT BY` | `SELECT v, COUNT(*) FROM t GROUP BY v AS v` |
|
|
121
|
+
| `LEFT ONLY JOIN` | `SELECT * FROM a LEFT ONLY JOIN b USING (id)` |
|
|
122
|
+
| `* WITHOUT (...)` projections | `SELECT b.* WITHOUT (b.id) FROM t AS b` |
|
|
95
123
|
| Named expressions | `$t = (SELECT 1 AS x)` |
|
|
124
|
+
| Lambda expressions | `($x, $y?) -> ($x + COALESCE($y, 0))`, `($y) -> { $p = "x"; RETURN $p \|\| $y }` |
|
|
125
|
+
| YQL struct literals | `AsList(<|user_id: "u1", description: NULL|>)` |
|
|
126
|
+
| `IN COMPACT` | `WHERE key IN COMPACT $values` |
|
|
96
127
|
| `PRAGMA` | `PRAGMA AnsiImplicitCrossJoin` |
|
|
128
|
+
| Table-valued functions | `SELECT * FROM AS_TABLE($Input) AS k` |
|
|
129
|
+
| Table source options and index views | ``FROM `t` WITH TabletId='...'``, ``FROM `t` VIEW PRIMARY KEY v`` |
|
|
130
|
+
| Function-valued expressions | `$grep(x)`, `DateTime::Format("%Y-%m-%d")(ts)`, `Interval("P7D")` |
|
|
97
131
|
|
|
98
132
|
Table names without backticks are accepted on input; the generator always produces backtick-quoted output.
|
|
99
133
|
|
|
134
|
+
The parser also tolerates case variants that appear in real YQL dumps, such as
|
|
135
|
+
`set<Utf8>`, `Tuple<Int32, Utf8>?`, and lowercase `return` in lambda blocks.
|
|
136
|
+
|
|
100
137
|
#### CTEs reassembly
|
|
101
138
|
|
|
102
139
|
YDB-style named expressions are automatically reassembled into standard `WITH` CTEs when targeting other dialects:
|
|
@@ -152,6 +189,7 @@ Functions below are recognized by sqlglot as standard SQL expressions and transl
|
|
|
152
189
|
| `INTERVAL n HOUR` (literal) | `DateTime::IntervalFromHours(n)` |
|
|
153
190
|
| `INTERVAL n MINUTE` (literal) | `DateTime::IntervalFromMinutes(n)` |
|
|
154
191
|
| `INTERVAL n SECOND` (literal) | `DateTime::IntervalFromSeconds(n)` |
|
|
192
|
+
| `Interval("P7D")` (YQL input) | passed through unchanged |
|
|
155
193
|
| `dateDiff('minute', a, b)` | `(CAST(b AS Int64) - CAST(a AS Int64)) / 60000000` |
|
|
156
194
|
| `dateDiff('hour', a, b)` | `(CAST(b AS Int64) - CAST(a AS Int64)) / 3600000000` |
|
|
157
195
|
| `dateDiff('day', a, b)` | `(CAST(b AS Int64) - CAST(a AS Int64)) / 86400000000` |
|
|
@@ -177,11 +215,34 @@ Functions below are recognized by sqlglot as standard SQL expressions and transl
|
|
|
177
215
|
|---|---|
|
|
178
216
|
| `ARRAY(v1, v2, ...)` | `AsList(v1, v2, ...)` |
|
|
179
217
|
| `ARRAY_LENGTH(x)` / `ARRAY_SIZE(x)` | `ListLength(x)` |
|
|
180
|
-
| `ARRAY_FILTER(arr, x -> cond)` | `ListFilter(arr, ($x) ->
|
|
181
|
-
| `ARRAY_ANY(arr, x -> cond)` | `ListHasItems(ListFilter(arr, ($x) ->
|
|
218
|
+
| `ARRAY_FILTER(arr, x -> cond)` | `ListFilter(arr, ($x) -> (cond))` |
|
|
219
|
+
| `ARRAY_ANY(arr, x -> cond)` | `ListHasItems(ListFilter(arr, ($x) -> (cond)))` |
|
|
182
220
|
| `ARRAY_AGG(x)` | `AGGREGATE_LIST(x)` |
|
|
183
221
|
| `UNNEST(x)` | `FLATTEN BY x` |
|
|
184
222
|
|
|
223
|
+
Lambda expressions are represented with sqlglot's standard `exp.Lambda` AST node.
|
|
224
|
+
When a source dialect parses lambdas, the YDB generator emits YQL lambda syntax:
|
|
225
|
+
|
|
226
|
+
```sql
|
|
227
|
+
-- DuckDB input
|
|
228
|
+
SELECT list_filter(arr, x -> x > 0) FROM t
|
|
229
|
+
|
|
230
|
+
-- YDB output
|
|
231
|
+
SELECT ListFilter(arr, ($x) -> ($x > 0)) FROM `t`
|
|
232
|
+
```
|
|
233
|
+
|
|
234
|
+
YDB input also supports documented YQL lambda forms, including optional
|
|
235
|
+
arguments and block bodies with local named expressions:
|
|
236
|
+
|
|
237
|
+
```sql
|
|
238
|
+
($x, $y?) -> ($x + COALESCE($y, 0));
|
|
239
|
+
($y) -> { $prefix = "x"; RETURN $prefix || $y; };
|
|
240
|
+
```
|
|
241
|
+
|
|
242
|
+
ClickHouse `ARRAY JOIN` and simple `arrayJoin(...)` projections, and PostgreSQL
|
|
243
|
+
`LATERAL unnest(...)`, are converted to YDB `FLATTEN BY` when the operation is
|
|
244
|
+
directly tied to the source table.
|
|
245
|
+
|
|
185
246
|
### Conditional / math
|
|
186
247
|
|
|
187
248
|
| Input | YQL output |
|
|
@@ -196,6 +257,18 @@ Functions below are recognized by sqlglot as standard SQL expressions and transl
|
|
|
196
257
|
|---|---|
|
|
197
258
|
| `jsonb_col @> value` (PostgreSQL) | `Yson::Contains(jsonb_col, value)` |
|
|
198
259
|
|
|
260
|
+
YDB JSON functions are parsed and round-tripped, including `PASSING`,
|
|
261
|
+
`RETURNING`, wrapper modes, and `ON EMPTY` / `ON ERROR` clauses:
|
|
262
|
+
|
|
263
|
+
```sql
|
|
264
|
+
JSON_VALUE(payload, '$.value + $delta' PASSING 1 AS delta RETURNING Int64 DEFAULT 0 ON EMPTY ERROR ON ERROR)
|
|
265
|
+
JSON_QUERY(payload, '$.items' WITH CONDITIONAL ARRAY WRAPPER NULL ON EMPTY ERROR ON ERROR)
|
|
266
|
+
JSON_EXISTS(payload, '$.items[$Index]' PASSING 0 AS "Index" FALSE ON ERROR)
|
|
267
|
+
```
|
|
268
|
+
|
|
269
|
+
JSON paths can contain quoted keys, for example
|
|
270
|
+
`JSON_EXISTS(item_result, "$.'P_008 device playback test'")`.
|
|
271
|
+
|
|
199
272
|
---
|
|
200
273
|
|
|
201
274
|
## Type mapping
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
VERSION = "0.2.4"
|