ydb-sqlglot-plugin 0.2.2__tar.gz → 0.2.4__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: ydb-sqlglot-plugin
3
- Version: 0.2.2
3
+ Version: 0.2.4
4
4
  Summary: YDB dialect plugin for sqlglot
5
5
  Author: YDB Team
6
6
  License: Apache-2.0
@@ -102,6 +102,31 @@ LEFT JOIN (
102
102
 
103
103
  The same rewriting applies to `EXISTS`, `IN (subquery)`, and `ANY/ALL` subqueries.
104
104
 
105
+ #### GROUP BY aliases
106
+
107
+ YDB accepts aliases directly inside `GROUP BY` items. The generator uses this
108
+ form for grouped columns so later clauses and decorrelated subqueries can refer
109
+ to a stable grouping name:
110
+
111
+ ```sql
112
+ -- input
113
+ SELECT user_id, COUNT(*) FROM events GROUP BY user_id
114
+
115
+ -- output
116
+ SELECT user_id, COUNT(*) FROM `events` GROUP BY user_id AS user_id
117
+ ```
118
+
119
+ If a grouped column is selected under a generated alias, the `GROUP BY` item uses
120
+ that alias as well:
121
+
122
+ ```sql
123
+ SELECT user_id AS _u_1, COUNT(*) FROM `events` GROUP BY user_id AS _u_1
124
+ ```
125
+
126
+ Positional `GROUP BY` references are expanded before generation. When a
127
+ positional reference points to a constant expression, the grouping item is
128
+ removed because YDB rejects grouping by constants.
129
+
105
130
  ---
106
131
 
107
132
  ### YDB → any SQL
@@ -115,15 +140,27 @@ The plugin parses YDB/YQL back into sqlglot's AST, enabling round-trips, YDB-to-
115
140
  | `$variable` references | `SELECT * FROM $t AS t` |
116
141
  | `Module::Function()` | `DateTime::GetYear(ts)` |
117
142
  | `DECLARE $p AS Type` | `DECLARE $p AS Int32` |
118
- | `FLATTEN [LIST\|DICT] BY col` | `FROM t FLATTEN LIST BY col` |
143
+ | `FLATTEN [LIST\|DICT\|OPTIONAL] BY ...` / `FLATTEN COLUMNS` | `FROM t FLATTEN LIST BY col AS item`, `FROM t FLATTEN BY (a, b)`, `FROM t FLATTEN COLUMNS` |
119
144
  | `Optional<T>` / `T?` | `CAST(x AS Optional<Utf8>)` |
120
145
  | Container types | `CAST(x AS List<Int32>)`, `Dict<Utf8, Int64>`, `Set<Utf8>`, `Tuple<Int32, Utf8>` |
121
146
  | `ASSUME ORDER BY` | `SELECT * FROM t ASSUME ORDER BY id` |
147
+ | `GROUP BY expr AS alias` / `GROUP COMPACT BY` | `SELECT v, COUNT(*) FROM t GROUP BY v AS v` |
148
+ | `LEFT ONLY JOIN` | `SELECT * FROM a LEFT ONLY JOIN b USING (id)` |
149
+ | `* WITHOUT (...)` projections | `SELECT b.* WITHOUT (b.id) FROM t AS b` |
122
150
  | Named expressions | `$t = (SELECT 1 AS x)` |
151
+ | Lambda expressions | `($x, $y?) -> ($x + COALESCE($y, 0))`, `($y) -> { $p = "x"; RETURN $p \|\| $y }` |
152
+ | YQL struct literals | `AsList(<|user_id: "u1", description: NULL|>)` |
153
+ | `IN COMPACT` | `WHERE key IN COMPACT $values` |
123
154
  | `PRAGMA` | `PRAGMA AnsiImplicitCrossJoin` |
155
+ | Table-valued functions | `SELECT * FROM AS_TABLE($Input) AS k` |
156
+ | Table source options and index views | ``FROM `t` WITH TabletId='...'``, ``FROM `t` VIEW PRIMARY KEY v`` |
157
+ | Function-valued expressions | `$grep(x)`, `DateTime::Format("%Y-%m-%d")(ts)`, `Interval("P7D")` |
124
158
 
125
159
  Table names without backticks are accepted on input; the generator always produces backtick-quoted output.
126
160
 
161
+ The parser also tolerates case variants that appear in real YQL dumps, such as
162
+ `set<Utf8>`, `Tuple<Int32, Utf8>?`, and lowercase `return` in lambda blocks.
163
+
127
164
  #### CTEs reassembly
128
165
 
129
166
  YDB-style named expressions are automatically reassembled into standard `WITH` CTEs when targeting other dialects:
@@ -179,6 +216,7 @@ Functions below are recognized by sqlglot as standard SQL expressions and transl
179
216
  | `INTERVAL n HOUR` (literal) | `DateTime::IntervalFromHours(n)` |
180
217
  | `INTERVAL n MINUTE` (literal) | `DateTime::IntervalFromMinutes(n)` |
181
218
  | `INTERVAL n SECOND` (literal) | `DateTime::IntervalFromSeconds(n)` |
219
+ | `Interval("P7D")` (YQL input) | passed through unchanged |
182
220
  | `dateDiff('minute', a, b)` | `(CAST(b AS Int64) - CAST(a AS Int64)) / 60000000` |
183
221
  | `dateDiff('hour', a, b)` | `(CAST(b AS Int64) - CAST(a AS Int64)) / 3600000000` |
184
222
  | `dateDiff('day', a, b)` | `(CAST(b AS Int64) - CAST(a AS Int64)) / 86400000000` |
@@ -204,11 +242,34 @@ Functions below are recognized by sqlglot as standard SQL expressions and transl
204
242
  |---|---|
205
243
  | `ARRAY(v1, v2, ...)` | `AsList(v1, v2, ...)` |
206
244
  | `ARRAY_LENGTH(x)` / `ARRAY_SIZE(x)` | `ListLength(x)` |
207
- | `ARRAY_FILTER(arr, x -> cond)` | `ListFilter(arr, ($x) -> {RETURN cond})` |
208
- | `ARRAY_ANY(arr, x -> cond)` | `ListHasItems(ListFilter(arr, ($x) -> {RETURN cond}))` |
245
+ | `ARRAY_FILTER(arr, x -> cond)` | `ListFilter(arr, ($x) -> (cond))` |
246
+ | `ARRAY_ANY(arr, x -> cond)` | `ListHasItems(ListFilter(arr, ($x) -> (cond)))` |
209
247
  | `ARRAY_AGG(x)` | `AGGREGATE_LIST(x)` |
210
248
  | `UNNEST(x)` | `FLATTEN BY x` |
211
249
 
250
+ Lambda expressions are represented with sqlglot's standard `exp.Lambda` AST node.
251
+ When a source dialect parses lambdas, the YDB generator emits YQL lambda syntax:
252
+
253
+ ```sql
254
+ -- DuckDB input
255
+ SELECT list_filter(arr, x -> x > 0) FROM t
256
+
257
+ -- YDB output
258
+ SELECT ListFilter(arr, ($x) -> ($x > 0)) FROM `t`
259
+ ```
260
+
261
+ YDB input also supports documented YQL lambda forms, including optional
262
+ arguments and block bodies with local named expressions:
263
+
264
+ ```sql
265
+ ($x, $y?) -> ($x + COALESCE($y, 0));
266
+ ($y) -> { $prefix = "x"; RETURN $prefix || $y; };
267
+ ```
268
+
269
+ ClickHouse `ARRAY JOIN` and simple `arrayJoin(...)` projections, and PostgreSQL
270
+ `LATERAL unnest(...)`, are converted to YDB `FLATTEN BY` when the operation is
271
+ directly tied to the source table.
272
+
212
273
  ### Conditional / math
213
274
 
214
275
  | Input | YQL output |
@@ -223,6 +284,18 @@ Functions below are recognized by sqlglot as standard SQL expressions and transl
223
284
  |---|---|
224
285
  | `jsonb_col @> value` (PostgreSQL) | `Yson::Contains(jsonb_col, value)` |
225
286
 
287
+ YDB JSON functions are parsed and round-tripped, including `PASSING`,
288
+ `RETURNING`, wrapper modes, and `ON EMPTY` / `ON ERROR` clauses:
289
+
290
+ ```sql
291
+ JSON_VALUE(payload, '$.value + $delta' PASSING 1 AS delta RETURNING Int64 DEFAULT 0 ON EMPTY ERROR ON ERROR)
292
+ JSON_QUERY(payload, '$.items' WITH CONDITIONAL ARRAY WRAPPER NULL ON EMPTY ERROR ON ERROR)
293
+ JSON_EXISTS(payload, '$.items[$Index]' PASSING 0 AS "Index" FALSE ON ERROR)
294
+ ```
295
+
296
+ JSON paths can contain quoted keys, for example
297
+ `JSON_EXISTS(item_result, "$.'P_008 device playback test'")`.
298
+
226
299
  ---
227
300
 
228
301
  ## Type mapping
@@ -75,6 +75,31 @@ LEFT JOIN (
75
75
 
76
76
  The same rewriting applies to `EXISTS`, `IN (subquery)`, and `ANY/ALL` subqueries.
77
77
 
78
+ #### GROUP BY aliases
79
+
80
+ YDB accepts aliases directly inside `GROUP BY` items. The generator uses this
81
+ form for grouped columns so later clauses and decorrelated subqueries can refer
82
+ to a stable grouping name:
83
+
84
+ ```sql
85
+ -- input
86
+ SELECT user_id, COUNT(*) FROM events GROUP BY user_id
87
+
88
+ -- output
89
+ SELECT user_id, COUNT(*) FROM `events` GROUP BY user_id AS user_id
90
+ ```
91
+
92
+ If a grouped column is selected under a generated alias, the `GROUP BY` item uses
93
+ that alias as well:
94
+
95
+ ```sql
96
+ SELECT user_id AS _u_1, COUNT(*) FROM `events` GROUP BY user_id AS _u_1
97
+ ```
98
+
99
+ Positional `GROUP BY` references are expanded before generation. When a
100
+ positional reference points to a constant expression, the grouping item is
101
+ removed because YDB rejects grouping by constants.
102
+
78
103
  ---
79
104
 
80
105
  ### YDB → any SQL
@@ -88,15 +113,27 @@ The plugin parses YDB/YQL back into sqlglot's AST, enabling round-trips, YDB-to-
88
113
  | `$variable` references | `SELECT * FROM $t AS t` |
89
114
  | `Module::Function()` | `DateTime::GetYear(ts)` |
90
115
  | `DECLARE $p AS Type` | `DECLARE $p AS Int32` |
91
- | `FLATTEN [LIST\|DICT] BY col` | `FROM t FLATTEN LIST BY col` |
116
+ | `FLATTEN [LIST\|DICT\|OPTIONAL] BY ...` / `FLATTEN COLUMNS` | `FROM t FLATTEN LIST BY col AS item`, `FROM t FLATTEN BY (a, b)`, `FROM t FLATTEN COLUMNS` |
92
117
  | `Optional<T>` / `T?` | `CAST(x AS Optional<Utf8>)` |
93
118
  | Container types | `CAST(x AS List<Int32>)`, `Dict<Utf8, Int64>`, `Set<Utf8>`, `Tuple<Int32, Utf8>` |
94
119
  | `ASSUME ORDER BY` | `SELECT * FROM t ASSUME ORDER BY id` |
120
+ | `GROUP BY expr AS alias` / `GROUP COMPACT BY` | `SELECT v, COUNT(*) FROM t GROUP BY v AS v` |
121
+ | `LEFT ONLY JOIN` | `SELECT * FROM a LEFT ONLY JOIN b USING (id)` |
122
+ | `* WITHOUT (...)` projections | `SELECT b.* WITHOUT (b.id) FROM t AS b` |
95
123
  | Named expressions | `$t = (SELECT 1 AS x)` |
124
+ | Lambda expressions | `($x, $y?) -> ($x + COALESCE($y, 0))`, `($y) -> { $p = "x"; RETURN $p \|\| $y }` |
125
+ | YQL struct literals | `AsList(<|user_id: "u1", description: NULL|>)` |
126
+ | `IN COMPACT` | `WHERE key IN COMPACT $values` |
96
127
  | `PRAGMA` | `PRAGMA AnsiImplicitCrossJoin` |
128
+ | Table-valued functions | `SELECT * FROM AS_TABLE($Input) AS k` |
129
+ | Table source options and index views | ``FROM `t` WITH TabletId='...'``, ``FROM `t` VIEW PRIMARY KEY v`` |
130
+ | Function-valued expressions | `$grep(x)`, `DateTime::Format("%Y-%m-%d")(ts)`, `Interval("P7D")` |
97
131
 
98
132
  Table names without backticks are accepted on input; the generator always produces backtick-quoted output.
99
133
 
134
+ The parser also tolerates case variants that appear in real YQL dumps, such as
135
+ `set<Utf8>`, `Tuple<Int32, Utf8>?`, and lowercase `return` in lambda blocks.
136
+
100
137
  #### CTEs reassembly
101
138
 
102
139
  YDB-style named expressions are automatically reassembled into standard `WITH` CTEs when targeting other dialects:
@@ -152,6 +189,7 @@ Functions below are recognized by sqlglot as standard SQL expressions and transl
152
189
  | `INTERVAL n HOUR` (literal) | `DateTime::IntervalFromHours(n)` |
153
190
  | `INTERVAL n MINUTE` (literal) | `DateTime::IntervalFromMinutes(n)` |
154
191
  | `INTERVAL n SECOND` (literal) | `DateTime::IntervalFromSeconds(n)` |
192
+ | `Interval("P7D")` (YQL input) | passed through unchanged |
155
193
  | `dateDiff('minute', a, b)` | `(CAST(b AS Int64) - CAST(a AS Int64)) / 60000000` |
156
194
  | `dateDiff('hour', a, b)` | `(CAST(b AS Int64) - CAST(a AS Int64)) / 3600000000` |
157
195
  | `dateDiff('day', a, b)` | `(CAST(b AS Int64) - CAST(a AS Int64)) / 86400000000` |
@@ -177,11 +215,34 @@ Functions below are recognized by sqlglot as standard SQL expressions and transl
177
215
  |---|---|
178
216
  | `ARRAY(v1, v2, ...)` | `AsList(v1, v2, ...)` |
179
217
  | `ARRAY_LENGTH(x)` / `ARRAY_SIZE(x)` | `ListLength(x)` |
180
- | `ARRAY_FILTER(arr, x -> cond)` | `ListFilter(arr, ($x) -> {RETURN cond})` |
181
- | `ARRAY_ANY(arr, x -> cond)` | `ListHasItems(ListFilter(arr, ($x) -> {RETURN cond}))` |
218
+ | `ARRAY_FILTER(arr, x -> cond)` | `ListFilter(arr, ($x) -> (cond))` |
219
+ | `ARRAY_ANY(arr, x -> cond)` | `ListHasItems(ListFilter(arr, ($x) -> (cond)))` |
182
220
  | `ARRAY_AGG(x)` | `AGGREGATE_LIST(x)` |
183
221
  | `UNNEST(x)` | `FLATTEN BY x` |
184
222
 
223
+ Lambda expressions are represented with sqlglot's standard `exp.Lambda` AST node.
224
+ When a source dialect parses lambdas, the YDB generator emits YQL lambda syntax:
225
+
226
+ ```sql
227
+ -- DuckDB input
228
+ SELECT list_filter(arr, x -> x > 0) FROM t
229
+
230
+ -- YDB output
231
+ SELECT ListFilter(arr, ($x) -> ($x > 0)) FROM `t`
232
+ ```
233
+
234
+ YDB input also supports documented YQL lambda forms, including optional
235
+ arguments and block bodies with local named expressions:
236
+
237
+ ```sql
238
+ ($x, $y?) -> ($x + COALESCE($y, 0));
239
+ ($y) -> { $prefix = "x"; RETURN $prefix || $y; };
240
+ ```
241
+
242
+ ClickHouse `ARRAY JOIN` and simple `arrayJoin(...)` projections, and PostgreSQL
243
+ `LATERAL unnest(...)`, are converted to YDB `FLATTEN BY` when the operation is
244
+ directly tied to the source table.
245
+
185
246
  ### Conditional / math
186
247
 
187
248
  | Input | YQL output |
@@ -196,6 +257,18 @@ Functions below are recognized by sqlglot as standard SQL expressions and transl
196
257
  |---|---|
197
258
  | `jsonb_col @> value` (PostgreSQL) | `Yson::Contains(jsonb_col, value)` |
198
259
 
260
+ YDB JSON functions are parsed and round-tripped, including `PASSING`,
261
+ `RETURNING`, wrapper modes, and `ON EMPTY` / `ON ERROR` clauses:
262
+
263
+ ```sql
264
+ JSON_VALUE(payload, '$.value + $delta' PASSING 1 AS delta RETURNING Int64 DEFAULT 0 ON EMPTY ERROR ON ERROR)
265
+ JSON_QUERY(payload, '$.items' WITH CONDITIONAL ARRAY WRAPPER NULL ON EMPTY ERROR ON ERROR)
266
+ JSON_EXISTS(payload, '$.items[$Index]' PASSING 0 AS "Index" FALSE ON ERROR)
267
+ ```
268
+
269
+ JSON paths can contain quoted keys, for example
270
+ `JSON_EXISTS(item_result, "$.'P_008 device playback test'")`.
271
+
199
272
  ---
200
273
 
201
274
  ## Type mapping
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
4
4
 
5
5
  [project]
6
6
  name = "ydb-sqlglot-plugin"
7
- version = "0.2.2" # AUTOVERSION
7
+ version = "0.2.4" # AUTOVERSION
8
8
  description = "YDB dialect plugin for sqlglot"
9
9
  readme = "README.md"
10
10
  license = {text = "Apache-2.0"}
@@ -0,0 +1 @@
1
+ VERSION = "0.2.4"