@clickzetta/cz-cli-darwin-x64 0.3.89 → 0.3.91

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (27) hide show
  1. package/bin/cz-cli +0 -0
  2. package/bin/skills/clickzetta-dynamic-table/SKILL.md +169 -169
  3. package/bin/skills/clickzetta-dynamic-table/best-practices/dimension-table-join-guide.md +126 -126
  4. package/bin/skills/clickzetta-dynamic-table/best-practices/medallion-and-stream-patterns.md +25 -25
  5. package/bin/skills/clickzetta-dynamic-table/best-practices/non-partitioned-merge-into-warning.md +48 -48
  6. package/bin/skills/clickzetta-dynamic-table/best-practices/performance-optimization.md +51 -51
  7. package/bin/skills/clickzetta-dynamic-table/best-practices/scheduling-guide.md +59 -59
  8. package/bin/skills/clickzetta-dynamic-table/dt-creator/SKILL.md +8 -7
  9. package/bin/skills/clickzetta-dynamic-table/dt-creator/references/dt-declaration-strategy.md +99 -99
  10. package/bin/skills/clickzetta-dynamic-table/dt-creator/references/incremental-config-reference.md +188 -188
  11. package/bin/skills/clickzetta-dynamic-table/dt-creator/references/refresh-history-guide.md +117 -117
  12. package/bin/skills/clickzetta-dynamic-table/dt-creator/references/sql-limitations.md +29 -29
  13. package/bin/skills/clickzetta-dynamic-table/dynamic-table-alter/SKILL.md +80 -79
  14. package/bin/skills/clickzetta-dynamic-table/sql-to-dt/SKILL.md +15 -15
  15. package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-column-validation-rules.md +61 -61
  16. package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-conversion-rules.md +100 -100
  17. package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-placeholder-rules.md +64 -64
  18. package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-refresh-rules.md +32 -32
  19. package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-self-reference-rules.md +21 -21
  20. package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-workflow.md +71 -71
  21. package/bin/skills/clickzetta-sql-pipeline-manager/SKILL.md +203 -202
  22. package/bin/skills/clickzetta-sql-pipeline-manager/references/dynamic-table.md +62 -62
  23. package/bin/skills/clickzetta-sql-pipeline-manager/references/materialized-view.md +34 -34
  24. package/bin/skills/clickzetta-sql-pipeline-manager/references/pipe.md +61 -61
  25. package/bin/skills/clickzetta-sql-pipeline-manager/references/table-stream.md +41 -41
  26. package/bin/skills/clickzetta-table-stream-pipeline/SKILL.md +103 -101
  27. package/package.json +1 -1
@@ -1,160 +1,161 @@
1
1
  ---
2
2
  name: dynamic-table-alter
3
3
  description: |
4
- 修改 ClickZetta 动态表(Dynamic Table)的结构和属性。支持直接 ALTER 操作(suspend、resume、
5
- rename_columnset_commentset_column_commentset/unset properties)以及 CREATE OR REPLACE
6
- 重建操作(修改调度周期、计算集群、加列、减列、改列类型、改 SQL 定义)。当用户说"修改动态表"、
7
- "动态表加列"、"改刷新间隔"、"暂停动态表"时触发。
4
+ Modify the structure and properties of ClickZetta Dynamic Tables. Supports direct ALTER operations
5
+ (suspend, resume, rename_column, set_comment, set_column_comment, set/unset properties) as well as
6
+ CREATE OR REPLACE rebuild operations (modify refresh interval, compute cluster, add column, drop column,
7
+ change column type, change SQL definition). Triggers when the user says "modify dynamic table",
8
+ "add column to dynamic table", "change refresh interval", or "suspend dynamic table".
8
9
  ---
9
10
 
10
- # 动态表修改工作流
11
+ # Dynamic Table Modification Workflow
11
12
 
12
- ## 指令
13
+ ## Instructions
13
14
 
14
- ### 步骤 1:确认动态表存在并获取当前定义
15
- 执行 `SHOW CREATE TABLE schema_name.table_name` 获取动态表当前定义。
16
- 如果不确定是否为动态表,先用 `SHOW TABLES WHERE is_dynamic` 查看列表。
15
+ ### Step 1: Confirm the Dynamic Table Exists and Retrieve Its Current Definition
16
+ Execute `SHOW CREATE TABLE schema_name.table_name` to get the current definition of the Dynamic Table.
17
+ If unsure whether it is a Dynamic Table, first use `SHOW TABLES WHERE is_dynamic` to view the list.
17
18
 
18
- ### 步骤 2:判断操作类型并选择执行方式
19
+ ### Step 2: Determine the Operation Type and Choose the Execution Method
19
20
 
20
- ClickZetta 动态表的修改操作分为两类:
21
+ Dynamic Table modification operations fall into two categories:
21
22
 
22
- **A. 直接 ALTER 操作**(6种,可直接执行):
23
+ **A. Direct ALTER operations** (6 types, can be executed directly):
23
24
 
24
- 1. **suspend** — 暂停调度任务:
25
+ 1. **suspend** — Pause the scheduled refresh:
25
26
  ```sql
26
27
  ALTER DYNAMIC TABLE dt_name SUSPEND;
27
28
  ```
28
29
 
29
- 2. **resume** — 启动调度任务:
30
+ 2. **resume** — Start the scheduled refresh:
30
31
  ```sql
31
32
  ALTER DYNAMIC TABLE dt_name RESUME;
32
33
  ```
33
34
 
34
- 3. **set_comment** — 修改表注释:
35
+ 3. **set_comment** — Modify the table comment:
35
36
  ```sql
36
37
  ALTER DYNAMIC TABLE dt_name SET COMMENT 'comment';
37
38
  ```
38
39
 
39
- 4. **rename_column** — 修改列名:
40
+ 4. **rename_column** — Rename a column:
40
41
  ```sql
41
42
  ALTER DYNAMIC TABLE dt_name RENAME COLUMN old_name TO new_name;
42
43
  ```
43
44
 
44
- 5. **set_column_comment** — 修改列注释(注意用 CHANGE COLUMN):
45
+ 5. **set_column_comment** — Modify a column comment (note: use CHANGE COLUMN):
45
46
  ```sql
46
47
  ALTER DYNAMIC TABLE dt_name CHANGE COLUMN column_name COMMENT 'comment';
47
48
  ```
48
49
 
49
- 6. **set/unset properties** — 修改表属性(目前为保留参数):
50
+ 6. **set/unset properties** — Modify table properties (currently reserved parameters):
50
51
  ```sql
51
- -- 设置属性
52
+ -- Set a property
52
53
  ALTER DYNAMIC TABLE dt_name SET PROPERTIES('key' = 'value');
53
- -- 删除属性
54
+ -- Remove a property
54
55
  ALTER DYNAMIC TABLE dt_name UNSET PROPERTIES('key');
55
56
  ```
56
57
 
57
- **B. CREATE OR REPLACE 操作**(6种,需要重建动态表):
58
+ **B. CREATE OR REPLACE operations** (6 types, require rebuilding the Dynamic Table):
58
59
 
59
- > ⚠️ **以下操作不支持 ALTER 语法**。`ALTER DYNAMIC TABLE ... SET REFRESH INTERVAL` 等语法不存在,会报语法错误。必须使用 `CREATE OR REPLACE DYNAMIC TABLE` 重建。
60
+ > ⚠️ **The following operations do not support ALTER syntax.** Syntax like `ALTER DYNAMIC TABLE ... SET REFRESH INTERVAL` does not exist and will cause a syntax error. You must use `CREATE OR REPLACE DYNAMIC TABLE` to rebuild.
60
61
 
61
- 这些操作涉及 SQL 查询逻辑变化,无法通过 ALTER 直接完成:
62
+ These operations involve changes to SQL query logic and cannot be completed via ALTER directly:
62
63
 
63
- 7. **修改调度周期** — ❌ 不支持 `ALTER ... SET REFRESH INTERVAL`
64
- 8. **修改计算集群** — ❌ 不支持 `ALTER ... SET VCLUSTER`
65
- 9. **增加列**
66
- 10. **减列**
67
- 11. **修改列类型**
68
- 12. **修改 SQL 定义**
64
+ 7. **Modify refresh interval** — ❌ `ALTER ... SET REFRESH INTERVAL` is not supported
65
+ 8. **Modify compute cluster** — ❌ `ALTER ... SET VCLUSTER` is not supported
66
+ 9. **Add column**
67
+ 10. **Drop column**
68
+ 11. **Modify column type**
69
+ 12. **Modify SQL definition**
69
70
 
70
- ### 步骤 3:执行 CREATE OR REPLACE 重建(仅 B 类操作)
71
+ ### Step 3: Execute CREATE OR REPLACE Rebuild (Type B operations only)
71
72
 
72
- 1. 执行 `SHOW CREATE TABLE schema_name.table_name` 获取原始 DDL
73
- > ⚠️ `SHOW CREATE TABLE` 不支持 LIMIT/WHERE 子句,直接执行即可
74
- 2. 解析出:列定义、REFRESH 子句、AS SELECT 子句、COMMENT
75
- 3. 根据操作修改对应部分
76
- 4. 执行重建 SQL
73
+ 1. Execute `SHOW CREATE TABLE schema_name.table_name` to get the original DDL
74
+ > ⚠️ `SHOW CREATE TABLE` does not support LIMIT/WHERE clauses; execute it directly
75
+ 2. Parse out: column definitions, REFRESH clause, AS SELECT clause, COMMENT, etc.
76
+ 3. Modify the relevant parts according to the operation
77
+ 4. Execute the rebuild SQL
77
78
 
78
- **关于全量刷新的触发**:
79
- - 简单的删除列 / 添加列(添加的列只是从源表 SELECT 透传,不参与 JOIN key、GROUP key 等计算)→ **增量刷新**
80
- - 涉及计算逻辑变化(修改 WHERE 条件、修改聚合逻辑、新增列参与计算等)→ **全量刷新**
81
- - 兼容类型变更(如 INT → BIGINT)→ **增量刷新**
79
+ **About full refresh triggers:**
80
+ - Simple drop column / add column (where the added column is simply passed through from the source table without participating in JOIN keys, GROUP keys, or other computations) → **incremental refresh**
81
+ - Changes involving computation logic (modifying WHERE conditions, modifying aggregation logic, new column participates in computation, etc.) → **full refresh**
82
+ - Compatible type changes (e.g., INT → BIGINT) → **incremental refresh**
82
83
 
83
- ### 步骤 4:验证修改结果
84
- 使用 `DESC TABLE dt_name` 确认修改生效。
84
+ ### Step 4: Verify the Modification
85
+ Use `DESC TABLE dt_name` to confirm the modification took effect.
85
86
 
86
87
  ---
87
88
 
88
- ## 示例
89
+ ## Examples
89
90
 
90
- ### 示例 1:修改调度周期
91
+ ### Example 1: Modify Refresh Interval
91
92
 
92
93
  ```sql
93
- -- 原表
94
+ -- Original table
94
95
  CREATE DYNAMIC TABLE dt_name
95
96
  REFRESH INTERVAL 10 MINUTE vcluster DEFAULT
96
97
  AS SELECT * FROM student02;
97
98
 
98
- -- 修改后(改为 20 分钟)
99
+ -- After modification (changed to 20 minutes)
99
100
  CREATE OR REPLACE DYNAMIC TABLE dt_name
100
101
  REFRESH INTERVAL 20 MINUTE vcluster DEFAULT
101
102
  AS SELECT * FROM student02;
102
103
  ```
103
104
 
104
- ### 示例 2:修改计算集群
105
+ ### Example 2: Modify Compute Cluster
105
106
 
106
107
  ```sql
107
- -- 原表
108
+ -- Original table
108
109
  CREATE DYNAMIC TABLE dt_name
109
110
  REFRESH INTERVAL 10 MINUTE vcluster DEFAULT
110
111
  AS SELECT * FROM student02;
111
112
 
112
- -- 修改后(改为 alter_vc 集群)
113
+ -- After modification (changed to alter_vc cluster)
113
114
  CREATE OR REPLACE DYNAMIC TABLE dt_name
114
115
  REFRESH INTERVAL 10 MINUTE vcluster alter_vc
115
116
  AS SELECT * FROM student02;
116
117
  ```
117
118
 
118
- ### 示例 3:增加列
119
+ ### Example 3: Add Column
119
120
 
120
121
  ```sql
121
- -- 原表
122
+ -- Original table
122
123
  CREATE DYNAMIC TABLE change_table (i, j)
123
124
  AS SELECT * FROM dy_base_a;
124
125
 
125
- -- 添加一列 col(涉及计算逻辑,下次刷新会全量刷新)
126
+ -- Add column col (involves computation logic; next refresh will be a full refresh)
126
127
  CREATE OR REPLACE DYNAMIC TABLE change_table (i, j, col)
127
128
  AS SELECT i, j, j * 1 FROM dy_base_a;
128
129
 
129
130
  REFRESH DYNAMIC TABLE change_table;
130
131
  ```
131
132
 
132
- ### 示例 4:减列
133
+ ### Example 4: Drop Column
133
134
 
134
135
  ```sql
135
- -- 原表有 i, j 两列
136
+ -- Original table has columns i, j
136
137
  CREATE DYNAMIC TABLE change_table (i, j)
137
138
  AS SELECT * FROM dy_base_a;
138
139
 
139
- -- 减列(简单透传,增量刷新)
140
+ -- Drop column (simple pass-through; incremental refresh)
140
141
  CREATE OR REPLACE DYNAMIC TABLE change_table (i)
141
142
  AS SELECT i FROM dy_base_a;
142
143
  ```
143
144
 
144
- ### 示例 5:修改 SQL 定义
145
+ ### Example 5: Modify SQL Definition
145
146
 
146
147
  ```sql
147
- -- 修改 WHERE 过滤条件(全量刷新)
148
+ -- Modify WHERE filter condition (full refresh)
148
149
  CREATE OR REPLACE DYNAMIC TABLE change_table (i, j)
149
150
  AS SELECT * FROM dy_base_a WHERE i > 3;
150
151
 
151
152
  REFRESH DYNAMIC TABLE change_table;
152
153
  ```
153
154
 
154
- ### 示例 6:修改列类型
155
+ ### Example 6: Modify Column Type
155
156
 
156
157
  ```sql
157
- -- INT → BIGINT(兼容类型,增量刷新)
158
+ -- INT → BIGINT (compatible type; incremental refresh)
158
159
  CREATE OR REPLACE DYNAMIC TABLE change_table (i, j)
159
160
  AS SELECT CAST(i AS BIGINT), j FROM dy_base_a;
160
161
 
@@ -163,28 +164,28 @@ REFRESH DYNAMIC TABLE change_table;
163
164
 
164
165
  ---
165
166
 
166
- ## 平台特有知识
167
+ ## Platform-Specific Knowledge
167
168
 
168
- - **CHANGE COLUMN 语法**:设置列注释用 `CHANGE COLUMN col COMMENT 'xxx'`,不是 `ALTER COLUMN`
169
- - **RENAME COLUMN 语法**:`RENAME COLUMN old TO new`
170
- - **DML 限制**:动态表默认不支持 UPDATE/DELETE/MERGE(因隐藏列 MV__KEY),如需 DML 须先执行 `SET cz.sql.dt.allow.dml = true;`
171
- - **REFRESH 格式**:`REFRESH INTERVAL <N> MINUTE vcluster <name>`,支持 SECOND/MINUTE/HOUR/DAY
172
- - **CREATE OR REPLACE 风险**:涉及计算逻辑变化时会触发全量刷新,大表可能耗时较长
173
- - **schema 前缀**:所有 ALTER/CREATE 语句中表名应包含 schema 前缀
174
- - **列定义可省略类型**:`CREATE DYNAMIC TABLE dt (i, j) AS SELECT ...` 类型由 SELECT 推断
175
- - **DROP 语法**:必须用 `DROP DYNAMIC TABLE dt_name`,不能用 `DROP TABLE dt_name`(会报错)
176
- - **UNDROP 语法**:必须用 `UNDROP TABLE dt_name`,不能用 `UNDROP DYNAMIC TABLE dt_name`
177
- - **DESC 语法**:动态表用 `DESC TABLE dt_name`,不要写 `DESC DYNAMIC TABLE dt_name EXTENDED`(EXTENDED 不支持)
169
+ - **CHANGE COLUMN syntax**: set a column comment with `CHANGE COLUMN col COMMENT 'xxx'`, not `ALTER COLUMN`
170
+ - **RENAME COLUMN syntax**: `RENAME COLUMN old TO new`
171
+ - **DML restrictions**: Dynamic Tables do not support UPDATE/DELETE/MERGE by default (due to hidden column MV__KEY); to use DML, first execute `SET cz.sql.dt.allow.dml = true;`
172
+ - **REFRESH format**: `REFRESH INTERVAL <N> MINUTE vcluster <name>`, supports SECOND/MINUTE/HOUR/DAY
173
+ - **CREATE OR REPLACE risk**: changes involving computation logic will trigger a full refresh, which may take a long time for large tables
174
+ - **Schema prefix**: all ALTER/CREATE statements should include the schema prefix in the table name
175
+ - **Column definitions can omit types**: `CREATE DYNAMIC TABLE dt (i, j) AS SELECT ...` types are inferred from SELECT
176
+ - **DROP syntax**: must use `DROP DYNAMIC TABLE dt_name`; `DROP TABLE dt_name` will cause an error
177
+ - **UNDROP syntax**: must use `UNDROP TABLE dt_name`; `UNDROP DYNAMIC TABLE dt_name` is not supported
178
+ - **DESC syntax**: use `DESC TABLE dt_name` for Dynamic Tables; do not write `DESC DYNAMIC TABLE dt_name EXTENDED` (EXTENDED is not supported)
178
179
 
179
- ## 故障排除
180
+ ## Troubleshooting
180
181
 
181
- | 错误 | 原因 | 解决方案 |
182
+ | Error | Cause | Solution |
182
183
  |---|---|---|
183
- | ALTER "Syntax error at or near 'REFRESH'" | `ALTER ... SET REFRESH INTERVAL` 语法不存在 | 使用 `CREATE OR REPLACE DYNAMIC TABLE ... REFRESH INTERVAL ...` 重建 |
184
- | ALTER "unsupported operation" | 尝试对动态表执行 B 类操作的 ALTER 语法 | 使用 CREATE OR REPLACE 重建 |
185
- | `DROP TABLE dt_name` 报错 | 动态表必须用 `DROP DYNAMIC TABLE` | 改为 `DROP DYNAMIC TABLE dt_name` |
186
- | `UNDROP DYNAMIC TABLE` 报错 | UNDROP 不支持 DYNAMIC TABLE 关键字 | 改为 `UNDROP TABLE dt_name` |
187
- | `DESC DYNAMIC TABLE ... EXTENDED` 报错 | 不支持 EXTENDED 参数 | 改为 `DESC TABLE dt_name`(不加 EXTENDED |
188
- | UPDATE/DELETE "MV__KEY" 相关错误 | 动态表有隐藏列 MV__KEY,默认禁止 DML | 先执行 `SET cz.sql.dt.allow.dml = true;` |
189
- | CREATE OR REPLACE 后数据为空 | AS SELECT 子句引用的源表或列不正确 | 先验证 SELECT 子句是否返回数据 |
190
- | CREATE OR REPLACE 后全量刷新 | 新增列参与了计算逻辑(JOIN keyGROUP key 等) | 预期行为,等待全量刷新完成 |
184
+ | ALTER reports "Syntax error at or near 'REFRESH'" | `ALTER ... SET REFRESH INTERVAL` syntax does not exist | Use `CREATE OR REPLACE DYNAMIC TABLE ... REFRESH INTERVAL ...` to rebuild |
185
+ | ALTER reports "unsupported operation" | Attempted a Type B ALTER operation on a Dynamic Table | Use CREATE OR REPLACE to rebuild |
186
+ | `DROP TABLE dt_name` fails | Dynamic Tables must use `DROP DYNAMIC TABLE` | Change to `DROP DYNAMIC TABLE dt_name` |
187
+ | `UNDROP DYNAMIC TABLE` fails | UNDROP does not support the DYNAMIC TABLE keyword | Change to `UNDROP TABLE dt_name` |
188
+ | `DESC DYNAMIC TABLE ... EXTENDED` fails | EXTENDED parameter is not supported | Change to `DESC TABLE dt_name` (without EXTENDED) |
189
+ | UPDATE/DELETE reports "MV__KEY" related error | Dynamic Tables have a hidden column MV__KEY; DML is disabled by default | First execute `SET cz.sql.dt.allow.dml = true;` |
190
+ | Data is empty after CREATE OR REPLACE | The AS SELECT clause references an incorrect source table or column | Verify that the SELECT clause returns data first |
191
+ | Full refresh triggered after CREATE OR REPLACE | The new column participates in computation logic (JOIN key, GROUP key, etc.) | Expected behavior; wait for the full refresh to complete |
@@ -1,27 +1,27 @@
1
1
  ---
2
2
  name: sql-to-dt
3
- description: Hive/Spark 等任意批处理系统的 CREATE TABLE DDL + INSERT OVERWRITE SQL 自动转换为 Dynamic Table DDL 及配套文件(refreshprev_refreshbackfill)。当用户提供 DDL INSERT OVERWRITE 要求转换为 DT 时触发,或用户说"创建动态表"时主动引导提供输入。Triggers on: "转换DT", "sql to dt", "convert to dynamic table", "INSERT OVERWRITE DT", "DDL 转换", "创建动态表"
3
+ description: Automatically converts CREATE TABLE DDL + INSERT OVERWRITE SQL from Hive/Spark or any batch processing system into Dynamic Table DDL and companion files (refresh, prev_refresh, backfill). Triggers when the user provides a DDL and INSERT OVERWRITE and requests conversion to DT, or when the user says "create dynamic table" and should be proactively guided to provide input. Triggers on: "convert to DT", "sql to dt", "convert to dynamic table", "INSERT OVERWRITE to DT", "DDL conversion", "create dynamic table"
4
4
  ---
5
5
 
6
- # SQL → Dynamic Table 自动转换
6
+ # SQL → Dynamic Table Automatic Conversion
7
7
 
8
- Hive/Spark 等任意批处理系统的 ETL SQLCREATE TABLE + INSERT OVERWRITE)转换为 Dynamic Table DDL 及配套运维文件。
8
+ Converts ETL SQL (CREATE TABLE + INSERT OVERWRITE) from Hive/Spark or any batch processing system into Dynamic Table DDL and companion operation files.
9
9
 
10
- ## 使用方式
10
+ ## Usage
11
11
 
12
- 提供以下输入:
13
- 1. CREATE TABLE DDL(表结构定义)
14
- 2. INSERT OVERWRITE SQLETL 查询逻辑)
12
+ Provide the following inputs:
13
+ 1. CREATE TABLE DDL (table structure definition)
14
+ 2. INSERT OVERWRITE SQL (ETL query logic)
15
15
 
16
- 转换工具会自动完成:占位符替换、自引用检测、核心转换、列校验、配套文件生成、转换后改进建议。
16
+ The conversion tool will automatically handle: placeholder replacement, self-reference detection, core conversion, column validation, companion file generation, and post-conversion improvement suggestions.
17
17
 
18
- 详细工作流参见 #[[file:references/sql2dt-workflow.md]]
18
+ For the detailed workflow, see #[[file:references/sql2dt-workflow.md]]
19
19
 
20
20
  ## references/
21
21
 
22
- - **sql2dt-workflow.md** — 完整转换工作流(6 步:预处理、占位符替换、自引用检测、核心转换、列校验、配套文件生成)
23
- - **sql2dt-conversion-rules.md** — 核心 DDL 转换规则(解析 DDL、解析 INSERT、组装 DT DDL、静态分区注入)
24
- - **sql2dt-placeholder-rules.md** — 占位符替换规则(${var} → SESSION_CONFIGS()
25
- - **sql2dt-self-reference-rules.md** — 自引用表转换规则
26
- - **sql2dt-column-validation-rules.md** — 列校验规则(schema 列数 = SELECT 列数)
27
- - **sql2dt-refresh-rules.md** — Refresh 与调度文件生成规则
22
+ - **sql2dt-workflow.md** — Complete conversion workflow (6 steps: pre-processing, placeholder replacement, self-reference detection, core conversion, column validation, companion file generation)
23
+ - **sql2dt-conversion-rules.md** — Core DDL conversion rules (parse DDL, parse INSERT, assemble DT DDL, static partition injection)
24
+ - **sql2dt-placeholder-rules.md** — Placeholder replacement rules (${var} → SESSION_CONFIGS())
25
+ - **sql2dt-self-reference-rules.md** — Self-referencing table conversion rules
26
+ - **sql2dt-column-validation-rules.md** — Column validation rules (schema column count = SELECT column count)
27
+ - **sql2dt-refresh-rules.md** — Refresh and scheduling file generation rules
@@ -1,118 +1,118 @@
1
- # Dynamic Table 列校验与一致性规则
1
+ # Dynamic Table Column Validation and Consistency Rules
2
2
 
3
- 你是一个 SQL 转换专家。在生成 Dynamic Table DDL 后,需要校验 schema 定义的列与 SELECT 查询产出的列是否一致。
3
+ You are a SQL conversion expert. After generating the Dynamic Table DDL, you need to validate whether the columns defined in the schema match the columns produced by the SELECT query.
4
4
 
5
- ## 列数校验(必须通过)
5
+ ## Column Count Validation (Must Pass)
6
6
 
7
- ### 规则
7
+ ### Rule
8
8
 
9
- 生成的 DDL 中,括号内定义的列数必须等于 AS 后面 SELECT 查询产出的列数。
9
+ The number of columns defined in the parentheses of the generated DDL must equal the number of columns produced by the SELECT query after AS.
10
10
 
11
11
  ```sql
12
12
  CREATE OR REPLACE DYNAMIC TABLE t (
13
13
  col1 BIGINT, -- 1
14
14
  col2 STRING, -- 2
15
- dt STRING -- 3 → schema 列数 = 3
15
+ dt STRING -- 3 → schema column count = 3
16
16
  )
17
17
  AS
18
- SELECT col1, col2, '2024-01-01' AS dt -- → SELECT 列数 = 3 ✓
18
+ SELECT col1, col2, '2024-01-01' AS dt -- → SELECT column count = 3 ✓
19
19
  FROM source;
20
20
  ```
21
21
 
22
- ### SELECT 列数计算
22
+ ### Counting SELECT Columns
23
23
 
24
- 1. 找到 AS 后面的 SELECT 子句
25
- 2. 找到顶层 FROM(不在子查询/括号内的 FROM)
26
- 3. 计算 SELECT FROM 之间的顶层逗号数 + 1 = 列数
27
- 4. 顶层逗号:不在括号 `()`、方括号 `[]`、引号 `''`/`""` 内的逗号
24
+ 1. Find the SELECT clause after AS
25
+ 2. Find the top-level FROM (not inside a subquery/parentheses)
26
+ 3. Count top-level commas between SELECT and FROM + 1 = column count
27
+ 4. Top-level commas: commas not inside parentheses `()`, square brackets `[]`, or quotes `''`/`""`
28
28
 
29
- ### UNION ALL 的列数
29
+ ### Column Count for UNION ALL
30
30
 
31
- 取第一个分支的列数(所有分支列数应一致)。
31
+ Use the column count of the first branch (all branches should have the same column count).
32
32
 
33
- ### 校验失败
33
+ ### Validation Failure
34
34
 
35
- 如果 schema 列数 ≠ SELECT 列数,转换失败,报错:
35
+ If schema column count ≠ SELECT column count, conversion fails with error:
36
36
  ```
37
- Schema列数(N) != SELECT列数(M)
37
+ Schema column count (N) != SELECT column count (M)
38
38
  ```
39
39
 
40
- ## 列名校验(可选)
40
+ ## Column Name Validation (Optional)
41
41
 
42
- ### 规则
42
+ ### Rule
43
43
 
44
- 逐位对比 schema 中的列名与 SELECT 中推断出的别名。建议在列数校验通过后,如果 SELECT 中大部分列都有明确别名(AS 或裸标识符),开启列名校验做二次确认。
44
+ Compare schema column names against inferred aliases from SELECT, position by position. Recommended to enable after column count validation passes, if most columns in SELECT have explicit aliases (AS or bare identifiers).
45
45
 
46
- ### SELECT 列别名推断
46
+ ### Inferring SELECT Column Aliases
47
47
 
48
- 按优先级从高到低:
48
+ In order of priority from high to low:
49
49
 
50
- 1. **AS 别名**:`expression AS alias` → 别名为 `alias`
51
- 2. **末尾标识符**:`table.column` → 别名为 `column`
52
- 3. **裸标识符**:`column_name` → 别名为 `column_name`
53
- 4. **无法推断**:`func(a, b)` 没有 AS → 标记为 `<expr>`,跳过校验
50
+ 1. **AS alias**: `expression AS alias` → alias is `alias`
51
+ 2. **Trailing identifier**: `table.column` → alias is `column`
52
+ 3. **Bare identifier**: `column_name` → alias is `column_name`
53
+ 4. **Cannot infer**: `func(a, b)` without AS → mark as `<expr>`, skip validation
54
54
 
55
- ### 对比规则
55
+ ### Comparison Rules
56
56
 
57
- - 逐位对比(第1列对第1列,第2列对第2列...)
58
- - 如果某位是 `<expr>`(无法推断),跳过该位
59
- - 对比不区分大小写
60
- - 不匹配时报错并列出具体不对齐的列
57
+ - Compare position by position (1st column vs 1st column, 2nd vs 2nd, ...)
58
+ - If a position is `<expr>` (cannot infer), skip that position
59
+ - Comparison is case-insensitive
60
+ - On mismatch, report error and list the specific misaligned columns
61
61
 
62
- ## 静态分区注入后的列数
62
+ ## Column Count After Static Partition Injection
63
63
 
64
- 注入静态分区列后,SELECT 列数会增加。校验应在注入后进行。
64
+ After injecting static partition columns, the SELECT column count increases. Validation should be performed after injection.
65
65
 
66
- ### 避免重复注入
66
+ ### Avoid Duplicate Injection
67
67
 
68
- 在注入前检查 SELECT 中是否已包含该分区列:
68
+ Before injection, check whether SELECT already contains the partition column:
69
69
 
70
- 1. 解析 SELECT 中每个表达式的最终别名
71
- 2. 如果别名(不区分大小写)与分区列名匹配该列已存在,跳过注入
72
- 3. 只注入 SELECT 中不存在的分区列
70
+ 1. Parse the final alias of each expression in SELECT
71
+ 2. If the alias (case-insensitive) matches the partition column name the column already exists; skip injection
72
+ 3. Only inject partition columns not already present in SELECT
73
73
 
74
- ## UNION ALL 一致性
74
+ ## UNION ALL Consistency
75
75
 
76
- ### 分支列数一致性
76
+ ### Branch Column Count Consistency
77
77
 
78
- 所有 UNION ALL 分支的列数必须相同。如果不一致,记录警告:
78
+ All UNION ALL branches must have the same column count. If inconsistent, record a warning:
79
79
  ```
80
- UNION分支列数不一致: [12, 13, 12]
80
+ UNION branch column counts are inconsistent: [12, 13, 12]
81
81
  ```
82
82
 
83
- ### 注入后复核
83
+ ### Post-injection Recheck
84
84
 
85
- 静态分区注入后,再次检查各分支列数是否一致:
85
+ After static partition injection, recheck whether all branch column counts are consistent:
86
86
  ```
87
- 注入后UNION各分支列数: [13, 13, 13]
87
+ UNION branch column counts after injection: [13, 13, 13]
88
88
  ```
89
89
 
90
- ## 重复别名检测
90
+ ## Duplicate Alias Detection
91
91
 
92
- 如果 SELECT 中有重复的列别名,记录警告:
92
+ If SELECT contains duplicate column aliases, record a warning:
93
93
  ```
94
- 检测到重复列别名: ['dt']
94
+ Duplicate column aliases detected: ['dt']
95
95
  ```
96
96
 
97
- 重复别名可能导致:
98
- - 列数看起来正确但实际语义错误
99
- - 下游查询引用歧义
97
+ Duplicate aliases may cause:
98
+ - Column count appears correct but actual semantics are wrong
99
+ - Ambiguous references in downstream queries
100
100
 
101
- ## 缺失分区列检测
101
+ ## Missing Partition Column Detection
102
102
 
103
- 如果 SELECT 中缺少某些分区列(注入前),记录信息:
103
+ If SELECT is missing some partition columns (before injection), record information:
104
104
  ```
105
- 检测到缺失分区列: ['dt', 'ds']
105
+ Missing partition columns detected: ['dt', 'ds']
106
106
  ```
107
107
 
108
- 这些列会在注入步骤中被自动添加。
108
+ These columns will be automatically added in the injection step.
109
109
 
110
- ## 完整校验流程
110
+ ## Complete Validation Flow
111
111
 
112
112
  ```
113
- 1. 生成 DDL(含静态分区注入)
114
- 2. 提取 schema 列数
115
- 3. 提取 SELECT 列数
116
- 4. 比较列数不等则失败
117
- 5. (可选)逐位对比列名不匹配则失败
113
+ 1. Generate DDL (including static partition injection)
114
+ 2. Extract schema column count
115
+ 3. Extract SELECT column count
116
+ 4. Compare column counts fail if unequal
117
+ 5. (Optional) Compare column names position by position fail if mismatched
118
118
  ```