@clickzetta/cz-cli-darwin-arm64 0.3.81 → 0.3.83

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (201) hide show
  1. package/bin/cz-cli +0 -0
  2. package/bin/skills/clickzetta-access-control/LICENSE +16 -0
  3. package/bin/skills/clickzetta-access-control/SKILL.md +243 -0
  4. package/bin/skills/clickzetta-access-control/eval_cases.jsonl +3 -0
  5. package/bin/skills/clickzetta-access-control/references/dynamic-masking.md +86 -0
  6. package/bin/skills/clickzetta-access-control/references/grant-revoke.md +103 -0
  7. package/bin/skills/clickzetta-access-control/references/role-management.md +66 -0
  8. package/bin/skills/clickzetta-access-control/references/user-management.md +61 -0
  9. package/bin/skills/clickzetta-app-python-sdk/LICENSE +16 -0
  10. package/bin/skills/clickzetta-app-python-sdk/SKILL.md +153 -0
  11. package/bin/skills/clickzetta-app-python-sdk/eval_cases.jsonl +12 -0
  12. package/bin/skills/clickzetta-app-python-sdk/references/bulkload.md +196 -0
  13. package/bin/skills/clickzetta-app-python-sdk/references/connector.md +143 -0
  14. package/bin/skills/clickzetta-app-python-sdk/references/realtime.md +122 -0
  15. package/bin/skills/clickzetta-batch-sync-pipeline/LICENSE +16 -0
  16. package/bin/skills/clickzetta-batch-sync-pipeline/SKILL.md +227 -0
  17. package/bin/skills/clickzetta-batch-sync-pipeline/eval_cases.jsonl +5 -0
  18. package/bin/skills/clickzetta-bi-connect/LICENSE +16 -0
  19. package/bin/skills/clickzetta-bi-connect/SKILL.md +176 -0
  20. package/bin/skills/clickzetta-bi-connect/eval_cases.jsonl +5 -0
  21. package/bin/skills/clickzetta-bi-connect/references/bi-tools.md +170 -0
  22. package/bin/skills/clickzetta-cdc-sync-pipeline/LICENSE +16 -0
  23. package/bin/skills/clickzetta-cdc-sync-pipeline/SKILL.md +633 -0
  24. package/bin/skills/clickzetta-cdc-sync-pipeline/eval_cases.jsonl +5 -0
  25. package/bin/skills/clickzetta-data-ingest-pipeline/LICENSE +16 -0
  26. package/bin/skills/clickzetta-data-ingest-pipeline/SKILL.md +237 -0
  27. package/bin/skills/clickzetta-data-ingest-pipeline/eval_cases.jsonl +5 -0
  28. package/bin/skills/clickzetta-data-retention/LICENSE +16 -0
  29. package/bin/skills/clickzetta-data-retention/SKILL.md +160 -0
  30. package/bin/skills/clickzetta-data-retention/eval_cases.jsonl +5 -0
  31. package/bin/skills/clickzetta-data-retention/references/lifecycle-reference.md +175 -0
  32. package/bin/skills/clickzetta-data-science/LICENSE +16 -0
  33. package/bin/skills/clickzetta-data-science/SKILL.md +125 -0
  34. package/bin/skills/clickzetta-data-science/eval_cases.jsonl +12 -0
  35. package/bin/skills/clickzetta-data-science/references/bitmap-profile.md +146 -0
  36. package/bin/skills/clickzetta-data-science/references/data-patterns.md +110 -0
  37. package/bin/skills/clickzetta-data-science/references/setup.md +160 -0
  38. package/bin/skills/clickzetta-data-science/references/stats-functions.md +195 -0
  39. package/bin/skills/clickzetta-data-science/references/write-and-infer.md +122 -0
  40. package/bin/skills/clickzetta-data-science/references/zettapark-api.md +156 -0
  41. package/bin/skills/clickzetta-data-sharing/LICENSE +16 -0
  42. package/bin/skills/clickzetta-data-sharing/SKILL.md +160 -0
  43. package/bin/skills/clickzetta-data-sharing/eval_cases.jsonl +3 -0
  44. package/bin/skills/clickzetta-data-sharing/references/share-ddl.md +134 -0
  45. package/bin/skills/clickzetta-dba-guide/LICENSE +16 -0
  46. package/bin/skills/clickzetta-dba-guide/SKILL.md +542 -0
  47. package/bin/skills/clickzetta-dba-guide/eval_cases.jsonl +3 -0
  48. package/bin/skills/clickzetta-dw-modeling/LICENSE +16 -0
  49. package/bin/skills/clickzetta-dw-modeling/SKILL.md +351 -0
  50. package/bin/skills/clickzetta-dw-modeling/eval_cases.jsonl +4 -0
  51. package/bin/skills/clickzetta-dw-modeling/references/modeling-patterns.md +100 -0
  52. package/bin/skills/clickzetta-dynamic-table/LICENSE +16 -0
  53. package/bin/skills/clickzetta-dynamic-table/SKILL.md +230 -0
  54. package/bin/skills/clickzetta-dynamic-table/best-practices/dimension-table-join-guide.md +253 -0
  55. package/bin/skills/clickzetta-dynamic-table/best-practices/medallion-and-stream-patterns.md +124 -0
  56. package/bin/skills/clickzetta-dynamic-table/best-practices/non-partitioned-merge-into-warning.md +96 -0
  57. package/bin/skills/clickzetta-dynamic-table/best-practices/performance-optimization.md +109 -0
  58. package/bin/skills/clickzetta-dynamic-table/best-practices/scheduling-guide.md +135 -0
  59. package/bin/skills/clickzetta-dynamic-table/dt-creator/SKILL.md +15 -0
  60. package/bin/skills/clickzetta-dynamic-table/dt-creator/references/dt-declaration-strategy.md +185 -0
  61. package/bin/skills/clickzetta-dynamic-table/dt-creator/references/incremental-config-reference.md +427 -0
  62. package/bin/skills/clickzetta-dynamic-table/dt-creator/references/refresh-history-guide.md +260 -0
  63. package/bin/skills/clickzetta-dynamic-table/dt-creator/references/sql-limitations.md +80 -0
  64. package/bin/skills/clickzetta-dynamic-table/dynamic-table-alter/SKILL.md +190 -0
  65. package/bin/skills/clickzetta-dynamic-table/eval_cases.jsonl +5 -0
  66. package/bin/skills/clickzetta-dynamic-table/sql-to-dt/SKILL.md +27 -0
  67. package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-column-validation-rules.md +118 -0
  68. package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-conversion-rules.md +225 -0
  69. package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-placeholder-rules.md +182 -0
  70. package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-refresh-rules.md +98 -0
  71. package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-self-reference-rules.md +76 -0
  72. package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-workflow.md +109 -0
  73. package/bin/skills/clickzetta-external-catalog/LICENSE +16 -0
  74. package/bin/skills/clickzetta-external-catalog/SKILL.md +123 -0
  75. package/bin/skills/clickzetta-external-catalog/eval_cases.jsonl +5 -0
  76. package/bin/skills/clickzetta-external-catalog/references/external-catalog-ddl.md +130 -0
  77. package/bin/skills/clickzetta-external-function/LICENSE +16 -0
  78. package/bin/skills/clickzetta-external-function/SKILL.md +203 -0
  79. package/bin/skills/clickzetta-external-function/eval_cases.jsonl +4 -0
  80. package/bin/skills/clickzetta-external-function/references/external-function-ddl.md +171 -0
  81. package/bin/skills/clickzetta-file-import-pipeline/LICENSE +16 -0
  82. package/bin/skills/clickzetta-file-import-pipeline/SKILL.md +190 -0
  83. package/bin/skills/clickzetta-file-import-pipeline/eval_cases.jsonl +5 -0
  84. package/bin/skills/clickzetta-index-manager/LICENSE +16 -0
  85. package/bin/skills/clickzetta-index-manager/SKILL.md +140 -0
  86. package/bin/skills/clickzetta-index-manager/eval_cases.jsonl +5 -0
  87. package/bin/skills/clickzetta-index-manager/references/bloomfilter-index.md +67 -0
  88. package/bin/skills/clickzetta-index-manager/references/index-management.md +73 -0
  89. package/bin/skills/clickzetta-index-manager/references/inverted-index.md +80 -0
  90. package/bin/skills/clickzetta-index-manager/references/vector-index.md +81 -0
  91. package/bin/skills/clickzetta-java-sdk/LICENSE +16 -0
  92. package/bin/skills/clickzetta-java-sdk/SKILL.md +186 -0
  93. package/bin/skills/clickzetta-java-sdk/eval_cases.jsonl +12 -0
  94. package/bin/skills/clickzetta-java-sdk/references/bulkload.md +163 -0
  95. package/bin/skills/clickzetta-java-sdk/references/realtime.md +212 -0
  96. package/bin/skills/clickzetta-kafka-ingest-pipeline/LICENSE +16 -0
  97. package/bin/skills/clickzetta-kafka-ingest-pipeline/SKILL.md +769 -0
  98. package/bin/skills/clickzetta-kafka-ingest-pipeline/eval_cases.jsonl +5 -0
  99. package/bin/skills/clickzetta-kafka-ingest-pipeline/references/kafka-pipe-syntax.md +324 -0
  100. package/bin/skills/clickzetta-lakehouse-connect/LICENSE +16 -0
  101. package/bin/skills/clickzetta-lakehouse-connect/SKILL.md +218 -0
  102. package/bin/skills/clickzetta-lakehouse-connect/eval_cases.jsonl +3 -0
  103. package/bin/skills/clickzetta-lakehouse-connect/evals/evals.json +35 -0
  104. package/bin/skills/clickzetta-lakehouse-connect/references/config-file.md +435 -0
  105. package/bin/skills/clickzetta-lakehouse-connect/references/jdbc.md +478 -0
  106. package/bin/skills/clickzetta-lakehouse-connect/references/python-sdk.md +225 -0
  107. package/bin/skills/clickzetta-lakehouse-connect/references/sqlalchemy.md +468 -0
  108. package/bin/skills/clickzetta-lakehouse-connect/references/zettapark-session.md +445 -0
  109. package/bin/skills/clickzetta-manage-comments/LICENSE +16 -0
  110. package/bin/skills/clickzetta-manage-comments/SKILL.md +219 -0
  111. package/bin/skills/clickzetta-manage-comments/eval_cases.jsonl +3 -0
  112. package/bin/skills/clickzetta-metadata/LICENSE +16 -0
  113. package/bin/skills/clickzetta-metadata/SKILL.md +502 -0
  114. package/bin/skills/clickzetta-metadata/eval_cases.jsonl +5 -0
  115. package/bin/skills/clickzetta-metadata/references/instance-views-reference.md +276 -0
  116. package/bin/skills/clickzetta-metadata/references/metering-views-reference.md +137 -0
  117. package/bin/skills/clickzetta-metadata/references/show-desc-reference.md +326 -0
  118. package/bin/skills/clickzetta-metadata/references/views-reference.md +271 -0
  119. package/bin/skills/clickzetta-monitoring/LICENSE +16 -0
  120. package/bin/skills/clickzetta-monitoring/SKILL.md +215 -0
  121. package/bin/skills/clickzetta-monitoring/eval_cases.jsonl +5 -0
  122. package/bin/skills/clickzetta-monitoring/references/job-history-analysis.md +97 -0
  123. package/bin/skills/clickzetta-monitoring/references/show-jobs.md +48 -0
  124. package/bin/skills/clickzetta-oss-ingest-pipeline/LICENSE +16 -0
  125. package/bin/skills/clickzetta-oss-ingest-pipeline/SKILL.md +562 -0
  126. package/bin/skills/clickzetta-oss-ingest-pipeline/eval_cases.jsonl +5 -0
  127. package/bin/skills/clickzetta-overview/LICENSE +16 -0
  128. package/bin/skills/clickzetta-overview/SKILL.md +102 -0
  129. package/bin/skills/clickzetta-overview/eval_cases.jsonl +5 -0
  130. package/bin/skills/clickzetta-overview/references/brands-and-endpoints.md +79 -0
  131. package/bin/skills/clickzetta-overview/references/object-model.md +311 -0
  132. package/bin/skills/clickzetta-overview/references/studio-modules.md +173 -0
  133. package/bin/skills/clickzetta-pipeline-review/LICENSE +16 -0
  134. package/bin/skills/clickzetta-pipeline-review/SKILL.md +377 -0
  135. package/bin/skills/clickzetta-query-optimizer/LICENSE +16 -0
  136. package/bin/skills/clickzetta-query-optimizer/SKILL.md +156 -0
  137. package/bin/skills/clickzetta-query-optimizer/eval_cases.jsonl +5 -0
  138. package/bin/skills/clickzetta-query-optimizer/references/explain.md +56 -0
  139. package/bin/skills/clickzetta-query-optimizer/references/hints-and-sortkey.md +78 -0
  140. package/bin/skills/clickzetta-query-optimizer/references/optimize.md +65 -0
  141. package/bin/skills/clickzetta-query-optimizer/references/result-cache.md +49 -0
  142. package/bin/skills/clickzetta-query-optimizer/references/show-jobs.md +42 -0
  143. package/bin/skills/clickzetta-realtime-sync-pipeline/LICENSE +16 -0
  144. package/bin/skills/clickzetta-realtime-sync-pipeline/SKILL.md +323 -0
  145. package/bin/skills/clickzetta-realtime-sync-pipeline/eval_cases.jsonl +5 -0
  146. package/bin/skills/clickzetta-semantic-view/LICENSE +16 -0
  147. package/bin/skills/clickzetta-semantic-view/SKILL.md +207 -0
  148. package/bin/skills/clickzetta-semantic-view/eval_cases.jsonl +12 -0
  149. package/bin/skills/clickzetta-semantic-view/references/semantic-view-reference.md +167 -0
  150. package/bin/skills/clickzetta-spark-flink-connector/LICENSE +16 -0
  151. package/bin/skills/clickzetta-spark-flink-connector/SKILL.md +92 -0
  152. package/bin/skills/clickzetta-spark-flink-connector/eval_cases.jsonl +5 -0
  153. package/bin/skills/clickzetta-spark-flink-connector/references/flink.md +147 -0
  154. package/bin/skills/clickzetta-spark-flink-connector/references/spark.md +132 -0
  155. package/bin/skills/clickzetta-sql-pipeline-manager/LICENSE +16 -0
  156. package/bin/skills/clickzetta-sql-pipeline-manager/SKILL.md +485 -0
  157. package/bin/skills/clickzetta-sql-pipeline-manager/eval_cases.jsonl +12 -0
  158. package/bin/skills/clickzetta-sql-pipeline-manager/evals/evals.json +166 -0
  159. package/bin/skills/clickzetta-sql-pipeline-manager/references/dynamic-table.md +185 -0
  160. package/bin/skills/clickzetta-sql-pipeline-manager/references/materialized-view.md +129 -0
  161. package/bin/skills/clickzetta-sql-pipeline-manager/references/pipe.md +222 -0
  162. package/bin/skills/clickzetta-sql-pipeline-manager/references/table-stream.md +125 -0
  163. package/bin/skills/clickzetta-sql-syntax-guide/LICENSE +16 -0
  164. package/bin/skills/clickzetta-sql-syntax-guide/SKILL.md +249 -0
  165. package/bin/skills/clickzetta-sql-syntax-guide/eval_cases.jsonl +3 -0
  166. package/bin/skills/clickzetta-sql-syntax-guide/references/ddl-reference.md +350 -0
  167. package/bin/skills/clickzetta-sql-syntax-guide/references/dml-reference.md +279 -0
  168. package/bin/skills/clickzetta-sql-syntax-guide/references/dql-reference.md +504 -0
  169. package/bin/skills/clickzetta-sql-syntax-guide/references/functions-reference.md +372 -0
  170. package/bin/skills/clickzetta-sql-syntax-guide/references/migration-databricks.md +260 -0
  171. package/bin/skills/clickzetta-sql-syntax-guide/references/migration-snowflake.md +382 -0
  172. package/bin/skills/clickzetta-sql-syntax-guide/references/vs-snowflake.md +346 -0
  173. package/bin/skills/clickzetta-sql-syntax-guide/references/vs-spark.md +229 -0
  174. package/bin/skills/clickzetta-studio-task-manager/LICENSE +16 -0
  175. package/bin/skills/clickzetta-studio-task-manager/SKILL.md +652 -0
  176. package/bin/skills/clickzetta-table-lineage/LICENSE +16 -0
  177. package/bin/skills/clickzetta-table-lineage/SKILL.md +90 -0
  178. package/bin/skills/clickzetta-table-lineage/eval_cases.jsonl +1 -0
  179. package/bin/skills/clickzetta-table-lineage/references/normalize_func.sql +14 -0
  180. package/bin/skills/clickzetta-table-lineage/references/table_cost.sql +38 -0
  181. package/bin/skills/clickzetta-table-lineage/references/table_lineage_standalone.html +562 -0
  182. package/bin/skills/clickzetta-table-lineage/references/table_relation.sql +25 -0
  183. package/bin/skills/clickzetta-table-stream-pipeline/LICENSE +16 -0
  184. package/bin/skills/clickzetta-table-stream-pipeline/SKILL.md +206 -0
  185. package/bin/skills/clickzetta-table-stream-pipeline/eval_cases.jsonl +5 -0
  186. package/bin/skills/clickzetta-vcluster-manager/LICENSE +16 -0
  187. package/bin/skills/clickzetta-vcluster-manager/SKILL.md +212 -0
  188. package/bin/skills/clickzetta-vcluster-manager/eval_cases.jsonl +5 -0
  189. package/bin/skills/clickzetta-vcluster-manager/references/vc-cache.md +54 -0
  190. package/bin/skills/clickzetta-vcluster-manager/references/vcluster-ddl.md +150 -0
  191. package/bin/skills/clickzetta-volume-manager/LICENSE +16 -0
  192. package/bin/skills/clickzetta-volume-manager/SKILL.md +292 -0
  193. package/bin/skills/clickzetta-volume-manager/eval_cases.jsonl +5 -0
  194. package/bin/skills/clickzetta-volume-manager/references/volume-ddl.md +199 -0
  195. package/bin/skills/clickzetta-zettapark/LICENSE +16 -0
  196. package/bin/skills/clickzetta-zettapark/SKILL.md +248 -0
  197. package/bin/skills/clickzetta-zettapark/eval_cases.jsonl +12 -0
  198. package/bin/skills/clickzetta-zettapark/references/zettapark-api.md +283 -0
  199. package/bin/skills/cz-cli/SKILL.md +313 -0
  200. package/bin/skills/cz-cli/references/profile-setup.md +120 -0
  201. package/package.json +1 -1
@@ -0,0 +1,260 @@
1
+ # Dynamic Table 增量刷新历史查询指南
2
+
3
+ 查看 DT/MV 的增量刷新历史有三种方式,适用于不同场景。
4
+
5
+ ---
6
+
7
+ ## 方式一:SHOW DYNAMIC TABLE REFRESH HISTORY
8
+
9
+ 查看 DT 的刷新作业级别信息,包括每次刷新的状态、耗时、触发方式、刷新模式等。
10
+
11
+ ### 语法
12
+
13
+ ```sql
14
+ -- 通过 WHERE 过滤(name 列匹配表名)
15
+ SHOW DYNAMIC TABLE REFRESH HISTORY WHERE name = 'my_dt';
16
+
17
+ -- 组合 WHERE + LIMIT
18
+ SHOW DYNAMIC TABLE REFRESH HISTORY WHERE name = 'my_dt' AND state = 'SUCCEED' LIMIT 20;
19
+
20
+ -- MV 也支持同样的语法
21
+ SHOW MATERIALIZED VIEW REFRESH HISTORY WHERE name = 'my_mv' LIMIT 10;
22
+ ```
23
+
24
+ ### 输出列
25
+
26
+ | 列名 | 类型 | 说明 |
27
+ |------|------|------|
28
+ | workspace_name | STRING | 所属 Workspace |
29
+ | schema_name | STRING | 所属 Schema |
30
+ | name | STRING | DT/MV 名称 |
31
+ | virtual_cluster | STRING | 执行刷新的虚拟集群 |
32
+ | start_time | TIMESTAMP | 刷新开始时间 |
33
+ | end_time | TIMESTAMP | 刷新结束时间(运行中为 NULL) |
34
+ | duration | INTERVAL | 刷新耗时(运行中显示已经过的时间) |
35
+ | state | STRING | 刷新状态(SUCCEED / FAILED / RUNNING 等) |
36
+ | refresh_trigger | STRING | 触发方式:`SYSTEM_SCHEDULED`(系统调度自动触发)或 `MANUAL`(用户手动 REFRESH) |
37
+ | refresh_mode | STRING | 刷新模式,见下方详细说明 |
38
+ | error_message | STRING | 失败时的错误信息(成功时为 NULL) |
39
+ | source_tables | ARRAY<MAP<STRING,STRING>> | 源表列表,每个元素是一个 MAP,包含 `workspace`、`schema`、`table_name` 三个 key |
40
+ | stats | MAP<STRING,STRING> | 刷新统计,包含 `rows_inserted`(插入行数)和 `rows_deleted`(删除行数) |
41
+ | job_id | STRING | 对应的 Job ID,可用于关联 `information_schema.job_history` 查更多详情 |
42
+
43
+ ### refresh_mode 详解
44
+
45
+ `refresh_mode` 是判断增量计算是否生效的关键字段:
46
+
47
+ | 值 | 含义 | 说明 |
48
+ |----|------|------|
49
+ | `INCREMENTAL` | 增量刷新 | 增量引擎成功生成了增量计划,只处理了源表的变更数据 |
50
+ | `FULL` | 全量刷新 | 回退到全量重算。可能原因:首次刷新、维度表变更、增量计划生成失败、用户强制全量等 |
51
+ | `NO_DATA` | 无数据变更 | 源表在上次刷新后没有新的数据变更,本次刷新跳过计算 |
52
+
53
+ ### source_tables 详解
54
+
55
+ `source_tables` 列返回该次刷新涉及的所有输入表信息,每个元素是一个 MAP:
56
+
57
+ ```
58
+ [
59
+ {"workspace": "my_ws", "schema": "public", "table_name": "orders"},
60
+ {"workspace": "my_ws", "schema": "public", "table_name": "dim_product"}
61
+ ]
62
+ ```
63
+
64
+ ### stats 详解
65
+
66
+ `stats` 列返回该次刷新对目标表的写入统计:
67
+
68
+ ```
69
+ {"rows_inserted": "1000", "rows_deleted": "50"}
70
+ ```
71
+
72
+ - `rows_inserted`:本次刷新向目标表插入的行数
73
+ - `rows_deleted`:本次刷新从目标表删除的行数(增量模式下,更新操作会产生 delete + insert)
74
+
75
+ ### 典型用法
76
+
77
+ ```sql
78
+ -- 查看失败的刷新记录
79
+ SHOW DYNAMIC TABLE REFRESH HISTORY WHERE name = 'my_dt' AND state = 'FAILED';
80
+
81
+ -- 查看是否回退到了全量刷新(排查增量是否生效)
82
+ SHOW DYNAMIC TABLE REFRESH HISTORY WHERE name = 'my_dt' AND refresh_mode = 'FULL';
83
+
84
+ -- 查看无数据变更的刷新(源表没有新数据时会出现)
85
+ SHOW DYNAMIC TABLE REFRESH HISTORY WHERE name = 'my_dt' AND refresh_mode = 'NO_DATA';
86
+
87
+ -- 查看系统自动调度的刷新
88
+ SHOW DYNAMIC TABLE REFRESH HISTORY WHERE name = 'my_dt' AND refresh_trigger = 'SYSTEM_SCHEDULED';
89
+ ```
90
+
91
+ ---
92
+
93
+ ## 方式二:DESC HISTORY
94
+
95
+ 查看表的版本级别历史,包括每个版本的行数、字节数、操作类型等。适用于了解数据变更粒度。
96
+
97
+ ### 语法
98
+
99
+ ```sql
100
+ -- 查看 DT 的版本历史
101
+ DESC HISTORY my_dt;
102
+
103
+ -- 查看源表的版本历史
104
+ DESC HISTORY source_table;
105
+
106
+ -- 支持 WHERE 过滤
107
+ DESC HISTORY my_dt WHERE version > 10;
108
+
109
+ -- 支持 LIMIT
110
+ DESC HISTORY my_dt LIMIT 20;
111
+ ```
112
+
113
+ ### 输出列
114
+
115
+ 对于普通表(DESC_TABLE_HISTORY):
116
+
117
+ | 列名 | 类型 | 说明 |
118
+ |------|------|------|
119
+ | sequence | BIGINT | 序列号 |
120
+ | version | BIGINT | 版本号 |
121
+ | time | TIMESTAMP | 版本创建时间 |
122
+ | total_rows | BIGINT | 该版本的总行数 |
123
+ | total_bytes | BIGINT | 该版本的总字节数 |
124
+ | user | STRING | 操作用户 |
125
+ | operation | STRING | 操作类型(INSERT / COMPACTION / REFRESH 等) |
126
+ | job_id | STRING | 对应的 Job ID |
127
+
128
+ 对于 DT/MV(DESC_MV_HISTORY),额外包含:
129
+
130
+ | 列名 | 类型 | 说明 |
131
+ |------|------|------|
132
+ | source_tables | ARRAY<MAP<STRING,STRING>> | 源表及其对应的版本信息 |
133
+
134
+ DESC HISTORY 对 DT/MV 的 `source_tables` 比 SHOW REFRESH HISTORY 更详细,包含每个源表在该版本对应的快照信息:
135
+
136
+ ```
137
+ [
138
+ {"table_name": "orders", "workspace": "my_ws", "schema": "public", "version": "123", "sequence": "5", "commit_time": "2025-01-15 10:30:00"},
139
+ {"table_name": "dim_product", "workspace": "my_ws", "schema": "public", "version": "456", "sequence": "2", "commit_time": "2025-01-15 08:00:00"}
140
+ ]
141
+ ```
142
+
143
+ - `version`:源表的 snapshot_id
144
+ - `sequence`:源表的 sequence 号
145
+ - `commit_time`:源表该版本的提交时间
146
+
147
+ 这些信息可以用来追溯某次刷新读取了源表的哪个版本数据。
148
+
149
+ ### 典型用法
150
+
151
+ ```sql
152
+ -- 查看 DT 最近的版本变化,确认 compaction 是否正常执行
153
+ DESC HISTORY my_dt LIMIT 10;
154
+
155
+ -- 查看源表的版本历史,判断数据写入频率
156
+ DESC HISTORY source_table LIMIT 20;
157
+
158
+ -- 查看 DT 的 compaction 记录
159
+ DESC HISTORY my_dt WHERE operation = 'COMPACTION';
160
+ ```
161
+
162
+ ---
163
+
164
+ ## 方式三:information_schema.materialized_view_refresh_history
165
+
166
+ 从 information_schema 查询刷新历史,适合跨表批量分析、与其他系统集成、或做长期趋势监控。数据按天分区(pt_date),保留天数由系统配置决定。
167
+
168
+ ### 语法
169
+
170
+ ```sql
171
+ -- 查看指定 DT 的刷新历史
172
+ SELECT *
173
+ FROM information_schema.materialized_view_refresh_history
174
+ WHERE materialized_view_name = 'my_dt'
175
+ ORDER BY start_time DESC
176
+ LIMIT 10;
177
+
178
+ -- 查看某天所有 DT 的刷新情况
179
+ SELECT materialized_view_name, status, start_time, end_time, error_message
180
+ FROM information_schema.materialized_view_refresh_history
181
+ WHERE pt_date = '2025-01-15'
182
+ ORDER BY start_time DESC;
183
+
184
+ -- 查看失败的刷新
185
+ SELECT materialized_view_name, error_code, error_message, start_time
186
+ FROM information_schema.materialized_view_refresh_history
187
+ WHERE status = 'FAILED' AND pt_date >= '2025-01-01'
188
+ ORDER BY start_time DESC;
189
+ ```
190
+
191
+ ### 输出列
192
+
193
+ | 列名 | 类型 | 说明 |
194
+ |------|------|------|
195
+ | workspace_name | STRING | 所属 Workspace |
196
+ | schema_name | STRING | 所属 Schema |
197
+ | materialized_view_name | STRING | DT/MV 名称 |
198
+ | cru | DOUBLE | 消耗的计算资源单位 |
199
+ | virtual_cluster_name | STRING | 执行刷新的虚拟集群 |
200
+ | status | STRING | 刷新状态 |
201
+ | scheduled_start_time | TIMESTAMP | 计划开始时间 |
202
+ | start_time | TIMESTAMP | 实际开始时间 |
203
+ | end_time | TIMESTAMP | 结束时间 |
204
+ | error_code | STRING | 错误码 |
205
+ | error_message | STRING | 错误信息 |
206
+ | pt_date | STRING | 分区日期 |
207
+
208
+ ### 典型用法
209
+
210
+ ```sql
211
+ -- 统计某个 DT 最近 7 天的刷新成功率
212
+ SELECT
213
+ pt_date,
214
+ COUNT(*) AS total,
215
+ SUM(CASE WHEN status = 'SUCCEED' THEN 1 ELSE 0 END) AS success,
216
+ SUM(CASE WHEN status = 'FAILED' THEN 1 ELSE 0 END) AS failed
217
+ FROM information_schema.materialized_view_refresh_history
218
+ WHERE materialized_view_name = 'my_dt'
219
+ AND pt_date >= DATE_FORMAT(DATEADD(DAY, -7, CURRENT_DATE()), '%Y-%m-%d')
220
+ GROUP BY pt_date
221
+ ORDER BY pt_date;
222
+
223
+ -- 查看消耗 CRU 最多的刷新
224
+ SELECT materialized_view_name, cru, start_time, end_time
225
+ FROM information_schema.materialized_view_refresh_history
226
+ WHERE pt_date >= '2025-01-01'
227
+ ORDER BY cru DESC
228
+ LIMIT 10;
229
+ ```
230
+
231
+ ### 与 information_schema.job_history 的区别
232
+
233
+ `information_schema.job_history` 记录所有类型的 Job(SQL 查询、DML、DDL 等),而 `materialized_view_refresh_history` 专门记录 DT/MV 的刷新历史,字段更有针对性。
234
+
235
+ 如果需要查看刷新 Job 的完整信息(如 job_text、input_bytes 等),可以通过 job_id 关联:
236
+
237
+ ```sql
238
+ -- 通过 SHOW DYNAMIC TABLE REFRESH HISTORY 获取 job_id,再到 job_history 查详情
239
+ SELECT *
240
+ FROM information_schema.job_history
241
+ WHERE job_id = '<从 SHOW REFRESH HISTORY 获取的 job_id>'
242
+ AND pt_date = '2025-01-15';
243
+ ```
244
+
245
+ ---
246
+
247
+ ## 三种方式对比
248
+
249
+ | 特性 | SHOW REFRESH HISTORY | DESC HISTORY | information_schema |
250
+ |------|---------------------|--------------|-------------------|
251
+ | 粒度 | 刷新作业级别 | 表版本级别 | 刷新作业级别 |
252
+ | 刷新模式(增量/全量/无数据) | ✅ refresh_mode | ❌ | ❌ |
253
+ | 触发方式(调度/手动) | ✅ refresh_trigger | ❌ | ❌ |
254
+ | 写入统计(inserted/deleted) | ✅ stats | ❌ | ❌ |
255
+ | 源表列表 | ✅ 表名级别 | ✅ 含版本/sequence/commit_time | ❌ |
256
+ | 版本号/总行数/总字节数 | ❌ | ✅ version/total_rows/total_bytes | ❌ |
257
+ | CRU 消耗 | ❌ | ❌ | ✅ cru |
258
+ | 跨表批量查询 | ❌(单表) | ❌(单表) | ✅(可批量) |
259
+ | compaction 记录 | ❌ | ✅ | ❌ |
260
+ | 适用场景 | 排查增量是否生效、刷新状态 | 查看数据版本变化、追溯源表版本 | 批量分析/监控/CRU 统计 |
@@ -0,0 +1,80 @@
1
+ # Dynamic Table SQL 限制与支持矩阵
2
+
3
+ 本文档列出 Dynamic Table 增量计算支持和不支持的 SQL 模式。
4
+
5
+ ## JOIN 类型支持
6
+
7
+ | JOIN 类型 | 增量支持 | 说明 |
8
+ |-----------|---------|------|
9
+ | INNER JOIN | ✅ | 完全支持 |
10
+ | LEFT JOIN (LEFT OUTER) | ✅ | 完全支持 |
11
+ | RIGHT JOIN (RIGHT OUTER) | ✅ | 完全支持 |
12
+ | FULL OUTER JOIN | ✅ | 完全支持 |
13
+ | LEFT SEMI JOIN | ✅ | 完全支持 |
14
+ | LEFT ANTI JOIN | ✅ | 完全支持 |
15
+
16
+ ## 聚合函数支持
17
+
18
+ ### 支持增量计算的聚合函数
19
+
20
+ - `SUM`, `SUM0`, `COUNT`, `COUNT_IF`, `MIN`, `MAX`, `MIN_BY`, `MAX_BY`
21
+ - `AVG`, `STDDEV_SAMP`, `STDDEV_POP`, `VAR_SAMP`, `VAR_POP`
22
+ - `Percentile`, `Median`, `COUNT_DISTINCT`
23
+ - `BIT_OR`, `BIT_AND`, `BIT_XOR`, `BOOL_OR`, `BOOL_AND`
24
+ - `GROUP_BITMAP` 系列
25
+ - `COLLECT_SET`, `COLLECT_LIST`, `COLLECT_SET_ON_ARRAY`, `COLLECT_LIST_ON_ARRAY`
26
+ - `MAP_AGG`, `WM_CONCAT`
27
+
28
+ ### 结果不稳定的聚合函数(增量结果可能与全量不一致)
29
+
30
+ - `ANY_VALUE`, `FIRST_VALUE`, `LAST_VALUE`
31
+ - `APPROX_COUNT_DISTINCT`, `APPROX_HISTOGRAM`, `APPROX_TOP_K`, `APPROX_PERCENTILE`
32
+ - `JSON_MERGE_AGG`
33
+
34
+ ## 窗口函数支持
35
+
36
+ ### 支持的窗口函数
37
+
38
+ - `ROW_NUMBER`, `RANK`, `DENSE_RANK`, `PERCENT_RANK`
39
+ - `FIRST_VALUE`, `LAST_VALUE`, `NTH_VALUE`
40
+ - `COUNT`, `SUM`, `SUM0`, `MIN`, `MAX`, `AVG`
41
+ - `LEAD`, `LAG`, `CUME_DIST`, `NTILE`
42
+ - `COLLECT_LIST`, `COLLECT_SET`, `COLLECT_SET_ON_ARRAY`, `COLLECT_LIST_ON_ARRAY`
43
+
44
+ ## ORDER BY / LIMIT / OFFSET
45
+
46
+ 支持 `ORDER BY`、`LIMIT`、`OFFSET` 语法。
47
+
48
+ ⚠️ 不建议在 DT 中使用全局 `ORDER BY`。全局排序在每次增量刷新时开销非常大,推荐将排序逻辑放在下游查询数据时执行,而非 ETL 建模阶段。
49
+
50
+ ## 非确定性函数
51
+
52
+ 非确定性函数(如 `NOW()`、`CURRENT_TIMESTAMP`、`CURRENT_DATE`、`random()` 等)在不参与计算逻辑时默认支持。具体来说,只要这些函数不出现在以下位置,就可以正常使用:
53
+ - 窗口函数的 `PARTITION BY` key
54
+ - `JOIN` key
55
+ - `GROUP BY` key
56
+ - 其他函数的入参
57
+
58
+ 典型场景:在 SELECT 中直接输出数据处理时间,记录每条数据被 DT 刷新处理的时刻:
59
+
60
+ ```sql
61
+ CREATE DYNAMIC TABLE order_with_process_time AS
62
+ SELECT
63
+ id,
64
+ amount,
65
+ status,
66
+ CURRENT_TIMESTAMP AS process_time -- 记录刷新时的处理时间,直接输出到目标表
67
+ FROM orders
68
+ WHERE status = 'completed';
69
+ ```
70
+
71
+ 时间函数会在每次 REFRESH 时被常量折叠为当次刷新的时间戳。
72
+
73
+ ## UDF / UDAF / UDTF
74
+
75
+ 自定义函数需要在创建时声明为确定性函数(deterministic),才能在 DT 中使用增量计算。未声明确定性的自定义函数会导致增量计算被禁用。
76
+
77
+ ## 源表类型限制
78
+
79
+ - **虚拟视图(VIEW)**:不能作为 DT 的输入表,会禁用增量计算
80
+ - **外部表(External Table)**:不支持增量计算
@@ -0,0 +1,190 @@
1
+ ---
2
+ name: dynamic-table-alter
3
+ description: |
4
+ 修改 ClickZetta 动态表(Dynamic Table)的结构和属性。支持直接 ALTER 操作(suspend、resume、
5
+ rename_column、set_comment、set_column_comment、set/unset properties)以及 CREATE OR REPLACE
6
+ 重建操作(修改调度周期、计算集群、加列、减列、改列类型、改 SQL 定义)。当用户说"修改动态表"、
7
+ "动态表加列"、"改刷新间隔"、"暂停动态表"时触发。
8
+ ---
9
+
10
+ # 动态表修改工作流
11
+
12
+ ## 指令
13
+
14
+ ### 步骤 1:确认动态表存在并获取当前定义
15
+ 执行 `SHOW CREATE TABLE schema_name.table_name` 获取动态表当前定义。
16
+ 如果不确定是否为动态表,先用 `SHOW TABLES WHERE is_dynamic` 查看列表。
17
+
18
+ ### 步骤 2:判断操作类型并选择执行方式
19
+
20
+ ClickZetta 动态表的修改操作分为两类:
21
+
22
+ **A. 直接 ALTER 操作**(6种,可直接执行):
23
+
24
+ 1. **suspend** — 暂停调度任务:
25
+ ```sql
26
+ ALTER DYNAMIC TABLE dt_name SUSPEND;
27
+ ```
28
+
29
+ 2. **resume** — 启动调度任务:
30
+ ```sql
31
+ ALTER DYNAMIC TABLE dt_name RESUME;
32
+ ```
33
+
34
+ 3. **set_comment** — 修改表注释:
35
+ ```sql
36
+ ALTER DYNAMIC TABLE dt_name SET COMMENT 'comment';
37
+ ```
38
+
39
+ 4. **rename_column** — 修改列名:
40
+ ```sql
41
+ ALTER DYNAMIC TABLE dt_name RENAME COLUMN old_name TO new_name;
42
+ ```
43
+
44
+ 5. **set_column_comment** — 修改列注释(注意用 CHANGE COLUMN):
45
+ ```sql
46
+ ALTER DYNAMIC TABLE dt_name CHANGE COLUMN column_name COMMENT 'comment';
47
+ ```
48
+
49
+ 6. **set/unset properties** — 修改表属性(目前为保留参数):
50
+ ```sql
51
+ -- 设置属性
52
+ ALTER DYNAMIC TABLE dt_name SET PROPERTIES('key' = 'value');
53
+ -- 删除属性
54
+ ALTER DYNAMIC TABLE dt_name UNSET PROPERTIES('key');
55
+ ```
56
+
57
+ **B. CREATE OR REPLACE 操作**(6种,需要重建动态表):
58
+
59
+ > ⚠️ **以下操作不支持 ALTER 语法**。`ALTER DYNAMIC TABLE ... SET REFRESH INTERVAL` 等语法不存在,会报语法错误。必须使用 `CREATE OR REPLACE DYNAMIC TABLE` 重建。
60
+
61
+ 这些操作涉及 SQL 查询逻辑变化,无法通过 ALTER 直接完成:
62
+
63
+ 7. **修改调度周期** — ❌ 不支持 `ALTER ... SET REFRESH INTERVAL`
64
+ 8. **修改计算集群** — ❌ 不支持 `ALTER ... SET VCLUSTER`
65
+ 9. **增加列**
66
+ 10. **减列**
67
+ 11. **修改列类型**
68
+ 12. **修改 SQL 定义**
69
+
70
+ ### 步骤 3:执行 CREATE OR REPLACE 重建(仅 B 类操作)
71
+
72
+ 1. 执行 `SHOW CREATE TABLE schema_name.table_name` 获取原始 DDL
73
+ > ⚠️ `SHOW CREATE TABLE` 不支持 LIMIT/WHERE 子句,直接执行即可
74
+ 2. 解析出:列定义、REFRESH 子句、AS SELECT 子句、COMMENT 等
75
+ 3. 根据操作修改对应部分
76
+ 4. 执行重建 SQL
77
+
78
+ **关于全量刷新的触发**:
79
+ - 简单的删除列 / 添加列(添加的列只是从源表 SELECT 透传,不参与 JOIN key、GROUP key 等计算)→ **增量刷新**
80
+ - 涉及计算逻辑变化(修改 WHERE 条件、修改聚合逻辑、新增列参与计算等)→ **全量刷新**
81
+ - 兼容类型变更(如 INT → BIGINT)→ **增量刷新**
82
+
83
+ ### 步骤 4:验证修改结果
84
+ 使用 `DESC TABLE dt_name` 确认修改生效。
85
+
86
+ ---
87
+
88
+ ## 示例
89
+
90
+ ### 示例 1:修改调度周期
91
+
92
+ ```sql
93
+ -- 原表
94
+ CREATE DYNAMIC TABLE dt_name
95
+ REFRESH INTERVAL 10 MINUTE vcluster DEFAULT
96
+ AS SELECT * FROM student02;
97
+
98
+ -- 修改后(改为 20 分钟)
99
+ CREATE OR REPLACE DYNAMIC TABLE dt_name
100
+ REFRESH INTERVAL 20 MINUTE vcluster DEFAULT
101
+ AS SELECT * FROM student02;
102
+ ```
103
+
104
+ ### 示例 2:修改计算集群
105
+
106
+ ```sql
107
+ -- 原表
108
+ CREATE DYNAMIC TABLE dt_name
109
+ REFRESH INTERVAL 10 MINUTE vcluster DEFAULT
110
+ AS SELECT * FROM student02;
111
+
112
+ -- 修改后(改为 alter_vc 集群)
113
+ CREATE OR REPLACE DYNAMIC TABLE dt_name
114
+ REFRESH INTERVAL 10 MINUTE vcluster alter_vc
115
+ AS SELECT * FROM student02;
116
+ ```
117
+
118
+ ### 示例 3:增加列
119
+
120
+ ```sql
121
+ -- 原表
122
+ CREATE DYNAMIC TABLE change_table (i, j)
123
+ AS SELECT * FROM dy_base_a;
124
+
125
+ -- 添加一列 col(涉及计算逻辑,下次刷新会全量刷新)
126
+ CREATE OR REPLACE DYNAMIC TABLE change_table (i, j, col)
127
+ AS SELECT i, j, j * 1 FROM dy_base_a;
128
+
129
+ REFRESH DYNAMIC TABLE change_table;
130
+ ```
131
+
132
+ ### 示例 4:减列
133
+
134
+ ```sql
135
+ -- 原表有 i, j 两列
136
+ CREATE DYNAMIC TABLE change_table (i, j)
137
+ AS SELECT * FROM dy_base_a;
138
+
139
+ -- 减列(简单透传,增量刷新)
140
+ CREATE OR REPLACE DYNAMIC TABLE change_table (i)
141
+ AS SELECT i FROM dy_base_a;
142
+ ```
143
+
144
+ ### 示例 5:修改 SQL 定义
145
+
146
+ ```sql
147
+ -- 修改 WHERE 过滤条件(全量刷新)
148
+ CREATE OR REPLACE DYNAMIC TABLE change_table (i, j)
149
+ AS SELECT * FROM dy_base_a WHERE i > 3;
150
+
151
+ REFRESH DYNAMIC TABLE change_table;
152
+ ```
153
+
154
+ ### 示例 6:修改列类型
155
+
156
+ ```sql
157
+ -- INT → BIGINT(兼容类型,增量刷新)
158
+ CREATE OR REPLACE DYNAMIC TABLE change_table (i, j)
159
+ AS SELECT CAST(i AS BIGINT), j FROM dy_base_a;
160
+
161
+ REFRESH DYNAMIC TABLE change_table;
162
+ ```
163
+
164
+ ---
165
+
166
+ ## 平台特有知识
167
+
168
+ - **CHANGE COLUMN 语法**:设置列注释用 `CHANGE COLUMN col COMMENT 'xxx'`,不是 `ALTER COLUMN`
169
+ - **RENAME COLUMN 语法**:`RENAME COLUMN old TO new`
170
+ - **DML 限制**:动态表默认不支持 UPDATE/DELETE/MERGE(因隐藏列 MV__KEY),如需 DML 须先执行 `SET cz.sql.dt.allow.dml = true;`
171
+ - **REFRESH 格式**:`REFRESH INTERVAL <N> MINUTE vcluster <name>`,支持 SECOND/MINUTE/HOUR/DAY
172
+ - **CREATE OR REPLACE 风险**:涉及计算逻辑变化时会触发全量刷新,大表可能耗时较长
173
+ - **schema 前缀**:所有 ALTER/CREATE 语句中表名应包含 schema 前缀
174
+ - **列定义可省略类型**:`CREATE DYNAMIC TABLE dt (i, j) AS SELECT ...` 类型由 SELECT 推断
175
+ - **DROP 语法**:必须用 `DROP DYNAMIC TABLE dt_name`,不能用 `DROP TABLE dt_name`(会报错)
176
+ - **UNDROP 语法**:必须用 `UNDROP TABLE dt_name`,不能用 `UNDROP DYNAMIC TABLE dt_name`
177
+ - **DESC 语法**:动态表用 `DESC TABLE dt_name`,不要写 `DESC DYNAMIC TABLE dt_name EXTENDED`(EXTENDED 不支持)
178
+
179
+ ## 故障排除
180
+
181
+ | 错误 | 原因 | 解决方案 |
182
+ |---|---|---|
183
+ | ALTER 报 "Syntax error at or near 'REFRESH'" | `ALTER ... SET REFRESH INTERVAL` 语法不存在 | 使用 `CREATE OR REPLACE DYNAMIC TABLE ... REFRESH INTERVAL ...` 重建 |
184
+ | ALTER 报 "unsupported operation" | 尝试对动态表执行 B 类操作的 ALTER 语法 | 使用 CREATE OR REPLACE 重建 |
185
+ | `DROP TABLE dt_name` 报错 | 动态表必须用 `DROP DYNAMIC TABLE` | 改为 `DROP DYNAMIC TABLE dt_name` |
186
+ | `UNDROP DYNAMIC TABLE` 报错 | UNDROP 不支持 DYNAMIC TABLE 关键字 | 改为 `UNDROP TABLE dt_name` |
187
+ | `DESC DYNAMIC TABLE ... EXTENDED` 报错 | 不支持 EXTENDED 参数 | 改为 `DESC TABLE dt_name`(不加 EXTENDED) |
188
+ | UPDATE/DELETE 报 "MV__KEY" 相关错误 | 动态表有隐藏列 MV__KEY,默认禁止 DML | 先执行 `SET cz.sql.dt.allow.dml = true;` |
189
+ | CREATE OR REPLACE 后数据为空 | AS SELECT 子句引用的源表或列不正确 | 先验证 SELECT 子句是否返回数据 |
190
+ | CREATE OR REPLACE 后全量刷新 | 新增列参与了计算逻辑(JOIN key、GROUP key 等) | 预期行为,等待全量刷新完成 |
@@ -0,0 +1,5 @@
1
+ {"case_id":"001","type":"should_call","user_input":"帮我创建一个 Dynamic Table,从 public.dim_studio_user_dmin_f 聚合按租户+日期统计用户数,每 60 分钟自动刷新","expected_skill":"clickzetta-dynamic-table","expected_output_contains":["DYNAMIC TABLE","REFRESH"]}
2
+ {"case_id":"002","type":"should_call","user_input":"怎么查看动态表的刷新历史和状态","expected_skill":"clickzetta-dynamic-table","expected_output_contains":["REFRESH HISTORY"]}
3
+ {"case_id":"003","type":"should_call","user_input":"动态表的增量刷新怎么配置?SESSION_CONFIGS 怎么用?","expected_skill":"clickzetta-dynamic-table","expected_output_contains":["SESSION_CONFIGS","增量"]}
4
+ {"case_id":"004","type":"should_call","user_input":"静态分区 DT 和动态分区 DT 有什么区别?该怎么选?","expected_skill":"clickzetta-dynamic-table","expected_output_contains":["静态分区","动态分区"]}
5
+ {"case_id":"005","type":"should_call","user_input":"动态表怎么修改刷新间隔和 vcluster?","expected_skill":"clickzetta-dynamic-table","expected_output_contains":["ALTER","DYNAMIC TABLE"]}
@@ -0,0 +1,27 @@
1
+ ---
2
+ name: sql-to-dt
3
+ description: 将 Hive/Spark 等任意批处理系统的 CREATE TABLE DDL + INSERT OVERWRITE SQL 自动转换为 Dynamic Table DDL 及配套文件(refresh、prev_refresh、backfill)。当用户提供 DDL 和 INSERT OVERWRITE 要求转换为 DT 时触发,或用户说"创建动态表"时主动引导提供输入。Triggers on: "转换DT", "sql to dt", "convert to dynamic table", "INSERT OVERWRITE 转 DT", "DDL 转换", "创建动态表"
4
+ ---
5
+
6
+ # SQL → Dynamic Table 自动转换
7
+
8
+ 将 Hive/Spark 等任意批处理系统的 ETL SQL(CREATE TABLE + INSERT OVERWRITE)转换为 Dynamic Table DDL 及配套运维文件。
9
+
10
+ ## 使用方式
11
+
12
+ 提供以下输入:
13
+ 1. CREATE TABLE DDL(表结构定义)
14
+ 2. INSERT OVERWRITE SQL(ETL 查询逻辑)
15
+
16
+ 转换工具会自动完成:占位符替换、自引用检测、核心转换、列校验、配套文件生成、转换后改进建议。
17
+
18
+ 详细工作流参见 #[[file:references/sql2dt-workflow.md]]
19
+
20
+ ## references/
21
+
22
+ - **sql2dt-workflow.md** — 完整转换工作流(6 步:预处理、占位符替换、自引用检测、核心转换、列校验、配套文件生成)
23
+ - **sql2dt-conversion-rules.md** — 核心 DDL 转换规则(解析 DDL、解析 INSERT、组装 DT DDL、静态分区注入)
24
+ - **sql2dt-placeholder-rules.md** — 占位符替换规则(${var} → SESSION_CONFIGS())
25
+ - **sql2dt-self-reference-rules.md** — 自引用表转换规则
26
+ - **sql2dt-column-validation-rules.md** — 列校验规则(schema 列数 = SELECT 列数)
27
+ - **sql2dt-refresh-rules.md** — Refresh 与调度文件生成规则
@@ -0,0 +1,118 @@
1
+ # Dynamic Table 列校验与一致性规则
2
+
3
+ 你是一个 SQL 转换专家。在生成 Dynamic Table DDL 后,需要校验 schema 定义的列与 SELECT 查询产出的列是否一致。
4
+
5
+ ## 列数校验(必须通过)
6
+
7
+ ### 规则
8
+
9
+ 生成的 DDL 中,括号内定义的列数必须等于 AS 后面 SELECT 查询产出的列数。
10
+
11
+ ```sql
12
+ CREATE OR REPLACE DYNAMIC TABLE t (
13
+ col1 BIGINT, -- 1
14
+ col2 STRING, -- 2
15
+ dt STRING -- 3 → schema 列数 = 3
16
+ )
17
+ AS
18
+ SELECT col1, col2, '2024-01-01' AS dt -- → SELECT 列数 = 3 ✓
19
+ FROM source;
20
+ ```
21
+
22
+ ### SELECT 列数计算
23
+
24
+ 1. 找到 AS 后面的 SELECT 子句
25
+ 2. 找到顶层 FROM(不在子查询/括号内的 FROM)
26
+ 3. 计算 SELECT 和 FROM 之间的顶层逗号数 + 1 = 列数
27
+ 4. 顶层逗号:不在括号 `()`、方括号 `[]`、引号 `''`/`""` 内的逗号
28
+
29
+ ### UNION ALL 的列数
30
+
31
+ 取第一个分支的列数(所有分支列数应一致)。
32
+
33
+ ### 校验失败
34
+
35
+ 如果 schema 列数 ≠ SELECT 列数,转换失败,报错:
36
+ ```
37
+ Schema列数(N) != SELECT列数(M)
38
+ ```
39
+
40
+ ## 列名校验(可选)
41
+
42
+ ### 规则
43
+
44
+ 逐位对比 schema 中的列名与 SELECT 中推断出的别名。建议在列数校验通过后,如果 SELECT 中大部分列都有明确别名(AS 或裸标识符),开启列名校验做二次确认。
45
+
46
+ ### SELECT 列别名推断
47
+
48
+ 按优先级从高到低:
49
+
50
+ 1. **AS 别名**:`expression AS alias` → 别名为 `alias`
51
+ 2. **末尾标识符**:`table.column` → 别名为 `column`
52
+ 3. **裸标识符**:`column_name` → 别名为 `column_name`
53
+ 4. **无法推断**:`func(a, b)` 没有 AS → 标记为 `<expr>`,跳过校验
54
+
55
+ ### 对比规则
56
+
57
+ - 逐位对比(第1列对第1列,第2列对第2列...)
58
+ - 如果某位是 `<expr>`(无法推断),跳过该位
59
+ - 对比不区分大小写
60
+ - 不匹配时报错并列出具体不对齐的列
61
+
62
+ ## 静态分区注入后的列数
63
+
64
+ 注入静态分区列后,SELECT 列数会增加。校验应在注入后进行。
65
+
66
+ ### 避免重复注入
67
+
68
+ 在注入前检查 SELECT 中是否已包含该分区列:
69
+
70
+ 1. 解析 SELECT 中每个表达式的最终别名
71
+ 2. 如果别名(不区分大小写)与分区列名匹配 → 该列已存在,跳过注入
72
+ 3. 只注入 SELECT 中不存在的分区列
73
+
74
+ ## UNION ALL 一致性
75
+
76
+ ### 分支列数一致性
77
+
78
+ 所有 UNION ALL 分支的列数必须相同。如果不一致,记录警告:
79
+ ```
80
+ UNION分支列数不一致: [12, 13, 12]
81
+ ```
82
+
83
+ ### 注入后复核
84
+
85
+ 静态分区注入后,再次检查各分支列数是否一致:
86
+ ```
87
+ 注入后UNION各分支列数: [13, 13, 13]
88
+ ```
89
+
90
+ ## 重复别名检测
91
+
92
+ 如果 SELECT 中有重复的列别名,记录警告:
93
+ ```
94
+ 检测到重复列别名: ['dt']
95
+ ```
96
+
97
+ 重复别名可能导致:
98
+ - 列数看起来正确但实际语义错误
99
+ - 下游查询引用歧义
100
+
101
+ ## 缺失分区列检测
102
+
103
+ 如果 SELECT 中缺少某些分区列(注入前),记录信息:
104
+ ```
105
+ 检测到缺失分区列: ['dt', 'ds']
106
+ ```
107
+
108
+ 这些列会在注入步骤中被自动添加。
109
+
110
+ ## 完整校验流程
111
+
112
+ ```
113
+ 1. 生成 DDL(含静态分区注入)
114
+ 2. 提取 schema 列数
115
+ 3. 提取 SELECT 列数
116
+ 4. 比较列数 → 不等则失败
117
+ 5. (可选)逐位对比列名 → 不匹配则失败
118
+ ```