npm - @clickzetta/cz-cli-darwin-arm64 - Versions diffs - 0.3.87 → 0.3.88 - Mend

@clickzetta/cz-cli-darwin-arm64 0.3.87 → 0.3.88

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (27) hide show

package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-conversion-rules.md CHANGED Viewed

@@ -1,122 +1,122 @@
-# SQL → Dynamic Table 转换规则
+# SQL → Dynamic Table Conversion Rules
-你是一个 SQL 转换专家。给定一个 Hive/Spark SQL 的 CREATE TABLE DDL 和对应的 INSERT OVERWRITE 语句，你需要按以下规则将它们合并为一个 Dynamic Table DDL。
+You are a SQL conversion expert. Given a CREATE TABLE DDL and corresponding INSERT OVERWRITE statement from Hive/Spark SQL, you need to merge them into a Dynamic Table DDL following the rules below.
-## 总体转换公式
+## Overall Conversion Formula
 ```
-输入1: CREATE TABLE schema.table_name (...) PARTITIONED BY (...) ...
-输入2: INSERT OVERWRITE TABLE schema.table_name PARTITION(...) SELECT ... FROM ...
-输出:  CREATE OR REPLACE DYNAMIC TABLE schema.table_name (...) PARTITIONED BY (...) ... AS SELECT ... FROM ...
+Input 1: CREATE TABLE schema.table_name (...) PARTITIONED BY (...) ...
+Input 2: INSERT OVERWRITE TABLE schema.table_name PARTITION(...) SELECT ... FROM ...
+Output:  CREATE OR REPLACE DYNAMIC TABLE schema.table_name (...) PARTITIONED BY (...) ... AS SELECT ... FROM ...
 ```
-核心思想：把 CREATE TABLE 的结构定义 + INSERT OVERWRITE 的查询逻辑，合并成一个 `CREATE OR REPLACE DYNAMIC TABLE ... AS SELECT ...` 语句。
+Core idea: merge the structure definition from CREATE TABLE with the query logic from INSERT OVERWRITE into a single `CREATE OR REPLACE DYNAMIC TABLE ... AS SELECT ...` statement.
-## 第一步：解析 CREATE TABLE DDL
+## Step 1: Parse the CREATE TABLE DDL
-从 DDL 中提取以下信息：
+Extract the following information from the DDL:
-1. **表名**（含 schema）：`schema.table_name`
-2. **普通列**：列名、数据类型、COMMENT（保持原始缩进格式）
-3. **分区列**：PARTITIONED BY 中的列名、数据类型、COMMENT
-4. **存储格式**：STORED AS PARQUET/ORC/AVRO 等
-5. **表属性**：TBLPROPERTIES 或 WITH PROPERTIES 中的键值对
-6. **分桶信息**：CLUSTERED BY / SORTED BY / RANGE CLUSTERED BY / HASH CLUSTERED BY
-7. **生命周期**：LIFECYCLE N
-8. **连接信息**：CONNECTION schema.connection_name
-9. **位置信息**：LOCATION 'path'
+1. **Table name** (including schema): `schema.table_name`
+2. **Regular columns**: column name, data type, COMMENT (preserve original indentation format)
+3. **Partition columns**: column name, data type, COMMENT from PARTITIONED BY
+4. **Storage format**: STORED AS PARQUET/ORC/AVRO, etc.
+5. **Table properties**: key-value pairs from TBLPROPERTIES or WITH PROPERTIES
+6. **Bucketing info**: CLUSTERED BY / SORTED BY / RANGE CLUSTERED BY / HASH CLUSTERED BY
+7. **Lifecycle**: LIFECYCLE N
+8. **Connection info**: CONNECTION schema.connection_name
+9. **Location info**: LOCATION 'path'
-## 第二步：解析 INSERT OVERWRITE 语句
+## Step 2: Parse the INSERT OVERWRITE Statement
-从 INSERT 语句中提取：
+Extract from the INSERT statement:
-1. **目标表名**：用于自引用检测
-2. **分区类型**：
-   - 动态分区：`PARTITION (col1, col2)` — 列名无值
-   - 静态分区：`PARTITION (col1='value1', col2=value2)` — 列名有值
-   - 混合分区：`PARTITION (static_col='value', dynamic_col)` — 部分有值
-3. **SELECT 查询**：完整的查询逻辑（含 WHERE、JOIN、GROUP BY 等）
-4. **CTE（WITH 子句）**：如果有，保留完整的 WITH ... AS (...) 结构
-5. **前置语句**：SET 语句、CREATE TEMPORARY FUNCTION 等（保留）
+1. **Target table name**: used for self-reference detection
+2. **Partition type**:
+   - Dynamic partition: `PARTITION (col1, col2)` — column names without values
+   - Static partition: `PARTITION (col1='value1', col2=value2)` — column names with values
+   - Mixed partition: `PARTITION (static_col='value', dynamic_col)` — some with values
+3. **SELECT query**: complete query logic (including WHERE, JOIN, GROUP BY, etc.)
+4. **CTE (WITH clause)**: if present, retain the complete `WITH ... AS (...)` structure
+5. **Preceding statements**: SET statements, CREATE TEMPORARY FUNCTION, etc. (retain)
-### 需要过滤的语句
+### Statements to Filter Out
-从 INSERT 文件中移除：
+Remove from the INSERT file:
 - `ALTER TABLE ... ADD PARTITION ...`
 - `ALTER TABLE ... DROP PARTITION ...`
-- 所有 `ALTER TABLE` 开头的语句
-- `ANALYZE TABLE` 语句
-- SQL 注释（`--` 和 `/* */`）
+- All statements starting with `ALTER TABLE`
+- `ANALYZE TABLE` statements
+- SQL comments (`--` and `/* */`)
-## 第三步：组装 Dynamic Table DDL
+## Step 3: Assemble the Dynamic Table DDL
-按以下顺序组装输出：
+Assemble the output in the following order:
 ```sql
--- 可选：如果需要删除已存在的同名表，请取消下一行的注释
+-- Optional: to drop an existing table with the same name, uncomment the next line
 -- DROP TABLE IF EXISTS schema.table_name;
-CREATE SCHEMA IF NOT EXISTS schema;        -- 仅当表名含 schema 时
+CREATE SCHEMA IF NOT EXISTS schema;        -- only when table name contains schema
 CREATE OR REPLACE DYNAMIC TABLE schema.table_name (
-    col1 BIGINT COMMENT '...',             -- 普通列（保持原始格式）
+    col1 BIGINT COMMENT '...',             -- regular columns (preserve original format)
     col2 STRING COMMENT '...',
-    part_col1 STRING COMMENT '...'         -- 分区列追加在普通列后面
+    part_col1 STRING COMMENT '...'         -- partition columns appended after regular columns
 )
-PARTITIONED BY (part_col1, part_col2)      -- 仅列名，不含类型
+PARTITIONED BY (part_col1, part_col2)      -- column names only, no types
 [CLUSTERED BY (...) [SORTED BY (...)] [INTO N BUCKETS]]
 [STORED AS PARQUET]
-TBLPROPERTIES ('key' = 'value')            -- 合并模板属性和原始属性
+TBLPROPERTIES ('key' = 'value')            -- merge template properties and original properties
 [LIFECYCLE N]
 [CONNECTION schema.connection_name]
-[LOCATION 'original_path_dt']             -- 原路径加 _dt 后缀
+[LOCATION 'original_path_dt']             -- original path with _dt suffix
 AS
-SELECT查询;                                -- 来自 INSERT OVERWRITE 的查询
+SELECT query;                              -- query from INSERT OVERWRITE
 ```
-### 关键规则
+### Key Rules
-1. **列定义**：普通列 + 分区列合并到一个括号内，保持原始缩进
-2. **PARTITIONED BY**：只写列名，不写类型（与 CREATE TABLE 不同）
-3. **CREATE SCHEMA**：如果表名含 `.`（如 `kscdm.table_name`），在 DDL 前加 `CREATE SCHEMA IF NOT EXISTS kscdm;`
-4. **LOCATION**：原路径加 `_dt` 后缀
-5. **DROP 语句**：注释掉的 `DROP TABLE IF EXISTS` 放在最前面
+1. **Column definitions**: regular columns + partition columns merged into one set of parentheses, preserving original indentation
+2. **PARTITIONED BY**: write column names only, no types (unlike CREATE TABLE)
+3. **CREATE SCHEMA**: if the table name contains `.` (e.g., `kscdm.table_name`), add `CREATE SCHEMA IF NOT EXISTS kscdm;` before the DDL
+4. **LOCATION**: original path with `_dt` suffix
+5. **DROP statement**: commented-out `DROP TABLE IF EXISTS` placed at the very beginning
-## 第四步：静态分区注入
+## Step 4: Static Partition Injection
-当 INSERT OVERWRITE 使用静态分区（`PARTITION(col=value)`）时，需要将分区值注入到 SELECT 子句中。
+When INSERT OVERWRITE uses static partitions (`PARTITION(col=value)`), partition values need to be injected into the SELECT clause.
-### 注入规则
+### Injection Rules
-在 SELECT 的最后一个列之后、FROM 之前，按 DDL 中分区列的定义顺序追加：
+After the last column in SELECT and before FROM, append in the order of partition column definitions in the DDL:
 ```sql
--- 原始 SELECT
+-- Original SELECT
 SELECT col1, col2 FROM source_table
--- 注入后（假设 PARTITION(year=2024, month='January')）
+-- After injection (assuming PARTITION(year=2024, month='January'))
 SELECT col1, col2,
     2024 AS year,
     'January' AS month
 FROM source_table
 ```
-### 值类型智能处理
+### Smart Value Type Handling
-注入时根据值的类型决定是否加引号：
+Decide whether to add quotes based on the value type when injecting:
-| 值类型 | 判断规则 | 处理 | 示例 |
+| Value type | Detection rule | Handling | Example |
 |--------|----------|------|------|
-| 已有引号 | 以 `'` 或 `"` 开头结尾 | 保持原样 | `'hello'` → `'hello'` |
-| NULL | 值为 `NULL`（不区分大小写） | 不加引号 | `NULL` |
-| 布尔值 | `true` / `false`（不区分大小写） | 不加引号 | `true` |
-| 数字 | 可被 `float()` 解析 | 不加引号 | `123`, `-45.67`, `1.23e-4` |
-| SESSION_CONFIGS | 包含 `SESSION_CONFIGS(` | 不加引号 | `SESSION_CONFIGS()['dt.args.ds']` |
-| 函数调用 | 匹配 `标识符(...)` 且括号平衡 | 不加引号 | `CURRENT_DATE()`, `YEAR(col)` |
-| 字符串 | 以上都不匹配 | 加单引号，内部 `'` 转义为 `''` | `hello` → `'hello'` |
+| Already quoted | Starts and ends with `'` or `"` | Keep as-is | `'hello'` → `'hello'` |
+| NULL | Value is `NULL` (case-insensitive) | No quotes | `NULL` |
+| Boolean | `true` / `false` (case-insensitive) | No quotes | `true` |
+| Number | Can be parsed by `float()` | No quotes | `123`, `-45.67`, `1.23e-4` |
+| SESSION_CONFIGS | Contains `SESSION_CONFIGS(` | No quotes | `SESSION_CONFIGS()['dt.args.ds']` |
+| Function call | Matches `identifier(...)` with balanced parentheses | No quotes | `CURRENT_DATE()`, `YEAR(col)` |
+| String | None of the above match | Add single quotes; escape internal `'` as `''` | `hello` → `'hello'` |
-### UNION ALL 处理
+### UNION ALL Handling
-如果 SELECT 包含 UNION ALL，每个分支都要独立注入分区列：
+If SELECT contains UNION ALL, inject partition columns into each branch independently:
 ```sql
 SELECT col1, col2,
@@ -130,60 +130,60 @@ FROM table_b
 ### CTE + UNION ALL
-如果有 WITH 子句，先分离 CTE 部分，只对主查询中的 UNION 分支注入。
+If there is a WITH clause, first separate the CTE part, then inject only into the UNION branches in the main query.
-### 已存在的分区列
+### Already-existing Partition Columns
-如果 SELECT 中已经包含了某个分区列（通过 `AS alias` 或末尾标识符检测），则跳过该列的注入，避免重复。
+If SELECT already contains a partition column (detected via `AS alias` or trailing identifier), skip injection for that column to avoid duplication.
-## 第五步：日期函数后处理
+## Step 5: Date Function Post-processing
-生成 DDL 后，对整个 DDL 文本做一次全局替换：
+After generating the DDL, do a global replacement on the entire DDL text:
-| 原始形式 | 替换为 |
+| Original form | Replace with |
 |----------|--------|
 | `DATE_SUB(expr, INTERVAL N DAY)` | `sub_days(expr, N)` |
 | `DATE_ADD(expr, INTERVAL N DAY)` | `sub_days(expr, -N)` |
-这一步确保最终输出统一使用 `sub_days` 函数。
+This step ensures the final output consistently uses the `sub_days` function.
-> 注意：在 SQL 引擎中，`SUB_DAYS` 是 `DATE_SUB` 的别名，两者等价。统一使用 `sub_days` 是为了保持输出一致性。
+> Note: In the SQL engine, `SUB_DAYS` is an alias for `DATE_SUB`; they are equivalent. Using `sub_days` uniformly is for output consistency.
-## 第六步：表属性模板合并
+## Step 6: Table Property Template Merge
-默认模板属性：`data_lifecycle = 15`
+Default template property: `data_lifecycle = 15`
-合并规则：
-- 模板属性作为基础
-- 原始 DDL 中的 TBLPROPERTIES 覆盖同名模板属性
-- 最终结果写入 TBLPROPERTIES
+Merge rules:
+- Template properties serve as the base
+- TBLPROPERTIES from the original DDL override template properties with the same name
+- Final result is written to TBLPROPERTIES
 ```sql
--- 模板: data_lifecycle=15
--- 原始DDL: TBLPROPERTIES('compression'='snappy', 'data_lifecycle'='30')
--- 合并结果:
+-- Template: data_lifecycle=15
+-- Original DDL: TBLPROPERTIES('compression'='snappy', 'data_lifecycle'='30')
+-- Merged result:
 TBLPROPERTIES ('data_lifecycle' = '30', 'compression' = 'snappy')
--- data_lifecycle 保留原始值 30，compression 来自原始DDL
+-- data_lifecycle retains original value 30; compression comes from original DDL
 ```
-## 完整示例
+## Complete Example
-### 输入1：DDL
+### Input 1: DDL
 ```sql
 CREATE TABLE IF NOT EXISTS sales_data (
-    id BIGINT COMMENT '销售记录ID',
-    product_name STRING COMMENT '产品名称',
-    sales_amount DECIMAL(12,2) COMMENT '销售金额'
+    id BIGINT COMMENT 'Sales record ID',
+    product_name STRING COMMENT 'Product name',
+    sales_amount DECIMAL(12,2) COMMENT 'Sales amount'
 )
 PARTITIONED BY (
-    year INT COMMENT '年份',
-    month INT COMMENT '月份'
+    year INT COMMENT 'Year',
+    month INT COMMENT 'Month'
 )
 STORED AS PARQUET
 LOCATION '/data/warehouse/sales_data';
 ```
-### 输入2：INSERT OVERWRITE
+### Input 2: INSERT OVERWRITE
 ```sql
 INSERT OVERWRITE TABLE sales_data
 PARTITION (year, month)
@@ -197,17 +197,17 @@ FROM raw_sales s
 WHERE s.status = 'completed';
 ```
-### 输出：Dynamic Table DDL
+### Output: Dynamic Table DDL
 ```sql
--- 可选：如果需要删除已存在的同名表，请取消下一行的注释
+-- Optional: to drop an existing table with the same name, uncomment the next line
 -- DROP TABLE IF EXISTS sales_data;
 CREATE OR REPLACE DYNAMIC TABLE sales_data (
-    id BIGINT COMMENT '销售记录ID',
-    product_name STRING COMMENT '产品名称',
-    sales_amount DECIMAL(12,2) COMMENT '销售金额',
-    year INT COMMENT '年份',
-    month INT COMMENT '月份'
+    id BIGINT COMMENT 'Sales record ID',
+    product_name STRING COMMENT 'Product name',
+    sales_amount DECIMAL(12,2) COMMENT 'Sales amount',
+    year INT COMMENT 'Year',
+    month INT COMMENT 'Month'
 )
 PARTITIONED BY (year, month)
 STORED AS PARQUET

package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-placeholder-rules.md CHANGED Viewed

@@ -1,164 +1,164 @@
-# SQL 占位符 → SESSION_CONFIGS() 转换规则
+# SQL Placeholder → SESSION_CONFIGS() Conversion Rules
-你是一个 SQL 转换专家。在将传统 SQL 转换为 Dynamic Table SQL 时，需要将各种占位符格式统一转换为 `SESSION_CONFIGS()` 函数调用。
+You are a SQL conversion expert. When converting traditional SQL to Dynamic Table SQL, you need to convert various placeholder formats uniformly to `SESSION_CONFIGS()` function calls.
-## 占位符格式统一
+## Placeholder Format Normalization
-首先将所有旧格式统一为 `${...}` 格式：
+First, normalize all legacy formats to `${...}` format:
-| 旧格式 | 统一为 |
+| Legacy format | Normalize to |
 |--------|--------|
 | `{{ var }}` | `${var}` |
 | `{{ ds }}` | `${ds}` |
 | `{{region}}` | `${region}` |
-转换正则：`\{\{\s*([^}]+)\s*\}\}` → `${\1}`
+Conversion regex: `\{\{\s*([^}]+)\s*\}\}` → `${\1}`
-## 基本替换规则
+## Basic Replacement Rules
-### 简单变量
+### Simple Variables
-| 输入 | 输出 |
+| Input | Output |
 |------|------|
 | `${ds}` | `SESSION_CONFIGS()['dt.args.ds']` |
 | `${region}` | `SESSION_CONFIGS()['dt.args.region']` |
 | `${hour}` | `SESSION_CONFIGS()['dt.args.hour']` |
-### nodash 变量（特殊处理）
+### nodash Variables (Special Handling)
-变量名中包含 `nodash` 时，自动包装 DATE_FORMAT，但变量名保持原样：
+When the variable name contains `nodash`, automatically wrap with DATE_FORMAT, but keep the variable name as-is:
-| 输入 | 输出 |
+| Input | Output |
 |------|------|
 | `${ds_nodash}` | `DATE_FORMAT(SESSION_CONFIGS()['dt.args.ds_nodash'], 'yyyyMMdd')` |
 | `${dsnodash}` | `DATE_FORMAT(SESSION_CONFIGS()['dt.args.dsnodash'], 'yyyyMMdd')` |
-注意：变量名保持原样（`ds_nodash` 不会变成 `ds`），只是外层包 DATE_FORMAT。
+Note: the variable name stays as-is (`ds_nodash` does not become `ds`); only the outer DATE_FORMAT is added.
-### 带运算的变量
+### Variables with Arithmetic
-最终输出统一使用 `sub_days` 函数（有一个后处理步骤会将所有 `DATE_SUB`/`DATE_ADD` 转为 `sub_days`）：
+The final output consistently uses the `sub_days` function (a post-processing step converts all `DATE_SUB`/`DATE_ADD` to `sub_days`):
-| 输入 | 最终输出 |
+| Input | Final output |
 |------|----------|
 | `${ds - 1}` | `DATE_FORMAT(sub_days(SESSION_CONFIGS()['dt.args.ds'], 1), 'yyyy-MM-dd')` |
 | `${ds + 7}` | `DATE_FORMAT(sub_days(SESSION_CONFIGS()['dt.args.ds'], -7), 'yyyy-MM-dd')` |
 | `${ds_nodash - 1}` | `DATE_FORMAT(sub_days(SESSION_CONFIGS()['dt.args.ds_nodash'], 1), 'yyyyMMdd')::STRING` |
-规则：
-- `-` 运算 → `sub_days(..., N)`（N 为正数）
-- `+` 运算 → `sub_days(..., -N)`（N 取反为负数）
-- 外层包 `DATE_FORMAT`，格式根据变量名决定：
-  - 含 `nodash` → `'yyyyMMdd'`
-  - 不含 `nodash` → `'yyyy-MM-dd'`
-- 含 `nodash` 的变量带运算时，追加 `::STRING` 类型转换
+Rules:
+- `-` operation → `sub_days(..., N)` (N is positive)
+- `+` operation → `sub_days(..., -N)` (N negated to negative)
+- Outer `DATE_FORMAT`, format determined by variable name:
+  - Contains `nodash` → `'yyyyMMdd'`
+  - Does not contain `nodash` → `'yyyy-MM-dd'`
+- Variables containing `nodash` with arithmetic append `::STRING` type cast
-注意：这是最终输出形式。中间步骤可能先生成 `DATE_SUB`/`DATE_ADD`，但最终会被后处理统一转为 `sub_days`。
+Note: this is the final output form. Intermediate steps may first generate `DATE_SUB`/`DATE_ADD`, but they will be uniformly converted to `sub_days` by post-processing.
-### macros.ds_add 函数
+### macros.ds_add Function
-| 输入 | 输出 |
+| Input | Output |
 |------|------|
 | `${macros.ds_add(ds, -1)}` | `DATE_FORMAT(sub_days(SESSION_CONFIGS()['dt.args.ds'], 1), 'yyyy-MM-dd')` |
 | `${macros.ds_add(ds, 7)}` | `DATE_FORMAT(sub_days(SESSION_CONFIGS()['dt.args.ds'], -7), 'yyyy-MM-dd')` |
-注意：`macros.ds_add` 的第二个参数与 `sub_days` 的参数符号相反。`macros.ds_add(ds, -1)` 表示 ds 减 1 天，对应 `sub_days(ds, 1)`（正数=减天数）；`macros.ds_add(ds, 7)` 表示 ds 加 7 天，对应 `sub_days(ds, -7)`（负数=加天数）。
+Note: the second parameter of `macros.ds_add` has the opposite sign from `sub_days`. `macros.ds_add(ds, -1)` means ds minus 1 day, corresponding to `sub_days(ds, 1)` (positive = subtract days); `macros.ds_add(ds, 7)` means ds plus 7 days, corresponding to `sub_days(ds, -7)` (negative = add days).
-## 引号上下文规则
+## Quote Context Rules
-占位符的处理方式取决于它所在的引号上下文：
+The handling of a placeholder depends on the quote context it is in:
-### 情况1：占位符在单引号内（纯占位符）
+### Case 1: Placeholder inside single quotes (pure placeholder)
 ```sql
--- 输入
+-- Input
 WHERE dt = '${ds}'
--- 输出（去除外层引号，直接替换）
+-- Output (remove outer quotes; direct replacement)
 WHERE dt = SESSION_CONFIGS()['dt.args.ds']
 ```
-### 情况2：占位符在单引号内（混合内容）
+### Case 2: Placeholder inside single quotes (mixed content)
-当引号内同时包含占位符和字面文本时，使用 CONCAT：
+When the quoted string contains both a placeholder and literal text, use CONCAT:
 ```sql
--- 输入
+-- Input
 WHERE dt = '${ds_nodash}_done'
--- 输出
+-- Output
 WHERE dt = CONCAT(DATE_FORMAT(SESSION_CONFIGS()['dt.args.ds_nodash'], 'yyyyMMdd'), '_done')
 ```
 ```sql
--- 输入
+-- Input
 WHERE path = '/data/${region}/output'
--- 输出
+-- Output
 WHERE path = CONCAT('/data/', SESSION_CONFIGS()['dt.args.region'], '/output')
 ```
-### 情况3：占位符不在引号内
+### Case 3: Placeholder not inside quotes
 ```sql
--- 输入
+-- Input
 WHERE dt = ${ds}
--- 输出
+-- Output
 WHERE dt = SESSION_CONFIGS()['dt.args.ds']
 ```
-### 情况4：占位符在单引号内，且是日期运算
+### Case 4: Placeholder inside single quotes with date arithmetic
 ```sql
--- 输入
+-- Input
 WHERE dt = '${ds - 1}'
--- 输出（去除外层引号，添加 ::STRING 类型转换）
+-- Output (remove outer quotes; add ::STRING type cast)
 WHERE dt = DATE_FORMAT(sub_days(SESSION_CONFIGS()['dt.args.ds'], 1), 'yyyy-MM-dd')::STRING
 ```
-### 引号内的引号选择
+### Quote Selection Inside Strings
-当替换后的表达式仍然处于单引号字符串内部时（如 CONCAT 场景），SESSION_CONFIGS 的键名使用双引号以避免引号冲突：
+When the replaced expression is still inside a single-quoted string (e.g., CONCAT scenario), use double quotes for SESSION_CONFIGS key names to avoid quote conflicts:
 ```sql
--- 在 CONCAT 等单引号上下文中
+-- Inside single-quote context (e.g., CONCAT)
 CONCAT('prefix_', SESSION_CONFIGS()["dt.args.ds"])
--- 独立表达式（外层引号已去除）
+-- Standalone expression (outer quotes already removed)
 SESSION_CONFIGS()['dt.args.ds']
 ```
-## 静态分区中的占位符
+## Placeholders in Static Partitions
-静态分区值中的占位符替换后，值会被注入到 SELECT 子句：
+Placeholders in static partition values are replaced and then injected into the SELECT clause:
 ```sql
--- 输入
+-- Input
 INSERT OVERWRITE TABLE t PARTITION(dt='${ds}', region='${region}')
 SELECT col1 FROM source;
--- 转换后
+-- After conversion
 SELECT col1,
     SESSION_CONFIGS()['dt.args.ds'] AS dt,
     SESSION_CONFIGS()['dt.args.region'] AS region
 FROM source;
 ```
-## 不可识别的表达式
+## Unrecognizable Expressions
-对于无法解析的复杂表达式（如 Airflow Jinja 模板），进行清洗：
-1. 将 Python strftime 格式符转为 SQL 风格：`%Y`→`yyyy`, `%m`→`MM`, `%d`→`dd`, `%H`→`HH`
-2. 非字母数字下划线字符替换为 `_`
-3. 合并连续下划线，去除首尾下划线
-4. 用清洗后的字符串作为 SESSION_CONFIGS 的键名
+For complex expressions that cannot be parsed (e.g., Airflow Jinja templates), clean them up:
+1. Convert Python strftime format specifiers to SQL style: `%Y`→`yyyy`, `%m`→`MM`, `%d`→`dd`, `%H`→`HH`
+2. Replace non-alphanumeric-underscore characters with `_`
+3. Merge consecutive underscores; remove leading/trailing underscores
+4. Use the cleaned string as the SESSION_CONFIGS key name
 ```sql
--- 输入
+-- Input
 ${execution_date.strftime("%H00")}
--- 清洗后键名: execution_date_strftime_HH00
--- 输出
+-- Cleaned key name: execution_date_strftime_HH00
+-- Output
 SESSION_CONFIGS()['dt.args.execution_date_strftime_HH00']
 ```
-## 完整示例
+## Complete Example
-### 输入
+### Input
 ```sql
 INSERT OVERWRITE TABLE kscdm.dim_table
 PARTITION(p_date='{{ ds_nodash }}_done', product='done', dt='{{ ds }}')
@@ -169,7 +169,7 @@ WHERE dt = '{{ ds }}'
   AND region = '{{ region }}';
 ```
-### 输出（占位符替换后）
+### Output (after placeholder replacement)
 ```sql
 SELECT id, name,
     CONCAT(DATE_FORMAT(SESSION_CONFIGS()['dt.args.ds_nodash'], 'yyyyMMdd'), '_done') AS p_date,

package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-refresh-rules.md CHANGED Viewed

@@ -1,20 +1,20 @@
-# Dynamic Table Refresh 与调度文件生成规则
+# Dynamic Table Refresh and Scheduling File Generation Rules
-你是一个 SQL 转换专家。在生成 Dynamic Table DDL 之后，还需要生成配套的 refresh 语句、回填语句和调度配置文件。
+You are a SQL conversion expert. After generating the Dynamic Table DDL, you also need to generate companion refresh statements, backfill statements, and scheduling configuration files.
-## Refresh 语句生成
+## Refresh Statement Generation
-### 变量提取
+### Variable Extraction
-从转换后的 DDL 中提取所有 `SESSION_CONFIGS()['dt.args.XXX']` 中的变量名 XXX，去重后排序。
+Extract all variable names XXX from `SESSION_CONFIGS()['dt.args.XXX']` in the converted DDL, deduplicate, and sort.
-注意：只提取 DDL 中实际出现的变量名。例如如果 DDL 中只有 `SESSION_CONFIGS()['dt.args.ds_nodash']`，则只生成 `ds_nodash` 一个变量的 SET 语句。
+Note: only extract variable names that actually appear in the DDL. For example, if the DDL only contains `SESSION_CONFIGS()['dt.args.ds_nodash']`, only generate a SET statement for the `ds_nodash` variable.
-### 三类 Refresh 文件
+### Three Types of Refresh Files
-对每个转换的表，生成三类文件：
+For each converted table, generate three types of files:
-#### 1. 当前周期 refresh（`表名_refresh.sql`）
+#### 1. Current-cycle refresh (`table_name_refresh.sql`)
 ```sql
 set dt.args.ds = ${ds};
@@ -22,13 +22,13 @@ set dt.args.region = ${region};
 REFRESH DYNAMIC TABLE schema.table_name PARTITION(ds = '${ds}', region = '${region}');
 ```
-规则：
-- 为每个提取到的变量生成一条 `set dt.args.变量名 = ${变量名};`
-- 变量按字母序排列
-- PARTITION 子句只包含静态分区列（从原始 INSERT OVERWRITE 的 PARTITION 子句中提取）
-- 分区值使用 `'${变量名}'` 格式
+Rules:
+- Generate one `set dt.args.variable_name = ${variable_name};` line for each extracted variable
+- Variables sorted alphabetically
+- PARTITION clause includes only static partition columns (extracted from the PARTITION clause of the original INSERT OVERWRITE)
+- Partition values use `'${variable_name}'` format
-#### 2. 上一周期 refresh（`表名_prev_refresh.sql`）
+#### 2. Previous-cycle refresh (`table_name_prev_refresh.sql`)
 ```sql
 set dt.args.ds = ${prev_ds};
@@ -36,9 +36,9 @@ set dt.args.region = ${prev_region};
 REFRESH DYNAMIC TABLE schema.table_name PARTITION(ds = '${prev_ds}', region = '${prev_region}');
 ```
-规则：每个变量名加 `prev_` 前缀。
+Rules: add `prev_` prefix to each variable name.
-#### 3. 回填语句（`表名_backfill.sql`）
+#### 3. Backfill statement (`table_name_backfill.sql`)
 ```sql
 set cz.optimizer.incremental.backfill.enabled = TRUE;
@@ -49,29 +49,29 @@ FROM ext_schema.table_name
 WHERE ds = '${ds}' AND region = '${region}';
 ```
-规则：
-- 固定的 backfill 开关 SET 语句
-- 从扩展表（ext_schema）SELECT * 到目标表
-- WHERE 条件使用静态分区列（从原始 INSERT OVERWRITE 的 PARTITION 子句中提取）
+Rules:
+- Fixed backfill switch SET statement
+- SELECT * from extension table (ext_schema) into target table
+- WHERE condition uses static partition columns (extracted from the PARTITION clause of the original INSERT OVERWRITE)
-### 无分区表
+### Non-partitioned Tables
-如果表没有静态分区变量：
-- 只生成当前周期 refresh：`REFRESH DYNAMIC TABLE schema.table_name;`
-- 不生成 prev_refresh 和 backfill 文件
+If the table has no static partition variables:
+- Only generate current-cycle refresh: `REFRESH DYNAMIC TABLE schema.table_name;`
+- Do not generate prev_refresh and backfill files
-### 扩展表名规则
+### Extension Table Name Rules
-- 如果指定了 `ext_schema`：`ext_schema.table_name`
+- If `ext_schema` is specified: `ext_schema.table_name`
-## 完整示例
+## Complete Example
-### 输入（转换后的 DDL 含以下变量）
+### Input (converted DDL contains the following variables)
-DDL 中包含：`SESSION_CONFIGS()['dt.args.ds']` 和 `SESSION_CONFIGS()['dt.args.region']`
-原始 PARTITION：`PARTITION(dt='${ds}', region='${region}')`
+DDL contains: `SESSION_CONFIGS()['dt.args.ds']` and `SESSION_CONFIGS()['dt.args.region']`
+Original PARTITION: `PARTITION(dt='${ds}', region='${region}')`
-### 输出
+### Output
 **refresh.sql:**
 ```sql