@clickzetta/cz-cli-darwin-x64 0.3.87-dev.20260528223948 → 0.3.88

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (27) hide show
  1. package/bin/cz-cli +0 -0
  2. package/bin/skills/clickzetta-dynamic-table/SKILL.md +169 -169
  3. package/bin/skills/clickzetta-dynamic-table/best-practices/dimension-table-join-guide.md +126 -126
  4. package/bin/skills/clickzetta-dynamic-table/best-practices/medallion-and-stream-patterns.md +25 -25
  5. package/bin/skills/clickzetta-dynamic-table/best-practices/non-partitioned-merge-into-warning.md +48 -48
  6. package/bin/skills/clickzetta-dynamic-table/best-practices/performance-optimization.md +51 -51
  7. package/bin/skills/clickzetta-dynamic-table/best-practices/scheduling-guide.md +59 -59
  8. package/bin/skills/clickzetta-dynamic-table/dt-creator/SKILL.md +8 -7
  9. package/bin/skills/clickzetta-dynamic-table/dt-creator/references/dt-declaration-strategy.md +99 -99
  10. package/bin/skills/clickzetta-dynamic-table/dt-creator/references/incremental-config-reference.md +188 -188
  11. package/bin/skills/clickzetta-dynamic-table/dt-creator/references/refresh-history-guide.md +117 -117
  12. package/bin/skills/clickzetta-dynamic-table/dt-creator/references/sql-limitations.md +29 -29
  13. package/bin/skills/clickzetta-dynamic-table/dynamic-table-alter/SKILL.md +80 -79
  14. package/bin/skills/clickzetta-dynamic-table/sql-to-dt/SKILL.md +15 -15
  15. package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-column-validation-rules.md +61 -61
  16. package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-conversion-rules.md +100 -100
  17. package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-placeholder-rules.md +64 -64
  18. package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-refresh-rules.md +32 -32
  19. package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-self-reference-rules.md +21 -21
  20. package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-workflow.md +71 -71
  21. package/bin/skills/clickzetta-sql-pipeline-manager/SKILL.md +203 -202
  22. package/bin/skills/clickzetta-sql-pipeline-manager/references/dynamic-table.md +62 -62
  23. package/bin/skills/clickzetta-sql-pipeline-manager/references/materialized-view.md +34 -34
  24. package/bin/skills/clickzetta-sql-pipeline-manager/references/pipe.md +61 -61
  25. package/bin/skills/clickzetta-sql-pipeline-manager/references/table-stream.md +41 -41
  26. package/bin/skills/clickzetta-table-stream-pipeline/SKILL.md +103 -101
  27. package/package.json +1 -1
@@ -1,12 +1,12 @@
1
- # Table Stream(表流)SQL 参考
1
+ # Table Stream SQL Reference
2
2
 
3
- > **⚠️ ClickZetta 特有语法**
4
- > - 创建语法是 `CREATE TABLE STREAM`,参数放在 `WITH PROPERTIES (...)`
5
- > - 元数据字段是 `__change_type`(双下划线),值:`INSERT` / `UPDATE_BEFORE` / `UPDATE_AFTER` / `DELETE`
6
- > - UPDATE 产生两条记录:`UPDATE_BEFORE`(更新前)和 `UPDATE_AFTER`(更新后)
7
- > - 通常只需要 `UPDATE_AFTER` `INSERT`,忽略 `UPDATE_BEFORE`
3
+ > **⚠️ ClickZetta-specific syntax**
4
+ > - Creation syntax is `CREATE TABLE STREAM`, with parameters inside `WITH PROPERTIES (...)`
5
+ > - Metadata field is `__change_type` (double underscore), values: `INSERT` / `UPDATE_BEFORE` / `UPDATE_AFTER` / `DELETE`
6
+ > - UPDATE produces two records: `UPDATE_BEFORE` (before update) and `UPDATE_AFTER` (after update)
7
+ > - Typically only `UPDATE_AFTER` and `INSERT` are needed; `UPDATE_BEFORE` can be ignored
8
8
 
9
- Table Stream 捕获源表的变更数据(INSERT / UPDATE / DELETE),是构建 CDC 管道的核心对象。通常与 Dynamic Table SQL 任务配合消费变更数据。
9
+ Table Stream captures change data (INSERT / UPDATE / DELETE) from a source table and is the core object for building CDC pipelines. It is typically consumed by Dynamic Tables or SQL tasks.
10
10
 
11
11
  ## CREATE TABLE STREAM
12
12
 
@@ -21,48 +21,48 @@ CREATE [ OR REPLACE ] TABLE STREAM [ IF NOT EXISTS ] <stream_name>
21
21
  );
22
22
  ```
23
23
 
24
- **关键参数:**
25
- - `TABLE_STREAM_MODE = STANDARD`(默认):捕获 INSERTUPDATEDELETE 所有变更,每行附带 `__change_type` 字段(`INSERT` / `UPDATE_BEFORE` / `UPDATE_AFTER` / `DELETE`)
26
- - `TABLE_STREAM_MODE = APPEND_ONLY`:只捕获 INSERT,性能更好,适合仅追加写入的源表
27
- - `SHOW_INITIAL_ROWS = TRUE`:首次消费返回建 Stream 时表中已有行;`FALSE`(默认)仅返回建 Stream 后的新变更
28
- - `TIMESTAMP AS OF`:指定 Stream 从哪个时间点开始捕获变更
24
+ **Key parameters:**
25
+ - `TABLE_STREAM_MODE = STANDARD` (default): captures all changes — INSERT, UPDATE, DELETE each row includes a `__change_type` field (`INSERT` / `UPDATE_BEFORE` / `UPDATE_AFTER` / `DELETE`)
26
+ - `TABLE_STREAM_MODE = APPEND_ONLY`: captures INSERT only; better performance, suitable for append-only source tables
27
+ - `SHOW_INITIAL_ROWS = TRUE`: first consumption returns rows already in the table when the Stream was created; `FALSE` (default) returns only new changes after Stream creation
28
+ - `TIMESTAMP AS OF`: specifies the point in time from which the Stream starts capturing changes
29
29
 
30
- **示例:**
30
+ **Examples:**
31
31
  ```sql
32
- -- 在普通表上创建标准流(捕获所有变更,需先开启 change_tracking
32
+ -- Create a standard stream on a regular table (captures all changes; change_tracking must be enabled first)
33
33
  ALTER TABLE ods.orders SET PROPERTIES ('change_tracking' = 'true');
34
34
 
35
35
  CREATE TABLE STREAM orders_stream
36
36
  ON TABLE ods.orders
37
37
  WITH PROPERTIES ('TABLE_STREAM_MODE' = 'STANDARD');
38
38
 
39
- -- 仅追加流
39
+ -- Append-only stream
40
40
  CREATE TABLE STREAM events_stream
41
41
  ON TABLE dw.events
42
- COMMENT '事件流,仅追加'
42
+ COMMENT 'Event stream, append only'
43
43
  WITH PROPERTIES ('TABLE_STREAM_MODE' = 'APPEND_ONLY');
44
44
 
45
- -- 从指定时间点开始捕获
45
+ -- Start capturing from a specific timestamp
46
46
  CREATE TABLE STREAM orders_stream_from_ts
47
47
  ON TABLE ods.orders
48
48
  TIMESTAMP AS OF '2024-01-01 00:00:00'
49
49
  WITH PROPERTIES ('TABLE_STREAM_MODE' = 'STANDARD', 'SHOW_INITIAL_ROWS' = 'TRUE');
50
50
  ```
51
51
 
52
- ## 消费 Table Stream
52
+ ## Consuming a Table Stream
53
53
 
54
- Table Stream offset 通过 DML 操作移动。**仅 SELECT 不会移动 offset**,可以反复查询预览。执行 DMLINSERT INTO / MERGE INTO / UPDATE / DELETE)消费数据后,offset 前进。
54
+ The Table Stream offset advances through DML operations. **SELECT alone does not advance the offset** — you can query repeatedly for preview. Executing DML (INSERT INTO / MERGE INTO / UPDATE / DELETE) consumes the data and advances the offset.
55
55
 
56
56
  ```sql
57
- -- 查看当前未消费的变更数据(不移动 offset
57
+ -- View current unconsumed change data (does not advance offset)
58
58
  SELECT * FROM orders_stream;
59
59
 
60
- -- 变更数据包含的系统字段
60
+ -- System fields included in change data:
61
61
  -- __change_type: INSERT | UPDATE_BEFORE | UPDATE_AFTER | DELETE
62
- -- __commit_version: 变更版本号
63
- -- __commit_timestamp: 变更发生时间
62
+ -- __commit_version: change version number
63
+ -- __commit_timestamp: time the change occurred
64
64
 
65
- -- 典型用法:将变更数据 MERGE 到目标表(过滤掉 UPDATE_BEFORE
65
+ -- Typical usage: MERGE change data into target table (filter out UPDATE_BEFORE)
66
66
  MERGE INTO dw.orders_dim AS target
67
67
  USING (
68
68
  SELECT * FROM orders_stream
@@ -73,7 +73,7 @@ WHEN MATCHED AND src.__change_type = 'UPDATE_AFTER' THEN UPDATE SET target.statu
73
73
  WHEN MATCHED AND src.__change_type = 'DELETE' THEN DELETE
74
74
  WHEN NOT MATCHED AND src.__change_type IN ('INSERT', 'UPDATE_AFTER') THEN INSERT (order_id, status, amount) VALUES (src.order_id, src.status, src.amount);
75
75
 
76
- -- 配合 Dynamic Table 自动消费(推荐)
76
+ -- Consume automatically with a Dynamic Table (recommended)
77
77
  CREATE OR REPLACE DYNAMIC TABLE dw.orders_processed
78
78
  REFRESH INTERVAL 1 MINUTE vcluster default
79
79
  AS
@@ -91,35 +91,35 @@ DROP TABLE STREAM [ IF EXISTS ] <stream_name>;
91
91
  ## SHOW / DESC
92
92
 
93
93
  ```sql
94
- -- 列出当前 schema 下所有 Table Stream
94
+ -- List all Table Streams in the current schema
95
95
  SHOW TABLE STREAMS;
96
96
 
97
- -- 列出指定 schema 下的 Table Stream
97
+ -- List Table Streams in a specific schema
98
98
  SHOW TABLE STREAMS IN <schema_name>;
99
99
 
100
- -- 按名称过滤
100
+ -- Filter by name
101
101
  SHOW TABLE STREAMS LIKE 'orders%';
102
102
 
103
- -- 查看 Table Stream 详情(源表、模式、创建时间)
103
+ -- View Table Stream details (source table, mode, creation time)
104
104
  DESC TABLE STREAM <stream_name>;
105
105
  ```
106
106
 
107
- ## 注意事项
107
+ ## Notes
108
108
 
109
- - SELECT 不会移动 offset,可反复查询预览
110
- - DML 操作(INSERT INTO / MERGE INTO / UPDATE / DELETE)会移动 offset
111
- - ⚠️ 即使 DML WHERE 条件过滤了部分行,**所有行的 offset 都会移动**
112
- - 若长时间不消费,超出源表的 `data_retention_days` 后数据会丢失
113
- - `STANDARD` 模式下 UPDATE 会产生两条记录:`UPDATE_BEFORE`(更新前)和 `UPDATE_AFTER`(更新后)
114
- - 消费时通常过滤 `__change_type != 'UPDATE_BEFORE'`,忽略旧值
115
- - 源表需先开启 `change_tracking`:`ALTER TABLE name SET PROPERTIES ('change_tracking' = 'true')`
109
+ - SELECT alone does not advance the offset; you can query repeatedly for preview
110
+ - DML operations (INSERT INTO / MERGE INTO / UPDATE / DELETE) advance the offset
111
+ - ⚠️ Even if a DML has a WHERE clause that filters some rows, **the offset advances for all rows**
112
+ - If not consumed for a long time, data will be lost once the source table's `data_retention_days` is exceeded
113
+ - In `STANDARD` mode, UPDATE produces two records: `UPDATE_BEFORE` (before update) and `UPDATE_AFTER` (after update)
114
+ - When consuming, typically filter `__change_type != 'UPDATE_BEFORE'` to ignore old values
115
+ - The source table must have `change_tracking` enabled first: `ALTER TABLE name SET PROPERTIES ('change_tracking' = 'true')`
116
116
 
117
- ## 参考文档
117
+ ## Reference Documentation
118
118
 
119
119
  - [CREATE TABLE STREAM](https://www.yunqi.tech/documents/create-table-stream)
120
120
  - [DESC TABLE STREAM](https://www.yunqi.tech/documents/desc-table-stream)
121
121
  - [SHOW TABLE STREAMS](https://www.yunqi.tech/documents/show-table-streams)
122
122
  - [DROP TABLE STREAM](https://www.yunqi.tech/documents/drop-table-stream)
123
- - [TABLE STREAM 简介](https://www.yunqi.tech/documents/tablestream_summary)
124
- - [Table Stream 变化数据捕获](https://www.yunqi.tech/documents/table_stream)
125
- - [Table Stream 最佳实践](https://www.yunqi.tech/documents/lakehouse-table-stream-best-practices)
123
+ - [Table Stream Overview](https://www.yunqi.tech/documents/tablestream_summary)
124
+ - [Table Stream Change Data Capture](https://www.yunqi.tech/documents/table_stream)
125
+ - [Table Stream Best Practices](https://www.yunqi.tech/documents/lakehouse-table-stream-best-practices)
@@ -1,89 +1,91 @@
1
1
  ---
2
2
  name: clickzetta-table-stream-pipeline
3
3
  description: |
4
- 搭建和管理 ClickZetta Table Stream 变更数据捕获管道,覆盖从源表配置、Stream 创建、
5
- 数据消费到增量 ETL 的端到端工作流。当用户说"创建 Table Stream"、"Table Stream CDC"、
6
- "Table Stream 管道"、"Table Stream 增量消费"、"Stream 消费"时触发。
7
- 包含变更跟踪开启、模式选择、offset 管理、元数据字段使用、幂等消费等 ClickZetta 特有逻辑。
4
+ Build and manage ClickZetta Table Stream change data capture pipelines, covering the
5
+ end-to-end workflow from source table configuration, Stream creation, and data consumption
6
+ to incremental ETL. Trigger when the user says "create Table Stream", "Table Stream CDC",
7
+ "Table Stream pipeline", "Table Stream incremental consumption", or "Stream consumption".
8
+ Includes change tracking enablement, mode selection, offset management, metadata field usage,
9
+ and idempotent consumption — all ClickZetta-specific logic.
8
10
  Keywords: table stream, CDC, change capture, incremental ETL, stream
9
11
  ---
10
12
 
11
- # Table Stream 变更数据捕获工作流
13
+ # Table Stream Change Data Capture Workflow
12
14
 
13
- ## 指令
15
+ ## Instructions
14
16
 
15
- ### 步骤 1:开启源表变更跟踪(必需前置)
16
- 执行 SQL 开启源表的 change_tracking
17
+ ### Step 1: Enable Change Tracking on the Source Table (Required Prerequisite)
18
+ Execute SQL to enable `change_tracking` on the source table:
17
19
  ```sql
18
20
  ALTER TABLE <source_table> SET PROPERTIES ('change_tracking' = 'true');
19
21
  ```
20
- - 这是强制性前置步骤,不执行则 Stream 无法正确捕获变更
21
- - 验证属性是否生效(两种方法):
22
+ - This is a mandatory prerequisite. Without it, the Stream cannot correctly capture changes.
23
+ - Verify the property took effect (two methods):
22
24
  ```sql
23
- -- 方法 1DESC EXTENDED 查看 properties
25
+ -- Method 1: DESC EXTENDED to view properties
24
26
  DESC EXTENDED <source_table>;
25
27
 
26
- -- 方法 2:查询 information_schema
28
+ -- Method 2: Query information_schema
27
29
  SELECT table_name, properties FROM information_schema.tables WHERE table_name = '<source_table>';
28
30
  ```
29
31
 
30
- ### 步骤 2:创建 Table Stream
31
- 执行 SQL 创建 Stream
32
+ ### Step 2: Create a Table Stream
33
+ Execute SQL to create the Stream:
32
34
  ```sql
33
35
  CREATE [ OR REPLACE ] TABLE STREAM <stream_name>
34
36
  ON TABLE <source_table>
35
37
  [ TIMESTAMP AS OF '<timestamp>' ]
36
- [ COMMENT '<描述>' ]
38
+ [ COMMENT '<description>' ]
37
39
  WITH PROPERTIES (
38
40
  'TABLE_STREAM_MODE' = 'STANDARD | APPEND_ONLY',
39
41
  'SHOW_INITIAL_ROWS' = 'TRUE | FALSE'
40
42
  );
41
43
  ```
42
- 关键参数选择:
43
- - **STANDARD 模式**:捕获 INSERT/UPDATE/DELETE,反映表当前状态(delta 变化)适用于数据同步、增量 ETL
44
- - delta 变化指两个事务时间点之间的净变化。例如:先 INSERT DELETE 同一行 → delta 为空;先 INSERT UPDATE → delta 为一条新行(最终状态)
45
- - **APPEND_ONLY 模式**:仅捕获 INSERT,保留所有历史插入记录适用于审计、历史记录保留
46
- - 即使后续 DELETE 了某行,APPEND_ONLY 模式仍保留该行的 INSERT 记录
47
- - **SHOW_INITIAL_ROWS = TRUE**:首次消费返回建 Stream 时表中已有行
48
- - **SHOW_INITIAL_ROWS = FALSE**(默认):首次消费仅返回建 Stream 后的新变更
49
- - 可选:指定起始时间点
44
+ Key parameter selection:
45
+ - **STANDARD mode**: captures INSERT/UPDATE/DELETE, reflecting the current state of the table (delta changes)suitable for data sync, incremental ETL
46
+ - Delta changes refer to the net change between two transaction timestamps. For example: INSERT then DELETE the same row → delta is empty; INSERT then UPDATE → delta is one new row (final state)
47
+ - **APPEND_ONLY mode**: captures INSERT only, retaining all historical insert records suitable for auditing, historical record retention
48
+ - Even if a row is later DELETEd, APPEND_ONLY mode retains the INSERT record for that row
49
+ - **SHOW_INITIAL_ROWS = TRUE**: first consumption returns rows already in the table when the Stream was created
50
+ - **SHOW_INITIAL_ROWS = FALSE** (default): first consumption returns only new changes after Stream creation
51
+ - Optional: specify a starting timestamp
50
52
  ```sql
51
- -- TIMESTAMP AS OF 用于指定 Stream 的起始读取位点
52
- -- 注意:此功能在某些场景下可能不稳定,建议优先使用默认行为(从创建时刻开始)
53
+ -- TIMESTAMP AS OF specifies the starting read offset for the Stream
54
+ -- Note: this feature may be unstable in some scenarios; prefer the default behavior (start from creation time)
53
55
  CREATE TABLE STREAM <stream_name>
54
56
  ON TABLE <source_table>
55
57
  TIMESTAMP AS OF '<timestamp>'
56
58
  WITH PROPERTIES ('TABLE_STREAM_MODE' = 'STANDARD');
57
59
  ```
58
60
 
59
- ### 步骤 3:准备目标表
60
- 创建与源表结构兼容的目标表:
61
- - 目标表列定义需包含源表的业务列
62
- - 建议额外添加元数据列(如 sync_versionsync_timestamp)用于追踪
61
+ ### Step 3: Prepare the Target Table
62
+ Create a target table with a structure compatible with the source table:
63
+ - The target table column definitions must include the business columns from the source table
64
+ - Recommended: add extra metadata columns (e.g., sync_version, sync_timestamp) for tracking
63
65
 
64
- ### 步骤 4:查询 Stream 数据(预览,不移动 offset)
65
- 执行 SELECT 预览 Stream 中的变更数据:
66
+ ### Step 4: Query Stream Data (Preview — Does Not Advance Offset)
67
+ Execute SELECT to preview change data in the Stream:
66
68
  ```sql
67
69
  SELECT *, __change_type, __commit_version, __commit_timestamp
68
70
  FROM <stream_name>;
69
71
  ```
70
- - SELECT 不会移动 offset
71
- - 元数据字段:`__change_type`(值:`INSERT` / `UPDATE_BEFORE` / `UPDATE_AFTER` / `DELETE`)、`__commit_version`、`__commit_timestamp`
72
- - **UPDATE 处理要点**:UPDATE 操作产生两条记录:
73
- - `UPDATE_BEFORE`:更新前的旧值(通常在消费时忽略)
74
- - `UPDATE_AFTER`:更新后的新值(用于写入目标表)
75
- - 消费时务必过滤 `__change_type`,避免将 `UPDATE_BEFORE` 旧值误写入目标表
72
+ - SELECT alone does not advance the offset
73
+ - Metadata fields: `__change_type` (values: `INSERT` / `UPDATE_BEFORE` / `UPDATE_AFTER` / `DELETE`), `__commit_version`, `__commit_timestamp`
74
+ - **UPDATE handling**: an UPDATE operation produces two records:
75
+ - `UPDATE_BEFORE`: the old value before the update (typically ignored during consumption)
76
+ - `UPDATE_AFTER`: the new value after the update (used when writing to the target table)
77
+ - Always filter on `__change_type` during consumption to avoid writing `UPDATE_BEFORE` old values into the target table
76
78
 
77
- ### 步骤 5:消费 Stream 数据(移动 offset)
78
- 执行 DML 操作消费数据:
79
+ ### Step 5: Consume Stream Data (Advances Offset)
80
+ Execute a DML operation to consume data:
79
81
 
80
- #### 方式 A:全量消费(INSERT INTO
82
+ #### Method A: Full Consumption (INSERT INTO)
81
83
  ```sql
82
84
  INSERT INTO <target_table>
83
85
  SELECT <columns> FROM <stream_name>;
84
86
  ```
85
87
 
86
- #### 方式 B:幂等消费(MERGE,推荐)
88
+ #### Method B: Idempotent Consumption (MERGE — recommended)
87
89
  ```sql
88
90
  MERGE INTO <target_table> t
89
91
  USING (SELECT * FROM <stream_name> WHERE __change_type != 'UPDATE_BEFORE') s
@@ -92,68 +94,68 @@ WHEN MATCHED AND s.__change_type IN ('INSERT', 'UPDATE_AFTER') THEN UPDATE SET t
92
94
  WHEN MATCHED AND s.__change_type = 'DELETE' THEN DELETE
93
95
  WHEN NOT MATCHED AND s.__change_type = 'INSERT' THEN INSERT (<columns>) VALUES (s.<columns>);
94
96
  ```
95
- - DML 操作(INSERT/UPDATE/MERGE)会移动 offset
96
- - ⚠️ 即使使用 WHERE 条件过滤,**所有数据的 offset 仍会移动**(不仅是匹配的行)
97
- - 推荐使用 MERGE 实现幂等性,避免重复消费导致数据重复
98
- - USING 子查询中过滤掉 `UPDATE_BEFORE`,避免旧值干扰 MERGE 逻辑
99
- - ⚠️ **MERGE 语法顺序要求**:多个 `WHEN MATCHED` 子句时,**UPDATE 必须在 DELETE 之前**,否则报错(错误信息:`update statement must be before delete statement`)
100
-
101
- ### 步骤 6:验证消费状态
102
- 执行查询确认消费完成:
97
+ - DML operations (INSERT/UPDATE/MERGE) advance the offset
98
+ - ⚠️ Even with a WHERE clause that filters some rows, **the offset advances for all rows** (not just the matched ones)
99
+ - Use MERGE for idempotency to avoid duplicate data from repeated consumption
100
+ - Filter out `UPDATE_BEFORE` in the USING subquery to prevent old values from interfering with MERGE logic
101
+ - ⚠️ **MERGE clause ordering requirement**: when multiple `WHEN MATCHED` clauses are present, **UPDATE must come before DELETE**, otherwise an error occurs (error message: `update statement must be before delete statement`)
102
+
103
+ ### Step 6: Verify Consumption Status
104
+ Execute a query to confirm consumption is complete:
103
105
  ```sql
104
106
  SELECT COUNT(*) FROM <stream_name>;
105
107
  ```
106
- - 消费成功后 COUNT 应为 0 或仅包含新变更
107
- - 记录最后消费的 `__commit_version` 用于故障恢复
108
+ - After successful consumption, COUNT should be 0 or contain only new changes
109
+ - Record the last consumed `__commit_version` for failure recovery
108
110
 
109
- ## Offset 移动规则
111
+ ## Offset Advancement Rules
110
112
 
111
- | 操作 | 是否移动 offset | 说明 |
113
+ | Operation | Advances offset? | Notes |
112
114
  |------|----------------|------|
113
- | `SELECT * FROM stream` | 不移动 | 仅预览,可反复查询 |
114
- | `INSERT INTO target SELECT ... FROM stream` | 移动 | 消费数据 |
115
- | `MERGE INTO target USING stream ...` | 移动 | 消费数据(推荐) |
116
- | `UPDATE target SET ... FROM stream` | 移动 | 消费数据 |
117
- | `DELETE FROM target USING stream` | 移动 | 消费数据 |
118
- | WHERE DML | 全部移动 | 即使 WHERE 过滤了部分行,所有行的 offset 都会移动 |
115
+ | `SELECT * FROM stream` | No | Preview only; can be queried repeatedly |
116
+ | `INSERT INTO target SELECT ... FROM stream` | Yes | Consumes data |
117
+ | `MERGE INTO target USING stream ...` | Yes | Consumes data (recommended) |
118
+ | `UPDATE target SET ... FROM stream` | Yes | Consumes data |
119
+ | `DELETE FROM target USING stream` | Yes | Consumes data |
120
+ | DML with WHERE clause | Yes (all rows) | Even if WHERE filters some rows, offset advances for all rows |
119
121
 
120
- > ⚠️ **关键注意**:offset 移动是全量的。一旦执行 DML 消费 Stream,所有变更记录的 offset 都会前进,无法部分消费。如果 DML 执行失败(如目标表不存在),offset 不会移动。
122
+ > ⚠️ **Key note**: offset advancement is all-or-nothing. Once a DML consumes the Stream, the offset advances for all change records — partial consumption is not possible. If the DML fails (e.g., target table does not exist), the offset does not advance.
121
123
 
122
- ## 模式选择速查
124
+ ## Mode Selection Quick Reference
123
125
 
124
- | 需求 | 推荐模式 |
126
+ | Requirement | Recommended mode |
125
127
  |------|---------|
126
- | 数据同步(保持目标与源一致) | STANDARD |
127
- | 增量 ETL 流程 | STANDARD |
128
- | 审计所有插入记录 | APPEND_ONLY |
129
- | 历史记录保留 | APPEND_ONLY |
128
+ | Data sync (keep target consistent with source) | STANDARD |
129
+ | Incremental ETL pipeline | STANDARD |
130
+ | Audit all insert records | APPEND_ONLY |
131
+ | Historical record retention | APPEND_ONLY |
130
132
 
131
- ## 性能优化要点
133
+ ## Performance Optimization Tips
132
134
 
133
- - SELECT 必要列,避免 `SELECT *`
134
- - 定期消费 Stream,避免数据累积
135
- - 高变更率表:更频繁消费;低变更率表:降低频率
136
- - 大型 Stream 可按主键范围拆分并行处理
137
- - 在源表上设置适当的数据保留期
135
+ - Select only necessary columns; avoid `SELECT *`
136
+ - Consume the Stream regularly to prevent data accumulation
137
+ - High-change-rate tables: consume more frequently; low-change-rate tables: reduce frequency
138
+ - Large Streams can be split by primary key range for parallel processing
139
+ - Set an appropriate data retention period on the source table
138
140
 
139
- ## 示例
141
+ ## Examples
140
142
 
141
- ### 示例 1:订单表实时同步
143
+ ### Example 1: Real-time Order Table Sync
142
144
  ```sql
143
- -- 1. 开启源表变更跟踪
145
+ -- 1. Enable change tracking on source table
144
146
  ALTER TABLE orders SET PROPERTIES ('change_tracking' = 'true');
145
147
 
146
- -- 2. 创建 Table Stream
148
+ -- 2. Create Table Stream
147
149
  CREATE TABLE STREAM orders_stream ON TABLE orders
148
150
  WITH PROPERTIES ('TABLE_STREAM_MODE' = 'STANDARD', 'SHOW_INITIAL_ROWS' = 'FALSE');
149
151
 
150
- -- 3. 创建目标表(与源表结构兼容)
152
+ -- 3. Create target table (compatible structure with source)
151
153
  CREATE TABLE orders_sync (order_id INT, status STRING, amount DOUBLE);
152
154
 
153
- -- 4. 预览 Stream 数据(不移动 offset
155
+ -- 4. Preview Stream data (does not advance offset)
154
156
  SELECT *, __commit_version, __commit_timestamp FROM orders_stream;
155
157
 
156
- -- 5. 消费 Stream 数据(移动 offset
158
+ -- 5. Consume Stream data (advances offset)
157
159
  MERGE INTO orders_sync t
158
160
  USING (SELECT * FROM orders_stream WHERE __change_type != 'UPDATE_BEFORE') s
159
161
  ON t.order_id = s.order_id
@@ -161,46 +163,46 @@ WHEN MATCHED AND s.__change_type IN ('INSERT', 'UPDATE_AFTER') THEN UPDATE SET t
161
163
  WHEN MATCHED AND s.__change_type = 'DELETE' THEN DELETE
162
164
  WHEN NOT MATCHED AND s.__change_type = 'INSERT' THEN INSERT (order_id, status, amount) VALUES (s.order_id, s.status, s.amount);
163
165
 
164
- -- 6. 验证消费完成
166
+ -- 6. Verify consumption is complete
165
167
  SELECT COUNT(*) FROM orders_stream;
166
168
  ```
167
169
 
168
- ### 示例 2:用户行为审计(保留全部插入历史)
170
+ ### Example 2: User Behavior Audit (Retain Full Insert History)
169
171
  ```sql
170
- -- 1. 开启源表变更跟踪
172
+ -- 1. Enable change tracking on source table
171
173
  ALTER TABLE user_actions SET PROPERTIES ('change_tracking' = 'true');
172
174
 
173
- -- 2. 创建 Table StreamAPPEND_ONLY 模式)
175
+ -- 2. Create Table Stream (APPEND_ONLY mode)
174
176
  CREATE TABLE STREAM user_actions_audit_stream ON TABLE user_actions
175
177
  WITH PROPERTIES ('TABLE_STREAM_MODE' = 'APPEND_ONLY', 'SHOW_INITIAL_ROWS' = 'TRUE');
176
178
 
177
- -- 3. 预览 Stream 数据
179
+ -- 3. Preview Stream data
178
180
  SELECT *, __commit_version, __commit_timestamp FROM user_actions_audit_stream;
179
181
 
180
- -- 4. 消费 Stream 数据(INSERT INTO 移动 offset
182
+ -- 4. Consume Stream data (INSERT INTO advances offset)
181
183
  INSERT INTO user_actions_audit
182
184
  SELECT *, __commit_version AS audit_version, __commit_timestamp AS audit_time
183
185
  FROM user_actions_audit_stream;
184
186
  ```
185
187
 
186
- ## 故障排除
188
+ ## Troubleshooting
187
189
 
188
- Stream 不捕获变更:
189
- 原因:源表未开启 change_tracking
190
- 解决方案:执行 `ALTER TABLE <table> SET PROPERTIES ('change_tracking' = 'true')`,确认 DML Stream 创建后执行
190
+ Stream not capturing changes:
191
+ Cause: `change_tracking` not enabled on the source table
192
+ Solution: Execute `ALTER TABLE <table> SET PROPERTIES ('change_tracking' = 'true')`; confirm that DML was executed after the Stream was created
191
193
 
192
- 无法区分变更类型:
193
- 原因:未在 MERGE/INSERT 中过滤 `__change_type`,导致 `UPDATE_BEFORE` 旧值也被写入目标表
194
- 解决方案:MERGE 时过滤 `__change_type IN ('UPDATE_AFTER', 'DELETE')`,忽略 `UPDATE_BEFORE` 记录
194
+ Cannot distinguish change types:
195
+ Cause: `__change_type` not filtered in MERGE/INSERT, causing `UPDATE_BEFORE` old values to be written to the target table
196
+ Solution: Filter `__change_type IN ('UPDATE_AFTER', 'DELETE')` in MERGE; ignore `UPDATE_BEFORE` records
195
197
 
196
- 消费后 offset 未移动:
197
- 原因:仅使用 SELECT 查询,未执行 DML
198
- 解决方案:必须通过 INSERT INTO / MERGE INTO / UPDATE 等 DML 操作消费数据
198
+ Offset not advancing after consumption:
199
+ Cause: Only SELECT was used; no DML was executed
200
+ Solution: Data must be consumed via DML operations such as INSERT INTO / MERGE INTO / UPDATE
199
201
 
200
- 重复消费导致目标表数据重复:
201
- 原因:使用 INSERT INTO 而非 MERGE,或消费逻辑非幂等
202
- 解决方案:改用 MERGE 语句;记录最后消费的 `__commit_version` `__commit_timestamp` 用于断点恢复
202
+ Duplicate data in target table from repeated consumption:
203
+ Cause: Using INSERT INTO instead of MERGE, or non-idempotent consumption logic
204
+ Solution: Switch to MERGE statements; record the last consumed `__commit_version` and `__commit_timestamp` for checkpoint recovery
203
205
 
204
- COMMENT 语法错误:
205
- 原因:使用了 `COMMENT = '...'`(带等号)而非 `COMMENT '...'`
206
- 解决方案:正确语法为 `COMMENT '注释内容'`,不带等号
206
+ COMMENT syntax error:
207
+ Cause: Used `COMMENT = '...'` (with equals sign) instead of `COMMENT '...'`
208
+ Solution: Correct syntax is `COMMENT 'description'` — no equals sign
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@clickzetta/cz-cli-darwin-x64",
3
- "version": "0.3.87-dev.20260528223948",
3
+ "version": "0.3.88",
4
4
  "description": "cz-cli binary for macOS x64 (Intel)",
5
5
  "os": [
6
6
  "darwin"