npm - @clickzetta/cz-cli-darwin-arm64 - Versions diffs - 0.3.87-dev.20260528223948 → 0.3.88 - Mend

@clickzetta/cz-cli-darwin-arm64 0.3.87-dev.20260528223948 → 0.3.88

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (27) hide show

package/bin/skills/clickzetta-sql-pipeline-manager/references/dynamic-table.md CHANGED Viewed

@@ -1,14 +1,14 @@
-# Dynamic Table（动态表）SQL 参考
+# Dynamic Table SQL Reference
-> **⚠️ ClickZetta 特有语法**
-> - 刷新调度写法：`REFRESH INTERVAL 5 MINUTE vcluster default`（不是 `TARGET_LAG`）
-> - 修改调度周期或计算集群必须用 `CREATE OR REPLACE`，`ALTER` 不支持
-> - `ALTER DYNAMIC TABLE` 只支持：SUSPEND / RESUME / SET COMMENT / RENAME COLUMN / CHANGE COLUMN COMMENT / SET/UNSET PROPERTIES
-> - 删除用 `DROP DYNAMIC TABLE`（不是 `DROP TABLE`）
-> - 恢复用 `UNDROP TABLE`（不是 `UNDROP DYNAMIC TABLE`）
-> - DESC 用 `DESC TABLE name`（不支持 `DESC DYNAMIC TABLE name EXTENDED`）
+> **⚠️ ClickZetta-specific syntax**
+> - Refresh schedule syntax: `REFRESH INTERVAL 5 MINUTE vcluster default` (not `TARGET_LAG`)
+> - Modifying the schedule interval or compute cluster requires `CREATE OR REPLACE`; `ALTER` does not support this
+> - `ALTER DYNAMIC TABLE` only supports: SUSPEND / RESUME / SET COMMENT / RENAME COLUMN / CHANGE COLUMN COMMENT / SET/UNSET PROPERTIES
+> - Drop with `DROP DYNAMIC TABLE` (not `DROP TABLE`)
+> - Restore with `UNDROP TABLE` (not `UNDROP DYNAMIC TABLE`)
+> - Describe with `DESC TABLE name` (does not support `DESC DYNAMIC TABLE name EXTENDED`)
-动态表是 ClickZetta Lakehouse 的核心增量计算对象。通过 SQL 查询定义，自动增量刷新，无需手动调度。
+Dynamic Tables are the core incremental computation objects in ClickZetta Lakehouse. Defined by a SQL query, they refresh automatically and incrementally without manual scheduling.
 ## CREATE DYNAMIC TABLE
@@ -25,15 +25,15 @@ AS
   <query>;
 ```
-**关键参数：**
-- `REFRESH INTERVAL <n> MINUTE`：刷新间隔，最小 1 分钟
-- `vcluster`：运行刷新任务的计算集群名称（直接跟名称，不带等号和引号）
-- `OR REPLACE`：若同名动态表已存在则替换（修改 SQL 逻辑或调度配置必须用此方式）
-- 建议使用 GP 型集群（如 `default`），AP 型集群不支持小文件合并
+**Key parameters:**
+- `REFRESH INTERVAL <n> MINUTE`: refresh interval, minimum 1 minute
+- `vcluster`: name of the compute cluster to run refresh jobs (name directly, no equals sign or quotes)
+- `OR REPLACE`: replaces an existing Dynamic Table with the same name (required when modifying SQL logic or scheduling config)
+- Recommended: use a GP-type cluster (e.g., `default`); AP-type clusters do not support small file compaction
-**示例：**
+**Examples:**
 ```sql
--- 基础示例：每 5 分钟刷新一次订单汇总
+-- Basic example: refresh order summary every 5 minutes
 CREATE OR REPLACE DYNAMIC TABLE dw.order_summary
   REFRESH INTERVAL 5 MINUTE vcluster default
 AS
@@ -45,7 +45,7 @@ SELECT
 FROM ods.orders
 GROUP BY 1, 2;
--- 修改调度周期（必须用 CREATE OR REPLACE）
+-- Modify refresh interval (must use CREATE OR REPLACE)
 CREATE OR REPLACE DYNAMIC TABLE dw.order_summary
   REFRESH INTERVAL 10 MINUTE vcluster default
 AS
@@ -61,82 +61,82 @@ GROUP BY 1, 2;
 ## ALTER DYNAMIC TABLE
 ```sql
--- 暂停刷新
+-- Suspend refresh
 ALTER DYNAMIC TABLE <name> SUSPEND;
--- 恢复刷新
+-- Resume refresh
 ALTER DYNAMIC TABLE <name> RESUME;
--- 修改注释
+-- Modify comment
 ALTER DYNAMIC TABLE <name> SET COMMENT '<comment>';
--- 修改列名
+-- Rename column
 ALTER DYNAMIC TABLE <name> RENAME COLUMN <old_col> TO <new_col>;
--- 修改列注释（注意用 CHANGE COLUMN）
+-- Modify column comment (note: use CHANGE COLUMN)
 ALTER DYNAMIC TABLE <name> CHANGE COLUMN <col_name> COMMENT '<comment>';
--- 修改属性
+-- Modify properties
 ALTER DYNAMIC TABLE <name> SET PROPERTIES ('key' = 'value');
 ALTER DYNAMIC TABLE <name> UNSET PROPERTIES ('key');
 ```
-> 注意：修改调度周期、计算集群、SQL 查询逻辑，必须用 `CREATE OR REPLACE DYNAMIC TABLE`，ALTER 不支持这些操作。
+> Note: To modify the refresh interval, compute cluster, or SQL query logic, use `CREATE OR REPLACE DYNAMIC TABLE`. ALTER does not support these operations.
-## REFRESH DYNAMIC TABLE（手动触发）
+## REFRESH DYNAMIC TABLE (Manual Trigger)
 ```sql
--- 手动触发一次刷新
+-- Manually trigger a single refresh
 REFRESH DYNAMIC TABLE <name>;
 ```
 ## DROP DYNAMIC TABLE
 ```sql
--- ⚠️ 必须用 DROP DYNAMIC TABLE，不能用 DROP TABLE
+-- ⚠️ Must use DROP DYNAMIC TABLE, not DROP TABLE
 DROP DYNAMIC TABLE [ IF EXISTS ] <name>;
--- 恢复已删除的动态表（⚠️ 用 UNDROP TABLE，不是 UNDROP DYNAMIC TABLE）
+-- Restore a dropped Dynamic Table (⚠️ use UNDROP TABLE, not UNDROP DYNAMIC TABLE)
 UNDROP TABLE <name>;
 ```
 ## SHOW / DESC
 ```sql
--- 列出当前 schema 下所有动态表
+-- List all Dynamic Tables in the current schema
 SHOW TABLES WHERE is_dynamic = true;
--- 列出指定 schema 下的动态表
+-- List Dynamic Tables in a specific schema
 SHOW TABLES IN <schema_name> WHERE is_dynamic = true;
--- 查看动态表结构
+-- View Dynamic Table structure
 DESC TABLE <name>;
--- 查看完整建表语句
+-- View full CREATE statement
 SHOW CREATE TABLE <name>;
--- 查看刷新历史（状态、耗时、触发方式、增量行数）
+-- View refresh history (status, duration, trigger type, incremental row count)
 SHOW DYNAMIC TABLE REFRESH HISTORY WHERE name = '<dt_name>' LIMIT 20;
 ```
-> ⚠️ **DESC 注意**：动态表用 `DESC TABLE name`，不支持 `DESC DYNAMIC TABLE name EXTENDED`（EXTENDED 会报错）。
+> ⚠️ **DESC note**: Use `DESC TABLE name` for Dynamic Tables. `DESC DYNAMIC TABLE name EXTENDED` is not supported (EXTENDED will cause an error).
-## 注意事项
+## Notes
-- 修改 SQL 逻辑、调度周期、计算集群 → 用 `CREATE OR REPLACE`，不能用 `ALTER`
-- 刷新间隔最小 1 分钟
-- 删除用 `DROP DYNAMIC TABLE`（不是 `DROP TABLE`）
-- 恢复用 `UNDROP TABLE`（不是 `UNDROP DYNAMIC TABLE`）
-- 刷新失败不影响表的可查询性（返回上次成功版本的数据）
-- 非简单加列/减列的 `CREATE OR REPLACE` 会触发一次全量刷新
-- 建议使用 GP 型集群（如 `default`），AP 型集群不支持小文件合并
+- To modify SQL logic, refresh interval, or compute cluster → use `CREATE OR REPLACE`; `ALTER` is not supported for these
+- Minimum refresh interval is 1 minute
+- Drop with `DROP DYNAMIC TABLE` (not `DROP TABLE`)
+- Restore with `UNDROP TABLE` (not `UNDROP DYNAMIC TABLE`)
+- Refresh failures do not affect queryability (returns data from the last successful version)
+- A `CREATE OR REPLACE` that is not a simple add/drop column will trigger a full refresh
+- Recommended: use a GP-type cluster (e.g., `default`); AP-type clusters do not support small file compaction
-## 参数化动态表（SESSION_CONFIGS）
+## Parameterized Dynamic Table (SESSION_CONFIGS)
-通过 `SESSION_CONFIGS()` 函数定义参数化查询，在刷新时传入分区值控制刷新范围：
+Use the `SESSION_CONFIGS()` function to define parameterized queries, passing partition values at refresh time to control the refresh scope:
 ```sql
--- 创建参数化动态表
+-- Create a parameterized Dynamic Table
 CREATE OR REPLACE DYNAMIC TABLE dwd.orders_partitioned
   REFRESH INTERVAL 30 MINUTE vcluster default
 AS
@@ -144,42 +144,42 @@ SELECT order_id, user_id, amount, dt
 FROM ods.orders
 WHERE dt = SESSION_CONFIGS('target_date', CAST(CURRENT_DATE() AS STRING));
--- 手动触发刷新并传入参数
+-- Manually trigger refresh with parameters
 REFRESH DYNAMIC TABLE dwd.orders_partitioned
   WITH PROPERTIES ('target_date' = '2024-06-15');
 ```
-适用场景：传统按天全量 ETL 改造为增量任务，用 SESSION_CONFIGS 替换调度变量。
+Use case: migrating traditional daily full ETL jobs to incremental jobs, replacing scheduling variables with SESSION_CONFIGS.
-## 动态表 DML 操作
+## Dynamic Table DML Operations
-动态表默认不支持 DML，需先开启参数（每次 DML 前都需要 SET）：
+Dynamic Tables do not support DML by default. You must enable the parameter first (must be set before each DML operation):
 ```sql
--- ⚠️ 必须在同一会话/批次中先执行 SET，再执行 DML
+-- ⚠️ Must execute SET in the same session/batch before the DML
 SET cz.sql.dt.allow.dml = true;
 INSERT INTO <name> VALUES (...);
--- 删除
+-- Delete
 SET cz.sql.dt.allow.dml = true;
 DELETE FROM <name> WHERE ...;
 ```
-> ⚠️ **DML 注意事项**：
-> - `SET cz.sql.dt.allow.dml = true` 必须与 DML 语句在同一执行批次中
-> - 执行 DML 后，下一次自动刷新会触发**全量刷新**（而非增量），可能耗时较长
-> - UPDATE 可能因内部隐藏列（`MV__KEY`）报错，建议改用 DELETE + INSERT
-> - 仅在数据修正等特殊场景使用 DML
+> ⚠️ **DML notes**:
+> - `SET cz.sql.dt.allow.dml = true` must be in the same execution batch as the DML statement
+> - After a DML operation, the next automatic refresh will trigger a **full refresh** (not incremental), which may take longer
+> - UPDATE may fail due to internal hidden columns (`MV__KEY`); use DELETE + INSERT instead
+> - Use DML only for special cases such as data correction
-## 参考文档
+## Reference Documentation
 - [CREATE DYNAMIC TABLE](https://www.yunqi.tech/documents/create-dynamic-table)
 - [ALTER DYNAMIC TABLE](https://www.yunqi.tech/documents/alter-dynamic-table)
 - [DROP DYNAMIC TABLE](https://www.yunqi.tech/documents/drop-dynamic-table)
 - [SHOW DYNAMIC TABLES](https://www.yunqi.tech/documents/show-dynamic-table)
 - [SHOW DYNAMIC TABLE REFRESH HISTORY](https://www.yunqi.tech/documents/refresh-history)
-- [动态表简介](https://www.yunqi.tech/documents/dynamic_table_summary)
-- [查看动态表刷新模式](https://www.yunqi.tech/documents/dynamic-table-incre)
-- [传统离线任务转增量实践](https://www.yunqi.tech/documents/transformt-dt)
-- [动态表支持参数化定义](https://www.yunqi.tech/documents/dynamicTable-parmaters)
-- [动态表支持DML语句修改](https://www.yunqi.tech/documents/dynamicTable-dml)
+- [Dynamic Table Overview](https://www.yunqi.tech/documents/dynamic_table_summary)
+- [View Dynamic Table Refresh Mode](https://www.yunqi.tech/documents/dynamic-table-incre)
+- [Migrating Traditional Offline Jobs to Incremental](https://www.yunqi.tech/documents/transformt-dt)
+- [Parameterized Dynamic Table](https://www.yunqi.tech/documents/dynamicTable-parmaters)
+- [Dynamic Table DML Support](https://www.yunqi.tech/documents/dynamicTable-dml)

package/bin/skills/clickzetta-sql-pipeline-manager/references/materialized-view.md CHANGED Viewed

@@ -1,11 +1,11 @@
-# Materialized View（物化视图）SQL 参考
+# Materialized View SQL Reference
-> **⚠️ ClickZetta 特有语法**
-> - 定时刷新：`REFRESH INTERVAL 10 MINUTE vcluster default`（与动态表语法相同）
-> - 手动刷新：`REFRESH MATERIALIZED VIEW <name>;`
-> - 修改注释用 `ALTER TABLE`，不是 `ALTER MATERIALIZED VIEW`
+> **⚠️ ClickZetta-specific syntax**
+> - Scheduled refresh: `REFRESH INTERVAL 10 MINUTE vcluster default` (same syntax as Dynamic Table)
+> - Manual refresh: `REFRESH MATERIALIZED VIEW <name>;`
+> - Modify comments with `ALTER TABLE`, not `ALTER MATERIALIZED VIEW`
-物化视图将查询结果预计算并物理存储，适合固定维度的聚合加速场景。与动态表的区别：物化视图支持手动或定时刷新，不支持增量刷新。
+Materialized Views pre-compute and physically store query results, making them ideal for fixed-dimension aggregation acceleration. Unlike Dynamic Tables, Materialized Views support manual or scheduled refresh but do not support incremental refresh.
 ## CREATE MATERIALIZED VIEW
@@ -19,15 +19,15 @@ AS
   <query>;
 ```
-**关键参数：**
-- `REFRESH INTERVAL 10 MINUTE vcluster default`：定时自动刷新（与动态表语法相同）
-- 不写 REFRESH 子句：只能手动触发 `REFRESH MATERIALIZED VIEW <name>;`
-- `BUILD DEFERRED`：延迟构建，创建时不立即计算结果
-- `DISABLE QUERY REWRITE`：禁用查询改写（不自动用 MV 加速查询）
+**Key parameters:**
+- `REFRESH INTERVAL 10 MINUTE vcluster default`: scheduled automatic refresh (same syntax as Dynamic Table)
+- Omitting the REFRESH clause: only manual refresh via `REFRESH MATERIALIZED VIEW <name>;`
+- `BUILD DEFERRED`: deferred build — does not compute results immediately at creation time
+- `DISABLE QUERY REWRITE`: disables query rewrite (MV will not automatically accelerate queries)
-**示例：**
+**Examples:**
 ```sql
--- 定时自动刷新的物化视图（每 10 分钟）
+-- Materialized View with scheduled auto-refresh (every 10 minutes)
 CREATE MATERIALIZED VIEW mv_dept_stats
 REFRESH INTERVAL 10 MINUTE vcluster default
 AS
@@ -40,7 +40,7 @@ FROM departments d
 JOIN employees e ON d.dept_id = e.dept_id
 GROUP BY d.dept_id, d.dept_name;
--- 修改刷新周期（需要 CREATE OR REPLACE）
+-- Modify refresh interval (requires CREATE OR REPLACE)
 CREATE OR REPLACE MATERIALIZED VIEW mv_dept_stats
 BUILD DEFERRED
 REFRESH INTERVAL 20 MINUTE vcluster default
@@ -57,32 +57,32 @@ FROM departments d
 JOIN employees e ON d.dept_id = e.dept_id
 GROUP BY d.dept_id, d.dept_name, d.location;
--- 手动刷新
+-- Manual refresh
 REFRESH MATERIALIZED VIEW mv_dept_stats;
 ```
 ## ALTER MATERIALIZED VIEW
 ```sql
--- 暂停自动刷新
+-- Suspend automatic refresh
 ALTER MATERIALIZED VIEW <name> SUSPEND;
--- 恢复自动刷新
+-- Resume automatic refresh
 ALTER MATERIALIZED VIEW <name> RESUME;
--- 修改注释
+-- Modify comment
 ALTER TABLE <mv_name> SET COMMENT '<comment>';
--- 修改列注释（物化视图用 ALTER TABLE 语法）
+-- Modify column comment (Materialized Views use ALTER TABLE syntax)
 ALTER TABLE <mv_name> CHANGE COLUMN <col_name> COMMENT '<comment>';
 ```
-> 注意：物化视图的注释修改使用 `ALTER TABLE`，不是 `ALTER MATERIALIZED VIEW`。
+> Note: Use `ALTER TABLE` (not `ALTER MATERIALIZED VIEW`) to modify comments on a Materialized View.
 ## REFRESH MATERIALIZED VIEW
 ```sql
--- 手动触发全量刷新
+-- Manually trigger a full refresh
 REFRESH MATERIALIZED VIEW <name>;
 ```
@@ -95,35 +95,35 @@ DROP MATERIALIZED VIEW [ IF EXISTS ] <name>;
 ## SHOW / DESC
 ```sql
--- 列出当前 schema 下所有物化视图
+-- List all Materialized Views in the current schema
 SHOW TABLES WHERE is_materialized_view = true;
--- 按名称过滤
+-- Filter by name
 SHOW TABLES LIKE 'mv_%' WHERE is_materialized_view = true;
--- 查看物化视图结构
+-- View Materialized View structure
 DESC MATERIALIZED VIEW <name>;
 DESCRIBE MATERIALIZED VIEW <name> EXTENDED;
--- 查看完整建表语句
+-- View full CREATE statement
 SHOW CREATE TABLE <name>;
 ```
-## 动态表 vs 物化视图 选择指南
+## Dynamic Table vs Materialized View — Selection Guide
-| 场景 | 推荐 |
+| Scenario | Recommended |
 |---|---|
-| 需要秒/分钟级自动增量刷新 | Dynamic Table |
-| 固定聚合，手动或低频刷新 | Materialized View |
-| 需要 CDC 变更感知 | Dynamic Table + Table Stream |
-| 加速 BI 查询，数据不要求实时 | Materialized View |
+| Need second/minute-level automatic incremental refresh | Dynamic Table |
+| Fixed aggregation, manual or low-frequency refresh | Materialized View |
+| Need CDC change detection | Dynamic Table + Table Stream |
+| Accelerate BI queries, real-time data not required | Materialized View |
-## 参考文档
+## Reference Documentation
 - [CREATE MATERIALIZED VIEW](https://www.yunqi.tech/documents/CREATEMATERIALIZEDVIEW)
 - [ALTER MATERIALIZED VIEW](https://www.yunqi.tech/documents/alter-materialzied-view)
 - [REFRESH MATERIALIZED VIEW](https://www.yunqi.tech/documents/REFRESH)
 - [DROP MATERIALIZED VIEW](https://www.yunqi.tech/documents/DROPMATERIALIZEDVIEW)
 - [SHOW MATERIALIZED VIEWS](https://www.yunqi.tech/documents/show-materialized-view)
-- [物化视图概念与场景](https://www.yunqi.tech/documents/MATERIALIZEDVIEW)
-- [物化视图 DDL 汇总](https://www.yunqi.tech/documents/materialized_ddl)
+- [Materialized View Concepts and Use Cases](https://www.yunqi.tech/documents/MATERIALIZEDVIEW)
+- [Materialized View DDL Summary](https://www.yunqi.tech/documents/materialized_ddl)

package/bin/skills/clickzetta-sql-pipeline-manager/references/pipe.md CHANGED Viewed

@@ -1,14 +1,14 @@
-# Pipe SQL 参考
+# Pipe SQL Reference
-> **⚠️ ClickZetta 特有语法**
-> - Kafka 读取函数是 `read_kafka(...)`，使用**位置参数**（不是命名参数 `=>`）
-> - JSON 字段提取用 `parse_json(value::string)['field']::TYPE` 语法
-> - Pipe 创建后默认自动启动，无需手动 RESUME
-> - OSS Pipe 的 `PURGE=true` 紧跟在 `USING <format>` 之后（如 `USING CSV PURGE=true`）
+> **⚠️ ClickZetta-specific syntax**
+> - The Kafka read function is `read_kafka(...)`, using **positional parameters** (not named parameters with `=>`)
+> - JSON field extraction uses `parse_json(value::string)['field']::TYPE` syntax
+> - A Pipe starts automatically after creation; no manual RESUME is needed
+> - For OSS Pipes, `PURGE=true` follows immediately after `USING <format>` (e.g., `USING CSV PURGE=true`)
-Pipe 是 ClickZetta Lakehouse 的持续数据导入对象，通过 SQL 定义从 Kafka 或对象存储（OSS/S3/COS）自动、持续地将数据导入目标表，无需外部调度。
+Pipe is the continuous data ingestion object in ClickZetta Lakehouse. Defined by SQL, it automatically and continuously imports data from Kafka or object storage (OSS/S3/COS) into a target table without external scheduling.
-## CREATE PIPE — 从 Kafka 导入
+## CREATE PIPE — Ingest from Kafka
 ```sql
 CREATE [ OR REPLACE ] PIPE <pipe_name>
@@ -21,22 +21,22 @@ AS
 COPY INTO <target_table> FROM (
   SELECT <expr> [, ...]
   FROM read_kafka(
-    '<bootstrap_servers>',   -- 必填：Kafka 集群地址
-    '<topic>',               -- 必填：Topic 名称
-    '',                      -- 保留（填空字符串）
-    '<group_id>',            -- 必填：持久消费者组 ID
-    '', '', '', '',          -- 位置参数留空，由 Pipe 自动管理
-    'raw',                   -- key 格式（目前只支持 raw）
-    'raw',                   -- value 格式（目前只支持 raw）
+    '<bootstrap_servers>',   -- required: Kafka cluster address
+    '<topic>',               -- required: topic name
+    '',                      -- reserved (leave empty string)
+    '<group_id>',            -- required: persistent consumer group ID
+    '', '', '', '',          -- positional params left empty, managed by Pipe automatically
+    'raw',                   -- key format (only 'raw' supported currently)
+    'raw',                   -- value format (only 'raw' supported currently)
     0,                       -- max_errors
-    MAP(<kafka_config>)      -- Kafka 配置参数
+    MAP(<kafka_config>)      -- Kafka configuration parameters
   )
 );
 ```
-**示例：**
+**Examples:**
 ```sql
--- 从 Kafka 持续导入 JSON 数据
+-- Continuously ingest JSON data from Kafka
 CREATE OR REPLACE PIPE kafka_orders_pipe
   VIRTUAL_CLUSTER = 'default'
   BATCH_INTERVAL_IN_SECONDS = '60'
@@ -62,7 +62,7 @@ COPY INTO ods.orders FROM (
   )
 );
--- SASL 认证
+-- SASL authentication
 CREATE PIPE kafka_secure_pipe
   VIRTUAL_CLUSTER = 'pipe_vc'
   BATCH_INTERVAL_IN_SECONDS = '60'
@@ -83,12 +83,12 @@ COPY INTO ods.secure_events FROM (
 );
 ```
-## 验证 Kafka 连接（创建 Pipe 前）
+## Verify Kafka Connection (Before Creating a Pipe)
-独立使用 `read_kafka` 探查数据时，可以在 MAP 中设置 `kafka.auto.offset.reset`：
+When using `read_kafka` standalone to explore data, you can set `kafka.auto.offset.reset` in the MAP:
 ```sql
--- 验证连接和数据格式
+-- Verify connection and data format
 SELECT value::string
 FROM read_kafka(
   'kafka.example.com:9092',
@@ -102,11 +102,11 @@ FROM read_kafka(
 LIMIT 10;
 ```
-> ⚠️ **独立探查 vs Pipe 中的区别**：
-> - 独立探查：可在 MAP 中设置 `kafka.auto.offset.reset` 为 `earliest` 读取历史数据
-> - Pipe 中：位置参数必须留空，消费位点由 Pipe 的 `RESET_KAFKA_GROUP_OFFSETS` 参数控制
+> ⚠️ **Standalone exploration vs inside a Pipe**:
+> - Standalone exploration: you can set `kafka.auto.offset.reset` to `earliest` in the MAP to read historical data
+> - Inside a Pipe: positional parameters must be left empty; the consumer offset is controlled by the Pipe's `RESET_KAFKA_GROUP_OFFSETS` parameter
-## CREATE PIPE — 从对象存储导入
+## CREATE PIPE — Ingest from Object Storage
 ```sql
 CREATE [ OR REPLACE ] PIPE [ IF NOT EXISTS ] <pipe_name>
@@ -120,17 +120,17 @@ FROM VOLUME <volume_name>
 USING <csv | parquet | orc | json> [OPTIONS ('<key>' = '<value>', ...)] PURGE=true;
 ```
-**关键参数：**
-- `VIRTUAL_CLUSTER`：指定虚拟集群名称（OSS Pipe 必填）
-- `INGEST_MODE = 'LIST_PURGE'`：通用模式，定期扫描文件列表，必须设置 `PURGE=true`
-- `INGEST_MODE = 'EVENT_NOTIFICATION'`：事件通知模式，低延迟（仅阿里云 OSS + AWS S3），不需要 `PURGE=true`
-- `COMMENT 'text'`：不带等号（`COMMENT = 'text'` 会报错）
-- `PURGE=true`：放在最后，OPTIONS 在其之前：`USING CSV OPTIONS (...) PURGE=true`
-- PIPE 中的 COPY 语句不支持 `files`、`regexp`、`subdirectory` 参数
+**Key parameters:**
+- `VIRTUAL_CLUSTER`: specifies the virtual cluster name (required for OSS Pipes)
+- `INGEST_MODE = 'LIST_PURGE'`: general mode, periodically scans the file list; `PURGE=true` must be set
+- `INGEST_MODE = 'EVENT_NOTIFICATION'`: event notification mode, low latency (Alibaba Cloud OSS + AWS S3 only); `PURGE=true` is not required
+- `COMMENT 'text'`: no equals sign (`COMMENT = 'text'` will cause an error)
+- `PURGE=true`: placed at the end, after OPTIONS: `USING CSV OPTIONS (...) PURGE=true`
+- COPY statements inside a PIPE do not support `files`, `regexp`, or `subdirectory` parameters
-**示例：**
+**Examples:**
 ```sql
--- LIST_PURGE 模式（带 OPTIONS）
+-- LIST_PURGE mode (with OPTIONS)
 CREATE OR REPLACE PIPE oss_events_pipe
   VIRTUAL_CLUSTER = 'default'
   INGEST_MODE = 'LIST_PURGE'
@@ -140,7 +140,7 @@ COPY INTO ods.events
 FROM VOLUME my_oss_volume
 USING PARQUET PURGE=true;
--- CSV 格式带 OPTIONS（OPTIONS 在 PURGE 之前）
+-- CSV format with OPTIONS (OPTIONS before PURGE)
 CREATE PIPE oss_csv_pipe
   VIRTUAL_CLUSTER = 'default'
   INGEST_MODE = 'LIST_PURGE'
@@ -149,7 +149,7 @@ COPY INTO ods.csv_data
 FROM VOLUME my_csv_volume
 USING CSV OPTIONS ('header' = 'true', 'sep' = ',') PURGE=true;
--- EVENT_NOTIFICATION 模式（不需要 PURGE）
+-- EVENT_NOTIFICATION mode (PURGE not required)
 CREATE PIPE oss_event_pipe
   VIRTUAL_CLUSTER = 'default'
   INGEST_MODE = 'EVENT_NOTIFICATION'
@@ -160,33 +160,33 @@ FROM VOLUME my_oss_event_volume
 USING PARQUET;
 ```
-## 启停 Pipe
+## Start / Stop a Pipe
 ```sql
--- 暂停 Pipe
+-- Pause Pipe
 ALTER PIPE <pipe_name> SET PIPE_EXECUTION_PAUSED = true;
--- 恢复 Pipe
+-- Resume Pipe
 ALTER PIPE <pipe_name> SET PIPE_EXECUTION_PAUSED = false;
 ```
-## 修改 Pipe 属性
+## Modify Pipe Properties
 ```sql
--- 每次只能修改一个属性
+-- Only one property can be modified at a time
 ALTER PIPE <pipe_name> SET VIRTUAL_CLUSTER = 'new_vc';
 ALTER PIPE <pipe_name> SET COPY_JOB_HINT = '{"cz.sql.split.kafka.strategy":"size","cz.mapper.kafka.message.size":"200000"}';
 ```
-> ⚠️ **ALTER PIPE 支持的属性**：
+> ⚠️ **Supported ALTER PIPE properties**:
 > - ✅ `PIPE_EXECUTION_PAUSED`
 > - ✅ `VIRTUAL_CLUSTER`
 > - ✅ `COPY_JOB_HINT`
-> - ❌ `BATCH_INTERVAL_IN_SECONDS`（不支持修改，需删除重建）
-> - ❌ `BATCH_SIZE_PER_KAFKA_PARTITION`（不支持修改，需删除重建）
+> - ❌ `BATCH_INTERVAL_IN_SECONDS` (not supported; must drop and recreate)
+> - ❌ `BATCH_SIZE_PER_KAFKA_PARTITION` (not supported; must drop and recreate)
 >
-> 不支持修改 COPY/INSERT 语句逻辑，需删除 Pipe 后重建。
-> `COPY_JOB_HINT` 修改会覆盖所有已有 hints，需一次性设置全部参数。
+> Modifying the COPY/INSERT statement logic is not supported; drop the Pipe and recreate it.
+> Modifying `COPY_JOB_HINT` overwrites all existing hints; all parameters must be set at once.
 ## DROP PIPE
@@ -197,26 +197,26 @@ DROP PIPE [ IF EXISTS ] <pipe_name>;
 ## SHOW PIPE
 ```sql
--- 列出当前 schema 下所有 Pipe
+-- List all Pipes in the current schema
 SHOW PIPES;
--- 查看 Pipe 详情（状态、延迟、定义）
+-- View Pipe details (status, latency, definition)
 DESC PIPE <pipe_name>;
 DESC PIPE EXTENDED <pipe_name>;
 ```
-## 注意事项
+## Notes
-- Pipe 创建后默认自动启动，无需手动 RESUME
-- Kafka Pipe 使用 consumer group 管理 offset，重建 Pipe 时保持相同 group_id 可从上次位点继续
-- 对象存储 Pipe 通过文件列表或事件通知检测新文件，`load_history` 去重记录保留 7 天
-- Pipe 不支持修改 AS 子句，需要删除后重建（不是 `CREATE OR REPLACE`）
-- Kafka Pipe 仅支持 PLAINTEXT 和 SASL_PLAINTEXT 安全协议，不支持 SSL
+- A Pipe starts automatically after creation; no manual RESUME is needed
+- Kafka Pipes use a consumer group to manage offsets; keeping the same group_id when recreating a Pipe allows resuming from the last offset
+- Object storage Pipes detect new files via file list scanning or event notifications; `load_history` deduplication records are retained for 7 days
+- Pipes do not support modifying the AS clause; drop and recreate (not `CREATE OR REPLACE`)
+- Kafka Pipes only support PLAINTEXT and SASL_PLAINTEXT security protocols; SSL is not supported
-## 参考文档
+## Reference Documentation
-- [Pipe 简介](https://www.yunqi.tech/documents/pipe-summary)
-- [借助 read_kafka 函数持续导入](https://www.yunqi.tech/documents/pipe-kafka)
-- [借助 Kafka 外表 Table Stream 持续导入](https://www.yunqi.tech/documents/pipe-kafka-table-stream)
-- [最佳实践：使用 Pipe 高效接入 Kafka 数据](https://www.yunqi.tech/documents/pipe-kafka-bestpractice-1)
-- [使用 Pipe 持续导入对象存储数据](https://www.yunqi.tech/documents/pipe-storage-object)
+- [Pipe Overview](https://www.yunqi.tech/documents/pipe-summary)
+- [Continuous Ingestion with read_kafka](https://www.yunqi.tech/documents/pipe-kafka)
+- [Continuous Ingestion with Kafka External Table Stream](https://www.yunqi.tech/documents/pipe-kafka-table-stream)
+- [Best Practices: Efficient Kafka Ingestion with Pipe](https://www.yunqi.tech/documents/pipe-kafka-bestpractice-1)
+- [Continuous Ingestion from Object Storage with Pipe](https://www.yunqi.tech/documents/pipe-storage-object)