npm - @clickzetta/cz-cli-darwin-arm64 - Versions diffs - 0.5.16 → 0.5.18 - Mend

@clickzetta/cz-cli-darwin-arm64 0.5.16 → 0.5.18

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (243) hide show

package/bin/skills/lakehouse-doc-en/references/pipe-kafka-table-stream.md CHANGED Viewed

@@ -1,14 +1,14 @@
 # Using Table Stream and Pipe to Import Kafka Data into Lakehouse
-## 1. Background Introduction
+## 1. Background
-In the field of big data processing, efficiently importing streaming data from Kafka into a Lakehouse (data lake warehouse) is a common requirement. CloudTech provides powerful Table Stream and Pipe functionalities that simplify and enhance this process. This article will detail how to use Table Stream and Pipe to import Kafka data into a Lakehouse, including the complete process of creating a Kafka external table and a Kafka Table Stream.
+In big data processing, efficiently ingesting streaming data from Kafka into a Lakehouse is a common requirement. Singdata Lakehouse provides powerful Table Stream and Pipe functionality that makes this process simpler and more efficient. This article describes how to use Table Stream and Pipe to import Kafka data into the Lakehouse, covering the complete process of creating a Kafka external table and a Kafka Table Stream.
-## 2. Operational Steps
+## 2. Steps
-### Creating a Kafka External Table
+### Create a Kafka External Table
-Before using Table Stream and Pipe, we need to create an external table integrated with Kafka to access data in Kafka.
+Before using Table Stream and Pipe, create an [external table integrated with Kafka](create-kafka-external.md) to access data in Kafka.
 ```sql
 CREATE STORAGE CONNECTION pipe_kafka
@@ -24,23 +24,24 @@ OPTIONS (   'group_id' = 'external_table_lh',    'topics' = 'my_topic')
 CONNECTION pipe_kafka;
 ```
-### Creating a Table Stream
+### Create a Table Stream
-Create a Table Stream on the Kafka external table to capture real-time data changes in Kafka.
+[Create a Table Stream](create-table-stream.md) on the Kafka external table to capture real-time data changes from Kafka.
 ```sql
 CREATE TABLE STREAM kafka_table_stream_pipe1
 ON TABLE external_table_kafka
 WITH PROPERTIES (
     'table_stream_mode' = 'append_only'
 );
 ```
-- `kafka_table_stream_pipe1`: Name of the Table Stream.
-- `ON TABLE external_table_kafka`: Specifies that the Table Stream is created based on the previously created Kafka external table.
-- `table_stream_mode='append_only'`: Sets the mode of the Table Stream to append-only, meaning it will only capture newly added data rows.
+* `kafka_table_stream_pipe1`: Name of the Table Stream.
+* `ON TABLE external_table_kafka`: Specifies that the Table Stream is created based on the previously created Kafka external table.
+* `table_stream_mode='append_only'`: Sets the mode to append-only, meaning only newly added data rows are captured.
-After creation, you can verify the data in the Table Stream with the following query:
+After creation, verify the data in the Table Stream with the following query:
 ```sql
 SELECT CAST(value AS STRING) FROM kafka_table_stream_pipe1;
@@ -48,61 +49,61 @@ SELECT CAST(value AS STRING) FROM kafka_table_stream_pipe1;
 This query converts the `value` field in the Table Stream to a string type and returns it for subsequent processing.
-### Creating a Target Table
+### Create a Target Table
-Next, create a target table to store data imported from Kafka.
+Create a target table to store data imported from Kafka.
 ```sql
-CREATE TABLE kafak_sink_table_1 (
+CREATE TABLE kafka_sink_table_1 (
     a TIMESTAMP,
     b STRING
 );
 ```
-- `kafak_sink_table_1`: Name of the target table.
-- `a TIMESTAMP`: First field for storing timestamp data.
-- `b STRING`: Second field for storing string data.
+* `kafka_sink_table_1`: Name of the target table.
+* `a TIMESTAMP`: First field for storing timestamp data.
+* `b STRING`: Second field for storing string data.
-### Creating a Pipe
+### Create a Pipe
-Finally, use a Pipe to continuously import data from the Table Stream into the target table.
+Use a Pipe to continuously import data from the Table Stream into the target table.
 ```sql
 CREATE PIPE kafka_pipe_stream
 VIRTUAL_CLUSTER = 'test_alter'
 AS
-COPY INTO kafak_sink_table_1
+COPY INTO kafka_sink_table_1
 FROM (
     SELECT CURRENT_TIMESTAMP(), CAST(value AS STRING) FROM kafka_table_stream_pipe1
 );
 ```
-- `kafka_pipe_stream`: Name of the Pipe.
-- `VIRTUAL_CLUSTER = 'test_alter'`: Specifies the virtual cluster to use.
-- `COPY INTO kafak_sink_table_1`: Copies data into the target table `kafak_sink_table_1`.
-- `SELECT CURRENT_TIMESTAMP(), CAST(value AS STRING) FROM kafka_table_stream_pipe1`: Selects data from the Table Stream, using the current timestamp and the converted `value` field as the two columns for the target table.
+* `kafka_pipe_stream`: Name of the Pipe.
+* `VIRTUAL_CLUSTER = 'test_alter'`: Specifies the Virtual Cluster to use.
+* `COPY INTO kafka_sink_table_1`: Copies data into the target table `kafka_sink_table_1`.
+* `SELECT CURRENT_TIMESTAMP(), CAST(value AS STRING) FROM kafka_table_stream_pipe1`: Selects data from the Table Stream, using the current timestamp and the converted `value` field as the two columns for the target table.
-Other Configurable Properties:
+Other configurable properties:
 - `INITIAL_DELAY_IN_SECONDS`: Initial job scheduling delay (optional, default 0 seconds)
-- `BATCH_INTERVAL_IN_SECONDS`: (Optional) Sets the batch processing interval, default 60 seconds.
-- `BATCH_SIZE_PER_KAFKA_PARTITION`: (Optional) Sets the batch size per Kafka partition, default 500,000 records.
-- `MAX_SKIP_BATCH_COUNT_ON_ERROR`: (Optional) Sets the maximum number of batches to skip on error, default 30.
-- `RESET_KAFKA_GROUP_OFFSETS`: (Optional) Sets the initial offset for Kafka when starting the pipe. Cannot be modified. Possible values: `latest`, `earliest`, `none`, `valid`, `${TIMESTAMP_MILLISECONDS}`
-    - `none`: Default, no action.
-    - `valid`: Checks if the current offset in the group is expired and resets expired partitions to the current earliest.
-    - `earliest`: Resets to the current earliest.
-    - `latest`: Resets to the current latest.
-    - `${TIMESTAMP_MILLISECONDS}`: Resets to the offset corresponding to the millisecond timestamp, e.g., '1737789688000' (2025-01-25 15:21:28).
+- `BATCH_INTERVAL_IN_SECONDS`: (Optional) Batch processing interval, default 60 seconds.
+- `BATCH_SIZE_PER_KAFKA_PARTITION`: (Optional) Batch size per Kafka partition, default 500,000 records.
+- `MAX_SKIP_BATCH_COUNT_ON_ERROR`: (Optional) Maximum number of batches to skip on error, default 30.
+- `RESET_KAFKA_GROUP_OFFSETS`: (Optional) Initial Kafka offset when starting the Pipe. Cannot be modified after creation. Possible values: `latest`, `earliest`, `none`, `valid`, `${TIMESTAMP_MILLISECONDS}`
+    - `none`: No action (default)
+    - `valid`: Checks if the current group offset is expired and resets expired partition offsets to the current earliest
+    - `earliest`: Resets to the current earliest
+    - `latest`: Resets to the current latest
+    - `${TIMESTAMP_MILLISECONDS}`: Resets to the offset corresponding to the millisecond timestamp, e.g., `1737789688000` (2025-01-25 15:21:28)
-## 3. Verifying Results
+## 3. Verify Results
-You can verify whether the data has been successfully imported by querying the target table:
+Verify whether data has been successfully imported by querying the target table:
 ```sql
-SELECT * FROM kafak_sink_table_1;
+SELECT * FROM kafka_sink_table_1;
 ```
-Additionally, check the running status of the Pipe to ensure it is functioning properly:
+Check the running status of the Pipe to ensure it is working properly:
 ```sql
 SHOW PIPES;
@@ -112,14 +113,15 @@ This command lists all created Pipes and their status information, including whe
 ## 4. Status Monitoring and Management
-### Checking Kafka Consumption Latency
+### Check Kafka Consumption Latency
-Use the `DESC PIPE` command. For example, the JSON string in `pipe_latency` below:
-- `lastConsumeTimestamp`: The last consumed offset.
-- `offsetLag`: The backlog of Kafka data.
-- `timeLag`: Consumption latency, calculated as the current time minus the last consumed offset. If Kafka consumption is abnormal, the value is -1.
+Use the `DESC PIPE` command. The JSON string in `pipe_latency` contains the following fields:
+- `lastConsumeTimestamp`: The last consumed offset timestamp
+- `offsetLag`: The backlog of Kafka data
+- `timeLag`: Consumption latency, calculated as the current time minus the last consumed offset timestamp. When Kafka consumption is abnormal, the value is -1
-```sql
+````
 DESC PIPE EXTENDED kafka_pipe_stream
 +--------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
 |     info_name      |                                                                                                               info_value                                                            |
@@ -130,53 +132,53 @@ DESC PIPE EXTENDED kafka_pipe_stream
 | last_modified_time | 2025-03-05 10:40:55.405                                                                                                                                                             |
 | comment            |                                                                                                                                                                                     |
 | properties         | ((virtual_cluster,test_alter))                                                                                                                                                      |
-| copy_statement     | COPY INTO TABLE qingyun.pipe_schema.kafak_sink_table_1 FROM (SELECT `current_timestamp`() AS ```current_timestamp``()`, CAST(kafka_table_stream_pipe1.`value` AS string) AS `value` |
+| copy_statement     | COPY INTO TABLE qingyun.pipe_schema.kafka_sink_table_1 FROM (SELECT `current_timestamp`() AS ```current_timestamp``()`, CAST(kafka_table_stream_pipe1.`value` AS string) AS `value` |
 | pipe_status        | RUNNING                                                                                                                                                                             |
-| output_name        | xxxxxxx.pipe_schema.kafak_sink_table_1                                                                                                                                              |
+| output_name        | xxxxxxx.pipe_schema.kafka_sink_table_1                                                                                                                                              |
 | input_name         | kafka_table_stream:xxxxxxx.pipe_schema.kafka_table_stream_pipe1                                                                                                                     |
 | invalid_reason     |                                                                                                                                                                                     |
 | pipe_latency       | {"kafka":{"lags":{"0":0,"1":0,"2":0,"3":0},"lastConsumeTimestamp":-1,"offsetLag":0,"timeLag":-1}}                                                                                   |
 +--------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-```
-### Viewing Pipe Execution History
+````
-Since each Pipe execution is a copy operation, you can view all operations in the job history. Use the `query_tag` in the [Job History](web-job-history.md) to filter, as all Pipe copy jobs are tagged in the format `pipe.``workspace_name``.schema_name.pipe_name`, making it easy to track and manage.
+### View Pipe Execution History
-### Stopping and Starting a Pipe
+Since each Pipe execution is a COPY operation, you can view all operations in the job history. Filter by `query_tag` in [Job History](<web-job-history.md>). All Pipe COPY jobs are tagged in the format `pipe.``workspace_name``.schema_name.pipe_name` for easy tracking.
-- Pause a Pipe:
-  ```sql
-  ALTER PIPE pipe_name SET PIPE_EXECUTION_PAUSED = true;
-  ```
+### Stop and Start a Pipe
+- Pause a Pipe:
+```
+ALTER PIPE pipe_name SET PIPE_EXECUTION_PAUSED = true;
+```
 - Resume a Pipe:
-  ```sql
-  ALTER PIPE pipe_name SET PIPE_EXECUTION_PAUSED = false;
-  ```
+```
+ALTER PIPE pipe_name SET PIPE_EXECUTION_PAUSED = false;
+```
-### Modifying Pipe Properties
+### Modify Pipe Properties
-You can modify Pipe properties, but only one property at a time. If multiple properties need to be modified, execute the `ALTER` command multiple times. Below are the modifiable properties and their syntax:
+You can modify Pipe properties one at a time. If multiple properties need to be changed, run the `ALTER` command multiple times. Below are the modifiable properties and their syntax:
-```sql
+```SQL
 ALTER PIPE pipe_name SET
-   [VIRTUAL_CLUSTER = 'virtual_cluster_name'],
+    [VIRTUAL_CLUSTER = 'virtual_cluster_name'],
     [BATCH_INTERVAL_IN_SECONDS=''],
-   [ BATCH_SIZE_PER_KAFKA_PARTITION=''],
+    [BATCH_SIZE_PER_KAFKA_PARTITION=''],
     [MAX_SKIP_BATCH_COUNT_ON_ERROR=''],
     [COPY_JOB_HINT='']
 ```
 Examples:
-```sql
--- Modify compute cluster
-ALTER PIPE pipe_name SET VIRTUAL_CLUSTER = 'default';
+```
+-- Modify the Virtual Cluster
+ALTER PIPE pipe_name SET VIRTUAL_CLUSTER = 'DEFAULT'
 -- Set COPY_JOB_HINT
-ALTER PIPE pipe_name SET copy_hints='{"cz.mapper.kafka.message.size": "2000000"}';
+ALTER PIPE pipe_name SET COPY_JOB_HINT='{"cz.mapper.kafka.message.size": "2000000"}'
 ```
-**Note**
-- Modifying the COPY statement logic is not supported. If needed, delete the Pipe and recreate it.
-- When modifying the `COPY_JOB_HINT` of a Pipe, the new settings will overwrite existing hints. If your Pipe already has hints (e.g., `{"cz.sql.split.kafka.strategy":"size"}`), you must set all required hints together when adding new ones; otherwise, existing hints will be overwritten. Separate multiple parameters with commas.
+**Notes**
+- Modifying the COPY statement logic is not supported. If you need to modify it, delete the Pipe and recreate it.
+- When modifying the `COPY_JOB_HINT` of a Pipe, the new settings will overwrite all existing hints. If your Pipe already has hints such as `{"cz.sql.split.kafka.strategy":"size"}`, you must include all required hints together when setting new ones; otherwise existing hints will be overwritten. Separate multiple parameters with commas.

package/bin/skills/lakehouse-doc-en/references/pipe-kafka.md CHANGED Viewed

@@ -1,4 +1,10 @@
-# Continuous Data Import from Kafka Using Pipe
+# Continuous Data Collection from Kafka Using Pipe
+## Overview
+Pipe is the **continuous data ingestion** solution provided by the Lakehouse, designed to automatically and continuously import data from Kafka into Lakehouse tables. Pipe creates a persistent consumer group, maintains the consumption position, and runs continuously according to the configured scheduling strategy.
+A Kafka Pipe is like a continuously running consumer group. You only need to define the consumption logic, and it automatically pulls data from the Topic and writes it to a table — no manual triggering or Cron configuration required.
 ## Kafka Pipe Syntax
@@ -6,34 +12,73 @@
 -- Syntax for creating a Pipe from Kafka
 CREATE PIPE [ IF NOT EXISTS ] <pipe_name>
     VIRTUAL_CLUSTER = 'virtual_cluster_name'
+    [INITIAL_DELAY_IN_SECONDS='']
     [BATCH_INTERVAL_IN_SECONDS='']
-   [ BATCH_SIZE_PER_KAFKA_PARTITION='']
+    [BATCH_SIZE_PER_KAFKA_PARTITION='']
     [MAX_SKIP_BATCH_COUNT_ON_ERROR='']
     [RESET_KAFKA_GROUP_OFFSETS='']
     [COPY_JOB_HINT='']
 AS <copy_statement>;
 ```
-* `<pipe_name>`: The name of the Pipe object you want to create.
-* `VIRTUAL_CLUSTER`: Specify the name of the virtual cluster.
-* `BATCH_INTERVAL_IN_SECONDS`: (Optional) Set the batch interval time, default is 60 seconds.
-* `BATCH_SIZE_PER_KAFKA_PARTITION`: (Optional) Set the batch size per Kafka partition, default is 500,000 records.
-* `MAX_SKIP_BATCH_COUNT_ON_ERROR`: (Optional) Set the maximum retry count for skipped batches on error, default is 30.
-- `RESET_KAFKA_GROUP_OFFSETS`: (Optional) Sets the initial offset for Kafka when starting the pipe. This property cannot be modified after the pipe is created. Possible values are `latest`, `earliest`, `none`, `valid`, and `${TIMESTAMP_MILLISECONDS}`.
-    - `none`: No action by default.
-    - `valid`: Checks if the current offset in the group is expired and resets expired partitions to the current earliest offset.
-    - `earliest`: Resets to the current earliest offset.
-    - `latest`: Resets to the current latest offset.
-    - `${TIMESTAMP_MILLISECONDS}`: Resets to the offset corresponding to the millisecond timestamp, for example, `'1737789688000'` (which corresponds to January 25, 2025, 15:21:28).
+* `<pipe_name>`: The name of the Pipe object, used for management and monitoring.
+* `VIRTUAL_CLUSTER`: Specifies the name of the Virtual Cluster to execute Pipe tasks.
+* `INITIAL_DELAY_IN_SECONDS`: Initial job scheduling delay (optional, default 0 seconds).
+* `BATCH_INTERVAL_IN_SECONDS`: (Optional) Controls how long to accumulate data per batch before writing — a shorter interval means fresher data, a longer interval means more efficient single writes. Default of 60 seconds works for most scenarios.
+* `BATCH_SIZE_PER_KAFKA_PARTITION`: (Optional) Batch size per Kafka partition, default 500,000 records.
+* `MAX_SKIP_BATCH_COUNT_ON_ERROR`: (Optional) Maximum number of batches to skip on error, default 30.
+* `RESET_KAFKA_GROUP_OFFSETS`: (Optional) Controls where the Pipe starts consuming Kafka data when it starts. Only settable at startup. If not set and the consumer group has no historical position, Kafka's [auto.offset.reset](https://kafka.apache.org/documentation/#consumerconfigs_auto.offset.reset) configuration is used (default `latest`). Supported values:
+  * `none`: No action; uses [auto.offset.reset](https://kafka.apache.org/documentation/#consumerconfigs_auto.offset.reset)
+  * `valid`: Checks if the current group offset is expired and resets expired partition offsets to the current earliest
+  * `earliest`: Resets to the current earliest
+  * `latest`: Resets to the current latest
+  * `${TIMESTAMP_MILLISECONDS}`: Resets to the offset corresponding to the millisecond timestamp, e.g., `1737789688000` (2025-01-25 15:21:28)
+## Using READ\_KAFKA in a Pipe
+For temporary exploration, you can use the READ_KAFKA function directly (see [READ_KAFKA Function](<sql_functions/table_functions/read_kafka.md>)). When using `READ_KAFKA` in a Pipe's COPY statement, the following **important differences** apply:
+### Parameter Passing Rules
+```sql
+-- READ_KAFKA syntax in a Pipe
+read_kafka (
+    'bootstrap_servers',     -- Required: Kafka cluster address in host:port format, multiple brokers separated by commas — 2-3 broker addresses are sufficient, no need to list all nodes
+    'topic',                 -- Required: Topic name — one Pipe corresponds to one Topic; create multiple Pipes for multiple Topics
+    '',                      -- Required: Topic pattern (not yet supported, leave empty string)
+    'group_id',              -- Required: Persistent consumer group ID — use a meaningful name (e.g., pipe_orders_group); different Pipes for the same Topic must use different group_ids
+    '',                      -- Leave empty: start position is managed automatically by Pipe (when using READ_KAFKA standalone, fill starting_offsets here)
+    '',                      -- Leave empty: end position managed automatically by Pipe
+    '',                      -- Leave empty: start timestamp managed automatically by Pipe
+    '',                      -- Leave empty: end timestamp managed automatically by Pipe
+    'raw',                   -- Key format
+    'raw',                   -- Value format
+    0,                       -- Max error count
+    map()                    -- Kafka config parameters — fill in SSL, SASL and other auth params here when needed, e.g., map('security.protocol','SASL_SSL',...)
+)
+```
+### Key Differences
+| Feature | READ\_KAFKA Function (standalone) | READ\_KAFKA (in a Pipe) |
+| ------ | ------------------------ | --------------------- |
+| Consumer group | Temporary, destroyed after execution | Persistent, maintains consumption position |
+| Position management | Manually specify starting\_offsets etc. | Managed automatically by Pipe; position parameters must be left empty |
+| Execution mode | One-time query | Continuously scheduled |
+| Default start position | earliest (explore historical data) | latest (process new data) |
+### Best Practices
+See [Efficiently Ingesting Kafka Data with Pipe](<pipe-kafka-bestpractice-1.md>)
 ## Usage Example
 ```SQL
-/*Use Lakehouse Pipe task object to continuously import Kafka data into the target table*/
+/*Use a Lakehouse Pipe task object to continuously import Kafka data into a target table*/
 ---Step01: Create the target table for Kafka writes
 create table kafka_raw(value string);
----Step02: Create PIPE task to read from Kafka and write to the target table
+---Step02: Create a PIPE task to read from Kafka and write to the target table
 CREATE PIPE  load_kafka01
 VIRTUAL_CLUSTER = 'DEFAULT'
 BATCH_INTERVAL_IN_SECONDS = '10'
@@ -48,12 +93,12 @@ FROM (
         'test',-- topic name
         '', -- topic prefix not supported yet
         'pipe_kafka_group',-- group id
-        '',-- Point-related parameters, leave blank in pipe ddl
-        '',-- Point-related parameters, leave blank in pipe ddl
-        '',-- Point-related parameters, leave blank in pipe ddl
-        '',-- Point-related parameters, leave blank in pipe ddl
-        'raw',-- format of key, currently only supports binary
-        'raw',-- format of value, currently only supports binary
+        '',-- offset-related parameter, leave empty in pipe ddl
+        '',-- offset-related parameter, leave empty in pipe ddl
+        '',-- offset-related parameter, leave empty in pipe ddl
+        '',-- offset-related parameter, leave empty in pipe ddl
+        'raw',-- key format, currently only supports binary
+        'raw',-- value format, currently only supports binary
         0,
         map()
         )
@@ -88,72 +133,18 @@ SELECT * FROM kafka_raw LIMIT 100;
 DROP PIPE load_kafka01;
 ```
-## Function: read\_kafka
-> Note: This function is currently in preview release
-## Function Description
-Read data from an Apache Kafka cluster and return the data in tabular form.
-## Function Syntax
-```SQL
-read_kafka (
-    <bootstrapServers>,
-    <topic>,
-    <topic_prefix>,
-    <group_id>,
-    <STARTING_OFFSETS>,
-    <ENDING_OFFSETS>,
-    <STARTING_OFFSETS_TIMESTAMP>,
-    <ENDING_OFFSETS_TIMESTAMP>,
-    <KEY_FORMAT>,
-    <VALUE_FORMAT>,
-    <MAX_ERROR_NUMBER>,
-    <kafka_parameters>
-)
-```
-## Parameter Description
-* bootstrap: Comma-separated Kafka broker server addresses, such as `1.2.3.1:9092,1.2.3.2:9092`.
-* topic: Kafka topic name, multiple topics separated by commas, such as `topicA,topicB`.
-* topic\_pattern: Topic regex, not supported yet, leave it empty by default. For example: ''.
-* group\_id: Kafka consumer group ID.
-* STARTING\_OFFSETS: Specifies the starting offset to read from, default is `latest`. This parameter does not need to be passed in the pipe.
-* ENDING\_OFFSETS: Specifies the ending offset, default is `latest`. This parameter does not need to be passed in the pipe.
-* STARTING\_OFFSETS\_TIMESTAMP: Specifies the timestamp for the starting offset. This parameter does not need to be passed in the pipe.
-* ENDING\_OFFSETS\_TIMESTAMP: Specifies the timestamp for the ending offset. This parameter does not need to be passed in the pipe.
-* KEY\_FORMAT: Specifies the format of the key to read, case-insensitive STRING type. Currently, only raw format is supported.
-* VALUE\_FORMAT: Specifies the format of the value to read, case-insensitive STRING type. Currently, only raw format is supported.
-* MAX\_ERROR\_NUMBER: The maximum number of allowed error rows within the reading window. Must be greater than or equal to 0. The default is 0, which means no error rows are allowed, with a range of 0-100000.
-* kafka\_parameters: Parameters to be passed to Kafka, prefixed with kafka., directly using Kafka's parameters. These options can be found in Kafka. The format is like MAP('kafka.security.protocol', 'PLAINTEXT', 'kafka.auto.offset.reset', 'latest'). For values, refer to the [Kafka documentation](https://kafka.apache.org/documentation/#consumerconfigs).
-## Return Values
-| Field           | Meaning                      | Type                 |
-| --------------- | ---------------------------- | -------------------- |
-| topic           | Kafka topic name             | STRING               |
-| partition       | Data partition ID            | INT                  |
-| offset          | Offset in Kafka partition    | BIGINT               |
-| timestamp       | Kafka message timestamp      | TIMESTAMP\_LTZ       |
-| timestamp\_type | Kafka message timestamp type | STRING               |
-| headers         | Kafka message headers        | MAP\<STRING, BINARY> |
-| key             | Kafka key value              | BINARY               |
-| value           | Kafka value                  | BINARY               |
 ## Status Monitoring and Management
-### Viewing Kafka Consumption Latency
+### Check Kafka Consumption Latency
-Use the `DESC PIPE` command. For example, the JSON string in `pipe_latency` below:
-- `lastConsumeTimestamp`: The last consumed offset.
-- `offsetLag`: The backlog of Kafka data.
-- `timeLag`: Consumption latency, calculated as the current time minus the last consumed offset. If Kafka consumption is abnormal, the value is -1.
+Use the `DESC PIPE` command. The JSON string in `pipe_latency` contains the following fields:
+- `lastConsumeTimestamp`: The last consumed offset timestamp
+- `offsetLag`: The backlog of Kafka data
+- `timeLag`: Consumption latency, calculated as the current time minus the last consumed offset timestamp. When Kafka consumption is abnormal, the value is -1
-```sql
+````
 DESC PIPE EXTENDED kafka_pipe_stream
 +--------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
 |     info_name      |                                                                                                               info_value                                                            |
@@ -164,54 +155,58 @@ DESC PIPE EXTENDED kafka_pipe_stream
 | last_modified_time | 2025-03-05 10:40:55.405                                                                                                                                                             |
 | comment            |                                                                                                                                                                                     |
 | properties         | ((virtual_cluster,test_alter))                                                                                                                                                      |
-| copy_statement     | COPY INTO TABLE qingyun.pipe_schema.kafak_sink_table_1 FROM (SELECT `current_timestamp`() AS ```current_timestamp``()`, CAST(kafka_table_stream_pipe1.`value` AS string) AS `value` |
+| copy_statement     | COPY INTO TABLE qingyun.pipe_schema.kafka_sink_table_1 FROM (SELECT `current_timestamp`() AS ```current_timestamp``()`, CAST(kafka_table_stream_pipe1.`value` AS string) AS `value` |
 | pipe_status        | RUNNING                                                                                                                                                                             |
-| output_name        | xxxxxxx.pipe_schema.kafak_sink_table_1                                                                                                                                              |
+| output_name        | xxxxxxx.pipe_schema.kafka_sink_table_1                                                                                                                                              |
 | input_name         | kafka_table_stream:xxxxxxx.pipe_schema.kafka_table_stream_pipe1                                                                                                                     |
 | invalid_reason     |                                                                                                                                                                                     |
 | pipe_latency       | {"kafka":{"lags":{"0":0,"1":0,"2":0,"3":0},"lastConsumeTimestamp":-1,"offsetLag":0,"timeLag":-1}}                                                                                   |
 +--------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
-```
-### Viewing Pipe Execution History
+````
-Since each Pipe execution triggers a copy operation, you can view all operations in the job history. Use the `query_tag` in the [Job History](web-job-history.md) to filter. All Pipe copy jobs are tagged in the format `pipe.``workspace_name``.schema_name.pipe_name`, making it easy to track and manage.
+### View Pipe Execution History
-### Stopping and Starting a Pipe
+Since each Pipe execution is a COPY operation, you can view all operations in the job history. Filter by `query_tag` in the [Job History](web-job-history.md). All Pipe COPY jobs are tagged in the format `pipe.``workspace_name``.schema_name.pipe_name` for easy tracking.
-- Pause a Pipe:
-  ```sql
-  ALTER PIPE pipe_name SET PIPE_EXECUTION_PAUSED = true;
-  ```
+### Stop and Start a Pipe
-- Resume a Pipe:
-  ```sql
-  ALTER PIPE pipe_name SET PIPE_EXECUTION_PAUSED = false;
-  ```
+* Pause a Pipe:
-### Modifying Pipe Properties
+```
+ALTER PIPE pipe_name SET PIPE_EXECUTION_PAUSED = true;
+```
-You can modify the properties of a Pipe, but only one property at a time. If multiple properties need to be modified, execute the `ALTER` command multiple times. Below are the modifiable properties and their syntax:
+* Resume a Pipe:
-```sql
+```
+ALTER PIPE pipe_name SET PIPE_EXECUTION_PAUSED = false;
+```
+### Modify Pipe Properties
+You can modify Pipe properties one at a time. If multiple properties need to be changed, run the `ALTER` command multiple times. Below are the modifiable properties and their syntax:
+```SQL
 ALTER PIPE pipe_name SET
    [VIRTUAL_CLUSTER = 'virtual_cluster_name'],
-    [BATCH_INTERVAL_IN_SECONDS=''],
-   [ BATCH_SIZE_PER_KAFKA_PARTITION=''],
-    [MAX_SKIP_BATCH_COUNT_ON_ERROR=''],
-    [RESET_KAFKA_GROUP_OFFSETS=''],
-    [COPY_JOB_HINT='']
+   [BATCH_INTERVAL_IN_SECONDS=''],
+   [BATCH_SIZE_PER_KAFKA_PARTITION=''],
+   [MAX_SKIP_BATCH_COUNT_ON_ERROR=''],
+   [COPY_JOB_HINT='']
 ```
 Examples:
-```sql
--- Modify the compute cluster
-ALTER PIPE pipe_name SET VIRTUAL_CLUSTER = 'default';
+```
+-- Modify the Virtual Cluster
+ALTER PIPE pipe_name SET VIRTUAL_CLUSTER = 'DEFAULT'
 -- Set COPY_JOB_HINT
-ALTER PIPE pipe_name SET copy_hints='{"cz.mapper.kafka.message.size": "2000000"}';
+ALTER PIPE pipe_name SET COPY_JOB_HINT='{"cz.mapper.kafka.message.size": "2000000"}'
 ```
-**Note**
-- Modifying the logic of the COPY statement is not supported. If you need to modify it, delete the Pipe and recreate it.
-- When modifying the `COPY_JOB_HINT` of a Pipe, the new settings will overwrite existing hints. Therefore, if your Pipe already has hints (e.g., `{"cz.sql.split.kafka.strategy":"size"}`), you must set all required hints together when adding new ones; otherwise, the existing hints will be overwritten by the new settings. Separate multiple parameters with commas.
+**Notes**
+* Modifying the COPY statement logic is not supported. If you need to modify it, delete the Pipe and recreate it.
+* When modifying the `COPY_JOB_HINT` of a Pipe, the new settings will overwrite all existing hints. If your Pipe already has hints such as `{"cz.sql.split.kafka.strategy":"size"}`, you must include all required hints together when setting new ones; otherwise existing hints will be overwritten. Separate multiple parameters with commas.