@clickzetta/cz-cli-darwin-arm64 0.5.15 → 0.5.17
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bin/cz-cli +0 -0
- package/bin/skills/lakehouse-doc-en/SKILL.md +6 -11
- package/bin/skills/lakehouse-doc-en/references/AIGateway.md +58 -13
- package/bin/skills/lakehouse-doc-en/references/Computation.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/DataSource_Amazon_DocumentDB.md +3 -1
- package/bin/skills/lakehouse-doc-en/references/Foreach.md +14 -14
- package/bin/skills/lakehouse-doc-en/references/JDBC-Driver.md +0 -1
- package/bin/skills/lakehouse-doc-en/references/LakehouseAI-overview.md +21 -8
- package/bin/skills/lakehouse-doc-en/references/LakehouseDataGPT-tour.md +4 -9
- package/bin/skills/lakehouse-doc-en/references/LakehouseStudio-tour.md +14 -19
- package/bin/skills/lakehouse-doc-en/references/Lakehouse_Zilliz_MakeDataReadyforBIandAI.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/Logstash.md +3 -3
- package/bin/skills/lakehouse-doc-en/references/Migrate_Spark_DataEngineeringBestPractices_Project_to_Lakehouse.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/Notebook.md +17 -17
- package/bin/skills/lakehouse-doc-en/references/RemoteFunction-as-udf.md +14 -14
- package/bin/skills/lakehouse-doc-en/references/SQL_External_Catalog_Guide.md +1 -9
- package/bin/skills/lakehouse-doc-en/references/SUMMARY.md +59 -29
- package/bin/skills/lakehouse-doc-en/references/WINDOWFUNCTION.md +99 -57
- package/bin/skills/lakehouse-doc-en/references/Zettapark_Data_Engineering_Demo.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/access-control-configuration.md +1 -8
- package/bin/skills/lakehouse-doc-en/references/aigw-2026-2-5-1.0.md +16 -0
- package/bin/skills/lakehouse-doc-en/references/aigw-2026-3-29-1.0.2.md +14 -0
- package/bin/skills/lakehouse-doc-en/references/aigw-2026-3-8-1.0.1.md +16 -0
- package/bin/skills/lakehouse-doc-en/references/aigw-2026-4-28-1.1.md +29 -0
- package/bin/skills/lakehouse-doc-en/references/aigw-2026-5-12-1.1.1.md +18 -0
- package/bin/skills/lakehouse-doc-en/references/aigw-2026-5-15-1.2.md +9 -0
- package/bin/skills/lakehouse-doc-en/references/aigw-2026-5-21-1.3.md +9 -0
- package/bin/skills/lakehouse-doc-en/references/aigw-2026-5-28-1.4.md +10 -0
- package/bin/skills/lakehouse-doc-en/references/aigw-2026-6-3-1.5.md +9 -0
- package/bin/skills/lakehouse-doc-en/references/alicloud-arn-externalid.md +0 -5
- package/bin/skills/lakehouse-doc-en/references/answer-accuracy-improve.md +120 -103
- package/bin/skills/lakehouse-doc-en/references/application-list.md +1 -3
- package/bin/skills/lakehouse-doc-en/references/approval-list.md +16 -17
- package/bin/skills/lakehouse-doc-en/references/batch-load-parquet-file-into-lakehouse.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/batch_sync.md +9 -9
- package/bin/skills/lakehouse-doc-en/references/batch_sync_Sop.md +2 -2
- package/bin/skills/lakehouse-doc-en/references/batchloadparquetfileintoLakehouse.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/bulkloadv1-python-sdk.md +3 -3
- package/bin/skills/lakehouse-doc-en/references/chart-auto-refresh-guide.md +12 -6
- package/bin/skills/lakehouse-doc-en/references/clickzetta-sample-data.md +3 -3
- package/bin/skills/lakehouse-doc-en/references/code_approval.md +1 -5
- package/bin/skills/lakehouse-doc-en/references/composite_task.md +31 -42
- package/bin/skills/lakehouse-doc-en/references/comprehensive_guide_to_ingesting_environment_and_data_generate.md +6 -9
- package/bin/skills/lakehouse-doc-en/references/comprehensive_guide_to_ingesting_javasdk_bulkload_realtime.md +4 -10
- package/bin/skills/lakehouse-doc-en/references/comprehensive_guide_to_ingesting_kafka_realtime_sync.md +1 -10
- package/bin/skills/lakehouse-doc-en/references/comprehensive_guide_to_ingesting_local_file_into_table_by_studio.md +0 -6
- package/bin/skills/lakehouse-doc-en/references/comprehensive_guide_to_ingesting_studio_batchload_public_network.md +0 -5
- package/bin/skills/lakehouse-doc-en/references/comprehensive_guide_to_ingesting_studio_python_node.md +2 -7
- package/bin/skills/lakehouse-doc-en/references/comprehensive_guide_to_ingesting_studio_realtime_cdc_public_network.md +13 -18
- package/bin/skills/lakehouse-doc-en/references/comprehensive_guide_to_ingesting_studio_sql_insert.md +0 -1
- package/bin/skills/lakehouse-doc-en/references/concepts.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/config-datasource.md +5 -7
- package/bin/skills/lakehouse-doc-en/references/connect-with-cli.md +116 -72
- package/bin/skills/lakehouse-doc-en/references/connect-with-cz-cli.md +151 -0
- package/bin/skills/lakehouse-doc-en/references/continue-job.md +9 -17
- package/bin/skills/lakehouse-doc-en/references/create-api-connection.md +315 -286
- package/bin/skills/lakehouse-doc-en/references/create-catalog-connection.md +1 -0
- package/bin/skills/lakehouse-doc-en/references/create-dynamic-table.md +4 -4
- package/bin/skills/lakehouse-doc-en/references/create-external-catalog.md +85 -22
- package/bin/skills/lakehouse-doc-en/references/create-table-ddl.md +45 -0
- package/bin/skills/lakehouse-doc-en/references/creating_alicloud_privatelinkendpoint.md +4 -6
- package/bin/skills/lakehouse-doc-en/references/creating_alicloud_privatelinkservice.md +4 -7
- package/bin/skills/lakehouse-doc-en/references/creating_tencentcloud_privatelinkendpoint.md +2 -7
- package/bin/skills/lakehouse-doc-en/references/creating_tencentcloud_privatelinkservice.md +1 -5
- package/bin/skills/lakehouse-doc-en/references/cz-cli-agent.md +15 -10
- package/bin/skills/lakehouse-doc-en/references/cz-cli-datasource.md +0 -8
- package/bin/skills/lakehouse-doc-en/references/cz-cli-sql.md +2 -45
- package/bin/skills/lakehouse-doc-en/references/cz-cli.md +53 -42
- package/bin/skills/lakehouse-doc-en/references/dashboard-version-management-guide.md +12 -4
- package/bin/skills/lakehouse-doc-en/references/data-integration-intro.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/data-integration.md +29 -27
- package/bin/skills/lakehouse-doc-en/references/data-load-summary.md +3 -3
- package/bin/skills/lakehouse-doc-en/references/data-quality.md +25 -25
- package/bin/skills/lakehouse-doc-en/references/data-sharing.md +31 -54
- package/bin/skills/lakehouse-doc-en/references/data-sources.md +45 -45
- package/bin/skills/lakehouse-doc-en/references/data_catalog.md +23 -25
- package/bin/skills/lakehouse-doc-en/references/data_privacy.md +5 -2
- package/bin/skills/lakehouse-doc-en/references/data_sharing_between_accounts_guide.md +0 -4
- package/bin/skills/lakehouse-doc-en/references/data_visualization.md +4 -15
- package/bin/skills/lakehouse-doc-en/references/dataagent.md +39 -7
- package/bin/skills/lakehouse-doc-en/references/databricks-delta-to-lakehouse-migration.md +168 -0
- package/bin/skills/lakehouse-doc-en/references/databricks-dlt-to-lakehouse-migration.md +331 -0
- package/bin/skills/lakehouse-doc-en/references/databricks-external-catalog-practice.md +367 -0
- package/bin/skills/lakehouse-doc-en/references/databricks-jobs-to-studio-migration.md +199 -0
- package/bin/skills/lakehouse-doc-en/references/databricks-notebook-to-studio-migration.md +350 -0
- package/bin/skills/lakehouse-doc-en/references/databricks-uc-governance-to-lakehouse-migration.md +327 -0
- package/bin/skills/lakehouse-doc-en/references/datagpt-model-config.md +34 -0
- package/bin/skills/lakehouse-doc-en/references/datagpt_data_source.md +50 -37
- package/bin/skills/lakehouse-doc-en/references/datagpt_introduction.md +55 -79
- package/bin/skills/lakehouse-doc-en/references/datagpt_quickstart.md +50 -64
- package/bin/skills/lakehouse-doc-en/references/datalake-acceleration.md +75 -2
- package/bin/skills/lakehouse-doc-en/references/dbt-databricks-to-clickzetta-migration.md +242 -0
- package/bin/skills/lakehouse-doc-en/references/dynamic-mask.md +30 -30
- package/bin/skills/lakehouse-doc-en/references/dynamic-table-bestpractice.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/dynamic-table-introduce.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/dynamic_table_summary.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/eco_integration/streamlit.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/eco_integration/superset.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/ecosystem-all.md +1 -3
- package/bin/skills/lakehouse-doc-en/references/ecosystem.md +145 -0
- package/bin/skills/lakehouse-doc-en/references/external-catalog-summary.md +33 -38
- package/bin/skills/lakehouse-doc-en/references/external-function-combo-practice.md +466 -0
- package/bin/skills/lakehouse-doc-en/references/f6fc6447ee.md +7 -9
- package/bin/skills/lakehouse-doc-en/references/federation-query.md +56 -6
- package/bin/skills/lakehouse-doc-en/references/finebi-mysql.md +2 -0
- package/bin/skills/lakehouse-doc-en/references/get-started-with-sample-data.md +10 -11
- package/bin/skills/lakehouse-doc-en/references/gitfolder.md +2 -3
- package/bin/skills/lakehouse-doc-en/references/grant-privileges.md +2 -0
- package/bin/skills/lakehouse-doc-en/references/iceberg-rest-catalog-databricks.md +166 -0
- package/bin/skills/lakehouse-doc-en/references/ide.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/if_else_task.md +59 -57
- package/bin/skills/lakehouse-doc-en/references/input_output.md +10 -7
- package/bin/skills/lakehouse-doc-en/references/jobprofile-bestpractices.md +60 -64
- package/bin/skills/lakehouse-doc-en/references/kafka-connection.md +0 -1
- package/bin/skills/lakehouse-doc-en/references/key-concepts.md +146 -117
- package/bin/skills/lakehouse-doc-en/references/lakehouse-ai-gateway-cz-cli.md +317 -0
- package/bin/skills/lakehouse-doc-en/references/lakehouse-ai-sql-analysis.md +345 -0
- package/bin/skills/lakehouse-doc-en/references/lakehouse-dqc-guide.md +300 -0
- package/bin/skills/lakehouse-doc-en/references/lakehouse-medallion-sql-dt-guide.md +543 -0
- package/bin/skills/lakehouse-doc-en/references/lakehouse-multi-cloud-acceleration.md +274 -0
- package/bin/skills/lakehouse-doc-en/references/lakehouse-multimodal-ai-pipeline.md +198 -0
- package/bin/skills/lakehouse-doc-en/references/lakehouse-quick-experience_guide.md +49 -52
- package/bin/skills/lakehouse-doc-en/references/lakehouse-volume-pipe-acceleration-guide.md +380 -0
- package/bin/skills/lakehouse-doc-en/references/langchain-plug-installation.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/management.md +4 -9
- package/bin/skills/lakehouse-doc-en/references/medallion-lakehouse-from-scratch.md +2 -1
- package/bin/skills/lakehouse-doc-en/references/metrics_answer_build.md +58 -21
- package/bin/skills/lakehouse-doc-en/references/migrate-spark-data-engineering-best-practices-to-lakehouse.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/mindsdb.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/monitoring_and_alerting.md +65 -60
- package/bin/skills/lakehouse-doc-en/references/monitoring_item_specification.md +33 -33
- package/bin/skills/lakehouse-doc-en/references/multitable_batch_sync.md +16 -16
- package/bin/skills/lakehouse-doc-en/references/multitable_realtime_sync.md +65 -72
- package/bin/skills/lakehouse-doc-en/references/multitable_realtime_sync_sop.md +54 -52
- package/bin/skills/lakehouse-doc-en/references/navicat-mysql.md +2 -0
- package/bin/skills/lakehouse-doc-en/references/om-dynamic-table.md +71 -66
- package/bin/skills/lakehouse-doc-en/references/om-vcluster.md +2 -0
- package/bin/skills/lakehouse-doc-en/references/open-api-create-session.md +79 -0
- package/bin/skills/lakehouse-doc-en/references/open-api-generate-auth-token.md +63 -0
- package/bin/skills/lakehouse-doc-en/references/open-api-overview.md +96 -0
- package/bin/skills/lakehouse-doc-en/references/open-api-quick-start.md +286 -0
- package/bin/skills/lakehouse-doc-en/references/open-api-response-guide.md +264 -0
- package/bin/skills/lakehouse-doc-en/references/open-api-safe-question-poll.md +201 -0
- package/bin/skills/lakehouse-doc-en/references/open-api-text2insight-query.md +99 -0
- package/bin/skills/lakehouse-doc-en/references/open-api-text2insight-stop.md +74 -0
- package/bin/skills/lakehouse-doc-en/references/overview.md +6 -7
- package/bin/skills/lakehouse-doc-en/references/permission-application.md +5 -5
- package/bin/skills/lakehouse-doc-en/references/pipe-introduction.md +1 -0
- package/bin/skills/lakehouse-doc-en/references/pipe-kafka-table-stream.md +72 -70
- package/bin/skills/lakehouse-doc-en/references/pipe-kafka.md +105 -110
- package/bin/skills/lakehouse-doc-en/references/pipe-overview.md +40 -40
- package/bin/skills/lakehouse-doc-en/references/pipe-storage-object.md +43 -48
- package/bin/skills/lakehouse-doc-en/references/pipe-summary.md +14 -4
- package/bin/skills/lakehouse-doc-en/references/pipe-syntax.md +58 -151
- package/bin/skills/lakehouse-doc-en/references/practice_python_task.md +4 -4
- package/bin/skills/lakehouse-doc-en/references/pricing-ai-gateway.md +181 -0
- package/bin/skills/lakehouse-doc-en/references/pricing-lakehouse.md +316 -0
- package/bin/skills/lakehouse-doc-en/references/pricing.md +44 -288
- package/bin/skills/lakehouse-doc-en/references/private-link-general.md +0 -2
- package/bin/skills/lakehouse-doc-en/references/pyspark-to-zettapark-migration-f1.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/python-igs.md +7 -3
- package/bin/skills/lakehouse-doc-en/references/python-sample-put-github-rt-events.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/python-task.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/python_reference/connector.md +3 -3
- package/bin/skills/lakehouse-doc-en/references/python_reference/connector_advanced.md +2 -2
- package/bin/skills/lakehouse-doc-en/references/python_reference/connector_examples.md +2 -2
- package/bin/skills/lakehouse-doc-en/references/python_sdk_guide.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/python_shell_datasource.md +11 -9
- package/bin/skills/lakehouse-doc-en/references/quick_start_batch_sync_data.md +9 -18
- package/bin/skills/lakehouse-doc-en/references/quick_start_bi_analysis.md +8 -25
- package/bin/skills/lakehouse-doc-en/references/quick_start_create_workspace.md +4 -6
- package/bin/skills/lakehouse-doc-en/references/quick_start_data_quality.md +8 -8
- package/bin/skills/lakehouse-doc-en/references/quick_start_etl.md +16 -20
- package/bin/skills/lakehouse-doc-en/references/quick_start_monitoring_and_alerting.md +10 -18
- package/bin/skills/lakehouse-doc-en/references/quick_start_sql_query.md +7 -10
- package/bin/skills/lakehouse-doc-en/references/quick_start_upload_data.md +5 -7
- package/bin/skills/lakehouse-doc-en/references/quick_start_user_management.md +8 -8
- package/bin/skills/lakehouse-doc-en/references/quick_start_workspace.md +0 -5
- package/bin/skills/lakehouse-doc-en/references/quick_start_workspace_user.md +8 -8
- package/bin/skills/lakehouse-doc-en/references/quickstart.md +69 -56
- package/bin/skills/lakehouse-doc-en/references/quickstart_datashare_between_companies.md +0 -5
- package/bin/skills/lakehouse-doc-en/references/quickstart_envirment_for_team.md +0 -24
- package/bin/skills/lakehouse-doc-en/references/realtime-pipeline-selection-guide.md +1 -2
- package/bin/skills/lakehouse-doc-en/references/realtime-sales-dashboard-with-dynamic-table.md +3 -3
- package/bin/skills/lakehouse-doc-en/references/realtime_sync.md +0 -1
- package/bin/skills/lakehouse-doc-en/references/release-note-2026-05-19.md +5 -3
- package/bin/skills/lakehouse-doc-en/references/revoke-privileges.md +3 -1
- package/bin/skills/lakehouse-doc-en/references/roles.md +2 -3
- package/bin/skills/lakehouse-doc-en/references/row-filter.md +165 -0
- package/bin/skills/lakehouse-doc-en/references/row_level_permission.md +30 -19
- package/bin/skills/lakehouse-doc-en/references/scheduled_task.md +28 -21
- package/bin/skills/lakehouse-doc-en/references/security_overview.md +99 -21
- package/bin/skills/lakehouse-doc-en/references/set-command.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/setup.md +13 -15
- package/bin/skills/lakehouse-doc-en/references/show-grants.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/snowflake-dynamic-tables-to-lakehouse.md +2 -2
- package/bin/skills/lakehouse-doc-en/references/spark-connector-summary.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/sql_functions/context_functions/current_vcluster.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/sso-configuration.md +2 -2
- package/bin/skills/lakehouse-doc-en/references/streaming_pipeline_with_dynamic_table.md +0 -1
- package/bin/skills/lakehouse-doc-en/references/studio-incremental-sync-practice.md +27 -23
- package/bin/skills/lakehouse-doc-en/references/studio-shell-task.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/supported-cloud-platforms.md +32 -0
- package/bin/skills/lakehouse-doc-en/references/table_rendering.md +18 -12
- package/bin/skills/lakehouse-doc-en/references/task-develop.md +89 -91
- package/bin/skills/lakehouse-doc-en/references/task_development.md +19 -17
- package/bin/skills/lakehouse-doc-en/references/task_group.md +16 -14
- package/bin/skills/lakehouse-doc-en/references/task_instance.md +21 -21
- package/bin/skills/lakehouse-doc-en/references/task_param.md +38 -35
- package/bin/skills/lakehouse-doc-en/references/task_param_reference.md +81 -79
- package/bin/skills/lakehouse-doc-en/references/task_scheduling_dependency.md +20 -21
- package/bin/skills/lakehouse-doc-en/references/tencentcloud_arn_and_externalid.md +1 -5
- package/bin/skills/lakehouse-doc-en/references/trial-account-quotas-and-limits.md +1 -3
- package/bin/skills/lakehouse-doc-en/references/tutorial_connect_to_lakehouse.md +69 -0
- package/bin/skills/lakehouse-doc-en/references/tutorials.md +4 -1
- package/bin/skills/lakehouse-doc-en/references/unique-key.md +167 -0
- package/bin/skills/lakehouse-doc-en/references/usageandbillingview.md +138 -0
- package/bin/skills/lakehouse-doc-en/references/use-dbt-dev.md +3 -3
- package/bin/skills/lakehouse-doc-en/references/use-java-sdk-realtime-uploaddata.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/use-java-sdk-upload-data-local.md +3 -3
- package/bin/skills/lakehouse-doc-en/references/use-models.md +128 -0
- package/bin/skills/lakehouse-doc-en/references/use-mysql-client.md +81 -81
- package/bin/skills/lakehouse-doc-en/references/use-python-sdk-upload-data.md +10 -12
- package/bin/skills/lakehouse-doc-en/references/user-identification.md +2 -3
- package/bin/skills/lakehouse-doc-en/references/user_permission_grand_guide.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/using-udf-in-dynamic-table.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/vc_cache.md +18 -22
- package/bin/skills/lakehouse-doc-en/references/vcluster_size_description.md +33 -31
- package/bin/skills/lakehouse-doc-en/references/virtual-cluster.md +43 -45
- package/bin/skills/lakehouse-doc-en/references/web-job-history.md +94 -108
- package/bin/skills/lakehouse-doc-en/references/web_search.md +16 -7
- package/bin/skills/lakehouse-doc-en/references/zettapark-data-engineering-demo.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/zettapark-dataframe-guide.md +144 -70
- package/bin/skills/lakehouse-doc-en/references/zettapark-dynamic-table-guide.md +2 -2
- package/bin/skills/lakehouse-doc-en/references/zettapark-etl-guide.md +73 -33
- package/bin/skills/lakehouse-doc-en/references/zettapark-feature-engineering.md +2 -2
- package/bin/skills/lakehouse-doc-en/references/zettapark-functions-guide.md +75 -46
- package/bin/skills/lakehouse-doc-en/references/zettapark-quick-start.md +2 -2
- package/bin/skills/lakehouse-doc-en/references/zettapark-stream-guide.md +4 -4
- package/bin/skills/lakehouse-doc-en/references/zettapark-volume-guide.md +93 -29
- package/package.json +1 -1
- package/bin/skills/lakehouse-doc-en/references/CLAUDE.md +0 -606
- package/bin/skills/lakehouse-doc-en/references/modelprice.md +0 -155
|
@@ -1,10 +1,10 @@
|
|
|
1
|
-
# Data Pipelines and Change Capture
|
|
1
|
+
# Data Pipelines and Change Data Capture
|
|
2
2
|
|
|
3
|
-
This chapter covers two types of objects: **Pipe** (continuous data ingestion) and **Table Stream** (change data capture).
|
|
3
|
+
This chapter covers two types of objects: **Pipe** (continuous data ingestion) and **Table Stream** (change data capture). Pipe automatically writes data from external files or message streams into tables; Table Stream records incremental changes on a table for downstream incremental computation.
|
|
4
4
|
|
|
5
|
-
|
|
5
|
+
Pipe is the **continuous data ingestion object** in Singdata Lakehouse. Once created, it runs automatically, continuously reading data from object storage (OSS/COS/S3) or Kafka and writing it to a target table — no manual triggering required.
|
|
6
6
|
|
|
7
|
-
|
|
7
|
+
Analogy: Pipe is like a continuously running conveyor belt. Once files are uploaded to an OSS subdirectory, Pipe automatically detects and loads them within about 30 seconds. When Kafka messages are written, Pipe continuously consumes and writes them in batch intervals. Unlike scheduled tasks, Pipe runs persistently and processes new data as it arrives.
|
|
8
8
|
|
|
9
9
|

|
|
10
10
|
|
|
@@ -13,22 +13,22 @@ Think of a Pipe as a continuously running conveyor belt. After a file is uploade
|
|
|
13
13
|
## Pipe Types
|
|
14
14
|
|
|
15
15
|
| Type | Data Source | Detection Latency | Typical Use Case |
|
|
16
|
-
|
|
17
|
-
| **Object Storage Pipe** | OSS / COS / S3 | ~30 seconds |
|
|
16
|
+
|------|-------------|-------------------|-----------------|
|
|
17
|
+
| **Object Storage Pipe** | OSS / COS / S3 | ~30 seconds | Automatic ingestion of periodically uploaded CSV/Parquet/JSON files |
|
|
18
18
|
| **Kafka Pipe** | Kafka Topic | Per batch interval (default 60 seconds) | Real-time ingestion of logs and business events |
|
|
19
19
|
|
|
20
|
-
|
|
20
|
+
Pipe and Studio sync tasks are functionally equivalent. The difference is: Pipe is created and managed via SQL DDL, suitable for code-based pipeline management; Studio sync tasks are configured through a visual interface and support more data sources (including relational databases).
|
|
21
21
|
|
|
22
22
|
---
|
|
23
23
|
|
|
24
24
|
## Continuous Ingestion from Object Storage
|
|
25
25
|
|
|
26
|
-
**Prerequisites**: Create a Storage Connection → Create an External Volume (must point to a specific subdirectory, not the bucket root
|
|
26
|
+
**Prerequisites**: Create a Storage Connection → Create an External Volume (must point to a specific subdirectory, not the bucket root) → Create the target table → Create the Pipe.
|
|
27
27
|
|
|
28
|
-
> ⚠️
|
|
28
|
+
> ⚠️ The Volume's `LOCATION` must point to an OSS subdirectory (e.g., `oss://bucket/pipe_data/`), not the bucket root (`oss://bucket/`). Otherwise, Pipe creation will fail.
|
|
29
29
|
|
|
30
30
|
```sql
|
|
31
|
-
-- Step 1: Create a
|
|
31
|
+
-- Step 1: Create a storage connection
|
|
32
32
|
CREATE STORAGE CONNECTION my_oss_conn
|
|
33
33
|
TYPE OSS
|
|
34
34
|
ENDPOINT = 'oss-cn-hangzhou.aliyuncs.com'
|
|
@@ -51,7 +51,7 @@ CREATE TABLE IF NOT EXISTS orders (
|
|
|
51
51
|
|
|
52
52
|
-- Step 4: Create the Pipe
|
|
53
53
|
CREATE PIPE orders_oss_pipe
|
|
54
|
-
VIRTUAL_CLUSTER = '
|
|
54
|
+
VIRTUAL_CLUSTER = 'DEFAULT'
|
|
55
55
|
INGEST_MODE = 'LIST_PURGE'
|
|
56
56
|
AS
|
|
57
57
|
COPY INTO orders
|
|
@@ -65,31 +65,31 @@ Once created, the Pipe starts running immediately and checks the Volume for new
|
|
|
65
65
|
### Two Ingestion Modes
|
|
66
66
|
|
|
67
67
|
| Mode | Trigger | Source File Handling | Use Case |
|
|
68
|
-
|
|
69
|
-
| `LIST_PURGE` | Periodic polling scan (~30 seconds) | **
|
|
70
|
-
| `EVENT_NOTIFICATION` | Object storage event notification (near real-time) |
|
|
68
|
+
|------|---------|---------------------|---------|
|
|
69
|
+
| `LIST_PURGE` | Periodic polling scan (~30 seconds) | **Deletes source files after ingestion** | Simple configuration, suitable for most scenarios |
|
|
70
|
+
| `EVENT_NOTIFICATION` | Object storage event notification (near real-time) | Keeps source files | When source files need to be retained; OSS and S3 only |
|
|
71
71
|
|
|
72
|
-
> ⚠️
|
|
72
|
+
> ⚠️ `LIST_PURGE` mode **permanently deletes** source files from OSS after a successful import. This is irreversible. Use `EVENT_NOTIFICATION` mode if you need to retain the files.
|
|
73
73
|
|
|
74
|
-
### Deduplication
|
|
74
|
+
### Deduplication Mechanism
|
|
75
75
|
|
|
76
|
-
|
|
76
|
+
Pipe records ingested file paths in `load_history`. The same file path is only imported once — re-uploading the same file will not trigger a duplicate import. `load_history` records are retained for 7 days.
|
|
77
77
|
|
|
78
78
|
```sql
|
|
79
79
|
-- View ingested file records
|
|
80
80
|
SELECT * FROM load_history('orders');
|
|
81
|
-
--
|
|
81
|
+
-- Results include: file_path, last_copy_time, file_size, status, first_error_message
|
|
82
82
|
```
|
|
83
83
|
|
|
84
84
|
---
|
|
85
85
|
|
|
86
86
|
## Continuous Consumption from Kafka
|
|
87
87
|
|
|
88
|
-
|
|
88
|
+
Pipe creates a persistent consumer group, pulling data from a Kafka Topic in batches and writing it to a table.
|
|
89
89
|
|
|
90
90
|
```sql
|
|
91
91
|
CREATE PIPE kafka_orders_pipe
|
|
92
|
-
VIRTUAL_CLUSTER = '
|
|
92
|
+
VIRTUAL_CLUSTER = 'DEFAULT'
|
|
93
93
|
BATCH_INTERVAL_IN_SECONDS = '60'
|
|
94
94
|
AS
|
|
95
95
|
COPY INTO orders_raw
|
|
@@ -98,23 +98,23 @@ FROM (
|
|
|
98
98
|
FROM TABLE(READ_KAFKA(
|
|
99
99
|
'kafka-host:9092', -- bootstrap.servers
|
|
100
100
|
'orders_topic', -- topic
|
|
101
|
-
'', -- topic pattern (not supported
|
|
102
|
-
'pipe_orders_group', -- group_id (must be unique per
|
|
103
|
-
'', '', '', '', -- start/end offsets and timestamps, managed by Pipe
|
|
101
|
+
'', -- topic pattern (not yet supported, leave empty)
|
|
102
|
+
'pipe_orders_group', -- group_id (must be unique per topic per Pipe)
|
|
103
|
+
'', '', '', '', -- start/end offsets and timestamps, managed by Pipe
|
|
104
104
|
'raw', 'raw', -- key/value format
|
|
105
|
-
0, map() -- max
|
|
105
|
+
0, map() -- max errors, extra Kafka config
|
|
106
106
|
))
|
|
107
107
|
);
|
|
108
108
|
```
|
|
109
109
|
|
|
110
|
-
> ⚠️
|
|
110
|
+
> ⚠️ A single Pipe can only consume one Kafka Topic. Multiple Topics require multiple Pipes, and each Pipe must have a unique `group_id`.
|
|
111
111
|
|
|
112
112
|
---
|
|
113
113
|
|
|
114
114
|
## Monitoring and Management
|
|
115
115
|
|
|
116
116
|
```sql
|
|
117
|
-
-- View all Pipes (
|
|
117
|
+
-- View all Pipes (with status, type, VCluster)
|
|
118
118
|
SHOW PIPES;
|
|
119
119
|
|
|
120
120
|
-- View Pipe details
|
|
@@ -136,32 +136,32 @@ DROP PIPE orders_oss_pipe;
|
|
|
136
136
|
Key fields in `DESC PIPE` output:
|
|
137
137
|
|
|
138
138
|
| Field | Description |
|
|
139
|
-
|
|
139
|
+
|-------|-------------|
|
|
140
140
|
| `pipe_status` | `RUNNING` / `PAUSED` |
|
|
141
141
|
| `pipe_kind` | `VOLUME` (object storage) or `KAFKA` |
|
|
142
|
-
| `properties` | Configuration such as ingest_mode
|
|
142
|
+
| `properties` | Configuration such as ingest_mode, vcluster |
|
|
143
143
|
| `input_name` | Data source (Volume or Kafka Topic) |
|
|
144
144
|
| `output_name` | Full path of the target table |
|
|
145
|
-
| `invalid_reason` | Error reason when
|
|
145
|
+
| `invalid_reason` | Error reason when Pipe is in an error state |
|
|
146
146
|
|
|
147
|
-
To view Pipe execution history
|
|
147
|
+
To view Pipe execution history: filter job history by `query_tag`, in the format `pipe.workspace_name.schema_name.pipe_name`.
|
|
148
148
|
|
|
149
149
|
---
|
|
150
150
|
|
|
151
|
-
##
|
|
151
|
+
## Notes
|
|
152
152
|
|
|
153
|
-
- **Volume must point to a subdirectory**: `LOCATION` cannot be the bucket root
|
|
154
|
-
- **Each Pipe requires its own Volume**: Different Pipes cannot share the same Volume
|
|
155
|
-
- **
|
|
156
|
-
- **Data loading order is not
|
|
157
|
-
- **Recommended file sizes**: gzip
|
|
158
|
-
-
|
|
153
|
+
- **Volume must point to a subdirectory**: `LOCATION` cannot be the bucket root, otherwise Pipe creation will fail
|
|
154
|
+
- **Each Pipe requires its own Volume**: Different Pipes cannot share the same Volume
|
|
155
|
+
- **COPY statement cannot be modified**: To change the ingestion logic, drop and recreate the Pipe
|
|
156
|
+
- **Data loading order is not guaranteed**
|
|
157
|
+
- **Recommended file sizes**: gzip compressed files should be under 50MB; uncompressed CSV/Parquet files should be 128MB–256MB
|
|
158
|
+
- **EVENT_NOTIFICATION mode** requires additional MNS message queue configuration, supports only Alibaba Cloud OSS and AWS S3, and must use RoleARN authorization
|
|
159
159
|
|
|
160
160
|
---
|
|
161
161
|
|
|
162
162
|
## Related Documentation
|
|
163
163
|
|
|
164
|
-
- [Object Storage Pipe](pipe-storage-object.md) —
|
|
165
|
-
- [Kafka Pipe](pipe-kafka.md) — READ_KAFKA parameter reference
|
|
164
|
+
- [Object Storage Pipe](pipe-storage-object.md) — Full configuration for LIST_PURGE and EVENT_NOTIFICATION
|
|
165
|
+
- [Kafka Pipe](pipe-kafka.md) — READ_KAFKA parameter reference, consumer offset management
|
|
166
166
|
- [Pipe Syntax Reference](pipe-syntax.md) — Complete DDL syntax
|
|
167
|
-
- [Table Stream](om-table-stream.md) — Change data capture
|
|
167
|
+
- [Table Stream](om-table-stream.md) — Change data capture, CDC-driven incremental computation
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# Continuous Data Import from Object Storage Using Pipe
|
|
2
2
|
|
|
3
|
-
Pipe is a powerful data import feature in the Singdata Lakehouse platform. It allows
|
|
3
|
+
Pipe is a powerful data import feature in the Singdata Lakehouse platform. It allows users to read data directly from object storage at a fixed frequency and import it into the Lakehouse. By implementing a file detection mechanism, Pipe supports micro-batch file loading, enabling quick access to the latest data. It is particularly suitable for scenarios requiring real-time or near-real-time data processing.
|
|
4
4
|
|
|
5
5
|
Think of an object storage Pipe as a program that continuously scans an inbox — you simply upload files to OSS/S3/COS, and it automatically detects and imports them without any manual trigger.
|
|
6
6
|
|
|
@@ -8,34 +8,35 @@ Think of an object storage Pipe as a program that continuously scans an inbox
|
|
|
8
8
|
|
|
9
9
|
1. **File Detection**:
|
|
10
10
|
|
|
11
|
-
1. **EVENT\_NOTIFICATION\_MODE**: Requires enabling a message service.
|
|
12
|
-
2. **LIST\_PURGE mode**: Periodically scans directories, synchronizes unrecorded files, and deletes
|
|
11
|
+
1. **EVENT\_NOTIFICATION\_MODE**: Requires enabling a message service. Uses Alibaba Cloud Message Service to notify the Lakehouse of new file uploads. Currently only Alibaba Cloud OSS and AWS S3 are supported.
|
|
12
|
+
2. **LIST\_PURGE mode**: Periodically scans directories, synchronizes unrecorded files, and deletes source files after synchronization.
|
|
13
13
|
|
|
14
14
|
2. **COPY Statement**: Defines the source location of data files and the target table, supporting multiple file formats.
|
|
15
15
|
|
|
16
16
|
3. **Automated Loading**: Automatically detects new files and executes the COPY statement.
|
|
17
17
|
|
|
18
|
-
4. **Duplicate Import Prevention**: To avoid duplicate imports, the `load_history` function records the
|
|
18
|
+
4. **Duplicate Import Prevention**: To avoid duplicate imports, the `load_history` function records the COPY import history for the current table. When Pipe executes, it deduplicates based on the `load_history` table name and import file name, ensuring already-imported files are not re-imported. If you need to re-import a recorded file, manually execute a COPY command. Records in `load_history` are retained for 7 days.
|
|
19
19
|
|
|
20
|
-
5. **Pipe Import Job History**: Since each execution is a Pipe-issued
|
|
20
|
+
5. **Pipe Import Job History**: Since each execution is a Pipe-issued COPY, you can view all operations in the job history. Filter by `query_tag` in the job history — all COPY jobs executed by a Pipe are tagged with the format `pipe.``workspace_name``.schema_name.pipe_name`, making it easy to track and manage.
|
|
21
21
|
|
|
22
22
|
## Use Cases
|
|
23
23
|
|
|
24
|
-
* **Real-time Data Synchronization**: When
|
|
25
|
-
* **Cost Optimization**: Importing and exporting data via object storage avoids public network traffic costs. Within the same region, you can specify intranet transfer for object storage
|
|
24
|
+
* **Real-time Data Synchronization**: When data is stored in object storage and needs frequent synchronization to access the latest data.
|
|
25
|
+
* **Cost Optimization**: Importing and exporting data via object storage avoids public network traffic costs. Within the same region, you can specify intranet transfer for object storage to further reduce costs.
|
|
26
26
|
|
|
27
27
|
## Notes
|
|
28
28
|
|
|
29
29
|
* When using EVENT\_NOTIFICATION\_MODE, you must use the role ARN authorization method to create the storage connection.
|
|
30
|
-
* LIST\_PURGE mode supports both access key and role ARN authorization
|
|
30
|
+
* LIST\_PURGE mode supports both access key and role ARN authorization.
|
|
31
31
|
* **Recommended File Size**: gzip compressed files should be around 50 MB. Uncompressed CSV and Parquet files should be between 128 MB and 256 MB.
|
|
32
32
|
* **Data Loading Order**: **Data loading cannot guarantee strict ordering**.
|
|
33
33
|
* **Pipe Latency**: Pipe loading time is affected by various factors, including file format, size, and the complexity of the COPY statement.
|
|
34
|
-
* **Pipe and Volume Mapping**: Each Pipe
|
|
34
|
+
* **Pipe and Volume Mapping**: Each Pipe requires a dedicated Volume and cannot be shared.
|
|
35
35
|
* Modifying the COPY statement logic is not supported. If you need to change it, delete the Pipe and recreate it.
|
|
36
|
-
* When
|
|
36
|
+
* When modifying a Pipe's `COPY_JOB_HINT`, the new settings overwrite existing hints. If your Pipe already has hints such as `{"cz.sql.split.kafka.strategy":"size"}`, you must include all required hints together when setting new ones; otherwise existing hints will be overwritten. Separate multiple parameters with commas.
|
|
37
37
|
* The COPY statement inside a PIPE does not support the `files`, `regexp`, or `subdirectory` parameters.
|
|
38
38
|
|
|
39
|
+
|
|
39
40
|
## Cost
|
|
40
41
|
|
|
41
42
|
Charged based on the computing resources used when loading files.
|
|
@@ -51,27 +52,24 @@ CREATE PIPE [ IF NOT EXISTS ] <pipe_name>
|
|
|
51
52
|
AS <copy_statement>;
|
|
52
53
|
```
|
|
53
54
|
|
|
54
|
-
|
|
55
|
-
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
* `LIST_PURGE`: Periodically polls and scans the directory — simple to configure, suitable for most scenarios; **deletes source files after a successful import (irreversible**).
|
|
55
|
+
- `<pipe_name>`: The name of the Pipe object to create.
|
|
56
|
+
- `VIRTUAL_CLUSTER`: Specifies the Virtual Cluster name.
|
|
57
|
+
- `INGEST_MODE`: Determines the data ingestion mode — choose one:
|
|
58
|
+
- `LIST_PURGE`: Periodically polls and scans the directory — simple to configure, suitable for most scenarios; **deletes source files after a successful import (irreversible)**
|
|
59
|
+
- `EVENT_NOTIFICATION`: Triggers immediately upon receiving an object storage event notification — suitable for near-real-time ingestion where source files must be retained; requires additional MNS queue configuration
|
|
60
60
|
|
|
61
|
-
|
|
62
|
-
|
|
61
|
+
> ⚠️ **Warning**: `LIST_PURGE` mode **permanently deletes** source files from object storage after a successful import. If you need to retain the original files, use `EVENT_NOTIFICATION` mode.
|
|
62
|
+
- `COPY_JOB_HINT`: Optional, reserved parameter for Lakehouse.
|
|
63
|
+
- `copy_statement`: `<copy_statement>` supports all file parameters. When the `ON_ERROR=CONTINUE|ABORT` parameter is set, it controls error handling during data loading, and the list of imported files is returned:
|
|
64
|
+
* `CONTINUE`: Skips error rows and continues loading subsequent data. Suitable for tolerating partial errors while maximizing data loading completion. Currently, ignorable errors are limited to file format mismatches — for example, the command specifies zip compression but the file uses zstd.
|
|
65
|
+
* `ABORT`: Immediately terminates the entire `COPY` operation. Suitable for strict data quality requirements where any error requires manual inspection.
|
|
63
66
|
|
|
64
|
-
* `COPY_JOB_HINT`: Optional, reserved parameter for Lakehouse.
|
|
65
|
-
|
|
66
|
-
* `copy_statement`: `<copy_statement>` supports all file parameters. When the `ON_ERROR=CONTINUE|ABORT` parameter is set, it controls how errors are handled during data loading, and the list of imported files is returned:
|
|
67
|
-
* `CONTINUE`: Skips error rows and continues loading subsequent data. Suitable for scenarios that tolerate partial errors and require maximum data loading completion. Currently, ignorable errors are limited to file format mismatches — for example, the command specifies zip compression but the file uses zstd compression.
|
|
68
|
-
* `ABORT`: Immediately terminates the entire `COPY` operation. Suitable for scenarios with strict data quality requirements where any error requires manual inspection.
|
|
69
67
|
|
|
70
68
|
## Supported File Formats
|
|
71
69
|
|
|
72
70
|
Refer to [COPY INTO import](copy-into-table.md).
|
|
73
71
|
|
|
74
|
-
##
|
|
72
|
+
## PIPE Load Examples
|
|
75
73
|
|
|
76
74
|
### Using Scan File Mode (LIST\_PURGE)
|
|
77
75
|
|
|
@@ -100,7 +98,7 @@ CREATE EXTERNAL VOLUME pipe_volume
|
|
|
100
98
|
|
|
101
99
|
```
|
|
102
100
|
|
|
103
|
-
Step 2: Run the
|
|
101
|
+
Step 2: Run the COPY command standalone to verify it imports successfully
|
|
104
102
|
|
|
105
103
|
```SQL
|
|
106
104
|
copy into pipe_purge_mode from volume pipe_volume(id int,col string)
|
|
@@ -113,26 +111,26 @@ Step 3: Use the above statement to build the Pipe object
|
|
|
113
111
|
|
|
114
112
|
```SQL
|
|
115
113
|
create pipe volume_pipe_list_purge
|
|
116
|
-
VIRTUAL_CLUSTER = '
|
|
117
|
-
--
|
|
114
|
+
VIRTUAL_CLUSTER = 'DEFAULT'
|
|
115
|
+
-- Use scan file mode to get the latest files
|
|
118
116
|
INGEST_MODE = 'LIST_PURGE'
|
|
119
117
|
as
|
|
120
118
|
copy into pipe_purge_mode from volume pipe_volume(id int,col string)
|
|
121
119
|
using csv OPTIONS(
|
|
122
120
|
'header'='false'
|
|
123
121
|
)
|
|
124
|
-
--Must add purge parameter to delete data after successful import
|
|
122
|
+
-- Must add purge parameter to delete data after successful import
|
|
125
123
|
purge=true
|
|
126
124
|
;
|
|
127
125
|
```
|
|
128
126
|
|
|
129
127
|
Step 4: View Pipe execution history and imported files
|
|
130
128
|
|
|
131
|
-
* View the execution status of Pipe
|
|
129
|
+
* View the execution status of Pipe COPY jobs
|
|
132
130
|
|
|
133
|
-
Filter by `query_tag` in the job history. All
|
|
131
|
+
Filter by `query_tag` in the job history. All COPY jobs executed by the Pipe are tagged with the format: `pipe.workspace_name.schema_name.pipe_name`
|
|
134
132
|
|
|
135
|
-
* View the history of files imported by
|
|
133
|
+
* View the history of files imported by COPY jobs
|
|
136
134
|
|
|
137
135
|
```SQL
|
|
138
136
|
select * from load_history('schema_name.table_name');
|
|
@@ -143,7 +141,7 @@ select * from load_history('schema_name.table_name');
|
|
|
143
141
|
Step 1: Enable Alibaba Cloud Message Service (MNS)
|
|
144
142
|
|
|
145
143
|
1. Enable Message Service MNS in the Alibaba Cloud console.
|
|
146
|
-
2. Configure MNS to listen to the OSS folder you want to synchronize. [See the documentation
|
|
144
|
+
2. Configure MNS to listen to the OSS folder you want to synchronize. [See the documentation](https://help.aliyun.com/zh/mns/user-guide/create-a-rule-to-generate-oss-event-notifications?spm=a2c4g.11186623.0.i14)
|
|
147
145
|
|
|
148
146
|
Step 2: Authorize Lakehouse to read OSS
|
|
149
147
|
|
|
@@ -164,14 +162,14 @@ CREATE STORAGE CONNECTION my_connection_exnet_role
|
|
|
164
162
|
TYPE oss
|
|
165
163
|
REGION = 'cn-hangzhou' -- Select according to the region where OSS is located
|
|
166
164
|
ROLE_ARN = 'acs:ram::...:role/czudfrole' -- Replace with your Role ARN
|
|
167
|
-
ENDPOINT = 'oss-cn-hangzhou.aliyuncs.com'; -- Select Endpoint
|
|
165
|
+
ENDPOINT = 'oss-cn-hangzhou.aliyuncs.com'; -- Select Endpoint based on the OSS region
|
|
168
166
|
```
|
|
169
167
|
|
|
170
168
|
Step 5: Create a Volume
|
|
171
169
|
|
|
172
170
|
```SQL
|
|
173
171
|
CREATE EXTERNAL VOLUME my_volume_exnet_role
|
|
174
|
-
LOCATION 'oss://function-compute-my1/autoloader' -- Replace with
|
|
172
|
+
LOCATION 'oss://function-compute-my1/autoloader' -- Replace with your OSS Bucket path
|
|
175
173
|
USING connection my_connection_exnet_role
|
|
176
174
|
DIRECTORY = (
|
|
177
175
|
enable = TRUE,
|
|
@@ -197,7 +195,8 @@ COPY INTO pipe_log_json FROM (
|
|
|
197
195
|
|
|
198
196
|
## Status Monitoring and Management
|
|
199
197
|
|
|
200
|
-
###
|
|
198
|
+
### View Pipe Status
|
|
199
|
+
|
|
201
200
|
|
|
202
201
|
````
|
|
203
202
|
DESC PIPE EXTENDED kafka_pipe_stream
|
|
@@ -217,29 +216,27 @@ DESC PIPE EXTENDED kafka_pipe_stream
|
|
|
217
216
|
| invalid_reason | |
|
|
218
217
|
| pipe_latency | {"kafka":{"lags":{"0":0,"1":0,"2":0,"3":0},"lastConsumeTimestamp":-1,"offsetLag":0,"timeLag":-1}} |
|
|
219
218
|
+--------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|
|
220
|
-
````
|
|
221
219
|
|
|
222
|
-
|
|
220
|
+
````
|
|
223
221
|
|
|
224
|
-
|
|
222
|
+
### View Pipe Execution History
|
|
225
223
|
|
|
226
|
-
|
|
224
|
+
Since each execution is a Pipe-issued COPY, you can view all operations in the job history. Filter by `query_tag` in the [job history](<web-job-history.md>). All COPY jobs executed by a Pipe are tagged with the format `pipe.``workspace_name``.schema_name.pipe_name` for easy tracking.
|
|
227
225
|
|
|
228
|
-
|
|
226
|
+
### Stop and Start a Pipe
|
|
229
227
|
|
|
228
|
+
- Pause a Pipe:
|
|
230
229
|
```
|
|
231
230
|
ALTER PIPE pipe_name SET PIPE_EXECUTION_PAUSED = true;
|
|
232
231
|
```
|
|
233
|
-
|
|
234
|
-
* Resume a Pipe
|
|
235
|
-
|
|
232
|
+
- Resume a Pipe:
|
|
236
233
|
```
|
|
237
234
|
ALTER PIPE pipe_name SET PIPE_EXECUTION_PAUSED = false
|
|
238
235
|
```
|
|
239
236
|
|
|
240
|
-
###
|
|
237
|
+
### Modify Pipe Properties
|
|
241
238
|
|
|
242
|
-
You can modify
|
|
239
|
+
You can modify Pipe properties one at a time. If multiple properties need to be changed, run the `ALTER` command multiple times. Below are the modifiable properties and their syntax:
|
|
243
240
|
|
|
244
241
|
```SQL
|
|
245
242
|
ALTER PIPE pipe_name SET
|
|
@@ -255,9 +252,7 @@ Examples:
|
|
|
255
252
|
|
|
256
253
|
```
|
|
257
254
|
-- Change the compute cluster
|
|
258
|
-
ALTER PIPE pipe_name SET VIRTUAL_CLUSTER = '
|
|
255
|
+
ALTER PIPE pipe_name SET VIRTUAL_CLUSTER = 'DEFAULT'
|
|
259
256
|
-- Set COPY_JOB_HINT
|
|
260
257
|
ALTER PIPE pipe_name SET COPY_JOB_HINT='{"cz.mapper.kafka.message.size": "2000000"}'
|
|
261
258
|
```
|
|
262
|
-
|
|
263
|
-
^
|
|
@@ -1,6 +1,5 @@
|
|
|
1
1
|
# Using Pipe for Continuous Data Import
|
|
2
2
|
|
|
3
|
-
> \[Preview Release] This feature is currently in public preview.
|
|
4
3
|
|
|
5
4
|
## Overview
|
|
6
5
|
|
|
@@ -79,13 +78,24 @@ Parameter Description:
|
|
|
79
78
|
|
|
80
79
|
Currently, you can use SQL commands to view the Pipe list and object details.
|
|
81
80
|
|
|
82
|
-
* Use the SHOW PIPES command to view the
|
|
81
|
+
* Use the `SHOW PIPES` command to view the Pipe object list
|
|
83
82
|
|
|
84
|
-
```
|
|
83
|
+
```
|
|
84
|
+
-- List all Pipes in the current Schema
|
|
85
85
|
SHOW PIPES;
|
|
86
|
+
-- List all Pipes in a specified Schema
|
|
87
|
+
SHOW PIPES IN SCHEMA schema_name;
|
|
88
|
+
-- List all Pipes in a specified Workspace
|
|
89
|
+
SHOW PIPES IN WORKSPACE workspace_name;
|
|
86
90
|
```
|
|
87
91
|
|
|
88
|
-
|
|
92
|
+
**Notes**
|
|
93
|
+
|
|
94
|
+
* `SHOW PIPES`: Lists Pipe objects in the current Schema by default.
|
|
95
|
+
* `SHOW PIPES IN SCHEMA schema_name`: Lists all Pipe objects in the specified Schema.
|
|
96
|
+
* `SHOW PIPES IN WORKSPACE workspace_name`: Lists all Pipe objects in the specified Workspace.
|
|
97
|
+
|
|
98
|
+
* Use the `DESC PIPE` command to view detailed information about the specified Pipe object
|
|
89
99
|
|
|
90
100
|
```SQL
|
|
91
101
|
DESC PIPE <name>;
|