@clickzetta/cz-cli-darwin-x64 0.5.16 → 0.5.17
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bin/cz-cli +0 -0
- package/bin/skills/lakehouse-doc-en/SKILL.md +6 -11
- package/bin/skills/lakehouse-doc-en/references/AIGateway.md +58 -13
- package/bin/skills/lakehouse-doc-en/references/Computation.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/DataSource_Amazon_DocumentDB.md +3 -1
- package/bin/skills/lakehouse-doc-en/references/Foreach.md +14 -14
- package/bin/skills/lakehouse-doc-en/references/JDBC-Driver.md +0 -1
- package/bin/skills/lakehouse-doc-en/references/LakehouseAI-overview.md +21 -8
- package/bin/skills/lakehouse-doc-en/references/LakehouseDataGPT-tour.md +4 -9
- package/bin/skills/lakehouse-doc-en/references/LakehouseStudio-tour.md +14 -19
- package/bin/skills/lakehouse-doc-en/references/Lakehouse_Zilliz_MakeDataReadyforBIandAI.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/Logstash.md +3 -3
- package/bin/skills/lakehouse-doc-en/references/Migrate_Spark_DataEngineeringBestPractices_Project_to_Lakehouse.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/Notebook.md +17 -17
- package/bin/skills/lakehouse-doc-en/references/RemoteFunction-as-udf.md +14 -14
- package/bin/skills/lakehouse-doc-en/references/SQL_External_Catalog_Guide.md +1 -9
- package/bin/skills/lakehouse-doc-en/references/SUMMARY.md +59 -29
- package/bin/skills/lakehouse-doc-en/references/WINDOWFUNCTION.md +99 -57
- package/bin/skills/lakehouse-doc-en/references/Zettapark_Data_Engineering_Demo.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/access-control-configuration.md +1 -8
- package/bin/skills/lakehouse-doc-en/references/aigw-2026-2-5-1.0.md +16 -0
- package/bin/skills/lakehouse-doc-en/references/aigw-2026-3-29-1.0.2.md +14 -0
- package/bin/skills/lakehouse-doc-en/references/aigw-2026-3-8-1.0.1.md +16 -0
- package/bin/skills/lakehouse-doc-en/references/aigw-2026-4-28-1.1.md +29 -0
- package/bin/skills/lakehouse-doc-en/references/aigw-2026-5-12-1.1.1.md +18 -0
- package/bin/skills/lakehouse-doc-en/references/aigw-2026-5-15-1.2.md +9 -0
- package/bin/skills/lakehouse-doc-en/references/aigw-2026-5-21-1.3.md +9 -0
- package/bin/skills/lakehouse-doc-en/references/aigw-2026-5-28-1.4.md +10 -0
- package/bin/skills/lakehouse-doc-en/references/aigw-2026-6-3-1.5.md +9 -0
- package/bin/skills/lakehouse-doc-en/references/alicloud-arn-externalid.md +0 -5
- package/bin/skills/lakehouse-doc-en/references/answer-accuracy-improve.md +120 -103
- package/bin/skills/lakehouse-doc-en/references/application-list.md +1 -3
- package/bin/skills/lakehouse-doc-en/references/approval-list.md +16 -17
- package/bin/skills/lakehouse-doc-en/references/batch-load-parquet-file-into-lakehouse.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/batch_sync.md +9 -9
- package/bin/skills/lakehouse-doc-en/references/batch_sync_Sop.md +2 -2
- package/bin/skills/lakehouse-doc-en/references/batchloadparquetfileintoLakehouse.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/bulkloadv1-python-sdk.md +3 -3
- package/bin/skills/lakehouse-doc-en/references/chart-auto-refresh-guide.md +12 -6
- package/bin/skills/lakehouse-doc-en/references/clickzetta-sample-data.md +3 -3
- package/bin/skills/lakehouse-doc-en/references/code_approval.md +1 -5
- package/bin/skills/lakehouse-doc-en/references/composite_task.md +31 -42
- package/bin/skills/lakehouse-doc-en/references/comprehensive_guide_to_ingesting_environment_and_data_generate.md +6 -9
- package/bin/skills/lakehouse-doc-en/references/comprehensive_guide_to_ingesting_javasdk_bulkload_realtime.md +4 -10
- package/bin/skills/lakehouse-doc-en/references/comprehensive_guide_to_ingesting_kafka_realtime_sync.md +1 -10
- package/bin/skills/lakehouse-doc-en/references/comprehensive_guide_to_ingesting_local_file_into_table_by_studio.md +0 -6
- package/bin/skills/lakehouse-doc-en/references/comprehensive_guide_to_ingesting_studio_batchload_public_network.md +0 -5
- package/bin/skills/lakehouse-doc-en/references/comprehensive_guide_to_ingesting_studio_python_node.md +2 -7
- package/bin/skills/lakehouse-doc-en/references/comprehensive_guide_to_ingesting_studio_realtime_cdc_public_network.md +13 -18
- package/bin/skills/lakehouse-doc-en/references/comprehensive_guide_to_ingesting_studio_sql_insert.md +0 -1
- package/bin/skills/lakehouse-doc-en/references/concepts.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/config-datasource.md +5 -7
- package/bin/skills/lakehouse-doc-en/references/connect-with-cli.md +116 -72
- package/bin/skills/lakehouse-doc-en/references/connect-with-cz-cli.md +151 -0
- package/bin/skills/lakehouse-doc-en/references/continue-job.md +9 -17
- package/bin/skills/lakehouse-doc-en/references/create-api-connection.md +315 -286
- package/bin/skills/lakehouse-doc-en/references/create-catalog-connection.md +1 -0
- package/bin/skills/lakehouse-doc-en/references/create-dynamic-table.md +4 -4
- package/bin/skills/lakehouse-doc-en/references/create-external-catalog.md +85 -22
- package/bin/skills/lakehouse-doc-en/references/create-table-ddl.md +45 -0
- package/bin/skills/lakehouse-doc-en/references/creating_alicloud_privatelinkendpoint.md +4 -6
- package/bin/skills/lakehouse-doc-en/references/creating_alicloud_privatelinkservice.md +4 -7
- package/bin/skills/lakehouse-doc-en/references/creating_tencentcloud_privatelinkendpoint.md +2 -7
- package/bin/skills/lakehouse-doc-en/references/creating_tencentcloud_privatelinkservice.md +1 -5
- package/bin/skills/lakehouse-doc-en/references/cz-cli-agent.md +15 -10
- package/bin/skills/lakehouse-doc-en/references/cz-cli-datasource.md +0 -8
- package/bin/skills/lakehouse-doc-en/references/cz-cli-sql.md +2 -45
- package/bin/skills/lakehouse-doc-en/references/cz-cli.md +53 -42
- package/bin/skills/lakehouse-doc-en/references/dashboard-version-management-guide.md +12 -4
- package/bin/skills/lakehouse-doc-en/references/data-integration-intro.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/data-integration.md +29 -27
- package/bin/skills/lakehouse-doc-en/references/data-load-summary.md +3 -3
- package/bin/skills/lakehouse-doc-en/references/data-quality.md +25 -25
- package/bin/skills/lakehouse-doc-en/references/data-sharing.md +31 -54
- package/bin/skills/lakehouse-doc-en/references/data-sources.md +45 -45
- package/bin/skills/lakehouse-doc-en/references/data_catalog.md +23 -25
- package/bin/skills/lakehouse-doc-en/references/data_privacy.md +5 -2
- package/bin/skills/lakehouse-doc-en/references/data_sharing_between_accounts_guide.md +0 -4
- package/bin/skills/lakehouse-doc-en/references/data_visualization.md +4 -15
- package/bin/skills/lakehouse-doc-en/references/dataagent.md +39 -7
- package/bin/skills/lakehouse-doc-en/references/databricks-delta-to-lakehouse-migration.md +168 -0
- package/bin/skills/lakehouse-doc-en/references/databricks-dlt-to-lakehouse-migration.md +331 -0
- package/bin/skills/lakehouse-doc-en/references/databricks-external-catalog-practice.md +367 -0
- package/bin/skills/lakehouse-doc-en/references/databricks-jobs-to-studio-migration.md +199 -0
- package/bin/skills/lakehouse-doc-en/references/databricks-notebook-to-studio-migration.md +350 -0
- package/bin/skills/lakehouse-doc-en/references/databricks-uc-governance-to-lakehouse-migration.md +327 -0
- package/bin/skills/lakehouse-doc-en/references/datagpt-model-config.md +34 -0
- package/bin/skills/lakehouse-doc-en/references/datagpt_data_source.md +50 -37
- package/bin/skills/lakehouse-doc-en/references/datagpt_introduction.md +55 -79
- package/bin/skills/lakehouse-doc-en/references/datagpt_quickstart.md +50 -64
- package/bin/skills/lakehouse-doc-en/references/datalake-acceleration.md +75 -2
- package/bin/skills/lakehouse-doc-en/references/dbt-databricks-to-clickzetta-migration.md +242 -0
- package/bin/skills/lakehouse-doc-en/references/dynamic-mask.md +30 -30
- package/bin/skills/lakehouse-doc-en/references/dynamic-table-bestpractice.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/dynamic-table-introduce.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/dynamic_table_summary.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/eco_integration/streamlit.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/eco_integration/superset.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/ecosystem-all.md +1 -3
- package/bin/skills/lakehouse-doc-en/references/ecosystem.md +145 -0
- package/bin/skills/lakehouse-doc-en/references/external-catalog-summary.md +33 -38
- package/bin/skills/lakehouse-doc-en/references/external-function-combo-practice.md +466 -0
- package/bin/skills/lakehouse-doc-en/references/f6fc6447ee.md +7 -9
- package/bin/skills/lakehouse-doc-en/references/federation-query.md +56 -6
- package/bin/skills/lakehouse-doc-en/references/finebi-mysql.md +2 -0
- package/bin/skills/lakehouse-doc-en/references/get-started-with-sample-data.md +10 -11
- package/bin/skills/lakehouse-doc-en/references/gitfolder.md +2 -3
- package/bin/skills/lakehouse-doc-en/references/grant-privileges.md +2 -0
- package/bin/skills/lakehouse-doc-en/references/iceberg-rest-catalog-databricks.md +166 -0
- package/bin/skills/lakehouse-doc-en/references/ide.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/if_else_task.md +59 -57
- package/bin/skills/lakehouse-doc-en/references/input_output.md +10 -7
- package/bin/skills/lakehouse-doc-en/references/jobprofile-bestpractices.md +60 -64
- package/bin/skills/lakehouse-doc-en/references/kafka-connection.md +0 -1
- package/bin/skills/lakehouse-doc-en/references/key-concepts.md +146 -117
- package/bin/skills/lakehouse-doc-en/references/lakehouse-ai-gateway-cz-cli.md +317 -0
- package/bin/skills/lakehouse-doc-en/references/lakehouse-ai-sql-analysis.md +345 -0
- package/bin/skills/lakehouse-doc-en/references/lakehouse-dqc-guide.md +300 -0
- package/bin/skills/lakehouse-doc-en/references/lakehouse-medallion-sql-dt-guide.md +543 -0
- package/bin/skills/lakehouse-doc-en/references/lakehouse-multi-cloud-acceleration.md +274 -0
- package/bin/skills/lakehouse-doc-en/references/lakehouse-multimodal-ai-pipeline.md +198 -0
- package/bin/skills/lakehouse-doc-en/references/lakehouse-quick-experience_guide.md +49 -52
- package/bin/skills/lakehouse-doc-en/references/lakehouse-volume-pipe-acceleration-guide.md +380 -0
- package/bin/skills/lakehouse-doc-en/references/langchain-plug-installation.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/management.md +4 -9
- package/bin/skills/lakehouse-doc-en/references/medallion-lakehouse-from-scratch.md +2 -1
- package/bin/skills/lakehouse-doc-en/references/metrics_answer_build.md +58 -21
- package/bin/skills/lakehouse-doc-en/references/migrate-spark-data-engineering-best-practices-to-lakehouse.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/mindsdb.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/monitoring_and_alerting.md +65 -60
- package/bin/skills/lakehouse-doc-en/references/monitoring_item_specification.md +33 -33
- package/bin/skills/lakehouse-doc-en/references/multitable_batch_sync.md +16 -16
- package/bin/skills/lakehouse-doc-en/references/multitable_realtime_sync.md +65 -72
- package/bin/skills/lakehouse-doc-en/references/multitable_realtime_sync_sop.md +54 -52
- package/bin/skills/lakehouse-doc-en/references/navicat-mysql.md +2 -0
- package/bin/skills/lakehouse-doc-en/references/om-dynamic-table.md +71 -66
- package/bin/skills/lakehouse-doc-en/references/om-vcluster.md +2 -0
- package/bin/skills/lakehouse-doc-en/references/open-api-create-session.md +79 -0
- package/bin/skills/lakehouse-doc-en/references/open-api-generate-auth-token.md +63 -0
- package/bin/skills/lakehouse-doc-en/references/open-api-overview.md +96 -0
- package/bin/skills/lakehouse-doc-en/references/open-api-quick-start.md +286 -0
- package/bin/skills/lakehouse-doc-en/references/open-api-response-guide.md +264 -0
- package/bin/skills/lakehouse-doc-en/references/open-api-safe-question-poll.md +201 -0
- package/bin/skills/lakehouse-doc-en/references/open-api-text2insight-query.md +99 -0
- package/bin/skills/lakehouse-doc-en/references/open-api-text2insight-stop.md +74 -0
- package/bin/skills/lakehouse-doc-en/references/overview.md +6 -7
- package/bin/skills/lakehouse-doc-en/references/permission-application.md +5 -5
- package/bin/skills/lakehouse-doc-en/references/pipe-introduction.md +1 -0
- package/bin/skills/lakehouse-doc-en/references/pipe-kafka-table-stream.md +72 -70
- package/bin/skills/lakehouse-doc-en/references/pipe-kafka.md +105 -110
- package/bin/skills/lakehouse-doc-en/references/pipe-overview.md +40 -40
- package/bin/skills/lakehouse-doc-en/references/pipe-storage-object.md +43 -48
- package/bin/skills/lakehouse-doc-en/references/pipe-summary.md +14 -4
- package/bin/skills/lakehouse-doc-en/references/pipe-syntax.md +58 -151
- package/bin/skills/lakehouse-doc-en/references/practice_python_task.md +4 -4
- package/bin/skills/lakehouse-doc-en/references/pricing-ai-gateway.md +181 -0
- package/bin/skills/lakehouse-doc-en/references/pricing-lakehouse.md +316 -0
- package/bin/skills/lakehouse-doc-en/references/pricing.md +44 -288
- package/bin/skills/lakehouse-doc-en/references/private-link-general.md +0 -2
- package/bin/skills/lakehouse-doc-en/references/pyspark-to-zettapark-migration-f1.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/python-igs.md +7 -3
- package/bin/skills/lakehouse-doc-en/references/python-sample-put-github-rt-events.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/python-task.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/python_reference/connector.md +3 -3
- package/bin/skills/lakehouse-doc-en/references/python_reference/connector_advanced.md +2 -2
- package/bin/skills/lakehouse-doc-en/references/python_reference/connector_examples.md +2 -2
- package/bin/skills/lakehouse-doc-en/references/python_sdk_guide.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/python_shell_datasource.md +11 -9
- package/bin/skills/lakehouse-doc-en/references/quick_start_batch_sync_data.md +9 -18
- package/bin/skills/lakehouse-doc-en/references/quick_start_bi_analysis.md +8 -25
- package/bin/skills/lakehouse-doc-en/references/quick_start_create_workspace.md +4 -6
- package/bin/skills/lakehouse-doc-en/references/quick_start_data_quality.md +8 -8
- package/bin/skills/lakehouse-doc-en/references/quick_start_etl.md +16 -20
- package/bin/skills/lakehouse-doc-en/references/quick_start_monitoring_and_alerting.md +10 -18
- package/bin/skills/lakehouse-doc-en/references/quick_start_sql_query.md +7 -10
- package/bin/skills/lakehouse-doc-en/references/quick_start_upload_data.md +5 -7
- package/bin/skills/lakehouse-doc-en/references/quick_start_user_management.md +8 -8
- package/bin/skills/lakehouse-doc-en/references/quick_start_workspace.md +0 -5
- package/bin/skills/lakehouse-doc-en/references/quick_start_workspace_user.md +8 -8
- package/bin/skills/lakehouse-doc-en/references/quickstart.md +69 -56
- package/bin/skills/lakehouse-doc-en/references/quickstart_datashare_between_companies.md +0 -5
- package/bin/skills/lakehouse-doc-en/references/quickstart_envirment_for_team.md +0 -24
- package/bin/skills/lakehouse-doc-en/references/realtime-pipeline-selection-guide.md +1 -2
- package/bin/skills/lakehouse-doc-en/references/realtime-sales-dashboard-with-dynamic-table.md +3 -3
- package/bin/skills/lakehouse-doc-en/references/realtime_sync.md +0 -1
- package/bin/skills/lakehouse-doc-en/references/release-note-2026-05-19.md +5 -3
- package/bin/skills/lakehouse-doc-en/references/revoke-privileges.md +3 -1
- package/bin/skills/lakehouse-doc-en/references/roles.md +2 -3
- package/bin/skills/lakehouse-doc-en/references/row-filter.md +165 -0
- package/bin/skills/lakehouse-doc-en/references/row_level_permission.md +30 -19
- package/bin/skills/lakehouse-doc-en/references/scheduled_task.md +28 -21
- package/bin/skills/lakehouse-doc-en/references/security_overview.md +99 -21
- package/bin/skills/lakehouse-doc-en/references/set-command.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/setup.md +13 -15
- package/bin/skills/lakehouse-doc-en/references/show-grants.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/snowflake-dynamic-tables-to-lakehouse.md +2 -2
- package/bin/skills/lakehouse-doc-en/references/spark-connector-summary.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/sql_functions/context_functions/current_vcluster.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/sso-configuration.md +2 -2
- package/bin/skills/lakehouse-doc-en/references/streaming_pipeline_with_dynamic_table.md +0 -1
- package/bin/skills/lakehouse-doc-en/references/studio-incremental-sync-practice.md +27 -23
- package/bin/skills/lakehouse-doc-en/references/studio-shell-task.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/supported-cloud-platforms.md +32 -0
- package/bin/skills/lakehouse-doc-en/references/table_rendering.md +18 -12
- package/bin/skills/lakehouse-doc-en/references/task-develop.md +89 -91
- package/bin/skills/lakehouse-doc-en/references/task_development.md +19 -17
- package/bin/skills/lakehouse-doc-en/references/task_group.md +16 -14
- package/bin/skills/lakehouse-doc-en/references/task_instance.md +21 -21
- package/bin/skills/lakehouse-doc-en/references/task_param.md +38 -35
- package/bin/skills/lakehouse-doc-en/references/task_param_reference.md +81 -79
- package/bin/skills/lakehouse-doc-en/references/task_scheduling_dependency.md +20 -21
- package/bin/skills/lakehouse-doc-en/references/tencentcloud_arn_and_externalid.md +1 -5
- package/bin/skills/lakehouse-doc-en/references/trial-account-quotas-and-limits.md +1 -3
- package/bin/skills/lakehouse-doc-en/references/tutorial_connect_to_lakehouse.md +69 -0
- package/bin/skills/lakehouse-doc-en/references/tutorials.md +4 -1
- package/bin/skills/lakehouse-doc-en/references/unique-key.md +167 -0
- package/bin/skills/lakehouse-doc-en/references/usageandbillingview.md +138 -0
- package/bin/skills/lakehouse-doc-en/references/use-dbt-dev.md +3 -3
- package/bin/skills/lakehouse-doc-en/references/use-java-sdk-realtime-uploaddata.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/use-java-sdk-upload-data-local.md +3 -3
- package/bin/skills/lakehouse-doc-en/references/use-models.md +128 -0
- package/bin/skills/lakehouse-doc-en/references/use-mysql-client.md +81 -81
- package/bin/skills/lakehouse-doc-en/references/use-python-sdk-upload-data.md +10 -12
- package/bin/skills/lakehouse-doc-en/references/user-identification.md +2 -3
- package/bin/skills/lakehouse-doc-en/references/user_permission_grand_guide.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/using-udf-in-dynamic-table.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/vc_cache.md +18 -22
- package/bin/skills/lakehouse-doc-en/references/vcluster_size_description.md +33 -31
- package/bin/skills/lakehouse-doc-en/references/virtual-cluster.md +43 -45
- package/bin/skills/lakehouse-doc-en/references/web-job-history.md +94 -108
- package/bin/skills/lakehouse-doc-en/references/web_search.md +16 -7
- package/bin/skills/lakehouse-doc-en/references/zettapark-data-engineering-demo.md +1 -1
- package/bin/skills/lakehouse-doc-en/references/zettapark-dataframe-guide.md +144 -70
- package/bin/skills/lakehouse-doc-en/references/zettapark-dynamic-table-guide.md +2 -2
- package/bin/skills/lakehouse-doc-en/references/zettapark-etl-guide.md +73 -33
- package/bin/skills/lakehouse-doc-en/references/zettapark-feature-engineering.md +2 -2
- package/bin/skills/lakehouse-doc-en/references/zettapark-functions-guide.md +75 -46
- package/bin/skills/lakehouse-doc-en/references/zettapark-quick-start.md +2 -2
- package/bin/skills/lakehouse-doc-en/references/zettapark-stream-guide.md +4 -4
- package/bin/skills/lakehouse-doc-en/references/zettapark-volume-guide.md +93 -29
- package/package.json +1 -1
- package/bin/skills/lakehouse-doc-en/references/CLAUDE.md +0 -606
- package/bin/skills/lakehouse-doc-en/references/modelprice.md +0 -155
|
@@ -9,30 +9,30 @@ You can create a new sync task from both the workspace and the development entry
|
|
|
9
9
|
* Create from Workspace
|
|
10
10
|
Through the "Workspace" entry in the console, select "New Batch Sync" or "Real-time Sync Task" under the "New" button on the right.
|
|
11
11
|
|
|
12
|
-

|
|
12
|
+
:-: 
|
|
13
13
|
|
|
14
14
|
* Create from Development Entry
|
|
15
15
|
You can enter the "Development" page and select to create a new sync task in the specified directory in the task area.
|
|
16
16
|
|
|
17
|
-

|
|
17
|
+
:-: 
|
|
18
18
|
|
|
19
19
|
## Batch Sync Task Development
|
|
20
20
|
|
|
21
21
|
**Step 1: Create a New Batch Data** **Sync** **Task**
|
|
22
22
|
Create an Batch sync task with a specified name in the specified task save location.
|
|
23
23
|
|
|
24
|
-

|
|
24
|
+
:-: 
|
|
25
25
|
|
|
26
26
|
The system will generate the sync task and open the data sync task editor in the right area for user editing:
|
|
27
27
|
|
|
28
|
-

|
|
28
|
+
:-: 
|
|
29
29
|
|
|
30
30
|
**Step 2: Define the Sync Task**
|
|
31
31
|
|
|
32
32
|
* Select Source and Target Data Sources and Data Objects
|
|
33
33
|
On the data source side, select an existing data source or create a new data source as the data source and specify the data object to be synchronized. On the data target side, select an existing data source or create a new data source as the target data. The write object of the target data supports specifying the data object or quickly creating it based on the source object.
|
|
34
34
|
|
|
35
|
-

|
|
35
|
+
:-: 
|
|
36
36
|
|
|
37
37
|
After determining the source object and target object, the data sync task will generate a field mapping between the source object and the target object. By default, the same row mapping rule is used, and the mapping between fields can be adjusted by dragging. It supports adding constant fields as source table fields for mapping and writing.
|
|
38
38
|
|
|
@@ -40,7 +40,7 @@ The system will generate the sync task and open the data sync task editor in the
|
|
|
40
40
|
* Task concurrency, can be set to a minimum of 1 and a maximum of 10
|
|
41
41
|
* Task sync rate, can be set to a minimum of 1MB/S, with no maximum limit
|
|
42
42
|
|
|
43
|
-

|
|
43
|
+
:-: 
|
|
44
44
|
|
|
45
45
|
* Advanced Configuration of the Task
|
|
46
46
|
The advanced configuration area usually does not need to be configured and can be left blank. You can also expand to set advanced parameters for the task, such as adjusting the memory specifications used by the task. The supported parameters are as follows. For specific settings, please contact technical support.
|
|
@@ -50,13 +50,13 @@ The system will generate the sync task and open the data sync task editor in the
|
|
|
50
50
|
**Step 3: Test the Sync Task**
|
|
51
51
|
Click "Run" on the development task interface to test the sync task. Observe the task execution status and logs, query the data changes in the target table, and verify whether the sync task is executed correctly.
|
|
52
52
|
|
|
53
|
-

|
|
53
|
+
:-: 
|
|
54
54
|
|
|
55
55
|
**Step 4: Set Scheduling and Deploy to Production**
|
|
56
56
|
After the scheduling configuration is successful, you can click the "Submit" button of the task to deploy it to the scheduling system for periodic execution.
|
|
57
57
|
|
|
58
|
-

|
|
58
|
+
:-: 
|
|
59
59
|
|
|
60
60
|
View and maintain the published sync tasks in the Operations Center
|
|
61
61
|
|
|
62
|
-

|
|
62
|
+
:-: 
|
|
@@ -12,7 +12,7 @@ Answer: The cost of data sync tasks is generally composed of two categories: har
|
|
|
12
12
|
|
|
13
13
|
### Question: What data sources are currently supported for offline sync?
|
|
14
14
|
|
|
15
|
-
Answer: On the task configuration page, when selecting the source and target, all supported data source types are listed in full. If no data source is available, click the
|
|
15
|
+
Answer: On the task configuration page, when selecting the source and target, all supported data source types are listed in full. If no data source is available, click the + button to create a new one first, then use it. Offline sync data sources can be freely combined in pairs to build a rich variety of sync links. See: [Data Source Management](config-datasource.md)
|
|
16
16
|
|
|
17
17
|
^
|
|
18
18
|
|
|
@@ -82,7 +82,7 @@ Solution: Typically, the following solutions are available:
|
|
|
82
82
|
|
|
83
83
|
* Refer to the heap memory overflow solution and adjust the `taskmanager.memory.process.size` parameter.
|
|
84
84
|
* Separately adjust the task off-heap memory size by adjusting the `taskmanager.memory.task.off-heap.size` parameter, e.g., 256m or 512m.
|
|
85
|
-
* If the data source supports setting batch size, reduce the configured value appropriately. **However, note that this may lead to reduced sync efficiency
|
|
85
|
+
* If the data source supports setting batch size, reduce the configured value appropriately. **However, note that this may lead to reduced sync efficiency**.
|
|
86
86
|
|
|
87
87
|
### Question: How to resolve the error `CZLH-67000:Out of Memory undefined: could not allocate block of size 262KB (1.0GB/1.0GB used)`?
|
|
88
88
|
|
|
@@ -12,7 +12,7 @@ This guide will help you import large amounts of data from public URL Parquet fi
|
|
|
12
12
|
|
|
13
13
|
Script download address: <https://github.com/yunqiqiliang/nyc-taxi-data-clickzetta>
|
|
14
14
|
|
|
15
|
-
## 1. Install [Singdata SQLLine](
|
|
15
|
+
## 1. Install [Singdata SQLLine](connect-with-cli.md)
|
|
16
16
|
|
|
17
17
|
## 2. Install [R](https://www.r-project.org/)
|
|
18
18
|
|
|
@@ -20,7 +20,7 @@ pip uninstall clickzetta-connector clickzetta-connector-python clickzetta-sqlalc
|
|
|
20
20
|
pip show clickzetta-connector clickzetta-sqlalchemy clickzetta-ingestion-python clickzetta-ingestion-python-v2 clickzetta-connector-python
|
|
21
21
|
```
|
|
22
22
|
|
|
23
|
-
Install the latest version (requires Python >= 3.
|
|
23
|
+
Install the latest version (requires Python >= 3.10):
|
|
24
24
|
|
|
25
25
|
```bash
|
|
26
26
|
pip install clickzetta-connector -U -i https://pypi.org/simple/
|
|
@@ -68,7 +68,7 @@ conn = connect(
|
|
|
68
68
|
instance='your_instance',
|
|
69
69
|
workspace='your_workspace',
|
|
70
70
|
schema='public',
|
|
71
|
-
vcluster='
|
|
71
|
+
vcluster='DEFAULT'
|
|
72
72
|
)
|
|
73
73
|
|
|
74
74
|
bulkload_stream = conn.create_bulkload_stream(schema='public', table='bulkload_test')
|
|
@@ -96,7 +96,7 @@ bulkload_stream.commit()
|
|
|
96
96
|
instance='your_instance',
|
|
97
97
|
workspace='your_workspace',
|
|
98
98
|
schema='public',
|
|
99
|
-
vcluster='
|
|
99
|
+
vcluster='DEFAULT'
|
|
100
100
|
)
|
|
101
101
|
```
|
|
102
102
|
|
|
@@ -1,5 +1,3 @@
|
|
|
1
|
-
^
|
|
2
|
-
|
|
3
1
|
# Chart Auto-Refresh Settings
|
|
4
2
|
|
|
5
3
|
## Feature Overview
|
|
@@ -16,10 +14,10 @@ Charts in dashboards support setting an auto-refresh interval. The system will a
|
|
|
16
14
|
|
|
17
15
|
## How to Use
|
|
18
16
|
|
|
19
|
-
| Step
|
|
20
|
-
|
|
|
21
|
-
| Select chart settings | Click "Dashboard" and select the chart you want to configure
|
|
22
|
-
| Set refresh interval
|
|
17
|
+
| Step | Description | Screenshot |
|
|
18
|
+
| --------------------- | ------------------------------------------------------------------ | --------------------------------------------------- |
|
|
19
|
+
| Select chart settings | Click "Dashboard" and select the chart you want to configure |  |
|
|
20
|
+
| Set refresh interval | Default refresh every 24 hours; adjustable based on business needs |  |
|
|
23
21
|
|
|
24
22
|
## **Notes**
|
|
25
23
|
|
|
@@ -28,3 +26,11 @@ Charts in dashboards support setting an auto-refresh interval. The system will a
|
|
|
28
26
|
2\. During refresh, the system re-executes the query corresponding to that chart to retrieve the latest data
|
|
29
27
|
|
|
30
28
|
3\. It is recommended to set refresh intervals reasonably based on data update frequency, avoiding excessively frequent refreshes that cause unnecessary resource consumption
|
|
29
|
+
|
|
30
|
+
## Related Documentation
|
|
31
|
+
|
|
32
|
+
* [Scheduled Tasks](scheduled_task.md) — Automatically execute analysis on a schedule and push results
|
|
33
|
+
* [Dashboard Version Management](dashboard-version-management-guide.md) — Manage multi-version history of dashboards
|
|
34
|
+
* [Conversational Data Analytics (Analytics Agent)](datagpt_introduction.md) — Return to feature overview
|
|
35
|
+
|
|
36
|
+
^
|
|
@@ -213,7 +213,7 @@ ORDER BY trips DESC;
|
|
|
213
213
|
| `date_processed` | timestamp_ltz | Vector processing timestamp |
|
|
214
214
|
|
|
215
215
|
**Use cases**:
|
|
216
|
-
- Experience vector similarity retrieval (the
|
|
216
|
+
- Experience vector similarity retrieval (the `cosine_distance` function)
|
|
217
217
|
- Build a RAG (Retrieval-Augmented Generation) Q&A system based on product documentation
|
|
218
218
|
- Learn how to use the `AI_EMBEDDING` function together with vector indexes
|
|
219
219
|
|
|
@@ -224,7 +224,7 @@ SELECT
|
|
|
224
224
|
filename,
|
|
225
225
|
type,
|
|
226
226
|
text,
|
|
227
|
-
embeddings
|
|
227
|
+
cosine_distance(embeddings, AI_EMBEDDING('ai_gateway_conn:text-embedding-v4', 'What is a dynamic table')) AS distance
|
|
228
228
|
FROM clickzetta_sample_data.clickzetta_doc_kb.dashscope_clickzetta_elements
|
|
229
229
|
ORDER BY distance ASC
|
|
230
230
|
LIMIT 5;
|
|
@@ -236,4 +236,4 @@ LIMIT 5;
|
|
|
236
236
|
- [TPC-H Performance Benchmark](tpch-benchmark.md)
|
|
237
237
|
- [Table Stream](om-table-stream.md)
|
|
238
238
|
- [Vector Index](om-inverted-index.md)
|
|
239
|
-
- [AI_EMBEDDING Function](ai_embedding.md)
|
|
239
|
+
- [AI_EMBEDDING Function](sql_functions/ai_embedding.md)
|
|
@@ -2,7 +2,7 @@ When configuring a workspace, you can enable the "Mandatory Code Review" feature
|
|
|
2
2
|
|
|
3
3
|
## Prerequisites
|
|
4
4
|
|
|
5
|
-
Only users with the workspace administrator (
|
|
5
|
+
Only users with the workspace administrator (workspace\_admin) role can enable or disable the mandatory code review process.
|
|
6
6
|
|
|
7
7
|
## Steps
|
|
8
8
|
|
|
@@ -10,7 +10,6 @@ Only users with the workspace administrator (workspace_admin) role can enable or
|
|
|
10
10
|
|
|
11
11
|
* Enable the code review process when creating a new workspace.
|
|
12
12
|
* For existing workspaces -> Click to enter the details page -> Edit, and enable the code review feature.
|
|
13
|
-

|
|
14
13
|
|
|
15
14
|
**2. Configure the Code Review Approval Flow**
|
|
16
15
|
|
|
@@ -19,16 +18,13 @@ Click Approval -> Approval Flow -> Select the code review flow for the target wo
|
|
|
19
18
|
Under "Approval", configure the approval roles/users for the code review flow. These roles/users will be the personnel who need to participate in code review when code is submitted in the current workspace.
|
|
20
19
|
|
|
21
20
|
> Only roles with development permissions can perform approvals
|
|
22
|
-

|
|
23
21
|
|
|
24
22
|
**3. Submit Code for Review**
|
|
25
23
|
|
|
26
24
|
After enabling mandatory code review, any submission action will trigger a "Code Review" approval ticket. Once the approver approves, the code will be published to the production environment. If the approver rejects it, you need to modify and resubmit.
|
|
27
|
-

|
|
28
25
|
|
|
29
26
|
**4. Approval**
|
|
30
27
|
|
|
31
28
|
After code submission, the approver needs to review it.
|
|
32
29
|
|
|
33
30
|
Click Approval -> Under the Approval tab, find the target task and perform the relevant actions.
|
|
34
|
-

|
|
@@ -18,15 +18,9 @@ You can create a composite task through any of the following paths:
|
|
|
18
18
|
|
|
19
19
|
* Task Development: Task tab → New Task → Select "Composite Task"
|
|
20
20
|
|
|
21
|
-

|
|
22
|
-
|
|
23
21
|
* Task Group Details page: Inside a task group → Add New Task → Select "Composite Task"
|
|
24
|
-
|
|
25
|
-

|
|
26
22
|
|
|
27
23
|
* Workspace: Top navigation bar → New Task → Select "Composite Task"
|
|
28
|
-
|
|
29
|
-

|
|
30
24
|
|
|
31
25
|
### Basic Information
|
|
32
26
|
|
|
@@ -36,20 +30,19 @@ In the composite task creation dialog, fill in the required fields — same as f
|
|
|
36
30
|
* Folder: The folder to place the task in. Required.
|
|
37
31
|
* Task Group: Optionally assign the task to a task group by selecting a specific task group name.
|
|
38
32
|
|
|
39
|
-
|
|
33
|
+
^
|
|
40
34
|
|
|
41
35
|
## Core Feature Guide
|
|
42
36
|
|
|
43
37
|
### Subtask Management (Canvas Mode)
|
|
44
38
|
|
|
45
39
|
Composite tasks use a **canvas** (DAG diagram) to display subtask nodes and support the following operations:
|
|
46
|
-

|
|
47
40
|
|
|
48
41
|
#### Adding Subtasks
|
|
49
42
|
|
|
50
43
|
* **Entry point**: Canvas toolbar → "New Subtask" → Select a task type (only periodic tasks are supported: offline synchronization, SQL, etc.; **real-time tasks are not supported**). You can add a subtask by clicking the task type or dragging it onto the canvas.
|
|
51
44
|
|
|
52
|
-
|
|
45
|
+
^
|
|
53
46
|
|
|
54
47
|
* **Default state**: Newly added subtasks have no dependencies by default. You need to configure dependencies manually — either by drawing connections on the canvas or through the subtask detail configuration.
|
|
55
48
|
|
|
@@ -75,18 +68,16 @@ Composite tasks use a **canvas** (DAG diagram) to display subtask nodes and supp
|
|
|
75
68
|
|
|
76
69
|
The scheduling strategy for a composite task is managed through a combination of **global configuration** (at the composite task level) and **local configuration** (at the subtask level). Subtasks inherit the composite task's global configuration by default, but can also be configured individually. When a local configuration exists, it takes precedence. Key configuration items:
|
|
77
70
|
|
|
78
|
-
| Configuration Item | Global Configuration (Composite Task)
|
|
79
|
-
| ------------------ |
|
|
80
|
-
| Scheduling Time | Required; supports Cron expressions. Subtasks cannot set this independently. | None. Subtasks follow the composite task's global scheduling time and run at the same frequency.
|
|
81
|
-
| Instance Rerun | Global setting (rerun count, interval). Subtasks inherit by default.
|
|
82
|
-
| Task Priority | Global setting (default value).
|
|
83
|
-
| Self-dependency | Global setting (controls composite task cycle dependency).
|
|
71
|
+
| Configuration Item | Global Configuration (Composite Task) | Local Configuration (Subtask) |
|
|
72
|
+
| ------------------ | ---------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------- |
|
|
73
|
+
| Scheduling Time | Required; supports Cron expressions. Subtasks cannot set this independently. | None. Subtasks follow the composite task's global scheduling time and run at the same frequency. |
|
|
74
|
+
| Instance Rerun | Global setting (rerun count, interval). Subtasks inherit by default. | Optional override of global settings (only supported for specific task types such as SQL). |
|
|
75
|
+
| Task Priority | Global setting (default value). | Optional individual setting (overrides global; only supported for SQL nodes). |
|
|
76
|
+
| Self-dependency | Global setting (controls composite task cycle dependency). | None. Subtask self-dependency is indirectly achieved through the composite task's global self-dependency. |
|
|
84
77
|
|
|
85
78
|
**Example**: If the composite task global setting is "3 reruns with a 5-minute interval," and a subtask is individually configured with "5 reruns," that subtask will use "5 reruns with a 5-minute interval."
|
|
86
79
|
|
|
87
|
-
|
|
88
|
-
|
|
89
|
-

|
|
80
|
+
^
|
|
90
81
|
|
|
91
82
|
### Parameter Management (Composite Task → Subtask Propagation)
|
|
92
83
|
|
|
@@ -94,9 +85,7 @@ The scheduling strategy for a composite task is managed through a combination of
|
|
|
94
85
|
|
|
95
86
|
* **Entry point**: Composite task detail page → Click "Parameters" → Add parameters in the dialog (e.g., `composite_task_param`):
|
|
96
87
|
|
|
97
|
-
|
|
98
|
-
|
|
99
|
-

|
|
88
|
+
^
|
|
100
89
|
|
|
101
90
|
* **Scope**: Parameters are only valid within the current composite task. Parameters are isolated between different composite tasks.
|
|
102
91
|
|
|
@@ -104,7 +93,7 @@ The scheduling strategy for a composite task is managed through a combination of
|
|
|
104
93
|
|
|
105
94
|
In a subtask's parameter configuration, if the parameter name matches a parameter defined in the composite task, the system automatically recognizes it and prompts you to select the value source. To use the globally defined composite task parameter, select "Composite Task" as the value source; otherwise, select "Task".
|
|
106
95
|
|
|
107
|
-
|
|
96
|
+
^
|
|
108
97
|
|
|
109
98
|
In subtask code, you can reference parameters using `${parameter_name}`. **Example** (SQL subtask):
|
|
110
99
|
|
|
@@ -124,21 +113,19 @@ SELECT * FROM user_log WHERE dt = '${composite_task_param}'
|
|
|
124
113
|
|
|
125
114
|
* **Entry point**: Composite task detail page → "Submit."
|
|
126
115
|
|
|
127
|
-

|
|
128
|
-
|
|
129
116
|
## Operations and Monitoring
|
|
130
117
|
|
|
131
118
|
### Composite Task Operations
|
|
132
119
|
|
|
133
120
|
Composite tasks appear in the Operations Center alongside other periodic tasks and support similar operations such as pause, data backfill, and offline.
|
|
134
121
|
|
|
135
|
-
|
|
122
|
+
^
|
|
136
123
|
|
|
137
|
-
| Operation
|
|
138
|
-
|
|
|
139
|
-
| Pause/Resume
|
|
140
|
-
| Data Backfill
|
|
141
|
-
| Offline
|
|
124
|
+
| Operation | Description |
|
|
125
|
+
| ------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
126
|
+
| Pause/Resume | After pausing, the composite task and its subtasks stop scheduling. After resuming, execution continues according to the current configuration. |
|
|
127
|
+
| Data Backfill | Select a time range → The composite task generates backfill instances as a whole according to scheduling rules (including all subtask instances). |
|
|
128
|
+
| Offline | After going offline, the task no longer schedules. Check for downstream dependencies — if any exist, you cannot go offline directly and must use the "Offline (including downstream)" feature. |
|
|
142
129
|
|
|
143
130
|
### Composite Task Instance Operations
|
|
144
131
|
|
|
@@ -157,23 +144,23 @@ In the Instance Operations tab, composite task instances are listed alongside ot
|
|
|
157
144
|
|
|
158
145
|
#### Instance Operations
|
|
159
146
|
|
|
160
|
-
| Operation
|
|
161
|
-
|
|
|
162
|
-
| Rerun
|
|
163
|
-
| Mark Success/Failure
|
|
164
|
-
| Terminate
|
|
147
|
+
| Operation | Supported States | Behavior |
|
|
148
|
+
| -------------------- | ------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
149
|
+
| Rerun | Any terminal state (failed/succeeded) | Reruns the entire composite task. Subtasks generate new instances according to their respective rerun rules. Partial subtask selection is not supported. |
|
|
150
|
+
| Mark Success/Failure | Not Started / Terminal state | Forces the composite task and all subtask instance statuses to change to succeeded or failed. |
|
|
151
|
+
| Terminate | Running | Terminates all running or not-started subtask instances and sets their status to failed. |
|
|
165
152
|
|
|
166
153
|
### Monitoring and Alerts
|
|
167
154
|
|
|
168
155
|
In monitoring and alerts, composite tasks are categorized under periodic scheduling tasks. Existing monitoring items such as "Task Instance Execution Failure" and "Periodic Task Instance Completion Time" apply to composite tasks as well. You can configure monitoring rules for composite tasks the same way you would for regular SQL periodic tasks.
|
|
169
156
|
|
|
170
|
-
|
|
157
|
+
^
|
|
171
158
|
|
|
172
159
|
### Triggering Data Quality Rule Checks
|
|
173
160
|
|
|
174
161
|
When configuring data quality rules, you can select a composite task as the scheduling trigger in the rule execution trigger settings, as shown below. You can select a specific subtask to trigger the check; if "Subtask" is left blank, the check is triggered after the entire composite task completes.
|
|
175
162
|
|
|
176
|
-
|
|
163
|
+
^
|
|
177
164
|
|
|
178
165
|
## Notes
|
|
179
166
|
|
|
@@ -182,11 +169,13 @@ When configuring data quality rules, you can select a composite task as the sche
|
|
|
182
169
|
3. **Version management**: Composite tasks only support "submission versions" (no saved versions) and do not currently support version rollback.
|
|
183
170
|
4. **Task dependencies**: Within a task group, a composite task exists as a whole unit. Internal subtasks cannot be added individually to the task group, and dependencies are between the composite task as a whole and other tasks. Internal subtasks of a composite task cannot depend on other tasks in the task group, nor can they be depended upon by other tasks in the task group.
|
|
184
171
|
|
|
185
|
-
|
|
172
|
+
***
|
|
186
173
|
|
|
187
174
|
## Related Documentation
|
|
188
175
|
|
|
189
|
-
|
|
190
|
-
|
|
191
|
-
|
|
192
|
-
|
|
176
|
+
* [Task Parameters](task_param.md) — Concepts and configuration for composite task parameters
|
|
177
|
+
* [Task Parameter Syntax Reference](task_param_reference.md) — Full syntax for time expressions and built-in parameters
|
|
178
|
+
* [Task Group](task_group.md) — Using task group parameters to share parameters across composite tasks
|
|
179
|
+
* [Task Development and Scheduling](task-develop.md) — Development and scheduling configuration for SQL subtasks
|
|
180
|
+
|
|
181
|
+
^
|
|
@@ -2,7 +2,7 @@
|
|
|
2
2
|
|
|
3
3
|
## Python Environment Setup
|
|
4
4
|
|
|
5
|
-
This guide includes a data generator and several examples, requiring Python 3.
|
|
5
|
+
This guide includes a data generator and several examples, requiring Python 3.10, Java, and some other libraries and utilities.
|
|
6
6
|
|
|
7
7
|
To set up these dependencies, we will use conda.
|
|
8
8
|
|
|
@@ -22,7 +22,7 @@ dependencies:
|
|
|
22
22
|
- pandas=1.5.3
|
|
23
23
|
- pip=23.0.1
|
|
24
24
|
- pyarrow=10.0.1
|
|
25
|
-
- python=3.
|
|
25
|
+
- python=3.10
|
|
26
26
|
- python-confluent-kafka
|
|
27
27
|
- python-dotenv=0.21.0
|
|
28
28
|
- python-rapidjson=1.5
|
|
@@ -554,20 +554,20 @@ You will use the [Singdata Lakehouse Studio](https://accounts.clickzetta.com/) w
|
|
|
554
554
|
|
|
555
555
|
Navigate to Development -> Tasks, click `+` to create a new workspace and worksheet task, then select SQL Worksheet
|
|
556
556
|
|
|
557
|
-
|
|
557
|
+
|
|
558
558
|
|
|
559
559
|
Create a workspace to store all tasks and code for this project. Workspace name: 01\_Demo\_Data\_Ingest
|
|
560
560
|
|
|
561
561
|
^
|
|
562
562
|
|
|
563
|
-
|
|
563
|
+
|
|
564
564
|
|
|
565
565
|
|
|
566
566
|
Create the first task, select SQL as the type. Workspace task name: 01\_Setup\_Environment
|
|
567
567
|
|
|
568
568
|
^
|
|
569
569
|
|
|
570
|
-
|
|
570
|
+
|
|
571
571
|
|
|
572
572
|
###
|
|
573
573
|
|
|
@@ -639,7 +639,7 @@ The config-ingest.json file contains your account login information for Singdata
|
|
|
639
639
|
"instance": "Please enter your instance ID",
|
|
640
640
|
"workspace": "Please enter your workspace, e.g., gharchive",
|
|
641
641
|
"schema": "Please enter your schema, e.g., public",
|
|
642
|
-
"vcluster": "Please enter your virtual cluster, e.g.,
|
|
642
|
+
"vcluster": "Please enter your virtual cluster, e.g., DEFAULT_AP",
|
|
643
643
|
"sdk_job_timeout": 10,
|
|
644
644
|
"hints": {
|
|
645
645
|
"sdk.job.timeout": 3,
|
|
@@ -654,16 +654,13 @@ The config-ingest.json file contains your account login information for Singdata
|
|
|
654
654
|
|
|
655
655
|
Navigate to Management -> Data Sources, click "New Data Source" and select Postgres to create a Postgres data source, so that Postgres can be accessed by Singdata Lakehouse.
|
|
656
656
|
|
|
657
|
-
:-: 
|
|
658
657
|
|
|
659
658
|
* Data Source Name: ingest\_demo\_from\_pg
|
|
660
659
|
* Connection Parameters: Same as the environment connection parameters in the database environment settings.
|
|
661
660
|
* Please make sure to configure the correct time zone of the database to avoid data synchronization failure.
|
|
662
661
|
|
|
663
|
-
:-: 
|
|
664
662
|
|
|
665
663
|
Once the environment is created, it can be used.
|
|
666
664
|
|
|
667
|
-
:-: 
|
|
668
665
|
|
|
669
666
|
Test the connection, and if it prompts success, it means the configuration is successful.
|
|
@@ -18,11 +18,10 @@ Download the code for this guide from the [GitHub repository](https://github.com
|
|
|
18
18
|
|
|
19
19
|
Add the project directory to your VS Code workspace.
|
|
20
20
|
|
|
21
|
-
:-: 
|
|
22
21
|
|
|
23
22
|
##### Modify Parameters
|
|
24
23
|
|
|
25
|
-
Rename the file `config/config-ingest-sample.json` to `config/config-ingest.json`, and modify the [parameter values](
|
|
24
|
+
Rename the file `config/config-ingest-sample.json` to `config/config-ingest.json`, and modify the [parameter values](jdbc-driver.md) in `config-ingest.json`.
|
|
26
25
|
|
|
27
26
|
```JSON
|
|
28
27
|
{
|
|
@@ -43,9 +42,8 @@ Rename the file `config/config-ingest-sample.json` to `config/config-ingest.json
|
|
|
43
42
|
|
|
44
43
|
##### Bulkload
|
|
45
44
|
|
|
46
|
-
Run `BulkLoadFile.java` in VS Code
|
|
45
|
+
Run `BulkLoadFile.java` in VS Code.
|
|
47
46
|
|
|
48
|
-
:-: 
|
|
49
47
|
|
|
50
48
|
```JAVA
|
|
51
49
|
import com.clickzetta.client.BulkloadStream;
|
|
@@ -400,17 +398,13 @@ throw new ArithmeticException(bulkloadStream.getErrorMessage());
|
|
|
400
398
|
|
|
401
399
|
^
|
|
402
400
|
|
|
403
|
-
View execution results
|
|
401
|
+
View execution results.
|
|
404
402
|
|
|
405
|
-
:-:
|
|
406
|
-

|
|
407
403
|
|
|
408
404
|
##### Realtime Ingestion
|
|
409
405
|
|
|
410
|
-
Run `StreamingInsert.java` in VS Code
|
|
406
|
+
Run `StreamingInsert.java` in VS Code.
|
|
411
407
|
|
|
412
|
-
:-:
|
|
413
|
-

|
|
414
408
|
|
|
415
409
|
```JAVA
|
|
416
410
|
import com.clickzetta.client.ClickZettaClient;
|
|
@@ -12,11 +12,9 @@ Existing Kafka data source with high real-time requirements for data synchroniza
|
|
|
12
12
|
|
|
13
13
|
Navigate to Development -> Tasks, click "+", select "Real-time Sync", and create a new "Real-time Sync" job.
|
|
14
14
|
|
|
15
|
-
:-: 
|
|
16
15
|
|
|
17
|
-
Main configuration as follows
|
|
16
|
+
Main configuration as follows.
|
|
18
17
|
|
|
19
|
-
:-: 
|
|
20
18
|
|
|
21
19
|
^
|
|
22
20
|
|
|
@@ -28,13 +26,11 @@ Then select the Lakehouse target on the right, choose an existing data table, or
|
|
|
28
26
|
|
|
29
27
|
In the "Create Data Table" SQL code, change the table name to "target\_table\_from\_kafka".
|
|
30
28
|
|
|
31
|
-
:-: 
|
|
32
29
|
|
|
33
30
|
^
|
|
34
31
|
|
|
35
32
|
In the "Field Mapping Configuration" area, Kafka Topic built-in fields will be used for data field mapping by default. If the message format in the Topic is JSON, you can also use the new calculated column method to parse the content in the value field using JSONPath rules. For example, extract the accountId field in the \_\_value\_\_ from the source topic and write it into the target \_\_value\_\_ field as shown in the figure below.
|
|
36
33
|
|
|
37
|
-
:-: 
|
|
38
34
|
|
|
39
35
|
^
|
|
40
36
|
|
|
@@ -44,25 +40,20 @@ In the "Sync Rule Configuration", set the maximum concurrency for synchronizatio
|
|
|
44
40
|
|
|
45
41
|
After checking that the field mapping meets expectations, set the required information such as "Cluster" in the configuration, click "OK", and then click "Save" to save the task configuration.
|
|
46
42
|
|
|
47
|
-
:-: 
|
|
48
43
|
|
|
49
44
|
Real-time sync tasks currently do not support direct test runs. You need to submit and publish them, then check if the results are normal.
|
|
50
45
|
|
|
51
|
-
:-: 
|
|
52
46
|
|
|
53
47
|
#### Next Steps
|
|
54
48
|
|
|
55
49
|
* In the Operations Center, start the real-time sync task, observe the task running metrics, and verify if the data synchronization results are normal.
|
|
56
50
|
|
|
57
|
-
:-: 
|
|
58
51
|
|
|
59
52
|
* For the first start, select the "Stateless Start" method.
|
|
60
53
|
|
|
61
|
-
:-: 
|
|
62
54
|
|
|
63
55
|
* After a normal start, you can see the following monitoring metrics, indicating that the sync task is running normally.
|
|
64
56
|
|
|
65
|
-
:-: 
|
|
66
57
|
|
|
67
58
|
* Spot check the data in the target table and verify it against the source to see if it meets expectations.
|
|
68
59
|
|
|
@@ -16,7 +16,6 @@ Suitable for directly uploading smaller local files (not larger than 2GB) such a
|
|
|
16
16
|
|
|
17
17
|
Navigate to Data -> Data Directory, click "Upload Data" to import local files (CSV files generated in the test data generation section) into the table.
|
|
18
18
|
|
|
19
|
-
:-: 
|
|
20
19
|
|
|
21
20
|
##### Import Data
|
|
22
21
|
|
|
@@ -27,11 +26,9 @@ Click "Upload Data":
|
|
|
27
26
|
* Select "Create New Table", table name: lift\_tuckets\_import\_by\_studio\_web
|
|
28
27
|
* Virtual compute cluster created in the Singdata Lakehouse setup section
|
|
29
28
|
|
|
30
|
-
:-: 
|
|
31
29
|
|
|
32
30
|
After clicking "Next", check if the automatic settings for the uploaded data are correct. If the data preview meets expectations, the automatic settings are correct. Click "Confirm" to complete the data upload.
|
|
33
31
|
|
|
34
|
-
:-: 
|
|
35
32
|
|
|
36
33
|
##### Result Verification
|
|
37
34
|
|
|
@@ -39,15 +36,12 @@ Go to "Data" to check the import status and data:
|
|
|
39
36
|
|
|
40
37
|
You can see that the number of rows written in the import result is "100,000", which is consistent with the number generated in the "Test Data Generation" step.
|
|
41
38
|
|
|
42
|
-
:-: 
|
|
43
39
|
|
|
44
40
|
You can further "Preview Data" to confirm the data was loaded successfully:
|
|
45
41
|
|
|
46
|
-
:-: 
|
|
47
42
|
|
|
48
43
|
At this point, we have loaded local files into the table via Singdata Lakehouse Studio.
|
|
49
44
|
|
|
50
|
-
:-: 
|
|
51
45
|
|
|
52
46
|
#### Next Steps
|
|
53
47
|
|
|
@@ -12,27 +12,22 @@ When the existing data source (including databases, data warehouses) has a publi
|
|
|
12
12
|
|
|
13
13
|
Navigate to Development -> Tasks, click "+", select "Offline Sync", and create a new "Offline Sync" job.
|
|
14
14
|
|
|
15
|
-
:-: 
|
|
16
15
|
|
|
17
16
|
Other parameter configurations are as follows:
|
|
18
17
|
|
|
19
|
-
:-: 
|
|
20
18
|
|
|
21
19
|
Then select to create a new data table: lift\_tickets\_data\_from\_pg\_batch.
|
|
22
20
|
|
|
23
21
|
In the "Create New Data Table" SQL code, change the table name to "lift\_tickets\_data\_from\_pg\_batch".
|
|
24
22
|
|
|
25
|
-
:-: 
|
|
26
23
|
|
|
27
24
|
Check if the field mapping meets expectations, then test run the sync task:
|
|
28
25
|
|
|
29
|
-
:-: 
|
|
30
26
|
|
|
31
27
|
Check the test results:
|
|
32
28
|
|
|
33
29
|
View the test task logs and check if the number of nubWrite matches the number of rows in the source table.
|
|
34
30
|
|
|
35
|
-
:-: 
|
|
36
31
|
|
|
37
32
|
#### Next Steps Recommendations
|
|
38
33
|
|
|
@@ -18,7 +18,6 @@ Navigate to Development -> Tasks, click "+", and create a new Python task.
|
|
|
18
18
|
|
|
19
19
|
Task Name: 05_Loading Files from the Web into the Lakehouse via Studio's Built-in Python Node.
|
|
20
20
|
|
|
21
|
-
:-: 
|
|
22
21
|
|
|
23
22
|
##### Develop Python Task Code
|
|
24
23
|
|
|
@@ -82,25 +81,21 @@ There are two parameters:
|
|
|
82
81
|
|
|
83
82
|
ACCESS\_KEY\_ID = '${ak}'ACCESS\_KEY\_SECRET = '${sk}'
|
|
84
83
|
|
|
85
|
-
Click on the schedule to fill in the default values for the parameters
|
|
84
|
+
Click on the schedule to fill in the default values for the parameters.
|
|
86
85
|
|
|
87
|
-
:-: 
|
|
88
86
|
|
|
89
|
-
Click "Load Parameters from Code" and fill in the corresponding values
|
|
87
|
+
Click "Load Parameters from Code" and fill in the corresponding values.
|
|
90
88
|
|
|
91
|
-
:-: 
|
|
92
89
|
|
|
93
90
|
##### Run the Test
|
|
94
91
|
|
|
95
92
|
Click "Run" to execute the Python code.
|
|
96
93
|
|
|
97
|
-
:-: 
|
|
98
94
|
|
|
99
95
|
##### Check the Upload Results
|
|
100
96
|
|
|
101
97
|
Log in to Alibaba Cloud Object Storage to view the uploaded files.
|
|
102
98
|
|
|
103
|
-
:-: 
|
|
104
99
|
|
|
105
100
|
#### Next Steps
|
|
106
101
|
|