@clickzetta/cz-cli-darwin-x64 0.5.16 → 0.5.17

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (243) hide show
  1. package/bin/cz-cli +0 -0
  2. package/bin/skills/lakehouse-doc-en/SKILL.md +6 -11
  3. package/bin/skills/lakehouse-doc-en/references/AIGateway.md +58 -13
  4. package/bin/skills/lakehouse-doc-en/references/Computation.md +1 -1
  5. package/bin/skills/lakehouse-doc-en/references/DataSource_Amazon_DocumentDB.md +3 -1
  6. package/bin/skills/lakehouse-doc-en/references/Foreach.md +14 -14
  7. package/bin/skills/lakehouse-doc-en/references/JDBC-Driver.md +0 -1
  8. package/bin/skills/lakehouse-doc-en/references/LakehouseAI-overview.md +21 -8
  9. package/bin/skills/lakehouse-doc-en/references/LakehouseDataGPT-tour.md +4 -9
  10. package/bin/skills/lakehouse-doc-en/references/LakehouseStudio-tour.md +14 -19
  11. package/bin/skills/lakehouse-doc-en/references/Lakehouse_Zilliz_MakeDataReadyforBIandAI.md +1 -1
  12. package/bin/skills/lakehouse-doc-en/references/Logstash.md +3 -3
  13. package/bin/skills/lakehouse-doc-en/references/Migrate_Spark_DataEngineeringBestPractices_Project_to_Lakehouse.md +1 -1
  14. package/bin/skills/lakehouse-doc-en/references/Notebook.md +17 -17
  15. package/bin/skills/lakehouse-doc-en/references/RemoteFunction-as-udf.md +14 -14
  16. package/bin/skills/lakehouse-doc-en/references/SQL_External_Catalog_Guide.md +1 -9
  17. package/bin/skills/lakehouse-doc-en/references/SUMMARY.md +59 -29
  18. package/bin/skills/lakehouse-doc-en/references/WINDOWFUNCTION.md +99 -57
  19. package/bin/skills/lakehouse-doc-en/references/Zettapark_Data_Engineering_Demo.md +1 -1
  20. package/bin/skills/lakehouse-doc-en/references/access-control-configuration.md +1 -8
  21. package/bin/skills/lakehouse-doc-en/references/aigw-2026-2-5-1.0.md +16 -0
  22. package/bin/skills/lakehouse-doc-en/references/aigw-2026-3-29-1.0.2.md +14 -0
  23. package/bin/skills/lakehouse-doc-en/references/aigw-2026-3-8-1.0.1.md +16 -0
  24. package/bin/skills/lakehouse-doc-en/references/aigw-2026-4-28-1.1.md +29 -0
  25. package/bin/skills/lakehouse-doc-en/references/aigw-2026-5-12-1.1.1.md +18 -0
  26. package/bin/skills/lakehouse-doc-en/references/aigw-2026-5-15-1.2.md +9 -0
  27. package/bin/skills/lakehouse-doc-en/references/aigw-2026-5-21-1.3.md +9 -0
  28. package/bin/skills/lakehouse-doc-en/references/aigw-2026-5-28-1.4.md +10 -0
  29. package/bin/skills/lakehouse-doc-en/references/aigw-2026-6-3-1.5.md +9 -0
  30. package/bin/skills/lakehouse-doc-en/references/alicloud-arn-externalid.md +0 -5
  31. package/bin/skills/lakehouse-doc-en/references/answer-accuracy-improve.md +120 -103
  32. package/bin/skills/lakehouse-doc-en/references/application-list.md +1 -3
  33. package/bin/skills/lakehouse-doc-en/references/approval-list.md +16 -17
  34. package/bin/skills/lakehouse-doc-en/references/batch-load-parquet-file-into-lakehouse.md +1 -1
  35. package/bin/skills/lakehouse-doc-en/references/batch_sync.md +9 -9
  36. package/bin/skills/lakehouse-doc-en/references/batch_sync_Sop.md +2 -2
  37. package/bin/skills/lakehouse-doc-en/references/batchloadparquetfileintoLakehouse.md +1 -1
  38. package/bin/skills/lakehouse-doc-en/references/bulkloadv1-python-sdk.md +3 -3
  39. package/bin/skills/lakehouse-doc-en/references/chart-auto-refresh-guide.md +12 -6
  40. package/bin/skills/lakehouse-doc-en/references/clickzetta-sample-data.md +3 -3
  41. package/bin/skills/lakehouse-doc-en/references/code_approval.md +1 -5
  42. package/bin/skills/lakehouse-doc-en/references/composite_task.md +31 -42
  43. package/bin/skills/lakehouse-doc-en/references/comprehensive_guide_to_ingesting_environment_and_data_generate.md +6 -9
  44. package/bin/skills/lakehouse-doc-en/references/comprehensive_guide_to_ingesting_javasdk_bulkload_realtime.md +4 -10
  45. package/bin/skills/lakehouse-doc-en/references/comprehensive_guide_to_ingesting_kafka_realtime_sync.md +1 -10
  46. package/bin/skills/lakehouse-doc-en/references/comprehensive_guide_to_ingesting_local_file_into_table_by_studio.md +0 -6
  47. package/bin/skills/lakehouse-doc-en/references/comprehensive_guide_to_ingesting_studio_batchload_public_network.md +0 -5
  48. package/bin/skills/lakehouse-doc-en/references/comprehensive_guide_to_ingesting_studio_python_node.md +2 -7
  49. package/bin/skills/lakehouse-doc-en/references/comprehensive_guide_to_ingesting_studio_realtime_cdc_public_network.md +13 -18
  50. package/bin/skills/lakehouse-doc-en/references/comprehensive_guide_to_ingesting_studio_sql_insert.md +0 -1
  51. package/bin/skills/lakehouse-doc-en/references/concepts.md +1 -1
  52. package/bin/skills/lakehouse-doc-en/references/config-datasource.md +5 -7
  53. package/bin/skills/lakehouse-doc-en/references/connect-with-cli.md +116 -72
  54. package/bin/skills/lakehouse-doc-en/references/connect-with-cz-cli.md +151 -0
  55. package/bin/skills/lakehouse-doc-en/references/continue-job.md +9 -17
  56. package/bin/skills/lakehouse-doc-en/references/create-api-connection.md +315 -286
  57. package/bin/skills/lakehouse-doc-en/references/create-catalog-connection.md +1 -0
  58. package/bin/skills/lakehouse-doc-en/references/create-dynamic-table.md +4 -4
  59. package/bin/skills/lakehouse-doc-en/references/create-external-catalog.md +85 -22
  60. package/bin/skills/lakehouse-doc-en/references/create-table-ddl.md +45 -0
  61. package/bin/skills/lakehouse-doc-en/references/creating_alicloud_privatelinkendpoint.md +4 -6
  62. package/bin/skills/lakehouse-doc-en/references/creating_alicloud_privatelinkservice.md +4 -7
  63. package/bin/skills/lakehouse-doc-en/references/creating_tencentcloud_privatelinkendpoint.md +2 -7
  64. package/bin/skills/lakehouse-doc-en/references/creating_tencentcloud_privatelinkservice.md +1 -5
  65. package/bin/skills/lakehouse-doc-en/references/cz-cli-agent.md +15 -10
  66. package/bin/skills/lakehouse-doc-en/references/cz-cli-datasource.md +0 -8
  67. package/bin/skills/lakehouse-doc-en/references/cz-cli-sql.md +2 -45
  68. package/bin/skills/lakehouse-doc-en/references/cz-cli.md +53 -42
  69. package/bin/skills/lakehouse-doc-en/references/dashboard-version-management-guide.md +12 -4
  70. package/bin/skills/lakehouse-doc-en/references/data-integration-intro.md +1 -1
  71. package/bin/skills/lakehouse-doc-en/references/data-integration.md +29 -27
  72. package/bin/skills/lakehouse-doc-en/references/data-load-summary.md +3 -3
  73. package/bin/skills/lakehouse-doc-en/references/data-quality.md +25 -25
  74. package/bin/skills/lakehouse-doc-en/references/data-sharing.md +31 -54
  75. package/bin/skills/lakehouse-doc-en/references/data-sources.md +45 -45
  76. package/bin/skills/lakehouse-doc-en/references/data_catalog.md +23 -25
  77. package/bin/skills/lakehouse-doc-en/references/data_privacy.md +5 -2
  78. package/bin/skills/lakehouse-doc-en/references/data_sharing_between_accounts_guide.md +0 -4
  79. package/bin/skills/lakehouse-doc-en/references/data_visualization.md +4 -15
  80. package/bin/skills/lakehouse-doc-en/references/dataagent.md +39 -7
  81. package/bin/skills/lakehouse-doc-en/references/databricks-delta-to-lakehouse-migration.md +168 -0
  82. package/bin/skills/lakehouse-doc-en/references/databricks-dlt-to-lakehouse-migration.md +331 -0
  83. package/bin/skills/lakehouse-doc-en/references/databricks-external-catalog-practice.md +367 -0
  84. package/bin/skills/lakehouse-doc-en/references/databricks-jobs-to-studio-migration.md +199 -0
  85. package/bin/skills/lakehouse-doc-en/references/databricks-notebook-to-studio-migration.md +350 -0
  86. package/bin/skills/lakehouse-doc-en/references/databricks-uc-governance-to-lakehouse-migration.md +327 -0
  87. package/bin/skills/lakehouse-doc-en/references/datagpt-model-config.md +34 -0
  88. package/bin/skills/lakehouse-doc-en/references/datagpt_data_source.md +50 -37
  89. package/bin/skills/lakehouse-doc-en/references/datagpt_introduction.md +55 -79
  90. package/bin/skills/lakehouse-doc-en/references/datagpt_quickstart.md +50 -64
  91. package/bin/skills/lakehouse-doc-en/references/datalake-acceleration.md +75 -2
  92. package/bin/skills/lakehouse-doc-en/references/dbt-databricks-to-clickzetta-migration.md +242 -0
  93. package/bin/skills/lakehouse-doc-en/references/dynamic-mask.md +30 -30
  94. package/bin/skills/lakehouse-doc-en/references/dynamic-table-bestpractice.md +1 -1
  95. package/bin/skills/lakehouse-doc-en/references/dynamic-table-introduce.md +1 -1
  96. package/bin/skills/lakehouse-doc-en/references/dynamic_table_summary.md +1 -1
  97. package/bin/skills/lakehouse-doc-en/references/eco_integration/streamlit.md +1 -1
  98. package/bin/skills/lakehouse-doc-en/references/eco_integration/superset.md +1 -1
  99. package/bin/skills/lakehouse-doc-en/references/ecosystem-all.md +1 -3
  100. package/bin/skills/lakehouse-doc-en/references/ecosystem.md +145 -0
  101. package/bin/skills/lakehouse-doc-en/references/external-catalog-summary.md +33 -38
  102. package/bin/skills/lakehouse-doc-en/references/external-function-combo-practice.md +466 -0
  103. package/bin/skills/lakehouse-doc-en/references/f6fc6447ee.md +7 -9
  104. package/bin/skills/lakehouse-doc-en/references/federation-query.md +56 -6
  105. package/bin/skills/lakehouse-doc-en/references/finebi-mysql.md +2 -0
  106. package/bin/skills/lakehouse-doc-en/references/get-started-with-sample-data.md +10 -11
  107. package/bin/skills/lakehouse-doc-en/references/gitfolder.md +2 -3
  108. package/bin/skills/lakehouse-doc-en/references/grant-privileges.md +2 -0
  109. package/bin/skills/lakehouse-doc-en/references/iceberg-rest-catalog-databricks.md +166 -0
  110. package/bin/skills/lakehouse-doc-en/references/ide.md +1 -1
  111. package/bin/skills/lakehouse-doc-en/references/if_else_task.md +59 -57
  112. package/bin/skills/lakehouse-doc-en/references/input_output.md +10 -7
  113. package/bin/skills/lakehouse-doc-en/references/jobprofile-bestpractices.md +60 -64
  114. package/bin/skills/lakehouse-doc-en/references/kafka-connection.md +0 -1
  115. package/bin/skills/lakehouse-doc-en/references/key-concepts.md +146 -117
  116. package/bin/skills/lakehouse-doc-en/references/lakehouse-ai-gateway-cz-cli.md +317 -0
  117. package/bin/skills/lakehouse-doc-en/references/lakehouse-ai-sql-analysis.md +345 -0
  118. package/bin/skills/lakehouse-doc-en/references/lakehouse-dqc-guide.md +300 -0
  119. package/bin/skills/lakehouse-doc-en/references/lakehouse-medallion-sql-dt-guide.md +543 -0
  120. package/bin/skills/lakehouse-doc-en/references/lakehouse-multi-cloud-acceleration.md +274 -0
  121. package/bin/skills/lakehouse-doc-en/references/lakehouse-multimodal-ai-pipeline.md +198 -0
  122. package/bin/skills/lakehouse-doc-en/references/lakehouse-quick-experience_guide.md +49 -52
  123. package/bin/skills/lakehouse-doc-en/references/lakehouse-volume-pipe-acceleration-guide.md +380 -0
  124. package/bin/skills/lakehouse-doc-en/references/langchain-plug-installation.md +1 -1
  125. package/bin/skills/lakehouse-doc-en/references/management.md +4 -9
  126. package/bin/skills/lakehouse-doc-en/references/medallion-lakehouse-from-scratch.md +2 -1
  127. package/bin/skills/lakehouse-doc-en/references/metrics_answer_build.md +58 -21
  128. package/bin/skills/lakehouse-doc-en/references/migrate-spark-data-engineering-best-practices-to-lakehouse.md +1 -1
  129. package/bin/skills/lakehouse-doc-en/references/mindsdb.md +1 -1
  130. package/bin/skills/lakehouse-doc-en/references/monitoring_and_alerting.md +65 -60
  131. package/bin/skills/lakehouse-doc-en/references/monitoring_item_specification.md +33 -33
  132. package/bin/skills/lakehouse-doc-en/references/multitable_batch_sync.md +16 -16
  133. package/bin/skills/lakehouse-doc-en/references/multitable_realtime_sync.md +65 -72
  134. package/bin/skills/lakehouse-doc-en/references/multitable_realtime_sync_sop.md +54 -52
  135. package/bin/skills/lakehouse-doc-en/references/navicat-mysql.md +2 -0
  136. package/bin/skills/lakehouse-doc-en/references/om-dynamic-table.md +71 -66
  137. package/bin/skills/lakehouse-doc-en/references/om-vcluster.md +2 -0
  138. package/bin/skills/lakehouse-doc-en/references/open-api-create-session.md +79 -0
  139. package/bin/skills/lakehouse-doc-en/references/open-api-generate-auth-token.md +63 -0
  140. package/bin/skills/lakehouse-doc-en/references/open-api-overview.md +96 -0
  141. package/bin/skills/lakehouse-doc-en/references/open-api-quick-start.md +286 -0
  142. package/bin/skills/lakehouse-doc-en/references/open-api-response-guide.md +264 -0
  143. package/bin/skills/lakehouse-doc-en/references/open-api-safe-question-poll.md +201 -0
  144. package/bin/skills/lakehouse-doc-en/references/open-api-text2insight-query.md +99 -0
  145. package/bin/skills/lakehouse-doc-en/references/open-api-text2insight-stop.md +74 -0
  146. package/bin/skills/lakehouse-doc-en/references/overview.md +6 -7
  147. package/bin/skills/lakehouse-doc-en/references/permission-application.md +5 -5
  148. package/bin/skills/lakehouse-doc-en/references/pipe-introduction.md +1 -0
  149. package/bin/skills/lakehouse-doc-en/references/pipe-kafka-table-stream.md +72 -70
  150. package/bin/skills/lakehouse-doc-en/references/pipe-kafka.md +105 -110
  151. package/bin/skills/lakehouse-doc-en/references/pipe-overview.md +40 -40
  152. package/bin/skills/lakehouse-doc-en/references/pipe-storage-object.md +43 -48
  153. package/bin/skills/lakehouse-doc-en/references/pipe-summary.md +14 -4
  154. package/bin/skills/lakehouse-doc-en/references/pipe-syntax.md +58 -151
  155. package/bin/skills/lakehouse-doc-en/references/practice_python_task.md +4 -4
  156. package/bin/skills/lakehouse-doc-en/references/pricing-ai-gateway.md +181 -0
  157. package/bin/skills/lakehouse-doc-en/references/pricing-lakehouse.md +316 -0
  158. package/bin/skills/lakehouse-doc-en/references/pricing.md +44 -288
  159. package/bin/skills/lakehouse-doc-en/references/private-link-general.md +0 -2
  160. package/bin/skills/lakehouse-doc-en/references/pyspark-to-zettapark-migration-f1.md +1 -1
  161. package/bin/skills/lakehouse-doc-en/references/python-igs.md +7 -3
  162. package/bin/skills/lakehouse-doc-en/references/python-sample-put-github-rt-events.md +1 -1
  163. package/bin/skills/lakehouse-doc-en/references/python-task.md +1 -1
  164. package/bin/skills/lakehouse-doc-en/references/python_reference/connector.md +3 -3
  165. package/bin/skills/lakehouse-doc-en/references/python_reference/connector_advanced.md +2 -2
  166. package/bin/skills/lakehouse-doc-en/references/python_reference/connector_examples.md +2 -2
  167. package/bin/skills/lakehouse-doc-en/references/python_sdk_guide.md +1 -1
  168. package/bin/skills/lakehouse-doc-en/references/python_shell_datasource.md +11 -9
  169. package/bin/skills/lakehouse-doc-en/references/quick_start_batch_sync_data.md +9 -18
  170. package/bin/skills/lakehouse-doc-en/references/quick_start_bi_analysis.md +8 -25
  171. package/bin/skills/lakehouse-doc-en/references/quick_start_create_workspace.md +4 -6
  172. package/bin/skills/lakehouse-doc-en/references/quick_start_data_quality.md +8 -8
  173. package/bin/skills/lakehouse-doc-en/references/quick_start_etl.md +16 -20
  174. package/bin/skills/lakehouse-doc-en/references/quick_start_monitoring_and_alerting.md +10 -18
  175. package/bin/skills/lakehouse-doc-en/references/quick_start_sql_query.md +7 -10
  176. package/bin/skills/lakehouse-doc-en/references/quick_start_upload_data.md +5 -7
  177. package/bin/skills/lakehouse-doc-en/references/quick_start_user_management.md +8 -8
  178. package/bin/skills/lakehouse-doc-en/references/quick_start_workspace.md +0 -5
  179. package/bin/skills/lakehouse-doc-en/references/quick_start_workspace_user.md +8 -8
  180. package/bin/skills/lakehouse-doc-en/references/quickstart.md +69 -56
  181. package/bin/skills/lakehouse-doc-en/references/quickstart_datashare_between_companies.md +0 -5
  182. package/bin/skills/lakehouse-doc-en/references/quickstart_envirment_for_team.md +0 -24
  183. package/bin/skills/lakehouse-doc-en/references/realtime-pipeline-selection-guide.md +1 -2
  184. package/bin/skills/lakehouse-doc-en/references/realtime-sales-dashboard-with-dynamic-table.md +3 -3
  185. package/bin/skills/lakehouse-doc-en/references/realtime_sync.md +0 -1
  186. package/bin/skills/lakehouse-doc-en/references/release-note-2026-05-19.md +5 -3
  187. package/bin/skills/lakehouse-doc-en/references/revoke-privileges.md +3 -1
  188. package/bin/skills/lakehouse-doc-en/references/roles.md +2 -3
  189. package/bin/skills/lakehouse-doc-en/references/row-filter.md +165 -0
  190. package/bin/skills/lakehouse-doc-en/references/row_level_permission.md +30 -19
  191. package/bin/skills/lakehouse-doc-en/references/scheduled_task.md +28 -21
  192. package/bin/skills/lakehouse-doc-en/references/security_overview.md +99 -21
  193. package/bin/skills/lakehouse-doc-en/references/set-command.md +1 -1
  194. package/bin/skills/lakehouse-doc-en/references/setup.md +13 -15
  195. package/bin/skills/lakehouse-doc-en/references/show-grants.md +1 -1
  196. package/bin/skills/lakehouse-doc-en/references/snowflake-dynamic-tables-to-lakehouse.md +2 -2
  197. package/bin/skills/lakehouse-doc-en/references/spark-connector-summary.md +1 -1
  198. package/bin/skills/lakehouse-doc-en/references/sql_functions/context_functions/current_vcluster.md +1 -1
  199. package/bin/skills/lakehouse-doc-en/references/sso-configuration.md +2 -2
  200. package/bin/skills/lakehouse-doc-en/references/streaming_pipeline_with_dynamic_table.md +0 -1
  201. package/bin/skills/lakehouse-doc-en/references/studio-incremental-sync-practice.md +27 -23
  202. package/bin/skills/lakehouse-doc-en/references/studio-shell-task.md +1 -1
  203. package/bin/skills/lakehouse-doc-en/references/supported-cloud-platforms.md +32 -0
  204. package/bin/skills/lakehouse-doc-en/references/table_rendering.md +18 -12
  205. package/bin/skills/lakehouse-doc-en/references/task-develop.md +89 -91
  206. package/bin/skills/lakehouse-doc-en/references/task_development.md +19 -17
  207. package/bin/skills/lakehouse-doc-en/references/task_group.md +16 -14
  208. package/bin/skills/lakehouse-doc-en/references/task_instance.md +21 -21
  209. package/bin/skills/lakehouse-doc-en/references/task_param.md +38 -35
  210. package/bin/skills/lakehouse-doc-en/references/task_param_reference.md +81 -79
  211. package/bin/skills/lakehouse-doc-en/references/task_scheduling_dependency.md +20 -21
  212. package/bin/skills/lakehouse-doc-en/references/tencentcloud_arn_and_externalid.md +1 -5
  213. package/bin/skills/lakehouse-doc-en/references/trial-account-quotas-and-limits.md +1 -3
  214. package/bin/skills/lakehouse-doc-en/references/tutorial_connect_to_lakehouse.md +69 -0
  215. package/bin/skills/lakehouse-doc-en/references/tutorials.md +4 -1
  216. package/bin/skills/lakehouse-doc-en/references/unique-key.md +167 -0
  217. package/bin/skills/lakehouse-doc-en/references/usageandbillingview.md +138 -0
  218. package/bin/skills/lakehouse-doc-en/references/use-dbt-dev.md +3 -3
  219. package/bin/skills/lakehouse-doc-en/references/use-java-sdk-realtime-uploaddata.md +1 -1
  220. package/bin/skills/lakehouse-doc-en/references/use-java-sdk-upload-data-local.md +3 -3
  221. package/bin/skills/lakehouse-doc-en/references/use-models.md +128 -0
  222. package/bin/skills/lakehouse-doc-en/references/use-mysql-client.md +81 -81
  223. package/bin/skills/lakehouse-doc-en/references/use-python-sdk-upload-data.md +10 -12
  224. package/bin/skills/lakehouse-doc-en/references/user-identification.md +2 -3
  225. package/bin/skills/lakehouse-doc-en/references/user_permission_grand_guide.md +1 -1
  226. package/bin/skills/lakehouse-doc-en/references/using-udf-in-dynamic-table.md +1 -1
  227. package/bin/skills/lakehouse-doc-en/references/vc_cache.md +18 -22
  228. package/bin/skills/lakehouse-doc-en/references/vcluster_size_description.md +33 -31
  229. package/bin/skills/lakehouse-doc-en/references/virtual-cluster.md +43 -45
  230. package/bin/skills/lakehouse-doc-en/references/web-job-history.md +94 -108
  231. package/bin/skills/lakehouse-doc-en/references/web_search.md +16 -7
  232. package/bin/skills/lakehouse-doc-en/references/zettapark-data-engineering-demo.md +1 -1
  233. package/bin/skills/lakehouse-doc-en/references/zettapark-dataframe-guide.md +144 -70
  234. package/bin/skills/lakehouse-doc-en/references/zettapark-dynamic-table-guide.md +2 -2
  235. package/bin/skills/lakehouse-doc-en/references/zettapark-etl-guide.md +73 -33
  236. package/bin/skills/lakehouse-doc-en/references/zettapark-feature-engineering.md +2 -2
  237. package/bin/skills/lakehouse-doc-en/references/zettapark-functions-guide.md +75 -46
  238. package/bin/skills/lakehouse-doc-en/references/zettapark-quick-start.md +2 -2
  239. package/bin/skills/lakehouse-doc-en/references/zettapark-stream-guide.md +4 -4
  240. package/bin/skills/lakehouse-doc-en/references/zettapark-volume-guide.md +93 -29
  241. package/package.json +1 -1
  242. package/bin/skills/lakehouse-doc-en/references/CLAUDE.md +0 -606
  243. package/bin/skills/lakehouse-doc-en/references/modelprice.md +0 -155
@@ -0,0 +1,242 @@
1
+ # dbt-databricks → dbt-clickzetta Migration in Practice: Financial Payment Data Pipeline
2
+
3
+ If your data pipeline is built with dbt + Databricks SQL, migrating to Singdata Lakehouse is far less work than you might expect — as little as changing one line in `profiles.yml`. All core dbt capabilities — CTE models, window functions, SCD Type 2 logic, data quality tests, unit tests, and data contracts — are fully compatible with Lakehouse and require zero changes.
4
+
5
+ This article validates that claim with a real project: a financial payment data pipeline built on dbt-databricks (staging → intermediate → marts → semantic, 30 models, 36 tests) was fully migrated to dbt-clickzetta and verified with `dbt seed 9/9 + dbt run 30/30 + dbt test 36/36` — all passing.
6
+
7
+ Full code on GitHub: [dbt-databricks2lakehouse-blueprint](https://github.com/clickzetta/dbt-databricks2lakehouse-blueprint)
8
+
9
+ ---
10
+
11
+ ## Original Project
12
+
13
+ [dbt-databricks2lakehouse-blueprint](https://github.com/clickzetta/dbt-databricks2lakehouse-blueprint) is forked from [Alex-Teodosiu/dbt-blueprint](https://github.com/Alex-Teodosiu/dbt-blueprint) (⭐72). The original stack is dbt-databricks + Databricks SQL Warehouse. The project simulates a P2P/P2B (peer-to-peer / peer-to-business) payment platform data engineering pipeline, covering five data domains — users, merchants, accounts, bank cards, and transactions — and implements the complete chain from raw events → cleansing → SCD Type 2 dimension tables → fact tables → semantic layer.
14
+
15
+ The migrated code is in the `03_lakehouse/` directory and can be compared file-by-file with `01_source/dbt_blueprint/`.
16
+
17
+ ## Conclusion First
18
+
19
+ **The adapter is dbt's database dialect layer** — switching adapters means switching the database; the dbt models themselves do not need to change. Changes are concentrated in `profiles.yml` (connection configuration) and a handful of Databricks-specific syntax items.
20
+
21
+ | Change | Files | Effort | Notes |
22
+ |--------|-------|--------|-------|
23
+ | Adapter switch | 1 (`profiles.yml`) | Minimal | `type: databricks` → `type: clickzetta`, different connection params |
24
+ | Project profile name | 1 (`dbt_project.yml`) | Minimal | `connection_databricks` → `dbt_blueprint` |
25
+ | `getdate()` | 1 | Minimal | → `current_date()` |
26
+ | Macro target name | 1 | Minimal | `databricks_cluster` → `clickzetta_prod` |
27
+
28
+ **Parts that need no changes**: all CTE model logic, window functions (LEAD/LAG/ROW_NUMBER), SCD Type 2 logic, `dbt_utils.generate_surrogate_key()`, data quality tests, unit tests, data contracts, `SELECT * EXCEPT (col)`, `col :: type` cast syntax, `DATEDIFF(year, start, end)` — all natively supported by Lakehouse.
29
+
30
+ ---
31
+
32
+ ## Technology Stack Comparison
33
+
34
+ | | dbt-databricks (original) | dbt-clickzetta (migrated) |
35
+ |---|---|---|
36
+ | dbt adapter | `dbt-databricks` | `dbt-clickzetta` |
37
+ | Compute engine | Databricks SQL Warehouse | Singdata Lakehouse |
38
+ | Connection method | `host` + `http_path` + `token` | `instance` + `workspace` + `service` + `username` + `password` |
39
+ | Target catalog | `catalog: dbt_blueprint` | Not needed (workspace is the catalog) |
40
+ | SQL dialect | Databricks SQL | Lakehouse SQL (ANSI-compatible) |
41
+ | `:: cast` | Supported | **Supported** (same on both sides, no change needed) |
42
+ | `SELECT * EXCEPT` | Supported | **Supported** (same on both sides, no change needed) |
43
+ | `DATEDIFF(year, s, e)` | Supported | **Supported** (three-argument form compatible) |
44
+ | `getdate()` | Supported | Not supported → use `current_date()` |
45
+ | Model logic (CTE/Window/JOIN) | — | **Fully identical** |
46
+
47
+ ---
48
+
49
+ ![](.topwrite/assets/anim-29-dbt-databricks-migration.svg)
50
+
51
+ ---
52
+
53
+ ## Project Background
54
+
55
+ The data comes from a simulated P2P/P2B payment platform and contains 9 raw event tables:
56
+
57
+ | Domain | Table | Rows | Description |
58
+ |--------|-------|------|-------------|
59
+ | Users | `user_events` | 196 | User registration/status change events (SCD2 source) |
60
+ | Merchants | `merchant_events` | 30 | Merchant registration/status changes |
61
+ | Merchants | `industry_codes` | 30 | Industry code reference table |
62
+ | Accounts | `account_events` | 80 | Bank account opening/change events |
63
+ | Bank cards | `card_events` | 60 | Card activation/status changes |
64
+ | Transactions | `raw_p2p_captured_events` | 150 | Successful P2P transactions |
65
+ | Transactions | `raw_p2p_failed_events` | 50 | Failed P2P transactions |
66
+ | Transactions | `raw_p2b_captured_events` | 80 | Successful P2B transactions |
67
+ | Transactions | `raw_p2b_failed_events` | 30 | Failed P2B transactions |
68
+
69
+ dbt four-layer architecture:
70
+
71
+ ```
72
+ staging → intermediate → marts → semantic
73
+ (view) (table) (table) (table)
74
+ raw events SCD2 logic dim + fact enriched
75
+ ```
76
+
77
+ ---
78
+
79
+ ## Migration Steps
80
+
81
+ ### Step 1: Switch the Adapter (Core Change)
82
+
83
+ The original `profiles.yml` uses `type: databricks`. The migration only requires switching to `type: clickzetta`, replacing Databricks connection params `host/http_path/token` with Lakehouse params `instance/workspace/service`:
84
+
85
+ ```yaml
86
+ # Original (dbt-databricks)
87
+ connection_databricks:
88
+ target: dev_local
89
+ outputs:
90
+ dev_local:
91
+ type: databricks
92
+ catalog: dbt_blueprint
93
+ schema: default
94
+ host: dbc-a505ff10-0af5.cloud.databricks.com
95
+ http_path: /sql/1.0/warehouses/4fa4ca06332da87f
96
+ token: "{{ env_var('BLUEPRINT_DATABRICKS_TOKEN') }}"
97
+ threads: 1
98
+ ```
99
+
100
+ ```yaml
101
+ # Migrated (dbt-clickzetta)
102
+ dbt_blueprint:
103
+ target: dev
104
+ outputs:
105
+ dev:
106
+ type: clickzetta
107
+ instance: "{{ env_var('CZ_INSTANCE') }}"
108
+ workspace: "{{ env_var('CZ_WORKSPACE') }}"
109
+ schema: "{{ env_var('CZ_SCHEMA', 'dbt_blueprint_dev') }}"
110
+ vcluster: "{{ env_var('CZ_VCLUSTER', 'DEFAULT') }}"
111
+ username: "{{ env_var('CZ_USERNAME') }}"
112
+ password: "{{ env_var('CZ_PASSWORD') }}"
113
+ service: "{{ env_var('CZ_SERVICE') }}"
114
+ threads: 4
115
+ ```
116
+
117
+ Also update the `profile` name in `dbt_project.yml`:
118
+
119
+ ```yaml
120
+ # Original
121
+ profile: 'connection_databricks'
122
+
123
+ # Migrated
124
+ profile: 'dbt_blueprint'
125
+ ```
126
+
127
+ ### Step 2: Replace `getdate()`
128
+
129
+ `getdate()` is a SQL Server/Databricks function. Lakehouse uses `current_date()`:
130
+
131
+ ```sql
132
+ -- Original (int_users.sql)
133
+ {{ calculate_age('date_of_birth', 'getdate()') }} as age
134
+
135
+ -- Migrated
136
+ {{ calculate_age('date_of_birth', 'current_date()') }} as age
137
+ ```
138
+
139
+ > 💡 **Tip**: The `:: cast` syntax and the three-argument `DATEDIFF(year, start, end)` form are both **natively supported** in Lakehouse — no changes needed.
140
+
141
+ ### Step 3: Update the Target Name in the `generate_schema_name` Macro
142
+
143
+ The original macro uses `databricks_cluster` as the prod environment name. Change it to `clickzetta_prod`:
144
+
145
+ ```sql
146
+ -- Original
147
+ {%- if target.name in ["prod", "databricks_cluster"] -%}
148
+
149
+ -- Migrated
150
+ {%- if target.name in ["prod", "clickzetta_prod"] -%}
151
+ ```
152
+
153
+ ---
154
+
155
+ ## Fully Compatible Parts (No Changes Needed)
156
+
157
+ The following patterns are written identically on both sides and have been tested end-to-end:
158
+
159
+ ```sql
160
+ -- :: cast type conversion — supported on both sides
161
+ userId :: string as user_id,
162
+ eventTime :: timestamp as event_time,
163
+ amount :: double as transaction_amount
164
+
165
+ -- SCD Type 2 window logic
166
+ , users_scd2 as (
167
+ select
168
+ user_id, status, event_time as from_event_timestamp,
169
+ lead(event_time) over w as to_event_timestamp
170
+ from users
171
+ window w as (partition by user_id order by event_time)
172
+ )
173
+
174
+ -- DATEDIFF three-argument form — supported on both sides
175
+ DATEDIFF(year, date_of_birth, current_date())
176
+
177
+ -- Surrogate key (dbt_utils)
178
+ {{ dbt_utils.generate_surrogate_key(['user_id', 'from_event_timestamp', 'status']) }}
179
+
180
+ -- SELECT * EXCEPT — supported on both sides
181
+ select * except (age)
182
+ from users
183
+ where age >= 18
184
+
185
+ -- Data contracts (model contracts)
186
+ config:
187
+ contract:
188
+ enforced: true
189
+ columns:
190
+ - name: transaction_uid
191
+ data_type: string
192
+ data_tests:
193
+ - not_null
194
+ - unique
195
+
196
+ -- Unit tests
197
+ unit_tests:
198
+ - name: test__int_users__scd_logic
199
+ model: int_users
200
+ given:
201
+ - input: ref('stg_user_events')
202
+ rows: [...]
203
+ expect:
204
+ rows: [...]
205
+ ```
206
+
207
+ ---
208
+
209
+ ## dbt Test Results
210
+
211
+ Tested on the AWS Singapore instance (`aws_singapore_prod`), 36/36 all passing:
212
+
213
+ ```
214
+ dbt seed: 9/9 PASS (196 users, 310 transactions, 80 accounts, 60 cards...)
215
+ dbt run: 30/30 PASS (9 views + 21 tables)
216
+ dbt test: 36/36 PASS (21 data tests + 15 unit tests)
217
+ ```
218
+
219
+ Data quality tests include: not_null/unique constraints on source tables, marts-layer data contracts (`contract: enforced: true`), and column type validation on fact_transaction. Unit tests cover SCD2 logic (`int_users__scd_logic`), transaction aggregation (`int_transactions__union_all`), and the age calculation macro (`calculate_age__valid_ages`).
220
+
221
+ ---
222
+
223
+ ## Notes
224
+
225
+ - **`getdate()` vs `current_date()`**: Databricks' `getdate()` returns the current timestamp. Lakehouse uses `current_date()` (date) or `current_timestamp()` (timestamp) as replacements. The `col :: type` cast syntax and the three-argument `DATEDIFF(year, s, e)` form are both **natively supported** in Lakehouse — no changes needed.
226
+ - **Catalog hierarchy**: Databricks uses three-part naming (`catalog.schema.table`). In dbt with Lakehouse, only `schema.table` is used; the adapter handles the rest automatically. No manual changes to table references in models are needed.
227
+ - **Seeds replacing the ingestion layer**: The original project relies on ingestion tables in the Databricks workspace. After migrating to Lakehouse, seeds CSVs are used as replacements. In production, these can be replaced with Studio data integration or CDC tasks.
228
+
229
+ ## Related Documentation
230
+
231
+ ### dbt Development Guides
232
+
233
+ - [dbt Quick Start](use-dbt-dev.md): dbt-clickzetta adapter installation and configuration
234
+ - [dbt Incremental Models](dbt-incremental.md): Incremental strategy configuration and unique_key options
235
+ - [dbt Data Quality Checks](dbt-data-quality.md): Data tests, contract validation, and custom tests
236
+ - [dbt Advanced Features](dbt-advanced-features.md): Macros, snapshots, semantic layer, and more
237
+
238
+ ### Other Migration Case Studies
239
+
240
+ - [Databricks Notebook → Lakehouse Migration (Retail Medallion Pipeline)](databricks-notebook-to-studio-migration.md): PySpark Notebook → ZettaPark/Studio tasks
241
+ - [dbt BigQuery Migration: Retail Data Warehouse Pipeline](dbt-bigquery-to-clickzetta-migration.md): BigQuery → Lakehouse adapter migration
242
+ - [dbt Snowflake Migration: TPC-H Data Warehouse Pipeline](dbt-snowflake-to-clickzetta-migration.md): Snowflake → Lakehouse adapter migration
@@ -1,12 +1,12 @@
1
1
  # Lakehouse Column-Level Security (Dynamic Masking) User Guide
2
2
 
3
- ## 1. Overview
3
+ ## Overview
4
4
 
5
- Column-level Security provides fine-grained data protection capabilities through Dynamic Data Masking, which dynamically modifies the display of sensitive data (such as partial hiding or character replacement) based on user identity or role. We only store the original data and execute the masking function at runtime during data access. This document introduces how to implement this functionality through SQL interfaces.
5
+ Column-level Security provides fine-grained data protection capabilities through Dynamic Data Masking, which dynamically modifies the display of sensitive data (such as partial hiding or character replacement) based on user identity or role. The system only stores the original data and executes the masking function at runtime during data access. This document introduces how to implement this functionality through the SQL interface.
6
6
 
7
- ## 2. Core Syntax
7
+ ## Core Syntax
8
8
 
9
- ### 2.1 Creating Masking Policy Functions
9
+ ### 1 Creating Masking Policy Functions
10
10
 
11
11
  Refer to the [CREATE FUNCTION (SQL)](create-sql-function.md) syntax.
12
12
 
@@ -19,12 +19,12 @@ expression_with_conditional_logic;
19
19
 
20
20
  **Key Elements**:
21
21
 
22
- - Must return the same data type as the original column.
23
- - Use security context functions:
24
- - `current_user()` to get the current user (note case sensitivity).
25
- - `current_roles()` to get an array of user roles.
22
+ * Must return the same data type as the original column.
23
+ * Use security context functions:
24
+ * `current_user()` to get the current user (note case sensitivity).
25
+ * `current_roles()` to get an array of user roles.
26
26
 
27
- ### 2.2 Binding Policies to Columns
27
+ ### 2 Binding Policies to Columns
28
28
 
29
29
  **When Creating a Table**:
30
30
 
@@ -45,13 +45,13 @@ SET MASK schema_name.masking_function;
45
45
 
46
46
  **Adding a Column with Masking**:
47
47
 
48
- ```sql
48
+ ```SQL
49
49
  ALTER TABLE table_name ADD COLUMN (column_name column_type MASK schema_name.masking_function);
50
50
  ```
51
51
 
52
- ### 2.3 Removing Policy Binding
52
+ ### 3 Removing Policy Binding
53
53
 
54
- ```sql
54
+ ```SQL
55
55
  ALTER TABLE table_name
56
56
  CHANGE COLUMN column_name
57
57
  UNSET MASK;
@@ -59,9 +59,9 @@ UNSET MASK;
59
59
 
60
60
  ***
61
61
 
62
- ## 3. Use Case Examples
62
+ ## Use Case Examples
63
63
 
64
- ### 3.1 Basic Masking
64
+ ### 1 Basic Masking
65
65
 
66
66
  **Requirement**: Display the first 6 characters of an ID card number, followed by 4 asterisks, and then the last 4 characters.
67
67
 
@@ -79,7 +79,7 @@ ALTER TABLE data CHANGE COLUMN idcard SET MASK public.idcard_masking;
79
79
  Original Value: 130183199901011234 → Masked: 130183****9010
80
80
  ```
81
81
 
82
- ### 3.2 Dynamic Masking Based on User
82
+ ### 2 Dynamic Masking Based on User
83
83
 
84
84
  **Requirement**: Only the `UAT_TEST` user should see masked data.
85
85
 
@@ -104,7 +104,7 @@ CASE
104
104
  END;
105
105
  ```
106
106
 
107
- ### 3.3 Dynamic Masking Based on Role
107
+ ### 3 Dynamic Masking Based on Role
108
108
 
109
109
  **Requirement**: Users with the `user_admin` role can view the full information.
110
110
 
@@ -121,9 +121,9 @@ END;
121
121
 
122
122
  ***
123
123
 
124
- ## 4. Complete Operation Example
124
+ ## Complete Operation Example
125
125
 
126
- ### 4.1 Initializing the Environment
126
+ ### 1 Initializing the Environment
127
127
 
128
128
  ```sql
129
129
  CREATE SCHEMA IF NOT EXISTS security_demo;
@@ -144,7 +144,7 @@ INSERT INTO security_demo.user_data VALUES ('James', '123-45-6789', '123456789')
144
144
  SELECT * FROM security_demo.user_data;
145
145
  ```
146
146
 
147
- ### 4.2 Creating Policy Functions
147
+ ### 2 Creating Policy Functions
148
148
 
149
149
  ```sql
150
150
  -- Exemption for privileged roles
@@ -157,7 +157,7 @@ CASE
157
157
  END;
158
158
  ```
159
159
 
160
- ### 4.3 Modifying Masking Policies
160
+ ### 3 Modifying Masking Policies
161
161
 
162
162
  ```sql
163
163
  -- Removing the previous policy
@@ -167,7 +167,7 @@ ALTER TABLE security_demo.user_data CHANGE COLUMN ssn UNSET MASK;
167
167
  ALTER TABLE security_demo.user_data CHANGE COLUMN ssn SET MASK security_demo.admin_ssn_mask;
168
168
  ```
169
169
 
170
- ### 4.4 Verifying the Effect
170
+ ### 4 Verifying the Effect
171
171
 
172
172
  **Query by a Regular User**:
173
173
 
@@ -185,18 +185,18 @@ SELECT * FROM user_data;
185
185
 
186
186
  ***
187
187
 
188
- ## 5. Management Notes
188
+ ## Management Notes
189
189
 
190
- ### 5.1 Permission Control
190
+ ### 1 Permission Control
191
191
 
192
- - Only roles with `ALTER TABLE` permissions are allowed to modify masking policies.
193
- - Function creation requires `CREATE FUNCTION` permissions.
192
+ * Only roles with `ALTER TABLE` permissions are allowed to modify masking policies.
193
+ * Function creation requires `CREATE FUNCTION` permissions.
194
194
 
195
- ### 5.2 Performance Recommendations
195
+ ### 2 Performance Recommendations
196
196
 
197
- - Avoid using complex calculations in masking functions.
198
- - Use conditional logic cautiously for columns with high query frequency.
197
+ * Avoid using complex calculations in masking functions.
198
+ * Use conditional logic cautiously for columns with high query frequency.
199
199
 
200
- ## 6. Limitations
200
+ ## Limitations
201
201
 
202
- - Only one masking policy can be bound to a single column. If you want to define multiple masking rules, you can use conditional logic within a single function to apply different policies.
202
+ * Only one masking policy can be bound to a single column. If you want to define multiple masking rules, you can use conditional logic within a single function to apply different policies.
@@ -77,7 +77,7 @@ Lakehouse currently uses a scheduling mechanism to update Dynamic Table. The fol
77
77
 
78
78
  | | Usage | Advantages | Disadvantages |
79
79
  | ------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
80
- | Define scheduling attributes in DDL statements | Define the refresh interval in refreshOption, refer to the documentation <https://dev-doc.clickzetta.com/zh-CN/create-dynamic-table> for specific usage. Currently, the refresh interval is limited to one minute. | Simple and easy to use, can quickly set refresh options. Does not rely on any third-party tools. | Currently, Lakehouse does not support defining strict upstream and downstream dependencies on Dynamic Table. In DDL definition, it relies on time scheduling. You can ensure upstream refresh completion before scheduling downstream through time intervals. |
80
+ | Define scheduling attributes in DDL statements | Define the refresh interval in refreshOption, refer to the documentation <create-dynamic-table.md> for specific usage. Currently, the refresh interval is limited to one minute. | Simple and easy to use, can quickly set refresh options. Does not rely on any third-party tools. | Currently, Lakehouse does not support defining strict upstream and downstream dependencies on Dynamic Table. In DDL definition, it relies on time scheduling. You can ensure upstream refresh completion before scheduling downstream through time intervals. |
81
81
  | Define scheduling in Lakehouse Studio | You can configure scheduling through a visual interface in Lakehouse Studio, refer to [Task Development Scheduling Documentation](taskdevelop.md) for specific configuration methods. Currently, the refresh interval is limited to one minute. | Visual configuration, user-friendly. Supports scheduling dependency configuration to ensure downstream refresh after upstream refresh completion. Supports single-node operation monitoring, such as failure alerts, timeout alerts, etc. | |
82
82
  | Submitting Refresh Jobs Using Third-Party Scheduling Engines | By downloading the Lakehouse client command, use cron expressions to schedule Refresh tasks. Alternatively, use the Java Jdbc interface to customize the submission of Refresh commands. | You can more flexibly control job submission and configure scheduling information, with no time interval restrictions | Requires reliance on third-party scheduling. Introduces third-party scheduling systems |
83
83
 
@@ -75,7 +75,7 @@ Lakehouse currently uses a scheduling mechanism to update Dynamic Tables. The fo
75
75
 
76
76
  | | Usage | Advantages | Disadvantages |
77
77
  | ------------------------ | ------------------------------------------------------------------------------------------------------- | --------------------------------------------------------- | ------------------------------------------------------------------------------------- |
78
- | Define scheduling properties in DDL statements | Define the refresh interval in refreshOption, refer to the documentation https\://dev-doc.clickzetta.com/zh-CN/create-dynamic-table for specific usage. Currently, the refresh interval is limited to one minute. | Simple and easy to use, can quickly set refresh options. Does not rely on any third-party tools. | Currently, Lakehouse does not support defining strict upstream and downstream dependencies on Dynamic Tables. In DDL definitions, it relies on time scheduling. You can ensure the upstream refresh is completed before scheduling the downstream through time intervals. |
78
+ | Define scheduling properties in DDL statements | Define the refresh interval in refreshOption, refer to the documentation create-dynamic-table.md for specific usage. Currently, the refresh interval is limited to one minute. | Simple and easy to use, can quickly set refresh options. Does not rely on any third-party tools. | Currently, Lakehouse does not support defining strict upstream and downstream dependencies on Dynamic Tables. In DDL definitions, it relies on time scheduling. You can ensure the upstream refresh is completed before scheduling the downstream through time intervals. |
79
79
  | Define scheduling in Lakehouse Studio | You can configure scheduling through a visual interface in Lakehouse Studio, refer to the [task development scheduling documentation](taskdevelop.md) for specific configuration methods. Currently, the refresh interval is limited to one minute. | Visual configuration, user-friendly. Supports scheduling dependency configuration to ensure the upstream refresh is completed before refreshing the downstream. Supports single-node operation monitoring, such as failure alerts, timeout alerts, etc. | |
80
80
  | Submitting Refresh Jobs Using Third-Party Scheduling Engines | By downloading the Lakehouse client command, use cron expressions to schedule Refresh tasks. Alternatively, use the Java Jdbc interface to customize the submission of Refresh commands. | You can more flexibly control job submission and configure scheduling information, with no time interval restrictions | Requires reliance on third-party scheduling. Introduces third-party scheduling systems |
81
81
 
@@ -1,6 +1,6 @@
1
1
  # Dynamic Table Overview
2
2
 
3
- > [Preview Release] This feature is currently in an invited preview stage. If you need to use it, please contact our technical support team for assistance.
3
+
4
4
 
5
5
  ## What is a Dynamic Table
6
6
 
@@ -99,6 +99,6 @@ if st.button('Generate Chart') and sql:
99
99
  ## Reference Resources
100
100
 
101
101
  * [Streamlit Official Documentation](https://docs.streamlit.io/library/get-started)
102
- * [Singdata Lakehouse Official Documentation](https://docs.singdata.com/lakehouse/)
102
+ * [Singdata Lakehouse Official Documentation](https://singdata.com/documents)
103
103
 
104
104
  ^
@@ -22,7 +22,7 @@ Open the browser and go to http://localhost:8088.
22
22
  ### Local Installation
23
23
 
24
24
  1. Install `clickzetta-sqlalchemy`:
25
- `clickzetta-sqlalchemy` needs to be installed in an environment with Python version 3.7 or above.
25
+ `clickzetta-sqlalchemy` needs to be installed in an environment with Python version 3.10 or above.
26
26
  Installation command (ensure the current environment does not need to use clickzetta-sqlalchemy and clickzetta-connector, uninstall them to avoid dependency conflicts):
27
27
  ```
28
28
  pip uninstall -y clickzetta-sqlalchemy clickzetta-connector && pip install clickzetta-connector -U
@@ -76,7 +76,6 @@ See [JDBC Driver](jdbc-driver.md) for details.
76
76
  |--------|-------------|-----------|
77
77
  | Apache Spark | Read and write Singdata Lakehouse tables via the Spark Connector; supports the DataFrame API and spark-sql | [Spark Connector](spark-connector-summary.md) |
78
78
  | Apache Flink | Write to Singdata Lakehouse via the Flink Connector; supports CDC scenarios and append-only mode; sink tables only (write) | [Flink Connector](flink-write-connector.md) |
79
- | Trino | Query Singdata Lakehouse data via Trino federated queries | [Trino Integration Guide](eco_integration/trino.md) |
80
79
 
81
80
  **Two Flink Connector modes**:
82
81
  - `igs-dynamic-table`: supports CDC (insert / update / delete); the target table must have a primary key
@@ -116,6 +115,5 @@ What is your use case?
116
115
  │ └── Python applications → Python SDK
117
116
  └── Compute engine
118
117
  ├── Batch processing / ML → Spark Connector
119
- ├── Stream processing / CDC → Flink Connector
120
- └── Federated queries → Trino
118
+ └── Stream processing / CDC → Flink Connector
121
119
  ```
@@ -0,0 +1,145 @@
1
+ # Ecosystem
2
+
3
+ Singdata Lakehouse is compatible with mainstream data integration, BI, AI, and development tools, and is deployed on seven public clouds including Alibaba Cloud, Tencent Cloud, and AWS. This document summarizes verified third-party tools and connection solutions organized by category.
4
+
5
+ If the tool you need is not on the list, that does not mean it is unsupported — Lakehouse provides standard access via JDBC, MySQL protocol, and Python/Java SDKs, and any tool compatible with these protocols can connect directly. If you want to develop a new connector or integration solution based on Lakehouse, feel free to contact our partner team.
6
+
7
+ ## Cloud Platforms (CSP)
8
+
9
+ Lakehouse is deployed on seven clouds: Alibaba Cloud, Tencent Cloud, AWS, GCP, Huawei Cloud, Baidu AI Cloud, and Volcengine. Alibaba Cloud, Tencent Cloud, and AWS provide complete dedicated documentation (including storage connections, private network connections, and permission configuration); the configuration approach is consistent across all other cloud platforms. BYOS (Bring Your Own Storage) deployment is also supported — data is stored under the user's own cloud account and does not pass through the Singdata platform. See [Supported Cloud Platforms](supported-cloud-platforms.md) and [Private Storage Overview](byos_general.md) for details.
10
+
11
+ ***
12
+
13
+ ## Data Integration
14
+
15
+ The following data integration tools are compatible with Lakehouse, covering offline batch, real-time CDC, message streaming, and log collection scenarios. Lakehouse also supports [50+ data sources](data-sources.md) (MySQL, Oracle, PostgreSQL, MongoDB, Hive, MaxCompute, etc.) via Studio Data Sync for direct access without third-party tools:
16
+
17
+ | Tool | Connection | Description | Reference |
18
+ | ------------ | ------------------ | ------------------------------------------------------------- | ------------------------------------------------------------ |
19
+ | Apache Kafka | Kafka Connector | Real-time message stream writing to Lakehouse | [Kafka Data Source](DataSource_Kafka.md) |
20
+ | AutoMQ | Kafka Protocol | Next-generation message queue, compatible with Kafka protocol | [AutoMQ Data Source](DataSource_AutoMQ.md) |
21
+ | Airbyte | JDBC | Open-source ELT platform with a rich connector ecosystem | [Airbyte Integration Guide](airbyte.md) |
22
+ | DataX | Plugin-based | Alibaba open-source tool, suitable for batch data sync | [DataX Integration Guide](eco_integration/datax.md) |
23
+ | Apache Flink | Flink Connector | Stream processing engine for real-time writes to Lakehouse | [Flink Connector](flink-write-connector.md) |
24
+ | Apache Spark | Spark Connector | Large-scale data reads and writes for Lakehouse tables | [Spark Connector](spark-connector-summary.md) |
25
+ | Logstash | Logstash Connector | Import log data into Lakehouse | [Logstash Integration Guide](Logstash.md) |
26
+ | Bluepipe | Native integration | Real-time CDC sync from Oracle to Lakehouse | [Bluepipe Sync Guide](bluepipe-oracle-lakehouse-datasync.md) |
27
+
28
+ ***
29
+
30
+ ## BI and Visualization
31
+
32
+ The following BI tools are compatible with Lakehouse. Any BI tool supporting JDBC, ODBC, or MySQL protocol can connect directly and is not limited to the list below:
33
+
34
+ | Tool | Connection | Description | Reference |
35
+ | --------------- | -------------- | ---------------------------------------------------------------- | ---------------------------------------------------------------- |
36
+ | FineBI | JDBC / MySQL | Leading domestic BI tool | [JDBC Connection](FineBI.md) · [MySQL Protocol](finebi-mysql.md) |
37
+ | Tableau | JDBC | Suitable for complex visualizations and exploratory analysis | [Tableau Connection Guide](tableau-connect-to-lakehouse.md) |
38
+ | Power BI | MySQL Protocol | Connect via MySQL protocol | [Power BI Connection Guide](PowerBI.md) |
39
+ | Apache Superset | SQLAlchemy | Open-source, suitable for self-service analytics | [Superset Connection Guide](eco_integration/superset.md) |
40
+ | Metabase | JDBC | Open-source, easy to deploy, suitable for small and medium teams | [Metabase Connection Guide](metabase.md) |
41
+ | Apache Zeppelin | JDBC | Notebook-style data exploration | [Zeppelin Connection Guide](eco_integration/Zeppelin.md) |
42
+ | Rath | JDBC | Open-source intelligent analytics with automatic insight support | [Rath Connection Guide](eco_integration/rath.md) |
43
+ | Streamlit | Python SDK | Rapidly build data apps for data science teams | [Streamlit Connection Guide](eco_integration/streamlit.md) |
44
+
45
+ ***
46
+
47
+ ## Transformation and Compute Engines
48
+
49
+ The following data transformation tools and compute engines are compatible with Lakehouse:
50
+
51
+ | Tool | Connection | Description | Reference |
52
+ | ------------ | ---------------------- | ------------------------------------------------------------------------ | ----------------------------------------------- |
53
+ | dbt | dbt-clickzetta adapter | Data modeling and transformation, supports Dynamic Table materialization | [dbt Integration Guide](eco_integration/dbt.md) |
54
+ | Apache Spark | Spark Connector | Large-scale batch processing and machine learning | [Spark Connector](spark-connector-summary.md) |
55
+ | Apache Flink | Flink Connector | Real-time stream processing | [Flink Connector](flink-write-connector.md) |
56
+
57
+ The **dbt documentation series** covers all scenarios from quick start to migration practice: jaffle-shop experience, Snowflake/BigQuery migration, incremental processing, real-time pipelines, and data quality testing. See [DBT Practice Series](dbt-practice-series.md).
58
+
59
+ ***
60
+
61
+ ## AI and Machine Learning
62
+
63
+ The following AI frameworks and platforms are compatible with Lakehouse, supporting vector storage, RAG applications, and AI workflow scenarios:
64
+
65
+ | Tool | Integration | Description | Reference |
66
+ | --------------- | ---------------- | ------------------------------------------------ | -------------------------------------------------------------------------- |
67
+ | LangChain | Python SDK | Vector storage and RAG application development | [LangChain Integration](langchain_integration.md) |
68
+ | LlamaIndex | Python SDK | Data indexing and retrieval | [LlamaIndex Integration](llama-index.md) |
69
+ | Dify | MCP Server / SDK | Vector database + file storage | [Dify Integration Overview](dify_yunqilakehouse_integration_overview.md) |
70
+ | N8N | MCP Server | Unified AI workflows | [N8N Integration](N8N_AI_Workflow_Integration.md) |
71
+ | MindsDB | JDBC | ML/LLM modeling and prediction on Lakehouse data | [MindsDB Integration](JDBC_MindsDB_ML_LLM.md) |
72
+ | Datus | MCP Server | Data engineering agent | [Datus Integration](Datus_Lakehouse_Integrated_Guide.md) |
73
+ | Zilliz | Joint solution | Vector database joint solution | [Zilliz Joint Solution](lakehouse-zilliz-make-data-ready-for-bi-and-ai.md) |
74
+ | Unstructured.io | SDK | Unstructured document parsing and vectorization | [Unstructured.io Integration](unstructured-io.md) |
75
+
76
+ Lakehouse also provides an [MCP Server](LakehouseMCPServer.md) that can be called by any AI Agent supporting the MCP protocol.
77
+
78
+ ***
79
+
80
+ ## Programmatic Interfaces
81
+
82
+ Lakehouse provides the following native programming interfaces and SDKs:
83
+
84
+ | Interface | Language | Description | Reference |
85
+ | -------------- | ---------- | ----------------------------------------------------------- | ---------------------------------------------------------- |
86
+ | JDBC Driver | Java / JVM | Standard JDBC interface, compatible with all JVM ecosystems | [JDBC Driver](JDBC-Driver.md) |
87
+ | MySQL Protocol | All | No client dependency, compatible with MySQL ecosystem | [MySQL Protocol Connection](use-mysql-client.md) |
88
+ | Python SDK | Python | PEP 249 compatible, supports batch/real-time writes | [Python SDK](python_reference/python-sdk-summary.md) |
89
+ | Java SDK | Java | Supports BulkLoad and real-time stream writes | [Java SDK Batch Upload](use-java-sdk-upload-data-local.md) |
90
+ | SQLAlchemy | Python | Standard Python ORM / SQL toolkit | [SQLAlchemy Connection](sqlalchemy.md) |
91
+ | cz-cli | Shell | Command-line client: SQL + Studio Tasks + AI Agent | [cz-cli Guide](cz-cli.md) |
92
+
93
+ ***
94
+
95
+ ## SQL Clients and Database Management Tools
96
+
97
+ These tools connect via JDBC or MySQL protocol, compatible with standard SQL operations:
98
+
99
+ | Tool | Connection | Description | Reference |
100
+ | --------------- | -------------- | ------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------- |
101
+ | DBeaver | JDBC | Open-source and free, community edition is sufficient, suitable for daily queries and data exploration | [DBeaver Connection Guide](eco_integration/dbeaver-lakehouse.md) |
102
+ | DataGrip | JDBC | JetBrains product with strong code completion and SQL analysis | [DataGrip Connection Guide](eco_integration/datagrip-lakehouse.md) |
103
+ | SQL Workbench/J | JDBC | Lightweight, basic SQL execution | [SQL Workbench/J Connection Guide](eco_integration/sqlworkbench-j-lakehouse.md) |
104
+ | Navicat | MySQL Protocol | Visual management with intuitive operations | [Navicat Connection Guide](navicat-mysql.md) |
105
+
106
+ ***
107
+
108
+ ## Data Lake Formats
109
+
110
+ Lakehouse is **natively based on Apache Iceberg** — tables are stored in Iceberg format, supporting time travel, partition evolution, schema evolution, and cross-engine access. Delta Lake and Hudi formats are also supported via external tables:
111
+
112
+ | Format | Relationship | Description | Reference |
113
+ | -------------- | -------------- | --------------------------------------------------------------- | ------------------------------------------------------------ |
114
+ | Apache Iceberg | Native format | Underlying format for all Lakehouse tables, cross-engine access | [Spark + Iceberg Analytics](spark-lakehouse-iceberg-rest.md) |
115
+ | Delta Lake | External table | Open table format from the Databricks ecosystem | [Delta Lake External Table](delta-lake.md) |
116
+ | Apache Hudi | External table | Open table format optimized for streaming writes | [Hudi External Table](external-hudi-table.md) |
117
+
118
+ **Federated Queries**: Query Iceberg tables in Hive, Databricks, and Snowflake OpenCatalog directly via External Catalog, without data migration. See [Federated Query](federation-query.md).
119
+
120
+ ***
121
+
122
+ ## Modern Data Stack
123
+
124
+ The following solution combinations show how to build a complete data platform using Lakehouse and ecosystem tools:
125
+
126
+ | Solution | Toolchain | Reference |
127
+ | ------------------ | ------------------------------------ | --------------------------------------------------------------------------- |
128
+ | ELT-oriented | Airbyte → Lakehouse → dbt → Metabase | [ELT Modern Data Stack](ELTModernDataStack.md) |
129
+ | Analytics-oriented | Lakehouse ← dbt → Superset | [Analytics Modern Data Stack](analytics-modern-data-stack.md) |
130
+ | BI + AI | Lakehouse + Zilliz | [BI + AI Joint Solution](lakehouse-zilliz-make-data-ready-for-bi-and-ai.md) |
131
+
132
+ ***
133
+
134
+ > 💡 **Tip**: The list above contains verified and compatible third-party tools. Lakehouse provides standard access via JDBC, MySQL protocol, and Python/Java SDKs — any tool compatible with these protocols can be used directly. If the tool you need is not on the list, it can still connect normally.
135
+
136
+ ## Quick Navigation
137
+
138
+ * **Understand product concepts**: [Key Concepts](key-concepts.md) · [Incremental Computing](incremental-computing.md)
139
+ * **Start ingesting data**: [Data Integration](#data-integration) · [50+ Data Source Support](data-sources.md)
140
+ * **Connect BI tools**: [BI and Visualization](#bi-and-visualization)
141
+ * **Data modeling**: [dbt Integration Guide](eco_integration/dbt.md) · [DBT Practice Series](dbt-practice-series.md)
142
+ * **Programmatic access**: [Programmatic Interfaces](#programmatic-interfaces)
143
+ * **AI application development**: [AI and Machine Learning](#ai-and-machine-learning)
144
+
145
+ ^