@clickzetta/cz-cli-linux-x64 0.3.4 → 0.3.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (118) hide show
  1. package/bin/cz-cli +0 -0
  2. package/package.json +1 -1
  3. package/bin/skills/clickzetta-access-control/SKILL.md +0 -243
  4. package/bin/skills/clickzetta-access-control/references/dynamic-masking.md +0 -86
  5. package/bin/skills/clickzetta-access-control/references/grant-revoke.md +0 -103
  6. package/bin/skills/clickzetta-access-control/references/role-management.md +0 -66
  7. package/bin/skills/clickzetta-access-control/references/user-management.md +0 -61
  8. package/bin/skills/clickzetta-ai-vector-search/SKILL.md +0 -160
  9. package/bin/skills/clickzetta-ai-vector-search/references/vector-search.md +0 -155
  10. package/bin/skills/clickzetta-app-python-sdk/SKILL.md +0 -153
  11. package/bin/skills/clickzetta-app-python-sdk/references/bulkload.md +0 -196
  12. package/bin/skills/clickzetta-app-python-sdk/references/connector.md +0 -143
  13. package/bin/skills/clickzetta-app-python-sdk/references/realtime.md +0 -122
  14. package/bin/skills/clickzetta-batch-sync-pipeline/SKILL.md +0 -293
  15. package/bin/skills/clickzetta-bi-connect/SKILL.md +0 -176
  16. package/bin/skills/clickzetta-bi-connect/references/bi-tools.md +0 -170
  17. package/bin/skills/clickzetta-cdc-sync-pipeline/SKILL.md +0 -457
  18. package/bin/skills/clickzetta-concepts/SKILL.md +0 -282
  19. package/bin/skills/clickzetta-concepts/references/brands-and-endpoints.md +0 -79
  20. package/bin/skills/clickzetta-concepts/references/object-model.md +0 -311
  21. package/bin/skills/clickzetta-data-ingest-pipeline/SKILL.md +0 -165
  22. package/bin/skills/clickzetta-data-lifecycle/SKILL.md +0 -211
  23. package/bin/skills/clickzetta-data-lifecycle/references/lifecycle-reference.md +0 -175
  24. package/bin/skills/clickzetta-data-recovery/SKILL.md +0 -215
  25. package/bin/skills/clickzetta-data-recovery/evals/evals.json +0 -35
  26. package/bin/skills/clickzetta-data-science/SKILL.md +0 -125
  27. package/bin/skills/clickzetta-data-science/references/bitmap-profile.md +0 -146
  28. package/bin/skills/clickzetta-data-science/references/data-patterns.md +0 -110
  29. package/bin/skills/clickzetta-data-science/references/setup.md +0 -160
  30. package/bin/skills/clickzetta-data-science/references/stats-functions.md +0 -195
  31. package/bin/skills/clickzetta-data-science/references/write-and-infer.md +0 -122
  32. package/bin/skills/clickzetta-data-science/references/zettapark-api.md +0 -156
  33. package/bin/skills/clickzetta-data-sharing/SKILL.md +0 -160
  34. package/bin/skills/clickzetta-data-sharing/references/share-ddl.md +0 -134
  35. package/bin/skills/clickzetta-dba-guide/SKILL.md +0 -540
  36. package/bin/skills/clickzetta-dw-modeling/SKILL.md +0 -259
  37. package/bin/skills/clickzetta-dw-modeling/references/modeling-patterns.md +0 -100
  38. package/bin/skills/clickzetta-dynamic-table/SKILL.md +0 -112
  39. package/bin/skills/clickzetta-dynamic-table/best-practices/dimension-table-join-guide.md +0 -257
  40. package/bin/skills/clickzetta-dynamic-table/best-practices/medallion-and-stream-patterns.md +0 -124
  41. package/bin/skills/clickzetta-dynamic-table/best-practices/non-partitioned-merge-into-warning.md +0 -96
  42. package/bin/skills/clickzetta-dynamic-table/best-practices/performance-optimization.md +0 -109
  43. package/bin/skills/clickzetta-dynamic-table/dt-creator/SKILL.md +0 -15
  44. package/bin/skills/clickzetta-dynamic-table/dt-creator/references/dt-declaration-strategy.md +0 -185
  45. package/bin/skills/clickzetta-dynamic-table/dt-creator/references/incremental-config-reference.md +0 -429
  46. package/bin/skills/clickzetta-dynamic-table/dt-creator/references/refresh-history-guide.md +0 -268
  47. package/bin/skills/clickzetta-dynamic-table/dt-creator/references/sql-limitations.md +0 -80
  48. package/bin/skills/clickzetta-dynamic-table/dynamic-table-alter/SKILL.md +0 -190
  49. package/bin/skills/clickzetta-external-catalog/SKILL.md +0 -120
  50. package/bin/skills/clickzetta-external-catalog/references/external-catalog-ddl.md +0 -130
  51. package/bin/skills/clickzetta-external-function/SKILL.md +0 -203
  52. package/bin/skills/clickzetta-external-function/references/external-function-ddl.md +0 -171
  53. package/bin/skills/clickzetta-file-import-pipeline/SKILL.md +0 -156
  54. package/bin/skills/clickzetta-index-manager/SKILL.md +0 -140
  55. package/bin/skills/clickzetta-index-manager/references/bloomfilter-index.md +0 -67
  56. package/bin/skills/clickzetta-index-manager/references/index-management.md +0 -73
  57. package/bin/skills/clickzetta-index-manager/references/inverted-index.md +0 -80
  58. package/bin/skills/clickzetta-index-manager/references/vector-index.md +0 -81
  59. package/bin/skills/clickzetta-information-schema/SKILL.md +0 -367
  60. package/bin/skills/clickzetta-information-schema/references/instance-views-reference.md +0 -276
  61. package/bin/skills/clickzetta-information-schema/references/metering-views-reference.md +0 -137
  62. package/bin/skills/clickzetta-information-schema/references/views-reference.md +0 -271
  63. package/bin/skills/clickzetta-java-sdk/SKILL.md +0 -186
  64. package/bin/skills/clickzetta-java-sdk/references/bulkload.md +0 -163
  65. package/bin/skills/clickzetta-java-sdk/references/realtime.md +0 -212
  66. package/bin/skills/clickzetta-kafka-ingest-pipeline/SKILL.md +0 -639
  67. package/bin/skills/clickzetta-kafka-ingest-pipeline/references/kafka-pipe-syntax.md +0 -324
  68. package/bin/skills/clickzetta-lakehouse-connect/SKILL.md +0 -218
  69. package/bin/skills/clickzetta-lakehouse-connect/evals/evals.json +0 -35
  70. package/bin/skills/clickzetta-lakehouse-connect/references/config-file.md +0 -435
  71. package/bin/skills/clickzetta-lakehouse-connect/references/jdbc.md +0 -478
  72. package/bin/skills/clickzetta-lakehouse-connect/references/python-sdk.md +0 -225
  73. package/bin/skills/clickzetta-lakehouse-connect/references/sqlalchemy.md +0 -468
  74. package/bin/skills/clickzetta-lakehouse-connect/references/zettapark-session.md +0 -445
  75. package/bin/skills/clickzetta-manage-comments/SKILL.md +0 -219
  76. package/bin/skills/clickzetta-metadata-query/SKILL.md +0 -298
  77. package/bin/skills/clickzetta-metadata-query/references/show-desc-reference.md +0 -326
  78. package/bin/skills/clickzetta-monitoring/SKILL.md +0 -199
  79. package/bin/skills/clickzetta-monitoring/references/job-history-analysis.md +0 -97
  80. package/bin/skills/clickzetta-monitoring/references/show-jobs.md +0 -48
  81. package/bin/skills/clickzetta-oss-ingest-pipeline/SKILL.md +0 -427
  82. package/bin/skills/clickzetta-query-optimizer/SKILL.md +0 -156
  83. package/bin/skills/clickzetta-query-optimizer/references/explain.md +0 -56
  84. package/bin/skills/clickzetta-query-optimizer/references/hints-and-sortkey.md +0 -78
  85. package/bin/skills/clickzetta-query-optimizer/references/optimize.md +0 -65
  86. package/bin/skills/clickzetta-query-optimizer/references/result-cache.md +0 -49
  87. package/bin/skills/clickzetta-query-optimizer/references/show-jobs.md +0 -42
  88. package/bin/skills/clickzetta-realtime-sync-pipeline/SKILL.md +0 -197
  89. package/bin/skills/clickzetta-semantic-view/SKILL.md +0 -207
  90. package/bin/skills/clickzetta-semantic-view/references/semantic-view-reference.md +0 -167
  91. package/bin/skills/clickzetta-spark-flink-connector/SKILL.md +0 -92
  92. package/bin/skills/clickzetta-spark-flink-connector/references/flink.md +0 -147
  93. package/bin/skills/clickzetta-spark-flink-connector/references/spark.md +0 -132
  94. package/bin/skills/clickzetta-sql-pipeline-manager/SKILL.md +0 -379
  95. package/bin/skills/clickzetta-sql-pipeline-manager/evals/evals.json +0 -166
  96. package/bin/skills/clickzetta-sql-pipeline-manager/references/dynamic-table.md +0 -185
  97. package/bin/skills/clickzetta-sql-pipeline-manager/references/materialized-view.md +0 -129
  98. package/bin/skills/clickzetta-sql-pipeline-manager/references/pipe.md +0 -222
  99. package/bin/skills/clickzetta-sql-pipeline-manager/references/table-stream.md +0 -125
  100. package/bin/skills/clickzetta-sql-syntax-guide/SKILL.md +0 -172
  101. package/bin/skills/clickzetta-sql-syntax-guide/references/ddl-reference.md +0 -350
  102. package/bin/skills/clickzetta-sql-syntax-guide/references/dml-reference.md +0 -279
  103. package/bin/skills/clickzetta-sql-syntax-guide/references/dql-reference.md +0 -504
  104. package/bin/skills/clickzetta-sql-syntax-guide/references/functions-reference.md +0 -372
  105. package/bin/skills/clickzetta-sql-syntax-guide/references/migration-databricks.md +0 -260
  106. package/bin/skills/clickzetta-sql-syntax-guide/references/migration-snowflake.md +0 -382
  107. package/bin/skills/clickzetta-sql-syntax-guide/references/vs-snowflake.md +0 -346
  108. package/bin/skills/clickzetta-sql-syntax-guide/references/vs-spark.md +0 -229
  109. package/bin/skills/clickzetta-studio-overview/SKILL.md +0 -170
  110. package/bin/skills/clickzetta-studio-overview/references/studio-modules.md +0 -173
  111. package/bin/skills/clickzetta-table-stream-pipeline/SKILL.md +0 -206
  112. package/bin/skills/clickzetta-vcluster-manager/SKILL.md +0 -212
  113. package/bin/skills/clickzetta-vcluster-manager/references/vc-cache.md +0 -54
  114. package/bin/skills/clickzetta-vcluster-manager/references/vcluster-ddl.md +0 -150
  115. package/bin/skills/clickzetta-volume-manager/SKILL.md +0 -292
  116. package/bin/skills/clickzetta-volume-manager/references/volume-ddl.md +0 -199
  117. package/bin/skills/clickzetta-zettapark/SKILL.md +0 -248
  118. package/bin/skills/clickzetta-zettapark/references/zettapark-api.md +0 -283
@@ -1,212 +0,0 @@
1
- ---
2
- name: clickzetta-vcluster-manager
3
- description: |
4
- 管理 ClickZetta Lakehouse 计算集群(VCluster)的完整生命周期。
5
- 覆盖创建(通用型/分析型/同步型)、启动/停止、规格调整、弹性扩缩容、
6
- 缓存配置(PRELOAD_TABLES)、查看集群状态等操作。
7
- 当用户说"创建集群"、"计算集群"、"VCluster"、"启动集群"、"停止集群"、
8
- "调整集群规格"、"集群扩容"、"集群缩容"、"自动停止"、"自动启动"、
9
- "预加载缓存"、"PRELOAD"、"集群类型"、"GP集群"、"AP集群"、"分析型集群"、
10
- "通用型集群"、"同步型集群"时触发。
11
- Keywords: VCluster, compute cluster, create, suspend, resume, resize, auto-scale
12
- ---
13
-
14
- # ClickZetta 计算集群管理
15
-
16
- 阅读 [references/vcluster-ddl.md](references/vcluster-ddl.md) 了解完整语法。
17
-
18
- ## 集群类型选择
19
-
20
- | 类型 | 关键字 | 适用场景 | 扩缩容方式 |
21
- |---|---|---|---|
22
- | 通用型(GP) | `GENERAL` | 离线 ETL、数据摄取、综合场景 | 纵向(规格大小) |
23
- | 分析型(AP) | `ANALYTICS` | 高并发在线查询、BI 报表、Ad-Hoc | 横向(副本数) |
24
- | 同步型 | `INTEGRATION` | 数据集成同步任务 | 纵向(规格大小) |
25
-
26
- **规格单位**:CRU(Compute Resource Unit)
27
- - 通用型/同步型:1-256 CRU,步长 1(同步型额外支持 0.25、0.5)
28
- - 分析型:1-256 CRU,须为 2 的 n 次幂(1、2、4、8、16...)
29
-
30
- ---
31
-
32
- ## 创建集群
33
-
34
- ```sql
35
- -- 通用型:离线 ETL,8 CRU,作业完成后 60 秒自动停止
36
- CREATE VCLUSTER IF NOT EXISTS etl_vc
37
- VCLUSTER_TYPE = GENERAL
38
- VCLUSTER_SIZE = 8
39
- AUTO_SUSPEND_IN_SECOND = 60
40
- AUTO_RESUME = TRUE
41
- COMMENT '离线ETL专用集群';
42
-
43
- -- 通用型:弹性规格(1-16 CRU)
44
- CREATE VCLUSTER IF NOT EXISTS etl_elastic_vc
45
- VCLUSTER_TYPE = GENERAL
46
- MIN_VCLUSTER_SIZE = 1
47
- MAX_VCLUSTER_SIZE = 16
48
- AUTO_SUSPEND_IN_SECOND = 300;
49
-
50
- -- 分析型:BI 报表,4 CRU,1-10 副本,最大 80 并发
51
- CREATE VCLUSTER IF NOT EXISTS bi_vc
52
- VCLUSTER_TYPE = ANALYTICS
53
- VCLUSTER_SIZE = 4
54
- MIN_REPLICAS = 1
55
- MAX_REPLICAS = 10
56
- MAX_CONCURRENCY = 8
57
- AUTO_SUSPEND_IN_SECOND = 1800
58
- AUTO_RESUME = TRUE
59
- COMMENT 'BI报表在线查询集群';
60
-
61
- -- 同步型:数据集成任务
62
- CREATE VCLUSTER IF NOT EXISTS sync_vc
63
- VCLUSTER_TYPE = INTEGRATION
64
- VCLUSTER_SIZE = 1
65
- AUTO_RESUME = TRUE;
66
- ```
67
-
68
- ---
69
-
70
- ## 启动 / 停止
71
-
72
- ```sql
73
- -- 启动集群
74
- ALTER VCLUSTER IF EXISTS etl_vc RESUME;
75
-
76
- -- 正常停止(等待当前作业完成)
77
- ALTER VCLUSTER IF EXISTS etl_vc SUSPEND;
78
-
79
- -- 强制停止(立即中断所有作业)
80
- ALTER VCLUSTER IF EXISTS etl_vc SUSPEND FORCE;
81
-
82
- -- 取消集群中所有作业
83
- ALTER VCLUSTER IF EXISTS etl_vc CANCEL ALL JOBS;
84
- ```
85
-
86
- ---
87
-
88
- ## 修改集群属性
89
-
90
- ```sql
91
- -- 调整规格
92
- ALTER VCLUSTER IF EXISTS etl_vc SET VCLUSTER_SIZE = 16;
93
-
94
- -- 修改自动停止时间
95
- ALTER VCLUSTER IF EXISTS etl_vc SET AUTO_SUSPEND_IN_SECOND = 300;
96
-
97
- -- 分析型:调整副本数和并发
98
- ALTER VCLUSTER IF EXISTS bi_vc SET
99
- MIN_REPLICAS = 2
100
- MAX_REPLICAS = 5
101
- MAX_CONCURRENCY = 16;
102
-
103
- -- 修改注释
104
- ALTER VCLUSTER IF EXISTS etl_vc SET COMMENT '新注释';
105
- ```
106
-
107
- ---
108
-
109
- ## 缓存配置(仅分析型)
110
-
111
- 阅读 [references/vc-cache.md](references/vc-cache.md) 了解缓存详情。
112
-
113
- ```sql
114
- -- 设置预加载表(覆盖写,需带上所有已有表)
115
- ALTER VCLUSTER bi_vc SET PRELOAD_TABLES = "public.orders,public.customers";
116
-
117
- -- 查看当前集群缓存状态
118
- SHOW PRELOAD CACHED STATUS;
119
-
120
- -- 查看指定集群缓存状态
121
- SHOW VCLUSTER bi_vc PRELOAD CACHED STATUS;
122
- ```
123
-
124
- ---
125
-
126
- ## 查看集群信息
127
-
128
- ```sql
129
- -- 列出所有集群
130
- SHOW VCLUSTERS;
131
-
132
- -- 按类型过滤
133
- SHOW VCLUSTERS WHERE vcluster_type = 'ANALYTICS';
134
- SHOW VCLUSTERS WHERE state = 'SUSPENDED';
135
-
136
- -- 按名称模糊匹配
137
- SHOW VCLUSTERS LIKE 'etl%';
138
-
139
- -- 查看集群详情
140
- DESC VCLUSTER etl_vc;
141
- DESC VCLUSTER EXTENDED bi_vc;
142
- ```
143
-
144
- ---
145
-
146
- ## 删除集群
147
-
148
- ```sql
149
- -- 等待当前作业完成后删除
150
- DROP VCLUSTER IF EXISTS etl_vc;
151
-
152
- -- 立即强制删除(中断正在运行的作业)
153
- DROP VCLUSTER IF EXISTS etl_vc FORCE;
154
- ```
155
-
156
- ---
157
-
158
- ## 切换当前会话集群
159
-
160
- ```sql
161
- USE VCLUSTER bi_vc;
162
- ```
163
-
164
- ---
165
-
166
- ## 典型场景
167
-
168
- ### 场景 1:离线 ETL 集群
169
-
170
- ```sql
171
- CREATE VCLUSTER IF NOT EXISTS etl_daily
172
- VCLUSTER_TYPE = GENERAL
173
- VCLUSTER_SIZE = 8
174
- AUTO_SUSPEND_IN_SECOND = 60
175
- AUTO_RESUME = TRUE
176
- COMMENT '每日ETL作业,完成后1分钟自动停止';
177
- ```
178
-
179
- ### 场景 2:在线 BI 报表集群(高并发)
180
-
181
- ```sql
182
- CREATE VCLUSTER IF NOT EXISTS bi_online
183
- VCLUSTER_TYPE = ANALYTICS
184
- VCLUSTER_SIZE = 4
185
- MIN_REPLICAS = 1
186
- MAX_REPLICAS = 10
187
- MAX_CONCURRENCY = 8
188
- AUTO_SUSPEND_IN_SECOND = 1800
189
- AUTO_RESUME = TRUE
190
- COMMENT 'BI在线查询,最大支持80并发';
191
- ```
192
-
193
- ### 场景 3:数据集成同步集群
194
-
195
- ```sql
196
- CREATE VCLUSTER IF NOT EXISTS cdc_sync
197
- VCLUSTER_TYPE = INTEGRATION
198
- VCLUSTER_SIZE = 0.5
199
- AUTO_RESUME = TRUE
200
- COMMENT '轻量CDC同步任务';
201
- ```
202
-
203
- ---
204
-
205
- ## 常见问题
206
-
207
- | 问题 | 原因 | 解决方案 |
208
- |---|---|---|
209
- | 分析型集群规格报错 | 规格须为 2 的 n 次幂 | 使用 1、2、4、8、16、32... |
210
- | PRELOAD_TABLES 不生效 | 仅 AP 型集群支持 | 确认集群类型为 ANALYTICS |
211
- | 添加预加载表后原有表消失 | PRELOAD_TABLES 是覆盖写 | 设置时带上所有已有表 |
212
- | 集群停止后缓存丢失 | 本地缓存随集群停止释放 | 重启后自动重新加载 PRELOAD 表 |
@@ -1,54 +0,0 @@
1
- # 计算集群缓存参考
2
-
3
- > 来源:https://www.yunqi.tech/documents/vc_cache
4
-
5
- ## 缓存类型
6
-
7
- Lakehouse 提供三种缓存:
8
- 1. **查询结果缓存(ResultCache)** - 服务层,工作空间内共享
9
- 2. **元数据缓存(MetadataCache)** - 服务层,工作空间内共享
10
- 3. **计算集群本地缓存(Local Disk Cache)** - 保存在集群本地节点,仅使用指定集群时可用
11
-
12
- ## 主动缓存(PRELOAD_TABLES)
13
-
14
- 仅适用于**分析型(AP)**集群。集群每次启动时自动加载预缓存表的最新数据/分区。
15
-
16
- ```sql
17
- -- 设置预加载表(覆盖写,需带上所有已有表)
18
- ALTER VCLUSTER default SET PRELOAD_TABLES = "schema1.table1,schema2.table2";
19
-
20
- -- 添加新表时,必须包含原有表,否则会覆盖
21
- ALTER VCLUSTER default SET PRELOAD_TABLES = "schema1.table1,schema2.table2,schema3.table3";
22
-
23
- -- 支持通配符
24
- ALTER VCLUSTER bi_vc SET PRELOAD_TABLES = "sales.*,public.dim_date";
25
- ```
26
-
27
- ⚠️ 注意:执行缓存命令后,只有新写入的数据才会被缓存。
28
-
29
- ## 被动缓存
30
-
31
- 首次查询时自动缓存读取的文件,后续相同查询直接命中缓存。支持 GP 型和 AP 型集群。
32
-
33
- ## 查看缓存状态
34
-
35
- ```sql
36
- -- 显示当前集群的预加载表/分区状态
37
- SHOW PRELOAD CACHED STATUS;
38
-
39
- -- 显示指定集群的预加载状态
40
- SHOW VCLUSTER <vc_name> PRELOAD CACHED STATUS;
41
-
42
- -- 按表名过滤
43
- SHOW VCLUSTER <vc_name> PRELOAD CACHED STATUS WHERE table LIKE '%table_name%';
44
-
45
- -- 显示预加载缓存汇总信息
46
- SHOW EXTENDED PRELOAD CACHED STATUS;
47
- ```
48
-
49
- ## 注意事项
50
-
51
- - 集群停止时,本地缓存自动释放
52
- - AP 型集群重启时只缓存最新写入的数据或分区
53
- - `SHOW PRELOAD` 状态更新可能有约 10 分钟延迟,但缓存实际已生效
54
- - PRELOAD_TABLES 是覆盖写,添加新表时需带上所有已有表
@@ -1,150 +0,0 @@
1
- # CREATE / ALTER / DROP VCLUSTER 参考
2
-
3
- > 来源:https://www.yunqi.tech/documents/create_cluster 和 alter-vcluster 和 drop-vcluster
4
-
5
- ---
6
-
7
- ## 集群类型选择
8
-
9
- | 类型 | 关键字 | 适用场景 | 扩缩容方式 |
10
- |---|---|---|---|
11
- | 通用型(GP) | `GENERAL` | 离线 ETL、数据摄取、综合场景 | 纵向(规格大小) |
12
- | 分析型(AP) | `ANALYTICS` | 高并发在线查询、BI 报表、Ad-Hoc | 横向(副本数) |
13
- | 同步型 | `INTEGRATION` | 数据集成同步任务 | 纵向(规格大小) |
14
-
15
- **规格单位**:CRU(Compute Resource Unit)
16
- - 通用型/同步型:1-256 CRU,步长 1(同步型额外支持 0.25、0.5)
17
- - 分析型:1-256 CRU,须为 2 的 n 次幂(1、2、4、8、16...)
18
-
19
- ---
20
-
21
- ## CREATE VCLUSTER
22
-
23
- ```sql
24
- CREATE VCLUSTER [IF NOT EXISTS] <name>
25
- VCLUSTER_TYPE = GENERAL | ANALYTICS | INTEGRATION
26
- VCLUSTER_SIZE = num -- 固定规格
27
- -- 或弹性规格(通用型/同步型)
28
- MIN_VCLUSTER_SIZE = num
29
- MAX_VCLUSTER_SIZE = num
30
- AUTO_SUSPEND_IN_SECOND = num -- 空闲自动停止秒数,-1 表示不停止,默认 600
31
- AUTO_RESUME = TRUE | FALSE -- 是否自动启动,默认 TRUE
32
- QUERY_RUNTIME_LIMIT_IN_SECOND = num -- 单作业最大执行时长(秒),默认 86400
33
- [COMMENT '']
34
- ```
35
-
36
- ### 分析型专有参数
37
-
38
- ```sql
39
- MIN_REPLICAS = num -- 最小实例数(1-10),默认 1
40
- MAX_REPLICAS = num -- 最大实例数(1-10),默认 1
41
- MAX_CONCURRENCY = num -- 每实例最大并发数(1-32),默认 8
42
- PRELOAD_TABLES = "schema.table1,schema.table2" -- 预加载缓存表
43
- ```
44
-
45
- ### 示例
46
-
47
- ```sql
48
- -- 通用型:离线 ETL,8 CRU,作业完成后 60 秒自动停止
49
- CREATE VCLUSTER IF NOT EXISTS etl_vc
50
- VCLUSTER_TYPE = GENERAL
51
- VCLUSTER_SIZE = 8
52
- AUTO_SUSPEND_IN_SECOND = 60
53
- AUTO_RESUME = TRUE
54
- COMMENT '离线ETL专用集群';
55
-
56
- -- 通用型:弹性规格(1-16 CRU)
57
- CREATE VCLUSTER IF NOT EXISTS etl_elastic_vc
58
- VCLUSTER_TYPE = GENERAL
59
- MIN_VCLUSTER_SIZE = 1
60
- MAX_VCLUSTER_SIZE = 16
61
- AUTO_SUSPEND_IN_SECOND = 300;
62
-
63
- -- 分析型:BI 报表,4 CRU,1-10 副本,最大 80 并发
64
- CREATE VCLUSTER IF NOT EXISTS bi_vc
65
- VCLUSTER_TYPE = ANALYTICS
66
- VCLUSTER_SIZE = 4
67
- MIN_REPLICAS = 1
68
- MAX_REPLICAS = 10
69
- MAX_CONCURRENCY = 8
70
- AUTO_SUSPEND_IN_SECOND = 1800
71
- AUTO_RESUME = TRUE
72
- COMMENT 'BI报表在线查询集群';
73
-
74
- -- 同步型:数据集成任务
75
- CREATE VCLUSTER IF NOT EXISTS sync_vc
76
- VCLUSTER_TYPE = INTEGRATION
77
- VCLUSTER_SIZE = 1
78
- AUTO_RESUME = TRUE;
79
- ```
80
-
81
- ---
82
-
83
- ## ALTER VCLUSTER
84
-
85
- ```sql
86
- -- 启动集群
87
- ALTER VCLUSTER [IF EXISTS] <name> RESUME;
88
-
89
- -- 停止集群
90
- ALTER VCLUSTER [IF EXISTS] <name> SUSPEND [FORCE];
91
-
92
- -- 取消集群中所有作业
93
- ALTER VCLUSTER [IF EXISTS] <name> CANCEL ALL JOBS;
94
-
95
- -- 修改属性
96
- ALTER VCLUSTER [IF EXISTS] <name> SET
97
- VCLUSTER_SIZE = num
98
- AUTO_SUSPEND_IN_SECOND = num
99
- AUTO_RESUME = TRUE | FALSE
100
- MAX_CONCURRENCY = num -- 仅分析型
101
- MIN_REPLICAS = num -- 仅分析型
102
- MAX_REPLICAS = num -- 仅分析型
103
- PRELOAD_TABLES = "schema.table";
104
-
105
- -- 修改注释
106
- ALTER VCLUSTER [IF EXISTS] <name> SET COMMENT '新注释';
107
- ```
108
-
109
- ---
110
-
111
- ## DROP VCLUSTER
112
-
113
- ```sql
114
- -- 等待当前作业完成后删除
115
- DROP VCLUSTER [IF EXISTS] <name>;
116
-
117
- -- 立即强制删除(中断正在运行的作业)
118
- DROP VCLUSTER [IF EXISTS] <name> FORCE;
119
- ```
120
-
121
- ---
122
-
123
- ## DESC / SHOW VCLUSTER
124
-
125
- ```sql
126
- -- 查看集群基本信息
127
- DESC VCLUSTER <name>;
128
-
129
- -- 查看扩展信息
130
- DESC VCLUSTER EXTENDED <name>;
131
-
132
- -- 列出所有集群
133
- SHOW VCLUSTERS;
134
-
135
- -- 按类型过滤
136
- SHOW VCLUSTERS WHERE vcluster_type = 'GENERAL';
137
- SHOW VCLUSTERS WHERE state = 'SUSPENDED';
138
- SHOW VCLUSTERS WHERE vcluster_type = 'ANALYTICS';
139
-
140
- -- 按名称模糊匹配
141
- SHOW VCLUSTERS LIKE 'etl%';
142
- ```
143
-
144
- ---
145
-
146
- ## USE VCLUSTER(切换当前会话集群)
147
-
148
- ```sql
149
- USE VCLUSTER <name>;
150
- ```
@@ -1,292 +0,0 @@
1
- ---
2
- name: clickzetta-volume-manager
3
- description: |
4
- 管理 ClickZetta Lakehouse Volume 对象,实现对象存储(OSS/COS/S3)的挂载、
5
- 文件查询与数据导入导出。覆盖外部 Volume 创建(OSS/COS/S3)、内部 User Volume
6
- 文件操作(PUT/GET/REMOVE)、SELECT FROM VOLUME 直接查询文件、
7
- COPY INTO TABLE 导入、COPY INTO VOLUME 导出等完整工作流。
8
- 当用户说"创建Volume"、"挂载OSS"、"挂载S3"、"挂载COS"、"Volume管理"、
9
- "查询OSS文件"、"查询S3文件"、"上传文件到Volume"、"PUT文件"、"GET文件"、
10
- "从Volume导入数据"、"导出到Volume"、"COPY INTO VOLUME"、"SELECT FROM VOLUME"、
11
- "User Volume"、"数据湖文件"、"数据导出"、"导出数据"、"导出CSV"、"导出Parquet"、
12
- "COPY OVERWRITE INTO"时触发。
13
- Keywords: Volume, OSS, COS, S3, mount, file query, COPY INTO, external storage
14
- ---
15
-
16
- # ClickZetta Volume 管理
17
-
18
- 阅读 [references/volume-ddl.md](references/volume-ddl.md) 了解完整语法。
19
-
20
- ## Volume 类型
21
-
22
- | 类型 | 说明 | 典型用途 |
23
- |---|---|---|
24
- | 外部 Volume | 挂载 OSS/COS/S3 路径 | 访问已有对象存储数据 |
25
- | User Volume | 用户专属内部存储 | 临时文件上传、本地文件导入 |
26
- | Table Volume | 表关联内部存储 | 表数据文件管理 |
27
-
28
- ---
29
-
30
- ## 创建外部 Volume
31
-
32
- 前提:先创建 STORAGE CONNECTION(对象存储认证配置)
33
-
34
- > ⚠️ **跨云限制**:Storage Connection 必须与 Lakehouse 实例在同一云厂商。阿里云实例不能创建 COS/S3 Connection,腾讯云实例不能创建 OSS Connection。
35
-
36
- > ⚠️ **阿里云 OSS 参数名**:
37
- > - 小写形式:`access_id` / `access_key`(推荐)
38
- > - 大写形式:`ACCESS_KEY_ID` / `ACCESS_KEY_SECRET`(也可以)
39
- > - ⚠️ `ACCESS_KEY` / `SECRET_KEY` 会报错(缺少 `_ID` / `_SECRET` 后缀)
40
-
41
- ```sql
42
- -- 阿里云 OSS
43
- CREATE STORAGE CONNECTION IF NOT EXISTS my_oss_conn
44
- TYPE OSS
45
- access_id = 'LTAIxxxxxxxxxxxx'
46
- access_key = 'T8Gexxxxxxmtxxxxxx'
47
- ENDPOINT = 'oss-cn-hangzhou-internal.aliyuncs.com';
48
-
49
- -- 腾讯云 COS
50
- CREATE STORAGE CONNECTION IF NOT EXISTS my_cos_conn
51
- TYPE COS
52
- ACCESS_KEY = '<access_key>'
53
- SECRET_KEY = '<secret_key>'
54
- REGION = 'ap-shanghai'
55
- APP_ID = '1310000503';
56
-
57
- -- AWS S3
58
- CREATE STORAGE CONNECTION IF NOT EXISTS my_s3_conn
59
- TYPE S3
60
- ACCESS_KEY = '<access_key>'
61
- SECRET_KEY = '<secret_key>'
62
- REGION = 'us-east-1';
63
- ```
64
-
65
- ```sql
66
- -- 挂载阿里云 OSS
67
- CREATE EXTERNAL VOLUME my_oss_volume
68
- LOCATION 'oss://my-bucket/data-path/'
69
- USING CONNECTION my_oss_conn
70
- DIRECTORY = (ENABLE = TRUE, AUTO_REFRESH = TRUE)
71
- RECURSIVE = TRUE;
72
-
73
- -- 挂载腾讯云 COS
74
- CREATE EXTERNAL VOLUME my_cos_volume
75
- LOCATION 'cos://my-bucket/data-path/'
76
- USING CONNECTION my_cos_conn
77
- DIRECTORY = (ENABLE = TRUE)
78
- RECURSIVE = TRUE;
79
-
80
- -- 挂载 AWS S3
81
- CREATE EXTERNAL VOLUME my_s3_volume
82
- LOCATION 's3://my-bucket/data-path/'
83
- USING CONNECTION my_s3_conn
84
- DIRECTORY = (ENABLE = TRUE)
85
- RECURSIVE = TRUE;
86
- ```
87
-
88
- ---
89
-
90
- ## 查看 Volume
91
-
92
- ```sql
93
- -- 列出所有 Volume
94
- SHOW VOLUMES;
95
-
96
- -- 过滤外部 Volume(SHOW VOLUMES 不支持 WHERE 过滤,使用 information_schema)
97
- SELECT volume_name, volume_type, volume_region, volume_creator
98
- FROM information_schema.volumes
99
- WHERE volume_type = 'EXTERNAL';
100
-
101
- -- 查看详情
102
- DESC VOLUME my_oss_volume;
103
-
104
- -- 查看目录下的文件
105
- SHOW VOLUME DIRECTORY my_oss_volume;
106
-
107
- -- 刷新目录元数据后查询(上传新文件后可能需要手动刷新)
108
- ALTER VOLUME my_oss_volume REFRESH;
109
- SELECT * FROM DIRECTORY(VOLUME my_oss_volume);
110
- ```
111
-
112
- > ⚠️ **目录刷新注意**:上传文件到对象存储后,`SHOW VOLUME DIRECTORY` 可能不会立即显示新文件。
113
- > 如果启用了 `AUTO_REFRESH = TRUE`,系统会定期自动刷新;否则需要手动执行 `ALTER VOLUME name REFRESH`。
114
-
115
- ---
116
-
117
- ## 直接查询 Volume 中的文件
118
-
119
- > ⚠️ **语法限制**:ClickZetta 不支持 `@volume_name` 简写(Snowflake Stage 语法),必须使用 `FROM VOLUME name USING format` 完整语法。
120
- > ⚠️ **多格式文件处理**:如果 Volume 中包含多种格式的文件(如 .csv 和 .json 混合),不指定 `FILES()` 或 `SUBDIRECTORY` 时会尝试读取所有文件,可能因格式不匹配而报错。建议使用 `FILES('xxx.csv')` 指定文件或 `SUBDIRECTORY 'csv_data/'` 指定子目录。
121
- > ⚠️ **JSON 嵌套字段访问**:使用 `data['key']` 语法(不是 Snowflake 的 `data:key` 语法)。
122
-
123
- ```sql
124
- -- 查询 CSV 文件(自动推断 schema)
125
- SELECT * FROM VOLUME my_oss_volume
126
- USING CSV
127
- OPTIONS('header' = 'true', 'sep' = ',')
128
- SUBDIRECTORY 'orders/2024/'
129
- LIMIT 100;
130
-
131
- -- 查询 Parquet 文件
132
- SELECT * FROM VOLUME my_oss_volume
133
- USING PARQUET
134
- REGEXP '.*2024-0[1-6].parquet';
135
-
136
- -- 查询指定文件(推荐,避免多格式冲突)
137
- SELECT * FROM VOLUME my_oss_volume
138
- USING JSON
139
- FILES('user_events.json');
140
-
141
- -- 查询 JSON 嵌套字段
142
- SELECT
143
- data['event_id'] AS event_id,
144
- data['properties']['device'] AS device
145
- FROM VOLUME my_oss_volume
146
- USING JSON
147
- FILES('events.json');
148
-
149
- -- 查询 User Volume 文件
150
- SELECT * FROM USER VOLUME
151
- USING CSV
152
- OPTIONS('header' = 'true')
153
- FILES('upload.csv');
154
- ```
155
-
156
- ---
157
-
158
- ## User Volume 文件操作
159
-
160
- ```sql
161
- -- 查看文件列表
162
- SHOW USER VOLUME DIRECTORY;
163
-
164
- -- 上传本地文件
165
- PUT '/local/path/data.csv' TO USER VOLUME;
166
- PUT '/local/path/data.csv' TO USER VOLUME FILE 'subdir/data.csv';
167
-
168
- -- 下载文件
169
- GET USER VOLUME FILE 'subdir/data.csv' TO '/local/output/';
170
-
171
- -- 删除文件
172
- REMOVE USER VOLUME FILE 'subdir/data.csv';
173
- ```
174
-
175
- ---
176
-
177
- ## 数据导入导出
178
-
179
- ### 从 Volume 导入到表
180
-
181
- ```sql
182
- -- CSV 导入
183
- COPY INTO my_table
184
- FROM VOLUME my_oss_volume
185
- USING CSV
186
- OPTIONS('header' = 'true')
187
- SUBDIRECTORY 'data/';
188
-
189
- -- 指定文件导入
190
- COPY INTO my_table
191
- FROM VOLUME my_oss_volume
192
- USING PARQUET
193
- FILES('data_2024.parquet');
194
-
195
- -- 正则匹配文件导入
196
- COPY INTO my_table
197
- FROM VOLUME my_oss_volume
198
- USING PARQUET
199
- REGEXP '.*2024-0[1-6].parquet';
200
-
201
- -- 覆盖写入(清空表后导入)
202
- COPY OVERWRITE INTO my_table
203
- FROM VOLUME my_oss_volume
204
- USING CSV
205
- OPTIONS('header' = 'true');
206
- ```
207
-
208
- ### 导出表到 Volume
209
-
210
- ```sql
211
- -- 导出整张表为 Parquet(到 External Volume)
212
- COPY INTO VOLUME my_oss_volume
213
- SUBDIRECTORY 'export/'
214
- FROM TABLE my_table
215
- FILE_FORMAT = (TYPE = PARQUET);
216
-
217
- -- 导出查询结果为 CSV(带压缩)
218
- COPY INTO VOLUME my_oss_volume
219
- SUBDIRECTORY 'export/2024/'
220
- FROM (SELECT * FROM orders WHERE year = 2024)
221
- FILE_FORMAT = (TYPE = CSV COMPRESSION = 'GZIP');
222
-
223
- -- 导出到 User Volume
224
- COPY INTO USER VOLUME
225
- SUBDIRECTORY 'my_export/'
226
- FROM TABLE my_table
227
- FILE_FORMAT = (TYPE = CSV);
228
-
229
- -- 导出到 Table Volume
230
- COPY INTO TABLE VOLUME my_table
231
- SUBDIRECTORY 'backup/'
232
- FROM TABLE my_table
233
- FILE_FORMAT = (TYPE = PARQUET);
234
- ```
235
-
236
- > ⚠️ `COPY INTO VOLUME` 导出使用 `FILE_FORMAT = (TYPE = CSV/PARQUET)`,不是 `USING CSV`。
237
- > `USING` 关键字仅用于 `SELECT FROM VOLUME` 查询文件。
238
-
239
- ### 导出到本地(GET 命令)
240
-
241
- ```sql
242
- -- 从 Volume 下载文件到本地
243
- GET VOLUME my_oss_volume FILE 'export/data.csv' TO '/local/output/';
244
-
245
- -- 从 User Volume 下载
246
- GET USER VOLUME FILE 'my_export/data.csv' TO '/local/output/';
247
- ```
248
-
249
- ### 通过 Studio 导出
250
-
251
- 在 Lakehouse Studio 中:
252
- - 执行 SQL 查询后,点击结果区域的「导出」按钮,可导出为 CSV 或 Excel 文件
253
- - 支持导出最多 10 万行查询结果
254
-
255
- ---
256
-
257
- ## 删除 Volume
258
-
259
- ```sql
260
- DROP VOLUME IF EXISTS my_oss_volume;
261
- ```
262
-
263
- ---
264
-
265
- ## 常见问题
266
-
267
- | 问题 | 原因 | 解决方案 |
268
- |---|---|---|
269
- | SHOW VOLUME DIRECTORY 无文件 | 目录未刷新 | 执行 `ALTER VOLUME name REFRESH` |
270
- | SELECT FROM VOLUME 报错 | 格式不匹配 | 确认 USING 后的格式与实际文件格式一致;使用 `FILES()` 指定文件 |
271
- | COPY INTO 读取多格式文件失败 | Volume 中有混合格式文件 | 使用 `FILES('xxx.csv')` 指定文件或 `SUBDIRECTORY` 指定子目录 |
272
- | PUT 命令失败 | 本地路径不存在 | 确认本地文件路径正确 |
273
- | COPY INTO 报错 | 权限不足 | 检查 STORAGE CONNECTION 的访问密钥权限 |
274
- | `@volume` 语法报错 | ClickZetta 不支持 | 使用 `FROM VOLUME name USING format` 完整语法 |
275
- | `data:key` 语法报错 | Snowflake JSON 语法不适用 | 使用 `data['key']` 语法访问 JSON 嵌套字段 |
276
- | `METADATA$FILENAME` 报错 | ClickZetta 不支持此元数据字段 | 使用字符串字面量或在 INSERT 时手动添加文件路径列 |
277
-
278
- ---
279
-
280
- ## Snowflake 迁移对照
281
-
282
- | Snowflake 语法 | ClickZetta 等价语法 | 说明 |
283
- |---|---|---|
284
- | `@my_stage` | `VOLUME my_volume` | Stage → Volume |
285
- | `SELECT * FROM @stage/path` | `SELECT * FROM VOLUME vol USING CSV SUBDIRECTORY 'path/'` | 必须指定 USING 格式 |
286
- | `data:key::STRING` | `data['key']` | JSON 字段访问 |
287
- | `data:nested.key` | `data['nested']['key']` | 嵌套 JSON 访问 |
288
- | `METADATA$FILENAME` | 不支持 | 需手动添加文件路径列 |
289
- | `METADATA$FILE_ROW_NUMBER` | 不支持 | 无等价功能 |
290
- | `FILE_FORMAT = (TYPE = CSV)` | `USING CSV OPTIONS(...)` | 导入时用 USING,导出时用 FILE_FORMAT |
291
- | `COPY INTO table FROM @stage` | `COPY INTO table FROM VOLUME vol USING format` | 导入语法 |
292
- | `COPY INTO @stage FROM table` | `COPY INTO VOLUME vol SUBDIRECTORY '/' FROM TABLE t FILE_FORMAT=(...)` | 导出语法 |