@clickzetta/cz-cli-darwin-arm64 0.3.19 → 0.3.21

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (40) hide show
  1. package/bin/cz-cli +0 -0
  2. package/bin/skills/clickzetta-access-control/eval_cases.jsonl +1 -1
  3. package/bin/skills/clickzetta-batch-sync-pipeline/eval_cases.jsonl +5 -0
  4. package/bin/skills/clickzetta-cdc-sync-pipeline/eval_cases.jsonl +5 -0
  5. package/bin/skills/clickzetta-dba-guide/SKILL.md +542 -0
  6. package/bin/skills/clickzetta-dba-guide/eval_cases.jsonl +3 -0
  7. package/bin/skills/clickzetta-dw-modeling/eval_cases.jsonl +1 -1
  8. package/bin/skills/clickzetta-dynamic-table/eval_cases.jsonl +5 -0
  9. package/bin/skills/clickzetta-file-import-pipeline/eval_cases.jsonl +5 -0
  10. package/bin/skills/clickzetta-lakehouse-connect/SKILL.md +218 -0
  11. package/bin/skills/clickzetta-lakehouse-connect/eval_cases.jsonl +3 -0
  12. package/bin/skills/clickzetta-lakehouse-connect/evals/evals.json +35 -0
  13. package/bin/skills/clickzetta-lakehouse-connect/references/config-file.md +435 -0
  14. package/bin/skills/clickzetta-lakehouse-connect/references/jdbc.md +478 -0
  15. package/bin/skills/clickzetta-lakehouse-connect/references/python-sdk.md +225 -0
  16. package/bin/skills/clickzetta-lakehouse-connect/references/sqlalchemy.md +468 -0
  17. package/bin/skills/clickzetta-lakehouse-connect/references/zettapark-session.md +445 -0
  18. package/bin/skills/clickzetta-manage-comments/SKILL.md +219 -0
  19. package/bin/skills/clickzetta-manage-comments/eval_cases.jsonl +3 -0
  20. package/bin/skills/clickzetta-metadata/SKILL.md +483 -0
  21. package/bin/skills/clickzetta-metadata/eval_cases.jsonl +5 -0
  22. package/bin/skills/clickzetta-metadata/references/instance-views-reference.md +276 -0
  23. package/bin/skills/clickzetta-metadata/references/metering-views-reference.md +137 -0
  24. package/bin/skills/clickzetta-metadata/references/show-desc-reference.md +326 -0
  25. package/bin/skills/clickzetta-metadata/references/views-reference.md +271 -0
  26. package/bin/skills/clickzetta-oss-ingest-pipeline/eval_cases.jsonl +5 -0
  27. package/bin/skills/clickzetta-overview/SKILL.md +102 -0
  28. package/bin/skills/clickzetta-overview/eval_cases.jsonl +5 -0
  29. package/bin/skills/clickzetta-overview/references/brands-and-endpoints.md +79 -0
  30. package/bin/skills/clickzetta-overview/references/object-model.md +311 -0
  31. package/bin/skills/clickzetta-overview/references/studio-modules.md +173 -0
  32. package/bin/skills/clickzetta-realtime-sync-pipeline/eval_cases.jsonl +5 -0
  33. package/bin/skills/clickzetta-sql-pipeline-manager/eval_cases.jsonl +12 -0
  34. package/bin/skills/clickzetta-table-stream-pipeline/eval_cases.jsonl +5 -0
  35. package/bin/skills/clickzetta-vcluster-manager/eval_cases.jsonl +5 -0
  36. package/bin/skills/clickzetta-volume-manager/eval_cases.jsonl +5 -0
  37. package/bin/skills/cz-cli-inner/SKILL.md +5 -4
  38. package/package.json +1 -1
  39. package/bin/skills/clickzetta-data-ingest-pipeline/SKILL.md +0 -220
  40. package/bin/skills/clickzetta-data-ingest-pipeline/eval_cases.jsonl +0 -5
@@ -0,0 +1,483 @@
1
+ ---
2
+ name: clickzetta-metadata
3
+ description: |
4
+ 查询 ClickZetta Lakehouse 元数据,覆盖两种查询方式:
5
+ 1. SHOW/DESC 命令族(实时,无延迟):快速查看当前对象状态,适合单个对象的即时查询
6
+ 2. INFORMATION_SCHEMA 视图(约 15 分钟延迟,支持复杂 SQL 分析):适合聚合统计、费用归因、跨对象分析
7
+
8
+ 选择原则:
9
+ - 快速查看单个对象(表结构、字段、集群状态、权限)→ SHOW/DESC
10
+ - 复杂 SQL 分析、费用归因、跨空间统计、历史趋势 → information_schema
11
+
12
+ 覆盖所有 SHOW 命令(TABLES/SCHEMAS/CATALOGS/COLUMNS/VOLUMES/CONNECTIONS/JOBS/VCLUSTERS/
13
+ PIPES/SHARES/USERS/ROLES/GRANTS/FUNCTIONS/TABLE STREAMS/PARTITIONS/SYNONYMS/INDEX/
14
+ DYNAMIC TABLE REFRESH HISTORY/TABLES HISTORY),所有 DESC 命令(TABLE/SCHEMA/HISTORY/
15
+ VCLUSTER/VOLUME/CONNECTION/FUNCTION/VIEW/DYNAMIC TABLE/SHARE/INDEX/TABLE STREAM),
16
+ SHOW CREATE TABLE,load_history(),FROM (SHOW ...) 子查询,上下文函数,
17
+ 以及 INFORMATION_SCHEMA 空间级和实例级视图(TABLES/COLUMNS/JOB_HISTORY/USERS/ROLES/
18
+ VOLUMES/CONNECTIONS/MATERIALIZED_VIEW_REFRESH_HISTORY/STORAGE_METERING/INSTANCE_USAGE 等)。
19
+
20
+ 当用户说"查看表列表"、"查看字段"、"查看字段信息"、"查看作业"、"查看作业历史"、
21
+ "查看 JOB 历史"、"SHOW TABLES"、"DESC TABLE"、"查看分区"、"查看历史版本"、
22
+ "查看删除的表"、"查看导入历史"、"load_history"、"SHOW JOBS"、"查看集群状态"、
23
+ "查看连接"、"查看权限"、"SHOW GRANTS"、"查看函数"、"查看 Volume"、
24
+ "查看 Volume 列表"、"查看 Share"、"查看 Catalog"、"查看慢查询"、
25
+ "查看 CRU 消耗"、"费用分析"、"成本分析"、"计算费用"、"存储费用"、
26
+ "用量统计"、"成本归因"、"哪个用户消耗最多"、"存储用量排行"、
27
+ "查看用户列表"、"查看角色"、"查看 Connection"、"查看物化视图刷新历史"、
28
+ "元数据查询"、"information_schema"、"查看所有表"、"查看 Schema 列表"、
29
+ "统计存储用量"、"SHOW/DESC 和 information_schema 哪个更快"时触发。
30
+
31
+ 注意:本 skill 仅覆盖元数据的只读查询(SHOW/DESC/information_schema)。
32
+ 权限变更(GRANT/REVOKE/创建用户/角色管理/数据脱敏)请使用 clickzetta-access-control skill。
33
+ Keywords: SHOW, DESC, DESCRIBE, metadata, load_history, information_schema, table info, column info, job history, system view, cost analysis, CRU
34
+ ---
35
+
36
+ # ClickZetta 元数据查询指南
37
+
38
+ ## 选择查询方式
39
+
40
+ | 场景 | 推荐方式 | 原因 |
41
+ |---|---|---|
42
+ | 快速查看当前状态(表、字段、集群) | `SHOW` / `DESC` | 实时,无延迟 |
43
+ | 复杂 SQL 分析、聚合统计 | `information_schema` | 支持 JOIN/GROUP BY/WHERE |
44
+ | 查看已删除对象 | `SHOW TABLES HISTORY` | 专用命令,实时 |
45
+ | 费用分析(含金额) | `SYS.information_schema.INSTANCE_USAGE` / `STORAGE_METERING` | 含实际金额字段 |
46
+ | CRU 消耗统计(无金额) | `information_schema.JOB_HISTORY` | 支持按用户/时间聚合 |
47
+ | 导入文件去重 | `load_history()` | 专用函数 |
48
+ | 跨空间查询 | `SYS.information_schema.*` | 需 INSTANCE ADMIN |
49
+
50
+ **延迟说明**:SHOW/DESC 实时返回;information_schema 视图有约 15 分钟延迟。
51
+
52
+ ---
53
+
54
+ ## 参考文档
55
+
56
+ - [SHOW/DESC 完整语法](references/show-desc-reference.md)
57
+ - [空间级 INFORMATION_SCHEMA 视图](references/views-reference.md)
58
+ - [实例级视图(需 INSTANCE ADMIN)](references/instance-views-reference.md)
59
+ - [费用计量视图(STORAGE_METERING / INSTANCE_USAGE)](references/metering-views-reference.md)
60
+
61
+ ---
62
+
63
+ ## SHOW / DESC 快速参考
64
+
65
+ ### 当前上下文
66
+
67
+ ```sql
68
+ SELECT current_workspace(), current_schema(), current_user(), current_vcluster();
69
+ ```
70
+
71
+ ### 数据对象
72
+
73
+ ```sql
74
+ -- Schema
75
+ SHOW SCHEMAS;
76
+ SHOW SCHEMAS EXTENDED;
77
+ SHOW SCHEMAS LIKE 'ods%';
78
+
79
+ -- 表(含视图/物化视图/动态表/外部表)
80
+ SHOW TABLES;
81
+ SHOW TABLES IN my_schema;
82
+ SHOW TABLES LIKE '%order%';
83
+ SHOW TABLES WHERE is_view = false AND is_materialized_view = false; -- 普通表
84
+ SHOW TABLES WHERE is_view = true; -- 视图
85
+ SHOW TABLES WHERE is_materialized_view = true; -- 物化视图
86
+ SHOW TABLES WHERE is_dynamic = true; -- 动态表
87
+ SHOW TABLES WHERE is_external = true; -- 外部表
88
+ -- ⚠️ SHOW VIEWS IN schema 语法不支持,用 SHOW TABLES WHERE is_view=true
89
+
90
+ -- 字段
91
+ SHOW COLUMNS IN my_schema.my_table;
92
+ SHOW COLUMNS FROM my_table IN my_schema;
93
+
94
+ -- 完整建表语句
95
+ SHOW CREATE TABLE my_table;
96
+
97
+ -- 分区
98
+ SHOW PARTITIONS my_table;
99
+ SHOW PARTITIONS EXTENDED my_table;
100
+ SHOW PARTITIONS my_table PARTITION (dt = '2024-01');
101
+ -- ⚠️ SHOW PARTITIONS WHERE col='x' 不支持,需用 PARTITION() 子句
102
+
103
+ -- Volume(不支持 IN schema,用 WHERE 过滤)
104
+ SHOW VOLUMES;
105
+ SHOW VOLUMES WHERE schema_name = 'my_schema';
106
+
107
+ -- Table Stream
108
+ SHOW TABLE STREAMS;
109
+ SHOW TABLE STREAMS IN my_schema;
110
+
111
+ -- 索引
112
+ SHOW INDEX IN my_schema.my_table;
113
+
114
+ -- 函数
115
+ SHOW FUNCTIONS LIKE '%date%';
116
+ SHOW EXTERNAL FUNCTIONS;
117
+ -- ⚠️ 不支持 IN schema 子句
118
+
119
+ -- 历史(含已删除表)
120
+ SHOW TABLES HISTORY;
121
+ SHOW TABLES HISTORY IN my_schema;
122
+ ```
123
+
124
+ ### Catalog(联邦查询)
125
+
126
+ ```sql
127
+ SHOW CATALOGS;
128
+ SHOW SCHEMAS IN catalog_name;
129
+ SHOW TABLES IN catalog_name.schema_name;
130
+ ```
131
+
132
+ ### 计算与连接
133
+
134
+ ```sql
135
+ -- 计算集群
136
+ SHOW VCLUSTERS;
137
+ SHOW VCLUSTERS WHERE state = 'RUNNING';
138
+
139
+ -- 作业(最近 7 天,最多 10000 条,不支持 ORDER BY)
140
+ SHOW JOBS LIMIT 20;
141
+ SHOW JOBS IN VCLUSTER default_ap LIMIT 20;
142
+
143
+ -- 动态表刷新历史(最近 7 天)
144
+ SHOW DYNAMIC TABLE REFRESH HISTORY LIMIT 20;
145
+ SHOW DYNAMIC TABLE REFRESH HISTORY WHERE state = 'FAILED';
146
+
147
+ -- 连接对象
148
+ SHOW CONNECTIONS;
149
+ SHOW CONNECTIONS WHERE category = 'STORAGE';
150
+
151
+ -- Pipe
152
+ SHOW PIPES;
153
+ SHOW PIPES IN my_schema;
154
+ ```
155
+
156
+ ### 用户、权限与共享
157
+
158
+ ```sql
159
+ SHOW USERS;
160
+ SHOW ROLES;
161
+ SHOW GRANTS TO USER alice;
162
+ SHOW GRANTS TO ROLE analyst_role;
163
+ SHOW SHARES;
164
+ ```
165
+
166
+ ### DESC 命令
167
+
168
+ ```sql
169
+ DESC my_table;
170
+ DESC EXTENDED my_table; -- 含 last_modified_time/properties/statistics
171
+ DESC SCHEMA my_schema;
172
+ DESC VCLUSTER default_ap;
173
+ DESC VOLUME my_volume;
174
+ DESC CONNECTION my_oss_conn;
175
+ DESC FUNCTION my_schema.my_function; -- 仅支持外部函数
176
+ DESC SHARE my_share_name;
177
+ DESC CATALOG my_catalog;
178
+
179
+ -- 版本历史(依赖 data_retention_days)
180
+ DESC HISTORY my_table;
181
+ -- 返回:version, time, total_rows, total_bytes, user, operation, job_id
182
+ ```
183
+
184
+ ### load_history() — 文件导入历史
185
+
186
+ ```sql
187
+ -- 参数必须是带引号的字符串
188
+ SELECT file_path, last_copy_time, file_size, status, first_error_message
189
+ FROM load_history('my_schema.my_table')
190
+ ORDER BY last_copy_time DESC
191
+ LIMIT 20;
192
+ -- ⚠️ load_history(schema.table) 不带引号会报错
193
+ ```
194
+
195
+ ### FROM (SHOW ...) 子查询
196
+
197
+ ```sql
198
+ -- 过滤 SHOW 结果
199
+ SELECT schema_name, table_name
200
+ FROM (SHOW TABLES IN my_schema)
201
+ WHERE is_view = false;
202
+
203
+ -- 统计各类型表数量
204
+ SELECT
205
+ CASE WHEN is_view THEN 'VIEW'
206
+ WHEN is_materialized_view THEN 'MV'
207
+ WHEN is_dynamic THEN 'DT'
208
+ WHEN is_external THEN 'EXTERNAL'
209
+ ELSE 'TABLE' END AS type,
210
+ COUNT(*) AS cnt
211
+ FROM (SHOW TABLES IN my_schema)
212
+ GROUP BY 1;
213
+
214
+ -- 查看挂起的集群
215
+ SELECT name, state FROM (SHOW VCLUSTERS) WHERE state = 'SUSPENDED';
216
+ ```
217
+
218
+ **注意**:不支持创建包含 SHOW 命令的视图。
219
+
220
+ ### SHOW/DESC 注意事项
221
+
222
+ 1. `SHOW SCHEMAS WHERE`:不支持,需用 `SHOW SCHEMAS EXTENDED` 后应用层过滤
223
+ 2. `SHOW VIEWS IN schema`:语法报错,用 `SHOW TABLES WHERE is_view=true`
224
+ 3. `SHOW VOLUMES IN schema`:语法报错,用 `SHOW VOLUMES WHERE schema_name='x'`
225
+ 4. `SHOW PARTITIONS WHERE col='x'`:不支持,用 `PARTITION(col='x')` 子句
226
+ 5. `SHOW JOBS`:只显示最近 7 天,最多 10000 条;不支持 ORDER BY
227
+ 6. `LIKE` 和 `WHERE` 不能同时用:用 `FROM (SHOW TABLES) WHERE table_name LIKE 'x%'` 替代
228
+
229
+ ---
230
+
231
+ ## INFORMATION_SCHEMA 快速参考
232
+
233
+ ### 层级说明
234
+
235
+ | 层级 | 访问路径 | 权限要求 | 覆盖范围 |
236
+ |---|---|---|---|
237
+ | 空间级 | `information_schema.<视图名>` | workspace_admin | 当前工作空间 |
238
+ | 实例级 | `SYS.information_schema.<视图名>` | INSTANCE ADMIN | 所有工作空间 |
239
+
240
+ **重要限制**:所有视图只读,数据有约 15 分钟延迟。空间级视图只显示当前存在的对象;实例级视图含已删除对象,用 `WHERE delete_time IS NULL` 过滤。
241
+
242
+ ### 空间级视图(`information_schema.*`)
243
+
244
+ | 视图名 | 说明 |
245
+ |---|---|
246
+ | SCHEMAS | 当前空间下的所有 Schema |
247
+ | TABLES | 当前空间下的所有表(含视图、物化视图) |
248
+ | COLUMNS | 所有表的字段信息 |
249
+ | VIEWS | 所有视图定义 |
250
+ | USERS | 空间内用户及角色 |
251
+ | ROLES | 空间内角色及成员 |
252
+ | JOB_HISTORY | 作业执行历史(保留 60 天,含 PT_DATE 分区列) |
253
+ | MATERIALIZED_VIEW_REFRESH_HISTORY | 物化视图刷新历史(含 PT_DATE 分区列) |
254
+ | AUTOMV_REFRESH_HISTORY | 自动物化视图刷新历史 |
255
+ | VOLUMES | Volume 对象信息 |
256
+ | CONNECTIONS | 存储连接对象信息 |
257
+ | SORTKEY_CANDIDATES | 推荐排序列 |
258
+
259
+ ### 实例级视图(`SYS.information_schema.*`)
260
+
261
+ | 视图名 | 说明 |
262
+ |---|---|
263
+ | WORKSPACES | 所有工作空间信息(含存储用量) |
264
+ | SCHEMAS | 所有空间的 Schema(含删除记录) |
265
+ | TABLES | 所有空间的表(含删除记录) |
266
+ | COLUMNS | 所有空间的字段(含删除记录) |
267
+ | VIEWS | 所有空间的视图 |
268
+ | USERS | 所有空间的用户 |
269
+ | ROLES | 所有空间的角色 |
270
+ | JOB_HISTORY | 所有空间的作业历史 |
271
+ | MATERIALIZED_VIEW_REFRESH_HISTORY | 所有空间的物化视图刷新历史 |
272
+ | VOLUMES | 所有空间的 Volume |
273
+ | CONNECTIONS | 所有空间的连接对象 |
274
+ | OBJECT_PRIVILEGES | 权限授予记录 |
275
+ | SORTKEY_CANDIDATES | 所有空间的排序列推荐 |
276
+ | **STORAGE_METERING** ⭐ | **存储费用明细(托管存储/多版本存储/网络传输),按天按空间** |
277
+ | **INSTANCE_USAGE** ⭐ | **计算费用明细(AP/GP集群/任务调度/数据集成),按天按空间** |
278
+
279
+ ---
280
+
281
+ ## 常用查询示例
282
+
283
+ ### 查看表结构
284
+
285
+ ```sql
286
+ -- 列出当前空间所有表
287
+ SELECT table_schema, table_name, table_type, row_count, bytes, create_time
288
+ FROM information_schema.tables
289
+ WHERE table_type = 'MANAGED_TABLE'
290
+ ORDER BY table_schema, table_name;
291
+
292
+ -- 查看某张表的字段
293
+ SELECT column_name, data_type, is_nullable, is_primary_key, is_clustering_column, comment
294
+ FROM information_schema.columns
295
+ WHERE table_schema = 'my_schema'
296
+ AND table_name = 'my_table'
297
+ ORDER BY column_name;
298
+
299
+ -- 查找包含特定字段名的表
300
+ SELECT table_schema, table_name, column_name, data_type
301
+ FROM information_schema.columns
302
+ WHERE column_name ILIKE '%user_id%';
303
+ ```
304
+
305
+ ### 查看作业历史
306
+
307
+ ```sql
308
+ -- 最近 24 小时的作业
309
+ SELECT job_id, job_creator, status, execution_time, cru,
310
+ input_bytes, output_bytes, start_time
311
+ FROM information_schema.job_history
312
+ WHERE pt_date >= CAST(CURRENT_DATE - INTERVAL 1 DAY AS DATE)
313
+ ORDER BY start_time DESC;
314
+
315
+ -- 失败的作业
316
+ SELECT job_id, job_creator, job_text, error_message, start_time
317
+ FROM information_schema.job_history
318
+ WHERE status = 'FAILED'
319
+ AND pt_date >= CAST(CURRENT_DATE - INTERVAL 7 DAY AS DATE)
320
+ ORDER BY start_time DESC;
321
+
322
+ -- 按用户统计 CRU 消耗(最近 30 天)
323
+ -- 注意:status 成功值为 'SUCCEED'(非 'SUCCEEDED')
324
+ SELECT job_creator,
325
+ COUNT(*) AS job_count,
326
+ SUM(cru) AS total_cru,
327
+ AVG(execution_time) AS avg_exec_sec
328
+ FROM information_schema.job_history
329
+ WHERE pt_date >= CAST(CURRENT_DATE - INTERVAL 30 DAY AS DATE)
330
+ AND status = 'SUCCEED'
331
+ GROUP BY job_creator
332
+ ORDER BY total_cru DESC;
333
+
334
+ -- 慢查询(超过 60 秒)
335
+ SELECT job_id, job_creator, execution_time, input_bytes, job_text
336
+ FROM information_schema.job_history
337
+ WHERE execution_time > 60
338
+ AND pt_date >= CAST(CURRENT_DATE - INTERVAL 7 DAY AS DATE)
339
+ ORDER BY execution_time DESC
340
+ LIMIT 20;
341
+ ```
342
+
343
+ ### 物化视图刷新监控
344
+
345
+ ```sql
346
+ -- 最近刷新失败的物化视图
347
+ SELECT schema_name, materialized_view_name, status,
348
+ start_time, end_time, error_message
349
+ FROM information_schema.materialized_view_refresh_history
350
+ WHERE status = 'FAILED'
351
+ AND pt_date >= CAST(CURRENT_DATE - INTERVAL 7 DAY AS DATE)
352
+ ORDER BY start_time DESC;
353
+ ```
354
+
355
+ ### 费用分析(需 INSTANCE ADMIN)
356
+
357
+ 费用分析使用两个实例级专有视图,**这是 JOB_HISTORY.CRU 无法替代的**:
358
+ - `STORAGE_METERING`:存储费用(托管存储/多版本存储/网络传输),含实际金额
359
+ - `INSTANCE_USAGE`:计算费用(AP/GP集群/任务调度/数据集成/流式集成),含实际金额
360
+
361
+ ```sql
362
+ -- 按工作空间汇总本月计算费用
363
+ SELECT workspace_name,
364
+ sku_name,
365
+ ROUND(SUM(measurements_consumption), 2) AS total_cru,
366
+ ROUND(SUM(amount), 2) AS total_amount_yuan
367
+ FROM SYS.information_schema.instance_usage
368
+ WHERE measurement_start >= DATE_TRUNC('month', CURRENT_DATE)
369
+ AND sku_category = 'compute'
370
+ GROUP BY workspace_name, sku_name
371
+ ORDER BY total_amount_yuan DESC;
372
+
373
+ -- 按工作空间汇总本月存储费用
374
+ SELECT workspace_name,
375
+ sku_name,
376
+ ROUND(SUM(measurements_consumption), 4) AS consumption,
377
+ measurements_unit,
378
+ ROUND(SUM(amount), 4) AS total_amount_yuan
379
+ FROM SYS.information_schema.storage_metering
380
+ WHERE measurement_start >= DATE_TRUNC('month', CURRENT_DATE)
381
+ GROUP BY workspace_name, sku_name, measurements_unit
382
+ ORDER BY workspace_name, total_amount_yuan DESC;
383
+
384
+ -- 存储 + 计算综合费用汇总(本月)
385
+ SELECT cost_type, workspace_name,
386
+ ROUND(SUM(total_amount), 2) AS total_yuan
387
+ FROM (
388
+ SELECT 'compute' AS cost_type, workspace_name, amount AS total_amount
389
+ FROM SYS.information_schema.instance_usage
390
+ WHERE measurement_start >= DATE_TRUNC('month', CURRENT_DATE)
391
+ UNION ALL
392
+ SELECT 'storage' AS cost_type, workspace_name, amount AS total_amount
393
+ FROM SYS.information_schema.storage_metering
394
+ WHERE measurement_start >= DATE_TRUNC('month', CURRENT_DATE)
395
+ ) t
396
+ GROUP BY cost_type, workspace_name
397
+ ORDER BY cost_type, total_yuan DESC;
398
+
399
+ -- 按天统计计算费用趋势(最近 30 天)
400
+ SELECT DATE(measurement_start) AS dt,
401
+ sku_name,
402
+ ROUND(SUM(amount), 2) AS daily_amount_yuan
403
+ FROM SYS.information_schema.instance_usage
404
+ WHERE measurement_start >= CURRENT_DATE - INTERVAL 30 DAY
405
+ AND sku_category = 'compute'
406
+ GROUP BY DATE(measurement_start), sku_name
407
+ ORDER BY dt, daily_amount_yuan DESC;
408
+ ```
409
+
410
+ **INSTANCE_USAGE SKU 枚举值(sku_category = 'compute'):**
411
+
412
+ | sku_name | 说明 |
413
+ |---|---|
414
+ | AP类型计算集群 | 分析型 VCluster 费用 |
415
+ | GP类型计算集群 | 通用型 VCluster 费用 |
416
+ | 任务调度 | Studio 任务调度费用 |
417
+ | 数据集成 | 离线/实时同步任务费用 |
418
+ | 流式集成 | 流式数据集成费用 |
419
+
420
+ **STORAGE_METERING SKU 枚举值:**
421
+
422
+ | sku_category | sku_name | 说明 |
423
+ |---|---|---|
424
+ | storage | 托管存储容量 | 内部表数据存储 |
425
+ | storage | 多版本未删除存储 | Time Travel 历史版本存储 |
426
+ | network | 数据查询Internet数据传输 | 公网数据传输费用 |
427
+
428
+ ### 存储用量分析
429
+
430
+ ```sql
431
+ -- 存储用量排行(当前空间,按表)
432
+ SELECT table_schema, table_name,
433
+ ROUND(bytes / 1024.0 / 1024 / 1024, 3) AS size_gb,
434
+ row_count
435
+ FROM information_schema.tables
436
+ WHERE table_type = 'MANAGED_TABLE'
437
+ ORDER BY bytes DESC
438
+ LIMIT 20;
439
+
440
+ -- 跨空间存储汇总(需 INSTANCE ADMIN)
441
+ SELECT workspace_name,
442
+ ROUND(workspace_storage / 1024.0 / 1024 / 1024, 2) AS storage_gb
443
+ FROM SYS.information_schema.workspaces
444
+ WHERE delete_time IS NULL
445
+ ORDER BY workspace_storage DESC;
446
+
447
+ -- 跨空间查找大表(大于 10GB)
448
+ SELECT table_catalog, table_schema, table_name,
449
+ row_count,
450
+ ROUND(bytes / 1024.0 / 1024 / 1024, 2) AS size_gb
451
+ FROM SYS.information_schema.tables
452
+ WHERE delete_time IS NULL
453
+ AND bytes > 10 * 1024 * 1024 * 1024
454
+ ORDER BY bytes DESC;
455
+ ```
456
+
457
+ ### 用户和权限
458
+
459
+ ```sql
460
+ -- 列出空间内所有用户及角色
461
+ SELECT user_name, role_names, email, create_time
462
+ FROM information_schema.users
463
+ ORDER BY create_time DESC;
464
+
465
+ -- 查看权限授予记录(需 INSTANCE ADMIN)
466
+ SELECT grantor, grantee, granted_to, object_type,
467
+ object_schema, object_name, privilege_type, authorization_time
468
+ FROM SYS.information_schema.object_privileges
469
+ WHERE grantee = 'some_user'
470
+ ORDER BY authorization_time DESC;
471
+ ```
472
+
473
+ ---
474
+
475
+ ## INFORMATION_SCHEMA 注意事项
476
+
477
+ 1. **ROW_COUNT / BYTES 为估计值**:PRIMARY KEY 表、实时写入表、分区操作后可能不准确
478
+ 2. **JOB_HISTORY 保留 60 天**:超过 60 天的历史记录会被自动清理
479
+ 3. **空间级视图无 DELETE_TIME**:实例级视图含已删除对象,用 `WHERE delete_time IS NULL` 过滤
480
+ 4. **JOB_HISTORY 有 PT_DATE 分区列**:用 `pt_date >= CAST(CURRENT_DATE - INTERVAL N DAY AS DATE)` 过滤,比 `start_time` 过滤性能更好
481
+ 5. **STATUS 值注意**:JOB_HISTORY 成功状态为 `'SUCCEED'`(非 `'SUCCEEDED'`);MV 刷新成功为 `'SUCCEED'`(非 `'FINISHED'`)
482
+ 6. **SYS.information_schema 包含所有 workspace 数据**:不加 `table_catalog` 过滤会返回所有 workspace 的结果。字段名是 `create_time`(不是 `created_time`)
483
+ 7. **STORAGE_METERING / INSTANCE_USAGE 仅实例级**:需 INSTANCE ADMIN 权限,通过 `SYS.information_schema.*` 访问;含实际金额字段,是费用分析的权威来源
@@ -0,0 +1,5 @@
1
+ {"case_id":"001","type":"should_call","user_input":"怎么查看当前 schema 下有哪些表?SHOW TABLES 的用法","expected_skill":"clickzetta-metadata","expected_output_contains":["SHOW TABLES"]}
2
+ {"case_id":"002","type":"should_call","user_input":"怎么查看一张表的字段信息和分区?","expected_skill":"clickzetta-metadata","expected_output_contains":["DESC","TABLE"]}
3
+ {"case_id":"003","type":"should_call","user_input":"怎么通过 information_schema 统计各表的存储用量排行?","expected_skill":"clickzetta-metadata","expected_output_contains":["information_schema","tables"]}
4
+ {"case_id":"004","type":"should_call","user_input":"怎么查看过去 7 天的 CRU 消耗和费用归因?哪个用户消耗最多?","expected_skill":"clickzetta-metadata","expected_output_contains":["information_schema","job_history"]}
5
+ {"case_id":"005","type":"should_call","user_input":"SHOW/DESC 和 information_schema 有什么区别?该用哪个?","expected_skill":"clickzetta-metadata","expected_output_contains":["SHOW","information_schema"]}