@clickzetta/cz-cli-darwin-arm64 0.3.19 → 0.3.21
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bin/cz-cli +0 -0
- package/bin/skills/clickzetta-access-control/eval_cases.jsonl +1 -1
- package/bin/skills/clickzetta-batch-sync-pipeline/eval_cases.jsonl +5 -0
- package/bin/skills/clickzetta-cdc-sync-pipeline/eval_cases.jsonl +5 -0
- package/bin/skills/clickzetta-dba-guide/SKILL.md +542 -0
- package/bin/skills/clickzetta-dba-guide/eval_cases.jsonl +3 -0
- package/bin/skills/clickzetta-dw-modeling/eval_cases.jsonl +1 -1
- package/bin/skills/clickzetta-dynamic-table/eval_cases.jsonl +5 -0
- package/bin/skills/clickzetta-file-import-pipeline/eval_cases.jsonl +5 -0
- package/bin/skills/clickzetta-lakehouse-connect/SKILL.md +218 -0
- package/bin/skills/clickzetta-lakehouse-connect/eval_cases.jsonl +3 -0
- package/bin/skills/clickzetta-lakehouse-connect/evals/evals.json +35 -0
- package/bin/skills/clickzetta-lakehouse-connect/references/config-file.md +435 -0
- package/bin/skills/clickzetta-lakehouse-connect/references/jdbc.md +478 -0
- package/bin/skills/clickzetta-lakehouse-connect/references/python-sdk.md +225 -0
- package/bin/skills/clickzetta-lakehouse-connect/references/sqlalchemy.md +468 -0
- package/bin/skills/clickzetta-lakehouse-connect/references/zettapark-session.md +445 -0
- package/bin/skills/clickzetta-manage-comments/SKILL.md +219 -0
- package/bin/skills/clickzetta-manage-comments/eval_cases.jsonl +3 -0
- package/bin/skills/clickzetta-metadata/SKILL.md +483 -0
- package/bin/skills/clickzetta-metadata/eval_cases.jsonl +5 -0
- package/bin/skills/clickzetta-metadata/references/instance-views-reference.md +276 -0
- package/bin/skills/clickzetta-metadata/references/metering-views-reference.md +137 -0
- package/bin/skills/clickzetta-metadata/references/show-desc-reference.md +326 -0
- package/bin/skills/clickzetta-metadata/references/views-reference.md +271 -0
- package/bin/skills/clickzetta-oss-ingest-pipeline/eval_cases.jsonl +5 -0
- package/bin/skills/clickzetta-overview/SKILL.md +102 -0
- package/bin/skills/clickzetta-overview/eval_cases.jsonl +5 -0
- package/bin/skills/clickzetta-overview/references/brands-and-endpoints.md +79 -0
- package/bin/skills/clickzetta-overview/references/object-model.md +311 -0
- package/bin/skills/clickzetta-overview/references/studio-modules.md +173 -0
- package/bin/skills/clickzetta-realtime-sync-pipeline/eval_cases.jsonl +5 -0
- package/bin/skills/clickzetta-sql-pipeline-manager/eval_cases.jsonl +12 -0
- package/bin/skills/clickzetta-table-stream-pipeline/eval_cases.jsonl +5 -0
- package/bin/skills/clickzetta-vcluster-manager/eval_cases.jsonl +5 -0
- package/bin/skills/clickzetta-volume-manager/eval_cases.jsonl +5 -0
- package/bin/skills/cz-cli-inner/SKILL.md +5 -4
- package/package.json +1 -1
- package/bin/skills/clickzetta-data-ingest-pipeline/SKILL.md +0 -220
- package/bin/skills/clickzetta-data-ingest-pipeline/eval_cases.jsonl +0 -5
|
@@ -0,0 +1,483 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: clickzetta-metadata
|
|
3
|
+
description: |
|
|
4
|
+
查询 ClickZetta Lakehouse 元数据,覆盖两种查询方式:
|
|
5
|
+
1. SHOW/DESC 命令族(实时,无延迟):快速查看当前对象状态,适合单个对象的即时查询
|
|
6
|
+
2. INFORMATION_SCHEMA 视图(约 15 分钟延迟,支持复杂 SQL 分析):适合聚合统计、费用归因、跨对象分析
|
|
7
|
+
|
|
8
|
+
选择原则:
|
|
9
|
+
- 快速查看单个对象(表结构、字段、集群状态、权限)→ SHOW/DESC
|
|
10
|
+
- 复杂 SQL 分析、费用归因、跨空间统计、历史趋势 → information_schema
|
|
11
|
+
|
|
12
|
+
覆盖所有 SHOW 命令(TABLES/SCHEMAS/CATALOGS/COLUMNS/VOLUMES/CONNECTIONS/JOBS/VCLUSTERS/
|
|
13
|
+
PIPES/SHARES/USERS/ROLES/GRANTS/FUNCTIONS/TABLE STREAMS/PARTITIONS/SYNONYMS/INDEX/
|
|
14
|
+
DYNAMIC TABLE REFRESH HISTORY/TABLES HISTORY),所有 DESC 命令(TABLE/SCHEMA/HISTORY/
|
|
15
|
+
VCLUSTER/VOLUME/CONNECTION/FUNCTION/VIEW/DYNAMIC TABLE/SHARE/INDEX/TABLE STREAM),
|
|
16
|
+
SHOW CREATE TABLE,load_history(),FROM (SHOW ...) 子查询,上下文函数,
|
|
17
|
+
以及 INFORMATION_SCHEMA 空间级和实例级视图(TABLES/COLUMNS/JOB_HISTORY/USERS/ROLES/
|
|
18
|
+
VOLUMES/CONNECTIONS/MATERIALIZED_VIEW_REFRESH_HISTORY/STORAGE_METERING/INSTANCE_USAGE 等)。
|
|
19
|
+
|
|
20
|
+
当用户说"查看表列表"、"查看字段"、"查看字段信息"、"查看作业"、"查看作业历史"、
|
|
21
|
+
"查看 JOB 历史"、"SHOW TABLES"、"DESC TABLE"、"查看分区"、"查看历史版本"、
|
|
22
|
+
"查看删除的表"、"查看导入历史"、"load_history"、"SHOW JOBS"、"查看集群状态"、
|
|
23
|
+
"查看连接"、"查看权限"、"SHOW GRANTS"、"查看函数"、"查看 Volume"、
|
|
24
|
+
"查看 Volume 列表"、"查看 Share"、"查看 Catalog"、"查看慢查询"、
|
|
25
|
+
"查看 CRU 消耗"、"费用分析"、"成本分析"、"计算费用"、"存储费用"、
|
|
26
|
+
"用量统计"、"成本归因"、"哪个用户消耗最多"、"存储用量排行"、
|
|
27
|
+
"查看用户列表"、"查看角色"、"查看 Connection"、"查看物化视图刷新历史"、
|
|
28
|
+
"元数据查询"、"information_schema"、"查看所有表"、"查看 Schema 列表"、
|
|
29
|
+
"统计存储用量"、"SHOW/DESC 和 information_schema 哪个更快"时触发。
|
|
30
|
+
|
|
31
|
+
注意:本 skill 仅覆盖元数据的只读查询(SHOW/DESC/information_schema)。
|
|
32
|
+
权限变更(GRANT/REVOKE/创建用户/角色管理/数据脱敏)请使用 clickzetta-access-control skill。
|
|
33
|
+
Keywords: SHOW, DESC, DESCRIBE, metadata, load_history, information_schema, table info, column info, job history, system view, cost analysis, CRU
|
|
34
|
+
---
|
|
35
|
+
|
|
36
|
+
# ClickZetta 元数据查询指南
|
|
37
|
+
|
|
38
|
+
## 选择查询方式
|
|
39
|
+
|
|
40
|
+
| 场景 | 推荐方式 | 原因 |
|
|
41
|
+
|---|---|---|
|
|
42
|
+
| 快速查看当前状态(表、字段、集群) | `SHOW` / `DESC` | 实时,无延迟 |
|
|
43
|
+
| 复杂 SQL 分析、聚合统计 | `information_schema` | 支持 JOIN/GROUP BY/WHERE |
|
|
44
|
+
| 查看已删除对象 | `SHOW TABLES HISTORY` | 专用命令,实时 |
|
|
45
|
+
| 费用分析(含金额) | `SYS.information_schema.INSTANCE_USAGE` / `STORAGE_METERING` | 含实际金额字段 |
|
|
46
|
+
| CRU 消耗统计(无金额) | `information_schema.JOB_HISTORY` | 支持按用户/时间聚合 |
|
|
47
|
+
| 导入文件去重 | `load_history()` | 专用函数 |
|
|
48
|
+
| 跨空间查询 | `SYS.information_schema.*` | 需 INSTANCE ADMIN |
|
|
49
|
+
|
|
50
|
+
**延迟说明**:SHOW/DESC 实时返回;information_schema 视图有约 15 分钟延迟。
|
|
51
|
+
|
|
52
|
+
---
|
|
53
|
+
|
|
54
|
+
## 参考文档
|
|
55
|
+
|
|
56
|
+
- [SHOW/DESC 完整语法](references/show-desc-reference.md)
|
|
57
|
+
- [空间级 INFORMATION_SCHEMA 视图](references/views-reference.md)
|
|
58
|
+
- [实例级视图(需 INSTANCE ADMIN)](references/instance-views-reference.md)
|
|
59
|
+
- [费用计量视图(STORAGE_METERING / INSTANCE_USAGE)](references/metering-views-reference.md)
|
|
60
|
+
|
|
61
|
+
---
|
|
62
|
+
|
|
63
|
+
## SHOW / DESC 快速参考
|
|
64
|
+
|
|
65
|
+
### 当前上下文
|
|
66
|
+
|
|
67
|
+
```sql
|
|
68
|
+
SELECT current_workspace(), current_schema(), current_user(), current_vcluster();
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
### 数据对象
|
|
72
|
+
|
|
73
|
+
```sql
|
|
74
|
+
-- Schema
|
|
75
|
+
SHOW SCHEMAS;
|
|
76
|
+
SHOW SCHEMAS EXTENDED;
|
|
77
|
+
SHOW SCHEMAS LIKE 'ods%';
|
|
78
|
+
|
|
79
|
+
-- 表(含视图/物化视图/动态表/外部表)
|
|
80
|
+
SHOW TABLES;
|
|
81
|
+
SHOW TABLES IN my_schema;
|
|
82
|
+
SHOW TABLES LIKE '%order%';
|
|
83
|
+
SHOW TABLES WHERE is_view = false AND is_materialized_view = false; -- 普通表
|
|
84
|
+
SHOW TABLES WHERE is_view = true; -- 视图
|
|
85
|
+
SHOW TABLES WHERE is_materialized_view = true; -- 物化视图
|
|
86
|
+
SHOW TABLES WHERE is_dynamic = true; -- 动态表
|
|
87
|
+
SHOW TABLES WHERE is_external = true; -- 外部表
|
|
88
|
+
-- ⚠️ SHOW VIEWS IN schema 语法不支持,用 SHOW TABLES WHERE is_view=true
|
|
89
|
+
|
|
90
|
+
-- 字段
|
|
91
|
+
SHOW COLUMNS IN my_schema.my_table;
|
|
92
|
+
SHOW COLUMNS FROM my_table IN my_schema;
|
|
93
|
+
|
|
94
|
+
-- 完整建表语句
|
|
95
|
+
SHOW CREATE TABLE my_table;
|
|
96
|
+
|
|
97
|
+
-- 分区
|
|
98
|
+
SHOW PARTITIONS my_table;
|
|
99
|
+
SHOW PARTITIONS EXTENDED my_table;
|
|
100
|
+
SHOW PARTITIONS my_table PARTITION (dt = '2024-01');
|
|
101
|
+
-- ⚠️ SHOW PARTITIONS WHERE col='x' 不支持,需用 PARTITION() 子句
|
|
102
|
+
|
|
103
|
+
-- Volume(不支持 IN schema,用 WHERE 过滤)
|
|
104
|
+
SHOW VOLUMES;
|
|
105
|
+
SHOW VOLUMES WHERE schema_name = 'my_schema';
|
|
106
|
+
|
|
107
|
+
-- Table Stream
|
|
108
|
+
SHOW TABLE STREAMS;
|
|
109
|
+
SHOW TABLE STREAMS IN my_schema;
|
|
110
|
+
|
|
111
|
+
-- 索引
|
|
112
|
+
SHOW INDEX IN my_schema.my_table;
|
|
113
|
+
|
|
114
|
+
-- 函数
|
|
115
|
+
SHOW FUNCTIONS LIKE '%date%';
|
|
116
|
+
SHOW EXTERNAL FUNCTIONS;
|
|
117
|
+
-- ⚠️ 不支持 IN schema 子句
|
|
118
|
+
|
|
119
|
+
-- 历史(含已删除表)
|
|
120
|
+
SHOW TABLES HISTORY;
|
|
121
|
+
SHOW TABLES HISTORY IN my_schema;
|
|
122
|
+
```
|
|
123
|
+
|
|
124
|
+
### Catalog(联邦查询)
|
|
125
|
+
|
|
126
|
+
```sql
|
|
127
|
+
SHOW CATALOGS;
|
|
128
|
+
SHOW SCHEMAS IN catalog_name;
|
|
129
|
+
SHOW TABLES IN catalog_name.schema_name;
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
### 计算与连接
|
|
133
|
+
|
|
134
|
+
```sql
|
|
135
|
+
-- 计算集群
|
|
136
|
+
SHOW VCLUSTERS;
|
|
137
|
+
SHOW VCLUSTERS WHERE state = 'RUNNING';
|
|
138
|
+
|
|
139
|
+
-- 作业(最近 7 天,最多 10000 条,不支持 ORDER BY)
|
|
140
|
+
SHOW JOBS LIMIT 20;
|
|
141
|
+
SHOW JOBS IN VCLUSTER default_ap LIMIT 20;
|
|
142
|
+
|
|
143
|
+
-- 动态表刷新历史(最近 7 天)
|
|
144
|
+
SHOW DYNAMIC TABLE REFRESH HISTORY LIMIT 20;
|
|
145
|
+
SHOW DYNAMIC TABLE REFRESH HISTORY WHERE state = 'FAILED';
|
|
146
|
+
|
|
147
|
+
-- 连接对象
|
|
148
|
+
SHOW CONNECTIONS;
|
|
149
|
+
SHOW CONNECTIONS WHERE category = 'STORAGE';
|
|
150
|
+
|
|
151
|
+
-- Pipe
|
|
152
|
+
SHOW PIPES;
|
|
153
|
+
SHOW PIPES IN my_schema;
|
|
154
|
+
```
|
|
155
|
+
|
|
156
|
+
### 用户、权限与共享
|
|
157
|
+
|
|
158
|
+
```sql
|
|
159
|
+
SHOW USERS;
|
|
160
|
+
SHOW ROLES;
|
|
161
|
+
SHOW GRANTS TO USER alice;
|
|
162
|
+
SHOW GRANTS TO ROLE analyst_role;
|
|
163
|
+
SHOW SHARES;
|
|
164
|
+
```
|
|
165
|
+
|
|
166
|
+
### DESC 命令
|
|
167
|
+
|
|
168
|
+
```sql
|
|
169
|
+
DESC my_table;
|
|
170
|
+
DESC EXTENDED my_table; -- 含 last_modified_time/properties/statistics
|
|
171
|
+
DESC SCHEMA my_schema;
|
|
172
|
+
DESC VCLUSTER default_ap;
|
|
173
|
+
DESC VOLUME my_volume;
|
|
174
|
+
DESC CONNECTION my_oss_conn;
|
|
175
|
+
DESC FUNCTION my_schema.my_function; -- 仅支持外部函数
|
|
176
|
+
DESC SHARE my_share_name;
|
|
177
|
+
DESC CATALOG my_catalog;
|
|
178
|
+
|
|
179
|
+
-- 版本历史(依赖 data_retention_days)
|
|
180
|
+
DESC HISTORY my_table;
|
|
181
|
+
-- 返回:version, time, total_rows, total_bytes, user, operation, job_id
|
|
182
|
+
```
|
|
183
|
+
|
|
184
|
+
### load_history() — 文件导入历史
|
|
185
|
+
|
|
186
|
+
```sql
|
|
187
|
+
-- 参数必须是带引号的字符串
|
|
188
|
+
SELECT file_path, last_copy_time, file_size, status, first_error_message
|
|
189
|
+
FROM load_history('my_schema.my_table')
|
|
190
|
+
ORDER BY last_copy_time DESC
|
|
191
|
+
LIMIT 20;
|
|
192
|
+
-- ⚠️ load_history(schema.table) 不带引号会报错
|
|
193
|
+
```
|
|
194
|
+
|
|
195
|
+
### FROM (SHOW ...) 子查询
|
|
196
|
+
|
|
197
|
+
```sql
|
|
198
|
+
-- 过滤 SHOW 结果
|
|
199
|
+
SELECT schema_name, table_name
|
|
200
|
+
FROM (SHOW TABLES IN my_schema)
|
|
201
|
+
WHERE is_view = false;
|
|
202
|
+
|
|
203
|
+
-- 统计各类型表数量
|
|
204
|
+
SELECT
|
|
205
|
+
CASE WHEN is_view THEN 'VIEW'
|
|
206
|
+
WHEN is_materialized_view THEN 'MV'
|
|
207
|
+
WHEN is_dynamic THEN 'DT'
|
|
208
|
+
WHEN is_external THEN 'EXTERNAL'
|
|
209
|
+
ELSE 'TABLE' END AS type,
|
|
210
|
+
COUNT(*) AS cnt
|
|
211
|
+
FROM (SHOW TABLES IN my_schema)
|
|
212
|
+
GROUP BY 1;
|
|
213
|
+
|
|
214
|
+
-- 查看挂起的集群
|
|
215
|
+
SELECT name, state FROM (SHOW VCLUSTERS) WHERE state = 'SUSPENDED';
|
|
216
|
+
```
|
|
217
|
+
|
|
218
|
+
**注意**:不支持创建包含 SHOW 命令的视图。
|
|
219
|
+
|
|
220
|
+
### SHOW/DESC 注意事项
|
|
221
|
+
|
|
222
|
+
1. `SHOW SCHEMAS WHERE`:不支持,需用 `SHOW SCHEMAS EXTENDED` 后应用层过滤
|
|
223
|
+
2. `SHOW VIEWS IN schema`:语法报错,用 `SHOW TABLES WHERE is_view=true`
|
|
224
|
+
3. `SHOW VOLUMES IN schema`:语法报错,用 `SHOW VOLUMES WHERE schema_name='x'`
|
|
225
|
+
4. `SHOW PARTITIONS WHERE col='x'`:不支持,用 `PARTITION(col='x')` 子句
|
|
226
|
+
5. `SHOW JOBS`:只显示最近 7 天,最多 10000 条;不支持 ORDER BY
|
|
227
|
+
6. `LIKE` 和 `WHERE` 不能同时用:用 `FROM (SHOW TABLES) WHERE table_name LIKE 'x%'` 替代
|
|
228
|
+
|
|
229
|
+
---
|
|
230
|
+
|
|
231
|
+
## INFORMATION_SCHEMA 快速参考
|
|
232
|
+
|
|
233
|
+
### 层级说明
|
|
234
|
+
|
|
235
|
+
| 层级 | 访问路径 | 权限要求 | 覆盖范围 |
|
|
236
|
+
|---|---|---|---|
|
|
237
|
+
| 空间级 | `information_schema.<视图名>` | workspace_admin | 当前工作空间 |
|
|
238
|
+
| 实例级 | `SYS.information_schema.<视图名>` | INSTANCE ADMIN | 所有工作空间 |
|
|
239
|
+
|
|
240
|
+
**重要限制**:所有视图只读,数据有约 15 分钟延迟。空间级视图只显示当前存在的对象;实例级视图含已删除对象,用 `WHERE delete_time IS NULL` 过滤。
|
|
241
|
+
|
|
242
|
+
### 空间级视图(`information_schema.*`)
|
|
243
|
+
|
|
244
|
+
| 视图名 | 说明 |
|
|
245
|
+
|---|---|
|
|
246
|
+
| SCHEMAS | 当前空间下的所有 Schema |
|
|
247
|
+
| TABLES | 当前空间下的所有表(含视图、物化视图) |
|
|
248
|
+
| COLUMNS | 所有表的字段信息 |
|
|
249
|
+
| VIEWS | 所有视图定义 |
|
|
250
|
+
| USERS | 空间内用户及角色 |
|
|
251
|
+
| ROLES | 空间内角色及成员 |
|
|
252
|
+
| JOB_HISTORY | 作业执行历史(保留 60 天,含 PT_DATE 分区列) |
|
|
253
|
+
| MATERIALIZED_VIEW_REFRESH_HISTORY | 物化视图刷新历史(含 PT_DATE 分区列) |
|
|
254
|
+
| AUTOMV_REFRESH_HISTORY | 自动物化视图刷新历史 |
|
|
255
|
+
| VOLUMES | Volume 对象信息 |
|
|
256
|
+
| CONNECTIONS | 存储连接对象信息 |
|
|
257
|
+
| SORTKEY_CANDIDATES | 推荐排序列 |
|
|
258
|
+
|
|
259
|
+
### 实例级视图(`SYS.information_schema.*`)
|
|
260
|
+
|
|
261
|
+
| 视图名 | 说明 |
|
|
262
|
+
|---|---|
|
|
263
|
+
| WORKSPACES | 所有工作空间信息(含存储用量) |
|
|
264
|
+
| SCHEMAS | 所有空间的 Schema(含删除记录) |
|
|
265
|
+
| TABLES | 所有空间的表(含删除记录) |
|
|
266
|
+
| COLUMNS | 所有空间的字段(含删除记录) |
|
|
267
|
+
| VIEWS | 所有空间的视图 |
|
|
268
|
+
| USERS | 所有空间的用户 |
|
|
269
|
+
| ROLES | 所有空间的角色 |
|
|
270
|
+
| JOB_HISTORY | 所有空间的作业历史 |
|
|
271
|
+
| MATERIALIZED_VIEW_REFRESH_HISTORY | 所有空间的物化视图刷新历史 |
|
|
272
|
+
| VOLUMES | 所有空间的 Volume |
|
|
273
|
+
| CONNECTIONS | 所有空间的连接对象 |
|
|
274
|
+
| OBJECT_PRIVILEGES | 权限授予记录 |
|
|
275
|
+
| SORTKEY_CANDIDATES | 所有空间的排序列推荐 |
|
|
276
|
+
| **STORAGE_METERING** ⭐ | **存储费用明细(托管存储/多版本存储/网络传输),按天按空间** |
|
|
277
|
+
| **INSTANCE_USAGE** ⭐ | **计算费用明细(AP/GP集群/任务调度/数据集成),按天按空间** |
|
|
278
|
+
|
|
279
|
+
---
|
|
280
|
+
|
|
281
|
+
## 常用查询示例
|
|
282
|
+
|
|
283
|
+
### 查看表结构
|
|
284
|
+
|
|
285
|
+
```sql
|
|
286
|
+
-- 列出当前空间所有表
|
|
287
|
+
SELECT table_schema, table_name, table_type, row_count, bytes, create_time
|
|
288
|
+
FROM information_schema.tables
|
|
289
|
+
WHERE table_type = 'MANAGED_TABLE'
|
|
290
|
+
ORDER BY table_schema, table_name;
|
|
291
|
+
|
|
292
|
+
-- 查看某张表的字段
|
|
293
|
+
SELECT column_name, data_type, is_nullable, is_primary_key, is_clustering_column, comment
|
|
294
|
+
FROM information_schema.columns
|
|
295
|
+
WHERE table_schema = 'my_schema'
|
|
296
|
+
AND table_name = 'my_table'
|
|
297
|
+
ORDER BY column_name;
|
|
298
|
+
|
|
299
|
+
-- 查找包含特定字段名的表
|
|
300
|
+
SELECT table_schema, table_name, column_name, data_type
|
|
301
|
+
FROM information_schema.columns
|
|
302
|
+
WHERE column_name ILIKE '%user_id%';
|
|
303
|
+
```
|
|
304
|
+
|
|
305
|
+
### 查看作业历史
|
|
306
|
+
|
|
307
|
+
```sql
|
|
308
|
+
-- 最近 24 小时的作业
|
|
309
|
+
SELECT job_id, job_creator, status, execution_time, cru,
|
|
310
|
+
input_bytes, output_bytes, start_time
|
|
311
|
+
FROM information_schema.job_history
|
|
312
|
+
WHERE pt_date >= CAST(CURRENT_DATE - INTERVAL 1 DAY AS DATE)
|
|
313
|
+
ORDER BY start_time DESC;
|
|
314
|
+
|
|
315
|
+
-- 失败的作业
|
|
316
|
+
SELECT job_id, job_creator, job_text, error_message, start_time
|
|
317
|
+
FROM information_schema.job_history
|
|
318
|
+
WHERE status = 'FAILED'
|
|
319
|
+
AND pt_date >= CAST(CURRENT_DATE - INTERVAL 7 DAY AS DATE)
|
|
320
|
+
ORDER BY start_time DESC;
|
|
321
|
+
|
|
322
|
+
-- 按用户统计 CRU 消耗(最近 30 天)
|
|
323
|
+
-- 注意:status 成功值为 'SUCCEED'(非 'SUCCEEDED')
|
|
324
|
+
SELECT job_creator,
|
|
325
|
+
COUNT(*) AS job_count,
|
|
326
|
+
SUM(cru) AS total_cru,
|
|
327
|
+
AVG(execution_time) AS avg_exec_sec
|
|
328
|
+
FROM information_schema.job_history
|
|
329
|
+
WHERE pt_date >= CAST(CURRENT_DATE - INTERVAL 30 DAY AS DATE)
|
|
330
|
+
AND status = 'SUCCEED'
|
|
331
|
+
GROUP BY job_creator
|
|
332
|
+
ORDER BY total_cru DESC;
|
|
333
|
+
|
|
334
|
+
-- 慢查询(超过 60 秒)
|
|
335
|
+
SELECT job_id, job_creator, execution_time, input_bytes, job_text
|
|
336
|
+
FROM information_schema.job_history
|
|
337
|
+
WHERE execution_time > 60
|
|
338
|
+
AND pt_date >= CAST(CURRENT_DATE - INTERVAL 7 DAY AS DATE)
|
|
339
|
+
ORDER BY execution_time DESC
|
|
340
|
+
LIMIT 20;
|
|
341
|
+
```
|
|
342
|
+
|
|
343
|
+
### 物化视图刷新监控
|
|
344
|
+
|
|
345
|
+
```sql
|
|
346
|
+
-- 最近刷新失败的物化视图
|
|
347
|
+
SELECT schema_name, materialized_view_name, status,
|
|
348
|
+
start_time, end_time, error_message
|
|
349
|
+
FROM information_schema.materialized_view_refresh_history
|
|
350
|
+
WHERE status = 'FAILED'
|
|
351
|
+
AND pt_date >= CAST(CURRENT_DATE - INTERVAL 7 DAY AS DATE)
|
|
352
|
+
ORDER BY start_time DESC;
|
|
353
|
+
```
|
|
354
|
+
|
|
355
|
+
### 费用分析(需 INSTANCE ADMIN)
|
|
356
|
+
|
|
357
|
+
费用分析使用两个实例级专有视图,**这是 JOB_HISTORY.CRU 无法替代的**:
|
|
358
|
+
- `STORAGE_METERING`:存储费用(托管存储/多版本存储/网络传输),含实际金额
|
|
359
|
+
- `INSTANCE_USAGE`:计算费用(AP/GP集群/任务调度/数据集成/流式集成),含实际金额
|
|
360
|
+
|
|
361
|
+
```sql
|
|
362
|
+
-- 按工作空间汇总本月计算费用
|
|
363
|
+
SELECT workspace_name,
|
|
364
|
+
sku_name,
|
|
365
|
+
ROUND(SUM(measurements_consumption), 2) AS total_cru,
|
|
366
|
+
ROUND(SUM(amount), 2) AS total_amount_yuan
|
|
367
|
+
FROM SYS.information_schema.instance_usage
|
|
368
|
+
WHERE measurement_start >= DATE_TRUNC('month', CURRENT_DATE)
|
|
369
|
+
AND sku_category = 'compute'
|
|
370
|
+
GROUP BY workspace_name, sku_name
|
|
371
|
+
ORDER BY total_amount_yuan DESC;
|
|
372
|
+
|
|
373
|
+
-- 按工作空间汇总本月存储费用
|
|
374
|
+
SELECT workspace_name,
|
|
375
|
+
sku_name,
|
|
376
|
+
ROUND(SUM(measurements_consumption), 4) AS consumption,
|
|
377
|
+
measurements_unit,
|
|
378
|
+
ROUND(SUM(amount), 4) AS total_amount_yuan
|
|
379
|
+
FROM SYS.information_schema.storage_metering
|
|
380
|
+
WHERE measurement_start >= DATE_TRUNC('month', CURRENT_DATE)
|
|
381
|
+
GROUP BY workspace_name, sku_name, measurements_unit
|
|
382
|
+
ORDER BY workspace_name, total_amount_yuan DESC;
|
|
383
|
+
|
|
384
|
+
-- 存储 + 计算综合费用汇总(本月)
|
|
385
|
+
SELECT cost_type, workspace_name,
|
|
386
|
+
ROUND(SUM(total_amount), 2) AS total_yuan
|
|
387
|
+
FROM (
|
|
388
|
+
SELECT 'compute' AS cost_type, workspace_name, amount AS total_amount
|
|
389
|
+
FROM SYS.information_schema.instance_usage
|
|
390
|
+
WHERE measurement_start >= DATE_TRUNC('month', CURRENT_DATE)
|
|
391
|
+
UNION ALL
|
|
392
|
+
SELECT 'storage' AS cost_type, workspace_name, amount AS total_amount
|
|
393
|
+
FROM SYS.information_schema.storage_metering
|
|
394
|
+
WHERE measurement_start >= DATE_TRUNC('month', CURRENT_DATE)
|
|
395
|
+
) t
|
|
396
|
+
GROUP BY cost_type, workspace_name
|
|
397
|
+
ORDER BY cost_type, total_yuan DESC;
|
|
398
|
+
|
|
399
|
+
-- 按天统计计算费用趋势(最近 30 天)
|
|
400
|
+
SELECT DATE(measurement_start) AS dt,
|
|
401
|
+
sku_name,
|
|
402
|
+
ROUND(SUM(amount), 2) AS daily_amount_yuan
|
|
403
|
+
FROM SYS.information_schema.instance_usage
|
|
404
|
+
WHERE measurement_start >= CURRENT_DATE - INTERVAL 30 DAY
|
|
405
|
+
AND sku_category = 'compute'
|
|
406
|
+
GROUP BY DATE(measurement_start), sku_name
|
|
407
|
+
ORDER BY dt, daily_amount_yuan DESC;
|
|
408
|
+
```
|
|
409
|
+
|
|
410
|
+
**INSTANCE_USAGE SKU 枚举值(sku_category = 'compute'):**
|
|
411
|
+
|
|
412
|
+
| sku_name | 说明 |
|
|
413
|
+
|---|---|
|
|
414
|
+
| AP类型计算集群 | 分析型 VCluster 费用 |
|
|
415
|
+
| GP类型计算集群 | 通用型 VCluster 费用 |
|
|
416
|
+
| 任务调度 | Studio 任务调度费用 |
|
|
417
|
+
| 数据集成 | 离线/实时同步任务费用 |
|
|
418
|
+
| 流式集成 | 流式数据集成费用 |
|
|
419
|
+
|
|
420
|
+
**STORAGE_METERING SKU 枚举值:**
|
|
421
|
+
|
|
422
|
+
| sku_category | sku_name | 说明 |
|
|
423
|
+
|---|---|---|
|
|
424
|
+
| storage | 托管存储容量 | 内部表数据存储 |
|
|
425
|
+
| storage | 多版本未删除存储 | Time Travel 历史版本存储 |
|
|
426
|
+
| network | 数据查询Internet数据传输 | 公网数据传输费用 |
|
|
427
|
+
|
|
428
|
+
### 存储用量分析
|
|
429
|
+
|
|
430
|
+
```sql
|
|
431
|
+
-- 存储用量排行(当前空间,按表)
|
|
432
|
+
SELECT table_schema, table_name,
|
|
433
|
+
ROUND(bytes / 1024.0 / 1024 / 1024, 3) AS size_gb,
|
|
434
|
+
row_count
|
|
435
|
+
FROM information_schema.tables
|
|
436
|
+
WHERE table_type = 'MANAGED_TABLE'
|
|
437
|
+
ORDER BY bytes DESC
|
|
438
|
+
LIMIT 20;
|
|
439
|
+
|
|
440
|
+
-- 跨空间存储汇总(需 INSTANCE ADMIN)
|
|
441
|
+
SELECT workspace_name,
|
|
442
|
+
ROUND(workspace_storage / 1024.0 / 1024 / 1024, 2) AS storage_gb
|
|
443
|
+
FROM SYS.information_schema.workspaces
|
|
444
|
+
WHERE delete_time IS NULL
|
|
445
|
+
ORDER BY workspace_storage DESC;
|
|
446
|
+
|
|
447
|
+
-- 跨空间查找大表(大于 10GB)
|
|
448
|
+
SELECT table_catalog, table_schema, table_name,
|
|
449
|
+
row_count,
|
|
450
|
+
ROUND(bytes / 1024.0 / 1024 / 1024, 2) AS size_gb
|
|
451
|
+
FROM SYS.information_schema.tables
|
|
452
|
+
WHERE delete_time IS NULL
|
|
453
|
+
AND bytes > 10 * 1024 * 1024 * 1024
|
|
454
|
+
ORDER BY bytes DESC;
|
|
455
|
+
```
|
|
456
|
+
|
|
457
|
+
### 用户和权限
|
|
458
|
+
|
|
459
|
+
```sql
|
|
460
|
+
-- 列出空间内所有用户及角色
|
|
461
|
+
SELECT user_name, role_names, email, create_time
|
|
462
|
+
FROM information_schema.users
|
|
463
|
+
ORDER BY create_time DESC;
|
|
464
|
+
|
|
465
|
+
-- 查看权限授予记录(需 INSTANCE ADMIN)
|
|
466
|
+
SELECT grantor, grantee, granted_to, object_type,
|
|
467
|
+
object_schema, object_name, privilege_type, authorization_time
|
|
468
|
+
FROM SYS.information_schema.object_privileges
|
|
469
|
+
WHERE grantee = 'some_user'
|
|
470
|
+
ORDER BY authorization_time DESC;
|
|
471
|
+
```
|
|
472
|
+
|
|
473
|
+
---
|
|
474
|
+
|
|
475
|
+
## INFORMATION_SCHEMA 注意事项
|
|
476
|
+
|
|
477
|
+
1. **ROW_COUNT / BYTES 为估计值**:PRIMARY KEY 表、实时写入表、分区操作后可能不准确
|
|
478
|
+
2. **JOB_HISTORY 保留 60 天**:超过 60 天的历史记录会被自动清理
|
|
479
|
+
3. **空间级视图无 DELETE_TIME**:实例级视图含已删除对象,用 `WHERE delete_time IS NULL` 过滤
|
|
480
|
+
4. **JOB_HISTORY 有 PT_DATE 分区列**:用 `pt_date >= CAST(CURRENT_DATE - INTERVAL N DAY AS DATE)` 过滤,比 `start_time` 过滤性能更好
|
|
481
|
+
5. **STATUS 值注意**:JOB_HISTORY 成功状态为 `'SUCCEED'`(非 `'SUCCEEDED'`);MV 刷新成功为 `'SUCCEED'`(非 `'FINISHED'`)
|
|
482
|
+
6. **SYS.information_schema 包含所有 workspace 数据**:不加 `table_catalog` 过滤会返回所有 workspace 的结果。字段名是 `create_time`(不是 `created_time`)
|
|
483
|
+
7. **STORAGE_METERING / INSTANCE_USAGE 仅实例级**:需 INSTANCE ADMIN 权限,通过 `SYS.information_schema.*` 访问;含实际金额字段,是费用分析的权威来源
|
|
@@ -0,0 +1,5 @@
|
|
|
1
|
+
{"case_id":"001","type":"should_call","user_input":"怎么查看当前 schema 下有哪些表?SHOW TABLES 的用法","expected_skill":"clickzetta-metadata","expected_output_contains":["SHOW TABLES"]}
|
|
2
|
+
{"case_id":"002","type":"should_call","user_input":"怎么查看一张表的字段信息和分区?","expected_skill":"clickzetta-metadata","expected_output_contains":["DESC","TABLE"]}
|
|
3
|
+
{"case_id":"003","type":"should_call","user_input":"怎么通过 information_schema 统计各表的存储用量排行?","expected_skill":"clickzetta-metadata","expected_output_contains":["information_schema","tables"]}
|
|
4
|
+
{"case_id":"004","type":"should_call","user_input":"怎么查看过去 7 天的 CRU 消耗和费用归因?哪个用户消耗最多?","expected_skill":"clickzetta-metadata","expected_output_contains":["information_schema","job_history"]}
|
|
5
|
+
{"case_id":"005","type":"should_call","user_input":"SHOW/DESC 和 information_schema 有什么区别?该用哪个?","expected_skill":"clickzetta-metadata","expected_output_contains":["SHOW","information_schema"]}
|