@clickzetta/cz-cli-darwin-x64 0.3.74 → 0.3.76
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bin/cz-cli +0 -0
- package/bin/skills/clickzetta-dynamic-table/SKILL.md +188 -126
- package/bin/skills/clickzetta-dynamic-table/best-practices/dimension-table-join-guide.md +7 -11
- package/bin/skills/clickzetta-dynamic-table/best-practices/scheduling-guide.md +135 -0
- package/bin/skills/clickzetta-dynamic-table/dt-creator/references/dt-declaration-strategy.md +8 -8
- package/bin/skills/clickzetta-dynamic-table/dt-creator/references/incremental-config-reference.md +2 -4
- package/bin/skills/clickzetta-dynamic-table/dt-creator/references/refresh-history-guide.md +2 -10
- package/bin/skills/clickzetta-dynamic-table/sql-to-dt/SKILL.md +27 -0
- package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-column-validation-rules.md +118 -0
- package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-conversion-rules.md +225 -0
- package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-placeholder-rules.md +182 -0
- package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-refresh-rules.md +98 -0
- package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-self-reference-rules.md +76 -0
- package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-workflow.md +109 -0
- package/bin/skills/cz-cli/SKILL.md +262 -224
- package/package.json +1 -1
|
@@ -0,0 +1,109 @@
|
|
|
1
|
+
# SQL → Dynamic Table 完整转换工作流
|
|
2
|
+
|
|
3
|
+
当用户给你一组 CREATE TABLE DDL 和 INSERT OVERWRITE SQL,要求转换为 Dynamic Table 时,按以下步骤顺序执行。
|
|
4
|
+
|
|
5
|
+
每一步的详细规则在对应的 skill 文件中,你需要同时引用它们。
|
|
6
|
+
|
|
7
|
+
## 工作流步骤
|
|
8
|
+
|
|
9
|
+
### Step 1: 预处理输入
|
|
10
|
+
|
|
11
|
+
从 INSERT OVERWRITE 文件中移除:
|
|
12
|
+
- 所有 `ALTER TABLE` 语句
|
|
13
|
+
- `ANALYZE TABLE` 语句
|
|
14
|
+
- SQL 注释(`--` 和 `/* */`)
|
|
15
|
+
|
|
16
|
+
保留:CREATE TABLE、INSERT OVERWRITE、WITH、SET、CREATE TEMPORARY FUNCTION。
|
|
17
|
+
|
|
18
|
+
### Step 2: 占位符替换
|
|
19
|
+
|
|
20
|
+
按 #[[file:sql2dt-placeholder-rules.md]] 中的规则:
|
|
21
|
+
1. 统一占位符格式(`{{ }}` → `${ }`)
|
|
22
|
+
2. 替换所有占位符为 `SESSION_CONFIGS()` 调用
|
|
23
|
+
3. 处理 nodash 变量、日期运算、macros 函数
|
|
24
|
+
4. 根据引号上下文决定处理方式(去引号 / CONCAT / 直接替换)
|
|
25
|
+
|
|
26
|
+
### Step 3: 自引用检测
|
|
27
|
+
|
|
28
|
+
按 #[[file:sql2dt-self-reference-rules.md]] 中的规则:
|
|
29
|
+
1. 检查 INSERT OVERWRITE 目标表是否出现在 FROM/JOIN 中
|
|
30
|
+
2. 如果是自引用表,标记并在后续步骤中添加注释、使用显式 schema
|
|
31
|
+
|
|
32
|
+
### Step 4: 核心转换
|
|
33
|
+
|
|
34
|
+
按 #[[file:sql2dt-conversion-rules.md]] 中的规则:
|
|
35
|
+
1. 解析 CREATE TABLE DDL(提取列、分区、属性等)
|
|
36
|
+
2. 解析 INSERT OVERWRITE(提取查询、分区类型)
|
|
37
|
+
3. 组装 `CREATE OR REPLACE DYNAMIC TABLE ... AS SELECT ...`
|
|
38
|
+
4. 注入静态分区值到 SELECT(智能引号处理)
|
|
39
|
+
5. 合并表属性模板(默认 `data_lifecycle=15`)
|
|
40
|
+
6. 处理 UNION ALL(每个分支独立注入)
|
|
41
|
+
7. 日期函数后处理:将所有 `DATE_SUB/DATE_ADD` 统一转为 `sub_days`
|
|
42
|
+
|
|
43
|
+
### Step 5: 列校验
|
|
44
|
+
|
|
45
|
+
按 #[[file:sql2dt-column-validation-rules.md]] 中的规则:
|
|
46
|
+
1. 计算 schema 列数和 SELECT 列数
|
|
47
|
+
2. 验证两者相等
|
|
48
|
+
3. 检查重复别名和缺失分区列
|
|
49
|
+
4. UNION ALL 分支列数一致性检查
|
|
50
|
+
|
|
51
|
+
### Step 6: 生成配套文件
|
|
52
|
+
|
|
53
|
+
按 #[[file:sql2dt-refresh-rules.md]] 中的规则:
|
|
54
|
+
1. 从 DDL 中提取所有 SESSION_CONFIGS 变量
|
|
55
|
+
2. 生成当前周期 refresh 语句
|
|
56
|
+
3. 生成上一周期 prev_refresh 语句
|
|
57
|
+
4. 生成回填 backfill 语句
|
|
58
|
+
|
|
59
|
+
### Step 7: 转换后改进建议
|
|
60
|
+
|
|
61
|
+
DDL 生成完成后,对转换结果做以下检查,并主动向用户提出改进建议:
|
|
62
|
+
|
|
63
|
+
**检查 1:非分区表 + 持续写入风险**
|
|
64
|
+
|
|
65
|
+
按 #[[file:../best-practices/non-partitioned-merge-into-warning.md]] 中的判断逻辑:
|
|
66
|
+
- 生成的 DT 是非分区表(无 `PARTITIONED BY` 也无 `SESSION_CONFIGS()`)
|
|
67
|
+
- 且 SQL 中包含 `ROW_NUMBER() OVER (PARTITION BY ... ORDER BY ... DESC) WHERE rn = 1` 去重模式
|
|
68
|
+
|
|
69
|
+
→ 满足条件时,使用该文档中的告警话术模板向用户发出风险提示,并建议改用 MERGE INTO + Table Stream 方案。
|
|
70
|
+
|
|
71
|
+
**检查 2:SQL 性能优化机会**
|
|
72
|
+
|
|
73
|
+
按 #[[file:../best-practices/performance-optimization.md]] 中的规则,扫描生成的 DT SQL:
|
|
74
|
+
- 存在 `LEFT/RIGHT/FULL OUTER JOIN` → 提示如果业务允许,改用 INNER JOIN 可提升增量效率
|
|
75
|
+
- 存在无 `PARTITION BY` 的窗口函数 → 提示添加 PARTITION BY,否则每次增量都全量重算
|
|
76
|
+
- `GROUP BY` 使用了复杂表达式(如 `DATE_TRUNC`、`SUBSTR`)→ 提示考虑在上游预计算或拆分为多级 DT
|
|
77
|
+
|
|
78
|
+
**检查 3:JOIN 中是否有维度表**
|
|
79
|
+
|
|
80
|
+
按 #[[file:../best-practices/dimension-table-join-guide.md]] 中的推荐场景:
|
|
81
|
+
- SQL 中存在 JOIN → 询问用户右侧表是否为低频变更的维度表(码表、字典表、配置表等)
|
|
82
|
+
- 如果是 → 建议在 TBLPROPERTIES 中添加 `mv_const_tables` 配置,并说明其行为和数据一致性权衡
|
|
83
|
+
|
|
84
|
+
## 输出清单
|
|
85
|
+
|
|
86
|
+
对每个表,最终输出:
|
|
87
|
+
|
|
88
|
+
| 文件 | 内容 | 条件 |
|
|
89
|
+
|------|------|------|
|
|
90
|
+
| `表名.sql` | Dynamic Table DDL | 始终生成 |
|
|
91
|
+
| `表名_refresh.sql` | 当前周期 REFRESH 语句 | 始终生成 |
|
|
92
|
+
| `表名_prev_refresh.sql` | 上一周期 REFRESH 语句 | 仅有分区变量时 |
|
|
93
|
+
| `表名_backfill.sql` | 回填语句 | 仅有分区变量时 |
|
|
94
|
+
|
|
95
|
+
## 快速判断路径
|
|
96
|
+
|
|
97
|
+
```
|
|
98
|
+
输入 DDL + INSERT OVERWRITE
|
|
99
|
+
│
|
|
100
|
+
├─ 有占位符? → Step 2 占位符替换
|
|
101
|
+
│
|
|
102
|
+
├─ 自引用? → Step 3 特殊处理
|
|
103
|
+
│
|
|
104
|
+
├─ 有静态分区? → Step 4 注入分区值到 SELECT
|
|
105
|
+
│
|
|
106
|
+
├─ 有 UNION ALL? → Step 4 每个分支独立注入
|
|
107
|
+
│
|
|
108
|
+
└─ 生成 DDL → Step 5 校验 → Step 6 生成配套文件 → Step 7 改进建议
|
|
109
|
+
```
|
|
@@ -1,274 +1,312 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: cz-cli
|
|
3
|
-
description: "
|
|
4
|
-
generated_cli_version: "0.0.0-dev+202605250308"
|
|
5
|
-
generated_with: "cz-cli-skill-generator/1.0.0"
|
|
6
|
-
command_inventory_source: "click-command-tree"
|
|
7
|
-
command_count: 51
|
|
3
|
+
description: "Route ALL ClickZetta Lakehouse operations to cz-cli: SQL, Studio tasks, tables, pipelines, profiles. Use when user mentions ClickZetta, Lakehouse, cz-cli, or needs profile/connection configuration."
|
|
8
4
|
---
|
|
9
5
|
|
|
10
|
-
# cz-cli
|
|
6
|
+
# cz-cli — ClickZetta Lakehouse Subagent
|
|
11
7
|
|
|
12
|
-
|
|
13
|
-
> **Fallback**: If `cz-cli` command not present, check for a bundled binary at `scripts/<platform>/cz-cli` (e.g. `scripts/macos-arm64/cz-cli`, `scripts/linux-x86_64/cz-cli`, `scripts/windows-x86_64/cz-cli.exe`). If found, use its absolute path directly. Otherwise run `pip3 install cz-cli -U` then use `cz-cli` from PATH.
|
|
8
|
+
You have no direct Lakehouse access. Always delegate via cz-cli.
|
|
14
9
|
|
|
15
|
-
##
|
|
10
|
+
## Capabilities
|
|
16
11
|
|
|
17
|
-
|
|
18
|
-
|
|
12
|
+
### SQL & Data Operations
|
|
13
|
+
- Execute any SQL against Lakehouse: SELECT, DDL (CREATE/ALTER/DROP TABLE, SCHEMA, VIEW), DML (INSERT, UPDATE, DELETE, MERGE INTO)
|
|
14
|
+
- Run async jobs and fetch results
|
|
15
|
+
- Preview table data and row counts
|
|
19
16
|
|
|
20
|
-
###
|
|
17
|
+
### Table & Schema Management
|
|
18
|
+
- List, describe, create, and drop tables and schemas
|
|
19
|
+
- View table history, indexes, partitions, and statistics
|
|
20
|
+
- Add or update column/table comments
|
|
21
|
+
- Create Dynamic Tables for auto-incremental ETL (ODS→DWD→DWS pipelines)
|
|
22
|
+
- Create Materialized Views for pre-computed aggregations
|
|
23
|
+
- Create Table Streams to capture INSERT/UPDATE/DELETE changes for CDC UPSERT
|
|
21
24
|
|
|
22
|
-
|
|
25
|
+
### Studio Task Management
|
|
26
|
+
- Create, configure, deploy, and delete Studio tasks (SQL, Shell, Python, integration, flow)
|
|
27
|
+
- Save task content and cron schedule
|
|
28
|
+
- Deploy, undeploy, and execute tasks ad-hoc
|
|
29
|
+
- Monitor run instances: list, detail, wait, logs, stop, rerun, backfill
|
|
30
|
+
- View run statistics and dependencies
|
|
23
31
|
|
|
24
|
-
###
|
|
32
|
+
### Data Sync Pipelines
|
|
33
|
+
- Create single-table realtime CDC sync tasks (MySQL/PostgreSQL/SQL Server → Lakehouse, task_type=28)
|
|
34
|
+
- Create multi-table or whole-database CDC sync tasks — mirror, merge, or sharded-table consolidation (task_type=281)
|
|
35
|
+
- Create offline batch sync tasks with Cron scheduling — single-table (task_type=10) or multi-table (task_type=291)
|
|
36
|
+
- Manage sync task lifecycle: start, stop, offline, backfill, add tables, re-sync individual tables
|
|
25
37
|
|
|
26
|
-
|
|
38
|
+
### Data Ingestion Pipelines
|
|
39
|
+
- Create continuous OSS/S3/COS ingest PIPE (LIST_PURGE scan mode or EVENT_NOTIFICATION mode)
|
|
40
|
+
- Create continuous Kafka ingest PIPE using READ_KAFKA function
|
|
41
|
+
- One-shot file import from URL, local path, or Volume (COPY INTO)
|
|
42
|
+
- Manage PIPE lifecycle: pause, resume, adjust batch interval, view load history
|
|
27
43
|
|
|
28
|
-
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
|
|
44
|
+
### Data Recovery
|
|
45
|
+
- Query data at a historical point in time (Time Travel: TIMESTAMP AS OF)
|
|
46
|
+
- Roll back a table to a previous version (RESTORE TABLE)
|
|
47
|
+
- Recover accidentally dropped tables, dynamic tables, or materialized views (UNDROP TABLE)
|
|
48
|
+
- Inspect change history (DESC HISTORY) and deleted objects (SHOW TABLES HISTORY)
|
|
49
|
+
- Configure retention: data_retention_days, data_lifecycle
|
|
32
50
|
|
|
33
|
-
|
|
51
|
+
### Connection & Environment Management
|
|
52
|
+
- Create, update, delete, and switch connection profiles (multi-cloud, multi-region)
|
|
53
|
+
- Check connection status and switch workspace
|
|
54
|
+
- List and test external data sources, browse catalogs and objects
|
|
55
|
+
- Generate Python SDK / ZettaPark / SQLAlchemy / JDBC connection code from profile parameters
|
|
34
56
|
|
|
35
|
-
|
|
57
|
+
## Route to cz-cli
|
|
36
58
|
|
|
59
|
+
- Any mention of cz-cli, ClickZetta, or Lakehouse
|
|
60
|
+
- SQL against Lakehouse (SELECT, DDL, DML)
|
|
61
|
+
- Studio task development, scheduling, monitoring
|
|
62
|
+
- Data pipelines (sync, import, CDC, streaming)
|
|
63
|
+
- Profile and connection configuration
|
|
64
|
+
- Performance analysis and optimization
|
|
37
65
|
|
|
38
|
-
|
|
66
|
+
## Keep in host agent
|
|
39
67
|
|
|
40
|
-
|
|
41
|
-
|
|
42
|
-
|
|
68
|
+
- General programming unrelated to ClickZetta
|
|
69
|
+
- Local file operations, git
|
|
70
|
+
- Non-ClickZetta databases (PostgreSQL, MySQL standalone)
|
|
43
71
|
|
|
44
|
-
|
|
72
|
+
## Execution strategy
|
|
45
73
|
|
|
46
|
-
|
|
47
|
-
- **lakehouse-doc** — ClickZetta Lakehouse official documentation. Covers SQL syntax, data types, functions, DDL/DML commands, dynamic tables, materialized views, access control, VCluster, data lake, AI functions, etc. When the user writes or asks about SQL syntax or Lakehouse dialect specifics, **MUST** consult lakehouse-doc skill references first to ensure accuracy.
|
|
74
|
+
**MANDATORY: Follow this order every time. No exceptions.**
|
|
48
75
|
|
|
49
|
-
**
|
|
76
|
+
**Step 1 — MUST check LLM first:**
|
|
50
77
|
|
|
51
|
-
|
|
78
|
+
```bash
|
|
79
|
+
cz-cli agent llm show
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
**Step 2 — If active LLM is configured (kind != "none"), you MUST use agent run:**
|
|
83
|
+
|
|
84
|
+
```bash
|
|
85
|
+
cz-cli agent run "<request>" --format a2a --dangerously-skip-permissions
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
Do NOT use direct cz-cli subcommands when an active LLM is available. Always delegate the full request to `cz-cli agent run`.
|
|
89
|
+
|
|
90
|
+
The output includes a `session_id`. Use it to inspect the run:
|
|
91
|
+
- `cz-cli agent session status <session_id>` — current state (`busy`/`retry` with `progress`, `idle` with `result`, or `error`)
|
|
92
|
+
- `cz-cli agent export <session_id>` — full conversation (messages, tool calls, reasoning, text). Wait until `status` is `idle` before exporting; otherwise the conversation may still be in progress.
|
|
93
|
+
|
|
94
|
+
**Step 3 — ONLY if no active LLM (kind: "none" or empty list), fall back to direct commands:**
|
|
95
|
+
|
|
96
|
+
Decompose the request into concrete `cz-cli` subcommands (`sql`, `schema`, `table`, `task`, `runs`, `job`, `datasource`, `profile`, etc.), execute them, and synthesize the result.
|
|
97
|
+
|
|
98
|
+
Use direct commands for local setup and diagnostics even when agent path is available: `cz-cli profile ...`, `cz-cli -p <profile> status`, `cz-cli agent llm ...`, `cz-cli --help`.
|
|
99
|
+
|
|
100
|
+
With session continuity:
|
|
101
|
+
|
|
102
|
+
```bash
|
|
103
|
+
cz-cli agent run "<request>" --format a2a --dangerously-skip-permissions --session <session_id>
|
|
104
|
+
```
|
|
105
|
+
|
|
106
|
+
Reuse `session_id` for follow-ups on the same topic. Omit `--session` to start fresh.
|
|
107
|
+
|
|
108
|
+
## Async mode (non-TTY / long-running tasks)
|
|
109
|
+
|
|
110
|
+
When running in non-TTY environments (e.g. as a subagent from Claude Code) or for long-running tasks, use async mode to avoid blocking:
|
|
111
|
+
|
|
112
|
+
### Submit asynchronously
|
|
113
|
+
|
|
114
|
+
```bash
|
|
115
|
+
cz-cli agent run "<request>" --async --format a2a --dangerously-skip-permissions
|
|
116
|
+
```
|
|
117
|
+
|
|
118
|
+
Returns immediately with a session ID:
|
|
119
|
+
```json
|
|
120
|
+
{"session_id": "01JXF3K...", "status": "running", "message": "Session submitted asynchronously"}
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
Note: In non-TTY with `--format a2a` or `--format json`, async mode activates automatically (no `--async` flag needed).
|
|
124
|
+
|
|
125
|
+
### Poll status
|
|
126
|
+
|
|
127
|
+
```bash
|
|
128
|
+
cz-cli agent session status <session_id>
|
|
129
|
+
```
|
|
130
|
+
|
|
131
|
+
While running:
|
|
132
|
+
```json
|
|
133
|
+
{"session_id": "01JXF3K...", "status": "busy", "progress": "$ cz-cli table list -o table"}
|
|
134
|
+
```
|
|
135
|
+
|
|
136
|
+
Other progress examples you may see during polling:
|
|
137
|
+
- `"💭 Thinking..."` — LLM is reasoning
|
|
138
|
+
- `"✏ Generating response..."` — LLM is writing the reply
|
|
139
|
+
- `"✱ Grep \"error\" · 3 matches"` — running a search tool
|
|
140
|
+
- `"↻ Retry (attempt 2)"` — retrying a failed LLM call (paired with a `retry` field describing the reason)
|
|
141
|
+
|
|
142
|
+
When complete:
|
|
143
|
+
```json
|
|
144
|
+
{"session_id": "01JXF3K...", "status": "idle", "result": "Here are the results:\n..."}
|
|
145
|
+
```
|
|
146
|
+
|
|
147
|
+
The `result` field is the final text reply. For full conversation details (thinking, tool calls, intermediate text), use `cz-cli agent export <session_id>`.
|
|
148
|
+
|
|
149
|
+
If the session does not exist:
|
|
150
|
+
```json
|
|
151
|
+
{"session_id": "ses_invalid", "error": "Session not found"}
|
|
152
|
+
```
|
|
153
|
+
(exits with code 1)
|
|
154
|
+
|
|
155
|
+
### Retrieve full conversation (thinking + tool calls + text)
|
|
156
|
+
|
|
157
|
+
```bash
|
|
158
|
+
cz-cli agent export <session_id>
|
|
159
|
+
```
|
|
160
|
+
|
|
161
|
+
Returns complete session with all message parts:
|
|
162
|
+
```json
|
|
163
|
+
{
|
|
164
|
+
"info": { "id": "...", "title": "...", "time": {...} },
|
|
165
|
+
"messages": [
|
|
166
|
+
{
|
|
167
|
+
"info": { "role": "user" },
|
|
168
|
+
"parts": [{ "type": "text", "text": "original prompt" }]
|
|
169
|
+
},
|
|
170
|
+
{
|
|
171
|
+
"info": { "role": "assistant" },
|
|
172
|
+
"parts": [
|
|
173
|
+
{ "type": "reasoning", "text": "thinking content..." },
|
|
174
|
+
{ "type": "tool", "tool": "bash", "state": { "status": "completed", "input": {...}, "output": "..." } },
|
|
175
|
+
{ "type": "text", "text": "final answer..." }
|
|
176
|
+
]
|
|
177
|
+
}
|
|
178
|
+
]
|
|
179
|
+
}
|
|
180
|
+
```
|
|
181
|
+
|
|
182
|
+
Part types in export:
|
|
183
|
+
- `reasoning` — LLM thinking/reasoning blocks
|
|
184
|
+
- `tool` — tool calls with full input/output (bash, read, write, edit, glob, grep, etc.)
|
|
185
|
+
- `text` — final text response
|
|
186
|
+
- `step-start` / `step-finish` — step boundaries
|
|
187
|
+
- `patch` — code diffs
|
|
188
|
+
- `subtask` — delegated sub-tasks
|
|
189
|
+
|
|
190
|
+
### Async workflow pattern
|
|
52
191
|
|
|
53
|
-
|
|
192
|
+
```bash
|
|
193
|
+
# 1. Submit
|
|
194
|
+
SESSION=$(cz-cli agent run "complex analysis" --async --format a2a --dangerously-skip-permissions | jq -r '.session_id')
|
|
195
|
+
|
|
196
|
+
# 2. Poll until done, printing progress along the way
|
|
197
|
+
while true; do
|
|
198
|
+
STATUS=$(cz-cli agent session status $SESSION)
|
|
199
|
+
STATE=$(echo "$STATUS" | jq -r '.status')
|
|
200
|
+
if [ "$STATE" = "idle" ]; then
|
|
201
|
+
echo "$STATUS" | jq -r '.result'
|
|
202
|
+
break
|
|
203
|
+
fi
|
|
204
|
+
echo "$STATUS" | jq -r '.progress // empty'
|
|
205
|
+
sleep 5
|
|
206
|
+
done
|
|
207
|
+
|
|
208
|
+
# Need full conversation (thinking + tool calls)?
|
|
209
|
+
cz-cli agent export $SESSION
|
|
210
|
+
```
|
|
54
211
|
|
|
55
|
-
|
|
56
|
-
- Asking about ClickZetta Lakehouse SQL dialect syntax, keywords, or function usage
|
|
57
|
-
- Using data types, type casting, or datetime formats
|
|
58
|
-
- Creating or altering tables, views, materialized views, dynamic tables, external tables
|
|
59
|
-
- Data import/export (COPY INTO, PUT/GET, BulkLoad, Pipe)
|
|
60
|
-
- Access control (GRANT / REVOKE), roles, permissions
|
|
61
|
-
- VCluster configuration and management
|
|
62
|
-
- Index creation and usage (inverted index, BloomFilter, vector index)
|
|
63
|
-
- AI functions, vector search, semantic views
|
|
64
|
-
- Information Schema system view queries
|
|
212
|
+
### With session continuity (async)
|
|
65
213
|
|
|
66
|
-
|
|
214
|
+
```bash
|
|
215
|
+
# First turn
|
|
216
|
+
SESSION=$(cz-cli agent run "describe sales table" --async --format a2a --dangerously-skip-permissions | jq -r '.session_id')
|
|
217
|
+
# ... wait for completion ...
|
|
67
218
|
|
|
68
|
-
|
|
219
|
+
# Follow-up turn on same session
|
|
220
|
+
cz-cli agent run "now show row counts" --async --format a2a --dangerously-skip-permissions --session $SESSION
|
|
221
|
+
```
|
|
69
222
|
|
|
70
|
-
|
|
71
|
-
After saving, tell the user: "The task has been saved as a draft. Scheduling is not yet active. Let me know explicitly if you want to publish or execute it."
|
|
223
|
+
### Important notes for async mode
|
|
72
224
|
|
|
73
|
-
|
|
225
|
+
- **Permissions:** Always use `--dangerously-skip-permissions` — async mode cannot handle interactive permission prompts
|
|
226
|
+
- **Server requirement:** An agent runtime server must be running (or will be started automatically)
|
|
227
|
+
- **Error handling:** If session is already busy, returns `{"error": "session busy"}`
|
|
74
228
|
|
|
75
|
-
|
|
229
|
+
## Multi-environment (profiles)
|
|
76
230
|
|
|
77
|
-
|
|
231
|
+
When the user specifies an environment or profile (e.g. "use uat_test", "on the test instance"):
|
|
78
232
|
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
|
|
82
|
-
- Requires the task to already be published (online); draft tasks cannot be backfilled
|
|
233
|
+
```bash
|
|
234
|
+
cz-cli agent run "<request>" --profile uat_test --format a2a --dangerously-skip-permissions
|
|
235
|
+
```
|
|
83
236
|
|
|
84
|
-
|
|
237
|
+
Available profiles: read `~/.clickzetta/profiles.toml` or run `cz-cli profile list`.
|
|
85
238
|
|
|
86
|
-
|
|
239
|
+
## Adding a new profile
|
|
87
240
|
|
|
88
|
-
**
|
|
89
|
-
- If the user only needs "latest / recent N items": stop after page 1, do not paginate.
|
|
90
|
-
- If the user needs the full dataset: paginate until `ai_message` no longer contains a next-page hint.
|
|
91
|
-
- **Safety limit**: auto-paginate at most **3 pages** (≈ 30 items). If more exist, surface the next-page command to the user and ask whether to continue.
|
|
241
|
+
**Trigger conditions:** User says "configure new environment", "add profile", "can't connect", mentions an unknown profile name, or provides connection credentials.
|
|
92
242
|
|
|
93
|
-
###
|
|
243
|
+
### Step 1 — Collect information (guided Q&A)
|
|
94
244
|
|
|
95
|
-
|
|
96
|
-
The Python SDK(connector/igs/bulkload) is using from clickzetta import connect instead of Zettapark's session.
|
|
245
|
+
If all required fields are already provided, skip directly to Step 2.
|
|
97
246
|
|
|
98
|
-
|
|
99
|
-
- If user intent says “Python task”, command must include `--type PYTHON`.
|
|
100
|
-
- If `--type FLOW`, immediately switch to Rule 6 — use Flow-specific commands exclusively.
|
|
247
|
+
Otherwise, ask for missing ones. Accept all at once or prompt one by one.
|
|
101
248
|
|
|
102
|
-
|
|
249
|
+
**Required fields:**
|
|
103
250
|
|
|
104
|
-
|
|
251
|
+
| Field | Question to ask | Example |
|
|
252
|
+
|-------|----------------|---------|
|
|
253
|
+
| `service` | Which cloud region? (see table below, or provide the service endpoint directly) | `cn-shanghai-alicloud.api.clickzetta.com` |
|
|
254
|
+
| `instance` | What is the instance name? | `billingsh` |
|
|
255
|
+
| `workspace` | What is the workspace name? | `meter_n_bill` |
|
|
256
|
+
| `username` | What is the username? | `billing_admin` |
|
|
257
|
+
| `password` | What is the password? | — |
|
|
258
|
+
| `name` | What should this profile be named? (suggested format below) | `billingsh` |
|
|
105
259
|
|
|
106
|
-
|
|
107
|
-
- **MUST NOT** use `task save`, `task save-config`, `task detail`, or `task online` on Flow child nodes — these tools are for regular (non-Flow) tasks only and will produce incorrect results or errors.
|
|
108
|
-
- **Decision rule**: if `task_type == 500` OR the user mentions flow/workflow context → use `task flow *` commands unconditionally.
|
|
260
|
+
**Common service endpoints (offer as options):**
|
|
109
261
|
|
|
110
|
-
|
|
262
|
+
| Region | service | Suggested profile prefix |
|
|
263
|
+
|--------|---------|--------------------------|
|
|
264
|
+
| Alibaba Cloud East China 2 (Shanghai) | `cn-shanghai-alicloud.api.clickzetta.com` | `cn-shanghai` |
|
|
265
|
+
| Tencent Cloud East China (Shanghai) | `ap-shanghai-tencentcloud.api.clickzetta.com` | `ap-shanghai` |
|
|
266
|
+
| Tencent Cloud North China (Beijing) | `ap-beijing-tencentcloud.api.clickzetta.com` | `ap-beijing` |
|
|
267
|
+
| Tencent Cloud South China (Guangzhou) | `ap-guangzhou-tencentcloud.api.clickzetta.com` | `ap-guangzhou` |
|
|
268
|
+
| AWS China (Beijing) | `cn-north-1-aws.api.clickzetta.com` | `cn-north-1` |
|
|
111
269
|
|
|
112
|
-
|
|
270
|
+
**Inference rules (reduce unnecessary questions):**
|
|
271
|
+
- If the user describes a cloud region in natural language (e.g. "Alibaba Cloud Shanghai", "Tencent Cloud Beijing", "阿里云上海", "腾讯云北京"), look up the service endpoint from the table above — do NOT ask the user to provide it again.
|
|
272
|
+
- If the user hasn't provided a profile name, suggest `<prefix>-<instance>` using the prefix from the table (e.g. `cn-shanghai-billingsh`). Confirm with the user or proceed if they don't object.
|
|
113
273
|
|
|
114
|
-
|
|
274
|
+
### Step 2 — Create profile
|
|
115
275
|
|
|
116
|
-
|
|
276
|
+
Run `cz-cli profile create` with all collected fields:
|
|
117
277
|
|
|
118
|
-
|
|
278
|
+
```bash
|
|
279
|
+
cz-cli profile create <name> \
|
|
280
|
+
--username <username> \
|
|
281
|
+
--password <password> \
|
|
282
|
+
--instance <instance> \
|
|
283
|
+
--workspace <workspace> \
|
|
284
|
+
--service <service> \
|
|
285
|
+
--schema public \
|
|
286
|
+
--vcluster default
|
|
287
|
+
```
|
|
288
|
+
|
|
289
|
+
### Step 3 — Verify connection
|
|
290
|
+
|
|
291
|
+
After creating, run:
|
|
119
292
|
|
|
120
293
|
```bash
|
|
121
|
-
|
|
122
|
-
|
|
294
|
+
cz-cli status --profile <name>
|
|
295
|
+
```
|
|
296
|
+
|
|
297
|
+
A successful response looks like:
|
|
298
|
+
```json
|
|
299
|
+
{"data": {"connected": true, "workspace": "...", "time_ms": ...}}
|
|
300
|
+
```
|
|
301
|
+
|
|
302
|
+
If it fails, report the error and ask the user to double-check credentials or service endpoint.
|
|
303
|
+
|
|
304
|
+
## Error handling
|
|
123
305
|
|
|
124
|
-
|
|
125
|
-
cz-cli task content my_task --format toon & cz-cli runs list --task my_task --format toon & wait
|
|
306
|
+
All errors in non-TTY mode output JSON to stdout:
|
|
126
307
|
|
|
127
|
-
|
|
128
|
-
|
|
308
|
+
```json
|
|
309
|
+
{"ok": false, "error": "NO_PROFILE", "next_steps": ["cz-cli setup --credential <base64>"]}
|
|
129
310
|
```
|
|
130
311
|
|
|
131
|
-
|
|
132
|
-
- `--field <name>` — extract one value as raw text: `cz-cli sql "..." --field job_id` → `2026042818122957849079780`
|
|
133
|
-
- `--format toon` — line-per-field output, works with `grep`/`head`
|
|
134
|
-
- `--format json` — single-line JSON, parse in code only (do NOT pipe to grep/head)
|
|
135
|
-
|
|
136
|
-
**Shortcut**: Use `cz-cli sql --sync "..." > /tmp/res.json` to force synchronous execution (waits for results inline). Prefer `--sync` for simple/fast queries.
|
|
137
|
-
|
|
138
|
-
## Command Quick Reference
|
|
139
|
-
|
|
140
|
-
### Connection & Profile
|
|
141
|
-
- `cz-cli profile create <name> --username U --password P --instance I --workspace W` — Create profile
|
|
142
|
-
- `cz-cli profile create <name> --pat TOKEN --instance I --workspace W` — Create profile with PAT
|
|
143
|
-
- `cz-cli profile list` — List all profiles
|
|
144
|
-
- `cz-cli profile status` — Test connection, show workspace/schema
|
|
145
|
-
- `cz-cli profile use <name>` — Set default profile
|
|
146
|
-
- `cz-cli profile delete <name>` — Delete a profile
|
|
147
|
-
- `cz-cli profile discover --studio-url URL --username U --password P` — Discover regions/instances
|
|
148
|
-
|
|
149
|
-
### SQL Execution
|
|
150
|
-
- `cz-cli sql "SELECT * FROM t LIMIT 10"` — Execute SQL query
|
|
151
|
-
- `cz-cli sql -f query.sql` — Execute SQL from file
|
|
152
|
-
- `cz-cli sql --write "INSERT INTO t VALUES (...)"` — Write operation (requires --write)
|
|
153
|
-
- `cz-cli sql --async "SELECT count(*) FROM big_table"` — Async execution, returns job_id
|
|
154
|
-
- `cz-cli sql --sync "SELECT 1"` — Force synchronous execution
|
|
155
|
-
- `cz-cli sql -e "SELECT * FROM t WHERE id = %(id)s" --variable id=123` — Variable substitution
|
|
156
|
-
|
|
157
|
-
### Job Management (async SQL results)
|
|
158
|
-
- `cz-cli job status <job_id>` — Check job status
|
|
159
|
-
- `cz-cli job result <job_id>` — Fetch result set (waits if running, limited to 100 rows)
|
|
160
|
-
- `cz-cli job result --no-limit <job_id>` — Fetch full result set
|
|
161
|
-
|
|
162
|
-
### Schema & Table
|
|
163
|
-
- `cz-cli schema list` / `cz-cli schema describe <name>` / `cz-cli schema create <name>` / `cz-cli schema drop <name>`
|
|
164
|
-
- `cz-cli table list [--schema S] [--like 'pattern%']` / `cz-cli table describe <name>` / `cz-cli table preview <name> [--limit N]`
|
|
165
|
-
- `cz-cli table stats <name>` / `cz-cli table history [name]` / `cz-cli table create "DDL"` / `cz-cli table drop <name>`
|
|
166
|
-
|
|
167
|
-
### Workspace
|
|
168
|
-
- `cz-cli workspace current` — Show current workspace
|
|
169
|
-
- `cz-cli workspace use <name> [--persist]` — Switch workspace
|
|
170
|
-
|
|
171
|
-
### Task Scheduling (Studio)
|
|
172
|
-
- `cz-cli task list [--page N --page-size N]` — List tasks
|
|
173
|
-
- `cz-cli task create <name> --type SQL|PYTHON|SHELL|SPARK|FLOW --folder F` — Create task
|
|
174
|
-
- `cz-cli task content <name_or_id>` — Get task script, config and params (draft); response includes `params` array with `paramType=system` (built-in params / time expressions) or `paramType=manual` (constants)
|
|
175
|
-
- `cz-cli task save-content <name_or_id> --file script.sql [--params '{"key":"val","dt":"bizdate","yd":"$[yyyy-MM-dd,-1d]"}']` — Save script and optionally set params; system params (bizdate, sys_biz_day, sys_biz_datetime, sys_plan_day, sys_plan_datetime, sys_plan_timestamp, sys_task_id, sys_task_name, sys_task_owner) and time expressions starting with `$[` are auto-detected as `paramType=system`
|
|
176
|
-
- `cz-cli task save-config <name_or_id> --cron "0 0 8 * * ? *"` — Save schedule (sec min hour day month week year)
|
|
177
|
-
- `cz-cli task online <name_or_id> -y` — Publish task
|
|
178
|
-
- `cz-cli task offline <name_or_id> -y` — Take offline (IRREVERSIBLE)
|
|
179
|
-
- `cz-cli task execute <name_or_id> [--param KEY=VAL ...]` — Ad-hoc execution; auto-loads saved `manual` params as defaults (system params like `bizdate` are NOT auto-injected in adhoc mode — pass them explicitly via `--param`); warns if unresolved `${placeholders}` remain after merge (SQL tasks will fail, Python/Shell silently keep literal strings)
|
|
180
|
-
- `cz-cli task flow dag <flow>` / `task flow create-node` / `task flow bind` / `task flow submit` — Flow operations
|
|
181
|
-
|
|
182
|
-
### Runs & Attempts
|
|
183
|
-
- `cz-cli runs list [--task T --status S --run-type SCHEDULE|REFILL --from D --to D]`
|
|
184
|
-
- `cz-cli runs detail <run_id_or_task>` / `cz-cli runs logs <run_id_or_task>` / `cz-cli runs wait <id> --timeout N`
|
|
185
|
-
- `cz-cli runs refill <task> --from D --to D [-y]` — **补数/回填**: re-run scheduled instances for a historical date range (task must be online). `D` accepts `YYYY-MM-DD` (day boundary: `--from` = start of day, `--to` = 23:59:59) or `YYYY-MM-DDTHH:MM:SS` for exact datetime — **use ISO datetime for hourly/minutely tasks** to avoid missing instances
|
|
186
|
-
- `cz-cli attempts list <run_id_or_task>` / `cz-cli attempts log <run_id_or_task> [--attempt-id N]`
|
|
187
|
-
|
|
188
|
-
### Agent (AI Agent)
|
|
189
|
-
- `cz-cli agent run "<prompt>" [--session ID]` — Run AI agent with a natural-language prompt
|
|
190
|
-
- `cz-cli agent llm` — Manage LLM providers
|
|
191
|
-
|
|
192
|
-
## Command Inventory (Generated)
|
|
193
|
-
|
|
194
|
-
### `ai-guide`
|
|
195
|
-
- `ai-guide` - Generate AI-friendly command reference
|
|
196
|
-
|
|
197
|
-
### `attempts`
|
|
198
|
-
- `attempts` - Manage attempt records
|
|
199
|
-
- `attempts list` - List attempts for a run
|
|
200
|
-
- `attempts log` - Get attempt log
|
|
201
|
-
|
|
202
|
-
### `job`
|
|
203
|
-
- `job` - Job performance tools
|
|
204
|
-
- `job result` - Fetch result set of a SQL job
|
|
205
|
-
- `job status` - Check status/summary of a SQL job
|
|
206
|
-
|
|
207
|
-
### `profile`
|
|
208
|
-
- `profile` - Manage connection profiles
|
|
209
|
-
- `profile create` - Create a profile
|
|
210
|
-
- `profile delete` - Delete a profile
|
|
211
|
-
- `profile detail` - Show profile details
|
|
212
|
-
- `profile list` - List profiles
|
|
213
|
-
- `profile update` - Update a profile
|
|
214
|
-
- `profile use` - Set default profile
|
|
215
|
-
|
|
216
|
-
### `runs`
|
|
217
|
-
- `runs` - Manage task run instances
|
|
218
|
-
- `runs deps` - View run dependencies
|
|
219
|
-
- `runs detail` - Get run detail
|
|
220
|
-
- `runs list` - List run instances
|
|
221
|
-
- `runs logs` - Get run logs
|
|
222
|
-
- `runs refill` - Submit backfill job
|
|
223
|
-
- `runs rerun` - Rerun a failed instance
|
|
224
|
-
- `runs stats` - Get run statistics summary
|
|
225
|
-
- `runs stop` - Stop a running instance
|
|
226
|
-
- `runs wait` - Poll until run completes
|
|
227
|
-
|
|
228
|
-
### `schema`
|
|
229
|
-
- `schema` - Manage schemas
|
|
230
|
-
- `schema create` - Create a schema
|
|
231
|
-
- `schema describe` - Describe a schema
|
|
232
|
-
- `schema drop` - Drop a schema
|
|
233
|
-
- `schema list` - List schemas
|
|
234
|
-
|
|
235
|
-
### `sql`
|
|
236
|
-
- `sql` - Execute SQL against ClickZetta
|
|
237
|
-
- `cz-cli sql "SELECT 1"` — Run a simple query
|
|
238
|
-
- `cz-cli sql -e "SELECT * FROM t LIMIT 10" --sync` — Synchronous query
|
|
239
|
-
- `cz-cli sql -f query.sql --write` — Execute write SQL from file
|
|
240
|
-
|
|
241
|
-
### `table`
|
|
242
|
-
- `table` - Manage tables
|
|
243
|
-
- `table create` - Create a table from DDL
|
|
244
|
-
- `table describe` - Describe a table
|
|
245
|
-
- `table drop` - Drop a table
|
|
246
|
-
- `table history` - Show table history
|
|
247
|
-
- `table list` - List tables
|
|
248
|
-
- `table preview` - Preview table data
|
|
249
|
-
- `table stats` - Get table row count
|
|
250
|
-
|
|
251
|
-
### `task`
|
|
252
|
-
- `task` - Manage Studio tasks
|
|
253
|
-
- `task content` - Get task content
|
|
254
|
-
- `task create` - Create a new task
|
|
255
|
-
- `task deps` - Show task dependencies
|
|
256
|
-
- `task execute` - Execute a task ad-hoc
|
|
257
|
-
- `task list` - List tasks
|
|
258
|
-
- `task offline` - Take a task offline
|
|
259
|
-
- `task online` - Publish a task
|
|
260
|
-
- `task save-config` - Save task schedule config
|
|
261
|
-
- `task save-content` - Save task script
|
|
262
|
-
|
|
263
|
-
### `workspace`
|
|
264
|
-
- `workspace` - Manage workspaces
|
|
265
|
-
- `workspace current` - Show current workspace
|
|
266
|
-
- `workspace list` - List workspaces
|
|
267
|
-
|
|
268
|
-
## Command Risk Reference
|
|
269
|
-
|
|
270
|
-
| Risk Level | Commands | Key Concern |
|
|
271
|
-
|-----------------|--------------------------------------------------|-----------------------------------------------------------------|
|
|
272
|
-
| 🔴 Irreversible | `task offline`, `schema drop`, `table drop` | Cannot be undone; clears history or deletes objects permanently |
|
|
273
|
-
| 🟠 High Impact | `task online`, `runs refill`, `task flow submit` | Affects live schedule or re-runs historical data |
|
|
274
|
-
| 🟢 Safe | All others | No side effects |
|
|
312
|
+
On `NO_PROFILE` error: check if a profile can be configured via username/password (see "Adding a new profile" above). If the user has a base64 credential instead, guide them to run `cz-cli setup --credential <base64>`. See `references/profile-setup.md`.
|