@clickzetta/cz-cli-darwin-arm64 0.3.74 → 0.3.76

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,109 @@
1
+ # SQL → Dynamic Table 完整转换工作流
2
+
3
+ 当用户给你一组 CREATE TABLE DDL 和 INSERT OVERWRITE SQL,要求转换为 Dynamic Table 时,按以下步骤顺序执行。
4
+
5
+ 每一步的详细规则在对应的 skill 文件中,你需要同时引用它们。
6
+
7
+ ## 工作流步骤
8
+
9
+ ### Step 1: 预处理输入
10
+
11
+ 从 INSERT OVERWRITE 文件中移除:
12
+ - 所有 `ALTER TABLE` 语句
13
+ - `ANALYZE TABLE` 语句
14
+ - SQL 注释(`--` 和 `/* */`)
15
+
16
+ 保留:CREATE TABLE、INSERT OVERWRITE、WITH、SET、CREATE TEMPORARY FUNCTION。
17
+
18
+ ### Step 2: 占位符替换
19
+
20
+ 按 #[[file:sql2dt-placeholder-rules.md]] 中的规则:
21
+ 1. 统一占位符格式(`{{ }}` → `${ }`)
22
+ 2. 替换所有占位符为 `SESSION_CONFIGS()` 调用
23
+ 3. 处理 nodash 变量、日期运算、macros 函数
24
+ 4. 根据引号上下文决定处理方式(去引号 / CONCAT / 直接替换)
25
+
26
+ ### Step 3: 自引用检测
27
+
28
+ 按 #[[file:sql2dt-self-reference-rules.md]] 中的规则:
29
+ 1. 检查 INSERT OVERWRITE 目标表是否出现在 FROM/JOIN 中
30
+ 2. 如果是自引用表,标记并在后续步骤中添加注释、使用显式 schema
31
+
32
+ ### Step 4: 核心转换
33
+
34
+ 按 #[[file:sql2dt-conversion-rules.md]] 中的规则:
35
+ 1. 解析 CREATE TABLE DDL(提取列、分区、属性等)
36
+ 2. 解析 INSERT OVERWRITE(提取查询、分区类型)
37
+ 3. 组装 `CREATE OR REPLACE DYNAMIC TABLE ... AS SELECT ...`
38
+ 4. 注入静态分区值到 SELECT(智能引号处理)
39
+ 5. 合并表属性模板(默认 `data_lifecycle=15`)
40
+ 6. 处理 UNION ALL(每个分支独立注入)
41
+ 7. 日期函数后处理:将所有 `DATE_SUB/DATE_ADD` 统一转为 `sub_days`
42
+
43
+ ### Step 5: 列校验
44
+
45
+ 按 #[[file:sql2dt-column-validation-rules.md]] 中的规则:
46
+ 1. 计算 schema 列数和 SELECT 列数
47
+ 2. 验证两者相等
48
+ 3. 检查重复别名和缺失分区列
49
+ 4. UNION ALL 分支列数一致性检查
50
+
51
+ ### Step 6: 生成配套文件
52
+
53
+ 按 #[[file:sql2dt-refresh-rules.md]] 中的规则:
54
+ 1. 从 DDL 中提取所有 SESSION_CONFIGS 变量
55
+ 2. 生成当前周期 refresh 语句
56
+ 3. 生成上一周期 prev_refresh 语句
57
+ 4. 生成回填 backfill 语句
58
+
59
+ ### Step 7: 转换后改进建议
60
+
61
+ DDL 生成完成后,对转换结果做以下检查,并主动向用户提出改进建议:
62
+
63
+ **检查 1:非分区表 + 持续写入风险**
64
+
65
+ 按 #[[file:../best-practices/non-partitioned-merge-into-warning.md]] 中的判断逻辑:
66
+ - 生成的 DT 是非分区表(无 `PARTITIONED BY` 也无 `SESSION_CONFIGS()`)
67
+ - 且 SQL 中包含 `ROW_NUMBER() OVER (PARTITION BY ... ORDER BY ... DESC) WHERE rn = 1` 去重模式
68
+
69
+ → 满足条件时,使用该文档中的告警话术模板向用户发出风险提示,并建议改用 MERGE INTO + Table Stream 方案。
70
+
71
+ **检查 2:SQL 性能优化机会**
72
+
73
+ 按 #[[file:../best-practices/performance-optimization.md]] 中的规则,扫描生成的 DT SQL:
74
+ - 存在 `LEFT/RIGHT/FULL OUTER JOIN` → 提示如果业务允许,改用 INNER JOIN 可提升增量效率
75
+ - 存在无 `PARTITION BY` 的窗口函数 → 提示添加 PARTITION BY,否则每次增量都全量重算
76
+ - `GROUP BY` 使用了复杂表达式(如 `DATE_TRUNC`、`SUBSTR`)→ 提示考虑在上游预计算或拆分为多级 DT
77
+
78
+ **检查 3:JOIN 中是否有维度表**
79
+
80
+ 按 #[[file:../best-practices/dimension-table-join-guide.md]] 中的推荐场景:
81
+ - SQL 中存在 JOIN → 询问用户右侧表是否为低频变更的维度表(码表、字典表、配置表等)
82
+ - 如果是 → 建议在 TBLPROPERTIES 中添加 `mv_const_tables` 配置,并说明其行为和数据一致性权衡
83
+
84
+ ## 输出清单
85
+
86
+ 对每个表,最终输出:
87
+
88
+ | 文件 | 内容 | 条件 |
89
+ |------|------|------|
90
+ | `表名.sql` | Dynamic Table DDL | 始终生成 |
91
+ | `表名_refresh.sql` | 当前周期 REFRESH 语句 | 始终生成 |
92
+ | `表名_prev_refresh.sql` | 上一周期 REFRESH 语句 | 仅有分区变量时 |
93
+ | `表名_backfill.sql` | 回填语句 | 仅有分区变量时 |
94
+
95
+ ## 快速判断路径
96
+
97
+ ```
98
+ 输入 DDL + INSERT OVERWRITE
99
+
100
+ ├─ 有占位符? → Step 2 占位符替换
101
+
102
+ ├─ 自引用? → Step 3 特殊处理
103
+
104
+ ├─ 有静态分区? → Step 4 注入分区值到 SELECT
105
+
106
+ ├─ 有 UNION ALL? → Step 4 每个分支独立注入
107
+
108
+ └─ 生成 DDL → Step 5 校验 → Step 6 生成配套文件 → Step 7 改进建议
109
+ ```
@@ -1,274 +1,312 @@
1
1
  ---
2
2
  name: cz-cli
3
- description: "CLI tool for ClickZetta Lakehouse. Use this skill when the user needs to run SQL queries, develop or schedule Studio tasks, check run status and logs, or manage tables and schemas."
4
- generated_cli_version: "0.0.0-dev+202605250308"
5
- generated_with: "cz-cli-skill-generator/1.0.0"
6
- command_inventory_source: "click-command-tree"
7
- command_count: 51
3
+ description: "Route ALL ClickZetta Lakehouse operations to cz-cli: SQL, Studio tasks, tables, pipelines, profiles. Use when user mentions ClickZetta, Lakehouse, cz-cli, or needs profile/connection configuration."
8
4
  ---
9
5
 
10
- # cz-cli Skill
6
+ # cz-cli — ClickZetta Lakehouse Subagent
11
7
 
12
- > [!IMPORTANT]
13
- > **Fallback**: If `cz-cli` command not present, check for a bundled binary at `scripts/<platform>/cz-cli` (e.g. `scripts/macos-arm64/cz-cli`, `scripts/linux-x86_64/cz-cli`, `scripts/windows-x86_64/cz-cli.exe`). If found, use its absolute path directly. Otherwise run `pip3 install cz-cli -U` then use `cz-cli` from PATH.
8
+ You have no direct Lakehouse access. Always delegate via cz-cli.
14
9
 
15
- ## AI Agent Behavior Rules
10
+ ## Capabilities
16
11
 
17
- > [!IMPORTANT]
18
- > Read these rules before invoking any command.
12
+ ### SQL & Data Operations
13
+ - Execute any SQL against Lakehouse: SELECT, DDL (CREATE/ALTER/DROP TABLE, SCHEMA, VIEW), DML (INSERT, UPDATE, DELETE, MERGE INTO)
14
+ - Run async jobs and fetch results
15
+ - Preview table data and row counts
19
16
 
20
- ### Rule 0 Initialize connection profile before first use
17
+ ### Table & Schema Management
18
+ - List, describe, create, and drop tables and schemas
19
+ - View table history, indexes, partitions, and statistics
20
+ - Add or update column/table comments
21
+ - Create Dynamic Tables for auto-incremental ETL (ODS→DWD→DWS pipelines)
22
+ - Create Materialized Views for pre-computed aggregations
23
+ - Create Table Streams to capture INSERT/UPDATE/DELETE changes for CDC UPSERT
21
24
 
22
- When any command returns error code `NO_PROFILE`, run `cz-cli setup --credential <base64>` if the credential string is available, or read `references/profile-setup.md` and follow the onboarding steps there. Always use the **AskUserQuestion tool** — never ask via plain text.
25
+ ### Studio Task Management
26
+ - Create, configure, deploy, and delete Studio tasks (SQL, Shell, Python, integration, flow)
27
+ - Save task content and cron schedule
28
+ - Deploy, undeploy, and execute tasks ad-hoc
29
+ - Monitor run instances: list, detail, wait, logs, stop, rerun, backfill
30
+ - View run statistics and dependencies
23
31
 
24
- ### Rule 1 — Clarify intent before any state-changing operation
32
+ ### Data Sync Pipelines
33
+ - Create single-table realtime CDC sync tasks (MySQL/PostgreSQL/SQL Server → Lakehouse, task_type=28)
34
+ - Create multi-table or whole-database CDC sync tasks — mirror, merge, or sharded-table consolidation (task_type=281)
35
+ - Create offline batch sync tasks with Cron scheduling — single-table (task_type=10) or multi-table (task_type=291)
36
+ - Manage sync task lifecycle: start, stop, offline, backfill, add tables, re-sync individual tables
25
37
 
26
- When a task involves an operation that changes system state (publishing a schedule, taking a task offline, triggering a run, deleting an object, etc.):
38
+ ### Data Ingestion Pipelines
39
+ - Create continuous OSS/S3/COS ingest PIPE (LIST_PURGE scan mode or EVENT_NOTIFICATION mode)
40
+ - Create continuous Kafka ingest PIPE using READ_KAFKA function
41
+ - One-shot file import from URL, local path, or Volume (COPY INTO)
42
+ - Manage PIPE lifecycle: pause, resume, adjust batch interval, view load history
27
43
 
28
- **Do NOT proceed without** explicitly understanding the user's intent on all three points:
29
- 1. **Target**: Which task / table / run instance?
30
- 2. **Purpose**: Why does this need to happen now?
31
- 3. **Side-effect acknowledgment**: Does the user understand the impact (historical re-runs, schedule interruption, permanent deletion, etc.)?
44
+ ### Data Recovery
45
+ - Query data at a historical point in time (Time Travel: TIMESTAMP AS OF)
46
+ - Roll back a table to a previous version (RESTORE TABLE)
47
+ - Recover accidentally dropped tables, dynamic tables, or materialized views (UNDROP TABLE)
48
+ - Inspect change history (DESC HISTORY) and deleted objects (SHOW TABLES HISTORY)
49
+ - Configure retention: data_retention_days, data_lifecycle
32
50
 
33
- Use the **AskUserQuestion tool** (not plain text) to confirm with the user, then wait for an explicit reply before executing.
51
+ ### Connection & Environment Management
52
+ - Create, update, delete, and switch connection profiles (multi-cloud, multi-region)
53
+ - Check connection status and switch workspace
54
+ - List and test external data sources, browse catalogs and objects
55
+ - Generate Python SDK / ZettaPark / SQLAlchemy / JDBC connection code from profile parameters
34
56
 
35
- Consult **Command Risk Reference** below to determine whether a command qualifies as state-changing.
57
+ ## Route to cz-cli
36
58
 
59
+ - Any mention of cz-cli, ClickZetta, or Lakehouse
60
+ - SQL against Lakehouse (SELECT, DDL, DML)
61
+ - Studio task development, scheduling, monitoring
62
+ - Data pipelines (sync, import, CDC, streaming)
63
+ - Profile and connection configuration
64
+ - Performance analysis and optimization
37
65
 
38
- ### Rule 2 Route to companion skill BEFORE acting (MUST)
66
+ ## Keep in host agent
39
67
 
40
- > [!IMPORTANT]
41
- > Check this routing table **before** executing any command or writing any code.
42
- > If the user intent matches a trigger below, **MUST** invoke the listed skill immediately and stop processing in cz-cli.
68
+ - General programming unrelated to ClickZetta
69
+ - Local file operations, git
70
+ - Non-ClickZetta databases (PostgreSQL, MySQL standalone)
43
71
 
44
- The following companion skills are bundled with `cz-cli`:
72
+ ## Execution strategy
45
73
 
46
- - **lakehouse-python-sdk** Lakehouse Python/Shell task engineering: develop, rewrite, optimize, troubleshoot. Covers connector, BulkLoad, IGS, Studio params, datasource, and CREATE TABLE DDL; MUST trigger on: develop/write/create/modify/rewrite/optimize Python or Shell task; BulkLoad batch upload; IGS realtime ingest; connector query/write; Python task error/diagnosis; CREATE TABLE / partition / bucket / index DDL.
47
- - **lakehouse-doc** — ClickZetta Lakehouse official documentation. Covers SQL syntax, data types, functions, DDL/DML commands, dynamic tables, materialized views, access control, VCluster, data lake, AI functions, etc. When the user writes or asks about SQL syntax or Lakehouse dialect specifics, **MUST** consult lakehouse-doc skill references first to ensure accuracy.
74
+ **MANDATORY: Follow this order every time. No exceptions.**
48
75
 
49
- **If no row matches**: proceed with cz-cli commands directly.
76
+ **Step 1 MUST check LLM first:**
50
77
 
51
- ### Rule 2.1 — MUST consult lakehouse-doc skill for SQL commands
78
+ ```bash
79
+ cz-cli agent llm show
80
+ ```
81
+
82
+ **Step 2 — If active LLM is configured (kind != "none"), you MUST use agent run:**
83
+
84
+ ```bash
85
+ cz-cli agent run "<request>" --format a2a --dangerously-skip-permissions
86
+ ```
87
+
88
+ Do NOT use direct cz-cli subcommands when an active LLM is available. Always delegate the full request to `cz-cli agent run`.
89
+
90
+ The output includes a `session_id`. Use it to inspect the run:
91
+ - `cz-cli agent session status <session_id>` — current state (`busy`/`retry` with `progress`, `idle` with `result`, or `error`)
92
+ - `cz-cli agent export <session_id>` — full conversation (messages, tool calls, reasoning, text). Wait until `status` is `idle` before exporting; otherwise the conversation may still be in progress.
93
+
94
+ **Step 3 — ONLY if no active LLM (kind: "none" or empty list), fall back to direct commands:**
95
+
96
+ Decompose the request into concrete `cz-cli` subcommands (`sql`, `schema`, `table`, `task`, `runs`, `job`, `datasource`, `profile`, etc.), execute them, and synthesize the result.
97
+
98
+ Use direct commands for local setup and diagnostics even when agent path is available: `cz-cli profile ...`, `cz-cli -p <profile> status`, `cz-cli agent llm ...`, `cz-cli --help`.
99
+
100
+ With session continuity:
101
+
102
+ ```bash
103
+ cz-cli agent run "<request>" --format a2a --dangerously-skip-permissions --session <session_id>
104
+ ```
105
+
106
+ Reuse `session_id` for follow-ups on the same topic. Omit `--session` to start fresh.
107
+
108
+ ## Async mode (non-TTY / long-running tasks)
109
+
110
+ When running in non-TTY environments (e.g. as a subagent from Claude Code) or for long-running tasks, use async mode to avoid blocking:
111
+
112
+ ### Submit asynchronously
113
+
114
+ ```bash
115
+ cz-cli agent run "<request>" --async --format a2a --dangerously-skip-permissions
116
+ ```
117
+
118
+ Returns immediately with a session ID:
119
+ ```json
120
+ {"session_id": "01JXF3K...", "status": "running", "message": "Session submitted asynchronously"}
121
+ ```
122
+
123
+ Note: In non-TTY with `--format a2a` or `--format json`, async mode activates automatically (no `--async` flag needed).
124
+
125
+ ### Poll status
126
+
127
+ ```bash
128
+ cz-cli agent session status <session_id>
129
+ ```
130
+
131
+ While running:
132
+ ```json
133
+ {"session_id": "01JXF3K...", "status": "busy", "progress": "$ cz-cli table list -o table"}
134
+ ```
135
+
136
+ Other progress examples you may see during polling:
137
+ - `"💭 Thinking..."` — LLM is reasoning
138
+ - `"✏ Generating response..."` — LLM is writing the reply
139
+ - `"✱ Grep \"error\" · 3 matches"` — running a search tool
140
+ - `"↻ Retry (attempt 2)"` — retrying a failed LLM call (paired with a `retry` field describing the reason)
141
+
142
+ When complete:
143
+ ```json
144
+ {"session_id": "01JXF3K...", "status": "idle", "result": "Here are the results:\n..."}
145
+ ```
146
+
147
+ The `result` field is the final text reply. For full conversation details (thinking, tool calls, intermediate text), use `cz-cli agent export <session_id>`.
148
+
149
+ If the session does not exist:
150
+ ```json
151
+ {"session_id": "ses_invalid", "error": "Session not found"}
152
+ ```
153
+ (exits with code 1)
154
+
155
+ ### Retrieve full conversation (thinking + tool calls + text)
156
+
157
+ ```bash
158
+ cz-cli agent export <session_id>
159
+ ```
160
+
161
+ Returns complete session with all message parts:
162
+ ```json
163
+ {
164
+ "info": { "id": "...", "title": "...", "time": {...} },
165
+ "messages": [
166
+ {
167
+ "info": { "role": "user" },
168
+ "parts": [{ "type": "text", "text": "original prompt" }]
169
+ },
170
+ {
171
+ "info": { "role": "assistant" },
172
+ "parts": [
173
+ { "type": "reasoning", "text": "thinking content..." },
174
+ { "type": "tool", "tool": "bash", "state": { "status": "completed", "input": {...}, "output": "..." } },
175
+ { "type": "text", "text": "final answer..." }
176
+ ]
177
+ }
178
+ ]
179
+ }
180
+ ```
181
+
182
+ Part types in export:
183
+ - `reasoning` — LLM thinking/reasoning blocks
184
+ - `tool` — tool calls with full input/output (bash, read, write, edit, glob, grep, etc.)
185
+ - `text` — final text response
186
+ - `step-start` / `step-finish` — step boundaries
187
+ - `patch` — code diffs
188
+ - `subtask` — delegated sub-tasks
189
+
190
+ ### Async workflow pattern
52
191
 
53
- When the user request involves any of the following, **MUST** consult **lakehouse-doc** skill references before answering or generating SQL:
192
+ ```bash
193
+ # 1. Submit
194
+ SESSION=$(cz-cli agent run "complex analysis" --async --format a2a --dangerously-skip-permissions | jq -r '.session_id')
195
+
196
+ # 2. Poll until done, printing progress along the way
197
+ while true; do
198
+ STATUS=$(cz-cli agent session status $SESSION)
199
+ STATE=$(echo "$STATUS" | jq -r '.status')
200
+ if [ "$STATE" = "idle" ]; then
201
+ echo "$STATUS" | jq -r '.result'
202
+ break
203
+ fi
204
+ echo "$STATUS" | jq -r '.progress // empty'
205
+ sleep 5
206
+ done
207
+
208
+ # Need full conversation (thinking + tool calls)?
209
+ cz-cli agent export $SESSION
210
+ ```
54
211
 
55
- - Writing, modifying, or optimizing SQL statements (DDL / DML / DQL)
56
- - Asking about ClickZetta Lakehouse SQL dialect syntax, keywords, or function usage
57
- - Using data types, type casting, or datetime formats
58
- - Creating or altering tables, views, materialized views, dynamic tables, external tables
59
- - Data import/export (COPY INTO, PUT/GET, BulkLoad, Pipe)
60
- - Access control (GRANT / REVOKE), roles, permissions
61
- - VCluster configuration and management
62
- - Index creation and usage (inverted index, BloomFilter, vector index)
63
- - AI functions, vector search, semantic views
64
- - Information Schema system view queries
212
+ ### With session continuity (async)
65
213
 
66
- **Rationale**: ClickZetta Lakehouse SQL dialect differs from standard SQL and other databases. Relying on general knowledge may produce incorrect syntax. Consulting lakehouse-doc significantly improves answer accuracy and confidence.
214
+ ```bash
215
+ # First turn
216
+ SESSION=$(cz-cli agent run "describe sales table" --async --format a2a --dangerously-skip-permissions | jq -r '.session_id')
217
+ # ... wait for completion ...
67
218
 
68
- ### Rule 3 Development ends at save; execution requires separate authorization
219
+ # Follow-up turn on same session
220
+ cz-cli agent run "now show row counts" --async --format a2a --dangerously-skip-permissions --session $SESSION
221
+ ```
69
222
 
70
- When the user says "develop", "write", "create", or "modify" a task, the work is **complete once the script is saved successfully**.
71
- After saving, tell the user: "The task has been saved as a draft. Scheduling is not yet active. Let me know explicitly if you want to publish or execute it."
223
+ ### Important notes for async mode
72
224
 
73
- **Do NOT** auto-publish, auto-trigger execution, or auto-submit a backfill after saving even if those steps seem like the logical next thing. Every phase that produces a real side effect requires the user to express intent and grant authorization separately.
225
+ - **Permissions:** Always use `--dangerously-skip-permissions`async mode cannot handle interactive permission prompts
226
+ - **Server requirement:** An agent runtime server must be running (or will be started automatically)
227
+ - **Error handling:** If session is already busy, returns `{"error": "session busy"}`
74
228
 
75
- **Exception**: If the user's original request *explicitly* authorized all subsequent steps (e.g. "create and immediately go live", "develop and run it now"), AND Rule 1 already confirmed that intent at the start, the Agent may proceed through the authorized steps without stopping again at each phase. Do not re-ask for confirmation that was already given.
229
+ ## Multi-environment (profiles)
76
230
 
77
- ### Rule 3.1 补数/回填/重跑历史数据 `runs refill` (NOT `task`)
231
+ When the user specifies an environment or profile (e.g. "use uat_test", "on the test instance"):
78
232
 
79
- When the user says "补数", "回填", "重跑历史", "backfill", "re-run historical", or "re-process date range":
80
- - **MUST** use `cz-cli runs refill <task> --from YYYY-MM-DD --to YYYY-MM-DD [-y]`
81
- - This command is under `runs`, **NOT** under `task` — do not look for it in task subcommands
82
- - Requires the task to already be published (online); draft tasks cannot be backfilled
233
+ ```bash
234
+ cz-cli agent run "<request>" --profile uat_test --format a2a --dangerously-skip-permissions
235
+ ```
83
236
 
84
- ### Rule 4 Paginated results are not complete data
237
+ Available profiles: read `~/.clickzetta/profiles.toml` or run `cz-cli profile list`.
85
238
 
86
- All `list` commands return only page 1 by default (typically 10 items). The `ai_message` field in the response contains the total count and the command to fetch the next page — treat it as the authoritative next-step hint and follow it. Never treat a first-page result as the full dataset.
239
+ ## Adding a new profile
87
240
 
88
- **Pagination termination rules**:
89
- - If the user only needs "latest / recent N items": stop after page 1, do not paginate.
90
- - If the user needs the full dataset: paginate until `ai_message` no longer contains a next-page hint.
91
- - **Safety limit**: auto-paginate at most **3 pages** (≈ 30 items). If more exist, surface the next-page command to the user and ask whether to continue.
241
+ **Trigger conditions:** User says "configure new environment", "add profile", "can't connect", mentions an unknown profile name, or provides connection credentials.
92
242
 
93
- ### Rule 5Always pass explicit task type when creating tasks
243
+ ### Step 1Collect information (guided Q&A)
94
244
 
95
- When creating a task with `cz-cli task create`, **always** pass `--type` explicitly (`SQL` / `PYTHON` / `SHELL` / `SPARK` / `FLOW`).
96
- The Python SDK(connector/igs/bulkload) is using from clickzetta import connect instead of Zettapark's session.
245
+ If all required fields are already provided, skip directly to Step 2.
97
246
 
98
- - **MUST NOT** rely on default task type.
99
- - If user intent says “Python task”, command must include `--type PYTHON`.
100
- - If `--type FLOW`, immediately switch to Rule 6 — use Flow-specific commands exclusively.
247
+ Otherwise, ask for missing ones. Accept all at once or prompt one by one.
101
248
 
102
- ### Rule 6 — Flow nodes use Flow-specific tools exclusively
249
+ **Required fields:**
103
250
 
104
- When the operation target is a Flow task or any of its child nodes (`task_type=500`, or user mentions "composite task / flow / workflow"):
251
+ | Field | Question to ask | Example |
252
+ |-------|----------------|---------|
253
+ | `service` | Which cloud region? (see table below, or provide the service endpoint directly) | `cn-shanghai-alicloud.api.clickzetta.com` |
254
+ | `instance` | What is the instance name? | `billingsh` |
255
+ | `workspace` | What is the workspace name? | `meter_n_bill` |
256
+ | `username` | What is the username? | `billing_admin` |
257
+ | `password` | What is the password? | — |
258
+ | `name` | What should this profile be named? (suggested format below) | `billingsh` |
105
259
 
106
- - **MUST** use Flow-specific commands: `task flow node-detail`, `task flow node-save`, `task flow node-save-config`, `task flow bind`, `task flow submit`, etc.
107
- - **MUST NOT** use `task save`, `task save-config`, `task detail`, or `task online` on Flow child nodes — these tools are for regular (non-Flow) tasks only and will produce incorrect results or errors.
108
- - **Decision rule**: if `task_type == 500` OR the user mentions flow/workflow context → use `task flow *` commands unconditionally.
260
+ **Common service endpoints (offer as options):**
109
261
 
110
- ### Rule 7 Always display studio_url in final report
262
+ | Region | service | Suggested profile prefix |
263
+ |--------|---------|--------------------------|
264
+ | Alibaba Cloud East China 2 (Shanghai) | `cn-shanghai-alicloud.api.clickzetta.com` | `cn-shanghai` |
265
+ | Tencent Cloud East China (Shanghai) | `ap-shanghai-tencentcloud.api.clickzetta.com` | `ap-shanghai` |
266
+ | Tencent Cloud North China (Beijing) | `ap-beijing-tencentcloud.api.clickzetta.com` | `ap-beijing` |
267
+ | Tencent Cloud South China (Guangzhou) | `ap-guangzhou-tencentcloud.api.clickzetta.com` | `ap-guangzhou` |
268
+ | AWS China (Beijing) | `cn-north-1-aws.api.clickzetta.com` | `cn-north-1` |
111
269
 
112
- Responses from `task`, `runs`, and `executions` commands may include a `studio_url` field. When present, surface it in the end to the user so they can open the resource directly in Studio.
270
+ **Inference rules (reduce unnecessary questions):**
271
+ - If the user describes a cloud region in natural language (e.g. "Alibaba Cloud Shanghai", "Tencent Cloud Beijing", "阿里云上海", "腾讯云北京"), look up the service endpoint from the table above — do NOT ask the user to provide it again.
272
+ - If the user hasn't provided a profile name, suggest `<prefix>-<instance>` using the prefix from the table (e.g. `cn-shanghai-billingsh`). Confirm with the user or proceed if they don't object.
113
273
 
114
- Display as a Markdown hyperlink: `[View in Studio](https://...)`. Show all studio_url values returned across all commands in the same response do not deduplicate.
274
+ ### Step 2Create profile
115
275
 
116
- ### Rule 8 Maximize execution efficiency: parallel and chained commands
276
+ Run `cz-cli profile create` with all collected fields:
117
277
 
118
- **Independent commands → parallel. Dependent commands → chain with `--field`.**
278
+ ```bash
279
+ cz-cli profile create <name> \
280
+ --username <username> \
281
+ --password <password> \
282
+ --instance <instance> \
283
+ --workspace <workspace> \
284
+ --service <service> \
285
+ --schema public \
286
+ --vcluster default
287
+ ```
288
+
289
+ ### Step 3 — Verify connection
290
+
291
+ After creating, run:
119
292
 
120
293
  ```bash
121
- # Async SQL: submit → poll → fetch result → preview first rows
122
- JOB=$(cz-cli sql "SELECT * FROM orders LIMIT 100" --field job_id) && cz-cli job status $JOB --format toon && cz-cli job result $JOB --format toon | head -12
294
+ cz-cli status --profile <name>
295
+ ```
296
+
297
+ A successful response looks like:
298
+ ```json
299
+ {"data": {"connected": true, "workspace": "...", "time_ms": ...}}
300
+ ```
301
+
302
+ If it fails, report the error and ask the user to double-check credentials or service endpoint.
303
+
304
+ ## Error handling
123
305
 
124
- # Parallel: two independent lookups at the same time
125
- cz-cli task content my_task --format toon & cz-cli runs list --task my_task --format toon & wait
306
+ All errors in non-TTY mode output JSON to stdout:
126
307
 
127
- # Chain: save then verify
128
- cz-cli task save-content my_task --file script.sql && cz-cli task content my_task --format toon
308
+ ```json
309
+ {"ok": false, "error": "NO_PROFILE", "next_steps": ["cz-cli setup --credential <base64>"]}
129
310
  ```
130
311
 
131
- Key tools:
132
- - `--field <name>` — extract one value as raw text: `cz-cli sql "..." --field job_id` → `2026042818122957849079780`
133
- - `--format toon` — line-per-field output, works with `grep`/`head`
134
- - `--format json` — single-line JSON, parse in code only (do NOT pipe to grep/head)
135
-
136
- **Shortcut**: Use `cz-cli sql --sync "..." > /tmp/res.json` to force synchronous execution (waits for results inline). Prefer `--sync` for simple/fast queries.
137
-
138
- ## Command Quick Reference
139
-
140
- ### Connection & Profile
141
- - `cz-cli profile create <name> --username U --password P --instance I --workspace W` — Create profile
142
- - `cz-cli profile create <name> --pat TOKEN --instance I --workspace W` — Create profile with PAT
143
- - `cz-cli profile list` — List all profiles
144
- - `cz-cli profile status` — Test connection, show workspace/schema
145
- - `cz-cli profile use <name>` — Set default profile
146
- - `cz-cli profile delete <name>` — Delete a profile
147
- - `cz-cli profile discover --studio-url URL --username U --password P` — Discover regions/instances
148
-
149
- ### SQL Execution
150
- - `cz-cli sql "SELECT * FROM t LIMIT 10"` — Execute SQL query
151
- - `cz-cli sql -f query.sql` — Execute SQL from file
152
- - `cz-cli sql --write "INSERT INTO t VALUES (...)"` — Write operation (requires --write)
153
- - `cz-cli sql --async "SELECT count(*) FROM big_table"` — Async execution, returns job_id
154
- - `cz-cli sql --sync "SELECT 1"` — Force synchronous execution
155
- - `cz-cli sql -e "SELECT * FROM t WHERE id = %(id)s" --variable id=123` — Variable substitution
156
-
157
- ### Job Management (async SQL results)
158
- - `cz-cli job status <job_id>` — Check job status
159
- - `cz-cli job result <job_id>` — Fetch result set (waits if running, limited to 100 rows)
160
- - `cz-cli job result --no-limit <job_id>` — Fetch full result set
161
-
162
- ### Schema & Table
163
- - `cz-cli schema list` / `cz-cli schema describe <name>` / `cz-cli schema create <name>` / `cz-cli schema drop <name>`
164
- - `cz-cli table list [--schema S] [--like 'pattern%']` / `cz-cli table describe <name>` / `cz-cli table preview <name> [--limit N]`
165
- - `cz-cli table stats <name>` / `cz-cli table history [name]` / `cz-cli table create "DDL"` / `cz-cli table drop <name>`
166
-
167
- ### Workspace
168
- - `cz-cli workspace current` — Show current workspace
169
- - `cz-cli workspace use <name> [--persist]` — Switch workspace
170
-
171
- ### Task Scheduling (Studio)
172
- - `cz-cli task list [--page N --page-size N]` — List tasks
173
- - `cz-cli task create <name> --type SQL|PYTHON|SHELL|SPARK|FLOW --folder F` — Create task
174
- - `cz-cli task content <name_or_id>` — Get task script, config and params (draft); response includes `params` array with `paramType=system` (built-in params / time expressions) or `paramType=manual` (constants)
175
- - `cz-cli task save-content <name_or_id> --file script.sql [--params '{"key":"val","dt":"bizdate","yd":"$[yyyy-MM-dd,-1d]"}']` — Save script and optionally set params; system params (bizdate, sys_biz_day, sys_biz_datetime, sys_plan_day, sys_plan_datetime, sys_plan_timestamp, sys_task_id, sys_task_name, sys_task_owner) and time expressions starting with `$[` are auto-detected as `paramType=system`
176
- - `cz-cli task save-config <name_or_id> --cron "0 0 8 * * ? *"` — Save schedule (sec min hour day month week year)
177
- - `cz-cli task online <name_or_id> -y` — Publish task
178
- - `cz-cli task offline <name_or_id> -y` — Take offline (IRREVERSIBLE)
179
- - `cz-cli task execute <name_or_id> [--param KEY=VAL ...]` — Ad-hoc execution; auto-loads saved `manual` params as defaults (system params like `bizdate` are NOT auto-injected in adhoc mode — pass them explicitly via `--param`); warns if unresolved `${placeholders}` remain after merge (SQL tasks will fail, Python/Shell silently keep literal strings)
180
- - `cz-cli task flow dag <flow>` / `task flow create-node` / `task flow bind` / `task flow submit` — Flow operations
181
-
182
- ### Runs & Attempts
183
- - `cz-cli runs list [--task T --status S --run-type SCHEDULE|REFILL --from D --to D]`
184
- - `cz-cli runs detail <run_id_or_task>` / `cz-cli runs logs <run_id_or_task>` / `cz-cli runs wait <id> --timeout N`
185
- - `cz-cli runs refill <task> --from D --to D [-y]` — **补数/回填**: re-run scheduled instances for a historical date range (task must be online). `D` accepts `YYYY-MM-DD` (day boundary: `--from` = start of day, `--to` = 23:59:59) or `YYYY-MM-DDTHH:MM:SS` for exact datetime — **use ISO datetime for hourly/minutely tasks** to avoid missing instances
186
- - `cz-cli attempts list <run_id_or_task>` / `cz-cli attempts log <run_id_or_task> [--attempt-id N]`
187
-
188
- ### Agent (AI Agent)
189
- - `cz-cli agent run "<prompt>" [--session ID]` — Run AI agent with a natural-language prompt
190
- - `cz-cli agent llm` — Manage LLM providers
191
-
192
- ## Command Inventory (Generated)
193
-
194
- ### `ai-guide`
195
- - `ai-guide` - Generate AI-friendly command reference
196
-
197
- ### `attempts`
198
- - `attempts` - Manage attempt records
199
- - `attempts list` - List attempts for a run
200
- - `attempts log` - Get attempt log
201
-
202
- ### `job`
203
- - `job` - Job performance tools
204
- - `job result` - Fetch result set of a SQL job
205
- - `job status` - Check status/summary of a SQL job
206
-
207
- ### `profile`
208
- - `profile` - Manage connection profiles
209
- - `profile create` - Create a profile
210
- - `profile delete` - Delete a profile
211
- - `profile detail` - Show profile details
212
- - `profile list` - List profiles
213
- - `profile update` - Update a profile
214
- - `profile use` - Set default profile
215
-
216
- ### `runs`
217
- - `runs` - Manage task run instances
218
- - `runs deps` - View run dependencies
219
- - `runs detail` - Get run detail
220
- - `runs list` - List run instances
221
- - `runs logs` - Get run logs
222
- - `runs refill` - Submit backfill job
223
- - `runs rerun` - Rerun a failed instance
224
- - `runs stats` - Get run statistics summary
225
- - `runs stop` - Stop a running instance
226
- - `runs wait` - Poll until run completes
227
-
228
- ### `schema`
229
- - `schema` - Manage schemas
230
- - `schema create` - Create a schema
231
- - `schema describe` - Describe a schema
232
- - `schema drop` - Drop a schema
233
- - `schema list` - List schemas
234
-
235
- ### `sql`
236
- - `sql` - Execute SQL against ClickZetta
237
- - `cz-cli sql "SELECT 1"` — Run a simple query
238
- - `cz-cli sql -e "SELECT * FROM t LIMIT 10" --sync` — Synchronous query
239
- - `cz-cli sql -f query.sql --write` — Execute write SQL from file
240
-
241
- ### `table`
242
- - `table` - Manage tables
243
- - `table create` - Create a table from DDL
244
- - `table describe` - Describe a table
245
- - `table drop` - Drop a table
246
- - `table history` - Show table history
247
- - `table list` - List tables
248
- - `table preview` - Preview table data
249
- - `table stats` - Get table row count
250
-
251
- ### `task`
252
- - `task` - Manage Studio tasks
253
- - `task content` - Get task content
254
- - `task create` - Create a new task
255
- - `task deps` - Show task dependencies
256
- - `task execute` - Execute a task ad-hoc
257
- - `task list` - List tasks
258
- - `task offline` - Take a task offline
259
- - `task online` - Publish a task
260
- - `task save-config` - Save task schedule config
261
- - `task save-content` - Save task script
262
-
263
- ### `workspace`
264
- - `workspace` - Manage workspaces
265
- - `workspace current` - Show current workspace
266
- - `workspace list` - List workspaces
267
-
268
- ## Command Risk Reference
269
-
270
- | Risk Level | Commands | Key Concern |
271
- |-----------------|--------------------------------------------------|-----------------------------------------------------------------|
272
- | 🔴 Irreversible | `task offline`, `schema drop`, `table drop` | Cannot be undone; clears history or deletes objects permanently |
273
- | 🟠 High Impact | `task online`, `runs refill`, `task flow submit` | Affects live schedule or re-runs historical data |
274
- | 🟢 Safe | All others | No side effects |
312
+ On `NO_PROFILE` error: check if a profile can be configured via username/password (see "Adding a new profile" above). If the user has a base64 credential instead, guide them to run `cz-cli setup --credential <base64>`. See `references/profile-setup.md`.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@clickzetta/cz-cli-darwin-arm64",
3
- "version": "0.3.74",
3
+ "version": "0.3.76",
4
4
  "description": "cz-cli binary for macOS ARM64 (Apple Silicon)",
5
5
  "os": [
6
6
  "darwin"