@clickzetta/cz-cli-darwin-arm64 0.3.75 → 0.3.76
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bin/cz-cli +0 -0
- package/bin/skills/clickzetta-dynamic-table/SKILL.md +188 -126
- package/bin/skills/clickzetta-dynamic-table/best-practices/dimension-table-join-guide.md +7 -11
- package/bin/skills/clickzetta-dynamic-table/best-practices/scheduling-guide.md +135 -0
- package/bin/skills/clickzetta-dynamic-table/dt-creator/references/dt-declaration-strategy.md +8 -8
- package/bin/skills/clickzetta-dynamic-table/dt-creator/references/incremental-config-reference.md +2 -4
- package/bin/skills/clickzetta-dynamic-table/dt-creator/references/refresh-history-guide.md +2 -10
- package/bin/skills/clickzetta-dynamic-table/sql-to-dt/SKILL.md +27 -0
- package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-column-validation-rules.md +118 -0
- package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-conversion-rules.md +225 -0
- package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-placeholder-rules.md +182 -0
- package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-refresh-rules.md +98 -0
- package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-self-reference-rules.md +76 -0
- package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-workflow.md +109 -0
- package/bin/skills/cz-cli/SKILL.md +125 -0
- package/package.json +1 -1
|
@@ -0,0 +1,109 @@
|
|
|
1
|
+
# SQL → Dynamic Table 完整转换工作流
|
|
2
|
+
|
|
3
|
+
当用户给你一组 CREATE TABLE DDL 和 INSERT OVERWRITE SQL,要求转换为 Dynamic Table 时,按以下步骤顺序执行。
|
|
4
|
+
|
|
5
|
+
每一步的详细规则在对应的 skill 文件中,你需要同时引用它们。
|
|
6
|
+
|
|
7
|
+
## 工作流步骤
|
|
8
|
+
|
|
9
|
+
### Step 1: 预处理输入
|
|
10
|
+
|
|
11
|
+
从 INSERT OVERWRITE 文件中移除:
|
|
12
|
+
- 所有 `ALTER TABLE` 语句
|
|
13
|
+
- `ANALYZE TABLE` 语句
|
|
14
|
+
- SQL 注释(`--` 和 `/* */`)
|
|
15
|
+
|
|
16
|
+
保留:CREATE TABLE、INSERT OVERWRITE、WITH、SET、CREATE TEMPORARY FUNCTION。
|
|
17
|
+
|
|
18
|
+
### Step 2: 占位符替换
|
|
19
|
+
|
|
20
|
+
按 #[[file:sql2dt-placeholder-rules.md]] 中的规则:
|
|
21
|
+
1. 统一占位符格式(`{{ }}` → `${ }`)
|
|
22
|
+
2. 替换所有占位符为 `SESSION_CONFIGS()` 调用
|
|
23
|
+
3. 处理 nodash 变量、日期运算、macros 函数
|
|
24
|
+
4. 根据引号上下文决定处理方式(去引号 / CONCAT / 直接替换)
|
|
25
|
+
|
|
26
|
+
### Step 3: 自引用检测
|
|
27
|
+
|
|
28
|
+
按 #[[file:sql2dt-self-reference-rules.md]] 中的规则:
|
|
29
|
+
1. 检查 INSERT OVERWRITE 目标表是否出现在 FROM/JOIN 中
|
|
30
|
+
2. 如果是自引用表,标记并在后续步骤中添加注释、使用显式 schema
|
|
31
|
+
|
|
32
|
+
### Step 4: 核心转换
|
|
33
|
+
|
|
34
|
+
按 #[[file:sql2dt-conversion-rules.md]] 中的规则:
|
|
35
|
+
1. 解析 CREATE TABLE DDL(提取列、分区、属性等)
|
|
36
|
+
2. 解析 INSERT OVERWRITE(提取查询、分区类型)
|
|
37
|
+
3. 组装 `CREATE OR REPLACE DYNAMIC TABLE ... AS SELECT ...`
|
|
38
|
+
4. 注入静态分区值到 SELECT(智能引号处理)
|
|
39
|
+
5. 合并表属性模板(默认 `data_lifecycle=15`)
|
|
40
|
+
6. 处理 UNION ALL(每个分支独立注入)
|
|
41
|
+
7. 日期函数后处理:将所有 `DATE_SUB/DATE_ADD` 统一转为 `sub_days`
|
|
42
|
+
|
|
43
|
+
### Step 5: 列校验
|
|
44
|
+
|
|
45
|
+
按 #[[file:sql2dt-column-validation-rules.md]] 中的规则:
|
|
46
|
+
1. 计算 schema 列数和 SELECT 列数
|
|
47
|
+
2. 验证两者相等
|
|
48
|
+
3. 检查重复别名和缺失分区列
|
|
49
|
+
4. UNION ALL 分支列数一致性检查
|
|
50
|
+
|
|
51
|
+
### Step 6: 生成配套文件
|
|
52
|
+
|
|
53
|
+
按 #[[file:sql2dt-refresh-rules.md]] 中的规则:
|
|
54
|
+
1. 从 DDL 中提取所有 SESSION_CONFIGS 变量
|
|
55
|
+
2. 生成当前周期 refresh 语句
|
|
56
|
+
3. 生成上一周期 prev_refresh 语句
|
|
57
|
+
4. 生成回填 backfill 语句
|
|
58
|
+
|
|
59
|
+
### Step 7: 转换后改进建议
|
|
60
|
+
|
|
61
|
+
DDL 生成完成后,对转换结果做以下检查,并主动向用户提出改进建议:
|
|
62
|
+
|
|
63
|
+
**检查 1:非分区表 + 持续写入风险**
|
|
64
|
+
|
|
65
|
+
按 #[[file:../best-practices/non-partitioned-merge-into-warning.md]] 中的判断逻辑:
|
|
66
|
+
- 生成的 DT 是非分区表(无 `PARTITIONED BY` 也无 `SESSION_CONFIGS()`)
|
|
67
|
+
- 且 SQL 中包含 `ROW_NUMBER() OVER (PARTITION BY ... ORDER BY ... DESC) WHERE rn = 1` 去重模式
|
|
68
|
+
|
|
69
|
+
→ 满足条件时,使用该文档中的告警话术模板向用户发出风险提示,并建议改用 MERGE INTO + Table Stream 方案。
|
|
70
|
+
|
|
71
|
+
**检查 2:SQL 性能优化机会**
|
|
72
|
+
|
|
73
|
+
按 #[[file:../best-practices/performance-optimization.md]] 中的规则,扫描生成的 DT SQL:
|
|
74
|
+
- 存在 `LEFT/RIGHT/FULL OUTER JOIN` → 提示如果业务允许,改用 INNER JOIN 可提升增量效率
|
|
75
|
+
- 存在无 `PARTITION BY` 的窗口函数 → 提示添加 PARTITION BY,否则每次增量都全量重算
|
|
76
|
+
- `GROUP BY` 使用了复杂表达式(如 `DATE_TRUNC`、`SUBSTR`)→ 提示考虑在上游预计算或拆分为多级 DT
|
|
77
|
+
|
|
78
|
+
**检查 3:JOIN 中是否有维度表**
|
|
79
|
+
|
|
80
|
+
按 #[[file:../best-practices/dimension-table-join-guide.md]] 中的推荐场景:
|
|
81
|
+
- SQL 中存在 JOIN → 询问用户右侧表是否为低频变更的维度表(码表、字典表、配置表等)
|
|
82
|
+
- 如果是 → 建议在 TBLPROPERTIES 中添加 `mv_const_tables` 配置,并说明其行为和数据一致性权衡
|
|
83
|
+
|
|
84
|
+
## 输出清单
|
|
85
|
+
|
|
86
|
+
对每个表,最终输出:
|
|
87
|
+
|
|
88
|
+
| 文件 | 内容 | 条件 |
|
|
89
|
+
|------|------|------|
|
|
90
|
+
| `表名.sql` | Dynamic Table DDL | 始终生成 |
|
|
91
|
+
| `表名_refresh.sql` | 当前周期 REFRESH 语句 | 始终生成 |
|
|
92
|
+
| `表名_prev_refresh.sql` | 上一周期 REFRESH 语句 | 仅有分区变量时 |
|
|
93
|
+
| `表名_backfill.sql` | 回填语句 | 仅有分区变量时 |
|
|
94
|
+
|
|
95
|
+
## 快速判断路径
|
|
96
|
+
|
|
97
|
+
```
|
|
98
|
+
输入 DDL + INSERT OVERWRITE
|
|
99
|
+
│
|
|
100
|
+
├─ 有占位符? → Step 2 占位符替换
|
|
101
|
+
│
|
|
102
|
+
├─ 自引用? → Step 3 特殊处理
|
|
103
|
+
│
|
|
104
|
+
├─ 有静态分区? → Step 4 注入分区值到 SELECT
|
|
105
|
+
│
|
|
106
|
+
├─ 有 UNION ALL? → Step 4 每个分支独立注入
|
|
107
|
+
│
|
|
108
|
+
└─ 生成 DDL → Step 5 校验 → Step 6 生成配套文件 → Step 7 改进建议
|
|
109
|
+
```
|
|
@@ -87,6 +87,10 @@ cz-cli agent run "<request>" --format a2a --dangerously-skip-permissions
|
|
|
87
87
|
|
|
88
88
|
Do NOT use direct cz-cli subcommands when an active LLM is available. Always delegate the full request to `cz-cli agent run`.
|
|
89
89
|
|
|
90
|
+
The output includes a `session_id`. Use it to inspect the run:
|
|
91
|
+
- `cz-cli agent session status <session_id>` — current state (`busy`/`retry` with `progress`, `idle` with `result`, or `error`)
|
|
92
|
+
- `cz-cli agent export <session_id>` — full conversation (messages, tool calls, reasoning, text). Wait until `status` is `idle` before exporting; otherwise the conversation may still be in progress.
|
|
93
|
+
|
|
90
94
|
**Step 3 — ONLY if no active LLM (kind: "none" or empty list), fall back to direct commands:**
|
|
91
95
|
|
|
92
96
|
Decompose the request into concrete `cz-cli` subcommands (`sql`, `schema`, `table`, `task`, `runs`, `job`, `datasource`, `profile`, etc.), execute them, and synthesize the result.
|
|
@@ -101,6 +105,127 @@ cz-cli agent run "<request>" --format a2a --dangerously-skip-permissions --sessi
|
|
|
101
105
|
|
|
102
106
|
Reuse `session_id` for follow-ups on the same topic. Omit `--session` to start fresh.
|
|
103
107
|
|
|
108
|
+
## Async mode (non-TTY / long-running tasks)
|
|
109
|
+
|
|
110
|
+
When running in non-TTY environments (e.g. as a subagent from Claude Code) or for long-running tasks, use async mode to avoid blocking:
|
|
111
|
+
|
|
112
|
+
### Submit asynchronously
|
|
113
|
+
|
|
114
|
+
```bash
|
|
115
|
+
cz-cli agent run "<request>" --async --format a2a --dangerously-skip-permissions
|
|
116
|
+
```
|
|
117
|
+
|
|
118
|
+
Returns immediately with a session ID:
|
|
119
|
+
```json
|
|
120
|
+
{"session_id": "01JXF3K...", "status": "running", "message": "Session submitted asynchronously"}
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
Note: In non-TTY with `--format a2a` or `--format json`, async mode activates automatically (no `--async` flag needed).
|
|
124
|
+
|
|
125
|
+
### Poll status
|
|
126
|
+
|
|
127
|
+
```bash
|
|
128
|
+
cz-cli agent session status <session_id>
|
|
129
|
+
```
|
|
130
|
+
|
|
131
|
+
While running:
|
|
132
|
+
```json
|
|
133
|
+
{"session_id": "01JXF3K...", "status": "busy", "progress": "$ cz-cli table list -o table"}
|
|
134
|
+
```
|
|
135
|
+
|
|
136
|
+
Other progress examples you may see during polling:
|
|
137
|
+
- `"💭 Thinking..."` — LLM is reasoning
|
|
138
|
+
- `"✏ Generating response..."` — LLM is writing the reply
|
|
139
|
+
- `"✱ Grep \"error\" · 3 matches"` — running a search tool
|
|
140
|
+
- `"↻ Retry (attempt 2)"` — retrying a failed LLM call (paired with a `retry` field describing the reason)
|
|
141
|
+
|
|
142
|
+
When complete:
|
|
143
|
+
```json
|
|
144
|
+
{"session_id": "01JXF3K...", "status": "idle", "result": "Here are the results:\n..."}
|
|
145
|
+
```
|
|
146
|
+
|
|
147
|
+
The `result` field is the final text reply. For full conversation details (thinking, tool calls, intermediate text), use `cz-cli agent export <session_id>`.
|
|
148
|
+
|
|
149
|
+
If the session does not exist:
|
|
150
|
+
```json
|
|
151
|
+
{"session_id": "ses_invalid", "error": "Session not found"}
|
|
152
|
+
```
|
|
153
|
+
(exits with code 1)
|
|
154
|
+
|
|
155
|
+
### Retrieve full conversation (thinking + tool calls + text)
|
|
156
|
+
|
|
157
|
+
```bash
|
|
158
|
+
cz-cli agent export <session_id>
|
|
159
|
+
```
|
|
160
|
+
|
|
161
|
+
Returns complete session with all message parts:
|
|
162
|
+
```json
|
|
163
|
+
{
|
|
164
|
+
"info": { "id": "...", "title": "...", "time": {...} },
|
|
165
|
+
"messages": [
|
|
166
|
+
{
|
|
167
|
+
"info": { "role": "user" },
|
|
168
|
+
"parts": [{ "type": "text", "text": "original prompt" }]
|
|
169
|
+
},
|
|
170
|
+
{
|
|
171
|
+
"info": { "role": "assistant" },
|
|
172
|
+
"parts": [
|
|
173
|
+
{ "type": "reasoning", "text": "thinking content..." },
|
|
174
|
+
{ "type": "tool", "tool": "bash", "state": { "status": "completed", "input": {...}, "output": "..." } },
|
|
175
|
+
{ "type": "text", "text": "final answer..." }
|
|
176
|
+
]
|
|
177
|
+
}
|
|
178
|
+
]
|
|
179
|
+
}
|
|
180
|
+
```
|
|
181
|
+
|
|
182
|
+
Part types in export:
|
|
183
|
+
- `reasoning` — LLM thinking/reasoning blocks
|
|
184
|
+
- `tool` — tool calls with full input/output (bash, read, write, edit, glob, grep, etc.)
|
|
185
|
+
- `text` — final text response
|
|
186
|
+
- `step-start` / `step-finish` — step boundaries
|
|
187
|
+
- `patch` — code diffs
|
|
188
|
+
- `subtask` — delegated sub-tasks
|
|
189
|
+
|
|
190
|
+
### Async workflow pattern
|
|
191
|
+
|
|
192
|
+
```bash
|
|
193
|
+
# 1. Submit
|
|
194
|
+
SESSION=$(cz-cli agent run "complex analysis" --async --format a2a --dangerously-skip-permissions | jq -r '.session_id')
|
|
195
|
+
|
|
196
|
+
# 2. Poll until done, printing progress along the way
|
|
197
|
+
while true; do
|
|
198
|
+
STATUS=$(cz-cli agent session status $SESSION)
|
|
199
|
+
STATE=$(echo "$STATUS" | jq -r '.status')
|
|
200
|
+
if [ "$STATE" = "idle" ]; then
|
|
201
|
+
echo "$STATUS" | jq -r '.result'
|
|
202
|
+
break
|
|
203
|
+
fi
|
|
204
|
+
echo "$STATUS" | jq -r '.progress // empty'
|
|
205
|
+
sleep 5
|
|
206
|
+
done
|
|
207
|
+
|
|
208
|
+
# Need full conversation (thinking + tool calls)?
|
|
209
|
+
cz-cli agent export $SESSION
|
|
210
|
+
```
|
|
211
|
+
|
|
212
|
+
### With session continuity (async)
|
|
213
|
+
|
|
214
|
+
```bash
|
|
215
|
+
# First turn
|
|
216
|
+
SESSION=$(cz-cli agent run "describe sales table" --async --format a2a --dangerously-skip-permissions | jq -r '.session_id')
|
|
217
|
+
# ... wait for completion ...
|
|
218
|
+
|
|
219
|
+
# Follow-up turn on same session
|
|
220
|
+
cz-cli agent run "now show row counts" --async --format a2a --dangerously-skip-permissions --session $SESSION
|
|
221
|
+
```
|
|
222
|
+
|
|
223
|
+
### Important notes for async mode
|
|
224
|
+
|
|
225
|
+
- **Permissions:** Always use `--dangerously-skip-permissions` — async mode cannot handle interactive permission prompts
|
|
226
|
+
- **Server requirement:** An agent runtime server must be running (or will be started automatically)
|
|
227
|
+
- **Error handling:** If session is already busy, returns `{"error": "session busy"}`
|
|
228
|
+
|
|
104
229
|
## Multi-environment (profiles)
|
|
105
230
|
|
|
106
231
|
When the user specifies an environment or profile (e.g. "use uat_test", "on the test instance"):
|