@clickzetta/cz-cli-darwin-x64 1.0.11 → 1.0.13
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/bin/cz-cli
CHANGED
|
Binary file
|
|
@@ -0,0 +1,91 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: clickzetta-cmt-api
|
|
3
|
+
description: Use when working in this repository to discover CMT sources, plan migrations, start runs, monitor progress, verify migration outcomes, inspect run logs, or clean up failed runs through the official CMT v2 surface.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# CMT v2 API
|
|
7
|
+
|
|
8
|
+
This repository exposes a mixed OpenAPI surface. The live `/doc` page currently renders the full `/v3/api-docs`, which still includes many legacy `/api/*` routes alongside the new `/api/v2/*` routes.
|
|
9
|
+
|
|
10
|
+
The skill exists to stop agents from getting lost in that mixed surface.
|
|
11
|
+
|
|
12
|
+
## Core Rule
|
|
13
|
+
|
|
14
|
+
Use `/api/v2` first.
|
|
15
|
+
|
|
16
|
+
Do not call legacy `/api/sources`, `/api/tables`, `/api/jobs`, or task-log polling routes for the migrated agent workflow surface. These legacy routes are deprecated for agent use because they have v2 replacements:
|
|
17
|
+
|
|
18
|
+
- `/api/sources` -> `/api/v2/catalog/sources`
|
|
19
|
+
- `/api/tables/{tableId}` -> `/api/v2/catalog/tables/{tableId}`
|
|
20
|
+
- `/api/jobs/check-conflict` -> `/api/v2/migrations/plans`
|
|
21
|
+
- `/api/jobs` -> `/api/v2/migrations/runs`
|
|
22
|
+
- `/api/jobs/{jobId}` -> `/api/v2/runs/{runId}`
|
|
23
|
+
- `/api/tasks/{taskId}` -> `/api/v2/runs/{runId}/events` or `/api/v2/logs/runs/{runId}`
|
|
24
|
+
|
|
25
|
+
Only fall back to legacy `/api/*` when a v2 response explicitly requires fallback or the user explicitly asks for a legacy-only capability.
|
|
26
|
+
|
|
27
|
+
Dot use cz-agent or cz-cli agent command. it is only for external customer reference, not used in coding development or testing.
|
|
28
|
+
|
|
29
|
+
## Anti-Misfire: External Tool Exclusion
|
|
30
|
+
|
|
31
|
+
When working in this repository (clickzetta-lakehouse-tool-migrate), **never** delegate to external tools for operations CMT already supports:
|
|
32
|
+
|
|
33
|
+
- **Do NOT use `cz-agent run`** — CMT has `/api/ops/sql/execute` for querying any datasource (source or target).
|
|
34
|
+
- **Do NOT use `cz-cli sql`** — same reason; use `/api/ops/sql/execute` with `sourceName` or `workspace`.
|
|
35
|
+
- **Do NOT invoke the global `cz-cli` skill** — it is for standalone Lakehouse operations outside CMT context.
|
|
36
|
+
|
|
37
|
+
If any global skill (e.g. `cz-cli`, `cz-agent`) tries to activate for "query data", "describe table", or "run SQL" while cwd is this repo, override it and use CMT's own endpoints instead:
|
|
38
|
+
|
|
39
|
+
| Need | CMT endpoint | Do NOT use |
|
|
40
|
+
|------|-------------|------------|
|
|
41
|
+
| Query source data | `POST /api/ops/sql/execute` with `sourceName` | `cz-agent run`, `cz-cli sql` |
|
|
42
|
+
| Query target (Lakehouse) data | `POST /api/ops/sql/execute` with `workspace` | `cz-agent run`, `cz-cli sql` |
|
|
43
|
+
| Describe table schema | `POST /api/ops/sql/execute` with DESC SQL | `cz-cli table describe` |
|
|
44
|
+
| Check connection | `GET /api/ops/sql/info` | `cz-cli status` |
|
|
45
|
+
|
|
46
|
+
If the user did not specify which CMT environment to use, stop and clarify whether they want:
|
|
47
|
+
- the local CMT service
|
|
48
|
+
- an online CMT service, with its base URL provided explicitly
|
|
49
|
+
|
|
50
|
+
|
|
51
|
+
## Reality Check
|
|
52
|
+
|
|
53
|
+
- `/doc` is useful for confirming route existence and schema names.
|
|
54
|
+
- `/doc` is not curated enough to be the decision source for agent workflow selection.
|
|
55
|
+
- Prefer `/llms.txt` for routing guidance and use the workflow reference below for exact sequence.
|
|
56
|
+
- Current v2 tags in OpenAPI are machine-generated names such as `catalog-v-2-api` and `run-v-2-api`. Ignore the tag names and reason from the path shape.
|
|
57
|
+
- The expected local server is `MMAv3.jar` on port `6060`. If `localhost:6060` is not listening, start it before using `/doc` or `/api/v2`.
|
|
58
|
+
|
|
59
|
+
## Workflow
|
|
60
|
+
|
|
61
|
+
Follow [references/workflows.md](references/workflows.md) for the standard call path.
|
|
62
|
+
|
|
63
|
+
## Guardrails
|
|
64
|
+
|
|
65
|
+
- Do not assume `localhost:6060` unless the user explicitly chose the local CMT environment.
|
|
66
|
+
- If the user did not specify local vs online CMT, ask a short clarification before calling `/doc` or `/api/v2`.
|
|
67
|
+
- If the user chose an online CMT service, require its base URL and surface that URL back in the user-facing update before using it.
|
|
68
|
+
- Before browsing `http://localhost:6060/doc`, check whether port `6060` is listening.
|
|
69
|
+
- If port `6060` is not listening, start the local server from repo root with `mkdir -p .codex/tmp && nohup java -jar MMAv3.jar -c conf/config_prod.ini > .codex/tmp/mmav3-6060.log 2>&1 & echo $! > .codex/tmp/mmav3-6060.pid`.
|
|
70
|
+
- After starting the server, record the PID from `.codex/tmp/mmav3-6060.pid` in the user-facing update and surface the local jump link `http://localhost:6060/doc`.
|
|
71
|
+
- Treat `plan -> start -> wait -> verify` as the default mutation chain.
|
|
72
|
+
- Never execute admin mutations unless the user has explicitly asked for them.
|
|
73
|
+
- Never infer that a run is stuck just because `/doc` also shows legacy job/task endpoints.
|
|
74
|
+
- On `attention`, read `next_action.recommended_action` before deciding what to do.
|
|
75
|
+
- On verification, always report the final target identity, not just a run status.
|
|
76
|
+
- Prefer `/api/v2/runs/{id}/wait` over ad hoc polling loops.
|
|
77
|
+
- If the user asks for “progress”, answer from `/api/v2/runs/{id}`, `/wait`, `/events`, or `/logs/runs/{id}`. Do not improvise by stitching legacy task APIs first.
|
|
78
|
+
- If any returned child has `status=skipped`, report the skipped objects and surface `reason_code` / `reason_summary` instead of collapsing them into generic success or silence.
|
|
79
|
+
- If `children` are present and every child has `status=skipped`, do not report generic migration success. Explain that no execution task was generated and surface the skipped objects plus `reason_code` / `reason_summary`.
|
|
80
|
+
- If the user asks why there are no execution tasks or why a child set looks empty, answer from `children.status=skipped` and `reason_*` before falling back to legacy inference.
|
|
81
|
+
|
|
82
|
+
## What This Skill Solves
|
|
83
|
+
|
|
84
|
+
- Choosing the right v2 endpoint from a noisy `/doc`
|
|
85
|
+
- Avoiding accidental fallback to legacy mutation endpoints
|
|
86
|
+
- Standardizing how runs are started, monitored, and verified
|
|
87
|
+
- Making cleanup and retry decisions from structured run state instead of guesswork
|
|
88
|
+
|
|
89
|
+
## Recovery
|
|
90
|
+
|
|
91
|
+
Follow [references/error-recovery.md](references/error-recovery.md) for error classes and next steps.
|
|
@@ -0,0 +1,45 @@
|
|
|
1
|
+
# Error Recovery
|
|
2
|
+
|
|
3
|
+
## Status Handling
|
|
4
|
+
|
|
5
|
+
- `accepted`: the run exists but has not progressed yet; wait rather than branching to legacy state.
|
|
6
|
+
- `preparing`: the start path is still translating intent into runnable work; wait unless the user asked to interrupt.
|
|
7
|
+
- `ready`: tasks are prepared; if the user asked for status, report it as runnable state, not as success.
|
|
8
|
+
- `running`: keep waiting and summarize progress from structured events.
|
|
9
|
+
- `succeeded`: verify targets and checks before claiming success. If all run children are `skipped`, report that no execution task was generated rather than implying data moved.
|
|
10
|
+
- `failed`: inspect run events or logs before proposing cleanup or retry.
|
|
11
|
+
- `attention`: do not guess. Read `next_action.recommended_action` and surface it.
|
|
12
|
+
|
|
13
|
+
## Error Categories
|
|
14
|
+
|
|
15
|
+
- `environment_unspecified`: the user did not say whether to use local CMT or an online CMT service. Ask which environment to use before touching `/doc` or `/api/v2`.
|
|
16
|
+
- `service_url_missing`: the user chose an online CMT service but did not provide its base URL. Ask for the URL before proceeding.
|
|
17
|
+
- `service_unavailable_local`: `localhost:6060` is not listening yet. Start `MMAv3.jar` with the documented `nohup` command, record the PID, and surface the local links after the port is ready.
|
|
18
|
+
- `validation`: fix request shape or missing identifiers before retrying.
|
|
19
|
+
- `conflict`: inspect existing targets or conflicting runs, then clean up only with explicit user intent.
|
|
20
|
+
- `upstream_timeout`: prefer the retry path suggested by the response or `next_action`.
|
|
21
|
+
- `transient`: safe to retry when the response marks it retryable.
|
|
22
|
+
- `dangerous_operation_requires_confirmation`: stop and get explicit user intent.
|
|
23
|
+
|
|
24
|
+
## Decision Rules
|
|
25
|
+
|
|
26
|
+
- If the environment is unspecified, clarify local vs online before assuming `localhost:6060`.
|
|
27
|
+
- If the user wants an online CMT service, require and restate the base URL before calling `/doc`, `/llms.txt`, or `/api/v2/*`.
|
|
28
|
+
- If the response already carries a next step, follow that instead of inventing one.
|
|
29
|
+
- If the run is failed but the target identity is still needed, read verification before discussing cleanup.
|
|
30
|
+
- If the user asks “what should I do now”, answer from `recommended_action`, conflict context, or verification state.
|
|
31
|
+
- If there is no v2 signal for the recovery path, say that explicitly instead of silently dropping to legacy routes.
|
|
32
|
+
- If the user asks why a succeeded run has no execution tasks, answer from `children.status=skipped` and `children.reason_*` before falling back to legacy inference.
|
|
33
|
+
- If `/doc` is unreachable because the local service is down, start it first, record the PID file path, and return `http://localhost:6060/doc` as the next link to open.
|
|
34
|
+
|
|
35
|
+
## Cleanup and Retry
|
|
36
|
+
|
|
37
|
+
- Retry only after identifying why the previous run stopped.
|
|
38
|
+
- Cleanup is an admin mutation and must be user-authorized.
|
|
39
|
+
- Do not fall back to legacy mutation endpoints on your own.
|
|
40
|
+
|
|
41
|
+
## Current Surface Limits
|
|
42
|
+
|
|
43
|
+
- The current v2 verification surface is still derived from existing run/task state and is not yet a full independent target probe.
|
|
44
|
+
- The current `/doc` surface still exposes many legacy endpoints; treat that as documentation noise, not as permission to use them by default.
|
|
45
|
+
- A succeeded run with skipped-only children is not evidence that data was copied; explain the outcome from `children.reason_code` / `children.reason_summary`.
|
|
@@ -0,0 +1,167 @@
|
|
|
1
|
+
# Workflows
|
|
2
|
+
|
|
3
|
+
## Live `/doc` Interpretation
|
|
4
|
+
|
|
5
|
+
The current `/doc` page shows:
|
|
6
|
+
|
|
7
|
+
- v2 routes
|
|
8
|
+
- legacy routes
|
|
9
|
+
- raw OpenAPI tags such as `migration-v-2-api`
|
|
10
|
+
|
|
11
|
+
Read it as a schema browser, not as a workflow guide.
|
|
12
|
+
|
|
13
|
+
## Environment Selection
|
|
14
|
+
|
|
15
|
+
1. If the user did not specify which CMT environment to use, ask whether they want:
|
|
16
|
+
- the local CMT service
|
|
17
|
+
- an online CMT service, with its base URL
|
|
18
|
+
2. Do not assume `localhost:6060` until the user explicitly chose the local environment.
|
|
19
|
+
3. If the user chose an online CMT service, repeat the chosen base URL in the user-facing update before using `/doc`, `/llms.txt`, or `/api/v2/*`.
|
|
20
|
+
|
|
21
|
+
## Local Service Bootstrap
|
|
22
|
+
|
|
23
|
+
1. Only use this section after the user explicitly chose the local CMT environment.
|
|
24
|
+
2. Before calling `/doc`, `/llms.txt`, or `/api/v2/*`, check whether port `6060` is listening.
|
|
25
|
+
3. If `6060` is not listening, start the local server from repo root with:
|
|
26
|
+
|
|
27
|
+
```bash
|
|
28
|
+
mkdir -p .codex/tmp && nohup java -jar MMAv3.jar -c conf/config_prod.ini > .codex/tmp/mmav3-6060.log 2>&1 & echo $! > .codex/tmp/mmav3-6060.pid
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
4. Record the PID written to `.codex/tmp/mmav3-6060.pid`.
|
|
32
|
+
5. Wait for the port to accept connections, then surface the jump links:
|
|
33
|
+
- `http://localhost:6060/doc`
|
|
34
|
+
- `http://localhost:6060/llms.txt`
|
|
35
|
+
6. If `6060` is already listening, reuse the running service instead of starting a duplicate process.
|
|
36
|
+
|
|
37
|
+
## Route Matrix
|
|
38
|
+
|
|
39
|
+
| Goal | Route | Notes |
|
|
40
|
+
| --- | --- | --- |
|
|
41
|
+
| list sources | `/api/v2/catalog/sources` | first read step |
|
|
42
|
+
| inspect one table | `/api/v2/catalog/tables/{tableId}` | use when table id is already known |
|
|
43
|
+
| explain intent | `/api/v2/migrations/plans` | use before any start |
|
|
44
|
+
| start run | `/api/v2/migrations/runs` | returns `run_id` and run URLs |
|
|
45
|
+
| inspect run | `/api/v2/runs/{runId}` | current summary |
|
|
46
|
+
| wait on run | `/api/v2/runs/{runId}/wait` | preferred progress path |
|
|
47
|
+
| structured events | `/api/v2/runs/{runId}/events` | progress timeline |
|
|
48
|
+
| run logs facade | `/api/v2/logs/runs/{runId}` | log-oriented read path |
|
|
49
|
+
| verify result | `/api/v2/verifications/runs/{runId}` | final target identity and checks |
|
|
50
|
+
| cleanup run | `/api/v2/admin/runs/{runId}/cleanup` | explicit user intent only |
|
|
51
|
+
| execute SQL (any datasource) | `POST /api/ops/sql/execute` | unified query: pass `sourceName` for source, or `workspace`+`schema` for Lakehouse target |
|
|
52
|
+
| sql endpoint info | `GET /api/ops/sql/info` | returns default workspace, schema, vcluster, limits |
|
|
53
|
+
|
|
54
|
+
## Deprecated Legacy Mappings
|
|
55
|
+
|
|
56
|
+
| Deprecated legacy route | Use instead |
|
|
57
|
+
| --- | --- |
|
|
58
|
+
| `/api/sources` | `/api/v2/catalog/sources` |
|
|
59
|
+
| `/api/tables/{tableId}` | `/api/v2/catalog/tables/{tableId}` |
|
|
60
|
+
| `/api/jobs/check-conflict` | `/api/v2/migrations/plans` |
|
|
61
|
+
| `/api/jobs` | `/api/v2/migrations/runs` |
|
|
62
|
+
| `/api/jobs/{jobId}` | `/api/v2/runs/{runId}` |
|
|
63
|
+
| `/api/tasks/{taskId}` | `/api/v2/runs/{runId}/events` or `/api/v2/logs/runs/{runId}` |
|
|
64
|
+
|
|
65
|
+
## Default Migration Sequence
|
|
66
|
+
|
|
67
|
+
1. Read `/llms.txt` when routing is unclear.
|
|
68
|
+
2. Discover the source with `/api/v2/catalog/sources`.
|
|
69
|
+
3. If the user already gave a table id, inspect `/api/v2/catalog/tables/{tableId}`.
|
|
70
|
+
4. Create a plan with `/api/v2/migrations/plans`.
|
|
71
|
+
5. Review `precheck`, `resolved_targets`, and `execution_strategy`.
|
|
72
|
+
6. Start with `/api/v2/migrations/runs`.
|
|
73
|
+
7. Monitor with `/api/v2/runs/{runId}/wait`.
|
|
74
|
+
8. If more detail is needed, read `/api/v2/runs/{runId}/events` or `/api/v2/logs/runs/{runId}`.
|
|
75
|
+
9. Finish with `/api/v2/verifications/runs/{runId}`.
|
|
76
|
+
|
|
77
|
+
## Request Shapes
|
|
78
|
+
|
|
79
|
+
### Plan
|
|
80
|
+
|
|
81
|
+
`POST /api/v2/migrations/plans`
|
|
82
|
+
|
|
83
|
+
```json
|
|
84
|
+
{
|
|
85
|
+
"source": {
|
|
86
|
+
"source_id": "src_databricks_az",
|
|
87
|
+
"schema": "demo_table_type_examples",
|
|
88
|
+
"tables": ["basic_scalar_types"]
|
|
89
|
+
},
|
|
90
|
+
"target": {
|
|
91
|
+
"workspace": "sample_workspace",
|
|
92
|
+
"schema": "public"
|
|
93
|
+
},
|
|
94
|
+
"options": {
|
|
95
|
+
"refresh_mode": "none",
|
|
96
|
+
"schema_evolution": true,
|
|
97
|
+
"verification_mode": "row_count"
|
|
98
|
+
}
|
|
99
|
+
}
|
|
100
|
+
```
|
|
101
|
+
|
|
102
|
+
### Start
|
|
103
|
+
|
|
104
|
+
`POST /api/v2/migrations/runs`
|
|
105
|
+
|
|
106
|
+
```json
|
|
107
|
+
{
|
|
108
|
+
"intent": {
|
|
109
|
+
"source": {
|
|
110
|
+
"source_id": "src_databricks_az",
|
|
111
|
+
"schema": "demo_table_type_examples",
|
|
112
|
+
"tables": ["basic_scalar_types"]
|
|
113
|
+
},
|
|
114
|
+
"target": {
|
|
115
|
+
"workspace": "sample_workspace",
|
|
116
|
+
"schema": "public"
|
|
117
|
+
},
|
|
118
|
+
"options": {
|
|
119
|
+
"refresh_mode": "none",
|
|
120
|
+
"schema_evolution": true,
|
|
121
|
+
"verification_mode": "row_count"
|
|
122
|
+
}
|
|
123
|
+
},
|
|
124
|
+
"refresh_before_submit": false
|
|
125
|
+
}
|
|
126
|
+
```
|
|
127
|
+
|
|
128
|
+
### Execute SQL (source or target)
|
|
129
|
+
|
|
130
|
+
`POST /api/ops/sql/execute`
|
|
131
|
+
|
|
132
|
+
Query a **source** (Databricks, MC, Doris, etc.):
|
|
133
|
+
```json
|
|
134
|
+
{
|
|
135
|
+
"sql": "SELECT * FROM demo_table_type_examples.basic_scalar_types ORDER BY id",
|
|
136
|
+
"sourceName": "databricks_az",
|
|
137
|
+
"limit": 100
|
|
138
|
+
}
|
|
139
|
+
```
|
|
140
|
+
|
|
141
|
+
Query the **Lakehouse target**:
|
|
142
|
+
```json
|
|
143
|
+
{
|
|
144
|
+
"sql": "SELECT * FROM demo_table_type_examples.basic_scalar_types ORDER BY id",
|
|
145
|
+
"workspace": "wanxin-test-ws-03",
|
|
146
|
+
"schema": "demo_table_type_examples",
|
|
147
|
+
"limit": 100
|
|
148
|
+
}
|
|
149
|
+
```
|
|
150
|
+
|
|
151
|
+
Response includes `columns` (name + type), `rows`, `rowCount`, `elapsedMs`, and `jobId`.
|
|
152
|
+
|
|
153
|
+
## Progress Policy
|
|
154
|
+
|
|
155
|
+
- Prefer `wait` for “monitor until something changes”.
|
|
156
|
+
- Prefer `getRun` for “tell me the current state now”.
|
|
157
|
+
- Prefer `events` or `logs` for “why is it slow”, “what failed”, or “summarize what happened”.
|
|
158
|
+
- Treat `children.status=skipped` as a first-class child outcome, not as missing task data, and surface `reason_code` / `reason_summary` when reporting those children.
|
|
159
|
+
- If a run is `succeeded` and all returned children are `skipped`, report that the run reached terminal state without generating executable tasks.
|
|
160
|
+
|
|
161
|
+
## Anti-Patterns
|
|
162
|
+
|
|
163
|
+
- Do not jump from `/doc` to legacy `/api/jobs` just because it is more detailed.
|
|
164
|
+
- Do not use deprecated legacy source/table/job/task-log routes when the mapped `/api/v2` route exists.
|
|
165
|
+
- Do not start a run without first materializing a plan unless the user explicitly wants a direct fire-and-monitor flow.
|
|
166
|
+
- Do not claim migration success from `succeeded` alone. Always read verification.
|
|
167
|
+
- Do not interpret `children=[]` from `wait` with default params as evidence that there were no child outcomes; call `getRun` or `wait?include_children=true` before concluding that.
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: cz-cli
|
|
3
|
-
description: "Delegate ClickZetta Lakehouse OPERATIONS (run SQL, manage tables/schemas, create Studio tasks, set up sync/ingest pipelines, configure profiles) to the cz-cli agent. TRIGGER when user wants to EXECUTE
|
|
3
|
+
description: "Delegate ClickZetta Lakehouse OPERATIONS (run SQL, manage tables/schemas, create Studio tasks, set up sync/ingest pipelines, configure profiles) to the cz-cli agent. TRIGGER when user explicitly mentions ClickZetta, Lakehouse, cz-cli, or a known profile/workspace name AND wants to EXECUTE an operation (query data, create/alter tables, deploy tasks, build pipelines, set up a new connection). SKIP when (1) user is developing the cz-cli tool itself (cwd is the cz-cli source repo, editing CLI source/tests, debugging build/install/unlink/permission issues), (2) only discussing cz-cli design/code without wanting to run anything on Lakehouse, or (3) the current project already has its own datasource query capabilities (e.g. project has AGENTS.md or skills that provide SQL execution endpoints) — do not intercept generic 'query data' or 'run SQL' requests that belong to the host project's own toolchain."
|
|
4
4
|
---
|
|
5
5
|
|
|
6
6
|
# cz-cli — ClickZetta Lakehouse Subagent
|