@clickzetta/cz-cli-darwin-x64 0.3.89 → 0.3.91
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bin/cz-cli +0 -0
- package/bin/skills/clickzetta-dynamic-table/SKILL.md +169 -169
- package/bin/skills/clickzetta-dynamic-table/best-practices/dimension-table-join-guide.md +126 -126
- package/bin/skills/clickzetta-dynamic-table/best-practices/medallion-and-stream-patterns.md +25 -25
- package/bin/skills/clickzetta-dynamic-table/best-practices/non-partitioned-merge-into-warning.md +48 -48
- package/bin/skills/clickzetta-dynamic-table/best-practices/performance-optimization.md +51 -51
- package/bin/skills/clickzetta-dynamic-table/best-practices/scheduling-guide.md +59 -59
- package/bin/skills/clickzetta-dynamic-table/dt-creator/SKILL.md +8 -7
- package/bin/skills/clickzetta-dynamic-table/dt-creator/references/dt-declaration-strategy.md +99 -99
- package/bin/skills/clickzetta-dynamic-table/dt-creator/references/incremental-config-reference.md +188 -188
- package/bin/skills/clickzetta-dynamic-table/dt-creator/references/refresh-history-guide.md +117 -117
- package/bin/skills/clickzetta-dynamic-table/dt-creator/references/sql-limitations.md +29 -29
- package/bin/skills/clickzetta-dynamic-table/dynamic-table-alter/SKILL.md +80 -79
- package/bin/skills/clickzetta-dynamic-table/sql-to-dt/SKILL.md +15 -15
- package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-column-validation-rules.md +61 -61
- package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-conversion-rules.md +100 -100
- package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-placeholder-rules.md +64 -64
- package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-refresh-rules.md +32 -32
- package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-self-reference-rules.md +21 -21
- package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-workflow.md +71 -71
- package/bin/skills/clickzetta-sql-pipeline-manager/SKILL.md +203 -202
- package/bin/skills/clickzetta-sql-pipeline-manager/references/dynamic-table.md +62 -62
- package/bin/skills/clickzetta-sql-pipeline-manager/references/materialized-view.md +34 -34
- package/bin/skills/clickzetta-sql-pipeline-manager/references/pipe.md +61 -61
- package/bin/skills/clickzetta-sql-pipeline-manager/references/table-stream.md +41 -41
- package/bin/skills/clickzetta-table-stream-pipeline/SKILL.md +103 -101
- package/package.json +1 -1
package/bin/skills/clickzetta-dynamic-table/sql-to-dt/references/sql2dt-self-reference-rules.md
CHANGED
|
@@ -1,34 +1,34 @@
|
|
|
1
|
-
# Dynamic Table
|
|
1
|
+
# Dynamic Table Self-referencing Table Conversion Rules
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
You are a SQL conversion expert. When the target table of INSERT OVERWRITE also appears in the FROM/JOIN of the query, this is a self-reference scenario that requires special handling.
|
|
4
4
|
|
|
5
|
-
##
|
|
5
|
+
## Self-reference Detection
|
|
6
6
|
|
|
7
|
-
###
|
|
7
|
+
### Detection Criteria
|
|
8
8
|
|
|
9
|
-
1.
|
|
10
|
-
2.
|
|
11
|
-
3.
|
|
12
|
-
4.
|
|
9
|
+
1. Extract the target table name (including schema) from the INSERT OVERWRITE statement
|
|
10
|
+
2. Search for that table name in the FROM and JOIN clauses of the SELECT query
|
|
11
|
+
3. Exclude table name references in the PARTITION clause (these do not count as self-references)
|
|
12
|
+
4. If the target table name is found in FROM/JOIN → classify as self-reference
|
|
13
13
|
|
|
14
|
-
###
|
|
14
|
+
### Example
|
|
15
15
|
|
|
16
16
|
```sql
|
|
17
|
-
--
|
|
17
|
+
-- Target table: kscdm.daily_sales
|
|
18
18
|
INSERT OVERWRITE TABLE kscdm.daily_sales PARTITION(ds='${ds}')
|
|
19
19
|
SELECT current.id, current.amount
|
|
20
20
|
FROM source_sales current
|
|
21
|
-
LEFT JOIN kscdm.daily_sales prev ON current.id = prev.id -- ←
|
|
21
|
+
LEFT JOIN kscdm.daily_sales prev ON current.id = prev.id -- ← self-reference
|
|
22
22
|
WHERE prev.ds = '${ds - 1}';
|
|
23
23
|
```
|
|
24
24
|
|
|
25
|
-
##
|
|
25
|
+
## Conversion Rules
|
|
26
26
|
|
|
27
|
-
|
|
27
|
+
Self-referencing table conversion is essentially the same as regular tables, with the following differences:
|
|
28
28
|
|
|
29
|
-
### 1.
|
|
29
|
+
### 1. Explicit Schema Declaration
|
|
30
30
|
|
|
31
|
-
|
|
31
|
+
Self-referencing tables must explicitly declare complete column definitions (including types) in CREATE DYNAMIC TABLE, because the SQL engine needs this information to infer the types of self-dependent columns:
|
|
32
32
|
|
|
33
33
|
```sql
|
|
34
34
|
CREATE OR REPLACE DYNAMIC TABLE kscdm.daily_sales (
|
|
@@ -45,23 +45,23 @@ LEFT JOIN kscdm.daily_sales prev ON current.id = prev.id
|
|
|
45
45
|
WHERE prev.ds = DATE_FORMAT(sub_days(SESSION_CONFIGS()['dt.args.ds'], 1), 'yyyy-MM-dd')::STRING;
|
|
46
46
|
```
|
|
47
47
|
|
|
48
|
-
### 2.
|
|
48
|
+
### 2. Retain Self-reference in Query
|
|
49
49
|
|
|
50
|
-
|
|
50
|
+
In the converted AS clause, the self-referencing table name remains unchanged without any substitution. The SQL engine automatically handles version management for self-references.
|
|
51
51
|
|
|
52
|
-
##
|
|
52
|
+
## Common Self-reference Scenarios
|
|
53
53
|
|
|
54
|
-
###
|
|
54
|
+
### Day-over-day Comparison
|
|
55
55
|
|
|
56
56
|
```sql
|
|
57
|
-
--
|
|
57
|
+
-- Input
|
|
58
58
|
INSERT OVERWRITE TABLE metrics PARTITION(ds='${ds}')
|
|
59
59
|
SELECT t.id, t.value,
|
|
60
60
|
t.value - prev.value AS daily_change
|
|
61
61
|
FROM source t
|
|
62
62
|
LEFT JOIN metrics prev ON t.id = prev.id AND prev.ds = '${ds - 1}';
|
|
63
63
|
|
|
64
|
-
--
|
|
64
|
+
-- Output
|
|
65
65
|
CREATE OR REPLACE DYNAMIC TABLE metrics (
|
|
66
66
|
id BIGINT, value DECIMAL(10,2), daily_change DECIMAL(10,2), ds STRING
|
|
67
67
|
)
|
|
@@ -1,109 +1,109 @@
|
|
|
1
|
-
# SQL → Dynamic Table
|
|
1
|
+
# SQL → Dynamic Table Complete Conversion Workflow
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
When the user gives you a set of CREATE TABLE DDL and INSERT OVERWRITE SQL and asks to convert them to a Dynamic Table, execute the following steps in order.
|
|
4
4
|
|
|
5
|
-
|
|
5
|
+
The detailed rules for each step are in the corresponding skill files, which you need to reference simultaneously.
|
|
6
6
|
|
|
7
|
-
##
|
|
7
|
+
## Workflow Steps
|
|
8
8
|
|
|
9
|
-
### Step 1:
|
|
9
|
+
### Step 1: Pre-process Input
|
|
10
10
|
|
|
11
|
-
|
|
12
|
-
-
|
|
13
|
-
- `ANALYZE TABLE`
|
|
14
|
-
- SQL
|
|
11
|
+
Remove from the INSERT OVERWRITE file:
|
|
12
|
+
- All `ALTER TABLE` statements
|
|
13
|
+
- `ANALYZE TABLE` statements
|
|
14
|
+
- SQL comments (`--` and `/* */`)
|
|
15
15
|
|
|
16
|
-
|
|
16
|
+
Retain: CREATE TABLE, INSERT OVERWRITE, WITH, SET, CREATE TEMPORARY FUNCTION.
|
|
17
17
|
|
|
18
|
-
### Step 2:
|
|
18
|
+
### Step 2: Placeholder Replacement
|
|
19
19
|
|
|
20
|
-
|
|
21
|
-
1.
|
|
22
|
-
2.
|
|
23
|
-
3.
|
|
24
|
-
4.
|
|
20
|
+
Follow the rules in #[[file:sql2dt-placeholder-rules.md]]:
|
|
21
|
+
1. Normalize placeholder format (`{{ }}` → `${ }`)
|
|
22
|
+
2. Replace all placeholders with `SESSION_CONFIGS()` calls
|
|
23
|
+
3. Handle nodash variables, date arithmetic, macros functions
|
|
24
|
+
4. Decide handling based on quote context (remove quotes / CONCAT / direct replacement)
|
|
25
25
|
|
|
26
|
-
### Step 3:
|
|
26
|
+
### Step 3: Self-reference Detection
|
|
27
27
|
|
|
28
|
-
|
|
29
|
-
1.
|
|
30
|
-
2.
|
|
28
|
+
Follow the rules in #[[file:sql2dt-self-reference-rules.md]]:
|
|
29
|
+
1. Check whether the INSERT OVERWRITE target table appears in FROM/JOIN
|
|
30
|
+
2. If it is a self-referencing table, mark it and add comments and use explicit schema in subsequent steps
|
|
31
31
|
|
|
32
|
-
### Step 4:
|
|
32
|
+
### Step 4: Core Conversion
|
|
33
33
|
|
|
34
|
-
|
|
35
|
-
1.
|
|
36
|
-
2.
|
|
37
|
-
3.
|
|
38
|
-
4.
|
|
39
|
-
5.
|
|
40
|
-
6.
|
|
41
|
-
7.
|
|
34
|
+
Follow the rules in #[[file:sql2dt-conversion-rules.md]]:
|
|
35
|
+
1. Parse CREATE TABLE DDL (extract columns, partitions, properties, etc.)
|
|
36
|
+
2. Parse INSERT OVERWRITE (extract query, partition type)
|
|
37
|
+
3. Assemble `CREATE OR REPLACE DYNAMIC TABLE ... AS SELECT ...`
|
|
38
|
+
4. Inject static partition values into SELECT (smart quote handling)
|
|
39
|
+
5. Merge table property template (default `data_lifecycle=15`)
|
|
40
|
+
6. Handle UNION ALL (inject into each branch independently)
|
|
41
|
+
7. Date function post-processing: convert all `DATE_SUB/DATE_ADD` to `sub_days`
|
|
42
42
|
|
|
43
|
-
### Step 5:
|
|
43
|
+
### Step 5: Column Validation
|
|
44
44
|
|
|
45
|
-
|
|
46
|
-
1.
|
|
47
|
-
2.
|
|
48
|
-
3.
|
|
49
|
-
4. UNION ALL
|
|
45
|
+
Follow the rules in #[[file:sql2dt-column-validation-rules.md]]:
|
|
46
|
+
1. Count schema columns and SELECT columns
|
|
47
|
+
2. Verify they are equal
|
|
48
|
+
3. Check for duplicate aliases and missing partition columns
|
|
49
|
+
4. UNION ALL branch column count consistency check
|
|
50
50
|
|
|
51
|
-
### Step 6:
|
|
51
|
+
### Step 6: Generate Companion Files
|
|
52
52
|
|
|
53
|
-
|
|
54
|
-
1.
|
|
55
|
-
2.
|
|
56
|
-
3.
|
|
57
|
-
4.
|
|
53
|
+
Follow the rules in #[[file:sql2dt-refresh-rules.md]]:
|
|
54
|
+
1. Extract all SESSION_CONFIGS variables from the DDL
|
|
55
|
+
2. Generate current-cycle refresh statement
|
|
56
|
+
3. Generate previous-cycle prev_refresh statement
|
|
57
|
+
4. Generate backfill statement
|
|
58
58
|
|
|
59
|
-
### Step 7:
|
|
59
|
+
### Step 7: Post-conversion Improvement Suggestions
|
|
60
60
|
|
|
61
|
-
DDL
|
|
61
|
+
After DDL generation is complete, check the conversion result and proactively offer improvement suggestions to the user:
|
|
62
62
|
|
|
63
|
-
|
|
63
|
+
**Check 1: Non-partitioned table + continuous write risk**
|
|
64
64
|
|
|
65
|
-
|
|
66
|
-
-
|
|
67
|
-
-
|
|
65
|
+
Follow the judgment logic in #[[file:../best-practices/non-partitioned-merge-into-warning.md]]:
|
|
66
|
+
- The generated DT is a non-partitioned table (no `PARTITIONED BY` and no `SESSION_CONFIGS()`)
|
|
67
|
+
- And the SQL contains the `ROW_NUMBER() OVER (PARTITION BY ... ORDER BY ... DESC) WHERE rn = 1` deduplication pattern
|
|
68
68
|
|
|
69
|
-
→
|
|
69
|
+
→ When conditions are met, use the alert message template from that document to warn the user of the risk, and suggest switching to the MERGE INTO + Table Stream approach.
|
|
70
70
|
|
|
71
|
-
|
|
71
|
+
**Check 2: SQL performance optimization opportunities**
|
|
72
72
|
|
|
73
|
-
|
|
74
|
-
-
|
|
75
|
-
-
|
|
76
|
-
- `GROUP BY`
|
|
73
|
+
Follow the rules in #[[file:../best-practices/performance-optimization.md]], scan the generated DT SQL:
|
|
74
|
+
- Contains `LEFT/RIGHT/FULL OUTER JOIN` → suggest switching to INNER JOIN if business allows, to improve incremental efficiency
|
|
75
|
+
- Contains window functions without `PARTITION BY` → suggest adding PARTITION BY; otherwise every incremental refresh will do a full recomputation
|
|
76
|
+
- `GROUP BY` uses complex expressions (e.g., `DATE_TRUNC`, `SUBSTR`) → suggest pre-computing upstream or splitting into multi-level DTs
|
|
77
77
|
|
|
78
|
-
|
|
78
|
+
**Check 3: Whether there are dimension tables in JOINs**
|
|
79
79
|
|
|
80
|
-
|
|
81
|
-
- SQL
|
|
82
|
-
-
|
|
80
|
+
Follow the recommended scenarios in #[[file:../best-practices/dimension-table-join-guide.md]]:
|
|
81
|
+
- SQL contains JOIN → ask the user whether the right-side table is a low-frequency-change dimension table (lookup table, dictionary table, config table, etc.)
|
|
82
|
+
- If yes → suggest adding `mv_const_tables` configuration in TBLPROPERTIES, and explain its behavior and data consistency tradeoffs
|
|
83
83
|
|
|
84
|
-
##
|
|
84
|
+
## Output Checklist
|
|
85
85
|
|
|
86
|
-
|
|
86
|
+
For each table, the final output is:
|
|
87
87
|
|
|
88
|
-
|
|
|
88
|
+
| File | Content | Condition |
|
|
89
89
|
|------|------|------|
|
|
90
|
-
|
|
|
91
|
-
|
|
|
92
|
-
|
|
|
93
|
-
|
|
|
90
|
+
| `table_name.sql` | Dynamic Table DDL | Always generated |
|
|
91
|
+
| `table_name_refresh.sql` | Current-cycle REFRESH statement | Always generated |
|
|
92
|
+
| `table_name_prev_refresh.sql` | Previous-cycle REFRESH statement | Only when partition variables exist |
|
|
93
|
+
| `table_name_backfill.sql` | Backfill statement | Only when partition variables exist |
|
|
94
94
|
|
|
95
|
-
##
|
|
95
|
+
## Quick Decision Path
|
|
96
96
|
|
|
97
97
|
```
|
|
98
|
-
|
|
98
|
+
Input DDL + INSERT OVERWRITE
|
|
99
99
|
│
|
|
100
|
-
├─
|
|
100
|
+
├─ Has placeholders? → Step 2 placeholder replacement
|
|
101
101
|
│
|
|
102
|
-
├─
|
|
102
|
+
├─ Self-reference? → Step 3 special handling
|
|
103
103
|
│
|
|
104
|
-
├─
|
|
104
|
+
├─ Has static partitions? → Step 4 inject partition values into SELECT
|
|
105
105
|
│
|
|
106
|
-
├─
|
|
106
|
+
├─ Has UNION ALL? → Step 4 inject into each branch independently
|
|
107
107
|
│
|
|
108
|
-
└─
|
|
108
|
+
└─ Generate DDL → Step 5 validate → Step 6 generate companion files → Step 7 improvement suggestions
|
|
109
109
|
```
|