npm - @clickzetta/cz-cli-darwin-x64 - Versions diffs - 0.3.91 → 0.3.93 - Mend

@clickzetta/cz-cli-darwin-x64 0.3.91 → 0.3.93

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (69) hide show

package/bin/skills/clickzetta-sql-migration/references/function-mapping.md ADDED Viewed

@@ -0,0 +1,194 @@
+# Function Mapping: Snowflake / Spark / Databricks → ClickZetta
+> Comprehensive mapping table for functions that **differ** between systems, plus a list of unsupported functions and their workarounds.
+> For the full ClickZetta function reference, refer to the official ClickZetta Lakehouse documentation.
+---
+## Conditional Functions
+| Snowflake | Spark / Databricks | ClickZetta | Notes |
+|---|---|---|---|
+| `IFF(cond, a, b)` | `IF(cond, a, b)` | `IF(cond, a, b)` | ClickZetta does not support `IFF` |
+| `ZEROIFNULL(x)` | — | `COALESCE(x, 0)` or `NVL(x, 0)` | |
+| `NULLIFZERO(x)` | — | `NULLIF(x, 0)` | |
+| `BOOLAND(a, b)` | — | `a AND b` | use boolean operator |
+| `BOOLOR(a, b)` | — | `a OR b` | |
+| `DECODE(...)` | `DECODE(...)` | `DECODE(...)` | ✅ all supported |
+| `NULLIF` / `COALESCE` / `NVL` | same | same | ✅ all supported |
+---
+## Date / Time Functions
+| Snowflake | Spark / Databricks | ClickZetta | Notes |
+|---|---|---|---|
+| `DATEADD(day, n, dt)` | `DATE_ADD(dt, n)` | `DATEADD(day, n, dt)` ✅ or `DATE_ADD(dt, n)` ✅ | both syntaxes work |
+| `DATEDIFF(day, start, end)` | `DATEDIFF(end, start)` | `DATEDIFF(day, start, end)` ✅ or `DATEDIFF(end, start)` ✅ | both supported, but **2-arg form has reversed order from Snowflake** |
+| `DATE_TRUNC('month', dt)` | `DATE_TRUNC('month', dt)` | same | ✅ identical |
+| `TO_DATE(s)` / `TO_TIMESTAMP(s)` | same | same | ✅ identical |
+| `CONVERT_TIMEZONE(tz, dt)` | `from_utc_timestamp(dt, tz)` | `FROM_UTC_TIMESTAMP(dt, tz)` / `TO_UTC_TIMESTAMP(dt, tz)` | |
+| `SYSDATE()` / `GETDATE()` | `current_timestamp()` | `CURRENT_TIMESTAMP()` or `NOW()` | both supported |
+| `TIMESTAMPADD(unit, n, dt)` | — | `dt + INTERVAL n unit` | |
+| `LAST_DAY(dt)` | `last_day(dt)` | `LAST_DAY(dt)` | ✅ identical |
+| `MONTHS_BETWEEN(d1, d2)` | `months_between(d1, d2)` | `MONTHS_BETWEEN(d1, d2)` | ✅ identical |
+| `YEAR(dt)` / `MONTH(dt)` / `DAY(dt)` | same | same | ✅ identical |
+| `DATE_PART('year', dt)` | `date_part('year', dt)` | ❌ not supported | use `EXTRACT(YEAR FROM dt)` or `YEAR(dt)` |
+| `MAKEDATE(year, dayofyear)` | — | ❌ not supported | use `MAKE_DATE(year, month, day)` |
+| `CONVERT_TZ(dt, from, to)` | — | ❌ not supported | use `FROM_UTC_TIMESTAMP` / `TO_UTC_TIMESTAMP` |
+---
+## String Functions
+| Snowflake | Spark / Databricks | ClickZetta | Notes |
+|---|---|---|---|
+| `CHARINDEX(sub, s)` | `instr(s, sub)` | `INSTR(s, sub)` | ⚠️ **parameter order is reversed from Snowflake** |
+| `EDITDISTANCE(s1, s2)` | `levenshtein(s1, s2)` | ❌ `LEVENSHTEIN` not supported | use Python UDF / ZettaPark |
+| `SOUNDEX(s)` | `soundex(s)` | ❌ not supported | no alternative |
+| `STRTOK(s, delim, n)` | `split(s, delim)[n-1]` | `SPLIT_PART(s, delim, n)` | |
+| `ILIKE` | `ilike` | `ILIKE` | ✅ all supported |
+| `RLIKE` / `REGEXP_LIKE` | `rlike` | `RLIKE` / `REGEXP_LIKE` | ✅ all supported |
+| `CONTAINS(s, sub)` | `contains(s, sub)` | `INSTR(s, sub) > 0` | |
+| `STARTSWITH(s, p)` | `startswith(s, p)` | `STARTSWITH(s, p)` ✅ or `s LIKE 'p%'` | both supported |
+| `ENDSWITH(s, p)` | `endswith(s, p)` | `ENDSWITH(s, p)` ✅ or `s LIKE '%p'` | both supported |
+| `INITCAP(s)` | `initcap(s)` | `INITCAP(s)` | ✅ identical |
+| `REGEXP_SUBSTR(s, p)` | `regexp_extract(s, p, 0)` | ❌ `REGEXP_SUBSTR` not supported | use `REGEXP_EXTRACT(s, '(p)')` |
+| `OVERLAY(s PLACING new FROM pos)` | `overlay(...)` | ❌ not supported | use `CONCAT(LEFT(s, pos-1), new, SUBSTR(s, pos+len))` |
+| `FORMAT(num, decimals)` | — | ❌ thousand-separator format not supported | use `ROUND` + `CAST` |
+---
+## Aggregate Functions
+| Snowflake | Spark / Databricks | ClickZetta | Notes |
+|---|---|---|---|
+| `LISTAGG(col, ',') WITHIN GROUP (ORDER BY col)` | `concat_ws(',', collect_list(col))` | `GROUP_CONCAT(col ORDER BY col SEPARATOR ',')` | |
+| `ARRAY_AGG(col) WITHIN GROUP (ORDER BY col)` | `array_agg(col)` (no ordering) | `ARRAY_AGG(col)` | ⚠️ `WITHIN GROUP` not supported |
+| `OBJECT_AGG(key, value)` | `map_from_entries(...)` | `MAP_AGG(key, value)` | |
+| `APPROX_COUNT_DISTINCT(col)` | `approx_count_distinct(col)` | `APPROX_COUNT_DISTINCT(col)` | ✅ identical |
+| `MEDIAN(col)` | — | `MEDIAN(col)` | ✅ identical |
+| `BITAND_AGG / BITOR_AGG / BITXOR_AGG` | — | `BIT_AND / BIT_OR / BIT_XOR` | |
+| `REGR_SLOPE / REGR_INTERCEPT` | — | ❌ not supported | manually compute via `CORR` + `STDDEV` |
+---
+## Array / Object Functions
+| Snowflake | Spark / Databricks | ClickZetta | Notes |
+|---|---|---|---|
+| `ARRAY_CONSTRUCT(...)` | `array(...)` | `ARRAY(...)` | |
+| `OBJECT_CONSTRUCT('k', v, ...)` | `named_struct('k', v, ...)` or `map(...)` | `named_struct('k', v, ...)` ✅ or `MAP(...)` | |
+| `ARRAY_SIZE(arr)` | `size(arr)` | `SIZE(arr)` ✅ or `ARRAY_SIZE(arr)` ✅ | both supported |
+| `ARRAY_CONTAINS(val, arr)` | `array_contains(arr, val)` | `ARRAY_CONTAINS(arr, val)` | ⚠️ **Snowflake parameter order reversed** |
+| `OBJECT_KEYS(obj)` | `map_keys(map)` | `MAP_KEYS(map)` | |
+| `FLATTEN(arr)` | `flatten(arr)` | `FLATTEN(arr)` | ✅ for array of arrays |
+| `LATERAL FLATTEN(input => arr)` | `LATERAL VIEW EXPLODE(arr)` | `LATERAL VIEW EXPLODE(arr)` | ⚠️ Snowflake → Hive-style syntax change |
+| `STRUCT(1 AS id, 'a' AS name)` (Spark) | same | `named_struct('id', 1, 'name', 'a')` | ⚠️ ClickZetta `STRUCT` does not accept `AS` for named fields |
+| `TO_ARRAY(expr)` | — | ❌ not supported | use `ARRAY(expr)` or `CAST(... AS ARRAY<T>)` |
+| `MAP_FROM_ZIP(keys, values)` | — | ❌ not supported | use `MAP_FROM_ARRAYS(keys, values)` |
+ClickZetta supports higher-order functions (Spark style) which Snowflake does not:
+```sql
+SELECT TRANSFORM(skills, x -> UPPER(x)) FROM emp;
+SELECT FILTER(scores, x -> x > 90) FROM students;
+SELECT EXISTS(scores, x -> x > 100) FROM students;
+SELECT FORALL(scores, x -> x >= 0) FROM students;
+SELECT ZIP_WITH(a, b, (x, y) -> x + y) FROM t;
+```
+`AGGREGATE` / `REDUCE` (Spark names) are not supported — use `ARRAY_AGG` + aggregate functions instead.
+---
+## JSON / Semi-structured Access
+```sql
+-- Snowflake (colon syntax + double-colon cast)
+SELECT data:address:city AS city FROM users;
+SELECT data:age::INT AS age FROM users;
+SELECT data:phoneNumbers[0]:number FROM users;
+-- ClickZetta (bracket syntax)
+SELECT data['address']['city'] AS city FROM users;
+SELECT CAST(data['age'] AS INT) AS age FROM users;
+SELECT data['phoneNumbers'][0]['number'] FROM users;
+-- ClickZetta also accepts :: cast operator
+SELECT data['amount']::DOUBLE AS amount FROM orders;
+```
+| Snowflake | ClickZetta |
+|---|---|
+| `data:key` | `data['key']` |
+| `data[0]:key` | `data[0]['key']` |
+| `data:key::TYPE` | `CAST(data['key'] AS TYPE)` or `data['key']::TYPE` |
+| `PARSE_JSON(s)` | `PARSE_JSON(s)` ✅ identical |
+| `TO_VARIANT(x)` | `PARSE_JSON(TO_JSON(x))` |
+| `TO_JSON(x)` | `TO_JSON(x)` ✅ identical |
+| `IS_NULL_VALUE(json:key)` | `data['key'] IS NULL` |
+---
+## System / Context Functions
+| Snowflake | Spark / Databricks | ClickZetta | Notes |
+|---|---|---|---|
+| `CURRENT_DATABASE()` | `current_database()` | `CURRENT_WORKSPACE()` | concept rename |
+| `CURRENT_WAREHOUSE()` | — | `CURRENT_VCLUSTER()` | concept rename |
+| `CURRENT_ROLE()` | `current_user()` | `CURRENT_USER()` | no role function |
+| `CURRENT_SCHEMA()` | `current_database()` | `CURRENT_SCHEMA()` | ✅ |
+| — | — | `CURRENT_INSTANCE_ID()` | ClickZetta-specific |
+---
+## Type Conversion Functions
+| Snowflake | Spark / Databricks | ClickZetta | Notes |
+|---|---|---|---|
+| `TRY_TO_NUMBER(s)` / `TRY_TO_DATE(s)` | `try_cast(s AS ...)` | `TRY_CAST(s AS ...)` | |
+| `TO_VARIANT(x)` | — | `PARSE_JSON(TO_JSON(x))` | |
+| `CAST(...)` / `::TYPE` | `CAST(...)` / `::TYPE` | `CAST(...)` / `::TYPE` | ✅ all supported |
+---
+## Functions with No Direct ClickZetta Equivalent
+| Function | Source | Workaround |
+|---|---|---|
+| `SOUNDEX(s)` | Snowflake | None |
+| `EDITDISTANCE` / `LEVENSHTEIN` | Snowflake / Spark | Python UDF |
+| `JSON_ARRAY_LENGTH` | various | `SIZE(CAST(json AS ARRAY<STRING>))` |
+| `JSON_OBJECT_KEYS` | various | manually parse |
+| `REGEXP_SUBSTR` | Snowflake | `REGEXP_EXTRACT(s, '(p)')` |
+| `GENERATE_SERIES(s, e)` / `RANGE(n)` | various | `EXPLODE(SEQUENCE(s, e))` |
+| `TABLESAMPLE (n PERCENT)` | various | `ORDER BY RAND() LIMIT n` |
+| `ST_*` geospatial functions | various | None — geospatial not supported |
+| `TO_IPV4` / IP address functions | various | None |
+| `HLL_APPROX` | various | `APPROX_COUNT_DISTINCT(col)` |
+| `BITAND(a, b)` / `BITOR(a, b)` / `BITXOR(a, b)` | various | bitwise operators `&` / `\|` / `^` |
+| `INITCAP(s)` (in versions that miss it) | — | `CONCAT(UPPER(SUBSTR(s,1,1)), LOWER(SUBSTR(s,2)))` |
+| `SQUARE(x)` | Snowflake | `POWER(x, 2)` |
+| `HAVERSINE(...)` | Snowflake | None |
+| `WIDTH_BUCKET(...)` | Snowflake | None |
+| `FACTORIAL(n)` | various | `EXP(SUM(LN(generate)))` over a sequence |
+| `BIN(x)` | various | `CONV(x, 10, 2)` |
+---
+## Vector Functions (ClickZetta-Specific)
+ClickZetta has native vector functions for similarity search, which Snowflake/Spark do not provide:
+```sql
+L2_DISTANCE(v1, v2)             -- Euclidean distance
+COSINE_DISTANCE(v1, v2)         -- Cosine distance
+DOT_PRODUCT(v1, v2)             -- Dot product
+HAMMING_DISTANCE(v1, v2)        -- Hamming distance (binary)
+JACCARD_DISTANCE(v1, v2)        -- Jaccard distance
+BINARY_QUANTIZE(v)              -- float vector → binary
+VECTOR(v1, v2, ...)             -- construct vector
+```
+If migrating from Snowflake Cortex Search or Databricks Vector Search, redesign around these primitives + the `VECTOR INDEX` (see ClickZetta Lakehouse documentation).

package/bin/skills/clickzetta-sql-migration/references/functions-reference.md ADDED Viewed

@@ -0,0 +1,372 @@
+# Functions Complete Reference
+> With Snowflake / Spark SQL difference annotations
+---
+## Numeric Functions
+```sql
+ABS(x)                          -- absolute value
+CEIL(x) / CEILING(x)            -- round up
+FLOOR(x)                        -- round down
+ROUND(x, d)                     -- round to d decimal places
+TRUNCATE(x, d)                  -- truncate to d decimal places
+MOD(x, y) / x % y               -- modulo
+POWER(x, y) / POW(x, y)         -- exponentiation
+SQRT(x)                         -- square root
+EXP(x)                          -- e^x
+LN(x) / LOG(x)                  -- natural logarithm
+LOG(base, x)                    -- logarithm with specified base
+LOG2(x) / LOG10(x)              -- base-2/base-10 logarithm
+SIGN(x)                         -- sign (-1/0/1)
+GREATEST(a, b, c, ...)          -- maximum value
+LEAST(a, b, c, ...)             -- minimum value
+RANDOM() / RAND()               -- random number 0-1
+PI()                            -- π
+SIN(x) / COS(x) / TAN(x)       -- trigonometric functions
+ASIN(x) / ACOS(x) / ATAN(x)    -- inverse trigonometric functions
+ATAN2(y, x)                     -- arctangent
+DEGREES(x) / RADIANS(x)        -- degree/radian conversion
+-- ⚠️ FACTORIAL not supported, use EXP(SUM(LN(n))) instead
+-- ⚠️ BIN(x) not supported, use CONV(x, 10, 2) instead
+HEX(x)                          -- convert to hexadecimal string
+UNHEX(s)                        -- hexadecimal to string
+CONV(x, from_base, to_base)     -- base conversion (e.g., CONV(10,10,2) gives '1010')
+```
+**Differences from Snowflake:**
+- Snowflake `SQUARE(x)` → ClickZetta `POWER(x, 2)`
+- Snowflake `HAVERSINE(lat1, lon1, lat2, lon2)` → ClickZetta not supported
+- Snowflake `WIDTH_BUCKET` → ClickZetta not supported
+---
+## String Functions
+```sql
+-- Basic operations
+LENGTH(s) / CHAR_LENGTH(s)      -- character length
+OCTET_LENGTH(s)                 -- byte length
+UPPER(s) / LOWER(s)             -- case conversion
+INITCAP(s)                      -- capitalize first letter
+TRIM(s) / LTRIM(s) / RTRIM(s)  -- trim whitespace
+TRIM(BOTH 'x' FROM s)           -- trim specified character
+LPAD(s, n, pad) / RPAD(s, n, pad)  -- padding
+REPEAT(s, n)                    -- repeat
+REVERSE(s)                      -- reverse
+SPACE(n)                        -- n spaces
+-- Concatenation
+CONCAT(s1, s2, ...)             -- concatenate (NULL propagates)
+CONCAT_WS(sep, s1, s2, ...)     -- concatenate with separator (skips NULL)
+s1 || s2                        -- concatenation operator
+-- Substring
+SUBSTR(s, pos) / SUBSTRING(s, pos)
+SUBSTR(s, pos, len) / SUBSTRING(s, pos, len)
+LEFT(s, n) / RIGHT(s, n)
+MID(s, pos, len)                -- same as SUBSTR
+-- Search
+INSTR(s, substr)                -- find position (1-based, 0 means not found)
+LOCATE(substr, s)               -- same as INSTR, different parameter order
+LOCATE(substr, s, pos)          -- search from pos
+POSITION(substr IN s)           -- ✅ supported, returns substring position (1-based)
+FIND_IN_SET(s, list)            -- find in comma-separated list
+-- Replace
+REPLACE(s, old, new)            -- replace all occurrences
+TRANSLATE(s, from_chars, to_chars)  -- character-level replacement
+-- ⚠️ OVERLAY syntax not supported, use CONCAT(LEFT(s,pos-1), new, SUBSTR(s,pos+len)) instead
+-- Regex
+REGEXP_EXTRACT(s, pattern, group)   -- extract matching group
+REGEXP_EXTRACT_ALL(s, pattern)      -- extract all matches
+REGEXP_REPLACE(s, pattern, repl)    -- regex replace
+REGEXP_LIKE(s, pattern)             -- regex match (returns boolean)
+RLIKE(s, pattern)                   -- same as REGEXP_LIKE
+s RLIKE pattern                     -- operator form
+REGEXP_COUNT(s, pattern)            -- match count
+REGEXP_SUBSTR(s, pattern)           -- extract first match
+-- Split
+SPLIT(s, delimiter)             -- split by delimiter, returns ARRAY
+SPLIT_PART(s, delimiter, n)     -- get nth split part (1-based)
+-- Formatting
+FORMAT_STRING(fmt, args...)     -- printf style (e.g., FORMAT_STRING('%d items', 5) → '5 items')
+-- ⚠️ FORMAT(number, decimals) number thousand-separator formatting not supported, use ROUND + CAST instead
+-- Encoding
+BASE64(s) / UNBASE64(s)         -- Base64 encode/decode
+MD5(s)                          -- MD5 hash
+SHA1(s) / SHA2(s, bits)         -- SHA hash
+CRC32(s)                        -- CRC32
+ENCODE(s, charset) / DECODE(s, charset)  -- charset encode/decode
+-- Other
+ASCII(s)                        -- ASCII code of first character
+CHAR(n)                         -- ASCII code to character
+-- ⚠️ SOUNDEX not supported
+-- ⚠️ LEVENSHTEIN not supported, use Python UDF or ZettaPark instead
+HAMMING_DISTANCE(s1, s2)        -- Hamming distance (strings)
+```
+**Differences from Snowflake:**
+- Snowflake `CHARINDEX(substr, s)` → ClickZetta `INSTR(s, substr)` or `LOCATE(substr, s)` (different parameter order!)
+- Snowflake `EDITDISTANCE(s1, s2)` → ClickZetta does not support LEVENSHTEIN, use Python UDF
+- Snowflake `STRTOK(s, delim, n)` → ClickZetta `SPLIT_PART(s, delim, n)`
+- Snowflake `ILIKE(s, pattern)` → ClickZetta `ILIKE` ✅ also supported!
+- Snowflake `CONTAINS(s, substr)` → ClickZetta `INSTR(s, substr) > 0`
+- Snowflake `STARTSWITH(s, prefix)` → ClickZetta `s LIKE 'prefix%'` or `STARTSWITH(s, prefix)`
+- Snowflake `ENDSWITH(s, suffix)` → ClickZetta `s LIKE '%suffix'` or `ENDSWITH(s, suffix)`
+---
+## Date/Time Functions
+```sql
+-- Get current time
+CURRENT_DATE()                  -- current date
+CURRENT_TIMESTAMP() / NOW()     -- current timestamp (with timezone)
+CURRENT_TIME()                  -- current time
+LOCALTIMESTAMP()                -- local timestamp
+-- Extract parts
+YEAR(dt) / MONTH(dt) / DAY(dt)
+HOUR(dt) / MINUTE(dt) / SECOND(dt)
+DAYOFWEEK(dt)                   -- 1=Sunday, 7=Saturday
+DAYOFMONTH(dt)                  -- same as DAY
+DAYOFYEAR(dt)                   -- day of year
+WEEKOFYEAR(dt)                  -- week of year
+QUARTER(dt)                     -- quarter (1-4)
+EXTRACT(YEAR FROM dt)           -- standard SQL extraction
+-- ⚠️ DATE_PART('year', dt) not supported, use EXTRACT or YEAR(dt) instead
+-- Date arithmetic
+DATE_ADD(dt, n)                 -- add n days
+DATE_SUB(dt, n)                 -- subtract n days
+dt + INTERVAL n DAY             -- add n days (standard SQL)
+dt - INTERVAL n DAY             -- subtract n days
+dt + INTERVAL '1-2' YEAR TO MONTH  -- add 1 year 2 months
+ADDDATE(dt, n)                  -- same as DATE_ADD
+SUBDATE(dt, n)                  -- same as DATE_SUB
+ADD_MONTHS(dt, n)               -- add n months
+MONTHS_BETWEEN(dt1, dt2)        -- month difference
+-- Date difference
+DATEDIFF(end_dt, start_dt)      -- two-parameter form: returns day difference (end first)
+DATEDIFF(unit, start_dt, end_dt) -- three-parameter form: specify unit (day/hour/month etc.), Snowflake-compatible
+TIMESTAMPDIFF(unit, dt1, dt2)   -- difference in specified unit
+-- Truncation
+DATE_TRUNC('year', dt)          -- truncate to year
+DATE_TRUNC('month', dt)         -- truncate to month
+DATE_TRUNC('day', dt)           -- truncate to day
+DATE_TRUNC('hour', dt)          -- truncate to hour
+DATE_TRUNC('week', dt)          -- truncate to week (Monday)
+TRUNC(dt, 'MM')                 -- Oracle-style truncation
+-- Formatting
+DATE_FORMAT(dt, 'yyyy-MM-dd')   -- format to string
+DATE_FORMAT(dt, 'yyyy-MM-dd HH:mm:ss')
+TO_CHAR(dt, 'YYYY-MM-DD')       -- same as DATE_FORMAT
+-- Conversion
+TO_DATE('2024-01-01')           -- string to date
+TO_DATE('2024-01-01', 'yyyy-MM-dd')
+TO_TIMESTAMP('2024-01-01 12:00:00')
+TO_TIMESTAMP('2024-01-01', 'yyyy-MM-dd')
+CAST('2024-01-01' AS DATE)
+CAST('2024-01-01 12:00:00' AS TIMESTAMP)
+FROM_UNIXTIME(unix_ts)          -- Unix timestamp to timestamp
+FROM_UNIXTIME(unix_ts, fmt)     -- to formatted string
+UNIX_TIMESTAMP()                -- current Unix timestamp
+UNIX_TIMESTAMP(dt)              -- date to Unix timestamp
+UNIX_TIMESTAMP(s, fmt)          -- string to Unix timestamp
+-- Other
+LAST_DAY(dt)                    -- last day of month
+NEXT_DAY(dt, 'Monday')          -- next specified day of week
+MAKE_DATE(year, month, day)     -- construct date (note: MAKE_DATE not MAKEDATE)
+ADD_MONTHS(dt, n)               -- add n months
+MONTHS_BETWEEN(dt1, dt2)        -- month difference
+TIMESTAMPDIFF(unit, dt1, dt2)   -- difference in specified unit (e.g., TIMESTAMPDIFF(MONTH, ...))
+FROM_UTC_TIMESTAMP(ts, tz)      -- UTC to specified timezone
+TO_UTC_TIMESTAMP(ts, tz)        -- specified timezone to UTC
+-- ⚠️ CONVERT_TZ(dt, from_tz, to_tz) not supported, use FROM_UTC_TIMESTAMP/TO_UTC_TIMESTAMP instead
+-- ⚠️ MAKEDATE(year, dayofyear) not supported, use MAKE_DATE(year, month, day) instead
+-- ⚠️ MAKETIME / PERIOD_ADD / PERIOD_DIFF not supported
+```
+**Differences from Snowflake:**
+- Snowflake `DATEADD(day, n, dt)` → ClickZetta `DATEADD(day, n, dt)` ✅ also supported; or use `DATE_ADD(dt, n)` / `dt + INTERVAL n DAY`
+- Snowflake `DATEDIFF(day, start, end)` → ClickZetta `DATEDIFF(day, start, end)` ✅ three-parameter form also supported; or use `DATEDIFF(end, start)` two-parameter form (returns days)
+- Snowflake `DATE_TRUNC('day', dt)` → ClickZetta same
+- Snowflake `TO_DATE(s)` → ClickZetta same
+- Snowflake `CONVERT_TIMEZONE(from, to, ts)` → ClickZetta `FROM_UTC_TIMESTAMP` / `TO_UTC_TIMESTAMP`
+- Snowflake `CONVERT_TIMEZONE(tz, dt)` → ClickZetta `CONVERT_TZ(dt, from_tz, to_tz)`
+- Snowflake `SYSDATE()` / `GETDATE()` → ClickZetta `CURRENT_TIMESTAMP()` / `NOW()`
+- Snowflake `TIMESTAMPADD(unit, n, dt)` → ClickZetta `dt + INTERVAL n unit`
+**Differences from Spark SQL:**
+- Most functions are the same; ClickZetta is compatible with Spark date functions
+---
+## Conditional Functions
+```sql
+-- IF
+IF(condition, true_val, false_val)
+-- CASE WHEN
+CASE WHEN cond1 THEN val1
+     WHEN cond2 THEN val2
+     ELSE default_val
+END
+-- Simple CASE
+CASE status
+    WHEN 'A' THEN 'Active'
+    WHEN 'I' THEN 'Inactive'
+    ELSE 'Unknown'
+END
+-- NULL handling
+COALESCE(a, b, c)               -- first non-NULL value
+NVL(a, b)                       -- return b if a is NULL (same as IFNULL)
+IFNULL(a, b)                    -- same as NVL
+NULLIF(a, b)                    -- return NULL if a=b, otherwise return a
+NVL2(a, b, c)                   -- return b if a is not NULL, otherwise c
+ISNULL(a)                       -- is NULL (returns boolean)
+ISNOTNULL(a)                    -- is not NULL
+-- DECODE (Oracle/Hive style)
+DECODE(expr, val1, res1, val2, res2, ..., default)
+-- Type checking
+TYPEOF(expr)                    -- returns type name as string
+```
+**Differences from Snowflake:**
+- Snowflake `IFF(cond, a, b)` → ClickZetta `IF(cond, a, b)`
+- Snowflake `ZEROIFNULL(x)` → ClickZetta `COALESCE(x, 0)` or `NVL(x, 0)`
+- Snowflake `NULLIFZERO(x)` → ClickZetta `NULLIF(x, 0)`
+- Snowflake `BOOLAND(a, b)` / `BOOLOR(a, b)` → ClickZetta `a AND b` / `a OR b`
+---
+## Aggregate Functions
+```sql
+-- Basic aggregation
+COUNT(*) / COUNT(col) / COUNT(DISTINCT col)
+SUM(col) / AVG(col) / MAX(col) / MIN(col)
+STDDEV(col) / STDDEV_POP(col) / STDDEV_SAMP(col)
+VARIANCE(col) / VAR_POP(col) / VAR_SAMP(col)
+-- Boolean aggregation
+BOOL_OR(cond)                   -- any one is true
+BOOL_AND(cond)                  -- all are true
+EVERY(cond)                     -- same as BOOL_AND
+-- String aggregation
+GROUP_CONCAT(col ORDER BY col SEPARATOR ',')   -- replaces Snowflake LISTAGG
+GROUP_CONCAT(DISTINCT col SEPARATOR ',')
+-- Array aggregation
+ARRAY_AGG(col)                  -- collect into array (includes NULL)
+COLLECT_LIST(col)               -- same as ARRAY_AGG
+COLLECT_SET(col)                -- collect deduplicated
+-- Approximate aggregation
+APPROX_COUNT_DISTINCT(col)      -- approximate distinct count (HyperLogLog)
+APPROX_PERCENTILE(col, p)       -- approximate percentile
+-- Statistical aggregation
+CORR(x, y)                      -- correlation coefficient
+COVAR_POP(x, y) / COVAR_SAMP(x, y)  -- covariance
+-- ⚠️ REGR_SLOPE / REGR_INTERCEPT not supported
+-- Alternative: CORR(y,x) * STDDEV(y) / STDDEV(x) to calculate slope
+-- Ordered set aggregation
+PERCENTILE(col, p)              -- exact percentile
+PERCENTILE_APPROX(col, p)       -- approximate percentile
+MEDIAN(col)                     -- median
+```
+**Differences from Snowflake:**
+- Snowflake `LISTAGG(col, ',') WITHIN GROUP (ORDER BY col)` → ClickZetta `GROUP_CONCAT(col ORDER BY col SEPARATOR ',')`
+- Snowflake `ARRAY_AGG(col) WITHIN GROUP (ORDER BY col)` → ClickZetta `ARRAY_AGG(col)` does not support WITHIN GROUP
+- Snowflake `OBJECT_AGG(key, value)` → ClickZetta `MAP_AGG(key, value)`
+- Snowflake `BITAND_AGG / BITOR_AGG / BITXOR_AGG` → ClickZetta `BIT_AND / BIT_OR / BIT_XOR`
+---
+## Type Conversion Functions
+```sql
+-- Explicit conversion
+CAST(expr AS target_type)
+expr::target_type               -- shorthand syntax
+-- Safe conversion (returns NULL on failure instead of error)
+TRY_CAST(expr AS target_type)
+-- String conversion
+TO_NUMBER(s) / TO_DECIMAL(s)
+TO_DOUBLE(s)
+TO_BOOLEAN(s)                   -- 'true'/'false'/'1'/'0'
+-- Examples
+CAST('123' AS INT)
+CAST(123 AS STRING)
+CAST('2024-01-01' AS DATE)
+CAST('[1,2,3]' AS VECTOR(3))    -- string to vector
+TRY_CAST('abc' AS INT)          -- returns NULL
+```
+**Differences from Snowflake:**
+- Snowflake `TRY_TO_NUMBER / TRY_TO_DATE` → ClickZetta `TRY_CAST`
+- Snowflake `TO_VARIANT(x)` → ClickZetta `PARSE_JSON(TO_JSON(x))`
+---
+## System/Context Functions
+```sql
+CURRENT_USER()                  -- current username
+CURRENT_WORKSPACE()             -- current workspace
+CURRENT_SCHEMA()                -- current schema
+CURRENT_VCLUSTER()              -- current compute cluster
+CURRENT_INSTANCE_ID()           -- current instance ID
+VERSION()                       -- version information
+```
+**Differences from Snowflake:**
+- Snowflake `CURRENT_DATABASE()` → ClickZetta `CURRENT_WORKSPACE()`
+- Snowflake `CURRENT_WAREHOUSE()` → ClickZetta `CURRENT_VCLUSTER()`
+- Snowflake `CURRENT_ROLE()` → ClickZetta has no direct equivalent
+---
+## Vector Functions
+```sql
+-- Distance calculation
+L2_DISTANCE(v1, v2)             -- Euclidean distance (smaller = more similar)
+COSINE_DISTANCE(v1, v2)         -- Cosine distance (smaller = more similar)
+DOT_PRODUCT(v1, v2)             -- Dot product (larger = more similar, requires normalization)
+HAMMING_DISTANCE(v1, v2)        -- Hamming distance (binary vectors)
+JACCARD_DISTANCE(v1, v2)        -- Jaccard distance
+-- Vector operations
+BINARY_QUANTIZE(v)              -- binarize float vector
+VECTOR(v1, v2, ...)             -- build vector
+-- Build vector
+SELECT VECTOR(0.1, 0.2, 0.3, 0.4);
+SELECT CAST('[0.1, 0.2, 0.3]' AS VECTOR(3));
+```