npm - @kinetica/admin-agent - Versions diffs - 0.2.3 → 0.2.4 - Mend

@kinetica/admin-agent 0.2.3 → 0.2.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/README.md +3 -2
package/knowledge/references/sql-dialect.md +100 -0
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -417,7 +417,7 @@ References provide domain knowledge (not diagnostic runbooks). Create a `.md` fi
 **Playbooks** (6): memory-pressure, gpu-out-of-memory, query-contention, resource-group-exhaustion, stale-rank, config-drift
-**References** (9):
+**References** (10):
 - `gpudb-conf` — master config file structure, section index, tiered storage semantics
 - `tiered-objects` — `ki_tiered_objects` schema, ID format, diagnostic queries
@@ -427,11 +427,12 @@ References provide domain knowledge (not diagnostic runbooks). Create a `.md` fi
 - `mutation-safety` — pre-execution checklist for rebalance, alter-config, and DDL paths
 - `sql-alter-table` — Kinetica 7.2 ALTER TABLE grammar, column property flags, shard-key immutability
 - `sql-create-index` — column index syntax, chunk skip index, when to use which
+- `sql-dialect` — PostgreSQL-baseline mental model + a "false friends" table of cross-dialect SQL that looks valid but fails in Kinetica (e.g. `TRY_CAST`/`SAFE_CAST`, backtick quoting, `NUMERIC` vs `DECIMAL`); steers remediation SQL away from SQL Server/Snowflake/Oracle idioms
 - `version-quirks-7.2` — endpoint/property differences between 7.2.x and earlier releases
 Plus a **bundle-scoped reference** (`support-bundle` — bundle layout, the two per-rank log families, raw + Loki-JSONL log-line formats, severity ordering, file parsing, crash-SQL forensics, and how to work an off-shape bundle via the `layout_match`/confidence signals) that lives in `knowledge/references/bundle/`. It loads in **every** session — even a pure live one — so that a bundle attached mid-session via `kinetica_load_bundle` has its parsing knowledge ready in the (build-once) prompt; the corpus is cached, so the cost to a session that never attaches a bundle is negligible.
-> **Heads up — prompt budget:** all playbooks and references are front-loaded into a single system prompt at startup, so its token cost grows with the knowledge corpus. A startup tripwire (`agent/prompt-budget.ts`) prints the assembled prompt size under `DEBUG` and warns on stderr once it exceeds ~20,000 estimated tokens. Current baseline is ~13.4k tokens (6 playbooks + 9 references). If you add substantial knowledge and trip that warning, treat it as the cue to switch from "load everything" to keyword-based playbook selection.
+> **Heads up — prompt budget:** all playbooks and references are front-loaded into a single system prompt at startup, so its token cost grows with the knowledge corpus. A startup tripwire (`agent/prompt-budget.ts`) prints the assembled prompt size under `DEBUG` and warns on stderr once it exceeds ~20,000 estimated tokens. Current baseline is ~15.5k tokens (6 playbooks + 10 references). If you add substantial knowledge and trip that warning, treat it as the cue to switch from "load everything" to keyword-based playbook selection.
 ## Development

package/knowledge/references/sql-dialect.md ADDED Viewed

@@ -0,0 +1,100 @@
+---
+title: Kinetica SQL Dialect — PostgreSQL Baseline & False Friends
+category: sql-syntax
+keywords:
+  [
+    sql-dialect,
+    postgresql,
+    false-friends,
+    try-cast,
+    safe-cast,
+    cast,
+    convert,
+    remediation-sql,
+    datediff,
+    timestamp,
+    nested-aggregate,
+    identifiers,
+    backticks,
+    decimal,
+    numeric,
+  ]
+---
+## Mental Model — Start from PostgreSQL
+Kinetica SQL is **PostgreSQL-compatible**: treat standard PostgreSQL syntax,
+functions, and behavior as the baseline. The deviations documented here (and in
+`version-quirks-7.2.md`) **override** that baseline. When no Kinetica-specific
+rule applies, the PostgreSQL form is the safe default.
+**The common failure when recommending remediation SQL is importing idioms from
+OTHER dialects** — SQL Server, Snowflake, Oracle, MySQL. Those are not the
+baseline; PostgreSQL is. The table below lists imports that look valid but fail
+in Kinetica.
+> Dialect facts adapted from the official `kineticadb/agent-skills` knowledge
+> base (Apache-2.0).
+## False Friends — Looks Valid, FAILS in Kinetica
+Do NOT put any of these in a remediation suggestion. Use the Kinetica form.
+| Looks valid (other dialect)            | Why it fails                                                | Use instead                                                                     |
+| -------------------------------------- | ----------------------------------------------------------- | ------------------------------------------------------------------------------- |
+| `TRY_CAST(x AS t)` / `SAFE_CAST(x, t)` | No error-tolerant cast exists in Kinetica                   | `CAST(x AS t)` or `CONVERT(x, t)`; shorthand `INT(x)`, `DOUBLE(x)`, `STRING(x)` |
+| `` `ident` `` (backtick quoting)       | Backticks are not a valid identifier quote                  | ANSI double quotes: `"ident"`                                                   |
+| `ts1 - ts2` (timestamp subtraction)    | Timestamp arithmetic with `-` is not supported              | `DATEDIFF('unit', ts1, ts2)`                                                    |
+| `NUMERIC(p, s)`                        | The type is named `DECIMAL`, not `NUMERIC`                  | `DECIMAL(p, s)` (max precision 27, max scale 18)                                |
+| `SUM(COUNT(*))` (nested aggregates)    | Fails with "Aggregate expressions cannot be nested"         | Separate into CTEs — window/aggregate in different stages                       |
+| `ANALYZE TABLE t`                      | No cost-based optimizer stats (see `version-quirks-7.2.md`) | No equivalent — do NOT suggest a "refresh table stats" step                     |
+| `SELECT ... ;` (trailing semicolon)    | A trailing `;` is rejected                                  | Omit the trailing semicolon                                                     |
+| `ORDER BY <array_col>`                 | Cannot sort by an `array<...>` column                       | Index an element (`ORDER BY "col"[1]`) or sort by a scalar column               |
+`TRY_CAST` / `SAFE_CAST` warrant special note: they come from SQL Server,
+Snowflake, and BigQuery, and Kinetica has no cast variant that returns NULL on
+conversion failure. If a value might not convert cleanly, filter the source rows
+(`WHERE` / `CASE`) before casting rather than reaching for a non-existent
+`TRY_*` function.
+## Type Conversion — the Valid Forms
+- Standard `CAST(expr AS type)` and `CONVERT(expr, type)` both work.
+- Shorthand cast functions: `INT(expr)`, `LONG(expr)`, `DOUBLE(expr)`,
+  `FLOAT(expr)`, `DECIMAL(expr)`, `STRING(expr)`, `ULONG(expr)`.
+- `JSON_EXTRACT_VALUE` always returns TEXT — you MUST cast for numeric use:
+  `CAST(JSON_EXTRACT_VALUE("payload", '$.count') AS INTEGER) > 100`.
+## Date / Time — Use Functions, Not Arithmetic
+| Kinetica form                        | Replaces (PostgreSQL / other)     |
+| ------------------------------------ | --------------------------------- |
+| `DATEDIFF('unit', start, end)`       | `EXTRACT(EPOCH FROM end - start)` |
+| `DATEADD('unit', amount, ts)`        | `ts + INTERVAL '...'`             |
+| `TIME_BUCKET(INTERVAL 'n' UNIT, ts)` | `date_bin()`                      |
+Units: `SECOND, MINUTE, HOUR, DAY, WEEK, MONTH, QUARTER, YEAR` (also
+`MICROSECOND` / `MILLISECOND`). INTERVAL syntax: `INTERVAL '30' MINUTE`.
+## Identifier & Statement Hygiene
+- **Double quotes only** for identifiers (`"my_col"`) — never backticks.
+- **Identifiers are case-sensitive** — `"UserID"` ≠ `"userid"`. Verify column
+  names against the discovered schema before recommending SQL.
+- **Fully-qualify table names** — `"schema"."table"`.
+- **No trailing semicolons.**
+## Kinetica Conveniences (valid, non-obvious)
+- `SELECT * EXCLUDE (col1, col2)` — wildcard minus specific columns.
+- `IF(cond, a, b)` — ternary (PostgreSQL has only `CASE`).
+- `NVL(x, default)` / `NVL2(x, not_null, null_val)` — null handling.
+- `DECODE(expr, m1, v1, ..., default)` — pattern matching.
+## When Unsure — Verify Empirically
+The live database is the source of truth. Before recommending any remediation
+SQL whose syntax you are not certain Kinetica supports, validate it with
+`kinetica_explain_query` against the live instance. If it cannot be validated
+(or there is no live connection), label the suggestion as unverified rather than
+asserting it is correct.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@kinetica/admin-agent",
-  "version": "0.2.3",
+  "version": "0.2.4",
   "description": "Autonomous diagnostic agent for Kinetica databases",
   "license": "Apache-2.0",
   "author": "Kinetica",