@kinetica/admin-agent 0.2.3 → 0.2.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -417,7 +417,7 @@ References provide domain knowledge (not diagnostic runbooks). Create a `.md` fi
417
417
 
418
418
  **Playbooks** (6): memory-pressure, gpu-out-of-memory, query-contention, resource-group-exhaustion, stale-rank, config-drift
419
419
 
420
- **References** (9):
420
+ **References** (10):
421
421
 
422
422
  - `gpudb-conf` — master config file structure, section index, tiered storage semantics
423
423
  - `tiered-objects` — `ki_tiered_objects` schema, ID format, diagnostic queries
@@ -427,11 +427,12 @@ References provide domain knowledge (not diagnostic runbooks). Create a `.md` fi
427
427
  - `mutation-safety` — pre-execution checklist for rebalance, alter-config, and DDL paths
428
428
  - `sql-alter-table` — Kinetica 7.2 ALTER TABLE grammar, column property flags, shard-key immutability
429
429
  - `sql-create-index` — column index syntax, chunk skip index, when to use which
430
+ - `sql-dialect` — PostgreSQL-baseline mental model + a "false friends" table of cross-dialect SQL that looks valid but fails in Kinetica (e.g. `TRY_CAST`/`SAFE_CAST`, backtick quoting, `NUMERIC` vs `DECIMAL`); steers remediation SQL away from SQL Server/Snowflake/Oracle idioms
430
431
  - `version-quirks-7.2` — endpoint/property differences between 7.2.x and earlier releases
431
432
 
432
433
  Plus a **bundle-scoped reference** (`support-bundle` — bundle layout, the two per-rank log families, raw + Loki-JSONL log-line formats, severity ordering, file parsing, crash-SQL forensics, and how to work an off-shape bundle via the `layout_match`/confidence signals) that lives in `knowledge/references/bundle/`. It loads in **every** session — even a pure live one — so that a bundle attached mid-session via `kinetica_load_bundle` has its parsing knowledge ready in the (build-once) prompt; the corpus is cached, so the cost to a session that never attaches a bundle is negligible.
433
434
 
434
- > **Heads up — prompt budget:** all playbooks and references are front-loaded into a single system prompt at startup, so its token cost grows with the knowledge corpus. A startup tripwire (`agent/prompt-budget.ts`) prints the assembled prompt size under `DEBUG` and warns on stderr once it exceeds ~20,000 estimated tokens. Current baseline is ~13.4k tokens (6 playbooks + 9 references). If you add substantial knowledge and trip that warning, treat it as the cue to switch from "load everything" to keyword-based playbook selection.
435
+ > **Heads up — prompt budget:** all playbooks and references are front-loaded into a single system prompt at startup, so its token cost grows with the knowledge corpus. A startup tripwire (`agent/prompt-budget.ts`) prints the assembled prompt size under `DEBUG` and warns on stderr once it exceeds ~20,000 estimated tokens. Current baseline is ~15.5k tokens (6 playbooks + 10 references). If you add substantial knowledge and trip that warning, treat it as the cue to switch from "load everything" to keyword-based playbook selection.
435
436
 
436
437
  ## Development
437
438
 
@@ -0,0 +1,100 @@
1
+ ---
2
+ title: Kinetica SQL Dialect — PostgreSQL Baseline & False Friends
3
+ category: sql-syntax
4
+ keywords:
5
+ [
6
+ sql-dialect,
7
+ postgresql,
8
+ false-friends,
9
+ try-cast,
10
+ safe-cast,
11
+ cast,
12
+ convert,
13
+ remediation-sql,
14
+ datediff,
15
+ timestamp,
16
+ nested-aggregate,
17
+ identifiers,
18
+ backticks,
19
+ decimal,
20
+ numeric,
21
+ ]
22
+ ---
23
+
24
+ ## Mental Model — Start from PostgreSQL
25
+
26
+ Kinetica SQL is **PostgreSQL-compatible**: treat standard PostgreSQL syntax,
27
+ functions, and behavior as the baseline. The deviations documented here (and in
28
+ `version-quirks-7.2.md`) **override** that baseline. When no Kinetica-specific
29
+ rule applies, the PostgreSQL form is the safe default.
30
+
31
+ **The common failure when recommending remediation SQL is importing idioms from
32
+ OTHER dialects** — SQL Server, Snowflake, Oracle, MySQL. Those are not the
33
+ baseline; PostgreSQL is. The table below lists imports that look valid but fail
34
+ in Kinetica.
35
+
36
+ > Dialect facts adapted from the official `kineticadb/agent-skills` knowledge
37
+ > base (Apache-2.0).
38
+
39
+ ## False Friends — Looks Valid, FAILS in Kinetica
40
+
41
+ Do NOT put any of these in a remediation suggestion. Use the Kinetica form.
42
+
43
+ | Looks valid (other dialect) | Why it fails | Use instead |
44
+ | -------------------------------------- | ----------------------------------------------------------- | ------------------------------------------------------------------------------- |
45
+ | `TRY_CAST(x AS t)` / `SAFE_CAST(x, t)` | No error-tolerant cast exists in Kinetica | `CAST(x AS t)` or `CONVERT(x, t)`; shorthand `INT(x)`, `DOUBLE(x)`, `STRING(x)` |
46
+ | `` `ident` `` (backtick quoting) | Backticks are not a valid identifier quote | ANSI double quotes: `"ident"` |
47
+ | `ts1 - ts2` (timestamp subtraction) | Timestamp arithmetic with `-` is not supported | `DATEDIFF('unit', ts1, ts2)` |
48
+ | `NUMERIC(p, s)` | The type is named `DECIMAL`, not `NUMERIC` | `DECIMAL(p, s)` (max precision 27, max scale 18) |
49
+ | `SUM(COUNT(*))` (nested aggregates) | Fails with "Aggregate expressions cannot be nested" | Separate into CTEs — window/aggregate in different stages |
50
+ | `ANALYZE TABLE t` | No cost-based optimizer stats (see `version-quirks-7.2.md`) | No equivalent — do NOT suggest a "refresh table stats" step |
51
+ | `SELECT ... ;` (trailing semicolon) | A trailing `;` is rejected | Omit the trailing semicolon |
52
+ | `ORDER BY <array_col>` | Cannot sort by an `array<...>` column | Index an element (`ORDER BY "col"[1]`) or sort by a scalar column |
53
+
54
+ `TRY_CAST` / `SAFE_CAST` warrant special note: they come from SQL Server,
55
+ Snowflake, and BigQuery, and Kinetica has no cast variant that returns NULL on
56
+ conversion failure. If a value might not convert cleanly, filter the source rows
57
+ (`WHERE` / `CASE`) before casting rather than reaching for a non-existent
58
+ `TRY_*` function.
59
+
60
+ ## Type Conversion — the Valid Forms
61
+
62
+ - Standard `CAST(expr AS type)` and `CONVERT(expr, type)` both work.
63
+ - Shorthand cast functions: `INT(expr)`, `LONG(expr)`, `DOUBLE(expr)`,
64
+ `FLOAT(expr)`, `DECIMAL(expr)`, `STRING(expr)`, `ULONG(expr)`.
65
+ - `JSON_EXTRACT_VALUE` always returns TEXT — you MUST cast for numeric use:
66
+ `CAST(JSON_EXTRACT_VALUE("payload", '$.count') AS INTEGER) > 100`.
67
+
68
+ ## Date / Time — Use Functions, Not Arithmetic
69
+
70
+ | Kinetica form | Replaces (PostgreSQL / other) |
71
+ | ------------------------------------ | --------------------------------- |
72
+ | `DATEDIFF('unit', start, end)` | `EXTRACT(EPOCH FROM end - start)` |
73
+ | `DATEADD('unit', amount, ts)` | `ts + INTERVAL '...'` |
74
+ | `TIME_BUCKET(INTERVAL 'n' UNIT, ts)` | `date_bin()` |
75
+
76
+ Units: `SECOND, MINUTE, HOUR, DAY, WEEK, MONTH, QUARTER, YEAR` (also
77
+ `MICROSECOND` / `MILLISECOND`). INTERVAL syntax: `INTERVAL '30' MINUTE`.
78
+
79
+ ## Identifier & Statement Hygiene
80
+
81
+ - **Double quotes only** for identifiers (`"my_col"`) — never backticks.
82
+ - **Identifiers are case-sensitive** — `"UserID"` ≠ `"userid"`. Verify column
83
+ names against the discovered schema before recommending SQL.
84
+ - **Fully-qualify table names** — `"schema"."table"`.
85
+ - **No trailing semicolons.**
86
+
87
+ ## Kinetica Conveniences (valid, non-obvious)
88
+
89
+ - `SELECT * EXCLUDE (col1, col2)` — wildcard minus specific columns.
90
+ - `IF(cond, a, b)` — ternary (PostgreSQL has only `CASE`).
91
+ - `NVL(x, default)` / `NVL2(x, not_null, null_val)` — null handling.
92
+ - `DECODE(expr, m1, v1, ..., default)` — pattern matching.
93
+
94
+ ## When Unsure — Verify Empirically
95
+
96
+ The live database is the source of truth. Before recommending any remediation
97
+ SQL whose syntax you are not certain Kinetica supports, validate it with
98
+ `kinetica_explain_query` against the live instance. If it cannot be validated
99
+ (or there is no live connection), label the suggestion as unverified rather than
100
+ asserting it is correct.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@kinetica/admin-agent",
3
- "version": "0.2.3",
3
+ "version": "0.2.4",
4
4
  "description": "Autonomous diagnostic agent for Kinetica databases",
5
5
  "license": "Apache-2.0",
6
6
  "author": "Kinetica",