@kinetica/admin-agent 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,26 @@
1
+ ---
2
+ title: Config Drift / Configuration Pitfalls
3
+ category: configuration
4
+ severity: warning
5
+ keywords: [config, drift, configuration, regression, upgrade]
6
+ ---
7
+
8
+ ## Symptoms
9
+
10
+ - Unexpected behavior after upgrade or config change
11
+ - Performance regression with no workload change
12
+
13
+ ## Detection
14
+
15
+ - `kinetica_get_system_properties` → compare against known-good values
16
+ - `kinetica_get_config` → snapshot shows non-default values (7.2.x: use system properties instead)
17
+
18
+ ## Root Cause
19
+
20
+ Configuration issue from manual edits, upgrade migration, or environment-specific settings.
21
+
22
+ ## Remediation
23
+
24
+ 1. Use `kinetica_alter_system_properties` to restore known-good config values
25
+ 2. Review Kinetica changelog for breaking config changes between versions
26
+ 3. Document baseline configuration for future comparison
@@ -0,0 +1,27 @@
1
+ ---
2
+ title: GPU Out-of-Memory
3
+ category: performance
4
+ severity: critical
5
+ keywords: [VRAM, GPU, OOM, memory, timeout]
6
+ ---
7
+
8
+ ## Symptoms
9
+
10
+ - ERROR logs with "out_of_memory" or GPU OOM, query failures
11
+ - `kinetica_get_metrics` shows GPU memory near 100%
12
+
13
+ ## Detection
14
+
15
+ - `kinetica_get_metrics` → check `vram_used` on worker ranks
16
+ - `ki_catalog.ki_tiered_objects` → find large VRAM-tier objects
17
+
18
+ ## Root Cause
19
+
20
+ Queries materializing too much data in VRAM; oversized objects loaded into GPU memory.
21
+
22
+ ## Remediation
23
+
24
+ 1. Identify largest GPU objects via `kinetica_resource_objects`
25
+ 2. Add query limits to constrain result set sizes
26
+ 3. Review GPU memory allocation config via `kinetica_get_system_properties` (conf.tier.\*)
27
+ 4. Consider tier eviction policy changes
@@ -0,0 +1,29 @@
1
+ ---
2
+ title: Memory Pressure
3
+ category: performance
4
+ severity: warning
5
+ keywords: [memory, pressure, eviction, RAM, slow, disk]
6
+ ---
7
+
8
+ ## Symptoms
9
+
10
+ - Slow queries with no obvious cause
11
+ - Eviction warnings in logs
12
+ - `ki_tiered_objects` showing data moved to PERSIST or DISK tier
13
+
14
+ ## Detection
15
+
16
+ - `kinetica_get_metrics` → high RAM usage percentage (above 80%)
17
+ - `ki_catalog.ki_tiered_objects` → objects in PERSIST/DISK tier that should be in RAM
18
+ - `kinetica_resource_objects` → non-zero eviction counts
19
+
20
+ ## Root Cause
21
+
22
+ Total working set exceeds available RAM; large objects not fitting in configured tier limits.
23
+
24
+ ## Remediation
25
+
26
+ 1. Increase tier memory allocation via `kinetica_alter_system_properties` (conf.tier.\*)
27
+ 2. Identify and evict cold objects via `kinetica_resource_objects`
28
+ 3. Archive unused tables to free tier capacity
29
+ 4. Review resource group memory limits in `kinetica_resource_groups`
@@ -0,0 +1,28 @@
1
+ ---
2
+ title: Query Contention
3
+ category: performance
4
+ severity: warning
5
+ keywords: [query, contention, slow, blocking, lock, concurrent]
6
+ ---
7
+
8
+ ## Symptoms
9
+
10
+ - Long-running queries in `ki_query_history` (large elapsed time between start and completion)
11
+ - Active queries blocking each other
12
+
13
+ ## Detection
14
+
15
+ - `ki_catalog.ki_query_active_all` → multiple long-running queries
16
+ - `ki_catalog.ki_query_history` → queries with large elapsed time
17
+ - `ki_catalog.ki_query_workers` → blocked worker threads
18
+
19
+ ## Root Cause
20
+
21
+ Concurrent large queries competing for GPU resources; lock contention on shared tables.
22
+
23
+ ## Remediation
24
+
25
+ 1. Stagger large queries to reduce concurrent GPU pressure
26
+ 2. Review query priority settings in resource groups
27
+ 3. Consider query queue configuration via `kinetica_alter_system_properties`
28
+ 4. Check `kinetica_resource_groups` for CPU concurrency limits
@@ -0,0 +1,27 @@
1
+ ---
2
+ title: Resource Group Exhaustion
3
+ category: resources
4
+ severity: warning
5
+ keywords: [resource, group, limit, exhaustion, tier, capacity]
6
+ ---
7
+
8
+ ## Symptoms
9
+
10
+ - Queries failing with resource limit errors
11
+ - Tier capacity warnings
12
+
13
+ ## Detection
14
+
15
+ - `kinetica_resource_groups` → tier usage near limits
16
+ - `kinetica_resource_objects` → uneven object distribution across ranks
17
+
18
+ ## Root Cause
19
+
20
+ Resource group limits too low for workload; uneven data placement across tiers.
21
+
22
+ ## Remediation
23
+
24
+ 1. Increase resource group limits via `kinetica_alter_system_properties`
25
+ 2. Use `kinetica_admin_rebalance` to redistribute data evenly across ranks
26
+ 3. Review resource group assignments in `kinetica_show_security`
27
+ 4. Consider adding new resource groups for workload isolation
@@ -0,0 +1,26 @@
1
+ ---
2
+ title: Stale Rank (Rank Not Responding)
3
+ category: cluster
4
+ severity: critical
5
+ keywords: [rank, stale, offline, crash, partition]
6
+ ---
7
+
8
+ ## Symptoms
9
+
10
+ - Health check shows unhealthy rank
11
+ - Cluster status shows rank offline
12
+
13
+ ## Detection
14
+
15
+ - `kinetica_health_check` → non-OK rank status
16
+ - `kinetica_cluster_status` → rank alerts, shard mapping gaps
17
+
18
+ ## Root Cause
19
+
20
+ Stale rank process after crash or network partition; rank failed to rejoin cluster.
21
+
22
+ ## Remediation
23
+
24
+ 1. Tell user to run `gadmin restart rank <N>` manually (no REST API for worker restart in 7.2)
25
+ 2. After rank recovers, use `kinetica_admin_rebalance` to redistribute shards
26
+ 3. Verify recovery with `kinetica_health_check` and `kinetica_cluster_status`
@@ -0,0 +1,82 @@
1
+ ---
2
+ title: ki_catalog Enum Values
3
+ category: catalog-schema
4
+ keywords:
5
+ [
6
+ ki_catalog,
7
+ enums,
8
+ obj_kind,
9
+ shard_kind,
10
+ persistence,
11
+ partition_type,
12
+ tier,
13
+ priority,
14
+ tiered_objects,
15
+ ]
16
+ ---
17
+
18
+ ## Overview
19
+
20
+ Many `ki_catalog` columns encode state as single-character codes or
21
+ small string constants. These are the canonical values — decode them
22
+ explicitly when interpreting query results or building WHERE clauses.
23
+
24
+ ## ki_objects
25
+
26
+ | Column | Value | Meaning |
27
+ | ------------- | ----- | ----------------------------- |
28
+ | `obj_kind` | `R` | table / relation |
29
+ | `obj_kind` | `V` | view |
30
+ | `shard_kind` | `S` | sharded |
31
+ | `shard_kind` | `N` | not sharded |
32
+ | `persistence` | `P` | persistent (survives restart) |
33
+ | `persistence` | `T` | temporary |
34
+
35
+ ## ki_partitions
36
+
37
+ | Column | Value | Meaning |
38
+ | ---------------- | ---------- | -------------------------------- |
39
+ | `partition_type` | `NONE` | unpartitioned |
40
+ | `partition_type` | `INTERVAL` | time-based interval partitioning |
41
+
42
+ ## ki_tiered_objects
43
+
44
+ ### `id` format
45
+
46
+ String identifier (`char256`), format like `@schema@oid[col][chunk]`
47
+ (e.g., `@nyctaxi@365[col][0]`). NOT a numeric OID — **cannot** be
48
+ joined to `ki_objects.oid`. See `catalog-joins.md` for the correct
49
+ lookup path.
50
+
51
+ ### `tier`
52
+
53
+ Storage tier placement. One of:
54
+
55
+ - `RAM` — host memory
56
+ - `PERSIST` — persistent SSD/disk tier
57
+ - `DISK0` — primary disk tier
58
+ - `VRAM` — GPU memory
59
+
60
+ Same values appear in `ki_partitions.tier`.
61
+
62
+ ### `priority`
63
+
64
+ Tier manager priority — determines eviction order when a tier fills:
65
+
66
+ | Value | Meaning |
67
+ | ----- | ----------------------------------- |
68
+ | 1 | system / `ki_catalog` (never evict) |
69
+ | 5 | regular user tables |
70
+ | 9 | temporary / ephemeral |
71
+
72
+ Higher `priority` = more expendable = evicted first.
73
+
74
+ ### `locked` and `evictable`
75
+
76
+ - `locked = 1` — pinned in its current tier; tier manager cannot move it.
77
+ - `evictable = 1` — tier manager may move this object to a lower tier
78
+ when space is needed.
79
+
80
+ An object can be both unlocked and non-evictable (rare; means the tier
81
+ manager will not proactively move it but nothing prevents an
82
+ administrator from doing so).
@@ -0,0 +1,105 @@
1
+ ---
2
+ title: ki_catalog Cross-Table Correlation Paths
3
+ category: catalog-schema
4
+ keywords:
5
+ [
6
+ ki_catalog,
7
+ joins,
8
+ correlation,
9
+ ki_objects,
10
+ ki_columns,
11
+ ki_partitions,
12
+ ki_query_history,
13
+ ki_tiered_objects,
14
+ oid,
15
+ ]
16
+ ---
17
+
18
+ ## Overview
19
+
20
+ When investigating issues, evidence usually has to be joined across
21
+ multiple `ki_catalog` tables. These are the standard correlation paths
22
+ — prefer them over ad-hoc joins.
23
+
24
+ ## Table Metadata Chain
25
+
26
+ Walk this chain to go from an object name to its on-disk footprint and
27
+ schema:
28
+
29
+ ```
30
+ ki_objects.oid
31
+ → ki_obj_stat.oid (row counts, total sizes)
32
+ → ki_partitions.oid (tier placement, compression)
33
+ → ki_columns.table_oid (column schema)
34
+ ```
35
+
36
+ ## Column Type Resolution
37
+
38
+ `ki_columns.column_type_oid` is a numeric OID, not a type name. Join
39
+ it to `ki_datatypes.oid` to get the human-readable type:
40
+
41
+ | OID | Type |
42
+ | ---- | ---------- |
43
+ | 20 | `long` |
44
+ | 1043 | `char256` |
45
+ | 1114 | `datetime` |
46
+ | 2950 | `uuid` |
47
+ | 25 | `string` |
48
+
49
+ ## Query Drill-Down
50
+
51
+ To reconstruct a slow query's execution tree:
52
+
53
+ ```
54
+ ki_query_history.query_id
55
+ → ki_query_span_metrics_all.query_id
56
+ → span tree via span_id / parent_span_id
57
+ ```
58
+
59
+ ## Active Query Workers
60
+
61
+ For queries currently running:
62
+
63
+ ```
64
+ ki_query_active_all.job_id
65
+ → ki_query_workers.job_id (worker threads, elapsed time, blockers)
66
+ ```
67
+
68
+ Use `ki_query_active_all.is_cancellable` to check whether a running
69
+ query can be cancelled before suggesting that remediation.
70
+
71
+ ## Permission Audit
72
+
73
+ ```
74
+ ki_object_permissions.role_oid → ki_users_and_roles.oid
75
+ ki_object_permissions.object_oid → ki_objects.oid
76
+ ```
77
+
78
+ ## Dependency Graph
79
+
80
+ For impact analysis before proposing a DROP:
81
+
82
+ ```
83
+ ki_depend.src_obj_oid → ki_objects.oid
84
+ ki_depend.dep_obj_oid → ki_objects.oid
85
+ ```
86
+
87
+ ## Tier Object Lookup (WARNING — no OID join)
88
+
89
+ `ki_tiered_objects.id` is a **string identifier** (e.g.,
90
+ `@nyctaxi@365[col][0]`), NOT a numeric OID. Do NOT try to join it
91
+ with `ki_objects.oid` — the types don't match and the values don't
92
+ correspond.
93
+
94
+ For per-table tier placement, prefer the dedicated tool:
95
+
96
+ ```
97
+ kinetica_resource_objects (with table_names filter)
98
+ ```
99
+
100
+ For SQL-based analysis, filter with a string match:
101
+
102
+ ```sql
103
+ SELECT * FROM ki_catalog.ki_tiered_objects
104
+ WHERE id LIKE '%table_name%'
105
+ ```
@@ -0,0 +1,93 @@
1
+ ---
2
+ title: gpudb.conf Configuration Reference
3
+ category: configuration
4
+ keywords: [gpudb.conf, config, configuration, parameters, tuning, tiers, alerts]
5
+ ---
6
+
7
+ ## Overview
8
+
9
+ `gpudb.conf` is the master Kinetica configuration file (INI format, all under `[gaia]` section).
10
+ Default on-disk location: `/opt/gpudb/core/etc/gpudb.conf`.
11
+ Retrieved via `kinetica_show_configuration` (host manager port 9300), modified via `kinetica_alter_configuration`.
12
+ Runtime properties are a subset available via `kinetica_get_system_properties` / `kinetica_alter_system_properties`.
13
+
14
+ ## Section Index
15
+
16
+ | Section | Key Parameters | Diagnostic Relevance |
17
+ | ------------------- | ------------------------------------------------------------------------------------ | --------------------------- |
18
+ | Identification | `ring_name`, `cluster_name` | Cluster identity |
19
+ | Hosts | `host<#>.address`, `host<#>.ram_limit`, `host<#>.gpus` | Host topology, RAM caps |
20
+ | Ranks | `rank<#>.host` | Rank-to-host mapping |
21
+ | Network | `head_port` (9191), `host_manager_http_port` (9300), `enable_worker_http_servers` | Connectivity issues |
22
+ | Security | `require_authentication`, `enable_authorization` | Auth troubleshooting |
23
+ | Auditing | `enable_audit`, `audit_body`, `lock_audit` | Audit trail |
24
+ | Licensing | `license_key` | License issues |
25
+ | Processes & Threads | `worker_endpoint_threads`, `tcs_per_tom`, `tps_per_tom`, `subtask_concurrency_limit` | Performance tuning |
26
+ | Hardware | `rank<#>.taskcalc_gpu`, `rank<#>.numa_node` | GPU/NUMA assignment |
27
+ | General | `default_ttl`, `chunk_size`, `execution_mode`, `request_timeout` | Performance, data lifecycle |
28
+ | Visualization | `max_heatmap_size`, `enable_opengl_renderer`, `enable_vectortile_service` | WMS/VTS issues |
29
+ | Text Search | `enable_text_search`, `text_indices_per_tom` | Text search issues |
30
+ | Persistence | `persist_directory`, `wal.*`, `compression_codec`, `load_vectors_on_start` | Data durability, startup |
31
+ | Monitoring | `enable_stats_server`, `telm.persist_query_metrics` | Observability |
32
+ | Graph Servers | `enable_graph_server`, `graph.server<#>.host` | Graph analytics |
33
+ | HA | `enable_ha`, `enable_ha_replay` | High availability |
34
+ | Alerts | `alert_memory_percentage`, `alert_disk_percentage`, `heartbeat_*` | Alert config |
35
+ | Failover | `np1.enable_worker_failover`, `np1.rank_restart_attempts` | Failover behavior |
36
+ | Postgres Proxy | `enable_postgres_proxy`, `postgres_proxy.port` (5432) | Client connectivity |
37
+ | SQL Engine | `sql.enable_planner`, `sql.planner.timeout`, `sql.plan_cache_size` | Query planning |
38
+ | Tiered Storage | `tier.{vram,ram,disk,persist,cold}.*` | Memory/storage management |
39
+ | Tier Strategy | `tier_strategy.default` | Data placement policy |
40
+ | Resource Groups | `resource_group.default.*` | Resource allocation |
41
+
42
+ ## Performance-Critical Parameters
43
+
44
+ **Thread Pools** (all accept `-1` for auto):
45
+
46
+ - `worker_endpoint_threads` — HTTP request handling threads per worker rank
47
+ - `tps_per_tom` — data processing threads (inserts, updates, deletes); multi-head ingest not affected
48
+ - `tcs_per_tom` — calculation threads (aggregates, record retrieval)
49
+ - `subtask_concurrency_limit` — query-level scheduler concurrency; lower = depth-first (fewer queries, faster completion), higher = breadth-first (more concurrency)
50
+
51
+ **Chunk Settings:**
52
+
53
+ - `chunk_size` — records per chunk (default 8M; 0 disables chunking)
54
+ - `chunk_max_memory` — max total chunk data per table in bytes
55
+ - `chunk_column_max_memory` — max per-column chunk data in memory (512MB)
56
+
57
+ **Execution Mode:** `execution_mode` = `default` | `host` | `device` | `<rows>` — controls CPU vs GPU kernel execution. When set to `device` but no GPUs are available, falls back to CPU.
58
+
59
+ ## Tiered Storage Quick Reference
60
+
61
+ Five tier types (data flows down when evicted):
62
+
63
+ 1. **VRAM** — GPU memory; limit/watermarks per rank per GPU
64
+ 2. **RAM** — main memory; rank0 gets ~10% of system RAM, workers split the rest
65
+ 3. **Disk** — temporary swap cache (fast SSD recommended); multiple disk tiers supported
66
+ 4. **Persist** — permanent storage; data survives restarts
67
+ 5. **Cold** — extended storage (disk, HDFS, S3, Azure, GCS); for infrequently accessed data
68
+
69
+ **Watermark semantics:** `high_watermark` triggers background eviction; eviction continues until usage drops below `low_watermark`. Both are percentages (1-100). Set both to 100 to disable eviction. Watermarks are ignored when limit is -1.
70
+
71
+ **Default tier strategy format:** `VRAM <priority>, RAM <priority>, DISK0 <priority>, PERSIST <priority>` — priority 1 (lowest, first evicted) to 9 (highest, last evicted), 10 = unevictable.
72
+
73
+ ## WAL (Write-Ahead Log)
74
+
75
+ - `wal.sync_policy`: `none` (disabled) | `background` (periodic) | `flush` (per-operation, survives DB crash) | `fsync` (per-operation, survives OS crash)
76
+ - `wal.checksum`: integrity protection on WAL entries
77
+ - `wal.truncate_corrupt_tables_on_start`: auto-truncate corrupt tables on replay (vs. manual REPAIR TABLE)
78
+
79
+ ## Alert Thresholds
80
+
81
+ - `alert_memory_percentage` — comma-separated thresholds (e.g., `1, 5, 10, 20`) for low-memory alerts
82
+ - `alert_disk_percentage` — same for low-disk alerts
83
+ - `heartbeat_interval` / `heartbeat_timeout` / `heartbeat_missed_limit` — host failure detection timing
84
+
85
+ ## Key Gotchas
86
+
87
+ - **`-1` means different things:** For thread counts = auto-detect; for tier limits = no limit (ignore watermarks); for `default_ttl` = disabled
88
+ - **`default_ttl`** is in MINUTES — non-protected tables are auto-deleted after this time. A value of 20 means tables without explicit TTL override vanish after 20 minutes.
89
+ - **`load_vectors_on_start = on_demand`** means data loads lazily — first queries on cold data will be slower
90
+ - **Rank 0** is the head/coordinator node with minimal RAM allocation (~10%); it does NOT hold data. Worker ranks (1+) hold all data.
91
+ - **`execution_mode = device`** silently falls back to CPU when no GPUs are present — no error is raised
92
+ - **7.2.x missing parameters:** `sm_omp_threads`, `kernel_omp_threads` do NOT exist — use `worker_endpoint_threads`, `subtask_concurrency_limit`, `tcs_per_tom` instead
93
+ - **Config changes require restart** unless the parameter is also a runtime system property (check via `kinetica_get_system_properties`)
@@ -0,0 +1,89 @@
1
+ ---
2
+ title: Mutation Safety Rules
3
+ category: mutation-policy
4
+ keywords:
5
+ [
6
+ mutation,
7
+ safety,
8
+ admin-rebalance,
9
+ alter-system-properties,
10
+ alter-configuration,
11
+ never-propose,
12
+ ai_api_key,
13
+ cache-clearing,
14
+ worker-restart,
15
+ aggressiveness,
16
+ ]
17
+ ---
18
+
19
+ ## Overview
20
+
21
+ Safety contract the agent must follow before and during Round 4
22
+ (Mutation Proposal) of the investigation protocol. These rules combine
23
+ version-specific Kinetica 7.2.x facts with operational policy — every
24
+ mutation tool call is subject to them.
25
+
26
+ ## Pre-Mutation Checklist
27
+
28
+ BEFORE proposing any mutation:
29
+
30
+ 1. Always run `kinetica_health_check` first — do not mutate an unhealthy
31
+ cluster.
32
+ 2. For `kinetica_admin_rebalance`: check `kinetica_cluster_status` for
33
+ active rebalance/add/remove operations — never propose rebalance
34
+ when one is already running.
35
+ 3. For config changes: use `kinetica_get_system_properties` to read the
36
+ current value BEFORE proposing a change (so the report can show a
37
+ meaningful before/after diff).
38
+
39
+ ## NEVER Propose
40
+
41
+ - `/clear/table` or `/clear/tablemonitor` as cache-clearing operations —
42
+ these DELETE DATA permanently in Kinetica. They are not caches.
43
+ - Setting `ai_api_key` via `kinetica_alter_system_properties` — this is
44
+ a credential that would appear in audit logs.
45
+ - Setting `external_files_directory` — filesystem path; potential path
46
+ traversal concern.
47
+ - Setting `flush_to_disk` — can trigger an expensive I/O storm.
48
+ - Worker restart — no REST API exists in Kinetica 7.2. Tell the
49
+ operator to run `gadmin restart rank <N>` manually instead.
50
+ - Cache clearing — no safe API exists in Kinetica 7.2. Recommend
51
+ query-side solutions (rewriting the query, adding an index, bumping
52
+ resource group limits) instead of trying to clear caches.
53
+
54
+ ## For `kinetica_admin_rebalance`
55
+
56
+ - Recommend aggressiveness 1–3 during production hours (reduces query
57
+ latency impact).
58
+ - Recommend aggressiveness 4–5 during maintenance windows only.
59
+ - Warn the operator: rebalance causes "delayed query responses" while
60
+ running.
61
+ - Check `kinetica_cluster_status` for active jobs before proposing.
62
+ - On single-worker-rank clusters (rank 0 + 1 worker), rebalance
63
+ returns "Database must be offline" — rebalance is only meaningful
64
+ with 2+ worker ranks.
65
+
66
+ ## For `kinetica_alter_system_properties`
67
+
68
+ - The tool enforces an allow-list of 43 documented properties —
69
+ unsupported names are rejected before the API call.
70
+ - Prefer changing `subtask_concurrency_limit`, `tcs_per_tom`, or
71
+ `tps_per_tom` for concurrency tuning.
72
+ - NOTE: `sm_omp_threads` and `kernel_omp_threads` do NOT exist in
73
+ Kinetica 7.2.x (not in the allow-list).
74
+ - Avoid `chunk_size` changes without DBA review — affects all query
75
+ performance.
76
+ - `request_timeout` changes affect ALL endpoints system-wide.
77
+
78
+ ## For `kinetica_alter_configuration`
79
+
80
+ - ALWAYS read the current config via `kinetica_show_configuration`
81
+ first.
82
+ - Make targeted edits to specific lines — never compose a config from
83
+ scratch.
84
+ - Submit the full modified `config_string` (the entire file is
85
+ replaced).
86
+ - Changes require a service restart to take effect — inform the
87
+ operator.
88
+ - This tool contacts the host manager (port 9300), not the DB engine
89
+ (port 9191).
@@ -0,0 +1,54 @@
1
+ ---
2
+ title: Kinetica Rank Architecture
3
+ category: cluster-topology
4
+ keywords: [ranks, rank-0, head, coordinator, worker, shards, metrics-interpretation, asymmetry]
5
+ ---
6
+
7
+ ## Overview
8
+
9
+ Kinetica uses a rank-based distributed architecture. A single host
10
+ typically runs one head rank (rank 0) plus one or more worker ranks
11
+ (rank 1, rank 2, …). Understanding the asymmetry between rank 0 and
12
+ worker ranks is essential to correctly interpreting metrics and
13
+ resource-group reports.
14
+
15
+ ## Rank 0 — Head / Coordinator
16
+
17
+ - Stores only **metadata** (~4 MB RAM steady-state).
18
+ - Has **no** `PERSIST` / `DISK` / `VRAM` tiers configured — data
19
+ tiers live on worker ranks.
20
+ - Has **no** resource objects (nothing to place in tiers).
21
+ - Has **no** `rank_usage` entry in resource groups.
22
+ - Much lower RAM limit, typically ~750 MB.
23
+ - Responsible for coordinating queries, query planning, and routing
24
+ requests to worker ranks.
25
+
26
+ ## Rank 1+ — Workers / Data Nodes
27
+
28
+ - Hold the actual user data.
29
+ - Have full tier configuration (RAM / PERSIST / DISK / VRAM as
30
+ configured in `gpudb.conf`).
31
+ - All 16,384 shards map to worker ranks (rank 0 holds no shards).
32
+ - RAM limits typically 5+ GB per rank.
33
+
34
+ ## Interpreting Metrics — Key Rule
35
+
36
+ **Rank 0's low resource usage is normal — it is NOT a sign of
37
+ imbalance or a failing node.**
38
+
39
+ When reviewing `kinetica_get_metrics` or `kinetica_node_details`:
40
+
41
+ - Compare worker ranks against each other — rank 1 vs rank 2 vs …
42
+ - Do NOT compare rank 0 against worker ranks; the asymmetry will
43
+ always make rank 0 look "idle".
44
+ - Do NOT propose rebalance because rank 0 has less data than workers
45
+ — that is the expected topology.
46
+
47
+ Similarly, when a resource group report shows no `rank_usage` for
48
+ rank 0, that is correct — nothing runs in resource groups on the head
49
+ rank.
50
+
51
+ ## Single-Worker Clusters
52
+
53
+ Rebalance requires 2+ worker ranks — see `version-quirks-7.2.md` for
54
+ the exact `/admin/rebalance` precondition and error message.
@@ -0,0 +1,78 @@
1
+ ---
2
+ title: Kinetica ALTER TABLE — Column Property Syntax
3
+ category: sql-syntax
4
+ keywords:
5
+ [
6
+ alter-table,
7
+ alter-column,
8
+ modify-column,
9
+ dict,
10
+ text-search,
11
+ compress,
12
+ column-properties,
13
+ shard-key,
14
+ kinetica-alter-table-columns,
15
+ ]
16
+ ---
17
+
18
+ ## Overview
19
+
20
+ Kinetica's `ALTER TABLE` syntax for column properties differs from
21
+ standard SQL in two non-obvious ways:
22
+
23
+ 1. Column properties live INSIDE the type parentheses, not as trailing
24
+ clauses.
25
+ 2. There is no `SET`/`ADD`/`DROP` for individual properties — every
26
+ change requires repeating the FULL column definition.
27
+
28
+ ## Single-Column Changes
29
+
30
+ ```sql
31
+ -- Add DICT encoding to an existing column (repeat full definition):
32
+ ALTER TABLE [schema.]table_name
33
+ ALTER COLUMN column_name VARCHAR(size, DICT) [NOT NULL]
34
+
35
+ -- Equivalent MODIFY syntax:
36
+ ALTER TABLE [schema.]table_name
37
+ MODIFY COLUMN column_name VARCHAR(size, DICT) [NOT NULL]
38
+
39
+ -- Remove DICT encoding (omit DICT from definition):
40
+ ALTER TABLE [schema.]table_name
41
+ ALTER COLUMN column_name VARCHAR(size) [NOT NULL]
42
+ ```
43
+
44
+ ## Multiple Column Changes
45
+
46
+ Multiple alterations on the same table can be bundled in a single
47
+ statement:
48
+
49
+ ```sql
50
+ ALTER TABLE [schema.]table_name
51
+ ALTER COLUMN col1 VARCHAR(50, DICT),
52
+ ALTER COLUMN col2 VARCHAR(100, TEXT_SEARCH) NOT NULL,
53
+ ALTER COLUMN col3 INT(DICT)
54
+ ```
55
+
56
+ **For agent use:** when recommending 2+ column changes on one table,
57
+ prefer the `kinetica_alter_table_columns` tool — it composes this
58
+ bundled statement automatically and surfaces an interactive checklist
59
+ for operator approval.
60
+
61
+ ## Key Rules
62
+
63
+ - **Properties inside parentheses:** `VARCHAR(50, DICT)` —
64
+ NOT `VARCHAR(50) DICT`. Placing the property outside the parens is a
65
+ syntax error.
66
+ - **Full definition required:** type, size, properties, nullability
67
+ must all be repeated. There is no `ALTER COLUMN col SET DICT` syntax.
68
+ - **Available column properties:** `DICT`, `TEXT_SEARCH`,
69
+ `COMPRESS(type)`, `IPV4`, `NORMALIZE`, `INIT_WITH_NOW`,
70
+ `INIT_WITH_UUID`, `UPDATE_WITH_NOW`.
71
+ - **Cascade behavior:** Dependent views, materialized views, and SQL
72
+ procedures are DROPPED when a referenced column is altered. Warn
73
+ operators before proposing ALTER COLUMN on a column with known
74
+ dependencies — check `ki_catalog.ki_depend` first.
75
+ - **Shard keys are immutable** — check `is_shard_key` in
76
+ `ki_catalog.ki_columns` (or `properties` in `kinetica_show_table`)
77
+ before proposing ALTER COLUMN. See `version-quirks-7.2.md` for the
78
+ full rule.