@kinetica/admin-agent 0.2.1 → 0.2.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -48,7 +48,7 @@ Built with the [Claude Agent SDK](https://docs.anthropic.com/en/docs/agents-and-
48
48
 
49
49
  - Autonomous multi-round investigation with parallel tool calls
50
50
  - 16 read-only diagnostic tools + 4 mutation tools with interactive approval + 2 self-managing tools (reporting, batch-column alter) = **22 live tools**, plus 6 offline bundle-analysis tools = **28 total**
51
- - **Offline support-bundle analysis** — diagnose from an extracted `gpudb_sysinfo` bundle (per-rank logs, `gpudb.conf`, host diagnostics) with no live connection, or attach a bundle alongside a live session to cross-check captured history against current state
51
+ - **Offline support-bundle analysis** — diagnose from an extracted `gpudb_sysinfo` bundle (per-rank logs, `gpudb.conf`, host diagnostics) with no live connection, or attach a bundle alongside a live session to cross-check captured history against current state — even bundles that don't match the standard layout, via file-name and content inference
52
52
  - Expert knowledge via pluggable playbooks (no code required to add new ones)
53
53
  - Schema-aware SQL — discovers actual column names at startup, never guesses
54
54
  - HTTPS-first URL resolution with explicit consent required before any HTTP fallback
@@ -243,6 +243,8 @@ A bundle and a live connection are **composable capabilities, not exclusive mode
243
243
 
244
244
  **Every rank, however its logs were captured.** A bundle can carry per-rank logs in two forms: full rolling logs for the ranks on the collector's own host (`logs-local/`, including rotated history like `….log.1`), and centralized Loki/promtail exports for the entire cluster (`logs/rank0.log` … `rankN.log`, plus `hostmanager.log` and per-component tails). The agent reads both transparently — it identifies each rank from either source, prefers the richer rolling log when a rank has both, and falls back to the centralized export for ranks that live on other hosts. So on a multi-node cluster you can investigate **all** ranks (and the host manager), not just the ones local to where the bundle was collected. The centralized exports are JSON-wrapped on disk; the tools unwrap them automatically, so severity filters and timelines behave identically across both formats. `kinetica_bundle_list_files` reports the true rank count under `ranks_present` — trust it rather than guessing from `logs-local/`.
245
245
 
246
+ **Bundles that don't match the expected shape.** Not every bundle is a clean `gpudb_sysinfo` capture — a customer may hand over a flat logs-only dump, a differently-named collector's output, or a partial directory. The agent infers each file's type from its name, and for files whose names give nothing away it sniffs a bounded slice of their content against the same log/config/sysinfo parsers. So a rolling log shipped without the canonical `core-` prefix, or a host-manager `.out` capture, is still recognized, searchable, and rank-attributed rather than silently dropped. `kinetica_bundle_list_files` reports a `layout_match` verdict (`canonical` / `partial` / `unfamiliar`), a per-file confidence (`exact` / `inferred` / `weak`), and any files it couldn't place — and the operator gets a startup warning when a bundle is off-shape — so an inference is never passed off as certainty. Classification depends only on file names and contents, never on what the bundle directory itself is named.
247
+
246
248
  Anthropic authentication still runs in bundle mode; only the interactive Kinetica credential collection is skipped (there may be no live DB to connect to). See [Offline Bundle Analysis](#offline-bundle-analysis-read-only) for the tools, and [CLAUDE.md](CLAUDE.md) for the parser/architecture details.
247
249
 
248
250
  ## CLI Flags
@@ -330,14 +332,14 @@ The `--bundle` flag points the agent at an **extracted** support-bundle director
330
332
 
331
333
  Available against an extracted `gpudb_sysinfo` support bundle (see [Offline Bundle Mode](#offline-bundle-mode)). All read-only; the search/timeline tools stream and bound their output so a large rank log (tens of MB, hundreds of thousands of lines) never blows up the context.
332
334
 
333
- | Tool | Description |
334
- | ------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------- |
335
- | `kinetica_load_bundle` | Attach an extracted bundle directory; without a path it opens a directory picker (a model-supplied path needs operator confirmation) |
336
- | `kinetica_bundle_list_files` | Inventory: detected version, ranks + services present, file counts/sizes by kind — call this first |
337
- | `kinetica_bundle_log_timeline` | Per-time-bucket severity counts across ranks (the incident shape) — call before searching |
338
- | `kinetica_bundle_search_logs` | Bounded log search by regex, min-severity, time window, and rank / host-manager / component (reads both rolling and Loki-export logs) |
339
- | `kinetica_bundle_read_config` | Read the bundle's real on-disk `gpudb.conf`, with optional section/key filter |
340
- | `kinetica_bundle_read_sysinfo` | OS/process/version diagnostic files (memory, CPU, disk, GPU, network, process args) |
335
+ | Tool | Description |
336
+ | ------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
337
+ | `kinetica_load_bundle` | Attach an extracted bundle directory; without a path it opens a directory picker (a model-supplied path needs operator confirmation) |
338
+ | `kinetica_bundle_list_files` | Inventory: detected version, ranks + services present, file counts/sizes by kind, plus a layout-match verdict + per-file confidence for off-shape bundles — call this first |
339
+ | `kinetica_bundle_log_timeline` | Per-time-bucket severity counts across ranks (the incident shape) — call before searching |
340
+ | `kinetica_bundle_search_logs` | Bounded log search by regex, min-severity, time window, and rank / host-manager / component (reads both rolling and Loki-export logs) |
341
+ | `kinetica_bundle_read_config` | Read the bundle's real on-disk `gpudb.conf`, with optional section/key filter |
342
+ | `kinetica_bundle_read_sysinfo` | OS/process/version diagnostic files (memory, CPU, disk, GPU, network, process args) |
341
343
 
342
344
  ### Reporting
343
345
 
@@ -427,7 +429,7 @@ References provide domain knowledge (not diagnostic runbooks). Create a `.md` fi
427
429
  - `sql-create-index` — column index syntax, chunk skip index, when to use which
428
430
  - `version-quirks-7.2` — endpoint/property differences between 7.2.x and earlier releases
429
431
 
430
- Plus a **bundle-scoped reference** (`support-bundle` — bundle layout, the two per-rank log families, raw + Loki-JSONL log-line formats, severity ordering, file parsing, crash-SQL forensics) that lives in `knowledge/references/bundle/`. It loads in **every** session — even a pure live one — so that a bundle attached mid-session via `kinetica_load_bundle` has its parsing knowledge ready in the (build-once) prompt; the corpus is cached, so the cost to a session that never attaches a bundle is negligible.
432
+ Plus a **bundle-scoped reference** (`support-bundle` — bundle layout, the two per-rank log families, raw + Loki-JSONL log-line formats, severity ordering, file parsing, crash-SQL forensics, and how to work an off-shape bundle via the `layout_match`/confidence signals) that lives in `knowledge/references/bundle/`. It loads in **every** session — even a pure live one — so that a bundle attached mid-session via `kinetica_load_bundle` has its parsing knowledge ready in the (build-once) prompt; the corpus is cached, so the cost to a session that never attaches a bundle is negligible.
431
433
 
432
434
  > **Heads up — prompt budget:** all playbooks and references are front-loaded into a single system prompt at startup, so its token cost grows with the knowledge corpus. A startup tripwire (`agent/prompt-budget.ts`) prints the assembled prompt size under `DEBUG` and warns on stderr once it exceeds ~20,000 estimated tokens. Current baseline is ~13.4k tokens (6 playbooks + 9 references). If you add substantial knowledge and trip that warning, treat it as the cue to switch from "load everything" to keyword-based playbook selection.
433
435
 
@@ -3875,248 +3875,8 @@ var import_claude_agent_sdk4 = require("@anthropic-ai/claude-agent-sdk");
3875
3875
  // src/tools/bundle/list-files.ts
3876
3876
  var import_zod18 = require("zod");
3877
3877
 
3878
- // src/bundle/known-files.ts
3879
- var KNOWN_BUNDLE_FILES = {
3880
- // Host resources
3881
- "cpu.txt": "CPU topology, NUMA, and interrupts (lscpu, numactl, /proc/cpuinfo, /proc/interrupts)",
3882
- "mem.txt": "Memory usage, /proc/meminfo, and transparent-hugepage setting (free -m -t)",
3883
- "disk.txt": "Filesystems, mounts, block devices, and disk stats (df, mount, lsblk, fdisk, /etc/fstab, /proc/diskstats)",
3884
- "gpu.txt": "NVIDIA GPU inventory and state (nvidia-smi -L/-q, modinfo nvidia)",
3885
- "net.txt": "Network interfaces, sockets, and DNS (hostname, ifconfig, netstat, /etc/resolv.conf)",
3886
- // Processes
3887
- "ps.txt": "Full process list (ps -auxww, ps -ejHlfww)",
3888
- "gpudb-exe.txt": "Running gpudb processes (ps auxfwww | grep gpudb)",
3889
- // Hardware / firmware
3890
- "dmidecode.txt": "BIOS / DMI hardware inventory (dmidecode)",
3891
- "lshw.txt": "Hardware listing (lshw -short -numeric)",
3892
- "pci.txt": "PCI devices and I/O resources (lspci, /proc/ioports, /proc/iomem)",
3893
- // Kernel / OS
3894
- "dmesg.txt": "Kernel ring buffer \u2014 boot and runtime kernel messages (dmesg -T)",
3895
- "dmesg-timestamp.txt": "Kernel ring buffer with human-readable timestamps",
3896
- "sysctl.txt": "Kernel tunables (sysctl -a)",
3897
- "sys.txt": "OS identity, uptime, ulimits, kernel cmdline, clocksource, and loaded modules (uname, ulimit, /proc/cmdline, lsmod)",
3898
- "lsof.txt": "Open files and network sockets (lsof -n -P)",
3899
- "lslocks.txt": "Held file locks (lslocks)",
3900
- // Packages / linker / accounts
3901
- "deb.txt": "Installed Debian packages and verification (dpkg -l, dpkg -V)",
3902
- "rpm.txt": "Installed RPM packages (rpm -qa)",
3903
- "ld.so.conf.txt": "Dynamic-linker library search paths (/etc/ld.so.conf)",
3904
- "user.txt": "Users, groups, and the gpudb service account (whoami, id, /etc/passwd, /etc/group)",
3905
- "sudoers.txt": "Sudo configuration (/etc/sudoers)",
3906
- "etc_profile.txt": "Login shell profile (/etc/profile)",
3907
- "etc_bashrc.txt": "System bashrc (/etc/bashrc)",
3908
- "etc_host.txt": "Static hostname resolution (/etc/hosts)",
3909
- // Kinetica-specific
3910
- "gpudb.txt": "GPUdb version/build, binary md5 + ldd, and the captured gpudb.conf / gpudb_logger.conf ($GPUDB_EXE -v)",
3911
- "gpudb_core_etc_gpudb.conf": "The live gpudb.conf at capture time (the database's main config)",
3912
- "gpudb_core_etc_gpudb_logger.conf": "The logging configuration (gpudb_logger.conf)",
3913
- "loki-info.txt": "Loki log-index stats: labels, series, and per-class volume (logcli)",
3914
- "sql-queries.txt": "SQL query log extracted from Loki (logcli)",
3915
- "tables.txt": "Table schemas and column types (gadmin --schema), when collected",
3916
- "logfiles.txt": "Manifest: the log directories/files the collector enumerated",
3917
- "errors.txt": "Collection commands that FAILED during capture (Evidence Gaps)",
3918
- "proc-logs-erros.txt": "Per-process log-collection failures during capture (Evidence Gaps)"
3919
- };
3920
- var KIND_DESCRIPTIONS = {
3921
- "core-log": "Per-rank rolling Kinetica core log (the primary incident narrative)",
3922
- "component-log": "Component service log (sql-engine, httpd, reveal, tomcat, stats, \u2026)",
3923
- "loki-tail": "Last-2h Loki tail for a service (small; searched only when no core logs exist)",
3924
- "process-info": "Per-rank process snapshot: command line, PID, and environment (/proc/<pid>/environ)",
3925
- config: "Kinetica configuration file",
3926
- "version-info": "GPUdb version/build information",
3927
- "collection-errors": "Collection commands that FAILED during capture (Evidence Gaps)",
3928
- manifest: "Manifest of log directories/files the collector enumerated"
3929
- };
3930
- function basename(relPath) {
3931
- const parts = relPath.split("/");
3932
- return parts[parts.length - 1] ?? relPath;
3933
- }
3934
- function describeBundleFile(entry) {
3935
- return KNOWN_BUNDLE_FILES[basename(entry.relPath)] ?? KIND_DESCRIPTIONS[entry.kind] ?? "";
3936
- }
3937
-
3938
- // src/tools/bundle/list-files.ts
3939
- var BundleListFilesSchema = import_zod18.z.object({
3940
- kind: import_zod18.z.string().optional()
3941
- });
3942
- async function bundleListFiles(source, args = {}) {
3943
- const all = source.listFiles();
3944
- const filtered = args.kind ? all.filter((e) => e.kind === args.kind) : all;
3945
- const { totalFiles, totalBytes, byKind, ranks, services } = source.inventory();
3946
- const version = await source.detectVersion();
3947
- const errors = await source.collectionErrors();
3948
- const files = filtered.map((e) => ({
3949
- file: e.relPath,
3950
- kind: e.kind,
3951
- rank: e.rank ?? "",
3952
- size_kb: Math.round(e.sizeBytes / 1024),
3953
- // What the file contains — so the agent can pick the right one without reading it.
3954
- description: describeBundleFile(e)
3955
- }));
3956
- return {
3957
- ok: true,
3958
- data: {
3959
- detected_version: version ?? "unknown",
3960
- ranks_present: ranks.join(", ") || "none",
3961
- services_present: services.join(", ") || "none",
3962
- total_files: totalFiles,
3963
- total_size_mb: Number((totalBytes / 1e6).toFixed(1)),
3964
- counts_by_kind: byKind,
3965
- failed_collections: errors.length,
3966
- files
3967
- }
3968
- };
3969
- }
3970
-
3971
- // src/tools/bundle/log-timeline.ts
3972
- var import_zod19 = require("zod");
3973
- var BundleLogTimelineSchema = import_zod19.z.object({
3974
- min_severity: import_zod19.z.enum(["INFO", "WARN", "UERR", "ERROR", "FATAL"]).optional(),
3975
- granularity: import_zod19.z.enum(["day", "hour", "minute"]).optional(),
3976
- rank: import_zod19.z.string().describe('Numeric rank only, e.g. "r0"/"r1". For the host manager use host_manager.').optional(),
3977
- host_manager: import_zod19.z.boolean().describe("Bucket the host-manager (hm) log \u2014 a singleton service, not a rank.").optional(),
3978
- component: import_zod19.z.string().optional(),
3979
- include_components: import_zod19.z.boolean().optional()
3980
- });
3981
- async function bundleLogTimeline(source, args = {}) {
3982
- const query3 = {
3983
- ...args.min_severity !== void 0 ? { minSeverity: args.min_severity } : {},
3984
- ...args.granularity !== void 0 ? { granularity: args.granularity } : {},
3985
- ...args.rank !== void 0 ? { rank: args.rank } : {},
3986
- ...args.host_manager !== void 0 ? { hostManager: args.host_manager } : {},
3987
- ...args.component !== void 0 ? { component: args.component } : {},
3988
- ...args.include_components !== void 0 ? { includeComponents: args.include_components } : {}
3989
- };
3990
- const result = await source.logTimeline(query3);
3991
- const severities = [...new Set(result.buckets.flatMap((b) => Object.keys(b.counts)))];
3992
- const order = ["FATAL", "ERROR", "UERR", "WARN", "INFO"];
3993
- severities.sort((a, b) => order.indexOf(a) - order.indexOf(b));
3994
- const rows = result.buckets.map((b) => {
3995
- const row = { time_bucket: b.bucket };
3996
- for (const sev of severities) row[sev] = b.counts[sev] ?? 0;
3997
- row.total = b.total;
3998
- return row;
3999
- });
4000
- return {
4001
- ok: true,
4002
- note: result.totalCounted === 0 ? "No lines at or above the severity threshold \u2014 try a lower min_severity." : `${result.totalCounted} event(s) across ${result.buckets.length} bucket(s), ${result.filesScanned.length} file(s).`,
4003
- data: {
4004
- lines_scanned: result.linesScanned,
4005
- files_scanned: result.filesScanned.join(", ") || "none",
4006
- buckets: rows
4007
- }
4008
- };
4009
- }
4010
-
4011
- // src/tools/bundle/search-logs.ts
4012
- var import_zod20 = require("zod");
4013
- var BundleSearchLogsSchema = import_zod20.z.object({
4014
- regex: import_zod20.z.string().optional(),
4015
- min_severity: import_zod20.z.enum(["INFO", "WARN", "UERR", "ERROR", "FATAL"]).optional(),
4016
- from_ts: import_zod20.z.string().optional(),
4017
- to_ts: import_zod20.z.string().optional(),
4018
- rank: import_zod20.z.string().describe('Numeric rank only, e.g. "r0"/"r1". For the host manager use host_manager.').optional(),
4019
- host_manager: import_zod20.z.boolean().describe("Search the host-manager (hm) log \u2014 a singleton service, not a rank.").optional(),
4020
- component: import_zod20.z.string().optional(),
4021
- include_components: import_zod20.z.boolean().optional(),
4022
- max_matches: import_zod20.z.number().int().min(1).max(1e3).optional()
4023
- });
4024
- async function bundleSearchLogs(source, args = {}) {
4025
- const query3 = {
4026
- ...args.regex !== void 0 ? { regex: args.regex } : {},
4027
- ...args.min_severity !== void 0 ? { minSeverity: args.min_severity } : {},
4028
- ...args.from_ts !== void 0 ? { fromTs: args.from_ts } : {},
4029
- ...args.to_ts !== void 0 ? { toTs: args.to_ts } : {},
4030
- ...args.rank !== void 0 ? { rank: args.rank } : {},
4031
- ...args.host_manager !== void 0 ? { hostManager: args.host_manager } : {},
4032
- ...args.component !== void 0 ? { component: args.component } : {},
4033
- ...args.include_components !== void 0 ? { includeComponents: args.include_components } : {},
4034
- ...args.max_matches !== void 0 ? { maxMatches: args.max_matches } : {}
4035
- };
4036
- const result = await source.searchLogs(query3);
4037
- const note = result.capped ? `Showing ${result.matches.length} of ${result.totalMatched} matches across ${result.filesScanned.length} file(s) (display capped). Narrow with a tighter regex, severity, or time window to surface the specific lines.` : `${result.totalMatched} match(es) across ${result.filesScanned.length} file(s).`;
4038
- return {
4039
- ok: true,
4040
- note,
4041
- data: {
4042
- total_matched: result.totalMatched,
4043
- lines_scanned: result.linesScanned,
4044
- files_scanned: result.filesScanned.join(", ") || "none",
4045
- capped: result.capped,
4046
- matches: result.matches.map((m) => ({
4047
- file: m.file,
4048
- line: m.lineNumber,
4049
- timestamp: m.timestamp ?? "",
4050
- severity: m.severity ?? "",
4051
- rank: m.rank ?? "",
4052
- message: m.message
4053
- }))
4054
- }
4055
- };
4056
- }
4057
-
4058
- // src/tools/bundle/read-config.ts
4059
- var import_zod21 = require("zod");
4060
- var BundleReadConfigSchema = import_zod21.z.object({
4061
- section: import_zod21.z.string().optional(),
4062
- key: import_zod21.z.string().optional()
4063
- });
4064
- async function bundleReadConfig(source, args = {}) {
4065
- const result = await source.readConfig({
4066
- ...args.section !== void 0 ? { section: args.section } : {},
4067
- ...args.key !== void 0 ? { key: args.key } : {}
4068
- });
4069
- if ("error" in result) {
4070
- return { ok: false, status: 0, error: result.error, raw: "" };
4071
- }
4072
- if (result.entries.length === 0 && args.section !== void 0) {
4073
- const all = await source.readConfig(args.key !== void 0 ? { key: args.key } : {});
4074
- const sections = "error" in all ? [] : [...new Set(all.entries.map((e) => e.section))].sort();
4075
- const sectionList = sections.map((s) => s === "" ? "(flat/top-level)" : s).join(", ");
4076
- return {
4077
- ok: true,
4078
- note: `No entries in section "${args.section}" of ${result.file}. gpudb.conf is largely flat \u2014 retry filtering by key only. Sections present: ${sectionList || "(none)"}.`,
4079
- data: { section_not_found: args.section, available_sections: sections }
4080
- };
4081
- }
4082
- return {
4083
- ok: true,
4084
- note: `${result.entries.length} entr(y/ies) from ${result.file}.`,
4085
- data: result.entries.map((e) => ({ section: e.section, key: e.key, value: e.value }))
4086
- };
4087
- }
4088
-
4089
- // src/tools/bundle/read-sysinfo.ts
4090
- var import_zod22 = require("zod");
4091
- var BundleReadSysinfoSchema = import_zod22.z.object({
4092
- name: import_zod22.z.string().min(1)
4093
- });
4094
- async function bundleReadSysinfo(source, args) {
4095
- const result = await source.readSysinfo(args.name);
4096
- if ("error" in result) {
4097
- return { ok: false, status: 0, error: result.error, raw: "" };
4098
- }
4099
- return {
4100
- ok: true,
4101
- data: {
4102
- ...result.header !== void 0 ? { source_file: result.header } : {},
4103
- blocks: result.blocks.map((b) => ({
4104
- command: b.command,
4105
- ...b.exitCode !== void 0 ? { exit_code: b.exitCode } : {},
4106
- output: b.output
4107
- }))
4108
- }
4109
- };
4110
- }
4111
-
4112
- // src/tools/bundle/load-bundle.ts
4113
- var import_zod23 = require("zod");
4114
-
4115
- // src/bundle/verify-bundle.ts
4116
- var import_promises6 = require("fs/promises");
4117
-
4118
3878
  // src/bundle/BundleSource.ts
4119
- var import_promises5 = require("fs/promises");
3879
+ var import_promises6 = require("fs/promises");
4120
3880
  var import_node_path6 = require("path");
4121
3881
 
4122
3882
  // src/bundle/sysinfo-block.ts
@@ -4409,20 +4169,26 @@ async function aggregateTimeline(filePath, query3 = {}) {
4409
4169
  }
4410
4170
 
4411
4171
  // src/bundle/bundle-index.ts
4412
- var import_promises4 = require("fs/promises");
4172
+ var import_promises5 = require("fs/promises");
4413
4173
  var import_node_path5 = require("path");
4414
4174
 
4415
4175
  // src/bundle/classify-file.ts
4416
- var ROLLING_ID_RE = /core-gpudb-rolling-(r\d+|hm)\.log(?:\.\d+)?$/;
4176
+ var ROLLING_ID_RE = /(?:core-)?gpudb-rolling-(r\d+|hm)\.log(?:\.\d+)?$/;
4417
4177
  var EXE_ID_RE = /gpudb-exe-(r\d+|hm)-/;
4418
4178
  var HOST_RE = /\b(node\w+)\b/;
4179
+ var CONF_RE = /\.conf$/i;
4180
+ var CONF_ALT_RE = /\.(cfg|ini)$/i;
4419
4181
  var LOG_RE = /\.log(?:\.\d+)?$/;
4182
+ var LOGISH_RE = /\.(?:log|out|err)(?:\.\d+)?$/i;
4420
4183
  var LOKI_RANK_RE = /^rank(\d+)\.log$/;
4421
4184
  var LOKI_HM_BASE = "hostmanager.log";
4185
+ var HM_TOKEN_RE = /host-?manager/i;
4186
+ var RANK_TOKEN_RE = /(?:\brank[-_]?|\br)(\d{1,2})\b/i;
4187
+ var LOG_DIR_RE = /(?:^|\/)(?:logs|logs-local|log)(?:\/|$)/;
4422
4188
  function rankOrService(id) {
4423
4189
  return id === "hm" ? { service: "host-manager" } : { rank: id };
4424
4190
  }
4425
- function basename2(relPath) {
4191
+ function basename(relPath) {
4426
4192
  const parts = relPath.split("/");
4427
4193
  return parts[parts.length - 1] ?? relPath;
4428
4194
  }
@@ -4434,70 +4200,247 @@ function inferHost(relPath) {
4434
4200
  return HOST_RE.exec(relPath)?.[1] ?? void 0;
4435
4201
  }
4436
4202
  function componentName(base) {
4437
- return base.replace(/\.\d+$/, "").replace(/(\.log)+$/, "").replace(/^core-gpudb-/, "").replace(/^gpudb-/, "").replace(/-node\w+$/, "");
4203
+ return base.replace(/\.\d+$/, "").replace(/(?:\.(?:log|out|err))+$/i, "").replace(/^core-gpudb-/, "").replace(/^gpudb-/, "").replace(/-node\w+$/, "");
4438
4204
  }
4205
+ function cls(kind, confidence, reason, parts = {}) {
4206
+ return {
4207
+ kind,
4208
+ confidence,
4209
+ reason,
4210
+ ...parts.rank !== void 0 ? { rank: parts.rank } : {},
4211
+ ...parts.inferredRank !== void 0 ? { inferredRank: parts.inferredRank } : {},
4212
+ ...parts.service !== void 0 ? { service: parts.service } : {},
4213
+ ...parts.component !== void 0 ? { component: parts.component } : {},
4214
+ ...parts.host !== void 0 ? { host: parts.host } : {}
4215
+ };
4216
+ }
4217
+ var MATCHERS = [
4218
+ // ── Tier A: canonical filenames / locations (exact) ──────────────────────────
4219
+ (c) => CONF_RE.test(c.base) ? cls("config", "exact", "config (.conf)", { host: c.host }) : null,
4220
+ (c) => CONF_ALT_RE.test(c.base) ? cls("config", "inferred", "config-like extension (.cfg/.ini)", { host: c.host }) : null,
4221
+ (c) => c.base === "logfiles.txt" ? cls("manifest", "exact", "collector manifest", { host: c.host }) : null,
4222
+ (c) => c.base === "errors.txt" || c.base.endsWith("erros.txt") ? cls("collection-errors", "exact", "collection-errors summary", { host: c.host }) : null,
4223
+ (c) => c.base === "gpudb.txt" ? cls("version-info", "exact", "gpudb.txt", { host: c.host }) : null,
4224
+ (c) => {
4225
+ const m = EXE_ID_RE.exec(c.base);
4226
+ return m ? cls("process-info", "exact", "gpudb-exe process capture", {
4227
+ ...rankOrService(m[1]),
4228
+ host: c.host
4229
+ }) : null;
4230
+ },
4231
+ (c) => {
4232
+ const m = ROLLING_ID_RE.exec(c.base);
4233
+ if (!m) return null;
4234
+ const reason = c.base.startsWith("core-") ? "core rolling-log pattern" : "rolling-log pattern (no core- prefix)";
4235
+ return cls("core-log", "exact", reason, { ...rankOrService(m[1]), host: c.host });
4236
+ },
4237
+ (c) => {
4238
+ if (c.dir !== "logs" || !LOG_RE.test(c.base)) return null;
4239
+ const lr = LOKI_RANK_RE.exec(c.base);
4240
+ const lokiId = lr ? `r${lr[1]}` : c.base === LOKI_HM_BASE ? "hm" : void 0;
4241
+ return lokiId !== void 0 ? cls("loki-tail", "exact", "Loki per-rank/host-manager export under logs/", {
4242
+ ...rankOrService(lokiId),
4243
+ host: c.host
4244
+ }) : cls("loki-tail", "exact", "Loki component tail under logs/", {
4245
+ component: componentName(c.base),
4246
+ host: c.host
4247
+ });
4248
+ },
4249
+ (c) => c.dir === "logs-local" && LOG_RE.test(c.base) ? cls("component-log", "exact", "component log under logs-local/", {
4250
+ component: componentName(c.base),
4251
+ host: c.host
4252
+ }) : null,
4253
+ // ── Tier B: off-shape name/extension heuristics (inferred) ───────────────────
4254
+ // Host-manager service logs in a flat layout: the rolling-hm log is already caught
4255
+ // above; this catches the service log and the process stdout (.out). This MUST come
4256
+ // before the generic gpudb-prefixed matcher below — both would classify a
4257
+ // `gpudb-host-manager-*.log` as a component-log, but only this one adds the
4258
+ // `service: "host-manager"` tag. Kept separate (not folded into the gpudb matcher) so
4259
+ // a host-manager log WITHOUT a gpudb prefix (e.g. a renamed `hostmanager-*.out`) still
4260
+ // gets the service tag rather than falling through to a plain component-log.
4261
+ (c) => HM_TOKEN_RE.test(c.base) && LOGISH_RE.test(c.base) ? cls("component-log", "inferred", "host-manager service log (name match)", {
4262
+ service: "host-manager",
4263
+ component: componentName(c.base),
4264
+ host: c.host
4265
+ }) : null,
4266
+ // Any other gpudb-prefixed log-ish file in a non-canonical location.
4267
+ (c) => (c.base.startsWith("gpudb") || c.base.startsWith("core-gpudb")) && LOGISH_RE.test(c.base) ? cls("component-log", "inferred", "gpudb log (name match, non-canonical location)", {
4268
+ component: componentName(c.base),
4269
+ host: c.host
4270
+ }) : null,
4271
+ // A log-ish file sitting in a log-named directory, or carrying a rank token.
4272
+ (c) => {
4273
+ if (!LOGISH_RE.test(c.base)) return null;
4274
+ const inLogDir = LOG_DIR_RE.test(c.relPath);
4275
+ const rm = RANK_TOKEN_RE.exec(c.base);
4276
+ if (!inLogDir && !rm) return null;
4277
+ const rank = rm ? `r${rm[1]}` : void 0;
4278
+ const reason = rank ? "log-like file with a rank token" : "log-like file in a log directory";
4279
+ return cls(c.dir === "logs" ? "loki-tail" : "component-log", "inferred", reason, {
4280
+ ...rank !== void 0 ? { rank, inferredRank: true } : { component: componentName(c.base) },
4281
+ host: c.host
4282
+ });
4283
+ },
4284
+ // ── Tier C: extension-only fallbacks (weak) ──────────────────────────────────
4285
+ (c) => c.base.endsWith(".txt") ? cls("os-diag", "weak", "fallback: .txt extension", { host: c.host }) : null,
4286
+ (c) => LOGISH_RE.test(c.base) ? cls("component-log", "weak", "fallback: log-like extension", {
4287
+ component: componentName(c.base),
4288
+ host: c.host
4289
+ }) : null
4290
+ ];
4439
4291
  function classifyFile(relPath) {
4440
- const base = basename2(relPath);
4292
+ const base = basename(relPath);
4441
4293
  const dir = dirOf(relPath);
4442
4294
  const host = inferHost(relPath);
4443
- if (base.endsWith(".conf")) {
4444
- return { kind: "config", ...host ? { host } : {} };
4295
+ const ctx = { relPath, base, dir, ...host !== void 0 ? { host } : {} };
4296
+ for (const matcher of MATCHERS) {
4297
+ const result = matcher(ctx);
4298
+ if (result) return result;
4299
+ }
4300
+ return cls("unknown", "weak", "unrecognized file", { host });
4301
+ }
4302
+
4303
+ // src/bundle/sniff-file.ts
4304
+ var import_promises4 = require("fs/promises");
4305
+ var SNIFF_HEAD_BYTES = 8192;
4306
+ var SNIFF_MAX_LINES = 20;
4307
+ async function readHead(absPath, headBytes) {
4308
+ let fh;
4309
+ try {
4310
+ fh = await (0, import_promises4.open)(absPath, "r");
4311
+ const buf = Buffer.alloc(headBytes);
4312
+ const { bytesRead } = await fh.read(buf, 0, headBytes, 0);
4313
+ return buf.subarray(0, bytesRead).toString("utf-8");
4314
+ } catch {
4315
+ return "";
4316
+ } finally {
4317
+ await fh?.close().catch(() => void 0);
4318
+ }
4319
+ }
4320
+ function refineSysinfoKind(command) {
4321
+ const cmd = command.toLowerCase();
4322
+ if (/-v\b|--version|\bgpudb_logger\b/.test(cmd) && cmd.includes("gpudb")) {
4323
+ return { kind: "version-info", detail: "version command" };
4324
+ }
4325
+ if (/\bps\b|\/proc\/|environ|grep .*gpudb/.test(cmd)) {
4326
+ return { kind: "process-info", detail: "process snapshot command" };
4445
4327
  }
4446
- if (base === "logfiles.txt") {
4447
- return { kind: "manifest", ...host ? { host } : {} };
4328
+ return { kind: "os-diag", detail: "host-diagnostic command" };
4329
+ }
4330
+ function logLineResult(rank, severity, isHm) {
4331
+ if (rank !== void 0) {
4332
+ return { kind: "core-log", reason: `log line parsed (${severity}, rank ${rank})`, rank };
4448
4333
  }
4449
- if (base === "errors.txt" || base.endsWith("erros.txt")) {
4450
- return { kind: "collection-errors", ...host ? { host } : {} };
4334
+ if (isHm) {
4335
+ return {
4336
+ kind: "component-log",
4337
+ reason: `log line parsed (${severity}, host-manager)`,
4338
+ service: "host-manager"
4339
+ };
4451
4340
  }
4452
- if (base === "gpudb.txt") {
4453
- return { kind: "version-info", ...host ? { host } : {} };
4341
+ return { kind: "component-log", reason: `log line parsed (${severity})` };
4342
+ }
4343
+ async function sniffFile(absPath, opts = {}) {
4344
+ const headBytes = opts.headBytes ?? SNIFF_HEAD_BYTES;
4345
+ const maxLines = opts.maxLines ?? SNIFF_MAX_LINES;
4346
+ const text2 = await readHead(absPath, headBytes);
4347
+ if (text2 === "") return void 0;
4348
+ const lines = [];
4349
+ for (const raw of text2.split("\n")) {
4350
+ const trimmed = raw.trim();
4351
+ if (trimmed === "") continue;
4352
+ lines.push(raw);
4353
+ if (lines.length >= maxLines) break;
4354
+ }
4355
+ if (lines.length === 0) return void 0;
4356
+ for (const line of lines) {
4357
+ const m = EXEC_CMD_RE.exec(line.trim());
4358
+ if (m) {
4359
+ const { kind, detail } = refineSysinfoKind(m[1]);
4360
+ return { kind, reason: `EXEC_CMD header (${detail})` };
4361
+ }
4454
4362
  }
4455
- const exeId = EXE_ID_RE.exec(base);
4456
- if (exeId) {
4457
- return { kind: "process-info", ...rankOrService(exeId[1]), ...host ? { host } : {} };
4363
+ const unwrapped = unwrapLokiJsonl(lines[0]);
4364
+ if (unwrapped !== void 0) {
4365
+ const p = parseLogLine(unwrapped);
4366
+ const rank = p.rank;
4367
+ return {
4368
+ kind: "loki-tail",
4369
+ reason: `Loki JSONL record${rank ? ` (rank ${rank})` : ""}`,
4370
+ ...rank !== void 0 ? { rank } : {}
4371
+ };
4458
4372
  }
4459
- if (LOG_RE.test(base)) {
4460
- const rolling = ROLLING_ID_RE.exec(base);
4461
- if (rolling) {
4462
- return { kind: "core-log", ...rankOrService(rolling[1]), ...host ? { host } : {} };
4463
- }
4464
- if (dir === "logs") {
4465
- const lokiRank = LOKI_RANK_RE.exec(base);
4466
- const lokiId = lokiRank ? `r${lokiRank[1]}` : base === LOKI_HM_BASE ? "hm" : void 0;
4467
- if (lokiId !== void 0) {
4468
- return { kind: "loki-tail", ...rankOrService(lokiId), ...host ? { host } : {} };
4469
- }
4470
- return { kind: "loki-tail", component: componentName(base), ...host ? { host } : {} };
4373
+ for (const line of lines) {
4374
+ const p = parseLogLine(line);
4375
+ if (p.severity !== void 0 && severityRank(p.severity) >= 0) {
4376
+ const isHm = p.context?.startsWith("hm/") ?? false;
4377
+ return logLineResult(p.rank, p.severity, isHm);
4471
4378
  }
4472
- return { kind: "component-log", component: componentName(base), ...host ? { host } : {} };
4473
4379
  }
4474
- if (base.endsWith(".txt")) {
4475
- return { kind: "os-diag", ...host ? { host } : {} };
4380
+ const hasSection = lines.some((l) => SECTION_RE.test(l.trim()));
4381
+ if (hasSection && parseIni(text2).length >= 2) {
4382
+ return { kind: "config", reason: "INI section + key/value entries" };
4476
4383
  }
4477
- return { kind: "unknown", ...host ? { host } : {} };
4384
+ return void 0;
4478
4385
  }
4479
4386
 
4480
4387
  // src/bundle/bundle-index.ts
4388
+ async function refineWithContent(c, absPath) {
4389
+ if (c.confidence !== "weak" || c.kind === "os-diag") return c;
4390
+ const sniff = await sniffFile(absPath);
4391
+ if (!sniff) return c;
4392
+ const addsKind = sniff.kind !== c.kind;
4393
+ const addsRank = sniff.rank !== void 0 && c.rank === void 0;
4394
+ const addsService = sniff.service !== void 0 && c.service === void 0;
4395
+ if (!addsKind && !addsRank && !addsService) return c;
4396
+ return {
4397
+ ...c,
4398
+ kind: sniff.kind,
4399
+ confidence: "inferred",
4400
+ reason: `content: ${sniff.reason}`,
4401
+ ...sniff.rank !== void 0 ? { rank: sniff.rank } : {},
4402
+ ...sniff.service !== void 0 ? { service: sniff.service } : {}
4403
+ };
4404
+ }
4481
4405
  async function buildIndex(rootDir) {
4482
4406
  let relPaths;
4407
+ let realRoot;
4483
4408
  try {
4484
- relPaths = await (0, import_promises4.readdir)(rootDir, { recursive: true });
4409
+ relPaths = await (0, import_promises5.readdir)(rootDir, { recursive: true });
4410
+ realRoot = await (0, import_promises5.realpath)(rootDir);
4485
4411
  } catch {
4486
4412
  return [];
4487
4413
  }
4414
+ const dirConfined = /* @__PURE__ */ new Map();
4415
+ const isDirConfined = (dir) => {
4416
+ let verdict = dirConfined.get(dir);
4417
+ if (verdict === void 0) {
4418
+ verdict = (0, import_promises5.realpath)(dir).then(
4419
+ (realDir) => realDir === realRoot || realDir.startsWith(realRoot + import_node_path5.sep),
4420
+ () => false
4421
+ // an unresolvable directory (broken/cyclic symlink) → drop its entries
4422
+ );
4423
+ dirConfined.set(dir, verdict);
4424
+ }
4425
+ return verdict;
4426
+ };
4488
4427
  const settled = await Promise.all(
4489
4428
  relPaths.map(async (rel) => {
4490
4429
  const relPath = rel.split("\\").join("/");
4491
4430
  const absPath = (0, import_node_path5.join)(rootDir, rel);
4492
4431
  try {
4493
- const s = await (0, import_promises4.lstat)(absPath);
4432
+ const s = await (0, import_promises5.lstat)(absPath);
4494
4433
  if (s.isSymbolicLink() || !s.isFile()) return null;
4495
- const c = classifyFile(relPath);
4434
+ if (!await isDirConfined((0, import_node_path5.dirname)(absPath))) return null;
4435
+ const c = await refineWithContent(classifyFile(relPath), absPath);
4496
4436
  return {
4497
4437
  relPath,
4498
4438
  absPath,
4499
4439
  kind: c.kind,
4440
+ confidence: c.confidence,
4441
+ ...c.reason !== void 0 ? { reason: c.reason } : {},
4500
4442
  ...c.rank !== void 0 ? { rank: c.rank } : {},
4443
+ ...c.inferredRank !== void 0 ? { inferredRank: c.inferredRank } : {},
4501
4444
  ...c.service !== void 0 ? { service: c.service } : {},
4502
4445
  ...c.host !== void 0 ? { host: c.host } : {},
4503
4446
  ...c.component !== void 0 ? { component: c.component } : {},
@@ -4513,6 +4456,25 @@ async function buildIndex(rootDir) {
4513
4456
 
4514
4457
  // src/bundle/BundleSource.ts
4515
4458
  var GPUDB_VERSION_RE = /GPUdb version\s*:\s*(\S+)/;
4459
+ var ANCHOR_KINDS = ["config", "version-info"];
4460
+ var MIN_ANCHORS_FOR_CANONICAL = 2;
4461
+ var PARTIAL_INFERRED_FRACTION = 0.25;
4462
+ function assessLayout(inventory) {
4463
+ const anchorsPresent = ANCHOR_KINDS.filter((k) => (inventory.byKind[k] ?? 0) > 0).length;
4464
+ const inferredFraction = inventory.totalFiles > 0 ? inventory.inferredFiles / inventory.totalFiles : 0;
4465
+ let layout;
4466
+ if (anchorsPresent === 0) layout = "unfamiliar";
4467
+ else if (anchorsPresent >= MIN_ANCHORS_FOR_CANONICAL && inferredFraction < PARTIAL_INFERRED_FRACTION)
4468
+ layout = "canonical";
4469
+ else layout = "partial";
4470
+ if (layout === "canonical") return { layout };
4471
+ const bits = [`${inventory.inferredFiles}/${inventory.totalFiles} files classified by inference`];
4472
+ if (inventory.unknownFiles > 0) bits.push(`${inventory.unknownFiles} unclassified`);
4473
+ if (inventory.inferredRanks.length > 0)
4474
+ bits.push(`inferred ranks ${inventory.inferredRanks.join(", ")} (unconfirmed)`);
4475
+ const layoutWarning = layout === "unfamiliar" ? `This bundle does not match the canonical gpudb_sysinfo layout \u2014 no config/version/host-diagnostic files were found. Working from inference: ${bits.join("; ")}.` : `This bundle only partially matches the canonical layout: ${bits.join("; ")}.`;
4476
+ return { layout, layoutWarning };
4477
+ }
4516
4478
  function selectLogFiles(index, opts) {
4517
4479
  if (opts.component !== void 0) {
4518
4480
  return index.filter(
@@ -4567,12 +4529,17 @@ async function createBundleSource(rootDir) {
4567
4529
  const inventoryValue = (() => {
4568
4530
  const byKind = {};
4569
4531
  const rankSet = /* @__PURE__ */ new Set();
4532
+ const inferredRankSet = /* @__PURE__ */ new Set();
4570
4533
  const serviceSet = /* @__PURE__ */ new Set();
4571
4534
  let totalBytes = 0;
4535
+ let inferredFiles = 0;
4536
+ let unknownFiles = 0;
4572
4537
  for (const e of index) {
4573
4538
  byKind[e.kind] = (byKind[e.kind] ?? 0) + 1;
4574
4539
  totalBytes += e.sizeBytes;
4575
- if (e.rank) rankSet.add(e.rank);
4540
+ if (e.confidence === "inferred") inferredFiles++;
4541
+ if (e.kind === "unknown") unknownFiles++;
4542
+ if (e.rank) (e.inferredRank ? inferredRankSet : rankSet).add(e.rank);
4576
4543
  if (e.service) serviceSet.add(e.service);
4577
4544
  }
4578
4545
  return {
@@ -4580,14 +4547,17 @@ async function createBundleSource(rootDir) {
4580
4547
  totalBytes,
4581
4548
  byKind,
4582
4549
  ranks: [...rankSet].sort(),
4583
- services: [...serviceSet].sort()
4550
+ inferredRanks: [...inferredRankSet].filter((r) => !rankSet.has(r)).sort(),
4551
+ services: [...serviceSet].sort(),
4552
+ inferredFiles,
4553
+ unknownFiles
4584
4554
  };
4585
4555
  })();
4586
4556
  const detectVersion = async () => {
4587
4557
  const versionFile = findByKind("version-info");
4588
4558
  if (versionFile) {
4589
4559
  try {
4590
- const parsed = parseSysinfo(await (0, import_promises5.readFile)(versionFile.absPath, "utf-8"));
4560
+ const parsed = parseSysinfo(await (0, import_promises6.readFile)(versionFile.absPath, "utf-8"));
4591
4561
  for (const block of parsed.blocks) {
4592
4562
  const m = GPUDB_VERSION_RE.exec(block.output);
4593
4563
  if (m) return m[1];
@@ -4598,7 +4568,7 @@ async function createBundleSource(rootDir) {
4598
4568
  const configFile = findByKind("config");
4599
4569
  if (configFile) {
4600
4570
  try {
4601
- const entries = parseIni(await (0, import_promises5.readFile)(configFile.absPath, "utf-8"));
4571
+ const entries = parseIni(await (0, import_promises6.readFile)(configFile.absPath, "utf-8"));
4602
4572
  return entries.find((e) => e.key === "file_version")?.value;
4603
4573
  } catch {
4604
4574
  return void 0;
@@ -4610,7 +4580,7 @@ async function createBundleSource(rootDir) {
4610
4580
  const configFile = index.find((e) => e.kind === "config" && e.relPath.endsWith("gpudb.conf")) ?? findByKind("config");
4611
4581
  if (!configFile) return { error: "no gpudb.conf found in bundle" };
4612
4582
  try {
4613
- const entries = parseIni(await (0, import_promises5.readFile)(configFile.absPath, "utf-8"));
4583
+ const entries = parseIni(await (0, import_promises6.readFile)(configFile.absPath, "utf-8"));
4614
4584
  return { entries: filterIni(entries, opts), file: configFile.relPath };
4615
4585
  } catch (err) {
4616
4586
  return { error: err instanceof Error ? err.message : String(err) };
@@ -4624,7 +4594,7 @@ async function createBundleSource(rootDir) {
4624
4594
  const abs = resolve3(entry.relPath);
4625
4595
  if (!abs) return { error: `path "${name}" escapes the bundle root` };
4626
4596
  try {
4627
- return parseSysinfo(await (0, import_promises5.readFile)(abs, "utf-8"));
4597
+ return parseSysinfo(await (0, import_promises6.readFile)(abs, "utf-8"));
4628
4598
  } catch (err) {
4629
4599
  return { error: err instanceof Error ? err.message : String(err) };
4630
4600
  }
@@ -4683,7 +4653,7 @@ async function createBundleSource(rootDir) {
4683
4653
  const lines = [];
4684
4654
  for (const file of files) {
4685
4655
  try {
4686
- const content = await (0, import_promises5.readFile)(file.absPath, "utf-8");
4656
+ const content = await (0, import_promises6.readFile)(file.absPath, "utf-8");
4687
4657
  for (const line of content.split("\n")) {
4688
4658
  const trimmed = line.trim();
4689
4659
  if (trimmed !== "" && !/^-{3,}$/.test(trimmed)) lines.push(trimmed);
@@ -4707,13 +4677,277 @@ async function createBundleSource(rootDir) {
4707
4677
  };
4708
4678
  }
4709
4679
 
4680
+ // src/bundle/known-files.ts
4681
+ var KNOWN_BUNDLE_FILES = {
4682
+ // Host resources
4683
+ "cpu.txt": "CPU topology, NUMA, and interrupts (lscpu, numactl, /proc/cpuinfo, /proc/interrupts)",
4684
+ "mem.txt": "Memory usage, /proc/meminfo, and transparent-hugepage setting (free -m -t)",
4685
+ "disk.txt": "Filesystems, mounts, block devices, and disk stats (df, mount, lsblk, fdisk, /etc/fstab, /proc/diskstats)",
4686
+ "gpu.txt": "NVIDIA GPU inventory and state (nvidia-smi -L/-q, modinfo nvidia)",
4687
+ "net.txt": "Network interfaces, sockets, and DNS (hostname, ifconfig, netstat, /etc/resolv.conf)",
4688
+ // Processes
4689
+ "ps.txt": "Full process list (ps -auxww, ps -ejHlfww)",
4690
+ "gpudb-exe.txt": "Running gpudb processes (ps auxfwww | grep gpudb)",
4691
+ // Hardware / firmware
4692
+ "dmidecode.txt": "BIOS / DMI hardware inventory (dmidecode)",
4693
+ "lshw.txt": "Hardware listing (lshw -short -numeric)",
4694
+ "pci.txt": "PCI devices and I/O resources (lspci, /proc/ioports, /proc/iomem)",
4695
+ // Kernel / OS
4696
+ "dmesg.txt": "Kernel ring buffer \u2014 boot and runtime kernel messages (dmesg -T)",
4697
+ "dmesg-timestamp.txt": "Kernel ring buffer with human-readable timestamps",
4698
+ "sysctl.txt": "Kernel tunables (sysctl -a)",
4699
+ "sys.txt": "OS identity, uptime, ulimits, kernel cmdline, clocksource, and loaded modules (uname, ulimit, /proc/cmdline, lsmod)",
4700
+ "lsof.txt": "Open files and network sockets (lsof -n -P)",
4701
+ "lslocks.txt": "Held file locks (lslocks)",
4702
+ // Packages / linker / accounts
4703
+ "deb.txt": "Installed Debian packages and verification (dpkg -l, dpkg -V)",
4704
+ "rpm.txt": "Installed RPM packages (rpm -qa)",
4705
+ "ld.so.conf.txt": "Dynamic-linker library search paths (/etc/ld.so.conf)",
4706
+ "user.txt": "Users, groups, and the gpudb service account (whoami, id, /etc/passwd, /etc/group)",
4707
+ "sudoers.txt": "Sudo configuration (/etc/sudoers)",
4708
+ "etc_profile.txt": "Login shell profile (/etc/profile)",
4709
+ "etc_bashrc.txt": "System bashrc (/etc/bashrc)",
4710
+ "etc_host.txt": "Static hostname resolution (/etc/hosts)",
4711
+ // Kinetica-specific
4712
+ "gpudb.txt": "GPUdb version/build, binary md5 + ldd, and the captured gpudb.conf / gpudb_logger.conf ($GPUDB_EXE -v)",
4713
+ "gpudb_core_etc_gpudb.conf": "The live gpudb.conf at capture time (the database's main config)",
4714
+ "gpudb_core_etc_gpudb_logger.conf": "The logging configuration (gpudb_logger.conf)",
4715
+ "loki-info.txt": "Loki log-index stats: labels, series, and per-class volume (logcli)",
4716
+ "sql-queries.txt": "SQL query log extracted from Loki (logcli)",
4717
+ "tables.txt": "Table schemas and column types (gadmin --schema), when collected",
4718
+ "logfiles.txt": "Manifest: the log directories/files the collector enumerated",
4719
+ "errors.txt": "Collection commands that FAILED during capture (Evidence Gaps)",
4720
+ "proc-logs-erros.txt": "Per-process log-collection failures during capture (Evidence Gaps)"
4721
+ };
4722
+ var KIND_DESCRIPTIONS = {
4723
+ "core-log": "Per-rank rolling Kinetica core log (the primary incident narrative)",
4724
+ "component-log": "Component service log (sql-engine, httpd, reveal, tomcat, stats, \u2026)",
4725
+ "loki-tail": "Last-2h Loki tail for a service (small; searched only when no core logs exist)",
4726
+ "process-info": "Per-rank process snapshot: command line, PID, and environment (/proc/<pid>/environ)",
4727
+ config: "Kinetica configuration file",
4728
+ "version-info": "GPUdb version/build information",
4729
+ "collection-errors": "Collection commands that FAILED during capture (Evidence Gaps)",
4730
+ manifest: "Manifest of log directories/files the collector enumerated"
4731
+ };
4732
+ function basename2(relPath) {
4733
+ const parts = relPath.split("/");
4734
+ return parts[parts.length - 1] ?? relPath;
4735
+ }
4736
+ function describeBundleFile(entry) {
4737
+ return KNOWN_BUNDLE_FILES[basename2(entry.relPath)] ?? KIND_DESCRIPTIONS[entry.kind] ?? "";
4738
+ }
4739
+
4740
+ // src/tools/bundle/list-files.ts
4741
+ var BundleListFilesSchema = import_zod18.z.object({
4742
+ kind: import_zod18.z.string().optional()
4743
+ });
4744
+ var MAX_UNKNOWN_LISTED = 40;
4745
+ async function bundleListFiles(source, args = {}) {
4746
+ const all = source.listFiles();
4747
+ const filtered = args.kind ? all.filter((e) => e.kind === args.kind) : all;
4748
+ const inventory = source.inventory();
4749
+ const {
4750
+ totalFiles,
4751
+ totalBytes,
4752
+ byKind,
4753
+ ranks,
4754
+ inferredRanks,
4755
+ services,
4756
+ inferredFiles,
4757
+ unknownFiles
4758
+ } = inventory;
4759
+ const { layout, layoutWarning } = assessLayout(inventory);
4760
+ const version = await source.detectVersion();
4761
+ const errors = await source.collectionErrors();
4762
+ const files = filtered.map((e) => ({
4763
+ file: e.relPath,
4764
+ kind: e.kind,
4765
+ // How sure the classification is: exact (canonical name) | inferred (heuristic) | weak.
4766
+ confidence: e.confidence,
4767
+ ...e.reason !== void 0 ? { why: e.reason } : {},
4768
+ rank: e.rank ?? "",
4769
+ size_kb: Math.round(e.sizeBytes / 1024),
4770
+ // What the file contains — so the agent can pick the right one without reading it.
4771
+ description: describeBundleFile(e)
4772
+ }));
4773
+ const unknownPaths = all.filter((e) => e.kind === "unknown").map((e) => e.relPath);
4774
+ return {
4775
+ ok: true,
4776
+ data: {
4777
+ detected_version: version ?? "unknown",
4778
+ // How well the bundle matches the canonical gpudb_sysinfo layout.
4779
+ layout_match: layout,
4780
+ ...layoutWarning !== void 0 ? { layout_note: layoutWarning } : {},
4781
+ ranks_present: ranks.join(", ") || "none",
4782
+ ...inferredRanks.length > 0 ? { inferred_ranks_unconfirmed: inferredRanks.join(", ") } : {},
4783
+ services_present: services.join(", ") || "none",
4784
+ total_files: totalFiles,
4785
+ total_size_mb: Number((totalBytes / 1e6).toFixed(1)),
4786
+ counts_by_kind: byKind,
4787
+ inferred_files: inferredFiles,
4788
+ unknown_files: unknownFiles,
4789
+ ...unknownPaths.length > 0 ? {
4790
+ unknown_file_paths: unknownPaths.slice(0, MAX_UNKNOWN_LISTED),
4791
+ ...unknownPaths.length > MAX_UNKNOWN_LISTED ? { unknown_file_paths_truncated: unknownPaths.length - MAX_UNKNOWN_LISTED } : {}
4792
+ } : {},
4793
+ failed_collections: errors.length,
4794
+ files
4795
+ }
4796
+ };
4797
+ }
4798
+
4799
+ // src/tools/bundle/log-timeline.ts
4800
+ var import_zod19 = require("zod");
4801
+ var BundleLogTimelineSchema = import_zod19.z.object({
4802
+ min_severity: import_zod19.z.enum(["INFO", "WARN", "UERR", "ERROR", "FATAL"]).optional(),
4803
+ granularity: import_zod19.z.enum(["day", "hour", "minute"]).optional(),
4804
+ rank: import_zod19.z.string().describe('Numeric rank only, e.g. "r0"/"r1". For the host manager use host_manager.').optional(),
4805
+ host_manager: import_zod19.z.boolean().describe("Bucket the host-manager (hm) log \u2014 a singleton service, not a rank.").optional(),
4806
+ component: import_zod19.z.string().optional(),
4807
+ include_components: import_zod19.z.boolean().optional()
4808
+ });
4809
+ async function bundleLogTimeline(source, args = {}) {
4810
+ const query3 = {
4811
+ ...args.min_severity !== void 0 ? { minSeverity: args.min_severity } : {},
4812
+ ...args.granularity !== void 0 ? { granularity: args.granularity } : {},
4813
+ ...args.rank !== void 0 ? { rank: args.rank } : {},
4814
+ ...args.host_manager !== void 0 ? { hostManager: args.host_manager } : {},
4815
+ ...args.component !== void 0 ? { component: args.component } : {},
4816
+ ...args.include_components !== void 0 ? { includeComponents: args.include_components } : {}
4817
+ };
4818
+ const result = await source.logTimeline(query3);
4819
+ const severities = [...new Set(result.buckets.flatMap((b) => Object.keys(b.counts)))];
4820
+ const order = ["FATAL", "ERROR", "UERR", "WARN", "INFO"];
4821
+ severities.sort((a, b) => order.indexOf(a) - order.indexOf(b));
4822
+ const rows = result.buckets.map((b) => {
4823
+ const row = { time_bucket: b.bucket };
4824
+ for (const sev of severities) row[sev] = b.counts[sev] ?? 0;
4825
+ row.total = b.total;
4826
+ return row;
4827
+ });
4828
+ return {
4829
+ ok: true,
4830
+ note: result.totalCounted === 0 ? "No lines at or above the severity threshold \u2014 try a lower min_severity." : `${result.totalCounted} event(s) across ${result.buckets.length} bucket(s), ${result.filesScanned.length} file(s).`,
4831
+ data: {
4832
+ lines_scanned: result.linesScanned,
4833
+ files_scanned: result.filesScanned.join(", ") || "none",
4834
+ buckets: rows
4835
+ }
4836
+ };
4837
+ }
4838
+
4839
+ // src/tools/bundle/search-logs.ts
4840
+ var import_zod20 = require("zod");
4841
+ var BundleSearchLogsSchema = import_zod20.z.object({
4842
+ regex: import_zod20.z.string().optional(),
4843
+ min_severity: import_zod20.z.enum(["INFO", "WARN", "UERR", "ERROR", "FATAL"]).optional(),
4844
+ from_ts: import_zod20.z.string().optional(),
4845
+ to_ts: import_zod20.z.string().optional(),
4846
+ rank: import_zod20.z.string().describe('Numeric rank only, e.g. "r0"/"r1". For the host manager use host_manager.').optional(),
4847
+ host_manager: import_zod20.z.boolean().describe("Search the host-manager (hm) log \u2014 a singleton service, not a rank.").optional(),
4848
+ component: import_zod20.z.string().optional(),
4849
+ include_components: import_zod20.z.boolean().optional(),
4850
+ max_matches: import_zod20.z.number().int().min(1).max(1e3).optional()
4851
+ });
4852
+ async function bundleSearchLogs(source, args = {}) {
4853
+ const query3 = {
4854
+ ...args.regex !== void 0 ? { regex: args.regex } : {},
4855
+ ...args.min_severity !== void 0 ? { minSeverity: args.min_severity } : {},
4856
+ ...args.from_ts !== void 0 ? { fromTs: args.from_ts } : {},
4857
+ ...args.to_ts !== void 0 ? { toTs: args.to_ts } : {},
4858
+ ...args.rank !== void 0 ? { rank: args.rank } : {},
4859
+ ...args.host_manager !== void 0 ? { hostManager: args.host_manager } : {},
4860
+ ...args.component !== void 0 ? { component: args.component } : {},
4861
+ ...args.include_components !== void 0 ? { includeComponents: args.include_components } : {},
4862
+ ...args.max_matches !== void 0 ? { maxMatches: args.max_matches } : {}
4863
+ };
4864
+ const result = await source.searchLogs(query3);
4865
+ const note = result.capped ? `Showing ${result.matches.length} of ${result.totalMatched} matches across ${result.filesScanned.length} file(s) (display capped). Narrow with a tighter regex, severity, or time window to surface the specific lines.` : `${result.totalMatched} match(es) across ${result.filesScanned.length} file(s).`;
4866
+ return {
4867
+ ok: true,
4868
+ note,
4869
+ data: {
4870
+ total_matched: result.totalMatched,
4871
+ lines_scanned: result.linesScanned,
4872
+ files_scanned: result.filesScanned.join(", ") || "none",
4873
+ capped: result.capped,
4874
+ matches: result.matches.map((m) => ({
4875
+ file: m.file,
4876
+ line: m.lineNumber,
4877
+ timestamp: m.timestamp ?? "",
4878
+ severity: m.severity ?? "",
4879
+ rank: m.rank ?? "",
4880
+ message: m.message
4881
+ }))
4882
+ }
4883
+ };
4884
+ }
4885
+
4886
+ // src/tools/bundle/read-config.ts
4887
+ var import_zod21 = require("zod");
4888
+ var BundleReadConfigSchema = import_zod21.z.object({
4889
+ section: import_zod21.z.string().optional(),
4890
+ key: import_zod21.z.string().optional()
4891
+ });
4892
+ async function bundleReadConfig(source, args = {}) {
4893
+ const result = await source.readConfig({
4894
+ ...args.section !== void 0 ? { section: args.section } : {},
4895
+ ...args.key !== void 0 ? { key: args.key } : {}
4896
+ });
4897
+ if ("error" in result) {
4898
+ return { ok: false, status: 0, error: result.error, raw: "" };
4899
+ }
4900
+ if (result.entries.length === 0 && args.section !== void 0) {
4901
+ const all = await source.readConfig(args.key !== void 0 ? { key: args.key } : {});
4902
+ const sections = "error" in all ? [] : [...new Set(all.entries.map((e) => e.section))].sort();
4903
+ const sectionList = sections.map((s) => s === "" ? "(flat/top-level)" : s).join(", ");
4904
+ return {
4905
+ ok: true,
4906
+ note: `No entries in section "${args.section}" of ${result.file}. gpudb.conf is largely flat \u2014 retry filtering by key only. Sections present: ${sectionList || "(none)"}.`,
4907
+ data: { section_not_found: args.section, available_sections: sections }
4908
+ };
4909
+ }
4910
+ return {
4911
+ ok: true,
4912
+ note: `${result.entries.length} entr(y/ies) from ${result.file}.`,
4913
+ data: result.entries.map((e) => ({ section: e.section, key: e.key, value: e.value }))
4914
+ };
4915
+ }
4916
+
4917
+ // src/tools/bundle/read-sysinfo.ts
4918
+ var import_zod22 = require("zod");
4919
+ var BundleReadSysinfoSchema = import_zod22.z.object({
4920
+ name: import_zod22.z.string().min(1)
4921
+ });
4922
+ async function bundleReadSysinfo(source, args) {
4923
+ const result = await source.readSysinfo(args.name);
4924
+ if ("error" in result) {
4925
+ return { ok: false, status: 0, error: result.error, raw: "" };
4926
+ }
4927
+ return {
4928
+ ok: true,
4929
+ data: {
4930
+ ...result.header !== void 0 ? { source_file: result.header } : {},
4931
+ blocks: result.blocks.map((b) => ({
4932
+ command: b.command,
4933
+ ...b.exitCode !== void 0 ? { exit_code: b.exitCode } : {},
4934
+ output: b.output
4935
+ }))
4936
+ }
4937
+ };
4938
+ }
4939
+
4940
+ // src/tools/bundle/load-bundle.ts
4941
+ var import_zod23 = require("zod");
4942
+
4710
4943
  // src/bundle/verify-bundle.ts
4944
+ var import_promises7 = require("fs/promises");
4711
4945
  var ARCHIVE_RE = /\.(tgz|tar\.gz|tar|gz|zip)$/i;
4712
4946
  var EXPECTED_KINDS = ["config", "core-log"];
4713
4947
  async function verifyBundle(bundlePath) {
4714
4948
  let info;
4715
4949
  try {
4716
- info = await (0, import_promises6.stat)(bundlePath);
4950
+ info = await (0, import_promises7.stat)(bundlePath);
4717
4951
  } catch {
4718
4952
  return { ok: false, error: `bundle path does not exist: ${bundlePath}` };
4719
4953
  }
@@ -4733,12 +4967,15 @@ async function verifyBundle(bundlePath) {
4733
4967
  }
4734
4968
  const missingExpected = EXPECTED_KINDS.filter((k) => (inventory.byKind[k] ?? 0) === 0);
4735
4969
  const kineticaVersion = await bundleSource.detectVersion();
4970
+ const { layout, layoutWarning } = assessLayout(inventory);
4736
4971
  return {
4737
4972
  ok: true,
4738
4973
  bundleSource,
4739
4974
  ...kineticaVersion !== void 0 ? { kineticaVersion } : {},
4740
4975
  inventory,
4741
- missingExpected
4976
+ missingExpected,
4977
+ layout,
4978
+ ...layoutWarning !== void 0 ? { layoutWarning } : {}
4742
4979
  };
4743
4980
  }
4744
4981
 
@@ -4957,7 +5194,7 @@ Before gathering evidence, announce a brief 2-3 line plan: restate the issue, li
4957
5194
 
4958
5195
  ### Round 1 \u2014 Orient
4959
5196
 
4960
- - ${t}kinetica_bundle_list_files${t} \u2014 **ALWAYS FIRST.** Learn the detected version, which ranks are present, what file kinds exist, and how many collections failed.
5197
+ - ${t}kinetica_bundle_list_files${t} \u2014 **ALWAYS FIRST.** Learn the detected version, which ranks are present, what file kinds exist, and how many collections failed. Check ${t}layout_match${t}: if it is not ${t}canonical${t}, this bundle is off-shape (e.g. a logs-only dump) \u2014 read the ${t}layout_note${t}, treat any ${t}unknown_file_paths${t} as evidence to inspect by hand (open one with ${t}kinetica_bundle_read_sysinfo${t}), and trust ${t}ranks_present${t} over ${t}inferred_ranks_unconfirmed${t}. See the support-bundle reference ("When the bundle doesn't match the expected layout").
4961
5198
  - ${t}kinetica_bundle_log_timeline${t} (min_severity: WARN) \u2014 get the incident shape: when did WARN/ERROR/FATAL spike, and on which rank?
4962
5199
 
4963
5200
  ### Round 2 \u2014 Drill Down
@@ -5077,7 +5314,7 @@ function createBundleHolder(initial) {
5077
5314
  }
5078
5315
 
5079
5316
  // src/cli/pick-bundle-path.ts
5080
- var import_promises7 = require("fs/promises");
5317
+ var import_promises8 = require("fs/promises");
5081
5318
  var import_node_path7 = require("path");
5082
5319
  function isPermissionError(err) {
5083
5320
  if (typeof err !== "object" || err === null || !("code" in err)) return false;
@@ -5092,7 +5329,7 @@ async function listDirectoryCandidates(term) {
5092
5329
  const resolved = (0, import_node_path7.resolve)(baseDir);
5093
5330
  let entries;
5094
5331
  try {
5095
- entries = await (0, import_promises7.readdir)(resolved, { withFileTypes: true });
5332
+ entries = await (0, import_promises8.readdir)(resolved, { withFileTypes: true });
5096
5333
  } catch (err) {
5097
5334
  if (isPermissionError(err)) return { kind: "denied", dir: resolved };
5098
5335
  return { kind: "ok", candidates: [] };
@@ -6087,7 +6324,7 @@ async function logout() {
6087
6324
 
6088
6325
  // src/session/env-file.ts
6089
6326
  var import_fs2 = require("fs");
6090
- var import_promises8 = require("fs/promises");
6327
+ var import_promises9 = require("fs/promises");
6091
6328
  var import_path2 = require("path");
6092
6329
  var import_picocolors11 = __toESM(require("picocolors"));
6093
6330
  function parseEnvContent(content) {
@@ -6179,11 +6416,11 @@ async function offerSaveCredentials(url, user, dir) {
6179
6416
  const filePath = (0, import_path2.join)(dir ?? process.cwd(), ".env");
6180
6417
  let existing;
6181
6418
  try {
6182
- existing = await (0, import_promises8.readFile)(filePath, "utf8");
6419
+ existing = await (0, import_promises9.readFile)(filePath, "utf8");
6183
6420
  } catch {
6184
6421
  }
6185
6422
  const content = buildEnvContent(url, user, existing);
6186
- await (0, import_promises8.writeFile)(filePath, content, "utf8");
6423
+ await (0, import_promises9.writeFile)(filePath, content, "utf8");
6187
6424
  console.error(import_picocolors11.default.dim("Saved to .env"));
6188
6425
  } catch (err) {
6189
6426
  const message = err instanceof Error ? err.message : String(err);
@@ -6624,7 +6861,12 @@ async function main() {
6624
6861
  process.exitCode = 1;
6625
6862
  return;
6626
6863
  }
6627
- if (result.missingExpected.length > 0) {
6864
+ if (result.layoutWarning !== void 0) {
6865
+ process.stderr.write(
6866
+ import_picocolors15.default.yellow(`Warning: ${result.layoutWarning} Diagnosing with what is present.
6867
+ `)
6868
+ );
6869
+ } else if (result.missingExpected.length > 0) {
6628
6870
  process.stderr.write(
6629
6871
  import_picocolors15.default.yellow(
6630
6872
  `Warning: bundle is missing expected artifact(s): ${result.missingExpected.join(", ")}. Diagnosing with what is present.
@@ -49,6 +49,17 @@ See `rank-architecture.md` (Where queries are logged) for why this locality hold
49
49
  - **Packages / accounts:** `deb.txt` / `rpm.txt` (installed packages), `user.txt` (users/groups, gpudb account), `ld.so.conf.txt`, `etc_*.txt` (system shell/host config).
50
50
  - **Evidence Gaps:** `errors.txt` / `proc-logs-erros.txt` — collection commands that FAILED. `logfiles.txt` — manifest of log dirs the collector enumerated.
51
51
 
52
+ ### When the bundle doesn't match the expected layout
53
+
54
+ Not every bundle is a full `gpudb_sysinfo` capture. A customer may hand over a bare logs-only dump, a differently-named collector's output, or a flat directory. `kinetica_bundle_list_files` tells you how well it matched, so you never reason blindly over an unfamiliar shape:
55
+
56
+ - **`layout_match`** — `canonical` (a normal gpudb_sysinfo bundle), `partial`, or `unfamiliar` (none of the expected config/version/host-diagnostic anchors were found, e.g. a logs-only dump). When it is not `canonical`, a `layout_note` summarizes what was inferred.
57
+ - **Per-file `confidence`** — `exact` (matched a canonical name/location), `inferred` (recognized by a name or content heuristic — e.g. a rolling log shipped WITHOUT the `core-` prefix, or a `.out` whose first lines parsed as log lines), or `weak`. The `why` field states how each file was classified.
58
+ - **`inferred_ranks_unconfirmed`** — ranks seen only via a loose name guess, never confirmed by a canonical pattern or by log content. Treat these as "possible — verify," distinct from `ranks_present`, which stays trustworthy.
59
+ - **`unknown_file_paths`** — files that could not be classified at all. Do NOT ignore them: they may be evidence under an unfamiliar name. Read one with `kinetica_bundle_read_sysinfo` (it returns the raw content / EXEC_CMD blocks) to decide what it holds.
60
+
61
+ Inference does not make a file second-class: a rolling log recognized without its `core-` prefix is treated exactly like a canonical core log — it appears in `ranks_present` and `kinetica_bundle_search_logs`/`log_timeline` search it normally. The parsers have already been applied for you. Your job: trust `ranks_present` / `services_present`, sanity-check anything marked `inferred` or `unknown`, and state plainly in the report when the evidence came from an off-shape bundle (note the `layout_match`).
62
+
52
63
  ### Two log families — and why every rank is reachable
53
64
 
54
65
  A bundle carries per-rank logs in up to two places, and the collector host usually holds only a couple of the cluster's ranks:
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@kinetica/admin-agent",
3
- "version": "0.2.1",
3
+ "version": "0.2.2",
4
4
  "description": "Autonomous diagnostic agent for Kinetica databases",
5
5
  "license": "Apache-2.0",
6
6
  "author": "Kinetica",