npm - @kinetica/admin-agent - Versions diffs - 0.2.2 → 0.2.3 - Mend

@kinetica/admin-agent 0.2.2 → 0.2.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/README.md +8 -8
package/dist/admin-agent.js +50 -9
package/knowledge/references/bundle/support-bundle.md +7 -4
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -332,14 +332,14 @@ The `--bundle` flag points the agent at an **extracted** support-bundle director
 Available against an extracted `gpudb_sysinfo` support bundle (see [Offline Bundle Mode](#offline-bundle-mode)). All read-only; the search/timeline tools stream and bound their output so a large rank log (tens of MB, hundreds of thousands of lines) never blows up the context.
-| Tool                           | Description                                                                                                                                                                 |
-| ------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `kinetica_load_bundle`         | Attach an extracted bundle directory; without a path it opens a directory picker (a model-supplied path needs operator confirmation)                                        |
-| `kinetica_bundle_list_files`   | Inventory: detected version, ranks + services present, file counts/sizes by kind, plus a layout-match verdict + per-file confidence for off-shape bundles — call this first |
-| `kinetica_bundle_log_timeline` | Per-time-bucket severity counts across ranks (the incident shape) — call before searching                                                                                   |
-| `kinetica_bundle_search_logs`  | Bounded log search by regex, min-severity, time window, and rank / host-manager / component (reads both rolling and Loki-export logs)                                       |
-| `kinetica_bundle_read_config`  | Read the bundle's real on-disk `gpudb.conf`, with optional section/key filter                                                                                               |
-| `kinetica_bundle_read_sysinfo` | OS/process/version diagnostic files (memory, CPU, disk, GPU, network, process args)                                                                                         |
+| Tool                           | Description                                                                                                                                                                                                                                                                                 |
+| ------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `kinetica_load_bundle`         | Attach an extracted bundle directory; without a path it opens a directory picker (a model-supplied path needs operator confirmation)                                                                                                                                                        |
+| `kinetica_bundle_list_files`   | Inventory: detected version, ranks + services present, file counts/sizes by kind, plus a layout-match verdict + per-file confidence for off-shape bundles — call this first                                                                                                                 |
+| `kinetica_bundle_log_timeline` | Per-time-bucket severity counts across ranks (the incident shape) — call before searching                                                                                                                                                                                                   |
+| `kinetica_bundle_search_logs`  | Bounded log search by regex, min-severity, time window, and rank / host-manager / component (reads both rolling and Loki-export logs); `include_multiline` stitches a multi-line record — e.g. a full `Executing SQL:` query whose embedded newlines span many lines — back onto each match |
+| `kinetica_bundle_read_config`  | Read the bundle's real on-disk `gpudb.conf`, with optional section/key filter                                                                                                                                                                                                               |
+| `kinetica_bundle_read_sysinfo` | OS/process/version diagnostic files (memory, CPU, disk, GPU, network, process args)                                                                                                                                                                                                         |
 ### Reporting

package/dist/admin-agent.js CHANGED Viewed

@@ -4034,6 +4034,8 @@ function parseLogLine(line) {
 // src/bundle/log-search.ts
 var DEFAULT_MAX_MATCHES = 200;
+var MULTILINE_MAX_LINES = 300;
+var MULTILINE_MAX_CHARS = 2e4;
 var REGEX_SCAN_MAX = 8192;
 var GRANULARITY_LEN = {
   day: 10,
@@ -4075,6 +4077,23 @@ function matchesFilters(parsed, query3, regex, minRank) {
     return false;
   return true;
 }
+function buildMatch(lineNumber, parsed) {
+  return {
+    lineNumber,
+    ...parsed.timestamp !== void 0 ? { timestamp: parsed.timestamp } : {},
+    ...parsed.severity !== void 0 ? { severity: parsed.severity } : {},
+    ...parsed.rank !== void 0 ? { rank: parsed.rank } : {},
+    message: parsed.message,
+    raw: parsed.raw
+  };
+}
+function finalizeMultiline(pending) {
+  if (pending.extra.length === 0) return pending.base;
+  const joined = pending.extra.join("\n");
+  const suffix = pending.truncated ? "\n\u2026 [continuation truncated]" : "";
+  return { ...pending.base, message: `${pending.base.message}
+${joined}${suffix}` };
+}
 async function searchLogFile(filePath, query3) {
   const maxMatches = query3.maxMatches ?? DEFAULT_MAX_MATCHES;
   const minRank = query3.minSeverity !== void 0 ? severityRank(query3.minSeverity) : -Infinity;
@@ -4096,9 +4115,17 @@ async function searchLogFile(filePath, query3) {
     ...query3.fromTs !== void 0 ? { fromTs: floorTimestamp(query3.fromTs) } : {},
     ...query3.toTs !== void 0 ? { toTs: ceilTimestamp(query3.toTs) } : {}
   };
+  const coalesce = query3.coalesceMultiline === true;
   const matches = [];
   let totalMatched = 0;
   let linesScanned = 0;
+  let pending;
+  const flushPending = () => {
+    if (pending) {
+      matches.push(finalizeMultiline(pending));
+      pending = void 0;
+    }
+  };
   try {
     const rl = (0, import_node_readline.createInterface)({
       input: (0, import_node_fs4.createReadStream)(filePath, { encoding: "utf-8" }),
@@ -4107,20 +4134,29 @@ async function searchLogFile(filePath, query3) {
     for await (const line of rl) {
       linesScanned++;
       const parsed = parseLogLine(line);
+      if (pending) {
+        if (parsed.timestamp === void 0) {
+          if (!pending.truncated && pending.extra.length < MULTILINE_MAX_LINES && pending.chars + line.length + 1 <= MULTILINE_MAX_CHARS) {
+            pending.extra.push(line);
+            pending.chars += line.length + 1;
+          } else {
+            pending.truncated = true;
+          }
+          continue;
+        }
+        flushPending();
+      }
       if (!matchesFilters(parsed, boundedQuery, regex, minRank)) continue;
       totalMatched++;
       if (matches.length < maxMatches) {
-        matches.push({
-          lineNumber: linesScanned,
-          ...parsed.timestamp !== void 0 ? { timestamp: parsed.timestamp } : {},
-          ...parsed.severity !== void 0 ? { severity: parsed.severity } : {},
-          ...parsed.rank !== void 0 ? { rank: parsed.rank } : {},
-          message: parsed.message,
-          raw: parsed.raw
-        });
+        const base = buildMatch(linesScanned, parsed);
+        if (coalesce) pending = { base, extra: [], chars: 0, truncated: false };
+        else matches.push(base);
       }
     }
+    flushPending();
   } catch (err) {
+    flushPending();
     const message = err instanceof Error ? err.message : String(err);
     return {
       matches,
@@ -4508,7 +4544,8 @@ function toLineQuery(q) {
     ...q.minSeverity !== void 0 ? { minSeverity: q.minSeverity } : {},
     ...q.fromTs !== void 0 ? { fromTs: q.fromTs } : {},
     ...q.toTs !== void 0 ? { toTs: q.toTs } : {},
-    ...q.maxMatches !== void 0 ? { maxMatches: q.maxMatches } : {}
+    ...q.maxMatches !== void 0 ? { maxMatches: q.maxMatches } : {},
+    ...q.coalesceMultiline !== void 0 ? { coalesceMultiline: q.coalesceMultiline } : {}
   };
 }
 function toTimelineLineQuery(q) {
@@ -4847,6 +4884,9 @@ var BundleSearchLogsSchema = import_zod20.z.object({
   host_manager: import_zod20.z.boolean().describe("Search the host-manager (hm) log \u2014 a singleton service, not a rank.").optional(),
   component: import_zod20.z.string().optional(),
   include_components: import_zod20.z.boolean().optional(),
+  include_multiline: import_zod20.z.boolean().describe(
+    "Reconstruct multi-line log records: append continuation lines (those with no timestamp) to each match. Use this to capture a full SQL statement on an 'Executing SQL:' line \u2014 the query often spans many lines because the SQL has embedded newlines, and a plain match shows only its first line. Works on the rolling core logs (logs-local/); Loki per-rank tails (logs/rankN.log) keep only the statement's first line, so there are no continuation lines to stitch there."
+  ).optional(),
   max_matches: import_zod20.z.number().int().min(1).max(1e3).optional()
 });
 async function bundleSearchLogs(source, args = {}) {
@@ -4859,6 +4899,7 @@ async function bundleSearchLogs(source, args = {}) {
     ...args.host_manager !== void 0 ? { hostManager: args.host_manager } : {},
     ...args.component !== void 0 ? { component: args.component } : {},
     ...args.include_components !== void 0 ? { includeComponents: args.include_components } : {},
+    ...args.include_multiline !== void 0 ? { coalesceMultiline: args.include_multiline } : {},
     ...args.max_matches !== void 0 ? { maxMatches: args.max_matches } : {}
   };
   const result = await source.searchLogs(query3);

package/knowledge/references/bundle/support-bundle.md CHANGED Viewed

@@ -20,6 +20,7 @@ Severity order for filtering is `WARN < UERR < ERROR < FATAL`, so `min_severity=
 - The logs are large (a rank log can exceed 100k lines). NEVER ask for a whole file. Use `kinetica_bundle_log_timeline` to localize, then `kinetica_bundle_search_logs` with a tight time window + severity to extract only relevant lines. The match cap is shared across files — if you see "capped", narrow the query rather than asking for more.
 - You can pass a timeline bucket label straight into `from_ts`/`to_ts` (e.g. `2026-06-11 15` searches that whole hour) — partial timestamps are widened to cover the full period.
+- A single log record can span multiple physical lines when a logged value (notably a SQL statement) contains embedded newlines — the continuation lines have no timestamp. A plain search returns only the first line. Pass `include_multiline: true` to stitch the continuation lines back onto each match and recover the whole record. See "Finding a crash's triggering SQL".
 - Timestamps are plain local strings without a timezone; compare them lexically and treat cross-rank timing cautiously.
 - **Ranks vs. the host manager:** `rank` selects a numeric rank (`r0`, `r1`, …) only. The host manager (`core-gpudb-rolling-hm.log`) is a singleton service, NOT a rank — search or timeline it with `host_manager: true`, never `rank: "hm"`. By default both `log_timeline` and `search_logs` already cover the host manager along with the numeric ranks; `kinetica_bundle_list_files` lists it under `services_present`.
@@ -29,11 +30,13 @@ When a worker rank segfaults mid-query, that rank's log holds the **backtrace**
 Workflow, given a `JobId` from a worker's crash stack:
-1. `kinetica_bundle_search_logs` with `rank: "r0"` and `regex` = the JobId. r0 logs the `/execute/sql` receipt (submitting user), the `Sql/SqlDriver.cpp … Executing SQL:` line, and per-operation endpoint lines.
-2. The per-operation lines (`Endpoint_aggregate_group_by.cpp`, filter/join endpoints) carry `table:`, `column_names:`/`aliases:` (the SELECT list), and `expr:` (the full WHERE predicate) — reconstruct the query from these.
-3. **Quirk:** if `Found plan for the SQL in cache` precedes it, the `Executing SQL:` line is truncated to just `SELECT`. Use the per-operation endpoint lines (step 2) — their predicate survives a cache hit. A `datetime()`/timestamp filter showing up here often _is_ the input that triggered a parser segfault.
+1. `kinetica_bundle_search_logs` with `rank: "r0"`, `regex` = the JobId, **and `include_multiline: true`**. r0 logs the `/execute/sql` receipt (submitting user), the `Sql/SqlDriver.cpp … Executing SQL:` line, and per-operation endpoint lines.
+2. **Read the full statement straight from the `Executing SQL:` line — do not reconstruct it.** Kinetica logs the SQL verbatim, so a real query spans MANY physical lines: `FROM …`, `JOIN …`, `WHERE …` each land on their own line with no timestamp prefix. Those continuation lines belong to the same log record; `include_multiline: true` stitches them back onto the match so you see the WHOLE query. WITHOUT it, a match is only the first physical line (e.g. `… Executing SQL: SELECT c."circuitId", c."circuitRef"`) and you would wrongly report the query as "truncated." Quote the statement verbatim from this single (now multi-line) match.
+3. **Cache-hit fallback only:** if `Found plan for the SQL in cache` precedes the job, Kinetica logs `Executing SQL:` as just the statement keyword (e.g. a bare `SELECT` or `EXECUTE PROCEDURE …`) with no continuation lines to stitch. ONLY then fall back to the per-operation endpoint lines (`Endpoint_aggregate_group_by.cpp`, filter/join endpoints): they carry `table:`, `column_names:`/`aliases:` (the SELECT list), and `expr:` (the full WHERE predicate), whose values survive a cache hit. A `datetime()`/timestamp filter showing up here often _is_ the input that triggered a parser segfault.
-See `rank-architecture.md` (Where queries are logged) for why this locality holds.
+**Where the full multi-line query actually lives:** `include_multiline` recovers the whole statement only from the **rolling core logs** (`logs-local/core-gpudb-rolling-r0.log`), where the SQL's embedded newlines are preserved as continuation lines. The **Loki per-rank tail** (`logs/rank0.log`) keeps only the statement's first physical line — promtail captures each line as its own record, so the `FROM`/`JOIN`/`WHERE` lines are simply not in that export, and nothing can stitch them. So this workflow depends on r0 being present under `logs-local/`. If r0 exists only as a Loki tail (rare for the coordinator, but possible for a Loki-only bundle), the complete query may not be in the bundle at all — say so rather than reporting the first line as the whole query, and fall back to step 3's endpoint lines.
+See `rank-architecture.md` (Where queries are logged) for why this locality holds, and "Two log families" below for rolling-vs-Loki precedence.
 ### Files of interest

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@kinetica/admin-agent",
-  "version": "0.2.2",
+  "version": "0.2.3",
   "description": "Autonomous diagnostic agent for Kinetica databases",
   "license": "Apache-2.0",
   "author": "Kinetica",