npm - agent-inspect - Versions diffs - 1.7.0 → 1.8.0 - Mend

agent-inspect 1.7.0 → 1.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (44) hide show

package/CHANGELOG.md +13 -1
package/README.md +11 -6
package/docs/ADAPTER-CONFORMANCE.md +7 -3
package/docs/ADAPTERS.md +120 -5
package/docs/API.md +123 -21
package/docs/CLI.md +154 -6
package/docs/KNOWN-ISSUES.md +7 -1
package/docs/LIMITATIONS.md +7 -1
package/docs/SCHEMA.md +1 -0
package/package.json +12 -2
package/packages/cli/dist/index.cjs +2057 -33
package/packages/cli/dist/index.cjs.map +1 -1
package/packages/cli/dist/index.mjs +2057 -33
package/packages/cli/dist/index.mjs.map +1 -1
package/packages/core/dist/advanced.d.cts +4 -4
package/packages/core/dist/advanced.d.ts +4 -4
package/packages/core/dist/checks.cjs +1535 -0
package/packages/core/dist/checks.cjs.map +1 -0
package/packages/core/dist/checks.d.cts +585 -0
package/packages/core/dist/checks.d.ts +585 -0
package/packages/core/dist/checks.mjs +1512 -0
package/packages/core/dist/checks.mjs.map +1 -0
package/packages/core/dist/diff.d.cts +3 -3
package/packages/core/dist/diff.d.ts +3 -3
package/packages/core/dist/exporters.d.cts +3 -3
package/packages/core/dist/exporters.d.ts +3 -3
package/packages/core/dist/index.d.cts +6 -6
package/packages/core/dist/index.d.ts +6 -6
package/packages/core/dist/{inspect-event-Des4JDHo.d.cts → inspect-event-CevRYp58.d.cts} +1 -1
package/packages/core/dist/{inspect-event-Des4JDHo.d.ts → inspect-event-CevRYp58.d.ts} +1 -1
package/packages/core/dist/{log-config-C1GcJPIM.d.ts → log-config-BPHS4Sds.d.ts} +1 -1
package/packages/core/dist/{log-config-BnH8Ykcb.d.cts → log-config-DanPV3P9.d.cts} +1 -1
package/packages/core/dist/logs.d.cts +3 -3
package/packages/core/dist/logs.d.ts +3 -3
package/packages/core/dist/{persisted-inspect-event-DiFto0K2.d.ts → persisted-inspect-event-Cw7TeYGr.d.ts} +1 -1
package/packages/core/dist/{persisted-inspect-event-0kaRADsp.d.cts → persisted-inspect-event-DHPfzUd8.d.cts} +1 -1
package/packages/core/dist/persisted.d.cts +5 -5
package/packages/core/dist/persisted.d.ts +5 -5
package/packages/core/dist/readers.d.cts +2 -2
package/packages/core/dist/readers.d.ts +2 -2
package/packages/core/dist/{types-tSix7tfv.d.ts → types-Ap9uMdx_.d.ts} +1 -1
package/packages/core/dist/{types-DB8jB6Jg.d.cts → types-B2-BU5CS.d.cts} +1 -1
package/packages/core/dist/writers.d.cts +2 -2
package/packages/core/dist/writers.d.ts +2 -2

package/docs/CLI.md CHANGED Viewed

@@ -26,6 +26,10 @@ Core commands:
 - `tail` — live-tail logs into updating local trees
 - `export` — export manual traces to Markdown/HTML/OpenInference/OTLP JSON (local only)
 - `open` — read supported local trace files, directories, or stdin through the canonical reader pipeline
+- `check` — run deterministic local trace checks with stable JSON and exit codes
+- `scan` — best-effort local safety scan for trace capture risks
+- `verify-safe` — best-effort local trace safety verification
+- `artifacts` — create safe local CI trace artifact bundles and optional step summaries
 - `diff` — compare two manual traces (local, read-only)
 - `timeline` — chronological view of one run (local JSONL)
 - `stats` — local aggregate stats over a trace directory
@@ -44,6 +48,20 @@ Core commands:
 - **0**: command succeeded (even if a diff reports “differences”)
 - **1**: command error (invalid args, missing files, missing runs, parse failures, validation failures, etc.)
+Exception: `check` uses CI-oriented semantic exit codes:
+- **0**: all selected checks passed
+- **1**: checks ran and at least one error-severity rule failed
+- **2**: invalid arguments or invalid config
+- **3**: trace input could not be read
+- **4**: unsupported or ambiguous trace format
+Exception: `scan` and `verify-safe` use local safety status exit codes:
+- **0**: status is SAFE or SAFE WITH WARNINGS
+- **1**: status is UNSAFE
+- **2**: status is UNKNOWN, including unreadable, unsupported, ambiguous, or invalid inputs
 AgentInspect favors **human-readable errors without stack traces** for expected user mistakes.
 ## 4. JSON output policy
@@ -59,6 +77,8 @@ Many commands support `--json` for scripting. JSON output is intended to be:
 - Log-derived output includes **confidence** labels and avoids inventing parent-child relationships.
 - Redaction defaults are conservative (e.g. `authorization`, `cookie`, `token`, `apiKey`, `password`, `secret`, `email`).
 - Exported payloads are **redacted by default** unless explicitly configured otherwise.
+- `scan` and `verify-safe` are best-effort local checks, not compliance, privacy, security, or regulatory certifications.
+- `artifacts` renders structural summaries and check evidence only; it does not include raw prompt/output bodies, request/response bodies, headers, API keys, secrets, or full tool payloads.
 ## 6. Command reference
@@ -222,7 +242,135 @@ cat packages/core/test/fixtures/openinference-basic.json | npx agent-inspect ope
 When a directory or payload contains multiple runs, `open` lists the run ids and exits until you pass `--run <run-id>`.
-### 6.8 `diff`
+### 6.8 `check`
+Run deterministic checks against a local trace. This command is local and read-only: it does not rerun agents, call models, upload traces, or mutate input files.
+```bash
+agent-inspect check <trace-path-or-run-id> [options]
+```
+`<trace-path-or-run-id>` may be a trace file, directory, `-` for stdin, or a run id resolved with `--dir`.
+Options:
+- `--dir <path>`: trace directory for run-id lookup
+- `--format <agent-inspect-jsonl|openinference-json|otlp-json>`: explicit reader format override
+- `--run <run-id>`: select a run when input contains multiple runs
+- `--config <path>`: check config (`.json`, `.js`, `.mjs`, or `.cjs`)
+- `--json`: print deterministic `TraceCheckResult` JSON
+- `--rule <id>`: select a rule id; repeatable
+- `--max-duration-ms <number>`: add `run.duration`
+- `--required-tool <name>` / `--forbidden-tool <name>`: add `tool.usage`
+- `--allowed-model <model>` / `--max-total-tokens <number>`: add `llm.usage`
+By default, `check` runs `run.status`. Additional built-in rules can be selected with `--rule` or config when their options are available.
+Config files use this shape:
+```json
+{
+  "checks": {
+    "select": ["run.status", "run.duration"],
+    "run": { "maxDurationMs": 30000 },
+    "tool": { "required": ["search_docs"] },
+    "llm": { "allowedModels": ["gpt-4.1-mini"], "maxTotalTokens": 12000 }
+  }
+}
+```
+YAML is not supported. TypeScript config files (`.ts`, `.mts`, `.cts`) fail clearly unless a future explicit loader strategy is added; use precompiled JavaScript config instead.
+Examples:
+```bash
+npx agent-inspect check fixtures/traces-v0.2/manual-basic.jsonl --json
+npx agent-inspect check minimal-success --dir fixtures/traces --rule run.status
+npx agent-inspect check trace.jsonl --max-duration-ms 30000 --required-tool search_docs --json
+```
+Recipe: [examples/recipes/deterministic-ci-checks](../examples/recipes/deterministic-ci-checks/README.md)
+### 6.9 `scan` and `verify-safe`
+Run best-effort local safety verification for supported trace inputs. These commands are local and read-only: they do not rerun agents, call models, upload traces, mutate input files, or certify compliance.
+```bash
+agent-inspect scan <trace-path-or-run-id> [options]
+agent-inspect verify-safe <trace-path-or-run-id> [options]
+```
+`<trace-path-or-run-id>` may be a trace file, directory, `-` for stdin, or a run id resolved with `--dir`.
+Statuses:
+- `SAFE`: no safety findings and no reader warnings.
+- `SAFE WITH WARNINGS`: no safety findings, but the reader reported warnings or unsupported fields.
+- `UNSAFE`: safety findings were detected.
+- `UNKNOWN`: the input could not be read, normalized, or selected conservatively.
+Options:
+- `--dir <path>`: trace directory for run-id lookup
+- `--format <agent-inspect-jsonl|openinference-json|otlp-json>`: explicit reader format override
+- `--run <run-id>`: select a run when input contains multiple runs
+- `--json`: print deterministic JSON safety result
+- `--max-string-length <number>`: unsafe threshold for string values
+- `--max-array-length <number>`: unsafe threshold for array values
+- `--max-object-keys <number>`: unsafe threshold for object key counts
+- `--max-serialized-bytes <number>`: unsafe threshold for serialized values
+The scan looks for raw prompt/output-like capture paths, unredacted sensitive-looking keys, secret-like string patterns, and oversized values. It reports evidence paths rather than raw prompt, output, request/response, header, API key, secret, or full tool payload values. Secret detection is best-effort and should not be treated as exhaustive.
+Examples:
+```bash
+npx agent-inspect scan fixtures/traces-v0.2/manual-basic.jsonl --json
+npx agent-inspect verify-safe minimal-success --dir fixtures/traces
+npx agent-inspect verify-safe trace.jsonl --max-string-length 8192 --json
+```
+### 6.10 `artifacts`
+Create deterministic local CI artifacts for supported trace inputs. This command is local and read-only for trace inputs: it does not rerun agents, call models, upload files, use GitHub APIs, or mutate repository state. It writes only to `--output-dir` and, when requested, a local step-summary file.
+```bash
+agent-inspect artifacts <trace-path-or-run-id> --output-dir <path> [options]
+```
+Generated files:
+- `trace.json`: structural trace summary only
+- `check.json`: safety check result
+- `diff.json`: baseline diff result, or `not_requested`
+- `summary.md`: safe Markdown CI summary
+- `report.html`: safe HTML CI summary
+- `manifest.json`: deterministic file/status manifest
+Options:
+- `--output-dir <path>`: required local artifact directory
+- `--dir <path>`: trace directory for run-id lookup
+- `--format <agent-inspect-jsonl|openinference-json|otlp-json>`: explicit reader format override
+- `--run <run-id>`: select a run when input contains multiple runs
+- `--baseline <trace-path-or-run-id>`: optional baseline trace for diff artifacts
+- `--baseline-run <run-id>`: select a run from the baseline trace
+- `--github-summary <path>`: append the safe Markdown summary to this file, such as `$GITHUB_STEP_SUMMARY`
+- `--json`: print deterministic `manifest.json` content
+The artifact command runs safety checks before rendering and only includes structural counts, statuses, bounded check findings, diagnostics, and evidence paths. Baseline diff artifacts use normalized baseline checks and also avoid raw prompt/output/tool payload values. `--github-summary` is plain local file output; AgentInspect does not call GitHub APIs or upload artifacts.
+Examples:
+```bash
+npx agent-inspect artifacts fixtures/traces-v0.2/manual-basic.jsonl --output-dir ./artifacts --json
+npx agent-inspect artifacts minimal-success --dir fixtures/traces --output-dir ./artifacts --github-summary "$GITHUB_STEP_SUMMARY"
+npx agent-inspect artifacts candidate.jsonl --baseline baseline.jsonl --output-dir ./artifacts
+```
+Recipe and sample workflow: [examples/recipes/deterministic-ci-checks](../examples/recipes/deterministic-ci-checks/README.md)
+### 6.11 `diff`
 Compare two manual trace runs. Diff is **local** and **read-only** (does not rerun agents).
@@ -286,7 +434,7 @@ Differences:
 More examples, including timing-only and structure-only diffs, are in `docs/DIFF.md`.
-### 6.9 `timeline`
+### 6.12 `timeline`
 Chronological step list for one manual trace. Read-only; does not mutate JSONL files.
@@ -302,7 +450,7 @@ Options:
 ![Timeline with slow-step focus](../assets/demos/timeline.gif)
-### 6.10 `stats`
+### 6.13 `stats`
 Local aggregate statistics over trace files in a directory. Read-only.
@@ -322,7 +470,7 @@ Options:
 Use `--correlation-id` or `--group-id` to filter runs by `run_started` metadata (see [API.md](./API.md)).
-### 6.11 `search`
+### 6.14 `search`
 Deterministic search over local traces (substring / exact filters). No semantic search.
@@ -352,7 +500,7 @@ npx agent-inspect search --duration ">100ms" --json
 ![Search traces by status error](../assets/demos/search.gif)
-### 6.12 `what`
+### 6.15 `what`
 Concise human-readable summary of one local trace run. Read-only; accepts v0.1 manual JSONL and v0.2 persisted-event JSONL through the shared dual-format normalization path. Vocabulary: [TRACE-VOCABULARY-V1.5.md](./proposals/TRACE-VOCABULARY-V1.5.md).
@@ -381,7 +529,7 @@ Outcome: Completed successfully.
 Slowest: plan (100ms, logic)
 ```
-### 6.13 `report`
+### 6.16 `report`
 Generate a local inspection report combining **what happened**, **timeline**, and **execution tree** sections. The command reads local v0.1 manual JSONL and v0.2 persisted-event JSONL through the shared dual-format normalization path without mutating them. Distinct from `export` (which targets shareable tree snapshots and standards formats).

package/docs/KNOWN-ISSUES.md CHANGED Viewed

@@ -23,7 +23,7 @@ AgentInspect is **local-first** and **CLI-first**. These behaviors are intention
 - **Vendor sinks** (hosted dashboards, Langfuse/Braintrust/New Relic/Datadog native uploads, OTLP gRPC streaming, etc.) are **not implemented** in the core packages described here.
 - **AI SDK adapter** (`@agent-inspect/ai-sdk`) is experimental and metadata-first. It depends on explicit AI SDK telemetry configuration and requires `recordInputs: false` / `recordOutputs: false` for the documented safe path.
-- **OpenAI Agents JS adapter** (`@agent-inspect/openai-agents`) is scaffold-only in the v1.7 train. Runtime span mapping is not implemented, and the safe future path is `setTraceProcessors()` rather than `addTraceProcessor()`.
+- **OpenAI Agents JS adapter** (`@agent-inspect/openai-agents`) is experimental and remains private/unpublished until the v1.8 first-publication gate. Runtime metadata mapping is local-only; the safe install path is `setTraceProcessors()` rather than `addTraceProcessor()`.
 - **LangGraph support** is currently a documented boundary through `@agent-inspect/langchain`, not a dedicated package.
 - **LangChain adapter** captures **metadata-oriented** signals by default; it does not replace full framework observability.
 - **LangChain `stream: true`** records chunk counts and timing only — not a full token replay. Per-token JSONL events are not emitted.
@@ -93,6 +93,12 @@ pnpm compat:smoke
 - Fixture pattern: [test/consumer-fixtures/jest-cjs/](../../test/consumer-fixtures/jest-cjs/).
 - Full Jest runner smoke in CI is a documented follow-up — root package does not ship Jest as a devDependency.
+## v1.8 pre-release adoption notes
+- `@agent-inspect/vitest` and `@agent-inspect/jest` are private/unpublished until the v1.8 release-readiness gate completes. The [test reporter artifact recipe](../examples/recipes/test-reporter-artifacts/README.md) documents the intended config shape without requiring those packages.
+- `agent-inspect artifacts --github-summary` writes a local step-summary file only. It does not call GitHub APIs, open PR comments, upload artifacts, or mutate repository state.
+- Baseline checks compare normalized structural facts from explicit candidate and baseline inputs. They are useful for CI regression evidence, not replay or semantic eval scoring.
 ### What to include in a bug report
 - Node.js version (`node -v`)

package/docs/LIMITATIONS.md CHANGED Viewed

@@ -31,7 +31,7 @@ This document states what AgentInspect **does not** provide today. It complement
 - **AI SDK integration is explicit telemetry wiring.** Use `@agent-inspect/ai-sdk` through AI SDK `experimental_telemetry.integrations`; AgentInspect does not wrap providers, patch fetch, or enable telemetry globally.
 - **AI SDK privacy settings are caller-owned.** Examples set `recordInputs: false` and `recordOutputs: false`; leaving those enabled in user code can cause the AI SDK telemetry layer to include richer data before AgentInspect receives events.
-- **OpenAI Agents JS support is scaffold-only.** `@agent-inspect/openai-agents` documents the safe `setTraceProcessors()` boundary but does not map runtime spans yet and is not part of the v1.7 published package set.
+- **OpenAI Agents JS support is experimental and not published yet.** `@agent-inspect/openai-agents` maps metadata-only runtime spans through the safe `setTraceProcessors()` boundary, remains private until the v1.8 first-publication gate, and does not capture raw payloads by default.
 - **LangGraph support is a boundary decision, not a separate package.** Initial support is expected through `@agent-inspect/langchain` callbacks unless no-network fixtures prove a separate package is needed.
 - **No root/core adapter dependencies.** AI SDK, OpenAI Agents, LangGraph, OpenTelemetry, and LangChain remain outside the root/core runtime dependency graph.
@@ -54,6 +54,12 @@ This document states what AgentInspect **does not** provide today. It complement
 - **Metadata truncation** applies to string values and nested structures; very large metadata may be replaced with a truncation marker when `maxEventBytes` is exceeded (default 64 KiB per JSONL line).
 - **Redaction is not encryption.** Local trace files remain readable on disk; treat `.agent-inspect-runs/` like any developer artifact that may contain operational data.
+## Checks, artifacts, and test reporters
+- **Checks are deterministic local rules, not compliance certification.** `check`, `scan`, and `verify-safe` surface bounded findings and diagnostics over supported local inputs; they do not prove a trace is safe for every sharing context.
+- **Safe CI artifacts are structural summaries.** They avoid raw prompt/output/request/response/header/tool payload content by default, but teams should still review generated files before sharing.
+- **Vitest/Jest reporters are optional and unpublished until release readiness in the v1.8 train.** The recipes document config patterns and explicit associations; consumers should install the packages only after publication.
 ## Execution semantics
 - **No replay / fork** of past runs from traces alone.

package/docs/SCHEMA.md CHANGED Viewed

@@ -250,6 +250,7 @@ v1.6 adds experimental writer and reader surfaces without changing the stable ma
 - `agent-inspect/readers` and `agent-inspect open` read local AgentInspect JSONL, OpenInference JSON, and OTLP JSON inputs through compatibility adapters.
 - OpenInference and OTLP JSON inputs are **not** a third AgentInspect persisted schema. They are local read formats normalized into inspection trees with warnings and unsupported-field reporting.
 - Reader and writer APIs perform no network upload and do not mutate source files.
+- v1.8 checks, safety verification, baseline comparison, safe CI artifacts, and reporter artifacts are report layers over existing trace inputs. They do not change manual trace writing, introduce a third persisted trace model, or embed raw prompt/output/request/response/header/tool payload content in their default structural outputs.
 ## 16. Migration notes

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "agent-inspect",
-  "version": "1.7.0",
+  "version": "1.8.0",
   "license": "MIT",
   "type": "module",
   "description": "Local-first execution-tree debugger for TypeScript AI agents",
@@ -95,6 +95,16 @@
         "types": "./packages/core/dist/readers.d.cts",
         "default": "./packages/core/dist/readers.cjs"
       }
+    },
+    "./checks": {
+      "import": {
+        "types": "./packages/core/dist/checks.d.ts",
+        "default": "./packages/core/dist/checks.mjs"
+      },
+      "require": {
+        "types": "./packages/core/dist/checks.d.cts",
+        "default": "./packages/core/dist/checks.cjs"
+      }
     }
   },
   "bin": {
@@ -158,7 +168,7 @@
   },
   "scripts": {
     "clean": "pnpm -r exec -- rm -rf dist",
-    "build": "pnpm exec tsup --config tsup.core.config.ts && pnpm exec tsup --config tsup.cli.config.ts && pnpm exec tsup --config tsup.langchain.config.ts && pnpm exec tsup --config tsup.tui.config.ts && pnpm exec tsup --config tsup.ai-sdk.config.ts && pnpm exec tsup --config tsup.openai-agents.config.ts",
+    "build": "pnpm exec tsup --config tsup.core.config.ts && pnpm exec tsup --config tsup.cli.config.ts && pnpm exec tsup --config tsup.langchain.config.ts && pnpm exec tsup --config tsup.tui.config.ts && pnpm exec tsup --config tsup.ai-sdk.config.ts && pnpm exec tsup --config tsup.vitest.config.ts && pnpm exec tsup --config tsup.jest.config.ts && pnpm exec tsup --config tsup.openai-agents.config.ts",
     "typecheck": "tsc --noEmit",
     "test": "vitest run",
     "test:watch": "vitest",