@botbotgo/agent-harness 0.0.232 → 0.0.233

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1028,9 +1028,9 @@ ACP transport notes:
1028
1028
  - `serveAgUiHttp(runtime)` exposes an AG-UI-compatible HTTP SSE bridge that projects runtime lifecycle, text output, upstream thinking, step progress, and tool calls onto `RUN_*`, `TEXT_MESSAGE_*`, `THINKING_TEXT_MESSAGE_*`, `STEP_*`, and `TOOL_CALL_*` events for UI clients.
1029
1029
  - `createRuntimeMcpServer(runtime)` and `serveRuntimeMcpOverStdio(runtime)` expose the persisted runtime control surface itself as MCP tools, including sessions, requests, approvals, artifacts, events, and package export helpers.
1030
1030
  - `listRequestEvents(...)` and `exportRequestPackage(...)` are the request-first inspection helpers.
1031
- - `exportRequestPackage(...)` and `exportSessionPackage(...)` package stable runtime records, transcript, approvals, events, and artifacts for operator tooling without reaching into persistence internals.
1031
+ - `exportRequestPackage(...)` and `exportSessionPackage(...)` package stable runtime records, transcript, approvals, events, artifacts, and governance evidence for operator tooling without reaching into persistence internals.
1032
1032
  - `runtime/default.governance.remoteMcp` can now deny or allow specific MCP servers, raise approval requirements by transport, and stamp transport-based risk tiers into runtime governance bundles. MCP server catalogs can also declare trust tier, access mode, tenant scope, approval policy, prompt-injection risk, and OAuth scope metadata so governance bundles capture why one remote tool is treated as high-risk.
1033
1033
  - Protocol responsibilities stay split on purpose: ACP is the primary editor/client runtime boundary, A2A is the streaming-capable agent-platform bridge with polling compatibility, AG-UI is the UI event surface, and runtime MCP is the operator-facing control plane exported as MCP tools.
1034
1034
  - `runtime/default.observability.tracing` can now describe exporter metadata such as OTLP endpoints and propagation mode, so frozen runtime snapshots keep trace-correlation plus operator-visible export context without exposing backend-private span internals.
1035
- - `agent-harness runtime overview`, `agent-harness runtime health`, `agent-harness runtime approvals list|watch`, and `agent-harness runtime runs list|tail` provide a thin operator CLI over persisted runtime health, queue pressure, governance risk, approval queues, and active run state.
1035
+ - `agent-harness runtime overview`, `agent-harness runtime health`, `agent-harness runtime approvals list|watch`, `agent-harness runtime runs list|tail`, and `agent-harness runtime export request|session` provide a thin operator CLI over persisted runtime health, queue pressure, governance risk, approval queues, active run state, and audit-ready evidence packages.
1036
1036
  - detailed A2A adapter guidance lives in [`docs/a2a-bridge.md`](docs/a2a-bridge.md)
package/README.zh.md CHANGED
@@ -986,9 +986,9 @@ ACP transport 说明:
986
986
  - `serveAgUiHttp(runtime)` 提供 AG-UI HTTP SSE bridge,把 runtime 生命周期、文本输出、upstream thinking、step 进度与 tool call 投影成 `RUN_*`、`TEXT_MESSAGE_*`、`THINKING_TEXT_MESSAGE_*`、`STEP_*` 与 `TOOL_CALL_*` 事件,便于 UI 客户端直接接入。
987
987
  - `createRuntimeMcpServer(runtime)` 与 `serveRuntimeMcpOverStdio(runtime)` 会把持久化 runtime 控制面本身暴露成 MCP tools,包括 sessions、requests、approvals、artifacts、events 与 package export helpers。
988
988
  - `listRequestEvents(...)` 与 `exportRequestPackage(...)` 是 request-first 的检查 helper。
989
- - `exportRequestPackage(...)` 与 `exportSessionPackage(...)` 可把稳定 runtime 记录、transcript、approvals、events artifacts 打包给管理工具,而不必直接访问 persistence 内部实现。
989
+ - `exportRequestPackage(...)` 与 `exportSessionPackage(...)` 可把稳定 runtime 记录、transcript、approvals、events、artifacts governance evidence 一起打包给管理工具,而不必直接访问 persistence 内部实现。
990
990
  - `runtime/default.governance.remoteMcp` 现在可以按 MCP server 或 transport 做 allow/deny、审批升级,并把 transport 风险等级写进 runtime governance bundles。MCP server catalog 也可以声明 trust tier、access mode、tenant scope、approval policy、prompt-injection risk 与 OAuth scope 元数据,让治理快照能解释为什么某个远端工具被视为高风险。
991
991
  - 协议分工要继续保持清晰:ACP 是 editor / client 的主运行时边界,A2A 是支持 streaming 且兼容轮询的 agent-platform bridge,AG-UI 是 UI 事件面,runtime MCP 是以 MCP tools 暴露的 operator control plane。
992
992
  - `runtime/default.observability.tracing` 现在可描述 OTLP endpoint 和 propagation mode 这类 exporter 元数据,使冻结的 runtime snapshot 在保留 trace correlation 的同时,也能保留有用的导出上下文,而不暴露 backend 私有 span 细节。
993
- - `agent-harness runtime overview`、`agent-harness runtime health`、`agent-harness runtime approvals list|watch` 与 `agent-harness runtime runs list|tail` 提供了一层轻量 CLI,可直接查看 runtime health、queue pressure、governance risk、审批队列和运行状态。
993
+ - `agent-harness runtime overview`、`agent-harness runtime health`、`agent-harness runtime approvals list|watch`、`agent-harness runtime runs list|tail` 与 `agent-harness runtime export request|session` 提供了一层轻量 CLI,可直接查看 runtime health、queue pressure、governance risk、审批队列、运行状态与可审计证据包。
994
994
  - 更详细的 A2A 适配层开发说明见 [`docs/a2a-bridge.md`](docs/a2a-bridge.md)
package/dist/api.d.ts CHANGED
@@ -1,4 +1,4 @@
1
- import type { ArtifactListing, CancelOptions, InvocationEnvelope, ListMemoriesInput, ListMemoriesResult, MemoryRecord, MemorizeInput, MemorizeResult, MessageContent, RecallInput, RecallResult, RemoveMemoryInput, RequestRecord, RequestSummary, ResumeOptions, RunDecisionOptions, RunListeners, RunResult, RunStartOptions, RuntimeHealthSnapshot, RuntimeGovernanceDiagnostics, RuntimeOperatorOverview, RuntimeQueueDiagnostics, RuntimeAdapterOptions, RuntimeEvaluationExport, RuntimeEvaluationExportInput, RuntimeEvaluationReplayInput, RuntimeEvaluationReplayResult as InternalRuntimeEvaluationReplayResult, RuntimeSessionPackage, RuntimeSessionPackageInput, SessionListSummary, SessionRecord, SessionSummary, TranscriptMessage, UpdateMemoryInput, WorkspaceLoadOptions } from "./contracts/types.js";
1
+ import type { ArtifactListing, CancelOptions, InvocationEnvelope, ListMemoriesInput, ListMemoriesResult, MemoryRecord, MemorizeInput, MemorizeResult, MessageContent, RecallInput, RecallResult, RemoveMemoryInput, RequestRecord, RequestSummary, ResumeOptions, RunDecisionOptions, RunListeners, RunResult, RunStartOptions, RuntimeHealthSnapshot, RuntimeGovernanceEvidence, RuntimeGovernanceDiagnostics, RuntimeOperatorOverview, RuntimeQueueDiagnostics, RuntimeAdapterOptions, RuntimeEvaluationExport, RuntimeEvaluationExportInput, RuntimeEvaluationReplayInput, RuntimeEvaluationReplayResult as InternalRuntimeEvaluationReplayResult, RuntimeSessionPackage, RuntimeSessionPackageInput, SessionListSummary, SessionRecord, SessionSummary, TranscriptMessage, UpdateMemoryInput, WorkspaceLoadOptions } from "./contracts/types.js";
2
2
  import { AgentHarnessRuntime } from "./runtime/harness.js";
3
3
  import type { InventoryAgentRecord, InventorySkillRecord } from "./runtime/harness/system/inventory.js";
4
4
  import type { RequirementAssessmentOptions } from "./runtime/harness/system/skill-requirements.js";
@@ -45,6 +45,7 @@ export type Approval = {
45
45
  sessionId: string;
46
46
  requestId: string;
47
47
  toolName: string;
48
+ approvalReason?: string;
48
49
  status: "pending" | "approved" | "edited" | "rejected" | "expired";
49
50
  requestedAt: string;
50
51
  resolvedAt: string | null;
@@ -69,6 +70,7 @@ export type RequestPackage = {
69
70
  transcript: TranscriptMessage[];
70
71
  events: RequestEvent[];
71
72
  artifacts: RequestArtifactListing["items"];
73
+ governance: RuntimeGovernanceEvidence;
72
74
  runtimeHealth?: RuntimeHealthSnapshot;
73
75
  };
74
76
  export type RuntimeEvaluationReplayResult = Omit<InternalRuntimeEvaluationReplayResult, "result"> & {
package/dist/api.js CHANGED
@@ -17,6 +17,7 @@ function toApprovalRecord(record) {
17
17
  sessionId: record.threadId,
18
18
  requestId: record.runId,
19
19
  toolName: record.toolName,
20
+ ...(record.approvalReason ? { approvalReason: record.approvalReason } : {}),
20
21
  status: record.status,
21
22
  requestedAt: record.requestedAt,
22
23
  resolvedAt: record.resolvedAt,
@@ -84,6 +85,7 @@ function toRequestPackage(pkg) {
84
85
  transcript: pkg.transcript,
85
86
  events: pkg.events.map(toPublicEvent),
86
87
  artifacts: pkg.artifacts,
88
+ governance: pkg.governance,
87
89
  ...(pkg.runtimeHealth ? { runtimeHealth: pkg.runtimeHealth } : {}),
88
90
  };
89
91
  }
package/dist/cli.js CHANGED
@@ -20,6 +20,8 @@ function renderUsage() {
20
20
  agent-harness runtime approvals watch [--workspace <path>] [--status <pending|approved|edited|rejected|expired>] [--poll-ms <ms>] [--once] [--json]
21
21
  agent-harness runtime runs list [--workspace <path>] [--agent <agentId>] [--thread <threadId>] [--state <state>] [--json]
22
22
  agent-harness runtime runs tail [--workspace <path>] [--agent <agentId>] [--thread <threadId>] [--state <state>] [--poll-ms <ms>] [--once] [--json]
23
+ agent-harness runtime export request --workspace <path> --session <sessionId> --request <requestId> [--artifacts] [--artifact-contents] [--health] [--json]
24
+ agent-harness runtime export session --workspace <path> --session <sessionId> [--artifacts] [--artifact-contents] [--health] [--json]
23
25
  agent-harness runtime-mcp serve [--workspace <path>]
24
26
  `;
25
27
  }
@@ -218,6 +220,80 @@ function parseRuntimeInspectOptions(args) {
218
220
  }
219
221
  return { workspaceRoot, json, once, pollMs, limit, status, state, agentId, threadId };
220
222
  }
223
+ function parseRuntimeExportOptions(args) {
224
+ let workspaceRoot;
225
+ let sessionId;
226
+ let requestId;
227
+ let includeArtifacts = false;
228
+ let includeArtifactContents = false;
229
+ let includeRuntimeHealth = false;
230
+ let json = false;
231
+ for (let index = 0; index < args.length; index += 1) {
232
+ const arg = args[index];
233
+ if (arg === "--artifacts") {
234
+ includeArtifacts = true;
235
+ continue;
236
+ }
237
+ if (arg === "--artifact-contents") {
238
+ includeArtifacts = true;
239
+ includeArtifactContents = true;
240
+ continue;
241
+ }
242
+ if (arg === "--health") {
243
+ includeRuntimeHealth = true;
244
+ continue;
245
+ }
246
+ if (arg === "--json") {
247
+ json = true;
248
+ continue;
249
+ }
250
+ if (arg === "--workspace" || arg === "--session" || arg === "--request") {
251
+ const value = args[index + 1];
252
+ if (!value) {
253
+ return {
254
+ workspaceRoot,
255
+ sessionId,
256
+ requestId,
257
+ includeArtifacts,
258
+ includeArtifactContents,
259
+ includeRuntimeHealth,
260
+ json,
261
+ error: `Missing value for ${arg}`,
262
+ };
263
+ }
264
+ if (arg === "--workspace") {
265
+ workspaceRoot = value;
266
+ }
267
+ else if (arg === "--session") {
268
+ sessionId = value;
269
+ }
270
+ else {
271
+ requestId = value;
272
+ }
273
+ index += 1;
274
+ continue;
275
+ }
276
+ return {
277
+ workspaceRoot,
278
+ sessionId,
279
+ requestId,
280
+ includeArtifacts,
281
+ includeArtifactContents,
282
+ includeRuntimeHealth,
283
+ json,
284
+ error: `Unknown option: ${arg}`,
285
+ };
286
+ }
287
+ return {
288
+ workspaceRoot,
289
+ sessionId,
290
+ requestId,
291
+ includeArtifacts,
292
+ includeArtifactContents,
293
+ includeRuntimeHealth,
294
+ json,
295
+ };
296
+ }
221
297
  function renderJson(value) {
222
298
  return `${JSON.stringify(value, null, 2)}\n`;
223
299
  }
@@ -517,15 +593,73 @@ export async function runCli(argv, io = {}, deps = {}) {
517
593
  }
518
594
  }
519
595
  if (command === "runtime") {
520
- const [subcommand, possibleNestedCommand, ...remainingArgs] = [projectName, ...rest];
596
+ const [subcommand, possibleNestedCommand, possibleThirdCommand, ...remainingArgs] = [projectName, ...rest];
521
597
  if (!subcommand) {
522
598
  stderr(renderUsage());
523
599
  return 1;
524
600
  }
601
+ if (subcommand === "export") {
602
+ const exportTarget = possibleNestedCommand;
603
+ const parsed = parseRuntimeExportOptions([possibleThirdCommand, ...remainingArgs].filter((item) => typeof item === "string"));
604
+ if (parsed.error) {
605
+ stderr(`${parsed.error}\n`);
606
+ stderr(renderUsage());
607
+ return 1;
608
+ }
609
+ if (!parsed.sessionId) {
610
+ stderr("Missing value for --session\n");
611
+ stderr(renderUsage());
612
+ return 1;
613
+ }
614
+ if (exportTarget !== "request" && exportTarget !== "session") {
615
+ stderr(renderUsage());
616
+ return 1;
617
+ }
618
+ if (exportTarget === "request" && !parsed.requestId) {
619
+ stderr("Missing value for --request\n");
620
+ stderr(renderUsage());
621
+ return 1;
622
+ }
623
+ try {
624
+ const runtime = await createHarness(path.resolve(cwd, parsed.workspaceRoot ?? "."));
625
+ try {
626
+ if (exportTarget === "request") {
627
+ const pkg = await runtime.exportRequestPackage({
628
+ sessionId: parsed.sessionId,
629
+ requestId: parsed.requestId,
630
+ includeArtifacts: parsed.includeArtifacts,
631
+ includeArtifactContents: parsed.includeArtifactContents,
632
+ includeRuntimeHealth: parsed.includeRuntimeHealth,
633
+ });
634
+ stdout(renderJson(pkg));
635
+ }
636
+ else {
637
+ const pkg = await runtime.exportSessionPackage({
638
+ sessionId: parsed.sessionId,
639
+ includeArtifacts: parsed.includeArtifacts,
640
+ includeArtifactContents: parsed.includeArtifactContents,
641
+ includeRuntimeHealth: parsed.includeRuntimeHealth,
642
+ });
643
+ stdout(renderJson(pkg));
644
+ }
645
+ }
646
+ finally {
647
+ await runtime.stop();
648
+ }
649
+ return 0;
650
+ }
651
+ catch (error) {
652
+ const message = error instanceof Error ? error.message : String(error);
653
+ stderr(`${message}\n`);
654
+ return 1;
655
+ }
656
+ }
525
657
  const nestedCommand = (subcommand === "approvals" || subcommand === "runs") && possibleNestedCommand
526
658
  ? possibleNestedCommand
527
659
  : undefined;
528
- const subcommandArgs = nestedCommand ? remainingArgs : [possibleNestedCommand, ...remainingArgs].filter((item) => typeof item === "string");
660
+ const subcommandArgs = nestedCommand
661
+ ? [possibleThirdCommand, ...remainingArgs].filter((item) => typeof item === "string")
662
+ : [possibleNestedCommand, possibleThirdCommand, ...remainingArgs].filter((item) => typeof item === "string");
529
663
  const parsed = parseRuntimeInspectOptions(subcommandArgs);
530
664
  if (parsed.error) {
531
665
  stderr(`${parsed.error}\n`);
@@ -572,6 +572,7 @@ export type ApprovalRecord = {
572
572
  threadId: string;
573
573
  runId: string;
574
574
  toolName: string;
575
+ approvalReason?: string;
575
576
  status: "pending" | "approved" | "edited" | "rejected" | "expired";
576
577
  requestedAt: string;
577
578
  resolvedAt: string | null;
@@ -726,6 +727,21 @@ export type RuntimeRunPackageInput = {
726
727
  includeArtifactContents?: boolean;
727
728
  includeRuntimeHealth?: boolean;
728
729
  };
730
+ export type RuntimeApprovalSummary = {
731
+ total: number;
732
+ pending: number;
733
+ approved: number;
734
+ edited: number;
735
+ rejected: number;
736
+ expired: number;
737
+ toolNames: string[];
738
+ approvalReasons: string[];
739
+ };
740
+ export type RuntimeGovernanceEvidence = {
741
+ bundles: RuntimeGovernanceBundle[];
742
+ approvalSummary: RuntimeApprovalSummary;
743
+ summary: string;
744
+ };
729
745
  export type RuntimeRunPackage = {
730
746
  session: SessionRecord | null;
731
747
  request: RequestRecord | null;
@@ -733,6 +749,7 @@ export type RuntimeRunPackage = {
733
749
  transcript: TranscriptMessage[];
734
750
  events: HarnessEvent[];
735
751
  artifacts: RuntimeEvaluationArtifact[];
752
+ governance: RuntimeGovernanceEvidence;
736
753
  runtimeHealth?: RuntimeHealthSnapshot;
737
754
  };
738
755
  export type RuntimeSessionPackageInput = {
@@ -747,6 +764,14 @@ export type RuntimeSessionPackage = {
747
764
  approvals: ApprovalRecord[];
748
765
  transcript: TranscriptMessage[];
749
766
  runs: RuntimeRunPackage[];
767
+ governance: {
768
+ runs: Array<{
769
+ requestId: string;
770
+ evidence: RuntimeGovernanceEvidence;
771
+ }>;
772
+ approvalSummary: RuntimeApprovalSummary;
773
+ summary: string;
774
+ };
750
775
  runtimeHealth?: RuntimeHealthSnapshot;
751
776
  };
752
777
  export type RuntimeInventoryContext = {
@@ -1 +1 @@
1
- export declare const AGENT_HARNESS_VERSION = "0.0.231";
1
+ export declare const AGENT_HARNESS_VERSION = "0.0.232";
@@ -1 +1 @@
1
- export const AGENT_HARNESS_VERSION = "0.0.231";
1
+ export const AGENT_HARNESS_VERSION = "0.0.232";
@@ -69,6 +69,24 @@ function toSessionListSummary(session) {
69
69
  snippet: normalizeSessionListText(session.lastMessage?.content, 160),
70
70
  };
71
71
  }
72
+ function summarizeApprovalEvidence(approvals) {
73
+ const toolNames = Array.from(new Set(approvals
74
+ .map((approval) => approval.toolName)
75
+ .filter((toolName) => typeof toolName === "string" && toolName.trim().length > 0)));
76
+ const approvalReasons = Array.from(new Set(approvals
77
+ .map((approval) => approval.approvalReason)
78
+ .filter((reason) => typeof reason === "string" && reason.trim().length > 0)));
79
+ return {
80
+ total: approvals.length,
81
+ pending: approvals.filter((approval) => approval.status === "pending").length,
82
+ approved: approvals.filter((approval) => approval.status === "approved").length,
83
+ edited: approvals.filter((approval) => approval.status === "edited").length,
84
+ rejected: approvals.filter((approval) => approval.status === "rejected").length,
85
+ expired: approvals.filter((approval) => approval.status === "expired").length,
86
+ toolNames,
87
+ approvalReasons,
88
+ };
89
+ }
72
90
  export class AgentHarnessRuntime {
73
91
  workspace;
74
92
  runtimeAdapterOptions;
@@ -710,6 +728,15 @@ export class AgentHarnessRuntime {
710
728
  ? { content: await this.persistence.readArtifact(input.sessionId, input.requestId, artifact.path) }
711
729
  : {}),
712
730
  })));
731
+ const approvalSummary = summarizeApprovalEvidence(approvals);
732
+ const bundles = request?.runtimeSnapshot?.governance?.bundles ?? [];
733
+ const governanceSummaryParts = [
734
+ `${bundles.length} governance bundle(s)`,
735
+ `${approvalSummary.total} approval record(s)`,
736
+ ];
737
+ if (approvalSummary.approvalReasons.length > 0) {
738
+ governanceSummaryParts.push(`reasons=${approvalSummary.approvalReasons.join(",")}`);
739
+ }
713
740
  return {
714
741
  session,
715
742
  request,
@@ -717,6 +744,11 @@ export class AgentHarnessRuntime {
717
744
  transcript,
718
745
  events,
719
746
  artifacts,
747
+ governance: {
748
+ bundles,
749
+ approvalSummary,
750
+ summary: governanceSummaryParts.join(" "),
751
+ },
720
752
  ...(input.includeRuntimeHealth === false ? {} : { runtimeHealth: await this.getHealth() }),
721
753
  };
722
754
  }
@@ -733,12 +765,25 @@ export class AgentHarnessRuntime {
733
765
  includeArtifactContents: input.includeArtifactContents,
734
766
  includeRuntimeHealth: false,
735
767
  })));
768
+ const approvals = await this.listApprovals({ threadId: input.sessionId });
769
+ const approvalSummary = summarizeApprovalEvidence(approvals);
770
+ const governanceRuns = runs
771
+ .filter((item) => item.request?.requestId)
772
+ .map((item) => ({
773
+ requestId: item.request.requestId,
774
+ evidence: item.governance,
775
+ }));
736
776
  return {
737
777
  session,
738
778
  requests: runs.map((item) => item.request).filter((item) => Boolean(item)),
739
- approvals: await this.listApprovals({ threadId: input.sessionId }),
779
+ approvals,
740
780
  transcript: await this.persistence.listThreadMessages(input.sessionId, 500),
741
781
  runs,
782
+ governance: {
783
+ runs: governanceRuns,
784
+ approvalSummary,
785
+ summary: `${governanceRuns.length} run evidence package(s), ${approvalSummary.total} approval record(s)`,
786
+ },
742
787
  ...(input.includeRuntimeHealth === false ? {} : { runtimeHealth: await this.getHealth() }),
743
788
  };
744
789
  }
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@botbotgo/agent-harness",
3
- "version": "0.0.232",
3
+ "version": "0.0.233",
4
4
  "description": "Workspace runtime for multi-agent applications",
5
5
  "license": "MIT",
6
6
  "type": "module",