@botbotgo/agent-harness 0.0.134 → 0.0.135

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -17,6 +17,10 @@
17
17
  <strong>The application runtime for multi-agent products with approvals, recovery, and operator control built in.</strong>
18
18
  </p>
19
19
 
20
+ <p align="center">
21
+ <strong>Turn one agent workspace into one operable product runtime.</strong>
22
+ </p>
23
+
20
24
  <p align="center">
21
25
  <a href="https://botbotgo.github.io/agent-harness/">Product website</a>
22
26
  (static page in <code>docs/</code>, publish with GitHub Pages; EN / 中文 toggle)
@@ -31,6 +35,17 @@
31
35
 
32
36
  ## What Problem We Solve
33
37
 
38
+ In one line: `agent-harness` takes the runtime work that appears after the demo and makes it part of the product runtime from day one.
39
+
40
+ If your team already has agents, prompts, tools, and workflows, the missing layer is usually not more execution. It is the runtime that makes those pieces operable as software.
41
+
42
+ What you get on day one:
43
+
44
+ - a runtime that keeps `runs`, `threads`, `approvals`, and `events` as inspectable product records
45
+ - a recovery path that survives interruption, restart, and operator decisions
46
+ - one workspace-shaped assembly model instead of app-specific runtime glue
47
+ - one stable runtime contract even when execution backends change underneath
48
+
34
49
  AI makes it much easier to generate agent logic, tool calls, and workflow code. The hard part moves to operations.
35
50
 
36
51
  Once the demo works, the real software problem changes shape:
@@ -50,6 +65,12 @@ Teams still need answers to the runtime questions that appear after that shift:
50
65
 
51
66
  `agent-harness` solves that layer. It keeps agent execution upstream while making the application runtime operable, recoverable, and governable.
52
67
 
68
+ That means the product story becomes easier to explain:
69
+
70
+ - you bring the workspace, agents, tools, and prompts
71
+ - `agent-harness` brings persisted `runs`, `threads`, `approvals`, `events`, recovery, and operator visibility
72
+ - your application gets one stable runtime contract instead of backend-specific runtime plumbing
73
+
53
74
  Concretely, that means:
54
75
 
55
76
  - a product-facing approval and operator surface instead of backend-specific middleware state
@@ -152,6 +173,21 @@ Real products need a runtime that can answer harder questions:
152
173
  - It lets YAML own assembly and operating policy while code keeps a tiny surface
153
174
  - It goes deep on runtime concerns that upstream libraries do not fully productize
154
175
 
176
+ ## When To Use It
177
+
178
+ Use `agent-harness` when:
179
+
180
+ - you already know your product needs agents, tools, prompts, or MCP access, but the missing layer is runtime operations
181
+ - you need approvals, restart recovery, queueing, or inspectable run records as part of the shipped product
182
+ - you want one workspace-shaped assembly model instead of hand-written runtime bootstrapping in every app
183
+ - you want to keep backend execution semantics upstream while holding the product contract stable
184
+
185
+ Do not reach for it when:
186
+
187
+ - you only need a single short-lived agent call with no approvals, no persistence, and no operational control surface
188
+ - you are looking for a workflow builder or low-code automation canvas
189
+ - you want to replace LangChain v1 or DeepAgents execution semantics rather than operate around them
190
+
155
191
  ## Quick Start
156
192
 
157
193
  Install:
@@ -205,6 +241,17 @@ try {
205
241
  }
206
242
  ```
207
243
 
244
+ Three-minute mental model:
245
+
246
+ 1. Point `createAgentHarness(...)` at a workspace root.
247
+ 2. Call `run(runtime, { ... })` to execute one request.
248
+ 3. Inspect persisted runtime records instead of treating the final answer as the only product artifact.
249
+
250
+ This is the shortest product pitch:
251
+
252
+ - your team builds the agent app
253
+ - `agent-harness` makes that app operable
254
+
208
255
  If you want the shortest possible mental model:
209
256
 
210
257
  - one workspace becomes one runtime
@@ -429,6 +476,13 @@ Core workspace files:
429
476
  Workspace-local tool modules in `resources/tools/` should be exported with `tool({...})`.
430
477
  Any other local module shape is not supported, and unsupported shapes are rejected at load time.
431
478
 
479
+ Default wiring guidance:
480
+
481
+ - prefer agent-local wiring for workspace-owned function tools
482
+ - keep `config/catalogs/tools.yaml` for reusable shared tools
483
+ - keep `config/catalogs/mcp.yaml` for shared MCP server definitions
484
+ - let agents select MCP tools and apply per-usage MCP overrides where needed
485
+
432
486
  There are three main configuration layers:
433
487
 
434
488
  - runtime policy in `config/runtime/workspace.yaml`
@@ -581,10 +635,14 @@ Use this file for reusable tool objects.
581
635
 
582
636
  Built-in tool families include function tools, backend tools, MCP tools, bundles, and provider-native tools. Provider-native tools are declared in YAML and resolved directly to upstream factories.
583
637
 
638
+ For workspace-owned function tools, prefer agent-side wiring first. Keep `config/catalogs/tools.yaml` for reusable shared tool objects rather than making it the default path for every local tool.
639
+
584
640
  ### `config/catalogs/mcp.yaml`
585
641
 
586
642
  Use this file for named MCP server presets.
587
643
 
644
+ MCP servers are usually heavier shared resources than local function tools. Keep shared MCP connection details here, then let each agent choose the remote tools it wants and apply per-usage overrides at the agent usage point.
645
+
588
646
  Example:
589
647
 
590
648
  ```yaml
package/README.zh.md CHANGED
@@ -17,6 +17,10 @@
17
17
  <strong>面向多 agent 产品的应用运行时:内建审批、恢复与运维控制,而不只是执行。</strong>
18
18
  </p>
19
19
 
20
+ <p align="center">
21
+ <strong>把一个 agent 工作区直接变成一套可运行的产品级 runtime。</strong>
22
+ </p>
23
+
20
24
  <p align="center">
21
25
  <a href="https://botbotgo.github.io/agent-harness/">产品网站</a>
22
26
  (<code>docs/</code> 中的静态页,通过 GitHub Pages 发布;支持中英文切换)
@@ -31,6 +35,17 @@
31
35
 
32
36
  ## 我们解决什么问题
33
37
 
38
+ 一句话概括:`agent-harness` 把 demo 之后才暴露出来的运行时问题,提前收进产品 runtime 本身。
39
+
40
+ 如果团队已经有 agents、prompts、tools 和 workflows,真正缺的通常不是再来一层执行,而是把这些东西变成“可运维的软件”的运行时层。
41
+
42
+ 第一天就能直接拿到的东西:
43
+
44
+ - 把 `runs`、`threads`、`approvals`、`events` 作为可查询产品记录保存下来的 runtime
45
+ - 能跨中断、重启和人工决策继续推进的恢复路径
46
+ - 一个工作区形态的装配模型,而不是每个应用各写一套运行时胶水
47
+ - 即使底层 execution backend 变化,也尽量保持稳定的 runtime 契约
48
+
34
49
  AI 让 agent 逻辑、工具调用和工作流代码更容易生成,真正变难的是运行时运维。
35
50
 
36
51
  当 demo 跑起来之后,真正的软件问题会换一种形状出现:
@@ -50,6 +65,12 @@ AI 让 agent 逻辑、工具调用和工作流代码更容易生成,真正变
50
65
 
51
66
  `agent-harness` 解决的就是这一层。它把 agent 执行留在上游,同时把应用运行时做成可运维、可恢复、可治理的系统。
52
67
 
68
+ 换成更直接的产品语言,就是:
69
+
70
+ - 你负责工作区、agents、tools 和 prompts
71
+ - `agent-harness` 负责持久化 `runs`、`threads`、`approvals`、`events`、恢复能力与运维可见性
72
+ - 你的应用拿到的是一个稳定的 runtime 契约,而不是一堆 backend 专属的运行时胶水代码
73
+
53
74
  具体来说,就是把这些能力沉到运行时里:
54
75
 
55
76
  - 面向产品的审批与运维控制面,而不是 backend 专属的中间件状态
@@ -152,6 +173,21 @@ AI 让 agent 逻辑、工具调用和工作流代码更容易生成,真正变
152
173
  - 复杂装配与运行策略交给 YAML,代码面保持极小
153
174
  - 在上游库未充分产品化的运行时问题上做深做透
154
175
 
176
+ ## 什么时候该用
177
+
178
+ 下面这些场景适合用 `agent-harness`:
179
+
180
+ - 你已经确定产品需要 agents、tools、prompts 或 MCP,但真正缺的是运行时运维层
181
+ - 你需要把审批、重启恢复、排队调度或可查询运行记录一起作为产品能力交付出去
182
+ - 你希望用一个 workspace 形态的装配模型取代每个应用各写一套启动和运行时胶水
183
+ - 你想把 backend 的执行语义留在上游,同时把产品契约稳定下来
184
+
185
+ 下面这些场景就不应该优先用它:
186
+
187
+ - 你只需要一次短生命周期的 agent 调用,不需要审批、持久化或运维控制面
188
+ - 你要的是工作流搭建器或低代码自动化画布
189
+ - 你想替代 LangChain v1 或 DeepAgents 的执行语义,而不是围绕它们做运行时
190
+
155
191
  ## 快速开始
156
192
 
157
193
  安装:
@@ -205,6 +241,17 @@ try {
205
241
  }
206
242
  ```
207
243
 
244
+ 三分钟心智模型:
245
+
246
+ 1. 用 `createAgentHarness(...)` 指向一个 workspace root。
247
+ 2. 用 `run(runtime, { ... })` 执行一次请求。
248
+ 3. 把持久化的运行时记录当成产品资产,而不是只盯着最终回答。
249
+
250
+ 如果再压缩成最短产品表述,就是:
251
+
252
+ - 你的团队负责构建 agent app
253
+ - `agent-harness` 负责让这个 app 可运维
254
+
208
255
  最短心智模型:
209
256
 
210
257
  - 一个工作区对应一个运行时
@@ -8,6 +8,8 @@ export type ParsedAgentObject = {
8
8
  modelRef: string;
9
9
  runRoot?: string;
10
10
  toolRefs: string[];
11
+ toolBindings?: ParsedAgentToolBinding[];
12
+ inlineTools?: ParsedToolObject[];
11
13
  mcpServers?: Array<Record<string, unknown>>;
12
14
  skillPathRefs: string[];
13
15
  memorySources: string[];
@@ -17,6 +19,10 @@ export type ParsedAgentObject = {
17
19
  deepAgentConfig?: Record<string, unknown>;
18
20
  sourcePath: string;
19
21
  };
22
+ export type ParsedAgentToolBinding = {
23
+ ref: string;
24
+ overrides?: Record<string, unknown>;
25
+ };
20
26
  export type WorkspaceObject = {
21
27
  id: string;
22
28
  kind: string;
@@ -72,7 +78,9 @@ export type ParsedToolObject = {
72
78
  description: string;
73
79
  implementationName?: string;
74
80
  config?: Record<string, unknown>;
81
+ subprocess?: boolean;
75
82
  inputSchemaRef?: string;
83
+ embeddingModelRef?: string;
76
84
  backendOperation?: string;
77
85
  mcpRef?: string;
78
86
  bundleRefs: string[];
@@ -116,7 +124,9 @@ export type CompiledTool = {
116
124
  name: string;
117
125
  description: string;
118
126
  config?: Record<string, unknown>;
127
+ subprocess?: boolean;
119
128
  inputSchemaRef?: string;
129
+ embeddingModelRef?: string;
120
130
  backendOperation?: string;
121
131
  mcpRef?: string;
122
132
  bundleRefs: string[];
@@ -118,7 +118,9 @@ registerToolKind({
118
118
  name: tool.name,
119
119
  description: tool.description,
120
120
  config: tool.config,
121
+ subprocess: tool.subprocess,
121
122
  inputSchemaRef: tool.inputSchemaRef,
123
+ embeddingModelRef: tool.embeddingModelRef,
122
124
  bundleRefs: [],
123
125
  hitl: tool.hitl
124
126
  ? {
@@ -150,7 +152,9 @@ registerToolKind({
150
152
  name: tool.name,
151
153
  description: tool.description,
152
154
  config: tool.config,
155
+ subprocess: tool.subprocess,
153
156
  inputSchemaRef: tool.inputSchemaRef,
157
+ embeddingModelRef: tool.embeddingModelRef,
154
158
  backendOperation: tool.backendOperation,
155
159
  bundleRefs: [],
156
160
  hitl: tool.hitl
@@ -183,7 +187,9 @@ registerToolKind({
183
187
  name: tool.name,
184
188
  description: tool.description,
185
189
  config: tool.config,
190
+ subprocess: tool.subprocess,
186
191
  inputSchemaRef: tool.inputSchemaRef,
192
+ embeddingModelRef: tool.embeddingModelRef,
187
193
  mcpRef: tool.mcpRef,
188
194
  bundleRefs: [],
189
195
  hitl: tool.hitl
@@ -222,7 +228,9 @@ registerToolKind({
222
228
  name: tool.name,
223
229
  description: tool.description,
224
230
  config: tool.config,
231
+ subprocess: tool.subprocess,
225
232
  inputSchemaRef: tool.inputSchemaRef,
233
+ embeddingModelRef: tool.embeddingModelRef,
226
234
  bundleRefs: [],
227
235
  hitl: tool.hitl
228
236
  ? {
@@ -1 +1 @@
1
- export declare const AGENT_HARNESS_VERSION = "0.0.133";
1
+ export declare const AGENT_HARNESS_VERSION = "0.0.134";
@@ -1 +1 @@
1
- export const AGENT_HARNESS_VERSION = "0.0.133";
1
+ export const AGENT_HARNESS_VERSION = "0.0.134";
@@ -17,5 +17,9 @@ export type McpToolDescriptor = {
17
17
  };
18
18
  export declare function readMcpServerConfig(workspace: WorkspaceBundle, tool: WorkspaceBundle["tools"] extends Map<any, infer T> ? T : never): McpServerConfig | null;
19
19
  export declare function getOrCreateMcpClient(config: McpServerConfig): Promise<Client>;
20
+ export declare function closeMcpClientsForWorkspace(workspace: WorkspaceBundle): Promise<void>;
21
+ export declare function __resetMcpClientCacheForTests(): void;
22
+ export declare function __setMcpClientCacheEntryForTests(config: McpServerConfig, clientPromise: Promise<Client>): void;
23
+ export declare function __setMcpClientLoaderForTests(loader: (config: McpServerConfig) => Promise<Client>): void;
20
24
  export declare function listRemoteMcpTools(config: McpServerConfig): Promise<McpToolDescriptor[]>;
21
25
  export declare function createMcpToolResolver(workspace: WorkspaceBundle): NonNullable<RuntimeAdapterOptions["toolResolver"]>;
@@ -6,6 +6,7 @@ import { WebSocketClientTransport } from "@modelcontextprotocol/sdk/client/webso
6
6
  import { AGENT_HARNESS_VERSION } from "../package-version.js";
7
7
  import { createRuntimeEnv } from "../runtime/support/runtime-env.js";
8
8
  const mcpClientCache = new Map();
9
+ let mcpClientLoader = createConnectedMcpClient;
9
10
  function readStringRecord(value) {
10
11
  if (typeof value !== "object" || !value) {
11
12
  return undefined;
@@ -73,46 +74,124 @@ function createMcpCacheKey(config) {
73
74
  headers: config.headers ?? {},
74
75
  });
75
76
  }
77
+ async function createConnectedMcpClient(config) {
78
+ const client = new Client({
79
+ name: "agent-harness",
80
+ version: AGENT_HARNESS_VERSION,
81
+ });
82
+ const headers = {
83
+ ...(config.headers ?? {}),
84
+ ...(config.token ? { Authorization: `Bearer ${config.token}` } : {}),
85
+ };
86
+ const transport = config.transport === "http"
87
+ ? new StreamableHTTPClientTransport(new URL(config.url ?? ""), {
88
+ requestInit: Object.keys(headers).length > 0 ? { headers } : undefined,
89
+ })
90
+ : config.transport === "sse"
91
+ ? new SSEClientTransport(new URL(config.url ?? ""), {
92
+ requestInit: Object.keys(headers).length > 0 ? { headers } : undefined,
93
+ })
94
+ : config.transport === "websocket"
95
+ ? new WebSocketClientTransport(new URL(config.url ?? ""))
96
+ : new StdioClientTransport({
97
+ command: config.command ?? "",
98
+ args: config.args,
99
+ env: createRuntimeEnv(config.env),
100
+ cwd: config.cwd,
101
+ });
102
+ await client.connect(transport);
103
+ return client;
104
+ }
105
+ function isRecoverableMcpError(error) {
106
+ if (typeof error !== "object" || error === null) {
107
+ return false;
108
+ }
109
+ const message = typeof error.message === "string"
110
+ ? (error.message).toLowerCase()
111
+ : "";
112
+ const code = typeof error.code === "string"
113
+ ? (error.code).toLowerCase()
114
+ : "";
115
+ return [
116
+ "connection closed",
117
+ "transport closed",
118
+ "socket closed",
119
+ "stream closed",
120
+ "network socket disconnected",
121
+ ].some((pattern) => message.includes(pattern))
122
+ || ["econnreset", "epipe", "ehostunreach", "ecancelled"].includes(code);
123
+ }
124
+ async function closeCachedMcpClient(cacheKey) {
125
+ const cached = mcpClientCache.get(cacheKey);
126
+ mcpClientCache.delete(cacheKey);
127
+ if (!cached) {
128
+ return;
129
+ }
130
+ try {
131
+ const client = await cached;
132
+ await client.close();
133
+ }
134
+ catch {
135
+ // Ignore teardown failures for clients that never connected successfully.
136
+ }
137
+ }
138
+ async function invalidateMcpClient(config) {
139
+ await closeCachedMcpClient(createMcpCacheKey(config));
140
+ }
141
+ async function withRecoveredMcpClient(config, operation) {
142
+ const client = await getOrCreateMcpClient(config);
143
+ try {
144
+ return await operation(client);
145
+ }
146
+ catch (error) {
147
+ if (!isRecoverableMcpError(error)) {
148
+ throw error;
149
+ }
150
+ await invalidateMcpClient(config);
151
+ return operation(await getOrCreateMcpClient(config));
152
+ }
153
+ }
76
154
  export async function getOrCreateMcpClient(config) {
77
155
  const cacheKey = createMcpCacheKey(config);
78
156
  const cached = mcpClientCache.get(cacheKey);
79
157
  if (cached) {
80
158
  return cached;
81
159
  }
82
- const loading = (async () => {
83
- const client = new Client({
84
- name: "agent-harness",
85
- version: AGENT_HARNESS_VERSION,
86
- });
87
- const headers = {
88
- ...(config.headers ?? {}),
89
- ...(config.token ? { Authorization: `Bearer ${config.token}` } : {}),
90
- };
91
- const transport = config.transport === "http"
92
- ? new StreamableHTTPClientTransport(new URL(config.url ?? ""), {
93
- requestInit: Object.keys(headers).length > 0 ? { headers } : undefined,
94
- })
95
- : config.transport === "sse"
96
- ? new SSEClientTransport(new URL(config.url ?? ""), {
97
- requestInit: Object.keys(headers).length > 0 ? { headers } : undefined,
98
- })
99
- : config.transport === "websocket"
100
- ? new WebSocketClientTransport(new URL(config.url ?? ""))
101
- : new StdioClientTransport({
102
- command: config.command ?? "",
103
- args: config.args,
104
- env: createRuntimeEnv(config.env),
105
- cwd: config.cwd,
106
- });
107
- await client.connect(transport);
108
- return client;
109
- })();
160
+ const loading = mcpClientLoader(config).catch((error) => {
161
+ if (mcpClientCache.get(cacheKey) === loading) {
162
+ mcpClientCache.delete(cacheKey);
163
+ }
164
+ throw error;
165
+ });
110
166
  mcpClientCache.set(cacheKey, loading);
111
167
  return loading;
112
168
  }
169
+ export async function closeMcpClientsForWorkspace(workspace) {
170
+ const cacheKeys = new Set();
171
+ for (const tool of workspace.tools.values()) {
172
+ if (tool.type !== "mcp") {
173
+ continue;
174
+ }
175
+ const config = readMcpServerConfig(workspace, tool);
176
+ if (!config) {
177
+ continue;
178
+ }
179
+ cacheKeys.add(createMcpCacheKey(config));
180
+ }
181
+ await Promise.all(Array.from(cacheKeys, (cacheKey) => closeCachedMcpClient(cacheKey)));
182
+ }
183
+ export function __resetMcpClientCacheForTests() {
184
+ mcpClientCache.clear();
185
+ mcpClientLoader = createConnectedMcpClient;
186
+ }
187
+ export function __setMcpClientCacheEntryForTests(config, clientPromise) {
188
+ mcpClientCache.set(createMcpCacheKey(config), clientPromise);
189
+ }
190
+ export function __setMcpClientLoaderForTests(loader) {
191
+ mcpClientLoader = loader;
192
+ }
113
193
  async function getRemoteMcpToolDescriptor(config, remoteToolName) {
114
- const client = await getOrCreateMcpClient(config);
115
- const result = await client.listTools();
194
+ const result = await withRecoveredMcpClient(config, (client) => client.listTools());
116
195
  const tool = result.tools.find((item) => typeof item.name === "string" && item.name === remoteToolName);
117
196
  if (!tool || typeof tool.name !== "string") {
118
197
  return null;
@@ -124,8 +203,7 @@ async function getRemoteMcpToolDescriptor(config, remoteToolName) {
124
203
  };
125
204
  }
126
205
  export async function listRemoteMcpTools(config) {
127
- const client = await getOrCreateMcpClient(config);
128
- const result = await client.listTools();
206
+ const result = await withRecoveredMcpClient(config, (client) => client.listTools());
129
207
  return result.tools
130
208
  .filter((tool) => typeof tool.name === "string")
131
209
  .map((tool) => ({
@@ -155,11 +233,10 @@ export function createMcpToolResolver(workspace) {
155
233
  description: tool.description,
156
234
  inputSchemaPromise: descriptorPromise.then((descriptor) => descriptor?.inputSchema),
157
235
  async invoke(input) {
158
- const client = await getOrCreateMcpClient(serverConfig);
159
- const result = await client.callTool({
236
+ const result = await withRecoveredMcpClient(serverConfig, (client) => client.callTool({
160
237
  name: remoteToolName,
161
238
  arguments: typeof input === "object" && input !== null ? input : {},
162
- });
239
+ }));
163
240
  const textParts = Array.isArray(result.content)
164
241
  ? result.content
165
242
  .filter((item) => typeof item === "object" && item !== null && "type" in item)