npm - @botbotgo/agent-harness - Versions diffs - 0.0.149 → 0.0.151 - Mend

@botbotgo/agent-harness 0.0.149 → 0.0.151

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/README.md CHANGED Viewed

@@ -33,20 +33,31 @@
   >
 </p>
+## Easy to start · Full runtime · Configure and extend
+**At a glance:** the onboarding path stays thin, the runtime capabilities ship as a complete layer, and most ongoing work moves into **YAML configuration** plus **extensions** (local tools, SKILL packages, MCP) instead of bespoke runtime infrastructure.
+- **Easy to start:** `createAgentHarness` → `run` → `stop`, plus inspection helpers such as `subscribe`, `listSessions`, `listApprovals`, and `resolveApproval`.
+- **Configure:** routing, models, tools, stores, backends, MCP, recovery, and maintenance in declarative workspace YAML (see [Quick Start](#quick-start) and [How To Configure](#how-to-configure)).
+- **Extend:** drop `tool({...})` modules and SKILL trees under `resources/`, wire shared tools and MCP in catalogs, and let agents whitelist what they use.
+- **Built into the runtime:** persisted `runs`, `threads`, `approvals`, and `events`; recovery and queueing; streaming listeners; MCP in/out; LangChain v1 and DeepAgents adapters—so you do not rebuild that layer per app.
 ## What Problem We Solve
-In one line: `agent-harness` takes the runtime work that appears after the demo and makes it part of the product runtime from day one.
+In one line: `agent-harness` productizes the runtime work that usually appears after the demo.
-If your team already has agents, prompts, tools, and workflows, the missing layer is usually not more execution. It is the runtime that makes those pieces operable as software.
+If your team already has agents, prompts, tools, and workflows, the missing layer is usually not more execution. The missing layer is the runtime that makes those pieces operable, inspectable, and recoverable in production.
 What you get on day one:
 - a runtime that keeps `runs`, `threads`, `approvals`, and `events` as inspectable product records
 - a recovery path that survives interruption, restart, and operator decisions
+- stable run correlation and continuity metadata so operators can join one persisted run to logs, traces, and fallback transitions
+- approval defaults for sensitive durable memory writes and write-like MCP calls instead of relying on each tool definition to remember governance
 - one workspace-shaped assembly model instead of app-specific runtime glue
 - one stable runtime contract even when execution backends change underneath
-AI makes it much easier to generate agent logic, tool calls, and workflow code. The hard part moves to operations.
+AI makes it much easier to generate agent logic, tool calls, and workflow code. The harder problem shifts to operations.
 Once the demo works, the real software problem changes shape:
@@ -55,7 +66,7 @@ Once the demo works, the real software problem changes shape:
 - backend, prompt, and tool changes happen faster, but product-facing behavior still needs one stable control surface
 - MCP and provider-native tooling expand what agents can reach, which raises the bar for governance
-Teams still need answers to the runtime questions that appear after that shift:
+Teams still need clear answers to the runtime questions that appear after that shift:
 - how approvals are resolved and audited
 - how runs, threads, and events stay inspectable
@@ -63,19 +74,21 @@ Teams still need answers to the runtime questions that appear after that shift:
 - how routing, concurrency, and maintenance policy stay consistent
 - how backend churn does not leak into the product model
-`agent-harness` solves that layer. It keeps agent execution upstream while making the application runtime operable, recoverable, and governable.
+`agent-harness` solves that layer. It keeps agent execution upstream while turning the application runtime into something teams can operate, recover, and govern.
-That means the product story becomes easier to explain:
+That makes the product story easier to explain:
 - you bring the workspace, agents, tools, and prompts
 - `agent-harness` brings persisted `runs`, `threads`, `approvals`, `events`, recovery, and operator visibility
 - your application gets one stable runtime contract instead of backend-specific runtime plumbing
-Concretely, that means:
+In concrete terms:
 - a product-facing approval and operator surface instead of backend-specific middleware state
 - persisted `runs`, `threads`, `approvals`, and `events` as stable runtime records
+- runtime-owned inspection fields such as tracing correlation ids and continuity metadata instead of provider-private observability handles
 - restart-safe recovery and continuation as system-managed behavior
+- default runtime governance for high-risk memory and MCP side effects
 - YAML-owned routing, concurrency, maintenance, and recovery policy
 - adapter isolation so backend replacement does not redefine the public runtime model
@@ -85,7 +98,7 @@ Concretely, that means:
 It is not a new agent framework. It is the runtime layer around LangChain v1 and DeepAgents that turns one workspace into one operable application runtime.
-The point is simple:
+The positioning is simple:
 - Codex, Claude Code, and Cursor are products for people using agents
 - LangChain v1 and DeepAgents are frameworks for defining agent execution semantics
@@ -111,15 +124,16 @@ The runtime provides:
 - local `resources/tools/` `tool({...})` modules and `resources/skills/` discovery
 - persisted threads, runs, approvals, events, queue state, and recovery metadata
-In practice, the harness exists for the parts that are painful to rebuild inside every agent app:
+In practice, the harness exists for the parts that are expensive and repetitive to rebuild inside every agent app:
 - approval inboxes and human decision flow
 - persisted runs, threads, and inspectable event history
+- run correlation, continuity, and recovery inspection that still works after a stream fallback or restart
 - runtime-managed recovery after interrupts, failures, or process restart
 - queueing, concurrency, maintenance, and operational policy
 - stable runtime records that stay usable even if the backend changes
-The repository-owned default config layer is intentionally full-shaped. The shipped YAML keeps explicit defaults for the important runtime and agent knobs so teams can start from concrete config instead of reverse-engineering adapter behavior from source.
+The repository-owned default config layer is intentionally full-shaped. The shipped YAML keeps explicit defaults for the important runtime and agent knobs so teams can start from concrete configuration instead of reverse-engineering adapter behavior from source.
 The default rule is:
@@ -157,7 +171,7 @@ Recommended orchestration shape for long-running flows:
 ## Why This Exists
-Most agent tooling stops at execution.
+Most agent tooling stops at execution. Production software does not.
 Real products need a runtime that can answer harder questions:
@@ -173,8 +187,9 @@ Real products need a runtime that can answer harder questions:
 - It treats `runs`, `threads`, `approvals`, `events`, and recovery as first-class product records
 - It gives operators a runtime control surface instead of exposing raw backend internals
+- It keeps observability and governance runtime-owned with trace correlation, continuity metadata, and approval defaults for sensitive side effects
 - It keeps checkpoint resume system-managed instead of promoting checkpoint internals into the primary API
-- It lets YAML own assembly and operating policy while code keeps a tiny surface
+- It lets YAML own assembly and operating policy while code keeps a small, stable surface
 - It goes deep on runtime concerns that upstream libraries do not fully productize
 ## When To Use It

package/README.zh.md CHANGED Viewed

@@ -33,20 +33,31 @@
   >
 </p>
+## 容易上手 · 能力齐全 · 配置与扩展
+**一句话：** 接入面保持很薄，运行时能力整层交付，日常主要工作集中在 **YAML 配置** 与 **扩展**（本地工具、SKILL 包、MCP），而不是反复自建运行时基础设施。
+- **容易上手：** `createAgentHarness` → `run` → `stop`，以及 `subscribe`、`listSessions`、`listApprovals`、`resolveApproval` 等查询与控制能力。
+- **配置：** 路由、模型、工具、存储、后端、MCP、恢复与维护写在声明式工作区 YAML 里（见[快速开始](#快速开始)与[如何配置](#如何配置)）。
+- **扩展：** 在 `resources/` 下放置 `tool({...})` 模块与 SKILL 目录，在目录里声明共享工具与 MCP，再由各 agent 按名字白名单启用。
+- **内建运行时：** 持久化 `runs`、`threads`、`approvals` 与 `events`；恢复与排队；流式监听；MCP 接入与对外暴露；LangChain v1 与 DeepAgents 适配——避免每个应用重复造这一层。
 ## 我们解决什么问题
-一句话概括：`agent-harness` 把 demo 之后才暴露出来的运行时问题，提前收进产品 runtime 本身。
+一句话概括：`agent-harness` 把通常在 demo 之后才暴露出来的运行时工作，提前做成产品 runtime 的一部分。
-如果团队已经有 agents、prompts、tools 和 workflows，真正缺的通常不是再来一层执行，而是把这些东西变成“可运维的软件”的运行时层。
+如果团队已经有 agents、prompts、tools 和 workflows，真正缺的通常不是再多一层执行，而是把这些能力变成“可运维软件”的运行时层。
 第一天就能直接拿到的东西：
 - 把 `runs`、`threads`、`approvals`、`events` 作为可查询产品记录保存下来的 runtime
 - 能跨中断、重启和人工决策继续推进的恢复路径
+- 稳定的 run 关联与连续性元数据，让一次持久化运行能和日志、trace、fallback 过程对齐
+- 对敏感 durable memory 写入和写类 MCP 调用默认走审批，而不是把治理责任留给每个工具定义自己记住
 - 一个工作区形态的装配模型，而不是每个应用各写一套运行时胶水
 - 即使底层 execution backend 变化，也尽量保持稳定的 runtime 契约
-AI 让 agent 逻辑、工具调用和工作流代码更容易生成，真正变难的是运行时运维。
+AI 让 agent 逻辑、工具调用和工作流代码更容易生成，真正更难的是运行时运维。
 当 demo 跑起来之后，真正的软件问题会换一种形状出现：
@@ -55,7 +66,7 @@ AI 让 agent 逻辑、工具调用和工作流代码更容易生成，真正变
 - backend、prompt 和工具变化更快了，但面向产品的控制面仍然必须稳定
 - MCP 与 provider 原生工具扩展了 agent 的可触达范围，也同步抬高了治理要求
-团队仍然要回答这些运行时问题：
+团队仍然要正面回答这些运行时问题：
 - 审批怎么决策、怎么审计
 - runs、threads、events 怎么稳定可查
@@ -63,7 +74,7 @@ AI 让 agent 逻辑、工具调用和工作流代码更容易生成，真正变
 - 路由、并发和维护策略怎么保持一致
 - 后端频繁变化时，怎么不让产品模型跟着漂移
-`agent-harness` 解决的就是这一层。它把 agent 执行留在上游，同时把应用运行时做成可运维、可恢复、可治理的系统。
+`agent-harness` 解决的就是这一层。它把 agent 执行留在上游，同时把应用运行时做成真正可运维、可恢复、可治理的系统。
 换成更直接的产品语言，就是：
@@ -75,7 +86,9 @@ AI 让 agent 逻辑、工具调用和工作流代码更容易生成，真正变
 - 面向产品的审批与运维控制面，而不是 backend 专属的中间件状态
 - 稳定持久化的 `runs`、`threads`、`approvals` 与 `events` 记录
+- 由运行时持有的 tracing correlation id 与 continuity metadata，而不是 provider 私有观测句柄
 - 由系统托管的重启恢复与中断续跑
+- 面向高风险 memory / MCP 副作用的默认治理策略
 - 由 YAML 持有的路由、并发、维护与恢复策略
 - 通过适配器隔离 backend 变化，不让公共运行时模型漂移
@@ -85,7 +98,7 @@ AI 让 agent 逻辑、工具调用和工作流代码更容易生成，真正变
 它不是又一个 agent 框架，而是围绕 LangChain v1 与 DeepAgents 的运行时层：把一个工作区变成一套可运维的应用运行时。
-关系可以概括为：
+产品定位可以概括为：
 - Codex、Claude Code、Cursor 是「人用 agent」的产品
 - LangChain v1 与 DeepAgents 是定义 agent 执行语义的框架
@@ -111,15 +124,16 @@ AI 让 agent 逻辑、工具调用和工作流代码更容易生成，真正变
 - 本地 `resources/tools/` 中 `tool({...})` 工具模块与 `resources/skills/` 的发现
 - 持久化的线程、运行、审批、事件、队列状态与恢复元数据
-落到实际系统里，harness 主要解决那些每个 agent 应用都不想重复造一遍的运行时难题：
+落到实际系统里，harness 主要解决那些每个 agent 应用都不应该重复造一遍的运行时难题：
 - 审批收件箱与人工决策流
 - 持久化的 runs、threads 与可查询事件历史
+- 即使发生 stream fallback 或进程重启也还能成立的 run 关联、连续性与恢复观测
 - 中断、失败或进程重启后的运行时托管恢复
 - 队列、并发、维护与运维策略
 - 即使后端变更也保持稳定的运行时记录模型
-仓库自带的默认配置刻意做成「形状完整」。随仓库提供的 YAML 对重要的运行时与 agent 开关给出显式默认值，便于从具体配置起步，而不必从源码反推适配器行为。
+仓库自带的默认配置刻意做成「形状完整」。随仓库提供的 YAML 对关键运行时与 agent 开关给出显式默认值，便于从具体配置起步，而不必从源码反推适配器行为。
 默认原则是：
@@ -154,7 +168,7 @@ AI 让 agent 逻辑、工具调用和工作流代码更容易生成，真正变
 ## 为何需要它
-多数 agent 工具停在「能跑」。
+多数 agent 工具停在「能跑」，但生产软件不能只停在这里。
 真实产品需要能回答更难问题的运行时：
@@ -164,14 +178,15 @@ AI 让 agent 逻辑、工具调用和工作流代码更容易生成，真正变
 - 如何在不暴露原始后端状态的前提下检查线程与事件？
 - 如何在不大改产品模型的前提下替换后端实现？
-`agent-harness` 在不把应用 API 变成 LangChain v1 / DeepAgents 翻版的前提下，回答这些问题。
+`agent-harness` 在不把应用 API 做成 LangChain v1 / DeepAgents 翻版的前提下，回答这些问题。
 ## 有何不同
 - 将 `runs`、`threads`、`approvals`、`events` 与恢复视为一等产品记录
 - 给运维侧提供运行时控制面，而不是暴露原始后端内部结构
+- 将可观测性与治理留在运行时：包括 trace correlation、continuity metadata，以及高风险副作用的默认审批
 - 将 checkpoint 恢复作为系统管理的行为，而不是把 checkpoint 细节抬成主 API
-- 复杂装配与运行策略交给 YAML，代码面保持极小
+- 复杂装配与运行策略交给 YAML，代码面保持小而稳
 - 在上游库未充分产品化的运行时问题上做深做透
 ## 什么时候该用

package/dist/package-version.d.ts CHANGED Viewed

	@@ -1 +1 @@
1	- export declare const AGENT_HARNESS_VERSION = "0.0.~~148~~";
1	+ export declare const AGENT_HARNESS_VERSION = "0.0.150";

package/dist/package-version.js CHANGED Viewed

	@@ -1 +1 @@
1	- export const AGENT_HARNESS_VERSION = "0.0.~~148~~";
1	+ export const AGENT_HARNESS_VERSION = "0.0.150";

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@botbotgo/agent-harness",
-  "version": "0.0.149",
+  "version": "0.0.151",
   "description": "Workspace runtime for multi-agent applications",
   "type": "module",
   "packageManager": "npm@10.9.2",