stable-harness 0.0.7 → 0.0.9
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +10 -0
- package/docs/0.1.0-p0-runtime-control-plane-plan.zh.md +171 -0
- package/docs/0.1.0-retry-policy.zh.md +87 -0
- package/docs/0.1.0-stable-runtime-development-roadmap.zh.md +393 -0
- package/docs/0.1.0-tool-guard-benchmark.zh.md +42 -0
- package/docs/adapter-contract.md +199 -0
- package/docs/architecture/backend-comparison.md +41 -0
- package/docs/architecture/runtime-events.md +263 -0
- package/docs/architecture/runtime-events.zh.md +248 -0
- package/docs/architecture/system-architecture.zh.md +435 -0
- package/docs/compatibility-matrix.md +139 -0
- package/docs/engineering-rules.md +111 -0
- package/docs/evaluation/0.1.0-bfcl-targeted-model-matrix.zh.md +1632 -0
- package/docs/evaluation/0.1.0-bfcl-targeted-review-matrix.zh.md +1952 -0
- package/docs/evaluation/0.1.0-bfcl-tool-guard.zh.md +1427 -0
- package/docs/granite-tool-calling-comparison.zh.md +206 -0
- package/docs/guides/getting-started.md +126 -0
- package/docs/guides/index.md +40 -0
- package/docs/guides/integration-guide.md +126 -0
- package/docs/guides/operator-runbook.md +153 -0
- package/docs/guides/workspace-authoring.md +212 -0
- package/docs/implementation-blueprint.md +233 -0
- package/docs/memory/0.1.0-memory-design.zh.md +719 -0
- package/docs/memory/0.1.0-step-09-deepagents-native-memory.zh.md +146 -0
- package/docs/memory/0.1.0-step-09-langmem-shaped-provider.zh.md +169 -0
- package/docs/memory/0.1.0-step-09-memory-adapter-projection.zh.md +123 -0
- package/docs/memory/0.1.0-step-09-memory-contract.zh.md +169 -0
- package/docs/memory/0.1.0-step-09-memory-governance-approval.zh.md +143 -0
- package/docs/memory/0.1.0-step-09-memory-lifecycle-hooks.zh.md +150 -0
- package/docs/memory/0.1.0-step-09-memory-maintenance-boundary.zh.md +118 -0
- package/docs/memory/0.1.0-step-09-memory-persistence-boundary.zh.md +118 -0
- package/docs/product/adoption-playbook.md +145 -0
- package/docs/product/market-positioning.md +137 -0
- package/docs/product-boundary.md +258 -0
- package/docs/protocols/http-runtime.md +37 -0
- package/docs/protocols/langgraph-compatible.md +107 -0
- package/docs/protocols/openai-compatible.md +121 -0
- package/docs/tooling/0.1.0-bettercall-tool-quality.zh.md +231 -0
- package/package.json +3 -1
|
@@ -0,0 +1,231 @@
|
|
|
1
|
+
# BetterCall Tool Call Quality
|
|
2
|
+
|
|
3
|
+
本文说明 `stable-harness` 如何通过 `@botbotgo/better-call` 提升 tool call 质量。
|
|
4
|
+
|
|
5
|
+
核心原则:
|
|
6
|
+
|
|
7
|
+
- DeepAgents/LangGraph 负责 agent loop、tool selection、是否再次调用工具。
|
|
8
|
+
- BetterCall 负责工具调用进入真实执行层前的 validate + repair + block。
|
|
9
|
+
- stable-harness tool gateway 负责运行时治理:agent inventory、context-aware semantic validator、错误投影、trace/event。
|
|
10
|
+
|
|
11
|
+
## 接入位置
|
|
12
|
+
|
|
13
|
+
`stable-harness` 在 tool gateway 创建阶段准备 BetterCall validation tools:
|
|
14
|
+
|
|
15
|
+
```ts
|
|
16
|
+
const registry = new Map(prepareBetterCallTools(tools, options.betterCall).map((tool) => [tool.id, tool]));
|
|
17
|
+
```
|
|
18
|
+
|
|
19
|
+
内部做一次薄映射:
|
|
20
|
+
|
|
21
|
+
| stable-harness tool | BetterCall tool |
|
|
22
|
+
| --- | --- |
|
|
23
|
+
| `id` | `name` |
|
|
24
|
+
| `description` | `description` |
|
|
25
|
+
| `schema` | `schema` |
|
|
26
|
+
| validation-only `invoke(input)` | returns `input` |
|
|
27
|
+
|
|
28
|
+
这样 runtime 每次执行工具时不再临时创建单工具 wrapper,而是复用创建 gateway 时由 `betterTools(...)` 生成的 validation tools。BetterCall mode 默认为 `repair`;当配置了 `repair` function 或 `repairModel` 时,tool selection 和 arguments/schema 错误会先修复并二次验证,仍无法通过时才拒绝。
|
|
29
|
+
|
|
30
|
+
## 总体流程
|
|
31
|
+
|
|
32
|
+
```mermaid
|
|
33
|
+
flowchart TD
|
|
34
|
+
A["Workspace tools"] --> B["createInMemoryToolGateway(tools)"]
|
|
35
|
+
B --> C["prepareBetterCallTools(tools)"]
|
|
36
|
+
C --> D["betterTools(tools.map(toBetterCallTool))"]
|
|
37
|
+
D --> E["Tool registry stores original tool + BetterCall validation tool"]
|
|
38
|
+
|
|
39
|
+
F["DeepAgents/LangGraph selects tool + args"] --> G["stable-harness runtime checks agent inventory"]
|
|
40
|
+
G --> H["tool gateway invoke"]
|
|
41
|
+
H --> S{"registered tool?"}
|
|
42
|
+
S -- "no" --> R["BetterCall repair tool selection within allowed inventory"]
|
|
43
|
+
S -- "yes" --> I["stable validateArgs(context)"]
|
|
44
|
+
R --> I
|
|
45
|
+
I --> J["BetterCall validate + repair + block"]
|
|
46
|
+
J --> K{"valid or repaired?"}
|
|
47
|
+
K -- "yes" --> L["execute original tool.invoke(safe args, context)"]
|
|
48
|
+
K -- "no" --> M["ToolArgumentValidationError with structured issues"]
|
|
49
|
+
M --> N["DeepAgents receives tool error observation"]
|
|
50
|
+
N --> O["model may issue corrected tool call"]
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
## 质量提升点
|
|
54
|
+
|
|
55
|
+
| Layer | 检查内容 | 失败时 |
|
|
56
|
+
| --- | --- | --- |
|
|
57
|
+
| Agent inventory | 当前 agent 是否允许调用该 tool | runtime 直接拒绝 |
|
|
58
|
+
| stable semantic validator | 需要 `context` 的业务/运行时规则 | reject 或 repair args |
|
|
59
|
+
| BetterCall tool selection | tool name 是否存在于 wrapped tools | 默认 repair mode;修复后的 tool 仍必须在 agent inventory 内 |
|
|
60
|
+
| BetterCall arguments/schema | required、type、enum、extra arg | 默认 repair mode;配置 repair 后先修复,失败才 block |
|
|
61
|
+
| Error projection | BetterCall path 转换为 stable-harness 参数 path | 返回可读结构化错误 |
|
|
62
|
+
| DeepAgents repair loop | 模型看到 tool error 后是否重试 | 上游 agent loop 决定 |
|
|
63
|
+
|
|
64
|
+
## Sequence: Gateway 初始化
|
|
65
|
+
|
|
66
|
+
```mermaid
|
|
67
|
+
sequenceDiagram
|
|
68
|
+
participant App as Workspace App
|
|
69
|
+
participant Gateway as createInMemoryToolGateway
|
|
70
|
+
participant Adapter as stable BetterCall adapter
|
|
71
|
+
participant BC as betterTools
|
|
72
|
+
participant Registry as Tool Registry
|
|
73
|
+
|
|
74
|
+
App->>Gateway: tools array
|
|
75
|
+
Gateway->>Adapter: prepareBetterCallTools(tools)
|
|
76
|
+
Adapter->>Adapter: map stable tool id -> BetterCall tool name
|
|
77
|
+
Adapter->>BC: betterTools(mappedTools, { mode: "repair", repair/repairModel })
|
|
78
|
+
BC-->>Adapter: validation tools
|
|
79
|
+
Adapter-->>Gateway: runtime tools with validationTool
|
|
80
|
+
Gateway->>Registry: store by stable tool.id
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
要点:
|
|
84
|
+
|
|
85
|
+
- `betterTools(...)` 在 gateway 创建阶段调用一次,默认 `mode` 是 `repair`。
|
|
86
|
+
- `ToolGatewayTool` 的公开类型不暴露 `validationTool`。
|
|
87
|
+
- 原始 stable tool 的 `invoke(args, context)` 不被 BetterCall 替换;BetterCall 只做执行前验证、修复和拦截。
|
|
88
|
+
|
|
89
|
+
## Sequence: Tool Selection Repair
|
|
90
|
+
|
|
91
|
+
```mermaid
|
|
92
|
+
sequenceDiagram
|
|
93
|
+
participant Runtime as stable runtime
|
|
94
|
+
participant Gateway as tool gateway
|
|
95
|
+
participant BC as BetterCall reliableToolCalls
|
|
96
|
+
participant Tool as Repaired tool
|
|
97
|
+
|
|
98
|
+
Runtime->>Gateway: repairToolCall(wrongTool, args, allowedToolIds)
|
|
99
|
+
Gateway->>BC: all registered tools filtered by allowedToolIds
|
|
100
|
+
BC->>BC: repair tool name + args
|
|
101
|
+
alt repaired to allowed registered tool
|
|
102
|
+
BC-->>Gateway: safe toolId + args
|
|
103
|
+
Gateway-->>Runtime: repaired tool call
|
|
104
|
+
Runtime->>Gateway: invoke(repaired toolId, args)
|
|
105
|
+
Gateway->>Tool: execute after argument validation
|
|
106
|
+
else cannot repair or repaired tool not allowed
|
|
107
|
+
BC-->>Gateway: no safe call
|
|
108
|
+
Gateway-->>Runtime: block with inventory/registration error
|
|
109
|
+
end
|
|
110
|
+
```
|
|
111
|
+
|
|
112
|
+
修复 tool selection 时,stable-harness 只把当前 agent 允许的工具传给 BetterCall。即使 repair model 返回了别的工具,只要不在 `allowedToolIds` 里,runtime 仍然拒绝执行。
|
|
113
|
+
|
|
114
|
+
## Sequence: Validate + Repair + Block
|
|
115
|
+
|
|
116
|
+
```mermaid
|
|
117
|
+
sequenceDiagram
|
|
118
|
+
participant DA as DeepAgents/LangGraph
|
|
119
|
+
participant Runtime as stable runtime
|
|
120
|
+
participant Gateway as tool gateway
|
|
121
|
+
participant Semantic as validateArgs(context)
|
|
122
|
+
participant BC as BetterCall validation tool
|
|
123
|
+
participant Tool as Original tool
|
|
124
|
+
|
|
125
|
+
DA->>Runtime: selected tool + args
|
|
126
|
+
Runtime->>Runtime: verify tool is assigned to agent
|
|
127
|
+
Runtime->>Gateway: invoke(toolId, args, context)
|
|
128
|
+
Gateway->>Semantic: validateArgs(args, context)
|
|
129
|
+
alt semantic reject
|
|
130
|
+
Semantic-->>Gateway: reject with issues
|
|
131
|
+
Gateway-->>Runtime: ToolArgumentValidationError
|
|
132
|
+
else semantic repair or allow
|
|
133
|
+
Semantic-->>Gateway: repaired/allowed args
|
|
134
|
+
Gateway->>BC: validationTool.invoke(args)
|
|
135
|
+
alt BetterCall repairs
|
|
136
|
+
BC-->>Gateway: safe args
|
|
137
|
+
Gateway->>Tool: invoke(safe args, context)
|
|
138
|
+
Tool-->>Gateway: output
|
|
139
|
+
Gateway-->>Runtime: tool result
|
|
140
|
+
else BetterCall rejects
|
|
141
|
+
BC-->>Gateway: BetterToolValidationError(issues)
|
|
142
|
+
Gateway->>Gateway: map $.calls[0].args.x -> $.x
|
|
143
|
+
Gateway-->>Runtime: ToolArgumentValidationError
|
|
144
|
+
else BetterCall accepts
|
|
145
|
+
BC-->>Gateway: validated args
|
|
146
|
+
Gateway->>Tool: invoke(args, context)
|
|
147
|
+
Tool-->>Gateway: output
|
|
148
|
+
Gateway-->>Runtime: tool result
|
|
149
|
+
end
|
|
150
|
+
end
|
|
151
|
+
```
|
|
152
|
+
|
|
153
|
+
BetterCall 默认运行在 repair mode。配置了 `repair` function 或 `repairModel` 时,BetterCall 会先尝试修复 arguments,再二次验证,验证通过后才允许执行。没有可用修复器时,它会退化为 validate + block,保证错误调用不会进入真实工具。
|
|
154
|
+
|
|
155
|
+
## Sequence: DeepAgents 自我修复
|
|
156
|
+
|
|
157
|
+
```mermaid
|
|
158
|
+
sequenceDiagram
|
|
159
|
+
participant Model as Agent model
|
|
160
|
+
participant DA as DeepAgents/LangGraph loop
|
|
161
|
+
participant Gateway as stable tool gateway
|
|
162
|
+
participant BC as BetterCall validation
|
|
163
|
+
participant Tool as Real tool
|
|
164
|
+
|
|
165
|
+
Model-->>DA: tool call with bad args
|
|
166
|
+
DA->>Gateway: invoke tool
|
|
167
|
+
Gateway->>BC: validate args
|
|
168
|
+
BC-->>Gateway: reject structured issues
|
|
169
|
+
Gateway-->>DA: ToolMessage(status="error")
|
|
170
|
+
DA->>Model: feed tool error observation
|
|
171
|
+
Model-->>DA: corrected tool call
|
|
172
|
+
DA->>Gateway: invoke corrected tool
|
|
173
|
+
Gateway->>BC: validate corrected args
|
|
174
|
+
BC-->>Gateway: accept
|
|
175
|
+
Gateway->>Tool: execute real tool
|
|
176
|
+
Tool-->>DA: result
|
|
177
|
+
```
|
|
178
|
+
|
|
179
|
+
这一步不是 stable-harness 本地 replay tool call,也不是 TODO/text parsing。模型是否根据错误再次调用工具,属于 DeepAgents/LangGraph agent loop。
|
|
180
|
+
|
|
181
|
+
## Sequence: BetterCall repairModel 模式
|
|
182
|
+
|
|
183
|
+
`@botbotgo/better-call` package 本身支持:
|
|
184
|
+
|
|
185
|
+
```ts
|
|
186
|
+
const tools = betterTools([searchTool, calculatorTool], { repairModel });
|
|
187
|
+
```
|
|
188
|
+
|
|
189
|
+
当前 stable-harness tool gateway 默认使用 BetterCall `repair` mode,并允许通过 `betterCall.repair` 或 `betterCall.repairModel` 注入修复器。没有修复器时,安全语义仍然是 block。
|
|
190
|
+
|
|
191
|
+
配置 `repairModel` 后,流程是:
|
|
192
|
+
|
|
193
|
+
```mermaid
|
|
194
|
+
sequenceDiagram
|
|
195
|
+
participant Gateway as stable tool gateway
|
|
196
|
+
participant BC as BetterCall
|
|
197
|
+
participant Repair as repairModel
|
|
198
|
+
participant Tool as Real tool
|
|
199
|
+
|
|
200
|
+
Gateway->>BC: validationTool.invoke(args)
|
|
201
|
+
BC-->>Gateway: reject
|
|
202
|
+
BC->>Repair: repair prompt + issues
|
|
203
|
+
Repair-->>BC: corrected calls JSON
|
|
204
|
+
BC->>BC: validate repaired args again
|
|
205
|
+
alt repaired valid
|
|
206
|
+
BC-->>Gateway: repaired args
|
|
207
|
+
Gateway->>Tool: execute with repaired args
|
|
208
|
+
else repaired invalid
|
|
209
|
+
BC-->>Gateway: structured rejection
|
|
210
|
+
Gateway-->>Gateway: block before execution
|
|
211
|
+
end
|
|
212
|
+
```
|
|
213
|
+
|
|
214
|
+
这会让 gateway 在返回错误给 DeepAgents 之前先修一次。修复后仍不合法时,才把结构化错误交回 DeepAgents/LangGraph agent loop。
|
|
215
|
+
|
|
216
|
+
## 错误类型分类
|
|
217
|
+
|
|
218
|
+
| Category | Example | stable-harness 行为 |
|
|
219
|
+
| --- | --- | --- |
|
|
220
|
+
| Tool selection | unknown tool、wrong tool name | 默认 repair mode;修复后仍必须是当前 agent 允许的 registered tool,否则拒绝 |
|
|
221
|
+
| Irrelevance | no tool should be called | BetterCall 可在 package 层识别;gateway 层通常已有 selected tool |
|
|
222
|
+
| Arguments | missing required arg、wrong arg name、wrong type | 默认 repair mode;有修复器时先 repair,再二次验证 |
|
|
223
|
+
| Schema | invalid enum、extra arg | 默认 repair mode;可修复项先 repair,仍无法通过 schema 时拒绝 |
|
|
224
|
+
| Semantic | workspace-relative path、domain-specific ticker | stable `validateArgs(context)` 拒绝或 repair |
|
|
225
|
+
|
|
226
|
+
## 当前验证
|
|
227
|
+
|
|
228
|
+
- `npm run check`
|
|
229
|
+
- `npm test`:74/74 通过
|
|
230
|
+
- `npm run check:rules`
|
|
231
|
+
- BetterCall package 已发布到 npm:`@botbotgo/better-call@0.1.1`
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "stable-harness",
|
|
3
|
-
"version": "0.0.
|
|
3
|
+
"version": "0.0.9",
|
|
4
4
|
"type": "module",
|
|
5
5
|
"description": "Stable application runtime and operator control plane for agent workspaces.",
|
|
6
6
|
"license": "MIT",
|
|
@@ -18,6 +18,7 @@
|
|
|
18
18
|
},
|
|
19
19
|
"files": [
|
|
20
20
|
"README.md",
|
|
21
|
+
"docs/**/*.md",
|
|
21
22
|
"dist/*.js",
|
|
22
23
|
"dist/*.d.ts",
|
|
23
24
|
"dist/compat/**/*.js",
|
|
@@ -66,6 +67,7 @@
|
|
|
66
67
|
"prepack": "npm run release:minify",
|
|
67
68
|
"release:minify": "find dist packages -type f -name '*.js' \\( -path 'dist/*' -o -path '*/dist/*' \\) -exec sh -c 'for f do ./node_modules/.bin/terser \"$f\" --compress passes=2 --mangle keep_fnames=true --keep-fnames --keep-classnames --module --comments false --output \"$f\"; done' sh {} +",
|
|
68
69
|
"release:check-package": "node scripts/release/check-npm-package.mjs",
|
|
70
|
+
"release:smoke": "node scripts/release/smoke-npm-install.mjs",
|
|
69
71
|
"release:pack": "npm run build && npm run release:check-package && npm pack --dry-run",
|
|
70
72
|
"release:publish": "npm publish --access public --registry https://registry.npmjs.org/",
|
|
71
73
|
"example:minimal": "node dist/examples/minimal-deepagents/run.js"
|