stable-harness 0.0.139 → 0.0.143

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (208) hide show
  1. package/README.md +30 -17
  2. package/dist/index.js +1 -1
  3. package/dist/runtime/skills/skill-metadata.js +1 -1
  4. package/dist/workspace/compile.js +1 -1
  5. package/docs/0.1.0-tool-guard-benchmark.zh.md +5 -5
  6. package/docs/architecture/system-architecture.zh.md +3 -3
  7. package/docs/evaluation/0.1.0-bfcl-targeted-model-matrix.zh.md +306 -306
  8. package/docs/evaluation/0.1.0-bfcl-targeted-review-matrix.zh.md +1 -1
  9. package/docs/evaluation/0.1.0-bfcl-tool-guard.zh.md +1 -1
  10. package/docs/granite-tool-calling-comparison.zh.md +8 -8
  11. package/docs/guides/getting-started.md +9 -6
  12. package/docs/guides/index.md +5 -3
  13. package/docs/guides/integration-guide.md +44 -43
  14. package/docs/guides/operator-runbook.md +51 -3
  15. package/docs/guides/runtime-governance-proof.md +4 -4
  16. package/docs/guides/workspace-authoring.md +2 -2
  17. package/docs/guides/workspace-docker-build.md +3 -3
  18. package/docs/memory/0.1.0-memory-design.zh.md +20 -0
  19. package/docs/memory/0.1.0-step-09-deepagents-native-memory.zh.md +1 -1
  20. package/docs/memory/0.1.0-step-09-langmem-shaped-provider.zh.md +1 -1
  21. package/docs/memory/0.1.0-step-09-memory-adapter-projection.zh.md +3 -3
  22. package/docs/memory/0.1.0-step-09-memory-contract.zh.md +1 -1
  23. package/docs/memory/0.1.0-step-09-memory-governance-approval.zh.md +1 -1
  24. package/docs/memory/0.1.0-step-09-memory-lifecycle-hooks.zh.md +1 -1
  25. package/docs/memory/0.1.0-step-09-memory-maintenance-boundary.zh.md +1 -1
  26. package/docs/memory/0.1.0-step-09-memory-persistence-boundary.zh.md +1 -1
  27. package/docs/protocols/coverage-matrix.md +114 -0
  28. package/docs/protocols/http-runtime.md +31 -8
  29. package/docs/protocols/langgraph-compatible.md +75 -17
  30. package/docs/protocols/openai-compatible.md +25 -7
  31. package/docs/protocols/{agent-protocols.md → protocol-facades.md} +76 -18
  32. package/node_modules/@stable-harness/adapter-deepagents/dist/src/adapter.js +1 -1
  33. package/node_modules/@stable-harness/adapter-deepagents/dist/src/internal/builtin/task-inventory.js +1 -1
  34. package/node_modules/@stable-harness/adapter-deepagents/dist/src/internal/builtin-call-repair.js +1 -1
  35. package/node_modules/@stable-harness/adapter-deepagents/dist/src/internal/builtin-tool-policy.js +1 -1
  36. package/node_modules/@stable-harness/adapter-deepagents/dist/src/internal/gateway/tool-failure-events.js +1 -1
  37. package/node_modules/@stable-harness/adapter-deepagents/dist/src/internal/raw-tool-call-parser.js +1 -1
  38. package/node_modules/@stable-harness/adapter-deepagents/dist/src/internal/stream-events.js +1 -1
  39. package/node_modules/@stable-harness/adapter-deepagents/dist/src/internal/substrate/checkpoint.js +1 -1
  40. package/node_modules/@stable-harness/adapter-deepagents/dist/src/internal/tool-repeat-visibility.js +1 -1
  41. package/node_modules/@stable-harness/adapter-deepagents/dist/src/internal/trace-projection.js +1 -1
  42. package/node_modules/@stable-harness/adapter-deepagents/dist/src/internal/vfs-backend.js +1 -1
  43. package/node_modules/@stable-harness/adapter-deepagents/package.json +2 -2
  44. package/node_modules/@stable-harness/adapter-langgraph/dist/src/graph.js +1 -1
  45. package/node_modules/@stable-harness/adapter-langgraph/dist/src/index.js +1 -1
  46. package/node_modules/@stable-harness/adapter-langgraph/dist/src/runtime.js +1 -1
  47. package/node_modules/@stable-harness/adapter-langgraph/dist/src/skill-providers.js +1 -1
  48. package/node_modules/@stable-harness/adapter-langgraph/dist/src/types.d.ts +1 -0
  49. package/node_modules/@stable-harness/adapter-langgraph/package.json +2 -2
  50. package/node_modules/@stable-harness/core/dist/boundary-scan.js +1 -1
  51. package/node_modules/@stable-harness/core/dist/index.d.ts +1 -0
  52. package/node_modules/@stable-harness/core/dist/index.js +1 -1
  53. package/node_modules/@stable-harness/core/dist/memory-plugins/shared.js +1 -1
  54. package/node_modules/@stable-harness/core/dist/memory-plugins.js +1 -1
  55. package/node_modules/@stable-harness/core/dist/quality/event-evidence.js +1 -1
  56. package/node_modules/@stable-harness/core/dist/quality/execution-review.js +1 -1
  57. package/node_modules/@stable-harness/core/dist/quality/synthesis/fields.js +1 -1
  58. package/node_modules/@stable-harness/core/dist/recovery/execution-contract.js +1 -1
  59. package/node_modules/@stable-harness/core/dist/recovery/tool-call-structure.js +1 -1
  60. package/node_modules/@stable-harness/core/dist/recovery/tool-call.js +1 -1
  61. package/node_modules/@stable-harness/core/dist/runtime/direct-tool-call.js +1 -1
  62. package/node_modules/@stable-harness/core/dist/runtime/inspection/methods.js +1 -1
  63. package/node_modules/@stable-harness/core/dist/runtime/inspection/replay.js +1 -1
  64. package/node_modules/@stable-harness/core/dist/runtime/persistence/artifacts.d.ts +4 -0
  65. package/node_modules/@stable-harness/core/dist/runtime/persistence/artifacts.js +1 -1
  66. package/node_modules/@stable-harness/core/dist/runtime/persistence/queue.d.ts +1 -0
  67. package/node_modules/@stable-harness/core/dist/runtime/persistence/queue.js +1 -1
  68. package/node_modules/@stable-harness/core/dist/runtime/persistence/stores.d.ts +1 -0
  69. package/node_modules/@stable-harness/core/dist/runtime/persistence/stores.js +1 -1
  70. package/node_modules/@stable-harness/core/dist/runtime/persistence/system-data.d.ts +34 -0
  71. package/node_modules/@stable-harness/core/dist/runtime/persistence/system-data.js +1 -0
  72. package/node_modules/@stable-harness/core/dist/runtime/recovery/adapter-result.js +1 -1
  73. package/node_modules/@stable-harness/core/dist/runtime/recovery/non-focused-recovery.js +1 -1
  74. package/node_modules/@stable-harness/core/dist/runtime.js +1 -1
  75. package/node_modules/@stable-harness/core/dist/workflows/index.d.ts +20 -0
  76. package/node_modules/@stable-harness/core/dist/workflows/index.js +1 -1
  77. package/node_modules/@stable-harness/core/package.json +3 -3
  78. package/node_modules/@stable-harness/governance/dist/src/approval-queue.d.ts +1 -0
  79. package/node_modules/@stable-harness/governance/dist/src/approval-queue.js +1 -1
  80. package/node_modules/@stable-harness/governance/dist/src/index.d.ts +1 -1
  81. package/node_modules/@stable-harness/governance/dist/src/index.js +1 -1
  82. package/node_modules/@stable-harness/governance/package.json +1 -1
  83. package/node_modules/@stable-harness/memory/package.json +1 -1
  84. package/node_modules/@stable-harness/protocols/dist/src/http-events.d.ts +2 -0
  85. package/node_modules/@stable-harness/protocols/dist/src/http-events.js +1 -1
  86. package/node_modules/@stable-harness/protocols/dist/src/http-server.d.ts +5 -1
  87. package/node_modules/@stable-harness/protocols/dist/src/http-server.js +1 -1
  88. package/node_modules/@stable-harness/protocols/dist/src/index.d.ts +3 -3
  89. package/node_modules/@stable-harness/protocols/dist/src/index.js +1 -1
  90. package/node_modules/@stable-harness/protocols/dist/src/openai-compatible.js +1 -1
  91. package/node_modules/@stable-harness/protocols/dist/src/openai-payload.d.ts +89 -0
  92. package/node_modules/@stable-harness/protocols/dist/src/openai-payload.js +1 -1
  93. package/node_modules/@stable-harness/protocols/dist/src/{agent-protocols.d.ts → protocol-facades.d.ts} +4 -10
  94. package/node_modules/@stable-harness/protocols/dist/src/protocol-facades.js +1 -0
  95. package/node_modules/@stable-harness/protocols/dist/src/protocol-utils.d.ts +505 -0
  96. package/node_modules/@stable-harness/protocols/dist/src/protocol-utils.js +1 -0
  97. package/node_modules/@stable-harness/protocols/package.json +2 -2
  98. package/node_modules/@stable-harness/tool-gateway/dist/src/argument-guard.js +1 -1
  99. package/node_modules/@stable-harness/tool-gateway/dist/src/schema-validation.js +1 -1
  100. package/node_modules/@stable-harness/tool-gateway/package.json +1 -1
  101. package/node_modules/@stable-harness/workspace-yaml/dist/boundary-scan.js +1 -1
  102. package/node_modules/@stable-harness/workspace-yaml/dist/discovery.js +1 -1
  103. package/node_modules/@stable-harness/workspace-yaml/dist/documents.js +1 -1
  104. package/node_modules/@stable-harness/workspace-yaml/dist/loader.js +1 -1
  105. package/node_modules/@stable-harness/workspace-yaml/dist/workflows.js +1 -1
  106. package/node_modules/@stable-harness/workspace-yaml/package.json +2 -2
  107. package/package.json +13 -12
  108. package/packages/adapter-deepagents/dist/src/adapter.js +1 -1
  109. package/packages/adapter-deepagents/dist/src/internal/builtin/task-inventory.js +1 -1
  110. package/packages/adapter-deepagents/dist/src/internal/builtin-call-repair.js +1 -1
  111. package/packages/adapter-deepagents/dist/src/internal/builtin-tool-policy.js +1 -1
  112. package/packages/adapter-deepagents/dist/src/internal/gateway/tool-failure-events.js +1 -1
  113. package/packages/adapter-deepagents/dist/src/internal/raw-tool-call-parser.js +1 -1
  114. package/packages/adapter-deepagents/dist/src/internal/stream-events.js +1 -1
  115. package/packages/adapter-deepagents/dist/src/internal/substrate/checkpoint.js +1 -1
  116. package/packages/adapter-deepagents/dist/src/internal/tool-repeat-visibility.js +1 -1
  117. package/packages/adapter-deepagents/dist/src/internal/trace-projection.js +1 -1
  118. package/packages/adapter-deepagents/dist/src/internal/vfs-backend.js +1 -1
  119. package/packages/adapter-deepagents/package.json +2 -2
  120. package/packages/adapter-langgraph/dist/src/graph.js +1 -1
  121. package/packages/adapter-langgraph/dist/src/index.js +1 -1
  122. package/packages/adapter-langgraph/dist/src/runtime.js +1 -1
  123. package/packages/adapter-langgraph/dist/src/skill-providers.js +1 -1
  124. package/packages/adapter-langgraph/dist/src/types.d.ts +1 -0
  125. package/packages/adapter-langgraph/package.json +2 -2
  126. package/packages/cli/dist/src/args.d.ts +3 -2
  127. package/packages/cli/dist/src/args.js +1 -1
  128. package/packages/cli/dist/src/build.js +1 -1
  129. package/packages/cli/dist/src/cli.js +1 -1
  130. package/packages/cli/dist/src/console/session.js +1 -1
  131. package/packages/cli/dist/src/daemon/client.d.ts +4 -3
  132. package/packages/cli/dist/src/daemon/client.js +1 -1
  133. package/packages/cli/dist/src/init.js +1 -1
  134. package/packages/cli/dist/src/langgraph/agent-server-compat.d.ts +8 -0
  135. package/packages/cli/dist/src/langgraph/agent-server-compat.js +1 -0
  136. package/packages/cli/dist/src/langgraph/store-projection.d.ts +1 -0
  137. package/packages/cli/dist/src/langgraph/store-projection.js +1 -0
  138. package/packages/cli/dist/src/langgraph-official.js +1 -1
  139. package/packages/cli/dist/src/memory/providers.js +1 -1
  140. package/packages/cli/dist/src/server/gateway.d.ts +12 -0
  141. package/packages/cli/dist/src/server/gateway.js +1 -0
  142. package/packages/cli/dist/src/server/protocol-defaults.d.ts +7 -0
  143. package/packages/cli/dist/src/server/protocol-defaults.js +1 -0
  144. package/packages/cli/dist/src/server.js +1 -1
  145. package/packages/cli/package.json +8 -8
  146. package/packages/core/dist/boundary-scan.js +1 -1
  147. package/packages/core/dist/index.d.ts +1 -0
  148. package/packages/core/dist/index.js +1 -1
  149. package/packages/core/dist/memory-plugins/shared.js +1 -1
  150. package/packages/core/dist/memory-plugins.js +1 -1
  151. package/packages/core/dist/quality/event-evidence.js +1 -1
  152. package/packages/core/dist/quality/execution-review.js +1 -1
  153. package/packages/core/dist/quality/synthesis/fields.js +1 -1
  154. package/packages/core/dist/recovery/execution-contract.js +1 -1
  155. package/packages/core/dist/recovery/tool-call-structure.js +1 -1
  156. package/packages/core/dist/recovery/tool-call.js +1 -1
  157. package/packages/core/dist/runtime/direct-tool-call.js +1 -1
  158. package/packages/core/dist/runtime/inspection/methods.js +1 -1
  159. package/packages/core/dist/runtime/inspection/replay.js +1 -1
  160. package/packages/core/dist/runtime/persistence/artifacts.d.ts +4 -0
  161. package/packages/core/dist/runtime/persistence/artifacts.js +1 -1
  162. package/packages/core/dist/runtime/persistence/queue.d.ts +1 -0
  163. package/packages/core/dist/runtime/persistence/queue.js +1 -1
  164. package/packages/core/dist/runtime/persistence/stores.d.ts +1 -0
  165. package/packages/core/dist/runtime/persistence/stores.js +1 -1
  166. package/packages/core/dist/runtime/persistence/system-data.d.ts +34 -0
  167. package/packages/core/dist/runtime/persistence/system-data.js +1 -0
  168. package/packages/core/dist/runtime/recovery/adapter-result.js +1 -1
  169. package/packages/core/dist/runtime/recovery/non-focused-recovery.js +1 -1
  170. package/packages/core/dist/runtime.js +1 -1
  171. package/packages/core/dist/workflows/index.d.ts +20 -0
  172. package/packages/core/dist/workflows/index.js +1 -1
  173. package/packages/core/package.json +3 -3
  174. package/packages/evaluation/dist/src/benchmark.js +1 -1
  175. package/packages/evaluation/dist/src/run-record.js +1 -1
  176. package/packages/evaluation/dist/src/tool-call-metrics.js +1 -1
  177. package/packages/evaluation/package.json +2 -2
  178. package/packages/governance/dist/src/approval-queue.d.ts +1 -0
  179. package/packages/governance/dist/src/approval-queue.js +1 -1
  180. package/packages/governance/dist/src/index.d.ts +1 -1
  181. package/packages/governance/dist/src/index.js +1 -1
  182. package/packages/governance/package.json +1 -1
  183. package/packages/memory/package.json +1 -1
  184. package/packages/protocols/dist/src/http-events.d.ts +2 -0
  185. package/packages/protocols/dist/src/http-events.js +1 -1
  186. package/packages/protocols/dist/src/http-server.d.ts +5 -1
  187. package/packages/protocols/dist/src/http-server.js +1 -1
  188. package/packages/protocols/dist/src/index.d.ts +3 -3
  189. package/packages/protocols/dist/src/index.js +1 -1
  190. package/packages/protocols/dist/src/openai-compatible.js +1 -1
  191. package/packages/protocols/dist/src/openai-payload.d.ts +89 -0
  192. package/packages/protocols/dist/src/openai-payload.js +1 -1
  193. package/packages/protocols/dist/src/{agent-protocols.d.ts → protocol-facades.d.ts} +4 -10
  194. package/packages/protocols/dist/src/protocol-facades.js +1 -0
  195. package/packages/protocols/dist/src/protocol-utils.d.ts +505 -0
  196. package/packages/protocols/dist/src/protocol-utils.js +1 -0
  197. package/packages/protocols/package.json +2 -2
  198. package/packages/tool-gateway/dist/src/argument-guard.js +1 -1
  199. package/packages/tool-gateway/dist/src/schema-validation.js +1 -1
  200. package/packages/tool-gateway/package.json +1 -1
  201. package/packages/workspace-yaml/dist/boundary-scan.js +1 -1
  202. package/packages/workspace-yaml/dist/discovery.js +1 -1
  203. package/packages/workspace-yaml/dist/documents.js +1 -1
  204. package/packages/workspace-yaml/dist/loader.js +1 -1
  205. package/packages/workspace-yaml/dist/workflows.js +1 -1
  206. package/packages/workspace-yaml/package.json +2 -2
  207. package/node_modules/@stable-harness/protocols/dist/src/agent-protocols.js +0 -1
  208. package/packages/protocols/dist/src/agent-protocols.js +0 -1
package/README.md CHANGED
@@ -92,25 +92,34 @@ stable-harness console -w ./examples/minimal-deepagents
92
92
  Start the runtime daemon and protocol facades:
93
93
 
94
94
  ```bash
95
- stable-harness start -w ./examples/minimal-deepagents --port 8642
95
+ stable-harness start -w ./examples/minimal-deepagents
96
+ stable-harness serve -w ./examples/minimal-deepagents
96
97
  ```
97
98
 
98
- Then point first-party clients at the Stable Runtime control plane and
99
- OpenAI-compatible clients at `/v1`:
99
+ Then point clients at the single Stable Harness gateway. The gateway exposes
100
+ system metadata, first-party runtime APIs, and protocol-compatible surfaces under
101
+ stable path groups:
100
102
 
101
103
  ```text
102
- http://127.0.0.1:8641
103
- http://127.0.0.1:8642/v1
104
+ http://127.0.0.1:56789/manifest
105
+ http://127.0.0.1:56789/protocols
106
+ http://127.0.0.1:56789/runtime/v1
107
+ http://127.0.0.1:56789/protocols/openai/v1
108
+ http://127.0.0.1:56789/protocols/langgraph
109
+ http://127.0.0.1:56789/protocols/mcp
110
+ http://127.0.0.1:56789/protocols/a2a
111
+ http://127.0.0.1:56789/protocols/acp
112
+ http://127.0.0.1:56789/protocols/ag-ui
104
113
  ```
105
114
 
106
- ACP, A2A, and AG-UI facades are available when enabled in runtime YAML through
107
- the combined `agentProtocols` server.
115
+ Runtime YAML can disable protocol groups or bind the gateway to an
116
+ environment-managed port.
108
117
 
109
118
  For request commands and console mode, the CLI can choose how it connects:
110
119
 
111
120
  ```bash
112
121
  stable-harness -w ./examples/minimal-deepagents --runtime auto "Review this workspace."
113
- stable-harness -w ./examples/minimal-deepagents --runtime daemon --daemon-url http://127.0.0.1:8641 "Review this workspace."
122
+ stable-harness -w ./examples/minimal-deepagents --runtime daemon --daemon-url http://127.0.0.1:56789/runtime/v1 "Review this workspace."
114
123
  stable-harness -w ./examples/minimal-deepagents --runtime local "Review this workspace."
115
124
  stable-harness console -w ./examples/minimal-deepagents --runtime daemon
116
125
  ```
@@ -124,15 +133,19 @@ When a protocol facade is enabled, the same CLI can also choose the client
124
133
  protocol:
125
134
 
126
135
  ```bash
127
- stable-harness -w ./examples/minimal-deepagents --protocol stable-runtime --tool echo_tool --tool-args-json '{"value":"native"}'
128
- stable-harness -w ./examples/minimal-deepagents --protocol a2a --protocol-url http://127.0.0.1:8650 --tool echo_tool --tool-args-json '{"value":"a2a"}'
129
- stable-harness -w ./examples/minimal-deepagents --protocol agui --protocol-url http://127.0.0.1:8650 --tool echo_tool --tool-args-json '{"value":"agui"}'
130
- stable-harness -w ./examples/minimal-deepagents --protocol acp --protocol-url http://127.0.0.1:8650 --tool echo_tool --tool-args-json '{"value":"acp"}'
136
+ stable-harness -w ./examples/minimal-deepagents --protocol runtime --tool echo_tool --tool-args-json '{"value":"native"}'
137
+ stable-harness -w ./examples/minimal-deepagents --protocol a2a --protocol-url http://127.0.0.1:56789/protocols --tool echo_tool --tool-args-json '{"value":"a2a"}'
138
+ stable-harness -w ./examples/minimal-deepagents --protocol agui --protocol-url http://127.0.0.1:56789/protocols --tool echo_tool --tool-args-json '{"value":"agui"}'
139
+ stable-harness -w ./examples/minimal-deepagents --protocol acp --protocol-url http://127.0.0.1:56789/protocols --tool echo_tool --tool-args-json '{"value":"acp"}'
131
140
  ```
132
141
 
133
- `stable-runtime` exposes the full operator control plane. A2A, AG-UI, and ACP
134
- map request turns through their protocol facades; use Stable Runtime HTTP + SSE
135
- for traces, approvals, memory administration, and full session inspection.
142
+ `runtime` exposes the full operator control plane directly through
143
+ `/runtime/v1`. OpenAI, A2A, AG-UI, ACP, MCP, and LangGraph-compatible protocol
144
+ routes live under `/protocols/*`.
145
+
146
+ See the [protocol feature coverage matrix](docs/protocols/coverage-matrix.md)
147
+ for the side-by-side list of features, API families, protocol support, and
148
+ unsupported areas.
136
149
 
137
150
  Build a portable Docker runtime artifact for the workspace:
138
151
 
@@ -264,10 +277,10 @@ This is constrained repair, not silent magic:
264
277
 
265
278
  ## Protocols
266
279
 
267
- - Stable Runtime HTTP + SSE: [docs/protocols/http-runtime.md](docs/protocols/http-runtime.md)
280
+ - native runtime HTTP + SSE: [docs/protocols/http-runtime.md](docs/protocols/http-runtime.md)
268
281
  - OpenAI-compatible facade: [docs/protocols/openai-compatible.md](docs/protocols/openai-compatible.md)
269
282
  - LangGraph-compatible facade: [docs/protocols/langgraph-compatible.md](docs/protocols/langgraph-compatible.md)
270
- - ACP, A2A, and AG-UI facades: [docs/protocols/agent-protocols.md](docs/protocols/agent-protocols.md)
283
+ - ACP, A2A, and AG-UI facades: [docs/protocols/protocol-facades.md](docs/protocols/protocol-facades.md)
271
284
 
272
285
  ## Documentation
273
286
 
package/dist/index.js CHANGED
@@ -1 +1 @@
1
- import{createBackendModel as e,createDeepAgentsAdapter as r}from"@stable-harness/adapter-deepagents";import{createLangGraphRuntimeAdapter as a,createLangGraphWorkflowAdapter as t,createRegistrySkillResolverProvider as o}from"@stable-harness/adapter-langgraph";import{createStableHarnessRuntime as n}from"@stable-harness/core";import{createModuleToolGateway as i}from"@stable-harness/tool-gateway";import{loadWorkspaceFromYaml as s}from"@stable-harness/workspace-yaml";export{createDeepAgentsAdapter,createDeepAgentsMemoryMaintenanceTarget}from"@stable-harness/adapter-deepagents";export{createDeepAgentsMiddlewareSkillProvider,createLangGraphRuntimeAdapter,createLangGraphWorkflowAdapter,createRegistrySkillResolverProvider}from"@stable-harness/adapter-langgraph";export{createLangMemServiceProvider}from"@stable-harness/memory";export{createInMemoryRuntimeMemoryStore,createJsonFileRuntimeMemoryStore}from"@stable-harness/memory";export{applySpecDrivenPhaseTransition,containsRecoverableResultOutput,createSpecDrivenArtifact,createSpecDrivenArtifactEvent,createSpecDrivenPhaseEvent,createSpecDrivenWorkflowPolicy,createSpecDrivenWorkflowState,defaultExecutionEvaluatorRules,defaultToolGuardrails,evaluateExecutionRules,evaluateToolGuardrails,collectStableHarnessPrometheusSamples,projectRuntimeTrace,repeatToolGuardrail,renderStableHarnessPrometheusMetrics,resolveEnabledMemories,STABLE_HARNESS_PROMETHEUS_LABELS,requiredPlanToolGuardrail,reviewExecutionEvidence,toolDependencyGuardrail}from"@stable-harness/core";export{loadWorkspaceFromYaml}from"@stable-harness/workspace-yaml";export{createInMemoryToolGateway,createModuleToolGateway}from"@stable-harness/tool-gateway";export function createStableHarnessRuntime(e){return"string"==typeof e?createStableRuntime({workspaceRoot:e}):"workspaceRoot"in e?createStableRuntime(e):n(e)}export async function createStableRuntime(e){const r=await s(e.workspaceRoot),a=e.toolGateway??await i({tools:r.tools.values()});return n({workspace:r,toolGateway:a,memory:e.memory,qualityReviewModel:createQualityReviewModel(r),toolGuardrails:e.toolGuardrails,executionEvaluatorRules:e.executionEvaluatorRules,adapters:e.adapters??createRuntimeAdapters(r,e),workflowAdapters:e.workflowAdapters??createWorkflowAdapters(r,e)})}function createQualityReviewModel(r){const a=function readQualityModelRef(e){const r=isRecord(e)?e:{};return readString((isRecord(r.reviewer)?r.reviewer:r).modelRef)}(r.runtime.quality),t=a?r.models.get(a):void 0,o=t?e(t):void 0;return function isQualityReviewModel(e){return isRecord(e)&&"function"==typeof e.invoke}(o)?o:void 0}export async function requestStableRuntime(e,r){return e.request(r)}function createRuntimeAdapters(e,t){const o={deepagents:({policy:e})=>r(e.config?{config:e.config}:{}),langgraph:({policy:e})=>a({...readLangGraphOptions(e.config),name:e.name}),...t.adapterFactories},n=function runtimeAdapterPolicies(e){const r=e.runtime.adapters?.filter(e=>!1!==e.enabled);return r&&r.length>0?r:[...new Set([...e.agents.values()].map(e=>e.backend))].map(e=>({name:e}))}(e);return n.map(r=>{const a=o[r.name];if(a)return a({policy:r,workspace:e});throw new Error(`Unsupported runtime adapter: ${r.name}`)})}function createWorkflowAdapters(e,r){const a={langgraph:({name:e,options:r})=>t({...readLangGraphOptions(r),name:e}),...r.workflowAdapterFactories};return[...new Set([...e.workflows.values()].map(e=>e.adapter??"").filter(Boolean))].map(t=>{const o=a[t];return o?.({name:t,workspace:e,options:readWorkflowAdapterOptions(r,t)})}).filter(e=>Boolean(e))}function readWorkflowAdapterOptions(e,r){return e.workflowAdapterOptions?.[r]??{}}function readLangGraphOptions(e){return isRecord(e)?{...e,...void 0!==readLangGraphSkillProvider(e)?{skillProvider:readLangGraphSkillProvider(e)}:{}}:{}}function readLangGraphSkillProvider(e){if(!1===e.skillProvider)return!1;const r=function readSkillProviderConfig(e){return isRecord(e.skills)?e.skills:isRecord(e.skillProvider)?e.skillProvider:void 0}(e);if(!r)return;const a=readString(r.provider)??readString(r.name)??"registry-resolver";if(["none","disabled","false"].includes(a))return!1;if("registry-resolver"!==a)throw new Error(`Unsupported LangGraph skill provider: ${a}`);return o({..."boolean"==typeof r.includeContent?{includeContent:r.includeContent}:{},..."number"==typeof r.maxBytes&&Number.isFinite(r.maxBytes)?{maxBytes:r.maxBytes}:{}})}function readString(e){return"string"==typeof e&&e.trim()?e.trim():void 0}function isRecord(e){return"object"==typeof e&&null!==e&&!Array.isArray(e)}
1
+ import{createBackendModel as e,createDeepAgentsAdapter as r}from"@stable-harness/adapter-deepagents";import{createLangGraphRuntimeAdapter as t,createLangGraphWorkflowAdapter as a,createRegistrySkillResolverProvider as o}from"@stable-harness/adapter-langgraph";import{createStableHarnessRuntime as n}from"@stable-harness/core";import{createModuleToolGateway as i}from"@stable-harness/tool-gateway";import{loadWorkspaceFromYaml as s}from"@stable-harness/workspace-yaml";export{createDeepAgentsAdapter,createDeepAgentsMemoryMaintenanceTarget}from"@stable-harness/adapter-deepagents";export{createDeepAgentsMiddlewareSkillProvider,createLangGraphRuntimeAdapter,createLangGraphWorkflowAdapter,createRegistrySkillResolverProvider}from"@stable-harness/adapter-langgraph";export{createLangMemServiceProvider}from"@stable-harness/memory";export{createInMemoryRuntimeMemoryStore,createJsonFileRuntimeMemoryStore}from"@stable-harness/memory";export{applySpecDrivenPhaseTransition,containsRecoverableResultOutput,createSpecDrivenArtifact,createSpecDrivenArtifactEvent,createSpecDrivenPhaseEvent,createSpecDrivenWorkflowPolicy,createSpecDrivenWorkflowState,defaultExecutionEvaluatorRules,defaultToolGuardrails,evaluateExecutionRules,evaluateToolGuardrails,collectStableHarnessPrometheusSamples,projectRuntimeTrace,repeatToolGuardrail,renderStableHarnessPrometheusMetrics,resolveEnabledMemories,STABLE_HARNESS_PROMETHEUS_LABELS,requiredPlanToolGuardrail,reviewExecutionEvidence,toolDependencyGuardrail}from"@stable-harness/core";export{loadWorkspaceFromYaml}from"@stable-harness/workspace-yaml";export{createInMemoryToolGateway,createModuleToolGateway}from"@stable-harness/tool-gateway";export function createStableHarnessRuntime(e){return"string"==typeof e?createStableRuntime({workspaceRoot:e}):"workspaceRoot"in e?createStableRuntime(e):n(e)}export async function createStableRuntime(e){const r=await s(e.workspaceRoot),t=e.toolGateway??await i({tools:r.tools.values()});return n({workspace:r,toolGateway:t,memory:e.memory,qualityReviewModel:createQualityReviewModel(r),toolGuardrails:e.toolGuardrails,executionEvaluatorRules:e.executionEvaluatorRules,adapters:e.adapters??createRuntimeAdapters(r,e),workflowAdapters:e.workflowAdapters??createWorkflowAdapters(r,e)})}function createQualityReviewModel(r){const t=function readQualityModelRef(e){const r=isRecord(e)?e:{};return readString((isRecord(r.reviewer)?r.reviewer:r).modelRef)}(r.runtime.quality),a=t?r.models.get(t):void 0,o=a?e(a):void 0;return function isQualityReviewModel(e){return isRecord(e)&&"function"==typeof e.invoke}(o)?o:void 0}export async function requestStableRuntime(e,r){return e.request(r)}function createRuntimeAdapters(e,a){const o={deepagents:({policy:e})=>r(e.config?{config:e.config}:{}),langgraph:({policy:e})=>t({...readLangGraphOptions(e.config),name:e.name}),...a.adapterFactories},n=function runtimeAdapterPolicies(e){const r=e.runtime.adapters?.filter(e=>!1!==e.enabled);return r&&r.length>0?r:[...new Set([...e.agents.values()].map(e=>e.backend))].map(e=>({name:e}))}(e);return n.map(r=>{const t=o[r.name];if(t)return t({policy:r,workspace:e});throw new Error(`Unsupported runtime adapter: ${r.name}`)})}function createWorkflowAdapters(e,r){const t={langgraph:({name:e,options:r})=>a({...readLangGraphOptions(r),name:e}),...r.workflowAdapterFactories};return[...new Set([...e.workflows.values()].map(e=>e.adapter??"").filter(Boolean))].map(a=>{const o=t[a];return o?.({name:a,workspace:e,options:readWorkflowAdapterOptions(r,a)})}).filter(e=>Boolean(e))}function readWorkflowAdapterOptions(e,r){return e.workflowAdapterOptions?.[r]??{}}function readLangGraphOptions(e){return isRecord(e)?{...e,...void 0!==readLangGraphSkillProvider(e)?{skillProvider:readLangGraphSkillProvider(e)}:{}}:{}}function readLangGraphSkillProvider(e){if(!1===e.skillProvider)return!1;const r=function readSkillProviderConfig(e){return isRecord(e.skills)?e.skills:isRecord(e.skillProvider)?e.skillProvider:void 0}(e);if(!r)return;const t=readString(r.provider)??readString(r.name)??"registry-resolver";if(["none","disabled","false"].includes(t))return!1;if("registry-resolver"!==t)throw new Error(`Unsupported LangGraph skill provider: ${t}`);return o({..."boolean"==typeof r.includeContent?{includeContent:r.includeContent}:{},..."number"==typeof r.maxBytes&&Number.isFinite(r.maxBytes)?{maxBytes:r.maxBytes}:{}})}function readString(e){return"string"==typeof e&&e.trim()?e.trim():void 0}function isRecord(e){return"object"==typeof e&&null!==e&&!Array.isArray(e)}
@@ -1 +1 @@
1
- import{readFileSync as r,statSync as t}from"node:fs";import*as e from"node:path";import{parse as n}from"yaml";export function validateSkillMetadata(o){const i=t(o).isDirectory()?e.join(o,"SKILL.md"):o,a=r(i,"utf8").match(/^---\n([\s\S]*?)\n---/u);if(!a)throw new Error(`${i} is missing YAML front matter`);const s=n(a[1]),m=readString(s.name,"name",i),l=readString(s.description,"description",i),d=function readStringList(r){return Array.isArray(r)?r.filter(r=>"string"==typeof r&&r.trim().length>0):[]}(s["allowed-tools"]);if(0===d.length)throw new Error(`${e.basename(o)} must declare allowed-tools`);return{name:m,description:l,allowedTools:d,path:i}}function readString(r,t,e){if("string"!=typeof r||!r.trim())throw new Error(`${e} front matter requires ${t}`);return r.trim()}
1
+ import{readFileSync as r,statSync as t}from"node:fs";import*as e from"node:path";import{parse as o}from"yaml";export function validateSkillMetadata(n){const i=t(n).isDirectory()?e.join(n,"SKILL.md"):n,a=r(i,"utf8").match(/^---\n([\s\S]*?)\n---/u);if(!a)throw new Error(`${i} is missing YAML front matter`);const s=o(a[1]),m=readString(s.name,"name",i),l=readString(s.description,"description",i),d=function readStringList(r){return Array.isArray(r)?r.filter(r=>"string"==typeof r&&r.trim().length>0):[]}(s["allowed-tools"]);if(0===d.length)throw new Error(`${e.basename(n)} must declare allowed-tools`);return{name:m,description:l,allowedTools:d,path:i}}function readString(r,t,e){if("string"!=typeof r||!r.trim())throw new Error(`${e} front matter requires ${t}`);return r.trim()}
@@ -1 +1 @@
1
- import{readdir as e,readFile as t}from"node:fs/promises";import*as s from"node:path";import{parseAllDocuments as o}from"yaml";import{validateSkillMetadata as r}from"../runtime/skills/skill-metadata.js";export async function loadWorkspace(n){const i=s.resolve(n),a=await async function readConfigDocuments(e){const s=await listYamlFiles(e),r=[];for(const e of s){const s=await t(e,"utf8");for(const e of o(s)){const t=e.toJSON();t?.kind&&r.push({...t,metadata:t.metadata??{},spec:t.spec??{}})}}return r}(s.join(i,"config")),c=function compileModels(e){const t=new Map;for(const s of e){if("Models"===s.kind&&Array.isArray(s.spec))for(const e of s.spec){const s=compileModel(e);t.set(String(s.name),s)}if("Model"===s.kind&&isRecord(s.spec)){const e=compileModel({name:s.metadata?.name,...s.spec});t.set(String(e.name),e)}}return t}(a),l=function compileAgents(e,t){const o=new Map;for(const r of t.filter(e=>"Agent"===e.kind)){if(!isRecord(r.spec))continue;const t=String(r.metadata?.name),n=isRecord(r.spec.config)?r.spec.config:{},i=readSystemPrompt(r.spec,n);o.set(t,{id:t,name:t,description:r.metadata?.description??"",sourcePath:s.join(e,"config","agents",`${t}.yaml`),modelRef:normalizeRef(r.spec.modelRef),toolRefs:readStringList(r.spec.tools),skillPathRefs:readStringList(r.spec.skills).map(t=>s.join(e,"resources","skills",t)),subagentRefs:readStringList(r.spec.subagents),memorySources:readMemorySources(r.spec.memory),deepAgentConfig:{...n,responseFormat:normalizeResponseFormat(n.responseFormat),systemPrompt:i}})}return o}(i,a),m=await async function compileTools(e,t){const o=new Map;for(const e of t.filter(e=>"Tool"===e.kind)){const t=e.metadata?.name;t&&o.set(t,{name:t,id:t,spec:e.spec})}return await loadToolExports(s.join(e,"resources","tools"),o),o}(i,a),p=await async function compileSkills(t){const o=new Map,n=s.join(t,"resources","skills");for(const t of await e(n,{withFileTypes:!0})){if(!t.isDirectory())continue;const e=s.join(n,t.name),i=r(e);o.set(i.name,{name:i.name,path:e,description:i.description,allowedTools:i.allowedTools})}return o}(i);return function resolveAgentSkillNames(e,t){for(const o of e.values())o.skillPathRefs=o.skillPathRefs.map(e=>{const o=s.basename(e);return t.get(o)?.path??e})}(l,p),function ensureDirectAgent(e,t){e.has("direct")||e.set("direct",{id:"direct",name:"direct",description:"Direct execution agent.",sourcePath:s.join(t,"config","runtime","workspace.yaml"),toolRefs:[],skillPathRefs:[],subagentRefs:[],memorySources:[],deepAgentConfig:{}})}(l,i),{workspaceRoot:i,models:c,agents:l,tools:m,skills:p,bindings:compileBindings(l)}}async function listYamlFiles(t){const o=await e(t,{withFileTypes:!0});return(await Promise.all(o.map(async e=>{const o=s.join(t,e.name);return e.isDirectory()?listYamlFiles(o):e.isFile()&&/\.ya?ml$/iu.test(e.name)?[o]:[]}))).flat().sort()}function compileModel(e){const t=String(e.name),s={...e};return delete s.name,delete s.provider,delete s.model,{name:t,provider:String(resolveValue(e.provider)),model:String(resolveValue(e.model)),init:(o=s,Object.fromEntries(Object.entries(o).map(([e,t])=>[e,resolveValue(t)])))};var o}async function loadToolExports(o,r){const n=await e(o,{withFileTypes:!0});for(const e of n){if(e.name.startsWith("_"))continue;const n=s.join(o,e.name);if(e.isDirectory()){await loadToolExports(n,r);continue}if(!e.isFile()||!e.name.endsWith(".mjs"))continue;const i=(await t(n,"utf8")).match(/export\s+const\s+([A-Za-z_][A-Za-z0-9_]*)\s*=\s*tool/u),a=i?.[1]??s.basename(e.name,".mjs");r.set(a,{name:a,id:a,sourcePath:n})}}function compileBindings(e){const t=new Map;for(const[s,o]of e){const e=o.deepAgentConfig;t.set(s,{agentId:s,deepAgentParams:{responseFormat:e.responseFormat,systemPrompt:e.systemPrompt}})}return t}function readSystemPrompt(e,t){return"string"==typeof e.systemPrompt?e.systemPrompt:"string"==typeof t.systemPrompt?t.systemPrompt:""}function normalizeResponseFormat(e){const t={type:"object",properties:{status:{type:"string",enum:["completed","blocked","failed","refused"]},summary:{type:"array",items:{type:"string"}},findings:{type:"array",items:{type:"string"}},blockers:{type:"array",items:{type:"string"}},nextActions:{type:"array",items:{type:"string"}},report:{type:"string"}},required:["status","summary","findings","blockers","nextActions","report"]};if(!isRecord(e))return t;const s=isRecord(e.properties)?e.properties:{},o=Array.isArray(e.required)?e.required:[];return{...t,...e,properties:{...t.properties,...s},required:[...new Set([...t.required,...o].filter(e=>"string"==typeof e))]}}function readMemorySources(e){return Array.isArray(e)?e.flatMap(e=>isRecord(e)&&"string"==typeof e.path?[e.path]:[]):[]}function readStringList(e){return Array.isArray(e)?e.filter(e=>"string"==typeof e&&e.trim().length>0):[]}function normalizeRef(e){return"string"==typeof e?e.replace(/^[^/]+\//u,""):void 0}function resolveValue(e){if("string"!=typeof e)return e;const t=e.replace(/\$\{env:([A-Za-z_][A-Za-z0-9_]*)(?::-(.*?))?\}/gu,(e,t,s)=>process.env[t]??s??"");return/^(true|false)$/iu.test(t)?"true"===t.toLowerCase():/^-?\d+$/u.test(t)?Number(t):t}function isRecord(e){return"object"==typeof e&&null!==e&&!Array.isArray(e)}
1
+ import{readdir as e,readFile as t}from"node:fs/promises";import*as o from"node:path";import{parseAllDocuments as s}from"yaml";import{validateSkillMetadata as r}from"../runtime/skills/skill-metadata.js";export async function loadWorkspace(n){const i=o.resolve(n),a=await async function readConfigDocuments(e){const o=await listYamlFiles(e),r=[];for(const e of o){const o=await t(e,"utf8");for(const e of s(o)){const t=e.toJSON();t?.kind&&r.push({...t,metadata:t.metadata??{},spec:t.spec??{}})}}return r}(o.join(i,"config")),c=function compileModels(e){const t=new Map;for(const o of e){if("Models"===o.kind&&Array.isArray(o.spec))for(const e of o.spec){const o=compileModel(e);t.set(String(o.name),o)}if("Model"===o.kind&&isRecord(o.spec)){const e=compileModel({name:o.metadata?.name,...o.spec});t.set(String(e.name),e)}}return t}(a),m=function compileAgents(e,t){const s=new Map;for(const r of t.filter(e=>"Agent"===e.kind)){if(!isRecord(r.spec))continue;const t=String(r.metadata?.name),n=isRecord(r.spec.config)?r.spec.config:{},i=readSystemPrompt(r.spec,n);s.set(t,{id:t,name:t,description:r.metadata?.description??"",sourcePath:o.join(e,"config","agents",`${t}.yaml`),modelRef:normalizeRef(r.spec.modelRef),toolRefs:readStringList(r.spec.tools),skillPathRefs:readStringList(r.spec.skills).map(t=>o.join(e,"resources","skills",t)),subagentRefs:readStringList(r.spec.subagents),memorySources:readMemorySources(r.spec.memory),deepAgentConfig:{...n,responseFormat:normalizeResponseFormat(n.responseFormat),systemPrompt:i}})}return s}(i,a),l=await async function compileTools(e,t){const s=new Map;for(const e of t.filter(e=>"Tool"===e.kind)){const t=e.metadata?.name;t&&s.set(t,{name:t,id:t,spec:e.spec})}return await loadToolExports(o.join(e,"resources","tools"),s),s}(i,a),p=await async function compileSkills(t){const s=new Map,n=o.join(t,"resources","skills");for(const t of await e(n,{withFileTypes:!0})){if(!t.isDirectory())continue;const e=o.join(n,t.name),i=r(e);s.set(i.name,{name:i.name,path:e,description:i.description,allowedTools:i.allowedTools})}return s}(i);return function resolveAgentSkillNames(e,t){for(const s of e.values())s.skillPathRefs=s.skillPathRefs.map(e=>{const s=o.basename(e);return t.get(s)?.path??e})}(m,p),function ensureDirectAgent(e,t){e.has("direct")||e.set("direct",{id:"direct",name:"direct",description:"Direct execution agent.",sourcePath:o.join(t,"config","runtime","workspace.yaml"),toolRefs:[],skillPathRefs:[],subagentRefs:[],memorySources:[],deepAgentConfig:{}})}(m,i),{workspaceRoot:i,models:c,agents:m,tools:l,skills:p,bindings:compileBindings(m)}}async function listYamlFiles(t){const s=await e(t,{withFileTypes:!0});return(await Promise.all(s.map(async e=>{const s=o.join(t,e.name);return e.isDirectory()?listYamlFiles(s):e.isFile()&&/\.ya?ml$/iu.test(e.name)?[s]:[]}))).flat().sort()}function compileModel(e){const t=String(e.name),o={...e};return delete o.name,delete o.provider,delete o.model,{name:t,provider:String(resolveValue(e.provider)),model:String(resolveValue(e.model)),init:(s=o,Object.fromEntries(Object.entries(s).map(([e,t])=>[e,resolveValue(t)])))};var s}async function loadToolExports(s,r){const n=await e(s,{withFileTypes:!0});for(const e of n){if(e.name.startsWith("_"))continue;const n=o.join(s,e.name);if(e.isDirectory()){await loadToolExports(n,r);continue}if(!e.isFile()||!e.name.endsWith(".mjs"))continue;const i=(await t(n,"utf8")).match(/export\s+const\s+([A-Za-z_][A-Za-z0-9_]*)\s*=\s*tool/u),a=i?.[1]??o.basename(e.name,".mjs");r.set(a,{name:a,id:a,sourcePath:n})}}function compileBindings(e){const t=new Map;for(const[o,s]of e){const e=s.deepAgentConfig;t.set(o,{agentId:o,deepAgentParams:{responseFormat:e.responseFormat,systemPrompt:e.systemPrompt}})}return t}function readSystemPrompt(e,t){return"string"==typeof e.systemPrompt?e.systemPrompt:"string"==typeof t.systemPrompt?t.systemPrompt:""}function normalizeResponseFormat(e){const t={type:"object",properties:{status:{type:"string",enum:["completed","blocked","failed","refused"]},summary:{type:"array",items:{type:"string"}},findings:{type:"array",items:{type:"string"}},blockers:{type:"array",items:{type:"string"}},nextActions:{type:"array",items:{type:"string"}},report:{type:"string"}},required:["status","summary","findings","blockers","nextActions","report"]};if(!isRecord(e))return t;const o=isRecord(e.properties)?e.properties:{},s=Array.isArray(e.required)?e.required:[];return{...t,...e,properties:{...t.properties,...o},required:[...new Set([...t.required,...s].filter(e=>"string"==typeof e))]}}function readMemorySources(e){return Array.isArray(e)?e.flatMap(e=>isRecord(e)&&"string"==typeof e.path?[e.path]:[]):[]}function readStringList(e){return Array.isArray(e)?e.filter(e=>"string"==typeof e&&e.trim().length>0):[]}function normalizeRef(e){return"string"==typeof e?e.replace(/^[^/]+\//u,""):void 0}function resolveValue(e){if("string"!=typeof e)return e;const t=e.replace(/\$\{env:([A-Za-z_][A-Za-z0-9_]*)(?::-(.*?))?\}/gu,(e,t,o)=>process.env[t]??o??"");return/^(true|false)$/iu.test(t)?"true"===t.toLowerCase():/^-?\d+$/u.test(t)?Number(t):t}function isRecord(e){return"object"==typeof e&&null!==e&&!Array.isArray(e)}
@@ -4,7 +4,7 @@
4
4
 
5
5
  ## 测试设置
6
6
 
7
- - 远端 Ollama:`https://ollama-rtx-4070.easynet.world`
7
+ - 远端 Ollama:`https://ollama-rtx-4070.easynet.world/v1`
8
8
  - 每个模型自然用例轮数:`10`,总自然用例数为 `50`
9
9
  - 注入错误矩阵覆盖:未知工具、错误工具名、缺必填、类型错、enum 错、extra arg、绝对路径、语义 ticker 错、不可解析参数
10
10
  - 该 benchmark 是产品级 fault-injection 与本地 BFCL-style 子集,不是 BFCL 官方成绩。
@@ -19,8 +19,8 @@
19
19
  | qwen3.5:0.8b | on | 50 | 100% | 100% | 0% | 0% | 100% |
20
20
  | qwen3.5:2b | off | 50 | 100% | 100% | 0% | 0% | 100% |
21
21
  | qwen3.5:2b | on | 50 | 100% | 100% | 0% | 0% | 100% |
22
- | granite4.1:3b | off | 50 | 100% | 100% | 0% | 0% | 100% |
23
- | granite4.1:3b | on | 50 | 100% | 100% | 0% | 0% | 100% |
22
+ | qwen3.5:4b | off | 50 | 100% | 100% | 0% | 0% | 100% |
23
+ | qwen3.5:4b | on | 50 | 100% | 100% | 0% | 0% | 100% |
24
24
  | qwen3.5:4b | off | 50 | 100% | 100% | 0% | 0% | 100% |
25
25
  | qwen3.5:4b | on | 50 | 100% | 100% | 0% | 0% | 100% |
26
26
 
@@ -31,12 +31,12 @@
31
31
  | qwen3:0.6b | 100% | 66.7% | name, schema, type, semantic |
32
32
  | qwen3.5:0.8b | 100% | 66.7% | name, schema, type, semantic |
33
33
  | qwen3.5:2b | 100% | 100% | name, schema, type, semantic |
34
- | granite4.1:3b | 100% | 100% | name, schema, type, semantic |
34
+ | qwen3.5:4b | 100% | 100% | name, schema, type, semantic |
35
35
  | qwen3.5:4b | 100% | 100% | name, schema, type, semantic |
36
36
 
37
37
  ## 结论
38
38
 
39
39
  - Guard 的核心收益是阻止错误 tool call 进入真实执行层;在本轮测试里,所有注入错误都被 100% 拦截。
40
40
  - `qwen3:0.6b` 的自然输出存在 20% 原本会错误执行的 registered tool call,开启 Guard 后 bad execution 从 20% 降到 0%。
41
- - `qwen3.5:2b`、`granite4.1:3b`、`qwen3.5:4b` 对注入错误的一轮 repair 成功率为 100%。这个结论只适用于本 benchmark 的注入错误矩阵。
41
+ - `qwen3.5:2b`、`qwen3.5:4b`、`qwen3.5:4b` 对注入错误的一轮 repair 成功率为 100%。这个结论只适用于本 benchmark 的注入错误矩阵。
42
42
  - `qwen3.5:0.8b` 及以上在本轮自然用例里 baseline 已经是 100%,所以自然场景没有可观察的 accepted-rate uplift。
@@ -366,7 +366,7 @@ flowchart LR
366
366
  end
367
367
 
368
368
  subgraph Service["Standalone service"]
369
- CLI["stable-harness start -w workspace --port 8642"] --> Server["HTTP / OpenAI-compatible server"]
369
+ CLI["stable-harness start -w workspace"] --> Server["HTTP / protocol servers"]
370
370
  Server --> RuntimeB["Runtime instance"]
371
371
  end
372
372
 
@@ -383,8 +383,8 @@ flowchart LR
383
383
  推荐启动路径:
384
384
 
385
385
  - 嵌入式应用:`createStableHarnessRuntime(workspaceRoot)`,由 workspace YAML 决定 adapter、protocol 和 policy 默认值。
386
- - 本地或服务化入口:`stable-harness start -w "$PWD" --port 8642 --api-key ...`。
387
- - OpenAI-compatible 客户端:连接 `http://host:port/v1`,把 `model` 映射为 workspace agent id。
386
+ - 本地或服务化入口:`stable-harness start -w "$PWD"` `stable-harness serve -w "$PWD"`。
387
+ - OpenAI-compatible 客户端:默认连接 `http://127.0.0.1:56789/protocols/openai/v1`,把 `model` 映射为 workspace agent id。
388
388
 
389
389
  ## 扩展点
390
390