stable-harness 0.0.7 → 0.0.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (39) hide show
  1. package/README.md +10 -0
  2. package/docs/0.1.0-p0-runtime-control-plane-plan.zh.md +171 -0
  3. package/docs/0.1.0-retry-policy.zh.md +87 -0
  4. package/docs/0.1.0-stable-runtime-development-roadmap.zh.md +393 -0
  5. package/docs/0.1.0-tool-guard-benchmark.zh.md +42 -0
  6. package/docs/adapter-contract.md +199 -0
  7. package/docs/architecture/backend-comparison.md +41 -0
  8. package/docs/architecture/runtime-events.md +263 -0
  9. package/docs/architecture/runtime-events.zh.md +248 -0
  10. package/docs/architecture/system-architecture.zh.md +435 -0
  11. package/docs/compatibility-matrix.md +139 -0
  12. package/docs/engineering-rules.md +111 -0
  13. package/docs/evaluation/0.1.0-bfcl-targeted-model-matrix.zh.md +1632 -0
  14. package/docs/evaluation/0.1.0-bfcl-targeted-review-matrix.zh.md +1952 -0
  15. package/docs/evaluation/0.1.0-bfcl-tool-guard.zh.md +1427 -0
  16. package/docs/granite-tool-calling-comparison.zh.md +206 -0
  17. package/docs/guides/getting-started.md +126 -0
  18. package/docs/guides/index.md +40 -0
  19. package/docs/guides/integration-guide.md +126 -0
  20. package/docs/guides/operator-runbook.md +153 -0
  21. package/docs/guides/workspace-authoring.md +212 -0
  22. package/docs/implementation-blueprint.md +233 -0
  23. package/docs/memory/0.1.0-memory-design.zh.md +719 -0
  24. package/docs/memory/0.1.0-step-09-deepagents-native-memory.zh.md +146 -0
  25. package/docs/memory/0.1.0-step-09-langmem-shaped-provider.zh.md +169 -0
  26. package/docs/memory/0.1.0-step-09-memory-adapter-projection.zh.md +123 -0
  27. package/docs/memory/0.1.0-step-09-memory-contract.zh.md +169 -0
  28. package/docs/memory/0.1.0-step-09-memory-governance-approval.zh.md +143 -0
  29. package/docs/memory/0.1.0-step-09-memory-lifecycle-hooks.zh.md +150 -0
  30. package/docs/memory/0.1.0-step-09-memory-maintenance-boundary.zh.md +118 -0
  31. package/docs/memory/0.1.0-step-09-memory-persistence-boundary.zh.md +118 -0
  32. package/docs/product/adoption-playbook.md +145 -0
  33. package/docs/product/market-positioning.md +137 -0
  34. package/docs/product-boundary.md +258 -0
  35. package/docs/protocols/http-runtime.md +37 -0
  36. package/docs/protocols/langgraph-compatible.md +107 -0
  37. package/docs/protocols/openai-compatible.md +121 -0
  38. package/docs/tooling/0.1.0-bettercall-tool-quality.zh.md +231 -0
  39. package/package.json +3 -1
@@ -0,0 +1,212 @@
1
+ # Workspace Authoring
2
+
3
+ Stable Harness workspaces are directories of Kubernetes-style YAML resources plus
4
+ local resources such as tools and skills.
5
+
6
+ The workspace is the product surface. Application behavior should be expressed
7
+ here, not hidden inside framework code.
8
+
9
+ ## Folder Shape
10
+
11
+ ```text
12
+ config/
13
+ runtime/workspace.yaml
14
+ agents/orchestra.yaml
15
+ catalogs/models.yaml
16
+ catalogs/tools.yaml
17
+ workflows/review-shell.yaml
18
+ resources/
19
+ tools/
20
+ skills/
21
+ ```
22
+
23
+ ## Runtime
24
+
25
+ `kind: Runtime` owns workspace-level routing, protocol exposure, memory policy,
26
+ approval policy, event display, and other runtime controls.
27
+
28
+ ```yaml
29
+ apiVersion: stable-harness.dev/v1
30
+ kind: Runtime
31
+ metadata:
32
+ name: app-runtime
33
+ spec:
34
+ routing:
35
+ defaultAgentId: orchestra
36
+ protocols:
37
+ inProcess: true
38
+ openaiCompatible:
39
+ host: 127.0.0.1
40
+ port: 8642
41
+ ```
42
+
43
+ Keep runtime config typed and explicit. Do not route by prompt keywords, tool
44
+ names, benchmark cases, or downstream domains.
45
+
46
+ ## Agents
47
+
48
+ `kind: Agent` declares the agent inventory and selects the backend adapter.
49
+
50
+ ```yaml
51
+ apiVersion: stable-harness.dev/v1
52
+ kind: Agent
53
+ metadata:
54
+ name: orchestra
55
+ spec:
56
+ backend: deepagents
57
+ modelRef: local-dev
58
+ systemPrompt: You are a concise workspace agent.
59
+ tools:
60
+ - echo_tool
61
+ subagents:
62
+ - reviewer
63
+ config:
64
+ deepagents:
65
+ taskDescription: Call a subagent only when the upstream backend chooses to.
66
+ ```
67
+
68
+ Backend-specific settings belong under explicit adapter config fields such as
69
+ `config.deepagents`. Stable Harness should pass upstream-native semantics
70
+ through instead of recreating them locally.
71
+
72
+ ## Models
73
+
74
+ `kind: Model` declares model provider configuration.
75
+
76
+ ```yaml
77
+ apiVersion: stable-harness.dev/v1
78
+ kind: Model
79
+ metadata:
80
+ name: local-dev
81
+ spec:
82
+ provider: openai-compatible
83
+ model: ${env:STABLE_HARNESS_MODEL:-gpt-4.1-mini}
84
+ baseUrl: ${env:STABLE_HARNESS_OPENAI_BASE_URL:-https://api.openai.com/v1}
85
+ apiKey: ${env:OPENAI_API_KEY}
86
+ ```
87
+
88
+ Use environment references for secrets and environment-specific endpoints.
89
+
90
+ ## Tools
91
+
92
+ Local module tools are auto-discovered from `resources/tools/*.mjs`. The file
93
+ name becomes the tool ID unless the file exports `tool(...)` declarations.
94
+
95
+ Implement a local module:
96
+
97
+ ```js
98
+ export const echo_tool = {
99
+ description: "Echo input through the Stable Harness tool gateway.",
100
+ schema: {
101
+ type: "object",
102
+ properties: { value: { type: "string" } },
103
+ required: ["value"],
104
+ },
105
+ async invoke(args) {
106
+ return JSON.stringify({ echoed: args.value });
107
+ },
108
+ };
109
+ ```
110
+
111
+ Reference the discovered tool from an agent:
112
+
113
+ ```yaml
114
+ apiVersion: stable-harness.dev/v1
115
+ kind: Agent
116
+ metadata:
117
+ name: orchestra
118
+ spec:
119
+ backend: deepagents
120
+ modelRef: local-dev
121
+ systemPrompt: You are a concise workspace agent.
122
+ tools:
123
+ - echo_tool
124
+ ```
125
+
126
+ Use `kind: Tool` YAML for non-module or gateway-owned tools such as built-ins:
127
+
128
+ ```yaml
129
+ apiVersion: stable-harness.dev/v1
130
+ kind: Tool
131
+ metadata:
132
+ name: shell
133
+ spec:
134
+ description: Run an approved shell command through the configured tool gateway.
135
+ implementation: local-tool-gateway:shell
136
+ schema:
137
+ type: object
138
+ properties:
139
+ command:
140
+ type: string
141
+ required:
142
+ - command
143
+ ```
144
+
145
+ Do not declare a YAML tool with the same ID as an auto-discovered module tool
146
+ unless that YAML declaration intentionally replaces the module-backed tool.
147
+
148
+ The CLI tool gateway uses BetterCall repair mode for registered tools. Repair is
149
+ bounded by inventory, schemas, semantic validators, and governance policy.
150
+ Unknown or unauthorized tools stay blocked.
151
+
152
+ ## Workflows
153
+
154
+ Workflows are explicit control-plane topology, not a second agent language.
155
+
156
+ ```yaml
157
+ apiVersion: stable-harness.dev/v1
158
+ kind: Workflow
159
+ metadata:
160
+ name: review-shell
161
+ spec:
162
+ adapter: langgraph
163
+ entry: inspect
164
+ nodes:
165
+ - id: inspect
166
+ use: agents.orchestra
167
+ - id: run_shell
168
+ use: tools.shell
169
+ - id: review
170
+ use: agents.reviewer
171
+ edges:
172
+ - from: inspect
173
+ to: run_shell
174
+ - from: run_shell
175
+ to: review
176
+ ```
177
+
178
+ Use workflows when the operator needs inspectable topology, migration support,
179
+ or deterministic orchestration around existing inventory.
180
+
181
+ ## Memory
182
+
183
+ Runtime memory policy belongs on the runtime resource. Backend-native memory
184
+ should still be used through the adapter when the upstream backend owns it.
185
+
186
+ ```yaml
187
+ spec:
188
+ memory:
189
+ enabled: true
190
+ deepagentsMem:
191
+ read: true
192
+ write: true
193
+ LangMem:
194
+ read: true
195
+ write: true
196
+ maintenance:
197
+ enabled: true
198
+ schedule: periodic
199
+ ```
200
+
201
+ Memory is an operational substrate: namespace, persistence, lifecycle,
202
+ maintenance, import/export, and governance. It is not a fake uniform replacement
203
+ for every backend's native memory behavior.
204
+
205
+ ## Authoring Rules
206
+
207
+ - Put product behavior in workspace YAML and resources.
208
+ - Keep framework code generic and backend-neutral.
209
+ - Prefer upstream passthrough over local copies of upstream features.
210
+ - Add runtime capabilities only when they are lifecycle, governance,
211
+ observability, persistence, recovery, protocol, memory, or tool-gateway work.
212
+ - Make each capability independently replaceable.
@@ -0,0 +1,233 @@
1
+ # Stable Harness Implementation Blueprint
2
+
3
+ This document defines what to build first and how each package should stay inside the product boundary.
4
+
5
+ ## Repository Shape
6
+
7
+ The project is a TypeScript monorepo.
8
+
9
+ Top-level packages:
10
+
11
+ - `@stable-harness/core`
12
+ - `@stable-harness/workspace-yaml`
13
+ - `@stable-harness/adapter-deepagents`
14
+ - `@stable-harness/memory`
15
+ - `@stable-harness/governance`
16
+ - `@stable-harness/protocols`
17
+ - `@stable-harness/cli`
18
+
19
+ Each package should have one clear responsibility and should avoid cross-layer shortcuts.
20
+
21
+ ## Package Responsibilities
22
+
23
+ ### `@stable-harness/core`
24
+
25
+ Owns the stable runtime contract:
26
+
27
+ - compiled workspace model
28
+ - runtime request and response types
29
+ - runtime events
30
+ - adapter contract
31
+ - runtime lifecycle
32
+ - inspection surface
33
+
34
+ It must not import DeepAgents, LangChain, OpenAI SDK, Gemini SDK, or protocol-specific servers.
35
+
36
+ ### `@stable-harness/workspace-yaml`
37
+
38
+ Owns workspace loading and validation:
39
+
40
+ - discover YAML files
41
+ - parse Kubernetes-style resources
42
+ - validate resource shape
43
+ - compile resources into the core workspace model
44
+ - preserve backend-specific config under explicit adapter config fields
45
+
46
+ It must not execute agents or tools.
47
+
48
+ ### `@stable-harness/adapter-deepagents`
49
+
50
+ Owns DeepAgents integration:
51
+
52
+ - create upstream DeepAgents instances
53
+ - pass through DeepAgents-native config
54
+ - map runtime request metadata to upstream calls
55
+ - normalize upstream events into runtime events
56
+ - return final output and artifacts
57
+
58
+ It must stay thin. It must not add domain heuristics, TODO-to-tool inference, or local replay of DeepAgents custom tools.
59
+
60
+ Before adding behavior here, check whether the current DeepAgents version already provides it. Existing DeepAgents behavior must be passed through or configured with upstream-native options, not rebuilt locally.
61
+
62
+ ### `@stable-harness/memory`
63
+
64
+ Owns runtime-level memory lifecycle:
65
+
66
+ - memory store interfaces
67
+ - namespace conventions
68
+ - persistence adapters
69
+ - compaction hooks
70
+ - backup and export hooks
71
+
72
+ It may integrate with backend-native memory primitives through adapters, but it must not replace them with a fake uniform semantic model.
73
+
74
+ ### `@stable-harness/governance`
75
+
76
+ Owns runtime governance:
77
+
78
+ - tool approval policy
79
+ - sandbox policy
80
+ - resource limits
81
+ - audit decisions
82
+ - policy evaluation results
83
+
84
+ It must use structured policy and metadata, not free-form text matching.
85
+
86
+ ### `@stable-harness/protocols`
87
+
88
+ Owns protocol adapters:
89
+
90
+ - in-process client
91
+ - HTTP server
92
+ - MCP surface
93
+ - ACP surface
94
+ - A2A surface
95
+ - AG-UI surface
96
+
97
+ Protocol adapters should expose stable runtime concepts and call the core runtime.
98
+
99
+ ### `@stable-harness/cli`
100
+
101
+ Owns local developer workflow:
102
+
103
+ - initialize workspaces
104
+ - run a request
105
+ - inspect runtime state
106
+ - list events
107
+ - approve or deny pending work
108
+
109
+ CLI commands should be thin wrappers around core runtime and protocol clients.
110
+
111
+ ## Build Order
112
+
113
+ ### Phase 1: Runtime spine
114
+
115
+ Deliver:
116
+
117
+ - core runtime types
118
+ - in-memory runtime implementation
119
+ - YAML loader
120
+ - minimal DeepAgents adapter contract
121
+ - minimal example workspace
122
+ - build and test scripts
123
+
124
+ Acceptance:
125
+
126
+ - `npm run check` passes
127
+ - example workspace can load
128
+ - runtime can route one request to the DeepAgents adapter scaffold
129
+ - no file violates project size limits
130
+
131
+ ### Phase 2: Real DeepAgents integration
132
+
133
+ Deliver:
134
+
135
+ - `createDeepAgent` based adapter path
136
+ - direct passthrough for DeepAgents-native middleware config
137
+ - support for DeepAgents task tool and subagents through upstream semantics
138
+ - event normalization
139
+ - integration tests with a real upstream DeepAgents dependency
140
+ - a DeepAgents capability audit for every adapter feature added in this phase
141
+
142
+ Acceptance:
143
+
144
+ - adapter does not manually recreate the DeepAgents middleware stack when `createDeepAgent` is sufficient
145
+ - DeepAgents custom tool calls stay inside upstream execution boundary
146
+ - any wrapper around DeepAgents is limited to runtime governance, lifecycle, events, traces, artifacts, protocol access, or operator inspection
147
+ - no runtime keyword heuristics
148
+
149
+ ### Capability Design Rule
150
+
151
+ Every runtime-owned feature should be a single-purpose optional capability.
152
+
153
+ Examples:
154
+
155
+ - approvals
156
+ - event recording
157
+ - artifact storage
158
+ - memory lifecycle
159
+ - tool gateway
160
+ - replay
161
+ - protocol serving
162
+ - sandbox policy
163
+ - evaluation runner
164
+
165
+ Each capability needs:
166
+
167
+ - a narrow interface
168
+ - typed config
169
+ - an enable or disable path
170
+ - a replacement point
171
+ - focused tests
172
+
173
+ Do not make one feature depend on unrelated runtime capabilities unless the dependency is explicit in typed config.
174
+
175
+ ### Phase 3: Operator control plane
176
+
177
+ Deliver:
178
+
179
+ - persisted run records
180
+ - event store
181
+ - approval queue
182
+ - cancellation and retry lifecycle
183
+ - run inspection API
184
+ - CLI inspection commands
185
+
186
+ Acceptance:
187
+
188
+ - operators can inspect request lifecycle without touching backend checkpoints
189
+ - approval flow is driven by structured policy
190
+
191
+ ### Phase 4: Protocol surfaces
192
+
193
+ Deliver:
194
+
195
+ - HTTP API
196
+ - MCP protocol adapter
197
+ - ACP adapter
198
+ - AG-UI event stream
199
+ - protocol conformance tests
200
+
201
+ Acceptance:
202
+
203
+ - all protocols call the same stable runtime contract
204
+ - protocol adapters do not duplicate backend logic
205
+
206
+ ### Phase 5: Memory and replay
207
+
208
+ Deliver:
209
+
210
+ - memory persistence adapter
211
+ - memory import/export
212
+ - trace export
213
+ - replay manifest
214
+ - evaluation fixture runner
215
+
216
+ Acceptance:
217
+
218
+ - replay is driven by structured traces
219
+ - memory governance is visible in events
220
+
221
+ ## Forbidden Implementation Patterns
222
+
223
+ Do not add:
224
+
225
+ - hardcoded finance, Kubernetes, GitHub Actions, source-analysis, or other downstream tool branches
226
+ - user-text regex routing
227
+ - assistant-text parsing that triggers tools
228
+ - TODO parsing that creates evidence tool calls
229
+ - local replay for upstream custom tools
230
+ - framework-specific public API naming unless intentionally accepted as stable product surface
231
+ - bundled runtime subsystems that cannot be independently enabled, disabled, replaced, and tested
232
+
233
+ If a feature seems useful but requires one of these patterns, it belongs in a downstream workspace or upstream adapter config, not in the generic runtime.