stable-harness 0.0.8 → 0.0.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (40) hide show
  1. package/LICENSE +21 -0
  2. package/README.md +10 -0
  3. package/docs/0.1.0-p0-runtime-control-plane-plan.zh.md +171 -0
  4. package/docs/0.1.0-retry-policy.zh.md +87 -0
  5. package/docs/0.1.0-stable-runtime-development-roadmap.zh.md +393 -0
  6. package/docs/0.1.0-tool-guard-benchmark.zh.md +42 -0
  7. package/docs/adapter-contract.md +199 -0
  8. package/docs/architecture/backend-comparison.md +41 -0
  9. package/docs/architecture/runtime-events.md +263 -0
  10. package/docs/architecture/runtime-events.zh.md +248 -0
  11. package/docs/architecture/system-architecture.zh.md +435 -0
  12. package/docs/compatibility-matrix.md +139 -0
  13. package/docs/engineering-rules.md +111 -0
  14. package/docs/evaluation/0.1.0-bfcl-targeted-model-matrix.zh.md +1632 -0
  15. package/docs/evaluation/0.1.0-bfcl-targeted-review-matrix.zh.md +1952 -0
  16. package/docs/evaluation/0.1.0-bfcl-tool-guard.zh.md +1427 -0
  17. package/docs/granite-tool-calling-comparison.zh.md +206 -0
  18. package/docs/guides/getting-started.md +126 -0
  19. package/docs/guides/index.md +40 -0
  20. package/docs/guides/integration-guide.md +126 -0
  21. package/docs/guides/operator-runbook.md +153 -0
  22. package/docs/guides/workspace-authoring.md +212 -0
  23. package/docs/implementation-blueprint.md +233 -0
  24. package/docs/memory/0.1.0-memory-design.zh.md +719 -0
  25. package/docs/memory/0.1.0-step-09-deepagents-native-memory.zh.md +146 -0
  26. package/docs/memory/0.1.0-step-09-langmem-shaped-provider.zh.md +169 -0
  27. package/docs/memory/0.1.0-step-09-memory-adapter-projection.zh.md +123 -0
  28. package/docs/memory/0.1.0-step-09-memory-contract.zh.md +169 -0
  29. package/docs/memory/0.1.0-step-09-memory-governance-approval.zh.md +143 -0
  30. package/docs/memory/0.1.0-step-09-memory-lifecycle-hooks.zh.md +150 -0
  31. package/docs/memory/0.1.0-step-09-memory-maintenance-boundary.zh.md +118 -0
  32. package/docs/memory/0.1.0-step-09-memory-persistence-boundary.zh.md +118 -0
  33. package/docs/product/adoption-playbook.md +145 -0
  34. package/docs/product/market-positioning.md +137 -0
  35. package/docs/product-boundary.md +258 -0
  36. package/docs/protocols/http-runtime.md +37 -0
  37. package/docs/protocols/langgraph-compatible.md +107 -0
  38. package/docs/protocols/openai-compatible.md +121 -0
  39. package/docs/tooling/0.1.0-bettercall-tool-quality.zh.md +231 -0
  40. package/package.json +2 -1
@@ -0,0 +1,42 @@
1
+ # 0.1.0 Tool Guard Benchmark
2
+
3
+ 生成时间:2026-05-07T00:40:17.180Z
4
+
5
+ ## 测试设置
6
+
7
+ - 远端 Ollama:`https://ollama-rtx-4070.easynet.world`
8
+ - 每个模型自然用例轮数:`10`,总自然用例数为 `50`
9
+ - 注入错误矩阵覆盖:未知工具、错误工具名、缺必填、类型错、enum 错、extra arg、绝对路径、语义 ticker 错、不可解析参数
10
+ - 该 benchmark 是产品级 fault-injection 与本地 BFCL-style 子集,不是 BFCL 官方成绩。
11
+
12
+ ## 自然工具调用
13
+
14
+ | 模型 | Repair | 自然用例数 | Exact | Baseline Accepted | Bad Exec 无 Guard | Bad Exec 有 Guard | Final Accepted |
15
+ | --- | --- | --- | --- | --- | --- | --- | --- |
16
+ | qwen3:0.6b | off | 50 | 80% | 80% | 20% | 0% | 80% |
17
+ | qwen3:0.6b | on | 50 | 80% | 80% | 20% | 0% | 80% |
18
+ | qwen3.5:0.8b | off | 50 | 100% | 100% | 0% | 0% | 100% |
19
+ | qwen3.5:0.8b | on | 50 | 100% | 100% | 0% | 0% | 100% |
20
+ | qwen3.5:2b | off | 50 | 100% | 100% | 0% | 0% | 100% |
21
+ | qwen3.5:2b | on | 50 | 100% | 100% | 0% | 0% | 100% |
22
+ | granite4.1:3b | off | 50 | 100% | 100% | 0% | 0% | 100% |
23
+ | granite4.1:3b | on | 50 | 100% | 100% | 0% | 0% | 100% |
24
+ | qwen3.5:4b | off | 50 | 100% | 100% | 0% | 0% | 100% |
25
+ | qwen3.5:4b | on | 50 | 100% | 100% | 0% | 0% | 100% |
26
+
27
+ ## 注入错误矩阵
28
+
29
+ | 模型 | 注入错误 Guard 拦截 | 注入错误 Repair 成功 | 覆盖错误类型 |
30
+ | --- | --- | --- | --- |
31
+ | qwen3:0.6b | 100% | 66.7% | name, schema, type, semantic |
32
+ | qwen3.5:0.8b | 100% | 66.7% | name, schema, type, semantic |
33
+ | qwen3.5:2b | 100% | 100% | name, schema, type, semantic |
34
+ | granite4.1:3b | 100% | 100% | name, schema, type, semantic |
35
+ | qwen3.5:4b | 100% | 100% | name, schema, type, semantic |
36
+
37
+ ## 结论
38
+
39
+ - Guard 的核心收益是阻止错误 tool call 进入真实执行层;在本轮测试里,所有注入错误都被 100% 拦截。
40
+ - `qwen3:0.6b` 的自然输出存在 20% 原本会错误执行的 registered tool call,开启 Guard 后 bad execution 从 20% 降到 0%。
41
+ - `qwen3.5:2b`、`granite4.1:3b`、`qwen3.5:4b` 对注入错误的一轮 repair 成功率为 100%。这个结论只适用于本 benchmark 的注入错误矩阵。
42
+ - `qwen3.5:0.8b` 及以上在本轮自然用例里 baseline 已经是 100%,所以自然场景没有可观察的 accepted-rate uplift。
@@ -0,0 +1,199 @@
1
+ # Backend Adapter Contract
2
+
3
+ Backend adapters connect `stable-harness` to an upstream agent framework.
4
+
5
+ They are internal integration layers. They are not the public product boundary.
6
+
7
+ Adapters are passthrough-first. They translate stable runtime requests into
8
+ upstream calls, preserve upstream execution semantics, and avoid creating
9
+ stable-owned replicas of backend concepts.
10
+
11
+ ## Adapter Inputs
12
+
13
+ Every adapter receives:
14
+
15
+ - compiled workspace
16
+ - selected agent
17
+ - runtime request
18
+ - request ID
19
+ - session ID
20
+ - event emitter
21
+
22
+ Adapters should use the selected agent's structured config to call the upstream framework.
23
+
24
+ Before adding adapter behavior, inspect the current upstream backend capability. If the upstream framework already owns the behavior, the adapter should expose it through passthrough config instead of rebuilding it.
25
+
26
+ Adapter config may include backend-native sections. Those sections are not a
27
+ license to leak backend concepts into core runtime; they are an escape hatch for
28
+ typed upstream passthrough.
29
+
30
+ ## Adapter Outputs
31
+
32
+ Adapters may return either a string or a structured runtime output:
33
+
34
+ ```ts
35
+ type RuntimeOutput = {
36
+ text: string;
37
+ metadata?: Record<string, unknown>;
38
+ artifacts?: RuntimeArtifact[];
39
+ };
40
+ ```
41
+
42
+ Artifacts must remain stable runtime records rather than backend-specific public types.
43
+
44
+ ## Runtime Interface
45
+
46
+ The core runtime interface is split into focused surfaces:
47
+
48
+ - `RuntimeClient`: submit a request and receive a response
49
+ - `RuntimeEventSource`: subscribe to normalized runtime events
50
+ - `RuntimeInspector`: inspect run records
51
+ - `RuntimeLifecycle`: cancel running work and stop the runtime
52
+
53
+ The runtime owns request IDs, session IDs, parent run links, metadata, artifact records, lifecycle state, and event recording. Backend adapters own only the upstream execution handoff.
54
+
55
+ ## Workflow Adapter Interface
56
+
57
+ The native product surface is still an `Agent` definition. Graph-capable
58
+ backends such as LangGraph can run an Agent whose normal inventory fields are
59
+ connected by optional `edges`; DeepAgents agent definitions remain unchanged.
60
+
61
+ Workflow adapters are separate from agent backend adapters. They receive an
62
+ explicit workflow request plus a validated workflow definition and decide how to
63
+ compile or execute it with an upstream workflow system.
64
+
65
+ Core runtime owns:
66
+
67
+ - resolving `workflowId`, `routeId`, and default workflow routing from typed config
68
+ - validating that the workflow exists in workspace inventory
69
+ - recording request lifecycle and normalized events
70
+ - exposing workflow inspection and plan surfaces
71
+
72
+ Workflow adapters own:
73
+
74
+ - compiling the workflow to LangGraph, Microsoft Agent Framework, or another backend
75
+ - preserving that backend's execution semantics
76
+ - mapping backend progress into stable runtime events
77
+ - returning stable runtime output and artifacts
78
+
79
+ The core runtime must not execute workflow nodes itself unless a future native
80
+ workflow adapter is intentionally added as a replaceable plugin.
81
+
82
+ ## LangGraph Workflow Adapter
83
+
84
+ `@stable-harness/adapter-langgraph` is the first graph-capable adapter. It can
85
+ compile either an explicit workflow definition or a `backend: langgraph` Agent
86
+ with appended `edges` into an upstream LangGraph `StateGraph`.
87
+
88
+ For the Agent path, nodes are derived from existing Agent inventory:
89
+
90
+ - `tools` become `tools.<id>` nodes
91
+ - `skills` become `skills.<id>` nodes
92
+ - `subagents` become `agents.<id>` nodes
93
+ - `edges` connect those node IDs
94
+
95
+ No DeepAgents field changes are required for this path.
96
+
97
+ The adapter intentionally does not define stable-owned agent or tool execution
98
+ semantics. Each workflow node must resolve to an injected node handler, keyed by
99
+ node ID or inventory reference such as `agents.orchestra` or `tools.shell`.
100
+ For generic inventory integration, callers can inject node resolvers keyed by
101
+ inventory kind such as `agents`, `tools`, `skills`, or `workflows`.
102
+
103
+ This keeps responsibilities separated:
104
+
105
+ - workflow YAML defines topology and inventory references
106
+ - LangGraph owns graph execution semantics
107
+ - node handlers decide how a referenced agent, tool, skill, or sub-workflow runs
108
+ - node resolvers provide reusable handling for inventory reference families
109
+ - core runtime owns request lifecycle, events, metadata, artifacts, and protocol access
110
+
111
+ Conditional edges and cyclic graphs require explicit adapter plugins or options.
112
+ They must not be inferred from prompt text.
113
+
114
+ Conditional LangGraph edges are enabled through injected `conditionalRouters`.
115
+ The YAML `condition` value is a route label, not business logic. A router keyed
116
+ by source node ID reads typed runtime state and returns one of those labels.
117
+
118
+ Sub-workflow nodes are also opt-in adapter behavior. A node that references
119
+ `workflows.<id>` is executed only when the LangGraph adapter is configured with
120
+ sub-workflow support. The adapter re-enters LangGraph with the referenced
121
+ workflow and enforces a depth limit. Core runtime still treats this as one
122
+ pluggable workflow adapter call; it does not interpret the child workflow's node
123
+ semantics.
124
+
125
+ The root runtime factory may assemble known workflow adapters from workflow YAML.
126
+ For example, a workflow with `adapter: langgraph` can be paired with injected
127
+ LangGraph node handlers. Unknown workflow adapter names are left for explicit
128
+ injection and must not block ordinary runtime startup.
129
+
130
+ Embedded callers may provide adapter factories keyed by adapter name. This is
131
+ the native extension point for customer-owned runtime or workflow backends; it
132
+ keeps workspace config generic while avoiding hardcoded backend aliases.
133
+
134
+ ## Required Behavior
135
+
136
+ Adapters should:
137
+
138
+ - preserve upstream execution semantics
139
+ - pass through upstream-native config when possible
140
+ - expose upstream primitives through typed config rather than harness-owned replicas
141
+ - normalize upstream events into stable runtime events
142
+ - put backend-specific recovery hints in typed runtime config, not core code
143
+ - keep backend-specific details behind the adapter boundary
144
+ - use typed config and metadata for runtime decisions
145
+ - keep each stable wrapper capability independently enableable, disableable, replaceable, and testable
146
+
147
+ ## Forbidden Behavior
148
+
149
+ Adapters must not:
150
+
151
+ - route by matching user prose
152
+ - hardcode downstream business domains
153
+ - infer tools from TODO text
154
+ - synthesize tool calls the upstream model did not select
155
+ - locally replay upstream custom tool calls
156
+ - recreate an upstream framework's default stack when a native constructor exists
157
+ - add bundled behavior that cannot be enabled, disabled, or replaced independently
158
+ - invent a stable-owned concept when upstream passthrough or typed backend config is sufficient
159
+ - shape generic runtime interfaces around one backend's internal model
160
+
161
+ ## DeepAgents Direction
162
+
163
+ The DeepAgents adapter should use the upstream `createDeepAgent` path as the primary integration point.
164
+
165
+ DeepAgents-native features such as subagents, task tool behavior, filesystem middleware, skills, memory middleware, and sandbox primitives should be passed through or configured through upstream-native options.
166
+
167
+ The adapter must not duplicate any current DeepAgents feature. If a DeepAgents feature needs stable product treatment, add only a narrow optional runtime capability around it, such as events, approvals, persistence, replay, artifact capture, protocol access, or operator inspection.
168
+
169
+ DeepAgents-native memory passthrough belongs in the DeepAgents adapter package.
170
+ Core memory lifecycle may expose generic provider and maintenance contracts, but
171
+ it must not export DeepAgents-named helpers.
172
+
173
+ `stable-harness` should expose only the stable runtime layer around that execution:
174
+
175
+ - workspace loading
176
+ - request lifecycle
177
+ - approvals
178
+ - events
179
+ - traces
180
+ - recovery
181
+ - memory lifecycle
182
+ - protocol access
183
+
184
+ Each item should remain an independent capability with its own interface, config, tests, and replacement point.
185
+
186
+ ## Future Adapters
187
+
188
+ OpenAI Agents SDK, Gemini SDK, LangGraph, and customer-owned frameworks should use the same adapter contract.
189
+
190
+ If a backend requires a capability that does not fit this contract, first decide whether it is:
191
+
192
+ - backend execution semantics, which should stay inside the adapter
193
+ - product runtime semantics, which may extend the core contract
194
+ - downstream application logic, which belongs in the workspace
195
+
196
+ Microsoft Agent Framework should follow the same rule if added later: its typed
197
+ workflows, checkpoints, middleware, sessions, and human-in-the-loop primitives
198
+ should be passed through or wrapped for runtime lifecycle and observability, not
199
+ rebuilt as a second stable-owned workflow engine.
@@ -0,0 +1,41 @@
1
+ # Backend Comparison
2
+
3
+ `stable-harness` compares backend adapters through the stable runtime boundary,
4
+ not by reimplementing backend execution semantics.
5
+
6
+ The comparison test in `test/adapter/backend-comparison.test.ts` runs the same
7
+ workspace inventory through:
8
+
9
+ - `deepagents`: upstream agent loop receives stable tool gateway tools and skill
10
+ source paths.
11
+ - `langgraph`: graph nodes receive the same stable tool gateway and resolve
12
+ `skills.*` through the registry resolver.
13
+
14
+ The test validates:
15
+
16
+ - both backends can run from the same `WorkspaceAgent` inventory shape;
17
+ - both backends invoke the same stable tool gateway with the same arguments;
18
+ - each backend keeps its own runtime context and agent id;
19
+ - DeepAgents receives skill paths as upstream passthrough;
20
+ - LangGraph resolves skill metadata and `SKILL.md` content through the stable
21
+ registry resolver;
22
+ - LangGraph graph trace preserves node order.
23
+
24
+ Observed adapter difference:
25
+
26
+ - DeepAgents tool output is normalized through the upstream tool wrapper and is
27
+ stringified before reaching the mocked agent result.
28
+ - LangGraph node resolver output remains structured inside graph state.
29
+
30
+ This difference is backend-specific and should remain visible in adapter tests
31
+ instead of being hidden by core runtime behavior.
32
+
33
+ Run the deterministic benchmark:
34
+
35
+ ```bash
36
+ npm run benchmark:backend-comparison
37
+ ```
38
+
39
+ The benchmark emits JSON with per-backend success rate, tool argument match rate,
40
+ skill resolution rate, average duration, output shape, and LangGraph trace nodes.
41
+ Use `BACKEND_COMPARE_REPEAT=20` to increase the repeat count.
@@ -0,0 +1,263 @@
1
+ # Runtime Event Model
2
+
3
+ This document defines the stable-harness event model. The names listed here are
4
+ the current physical event schema; old event names are not kept as a compatible
5
+ surface.
6
+
7
+ Chapter structure:
8
+
9
+ - Top level: owner, starting with `agent`.
10
+ - Middle level: category, such as Signal, Fact, Envelope, or View.
11
+ - Lower level: concrete event group, such as `agent.tool.*` or `runtime.request.*`.
12
+
13
+ The `runtime.*` and `agent.*` names in this document are stable namespaces.
14
+ `event type` means a top-level runtime event; `payload phase` means
15
+ `runtime.adapter.event.event.phase`.
16
+
17
+ Source of truth:
18
+
19
+ - Top-level runtime events: `packages/core/src/runtime/events.ts`
20
+ - Trace projection: `packages/core/src/trace.ts`
21
+ - OpenAI-compatible SSE projection: `packages/protocols/src/openai-stream.ts`
22
+
23
+ ## 1. Owner: Agent Runtime / Backend Adapter
24
+
25
+ Agent owner means the upstream agent runtime or backend adapter, such as
26
+ DeepAgents, the LangGraph workflow adapter, or future OpenAI Agents SDK and
27
+ Gemini SDK adapters.
28
+
29
+ stable-harness does not own this layer's execution semantics. It observes,
30
+ records, persists, and projects these signals through the
31
+ `runtime.adapter.event` envelope.
32
+
33
+ ### 1.1 Category: Agent Signals
34
+
35
+ Agent signals are observable signals from upstream/backend execution.
36
+
37
+ #### 1.1.1 Event Group: `agent.lifecycle.*`
38
+
39
+ | Event | Payload phase | Owner | Common fields | Meaning |
40
+ | --- | --- | --- | --- | --- |
41
+ | `agent.handoff` | `agent.handoff` | backend adapter | `adapter`, `phase`, `modelRef`, `tools`, `subagents` | Adapter took over execution. |
42
+
43
+ #### 1.1.2 Event Group: `agent.output.*`
44
+
45
+ | Event | Payload phase | Owner | Common fields | Meaning |
46
+ | --- | --- | --- | --- | --- |
47
+ | `agent.output.delta` | `agent.output.delta` | upstream agent runtime | `adapter`, `phase`, `text` | Assistant stream delta. |
48
+
49
+ #### 1.1.3 Event Group: `agent.tool.*`
50
+
51
+ | Event | Payload phase | Owner | Common fields | Meaning |
52
+ | --- | --- | --- | --- | --- |
53
+ | `agent.tool.start` | `agent.tool.start` | upstream / adapter | `adapter`, `phase`, `toolId`, `args` | Upstream/adapter tool call started. |
54
+ | `agent.tool.result` | `agent.tool.result` | upstream / adapter | `adapter`, `phase`, `toolId`, `output`, `error` | Upstream/adapter tool call completed or failed. |
55
+
56
+ #### 1.1.4 Event Group: `agent.workflow.*`
57
+
58
+ | Event | Payload phase | Owner | Common fields | Meaning |
59
+ | --- | --- | --- | --- | --- |
60
+ | `agent.langgraph.invoke` | `agent.langgraph.invoke` | upstream workflow runtime | `adapter`, `phase`, `workflowId` | LangGraph workflow invocation started. |
61
+ | `agent.node.completed` | `agent.node.completed` | upstream workflow runtime | `adapter`, `phase`, `workflowId`, `nodeId` | Workflow node completed. |
62
+
63
+ ## 2. Owner: Stable Harness Runtime / Control Plane
64
+
65
+ Stable Harness runtime/control-plane owner means facts, runtime contracts,
66
+ memory lifecycle, artifacts, tool gateway events, and adapter envelopes owned by
67
+ stable-harness itself.
68
+
69
+ These events are the source of truth for store, audit, replay, and tests.
70
+
71
+ ### 2.1 Category: Runtime Facts
72
+
73
+ Runtime facts are stable, typed, auditable facts stored on the run record.
74
+
75
+ #### 2.1.1 Event Group: `runtime.request.*`
76
+
77
+ | Event | Event type | Required fields | Optional fields | Meaning |
78
+ | --- | --- | --- | --- | --- |
79
+ | `runtime.request.started` | `runtime.request.started` | `requestId`, `sessionId`, `agentId` | | Request started. |
80
+ | `runtime.request.completed` | `runtime.request.completed` | `requestId`, `sessionId`, `agentId`, `output` | | Request completed successfully. |
81
+ | `runtime.request.failed` | `runtime.request.failed` | `requestId`, `sessionId`, `agentId`, `error` | | Request failed. |
82
+ | `runtime.request.cancelled` | `runtime.request.cancelled` | `requestId`, `sessionId`, `agentId` | `reason` | Request was cancelled. |
83
+
84
+ #### 2.1.2 Event Group: `runtime.execution.*`
85
+
86
+ | Event | Event type | Required fields | Optional fields | Meaning |
87
+ | --- | --- | --- | --- | --- |
88
+ | `runtime.execution.contract.failed` | `runtime.execution.contract.failed` | `requestId`, `sessionId`, `agentId`, `reason` | `missingEvidenceTools` | Execution evidence contract failed. |
89
+
90
+ #### 2.1.3 Event Group: `runtime.tool.direct.*`
91
+
92
+ These events only describe stable-harness direct tool requests executed through
93
+ the runtime tool gateway. They do not mean the agent/upstream selected a tool
94
+ during execution; agent-internal tool calls belong to `agent.tool.*` signals.
95
+
96
+ | Event | Event type | Required fields | Optional fields | Meaning |
97
+ | --- | --- | --- | --- | --- |
98
+ | `runtime.tool.direct.started` | `runtime.tool.direct.started` | `requestId`, `sessionId`, `agentId`, `toolId` | | Direct tool request started. |
99
+ | `runtime.tool.direct.completed` | `runtime.tool.direct.completed` | `requestId`, `sessionId`, `agentId`, `toolId`, `output` | | Direct tool request completed. |
100
+
101
+ #### 2.1.4 Event Group: `runtime.workflow.*`
102
+
103
+ These events are owned by the stable-harness workflow runtime. They are emitted
104
+ as top-level `runtime.workflow.*` facts, not adapter payloads; adapter payloads
105
+ only carry upstream/backend-owned workflow signals such as
106
+ `agent.langgraph.invoke`.
107
+
108
+ | Event | Event type | Required fields | Optional fields | Meaning |
109
+ | --- | --- | --- | --- | --- |
110
+ | `runtime.workflow.started` | `runtime.workflow.started` | `requestId`, `sessionId`, `agentId`, `workflowId`, `adapter` | | stable-harness workflow execution started. |
111
+ | `runtime.workflow.completed` | `runtime.workflow.completed` | `requestId`, `sessionId`, `agentId`, `workflowId`, `adapter` | | stable-harness workflow execution completed. |
112
+
113
+ #### 2.1.5 Event Group: `runtime.artifact.*`
114
+
115
+ | Event | Event type | Required fields | Optional fields | Meaning |
116
+ | --- | --- | --- | --- | --- |
117
+ | `runtime.artifact.created` | `runtime.artifact.created` | `requestId`, `sessionId`, `agentId`, `artifact` | | Artifact was created. |
118
+
119
+ #### 2.1.6 Event Group: `runtime.specDriven.*`
120
+
121
+ These events are owned by the stable-harness spec-driven workflow capability.
122
+ They record control-plane phase facts and artifacts; they do not replace backend
123
+ agent execution semantics.
124
+
125
+ | Event | Event type | Required fields | Optional fields | Meaning |
126
+ | --- | --- | --- | --- | --- |
127
+ | `runtime.specDriven.phase.started` | `runtime.specDriven.phase.started` | `requestId`, `sessionId`, `agentId`, `phaseId` | `workflowId` | Spec-driven phase started. |
128
+ | `runtime.specDriven.phase.blocked` | `runtime.specDriven.phase.blocked` | `requestId`, `sessionId`, `agentId`, `phaseId`, `reason` | `workflowId` | Spec-driven phase was blocked by a gate or policy. |
129
+ | `runtime.specDriven.phase.completed` | `runtime.specDriven.phase.completed` | `requestId`, `sessionId`, `agentId`, `phaseId` | `workflowId`, `artifact` | Spec-driven phase completed. |
130
+ | `runtime.specDriven.phase.verified` | `runtime.specDriven.phase.verified` | `requestId`, `sessionId`, `agentId`, `phaseId` | `workflowId`, `artifact` | Spec-driven phase was verified. |
131
+
132
+ #### 2.1.7 Event Group: `runtime.skill.*`
133
+
134
+ | Event | Event type | Required fields | Optional fields | Meaning |
135
+ | --- | --- | --- | --- | --- |
136
+ | `runtime.skill.candidate.created` | `runtime.skill.candidate.created` | `requestId`, `sessionId`, `agentId`, `candidateId`, `name`, `confidence`, `evidenceCount`, `status` | `proposedPath` | Skill candidate was discovered. |
137
+
138
+ ### 2.2 Category: Runtime Memory Facts
139
+
140
+ Memory is a stable-harness runtime/control-plane capability, so it belongs under
141
+ `runtime.memory.*`.
142
+
143
+ #### 2.2.1 Event Group: `runtime.memory.lifecycle`
144
+
145
+ | Event | Event type | Required fields | Optional fields | Meaning |
146
+ | --- | --- | --- | --- | --- |
147
+ | `runtime.memory.lifecycle` | `runtime.memory.lifecycle` | `requestId`, `sessionId`, `agentId`, `hook` | | Memory lifecycle hook ran. |
148
+
149
+ Current hooks:
150
+
151
+ | Hook | Meaning |
152
+ | --- | --- |
153
+ | `read-before-plan` | Read memory before planning. |
154
+ | `read-before-finalize` | Read memory before final output finalization. |
155
+ | `write-after-run` | Write memory after the run. |
156
+
157
+ #### 2.2.2 Event Group: `runtime.memory.recall.*`
158
+
159
+ | Event | Event type | Required fields | Optional fields | Meaning |
160
+ | --- | --- | --- | --- | --- |
161
+ | `runtime.memory.recall.completed` | `runtime.memory.recall.completed` | `requestId`, `sessionId`, `agentId`, `namespace`, `recordIds`, `context` | | Memory recall completed. |
162
+
163
+ #### 2.2.3 Event Group: `runtime.memory.write.*`
164
+
165
+ | Event | Event type | Required fields | Optional fields | Meaning |
166
+ | --- | --- | --- | --- | --- |
167
+ | `runtime.memory.candidate.submitted` | `runtime.memory.candidate.submitted` | `requestId`, `sessionId`, `agentId`, `candidate`, `decision` | `record` | Memory write candidate was submitted. |
168
+ | `runtime.memory.approval.requested` | `runtime.memory.approval.requested` | `requestId`, `sessionId`, `agentId`, `approval` | | Memory operation requested approval. |
169
+
170
+ #### 2.2.4 Event Group: `runtime.memory.plugin.*`
171
+
172
+ | Event | Event type | Required fields | Optional fields | Meaning |
173
+ | --- | --- | --- | --- | --- |
174
+ | `runtime.memory.plugin.started` | `runtime.memory.plugin.started` | `requestId`, `sessionId`, `agentId`, `memoryId`, `provider`, `namespace` | | Memory plugin started. |
175
+ | `runtime.memory.plugin.completed` | `runtime.memory.plugin.completed` | `requestId`, `sessionId`, `agentId`, `memoryId`, `provider`, `namespace`, `candidateCount` | | Memory plugin completed. |
176
+ | `runtime.memory.plugin.failed` | `runtime.memory.plugin.failed` | `requestId`, `sessionId`, `agentId`, `memoryId`, `provider`, `namespace`, `error` | | Memory plugin failed. |
177
+
178
+ #### 2.2.5 Event Group: `runtime.memory.maintenance.*`
179
+
180
+ | Event | Event type | Required fields | Optional fields | Meaning |
181
+ | --- | --- | --- | --- | --- |
182
+ | `runtime.memory.maintenance.started` | `runtime.memory.maintenance.started` | `requestId`, `sessionId`, `agentId`, `target` | | Memory maintenance started. |
183
+ | `runtime.memory.maintenance.completed` | `runtime.memory.maintenance.completed` | `requestId`, `sessionId`, `agentId`, `target`, `operationCount` | | Memory maintenance completed. |
184
+ | `runtime.memory.maintenance.failed` | `runtime.memory.maintenance.failed` | `requestId`, `sessionId`, `agentId`, `target`, `error` | | Memory maintenance failed. |
185
+
186
+ ### 2.3 Category: Runtime Envelope
187
+
188
+ Envelope is the stable container owned by stable-harness. The nested payload is
189
+ owned by the agent/backend.
190
+
191
+ #### 2.3.1 Event Group: `runtime.adapter.*`
192
+
193
+ | Event | Event type | Required fields | Optional fields | Meaning |
194
+ | --- | --- | --- | --- | --- |
195
+ | `runtime.adapter.event` | `runtime.adapter.event` | `requestId`, `sessionId`, `agentId`, `event` | | Stable envelope for backend/agent signals. |
196
+
197
+ ## 3. Owner: Stable Harness Views / Protocol
198
+
199
+ Views/protocol owner means the presentation, projection, transport, and
200
+ narration layers stable-harness derives from facts and signals. These layers do
201
+ not replace original facts or create execution semantics.
202
+
203
+ ### 3.1 Category: Runtime Trace Views
204
+
205
+ #### 3.1.1 Event Group: `runtime.trace.*`
206
+
207
+ | Logical view | Current implementation | Source | Purpose |
208
+ | --- | --- | --- | --- |
209
+ | `runtime.trace.request` | Trace category `request` | `runtime.request.*`, `runtime.execution.contract.failed` | Request timeline. |
210
+ | `runtime.trace.tool.direct` | Trace category `tool` | `runtime.tool.direct.started`, `runtime.tool.direct.completed` | Direct/gateway tool timeline. |
211
+ | `runtime.trace.agent.tool` | Trace category `adapter` | `runtime.adapter.event` payload `agent.tool.start` / `agent.tool.result` | Agent/upstream tool timeline. |
212
+ | `runtime.trace.adapter` | Trace category `adapter` | `runtime.adapter.event` | Backend signal timeline. |
213
+ | `runtime.trace.memory` | Trace category `memory` | `runtime.memory.*` | Memory timeline. |
214
+ | `runtime.trace.artifact` | Trace category `artifact` | `runtime.artifact.created` | Artifact timeline. |
215
+ | `runtime.trace.spec` | Trace category `spec` | `runtime.specDriven.phase.*` | Spec-driven phase timeline. |
216
+ | `runtime.trace.plan` | Trace category `plan` | `runtime.adapter.event.traceType: "plan"` | Plan/TODO presentation. |
217
+ | `runtime.trace.delegation` | Trace category `delegation` | `runtime.adapter.event.traceType: "delegation"` | Delegation presentation. |
218
+
219
+ ### 3.2 Category: Runtime Stream Views
220
+
221
+ #### 3.2.1 Event Group: `runtime.stream.*`
222
+
223
+ | Logical view | Current implementation | Source | Purpose |
224
+ | --- | --- | --- | --- |
225
+ | `runtime.stream.tool.progress` | SSE `stable_harness.tool.progress` | `runtime.tool.direct.*` or `agent.tool.*` signal | OpenAI-compatible tool progress stream. |
226
+ | `runtime.stream.progress.narration` | SSE `stable_harness.progress.narration` | `runtime.progress.narration` | OpenAI-compatible narration stream. |
227
+
228
+ ### 3.3 Category: Runtime Progress Views
229
+
230
+ #### 3.3.1 Event Group: `runtime.progress.*`
231
+
232
+ | Logical event | Current implementation | Required fields | Optional fields | Meaning |
233
+ | --- | --- | --- | --- | --- |
234
+ | `runtime.progress.narration` | Runtime event | `requestId`, `sessionId`, `agentId`, `message`, `provider`, `sourceEventTypes` | `sourceEventIds`, `model`, `style` | Human-readable progress generated from runtime facts and agent signals. |
235
+
236
+ Constraints:
237
+
238
+ - The narrator only consumes events. It must not mutate execution, runtime
239
+ state, tool results, approvals, or memory decisions.
240
+ - Narrator output must be traceable to source events.
241
+ - The narrator provider is a pluggable runtime view provider with sync or async
242
+ implementations; built-in `template` narration is available.
243
+ - Enable narration with `createStableHarnessRuntime({ progressNarration })` or
244
+ workspace `runtime.progress.narration.enabled`.
245
+ - CLI presentation policy is a separate runtime setting: `runtime.cli.events`.
246
+ It only controls CLI display, not whether runtime events are produced.
247
+ - The CLI default only shows `runtime.progress.narration`; use `include: ["*"]`
248
+ to show every event.
249
+
250
+ Workspace YAML:
251
+
252
+ ```yaml
253
+ spec:
254
+ progress:
255
+ narration:
256
+ enabled: true
257
+ style: concise
258
+ cli:
259
+ events:
260
+ include:
261
+ - runtime.progress.narration
262
+ - runtime.tool.direct.*
263
+ ```