npm - stable-harness - Versions diffs - 0.0.7 → 0.0.9 - Mend

stable-harness 0.0.7 → 0.0.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (39) hide show

package/README.md +10 -0
package/docs/0.1.0-p0-runtime-control-plane-plan.zh.md +171 -0
package/docs/0.1.0-retry-policy.zh.md +87 -0
package/docs/0.1.0-stable-runtime-development-roadmap.zh.md +393 -0
package/docs/0.1.0-tool-guard-benchmark.zh.md +42 -0
package/docs/adapter-contract.md +199 -0
package/docs/architecture/backend-comparison.md +41 -0
package/docs/architecture/runtime-events.md +263 -0
package/docs/architecture/runtime-events.zh.md +248 -0
package/docs/architecture/system-architecture.zh.md +435 -0
package/docs/compatibility-matrix.md +139 -0
package/docs/engineering-rules.md +111 -0
package/docs/evaluation/0.1.0-bfcl-targeted-model-matrix.zh.md +1632 -0
package/docs/evaluation/0.1.0-bfcl-targeted-review-matrix.zh.md +1952 -0
package/docs/evaluation/0.1.0-bfcl-tool-guard.zh.md +1427 -0
package/docs/granite-tool-calling-comparison.zh.md +206 -0
package/docs/guides/getting-started.md +126 -0
package/docs/guides/index.md +40 -0
package/docs/guides/integration-guide.md +126 -0
package/docs/guides/operator-runbook.md +153 -0
package/docs/guides/workspace-authoring.md +212 -0
package/docs/implementation-blueprint.md +233 -0
package/docs/memory/0.1.0-memory-design.zh.md +719 -0
package/docs/memory/0.1.0-step-09-deepagents-native-memory.zh.md +146 -0
package/docs/memory/0.1.0-step-09-langmem-shaped-provider.zh.md +169 -0
package/docs/memory/0.1.0-step-09-memory-adapter-projection.zh.md +123 -0
package/docs/memory/0.1.0-step-09-memory-contract.zh.md +169 -0
package/docs/memory/0.1.0-step-09-memory-governance-approval.zh.md +143 -0
package/docs/memory/0.1.0-step-09-memory-lifecycle-hooks.zh.md +150 -0
package/docs/memory/0.1.0-step-09-memory-maintenance-boundary.zh.md +118 -0
package/docs/memory/0.1.0-step-09-memory-persistence-boundary.zh.md +118 -0
package/docs/product/adoption-playbook.md +145 -0
package/docs/product/market-positioning.md +137 -0
package/docs/product-boundary.md +258 -0
package/docs/protocols/http-runtime.md +37 -0
package/docs/protocols/langgraph-compatible.md +107 -0
package/docs/protocols/openai-compatible.md +121 -0
package/docs/tooling/0.1.0-bettercall-tool-quality.zh.md +231 -0
package/package.json +3 -1

package/docs/memory/0.1.0-step-09-memory-governance-approval.zh.md ADDED Viewed

@@ -0,0 +1,143 @@
+# 0.1.0 Step 09.4 Memory Governance And Approval
+## 目标
+本步骤把 memory policy 的 `review` 决策接入 governance approval queue。模型和上游 agent 仍只能提出 memory candidate，runtime policy 决定是否存储、拒绝或进入审批。
+## 新增 Interface
+```text
+@stable-harness/governance
+├─ ApprovalRequest
+│  ├─ id
+│  ├─ kind: tool_invocation | memory_write
+│  ├─ reason
+│  ├─ status: pending | approved | rejected
+│  ├─ requestId / sessionId / agentId
+│  ├─ subject
+│  ├─ createdAt
+│  └─ resolvedAt
+│
+└─ ApprovalQueue
+   ├─ create(input)
+   ├─ list(status?)
+   └─ resolve(id, status)
+```
+core runtime 新增：
+```text
+createStableHarnessRuntime
+└─ approvals?: ApprovalQueue
+RuntimeEvent
+└─ runtime.memory.approval.requested
+```
+## 行为
+当 `RuntimeMemoryStore.submitCandidate()` 返回：
+```text
+decision.action === "review"
+```
+并且 runtime 配置了 `approvals` queue 时，core runtime 会创建：
+```text
+ApprovalRequest.kind = memory_write
+ApprovalRequest.status = pending
+ApprovalRequest.subject = { candidate, decision }
+```
+随后发出：
+```text
+runtime.memory.approval.requested
+```
+## Sequence Diagram
+```mermaid
+sequenceDiagram
+  participant Runtime
+  participant Memory as RuntimeMemoryStore
+  participant Policy as MemoryPolicy
+  participant Approvals as ApprovalQueue
+  Runtime->>Memory: submitCandidate(candidate)
+  Memory->>Policy: decide(candidate)
+  Policy-->>Memory: decision(review)
+  Memory-->>Runtime: candidate + decision
+  Runtime->>Runtime: emit runtime.memory.candidate.submitted
+  Runtime->>Approvals: create(memory_write)
+  Approvals-->>Runtime: ApprovalRequest(pending)
+  Runtime->>Runtime: emit runtime.memory.approval.requested
+```
+## Flow Chart
+```mermaid
+flowchart TD
+  A["MemoryCandidate"] --> B["MemoryPolicy decision"]
+  B --> C{"decision.action"}
+  C -->|store| D["Create MemoryRecord"]
+  C -->|reject| E["Audit rejection"]
+  C -->|review| F{"ApprovalQueue configured?"}
+  F -->|no| G["Emit candidate submitted only"]
+  F -->|yes| H["Create pending memory_write approval"]
+  H --> I["Emit runtime.memory.approval.requested"]
+```
+## 边界
+不做：
+- 不自动 approve memory。
+- 不在 approval 通过前写入 sensitive/restricted durable memory。
+- 不把 approval queue 变成 tool execution framework。
+- 不添加任何下游业务规则。
+## Verification
+本步骤新增测试覆盖：
+- restricted memory candidate 进入 `review`。
+- runtime 创建 `memory_write` approval request。
+- approval request 状态为 `pending`。
+- runtime 发出 `runtime.memory.approval.requested` event。
+验证命令：
+```text
+npm run check
+npm run check:rules
+npm test
+```
+## 下游验证
+本步骤已完成 EasyNet 真实验证：
+```text
+cd /Users/boqiangliang/project/easynet
+npm test
+npm run test:botbotgo:full
+```
+真实验证结果：
+- `npm test` 通过：18 个 contract tests，7 个 real integration tests。
+- `test:botbotgo:full` 通过：8/8 matrix cases。
+- full matrix 明确使用 EasyNet package-local `node_modules/.bin/botbotgo`。
+- 真实模型路径为 EasyNet 配置的远端 Ollama `granite4.1:3b`。
+- 覆盖 owner：orchestra、software、qa、ops、release、research、secretary、k8s。
+- 覆盖真实工具/数据路径：finance stock report、source analysis、disk investigation、Git/GitHub Actions、Kubernetes readonly investigation、CLI routing。
+## 下一步
+下一步是 `Step 09.5 Memory Persistence Boundary`：
+- 定义 durable store adapter interface。
+- 保持 in-memory store 为测试实现。
+- 不把 vector index 作为 source of truth。

package/docs/memory/0.1.0-step-09-memory-lifecycle-hooks.zh.md ADDED Viewed

@@ -0,0 +1,150 @@
+# 0.1.0 Step 09.2 Memory Lifecycle Hooks
+## 目标
+本步骤把长期记忆从独立 store contract 接入 core runtime lifecycle。接入方式是可选的、typed 的、显式的，不改变 DeepAgents 或其他上游 backend 的执行语义。
+核心原则：
+- 没有传入 `memory` store 时，runtime 行为和事件序列保持不变。
+- 有 `memory` store 时，runtime 可以执行 recall 和 candidate submission。
+- recall 结果只作为 runtime event 输出，不自动注入 adapter prompt。
+- durable memory 写入必须来自显式 `request.memory.candidates`，runtime 不从模型输出里猜测要记什么。
+- policy 决策仍由 `@stable-harness/memory` 负责。
+## 新增 Interface
+```text
+createStableHarnessRuntime
+└─ memory?: RuntimeMemoryStore
+RuntimeRequest
+└─ memory?: RuntimeRequestMemory
+   ├─ namespace?
+   ├─ recall?: false | { query?, limit? }
+   └─ candidates?: RuntimeMemoryCandidateInput[]
+RuntimeEvent
+├─ runtime.memory.lifecycle
+├─ runtime.memory.recall.completed
+└─ runtime.memory.candidate.submitted
+```
+## Lifecycle Hooks
+当前实现三个 hook：
+```text
+read-before-plan
+read-before-finalize
+write-after-run
+```
+语义如下：
+- `read-before-plan`：runtime 准备执行 adapter 前进行 recall。
+- `read-before-finalize`：adapter 返回后、request completed 前发出 finalize 生命周期事件。
+- `write-after-run`：只在 request 显式携带 memory candidates 时提交候选。
+## Sequence Diagram
+```mermaid
+sequenceDiagram
+  participant Client
+  participant Runtime
+  participant Memory as RuntimeMemoryStore
+  participant Adapter as Backend Adapter
+  Client->>Runtime: request(input, memory config)
+  Runtime->>Runtime: emit runtime.request.started
+  Runtime->>Runtime: emit runtime.memory.lifecycle(read-before-plan)
+  Runtime->>Memory: recall(namespace, query)
+  Memory-->>Runtime: records + compressed context
+  Runtime->>Runtime: emit runtime.memory.recall.completed
+  Runtime->>Adapter: run(request)
+  Adapter-->>Runtime: output
+  Runtime->>Runtime: emit runtime.memory.lifecycle(read-before-finalize)
+  alt explicit candidates provided
+    Runtime->>Runtime: emit runtime.memory.lifecycle(write-after-run)
+    Runtime->>Memory: submitCandidate(candidate)
+    Memory-->>Runtime: decision + optional record
+    Runtime->>Runtime: emit runtime.memory.candidate.submitted
+  end
+  Runtime->>Runtime: emit runtime.request.completed
+  Runtime-->>Client: response
+```
+## Flow Chart
+```mermaid
+flowchart TD
+  A["Runtime request"] --> B{"memory store configured?"}
+  B -->|no| C["Run adapter normally"]
+  B -->|yes| D["Emit read-before-plan"]
+  D --> E{"request.memory.recall is false?"}
+  E -->|yes| C
+  E -->|no| F["Recall by namespace and query"]
+  F --> G["Emit runtime.memory.recall.completed"]
+  G --> C
+  C --> H["Adapter output"]
+  H --> I["Emit read-before-finalize"]
+  I --> J{"explicit candidates?"}
+  J -->|no| K["runtime.request.completed"]
+  J -->|yes| L["Emit write-after-run"]
+  L --> M["submitCandidate through policy"]
+  M --> N["Emit runtime.memory.candidate.submitted"]
+  N --> K
+```
+## 为什么不自动注入 Adapter
+自动把 recall context 注入 prompt 会让 `stable-harness` 开始参与 agent execution semantics。这个责任属于上游 backend adapter，例如 DeepAgents、OpenAI Agents SDK 或 Gemini SDK。
+正确边界是：
+- core runtime 负责 lifecycle、events、policy、store、audit。
+- adapter 负责把 runtime memory projection 翻译为上游框架认可的输入。
+- workspace config 或 request metadata 明确决定是否启用 projection。
+## Verification
+本步骤新增测试覆盖：
+- 有 memory store 时发出 lifecycle events。
+- recall 事件包含 record id 和 compressed context。
+- adapter 输入和输出不被 memory recall 改写。
+- sensitive candidate 进入 `review`，不直接生成 record。
+本步骤验证命令：
+```text
+npm run check
+npm run check:rules
+npm test
+```
+下游 EasyNet 已完成真实验证：
+```text
+cd /Users/boqiangliang/project/easynet
+npm test
+npm run test:botbotgo:full
+```
+真实验证结果：
+- `npm test` 通过：18 个 contract tests，7 个 real integration tests。
+- `test:botbotgo:full` 通过：8/8 matrix cases。
+- full matrix 明确使用 EasyNet package-local `node_modules/.bin/botbotgo`。
+- 真实模型路径为 EasyNet 配置的远端 Ollama `granite4.1:3b`。
+- 覆盖 owner：orchestra、software、qa、ops、release、research、secretary、k8s。
+- 覆盖真实工具/数据路径：finance stock report、source analysis、disk investigation、Git/GitHub Actions、Kubernetes readonly investigation、CLI routing。
+## 下一步
+下一步是 `Step 09.3 Memory Adapter Projection`：
+- DeepAgents adapter 读取 runtime memory projection。
+- projection 必须由 typed config 显式启用。
+- 可投影为 DeepAgents `memory?: string[]` 或 middleware input。
+- 不把 DeepAgents `/memories/` 暴露为 stable-harness public API。

package/docs/memory/0.1.0-step-09-memory-maintenance-boundary.zh.md ADDED Viewed

@@ -0,0 +1,118 @@
+# 0.1.0 Step 09.6 Memory Maintenance Boundary
+## 目标
+本步骤定义 memory maintenance boundary。维护动作必须是显式 typed operation，不能由 runtime 根据自然语言关键词、业务领域或下游场景自动推断。
+## 新增 Interface
+```text
+@stable-harness/memory
+├─ MemoryMaintenanceAction
+│  ├─ mark_stale
+│  ├─ archive
+│  ├─ refresh
+│  └─ supersede
+│
+├─ MemoryMaintenanceOperation
+│  ├─ action
+│  ├─ recordId
+│  ├─ reason
+│  └─ replacementRecordId?
+│
+├─ MemoryMaintenanceResult
+│  ├─ operation
+│  ├─ record?
+│  ├─ applied
+│  └─ reason
+│
+└─ applyMemoryMaintenance(store, operations)
+```
+## 行为
+- `mark_stale`：把 record 标记为 `stale`。
+- `refresh`：把 record 标记为 `active` 并更新确认时间。
+- `archive`：把 record 标记为 `archived`。
+- `supersede`：把旧 record 标记为 `archived`，并记录 `supersededBy`。
+## 边界
+做：
+- 只执行调用方提交的 typed operations。
+- 操作结果返回 `applied` 和 reason。
+- 维护动作通过现有 store `update/archive` 执行。
+不做：
+- 不自动判断哪些 memory 过期。
+- 不按业务关键词合并或删除。
+- 不读取下游 agent prompt 来决定维护动作。
+- 不把维护逻辑变成另一个 planner。
+## Sequence Diagram
+```mermaid
+sequenceDiagram
+  participant Operator
+  participant Maintenance as applyMemoryMaintenance
+  participant Store as RuntimeMemoryStore
+  Operator->>Maintenance: MemoryMaintenanceOperation[]
+  loop each operation
+    Maintenance->>Store: update/archive(recordId)
+    Store-->>Maintenance: MemoryRecord?
+  end
+  Maintenance-->>Operator: MemoryMaintenanceResult[]
+```
+## Flow Chart
+```mermaid
+flowchart TD
+  A["Typed maintenance operation"] --> B{"action"}
+  B -->|mark_stale| C["update status stale"]
+  B -->|refresh| D["update status active"]
+  B -->|archive| E["archive record"]
+  B -->|supersede| F["archive old record with supersededBy"]
+  C --> G["MemoryMaintenanceResult"]
+  D --> G
+  E --> G
+  F --> G
+```
+## Verification
+本步骤新增测试覆盖：
+- `mark_stale` 把 record 置为 `stale`。
+- `refresh` 把 record 恢复为 `active`。
+- `archive` 把 record 置为 `archived`。
+验证命令：
+```text
+npm run check
+npm run check:rules
+npm test
+```
+## 下游验证
+本步骤已完成 EasyNet 真实验证：
+```text
+cd /Users/boqiangliang/project/easynet
+npm test
+npm run test:botbotgo:full
+```
+真实验证结果：
+- `npm test` 通过：18 个 contract tests，7 个 real integration tests。
+- `test:botbotgo:full` 通过：8/8 matrix cases。
+- full matrix 明确使用 EasyNet package-local `node_modules/.bin/botbotgo`。
+- 真实模型路径为 EasyNet 配置的远端 Ollama `granite4.1:3b`。
+- 覆盖 owner：orchestra、software、qa、ops、release、research、secretary、k8s。
+- 覆盖真实工具/数据路径：finance stock report、source analysis、disk investigation、Git/GitHub Actions、Kubernetes readonly investigation、CLI routing。

package/docs/memory/0.1.0-step-09-memory-persistence-boundary.zh.md ADDED Viewed

@@ -0,0 +1,118 @@
+# 0.1.0 Step 09.5 Memory Persistence Boundary
+## 目标
+本步骤定义 memory persistence boundary。它只负责 durable records 的保存和恢复，不定义数据库选型，不把 vector index 当作 source of truth。
+## 新增 Interface
+```text
+@stable-harness/memory
+├─ MemoryStoreSnapshot
+│  ├─ namespace
+│  ├─ records
+│  └─ exportedAt
+│
+├─ MemoryPersistenceAdapter
+│  ├─ load(namespace)
+│  └─ save(snapshot)
+│
+├─ createMemorySnapshot(store, filter)
+└─ createInMemoryMemoryPersistenceAdapter()
+```
+`createInMemoryRuntimeMemoryStore` 新增：
+```text
+records?: MemoryRecord[]
+```
+用于从 persistence adapter load 出来的 records hydrate 一个新的 runtime memory store。
+## 边界
+做：
+- structured `MemoryRecord` 是 source of truth。
+- persistence adapter 只保存和恢复 records。
+- snapshot 带 namespace 和 exportedAt。
+- in-memory persistence adapter 只用于测试和本地开发。
+不做：
+- 不实现 SQLite/Qdrant/LibSQL 具体后端。
+- 不把 vector store 作为主存储。
+- 不让 persistence adapter 参与 recall ranking。
+- 不让 persistence adapter 改写 memory policy。
+## Sequence Diagram
+```mermaid
+sequenceDiagram
+  participant Store as RuntimeMemoryStore
+  participant Snapshot as createMemorySnapshot
+  participant Adapter as MemoryPersistenceAdapter
+  participant NewStore as Hydrated Store
+  Store->>Snapshot: list(namespace filter)
+  Snapshot-->>Adapter: MemoryStoreSnapshot
+  Adapter->>Adapter: save(snapshot)
+  Adapter-->>NewStore: load(namespace)
+  NewStore->>NewStore: hydrate records
+```
+## Flow Chart
+```mermaid
+flowchart TD
+  A["RuntimeMemoryStore records"] --> B["createMemorySnapshot"]
+  B --> C["MemoryStoreSnapshot"]
+  C --> D["MemoryPersistenceAdapter.save"]
+  D --> E["MemoryPersistenceAdapter.load"]
+  E --> F["createInMemoryRuntimeMemoryStore records"]
+  F --> G["recall/list/update/archive"]
+```
+## Verification
+本步骤新增测试覆盖：
+- store 导出 snapshot。
+- persistence adapter 保存 snapshot。
+- 新的 in-memory store 从 loaded records hydrate。
+- hydrated store 可以 recall 原记录。
+验证命令：
+```text
+npm run check
+npm run check:rules
+npm test
+```
+## 下游验证
+本步骤已完成 EasyNet 真实验证：
+```text
+cd /Users/boqiangliang/project/easynet
+npm test
+npm run test:botbotgo:full
+```
+真实验证结果：
+- `npm test` 通过：18 个 contract tests，7 个 real integration tests。
+- `test:botbotgo:full` 通过：8/8 matrix cases。
+- full matrix 明确使用 EasyNet package-local `node_modules/.bin/botbotgo`。
+- 真实模型路径为 EasyNet 配置的远端 Ollama `granite4.1:3b`。
+- 覆盖 owner：orchestra、software、qa、ops、release、research、secretary、k8s。
+- 覆盖真实工具/数据路径：finance stock report、source analysis、disk investigation、Git/GitHub Actions、Kubernetes readonly investigation、CLI routing。
+## 下一步
+下一步是 `Step 09.6 Memory Maintenance Boundary`：
+- 定义 stale/archive/merge/supersede 的维护接口。
+- 维护逻辑必须基于 records 和 policy，不基于业务关键词。
+- 继续避免任何 downstream domain heuristic。

package/docs/product/adoption-playbook.md ADDED Viewed

@@ -0,0 +1,145 @@
+# Adoption Playbook
+Stable Harness adoption should start from a concrete runtime problem, not from a
+claim that teams need another agent framework.
+The message is simple: keep your chosen agent backend, add a stable runtime
+boundary around it.
+## Who Should Try It First
+Good first users:
+- teams already building with DeepAgents, LangGraph, or OpenAI-compatible tools
+- teams with a working prototype that lacks sessions, traces, approvals, memory
+  lifecycle, protocol access, or repeatable deployment
+- teams evaluating small or private models that need better tool-call reliability
+- platform teams that need a common runtime surface across multiple agent apps
+- product teams that need operator UI around requests, events, artifacts, and
+  approvals
+Poor first users:
+- teams still deciding whether they need agents at all
+- teams looking for a fully hosted end-user product
+- teams that want Stable Harness to replace their backend framework's planning
+  semantics
+## Entry Points
+### 1. Package Trial
+Goal: let a user see value in five minutes.
+```bash
+npx stable-harness init ./my-agent-app
+stable-harness -w ./my-agent-app
+stable-harness -w ./my-agent-app --agent orchestra --tool echo_tool --tool-args-json '{"value":"hello"}'
+```
+This proves installability, workspace shape, and gateway execution without
+requiring a full application migration.
+### 2. Existing Agent Wrapper
+Goal: wrap an existing agent app with runtime lifecycle and inspection.
+Steps:
+1. Create workspace YAML for the current agents, tools, and models.
+2. Keep backend semantics upstream-native.
+3. Add direct tool smoke tests through Stable Harness.
+4. Add event trace display or persistence.
+5. Expose the same runtime through CLI or OpenAI-compatible protocol.
+### 3. Tool Reliability Demo
+Goal: show why the tool gateway matters.
+Use a small model and a strict tool schema. Compare raw tool-call behavior with
+Stable Harness gateway behavior. The important evidence is not a single final
+answer; it is the runtime trace showing validation, repair, execution, and the
+bounded policy decision.
+### 4. Operator Control Plane Demo
+Goal: show production value beyond agent execution.
+Demonstrate:
+- request IDs and sessions
+- runtime event stream
+- tool traces
+- approvals or policy decisions
+- memory lifecycle hooks
+- OpenAI-compatible access to the same workspace
+### 5. Framework-Neutral Platform Pilot
+Goal: make Stable Harness the shared runtime layer across agent applications.
+Start with two workspaces that use different backend assumptions. Keep shared
+runtime controls in Stable Harness and keep backend-specific behavior behind
+adapter config.
+## Messaging
+Use this positioning:
+> Stable Harness is a stable runtime and operator control plane for agent
+> applications. It lets you keep DeepAgents, LangGraph, or another backend while
+> adding YAML inventory, sessions, events, governance, memory lifecycle, tool
+> repair, and protocol access.
+Avoid this positioning:
+- "a better agent framework"
+- "a replacement for LangGraph"
+- "automatic agent correctness"
+- "magic tool calling"
+## Documentation Funnel
+For new users:
+1. README
+2. Getting started
+3. Workspace authoring
+4. Integration guide
+5. Operator runbook
+For technical evaluators:
+1. Product boundary
+2. Compatibility matrix
+3. Adapter contract
+4. Protocol docs
+5. Engineering rules
+For buyers or internal champions:
+1. Adoption playbook
+2. Market positioning
+3. Tool reliability evidence
+4. Operator runbook
+5. Release evidence
+## Proof Points To Publish
+- npm package install smoke
+- minimal workspace init
+- direct tool-call run
+- OpenAI-compatible facade run
+- small-model tool-call repair examples
+- one real downstream workspace integration
+- runtime trace screenshots or logs
+- compatibility matrix by backend and protocol
+## What To Build Next For Adoption
+- copy-paste examples for common model providers
+- short videos or terminal GIFs for init and tool-call smoke
+- hosted documentation site generated from `docs/`
+- example workspaces for DeepAgents, LangGraph workflow, and OpenAI-compatible
+  facade
+- CI template that runs workspace validation and package smoke tests