npm - stable-harness - Versions diffs - 0.0.7 → 0.0.9 - Mend

stable-harness 0.0.7 → 0.0.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (39) hide show

package/README.md +10 -0
package/docs/0.1.0-p0-runtime-control-plane-plan.zh.md +171 -0
package/docs/0.1.0-retry-policy.zh.md +87 -0
package/docs/0.1.0-stable-runtime-development-roadmap.zh.md +393 -0
package/docs/0.1.0-tool-guard-benchmark.zh.md +42 -0
package/docs/adapter-contract.md +199 -0
package/docs/architecture/backend-comparison.md +41 -0
package/docs/architecture/runtime-events.md +263 -0
package/docs/architecture/runtime-events.zh.md +248 -0
package/docs/architecture/system-architecture.zh.md +435 -0
package/docs/compatibility-matrix.md +139 -0
package/docs/engineering-rules.md +111 -0
package/docs/evaluation/0.1.0-bfcl-targeted-model-matrix.zh.md +1632 -0
package/docs/evaluation/0.1.0-bfcl-targeted-review-matrix.zh.md +1952 -0
package/docs/evaluation/0.1.0-bfcl-tool-guard.zh.md +1427 -0
package/docs/granite-tool-calling-comparison.zh.md +206 -0
package/docs/guides/getting-started.md +126 -0
package/docs/guides/index.md +40 -0
package/docs/guides/integration-guide.md +126 -0
package/docs/guides/operator-runbook.md +153 -0
package/docs/guides/workspace-authoring.md +212 -0
package/docs/implementation-blueprint.md +233 -0
package/docs/memory/0.1.0-memory-design.zh.md +719 -0
package/docs/memory/0.1.0-step-09-deepagents-native-memory.zh.md +146 -0
package/docs/memory/0.1.0-step-09-langmem-shaped-provider.zh.md +169 -0
package/docs/memory/0.1.0-step-09-memory-adapter-projection.zh.md +123 -0
package/docs/memory/0.1.0-step-09-memory-contract.zh.md +169 -0
package/docs/memory/0.1.0-step-09-memory-governance-approval.zh.md +143 -0
package/docs/memory/0.1.0-step-09-memory-lifecycle-hooks.zh.md +150 -0
package/docs/memory/0.1.0-step-09-memory-maintenance-boundary.zh.md +118 -0
package/docs/memory/0.1.0-step-09-memory-persistence-boundary.zh.md +118 -0
package/docs/product/adoption-playbook.md +145 -0
package/docs/product/market-positioning.md +137 -0
package/docs/product-boundary.md +258 -0
package/docs/protocols/http-runtime.md +37 -0
package/docs/protocols/langgraph-compatible.md +107 -0
package/docs/protocols/openai-compatible.md +121 -0
package/docs/tooling/0.1.0-bettercall-tool-quality.zh.md +231 -0
package/package.json +3 -1

package/docs/guides/workspace-authoring.md ADDED Viewed

@@ -0,0 +1,212 @@
+# Workspace Authoring
+Stable Harness workspaces are directories of Kubernetes-style YAML resources plus
+local resources such as tools and skills.
+The workspace is the product surface. Application behavior should be expressed
+here, not hidden inside framework code.
+## Folder Shape
+```text
+config/
+  runtime/workspace.yaml
+  agents/orchestra.yaml
+  catalogs/models.yaml
+  catalogs/tools.yaml
+  workflows/review-shell.yaml
+resources/
+  tools/
+  skills/
+```
+## Runtime
+`kind: Runtime` owns workspace-level routing, protocol exposure, memory policy,
+approval policy, event display, and other runtime controls.
+```yaml
+apiVersion: stable-harness.dev/v1
+kind: Runtime
+metadata:
+  name: app-runtime
+spec:
+  routing:
+    defaultAgentId: orchestra
+  protocols:
+    inProcess: true
+    openaiCompatible:
+      host: 127.0.0.1
+      port: 8642
+```
+Keep runtime config typed and explicit. Do not route by prompt keywords, tool
+names, benchmark cases, or downstream domains.
+## Agents
+`kind: Agent` declares the agent inventory and selects the backend adapter.
+```yaml
+apiVersion: stable-harness.dev/v1
+kind: Agent
+metadata:
+  name: orchestra
+spec:
+  backend: deepagents
+  modelRef: local-dev
+  systemPrompt: You are a concise workspace agent.
+  tools:
+    - echo_tool
+  subagents:
+    - reviewer
+  config:
+    deepagents:
+      taskDescription: Call a subagent only when the upstream backend chooses to.
+```
+Backend-specific settings belong under explicit adapter config fields such as
+`config.deepagents`. Stable Harness should pass upstream-native semantics
+through instead of recreating them locally.
+## Models
+`kind: Model` declares model provider configuration.
+```yaml
+apiVersion: stable-harness.dev/v1
+kind: Model
+metadata:
+  name: local-dev
+spec:
+  provider: openai-compatible
+  model: ${env:STABLE_HARNESS_MODEL:-gpt-4.1-mini}
+  baseUrl: ${env:STABLE_HARNESS_OPENAI_BASE_URL:-https://api.openai.com/v1}
+  apiKey: ${env:OPENAI_API_KEY}
+```
+Use environment references for secrets and environment-specific endpoints.
+## Tools
+Local module tools are auto-discovered from `resources/tools/*.mjs`. The file
+name becomes the tool ID unless the file exports `tool(...)` declarations.
+Implement a local module:
+```js
+export const echo_tool = {
+  description: "Echo input through the Stable Harness tool gateway.",
+  schema: {
+    type: "object",
+    properties: { value: { type: "string" } },
+    required: ["value"],
+  },
+  async invoke(args) {
+    return JSON.stringify({ echoed: args.value });
+  },
+};
+```
+Reference the discovered tool from an agent:
+```yaml
+apiVersion: stable-harness.dev/v1
+kind: Agent
+metadata:
+  name: orchestra
+spec:
+  backend: deepagents
+  modelRef: local-dev
+  systemPrompt: You are a concise workspace agent.
+  tools:
+    - echo_tool
+```
+Use `kind: Tool` YAML for non-module or gateway-owned tools such as built-ins:
+```yaml
+apiVersion: stable-harness.dev/v1
+kind: Tool
+metadata:
+  name: shell
+spec:
+  description: Run an approved shell command through the configured tool gateway.
+  implementation: local-tool-gateway:shell
+  schema:
+    type: object
+    properties:
+      command:
+        type: string
+    required:
+      - command
+```
+Do not declare a YAML tool with the same ID as an auto-discovered module tool
+unless that YAML declaration intentionally replaces the module-backed tool.
+The CLI tool gateway uses BetterCall repair mode for registered tools. Repair is
+bounded by inventory, schemas, semantic validators, and governance policy.
+Unknown or unauthorized tools stay blocked.
+## Workflows
+Workflows are explicit control-plane topology, not a second agent language.
+```yaml
+apiVersion: stable-harness.dev/v1
+kind: Workflow
+metadata:
+  name: review-shell
+spec:
+  adapter: langgraph
+  entry: inspect
+  nodes:
+    - id: inspect
+      use: agents.orchestra
+    - id: run_shell
+      use: tools.shell
+    - id: review
+      use: agents.reviewer
+  edges:
+    - from: inspect
+      to: run_shell
+    - from: run_shell
+      to: review
+```
+Use workflows when the operator needs inspectable topology, migration support,
+or deterministic orchestration around existing inventory.
+## Memory
+Runtime memory policy belongs on the runtime resource. Backend-native memory
+should still be used through the adapter when the upstream backend owns it.
+```yaml
+spec:
+  memory:
+    enabled: true
+    deepagentsMem:
+      read: true
+      write: true
+    LangMem:
+      read: true
+      write: true
+    maintenance:
+      enabled: true
+      schedule: periodic
+```
+Memory is an operational substrate: namespace, persistence, lifecycle,
+maintenance, import/export, and governance. It is not a fake uniform replacement
+for every backend's native memory behavior.
+## Authoring Rules
+- Put product behavior in workspace YAML and resources.
+- Keep framework code generic and backend-neutral.
+- Prefer upstream passthrough over local copies of upstream features.
+- Add runtime capabilities only when they are lifecycle, governance,
+  observability, persistence, recovery, protocol, memory, or tool-gateway work.
+- Make each capability independently replaceable.

package/docs/implementation-blueprint.md ADDED Viewed

@@ -0,0 +1,233 @@
+# Stable Harness Implementation Blueprint
+This document defines what to build first and how each package should stay inside the product boundary.
+## Repository Shape
+The project is a TypeScript monorepo.
+Top-level packages:
+- `@stable-harness/core`
+- `@stable-harness/workspace-yaml`
+- `@stable-harness/adapter-deepagents`
+- `@stable-harness/memory`
+- `@stable-harness/governance`
+- `@stable-harness/protocols`
+- `@stable-harness/cli`
+Each package should have one clear responsibility and should avoid cross-layer shortcuts.
+## Package Responsibilities
+### `@stable-harness/core`
+Owns the stable runtime contract:
+- compiled workspace model
+- runtime request and response types
+- runtime events
+- adapter contract
+- runtime lifecycle
+- inspection surface
+It must not import DeepAgents, LangChain, OpenAI SDK, Gemini SDK, or protocol-specific servers.
+### `@stable-harness/workspace-yaml`
+Owns workspace loading and validation:
+- discover YAML files
+- parse Kubernetes-style resources
+- validate resource shape
+- compile resources into the core workspace model
+- preserve backend-specific config under explicit adapter config fields
+It must not execute agents or tools.
+### `@stable-harness/adapter-deepagents`
+Owns DeepAgents integration:
+- create upstream DeepAgents instances
+- pass through DeepAgents-native config
+- map runtime request metadata to upstream calls
+- normalize upstream events into runtime events
+- return final output and artifacts
+It must stay thin. It must not add domain heuristics, TODO-to-tool inference, or local replay of DeepAgents custom tools.
+Before adding behavior here, check whether the current DeepAgents version already provides it. Existing DeepAgents behavior must be passed through or configured with upstream-native options, not rebuilt locally.
+### `@stable-harness/memory`
+Owns runtime-level memory lifecycle:
+- memory store interfaces
+- namespace conventions
+- persistence adapters
+- compaction hooks
+- backup and export hooks
+It may integrate with backend-native memory primitives through adapters, but it must not replace them with a fake uniform semantic model.
+### `@stable-harness/governance`
+Owns runtime governance:
+- tool approval policy
+- sandbox policy
+- resource limits
+- audit decisions
+- policy evaluation results
+It must use structured policy and metadata, not free-form text matching.
+### `@stable-harness/protocols`
+Owns protocol adapters:
+- in-process client
+- HTTP server
+- MCP surface
+- ACP surface
+- A2A surface
+- AG-UI surface
+Protocol adapters should expose stable runtime concepts and call the core runtime.
+### `@stable-harness/cli`
+Owns local developer workflow:
+- initialize workspaces
+- run a request
+- inspect runtime state
+- list events
+- approve or deny pending work
+CLI commands should be thin wrappers around core runtime and protocol clients.
+## Build Order
+### Phase 1: Runtime spine
+Deliver:
+- core runtime types
+- in-memory runtime implementation
+- YAML loader
+- minimal DeepAgents adapter contract
+- minimal example workspace
+- build and test scripts
+Acceptance:
+- `npm run check` passes
+- example workspace can load
+- runtime can route one request to the DeepAgents adapter scaffold
+- no file violates project size limits
+### Phase 2: Real DeepAgents integration
+Deliver:
+- `createDeepAgent` based adapter path
+- direct passthrough for DeepAgents-native middleware config
+- support for DeepAgents task tool and subagents through upstream semantics
+- event normalization
+- integration tests with a real upstream DeepAgents dependency
+- a DeepAgents capability audit for every adapter feature added in this phase
+Acceptance:
+- adapter does not manually recreate the DeepAgents middleware stack when `createDeepAgent` is sufficient
+- DeepAgents custom tool calls stay inside upstream execution boundary
+- any wrapper around DeepAgents is limited to runtime governance, lifecycle, events, traces, artifacts, protocol access, or operator inspection
+- no runtime keyword heuristics
+### Capability Design Rule
+Every runtime-owned feature should be a single-purpose optional capability.
+Examples:
+- approvals
+- event recording
+- artifact storage
+- memory lifecycle
+- tool gateway
+- replay
+- protocol serving
+- sandbox policy
+- evaluation runner
+Each capability needs:
+- a narrow interface
+- typed config
+- an enable or disable path
+- a replacement point
+- focused tests
+Do not make one feature depend on unrelated runtime capabilities unless the dependency is explicit in typed config.
+### Phase 3: Operator control plane
+Deliver:
+- persisted run records
+- event store
+- approval queue
+- cancellation and retry lifecycle
+- run inspection API
+- CLI inspection commands
+Acceptance:
+- operators can inspect request lifecycle without touching backend checkpoints
+- approval flow is driven by structured policy
+### Phase 4: Protocol surfaces
+Deliver:
+- HTTP API
+- MCP protocol adapter
+- ACP adapter
+- AG-UI event stream
+- protocol conformance tests
+Acceptance:
+- all protocols call the same stable runtime contract
+- protocol adapters do not duplicate backend logic
+### Phase 5: Memory and replay
+Deliver:
+- memory persistence adapter
+- memory import/export
+- trace export
+- replay manifest
+- evaluation fixture runner
+Acceptance:
+- replay is driven by structured traces
+- memory governance is visible in events
+## Forbidden Implementation Patterns
+Do not add:
+- hardcoded finance, Kubernetes, GitHub Actions, source-analysis, or other downstream tool branches
+- user-text regex routing
+- assistant-text parsing that triggers tools
+- TODO parsing that creates evidence tool calls
+- local replay for upstream custom tools
+- framework-specific public API naming unless intentionally accepted as stable product surface
+- bundled runtime subsystems that cannot be independently enabled, disabled, replaced, and tested
+If a feature seems useful but requires one of these patterns, it belongs in a downstream workspace or upstream adapter config, not in the generic runtime.