npm - @mono-agent/agent-runtime - Versions diffs - 0.1.0 - Mend

@mono-agent/agent-runtime 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (60) hide show

package/ARCHITECTURE.md +219 -0
package/LICENSE +674 -0
package/README.md +430 -0
package/package.json +46 -0
package/src/agent/allowlists.js +49 -0
package/src/agent/approval.js +211 -0
package/src/agent/compaction.js +752 -0
package/src/agent/index.js +40 -0
package/src/agent/prompt/skill-index.js +66 -0
package/src/agent/tool-bloat.js +164 -0
package/src/agent/tools/bash.js +156 -0
package/src/agent/tools/edit.js +15 -0
package/src/agent/tools/glob.js +71 -0
package/src/agent/tools/grep.js +84 -0
package/src/agent/tools/index.js +17 -0
package/src/agent/tools/pi-bridge.js +638 -0
package/src/agent/tools/read.js +39 -0
package/src/agent/tools/shared/constants.js +21 -0
package/src/agent/tools/shared/dedup.js +31 -0
package/src/agent/tools/shared/output-truncation.js +54 -0
package/src/agent/tools/shared/path-resolver.js +156 -0
package/src/agent/tools/shared/ripgrep.js +130 -0
package/src/agent/tools/shared/runtime-context.js +69 -0
package/src/agent/tools/web-fetch.js +59 -0
package/src/agent/tools/web-search.js +21 -0
package/src/agent/tools/write.js +14 -0
package/src/agent/transcript.js +227 -0
package/src/ai/backend.js +17 -0
package/src/ai/cost.js +164 -0
package/src/ai/failure.js +165 -0
package/src/ai/file-change-stats.js +234 -0
package/src/ai/index.js +16 -0
package/src/ai/live-input-prompt.js +15 -0
package/src/ai/observer.js +233 -0
package/src/ai/providers/claude-cli.js +694 -0
package/src/ai/providers/claude-sdk.js +864 -0
package/src/ai/providers/claude-subagents.js +67 -0
package/src/ai/providers/codex-app.js +1045 -0
package/src/ai/providers/opencode-app.js +356 -0
package/src/ai/providers/opencode-discovery.js +39 -0
package/src/ai/providers/pi-events.js +62 -0
package/src/ai/providers/pi-messages.js +68 -0
package/src/ai/providers/pi-models.js +111 -0
package/src/ai/providers/pi-sdk.js +1310 -0
package/src/ai/registry.js +5 -0
package/src/ai/runtime/capabilities-used.js +56 -0
package/src/ai/runtime/capabilities.js +44 -0
package/src/ai/runtime/context-windows.js +38 -0
package/src/ai/runtime/fast-mode.js +8 -0
package/src/ai/runtime/model-refs.js +144 -0
package/src/ai/runtime/registry.js +57 -0
package/src/ai/runtime/router.js +214 -0
package/src/ai/runtime/sessions.js +126 -0
package/src/ai/streaming/codex-events.js +139 -0
package/src/ai/streaming/opencode-events.js +54 -0
package/src/ai/types.js +70 -0
package/src/index.js +23 -0
package/src/pi-auth.js +80 -0
package/src/runtime-brand.js +32 -0
package/src/runtime.js +104 -0

package/ARCHITECTURE.md ADDED Viewed

@@ -0,0 +1,219 @@
+# Agent Runtime Architecture
+## What It Is
+`@mono-agent/agent-runtime` is a provider-agnostic agent execution
+kernel. It does not own tasks, database state, UI, scheduling, or a host's
+domain-specific result contract. It owns the lower-level act of running an
+agent turn:
+- pick the right backend from a model reference and execution mode
+- expose built-in tools, MCP tools, approvals, structured output, and live input
+- enforce optional sandbox policy for built-in tool execution and stdio MCP startup
+- normalize provider events into one runtime event stream
+- classify runtime failures and retryable provider errors
+- collect usage, cost, cache, capability, and warning telemetry
+- return raw text plus raw structured output to the host
+Hosts consume the package through `src/runtime.js`.
+## Package Boundary
+```mermaid
+flowchart TB
+  HostApp["Host app<br/>API / coordinator / worker / UI / DB"] --> CoreAI["host runtime composition"]
+  CoreAI --> Runtime["agent-runtime<br/>createRuntime() / createRouterRuntime()"]
+  Runtime --> Registry["Runtime bridge registry<br/>model ref + executionMode -> backend"]
+  Runtime --> AgentKernel["Agent kernel<br/>built-in tools, MCP, approvals,<br/>compaction, transcript snapshots"]
+  Runtime --> Observability["Observers + metrics<br/>usage, cost, events, warnings"]
+  Runtime --> Failure["Failure taxonomy<br/>retryable provider detection"]
+  Registry --> ClaudeSDK["Claude SDK bridge"]
+  Registry --> ClaudeCLI["Claude Code CLI bridge"]
+  Registry --> PiSDK["Pi SDK bridge<br/>OpenAI, Codex, Gemini, OpenRouter,<br/>Ollama, custom providers"]
+  Registry --> CodexApp["Codex app-server CLI bridge"]
+  AgentKernel --> Builtins["Read / Write / Edit / Glob / Grep / Bash<br/>WebFetch / WebSearch"]
+  AgentKernel --> MCP["MCP stdio / SSE / HTTP tools"]
+  AgentKernel --> Sandbox["Sandbox policy<br/>path/network checks + stdio command wrapping"]
+  AgentKernel --> Artifacts["Tool-output bloat guard<br/>host artifact persistence"]
+  ClaudeSDK --> Providers["External model/provider surfaces"]
+  ClaudeCLI --> Providers
+  PiSDK --> Providers
+  CodexApp --> Providers
+  Runtime --> Result["RuntimeResult<br/>text, structuredResult, events,<br/>usage, diagnostics, failureKind"]
+  Result --> CoreAI
+  CoreAI --> HostContract["Host parses domain contract<br/>assistant result / task effects"]
+```
+The runtime stays below host domain behavior. Provider code in this package
+must not import host DB, API, coordinator, or UI modules. Hosts pass callbacks
+and pre-resolved settings into the runtime instead.
+## Runtime Selection
+```mermaid
+flowchart LR
+  ModelRef["options.model<br/>claude:* / pi:*:* / codex:*"] --> Parse["parseRuntimeModelReference()"]
+  Parse --> Mode["options.executionMode<br/>sdk or cli"]
+  Mode --> Resolve["resolveRuntimeBridge()"]
+  Resolve -->|sdk=claude + sdk mode| ClaudeSDK["claude bridge<br/>@anthropic-ai/claude-agent-sdk"]
+  Resolve -->|sdk=claude + cli mode| ClaudeCLI["claude-code bridge<br/>claude binary"]
+  Resolve -->|sdk=pi| PiSDK["pi bridge<br/>@earendil-works/pi-agent-core"]
+  Resolve -->|sdk=codex + cli mode| CodexApp["codex-app bridge<br/>codex app-server"]
+  Resolve --> Caps["runtimeCapabilities()<br/>static backend features"]
+  Caps --> Used["capabilitiesUsed<br/>per-call observed features"]
+```
+Canonical active model references are:
+- `claude:<modelId>` for Claude SDK or Claude Code CLI, selected by
+  `executionMode`
+- `pi:<providerId>:<modelName>` for Pi SDK providers
+- `codex:<modelId>` for Codex app-server CLI
+Legacy aliases are canonicalized at host ingress when needed. The strict parser
+keeps the package boundary honest by rejecting reserved runtime IDs such as
+`openai:*`, `vercel:*`, and `claude-code:*`.
+## Run Lifecycle
+```mermaid
+sequenceDiagram
+  participant Host as Host app
+  participant Runtime as createRuntime()
+  participant Registry as Bridge registry
+  participant Bridge as Provider bridge
+  participant Kernel as Agent kernel
+  participant Provider as SDK / CLI / app-server
+  participant Observer as Observer hub
+  Host->>Runtime: run(systemPrompt, options)
+  Runtime->>Registry: resolveRuntimeBridge(model, executionMode)
+  Registry-->>Runtime: bridge.execute()
+  Runtime->>Observer: create hub from host + call observers
+  Runtime->>Bridge: execute(systemPrompt, normalized options)
+  Bridge->>Kernel: prepare tools, MCP, approvals, limits
+  Kernel-->>Bridge: provider-specific tool surface
+  Bridge->>Provider: send prompt, messages, tools, schema, settings
+  loop streaming events
+    Provider-->>Bridge: assistant/tool/result/provider events
+    Bridge->>Observer: normalized runtime events
+    Bridge->>Kernel: execute built-in/MCP tools as needed
+    Kernel-->>Bridge: tool results or tool errors
+  end
+  Bridge-->>Runtime: RuntimeResult
+  Runtime->>Observer: flush()
+  Runtime-->>Host: text, structuredResult, events, usage, diagnostics
+  Host->>Host: validate/parse host-specific contract
+```
+The package forwards provider structured output as `structuredResult`, but it
+does not validate that output against a host domain schema. Hosts own that
+validation and any state-machine side effects.
+## Main Subsystems
+```mermaid
+flowchart TB
+  Public["Public API<br/>src/index.js"] --> RuntimeFactory["runtime.js<br/>createRuntime()"]
+  Public --> Router["ai/runtime/router.js<br/>createRouterRuntime()"]
+  Public --> AIExports["ai/index.js<br/>model refs, registry, observers"]
+  Public --> AgentExports["agent/index.js<br/>allowlists, compaction,<br/>approvals, transcript"]
+  RuntimeFactory --> Registry["ai/runtime/registry.js"]
+  Registry --> Providers["ai/providers/*"]
+  Providers --> Claude["claude-sdk.js"]
+  Providers --> ClaudeCode["claude-cli.js"]
+  Providers --> Pi["pi-sdk.js<br/>pi-models/messages/events"]
+  Providers --> Codex["codex-app.js"]
+  AgentExports --> Tools["agent/tools/*"]
+  Tools --> ToolRuntime["shared/runtime-context.js<br/>workspace, repoRoot, rg, brand"]
+  Tools --> PiBridge["tools/pi-bridge.js<br/>built-ins + MCP adaptation"]
+  AgentExports --> Compaction["agent/compaction.js"]
+  AgentExports --> Transcript["agent/transcript.js"]
+  AgentExports --> Approval["agent/approval.js"]
+  AgentExports --> Bloat["agent/tool-bloat.js"]
+  AIExports --> Failure["ai/failure.js"]
+  AIExports --> Cost["ai/cost.js"]
+  AIExports --> Observer["ai/observer.js"]
+  AIExports --> Capabilities["ai/runtime/capabilities*.js"]
+```
+Key responsibilities by subsystem:
+- `runtime.js`: binds host callbacks once, configures tool runtime context, and
+  routes each call to the resolved bridge.
+- `ai/runtime/registry.js`: maps model reference plus execution mode to one of
+  the built-in provider bridges.
+- `ai/runtime/router.js`: retries across an ordered fallback chain on retryable
+  provider failures, carrying a transcript-tail resume snapshot forward.
+- `ai/providers/*`: owns provider-specific request shapes, event conversion,
+  structured-output extraction, native subagent wiring, usage, and diagnostics.
+- `agent/tools/*`: implements built-in tools, path/workdir guards, sandbox
+  policy checks, MCP tool adaptation, Playwright artifact routing, and output
+  limits.
+- `agent/compaction.js`: estimates context pressure and compacts long agent
+  conversations for providers that support the package's compaction loop.
+- `agent/transcript.js`: builds bounded resume snapshots from prior provider
+  events so a fallback or continuation can keep context.
+- `agent/approval.js`: provides host-driven human-in-the-loop tool approval
+  gates where the backend supports runtime tool dispatch.
+- `ai/failure.js`: normalizes spawn, usage-limit, provider, cancellation, and
+  retryability decisions into stable failure kinds.
+## Host Responsibilities
+```mermaid
+flowchart LR
+  Host["Host app"] --> Pricing["resolveCustomPricing"]
+  Host --> Auth["resolvePiApiKey"]
+  Host --> Persist["persistArtifact"]
+  Host --> Compact["onCompactionRecorded"]
+  Host --> Approval["onToolApprovalRequest"]
+  Host --> Brand["runtimeBrand"]
+  Host --> Roots["workspace / repoRoot / ripgrepPath"]
+  Pricing --> Runtime["agent-runtime host callbacks"]
+  Auth --> Runtime
+  Persist --> Runtime
+  Compact --> Runtime
+  Approval --> Runtime
+  Brand --> Runtime
+  Roots --> Runtime
+  Runtime --> Raw["Raw runtime result"]
+  Raw --> Domain["Host-owned domain validation<br/>result contract, state machine,<br/>DB writes, UI surfaces"]
+```
+The host is responsible for:
+- resolving credentials and custom provider/model rows before provider calls
+- choosing model references, execution mode, effort, fallback chains, and
+  runtime settings
+- persisting artifacts, compaction rows, raw logs, run rows, and UI-facing state
+- validating structured output against the host's domain contract
+- converting runtime failures into product workflow behavior
+- deciding when to retry, recover, continue, cancel, or ask for user input
+## Essential Takeaway
+Think of `@mono-agent/agent-runtime` as the portable agent process engine
+underneath a host app. The host decides what a task means, which agent should
+run, how state changes, and how results are persisted. The runtime decides how
+to talk to Claude, Pi, and Codex execution surfaces; how tools are exposed; how
+provider failures are normalized; and how enough telemetry is returned for a
+host to make reliable orchestration decisions.