npm - @cuylabs/agent-runtime-dapr - Versions diffs - 0.9.0 → 0.11.0 - Mend

@cuylabs/agent-runtime-dapr 0.9.0 → 0.11.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (31) hide show

package/README.md +154 -19
package/dist/{chunk-2CEICSJH.js → chunk-5CJIC4YB.js} +184 -38
package/dist/{chunk-A34CHK2E.js → chunk-MQJ4LZOX.js} +30 -4
package/dist/chunk-O7H3XGY2.js +11222 -0
package/dist/chunk-YQQTUE6B.js +993 -0
package/dist/chunk-YS2CWYBQ.js +1358 -0
package/dist/client-UsEIzDF6.d.ts +322 -0
package/dist/dispatch/index.d.ts +9 -0
package/dist/dispatch/index.js +17 -0
package/dist/execution/index.d.ts +5 -4
package/dist/execution/index.js +2 -2
package/dist/host/index.d.ts +8 -4
package/dist/host/index.js +28 -8
package/dist/index-BY0FipV1.d.ts +770 -0
package/dist/index-CFm5LORU.d.ts +63 -0
package/dist/index-UtePd9on.d.ts +101 -0
package/dist/index.d.ts +62 -14
package/dist/index.js +76 -6
package/dist/invoker-B6ikdYaz.d.ts +50 -0
package/dist/{store-pRLGfYhN.d.ts → store-BXBIDz40.d.ts} +24 -3
package/dist/team/index.d.ts +612 -0
package/dist/team/index.js +30 -0
package/dist/worker-CXq0IFGX.d.ts +42 -0
package/dist/workflow/index.d.ts +4 -225
package/dist/workflow/index.js +2 -2
package/dist/{workflow-bridge-C8Z1yr0Y.d.ts → workflow-bridge-BcicHH1Y.d.ts} +4 -2
package/dist/workflow-host-D6W6fXoL.d.ts +459 -0
package/package.json +16 -6
package/dist/chunk-DILON56B.js +0 -668
package/dist/chunk-R47X4FG2.js +0 -2009
package/dist/index-BCMkUMAf.d.ts +0 -564

package/README.md CHANGED Viewed

@@ -19,6 +19,37 @@ It builds on:
 - `agent-core` for task and turn execution semantics
 - `agent-runtime` for the outer workload runtime contract
+When paired with `@cuylabs/agent-server`, this package should sit behind the
+same session/turn surface rather than replacing it. Use
+`createDaprAgentServerAdapter(runner)` when you want `agent-server`
+transports like WebSocket to route turns, steering, follow-ups, and
+interactive requests through the same Dapr workflow runtime as the hosted
+HTTP routes.
+## Why This Package Is Bigger Than A Simple Driver
+`agent-runtime-dapr` has two roles:
+1. It implements shared runtime contracts from `@cuylabs/agent-runtime`
+2. It exposes Dapr-native helpers that should stay outside the shared contract
+The first category is the portability seam:
+- `DaprRuntimeDriver` implements `RuntimeDriver`
+- `DaprOrchestratorRunStore` implements `OrchestratorRunStore`
+- `createDaprWorkloadRuntime(...)` builds a `WorkloadRuntime` with those pieces
+The second category is intentionally Dapr-specific:
+- workflow clients and workflow activities
+- HTTP host/runners
+- sidecar job callbacks
+- execution checkpoint persistence
+- cross-service invocation helpers
+Those features are not drift in the base runtime contract. They are adapter
+surfaces that exist because Dapr offers more than a generic scheduler/store.
 ## Why Dapr?
 Dapr provides the durable infrastructure while your agent owns the intelligence:
@@ -53,6 +84,40 @@ Under the hood, this package now exposes two layers:
 - `createDaprAgentRuntime(...)` and `createDaprAgentRunner(...)` as the
   `agent-core`-specific adapters built on top of that
+The rule is:
+- if your code only needs portable scheduling/orchestration, target `agent-runtime`
+- if your code wants Dapr durability or Dapr host capabilities, opt into this package explicitly
+## Tool Hosts And Durable Turns
+`ToolHost` configuration still belongs on the agent, not on the Dapr runner.
+```ts
+import { WorkflowRuntime } from "@dapr/dapr";
+import { createAgent } from "@cuylabs/agent-core";
+import { dockerHost } from "@cuylabs/agent-sandbox-docker";
+import { createDaprAgentRunner } from "@cuylabs/agent-runtime-dapr";
+const agent = createAgent({
+  model,
+  host: dockerHost({ image: "node:22", workspaceDir: "/workspace" }),
+  tools,
+});
+const runner = createDaprAgentRunner({
+  agent,
+  name: "my-agent",
+  workflowRuntime: new WorkflowRuntime(),
+});
+```
+In direct mode and durable mode, host-backed tools use the same `agent-core`
+execution seam. Dapr persists workflow state around the tool call, but the
+tool still executes through `agent.getHost()`.
+For the full explanation, see [Tool Hosts In Durable Workflows](docs/tool-hosts.md).
 ## Quick Start
 ### Step 1: Define your agent
@@ -110,8 +175,8 @@ curl -s http://localhost:3000/agents/run \
   -H "Content-Type: application/json" \
   -d '{"message": "Greet Carlos"}' | jq
-# Durable workflow (async, crash-recoverable)
-curl -s http://localhost:3000/agents/workflow \
+# Durable run (async, crash-recoverable)
+curl -s http://localhost:3000/agents/run-durable \
   -H "Content-Type: application/json" \
   -d '{"message": "Greet Carlos"}' | jq
 ```
@@ -154,17 +219,38 @@ Every agent host exposes two ways to run a turn:
 | Mode | Endpoint | Behavior |
 |------|----------|----------|
 | **Direct** | `POST /agents/run` | Synchronous. Returns result in the HTTP response. State is persisted, but execution is not crash-recoverable. |
-| **Workflow** | `POST /agents/workflow` | Asynchronous. Returns `202` with an `instanceId` immediately. The turn runs as a Dapr workflow — crash-safe with activity-level checkpoints. |
+| **Durable** | `POST /agents/run-durable` | Asynchronous. Returns `202` with an `instanceId` immediately. The turn runs as a Dapr workflow — crash-safe with activity-level checkpoints. |
-The workflow decomposes each turn into four activities:
+The workflow decomposes each turn into five activities:
 ```
-model-step → tool-call → step-commit → output-commit
+input-commit → model-step → tool-call → step-commit → output-commit
 ```
 Each activity is a checkpoint. If the process crashes after `tool-call`, Dapr
 replays from that point — the model call and tool execution don't repeat.
+## Team Coordination
+`createDaprTeamRunner()` applies the same split to multi-agent coordination:
+- Vocabulary:
+  - `run()` = direct, in-process coordinator execution
+  - `runDurable()` = start the durable root coordinator workflow
+  - child workflow = one durable member task execution started by the root
+  - `waitForDurableRun()` = external polling helper for the root workflow
+- `run(prompt)` keeps the coordinator loop in-process while using Dapr-backed stores.
+- `runDurable(prompt, options?)` starts a durable coordinator workflow and returns `{ teamId, workflowName, coordinatorSessionId, instanceId }`.
+- `getDurableRun(instanceId)` reads workflow status and extracts the final coordinator result when present.
+- `waitForDurableRun(instanceId, options?)` is the explicit edge-level wait helper when you want to block for completion.
+The HTTP surface mirrors that programmatic contract:
+- `POST /team/run`
+- `POST /team/run-durable`
+- `GET /team/workflows/:instanceId`
 ## HTTP API Reference
 | Method | Path | Description |
@@ -175,12 +261,34 @@ replays from that point — the model call and tool execution don't repeat.
 | `GET` | `/readyz` | Readiness alias |
 | `GET` | `/agents` | List registered agents |
 | `POST` | `/agents/run` | Run agent turn (direct) |
-| `POST` | `/agents/workflow` | Run agent turn (durable workflow) |
+| `POST` | `/agents/run-durable` | Run agent turn (durable) |
 | `POST` | `/agents/:id/run` | Run specific agent (direct) |
-| `POST` | `/agents/:id/workflow` | Run specific agent (durable workflow) |
+| `POST` | `/agents/:id/run-durable` | Run specific agent (durable) |
+| `GET` | `/agents/inputs` | List durable human input requests for the single hosted agent |
+| `GET` | `/agents/inputs/:requestId` | Get durable human input request for the single hosted agent |
+| `POST` | `/agents/inputs/:requestId/respond` | Resolve durable human input request for the single hosted agent |
+| `GET` | `/agents/approvals` | List durable approval requests for the single hosted agent |
+| `GET` | `/agents/approvals/:requestId` | Get durable approval request for the single hosted agent |
+| `POST` | `/agents/approvals/:requestId/respond` | Resolve durable approval for the single hosted agent |
+| `GET` | `/agents/:id/inputs` | List durable human input requests |
+| `GET` | `/agents/:id/inputs/:requestId` | Get durable human input request |
+| `POST` | `/agents/:id/inputs/:requestId/respond` | Resolve durable human input request |
+| `GET` | `/agents/:id/approvals` | List durable approval requests |
+| `GET` | `/agents/:id/approvals/:requestId` | Get durable approval request |
+| `POST` | `/agents/:id/approvals/:requestId/respond` | Resolve durable approval with `allow`, `deny`, or `remember` |
 | `GET` | `/agents/:id/executions/:sessionId` | Get execution details |
 | `GET` | `/agents/:id/executions/:sessionId/checkpoints` | Get execution checkpoints |
 | `GET` | `/agents/:id/workflows/:instanceId` | Get workflow state |
+| `POST` | `/agents/:id/workflows/:instanceId/terminate` | Terminate a running workflow |
+| `POST` | `/agents/steer` | Inject steering message (single-agent host) |
+| `POST` | `/agents/:id/steer` | Inject steering message into running workflow |
+| `POST` | `/agents/follow-up` | Queue follow-up message (single-agent host) |
+| `POST` | `/agents/:id/follow-up` | Queue follow-up for after current turn |
+| `GET` | `/agents/follow-ups` | List follow-up requests (single-agent host) |
+| `GET` | `/agents/:id/follow-ups` | List follow-up requests |
+| `GET` | `/agents/:id/events/:sessionId` | SSE stream of agent events |
+| `GET` | `/dapr/subscribe` | Dapr pub/sub subscription declaration |
+| `POST` | `/dapr/:topic` | Dapr pub/sub event delivery callback |
 | `POST` | `/job/:name` | Handle Dapr scheduled job trigger |
 ## Runner Options
@@ -194,6 +302,7 @@ replays from that point — the model call and tool execution don't repeat.
 | `workflowRuntime` | Yes | — | `new WorkflowRuntime()` from `@dapr/dapr` |
 | `daprHttpEndpoint` | No | `http://$DAPR_HOST:$DAPR_HTTP_PORT` | Sidecar HTTP endpoint |
 | `stateStoreName` | No | `"statestore"` | Dapr state store component |
+| `workflowComponent` | No | `"dapr"` | Dapr workflow component name |
 | `driverOptions` | No | — | Advanced Dapr runtime driver options: API token, retries, timeouts, custom `fetch`, sidecar verification |
 | `observers` | No | `[]` | Extra execution lifecycle observers |
 | `logging` | No | `true` | Enable/disable console logging |
@@ -203,10 +312,19 @@ replays from that point — the model call and tool execution don't repeat.
 The runner returns an object with:
 - `start()` — start runtime and workflow worker
+- `createHttpHandler(options?)` — build the Dapr host HTTP handler for embedding in a custom server
+- `agentServerCapabilities()` — capabilities patch describing the Dapr-backed runtime
 - `serve(options?)` — start HTTP server, block on SIGINT/SIGTERM
 - `run(message, options?)` — run a task programmatically
+- `runDurable(message, options?)` — start a durable turn programmatically
 - `stop()` — graceful shutdown
+`serve(options?)` also accepts lightweight UI-hosting options:
+- `staticDir` — serve static files before the built-in agent routes
+- `indexFile` — file served for `/` when `staticDir` is configured
+- `extraRoutes` — exact-match custom routes layered ahead of static assets and agent APIs
 Runner startup is transactional: if the workflow worker fails to start, the
 runtime is stopped before the error is returned.
@@ -236,6 +354,7 @@ invocation), the package also exports the lower-level building blocks:
 | Helper | Purpose |
 |--------|---------|
 | `createDaprAgentWorkflowHost()` | Wrap an Agent into a workflow host |
+| `createDaprAgentServerAdapter()` | Bridge `@cuylabs/agent-server` to the Dapr workflow runtime |
 | `createDaprWorkflowWorker()` | Register workflow hosts in a WorkflowRuntime |
 | `createDaprWorkloadRuntime()` | Dapr-backed runtime bundle for generic workloads |
 | `createDaprAgentRuntime()` | Create runtime bundle (scheduling + runner + store) |
@@ -246,27 +365,40 @@ invocation), the package also exports the lower-level building blocks:
 | `createDaprExecutionObserver()` | Persist execution events to the store |
 | `createDaprLoggingObserver()` | Console logging for execution lifecycle |
 | `DaprServiceInvoker` | Call agents across Dapr service boundaries |
+| `invokeRemoteAgentRun()` | Convenience wrapper for cross-service agent calls |
+| `createRemoteAgentTool()` | Create a tool that invokes a remote Dapr agent |
+| `createDaprWorkflowApprovalRuntime()` | Durable approval runtime |
+| `createDaprWorkflowHumanInputRuntime()` | Durable human-input runtime |
+| `createDaprWorkflowSteerRuntime()` | Durable steering runtime |
+| `createDaprWorkflowFollowUpRuntime()` | Durable follow-up runtime |
+| `createDaprHostHttpHandler()` | Build `Request → Response` handler for custom servers |
+| `createEventBus()` | In-process event bus for SSE streaming |
+| `createDaprPubSubEventBridge()` | Multi-instance event fan-out via Dapr pub/sub |
+| `createDaprDispatchRuntime()` | Dapr-backed async dispatch runtime |
+| `createDaprTeamRunner()` | Multi-agent team runner with durable coordination |
 See the [docs/](docs/) folder for detailed guides:
 - [Architecture](docs/architecture.md) — how the three packages compose
-- [Workflow Internals](docs/workflow-internals.md) — the 4-activity decomposition
-- [API Reference](docs/api-reference.md) — all exported types and functions
+- [Workflow Internals](docs/durability/workflow-internals.md) — the 5-activity decomposition
+- [Durable Tool Approvals](docs/hitl/durable-tool-approvals.md) — how approval middleware pauses and resumes Dapr workflows
+- [Durable Human Input](docs/hitl/durable-human-input.md) — how the built-in `question` tool pauses and resumes Dapr workflows
+- [API Reference](docs/api-reference.md) — all exported types and functions, including event streaming
 - [Advanced Patterns](docs/advanced-patterns.md) — cross-service invocation, custom observers, etc.
 ## Runtime Boundary
 The package layering is:
-- `agent-core`: agent turn/task semantics
+- `agent-core`: agent turn/task semantics, EventBus interface, AgentSignal
 - `agent-runtime`: generic workload orchestration contract
-- `agent-runtime-dapr`: Dapr-backed implementation of that contract
+- `agent-runtime-dapr`: Dapr-backed implementation of that contract, plus `DaprPubSubEventBridge` for multi-instance event fan-out
 `agent-runtime-dapr` integrates with those lower layers in two different ways:
 - outer workload path: it uses `agent-runtime` to schedule, dispatch, retry,
   and observe jobs
-- inner durable turn path: it uses `agent-core` runtime primitives to split one
+- inner durable turn path: it uses `agent-core` execution primitives to split one
   agent turn into durable workflow activities such as `model-step`,
   `tool-call`, `step-commit`, and `output-commit`
@@ -284,12 +416,15 @@ higher-level `createDaprAgentRunner(...)`.
 The [`examples/`](examples/) directory has complete, runnable scripts:
-| Script | Lines | Description |
-|--------|-------|-------------|
-| [`simple-agent.ts`](examples/simple-agent.ts) | ~55 | Minimal agent with one tool |
-| [`coding-agent.ts`](examples/coding-agent.ts) | ~45 | File-system tools via `@cuylabs/agent-code` |
-| [`multi-agent.ts`](examples/multi-agent.ts) | ~85 | Two agents in one process |
-| [`maintenance-host.ts`](examples/maintenance-host.ts) | ~200 | Scheduled cleanup worker with `/metrics` and Dapr job callbacks |
+| Script | Description |
+|--------|-------------|
+| [`01-simple-agent.ts`](examples/01-simple-agent.ts) | Minimal agent with one tool |
+| [`02-coding-agent.ts`](examples/02-coding-agent.ts) | File-system tools via `@cuylabs/agent-code` |
+| [`03-multi-agent.ts`](examples/03-multi-agent.ts) | Two agents in one process |
+| [`04-crash-recovery.ts`](examples/04-crash-recovery.ts) | Process crash mid-turn, Dapr auto-resumes |
+| [`05-tracing-zipkin.ts`](examples/05-tracing-zipkin.ts) | OpenTelemetry tracing → Zipkin |
+| [`06-tracing-phoenix.ts`](examples/06-tracing-phoenix.ts) | OpenTelemetry tracing → Arize Phoenix |
+| [`07-maintenance-host.ts`](examples/07-maintenance-host.ts) | Retention jobs + Prometheus metrics |
 See the [examples README](examples/README.md) for step-by-step setup and usage.
@@ -302,7 +437,7 @@ See the [examples README](examples/README.md) for step-by-step setup and usage.
 - `GET /ready` and `GET /readyz` report runtime, worker, sidecar, and state-store readiness
 - Dapr Jobs API calls are isolated behind an internal adapter so scheduler changes stay local to the Dapr package
 - Use `DaprExecutionStore.cleanup(...)` and `DaprOrchestratorRunStore.cleanup(...)` to enforce retention budgets
-- For a concrete operational service, see [`examples/maintenance-host.ts`](examples/maintenance-host.ts)
+- For a concrete operational service, see [`examples/07-maintenance-host.ts`](examples/07-maintenance-host.ts)
 - For containers: run one sidecar per app process, point `daprHttpEndpoint` at the local sidecar
 ## License

package/dist/{chunk-2CEICSJH.js → chunk-5CJIC4YB.js} RENAMED Viewed

@@ -1,7 +1,7 @@
 import {
   DaprSidecarClient,
   isDaprConflictError
-} from "./chunk-A34CHK2E.js";
+} from "./chunk-MQJ4LZOX.js";
 // src/execution/store.ts
 var DEFAULT_KEY_PREFIX = "agent-runtime:execution:";
@@ -96,6 +96,57 @@ var DaprExecutionStore = class {
       "keyPrefix"
     );
   }
+  // ── ExecutionStore interface (generic) ───────────────────────────────
+  async get(sessionId) {
+    const record = await this.getExecution(sessionId);
+    return record ? toGenericRunRecord(record) : void 0;
+  }
+  async list(options) {
+    const records = await this.listExecutions();
+    let filtered = records;
+    if (options?.status) {
+      const statuses = Array.isArray(options.status) ? options.status : [options.status];
+      filtered = filtered.filter((r) => statuses.includes(r.status));
+    }
+    if (options?.limit !== void 0) {
+      filtered = filtered.slice(0, options.limit);
+    }
+    return filtered.map(toGenericRunRecord);
+  }
+  async listGenericCheckpoints(sessionId) {
+    const records = await this.listCheckpoints(sessionId);
+    return records.map(toGenericCheckpointRecord);
+  }
+  async remove(sessionId) {
+    const existing = await this.getExecution(sessionId);
+    if (!existing) return false;
+    await this.deleteExecutionRecord(sessionId);
+    return true;
+  }
+  // ── Resume helper ────────────────────────────────────────────────────
+  /**
+   * Build a resume snapshot from a persisted execution record.
+   *
+   * Returns `undefined` if no execution exists for the session or the
+   * execution is already in a terminal state (completed/failed).
+   *
+   * The returned snapshot can be passed as `context.restoreFrom` to
+   * `createAgentTaskRunner(...)` to resume the direct-path execution
+   * from where it left off.
+   */
+  async buildResumeSnapshot(sessionId) {
+    const record = await this.getExecution(sessionId);
+    if (!record || record.status !== "running") return void 0;
+    return {
+      response: record.snapshot.response,
+      usage: { ...record.snapshot.usage },
+      toolCalls: record.snapshot.toolCalls.map((tc) => ({ ...tc })),
+      step: record.snapshot.activeStep ?? 0,
+      eventCount: record.snapshot.eventCount,
+      startedAt: record.startedAt
+    };
+  }
+  // ── Dapr-specific methods (rich types) ───────────────────────────────
   async getExecution(sessionId) {
     const value = await this.client.getState(
       this.stateKeyForExecution(sessionId)
@@ -147,9 +198,10 @@ var DaprExecutionStore = class {
     do {
       const response = await this.client.queryState({
         filter: {
-          EQ: {
-            kind: STORED_EXECUTION_CHECKPOINT_KIND
-          }
+          AND: [
+            { EQ: { kind: STORED_EXECUTION_CHECKPOINT_KIND } },
+            { EQ: { "checkpoint.sessionId": sessionId } }
+          ]
         },
         page: {
           limit: 200,
@@ -193,13 +245,14 @@ var DaprExecutionStore = class {
     await this.writeCheckpoint(record);
     await this.addCheckpointToIndex(record.sessionId, record.id).catch(() => {
     });
-    const current = await this.getExecution(checkpoint.run.sessionId);
-    const next = current ?? toExecutionRecord(checkpoint.run, checkpoint.snapshot);
-    next.updatedAt = checkpoint.snapshot.updatedAt;
-    next.lastCheckpointReason = checkpoint.reason;
-    next.checkpointCount = (current?.checkpointCount ?? 0) + 1;
-    next.snapshot = toSerializableSnapshot(checkpoint.snapshot);
-    await this.writeExecution(next);
+    await this.updateExecution(checkpoint.run.sessionId, (current) => {
+      const next = current ?? toExecutionRecord(checkpoint.run, checkpoint.snapshot);
+      next.updatedAt = checkpoint.snapshot.updatedAt;
+      next.lastCheckpointReason = checkpoint.reason;
+      next.checkpointCount = (current?.checkpointCount ?? 0) + 1;
+      next.snapshot = toSerializableSnapshot(checkpoint.snapshot);
+      return next;
+    });
   }
   async recordCompletion(run, result, snapshot) {
     const current = await this.getExecution(run.sessionId);
@@ -403,7 +456,7 @@ var DaprExecutionStore = class {
     }
     return present;
   }
-  async writeExecution(record) {
+  async writeExecution(record, etag) {
     const envelope = {
       kind: STORED_EXECUTION_KIND,
       version: STORED_EXECUTION_VERSION,
@@ -411,9 +464,32 @@ var DaprExecutionStore = class {
     };
     await this.client.saveState(
       this.stateKeyForExecution(record.sessionId),
-      envelope
+      envelope,
+      etag ? { etag, concurrency: "first-write" } : {}
     );
   }
+  /**
+   * Read-modify-write the execution record with optimistic concurrency.
+   * Retries on etag conflict up to 4 times.
+   */
+  async updateExecution(sessionId, updater) {
+    for (let attempt = 0; attempt < DEFAULT_INDEX_UPDATE_RETRIES; attempt += 1) {
+      const entry = await this.client.getStateEntry(
+        this.stateKeyForExecution(sessionId)
+      );
+      const current = this.decodeExecution(entry.value);
+      const next = updater(current);
+      try {
+        await this.writeExecution(next, entry.etag);
+        return;
+      } catch (error) {
+        if (isDaprConflictError(error) && attempt + 1 < DEFAULT_INDEX_UPDATE_RETRIES) {
+          continue;
+        }
+        throw error;
+      }
+    }
+  }
   async writeCheckpoint(record) {
     const envelope = {
       kind: STORED_EXECUTION_CHECKPOINT_KIND,
@@ -462,6 +538,54 @@ var DaprExecutionStore = class {
     return void 0;
   }
 };
+function toGenericRunRecord(record) {
+  return {
+    sessionId: record.sessionId,
+    status: record.status,
+    startedAt: record.startedAt,
+    updatedAt: record.updatedAt,
+    completedAt: record.completedAt,
+    checkpointCount: record.checkpointCount,
+    lastCheckpointReason: record.lastCheckpointReason,
+    snapshot: {
+      sessionId: record.snapshot.sessionId,
+      response: record.snapshot.response,
+      usage: { ...record.snapshot.usage },
+      toolCalls: record.snapshot.toolCalls.map((tc) => ({ ...tc })),
+      eventCount: record.snapshot.eventCount,
+      activeStep: record.snapshot.activeStep,
+      error: record.snapshot.error,
+      startedAt: record.snapshot.startedAt,
+      updatedAt: record.snapshot.updatedAt
+    },
+    result: record.result ? {
+      response: record.result.response,
+      sessionId: record.result.sessionId,
+      usage: { ...record.result.usage },
+      toolCalls: record.result.toolCalls.map((tc) => ({ ...tc }))
+    } : void 0,
+    error: record.error ? { ...record.error } : void 0
+  };
+}
+function toGenericCheckpointRecord(record) {
+  return {
+    id: record.id,
+    sessionId: record.sessionId,
+    reason: record.reason,
+    snapshot: {
+      sessionId: record.snapshot.sessionId,
+      response: record.snapshot.response,
+      usage: { ...record.snapshot.usage },
+      toolCalls: record.snapshot.toolCalls.map((tc) => ({ ...tc })),
+      eventCount: record.snapshot.eventCount,
+      activeStep: record.snapshot.activeStep,
+      error: record.snapshot.error,
+      startedAt: record.snapshot.startedAt,
+      updatedAt: record.snapshot.updatedAt
+    },
+    createdAt: record.createdAt
+  };
+}
 // src/execution/observer.ts
 var DaprExecutionObserver = class {
@@ -546,11 +670,12 @@ var EMPTY_USAGE = {
   outputTokens: 0,
   totalTokens: 0
 };
-function buildRun(state, payload, trigger) {
+function buildRun(state, payload, trigger, executionId) {
   return {
     payload,
     context: { trigger },
     sessionId: state.sessionId,
+    executionId: executionId ?? `${state.sessionId}:${state.startedAt}`,
     startedAt: state.startedAt
   };
 }
@@ -558,7 +683,7 @@ function collectToolCalls(state) {
   const toolCalls = [];
   for (const msg of state.messages) {
     if (msg.role === "tool") {
-      toolCalls.push({ name: msg.toolName, result: msg.content });
+      toolCalls.push({ name: msg.toolName, result: msg.result });
     }
   }
   return toolCalls;
@@ -569,16 +694,16 @@ function buildSnapshot(state) {
     response: state.finalResponse ?? state.lastModelStep?.text ?? "",
     usage: state.usage ?? EMPTY_USAGE,
     toolCalls: collectToolCalls(state),
-    eventCount: 0,
+    eventCount: state.turnState?.eventCount ?? state.messages.length,
     activeStep: state.step,
     startedAt: state.startedAt,
     updatedAt: state.updatedAt,
     turnState: state.turnState ?? {}
   };
 }
-function buildCheckpoint(reason, state, payload, trigger) {
+function buildCheckpoint(reason, state, payload, trigger, executionId) {
   return {
-    run: buildRun(state, payload, trigger),
+    run: buildRun(state, payload, trigger, executionId),
     reason,
     snapshot: buildSnapshot(state),
     createdAt: state.updatedAt
@@ -596,6 +721,7 @@ function createWorkflowObserverBridge(options) {
   const { observers } = options;
   let payload = options.payload;
   const trigger = options.trigger ?? "workflow";
+  const executionId = options.executionId;
   if (observers.length === 0) {
     return {
       async notifyTaskStart() {
@@ -618,16 +744,22 @@ function createWorkflowObserverBridge(options) {
     async notifyTaskStart(state) {
       if (taskStarted) return;
       taskStarted = true;
-      const run = buildRun(state, payload, trigger);
+      const run = buildRun(state, payload, trigger, executionId);
       const snapshot = buildSnapshot(state);
       await notifyAll(observers, (o) => o.onTaskStart?.(run, snapshot));
     },
     async notifyCheckpoint(reason, state) {
-      const checkpoint = buildCheckpoint(reason, state, payload, trigger);
+      const checkpoint = buildCheckpoint(
+        reason,
+        state,
+        payload,
+        trigger,
+        executionId
+      );
       await notifyAll(observers, (o) => o.onCheckpoint?.(checkpoint));
     },
     async notifyTaskComplete(state) {
-      const run = buildRun(state, payload, trigger);
+      const run = buildRun(state, payload, trigger, executionId);
       const snapshot = buildSnapshot(state);
       const result = {
         response: state.finalResponse ?? "",
@@ -641,16 +773,19 @@ function createWorkflowObserverBridge(options) {
       );
     },
     async notifyTaskError(state, error) {
-      const run = buildRun(state, payload, trigger);
+      const run = buildRun(state, payload, trigger, executionId);
       const snapshot = buildSnapshot(state);
       await notifyAll(observers, (o) => o.onTaskError?.(run, error, snapshot));
     },
     updatePayload(newPayload) {
       payload = newPayload;
     },
-    getOtelContext(sessionId) {
+    getOtelContext(sessionId, currentExecutionId) {
       for (const observer of observers) {
-        const ctx = observer.getOtelContext?.(sessionId);
+        const ctx = observer.getOtelContext?.(
+          sessionId,
+          currentExecutionId ?? executionId
+        );
         if (ctx !== void 0) return ctx;
       }
       return void 0;
@@ -659,6 +794,9 @@ function createWorkflowObserverBridge(options) {
 }
 // src/execution/telemetry.ts
+import {
+  DEFAULT_AGENT_NAME
+} from "@cuylabs/agent-core";
 var _otel = null;
 function oiMime(v) {
   const t = v.trimStart();
@@ -674,7 +812,7 @@ async function getOtel() {
   }
 }
 function createOtelObserver(config = {}) {
-  const agentName = config.agentName ?? "agent";
+  const agentName = config.agentName ?? DEFAULT_AGENT_NAME;
   const spanTimeoutMs = config.spanTimeoutMs ?? 5 * 60 * 1e3;
   const turnSpans = /* @__PURE__ */ new Map();
   let otel = null;
@@ -693,9 +831,13 @@ function createOtelObserver(config = {}) {
       "gen_ai.usage.output_tokens": usage.outputTokens ?? 0
     };
   }
+  function makeSpanKey(sessionId, executionId) {
+    return executionId ?? sessionId;
+  }
   return {
     async onTaskStart(run, _snapshot) {
-      const existing = turnSpans.get(run.sessionId);
+      const key = makeSpanKey(run.sessionId, run.executionId);
+      const existing = turnSpans.get(key);
       if (existing) {
         const inputVal2 = run.payload.message.slice(0, 4096);
         existing.span.setAttributes({
@@ -724,20 +866,22 @@ function createOtelObserver(config = {}) {
       });
       const ctx = otel.trace.setSpan(otel.context.active(), span);
       const timer = setTimeout(() => {
-        const entry = turnSpans.get(run.sessionId);
+        const entry = turnSpans.get(key);
         if (entry) {
           entry.span.setStatus({
             code: otel?.SpanStatusCode.ERROR ?? 2,
             message: "Span timed out (possible leak \u2014 task never completed)"
           });
           entry.span.end();
-          turnSpans.delete(run.sessionId);
+          turnSpans.delete(key);
         }
       }, spanTimeoutMs);
-      turnSpans.set(run.sessionId, { span, ctx, timer });
+      turnSpans.set(key, { span, ctx, timer });
     },
     onCheckpoint(checkpoint) {
-      const entry = turnSpans.get(checkpoint.run.sessionId);
+      const entry = turnSpans.get(
+        makeSpanKey(checkpoint.run.sessionId, checkpoint.run.executionId)
+      );
       if (!entry) return;
       const reason = checkpoint.reason;
       const attrs = {
@@ -758,7 +902,8 @@ function createOtelObserver(config = {}) {
       entry.span.addEvent(`agent.checkpoint.${reason}`, attrs);
     },
     onTaskComplete(run, result, _snapshot) {
-      const entry = turnSpans.get(run.sessionId);
+      const key = makeSpanKey(run.sessionId, run.executionId);
+      const entry = turnSpans.get(key);
       if (!entry) return;
       if (entry.timer) clearTimeout(entry.timer);
       entry.span.setAttributes({
@@ -773,10 +918,11 @@ function createOtelObserver(config = {}) {
       }
       entry.span.setStatus({ code: otel?.SpanStatusCode.OK ?? 1 });
       entry.span.end();
-      turnSpans.delete(run.sessionId);
+      turnSpans.delete(key);
     },
     onTaskError(run, error, snapshot) {
-      const entry = turnSpans.get(run.sessionId);
+      const key = makeSpanKey(run.sessionId, run.executionId);
+      const entry = turnSpans.get(key);
       if (!entry) return;
       if (entry.timer) clearTimeout(entry.timer);
       entry.span.setAttributes(getUsageAttrs(snapshot.usage));
@@ -786,13 +932,13 @@ function createOtelObserver(config = {}) {
       });
       entry.span.recordException(error);
       entry.span.end();
-      turnSpans.delete(run.sessionId);
+      turnSpans.delete(key);
     },
-    getOtelContext(sessionId) {
-      return turnSpans.get(sessionId)?.ctx;
+    getOtelContext(sessionId, executionId) {
+      return turnSpans.get(makeSpanKey(sessionId, executionId))?.ctx;
     },
-    activateContext(sessionId, fn) {
-      const entry = turnSpans.get(sessionId);
+    activateContext(sessionId, executionId, fn) {
+      const entry = turnSpans.get(makeSpanKey(sessionId, executionId));
       if (!entry?.ctx || !otel) return fn();
       return otel.context.with(entry.ctx, fn);
     }